Hive external table to parquet with binary columns

Question

I have 1 parquet data file with schema;

id integer
model binary

This file was created using pyspark and consist model identifier and dumped with pickle python library model binary.

Is it possible to create Hive external table for this parquet file and get output after select command. Let's suppose that Hive external table got exactly same schema.

CREATE EXTERNAL TABLE default.t_model
(
id integer
, model binary
)
STORED AS PARQUET
LOCATION 'hdfs_path';

I'd done each step above but always got empty answerset. Should I use Hive UDF for loading binary column? Or should I try another data type for parquet binary column like array?

Appreciate any answers, thx.

Dmitriy Kravchuk Dmitriy Kravchuk · Accepted Answer · 2021-02-13T13:55:12

Looks like I shouldn't use partitioned table without MSCK REPAIR TABLE command. With Hive binary data type everything works good.

Hive external table to parquet with binary columns

1 Answers