I have some parquet files which are created by Spark converting AVRO file to parquet file. And these parquet files contain different data types like decimal, int,string,boolean. But when I read this file using the dd.read_parquet with pyarrow engine, except int everything else is converted to object data types and which causes an issue in arithmetic calculations. And i tried with the float dtypes for the decimal columns but that will loose precision. Any idea how to read the values withot loosing the precision ?
Schema for the parquet file
<pyarrow._parquet.FileMetaData object at >
created_by: parquet-mr version 1.10.1 (build a89df8f9932b6ef6633d06069e50c9b7970bebd1)
num_columns: 7
num_rows: 1
num_row_groups: 1
format_version: 1.0
serialized_size: 4376
ID: string
CODE: string
CURRENCY: string
DEDUCT: decimal(20, 2)
PERCENT: decimal(11, 10)
MIN_DEDUCT: decimal(20, 2)
MAX_DEDUCT: decimal(20, 2)
metadata
{b'org.apache.spark.sql.parquet.row.metadata': b'{"type":"struct","fields":[{'
b'"name":"ID","'
b'type":"string","nullable":tr'
b'ue,"metadata":{}},{"name":"'
b'CODE","typ'
b'e":"string","nullable":true,'
b'"metadata":{}},{"name":"'
b'CURRENCY","typ'
b'e":"string","nullable":true,'
b'"metadata":{}},{"name":"DEDU'
b'CT","type":"decimal(20,2'
b')","nullable":true,"metadata'
b'":{}},{"name":"'
b'DEDUCT","'
b'type":"decimal(11,10)","null'
b'able":true,"metadata":{}},{"'
b'name":"MIN_DEDUCT","'
b'type":"decimal(20,2)","nulla'
b'ble":true,"metadata":{}},{"n'
b'ame":"MAX_DEDUCT","t'
b'ype":"decimal(20,2)","nullab'
b'le":true,"metadata":{}}]}'}