I am having a parquet file with the below structure
column_name_old |
---|
First |
Second |
I am crawling this file to a table in AWS glue, however in the schema table I want the table structure as below without changing anything in parquet files
column_name_new |
---|
First |
Second |
I tried updating table structure using boto3
col_list = js['Table']['StorageDescriptor']['Columns']
for x in col_list:
if isinstance(x, dict):
x.update({'Name': x['Name'].replace('column_name_old', 'column_name_new')})
And it works as I can see the table structure updated in Glue catalog, but when I query the table using the new column name I don't get any data as it seems the mapping between the table structure and partition files is lost.
Is this approach even possible or I must change the parquet files itself? If it's possible what I am doing wrong?