6
votes

I have uploaded the MySQL csv file / MYSQL zip file of all tables in Amazon S3 bucket. Now I want to link the Amazon Athena with S3 bucket file. But when I write the schema for different tables is showing the same result for the select query of each table. I have search a lot but not able to understand the exact /correct way to do this.

I want to create/update different table schema in Athena with the help of one csv /sql zip file from S3 bucket.

1

1 Answers

5
votes

Amazon Athena will look in a defined directory for data. All data files within that directory will be treated as containing data for a given table.

You use a CREATE TABLE command to both define the schema and direct Athena to the directory, eg:

CREATE EXTERNAL TABLE test1 (
  f1 string,
  s2 string
) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES ("separatorChar" = ",", "escapeChar" = "\\") 
LOCATION 's3://my-bucket/data-directory/'

You will need to run a CREATE EXTERNAL TABLE command for each table, and the data for each table should be in a separate directory. The CSV files can be in ZIP format (which makes it faster and cheaper to query).

As an alternative to writing these table definitions yourself, you can create a crawler in AWS Glue. Point the crawler to the data directory, supply a name and the crawler will examine the data files and create a table definition that matches the files.