2
votes

I have to insert records into Redshift table on a periodic basis. So I chose the strategy where I will copy s3 data into stage table and then append the stage table data into actual table using alter append command. In my case.

Stage table - URL_DATA_TEMP

Actual Table - URL_METADATA

now both tables I created using the same command and only changing the table name. ex:

CREATE TABLE _360_model.URL_METADATA
(
URL VARCHAR(8000),
URL_MD5 VARCHAR(300),
INDEX VARCHAR(200),
ASSET_TYPE VARCHAR(200)
);

CREATE TABLE _360_model.URL_DATA_TEMP
(
URL VARCHAR(8000),
URL_MD5 VARCHAR(300),
INDEX VARCHAR(200),
ASSET_TYPE VARCHAR(200)
);

Still when I try to use following append command, it complains.

alter table _360_model.URL_METADATA append from _360_model.URL_DATA_TEMP;

error:  Columns don't match.
code:      8001
context:   Column "asset_type" has different attributes in the source table 
and the target table. Columns with the same name must have the same 
attributes in both tables.

I am not able to understand when I used almost same command to create table, how can column structure can be different.

2
is column name INDEX spoiling ? INDEX is reserved word, but might be taken with quotes or so? - Raghavendra Kedlaya
Have you checked that the column coding, distribution keys etc. are identical? Try running select trim(tablename) as table, "column", trim(type) as type, encoding, distkey, sortkey, "notnull" from pg_table_def where tablename in ('URL_METADATA', 'URL_DATA_TEMP'); to compare them. - Nathan Griffiths
Your queries work fine for me, you should check that you are creating the tables in the same way as you've mentioned here. - Yankee

2 Answers

3
votes

This may be from column compression differences. On the copy command, are you using the option "COMPUPDATE ON" as shown below? If so, it is likely your staging table's column compression options are different than your target table.

copy <table name> from 's3://<data location>'
CREDENTIALS <creds>
region <region>
GZIP
CSV
IGNOREHEADER 1
TIMEFORMAT 'auto' manifest
COMPUPDATE ON;

I ran into a similar issue and did the following.

1) Load data with a copy command and turn COMPUPDATE ON

See: http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-load.html#copy-compupdate

2) Afer the load is complete, use the query below to see the automatic compression applied.

select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = '<table name>';

See: http://docs.aws.amazon.com/redshift/latest/dg/t_Compressing_data_on_disk.html

3) Recreate your target table using the same compression as the staging table.

create table <target table name> (
    <column name> <type> encode <encoding>,
    ...
)

See: http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html

Other table attributes such as sortkey will need to match in both your staging and target tables.

0
votes

I'm not able to recreate the issue with the SQL you've provided (using psql).

Try running your SQL with psql to see if your tool is altering the submitted SQL in some way.