0
votes

I am working in snowflake and got stuck very badly when I need unique columns in my final result set .

For e.g I have Table A with below information :

+---------+----------|
| user_id | product  |
+---------+----------|
| 1       | 1        |
| 1       | 2        |
| 2       | 1        |
| 2       | 2        |
| 3       | 2        |
+---------+----------|

TABLE B with below information :

+---------+---------- |
| user_id | Week      |
+---------+-----------|
| 1       | 2020-01-02|
| 1       | 2020-01-02|
| 2       | 2020-01-03|
| 2       | 2020-01-03|
| 3       | 2020-01-04|
+---------+-----------|

when I am writing the query

Select * from a join b on a.user_id =b.userid

I am getting the user_id two times because I am using the "*" but I need User_Id only one time with output as :

+---------+----------|---------- |
| user_id | product  | Week      |
+---------+----------|-----------|
| 1       | 1        | 2020-01-02|
| 1       | 2        | 2020-01-02|
| 2       | 1        | 2020-01-03|
| 2       | 2        | 2020-01-03|
| 3       | 2        | 2020-01-04|
+---------+----------|-----------|

But I am getting this:

+---------+----------|---------- |---------+
| user_id | product  | Week      | user_id |
+---------+----------|-----------|---------+
| 1       | 1        | 2020-01-02| 1       |
| 1       | 2        | 2020-01-02| 1       |
| 2       | 1        | 2020-01-03| 2       |
| 2       | 2        | 2020-01-03| 2       |
| 3       | 2        | 2020-01-04| 3       |
+---------+----------|-----------|---------+

The above is just and example ,in actual scenario i have 355 columns in Table A and 410 columns in Table B.So i can't do

Select a.column_name_1,a.column_name_2 ....a.column_name_355,b.column_name_1,a.column_name_2 ....a.column_name_355 
from table a join table b on a.uder_id = b.userid .

I have tried this but it is not working :

set name = (select listagg(column_name,',') as name from (
select row_number() over(order by column_name) column_id,column_name  from information_schema.columns
where table_schema='DEV'
and table_name ='table_a'
and column_name!='WEEK')
);

select $name from 
"DEV_DB"."DEV"."table_a" a inner join
"DEV_DB"."DEV"."table_b" b on a.WEEK = b.WEEK

Can anyone please help me !!

3
Replace the * with the columns you need.Larnu
@Larnu i can't because there are 355 columns in table a ,which is tedious job to write individual column namesabssab
There no magic "give me the columns I want" syntax, you need to tell the RDBMS what ones those are by writing it out. If it's "tedious" perhaps the real fact is that you don't need all those columns. Why do you have so many in the tables in the first place?Larnu
It is in _raw_area and i need to prepare the staging area by joining two tables. I was trying to do something like this : set name = (select listagg(column_name,',') as name from ( select row_number() over(order by column_name) column_id,column_name from information_schema.columns where table_schema='DEV' and table_name ='table_a' and column_name!='WEEK') ); select $name from "DEV_DB"."DEV"."table_a" a inner join "DEV_DB"."DEV"."table_b" b on a.WEEK = b.WEEKabssab

3 Answers

0
votes

here is what you need ,if I understand correctly

select t.user_id , product , Week  
from table1 
join ( select user_id , max(week) week
       from tableB
       group by user_id
) t
on t.user_id = table1.user_id
0
votes

Using JOIN USING:

JOIN

USING( <column_list> )

A list of columns in common between the two tables being joined; these columns are used as the join columns. The columns must have the same name and meaning in each of the tables being joined.

To use the USING clause properly, the projection list (the list of columns and other expressions after the SELECT keyword) should be “*”. This allows the server to return the key_column exactly once, which is the standard way to use the USING clause.

CREATE TABLE tab1(user_id INT, product_id INT);
CREATE TABLE tab2(user_id INT, week INT);

-- output 4 columns
SELECT * FROM tab1 INNER JOIN tab2 ON tab1.user_id = tab2.user_id;
--USER_ID   PRODUCT_ID  USER_ID WEEK

-- output 3 columns
SELECT * FROM tab1 INNER JOIN tab2 USING(user_id);
-- USER_ID  PRODUCT_ID  WEEK
0
votes

You can use the INFORMATION_SCHEMA views to help build your query instead of writing it all by hand for that many columns.

Example:

SELECT '"' || COLUMN_NAME || '" AS ' || '"TableA_' || COLUMN_NAME || '",'
FROM INFORMATION_SCHEMA."COLUMNS" 
WHERE TABLE_NAME = 'TableA'
ORDER BY ORDINAL_POSITION 

outputs similar to:

"col1" as "TableA_col1",
"col2" as "TableA_col2",
etc..

Then just wrap the column list in a SELECT and FROM TableA, do the same for TableB and add your JOIN condition.