2
votes

What is wrong with my SQL code?

I tried to eliminate duplicate rows following this answer

But I keep getting the following error:

near "(": syntax error: SELECT rn = ROW_NUMBER() OVER (

Here is the SQL code:

SELECT 
    rn = ROW_NUMBER() OVER (PARTITION BY s.stop_id, s.stop_name 
                            ORDER BY s.stop_id, s.stop_name)
FROM stops s

I read somewhere that it has to do with SQL versions or the usage of sqlite3 ??

Here some additional information to the problem:

I have the beginning table:

table_beginning =
[some_text0 , some_text1 , some_text2 ,  some_text3   ]
[ bla_1     ,   monday   ,    red     , bla_something ]
[ bla_77    ,   tuesday  ,   green    , bla_other     ]
[ bla_99    ,   monday   ,    blue    , bla_anything  ]
[ bla_00    ,  wednesday ,    red     , bla_maybe     ]
[ bla_55    ,   monday   ,   violet   , bla_more      ]
[ bal_66    ,   monday   ,    red     , bla_kind      ]
[ bal_22    ,   monday   ,    red     , bla_stuff     ]

I would like to end up with a table that has no doublicates concerning col_2 and col_3 (not caring whatever col_1 and col_4 are !!)

Such as:

table_final1 =
[some_text0 , some_text1 , some_text2 ,  some_text3   ]
[ bla_1     ,   monday   ,    red     , bla_something ]
[ bla_77    ,   tuesday  ,   green    , bla_other     ]
[ bla_99    ,   monday   ,    blue    , bla_anything  ]
[ bla_00    ,  wednesday ,    red     , bla_maybe     ]
[ bla_55    ,   monday   ,   violet   , bla_more      ]

(--> which ones of the rows that are kicked out does not matter. The accepted final table could therefore also look like:

table_final2 =
[some_text0 , some_text1 , some_text2 ,  some_text3   ]
[ bla_77    ,   tuesday  ,   green    , bla_other     ]
[ bla_99    ,   monday   ,    blue    , bla_anything  ]
[ bla_00    ,  wednesday ,    red     , bla_maybe     ]
[ bla_55    ,   monday   ,   violet   , bla_more      ]
[ bal_66    ,   monday   ,    red     , bla_kind      ]

or

table_final3 =
[some_text0 , some_text1 , some_text2 ,  some_text3   ]
[ bla_77    ,   tuesday  ,   green    , bla_other     ]
[ bla_99    ,   monday   ,    blue    , bla_anything  ]
[ bla_00    ,  wednesday ,    red     , bla_maybe     ]
[ bla_55    ,   monday   ,   violet   , bla_more      ]
[ bal_22    ,   monday   ,    red     , bla_stuff     ]

All that matters is that col_2 and col_3 have never the same two entries !

As you can see some_text1 = monday AND some_text2 = red exists now only once !! (eliminating doublicates from the point of view only looking at col_2 and col_3)

As of content in col_1 and col_4 - I don't care at all what is in there ! I'm only concerned about what is inside col_2 and col_3 (not having any doublicates there !)

One solution:

I figured out a way (but maybe there is a more elegant one??)

CREATE TABLE table_intermediate AS
  SELECT DISTINCT col_2, col_3
FROM table_beginning;

--> This creates an intermediate table --> with the DISTINCT keyword this does eliminate doublicates --> (disadvantage, I loose col_1 and col_4 information completely)

Maybe there is a solution where I can keep information on col_1 and col_4 ??? (again, which of the col_1 or col_4 I do not care !)

1
SQLite does not support window functions. See this related post: stackoverflow.com/q/49069627/460557Jorge Campos
just use distinct: select distinct yourcolumns from.... know that distinct is for all columns...Jorge Campos
Add some sample data and from it the desired result in your question, right now it is a clear example of an XY problem you said what you want instead of explain what you need.Jorge Campos
@Jorge Campos: (sorry for the delay in response - timeshift ;)) - I added more information. I also think that your suggestion to use DISTINCT brought me forward - remains the question if there is a solution where I don't "loose" col_1 and col_4 information with the DISTINCT keyword ?iKK
Ah - maybe Gordon Linoff's solution is it ?? (I just saw it coming in...)iKK

1 Answers

2
votes

In SQLite, you can typically use rowid:

select s.*
from stops s
where s.rowid = (select min(s2.rowid)
                 from stops s2
                 where s2.stop_id = s.stop_id and s2.stop_name = s.stop_name
                );

I'm not sure if this is what you really need. But this seems to be what you want to do with row_number(). If this doesn't hit the spot, then ask another question with sample data and desired results.