1
votes

I have a table tablename with columns col1-col10. Not every row has col4 populated, but every row has col1, col2, col3 populated. I want to get all {col1, col2, col3} tuples for when col4 satisfies a condition, then get all rows that match the tuple {col1, col2, col3} from tablename.

I have this query:

select t.*
from mytable t
where exists (
    select 1
    from mytable t1
    where 
        t1.col1 = t.col1 
        and t1.col2 = t.col2 
        and t1.col3 = t.col3 
        and t1.col4 >= 1000
)
LIMIT 1000

The size of the table is very large so I have to add the limit. Due to the limit, for some {col1, col2, col3} not getting all rows in the result dataset. I want to then get all rows that match the tuple {col1, col2, col3} from tablename.

I don't mind have less {col1, col2, col3} tuples in my result, but I want complete information for the ones I do have.

How can I achieve that?

2
Very large? It's all relative, isn't it?Strawberry
In this context, I mean I cannot do without adding a limit of some sort. Moreover, since I only care about getting complete information about some (about 100) unique {col1, col2, col3}user1745995
Sqlite and mysql are two different database products with different implenentations of sql. Remived the fonflicting product tags. Pls add the one back that you actually use!Shadow
Why did you add a LIMIT clause?The Impaler
@TheImpaler just want all rows for about 100 {col1, col2, col3} tuplesuser1745995

2 Answers

1
votes

You don't mention which database but the following query should run faster. You can do:

select t.*
from t
join (
  select distinct col1, col2, col3 
  from t
  where col4 >= 1000
  limit 100
) x on t.col1 = x.col1 and t.col2 = x.col2 and t.col3 = x.col3;

With the following indexes, the query should become faster:

create index ix1 on t (col4, col1, col2, col3);

create index ix2 on t (col1, col2, col3);
0
votes

A much more efficient approach is to use window functions:

select t.*
from (select t.*,
             sum(case when col4 > 1000 then 1 else 0 end) over (partition by col1, col2, col3) as cnt_matches
      from mytable t
     ) t
where cnt_matches > 0;