0
votes

I'm trying to query a couple of tables with a JOIN using a subquery to narrow down the returned results. When I use a WHERE clause in the subquery which references a value of a column from the outer query I get an error #1054 unknown column in where clause.

I'd like to query the Master table with mode=2 and get relevant records from the Transaction table. If Master.mode=2 and Master.type=1, then get all transactions who's id's match but with type=2 and vice versa. In short, get records from Transaction who's IDs match, but with opposite Type values. In case of more than 1 record, then get the record with the highest 'Amount' from Transaction table.

Master table
+----+-----+-------+-------+
|id  |mode |type   |other  |
+----+-----+-------+-------+
|1   |1    |1      |11111  |
|2   |2    |2      |22222  |
|3   |2    |1      |33333  |
|4   |1    |2      |44444  |
+----+-----+-------+-------+

Transaction table
+----+-------+-------+
|id  |type   |amount |
+----+-------+-------+
|1   |1      |1000   |
|1   |2      |1000   |
|2   |1      |2000   |
|2   |2      |3000   |
|3   |1      |500    |
|3   |1      |5000   |
|3   |1      |3000   |
|3   |2      |4000   |
|3   |2      |2000   |
|4   |1      |1000   |
|4   |2      |2000   |
|1   |1      |3000   |
+----+-------+-------+

Expected Result
+----+-----+-------+-------+-------+
|id  |mode |type   |other  |amount |
+----+-----+-------+-------+-------+
|2   |2    |2      |22222  |2000   |
|3   |2    |1      |33333  |4000   |
+----+-----+-------+-------+-------+

My query
SELECT t1.*, t2.Amount
FROM master AS t1
    INNER JOIN (
            SELECT t3.*
                FROM transactions AS t3
                WHERE t3.id=t1.id AND t3.Type=(CASE WHEN t1.Type=1 THEN 2 ELSE 1 END)
                ORDER BY t3.Amount DESC
                LIMIT 1
        ) AS t2 ON t1.id=t2.id
WHERE t1.Mode=2 AND t2.Type=(CASE WHEN t1.Type=1 THEN 2 ELSE 1 END)

I get an error #1054 - Unknown column 't1.id' in 'where clause'

1
Was that a question, or a status report? MySQL doesn't support correlated subquery as an inline view (called a derived table in the MySQL venacular.) t2 is in the FROM clause. It's a rowsource for the outer query. MySQL materializes the view as a derived table before the outer query runs. Values from the outer query are not available when the view is materialized.spencer7593
@spencer7593 Thanks! Any ideas how I could achieve my intended result? I had prepared my query based on similar answers I had seen on SO but it seems my logic is incorrect. Can you help with the corrected logic?Nitin Mistry
Looks like we could use a correlated subquery in the SELECT list.spencer7593

1 Answers

1
votes

To achieve the expected result. we could start with a query like this:

 SELECT m.id
      , m.mode
      , m.type
      , m.other
   FROM master m
  WHERE m.mode = 2
  ORDER BY ...

And for each row returned by that query, it looks like we want to get a single value. We can use a correlated subquery to get the value, for example:

 SELECT t.amount
   FROM transactions t
  WHERE t.id   = m.id 
    AND t.type = IF(m.type=2,1,2)
  ORDER BY t.amount DESC
  LIMIT 1

The correlated subquery returns a single column, and returns no more than one row, so we can include that in the SELECT list of the original query.

Something like this:

 SELECT m.id
      , m.mode
      , m.type
      , m.other
      , ( SELECT t.amount
            FROM transactions t
           WHERE t.id   = m.id 
             AND t.type = IF(m.type=2,1,2)
           ORDER BY t.amount DESC
           LIMIT 1
        ) AS `Amount`
   FROM master m
  WHERE m.mode = 2 
  ORDER BY ... 

The way this works is that the outer query (query from master) is executed. For each row returned, the expressions in the SELECT list are evaluated. Evaluating expressions like m.id, m.mode, m.type are pretty straightforward, and each returns a scalar value which gets put into the resultset.

Evaluating the correlated subquery works the same way. It's a little more complicated... MySQL executes the subquery, using the values of the current m row, and returns a single value, which gets put into the resultset.

NOTE:

If the correlated subquery doesn't return a row, a NULL gets put into the resultset.

If the specification is to return only rows with a non-NULL value in the amount column, we can add a HAVING clause

 WHERE m.mode = 2 
HAVING `Amount` IS NOT NULL
 ORDER BY ... 

Also note:

Because that correlated subquery is executed for each row returned by the outer query, for a lot of rows from master, thats a lot of subquery executions, which can drag down performance for large sets.

That makes it very important to have a suitable index available for subquery. Ideally, a covering index ...

ON transaction (id, type, amount)