So I have this HIVE Query:
SELECT p.id
FROM posts p JOIN
comments c ON c.postid = p.id JOIN
users u ON u.id = p.owneruserid JOIN
badges b ON b.userid = u.id CROSS JOIN
postlinks l
WHERE l.relatedpostid > p.id
AND u.upvotes IN (SELECT MAX(upvotes) AS uv FROM users uu WHERE uu.creationdate > p.creationdate);
and it gives me this Error:
Error while compiling statement: FAILED: SemanticException [Error 10249]: line 21:58 Unsupported SubQuery Expression 'creationdate': SubQuery expression refers to both Parent and SubQuery expressions and is not a valid join condition.
But I dont see how my SubQuery expression refers to both parent and SubQuery?!?
I specifically use user uu
to distinguish from user u
from the parent query, and p.creationdate has to be related to the parent query in my case.
As stated in this post already HIVE comes with some restrictions when it comes to Subqueries:
- These subqueries are only supported on the right-hand side of an expression.
- IN/NOT IN subqueries may only select a single column.
- EXISTS/NOT EXISTS must have one or more correlated predicates.
- References to the parent query are only supported in the WHERE clause of the subquery.
But I dont see how I am breaking any of these restrictions.
What am I doing wrong?
Any help is appreciated.
EDIT:
The data used, is an excerpt of stackexchange data. So post referrs to a stackexchange post etc..
Tables:
badges(id,class,date,name,tagbased,userid)
comments(id,creationdate,postid,score,text,userdisplayname,userid)
postlinks(id,creationdate,linktypeid,postid,relatedpostid)
posts(id,acceptedanswerid,answercount,body,closeddate,commentcount,communityowneddate,creationdate,favoritecount,lastactivitydate,lasteditdate,lasteditordisplayname,lasteditoruserid,ownerdisplayname,owneruserid,parentid,posttypeid,score,tags,title,viewcount)
tags(id,count,excerptpostid,tagname,wikipostid)
users(id,aboutme,accountid,creationdate,displayname,downvotes,lastaccessdate,location,profileimageurl,reputation,upvotes,views,websiteurl)
votes(id,bountyamount,creationdate,postid,userid,votetypeid)
I am mainly interested in how I could write
SELECT p.id FROM posts p JOIN users u ON u.id = p.owneruserid
WHERE u.upvotes IN
(SELECT MAX(upvotes) FROM users uu WHERE uu.creationdate > p.creationdate)
so that it does not give me this error.