0
votes

So I have this HIVE Query:

SELECT p.id 
FROM posts p JOIN
comments c ON c.postid = p.id JOIN
users u ON u.id = p.owneruserid JOIN
badges b ON b.userid = u.id CROSS JOIN
postlinks l 
WHERE l.relatedpostid > p.id
AND u.upvotes IN (SELECT MAX(upvotes) AS uv FROM users uu WHERE uu.creationdate > p.creationdate);

and it gives me this Error:

Error while compiling statement: FAILED: SemanticException [Error 10249]: line 21:58 Unsupported SubQuery Expression 'creationdate': SubQuery expression refers to both Parent and SubQuery expressions and is not a valid join condition.

But I dont see how my SubQuery expression refers to both parent and SubQuery?!?

I specifically use user uu to distinguish from user u from the parent query, and p.creationdate has to be related to the parent query in my case.

As stated in this post already HIVE comes with some restrictions when it comes to Subqueries:

  1. These subqueries are only supported on the right-hand side of an expression.
  2. IN/NOT IN subqueries may only select a single column.
  3. EXISTS/NOT EXISTS must have one or more correlated predicates.
  4. References to the parent query are only supported in the WHERE clause of the subquery.

But I dont see how I am breaking any of these restrictions.

What am I doing wrong?

Any help is appreciated.

EDIT:

The data used, is an excerpt of stackexchange data. So post referrs to a stackexchange post etc..

Tables:

badges(id,class,date,name,tagbased,userid)

comments(id,creationdate,postid,score,text,userdisplayname,userid)

postlinks(id,creationdate,linktypeid,postid,relatedpostid)

posts(id,acceptedanswerid,answercount,body,closeddate,commentcount,communityowneddate,creationdate,favoritecount,lastactivitydate,lasteditdate,lasteditordisplayname,lasteditoruserid,ownerdisplayname,owneruserid,parentid,posttypeid,score,tags,title,viewcount)

tags(id,count,excerptpostid,tagname,wikipostid)

users(id,aboutme,accountid,creationdate,displayname,downvotes,lastaccessdate,location,profileimageurl,reputation,upvotes,views,websiteurl)

votes(id,bountyamount,creationdate,postid,userid,votetypeid)

I am mainly interested in how I could write

SELECT p.id FROM posts p JOIN users u ON u.id = p.owneruserid
WHERE u.upvotes IN 
(SELECT MAX(upvotes) FROM users uu WHERE uu.creationdate > p.creationdate) 

so that it does not give me this error.

1
I dont think thats related to my problemDigitalJedi

1 Answers

0
votes

If you want the list of restrictions, you need to look at the documentation. The relevant restriction is:

Subquery predicates that reference a column in a parent query must use the equals (=) predicate operator.

There is no equality clause in the subquery. As I mentioned in the original question, you should ask a question with sample data, desired results, and an explanation of what the code should be doing.

Both of the troublesome comparisons in your query are highly unusual. Normally, such comparison would have an equality comparison. The query is doing something, and there might be a better way to accomplish what you want. It is just unclear what you want.