0
votes

I'm trying to query over a specific formatted date:

I have this query:

SELECT 
    REGEXP_REPLACE(datewithoutdash,
    '^(\\d{2})(\\d{2})(\\d{2})(.*)$','20\\1-\\2-\\3') 
    datewithdash 
     FROM table1 WHERE datewithdash < "2016-11-10";

Why I can't use where clause over the new variable?

I'm getting this error:

FAILED: SemanticException [Error 10004]: Line 26:14 Invalid table alias or column reference 'datewithdash': (possible column names are: ...)

2

2 Answers

3
votes

Hive doesn't know about the aliased column names in a select clause when its evaluating a where clause in the same query. Unfortunately you either have to nest it, or duplicate the transformation function into the where clause:

SELECT 
    REGEXP_REPLACE(datewithoutdash,
    '^(\\d{2})(\\d{2})(\\d{2})(.*)$','20\\1-\\2-\\3') as datewithdash 
FROM 
     table1 
WHERE 
    REGEXP_REPLACE(datewithoutdash,
    '^(\\d{2})(\\d{2})(\\d{2})(.*)$','20\\1-\\2-\\3')  < "2016-11-10";

OR

select * from (
    SELECT 
        REGEXP_REPLACE(datewithoutdash,
        '^(\\d{2})(\\d{2})(\\d{2})(.*)$','20\\1-\\2-\\3') as datewithdash 
    FROM 
         table1 
    ) a
WHERE 
    datewithdash  < "2016-11-10";

Another note - that function is pretty nasty - you could probably use a build in hive function like:

to_date(unix_timestamp(datewithoutdash,'yyMMdd'))

instead - it might be clearer.

0
votes

Hive can not recognize aliases. You will need to repeat the whole expression again.