3
votes

I'm looking to SELECT rows in Hive that have a special character [a-zA-Z0-9] in a column.

I'm not quite sure how to construct the WHERE clause but based upon other threads I think it should look something like:

SELECT DISTINCT user_name
FROM user_info
WHERE user_name like regexp_extract('%[^a-zA-Z\d\s:]%')

but Hive provided this error:

Error while compiling statement: FAILED: SemanticException [Error 10014]: Line 3:21 Wrong arguments ''%[^a-zA-Z\d\s:]%'': No matching method for class org.apache.hadoop.hive.ql.udf.UDFRegExpExtract with (string). Possible choices: FUNC(string, string) FUNC(string, string, int)

How should I set up the regexp_extract clause?

Thanks

1

1 Answers

5
votes
SELECT DISTINCT user_name
FROM user_info
WHERE user_name rlike '[^a-zA-Z\\d\\s:]'

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringOperators

hive> select 'Dudu Markovitz: 123' rlike '[^a-zA-Z\\d\\s:]';
OK
false
hive> select 'Dudu Markovitz: @123' rlike '[^a-zA-Z\\d\\s:]';
OK
true