14
votes

How can I execute lengthy, multiline Hive Queries in Spark SQL? Like query below:

val sqlContext = new HiveContext (sc)
val result = sqlContext.sql ("
 select ...
 from ...
");
5
please improve your post, nobody wants to see screenshots of codemtoto

5 Answers

19
votes

Use """ instead, so for example

val results = sqlContext.sql ("""
     select ....
     from ....
""");

or, if you want to format code, use:

val results = sqlContext.sql ("""
     |select ....
     |from ....
""".stripMargin);
4
votes

You can use triple-quotes at the start/end of the SQL code or a backslash at the end of each line.

val results = sqlContext.sql ("""
    create table enta.scd_fullfilled_entitlement as
    select *
    from my_table
    """);

results = sqlContext.sql (" \
    create table enta.scd_fullfilled_entitlement as \
    select * \
    from my_table \
    ")
0
votes

It is worth noting that the length is not the issue, just the writing. For this you can use """ as Gaweda suggested or simply use a string variable, e.g. by building it with string builder. For example:

val selectElements = Seq("a","b","c")
val builder = StringBuilder.newBuilder
builder.append("select ")
builder.append(selectElements.mkString(","))
builder.append(" where d<10")
val results = sqlContext.sql(builder.toString())
0
votes

In addition to the above ways, you can use the below-mentioned way as well:

val results = sqlContext.sql("select .... " +
" from .... " +
" where .... " +
" group by ....
");
0
votes
val query = """(SELECT
      a.AcctBranchName,
    c.CustomerNum,
    c.SourceCustomerId,
    a.SourceAccountId,
    a.AccountNum,
    c.FullName,
    c.LastName,
    c.BirthDate,
    a.Balance,
    case when [RollOverStatus] = 'Y' then 'Yes' Else 'No' end as RollOverStatus
    FROM
    v_Account AS a left join v_Customer AS c
      ON c.CustomerID = a.CustomerID AND c.Businessdate = a.Businessdate
    WHERE
    a.Category = 'Deposit' AND
    c.Businessdate= '2018-11-28'  AND
    isnull(a.Classification,'N/A') IN ('Contractual Account','Non-Term Deposit','Term Deposit')
    AND IsActive = 'Yes' ) tmp """