0
votes

I'm trying to pass an argument to spark-shell. For example, I want today's date as a variable inside scala code.

val conf = new SparkConf().setAppName("test").setMaster("local[*]")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.read.format("csv").load("./"+date+".csv")

My test.scala like above, and I need to get variable 'date' from the terminal. The solution that I've found is

$spark-shell -i <(echo val date = 2019-11-30 ; cat test3.scala)

However, this doesn't work. spark-shell runs, but nothing gets executed after it started to run. I'm new to scala, and have used only python before. In python, this function can be made by argparse library, and I want the scala code to run similarly like in python argparse.

Thanks in advance. Plus, I don't want to use sbt, I just want to use spark-shell.

1

1 Answers

0
votes

There are two ways that you could do this.

  1. Write code for Scala to get current time using a little bit of JAVA inside Scala code.

    import java.time.LocalDateTime
    import java.time.format.DateTimeFormatter
    DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss").format(LocalDateTime.now)
    
  2. Pass the argument date in your spark-submit job.

    def main(args: Array[String]): Unit = {
    val thisArg = args(0)
    print(thisArg)
    }
    

    Then you can call your job as spark-submit .... valueArg