Spark-Csv Write quotemode not working

Question

I am trying to write a DataFrame as a CSV file using Spark-CSV (https://github.com/databricks/spark-csv)

I am using the command below

res1.write.option("quoteMode", "NONE").format("com.databricks.spark.csv").save("File")

But my CSV file is always written as

"London"
"Copenhagen"
"Moscow"

instead of

London
Copenhagen
Moscow

quoteMode was never supported issues.apache.org/jira/browse/SPARK-26968 — Gaurav Shah

chaotic3quilibrium chaotic3quilibrium · Accepted Answer · 2017-03-30T23:05:02

Yes. The way to turn off the default escaping of the double quote character (") with the backslash character (\), you must add an .option() method call with just the right parameters after the .write() method call. The goal of the option() method call is to change how the csv() method "finds" instances of the "quote" character. To do this, you must change the default of what a "quote" actually means; i.e. change the character sought from being a double quote character (") to a Unicode "\u0000" character (essentially providing the Unicode NUL character which won't ever occur within a well formed JSON document).

val dataFrame =
  spark.sql("SELECT * FROM some_table_with_a_json_column")
val unitEmitCsv =
  dataframe
    .write
    .option("header", true)
    .option("delimiter", "\t")
    .option("quote", "\u0000") //magic is happening here
    .csv("/FileStore/temp.tsv")

This was only one of several lessons I learned attempting to work with Apache Spark and emitting .csv files. For more information and context on this, please see the blog post I wrote titled "Example Apache Spark ETL Pipeline Integrating a SaaS".

Spark-Csv Write quotemode not working

7 Answers