How to Convert .csv file to RDD<Vector>?

Question

I have a CSV file containing following data with 9000+ records

 id,Category1,Category2

How do I convert this csv file to RDD<Vector> so that I can use it to find similar column using columnSimilarities of Apache Spark in java.

Phash Phash · Accepted Answer · 2019-08-09T14:13:44

as I read, Vector can hold the ID and and double[] for the values. you need to fill the Vector.

List<String> lines = Files.readAllLines(Paths.get("myfile.csv"), Charset.defaultCharset());

then you can iterate over lines, create a Vector for each line, fill it with the values (you need to parse them) and add them to the RDD