0
votes

I am working on a Spring Java Project and integrating Apache spark and cassandra using Datastax connector.

I have autowired sparkSession and the below lines of code seems to work.

Map<String, String> configMap = new HashMap<>();
    configMap.put("keyspace", "key1");
    configMap.put("table", tableName.toLowerCase());

    Dataset<Row> ds = sparkSession.sqlContext().read().format("org.apache.spark.sql.cassandra").options(configMap)
            .load();
    ds.show();

But this is always giving me 20 records. I want to select all the records of table. can someone tell me how to do this ?

Thanks in advance.

1

1 Answers

1
votes

show always outputs 20 records by default, although you can pass an argument to specify how many items do you need. But show is usually used just for briefly examine the data, especially when working interactively.

In your case, everything is really depends on what do you want to do with the data - you already successfully loaded the data using the load function - after that you can just start to use normal Spark functions - select, filter, groupBy, etc.

P.S. You can find here more examples on using Spark Cassandra Connector (SCC) from Java, although it's more cumbersome than using Scala... And I recommend to make sure that you're using SCC 2.5.0 or higher because of the many new features there.