If I create a igniteRDD out of a cache with 10M entries in my spark job, will it load all 10M into my spark context? Please find my code below for reference.
SparkConf conf = new SparkConf().setAppName("IgniteSparkIntgr").setMaster("local");
JavaSparkContext context = new JavaSparkContext(conf);
JavaIgniteContext<Integer, Subscriber> igniteCxt = new JavaIgniteContext<Integer,Subscriber>(context,"example-ignite.xml");
JavaIgniteRDD<Integer,Subscriber> cache = igniteCxt.fromCache("subscriberCache");
DataFrame query_res = cache.sql("select id, lastName, company from Subscriber where id between ? and ?", 12, 15);
DataFrame input = loadInput(context);
DataFrame joined_df = input.join(query_res,input.col("id").equalTo(query_res.col("ID")));
System.out.println(joined_df.count());
In the above code, subscriberCache is having more than 10M entries. Will at any point of the above code the 10M Subscriber objects be loaded into JVM? Or it only loads the query output?
FYI:(Ignite is running in a separate JVM)