I am learning about Apache Spark and HDFS. I understand both of them for the most part although I am confused about one thing. My question is: Are the data nodes in the HDFS the same as the executor nodes in a spark cluster? In other words, are the nodes in the HDFS operating on the data that they contain or is the data from the datanodes in the HDFS sent to executors nodes in a spark cluster where the data is operated on? Please let me know if you would like me to clarify anything! Any help would be much appreciated!
Thank you,
Taylor