spark has many constructs:
- application
- driver
- executor
- worker node
- partition
- dataframe/rdd
- shuffle
- job
- stage
- task
- input file
- output file
- core
is there any diagram showing the relationships between them?
ie
each worker node can have 0-many executors. At least 1 worker should have at least 1 executor. But some workers might have no executors
partition to task is 1 to many
executor to core is 1 to many ..etc