Can 2 Spark job use a single HDFS/S3 storage simultaneously?

Question

I'm a beginner in Spark. I've a scenario where there are multiple source of data at different point of time for an analysis. Can I have 2 spark jobs to use a single HDFS/S3 storage at the same time? One job will write latest data to S3/HDFS and other will read that along with input data from another source for analysis.

Your title says: "Can 2 Spark job use a single HDFS/S3 storage simultaneously?" but your description referencing multiple sources. Is your question about access 1 data source from two jobs or [something else]? — Matt Andruff

Matt Andruff Matt Andruff · Accepted Answer · 2022-01-06T14:26:40

Yes, you can be writing and reading to the same data source. Data will only be present once writes are completed.(In both HDFS/S3)

Can 2 Spark job use a single HDFS/S3 storage simultaneously?

2 Answers