0
votes

It is taking up 600GB space on my cloud, and my code is still running. Wanted to know if this can be deleted.

1
it will kill the running job and your running job will fail while doing soRijul

1 Answers

0
votes

No you should not do so. As per my understanding temp files of pig are intermediate mapper files that would be used by reducer in next step. Instead of deleting it you can compress intermediate files to save space

With gzip you will get better compression (96-99%) but at a cost of 4% slowdown.

-Dpig.tmpfilecompression=true

-Dpig.tmpfilecompression.codec=gz

you can see more tuning options in pig documentation.

https://pig.apache.org/docs/r0.16.0/perf.html#compression