After reading some articles about Whole State Code Generation
, spark does bytecode optimizations to convert a query plan to an optimized execution plan.
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-sql-whole-stage-codegen.html
Now my next question is but still after doing these optimizations related to bytecodes and all, it might still be plausible that conversion of those bytecode instructions to machine code instructions could be a possible bottleneck because this is done by JIT alone during the runtime of the process and for this optimization to take place JIT has to have enough runs.
So does spark do anything related to dynamic/runtime conversion of optimized bytecode ( which is an outcome of whole stage code gen
) to machine code or does it rely on JIT to convert those byte code instructions to machine code instructions. Because if it relies on JIT then there are certain uncertainties involved.