I have two clusters each running different version of Hadoop. I am working on a POC were I need to understand how YARN provides the capability to run multiple applications simultaneously which was not accomplished with Classic Map Reduce Framework.
Hadoop Classic: I have a wordcount.jar file and executed on a single cluster (2 Mappers & 2 Reducers). I started two jobs in parallel, the one lucky started first got both mappers, completed the task and then second job started. This is the expected behavior.
Hadoop Yarn: Same wordcount.jar with a different cluster (4 cores, so total 4 machines). As Yarn does not pre-assign mapper and reducer, any core can be used as mapper or reducer. Here also I submitted two jobs in parallel. Expected Behavior: Both the jobs should start with 2 mappers each or whichever config as resource manager assigns but atleast both the jobs should start.
Reality: One job starts with 3 mappers and 1 reducers. second job waits untill first is completed.
Can someone please help me understand the behavior, as well as does the parallelism behavior best reflected with multinode cluster?