One of the major tenants of Hadoop is moving the computation to the data.
If you set the replication factor approximately equal to the number of datanodes, you're guaranteed that every machine will be able to process that data.
However, as you mention, namenode overhead is very important and more files or replicas causes slow requests. More replicas also can saturate your network in an unhealthy cluster. I've never seen anything higher than 5, and that's only for the most critical data of the company. Anything else, they left at 2 replicas
The execution engine doesn't matter too much other than Tez/Spark outperforming MR in most cases, but what matters more is the size of your files and what format they are stored in - that will be a major drive in execution performance