Most of the Kubernetes components are split by responsibility and workload assignment is no different. We could define the workload assignment process as Scheduling and Execution.
The Scheduler as the name suggests will be responsible for the Scheduling step, The process can be briefly described as, "get a list of pods, if it is not scheduled to a node, assign it to one node with capacity to run the pod". There is a nice blog post from Julia Evan here explaining Schedulers.
And Kubelet is responsible for the Execution of pods scheduled to it's node. It will get a list of POD Definitions allocated to it's node, make sure they are running with the right configuration, if not running start then.
With that in mind, the scenario you described will have the behavior expected, the POD will not be scheduled, because you don't have a node with capacity available for the POD.
Resource Balancing is mainly decided at scheduling level, a nice way to see it is when you add a new node to the cluster, if there are no PODs pending allocation, the node will not receive any pods. A brief of the logic used to Resource balancing can be seen on this PR
The solutions,
Kubernetes ships with a default scheduler. If the default scheduler does not suit your needs you can implement your own scheduler as described here. The idea would be implement and extension for the Scheduler to ReSchedule PODs already running when the cluster has capacity but not well distributed to allocated the new load.
Another option is use tools created for scenarios like this, the Descheduler is one, it will monitor the cluster and evict pods from nodes to make the scheduler re-allocate the PODs with a better balance. There is a nice blog post here describing these scenarios.
PS:
Keep in mind that the total memory of a node is not allocatable, depending on which provider you use, the capacity allocatable will be much lower than the total, take a look on this SO: Cannot create a deployment that requests more than 2Gi memory