On most designs the store buffer wouldn't directly send invalidate requests and is usually not even snooped1 by external requests. That is, it is part of the private/core-side of the coherence domain and so doesn't need to participate in coherence. Instead, the store buffer ultimately interacts with the first level of the caching subsystem which itself would be responsible for the various parts of the MESI protocol.
How that interaction works exactly depends on the design, of course. A simple design may only process one store at a time: the oldest one that is at the head of the store buffer and perform the RFO for that address, and when complete move on the to the next element. A more sophisticated design might send RFO for several "upcoming" requests in the store buffer in an attempt to exploit more MLP. The exact mechanism isn't clear to me on x86: stores to L2 seem to perform quite poorly in some scenarios, but I'm pretty sure a bunch of store misses to RAM will perform much better than if they were handled serially.
1 There are some exceptions, e.g. simultaneous multithreading (hyperthreading on x86) which involves two logical cores sharing all levels of cache and hence being able to avail themselves of the normal cache coherency mechanisms, may require store buffer snoops.