In short: No. At least not in a useful way.
Longer: The slow writing to flash is not caused by the CPU being so slow, but rather caused by time-consuming page write cycles to flash memory. (So, DMA isn't any faster). You can theoretically use DMA to write to flash (even if that is a bit tricky and has quite some pitfalls), but you won't gain any speed. Reason is, the CPU will be halted when it tries to access flash memory while it is written. Thus, the CPU will not be able to execute code from there while DMA writes it - No time gained.
The manual says:
Any attempt to read the Flash memory on STM32F4xx while it is being written or erased, causes the bus to stall. Read operations are processed correctly once the program operation has completed. This means that code or data fetches cannot be performed while a write/erase operation is ongoing.
That means executing code from any flash area while DMA runs to it is not possible - The CPU will be halted for that time period. You could, however, complicate things even more and copy parts of your code to RAM to execute it there to work around this.
Some of the F4 series have 4k of battery-backable RAM. Use that, it's much faster and much simpler to handle.