Firstly, I would like to refer you to Paul R's wonderful post on the FFT with regards to what each bin means with respect to the frequency content of your signal - How do I obtain the frequencies of each value in a FFT?
Essentially, it doesn't matter how long the signal is. What matters is how many points you choose for the FFT and ensuring that each signal has the same sampling rate. If this is the case, then you will be able to compare the frequency distribution between the two signals correctly. Remember, the FFT is a frequency decomposition algorithm. It decomposes your signal into a summation of sinusoids (or the complex exponential to be precise) and so we are measuring how much of each frequency is contained within that signal. If you think about the FFT in this way, then you can make another leap and say that the length of the signal is of little consequence.
To use a rather simple example, if we played a tone at 1 kHz for 3 seconds and another tone at 1 kHz for 10 seconds, if you think about it, it shouldn't matter how long that one was played. We know for certain that the frequency decomposition of this signal will only consist of 1 component - namely a component at 1 kHz. As such, you can certainly compare between both signals independent of how long the signals are between each other - we only look at the frequency content of the signal.
To go a bit further on this, recalling Paul R's post, supposing we had a 1024 point FFT and our sampling rate being at 44.1 kHz. Note that we don't care how long the signal is. With this, the bin number of the FFT that corresponds to which frequency that bin is mapped to can be summarized as follows:
0: 0 * 44100 / 1024 = 0.0 Hz
1: 1 * 44100 / 1024 = 43.1 Hz
2: 2 * 44100 / 1024 = 86.1 Hz
3: 3 * 44100 / 1024 = 129.2 Hz
4: ...
5: ...
...
511: 511 * 44100 / 1024 = 22006.9 Hz
As you can see, at N/2 - 1
, this corresponds to the Nyquist frequency. Also, notice that you only need half of the FFT to be able to reconstruct your data. The other bins (512 - 1023) correspond to negative frequencies and this is a consequence of the FFT algorithm (see the Cooley-Tukey FFT algorithm for more details).
The reason why I wanted to point out this bin to frequency mapping is because when we decompose your signal into its frequency components, as long as the sampling frequency and the number of points are the same, you will get a frequency breakdown that follows that table above, no matter how long the signal is. As such, when you decompose the signal into frequencies for the frequency domain, we simply measure how much of a particular frequency we see in your signal and that is independent of its length. Therefore, when you compare the two signals together, you will get a frequency decomposition for both signals like the one above and on that same note, you can compare the frequency content between both signals naturally.
tl;dr
If you don't care to read the above, the answer is no you don't need to normalize the lengths of the signals being measured. You just need to make sure that the sampling frequency as well as how many FFT points you are using to decompose each signal is the same between both signals.