While you can keep training a Word2Vec model with newer examples, unless the old examples are also re-presented in an interleaved fashion, those new examples may not make the model any better – no matter how well you adjust the alpha.
That's because while training on the new examples, the model is only being nudged to get better at predicting their words, in those new contexts. If there are words missing from the new texts, their word-vectors remain unadjusted as the rest of the model drifts. Even to the extent the same words repeat, their new contexts will presumably be different in some important ways – or else why keep training with new data? – which incrementally dilutes or obsoletes all influence of the older training.
There's even a word for the tendency (but far from certainty) of neural-networks to worsen when presented new data: catastrophic forgetting.
So, the most supportable policy is to re-train with all relevant data mixed-together, to be sure it all has equal influence. If you're improvising some other shortcuts, you're in experimental territory, and there's little reliable documentation or published work that can make strong suggestions about the relative balance of learning-rates/epoch-counts/etc. Any possible answers would also depend very heavily on the relative sizes of the corpuses and vocabularies both at first, and then on any subsequent updates, and also on how important your specific project are factors like vector-stability-over-time, or relative-quality-of-different-vectors. So there'd be no one answer – just what tends to work in your particular setup.
(There's an experimental feature in gensim Word2Vec – some internal model properties that end _lockf. That stands for 'lock-factor'. They match 1-for-1 with the word-vectors, and for any slot where this lock-factor is set to 0.0, the word-vector ignores training updates. In this way, you can essentially 'freeze' some words – such as those you're confident won't be improved by more training – while letting others still update. This might help with drift/forgetting issues with updates, but issues of relative quality and correct alpha/epochs are still murky, requiring project-by-project experimentation.)
Specifically with regard to your numbered questions:
(1) Each call to train() will do the specified epochs number of passes over the data, and smoothly manage the learning-rate from the model's configured starting alpha to min_alpha (unless you override those with extra parameters to train().
(2) As above, there's no established rule-of-thumb because the issue is complicated, incremental training in this style isn't guaranteed to help, and even where it might help it'd depend highly on non-generalizable project specifics.
(3) If a second call to train() causes no changes to vectors, there may be something wrong with your corpus-iterator. Enable logging to at least the INFO level and make sure train() is taking the time, and showing the incremental progress, that indicates real model updates are happening.