I haven't seen any way to do this, and given the way hierarchical-softmax (HS) outputs work, there's no obviously correct way to turn the output nodes' activation levels into a precise per-word likelihood estimation. Note that:
the predict_output_word()
method that (sort-of) simulates a negative-sampling prediction doesn't even try to handle HS mode
during training, neither HS nor negative-sampling modes make exact predictions – they just nudge the outputs to be more like the current training example would require
To the extent you could calculate all output node activations for a given context, then check each word's unique HS code-point node values for how close they are to "being predicted", you could potentially synthesize relative scores for each word – some measure of how far the values are from a "certain" output of that word. But whether and how each node's deviation should contribute to that score, and how that score might be indicative of a interpretable liklihood, is unclear.
There could also be issues because of the way HS codes are assigned strictly by word-frequency – so 'neighbor' word sharing mostly-the-same-encoding may be very different semantically. (There were some hints in the original word2vec.c code that it could potentially be beneficial to assign HS-encodings by clustering related words to have similar codings, rather than by strict frequency, but I've seen little practice of that since.)
I would suggest sticking to negative-sampling if interpretable predictions are important. (But also remember, word2vec isn't mainly used for predictions, it just uses the training-attempts-at-prediction to bootstrap a vector-arrangment that turn out to be useful for other tasks.)