0
votes

I am having trouble understanding how to use the method baumWelch in the package HMM.

According to the documentation we first have to initialize our hidden markov model

hmm = initHMM(c("A","B"), c("L","R"), transProbs=matrix(c(.8,.2,.2,.8),2),
              emissionProbs=matrix(c(.6,.4,.4,.6),2)

This means that states are named "A" and "B" emissions are named "L" and "R" we have transmission and emission probabilities as well.

So far so good, but now the tutorial creates the list of observations:

observations = sample(c("L","R"),prob=c(.8,.2),size=100,replace=TRUE)

this is a one dimensional vector as a list of observations. According to description from Rabiner's classic paper that Forward and Backward probabilities are calculated on a sequence of observations such as the variable observations in the above code. i.e. We need a matrix of such observations in order to even remotely train anything. How do I do that here?

EDIT:

In the example above the emissions are "L" and "R". An obesevation sequence O should be created with those emissions. e.g. O = LRRRLLLRR, etc. suppose length is t

An observation of this sort is used in the forward algorithm which given a full markov model and an observation sequence O will generate a matrix of dimensions n x t where n is the number of states in our HMM the i j _th element of such a matrix is interpreted as "being at time j, having generated the first j elements of observational sequence and being in state i".

Now forward as well as backward and several other algorithms are used in the Baum-Welch training algorithms.

I believe that the input to Baum-Welch should a a list of observational sequences and NOT a list of emissions. In my version the dimensions of the matrix input should be t times m where t is the length of observational sequence and m is the number of such sequences.

How do I understand the input here? is it that we have a hundred observational sequences of length 1? How do I provide such a matrix to baumWelch method in HMM

2
If you don't understand the principles of statistical methods, you should probably ask at Cross Validated or Data Science. This really isn't a specific programming question that's appropriate for Stack Overflow. - MrFlick
I understand the principles, In fact I should have rephrased it better it's not that I think that the observations should be a matrix, It is an immutable fact that it should be. What I am asking here is whether anyone has worked with the package, and can clarify how to understand this particular example and how to code the correct thing. - Vahagn Tumanyan
It still isn't clear to me what exactly your question is. You don't understand why they create vector of observations rather than a matrix? What would the dimensions of your expected matrix be and what would it's contents be? What part of the documentation are you referring to exactly? - MrFlick
Yes, I don't understand why have a vector of emissions and not a vector of vectors each of which is a vector of observations. - Vahagn Tumanyan

2 Answers

1
votes

As I understand it the baumWelch function in the HMM package only accepts a single sequence of observations as its secondary argument. If your training data consists of more than one sequence you could try the aphid package, which supports both Baum Welch and Viterbi model training with multiple-sequence lists. Disclaimer: I authored the package, partly because I was running into the same issues.

0
votes

Although asked a while ago, I found this question when I was confused by the same thing.

Quoting Ghahramani from An Intoduction to Hidden Markov Models and Bayesian Networks page 7:

"If the observation vector includes all the variables in the Bayesian Network than each term in the loglikelihood function further factors as..." see equation here

Essentially from what I understand because each node in the network only depends on it's parent you don't need to separate out the training vectors into a matrix and can use one full training vector instead. I suppose the only issue would be at the beginning of a new training vector, but you can overcome this by creating a special start state "*" which is always the first state in a new sequence.