0
votes

I have a (non-symmetric) probability matrix, and an observed vector of integer outcomes. I would like to find a vector that maximises the probability of the outcomes, given the transition matrix. Simply, I am trying to estimate a distribution of particles at sea given their ultimate distribution on land, and a matrix of probabilities of a particle released from a given point in the ocean ending up at a given point on the land.

The vector that I want to find is subject to the constraint that all components must be between 0-1, and the sum of the components must equal 1. I am trying to figure out the best optimisation approach for the problem.

My transition matrix and data set are quite large, but I have created a smaller one here:

I used a simulated known at- sea distribution of msim<-c(.3,.2,.1,.3,.1,0) and a simulated probability matrix (t) to come up with an estimated coastal matrix (Datasim2), as follows:

t<-matrix (c(0,.1,.1,.1,.1,.2,0,.1,0,0,.3,0,0,0,0,.4,.1,.3,0,.1,0,.1,.4,0,0,0,.1,0,.1,.1),
nrow=5,ncol=6, byrow=T)
rownames(t)<-c("C1","C2","C3","C4","C5") ### locations on land
colnames(t)<-c("S1","S2","S3","S4","S5","S6") ### locations at sea

Datasim<-as.numeric (round((t %*% msim)*500))

Datasim2<-c(rep("C1",95), rep("C2",35), rep("C3",90),rep("C4",15),rep("C5",30))
M <-c(0.1,0.1,0.1,0.1,0.1,0.1) ## starting M

I started with a straightforward function as follows:

EstimateSource3<-function(M,Data,T){

EstEndProbsall<-M%*%T
  TotalLkhd<-rep(NA, times=dim(Data)[1])

  for (j in 1:dim(Data)[1]){

ObsEstEndLkhd<-0
    ObsEstEndLkhd<-1-EstEndProbsall[1,]  ## likelihood of particle NOT ending up at locations other than the location of interest

      IndexC<-which(colnames(EstEndProbsall)==Data$LocationCode[j], arr.ind=T) ## likelihood of ending up at location of interest

      ObsEstEndLkhd[IndexC]<-EstEndProbsall[IndexC]

      #Total likelihood
      TotalLkhd[j]<-sum(log(ObsEstEndLkhd)) 
  }

  SumTotalLkhd<-sum(TotalLkhd)


  return(SumTotalLkhd)
}

DistributionEstimate <- optim(par = M, fn = EstimateSource3, Data = Datasim2, T=t, 
control = list(fnscale = -1, trace=5, maxit=500), lower = 0, upper = 1)

To constrain the sum to 1, I tried using a few of the suggestions posted here:How to set parameters' sum to 1 in constrained optimization

e.g. adding M<-M/sum(M) or SumTotalLkhd<-SumTotalLkhd-(10*pwr) to the body of the function, but neither yielded anything like msim, and in fact, the 2nd solution came up with the error “L-BFGS-B needs finite values of 'fn'”

I thought perhaps the quadprog package might be of some help, but I don’t think I have a symmetric positive definite matrix…

Thanks in advance for your help!

2
Please explain the math of your problem. You know the transition matrix T and distribution on land L and you are looking for most likely distribution at sea S? Is the relationship something like L' = S' T? This looks like a matrix equation to me, nothing to optimize here. You ask for "most likely" solution. How do you introduce randomness into the process? - Ott Toomet
Thanks for that suggestion. Yes, we know the transition matrix (T) and land distribution (Datasim2), and looking for sea distribution (M). I am attempting to solve for this using matrix equations, but am running into problems trying to get a pseudoinverse of T. I tried invT <- ginv(T) but for some reason this does not work; (invT %*% T does not yield an identity matrix. I cannot figure out why! Note that I have transposed the transition matrix in order for the matrix equation to work properly. - Alexandra

2 Answers

0
votes

What about that: Let D = distribution at land, M = at sea, T the transition matrix. You know D, T, you want to calculate M. You have

D' = M' T

hence D' T' = M' (T T')

and accordingly D'T'(T T')^(-1) = M'

Basically you solve it as when doing linear regression (seems SO does not support math notation: ' is transpose, ^(-1) is ordinary matrix inverse.)

Alternatively, D may be counts of particles, and now you can ask questions like: what is the most likely distribution of particles at sea. That needs a different approach though.

0
votes

Well, I have never done such models but think along the following lines. Let M be of length 3 and D of length 2, and T is hence 3x2. We know T and we observe D_1 particles at location 1 and D_2 particles at location 2.

What is the likelihood that you observe one particle at location D_1? It is Pr(D = 1) = M_1 T_11 + M_2 T_21 + M_3 T_32. Analogously, Pr(D = 2) = M_1 T_12 + M_2 T_22 + M_3 T_32. Now you can easily write the log-likelihood of observing D_1 and D_2 particles at locations 1 and 2. The code might look like this:

loglik <- function(M) {
   if(M[1] < 0 | M[1] > 1)
      return(NA)
   if(M[2] < 0 | M[2] > 1)
      return(NA)
   M3 <- 1 - M[1] - M[2]
   if(M3 < 0 | M3 > 1)
      return(NA)
   D[1]*log(T[1,1]*M[1] + T[2,1]*M[2] + T[3,1]*M3) +
       D[2]*log(T[1,2]*M[1] + T[2,2]*M[2] + T[3,2]*M3)
}
T <- matrix(c(0.1,0.2,0.3,0.9,0.8,0.7), 3, 2)
D <- c(100,200)
library(maxLik)
m <- maxLik(loglik, start=c(0.4,0.4), method="BFGS")
summary(m)

I get the answer (0, 0.2, 0.8) when I estimate it but standard errors are very large.

As I told, I have never done it so I don't know it it makes sense.