This problem appeared as a project in the **edX** course Computational Probability and Inference, **MITx – 6.008.1x. Here ****is** the definition of the problem:

**Problem Definition**

Your pet robot is stuck on Mars somewhere on a grid of width 12 and height 8. You would like to figure out where your robot is over time so that you can rescue it.

## MODEL

The robot’s position at time i is given by random variable Z_i, which takes on a value in {0,1,…,11}×{0,1,…,7}. For example, if Z_2=(5,4), then this means that at time step 2, the robot is in column 5, row 4. Luckily, the robot is quite predictable. At each time step, it makes one of five actions: it stays put, goes left, goes up, goes right, or goes down. But the action it chooses depends on its previous action.

In particular, if its previous action was a movement, it moves in the same direction with probability 0.9 and stays put with probability 0.1. If its previous action was to stay put, it stays again (w.p. 0.2), or moves in any direction (each w.p. 0.2). For example, if the robot’s previous action was ‘up’, then with probability 0.1, the robot’s next action will be ‘stay’, and with probability 0.9, the robot’s next action will be ‘up’.

We can visually represent these transitions with the following transition diagram (a transition diagram, while graphical and with probabilities is *not* the same as a probabilistic graphical model; each node in a probabilistic graphical model represents a random variable whereas in a transition diagram, each node is *not* a random variable but is instead an actual state that a random variable can take on, and the directed edges specify transition probabilities to get to other states):

There’s a catch: We need to treat the boundary of the grid differently. For instance, at the top of the grid, the robot can’t go any higher. We’ll re-normalize the probabilities of its remaining actions at the top so that they sum to 1. Such boundary cases suggest that the transition probabilities depend on the robot’s current location *and* its previous action. Thus, we model the robot’s hidden state Xi at time i to consist of both its location Z_i and the action A_i it last took to reach Z_i, i.e., X_i=(Z_i,A_i) as depicted below:

Unfortunately, we will not have access to directly observing the robot’s hidden state X_i. Instead, we have access to a noisy sensor that puts a uniform distribution on valid grid positions within 1 grid cell of the robot’s current true position. Also, this noisy sensor only gives a guess as to the robot’s current position and tells us nothing about what actions your robot has taken. In other words, at time i, we observe Y_i, which takes on a value in {0,1,…,11}×{0,1,…,7} as depicted below:

Lastly, we shall assume that initially the robot is in any of the grid locations with equal probability and always starts with its action set to ‘stay’.

**(a)** For every time step i, you want to compute a distribution which tells us where the robot was at time i. Implement the forward-backward algorithm to compute the marginal distribution p_X_i|Y_0,…,Y_{n−1}(⋅|y_0,…,y_{n−1}) for each i.

**(b)** Some of the observations were lost when they were transmitted from Mars back to Earth. Modify forward_backward so that it can handle missing observations, i.e., where at some time steps, there is no observation. In a list of observations, a missing observation is stored as None rather than the usual tuple (x, y).

**(c)** Now, you want to know which *sequence* of states the robot was most likely to have visited. Implement the Viterbi algorithm for finding the MAP estimate of the entire sequence of hidden states given a sequence of observations.

**(d)** Implement a modified version of the Viterbi algorithm that outputs the second-best solution to the MAP problem rather than the best solution.

**Forward and Backward message passing as Sum Product and Viterbi decoding as Max Product Algorithms**

Here are the **Viterbi** decoding outputs for the most likely hidden states (**actual location** and **action** of the robot for a few of the test cases):

Here are the **Viterbi** decoding outputs for the **2nd most most likely path **for the robot:

The same problem is solved with R and the below figure shows the animation for the hidden states estimated with the Viterbi Decoding:

**Green rectangle**: actual **robot** location (**hidden state**)

**Red rectangle**: noisy sensor estimates for the robot location (**observed state**)

**Blue Rectangle**: HMM correction (estimated hidden states with **Viterbi**)