The following problem appeared as the final project in the edX course **MITx: 6.008.1x Computational Probability and Inference**.

### Coconut Oil Price Movements

You really want coconut oil, but you don’t know which of your *four* *nearby markets* to buy the coconut oil at. You decide to do a detailed study on how these four markets relate, beginning with how coconut oil prices change across these markets.

The data provided consists of 2.8 years of daily coconut oil price movement data quantized to +1 (price went up in the past day), 0 (price stayed exactly the same in the past day), and -1 (price went down in the past day). Each line of the data file looks something like the following:

10/9/2012,-1,1,0,1

Note that there are five items: the date (month/day/year) followed by price movements for markets 0, 1, 2, and 3.

Let’s suppose for the purposes of this project that the price movements on different days are reasonably modeled to be i.i.d.

**(a)** Implement the Chow-Liu algorithm for *learning a tree* for these four markets.

Learning the tree PGM has the following few steps:

(1) **Compute the empirical distributions**: First we need to compute the empirical distribution for each node potential (here we have 4 nodes corresponding to 4 markets) and joint distribution for each edge potential, before we can compute the *mutual
*

*information*between the random variables representing the nodes in the next step.

(2) **Tree Structure Learning**: Once the parameters are learnt we can run the *Chow-Liu* algorithm to learn the tree structure as shown below.

As can be seen from the *mutual information matrix* and from the Chow-Liu algorithm animation, the edges are selected in order of the corresponding mutual information.

**(b)** Once you learn the Chow-Liu tree, of course the next thing is *learning parameters* for this specific tree.

(1) **Parameter learning**: We need to find the MLE of the parameters for the node potentials and the conditional probability tables (CPTs) as edge potentials given the observations (from the empirical conditional probability distributions with MLE), as shown below.

The observations for the 4 markets:

Here are the parameters for the *node* *potentials* learnt from the data with MLE:

The parameters for the *edge* *potentials* learnt from the data with MLE (in python nested dictionary format):

{(0, 1): {0: {0: 0.8597997138769671, 1: 0.0715307582260372, -1: 0.06866952789699571}, 1: {0: 0.6608187134502924, 1: 0.29239766081871343, -1: 0.04678362573099415}, -1: {0: 0.7697368421052632, 1: 0.05921052631578947, -1: 0.17105263157894737}}, (3, 0): {0: {0: 0.5135908440629471, 1: 0.2982456140350877, -1: 0.28289473684210525}, 1: {0: 0.2503576537911302, 1: 0.5555555555555556, -1: 0.08552631578947369}, -1: {0: 0.23605150214592274, 1: 0.14619883040935672, -1: 0.631578947368421}}, (0, 2): {0: {0: 0.8497854077253219, 1: 0.08583690987124463, -1: 0.06437768240343347}, 1: {0: 0.6257309941520468, 1: 0.3391812865497076, -1: 0.03508771929824561}, -1: {0: 0.6578947368421053, 1: 0.039473684210526314, -1: 0.3026315789473684}}, (2, 0): {0: {0: 0.8497854077253219, 1: 0.6257309941520468, -1: 0.6578947368421053}, 1: {0: 0.08583690987124463, 1: 0.3391812865497076, -1: 0.039473684210526314}, -1: {0: 0.06437768240343347, 1: 0.03508771929824561, -1: 0.3026315789473684}}, (1, 0): {0: {0: 0.8597997138769671, 1: 0.6608187134502924, -1: 0.7697368421052632}, 1: {0: 0.0715307582260372, 1: 0.29239766081871343, -1: 0.05921052631578947}, -1: {0: 0.06866952789699571, 1: 0.04678362573099415, -1: 0.17105263157894737}}, (0, 3): {0: {0: 0.5135908440629471, 1: 0.2503576537911302, -1: 0.23605150214592274}, 1: {0: 0.2982456140350877, 1: 0.5555555555555556, -1: 0.14619883040935672}, -1: {0: 0.28289473684210525, 1: 0.08552631578947369, -1: 0.631578947368421}}}

**(c)** After learning both the *tree* and the *parameters*, of course this means that we now have a *graphical model*, for which we can solve *specific inference* tasks. First, let’s not worry about incorporating observations yet. Implement the function `sum_product` for running the general Sum-Product algorithm for a given tree-structured graphical model.

The *sum-product algorithm* on trees to infer the *marginals *in the tree is shown below:

The following figure illustrates different phases of the sum-product algorithm on trees to learn the *marginals* at nodes using a different test data, the* blue-green tree*.

**(d)** We want to be able to answer questions like “what if we see in the last day that the price moved up for markets 1 and 2, but we didn’t get to observe what happened with markets 0 and 3″? What can we say about the posterior marginals for markets 0 and 3? This amounts to incorporating the observations that markets 1 and 2 each have observed values of +1. Fill in the general function `compute_marginals_given_observations` that let’s us incorporate any subset of observations for the four markets and that also produces posterior marginals.

The *posterior marginals* computed at the nodes with the *sum-product algorithm* using the new observations is shown below: