Inference Algorithms in Probabilistic Graphical Model: implementation of the Sum-Product and the Chow-Liu Algorithm for tree learning with Python

The following problem appeared as the final project in the edX course MITx: 6.008.1x Computational Probability and Inference.

Coconut Oil Price Movements

You really want coconut oil, but you don’t know which of your four nearby markets to buy the coconut oil at. You decide to do a detailed study on how these four markets relate, beginning with how coconut oil prices change across these markets.

The data provided consists of 2.8 years of daily coconut oil price movement data quantized to +1 (price went up in the past day), 0 (price stayed exactly the same in the past day), and -1 (price went down in the past day). Each line of the data file looks something like the following:

10/9/2012,-1,1,0,1

Note that there are five items: the date (month/day/year) followed by price movements for markets 0, 1, 2, and 3.

Let’s suppose for the purposes of this project that the price movements on different days are reasonably modeled to be i.i.d.

(a) Implement the Chow-Liu algorithm for learning a tree for these four markets.

Learning the tree PGM has the following few steps:

(1) Compute the empirical distributions: First we need to compute the empirical distribution for each node potential (here we have 4 nodes corresponding to 4 markets) and joint distribution for each edge potential, before we can compute the mutual
information  between the random variables representing the nodes in the next step.

(2) Tree Structure Learning: Once the parameters are learnt we can run the Chow-Liu algorithm to learn the tree structure as shown below.

chowliu
As can be seen from the mutual information matrix and from the Chow-Liu algorithm animation, the edges are selected in order of the corresponding mutual information.

mu
chowliu

(b) Once you learn the Chow-Liu tree, of course the next thing is learning parameters for this specific tree.

(1) Parameter learning: We need to find the MLE of the parameters for the node potentials and the conditional probability tables (CPTs) as edge potentials given the observations (from the empirical conditional probability distributions with MLE), as shown below.

mle

The observations for the 4 markets:
obs

Here are the parameters for the node potentials learnt from the data with MLE:
np

The parameters for the edge potentials learnt from the data with MLE (in python nested dictionary format):

{(0, 1): {0: {0: 0.8597997138769671, 1: 0.0715307582260372, -1: 0.06866952789699571}, 1: {0: 0.6608187134502924, 1: 0.29239766081871343, -1: 0.04678362573099415}, -1: {0: 0.7697368421052632, 1: 0.05921052631578947, -1: 0.17105263157894737}}, (3, 0): {0: {0: 0.5135908440629471, 1: 0.2982456140350877, -1: 0.28289473684210525}, 1: {0: 0.2503576537911302, 1: 0.5555555555555556, -1: 0.08552631578947369}, -1: {0: 0.23605150214592274, 1: 0.14619883040935672, -1: 0.631578947368421}}, (0, 2): {0: {0: 0.8497854077253219, 1: 0.08583690987124463, -1: 0.06437768240343347}, 1: {0: 0.6257309941520468, 1: 0.3391812865497076, -1: 0.03508771929824561}, -1: {0: 0.6578947368421053, 1: 0.039473684210526314, -1: 0.3026315789473684}}, (2, 0): {0: {0: 0.8497854077253219, 1: 0.6257309941520468, -1: 0.6578947368421053}, 1: {0: 0.08583690987124463, 1: 0.3391812865497076, -1: 0.039473684210526314}, -1: {0: 0.06437768240343347, 1: 0.03508771929824561, -1: 0.3026315789473684}}, (1, 0): {0: {0: 0.8597997138769671, 1: 0.6608187134502924, -1: 0.7697368421052632}, 1: {0: 0.0715307582260372, 1: 0.29239766081871343, -1: 0.05921052631578947}, -1: {0: 0.06866952789699571, 1: 0.04678362573099415, -1: 0.17105263157894737}}, (0, 3): {0: {0: 0.5135908440629471, 1: 0.2503576537911302, -1: 0.23605150214592274}, 1: {0: 0.2982456140350877, 1: 0.5555555555555556, -1: 0.14619883040935672}, -1: {0: 0.28289473684210525, 1: 0.08552631578947369, -1: 0.631578947368421}}}

(c) After learning both the tree and the parameters, of course this means that we now have a graphical model, for which we can solve specific inference tasks. First, let’s not worry about incorporating observations yet. Implement the function sum_product for running the general Sum-Product algorithm for a given tree-structured graphical model.

The sum-product algorithm on trees to infer the marginals in the tree  is shown below:

spalgo.png

The following figure illustrates different phases of the sum-product algorithm on trees to learn the marginals at nodes using a different test data, the blue-green tree.
bg

(d) We want to be able to answer questions like “what if we see in the last day that the price moved up for markets 1 and 2, but we didn’t get to observe what happened with markets 0 and 3″? What can we say about the posterior marginals for markets 0 and 3? This amounts to incorporating the observations that markets 1 and 2 each have observed values of +1. Fill in the general function compute_marginals_given_observations that let’s us incorporate any subset of observations for the four markets and that also produces posterior marginals.

The posterior marginals computed at the nodes with the sum-product algorithm using the new observations is shown below:
marg

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s