Three numeric variables were selected: **lifeexpectancy**, **employrate**,**internetuserate**. First a *subset *of the dataset was created, since we were interested primarily in the countries with *lifeexpectancy *less than equal to 70. All the exploratory analysis were done on this subset.

Since the variables selected are continuous, each variable was binned (grouped) into a few bins (e.g., *employrate *and *lifexpectancy *variables were binned into 4 equal-depth groups whereas the *internetuserate *variable was binned into 3 equal-width groups). For each of the variables, the frequency tables were computed and the *missing *values were coded out. We tried to understand how the *lifexpectancy *varied across the countries in different groups of *employrate *and *internetuserate*.

The *frequency distributions* of the managed (grouped) variables are shown below. As can be seen, number of countries with (low) *employrate *less than 56 is 19, whereas number of countries with (high) *employrate *greater than 70 is around 18. Also, it shows that there were 3 countries with NAN values for employ rate in the subset.

The below heatmap of the crostab shows that there are relatively high number of countries with low *intermetuserate *but high *employrate*!

The scatter plot of the variables employrate vs. internetuserate above shows that for the countries with lifeexpectancy > 60 we have higher internetuserate in general. The box plots below show that the lifeexpectancy is highest on average for the countries with second lowest employrate but for those with the highest interntuserate.