SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India in 2010 (Simulation in Python & R)

This appeared as a project in the edX course DelftX: MathMod1x Mathematical Modelling Basics and the project report can be found here. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Summary

In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 is going to be simulated. The basic epidemic SIR model will be used, it describes three populations: a susceptible population, an infected population, and a recovered population and assumes the total population (sum of these 3 populations) as fixed over the period of study.

There are two parameters for this model: namely the attack rate (β) per infected person per day through contacts and the recovery rate (α). Initially there will be a small number of infected persons in the population. Now the following questions are to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.
2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?
3. How long will the epidemic last?

The Euler’s method will be primarily used to solve the system of differential equations for the SIR model and compute the equilibrium points (along with some
analytic solution attempts for a few simplified special cases). Here are the
conclusions obtained from the simulations:

1. When the recovery rate α is ≈ 0 or very very low compared to the attack rate β, the flu will turn out to be an epidemic and the entire population will be infected first (the higher β is the quicker the epidemic outbreak).

2. To be more precise, when the initial susceptible population S(0) is greater than the inverse of the basic reproduction number 1/ R0 = α / β, a proper epidemic will break out.

3. When the initial susceptible population S(0) is less than the inverse of the basic reproduction number 1/R0=α/β, then a proper epidemic will never break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler’s method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

lv.png


Introduction

In 2010, the pandemic influenza A (H1N1) had an outbreak in Kolkata, West Bengal, India. An increased number of cases with influenza like illness (ILI) were reported in Greater Kolkata Metropolitan Area (GKMA) during July and August 2010, as stated in [3]. The main motivation for this research project will be to understand the spread of the pandemic, compute the equilibrium points and find the impact of the initial values of the infected rate and the attack / recovery rate parameters on the spread of the epidemic, using simulations using the basic epidemic SIR model.

The Euler’s method will be primarily used to solve the system of differential equations
for SIR model and compute the equilibrium points. First a few simplified special cases will be considered and both analytic and numerical methods (with Euler method) will be used to compute the equilibrium points. Then the solution for the generic model will be found. As described in [6], the SIR model can also be effectively used (in a broader
context) to model the propagation of computer virus in computer networks, particularly for the networks with Erdos-Renyi type random graph topology.


SIR Epidemic Model

The SIR model is an epidemiological model that computes the theoretical number of people infected with a contagious illness in a closed population over time. One of the basic one strain SIR models is Kermack-McKendrick Model. The Kermack-McKendrick Model is used to explain the rapid rise and fall in the number of infective patients observed in epidemics. It assumes that the population size is fixed (i.e., no births, no deaths due to disease nor by natural causes), incubation period of the infectious agent
is instantaneous, and duration of infectivity is the same as the length of the disease. It also assumes a completely homogeneous population with no age, spatial, or social structure.

The following figure 2.1 shows an electron microscope image of the re-assorted H1N1 influenza virus photographed at the CDC Influenza Laboratory. The viruses are 80 − 120 nm in diameter [1].

virus.png

2.1 Basic Mathematical Model

The starting model for an epidemic is the so-called SIR model, where S stands for susceptible population, the people that can be infected. I is the already infected population, the people that are contagious, and R stands for the recovered population, people who are not contagious any more.

2.1.1 Differential Equations

The SIR model can be defined using the following ordinary differential equations 2.1:

dif1

• The terms dS/dtdI/dtdR/dt in the differential equations indicate the rate of change of the susceptible population size, the infected population size and the recovered population size, respectively.

• The terms β and α indicate the attack rate (number of susceptible persons get infected per day) and the recovery rate of the flu (inverse of the number of days a person remains infected), respectively.

• High value of α means a person will be infected by the flu for less number of days and high value of β means that the epidemic will spread quickly.

• Also, as can be seen from below, from the differential equations it can be shown that the population (S + I + R) is assumed to be constant.

dif2.png

2.1.2 Collected data, units and values for the constants

• As can be seen from the following figure 2.2, the focus of this analysis will be limited to  the population in Kolkata Metropolitan Corporation (KMC, XII) area where the population can be assumed to be ≈ 4.5 million or 4500 thousands, as per [7].

Units

– All population (S, I, R) units will be in thousands persons (so that total population
N = 4500).
– As can be derived from the differential equations 2.1, the unit of β will be in
10^(−6) /persons/day (β = 25 will mean 25 persons in a million gets infected by
susceptible-infected contact per infected person per day).
– Similarly, the units of α will be in 10^(−3) / day (α = 167 will mean 167 × 10^(−3) /
day gets recovered from the flu per day).

• The attack rate is 20-29/100000 and the number of days infected (i.e. the inverse of recovery rate) = 5−7 days on average (with a few exceptions), as per [3].

• Typical values for β and α can be assumed to be 25 /person / day and 10^3/6 ≈ 167 / day, respectively.

f1.jpg

2.2 Simplified Model 1 (with α = 0)

• At first a simplified model is is created assuming that α = 0 (/ day) and that R = 0, so once infected, a person stays contagious for ever. Because S(t) + I(t) + R(t) = S(t) + I(t) = N is constant (since population size N is fixed), S(t) can be eliminated and a single differential equation in just I(t) is obtained as shown in the equation below 2.2.

f2

• Also, let the (fixed) population size N = 4500 = S(0) + I(0), (in thousand persons), initially the number of persons infected = I(0) = 1 (in thousand persons) and number of persons susceptible S(0) = N −I(0) = 4499 (in thousand persons), respectively. Let β = 25 × 10^(−6) /persons / day) to start with.

2.2.1 Analytic Solution

• The analytic solution can be found by following the steps shown in the Appendix A and the final solution is shown in the below equations 2.3:

f3.png
• The following figure 2.3 shows the logistic (bounded) growth in I(t) (in thousands persons) w.r.t. the time (in days) for different values of attack rate β×10^(−6) (/ person / day). As expected, the higher the attack rate, the quicker all the persons in the population become infected.

f4.png

2.2.2 Finding the equilibrium points for I

• The equilibrium points are the points where the rate of change in I is zero, the points that satisfy the following equation

f5

• Considering a small neighborhood of the equilibrium point at I = 0, it can be seen from the figure 2.4 that whenever I > 0, dI/dt > 0, so I increases and goes away from the equilibrium point.

• Hence, the equilibrium point at I = 0 is unstable.

• At I = N = 4500 (in thousand persons) it is a stable equilibrium. As can be seen from the following figure 2.4, in a small neighborhood of the equilibrium point at I = 4500, it always increases / decreases towards the equilibrium point.

• In a small  ε > 0 neighborhood at I = 4500 (in thousand persons),

1. dI/dt > 0, so I increases when I <= 4500 − ε .
2. dI/dt > 0, so I decreases when I >= 4500 +  ε .

• The same can be observed from the direction fields from the figure 2.5.

• Hence, the equilibrium at I = 4500 is stable.

f6.png

f7.png

2.2.3 Numerical Solution with the Euler’s Method

• The algorithm (taken from the course slides) shown in the following figure 2.6 will be used for numerical computation of the (equilibrium) solution using the Euler’s method.

f8.png

• As can be seen from the figure 2.6, then the infection at the next timestep can be (linearly) approximated (iteratively) by the summation of the the infection current timestep with the product of the difference in timestep and the derivative of the infection evaluated at the current timestep.

2.2.3.1 Finding the right step size (with β = 25 × 10^(−6)/person/day)

• In order to decide the best step size for the Euler method, first the Euler’s method is run with different step sizes as shown in the figure 2.7.

• As can be seen from the following table 2.1 and the figure 2.7, the largest differences in the value of I (with two consecutive step sizes) occurs around 78 days:

f9.png

f10.png
• As can be seen from the table in the Appendix B, the first time when the error becomes less than 1 person (in thousands) is with the step size 1/512 , hence this step size will be used for the Euler method.

2.2.3.2 Computing the (stable) equilibrium point

• Now, this timestep will be used to solve the problem to find the equilibrium time teq (in days). Find teq such that N − I(teq) < ε = 10^(−6) , the solution obtained is teq = 272.33398 days ≈ 273 days.

• Now, from the analytic solution 2.3 and the following figure 2.8, it can be verified that the teq solution that the Euler’s method obtained is pretty accurate (to the ε tolerance).

f11.png

2.2.3.3 Results with β = 29 × 10^(−6) / person / day, I(0) = 1 person

• Following the same iterations as above, the steepest error is obtained at t = 67 days in this case, as shown in the figure 2.9.

• The first time when error becomes less than one person for t = 67 days with the Euler ‘s method is with step size 1/512 again.

• The solution obtained is teq = 234.76953125 days ≈ 235 days, so the equilibrium (when all the population becomes infected) is obtained earlier as expected, since the attack rate β is higher.

f12.png

2.2.3.4 Results with β = 25 × 10^(−6) / person / day, with different initial values for infected persons (I(0))

• Following the same iterations as above, the equilibrium point is computed using the Euler’s method with different values of initial infected population I(0), as shown in the figure 2.10.

• The solutions obtained are teq = 272.33, 258.02, 251.85, 248.23, 245.66, 245.66 days for I(0) = 1, 5, 10, 15, 20 days, respectively. So the equilibrium is obtained earlier when the initial infected population size is higher, as expected.

f13.png

2.3 Simplified Model 2 (with β = 0)

• Next, yet another simplified model is considered by assuming that β = 0 and that α > 0, so the flu can no more infect anyone (susceptible, if any, possibly because everyone got infected), an infected person recovers from flu with rate α. This situation can be described again with a single differential equation in just I(t) as shown in the equation below 2.4.

f14

• Also, let the the entire population be infected, N = 4500 = I(0), (in thousand persons), initially the number of persons susceptible = S(0) = 0, respectively. Let α = 167 × 10^(−3)   (/ day) to start with.

2.3.1 Analytic Solution

• The analytic solution can be found by following the steps shown in the below equations 2.5:

f15

• The following figure 2.11 shows the exponential decay in I(t) (in thousand persons)  w.r.t. the time (in days) for different values of recovery rate α × 10^(−3)  (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population get rid of the infection.

f16.png

• Now, I(t) + R(t) = N (since S(t) = 0 forever, since no more infection) and I(0) = N, combining with the above analytic solution I(t) = I(0).exp(−αt) = N.exp(−αt), the following equation is obtained:
f17

• The following figure 2.12 shows the growth in R(t) (in thousand persons) w.r.t. the time (in days) for different values of recovery rate α × 10^(−3) (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population move to the removed state.

f18.png

2.3.2 Numerical Solution with the Euler’s Method

2.3.2.1 Solution with α = 167 × 10−3 / day

• Following the same iterations as above, the steepest error is obtained at t = 6 in this case, as shown in the figure 2.16.

• The first time when error becomes less than one person for t = 67 with the Euler’s method is with step size 1/256 .

• The solution obtained with the Euler’s method is 133.076171875 days ≈ 133 days to remove the infection from population with 10^(−6) tolerance. From the analytic solution,
I(133) = N.exp(−αt) = 1.016478E−06, similar result is obtained.

2.3.2.2 Results

The following figure 2.16 shows the solutions obtained with different step sizes using the Euler’s method.

f19.png

2.4 Generic Model (with α, β > 0)

First, the numeric solution will be attempted for the generic model (using the Euler’s method) and then some analytic insights will be derived for the generic model.

2.4.1 Numerical Solution with the Euler’s  Method

• The following algorithm 2.14 shown in the next figure is going to be used to obtain the solution using Euler method (the basic program for Euler’s method, adapted to include three dependent variables and three differential equations).

f20.png

• As can be seen from the figure 2.14, first the vector X(0) is formed by combining the three variables S, I, R at timestep 0. Then value of the vector at the next timestep can be (linearly) approximated (iteratively) by the (vector) summation of the vector value at the current timestep with the product of the difference in timestep and the derivative of the
vector evaluated at the current timestep.

2.4.1.1 Equilibrium points

• At the equilibrium point,

f21

There will be no infected person at the equilibrium point (infection should get removed).

• As can be seen from the following figure 2.15 also, I = 0 is an equilibrium point, which is quite expected, since in the equilibrium all the infected population will move to the removed state.

• Also, at every point the invariant S + I + R = N holds.

• In this particular case shown in figure 2.15, the susceptible population S also becomes 0 at equilibrium (since all the population got infected initially, all of them need to move to removed state) and R = N = 4500 (in thousand persons).

f22.png

2.4.1.2 Results with Euler method

• As explained in the previous sections, the same iterative method is to find the right stepsize for the Euler method. The minimum of the two stepsizes determined is
∆t = 1/512 day and again this stepsize is going to be used for the Euler’s method.

• The following figures show the solutions obtained with different values of α, β with the initial infected population size I(0) = 1 (in thousand persons). Higher values for the parameter β obtained from the literature are used for simulation, since β = 25 × 10^(−6) /person /day is too small (with the results not interesting) for the growth of the epidemic using the Euler’s method (at least till ∆t = 1/ 2^15), after which the iterative Euler’s
method becomes very slow).

• As can be seen, from the figures 2.16, 2.17 and 2.19, at equilibrium, I becomes zero.

• The solution (number of days to reach equilibrium) obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is teq = 143.35546875 ≈ 144 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.16.

• The solution (number of days to reach equilibrium) obtained at α = 167 × 10^(−3) /day and β = 5 × 10^(−5) /person /day is teq ≈ 542 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.17.

• Hence, higher the β value, the equilibrium is reached much earlier.

• The solution obtained at α = 500 × 10^(−3) /day and β = 25 × 10^(−5) /person /day is
teq ≈ 78 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.19.

• Hence, higher the α value, the equilibrium is reached earlier.

• The solution obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is
teq = 140 days with I(0) = 10. Hence, as expected, higher the number of initial infected population size, quicker the equilibrium is reached.

• At equilibrium, S does not necessarily become close to zero, since sometimes the entire  population may not get infected ever, as shown in the figure 2.17, where at equilibrium the susceptible population is non-zero.

f23f24f25f26

• As can be seen from the phase planes from following figure 2.21, at equilibrium, the infected population becomes 0.

f27.png

2.4.2 Analytic Solution and Insights

2.4.2.1 Basic Reproduction Number (R0)

The basic reproduction number (also called basic reproduction ratio) is defined by
R0 = β / α (unit is /day). As explained in [2], this ratio is derived as the expected number of new infections (these new infections are sometimes called secondary infections) from a single infection in a population where all subjects are susceptible. How the dynamics of the system depends on R0 will be discussed next.

2.4.2.2 The dynamics of the system as a function of R0

• By dividing the first equation by the third in 2.1, as done in [2], the following equation is obtained:

f28.png

• Now, at t → ∞, the equilibrium must have been already reached and all infections must have been removed, so that lim (t→∞) I(t) = 0.

• Also, let R∞ = lim (t→∞) R(t).

• Then from the above equation 2.7, R∞ = N − S(0).exp(R0.(R∞−R(0)))
.
• As explained in [2], the above equation shows that at the end of an epidemic, unless
S(0) = 0, not all individuals of the population have recovered, so some must remain  susceptible.

• This means that the end of an epidemic is caused by the decline in the number of infected individuals rather than an absolute lack of susceptible subjects [2].

• The role of the basic reproduction number is extremely important, as explained in [2]. From the differential equation, the following equation can be obtained:

f29

S(t) > 1/R0 ⇒ dI(t)/dt > 0 ⇒ there will be a proper epidemic outbreak with an increase of the number of the infectious (which can reach a considerable fraction of the population).

S(t) < 1 R0 ⇒ dI(t) dt < 0 ⇒ independently from the initial size of the susceptible population the disease can never cause a proper epidemic outbreak.

• As can be seen from the following figures 2.21 and 2.22 (from the simulation results obtained with Euler method), when S(0) > 1/R0 , there is a peak in the infection curve, indicating a proper epidemic outbreak.

• Also, from the figures 2.21 and 2.22, when S(0) > 1/R0 , the higher the the gap between S(0) and 1/R0 , the higher the peak is (the more people get infected) and the quicker the peak is attained.

• Again, from the figure 2.22, when 4490 = S(0) < 1/R0 = 5000, it never causes a proper epidemic outbreak .

f30

f31.png

 

• Again, by dividing the second equation by the first in 2.1, the following equation is obtained:

f32.png

f33.png

 

• As can be noticed from the above figure 2.23 that because the formulas differ only by an additive constant, these curves are all vertical translations of each other.

• The line I(t) = 0 consists of equilibrium points.

• Starting out at a point on one of these curves with I(t) > 0, as time goes on one needs to travel along the curve to the left (because dS/dt < 0), eventually approaching at some positive value of S(t).

• This must happen since on any of these curves, as I(t) → ∞, as S(t) → 0, from equation 2.8.

• So the answer to question (2) is that the epidemic will end as with approaching some positive value and thus there must always be some susceptibles left over.

• As can be seen from the following figure 2.24 (from the simulation results obtained with the Euler’s method), when S(0) > 1/R0 , lesser the the gap between S(0) and 1/R0 , the higher the population remains susceptible at equilibrium (or at t → ∞).

f34.png

Conclusions

In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 was simulated using the basic epidemic SIR model.Initially there will be a small number of infected persons in the population, most of the population had susceptible persons (still not infected but prone to be infected) and zero removed persons. Given the initial values of the variables and the parameter (attack and recovery rates of the flu) values, the following questions were attempted to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.

2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?

3. How long will the epidemic last?
The following conclusions are obtained after running the simulations with
different values of the parameters and the initial values of the variables:

1. When the recovery rate α is ≈ 0 or very very low compared to the attack rate β (so that R0 = β / α >> 1) and I(0) > 1, the flu will turn out to be an epidemic and the entire population will be infected first (the higher β is the quicker the epidemic break out).

2. To be more precise, when the initial susceptible population S(0) is greater than the inverse of the basic reproduction number 1/R0 = α / β, a proper epidemic will break out.

3. When the initial susceptible population S(0) is less than the inverse of the basic reproduction number 1/R0 = α/β, then a proper epidemic will never break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

7. Scope of improvement: The SIR model could be extended to The Classic Endemic Model [5] where the birth and the death rates are also considered for the population (this will be particularly useful when a disease takes a long time to reach the equilibrium state).

f35.png

f36.png

f37.png

 

Advertisements

One thought on “SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India in 2010 (Simulation in Python & R)

  1. Pingback: Sandipan Dey: SIR Epidemic model for influenza A (H1N1): Modeling the outbreak of the pandemic in Kolkata, West Bengal, India in 2010 (Simulation in Python & R) | Adrian Tudor Web Designer and Programmer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s