Some Statistics Concepts: Probability integral transformations

  • In this article, the statistics concepts for the probability integral transformation along with its applications will be discussed.

Probability Integral Tranformation

  • First let’s convince ourselves about the fact that a continuous random variable transformed by its own CDF with always have a U(0,1distribution. It can be proved as shown in the below figure.
    prob_integral.png
  • Let’s draw n=1000 i.i.d. samples from XExp(λ=5using R function rexp and then transform the variable by its own CDF F_X(x)=1exp(λx), s.t. Y=F_X(X). We can see from the below figure that F_Y is the distribution function of Y∼U(0,1).
    n <- 1000
    lambda <- 5
    x <- rexp(n, lambda)
    y <- 1 - exp(-lambda*x)
    F_y <- cumsum(table(y))/sum(table(y))
    par(mfrow=c(1,3))
    hist(x, col='blue')
    plot(F_y, col='green', pch=19, xlab='y', ylab=expression('F'['Y']), main=expression(paste('Y=1-e'^ paste('-',lambda,'X'))))
    plot(ecdf(y), col='red', pch=19, xlab='y', ylab='ECDF(y)')

    cdf.png

  • Application

    • Now, suppose that we have at our disposal only a function that can generate i.i.d. samples from XU(0,1)X∼U(0,1) distribution, now we want to use the function to generate i.i.d. samples from some other distribution (let’s say from XExp(λ=5) or XGeom(p=0.3) or XLaplace(μ=0,b=4).
    • We can use a U(0,1) random variable transformed by the inverse CDF corresponding to the other distribution to get a random variable with that CDF.
    • Let’s use the above facts to draw n=1000 samples from a Yξ(λ=5distribution, using just the samples drawn from a X(0,1) with the R function runif using probability integral transform.
      • First draw n samples from X(0,1).
      • Transform Y=F_X(X) with the inverse CDF −(1/λ)ln(1x).
      • Now Y has the same distribution as 1exp(λy).
    • Let’s compare the histograms obtained with the samples drawn from Yξ(λ=5) using probability integral transform and with the R function rexp. As expected, histogram looks almost exactly the same, as can be seen from the following figure.Also, the times taken to draw 10000 such samples are quite comparable.
      exp
      p1animation.gif
    • Similarly let’s use probability integral transform to draw samples from YGeom(p=0.1) using only XU(0,1) transformed with the inverse CDF
      ln(1x)/ln(1−p) of the geometric distribution, since the geometric distribution
      has the CDF 1(1p)^x and then compare with the ones drawn using R function rgeom. As expected, histogram looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.
      geom.png
      p2
    • Again let’s use probability integral transform to draw samples from YLaplace(μ=0,b=4using only XU(0,1) transformed with the following inverse CDF.
      icdf
      since the laplace distribution has the following CDF
      lcdf
      Then compare with the ones drawn using R function rlaplace. As expected, histogram looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.
      laplace
      p3
    • Finally let’s say we don’t have the function rnorm but we only have the ICDF qnorm and we want to sample from a normal distribution with a given mean and variance using the probability integral transform.
    • Then let’s compare the histogram with those generated using rnorm. As can be seen, they look exactly same. Also, the times taken to draw 10000 such samples are quite comparable.norm
      p4
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s