Some Statistics Concepts: Probability integral transformations

• In this article, the statistics concepts for the probability integral transformation along with its applications will be discussed.

Probability Integral Tranformation

• First let’s convince ourselves about the fact that a continuous random variable transformed by its own CDF with always have a U(0,1distribution. It can be proved as shown in the below figure.
• Let’s draw n=1000 i.i.d. samples from XExp(λ=5using R function rexp and then transform the variable by its own CDF F_X(x)=1exp(λx), s.t. Y=F_X(X). We can see from the below figure that F_Y is the distribution function of Y∼U(0,1).
n <- 1000
lambda <- 5
x <- rexp(n, lambda)
y <- 1 - exp(-lambda*x)
F_y <- cumsum(table(y))/sum(table(y))
par(mfrow=c(1,3))
hist(x, col='blue')
plot(F_y, col='green', pch=19, xlab='y', ylab=expression('F'['Y']), main=expression(paste('Y=1-e'^ paste('-',lambda,'X'))))
plot(ecdf(y), col='red', pch=19, xlab='y', ylab='ECDF(y)')

• Application

• Now, suppose that we have at our disposal only a function that can generate i.i.d. samples from XU(0,1)X∼U(0,1) distribution, now we want to use the function to generate i.i.d. samples from some other distribution (let’s say from XExp(λ=5) or XGeom(p=0.3) or XLaplace(μ=0,b=4).
• We can use a U(0,1) random variable transformed by the inverse CDF corresponding to the other distribution to get a random variable with that CDF.
• Let’s use the above facts to draw n=1000 samples from a Yξ(λ=5distribution, using just the samples drawn from a X(0,1) with the R function runif using probability integral transform.
• First draw n samples from X(0,1).
• Transform Y=F_X(X) with the inverse CDF −(1/λ)ln(1x).
• Now Y has the same distribution as 1exp(λy).
• Let’s compare the histograms obtained with the samples drawn from Yξ(λ=5) using probability integral transform and with the R function rexp. As expected, histogram looks almost exactly the same, as can be seen from the following figure.Also, the times taken to draw 10000 such samples are quite comparable.

• Similarly let’s use probability integral transform to draw samples from YGeom(p=0.1) using only XU(0,1) transformed with the inverse CDF
ln(1x)/ln(1−p) of the geometric distribution, since the geometric distribution
has the CDF 1(1p)^x and then compare with the ones drawn using R function rgeom. As expected, histogram looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.

• Again let’s use probability integral transform to draw samples from YLaplace(μ=0,b=4using only XU(0,1) transformed with the following inverse CDF.

since the laplace distribution has the following CDF

Then compare with the ones drawn using R function rlaplace. As expected, histogram looks almost exactly the same, as can be seen from the following figure. Also, the times taken to draw 10000 such samples are quite comparable.

• Finally let’s say we don’t have the function rnorm but we only have the ICDF qnorm and we want to sample from a normal distribution with a given mean and variance using the probability integral transform.
• Then let’s compare the histogram with those generated using rnorm. As can be seen, they look exactly same. Also, the times taken to draw 10000 such samples are quite comparable.