Probability

Suppose we consider the cholesterol level of a person chosen at random from a certain age group, or the height of an adult female chosen at random, or the lifetime of a randomly chosen battery of a certain type. Such quantities are called continuous random variables because their values actually range over an interval of real numbers, although they might be measured or recorded only to the nearest integer. Random variable is usually denoted by X. We assume that probability that continuous variable will take ceratain value a equals 0, that is $$${P}{\left({X}={a}\right)}={0}$$$.

Every continuous random variable has a probability density function (PDF) f. Probability density function has the following properties:

  1. $$${f{{\left({x}\right)}}}\ge{0}$$$ for all x, (probability must be negative).
  2. $$${\int_{{-\infty}}^{{\infty}}}{f{{\left({x}\right)}}}{d}{x}={1}$$$, (probabilities of all possible outcomes must add up to 1).

So, to calculate probability that random variable X lies in interval [a,b], we need to evaluate following integral: $$${P}{\left({a}\le{X}\le{b}\right)}={\int_{{a}}^{{b}}}{f{{\left({x}\right)}}}{d}{x}$$$.

Example 1. Let $$${f{{\left({x}\right)}}}={\left\{\begin{array}{c}{A}{{e}}^{{-{c}{x}}}{\quad\text{if}\quad}{x}\ge{0}\\{0}{\quad\text{if}\quad}{x}<{0}\\ \end{array}\right.}$$$.

Find constant A, so that f(x) is probability density function.

Let c=0.2. Find probability that X is less than 3, probability that X lies in interval [4,6] and probability that X greater then 7.

Solution.

Clearly, $$${f{{\left({x}\right)}}}\ge{0}$$$, therefore we need to find such A that $$${\int_{{-\infty}}^{{\infty}}}{f{{\left({x}\right)}}}{d}{x}={1}$$$.

Note, that since $$${f{{\left({x}\right)}}}={0}$$$ for $$${x}<{0}$$$ then we change limits of integration to $$${0},\infty$$$.

This integral is improper, so we use definition of improper integral:

$$${\int_{{-\infty}}^{{\infty}}}{f{{\left({x}\right)}}}{d}{x}={\int_{{0}}^{{\infty}}}{A}{{e}}^{{-{c}{x}}}{d}{x}={A}\lim_{{{t}\to\infty}}{\int_{{0}}^{{t}}}{{e}}^{{-{c}{x}}}{d}{x}=-\frac{{A}}{{c}}\lim_{{{t}\to\infty}}{{e}}^{{-{c}{x}}}{{\mid}_{{0}}^{{t}}}=$$$

$$$=-\frac{{A}}{{c}}\lim_{{{t}\to\infty}}{\left({{e}}^{{-{c}{t}}}-{1}\right)}=-\frac{{A}}{{c}}{\left(-{1}\right)}=\frac{{A}}{{c}}$$$.

This value must equal 1. therefore $$${A}={c}$$$.

$$${f{{\left({x}\right)}}}={\left\{\begin{array}{c}{c}{{e}}^{{-{c}{x}}}{\quad\text{if}\quad}{x}\ge{0}\\{0}{\quad\text{if}\quad}{x}<{0}\\ \end{array}\right.}$$$ is probability density function of exponentially distributed random variable

If c=0.2 then $$${f{{\left({x}\right)}}}={\left\{\begin{array}{c}{0.2}{{e}}^{{-{0.2}{x}}}{\quad\text{if}\quad}{x}\ge{0}\\{0}{\quad\text{if}\quad}{x}<{0}\\ \end{array}\right.}$$$.

First of all note that there is no difference between $$${P}{\left({X}\le{3}\right)}$$$ and $$${P}{\left({X}<{3}\right)}$$$, because $$${P}{\left({X}={3}\right)}={0}$$$: $$${P}{\left({X}\le{3}\right)}={P}{\left({X}<{3}\right)}+{P}{\left({X}={3}\right)}={P}{\left({X}<{3}\right)}+{0}={P}{\left({X}<{3}\right)}$$$.

$$${P}{\left({X}\le{3}\right)}={\int_{{-\infty}}^{{3}}}{f{{\left({x}\right)}}}{d}{x}={\int_{{0}}^{{3}}}{0.2}{{e}}^{{-{0.2}{x}}}{d}{x}=-{{e}}^{{-{0.2}{x}}}{{\mid}_{{0}}^{{3}}}={1}-{{e}}^{{-{0.6}}}\approx{0.4512}$$$.

$$${P}{\left({4}\le{X}\le{6}\right)}={\int_{{4}}^{{6}}}{f{{\left({x}\right)}}}{d}{x}={\int_{{4}}^{{6}}}{0.2}{{e}}^{{-{0.2}{x}}}{d}{x}=-{{e}}^{{-{0.2}{x}}}{{\mid}_{{4}}^{{6}}}={{e}}^{{-{0.8}}}-{{e}}^{{-{1.2}}}\approx$$$

$$$\approx{0.148135}$$$.

$$${P}{\left({X}\ge{7}\right)}={\int_{{7}}^{{\infty}}}{0.2}{{e}}^{{-{0.2}{x}}}{d}{x}={0.2}\lim_{{{t}\to\infty}}{\int_{{7}}^{{t}}}{{e}}^{{-{0.2}{x}}}{d}{x}=-\lim_{{{t}\to\infty}}{{e}}^{{-{c}{x}}}{{\mid}_{{7}}^{{t}}}=\lim_{{{t}\to\infty}}{\left({{e}}^{{-{1.4}}}-{{e}}^{{-{c}{t}}}\right)}=$$$

$$$={{e}}^{{-{1.4}}}\approx{0.2466}$$$.

The mean of any probability density function is defined to be $$$\mu={\int_{{-\infty}}^{{\infty}}}{x}{f{{\left({x}\right)}}}{d}{x}$$$.

The expression for the mean resembles an integral we have seen before in the Moments and Centers of Mass note. If R is the region that lies under the graph of f, we know that x-coordinate of the centroid of R is $$$\frac{{{\int_{{-\infty}}^{{\infty}}}{x}{f{{\left({x}\right)}}}{d}{x}}}{{{\int_{{-\infty}}^{{\infty}}}{f{{\left({x}\right)}}}{d}{x}}}=\frac{{{\int_{{-\infty}}^{{\infty}}}{x}{f{{\left({x}\right)}}}{d}{x}}}{{{1}}}={\int_{{-\infty}}^{{\infty}}}{x}{f{{\left({x}\right)}}}{d}{x}=\mu$$$.

So a thin plate in the shape of R balances at a point on the vertical line $$${x}=\mu$$$.

Example 2. Find the mean of exponentially distributed random varible.

We know that exponentially distributed random variable has PDF $$${f{{\left({x}\right)}}}={\left\{\begin{array}{c}{c}{{e}}^{{-{c}{x}}}{\quad\text{if}\quad}{x}\ge{0}\\{0}{\quad\text{if}\quad}{x}<{0}\\ \end{array}\right.}$$$.

Therefore, $$$\mu={\int_{{-\infty}}^{{\infty}}}{x}{f{{\left({x}\right)}}}{d}{x}={\int_{{0}}^{{\infty}}}{x}{c}{{e}}^{{-{c}{x}}}{d}{x}=\lim_{{{t}\to\infty}}{\int_{{0}}^{{t}}}{x}{c}{{e}}^{{-{c}{x}}}{d}{x}$$$.

On this stage we use integration by parts: let $$${u}={x}$$$ and $$${d}{v}={c}{{e}}^{{-{c}{x}}}{d}{x}$$$ then $$${d}{u}={d}{x}$$$ and $$${v}=\int{c}{{e}}^{{-{c}{x}}}{d}{x}=-{{e}}^{{-{c}{x}}}$$$.

Now, we have $$$\lim_{{{t}\to\infty}}{\int_{{0}}^{{t}}}{x}{c}{{e}}^{{-{c}{x}}}{d}{x}=\lim_{{{t}\to\infty}}{\left({x}{\left(-{{e}}^{{-{c}{x}}}\right)}{{\mid}_{{0}}^{{t}}}-{\int_{{0}}^{{t}}}{\left(-{{e}}^{{-{c}{x}}}\right)}{d}{x}\right)}=\lim_{{{t}\to\infty}}{\left(-{t}{{e}}^{{-{c}{t}}}-\frac{{1}}{{c}}{{e}}^{{-{c}{x}}}{{\mid}_{{0}}^{{t}}}\right)}=$$$

$$$=\lim_{{{t}\to\infty}}{\left(-{t}{{e}}^{{-{c}{t}}}-\frac{{1}}{{c}}{{e}}^{{-{c}{t}}}+\frac{{1}}{{c}}\right)}=\frac{{1}}{{c}}$$$.

Therefore, $$$\mu=\frac{{1}}{{c}}$$$ and PDF can be rewritten as $$${f{{\left({x}\right)}}}={\left\{\begin{array}{c}\frac{{1}}{\mu}{{e}}^{{-\frac{{x}}{\mu}}}{\quad\text{if}\quad}{x}\ge{0}\\{0}{\quad\text{if}\quad}{x}<{0}\\ \end{array}\right.}$$$.