In Probability theory and statistics, the probability distribution is a mathematical function that provides the probability of various possible outcomes occurring in an experiment. A probability distribution is a description of a random phenomenon in terms of event probabilities. Generally these distributions are divided into two classes Discrete Probability Distribution – Discrete data can only take certain values, e.g. when you roll the die, the possible outcomes are 1, 2, 3, 4, 5 or 6, not 1.5 or 2.45. Continuous probability distribution – Continuous data can take any value in a given range, e.g. child’s height, it can have any value – 100 cm, or, 107.5 cm or 123 cm.

A probability distribution which sample space is a set of real numbers is called one-dimensional, while a distribution which sample space is a vector space is called multidimensional. The univariate distribution gives the probabilities for one random variable taking different alternative values; multivariate distribution (common probability distribution) gives the probability of a random vector – a list of two or more random variables – assuming different combinations of values.

The most common probability distributions for machine learning:

- Bernoulli distribution
- Uniform distribution
- Binomial distribution
- Normal distribution
- Poisson distribution
- Exponential decay

##### Bernoulli distribution¶

The Bernoulli distribution has only two possible outcomes, namely 1 (success) and 0 (failure), and a single test. Thus, a random variable X that has a Bernoulli distribution can take the value 1 with a success probability of say p, and a value 0 with a failure probability of say q or 1-p. The Bernoulli distribution is a special case of the binomial distribution in which you conduct a single experiment so that the number of observations is 1. Thus, the Bernoulli distribution describes events with exactly two outcomes.

```
import warnings
warnings.simplefilter("ignore", UserWarning)
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.stats import bernoulli
import seaborn as sb
bernoulli = bernoulli.rvs(size=1000,p=0.6)
ax = sb.distplot(bernoulli, kde=True, color='g',hist_kws={"linewidth": 25,'alpha':1})
ax.set(xlabel='Bernoulli', ylabel='Frequency')
```

##### Uniform distribution¶

Discrete homogeneous distribution in which all elements of the finite set are equally probable. This is the theoretical distribution model for casino roulette or the first card of a well-shuffled deck.

When you roll the die, the scores range from 1 to 6. The likelihood of getting each of these scores is equally likely, and this is the basis of uniform distribution. Unlike Bernoulli’s distribution, all numbers of possible uniform distribution outcomes are equally probable.

```
import numpy as np
s = np.random.uniform(-1,0,1000)
count, bins, ignored = plt.hist(s, 15, density=True, color='y')
plt.plot(bins, np.ones_like(bins), linewidth=2, color='r')
```

##### Binomial distribution¶

Describes the number of successes in a series of independent experiments (Yes / No) with the same probability of success. There are only two possible outcomes. Success and failure. Therefore, the probability of obtaining an eagle (coin toss) and the probability of failure can be easily calculated as: q = 1- p = 0.5. A distribution where only two outcomes are possible, such as success or failure, profit or loss, win or loss, and where the probability of success and failure is the same for all tests, is called binomial distribution.

```
from scipy.stats import binom
binom.rvs(size=10,n=20,p=0.8)
binomial = binom.rvs(n=20,p=0.8,loc=0,size=1000)
ax = sb.distplot(binomial,kde=True,color='m',hist_kws={"linewidth": 25,'alpha':1})
ax.set(xlabel='Binomial', ylabel='Frequency')
```

##### Normal distribution¶

A normal distribution represents the behavior of most situations in the universe (therefore it is called a „normal” distribution.). A large sum of (small) random variables often turns out to be normally distributed, which contributes to its widespread use. Any distribution is known as a normal distribution if it has the following characteristics:

- Mean, median and distribution mode coincide.
- The distribution curve is bell-shaped and symmetrical about the line x = μ.
- The total area under the curve is 1.
- Exactly half the value is to the left of center and the other half to the right.

The normal distribution differs significantly from the binomial distribution. However, if the number of trials approaches infinity, the shapes will be similar.

```
mu, sigma = 0.5, 0.1
s = np.random.normal(mu, sigma, 1000)
count, bins, ignored = plt.hist(s, 20, normed=True, color='lightblue')
plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *np.exp( - (bins - mu)**2 / (2 * sigma**2) ), linewidth=3, color='r')
plt.show()
```

##### Poisson distribution¶

Poisson binomial distribution that describes the number of successes in a series of independent Yes / No experiments with different success probabilities. It can be any number. Examples of Poisson distribution:

- Number of emergency calls registered at the hospital during one day.
- Number of thefts reported in the area during the day.
- Number of customers arriving at the salon per hour.
- Number of suicides reported in the city.
- The number of printing errors on each page of the book.

Poisson distribution is applicable in situations where events occur at random moments in time and space where our interest is only in the number of occurrences of the event.

A distribution is called a Poisson distribution when the following assumptions are correct:

- Each successful event should not affect the outcome of another successful event.
- The probability of success in the short term must be equal to the probability of success in the long term.
- The probability of success in the interval approaches zero as the interval becomes smaller.

```
from scipy.stats import poisson
poisson= poisson.rvs(mu=4, size=10000)
ax = sb.distplot(poisson, kde=True, color='orange',hist_kws={"linewidth": 25,'alpha':1})
ax.set(xlabel='Poisson', ylabel='Frequency')
```

##### Exponential distribution¶

The exponential decay is widely used for survival analysis.

Some examples include:

- Length of arrival time by metro,
- Length of time between gas station arrivals
- Service life of the AC

```
lam= 0.5
x= np.arange(0,20,0.1)
y=lam*np.exp(-lam*x)
plt.plot(x,y)
plt.xlabel('x')
plt.show()
```

##### Relations between distributions¶

Relation between Bernoulli and binomial distribution

- The Bernoulli distribution is a special case of single-sample binomial distribution.
- There are only two possible outcomes of the Bernoulli and binomial decomposition – success and failure.
- Both Bernoulli and binomial distributions have independent paths

Relation between Poisson and the binomial distribution

The Poisson distribution is a limiting case of the binomial distribution under the following conditions:

- The number of attempts is infinitely large or n → ∞.
- The probability of success for each trial is the same and infinitely small or p → 0.

Relationship between the normal and binomial distribution and the normal and poisson distribution:

The normal distribution is another limiting form of the binomial distribution under the following conditions:

- The number of attempts is infinitely large, n → ∞.
- Both p and q are not infinitely small.
- The normal distribution is also the limiting case of the Poisson distribution with the parameter λ → ∞.

The relationship between the exponential decay and the Poisson distribution:

If the times between random events follow an exponential distribution with the degree λ, then the total number of events in the time period of length t follows the Poisson distribution with the parameter λt.