# Random numbers

## Contents

# Random numbers¶

## Uniform and normal distributions¶

To generate a random number drawn from the standard uniform distribution, we can use command `rand`

from package `Random`

:

```
using Random
rand()
```

```
0.22686673754119435
```

To get many random numbers presented in the array form, we must give the dimensions:

```
rand(3, 4)
```

```
3×4 Matrix{Float64}:
0.320465 0.447005 0.560518 0.990463
0.189469 0.514505 0.499002 0.00913538
0.976965 0.881017 0.449687 0.623803
```

The same command can be used to draw from the *discrete* uniform distribution.
To this end, as the first argument we provide the *list of elements* as an array or a range.
All the elements are drawn with equal probabilities:

```
@show rand([.2 .3 .5]);
@show rand(100:130);
```

```
rand([0.2 0.3 0.5]) = 0.5
rand(100:130) = 128
```

An array of the size \(S\times K\times N\) with elements drawn from `list_of_elements`

can be generated by using `rand(list_of_elements, dim1, dim2,...)`

, *e.g.*

```
rand(["a" "b" "c"], 4,3,2)
```

```
4×3×2 Array{String, 3}:
[:, :, 1] =
"a" "b" "a"
"a" "b" "a"
"a" "c" "a"
"a" "c" "b"
[:, :, 2] =
"c" "c" "a"
"a" "b" "b"
"a" "a" "b"
"a" "c" "a"
```

Analogously, you can draw numbers from the standard normal distribution \(\mathcal{N}(\mu=0,\sigma^2=1)\) by using `randn`

:

```
randn(2,3)
```

```
2×3 Matrix{Float64}:
-0.803837 -0.891013 -1.40048
-0.640025 -0.110768 0.361461
```

### Multivariate normal distribution¶

Draw sample of \(10000\) observations from distribution given by the multivariate nomal distribution:

We know that if \(x\sim \mathcal{N}(\mu, \Sigma)\) and \(y=\alpha+Bx,\) then \(y\sim\mathcal{N}(\alpha+B \mu, B \Sigma B').\) Given \(x\sim \mathcal{N}(0, \mathbf{I}),\) we can produce \(y\) by transformation \(y=\left(\substack{2\\1}\right)+Bx,\) where \(B\) is a lower triangular matrix of Cholesky decomposition for

Using this property, to get \(\mathbf{y}\) we can draw two random variables \(x_1\) and \(x_2\) independetly from the standard normal distribution first.

```
X = randn(10000, 2); #2 columns; each column is a r.v.
```

```
10000×2 Matrix{Float64}:
0.614505 -0.668119
1.03135 -0.0948743
-0.29912 -0.100329
-1.04933 0.719724
0.693856 0.756485
-0.0218002 -0.0695528
-0.926921 -2.09854
0.595044 0.623781
1.78484 1.07401
0.380712 -1.4075
0.43448 -0.866373
-0.584113 -0.775322
-0.214707 -1.65594
⋮
0.269616 1.08181
-0.0072384 0.12458
1.16377 1.91824
-0.0882252 -0.392798
0.163183 -0.595299
0.1951 1.25567
-1.37787 0.375443
0.263264 0.43212
1.23051 0.724854
0.541672 0.0221197
-0.8804 -0.0077747
-0.79008 -2.21579
```

As we can see the mean and variance of both variables are equal to around \(0\) and \(1,\) respectively. Both variables are independent, so their covariance are close to \(0.\)

```
using Statistics
println("mean(x₁)=$(mean(X[:,1]))")
println("var(x₁)=$(var(X[:,1]))")
println("mean(x₂)=$(mean(X[:,2]))")
println("var(x₂)=$(var(X[:,2]))")
println("cov(x₁, x₂)=$(cov(X[:,1], X[:,2]))")
```

```
mean(x₁)=0.011790572458630223
var(x₁)=1.017881472956483
mean(x₂)=0.00869419286907865
var(x₂)=0.9779904586024419
```

```
cov(x₁, x₂)=-0.004860308248876938
```

In our example, the covariance matrix of \(\mathbf{y}\) is given by:

```
BΣBt = [.75 .1;
.1 .25]
```

```
2×2 Matrix{Float64}:
0.75 0.1
0.1 0.25
```

To conduct Cholesky decomposition we use command `cholesky`

from pakcage `LinearAlgebra`

.
Without getting too much into details, it creates a structure with elements that can be accessed by using `.`

after a created object.
To get a lower triangular matrix from the decomposition we need to call `cholesky(BBt).L`

.
This function is written in the object-oriented paradigm.

```
using LinearAlgebra
B = cholesky(BΣBt).L
```

```
2×2 LowerTriangular{Float64, Matrix{Float64}}:
0.866025 ⋅
0.11547 0.486484
```

Indeed, our new matrix, \(B\), is the matrix we are looking for as *i.e.*

```
B*B'
```

```
2×2 Matrix{Float64}:
0.75 0.1
0.1 0.25
```

To get \(\mathbf{y},\) for each pair \((x_1, x_2)\) we need to perform the following operation \(\left(\substack{y_1\\y_2}\right) =\left(\substack{2\\1}\right)+B\left(\substack{x_1\\x_2}\right).\)
We can use `map`

for this:

```
y_map = map( (x₁, x₂) ->begin
x_vec = [x₁, x₂]
y₁, y₂ = [1, 2] + B*x_vec
return [y₁ y₂]
end,
X[:,1], X[:,2])
```

```
10000-element Vector{Matrix{Float64}}:
[1.53217655115301 1.7459274847477118]
[1.8931791461680745 2.07293574116509]
[0.7409543704848185 1.916652193182746]
[0.09125707985321885 2.228968606594767]
[1.6008968037577533 2.4481374946126477]
[0.9811204469598399 1.963646380409594]
[0.19726301281148062 0.8720627526760132]
[1.5153230167250609 2.372169057107908]
[2.5457150109368563 2.7285827782992755]
[1.329706157059598 1.3592360754313089]
[1.376270955025524 1.628692827112217]
[0.4941436989058252 1.5553709014876684]
[0.8140586580676017 1.1696216776795643]
⋮
[1.2334940938277101 2.557414163254594]
[0.9937313599170234 2.059770309415118]
[2.0078514083441 3.067571893908056]
[0.9235947662915548 1.7987225969622231]
[1.1413205515461562 1.7292395134391374]
[1.1689612526831619 2.6333912202223053]
[-0.19326890795427332 2.023544627712599]
[1.2279933103053142 2.2406186240020043]
[2.0656503068252374 2.494716337140241]
[1.469101416967663 2.0733077324878817]
[0.2375514392985837 1.8945579256179952]
[0.31577041485013124 0.8308227480454937]
```

The returned object is an array with two-element vectors \((y_1, y_2)\). For convenience, we may want to extract those elements and arrange \((y_1, y_2)\) in a two-column array:

```
Y = Array{Float64}(undef, 10000,2);
Y[:, 1] = map(x-> x[1] ,y_map);
Y[:, 2] = map(x-> x[2] ,y_map);
```

As we see, means and covariances are approximately equal to values of the random variables from which we wanted to draw random numbers:

```
println("mean(y₁)=$(mean(Y[:,1]))")
println("var(y₁)=$(var(Y[:,1]))")
println("mean(y₂)=$(mean(Y[:,2]))")
println("var(y₂)=$(var(Y[:,2]))")
println("cov(y₁, y₂) = $(cov(Y[:,1], Y[:,2]))")
```

```
mean(y₁)=1.0102109352743347
var(y₁)=0.763411104717362
mean(y₂)=2.005591043620997
var(y₂)=0.24448344570534644
```

```
cov(y₁, y₂) = 0.09974046303320667
```

Instead of using `map`

, we could perform simple matrix operations.
The result would be exactly the same:

```
[1 2] .+ X*B'
```

```
10000×2 Matrix{Float64}:
1.53218 1.74593
1.89318 2.07294
0.740954 1.91665
0.0912571 2.22897
1.6009 2.44814
0.98112 1.96365
0.197263 0.872063
1.51532 2.37217
2.54572 2.72858
1.32971 1.35924
1.37627 1.62869
0.494144 1.55537
0.814059 1.16962
⋮
1.23349 2.55741
0.993731 2.05977
2.00785 3.06757
0.923595 1.79872
1.14132 1.72924
1.16896 2.63339
-0.193269 2.02354
1.22799 2.24062
2.06565 2.49472
1.4691 2.07331
0.237551 1.89456
0.31577 0.830823
```

## Simulating Markov chains¶

Suppose that the income of a single household follows a certain Markov chain with the transition probability:

and income states equal to \(\mathbf{y} = [e^{-\frac{1}{2}}, 1 , e^{\frac{1}{2}}].\)

Simulate the dynamics of the income in the horizon of \(100\) periods. The initial income is equal to \(y_0=1.\)

Command `rand`

allows sampling from a finite set of elements where element has equal probabilities.
A similar thing can be done using `sample`

from package `StatsBase`

:

```
sample( exp.([-.5, 0, .5]) )
```

```
UndefVarError: sample not defined
Stacktrace:
[1] top-level scope
@ In[15]:1
[2] eval
@ ./boot.jl:360 [inlined]
[3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base ./loading.jl:1094
```

As in the previous example, the probability is the same across elements:

```
draws = [sample(exp.([-.5, 0, .5])) for i ∈1:10000] # sample(exp.([-.5, 0, .5]) , 10000 ) will give the same result
countmap( draws )
```

```
Dict{Float64, Int64} with 3 entries:
1.64872 => 3356
1.0 => 3380
0.606531 => 3264
```

If we want to draw element with different probabilities, we have provide weights:

```
draws = [sample( exp.([-.5, 0, .5]), Weights( [.8, .1, .1] ) ) for i ∈1:10000]
countmap( draws )
```

```
Dict{Float64, Int64} with 3 entries:
1.64872 => 956
1.0 => 968
0.606531 => 8076
```

In the considered example, the probabilities of states in the next period depends on the current state. The transition matrix is given by:

```
P = [.8 .1 .1;
.1 .8 .1;
.1 .1 .8]
```

```
3×3 Matrix{Float64}:
0.8 0.1 0.1
0.1 0.8 0.1
0.1 0.1 0.8
```

To simulate the income dynamics we need to build a loop in which `Weights`

are taken from a row of the transition matrix associated with the current state:

```
history = Array{Int}(undef, 101); #index (!) of the current state
history[1] = 2;
using StatsBase
for τ ∈ 2:length(history)
history[τ] = sample(1:length(y),
Weights( P[history[τ-1], :] )
)
end
```

Eventually, we have one simulated dynamics of the Markov chain:

```
using Plots
plot(y[history])
ylabel!("Income")
xlabel!("Time")
```

## Random numbers generated through a quantile function¶

### Continuous cdfs¶

Suppose that we want to draw random numbers according to a distribution given by the following cumulative distribution function:

The function has different definitions for three intervals, \((-\infty, 0), [0,1], (1,+\infty)\).
Using a similar approach to the one used in creating the CRRA utility function, we can use ternary operator `?:`

twice to create \(F(x)\) in `Julia`

:

```
F(x) = 0≤x≤1 ? x^2 :
( x< 0 ? 0 : 1)
```

```
F (generic function with 1 method)
```

As we can see, \(F(x)\) continuous and stricly increasing for \(x\in[0,1].\)

```
using Plots
plot(F, -.1, 1.2)
```

Notice that the values \(F(x)\) of each continuous cdf is uniformly distributed. This means that each value \(p\in[0,1]\) is equally likely. To draw random numbers for a random variable distributed according to \(F(x),\) we can use the quantile function (also called the inverse cdf). This function maps quantiles to a threshold value \(x\) below which random draws would fall \(p\) percent of the time. Next, draws from the uniform distribution can be used to represent quantiles of the distribution given by \(F(x).\) Using the quantile function, those draws can be projected on actual values, \(x.\)

For monotonically increasing and continuous cdf’s, this is simply the inverse function of the cdf. In our case, it is:

where \(p\in[0,1].\)
In `Julia`

we can define this function in the following way:

```
invF(p) = 0≤p≤1 ? √(p) : @error "Forbidden argument: p∉[0,1]"
```

```
invF (generic function with 1 method)
```

```
plot(invF, 0, 1)
xlabel!("p"); ylabel!("x")
```

Now we can generate some numbers from the uniform distribution and then by using `invF`

we project those draws on the actual values of \(x\). We do this exercise for \(10,000\) draws and \(20\) draws:

```
draws = rand(10000);
draws_x = invF.(draws);
```

```
draws_very_few = rand(20);
draws_x_very_few = invF.(draws_very_few);
```

First, let’s plot the empirical cdf’s of numbers drawn from the uniform distribution and compare them with the true cdf (which for the standardized uniform distribution is equal to \(G(x)=x\)).
To get empirical cdf’s, I will use `ecdf`

from package `StatsBase`

.
The output of `ecdf`

is a *function* that returns the implied values of the cdf for *all* values (not only the drawn ones).

```
using StatsBase
draws_ecdf = ecdf(draws);
draws_very_few_ecdf = ecdf(draws_very_few);
```

```
draws_ecdf(.5)
```

```
0.4986
```

```
draws_very_few_ecdf.( [.2 .4 .6] )
```

```
1×3 Matrix{Float64}:
0.3 0.6 0.75
```

```
using Plots
using Pipe
@pipe range(0,1, length=100) |> plot(_, draws_ecdf.(_), label="Empirical CDF (N=10,000)",
legend = :topleft)
@pipe range(0,1, length=100) |> plot!(_, draws_very_few_ecdf.(_), label ="Empirical CDF (N=20)")
@pipe range(0,1, length=100) |> plot!(_, map( x-> x, _), label ="True CDF of the uniform dist.")
title!("Draws of quantiles vs. cdf of the uniform dist.")
xlabel!("p")
ylabel!("cdf")
```

As can be seen, the empirical cdf with 10,000 draws resembles the true cdf of the uniform distribution very closely, while there is a much bigger discrepancy for the simulation only with 20 observations.

Now let’s compare the empirical cdf of implied \(x\) for 10,000 and 20 draws with the true cdf, the quadratic function from our example.

```
using StatsBase
cdf_x = ecdf(draws_x);
cdf_x_very_few = ecdf(draws_x_very_few);
```

```
using Pipe
@pipe range(0,1, length=100) |> plot(_, cdf_x.(_), label="Empirical CDF (N=10000)",
legend=:topleft)
@pipe range(0,1, length=100) |> plot!(_, cdf_x_very_few.(_), label ="Empirical CDF (N=20)")
@pipe range(0,1, length=100) |> plot!(_, F.(_), label ="True CDF")
```

Our exercise shows that by using the quantile function and random number generators from the uniform distribution, we can draw numbers from any arbitrary distribution, which exhibits continuous cdf and its quantile function is closed-form.

### Continuous cdf but no closed-formed solution for the quantile function¶

Suppose that we want to draw random numbers according to a distribution given by the following cumulative distribution function:

Function \(G(x)\) is a continuous cdf just like in our previous example:

```
G(x) = 0≤x≤1 ? ( x^2 + x * √(x+3) )/3 :
( x< 0 ? 0 : 1)
plot(G, -.1, 1.1)
```

Similarly to \(F(x)\), \(G(x)\) is a continuous cdf. In contrast to our previous example, it is not easy to derive the inverse function of \(G(x)\) in the closed form. Recall, for a given quantile \(p^*,\) we want to find such \(x\) that meets \(p^* = F(x)\). Therefore, the quantile function is defined as:

This issue can be addressed numerically. We can run a root finding procedure for the function given by:

where \(p^*\) is the parameter.

Let’s ilustrate this by find the median of the distribution given by \(G(x).\)
For this, we can use function `find_zero`

from package `Roots`

:

```
using Roots
median = find_zero( x -> .5 - G(x), .1) #2nd argument: initial guess for $x$
```

```
0.6004852578317297
```

```
median
```

```
0.6004852578317297
```

For the distribution given by \(G(x)\), the median is equal to around \(0.6.\) We can illustrate this by using a plot:

```
plot(G, -.1, 1.1, label="cdf",
legend=:topleft)
plot!([-.1, median], [.5, .5], label="value of p*")
plot!([median, median], [0, .5], label="root of the cdf given p*")
xlabel!("x")
ylabel!("p")
```

Note

More generally, we can use function `invG(p)`

, which for a given input `p`

conducts root finding procedure `find_zero( x -> p - G(x), .1)`

.

```
invG(p) = find_zero( x -> p - G(x), .1)
```

```
invG (generic function with 1 method)
```

```
draws = rand(10000);
draws_G = invG.(draws);
draws_very_few = rand(20);
draws_G_very_few = invG.(draws_very_few);
```

```
ecdf_G = ecdf(draws_G);
ecdf_G_very_few = ecdf(draws_G_very_few);
```

```
@pipe range(0,1, length=100) |> plot(_, ecdf_G.(_), label="Empirical CDF (N=10000)",
legend=:topleft)
@pipe range(0,1, length=100) |> plot!(_, ecdf_G_very_few.(_), label ="Empirical CDF (N=20)")
@pipe range(0,1, length=100) |> plot!(_, G.(_), label ="True CDF")
```

### Cdfs with discrete jumps¶

Suppose that we want to draw random numbers according to a distribution given by the following cumulative distribution function:

Just like before, we can define the cdf using `?.`

ternary operator:

```
H(x) = 0≤x≤1 ? ( x <.5 ? x^3 : x^2) :
( x < 0 ? 0 : 1)
```

```
H (generic function with 1 method)
```

```
plot(H, -.1, 1.1)
```

In the considered function, there is a discrete jump at \(x=0.5,\) which means that there is a probability mass at this point. In our case, there is the likelihood of \(\left({.5}\right)^2-\left({.5}\right)^3=.125\) that the drawn number will be equal exactly to \(.5.\)

For distributions with discrete jumps, we need to use a generalized version of the quantile function:

To study distributions with discrete jumps

Consequently, the (extended) quantile for the one from the discussed example will be as follows:

```
invH(p) = 0≤p≤1 ? ( p<.125 ? ∛(p) :
.125 ≤ p ≤ .25 ? .5 :
√(p) ) :
@error "Forbidden argument: p∉[0,1]"
```

```
invH (generic function with 1 method)
```

All other steps will be exactly the same as in the initial example with a continuous cdf.

```
draws = rand(10000);
draws_H = invH.(draws);
draws_very_few = rand(20);
draws_H_very_few = invH.(draws_very_few);
```

```
ecdf_H = ecdf(draws_H);
ecdf_H_very_few = ecdf(draws_H_very_few);
```

```
@pipe range(0,1, length=100) |> plot(_, ecdf_H.(_), label="Empirical CDF (N=10000)",
legend=:topleft)
@pipe range(0,1, length=100) |> plot!(_, ecdf_H_very_few.(_), label ="Empirical CDF (N=20)")
@pipe range(0,1, length=1000) |> plot!(_, H.(_), label ="True CDF")
```

As we can see, the procedure (with enough number of draws) succesfully replicates the discrete jump at \(0.5.\)