Broadcasting: arrays meet scalars
Contents
Broadcasting: arrays meet scalars¶
In the section describing arrays, we faced one seemingly bizzare obstacle in operating with objects of different data structures.
In our discussed example we tried to add a scalar number, \(1,\) to a matrix \(\mathbb{A}\) by running \(\mathbb{A}+1.\)
In Julia
it is not possible.
While trying running A+1
you get an error message.
We addressed this problem by using function repeat
.
Here I present a much more recommended approach and to this end I will use function broadcast
.
Broadcasting applies a certain function f
over the sequence of elements of the provided input objects.
In our example, we want to broadcast function +()
over elements of A
and 1
:
A = [1 2; 3 4] #our matrix
output = broadcast(+, A , 1)
2×2 Matrix{Int64}:
2 3
4 5
The broadcasting procedure adds A[1,1]
and 1
and assings the result to element output[1,1]
. Next, it adds A[2,1]
and 1
and assings the result to element output[2,1]
. In total, the operation is conducted four times and as a result you get a new object output
of size \(2\times 2.\)
Broadcasting allows to use not only arrays and scalars but also arrays of different sizes in a very similar way:
broadcast(+, A , [1 2])
2×2 Matrix{Int64}:
2 4
4 6
Dot (.
) notation for broadcasting¶
The number of input objects depends on the number of required inputs of the broadcasted function. To illustrate this, consider a following example:
Suppose that we want to create an exponential grid of \(n\) elements from \(k_1\) to \(k_n.\) In a grid of this type, the elements are defined as follows:
where \(d=\frac{\ln k_n - \ln k_1}{n-1}.\)
The grid on the logarithmic scale \((\ln k_i)_{i=1}^n\) can be computed as follows:
k₁ = 1;
kₙ = 12;
n = 1000;
d = (kₙ-k₁)/(n-1);
ln_k = collect( range(log(k₁), stop=log(kₙ), length=n) );
Equipped with \((\ln k_i)_{i=1}^n\), we would like to recover the grid with true values of \((k_i)_{i=1}^n.\)
We can do it by exponentiating each element of ln_k
.
broadcast
might be very useful for this:
k = broadcast(exp, ln_k)
1000-element Vector{Float64}:
1.0
1.0024904901749636
1.004987182891239
1.007490093596194
1.0099992377756684
1.0125146309540696
1.0150362886944675
1.0175642265986928
1.0200984603074312
1.0226390055003225
1.0251858778960556
1.0277390932524673
1.0302986673666386
⋮
11.676115152945892
11.705194403016046
11.734346074672795
11.76357034828139
11.792867404656281
11.822237425062225
11.851680591215429
11.881197085284661
11.910787089892368
11.940450788115827
11.97018836348827
12.0
Procedure broadcast
has a convenient short dot (.
) notation.
Instead of using broadcast(f, As)
, we can do exactly the same thing by adding .
at the end of function f
, i.e. f.(As)
.
In our example it will be:
exp.(ln_k)
1000-element Vector{Float64}:
1.0
1.0024904901749636
1.004987182891239
1.007490093596194
1.0099992377756684
1.0125146309540696
1.0150362886944675
1.0175642265986928
1.0200984603074312
1.0226390055003225
1.0251858778960556
1.0277390932524673
1.0302986673666386
⋮
11.676115152945892
11.705194403016046
11.734346074672795
11.76357034828139
11.792867404656281
11.822237425062225
11.851680591215429
11.881197085284661
11.910787089892368
11.940450788115827
11.97018836348827
12.0
Arithmetic operations such as +
, -
, ^
, /
, *
can be broadcasted by adding .
in front of operators:
A .+ [1 2]
2×2 Matrix{Int64}:
2 4
4 6
Don’t use repeat
for what can be done with broadcast
¶
In section on arrays, I discussed a method involving repeat
instead of broadcast
.
Now, I will show how inefficient it is in comparison to broadcasting.
For this, I am going compare both approaches in adding \(1\) to each element of an array of size \(100,000\times 10,000.\)
In the exercise macro @btime
from BenchmarkTools
is used to measure the execution time of both methods.
using BenchmarkTools
X = repeat(collect(1:100), 1000, 10000);
@show size(X)
size(X) = (100000, 10000)
(100000, 10000)
@btime X .+ 1;
@btime X + repeat([1], 100000, 10000);
15.541 s (7 allocations: 14.90 GiB)
The broadcasted (execution time: 3.308 s) function is almost five times faster than using repeat
(execution time: 15.541 s).
The main reason for this is that repeat([1], 100000, 10000)
generates a new large object, while in broadcast
this step is skipped.
Updating objects: new assignments vs. elementwise updates¶
Suppose that you have a large array of dimension \(10,000\times 10,000\) and you would like to increase the value of each element by \(1.\) Below you can find three alternative and seemingly very similar implementations. Macro @time
measure the execution time.
A = repeat(1:10000, 10000)
B = repeat(1:10000, 10000)
@time B = B .+ 1;
@time A .+= 1;
0.567747 seconds (4 allocations: 762.940 MiB, 18.89% gc time)
0.123577 seconds (2 allocations: 64 bytes)
As you can see B = B .+ 1;
is around 5 times slower than A .+= 1;
. The reason for this is that in B = B .+ 1;
, Julia
creates a copy of B
, adds \(1\) and then assigns the output to matrix B
again. In A .+= 1;
the operations (assignment and adding) are performed elementwise and no copy of A
is created in the operation.
Too many dots? @.
is the answer!¶
Suppose that we have vector \(\mathbf{x} = \left(1,2,3,4,5\right)'\) and we would like to perform the following operation for each element of the vector:
Using broadcast
, this can be done in the following way:
x = 1:5;
broadcast( /, broadcast( - , broadcast(^, x, 2), 1), 3)
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
As can be seen, a relatively simple operation is written in a very complicated way making the code extremely illegible.
With .
notation it can be slightly improved:
(x.^2 .- 1)./3
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
The code looks much better now but still the number of .
’s might be overwhelming for some people.
This issue can be addressed by applying @.
at the beginning of the line.
This macro converts all functions in the line to broadcasted functions (actually @.
can be put in any place of the line. The scope of the macro is from symbol @.
to the the end of the current line).
In our example we will have:
@. (x^2 - 1)/3
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
Some caveats on thoughtless usage of @.
¶
Suppose that we have vector \(\mathbf{x} = \left(1,2,3,4,5\right)'\) and we would like to compute the variance of its elements by using a textbook formula:
To avoid multiple .
’s, we might be tempted to use @.
notation.
However, the result can be quite surprising:
@. sum( (x - sum(x)/length(x))^2 )/(length(x) - 1)
5-element Vector{Float64}:
NaN
NaN
NaN
NaN
NaN
To understand what happens, recall that macro @.
broadcasts all functions in the line.
In particular, it means that sum
and lenght
are broadcasted too.
This is not what we want.
sum.(x)
computes for each element \(s\) the sum of all elements, i.e. \(\sum_{i=s}^s x_i,\) which is nothing else than \(x_s\) itself.
In a similar fashion, length.(x)
computes for each element \(s\) the length of this element, which is always \(1.\)
As a result we have:
@show sum.(x);
@show length.(x);
sum.(x) = [1, 2, 3, 4, 5]
length.(x) = [1, 1, 1, 1, 1]
To compute the variance we need to use .
notation only for -()
and ^()
:
sum( (x .- sum(x)/length(x)).^2 )/(length(x) - 1)
2.5
broadcast
vs. map
¶
to be written
@show map(+, A, 1);
@show broadcast(+, A, 1);
map(+, A, 1) = [2]
broadcast(+, A, 1) = [2 3; 4 5]