Broadcasting: arrays meet scalars
Contents
Broadcasting: arrays meet scalars¶
In the section describing arrays, we faced one seemingly bizzare obstacle in operating with objects of different data structures.
In our discussed example we tried to add a scalar number, 1, to a matrix A by running A+1.
In Julia it is not possible.
While trying running A+1 you get an error message.
We addressed this problem by using function repeat.
Here I present a much more recommended approach and to this end I will use function broadcast.
Broadcasting applies a certain function f over the sequence of elements of the provided input objects.
In our example, we want to broadcast function +() over elements of A and 1:
A = [1 2; 3 4] #our matrix
output = broadcast(+, A , 1)
2×2 Matrix{Int64}:
2 3
4 5
The broadcasting procedure adds A[1,1] and 1 and assings the result to element output[1,1]. Next, it adds A[2,1] and 1 and assings the result to element output[2,1]. In total, the operation is conducted four times and as a result you get a new object output of size 2×2.
Broadcasting allows to use not only arrays and scalars but also arrays of different sizes in a very similar way:
broadcast(+, A , [1 2])
2×2 Matrix{Int64}:
2 4
4 6
Dot (.) notation for broadcasting¶
The number of input objects depends on the number of required inputs of the broadcasted function. To illustrate this, consider a following example:
Example 3
Suppose that we want to create an exponential grid of n elements from k1 to kn. In a grid of this type, the elements are defined as follows:
where d=lnkn−lnk1n−1.
The grid on the logarithmic scale (lnki)ni=1 can be computed as follows:
k₁ = 1;
kₙ = 12;
n = 1000;
d = (kₙ-k₁)/(n-1);
ln_k = collect( range(log(k₁), stop=log(kₙ), length=n) );
Equipped with (lnki)ni=1, we would like to recover the grid with true values of (ki)ni=1.
We can do it by exponentiating each element of ln_k.
broadcast might be very useful for this:
k = broadcast(exp, ln_k)
1000-element Vector{Float64}:
1.0
1.0024904901749636
1.004987182891239
1.007490093596194
1.0099992377756684
1.0125146309540696
1.0150362886944675
1.0175642265986928
1.0200984603074312
1.0226390055003225
1.0251858778960556
1.0277390932524673
1.0302986673666386
⋮
11.676115152945892
11.705194403016046
11.734346074672795
11.76357034828139
11.792867404656281
11.822237425062225
11.851680591215429
11.881197085284661
11.910787089892368
11.940450788115827
11.97018836348827
12.0
Procedure broadcast has a convenient short dot (.) notation.
Instead of using broadcast(f, As), we can do exactly the same thing by adding . at the end of function f, i.e. f.(As).
In our example it will be:
exp.(ln_k)
1000-element Vector{Float64}:
1.0
1.0024904901749636
1.004987182891239
1.007490093596194
1.0099992377756684
1.0125146309540696
1.0150362886944675
1.0175642265986928
1.0200984603074312
1.0226390055003225
1.0251858778960556
1.0277390932524673
1.0302986673666386
⋮
11.676115152945892
11.705194403016046
11.734346074672795
11.76357034828139
11.792867404656281
11.822237425062225
11.851680591215429
11.881197085284661
11.910787089892368
11.940450788115827
11.97018836348827
12.0
Arithmetic operations such as +, -, ^, /, * can be broadcasted by adding . in front of operators:
A .+ [1 2]
2×2 Matrix{Int64}:
2 4
4 6
Don’t use repeat for what can be done with broadcast¶
In section on arrays, I discussed a method involving repeat instead of broadcast.
Now, I will show how inefficient it is in comparison to broadcasting.
For this, I am going compare both approaches in adding 1 to each element of an array of size 100,000×10,000.
In the exercise macro @btime from BenchmarkTools is used to measure the execution time of both methods.
using BenchmarkTools
X = repeat(collect(1:100), 1000, 10000);
@show size(X)
size(X) = (100000, 10000)
(100000, 10000)
@btime X .+ 1;
@btime X + repeat([1], 100000, 10000);
15.541 s (7 allocations: 14.90 GiB)
The broadcasted (execution time: 3.308 s) function is almost five times faster than using repeat (execution time: 15.541 s).
The main reason for this is that repeat([1], 100000, 10000) generates a new large object, while in broadcast this step is skipped.
Updating objects: new assignments vs. elementwise updates¶
Suppose that you have a large array of dimension 10,000×10,000 and you would like to increase the value of each element by 1. Below you can find three alternative and seemingly very similar implementations. Macro @time measure the execution time.
A = repeat(1:10000, 10000)
B = repeat(1:10000, 10000)
@time B = B .+ 1;
@time A .+= 1;
0.567747 seconds (4 allocations: 762.940 MiB, 18.89% gc time)
0.123577 seconds (2 allocations: 64 bytes)
As you can see B = B .+ 1; is around 5 times slower than A .+= 1;. The reason for this is that in B = B .+ 1;, Julia creates a copy of B, adds 1 and then assigns the output to matrix B again. In A .+= 1; the operations (assignment and adding) are performed elementwise and no copy of A is created in the operation.
Too many dots? @. is the answer!¶
Example 4
Suppose that we have vector x=(1,2,3,4,5)′ and we would like to perform the following operation for each element of the vector:
Using broadcast, this can be done in the following way:
x = 1:5;
broadcast( /, broadcast( - , broadcast(^, x, 2), 1), 3)
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
As can be seen, a relatively simple operation is written in a very complicated way making the code extremely illegible.
With . notation it can be slightly improved:
(x.^2 .- 1)./3
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
The code looks much better now but still the number of .’s might be overwhelming for some people.
This issue can be addressed by applying @. at the beginning of the line.
This macro converts all functions in the line to broadcasted functions (actually @. can be put in any place of the line. The scope of the macro is from symbol @. to the the end of the current line).
In our example we will have:
@. (x^2 - 1)/3
5-element Vector{Float64}:
0.0
1.0
2.6666666666666665
5.0
8.0
Some caveats on thoughtless usage of @.¶
Example 5
Suppose that we have vector x=(1,2,3,4,5)′ and we would like to compute the variance of its elements by using a textbook formula:
To avoid multiple .’s, we might be tempted to use @. notation.
However, the result can be quite surprising:
@. sum( (x - sum(x)/length(x))^2 )/(length(x) - 1)
5-element Vector{Float64}:
NaN
NaN
NaN
NaN
NaN
To understand what happens, recall that macro @. broadcasts all functions in the line.
In particular, it means that sum and lenght are broadcasted too.
This is not what we want.
sum.(x) computes for each element s the sum of all elements, i.e. ∑si=sxi, which is nothing else than xs itself.
In a similar fashion, length.(x) computes for each element s the length of this element, which is always 1.
As a result we have:
@show sum.(x);
@show length.(x);
sum.(x) = [1, 2, 3, 4, 5]
length.(x) = [1, 1, 1, 1, 1]
To compute the variance we need to use . notation only for -() and ^():
sum( (x .- sum(x)/length(x)).^2 )/(length(x) - 1)
2.5
broadcast vs. map¶
to be written
@show map(+, A, 1);
@show broadcast(+, A, 1);
map(+, A, 1) = [2]
broadcast(+, A, 1) = [2 3; 4 5]