Broadcasting: arrays meet scalars

In the section describing arrays, we faced one seemingly bizzare obstacle in operating with objects of different data structures. In our discussed example we tried to add a scalar number, 1, to a matrix A by running A+1. In Julia it is not possible. While trying running A+1 you get an error message. We addressed this problem by using function repeat. Here I present a much more recommended approach and to this end I will use function broadcast.

Broadcasting applies a certain function f over the sequence of elements of the provided input objects. In our example, we want to broadcast function +() over elements of A and 1:

A = [1 2; 3 4] #our matrix

output = broadcast(+, A , 1)
2×2 Matrix{Int64}:
 2  3
 4  5

The broadcasting procedure adds A[1,1] and 1 and assings the result to element output[1,1]. Next, it adds A[2,1] and 1 and assings the result to element output[2,1]. In total, the operation is conducted four times and as a result you get a new object output of size 2×2.

Broadcasting allows to use not only arrays and scalars but also arrays of different sizes in a very similar way:

broadcast(+, A , [1 2])
2×2 Matrix{Int64}:
 2  4
 4  6

Dot (.) notation for broadcasting

The number of input objects depends on the number of required inputs of the broadcasted function. To illustrate this, consider a following example:

Example 3

Suppose that we want to create an exponential grid of n elements from k1 to kn. In a grid of this type, the elements are defined as follows:

i{1,,n}lnki=lnk1+(i1)d,

where d=lnknlnk1n1.

The grid on the logarithmic scale (lnki)ni=1 can be computed as follows:

k₁    =  1;
kₙ    = 12;
n     = 1000;
d     = (kₙ-k₁)/(n-1);
ln_k  = collect( range(log(k₁), stop=log(kₙ), length=n) );

Equipped with (lnki)ni=1, we would like to recover the grid with true values of (ki)ni=1. We can do it by exponentiating each element of ln_k. broadcast might be very useful for this:

k = broadcast(exp, ln_k)
1000-element Vector{Float64}:
  1.0
  1.0024904901749636
  1.004987182891239
  1.007490093596194
  1.0099992377756684
  1.0125146309540696
  1.0150362886944675
  1.0175642265986928
  1.0200984603074312
  1.0226390055003225
  1.0251858778960556
  1.0277390932524673
  1.0302986673666386
  ⋮
 11.676115152945892
 11.705194403016046
 11.734346074672795
 11.76357034828139
 11.792867404656281
 11.822237425062225
 11.851680591215429
 11.881197085284661
 11.910787089892368
 11.940450788115827
 11.97018836348827
 12.0

Procedure broadcast has a convenient short dot (.) notation. Instead of using broadcast(f, As), we can do exactly the same thing by adding . at the end of function f, i.e. f.(As). In our example it will be:

exp.(ln_k)
1000-element Vector{Float64}:
  1.0
  1.0024904901749636
  1.004987182891239
  1.007490093596194
  1.0099992377756684
  1.0125146309540696
  1.0150362886944675
  1.0175642265986928
  1.0200984603074312
  1.0226390055003225
  1.0251858778960556
  1.0277390932524673
  1.0302986673666386
  ⋮
 11.676115152945892
 11.705194403016046
 11.734346074672795
 11.76357034828139
 11.792867404656281
 11.822237425062225
 11.851680591215429
 11.881197085284661
 11.910787089892368
 11.940450788115827
 11.97018836348827
 12.0

Arithmetic operations such as +, -, ^, /, * can be broadcasted by adding . in front of operators:

A .+ [1 2]
2×2 Matrix{Int64}:
 2  4
 4  6

Don’t use repeat for what can be done with broadcast

In section on arrays, I discussed a method involving repeat instead of broadcast. Now, I will show how inefficient it is in comparison to broadcasting. For this, I am going compare both approaches in adding 1 to each element of an array of size 100,000×10,000. In the exercise macro @btime from BenchmarkTools is used to measure the execution time of both methods.

using BenchmarkTools

X = repeat(collect(1:100), 1000, 10000);
@show size(X)
size(X) = (100000, 10000)
(100000, 10000)
@btime X .+ 1;
@btime X + repeat([1], 100000, 10000);
  15.541 s (7 allocations: 14.90 GiB)

The broadcasted (execution time: 3.308 s) function is almost five times faster than using repeat (execution time: 15.541 s). The main reason for this is that repeat([1], 100000, 10000) generates a new large object, while in broadcast this step is skipped.

Updating objects: new assignments vs. elementwise updates

Suppose that you have a large array of dimension 10,000×10,000 and you would like to increase the value of each element by 1. Below you can find three alternative and seemingly very similar implementations. Macro @time measure the execution time.

A = repeat(1:10000, 10000)
B = repeat(1:10000, 10000)



@time B  = B .+ 1;
@time A .+= 1;
  0.567747 seconds (4 allocations: 762.940 MiB, 18.89% gc time)
  0.123577 seconds (2 allocations: 64 bytes)

As you can see B  = B .+ 1; is around 5 times slower than A .+= 1;. The reason for this is that in B  = B .+ 1;, Julia creates a copy of B, adds 1 and then assigns the output to matrix B again. In A .+= 1; the operations (assignment and adding) are performed elementwise and no copy of A is created in the operation.

Too many dots? @. is the answer!

Example 4

Suppose that we have vector x=(1,2,3,4,5) and we would like to perform the following operation for each element of the vector:

i{1,,5}xi213.

Using broadcast, this can be done in the following way:

x = 1:5;

broadcast( /, broadcast( - , broadcast(^, x, 2), 1), 3)
5-element Vector{Float64}:
 0.0
 1.0
 2.6666666666666665
 5.0
 8.0

As can be seen, a relatively simple operation is written in a very complicated way making the code extremely illegible. With . notation it can be slightly improved:

(x.^2 .- 1)./3
5-element Vector{Float64}:
 0.0
 1.0
 2.6666666666666665
 5.0
 8.0

The code looks much better now but still the number of .’s might be overwhelming for some people. This issue can be addressed by applying @. at the beginning of the line. This macro converts all functions in the line to broadcasted functions (actually @. can be put in any place of the line. The scope of the macro is from symbol @. to the the end of the current line). In our example we will have:

@. (x^2 - 1)/3
5-element Vector{Float64}:
 0.0
 1.0
 2.6666666666666665
 5.0
 8.0

Some caveats on thoughtless usage of @.

Example 5

Suppose that we have vector x=(1,2,3,4,5) and we would like to compute the variance of its elements by using a textbook formula:

Var(x)=ni=1(xini=1xin)2n1.

To avoid multiple .’s, we might be tempted to use @. notation. However, the result can be quite surprising:

@. sum( (x - sum(x)/length(x))^2 )/(length(x) - 1)
5-element Vector{Float64}:
 NaN
 NaN
 NaN
 NaN
 NaN

To understand what happens, recall that macro @. broadcasts all functions in the line. In particular, it means that sum and lenght are broadcasted too. This is not what we want. sum.(x) computes for each element s the sum of all elements, i.e. si=sxi, which is nothing else than xs itself. In a similar fashion, length.(x) computes for each element s the length of this element, which is always 1. As a result we have:

@show sum.(x);
@show length.(x);
sum.(x) = [1, 2, 3, 4, 5]
length.(x) = [1, 1, 1, 1, 1]

To compute the variance we need to use . notation only for -() and ^():

sum( (x .- sum(x)/length(x)).^2 )/(length(x) - 1)
2.5

broadcast vs. map

to be written

@show map(+, A, 1);
@show broadcast(+, A, 1);
map(+, A, 1) = [2]
broadcast(+, A, 1) = [2 3; 4 5]