Collections

Author

Marie-Hélène Burle

Values can be stored in collections. This workshop introduces tuples, dictionaries, sets, and arrays in Julia.

Tuples

Tuples are immutable, indexable, and possibly heterogeneous collections of elements. The order of elements matters.

# Possibly heterogeneous (values can be of different types)
typeof((2, 'a', 1.0, "test"))
Tuple{Int64, Char, Float64, String}
# Indexable (note that indexing in Julia starts with 1)
x = (2, 'a', 1.0, "test");
x[3]
1.0
# Immutable (they cannot be modified)
# So this returns an error
x[3] = 8
MethodError: no method matching setindex!(::Tuple{Int64, Char, Float64, String}, ::Int64, ::Int64)
The function `setindex!` exists, but no method is defined for this combination of argument types.
Stacktrace:
 [1] top-level scope
   @ ~/parvus/prog/mint/julia/intro_collections.qmd:32

Named tuples

Tuples can have named components:

typeof((a=2, b='a', c=1.0, d="test"))
@NamedTuple{a::Int64, b::Char, c::Float64, d::String}
x = (a=2, b='a', c=1.0, d="test");
x.c
1.0

Dictionaries

Julia also has dictionaries: associative collections of key/value pairs:

x = Dict("Name"=>"Roger", "Age"=>52, "Index"=>0.3)
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Roger"

"Name", "Age", and "Index" are the keys; "Roger", 52, and 0.3 are the values.

The => operator is the same as the Pair function:

p = "foo" => 7
"foo" => 7
q = Pair("bar", 8)
"bar" => 8

Dictionaries can be heterogeneous (as in this example) and the order doesn’t matter. They are also indexable:

x["Name"]
"Roger"

And mutable (they can be modified):

x["Name"] = "Alex";
x
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Alex"

Sets

Sets are collections without duplicates. The order of elements doesn’t matter.

set1 = Set([9, 4, 8, 2, 7, 8])
Set{Int64} with 5 elements:
  4
  7
  2
  9
  8

Notice how this is a set of 5 (and not 6) elements: the duplicated 8 didn’t matter.

set2 = Set([10, 2, 3])
Set{Int64} with 3 elements:
  2
  10
  3

You can compare sets:

# The union is the set of elements that are in one OR the other set
union(set1, set2)
Set{Int64} with 7 elements:
  4
  7
  2
  10
  9
  8
  3
# The intersect is the set of elements that are in one AND the other set
intersect(set1, set2)
Set{Int64} with 1 element:
  2
# The setdiff is the set of elements that are in the first set but not in the second
# Note that the order matters here
setdiff(set1, set2)
Set{Int64} with 4 elements:
  4
  7
  9
  8

Sets can be heterogeneous:

Set(["test", 9, :a])
Set{Any} with 3 elements:
  :a
  "test"
  9

Arrays

Vectors

Unidimensional arrays in Julia are called vectors.

Vectors of one element

[3]
1-element Vector{Int64}:
 3
[3.4]
1-element Vector{Float64}:
 3.4
["Hello, World!"]
1-element Vector{String}:
 "Hello, World!"

Vectors of multiple elements

[3, 4]
2-element Vector{Int64}:
 3
 4

Two dimensional arrays

[3 4]
1×2 Matrix{Int64}:
 3  4
[[1, 3] [1, 2]]
2×2 Matrix{Int64}:
 1  1
 3  2

Syntax subtleties

These 3 syntaxes are equivalent:

[2 4 8]
1×3 Matrix{Int64}:
 2  4  8
hcat(2, 4, 8)
1×3 Matrix{Int64}:
 2  4  8
cat(2, 4, 8, dims=2)
1×3 Matrix{Int64}:
 2  4  8

These 4 syntaxes are equivalent:

[2
 4
 8]
3-element Vector{Int64}:
 2
 4
 8
[2; 4; 8]
3-element Vector{Int64}:
 2
 4
 8
vcat(2, 4, 8)
3-element Vector{Int64}:
 2
 4
 8
cat(2, 4, 8, dims=1)
3-element Vector{Int64}:
 2
 4
 8

Elements separated by semi-colons or end of lines get expanded vertically.
Those separated by commas do not get expanded.
Elements separated by spaces or tabs get expanded horizontally.

Your turn:

Compare the outputs of the following:

[1:2; 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2
 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2, 3:4]
2-element Vector{UnitRange{Int64}}:
 1:2
 3:4
[1:2 3:4]
2×2 Matrix{Int64}:
 1  3
 2  4

Arrays and types

In Julia, arrays can be heterogeneous:

[3, "hello"]
2-element Vector{Any}:
 3
  "hello"

This is possible because all elements of an array, no matter of what types, will always sit below the Any type in the type hierarchy.

Initializing arrays

Below are examples of some of the functions initializing arrays:

rand(2, 3, 4)
2×3×4 Array{Float64, 3}:
[:, :, 1] =
 0.993828   0.882674  0.156826
 0.0488372  0.384544  0.38646

[:, :, 2] =
 0.947572  0.211583   0.245329
 0.200717  0.0671244  0.11207

[:, :, 3] =
 0.920057  0.535439  0.758574
 0.633244  0.701944  0.922863

[:, :, 4] =
 0.0622371  0.996037  0.495487
 0.623987   0.402097  0.226669
rand(Int64, 2, 3, 4)
2×3×4 Array{Int64, 3}:
[:, :, 1] =
 -1659704258553532985  5092626041193358980  -6744239143480713967
  5676627516220431085   859766954670435512   2040841354476197273

[:, :, 2] =
 -8675906160027700335  -6349630634794011755  -735734371108161759
   138637581457749019  -3183745125470321952  8635673206110177569

[:, :, 3] =
  -26801811691569741   7032665316102188392   372415012557512763
 8384772822022222227  -8182155477413591973  7840102709525219400

[:, :, 4] =
  7791255455207374125  1527910740453680593  -8625969480346811118
 -2584660519127820262  4492350654429620582  -3802575830933417107
zeros(Int64, 2, 5)
2×5 Matrix{Int64}:
 0  0  0  0  0
 0  0  0  0  0
ones(2, 5)
2×5 Matrix{Float64}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
reshape([1, 2, 4, 2], (2, 2))
2×2 Matrix{Int64}:
 1  4
 2  2
fill("test", (2, 2))
2×2 Matrix{String}:
 "test"  "test"
 "test"  "test"

Broadcasting

To apply a function to each element of a collection rather than to the collection as a whole, Julia uses broadcasting.

a = [-3, 2, -5]
3-element Vector{Int64}:
 -3
  2
 -5
abs(a)
LoadError: MethodError: no method matching abs(::Vector{Int64})

This doesn’t work because the function abs only applies to single elements.

By broadcasting abs, you apply it to each element of a:

broadcast(abs, a)
3-element Vector{Int64}:
 3
 2
 5

The dot notation is equivalent:

abs.(a)
3-element Vector{Int64}:
 3
 2
 5

It can also be applied to the pipe, to unary and binary operators, etc.

a .|> abs
3-element Vector{Int64}:
 3
 2
 5

Your turn:

Try to understand the difference between the following 2 expressions:

abs.(a) == a .|> abs
true
abs.(a) .== a .|> abs
3-element BitVector:
 1
 1
 1

Hint: 0/1 are a short-form notations for false/true in arrays of Booleans.

Comprehensions

Julia has an array comprehension syntax similar to Python’s:

[ 3i + j for i=1:10, j=3 ]
10-element Vector{Int64}:
  6
  9
 12
 15
 18
 21
 24
 27
 30
 33

Indexing

As in other mathematically oriented languages such as R, Julia starts indexing at 1.

Indexing is done with square brackets:

a = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4
a[1, 1]
1
a[1, :]
2-element Vector{Int64}:
 1
 2
a[:, 1]
2-element Vector{Int64}:
 1
 3
# Here, we are indexing a tuple
(2, 4, 1.0, "test")[2]
4

Your turn:

Index the element on the 3rd row and 2nd column of b:

b = ["wrong" "wrong" "wrong"; "wrong" "wrong" "wrong"; "wrong" "you got it" "wrong"]
3×3 Matrix{String}:
 "wrong"  "wrong"       "wrong"
 "wrong"  "wrong"       "wrong"
 "wrong"  "you got it"  "wrong"

Your turn:

a = [1 2; 3 4]
a[1, 1]
a[1, :]

How can I get the second column?
How can I get the tuple (2, 4)? (a tuple is a list of elements)

As in Python, by default, arrays are passed by sharing:

a = [1, 2, 3];
a[1] = 0;
a
3-element Vector{Int64}:
 0
 2
 3

This prevents the unwanted copying of arrays.