Collections

Author

Marie-Hélène Burle

Values can be stored in collections. This workshop introduces tuples, dictionaries, sets, and arrays in Julia.

Tuples

Tuples are immutable, indexable, and possibly heterogeneous collections of elements. The order of elements matters.

# Possibly heterogeneous (values can be of different types)
typeof((2, 'a', 1.0, "test"))
Tuple{Int64, Char, Float64, String}
# Indexable (note that indexing in Julia starts with 1)
x = (2, 'a', 1.0, "test");
x[3]
1.0
# Immutable (they cannot be modified)
# So this returns an error
x[3] = 8
LoadError: MethodError: no method matching setindex!(::Tuple{Int64, Char, Float64, String}, ::Int64, ::Int64)
MethodError: no method matching setindex!(::Tuple{Int64, Char, Float64, String}, ::Int64, ::Int64)

Stacktrace:
 [1] top-level scope
   @ In[4]:3

Named tuples

Tuples can have named components:

typeof((a=2, b='a', c=1.0, d="test"))
@NamedTuple{a::Int64, b::Char, c::Float64, d::String}
x = (a=2, b='a', c=1.0, d="test");
x.c
1.0

Dictionaries

Julia also has dictionaries: associative collections of key/value pairs:

x = Dict("Name"=>"Roger", "Age"=>52, "Index"=>0.3)
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Roger"

"Name", "Age", and "Index" are the keys; "Roger", 52, and 0.3 are the values.

The => operator is the same as the Pair function:

p = "foo" => 7
"foo" => 7
q = Pair("bar", 8)
"bar" => 8

Dictionaries can be heterogeneous (as in this example) and the order doesn’t matter. They are also indexable:

x["Name"]
"Roger"

And mutable (they can be modified):

x["Name"] = "Alex";
x
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Alex"

Sets

Sets are collections without duplicates. The order of elements doesn’t matter.

set1 = Set([9, 4, 8, 2, 7, 8])
Set{Int64} with 5 elements:
  4
  7
  2
  9
  8

Notice how this is a set of 5 (and not 6) elements: the duplicated 8 didn’t matter.

set2 = Set([10, 2, 3])
Set{Int64} with 3 elements:
  2
  10
  3

You can compare sets:

# The union is the set of elements that are in one OR the other set
union(set1, set2)
Set{Int64} with 7 elements:
  4
  7
  2
  10
  9
  8
  3
# The intersect is the set of elements that are in one AND the other set
intersect(set1, set2)
Set{Int64} with 1 element:
  2
# The setdiff is the set of elements that are in the first set but not in the second
# Note that the order matters here
setdiff(set1, set2)
Set{Int64} with 4 elements:
  4
  7
  9
  8

Sets can be heterogeneous:

Set(["test", 9, :a])
Set{Any} with 3 elements:
  :a
  "test"
  9

Arrays

Vectors

Unidimensional arrays in Julia are called vectors.

Vectors of one element

[3]
1-element Vector{Int64}:
 3
[3.4]
1-element Vector{Float64}:
 3.4
["Hello, World!"]
1-element Vector{String}:
 "Hello, World!"

Vectors of multiple elements

[3, 4]
2-element Vector{Int64}:
 3
 4

Two dimensional arrays

[3 4]
1×2 Matrix{Int64}:
 3  4
[[1, 3] [1, 2]]
2×2 Matrix{Int64}:
 1  1
 3  2

Syntax subtleties

These 3 syntaxes are equivalent:

[2 4 8]
1×3 Matrix{Int64}:
 2  4  8
hcat(2, 4, 8)
1×3 Matrix{Int64}:
 2  4  8
cat(2, 4, 8, dims=2)
1×3 Matrix{Int64}:
 2  4  8

These 4 syntaxes are equivalent:

[2
 4
 8]
3-element Vector{Int64}:
 2
 4
 8
[2; 4; 8]
3-element Vector{Int64}:
 2
 4
 8
vcat(2, 4, 8)
3-element Vector{Int64}:
 2
 4
 8
cat(2, 4, 8, dims=1)
3-element Vector{Int64}:
 2
 4
 8

Elements separated by semi-colons or end of lines get expanded vertically.
Those separated by commas do not get expanded.
Elements separated by spaces or tabs get expanded horizontally.

Your turn:

Compare the outputs of the following:

[1:2; 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2
 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2, 3:4]
2-element Vector{UnitRange{Int64}}:
 1:2
 3:4
[1:2 3:4]
2×2 Matrix{Int64}:
 1  3
 2  4

Arrays and types

In Julia, arrays can be heterogeneous:

[3, "hello"]
2-element Vector{Any}:
 3
  "hello"

This is possible because all elements of an array, no matter of what types, will always sit below the Any type in the type hierarchy.

Initializing arrays

Below are examples of some of the functions initializing arrays:

rand(2, 3, 4)
2×3×4 Array{Float64, 3}:
[:, :, 1] =
 0.70497   0.452224  0.210217
 0.152121  0.808499  0.748643

[:, :, 2] =
 0.964218  0.533504  0.295138
 0.530122  0.705078  0.448783

[:, :, 3] =
 0.101024  0.702216  0.351094
 0.451474  0.643441  0.193529

[:, :, 4] =
 0.365804  0.593161  0.213761
 0.908817  0.669264  0.160509
rand(Int64, 2, 3, 4)
2×3×4 Array{Int64, 3}:
[:, :, 1] =
  -808940715765468093  -1584927078315600374  -5301199987516324173
 -6596392331988765638  -6192885842242193678   1889096344742778536

[:, :, 2] =
 -1263311441971715837   -398863679696473412    425946792632171343
 -8887634749817674030  -6532441838130849674  -5790650878322099032

[:, :, 3] =
 -3504949663976209361  -7056126120819890696   9014204101180695865
  5444915959299197671   7453311557699154449  -7332672815187269775

[:, :, 4] =
 -1027239353605623832  8546329529560148599  5006263260814316361
  3614836023227257818  -380255779183739001  9031894209972587885
zeros(Int64, 2, 5)
2×5 Matrix{Int64}:
 0  0  0  0  0
 0  0  0  0  0
ones(2, 5)
2×5 Matrix{Float64}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
reshape([1, 2, 4, 2], (2, 2))
2×2 Matrix{Int64}:
 1  4
 2  2
fill("test", (2, 2))
2×2 Matrix{String}:
 "test"  "test"
 "test"  "test"

Broadcasting

To apply a function to each element of a collection rather than to the collection as a whole, Julia uses broadcasting.

a = [-3, 2, -5]
3-element Vector{Int64}:
 -3
  2
 -5
abs(a)
LoadError: MethodError: no method matching abs(::Vector{Int64})

This doesn’t work because the function abs only applies to single elements.

By broadcasting abs, you apply it to each element of a:

broadcast(abs, a)
3-element Vector{Int64}:
 3
 2
 5

The dot notation is equivalent:

abs.(a)
3-element Vector{Int64}:
 3
 2
 5

It can also be applied to the pipe, to unary and binary operators, etc.

a .|> abs
3-element Vector{Int64}:
 3
 2
 5

Your turn:

Try to understand the difference between the following 2 expressions:

abs.(a) == a .|> abs
true
abs.(a) .== a .|> abs
3-element BitVector:
 1
 1
 1

Hint: 0/1 are a short-form notations for false/true in arrays of Booleans.

Comprehensions

Julia has an array comprehension syntax similar to Python’s:

[ 3i + j for i=1:10, j=3 ]
10-element Vector{Int64}:
  6
  9
 12
 15
 18
 21
 24
 27
 30
 33

Indexing

As in other mathematically oriented languages such as R, Julia starts indexing at 1.

Indexing is done with square brackets:

a = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4
a[1, 1]
1
a[1, :]
2-element Vector{Int64}:
 1
 2
a[:, 1]
2-element Vector{Int64}:
 1
 3
# Here, we are indexing a tuple
(2, 4, 1.0, "test")[2]
4

Your turn:

Index the element on the 3rd row and 2nd column of b:

b = ["wrong" "wrong" "wrong"; "wrong" "wrong" "wrong"; "wrong" "you got it" "wrong"]
3×3 Matrix{String}:
 "wrong"  "wrong"       "wrong"
 "wrong"  "wrong"       "wrong"
 "wrong"  "you got it"  "wrong"

Your turn:

a = [1 2; 3 4]
a[1, 1]
a[1, :]

How can I get the second column?
How can I get the tuple (2, 4)? (a tuple is a list of elements)

As in Python, by default, arrays are passed by sharing:

a = [1, 2, 3];
a[1] = 0;
a
3-element Vector{Int64}:
 0
 2
 3

This prevents the unwanted copying of arrays.