Collections

Author

Marie-Hélène Burle

Values can be stored in collections. This workshop introduces tuples, dictionaries, sets, and arrays in Julia.

Tuples

Tuples are immutable, indexable, and possibly heterogeneous collections of elements. The order of elements matters.

# Possibly heterogeneous (values can be of different types)
typeof((2, 'a', 1.0, "test"))
Tuple{Int64, Char, Float64, String}
# Indexable (note that indexing in Julia starts with 1)
x = (2, 'a', 1.0, "test");
x[3]
1.0
# Immutable (they cannot be modified)
# So this returns an error
x[3] = 8
LoadError: MethodError: no method matching setindex!(::Tuple{Int64, Char, Float64, String}, ::Int64, ::Int64)

Named tuples

Tuples can have named components:

typeof((a=2, b='a', c=1.0, d="test"))
@NamedTuple{a::Int64, b::Char, c::Float64, d::String}
x = (a=2, b='a', c=1.0, d="test");
x.c
1.0

Dictionaries

Julia also has dictionaries: associative collections of key/value pairs:

x = Dict("Name"=>"Roger", "Age"=>52, "Index"=>0.3)
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Roger"

"Name", "Age", and "Index" are the keys; "Roger", 52, and 0.3 are the values.

The => operator is the same as the Pair function:

p = "foo" => 7
"foo" => 7
q = Pair("bar", 8)
"bar" => 8

Dictionaries can be heterogeneous (as in this example) and the order doesn’t matter. They are also indexable:

x["Name"]
"Roger"

And mutable (they can be modified):

x["Name"] = "Alex";
x
Dict{String, Any} with 3 entries:
  "Index" => 0.3
  "Age"   => 52
  "Name"  => "Alex"

Sets

Sets are collections without duplicates. The order of elements doesn’t matter.

set1 = Set([9, 4, 8, 2, 7, 8])
Set{Int64} with 5 elements:
  4
  7
  2
  9
  8

Notice how this is a set of 5 (and not 6) elements: the duplicated 8 didn’t matter.

set2 = Set([10, 2, 3])
Set{Int64} with 3 elements:
  2
  10
  3

You can compare sets:

# The union is the set of elements that are in one OR the other set
union(set1, set2)
Set{Int64} with 7 elements:
  4
  7
  2
  10
  9
  8
  3
# The intersect is the set of elements that are in one AND the other set
intersect(set1, set2)
Set{Int64} with 1 element:
  2
# The setdiff is the set of elements that are in the first set but not in the second
# Note that the order matters here
setdiff(set1, set2)
Set{Int64} with 4 elements:
  4
  7
  9
  8

Sets can be heterogeneous:

Set(["test", 9, :a])
Set{Any} with 3 elements:
  :a
  "test"
  9

Arrays

Vectors

Unidimensional arrays in Julia are called vectors.

Vectors of one element

[3]
1-element Vector{Int64}:
 3
[3.4]
1-element Vector{Float64}:
 3.4
["Hello, World!"]
1-element Vector{String}:
 "Hello, World!"

Vectors of multiple elements

[3, 4]
2-element Vector{Int64}:
 3
 4

Two dimensional arrays

[3 4]
1×2 Matrix{Int64}:
 3  4
[[1, 3] [1, 2]]
2×2 Matrix{Int64}:
 1  1
 3  2

Syntax subtleties

These 3 syntaxes are equivalent:

[2 4 8]
1×3 Matrix{Int64}:
 2  4  8
hcat(2, 4, 8)
1×3 Matrix{Int64}:
 2  4  8
cat(2, 4, 8, dims=2)
1×3 Matrix{Int64}:
 2  4  8

These 4 syntaxes are equivalent:

[2
 4
 8]
3-element Vector{Int64}:
 2
 4
 8
[2; 4; 8]
3-element Vector{Int64}:
 2
 4
 8
vcat(2, 4, 8)
3-element Vector{Int64}:
 2
 4
 8
cat(2, 4, 8, dims=1)
3-element Vector{Int64}:
 2
 4
 8

Elements separated by semi-colons or end of lines get expanded vertically.
Those separated by commas do not get expanded.
Elements separated by spaces or tabs get expanded horizontally.

Your turn:

Compare the outputs of the following:

[1:2; 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2
 3:4]
4-element Vector{Int64}:
 1
 2
 3
 4
[1:2, 3:4]
2-element Vector{UnitRange{Int64}}:
 1:2
 3:4
[1:2 3:4]
2×2 Matrix{Int64}:
 1  3
 2  4

Arrays and types

In Julia, arrays can be heterogeneous:

[3, "hello"]
2-element Vector{Any}:
 3
  "hello"

This is possible because all elements of an array, no matter of what types, will always sit below the Any type in the type hierarchy.

Initializing arrays

Below are examples of some of the functions initializing arrays:

rand(2, 3, 4)
2×3×4 Array{Float64, 3}:
[:, :, 1] =
 0.676981  0.00192985  0.460161
 0.579801  0.0571031   0.19217

[:, :, 2] =
 0.709137  0.355586  0.717515
 0.334442  0.768498  0.93754

[:, :, 3] =
 0.110039  0.468733  0.764542
 0.708841  0.418923  0.102156

[:, :, 4] =
 0.1735    0.92587   0.822419
 0.122246  0.749059  0.52207
rand(Int64, 2, 3, 4)
2×3×4 Array{Int64, 3}:
[:, :, 1] =
 7539227344717627596  -3475288731017273925  3435963957489459227
 7076298999511187079  -4056353322580659761  -837262113887699001

[:, :, 2] =
 -7562670463192357073  -8927025020788172752   6417337822872556077
  3692648043801976038   3471765935100455283  -6192652305627405865

[:, :, 3] =
  7031151667336399214   5088860813990033390   5115764253454872856
 -6119814429800991191  -8609024759557032284  -1204795858698859213

[:, :, 4] =
 -2961801139235269924   5392657125003334687   1122551460163567839
  1768937120468164288  -3560940733318201168  -6602997138296377453
zeros(Int64, 2, 5)
2×5 Matrix{Int64}:
 0  0  0  0  0
 0  0  0  0  0
ones(2, 5)
2×5 Matrix{Float64}:
 1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  1.0  1.0
reshape([1, 2, 4, 2], (2, 2))
2×2 Matrix{Int64}:
 1  4
 2  2
fill("test", (2, 2))
2×2 Matrix{String}:
 "test"  "test"
 "test"  "test"

Broadcasting

To apply a function to each element of a collection rather than to the collection as a whole, Julia uses broadcasting.

a = [-3, 2, -5]
3-element Vector{Int64}:
 -3
  2
 -5
abs(a)
LoadError: MethodError: no method matching abs(::Vector{Int64})

This doesn’t work because the function abs only applies to single elements.

By broadcasting abs, you apply it to each element of a:

broadcast(abs, a)
3-element Vector{Int64}:
 3
 2
 5

The dot notation is equivalent:

abs.(a)
3-element Vector{Int64}:
 3
 2
 5

It can also be applied to the pipe, to unary and binary operators, etc.

a .|> abs
3-element Vector{Int64}:
 3
 2
 5

Your turn:

Try to understand the difference between the following 2 expressions:

abs.(a) == a .|> abs
true
abs.(a) .== a .|> abs
3-element BitVector:
 1
 1
 1

Hint: 0/1 are a short-form notations for false/true in arrays of Booleans.

Comprehensions

Julia has an array comprehension syntax similar to Python’s:

[ 3i + j for i=1:10, j=3 ]
10-element Vector{Int64}:
  6
  9
 12
 15
 18
 21
 24
 27
 30
 33

Indexing

As in other mathematically oriented languages such as R, Julia starts indexing at 1.

Indexing is done with square brackets:

a = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4
a[1, 1]
1
a[1, :]
2-element Vector{Int64}:
 1
 2
a[:, 1]
2-element Vector{Int64}:
 1
 3
# Here, we are indexing a tuple
(2, 4, 1.0, "test")[2]
4

Your turn:

Index the element on the 3rd row and 2nd column of b:

b = ["wrong" "wrong" "wrong"; "wrong" "wrong" "wrong"; "wrong" "you got it" "wrong"]
3×3 Matrix{String}:
 "wrong"  "wrong"       "wrong"
 "wrong"  "wrong"       "wrong"
 "wrong"  "you got it"  "wrong"

Your turn:

a = [1 2; 3 4]
a[1, 1]
a[1, :]

How can I get the second column?
How can I get the tuple (2, 4)? (a tuple is a list of elements)

As in Python, by default, arrays are passed by sharing:

a = [1, 2, 3];
a[1] = 0;
a
3-element Vector{Int64}:
 0
 2
 3

This prevents the unwanted copying of arrays.