# Collections

Author

Marie-Hélène Burle

Values can be stored in collections. This section introduces tuples, dictionaries, sets, and arrays in Python.

## Lists

Lists are declared in square brackets:

``````l = [2, 1, 3]
l``````
``[2, 1, 3]``
``type(l)``
``list``

They are mutable:

``````l.append(0)
l``````
``[2, 1, 3, 0]``

Lists are ordered:

``['b', 'a'] == ['a', 'b']``
``False``

They can have repeat values:

``['a', 'a', 'a', 't']``
``['a', 'a', 'a', 't']``

Lists can be homogeneous:

``['b', 'a', 'x', 'e']``
``['b', 'a', 'x', 'e']``
``type('b') == type('a') == type('x') == type('e')``
``True``

or heterogeneous:

``[3, 'some string', 2.9, 'z']``
``[3, 'some string', 2.9, 'z']``
``type(3) == type('some string') == type(2.9) == type('z')``
``False``

They can even be nested:

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``

The length of a list is the number of items it contains and can be obtained with the function `len`:

``len([3, ['b', 'e', 3.9, ['some string', 9.9]], 8])``
``3``

To extract an item from a list, you index it:

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``3``

Python starts indexing at `0`, so what we tend to think of as the “first” element of a list is for Python the “zeroth” element.

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``['b', 'e', 3.9, ['some string', 9.9]]``
``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``8``
``````# Of course you can't extract items that don't exist
[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``````
``IndexError: list index out of range``

You can index from the end of the list with negative values (here you start at `-1` for the last element):

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8][-1]``
``8``

How could you extract the string `'some string'` from the list `[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]`?

You can also slice a list:

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8][0:1]``
````

Notice how slicing returns a list.

Notice also how the left index is included but the right index excluded.

If you omit the first index, the slice starts at the beginning of the list:

``[1, 2, 3, 4, 5, 6, 7, 8, 9][:6]``
``[1, 2, 3, 4, 5, 6]``

If you omit the second index, the slice goes to the end of the list:

``[1, 2, 3, 4, 5, 6, 7, 8, 9][6:]``
``[7, 8, 9]``

When slicing, you can specify the stride:

``[1, 2, 3, 4, 5, 6, 7, 8, 9][2:7:2]``
``[3, 5, 7]``

The default stride is `1`:

``[1, 2, 3, 4, 5, 6, 7, 8, 9][2:7] == [1, 2, 3, 4, 5, 6, 7, 8, 9][2:7:1]``
``True``

You can reverse the order of a list with a `-1` stride applied on the whole list:

``[1, 2, 3, 4, 5, 6, 7, 8, 9][::-1]``
``[9, 8, 7, 6, 5, 4, 3, 2, 1]``

You can test whether an item is in a list:

``3 in [3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``True``
``9 in [3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``False``

or not in a list:

``3 not in [3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``False``

You can get the index (position) of an item inside a list:

``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8].index(3)``
``0``

Note that this only returns the index of the first occurrence:

``[3, 3, ['b', 'e', 3.9, ['some string', 9.9]], 8].index(3)``
``0``

Lists are mutable (they can be modified). For instance, you can replace items in a list by other items:

``````L = [3, ['b', 'e', 3.9, ['some string', 9.9]], 8]
L``````
``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8]``
``````L = 2
L``````
``[3, 2, 8]``

You can delete items from a list using their indices with `list.pop`:

``````L.pop(2)
L``````
``[3, 2]``

Here, because we are using `list.pop`, `2` represents the index (the 3rd item).

or with `del`:

``````del L
L``````
````

Notice how a list can have a single item:

``len(L)``
``1``

It is then called a “singleton list”.

You can also delete items from a list using their values with `list.remove`:

``````L.remove(2)
L``````
``[]``

Here, because we are using `list.remove`, `2` is the value `2`.

Notice how a list can even be empty:

``len(L)``
``0``

You can actually initialise empty lists:

``````M = []
type(M)``````
``list``

You can add items to a list. One at a time:

``````L.append(7)
L``````
````

And if you want to add multiple items at once?

``````# This doesn't work...
L.append(3, 6, 9)``````
``TypeError: list.append() takes exactly one argument (3 given)``
``````# This doesn't work either (that's not what we wanted)
L.append([3, 6, 9])
L``````
``[7, [3, 6, 9]]``

Fix this mistake we just made and remove the nested list `[3, 6, 9]`.

One option is:

``del L``

To add multiple values to a list (and not a nested list), you need to use `list.extend`:

``````L.extend([3, 6, 9])
L``````
``[7, 3, 6, 9]``

If you don’t want to add an item at the end of a list, you can use `list.insert(<index>, <object>)`:

``````L.insert(3, 'test')
L``````
``[7, 3, 6, 'test', 9]``

Let’s have the following list:

``L = [7, [3, 6, 9], 3, 'test', 6, 9]``

Insert the string `'nested'` in the zeroth position of the nested list `[3, 6, 9]` in `L`.

You can sort an homogeneous list:

``````L = [3, 9, 10, 0]
L.sort()
L``````
``[0, 3, 9, 10]``
``````L = ['some string', 'b', 'a']
L.sort()
L``````
``['a', 'b', 'some string']``

Heterogeneous lists cannot be sorted:

``````L = [3, ['b', 'e', 3.9, ['some string', 9.9]], 8]
L.sort()``````
``TypeError: '<' not supported between instances of 'list' and 'int'``

You can also get the min and max value of homogeneous lists:

``min([3, 9, 10, 0])``
``0``
``max(['some string', 'b', 'a'])``
``'some string'``

For heterogeneous lists, this also doesn’t work:

``min([3, ['b', 'e', 3.9, ['some string', 9.9]], 8])``
``TypeError: '<' not supported between instances of 'list' and 'int'``

Lists can be concatenated with `+`:

``L + [3, 6, 9]``
``[3, ['b', 'e', 3.9, ['some string', 9.9]], 8, 3, 6, 9]``

or repeated with `*`:

``L * 3``
``````[3,
['b', 'e', 3.9, ['some string', 9.9]],
8,
3,
['b', 'e', 3.9, ['some string', 9.9]],
8,
3,
['b', 'e', 3.9, ['some string', 9.9]],
8]``````

To sum up, lists are declared in square brackets. They are mutable, ordered (thus indexable), and possibly heterogeneous collections of values.

## Strings

Strings behave (a little) like lists of characters in that they have a length (the number of characters):

``````S = 'This is a string.'
len(S)``````
``17``

They have a min and a max:

``min(S)``
``' '``
``max(S)``
``'t'``

You can index them:

``S``
``'s'``

Slice them:

``S[10:16]``
``'string'``

Reverse the order of the string `S`.

They can also be concatenated with `+`:

``````T = 'This is another string.'
print(S + ' ' + T)``````
``This is a string. This is another string.``

or repeated with `*`:

``print(S * 3)``
``This is a string.This is a string.This is a string.``
``print((S + ' ') * 3)``
``This is a string. This is a string. This is a string. ``

This is where the similarities stop however: methods such as `list.sort`, `list.append`, etc. will not work on strings.

## Arrays

Python comes with a built-in array module. When you need arrays for storing and retrieving data, this module is perfectly suitable and extremely lightweight. This tutorial covers the syntax in detail.

Whenever you plan on performing calculations on your data however (which is the vast majority of cases), you should instead use the NumPy package, covered in another section.

## Tuples

Tuples are defined with parentheses:

``````t = (3, 1, 4, 2)
t``````
``(3, 1, 4, 2)``
``type(t)``
``tuple``

Tuples are ordered:

``(2, 3) == (3, 2)``
``False``

This means that they are indexable and sliceable:

``(2, 4, 6)``
``6``
``(2, 4, 6)[::-1]``
``(6, 4, 2)``

They can be nested:

``type((3, 1, (0, 2)))``
``tuple``
``len((3, 1, (0, 2)))``
``3``
``max((3, 1, 2))``
``3``

They can be heterogeneous:

``type(('string', 2, True))``
``tuple``

You can create empty tuples:

``type(())``
``tuple``

You can also create singleton tuples, but the syntax is a bit odd:

``````# This is not a tuple...
type((1))``````
``int``
``````# This is the weird way to define a singleton tuple
type((1,))``````
``tuple``

However, the big difference with lists is that tuples are immutable:

``````T = (2, 5)
T = 8``````
``TypeError: 'tuple' object does not support item assignment``

Tuples are quite fascinating:

``````a, b = 1, 2
a, b``````
``(1, 2)``
``````a, b = b, a
a, b``````
``(2, 1)``

Tuples are declared in parentheses. They are immutable, ordered (thus indexable), and possibly heterogeneous collections of values.

## Sets

Sets are declared in curly braces:

``````s = {3, 2, 5}
s``````
``{2, 3, 5}``
``type(s)``
``set``

Sets are unordered:

``{2, 4, 1} == {4, 2, 1}``
``True``

Consequently, it makes no sense to index a set.

Sets can be heterogeneous:

``````S = {2, 'a', 'string'}
isinstance(S, set)``````
``True``
``type(2) == type('a') == type('string')``
``False``

There are no duplicates in a set:

``{2, 2, 'a', 2, 'string', 'a'}``
``{2, 'a', 'string'}``

You can define an empty set, but only with the `set` function (because empty curly braces define a dictionary):

``````t = set()
t``````
``set()``
``len(t)``
``0``
``type(t)``
``set``

Since strings an iterables, you can use `set` to get a set of the unique characters:

``set('abba')``
``{'a', 'b'}``

How could you create a set with the single element `'abba'` in it?

Sets are declared in curly brackets. They are mutable, unordered (thus non indexable), possibly heterogeneous collections of unique values.

## Dictionaries

Dictionaries are declared in curly braces. They associate values to keys:

``````d = {'key1': 'value1', 'key2': 'value2'}
d``````
``{'key1': 'value1', 'key2': 'value2'}``
``type(d)``
``dict``

Dictionaries are unordered:

``{'a': 1, 'b': 2} == {'b': 2, 'a': 1}``
``True``

Consequently, the pairs themselves cannot be indexed. However, you can access values in a dictionary from their keys:

``````D = {'c': 1, 'a': 3, 'b': 2}
D['b']``````
``2``
``D.get('b')``
``2``
``D.items()``
``dict_items([('c', 1), ('a', 3), ('b', 2)])``
``D.values()``
``dict_values([1, 3, 2])``
``D.keys()``
``dict_keys(['c', 'a', 'b'])``

To return a sorted list of keys:

``sorted(D)``
``['a', 'b', 'c']``

You can create empty dictionaries:

``````E = {}
type(E)``````
``dict``

Dictionaries are mutable, so you can add, remove, or replace items.

Let’s add an item to our empty dictionary `E`:

``````E['author'] = 'Proust'
E``````
``{'author': 'Proust'}``

``````E['title'] = 'In search of lost time'
E``````
``{'author': 'Proust', 'title': 'In search of lost time'}``

We can modify one:

``````E['author'] = 'Marcel Proust'
E``````
``{'author': 'Marcel Proust', 'title': 'In search of lost time'}``

Add a third item to E with the number of volumes.

We can also remove items:

``````E.pop('author')
E``````
``{'title': 'In search of lost time'}``

Another method to remove items:

``````del E['title']
E``````
``{}``

Dictionaries are declared in curly braces. They are mutable and unordered collections of key/value pairs. They play the role of an associative array.

## Conversion between collections

From tuple to list:

``list((3, 8, 1))``
``[3, 8, 1]``

From tuple to set:

``set((3, 2, 3, 3))``
``{2, 3}``

From list to tuple:

``tuple([3, 1, 4])``
``(3, 1, 4)``

From list to set:

``set(['a', 2, 4])``
``{2, 4, 'a'}``

From set to tuple:

``tuple({2, 3})``
``(2, 3)``

From set to list:

``list({2, 3})``
``[2, 3]``

## Collections module

Python has a built-in collections module providing the additional data structures: deque, defaultdict, namedtuple, OrderedDict, Counter, ChainMap, UserDict, UserList, and UserList.