Icon
A Better Way to Learn Python

Collections

Python has a list of Container Datatypes that making working in collections easy

Collections

Most things in Python can be done with the built-in container types (dict, list, set, tuple, etc.). In this chapter, we will review the basic container types and then move to the more optimal container types.

Lists

A list in Python is an ordered collection of objects, also known as a sequence. Lists are indexed arrays starting with 0 and can host the object of any type. They are similar to strings in that you can access them using (-) negative indexes as well as ranges. Lists are mutable and allow you to add, remove or replace items.

# Creating an empty list
list1 = []
list2 = list()  # accepts either a tuple or a sequence and converts it into a Python list.

# Creating a list
list1 = [10, 20, 30]

# Allows storing heterogenous types
list2 = ['a', 5.5, 10]

The list can take any number of elements separated by comas, and each may belong to a different type: integer, float, string, etc. These become very familiar if you have existing experience in other high

Take some time to get familiar with Tools for Working with Lists

Learn more about arrays and collections and manipulation tools like bisect.

List Methods

Get familiar with all the fun things we can do with a list. More on Lists

Review the list of methods that can be used with a list type.

Let's see these list methods in action:

>>> fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana']
>>> fruits.count('apple')
2
>>> fruits.count('tangerine')
0
>>> fruits.index('banana')
3
>>> fruits.index('banana', 4)  # Find next banana starting a position 4
6
>>> fruits.reverse()
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']
>>> fruits.append('grape')
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange', 'grape']
>>> fruits.sort()
>>> fruits
['apple', 'apple', 'banana', 'banana', 'grape', 'kiwi', 'orange', 'pear']
>>> fruits.pop()
'pear'
>>> del fruits[0]          # remove an item from the list given it's index instead of it's value
>>> fruits
['apple', 'banana', 'banana', 'grape', 'kiwi', 'orange']
>>> del fruits[2:4]        # remove a slice from the list 
>>> fruits
['apple', 'banana', 'kiwi', 'orange']
>>> del fruits[:]          # clear the entire list
>>> fruits
[]

List Comprehensions

Python Documentation on List Comprehensions

Comprehensions provide a concise way to create lists. Mastering them can be tricky because they provide a new syntax to solve an old problem. Their pupose to be used as a tool for transforming one list into another list. It's essentially a filter followed by a map.

Let's take a look at a simple example. We will use a range here as well:

>>> compList = [iter for iter in range(10)]
>>> print(compList)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Let's bump up the complexity in this next example:

>>> listofColors = ['Red', 'Green', 'Blue', 'White', 'Black']
>>> firstLetters = [ color[0] for color in listofColors ]
>>> print(firstLetters)
['R', 'G', 'B', 'W', 'B']

You can put in conditional statements to control what ends up in the resulting list:

>>> print([ x+y for x in 'foo' for y in 'bar' ])
['fb', 'fa', 'fr', 'ob', 'oa', 'or', 'ob', 'oa', 'or']
>>> print([ x+y for x in 'foo' for y in 'bar' if x != 'f' and y != 'a' ])
['ob', 'or', 'ob', 'or']

The thing about implementing the comprehensions pattern is recognizing when you have the pattern. Looking for loops where you append new items to a list, with or without conditions is a good place to start

numbers = [1, 2, 3, 4, 5, 6]

doubled_evens = []
for n in numbers:
    if n % 2 == 0:
            doubled_evens.append(n * 2)

print(doubled_evens)
# [4, 8, 12]

Using the comprehension pattern it turns into this:

numbers = [1, 2, 3, 4, 5, 6]
doubled_evens = [ n * 2 for n in numbers if n % 2 == 0]

print(doubled_evens)
# [4, 8, 12]

Comprehensions with nested Loops

Flattening out a parent child matrix (list of lists) can be done using comprehensions as well. We can take this for loop that flattens a document into a list of words:

document = ['It was the best of times.', 'It was the worst of times.']

wordList = []
for row in document:
    for word in row.split():
            wordList.append(word)

print(wordList)
# ['It', 'was', 'the', 'best', 'of', 'times.', 'It', 'was', 'the', 'worst', 'of', 'times.']

And turn it into a list comprehension that does the same thing:

document = ['It was the best of times.', 'It was the worst of times.']
wordList = [ word for row in document for word in row.split() ]

print(wordList)
# ['It', 'was', 'the', 'best', 'of', 'times.', 'It', 'was', 'the', 'worst', 'of', 'times.']

Notice the order we implemented was top down, instead of bottom up.

If you find the lambda-statement-like comprehension a bit difficult to read, Python does allow for line breaks:

document = ['It was the best of times.', 'It was the worst of times.']
wordList = [ 
    word 
        for row in document 
        for word in row.split() 
]

print(wordList)
# ['It', 'was', 'the', 'best', 'of', 'times.', 'It', 'was', 'the', 'worst', 'of', 'times.']

Dictionaries

Get familiar with Mapping Types with the Python Docs

Dictionaries are Key:value lists. Each key:value pair is separated by a comma. We are mapping hashable values to arbitrary objects, which are mutable. Values that are not hashable may not be used as keys.

There are several ways to create a dictionary:

# These are all equal to { "one": 1, "two": 2, "three": 3}
>>> a = dict(one=1, two=2, three=3)
>>> 
>>> b = {'one': 1, 'two': 2, 'three': 3}
>>> 
>>> c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
>>> 
>>> d = dict([('two', 2), ('one', 1), ('three', 3)])
>>> 
>>> e = dict({'three': 3, 'one': 1, 'two': 2})
>>> 
>>> a == b == c == d == e
True

It's important to remember when using the base dict type, that if a key being added is already present, the value from the keyword argument replaces the value from the positional argument. Python 3.6 and above maintain the items in each dictionary in a consistent order. Older versions didn't guarantee that the order would e maintained.

ChainMap

If you need to link several dictionaries together, use a ChainMap.

Counter

The Counter object works very much like a dictionary. It's designed around keeping tallies. The key is the item to be counted and the value is the count. This gives you quite a bit more control than a dictionary.

>>> pitchCounts = Counter({'Noah': 120, 'Max': 200, 'Stephen': 160})
>>> pitchCounts['Noah']
120
>>> pitchCounts['Foo']     # Accessing a key that doesn't exist, returns 0 instead of an error
0

most_common() is a very popular method. It returns a list of tuples (value, frequency). If you pass an integer as the first parameter, it will return that many resulsts. It defaults by returning the frequency of all elements.

>>> numbers = [ randrange(10) for n in range(1000) ]          #Create a list of 1000 random numbers from 0-9
>>> Counter(numbers).most_common()
[(1, 108), (9, 105), (8, 104), (4, 102), (5, 102), (6, 97), (2, 97), (3, 97), (7, 97), (0, 91)]

Deque

For removing items from the start or end, use a deque object rather than a dictionary or a list. Also known as a 'double-ended queue'.

from collections import deque

custQue = deque( [f'customer{x}' for x in range(3)])
print(custQue)
# deque(['customer0', 'customer1', 'customer2'])

custQue.popleft()
# 'customer0'

custQue.pop()
# 'customer2'

print(custQue)
# deque(['customer1'])

Tuple

Review the small section on Immutable Sequence Types

A tuple is an immutable list which allows you to store a sequence of values separated by commas. They are like lists, but with a few key differences. The major one is that you can't reassign an item in a tuple. Lists are denoted with the brackets [] and tuples are denoted with parens (). Immutable sequences, such as tuples can stored in set and frozenset instances.

foo = ('Test', 10)
print(foo[0])
# Test

collections.namedtuple()

Think of a namedtuple like a dictionary, but immutable. They are great to use for simple tasks. With namedtuples, you don't have to use the int index for accessing its members.

>>> from collections import namedtuple
>>> sportsCar = namedtuple('Corvette', 'year make submodel')
>>> vette = sportsCar(year='2019', make='Chevrolet', submodel='ZR1')
>>> 
>>> print(vette)
Corvette(year='2019', make='Chevrolet', submodel='ZR1')
>>> print(vette.submodel)
ZR1

We can use the _make method which accepts an iterable and produces the namedtuple based on those values:

>>> from collections import namedtuple
>>> c = [30, 45]
>>> coordinates = namedtuple('Coordinates', ['x', 'y'])
>>> coordinates._make(c)
# Coordinates(x=30, y=45)

Sets

A set object is an unordered collection of distinct hashable objects. Commonly these include membership testing, removing duplicates from a sequence, and mathematical operations. The set type is mutable and contents can be changed by using methods like add() and remove(). The frozenset type is immutable and hashable.

Next, lets review some details about conditional statements and loops to add some definition to what you have seen in this tutorial so far.