The purpose of this notebook is to give me a quick memory refresh of some key Python concepts. It is more of a mindmap than a thorough documentation/tutorial of the Python language.

Language Notebook

Built-in functions

  • any(iterable): Return True if bool(x) is True for any x in the iterable.
  • all(iterable): Return True if bool(x) is True for all values x in the iterable.
  • dir([object]): If called without an argument, return the names in the current scope. Else, return an alphabetized list of names comprising (some of) the attributes of the given object, and of attributes reachable from it.
  • map(): Return an iterator that applies function to every item of iterable, yielding the results.
  • filter(function, iterable): Return an iterator yielding those items of iterable for which function(item) is true.
  • zip(*iterables): Make an iterator that aggregates elements from each of the iterables
  • sum(iterable)
  • ord(char): Get ASCII code of that character
  • chr(code): Inverse of ord()
  • Full list here

Types and Data Structures

Sequence types
  • Mutable
    • list()
      • .sort() :: in-place sort
    • bytearray()
      • Mutable counterpart to bytes()
      • List of methods here
    • collections.deque()
      • A generalization of stacks and queues
      • Appends and pops from either side occur at O(1)
      • Though list() objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v)
  • Immutable
    • tuple()
      • Sequences typically used to store collections of heterogenous data
      • Create using:
        • tuple() or tuple(iterator)
        • (), (1, 2), 1, 2, 3
        • 1, or (1,) for singleton tuple
    • range()
      • Sequence of numbers
    • str()
      • Sequence of Unicode code points
      • List of methods here
    • bytes()
      • Sequence of single bytes
    • collections.namedtuple()
      • Can be used wherever regular tuples are used and they add the ability to access fields by name instead of position index
      • Point = namedtuple('Point', ['x', 'y'])
      • p = Point(11, y=22)
Common Sequence Operations

They all support a number of operations:

  • x in s or x not in s
    • In str, bytes and bytearray sequences you can test for subsequences, like "gg" in "eggs"
  • s + t (concatenation)
  • s * n or n * s (add s to itself n times)
  • s[i]
  • s[i:j]
  • s[i:j:k]
  • len(s)
  • min(s)
  • max(s)
  • s.index(x) (index of first occurence of x)
  • s.count(x) (total number of occurences of x)
Mutable Sequence Operations
  • s[i] = x
  • s[i:j] = t :: t must be an iterable
  • s[i:j:k] = t :: t must be an iterable
  • s.append(x)
  • s.clear() :: same as del [:]
  • s.copy() :: create shallow copy (same as s[:])
  • s.extend(t) or s += t :: extend s with contents of t
  • s *= n :: update s with its contents repeated n times
  • s.insert(i, x) :: insert x into s at index i
  • s.pop(i)
  • s.remove(x)
  • s.reverse() :: reverse in place

Set Types

Unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.

  • Mutable
    • set() :: can also be created with {1, 2}
      • add(item)
      • remove(item) :: removes item. raises KeyError if item is not contained
      • discard(item) :: remove item if present
      • pop() :: remove an arbitrary element. raises KeyError if set is empty
      • clear()
  • Imutable
    • frozenset()

Mapping Types

A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects.

There is currently only one standard mapping type, the dictionary dict()

  • collections.ChainMap(*maps)
    • A ChainMap class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update() calls.
  • collections.Counter(iterable-or-mapping)
    • A Counter is a dict subclass for counting hashable objects.
    • It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values.
  • collections.OrderedDict()
    • Less relevant now (Python 3+) that dict() remembers insertions order

List comprehensions


""" Simple example """
squares = [ x**2 for x in range(10) ]
print(squares)

""" for .. for .. if example """
combinations = [ (x,y) for x in [1,2,3] for y in [3,1,4] if x != y ]
print(combinations)

""" Nested comprehension example """

matrix = [
    [1,2,3,4],
    [5,6,6,8],
    [9,10,11,12]
]

transpose = [[row[i] for row in matrix] for i in range(4)]
print(transpose)

""" Example with zip, which is simpler """

transp = list(zip(*matrix))
print(transp)

Enumeration, Iterators and Generators

enumerate()

The enumerate(iterable[,start=0]) function returns an iterator for a list of (i, value) tuples, where i is an increasing counter for every value item of the iterable. Example:

>>> for i,v in enumerate(['a', 'b', 'c']):
...  print(i, v)
0 a
1 b
2 c

Iterator

  • An object representing a stream of data.
  • Must implement __next__()
  • Repeated calls to the iterator’s __next__() method (or passing the iterator object to the built-in next() function) return successive items in the stream. When no more data are available a StopIteration exception is raised.

Iterator objects are required to have an __iter__() method that returns the iterator object itself, so every iterator is also iterable.

Iterable

  • An object capable of returning its members one at a time. Examples are list, str, tuple, dict and file objects.
  • Iterable classes must implement __iter__(self) or __getitem__(self, key)
    • __iter__() should return a new iterator
    • For __getitem__(), it should:
      • accept integers and slice objects
      • Raise TypeError if key is of inappropriate type
      • Raise IndexError if key is of a value outside the set of indexes
  • Iterables can be used in a for loop and in places where a sequence is needed (zip(), map() etc.)
  • When an iterable object is passes as an argument to the built-in iter(), it returns an iterator for the object.
  • The iterator returned with iter() is good for one pass over the set of values.
  • When dealing with iterables you usually don’t have to call iter() yourself. For example the for statement does that automatically for you, creating an unnamed variable to hold the iterator for the duratino of the loop.

iter()

Called when an iterator is required for a container. It should return a new iterator object that can iterate over all the objects in the container. For mappings, it should iterate over thekeys of the container.

Iterator implementation

import random

class Item:
    """Example collection container that provides an iterator"""

    def __init__(self, n):
        """ Create list of n random ints"""

        self.n = n
        self.items = [random.randint(0,n) for i in range(n)]

    def __iter__(self):
        return ItemIter(self.items)

    def loop(self):
        """ Generator-based iterator"""

        for index in range(len(self.items)):
            yield self.items[index]

class ItemIter:
    """ Subclass of 'Items' that implements Iterator API"""

    def __init__(self, items):
        self.items = items
        self.pos = 0

    def __next__(self):
        """ Implement iterator API method: __next__"""

        if self.pos >= len(self.items):
            raise StopIteration

        item = self.items[self.pos]
        self.pos += 1
        return item

    def __iter__(self):
        """ Implement iterator API method: __iter__"""

        return self

Generators

Generators are iterators, but you can iterate over them only once. The reason is that they do not store all the values in memory, they generate the values on the fly.

You use them by iterating over them, either with a for or by passing them to any function or construct that iterates

They are written like regular functions but use the yield or yield from statement whenever they want to return data. They can be created like:

def reverse(data):
  for i in range(len(data)-1, -1, -1):
    yield data[index]

for char in reverse('golf'):
  print(char)
  • yield from allows a generator to delegate part of its operations to another generator. For simple iterators, it essentially is a shortened form of for item in iterable: yield item (replaced with yield from iterable)

Anything that can be done with generators can also be done with class-based iterators What makes generators so compact is that the __iter__() and __next__() methods are created automatically.

Generator expressions

They are similar to list comprehensions, but with parentheses instead of square brackets. These expressions are designed for situations where the generator is used right away by an enclosing function

s1 = sum(i*i for i in range(10)) # generator, sum of squares
s2 = sum([ i*i for i in range(10) ]) # same with list comprehension
print(s1)
print(s2)

* and ** Operators

What is the * operator?

This operator unpacks arguments that are already in a list or tuple

Another example when using range(), which expects two arguments for start and stop, is:

args = [3,6]
print(list(range(*args)))

In a similar fashion with *, ** can deliver keyword arguments for dictionaries.

def add3(term1, term2, term3):
    return term1 + term2 + term3

d = {
        "term1": 1,
        "term2": 2,
        "term3": 6
}
print(add3(**d))

Lambda expressions

Small anonoymous fucntions acn be created with the lambda keyword They can be used wherever function objects are required. They are syntactically restricted to a single expression.

def make_incrementor(n):
    return lambda x: x + n

f = make_incrementor(2)
print(f(10))
print(f(15))

Functional Programming Modules

itertools

Full list: https://docs.python.org/3/library/itertools.html

  • Infinite iterators
    • count()
      • Example: count(10) --> 10 11 12 13 14 ...
    • cycle()
      • Example: cycle('ABCD') --> A B C D A B C D ...
    • repeat()
      • Example: repeat(10, 3) --> 10 10 10
  • Iterators terminating on the shortest input sequence
    • accumulate()
      • Example: accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    • chain()
      • Example: chain('ABC', 'DEF') --> A B C D E F
    • chain.from_iterable()
      • Example: chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    • compress()
      • Example: compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    • zip_longest()
      • Example: zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
  • Combinatoric iterators
    • product()
      • Example: cartesian product, equivalent to a nested for-loop
    • permutations()
      • Example: r-length tuples, all possible orderings, no repeated elements
    • combinations()
      • Example: r-length tuples, in sorted order, no repeated elements
    • combinations_with_replacement()
      • Example: r-length tuples, in sorted order, with repeated elements

functools

Full list: https://docs.python.org/3/library/functools.html

  • cached_property(fn)
  • cmp_to_key(fn)
  • lru_cache(fn)
  • reduce(fn, iterable[,initializer])
  • wraps() decorator

Map, Reduce, Filter with lambdas

map(), functools.reduce() and filter() return iterables so we convert the results to a list with list()


nums = list(range(1,10))

squared = list(map(lambda x: x * x, nums))
filtered = list(filter(lambda x: x > 5, nums))
product = list(reduce(lambda x, y: x * y, nums))

Magic Methods

See https://rszalski.github.io/magicmethods/ for the full list

  • Construction and Initialization
    • __new__(cls, [...])
    • __init__(self, [...])
    • __del__(self)
  • Comparison
    • __eq__(self, other)
    • __ne__(self, other)
    • __lt__(self, other)
    • __gt__(self, other)
    • __le__(self, other)
    • __ge__(self, other)
  • Unary operators and functions
    • __pos__(self) (+some_object)
    • __neg__(self)
    • __abs__(self)
    • __invert__(self)
    • __round__(self, n)
    • __floor__(self)
  • Normal arithmetic operators
    • __add__(self, other)
    • __sub__(self, other)
    • __mul__(self, other)
    • __div__(self, other)
    • __fllordiv__(self, other)
  • Reflected arithmetic operators
    • Same as normal equivalents, except the perform the operation with other as the first operand and self as the second, rather than the other way around.
    • For the reflectd operators to be called, the object on the left hand side of the operator (other in the example) must not define (or return NotImplemented) for its definition of the non-reflected version of an operation
    • __radd__(self, other)
    • __rsub__(self, other)
    • __rmul__(self, other)
    • __rdivxadd__(self, other)
  • Type conversion magic methods
    • __int__(self)
    • __long__(self)
    • __float__(self)
  • Representing your Classes
    • __str__(self)
    • __repr__(self)
    • __format__(self)
    • __hash__(self)
    • __sizeof__(self)

Annotations

When annotating, assignment is optional.

  • Variable annotations are usually used for type hints: count: int = 0

  • Function annotations

def sum_two_numbers(a: int, b: int) -> int:
    return a + b

Function arguments

There are two kinds of arguments:

  • Keyword argument. Preceded by an identifier (e.g. name=)

Example:

  complex(real=3, imag=5)
  complex(**{'real': 3, 'imag': 5})
  • Positional arguments:
  complex(3,5)
  complex(*[3,5])

Dictionary views

The objects returned from dict.keys(), dict.values(), and dict.items() are called dictionary views.

To force the dictionary view to become a full list use list(dictview)

If statements

x = 0

if x > 0:
    pass
elif x < 0:
    pass
else:
    pass

Looping and mutating strategies

""" Strategy:  Iterate over a copy """
users = {}
for user, status in users.copy().items():
    if status == 'inactive':
        del users[user]

""" Strategy:  Create a new collection """
active_users = {}
for user, status in users.items():
    if status == 'active':
        active_users[user] = status

Exceptions

  • Errors detected during execution are called exceptions and are not unconditionally fatal
try:
    x = int('test')
except ValueError:
    print('Not a valid number')

    pass

""" Exception raising and defining """
class B(Exception):
    pass
class C(B):
    pass

class D(C):
    def __str__(self):
        return "Error def"
    pass

for cls in [B, C, D]:
    try:
        raise cls("Exception text")
    except D as err:
        print("D {0}".format(err))
    except C as err:
        print("C {0}".format(err))
    except B:
        print("B")
    finally:
        pass
  • Predefined clean up actions
    • This is basically called context management in python
        with open("./list-comprehension.py") as f:
            for line in f:
                print(line, end='')
      

del statement

You can delete items from a list likeso:

- del a[0]
- del a[2:4]

Sequences

These are the tuple, list and range data types

For tuples:

- Tuple packing: t = 1,"test", 123
- Tuple unpacking: a, b, c = t
    - Unpacking works with any sequence (list range or tuple)

Sets

  • To create an empty set use set(), not {}, which will create a dictionary
  • 1 in {1,2,3} // fast membership checking
  • Set comprehensions work
    • {x for x in ‘abracadabra’ if x not in ‘abc’}
    • Note that {x: x for x in ‘abracadabra’} creates a dict, not a set

Dictionaries

> dict = { 'sape': 4139, 'guido': 4127, 'jack': 4098 }
> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
> {x: x**2 for x in (2, 4, 6)}

Looping

for key, value in {'a': 1, 'b': 2}:
    print(key, value)

for i, v in enumerate(['a', 'b', 'c']):
    print(i, v)

Scopes and namespaces

def scope_test():
    def do_local():
        spam = "local spam"

    def do_nonlocal():
        nonlocal spam
        spam = "nonlocal spam"

    def do_global():
        global spam
        spam = "global spam"

    spam = "test spam"
    do_local()
    print("After local assignment:", spam)
    do_nonlocal()
    print("After nonlocal assignment:", spam)
    do_global()
    print("After global assignment:", spam)

scope_test()
print("In global scope:", spam)
The output of the example code is:

After local assignment: test spam
After nonlocal assignment: nonlocal spam
After global assignment: nonlocal spam
In global scope: global spam

I/O

import json
print(json.dumps({'kostas': 1, 'lekkas':2, 'age': 34}))

print("{:2.3}".format(1/3))
""" Use json.dump(x, f) to write to file """
""" Use x = json.load(f) to load from file """

"""
The following strategy iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty.
"""
import fileinput
for line in fileinput.input():
    process(line)

Coroutines

async def read_data(db):
    """ native coroutine """
    pass

async def read_data2(db):
    data = await db.fetch('SELECT ...')

Decorators

  • The following example is from https://github.com/chiphuyen/python-is-cool/blob/master/cool-python-tips.ipynb

Defining a timit decorator:

def timeit(fn):
    # *args and **kwargs are to support positional and named arguments of fn
    @functools.wraps
    def get_time(*args, **kwargs):
        start = time.time()
        output = fn(*args, **kwargs)
        print(f"Time taken in {fn.__name__}: {time.time() - start:.7f}")
        return output  # make sure that the decorator returns the output of fn
    return get_time
  • functools.wrapper is a convenience function for invoking functools.update_wrapper() as a function decorator. What functools.wraps() dooes is that it assigns attributes of the original function to the wrapper function, like __name__, __modules__, __annotations__, __qualname__ and __doc__. If we don’t use functools.wrap then the wrapped function loses any docstrings and its name is that of the wrapper function.

Adding the decorator(s):


@functools.lru_cache()
def fib_helper(n):
    if n < 2:
        return n
    return fib_helper(n - 1) + fib_helper(n - 2)

@timeit
def fib(n):
    return fib_helper(n)

Classes / OOO

We have the following conventions with regards to naming variables and methods:

  • _var: Hint that the method/variable is intented for internal use. Not enforced.
  • var_: Sometimes used when the most fitting name is already taken by a keyword
  • __var: The python interpreter will rewrite the attribute in order to avoid naming conflicts in subclasses. This is called named-mangling.
  • __var__: Perhaps surprisingly, names with both leading and trailing double underscores are not name-mangled. This convention is reserved for special use in the language, also called as dunder (i.e. double under) methods.
  • _: Single underscore is sometimes used as a name to indicate that a variable/result is temporary or insignificant.

Other notes:

  • The class Node(object): syntax is only needed in Python 2.x, in 3 the (object) part is the implicit default so it’s not needed.
  • isinstance(object, class) : Check if object is instance of class
  • issubclass(class, class_or_tuple): Check if class is subclass of another class

Class, instance and static methods

  • instance methods
    • Regular methods, defined like def method(self[,args])
    • Can access attributes and methods of the same object through self
    • Can even modify class state through self.__class__
  • class methods
    • Decorated with @classmethod
    • Defined like def clsmethod(cls[,args]). The cls parameters points to the class
    • They can’t modify object instance state, only class state
    • Can be used as object factories, e.g. return cls([args])
  • static methods
    • Decorated with @staticmethod
    • Define like def staticmethod([args])
    • Can neither modify object state nor class state
    • Primarily a way to namespace methods

super()

  • Allows you to call methods of the superclass
    • e.g. in the subclass’s __init__, super().__init__([args])

setattr() and getattr()

These built-in functions set and get properties of classes:

  • getattr(object, name[, default])
  • setattr(object, name, value)

Can be useful to minimize repetitions and perform update actions in bulk, like:

class Character:
    __slots__ = (
            "strength",
            "dexterity"
    )

    def __init__(self):
        for i in self.__slots__:
            setattr(self, i, 0)

Misc

  • floor division is available with //, like 11 // 4 == 2
  • import strings for a collection of string constants, like strings.ascii_lowercase
  • recordclass.recordclass is basically a mutable collections.namedtuple

Packages

  • A package is a collection of modules
  • An __init__.py file is required to make Python treat directories containing the file as packages. __init__.py can be empty, execute initialization code or set the __all__ variable.
    • __all__ is a list of module names to be included on from package import *
  • Intra-package references:
    • from . import echo
    • from .. import formats
    • from ..filters import equilizer
  • setup.py
    • This file is the build script for setuptools. It tells setuptools about your package (such as the name and version) as well as which code files to include.
    • Mostly used to build and distribute a package

Third party modules

scipy

Scipy is an ecosystem with a few popular core packages:

  • NumPy
    • NumPy is the fundamental package for scientific computing with Python. It contains among other things:
      • a powerful N-dimensional array object
      • sophisticated (broadcasting) functions
      • tools for integrating C/C++ and Fortran code
      • useful linear algebra, Fourier transform, and random number capabilities
  • Core SciPy Library
    • Linear Algebra
      • scipy.linalg contains all the functions in numpy.linalg. plus some other more advanced ones not contained in numpy.linalg.
    • Optimization
    • Integration
    • Signal processing & Fourrier transforms
    • Graph routines
    • Statistics
    • etc.
  • Matplotlib: 2D plotting
  • SymPy: symbolic mathematics
  • pandas: Data structures & analysis
    • Feature engineering
  • IPython: interactive console, a core component of Jupyter

seaborn

  • Plotting library e.g. for barplots
  • What is the difference with matplotlib ?

Seaborn is a Python visualization library based on matplotlib. It provides a high-level, dataset-oriented interface for creating attractive statistical graphics. The plotting functions in seaborn understand pandas objects and leverage pandas grouping operations internally to support concise specification of complex visualizations. Seaborn also goes beyond matplotlib and pandas with the option to perform statistical estimation while plotting, aggregating across observations and visualizing the fit of statistical models to emphasize patterns in a dataset.

ML

  • TensorFlow
  • PyTorch
  • Keras
  • Z3
  • OR-Tools
  • SciKit
  • Bokeh