APS4000 Introduction to Honours Computing

Introduction to Python

Basics

Introduction

Python is a object-oriented interpreted script language. There is also a recent nature article at http://www.nature.com/news/programming-pick-up-python-1.16833. For documentation and tutorial, I generally recommend the original online python documentation at https://docs.python.org/3/. Of course, you can always “google” for specific recipes. Python allows you to work interactively, write routines you can call, but also to write standalone applications. There is modules for almost anything. Python is used for applications like running web servers to super computing. It also allows to interface to other languages, e.g., FORTRAN (f2py), C/C++, R, ... You can use python for scripted text processing, data analysis (numpy), plotting/visualization (matplotlib). The later two packages we will use in the second part of this course. Here we will first focus on a basic introduction to python3. This introduction is not comprising, it just is meant to give you an idea of the power of python, what you all might be able to do, but may have to look up later. It will not replace you actually reading the manual and tutorial, which I highly recommend.

Python is object-oriented – in Python everything is an object. Everything. Object oriented programming combines data and code, allowing you data encapsulation and includes inheritance and polymorphism.

In this crash course - and this will crash any person’s ability to absorb it all in just 3h - I will give an overview so that you have seen what you may be able to all use in python, to inspire you, to model you research projects and ideas in python, and find the right data structures and organization for it. So that later you may remember having seen things that you could use. I do not expect that you remember all by heart. I need to look up and try out things continuously myself.

Starting Python

In this course we want to use Python 3. At the time of this writing, the current version is Python 3.5.1. For simpler editing, we use IPython. The current version is 4.1.1. In the following the shell prompt (Bash) is displayed as

~>

To start IPython we use

~> ipython

you should then see a message like

Python 3.5.1 (default, Dec 29 2015, 23:29:32)
Type "copyright", "credits" or "license" for more information.
IPython 4.1.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPythons features.
%quickref -> Quick reference.
help      -> Pythons own help system.
object?   -> Details about object, use object??’ for extra details.

In [1]:

On some systems you may have to type

~> ipython3

Now you are ready to use python. There is also more fancy shells, e.g., the IPython notebook, which you would start using

~> ipython notebook

This should pop up a new tab in your web browser button on the top right.[1](#install_notes) The you press the New Notebook

Very useful later in this course - and indispensable for you later when developing python codes - will be to automatically reload modules when you modify them.

You can do the manually, every time you start IPython, by first loading the autoreload extension, then activating it to reload all loaded modules automatically when changed.

In [2]: %load_ext autoreload
In [3]: %autoreload 2

Instructions on how to set this up by default so it is done automatically every time you start IPython can be found at, e.g., https://www.reddit.com/r/Python/comments/rsfsi/tutorial_spend_30_seconds_setting_up/.

Executing commands

On the IPython / Python shell you type <enter> to execute commands. In the IPython notebook you have to type <shift>+<enter> to execute a cell; just <enter> start a new line, allowing you to execute a block of command at once. For the purpuse of repruducibility in other Python shells, I do not use that here usually.


1: For me this required installation of pyzmq and jinja2.

In [ ]:
%load_ext autoreload
In [ ]:
%autoreload 2

Basic Data Types

Scalar data types

Integer

Just the number. Can be arbitrary large. Must not contain ".". Example

In [ ]:
2
In [ ]:
2000000000000000000000000000000000000000000000000000000000000000

Integer constants can also be specified in other bases, e.g.,

In [ ]:
0x32
In [ ]:
0o32
In [ ]:
0b101
Floating Point

Floating point number with finite precision. Use "." to separate fraction part. Use Just the number. Use "e" to separate exponent. Example:

In [ ]:
2.
In [ ]:
-2.e23
In [ ]:
2.0001e23
In [ ]:
2.0000000000000000000000000000000001e23

Precision is system-dependent, typical is IEEE-754 8-byte binary floating point (15 digits precision, exponent $\pm300$. Classes for (internal) decimal representation exists as well.

Complex Numbers

Use "j" for imaginary part.

In [ ]:
3 -4j
In [ ]:
##### Operators

The usual, +, -, *, /, % (modulo), ** (power). Special: // (integer division). Combining integer with float will result in float. Examples:

In [ ]:
7 // 3
In [ ]:
7 / 3
In [ ]:
7. // 3
In [ ]:
10**400 + 1.

There is also bit-wise binary operations on integers using & (and), | (or), ^ (xor), ~ (not), << (left shift), >> (right shift):

In [ ]:
7 & 3
In [ ]:
7 | 8
In [ ]:
7 ^ 3
In [ ]:
~3
In [ ]:
3 << 2
Truth and Logical Operations

Define logical values True and False. Logical operations include or, not, and and:

In [ ]:
True or False
In [ ]:
True and False
In [ ]:
not True
Comparison Operators

These include <, >, <=, == (equality, in contrast to assignment), != (not equal), is (object identity), and is not (negated object identity).

In [ ]:
(1 > 3) or (3 == 4)
Nothing

... and there is the None object - we will use later.

In [ ]:
None

... and the Ellipsis object

In [ ]:
Ellipsis
Strings

Sequence of characters enclosed by matching single or double quotation marks. Multi-line strings can be defined using triple quotation marks.

In [ ]:
'abc 123'
In [ ]:
"""abc
123"""

On a regular (I)Python(3) prompt you would have seen a line starting with ....: for continuation:

In [30]: """abc
....: 123"""

Here, the special characters like new line (\n) start with a \. You can add them manually:

In [ ]:
'abc\n123'

To input special characters in to a string w/o interpretation, use a raw string with has an r in front of the string:

In [ ]:
r'abc\n123'

Here the backslash itself is represented by a double backslash. To see the difference, we can use the print function

In [ ]:
print('abc\n123')
In [ ]:
print(r'abc\n123')

You can also add strings or replicate strings

In [ ]:
'abc' + '123'
In [ ]:
'12' * 12

There is a whole variety of other string methods to be discussed later or in the man- ual. Note that in Python 3 there is also a string-like data types bytes, bytearray, and memoryview but these are for more advanced use cases - and to keep competent python programmers employed.

In [ ]:
bytearray(12)
In [ ]:
b'1234'
Indexing and Slicing

Slicing and indexing generally can be applied to all ordered data that can be indexed. It is done with the "[_]" operator, where "_" stands for an argument.

For strings, you can get individual characters (index is base-0)

In [ ]:
'abc123'[2]

or sub-strings (called "slicing" - last index excluded!). The basic syntax is "start:stop[:step]". Default step size is 1 and omitted start/stop values run to the end of the structure (string). Negative values count from the back, with -1 referring to the last element.

In [ ]:
'abc123'[2:4]
In [ ]:
'abc123'[2::2]
In [ ]:
'abc123'[::-2]
In [ ]:
'abc123'[:-1]
In [ ]:
'abc123'[-2:0:-1]

This is the default behaviour of slicing, but in principle, each object can define how it wants to react to this, so can objects (classes) you define by defining the necessary attributes. More later.

Organizing Data - Variables

You can also assign values to variables.

In [ ]:
a = 12
In [ ]:
print(a)

Variable can have letters, undershorts and numbers but must not start with a number. Variable names starting with one or two underscores usually have special meaning. Vari- able names are case sensitive.

Note: A variable is a name (pointer to) that object. Assignment to an existing variable does change where it points to not the object that it points to.

In [ ]:
a = 12
In [ ]:
b = a
In [ ]:
a = 13
In [ ]:
b

Variables do not need to be defined and given a type in advance; the type comes with the object it points to.

In [ ]:
a = 12
In [ ]:
a = 'abc'
In [ ]:
a

In-place assignment operators are short hand for written-out expression 2 This way you can apparently add even to non-mutable objects like strings. Operators comprise the usual suspects, +=, -=, *=, /=, %=, **=, &=, |=, ^=, ...

In [ ]:
i = 3
In [ ]:
i += 4.
In [ ]:
i
In [ ]:
s = 'abc'
In [ ]:
s += 'd'
In [ ]:
s

In the string case, "s" is instead now pointing to a new string object. Later we will see for numpy that mutable objects can modify this behavior.

Organizing Data - Containers

Lists

List of elements in square brackets

In [ ]:
[1, 2, 3]
In [ ]:
[1, 2, 3] + [4, 5, 6]
In [ ]:
[1,2,3] * 4
In [ ]:
a = [1, 2, 3]
In [ ]:
a += [4]
In [ ]:
a
In [ ]:
a += [4, 5]
In [ ]:
a

List elements as assigmnet targets

In [ ]:
a = [1, 2, 3]
In [ ]:
a[1] = 4
In [ ]:
a
In [ ]:
a[1] = 'c'
In [ ]:
a
In [ ]:
a[0] = a
In [ ]:
a
In [ ]:
a[0][0][0][1][0]

That is, list entries can be any kind of object, even itself, and lists are mutable. In contrast, strings are not mutable.

In [ ]:
s = '123'
In [ ]:
s[2] = 'a'

You can even replace ranges by ranges

In [ ]:
a = [1, 2, 3]
In [ ]:
a[0:2] = [4, 5, 6]
In [ ]:
a

But note that assignment to elements is different

In [ ]:
a = [1, 2, 3]
In [ ]:
a[1] = [1, 2, 3]
In [ ]:
a

and that ranges cannot be replaced by scalars

In [ ]:
a[1:2] = 1
In [ ]:
a[1:2] = [1]
In [ ]:
a

There is also specific list functions and methods, e.g., append, len, index, count, min, max, copy, insert, clear, remove, pop, reverse, sort, ... and the sorted function

In [ ]:
a = [1, 3, 2]
In [ ]:
a.sort()

In the example above, the "()" stand for a function call, here w/o any parameter.

In [ ]:
print(a)
In [ ]:
sorted([3, 5, 2, 4])

Empty lists:

In [ ]:
[]
In [ ]:
list()

You can test whether an element is in the list

In [ ]:
2 in [1, 2, 3]
In [ ]:
2 not in [1, 2, 3]

Deleting elements

In [ ]:
a = [1, 2, 3]
In [ ]:
del a[1]
In [ ]:
a
Tuples

Like lists, but not mutable. Generated by the comma operator, enclosed by bracket if ambiguous otherwise.

In [ ]:
a = (1, 'a', [1])
In [ ]:
a[2]

Empty and 1-element tuples:

In [ ]:
a = ()
In [ ]:
a
In [ ]:
a = (1,)
In [ ]:
a

Except operations that change the tuple, most list operations work for tuples as well.

In [ ]:
(1,) * 4
In [ ]:
1 in (1, 2, 3)
Dictionaries

Dictionaries are very efficient to organize data that cannot be indexes easily. It consists of a pair of key and index; the key has to be a “hashable” (i.e., usually non-mutable) object like a number, string, or Tuple. Truth values, None, and Ellipsis are fine as well.) Dictionaries are nor ordered or sorted. Create with square brackets or 'dict' function.

In [ ]:
a = {'a': 3, True: 4, 5: 7, (1,2): 4}
In [ ]:
a
In [ ]:
b = dict(a=7, c=11)
In [ ]:
b

Note that in the last example the keywords are converted to strings.

You can test whether a key is present

In [ ]:
'c' in b

You can combine dictionaries using update

In [ ]:
a.update(b)
In [ ]:
a

and access elements using the indexing syntax or the get method

In [ ]:
a[True]
In [ ]:
a.get(True)

Get a default value if key is not defined

In [ ]:
a.get(False, 0)

remove elements (element returned)

In [ ]:
a.pop(True)
In [ ]:
a

set default values for undefined entries

In [ ]:
a.setdefault(False, 0)
In [ ]:
a

New values can be easily added, by assignment; if the key does not yet exists it is added, if it dies exist, it is overwritten.

In [ ]:
a = {}
In [ ]:
a['a'] = 1
In [ ]:
a['b'] = 2
In [ ]:
a
In [ ]:
a['a'] = 3
In [ ]:
a

Determine number of elements

In [ ]:
len(a)

Deleting elements

In [ ]:
a = dict(a=2,b=3)
In [ ]:
a
In [ ]:
del a['b']
In [ ]:
a

There are variation classes like OrderedDict, etc., as well. Have a look at the Python Standard Library

Sets

Sort of like dictionaries, but only keys, no values. Provides many useful set operations.

In [ ]:
{1, 2, 3}
In [ ]:
{1, 2, 3} | {3, 4, 5}
In [ ]:
{1, 'a', (1, 2)}

Define empty set

In [ ]:
set()

Set operations include <= (subset), < (proper subset), >= (superset), > (proper superset), | (union), & (intersection), (difference),ˆ(symmetric difference), ... There is also a hashable (non-mutable) set variation, frozenset.

Converting Between Types

Dictionary to list

In [ ]:
b = dict(a = 7, c = 11)
In [ ]:
list(b)
In [ ]:
list(b.items())
In [ ]:
list(b.values())
In [ ]:
list(b.keys())

Conversion between list, set, tuple are trivial

In [ ]:
list(tuple(set([1, 2, 3, 2, 1])))
Object Identity

You can see whether objects are identical using the is operator. Or == on the object ID, which you can get with the id function.

In [ ]:
x = [1,2,3]
In [ ]:
y = x
In [ ]:
x is y
In [ ]:
id(x) == id(y)
In [ ]:
id(x)
In [ ]:
y = x + [4]
In [ ]:
x is y
In [ ]:
y
In [ ]:
y = x
In [ ]:
y += [4]
In [ ]:
x
In [ ]:
x is y
In [ ]:
x = (1,2,3)
In [ ]:
id(x)
In [ ]:
x += (4,)
In [ ]:
id(x)
In [ ]:
x
In [ ]:
i = 1.
In [ ]:
id(i)
In [ ]:
i += 1
In [ ]:
id(i)

So, we see that the inplace operator behave differently for mutable objects than for im- mutable objects. This is special behaviour of the mutable objects! Recall that strings and numbers are also immutable objects.

Modules and Code Organization

Besides interactive use, code can be organized in modules - python source files with exten- sion ".py". Many of the functions of the standard library also "live" in modules. Modules are imported using the "import" statement. As an example, let’s look at mathematical functions:

Mathematical Functions

These sit in the module math. Functions generally are called with round brackets. To use them, we first need to import this module

In [ ]:
import math
In [ ]:
math.sin(12)
In [ ]:
math.factorial(12)

Modules are objects. In the above example, the "." is used to access the model's cos function.

We can also import names for direct use

In [ ]:
from math import cos
In [ ]:
cos(12)

or even be lazy

In [ ]:
from math import atan2 as a
a(2,3)

What is all in the math module?

In [ ]:
? math
Type:        module
String form: <module 'math' from '/home/alex/Python/lib/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so'>
File:        ~/Python/lib/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
Docstring:
This module is always available.  It provides access to the
mathematical functions defined by the C standard.
In [ ]:
dir(math)
In [ ]:
math.__dict__

Making your Own Module

Use your favourite editor to edit the file test.py.

As a measure of style, we start the module - any python object - with a "doc string": On first line with a brief description, then a blank line, then the extended description on many lines and paragraphs as needed. Our code might be

"""
Module for python tests.
This Model cotains a selection of my python and learing codes.
will change a lot over time.
"""
import math

a = math.cos(12)

and we can use this in IPython:

In [ ]:
import test1
In [ ]:
test1.a
In [ ]:
test1.__doc__
In [ ]:
? test1
Type:        module
String form: <module 'test1' from '/home/alex/Class/ASP4000-python-2016/test1.py'>
File:        ~/Class/ASP4000-python-2016/test1.py
Docstring:
Module for python tests.
This Model cotains a selection of my python and learing codes.
will change a lot over time.

You can also add comments. Everything after a # symbol is treated as comment (unless in string). They can be at the end of a line, or taking up the entire line. Usually should go with the indentation of their scope

# I am a comment

Function is Form

In python code blocks are started by a colon at the end of a statement, then indentation (four white spaces) is used. The code block ends when de-indented. Lines can be continuedwith a backslash at the end of the line, or when a bracket (round, square, curl) encloses an expression.

Here, let’s define out first function, using the def statement that we save in test2.py

def f(x):
    y = x + 3
    return 2*y
In [ ]:
import test2
In [ ]:
test2.f(3)

We can also return more than one value. If there is no return statement, the default return value is None. I will save this in test3.py

def f(x):
    """My test function"""
    y = x + 3
    return 2 * y, 3 * y
In [ ]:
import test3
In [ ]:
test3.f(3)
In [ ]:
a, b = test3.f(4)
In [ ]:
a
In [ ]:
b

and we can also access the function doc string

In [ ]:
? test3.f
Signature: test3.f(x)
Docstring: My test function
File:      ~/Class/ASP4000-python-2016/test3.py
Type:      function
In [ ]:
test3.f.__doc__

Assignment to formal parameters will not overwrite actual parameter, just define inside the function to what object the name now points there. Variable defined inside the function are local and not visible outside.

def f(x):
    x = x + 3
    y = x + 4
    return y
In [ ]:
import test4
In [ ]:
x = 5
In [ ]:
test4.f(x)
In [ ]:
x
In [ ]:
test4.f.y

... but we can tell python to make a variable global (to the module) and variables not local (allow assignment to enclosing scope)

def f(x):
    global y
    z = 0
    def g(y):
        nonlocal z
        z = 4 * x + y
        return z + 1
    y = g(2 * x) + z
    return y + 1
In [ ]:
import test5
In [ ]:
test5.f(2)
In [ ]:
test5.y

In places where python expects code but you don’t want to do anything, you can use the pass statement. Can also be used for prototyping.

def f(x):
    pass

Functions do not have to return anything, they can just perform a task, .e.g,

def f(x):
    print('The value is:', x)

That is, they can act just like a subroutine in FORTRAN. As stated before, if there is not return value, the return None by default. If the return value is not used for anything, it is just ignored in codes (on the console it would be printed), even if the function does return a value.

Functions are Objects Too

In [ ]:
f = test5.f
In [ ]:
f(2)
In [ ]:
import math
In [ ]:
c = math.cos
In [ ]:
c(4 * math.pi)
In [ ]:
x = [test5.f, test5.f(2)]
In [ ]:
x
In [ ]:
x[0](3)

... and we can pass them as arguments

def f(g, x):
    return 2 * g(x)
In [ ]:
import test6
In [ ]:
test6.f(math.cos, 0)

... or return them ...

def f(n):
    def g(x):
        return x**n
    return g
In [ ]:
import test7
In [ ]:
h = test7.f(3)
In [ ]:
h
In [ ]:
h(2)

Note the (<locals>) above indicating a "closure" included - the function encapsulate the environment in which it was defined.

Instead of having to reload test by hand all the time, we can instruct IPython to do this automatically for us whenever the code has changed:

In [ ]:
%aimport test

But for the purpose of this course I had to create new files for the purpose of documentation

IPython Notebook

The IPython Notebook and (I)Python also allow you to define functions, etc., interactively, however, some things like modules do not work. For most cases I do not recooment this work mode. Here and exampe, nevertheless:

In [ ]:
def f(x):
    x = x**3
    return x
In [ ]:
f(5)

Anonymous “Lambda” Functions

You can also define anonymous functions

In [ ]:
lambda x,y: x*y
In [ ]:
f = _
In [ ]:
f(2,3)

The second input also shows you how to capture previous output. You can also use the %hist magic IPython function to list previous input. Or you can use Out[#] where # is the number of the output you refer to.

These anonymous functions are particularly useful if a function needs to be passed as argument:

def f(g):
    def h(x):
        return g(g(x))
    return h
In [ ]:
import test8
In [ ]:
h = test8.f(lambda x: x*2)
In [ ]:
h(3)
In [ ]:
h('abc')

Function Arguments

... can have default values, and can be passed by keyword, or just by position. In either case, all positional arguments need to precede keyword arguments. Keyword arguments can be in arbitrary order, but the must not be a conflict with positional arguments when calling the function.

def f(x, y = 2):
    return x**y
In [ ]:
import test9
In [ ]:
test9.f(2, 3)
In [ ]:
test9.f(y=2, x=3)
In [ ]:
test9.f(2, y=3)

You can collect extra/remaining positional and keyword arguments in a tuple or dictio- nary using "*args" and/or "**kwargs" after the last positional or keyword argument, respectively. The same syntax can be used to pass such parameters from lists/tuples or dictionaries to functions. First Python will match all positional arguments, then match the keywords.

def f(x, *args, y = 2, **kwargs):
    return x, args, y, kwargs
In [ ]:
import test10
In [ ]:
test10.f(1)
In [ ]:
test10.f(1,2,3)
In [ ]:
test10.f(1, z = 4)
In [ ]:
args = dict(a = 3, b = 4, y = 5)
In [ ]:
kwargs = dict(a = 3, b = 4, y = 5)
In [ ]:
test10.f(1, 2, *args, z = 3, **kwargs)

Note: Keyword parameter defaults are evaluated at the time of function definition only.

Flow Control Structures

So far we can only do rather boring things. We want to control program flow.

If Statement

The full statement includes a required leading if, followed by arbitrary many elif and an optional final else statement catching the remaining cases.

def f(x):
    if x < 0:
        return -2 * x
    elif x == 0:
        return 0
    elif (x < 1) or (x > 5):
        x = x**0.5
        return x
    else:
        return 1

There is no "case" statement as in many other languages. But there is an “inline” version of the if statement:

In [ ]:
print('a') if 3 > 4 else print('b')

While Statement

This will continue a loop as long as a given condition is fulfilled. An lse clause is executed if the condition test False, even if this is just the first time. break allows to terminate the loop, not executing the else clause, and a continue statement skips the rest and immediately continues with the next iteration test.

def f(x):
    y = 0
    while x > 5:
        x -= 1
        y += x
        if x > 10:
            break
        if x == 7:
            continue
    else:
        return y
    return -1

For Statement

This allows you to have a set number of iterations or iterate over a member of a container, etc. Similar use of else, continue, and break. Iterated item is assigned to a variable.

def f(x, n = 7):
    y = 0
    for i in range(1, n):
        y += x**i
    return y

and it can come from a list, dictionary, ...

def f(x):
    y = 0
    for i in [2, 4, 7]:
        y += x**i
    return y

or using enumerate to return 2-tuples of the item and its index

def f(x):
    d = dict()
    for i,z in enumerate(x):
        d[z] = i
    return d
In [ ]:
import test15
In [ ]:
test15.f('abc')

Another key example is the zip function

def f(x):
    d = dict()
    for i,z in enumerate(x):
        d[z] = i
    return d
In [ ]:
import test16
In [ ]:
test16.f('abc', '1234')

A much larger selection can be found in the itertools module.

Mutants - Modifying Function Arguments

At this point I would like to note that that while you can’t change what the formal parameter (from the function call) points to, the object itself that it points to, if mutable, can be changed:

def f(x):
    x = x + [1]
    return x
In [ ]:
import test17
In [ ]:
x = [1,2,3]
In [ ]:
test17.f(x)
In [ ]:
x

but

def f(x):
    x += [1]
In [ ]:
import test18
In [ ]:
 x = [1, 2, 3]
In [ ]:
test18.f(x)
In [ ]:
x

Unless absolutely intended, this can be hard to debug and should be avoided if you can. A very popular programming style (and languages designed around it) is called functional programming: Functions only return values but do not modify their arguments. It is already somewhat hard to violate this in python because you can’t pass parameters by reference (as in C or FORTRAN), but in some cases you sort of can mimic this if you really want. Be careful about this when modifying mutable containers that were passed as arguments to functions!

As said, except in cases where you really know what you do and it is the most efficient way to you task and well documented: Avoid it. Especially for the API and user interfaces to your code. It is much better to use function return values or only modify objects if the API explicitly calls for and indicates this.

This can be particularly troublesome for keyword parameters with defaults, as mentioned above:

def f(x = []):
    x +=  [1]
    return x
In [ ]:
import test19
In [ ]:
test19.f()
In [ ]:
test19.f()
In [ ]:
test19.f()

Ooops ...

Iterators and Comprehensions

A key design of Python 3 is lazy data evaluation. Items will be produced as needed, and the same mechanism can be used to iterate over items. The range object is such an example. You can define you own iterators or iterator functions using the yield statement.

Very useful are comprehensions, sort of the and inline version of the for loop.

In [ ]:
[i*2 for i in range(10) if i % 2 == 0]

and can be nested ...

In [ ]:
[i*j for i in range(5) for j in range(5) if i % 2 == 0 if j % 3 == 0]

and for a rainy day, there is generator objects from comprehensions as well ...

In [ ]:
(2*i for i in range(5))
In [ ]:
x = (2*i for i in range(5))
In [ ]:
list(x)
In [ ]:
list(x)

Iterators can be exhausted.

In practice, the next function return the next item form an iterator and raises and an StopIteration exceptions when there is no more items.

Advanced Python

Classes - The Heart of Python

In python every object also has a type. The type of an object is its class. You can use the type function to find out about the type of an object.

In [ ]:
type((1,))

You can define you own class by deriving from an exiting class - which can be one of your own classes -, hence inheriting its methods and attributes, but add more features and specialization. If you start a new class hierarchy from scratch, you usually would start by inheriting from object. But you can also inherit from multiple classes, merging their features. ```python class MyClass(object): """ My test class""" pass You can now create an object of this type, i.e., an instance of this new class by "calling" it.

In [ ]:
import test20
In [ ]:
o = test20.MyClass()
In [ ]:
o

This is not yet very exciting. The most important first step is to initialize the object. This is done by the __init__ method, to which the arguments of the object creation are passes. Additionally, a first argument self is passed, which is a reference to the current instance of this class and can be used to access the class’s attributes and methods. Note that you can read the classes attributes but by default do not overwrite them using self. The __init__ method must not return a value.

class MyClass(object):
    """ My test class"""
    def __init__(self, x):
        """initialize my object"""
        self.x = x
        self.y = x + 1
In [ ]:
import test21
In [ ]:
o = test21.MyClass(2)
In [ ]:
o.x
In [ ]:
o.y

You could also set things by hand on your class ...

In [ ]:
o.z = 4
In [ ]:
o.__dict__

... but this would be much more painful if you have to do it by hand everywhere this object is used, especially for more involved setup cases.

You can also define constants and do computations in the class body, and of course define any number of your own functions. You can even define function names by assignment, e.g., "__iadd__ = __add__". You can define how you objects reacts to operands, how it is printed, what its length is, how it reacts to indexing ([]) or being called (() - call).

As an example, let’s have a class that stores temperature:

class Temperature(object):
    unit = 'K'
    def __init__(self, T = 0):
        self.set(T)
    def set(self, T):
        if isinstance(T, Temperature):
           T = T._TK
        assert T >= 0
        self._TK = T # temperature in Kelvin
    def get(self):
        return self._TK
    def e(self):
        return 7.5657e-15 * self._TK**4
    def absolute(self):
        return self._TK
    def __str__(self):
        return '{} {}'.format(self.get(), self.unit)
    __repr__ = __str__

class Celsius(Temperature):
    unit = 'C'
    offset = 273.15
    def set(self, T):
        super().set(T + self.offset)
    def get(self):
        return super().get() - self.offset
In [ ]:
import test22
In [ ]:
t = test22.Temperature(3)
In [ ]:
t
In [ ]:
t.e()
In [ ]:
t = test22.Celsius(3)
In [ ]:
t
In [ ]:
t.absolute()

Exercise:

  1. Add a class that deals with temperature in Kelvin.
  2. Add function that computes the energy flux for black body radiation.
  3. Add a functionality so two temperature objects can be added. (Consider: what happens if you add two different object types? What should the resulting object type be?)

An example of a class of which instances behave like a function:

class F(object):
    def __init__(self, n):
        self.n = n
    def __call__(self, x):
        return x**self.n
In [ ]:
import test23
In [ ]:
f = test23.F(3)
In [ ]:
f(5)

You can also have "computed" properties of you object defining the __getattr__ method - but this is very advanced python.

class X(object):
    def __init__(self, x, v):
        assert isinstance(x, str)
        self._x = x
        self._v = v
    def __getattr__(self, x):
        if x == self._x:
            return self._v
        raise AttributeError()
In [ ]:
import test24
In [ ]:
x = test24.X('s', 3)
In [ ]:
x.s

Naming and Formatting Conventions

  • use doc strings
  • CamelCase for classes
  • runinnames for functions and methods
  • names starting with underscore for object private methods and attributes
  • use white space before and after operators and after comma.
  • use 4 spaces for indentation; not less (or more) and no tabulators.

Making Executable Scripts

There is four things that you need.

  1. A proper operating system (not Windows) - well, can be done there as well I suppose.
  2. Make the python file (script) executable. On Linux we would do this on the shell
    chmod u+x text.py
    
    Usually this is the last step
  3. Add a “shebang” (#!) at the beginning of the script. Typically I use
    #! /usr/bin/env python3
    
    This tells it what interpreter to use to execute your script. What you need to have there may vary from system to system. Alternatively, you can call you script later using
    python3 test.py
    
    The file does not require to retain the extension .py, or you can make a symbolic link (this is what I tend to do).
  4. Execute specific script code if the module is called as main program. In this case the variable __name__ contains the string "__main__", so you start your scrip code block using the line
    if __name__ == "__main__":
    

Other useful things are to access parameters passed to the script. These can be found in sys.argv - you need to import sys of course to use this. Note that sys.argv[0] is the program name. Note that the parameter are strings.

#! /usr/bin/env python3

"""my test script"""

import sys

a = 4

def f(x):
    print(x * a)

if __name__ == "__main__":
    f(sys.argv[1])

and on the shell

~/>./test25.py 34
34343434

A very useful package for dealing with input parameters for scripts is argparse. Having this available is what made me switch from Python 2 to Python 3.

When Things Fail - Exceptions

Exceptions are a regular means of "out of band" communication in python. Some things are easiest done this way and some not really in any other way. Use them.

The code block to be monitored is started with a try statement; exceptions are “caught” by one or several except clauses - generally or specialized for specific exceptions, the else clause is executed if nothing fails and the finally clause is execute in any case - failure or not.

def f(x):
    y = 0
    try:
        for x in range(x, 3):
            y = 1 / x
    except ZeroDivisionError as e:
        print('Error:', e)
    except Exception as e:
        print('Unexpected error:', e)
    else:
        print('all went fine')
    finally:
        print(y)
In [ ]:
import test26
In [ ]:
test26.f(2)
In [ ]:
test26.f(-2)
In [ ]:
test26.f(-2.5)

You can also use just "except:" to catch all exceptions if you don’t care about the exception object itself or leave off the "as ..." part if don’t need the exception info.

def f(x):
    y = 0
    try:
        for x in range(x, 3):
            y = 1 / x
    except ZeroDivisionError:
        print('Error!')
    except:
        print('Unexpected error!')
    else:
        print('all went fine')
    finally:
        print(y)

Input and Output

Format Strings

Strings have a format method. It allows you to replace "placeholders" - in curly braces - by arguments. They can be matched by index (zero-based) or by keyword. A detailed description is found at https://docs.python.org/3.4/library/string.html. The layout of these placeholder is to first give what is to be formatted - either by number or by name, followed by a colon, then the format string. If no number or name is supplied, numbering is automatic.

Examples:

In [ ]:
'The Winner is {:} on {:}'.format('Mr. X', 'best movie')
In [ ]:
'The Winner is {act:>10s} for A${prize:05d}'.format(prize=1000, act='Jim')

Another very useful string method is join to combine a list of strings.

In [ ]:
'/'.join(('home','alex','xxx'))

File I/O

Very useful to consequently use the routines in os and os.path to manage paths. More recently the module pathlib was added, but I do not have experience with it and find it sort of clunky. You may also need some routines from the sys module.

The open routine opens a file and returns a file object. To close the file, call it’s close method. Always close you files when done. Resources are finite. https://docs.python.org/3/library/functions.html#open

The routine takes a file name and an open “mode”. Useful modes are t for "text", b for binary, U for "universal new line" mode, r for read, w for write, a for append; x for exclusive creation; and added + opens it for "updating" (may truncate).

Use the write routine to write to a file, read to read (the entire) file, or read a single line using readline.

def f():
    f = open('xxx.txt', 'wt')
    f.write('123\n345\n5677')
    f.close()

Text files can be iterated over - each iteration yields one line:

def f():
    f = open('xxx.txt', 'rt')
    for i, l in enumerate(f):
        print('{:05d} {}'.format(i, l.strip()))
    f.close()

Here the strip method of the string gets rid of (lead/trail) which spaces and the trailing newline (\n).

We may deal with binary files later.

Resource Management - With Statement

The with statement allows you to manage resources. For example, automatically close them, and deal with exceptions (even close them then).

def f():
    with open('xxx.txt', 'rt') as f:
        for i, l in enumerate(f):
            print('{:05d} {}'.format(i, l.strip()))

Decorators

Start with the @ symbol and are essentially functions that return modified “decorated” functions. You may see this sometimes.

def mul5(f):
    def g(*args, **kwargs):
        print('function was decorated.')
        return f(*args, **kwargs) * 5
    return g

@mul5
def h(x):
    return x**2
In [ ]:
import test31
In [ ]:
test31.h(3)

The key point is that the decorator takes a function and returns another function.

The decorator itself can also be a function - can have parameters - that returns the actual wrapping function.

def multiply(x):
    def wrap(f):
        def g(*args, **kwargs):
            print('function was decorated:', x)
            return f(*args, **kwargs) * x
        return g
    return wrap

@multiply(5)
def h(x):
    return x**2
In [ ]:
import test32
In [ ]:
test32.h(3)

Or you can design it as a class with a __call__ method that takes parameters for its __init__ method.

class Multiply(object):
    def __init__(self, factor):
        self._factor = factor
    def __call__(self, f):
        def g(*args, **kwargs):
            print('function was decorated:', self._factor)
            return f(*args, **kwargs) * self._factor
        return g

@Multiply(5)
def h(x):
    return x**2

Properties

Allows you to provide an interface to internal data of you object. You can generally control object access using __get__ and __set__.

class Temperature(object):
    def __init__(self, T = 0):
        self.T = T
    @property
    def T(self):
        """T in K"""
        return self._TK
    @T.setter
    def T(self, T):
        self._TK = T
    def e(self):
        """energy density in cgs"""
        return 7.5657e-15 * self._TK**4

class Celsius(Temperature):
    offset = 273.15
    @property
    def T(self):
        """T in C"""
        return self._TK - self.offset
    @T.setter
    def T(self, T):
        self._TK = T + self.offset
In [ ]:
import test34
In [ ]:
t = test34.Celsius()
In [ ]:
t.T
In [ ]:
t = test34.Celsius()
In [ ]:
t.T
In [ ]:
t.T = 100
In [ ]:
t._TK
In [ ]:
t.T

using the property function you can also define properties that can only be set but not read.

class Temperature(object):
    def __init__(self, T = 0):
        self.T = T
    @property
    def T(self):
        """T in K"""
        return self._TK
    @T.setter
    def T(self, T):
        self._TK = T
    def e(self):
        """energy density in cgs"""
        return 7.5657e-15 * self._TK**4

class Celsius(Temperature):
    offset = 273.15
    @property
    def T(self):
        """T in C"""
        return self._TK - self.offset
    @T.setter
    def T(self, T):
        self._TK = T + self.offset
    def _set_TF(self, T):
        self.T = (T - 40) * 5 / 9
    TF = property(fset = _set_TF)
In [ ]:
import test35
In [ ]:
t = test35.Celsius()
In [ ]:
t.TF = -20
In [ ]:
t.T
In [ ]:
t.TF

Regular Expressions

Knowing how to use regular expressions will make your life much easier. Yes, it requires some work to get started. Python offers some powerful tools to use them for you text processing, e.g., extracting data from text files or web pages. The manual is at https://docs.python.org/3.4/library/re.html The main use it to match items that have certain patterns.

Key tokens are ( and ) to “capture” strings you want to extract, . to match anything, ? to match the previous item zero to one times, * to match the previous item any number of times, + to match the previous item at least once, ^ for beginning of string/line, $ for end of string/line, [ and ] to define a set of characters, etc. You also have special tokens like \d matching a digit, etc.

In [ ]:
import re
In [ ]:
re.findall('([123]+)', '1234ghgs7sjj4399')

An example script I have used to renumber all the input/output prompts from IPython to be consecutive in the script.

#! /usr/bin/env python3

import sys, re, os

def format(infile):
    outfile = infile + '.tmp'
    In  = re.compile('^(In \[)[0-9]+(\]:.*)')
    Out = re.compile('^(Out\[)[0-9]+(\]:.*)')
    Search = (In, Out)
    count = 0
    with open(infile, 'rt') as f, open(outfile, 'xt') as g:
        for line in f:
            for prompt in Search:
                 m = prompt.findall(line)
                 if len(m) == 0:
                     continue
                 if prompt is In:
                     count += 1
                 line = prompt.sub(r'\g<1>{:d}\g<2>'.format(count),
                                   line)
            g.write(line)
    os.remove(infile)
    os.rename(outfile, infile)

if __name__ == "__main__":
   format(sys.argv[1])

NumPy

A very widely used convention is to import numpy as np

In [ ]:
import numpy as np

If you start IPython with the --pylab flag, it will do this automatically, however, in you scripts you still have to do it by hand.

In the IPython notebook we can do the same with the %pylab macro

In [ ]:
%pylab

NumPy provides multi-dimensional array class, ndarray, optimized for numerical data processing. It allows various data types - it defines its own data types, called "dtypes", and even has record arrays. I recommend you the very good online documentation at http://www.numpy.org/.

There is too much to tell about NumPy to fit into the time allocated for this course. Some of the key features are to allow you to do indexing nut just in the slice notation, but also by lists of indices or multi-dimensional constructs, or by arrays of truth values. The latter is extremely useful to avoid loop with if statements.

Key advice: Use “advanced slicing” to replace loops with if statements! These are very slow. Should this really not be possible to avoid, you can easily write FORTRAN extension modules using f2py.

Examples:

In [ ]:
x = np.arange(12).reshape(3, -1)
In [ ]:
x.shape
In [ ]:
x[0,2]
In [ ]:
x[2]
In [ ]:
x
In [ ]:
ii = x % 2 == 1
In [ ]:
ii
In [ ]:
x[ii] = 0
In [ ]:
x
In [ ]:
y = np.array([1, 2, 3, 4])
In [ ]:
y
In [ ]:
x *= y
In [ ]:
x
In [ ]:
y = np.array([1,2,3])
In [ ]:
x *= y[:, np.newaxis]
In [ ]:
x

The example above shows NumPy by default tries to match the last dimension and automatically expands the others. To change that, you can add “extra axes” using np.newaxis.

Telling you all about NumPy would likely be an entire course by itself.

Matplotlib

Matplotlib is a frequently used Python package for plotting. To use it interactively on the IPython shell, use the --pylab flag when starting IPython or to use the %pylab macro in the IPython notebook. Much of the interactive plotting interface is quite similar to Mathlab, so it may not seem all that strange. Matplotlib is fully object-oriented, and so is the graphics it produces: Objects can be modified, even interacted with, e.g., react to users clicking at them (advanced programming, I have never used).

Whereas you can do things interactively on the shell for development - I do that, in part - I highly recommend that you put the finalized scripts inside a script, possibly a function, but surely preferable an object. Even it you just use the init function for all the plotting at first - later you can delegate some of the tasks, e.g., setups you frequently use, into separate routines, maybe in a base class ("'MyPlot'") from which you derive specialized plots. You can store properties of the plot, the figure object, axes objects, etc. in the class and later access them.

The key is that having plots in scripts, you can easily modify things and rerun the script, or if your data or model has changed, you can just re-run the script - especially for last minute changes when you thesis is due (advisor says: "Change the color of that line!").

There is a vast variety of different plot and recipes to make them. A prominent gallery with code examples can be found at http://matplotlib.org/gallery.html. Personally, I don’t think this is always done very well, but it is a good start. I highly recommend you have a look at the Beginner’s and Advanced guides (http://matplotlib.org/devdocs/contents.html).

As an example, let’s make a simple line plot:

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os.path

projectpath = os.path.expanduser('~/xxx')

class MyPlot(object):
    def __init__(self, func = lambda x: np.sin(x)**2):
        f = plt.figure()
        ax = f.add_subplot(1, 1, 1)
        x = np.linspace(1, 10, 100)
        y = func(x)
        ax.plot(x, y,
                color = 'r',
                lw = 3,
                label = 'Model A')
        ndata = 10
        x = np.random.choice(x, ndata)
        y = func(x) + np.random.rand(ndata) * 0.1
        ax.plot(x, y,
                color = 'b',
                marker = '+',
                linestyle = 'None',
                markersize = 12,
                markeredgewidth = 2,
                label = 'Data')
        ax.set_xscale('log')
        ax.set_xlabel(r'$x\,(\mathrm{cm})$')
        ax.set_ylabel(r'$\sin^2\left(x\right)$')
        ax.legend(loc='best')
        f.tight_layout()
        plt.show()
        self.f = f

    def save(self, filename):
        self.f.savefig(os.path.join(projectpath, filename))
In [ ]:
import test37
In [ ]:
p = test37.MyPlot()
In [ ]:
p.save('xxx.pdf')

Matplotlib will automatically create an output file format based on the file extension.

Insted of having plots opne in separate wondows, you can also have them embedded into the notebook, as you may be used form other notebook tools. Do this with the %matplotlib macro

In [ ]:
%matplotlib inline
In [ ]:
plot([0,1])

you can disable this agian using %matplotlib wityhout arguments.

In [ ]:
%matplotlib
In [ ]:
plot([0,1])

The End.

Please feel free to contact me if you have further questions or need some help with tough python problems.