Python is a object-oriented interpreted script language. There is also a recent nature article at http://www.nature.com/news/programming-pick-up-python-1.16833. For documentation and tutorial, I generally recommend the original online python documentation at https://docs.python.org/3/. Of course, you can always “google” for specific recipes. Python allows you to work interactively, write routines you can call, but also to write standalone applications. There is modules for almost anything. Python is used for applications like running web servers to super computing. It also allows to interface to other languages, e.g., FORTRAN (f2py), C/C++, R, ... You can use python for scripted text processing, data analysis (numpy), plotting/visualization (matplotlib). The later two packages we will use in the second part of this course. Here we will first focus on a basic introduction to python3. This introduction is not comprising, it just is meant to give you an idea of the power of python, what you all might be able to do, but may have to look up later. It will not replace you actually reading the manual and tutorial, which I highly recommend.
Python is object-oriented – in Python everything is an object. Everything. Object oriented programming combines data and code, allowing you data encapsulation and includes inheritance and polymorphism.
In this crash course - and this will crash any person’s ability to absorb it all in just 3h - I will give an overview so that you have seen what you may be able to all use in python, to inspire you, to model you research projects and ideas in python, and find the right data structures and organization for it. So that later you may remember having seen things that you could use. I do not expect that you remember all by heart. I need to look up and try out things continuously myself.
In this course we want to use Python 3. At the time of this writing, the current version is Python 3.5.1. For simpler editing, we use IPython. The current version is 4.1.1. In the following the shell prompt (Bash) is displayed as
~>
To start IPython we use
~> ipython
you should then see a message like
Python 3.5.1 (default, Dec 29 2015, 23:29:32)
Type "copyright", "credits" or "license" for more information.
IPython 4.1.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython’s features.
%quickref -> Quick reference.
help -> Python’s own help system.
object? -> Details about ’object’, use ’object??’ for extra details.
In [1]:
On some systems you may have to type
~> ipython3
Now you are ready to use python. There is also more fancy shells, e.g., the IPython notebook, which you would start using
~> ipython notebook
This should pop up a new tab in your web browser button on the top right.[1](#install_notes) The you press the New Notebook
Very useful later in this course - and indispensable for you later when developing python codes - will be to automatically reload modules when you modify them.
You can do the manually, every time you start IPython, by first loading the autoreload extension, then activating it to reload all loaded modules automatically when changed.
In [2]: %load_ext autoreload
In [3]: %autoreload 2
Instructions on how to set this up by default so it is done automatically every time you start IPython can be found at, e.g., https://www.reddit.com/r/Python/comments/rsfsi/tutorial_spend_30_seconds_setting_up/.
On the IPython / Python shell you type <enter>
to execute commands. In the IPython notebook you have to type <shift>+<enter>
to execute a cell; just <enter>
start a new line, allowing you to execute a block of command at once. For the purpuse of repruducibility in other Python shells, I do not use that here usually.
1: For me this required installation of pyzmq
and jinja2
.
%load_ext autoreload
%autoreload 2
2
2000000000000000000000000000000000000000000000000000000000000000
Integer constants can also be specified in other bases, e.g.,
0x32
0o32
0b101
Floating point number with finite precision. Use ".
" to separate fraction part. Use Just
the number. Use "e
" to separate exponent. Example:
2.
-2.e23
2.0001e23
2.0000000000000000000000000000000001e23
Precision is system-dependent, typical is IEEE-754 8-byte binary floating point (15 digits precision, exponent $\pm300$. Classes for (internal) decimal representation exists as well.
Use "j
" for imaginary part.
3 -4j
##### Operators
The usual, +
, -
, *
, /
, %
(modulo), **
(power). Special: //
(integer division). Combining integer with float will result in float. Examples:
7 // 3
7 / 3
7. // 3
10**400 + 1.
There is also bit-wise binary operations on integers using &
(and), |
(or), ^
(xor), ~
(not), <<
(left shift), >>
(right shift):
7 & 3
7 | 8
7 ^ 3
~3
3 << 2
Define logical values True
and False
. Logical operations include or
, not
, and and
:
True or False
True and False
not True
These include <
, >
, <=
, ==
(equality, in contrast to assignment), !=
(not equal), is
(object identity), and is not
(negated object identity).
(1 > 3) or (3 == 4)
... and there is the None object - we will use later.
None
...
and the Ellipsis object
Ellipsis
Sequence of characters enclosed by matching single or double quotation marks. Multi-line strings can be defined using triple quotation marks.
'abc 123'
"""abc
123"""
On a regular (I)Python(3) prompt you would have seen a line starting with ....:
for continuation:
In [30]: """abc
....: 123"""
Here, the special characters like new line
(\n
) start with a \
. You can add them manually:
'abc\n123'
To input special characters in to a string w/o interpretation, use a raw string with has
an r
in front of the string:
r'abc\n123'
Here the backslash itself is represented by a double backslash. To see the difference, we
can use the print
function
print('abc\n123')
print(r'abc\n123')
You can also add strings or replicate strings
'abc' + '123'
'12' * 12
There is a whole variety of other string methods to be discussed later or in the man-
ual. Note that in Python 3 there is also a string-like data types bytes
, bytearray
, and
memoryview
but these are for more advanced use cases - and to keep competent python
programmers employed.
bytearray(12)
b'1234'
Slicing and indexing generally can be applied to all ordered data that can be indexed. It
is done with the "[_]
" operator, where "_
" stands for an argument.
For strings, you can get individual characters (index is base-0)
'abc123'[2]
or sub-strings (called "slicing" - last index excluded!). The basic syntax is "start:stop[:step]
".
Default step size is 1 and omitted start/stop values run to the end of the structure (string).
Negative values count from the back, with -1 referring to the last element.
'abc123'[2:4]
'abc123'[2::2]
'abc123'[::-2]
'abc123'[:-1]
'abc123'[-2:0:-1]
This is the default behaviour of slicing, but in principle, each object can define how it wants to react to this, so can objects (classes) you define by defining the necessary attributes. More later.
You can also assign values to variables.
a = 12
print(a)
Variable can have letters, undershorts and numbers but must not start with a number. Variable names starting with one or two underscores usually have special meaning. Vari- able names are case sensitive.
Note: A variable is a name (pointer to) that object. Assignment to an existing variable does change where it points to not the object that it points to.
a = 12
b = a
a = 13
b
Variables do not need to be defined and given a type in advance; the type comes with the object it points to.
a = 12
a = 'abc'
a
In-place assignment operators are short hand for written-out expression 2 This way you
can apparently add even to non-mutable objects like strings. Operators comprise the
usual suspects, +=
, -=
, *=
, /=
, %=
, **=
, &=
, |=
, ^=
, ...
i = 3
i += 4.
i
s = 'abc'
s += 'd'
s
In the string case, "s" is instead now pointing to a new string object. Later we will see for numpy that mutable objects can modify this behavior.
[1, 2, 3]
[1, 2, 3] + [4, 5, 6]
[1,2,3] * 4
a = [1, 2, 3]
a += [4]
a
a += [4, 5]
a
List elements as assigmnet targets
a = [1, 2, 3]
a[1] = 4
a
a[1] = 'c'
a
a[0] = a
a
a[0][0][0][1][0]
That is, list entries can be any kind of object, even itself, and lists are mutable. In contrast, strings are not mutable.
s = '123'
s[2] = 'a'
You can even replace ranges by ranges
a = [1, 2, 3]
a[0:2] = [4, 5, 6]
a
But note that assignment to elements is different
a = [1, 2, 3]
a[1] = [1, 2, 3]
a
and that ranges cannot be replaced by scalars
a[1:2] = 1
a[1:2] = [1]
a
There is also specific list functions and methods, e.g., append
, len
, index
, count
, min
, max
, copy
, insert
, clear
, remove
, pop
, reverse
, sort
, ... and the sorted
function
a = [1, 3, 2]
a.sort()
In the example above, the "()" stand for a function call, here w/o any parameter.
print(a)
sorted([3, 5, 2, 4])
Empty lists:
[]
list()
You can test whether an element is in the list
2 in [1, 2, 3]
2 not in [1, 2, 3]
Deleting elements
a = [1, 2, 3]
del a[1]
a
Like lists, but not mutable. Generated by the comma operator, enclosed by bracket if ambiguous otherwise.
a = (1, 'a', [1])
a[2]
Empty and 1-element tuples:
a = ()
a
a = (1,)
a
Except operations that change the tuple, most list operations work for tuples as well.
(1,) * 4
1 in (1, 2, 3)
Dictionaries are very efficient to organize data that cannot be indexes easily. It consists
of a pair of key and index; the key has to be a “hashable” (i.e., usually non-mutable)
object like a number, string, or Tuple. Truth values, None
, and Ellipsis
are fine as well.)
Dictionaries are nor ordered or sorted. Create with square brackets or 'dict' function.
a = {'a': 3, True: 4, 5: 7, (1,2): 4}
a
b = dict(a=7, c=11)
b
Note that in the last example the keywords are converted to strings.
You can test whether a key is present
'c' in b
You can combine dictionaries using update
a.update(b)
a
and access elements using the indexing syntax or the get method
a[True]
a.get(True)
Get a default value if key is not defined
a.get(False, 0)
remove elements (element returned)
a.pop(True)
a
set default values for undefined entries
a.setdefault(False, 0)
a
New values can be easily added, by assignment; if the key does not yet exists it is added, if it dies exist, it is overwritten.
a = {}
a['a'] = 1
a['b'] = 2
a
a['a'] = 3
a
Determine number of elements
len(a)
Deleting elements
a = dict(a=2,b=3)
a
del a['b']
a
There are variation classes like OrderedDict
, etc., as well. Have a look at the Python Standard Library
Sort of like dictionaries, but only keys, no values. Provides many useful set operations.
{1, 2, 3}
{1, 2, 3} | {3, 4, 5}
{1, 'a', (1, 2)}
Define empty set
set()
Set operations include <=
(subset), <
(proper subset), >=
(superset), >
(proper superset), |
(union), &
(intersection), −
(difference),ˆ
(symmetric difference), ...
There is also a hashable (non-mutable) set variation, frozenset
.
Dictionary to list
b = dict(a = 7, c = 11)
list(b)
list(b.items())
list(b.values())
list(b.keys())
Conversion between list
, set
, tuple
are trivial
list(tuple(set([1, 2, 3, 2, 1])))
You can see whether objects are identical using the is
operator. Or ==
on the object ID,
which you can get with the id
function.
x = [1,2,3]
y = x
x is y
id(x) == id(y)
id(x)
y = x + [4]
x is y
y
y = x
y += [4]
x
x is y
x = (1,2,3)
id(x)
x += (4,)
id(x)
x
i = 1.
id(i)
i += 1
id(i)
So, we see that the inplace operator behave differently for mutable objects than for im- mutable objects. This is special behaviour of the mutable objects! Recall that strings and numbers are also immutable objects.
Besides interactive use, code can be organized in modules - python source files with exten-
sion ".py
". Many of the functions of the standard library also "live" in modules. Modules
are imported using the "import" statement. As an example, let’s look at mathematical
functions:
These sit in the module math
. Functions generally are called with round brackets. To use
them, we first need to import this module
import math
math.sin(12)
math.factorial(12)
Modules are objects. In the above example, the ".
" is used to access the model's cos
function.
We can also import names for direct use
from math import cos
cos(12)
or even be lazy
from math import atan2 as a
a(2,3)
What is all in the math module?
? math
Type: module
String form: <module 'math' from '/home/alex/Python/lib/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so'>
File: ~/Python/lib/python3.5/lib-dynload/math.cpython-35m-x86_64-linux-gnu.so
Docstring:
This module is always available. It provides access to the
mathematical functions defined by the C standard.
dir(math)
math.__dict__
Use your favourite editor to edit the file test.py
.
As a measure of style, we start the module - any python object - with a "doc string": On first line with a brief description, then a blank line, then the extended description on many lines and paragraphs as needed. Our code might be
"""
Module for python tests.
This Model cotains a selection of my python and learing codes.
will change a lot over time.
"""
import math
a = math.cos(12)
and we can use this in IPython:
import test1
test1.a
test1.__doc__
? test1
Type: module
String form: <module 'test1' from '/home/alex/Class/ASP4000-python-2016/test1.py'>
File: ~/Class/ASP4000-python-2016/test1.py
Docstring:
Module for python tests.
This Model cotains a selection of my python and learing codes.
will change a lot over time.
You can also add comments. Everything after a #
symbol is treated as comment (unless
in string). They can be at the end of a line, or taking up the entire line. Usually should
go with the indentation of their scope
# I am a comment
In python code blocks are started by a colon at the end of a statement, then indentation (four white spaces) is used. The code block ends when de-indented. Lines can be continuedwith a backslash at the end of the line, or when a bracket (round, square, curl) encloses an expression.
Here, let’s define out first function, using the def statement that we save in test2.py
def f(x):
y = x + 3
return 2*y
import test2
test2.f(3)
We can also return more than one value. If there is no return statement, the default
return value is None
. I will save this in test3.py
def f(x):
"""My test function"""
y = x + 3
return 2 * y, 3 * y
import test3
test3.f(3)
a, b = test3.f(4)
a
b
and we can also access the function doc string
? test3.f
Signature: test3.f(x)
Docstring: My test function
File: ~/Class/ASP4000-python-2016/test3.py
Type: function
test3.f.__doc__
Assignment to formal parameters will not overwrite actual parameter, just define inside the function to what object the name now points there. Variable defined inside the function are local and not visible outside.
def f(x):
x = x + 3
y = x + 4
return y
import test4
x = 5
test4.f(x)
x
test4.f.y
... but we can tell python to make a variable global (to the module) and variables not local (allow assignment to enclosing scope)
def f(x):
global y
z = 0
def g(y):
nonlocal z
z = 4 * x + y
return z + 1
y = g(2 * x) + z
return y + 1
import test5
test5.f(2)
test5.y
In places where python expects code but you don’t want to do anything, you can use the pass statement. Can also be used for prototyping.
def f(x):
pass
Functions do not have to return anything, they can just perform a task, .e.g,
def f(x):
print('The value is:', x)
That is, they can act just like a subroutine in FORTRAN. As stated before, if there is not return value, the return None by default. If the return value is not used for anything, it is just ignored in codes (on the console it would be printed), even if the function does return a value.
f = test5.f
f(2)
import math
c = math.cos
c(4 * math.pi)
x = [test5.f, test5.f(2)]
x
x[0](3)
... and we can pass them as arguments
def f(g, x):
return 2 * g(x)
import test6
test6.f(math.cos, 0)
... or return them ...
def f(n):
def g(x):
return x**n
return g
import test7
h = test7.f(3)
h
h(2)
Note the (<locals>
) above indicating a "closure" included - the function encapsulate the
environment in which it was defined.
Instead of having to reload test by hand all the time, we can instruct IPython to do this automatically for us whenever the code has changed:
%aimport test
But for the purpose of this course I had to create new files for the purpose of documentation
The IPython Notebook and (I)Python also allow you to define functions, etc., interactively, however, some things like modules do not work. For most cases I do not recooment this work mode. Here and exampe, nevertheless:
def f(x):
x = x**3
return x
f(5)
You can also define anonymous functions
lambda x,y: x*y
f = _
f(2,3)
The second input also shows you how to capture previous output. You can also use the
%hist
magic IPython function to list previous input. Or you can use Out[#]
where #
is the number of the output you refer to.
These anonymous functions are particularly useful if a function needs to be passed as argument:
def f(g):
def h(x):
return g(g(x))
return h
import test8
h = test8.f(lambda x: x*2)
h(3)
h('abc')
... can have default values, and can be passed by keyword, or just by position. In either case, all positional arguments need to precede keyword arguments. Keyword arguments can be in arbitrary order, but the must not be a conflict with positional arguments when calling the function.
def f(x, y = 2):
return x**y
import test9
test9.f(2, 3)
test9.f(y=2, x=3)
test9.f(2, y=3)
You can collect extra/remaining positional and keyword arguments in a tuple or dictio- nary using "*args" and/or "**kwargs" after the last positional or keyword argument, respectively. The same syntax can be used to pass such parameters from lists/tuples or dictionaries to functions. First Python will match all positional arguments, then match the keywords.
def f(x, *args, y = 2, **kwargs):
return x, args, y, kwargs
import test10
test10.f(1)
test10.f(1,2,3)
test10.f(1, z = 4)
args = dict(a = 3, b = 4, y = 5)
kwargs = dict(a = 3, b = 4, y = 5)
test10.f(1, 2, *args, z = 3, **kwargs)
Note: Keyword parameter defaults are evaluated at the time of function definition only.
So far we can only do rather boring things. We want to control program flow.
The full statement includes a required leading if, followed by arbitrary many elif and an optional final else statement catching the remaining cases.
def f(x):
if x < 0:
return -2 * x
elif x == 0:
return 0
elif (x < 1) or (x > 5):
x = x**0.5
return x
else:
return 1
There is no "case" statement as in many other languages. But there is an “inline” version of the if statement:
print('a') if 3 > 4 else print('b')
This will continue a loop as long as a given condition is fulfilled. An lse
clause is
executed if the condition test False
, even if this is just the first time. break allows to
terminate the loop, not executing the else clause, and a continue statement skips the
rest and immediately continues with the next iteration test.
def f(x):
y = 0
while x > 5:
x -= 1
y += x
if x > 10:
break
if x == 7:
continue
else:
return y
return -1
This allows you to have a set number of iterations or iterate over a member of a container,
etc. Similar use of else
, continue
, and break
. Iterated item is assigned to a variable.
def f(x, n = 7):
y = 0
for i in range(1, n):
y += x**i
return y
and it can come from a list, dictionary, ...
def f(x):
y = 0
for i in [2, 4, 7]:
y += x**i
return y
or using enumerate to return 2-tuples of the item and its index
def f(x):
d = dict()
for i,z in enumerate(x):
d[z] = i
return d
import test15
test15.f('abc')
Another key example is the zip
function
def f(x):
d = dict()
for i,z in enumerate(x):
d[z] = i
return d
import test16
test16.f('abc', '1234')
A much larger selection can be found in the itertools
module.
At this point I would like to note that that while you can’t change what the formal parameter (from the function call) points to, the object itself that it points to, if mutable, can be changed:
def f(x):
x = x + [1]
return x
import test17
x = [1,2,3]
test17.f(x)
x
but
def f(x):
x += [1]
import test18
x = [1, 2, 3]
test18.f(x)
x
Unless absolutely intended, this can be hard to debug and should be avoided if you can. A very popular programming style (and languages designed around it) is called functional programming: Functions only return values but do not modify their arguments. It is already somewhat hard to violate this in python because you can’t pass parameters by reference (as in C or FORTRAN), but in some cases you sort of can mimic this if you really want. Be careful about this when modifying mutable containers that were passed as arguments to functions!
As said, except in cases where you really know what you do and it is the most efficient way to you task and well documented: Avoid it. Especially for the API and user interfaces to your code. It is much better to use function return values or only modify objects if the API explicitly calls for and indicates this.
This can be particularly troublesome for keyword parameters with defaults, as mentioned above:
def f(x = []):
x += [1]
return x
import test19
test19.f()
test19.f()
test19.f()
Ooops ...
A key design of Python 3 is lazy data evaluation. Items will be produced as needed, and the same mechanism can be used to iterate over items. The range object is such an example. You can define you own iterators or iterator functions using the yield statement.
Very useful are comprehensions, sort of the and inline version of the for
loop.
[i*2 for i in range(10) if i % 2 == 0]
and can be nested ...
[i*j for i in range(5) for j in range(5) if i % 2 == 0 if j % 3 == 0]
and for a rainy day, there is generator objects from comprehensions as well ...
(2*i for i in range(5))
x = (2*i for i in range(5))
list(x)
list(x)
Iterators can be exhausted.
In practice, the next
function return the next item form an iterator and raises and an StopIteration
exceptions when there is no more items.
type((1,))
You can define you own class by deriving from an exiting class - which can be one of your own classes -, hence inheriting its methods and attributes, but add more features and specialization. If you start a new class hierarchy from scratch, you usually would start by inheriting from object. But you can also inherit from multiple classes, merging their features. ```python class MyClass(object): """ My test class""" pass You can now create an object of this type, i.e., an instance of this new class by "calling" it.
import test20
o = test20.MyClass()
o
This is not yet very exciting. The most important first step is to initialize the object.
This is done by the __init__
method, to which the arguments of the object creation are
passes. Additionally, a first argument self
is passed, which is a reference to the current
instance of this class and can be used to access the class’s attributes and methods. Note
that you can read the classes attributes but by default do not overwrite them using self.
The __init__
method must not return a value.
class MyClass(object):
""" My test class"""
def __init__(self, x):
"""initialize my object"""
self.x = x
self.y = x + 1
import test21
o = test21.MyClass(2)
o.x
o.y
You could also set things by hand on your class ...
o.z = 4
o.__dict__
... but this would be much more painful if you have to do it by hand everywhere this object is used, especially for more involved setup cases.
You can also define constants and do computations in the class body, and of course define
any number of your own functions. You can even define function names by assignment,
e.g., "__iadd__ = __add__
". You can define how you objects reacts to operands, how it is
printed, what its length is, how it reacts to indexing ([]
) or being called (()
- call
).
As an example, let’s have a class that stores temperature:
class Temperature(object):
unit = 'K'
def __init__(self, T = 0):
self.set(T)
def set(self, T):
if isinstance(T, Temperature):
T = T._TK
assert T >= 0
self._TK = T # temperature in Kelvin
def get(self):
return self._TK
def e(self):
return 7.5657e-15 * self._TK**4
def absolute(self):
return self._TK
def __str__(self):
return '{} {}'.format(self.get(), self.unit)
__repr__ = __str__
class Celsius(Temperature):
unit = 'C'
offset = 273.15
def set(self, T):
super().set(T + self.offset)
def get(self):
return super().get() - self.offset
import test22
t = test22.Temperature(3)
t
t.e()
t = test22.Celsius(3)
t
t.absolute()
Exercise:
An example of a class of which instances behave like a function:
class F(object):
def __init__(self, n):
self.n = n
def __call__(self, x):
return x**self.n
import test23
f = test23.F(3)
f(5)
You can also have "computed" properties of you object defining the __getattr__
method - but this is very advanced python.
class X(object):
def __init__(self, x, v):
assert isinstance(x, str)
self._x = x
self._v = v
def __getattr__(self, x):
if x == self._x:
return self._v
raise AttributeError()
import test24
x = test24.X('s', 3)
x.s
There is four things that you need.
chmod u+x text.py
#! /usr/bin/env python3
python3 test.py
__name__
contains the string "__main__
", so you start your scrip code block
using the lineif __name__ == "__main__":
Other useful things are to access parameters passed to the script. These can be found in
sys.argv
- you need to import sys of course to use this. Note that sys.argv[0]
is the
program name. Note that the parameter are strings.
#! /usr/bin/env python3
"""my test script"""
import sys
a = 4
def f(x):
print(x * a)
if __name__ == "__main__":
f(sys.argv[1])
and on the shell
~/>./test25.py 34
34343434
A very useful package for dealing with input parameters for scripts is argparse
. Having
this available is what made me switch from Python 2 to Python 3.
Exceptions are a regular means of "out of band" communication in python. Some things are easiest done this way and some not really in any other way. Use them.
The code block to be monitored is started with a try
statement; exceptions are “caught”
by one or several except
clauses - generally or specialized for specific exceptions, the else
clause is executed if nothing fails and the finally
clause is execute in any case - failure or not.
def f(x):
y = 0
try:
for x in range(x, 3):
y = 1 / x
except ZeroDivisionError as e:
print('Error:', e)
except Exception as e:
print('Unexpected error:', e)
else:
print('all went fine')
finally:
print(y)
import test26
test26.f(2)
test26.f(-2)
test26.f(-2.5)
You can also use just "except:
" to catch all exceptions if you don’t care about the
exception object itself or leave off the "as ...
" part if don’t need the exception info.
def f(x):
y = 0
try:
for x in range(x, 3):
y = 1 / x
except ZeroDivisionError:
print('Error!')
except:
print('Unexpected error!')
else:
print('all went fine')
finally:
print(y)
Strings have a format method. It allows you to replace "placeholders" - in curly braces - by arguments. They can be matched by index (zero-based) or by keyword. A detailed description is found at https://docs.python.org/3.4/library/string.html. The layout of these placeholder is to first give what is to be formatted - either by number or by name, followed by a colon, then the format string. If no number or name is supplied, numbering is automatic.
Examples:
'The Winner is {:} on {:}'.format('Mr. X', 'best movie')
'The Winner is {act:>10s} for A${prize:05d}'.format(prize=1000, act='Jim')
Another very useful string method is join to combine a list of strings.
'/'.join(('home','alex','xxx'))
Very useful to consequently use the routines in os and os.path to manage paths. More recently the module pathlib was added, but I do not have experience with it and find it sort of clunky. You may also need some routines from the sys module.
The open routine opens a file and returns a file object. To close the file, call it’s close method. Always close you files when done. Resources are finite. https://docs.python.org/3/library/functions.html#open
The routine takes a file name and an open “mode”. Useful modes are t for "text", b
for binary, U
for "universal new line" mode, r
for read, w
for write, a
for append; x
for exclusive creation; and added +
opens it for "updating" (may truncate).
Use the write routine to write to a file, read
to read (the entire) file, or read a single line using readline
.
def f():
f = open('xxx.txt', 'wt')
f.write('123\n345\n5677')
f.close()
Text files can be iterated over - each iteration yields one line:
def f():
f = open('xxx.txt', 'rt')
for i, l in enumerate(f):
print('{:05d} {}'.format(i, l.strip()))
f.close()
Here the strip
method of the string gets rid of (lead/trail) which spaces and the trailing
newline (\n
).
We may deal with binary files later.
The with statement allows you to manage resources. For example, automatically close them, and deal with exceptions (even close them then).
def f():
with open('xxx.txt', 'rt') as f:
for i, l in enumerate(f):
print('{:05d} {}'.format(i, l.strip()))
Start with the @
symbol and are essentially functions that return modified “decorated”
functions. You may see this sometimes.
def mul5(f):
def g(*args, **kwargs):
print('function was decorated.')
return f(*args, **kwargs) * 5
return g
@mul5
def h(x):
return x**2
import test31
test31.h(3)
The key point is that the decorator takes a function and returns another function.
The decorator itself can also be a function - can have parameters - that returns the actual wrapping function.
def multiply(x):
def wrap(f):
def g(*args, **kwargs):
print('function was decorated:', x)
return f(*args, **kwargs) * x
return g
return wrap
@multiply(5)
def h(x):
return x**2
import test32
test32.h(3)
Or you can design it as a class with a __call__
method that takes parameters for its
__init__
method.
class Multiply(object):
def __init__(self, factor):
self._factor = factor
def __call__(self, f):
def g(*args, **kwargs):
print('function was decorated:', self._factor)
return f(*args, **kwargs) * self._factor
return g
@Multiply(5)
def h(x):
return x**2
Allows you to provide an interface to internal data of you object. You can generally
control object access using __get__
and __set__
.
class Temperature(object):
def __init__(self, T = 0):
self.T = T
@property
def T(self):
"""T in K"""
return self._TK
@T.setter
def T(self, T):
self._TK = T
def e(self):
"""energy density in cgs"""
return 7.5657e-15 * self._TK**4
class Celsius(Temperature):
offset = 273.15
@property
def T(self):
"""T in C"""
return self._TK - self.offset
@T.setter
def T(self, T):
self._TK = T + self.offset
import test34
t = test34.Celsius()
t.T
t = test34.Celsius()
t.T
t.T = 100
t._TK
t.T
using the property
function you can also define properties that can only be set but not
read.
class Temperature(object):
def __init__(self, T = 0):
self.T = T
@property
def T(self):
"""T in K"""
return self._TK
@T.setter
def T(self, T):
self._TK = T
def e(self):
"""energy density in cgs"""
return 7.5657e-15 * self._TK**4
class Celsius(Temperature):
offset = 273.15
@property
def T(self):
"""T in C"""
return self._TK - self.offset
@T.setter
def T(self, T):
self._TK = T + self.offset
def _set_TF(self, T):
self.T = (T - 40) * 5 / 9
TF = property(fset = _set_TF)
import test35
t = test35.Celsius()
t.TF = -20
t.T
t.TF
Knowing how to use regular expressions will make your life much easier. Yes, it requires some work to get started. Python offers some powerful tools to use them for you text processing, e.g., extracting data from text files or web pages. The manual is at https://docs.python.org/3.4/library/re.html The main use it to match items that have certain patterns.
Key tokens are ( and ) to “capture” strings you want to extract, . to match anything, ?
to match the previous item zero to one times, *
to match the previous item any number
of times, +
to match the previous item at least once, ^ for beginning of string/line, $
for
end of string/line, [
and ]
to define a set of characters, etc. You also have special tokens
like \d
matching a digit, etc.
import re
re.findall('([123]+)', '1234ghgs7sjj4399')
An example script I have used to renumber all the input/output prompts from IPython to be consecutive in the script.
#! /usr/bin/env python3
import sys, re, os
def format(infile):
outfile = infile + '.tmp'
In = re.compile('^(In \[)[0-9]+(\]:.*)')
Out = re.compile('^(Out\[)[0-9]+(\]:.*)')
Search = (In, Out)
count = 0
with open(infile, 'rt') as f, open(outfile, 'xt') as g:
for line in f:
for prompt in Search:
m = prompt.findall(line)
if len(m) == 0:
continue
if prompt is In:
count += 1
line = prompt.sub(r'\g<1>{:d}\g<2>'.format(count),
line)
g.write(line)
os.remove(infile)
os.rename(outfile, infile)
if __name__ == "__main__":
format(sys.argv[1])
A very widely used convention is to import numpy
as np
import numpy as np
If you start IPython with the --pylab
flag, it will do this automatically, however, in you
scripts you still have to do it by hand.
In the IPython notebook we can do the same with the %pylab
macro
%pylab
NumPy provides multi-dimensional array class, ndarray, optimized for numerical data processing. It allows various data types - it defines its own data types, called "dtypes", and even has record arrays. I recommend you the very good online documentation at http://www.numpy.org/.
There is too much to tell about NumPy to fit into the time allocated for this course. Some of the key features are to allow you to do indexing nut just in the slice notation, but also by lists of indices or multi-dimensional constructs, or by arrays of truth values. The latter is extremely useful to avoid loop with if statements.
Key advice: Use “advanced slicing” to replace loops with if statements! These are
very slow. Should this really not be possible to avoid, you can easily write FORTRAN
extension modules using f2py.
Examples:
x = np.arange(12).reshape(3, -1)
x.shape
x[0,2]
x[2]
x
ii = x % 2 == 1
ii
x[ii] = 0
x
y = np.array([1, 2, 3, 4])
y
x *= y
x
y = np.array([1,2,3])
x *= y[:, np.newaxis]
x
The example above shows NumPy by default tries to match the last dimension and
automatically expands the others. To change that, you can add “extra axes” using
np.newaxis
.
Telling you all about NumPy would likely be an entire course by itself.
Matplotlib is a frequently used Python package for plotting. To use it interactively on the
IPython shell, use the --pylab
flag when starting IPython or to use the %pylab
macro in the IPython notebook. Much of the interactive plotting interface is quite similar to Mathlab, so it may not seem all that strange. Matplotlib is fully object-oriented, and so is the graphics it produces: Objects can be modified, even interacted with, e.g., react to users clicking at them (advanced programming, I have never used).
Whereas you can do things interactively on the shell for development - I do that, in part - I highly recommend that you put the finalized scripts inside a script, possibly a function, but surely preferable an object. Even it you just use the init function for all the plotting at first - later you can delegate some of the tasks, e.g., setups you frequently use, into separate routines, maybe in a base class ("'MyPlot'") from which you derive specialized plots. You can store properties of the plot, the figure object, axes objects, etc. in the class and later access them.
The key is that having plots in scripts, you can easily modify things and rerun the script, or if your data or model has changed, you can just re-run the script - especially for last minute changes when you thesis is due (advisor says: "Change the color of that line!").
There is a vast variety of different plot and recipes to make them. A prominent gallery with code examples can be found at http://matplotlib.org/gallery.html. Personally, I don’t think this is always done very well, but it is a good start. I highly recommend you have a look at the Beginner’s and Advanced guides (http://matplotlib.org/devdocs/contents.html).
As an example, let’s make a simple line plot:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os.path
projectpath = os.path.expanduser('~/xxx')
class MyPlot(object):
def __init__(self, func = lambda x: np.sin(x)**2):
f = plt.figure()
ax = f.add_subplot(1, 1, 1)
x = np.linspace(1, 10, 100)
y = func(x)
ax.plot(x, y,
color = 'r',
lw = 3,
label = 'Model A')
ndata = 10
x = np.random.choice(x, ndata)
y = func(x) + np.random.rand(ndata) * 0.1
ax.plot(x, y,
color = 'b',
marker = '+',
linestyle = 'None',
markersize = 12,
markeredgewidth = 2,
label = 'Data')
ax.set_xscale('log')
ax.set_xlabel(r'$x\,(\mathrm{cm})$')
ax.set_ylabel(r'$\sin^2\left(x\right)$')
ax.legend(loc='best')
f.tight_layout()
plt.show()
self.f = f
def save(self, filename):
self.f.savefig(os.path.join(projectpath, filename))
import test37
p = test37.MyPlot()
p.save('xxx.pdf')
Matplotlib will automatically create an output file format based on the file extension.
Insted of having plots opne in separate wondows, you can also have them embedded into the notebook, as you may be used form other notebook tools. Do this with the %matplotlib
macro
%matplotlib inline
plot([0,1])
you can disable this agian using %matplotlib
wityhout arguments.
%matplotlib
plot([0,1])
The End.
Please feel free to contact me if you have further questions or need some help with tough python problems.