# Functions, Libraries, and the File System

## Where We Just Were

• Python has:
• The usual primitive data types (numbers, strings, Booleans)
• The usual ways to combine statements (loops and conditionals)
• Lists for storing collections of data
• This lecture describes:
• Functions, for reusing code, and making it easier to understand
• Modules, for building libraries
• Four commonly-used libraries

## Defining Functions

• Define a new function using `def`
• Argument names follow in parentheses
• No types
• ```# Define function.
def ave(x, y):
return (x + y) / 2.0

# Use function.
print ave(20, 30)
```
```25.0
```
• Exit at any time with `return`
• ```def sign(x):
if x < 0:
return -1
if x == 0:
return 0
return 1

for i in [-17.0, 33.3, 0.0]:
print i, sign(i)
```
```-17.0 -1
33.3 1
0.0 0
```
• Functions with lots of `return` statements tend to be hard to understand
• Every function returns something
• `return` on its own is the same as `return None`
• Functions without `return` statements also return `None`
• `list.sort` and `list.reverse` are examples
• ```def double_elt(x):
for i in range(len(x)):
x[i] = x[i] * 2

values = [3, 'xyz', [9]]
print "values before call:", values
result = double_elt(values)
print "result of call:", result
print "values after call:", values
```
```values before call: [3, 'xyz', [9]]
result of call: None
values after call: [6, 'xyzxyz', [9, 9]]
```

## Scope

• Variables created in function are local to it
• Fresh copies created for each call
• Values discarded when the function returns
• As always, variables must be defined before they can be used
• Python manages variables using a call stack
• ```# Global variable called 'x'.
x = 123

# Function that defines a local variable called 'x'.
def f(arg):
x = arg
print "in call, x is", x

# Call the function to prove that it uses its local 'x'.
print "before call, global x is", x
f(999)
print "after call, global x is", x
```
```before call, global x is 123
in call, x is 999
after call, global x is 123
```
•  Figure 8.1: Call Stack (a)
•  Figure 8.2: Call Stack (b)
•  Figure 8.3: Call Stack (c)
• Each time a function is called, Python creates new variables, and puts them on top of the stack
• When a variable is referenced, Python looks at the top stack frame, then at global variables
• Unlike some languages, does not search every frame in the stack

## Parameter Passing Rules

• Python copies variables' values when passing them to functions
• Since Booleans, numbers, and strings can't be updated, a function can't affect its caller's values
• But if you pass a list to a function, the function will operate on the original list, not a copy
• ```def mutate(x, y):
x = 0
y[0] = "modified"

single = 1
triple = [1, 2, 3]

print "before call, single is", single, "and triple is", triple
mutate(single, triple)
print "after call, single is", single, "and triple is", triple
```
```before call, single is 1 and triple is [1, 2, 3]
after call, single is 1 and triple is ['modified', 2, 3]
```
•  Figure 8.4: Argument Passing (a)
•  Figure 8.5: Argument Passing (b)
•  Figure 8.6: Argument Passing (c)
•  Figure 8.7: Argument Passing (d)
• If you want to pass a copy of a list into `mutate`, use `mutate(single, triple[:])`
• `triple[:]` is the same as `triple[0:len(triple)]`
• …which is a slice of `triple` that includes the entire list…
• …and slicing creates a new list
•  Figure 8.8: Passing a Slice

## Default Parameter Values

• You can specify default values for parameters when defining a function
• Just “assign” some constant to the parameter in the definition
• The parameters actually passed when the function is called are matched up left to right
• So put the parameters the user is least likely to want to change at the end of the parameter list
• ```def total(values, start=0, end=None):

# If no values given, total is zero.
if not values:
return 0

# If no end specified, use the entire sequence.
if end is None:
end = len(values)

# Calculate.
result = 0
for i in range(start, end):
result += values[i]
return result
```
```numbers being added   [10, 20, 30, 40, 50, 60]
total(numbers, 0, 3)  60
total(numbers, 4)     110
total(numbers)        210
```
• All arguments with defaults must come after all arguments without them
• Otherwise, matching values to parameters would be ambiguous
• ```def total(values, start=0, end):

# If no values given, total is zero.
if not values:
return 0

# Calculate.
result = 0
for i in range(start, end):
result += values[i]
return result
```

## Extra Arguments

• If a function has been defined the right way, you can call it with extra arguments
• Last parameter in definition must have `*` in front of its name
• Doesn't matter what it's called, but most people use `*extra`
• Any “leftover” parameters are put in a tuple and passed via this argument
• Remember, a tuple is an immutable (unmodifiable) list
• Can only be one `*` argument per function
• If there were two or more, how would Python know which extra values to put in which?
• ```def show(first, *extra):
print first, extra

show(10)
show(10, 20, 30)
show(10, 'text', ['a', 'list'])
```
```10 ()
10 (20, 30)
10 ('text', ['a', 'list'])
```

## Functions Are Objects

• Once it has been translated into instructions, a function is just another object
• Happens to be an object you can call, just as strings and lists happens to be objects you can index
• `def` is just a shorthand for “create a function, and assign it to a variable”
• ```Pi = 3.14

def circum(r):
return 2 * Pi * r

for x in [1.0, 2.0, 3.0]:
print x, circum(x)
```
•  Figure 8.9: Functions As Objects
• This means you can:
• Redefine functions (just as you can reassign values to variables)
• Create aliases for functions
• Pass functions as parameters
• Store functions in lists
• Example: apply a function to each value in a list
• ```# Apply a function to each value in a list.
def applyToList(func, values):
result = []
for x in values:
y = func(x)
result.append(y)
return result

# A sample function (calculates 1/3 of value).
def oneThird(a):
return a/3.0

# Test.
values = [0.0, 0.5, 0.75, 0.875, 0.9375]
output = applyToList(oneThird, values)
for i in range(len(values)):
print values[i], "=>", output[i]
```
```0.0 => 0.0
0.5 => 0.166666666667
0.75 => 0.25
0.875 => 0.291666666667
0.9375 => 0.3125
```
• Example: apply several functions to a single value
• ```# Apply each function in a list to a value.
def applyEach(functions, value):
result = []
for f in functions:
y = f(value)
result.append(y)
return result

# One half.
def oneHalf(a):
return a/2.0

# One third.
def oneThird(a):
return a/3.0

# One quarter.
def oneQuarter(a):
return a/4.0

# Test.
functions = [oneHalf, oneThird, oneQuarter]
output = applyEach(functions, 0.25)
for i in range(len(functions)):
print functions[i].__name__, output[i]
```
```oneHalf 0.125
oneThird 0.0833333333333
oneQuarter 0.0625
```
• Note: every function has an attribute called `__name__`, which is the name it was originally defined under
• Handy when debugging
• ```# An example function.
def original(left, right):
return (left + right) / 2.0

# Create an alias for it.
alias = original

# Call them directly.
print "calling original:", original.__name__, original(3, 4)
print "calling alias:", alias.__name__, alias(3, 4)

# Get a little fancy.
print "Calling in loop..."
for f in [original, alias]:
print f.__name__, f(5, 8)
```
```calling original: original 3.5
calling alias: original 3.5
Calling in loop...
original 6.5
original 6.5
```

## Creating Modules

• Every Python file is automatically also a module (or library)
• If the file is called `xyz.py`, load it using `import xyz`
• The statements in the module are executed as it is loaded
• Assignment and `def` are statements
• You can use conditionals, loops, and anything else, too
• Refer to things in a module using `module.thing`
• Put this in `mylib.py`
• ```# A constant.
value = 123

# Remember, modules are executed as they're loaded...
if value < 200:
size = 'small'
else:
size = 'large'

# A function.
def printVersion():
print 'Stuff Version 2.2'
```
• Use it by putting this in `program.py`
• ```# Load the contents of mylib.py.
import mylib

# Define our own version of value.
value = "something else entirely"

# Show mylib's values.
print 'mylib.value', mylib.value
print 'mylib.size', mylib.size

# Show out own values.
print 'own value:', value

# Call function.
mylib.printVersion()
```
• When `program.py` runs, it prints this
• ```# Load the contents of mylib.py.
import mylib

# Define our own version of value.
value = "something else entirely"

# Show mylib's values.
print 'mylib.value', mylib.value
print 'mylib.size', mylib.size

# Show out own values.
print 'own value:', value

# Call function.
mylib.printVersion()
```
```mylib.value 123
mylib.size small
own value: something else entirely
Stuff Version 2.2
```
• Notice that both `mylib.py` and `program.py` define `value`
• Each module is its own global scope
• When something in `mylib.py` references `value`, it always gets its own `value`
• You can also use:
• `import mylib as m`, then call `m.printVersion()`
• `from mylib import printVersion`, then call `printVersion()`
• Question: what would happen if you did `from mylib import value`?
• `from mylib import *` imports everything from `mylib`

## The Math Library

• Python is a small language with a large library
• Much of the standard library is just Python interfaces to standard C libraries
• Example: the `math` library
• TypeNamePurposeExampleResult
Constant`e`Constant`e``2.71828…`
`pi`Constant`pi``3.14159…`
Function`ceil`Ceiling`ceil(2.5)``3.0`
`floor`Floor`floor(-2.5)``-3.0`
`exp`Exponential`exp(1.0)``2.71828…`
`log`Logarithm`log(4.0)``1.38629…`
`log(4.0, 2.0)``2.0`
`log10`Base-10 logarithm`log10(4.0)``0.60205…`
`pow`Power`pow(2.5, 2.0)``6.25`
`sqrt`Square root`sqrt(9.0)``3.0`
`cos`Cosine`cos(pi)``-1.0`
`asin`Arc sine`asin(-1.0)``-1.5707…`
`hypot`Euclidean norm x2 + y2`hypot(2, 3)``3.60555…`
`degrees`Convert from radians to degrees`degrees(pi)``180`
`radians`Convert from degrees to radians`radians(45)``0.78539…`

Table 8.1: The Python Standard Math Library

• All the other trigonometric functions (`tan`, `arctan`, etc.) are also there
• `abs` is built in to the language: don't have to import anything to use it

## Times

• Can't delete old files, or check how long your program has been running, without some notion of time
• Terminology:
• The epoch is the moment from which time is measured. On Unix, the epoch is midnight, January 1, 1970; time is measured in seconds since then.
• Coordinated Universal Time, or UTC, is the official time planet Earth. It used to be called Greenwich Mean Time (GMT).
• Local time is what the clock on your wall shows. It is defined to be UTC, plus or minus a constant offset determined by your time zone, possibly modified by daylight savings time.
• A leap year is a year that has an extra day (to keep the calendar in synch with Earth's orbit around the sun)
• A leap second is an extra second added to a day to keep UTC in synch with Earth's rotation
• Sometimes, two leap seconds have to be added.
• A time structure is an object with nine fields, representing a time in a more comprehensible format.
• `tm_year`: year (e.g., 2005)
• `tm_mon`: month (integer in the range [1, 12])
• `tm_mday`: day of the month (integer in the range [1, 31])
• `tm_hour`: hour of the day (integer in the range [0, 23])
• `tm_min`: minute of the hour (integer in the range [0, 59])
• `tm_sec`: second of the minute (integer in the range [0, 61])
• `tm_wday`: week day (integer in the range [0, 6], with Monday being 0)
• `tm_yday`: day of the year (integer in the range [1, 366])
• `tm_isdst`: daylight savings time flag
• Contents of `time` module (presented out of order so that they'll make more sense):

## Manipulating Pathnames

• The `os` module has a submodule called `os.path`
• Manipulate pathnames (filenames and directory names) correctly and efficiently
• Do not write your own functions for this—the rules are trickier than you think
• Used at least as often as `os` itself
• Contents:
• TypeNamePurposeExampleResult
Function`abspath`Create normalized absolute pathnames.`os.path.abspath('../jeevan/bin/script.py')``/home/jeevan/bin/script.py` (if executed in `/home/gvwilson`)
`basename`Return the last portion of a path (i.e., the filename, or the last directory name).`os.path.basename('/tmp/scratch/junk.data')``junk.data`
`dirname`Return all but the last portion of a path.`os.path.basename('/tmp/scratch/junk.data')``/tmp/scratch`
`exists`Return `True` if a pathname refers to an existing file or directory.`os.path.exists('./scribble.txt')``True` if there is a file called `scribble.txt` in the current working directory, `False` otherwise.
`getatime`Get the last access time of a file or directory (like `os.stat`).`os.path.getatime('.')``1112109573` (which means that the current directory was last read or written at 10:19:33 EST on March 29, 2005).
`getmtime`Get the last modification time of a file or directory (like `os.stat`).`os.path.getmtime('.')``1112109502` (which means that the current directory was last modified 71 seconds before the time shown above).
`getsize`Get the size of something in bytes (like `os.stat`).`os.path.getsize('py03.swc')``29662`.
`isabs``True` if its argument is an absolute pathname.`os.path.isabs('tmp/data.txt')``False`
`isfile``True` if its argument identifies an existing file.`os.path.isfile('tmp/data.txt')``True` if a file called `./tmp/data.txt` exists, and `False` otherwise.
`isdir``True` if its argument identifies an existing directory..`os.path.isdir('tmp')``True` if the current directory has a subdirectory called `tmp`.
`join`Join pathname fragments to create a full pathname.`os.path.join('/tmp', 'scratch', 'data.txt')``"/tmp/scratch/data.txt"`
`normpath`Normalize a pathname (i.e., remove redundant slashes, uses of `.` and `..`, etc.).`os.path.normpath('tmp/scratch/../other/file.txt')``"tmp/other/file.txt"`
`split`Return both of the values returned by `os.path.dirname` and `os.path.basename`.`os.path.split('/tmp/scratch.dat')``('/tmp', 'scratch.dat')`
`splitext`Split a path into two pieces `root` and `ext`, such that `ext` is the last piece beginning with a `"."`.`os.path.splitext('/tmp/scratch.dat')``('/tmp/scratch', '.dat')`

Table 8.5: Python's Pathname Library

## Knowing Where You Are

• Python assigns the module's name to the special variable `__name__`
• `"__main__"` when the file is run from the command line
• The module's name when it is loaded by something else
• Often used to draw attention to the main body of a large program
• Or to include self-tests in module
• If the module is loaded by something else, its name isn't `"__main__"`, so the tests aren't run

• `Daily Python URL` and `Doctor Dobb's Journal Python-URL` collect Python-related news in one place
• `Python Cookbook` (and the print version, [Martelli 2005] ) are recipes ranging in complexity from nearly trivial to fiendishly subtle

## Exercises

Exercise 8.1:

Write a function that takes two strings called `text` and `fragment` as arguments, and returns the number of times `fragment` appears in the second half of `text`. Your function must not create a copy of the second half of `text`. (Hint: read the documentation for `string.count`.)

Exercise 8.2:

What does the Python keyword `global` do? What are some reasons not to write code that uses it?

Exercise 8.3:

Consider the following sample of code and its output:

```def settings(first, **rest):
print 'first is', first
print 'rest is'
for (name, value) in rest.items():
print '...', name, value
print

settings(1)
settings(1, two=2, three="THREE")
```
```first is 1
rest is

first is 1
rest is
... two 2
... three THREE

```

What does the variable `rest` do? What does the double asterisk `**` in front of its name mean? How does it compare to the example with `*extra` (with a single asterisk) in the lecture?

Exercise 8.4:

Python allows you to import all the functions and variables in a module at once, making them local name. For example, if the module is called `values`, and contains a variable called `Threshold` and a function called `limit`, then after the statement `from values import *`, you can then refer directly to `Threshold` and `limit`, rather than having to use `values.Threshold` or `values.limit`. Explain why this is generally considered a bad thing to do, even though it reduces the amount programmers have to type.

Exercise 8.5:

`sys.stdin`, `sys.stdout`, and `sys.stderr` are variables, which means that you can assign to them. For example, if you want to change where `print` sends its output, you can do this:

```import sys

print 'this goes to stdout'
temp = sys.stdout
sys.stdout = open('temporary.txt', 'w')
print 'this goes to temporary.txt'
sys.stdout = temp
```

Do you think this is a good programming practice? When and why do you think its use might be justified?

Exercise 8.6:

`os.stat(path)` returns an object whose members describe various properties of the file or directory identified by `path`. Using this, write a function that will determine whether or not a file is more than one year old.

Exercise 8.7:

Write a Python program that takes as its arguments two years (such as 1997 and 2007), prints out the number of days between the 15th of each month from January of the first year until December of the last year.

Exercise 8.8:

Write a simple version of `which` in Python. Your program should check each directory on the caller's path (in order) to find an executable program that has the name given to it on the command line.