Lesson 6: Functions and dictionaries

Today's lesson covers defining your own functions, plus
a fantastically useful data type called a dictionary.

===================== Functions ===========================

The idea of a function is simple: it's a little piece of code that
does some specific task.  You can use functions to group your code so
it's more readable, or to separate tasks that you need to call
repeatedly or from different places.

Here's how you define a function:

def hello() :
    print "Hello, world!"

The keyword def says you're about to define a function; everything
inside it is indented. You can call it from other parts of your
program just by saying: hello()

That function just printed something. But fairly often, you'll want
a function to calculate some value and return it.

def count_words(s) :
    return s.split()    # return the number of words in the string s

In that case, you'd call it something like this:

mystring = "Here is a long line with a bunch of words in it"
print "The string contains", count_words(mystring), "words"

When you call count_words(mystring), Python takes mystring (the long
string I just defined) and passes it to the count_words function, but
inside count_words, the string will be called s, not mystring. The
function does whatever it needs to do, calculates its result, then
uses return to pass the result back to whatever code called it.

s in count_words is called an "argument". You can pass any number of
arguments to a function. For instance, to make a web URL:

def make_url(scheme, domain, path) :
    return scheme + "://" + domain + "/" + path

make_url("http", "mailman.linuxchix.org", "pipermail/courses/")
'http://mailman.linuxchix.org/pipermail/courses/'

=============== Returning multiple values: tuples ===============

You can return any Python type, not just numbers -- strings, lists,
whatever. You can even return more than one value:

def count_lines_and_words_in_file(filename) :
    lines = 0
    words = 0
    file = open(filename)
    for line in file :
        lines += 1
        words += len(line.split())
    return (lines, words)

You would call it like this:

(lines, words) = count_lines_and_words_in_file("my_file.txt")

You don't have to use the parentheses -- you can say
    return lines, words
and
lines, words = count_lines_and_words_in_file("my_file.txt")

but I think the parentheses makes it a little clearer what's happening.
I'm only mentioning it because you'll see it both ways in Python programs.

A parenthesized thing like (lines, words) or (42, "foo") is called a
"tuple" in Python.  A tuple is like a list in some ways -- you could say
>>> my_tuple = count_lines_and_words_in_file("/etc/hosts")
>>> my_tuple
(17, 34)
>>> len(my_tuple)
2
>>> my_tuple[0]
17
>>> count_lines_and_words_in_file("/etc/hosts")[0]
17

Looks just like a list, right? So how is it different?
You can't change anything inside it or add things to it once it's created.
So tuples are mostly useful only for passing to or returning from functions.
And it's not important to remember the name "tuple", just the concept of
returning multiple values. You'll see that a lot in Python programs.

===================== Dictionaries ===========================

Python has one more "collection of stuff" data type that's worth
knowing about: dictionaries.

Sometimes when you're writing a program, you don't necessarily know
how many things you're going to need to store, or what order they
should go in. All you know is that you need to associate names with
values. For that, you'd use a dictionary, defined with curly braces, { }.

Like if you wanted to associate names of organizations with their URLs:

urls = { "LinuxChix"     : "http://linuxchix.org",
         "Ubuntu Women"  : "http://ubuntu-women.org/",
         "Debian Women"  : "http://women.debian.org/",
         "Geek Feminism" : "http://geekfeminism.wikia.com/"
       }

(Note: this is one of the few times in Python where indentation isn't
important. I've indented the lines to look more readable, but you
could put them all on one line, or with or without indentation.)

Then you could say

print "Get more information on LinuxChix at", urls["LinuxChix"]

It looks like you're indexing a list -- note the square brackets,
["LinuxChix"]-- but with a string inside them, not a number.

The index values inside the brackets, like "LinuxChix", are called keys;
the other part, the URLs, are called values.  You can get a list of all
keys with keys(), and you can use keys() to loop over a dictionary:

for org in urls.keys() :
    print "Get more info on", org, "at:", urls[org]

By the way, you may notice I didn't a comma after the last item
(geekfeminism) in the dictionary. Some languages are picky about that,
but Python is flexible: and you can include a comma there or not.
If you're maintaining a long list of values in a big dictionary,
sometimes it's easier to have all the lines look the same and have
them all have commas.

You can also add stuff to a dictionary after it's made,
or start with an empty dictionary and add stuff to it:

urls = {}     # an empty dictionary
while True :
    org = raw_input("What's your organization name? (q to quit) ")
    if org == 'q' :
        break
    url = raw_input("What's your URL? ")
    urls[org] = url

print urls

Dictionaries are great, and you'll find all sorts of uses for them.
I was confused about dictionaries for a long time, maybe because you
use {} to create them but then you index them with [] like a list,
and I could never remember which was which. I wish I'd started using
dictionaries a lot sooner, so try not to be afraid of using them.
Just remember (or write down somewhere):

[] is an empty list
{} is an empty dictionary
() is an empty tuple (but you'll mostly see it in function return values)
-- but they're all indexed with [] once they're created.

===================== Homework ===========================

1. Write a function that takes a dictionary as argument, chooses a random
   key from that dictionary, and returns the key and its value as a tuple.

   In other words, if you used that urls dictionary I defined above,
   you would pass urls as the argument to the function, and the function
   might return something like (LinuxChix", "http://linuxchix.org").

2. Use the function you just defined to make a flashcard program.
   The flashcards can be on any subject you want (or a mix of
   subjects).  Make a dictionary of questions and answers -- the keys
   are the questions, the answers are the values. Then use your
   function to pick a random question/answer pair, print the question
   and wait for the user to hit return. Then print the answer.
   (The user can keep track of whether she got it right.)

   By the way, if you ever need to use strings with non-ASCII characters
   -- like if you're making flashcards for a language besides English --
   you can specify an encoding with a comment near the top of your program
   (e.g. right after the shebang line, if any) like this:
# -*- coding: utf-8 -*-

3. Change your flashcard program so that the user has to type the answer,
   and you compare it against the right answer and keep track of how
   many were answered right or wrong.

   Note: you may find this is kind of a pain, because if you make any
   typos or add extra spaces or anything you don't get credit for a
   right answer. If you find this to be a problem, do you have any ideas
   for ways you could make it more flexible?