Exception handling in Python not only provides a way to respond to errors, it allows you to elegantly structure your code and solve problems, reduces code, and sometimes speeds up your script. Being Pythonic means knowing how to use exceptions to their full extent. I've learned about exceptions from many different sources, each having a particular insight, but I haven't found one write-up that covers it all, so this entry an attempt at that.
This post is dedicated to exception handling; I'll do a future post on creating and raising exceptions.
Traditional programming languages tended to treat errors as a catastrophic situation. An error usually meant that the script couldn't possibly continue, and would probably instantly crash. Therefore, the job of the programmer was to do everything in their power to avoid errors.
That meant that the programmer needed to check for everything that could go wrong before calling a function: Does the file exist? Is the network up? Etc. And if something unexpected went wrong, then the script crashed anyway.
The process of checking for every possible error condition ahead of time is known as Look Before You Leap, or LBYL.
Exception handling provided a more robust mechanism for dealing with errors. Rather than instantly crashing, the error would result in an exception being raised (or thrown, as many languages describe it), allowing the programmer to catch the exception and choose how to respond to an error situation.
However, most programmers have retained that original fear of something going wrong, and are still trained to do everything they can to avoid an exception from being raised. Exceptions are still thought of as just a way to respond to catastrophic errors.
A huge benefit of exceptions in any language is that they solve the age-old problem of combining return values and error codes. Before exceptions, if an error occurred that didn't crash the script, then the function needed a way to indicate the error.
Typically, this meant that usually a function would return a number or string or whatever if everything had worked properly, or it would return 0
if an error occurred.
Unfortunately, for some functions, 0
could be a valid return value, so in that case they had to find a different value. Perhaps -1
would indicate an error. But what if -1
was a valid return value? Then maybe 999,999
would indicate an error...so the programmer was stuck reading through the documentation for each function to determine what return value indicated an error.
Exceptions solved this, because if nothing went wrong, then a valid value was returned. If something did go wrong, and exception was thrown, which was handled separately from the return value.
Suppose you are going to open a file to read data. If the file you are expecting doesn't exist, an exception is raised and your script crashes:
myfile = open('input.txt')
Exception handling allows you to decide what happens when an error occurs, by putting the potentially problematic code in a try
block.
In Python we don't use the throw and catch terminology -- instead you try an operation ("There is no try, only do", said a wise yet non-Pythonic alien) and if something goes wrong an exception is raised and you can choose to handle it.
Here is an example of simple exception handling, with a try
block and an except
block:
try:
print('Trying to open the input file...')
myfile = open('input.txt')
except:
print('Something went wrong! Oh well, continuing on and hoping for the best...')
print('Continuing...')
You can have as much code as you like in a try
block. If an exception is raised by any of the code in that block, Python will then look for an exception handler for the exception that was raised. An exception handling block starts with except
and specifies which exceptions it handles. In this case, because the except
block doesn't specify any particular exceptions, it matches any exception that is raised.
The example above is not particularly robust exception handling (probably we couldn't really continue on at this point), but illustrates the point that you have complete control over what happens when an exception is raised, including the ability to simply ignore it.
It turns out there are many cases where you don't actually care if an exception occurs, but you don't want your script to crash. For example, if you call os.mkdir()
to create a directory, it raises an exception if the directory already exists:
import os
os.mkdir('/tmp')
Since you were trying to create the directory, the fact that it already exists is not something you'd consider an error, and you just want to move silently on. You can do this using pass
, which is a keyword that means "this code block intentionally left blank":
try:
os.mkdir('/tmp')
except:
pass
print("Continuing...")
In this case, the exception occurred, it was ignored, and you move on.
When you do want to do something in response to an exception, you usually need to get information about what exception occurred. Let's say you are getting a value from a dictionary and there is no entry for the key you specify:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1}
rhino_count = zoo['rhinos']
print(f'There are {rhino_count} rhinos in our zoo!')
Once you realize that it's possible for this code to raise a KeyError
exception, you can create a specific exception handler for that case by specifying the exception name as part of the except
statement:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1}
try:
rhino_count = zoo['rhinos']
print(f'There are {rhino_count} rhinos in our zoo!')
except KeyError:
print('There are NO rhinos in our zoo!')
The except KeyError:
statement means this exception handler will only be called if it's an exception of type KeyError
. Any other exception type will still crash the script. For example, say you do have rhinos in the zoo, but mis-typed the rhino_count
variable name in print()
:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1, 'rhinos': 2}
try:
rhino_count = zoo['rhinos']
print(f'There are {rhoni_count} rhinos in our zoo!') # Error: typo in variable name
except KeyError:
print('There are NO rhinos in our zoo!')
Since the exception was of type NameError
, your exception handler didn't match it and the script crashed. You could do a specific NameError
handler, but you also have the option of handling any exception that comes along without having to know what it will be ahead of time, by creating an exception handler for type Exception
:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1, 'rhinos': 2}
try:
rhino_count = zoo['rhinos']
print(f'There are {rhoni_count} rhinos in our zoo!')
except KeyError:
print('There are NO rhinos in our zoo!')
except Exception as e:
print(f'Got exception of type {type(e)}: {e}')
All exceptions must be subclasses of Exception
, so a handler for Exception
matches any exception. For this reason, you should always put this handler last, since Python will stop checking the exception handlers once one matches. The except Exception as e:
statement assigns the exception object with information about what went wrong to the specified variable name. I'm not a fan of one-letter variable names, but it's very common to assign exceptions to the variable name e
so I'm doing so here.
It's up to you how you want to respond to an unexpected exception. You might log information about it to a file, or give the user a message, then you can either continue on or you can choose to let the script crash. To do the latter, simply re-raise the exception like so:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1, 'rhinos': 2}
try:
rhino_count = zoo['rhinos']
print(f'There are {rhoni_count} rhinos in our zoo!')
except KeyError:
print('There are NO rhinos in our zoo!')
except Exception as e:
print(f'Got exception of type {type(e)}: {e}')
print("Not sure what happened, so it's not safe to continue -- crashing the script!")
raise e
In this case, you print a message to the user and then crash the script using raise e
. The raise
statement is like returning a value from a function, except it instead raises the specified exception (within an exception handler you can raise
without specifying an exception and it will automatically raise the same exception that was received, but I recommend the clarity of specifying the exception object).
If you want to handle multiple exceptions in the same handler, you can use a list to specify the exceptions like so:
try:
animals = {}
num = animals['rhino']
except (KeyError, ValueError) as e:
print(f'{type(e)}: {e}')
As you can see, you are in complete control of what happens when an exception is raised. But the story doesn't end there. Now we get to the two most misunderstood features of Python exception handling, else
and finally
blocks.
An else
block is an optional feature of exception handling that allows you to have a block of code that is only run if no exceptions occurred. Why would you use this? Let's say you are handling exceptions rather than letting the script crash. Here is a situation you can get into if an exception occurs:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1}
try:
rhino_count = zoo['rhinos'] # Error: There is no 'rhinos' key
except:
print("Error!")
print(f"# of rhinos: {rhino_count}") # Error: Attempting to use a variable that doesn't exist
print("Continuing...")
In this case, an exception occurred, which kept rhino_count
from getting created, but the exception handling kept the script from crashing. I then tried to reference rhino_count
, which caused another exception that did crash the script. A classic way of handling this is to create the variable ahead of time and set it to False
or zero, then check the value before using it, like so:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1}
rhino_count = False # Initializing the variable to a known state.
try:
rhino_count = zoo['rhinos'] # Error: There is no 'rhinos' key
except:
print("Error!")
if rhino_count is not False:
print(f"# of rhinos: {rhino_count}")
print("Continuing...")
This approach works, but adds unnecessary code and complexity. The else
blocks achieves the same with less code, because if you reach the else
block, you know everything is as expected, so you don't have to initialize or check values before using:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1}
try:
rhino_count = zoo['rhinos']
except:
print("Error!")
else:
print(f"No exceptions occurred, # of rhinos: {rhino_count}")
print("Continuing...")
Here's what happens when there are no exceptions:
zoo = {'penguins': 5, 'lions': 12, 'zebras': 1, 'rhinos': 2}
try:
rhino_count = zoo['rhinos']
except:
print("Error!")
else:
print(f"No exceptions occurred, # of rhinos: {rhino_count}")
print("Continuing...")
The benefit of an else
block is that you can have code that relies on everything going as expected, and can skip that code if something does go wrong. It's completely optional, but is helpful for organizing your code more elegantly.
A finally
block contains code that will run before the exception handling is exited no matter what happens. That is, whether or not an exception occurs, the code in the finally
block will be run. I will emphasize this multiple times because people often take a bit to absorb what this means.
The finally
block is usually used for cleanup that needs to occur whether or not an exception occurred.
I'll start with the aspect of exception handling that I learned most recently: You can use a finally
block without needing to provide an exception handler. Here is what happens when no exception occurs:
big_data = 'a' * 100000
try:
print(len(big_data))
finally:
print("Cleaning up!")
del(big_data)
print("Continuing...")
In this case the try
block was executed, there were no exceptions, and the finally
block was called before moving on.
If you don't have an exception handler, then any exception will crash the script, but the finally
block still gets called before that happens:
big_data = 'a' * 100000
try:
print(len(bug_data)) # Error: Typo in variable name
finally:
print("Cleaning up!")
del(big_data)
print("Continuing...")
If you do have an exception handler, then that code will be executed and then the finally
block will be executed:
big_data = 'a' * 100000
try:
print(len(bug_data)) # Error: Typo in variable name
except:
print("Error!")
finally:
print("Cleaning up!")
del(big_data)
print("Continuing...")
The finally
block will be called even if the script is quit -- it is called no matter what happens.
At this point in classes, a student inevitably asks, "Will the finally block be called if..."
Instructor: (Deep breath) "Yes. NO MATTER WHAT HAPPENS."
Got that?
I felt all good about my knowledge of exceptions, then I ran into a problem that seemed to render the whole thing pointless. If I got an exception while opening a file, and I was using the finally
block to make sure the file was closed, I ran into this:
try:
myfile = open('bogus_path.txt')
except:
pass
finally:
myfile.close()
Because the file was never opened, my attempt to close it failed and my finally
block triggered an exception and crashed the script.
This flummoxed me for a while and gave me a dim view of exceptions. Now, presumably there's some convenient way to test whether a variable is defined or not, but before I got to that point I had a revelation: This is what exception handling is all about!
If it might raise an exception but you don't care if it does, just use an empty exception handling within the finally
block, like so:
try:
myfile = open('bogus_path.txt')
except:
pass
finally:
# myfile may or may not exist, so use an exception handler.
try:
myfile.close()
except:
pass
print("Continuing...")
Now if the file was opened it gets closed, if it wasn't we just move on. This is a perfect expression of the way exception handling solves many problems elegantly.
By knowing what the else
and finally
blocks do, you've become an elite Python programmer who knows something even many experienced folks don't.
That covers the features of exception handling. Now let's talk about the philosophy.
I covered LBYL (Look Before You Leap) above. Now I want to talk about the best part of Python exception handling, the thing that really makes your code Pythonic:
EAFP. It's Easier to Ask Forgiveness than Permission.
When you think about it, because so much code in the world follows an LBYL model, our logic tends to be inverted. First we do a whole bunch of error checking, and only if everything is good do we then actually do the thing we want to do. This buries the actual logic and makes the less interesting part of the code the most prominent. Here's a typical example:
import os
import json
def load_project_data(path):
if os.path.exists(path):
data_dict = json.load(open(path))
else:
print(f"{path} doesn't exist.")
return
if 'project name' in data_dict:
print(f"Project loaded: {data_dict['project name']}")
else:
print("Project data badly formatted, missing project name")
return
return data_dict
data = load_project_data('data.json')
The key to EAFP is to stop worrying about getting exceptions. Start with what you actually want to do, and handle errors after if need be. While we could handle specific exceptions, often you can get away with a generic exception handler:
import os
import json
def load_project_data(path):
try:
data_dict = json.load(open(path))
print(f"Project loaded: {data_dict['project name']}")
return data_dict
except Exception as e:
print(f"Error while loading {path}: {e}")
data = load_project_data('bogus_data.json')
Less code, and the logic is up front.
Even better and more Pythonic, let the person calling your function deal with any exceptions:
import os
import json
def load_project_data(path):
data_dict = json.load(open(path))
print(f"Project loaded: {data_dict['project name']}")
return data_dict
try:
data = load_project_data('bogus_data.json')
except Exception as e:
print(f"Failed to load data: {e}")
else:
print(data)
Exceptions aren't just for errors! They can be a normal part of structuring code.
As discussed above, exceptions allow for separating errors from return values. They also allow for returning a different kind of value when there is no error.
Python strings demonstrate an example of the old and new way of doing this. Strings have a find()
method that returns the index where the substring was found:
loc = 'Monty Python'.find('P')
print(loc)
If the substring isn't found, find()
returns the special value -1:
loc = 'Monty Python'.find('a')
print(loc)
This means you have to look up what -1 means and then check for that specific value, which once again pushes the error checking to the forefront as well as not being the most intuitive code:
loc = 'Monty Python'.find('a')
if loc != -1:
print(loc)
else:
print("Substring not found")
Strings also have an index
method that takes the Pythonic approach and simply raises an exception if the substring is not found:
loc = 'Monty Python'.index('a')
print(loc)
It might seem extreme to raise an exception just because a substring wasn't found, but that's the point: Exceptions are not always catastrophic, they can be a normal part of control flow:
try:
loc = 'Monty Python'.index('a')
print(loc)
except:
print('Substring not found')
Sometimes EAFP is the only way.
As mentioned above, os.mkdir()
creates a directory, but raises an exception if the directory already exists.
Any guesses what's wrong with this attempt to use LBYL to avoid an exception if the path already exists?
import os
if not os.path.exists('/tmp'):
os.mkdir('/tmp')
Seems straightforward -- we check to see if the path already exists, and if it doesn't we then proceed to call os.mkdir()
.
Add a + to your grade if you spotted the problem: A tiny bit of time passes between the call to os.path.exists()
and the subsequent call to os.mkdir()
. In that tiny bit of time, someone (maybe even another copy of your script) might have created the /tmp
path, in which case you'll get an exception even though you tried to avoid it with the existence check.
This is called a race condition, and it's a terrible bug to have to find. Because 99.9999% of the time everything is going to work fine, but 0.00001% of the time it will fail, and you'll tear your hair out trying to figure out why it never happens on your system but does happen for some of your users.
The only absolutely reliable solution to this race condition is the EAFP approach: Just try making the directory, and if it fails with an OSError exception, then you know the directory already exists at that instant and can move on:
import os
try:
os.mkdir('/tmp')
except OSError:
pass
I love Python exception handling, and I'm using the techniques discussed above more and more in my code as I fully realize the benefits.
But I have found one little problem popping up with some regularity: If I'm using exception handling to ignore the exception I expect, and then I'm adding some new code to the try
block, sometimes the code won't work and I can't figure out why. Then it turns out that an exception was being raised in the new code but was being ignored, so I couldn't see the problem. For example:
import os
try:
# Original code
os.mkdir('/tmp')
# New code that fails because the file doesn't exist but I don't see the exception.
myfile = open('/tmp/stuff.txt')
except:
pass
To deal with this, when adding new code I've gotten into the habit of printing out what exception occurred, so I can see problems:
import os
try:
# Original code
os.mkdir('/tmp')
# New code that fails because the file doesn't exist but I don't see the exception.
myfile = open('/tmp/stuff.txt')
except Exception as e:
print(f'DEBUG: Got exception of type: {type(e)}: {e}')
pass
My knowledge of the Pythonic approach to exceptions has come from a variety of sources. Here are a couple that stand out (if you know of other good discussions I should link to, please ping me!):
I’m on Python 3.7, working on a project where my colleague is on 3.6. This has quickly revealed an issue with dictionaries in 3.7 that causes serious bugs if you’re not aware of the problem and careful about handling it.
Dictionaries are the Python class for storing key/value pairs, one of the most useful and fundamental data types, especially in Python.
Since the beginning of recorded programming history, dictionaries have been unordered, meaning they store their data in an effectively random order:
zoo = {'birds': 10, 'lions': 1, 'zebras': 5, }
print(zoo)
To see what order a dictionary stored things in before 3.7, you can use the hash()
function, which returns a number that indicates the order in which the object will be stored in on your system:
print(f'zebras: {hash("zebras")}')
print(f'lions: {hash("lions")}')
print(f'birds: {hash("birds")}')
This is unintuitive, but by storing data this way, dictionaries are extremely fast to search. Unlike a list
, which is very slow to search, because Python searches a list by starting at the first item and going through the list until it finds the desired item; for a large list this can take a long time. A dictionary, on the other hand, calls the hash()
function for the item, gets that unique number, and effectively jumps right to the spot where it's stored.
When teaching dictionaries to new programmers (and sometimes even to experienced programmers) I've had to spend a lot of time explaining why and how they work the way they do, and imploring people to remember to never rely on the order of items in a dictionary.
In Python 3.7 dictionaries now remember their order. This is quite a nice feature, and I find myself now using dictionaries even more than I did before, because they act like an ordered list, yet are still extremely fast to search. It also makes them more intuitive and removes one potential source of bugs, where people would write code assuming the dictionary was ordered when it wasn’t.
For the last week I've been running into this. I'm working on a script that will be used for a project kickoff, and which makes heavy use of dictionaries being ordered to make the output usable.
Everything works great for me, but when my colleague on 3.6 runs the script, everything is in scrambled order, rendering the output almost useless. This is a very easy bug to run into unless you are also testing your script on a pre-3.7 version of Python.
Based on this experience, I've decided that for the foreseeable future I'm not going to use the built-in dictionary. Instead, the collections
module has an OrderedDict
class that acts just like a dictionary, but remembers the order no matter what version you are running (even Python 2.7 has this class).
You can't use the dictionary literal ({}
) for an OrderedDict, you have to use the class name, after which everything else is the same:
from collections import OrderedDict
mydict = OrderedDict()
mydict['birds'] = 5
mydict['lions'] = 1
mydict['zebras'] = 5
print(mydict)
If you take this approach you still need to be careful: I've found myself unthinkingly using a literal to create a dictionary and having the problem pop up yet again.
Even though I like having my dictionaries be ordered, this kind of change is disruptive enough that it seems like something that should go into a .0 release rather than a dot release, though I suppose that wouldn't really change this impact. In any case, now that this change has happened, be cautious with your use of dictionaries!
I love dir()
, I hate dir()
.
One of the first things I teach students, whether new or experienced programmers, is the triad of built in functions critical for figuring things out in Python:
type()
dir()
help()
Of these, dir()
is the function I use more than any other, whether in class or while doing my own programming. It's absolutely critical. And yet it annoys the hell out of me.
dir is short for directory, and provides a list of all the names associated with an object in Python, which are the attributes and methods you can use with that object.
Here's the output of dir()
when called on list
, one of the early objects I talk about in class:
print(dir(int))
What's wrong with that for students? Let me count the ways...
Eventually I got tired of spending several minutes explaining all these problems to students, and over the years I developed my own version of dir()
to address the issues. Seeing me use it, students started asking if they could use it in their own coding. To my surprise, I started finding myself using it when doing development, and I found myself discovering things about objects that I'd never noticed before.
Here's the output of mydir()
for a list
:
from mydir import mydir
mydir(list)
Which do you prefer?
This makes things so much nicer for students (and for their instructors...)
If you want to see the private (underscored) items, you can do so:
mydir(list, private=True)
Notice that classes and attributes are separated out from methods.
If you'd like to make use of this, you can get it from github (I may submit it to pypi as a module in the future):
I intend to make some additions:
Suggestions and patches welcome!