How to use the Python file system

Working with files involves one of two things: basic I/O and working with the filesystem (for example, naming, creating, moving, or referring to files), which is a bit tricky, because different operating systems h...
This article was sent to us by: Scott B. Reed at 01052010

1 Programming » How to use the Python file system
Bookmark and Share

Working with files involves one of two things: basic I/O and working with the filesystem (for example, naming, creating, moving, or referring to files), which is a bit tricky, because different operating systems have different filesystem conventions. It would be easy enough to learn how to perform basic file I/O without learning all the features Python has provided to simplify cross-platform filesystem interaction- but I wouldn't recommend it. Instead, read at least the first part of this article. This will give you the tools you need to refer to files in a manner that doesn't depend on your particular operating system. Then, when you use the basic I/O operations, you can open the relevant files in this manner. All operating systems refer to files and directories with strings naming a given file or directory.

Strings used in this manner are usually called pathnames (or sometimes just paths), which is the word we'll use for them. The fact that pathnames are strings introduces possible complications into working with them. Python does a good job of providing functions that help avoid these complications; but to make use of these Python functions effectively, you need an understanding of what the underlying problems are. This article discusses these details. Pathname semantics across different operating systems are very similar, because the filesystem on almost all operating systems is modeled as a tree structure, with a disk being the root, and folders, subfolders, and so forth being branches, subbranches, and so on. This means that most operating systems refer to a specific file in fundamentally the same manner: with a pathname that specifies the path to follow from the root of the filesystem tree (the disk) to the file in question. (This characterization of the root corresponding to a hard disk is an oversimplification. But it's close enough to the truth to serve for this article.) This pathname consists of a series of folders to descend into, in order to get to the desired file. Different operating systems have different conventions regarding the precise syntax of pathnames.

For example, the character used to separate sequential file or directory names in a Linux/UNIX pathname is /, whereas the character used to separate file or directory names in a Windows pathname is \. In addition, the UNIX filesystem has a single root (which is referred to by having a / character as the very first character in a pathname), whereas the Windows filesystem has a separate root for each drive, labeled A:\, B:\, C:\, and so forth (with C: usually being the main drive). Because of these differences, files will have different pathname representations on different operating systems. For example, a file called C:\data\myfile in MS Windows might be called /data/myfile on UNIX and on the Macintosh. Python provides functions and constants that allow you to perform common pathname manipulations without worrying about such syntactic details. With a little care, you can write your Python programs in such a manner that they will run correctly no matter what the underlying filesystem happens to be. These operating systems allow two different types of pathnames. Absolute pathnames specify the exact location of a file in a filesystem, without any ambiguity; they do this by listing the entire path to that file, starting from the root of the filesystem. Relative pathnames specify the position of a file relative to some other point in the filesystem, and that other point isn't specified in the relative pathname itself; instead, the absolute starting point for relative pathnames is provided by the context in which they're used. Relative paths need context to anchor them. This is typically provided in one of two ways. The simplest is to append the relative path to an existing absolute path, producing a new absolute path. For example, we might have a relative Windows path, Start Menu\Programs\Explorer, and an absolute path, C:\Documents and Settings\Administrator. By appending the two, we have a new absolute path, C:\Documents and Settings\ Administrator\Start Menu\Programs\Explorer, which refers to a specific file in the filesystem. By appending the same relative path to a different absolute path (say, C:\Documents and Settings\kmcdonald), we produce a path that refers to the Explorer program in a different user's (kmcdonald's) Profiles directory.

The second way in which relative paths may obtain a context is via an implicit reference to the current working directory, which is the particular directory where a Python program considers itself to be at any point during its execution. Python commands may implicitly make use of the current working directory when they're given a relative path as an argument. For example, if you use the os.listdir(path) command with a relative path argument, the anchor for that relative path is the current working directory, and the result of the command is a list of the filenames in the directory whose path is formed by appending the current working directory with the relative path argument. Whenever you edit a document on a computer, you have a concept of where you are in that computer's file structure because you're in the same directory (folder) as the file you're working on. Similarly, whenever Python is running, it has a concept of where in the directory structure it is at any moment. This is important because the program may ask for a list of files stored in the current directory. The directory that a Python program is in is called the current working directory for that program.

This may be different from the directory the program resides in. To see this in action, start Python and use the os.getcwd (get current working directory) command to find out what Python's initial current working directory is. Note that os.getcwd is used as a zero-argument function call, to emphasize the fact that the value it returns isn't a constant but will change as you issue commands that change the value of the current working directory. (It will probably be either the directory the Python program itself resides in or the directory you were in when you started up Python. On my Linux machine, the result is /home/vceder, which is my home directory.) On Windows machines, you'll see extra backslashes inserted into the path-this is because Windows uses them as its path separators, and in Python strings they have a special meaning unless they're also backslashed. The constant os.curdir returns whatever string your system happens to use as the same directory indicator. On both UNIX and Windows, this is a single dot; but to keep your programs portable, you should always use os.curdir instead of typing just the dot. This string is a relative path, meaning that os.listdir will append it to the path for the current working directory, giving the same path. This command returns a list of all of the files or folders inside the current working directory. As you can see, Python moves into the folder specified as an argument of the os.chdir function. Another call to os.listdir(os.curdir) would return a list of files in folder, because os.curdir would then be taken relative to the new current working directory. Many Python filesystem operations use the current working directory in this manner.

Legal Disclaimer

Our website is not responsible for the information contained by this article. Articleinput.com is a free articles resource thus practically any visitor can submit an article. However if you notice any copyrighted material, please contact us and we will remove the article(s) in discussion right away.

Related Articles

1. Associative arrays in Python are dictionaries
This article discusses dictionaries, Python's name for associative arrays, which it implements using hash tables. Dictionaries are amazingly us...

2. What can be used as a key in Python
Python permits more than just strings to be used in this manner. Any Python object that is immutable and hashable can be used as a key to a dic...

3. Definitions and uses of functions in Python
This article assumes you're familiar with function definitions in at least one other computer language and with the concepts that correspond to...

4. Python functions handle variable numbers of arguments
Python functions can also be defined to handle variable numbers of arguments. You can do this two different ways. One way handles the relativel...

5. Lambda expressions and generator functions in Python
Short functions like those you just saw can also be defined using lambda expressions of the form. Lambda expressions are anonymous little funct...

6. How to create a basic program in Python
Up until now, you've been using the Python interpreter mainly in interactive mode. For production use, you'll want to create Python programs or...

7. How to make Python script execution in UNIX and Mac OS X and in Windows
If you're on UNIX, you can easily make a script directly executable. Note that if Python 3.x isn't your default version of Python, you may need...

8. The difference between scripts on Windows scripts on UNIX
The way you call scripts on Windows differs from the way scripts are called on Linux/ UNIX, and that difference can affect what kind of scripts...

9. Python applications are distributed as source files
You can distribute your scripts as source files (as .py files). You can also ship them as byte code (as .pyc or .pyo files). A byte code file w...