⇦ Back

Python’s os and pathlib modules can be confusing. According to the documentation for os (see here) it provides “miscellaneous operating system interfaces” while its os.path sub-module (see here) provides “common pathname manipulations”. The pathlib module, on the other hands, provides “object-oriented filesystem paths” (see here).

What’s not clear at all is that there is actually a huge amount of overlap between these three modules! This page will take a look at some of these similarities and demonstrate functions and methods that have the same or similar behaviour.

To start with, import the two main modules into your script:

import os
from pathlib import Path

1 Working with Paths

A path - either to a file or to a folder - can be created in Python as a string:

# Create a path
print('Example Directory/Sub-Directory')
## Example Directory/Sub-Directory

However, different OSs have different standards for their paths: Windows uses back slashes while macOS and Linux use forward slashes, for example. So, a safer way to create paths is to either use os.path.join() from the path sub-module of the os module or Path() from the pathlib module:

# Create a path
print(os.path.join('Example Directory - os', 'Sub-Directory'))
## Example Directory - os/Sub-Directory
# Create a path
print(Path('Example Directory - pathlib', 'Sub-Directory'))
## Example Directory - pathlib/Sub-Directory

These options both concatenate strings in such a way as to make them valid paths on the OS they are run. On a Windows machine they would appear as follows:

## Example Directory - os\Sub-Directory
## Example Directory - pathlib\Sub-Directory

The difference between os.path.join() and pathlib.Path() is that the first produces a string whereas the second produces a path object. Path objects have slightly more functionality that is useful when creating and handling file paths and, in general, should be used when working with paths.

If you want something to be a string, make it a string. If you want something to be a path make it a path. Using good-practice conventions such as this will help ensure that your code does what you expect it to do and will help anyone who reads your code in figuring out what you meant it to do.

1.1 Splitting Up a Path

The opposite of what we did above - creating a path from individual strings - would be to break a full path up into its constituent parts. This can’t be done directly with the os module - it has an os.path.split() function but this only splits off the last part - but we can treat the path as a string (which it is) and use the .split() method instead. Bear in mind, however, that this isn’t OS agnostic because the slashes won’t be the right way round on Windows machines. The pathlib module has a better option: the .parts attribute:

# Split a path up into parts
path = os.path.join('Example Directory', 'Sub-Directory', 'Example File.txt')
print(path.split('/'))
## ['Example Directory', 'Sub-Directory', 'Example File.txt']
# Split a path up into parts
path = Path('Example Directory', 'Sub-Directory', 'Example File.txt')
print(path.parts)
## ('Example Directory', 'Sub-Directory', 'Example File.txt')

1.2 Replacing Elements in a Path

A path object has methods that will replace the file name (including the extension), the file stem (the file name not including the extension) and/or the extension:

# Replace the file name
print(path.with_name('Example Script.py'))
## Example Directory/Sub-Directory/Example Script.py
# Replace the file stem
print(path.with_stem('Example Script'))
## Example Directory/Sub-Directory/Example Script.txt
# Replace the file extension
print(path.with_suffix('.py'))
## Example Directory/Sub-Directory/Example File.py

2 Creating Files and Folders

Directories (folders) can be created as follows:

# Create directories
path_os = os.path.join('Example Directory - os', 'Sub-Directory')
os.makedirs(path_os, exist_ok=True)

# Create directories
path_pathlib = Path('Example Directory - pathlib', 'Sub-Directory')
Path(path_pathlib).mkdir(parents=True, exist_ok=True)

The exist_ok and parents options will ensure that you don’t get errors if the directories already exist or if the intermediate directories don’t exist, respectively.

Files can be created with either the os or pathlib module but in general it’s better practice to use base Python instead:

2.1 Using os

The os.mknod() function will make a node (a file), os.open() will open it in a particular mode (eg read and write mode), os.write() will write a bytestring (note - not just a string!) to it and os.close() will close it:

path = os.path.join('Example Directory - os', 'Example File 1.txt')
# Create file if it does not exist
os.mknod(path)
# Open file in read and write mode
fd = os.open(path, os.O_RDWR)
# Write to file
os.write(fd, b'Hi, mom!')
# Close file
os.close(fd)

Note that os.mknod() will only work if the file does not already exist.

2.2 Using pathlib

The pathlib module is simpler: you just need to use the .touch() and .write_text() methods:

path = Path('Example Directory - pathlib', 'Example File 1.txt')
# Create file
path.touch()
# Write to file
path.write_text('Hi, mom!')

Be default, the .touch() method will work even if the file already exists. This behaviour can be changed by using the exist_ok=False option.

2.3 Using Base Python

As it happens, you can create and write to files in base Python (without using any modules) using open(), .write() and .close():

# Create and write to files
paths = [
    os.path.join(path_os, 'Example File 2.txt'),
    Path(path_pathlib, 'Example File 2.txt'),
]
for path in paths:
    # Open file in 'write' mode, erasing content if the file exists
    f = open(path, 'w')
    # Write to a file
    f.write('Hi, mom')
    # Close the file
    f.close()

The above code will work even if the files already exist. If we instead opened a file in x-mode (exclusive creation) instead of w-mode (write) then it would fail if an existing file was detected.

3 Listing Elements in a Folder

The os module has two ways to list all the files and folders in a directory: listdir() will return the elements as a list of strings while scandir() will return them as a list of “DirEntry” objects which contain additional metadata about the objects. The pathlib module has the .iterdir() method which returns path objects:

# List all files and folders in a directory
print(os.listdir('Example Directory - os'))
## ['Example File 1.txt', 'Sub-Directory']
# List all files and folders in a directory
print([x for x in os.scandir('Example Directory - os')])
## [<DirEntry 'Example File 1.txt'>, <DirEntry 'Sub-Directory'>]
# List all files and folders in a directory
print([x for x in Path('Example Directory - pathlib').iterdir()])
## [PosixPath('Example Directory - pathlib/Example File 1.txt'), PosixPath('Example Directory - pathlib/Sub-Directory')]

No distinction is made between folders and files in the output besides the fact that files have extensions.

3.1 Sorting

Note that these results will not necessarily be in alphabetical order. To sort them, either use the sorted() function or the .sort() method:

print(sorted(os.listdir('Example Directory - os')))
## ['Example File 1.txt', 'Sub-Directory']
ls = os.listdir('Example Directory - os')
ls.sort()

print(ls)
## ['Example File 1.txt', 'Sub-Directory']

3.2 Skipping Elements

Often when you list the elements in a directory you won’t be interest in everything that is inside it. You’ll usually want to skip over hidden files and the Python script itself. Possibly, you will only be interested in files instead of folders or vice versa. Here’s a snippet that will only return non-hidden files in a folder:

directory = 'Example Directory - os'
# List all files and folders in a directory
files = os.listdir(directory)
# Remove this file from the list
if (file := os.path.basename(__file__)) in files:
    files.remove(file)
# Remove hidden files from the list
files = [f for f in files if not f.startswith('.')]
# Remove directories from the list
files = [f for f in files if not os.path.isdir(Path(directory, f))]
# Sort the files
files = sorted(files)

print(files)
## ['Example File 1.txt']

4 Walking Through a Filesystem

Instead of listing all the files and folders in one directory only, you can ‘walk’ through an entire filesystem and list all the files and folders in each sub-directory:

# Walk through all files and folders in a folder
for dirpath, dirnames, filenames in os.walk('Example Directory - os'):
    # Sort the dirnames and filenames to ensure you search them in order
    dirnames.sort()  # or `dirnames[:] = sorted(dirnames)`
    filenames.sort()  # or `filenames[:] = sorted(filenames)`
    # Print all files
    for filename in filenames:
        # Skip hidden files
        if filename.startswith('.'):
            continue
        # Skip this file
        if filename == os.path.basename(__file__):
            continue
        print(os.path.join(dirpath, filename))
## Example Directory - os/Example File 1.txt
## Example Directory - os/Sub-Directory/Example File 2.txt
# Walk through all files and folders in a folder
for dirpath, dirnames, filenames in Path('Example Directory - pathlib').walk():
    # Sort the dirnames and filenames to ensure you search them in order
    dirnames.sort()  # or `dirnames[:] = sorted(dirnames)`
    filenames.sort()  # or `filenames[:] = sorted(filenames)`
    # Print all files
    for filename in filenames:
        # Skip hidden files
        if filename.startswith('.'):
            continue
        # Skip this file
        if filename == Path(__file__).name:
            continue
        print(Path(dirpath, filename))
## Example Directory - pathlib/Example File 1.txt
## Example Directory - pathlib/Sub-Directory/Example File 2.txt

5 Checking Existence

For both modules, the exists() function/method can be used to check if a directory or file exists.

For directories:

# Check if a directory exists
path = os.path.join('Example Directory - os', 'Sub-Directory')
print(os.path.exists(path))
## True
# Check if a directory exists
path = Path('Example Directory - pathlib', 'Sub-Directory')
print(path.exists())
## True

For files:

# Check if a file exists
path = os.path.join('Example Directory - os', 'Example File 1.txt')
print(os.path.exists(path))
## True
# Check if a file exists
path = Path('Example Directory - pathlib', 'Example File 1.txt')
print(path.exists())
## True

6 Checking Type

As for checking whether something is a file or a folder, string objects (such as are returned by os.listdir()) can be used with the os.path.isdir() and os.path.isfile() functions. Path objects (such as are returned by Path.iterdir()) and “DirEntry” objects (which are returned by os.scandir()) have the .is_dir() and .is_file() methods but can also be used with the os.path.isdir() and os.path.isfile() functions:

For directories:

# For each item in a directory, is it a folder?
path = 'Example Directory - os'
_ = [print(x, os.path.isdir(os.path.join(path, x))) for x in os.listdir(path)]
## Example File 1.txt False
## Sub-Directory True
# For each item in a directory, is it a folder?
_ = [print(x, x.is_dir()) for x in os.scandir(path)]
## <DirEntry 'Example File 1.txt'> False
## <DirEntry 'Sub-Directory'> True
# For each item in a directory, is it a folder?
_ = [print(x, x.is_dir()) for x in Path('Example Directory - pathlib').iterdir()]
## Example Directory - pathlib/Example File 1.txt False
## Example Directory - pathlib/Sub-Directory True

For files:

# For each item in a directory, is it a file?
path = 'Example Directory - os'
_ = [print(x, os.path.isfile(os.path.join(path, x))) for x in os.listdir(path)]
## Example File 1.txt True
## Sub-Directory False
# For each item in a directory, is it a file?
_ = [print(x, os.path.isfile(x)) for x in os.scandir(path)]
## <DirEntry 'Example File 1.txt'> True
## <DirEntry 'Sub-Directory'> False
# For each item in a directory, is it a file?
_ = [print(x, x.is_file()) for x in Path('Example Directory - pathlib').iterdir()]
## Example Directory - pathlib/Example File 1.txt True
## Example Directory - pathlib/Sub-Directory False

6.1 Sanitising Lists of Files and Folders

Using a function such as os.listdir(path) will return everything that is in the directory specified by path (by default this will be the directory that the script is in). This will include hidden files and the script itself, and often you will only want to know the files in a folder, not the directories. Here’s some code that will sanitise the list of a directory’s contents by:

  • Removing the name of the script itself
  • Removing hidden files
  • Removing directories
  • Only including files of a certain type
  • Sorting alphabetically
# Create a list with the names of the files and folders in this folder
contents = os.listdir()
# Remove this file
contents.remove(os.path.basename(__file__))
# Remove hidden files
contents = [x for x in contents if not x.startswith('.')]
# Remove directories
contents = [x for x in contents if not os.path.isdir(x)]
# Only include CSV files
contents = [x for x in contents if os.path.splitext(x)[1] == '.csv']
# Sort
files = sorted(contents)

The above code could be tweaked as needed. For example, the following edit will return directories instead of files:

# Only include directories
contents = [x for x in contents if os.path.isdir(x)]

7 Renaming Files

The os.rename() function and the Path.rename() method can be used for this:

# Change the name of one of the files
old_name = 'Example Directory - os/Example File 1.txt'
new_name = 'Example Directory - os/Example File 1 - Renamed.txt'
os.rename(old_name, new_name)

# Change the name back
os.rename(new_name, old_name)
# Change the name of one of the files
old_name = 'Example Directory - pathlib/Example File 1.txt'
new_name = 'Example Directory - pathlib/Example File 1 - Renamed.txt'
Path(old_name).rename(new_name)

# Change the name back
Path(new_name).rename(old_name)

8 Another Option: os.system()

The os.system() function runs a piece of terminal/command line code. This is very powerful; it means that, essentially, a Python script can do anything a shell script can do because it can execute code that can be run on a terminal!

8.1 Listing Elements in a Folder

The ls shell command will list the files and folders within a folder, so it can be used like os.listdir() or Path.iterdir():

# List all files and folders in a directory
os.system('ls "Example Directory - os"')
## Example File 1.txt
## Sub-Directory

8.2 Renaming and Moving Files

Renaming and moving files (these operations are the same to your computer - moving a file is just renaming part of its path!) can be done with mv as opposed to os.rename() or Path.rename():

# Change the name of a file
old_name = 'Example Directory - os/Example File 1.txt'
new_name = 'Example Directory - os/Example File 1 - Renamed.txt'
os.system(f'mv "{old_name}" "{new_name}"')

# Change the name back
os.system(f'mv "{new_name}" "{old_name}"')

8.3 Comparing Contents

A useful command that will compare the contents of two different directories is diff:

# Compare the contents of two directories
dir1 = 'Example Directory - os'
dir2 = 'Example Directory - pathlib'
os.system(f'diff "{dir1}" "{dir2}"')
## Common subdirectories: Example Directory - os/Sub-Directory and Example Directory - pathlib/Sub-Directory

9 Deleting Files and Folders

There is the os.remove() function and the Path.unlink() method for deleting files:

# Delete files
os.remove('Example Directory - os/Example File 1.txt')
os.remove('Example Directory - os/Sub-Directory/Example File 2.txt')
# Delete files
Path('Example Directory - pathlib/Example File 1.txt').unlink()
Path('Example Directory - pathlib/Sub-Directory/Example File 2.txt').unlink()

Don’t confuse this os.remove() function with the .remove() method which removes an item from a list:

ls = ['A', 'B', 'C']
ls.remove('A')

print(ls)
## ['B', 'C']

For directories, both modules use rmdir() and both will return errors if the directories do not exist or are not empty:

# Delete directories
os.rmdir('Example Directory - os/Sub-Directory')
os.rmdir('Example Directory - os')
# Delete directories
Path('Example Directory - pathlib/Sub-Directory').rmdir()
Path('Example Directory - pathlib').rmdir()

⇦ Back