How To Use Python OS Module Walk Method Example

The os.walk(top, topdown=True, onerror=None, followlinks=False) method is a very important and powerful method in python standard os module. It can help us to traverse the directory tree in a top-down or bottom-up recursive manner very conveniently and return a tuple (dir-path, dir-names, file-names) for each directory.

Below is the returned tuple’s elements description. dir-path : The location of the directory tree, dir-names : A list of subdirectories in the directory tree, excluding (“.” and “..”), file-names : List of files in the directory tree.

If you set the optional parameter topdown = True or do not specify it, the directory traversal is performed in a top-down manner, that is, from the parent directory to the subdirectory step by step.

If set topdown = False, then it will traverse the directory in a bottom-up manner, that is, print the subdirectory first and then print the parent directory.

If the optional parameter onerror is specified, then onerror must be a function that has a parameter of an OSError instance, this allows the execution of os.walk() do not be interrupted.

You can throw an exception in this onerror function to stop executing os.walk() method. In general, this parameter is used to specify the processing method when an error occurs.

By default, os.walk() will not enter symbolic links when traversing. If the optional parameter followlinks = True is set, symbolic links will be entered.

Note that this may cause an infinite loop of traversal, because the symbolic link may link itself to itself, and os.walk() does not have such a high iq to find this.

Below is the example code, the example python file will receive the command line parameter as the traversed parent directory name. If do not provide the parent directory in command line argument, then it will traverse current directory where the python file being executed.

OsWalk.py

# import python os, sys module first.
import os, sys

# define a function which receive a directory name and walk through the directory.
def os_walk(p_dir_name):
    
    try:
        # walk through the directory.
        for root, dirs, files in os.walk(p_dir_name):
            
            # print out the directory name
            print("---directory---", root, "-"*10)

            # print out sub directory.
            for dir in dirs:
                
                print("---<DIR>---", dir)
            
            # print out sub file.
            for file in files:
                
                print("\t\t", file)
                
    except OSError as ex:
        print(ex)
        

if __name__ == '__main__':
    
    # set default parent directory name to current directory.
    p_dir_name = '.'
    
    # get system command parameters length.
    arg_len = len(sys.argv)
    
    # if command line parameter length = 1.
    if(arg_len > 1):
        # get first command line parameter.
        p_dir_name = sys.argv[1]
    
    # invoke the os_walk method to walk through the parent directory.
    os_walk(p_dir_name)

When you run above example with command > python OsWalk.py, you will get output in console like below.

---directory--- . ----------
---<DIR>--- csv2excel
---<DIR>--- csv2json
         FileOperateExample.py
         CheckFileExistExample.py
         OsWalk.py
         read_text_file_objgraph.png
         csv_coding_language.csv
         PDFExtract.py
         GetTextFileWordsCount.py
         CSVReadWriteExample.py
         ProfileMemoryExample.py
---directory--- ./csv2excel ----------
         employee_info_new.xlsx
         CSVExcelConvertionExample.py
         employee_info_new.csv
         employee_info.csv
---directory--- ./csv2json ----------
         json_user_info.json
         new_csv_user_info.csv
         CSVJSONConvertionExample.py
         csv_user_info.csv