Python File Operation Example

Python has a built-in open() method for reading and writing files. Using the open() method to operate a file can be divided into three steps: first, open the file, second, operate the file, and third, close the file. The return value of the open() method is a file object that can be assigned to a variable (file handle). Its basic syntax format is file = open(filename, mode).

filename: a string value containing the name of the file you want to access, usually a file path. mode: there are many modes for opening files. The default mode is read-only mode (r). In Python, all objects with read() and write() methods can be classified as file types. All objects of file type can be opened using the open() method, closed use close() method, and managed by the with context manager.

Below example show you how to open a file, write text into the file and close it.

# Open a file in write mode.
f = open("/tmp/foo.txt", "w")

# Write text to the file.
f.write("I love Python.\n")

# Close the opened file object.
f.close()

1. Open File Mode.

  1. r : Read only mode, the default mode. If the file does not exist, an error will be thrown, if the file exist, it will be read normally.
  2. : Write only mode. If the file does not exist, create a new file and write it; if it exists, empty the contents of the file before writing.
  3. a : Append mode. If the file does not exist, create a new file and write it; if it exists, append the data at the end of the file.
  4. x : Create a new file mode. If the file exists, an error will be reported. If it does not exist, a new file will be created, and then the content will be written. It is safer than w mode.
  5. b : Binary mode. For example, rb, wb and ab operate on data in bytes. Binary mode, usually used to read binary files such as pictures and videos. It reads and writes file content as bytes, so it gets a byte object instead of a string. In this reading and writing process, you need to specify your own encoding format.
    # Create a string object.
    >>> str = 'i love python'
    >>> 
    # Open a file with write mode.
    >>> file = open('test.txt','w')
    >>> 
    # Write the string to the file.
    >>> file.write(str)
    13
    >>> 
    # Close the file to flush data to disk.
    >>> file.close()
    
    
    # Create a string object.
    >>> str = 'i love python'
    >>> 
    # Get bytes type object from the string.
    >>> byteObj = bytes(str,encoding='utf-8')
    >>> 
    # Open the file with wb(write binary) mode.
    >>> file = open('test.txt','wb')
    >>> 
    # Then write string text to file will throw an exception.
    >>> file.write(str)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: a bytes-like object is required, not 'str'
    
    # We should write byte object to the file.
    >>> file.write(byteObj)
    13
    >>> file.close()
  6. + : Read write mode. For example, r+, w+, a+.

2. File Object Operation.

Whenever we open a file with the open() method, a file object is returned. This object has many built-in operation methods.

2.1 read(size).

Read a certain size of data and return it as a string or byte object. Size is an optional number type parameter that specifies the amount of data to read. When size is ignored or negative, all contents of the file will be read and returned. If the file size is large, please do not use read() method to read all file content into memory at one time, but read (512) once a time.

# Open a file.
file = open("test.txt", "r")

# Read all file content as string.
str = file.read()

# Print file content string.
print(str)
# Close the file.
file.close()

2.2 readline().

Reads a line of file content from a file object. The newline character is ‘\n’. If an empty string is returned, the last line has been read. This method is usually to read one line then process one line, and can’t turn back, can only move forward, can’t read any lines that have been read before.

# Open a file in read only mode.
file = open("test.txt", "r")

# Read one line in the file content.
str = file.readline()

# Print the one line string.
print(str)

# Close the file.
file.close()

2.3 readlines().

Read all the lines in the file, line by line, into a list, one by one as the elements of the list, and return the list. The readlines method will read all the file content into memory at one time, so there are some risks. But it has the advantage that each line is kept in the list and can be accessed at will.

# Open file in read only mode.
 
file = open("test.txt", "r")

# Read all file content lines into a list.
file_content = file.readlines()

# Print the file content list.
print(file_content)

# Close the file.
file.close()

2.4 Traverse file use for loop.

In fact, more often, we use file objects as an iterator to get file content traversal. This method is very simple, it does not need to read all the file content at one time, but it also does not provide a good control. Like the readlines method, it can only move forward, not backward.

# Open a file in read only mode.
file = open("test.txt", "r")

# Traverse the file content in for loop.
for line in file:

    print(line, end='')

# Close the file.
file.close()

2.5 Read file data method comparison.

  1. If the file is small, read() is the most convenient method to read file data at one time.
  2. If the file size cannot be determined, call read(size) repeatedly to avoid risk.
  3. If it is a configuration file, readlines() is most convenience.
  4. In general case, using for loop is better and faster.

2.6 write().

write() method is used to write string or byte type data to a file. The write() method can be invoked repeatedly many times. In fact, it is an operation in memory and will not be written back to the hard disk immediately. All write operations will not be reflected on the hard disk until the close() method is executed. In this process, if you want to save the changes in memory to the hard disk immediately, you can use the flush() method, but this may cause data inconsistency.

# Open a file with write mode.
file = open("test.txt", "w")

# Write string to the file.
file.write("I love python.\n")

# Close the file, the text will be write to the hard disk.
file.close()

2.7 tell().

Returns the current position of the file read-write pointer, which is the number of bytes from the beginning of the file.

2.8 seek().

If you want to change the position of the position pointer, you can use the seek (offset, from_what) method. Seek() is often used with the tell() method.

  1. offset : Represents the offset.
  2. from_what : If it is 0, it starts from the beginning of the file; if it is 1, it starts from the current position of the file read-write pointer; if it is 2, it starts from the end of the file; the default value is 0.
  3. seek(x,0) : Move x characters from the first character of the first line of the file.
  4. seek(x,1) : Move x characters from the current position.
  5. seek(-x,2) : Move x characters backward from the end of the file.
>>> file = open("test.txt", "wb+")
>>> 
>>> file.write(b"abcdefg")
7
>>> file.tell()
7
>>> file.seek(5)
5
>>> file.read(2)
b'fg'
>>> file.seek(-1,2)
6
>>> file.read(1)
b'g'
>>> 

2.9 close().

Close the file object. After processing a file, call close() method to close the file and release the system’s resources. If you try to call the file object again after the file is closed, an exception is thrown. The result of forgetting to call close() is that only a part of the data may be written to the disk, and the rest is lost.

3. File Data Encoding.

To read non UTF-8 encoded files, you need to pass the encoding parameter to the open() function, for example, to read GBK encoded files.

# Read gbk encoding files.
>>> file = open('test_gbk.txt', 'r', encoding='gbk')
>>> file.read()

When encountering some files with irregular encoding, a UnicodeDecodeError exception may be thrown, indicating that some illegally encoded characters may be mixed in the file. In this case, you can provide the errors parameter, which indicates how to handle if encountering coding errors.

# errors='ignore' means ignore the errors.
>>> file = open('test_gbk.txt', 'r', encoding='gbk', errors='ignore')

4. With Keyword.

With keyword: a context manager mechanism for Python. Python provides the with context manager mechanism to ensure that the file will be shut down normally. Under its management, there is no need to write the close statement.

# Open a file in write mode use with keyword.
with open('test.txt', 'w') as file:
    # Write data to the file.
    file.write('Hello, world!')

With supports opening multiple files at the same time.

# Open test1.txt file in read mode, and open test2.txt file in write mode.
with open('test1.txt') as obj1, open('test2.txt','w') as obj2:
    # Read data from test1.txt file.
    data = obj1.read()
    # Write data to the test2.txt file. 
    obj2.write(data)