Read and write files is the most common IO operation. Python has built-in functions for read and write files, and the usage is compatible with C. The ability to read and write files on disk is provided by the operating system, modern operating systems do not allow programs to operate directly on disks. So, to read and write a file is to ask the operating system to open a file object (usually called a file descriptor), and then, through the interface provided by the operating system, read the data from the file object (read file), or write data to the file object (write file).
1. Read File Example.
To open a file object in file read mode, use Python’s built-in
open() function, passing in the file name and the mode identifier.
>>> f = open('/Users/jerry/hello.txt', 'r')
'r' stands for read.
If the file does not exist, the
open() function throws an IOError and gives you error codes and detailed information to tell you that the file does not exist.
>>> f=open('/Users/jerry/hellooo.txt', 'r') Traceback (most recent call last): File "<stdin>", line 1, in <module> FileNotFoundError: [Errno 2] No such file or directory: '/Users/jerry/hellooo.txt'
If the file opens successfully, use
read() method can read the entire contents of the file at once. Python reads the content into memory, and use a string object to save the content.
>>> f.read() 'Hello, world!'
The final step is to call the
close() method to close the file. The file must be closed after it is used, because the file object will occupy the operating system’s resources, and the number of files that operating system can open at the same time is limited also.
Since IOError can occur when a file is read or write, so the
f.close() may not be invoked if an error occurs. So, to make sure that the file is closed correctly whether or not something goes wrong, we can use
try... finally to avoid the issue.
try: f = open('/Document/python/test.txt', 'r') print(f.read()) finally: if f: f.close()
But it’s too verbose to do this every time, so Python introduced the
with statement to automatically call the
close() method for us.
with open('/Document/python/abc.txt', 'r') as f: print(f.read())
This is same as the previous
try... finally code block, but the code is more simpler, and you don’t have to call the
read() method will reads the entire contents of the file at once, and if the file has 10 gigabytes, the memory will explodes. So, to be safe, you can call the
read(size) method repeatedly, reading up to size bytes at a time. In addition, a call to
readline() reads one line at a time, calls
readlines() reads all lines at once, and returns a list, each list item is one line text.
If the file size is small,
read() is easiest to read all content at once. If you cannot determine the file size, repeatedly call
read(size) for security. If it’s a configuration file, you had better to use
for line in f.readlines(): print(line.strip()) # Remove '\n' at the line tail.
1.1 file-like Object
Objects such as those returned by the
open() function with a
read() method are called file-like objects in Python. In addition to file, the object can be byte streams in memory, network streams, custom streams, and so on. File-like objects do not require inheritance from a particular class, just be required to have a
StringIO is a file-like Object created in memory, often used as a temporary buffer.
1.2 Binary File
All of the above examples defaults mentioned to read text files, and are utf-8 encoded text files. To read binary files, such as images, video, and so on, open the file in
>>> f = open('/Document/Images/test.jpg', 'rb') >>> f.read() b'\xdd\xf8\xee\xf1\xff\x18Exif\xff\xee...' # Hexadecimal bytes
1.3 Character Encoding
To read a text file that is not utf-8 encoded, you need to pass in the
encoding parameters to the
open() function, for example, to read the GBK-encoded file.
>>> f = open('/Document/text/gbk_test.txt', 'r', encoding='gbk') >>> f.read() '开发'
You may encounter
UnicodeDecodeError when you come across files that are not coded properly, because there may be some illegally encoded characters in the text file. In this case, the
open() function also have an
errors parameter, indicating what to do if an encoding error is encountered. The easiest way is to just ignore the error.
>>> f = open('/Document/dev/gbk.txt', 'r', encoding='gbk', errors='ignore')
2. Write File Example.
Write a file is same as read a file, except that when you call the
open() function, you pass in the mode identifier
'w' to write text or
'wb' to write binary data.
>>> f = open('/Users/jerry/hello_world.txt', 'w') >>> f.write('Hello, world!') >>> f.close()
You can call
write() repeatedly to write a file, but be sure to call
f.close() to close the file. When we write a file, the operating system doesn’t write the data to disk immediately, but instead caches it in memory and writes it when it’s free. Only when the
close() method is called does the operating system guarantee that all unwritten data is written to disk. The consequence of forgetting to call
close() is that some data that unwritten may be lost. So, it’s safe to use the
with open('/Users/jerry/hello_world.txt', 'w') as f: f.write('Hello, world!')
To write special encoded string into text file, pass the encoding parameter into the
open() method, the
open method will automatically convert the string to the specified encoding.
When write a file in
'w' mode, if the file already exists, it overrides it (equivalent to write a new file after delete it). What if we want to append text to the end of the file? You can pass in
'a' mode to append text to the end of the file.