Usually in Python, we use multi-threading or multi-processing to implement concurrent programming. Computing-intensive tasks for cpu are usually implemented by multi-processor because of GIL, while io-intensive tasks can be scheduled by threads to allow threads to relinquish GIL while performing io tasks, thus achieving superficial concurrency.
In fact, for io-intensive tasks, we still have one option:
Coroutine which is a “concurrency” running in a single thread. Comparing with multi-threading, a great advantage of coroutine is that it can save the switching overhead between multi-threads and achieve higher efficiency. Asyncio is an asynchronous io module in python, it is the basic coroutine module.
Unlike thread switching, coroutine switching is controlled by the program itself without switching overhead. Coroutine does not need multi thread lock mechanism, because it runs in the same thread, so there is no problem of accessing data at the same time, and the execution efficiency is much higher than that of multi-threading.
Because the coroutines are executed in a single thread, so how to use a multi-core cpu? The simplest method is to use multi-processor + coroutine, which can make full use of multi-core and give full play to the efficiency of the coroutine to achieve very high performance.
If you still can’t understand the concept of the coroutine, you can understand it so simply as below.
- Process / Thread : This is operating system provided ability to process tasks concurrently. It represent the capabilities of the operating system.
- Coroutine: It is a task scheduling skill in a single thread for programmers to artificially achieve multi-task concurrency in the process of code execution through excellent code skill. It rely on the process control ability of the programmer.
In the early days, python provide the yield keyword to produce generators. So that any function that contains yield is a generator.
When your function contains yield keyword and execute, the function will be paused at yield keyword and returns the value of the expression after yield keyword (default value is None) until the function object is called again by the next(function_object) method, then the function will continue execute from the last paused yield code.
Let us see an example as below. Open a terminal and input python3 to enter python interactive console, then run below python source code.
$ python3 Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 16:52:21) [Clang 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> # First define a python function, this function receive an integer type input parameter. >>> def test_func(n): x, y = 0, 1 i = 0 while i < n: # When the function run here, the thread will stop and return value y. yield y x, y = y, x+y i += 1 # Call test_func function with parameter 5. >>> func_object = test_func(5) # When call next() with the func_object, return current y's value. >>> next(func_object) 1 # Call next function again on the function object, now return y's new value. >>> next(func_object) 1 >>> next(func_object) 2 >>> next(func_object) 3 >>> next(func_object) 5 # When the loop stop, then throw an exception. >>> next(func_object) Traceback (most recent call last): File "<pyshell#9>", line 1, in <module> next(f) StopIteration
Look at the following example, two workers A and B simulate do two tasks alternately, it is implemented in a single thread with yield keyword to simulate multi-threading function.
#!/usr/bin/env python # -*- coding:utf-8 -*- import time def task1(): while True: yield "Worker A tired, let B work a while." time.sleep(1) print("Worker A do some work for a while...") def task2(t): next(t) while True: print("-----------------------------------") print("Worker B working for some time......") time.sleep(2) print("Worker B tired, let worker A do some work......") ret = t.send(None) print(ret) t.close() if __name__ == '__main__': t = task1() task2(t)
Keyword yield could only return and pause functions, but could not achieve the function of coroutine. Later, python defined a new function send() for it to receive values sent from outside, and this make the generator became a real coroutine.
Each generator can execute the send() method to send data to the yield statement inside the generator. Now, the yield statement is no longer just a
yield statement_expression form, it can also be an variable assignment form for example
variable_name = yield statement_expression.
It can implement two features at the same time, one is to pause and return the function, the other is to receive the value sent by the external
send() method then reactivate the function, and assign the value to the variable.
# Define a function which use yield keyword to define a variable. def simple_coroutine(): print('-> Start coroutine. ') b = 10 a = yield b print('-> Coroutine received a's value : ', a) test_coroutine = simple_coroutine() ret = next(my_coro) print(ret) test_coroutine.send(10)
A coroutine can be in one of the following four states. To get coroutine’s current state, you can first import inspect module then invoke inspect module ‘s getgeneratorstate(…) method. This method will return one of below strings.
- ‘GEN_CREATED’ : Waiting for the coroutine to start.
- ‘GEN_RUNNING’ : The coroutine is just running.
- ‘GEN_SUSPENDED’ : The coroutine is paused at the yield expression.
- ‘GEN_CLOSED’ : The coroutine is stopped and closed.
Because the parameters of the send() method will become the values of the paused yield expression, so the send() method can be called only when the coroutine is paused.
However, if the coroutine is not activated (the coroutine state is ‘GEN_CREATED’), and you sent a value other than None to it, this will trigger a TypeError. Therefore, you should always call the
next(my_coroutine) method to activate coroutine ( or call
my_coroutine.send (None) method ), which is called pre-activation.