Python Module Itertools Example

Python standard library module itertools provides a lot of convenient and flexible iterator functions, if you are familiar with it can greatly improve your working efficiency.

1. Infinite Iterator.

1.1 itertools.count

Create an iterator that generates consecutive integers starting at n and 0 if n is ignored. Below is sample source code.

count(start=0, step=1)
: for n in itertools.count():
...: if 100000 < n < 100010:
...: print n
...: if n > 1000000:
...: break
...: 
100001
100002
100003

1.2 itertools.cycle

Repeat an incoming sequence indefinitely.

cycle(iterable)
: count = 0

: for c in itertools.cycle("AB"):
...: if count > 4:
...: break
...: print c
...: count += 1
...: 
A
B
A
B
A

1.3 itertools.repeat

Create an iterator that repeats the object, times (if provided) specifies the repeat count, and if times is not provided, returns the object indefinitely.

repeat(object [,times])
: for x in itertools.repeat("hello world", 5):
...: print x
...: 
hello world
hello world

2. Functional Tool.

itertools.ifilter、itertools.reduce、itertools.imap、itertools.izip.

It has the same functionality as the built-in functions filter(), reduce(), map(), and zip(), except that it returns an iterator rather than a sequence. They has been removed in Python3 because the default built-in function return an iterator instead of sequence.

2.1 itertools.ifilterfalse

ifilterfalse(function or None, sequence)

In python3 is:

filterfalse(function or None, sequence)

Similar to filter, only items in the sequence where function(item) is False are generated.

: for elem in itertools.ifilterfalse(lambda x: x > 5, [2, 3, 5, 6, 7]):
....: print elem
....: 
2
3
5

2.2 itertools.izip_longest

izip_longest(iter1 [,iter2 [...]], [fillvalue=None])

In python3, it is as below.

zip_longest(iter1 [,iter2 [...]], [fillvalue=None])

Similar to zip, but different is that it will finish the longest iter iteration before ending, and fillvalue will be used to fill in other iter if there is any missing value.

: for item in itertools.izip_longest('abcd', '12', fillvalue='-'):
....: print item
....: 
('a', '1')
('b', '2')
('c', '-')
('d', '-')

2.3 itertools.starmap

starmap(function, sequence)

Execute each element of the sequence as a list of arguments to function, that is, function(*item), returning the iterator that executes the result. This function is only valid if the iterable generated items are applicable to this way of calling the function.

: seq = [(0, 5), (1, 6), (2, 7), (3, 3), (3, 8), (4, 9)]

: for item in itertools.starmap(lambda x,y:(x, y, x*y), seq):
...: print "%d * %d = %d" % item
...: 
0 * 5 = 0
1 * 6 = 6
2 * 7 = 14
3 * 3 = 9
3 * 8 = 24
4 * 9 = 36

2.4 itertools.dropwhile

dropwhile(predicate, iterable)

Create an iterator that, as long as the function predicate(item) is True, discards the items in the iterable, and if predicate returns False, generates the items in the iterable and all subsequent items. That is, the first time after the condition is false, the remaining items in the iterator are returned.

: for item in itertools.dropwhile(lambda x: x<1, [ -1, 0, 1, 2, 3, 4, 1, -2 ]):
...: print item
...: 
1
2
3
4
1
-2

2.5 itertools.takewhile

takewhile(predicate, iterable)

It’s the opposite of dropwhile. Create an iterator that generates the predicate(item) of the iterable, and as soon as the predicate is calculated to False, the iteration stops.

: for item in itertools.takewhile(lambda x: x < 2, [ -1, 0, 1, 2, 3, 4, 1, -2 ]):
....: print item
....: 
-1
0
1

3. Combination Tools.

3.1 itertools.chain

chain(*iterables)

Concatenate a set of iteration objects to form a larger iterator.

: for c in itertools.chain(‘ABC’, ‘XYZ’):
…: print c
…:
A
B
C
X
Y
Z

3.2 itertools.product

product(*iterables, repeat=1)

Create an iterator to generate the cartesian product of multiple iterator sets, and the repeat parameter is used to specify the number of times the sequence is repeated.

: for elem in itertools.product((1, 2), ('a', 'b')):
...: print elem
...: 
(1, 'a')
(1, 'b')
(2, 'a')
(2, 'b')

: for elem in itertools.product((1, 2), ('a', 'b'), repeat=2):
...: print elem
...: 
(1, 'a', 1, 'a')
(1, 'a', 1, 'b')
(1, 'a', 2, 'a')
(1, 'a', 2, 'b')
(1, 'b', 1, 'a')

3.3 itertools.permutations

permutations(iterable[, r])

Returns an iterator that iterates through any tuple of arbitrary r elements in an iterable. If r is not specified, the sequence length is the same as the number of items in an iterable.

: for elem in itertools.permutations('abc', 2):
...: print elem
...: 
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')

: for elem in itertools.permutations('abc'):
...: print elem
...: 
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

3.4 itertools.combinations

combinations(iterable, r)

Similar to permutations, but the combinations are not in sequence, that is, if iterable is “abc” and r is 2, ab and ba are regarded as duplicates, while only ab is put back.

: for elem in itertools.combinations('abc', 2):
....: print elem
....: 
('a', 'b')
('a', 'c')
('b', 'c')

3.5 itertools.combinations_with_replacement

combinations_with_replacement(iterable, r)
Similar to combinations, but allowing duplicate values, that is, if Iterable is "abc" and r is 2, there will be more aa, bb, cc.

: for elem in itertools.combinations_with_replacement('abc', 2):
....: print elem
....: 
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'b')
('b', 'c')
('c', 'c')

4. Other Tools.

4.1 itertools.compress

compress(data, selectors)

Equivalent to bool selection, only when the element corresponding to the location of selectors is true, the element corresponding to the location in data is retained, otherwise it is removed.

: list(itertools.compress('abcdef', [1, 1, 0, 1, 0, 1]))
: ['a', 'b', 'd', 'f']

: list(itertools.compress('abcdef', [True, False, True]))
: ['a', 'c']

4.2 itertools.groupby

groupby(iterable[, keyfunc])

Grouping elements in iterable. Keyfunc is a grouping function for grouping iterable’s continuous items. If not specified, it defaults to grouping iterable’s continuous identical items and returns an iterator (key, sub-iterator).

: for key, value_iter in itertools.groupby('aaabbbaaccd'):
....: print key, list(value_iter)
....: 
a ['a', 'a', 'a']
b ['b', 'b', 'b']
a ['a', 'a']
c ['c', 'c']
d ['d']

: data = ['a', 'bb', 'cc', 'ddd', 'eee', 'f']

: for key, value_iter in itertools.groupby(data, len):
....: print key, list(value_iter)
....: 
1 ['a']
2 ['bb', 'cc']
3 ['ddd', 'eee']
1 ['f']

Note: Sort before grouping, because groupby is grouped by comparing adjacent elements. You can see the second example, because a and f are not aligned, they are not grouped into the same list at last.

4.3 itertools.islice

islice(iterable, [start,] stop [, step])

Slice selection, start is the start index, stop is the end index, step is the step size, start and step are optional.

: list(itertools.islice([10, 6, 2, 8, 1, 3, 9], 5))
: [10, 6, 2, 8, 1]


: list(itertools.islice(itertools.count(), 3, 10))
: [3, 4, 5, 6, 7, 8, 9]

4.4 itertools.tee

tee(iterable, n=2)

Create n independent iterators from Iterable and return them as tuples.

: itertools.tee("abcedf")
: (<itertools.tee at 0x7fed7b8f59e0>, <itertools.tee at 0x7fed7b8f56c8>)

: iter1, iter2 = itertools.tee("abcedf")

: list(iter1)
: ['a', 'b', 'c', 'e', 'd', 'f']

: list(iter2)
: ['a', 'b', 'c', 'e', 'd', 'f']

: itertools.tee("abcedf", 3)
:
(<itertools.tee at 0x7fed7b8f5cf8>,
<itertools.tee at 0x7fed7b8f5cb0>,
<itertools.tee at 0x7fed7b8f5b00>)