= range(1000)
rg print(rg, type(rg))
range(0, 1000) <class 'range'>
range
revisitedWhat does range
return?
# Tip: you can separate digits with _ for readability (Python ignores the _)
for i in range(1_000_000_000):
print(i)
if i == 3:
break
We can see that it is iterable, perhaps a list or tuple?
How much memory does a list of 1 billion integers take?
(4 bytes/int) * 1B = 4GB of RAM?!
Keep in mind, we’re only using one of these integers at a time!
This wasteful behavior was indeed the case in Python 2, range
had to allocate all of the needed memory at once.
Fortunately in Python 3, a range is lazily-evaluated, but how?
= range(1000)
rg print(rg, type(rg))
range(0, 1000) <class 'range'>
Generators are a special type of object that is iterable, but not a collection.
A generator object yields a single result at a time, which means we can use far less memory in most cases, especially when we won’t use the entire iterable.
Today we’ll see two ways to write generators: generator functions, and generator expressions.
When we introduced comprehensions we mentioned that they only made sense to use if you are planning to use the entire iterable.
# this allocates a list of 100 integers
= [2**x for x in range(100)]
powers_of_two
# this is a wasteful way to use a list comprehension, only one value is needed at a time
for num in powers_of_two:
print(num)
if num > 1_000_000:
break
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
There is one more expression that uses the comprehension-style syntax, and that is the generator expression.
# parenthesis () instead of square brackets [], creates a generator expression
= (2**x for x in range(100))
powers_of_two_gen
# what is this type?
print(powers_of_two_gen)
print(type(powers_of_two_gen))
# iteration works as you'd expect
for num in powers_of_two_gen:
print(num)
if num > 1_000_000:
break
<generator object <genexpr> at 0x1086460c0>
<class 'generator'>
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
The generator expression creates a generator object, which, like range
, is iterable, but not a collection.
Iteration in Python is achieved via a protocol, one or more special methods that describe how a given type can be used in a certain context.
We’ll explore these protocols in more detail later in the course.
Iteration depends on two functions: iter()
and next()
.
iter()
takes an iterable and returns an iterator, a kind of object we usually don’t interact with directly.
Any iterable can be passed into next()
, which will return the next value, or raise a StopIteration
error if the iterable is exhausted.
for i in iterable:
print(i)
# can be rewritten as
= iter(iterable)
iterator while True:
try:
= next(iterator)
i print(i)
except StopIteration:
# this break only runs when next runs out
break
# we usually don't see the iterator!
While we usually don’t see the iterator, a generator is both iterable and iterator, so we can use next
on them:
= (3**n for n in range(5))
power_of_three print(next(power_of_three))
print(next(power_of_three))
print(next(power_of_three))
print(next(power_of_three))
print(next(power_of_three))
# one more call would raise an error!
try:
print(next(power_of_three))
except StopIteration:
print("done!")
1
3
9
27
81
done!
While generator expressions are one way to create simple generators, they have the same limitations that comprehensions have: they can have a single expression for each input, and use a single conditional.
We can also create generators by writing generator functions. These are special functions that return the generator type.
Note that this is the same type as the generators we’ve already introduced, just a different way to create them.
A function that contains the yield
keyword, is automatically a generator function. Let’s rewrite one of our earlier examples as a function:
def powers_of_2(limit):
for n in range(limit):
yield 2**n
Calling this function returns a generator:
= powers_of_2(10)
p2g print(p2g)
print(type(p2g))
<generator object powers_of_2 at 0x108645560>
<class 'generator'>
We can iterate over it with a for loop:
for x in p2g:
print(x)
1
2
4
8
16
32
64
128
256
512
All generators can only be consumed once, so an attempt to call next()
leads to an error:
try:
print(next(p2g))
except Exception as e:
print(repr(e))
StopIteration()
Generator functions do give us an option that expressions did not, we can create a fresh generator by calling the function again:
= powers_of_2(10)
g1 = powers_of_2(10)
g2
# g1 and g2 are independent, we can move one forward,
# and it doesn't affect the other
print(" g1", next(g1))
print(" g1", next(g1))
print(" g1", next(g1))
print(" g2", next(g2))
print(" g2", next(g2))
print(" g1", next(g1))
g1 1
g1 2
g1 4
g2 1
g2 2
g1 8
Generator functions can contain multiple yield
, each yield
pauses execution of the generator. The internal state of the function remains intact, unlike return.
def simple_generator_func():
print("start")
yield 1
print("still running")
yield 2
print("one more")
yield 3
= simple_generator_func()
g
print("got (first)", next(g))
print("--before loop--")
for x in g:
print("got", x)
start
got (first) 1
--before loop--
still running
got 2
one more
got 3
Generators do not need to ever actually exit, let’s create an abstraction for our powers-of-n generators:
def powers_of_n(n):
= 0
i while True:
yield n ** i
+= 1 i
= powers_of_n(5)
pow5
print(next(pow5))
print(next(pow5))
print(next(pow5))
print(next(pow5))
print(next(pow5))
1
5
25
125
625
This code works because we’re only asking for one more iteration of the loop, each time we call next
the code in the function runs until it hits the next yield
.
We can also use this in a loop if we’re careful to break!:
for x in powers_of_n(7):
print(x)
# it is up to the calling loop to break, without this we run forever!
if x > 10000:
break
1
7
49
343
2401
16807
range
# there is a third optional parameter to range, the *step*
for x in range(3, 10, 2):
print(x)
3
5
7
9
def new_range(lower_bound, upper_bound, step=1):
pass