… and how we might fix it.

This is NOT a hate post. Let me say that again, This is NOT a hate post! Python is a fine language and it was my first programming language. Like the initial stages of falling in love with someone, you can’t see the flaws, the same was with python :) But as time goes by and the initial limerance fades, you start seeing the flaws. Nobody is free of flaws, and neither are programming languages (not even lisp).

The pythonic way of doing map and filter

Using list comprehensions are the pythonic way of doing this.

def inc(x): 
    return x+1
>>> list(map(inc,range(10)))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# pythonic way
>>> [inc(i) for i in range(10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def is_even(x): return x%2==0 
>>> list(filter(is_even, range(10)))
[0, 2, 4, 6, 8]

# pythonic way
>>> [i for i in range(10) if is_even(i)]
[0, 2, 4, 6, 8]

There are several benefits to this, list comprehensions are highly optimized, and avoid some of the pitfalls that are seen in map and filter.

TLDR if you don’t care about using map and filter use List comprehensions.

But since this post is about map and filter, let’s focus on that.

What is a value?

To understand this post you need to understand what values are. In simple terms, entities that can be compared, have a magnitude, and are immutable are values.

How can you tell if something is a value? Easy, just ask whether it is immutable.

Is a string a value? yes, it’s immutable.

Is a tuple a value? yes, it’s immutable.

Is a list a value? no, it’s mutable.

Is an iterator a value? hmm… this one is hard, let’s try it out.

>>> a = iter((1,2,3))
>>> next(a)
1
>>> next(a)
2
>>> next(a)
3

each time you call next on a the return changes. So I’d say it’s not a value

Programming with values is very beneficial see Rich Hickey’s talk on the value of values.

Problem #1 map and filter returns an iterator.

>>> res = map(inc, range(10))
# let's check if it worked 
>>> list(res)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# let's filter all even integers from res
>>> list(filter(is_even, res))
[]

That is unexpected. If you’re an experienced pythonista, you probably know what went wrong, and this is expected (because you have learned to expect it).

Here’s why it is wrong to expect this. If we used list comprehensions, we wouldn’t run into this.

>>> res=[inc(i) for i in range(10)]

# let's check if it worked 
>>> res
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# let's filter all even integers from res
>>> [i for i in res if is_even(i)]
[2, 4, 6, 8, 10]

# unless you directly mutate res
# you can do more things with res.

I’m simplifying a bit, but map and filter returns an iterator when you call list or tuple on them. list(res) exhausts the iterator and res becomes empty.

>>> res = map(inc, range(10))

# res returns an iterator here
>>> list(res)  
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# list(res) exhausts the iterator 
# so you're filtering an empty iterator here
# so you get an empty list
>>> list(filter(is_even, res))
[]

A workaround

You can immediately realize the iterator and store that result.

res = list(map(inc, range(10)))

>>> list(res)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# works fine!
>>> list(filter(is_even,res))
[2, 4, 6, 8, 10]

But you lose the laziness of map and filter and it is inconvenient to do list(map...).

Problem #2 map and filter is lazy.

>>> filter(is_even, [1,2,3])        
<filter object at 0x0000018B347B0EB0>

Here when you call filter you’re creating a filter object, you are not computing the result. You only compute it when it’s absolutely required, this is laziness. It’s quite commmon in functional programming. Here’s why it’s a problem in python.

>>> a = [1,2,3,4]
>>> res = filter(is_even, a)
>>> a.append(10)
>>> a.append(12)

What do you think the result of the filteration will be? What would you get if you did list(res)? I want you to think long and hard.

Here’s the answer

>>> list(res)
[2, 4, 10, 12]

Most people would’ve guessed the answer correctly but that’s not the tough part. I want you to think what you meant when you wrote this. What did you really mean?

>>> res = filter(is_even, a)

I definitely meant that filter the value of a, which was [1,2,3,4]. This can result in bugs that are hard to track down and more importantly it makes your code hard to reason about.

Remember this, always -

Laziness coupled with mutability is a recipe for setting your hair on fire.

There is a reason that most functional languages have immutability backed in. You can only postpone evaluation of an expression when you can guarantee that the arguments to the expression will mean the same thing EVERYTIME.

in this case the result of filter(is_even, a) depends on when we realize the iterator. It is dependent on time.

>>> a = [1,2,3,4]
>>> res = filter(is_even,a) 
>>> a.append(10)
>>> a.append(12)
>>> a.append(14) 
>>> a.append(16) 
>>> list(res)
[2, 4, 10, 12, 14, 16]

it’s exactly the same line of code but the result changes. Here’s another way to think of it.

Your future actions affect the results of your past actions. We are essentially changing the past, which makes it extremely difficult to reason about your code.

I’ll quickly show you a clojure example. (don’t worry it looks almost like python)

user=> (def a [1,2,3,4]) ; equivalent to a = [1,2,3,4]
#'user/a
user=> (def res (filter even? a)) ; even? = is_even
#'user/res
user=> (def a (concat a [10])) ; concat is similar to append
#'user/a
user=> (def a (concat a [12])) 
#'user/a
user=> res  
(2 4) ; isn't this what you expected?

user=> a ; proof that a is something else
(1 2 3 4 10 12)

filter is lazy in clojure and yet you get the correct result i.e. filteration of [1,2,3,4] not [1,2,3,4,10,12].

You can’t change the past. You can see why time-travel might be a bad idea 😂

Just to remind you, List comprehensions solve these problems.

How can we fix map and filter?

To fix the problems we must

  1. return a value, not an iterator
  2. eliminate laziness or make sure mutability doesn’t affect the return value.

Fixing problem #1 is as easy as returning a list or a tuple. Fixing problem #2 is harder. If we want to make sure the return value is not affected by mutability, and try to have laziness, we need to do a deepcopy on the input iterable.

Here’s one way.

class filter:
    def __init__(self,fn, iterable):
        self.fn = fn
        self.iterable = deepcopy(iterable)
        self.res = None
    
    def __iter__(self):
        return [i for i in self.iterable if self.fn(i)]

But laziness isn’t only delaying computation, it is also computing the results only when they are required.

user=> (take 10 (map inc (range)))
(1 2 3 4 5 6 7 8 9 10)

here (range) returns an infinite sequence starting from 0 Since map is lazy, it only computes the first ten elements.

So the deepcopy in my filter implementation means that my implementation isn’t fully lazy. The only upside to this implementation is when the filteration function is expensive.

Using eager evaluation

I think the most pragmatic solution would be to eagerly evaluate map and filter.

def map(fn, *iterables): 
    return [fn(*i) for i in zip(*iterables)]

def filter(fn, iterable):
    return [i for i in iterable if fn(i)]

The benefit of this is that it can work as a drop in replacement for python’s default map and filter and if iterable is hashable then we can even add an lru_cache to these functions. But lists are the most commonly used containers and they’re not hashable, so maybe not that big of a benefit?

Why should you use this over a list comprehension?

This is a matter of personal opinion, after doing functional programming for a long time, I find maps and filters more readable than list comprehensions. They are more concise and using lambda functions in list comprehensions can be confusing, see this stack overflow answer

A compromise

There might be some rare case where the user might want to iterate over an infinite sequence or a huge sequence where laziness is necessary. In that case we can define a lazymap and lazyfilter. In my opinion, making the default case eager, and forcing the user to explicitly use the lazier version when needed is better. This would reduce surprises when a novice uses map and filter.

Can we do better than python’s default lazy implementation ?

Indeed we can

class lazymap:
    def __init__(self,fn, *iterables):
        self.fn = fn
        self.iterables = iterables
    def __iter__(self):
        return (self.fn(*i) for i in zip(*self.iterables))

class lazyfilter:
    def __init__(self,fn, iterable):
        self.fn = fn
        self.iterable = iterable
    def __iter__(self):
        return (i for i in self.iterable if self.fn(i))

Here’s why it is better. let’s define take.

# taken from functionali
def take(n: int, iterable: Iterable) -> Tuple:
    """Returns the first n number of elements in iterable.
    Returns an empty tuple if iterable is empty
    >>> take(3, [1,2,3,4,5])
    (1, 2, 3)
    """
    it = iter(iterable)
    accumulator = []
    i = 1
    while i <= n:
        try:
            accumulator.append(next(it))
            i += 1
        except StopIteration:
            break
    return tuple(accumulator)

now let’s see an example with the default python implementation.

>>> res = map(inc, range(100))
>>> take(5, res)                 
(1, 2, 3, 4, 5)
>>> take(5, res)
(6, 7, 8, 9, 10)

You don’t get the same result, even though it looks like you’re evaluating the same expression.

Here’s the same thing with lazymap.

>>> res = lazymap(inc, range(100))

>>> take(5, res)
(1, 2, 3, 4, 5)
>>> take(5, res)
(1, 2, 3, 4, 5)
>>> take(5, res)
(1, 2, 3, 4, 5)

You always get the same result just like you would in clojure or any other functional programming language.

user=> (def res (map inc (range 100)))
#'user/res
user=> (take 5 res)
(1 2 3 4 5)
user=> (take 5 res)
(1 2 3 4 5)

Shameless self-plug

If you liked the ideas in this post and would like to use them; functionali implements these functions.

Could this be a PEP/PR?

If you’re a member of the core python team, and think the default python implementation of map and filter could be changed to something similar to lazymap and lazyfilter please let me know I’d love to make a PR/PEP. As far as I can tell, there should be no breaking changes. Do reach out to me on twitter :)

Closing thoughts.

There is an overarching theme in this post; how mutability makes things harder to reason about. A lot of languages have mutability by default because they lean towards object orinted programming. Even though python supports multiple paradigms, it heavily favors OOP and an Imperative style. The design decisions behind python were driven by a desire to make code readable, and an imperative/OOP style is more readable. The underlying assumption behind that was code that is easy to read is easy to reason about. That is definitely true, you can’t reason about code written in brainfuck but readability isn’t the only factor makes code easy to reason about.

This is very readable but hard to reason about.

>>> a = [1,2,3,4]
>>> res = filter(is_even,a) 
>>> a.append(10)
>>> a.append(12)
>>> a.append(14) 
>>> a.append(16) 
>>> list(res)
[2, 4, 10, 12, 14, 16]

Besides readability, immutability is the another major factor that makes code easy to reason about.

Thanks to Jacob Zimmerman for pointing out a mistake in my lazymap and lazyfilter implementation, see this issue.