Beyond Python Logo

Using filter to simplify Python

Written by Matthew Yeager
6-minute read (700 words)
Published: Wed Sep 04 2019
Using filter to simplify Python

You've probably already used the pattern filter is here to help with. You have a list of items and based on some critiera you want to apply logic to just those few. It might look like this

foods = [
    {'name': 'apple', 'is_fruit': True, 'color': 'red'},
    {'name': 'banana', 'is_fruit': True, 'color': 'yellow'},
    {'name': 'cucumber', 'is_fruit': False, 'color': 'green'},
    {'name': 'date', 'is_fruit': True, 'color': 'purple'},
    {'name': 'eggplant', 'is_fruit': False, 'color': 'purple'},
    {'name': 'fig', 'is_fruit': True, 'color': 'purple'},
]
purple_fruit_count = 0
for f in foods:
    if f['is_fruit'] and f['color'] == 'purple':
        purple_fruit_count += 1

purple_fruit_count

> 2

We wanted to go through all the items to only run logic (counting foods) on a subset of the list (purple fruit). However, when you first look at this code it might take some time to understand what is being achieved and how. What exactly is this combination of filtering trying to accomplish? If the intent is not stated, how can it be updated or corrected?

Functional Python with filter

Another way to look at this problem is from a functional point of view. This means using functions which do not change the contents of the inputs. Using a function like filter will return a new list with references to items which meet your criteria.

is_purple_fruit = lambda f: f['is_fruit'] and f['color'] == 'purple'
purple_fruits = filter(is_purple_fruit, foods)

len(list(purple_fruits))

> 2

I would argue the readability has been much improved. We could use is_purple_fruit with a for-loop or list comprehension, but even the syntax around looping over the items has been simplified with the use of filter.

Another benefit of using filter is the requirement to pass a function. Although you could be passing an anonymous function (lambda) into filter directly, using named functions allows you to more easily start building a library of common filtering behaviors that encapsulate complex logic for reuse.

Imagine your application should only be dealing with currently existing foods and not those that have not gone extinct. This means someone implementing their own version of is_purple_fruit might miss checking for only active foods. Or worse, there might not be a simple is_active flag and instead, start and end dates that need to be compared.

Functional Python benefits

filter returns a generator, which means instead of evaluating the entire list, items are available on demand. To get the full list above, we wrapped the generator expression in list. When you think about it, you might not need all the items of your list to return from the beginning. Generators allow for better performance and memory usage over returning the full list. Check out our comparison of list comprehension, map + lambda, and the use of generators here How fast are list comprehensions?

Beyond Python Visual Newsletter

Enjoying the content? We send step-by-step visual Python tutorials to your inbox! Be notified when new content is available by the Beyond Python team.



Simple, testable functions
is_food_name_even = lambda f: len(f['name']) % 2 == 0
even_food_names = filter(is_food_name_even, foods)

list(even_food_names)

> [{'name': 'banana', 'is_fruit': True, 'color': 'yellow'},
   {'name': 'cucumber', 'is_fruit': False, 'color': 'green'},
   {'name': 'date', 'is_fruit': True, 'color': 'purple'},
   {'name': 'eggplant', 'is_fruit': False, 'color': 'purple'}]

Providing membership functions (is_purple_fruit, is_food_name_even) allows for others to focus on their task and not get bogged down on common filtering methods.

Now you might not have a simple for-loop + if combination. You may have a series of conditions or several if-else ladders that no longer make sense to use filter. This might be a symptom of a function doing too much work where isolating each aspect and testing all the possible branching becomes a nightmare.

You will also have available the ability to produce a list of items which failed the test

from itertools import filterfalse

odd_fruit_names = filterfalse(is_food_name_even, foods)
list(odd_fruit_names)

> [{'name': 'apple', 'is_fruit': True, 'color': 'red'},
   {'name': 'fig', 'is_fruit': True, 'color': 'purple'}]
Python's built-in filter

filter allows you to describe code in a step-by-step process. Using filter provides a clear, readable description of which well-named test is being used against a collection of items.

Additionally, you'll find performance and memory savings when leveraging the generator returned by the filter function.

Start trying out Python's built-in filter function when you come across the loop + single if statement pattern.




Questions, Comments, Concerns?

Thanks for reading! If you've made it this far then you are probably interested in the material that we will be producing. We have an idea of what we believe will be most valuable to our readers, but hearing from you directly would be even better.

Send us an email at questions@beyondpython.com or reach out to us on twitter @BeyondPython

If you have a topic that you are struggling with, a file that you can't seem to work with, or even a dataset that just seems impossible to wrangle, then please let us know. We want to provide you with useful and practical information so you can start using Python today.

Beyond Python Visual Newsletter

Enjoying the content? We send step-by-step visual Python tutorials to your inbox! Be notified when new content is available by the Beyond Python team.



Disclosures & Privacy
All Rights Reserved
© 2019 Beyond Python