# When to use Numpy

Written by Matthew Yeager
Published: Fri Aug 23 2019
When should I start using Python's Numpy?

`numpy` is a Python framework optimized for vector and matrix-based algorithms. With a purposed focus on the scientific community, `numpy` is able to achieve significant performance gains over the flexibility of built-in data structures. In practice, this means you might want to consider using `numpy`'s array over Python's list for data analysis.

`numpy`'s open-source repository showcases over 21,000 commits from 800+ contributors. These core data science packages have been downloaded and used by millions of Data Scientists, IT Professionals, and Business Leaders.

```# Adding elements together
element_count = 100000
a = range(element_count)
b = range(element_count)
c = range(element_count)

""" Add using list comprehension along with
built-in functions sum and zip.
"""
return [sum(x) for x in zip(a, b, c)]

""" Functional approach leveraging the
"""

import numpy as np
np_a = np.arange(element_count)
np_b = np.arange(element_count)
np_c = np.arange(element_count)
arrays' memory footprint and optimized codebase.
"""
return np_a + np_b + np_c

> 24.5 ms ± 146 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
> 17 ms ± 25.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
> 127 µs ± 152 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)```

The pure Python approaches are taking tens of milliseconds, while our `numpy` solution is just a hundred microseconds!

If you are interested, we have more content on performance testing. Check out a review on the runtime of list comprehension, functional coding, and generators.

`numpy` makes sense in a few different workflows and can be an advantage for you as a student, researcher, or software developer. There are several factors that might make `numpy` the right choice for you.

Maybe you can write your application in pure Python (what can't be built? :)), however, consider when its time to present and share your work. `numpy`'s matrix-based approach allows a wide range of professionals to understand the operations being performed. The vector-based thinking is similar to classical research environments like Matlab or R.

With `numpy` you'll find many industry-standard algorithms already battle-tested and well named as to inform your readers to their purpose. Sure, you could also build these functions in Python. But these functions, their maintenance, and lifelong support are now a liability for you and your team. Let's look at a few sample functions.

```# Polynomial fitting with Numpy

# Calculate coefficients for 3rd order fit
coefficients = np.polyfit(data.X, data.Y, 3)

# Make a function with the calculated coefficients
coefficent_1d = np.poly1d(coefficients)
fitted_values = [coefficent_1d(x) for x in x_values]```

Here we are able to take a series of values and compute fitted values across a polynomial. This built-in support for linear algebra allows for fast and accurate operations alongside easy to read code.

Matrix and vector-based support

`numpy`'s array is the center point of enabling fast, set-based computation. Instead of working with nested loops to access each item, analysts can instead use familiar vector-based mathematics. These operations have already been optimized for performance in a way that you can't get through the use of pure Python. This combination of Python's explicit, easy-to-read syntax and high-level mathematics has allowed for industry-wide adoption and support.

`numpy` has been established as the backbone for many machine learning and matrix-based computation libraries. For instance, Open Source Computer Vision Library (opencv.org) has more than 2500 optimized algorithms spanning a variety of topics which otherwise would not provide the level of performance required if not for `numpy`!

Enjoying the content? We send step-by-step visual Python tutorials to your inbox! Be notified when new content is available by the Beyond Python team.

Speed and performance with real arrays

Although the idea of lists and arrays might seem interchangeable in Python, there are key differences that set arrays apart when performing many mathematical operations. Python `list`s provide a great deal of flexibility in storing a variety of data types and objects in a single list while also providing simple appending and deleting.

Arrays within `numpy` are values of the same datatype in a continuous memory allocation. This means, as opposed to Python lists, the values will be stored physically next to each other allowing for quick retrieval without needing to look up where the value may be stored.

These typed arrays allow for `numpy` to use specialized float-based functions instead of needing to spend time casting and converting values to different datatypes. These considerations add significant overhead when performed over and over in a matrix or vector.

Is it time to use numpy?

Are you trying to share knowledge with others?
Do you have matrix and vector-based computations?
Are you concerned with the runtime of your application?
Would you be able to leverage the existing data analysis algorithms available within numpy?

I think you only really need to say yes to one of these questions before it would be valuable to start integrating `numpy` into your work. The time saved in not only in running your code but also being able to leverage the robust numerical framework that will make the move over worthwhile.

Thanks for reading! If you've made it this far then you are probably interested in the material that we will be producing. We have an idea of what we believe will be most valuable to our readers, but hearing from you directly would be even better.

Send us an email at questions@beyondpython.com or reach out to us on twitter @BeyondPython

If you have a topic that you are struggling with, a file that you can't seem to work with, or even a dataset that just seems impossible to wrangle, then please let us know. We want to provide you with useful and practical information so you can start using Python today.