DS Concepts DS Languages

Boolean Operations in NumPy

Boolean Operation in NumPy

Hi Learning Enthusiasts! In this article we learn about Boolean Operation in NumPy. Boolean Operations in NumPy includes everything that we need to do manipulation on provided data or arrays — it is the foundation of data manipulation in NumPy. Now, let’s learn about Boolean Operations in NumPy.

More importantly, to learn more about NumPy’s functions & operations you can read some of these articles:

  1. Universal Functions in NumPy
  2. Basics of NumPy Part – 1
  3. Basics of NumPy Part – 2

Topics to be covered in this article are as follows:

  1. Comparison Operators
  2. Filtering values using Universal Functions
  3. Boolean Operators

Comparison operators for Boolean operations

Because NumPy, just like any other programming language, uses standard Comparison or Boolean operations similar to mathematical operations, it returns TRUE or FALSE
depending upon our condition.

“Greater Than” operator>

For value_1 greater than value_2 we show it as follows:
value_1 > value_2

Now, lets apply same logic on an array and see at which location values are actually greater than given scalar value.

Let’s begin with creating a NumPy array.

In [1]:
import numpy as np
# Base array
x = np.arange(0,10,2)
x
Out[1]:
array([0, 2, 4, 6, 8])
In [2]:
# Checking at which locations values are greater than given scalar value
x > 3
Out[2]:
array([False, False,  True,  True,  True])

As you can see, in above example we are getting TRUE or FALSE depending upon our condition. That, is it is returning true TRUE for locations having value greater than 3 and FALSE otherwise.

It tells us an important fact, that Comparison Operators returns us only BOOLEAN VALUES.

Now, let’s try out other operations too.

“Less Than” operator<

For value_1 less than value_2 we show it as follows:
value_1 < value_2

In [3]:
# Checking at which locations values are smaller than given scalar value
x < 3
Out[3]:
array([ True,  True, False, False, False])

“Equal To” operator==

For value_1 equal to value_2 we show it as follows:
value_1 == value_2

In [4]:
# Checking at which locations values are equal to given scalar value
x == 4
Out[4]:
array([False, False,  True, False, False])

“Greater Than Equal To” operator>=

For value_1 greater than equal to value_2 we show it as follows:
value_1 >= value_2

In [5]:
# Checking at which locations values are greater than equal to given scalar value
x >= 6
Out[5]:
array([False, False, False,  True,  True])

“Less Than Equal To” operator<=

For value_1 less than equal to value_2 we show it as follows:
value_1 <= value_2

In [6]:
# Checking at which locations values are less than equal to given scalar value
x <= 6
Out[6]:
array([ True,  True,  True,  True, False])

“Not-Equal To” operator!=

For value_1 not equal to value_2 we show it as follows:
value_1 != value_2

In [7]:
# Checking at which locations values are not equal to given scalar value
x != 4
Out[7]:
array([ True,  True, False,  True,  True])

That covers all our Comparison Operators.

Filtering values using Universal Functions

Let’s move on to our next topic of filtering values from a given set of numbers or an array using Universal Functions. To learn more about Universal Functions please check following article Universal Functions in NumPy.

To count total number of values greater than any value in an array you can use following function np.count_nonzero(). That is, we will be counting number TRUE values after performing a comparison on those values.

Let’s count number of values greater than 3 in our array x.

In [8]:
x
Out[8]:
array([0, 2, 4, 6, 8])
In [9]:
# Compare values greater than 3
y = x > 3
y
Out[9]:
array([False, False,  True,  True,  True])
In [10]:
# Count values in array greater than 3
np.count_nonzero(y)
Out[10]:
3

As you can see, in above example we merged 2 things together, that is, first compared values and then passed that result in our function np.count_nonzero(). You can also do this in a single step, as shown below.

In [11]:
# Count values in array greater than 3
np.count_nonzero(x > 3)
Out[11]:
3

Count Number of Values Greater than 3 in each Row

It can also be achieved in similar manner, only change we will be doing here will be use of a new function called np.sum()

In [12]:
# Calculate Sum of Values Greater than 5
np.sum(x > 5)
Out[12]:
2

It clearly indicated that we have 2 values greater than 5.

Why it didn’t calculate sum of values?

— Because we are not applying np.sum() on numerical values, however on an array consisting of Boolean values TRUE & FALSE

Now, let’s try calculating sum of those values.

Calculate Sum of Values Greater than 5

We will be same function np.sum() only. To access value from each of those locations we will be using following format x[x > 5]

Now, what we have done here is:

  1. Applied comparison operator get an Array of Boolean values indicating TRUE for locations for which condition satisfies
  2. Then to access those TRUE valued locations we have passed it in our array X.

Let’s see same in steps through example.

In [13]:
x
Out[13]:
array([0, 2, 4, 6, 8])
In [14]:
# Location of Values Greater than 5
x > 5
Out[14]:
array([False, False, False,  True,  True])
In [15]:
# Fetching values at those locations having value = TRUE
x[ x > 5 ]
Out[15]:
array([6, 8])
In [16]:
# Calculate Sum of All Values Greater than 5
np.sum(x[ x > 5 ])
Out[16]:
14

Are ALL Values Greater than 5

For this we will use following function np.all(). It will return TRUE if all values passed in it correspond to TRUE.

In [17]:
# Are ALL Values Greater than 5
np.all(x > 5)
Out[17]:
False

We received FALSE because we also have values smaller than 5.

ANY value Greater than 5

For this we will use following function np.any(). It will return TRUE if we have even single TRUE value (any TRUE value) among values passed in it.

In [18]:
# Any values Greater than 5
np.any(x > 5)
Out[18]:
True

It returned TRUE because we have at least 2 values (6 & 8) which are greater than 5.

Similarly, you can use any Comparison Operator and pass that output to above functions to get desired result.

Boolean Operators

What if you want to calculcate sum of values greater than 3 & less than 20? What if you want calculate sum of values which either smaller than 3 or greater than 20?

These problems can be solved by combining multiple Comparison Operations using Boolean Operators.

Boolean Operators are in general of 4 types:

  1. & — it’s a Bitwise AND operator — function np.bitwise_and
  2. | — the Bitwise OR operator — function np.bitwise_or
  3. ^ — Bitwise XOR operator — function np.bitwise_xor
  4. ~ — Bitwise NOT operator — function np.bitwise_not

In this article we will be using Universal Functions only, that is, symbolic representation as they are more handy to use & faster to type.

Let’s create a sample array first

In [19]:
x = np.arange(1, 40)
x
Out[19]:
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39])

Calculate sum of all values which are smaller than 5 or greater than 38

For this we will be using following Boolean Operator — | — bitwise OR operator.

First let’s get result without using Boolean Operator — many steps.

In [20]:
# values smaller than 5
y_small = x [x < 5]
print("-- Values Smaller than 5")
print(y_small)

# values greater than 38
y_big = x [x > 38]
print("\n-- Values Greater than 38")
print(y_big)

print("\n--Calculating Sum of Smaller values")
z_small = np.sum(y_small)
print(z_small)

print("\n--Calculating Sum of Larger values")
z_big = np.sum(y_big)
print(z_big)

print("\n--Sum of desired values")
print(z_small + z_big )
-- Values Smaller than 5
[1 2 3 4]

-- Values Greater than 38
[39]

--Calculating Sum of Smaller values
10

--Calculating Sum of Larger values
39

--Sum of desired values
49

Using Boolean Operator to obtain same result in a single line of code.

In [21]:
# Calculate sum of all values which are smaller than 5 & greater than 38
np.sum(x[ (x < 5) | (x > 38)])
Out[21]:
49

So, you can clearly see the advantage of Boolean Operator over standard procedure.

Similarly you can combine any numbers of equations together and calculate desired value.

For example, if we want to add one more condition where we don’t want to add value = 2.

In [22]:
np.sum(x[ ((x < 5) | (x > 38)) & (x != 2)])
Out[22]:
47

This covers our current article. In next article we will be covering more about NumPy and its operations.

Stay tuned! Keep Learning!

You can also watch all our tutorials on YouTube @ ML for Analytics

 

Leave a Reply

Back To Top
%d bloggers like this: