Advertisements
New Delhi, India

lexsort() – Indirect Sort in NumPy

lexsort() – Indirect Sort in NumPy

Hi Enthusiastics Learners! So far we have been sorting a single given array in a straight forward manner. What if we are give 2 or more arrays and we want to sort main array in such a way, that, if a tie happens among elements of main array, then using 2nd array we can break that tie and so on? Python provides us an easy solution for it, that is, lexsort() – indirect sort in NumPy. We are calling it ‘lexsort() – indirect sort’ because while sorting values of one array it is using key values of other arrays as well. To learn more about sorting in NumPy check our article Sort Arrays in NumPy

Watch video tutorial here:

lexsort() syntax

lexsort(keys, axis=-1)

keys — corresponds to the arrays or columns that we will passing to be sorted.The last column is the primary sort key & if a 2-D array is passed the primary sort key is row.

axis — it defines the axis to be indirectly sorted. By default, sort over the last axis. It is optional to provide its value.

OUTPUT — it returns a array of indices that sort the keys along the specified axis.

For example, if we are given 2 arrays of number and we want to sort values in such a way that values of array A get sorted first then use B as a tie-breaker.

By tie-break we mean, if we have got same value in A more than once, then which of those should come first. We will decide that by checking which value corresponding to their location is smaller in B.

Note: Both arrays or all N number of arrays should be of same size.

In [7]:
import numpy as np

A = np.array([4, 1, 3, 4, 4, 4])
A
Out[7]:
array([4, 1, 3, 4, 4, 4])
In [8]:
B = np.array([2, 3, 1, 5, 3, 1])
B
Out[8]:
array([2, 3, 1, 5, 3, 1])

For sorting Array A first then B as tie breaker we will use following syntax:

lexsort((B, A)) — please note we are putting primary key at last not at begining.

In [10]:
result_indices = np.lexsort((B, A))
result_indices
Out[10]:
array([1, 2, 5, 0, 4, 3])

By running above command, we have received indices in which all arrays value should be placed.

You would have noticed by now, that we are placing position 5 before 0 portion. Reason for that is simple — in array A at both locations we have value=4, so basically its a tie. For choosing which location should come 1st we check values of array B for those same location — in array B we have 1 for position 5 and 2 for position 0. Thus, value corresponding position 5 are chosen first.

Let’s get values of both arrays to get better understanding of it.

In [12]:
# printing values of both arrays as pairs

[(A[index], B[index]) for index in result_indices]
Out[12]:
[(1, 3), (3, 1), (4, 1), (4, 2), (4, 3), (4, 5)]

One of the most common example where it can be used is in sorting name data, that is, sorting first_name, last_name etc.

Stay tuned & keep learning! In our next article we will learning more about Sorting techniques in NumPy.

To get latest update — follow our blog by registering with your email.

 

 

Advertisements

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: