DS Concepts DS Languages

Basics of NumPy Arrays Part – 1

Basics of NumPy Arrays Part – 1

Hello Enthusiastic Learners! Its time to learn Basics of NumPy Arrays. Arrays are the base of everything, since everything in our computer are numbers, all of them are being processed via an Array. When it comes to effective data processing and data manipulation, NumPy is one of the key libraries to get results in most efficient way. Even library “Pandas” is built around NumPy. So, it is good to have good knowledge of basics of NumPy arrays. For more basics of NumPy arrays & good introduction to it, visit our article Introduction to NumPy

Some of the things that we will cover in this article are:

  • Array Attributes – Shape, size, data types & more.
  • Slicing of Arrays – Accessing sub-arrays from larger arrays
  • Array Indexing – Accessing array elements

NumPy Array Attributes

Attributes of an array corresponds to – its sieze, dimensions, size of each dimension & data type of values stroed in array.

We will learn all this with help of few examples.

Before we jump on to creating arrays with random number, a little about random number genrator “seed”.
SEED – while genrating random numbers if we want eveytime same sequence of random numbers must be generated then we provide a value called “SEED” which when kept same for every run will make sure that same sequence is generated.

Now, we will create few Arrays.

Importing libraries

In [1]:
import numpy as np

# seed for reproducing same random number sequence
np.random.seed(10)

# 1 dimensional Array
a1 = np.random.randint(5, size=10)

# it will create a 1 dimensional array having 10 values and all below 5
a1
Out[1]:
array([1, 4, 0, 1, 3, 4, 1, 0, 1, 2])
In [2]:
# @ dimensional Array
a2 = np.random.randint(5, size=(2,10))

# it will create a 2 dimensional array having 10 values in each row and all below 5
a2
Out[2]:
array([[0, 1, 0, 2, 0, 4, 3, 0, 4, 3],
       [0, 3, 2, 1, 0, 4, 1, 3, 3, 1]])
In [3]:
# @ dimensional Array
a3 = np.random.randint(5, size=(3, 2, 10))

# it will create a 3 dimensional array having 10 values in each row all below 5
# In each dimension there will be 2 rows
a3
Out[3]:
array([[[4, 1, 4, 1, 1, 4, 3, 2, 0, 3],
        [4, 2, 0, 1, 2, 0, 0, 3, 1, 3]],

       [[4, 1, 4, 2, 0, 0, 4, 4, 0, 0],
        [2, 4, 2, 0, 0, 2, 3, 0, 4, 4]],

       [[0, 1, 1, 4, 0, 2, 1, 3, 1, 2],
        [0, 1, 1, 0, 2, 3, 0, 4, 2, 0]]])

Fetching attributes of arrays

Let’s fetch attributes for above arrays:

  • ndim – number of dimensions in each array
  • shape – dimensions
  • size – total number of elements in array
  • dtype – data type of array elemets (in current case it will be int64 only, as we used randint which genrated int type values only)
In [4]:
print("a1.ndim \t= '" + str(a1.ndim) + "' dimension(s) in array")
print("a1.shape \t= '" + str(a1.shape) + "' -- single row with 10 elements each")
print("a1.size \t= '" + str(a1.size) + "' -- total elements in array")
print("a1.dtype \t= '" + str(a1.dtype) + "' type data in array")
a1.ndim 	= '1' dimension(s) in array
a1.shape 	= '(10,)' -- single row with 10 elements each
a1.size 	= '10' -- total elements in array
a1.dtype 	= 'int64' type data in array
In [5]:
print("a2.ndim \t= '" + str(a2.ndim) + "' dimension(s) in array")
print("a2.shape \t= '" + str(a2.shape) + "' 2 rows with 10 elements each")
print("a2.size \t= '" + str(a2.size) + "' -- total elements in array")
print("a2.dtype \t= '" + str(a2.dtype) + "' type data in array")
a2.ndim 	= '2' dimension(s) in array
a2.shape 	= '(2, 10)' 2 rows with 10 elements each
a2.size 	= '20' -- total elements in array
a2.dtype 	= 'int64' type data in array
In [6]:
print("a3.ndim \t= '" + str(a3.ndim) + "' dimension(s) in array")
print("a3.shape \t= '" + str(a3.shape) + "' 3 dimensions, 2 rows in each dimension with 10 elements each")
print("a3.size \t= '" + str(a3.size) + "' -- total elements in array")
print("a3.dtype \t= '" + str(a3.dtype) + "' type data in array")
a3.ndim 	= '3' dimension(s) in array
a3.shape 	= '(3, 2, 10)' 3 dimensions, 2 rows in each dimension with 10 elements each
a3.size 	= '60' -- total elements in array
a3.dtype 	= 'int64' type data in array

Fetching size of array

If you want to get size of array & its element in bytes you can use following functions:

  • itemsize – size of each element of array in bytes
  • nbytes – size of complete array in bytes
In [7]:
print("Size of each item in bytes \t= " + str(a3.itemsize))
print("Size of array in bytes \t = " + str(a3.nbytes))
Size of each item in bytes 	= 8
Size of array in bytes 	 = 480

You can also check-out same tutorial in Video Format below

YouTube player

Accessing array elements (Array Indexing)

Here is our 1 dimensional array:

In [8]:
a1
Out[8]:
array([1, 4, 0, 1, 3, 4, 1, 0, 1, 2])

Fetching first element of array use following command:

In [9]:
a1[0]
Out[9]:
1

Get 5th element by using the code below:

In [10]:
a1[4]
Out[10]:
3

Due to what we read so far, we can say to access nth element we use array[n-1]

One can use negative values for indexing (or traversing) array from end.

In order to access last element of array we can use following:

In [11]:
a1[-1]
Out[11]:
2

For fetching 5th element from behind:

In [12]:
a1[-5]
Out[12]:
4

Certainly, for multi-dimension array, we can access values by adding a comma (,).
Here is our 2 dimensional array:

In [13]:
a2
Out[13]:
array([[0, 1, 0, 2, 0, 4, 3, 0, 4, 3],
       [0, 3, 2, 1, 0, 4, 1, 3, 3, 1]])

Below code fetches 1st element:

In [14]:
a2[0,0]
Out[14]:
0

Accessing 3rd element from 2nd row of array:

In [15]:
a2[1,2]
Out[15]:
2

Similarly, you can traverse 2 dimensional array as well from behind.

For getting 2nd last element from 2nd row of array:

In [16]:
a2[-1, -2]
Out[16]:
3

To access 2nd last element from 1st row of array:

In [17]:
a2[-2, -2]
Out[17]:
4

Slicing of Arrays

We will be accessing sub arrays from larger arrays. Hence, it is very important to learn that how we can access sub elements from an array because in many situations like while sampling data we need to fetch smaller data or sample from given data.

Above all, the syntax for slicing arrays is as follows:

x[ start : stop : step ]

  • start – Starting position, from which we will start accessing values from larger array. Default value start = 0.
  • stop – Last position in array till which you want to get values from larger array. Default value stop = size of dimension.
  • step – as discussed in our article “Introduction to NumPy“, step is the value by which we are skipping values. Default value step = 1.

Slicing on 1-Dimensional Array

Let’s proceed with creating an array.

In [18]:
x = np.arange(11)
x
Out[18]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Access first 4 elements

In [19]:
x[ : 4]
Out[19]:
array([0, 1, 2, 3])

Note: Element at “stop” position is not included

Further, One can access elements after 4th element

In [20]:
x[ 4 : ]
Out[20]:
array([ 4,  5,  6,  7,  8,  9, 10])

Fetch elements 4th to 8th element

In [21]:
x[ 4 : 8 ]
Out[21]:
array([4, 5, 6, 7])

All elements skipping 2 values, like 0, 3, 6, 9. For this we will step = 3

In [22]:
x[ : : 3 ]
Out[22]:
array([0, 3, 6, 9])

All elements skipping 1 value and values starting from 4th element.

In [23]:
x[ 3 : : 2 ]
Out[23]:
array([3, 5, 7, 9])

Using negative steps

Yes, negative STEP values are also allowed. In this case default values of start & stop value are swapped. Therefore, we will receive an array in reverse order.

All elements skipping 1 value and values starting from 8th element from end.

In [24]:
x[ 8 : : -2 ]
Out[24]:
array([8, 6, 4, 2, 0])

You can also use this technique to reverse an array.

In [25]:
x[ : : -1]
Out[25]:
array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1,  0])

Slicing on Multi-Dimensional Array

Let’s proceed with creating an array.

In [26]:
x = np.arange(20).reshape(4,5)
x
Out[26]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

reshape(rows, elements)

reshape is used to reshape our larger array as per rows & elements provided in reshape() function.

First of all, let’s fetch first 2 rows and 4 columns

In [27]:
x[ : 2, : 4 ]
Out[27]:
array([[0, 1, 2, 3],
       [5, 6, 7, 8]])

Also, we can fetch first 2 rows and 4 columns with STEP=2 on column values.

In [28]:
x[ : 2, : 4 : 2]
Out[28]:
array([[0, 2],
       [5, 7]])

Reversing multi-dimensional array

In [29]:
x[: : -1, : : -1]
Out[29]:
array([[19, 18, 17, 16, 15],
       [14, 13, 12, 11, 10],
       [ 9,  8,  7,  6,  5],
       [ 4,  3,  2,  1,  0]])

So far, we have covered many powerful functionalities of NumPy Arrays. Practice them and also play around with steps and multi-dimensional array. Further, we will be providing more in-depth knowledge in our next article “Basics of Numpy Array Part-2

Stay tuned. Keep Learning! Also, for video tutorials check our YouTube channel ML for Analytics

 

5 thoughts on “Basics of NumPy Arrays Part – 1

Leave a Reply

Back To Top

Discover more from Machine Learning For Analytics

Subscribe now to keep reading and get access to the full archive.

Continue reading