DS Concepts DS Languages

Basics of NumPy Arrays Part – 2

Basics of NumPy Arrays Part – 2

Hello Enthusiastic Learners! Its time to learn Basics of NumPy Arrays! Arrays are the base of everything, since everything in our computer are numbers, all of them are being processed via an Array and when it comes to effective data processing and data manipulation, NumPy is one of the key libraries to get results in most efficient way. In our previous article Basics of Numpy Arrays Part – 1 we studied about following topics — Array Attributes, Slicing of Arrays & Array Indexing, it will good to cover them as well. NumPy arrays plays a crucial role in Python especially when it comes to number handling, so, it is good to have good grasp on basics of NumPy arrays.

Topics to be covered

We will be covering following topics in this article:

  • Data types in NumPy
  • Create Copy of Arrays
  • Reshaping Arrays
  • Array Concatenation
  • Split Arrays

Data Types in NumPy

NumPy arrays values have only single data type for a given array, like int, float etc. Thus, while working on them it is very important to have detailed knowldge about data-types in NumPy as they play a key role while doing data manipulation.

Here is the list of data types supported by NumPy:

  • bool_ — Boolean (True or False) stored as a byte
  • uint8 — Unsigned integer (0 to 255)
  • uint16 — Unsigned integer (0 to 65535)
  • uint32 — Unsigned integer (0 to 4294967295)
  • uint64 — Unsigned integer (0 to 18446744073709551615)
  • int_ — Default integer type
  • intc — Identical to C int (generally int32 or int64)
  • intp — Integer used for indexing
  • int8 — Byte (-128 to 127)
  • int16 — Integer (-32768 to 32767)
  • int32 — Integer (-2147483648 to 2147483647)
  • int64 — Integer (-9223372036854775808 to 9223372036854775807)
  • float32 — Single precision float
  • float64/ float_ — Double precision float
  • complex64 — Complex number, represented by two 32-bit floats
  • complex128/ complex_ — Complex number, represented by two 64-bit floats

These data-types can be used while creating any NumPy Array using following format:

Importing libraries

In [1]:
import numpy as np

# Create an array containing 10 elements of type "int64"
np.arange(10, dtype='int64')
Out[1]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Creating arrays

In [2]:
# Create an array containing 10 elements of type "float64"
# Note, we can use both float64 or float_
np.arange(10, dtype='float_')
Out[2]:
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
In [3]:
# Create an array containing 10 elements of type "complex64"
np.arange(10, dtype='complex64')
Out[3]:
array([0.+0.j, 1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 5.+0.j, 6.+0.j, 7.+0.j,
       8.+0.j, 9.+0.j], dtype=complex64)

Creating copy of arrays

You would be thinking why we are talking about copying value of one array element to another. Can we not just just do array_new = array_old, in reality there are some disadvantages of doing so, because by using that syntax we are just creating a pointer existing array in memory.

That is, new array will point to same old array value, it is not a copy. And if we change values in new array it will affect values of old arrays as well.

For example:

In [4]:
old_array = np.arange(10)
old_array
Out[4]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Let’s create a new array, new_array = old_array

In [5]:
new_array = old_array
print("Value of NEW Array after simply using ' new_array = old_array ' ")
print(new_array)

# Update 1st element of new array to 20
new_array[0] = 20
print("\nValue of NEW Array after updating its 1st element to 20")
print(new_array)

print("\nValue of OLD Array after updating value of NEW Array")
print(old_array)
Value of NEW Array after simply using ' new_array = old_array ' 
[0 1 2 3 4 5 6 7 8 9]

Value of NEW Array after updating its 1st element to 20
[20  1  2  3  4  5  6  7  8  9]

Value of OLD Array after updating value of NEW Array
[20  1  2  3  4  5  6  7  8  9]

Important point

From above we can clearly see that value of our original array is also changed while we made a change in its assumed copy. This can’t be used while manipulating data.

Thus, to create COPY of an Array we use COPY() function. It creates a explicit copy of existing array, no affect will be there while making changes to new array.

We will be using following syntax for creating copy, new_array = old_array.copy()

For example:

In [6]:
old_array = np.arange(10)
old_array
Out[6]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [7]:
new_array = old_array.copy()
print("Value of NEW Array after using COPY() function ' new_array = old_array.copy() ' ")
print(new_array)

# Update 1st element of new array to 20
new_array[0] = 20
print("\nValue of NEW Array after updating its 1st element to 20")
print(new_array)

print("\nValue of OLD Array after updating value of NEW Array")
print(old_array)
Value of NEW Array after using COPY() function ' new_array = old_array.copy() ' 
[0 1 2 3 4 5 6 7 8 9]

Value of NEW Array after updating its 1st element to 20
[20  1  2  3  4  5  6  7  8  9]

Value of OLD Array after updating value of NEW Array
[0 1 2 3 4 5 6 7 8 9]

From above example, it is clear that only by using COPY() function we create a copy of an array in NumPy and there is no affect of any change done on new array. Thus, while working on projects which require a lot of data manipulation it is wise to use COPY() function.

Please note, same approach will be followed while creating Copy of any variable of Pandas.

Reshaping Arrays

RESHAPE() function of NumPy is very useful while doing modification of array.

Let’s understand its usage with examples. Begin with creating a base array.

In [8]:
# Our base array
x = np.arange(9)
print("Base Array:")
print(x)

print("Size of array = " + str(x.size))
Base Array:
[0 1 2 3 4 5 6 7 8]
Size of array = 9

It is a single row array having 9 columns.

To convert it to an array of type 3×3 you could use following code.

In [9]:
# Converting to 3 x 3 array from 1 x 9
x_3x3 = x.reshape(3,3)
print("Updated Array:")
print(x_3x3)
print("Size of array = " + str(x_3x3.size))
Updated Array:
[[0 1 2]
 [3 4 5]
 [6 7 8]]
Size of array = 9

Let’s create a 3×2 array and then reshape it to a 2×3 array

In [10]:
# Create a 3x2 array
x_3x2 = np.arange(6).reshape(3,2)
print("Updated Array:")
print(x_3x2)
print("Size of array = " + str(x_3x2.size))
Updated Array:
[[0 1]
 [2 3]
 [4 5]]
Size of array = 6
In [11]:
# Convert 3x2 array to 2x3 array
x_2x3 = x_3x2.reshape(2,3)
print("Updated Array:")
print(x_2x3)
print("Size of array = " + str(x_2x3.size))
Updated Array:
[[0 1 2]
 [3 4 5]]
Size of array = 6

NOTE: Size of new array must be same as that of previous array, if not it will throw an array, so choose cardinality of new array wisely.

Convert a 2×3 array to single row array, that is, 1×6 array.

In [12]:
# Convert 2x3 array to singel row array
x_1x6 = x_2x3.reshape(1,6)
print("Updated Array:")
print(x_1x6)
print("Size of array = " + str(x_1x6.size))
Updated Array:
[[0 1 2 3 4 5]]
Size of array = 6

Concatenate array

We will be joining (concatenate) 2 or more arrays into a single array.

To do so we can use following functions:

  • concatenate()
  • vstack()
  • hstack()

Joining 2 or more arrays using CONCATENATE()

In [13]:
# 1st Array
x = np.arange(4)
print("1st Array 'X'")
print(x)

# 2nd Array
y = np.arange(4,8)
print("2nd Array 'Y'")
print(y)

# Concatenate both Arrays
conc_x_y = np.concatenate( [ x , y ] )

print("\nConcatenated 2 arrays")
print(conc_x_y)
1st Array 'X'
[0 1 2 3]
2nd Array 'Y'
[4 5 6 7]

Concatenated 2 arrays
[0 1 2 3 4 5 6 7]
  • Remember to user square brackets [ ] in concatenate() function

Joining arrays

We can also join more than one arrays in a single go.

In [14]:
# 3rd Array
z = np.arange(8,12)
print("3rd Array 'Y'")
print(z)

# Concatenate both Arrays
conc_x_y_z = np.concatenate( [ x, y, z ] )

print("\nConcatenated 3 arrays")
print(conc_x_y_z)
3rd Array 'Y'
[ 8  9 10 11]

Concatenated 3 arrays
[ 0  1  2  3  4  5  6  7  8  9 10 11]

What to do if we are given 2-dimensional array?

You can use same method concatenation(), however keep an eye on cardinality of array.

In [15]:
# 1st 2-D array
x_2 = np.arange(8).reshape(2,4)
print("1st 2-D Array")
print(x_2)

# 2nd 2-D array
y_2 = np.arange(8,16).reshape(2,4)
print("\n2nd 2-D Array")
print(y_2)

# Concatenate 2-D arrays
conc_x_y_2 = np.concatenate( [ x_2, y_2 ] )
print("\n Concatenated 2-D Arrays")
print(conc_x_y_2)
1st 2-D Array
[[0 1 2 3]
 [4 5 6 7]]

2nd 2-D Array
[[ 8  9 10 11]
 [12 13 14 15]]

 Concatenated 2-D Arrays
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]

Concatenate Arrays of Different Dimensions

If we want to join 2 arrays of different dimensions we can use following functions:

  • vstack() — to join 2 arrays vertically
  • hstack() — to join 2 arrays horizontally

Joining arrays Vertically using vstack()

In [16]:
## 1st Array
x = np.array([1,2])
print("1st Array 'X'")
print(x)

# 2nd Array
y = np.array([[4,5],
              [6,7]])
print("\n2nd Array 'Y'")
print(y)

# Concatenate both Arrays Vertically
conc_x_y = np.vstack( [ x , y ] )

print("\nConcatenated 2 arrays Vertically")
print(conc_x_y)
1st Array 'X'
[1 2]

2nd Array 'Y'
[[4 5]
 [6 7]]

Concatenated 2 arrays Vertically
[[1 2]
 [4 5]
 [6 7]]

Joining arrays Horizontally using hstack()

In [17]:
## 1st Array
x = np.array([[1],
              [2]])
print("1st Array 'X'")
print(x)

# 2nd Array
y = np.array([[4,5],
              [6,7]])
print("\n2nd Array 'Y'")
print(y)

# Concatenate both Arrays Horizontally
conc_x_y = np.hstack( [ x , y ] )

print("\nConcatenated 2 arrays Horizontally")
print(conc_x_y)
1st Array 'X'
[[1]
 [2]]

2nd Array 'Y'
[[4 5]
 [6 7]]

Concatenated 2 arrays Horizontally
[[1 4 5]
 [2 6 7]]

Split Arrays

We will be splitting bigger arrays into smaller arrays, using following functions:

  • split()
  • vsplit()
  • hsplit()

Please note, all these functions will be taking position or index value from where we want to split an array.

In [18]:
x = np.arange(10)
x
Out[18]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
  • If we want to split our array in 2 parts at position 4 (at position 4 we have value = 5 in array ‘x’)

split()

split() can be used in following way

arr1, arr2, arr3 … , arr_n = np.split(arr_original, [ pos1, pos2, pos3 … , pos_n] )

In [19]:
# Split array in 2 parts at position 4
x1, x2 = np.split(x, [4])

print('New array after splitting')
print(x1, "\n")
print(x2)
New array after splitting
[0 1 2 3] 

[4 5 6 7 8 9]
  • Splitting array in 4 parts at different positions.
In [20]:
# Split array in 3 parts at positions 2, 5, 9
x1, x2, x3, x4 = np.split(x, [2, 5, 9])

print('New array after splitting')
print(x1, "\n")
print(x2, "\n")
print(x3, "\n")
print(x4)
New array after splitting
[0 1] 

[2 3 4] 

[5 6 7 8] 

[9]

Vertical Split

In [21]:
# Base Array
x = np.arange(16).reshape((4, 4))
x
Out[21]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
In [22]:
# Vertical Split at 3rd row

x1v, x2v = np.vsplit(x,[2]) 

print('New array after splitting')
print(x1v, "\n")
print(x2v)
New array after splitting
[[0 1 2 3]
 [4 5 6 7]] 

[[ 8  9 10 11]
 [12 13 14 15]]

Horizontal Split

In [23]:
# Horizontal Split at 4th column

x1h, x2h = np.hsplit(x,[3]) 

print('New array after splitting')
print(x1h, "\n")
print(x2h)
New array after splitting
[[ 0  1  2]
 [ 4  5  6]
 [ 8  9 10]
 [12 13 14]] 

[[ 3]
 [ 7]
 [11]
 [15]]

We have covered basics of NumPy array in this article. In next article we will covering about performance enhancement in Python using Universal Function in NumPy.

Stay tuned!! To get latest updates subscribe to our blog by simply registering via your Email. Also, check our YouTube channel ML for Analytics, it covers many more in form of video tutorials.

 

 

2 thoughts on “Basics of NumPy Arrays Part – 2

Leave a Reply

Back To Top

Discover more from Machine Learning For Analytics

Subscribe now to keep reading and get access to the full archive.

Continue reading