NumPy : Machine Learning
NumPy - Numerical Python
NumPy stands for Numerical Python. It is one of the most important Python libraries for students who want to learn data science, machine learning, and scientific computing.
If Python lists are a basic notebook, NumPy arrays are a proper calculator. They are faster, more compact, and much better for mathematical operations.
In simple words, NumPy helps us work with:
- numbers in large amounts
- arrays and matrices
- mathematical operations
- data processing for machine learning
Why NumPy is Important
Before NumPy, doing numerical work in Python was possible, but not efficient. NumPy makes the work easier and faster because it is built for computation.
Main benefits
- It is faster than normal Python lists for numerical tasks.
- It supports multidimensional arrays.
- It provides many built-in math functions.
- It is the foundation of many data science libraries like Pandas, SciPy, Matplotlib, and Scikit-learn.
Installing NumPy
You can install NumPy using pip:
pip install numpy
Then import it in Python like this:
import numpy as np
We usually write np because it is short and standard.
What is an Array?
An array is a collection of values stored together.
Python list example
numbers = [1, 2, 3, 4, 5]
NumPy array example
import numpy as np
numbers = np.array([1, 2, 3, 4, 5])
print(numbers)
Output:
[1 2 3 4 5]
Why Use NumPy Array Instead of List?
Python lists can hold different data types, but NumPy arrays are usually used for one data type only.
Example with list
data = [1, 2, 3.5, "hello"]
Example with NumPy array
import numpy as np
data = np.array([1, 2, 3, 4])
print(data.dtype)
Output:
int64
Because NumPy stores similar values together, it can process them more efficiently.
Creating NumPy Arrays
1. One-dimensional array
import numpy as np
arr = np.array([10, 20, 30, 40])
print(arr)
2. Two-dimensional array
import numpy as np
matrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print(matrix)
3. Three-dimensional array
import numpy as np
cube = np.array([
[[1, 2], [3, 4]],
[[5, 6], [7, 8]]
])
print(cube)
Important Array Properties
NumPy arrays have useful properties that help us understand the data.
Shape
Shape tells us the number of rows and columns.
import numpy as np
matrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print(matrix.shape)
Output:
(2, 3)
This means the array has 2 rows and 3 columns.
Size
Size tells us the total number of elements.
print(matrix.size)
Output:
6
Data type
print(matrix.dtype)
This shows the type of values stored in the array.
Indexing and Slicing
Just like lists, NumPy arrays support indexing and slicing.
One-dimensional slicing
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
print(arr[0])
print(arr[1:4])
print(arr[-1])
Output:
10
[20 30 40]
50
Two-dimensional indexing
import numpy as np
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(matrix[1, 2])
print(matrix[0, :])
print(matrix[:, 1])
Output:
6
[1 2 3]
[2 5 8]
Common NumPy Operations
NumPy makes mathematical operations very easy.
Addition
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b)
Output:
[5 7 9]
Subtraction
print(b - a)
Output:
[3 3 3]
Multiplication
print(a * b)
Output:
[ 4 10 18]
Division
print(b / a)
Output:
[4. 2.5 2. ]
Scalar Operations
You can also apply one number to the entire array.
import numpy as np
arr = np.array([1, 2, 3])
print(arr + 10)
print(arr * 2)
Output:
[11 12 13]
[2 4 6]
This is one of the best features of NumPy. It avoids writing loops again and again.
Reshaping Arrays
Reshaping means changing the structure of an array without changing its data.
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped = arr.reshape(2, 3)
print(reshaped)
Output:
[[1 2 3]
[4 5 6]]
This is very useful when working with data for machine learning models.
Useful NumPy Functions
zeros
Creates an array filled with zeros.
import numpy as np
print(np.zeros((2, 3)))
ones
Creates an array filled with ones.
print(np.ones((3, 2)))
arange
Creates values in a range.
print(np.arange(0, 10, 2))
Output:
[0 2 4 6 8]
linspace
Creates evenly spaced numbers between two values.
print(np.linspace(0, 1, 5))
Output:
[0. 0.25 0.5 0.75 1. ]
Aggregation Functions
These functions help us summarize data.
import numpy as np
arr = np.array([10, 20, 30, 40])
print(np.sum(arr))
print(np.mean(arr))
print(np.max(arr))
print(np.min(arr))
Output:
100
25.0
40
10
Random Numbers
NumPy also has a random module, which is very useful in machine learning and simulations.
import numpy as np
print(np.random.randint(1, 10, 5))
This gives 5 random numbers between 1 and 9.
You can also create random floats:
print(np.random.rand(2, 2))
Broadcasting
Broadcasting means NumPy can perform operations on arrays of different shapes when possible.
import numpy as np
matrix = np.array([
[1, 2, 3],
[4, 5, 6]
])
print(matrix + 10)
Output:
[[11 12 13]
[14 15 16]]
This saves a lot of time and code.
Matrix Operations
NumPy is especially strong for matrix and linear algebra operations.
Dot product
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.dot(a, b))
Output:
32
Matrix multiplication
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
print(np.matmul(matrix1, matrix2))
Copy vs View
This is an important concept for students.
If two arrays share memory, changing one can affect the other.
import numpy as np
arr = np.array([1, 2, 3])
copy_arr = arr.copy()
copy_arr[0] = 100
print(arr)
print(copy_arr)
The original array stays safe when you use .copy().
Simple Real-Life Use Case
Suppose you have marks of students and want to calculate the average quickly.
import numpy as np
marks = np.array([78, 85, 91, 66, 88])
print("Total:", np.sum(marks))
print("Average:", np.mean(marks))
print("Highest:", np.max(marks))
This kind of work is common in data analysis and machine learning preprocessing.
Why NumPy Matters in Machine Learning
Machine learning uses a lot of numbers, matrices, and mathematical operations. NumPy makes this possible in a clean and fast way.
It is used for:
- storing training data
- preprocessing datasets
- matrix calculations
- creating random values for experiments
- supporting libraries like Pandas and Scikit-learn
Summary
NumPy is a powerful Python library for numerical computing. As a student, you should learn it early because it makes data handling easier, faster, and more practical.
If Python is the language of programming, NumPy is one of the main tools that helps Python become useful for data science and machine learning.
Quick Practice
Try writing these by yourself:
- Create a NumPy array of 5 numbers.
- Print its shape and data type.
- Add 10 to every element.
- Find the sum and average.
- Reshape a 1D array into a 2D array.
Comments
Questions, corrections, and practical takeaways are welcome here.