NumPy Basics: Arrays, Operations, and Broadcasting
NumPy is the foundation of scientific Python. Here's what data science exams test — arrays, broadcasting, and vectorised operations.
Creating arrays
NumPy arrays are faster than Python lists for numerical operations and support vectorised operations.
import numpy as np
# From Python list
arr = np.array([1, 2, 3, 4, 5])
# Creating arrays
np.zeros((3, 4)) # 3x4 array of zeros
np.ones((2, 3)) # 2x3 array of ones
np.eye(3) # 3x3 identity matrix
np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
np.linspace(0, 1, 5) # [0, 0.25, 0.5, 0.75, 1.0]
np.random.randn(3, 3) # 3x3 random normal values
# Array properties
arr.shape # (5,)
arr.dtype # dtype('int64')
arr.ndim # 1
arr.size # 5 (total elements)Array operations and vectorisation
NumPy operations apply element-wise to entire arrays — no explicit loops needed.
a = np.array([1, 2, 3, 4]) b = np.array([10, 20, 30, 40]) # Element-wise operations a + b # [11, 22, 33, 44] a * b # [10, 40, 90, 160] a ** 2 # [1, 4, 9, 16] np.sqrt(a) # [1, 1.41, 1.73, 2.0] # Aggregate operations a.sum() # 10 a.mean() # 2.5 a.std() # standard deviation a.min() # 1 a.argmax() # 3 (index of max value) # Matrix operations A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) A @ B # matrix multiplication np.dot(A, B) # same result A.T # transpose
Indexing and slicing
NumPy supports advanced indexing beyond standard Python slicing.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) arr[0, 1] # 2 (row 0, col 1) arr[1, :] # [4, 5, 6] (all of row 1) arr[:, 2] # [3, 6, 9] (all of col 2) arr[0:2, 1:3] # [[2,3],[5,6]] (subarray) # Boolean indexing a = np.array([1, -2, 3, -4, 5]) a[a > 0] # [1, 3, 5] a[a < 0] = 0 # replace negatives with 0 # Fancy indexing indices = np.array([0, 2, 4]) a[indices] # [1, 3, 5]
Broadcasting
Broadcasting allows operations between arrays of different shapes, following strict rules.
# Rule: arrays are compatible if dimensions are equal or one of them is 1
a = np.array([[1, 2, 3],
[4, 5, 6]]) # shape (2, 3)
b = np.array([10, 20, 30]) # shape (3,)
# b is broadcast to (2, 3)
a + b # [[11,22,33],[14,25,36]]
# Scalar is broadcast to any shape
a * 2 # [[2,4,6],[8,10,12]]
# Column vector
c = np.array([[100], [200]]) # shape (2, 1)
# c is broadcast to (2, 3)
a + c # [[101,102,103],[204,205,206]]Exam tip
Broadcasting is the most-tested NumPy concept. The key rule: shapes are compared from the right — sizes must match OR one must be 1. Also know the difference between a[a > 0] (boolean indexing) and a[[0, 2]] (fancy indexing).
Think you're ready? Prove it.
Take the free Data Science readiness test. Get a score, topic breakdown, and your exact weak areas.
Take the free Data Science test →Free · No sign-up · Instant results