# Efficient Matrix Creation Using NumPy: A Comprehensive Guide

Written on

Today, we will delve into a vital area of numerical computing—constructing matrices efficiently in *NumPy*. Matrix operations form the backbone of numerous scientific and engineering applications and are essential in machine learning algorithms. While NumPy offers a powerful and convenient platform for matrix operations, matrix construction can still be a complex task. In this article, we will investigate how to create intricate matrices using *NumPy's* vectorization capabilities, thereby circumventing the slow process of iterating through rows and columns. By the conclusion of this article, you'll possess a solid understanding of efficient matrix creation in *NumPy* and how to apply this knowledge across various applications. Let’s get started!

## Our Sample Matrix

Imagine we need to create a complex (N+1)×(N+1) matrix D in Python with the following values:

The entries are defined as follows:

and

If you’re curious about what this matrix represents, it is known as the Chebyshev differentiation matrix, a spectral representation of the derivative operator on a Chebyshev grid. We will demonstrate its application at the end of this article.

A naive way to construct the matrix would involve creating an empty 2D array and looping through its rows and columns:

# The naive way. Don't do this! import numpy as np

N = 30 D = np.zeros((N+1, N+1))

for i in range(N+1):

for j in range(N+1):

...

However, as you might be aware, using for loops in Python is generally inefficient. Therefore, it's advisable to utilize *NumPy's* vectorization features instead. When dealing with a complex matrix like this, where you have a formula based on row and column indices, we will employ some *NumPy* techniques to achieve our goal.

## Efficient Matrix Construction

Let’s begin with the vector *x*. For clarity and to keep the output manageable, we'll set N=3.

import numpy as np

N = 3

x = np.cos(np.pi * np.arange(N+1) / N)

The output will be:

array([ 1. , 0.5, -0.5, -1. ])

Here, *NumPy* treats x as a row vector, though we will later confirm that this is not entirely accurate. Now, let’s define a matrix X using the indices:

How do we accomplish this in *NumPy*? We expect that *NumPy* behaves intuitively. Thus, we should be able to subtract x from its transpose and hope that it automatically expands the vectors into a matrix. Let's test this:

x - x.T

The output will show:

array([0., 0., 0., 0.])

This indicates that it remains a vector, but why? Let’s check the shape of x:

x.shape

This returns (4,), which means it is not a row vector; had it been, the shape would have been (1, 4). Similarly, it’s not a column vector either, as that would yield a shape of (4, 1). Hence, we should remember that constructing a "vector" in *NumPy* does not create a true row or column vector. To achieve this, we need to specify it explicitly, perhaps using reshape:

x = x.reshape(1, N+1) # turn x into a row vector

Now, let’s confirm the shape:

x.shape

This will yield:

(1, 4)

Alternatively, we can turn x into a column vector:

x = x.reshape(N+1, 1) # turn x into a column vector

The resulting shape would now be:

array([[ 1. ],

[ 0.5],

[-0.5],

[-1. ]])

Now we can perform the expected inflation:

X = x - x.T

The output will be:

array([[ 0. , 0.5, 1.5, 2. ],

[-0.5, 0. , 1. , 1.5],

[-1.5, -1. , 0. , 0.5],

[-2. , -1.5, -0.5, 0. ]])

Next, we can proceed to create the off-diagonal entries of our matrix D:

The goal is to create two additional matrices C and S with the following entries:

and

This will allow us to express:

We define:

c = np.ones_like(x) c[0, 0] = c[0, -1] = 2

This will yield:

array([[2.],

[1.],

[1.],

[1.]])

Using ones_like, we avoid needing to reshape to get a row vector, as the _like function inherits the shape from x, which we have already defined as a column vector.

Now, let’s check the shape of c:

c.shape

This returns:

(4, 1)

Next, we define:

C = c / c.T

The output will be:

array([[1. , 2. , 2. , 2. ],

[0.5, 1. , 1. , 1. ],

[0.5, 1. , 1. , 1. ],

[0.5, 1. , 1. , 1. ]])

Now we will create the sign matrix:

j = np.arange(N+1).reshape(N+1, 1)

S = (-1)**j * (-1)**j.T

The output will look like this:

array([[ 1, -1, 1, -1],

[-1, 1, -1, 1],

[ 1, -1, 1, -1],

[-1, 1, -1, 1]])

Now we’re prepared to define the off-diagonal entries of D. However, we must be cautious to select only the off-diagonal entries. If we simply write:

D = C * S / X

It will throw an error because we’ve also selected the diagonal entries, and the *x? - x?* in the denominator for i=j is zero. Thus, we must utilize a mask to extract the off-diagonal entries. It’s simpler to create a mask for the diagonal entries first and then invert it.

mask_diag = np.eye(N+1, dtype=bool)

This will yield:

array([[ True, False, False, False],

[False, True, False, False],

[False, False, True, False],

[False, False, False, True]])

Now we can invert it:

mask_off = ~mask_diag

This will result in:

array([[False, True, True, True],

[ True, False, True, True],

[ True, True, False, True],

[ True, True, True, False]])

Next, we initialize our matrix D:

D = np.zeros((N+1, N+1)) D[mask_off] = C[mask_off] * S[mask_off] / X[mask_off]

The output will be:

array([[ 0. , -4. , 1.33333333, -1. ],

[ 1. , 0. , -1. , 0.66666667],

[-0.33333333, 1. , 0. , -2. ],

[ 0.25 , -0.66666667, 2. , 0. ]])

Now for the diagonal entries:

By applying the diagonal mask to D, we obtain a 1D array that references all diagonal entries (currently all zero). Therefore, we also need a 1D array for the right-hand side. However, x is currently 2D since we turned it into a column vector of shape (4, 1). Thus, we need to revert this step using reshape(-1), and we must only use the values with indices 1 through N-1:

x = x.reshape(-1) # turns x back into shape (N,)

diag = np.zeros(N+1) diag[1:-1] = - x[1:-1] / (2*(1-x[1:-1]**2))

We then assign the diagonal entries:

D[mask_diag] = diag

The resulting matrix will look like this:

array([[ 0. , -4. , 1.33333333, -1. ],

[ 1. , -0.33333333, -1. , 0.66666667],

[-0.33333333, 1. , 0.33333333, -2. ],

[ 0.25 , -0.66666667, 2. , 0. ]])

Finally, we need to assign values to the two entries in the upper left and lower right corners:

D[0, 0] = 2*(N**2 + 1) / 6 D[N, N] = -D[0, 0]

And we have completed our matrix:

array([[ 3.33333333, -4. , 1.33333333, -1. ],

[ 1. , -0.33333333, -1. , 0.66666667],

[-0.33333333, 1. , 0.33333333, -2. ],

[ 0.25 , -0.66666667, 2. , -3.33333333]])

Let’s encapsulate this entire process in a function:

def cheb(N):

x = np.cos(np.pi * np.arange(N+1) / N)

x = x.reshape(1, N+1)

X = x - x.T

c = np.ones_like(x)

c[0, 0] = c[0, -1] = 2

C = c / c.T

j = np.arange(N+1).reshape(N+1, 1)

S = (-1)**j * (-1)**j.T

mask_diag = np.eye(N+1, dtype=bool)

mask_off = ~mask_diag

D = np.zeros((N+1, N+1))

D[mask_off] = C[mask_off] * S[mask_off] / X[mask_off]

x = x.reshape(-1) # turns x back into shape (N,)

diag = np.zeros(N+1)

diag[1:-1] = -x[1:-1] / (2*(1-x[1:-1]**2))

D[mask_diag] = diag

D[0, 0] = (2*N**2 + 1) / 6

D[N, N] = -D[0, 0]

return D.T

## Example Application

Now, let’s utilize the matrix function we just created. As mentioned earlier, this matrix serves as a representation of a derivative operation on a Chebyshev grid. We will place a function on such a grid and compare its exact derivative with the one computed using the Chebyshev matrix. To see how the performance scales, we will do this for different numbers of grid points:

import matplotlib.pyplot as plt

Ns = np.logspace(1, 2, 30, dtype=int) errs = []

for N in Ns:

x = np.cos(np.arange(N+1)*np.pi/N)

f = np.exp(-x**2) + x

df_dx_exact = -2*x*np.exp(-x**2) + 1

D = cheb(N)

df_dx = np.dot(D, f)

err = np.max(np.abs(df_dx - df_dx_exact))

errs.append(err)

plt.loglog(Ns, errs, '-o') plt.grid() plt.xlabel('N') plt.ylabel('max. error in derivative');

With just 20 grid points, we achieve a precision of 10^-12! This is what is known as spectral accuracy!

## Conclusion

In conclusion, we explored efficient methods for constructing complex matrices in *NumPy*, leveraging its vectorization features to optimize performance. By utilizing various *NumPy* techniques and understanding the behavior of its arrays, we successfully built the Chebyshev differentiation matrix, a valuable tool for numerical differentiation and solving differential equations. We hope this article has provided insightful knowledge for those looking to enhance their coding efficiency and matrix construction skills. Until next time, happy coding!

*If you found this article beneficial, consider becoming a Medium member for unlimited access to all articles. By registering through this link, you can also support me as a writer.*