1. Matrices and vectors

What is a matrix? This seems a fancy term, but really nothing special. It’s just a tool to hold a large set of numbers so that we’re spared from the pain of writing a lot of numbers. It also contributes to environmental protection, since we’re saving more space on the paper.

This is a matrix, plain and simple.

Now instead of writing eight numbers, we’ll just write A to represent them. Surely saves paper.

OK, I admit the paper-saving reason seems silly. However, matrix is just a rectangular array of numbers.

Dimension of matrix: number of rows X number of columns. As for A, we can write this: .

Entry of matrix is just the number on the position.

A_ij = “i,j entry” = number in the i^th row, j^th column.

For example, A₁₁ = 1402, A₃₂ = 1437.

What about vectors? A vector is an nx1 matrix. In other words, vectors only have one column.

For example, see this vector.

n dimensioned vector is also called a vector with 4 elements.

Y_i = i^th element. For example, Y₂ = 232.

By convention, uppercase refers matrix, e.g. A, B, X.

Lowercase refers vector, e.g. a, b, y.

2. Addition and scalar multiplication

Addition of matrices is just adding corresponding numbers up. See this example.

Legal addition has the condition that corresponding numbers exist. In other words, only same-dimension matrices can do addition.

Scalar is similar to what we do for real number. See this example.

3. Matrix-vector multiplication

First let’s see an example in action.

What’s happening? Let’s break it down.

Here is the rule: to get y_i, multiply A’s i^th row with elements of vector x, and add them up.

Matrix-vector multiplication is legal only if the columns of A match the elements, i.e. rows, of vector. The result vector will have rows of A.

A picture should clear all things out.

Why do we want to do this matrix-vector multiplication? Well, it makes life easier for large data operations. Suppose we have a function f(x) = -40 + 0.25x, which is used to predict the price of house according to its size. Now we have many houses, and we want to use this function to evaluate the price. What shall we do? We could write a for-loop. It’s straightforward. However, a nicer way would be to build a matrix and vector to do it.

By using matrix-vector multiplication, the code is simpler, and runs faster. The benefit is more evident when operating large quantity of data, say a million sets.

4. Matrix-matrix multiplication

This is an expansion of the matrix-vector multiplication. See an example below.

It should be clear now. The rule is: the i^thcolumn of the matrix C is obtained by multiplying A with the i^th column of B. In short, we’re just breaking B into many vectors and apply matrix-vector multiplication to each.

Same question: why do we want to do it? Let’s continue with our house price function. Now instead of one function, we have three, each with slight variations. We want to see the results of these three and compare with one another. If you code directly, it’s much pain. Using matrix-matrix multiplication will save the day.

This runs blazingly fast.

5. Matrix multiplication properties

Matrices multiplication is not commutative, i.e. AxB ≠ BxA.

This should be clear if you consider dimensions.

Matrices multiplication is associative, i.e. (AxB)xC = Ax(BxC).

Try to think it in dimensions. Let’s say R_A = mxn, R_B = nxo, R_C = oxp. If we first multiply A and B we get R = mxo, and then multiply C get R = mxp. If we first multiply B and C we get R = nxp, and then multiply A get R = mxp.

About identity matrix, which is a special matrix.

In real number, x*1 = 1*x = x. 1 is identity number.

If we borrow the concept to matrix, we want a matrix that can fit A*I = I*A = A.

This can actually happen, if you match the dimensions correctly.

This is a 3×3 identity matrix. Try it with any legal multiplication and see the result. The key is 1 from upleft to downright diagonal, and 0 for all other spots.

We usually note identity matrix with I_nxn.