Progress Log 108 (R): Create and Name Matrices
I’m taking an edX course entitled “Introduction to R for Data Science” and all of the concepts described below come from that course.
- Vector: 1D array of data elements.
- Matrix: 2D array of data elements.
- Rows and columns.
- Can only contain one atomic vector type.
Create a matrix
To build a matrix, you use the
matrix() function. You can choose to specify the number of rows or the number of columns. Have a look at the following example that creates a 2 by 3 matrix containing the values 1 to 6 by specifying the vector and setting the
nrow argument to 2.
R sees that the input vector has length 6 and that there have to be 2 rows. It then infers that you’ll probably want 3 columns, such that the number of matrix elements matches the number of input vector elements.
You could just as well specify
ncol instead of
In this case, R infers the number of rows automatically.
In both examples, R takes the vector containing the values 1 to 6 and fills it up, column-by-column.
If you prefer to fill up the matrix in a row-wise fashion, such that 1, 2, and 3 are in the first row, you can set the
byrow argument of the matrix to
Create a Matrix: recycling
Remember how R did recycling when you were subsetting vectors using logical vectors? The same thing happens when you pass the matrix function to a vector that is too short to fill up the entire matrix.
Suppose you pass a vector containing the values 1 to 3 to the matrix function and you explicitly say you want a matrix with 2 rows and 3 columns. R fills up the matrix column by column and simply repeats the vector.
If you try to fill up the matrix with a vector whose multiple does not nicely fit in the matrix, for example, when you want to put a 4-element vector in a 6-element matrix, R generates a warning message.
Apart from the
matrix() function, there is another easy way to create matrices that is more intuitive in some cases. You can paste vectors together using the
cbind(), short for column bind, takes the vectors you pass through it and sticks them together as if they were columns of a matrix.
rbind(), short for row bind, does the same thing but takes the input as rows and makes a matrix out of them.
These bind functions can also handle matrices, so you can use them to paste another row or another column to an existing matrix.
Suppose you have a matrix containing the elements 1 to 6. If you want to add another row to it, containing the values 7, 8, 9, you could simply run this command:
You can do a similar thing with
Naming a matrix:
In the case of naming vectors, you can simply use the
names() function but in the case of matrices, you could assign names to both columns and rows. That is why R came up with the
Retaking the matrix
m from before, we can set the row names just the same way as we named vectors, but this time with the
rownames() function. Setting up the column names with a vector of length 3 gives us a fully named matrix.
Naming a matrix
Just as with vectors, there are also one-liner ways of naming matrices while you’re building it. You can use the
dimnames argument of the matrix function for this.
You will need to specify a list which has a vector of row names as the first element and column names as the second element.
Matrices are just an extension of vectors. This means that they can only contain a single atomic vector type. If you try to store different types in a matrix, coercion automatically takes place.
Take a look at these two matrices, one containing numerics, the other one containing characters.
Let’s now try to bind these two matrices together in a column-wise fashion using
Did you see what happened? The numeric matrix elements were coerced to characters to end up with a matrix that is only comprised of characters.
To have a multi-dimensional data structure that contain different elements, you’ll have to use the lists or more specifically:
In R, the function
rowSums() conveniently calculates the totals for each row of a matrix. This function creates a new vector:
sum_of_rows_vector <- rowSums(my_matrix)
R also has the function
colSums() which calculates the totals for each column of a matrix.
sum_of_cols_vector <- colSums(my_matrix)