Progress Log 109 (R): Matrix Subsetting
I’m taking an edX course entitled “Introduction to R for Data Science” and all of the concepts described below come from that course.
Just as for vectors, there are situations in which you want to select single elements or entire parts of a matrix to continue your analysis with.
Have a look at this matrix containing some random numbers:
If you want to select a single element from this matrix, you’ll have to specify both the row and column of the element of interest.
Suppose we want to select the number 3, located in the first row and third column. We type
m[1, 3]. The first index refers to the row, the second one refers to the column.
Likewise, to select the number 15, at row 3 and column 2, we write
Notice that the results are single values, so vectors of length 1.
Subset column or row
What if you want to select an entire row or column from this matrix? You can do this by letting out some of the indices between the square brackets. For rows, you can leave out the second index between the square brackets. For columns, you can leave out the first index between the square brackets.
Notice here the result is not a matrix anymore. It’s a vector, but this time one that contains more than one element.
The third and fourth examples show what happens if you decide not to include a comma to clearly discern between column and row indices. When you pass a single index to subset a matrix, R simple goes through the matrix column by column from left to right. The first index is then 8, and the second is 5. This means if you pass
m , you should get 1, and if you pass
m you should get 14.
Subset multiple elements
Like in vector subsetting, you can also select multiple elements in matrix subsetting.
For the first example, you want to select 12 and 9, which are in the same row but in different columns. The result is a vector, because 1 dimension suffices.
However, you cannot select elements that don’t have one row or column in common. In the second example, we are trying to select 1 and 9. The call we used did not produce the desired result. Instead, a submatrix gets returned that spans the elements on row 1 and 2 and column 2 and 3.
In the third example, we are creating a submatrix that contains elements on row 1 and 3 and on columns 1, 3 and 4.
Subset by name
Remember that we can perform subsetting by using names and logicals with vectors? The same applies with matrices. Let’s look at subsetting names first.
In the first example, I am showing subsetting by index for comparative purposes.
In the second example, I replaced the indices with the corresponding names of the row and index where the
9 is found. It is in row
r2 and column
In the third example, I used a combination of row index and name of column to get the same answer as the first two examples.
Finally, I selected the row index and then selected the columns
Subset with logical vector
Finally, you can also use logical vectors for subsetting matrices. Again, the same rules apply: rows and columns corresponding to a
TRUE are kept, while those corresponding to a
FALSE are left out. To select the same elements as the previous call, you can use the first example:
The second example shows what happens when you only pass a vector of length 2 to perform a selection on the columns: The column section vector gets recycled to
FALSE, TRUE, FALSE, TRUE. The third example is there for comparison.