Two wrongs make a right
Students make a lot of mistakes when doing their maths, but sometimes they will make two mistakes in such a way that their final answer is still correct. This happened last week with one student quite spectacularly, because his doubly wrong method of doing a particular problem always produces the correct answer.
Let me explain: the Maths 1A students are currently learning about vectors in Rn and one of their assignment questions gives them several lists of vectors and asks them to decide if they are linearly independent or linearly dependent.
What they are supposed to do (based on what they are shown in the lectures) is put the vectors into a matrix as columns, then do row operations until they can tell where the pivots will be. If every column has a pivot, then the vectors are linearly independent; if there is a column with no pivot, then the vectors are linearly dependent. Check out this example to see how it works (it has a little extra for later on).
The reason for this method goes right back to the definition of linear independence: the vectors v1, ..., vr are linearly independent when the equation x1v1 + ... + xrvr = 0, has only the trivial solution of x1 = 0, ... , xr = 0. If there are any solutions where any of the xi's are not zero, then they are linearly dependent.
If you write out the vectors in full using their coordinates, then you find that using the first coordinates, you can make a linear equation involving the xi's, and similarly each successive coordinate produces a separate equation. So solving the vector equation is the same as solving n linear equations. And the students know how to do this: you put the coefficients of the equations into a matrix with one row per equation and do row operations. The final matrix represents a reduced set of equations, and if you can get to a stage where each row has a 1 in one spot and 0's in the others then you've got x1 = 0, ..., xr = 0. This would mean your original vectors had to be independent. If any one variable doesn't have a pivot, then you can let it be any value at all and still find a solution. This is called a free variable. Since it can be anything, it could be something other than zero, and so your vectors will be linearly dependent.
So this is what the student was supposed to do. However, what they did do was put the vectors into a matrix as rows, do row operations, and then look to see if there was a row of zeros. If he got a row of zeros, then he said the vectors were linearly dependent, and if he didn't, then he said the vectors were linearly independent.
His first mistake was to put the vectors in as rows. I am often repeating the mantra "vectors are columns, equations are rows" to students to remind them that for most situations, vectors really ought to be columns. We usually mutliply matrices on the left of a vector (Ax) which would only make sense if the vector was a column, and also this arrangement corresponds exactly to doing a specified linear combination of the columns of A. Finally, in this specific situation, the matching coordinates of each vector together make an equation, and equations are definitely rows, so this forces the vectors to line up their coordinates in rows. That is, they have to be columns (if we are to be solving equations anyway).
His second and much more fundamental mistake was to think that a row of zeros meant there had to be a free variable. (I know he was thinking this because he actually said it to me.) This is wrong because if a matrix does represent a set of equations, then whether you get rows of zeros at the end is actually not strongly related to whether there are free variables. Firstly, a row of zeros with a nonzero number in the answers position indicates no solution at all, independently of what the rest of the matrix is doing. Secondly, if there's a pivot in every column, then there will be a unique solution regardless of how many rows of zeros there are at the bottom. Whether you get infinitely many solutions is all about the pivots in the columns, not about the zeros in the rows!
It's not surprising that the student had this misconception because many high school students only ever see square matrices, and a row of zeros in a square matrix prevents there being a pivot in one of the columns and so does in fact indicate a free variable. But it doesn't apply for non-square matrices, especially ones with more rows than columns.
I explained all of this to him and he was happy with what I said. But then he frowned and asked, "Then why did I still get the answers right?" Why indeed?
If my student had put his vectors in as columns and looked for zero rows, he would have gotten his answers wrong. This is because with the vectors as columns, the rows are equations and the aim is to solve the equations and look for free variables. Since zero rows do not always force free variables, he would be looking for the wrong thing and would have been wrong a lot of the time.
If my student had put his vectors in as rows and looked for columns without pivots, he would also have gotten his answers wrong. This is because pivots are supposed to refer to variables in equations, whereas this matrix wouldn't actually represent equations at all, letalone ones related to the definition of linear independence.
However, he did neither of these combinations and instead put the vectors in as rows and looked for rows of zeros, and he got his answer right every time! The key to resolving this paradox is to figure out what the row operations represent when the rows of your matrix are vectors rather than equations.
Row operations basically perform linear combinations of rows. So when you do a sequence of row operations, you are actually doing a linear combination of the original rows. Therefore any row produced by this process is a linear combination of the original rows. So if you are able to produce the zero vector then it is actually possible to produce the zero vector by linear combinations! And it's not the trivial one either. Hence the original vectors must have been linearly dependent. Amazing huh?
An important question arises: what if you wanted to know what the actual linear combination was? With the vectors-as-columns approach, you are literally solving to find the xi's, and so at the end if you pick a nonzero value for the free variables, you can find the rest of them and there you have it!
With the vectors-as-rows approach, the linear combinations you do are recorded in the row operations. We need a way to keep track of the linear combinations / row operations we have done. One way to do this is to reason that if the original vectors were the standard basis, then whatever final vector we got would tell us what linear combination we had done (for example, (1,3,-2) = 1(1,0,0) + 3(0,1,0) -2 (0,0,1)). So why don't we do the same operations on the standard basis as we do on the original matrix?
What this means is that if you place an identity matrix next to your original matrix and do the same row operations on both, then whatever vector is next to the row of zeros when it happens will be the linear combination of the original rows that produced the zero vector! Check out the same example as before but by this new method.
Isn't it amazing the things you can learn by being wrong first!
This comment was left on the original blog post:
Stephen Wade 20 May 2014:
It’s funny how convention comes into play and that you could prove that either approach works. I had to check quickly, and I think this is right, that if you put vectors as rows in an m x n matrix A, then if dim ker A^t > 0 means you have linear dependence. Using rank theorem, dim ker A^t = m – dim col A = numbers of rows – number of rows with pivots = number of rows of zeros in the rref of A. So if the number of rows in rref of A > 0, you should have linear dependence. Cool 🙂