Research Ideas

Homework 5, Problem 5: Gaussian Elimination
[15 extra points; individual or pair]

Submission: Submit your hw5pr5.py file to Canvas

In this problem you will write a series of functions which, taken together, will implement a linear-equation solver based on a technique called Gaussian Elimination. Gaussian elimination is another name for the process of using row operations in order to bring a matrix to reduced-row-echelon form.

It turns out that this is simply a natural end result from writing a few short functions -- the row operations -- that alter 2d arrays.

Your main function, gauss(), should contain an array named A. Upon startup, the array A should have the following 3x4 default value:

A = [ [2.0,4.0,5.0,34.0], [2.0,7.0,4.0,36.0], [3.0,5.0,4.0,35.0] ]

Please use this default value -- it will be convenient for testing. In addition, one of the menu options will allow the user to change this matrix.

Then, the program should offer the user the following menu:

(1) Enter the size and values of an array

(2) Print the array

(3) Multiply an array row by a constant

(4) Add one row into another

(5) Add a multiple of one row to another

(6) Solve!

(7) Invert! [This is extra...]

(9) Quit

Which choice would you like?

The bulk of this problem's work lies in implementing the different options on this menu. In doing so, you should implement the following functions:

gauss

gauss() should print out the above menu, handle the user's choice, and loop until the user quits.

enterValues

enterValues() should ask the user for a list of lists that represents a 2d matrix. It will be up to the user to input valid data (and you may assume that we will test your code only on valid inputs). By using input, this method becomes quite straightforward! Be sure to return the matrix that is entered. For example, here the user's input is in blue:

>>> enterValues()

>>> Please type/paste a 2d list of lists: [ [1,1,1,6], [0,2,-1,3], [4,0,10,42] ]

[ [1,1,1,6], [0,2,-1,3], [4,0,10,42] ]

printArray

printArray(A) should take in a 2d array A and print it out. The key line to print a single value neatly formatted is the following:

print "%8.3f " % (A[row][col]) ,

A bit of explanation: this is the formatted printing that was introduced in the relativistic physics-simulation problem a few weeks ago. The string at left is the "format string." It says to use 8 spaces total, 3 of which should be after the decimal point, and the f indicates that it expects a floating-point number. Fortunately, it will take an integer, as well, so there's no need to convert (yet). The % sign indicates that the values to be printed are to follow -- those values to be printed are always in a tuple, which is a read-only list that uses parentheses instead of square brackets. Then, the tuple of the current array element appears, followed by a comma, which tells Python not to go to the next line.

That's a lot of stuff in one line of code, but there's no need to worry about it all at once. However, because the above line prints only one element, it will need to be inside a nested for loop, in which the outer loop iterates over the rows and the inner loop iterates over the columns. You'll need a plain print at the appropriate place, in order to place each row on a separate line.

Printing all the time

It's useful to save yourself the trouble of entering menu option #2 each time you would like to see the array. So, in addition to having option #2, you might consider calling printArray just before printing the menu each time through your primary while loop.

multRow

multRow(A, r, m) should multiply all of the elements in row r of the input array A by the value m. It should not change anything else in the array, nor should it ask for inputs -- nor should it print anything. The prompts for input and all output printing should be in the main function, gauss. Also, note that we start counting at 0, with row 0 being the topmost row, row 1 being the next row, and so on. See the complete run, below, for a detailed example on input and output.

addMofRowSIntoRowD

addMofRowSIntoRowD(A, m, rs, rd) should add each element of row rs, multiplied by m, into row rd. As in the prior two helper functions, row rs should remain completely unchanged; row rd will most likely change, and this method should not change anything else, ask for inputs, or print anything.

solve

The following is a pretty extensive description of how solve should run. At the end of this description is a summary about how you might organize for loops to implement Gaussian Elimination.

solve(A) should use the previous row-operation methods in order to solve the system of equations represented by the 2d array. First, if the number of columns is not exactly one greater than the number of rows, this method should simply print a message and return.

However, when the number of columns is one greater than the number of rows, this method should diagonalize the left-hand side of the array so that every diagonal entry is 1 and every off-diagonal entry is 0. You should use only row operations (the row methods defined above) to do this.

Note that the right-hand column will not, in general consist of zeros and ones -- rather, it will hold the solution to the original set of equations. Here is a sample "before" and "after" array:

Before:

2.000 4.000 5.000 34.000

2.000 7.000 4.000 36.000

3.000 5.000 4.000 35.000

Or, for easy pasting into Python: [ [2.0, 4.0, 5.0, 34.0], [2.0, 7.0, 4.0, 36.0], [3.0, 5.0, 4.0, 35.0] ]

After calling solve:

1.000 0.000 0.000 3.000

0.000 1.000 0.000 2.000

-0.000 -0.000 1.000 4.000

Note the negative zeros -- this is not a problem at all.

To make things more concrete, here is a step-by-step example of how solve should work. Basically, solve uses the two methods multRow and addMofRowSIntoRowD repeatedly in order to diagonalize all but the right column of the array:

Suppose we have three linear equations in three unknowns (the variables x, y, and z):

2x + 4y + 5z = 34

2x + 7y + 4z = 36

3x + 5y + 4z = 35

Our goal is to do a series of operations such that the coefficients along the diagonal all end up 1, and all the other coefficients end up 0. When we have reached this configuration, we will be able to read off the values of the variables directly. Obviously, the operations we perform must be such that they do not disturb the meaning of the equations.

Throughout this example, we'll always show all the variables and coefficients, even when the coefficient is 0 or 1, just to make things clear. Also, in each step we'll draw a pointer in front of each equation that has changed since the previous step.

The first step is to divide the zeroth equation by 2, which is the coefficient of x in that equation. That will make the coefficient of x in the zeroth equation 1:

--> 1x + 2y + 2.5z = 17

2x + 7y + 4z = 36

3x + 5y + 4z = 35

Now, notice that if we add negative two times the new zeroth equation to the first equation, the coefficient of x in the first equation will become 0. To the same end we add negative three times the zeroth equation to the second equation.

1x + 2y + 2.5z = 17

--> 0x + 3y + -1z = 2

--> 0x + -1y + -3.5z = -16

Next we repeat the process, dividing the first equation by 3 so that its y coefficient becomes 1:

1x + 2y + 2.5z = 17

--> 0x + 1y + -0.33z = 0.66

0x + -1y + -3.5z = -16

Then we add negative two times the first row to the zeroth row, and we add one first row to the second row. This will get the coefficient of y to be 0 in the zeroth and second equations (notice that since we have zeroed out the coefficients of x in all but the first equation, we won't mess up the x coefficients by doing this):

--> 1x + 0y + 3.16z = 15.66

0x + 1y + -0.33z = 0.66

--> 0x + 0y + -3.83z = -15.33

Finally, we divide the second equation by -3.83 to get its z coefficient to be 1, and add -3.16 and 0.33 times the result from the zeroth and first equations, respectively, to obtain:

--> 1x + 0y + 0z = 3

--> 0x + 1y + 0z = 2

--> 0x + 0y + 1z = 4

Thus, the solution of these equations is:

x = 3

y = 2

z = 4

Now, the thing to notice is that the variables play no role in this process; only the coefficients matter! If we take the variables out, all we are doing to solve a system of 3 equations in 3 unknowns is manipulating a 3 by 4 array of coefficients.

Important Notes:

In general Gaussian Elimination can be tricky as the input may not be well behaved. In particular, you may find that you don't actually have n distinct equations because one may be a linear combination of some of the others. If this happens there is not enough information to solve the system completely, and Gaussian elimination will lead one of the rows of the array to be all zeroes. This in turn will likely lead to a division by zero error. You need not worry about this and related problems. Just implement the naive algorithm; we will only test it on well behaved input.
Similarly, you may have a well-behaved problem, but such that the rows need swapping (because one of the pivot values is 0). Again, we will not test data for which this happens.
You may find your results include the value -0 in some places where you expect the value 0. This is ok, and is the result of the finite precision of computer arithmetic.

What does all that mean about solve ?

In sum, you will want to run an outer loop over the columns and an inner loop over the rows:

for col in range(len(A)):

# do the appropriate thing here

for row in range(len(A)):

# do the appropriate thing here when row != col

Extra Credit (up to +5 points):

For optional extra credit of up to +5 points, add another option to the menu:

(7) Inverse!

and when that option is chosen, your program should:

Check if the array (matrix) is currently square. That is, if the number of rows equals the number of columns. If not, you should print a warning message and return to the menu -- only square matrices have inverses.
If the array is square, you should compute its inverse and replace the current array with its inverse. How? One way to do this is to create an array with the same number of rows and twice as many columns as described at either of these URLs:

o http://www.moshe-online.com/tutor/matrix/matrix_inverse.html provides a slightly simpler explanation if you're not familiar with matrices, inverses, etc.

o http://www.richland.cc.il.us/james/lecture/m116/matrices/inverses.html is a bit more complicated, but explains why the technique works.

You can then use the row operations for which you've already created methods in order to find the inverse. Finally, be sure to replace the original matrix with its inverse.

As with the solve method, you can assume that the matrix will have an inverse (it will not be singular or otherwise poorly behaved).

Example run of gauss()

You may want to check that your functions work on the examples within this run:

>>> gauss()

(1) Enter the values in the existing array

(2) Print the array

(3) Multiply an array row

(4) Add one row into another

(5) Add a multiple of one row to another

(6) Solve!

(7) Invert! [This is extra credit]

(9) Quit

Choose one: 2

The array is now

2.000 4.000 5.000 34.000

2.000 7.000 4.000 36.000

3.000 5.000 4.000 35.000

(1) Enter the values in the existing array

(2) Print the array

(3) Multiply an array row

(4) Add one row into another

(5) Add a multiple of one row to another

(6) Solve!

(7) Invert! [This is extra credit]

(9) Quit

Choose one: 3

Which row? 0

What multiple? 2

The array is now

4.000 8.000 10.000 68.000

2.000 7.000 4.000 36.000

3.000 5.000 4.000 35.000

[[ MENU ]]

Choose one: 4

Which is the source (unchanged) row? 2

Which is the destination (changed) row? 1

The array is now

4.000 8.000 10.000 68.000

5.000 12.000 8.000 71.000

3.000 5.000 4.000 35.000

[[ MENU ]]

Choose one: 5

Which is the source (unchanged) row? 1

What multiple of that row? 0.5

Which is the destination (changed) row? 2

The array is now

4.000 8.000 10.000 68.000

5.000 12.000 8.000 71.000

5.500 11.000 8.000 70.500

[[ MENU ]]

Choose one: 6

The array is now

1.000 0.000 0.000 3.000

0.000 1.000 0.000 2.000

-0.000 -0.000 1.000 4.000

[[ MENU ]]

Choose one: 1

Type or paste a 2d matrix: [ [ 3,5,0 ], [ 1,2,1 ], [ -1,1,7 ] ]

The array is now

3.000 5.000 0.000

1.000 2.000 1.000

-1.000 1.000 7.000

[[ MENU ]]

Choose one: 7

The array is now

-13.000 35.000 -5.000

8.000 -21.000 3.000

-3.000 8.000 -1.000

[[ MENU ]]

Choose one: 9

If you have gotten to this point, you have completed problem 5! You should submit your hw5pr5.py file at Canvas .

Links

Lab 5

Homework 5