ITERATIVE SOLUTION BY RELAXATION
We return to the equations (Q) for u on the grid:
{4 u(i,j) - u(i-1,j) - u (i+1,j) - u(i,j-1) - u(i,j+1)} / h**2 = f(i,j), (Q)
u(i,j) = g(i,j) on the boundary.
Actually we have also learned how to write this as a matrix vector equation
(Au)(K) = {4u(K) - u(K-1) - u(K+1) - u(K-n+1) - u(K+n-1)} / h**2 = f(K). (R)
and since K is just another "name" for i,j we can instead write the equations as: (Au)(i,j) = f(i.j), i,j in interior u(i,j) = g(i,j), on boundary.
I will often use the form Au = f to describe the equations. Usually I will not have in mind the horrible matrix A given above, but rather the simple definition of the operator A as:
(Au)(i,j) = {4 u(i,j) - u(i-1,j) - u (i+1,j) - u(i,j-1) - u(i,j+1)} / h**2
(which of course is exactly equivalent to the matrix). Mathematically we say that A is an operator on the space of grid functions u(i,j).
Relaxation methods start by rewriting equation (Q) by taking all terms except u(i,j) to the other side and "solving" for u(i,j):
u(i,j) = 1/4 * (h**2 *f(i,j) + u(i-1,j) + u(i+1,j) + u(i,j-1) + u(i,j+1))
This identity is satisfied by the exact solution U - in fact it is exactly equivalent to the equation (Q).. An approximate solution ought to nearly satisfy it. This is the basis for an iterative solution method.
We will solve the equation iteratively. Starting with an initial guess u(i,j) we will define a better solution unew by the above equation:
unew(i,j) = 1/4 * (h**2 *f(i,j) + u(i,j-1) + u(i,j+1) + u(i-1,j) + u(i+1,j))
again applied at interior points (i,j) of GR, and
unew(i,j) = g(i,j) on boundary of GR.
Hopefullyl, unew is a better approximate than u. We then repeat the procedure. We copy unew into u, and regard it as a new "guess" and use the above formula to create yet another improvement unew. In practice u will tend to get steadily better (converge) or get steadily worse (diverge). For many important problems one can show it will always converge no matter what initial guess is taken - even the guess u(i,j) = 0 everywhere!
One can actually do even better by taking the new approximation to be a linear combination of unew and u:
v(i,j) = s*unew(i,j) + (1-s)*u(i,j), for some s
By experiment one can try to find a better value of s than s=1 (which would just be unew; obviously s=0 is not interesting since then v=u and so the solution cannot get better but just remains the same).
How can we tell when the solution is good enough? If we knew the exact answer U(i,j) to the numerical equations (Q) then the error vector e = U-u would be a good measure of how close we were, and we could decide to terminate iterations if the error was small enough. A convenient test is to stop iterating if norm_vector(e) < epsilon, a predefined small number such as .000001.
Unfortunately we dont know the exact answer U - in fact we are trying to find it! So the next best thing is to use the discrepancy in Au=f. For the exact answer, AU=f and so f-Au = 0. We call the vector r = f-Au, the residual. The residiual is 0 if u is the solution and is small if u is near the solution. Furthermore the residual can be easily calculated - it only requires that we can compute Au and that we know the right-side f; it does not require knowledge of the exact answer. Au is simple - just the some of 5 numbers.
The residual is a vector, and we usually measure its size by norm_vector(residual), which is a scalar, and therefore easier to manage.
This leads to the following iterative scheme
Choose initial u: perhaps u = 0, and a value for s: perhaps s = .8 .
loop unew(i,j) = 1/4 * (h*..... as above) u(i,j) = s*unew(i,j) + (1-s)*u(i,j) r(i,j) = (4u(i,j) - u(i,j-1) - u(i,j+1) - u(i-1,j) - u(i+1,j) - h**2*f(i,j)) residual = sqrt(sum over interior of GR of r(i,j)**2) if (residual < small) break loop end loop
What is the cost of this operation? Lets analyse the cost of one iteration.
Computing unew requires 4 adds and a multiply at each point i,j. Computing u requires an add and two multiplies at each i,j Computing r requires 5 subtracts and a multiply at each point i,j Computing residual = norm_vector(r) requires 1 add and 1 multiply at each i,j Total: 16 operations per point i,j, or a total of 16N operations Contrast with the N**3 needed by Gaussian Elimination. This is great news (so far).
How about space? The only space needed is to store u, unew, r and f - 4 vectors of length N each so the total memory needed is 4N - as opposed to N**2 for Gaussian Elimination. Again great news.
Now for the bad news! While the cost for an iteration is very cheap, the solution will actually only improve very slightly. It turns out that you need about N**2 iterations to reduce the error (or residual) by a factor 10. Thus the cost of getting an answer correct to say 6 digits might be 6N**3 - even worse than Gauss Elimination.
However, AT LEAST WE CAN RUN THIS ALGORITHM!! Gauss elimination required so much memory - N**2 - that it would not fit in the machine at all. This algorithm will fit in any reasoonable computer, and will grind out a solution eventually if you have enough patience.
Our upcoming goal will be to improve relaxation to make it converge faster so that ultimately maybe 6 iterations, rather than N**2 will be needed.
ERRORS AND RESIDUALS:
Here are some comments on measuring (or estimating) the error in an iterative solution process. Two concepts of error are most often used and are called residual and error respectively.
1. RESIDUALS
We are asked to solve: Lu = F where Lu(i,j) = 4u(i,j) - u(i,j-1) - u(i,j+1) - u(i-1,j) - u(i+1,j) F(i,j) = h**2 f(i,j)
Given any function u, it is natural to define the error - actually this is usually called the residual so I will call it r rather than e - as:
r = F - Lu
That way, if u is the exact answer then Lu = f so that r = 0. Conversely if r = 0 then u is the exact answer. If r is small it indicates that u is close to the exact answer. Therefore r is a reasonable measure of the error. r is of course a vector, so we measure it by RM = max |r(i,j)| or R2 = sqrt(sum of r(i,j)**2) both of which are scalars and therefore easier to work with
Obviously if R2 or RM are small, then ALL r(i,j) are small, and so u is close to exact.
2. ERRORS
Usually the name "error" is used to describe the difference between a guess u and the exact answer, U. Thus we might define:
e = U - u e(i,j) = U(i,j) - u(i,j)
Again, in order to have a simple measure one works with a scalar such as:
EM = max |e(i,j)| or E2 = sqrt(sum of r(i,j)**2)
Obviously if EM or E2 are small, then all e(i,j) are small, and so u is close to U.
3. RELATIONSHIP BETWEEN ERROR AND RESIDUAL.
The best measure of error is certainly e - it IS the error. However it is also impossible to compute it - unless you already know the exact answer U to the problem: in which case you would not be trying to solve it!
Therefore in practice we often fall back on the residual r, which we can compute as F - Au, since F, the right-hand side, is of course known.
The error e and residual r are closely related. In fact Ae = A(U-u) = AU - Au = F - Au = r
So in fact the error satisfies the equation Ae = r. Clearly if r is small so is e and vice versa.
4. ESTMATING SUCCESSIVE ERRORS
As I pointed out in the above, it is impossible to compute the error e = U-u of an iterate u unless you know the exact answer U - in which case you ought not to be iterating!
There are three approaches that are used to overcome this problem:
1. Use the residual: r = F-Au as a measure of the error, as I proposed in above
2. Estimate the error using two successive values of u, and u':
Let u' = Iteration(u)
where Iteration(u) stands for the new iteration values.
Now define as usual e = U-u e' = U-u' There is no way to compute e or e' since we dont know U usually. However we can compute d = u'-u Now d = u'-u = u' - U + U - u = e - e' and so while we cannot compute the error e or e', we can compute e - e'
Suppose we believe that our method is converging with a convergence factor c so that (by definition) the error decreases by c at each iteration: Then |e'| < c|e| and so |d| = |e - e'| >= |e| - |e'| >= |e| - c|e| = (1-c)|e|
Turning this around we therefore have:
|e| <= |d|/(1-c)
Therefore we have succeeded in estimating the uncalculatable e in terms of the calculatable d, although we need to be sure the method converges to make use of this. If you know the method has a convergence rate c <= .5, then we have 1 - c >= .5 and therefore |e| <= 2*d which says that to all intents and purposes d measures the error.
In practice one usually computes d or rather norm(d) = max |d(i,j| or norm(d) = sqrt (sum of d(i,j)**2)
and then regards d as meassuring the error. This is wrong if the method diverges, but of course then you will see that d is getting bigger!
3. In special cases, such as the code test cases where you start from a known solution U(i,j), it is possible to compute the error directly:
e(i,j) = U(i,j) - u(i,j)
So you really have three different ways to estimate error.
5. SPECIAL CASE OF RHS = 0:
There is one situation where it is actually possible to exactly measure the error at each iteration (not just the residual r).
This is the case where we are solving Au=f with f==0. In that case the exact solution is obviously U==0.
As a result, at any iteration we can compute the error: e = u-U = u ie the error is just u itself.
For the study of convergence, this case is therefore the nicest to work with. Of course the idea is to start with a nonzero u - perhaps a random u, or even just a simple function.
A second advantage of the above case is that in solving Au=f, f!=0, round-off errors become important assoon as the error gets fairly small. Basicaly Au is large, f is large, but r=f-Au is getting small, and being the difference of two large things, it becomes impossible to accurately calculate it.
This will show up in your tests as follows:
error starts out decreasing nicely; convergence rate away from 1 (say .6)
after some iterations
error decrease slows; convergence rate settles at a value near 1, say .98
after MANY iterations:
error becomes quite small compare to f; convergence rate now jumps to 1 or more at this point the solution just oscillates around, and it is pointless to continue - unless in higher precision.
So I strongly reccomend you test convergence aspects using f = 0.
COMMENT ON DISCRETE VS CONTINUUM:
In case 3, you need to be careful. I will illustrate with an example. Suppose we consider the equation: Lu = -d2u/dx**2 -d2u/dy**2 = -4, 0 <=x,y <=1 U(x,y) = g(x,y) == x**2 + y**2, on boundary
By design, the function U(x,y) = x**2 + y**2 is the exact solution (just stick it in and verify!).
Numerically we discretize the abobe equation and then solve what results. However we are solving a DISCRETE equation which is not exactly Lu = -4. We might call it Du = -4 where Du = (4u(i,j) - u(i,j-1) - u(i,j+1) - u(i-1,j) - u(i+1,j)) / h**2
where h is the grid spacing, and we dont know the exact answer to this discrete equation. It will certainly be close to U(x,y) = x**2 + y*2, but except in a real fluke, it wont be exactly this.
Our numerical procedure solves a discrete problem. To get a solution that is exactly known, you need to start with a discrete function, V, and compute the corresponding DISCRETE RHS for it:
i.e. chose numbers V(i,j) on a grid, and compute F(i,j) = DV(i,j) and define the boundary numbers g(i,j) = V(i,j) on the boundary.
Now solve Du = F on the grid u = g on the boundary of the grid.
By construction, the result will be u(i,j) = V(i,j), so you know the exact answer.
######
Lets return to our example and see what it leads to:
We had U(x,y) = x**2 + y**2, 0 <= x, y <= 1
Take the obvious (n+1)*(n+1) square grid: x(i) = i*hx, y(j) = j*hy, 0 <= i, j <= n, hx = hy = h = 1/n .
Then V(i,j) = U(x(i),y(j)) = (i**2 + j**2) * h**2
F(i,j) = DV(i,j) = [ (4(i**2 + j**2) - i**2 - (j-1)**2 -i**2 - (j+1)**2 - (i-1)**2 - j**2 - (i+1)**2 - j**2 ) * h**2 ] / h**2
= -4
So it is a fluke - the discrete derivative Du of V is exactly -4, not just approximately 4. This is a one in a million situation.
The boundary values are of course exact too: g(i,j) = V(i,j) on boundary = U(x(i),y(j))**2 by definition of V.
So in fact it is corect to use U(x,y) as the exact solution.
But in general you need to be more careful, and carefully construct the discrete derivative as above, not assume it is -4 or whateverthe continuum derivative was. In only of U is a quadratic function of x,y will you find that DU and LU agree.
The JACOBI, SOR and GAUSS_SEIDEL METHODS
The iteration scheme introduced in previous notes is a special case of a general iterative method for solving linear equations. This is as an alternative to solving the equations by Gaussian Elimination, which as we have seen tends to be very expensive.
Consider the general system: Au = F, where a is NxN matrix, u and F are vectors of length N.
We can write this in component form as: Sum_over_j a(i,j) u(j) = f(i), for each 0 <= i < N and where the sum is the sum over 0 <= j < N.
In the ith equation we could seperate out the term involving u(i) on the left:
Sum a(i,j) u(j) = a(i,i)u(i) + Sum' a(i,j)u(j)
where we have introduced the notation Sum' to denote the sum over all j EXCEPT j = i.
Now take the Sum' term to the right, to get:
a(i,i)u(i) = f(i) - Sum' a(i.j)u(j)
or u(i) = {f(i) - Sum' a(i,j)u(j)}/a(i,i)
Any solution of the equation Au=F must satisfy this.
Now supose we have a vector u(i) which is NOT a solution of the equation. We could define a new vector v(i) using the above relation:
Define: v(i) = {f(i) - Sum' a(i,j)u(j)}/a(i,i), 0 <= i < N.
If u is the the exact solution then from the above we have v = u. If u is close to the exact solution, then we might hope that v would be too.
In fact, under fairly broad circumstances, even if u is not close to the exact solution, v will be a bit closer.
This leads to the following iterative solution method:
Jacobi Iteration:
1. Choose an initial guess u0: a vector u0(i) Often, one chooses u0(i) = 0 for want of a better idea. This will certainly be a bad guess in general.
2. Define an improved solution u1 by the above formula for v:
v(i) = {f(i) - Sum' a(i,j)u0(j)}/a(i,i), 0 <= i < N. u1(i) = v(i)
3. Now continue this process, creating u2(i), u3(i), ...., each one based on the previous. For example: v(i) = {f(i) - Sum' a(i,j)u5(j)}/a(i,i), 0 <= i < N. u6(i) = v(i)
Then if the vectors u0, u1, u2, ... converge to some vector U, it is easy to show that U must be a solution of the equation. In praxctice we might repeat the process 100 times, and then u100 might be a very good approximation to U.
Implementation detail: Note that we dont ever need the vectors u0, u1, ... once we have computed the next vector u1, u2, ..... This allows us to store all of the u0, u1, u2, ... in the same array - lets call it u. Then the algorithm reduces to:
Do i = 0,N-1 u(i) = 0.0 End Do
Do Iter = 1,Num_Iter
Do i = 0,N-1 v(i) = {f(i) - Sum' a(i,j)u(j)}/a(i,i), 0 <= i < N. End Do
Do i = 0,N-1 u(i) = v(i) End Do
End Do
It is very important to store the new values into v(i) and only at the end, copy v(i) back to u(i). Otherwise when we would change u(i) immediately, and then in calculating v(i+1) we would be using the NEW value of u(i), not the old one.
Convergence:
Since such methods may not converge, it is important to monitor convergence. There are two ways of doing this as discussed in previous notes: residual or an error estimate.
Residual: Compute r = f-Au. Define ||r|| = norm(r) where norm is a suitable norm. Good choices for the norm would be: norm(r) = sqrt ( Sum (r(i))**2) or norm(r) = max (|r(i)|, 0 <=i <N.
Now print out ||r|| at each iteration. If |||r|| does not converge to 0 then the method is not converging.
We can define the residual convergence rate as: cr = ||current residual|| / || last residual|| and convergence will require that this ratio be less than 1.
Error: Estimate E = U - u, where U is the exact solution. Because we dont usually know U, the best we can do is to assume it is converging, in which case v will be closer to U than u was - so we could estimate the error as: e = u - ulast Again we would compute ||e|| using one of the norms above. Again we can define the convergence rate as: ce = ||current error|| / || last error||
A good computation of an iterative method will print out at each iteration: ||r||, cr, ||e||, ce allowing the user to monitor convergence.
Frequently one adds code to stop when ||e|| is sufficiently small, and cetainly to stop if the method is diverging - ie ce or cr bigger than 1 for several successive iterations.
Parallelization: The above method parallelizes well. Nothing special to say here.
SOR Method:
The Jacobi method may often be improved by mixing the new solution with a fraction of the old one. Choose a fraction s:
v(i) = {f(i) - Sum' a(i,j)u0(j)}/a(i,i), 0 <= i < N. u1(i) = s*v(i) + (1-s)*u0(i)
and repeat this for all iterations. Again, if we store all u's in one array we can write this as: v(i) = {f(i) - Sum' a(i,j)u0(j)}/a(i,i), 0 <= i < N. u(i) = s*v(i) + (1-s)*u(i)
Of course the case s=1 is exactly Jacobi, while s=0 would mean we kept using only the old value - so it could not converge. Good values for s can be found by experiment (part of the assignment).
One again we would compute convergence rates as above. We would say s was a good choice if the convergence rate for that s is small compared to that for another value of s.
Again this method parallelizes well.
Gauss-Seidel Method:
Instead of storing the new values into a new array v in Jacobi, we could instead have stored them AS THERY ARE COMPUTED into the array u, over-writing the old values there. Then when we compute the next u(i), we will be using the new u(j) for j<i, and the old u(j) for j>i. If we really believe the method converges then the new u are "better" than the old ones, so this should converge fadter than Jacobi. This Gauss-Seidel method leads to the iteration:
u(i) = {f(i) - Sum' a(i,j)u(j)}/a(i,i), 0 <= i < N. or as a program:
Do i = 0,N-1 u(i) = 0.0 End Do
Do Iter = 1,Num_Iter
Do i = 0,N-1 u(i) = {f(i) - Sum' a(i,j)u(j)}/a(i,i), 0 <= i < N. End Do
End Do
The results will now depend on which u(i) are computed first - ie on the order of the u(i). For example if we looped from N-1 down to 0 instead of from 0 up to N-1, we would get a different result (both good).
Once again we compute convergence rates etc as before.
Parallelization of this method is however a real problem. On a parallel machine, different processors will grab blocks of indices i and will compute the u(i) - in parallel. As aresult the new values being coputed in processor P are not known to procesor Q when he simultaneously computes new values. In the extreme case where there are N processors (vector length), ALL processors compute a single new value SIMULTANEOUSLY - and as a result ca use only the old values of u(i). Thus in this extreme case, Gauss-Seidel is identical to Jacobi. --