Next: Numerical Results. Up: PTidal, the Block Domain Previous: Explicit-Implicit Implementation

Fully Implicit Implementation

The above implementation was satisfactory well, since it had the CFL restriction in the time step, we decided to develop a fully implicit version to overcome this inconvenience.

An alternative approach to the Block Domain Decomposition might be: each process, working in one of the blocks, calculates its part of the tri-diagonal system and solves the system using the distributed tri-diagonal solver (Johnsson et al., 1985; Sjögreen, 1998). The solver works in the following four steps:

Forward elimination locally (parallel).
Backward elimination locally (parallel).
Solve for boundary points (sequential). This is the critical step where the contention happens. Each process can not continue computing the local part of the solution until it receives the data produced and sent by the the task working on its preceding block in the row. The task assigned to the first block in the row breaks the wait ring, because it's the first block, sending its data and allowing to complete the forward elimination. In the backward elimination sweep, which follows, each process waits the data produced in the rows' following block. The job working on the last block of the row, breaks now the deadlock sending its data and allowing to complete the backward elimination elimination.
Solve for interior points (parallel).

Enhancement introduced in the fully-implicit code are overcome by the above described high contention between processes introduced in the tri-diagonal solver. As a result both implementations give similar results in terms of speedup (serial/parallel run times ratio) benefits.

Next: Numerical Results. Up: PTidal, the Block Domain Previous: Explicit-Implicit Implementation

Elias Kaplan M.Sc.
1998-07-22