### TridiagLU - Parallel Direct Solver for Tridiagonal Systems

This code is a work-in-progress. The Git commit messages in the repository show the latest progress like bugfixes, addition of features, etc.

#### Download

With Bitbucket account and SSH keys set up:

###### git clone git@bitbucket.org:deboghosh/tridiaglu.git

Otherwise:

###### git clone https://bitbucket.org/deboghosh/tridiaglu.git

#### Overview

This is my implementation of a direct solver (LU decomposition) for non-periodic tridiagonal systems of equations in serial and parallel (MPI). It can solve a single system as well as a stack of independent systems. The latter is useful, for example, for compact finite-difference schemes applied to multi-dimensional problems. The following functions are available:-

**(**

*tridiagLU**a,b,c,x,n,ns,r,m*) (Parallel tridiagonal solver)

**(**

*tridiagLUGS**a,b,c,x,n,ns,r,m*) (Tridiagonal solver based on a gather-and-solve strategy)

**(**

*tridiagIterJacobi**a,b,c,x,n,ns,r,m*) (Iterative tridiagonal solver using the Jacobi method)

A **test code** is provided in */src/Test* to check the tridiagonal solvers and test the walltimes. It runs a series of tests - solution of an identity matrix, upper-triangular tridiagonal matrix and a full tridiagonal matrix with random entries and right-hand sides to check for accuracy. It measures the walltimes by solving a set of systems repeatedly.

#### Algorithm / References

* tridiagLU()* solves the parallel tridiagonal system by rearranging the equations such that the last points of each sub-domain are grouped at the end. This results in a parallel elimination of the interior points, followed by the inversion of a reduced system of size

*nproc-1*, whose elements are distributed as one element in each process except the root. This reduced system can either be solved by gathering on one processor, solving (in serial) and scattering back -

**, or iteratively by**

*tridiagLUGS()***. Dr. Paul Fischer explained the algorithm and guided me in implementing this.**

*tridiagIterJacobi()***Here**is a rough sketch of the algorithm.

* tridiagLUGS()* Each of the "ns" systems is gathered on one processor, solved in serial, and the solution scattered back. The parallelism lies in solving the "ns" different systems on multiple processors (i.e., each processor solves ~ns/nproc number of systems in serial).

* tridiagIterJacobi()* uses Jacobi iterations to solve the system. See

**here**.

#### Scalability

There are some scalability plots available **here**.

#### Function Arguments

**a**(Size: [0,ns-1]x[0,n-1], Type: double*) - subdiagonal entries

**b**(Size: [0,ns-1]x[0,n-1], Type: double*) - diagonal entries

**c**(Size: [0,ns-1]x[0,n-1], Type: double*) - superdiagonal entries

**x**(Size: [0,ns-1]x[0,n-1], Type: double*) - right-hand side (solution)

**n**(Type: int) - local size of the system

**ns**(Type: int) - number of systems to solve

**r**(Type: TridiagLU*) - structure containing the parameters and runtimes (see README or include/tridiagLU.h for explanation)

**m**(Type: MPIContext*) - MPI Communicator (NULL for serial)

Return value (int): 0 (successful solve), -1 (singular system)

Notes:-

#### Compiling

The following steps should compile the code (the default prefix is the source folder
itself, so the binary will be installed in *tridiaglu/bin*):

###### autoreconf -i

This will generate the required files for:###### [CFLAGS="..."] ./configure [--with-mpidir=/path/to/mpi] [--prefix=/install/dir]

###### make

###### make install

** A serial version can be compiled using the "-Dserial" compile flag.

** If MPI is unavailable or the configure script is unable to detect MPI, then a serial version of the code will get compiled.

#### Testing

Compiling the code results in compiling the tridiagonal solver functions as well as a code to test their performance and walltimes. The installed executable runs this test code.

###### [/path/to/mpi/bin/]mpiexec -n $NPROC /path/to/TRIDIAGLU

The run directory must contain a file "

**input**" containing three integers: size of the system, number of systems and number of solves (for walltime measurements).

The errors reported should be machine-zero.

#### Using

The test code demonstrates how to call these functions, including setting up the arrays *a*, *b*, *c* and *x* as well as the other arguments. To use these functions in another code, do either of the following:

Copy the header file * include/tridiagLU.h* to the include folder of the code calling these functions. Copy all the source files in

*src/TridiagLU/*to the the code's src directory and include them while compiling.

*OR*

Place

*where the compiler can find it (or use the compiler flag*

**include/tridiagLU.h***while compiling) and include*

**"-I/path/to/tridiagLU.h"***while linking.*

**$build_dir/src/TridiagLU/libTridiagLU.a**