README
May 7, 2014 ยท View on GitHub
Date: 8/15/2012
T3PIO Library: TACC's Terrific Tool for Parallel I/O
----------------------------------------------------
Robert McLay
mclay@tacc.utexas.edu
This software (T3PIO) is a parallel library to improve the parallel performance of MPI I/O. Since Parallel HDF5 uses MPI I/O, this library will improve the performance of it as well. This library interacts with the Lustre filesystem to match the application to the filesystem.
The parallel I/O performance depends on three parameters:
1) Number of writers
2) Number of stripes
3) Stripe size
This library focuses on writing of files. When an application creates a file, it is the only time that one can control the number of stripes. This library extracts data from the lustre file system and your parallel program to set all three parameters to improve performance.
The use of the library is straight-forward. Only one extra call to the T3PIO library is necessary to improve your performance.
Using T3PIO with HDF5:
In Fortran, it looks like this (a complete example can be found in cubeTestHDF5/writer.F90 (lines 52-140):
subroutine hdf5_writer(....) use hdf5 use t3pio integer globalSize ! Estimate of GlobalSize of file in (MB) integer info ! MPI Info object integer comm ! MPI Communicator integer(hid_t) :: plist_id ! Property list identifier character*(256) :: dir ! directory of output file ...
comm = MPI_COMM_WORLD
! Initialize info object. call MPI_Info_create(info, ierr)
! use library to fill info with nwriters, stripe dir = "./" call t3pio_set_info(comm, info, dir, ierr, & ! <--- T3PIO library call GLOBAL_SIZE=globalSize) call H5open_f(ierr) call H5Pcreate_f(H5P_FILE_ACCESS_F,plist_id,ierr) call H5Pset_fapl_mpio_f(plist_id, comm, info, ierr) call H5Fcreate_f(fileName, H5F_ACC_TRUNC_F, file_id, ierr, access_prp = plist_id)
Essentially, you make the normal calls to create an HDF5 file. The only addition is to call "t3pio_set_info" with the communicator, an info object, and the directory where the file is to be written. Optionally, you can specify an estimate of the global size of the file. If you do so then stripe size will increase to improve performance.
In C/C++ it is very similar (a complete example can be found in unstructTestHDF5/h5writer.C (lines 71-95)). Note that the first arguments to the function t3pio_set_info are required. All other options are optional.
#include "t3pio.h" #include "hdf5.h" void hdf5_writer(....) { MPI_Info info = MPI_INFO_NULL; const char *dir; hid_t plist_id; ...
MPI_Info_create(&info);
dir = "./";
ierr = t3pio_set_info(comm, info, dir, /* <-- T3PIO call */
T3PIO_GLOBAL_SIZE, globalSize);
plist_id = H5Pcreate(H5P_FILE_ACCESS);
ierr = H5Pset_fapl_mpio(plist_id, comm, info);
file_id = H5Fcreate(fileName, H5F_ACC_TRUNC,
H5P_DEFAULT, plist_id);
...
}
Using T3PIO with MPI I/O:
Using T3PIO with MPI I/O is essentially the same as with HDF5. In Fortran 90 it would look like (see writer.F90 lines 300-317):
subroutine hdf5_writer(....) use t3pio integer info ! MPI Info object integer comm ! MPI Communicator integer iTotalSz ! File size in MB. integer filehandle ! MPI file handle character*(256) :: dir ! directory of output file
call MPI_Info_create(info,ierr)
iTotalSz = totalSz / (1024*1024) dir = "./" call t3pio_set_info(MPI_COMM_WORLD, info, dir, ierr, & global_size = iTotalSz)
call MPI_File_open(p % comm, fn, MPI_MODE_CREATE+MPI_MODE_RDWR, & info, filehandle, ierr) ...
Obviously, the use of the T3PIO library is the same when using MPI I/O or HDF5.
Installing T3PIO:
The library can downloaded from github
$ git clone git://github.com/TACC/t3pio.git
To configure the package do:
$ FC=mpif90 F77=mpif77 CC=mpicc CXX=mpicxx ./configure [options]
Where you need to specify the four mpi compilers and where the options are:
--prefix=/path/to/install/dir
Path to where you wish the library and test programs to be installed
--with-phdf5=/path/to/parallel/hdf5/dir
The library does not require Parallel HDF5. However, the test programs do. Please
set the directory to the parent directory to hdf5 is installed. This package expects
that the "include" and "lib" to be directly below the directory specified above.
--with-node-memory=N (where N is in MB)
This is optional and avoids having the library run "free -m" to determine this quantity.
The size in MB. It should be the size of memory on the compute nodes.
After configure do:
$ make; make install
This will build and install the library and the two test programs: cubeTest, unstructTest
Test Program (1): cubeTest
To run the structured grid test code do the following to get help:
$ cubeTest -H
cubeTest version 1.0
Usage: cubeTest [options] options: -v : version -H : This message and quit --help : This message and quit --dim num : number of dimension (2, or 3) -g num : number of points in each direction globally (no default) -l num : number of points in each direction locally (default = 5) -f num : number of stripes per writer (default = 2) --stripes num : Allow no more than num stripes (file system limit by default) --h5chunk : use HDF5 with chunks --h5slab : use HDF5 with slab (default) --romio : use MPI I/O --numvar num : number of variables 1 to 9 (default = 1)
Test Program (2) : unstructTest
To run the unstructured grid test code do the following to get help:
$ unstructTest -h
Usage:
./unstructTest [options]
Options:
-h, -? : Print Usage
-v : Print Version
-C : use h5 chunk
-S : use h5 slab (default)
-l num : local size is num (default=10)
-f factor : number of stripes per writer (default=2)
-s num : maximum number of stripes