Next: , Previous: FFTW MPI Transposes, Up: Distributed-memory FFTW with MPI


6.8 FFTW MPI Wisdom

FFTW's “wisdom” facility (see Words of Wisdom-Saving Plans) can be used to save MPI plans as well as to save uniprocessor plans. However, for MPI there are several unavoidable complications.

First, the MPI standard does not guarantee that every process can perform file I/O (at least, not using C stdio routines)—in general, we may only assume that process 0 is capable of I/O.1 So, if we want to export the wisdom from a single process to a file, we must first export the wisdom to a string, then send it to process 0, then write it to a file.

Second, in principle we may want to have separate wisdom for every process, since in general the processes may run on different hardware even for a single MPI program. However, in practice FFTW's MPI code is designed for the case of homogeneous hardware (see Load balancing), and in this case it is convenient to use the same wisdom for every process. Thus, we need a mechanism to synchronize the wisdom.

To address both of these problems, FFTW provides the following two functions:

     void fftw_mpi_broadcast_wisdom(MPI_Comm comm);
     void fftw_mpi_gather_wisdom(MPI_Comm comm);

Given a communicator comm, fftw_mpi_broadcast_wisdom will broadcast the wisdom from process 0 to all other processes. Conversely, fftw_mpi_gather_wisdom will collect wisdom from all processes onto process 0. (If the plans created for the same problem by different processes are not the same, fftw_mpi_gather_wisdom will arbitrarily choose one of the plans.) Both of these functions may result in suboptimal plans for different processes if the processes are running on non-identical hardware. Both of these functions are collective calls, which means that they must be executed by all processes in the communicator.

So, for example, a typical code snippet to import wisdom from a file and use it on all processes would be:

     {
         int rank;
     
         fftw_mpi_init();
         MPI_Comm_rank(MPI_COMM_WORLD, &rank);
         if (rank == 0) fftw_import_wisdom_from_filename("mywisdom");
         fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD);
     }

(Note that we must call fftw_mpi_init before importing any wisdom that might contain MPI plans.) Similarly, a typical code snippet to export wisdom from all processes to a file is:

     {
         int rank;
     
         fftw_mpi_gather_wisdom(MPI_COMM_WORLD);
         MPI_Comm_rank(MPI_COMM_WORLD, &rank);
         if (rank == 0) fftw_export_wisdom_to_filename("mywisdom");
     }

Footnotes

[1] In fact, even this assumption is not technically guaranteed by the standard, although it seems to be universal in actual MPI implementations and is widely assumed by MPI-using software. Technically, you need to query the MPI_IO attribute of MPI_COMM_WORLD with MPI_Attr_get. If this attribute is MPI_PROC_NULL, no I/O is possible. If it is MPI_ANY_SOURCE, any process can perform I/O. Otherwise, it is the rank of a process that can perform I/O ... but since it is not guaranteed to yield the same rank on all processes, you have to do an MPI_Allreduce of some kind if you want all processes to agree about which is going to do I/O. And even then, the standard only guarantees that this process can perform output, but not input. See e.g. Parallel Programming with MPI by P. S. Pacheco, section 8.1.3. Needless to say, in our experience virtually no MPI programmers worry about this.