In this chapter, we outline the process for updating codes designed for the older FFTW 2 interface to work with FFTW 3. The interface for FFTW 3 is not backwards-compatible with the interface for FFTW 2 and earlier versions; codes written to use those versions will fail to link with FFTW 3. Nor is it possible to write “compatibility wrappers” to bridge the gap (at least not efficiently), because FFTW 3 has different semantics from previous versions. However, upgrading should be a straightforward process because the data formats are identical and the overall style of planning/execution is essentially the same.
Unlike FFTW 2, there are no separate header files for real and complex
transforms (or even for different precisions) in FFTW 3; all interfaces
are defined in the <fftw3.h>
header file.
The main difference in data types is that fftw_complex
in FFTW 2
was defined as a struct
with macros c_re
and c_im
for accessing the real/imaginary parts. (This is binary-compatible with
FFTW 3 on any machine except perhaps for some older Crays in single
precision.) The equivalent macros for FFTW 3 are:
#define c_re(c) ((c)[0]) #define c_im(c) ((c)[1])
This does not work if you are using the C99 complex type, however,
unless you insert a double*
typecast into the above macros
(see Complex numbers).
Also, FFTW 2 had an fftw_real
typedef that was an alias for
double
(in double precision). In FFTW 3 you should just use
double
(or whatever precision you are employing).
The major difference between FFTW 2 and FFTW 3 is in the planning/execution division of labor. In FFTW 2, plans were found for a given transform size and type, and then could be applied to any arrays and for any multiplicity/stride parameters. In FFTW 3, you specify the particular arrays, stride parameters, etcetera when creating the plan, and the plan is then executed for those arrays (unless the guru interface is used) and those parameters only. (FFTW 2 had “specific planner” routines that planned for a particular array and stride, but the plan could still be used for other arrays and strides.) That is, much of the information that was formerly specified at execution time is now specified at planning time.
Like FFTW 2's specific planner routines, the FFTW 3 planner overwrites
the input/output arrays unless you use FFTW_ESTIMATE
.
FFTW 2 had separate data types fftw_plan
, fftwnd_plan
,
rfftw_plan
, and rfftwnd_plan
for complex and real one- and
multi-dimensional transforms, and each type had its own ‘destroy’
function. In FFTW 3, all plans are of type fftw_plan
and all are
destroyed by fftw_destroy_plan(plan)
.
Where you formerly used fftw_create_plan
and fftw_one
to
plan and compute a single 1d transform, you would now use
fftw_plan_dft_1d
to plan the transform. If you used the generic
fftw
function to execute the transform with multiplicity
(howmany
) and stride parameters, you would now use the advanced
interface fftw_plan_many_dft
to specify those parameters. The
plans are now executed with fftw_execute(plan)
, which takes all
of its parameters (including the input/output arrays) from the plan.
In-place transforms no longer interpret their output argument as scratch
space, nor is there an FFTW_IN_PLACE
flag. You simply pass the
same pointer for both the input and output arguments. (Previously, the
output ostride
and odist
parameters were ignored for
in-place transforms; now, if they are specified via the advanced
interface, they are significant even in the in-place case, although they
should normally equal the corresponding input parameters.)
The FFTW_ESTIMATE
and FFTW_MEASURE
flags have the same
meaning as before, although the planning time will differ. You may also
consider using FFTW_PATIENT
, which is like FFTW_MEASURE
except that it takes more time in order to consider a wider variety of
algorithms.
For multi-dimensional complex DFTs, instead of fftwnd_create_plan
(or fftw2d_create_plan
or fftw3d_create_plan
), followed by
fftwnd_one
, you would use fftw_plan_dft
(or
fftw_plan_dft_2d
or fftw_plan_dft_3d
). followed by
fftw_execute
. If you used fftwnd
to to specify strides
etcetera, you would instead specify these via fftw_plan_many_dft
.
The analogues to rfftw_create_plan
and rfftw_one
with
FFTW_REAL_TO_COMPLEX
or FFTW_COMPLEX_TO_REAL
directions
are fftw_plan_r2r_1d
with kind FFTW_R2HC
or
FFTW_HC2R
, followed by fftw_execute
. The stride etcetera
arguments of rfftw
are now in fftw_plan_many_r2r
.
Instead of rfftwnd_create_plan
(or rfftw2d_create_plan
or
rfftw3d_create_plan
) followed by
rfftwnd_one_real_to_complex
or
rfftwnd_one_complex_to_real
, you now use fftw_plan_dft_r2c
(or fftw_plan_dft_r2c_2d
or fftw_plan_dft_r2c_3d
) or
fftw_plan_dft_c2r
(or fftw_plan_dft_c2r_2d
or
fftw_plan_dft_c2r_3d
), respectively, followed by
fftw_execute
. As usual, the strides etcetera of
rfftwnd_real_to_complex
or rfftwnd_complex_to_real
are no
specified in the advanced planner routines,
fftw_plan_many_dft_r2c
or fftw_plan_many_dft_c2r
.
In FFTW 2, you had to supply the FFTW_USE_WISDOM
flag in order to
use wisdom; in FFTW 3, wisdom is always used. (You could simulate the
FFTW 2 wisdom-less behavior by calling fftw_forget_wisdom
after
every planner call.)
The FFTW 3 wisdom import/export routines are almost the same as before (although the storage format is entirely different). There is one significant difference, however. In FFTW 2, the import routines would never read past the end of the wisdom, so you could store extra data beyond the wisdom in the same file, for example. In FFTW 3, the file-import routine may read up to a few hundred bytes past the end of the wisdom, so you cannot store other data just beyond it.1
Wisdom has been enhanced by additional humility in FFTW 3: whereas FFTW 2 would re-use wisdom for a given transform size regardless of the stride etc., in FFTW 3 wisdom is only used with the strides etc. for which it was created. Unfortunately, this means FFTW 3 has to create new plans from scratch more often than FFTW 2 (in FFTW 2, planning e.g. one transform of size 1024 also created wisdom for all smaller powers of 2, but this no longer occurs).
FFTW 3 also has the new routine fftw_import_system_wisdom
to
import wisdom from a standard system-wide location.
In FFTW 3, we recommend allocating your arrays with fftw_malloc
and deallocating them with fftw_free
; this is not required, but
allows optimal performance when SIMD acceleration is used. (Those two
functions actually existed in FFTW 2, and worked the same way, but were
not documented.)
In FFTW 2, there were fftw_malloc_hook
and fftw_free_hook
functions that allowed the user to replace FFTW's memory-allocation
routines (e.g. to implement different error-handling, since by default
FFTW prints an error message and calls exit
to abort the program
if malloc
returns NULL
). These hooks are not supported in
FFTW 3; those few users who require this functionality can just
directly modify the memory-allocation routines in FFTW (they are defined
in kernel/alloc.c
).
In FFTW 2, the subroutine names were obtained by replacing ‘fftw_’ with ‘fftw_f77’; in FFTW 3, you replace ‘fftw_’ with ‘dfftw_’ (or ‘sfftw_’ or ‘lfftw_’, depending upon the precision).
In FFTW 3, we have begun recommending that you always declare the type
used to store plans as integer*8
. (Too many people didn't notice
our instruction to switch from integer
to integer*8
for
64-bit machines.)
In FFTW 3, we provide a fftw3.f
“header file” to include in
your code (and which is officially installed on Unix systems). (In FFTW
2, we supplied a fftw_f77.i
file, but it was not installed.)
Otherwise, the C-Fortran interface relationship is much the same as it was before (e.g. return values become initial parameters, and multi-dimensional arrays are in column-major order). Unlike FFTW 2, we do provide some support for wisdom import/export in Fortran (see Wisdom of Fortran?).
Like FFTW 2, only the execution routines are thread-safe. All planner
routines, etcetera, should be called by only a single thread at a time
(see Thread safety). Unlike FFTW 2, there is no special
FFTW_THREADSAFE
flag for the planner to allow a given plan to be
usable by multiple threads in parallel; this is now the case by default.
The multi-threaded version of FFTW 2 required you to pass the number of
threads each time you execute the transform. The number of threads is
now stored in the plan, and is specified before the planner is called by
fftw_plan_with_nthreads
. The threads initialization routine used
to be called fftw_threads_init
and would return zero on success;
the new routine is called fftw_init_threads
and returns zero on
failure. See Multi-threaded FFTW.
There is no separate threads header file in FFTW 3; all the function
prototypes are in <fftw3.h>
. However, you still have to link to
a separate library (-lfftw3_threads -lfftw3 -lm
on Unix), as well as
to the threading library (e.g. POSIX threads on Unix).
[1] We do our own buffering because GNU libc I/O routines are horribly slow for single-character I/O, apparently for thread-safety reasons (whether you are using threads or not).