Up Up Up Last Update: Dec. 5th 1995

MagFEM Convolution routines

Package description

This document describes the usage of the convolution package which is used in the MagFEM-library. In short: Differences from other convolution packages:
Most packages (e.g. the routines in the standard Cray library) are for signal processing: A long signal (N elements) is filtered by convoluting it with a short sequence (M elements), with FFT's of length N+M. This package is optimized for convoluting a long signal (the charges, say length N) with a long filter (the interaction function, length 2N). The FFT is of length 2N only (it would be 3N in the example above). Further, the FFT of the filter is done once and for all, so that for repeated calculations it need not be done again.

We are currently considering implementing the Parseval theorem as well, which can be used if the integral over the convoluted functions is of interest only (The total energy in our application).

List of functions

Convolution is performed by calling the following functions (Hyperlinks lead to the detailed synopsis):
  1. Direct Summation (for debugging):
  2. Summation in Fourier Space:

Detailed synopsis

Convolution by direct summation:

      Subroutine CV1DDInit(din, N, M, NM, dout, sz)
C Take the array din(1:NM, 1:M) and prepare it for convolution. 
C din(*) is used as din(1:N, 1:M) in the caller. dout is allocated as
C dout(1:sz) by the caller (must be N*M at least)
C All that is done here is copy this into a contiguous array.
C Upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
      integer N, M, NM, sz
      real*8 din(1:NM, 1:M), dout(1:sz)
      
      Subroutine CV1DD(d, N, M, NM, if, Mif)
C Take the data in d (declared d(1:NM, 1:M), used as d(1:N,1:M)) and
C convolute it with the function in if(1:N,1:M). The result will be returned
C in d(1:N,1:Mif). if will not be changed, so it can be used several times.
C if has to be prepared with CV1DDInit
      integer N,M,NM,Mif
      real*8 d(1:NM, 0:Mif-1), if(0:N-1, 1:M)

      Subroutine CV2DDInit(din, N, M, NM, dout, sz)
C Take the array din(1:NM, 1:M) and prepare it for convolution. 
C din(*) is used as din(1:N, 1:M) in the caller. dout is allocated as
C dout(1:sz) by the caller (must be N*M at least)
C All that is done here is copy this into a contiguous array
C upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
      integer N, M, NM, sz
      real*8 din(1:NM, 1:M), dout(1:sz)

      Subroutine CV2DD(d, N, M, NM, if, dout)
C Take the data in d (declared d(1:NM, 1:M), used as d(1:N,1:M)) and
C convolute it with the function in if. The result will be returned in d.
C if will not be changed, so it can be used several times.
C if has to be prepared with CV2DDInit
      integer N,M,NM
      real*8 d(1:NM, 1:M), if(0:N-1, 0:M-1), dout(1:NM,1:M)

      Subroutine CV3DDInit(din, K,L,M, KM,LM, dout, sz)
C Take the array din(1:NM, 1:M) and prepare it for convolution. 
C din(*) is used as din(1:N, 1:M) in the caller. dout is allocated as
C dout(1:sz) by the caller (must be N*M at least)
C All that is done here is copy this into a contiguous array
C upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
      integer K,L,M, KM,LM, sz
      real*8 din(1:KM, 1:LM, 1:M), dout(1:sz)
      
      Subroutine CV3DD(d, K,L,M, KM,LM, if, dout)
C Take the data in d (declared d(1:KM, 1:LM, 1:M), used as d(1:K,1:L,1:M)
C and convolute it with the function in if. The result will be returned in d.
C if will not be changed, so it can be used several times.
C if has to be prepared with CV3DDInit
      integer K,L,M, KM,LM
      real*8 d(1:KM, 1:LM, 1:M), if(0:K-1, 0:L-1, 0:M-1)
      real*8 dout(KM, LM, M)

FFT

      Subroutine CV3DInit(din, H,K,L, HM,KM, dout, sz)
C Take the array din(HM,KM,L) and prepare it for convolution using FFT.
C din(*) is used as din(H,K,L) in the caller. dout is allocated as
C dout(1:sz) by the caller (sz must be 8*H*K*L at least)
C upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
C NOTE: This works for NON-periodic functions, as all maters of zero-padding
C are taken care of.
      integer H,K,L, HM,KM, sz
      real*8 din(HM, KM, L), dout(8*H*K*L)
      
      Subroutine CV3D(d, H,K,L, HM,KM, if, dout, h0,k0,l0,
     &     HMout,KMOut, scr,lscr)
C Take the data in d (declared d(HM,KM,L), used as d(H,K,L)) and
C convolute it with the function in if using FFT. The result will be 
C returned in dout(h0:h0+H-1,k0:k0+K-1,l0:l0+l-1). dout is declared by
C the caller as dout(1:HMout,1:KMOut,1:l0+l-1) at least. if will not be
C changed, so it can be used several times. if has to be prepared with
C CV2DInit. scr is a scratch place of at least size 12*H*K*L, if it is
C to be used as signaled by lscr=1. If lscr=0, scr might be a dummy
      integer H,K,L, HM,KM, h0,k0,l0, HMOut,KMOut, lscr
      real*8 d(HM,KM,L), if(8*HM*KM*L) ! ???
      real*8 dout(1:HMout,1:KMOut,1:l0+L-1), scr(0:6*H*K*L*lscr)

      Subroutine CV2DInit(din, N, M, NM, dout, sz)
C Take the array din(1:NM, 1:M) and prepare it for convolution using FFT.
C din(*) is used as din(1:N, 1:M) in the caller. dout is allocated as
C dout(1:sz) by the caller (sz must be 4(N+1)(M+1) at least)
C upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
C NOTE: This works for NON-periodic functions, as all maters of zero-padding
C are taken care of.
      integer N, M, NM, sz
      real*8 din(1:NM, 1:M), dout(1:4*N*M)  ! 4*N*M need not be true

      Subroutine CV2D(d,N,M,NM,if,dout,i0,j0,NMout,scr,lscr)
C Take the data in d (declared d(1:NM, 1:M), used as d(1:N,1:M)) and
C convolute it with the function in if using FFT. The result will be 
C returned in dout(i0:i0+N-1,j0:j0+M-1). dout is declared by the caller
C as dout(1:NMout,1:M+j0-1) at least. if will not be changed, so it can
C be used several times. if has to be prepared with CV2DInit.
C scr is a scratch place of at least size 6*N*M, if it is
C to be used as signaled by lscr=1. If lscr=0, scr might be a dummy
      integer N,M,NM,i0,j0,NMout,lscr
      real*8 d(1:NM, 1:M), if(4*(N+2)*(M+2))
      real*8 dout(1:NMout,1:M+j0-1), scr(0:6*N*M*lscr)

      Subroutine  CV1DInit(din,N,M,NM,dout,sz)
C Take the array din(1:NM, 1:M) and prepare it for convolution using FFT.
C Convolution is only performed in the 1st dimension.
C din(*) is used as din(1:N, 1:M) in the caller. dout is allocated as
C dout(1:sz) by the caller (must be 2*N*M at least)
C upon return, sz contains the number of reals actually used in dout, or
C -1 if an error occured (sz too small)
C Note: This works for NON-periodic functions, as all maters of zero-padding
C are taken care of
      integer N, M, NM, sz
      real*8 din(1:NM, 1:M), dout(1:2*N*M)

      Subroutine CV1D(d,N,M,NM,if,Mif,dout,i0,j0,NMout,scr,lscr)
C Take the data in d (declared d(1:NM, 1:M), used as d(1:N,1:M)) and
C convolute it with the function in if using FFT, with convolution in the first
C indices only. The result will be returned in dout. if will not be changed,
C so it can be used several times. if has to be prepared with CV1DInit.
C Mif is the number of fft'ed if's in if, which is prepared as if(N2,Mif) by
C CV1DInit. The size of the result in dout is always (1:N,1:mif), with
C dout being declared as (1:NMOut,1:M+j0-1)
C scr is a scratch array of at least size 4*N*M if to be used, as signaled
C by lscr=1. If lscr=0, local scratch will be allocated.
      integer N,M,NM,Mif,i0,j0,NMout,lscr
      real*8 d(1:NM, 1:M), if(2*N*M), dout(1:NMOut,1:Mif+j0-1)
      real*8 scr(4*N*M*lscr)

      Subroutine CV1DGather(d,N,M,NM,if,fi1,s1,fi2,s2,scr,lscr)
C Take the data in d (declared d(1:NM, 1:M), used as d(1:N,1:M)) and
C convolute it with the function in if using FFT, with convolution in the first
C indices only. fi1 will contain the sum of the M convolutions done, which
C is a vector of N values, stored with stride s1. fi2 and s2 serve the same
C purpose, but with d mirrored in the second dimension. if will not be changed,
C so it can be used several times. if has to be prepared with CV1DInit, as
C if(N2,M).
C scr is a scratch array of at least size 2*N*M if to be used, as signaled
C by lscr=1. If lscr=0, local scratch will be allocated.
C This highly specialized operation is required to efficiently
C compute the potential contribution of the volume charges to the surfaces.
      integer N,M,NM,s1,s2,lscr
      real*8 d(1:NM, 1:M), if(2*N*M), fi1(N*s1), fi2(N*s2)
      real*8 scr(2*N*M*lscr)

Graphical explanation of terms

The graphic to the right explains the compression in the initialization operation for the example of 2D-Data. The first index is plotted to the right, the second index to the top. This is what CV2DDInit(din, N, M, NM, dout, sz) and CV2DInit (din, N, M, NM, dout, sz) do, with din the left and dout the right array. CV2DInit actually returns the FFT of the right array, with a few additional elements - the return size is (N+1)*(M+1).

All interaction functions are supposed to be symmetrical, and only the necessary data is stored. This is explained in the picture to the left: The din-array in the previous figure is the shaded area only. The element number one describes the self interaction, whereas those with number two describe nearest neighbour interaction. As shown in the next figure, the numbers in these cells need not be the same.








Here is shown what happens to a single one in an array which is otherwise zero (empty squares), if convolved with a specified filter (lower left matrix). The result of the convolution operation is given to the right. This corresponds to a call to Subroutine CV2DD(d, N, M, NM, if, dout) with d the upper left and if the lower left matrix. Of course, if has to be prepared with CV2DDInit(din, N, M, NM, dout, sz). The right matrix, the result, is returned by CV2DD in dout. Please note that the FFT-Version is slightly more sopisticated, as it is capable of writing the result into a rectangular area of a specified matrix. The result, however, is the same to numerical precision. See the example program for details.





Author: K. Ramstöck ( klaus@ww.uni-erlangen.de). The code in parallel.f stems from Peter Lockey (P.Lockey@dl.ac.uk). If you are looking at a static copy of this file, you can always view a more recent version as http://magnet.ww.uni-erlangen.de/~klaus/conv.html