------------------------------------------------------------------ From: jott@aol.com (Jott) Newsgroups: comp.dsp Subject: Re: needed optimized fft Date: 16 Nov 1995 14:35:54 -0500 Organization: America Online, Inc. (1-800-827-6364) Message-ID: <48g3qq$i82@newsbf02.news.aol.com> References: <48dqoo$req@daffy.anetsrvcs.uwrf.edu> Have you considered taking advantage of the complex FFT by combining two real inputs into one FFT computation. The book I am referencing is "Handbook of Real-Time FFTs" by Winthrop Smith and Joanne Smith (IEEE PRESS) though I'm sure it is in many other books. Here is a short description of the method. Step 1. start with a(n) and b(n) both real inputs and of length N. n=0,1,2,...,N-1 c(n) = a(n) + j*b(n) step 2. C(k) = FFT{c(n)} = R(k) + j*I(k) k=0,1,2,...,N-1 NOTE that C(k) = A(k) + j*B(k) but since A(k) and B(k) are complex they need to be separated. Step 3. N is assumed even (if N is odd step 3 is different) for k = 1,2,....(N-2)/2 RP(k) = RP(N-k) = 0.5*[R(k) + R(N-k)] RM(k) = -RM(N-k) = 0.5*[R(k) - R(N-k)] IP(k) = IP(N-k) = 0.5*[I(k) + I(N-k)] IM(k) = -IM(N-k) = 0.5*[I(k) - I(N-k)] RP(0) = R(0) IP(0) = I(0) RM(0) = IM(0) = RM(N/2) = IM(N/2) = 0; RP(N/2) = R(N/2) IP(N/2) = I(N/2) Step 4 - final output for k=0,1,2,...,N-1 A(k) = RP(k) + j*IM(k) B(k) = IP(k) + j*RM(k) This scheme may be more efficient than two real FFTs. The above scheme could also be used if the orginal sequence x(m) of length M is split into two sequences of length N= M/2 ( M has to be even and assume N are even, if N is odd step 3 is slightly different ). The orginal sequence is broken up such that the even samples are placed in the real array and the odd samples placed in the imaginary array. a(n) = x(2n) , b(n) = x(2n+1) , n = 0,1,2,...,N The follow steps 1 through 3 above and use the following for step 4 step 4 for k= 0,1,2,...,N-1 XR(k) = XR(M-k) = RP(k) + cos(k*pi/N)*IP(k) - sin(k*pi/N)*RM(k) XI(k) = -XI(M-k) = IM(k) - cos(k*pi/N)*RM(k) - sin(k*pi/N)*IP(k) XR(0) = RP(0) + IP(0) XI(0) = IM(0) - RM(0) XR(N) = R(0) - I(0) XI(N) = 0 X(k) = X*(M-k) = XR(k) + j*XI(k) These may or may not be more efficient than using FFTs algorithms designed for real inputs, but they do allow more efficient use of what you do have. Maybe a discussion on the benefits of the above approaches versus real input FFT algorithms would be helpful. I would like to hear from people who have used complex FFTs for real inputs. I have only investigated it enough to prove it can be done. I have not investigated the computational load of using complex FFTs to process real inputs versus real input FFT algorithms. HTH, Joseph Ott Senior System Engineer VTG ------------------------------------------------------------------ I read the "note on real fft" on your hpmepage and may I suggest that the method the guy is describing is VERY OLD and very well known. It is complicated and kludgy and will have more overhead than the simple straight- forward real fft. The ref to read is the one by Sorensen et al I cite in my real fft C code. I think you should suggest that people use a real fft code instead. I haven't needed the ifft yet but if I do I will translate that to C too. Cheers, Bill Simpson ------------------------------------------------------------------ I strongly disagree with this. I have used this method on shipped systems based on 56K. The problem with the real FFT is that you can't make it go anywhere near twice as fast as a complex FFT, and to speed it up significantly requires customizing a lot of passes - say the first two and the last two. This means you need a lot more DSP code, which can be a problem in many applications. I would never describe the technique as 'kludgey' because is it mathematically perfectly sound. Sure, it's 'well known' - it isn't in a lot of textbooks, but it is simple enough to figure out. Remember, you're getting two real FFT's for the cost of one CFFT, plus a bit of simple pre- and post- processing. In our application, the pre- and post-processing was rolled in with the processing before and after the FFT, and was almost free. There is no way this will have 'more overhead' than the "straightforward" real fft -- unless your FFT is VERY small, like 64 points or less. There is something to watch out for -- since you are rolling two FFT's into one, there will be noise crosstalk between the channels due to rounding errors (assuming an integer DSP). In our application, the two channels were left and right from stereo, so crosstalk wasn't too much of a problem. In a double precision floating-point enviroment, you would should have >100 db isolation easily. The type of FFT - DIT or DIF - affects the relationship between the signal spectrum and the rounding noise spectrum in interesting ways. One of the things they don't cover in textbooks. --------------- reply my remark: > btw.: i found that considerations > for > -DSP/non DSP > -integer FFT/ float FFT > -small sizes/long sizes > -cache mem machines / non ... > etc etc > are often completely different. Yes, it does - there's also real-world/academic to deal with - often academic works count the multiplies and divides and ignore the addressing; in any decent DSP the hard part is addressing and sequencing to keep the ALU busy. ------------------------------------------------------------------