FFT Processor Chip Info Page

This page contains a comprehensive table listing key attributes of Fast Fourier Transform (FFT) chips such as speed, power, and word size. It also contains links to all sorts of FFT processors such as: special purpose chips, board-level products, soft/synthesizable processors, and programmable DSP chips. This page is maintained by Bevan Baas


FFT Chip Comparisons

This table lists key features of commercial and academic FFT processors. The only requirement is that they are able to compute a 1024-point complex transform.


Processor Year CMOS Tech

(µm)

Datapath Width

(bits)

Dataword Format

(fixed pt., block float, float pt.)

Supply Voltage

(V)

Execution Time

(µsec /1024-pt xform)

Power

(mW)

Clock

(MHz)

Number of Chips

M=DataMem
C=CoeffMem

Area, 0.5 µ eff

(mm^2)

I/O pads

(pins)

Energy Efficiency

(FFTs per Energy)
See below

Processor
DASP/PAC
Honeywell [1]
1988 1.2 µm 16 block float - 102 µsec 2000 + 2000 + 5*250 - 1+1+4M+C - 269/180/- 1.7 DASP/PAC
Honeywell
PDSP16510A
Zarlink (Plessey,Mitel)
1989? 1.4 µm 16 block float 5.0 98 µsec 3000 40 1 22 84 3.6 PDSP16510A
Plessey
PDSP16515A
Zarlink (Plessey,Mitel)
- - 18 block float 5.0 87 µsec - 45 1 - 84 - PDSP16515A
Plessey
L64280
LSI [5]
1990 1.5 µm 20 float 5.0 26 µsec 20,000 40 10+10 233 - 2.9 L64280
LSI
Dassault Electronique [6] 1990 1.0 µm 12 block float 5.0 6.4 µsec 24,000? 40 6 255 299 3.4 Dassault Electronique
Y. Zhu, Univ. of Calgary [7] 1993 1.2 µm 16 block float 5.0 155 µsec - 33 1+1+2M+C 15+ 132 - Y. Zhu
Univ. of Calgary
TM-66
Texas Mem Sys
- 0.8 µm 32 float 5.0 65 µsec 7000+ 50 1+1+M+ - 299/? <3.4 TM-66
Texas Mem Sys
BDSP9124/9320
Butterfly DSP
- 0.8 µm 24 block float - 54 µsec - 60 1+1+2M+C+ - 262/68 - BDSP9124/9320
Butterfly DSP
Cobra, Colorado State [10] 1994 0.75 µm 23 - 5.0 9.5 µsec 7700 40 16+ 1104+ 391 <12.4 Cobra
Colorado State
CNET
E. Bidet [11]
1994 0.5 µm 10 - 3.3 51 µsec 300 20 1 100 - 13.6 CNET
E. Bidet
Spiffee 1
Stanford
1995 0.7 µm,

Lpoly= 0.6 µm

20 fixed 3.3 30 µsec 845 173 1 25 70 27.6 Spiffee 1
Stanford
1.1 ** 330 µsec 9.5 16 223
Spiffee Low Vt *
Stanford
0.8 µm,
Lpoly= 0.26 µm
0.4 93 µsec <9.7 57 >887 Spiffee Low Vt 
Stanford
Spiffee ULP *
Stanford
0.5 µm 0.4 61 µsec 8 85 1025 Spiffee ULP 
Stanford
DaSP/PaC/RaS
Array Microsystems
1996? - 16 block float 5.0 131 µsec 1750 + 2000 + 3*2000 40 1+1+3 - 144/144/144 - DaSP/PaC/RaS
Array Microsystems
SNC960A
Sicom
1996 0.6 µm 16 - 5.0 20 µsec 2000- 3000 65 1 - - 9.0 SNC960A
Sicom
DSP-24, DSP Architectures[13] 1997 0.5 µm 24 block float 3.3 21 µsec 3500 100 1 217 308 8.7 DSP-24
DSP Architectures
M. Wosnitza, ETH, Zurich[14] 1998 0.5 µm 32 block float 3.3 80 µsec 6000 66 1 167 180 2.4 M. Wosnitza
ETH, Zurich
Radix RDA108 - - <=19 - 3.3 12.2 µsec - 84 2 - 313 - Radix RDA108
DoubleBW 2000 0.35 µm 24 float 3.3 10 µsec 8000 128 1 429 530 5.6 DoubleBW
TM-44
Texas Mem Sys
2001 0.13 µm 32 float - 8.04 µsec
(16.07/2)
8000 100 1 - 800+ 3.9 TM-44
Texas Mem Sys
S. M. Currie
Mayo FFT [15]
2002 0.25 µm 16 fixed 2.5 < 11 µsec - 100 1? 400 196 - Currie
Mayo FFT
PowerFFT
Eonic BV
2002 0.18 µm 32 float 1.8 10 µsec 1000 128 - - 600 34.6 PowerFFT
Eonic BV
J.-C. Kuo
NTU [16]
2003 0.35 µm 16 fixed 3.3 40 µsec 810 80 1 31 - 8.1 Kuo
NTU
Processor Year CMOS Tech

(µm)

Datapath Width

(bits)

Dataword Format

(fixed pt., block float, float pt.)

Supply Voltage

(V)

Execution Time

(µsec /1024-pt xform)

Power

(mW)

Clock

(MHz)

Number of Chips

M=DataMem
C=CoeffMem

Area, 0.5  µ eff

(mm^2)

I/O pads

(pins)

Energy Efficiency

(FFTs per Energy)
See below

Processor
Programmable DSP Processors
C40
Texas Instruments
- 0.7 µm - float 5.0 1298 µsec 4500 60 - - - - C40
Texas Instruments
SHARC ADSP-21061
Analog Devices
- - - float 5.0 460 µsec 4500 40 - - - - SHARC
Analog Devices
DSP32C
Lucent
- - 32 float - 2110 µsec - 80 - - - - DSP32C
Lucent
DSP16000
Lucent
- - - - 2.7 - - 100 - - - - DSP16000
Lucent
NM6403, Module 1998 0.5 µm 32 fixed 3.3 439 µsec 1300 50 1 - 256 1.7 NM6403
Module
HiPAR-DSP 4, Universitat Hannover 1999 0.5 µm 16 fixed - 222 µsec 5000 66 1 180 - 0.34 HiPAR-DSP 4
Universitat Hannover
Imagine
Stanford University
2002 0.15 µm - - 2.0 20.6 µsec ~9000 180 1? 2844 792 - Imagine
Stanford University
Synthesizable Processors
Inventra
Mentor
1997 - 20 - - 90 µsec - - - 27K gates - - Inventra
Mentor
Inventra
Mentor
1997 - 20 - - 90 µsec - - - 27K gates - - Inventra
Mentor
FPGA implementations
FFT core, Virtex 2 Pro 50
DSPLogic
2005 - 16? fixed - 5.5 µsec (12.8 µsec) *** - 199 1 - - - -
DSPLogic
FFT core, Virtex-4
4DSP
2004 0.09 µm 32 (24+8), IEEE-754 float - 5.12 µsec, 2.56 µsec **** - 200 1 19,836 slices - - -
4DSP
Other Processors
Cray 2
(1-cpu)
- - - float - 1000 µsec - 244 - - - - Cray 2
(1-cpu)
Cray Y-MP
(1-cpu)
- - - float - 600 µsec - 159 - - - - Cray Y-MP
(1-cpu)
Processor Year CMOS Tech

(µm)

Datapath Width

(bits)

Dataword Format

(fixed pt., block float, float pt.)

Supply Voltage

(V)

Execution Time

(1024-pt µsec/xform)

Power

(mW)

Clock

(MHz)

Number of Chips

M=DataMem
C=CoeffMem

Area, 0.5 µ eff

(mm^2)

I/O pads

(pins)

Energy Efficiency

(FFTs per Energy)
See below

Processor
FFTs per Energy   =   Tech   *   ( 2/3 (DPath/20) + 1/3 (DPath/20)2 )   /   Power   /   Exec Time   /   10-6
The function shown above, FFTs per Energy, gives an adjusted number of 1024-point complex FFTs that can be calculated for a fixed amount of energy, and attempts to factor out Technology and the datapath word width, DPath. It makes use of the observation that about 1/3 of the energy consumption of the 20-bit Spiffee processor scales as DPath2 (e.g., multipliers) and about 2/3 of the circuits scale in complexity linearly with DPath. The constant 10-6 is added to put the result into a more convenient range.
* Low Vt Spiffee processors
The full Spiffee Low Vt version has not yet been fabricated, although portions of it have been. Numbers shown are from extrapolated measurements, and will be better than shown because the power number includes additional circuits that the baseline Spiffee1 chip does not. The Spiffee ULP numbers are from simulations.

** For this case, an nwell bias was set to -0.5v. The current drawn was approximately 10 µA.

*** Performance limited to 12.8 µs by CPU/FPGA I/O.
**** Two cores in one FPGA double throughput
[1] S. Magar, et. al., ICASSP88. Assume memory power = 250mW per chip.
[5] P. Ruetz, et. al., ICPR90.
[6] Private communication with designer.
[7] Private communication with designer.
[10] G. Sunada, et. al., ICCD '94.
[11] E. Bidet, C. Joanblanq, and P. Senn, CICC '94
[13] Simulated results. Private communication with designer.
[14] M. Wosnitza, "A High Precision 1024-point FFT Processor for 2D Convolution", ISSCC '98.
[15] S. M. Currie, B. K. Gilbert, B. A. Randall, P. R. Schumacher, E. E. Swartzlander, "Implementation of a Single Chip, Pipelined, Complex, One-Dimensional Fast Fourier Transform in 0.25 um Bulk CMOS", ASAP, 2002.
[16] J.-C. Kuo, C.-H. Wen, and A.-Y. Wu, "Implementation of a Programmable 64~2048-point FFT/IFFT Processor for OFDM-Based Communication Systems," ISCAS, 2003.

A few more FFT processors

FFT board-level products

Programmable DSP processors

Soft FFT cores

Misc. FFT stuff


[Home page] [Email Bevan] Last update: October 2, 2005


Keywords: Low-power, high-performance, DSP, digital signal processing, Fast Fourier Transform, Discrete Fourier Transform, DFT