We present the cached-FFT algorithm which explicitly caches data from main memory using a much smaller and faster memory. This approach facilitates increased performance and, by reducing communication energy, increased energy-efficiency.
Spiffee is a 1024-point, single-chip, 460,000-transistor, 40-bit complex FFT processor designed to operate at very low supply voltages. It employs the cached-FFT algorithm which enables the design of a well-balanced, nine-stage pipeline. The processor calculates a complex radix-2 butterfly every cycle and contains unique hierarchical-bitline SRAM and ROM memories which operate well in both standard and low supply voltage, low threshold-voltage environments. The processor's substrate and well nodes are connected to chip pads, accessible for biasing to adjust transistor thresholds.
Spiffee has been
fabricated in a standard 0.7µm (Lpoly= 0.6µm)
CMOS process and is fully functional on its first fabrication.
At a supply voltage of 1.1V,
Spiffee calculates a 1024-point complex FFT in 330µsec,
while dissipating 9.5mW--resulting in an adjusted energy-efficiency more
than 16 times greater than that of the previously most efficient FFT processor.
At a supply voltage of 3.3V, it operates at 173MHz--a clock
rate 2.6 times faster than the previously fastest.
An Approach to Low-Power, High-Performance, Fast Fourier Transform Processor Design (1.3 MB, 196 pages)