SRC® Performance Advantage
Reconfigurable computing systems derive their performance advantage over
traditional systems through parallelism, computational efficiency and access to
application specific functional prefetch and data access units. This performance
delivers significant competitive advantage on the increasingly more level
playing field of clustered systems.
Application Performance: SRC Application Results
The following table illustrates this point by showing the performance advantage
of a single SRC Series H MAP® processor compared with a highly tuned code running on a
standard microprocessor. The performance gains are achieved by the ability to implement a custom mix of functions for each subroutine.
| APPLICATION |
MAP PERFORMANCE |
SPEEDUP: MAP PROCESSOR VS. STANDARD mP |
| Signal Processing (Spectrum Analyzer) |
2880 nsec per 4096 samples |
456x |
| Image Processing (Normalized Cross Correlation) |
0.105 sec/frame |
300x** |
| Finance (Black Scholes) |
300M/sec |
90x* |
| Backprojection (CAT, MRI, PET scanning, Radar processing) |
3.77 sec (5040 pulses, image size - 1002001 pixels) |
38x* |
| Reverse Time Migration |
13.3 nsec per output migration point |
25x* (see note below) |
| Molecular Dynamics (LAMMPS) |
0.11 sec/step |
10x** |
* Speedup relative to a 3.0 GHz Xeon
** Speedup relative to a 2.67 GHz Nehalem
All others relative to 2.8 GHz Xeon
Note: The performance gain for the Reverse Time Migration application using the Series H MAP processor was dependent upon the size of the 3D seismic volumes. For smaller volumes (300x300x300), the Series H MAP processor performed at least 25 times faster than Intel 3.0 GHz Xeon processors. On larger volumes, the Series H MAP processor preformed in excess of 25 times faster.
Application Performance: Customer Application Results
The table below summarizes results reported by customers using SRC Series E MAP
processors.
| APPLICATION |
Customer |
SPEEDUP: MAP PROCESSOR VS. STANDARD mP |
Cryptography (Bent Function Generation)
|
Naval Postgraduate School
(NPS) |
6500x |
| Target Recognition (Probeset Matching) |
Colorado State University |
535x |
| Gravitational Force (Astronomy) |
NCSA Dept. of Astronomy |
135x |
| Cosmology |
NCSA Dept. of Astronomy |
75x |
| Deconvolution |
NCSA National Optical Astronomy Observatory (NOAO) |
50x |
All speedups relative to 2.8 GHz Xeon
Speedup numbers for all of the above include all overhead, including data movement. All data is for a single MAP processor or a single microprocessor core and assumed 100% scalability for the microprocessor cores. Comparisons of the MAP processor to actual microprocessor based systems would result in even higher speedups due to less than 100% scalability in multicore microprocessor systems.