1.0 |
|
Accelerators and Heterogeneous Systems |
2.0 |
|
FPGAs |
|
2.1 |
|
Nature of the FPGA for Acceleration |
2.2 |
|
Programming Environment |
2.3 |
|
FPGAs Are Best for ... |
|
2.3.1 |
|
Very complex sequences of logical manipulation and arithmetic that could be accomplished in parallel |
2.3.2 |
|
Certain classes of applications that fit well with FPGAs |
2.3.3 |
|
Applications written in C with source code available for manipulation |
2.3.4 |
|
Manipulating bits and other unusual transformations not found directly in general processor instruction sets |
|
|
3.0 |
|
Graphics Cards |
|
3.1 |
|
Nature of the Graphics Card for Acceleration |
3.2 |
|
Programming Environment |
3.3 |
|
Graphics Cards Are Best for ... |
|
3.3.1 |
|
Large blocks of independent data items to which identical processing should be applied |
3.3.2 |
|
Software applications that detect and use graphics cards to accelerate their operation |
3.3.3 |
|
Certain classes of applications that have performed well with graphics card acceleration |
3.3.4 |
|
Broad language support needed |
|
|
4.0 |
|
Cell Broadband Engine (Cell) Processors |
|
4.1 |
|
Nature of the Cell Chip for Acceleration |
4.2 |
|
Programming Environment |
4.3 |
|
Cell Chips Are Best for ... |
|
4.3.1 |
|
Shorter vector lengths and rapid shifting between SIMD and nonvector general processing |
4.3.2 |
|
Predominately single-precision, floating-point calculations |
4.3.3 |
|
Stream processing approaches |
|
|
5.0 |
|
Cray Threadstorm Processor |
|
5.1 |
|
Nature of the Threadstorm for Acceleration |
5.2 |
|
Programming Environment |
5.3 |
|
Threadstorm Is Best for ... |
|
5.3.1 |
|
Applications that run poorly in cache on other processor types |
5.3.2 |
|
Applications that don't exhibit much locality of reference in data access patterns |
|
|
6.0 |
|
Future Accelerators |