Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
wiki:overview [2018/04/27 17:24] – Andreas Moshovos | wiki:overview [2018/04/27 17:59] (current) – Andreas Moshovos | ||
---|---|---|---|
Line 42: | Line 42: | ||
** Dynamic Stripes **: It has been known that the precisions that activations need can be tailored per network layer. Several hardware approaches exploit this precision variability to boost performance and energy efficiency. Here we show that much is left on the table by assigning precisions at the layer level. In practice the precisions will vary with the input and at a much lower granularity. An accelerator only needs to consider as many activations as it can process per cycle. In the work below we show how to adapt precisions variability at runtime at the processing granularity. We also show how to boost performance and energy efficiency for fully-connected layers. | ** Dynamic Stripes **: It has been known that the precisions that activations need can be tailored per network layer. Several hardware approaches exploit this precision variability to boost performance and energy efficiency. Here we show that much is left on the table by assigning precisions at the layer level. In practice the precisions will vary with the input and at a much lower granularity. An accelerator only needs to consider as many activations as it can process per cycle. In the work below we show how to adapt precisions variability at runtime at the processing granularity. We also show how to boost performance and energy efficiency for fully-connected layers. | ||
- | * Alberto Delmas, Patrick Judd, Sayeh Sharify, Andreas Moshovos, [[https:// | + | * Alberto Delmas, Patrick Judd, Sayeh Sharify, Andreas Moshovos, [[https:// |
** LOOM: An Accelerator for Embedded Devices **: When compute needs are modest the design described below exploits both activation and weight precisions. | ** LOOM: An Accelerator for Embedded Devices **: When compute needs are modest the design described below exploits both activation and weight precisions. |