RUN-TIME CONFIGURABLE APPROXIMATE MULTIPLIER DESIGN
Date
2023-08-14
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Saudi Digital Library
Abstract
The complexity of arithmetic continues to be an issue in the design of high-performance
and energy-efficient hardware. The problem is further exacerbated in systems powered by variable power levels can limit their computation capabilities. Multipliers
constitute a major component of these applications with complex logic design and
a large gate count compared to other arithmetic units. As such, there is significant
interest in designing new approaches to low-complexity multipliers.
Recently, approximate arithmetic, in particular approximate adders and multipliers, have shown notable advantages to benefit from a wide spectrum of naturally
imprecise-tolerant applications, such as image processing, pattern recognition, and
machine learning (ML). The concept of approximate arithmetic involves replacing
system components of normal degrees of complexity with less complex components,
which may provide reduced accuracy. Compared to the adder, the multiplier is a
crucial component of these applications with complex logic design and a large gate
count.
This thesis investigates the possibility and profitability to trade accuracy for energy
at run-time by using configurable approximate arithmetic hardware. In the first
approach, a configurable adaptive approximation method for multiplication is proposed. The extra overheads associated with in the configuration circuits prove to
be negligible compared to the multiplier’s costs. Central to the proposed approach
is a significance-driven logic compression (SDLC) multiplier architecture that can
dynamically adjust the level of approximation depending on the run-time power/accuracy constraints. The architecture can be configured to operate in the exact mode
(no approximation) or in progressively higher approximation modes (i.e. 2 to 4-bit
SDLC). In the second approach, a novel ML hardware design method centred around
multiply–accumulate (MAC) units is presented. Core to the configurable MAC design is a configurable multiplier. In the third approach, a configurable modified
activation function is proposed to minimize the prediction error of the configurable
MAC design.
To evaluate and validate the trade-offs, the three approaches (configurable multiplier, MAC unit and modified activation function) are designed in System-Verilog
and synthesized using Synopsys Design Compiler, employing a UMC 90nm digital complementary metal-oxide semiconductor (CMOS) technology as well as on
Field Programmable Gate Arrays (FPGAs), and then compared with other available methods. These improvements come at the expense of errors introduced into
the circuit and investigated. The efficacy of the first approach (configurable multiplier) technique is evaluated with a real life image processing application, which
consists of additions and multiplications using the proposed three multiplier configurations (Exact, 2- and 4-bit SDLC). The analysis considers the Gaussian blur filter
since it is widely used in image processing application, typically to reduce image
noise and artifacts by acting as a low-pass filter.
Additionally, the second and third approaches are evaluated as the key processing
blocks in a multi-layer perceptron (MLP) network in order to validate the dynamic
tunability between accuracy and power consumption. As case studies, the MLP is
trained using well-known machine learning (ML) datasets. The configurable multiplier design (first approach) can be suitably used for energy-efficient multiplier
designs, where quality requirements can be relaxed. The second and the third approaches (configurable MAC unit and activation function) can also be used within
the power-adaptive neuron modules with a minimal loss in output quality compared
to those used in previous studies.
Description
Keywords
Energy efficiency, approximate computing, low power design, multiplier design, MAC design