For my final project in Cornell’s Complex ASIC Design course I built a neural network accelerator which used the Logarithmic Number System (LNS). There has been some interest in using the LNS with neural networks as neural computation is multiplication heavy and multiplications in the LNS can be implemented with a simple adder. I augmented a PyMTL implementation of a single threaded PARCv2 processor with instructions to convert between LNS and fixed point, add LNS numbers, and multiply LNS numbers. I also built a matrix multiplication accelerator which could either use fixed point or the LNS. After integrating the accelerator with the processor and evaluating the various possible designs, it became clear that the delay and energy benefits gained by using the accelerator (generic cpu vs custom data and control path) far outweighed any savings in the ALU (LNS vs fixed point). This is due to two properties of matrix multiplication: 1) it has a regular dataflow that custom hardware can exploit well, and 2) benefits from cheap LNS multiplications wash out when the expensive but necessary LNS additions are used as well.
The code for this project is currently set to private due to the fact that I can’t release details about my baseline processor. Depending on your needs I may still be able to help you though, so contact me if you’re interested.