For low bit representations, you could just pre-calculate all the possible answers and cache those. A simple algorithm would be to decode numbers, do the arithmetic, and re-encode the answer. With the seven bit example used in the article, there are only 2^7 = 128 unique numbers. So you'd need 16KB of memory to store all possible multiplications. With 16 bit numbers that grows to 4GB. Beyond that, it probably gets too expensive rapidly. You might get some optimizations out of the symmetry for positive and negative.