# Stochastic Computing's Comeback: A Viable Approach for Nanoscale Technologies?

When we first learn about digital microsystems, we often start by figuring out how information can be represented using bits of information, 0 and 1. Representing any number larger than 1 or any fraction requires that we use several bits. Then, we usually proceed to learn about sign-and-magnitude and 2's complement notations. We may also learn about other fixed-point notations and floating-point systems and how complex they can get, especially in their circuit implementations.

However, even in the early days of computing, pioneers such as John Von Neumann postulated the existence of other ways to represent signals [1]. One of these, stochastic computing, gained in prominence in the 1960s and 1970s [2], but was soon forgotten. It is now making a comeback.

In its broadest sense, stochastic computation is a system of computation that relies on randomness as a necessary component. Stochastic control comes to mind as one of the earliest and most prominent uses of the concept. A more specific version of stochastic computation is a system where quantities being processed are represented through the statistics of a random process.

### An example

In the world of computation, one specific and often-used form of stochastic computing uses sequences of randomly generated bits. The time average of the bits is clearly a number in the interval [0, 1], and is the number being represented. For instance, in the example sequence below, there are 4 zeros and 16 ones in the sequence of length 20. As such, the underlying number represented by this sequence is 16/20 = 0.80.

0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1

Such a sequence could be generated, for example, by flipping a biased coin with a 80% likelihood of generating a 1, 20 consecutive times.

Now comes the interesting part.

Take two such independently generated sequences, and set each sequence as one of the inputs into a 2-input AND gate. Below, I show an example where one input corresponds to 0.80 and the other to 0.50. What is the result

Input A     0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1
Input B     1 0 0 1 1 1 1 0 0 1 0 0 1 0 0 1 1 1 0 0
A AND B    0 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 1 1 0 0

The resulting sequence has 8 ones out of 20, corresponding to a result of 0.40, exactly the product of the two inputs! Indeed, in a statistical sense, the AND gate acts as a multiplier. Imagine that: replacing all that logic circuitry required to build a conventional multiplier with one single logic gate! Actually, it is not that simple.

So, what are the tradeoffs? There are several.

(1) As stochastic computation involves randomness, there will always be some inexactness in the number representation. In conventional number representations, we can guarantee a maximum quantization error (at least for numbers that are within bounds). This is not the case for stochastic computation, where error (i.e., the difference between the number being represented and the sequence average) decreases with the exponential of the sequence length, but this error is not bounded. For instance, the length of a stochastic sequence must double to get the equivalent of one more bit of representation. In essence, there is a tradeoff between circuit complexity (i.e., number of logic gates), and sequence length (i.e., computation time). An example here will be very useful.

(2) Stochastic computation assumes that the bits in a sequence are mutually independent (i.e., we assume an underlying Bernoulli process). Some really interesting forms of stochastic computation rely on flip-flops (e.g., for division), or even more complex finite state machines (e.g., as approximations to nonlinear functions) [3][4]. These processes break the independence of consecutive bits, and much effort has to be put into dealing with this.

In the multiplier example above, the two input sequences, A and B, must also be independent from each other. Consider the following example where the inputs would not be independent, by "pushing" all the 1s to the left:

Input A 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
Input B 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
A AND B 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0

Here, the inputs are no longer "stochastic." Instead, they are what we call "pulse-width modulated." We see that the output is no longer the product of the inputs, but rather the minimum of the inputs. I leave it as an exercise to the reader what the OR gate would produce, both for stochastic inputs and pulse-width-modulated inputs.

(3) Generating stochastic sequences can be tricky. There is no such circuit element in digital logic design as a "biased coin"!

To date, these challenges have limited the utility of stochastic computation to a few domains where the algorithms of interest have a nice mapping onto stochastic gates. These include, for instance, image processing [5] and iterative decoding of error-control codes [6][7].

However, there has been a flourish of activity recently in the area due to stochastic computation's inherent fault tolerance. Imagine if one bit was generated in error in the first sequence above, in the 17th position for illustration:

Input A   0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 (0.75 instead of 0.80)
Input B   1 0 0 1 1 1 1 0 0 1 0 0 1 0 0 1 1 1 0 0 ( exact value of 0.50)
A AND B  0 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 0 1 0 0 (0.35 instead of 0.40)

The average of the first sequence changes from 0.80 to 0.75 and the result changes from 0.40 to 0.35: not a huge change. In reality, as sequence length grows, the impact of such an error decreases substantially, making this an excellent candidate for computation in emerging technologies where there is a lot of component variation, or when we want to operate a conventional CMOS circuit at extremely low voltages, i.e., where noise margins are minimized. Some logic-level approaches are discussed in [8]. There are also several circuit-level approaches to fault tolerance that incorporate some stochastic approaches (see, e.g., [9]).

Stochastic computation has definitely seen resurgence in recent years, and I expect more to come. In my opinion, the field is far wider than just dealing with sequences of random bits. Rather, it is a way of thinking (quantization in statistics, also related to multiple-valued logic), with intersection with other stochastic techniques such as genetic algorithms and simulated annealing that will lead to new and innovative solutions to problems.

If you are interested in learning more about stochastic computation, I suggest an excellent tutorial by Brown and Card [3].

References:
[1] J. Von Neumann, “Probabilistic logics and the synthesis of reliable organisms from unreliable components,” Automata Studies, pp. 43–98, 1956.
[2] B.R. Gaines, “Stochastic Computing Systems”, Advances in Information Systems Science, J.F. Tou, ed., vol. 2, chapter 2, pp. 37-172, New York: Plenum, 1969.
[3] B. Brown and H.C. Card. "Stochastic neural computation. I. Computational elements." IEEE Transactions on Computers, vol. 50, no. 9, pp. 891-905, 2001.
[4] P. Li, W. Qian, D. Lilja, M. Riedel, and K. Bazargan, "Using a two-dimensional finite-state machine for stochastic computation,” IEEE/ACM International Workshop on Logic and Synthesis, 2012.
[5] W. Qian, X. Li, M. Riedel, K. Bazargan, and D. Lilja, "An architecture for fault-tolerant computation with stochastic logic," IEEE Transactions on Computers, vol. 60, no. 1, pp. 93-105, 2010.
[6] V. Gaudet and A.  Rapley, Iterative decoding using stochastic computation, Electronics Letters, vol. 39, pp. 299-301, 2003.
[7] S. Sharifi Tehrani, W. Gross, and S. Mannor, “Stochastic decoding of LDPC codes”, IEEE Communications Letters, vol. 10, no. 10, pp. 716-718, 2006.
[8] V. Shmerko, S. Yanushkevich, and S. Lyshevski, Computer Arithmetics for Nanoelectronics, Taylor & Francis/CRC Press, 2009.
[9] C. Winstead, Y. Tang, E. Boutillon, C. Jego, and M. Jezéquel, “A space-time redundancy technique for embedded stochastic error correction,” International Symposium on Turbo Codes (ISTC), 2012.