1680
IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 25, NO. 17, SEPTEMBER 1, 2013
High-Speed Optical Vector and Matrix Operations Using a Semiconductor Laser Daniel Brunner, Miguel Cornelles Soriano, Member, IEEE, and Ingo Fischer
Abstract— Using induced transient dynamics of a standard telecommunication laser, we perform optical computation at high data rates. Employing time-multiplexing, we emulate a nonlinear photonic network via a single laser diode. Based on the induced dynamics of the network, a machine learning technique is implemented. We compute the multiplication of a scalar with a vector, vector dot and cross products, and the multiplication of a matrix and a vector, using the same system. Information is injected all-optically at a rate of 5 GSamples/s, allowing for up to 5 · 108 operations per second. Index Terms— Optical computing, semiconductor lasers, nonlinear optics, analog computers.
I. I NTRODUCTION
P
HOTONIC information processing has been a tantalizing prospect for many decades. Interest remains high, even after the field encountered severe set-backs. This continuing interest stems from the enormous prospects, such as high data rates, energy efficiency and true parallelism. Due to the astonishing success of electronic-digital computers, alternative systems were rendered almost irrelevant in the past. Recently, however, a revitalized interest into optical computation, partially based on alternative approaches to photonic information processing has emerged [1]. Due to the importance of fundamental vector operations to information processing in general, for system control or in telecommunication tasks, they were considered in some of the earliest optical computing concepts [2], [3]. Other all-optical implementations of vector and matrix operations exist [4]. Most, however, are based on designs specific for each single operation, and as such require a high degree of specialized hardware. For seamless interoperability with most nowadays photonic devices and for maximum technological relevance, a multipurpose photonic information processing scheme operating at telecommunication wavelength would be of high interest. Here, we experimentally demonstrate the photonic realization of a variety of algebraic operations between vectors, scalars and matrices, using a modification of the machine learning
Manuscript received May 6, 2013; revised June 26, 2013; accepted July 6, 2013. Date of publication July 15, 2013; date of current version August 5, 2013. This work was supported in part by MICINN, Comunitat Autònoma de les Illes Balears, FEDER, the European Commission under Projects TEC2012-36335, Grups Competitius, EC FP7 Projects PHOCUS under Grant 240763 and the NOVALIS under Grant 275840. The authors are with the Instituto de Física Interdisciplinar y Sistemas Complejos, IFISC, Palma de Mallorca E-07122, Spain (e-mail: dbrunner@ ifisc.uib-csic.es;
[email protected];
[email protected]). Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LPT.2013.2273373
concept of Reservoir Computing (RC) [5], [6]. When applied to more abstract and complex tasks, e.g. spoken digit recognition and chaotic timeseries prediction, optical RC systems already yielded excellent performance [7]. In this proof-ofconcept experiment, we significantly extend the range of demonstrated information processing tasks based on RC with all-optically implemented reservoirs. All computations to be reported here do not require memory. Therefore, we create a photonic network without recurrent connections and consequently without fading memory. Information injection and training procedures, however, remain unaltered and identical to the concept of RC. To avoid ambiguity in our nomenclature, we will refer to our computational scheme as nonlinear transient computing [8]. II. N ONLINEAR T RANSIENT C OMPUTING E MBEDDED IN H ARDWARE In nonlinear transient computing, computation is based on the response of a nonlinear network to injected information. Such networks can be established in a large variety of devices, i.e. using time-multiplexed [9] or multiple-node networks [10]. In our experiments, the all-optical realization of the nonlinear network exclusively utilizes standard, off-the-shelf telecommunication components. The attractiveness of this computational concept stems from the rather weak constraints upon the exact structure of the nonlinear network. As a consequence, implementations in opto-electronic [11], [12] and all-optical systems [7], [13] have been realized in proof-of-concept experiments recently. In all of these publications, the computational power of the network is based on the nonlinearity provided by standard photonic components, for example the threshold and amplitude-phase coupling of light inside a semiconductor laser diode [7], [14]. The scheme does not require a specific system architecture for solving different tasks and is based on simple learning rules. Defining the output as a linear weighted sum of nonlinear transients states [5], [9], for certain conditions any mathematical operation can be performed by such systems [15]. All reported computations to be carried out in our experiment do not require memory. Consequently, the delayed feedback, included in earlier implementations of our concept, was removed. In the original RC concept, nonlinear transients were induced in a reservoir of multiple, randomly interconnected nonlinear elements [5], [6]. Using time-multiplexing and data pre-processing, such a nonlinear network can be realized with a single nonlinear element [9]. Figure1 illustrates the concept,
1041-1135 © 2013 IEEE
BRUNNER et al.: HIGH-SPEED OPTICAL VECTOR AND MATRIX OPERATIONS USING A SEMICONDUCTOR LASER
Fig. 1. Transient computing with a single nonlinear element based on time multiplexing. Panel a) shows the two clock signals T1 and T2 and the input signal u. One computational step has a duration of T2 , which is divided into subinterval spaced by T1 . Each time step T1 during one T2 defines one individual transient state. The input signal u consists of the input information, multiplied by a matrix defining the connectivity from input to the nonlinear network. Panel b) schematically shows the laser diode (LD) transient states induced by information u, and the construction of the readout yk . Red arrows illustrate the connectivity between network-nodes induced by the system’s inertia. A schematic representation of our experimental realization is given in panel c).
showing the individual time multiplexing sequences (a), the laser’s transient response (b) and our experimental implementation (c). As illustrated in Fig. 1 a), for a network consisting of N transient states, one computational operation takes the time T2 = T1 ·N. Two clock signals therefore exist in our system: T2 defines the temporal duration of one computational operation, while signal T1 refers to the individual transient states. We define T1 = · T0 , with T0 being the characteristic response time of the laser and a scaling factor. For T1 < T0 , the system’s inertia couples consecutive transients and therefore establishes internal network connections. In a different information processing task, we empirically found the tradeoff between node-coupling and the network’s response to be optimal for ≈ 0.2 [9]. The induced nonlinear transients, i.e. the laser output, is illustrated in Fig. 1 b). Each transient state corresponds to a numbered circle. Arrows linking the numbered circles illustrate the network’s connection topology induced by the inertia and < 1. The system’s output (y(T2 )), providing the result of a computation, is a linear weighted sum of the network’s transients, illustrated as in Fig. 1 b). The computational scheme is inherently parallel in terms of datainjection, processing and data-readout, hence input and readout information are vectors. In our experiment, the number of independent components (L) in the input information (v) ranges from four (multiplying vector a by scalar c) and twelve (a multiplied by matrix D),
1681
Fig. 2. Algebraic vector and matrix operations, computed all-optically based on nonlinear photonic transients, for number of transients used for the computations ranging from 10 to 100. a) Error in terms of the standard deviation between target and computation results, b) the equivalent computation accuracy in number of bits. The dotted line in panel b) corresponds to the simulated accuracy limitation due to detection noise for the scalar multiplication of a vector.
with each scalar, vector- or matrix-component represented by an input-dimension. We constructed the injection signal (u) from random superposition of the L input vector components, according to un =
L
ωn,m · v m .
(1)
m=1
Here ω is the input connectivity matrix (dimensionality N × L) and element u n (n = {1 . . . N}) being the data injected for transient n. Equation (1) results in a vector with the number of components equal to the size of the network N. 70 % of elements in ω were set to zero, the remaining 30 % were randomly chosen as ±1. The sparsity of ω ensured each value of u n only includes information about few components of v. Accordingly, the training procedure will be able to emphasize (suppress) the nodes mixing the required (not-required) input data components for each dimension of the output. The number of independent output components required by the specific operation defines the number of readout classifiers needed. As an example, the vector dot product a · b requires one readout only (k = {1}), while a × b requires three (k = {1, 2, 3}). The linear readout weights are trained using standard machine learning techniques, e.g. linear regression [9]. Using the example of the vector cross product, we randomly create ξ = 1000 different realizations of vectors a and b, and calculate the readout targets (yt ) according to yt = a×b. Performance is evaluated via off-line cross validation, using 80% (20%) for training (testing). We construct ξ realizations of input vectors, matrices and scalars, with random values between zero and one for each component. The standard deviation is computed by the deviation between the test samples’ classifiers, and their targets and is averaged across
1682
IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 25, NO. 17, SEPTEMBER 1, 2013
the ξ different realizations: δ=
ξ k i=1 j =1
||yi, j − yi,t j ||/k ξ
.
(2)
Once the system’s analog classifier has been optimized for the trained vector or matrix operation, this operation can be carried out for any input values. Computation therefore is not restricted to the ξ random examples used for training and testing, but the system rather has learned and approximated the mathematical rule of the required operation. As such, after a learning procedure carried out offline on a computer, the system is capable of executing the various trained operation without future supervision. First hardware realizations of analog classifiers for RC based on time-multiplexing are being developed [16]. Such a system could afterwards perform the required computation without relying on further training. III. E XPERIMENTAL S ETUP A schematic of our experimental realization of nonlinear transient computing is shown in Fig. 1 c). In the all-optical implementation of the nonlinear network, we utilized a standard, fiber-pigtailed semiconductor laser as the central nonlinear element (LI). The laser operated at telecommunication wavelengths (λ = 1543 nm) and was biased at 12.2 mA, corresponding to 2.5 % above solitary laser threshold. As optical information injection source we used a narrow linewidth, tunable laser (LII) tuned to the same wavelength. Signal u was encoded via intensity modulation of the injection laser using a Mach-Zehnder modulator (MZM). The modulation rate of the injection signal was 5 × 109 Samples/second, the modulation resulted in minimum and maximum powers of 1 μW and 1 mW, respectively. Our experiment’s purpose was characterizing the computational performance of only the laser’s transient dynamics. We accordingly eliminated the influence of the MZM sin2 -nonlinearity during the data preprocessing stage. Experimental results of the individual tasks are shown in Fig. 2, depicting the average calculation error. Panel a) shows error δ calculated via eq. 2, panel b) the corresponding computation accuracy in number of bit levels. The information processing performance was evaluated for networks ranging in size from 10 to 100 transients. Each of the individual tasks poses a different computational challenge to the system. Multiplying a vector by a scalar can accurately be executed with as little as 10 transient states, with an error of ∼4 % for ≥15 transients. Computationally more demanding are vector dot and cross product. According to results displayed in Fig. 2, vector dot and cross product are computed with an approximately double error when compared to the scalar multiplication. Additionally, a network size of >30 was required. Not surprisingly, the most computationally demanding task was the multiplication of a vector by a matrix. While the lowest error achieved was practically identical to the one for vector dot and cross product, the size of the network required to achieve similar accuracy further increased to ≥50. For all reported tasks, the error between the optically computed values and their targets lies significantly below 10%
for computations based on more than 25 transients. In an entirely hardware based system, one could perform between 5×108 and 5×107 operations per seconds for computation using between 10 and 100 transients, respectively. For each operation, the computation accuracy approaches or exceeds 4 Bit resolution. In our experiments, we measured a signal-to-noise ratio of ∼5%, largely dominated by detection noise. To illustrate the impact of the detection noise, we compute the nonlinear response of a Ikeda-nonlinearity (sin2 , no delayed feedback), utilizing identical injection-data as in our experiment. Due to the analogue nature of the readout values, noise has a direct impact on system performance. Training the system’s readout using noise-less data, the error approaches 0% for 25 nonlinear elements and more. Including 5 % additive (detection) noise to the response of the Ikeda-nonlinearity results in a performance decrease by ∼1.5%. For the product between a vector and a scalar, results are illustrated by the dotted line in Fig. 2 b). Hence, a detector with a lower noise figure could significantly improve the performance of the system. For directly interfacing the system’s output with other optical elements, a detector could be avoided, hence eliminating this noise-source. IV. C ONCLUSION We have experimentally demonstrated high speed algebraic vector and matrix operations using standard, off-theshelf telecommunication components. Other than most optical information processing systems, the demonstrated analoguetransient computing scheme has been successful in several different, highly diverse benchmark tests. As such, we significantly extend the range of possible applications of high-speed all-optical information processing systems based on machine learning. Especially for applications where a compromise between high accuracy, high speed or energy efficiency has to be found, such optical information processing approaches could offer attractive alternatives to available techniques. In an implementation of our scheme in multi-nodenetworks, hence omitting the time-multiplexing procedure, the 1/N injection rate reduction can partly or entirely be eliminated. Much higher data-rates are therefore achievable, possibly reaching 10s of GSamples/s. In addition, fast nonlinear optical transients exist in many systems with dynamical responses on the picosecond timescales and shorter. A detailed investigation of the relation between laser diode parameters like the linewidth enhancement factor (here α ∼ 2), damping of relaxation oscillations (here ∼ 2.5 ns−1 ) is still lacking at this point. Semiconductor lasers exhibit some variety for each one of these parameters. Consequently, the accuracy of such computations might increase significantly in the future. Furthermore, factor was optimized for a system with memory, executing a prediction task. In our experiments we emphasize the general purpose properties of our scheme, therefore utilizing identical or similar parameters as in results reported earlier. Future, task specific optimization of could nevertheless result in an increase in performance and processing speed.
BRUNNER et al.: HIGH-SPEED OPTICAL VECTOR AND MATRIX OPERATIONS USING A SEMICONDUCTOR LASER
Physically implemented reservoir and nonlinear transient computing has been established with large success in recent years. To reach the full potential of this promising general purpose approach to information processing, a physical realization of the injection and readout stages has to be demonstrated. Early implementations of the readout stage have been demonstrated [16] and suggested [7]. A physical implementation of the entire concept could open new doors in the photonic-information processing and machine learning community. R EFERENCES [1] H. J. Caulfield and S. Dolev, “Why future supercomputing requires optics,” Nature Photon., vol. 4, no. 5, pp. 261–263, 2010. [2] H. J. Caulfield, W. T. Rose, M. J. Foster, and S. Horvitz. “Optical implementation of systolic array processing,” Opt. Commun., vol. 40, no. 2 pp. 86–90, Dec. 1981. [3] R. A. Athale, W. C. Collins, and P. D. Stilwell, “High accuracy matrix multiplication with outer product optical processor,” Appl. Opt., vol. 22, no. 3, pp. 368–370, Feb. 1983. [4] L. Yang, R. Ji, L. Zhang, J. Ding, and Q. Xu, “On-chip CMOScompatible optical signal processor,” Opt. Express, vol. 20, no. 12, pp. 13560–13565, Jun. 2012. [5] H. Jaeger and H. Haas, “Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication,” Science, vol. 304, no. 5667, pp. 78–80, Apr. 2004. [6] D. V. Buonomano and W. Maass, “State-dependent computations: Spatiotemporal processing in cortical networks,” Nature Rev. Neurosc., vol. 10, pp. 113–125, Feb. 2009.
1683
[7] D. Brunner, M. C. Soriano. C. R. Mirasso, and I. Fischer, “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun., vol. 4, p. 1364, Jan. 2013. [8] R. Martinenghi, S. Rybalko, M. Jacquot, Y. K. Chembo, and L. Larger, “Photonic nonlinear transient computing with multiple-delay wavelength dynamics,” Phys. Rev. Lett., vol. 108, no. 24, pp. 244101-1–244101-4, Jun. 2012. [9] L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system,” Nature Commun., vol. 2, p. 468, Sep. 2011. [10] K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw., vol. 22, no. 9, pp. 1469–1481, Sep. 2011. [11] L. Larger, et al., “Photonic information processing beyond Turing: An optoelectronic implementation of reservoir computing,” Opt. Express, vol. 20, no. 3, pp. 3241–3249, Jan. 2012. [12] Y. Paquot, et al., “Optoelectronic reservoir computing,” Sci. Rep., vol. 2, p. 287, Feb. 2012. [13] F. Duport, B. Schneider, A. Smerieri, M. Haelterman, and S. Massar, “All-optical reservoir computing,” Opt. Express, vol. 20, no. 20, pp. 22783–22795, Jul. 2012. [14] M. C. Soriano, J. Garcia-Ojalvo, C. R. Mirasso, and I. Fischer, “Complex photonics: Dynamics and applications of delay-coupled semiconductors lasers,” Rev. Modern Phys., vol. 85, no. 1, pp. 421–470, Mar. 2013. [15] J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep., vol. 2, p. 514, Jul. 2012. [16] A. Smerieri, F. Duport, Y. Paquot, B. Schrauwen, M. Haelterman, and S. Massar, “Towards fully analog hardware reservoir computing for speech recognition,” in Proc. AIP Conf., 2012, pp. 1892–1895.