Biological Cybernetics - CiteSeerX

Biol. Cybern. 75, 441–452 (1996)

Biological Cybernetics c Springer-Verlag 1996

Population networks: a large-scale framework for modelling cortical neural networks Hanspeter A. Mallot1 , Fotios Giannakopoulos2 1 2

Max-Planck-Institut für biologische Kybernetik, Spemannstrasse 38, D-72076 Tübingen, Germany Mathematisches Institut der Universität zu Köln, Weyertal 86-90, D-50931 Köln, Germany

Received: 9 January 1996 / Accepted in revised form: 24 July 1996

Abstract. Artificial neural networks are usually built on rather few elements such as activation functions, learning rules, and the network topology. When modelling the more complex properties of realistic networks, however, a number of higher-level structural principles become important. In this paper we present a theoretical framework for modelling cortical networks at a high level of abstraction. Based on the notion of a population of neurons, this framework can accommodate the common features of cortical architecture, such as lamination, multiple areas and topographic maps, input segregation, and local variations of the frequency of different cell types (e.g., cytochrome oxidase blobs). The framework is meant primarily for the simulation of activation dynamics; it can also be used to model the neural environment of single cells in a multiscale approach.

1 Introduction The standard approach to the modelling of neural networks is based on ‘integrate and fire’ units such as the adjustable neuron of Widrow and Hoff (1960). A set of neurons is represented by a vector of activities, each component of which corresponds to one unit, and a weight matrix representing the transmission weights between these units. While this approach has proved extremely powerful for many problems where modelling of individual neurons is important, it is not easily applied to the rich higher-level structures of large-scale networks such as cortices. In particular, three problems arise: – The huge number of neurons, for example, in the mammalian neocortex makes a complete model almost impossible. It is therefore desirable to include properties of groups of neurons, i.e., higher-level properties, within one model element. – Even though larger computers make such huge networks treatable1 , the available anatomical and physiological Correspondence to: H.A. Mallot 1 For examples of large single-unit models of cortices, see Patton et al. (1992), Somers et al. (1995), and Stemmler et al. (1995).

data do not suffice to specify the properties of all individual neurons. Generally, data apply to groups or classes of neurons, such as ‘the pyramidal cells of layer 6’, say. – The connectivity matrix does not make use of information on spatial nearness or geometry. It is obvious, however, that geometry is extremely important in cortical organization: for example, neighboring neurons are much more likely to be connected than more distant ones. A notion of space in neural network models is therefore desirable. One way to meet the above requirements is the continuous approach where each neuron is identified by a point in a two-dimensional layer (e.g., Beurle 1956; von Seelen 1968; Wilson and Cowan 1973; Amari 1977; Cowan and Ermentrout P 1978; an der Heiden 1980). The weighted summation ui = j wij sj of the inputs sj is replaced by the equation Z (1) u(x) = W (x, x′ )s(x′ )dx′

The important point about this equation is that the indices i, j are replaced by position variables x, x′ ∈ R2 for which metric relations apply. It is often assumed that the kernel W is of the form W (x, x′ ) = w(x − x′ ), in which case (1) becomes a convolution. This assumption implies that the spatial connectivity patterns of all neurons are identical, i.e., that the system is uniform or space-invariant. In earlier work, the continuous approach has been extended in two directions: (a) interaction of multiple neural layers (Krone et al. 1986; Giannakopoulos 1989) and (b) space variance, e.g., in the form of point-to-point maps or modulations of cell density (Mallot et al. 1990). In this paper, we unify these two extensions into a general framework for models of the activation dynamics of layered, space-variant (i.e., cortex-like) networks. 2 Cortical networks The organization of neural tissue in the central nervous systems of vertebrates can be classified into three major types: nuclei, formatio reticularis, and cortex. A cortex can be defined as a stack of neural layers with vertical connectivity

442

across the layers and topographic organization of input and output. The best-studied cortical structure is of course the one found in the mammalian neopallium. Similar structures, however, are rather widespread. They occur in other sections of the vertebrate brain, such as the avian visual wulst, the optic tectum or superior colliculus in the midbrain, and in the cerebellum. In invertebrates, similar organizations have been described, for example, in the cephalopod optic lobe. In summary, there seems to be an evolutionary advantage of cortical organization favoring its emergence in unrelated species and in different sections of the brain.2 Models of neural networks are thus required that explicitly contain these high-level features of cortical organization. From the point of view of modelling, the most important features of the organization of cerebral cortex are the following: Lamination. Cortical layers are characterized by various anatomical and physiological parameters, such as the relative abundance of cell classes, soma size, pharmacology, and both intrinsic and inter-area connectivity (Braitenberg and Schüz 1991). These parameters remain roughly constant within a two-dimensional sheet, but vary between sheets. In contrast, the ‘layers’ of artificial neural networks are defined by topology only, i.e., by the block structure of the connectivity matrix, leaving no room for geometrical concepts such as two-dimensional extent. Intra-area connectivity and feedback. Connections between two cortical neurons can be mediated by parallel synapses located in different layers. If detailed timing is considered, this multiplicity of connections can be functionally significant owing to differences in propagation time. Layers can also be devoid of nerve cell somata, mediating fiber contacts of neurons whose somata are located elsewhere; an example is the molecular layer (lamina 1) of the neocortex. Feedback connections are common also within one given layer (Lund 1988). Topographic maps. Inter-area connectivity is often organized as a topographic map, i.e., neurons located close together in one area project to neighboring regions within the target area. The retinotopic map of the primary visual cortex in mammals is constructed in two such steps, with the lateral geniculate nucleus as intermediate stage. Topographic maps can be modelled as continuous functions between the layers (‘rubber-sheet transformation’) plus a convolution accounting for terminal arborizations in the target area. Input columns. While tangential uniformity on a coarse spatial scale is a general feature of cortical organization, systematic inhomogeneities do exist on a finer scale. One such 2 In general, an anatomical trait can be expected to have a high functional significance if it evolved independently in different species. For example, Northcutt and Kaas (1995) pointed out that the sublamination of cortical layer 4 in squirrels and tree shrews (Tupaia) is homoplastic, i.e., has been evolved independently in the two taxa. It is therefore likely that sublamination plays an important role in cortical function in these taxa.

inhomogeneity is that of input columns such as the ocular dominance columns found in the primary visual cortex of mammals with overlapping visual fields. In this case, the topographic maps from the thalamic representations of the two eyes in the primary visual cortex are disrupted and interlaced to form the well known stripe pattern. Mathematically, this type of columnar arrangement can be modelled by simple modifications of the mapping functions (cf. Sect. 4.3). Density modulation and output columns. Within a cortical layer, the relative densities of different cell types may also vary. If cells projecting to different areas are treated as different types, this density modulation can be used to model output columns such as the specific connectivity between cytochrome oxidase blobs and interblobs in area V1 as a source area and the thick and thin stripes in area V2 as a target area. 3 Model 3.1 Populations and layers In the continuous approach, the basic unit of modelling is a population p of neurons distributed over a domain Bp . Each point x in Bp corresponds to a cell body and the excitation of the entire population is modelled by the continuous activity patterns ep (x, t) (spike rate) and up (x, t) (intracellular potential). The connectivity of the populations is determined by their axonal and dendritic arborizations which are assumed to be uniform (shift-invariant) up to topographic (point-topoint) maps. Both types of fibers spread over various cortical layers which will be modelled by the connection layers Cl , l ∈ {1, . . . , L} described below. In Table 1, the variables of the proposed model are summarized, together with their appropriate dimensions. Axonal spread (divergence). The axonal arborization for each point x ∈ Bp is modelled by a set of weight functions αlp : Cl → R and reference points y = Rlp (x), l = 1, . . . , L. Thus, the soma at position x spreads activity around the points Rlp (x) in a range determined by the point spread functions αlp . The reference points vary continuously with the position of the soma, thus making up a continuous, point-to-point map Rlp : Bp → Cl . These maps model topographic mappings between cortical areas; within one area, they are close to the identity. The spread of excitation from population p to layer l takes time which is α . Delays depend on the ‘vertical’ modelled by the delays Tlp distance between the layers Bp and Cl . In comparison, time delays due to ‘horizontal’ distances within the layers are assumed to be negligible. They can, however, be included by slight modifications in the network equations (nonseparability of spatial and temporal components; cf. Ermentrout and Cowan 1979). Dendritic summation (convergence). The dendritic arborization for each point x ∈ Bp is modelled by a set of weight

443 Table 1. Variables describing the continuous cortex model

Symbol

Dimension

Interpretation

sl (y, t)

Transmitter production Area × time

Presynaptic stimulus density (y ∈ Cl )

up (x, t)

Potential Spikes Area × time Spikes Cell body × time

Intracellular excitation (x ∈ Bp ) Spatiotemporal spike rate (x ∈ Bp )

—

Domain of up , ep

Cl ⊂ R2

—

Domain of sl

δpl (y)

Potential Transmitter production

Weighting (sensitivity) function of dendritic summation (population p from connection layer l; y ∈ Cl ).

αlp (y)

Tansmitter production Area × spike

̺p (x)

Cell bodies Area

Density of cells of population p (x ∈ Bp )

Rlp (x)

—

Axonal map Bp → Cl

Qpl (y)

—

Dendritic map Cl → Bp (will often be the identity)

δ Tpl

Time

Synaptic delay plus dendritic propagation delay

α Tlp

Time

Axonal propagation delay

hpl (t)

—

Temporal weighting function of dendritic summation

State variables

ep (x, t) fp (up )

Nonlinear transfer function

Anatomical variables Bp ⊂ R2

Weighting (point spread) function of axonal spread (population p to connection layer l; y ∈ Cl ). Sign models excitation and inhibition, respectively

Dynamics

functions δpl : Cl → R+ and reference points y = Qpl−1 (x), l = 1, . . . , L. Thus, the soma at position x collects activity around the points Qpl−1 (x) in a range determined by the sensitivity functions δpl . The reference points vary continuously with the position of the soma, thus making up a continuous, point-to-point map Qpl : Cl → Bp . Usually, these maps will be the identity; mathematically, they are needed to maintain the distinction between the domains Bp and Cl . Temporal summation due to both transmitter accumulation and the dynamics of postsynaptic potentials is modelled by a weighting function hpl (t). Compared with the more popular modelling of time structure by differential terms in up (Sect. 4.1), the use of hpl has the advantage of being specific for each input layer; for example, the same population may now have fast synapses in one layer and slow synapses in another. Finally, δ is introduced to model synaptic delay and dena delay Tpl dritic propagation time. As before, the delays depend on the ‘vertical’ distance between the layers only.

Somatic point operations. In addition to these spatial and temporal weighting functions, two types of point operations are performed by each population. First, a space-invariant, nonlinear transfer function fp models the relation between intracellular potential u and spike rate. Second, the density of cells of the population may vary according to a density function ̺p : Bp → R+ . Connection layers. The connection of two populations, p, q, say, is determined by the amount of overlap between the corresponding axonal and dendritic weighting functions (cf. Fig. 1). This assumption is discussed at length by Braitenberg and Schüz (1991), who call it ‘Peters’ rule’ (see also Peters 1985: 64ff). In the space-invariant case, it amounts to a convolution of the weighting functions αlq and δpl . As can be seen from Fig. 1, connections can be multiple, each pathway having different spatial and temporal characteristics. When connecting neural populations into networks, it is convenient (though slightly redundant) to define an additional state variable, the combined presynaptic activities

# "! 444

α1q

α T1q

q

-

A @ A@ A @ R αA @ T4q A A AAU α4q

# "! δp1

B B B B JJ B δ JJ BNB Tp1 ^ J J B J B Q Q sQ J B Q Q JB Q Q JB

up (x, t) =

L Z X l=1

t

−∞

Z

Cl

δ sl (y, t′ − Tpl )hpl (t − t′ )

×δpl (Qpl−1 (x) − y)dydt′

(2)

The intracellular activity described by the variable up is a potential that does not depend on the density of cells present at any one location. The presynaptic excitation sl in connection layer Cl is the transmitter production per unit area and time. The separable kernel hpl δpl transforms this transmitter concentration into intracellular activity.

p

Tp5 δ

δp5 Fig. 1. Modelling the connections between two neural populations q, p. The ovals represent axonal weight functions αlq , the hatched boxes represent dendritic weight functions δpl

sl (y, t), forming the common input to the dendritic weight functions δpl of all populations p. The idea is that the output of one population is not the immediate input to some other population. Rather, several outputs from different populations are accumulated into one distribution of presynaptic activity which then feeds into all neural populations with an appropriate dendritic port. We call the support of the presynaptic activity connection layers; they are indexed by l ∈ {1, . . . , L}. Connection layers are reminiscent of the ‘blackboard’-structure in multi-agent computer systems in that they collect activity from several populations without keeping track of the original source of the activity. When in turn a neural population ‘reads’ from the connection layer, it cannot know whose activity it is reacting to. This lack of labelling of the activities is a direct consequence of ‘Peters’ rule’.

Curved cortical surfaces. The sheets Cl and Bp are treated as subsets of R2 in this paper. In general, they can be considered parameterized two-dimensional surfaces in R3 . In this case, the appropriate area elements have to be inserted in the equations below.

3.2 Network equations Consider a network of P neural populations [index p, state variables ep (x, t) and up (x, t), x ∈ Bp ] connected via L connection layers [index l, state variable sl (y, t), y ∈ Cl ]. The activation dynamics of the resulting network can be formulated in three steps (cf. Fig. 2, Table 1): dendritic summation, somatic point operations and axonal spread.

Dendritic summation ({sl ; l = 1, . . . , L} → up ): For each population p, inputs from different connection layers s1 , . . . , . . . , sL are accumulated according to dendritic arborizations δ and the temporal weighting function hpl . δpl , delays Tpl

Somatic point operations (up → ep ): The resulting intracellular potential is passed through a nonlinearity fp and locally weighted with the density of the cell population ̺p (x). ep (x, t) = ̺p (x)fp (up (x, t))

(3)

The nonlinearity fp transforms intracellular into extracellular activity. The term fp (up ) is still independent of cell density. The spatiotemporal spike rate ep of the population is obtained by multiplication with the local cell density ̺p . It thus models the combined extracellular activity of all neurons in the population, as can be measured by coarse methods such as local field potential, the intrinsic signal in optical recording, etc. Axonal spread ({ep ; p = 1, . . . , P } → sl ): The resulting excitation is spread over the axonal densities αlp and added to the activity of the connection layer to which the axon projects, again with appropriate delays (propagation times) α . For axons from population p projecting to a connection Tlp layer l in another cortical area, a point-to-point mapping Rlp has to be considered: P Z X α sl (y, t) = ep (x, t − Tlp )αlp (y − Rlp (x))dx (4) p=1

Bp

In this equation, one or several of the populations may be considered receptor layers which receive external input and transmit it to connection layers accessible by the rest of the network.

Boundary conditions. All integrals are taken over subsets of the plane and no activity crosses the boundaries of these integration domains. The state variables describing the acitvity are not defined outside the domains considered. The consistency is guaranteed by the mapping functions Q , R, which are point-to-point and onto between their respective domains and ranges. A simple case with cross-boundary crosstalk is described briefly in the example in Sect. 4.4 below. Magnification. One problem in the formulation of the above equations is the proper modelling of areal magnification, cellular magnification, and possible amplifications of the signal that might be associated with either type of magnification. Here, we have chosen a formulation in which magnification and amplification are largely decoupled. We define magnification for a projection from Bq to Bp via a connection layer Cl . With Pplq := Qpl ◦ Rlq , we have:

445

Fig. 2. Activation function of a single neural population p. Activity is collected from the connection layers 1–4 via the dendritic weighting functions δpl to form the intracellular potential up (cf. equation 2) This potential is passed through two point operations: nonlinearity fp and space-variant density modulation ̺p (cf. equation 3) The resulting excitation ep is spread over the axonal arborizations αlp both within the same area (no topographic mapping) and to other areas (involving a topographic mapping Rlp ; cf. equation 4)

4 Examples

– Areal magnification Ma (x) := | det JPplq (x)|

(5)

– Cellular magnification Mc (x) := Ma (x)

̺p (Pplq (x)) ̺q (x)

(6)

In addition, the amplification A(x) can be defined as the total signal energy in population p resulting from a Dirac pulse of excitation applied at position x ∈ Bq in population q. Using this definition, the decoupling of magnification and amplification means the following: (a) amplification does not depend on the mapping Rlq or its magnification (except for boundary effects) while it is affected by the total divergence (kαlq k) and convergence (kδpl k); (b) magnification, both areal and cellular, does not depend on the divergence and convergence factors. The decoupling of areal magnification and amplification in (4) is due to the fact that the integrals are taken over the domain of the populations, Bp . Areal magnifications produced by the mappings Rlp do not lead to an increase in presynaptic excitation (transmitter production) in the connection layer. Rather, each area element ∆B of Bp has the same total number of synaptic terminals in the area element Rlp (∆B) ⊂ Cl , which are distributed over whatever area the image of ∆B has. This ‘dilution’ of activity in the presence of magnifying maps is seen better when rewriting (4) with Cl as the integration domain: sl (y, t) =

Z

P X

Cl p=1

−1 ′ α (y ), t − Tlp ) ep (Rlp

×αlp (y − y′ )| det JR−1 (y′ )|dy′

(7)

lp

The factor | det JR−1 (y′ )|, which comes in via the integral lp transformation theorem, is the inverse of areal magnification. It corresponds to the ‘dilution’ mentioned above.

4.1 Space-invariant networks in differential form An important special case of the population network is obtained by two assumptions. First, we omit all spacevariances, i.e., we assume Qpl = Rlp = id, Cl = Bp = R2 and ̺p ≡ 1 for all p, l. Second, we choose for the temporal weighting function hpl (t) = µp exp{−

t }, τp

τp > 0, µp > 0.

(8)

Note that the weighting function no longer depends on l, the index of the connection layers. In this situation, we can substitute (3, 4) into (2) and obtain by differentiation with respect to time: L

X 1 ∂up (x, t) = − up (x, t) + µp δpl ∂t τp l=1

∗

P X q=1

αlq ∗ fq (uq (x, t − Tplq ))

(9)

In (9), the asterisks denote two-dimensional spatial convoluδ +Tlqα corresponds tions. The combined time delay Tplq := Tpl to the flow of activity from population q to population p via connection layer l. Equations of this type have been studied by, for example, Feldman and Cowan (1975), Amari (1977), Amari and Arbib (1977), Ermentrout and Cowan (1979, 1980), von Seelen et al. (1987), Mallot and Giannakopoulos (1992), and Giannakopoulos and Oster (in press). Nonlinear properties of layered structures such as multistability or limit cycle behavior have been studied with simpler models using two populations (one excitatory and one inhibitory) and one or two connection layers. For example, Ermentrout and Cowan (1979, 1980) consider a system without time delays in which the distinction of pathways via different connection layers becomes obsolete. For suitable transmission weights, they found spatially periodic stationary solutions (such as stripes of activity), spatially constant

446

l

axonal spread, αlq , Tlqα

dendritic summation, δpl , hpl , Tplδ

s1 Σq 1 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p s 2 e e e e Σq 2 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p s 3 e e e e Σq 3 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p s 4 e e Σq 4 6 6

6 6 r

e8 e9

6 6 6

6 6

̺1 r

r

r

r

r

? ?

%

u2

u3

? ? ?

e3 e4 e5

? ?

% % u5 %

u4

f5 6 r

p p p p p p p p p p p p p p p e e p p p p p p p p p p p p p p p e e

up = Σl (hpl δpl ) ∗ sl

f3 6 -

%

e1 e2

u1

p p p p p p p p p p p p p p p

u6

̺7

% %

u7

e6 e7

Fig. 3. Model of a cortical area with L = 4 layers and P = 7 cell populations. In the upper part of the figure, the connection layers and their state variables s are numbered 1,. . .,4. In the lower part, the populations and their state variables u, e are numbered 1,. . .,7. The populations 8, 9 provide external input. Left-hand box, axonal spread operation, (4). Right-hand box, dendritic summation (2). Somatic point operations (nonlinearity, multiplication by density function ̺) are shown by example only c.Ret. B12

q #qq qq q "

B13

q q q q q q q q !

PP @ A PP PP A@ B1 C P 2 H LGN Q P jH H S TQ @P A P HH P P B2 TSAQ@ B8 P C5 P PR @PPP TS @ A QQ 3,8 q PP PP @ B3 Q @ S T A B10 C3 P Q 1 P @P @ S TP A R3,9 Q PP TAS @ @Q R6,13 B4 C6 P PP @ B9 Q @ TQ AS PP @ TAQ P @ B11 S B5 @TA A B6 T @ C4 P PP 6 PP P B7 6

HH

R5,12

i.Ret.

q q q q q q q q q q q

Visual Cortex (V1) C1

R5,6

R6,6

Fig. 4. Sketch of the retino-geniculo-cortical pathway in the population network formulation. The right-hand box (‘visual cortex’) is a simplified drawing of the network depicted also in Fig. 3 and the numbering is consistent with that figure. Connections within a brain area are drawn in thin lines; the directions (dendritic, axonal) of these connections are not shown in the figure. All mappings associated with intra-area projections are assumed to be the identity. Connections between different brain regions (thick lines) are mediated by axons and do involve topographic mappings

temporal oscillations, and standing waves of activity, i.e., periodic solutions in both space and time. In the general case, i.e., with or without delays, and with an arbitrary number of populations and connection layers, it can be shown that such spatially periodic solutions cannot arise from spatially constant initial conditions. That is to say, the system in itself cannot generate a space-dependent pattern of activity from space-independent stimuli and initial conditions (Giannakopoulos 1989). As a consequence, space invariances have to be introduced into the system either anatomically or through the input. Among the applications of this type of model to information-processing problems, we mention the cooperative

computation of stereo correspondence in layers (maps) of disparity-tuned neurons (for review see Blake and Wilson 1991). Chipalkatti and Arbib (1988) have formulated the mechanism in terms of integro-differential equations that are easily transformed into a population network with two populations and one connection layer. 4.2 Layered cortical area A linear model of a layered cortical area with a larger number of connection layers and populations was presented by Krone et al. (1986). It can be derived from (9) by omitting the nonlinearities fp . An extended version of this model with

447

P = 7 populations and L = 4 connection layers appears in Fig. 3. The domains of the populations Bp can be identified with the connection layer Cl in which the somata of the population are located (small circles in Fig. 3). In the framework presented here, the resulting combination of a connection layer with a varying number of populations models an anatomical lamina. In the example of Fig. 3, layer 1 is a ‘molecular’ layer with connections but no cell populations at all, layers 2 and 4 contain the somata of two populations, and layer 3 contains the somata of three populations. Note that molecular layers cannot be modelled without explicitly representing both dendritic and axonal arborizations. The most interesting property of this model (Krone et al. 1986) is the overall spatiotemporal filtering performed by the network using simple Gaussian weight functions αlp , δpl and first order low-passes hpl (t) (8) for the individual feedforward connections. The resultant kernels of the feedback structure are nonseparable; more specifically their width changes with time. This result is reminiscent of the recursive filters used in image processing: complex nonseparable filters requiring three-dimensional convolutions can be realized recursively by simple separable filters requiring just a two-dimensional (spatial) plus a one-dimensional (temporal) convolution at each iteration. A comparison of the spatiotemporal properties of the layered network with neurophysiological findings has been given by Dinse et al. (1991).

4.3 Thalamo-cortical pathway As an example of the interaction of multiple maps and layers, we discuss the representation of the binocular image in the ocular dominance stripes of the primary visual cortex in mammals. Figure 4 gives a simplified view of the initial stages of the visual pathway in terms of the connection layers involved. The cortex (right-hand box) consists of four connection layers with the corresponding cell populations, as was shown in Fig. 3. In addition, we consider two connection layers C5 , C6 in the lateral geniculate nucleus (LGN): one for the ipsilateral input and one for the contralateral. Each of these layers contains two cell populations, one of which is excitatory (B8 , B9 ) and one inhibitory (B10 , B11 ). The retinal input is modelled by the excitation of two additional populations: B12 , B13 . In Fig. 4, the connections involving a topographic map are symbolized by heavy lines; as always, these are connections from a population to a connection layer. The retinothalamic maps R5,12 and R6,13 are discussed in detail in Mallot (1985); see also the Appendix. The feedback projections from population B6 to connection layers C5 , C6 are nonspecific and are modelled by identity maps and broad spread functions α. Here, we will focus on modelling of the ocular dominance stripes by the two mappings R3,8 and R3,9 . Let us denote by c3,8 (y), c3,9 (y) the relative dominance in the visual cortex of the inputs from the two eyes; in the absence of outher inputs, we assume c3,8 (y) + c3,9 (y) ≡ co . If this distribution of dominance is to be generated by topographic maps, it has to be accompanied by appropriate magnifications. Consider two neighboring retinal points that are

mapped into different dominance stripes, i.e., with a stripe from the other eye in between. In this case, the magnification of the stripe-generating map is very large. Due to the relation between magnification and the ‘dilution’ of excitation discussed in relation to (4, 7), dominance will be low in regions where magnification is high. This leads to the following equation: | det JR−1 (y)| = c3,8 (y)

(10)

3,8

and an analogous equation for R3,9 . Example. Assume now that the ocular dominance stripes form a straight regular pattern aligned with the y2 -axis on the cortical surface. The coordinate orthogonal to the stripes is y1 . Let further c3,8 (y) = co (1 + a cos 2πy1 ) and c3,9 (y) = co (1 − a cos 2πy1 ) for 0 < a < 1. From the regularity of the stripe pattern, it follows that R3,8 is of the form R3,8 (x1 , x2 ) = (r3,8 (x1 ), x2 )

(11)

for some function r3,8 (x1 ), and therefore det JR−1 (y1 , y2 ) = 3,8

1 ′ (r −1 (y )) r3,8 3,8 1

.

We obtain the solution ! r 1 1−a tan πx1 r3,8 (x1 ) = arctan π 1+a

(12)

(13)

√ Here, the constant co has been set to 1/ 1 − a2 . This solution holds in the interval x1 ∈ (−1/2, 1/2). Outside this interval, it can be suitably continued. The solution for R3,9 differs only by the sign of a. The first components of both functions for a = 0.9 are plotted in Fig. 5. Simulations of a special case of the network sketched in Fig. 4 have been published elsewhere (Mallot and Brittinger 1989). Details of the network equations can be found in Giannakopoulos (1989). Other cases of input segregation that can be modelled along the same lines are the mapping from the parvo- and magnocellular layers of the LGN to the appropriate blobs and interblobs in area V1 and further to the thick and thin stripes of area V2. 4.4 Reciprocal feedback Visual cortical areas do not exist in isolation. In all species studied so far, there exist at least two cortical visual areas, one (V1, area 17) with a roughly circular shape and a second, elongated one (V2, area 18) surrounding it like a belt or crescent (Allman 1977; Krubitzer 1995). Where the retinotopic map of both areas has been studied, a continuous transition across the V1/V2 border is found, the area V2 map approximating a mirror image of the area V1 map and vice versa. Along the border, the retinal decussation is represented which in cats and primates coincides with the vertical midline (vertical meridian) of the visual field. As a simple model of this situation, consider the network depicted in Fig. 6 where two populations B1 , B2 are connected reciprocally via topographic maps. For the stationary case, we obtain from (4):

448 1.5

In (18), the V1/V2 border is no longer modelled as a boundary condition. Unlike the situation in (14, 15), activity and axonal arborizations can now cross freely from one area to the other. If we apply the same argument to (2) and omit nonlinearity and density modulation, we obtain the feedback equation for reciprocally connected pairs of areas: Z e(x′ )K(x, x′ )dx′ (19) e(x) =

1

0.5

B

where

0 0

1

0.5

1.5

Fig. 5. First component r(x1 ) of the thalamo-cortical mapping functions modelling ocular dominance stripes (13) with a = 0.9. Continuous line, r3,8 (x1 ); broken line, r3,9 (x1 ). Note that the graphs of r3,8 and r3,9 are mirror symmetric with respect to the diagonal (thin line)

q q q

q

& %

Area V1 C1

Area V2

PP 6

C2

Q1,1

PP q

PP 6

B1

Q2,2

PP q

B2

′

K(x, x ) =

Z

C

α(y − R(x′ ))δ(Q −1 (x) − y)dy

(20)

The homogeneous equation (19) is a fixed point equation whose solutions are eigen-patterns of activity on the joint domain B with eigenvalue 1. An inhomogeneous equation results if retinal input is also considered. The kernel is in general not symmetric. It describes the interaction of mapping (long-range connectivity) and convolutions (short-range connectivity).

R2,1 R1,2

Fig. 6. Reciprocal connection between two populations modelling areas V1, V2. For explanation see text

s1 (y) =

Z

B2

s2 (y) =

Z

B1

e2 (x)α12 (y − R12 (x))dx

(14)

e1 (x)α21 (y − R21 (x))dx

(15)

These equations can be combined into one single equation by considering the joint domains B := B1 ∪B2 and C := C1 ∪C2 . We can define the joint maps R:B→C R(x) :=

R21 (x) x ∈ B1 R12 (x) x ∈ B2

(16)

Q11 (y) y ∈ C1 Q22 (y) y ∈ C2

(17)

and Q :C→B Q (y) :=

Figure 7 illustrates the situation for a simplified version of the V1/V2-mapping of the cat proposed by Mallot (1985). Figure 7a shows the maps R1,0 and R2,0 from the contralateral visual field to area V1 (shaded region) and area V2. In Fig. 7b, c, the reciprocal map R is shown. Note that for Q = id, and C = B, the reciprocal map (16) has the property R = R−1 . The state variables s(y) and e(x) can also be defined piecewise on C and B, respectively. Let us further assume that the local connectivity is uniform over the joint domains, α12 = α21 =: α, δ11 = δ22 =: δ. With these definitions (14, 15) can be rewritten jointly as Z e(x)α(y − R(x))dx (18) s(y) = B

Example. As a simple example, we assume that R takes the form R(x1 , x2 ) = (−x1 , x2 ) (true mirroring at the vertical meridian), and Q = id. With B1 = C1 = R+ × R and B2 = C2 = R− × R we have B = C = R2 . Let us further assume that α and δ are rotationally symmetric and that the product of their Fourier transforms takes the value 1 on the circle ω12 + ω22 = ωr2 for some ωr > 0. In this situation, excitation patterns of the form e(x) = cos(ωr (x1 sin ϕ + x2 cos ϕ))

(21)

for ϕ = 0 (horizontal stripes) and ϕ = π/2 (vertical stripes) are solutions of (19), whereas stripes of other orientations are generally not. In addition, concentric solutions of the form e(x) = Jo (ωr kxk),

(22)

exist, where Jo is the zeroth order Bessel function. Reciprocal feedback thus enhances certain global patterns in input images while it attenuates others.

5 The mapping magnification equation What do the mapping functions Rlp in the above equations look like? Besides direct measurements of topographic maps in, for example, the visual, auditory, and somatosensory pathways, there exist independent measurements of areal magnification factors, i.e., the local ratio of the area of a patch on the sensory surface (retina, basilar membrane, skin) and the area of its cortical representation. Fischer (1973) proposed that the mapping function for a neural representation can be inferred from the local density of retinal ganglion cells which he took as an estimate of the areal magnification factor. He proposed two maps with the same magnification factor: a radial compression and the complex logarithmic map (see also Schwartz 1980). This idea was subsequently extended to multiple cortical areas (Mallot 1985).

449

0

30

0.1

60 60 30

0

0

0 -30

20 40

0.5

-0.1

60 80

b

-60

0

0.2

0.3

0.4

0.5

0.2

0.3

0.4

0.5

-40

0.1 -20

-0.5 0 20

0

-1 40

a 0

60

0.5

1

1.5

2

2.5

-0.1

c Fig. 7a–c. Analytic approximation of retinotopic mapping in the cat visual cortex, areas V1 and V2. a Mapping of the right visual field (inset in upper left) Area V1 [mapping R1,0 from (31)] is marked in gray. It is partially surrounded by area V2 [mapping R2,0 from (32)]. Continuous lines, vertical lines in the visual field; broken lines, horizontal lines in the visual field. The area centralis and its representation are indicated by a black dot. Note the bifurcation of the representation of the horizontal meridian in area 18. (Simplified from Mallot 1985; for the equations see Appendix A.). b Detail of a around the area centralis representation. The black grid gives a coordinate system of the cortical surface. c Same as b with the grid transformed by the reciprocal mapping R from area V1 to V2 and vice versa

The basic idea of this approach is that the mapping from retina to visual cortex is accomplished with constant cellular magnification. Denoting the retinal ganglion cell density by ̺r , we obtain from (6) ̺r (x) = c| det JR (x)|

(23)

for some constant c. We call (23) the ‘mapping magnification equation.’ It is a nonlinear, two-dimensional partial differential equation. Note that solutions of (23) for a given distribution of ganglion cell densities are not unique. Boundary conditions such as the shape of the outline of the target area have to be specified to determine the solution. The role of boundary effects in the construction of topographic maps from local rules (competitive learning) has been studied recently by Wolf et al. (1994). Furthermore, (23) is underdetermined since it provides just one equation for two unknowns, i.e., the two components of R. A number of additional contraints have been proposed that can be used as simultaneous equations. For instance, if only conformal solution are sought, the Cauchy-Riemann equations can be used. In this case, (23) is transformed into the inhomogeneous eiconal equation of mathematical physics (cf. Zauderer 1989). Another constraint, which was used by Fischer (1973), is the assumption that radii are mapped to radii. Yet another constraint can be derived from the temporal organization of axon growth: newly developing axons are guided by axons already in place

and are therefore likely to terminate in a region adjacent to the termination site of the previous growth front. This mechanism leads to a ‘chronotopic’ organization (Molnár and Blakemore 1995) in which the growth fronts and the directions perpendicular to them will be mapped to an orthogonal pattern in the final representation. In the following, we discuss a simple example with a rotationally symmetric ganglion cell density (Fig. 8). We introduce polar coordinates both in the retina [(x1 , x2 ) = r(cos ϕ, sin ϕ)] and in the target area [(y1 , y2 ) = s(cos ϑ, sin ϑ)]. With these coordinates, (23) becomes s (24) ̺(r) = |sr ϑϕ − sϕ ϑr | r (Here, the subscripts denote partial derivatives.) We assume that the radii ϕ ≡ const. of the retina are mapped to radii ϑ ≡ const. on the target area. This assumption amounts to the requirement ϑr ≡ 0. We briefly discuss two solutions for the resulting equation ̺(r) =

1 ssr ϑϕ r

(25)

Radial compression. By assumption, ϑ does not depend on r. We can thus write ϑ(r, ϕ) = f (ϕ) for some differentiable function f . Using the relation 2ssr = (s2 )r , we obtain the solution

450

1

a

c

b

d

some constant c. This implies that ̺(r) is a power function, ̺(r) = crp , p = / −2. Thus we have proven that conformal maps which map radii to radii and whose magnification factor is rotationally symmetric must be power functions. The solution reads Z rp 2c 1+p/2 r ̺(r′ )dr′ = (29) s(r, ϕ) = 2 +p 0 2+p ϑ(r, ϕ) = ϕ (30) 2

0

-1

It is depicted in Fig. 8d.

1

6 Conclusion 0

-1

-1

1

0

-1

0

1

Fig. 8a–d. Examples of topographic mapping functions with the same distribution of areal magnification, ̺(r, ϕ) = r0.6 . a Retinal grid whose image is used to illustrate the the maps. If only the shaded area is considered, the shaded parts in the following parts of the figure can be considered models of area V1, while the unshaded parts model area V2 (see Mallot 1985). b Radial compression. c Radial compression with different compression factors along different radii. d Conformal map (complex power function)

s(r, ϕ) =

s

2 f ′ (ϕ)

ϑ(r, ϕ) = f (ϕ)

Z

r

r′ ̺(r′ )dr′

(26)

0

(27)

Figure 8b shows this solution for the case ̺(r) = cr−0.6 where f (ϕ) = ϕ and hence s(r, ϕ) = cr−0.7 . In Fig. 8c the function f (ϕ) has been changed to ϕ − 21 sin ϕ. Note that the mapping of Fig. 8c does not correspond to a visual streak; the areal magnification factor has the same distribution in all parts of the figure. With ̺(r) = 1/r2 outside some central disk r < ro and f (ϕ) = ϕ, (26) becomes equivalent to equation 4 of Fischer (1973). Conformal map. Another way to constrain ϑϕ is by means of the Cauchy-Riemann equations, i.e., by looking for conformal solutions. The appropriate Cauchy-Riemann equation in polar coordinates is s (28) sr = ϑ ϕ r which is Substitution of (28) into (25) yields ̺(r) = (sr )R2 √ ̺(r)dr. easily integrated to give the solution s(r, ϕ) = The corresponding solution of ϑ is found by going back to (25), this time eliminating sr by means of (28). The result is √ sr ̺(r) ϑ(r, ϕ) = rϕ = rϕ R √ s ̺(r)dr Since we have assumed initially that ϑ does not depend on r, solutions can only exist if the fraction evaluates to c/r for

In this paper we have summarized a number of results and approaches toward a continuous theory of cortical networks. We argue that the continuous approach is appropriate both for anatomical (huge numbers of cells, need for explicit representation of space) and for physiological reasons. In physiology, the spatiotemporal flow of excitation has indeed been studied on a level above that of single cells, i.e., by current source density, field potentials, optical recording, etc. In this sense, we hope that the level of description presented here will prove useful for the understanding of physiological results. The framework presented seems to be most powerful for the modelling of high-level features of cortical architecture. It is less well adapted for networks with strong space variances on a fine spatial scale. In particular, learning by synaptic weight dynamics is typically different for each neuron. Since, in the framework presented here, connectivity is constant for an entire population of cells, single-cell learning cannot be modelled (but see Yuille et al. 1989). When single cell properties are considered, the scheme might however be used to model the ‘milieu’ in which the single cell is embedded. The activity variables e and s can easily be used as background stimuli or biases modulating single-cell responses. Thus, the population network approach may prove useful as part of multiscale simulations of neural networks (e.g., Aertsen et al. 1989). The main focus of the type of model presented here, however, lies in the modelling of spatiotemporal flows of activity. The information-processing capabilities of spatiotemporal patterns of activity in layered, space-variant networks have been explored in a number of special cases, such as prey-predator discrimination in anurans (Ewert and von Seelen 1974; Cervantes-Pérez et al. 1985), binocular prey localization in salamanders (Manteuffel and Roth 1993; Wiggers et al. 1995), saccade generation in the superior colliculus (Ottes et al. 1986), the construction of spatial or spatiotemporal receptive fields in the visual cortex by intrinsic feedback connections (Korn and von Seelen 1972; Krone et al. 1986; Sabatini 1996), or technical vision systems (von Seelen et al. 1995). The general framework presented here provides a common language for this class of models. Furthermore, it allows the modelling of a number of additional features of real cortices such as multiple maps, ocular dominance stripes, or cytochrome oxidase blobs, whose computational significance is only poorly understood.

451

Appendix. Mapping functions for Fig. 7 Figure 7a shows an approximation of the retinotopic maps of the visual areas V1 and V2 of the cat (simplified from Mallot 1985). Note that only the contralateral visual hemifield is represented in each cerebral hemisphere. If we identify this hemifield B0 with the interval [0, 1]×[−1, 1], the maps have the following equations: R1,0 := S ◦ L1 ◦ T ◦ L3 R2,0 := S ◦ L2 ◦ T ◦ L3

(31) (32)

where T (x) = kxk

cos(0.43 arctan x2 /x1 ) sin(0.43 arctan x2 /x1 )

S (x) = xkxk0.57 0.7 0.1 L1 (x) = x −0.2 0.9 −0.225 0.1 L2 (x) = x 0.4 0.9 9.0 0.0 L3 (x) = x −4.0 −10.0

Acknowledgements. We would like to thank Werner von Seelen, who introduced us to this field and shaped our thinking. The work described in this paper was in part supported by the Deutsche Forschungsgemeinschaft grant Ma 1038/5-1. We are grateful to Matthias Franz and Almut Schüz for helpful comments on earlier versions of this manuscript.

References Aertsen AMHJ, Gerstein GL, Habib MK, Palm G (1989) Dynamics of neuronal firing correlation: modulation of ‘effective connectivity’. J Neurophysiol 61:900–917 Allman J (1977) Evolution of the visual system in the early primate. Prog Psychobiol Physiol Psychol 7:1–53 Amari S-I (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27:77–87 Amari S-I, Arbib MA (1977) Competition and cooperation in neural nets. In: Metzler J (ed) Systems Neuroscience. Academic Press, New York, pp 119–165 Beurle RL (1956) Properties of a mass of cells capable of regenerating pulses. Phil Trans R Soc Lond B 240:55–94 Blake R, Wilson HR (1991) Neural models of stereoscopic vision. Trends Neurosci 14:445–452 Braitenberg V, Schüz A (1991) Anatomy of the cortex. Statistics and geometry. Springer Verlag, Berlin Heidelberg New York Cervantes-Pérez F, Lara R, Arbib MA (1985) A neural model of interactions subserving prey-predator discrimination and size preference in anuran amphibia. J Theor Biol 113:117–152 Chipalkatti R, Arbib MA (1988) The cue integration model of depth perception: a stability analysis. J Math Biol 26:235–262 Cowan J, Ermentrout G (1978) Some aspects of the ‘eigenbehavior’ of neural nets. In: Levin S (ed) Studies in mathematical biology, part I: Cellular behavior and the development of pattern. MAA, Washington DC, pp 67–117 Dinse HRO, Krüger K, Mallot HA, Best J (1991) Temporal structure of cortical information processing: cortical architecture, oscillations, and nonseparability of spatiotemporal receptive field organization. In: Krüger J (ed) Neuronal cooperativity. Springer Verlag, Berlin Heidelberg New York, pp 68–104 Ermentrout GB, Cowan JD (1979) Temporal oscillations in neuronal nets. J Math Biol 7:265–280

Ermentrout GB, Cowan JD (1980) Large scale spatially organized activity in neural nets. SIAM J Appl Math 38:1–21 Ewert J-P, von Seelen W (1974) Neurobiologie und System – Theorie eines visuellen Muster-Erkennungsmechanismus bei Kröten. Kybernetik 14:167–183 Feldman J, Cowan J (1975) Large-scale activity in neural nets. I. Theory with application to motoneuron pool responses. Biol Cybern 17:29–38 Fischer B (1973) Overlap of receptive field centers and representation of the visual field in the cat’s optic tract. Vision Res 13:2113–2120 Giannakopoulos F (1989) Nichtlineare Systeme zur Beschreibung geschichteter neuronaler Strukturen. PhD Thesis, Department of Mathematics, Johannes Gutenberg-Universität, Mainz, Germany Giannakopoulos F, Oster O (in press) Bifurcation properties of a planar system modelling neural activity. J Differ Equat Dyn Syst. Heiden U an der (1980) Analysis of neural networks. (Lecture notes in biomathematics, vol 35) Springer-Verlag, Berlin Heidelberg New York Korn A, von Seelen W (1972) Dynamische Eigenschaften von Nervennetzen im visuellen System. Kybernetik 10:64–77 Krone G, Mallot HA, Palm G, Schüz A (1986) The spatio-temporal receptive field: A dynamical model derived from cortical architectonics. Proc R Soc Lond B 226:421–444 Krubitzer L (1995) The organization of neocortex in mammals: are species realy so different? Trends Neurosci 18:408–417 Lund JS (1988) Anatomical organization of macaque monkey striate visual cortex. Annu Rev Neurosci 11:253–288 Mallot HA (1985) An overall description of retinotopic mapping in the cat’s visual cortex areas 17, 18, and 19. Biol Cybern 52:45–51 Mallot HA (1995) Layered computation in neural networks. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge, Mass, pp 513–516 Mallot HA, Brittinger R (1989) Towards a network theory of cortical areas. In: Cotterill RMJ (ed) Models of brain function. Cambridge, Cambridge University Press, pp 175–189 Mallot HA, Giannakopoulos F (1992) Activation dynamics of space-variant continuous networks. In: Taylor JG, Caianiello ER, Cotterill RMJ, Clark JW (eds) Neural network dynamics. Springer Verlag, Berlin Heidelberg New York, pp 341–355 Mallot HA, von Seelen W, Giannakopoulos F (1990) Neural mapping and space-variant image processing. Neural Networks 3:245–263 Manteuffel G, Roth G (1993) A model of the saccadic sensorimotor system of salamanders. Biol Cybern 68:431–440 Molnár Z, Blakemore C (1995) How do thalamic axons find their way to the cortex? Trends Neurosci 18:389–397 Northcutt RG, Kaas JH (1995) The emergence and evolution of mammalian neocortex. Trends Neurosci 18:373–379 Ottes FP, van Gisbergen JAM, Eggermont JJ (1986) Visuomotor fields of the superior colliculus: a quantitative model. Vision Res 26:857–873 Patton P, Thomas E, Wyatt RE (1992) A computational model of vertical signal propagation in the primary visual cortex. Biol Cybern 68:43–52 Peters A (1985) Visual cortex of the rat. In: Peters A, Jones EG (eds) Cerebral cortex, vol 3, Visual cortex. Plenum Press, New York, London, pp 19–80 Sabatini SP (1996) Recurrent inhibition and clustered connectivity as a basis for Gabor-like receptive fields in the visual cortex. Biol Cybern 74:189–202 Schwartz EL (1980) Computational anatomy and functional architecture of striate cortex: A spatial mapping approach to perceptual coding. Vision Res 20:645–669 Seelen W von (1968) Informationsverarbeitung in homogenen Netzen von Neuronenmodellen I. Kybernetik 5:133–148 Seelen W von, Mallot HA, Giannakopoulos F (1987) Characteristics of neuronal systems in the visual cortex. Biol Cybern 56:37–49 Seelen W von, Bohrer S, Kopecz J, Theimer WM (1995) A neural architecture for visual information processing. Int J Comput Vision 16:229–260 Somers DC, Nelson SB, Sur M (1995) An emergent model of orientation selectivity in cat visual cortical simple cells. J Neurosci 15:5448–5465 Stemmler M, Usher M, Niebur E (1995) Lateral interactions in primary visual cortex: a model bridging physiology and psychophysics. Science 269:1877–1880 Widrow B, Hoff ME (1960) Adaptive switching circuits. Convention record, IRE WESCON, New York

452

Wiggers W, Roth G, Eurich C, Straub A (1995) Binocular depth perception mechanisms in tongue-projecting salamanders. J Comp Physiol 176:365–377 Wilson HR, Cowan JD (1973) A mathematical theory of functional dynamics of cortical and thalamic nervous tissue. Kybernetik 13:55–80 Wolf F, Bauer H-U, Geisel T (1994) Formation of field discontinuities and islands in visual cortical maps. Biol Cybern 70:525–531

Yuille AL, Kammen DM, Cohen DS (1989) Quadrature and the development of orientation selective cortical cells by Hebb rules. Biol Cybern 61:183–194 Zauderer E (1989) Partial differential equations of applied mathematics, 2nd edn. Wiley, New York