Embodied Intelligence Through Coordination

1 downloads 0 Views 1MB Size Report
Jan 1, 2019 - In both the brain and body we see an intrinsic tendency to form .... expect that coordination dynamics is well suited to help in demystifying the complex interactions ... brain components can come into and out of synchronization with one ..... biasing the developmental process and subset of experiences had ...
Jan 2019

Embodied Intelligence Through Coordination Dynamics in the Brain-Body Complex. Kole Harvey In both the brain and body we see an intrinsic tendency to form patterns of coordination between individual components, or in other words the tendency to self-organize into a complex system capable of emergent behavior far more complex than any of its components. In this article we review theoretical and modeling work over the past three decades from the fields of coordination dynamics, which has studied the nature of this cooperation in detail, neuroscience, which has studied functional networks of cooperating components in the brain, and robotics, which has studied self-organizing patterns in non-linear systems. We attempt to provide a novel perspective that encapsulates this work in a manner that points to its utility in the development of embodied (robotic) intelligence. Notably, we seek to stimulate discussion on alternative forms of robotic control based on the way the human brain coordinates with the body in a rich environment and ask whether it holds the key to surmounting the intractable nature of high dimensional non-linear dynamical systems. We then expand this discussion using a set of uncovered principles to describe how we could use existing knowledge to develop autonomous intelligent or ‘cognitive-like’ behavior at the infant level.

Highlights 

How to constrain the degrees of freedom of an intelligent agent so that it can develop meaningful complex behavior



How to achieve autonomous behavior from the dynamic interplay between processes at the brain, body, and environment level



The development of open ended learning on top of coordination dynamics in the brain and body



The emergence of metastable, contextually appropriate behavior which can robustly achieve goals while at the same time remain adaptive to environmental contingencies at multiple timescales



What aspects of the complex intricate workings of the brain should be focused on when trying to develop intelligent embodied agents

I: Embodied Behavior and Coordination Dynamics 1. Embodied, embedded behavior through coupled oscillators First we will start from a specific example from Kuniyoshi and colleagues and note certain aspects of it that can be generalized for embodied (robotic) intelligent agents. (Kuniyoshi and Sangawa 2006)⁠ models the spontaneous ‘generalized movements’ (GM) conducted by fetus from 2 months after gestation until birth. These movements involve the entire body and lead to the development of a fundamental motor repertoire constituting the first attempts at coordination of brain, body, and environment by the child. Specifically, the main actors in the emergence of GM which can be modeled are the muscles, spindles, spine, medulla, and cortex. This model allows coupling between neural, body, and environment units, themselves modelled by central pattern generators (CPGs) which are abstractions of the kind of oscillators found in the spinal circuit of vertebrates. While CPGs are generally used to create monostable limit cycle behavior, networked CPGs can switch between limit cycle modes to produce multistable behavior. The difficulty of finding the correct parameters to generate interesting behavior in such circuits is a particular limitation, but they can be extended to a network of non-linearly coupled chaotic elements through the use of a globally coupled map (GCM); this uses a mean field which indirectly links each element to one another resulting in a network that can support specific modes of behavior based on the degree of coupling between elements. In the absence of coupling, each element behaves independently; when coupled, the behavior of the elements ranges from complete synchronization, to ordered phases with multiple clusters of entrained elements, to partially ordered phases with continuously shifting cluster configurations, to random/chaotic behavior. This range of behavior is a generalizable property resulting from competing tendencies of each unit to either segregate from one another or to integrate with one another by synchronizing via the mean-field. The ability to shift between different configurations (networks) of clusters is an example of metastability and allows for adaptive behavior. A further novelty added by Kuniyoshi is to couple (chaotic) elements of the spinal network of ‘motor neuron’ oscillators through the environment via a muscle/spindle pair instead of a generic mean field. In this way, the body/environment dynamics are a medium through which coordination patterns emerge, leading to the self-organization of inherently meaningful coordination patterns fundamentally derived from the agent’s physical constraints. A simulated medulla oblongata consisting of Bonhoffer van der Pol (BVP) oscillators is also included which can create either cyclic (classic CPG behavior) or chaotic motor patterns, depending on an external control signal. This area essentially plays the role of ‘midbrain behavioral selection’ as postulated in (Humphries, Gurney, and Prescott 2007). The result is that the whole system runs through quasi-stable ‘ghost’ attractors itinerantly. As a whole, the system is continuously changing its dynamic structure and thus explores different behaviors without converging to a stable attractor. These itinerant dynamics act as a search process to explore the possible behaviors of the body-environment space. Lastly, the model adds a variant of a Kohonen map to act as sensorimotor cortex (M1/S1) which can learn coherent coordination patterns as they are explored in the system, to be ‘reloaded’ later as topdown commands for the body. In conclusion, the model incorporates a minimal set of components required for a demonstration of embodied, embedded behavior.

2. Generalization Let us first briefly summarize the five key ideas that Kuniyoshi’s model incorporates. Self-organization: No central homunculus, takes advantage of the emergent behavior of complex systems to engage in multiple spontaneous behaviors and dynamic transitions between them. Coherence: Individual parts cooperate; the whole is greater than the sum of its parts. When the body moves it is a result of the collaboration between musculoskeletal-neural units to form ‘synergies’. Embodiment/embeddedness: Neural dynamics are constrained by constant dynamic interaction with the environment through a physical body. The ‘mind’ results from brain-body-environment interaction. Metastablity: Ability to transiently converge near ‘solution’ regions without getting stuck in attractors, ensuring continuous fluent adaptive response to the ever changing environment. Complementarity: Balance between segregation and integration as relative coordination both within and between spinal/midbrain/cortical areas allows for propagation of information through transiently formed functional networks. Our main hypothesis in this article is that the above set of properties together hold the potential to create a tractable, realistic solution to the learning problem of robotic agents with high degrees of freedom (DOF). However, in order to explore these properties outside of any given implementation we require a compatible theory which can describe them at a sufficient level of abstraction while also allowing for their application in the analysis of bodily and neural behavior. We posit that the field of coordination dynamics does just this. Coordination dynamics aims at finding and explaining dynamic, functional orderings among interacting components, or in other words their dynamic coordination patterns, and how they emerge from non-linear coupling of those components. Coordination dynamics is a relevant field of study for understanding the emergence of ‘cognitive’ function from the brain and its complex interactions with the body and environment since it provides an answer to the ‘coordination problem’, defined as “how, for any given cognitive function, the non-linear coupling among component parts gives rise to a wide variety of complex, coordinated behaviors.” (Bressler and Kelso 2001, pg. 26)⁠ We are motivated to extend these ideas to the cortical level and explain how neuronal populations can dynamically form functional networks – clusters of cooperating brain areas that transiently cohere to one another in order to perform some emergent function not possible by any of the areas on their own. We see a direct correspondence between the ‘mental’ workings of the brain and the ‘motor’ workings of the body, which both consist of dynamic groupings or ‘synergies’ of components which cooperate through non-linear coupling to create novel function. By surmounting the DOF problem in both cases, an intractable set of infinite possibilities – the body-mental search space – is reduced to a tractable subset, directly informed by the bodily, environmental, and social constraints that shape our form of life. It remains for us to discuss how this contributes to the formation of intelligent behavior which can be used to control complex embodied robotic agents. The key question we seek to ask in this article is what steps can we take to use the principles of coordination dynamics to extend the fetus-level agent of

Kuniyoshi’s coupled chaotic system to an infant level agent – capable of learning and executing sequences of ‘cognitive’ intentional behavior in a real world environment.

3. Using coordination dynamics to describe the brain and body Coordination dynamics tells us how patterns of interdependency change over time. The processes of pattern formation and change lie at the crux of complex systems science, where the interconnectedness of the elements of a system tend to exert more influence on the development of macro-level behavior than the individual details of any one component. Logically then we should expect that coordination dynamics is well suited to help in demystifying the complex interactions between the brain, body and environment. In coordination dynamics, the discussion centers on the main collective variables which can describe interactions between components while ignoring irrelevant details. Generally, the relative phase relationship is chosen for this role as it represents the spatiotemporal order between components in a succinct way. Further, relative phase changes between neural populations (e.g. cortical regions) operate on a scale slower than individual neurons and so are more comparable to the dynamics of actual behavior. This variable is shown to go through abrupt phase transitions at important points in the dynamics (bifurcations) - behavioral switches - similar to Ising models which also describe macro level phenomena using highly interconnected ‘simple agents’, representing for example individual molecules. Coordination dynamics can be summed up elegantly using the HKB model (Haken, Kelso, and Bunz 1985), which describes the dynamics of a 1:1 coordination between two oscillatory components:

(1)

This equation relates the relative phase ϕ to its rate of change and thus describes the dynamic coordination patterns between two components, parameterized by the frequency differences between the two components δω and the strength of coupling between them b:a, as well as a noise component ξ with strength Q. The HKB model allows for an intuitive description about the way individual body or brain components can come into and out of synchronization with one another and thus is perfect for describing the transient cooperation that is fundamental to the workings of the brain-body complex. Focusing on abstract quantities such as frequency and relative phase between oscillators allows for an account of the ‘functional network’ formation in the brain – as individual components can organize into temporary relationships by synchronizing their phase. This synchronization between non-linear oscillators can take the form of monostability - a single stable state to which the relative phase can be drawn to and where it stays, multistability - multiple stable states under identical boundary conditions, each of which may be settled to by the system depending on its history, and metastability - the absence of stable attractors but with ‘ghost’ attractors in their place, where the relative phase can hover for a

while before eventually leaving again, leading to itinerant dynamics. It is metastability that we are most interested in pursuing further, as it provides the means for transient coherence between components without sacrificing the ability to adapt to the ever changing circumstances of the body, environment, and other brain areas. Metastability is fundamentally linked to the complementarity of the opposing tendencies of segregation and integration – each cortical (or bodily) component has an innate tendency to some particular dynamical regime, while the connectivity between components leads to another tendency for components to sacrifice some of their independence in order to collaborate with one another. Our thesis is that we can learn about the processes behind intelligent behavior through the study of interaction between oscillatory neural patterns and embodied behavior – self organized complex system dynamics - without modeling all the way down to the microscopic details of individual spiking neurons. Indeed, in (Kelso, Fuchs, and Jirsa 1999;Kelso, Dumas, and Tognoli 2013), the brain’s workings are divided into three basic layers – the neuron layer, the neural field layer, and the behavioral layer. Previous such ‘multiscale’ models, including those relying on neural field theory are discussed in 5. We posit here that by alleviating individual details and focusing on the level of neural field and up, it may be possible using present day knowledge to achieve at least infant level artificial intelligence, given the right approach. In addition to Jirsa’s work on neural field descriptions of the brain, there is a line of research originating from Schoner’s Dynamic Field Theory (Schöner and Spencer 2015)⁠ that, similarly to the present article, attempts to link dynamics at the neural field layer to robotic control. (Erlhagen and Schöner 2002; Lomp et al. 2016; Zibner et al. 2011). However, the models frequently seen in these discussions tend to focus on static, predefined dynamic landscapes and explain only isolated phenomena. Further, they do not discuss how such fields can self-organize autonomously and generally only seek to give a descriptive account of cognitive dynamics. We believe it is for this reason that robotic implementations of DFT have yet to show marked improvements over traditional, cognitivist/computational forms of robotic control. Nevertheless, there is certainly compatibility between DFT and coordination dynamics in complex systems; the key to bridging the gap between them is presumably the introduction of self-organization and metastability, allowing for spontaneous generation of meaningful patterns in the modeled neural field. We can certainly imagine an agent that is fundamentally constrained by its physical manifestation from the bottom up in embodied/embedded fashion – always already in the world - but which relies on the emergent powers of complex systems to do more than just linearly couple individual areas – obtaining a whole more powerful than the sum of its parts. Further adding to this the adaptability of metastable functional networks that can form and disperse as the situation demands can lead us, we posit, to a form of intelligence that goes beyond the limitations of traditional robotics and AI.

4. Constraint satisfaction and solution finding In Kuniyoshi’s work we see how a collection of non-linear components can, through interconnectivity, come together to achieve stable balanced behavior; at each transition, the system as a whole converges to an ‘energy minimum’ that satisfies the most constraints at each of the junctions connecting

individual systems: between environment and body, between individual body parts, between body and subcortical system, and between subcortical and cortical systems. We see a direct correspondence between this kind of self-organizing synergy emergence and the theory of solution finding as expressed in (Bressler and Kelso 2001). Specifically, individual areas can find solutions to local ‘problems’ (sets of constraints) through a relaxation process; at a somewhat slower timescale, those areas that have found local solutions are then available to coordinate with other areas through increased coherence (relative coordination). In this way, local solutions are integrated into a global solution which temporarily generates a specific functional network which holds all its members near phase synchrony in the metastable regime. This process can continue at greater spatiotemporal scales to eventually reach a global energy minimum which incorporates the entire brain, and constitutes a ‘perceptuo-motor Gestalt’ which satisfies the most environmental/bodily/mental constraints possible given the current available behavioral repertoire. Notably, perception and action are in parallel – there is no need to convert the perceptual Gestalt into action through some hidden process of ‘cognition’; as the global state has reached consensus, the motor areas and body are already part of this consensus – in effect the ‘action’ is already underway, and is not so much a response as an ongoing adaptation that is meaningful at all levels. Nevertheless, the motor components of this network can temporarily become more independent as they process the ongoing action which has already been ‘worked out’ by the global network (Cohen and D’Esposito 2016)⁠ ; in this way we see how the current discussion absorbs traditional views of behavior based on perception-action cycles. We have seen how physical constraints might reduce the search space for coupled chaotic systems such as that of Kuniyoshi and further in work such as (Mori and Kuniyoshi 2010; Mori, Okuyama, and Asada 2013; Shim and Husbands 2012), but what about in the brain? What constraints might be used to make the search process tractable there? First of all, we know that the brain, via input from its sensory systems, is constrained by the physical body and the kinds of interactions that can be potentially had in its environment. Thus input and output are reduced to a ‘physically possible’ subset. Additionally, the midbrain’s more archaic methods of behavioral selection impose another level of constraint that limits behaviors to coherent modes of body-environment and intra-body relationships. As the fetus develops, we can further imagine that there would be a bias for development of salient or intrinsically pleasing modes of behavior, leading to what we may call a ‘hedonic bias’. That is not to say that behaviors which harm or displease the fetus-stage organism are not possible, but rather their avoidance would lead to a certain developmental bias which carves out a particular niche of behavioral space and eventually leads to a kind of ‘sensorimotor identity’. There are further constraints that may play a role in limiting the functional connectivity patterns of the brain at the scale of the neural field. We have discussed how the formation of a functional network, or the partial integration of various areas, can be thought of as a form of alignment where the relative phase of each area temporarily settles near a metastable ‘ghost’ attractor, and does so in a manner so as to minimize the energy of the system (relaxation). Indeed this resembles the way in which Hopfield nets and other artificial neural networks operate to solve externally posed problems, with the distinction that the network in those cases settles to a stable state in the finite ‘attractor landscape’ encoded in the structure of the network. Instead, metastable functional networks allow for a level of flexibility beyond that afforded by the semi-permanent links in the synaptic substrate atop of which it forms. It instead achieves temporary coherence – a solution to the complex set of constraints that the

unique combination of environment and bodily state imposes on it at present – and in doing so demonstrates the robust, adaptive, self-organizing, flexible form of ‘behavioral response’ which we so desire. If we accept this premise, it is easy to see how the same principle could play out at multiple scales – linking not just intra-area components but acting at the inter-area, intra-hemisphere, and brain-wide scale, resembling the kind of ‘cell-assemblies’ first posited by (Hebb 1949)⁠ and theorized by many (Pulvermüller, Garagnani, and Wennekers 2014)⁠ as the key to mesoscale functionality in the brain.

5. What we can learn from neural field theory Jirsa & Haken’s neural field model uses ‘conversion operations’ to define relations between firing rates and local field potentials in the brain, in other words how to link the scales of individual neurons and neuronal populations (Haken and Jirsa 1996). Further, (Jirsa et al. 2002) describes how this theory can be used to describe brain function and behavior at multiple levels of description: neural ensembles at one level, and their interactions governed by heterogenous connections between cortical areas at another. Spatiotemporal mappings between the modeled (neocortical) activity and input/output signals called ‘functional units’ can then be used to compare the model to EEG/MEG/behavioral movements. This formalization of the workings of the brain is further developed into a ‘Virtual Brain’ model in (Jirsa et al. 2010). Here we see a multi level model of the brain with the aforementioned heterogenous connections between ensembles. This model serves as an ideal research platform for investigating the relation between structural connectivity and the dynamics of the active brain, including other influences on the dynamics such as transmission delays, noise, and external input. Additionally, ideas from graph theory can be used to classify motifs in the brain network such as functional modules – sets of functionally related brain regions connected to one another that jointly act towards certain tasks or form sensory/motor systems (Dosenbach et al. 2007), and hub nodes – which are highly connected regions of the brain mainly in the frontal/parietal lobes that form a core or ‘rich club’ between one another (Friston et al. 2008). Hub nodes in particular have been posited by Shanahan as a solution to the Coalition formation problem (Shanahan 2012), which seeks to determine how it is that a ‘coalition’ of distributed brain processes can form when it is not hardwired in advance. His solution is that a ‘connective core’ (a rich club) of hub nodes central to the entire network can allow for the control and competition of coupling relations throughout the entire brain, and thus act as a ‘conductor’ of sorts for the transient coupling and decoupling of other brain regions even before permanent structural links form to ‘lock-in’ such couplings. If true, this theory would allow us to postulate how novel behaviors can arise in the absence of any particular network structure encoding them, and solves the chicken-egg problem of learning mechanisms in the brain. Mutliscale models of the brain such as in (Deco et al. 2008; Jirsa et al. 2010) thus give us insight into how to focus on certain aspects (for example of dynamical and network patterns, delays etc.) conducive to brain functionality without getting bogged down in detail. Studying these aspects can reveal recursive/hierarchical structure of spatiotemporal scales which can elucidate universal dynamical

principles and aid in reducing the target of study to fewer components. The success of the Virtual Brain model to reproduce resting state/default mode network activity is very encouraging, and in tandem with the development of these sorts of models we should expect to be able to find those most important principles that can be abstracted away from the brain and used to drive an intelligent agent based on simplified models of network behavior.

II: Cortical Learning 6. Coordination dynamics stored in the nervous system We will now investigate various challenges pertaining to skill learning given the present approach. We have seen from Kuniyoshi’s model how an embodied agent can bootstrap learning through the initial self-organized origin of synergies in the spinal-medulla areas. As the human fetus grows, the connection between spinal cord, midbrain, and cortex strengthens, and cortical dynamics begin to be influenced by and dictate the movements of the body. While the cortical area is represented by a Kohonen map in Kuniyoshi’s model, here we describe the theoretical potential of replacing this with a network of non-linear units similar to those in the spinal model. In such a model of the brain, we would seek to develop self-organization of functional networks corresponding to bodily and environmental constraints, developed on top of initial connectivity, and possibly directed by an initially present ‘connective core’ of hub nodes. Full body babbling in the fetus stage could thus lead to chaotic search of coherent body-midbrain-brain configurations, and repeated exposure to the ensuing transient states of connectivity in cortex during periods of such coherence could then be stored via structural changes in the cortical connectivity itself. While non-cortical networks may be sufficient for learning certain primary movements such as turning or crawling, it remains to describe how additional cortical mechanisms allow development of more complex behavior over time. Further we must describe how new skills may be learned without erasing old ones, namely how to solve the stability-plasticity dilemma. Also, the search space of all possible skills is so large that we must describe how the current approach is able to follow a tractable path through development, for example via incremental exploration of the behavioral space in a sensible way. (Schöner, Zanone, and Kelso 1992)⁠ goes into detail regarding how the principles of coordination dynamics can be used to describe learning. In this view the central nervous system (CNS) stores not coordination patterns per se, but the coordination dynamics themselves – which include not only individual patterns but also their dynamic environment – including how stable the supported patterns are to perturbation. In this view, a multitude of patterns and their dynamics share a space in the CNS – what we here refer to as the ‘skill landscape’. The learning of new skills then corresponds to formation of new attractors in this landscape, and new attractors can affect the stability of old attractors, leading to an accommodation process (Piaget 1952)⁠ to be necessary to keep the entire system stable throughout its development. In the above theory, constraints (such as those of the body and environment) do not have to specify any particular pattern but modify the landscape to favor a certain subset of attractors (certain

coordination patterns), while being unspecific to some variables such as the choice of limb involved. Indeed such task-level representation of coordination dynamics has been uncovered in (Kelso and Zanone 2002)⁠ and allow for transfer of learned skills between effectors (e.g. between arms and legs). The above internal constraints are referred to as ‘intrinsic dynamics’, and correspond to the learned set of skills. In comparison, ‘behavioral information’ is defined as particular perceived environment information/intentions additional to the demands of the task or body that further reduce the set of possibilities such that a particular set of patterns are given preference – or what might be described as a ‘solution manifold’ (Scholz and Schöner 1999). From this we can see how motor skills can be stored in the CNS without over-specifying their details (similar to the idea of ‘generalized motor programs’ (Schmidt and Lee 2005)), allowing for reliable behaviors that are also adaptive to any particular context. It also allows for some aspects of the environment to be treated as invariant with respect to a set of constraints, leading to a primitive form of ‘concepts’ abstracted away from the details. The malleable contribution of certain behavioral information allows for goal-directed behavior that can change on-the-fly, contingent on the availabilities of the environment during its execution. As behavioral information, intrinsic dynamics, and the state of the environment converge towards the same target, ‘affordances’ in the landscape turn into more attractive ‘solicitations’ (Bruineberg and Rietveld 2014)⁠ and the ensuing behavior becomes more deterministic as it transiently settles to a coherent state. As the task is executed, the environment and internal demands change, and so the functional network which united to achieve the present goal disbands again to transiently seek out its next metastable state. Conversely, if behavioral information differs greatly from intrinsic dynamics, instability ensues as the agent cannot achieve its ends in the present environment, and the search process must continue to try and find an alternative. What we have described here is surely the crux of adaptive behavior in general. At this point, one question we must ask is what happens in the early stages of learning, in which the agent has only minimal intrinsic dynamics, and every situation results in an unstable response? In other words, how can a complex behavior that isn’t a primitive form of body coherence be stabilized long enough to ‘lock it in’ to the CNS? In addition to the connective core which can govern functional interactions on a global scale, there is also evidence for a so called ‘coordination network’ in the supplemental motor cortex (SMC)/premotor cortex, outside of the primary sensorimotor areas of M1/S1 (Jantzen and Kelso 2004). Such a coordination network allows for external stabilization of primary sensorimotor functional networks (e.g. through alpha band inhibition of extraneous components (Popov, Kastner, and Jensen 2017; H. Park et al. 2016)). Through this mechanism, we can imagine how temporary behavioral information may be made strong enough to form a proto-attractor in the skill landscape, long enough for intrinsic dynamics to converge towards creation of a more permanent attractor in the manner outlined in the above view of the learning process. If the behavior is rewarding, release of neuromodulators may further accelerate this process by triggering transient neuroplasticity in the area (Soltoggio 2008). While the above account is somewhat speculative in nature, our aim is to provide an example of what sorts of mechanisms we should expect to look for when examining neural correlates of the skill learning process. A main point of the provided example is that any new skills are always generated from information found in the body/environment of the agent or adapted from pre-existing skills – thus always a relevant form of embodied behavior which branches out from the existing repertoire

rather than being entirely sui generis. This is backed by the fact that behavioral information and intrinsic dynamics must overlap to some extent in order for long term stability to emerge – in other words, what the agent already knows will heavily affect the skills it can learn at any particular point in time. As Patrick Wilson put it, "You can only learn what you already almost know." Based on this we can imagine the developmental path of an organism or artificial agent as a fractal-like structure where each branch can be further developed in detail, with new branches always biased by the past developmental history of the organism (Figure 1). We have already seen how this image can describe not only development but real time behavior, as the possibilities for action are successively constrained by context. Neural Darwinism theory tells us that those behaviors most engaged in would further dictate the structural connectivity of the brain (Edelman 1987), and so an element of determinism arises which constrains the future possibility of development as well as the search space during learning. Thus newly developed behaviors in the infant are not fully derivative but indeed never entirely novel either. It is easy to see how these limitations could ultimately show up as broken symmetries in the space of functional networks which in turn develop in the brain. Figure 1. The incremental acquisition of skill.

Here we show the predicted pattern of behavioral skill acquisition – a process that has greater than linear ability to search the behavioral space but is non-ergodic (has a 1.x fractal dimension), thus providing a tractable solution to the degrees of freedom problem. Each acquired skill sets the stage for a ‘branching’ into newer derivative skills as the behavioral niche incrementally expands, offering the agent new opportunities and challenges in piecemeal fashion.

7. The developmental process of continuous skill acquisition and adaptation The ultimate objective of an intelligent agent can be seen not as the storage of static patterns of behavior, but as capturing a skill landscape of coordination dynamics which ties together all skills in a mutually consistent way. We have seen how existing skills influence the acquisition of any new skills by biasing the developmental process and subset of experiences had thereafter; here we will show how the learning of new skills can influence previously learned skills through Piagetian accommodation. Further drawing from Piaget, we will then speculate on how existing skill landscapes can allow for more complex skills to be developed, leading to more sophisticated behavior.

When confronted with behavioral information corresponding to a to-be-learned skill, such a skill must initially be perceived or contemplated based on the existing networks structure; in other words, the ‘target’ of skill acquisition must be set in terms of the current skill landscape - interpreted in terms of what the agent already knows. We observe that any future skill the agent would be capable of learning must accord to one of the following: a.

A reconfiguration of an existing skill

b.

A combination of existing sub-skills

c.

A novel skill with several prerequisites that have yet to be acquired

We will cover each of these cases in turn.

(a) Reconfiguration The modification of existing skill is related to what has been called ‘motor adaptation’ - the ability of the nervous system to adjust specific parameters of motion based on external changes (Deuschl et al. 1996). Here the main skeleton of the skill already exists but slight modifications of the dynamics must occur after an environmental change or in order to support instantiation of the skill in multiple contexts. This is done by shifting, expanding, or changing its shape of the solution manifold. Examples of this in organisms may include the adjustment of motor actions after gaining or losing weight or after sustaining an injury. The traditional explanation of this phenomenon is that an internal predictive model begins to produce ‘prediction error’ as the body changes, which in turn causes the ‘internal model’ to change its parameters down the error gradient. An explanation of this learning process more in tune with the current approach instead posits the influence that the changing body morphology has on the coordination dynamics, shifting the skill attractor away from the behavioral information or goal state. The ensuing mismatch between behavioral information and intrinsic dynamics (which upon analysis may look like a ‘prediction error’ of sorts) should influence the learning dynamics so that the intrinsic dynamics are shifted back towards the target dynamics, as explained previously. Presumably, the use of a separate coordinating network such as premotor cortex/SMC can temporarily constrain the dynamics of the sensorimotor network of S1/M1 so that the required pattern is held in place despite the pull toward pre-existing intrinsic dynamics. Thus the adaptation gets stored via changes in structural connectivity, possibly with the assistance of neuromodulators such as dopamine which signal the reward of goal completion when the updated pattern is employed in the new context.

(b) Combination As initial basic skills carve out a niche for the agent, it will encounter the opportunity to string skills together into a sequence and achieve a result not possible by any of the skills used in isolation. Here we enter the domain of ‘cognition’ as more long term effects need to be considered or ‘planned’ in advance, and the behavior in question must be constrained not only by the immediate environment but

by future supposed environments and the next skills to be used in the sequence. A canonical example of this is the ‘end state comfort effect’, in which humans tend to grasp different locations of an object dependent on what they plan on doing with the object next (Rosenbaum et al. 2006). Here we see a qualitative change in operation from animalistic behavior to more human-like behavior, although primitive versions of sequential behavior can also be seen in rodents and non-human primates. The modularity of the motor system has been a cause for much debate, but past models such as MOSAIC (Haruno, Wolpert, and Kawato 2001) have shown us the limitations of discrete ‘symbolic’ modules arranged in a hierarchical or sequential order. Instead it is more fruitful to consider how the coordination dynamics of individual skills can be modified on the fly to support their insertion into a sequence, and if we are to model this in practice we much consider an extension of these dynamics to include forcing terms from the neural areas which govern sequence dynamics, allowing for modification of the solution manifold on the fly. Since we know that warping of the dynamics based on contingent information is possible, as discussed in (Kelso and Zanone 2002), it remains to consider the role of a representation of ‘sequence’. We can envision sequence information as another constraint – behavioral information that reduces the space of the solution manifold of a skill to a subset that is compatible with prior and subsequent steps, or indeed warps the manifold in such a way as to be compatible. The simplest implementation of this would be a memory module which preemptively calls forth the subsequent constraints and relates them to the relevant motor areas. The mechanics involved seem analogous to the workings of the previously discussed external coordination network, with the difference that the target dynamics are future-oriented instead of present-oriented. Further, through the learning process frequent use of a skill in context of a sequence will permanently warp it to those constraints, allowing for ‘automatic’ execution of entire sequences of behavior. The process of accommodation would then allow for different sequences to apply different solution manifolds, ultimately carving one initial skill into multiple variants by creating sub-regimes in the skill landscape, while keeping the usage of additional neural substrate to a minimum. We show an example of how accommodation could work to stabilize the skill landscape after introduction of a new skill or sequence in Figure 2. While the detailed mechanisms behind this kind of stabilization remains to be worked out, this gives us a good starting point to consider more spatiotemporally complex behavior which can have long term goals achieved by many intermediate steps.

Figure 2. Assimilation and accomodation of skill in a metastable skill landscape. We visually demonstrate the formation of skill attractors in a dynamic landscape such as that governed by the HKB equation in its metastable regime as a metaphor for the high dimensional coordination patterns in the CNS/body. (a) A metastable ‘ghost’ attractor pertaining to a particular functional network configuration exists in the brainbody system after being assimilated from body-environment exchanges. As the system nears this area, relative coordination between independent components takes place for a short while before the system moves out of the ghost, disbanding the functional network. (b) A new skill has just been learned but is incompatible with the preexisting skill, and thus destabilizes the network. The system can only hold the state in the corresponding functional networks for a short time before being destabilized. External coordination is required to remain in a coordinated state for long. (c) Practice of maintaining coordination in the case of both new and old skills evolves the landscape to accommodate both skills at the same time. Now either skill can manifest as a ‘habit’ without extra attention from the coordination network.

(c) Novel skill learning The acquisition of the most temporally distant sorts of skills is summed up by the idea of setting zones of proximal development (ZPDs) somewhere in an abstract goal space (Vygotsky 1987), the relevant behavioral correlates (means) of which are either not yet developed or can only be loosely extrapolated from the present skill landscape. Given an episodic memory system, an agent (such as a human infant) could recall certain novel or salient events with some random frequency or when confronted with a similar environment to the original memory. The agent could then engage in a directed ‘practice’ towards acquisition of the new skill. All that is needed for this is the recall of the event to bias the current activity to a subset of behaviors that more closely relate to that event – for example imitation of other agents, approach and manipulation of key objects, or attempts to confront limitations which prevent proximal access to the source of interest (sitting up, rolling, crawling, reaching, etc.). Even without prior knowledge of the structure of the novel event corresponding to a new skill, the agent can either incrementally adjust its skill set in a manner which leads to acquisition of the skill, or eventually give up due to frustration (reducing the probability of future recall of that event). The constraints of environmental aspects and body physics (and eventually learned sociocultural aspects) already reduce the subset of potential behaviors in any particular situation, so this recall bias may be all that is required for agents to develop skills that can grow in complexity but still be complementary to the structure of the environment, thus leading to a tractable developmental trajectory. Importantly, each skill mutually constrains other skills in the skill landscape of the agent, and so the acquisition of skills should not be seen as the learning of disparate ‘symbols’, but more of an expansion of the dynamic capabilities of the agent as a whole. In this way, the radius of behavioral reach of the agent progresses in an incremental and sensible way, progressively increasing the DOF of the agent and so presenting a restricted subset of new challenge in piecemeal fashion which is always consistent with the presently learned skills (Figure 1). This mirrors Bernstein’s idea of incrementally releasing ‘frozen degrees of freedom’ in order to make the search space of behavior more tractable (Bernstein 1967)⁠ . The relation between the sphere of behavioral dynamics and the functional networks of the CNS lets us understand that the above incremental acquisition process can describe not only physical behavior but

the development of neural synergies that start from humble beginnings of cooperation between local neural areas and advance to hemispherical or global assemblies which evolve over space and time through dynamic, metastable cooperation.

8. Remaining issues and future research Coordination dynamics leads us to a descriptive picture of the learning dynamics which involve the gradual shift of intrinsic dynamics towards skills that satisfy intentional and environmental constraints. Yet we still do not have a clear picture of how this is implemented in terms of the connectivity of the CNS. We expect that changing structural connectivity between areas is the main mechanism at play, and can imagine how neuromodulator dependent Hebbian learning may act as a way of ‘laying down the tracks’ between areas in a way so as to ‘lock-in’ required pattern dynamics. The underlying principle we expect such a system to obey is to attempt to lock-in as many rewarding patterns as possible while maintaining mutual stability with previously learned pattern dynamics, thus overcoming the stability-plasticity dilemma through repeated application of Piagetian assimilation and accommodation. A model of structural learning based on coordination dynamics and non-linear coupling between elements, which ties long timescale learning dynamics to real-time behavioral dynamics would lead to a powerful explanation of how functional connectivity in CNS can allow for complex human-level behavior. Indeed we see a glimpse of this in the works of Kuniyoshi as well as Shim & Husbands, where coupled chaotic elements search for stable behavioral patterns and are able to lock-in these patterns through learning mechanisms. Further, (Park et al. 2017)⁠ makes a valuable contribution linking the properties of networks with varying amounts of random connectivity to the temporal scale and complexity of resulting behaviors in a similar embodied model of coupled non-linear elements. From this we can see how work on investigating the properties of complex networks can give us intuition with regards to the operation of complex systems such as the (embodied) CNS. In order to push this research effort we must condense the steadily massing literature on connectomics and functional network mapping into concise theory which can be replicated in the development of embodied agents. We have seen how coordination dynamics can serve this, and future modeling work incorporating metastable coupling between non-linear elements while adhering to principles of connectivity deduced from models such as the Virtual Brain offer vast potential toward this end. Importantly, we believe a focus should be put on understanding the sequential process of transient coupling between neural areas in the context of particular tasks such as reaching, grasping, manipulating, and so on. We have seen how complex embodied behavior can come about even when the non-linear dynamics involved are simplified to a few differential equations incorporating the main aspects at play. The next step would be to extend this to the infant level by using similar coupled oscillator models with metastable functional networks and malleable connectivity to allow the agent to engage in and learn more complex, sequential behavior that is still fundamentally grounded in its body and environment. The success in such research would be beneficial in directing the attention of more researchers in robotics and the cognitive sciences to the importance of the embodied and embedded nature of real world intelligence.

References Bernstein, N. 1967. The Coordination and Regulation of Movements. Oxford, New York: Pergamon Press. Bressler, Steven L, and J A.S. Kelso. 2001. “Cortical Coordination Dynamics and Cognition.” Trends in Cognitive Sciences 5 (1): 26–36. papers3://publication/uuid/3FC782DD-54A8-4F0B-9870A82B08743F56. Bruineberg, Jelle, and Erik Rietveld. 2014. “Self-Organization, Free Energy Minimization, and Optimal Grip on a Field of Affordances.” Frontiers in Human Neuroscience 8. https://doi.org/10.3389/fnhum.2014.00599. Cohen, J. R., and M. D’Esposito. 2016. “The Segregation and Integration of Distinct Brain Networks and Their Relationship to Cognition.” Journal of Neuroscience 36 (48): 12083–94. https://doi.org/10.1523/JNEUROSCI.2965-15.2016. Deco, Gustavo, Viktor K. Jirsa, Peter A. Robinson, Michael Breakspear, and Karl Friston. 2008. “The Dynamic Brain: From Spiking Neurons to Neural Masses and Cortical Fields.” PLoS Computational Biology 4 (8). https://doi.org/10.1371/journal.pcbi.1000092. Deuschl, G., C. Toro, T. Zeffiro, S. Massaquoi, and M. Hallett. 1996. “Adaptation Motor Learning of Arm Movements in Patients with Cerebellar Disease.” Journal of Neurology Neurosurgery and Psychiatry 60 (5): 515–19. https://doi.org/10.1136/jnnp.60.5.515. Dosenbach, Nico U F, Damien A Fair, Francis M Miezin, Alexander L Cohen, Kristin K Wenger, Ronny A T Dosenbach, Michael D Fox, et al. 2007. “Distinct Brain Networks for Adaptive and Stable Task Control in Humans.” Proceedings of the National Academy of Sciences of the United States of America 104 (26): 11073–78. https://doi.org/10.1073/pnas.0704320104. Edelman, Gerald M. 1987. Neural Darwinism: The Theory of Neuronal Group Selection.Neural Darwinism: The Theory of Neuronal Group Selection. https://doi.org/10.1016/0004-3702(89)90004-0. Erlhagen, Wolfram, and Gregor Schöner. 2002. “Dynamic Field Theory of Movement Preparation.” Psychological Review 109 (3): 545–72. https://doi.org/10.1037/0033-295X.109.3.545. Friston, Karl J, Patric Hagmann, Leila Cammoun, Xavier Gigandet, Reto Meuli, Christopher J Honey, Van J Wedeen, and Olaf Sporns. 2008. “Mapping the Structural Core of Human Cerebral Cortex.” PLoS Biology, no. 7: e159. Haken, H., J. A S Kelso, and H. Bunz. 1985. “A Theoretical Model of Phase Transitions in Human Hand Movements.” Biological Cybernetics 51 (5): 347–56. https://doi.org/10.1007/BF00336922. Haken, H., and V. Jirsa. 1996. “Field Theory of Electromagnetic Brain Activity.” Physical Review Letters 77: 960–63. https://doi.org/10.1103/PhysRevLett.77.960. Haruno, Masahiko, Daniel M. Wolpert, and Mitsuo Kawato. 2001. “MOSAIC Model for Sensorimotor Learning and Control.” Neural Computation 13 (10): 2201–20. https://doi.org/10.1162/089976601750541778.

Hebb, CO. 1949. The Organization of Behavior. Humphries, M D, K Gurney, and T J Prescott. 2007. “Is There a Brainstem Substrate for Action Selection?” Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 362 (1485): 1627–39. https://doi.org/10.1098/rstb.2007.2057. Jantzen, Kelly J, and J A Scott Kelso. 2004. “Neural Coordination Dynamics of Human Sensorimotor Behavior⁠: A Review.” Jirsa, V K, O Sporns, M Breakspear, G Deco, and A R Mcintosh. 2010. “Towards the Virtual Brain⁠: Network Modeling of the Intact and the Damaged Brain,” 189–205. Jirsa, Viktor K, Kelly J Jantzen, Armin Fuchs, and J A Scott Kelso. 2002. “Spatiotemporal Forward Solution of the EEG and MEG Using Network Modeling” 21 (5): 493–504. Kelso, J A S, and P Zanone. 2002. “Coordination Dynamics of Learning and Transfer Across Different Effector Systems” 28 (4): 776–97. https://doi.org/10.1037//0096-1523.28.4.776. Kelso, J A Scott, Guillaume Dumas, and Emmanuelle Tognoli. 2013. “Outline of a General Theory of Behavior and Brain Coordination.” Neural Networks 37: 120–31. https://doi.org/10.1016/j.neunet.2012.09.003. Kelso, J A Scott, Armin Fuchs, and Viktor K Jirsa. 1999. “Traversing Scales of Brain and Behavioral Organization I⁠: Concepts and Experiments.” Kuniyoshi, Yasuo, and Shinji Sangawa. 2006. “Early Motor Development from Partially Ordered NeuralBody Dynamics⁠: Experiments with a Cortico -Spinal-Musculo-Skeletal Model,” 589–605. https://doi.org/10.1007/s00422-006-0127-z. Lomp, Oliver, Mathis Richter, Stephan K. U. Zibner, and Gregor Schöner. 2016. “Developing Dynamic Field Theory Architectures for Embodied Cognitive Systems with Cedar.” Frontiers in Neurorobotics 10 (November): 1–18. https://doi.org/10.3389/fnbot.2016.00014. Mori, Hiroki, and Yasuo Kuniyoshi. 2010. “A Human Fetus Development Simulation: Self-Organization of Behaviors through Tactile Sensation.” 2010 IEEE 9th International Conference on Development and Learning, ICDL-2010 - Conference Program, 82–87. https://doi.org/10.1109/DEVLRN.2010.5578860. Mori, Hiroki, Yuzi Okuyama, and Minoru Asada. 2013. “Emergence of Diverse Behaviors from Interactions between Nonlinear Oscillator Complex Networks and a Musculoskeletal System.” European Conference on Artificial Life 2013 12: 324–31. https://doi.org/10.7551/978-0-262-31709-2ch048. Park, Hyojin, Dong Soo Lee, Eunjoo Kang, Hyejin Kang, and Jarang Hahm. 2016. “Formation of Visual Memories Controlled by Gamma Power Phase- Locked to Alpha Oscillations,” no. June. https://doi.org/10.1038/srep28092. Park, Jihoon, Hiroki Mori, Yuji Okuyama, and Minoru Asada. 2017. “Chaotic Itinerancy within the Coupled Dynamics between a Physical Body and Neural Oscillator Networks.” PLoS ONE 12 (8): 1– 30. https://doi.org/10.1371/journal.pone.0182518.

Piaget, Jean. 1952. The Origins of Intelligence in Children.Journal of Consulting Psychology. Vol. 17. Routledge. Popov, Tzvetan, Sabine Kastner, and Ole Jensen. 2017. “FEF-Controlled Alpha Delay Activity Precedes Stimulus-Induced Gamma-Band Activity in Visual Cortex.” The Journal of Neuroscience 37 (15): 4117– 27. https://doi.org/10.1523/JNEUROSCI.3015-16.2017. Pulvermüller, Friedemann, Max Garagnani, and Thomas Wennekers. 2014. “Thinking in Circuits: Toward Neurobiological Explanation in Cognitive Neuroscience.” Biological Cybernetics 108 (5): 573– 93. https://doi.org/10.1007/s00422-014-0603-9. Rosenbaum, David A., Rajal G. Cohen, Ruud G.J. Meulenbroek, and Jonathan Vaughan. 2006. “Plans for Grasping Objects.” Motor Control and Learning, no. December 2015: 9–25. https://doi.org/10.1007/0387-28287-4_2. Schmidt, Richard, and Tim Lee. 2005. “Motor Control and Learning: A Behavioral Emphasis.” Human Kinetics, 535. https://doi.org/10.1007/s00586-014-3249-3. Scholz, John P., and Gregor Schöner. 1999. “The Uncontrolled Manifold Concept: Identifying Control Variables for a Functional Task.” Experimental Brain Research 126 (3): 289–306. https://doi.org/10.1007/s002210050738. Schöner, Gregor, John Spencer, and D F T Research Group. 2015. Dynamic Thinking: A Primer on Dynamic Field Theory. Oxford Series in Developmental Cognitive Neuroscience. https://doi.org/10.1093/acprof:oso/9780199300563.001.0001. Schöner, Gregor, Pier G. Zanone, and J. A. S. Kelso. 1992. “Learning as Change of Coordination Dynamics: Theory and Experiment.” Journal of Motor Behavior 24 (1): 29–48. https://doi.org/10.1080/00222895.1992.9941599. Shanahan, Murray. 2012. “The Brain ’ s Connective Core and Its Role in Animal Cognition,” 2704–14. https://doi.org/10.1098/rstb.2012.0128. Shim, Yoonsik, and Phil Husbands. 2012. “Chaotic Exploration and Learning of Locomotion Behaviors.” Neural Computation 24 (8): 2185–2222. https://doi.org/10.1162/NECO_a_00313. Soltoggio, Andrea. 2008. “Neural Plasticity and Minimal Topologies for Reward-Based Learning.” Proceedings - 8th International Conference on Hybrid Intelligent Systems, HIS 2008, no. October 2008: 637– 42. https://doi.org/10.1109/HIS.2008.155. Vygotsky, Lev. 1987. Zone of Proximal Development. Vygotsky, Lev. “Zone of Proximal Development.” Mind in Society: The Development of Higher Psychological Processes . Vol. 5291. https://doi.org/10.4135/9781412963848.n282. Zibner, Stephan K.U., Christian Faubel, Ioannis Iossifidis, and Gregor Schöner. 2011. “Dynamic Neural Fields as Building Blocks of a Cortex-Inspired Architecture for Robotic Scene Representation.” IEEE Transactions on Autonomous Mental Development 3 (1): 74–91. https://doi.org/10.1109/TAMD.2011.2109714.