1 ! 2 & +,3 42 * 4

!

"

#" " $ "

% #"

& '

& ( )%*)+, & - * . / * # -0/ 1 ! 2

& +,3 42 * 4

! " #

$ % & ' ()$

# #)(*+,$

-. $ - /+(0 " #

$ % & ' # #/+(0

A Voyage of Discovery In which the end of all our exploring will be to arrive at the New World Disorder And know the place for the first time (adapted from T S Eliot ʻFour Quartetsʼ)

Part 1; Looking from the old world to the new ¾ Chapter 1; The Intertwining of Defense and Economics; the Emergence of Network Centric Warfare and the ʻGig Economyʼ.......................................................................5 ¾ Chapter 2; The Cold War: Our Point of Departure....................................................19

Part 2; Storm tossed waters ¾ Chapter 3; Complexity, Entropy and Non-Linearity ¾ Chapter 4; Clustering, Herding and Volatile Information Flows. ¾ Chapter 5; Volatile Information Flows and Non-Linear Decisions

Part 3; Modeling the new world disorder ¾ Chapter 6; Analyzing Schelling – type Automata Models and Edge Networks.......119 ¾ Chapter 7: Validations and Speculations................................................................139

1

The Purpose of This Book The global transition from the industrial age to the information age implies the need for new methods and tools to help us be more agile in the adapting to the new world disorder sketched out by thinkers such as Joshua Cooper Ramo of the Kissinger Institute (“The Seventh Sense”; “The Age of the Unthinkable”). The winners will surf this change, the losers will be buried by it. This disruptive disorder is shifting the balance of power in conflict and economics away from slow-to-change, over – regulated, chain – of – command hierarchies towards self - organizing, adaptive networks. We need new economic and conflict models to understand how to be winners and surf the change. The purpose of this book is to develop these new mathematical methods based on the idea of extremal models which originated from my earlier work on conflict modeling. Extremal Models are models of highly entangled and complex systems far from equilibrium which model each entity as simply as possible, allowing the system complexity to emerge from the richness of their interactions. I call this approach “Extremal Modeling”. This book is dedicated to Per Bak, friend and colleague, whose untimely death robbed us all of a great mind. James Moffat London, UK, 2018.

2

Foreword by Dr. Walter Perry, Washington D C, USA Mathematics has always played a major role in military operations—from the use of geometry to design weapons in ancient Greece and Rome to todayʼs sophisticated combat models and simulations. With the advent of the computer, models and simulations have become increasingly more complex—and closer to capturing the “fog and friction” of real war. Central to the development and application of these concepts is the close collaboration between the U.S. and the UK. Over the years from the hunt for German U-boats in the North Atlantic to the hunt for implanted Improvised Explosive Devices (IEDs) in Iraq and Afghanistan, U.S. and UK operational analysts have worked together and with other friendly nations to help operational commanders accomplish their missions through the development of advanced concepts aimed at predicting enemy behavior and improving friendly situational awareness. This book builds on that shared, world-class expertise. As a senior information scientist at the U.S. RAND Corporation in the mid-1990s, I was fortunate to be selected to serve two years as an exchange scientist with the UK Centre for Defence Analysis (CDA), part of the UK Ministry of Defence. Since my interest at that time was assessing the effects of information on military command and control, I was paired with Professor Moffat. Together we conducted studies examining the effects of information on Royal Navy command and control, and the effects of trading situational awareness for firepower in support of the British Army. And in between, we conducted research focused on developing decision making models. Our collaboration did not end with my return to the U.S. We continued to share research results to the benefit of both our countries. In 1996, CDA became the Defence science and technology laboratory (Dstl) and I was frequently invited to present my research findings to Dstlʼs research staff. In addition, Professor Moffat participated in international scientific organizations that included the Santa Fe Institute in New Mexico. He was also deeply involved in NATO research collaboration, gaining a global reputation in mathematical modeling which bodes well for the insights developed here.

3

Modeling Conflict and Competition far from Equilibrium. Chapter 1; The intertwining of defense and economics; the emergence of network centric warfare and the ʻgig economyʼ

WƌŽĨĞƐƐŽƌ:ĂŵĞƐDŽĨĨĂƚ >'͕hE/sZ^/dzK&ZE

18

Modeling Conflict and Competition far from Equilibrium Chapter 2; The Cold War: Our Point of Departure

Summary: The conventional phase of the Cold War was planned as the last gasp of industrial age Clausewitzian ʻkriegeʼ between hierarchies of command. Mathematical modeling was centred on the grinding process of attrition in which large blocks of armoured ground units interacted in order to force defeat by a process of wearing away the other. By making a number of reasonable assumptions it is possible to represent this process as a set of linked, first order differential equations

During the Cold War (1945-1991) a key question was for how long an attack by the Soviet Union could be held, providing time for the political process to re-engage; thereby avoiding further escalation. To advise on this question, mathematical simulation modeling of a war in Central Europe between NATO and the Soviet Union was undertaken. This represented at its core the grinding process of attrition in which large blocks of armored ground units interacted in order to force defeat by a process of wearing away the other. By making a number of reasonable assumptions it is possible to represent this process as a set of linked, first order differential equations. The systematic use of this approach was initiated by F W Lanchester [1] and they are thus known as Lanchester equations. The following are a minimal set of assumptions which capture the core dynamic. 1. each unit on each side is within sensor and weapon range of all units on the other side; 2. units on each side are identical but the units on one side may have a diႇerent kill rate to the units on the opposing side; 3. each firing unit is suႈciently well-aware of the general location and condition of all enemy units so that when a target is killed, fire may be immediately shifted to a new target; 4. this new target is randomly chosen from the surviving targets.

19

With these ʻAimed Fireʼ assumptions we can easily derive the mathematical relationships involved. Lanchester Equations for Aimed Fire; At time t, and considering the small increment of time between t and t+įt, the number of Blue casualties įb(t) is given by the number of remaining Red units times the number of targets they kill, that is, r ( t ) kr δ t , where kr is the eႇectiveness of a single Red unit engagement per unit time, and r(t)įt is a measure of the number of such engagements. From this we have the first order differential relationship

db = kr r ( t ) . By symmetry we also have a similar dt

relationship for the attrition of Red units, namely

dr = kb b ( t ) . Thus for aimed fire, the attrition dt

rate imposed on the enemy is proportional to the strength of the force. This is the logic of the concentration of force (as evidenced in Desert Storm, the Gulf War of 1990-1991), and we incorporated these into high level wargames in order to assess strategic alternatives prior to that action. These helped to give context for the ʻleft hookʼ through the desert adopted by the coalition forces, predicting rapid advance with light casualties. Cybernetics and the Cold War The successful contribution of analytics during World War II naturally caused the admiralty, the war office, and the air ministry to see analytics as an aid to the planning of the forces in post-war UK. The problems presented in the early post-war years were formidable, including the strong downward pressure on defence budgets as the UK struggled to rebuild its economy. Forward-looking studies to support the planning process for future equipment procurement became a key issue; sifting through alternatives in order to gain best value from limited resources. Moving to the heart of the Cold War, during the 1970s and 1980s, analytics was increasingly used to consider potential future scenarios and conflicts—primarily, of course, in the hope that they would never happen. During the Cold War, mathematical modeling; exploiting the new technology of computer simulation; considered a very narrow set of scenarios in great detail; those potentially arising from major conflict in western Europe. With the rapid collapse of the Soviet Union during the period 1989-1991, came a whole series

20

of changes to the unpredictability, nature and range of potential future scenarios of conflict. This led to a key change that analytics needed to address to remain relevant, namely the entry of the information age into the Command process. The scripted warfare of the Cold War period was centered on attrition of the enemy, while coping with the ʻfog and frictionʼ of war. ʻFogʼ limits our sensing of enemy intent, ʻfrictionʼ constrains the number of courses of action open to us in response to our perception of events. Ashbyʼs law of requisite variety for controlled systems [2] requires the ʻvarietyʼ (the number of accessible states) of the system to be matched by the variety of its control. The fog and friction of the industrial age Cold War period meant that the forces in the field had low variety. This was matched by a low variety of command which we describe as ʻDeconflicted Commandʼ [3], where the battlefield was divided into sectors to reduce the command variety, with little or no land force interaction across sector boundaries The aim of the Soviet Forces1 was to sweep across Western Europe using the mass of their conventional forces to strike swiftly and deeply. In the light of such a threat, any mathematical model of the conventional (pre-nuclear) phase had to reflect NATOʼs primary aim: to slow the advance of the conventional forces for long enough to allow the political process to reengage. This slowing down by NATO would be achieved by a process of attrition, placing our strength against their strength in an all-out battle for survival. Mathematical modeling of attrition warfare, assuming Deconflicted command, was based on Lanchesterʼs equations (see Box). A comprehensive UK review is given in [4] and in the US by [5]. The Lanchester equations represented this conflict of attrition between two warring parties (ʻRedʼ and ʻBlueʼ) in which each side uses aimed fire to attack the other side [5]. From this we have the relationship attrition of Red units, namely

db = k r r ( t ) . By symmetry we have a similar relationship for the dt dr = kbb ( t ) . dt

From these expressions, setting b = b(t) and r = r(t), it also follows that kb bdb = k r rdr Integrating both sides of this equation, and using the initial values r0 and b0 for the number of

1

We use Soviet or Warsaw Pact interchangeably to represent the political and military grouping of communist countries under the sway of Russia during the period 1945 to 1989/1991.

21

Red and Blue units at the start of the conflict, we have the following relationship (from which the ʻsquare-lawʼ name derives). kb ( b02 − b 2 ) = kr ( r02 − r 2 ) ………….(2.1)

Now define the variables x = 1 −

k r2 r b ; y = ; and L = r 02 . Battles with the same value of L will r0 b0 kbb0

evolve in a similar way. Substituting these variables x, y and L into the expression (2.1) above we have;

1 − y 2 = Lx(2 − x) 1

(2.2) § § x ··2 y = 1 − Lx(2 − x) = ¨1 − 2 Lx ¨1 − ¸ ¸ ...................... 2 © ¹¹ © A first-order expansion of the expression at (2.2) is of the form; § x· y ≅ 1 − Lx ¨ 1 − ¸ ≅ 1 − Lx © 2¹

The relationship y=1−Lx corresponds to Lanchesterʼs linear law [5]. Thus we can see that when x, the proportion of red units destroyed, is small, the linear and square law relationships evolve in the same way, and then diverge as the nonlinear terms grow larger. This Lanchester law of the Cold War period determined the rate of attrition and hence the force ratio as a function of time. The movement of the line of advance, called the forward edge of the battle area or FEBA, was then determined by the force ratio in the same way that a head of steam moves a piston one way or the other. Such models were thus called piston models, and the representation of the attrition and movement process in this way formed the core of the simulation models of that time. In practice, elaborated forms of the Lanchester aimed fire assumptions such as the Bonder-Farrell equations [5] included sensor detections, ability to fire, detailed terrain screening characteristics and individual weapon system laydowns. This simulated dynamically a set of individual one on one exchanges between various types of direct fire units with time varying interaction coefficients. In essence we are evaluating through step-by-step simulation a dynamic version of the heterogeneous form of the Lanchester interaction equations which has no analytic solution. If Ri (t ), B j (t ), γ ji (t ) are

22

respectively the number of surviving red units of type i at time t, the number of surviving blue units of type j at time t , and the attrition rate per unit time at time t of units of red of type i by a unit of blue of type j calculated by the simulation, then we have the instantaneous heterogeneous Lanchester relation; ΔRi (t ) = Δt ¦ γ ji (t ) B j (t ) j

Where ΔRi (t ) is the number of red units of type i destroyed by blue during a single simulation time step of duration Δt . In the continuous limit Δt → 0 we can replace this time-stepped relationship by the set of differential equations; ∂Ri (t ) = ¦ γ ji (t ) B j (t ) ∂t j

In matrix-vector terms we can write this, in an obvious notation as a linear system;

∂R(t ) = Γ(t ) B(t ) ∂t Lanchesterʼs aim, in developing these relationships was to demonstrate mathematically the benefits of concentration of force. This was an anecdotal understanding that in war it is more advantageous to keep your forces concentrated together in a focused assault. We can see that this effect occurs by considering this question (to concentrate or not) from the point of view of optimal control theory. Since Lanchesterʼs equations in heterogeneous form are a linear system, we can [6]; (Appendix) show that an extremal control must be unique and optimal. Thus any optimal control is bang-bang control where the value “bangs” from one extreme possible value to another, and, like a Heaviside function, never assumes any intermediate value. In this sense, the optimal control always lies on the boundary of the admissible set of controls. In terms of force employment it means that under Lanchester assumptions, the whole force should be employed against a single objective. An example of Aggregation Tracking and aggregating the simulation output can give us insight into what is driving the output. By exploiting the symmetry of this type of armored warfare we can also derive a

23

ranking of the various unit types This approach can be useful where urgent high level insights are required. For example, I was required, during the mature Cold War period, to quickly analyze the impact of the class consisting of antiarmor weapons available to both sides (denoted Red and Blue). I did this by considering their ranking as follows. 1. Assume si (t ) (rank of Red weapon class i ) is proportional to the attrition of Blue units times their ranking rj (t ) . This gave me an equation of the form si (t ) ∝ ¦ γ ij (t )rj (t ) . j

2. Assume that the effect of the arms race is to force the overall ranking vector (s) of Red to be proportional to the overall ranking vector of Blue so that

si (t ) = λ ri (t ) = ¦ γ ij (t )rj (t) λ r (t ) = Γ(t )r (t ) j

Hence the Blue ranking vector is an eigenvector of the attrition matrix ī(t) . This is a square matrix with positive entries, thus the Perron-Frobenius theorem, see for example [7], guarantees that such an eigenvalue and eigenvector exist. With these assumptions of symmetry we can obtain a ranking of the various weapon classes as a function of time t. In essence we have solved for a fixed point of the function r (t ) → f (r (t )) =

Γ(t )r (t )

λ

. If required,

this approach can be generalised to the case of a non-linear ranking by using for example, Brouwerʼs fixed point theorem or its generalisation by Schauder to Hausdorff topological vector spaces, illustrating the connection between equilibrium, hierarchy and fixed points.

ĐŽŶŽŵŝĐŽŵƉĞƚŝƚŝŽŶŝŶ^ƚĂďůĞDĂƌŬĞƚŽŶĚŝƚŝŽŶƐ In completely stable market conditions prices represent points of stable equilibria between supply and demand, due competition beftween market traders. J von Neumann [8] conceived the interaction between two such traderical generakisation s as a zero-sum matrix game of payoffs for pure strategies. If the matrix of payoffs has a saddle point, the game solution is the corresponding pair of pure strategies. If such a stable point does not exist, von Neumann proved, using Brouwerʼs fixed point theorem, that an equivalent point of stability exists, provided that each player imposes a probabilistic choice function across their set of strategies. In later Chapters we will show that even in such stable markets, the optimal choice lies on a folded surface, giving rise to sudden, non-smooth change.

24

Front Line Analysis Rediscovered; The Falklands War

In 1982, I was a junior analyst working as part of a small policy analysis team, providing reports and advice to the management board of the Royal Air Force. The team were in the main headquarters building of the Ministry of Defence, in Whitehall, London, just opposite number 10 Downing Street. My responsibilities covered all existing and planned air to ground weapons. At the request of the Air Staff, I had completed a review of the stockpile of such weapons using Linear Programming to help decide the best mix. This had indicated the need to increase significantly the number of general purpose laser and GPS guided bombs with high accuracy to complement more specialist weapons; see technical appendix to [9]. I had through this work established good links with the senior military officers in the building. The Chief of the Air Staff was asked by the Secretary of State for Defence to establish a way of attacking Port Stanley airfield using one or more Vulcan aircraft. This airfield was the one useful concrete runway on the Falkland Islands, and was being used by the Argentinean forces as one of their main links back to the mainland. The analysis had to be done in a matter of hours, for reasons that only became clear a few weeks later. They turned to me for advice. An RAF Air Commodore was immediately nominated as my point of contact with the Chief of the Air Staff. An Operations Cell to manage the initial response to the Argentinean invasion had been set up in the building, and I was given direct access through two steel barred doors to the RAF Group Captain (an ex-Vulcan pilot) who was in control of this. His desk sat at the rear of a long, low room festooned with cables, TV monitors, and the desks of very busy people. In discussion with him, I established some of the parameters of the operation such as the height of the aircraft over the target. I then retrieved a large-scale map of Port Stanley Airfield from the map store in the basement of the building, several floors underground. With some help, I rapidly established the name of the construction firm that had constructed the airfield in the first place and phoned them up. They gave me the precise composition of the runway surface (such as the thickness of the concrete and what was underneath it). Colleagues at

25

the Royal Aircraft Establishment (as it then was) could then calculate for me the size of a crater made by a 1,000 lb. bomb dropped from a Vulcan. I had then about three hours to calculate the optimal bomb load. This was crucial to the success of the mission. Too many would increase the lift requirement on the airframe unnecessarily leading to higher than required fuel usage and too many air-to-air refuelings or even to possible ditching of the aircraft in the South Atlantic. Too few would jeopardize the ultimate aim of the mission; to strike the runway and deny its use. The bomb bay of the Vulcan allowed three packages of 1000lb bombs to be carried in total. Each of these packages consisted of two slung rows. The first row consisted of four bombs with the second row of three bombs fitted snugly into the gaps between those in the first row. These mechanical constraints reduced the problem to carrying 7,14 or 21 bombs. I calculated the throw of the bombs and their scatter along the mean track of the aircraft, which was at a slight, optimized, angle relative to the runway. My analysis confirmed that all 21 would be required to have a reasonable chance of striking the runway. This immense payload had to be hauled across several thousand miles of ocean. It meant refueling aircraft refueling other refuelers to get them into the right position to refuel the Vulcan. The Group Captain worked out the intervalometer settings that would need to be used (this was not trivial because the Vulcan was designed to deliver nuclear weapons). These release timings were based on my calculations of the required spacing between the craters on the ground. I drew the bomb craters on three strips of plastic (seven on each strip) to the correct scale of the large map of the airfield, and wrote out my analysis of the chance of getting a bomb on the runway in terms of the number of bombs, the height of the aircraft and the angle of the aircraft track relative to the runway (all of which were important in deciding the final aircraft flight path), and I then briefed the ʻ1-starʼ RAF air commodore. He told me later that this information was first briefed to the Secretary of State later the same day, and then briefed by the Chief of the Air Staff to the War Cabinet, chaired by Mrs. Thatcher, the then Prime Minister. Mrs. Thatcher herself played around with the strips of plastic and the map before declaring that 21 bombs would have to be used, and so it was decided. Following detailed planning by the RAF of the refueling schedule, and much innovative engineering, including

26

retrofit of the refueling probes, the attack went ahead about two weeks later, and one bomb struck the runway, as planned and as reported across the worldʼs press and media the following day. From the Official RAF history of the operation; (http://www.raf.mod.uk/history/OperationBlackBuck.cfm accessed Nov 2016) we have the following; Of the 21 bombs, one hit the runway at its mid-point cratering the concrete, the rest fell to one side and caused serious damage to airfield installations, aircraft and stores. After the attack, the plan called for the Vulcan to return to 300 ft to avoid the defences. Since no reaction was detected from the Argentine defences, Withers immediately climbed to an economic cruising level to save fuel. The return trip went exactly as planned, the rendezvous with the Nimrod and the additional tanker support were straightforward after the events of the long night. XM607 touched down at Ascension at the end of an astonishing 15 hours and 50 minutes in the air, which included 18 air-to-air refuelings. For this extraordinary, record-breaking mission and their superb airmanship throughout, Flt Lt Withers and Sqn Ldr Tuxford were awarded the Distinguished Flying Cross and the Air Force Cross respectively. dŚĞĂƐĐĂĚĞŽĨĨĨĞĐƚƐ The immediate (military) effect was disruption of take-offs from Port Stanley airfield. However, Operation BLACK BUCK (as it became known) not only reached and bombed the target but, in doing so, showed the Argentineans that the RAF had the potential to hit targets in Argentina [10]. This forced them to move their fighters further North, thus preventing them from escorting the air raids against the British, especially the RN and Merchant Ships entering San Carlos Water. This reduced the pressure of attacks on the Royal Naval task force (cascaded military effect). “Moffat.... had indirectly taken out a large chunk of the entire Argentinean air force” [11]. At the political level, the consequent success of the task force (due to the brave men and women on board, many of whom gave their lives during that action) bolstered the Prime Ministerʼs position in helping to confront the Soviet Union during the Cold War period (cascaded political effect).

27

All of this was achieved by a very hierarchical ʻcold warʼ planning organization which was able to morph for a short time into an agile, trusted and informal network of experts.

Moving Beyond the Cold War; First Attempts The MACRO aggregate level model was developed by Bonder and Farrell, as an early attempt to move beyond the ponderous and very detailed Cold War modeling. MACRO used Linear Programming to derive weightings for different components of the direct fire battle to aggregate the direct fire force into a homogeneous mass along a given corridor of advance represented as a flexible piston. During the planning phase in the run up to Gulf War 1; Desert Shield and Desert Storm, a number of these pistons were used to characterise alternative coalition plans and force laydowns. The coalition front line forces in a given corridor of advance were represented by the aggregate value B(t) and the Iraqi forces by the aggregate value R(t) with a Lanchester-like attrition equation applied. It was calibrated in the UK by historical analysis and by wargaming the same situation employing military staffs to ensure that it captured the essential dynamics of the process. This type of mathematical simulation model, and its more detailed counterparts, such as the US VIC (Vector in Commander) model were developed within the context of the Cold War and thus represent industrial age command. The role of the commander-in-chief (such as General Schwarzkopf in Gulf War 1) in attacking the enemy was to amass overwhelming force with the aim of achieving rapid advance along a number of attack axes. Such a plan of attack I call ʻDeliberate Planningʼ. Less hierarchical, more agile structures devolving decisionmaking to lower levels, require this to be complemented by a process I call ʻRapid Planningʼ

/ŶƚƌŽĚƵĐƚŝŽŶƚŽƚŚĞDŽĚĞůŝŶŐŽĨĞůŝďĞƌĂƚĞĂŶĚZĂƉŝĚWůĂŶŶŝŶŐ During the earlier post-war period, communications were slow and networks were sparse; the resultant variety of the control system was low. Thus the combat system itself was Deconflicted (the Central Region ʻlayer cakeʼ) to reduce to a minimum the interaction between neighboring army units and thus lower the variety to match that of the command system. At this level of planning, a commander-in-chief considers a number of potential courses of action, taking into account his intent (i.e. his primary goal or objective), and the perceived

28

intent of the enemy force. The algorithm which represents this process firstly develops a ʻpictureʼ of the layout and intent of the enemy force, based on sensor inputs. On the basis of this ʻpictureʼ, the planner then chooses; using a form of Artificial Intelligence (AI); a layout of the friendly force which best achieves the commanderʼs goals. This aspect of deliberate planning can be performed for example, by ʻbreedingʼ plans in an innovative way, using a genetic algorithm (GA), and then selecting a plan with a high ʻfitnessʼ level from the GA. Rapid Planning is based on the psychology of naturalistic decision making. It represents mathematically the heuristic pattern matching approach adopted by experts under stressful and highly dynamic circumstances. More detail of these AI algorithms will be given in later Chapters.

ZĞĨĞƌĞŶĐĞƐ 1. Lanchester F (1916) Aircraft in warfare: The dawn of the fourth arm Constable Ltd, UK 2. Ashby W (1991) Requisite Variety and Its Implications for the Control of Complex Systems. In: Facets of Systems Science. International Federation for Systems Research International Series on Systems Science and Engineering, vol 7. Springer, Boston, MA, USA 3. NATO (2010) NATO NEC C2 Maturity Model, Office of Sec Def US DoD, 4. Bowen K and McNaught K (1996) in The Lanchester Legacy Vol 3, Coventry Univ., UK 5. Taylor, J (1980) ʻLanchester Models of Warfareʼ (two-volume set), Military Applications Section, Operations Research Society of America, USA 6. Moffat J (2003) ʻComplexity Theory and Network Centric Warfareʼ.Office of the Sec Defense, DoD, USA , 2003. (Reprinted 2004). Now available on Kindle.

7. Keener J 1993. The Perron-Frobenius theorem and the ranking of football teams. SIAM Review 35

29

8. von Neumann J et al (1944) ʻ Theory of Games and Economic Behaviorʼ Princeton University Press.

9. Daniel D et al ʻMaximising Flexibility In Air Force Weapon Procurement.ʼJ. Operational Research Soc.35 No 3 (1984).

10. RAF AP 3003, A Brief History of the Royal Air Force.

11. Ramo J (2009) ʻThe Age of the Unthinkableʼ. Little, Brown; London UK.

30

0RGHOLQJConflict and Competition far from Equilibrium. Chapter 3; Complexity, Entropy and Non-Linearity

WƌŽĨĞƐƐŽƌŵĞƐDŽĨĨ ĂŶĐŚĞƐƚĞƌƚŽƚŚĞ&ƌĂĐƚĂůͲĂƐĞĚDƋƵĂƚŝŽŶ The Technical Collaboration Programme (TTCP) is an agreement among Canada, the UK, the USA, New Zealand and Australia allowing, inter alia, collaborative research on aspects of

38

defence and security Much of what follows is as a consequence of TTCP collaborative research captured in a TTCP report [18] available at arXiv:nlin/0607051v1 (accessed Nov 2017). The two key inputs to our scale free system of conflict, we assume, are the unit eႇectiveness k and the time interval ǻt. The variable k has dimensions FT −1 F −1 = T −1 , with F the dimension of force, and T the dimension of time; thus k.ǻt is dimensionless. The key output is the attrition to the opposing force per time interval. We thus assume a scaling collapsed relationship of the form E ( Δb / Δt ) ∝ kR ( t ) Φ ( Π ) with the dimensionless variable Π = k Δt . E is a statistical expectation over some time period T which we normally assume to be significantly greater than ǻt. The next question is the precise functional form of the system response function ĭ. Results from the New Zealand MANA conflict model produced by the UK at USPACOM Maui [19] are relevant here. A key process underlying the MANA model is how agents from either side discover opposition agents. Recall that for the aimed fire Lanchester equation (see Box) it was necessary to assume that each unit can detect and potentially fire at all the remaining units on the other side. Here we assume that the detection range of the agents is much smaller than the size of the battlespace. This forces the set of interactions across the large number of units to be both local and non-linear in character, and thus satisfies our definition of a complex system. The evidence, from [7-10, 19] that the real system is fractal in nature leads us to assume that ĭ is a smooth homogeneous function, and so, by Eulerʼs homogeneous function theorem, has the form Φ( x) ≅ xα where x = k Δt and B is an anomalous dimension. Note that this analysis does not apply to constrained flow through pipes since for that application the scaling collapse plot has a non-smooth kink corresponding to the onset of non-laminar flow. Inserting Eulerʼs expression in the form Φ(Π ) = Πα where Π = k Δt into the tentative relation; § ΔB · E0>'͕hE/sZ^/dzK&ZE

54

Modeling Conflict and Competition far from Equilibrium Chapter 4; Clustering, Herding and Volatile Information Flows.

Summary: Clustering or herding of agents in conflict or in markets, leads to a rich set of possible trajectories of system evolution. To understand the classes of behaviors which might emerge, we define and analyze a set of extremal models. These turn data into patterns and patterns into meaning. Capturing the process of intelligent agents in conflict or competition, set within a widely divergent set of possible futures, leads to a rich set of possible trajectories of system evolution for analysis to consider. To understand the classes of behaviors which might emerge, we define and analyze a set of extremal models. An extremal model is the simplest possible form of cellular automaton or agent based simulation model which represents the effects of interest [1]. Volatility in Stock Prices A key human behavior affecting stock prices and the related phenomenon of bubble markets for example, is the clustering of market trades and traders due to the herding instinct. The herding instinct is a mentality characterized by a lack of individual decisionmaking or thoughtfulness, causing people to think and act in the same way as the majority of those around them. In finance, a herd instinct relates to instances in which individuals gravitate to the same or similar investments based almost solely on the fact that many others are investing in those stocks. The fear of regret on missing out on a good investment is often a driving force behind the herd instinct [2]. Herding and Investment Bubbles An investment bubble results from a rapid escalation in the price of an asset over its intrinsic value, which is caused by exuberant market behavior perpetuated through a positive feedback loop. The bubble inflates until the asset price reaches a level

55

beyond economical rationality when further increases in price are contingent purely on investors continuing to buy in at the highest price. [2]. The move to electronic trading, reducing the delay between offer an acceptance of a trade on the upside of a bubble and shorting strategies on the downside can make the situation worse. To understand clustering, herding other related phenomena it is best to start with an extremal system model consisting of simple agents which generate complex behaviors through their interactions. A good example is the ʻSugarscapeʼ agent model of trading [3] from which the classic distribution of wealth in a market economy emerges. In our case we used the ISAAC model of conflict developed by the US Marine Corps [4]. Our reason for this is that it represents an extreme point of the set of all such agent based simulations, because it is simple, represents the effect of local clustering of agents, is completely unscripted and exhibits non-linear effects. It thus stands in contrast to the opposite extreme of a very tightly scripted model which is complicated but not complex. I need to explain the difference between a complicated and complex model, and this is as good a time as any to tackle it. Complicated Vs Complex Models

Imagine lying on a snowy slope, watching skiers coming down the opposite side of the valley, on-piste. There are two explanations for the behavior you see., corresponding to each of the two paradigms; 1. The complicated explanation, corresponding to top down or ʻdeconflictedʼ command and control [5]. Each skierʼs movements are prescribed through a hidden plan. 2. The complex explanation, corresponding to enabling or ʻedgeʼ command and control [5]. The constraints in place produce both desired and surprising emergent effects.

Applying Occamʼs Razor (that the simpler explanation is nearer the truth) would tend to favour the second explanation, in most cases. From these considerations, and others, we can characterise a complex and adaptive system as having the following set of characteristics [6];

56

1. Non-linear interaction between agents in the sense that the relation between input and output of the interaction is nonlinear 2. Decentralised Control: the emergent behavior is generated through local

coevolution, not through top down instruction. 3. Non-Equilibrium Order emerges from the space and time correlations inherent in an open, dissipative system far from equilibrium. 4. Adaptation; clusters of local interactions are constantly being created and dissolved across the system, correspond to correlation effects in space and timeʼ. 5. Collectivist Dynamics; elements locally influence each other, and these effects ripple through the system. To develop a mathematical model of herding exhibiting these ʻfive characteristics of complexityʼ we are driven to assuming that our model has to be scale free. From Chapter 3 this implies that the relationship between inputs and outputs can be represented as a relationship between renormalisation group invariants such as , for a 2 input, 1 output or (2,1) model:

Π = Φ(Π1 , Π 2 ) Different values of the invariants Π1 and Π 2 correspond to essentially different dynamic regimes of operation of the system. Unfolding the (2,1) model gives;

§ b · b f (a1 ,...ak , b1 , b2 ) = a1p ...akr Φ ¨ p1 1 r1 , p2 2 r2 ¸ © a1 ...ak a1 ...ak ¹

Two possibilities are available for this system (see Chapter 3 for details); Type 1 model.

57

Φ tends to a non-zero finite limit as either of the independent inputs tends to 0 or infinity, as appropriate. This means that the transformation function Φ can be replaced by a constant.

Type 2 model.

Φ has a renormalisation limit of the Gell-Mann form. The power law form of the limiting expression still leads to complete separation of variables, but with power exponents equal to the ʻanomalousʼ fractional dimensions of renormalisation. For evidence of models of Types 1 and 2, where there is no absolute scale or gauge for the variables, this approach directs us to search for evidence of power law relationships of the form y = xα , which, if plotted on a log-log scale, give a straight line whose slope is the power law exponent. Such expressions arise naturally in certain types of complex systems, particularly where fractal structures are involved, and I refer to them as ʻscalingʼ relationships, since they have no preferred gauge or scale.

Evidence for stable long term scaling behavior This analysis deals with the macro level looking across wars and having a time horizon of centuries. It is complemented by the recent historical analysis work on the Afghan conflict by Dobias et al. described in Chapter 3, the analysis in [7] and by other earlier and more ad hoc pieces of evidence [8]. First there is the recent work of Roberts and Turcotte [9]. By considering the intensity of the war as the number of battle deaths (suitably normalised to reflect the total civilian population), scaling relationships are obtained between the intensity of war and its frequency, which are stable over several centuries. This extends previous work by Richardson [10].

58

Secondly, in [11] Hartley has analysed datasets also spanning several centuries in time. Given initial force sizes x0 , y0 and final force sizes x,y he defines the following two dimensionless variables; 1. The Helmbold Ratio; HELMRAT =

x02 − x 2 y 02 − y 2

2. The Initial Force Ratio; FORRAT =

x0 y0

In [11] it is shown that these variable have a remarkably stable scaling relationship of the form;

The Hartley Relationship; Ln (HELMRAT) = α Ln (FORRAT) +β

where the expected value of α is approximately 1.35 and the value of β is approximately normally distributed about the value –0.22 with standard deviation 0.7. Hartley shows that the value of α has the characteristics of a universal constant.

Finally, in [6] we considered the emergent behavior of manoeuvre based warfare, based on a number of sets of historical data. The historical data indicates that for a given type of breakthrough (Immediate, Quick or Prolonged) the mean advance at breakthrough is a Lognormal distribution with a certain scaling character corresponding to the phenomenon of scaling collapse introduced in Chapter 3. This means that all the curves collapse onto each other with suitable renormalisation of the parameters. There is also evidence from this data for two classes of such emergent behavior, corresponding to linear and radial breakthrough. Businesses evolve or die. In the battle for survival, those that thrive do so by occupying the equivalent of biological niches, converting inputs into sufficient value to continue surviving. The dynamical ecosystem is the total economy. Trophic levels stratify it into micro-level/lshort-term;, meso-level/medium term and macro-level/long-term decision horizons. We now need to relate this pervasive idea of renormalisation/scale

59

invariance spanning across the trophic levels of our ecosystem to a model formalism for the ISAAC cellular automaton combat model and also to encompass aspects of the effect of clustering and local collaboration within this procedure. First, let us assume that we have an agent based conflict simulation representing the interaction between Red and Blue agents. Also assume that the clustering process, say for Red, is represented by the following effects:

1. That the number of discrete clusters of Red agents at time t, N(t), is specified ahead of the simulation and that 2. N(t) is a decreasing function of t.

The first of these effects simply suggests that we know or can calculate the average cluster size for the Red agents. The second of these effects is meant to suggest that the number of Red clusters stabilizes and then decreases over time reflecting the desire of Red to concentrate force. This latter assumption requires some caveats. It applies, most likely, in the mature and end of a simulation as numbers decline. Nearer to the start, evidence suggests that the number of clusters tends to increase due to dispersion from starting positions, and initial interaction with opposing forces. With these assumptions, let us further assume that the smallest cluster of Red agents, X (t ) , at time t, is taken and added to another, randomly chosen cluster of Red agents.

This process represents both the concentration of Red force and the reconstitution of force elements. Define: ¾ The cluster distribution function; ϕ ( x, t ) = (expected number of clusters of Red agents ≥ size x at time t ) ൊ (initial total number of clusters of Red agents) ¾ The fraction of remaining Red clusters; N (t ) = (the total number of remaining clusters of Red agents at time t ) ൊ (initial total number of clusters). Adapting the proof of [12] it can be easily shown that the cluster distribution function

ϕ ( x, t ) approaches a self-similar distribution of the form;

60

ϕ ( x, t ) =

g ( x / X (t )) X (t )

where g is a positive continuous function and where, for no clustering (i.e. x=1) we have g (1) = N (t ) X (t ) . This self-similar form means that we can define the distribution of relative cluster size in a way which is time invariant. Now assume that the evolution of the distribution of ϕ ( x, t ) is continuous with a continuous differential, which we define as ʻsmoothʼ. Then a small change in time t leads to a small change in ϕ ( x, t ) - this is equivalent to saying that the renormalisation group is smooth. If ϕ ( x, (1 − δ )t ) is the expected cluster size x at time (1 − δ )t and ϕ ( x, t ) is the same expectation at time t , then this assumption means we can find a constant b to first order such that:

ϕ ( x, t ) = (1 + bδ )ϕ ( x, (1 − δ )t ) i.e. for δ → 0 ,

t

∂ϕ ( x, t ) = bϕ ( x, t ) ∂t

thus

log ϕ ( x, t ) = α log t + β We have shown that the normalized expected cluster size at time t , ϕ ( x, t ) , varies as a power law with scaling constant α . This scaling scheme can be related directly to ISAAC cellular automata behavior by assuming each automaton moves, taking account of interactions, with a mean velocity v in time Δt . This represents the fact that; ΔB is proportional to the product of Δt

(Red unit effectiveness)x(the probability of

meeting a Red cluster)x(the expected number of Red units per cluster).

61

Keeping the cluster size constant for the moment, this indicates , from the analysis of Chapter 3 that the rate law for the ensemble average of Blue automata is given by; ΔB = k q ( D ) Δt r ( D ) Δt

If Red cluster size varies according to a distribution ϕ ( x, t ) =

g ( x / X (t ) ) where x is X (t )

the cluster size and X (t ) the smallest cluster at time t, then we have:

ΔB = k q ( D ) Δt r ( D ) N (t ) g ( y (t )) Δt where N (t ) is the normalised number of clusters of Red at time t and g ( y (t )) is the scaled distribution of cluster size, whose expectation evolves as a power law (as we have shown). The above equation is thus exactly what we would expect for a renormalisation invariant model of type 2. More General Forms of Clustering Fundamental to the above exposition, leading to a metamodel of Type 2, is the assumption that the average number of Red clusters stabilizes and then decreases with time. This does not adequately describe the full ability of Red units to move in a dispersed way and re-concentrate at selected points to attack Blue where necessary. Similarly, traders will cluster and re-cluster dependent on the coevolution of stock prices. A more general statement of the unifying mechanism for clustering and local collaboration, leading to scaling effects and and power laws is that of Self Organised Criticality (SOC). Self-Organised Criticality Our general context is, as stated earlier in the Chapter, that businesses evolve or die. In the battle for survival, those that thrive do so by occupying the equivalent of biological niches, converting inputs into sufficient value to continue surviving. The

62

dynamical ecosystem is the total economy. Trophic levels stratify it into microlevel/lshort-term;, meso-level/medium term and macro-level/long-term decision horizons. In the Self Organized Criticality (SOC) model of such a dynamical ecosystem, species evolve over a fitness landscape with random mutations and relative selection towards higher fitness. Bak [13] introduced a set of extremal cellular automata models to describe these extremal processes. Here we consider two fundamental questions: What is the mechanism of SOC clustering? What universal properties describe these clusters once the critical state has been reached?

Two exact equations for the dynamic of self organised behavior have been formulated; the first describes the approach toward the critical attractor as a function of time, and is governed by a ʻgapʼ equation for the divergence of cluster sizes. This shows how such clustering processes are generated. The second represents the fractal structure of the clusters by describing the statistics of the number of active ʻsitesʼ or automata involved in a cluster. An exponent governs the approach to the critical attractor state. A number of cellular automata models of SOC are possible, corresponding to distinct types of clustering phenomena, and Bak uses a particular assumption to relate the critical exponents in a broad range of such models to two basic exponents characterizing the critical attractor. One such extremal model captures the key idea of local clustering between physically neighboring automata, and is known as the Bak-Sneppen Evolution Model.

The Bak-Sneppen Evolution Model In this cellular automaton model [14], we have a d-dimensional lattice, and random numbers f i drawn without replacement from the interval [0,1] occupy the lattice sites. At each update step the extremal site (that is, the one with the smallest value

63

of f i ) is chosen, and then it and its 2d immediate neighbours are assigned new random numbers. As a model of evolution, the values f i correspond to ʻfitnessʼ values. Changing both the site and neighboring sites captures the process of local co-evolution. It turns out that the set of such active sites is a fractal in space-time. The approach to the critical attractor of the process (at which clusters of all sizes are possible) is controlled by the ʻgap equationʼ dG ( s ) 1 − G ( s ) = d ds L S G(s)

where G (s ) is the maximum extremal value fi (s) at time s, L is the linear size of the lattice, and S

G (s )

is the average cluster size at time s. (A cluster consists of a set

of extremal values f i each of which is a neighbour of the previous extremal value. The size of a cluster is the number of timesteps for which this process continues). From the previous equation, the rate of closure of the gap is inversely proportional to the average avalanche size: •

G (s) ∝

Thus at the critical value f c , S

G(s)

1 S

G(s)

→ ∞ and we have the scaling law: S ≈ ( f c − f i ) −γ

for some exponent γ . Thus the distribution of cluster size is governed by a power law expression at the critical point.

Before the critical point is reached, at some time t, if f 0 is the smallest random number on the lattice in the evolution model, then random numbers created at the next timestep will only continue the clustering process if they are smaller than f 0 .

64

Thus the value f 0 can be viewed as the branching probability of a random process over time, giving rise to a scaling relation of the form]; 1

P( S , f 0 ) = S −τ g ( S ( f c − f 0 ) σ )

for the probability distribution of clusters of size S for to the extremal value f 0 . Bubble Markets In the context of herding creating a bubble market, this equation describes the statistics of local bubble formation at a transient point f 0 heading towards the critical attractor value f c . The parameters τ and σ are model dependent and g is a scale invariant function. If as the system evolves we mark each of the minimal sites on the lattice as it is identified as an extremal value f 0 , then the set of marks generated over time forms a fractal in space-time. Cuts of this fractal in the space direction at a given time identify the site which is ʻactiveʼ (i.e. chosen as the minimal site) at that time. Cuts in the time direction produce a fractal time series. Cluster Formation in the ISAAC Model

A force which is spread uniformly across the battlefield will have A fractal dimension of D = 2. Conversely, an extremely concentrated force will tend towards D = 0. All other cases fall between these limits. In a series of experiments carried out at British Army Training Unit Suffield (BATUS) troops were issued with GPS units allowing their location to be known as the battle evolved over time. Readings taken at a particular point during the trial indicated a fractal distribution with a fractal dimension of 0.8. Readings over a 2 hour period of battle indicated a scatter of fractal dimension in the range 0.8 to 1.1. The pattern of fractal dimension is to show a general increase, followed by a levelling off. There is some instability towards the end, possibly due to reduction in active units. A number of UK BATUS data sets have been analyzed in the same way and show similar

65

characteristics. was calculated. These results lie within the expected range of ISAAC standard cases for which the mean fractal dimensions calculated were in the range 0.9-1.7; see 'Dispersed'

0.9

'Dynamic'

1.0

'Fluid'

1.1

'Classic Fronts'

1.7

A regression plot of D versus the exponent q(D) gave a slope of 0.55 which agrees with the analysis in Chapter 3. We have noted already that the set of all active sites (corresponding to our clustering force elements) is a fractal. It is shown in [14] that the fractal dimension of this set is given by the expression D (τ − 1)

where D is the cluster fractal dimension and τ is a measure of the cluster size distribution. For the Bak-Sneppen evolution model, we have D = 2.92 and τ = 1.245 [14] (Table II – 2 dimensional case). This gives a fractal dimension for the active sites of 0.72. In competitive markets, this fractal dimension reflects the clustering of trades and traders on the trading floor where the informal information links between traders to form such clusters. I discussed in detail the swarming dynamics of bubble//cluster formation and dissolution in ISAAC in [[complex bk]. by applying the disjoint set structure based Hoshen - Kopelman algorithm [ref] over time to the grid of cellular automata. Each cell of this grid is given a label; empty; isolate occupant; member of a cluster of size (variable x). At a given time t each cell label is updated from time t-1. This independent analysis illustrates the fractal dynamics of bubble/cluster formation.

66

The Rodgers-DʼHulst Mechanism: Coalescence And Fragmentation In another line of attack on herding mechanisms, I worked with Professor Rodgers at Brunel University; author of the Rodgers-DʼHulst mechanism of coalescence and fragmentation. We jointly considered its analysis [15] as part of a categorization of informal complex networks as trophic levels at the linked (due to Barabasi), synched (due to Stroghatz) and cliqued (due to Moffat) levels, We consider the conflict arena, and define the attack strength of a given terrorist or insurgent attack unit as the minimum number of people typically injured or killed as the result of an event involving this attack unit.. The sum of the attack strengths over all of the attack units is assumed to be a constant N which is a representation of the total attack capability. The questions to be addressed are; ¾ How is this capability partitioned? and ¾ How will this partitioning change over time? At one extreme there could be N units each with an attack strength of 1. At the other extreme there could be one completely coalesced attack unit with attack strength N. The real distribution will lie between these. The Rodgers-DʼHulst coalescence and fragmentation mechanism assumes that these attack strengths adapt in the following way [15]. Consider an arbitrary attack unit. We define the following extremal model. At a given time assume that this attack unit decides, as a group, EITHER to split itself up in order to mislead the enemy OR to coalesce (i.e. combine) with another attack unit, forming a single attack unit of attack strength the sum of the individual attack strengths. This mimics two insurgent groups deciding via radio communication or locally networked messaging to meet up and join forces in order to carry out a larger scale attack. To implement this fragmentation/coalescence process at a given timestep, we choose an attack unit at random but with a probability which is proportional to its attack strength. With a fixed probability p, this attack unit fragments into a number of

67

attack units with attack strength 1. Attack units with higher attack strength will either run across the enemy more and/or be more actively sought by the enemy. Thus with a probability 1-p the chosen attack unit coalesces with another attack unit which is chosen at random, but again with a probability q which is proportional to its attack strength. The two attack units then combine. The justification for choosing attack units for coalescence with a probability which is proportional to their attack strength, is as follows. It is risky to combine attack units, since it must involve significant message passing between the two units in order to coordinate their actions . Hence it becomes increasingly less worthwhile to combine attack units as the attack units get smaller. It turns out that infrequent fragmentation (a low value of p) is sufficient to yield a steady state process in which the number of attack units with attack strength s, plotted against the value s gives a power-law distribution with exponent α = 2.5 . If we assume that any particular attack unit could be involved in an event in a given time interval, with a probability hich is independent of its attack strength [16], then we have, given our definition of attack strength; P(attack severity ൒S) = P(attack) P(attack strength S) Thus log P(attack severity ൒S) = log P(attack) + P(attack strength S) = k - 2.5 S. This definition of attack severity, as greater than a given level of effect, reflects the uncertainty of outcome of any given incident. [16] also indicates that this exponent value is a universal attractor over time in the sense that the distribution of terrorist attack severity, defined as above, converges to a power law with exponent α = 2.5 . These processes of fragmentation and coalescence are mechanisms whereby insurgent units can adapt their interactions over time in response to the variety of tasks which they wish to carry out. In joint work with Brunel University [15] we used an extremal modeling approach to consider the ability to reconfigure information sharing between units. Each agent in our model has a bivariate input-output task vector (m, s) defined as;

68

•

A complete set of information categories m;

•

That subset s required to complete a particular knowledge based task.

Given two units who have different tasks, and hence different task vectors to be completed, we assume that if they have more similar tasks, they are more likely to form a link and share information. To reflect this, we define the link strength between the two actors as proportional to the inner product of their task vector. We assume that information tends to be shared with actors with similar tasks (and hence similar task vectors). It follows that informal links to allow this information sharing should occur preferably between such actors. As these links accumulate across the network, they will form clusters. A cluster is a set of linked units who are connected by unbroken paths of one or more links and can thus share information across the whole cluster [17]. At each timestep, our model chooses whether to create a new information sharing link or to destroy a cluster of existing links. The precise process in the model is based on the Rodgers-DʼHulst mechanism and is described below. >ŝŶŬƌĞĂƚŝŽŶ With probability p, the model attempts to create a new link. Two units are chosen at random and their link strength f (i.e. the inner product of their task vectors) is calculated. This link strength takes a value between 0 and 1 when suitably normalized. It is then used as the probability that an informal link is formed between the units.

>ŝŶŬĂŶĚůƵƐƚĞƌĞƐƚƌƵĐƚŝŽŶ With probability (1-p), the model to attempts to destroy an existing cluster of links. In our extremal model of the process, denoted Model 0, we assume that a link is chosen at random. This link will have a link strength, denoted f. With probability

(1 − f ) all links in the cluster containing this link are destroyed.

69

ZĞƐƵůƚƐĨŽƌDŽĚĞůϬ We developed a network based cellular automata model where every agentʼs data vector has length 6, and three of the entries have to be filled to complete a task (thus

m = 6, s = 3). In this (6, 3) model the link strengths (the inner products of the various task vectors) can only take the values

1 2 , and 1 . The proportions of these link 3 3

strengths in the model as the cluster size varies is shown in Figure 4 – 1.

This shows that these proportions are a global invariant.

Figure 4 – 1; Plot shows proportion of each link strength in clusters of size x for 0< x 400

In the case we have considered here, the three possible link strengths,

1 2 , and 1 3 3

have the proportions 0.3. 0.6 and 0.1 respectively in a given cluster of agents, as shown in Figure 4 - 1. If we increase the number of possible tasks, we will have a larger number of possible link strengths, and we can look at this more refined distribution. We thus analyzed agents with a (32, 16) input output task vector in order to allow a more general bell-shaped distribution of potential link strengths. The results remain consistent.

70

If we now consider the distribution of cluster sizes across the network, we wish to test the key theoretical prediction that this should approximate to a power-law distribution with an exponent of 2.5 (as expected from the Rodgers-DʼHulst mechanism). Figure 4 – 2 shows the distribution of cluster sizes for a model with a (6, 3) task vector and with a probability of exploration p = 0.5. The first plot (open diamonds) corresponds to a network of of agents equal to 216 ; the middle plot (shown as crosses) to a network of 218 and the right hand plot (open squares) to a network of 2 20 . These very large numbers of interacting agents reveal the underlying power-

law behavior as the limiting case.

Figure 4 – 2; The distribution of cluster sizes approximates to a power-law with an exponent of 2.5 (straight line) and the approximation improves as the network increases in size.

Further investigation of the nature of these clusters of interlinked agents indicates that they tend to be very ʻramifiedʼ and ʻtree likeʼ with very few loops or cycles. The emergent informal networks thus tend to have low clustering coefficients and a high average path length (the opposite of the properties of a small world network). They recall the high entropy filaments which are a consequence of Liouvilleʼs theorem applied to phase space. This structure would be highly adaptable but also highly vulnerable to attack of their information links.

^ŵĂůůEƵŵďĞƌƐŶĂůǇƐŝƐŽĨĂYƵĂƐŝͲŚĂŽƚŝĐ^ǇƐƚĞŵ

71

The type of analysis described above exploring the ʻthermodynamic limitʼ works well when provided with large volumes of raw data. For relatively small number of agents that usually cannot provide statistical data of acceptable quality we need another approach. In collaboration with Imperial College [18] we developed a bespoke clustering algorithm that is particularly well suited for the type of conflict scenarios with a relative small number of agents: such as those typically found in covert insertion operations (one solution of the folded surface of options examined in Chapter 2). Our aim in this example is to detect this infiltration process and the analysis proceeds as follows, for N agents, with N ʻsmallʼ. ¾ Consider the agent system as a fully connected bi-directional weighted graph with N nodes and M=N(N-1)/2 edges. The nodes represent the agents and weights on the edges are a measure of how well the agents are related. o To analyse static spatial clustering, edge weights are defined to be the geometric distances between agents o To analyse clustering corresponding to movement; edge weights are defined to be angular difference between the directions of agent movement. ¾ Build a Minimum Spanning Tree (MST) for this graph. Only a few of the M edges will be part of the MST; ¾ Compute a density approximation of the distribution of MST edges using a Gaussian-Parzen window estimation; ¾ Find the maximum peak in the density estimation and compute a ʻhalf-powerʼ cut-off edge threshold (located on the right of the peak) at which the density drops to half of the peak value; ¾ Remove the edges in the MST that have values above the cut-off threshold. The end result is a ʻforestʼ of sub-trees of the MST. Each tree in this forest represents a separate clustered group of agents. In our example there are only 15 agents interacting, implying that the system may be chaotic rather than complex, Each agent is given an 8-neuron recurrent neural network brain leading to a rich repertoire of non-linear local choices..

72

We complete Chapter 4 by discussing now in some detail the rich emergent properties obtainable from this quasi-chaotic system, and how we analysed them. This section owes much to the joint contribution of Nick Walmsley of Dstl and Professor Erol Gelenbe of Imperial College.

73

First,theinvestigationfocusseduponthespatialbehaviorofagents.To carrythisout,simulationrunsbothwithinthenormalterrainandinan obstacle-freeenvironmentwereassessed.UsingFigures4 -3and4 - 4for comparison,showingsnapshotsof simulation runs atsuccessivetimeswithin thesimulations,theeffectofʻsyntheticʼterrainmaybeexamined.Theagent trajectoriesareshownascolouredtrails.Theseprovideinformationon theevolutionof thespatialconfiguration of thedifferentteamsthroughoutthe simulation run.

74

a)steps=0

b)steps=300

c)steps=600

d)steps=900

Figure 4 - 3: Evolutionofthespatialconfigurationoftheagentswith respecttotimeinaterrainwithobstacles (green markings).

75

a)steps=0

b)steps=300

c)steps=600

d)steps=900

Figure 4 – 4 Evolutionofthespatialconfigurationoftheagentswithoutterrain

Terrainhastheeffectofreducingtheabilityofagentstoactasateam. Essentially,thisisduetotheadditionofrepulsiveforcesexperiencedbythe agentsastheycomewithincloseproximityoftheʻterrainobjectsʼ.

Besides spatial configuration, instantaneous agent behavior was also measuredasafunctionoftime.Withinthesettingofthesesimulations, instantaneousagentbehaviorreferstothedirectioninwhichanagent

76

moves(itsvelocity).ComparingFigure4 – 5 individual cases, we see that,terrainhastheeffectofintroducingʻnoiseʼ tothesystemthusweakeninggroupcohesiveness. The significanteffectoftheintroductionofweaponscanbeseeninthelatter stagesofthesimulation:morestabilityingroupcohesivenessisevidencedin thosescenarioswithweaponsdeployed. Figures 4 -6 is illustrative of detection schemata for fragmentationofagroupintosmallergroupsorcombinationoftwoormore groupsintoone.

77

Figure 4 - 5:Spatialandbehavioralclustering-detectionofagent divergence into chaos

78

Figure 4 – 6; Detection of group fragmentation and combination,

Figure 4 – 6 Illustrates how the fragmentation or combination of groups can be detected by exploiting these techniques. The Top insert shows the distribution of edge metrics, and the lower insert the evolution of the MST. Different edge metrics such as the agent to agent divergence illustated in Figure 4 – 5 can be used either separately or in conbination to tease out such patterns of meaning. In summary, these techniques allow us to turn qasi-chaotic behavior into patterns of meaning.

79

ZĞĨĞƌĞŶĐĞƐ

1. Per Bak (2002) Informal Seminar, Imperial College UK. 2. Investopedia (accessed Dec 2017) https://www.investopedia.com/terms/h/herdinstinct.asp#ixzz50Pikq7pF 3. Epstein, J. and Axtell, R. (1997).ʼ Sugarscapeʼ, Cambridge, MA: MIT Press. 4. Ilachinski, A. ʻArtificial War: Multiagent-Based Simulation of Combatʼ World

Scientific (2004), Singapore. 5. Albert D, Huber R , Moffat J (2010). ʻNATO NEC C2 Maturity Modelʼ. Washington DC, USA: US DoD CCRP. 6. Moffat J (2003) Complexity Theory and Network Centric Warfareʼ Washington DC, USA: US DoD CCRP. Now available on Kindle. 7. Moffat J, Scales T et al (2011) ʻQuantifying the Need for Force Agilityʼ International C2 Journal Vol 5 No 1. 8. Moffat J ʻMathematical Modeling of Information Age Conflict'ʼ Journal of Applied Mathematics and Decision Sciences, vol 2006, article ID 16018. 9. Roberts D, Turcotte D (1998) ʻFractality and Self Organised Criticality of Warsʼ Fractals Vol 6 No 4. 10. Richardson L (1960) ʻStatistics of Deadly Quarrelsʼ Pittsburg: The Boxwood Press. 11. Hartley D S (1991) ʻConfirming the Lanchesterian Linear-Logarithmic Model

of Attritionʼ, Martin Marietta Centre for Modeling, Simulation and Gaming, Report K/DSRD-263/R1, 12. Carr J, Pego R ʻSelf-Similarity in a Cut and Paste Model of Coarseningʼ Proc R Soc Lond A vol 456 (2000).

80

13. Paczuski M, Maslov S, Bak P (1994) ʻField Theory for a Model of Self Organised Criticalityʼ Europhys Lett Vol 27 No 2. 14. Paczuski M, Maslov S, Bak P (1996) ʻAvalanche Dynamics in Evolution, Growth and Depinning Modelsʼ Physics Rev E Vol 53 No 1. 15. Moffat J (2014) ʻComplex Adaptive Information Networks for Defence: Networks for Self-Synchronizationʼ in “Network Topology in Command and Control: Organization, Operation, and Evolution” Editors T Grant, R Janssen, H Monsuur. IGI Global, USA.

16. Johnson, N., Spagat, M et al (2005). From Old Wars to New Wars and Global Terrorism. Accessed Dec 2017; arXiv:Physics/0506213 17. Perry W, Moffat J (2004) ʻInformation Sharing Among Military Headquarters; The Effects on Decision-makingʼ . The RAND Corporation, Santa Monica, California, USA. 18. E Gelenbe, N Walmsley et al (2006) ʻA Dynamic Model for Identifying Enemy

Collective Behaviorʼ 11th International Command and Control Research and Technology Symposium, Cambridge, UK.

81

Modeling Conflict and Competition far from Equilibrium. Chapter 5; Volatile Information Flows and Non-Linear Decisions.

WƌŽĨĞƐƐŽƌ:ĂŵĞƐDŽĨĨĂƚ ĞĂƌŶŝŶŐ͖EĞƚǁŽƌŬǀŽůƵƚŝŽŶďǇƵĚĚŝŶŐ Our network tree representation of human reasoning can evolve and learn, developing ʻbud formsʼ. The bud is a subtree rooted at the pre-existing proposition U, and influence propagates through new nodes X, Y and Z. as shown in Figure 5 – 1.

U

π x (u )

λ y (x)

Y

λz (x)

X

π y (x )

π z (x) Z

Figure 5 – 1; The bud rooted at X. Inference flows down the bud to the tip; information flows up.

&ŽƌŵĂƚŝŽŶ͖dŚĞEĞƚǁŽƌŬǆƉůŽƌĞƐĂŶĚ>ĞĂƌŶƐ͘ Proposition U represents the effect of the pre-existing network on the bud form. This means that all the influence of the network on the bud is reflected in the local influence between X and its nearest neighbors in the network. •

We call this the locality assumption. It is equivalent to the assumption, for static

•

Pruning the tree: by carrying out measurements external to the system S, we can

inference networks, of conditional independence [1].

gain additional information, allowing the replacement of a bud by a number. This terminates further budding on this branch of the tree. •

The budding and pruning mechanisms, together with the locality assumption give our extremal model of reasoning the characteristics of a cellular automaton.

For the bud structure shown at Figure 5 - 1 the node U corresponds to a higher-level proposition, X is a connector node and Y and Z are lower level propositions or terminal data values blocking further budding. We assume that further budding can continue in this

84

case, so that Y and Z are lower level propositions. The notation of Figure 5 - 1 indicates that evidence ( ey − ) is available at the node Y and evidence ( e z− ) is available at the node Z. There is also evidence e x+ available from the rest of the network, which influences the proposition U. We now fuse these various (independent) pieces of data together and update our belief in the connector node X. the total evidence available at X from Y and Z is defined as [1]; e x− = e y− ∪ e z− For the likelihood ߣ of observing this data, given the proposition X we have; λ ( x ) = P ( e x− x ) = P ( e −y , e z− x ) = P ( e y− x ) P ( e z− x ) = λ y ( x ) λ z ( x )

The updated degree of belief, Ɏ , in proposition X similarly satisfies;

π ( x) = P( x ex+ ) = ¦ P( x ex+ , u) P(u ex+ ) = ¦ P( x u )π x (u ) = M x u .π x (u ) u

where π

x

u

( u ) is defined as P(u | ex+ ) ; we have exploited the locality assumption for X, and

the sum is over the different possible values u of the Proposition U. We can express the current belief in proposition X as:

i

BEL( x ) = P ( x ex+ , ex− ) = α P (ex− ex+ , x ) P ( x ex+ ) = α P (ex− x ) P ( x ex+ ) = αλ ( x )π ( x ) = αλ y ( x)λz ( x )π x (u ).M x u

We can consider this to be the total strength of belief due to information derived from the pre-existing tree and its new bud.

85

WĂƌƚϮ͗DƵůƚŝͲ,ǇƉŽƚŚĞƐŝƐdƌĂĐŬŝŶŐĂƐĂZĞƉƌĞƐĞŶƚĂƚŝŽŶŽĨ,ƵŵĂŶZĞĂƐŽŶŝŶŐ Let ሼ‫ݕ‬௧ ሽ be an n-vector of observations at discrete times t(1), t(2),…. of some underlying system S. For example, S could be a portfolio of n different share acquisitions, and the vector would then be the current quoted price of each share type in the portfolio. Tracking these over time as a vector rather than individually, gives us the opportunity to examine the co-variance and correlation between prices, in addition to the total value. Assuming normal errors the determinant of the covariance matrix is a measure of the entropy of the distribution. Applying the Kullback-Liebler information measure [2] shows that tracking the full vector rather than its individual components can add a significant amount of additional information, thereby increasing the accuracy of forecasting future pricing. So far, we have considered just one such future trajectory of the system (e.g. business as usual). However, we desire the ability to take account of sudden changes in market assumptions due to herding and the creation of bubble markets as we discussed earlier. We adopt for this construction the general approach of Bayesian forecasting as in [3] with the aim of building a set of models based on Kalman Filter theory each corresponding to a different set of market assumptions. This is similar to multiple hypothesis target tracking and I call this approach a Multiple Hypothesis Agile Tracker (M-HAT). Agility in this example is the ability to switch appropriately among future hypotheses H(1), H(2), H(3) or H(4) where these are now defined as: •

H(1) ʼsteadyʼ; prices fluctuate around a stable mean;

•

H(2) ʻblipʼ ; price fluctuates, and trend appears to begin but it is a random event which can be ignored for two updates either side of a possible turning point;

•

H(3) ʻrapid changeʼ ; mean prices trending ; increasing or decreasing rapidly; formation or deflation of a bubble market due to herding. Equivalent to a fractal signal of change in other price trackers.

•

,;ϰͿ͚ĞǆƉůŽƐŝǀĞĐŚĂŶŐĞ͖͛ƐŝŐŶŝĨŝĐĂŶƚĐŽƌƌĞĐƚŝŽŶĐŽƌƌĞƐƉŽŶĚŝŶŐƚŽĂƐƚĞƉĐŚĂŶŐĞ ŝŶǀĂůƵĞǁŚĞŶŵĂƌŬĞƚƐŽƉĞŶ͘

86

ThĞDƵůƚŝƉůĞ,ǇƉŽƚŚĞƐŝƐŐŝůĞdƌĂĐŬĞƌ;DͲ,dͿ We start with the vector of observations ‫ݕ‬௧ at time t and the state of the system described by the system vector ߠ௧ . We assume these are linked by an adaptive locally linear model;

y t = Ftθ t + v t where ν is a random error vector drawn from the multivariate normal distribution N ( 0 , V t ) and F t is a matrix of known independent functions ݂ሺ݅ǡ ݆ሻ of time. Similarly, the system equation updates the value of the state vector ș at time t-1 to the value at the next timestep t by an adaptive locally linear model;

θt = Gtθt −1 + wt

w

where G t is a matrix of known functions ݃ሺ݅ǡ ݆ሻ of time, and

is an error vector drawn at

random from a multivariate normal distribution N ( 0 , W t ) . By recursive use of the system equation, it is clear that for any time t the distribution of θ t | Dt , i.e. given the information available at time t , is multivariate normal, provided that the initial assumed distribution of

θ 0 is multivariate normal. Define the conditional distribution of ߠ as a multivariate normal distribution with mean

M t −1

and covariance matrix C t −1 .These represent best estimates

given the information available at time t-1. We now update these with the latest information vector we can write this in terms of conditional distributions as:

(θ t −1 Dt −1 ) ≈ N ( M t −1 , C t −1 ) (θ t Dt ) ≈ N ( M t , C t ) §θ t · ¸¸ as assessed with the information © yt ¹

We now consider the joint distribution of the vector ¨¨

available at time t-1 with marginal distributions (θ t D t −1) and ( y t Dt −1 ) . From the system equation, since the expectation E is linear, we have; E (θ t ) = GE (θ t −1 ) + E ( w t ) = GM

t −1

and from the observation equation we have, E ( y t ) = Ft E (θ t ) + E ( v t ) = F t GM

87

t −1

= yt

which is the expectation of the one step ahead forecast of y t for a single filter process. Given the actual value of the vector of observations y, the updated estimate of the system state, given by the distribution of (θ t y t ) is a conditional distribution corresponding to a §θ t · ¸¸ The position of this slice is determined © yt ¹

slice through the joint distribution of the vector ¨¨

by the collapse of the one step ahead forecast to its actual value y t at time t. [4]. The recursive form of the filter then follows as an expression of the recursive mapping;

Fj (t − 1, t ) : (θ t −1 Dt −1 ) ≈ N ( M t −1 , Ct −1 ) ՜ (θ t Dt ) ≈ N ( M t , C t ) We can combine these using the ʻmixture modelʼ schema developed in [3] which updates both the probability of the model and the data. We now describe all of this in more detail by an extended example taken from a real conflict application. Inevitably some details have been omitted.

ǆĂŵƉůĞĚĞǀĞůŽƉŵĞŶƚŽĨĂŵƵůƚŝŵŽĚĞůŽĨϰͲ,ǇƉŽƚŚĞƐŝƐŐŝůĞdƌĂĐŬĞƌƐ;ϰͲ,dͿ͘ I developed deliberate planning and rapid planning, see for example [5- 8] in order to capture in mathematical form the global and local processes of decision making in conflict. These are available as e-books on Amazon Kindle ™. For free access to the full texts of related peer reviewed papers visit Researchgate.net [9] Deliberate planning corresponds to a longer time horizon and is represented in our current models as alternative strings of missions available with entry conditions which must be satisfied [7]. Rapid planning is most suitable for capturing decisions taken by experts in stressful and rapidly changing circumstances. It is represented as a form of pattern matching, where the current perceived situation is compared with a set of mentally stored references, which have been accumulated thro experience and training. This is a satisficing method where the patterns are directly linked to choices by exploiting the mathematical properties of the 4-HAT.

88

KƵƌdŚĞŽƌǇŽĨ,ŽǁǆƉĞƌƚƐDĂŬĞ>ŽĐĂůŚŽŝĐĞƐ Our mathematical theory of local choice by experts is the Rapid Planner. It is based upon the Recognition Primed Decision-Making (RPDM) model of [13] which views decisionmaking under pressure as being a process closer to that of recognition than to that of a rational analysis of alternatives. The decision-maker, under this view, has several potential courses of action, each of which will be appropriate in one or another type of situation. When making the decision, the decision-maker compares his current situation, or more properly his perception of that situation, with his library of possible, archetypical, situations and selects that one which seems the closest match to his current problem. The advantage of this method is that it can be very fast and can combine both conscious and unconscious perception (the latter what we often term intuition). This theory makes a number of assumptions which we have tested and found to be true. We now describe he story of that process.

ǆƉĞƌŝŵĞŶƚĂůƵŝůĚŝŶŐůŽĐŬƐϭĂŶĚϮ I was fortunate to work directly with UK military officers who as a group are highly professional and reckoned to be among the best in the world. We were thus able to carry out a set of experiments called Building Blocks 1 and 2 with a number of carefully selected UK officers of the rank of Major or Lt Col and with appropriate experience, Including of tactical decision-making in conflict.

ƵŝůĚŝŶŐůŽĐŬϭ We investigated the choice process initially using of a series of decision games. The players were young UK army officers of around Major rank with recent conflict experience. In the first set, all players were expert in the protocols concerned with defence against a chemical or biological ʻWMDʼ attack/ Each game was played at Battlegroup command level, and began with a map and an initial situation briefing, with additional information being presented on a sequence of cards, based on earlier developments in [11]. This method generates large quantities of data, with each information update (presentation of a card) being a data point. The volume of data allows the use of multi-variate probit or logit regression models to analyse the results. These data hungry statistical models had a high degree of success in predicting the observed choice behavior, as discussed in detail in [10].

89

The first game involved a single ʻkeyʼ decision based on mounting evidence of a potential WMD attack using biological weapons. The experiment was conducted in a similar format to previous single decision games such as those described in [11]. For example, players operated in pairs, with the pairs being permitted to discuss any proposed course of action and being required to come to a joint decision. Each player, we emphasize, as well as being a highly professional UK army officer of around Major rank, was an expert in WMD defence. At each turn of the game, an update card was presented which gave a single piece of information of a ʻtypeʼ or ʻcategoryʼ. The cards were presented in order, one at a time, with the next card being presented when the players requested it, forming a ʻserialʼ. Each serial thus represented a fixed sequence of cards corresponding to an unfolding scenario with accumulating information. During development of these serials, prior to the set of games, the cards were initially randomly shuffled, and a sequence of cards selected to form a serial. Following adjustments to avoid checking for illogical sequences and whether the number of alarms was too high; This was then ʻfrozenʼ to form a serial capturing the effect of a dynamic and unpredictable set of events occurring as the scenario emerged. Following each information update, the players had to choose from the following options; ¾ Continue current mission; ¾ Issue a precautionary alert to troops in theatre, leading to use of initial protection measures. ¾ Escalate to ʻprobable attackʼ state – including recommending the taking of medical countermeasures and reporting this up the chain of command.

Eight serials were developed in total, with all pairs of players being exposed to all serials. The allocation of the order of serials to individual pairs was decided using a Latin Square design, which ensured that each pair not only saw the serials in a different order, but also saw each serial preceded by each of the other serials, to compensate for learning effects. At any given time working through a serial, each pair will have seen a certain number of cards ݊௜ in each of a total of six information categories 1 i 6. One measure of the information presented is a linear combination of these, weighted to reflect the relative importance the pair of players either implicitly or explicitly put on information of a certain category, giving an expression of the form

¦β n

i i

to the situation at the start of each experiment.

90

plus a constant value α corresponding

There are three possible action states: ¾ j=0 (no alert issued); ¾ j=1 (precautionary alert issued); ¾ j=2 (report of probable attack issued). The transitions 0ĺ0, 0ĺ1, 0ĺ2, 1ĺ1 and 1ĺ2 were those allowed in this game, at the time a new information card was revealed to the pair of players. A probit analysis of the playersʼ decisions was made based on the six information categories (n1 , n2 ,..., n6 ) . These represent the numbers of cards of each information category exposed to a pair of players in each experimental serial. Thus, we define (for a given pair of players); 6

γ j = P (< state j ) = Φ (α j −1 + ¦ β i ni )

j = 1, 2.

i =1

with ĭ the cumulative distribution of the standard normal distribution N (0,1). This allows for two intercepts (α 0 and α1 ) one for each of the two types of response, and weights ( β1 ,....., β 6 ) , calculated from multivariate probit analysis, for each of the six information categories used to classify each card. The probability of being at least in a certain action state Is thus; Pr(Y ≥ j | n1 , n2 ,..., n6 ) = 1 − Φ ( y j −1 )

Using this as our statistical model, the probit analysis indicates that the weights βi are all negative, so that the probability 1 − Φ ( y j −1 ) , as a measure of perceived threat, is an increasing function of the number of cards played over time.

DŽĚĞůWƌĞĚŝĐƚŝŽŶ Given a certain number of cards ݊௜ of each type i have been played so far in a serial, we predict that the players will be in various action states with the following probabilities; 6

P (Y ≥ 2) = P (Probable Attack) = 1 − Φ(α1 + ¦ β i ni ) i =1

91

6

6

i =1

i =1

P (Y ≥ 1) − P (Y ≥ 2) = P (Precautionary Alert) = Φ(α1 + ¦ β i ni ) − Φ(α 0 + ¦ β i ni ) These can be compared to the same probability assessed from the population of all pairs of players for the same serial and the same time into the serial marked by the same numbers and types of information cards played. Model prediction gives a good fit to the observed data, not only tracking the general rate of increase of probability of issuing an alert , but also following closely the degree of upward movement for each of the individual steps as each of the information cards is played. Overall, this statistical model of the decision process [10], where the variates correspond to the six information categories, can account for 60% of the variability in P(probable attack warning issued) and 80% of the variability in P(precautionary alert or higher issued). This represents the contribution to player choice of the information presented. The residual 20% - 40% of variation represents the effect of personality and other individual differences between commanders.

DƵůƚŝƉůĞĞĐŝƐŝŽŶ'ĂŵĞƐ The current approach of UK forces is that, where WMD protective measures are concerned, the local commander has the freedom to modify the level of personal protection taking higher levels of WMD risk in exchange for benefits in terms of freedom of action to achieve their objectives and reduced degradation (e.g. reduced fatigue and/or increased speed of movement). This problem was significantly more complex than that addressed in the single decision gaming, since troops could, by order from the force commander, move up and down WMD protection states as the commanderʼs perception of the chemical and biological threat waxed and waned over the serial, making the problem of modeling significantly more difficult than for a single decision to enhance protection due to a cumulative threat. However, the basic massage still applies-information drives choice, as we will demonstrate.

Several routes towards tactical objectives were identified, similar to those encountered during in Operation Iraqi Freedom (OIF). These we labelled from A to H (with each objective having just one route associated with it). The routes were selected to give a

92

variety of lengths, with all routes being in a general northerly or north-westerly direction. In each game, the playerʼs intent was to move along a single route towards his objective. The introductory contextual material asked the players to decide, at the beginning of the serial and after each information update whether to;

¾ order a change in protection level, enhancing or decreasing protection of the force under command; or

¾ continue advancing on the route with no change in WMD protection state. There are three choices of WMD protection state; DS1 (equipment carried but not put on); DS2 (suit mandated); DSR (respirator and suit mandated).

dŝŵĞWƌĞƐƐƵƌĞ Players were given only a general idea of the total amount of time available – they had no information on the number of cards in a serial or the total number of serials to be completed, only that it was reasonable to assume that a day would be sufficient to complete the available serials. Because the rate of advance during each serial was restricted by the WMD protection state, yet no additional time was available, pressure was put on those trying to complete the longer routes, thus enforcing the need to balance risk to personnel against risk to the mission. In this wider command environment, players did not require specialist WMD defence expertise. These pairs of players (18 in total) were allocated to the serials using a GraecoLatin Square design, which permuted routes with serials. Pairs were allocated lines in the Graeco-Latin Square at random without replacement in order to avoid narrowing of the potential choices by the players through overt scripting of the scenario. Having moved to a non-monotonic problem, a cumulative model was no longer appropriate. We therefore developed a very simple extremal Markov based model of t that only took account of both the previous WMD protection state and the information category or ʻtypeʼ of the last card presented. From this simple model the probability of being in at least WMD protection state j > 0 equals;

93

P (Y ≥ j ) = 1 − Φ ( y j −1 ); y j −1 = α j −1 + β prevDS ( prevDS ) + β typeoflastcard (lastcardtype)

The variables are previous WMD protection state (prevDS), and the information category of the last card played (lastcardtype). We now can calculate, as in the single decision game model, the following Decision Probabilities;

P(Y ≥ 2) = P( DSR) = 1 − Φ (α1 + ¦ βi xi ) P(Y ≥ 1) − P(Y ≥ 2) = P( DS 2) = Φ (α1 + ¦ βi xi ) − Φ (α 0 + ¦ βi xi ) P(Y < 1) = P( DS1) = Φ(α 0 + ¦ βi xi ) state or the type of the last card played), for which this process model is particularly suited [12].

DŽĚĞůWƌĞĚŝĐƚŝŽŶĂŶĚsĂůŝĚĂƚŝŽŶ͘ Our Markov process model has three states, corresponding to the three WMD protection states. The passage of time is measured by observing each state to ascertain if it is occupied or not at each positive integer k, building up a statistical measure over the players of observed probability of occupation of each of the states DS1, DS2, DSR as functions of k. These are calculated using the Markov transition probabilities k ĺ k+1 as follows; P(DS1k +1 ) = P ( DS1k ).P ( DS1k +1 | DS1k ) + P ( DS 2 k ).P ( DS1k +1 | DS 2 k ) + P ( DSR k ).P ( DS1k +1 | DSR k ) P(DS 2 k +1 ) = P ( DS1k ).P ( DS 2 k +1 | DS1k ) + P( DS 2 k ).P ( DS 2 k +1 | DS 2 k ) + P ( DSR k ).P ( DS 2 k +1 | DSR k ) P(DSR k +1 ) = P( DS1k ).P ( DSR k +1 | DS1k ) + P ( DS 2 k ).P ( DSR k +1 | DS 2 k ) + P ( DSR k ).P ( DSR k +1 | DSR k )

The conditional probabilities involved are calculated from our earlier Decision Probabilities, given that we also know the information category of the last card presented to the players. The model is thus fully described in the equations above, which show the probabilities of being in a given WMD protection state (DS) after presentation of the k-th card. The Markov element of the model updates the probabilities of being in the three individual WMD protection states. To estimate the performance of the model. This initial model explained

94

about 60% of the variability in choice reflected in P(DSR). Refining the information by using a finer classification of the data into more orthogonal categories increased this to 82%. An additional validation check on this refined model of behavior was made by comparing the model generated from a part of the data with the behavior of the remaining portion of the data. This additional constraint reduced our measure of the influence of information on behavior by about 10% to a still overwhelming 73% indicating the key contribution of information to the decision outcome.

If the models are extremal, then it is probably also important that they are probabilistic. The games do not produce a rule-based model of decision-making; different players did not display any great consistency in their behavior. The particular commander of a particular unit may have a tendency to be at one end of the decision spectrum or the other, and the games were able to show how such individual tendencies were distributed. The exact subjective detail of the situation and the circumstances may put the remaining 20-40% of variability unexplained by these models beyond the grasp of prediction. Because of this, such decision models should be probabilistic in nature. The Rapid Planner works by comparing a number of different decision options with an estimate of the current situation, consisting not only of an estimate of the value of different parameters of the situation, but also an associated set of estimated uncertainties. The model compares the probabilities that the true situation is that associated with each of the decision options and chooses amongst them. In terms of a decision process, we define three regions. The first defines the ʻNo Actionʼ region. The second (Region A) corresponds to the set of values { x1 , x2 } (i.e. the number of cards of each of the two information types which have already been shown to the pair of commanders), and for which the probability 1 − Φ (α 0 + β1 x1 + β 2 x2 ) is ≥ 0.5 . This corresponds to the region of the decision space within which the commander would choose the course of action ʻissue precautionary alert or higherʼ. The third region (Region B) is the subset of Region A corresponding to the set of values { x1 , x2 } for which the probability 1 − Φ (α1 + β1 x1 + β 2 x2 ) is ≥ 0.5 . This corresponds to the region within which the commander

would choose the course of action ʻissue probable attack warningʼ. The cues of the decision process forming the Recognized Picture are thus the different information categories in this case, and they are weighted by the constants βi . The regions

95

No Action; Region A and Region B within this Recognized Picture corresponding to the different decisions are represented probabilistically, corresponding to fuzzy membership functions, as in the Rapid Planning process. Analysis of the multiple decision games (which allowed movement between the three states DS1, DS2 and DSR) shows that the commandersʼ decisions can be captured in a similar way, and in fact form a simple Markov process, which is also to be expected from Rapid Planning considerations (Moffat, 2002).

In each case, then, the statistical model which explains the commandersʼ decision-making process is the Rapid Planning process (to an acceptable level of approximation), and is thus a relatively simple probabilistic model, rather than one based on complex, embedded and deterministic decision rules.

DĞĂƐƵƌŝŶŐ/ŶĨŽƌŵĂƚŝŽŶ The interpretation of the dimensions of the decision space is an interesting question. In the usual formulation of the Rapid Planner, the axes of the decision space are real world variables about whose value the decision-maker is uncertain. Here, there is no obvious equivalent to that process. Rather than estimating a value, players are proceeding directly to their decisions.

Treating the numbers of cards of different information categories as the axes of the decision space leaves open the question of what the cards are representing. In particular, the relative values of the information categories (i.e. the weights) are remarkably stable when the analysis is varied slightly or if only partial data sets are selected. This seems to indicate that the weights given to the different information categories are representative of some kind of intrinsic quantity, rather than merely being a statistical convenience. The validation exercise also seems to indicate that the weights assigned to the cards have some real meaning.

96

There is evidence from the experiments that there is a Bayesian element to the decisionmaking process – the initial intercepts ( α 0 and α1 ) look remarkably like Bayesian prior beliefs – the belief that an action is appropriate, even if no information has yet been received. This may be due in part to the fact that the players are placed in a context within which they are expected to ʻmake a decisionʼ.

Bayes theorem tells us that we can update probabilities given a prior P(A): P ( A | B ) = P ( A).

P ( B | A) P( B)

This can be translated into an additive form using logarithms thus: log P ( A | B ) = log P ( A) + log

P ( B | A) P( B)

There is an alternative formulation of the statistical model that allows this kind of Bayesian updating [12]. The linear logistic model is based upon the logistic transform, defined as:

θ = P(Y = 1| x) = exp(α + β x) /(1 + exp(α + β x))

Where ș is the probability of an event given a determining variable x, and 1-ș is the probability of the event not taking place. This allows a linear regression similar to the probit model, but instead of using the cumulative normal Φ we use the log of the ʻodds ratioʼ:

P(θ = 1) =

eα + β x 1+ e

α +β x

P(θ = 0) = 1 − P(θ = 1) =

1 1 + eα + β x

P(θ = 1) ½ α + β x Hence odds ratio = ® ¾=e ¯ P(θ = 0) ¿ and log(odds ratio) = α + β x

97

This gives us regression outputs in terms of the odds ratio, and means that a change in either the input variable x, or a change in the weighting β can be interpreted in terms of a shift in the odds ratio. More generally, x is a vector of variables, with a corresponding vector of weights β , so that β x = ¦ βi xi . i

The logistic transform L ( y ) =

ey is also very close numerically to the cumulative normal 1+ ey

distribution Φ( y ) , once a scaling factor has been applied [12]. An alternative formulation of our decision model is thus possible in terms of likelihood ratios and odds ratios. This approach is also consistent with recent research on decision - making. We consider O(A|B), the odds ratio of A given evidence B, and Δ( B | A) , the likelihood ratio of B given hypothesis A, defined as follows: O( A | B ) =

P( A | B) ; P(− A | B)

Δ( B | A ) =

From Bayesʼ Theorem we have:

P (B | A ) P (B | − A ) P( A | B) P ( B | A) P ( A) = . P ( − A | B ) P ( B | − A) P ( − A)

And hence: O( A | B ) = Δ( B | A).O( A). Note that this odds ratio formulation is independent of P(B). This formulation, by taking logs, give us a quantity of the kind that we are looking for:

log O( A | B) = log Δ( B | A) + log O( A).

It is also independent of the number of alternative hypotheses B, which is important when dealing with complex decision-making. Thus, if we can describe the data using a logistic model L rather than a probit model Φ , we can potentially produce a model equivalent to our probit model, with the value of a piece of new information being expressed as a log-likelihood ratio which updates the logodds ratio.

98

The logistic model for the multiple decision games is similar to the model used before for the probit analysis, and can be similarly stated, as follows:

P(Y ≥ j ) = 1 − L( y j −1 )

Where

y j −1 = α j −1 + β prevDS ( prevDS ) + βtypeoflastcard (lastcardtype) with x = { x1 , x2 } where x1 = prevDS (the previous dress state) and x2 = lastcardtype (the type of the last card played). Thus, as with the probit model, we can define the following Decision Probabilities:

P( DSR) = 1 − L(α1 + ¦ βi xi ) P( DS 2) = L(α1 + ¦ βi xi ) − L(α 0 + ¦ βi xi ) P( DS1) = L(α 0 + ¦ βi xi ) Here, L is the logistic transform L ( y ) =

ey . 1+ ey

The result of applying such a logistic model to the second set of games (with multiple decisions) is thus to use an ordinal logistic regression with a logistic transform link rather than a probit (cumulative normal) link. The model is the simplest of the information category models, with none of the categories being divided further. The variance explained the two models (probit versus logistic) is the same to three decimal places, (0.5922 vs 0.5921). Clearly the agreement between the two models is very good, as could be anticipated; the differences between the two transforms are mainly significant at the outer ends of the probability distribution, when we are dealing rare events.

99

ZĞƐƵůƚƐĨƌŽŵƵŝůĚŝŶŐůŽĐŬKŶĞǆƉĞƌŝŵĞŶƚƐ Summary: Our results indicate that even when time is of the essence and human lives are on the line, for expert tactical commanders, information updates are the dominant factor in the situation assessment process not individual commander personality, beliefs or cultural background. Our initial results support the hypothesis that this information is transformed into a desired choice as described by the mathematical process of Rapid Planning.

A number of key findings follow;

¾ From the statistical analysis of the two types of game (single decision and multiple decision) it is clear that information is a key driver of the commandersʼ decisionmaking process, rather than other effects such as the variation between different commanders. This decision-making can best be described by models which are probabilistic in nature rather than models which are deterministic and rule-based.

¾ The statistical models which provide a best fit to the command decision-making investigated here are entirely compatible with the Rapid Planning process.

¾ It is possible to use such card-based decision games to assess the weightings which the commanders place on the different categories of information they use in their decision-making We have now incorporated this into an extended form of the Rapid Planning model.

¾ A statistical model based on the logistic transform rather than the probit transform can provide additional insight into the value of these weightings in terms of how they change the odds ratio of a choice. The choice dilemma described in this extended example considers two levels of attributes. The first considers the commanderʼs local perception of the situation, leading to a locally desired choice. The second attribute measures longer-term and more global concerns related to higher-order considerations (for example, the overall aims and intent of the campaign). our analysis both in chapter 3 and here has identified this dilemma mathematically with a folded cubic surface forming a subset of the solution space. Emerging from these results is the idea that experts act like surgeons, diagnosing the situation by matching the pattern of cues and indicators, contained in presented information, to previous situations remembered from past experience and training. This recognition

100

process provides access to pre-learned knowledge about how “best” to behave and what to expect in such a situation. This pre-learned knowledge shapes the decision-makerʼs process of situation assessment and provides the starting point for course of action generation. In terms of the Rapid Planning process, this relates to the subjectively pre-specified areas of the Recognized Picture on which the decision-maker places emphasis. It also relates to how each of these areas is mapped onto a locally preferred course of action. The overall process is then: 1. Identify where you are in the Recognized Picture, based on tracking the values of the key information attributes over time; 2. Assess which of the pre-specified areas of this space the track pertains to (the pattern matching process); 3. Map from that identification, onto a locally desirable choice. 4. Resolve the dilemma between local and more global perceptions; 5. Implement the final choice. A set of further experiments was initiated to investigate and test this theory.

ƵŝůĚŝŶŐůŽĐŬdǁŽ͖dŚĞ&ŝŶĂůŚŽŝĐĞ In Building Block One we established that objective information updating a pattern matching process is the key method used by military experts under pressure to make a locally valid choice. Next, we needed to validate, by experiment, how such experts reconcile this with higher level perceions.in the stressful and rapidly moving circumstances of tactical command. To do this we developed the RPD game, where RPD stands for Recognition Primed Decision-making. This psychological structure for rapid decisionmaking by experts was developed by Klein [13] who supported this early work. Such games now form a defined subset of all games known as serious games. The RPD game structure was based on command decisions at Battlegroup and Company levels set in two different conflict scenarios; warfighting and counter-terrorism. The game is composed of a sequential set of steps;

¾ Players were in tactical command and were presented with an initial operational picture and a situation briefing.

¾ They were given a short period (ten minutes) to appraise the situation. ¾ This was followed by an information update in the form of an Intelligence briefing which might (or might not) demand action.

101

¾ The participants were then asked to choose and write down a course of action without being given any further time for analysis. The Intelligence update was designed to give them some latitude in choosing from a wide variety of plausible courses of action. After the course of action was selected, participants were invited to record their situation assessment along with the key indicators considered relevant to their course of action choice. It was accepted that this data might reflect post-hoc rationalisation to some extent. To account for any changes in situation assessment due to the process of having to analyse and express it, the players were also offered the opportunity to record any other courses of action that they may have considered. For the warfighting game, the player was in command of a Battlegroup (BG) of three tank Companies located in hides on a large wooded ridge feature; see Figure 5 - 2. Enemy troops in armored and mechanized units could be seen travelling along roads either side of the ridge.

102

Figure 5 – 2; Figure oriented with North upwards. NATO standard iconography indicates Blue battlegroup under player command on Elfas ridge North West (top left) of Figure. Other Blue forces lie to the South. Situation update indicates probable enemy airborne deployment to the West of the Elfas feature. The strength of these deployments is assessed as significant.

The global mission, as part of stabilizing the region, was to delay the enemy advance, part of which was running along both flanks of the ridge. A written brief was presented that described the operational update depicted in Figure 5 - 2.. The participants were asked to write down immediately their course of action. The player, as commander in the field HQ has several possible courses of action available, for example: •

remain in hides and do nothing;

•

request more information;

•

attack North/North-East (hoping for surprise) against the armored enemy units;

•

attack West directly against the reported deployment of airborne troops;

103

•

maintain a South-east withdrawal route to join-up with own forces.

The choices across the twenty-four participants appear to vary according to the way that they have taken account of the situation attributes and the higher-level mission orders. Some participants choose to give very little (if no) weight to the higher-level mission orders and focus on achieving effects that satisfy local utility. This local view of the mission in some cases tends to extend to the use and interpretation of the situation attributes in terms of both space and time. So, for example, the situation may be appraised purely as a snapshot in time (i.e. little or no forward projection) thus decision outcomes are assessed only from the local viewpoint.

Table 5 – 1 shows the association of the weighted utility values and attributes with the selected course of action. two concurrent, inter-dependent assessment processes: threat assessment and risk assessment. For those participants who are not so concerned about the uncertainty in the situation update, the relative weightings on the local and higher mission priorities coupled with the practical consideration of employing tanks against dismounted airborne troops, result in two very different courses of action. Some chose to attack West (employing tanks against dismounted troops and preventing link-up and closure of the gap West of Luthorst) and while others chose to attack North/N-East using tanks against tanks and adhering to Brigade orders.

104

ĂƚƚůĞŐƌƉ DŝƐƐŝŽŶ

ƌŝŐĂĚĞ DŝƐƐŝŽŶ

^ŝƚƵĂƚŝŽŶ ĞƌƚĂŝŶƚǇ

ŽƵƌƐĞŽĨĐƚŝŽŶ ^ĞůĞĐƚĞĚ

High

Low

Certain

Attack West

Low

High

Certain

Attack North/N-East

High

Low

Uncertain

Recce for Attack West

Medium

Medium

Certain

Limited Attack West

High-

Low-

Uncertain

Withdraw East or South

Medium

Medium

High

Low

Certain

Prepare Artillery and recce

High

Low

Uncertain

Recce & report to Brigade

Table 5 – 1; Warfighting RPD game results adapted from [14]; each row is a summary of a player response. Importance of global versus local situation assessments confirm dilemma hypothesis. Dilemma + Uncertainty implies choice of cautious course of action.

WĞĂĐĞͲ^ƵƉƉŽƌƚͬŽƵŶƚĞƌͲdĞƌƌŽƌŝƐŵƐĐĞŶĂƌŝŽ A small Blue team is escorting a convoy through winding tracks in difficult country (such as the support to Afghan forces given during Operation Enduring Freedom (OEF)). The information update to the player monitoring the situation back at TAC-HQ is that not far ahead there is a road block manned by a local terrorist group. There are several courses of action available; for example:

•

order the team to negotiate their way out of the situation;

•

order a withdrawal to move the civilian convoy vehicles to a safe distance;

•

do nothing and hope that the unit and convoy will through eventually;

•

deploy the quick reaction force (QRF) and move artillery units to fire positions.

•

There are well-defined Blue rules of engagement: •

Personal and direct-fire weapons may be used to engage a positively identified threat;

•

Indirect fire may be used to engage a positively identified threat.

105

Here the immediate potential outcomes of the mission are measured against attribute x1 and scored by Loss function U1. This evaluates features that have consequences local to the situation, such as an escalation of the immediate threat by ambush or weapon-firing, the reduced security of the civilians in the convoy and likelihood of kidnap, theft of supplies, etc. The second attribute x2 is scored by Loss function U 2 , evaluating more global issues concerning, for example, the integrity of the NATO campaign and political perceptions of NATOʼs ability to show resolve while adhering to the rules of engagement. The nature of peace-support operations generally means that the command structure tends to be flatter and with a less explicit hierarchy of mission orders (in contrast to the war-fighting scenario). Therefore, we would expect the course of action selection to be driven more from the situation attributes than from the weighting of mission priorities. The expected Loss/Utility model then is assumed of the general form;

ߙሺߛሻܷሺͳሻ ൅ ߚሺߛሻܷሺʹሻ Where;

ߙሺߛሻǢ ߚሺߛሻare the subjective priority weighting of local (resp. global) effects. U1 is the utility of the decision with respect to local outcomes. U2 is the utility of the decision with respect to more strategic consequences of the decision.

Ȗ represents the general level of the commanderʼs uncertainty in the situation. Participantsʼ course of action choices, d, can be set against a notional decision scale (reflecting the degree of overt force deployed) that ranges from “deploy the QRF with all available support (such as artillery and helicopters)” to “negotiate with IVCP troops and do NOT deploy QRF”. Several participants chose to “find out more” by sending in reconnaissance assets. This supports the fundamental basis for the non-linear Loss approach that there are two major control dynamics: ¾

the dynamics of the actual situation; in particular whether or not the situation is close to, or approaching, a critical condition that demands corrective action.

¾

the associated probability dynamic (situation uncertainty) strongly inter-related with consequential Loss

106

Figure 5 – 3 schematically represents the participantsʼ situation assessments plotted in the two-dimensional space representing the distilled form of the Recognized Picture. Each letter corresponds to a playerʼs assessment in the experiment. The position of the letter, and the associated arrow, indicate their stated position (abstracted to be in terms of the twodimensional representation) of their situation assessment, and the arrows attempt to show how they anticipate that the situation will develop. Overlaid onto this situation assessment plot is the grouping of their Course of Action choices. This representation goes some way towards an initial validation of the Rapid Planning process.

ZĂƉŝĚWůĂŶŶŝŶŐ The Building Block 1 and 2 experimental results validate the following mathematical model of rapid expert choice under stress, which I call Rapid Planning. The perceived pattern of events is defined by a number n of information attributes forming an-n-dimensional space called the Recognized Picture. The expertʼs evolving situational assessment over time corresponds to a compact fuzzy subset of this space represented by n uncorrelated fat-tailed lognormal or normal marginals trimmed at the 2-ı level. This moves across the recognized picture as the information flows update. If the situation is changing significantly, the type of Bayesian inference process we have chosen (the MHAT) corresponds naturally to a number of hypotheses that the commander has about the changing nature of the battlespace, and how this links to a number of fixed patterns in his Recognized Picture corresponding to the psychological model of Klein {13]. Choice of a fixed pattern leads directly to a corresponding locally desired course of action. This is resolved with higher level, more global intelligence minimizing expected Loss (equivalently maximizing expected utility). Some of these ideas are unpacked a little further below

>ŽŐŶŽƌŵĂůĂŶĚ&ĂƚdĂŝůŝƐƚƌŝďƵƚŝŽŶƐ Although the choice of an n-vector multivariate normal random variable Z may seem restrictive we can easily transform between this and the corresponding multivariate lognormal random variable Y = exp Z. This is a smooth bijective mapping on the strictly positive part of ܴ ௡ . This allows us to model a skewed fat-tail distribution as a lognormal with high variance. In certain cases, this approach is more than just fitting a distribution to

107

the data. A lognormal distribution is generated by the product of independent, identically distributed random variables. If a portfolio of investment returns has this characteristic and yields returns much higher than expected, then the distribution of returns is lognormal with a high variance.

108

Regional perception of terrorist threat level

Deploy QRF A

hi

U

W

N FJ

D

Do both O P L

K V I M F T R EG Q B C H More prepare recce S Prepare, talk, & inform police

projections of situation assessments

lo

lo

Negotiate

X

perceived level of strength of local terrorists

hi

i Figure 5 – 3; Peace Support / Counter – Terrorism RPD game results validating the assumed structure of the recognized picture adapted from [14]; a plot is shown of each experienced army officerʼs current and evolving** situational assessment in the recognized picture and preferred mission transition; Subsequent analysis produced a coarse graining of the space into discrete volumes each corresponding to a different choice. **The vector at each plot represents the playerʼs assessment of short term change as deteriorating over time.

We use a Beliefs, Desires and Intentions (BDI) layered agent structure to capture the full process where the layers within the agent correspond to those of a mobile robot [15, 16];

¾ Deliberate planning where the agent looks ahead, in collaboration with other agents, to adapt its plan and future mission structure. The current selected plan is assumed isomorphic to its Intent. For example, this global intent corresponds to Alice the supervisor in her corner office setting a board level yield to investment target as described in chapter 3. This is then cascaded down to a set of targets for trader Bob and his colleagues in the cubicle farm.

¾ A scheduling layer to turn this into a schedule of lower level operations for the individual agent; and,

¾

Rapid planning at the individual agent level, where the robot agent selects from a small set of alternative possible operations which are functionally defined. For example, trader Bob mentally sets his own target yield low for Monday due to lower

109

than normal activity levels, knowing that in his sector, the volume of trade will pick up later in the week. The process model invokes an analysis of the agentʼs current observations of the recognized picture based on data received by the agent via its sensors. The analysis of these data consists of data smoothing and parameter (mean and covariance) estimation. The data analysis is performed by an M-HAT. For example, each agent deduces its local PCPR from observations of enemy combat power and own force combat power. Each of these two volatile data input streams are analyzed independently by an M-HAT mixture model being a mixture or combining of single hypothesis models called 1-HATs. Which models we choose to mix is our choice. The following set seems to work well in practice;

¾ The steady 1-HAT is of 1st order polynomial form, and corresponds to business as usual. The other three 1-HATs in the mixture model are all 2nd order polynomial models.

¾ The blip 1-HAT represents a system model that describes a transient in the time series.

¾ The rapid change 1-HAT represents a system model that describes a slope change in the time series.

¾ The explosive change 1-HAT represents a system model that describes a step change in the time series following a pause. The key output parameters estimated by each of these 1-HATs are the current mean and covariance ሺ݉௧ǡ ‫ܥ‬௧ ሻ. We choose the estimate from the 1-HAT tracker with the highest posterior probability.

dŚĞZĞĐŽŐŶŝǌĞĚWŝĐƚƵƌĞ The Recognized Picture is the set of information needed by the commander, updated over time. Mathematically we represent it as a space spanned by a set of factors which are considered most relevant. For warfighting these relate to force ratio, logistics status etc., whereas for peacekeeping they may be numbers of refugees, water supplies and so on. In each case, the M-HAT formulation updates the assessment of where the commander perceives he is located within the space described by this set of factors. This is represented using Fuzzy Set Theory.

110

&ƵǌǌǇ^ĞƚƐĂŶĚ>ŽĐĂůŚŽŝĐĞƐ Taking as an example enemy and own force strengths as the factors forming the Recognized Picture, each M-HAT mixture model operates on an input time series. For one mixture model the input time series comprises observations from sensors and other means of the enemy combat power in the command agentʼs local area of interest. For the other mixture model the input time series comprises observations using radio or GPS updates of own force combat power in the agentʼs local area of interest. Each 4-HAT mixture model processes its associated time series and outputs a normal distribution of belief. Exploiting the Bayesian methods of [3], a probability is also assessed for each of the hypotheses (rapid change in enemy threat level is now occurring, for example). These probabilities are tracked over time. Each 4-HAT mixture model assessment is thus of the form:

¾ For a given dimension of the Recognized Picture such as the enemy force level assessment, and corresponding 4-HAT mixture model tracker, an updated fuzzy estimate ሺ݉௧ǡ ‫ܥ‬௧ ሻ of location on that dimension. In theory, there are four of these estimates, one produced by each 1-HAT in the mixture model. We simply choose the estimate corresponding to the hypothesis (such as rapid change now occurring on this dimension) with the highest posterior probability as calculated from each 1 – HAT. The idea behind the situation assessment described here is to provide an initial ʻOKʼ/ʼNot OKʼ alert. If the situation has shifted significantly since the last update, the assessment is ʻNot OKʼ and the agent needs to do some pattern matching in order to find out if a plan adaptation is required.

WĂƚƚĞƌŶDĂƚĐŚŝŶŐĂŶĚWƌĞĨĞƌƌĞĚDŝƐƐŝŽŶ^ĞůĞĐƚŝŽŶ The aim at this final stage is to try to recognize the extant situation in the outside world, based on the data received by the command agent, and to identify the mission appropriate to this situation. In the pattern matching process example at Figure 5 – 4 we compare the assessed PCPR distribution against a number of fixed patterns, denoted P[k]. When the ʻNot OKʼ situation is detected the agent executes the pattern matching process using the current PCPR. This results in a set of Bayesian posterior probabilities for the pre-defined patterns; and thence for the associated courses of action that the agent can adopt. These

111

probabilities define the relative ʻpreferabilityʼ of each pattern, P[k] (and thence the mission associated with each pattern) given the current local situation (defined by the PCPR).

Enemy force combat power estimate

Own force combat power estimate

Figure 5 – 4; Structure of the Recognized Picture (RP); based on building block experimental data [10, 14]see Figure 5 -3.; fixed patterns (open ovals – trimmed multivariate normal distributions) form fuzzy subsets of the RP.; trajectory of situational assessment ( hatched ovals – trimmed multivariate posterior normal distributions) also form fuzzy subsets of the RP.

Each fixed pattern is a representation of one possible situation that could exist in the outside world and the question to be answered is: which of these patterns (and thence situation) is most likely, given the observed PCPR? The comparison is made by computing the degree of overlap of the two trimmed bivariate normal distributions corresponding to assessed PCPR and a given pattern, P[k] . This yields two outputs:

¾ L(PCPR | P[k]) - the likelihood that the given PCPR would have been obtained had the situation in the outside world been the one represented by pattern P[k];

¾ p(P[k] | Dt) - the posterior probability that pattern P[k] is the one that best represents the situation extant in the outside world, given the time series of (enemy and own force combat power) observations seen to date (Dt).

112

Having calculated the posterior probability of each pattern P[1], P[2], ... P[N] we select the pattern, P[k], with the highest posterior probability as the one that best represents the situation extant in the outside world. The situation has now been ʻrecognizedʼ - and it is represented by the selected pattern, P[k]. The decision-makerʼs experience and training is represented within the agent as a bijective mapping from the set of fixed patterns to the set of appropriate missions in the agentʼs long-term memory. Thus, the selected pattern P[k], representing the recognized situation, leads directly to a locally desired mission.

DŽĚĞůůŝŶŐƚŚĞĨŝŶĂůĐŚŽŝĐĞ How does the agent finally choose either to stick with the current mission or adapt to another? We are assuming that the system is complex and thus under Edge control1 [17]. The set of tactical choices made cumulatively by the agent over time may give the appearance of an external intelligent planner steering the system, but higher level control is emergent. To quote the first few lines of the editorial from [18] concerning robot behavior; “In academia, the word tactical often refers to the quality of an action intended to give advantage over an opponent or problem scenario. As such it is reasonable to suggest that a sequence of tactical actions taken by some agent often results in what we perceive as intelligent behavior”. For initial guidance on what this means, we turn to the psychology of the decision process. Janis and Mann [19 ] identify a potential dilemma between the losses incurred due to not changing the mission, and the risks associated with change. This choice dilemma between potentially conflicting objectives is identical mathematically to Bob the market traderʼs dilemma analyzed in Chapter 3. Alice his supervisor has set a board level target for return on investment and she cascades that global target down to a set of yield to investment ratio (YTIR) targets for Bob (and the other traders in the cubicle farm) which he has to take account of in his day to day trading. As shown in chapter 3, the subset of best choices ሼߜ ‫ כ‬ሽ creates a folded surface within the space of feasible solutions.

ƵĚ&ŽƌŵĂƚŝŽŶZĞǀŝƐŝƚĞĚ

1 A term invented in discussion with David S Alberts, during a break from working up the ideas behind the NATO doctrine for command in the information age [17].

113

This process can be described as the forming of a bud in the sense of an extremal model of local human reasoning as a graphical belief network. discussed earlier. Our bud in this case has the form of Figure 5 – 5, showing the transmission of belief from the rest of the network and the influence of locally derived information. Setting y as the local threat data vector and e the perception of threat level derived from assets, analysis and intelligence with a broader span, these transmissions of influence are derived as follows;

y = ey− = ek− and ek+ = e

λ (k ) = λy (k ) ≡ P(ek− | k ) = P( y | k ); π k ( j ) ≡ P( j | ek+ ) = P( j | e )

Intent j

π k ( j) Pattern k

λ y (k ) Local data vector y

Figure 5 – 5; Bud formation in tactical decision making; the intent j communicates the influence of the rest of the network to the bud, as required by locality assumptions; bud formation is pruned by the data vector y. belief in enemy intent corresponds to a probability associated with each intent j drawn from a finite set.

The relationship we have now constructed between the M-HAT functions and our extremal model of human decision making as a series of interacting inferences across a ʻbudding networkʼ indicates that a key driver in the final choice is the relative importance to the commander of these two viewpoints. There is significant evidence from our experiments at Building Block 2 (discussed earlier in this chapter), in support. These functions thus represent plausible models of the commanderʼs choice. The resulting decision between conflicting objectives is then a dilemma. The outcome of this dilemma is a choice between deviating from his current CoA or not based on his changing perception of the threat. We assume that the unit commander will always want to take the decision į which minimizes her expected Loss based upon her past experience.

114

(S)he will then translate this decision into a course of action going forward which, again, in his or her past experience is the most suitable. In our example model, the unit commanderʼs local threat assessment is that the PCPR at time t has is normally distributed with expectation c and variance V as determined by the 4HAT tracking enemy combat power. However, more global intelligence and sensor feeds into his TAC – HQ show the potential for the mean of the distribution of PCPR density to change from c to c+į where the decision į is the shift in assessment of local threat. The variance value is assumed to remain the same with the decision change. The belief in PCPR given the decision to shift the threat perspective from

݂ሺߠሻ ൎ ሺܰሺܿǡ ܸሻ is then:

f (θ δ ) =

ª −(θ − (c + δ )) 2 º 1 exp « » x N ( c + δ ,V ) 2V 2π V ¬ ¼

The Loss function is modelled as in chapter 3 by a conjugate normal loss function;

ª ª −(θ − μ )2 º º L (θ , δ ) = h «1 − exp « »» ¬ 2k (δ ) ¼ ¼ ¬ The parameter h is a cutoff corresponding to maximum loss. Integrating over all feasible values of threat level ș gives the Loss expected if a decision į is made. This yields an expected Loss function of the general form;

E (δ ) =

=

h 2πV

h 2πV ∞

∞

©

−∞

ª − (θ − (c + δ ))2 º h »dθ − 2V 2πV ¼

³ exp«¬

−∞

Substituting t =

E (δ ) =

ª − (θ − μ )2 º · ª − (θ − (c + δ ) )2 º » ¸¸ exp « » dθ 2V 2k ¼¹ ¼ ¬

§

³ ¨¨1 − exp«¬

h 2πV

θ − (c + δ ) 2V

∞

³ exp[− t ] 2

−∞

∞

§

ª

³ ¨¨ exp«¬− ©

−∞

(θ − (δ + c))2 − (θ − μ )2 º ·¸dθ 2k

§ μk + (δ + c)V and q = ¨¨ k + V θ − k +V ©

2V dt −

h 2πV

∞

³ exp[−q

−∞

115

2

]

2V

»¸ ¼¹

· 1 ¸¸ ; ¹ 2kV

ª − (δ − (μ − c ))2 º 2kV exp « » dq k +V 2(k + V) ¬ ¼

=

h

π

∞

³ exp[− t ]dt − 2

−∞

E (δ ) = h − h

ª − (δ − (μ − c ))2 º ∞ k 2 exp « » ³ exp[− q ] dq k +V 2(k + V) ¬ ¼ −∞

h

π

2 ª − (δ − ( μ − c ) )2 º ª − (δ − d ) º º k k » = h «1 − exp « exp « »» 2(k + V) k +V k +V « » ¬« 2(k + V) ¼» ¼» ¬« ¬ ¼

(1)

The scalar valued function E(į) is of Lyapunov candidate type and we seek the value of į, which we denote as į* minimizing expected Loss. This is equivalent to the local commander continually adapting his threat perspective at each Command cycle so as to minimize the likely overall Loss involved, Setting the difference between the global and local perspectives of threat PCPR and the mean value given in the sensor cycle to be respectively μ − c = d ; with k held constant; differentiating equation (1) as a function of į yields (when set to zero for a turning point);

E ′(δ ) = −h

(δ − d ) § k · ¨ ¸ (k + V ) © k + V ¹

1

2

− (δ − d )2 ½ exp® ¾=0 ¯ 2(k + V ) ¿

There is only one solution to this equation – a choice of δ = d so that the adaptation is always to the global threat assessment; c + δ = c + d = c + ( μ − c) = μ . This corresponds to the unit commander simply ignoring the information given by his own sensors; not always a wise choice. In fact if the shift į in perception of risk is large, it is of great importance that it is correct. If the change is small, it is of less importance. Therefore, k cannot be constant . Thus k is dependent on the decision į made. The analysis at chapter 3 assumes that k = k (δ , ρ ) = (η 2 + ρ 2δ 2 )

−1

2

As discussed in chapter 3, the variable ȡ changes as the pressures affecting the situation change, introducing aspects such as blame culture into the decision model.

116

ZĞĨĞƌĞŶĐĞƐ

1. Pearl J (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann. 2. Kullback, S.; Leibler, R. (1951). On information and sufficiency. Ann. Math. Statistics 22. 3. West, M, Harrison J (1997) Bayesian Forecasting and Dynamic Models 2nd Edition, Springer 4. Mood, A., Graybill, F (1974). Introduction to the Theory of Statistics. McGraw5. Moffat J (2002) ʻCommand and Control in the Information Age; Representing its Impactʼ. The Stationery Office, London, Now available on Kindle 6. Moffat J, (2003) ʻComplexity Theory and Network Centric Warfareʼ, Office of the Sec Defense, DoD, Washington DC, USA. Now available on Kindle. 7. Moffat J (2011) ʻAdapting Modeling & Simulation for Network Enabled Operationsʼ, Office of the Sec Defense, DoD, Washington DC, USA, 2011 . Now available on Kindle 8. Moffat J et al. ʻCode of Best Practice (COBP) on the Assessment of Command and Controlʼ. NATO RTO-TR-9. AC/323(SAS)TP/4, 1999. Now available on Kindle. 9. Researchgate; https://www.researchgate.net/profile/James_Moffat (accessed Feb 2018). 10. Moffat J and Medhurst J (2009) ʻModeling of Human Decision-Making in Simulation Models of Conflict Using Experimental Gamingʼ. European Journal of OR Vol 196 Issue 3 pp 1147-1157, 2009. 11. Perry W and Moffat J. ʻ Measuring Consensus in Decision Making: An Application To Maritime Command and Control.ʼ J. Operational Research Soc. 48 No. 4 (1997) pp 3 8 3 3 9 0 . 12. Cox, D. R., & Snell, E. J. (1989). Analysis of binary data (Vol. 32). CRC Press. 13. Klein, G. A. (1993). A recognition-primed decision (RPD) model of rapid decision making (pp. 138-147). New York: Ablex Publishing Corporation.

117

14. Dodd, L., Moffat, J., & Smith, J. (2006). Discontinuity in decision-making when objectives conflict: a military command decision case study. Journal of the Operational Research Society, 57(6), 643-654. 15. Müller, J. P. (1996). The design of intelligent agents: a layered approach (Vol. 1177). Springer Science & Business Media. 16. Brooks R A Robust Layered Control System for a Mobile Robot. IEEE Journal on Robotics and Automation, vol RA2, no. 1, March 1986. 17. NATO (2010) ,ʻThe NATO NEC C2 Maturity Modelʼ, Office of the Sec Defense, DoD, Washington DC, USA , 2010. 18. Journal of Defense Modeling and Simulation 9(1), 2012 19. Janis I. and Mann, L. (1979). Decision Making. A Psychological Analysis of Conflict, Choice, and Commitment. New York, The Free Press, A division of Macmillan Inc.

118

Modeling Conflict and Competition far from Equilibrium: Chapter 6;

Analyzing Schelling – type Automata Models and Edge Networks

WƌŽĨĞƐƐŽƌ:ĂŵĞƐDŽĨĨĂƚ p(cr) then almost all clusters are percolating. As pĺp(cr) from above, it can be shown [3] that the distribution in size of percolating clusters is fractal. In fact, for p slightly greater than p(cr), the fraction F of individual domains that form part of a percolating cluster takes the form ‫ ן ܨ‬ሺ‫ ݌‬െ ‫݌‬ሺܿ‫ݎ‬ሻሻఊ . From this, we can see that as p approaches the critical value p(cr) from above, the fraction of domains forming part of the percolating cluster tends to zero (for a very large initial configuration). Thus such clusters can become very tenuous close to the critical point. We will encounter this phenomenon again in our discussion of information entropy although it is unclear at present if they are related.

124

^ŵĂůůtŽƌůĚEĞƚǁŽƌŬƐ

A network is a small world if the average path length is of the same order as the equivalent random graph (i.e. a random graph having the same number of nodes, and the same expected degree per node) but where the local clustering coefficient C is much greater than for the equivalent random graph. The small world network [4] has its roots in social systems in which most people have very local relationships (e.g. local friends), but also have a few friends who are a long distance away. To construct a simple example of such a network, we start with an ordered one-dimensional lattice of N nodes with a periodic boundary. We assume that every node is linked to its first K neighbours (K/2 on each side) and that N is significantly bigger than K. We also add an expected number pNK/2 of long range edges which connect nodes that otherwise would be part of different local communities. In developing a model which relates the average path length l to the main driving parameters of the system (p, K, N), we can derive the limiting form as a renormalisation invariant model of Gell-Mann type as discussed in chapter 4; 1

Nd l f ( pKN ) K

where f(x)| x=pKN is a scale invariant function. Scaling collapse of the three variables p,K,N into one composite variable x .is characteristic of the use of renormalisation group approaches. When the probability p is low, the spectral density function of the adjacency matrix has several spikes, reflecting the regular lattice-like nature of the network. As p increases to 1, the spectrum smooths out and tends to the semi-circular distribution characteristic of a random graph [5]. Thus the loop structure tends towards that of a random network for high values of p, reflecting the increasing dominance of the random longer range links.

125

^ĐĂůĞ&ƌĞĞEĞƚǁŽƌŬƐ A network is said to be scale free if the network as a system transforming inputs to outputs is scale free as defined by Moffat [6]. Such a network exhibits properties of scale invariance. This implies that the distribution of richness across nodes is a power law with a few nodes acting as hubs and having many ingoing and outgoing links, while most nodes have much sparser connection to the network. In general, neither random nor Watts-Strogatz type small world networks have this as a natural property. Since it is an emergent property of many real networks such as the World Wide Web, research has focused on how such scale free networks evolve, while still retaining other characteristics such as a high local clustering coefficient. It turns out that the two key factors required to create a scale free network are growth and preferential attachment. These are described as follows. 'ƌŽǁƚŚ At each timestep, a new node is added to the network. This node comes with a number of new links which are attached to the existing nodes. WƌĞĨĞƌĞŶƚŝĂůƚƚĂĐŚŵĞŶƚ͘ When choosing which node a new link should attach to, the probability of choice is proportional to the richness of each of the existing nodes. For this basic Barabasi type network evolution model, the exponent Ȗ characterizing the power law which describes the degree distribution is fixed at Ȗ = 3. We need to add additional locally adaptive features to the basic model to give the wide spread of exponent values we see across various real networks. Such real scale free networks typically retain a low average path length, in the sense that the average path length is proportional to the logarithm of the number of nodes. Correlations (k,l) also tend to develop spontaneously between connected pairs of nodes with degree values k and l. It is possible to generalise the basic model in a number of ways to achieve a broad spread of values of Ȗ including; ¾ Adding both edges and nodes to the existing network over time, ¾ Rewiring, where we take an existing edge and then reassign its ending node within the network.

126

ŶĂůǇƐŝŶŐƚŚĞĞŶĞĨŝƚŽĨĚŐĞEĞƚǁŽƌŬŝŶŐ Now we turn to the question; ¾ What is the net benefit (if any) of using the knowledge obtained by Edge networked interconnection? To answer this, I want to begin by recalling from chapter 3 the following; ʻʼ A dynamical system ʻflowʼ starting in a low entropy state will tend to wander into larger coarse grained volumes; hence entropy tends to increase over time if the system is isolated, giving rise to the second law of thermodynamics. Liouvilleʼs Theorem that the divergence of such a flow is zero implies that starting from a given probability distribution of initial conditions wiith low variance then the distribution will spread out and filament over time, with these filaments wandering into larger coarse grained volumes…….. For information networks we can apply the same considerations, of coarse graining and system complexity. However, the evolution of the system in terms of renormalisation rather than time is now the key variable. Analysis of thermodynamic evolution shows that the system, if isolated, self-organizes to an attractor state of maximum entropy. For a selforganizing information system, scale invariance is the attractor state and renormalisation over time is, the process by which the system evolves. We now wish to consider the Kolmogorov Complexity since this is an information based measure of descriptive complexity ref cover]. The basic idea has emerged several times; for example in Popperʼs development of axiomatic probability theory [7] and in Gell-Mannʼs suggestion of repeated patterns in data leading to the ability to succinctly describe the data [8]. In this case the expected description length is small, and the ability to summarize increases our understanding (as occurs in the development of the recognized picture). Popperʼs viewpoint is that the theory which explains the data with the smallest number of variables is the one giving maximum insight. Conversely a theory which simply mimics reality gives no additional insight. To put these ideas into a useful form, given a dataset D, we define a strictly positive likelihood or probability of D denoted ‫݌‬ሺ‫ܦ‬ሻ such that݈‫ͳ݃݋‬Ȁ‫݌‬ሺ‫ܦ‬ሻ ൌ Ǧ‫݌‬ሺ‫ܦ‬ሻ is the expected description length of D. We define this to be the Kolmogorov complexity measure of the dataset D. 127

More generally, given a sequence of datasets ‫ܦ‬௜ , clearly the expected description length of the sequence is just the sum of the description lengths for each, if they do not overlap in meaning. This implies that; N

−¦ p ( Di ) log p( Di ) i =1

Is the composite Kolmogorov Complexity (K – C) of these datasets.

From this equation we can see that K – C has the following properties; ¾ K – C takes values in the semi-open interval ]0, log N ]. Knowledge in the region near 0 is high, corresponding to very succinct descriptions. ¾ Knowledge in the region near to ݈‫ ܰ݃݋‬is low, corresponding to very lengthy descriptions with little or no pattern.

We will now describe an experiment run jointly with the Portuguese army to test this theory of Kolmogorov Complexity.. The experiment was carried out jointly, as part of a NATO research agreement, with the Portuguese Military Academy in Lisbon, using volunteer teams of military cadets in officer training at the academy. This elegant former palace was a reminder of the alliance between the UK and Portugal, going back to 1371. The experiment was structured around a terrorist incident in which each of the teams had to work together under different networking assumptions to rapidly determine the precise nature of what had happened; the Who, What, When, and Where of the incident. We used the US DoD ELICIT game framework (with active involvement of the US during the analysis of results) since this allowed us to capture the four main networking topologies shown in Figure 6 – 1.

128

What Web Site

Where Web Site

When Who Team

Who Web Site

Where Team What Team

Coordinator Configuration similar to Hierarchy Instructions define role of coordinator

Where Web Site When Web Site What Web Site Who Web Site

Where Team What Team When Team Who Team

Team member

Coordinator

Legend:

Team leader -

Instructions define role of coordinator

Coordinator/ Facilitator factoids and share /post evaluations

Feature added that allows players to evaluate

ELICIT Configuration for Collaborative C2 Approach

-

Players have access to all websites

Where Web Site When Web Site What Web Site Who Web Site

Where Team What Team When Team Who Team

Deconflictor

Team member

ELICIT Configuration for Coordinated C2 Approach

-

Team leader

Deconflictor

Legend: - Instructions as per ELICIT Hierarchy Baseline

ELICIT Configuration for De-conflicted C2 Approach

Figure 6 – 1; the four networking structures represented in ELICIT. Each corresponds to a NATO defined Command and Control (C2) Approach [9]. Deconflicted networking (top left) implies the system is the sum of its parts. Moving to more advanced forms (Coordinated; Collaborative; Edge) leads to increasing interaction synergy. Small light circular nodes are team members, small dark red circular nodes are team leaders (where present) and large circular dark nodes are coordinators across teams (where present). As we progress to higher levels of interaction, the coordinator’s role becomes more dynamic and proactive. Rectangles represent websites potentially accessible by team members to post or pull information, with networked cross-coupling between these websites increasing at the higher interaction levels.

For our experiments each ELICIT game was played by teams of 17 military cadets forming 4 teams (a total involvement of 68 cadets) and we defined four ‘solution spaces’ corresponding to the four parts of the overall solution (Who, What, When and Where). 12

,ǇƉŽƚŚĞƐŝƐdĞƐƚŝŶŐ

The underlying hypothesis we tested was that the reduced inertia in sharing of information at the more networked levels would, as a null hypothesis H(0), have no effect on reducing solution times; and thus keep civilians at risk. Fortunately H(0) turned out to be false. For a given solution space, each player developed over time a description of the solution (including the null case where no solution was given) – we call this an ID in ELICIT. For each solution space, there were up to K possible choices, and a particular choice was represented by k (1 ≤ k ≤ K ) . For example, the ʻwhoʼ choices corresponded to up to K different possible terrorist organizations. As the game progressed, additional information was made available and shared. We tested for convergence over time, corresponding to increasingly coincident IDs such as assessments of device location (Where) and device type (What). Hence, we tracked; S (i, t , k ) = Number of coincident IDs of type k for solution space i at time t

For example, if we consider the ʻWhatʼ solution space, and there are 3 identical IDs by various players for a device of type k, then S (What , t , k ) = 3 The probability of this description is defined as:

p(i, t , k ) = p(What , t , k ) =

S (What , t , k ) 3 = = 0.18 (since the number of players is 17). Note 17 17

that − log p (i, t , k ) (the expected description length) in this case is 0.75 when there are 3 coincident IDs, falling to about 0.03 if 16 out of 17 players give the same ID. k =K

The total number of definite IDs is given by

¦

S (i, t , k ) .

k =1, S ( i ,t , k ) ≠ 0

The number of players who do not make a definite ID for solution space i is then given by:

17 −

k =K

¦

k =1, S ( i ,t , k ) ≠ 0

130

S (i, t , k ) .

This equation also provides an indication of the level of uncertainty of a group towards any possible ID, making the reasonable assumption that uncertainty is related to unwillingness to make a positive ID. For the null case (no ID given) we define the probability of this description as:

p(i, t , k = ∅) =

1 where ∅ denotes the null set. 17

§1· In this case the expected description length of each null ID is − log ¨ ¸ = log17 and is © 17 ¹ thus as large and positive as it can be. If many players do not supply an ID (an event which requires a long description length to lay out), then the Kolmogorov complexity increases significantly. For example, at the beginning of the game, when there are no positive IDs, there are 17 null IDs, each with a description length of log (17). We now define the Kolmogorov Complexity for solution space i at time t as k =K ½1 1 °½ ° k = K KC (i, t ) = − ® ¦ p(i, t , k ) log p(i, t , k ) + ®17 − ¦ S (i, t , k ) ¾ log 17 17 k =1, S ( i ,t , k ) ≠ 0 ¯ ¿ ¯° k =1, S (i ,t ,k ) ≠ 0 ¿°

This expression then represents the expected description length for our solution space corresponding to each of the possible values of p (i, t , k ) , including all of the null IDs (each taken separately in the summation). In addition, it also provides an indication of the level of uncertainty of the group towards any possible ID since it refers to the number of individuals that did not provide a positive ID.

DĞĂƐƵƌŝŶŐŽŐŶŝƚŝǀĞ^ĞůĨͲ^ǇŶĐŚƌŽŶŝǌĂƚŝŽŶ

Cognitive Self-Synchronization (CSSync) measures the amount of coherence across a group at a particular time t in terms of the four soluion spaces ( Who, What, Where and When ) of a terrorist attack. Note that our emphasis here is on the synchronization of the positive IDs made by the subjects. Treating these subject identifications as a measure of uncertainty, the function we use to represent CSSync, based on Kolmogorov Complexity, is the following:

CSSync(i, t ) = 1 −

131

KC (i, t ) Max _ Entropy

CSSync is measured separately for each of the four solution spaces i=1,4 Note that Max_Entropy refers to the maximum Shannon entropy value and is used to normalize CSSync to a value between 0 and +1. The values at the boundaries may be interpreted as follows: •

CSSync=0 means the system is disordered;

•

CSSync=1 means the system is fully synchronized.

We assume that any group of players operating in ELICIT has an initial state of maximum disorder (maximum Shannon entropy), that is: N

Max _ Entropy = −¦ i =1

1 1 log = log N . N N

In our case, Max _ Entropy = log17 = 1.23 and;

CSSynch(i, t ) = 1 −

KC (i, t ) = 1 − 0.81 KC (i, t ) log17

This implies that reducing Kolmogorov complexity; leading to greater understanding of the problem, should increase team synchronization. A measure for the overall CSSync(t) at time t is the equally weighted sum of the partial values, that is:

¦ CSSync(i, t )

CSSync(t ) = i =1,4

4

We are essentially assuming that the four solution spaces are non-overlapping in terms of evidence and meaning. and we can then add the contributions from each of them to give an overall measure, based on Kolmogorov Complexity, for the state of the game at time t. Formally this is not strictly true since the solution spaces are linked, but the sum is always an upper bound and usually works well in practice as a measure of merit.

dŚĞŽƌĞƚŝĐĂůWƌĞĚŝĐƚŝŽŶ

Out theory predicts that the ability to network easily with other team members leads to deeper shared understanding thereby decreasing Kolmogorov Complexity. In such a 132

scenario we have a description length reducing over time for all problem spaces. But we have that;

¦ 1 − 0.81 KC (i, t )

CSSync(t ) = i =1,4

4

= 1 − 0.2 ¦ KC (i, t ) i =1.4

This leads to Prediction # 1. ¾ Reducing Kolmogorov Complexity ‫ܥܭ‬ሺ݅ǡ ‫ݐ‬ሻ should lead to increasing ‫ܿ݊ݕܵܵܥ‬ሺ‫ݐ‬ሻas we move up the networking levels. Interpreting Kolmogorov Complexity in this way as a lack of system level knowledge is consistent with its identification mathematically as Shannonʼs Information Entropy. In chapter 3 we postulated that with no intervention, this form of entropy would increase over time in terms of the expectation of an evolving Gibbs ensemble of identical systems. This leads to Prediction # 2; ¾ Increasing levels of CSSync require increasing investment of energy from outside the system.

ǆƉĞƌŝŵĞŶƚĂůZĞƐƵůƚƐ

The CSSync mean values for no networking (Conflicted) and each networking option are shown in Figure 6 – 2.

CSSync

Figure 6 - 2 Cognitive Self-Synchronization (y-axis) plotted against networking interaction level (x-axis). Horizontal bars show mean values across game outcomes; verticals indicate Max/Min values, Plot supports prediction # 1 from theory that CSSync increases as Kolmogorov Complexity of interaction reduces.

133

The experimental data shows, from Figure 6 – 2, a direct relation between the networking approach adopted and the resultant Self-Synchronizatnnion achieved in the cognitive domain (CSSync). From the data we have available, we can interpret this as a result of moving up in terms of the networking approach. A collective thus progressively removes constraints that inhibit information sharing, interaction, allocation of decision rights and the development of shared understanding and insight; and, at the same time, sets enablers that influence an increase in their membersʼ proactiveness. This in turns contributes to more information sharing, better levels of shared awareness and thus reduced Komogorov Complexity. This is confirmed when increasing networking interaction through the other approaches to Edge. The latter case is of particular interest

dŚĞĞŶĞĨŝƚƐŽĨWŽǁĞƌƚŽƚŚĞĚŐĞ Collaborative networking and Edge networking are equivalent in terms of network access (i.e., access to other players and websites). Thus the change in outcomes must be due to changing the organizational structure from a well-defined hierarchy to the organization ĚĞƐĐƌŝďĞĚŝŶdĂďůĞϲʹϭ͕ǁŝƚŚŽƵƚƉƌĞͲĚĞĨŝŶĞĚƌŽůĞƐĂŶĚǁŝƚŚĨƵůůǇĚŝƐƚƌŝďƵƚĞĚĚĞĐŝƐŝŽŶƌŝŐŚƚƐ to the edge. The three dimensions x , y = information sharing and z delineate a networking space as shown in Figure 6 – 3. The regions of the three dimensional netorking space within which the four classes of neworking are located lie sequentially along the diagonal vector of this space, with no networking at the origin and Edge networking in the upper right hand corner. As we move up the dimensons of Figure 63, the frequency of interactions between entities increases and thus their focus shifts from the Information domain (from sparse to rich exchange of information) to the Cognitive domain (toward higher degrees of situational awareness) and to the Social domain (toward higher degrees of shared awareness and understanding and increased sharing of resources).

134

Figure 6- 3; Left hand image; vector of networking approaches* Right hand image;networking approaches embedded as volumes in cartesian 3- dimensional Networking Space (grey box); adapted from [9]. *deconflicted - bottom left - rising through coordinated and collaborative to edge – top right..

135

Both Collaborative and Edge networking organizations succeeded in making most information accessible to all members. Yet, for the Edge organization, subjects displayed a significant increase in activity during the game (see Figure 6 – 4) and were able to reach the best scores for CSSync. The lack of variation across teams working the Edge case may be indicative of a maxing out of synchronization potential.

Edge networking

Dimension

X = Allocation of Decision Distributed to all players Rights Y = Shared Information Resources

Shared across all players, and all information accessible.

Z = Patterns of Networked Interaction

Unconstrained broad and rich emergent networks

Table 6 - 1. Characterization of Edge networking

To test Prediction # 2, we also measured the associated effort required to Self-Synchronize. we accounted as effort the amount of decision-making related activity; corresponding to energy crossing the system boundary [2] A unit of effort was thus assumed expended when any of the following actions occurred: •

A factoid was shared by a player with another player.

•

A factoid was posted by a player to a website.

•

A player performed a pull from a website.

•

A player performed an identification (an ID).

Total effort /time spent was the metric used to normalize the values across the different ELICIT games. The resulting effort per hour measured for each networking approach is presented in Figure 6 - 4.

136

ĨĨŽƌƚ;ĐŽƐƚͿ ϭϴϬϬ ϭϲϬϬ ϭϰϬϬ ϭϮϬϬ /ĚƐƉĞƌ,ŽƵƌ

ϭϬϬϬ

WƵůůƐƉĞƌ,ŽƵƌ

ϴϬϬ

WŽƐƚƐƉĞƌ,ŽƵƌ

ϲϬϬ

^ŚĂƌĞƐƉĞƌ,ŽƵƌ

ϰϬϬ ϮϬϬ Ϭ KE&>/d

KE&>/d

KKZ/Ed

K>>KZd/s

'

Figure 6 – 4; The various bands of shading show the main contributors to the total effort in each case. At higher networking approach levels, the increasing number of pulls from websites is the dominant feature. Maintaining the Edge networking approach during the game required the greatest expenditure of effort, followed by Collaborative C2. High Cognitive Self-Synchronization thus requires significant activity to sustain it. These findings are in line with Prediction # 2, given that they correspond to a single Gibbs ensemble.

The correlation indicated by Figure 6 – 4 was confirmed by a linear regression across the 18 game outcomes from which it is clear that a direct and proportional relation exists between effort spent and CSSync. Overall, the results indicate that the ability to Self-Synchronize in the cognitive domain, as measured by an information-based measure of merit, shows a steady improvement with the networking approach adopted in the game. This improvement in cognitive Self-Synchronization with networking approach is also directly related to the level of activity (the energy drawn into the system) required to sustain that networking approach.

DƵůƚŝƉůĞ'ŝďďƐŶƐĞŵďůĞƐ A specialized analysis was jointly carried out by a NATO research group (of which the author was co-chair) and the US DoD of 37 ELICIT experimentation trials conducted in the USA, Portugal, and Singapore over a 3-year period. Each of these was carried out using identical protocols to our game. We thus had the equivalent of 37+1 = 38 Gibbs ensembles of results. This permitted rigorous statistical testing of the hypothesis that higher levels of networking approach would perform more efficiently and more effectively. The results of the analysis were clear and unambiguous [9]. At the 95% confidence level we have;

137

Edge networking teams; ¾ were more likely to correctly solve the knowledge problem than hierarchies. ¾ solved the knowledge problem more quickly than hierarchies. ¾ shared information more than hierarchies.

ZĞĨĞƌĞŶĐĞƐ 1. Schelling T (1969) Models of Segregation. American Economic Review, 59(2). 2. Moffat J (2003) Complexity Theory and Network Centric Warfare US DoD, Washington DC. Now available on Kindle. 3. Bak P (1996) How Nature Works Copernicus, New York 4. Watts, D. J. Strogatz, S. H. (1998). "Collective Dynamics of 'Small-World' Networks" Nature. 393 (6684) 5. Albert R & Barabasi (2002) Statistical Mechanics of Complex Networks Reviews of Modern Physics 74 6. Moffat J (2006) ʻMathematical Modeling of Information Age Conflict'ʼ Journal

ofApplied Mathematics and Decision Sciences, vol 2006, article ID 16018. 7. Thornton, Stephen, "Karl Popper", The Stanford Encyclopedia of Philosophy (Summer 2017 Edition), Edward N. Zalta (ed.), URL = . 8. Gell Mann M (2002) Discussion at Santa Fe Institute with the Author. 9. NATO (2010) NATO NEC C2 Maturity Modelʼ.US DoD, Washington DC, USA , 2010.

138

0RGHOLQJConflict and Competition far from Equilibrium: Chapter 7: Validations and Speculations

WƌŽĨĞƐƐŽƌ:ĂŵĞƐDŽĨĨĂƚ ĞǀĞů͘ Overall, our algorithmic representation of human intelligence, as exemplified by decision making and its consequences in conflict, is validated when compared at the outcome or whole model level, as well as at the individual decision level of chapter 5. This approach results in a layered agent representation, where the bottom layer represents rapid response to immediate circumstances and higher level layers are plans anticipating future possibilities which are strings of these lower level responses. The results are consistent with the arguments put forward by Koch and Crick [4] and with recent developments in intelligent robotics [5]. Finally, developing new strings of missions as used in the COMAND model is a timeconsuming business. We have developed a possible way of at least partially automating this

144

process. This is based on an implementation of genetic algorithms [6] and a summary of the research is given at Figure 7 - 3.

145

Figure 7 – 3; Prototype Generation of a String of Missions. The top two boxes show autonomous tasking of sensors building a perception of enemy force deployment. Lower box shows a string of missions to be carried out by own forces, as generated by the genetic algorithm evolving a haploid gene pool of chromosomes. Each chromosome codes for a string of missions; a fitness function then coverts this to a likely outcome. The final mission string consists of a main force, mission tasked to advance to a set of objectives (thick line) avoiding the main perceived concentrations of enemy force, and a smaller force (narrow line) mission tasked to protect the right flank of the main force.

146

^ƉĞĐƵůĂƚŝŽŶƐ͖ǇďĞƌŶĞƚŝĐsĂƌŝĞƚǇĂŶĚǆƚƌĞŵĂůDŽĚĞůƐ

We defined a system as a function ĭ which transforms inputs to outputs and outcomes. To bring this system to life we embed this transformation in an environment. Cybernetics studies the control of such systems, and a key construct in cybernetic theory is the number of potentially available system states which is called the variety of the system. Ashbyʼs law of Requisite Variety states that a system is in control if the system variety matches the variety of its environment. For example, In the industrial age, our hierarchical networks and slow communications gave rise to low variety (a simple controller), thus we had to partition the battlespace into sectors, and have specialized force units (a simple system), in order to reduce the variety of the battlespace, in accord with Ashbyʼs Law. During the Cold War (Chapter 2), the whole of western Europe was divided in this way into sectors which were the responsibility of different NATO nations—low variety of the physical battlespace was matched to low variety of the command process. Command was also hierarchical, reflecting an efficient solution to a relatively stable external environment. In the defence context, the period of the Cold War was an example of awful stability—the threat stayed essentially constant for over forty years. As a consequence, detailed roles and specialist forces were engineered, operating inside welldefined sectors of operation, and managed by an unchanging hierarchy of command. Analysis of this “scenario” went into more and more detail of particular pieces of the puzzle.

By contrast, an agile management process succeeds when conditions are very uncertain and dynamic. Relating to the defence context, multiple scenarios of the future must be considered, each with high uncertainty associated with them. It is this uncertainty and a potentially very dynamic “battlespace” which is driving defence in the direction of “edge organizations” which have the agility to cope; agile but not fragile. Analysis of these situations puts the emphasis on the spread of likely futures, rather than on the detail of a specific “scenario.” As we move deeper into the information age, we thus foresee a turbulent and uncertain set of futures, and a battlespace with high variety. Thus we need to construct a representation of the human command and decision-making process which gives rise to high variety through sharing and networking.

147

In my work I have captured this by creating two representations of command and decisionmaking denoted deliberate planning and rapid planning. Rapid planning, reacting to local and fast changing circumstances, creates variety, and corresponds in cybernetic terms to feedback control. This is then constrained by more strategic deliberate planning in order to produce the requisite variety of command. Deliberate planning corresponds to a broad, cognitively-based review of the options available. In cybernetic terms this is feedforward control, since it involves the use of a model (i.e., a model within our model) to predict the effects of a given system change. In developing these ideas into computer algorithms which can be implemented in simulation models, we have discussed how it is possible to exploit ideas from complex adaptive systems theory and artificial-intelligence based agent approaches in order to develop mathematical algorithms corresponding to rapid and deliberate planning which avoid the use of lengthy rule sets, and instead use simpler more generic mathematical representations. This extremal modeling approach has many advantages when it comes to actual model construction and use. We thus end with an elegant illustration of how we may be able to build a truly extremal model of any time-stepped simulation.

ƵŝůĚŝŶŐĂŶǆƚƌĞŵĂůsĞƌƐŝŽŶŽĨ^/Dd The simple batlegroup model (SIMBAT) was itself a significant step forward in the development of fast-running, simple yet valid simulation models including explicit representation of tactical decision-making. The question we seek to explore here is;

How much simpler can we make SIMBAT without compromisng its validity? (The author gladly acknowledges the contribution of Professor Russell Cheng, Southampton University, to answering this question). In SIMBAT each agent has an embedded rapid planner and moves along its own designated path to reach a specific objective.The model is time-stepped and during each time-step the agents do the following key things;

¾ Update their assessment of the local threat and perceived combat power ratio (PCPR) using the sighting and acquisitions algorithm.

148

¾ An agent encountering hostile units will try to engage these in combat either as a mini-battle in which a unit exchanges fire at a distance, or as a close combat.

¾ Trigger their rapid planning algorithms to decide whether to advance, halt or retreat.

The nature of this updating process is such that each agent moves from one waypoint to another forward or back, with mean velocity ‫ ݒ‬during an update time ο‫ݐ‬, or else remains stationary or is rendered ineffective. Assuming that we observe the system only at the start of each update cycle, then this creates an embedded Markov process representaton of SIMBAT.

^ƚĞƉKŶĞ͖ZĞĚƵĐŝŶŐŝŵĞŶƐŝŽŶĂůŝƚǇ Let ܰ݁ be the initial number of enemy model agents. ܰ‫ ݋‬the corresponding number of own force model agents and ܹ݆the number of equally spaced waypoints with spacing ο݀ on track ݆ such that ‫ݒ‬ο‫ ݐ‬ൌ ο݀Ǥ Each agent has a separate track and can be labelled by its waypoint on the track if still efective, or is labelled as operationally ineffective. A potenial system state is then given by a system snapshot corresponing to a string of such of labels of length ܰ݁ ൅ ܰ‫݋‬. The total system variety is then; ே௘ାே௢

ෑ ሺܹ݆ ൅ ͳሻ ௝ୀଵ

This is a potentially very large number. We start with the simplest non-trivial case where one side has 1 agent and the other has 2 agents.We label an agent state by its track location if it remains effective, or else by the state labelled operationally ineffective. As already noted, the agent on track ݆ then generates ܹ݆ ൅ ͳ potential states. Assuming emergent control and edge organisation, the total potential cybernetic variety for a 2 agent v 1 agent battle with 3 waypoints per track is 4 for the single agent and and 16 for the opposing 2 agents.. Together these create 64 potential system states which we denote by the column vector ߨሺ‫ݐ‬ሻ. The Chapman-Kolmogorov equation is thus, for a homogeneous process, of the form ߨሺ‫ ݐ‬൅ ͳሻ ൌ ܲߨሺ‫ݐ‬ሻ with P a 64 x 64 stochastic transition matrix. Figure 7 - 4 shows a typical example for our problem domain.

149

Figure 7 - 4; State transition matrix for a 2 v 1 battle with 3 waypoints per track resulting in ሺ͵ ൅ ͳሻଷ ൌ ͸Ͷ potential system states. By changing the order of the states, equivalent to a unitary transformation, the matrix P has been transformed into Upper Hessenberg form. Each of the 64 x 64 matrix elements is shaded corresponding to its value, with the band clustered around the diagonal corresponding to those states staying local, and values in the top right corner corresponding to those jumping to another locality (the Small World structure).

Experiments confirm that there are no recurrent states; each state is either transient or absorbing (halting). In the limit ‫ ݐ‬՜ λ, only the absorbing states have non-zero probability. In practice this occurs after about 25 iterations, as shown in Table 7 – 2. Blue1

Red1

Red2

4 0 4 4 0 0 4 0

4 4 0 4 0 4 0 0

4 4 4 0 4 0 0 0 Total

CK {Simuln) 0.0595 (0.059) 0.2061 (0.204) 0.1081 (0.106) 0.1081 (0.107) 0.0751 (0.076) 0.0751 (0.075) 0.3549 (0.360) 0.0119 (0.012) 0.9989 (0.999)

Table 7 – 2; Probabilities at t = 25 for each of the 8 absorbing states for the 1 Blue 2 Red example Each of the first three columns corresponds to the track label for each unit. A state is a combination of three track labels and a halting state occurs when one of the list of eight states occurs. The possible track labels are either the final waymark 4, or 0 meaning that that the agent has been rendered ineffective. The probabilities of each of the halt states are calculated from the full Chapman-Kolmogorov matrix (CK) and for the mean of 5000 replications of an importance sampled monte-carlo model (Simuln), The results are very close..

The importance sampling simulation approach can be computed ʻon the flyʼ without the need to store the full Chapman-Kolmogorov matrix. It will be the first step in the development of our extremal model.

150

The second step is to identify seven areas within the SIMBAT model which could be replaced by simple parameters ൛ߣ௝ Ǣ ݆ ൌ ͳǡ ǥ Ǥ ǡ͹ൟ optimised to a particular scenario outcme. Together these two steps form our proposed extremal model and we denote it as a Markov Chain SIMBAT-Like (MCSL) model.

151

/ŶŝƚŝĂů&ŝŶĚŝŶŐƐĂƌĞWƌŽŵŝƐŝŶŐ ¾ The MCSL model runs about 40 times faster than the equivalent SIMBAT model. ¾ It seems possible, using the simple techniques used in [7], to select the parameter settings by comparing the outputs of MCSL runs with those of 6,0%$7UXQVVRWKDWWKH0&6/PRGHOEHKDYLRUFORVHO\PDWFKHVWKDWRIWKH 6,0%$7PRGHO

¾ Further work indicates that the adjustment factors may be fairly robust WRVXFKFKDQJHVLQVFHQDULR ¾ The same general approach could be applied to any time-stepped simulation model.

ZĞĨĞƌĞŶĐĞƐ 1. Moffat, J., Campbell, I., & Glover, P. (2004). Validation of the mission-based approach to representing command and control in simulation models of conflict. Journal of the Operational Research Society, 55(4), 340-349. 2. Moffat J. (2011). Adapting Modeling and Simulation for Network Enabled Operations, US DoD Washington DC, USA. 3. Moffat, J. (2006). Mathematical modeling of information age conflict. Advances in Decision Sciences. 4. Crick, F., & Koch, C. (2003). A framework for consciousness. Nature neuroscience, 6(2), 119. 5. Müller, J (1996). The design of intelligent agents: a layered approach (Vol. 1177). Springer Science & Business Media. 6. Moffat J, Fellows S (2010) ʻUsing Genetic Algorithms to Represent Higher-Level Planning in Simulation Models of Conflictʼ, Advances in Artificial Intelligence, vol. 2010, Article ID 701904, doi:10.1155/2010/701904. 7. Cheng R, Moffat J (2012) Optimally Tuned Markov Chain Simulations of Battles for Real Time Decision Makingʼ WSC proceedings 2012. Eds. C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A.M. Uhrmacher, Institute of Electrical and Electronic Engineers (IEEE).

152