Preliminary Experiments with Learning Agents in an Interactive Multi ...

Preliminary Experiments with Learning Agents in an Interactive Multi-agent Systems Architecture Tradespace Exploration Tool Daniel Selva, Clara Abello, Nozomi Hitomi Sibley School of Mechanical and Aerospace Engineering Cornell University Ithaca, New York 14853 Email: {ds925, ca426, nh295} @cornell.edu

Abstract—One of the main problems of current system design and architecture tools is the communication between the tool and the user and vice-versa. The tools are too opaque, and the users often don’t understand their outputs, which results in loss of confidence in the tool. The users are also frustrated because they don’t have an effective way of providing their expert knowledge (acquired either in real time or a priori) to the tools, which results in poor performance. This work describes an ongoing effort to improve that communication by incorporating an interactive and “transparent” learning agent into a multi-agent tradespace exploration tool. This learning agent mines the current population of architectures for driving features (combination of architectural variables) that appear to drive architectures towards a “good region” or a “bad region” of the tradespace, and shares that information with the user. This information is then used to produce a surrogate classification model based on a decision tree that predicts whether or not an architecture is likely to be in the “good region”. The decision tree model was chosen based on evidence that it is among the most human understandable classification models. The information about driving features is also fed to an adaptive heuristic optimization agent that uses it to intelligently apply “good features” to or remove “bad features” from existing architectures, with the hope of improving the efficiency of the search process. Perhaps most importantly, information about driving features, surrogate models, and heuristics can help the user understand the results of the tool better and gain useful architectural insight, such as dominant features and trade-offs in the architecture space.

I. I NTRODUCTION Tradespace exploration and optimization tools are decision support tools broadly used in systems architecture [1]–[3]. Traditionally seen as a way to optimize the architecture of a system, they are also a powerful means to help decision makers better understand the main trade-offs across families of architectures in the presence of technical and programmatic uncertainty, and ambiguity in stakeholder needs. Architecture tradespace exploration tools contain an enumerable and executable model of a system architecture. The enumerability of the model implies that one can define a space of alternative architectures that can be exhaustively or Manuscript created 15 Feb 2015. This work was supported in part by the College of Engineering at Cornell University and the Interdisciplinary Center of Polythecnic University of Catalonia.

partially searched. Typically, an architecture is modeled as a set of decisions, and each decision is assigned a set of allowed values, usually categorical, sometimes discrete, and rarely continuous. This, together with a set of feasibility constraints, defines the architecture space, which is usually very large due to combinatorial explosion. The executability of the model refers in this context to its ability to simulate some aspects of the behavior of the architecture, and estimate the value that it can deliver to stakeholders. The relative value of an architecture is assessed by using a number of metrics or figures of merit as a proxy for stakeholder satisfaction. Such metrics typically include some estimate of cost, performance, technical and/or programmatic risk, schedule and other important lifecycle properties such as flexibility and robustness. It is customary but not essential to aggregate all metrics except for cost in a single utility metric using the precepts of multi-attribute utility theory [4]. The architecture space can be searched exhaustively by brute force, or partially, either by means of deterministic methods such as design of experiments (e.g. orthogonal arrays, latin hypercubes), or stochastic methods such as heuristic optimization (e.g., genetic algorithms, particle swarm optimization [5], [6]). Design of experiments methods usually provide more insight about the structure of the tradespace (e.g., presence of clusters or strata of architectures), whereas heuristic optimization usually samples a larger fraction of “good” architectures. Hence, both approaches can be seen as complementary. In the context of heuristic optimization, since the performance of different strategies varies widely across problems as per the “no-free lunch theorem”, a relatively recent trend has been to combine multiple strategies in a multi-agent framework where each agent implements a given heuristic [7], [8]. Different aspects of such multi-agent sytems are currently being studied, such as the trade-offs between cooperation or competition between agents [9], the effectiveness of adaptive strategies, and the introduction of a hyper-heuristic that monitors the performance of different heuristics at different points in time during the search and allocates resources (i.e., architectures) to heuristics based on their observed performance [10],

[11]. One of the main challenges in the development and use of architecture tradespace exploration tools is the interaction with the users. The users often partially disagree with the outputs of the tool, and have difficulty tracing them back to specific input assumptions, or simply understanding the dynamics of the complex models used to calculate the outputs. As a result of this, users may lose confidence in the tool and decide to resort to more traditional 100% human methods for architecture selection. The keys to alleviate this problem are two-fold: optimize the functional allocation between the human and the computer, and designing effective communication strategies between the human and the computer. Both are different aspects of designing better human-computer interaction frameworks [12]. Given a set of tasks to be performed, functional allocation is defined as the process of deciding which tasks will be assigned to the human and which will be assigned to the computer [13]. In the case of tradespace exploration, there is some evidence that the performance of a hybrid human-machine system has the potential to exceed both the performance of a fully manual and fully automated systems [3]. These findings have been corroborated in other fields such as machine learning [14]. Concerning communication between human and the computer, there are two aspects to it: communication from the computer to the human (visualization, explanation) and from the human to the computer (preferences, expert knowledge). Both aspects of the communication are necessary for success. For example, the computer must be able to present the results of the tradespace in such a way that the user can understand them. Similarly, the human must be able to express preferences and expert knowledge about the problem in such a way that the computer can use the information effectively. The remainder of this paper describes several aspects of a multi-agent framework for architecture tradespace exploration focusing on the human-computer interaction aspects. An agent is described that learns the driving features of the tradespace using an algorithm adapted from association rules, uses these features to construct a surrogate model based on a classification tree, and then uses the classification tree to construct adaptive heuristics. All the information generated in this process is also presented to the user in real time. II. M ETHODS A. Madkit multi-agent framework The tool uses Madkit as the architecture for the multiagent framework [15]. In the Madkit framework, agents can take multiple roles within groups. Agents perform a behavior specified in a live() function while alive, and they can be created and destroyed by a central manager agent. Communication between agents is done through exchange of directed or broadcasted messages. For a more complete description of our Madkit-based multi-agent framework, see [16].

B. User → Computer communication: Candidate driving features and region delimitation The user has two main ways of passing information to the computer: 1) choosing and defining a set of candidate features that are likely to be driving features in the problem at hand; 2) delimiting the good and bad regions. An architectural feature is defined as any predicate function f : A → {true, f alse} that when applied to an architecture a ∈ A returns a Boolean value. An example is any logical combination of values for a subset of architectural decisions (e.g. f (A) = (x1 == 1 ∧ x2 == 0)). We will loosely refer to features whose predicate function uses n architectural decisions as features of order n. The goal of having the human provide a set of candidate features is to reduce the size of the feature space that the computer will have to search. The computer proposes a set of atomic candidate features of reduced order by default, but choosing a set of higher-level features (e.g. EmptyOrbit(Ok )) might make the search much more efficient. Importantly, this set of features can be changed during the search as the user gains more information about the tradespace. The second way of passing information to the computer is by delimiting “the good region” and “the bad region”. For example, the good region can be defined as the low-cost highperformance region or the high-utility region given a set of weights for the metrics. Since we can get useful information both from statistically good and statistically bad aspects of the architectures, we also identify a “bad” region of the space. These regions are delimited by means of a set of inequality constraints (e.g. cost < 10 ∧ perf > 0.7). C. Automatic feature identification Given the set of candidate features defined and the good and bad regions defined by the user, a learning agent mines the current tradespace for features that lead to the good and bad regions of the tradespace with high probability. This is done by adapting the concept of association rules. In unsupervised learning, association rules [17] are used to identify regions of the feature space that have high probability density. The typical example is the market basket: given a set of example market baskets, it is easy to see that some items are bought together more often than others, such as {bread, butter, eggs, and milk}, {peanut butter and jelly}, and so forth. The support of a feature (e.g. an item set in the market basket) is defined as the fraction of examples that present the feature. High-support features can then be used to generate rules defining probable relations between variables: for example, {bread, butter} → milk. The confidence of a feature of order n > 1 is defined as the support of the feature divided by the maximum support of all the features of smaller order it contains. We adapt the concepts of item sets and their support/confidence to describe architectural features (combinations of architectural variables) that appear often in the good and bad regions. Thus, this is a dual market basket problem,

since there is the “good basket” and the “bad basket”. Accordingly, the algorithm will find two sets of driving features (good features and bad features). The union of these sets forms our set of driving features. The market basket problem is usually solved by means of a greedy algorithm called the apriori algorithm. For our purposes, we adapted the apriori algorithm to take into account the different prior probabilities of different features. In association rules, if we assume a uniform item space where all items are equally likely to appear, all the “features” have the same probability of occurring randomly because they represent binary subsets of the same size. However, in general, our features have different probabilities of occurring naturally, as they represent physical aspects of the system that may correspond to event spaces of different size in the mathematical space (e.g. the Separate(Ii , Ij ) feature defined in our case study is much more likely to occur naturally than the T ogetherInOrbit(Ii , Ij , Ok ) feature). Thus, the support was normalized by the probability of finding that feature in a random population. The pseudocode for our driving feature identification algorithm is shown in Algorithm 1. Algorithm 1 Driving feature identification 1: drivingFeatures ← ∅ 2: for all possible feature do 3: support ← computeSupport(data, scheme) 4: if support > threshold then 5: if order(feature) = 1 then 6: drivingFeatures ← drivingFeatures ∪ feature 7: else 8: conf ← computeConfidence(instruments) 9: if conf > confidence then 10: drivingFeatures ← drivingFeatures ∪ feature

D. Surrogate model based on a classification tree Once the set of driving features has been identified, a training dataset can be automatically constructed inline (during execution of the tradespace search/optimization algorithm) where each example has a set of features populated together with a tag indicating whether or not the architecture belongs to the region defined by the user as “good”. (The tagging is of course automatic and does not require human input.) This training set can then be used to train a classification tree using the C4.5 algorithm [17], [18]. Essentially, C4.5 is a recursive algorithm that performs the following operations at each node: if the remaining set of examples is empty or they all belong to the same class, this is made a leaf node. Otherwise, the feature that provides the greatest information gain is chosen and the set of examples is divided according to this feature. The algorithm proceeds creating child nodes until all nodes are leaves. The resulting classification tree can be seen as a lowprecision but high-accuracy surrogate model that provides a

boolean estimate of the value of the architecture, i.e. is the architecture “good” or “bad”? The goal of this surrogate model is thus not to predict a certain metric such as cost of the architecture very precisely, but rather to help discover simple rules or cases combining a small number of architectural features that lead to generally good (or bad) architectures as defined by the user. One could argue that this task is already accomplished by the identification of the driving features, which essentially tell us that an architecture is good if any of the good driving features in the architecture is present, and an architecture is bad if any of the bad driving features is present. In practice, even though it can express any logical function given the right features, this simple logical OR over the set of driving features may not be the most compact way of communicating what leads to good or bad architectures. The classification tree tries to generalize that structure by adding a hierarchy of driving features. We note also that classification trees have been identified as generating knowledge in a human-learnable way. This goes back to the work of Newell and Simon, who showed with their pioneering work that the way human experts reason can be modeled using logical rules [19]. Note also that in a recent study on human-machine interaction in the context of machine learning algorithms, it was determined that humans understood more easily rule-based explanations than any other type of explanation (feature-based, similarity-based) [14]. E. Adaptive heuristics The driving features identified can be used more explicitly in the search process by incorporating them into adaptive search heuristics. For instance, if Algorithm 1 determines that F1 = {x1 = 1, x2 = 0, x3 = 0} is a good feature, then we can construct a heuristic that modifies an architecture where {x1 = 1, x2 = 0, x3 = 1} by setting x3 = 0. Of course, it is not guaranteed that the architecture will improve as a result of applying this modification, but it is statistically likely by construction of Algorithm 1 that the change will push the architecture closer to the “good region” This is a similar argument to the one made in the Holland’s justification of genetic algorithms with the theory of schemata [20], except here we use the information about building blocks more greedily than in GAs. In fact, to avoid ending up with heuristics that are too greedy (too much exploitation vs exploration), the heuristic was made stochastic so that there is a large probability 1 − that the association rule is applied, and a small probability that the architecture is simply randomly mutated. Note the similarity of this approach to the known optimal policy for the dynamic multi-armed bandit problem [21], [22]. The information from the classification tree can also be useful in evolving these heuristics, since features that appear higher in the hierarchy are likely to be more important and therefore should be prioritized. So far we have implied that a heuristic would only apply one change (i.e. add/remove one feature) at a time. This is what is known in heuristics as a small move. Since there

is evidence that a combination of large and small moves leads to more effective search [23], in our actual adaptive heuristics, the number of features applied by the heuristic to an architecture is also made random. More precisely, each feature has a certain probability p of being applied. This parameter can also be modified during running time to foster large changes at the beginning (exploration) and smaller changes at the end (exploitation), as done in simulated annealing and other metaheuristics [24]. Finally, we note that the heuristics are called adaptive because, while the structure of such heuristics can be coded off-line, their content (i.e. the exact features and variables) is adapted on-line using the results from the learning agent. F. Iteration We have already mentioned that the human has several mechanisms to incorporate expert knowledge into the framework. Essentially, this is an interactive process where the main role of the user is to interact with the computer to select a set of driving features and good and bad regions. The users can redefine the good and bad regions and adapt the thresholds, and even redefine new features based on new, higher-order predicate functions, until they are satisfied with the results. In other words, the user is discovering what good and bad architectures are at the same time that the architecture space is being searched. This is different from the traditional approach in which the definition of user preferences precedes the search process. III. P RELIMINARY RESULTS A. Validation with a toy example: buying a car Validation of complex tradespace exploration tools is challenging due to the absence of a “truth”. Often, a simple toy example is used for validation purposes, since the user can easily understand the main trade-offs in the problem, and check whether the results provided with the tool are reasonable. For this work, we chose a simple toy example based on designing a car by choosing from a set of available items such as ceramic brakes, solar roof, and so forth. The design problem can thus be readily formulated as a simple 0/1 knapsack problem, where each item has some benefit and some cost, and the goal is to maximize the benefit for a certain cost. The items and their individual benefits (a.k.a Want) and costs are shown in Table I. Instead of the usual single-objective formulation, we formulate and solve the bi-objective (benefit-cost) problem, thus generating a Pareto frontier of designs. For benchmarking purposes, the population of designs obtained after running a standard genetic algorithm (NSGA-II [5]) for 30 generations is shown on Fig. 1, together with the true Pareto front obtained by brute-force. We see that the Pareto frontier found by the genetic algorithm is a good approximation of the true Pareto frontier.

TABLE I L IST OF ITEMS WITH ITS WANT AND COST. T HE BENEFIT IS SCALED FROM -5 TO 5. Item

Want

Cost($)

SmartPhoneIntegration

1

500

19inWheels

4

1200

DualClutch

5

2900

AdaptiveSuspension

5

1000

CeramicBrakes

3

8150

DriverAssistantPkg

0

1900

ExecutivePkg

1

4300

LightingPkg

-2

1900

Side&TopViewCam

2

750

MoonRoof

-5

0

ParkingAssistant

0

500

PwrRearSunShade

-2

575

Fig. 1. Buying a car example’s tradespace after 30 iterations

1) Driving features: In this simple example, we define features as just combinations of items/options that are chosen or not chosen, e.g. a feature can be to choose the item ”ceramic brakes” and not choose the item ”moon roof”. If ”ceramic brakes” corresponds to variable x1 and ”moon roof” to variable x2 , this feature would be x1 = 1 ∧ x2 = 0. Note that features in this example are the same as Holland’s binary schemata, and there are 3n features, whereas there are only 2n different designs. We look at features that appear often in the “good region”. An architecture is considered to be in the “good region” if its cost is less than $11,000 and its want is greater than 4.0 (cf Fig. 1). The following driving features were identified by the agent: Dual Clutch chosen, Ceramic Brakes not chosen, Moon Roof not chosen and the combination Ceramic Brakes and Moon Roof both not chosen. This makes sense as these are the features that have the highest benefit-to-cost ratio. The support threshold was set to 20% and the confidence threshold to 90% to obtain a reasonable number of driving features. 2) Classification tree: The classification tree resulting from those driving features is shown in Fig. 2. We measured its classification accuracy in the usual way. In this case, as the total number of architectures that can be generated is relatively small (4096), all the possible architectures were generated and then divided randomly in two sets: the training set and the testing set. This process was repeated 45 times (bootstrapping) and the average accuracy given by the model was 90%, i.e. the

TABLE III L IST OF CANDIDATE INSTRUMENTS AND CANDIDATE ORBITS

CeramicBrakes and MoonRoof absent Weight = 2.019 No Bad - Weight = 1533

Yes DualClutch present Weight = 486 Yes Good - Weight = 239

No Bad - Weight = 247

Fig. 2. Classification Tree for the toy example TABLE II C ONFUSION MATRIX FOR THE TOY EXAMPLE CLASSIFICATION TREE

Classification by surrogate

Good Not Good Total

Actual Architecture Classification Good Not Good 197 91 103 1623 300 1714

Total 288 1726 2014

simple logic shown on Fig. 2 based on two features is enough to accurately predict whether an architecture lies in the good region defined by the user 90% of the time. More detailed information is provided on the confusion matrix of Table II. Note that the goal of this classification tree is not to be used as an accurate classifier but rather to provide useful information to the user as to what drives the architectures towards the “good region”. 3) Adaptive heuristic: Adaptive heuristics were created based on the driving features identified. In particular, these heuristics add ceramic brakes and remove the moon roof from architectures, and/or add the dual clutch to architectures. The parameters settign the probability of mutation and the probability of applying each heuristic p where set to = 0.1 and p = 0.6 respectively. The performance of the adaptive heuristics was assessed by comparing the quality and diversity of the final population obtained with NSGA-II (crossover plus mutation) augmented with the adaptive heuristic, compared to the baseline case of plain NSGA-II. As the tradespace of this toy example is very small, the results weren’t conclusive – both approaches performed similarly well and thus the results are not shown graphically for the sake of brevity. B. Application to a real-world complex system: climate monitoring constellation While validation with a toy example is important, it is not less important to show that the method can be used in a reallife complex architecting problem. We chose for this work an example from the aerospace domain in which the authors have done previous work [25]. This problem is to design a constellation of satellites to provide operational observations of the Earth’s climate. The open-ended problem statement was simplified into a mathematical formulation of a simple assignment problem between instruments and orbits: given a set of candidate instruments and orbits, an architecture is simply given by a boolean matrix A where A(i, j) = 1

Instruments

Orbits

ACE ORCA ACE POL ACE LID CLAR ERB ACE CPR DESD SAR DESD LID GACM VIS GACM SWIR HYSP TYR POSTEPS IRS CNES KaRIN

LEO-600-polar-NA SSO-600-SSO-AM SSO-600-SSO-DD SSO-800-SSO-AM SSO-800-SSO-DD

Fig. 3. Tradespace of the climate monitoring constellation problem after 40 iterations

if instrument i is assigned to orbit j, and A(i, j) = 0 otherwise. The list of candidate instruments and orbits is shown on Table III. For each architecture, a cost metric and a benefit metric were obtained using a complex simulation tool previously developed by the authors [26]. The tradespace after running the basic NSGA-II algorithm for 40 generations is shown on Fig. 3. In this case, the true Pareto front is not known and therefore not shown. 1) Driving features: Our initial set of features was the set of simple assignments of instruments to orbits, namely P resentInOrbit(Ii , Oj ) (Instrument i is present in orbit j) and AbsentF romOrbit(Ii , Oj ) (Instrument i is absent from orbit j). After some iterations, we added the following set of candidate features that appeared to explain scores: • P resent(Ii ) Instrument i is present in at least one of the orbits • Absent(Ii ) Instrument i is absent in all the orbits • Empty(Oi ) Orbit i is empty • T ogether(Ii , Ij ) Instruments i, j are present in the same orbit in at least one orbit • T ogetherInOrbit(Ii , Ij , Ok ) Instruments i, j are present in orbit Ok • Separate(Ii , Ij ) Instruments i, j are never present in the same orbit Note that these features can be extended to higher-order features considering more than 2 instruments. These higher-

TABLE V C ONFUSION MATRIX OF THE CLASSIFICATION TREE IN THE CLIMATE MONITORING CONSTELLATION PROBLEM

Classification by Surrogate

Good Not Good Total

Actual Architecture Classification Good Not Good 20 154 50 1776 70 1930

Total 174 1826 2000

order features were also considered. In this case, the population used to identify driving features was a randomly generated set of 2000 architectures, and the thresholds for support and confidence were set to 60% and 99% respectively. After a few iterations, we defined the good and bad regions as follows: An architecture is considered to be in the “good region” if its cost is less than $4,500M and its science is greater than 0.15, and it is considered to be in the “bad region” if its cost is greater than $4,500M and its science is less than 0.15. The driving features identified with these parameters are shown in Table IV. We observe that there are more negative features (e.g. that push the architecture toward the “bad region”) than positive features. In other words, there are many more ways of creating poor architectures than ways of creating good architectures. Most of these negative features are explained by suboptimal assignments of individual instruments to orbits. For example, instruments that use UV/VIS/NIR radiation such as GACM VIS do not work well in dawn-dusk sun-synchronous orbits (SSO-DD) where the local time on observed regions is either dawn or dusk, which limits the amount of light available. We observe the absence of driving features of order 2, whereas there are multiple negative features of order 3 of the form T ogetherInOrbit({Ii , Ij , Ik }, Ol ). On the positive side, we observe that other than P resentInOrbit(HYSP TIR,SSO600-SSO-AM), all features are more complex. In particuar, we observe that the two EmptyOrbit() features related to the dawn-dusk orbits appear as positive features. This is a good example of incomplete set of candidate features, since while the preliminary results seem to imply that having both orbits empty would be a good thing, what is really happening is that there is substantial redundancy between the orbits, and therefore the real good feature is EmptyOrbit(SSO-600-SSODD) ∨ EmptyOrbit(SSO-800-SSO-DD). 2) Classification tree: The set used to train the classification tree was a population of 1659 unique architectures obtained after several iterations of applying the optimization algorithm. Then the model was tested on a set of 2000 randomly generated architectures (different from the set used to identify the features).The classification tree resulting from the first set of driving features is shown in Fig. 4 and gave a classification accuracy of 89.8% . More detailed results can be seen on the confusion matrix shown on Table V. These relatively poor results suggest to the user that there might be even higher-order features that better describe the

Fig. 5. Pareto fronts for NSGA-II alone or augmented with adaptive heuristics after 25 generations

problem. We have already given one example, namely that of the two EmptyOrbit() features that should be combined with a logical or. Another example is that we might suspect that what is driving performance is the fact that a set of instruments are together in any orbit except for a particular one. In that case, we may add a new feature structure such as: T ogetherButN otInOrbit({Ii , Ij , Ik }, Ol ) := (T ogether({Ii , Ij , Ik })∧¬(T ogetherInOrbit({Ii , Ij , Ik }, Ol ) and run the algorithm again. This interactive process and the ability to create new high-order features with this language really is the key aspect of this research, and what ultimately leads to architectural insight. 3) Adaptive heuristic: The performance of the adaptive heuristic was compared to the performance of the simple NSGA-II as in the toy example. Both heuristics were run for 25 generations over a population with 100 architectures. The process was repeated 5 times for each to improve the confidence in the results. The “cooling scheme” (i.e., the evolution of the probabilities of the adaptive heuristic) was made such that the adaptive heuristic was essentially shut down after 10 generations, and the standard NSGA-II took over for the last 20. As it can be seen in Fig 5 the final Pareto front of the adaptive heuristic is significantly better than the one given by the genetic algorithm. This shows the promise of having an adequate way of feeding expert information back to the search algorithm. IV. C ONCLUSION A. Summary, main findings, and limitations This paper has presented the results of preliminary experiments with the incorporation of interactive learning agents into a multi-agent architecture optimization framework. The learning agents perform three main tasks: identification of driving architectural features, construction of a surrogate model (a binary classifier based on a decision tree), and generation of intelligent adaptive heuristics. The goal of these tasks is to provide architectural insight to the user in an interactive

TABLE IV D RIVING FEATURES IDENTIFIED FOR THE CLIMATE MONITORING CONSTELLATION PROBLEM Order

1

Positive Features

Negative Features

P resentInOrbit(HYSP TIR, SSO-600-SSO-AM)

P resentInOrbit(CLAR ERB, LEO-600-polar-NA) P resentInOrbit(ACE LID, SSO-600-SSO-AM) P resentInOrbit(CLAR ERB, SSO-600-SSO-AM) P resentInOrbit(CLAR ERB, SSO-600-SSO-DD) P resentInOrbit(GACM VIS, SSO-600-SSO-DD) P resentInOrbit(GACM SWIR, SSO-600-SSO-DD) P resentInOrbit(HYSP TIR VIS, SSO-600-SSO-DD) P resentInOrbit(ACE LID, Orbit SSO-800-SSO-DD) P resentInOrbit(CLAR ERB, SSO-800-SSO-PM) T ogetherInOrbit({DESD LID, ACE LID, ACE ORCA}, LEO-600-polar-NA) T ogetherInOrbit({POSTEPS IRS, ACE LID, ACE ORCA} LEO-600-polar-NA) T ogetherInOrbit({HYSP TIR, CLAR ERB, ACE POL}, SSO-600-SSO-AM) T ogetherInOrbit({POSTEPS IRS, GACM SWIR, ACE POL}, SSO-600-SSO-AM) T ogetherInOrbit({CNES KaRIN, DESD LID, CLAR ERB}, Orbit SSO-800-SSO-DD)

3

5

Absent(ACE LID) Absent(CLAR ERB) Absent(ACE CPR) Absent(DESD SAR)

12

Empty(SSO-600-SSO-DD) Empty(SSO-800-SSO-DD)

Absent(CNES KaRIN)

SSO-600-SSO-DD empty Weight = 1,659 Yes ACE_LID absent Weight = 680 Yes

No Bad - Weight = 979

No

CLAR_ERB absent Weight = 657

Bad - Weight = 23

Yes

No

Good (85%) - Weight = 563

CLAR_ERB in LEO-600-polar-NA Weight = 94 Yes SSO-800-SSO-DD empty Weight = 85 No

Good - Weight = 65

No Bad - Weight = 9

Yes Bad - Weight = 20

Fig. 4. Classification Tree for the climate monitoring constellation problem

process in which the user simultaneously discovers their preferences and the main architectural trade-offs. The paper has shown the promise of some of the key aspects of the methodology, including an adaptation of the apriori algorithm, the choice of classification trees as humanunderstandable surrogate models, and the development of an effective strategy to incorporate expert knowledge into the search process by means of adaptive heuristics that act in the early stages of the search. Of course, this approach is by construction limited by the user’s capacity of finding a good set of architectural features, which depends on their expertise and prior knowledge about the problem. It is important to note that these preliminary experiments are based in a small number of trials that do not provide statistically significant results. B. Future work We plan to continue this work and investigate several aspects of this line of research. A first step is to confirm these results with a larger number of trials and conditions, such as different values of support and confidence for the driving feature identification, or p, for the adaptive heuristic. The adequacy of our choice of classification trees can also be confirmed empirically. A natural extension of the framework is to incorporate a knowledge-based system (e.g. based on first order logic) to support the user in the discovery of high-level features. For example, an inference algorithm such as the one used in theorem-proving systems should be able to generalize the two EmptyOrbit using a logical or. Finally, we will put emphasis in studying ways of fostering effective communication between the human and computational agents in the framework, such as explanation and mutual learning. ACKNOWLEDGMENT The authors would like to thank the College of Engineering at Cornell University and the Interdisciplinary Center (CFIS) of Polytechnic University of Catalonia for partially supporting this research. R EFERENCES [1] A. M. Ross, D. E. Hastings, J. M. Warmkessel, and N. P. Diller, “Multi-attribute Tradespace Exploration as Front End for Effective Space System Design,” Journal of Spacecraft and Rockets, vol. 41, no. 1, pp. 20–28, 2004. [2] O. de Weck, M. Chaize, and R. de Neufville, “Staged Deployment of Communications Satellite Constellations in Low Earth Orbit,” Journal Of Aerospace Computing, Information, and Communication, vol. 1, no. 3, pp. 119–136, Mar. 2004. [3] D. Selva, “Experiments in Knowledge-intensive System Architecture: Interactive Architecture Optimization,” in 2014 IEEE Aerospace Conference. Big Sky, Montana: IEEE, 2014. [4] A. M. Ross, D. E. Hastings, J. M. Warmkessel, and N. P. Diller, “Multiattribute tradespace exploration as front end for effective space system design,” Journal of Environmental Management, vol. 41, no. 1, pp. 43– 55, 2004.

[5] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, Apr. 2002. [6] F. Glover and G. A. Kochenberger, Handbook in Metaheuristics. New York, NY: Kluwer Academic Publishers, 2003. [7] A. Ligtenberga, M. Wachowicza, A. K. Bregta, A. Beulensb, and D. L. Kettenisb, “A design and application of a multi-agent system for simulation of multi-actor spatial planning,” Journal of Spacecraft and Rockets, vol. 72, no. 1-2, pp. 20–28, 2004. [8] F. Peng, K. Tang, G. Chen, and X. Yao, “Population-based algorithm portfolios for numerical optimization,” IEEE Transactions on Evolutionary Computation, vol. 14, pp. 782–800, 2010. [9] L. Landry and J. Cagan, “Protocol-based multi-agent systems: the effect of diversity, dynamism, and cooperation in heuristic optimization approaches,” ASME Journal of Mechanical Design, vol. 133, no. 2, pp. 021 001–1–11, 2011. [10] E. K. Burke, M. Gendreau, M. Hyde, G. Kendall, G. Ochoa, E. zcan, and R. Qu, “Hyper-heuristics: a survey of the state of the art,” European Journal of Operations Research Society, vol. 64, no. 12, pp. 1695–1724, 2013. [11] D. Ouelhadj and S. Petrovic, “A cooperative hyper-heuristic search framework,” in Journal of Heuristics, vol. 16, 2010, pp. 835–857. [12] P. F. Egan, J. Cagan, C. Schunn, and P. R. Leduc, “Cognitive-based search strategies for complex bio-nanotechnology design derived through symbiotic human and agent-based approaches,” in Proc. ASME 2014 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, Buffalo, NY, Aug. 2014, pp. 34 714–1–10. [13] R. Parasuraman, T. B. Sheridan, and C. D. Wickens, “A model for types and levels of human interaction with automation,” IEEE Transactions on Systems, Man, and Cybernetics: Systems and Humans, vol. 30, no. 3, pp. 286 – 297, 2000. [14] S. Stumpf, V. Rajaram, L. Li, W.-K. Wong, M. Burnett, T. Dietterich, E. Sullivan, and J. Herlocker, “Interacting meaningfully with machine learning systems: Three experiments,” International Journal of HumanComputer Studies, vol. 67, no. 3, pp. 639–662, 2009. [15] O. Gutknecht and J. Ferber, “Madkit: a generic multi-agent platform,” in Proc. of the fourth international conference on Autonomous agents, New York, NY, Jun. 2000, pp. 78–79. [16] N. Hitomi and D. Selva, “Experiments with human integration in asynchronous and sequential multi-agent frameworks for architecture optimization,” in 2015 Conference on Systems Engineering Research, Hoboken, NJ, Apr. 2015. [17] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY, USA: Springer, 2003. [18] J. R. Quinlan, C4.5: Programs for Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishersr, 1993. [19] A. Newell and H. A. Simon, Human Problem Solving. Englewood Cliffs, NJ: Prentice Hall, 1972. [20] J. H. Holland, Adaptation in natural and artificial systems, 2nd ed. Cambridge, MA: MIT Press, 1992. [21] J. Gittins, K. Glazebrook, and R. Weber, Multi-Armed Bandit Allocation Indices: 2nd Edition, 2011. [22] K. H. Schlag, “Why Imitate, and If So, How?, A Boundedly Rational Approach to Multi-armed Bandits,” Journal of Economic Theory, vol. 78, pp. 130–156, 1998. [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0022053197923474\n¡Go to ISI¿://000072222000006 [23] J. Schneider and S. Kirkpatrick, “Ruin & Recreate,” in Stochastic Optimization, 2006, pp. 73–77. [Online]. Available: http://link.springer.com/content/pdf/10.1007/978-3-540-34560-2 11.pdf [24] S. Kirkpatrick, C. D. G. Jr., and M. P. Vecchi, “Optimization by simulated annealing,” AAAS Science, vol. 220, no. 4598, pp. 671–680, 1983. [25] D. Selva, “Knowledge-intensive global optimization of earth observing system architectures: a climate-centric case study,” pp. 92 411S–92 411S–22, 2014. [Online]. Available: http://dx.doi.org/10.1117/12.2067558 [26] D. Selva, B. G. Cameron, and E. F. Crawley, “Rule-based System Architecting of Earth Observing Systems: The Earth Science Decadal Survey,” Journal of Spacecraft and Rockets, vol. 51, no. 5, pp. 1505– 1521, 2014.

Preliminary Experiments with Learning Agents in an Interactive Multi ...

Preliminary Experiments with Learning Agents in an Interactive Multi ...

Suggest Documents

Training Agents With Interactive Reinforcement Learning and

Intelligent Agents for an Interactive Multi-media ... - Semantic Scholar

Learning Experiments in a Heterogeneous Multi

Social learning in multi agents system: an ... - Semantic Scholar

Experiments with Reinforcement Learning in

Preliminary experiments

Simulation with Learning Agents

Preliminary Experiments with an Actively Tuned Passive ... - CiteSeerX

Experiments with MRDTL -- A Multi-relational Decision Tree Learning ...

Preliminary Walking Experiments with Underactuated 3D Bipedal ...

A Multi-Agent System with Reinforcement Learning Agents for ...

A Multi-Agent System with Reinforcement Learning Agents for

An Interactive Approach for Multi-Criterion Optimization, with an ...

Sonic Panoramas: Experiments with Interactive ... - CiteSeerX

interactive multi-consumers power cooperatives with learning and ...

AGENTS IN AN ADAPTIVE LEARNING ... - Semantic Scholar

Experiments with Learning Parsing Heuristics

Reaching High Interactive Levels with Situated Agents

Reflections about Interactive Learning Environments: A Multi ...

An interactive learning environment in Geographical Information ...

Learning an Interactive Segmentation System

Supporting Physical Agents in an Interactive e-book | SpringerLink

Self-Organizing Cognitive Agents and Reinforcement Learning in Multi ...

An Interactive Robot in a Nursing Home: Preliminary ... - CiteSeerX