sessing the creativity of products in the design repository through surveys ... future work into creativity assessment using advanced statistical techniques. .... we present the formulation of our two datasets which we use to train our agents. ..... vacuum, which indicates a redesign featuring a voice-activated vacuum cleaner.
A Multiagent Framework for Component-Level Creativity Evaluation Carrie Rebhuhn1 , Brady Gilchrist2 Sarah Oman3 , Irem Tumer4 , Rob Stone5 , and Kagan Tumer6 Oregon State University
Abstract. Multiagent-based simulation has been successfully used to understand and predict organization and social behavior. In this work, we apply this concept to product design and use a multiagent framework to study the interactions of functions and components in a product that lead to novel and creative solutions. Though there is a clear-cut correlation between profitability and creativity in a product, the steps that lead to designing a “creative” product are elusive. To address this issue, we use recent advances in multiagent credit assignment to propagate a product’s creativity back to its components, and determine which component (or set of components) contributes the most to a product’s novelty. To validate our approach, we use the Design Repository that contains thousands of products which have been decomposed into functions and components. We demonstrate the applicability and benefits of the multiagent-based approach by (i) assessing and propagating the creativity of a set of products analyzed by Popular Science, the IDSA IDEA Award, and Time Magazine’s 50 Best Innovations, and (ii) directly assessing the creativity of products in the design repository through surveys and propagating these scores back to the components.
1
Introduction
Market trends in product sales point to innovation as the key to successful products. Innovation itself is an abstract concept that can take many forms. But what is it that makes these innovative new products actually innovative? In this work, we set out to determine what aspects of products make them innovative, and to record that information for reuse in later designs. To further understand creativity, and what makes things creative, we looked at products that had been identified by popular media such as Time Magazine, the IDSA IDEA Award, and Popular Science. The products in these lists were selected based on expert opinion and several other indicators. Popular Science, for example, uses indicators such as significance of the design and originality [1]. The Design Repository, maintained and housed at Oregon State University, serves as a method to record and reuse information on product innovativeness. The Design Repository o↵ers engineers a way to store and reuse suggestions in making design decisions, specifically by suggesting components which may be used to fulfill functional requirements of the product being designed. Previous
work has focused on propagating creativity down to the component level through comparative analysis of standard and exceptionally creative products [2]. In this work, an innovation parameter was calculated based on the rareness of the component in a population of designs and this was weighted according to function in order to calculate creativity. Multiagent-based simulation has been successfully used in many domains to study organization and social behavior [3–6]. By simulating the interaction among agents, this approach allows us to capture inherent properties of large systems and provides insight into their operation. In this work, we extend this approach to the process that takes place during product design where multiple functions need to be satisfied by multiple components to lead to a functioning and appealing design. We view this process as a multiagent coordination problem where the agents are the functions and take actions to select components in the design. The key contribution of this work is in determining how in such a multiagent system, the full product’s novelty score can be propagated down to each component. This is the same mathematical formation as the multiagent credit assignment problem where each agent received a reward based on its contribution to the performance of a full system. Because in many domains it is difficult to assess the impact of a particular agent on the system’s performance, this is an active research topic. In this paper we concentrate di↵erence rewards, which assess the impact of an agent based on how the system would perform in the absence of the agent. The rest of the paper is structured as follows: in Section 2 we provide an overview of the design repository, previous techniques used to calculate component creativity, and di↵erence rewards. In Section 3 we frame design as a multiagent system, suggest a reward structure for use in this system, and propose two methods of creativity propagation to the component level. In Section 4 we show results from testing our algorithm on the professionally evaluated dataset as well as the dataset created by our own survey, and we give an example of a creative redesign based on our creativity propagation. Finally, in Section 5 we discuss future work into creativity assessment using advanced statistical techniques.
2
Background
This work draws on principles from design engineering as well as multiagent learning. On the design side, in Section 2.1 we explain the general structure and benefits of the Design Repository of Oregon State University, and in Section 2.2 we will introduce novelty, a metric developed by Shah et. al. [7] for quantification of the uniqueness of a design. On the multiagent side, in Section 2.3 we identify di↵erence rewards as a learning-based method for creativity propagation. 2.1
Storing Innovation Information
The Design Repository at Oregon State University contains a wealth of information about modern products [8]. It currently contains over 180 products with
more than 6000 artifacts. The purpose of the repository is to store previous design information. Products are disassembled into their individual components. Information about each component is recorded such as function, mass, dimensions, materials, manufacturing processes, failure modes. Functional models are created for every product, identifying the way that materials, signals, and energy change as they flow through the product. The functions assigned to the components are done so using the functional basis [9]. This allows for a common language for functionality that is universally understood. The output from the repository is a morphological matrix containing all possible solutions for a given function (example: “Export”) and flow (example: “Electricity”). A morphological matrix is a matrix of component solutions to given functions. This matrix can be used by the designer to select components and assemble them into a complete concept. Within the repository, in addition to the component the frequency that a component solves the given function is also provided. What is missing from the repository is a way to capture the creativity of products and their components. With inclusion of creativity information, the creativity of concepts can be determined earlier on in the design process. In order to promote creativity-oriented designs, creativity at the product level about products must be able to be propagated down to the component and, ultimately, functional levels. 2.2
Novelty Calculations and Creativity
The attempt to ascribe metrics of creativity is not a new study, and is a source of recent research in the design community [2]. Quality, quantity, novelty, and variety were identified by Shah et al. [7] to evaluate a set of concepts. In particular, the metric of novelty has been implemented in recent work by Oman et al. [2] along with a weighting system in order to identify the creativity score of particular functions. The metric for novelty we use in this paper is similar to the one used in Oman et al., and is given by the equation: SN j =
Tj
Rj Tj
(1)
where SN j represents the novelty, Tj is the number of designs in the set being examined and Rj is the number of times the component solution for function j is seen in the set. This novelty metric has been traditionally applied only over sets of products with a similar type or goal, however in our work we use it over a set of products of varying types and goals. Previous work has focused on comparison of the functions of ‘common’ products to ‘innovative’ products [2]. In this way, functions which were unique to the innovative products could be identified and their frequency of appearance within the design set could be used to characterize their general impact of the creativity within products. In this work, we take this comparison idea and draw on the reinforcement learning concept of a “di↵erence reward”, which has been shown in many domains to e↵ectively characterize the impact that an agent has on a multiagent system [10].
2.3
Di↵erence Rewards
Di↵erence rewards have been used in a wide variety of large multiagent systems including the air traffic control, rover coordination, and network routing [11, 10, 12] to promote coordination among agents. This reward works by comparing the evaluation of the system to a system in which an agent is not there, or replaced by some counterfactual which refers to an alternative solution that the agent may have taken [10]. Mathematically this is expressed as: Di (z) = G(z)
G(z
i
+ c)
(2)
where G(z) is the full-system reward and G(z i + c) is the reward for a hypothetical system in which agent i was removed and its impact was replaced by a counterfactual c. In this work we use di↵erence rewards to compare products with particular creative components to products in which the creative components are replaced by a ‘standard’ component. This will be discussed in more detail in Sections 3.1.
3
Product Design as an Agent-based Game
To create a product, a designer must compose a set of components to fill a set of engineering constraints set by the customer. Functional requirements must be fulfilled by the components selected. Essentially, design may be broken down at a rudimentary level to function-satisfaction by selection of components; a game in which components must be selected to fulfill properties desired by the engineering requirements of the design. The ‘game’ of product design is thus structured according to three primary factors: products, functions, and components. 1. Products: The ‘goal’ of a designer is to create a product. Products feature a set of functions which they must perform, and a set of components which have been selected to satisfy these functions. We also assume that products have an associated creativity score. 2. Functions: The purpose of a product is to perform a set of functions. This defines the requirements for the action-selection of the designer. 3. Components: The components are the ‘actions’ which a designer has available to fulfill the functional requirements of the end product. In a step toward an autonomous design strategy, we assign our agents the task of function-satisfaction, or “component selection”. An agent is assigned select a component to fill a function (i.e., to import electricity) and has an action space which includes all components which could possibly accomplish this task (i.e. a battery or a power cord). We used two datasets which included real designs from the Design Repository to train our agents, using the function-component matrices of these designs to essentially force an agent to take an action while designing a product. We then evaluated the designs using the product-level rewards given by these datasets,
and rewarded the actions using two separate methods, which will be explained in more detail in Section 3.2. Agents then performed Q-learning with a learning rate of 0.5 using this reward. Q-values were initialized to the novelty score at the beginning, as we assume this is a good first-pass approximation of a creativity score for the component. 3.1
Reward Structures in the Design Game
To use the di↵erence reward decomposition as given by Equation 2, we need an equation for the global score of a device which we can decompose. In Section 3.2 we assume that we begin with a product-level evaluation of the creativity G(d) which is given by our experimental dataset (explained further in Section 3.3) and then formulate a decomposition of G(d). Then we use this formulation to develop an equation with which we can derive a di↵erence objective. We then present two methods of application of di↵erence rewards to our dataset. In Section 3.3 we present the formulation of our two datasets which we use to train our agents. 3.2
Product-level Creativity Score Decomposition and Di↵erence Reward Derivation
Novelty scores, though they do not have a direct mathematical correlation with creativity scores as of yet, have been used in previous work along with a handtuned weighting in order to assess the creativity of di↵erent components. From this fact, we postulate that G(d) = f (SN ), where G(d) is the total creativity score of a device, and f (SN ) is some function of the novelty of all of the components within the device. As a starting point, we assume that the creative impact of a component is proportional to its novelty score. This allows us to decompose the product-level creativity down into component impacts Ii : G(d) =
n n X X SN i ⇥ G(d) = Ii S i=1 i=1 Nall
(3)
where d is a design with n di↵erent functions, I is the impact of component i on the global score of the design, G(d) is the full score for that design (recall that this is given by our data), SN all is the sum of all novelties of the components, and SN i is the novelty score for component i. Using this decomposition we can derive a di↵erence reward which represents what would happen if one component was taken out and replaced by another component using these impacts Ii . The di↵erence reward for this system can be derived by using the G(z) formulation in Equation 3 in Equation 2, with some replacement component represented by the counterfactual term c. Making this substitution and simplifying yield a derivation of the di↵erence reward: Di (d) =
SN i SN c ⇥ G(d) SNall
(4)
where SNc refers to the novelty score of the component which has hypothetically replaced the original component in the design. In di↵erence rewards, the counterfactual term c can be defined in a variety of ways. In our domain, we employed two counterfactual reward formulations. The first counterfactual used the most common component solution (i.e. the lowest novelty score), and we refer to the di↵erence reward formulated using this method as Cipair (d). This was applied to reward the agents for all component solutions based on the design. This is given by the equation: Cipair (d) =
SN i
SNleast ⇥ G(d) SNall
(5)
where SNleast refers to the novelty of the most commonly-used component which fills agent’s function. The second counterfactual used all other component solutions as comparison. This meant that the set reward, Ciset (d) was recalculated k 1 times, where k refers to the number of di↵erent components that an agent had in its action space, and the agent performed a Q-update k 1 times with each di↵erence reward respectively. The equation for this creativity evaluation was therefore given by: Ciset (d) =
SNi SNalt ⇥ G(d) SNall
(6)
where SNalt refers to the novelty of an alternative component seen in the set of products. This is applied over all alternatives. The Q-table values resulting from these calculations represent the creativity estimation of the function-component pairs found in the set. These values are the agent’s estimation of the creative impact of the function-component pair on the product-level design score, and are used to present our findings in the two datasets in Section 4. 3.3
Training Data
As product-level creativity metrics are a subject of ongoing research, we used two datasets to train the agents. One featured a binary evaluation based on a set of 30 designs in the repository which were identified by Popular Science, the IDSA IDEA Award, and Time Magazine’s 50 Best Innovations to be particularly creative (G(d) = 1), and a set of designs identified to have low creativity (G(d) = 0) as comparison. As a binary global score tends to be somewhat more diffiult to learn, we also performed a user study which took a poll of 10 people’s opinions on the creativity score (1-5) of 50 designs from the repository, with 1 being the score of a ‘common’ device and 5 being the score of an ‘extremely creative’ device. The average of the responses was taken and then scaled between 0 and 1. The individuals in the study were college-aged with various majors including history, computer science, mechanical engineering, design and human environment, and
electrical engineering. Though the polling size is small we believe that the results derived from this data present a good preliminary representation of the trends we can expect to find using our creativity decomposition.
4 4.1
Results Function-Component Pair Creativity Computation
The results are divided based on the training data used to obtain them. One set of results was produced using data from the Popular Science products and their comparisons (Figure 1), while another set of results was produced using data from the survey of 50 products in the repository (Figure 2). The x-axis on these results represent unique function-component pairs, and the y-axis corresponds to their score computed by four di↵erent evaluations. These results are sorted by the “AvgScore” values of the function-component pairs, which were calculated by a simple average of the product-level scores of all products in which the function-component pair is found. The ”Novelty” score (as calculated be Equation 1) is shown as well. The “Cpair” score represents the Q-value of the agent for that particular function-component pair which has been adjusted using the di↵erence reward with a counterfactual of the most common component solution. The “Cset” score represents the Q-value adjused using the di↵erence reward with a counterfactual of all other component solutions seen in the set. As the results are sorted by “AvgScore” along the x-axis, an increase along this axis roughly corresponds to an increase in product-level creativity. For the following results we will assume that the x-axis is a good representation of how creative an entire product is, and make our comparisons accordingly. 4.1.1 Results Found by Popular Science The results found by using the dataset of Popular Science, the IDSA IDEA Award, and Time Magazine’s 50 Best Innovations showed a significant variance, quite possibly due to the fact that there were only 30 components evaluated and the learning signal was binary. The calculated novelties for these components were all roughly similar, although we see points of lower novelty in the mid-range creativities. This dip indicates that the novelty of components in the middle range of creativity can use much more common items than the higher or lower ranges of creativity. In Figure 1 we also see a disconnect between the average scores (“AvgScore”) of the products due to the binary nature of the scoring. However the most striking trend in this graph can be seen in the two di↵erence creativity evalutions. “Cpair”, while returning higher creativity estimations, shares the trend of appearing to ‘fan out’ with “Cset”. That is, in the lower ranges of creativity, the algorithms return a much more consistent evaluation of a component’s impact on the creativity of the system. However, as creativity increases, components are identified as being either more beneficial or detrimental to the creativity score of the entire product.
Fig. 1. Results from the Popular Science identification of creativity, sorted by average product-level creativity (AvgScore). The creativity evaluations show scattering after the point at which a medium amount of average design-level creativity is achieved. Additionally there is a drop in novelty corresponding to the medium ranges of creativity.
4.1.2 Results Found by a User Study The results found by using the dataset populated by the survey responses had signficantly less variance than the results found on the dataset in Figure 1. Many of the same trends that were observed in Figure 1 were also observed in 2. The same dip in novelty scores in the mid-range of creativity was seen, as well as a general closeness of creativity values at lower design-level creativity and a larger spread of creativity values at higher design-level creativity. The “Cpair” is somewhat less defined by this here, however, likely due to the nature of the dataset. However, “Cset” provides a clear ‘fanning out’ trend, which indicates that at higher levels of creativity the di↵erence reward is able to identify the components which add to and detract from the product-level creativity. 4.1.3 Summary and Implication of Results Both results show a dip in novelty scores as well as a spreading out of the creativity evaluation as the creativity increases. As the products in each dataset are di↵erent, the fact that the graph shapes are consistent across both analyses suggests both that the two di↵erence rewards used are consistent in what they discover in the data. We can draw three main conclusions about our findings through analysis of the novelty,
Fig. 2. Results from the user study identification of creativity, sorted by average product-level creativity (AvgScore). The creativity evaluations show gradual scattering as the average product-level creativity increases. There was a drop in the novelty corresponding to the medium ranges of creativity.
“Cpair” values, and “Cset” values: novelty does relate to the creativity score of the device, the creativity of a device in the higher ranges tends to depend on single components with high novelty, and the assessed creative impact of a component depends on the technique and dataset. Our first conclusion, that the novelty of the components within a device correlates to creativity, validates our approach of bootstrapping the Q-table with novelty scores. However the trend we discovered toward a dip in novelty scores also uncovers a major shortcoming in our attempt to identify creative components; the more common components may still receive a mid-level creativity score, but this depends on the configuration of components rather than the components themselves. This accounts for solutions in which common components are assembled in a creative configuration. However, we also demonstrate that in the very high levels of product-level creativity, more novel components are more frequently seen. We also see novel components in the lower levels of creativity– this is most likely due to devices which perform only one task uniquely. For example a microwave heats things in a way not found in any other product, but it only performs one task and therefore is not particularly creative.
Our second conclusion best highlights the intent of developing our di↵erence methods for creativity identification: in creative products, there tend to be components which add more to the creativity of the device than other components. This comes with the parallel observation that some components actually detract severely from the creativity of a product at higher product-level creativity. This invites the use of this method for design improvement toward higher levels of creativity, as we can identify which components help and hinder the creativity of a product and modify it accordingly. Our final conclusion is that the evaluation of the creativity is sensitive to the dataset upon which it is applied. Comparing Figures 1 and 2, we see that in Figure 1 “Cpair” shows the same pattern of tightness at lower levels of creativity and scattering at higher levels as “Cset”, but in Figure 2 “Cpair” appears to have much more scattering throughout all levels of creativity. This is due to the nature of the datasets on which “Cpair” was applied: the first featured a set of products which were clearly creative or clearly not creative and had little gradient in this metric, while the second set featured a range of creativities. As “Cpair” only compares to the lowest novelty score in the solution, it was better tailored to identify creativity in the set which had either a very low or very high creativity scores. “Cset” gave similar evaluations of creativities independent of the dataset, as it performed comparisons throughout the entire set, indiscriminant of whether the novelty was higher or lower. 4.2
Generating New Designs Based on Function-Component Pair Creativity vs. Novelty
Though we can demonstrate consistency and patterns in the data using our techniques, an example in an actual design provides a more intuitive look at what the di↵erent techniques actually o↵er. For this reason, we present a redesign of the design which had the lowest creativity ranking according to our survey: the dustbuster. We selected five functions in the dustbuster with the lowest “AvgScore” ratings and calculated suggested replacements for the components according to the highest-ranked component solutions as discovered by our di↵erent techniques. The results are shown in Table 4.2. The assessment of the creativity of the di↵erent replacements for the design is difficult to perform objectively. Additionally, as practicality is separated from the creativity, not all suggestions are necessarily optimal or even possible to implement. However all suggestions o↵er a designer a di↵erent perspective on how to modify a vacuum. The novelty evaluation suggests using a hydraulic pump, which is rarely if ever seen in vacuums and may o↵er an interesting perspective on how to creatively channel material sucked into the vacuum. The “Cpair” evaluation suggests that a speaker replace the electric switch to turn on the vacuum, which indicates a redesign featuring a voice-activated vacuum cleaner. The ”Cset” evaluation suggests that a displacement gauge should replace the on-switch, orienting the redesign around a vacuum which automatically detects metal waste.
Ultimately the usefulness of the design suggestions is still in the hands of the engineer. However the suggestions based on “Cpair” and “Cset” appear to o↵er more interesting solutions overall to the given functions. “Cpair” and “Cset” both allow for propagation of external evaluation of creativity down to the component levels, whereas novelty scores only take into account the frequency of a function-component solution within the dataset. Function Repository Novelty Cpair Cset convert energy to signal electric switch light source speaker displacement gauge control magnitude energy electric cord washer stop spring channel energy electric cord abrasive coupler cam channel material guiders hydraulic pump cushion cushion convert energy electric motor cover needle spring Table 1. Redesign suggestions of five of lowest-scoring function-component solutions in the dustbuster as estimated by “AvgScore”. Items under ‘Repository’ are the original components for the respective function, while items under ‘Novelty’, ‘Cpair’, and ‘Cset’ are suggestions to satisfy this function based on the novelty scores, “Cpair” evaluation, and “Cset” evaluation respectively. Ties between components are broken by “AvgScore” values.
5
Future Work
This work suggests that trends in component-level creativity can indeed be found using a decomposition of the product-level creativity score, and o↵ers motivation for more extensive testing and validation. The next steps in this application are to develop an automated and objective method of assessing full-design creativity. Advanced statistical methods such as latent variables may o↵er a step toward this goal. Latent variables [13] provide a promising direction for this full-design assessment. These quantify a property that is inherently difficult to measure. A latent variable model is in development to take advantage of product attributes to give a measure of innovativeness. Using this model for innovativeness will give a range of scores for creativity, and will also indicate the level of creativity of a product. A latent variable model would be capable of compensating for current versus past perceptions of creativity. It will also take into account perceptions of innovation by di↵erent demographics. The eventual goal of this work is to use a multiagent system to refine automated design suggestions o↵ered by the Design Repository. A latent variable design score along with the multiagent framework outlined in this work will allow for an automated method of refining the creativity scores of the designs, and thereby will allow the repository to o↵er suggestions which promote the design of more creative products. A better formulation of di↵erence rewards in this domain may also include use of these latent variables. However the work
presented here gives a good first step toward an automated method of creativity propagation down to the component level, and provides a framework which can extend to include latent variables or other product-level creativity evaluation.
Acknowledgments The authors would like to thank the National Science Foundation for their support under Grant Nos. CMMI-0928076 and 0927745 from the Engineering Design and Innovation Program and the NSF REU supplement grant CMMI-1033407. Any opinions or findings of this work are the responsibility of the authors, and do not necessarily reflect the views of the sponsors or collaborators.
References 1. https://www.popsci.com/bown2011/html/index/judging: Best of whats new 2011 (2011) 2. Oman, S., Gilchrist, B., Rebhuhn, C., Tumer, I.Y., Nix, A., Stone, R.: Towards a repository of innovative products to enhance engineering creativity education. (August 2012) 3. Grieser, K., Baldwin, T., Bohnert, F., Sonenberg, L.: Using ontological and document similarity to estimate museum exhibit relatedness. J. Comput. Cult. Herit. 3(3) (February 2011) 10:1–10:20 4. Norling, E.: Folk psychology for human modelling: Extending the bdi paradigm. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1. AAMAS ’04, Washington, DC, USA, IEEE Computer Society (2004) 202–209 5. Villatoro, D., Sen, S., Sabater-Mir, J.: Exploring the dimensions of convention emergence in multiagent systems. Advances in Complex Systems (ACS) 14(02) (2011) 201–227 6. Quattrociocchi, W., Conte, R.: Exploiting reputation in distributed virtual environments. CoRR abs/1106.5111 (2011) 7. Shah, J.: Metrics for measuring ideation e↵ectiveness. Design Studies 24(2) (March 2003) 111–134 8. Bohm, M., Vucovich, J., Stone, R.: Using a design repository to drive concept innovation. Journal of Computer and Information Science in Engineering 8(1) (2008) 9. Hirtz, J., Stone, R., McAdams, D., Szykman, S., Wood, K.: A functional basis for engineering design: Reconciling and evolving previous e↵orts. Research in Engineering Design 13(2) (2002) 65–82 10. Agogino, A., Tumer, K.: Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Autonomous Agents and Multi-Agent Systems 17(2) (2008) 320–338 11. Tumer, K., Agogino, A.K.: A multiagent approach to managing air traffic flow. Journal of Autonomous Agents and Multi-Agent Systems (2010) 12. Wolpert, D., Tumer, K.: Collective intelligence, data routing and braess’ paradox. Journal of Artificial Intelligence Research 16(2002) (2002) 359–387 13. Leder, Helmut, Carbon, Claus-Christian: Dimensions in appreciation of car interior design. Applied Cognitive Psychology 19(5) (2005) 603618