Data science approaches to prevent failures in systems engineering

Data science approaches to prevent failures in systems engineering Sponsor: DASD(SE) By Prof. Karen Marais and Prof. Bruno Ribeiro 9th Annual SERC Sponsor Research Review November 8, 2017 FHI 360 CONFERENCE CENTER 1825 Connecticut Avenue NW, 8th Floor Washington, DC 20009 www.sercuarc.org SSRR 2017

November 8, 2017

Project failures occur despite systems engineering best practices

Project delays, cost overruns, quality concerns, cancellations…

SSRR 2017

November 8, 2017

2

Why aren’t these methods helping (as much as we hope)?

Several possible reasons… ① They rely on extensive data creation, collection, and tracking, which is hard to do ② We think they are not useful, and so they are not

SSRR 2017

November 8, 2017

3

Our core ideas: ① risk assessment based on the “real reasons” for systems engineering failures, and ② augment existing data with about team assessments, Wisdom of the Crowd (WoC), to uncover problems and likely “real reason” causes

SSRR 2017

November 8, 2017

4

Most systems engineering failures do not involve black swans Most failures result from rather prosaic and predictable white swans: Lost tacit knowledge when employee(s) departed

Subjected to insufficient testing

Created deficient requirements

Failed to provide resources

Violated regulations

Failed to inspect

Used inadequate justification

Violated procedures

Subjected to inadequate reviews

Failed to form a contingency plan

Managed risk poorly

Kept poor records

Failed to consider systems factor

Created deficient procedures

Failed to supervise

Lacked experience

Enforced deficient regulations

Did not allow aspect to stabilize

Failed to consider human factor

Did not learn from failure

Failed to maintain

Diane Sorenson and Karen Marais, “Patterns of Causation in Accidents and Other Systems Engineering Failures,” IEEE Systems Conference, April 2016, IEEE, Orlando, FL. SSRR 2017

November 8, 2017

5

The failure cause network shows how these causes relate to one another

SSRR 2017

November 8, 2017

6

Consider the Mars Climate Orbiter failure

The project was severely understaffed, with some people working 80 hours per week. The team monitoring the spacecraft saw that errors were accumulating on the aim point for the spacecraft, but did not investigate. SSRR 2017

November 8, 2017

7

People knew something wasn’t right… Can we develop ways to get this information? (without requiring ever more paperwork…)

SSRR 2017

November 8, 2017

8

Big data can help, but it’s harder than you might think.

SSRR 2017

November 8, 2017

9

Expectation of Machine Learning/AI/Big Data

insights

Human bookkeeping

=

+ Automated data collection (sensors) Machine Learning / AI

Prevent, forecast failures

SSRR 2017

November 8, 2017

10

Reality of Machine Learning/AI/Big Data Estimated MTBF

~10 years? Check later with Emily

‐

If sensor measurement is wrong do…

TODO: Describe our work‐around. Need to revisit this later.

‐

Data collection not high priority when project is in trouble

Human bookkeeping (Low priority in crunch time)

=

+ Automated data collection (sensors) Machine Learning / AI

Low priority: faulty, incomplete when needed most SSRR 2017

November 8, 2017

11

We propose a tool that uses both existing data and Wisdom of the Crowds to help predict failures

Enterprise Software Derived Inputs Machine Learning Algorithm App Derived Inputs

SSRR 2017

Failure Prediction

Actual Failures

Algorithm is continuously updated within the organization

November 8, 2017

12

How will we get there? IDENTIFY INPUT DATA and DEVELOP COLLECTION APP

IDENTIFY Enterprise Software Derived Inputs

IDENTIFY Human Derived Inputs DEVELOP WoC App

SSRR 2017

DEVELOP First Generation of Machine Learning Algorithm

Failure Prediction at PARTNER ORGANIZATION

RECORD Actual Failures at PARTNER ORGANIZATION

Use partner organization data to train first generation of machine learning algorithm and tailor set of input parameters.

November 8, 2017

13

Identifying input data

Use student and partner organization data to train first and second generations of machine learning algorithm and tailor set of input signals. SIGNALS from Enterprise Software and WoC App

FACTORS Identified based on literature

REAL REASONS Provide initial seed for selecting factors

REAL FAILURES

Over time, the machine learning code makes direct links between signals and failures, thus we can discard the factors and real reasons.

SSRR 2017

November 8, 2017

14

How does input data relate to the real reasons?

SIGNAL from Employee App: How many times did you ask your team members a “why” question today?

SIGNAL from Finance Software: The percentage of budget associated with replacing faulty or unsuitable parts.

SSRR 2017

FACTOR: Low Proactivity Low proactivity may mean missed opportunities to question and improve requirements.

FACTOR: Faulty Parts Many faulty parts may be a sign of poor requirements specification (e.g., good part used in wrong way, or poor quality part).

November 8, 2017

REAL REASON: Conducted poor requirements engineering

15

Wisdom‐of‐the‐Crowd Information Accurate Wisdom‐of‐the‐Crowd Predictions from Incomplete Pictures An expert with a complete view and understanding of the entire process may be able to give a reasonable assessment of potential problems and delays. As projects become more complex, this assessment is increasingly hard and dedicated experts are no longer able to have a complete view. (Wisdom of the Crowd) Can we use the assessments of non‐experts with partial views to train neural networks to learn to a. Predict success and failures using non‐experts with incomplete (possibly biased) information? b. Ask relevant questions to these non‐experts to help make the data richer to better predict success and failure? SSRR 2017

November 8, 2017

16

Predicting Outcomes from WoC Inputs Predicted outcome

Hypothesis: Non‐expert opinions and their relationships can help predict project outcomes Approach: Use our newly developed Sparse Pattern Convolutional Neural Network (SPCNN), to learn dynamic relational dependencies between group actors, their opinions, and project outcomes. Input: WoC team member assessment of project health, WoC assessment of potential personal issues, team structure, traditional indicators

Sparse Pattern Convolutional Neural Network

Manager – Contractor Engineer – Eng. Intern (graph encodes relationship patterns)

Output: Predicted outcome of project milestones

M

C

M

E

EI

C

C E

... Meng, C., Sekar, C., Ribeiro, B., Neville, J., 2017, Predicting Subgraph Evolution in Heterogeneous Dynamic Networks, (preprint). Yang, J., Ribeiro, B., Neville, J., Should We Be Confident in Peer Effects Estimated from Partial Crawls of Social Networks? AAAI Conference on Web and Social Media (ICWSM), 2017

SSRR 2017

November 8, 2017

... inputs from each individual in team + relevant extra data

...

17

Active Learning + Contextual Bandits • Problem: Too many questions we would like to ask ―Must limit the number of questions to ask (avoid subject fatigue) ―Which questions/answers most correlate with outcome? o Active learning approach to learn which questions to ask Measures Prediction Accuracy

Gives feedback of usefulness of question Learns to choose questions to ask

Tests prediction Gets answer Makes prediction

M

C

M

E

C

C

E

... ...

...

{

predicts usefulness of each question SSRR 2017

November 8, 2017

18

Vision for Product Development: Year One and Future IDENTIFY INPUT DATA and DEVELOP COLLECTION APP

Enterprise Software Derived Inputs IDENTIFY and TAILOR EXPAND and REFINE

IDENTIFY Human Derived Inputs DEVELOP WoC App REFINE WoC App

SSRR 2017

Use student organization data to train first generation of machine learning algorithm and tailor set of input parameters.

Machine Learning Algorithm DEVELOP 1st Generation REFINE 2nd Generation

Failure Prediction at PARTNER ORGANIZATION

Actual Failures at STUDENT ORGANIZATIONS at PARTNER ORGANIZATION

Use partner organization data to train second generation of machine learning algorithm and expand and refine set of input parameters as necessary.

November 8, 2017

19

Data science approaches to prevent failures in systems engineering

Data science approaches to prevent failures in systems engineering

Suggest Documents

Data science approaches to prevent failures in systems engineering

Computational approaches to metabolic engineering utilizing systems ...

Leveraging Software Development Approaches in Systems Engineering

Exploring Alcohol Policy Approaches to Prevent ...

Novel vaccination approaches to prevent ... - BioMed Central

LOCAL APPROACHES TO PREVENT AND COUNTER ... - EUKN

Incorporating Systems Engineering Methodologies to ... - Science Direct

approaches to open data for science in spain - J-Stage

Approaches to Systems Development

Introduction to Data Science and Engineering

Meta system methodology to prevent system failures - Journals ISSS

Meta system methodology to prevent system failures - Journals ISSS

New challenging approaches to engineering

Embedded Systems in an Engineering Science Curriculum

A Systems Approach to Predicting Healthcare Failures

Computer Systems Science & Engineering

Systems approaches to skeletal variation in ...

Data Science for Software Engineering

Data Science for Software Engineering

Centralized vs. Federated: State Approaches to P-20W Data Systems

Systems changes to prevent severe ... - Semantic Scholar

How to Manage Failures in Air Traffic Control Software Systems

multidisciplinary approaches in engineering ...

RepFD - Using reputation systems to detect failures in large ... - LIP6