Algorithmic Accountability

11 downloads 0 Views 1MB Size Report
May 28, 2018 - Algorithmic Accoutability? ▫ Who is held accountable for decisions made ... ”Algorithm trading” at stock exchanges creates massive problems.
Algorithmic Accountability Transparency for Big Data, machine learning, and data protection

Assoc. Prof. Dr. Lothar Fritsch Karlstad University, Sweden RSA Conference (USA) Recap - Nordic Tour Oslo, May 28, 2018

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Welcome to Karlstad University!         

Established 1999 University college, 1978-1998 Branch of the University of Gothenburg, 19671977 Teacher education in Karlstad since 1843 Training for nurses since 1907 Excellent research group in Computer Science Just over 16,000 students 260 doctoral students Around 1,250 staff members

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Our location  Main campus in Karlstad  About 90 000 inhabitants  County capital in Värmland county  Ingesund School of Music in Arvika Oslo

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Who we are  Privacy and Security group (PriSec) in Dept. for Mathematics and Computer Science at Karlstad University, part of the excellent research group at KAU. 15 researchers (2 professor, 2 assoc. Professor/docent, 1 assist. Professor, 1 adjunct, 9 PhD students, 1 guest researcher

 Research topic focus: Information privacy, technical data protection, and network security.

 Engagement in EU-funded and other research projects.  Winner of 2017 Swedish Cyber Challenge, semi-finalist of European finals in 2017.

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Algorithmic Accoutability?  Who is held accountable for decisions made by machines?

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

”Smart Algorithms” and AI  ”Smart algorithms” said to solve many data processign problems better than humans  Precondition: «Big Data» - the more data collected, the better the algorithms believed to 

perform Even better: «Self-learning» algorithms shall make sense of masses of unstructured data

Smart AI Miracle

Lothar Fritsch Associate Professor Dr. (Docent)

Massive data collection

Big Data Miracle

Proof?

KARLSTAD UNIVERSITY

”Smart” Algorithms?  Simple correlations  Group norms enforced on individuals  Poorly calibrated models of reality implemented  Errors in data or programs are common.  ”Contextual integrity” for data use - interpretation of data only valid in the specific context of a person’s life (Helen Nissenbaum)

 Future privacy in the Big Data age is the control about how one is being ”interpreted by others” based on data (Mireille Hildebrand).

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Liability for Algorithms?  ”Algorithm trading” at stock exchanges creates massive problems.  «Smart» cars crash and kill (Tesla, Volvo).  Who is liable for damages, such as stock trading shutdown or lives lost? CEO, programmers, insurance, the «machine»?

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Flash Crash Algorithms gone wild on the Stock market on May 5, 2014. Source: Vox.com, https://www.vox.com/2014/4/15 /5616574/high-frequencytrading-guide-real-problemsexplained

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Big Data?  Belief that collection of massive amounts of data on business transactions,  

consumers or sensor data will enable better understanding. Creates a need for massive hard disks and powerful CPUs (and possibly, more expensive database licenses). More (historic) data available to feed into statistical analysis.

 ”Data is the new oil!” Then ”personal data breach” is the new oil spill!  What quality is this data? What is its life cycle? What if it is expired? This may lead to poor analysis results!

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Smart Algorithms?  ”Smartness” mostly build upon the fact that simple structured computation and data sorting is done much faster by computers than by humans.

 Algorithms are very good with discrete computation on high-quality data – they outperform humans in this area (Hollerith’s punch card sorting machines did, too!)

 Algorithms are very bad at handling uncertainty, unexpected events, variable data quality or changes in the world that the statistical model is not prepared for.

 Often, algorithms are used to predict the future from the past (stock rates, insurance rates, human behavior) either from individual or group historic data.

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Machine Learning?  Trendy thanks to accessible algorithms in APIs. Both professional and amateurs can now use ML technology.

 ML often discovers patterns, e.g. correlations, in large amounts of data. Example: Weather phenomena aligned with electricity demand changes.

 Some algorithms build their own ”models” from such observations and then apply them.

 HOWEVER: Are correlations casual (by chance) or causal (from A follows B)?  What about input data quality? What about the ”gaps” the ML doesn’t see? Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Who makes the models?  Statisticians  ”Data scientists”  Stock brokers

But who calibrates and verifies the models before they are let loose?

 Product managers  Salesmen  ANYONE independent of professional background can use the APIs! Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Training data, models and reality  ”Smart algorithms” are really program code and data bases filled with statistical representations of training data.

Training data

Lothar Fritsch Associate Professor Dr. (Docent)

Model

”Smart System”

KARLSTAD UNIVERSITY

Application of ”trained” model  Model is then applied to newly collected data (sensors, consumers) to classify them or to predict their behavior.

Observation Sensor data

”Smart System” with trained model

Conclusion Prediction

Possible update of model and calibration (”self-learning”) Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

The personal data supply chain

Lothar Fritsch Associate Professor Dr. (Docent)

Accenture report: Building digital trust: The role of data ethics in the digital age, https://www.accenture.com/us-en/insight-data-ethics , accessed April 2018 KARLSTAD UNIVERSITY

Accountability is Transparency!  Any ”smart algorithm” may have to prove correctness of a decision or a prediction.  Such proof requires insights into:  Training data  Model  All self-learning steps performed up to the decision (including their dynamic  

training data from sensors) The actual input data that led to the decision The actual algorithm code used for the decision (since it gets updated on occasion).

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Transparency rights in GDPR  Data subjects have rights to inquire about data processing if their personal data is involved in identifiable ways.

 A ”neural network” that encodes personal data, but then recognizes a person is, most likely, ”identifiable data” when facing a judge in court.

 Data subjects can inquire about data stored, about subcontractors processing, about other data sources used.

 Data subjects can request correction, (partial) deletion and they can refuse being processed.

 Does your ”AI” provide such intervention possibilities? Can you ”roll back” a neural network to a state ”without” a specific person’s data? Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY

Data management and data quality Big Data II

Big Data I

Lothar Fritsch Associate Professor Dr. (Docent)

Knowledge discovery

Data quality Data transparency Data protection

Smart Algorithm

Decision Conclusion Prediction

KARLSTAD UNIVERSITY

Fritsch, L.: How Big Data helps SDN with data protection and privacy. In: Taheri, J. (ed.) Big Data and Software Defined Networks, pp. 339-351. The Institution for Engineering and Technology (IET), London (2018) Lothar Fritsch KARLSTAD UNIVERSITY Associate Professor Dr. (Docent)

Accenture’s ” Universal principles for data ethics” 1. The highest priority is to respect the persons behind the data.

7. Data can be a tool of both inclusion and exclusion.

2. Account for the downstream uses of datasets.

8. As far as possible, explain methods for analysis and marketing to data disclosers.

3. The consequences of utilizing data and analytical tools today are shaped by how they’ve been used in the past. There’s no such thing as raw data. All datasets and accompanying analytic tools carry a history of human decision-making. As far as possible, that history should be auditable. This should include mechanisms for tracking the context of collection, methods of consent, chains of responsibility, and assessments of data quality and accuracy.

9. Data scientists and practitioners should accurately represent their qualifications (and limits to their expertise), adhere to professional standards, and strive for peer accountability.

4. Seek to match privacy and security safeguards with privacy and security expectations.

10. Aspire to design practices that incorporate transparency, configurability, accountability, and auditability. 11. Products and research practices should be subject to internal (and potentially external) ethical review.

5. Always follow the law, but understand that the law is often a minimum bar. 6. Be wary of collecting data just for the sake of having more data.

Lothar Fritsch Associate Professor Dr. (Docent)

12. Governance practices should be robust, known to all team members and regularly reviewed.

Accenture report: Building digital trust: The role of data ethics in the digital age, https://www.accenture.com/us-en/insight-data-ethics , accessed April 2018 KARLSTAD UNIVERSITY

Questions?

Free on-line course on Privacy by Design at https://www.kau.se/cs/pbd

Connect me on LinkedIn and ResearchGate!

Lothar Fritsch Associate Professor Dr. (Docent)

KARLSTAD UNIVERSITY