Markus Schmidberger, Data Platform Architect. Glomex GmbH â A ProSiebenSat.1 Media SE company. Berlin, April 12th 2016
berlin
Big Data is Dead, Long Live Business Intelligence? Michael Muckel, Head of Data Platform Markus Schmidberger, Data Platform Architect Glomex GmbH – A ProSiebenSat.1 Media SE company Berlin, April 12th 2016
Data Science Environment Data Platform - MicroService Layout
Content API
Content Import Service
CDN files
CDN Log Import Service
data stream
other modules
Content Discovery Service Data Management Service
Data API
Metadata Service KPI & Analytics Service
Portal
AdProxy Log Import Service
Data Lake data stream
VAS Log Import Service
data stream
Player Feedback Import Service
Data Layer
Technical Monitoring Service
Real-Time Dashboards
Dev / Ops Analytics Service
Data Platform Access Team
Data Quality Service External Data Import Service
Data Science Analytics Service
Data Science UI
Data Platform Monitoring Service
INGEST
Glomex GmbH – A ProSiebenSat.1 Media SE company
STORE
PROCESS & ANALYSE
VISUALIZE & SERVE
Page 27
Data Science Environment
Project Jupyter: http://jupyter.org/ Glomex GmbH – A ProSiebenSat.1 Media SE company
Page 28
Amazon Kinesis
Development
Cluster Technology
Data Sources
Data Science Environment - Architecture
Glomex GmbH – A ProSiebenSat.1 Media SE company
Amazon Redshift
Amazon S3
Elasticsearch
Amazon EMR In development
Github In development
Page 29
Our Lambda Architecture on AWS Data Platform - Lambda Architecture Batch Layer
other player modules
AWS Lambda
Amazon Redshift
CDN files
Amazon API Gateway Portal
AWS Lambda
Amazon Elastic MapReduce + Spark
Serving Layer
S3
EC2 with Caravel
EC2 with Jupyther
Team
data stream
Instance with Kinesis Agent
Amazon KinesisFirehose
AWS Lambda
Speed Layer
Glomex GmbH – A ProSiebenSat.1 Media SE company
EC2 with ElasticSearch
EC2 with Grafana
Applications
Page 30
Key Takeaways
Lambda Architecture Enrich your traditional, batch-driven BI-workflow with real-time analytics Use Lambda-Architecture as a guiding principle and adapt it to your needs
Glomex GmbH – A ProSiebenSat.1 Media SE company
Page 31
Key Takeaways
Focus on features development and robust pipelines not on infrastructure management AWS managed services provide an robust way to run complex big data infrastructures Follow best-practices provided by AWS and the community Glomex GmbH – A ProSiebenSat.1 Media SE company
Page 32
Key Takeaways
Provide an open data environments Trust the creativity of your engineering teams to find insights in your datasets Structure your data that it can be access in processed and raw form Notebooks provide easy access to even large distributed datasets
Glomex GmbH – A ProSiebenSat.1 Media SE company
Page 33
We are hiring …
Michael Muckel, Head of Data Platform Markus Schmidberger, Data Platform Architect Glomex GmbH – A ProSiebenSat.1 Media SE company