architecture, patterns and frameworks should be studied in more detail: ... Hansen (University of Aarhus, Denmark) presented experiences related to the.
Jun 20, 2002 - CCR/DIMACS Workshop/Tutorial on Mining Massive Data Sets and Streams: Mathematical Methods and Algorithms for Homeland Defense.
dialogue at the agency level with regulatory partners, including the .... early dialogue with applicants. It holds ...... nanotechnology-regulation-and-oversight-principles.pdf. Accessed ..... www.hc-sc.gc.ca/dhp-mps/dev-drug-instr-drogue/class_.
... apps below to open or edit this item. pdf-1876\summary-of-a-workshop-on-software-certifica ... iably-dependable-soft
Mar 28, 2016 - âLearning the Research Process from Hearing about Research Dataâ ... do you expect about research dat
Department of Environmental Health Sciences, ..... air-conditioning in homes, workplaces, and ..... preventing waterborne diseases include legal ..... Easterling DR, Horton B, Jones PD, Peterson TC, Karl TR, .... Gerba CP, Rose JB, Haas CN.
Environmental health risk assessment contributes increasingly to policy develop- ment, public health decision making, the establishment of environmental ...
Computer Science and Engineering Department, Lehigh University ... { qiw3, kessler, blank, pottenger}@cse.lehigh.edu ... to discuss a program line by line.
All rights reserved. Successful STEM Education: A Workshop Summary.
Alexandra Beatty, Rapporteur. Committee on Highly Successful Schools or
Programs.
Learning for Parallel Virtual Urban WorNshop: An innovate method ... (PVW) has been developed at the Department of Urban and Regional Planning ... innovative initiatives in urban planning education and it is a successful experience to ... highly rele
to some significant degreeâ; and (e) âprojects are realistic, not school-likeâ (p. 3-4). ... project linked to arc
We introduce a new sublinear space data structure—the Count-Min. Sketch— for
... tor has dimension n, and its current state at time t is a(t)=[a1(t),...ai(t),...,an(t)].
Proceedings of the Fifth National Conference on Artificial Intelligence(pp.1041-1045).Philadelphia,PA: Morgan Kaufmann.Google Scholar. Mitchell, T. M.(1982).
conclusiones de la discusión sobre nuevas ideas e iniciativas para futura instrumentacón. ABSTRACT. A brief summary of the first workshop on Science with the ...
Inflexible byte stream abstraction. • Scalability rules are simple. – But middleware
is complex. • Applying 'magic number'. – Unnatural and difficult to propogate.
This workshop was attended by representatives from: MoH, DoH Dohuk, DoH Anbar, UNICEF, UNFPA, WHO, WFP, MSF-CH, MSF-H,.
especial en los aspectos relativos al estado actual de la instrumentación y los objetivos cientıficos que se .... the world of al-Andalus, (i.e., the territory under Is-.
tracing between log file entries and corresponding software artefacts. The con- ... (CPS) integrate computing, networking, and physical processes to digitally exe-.
On. the subject of web crawls, Charlie Clarke indicated that they terminated the ... Judy. Johnson of NEC and Josh Coates of the Internet Archive both offered to ...
Jan 7, 2016 - tions based on different PDF parameterisations, as indi- ..... (v1.5) [5, 65], JR09 [66] and the recent updates MMHT 2014 [67] ...... To overcome this limitation, first steps have been undertaken to develop a Rivet wrapper around.
This paper discusses the efficiency and ease of implementation of a mesh coarsening or decimation (simplification) algorithm by using Algorithm Oriented Mesh ...
A Summary of Parallel Learning Efforts DIMACS Workshop on ...
Learning to classify news articles (RCV1 dataset). An Outline of What's Next. Ron
Bekkerman, Misha Bilenko and I are editing a book on. “Scaling up Machine ...
A Summary of Parallel Learning Efforts DIMACS Workshop on Parallelism: A 2020 Vision John Langford (Yahoo!)
What is Machine Learning? The simple version:
Given data (x, y )∗ find a function f (x) which predicts y .
y ∈ {0, 1} is a “label” x ∈ R n is “features” f (x) = hw · xi is a linear predictor.
What is Machine Learning? The simple version:
Given data (x, y )∗ find a function f (x) which predicts y .
y ∈ {0, 1} is a “label” x ∈ R n is “features” f (x) = hw · xi is a linear predictor.
y might be more complex and structured. Or nonexistent... x might be a sparse vector or a string. f can come from many more complex functional spaces. In general: the discipline of data-driven prediction.
Where is Machine Learning? At
:
1
Is the email spam or not?
2
Which news article is most interesting to a user?
3
Which ad is most interesting to a user?
4
Which result should come back from a search?
Where is Machine Learning? At
:
1
Is the email spam or not?
2
Which news article is most interesting to a user?
3
Which ad is most interesting to a user?
4
Which result should come back from a search?
In the rest of the world. 1
“statistical arbitrage”
2
Machine Translation
3
Watson
4
Face detectors in cameras
5
... constantly growing.
How does it work?
A common approach = gradient descent. Suppose we want to choose w for f (x) = hw · xi. Start with w = 0. Compute a “loss” according to lf (x, y ) = (f (x) − y )2 ∂lf . Alter the weights according to w ← w − η ∂w
How does it work?
A common approach = gradient descent. Suppose we want to choose w for f (x) = hw · xi. Start with w = 0. Compute a “loss” according to lf (x, y ) = (f (x) − y )2 ∂lf . Alter the weights according to w ← w − η ∂w
There are many variations and many other approaches. All efficient methods have some form of greedy optimization core. But it’s not just optimization: 1
We must predict the y correctly for new x.
2
There are popular nonoptimization methods as well.
Demonstration
Learning to classify news articles (RCV1 dataset)
Demonstration
Learning to classify news articles (RCV1 dataset)
An Outline of What’s Next Ron Bekkerman, Misha Bilenko and I are editing a book on “Scaling up Machine Learning”. Overview Next.
What’s in the book?
Parallel Unsupervised Learning Methods 1
Information-Theoretic Co-Clustering with MPI
2
Spectral Clustering using MapReduce as a subroutine
3
K-Means with GPU
4
Latent Dirichlet Analysis with MPI
It’s very hard to compare different results.
RBM-93K GPU Images
K-Means-400 GPU Synthetic
LDA-2000 MPI-1024 Medline
1e+06
Spec. Clust-53 MapRed-128 RCV1
Info CC MPI-400 RCV1 800x800
Features/s
... But let’s try Speed per method
1e+07 parallel single
100000
10000
1000
100
10
Ground Rules
Ginormous caveat: Prediction performance varies wildly depending on the problem–method pair.
Ground Rules
Ginormous caveat: Prediction performance varies wildly depending on the problem–method pair.
The standard: Input complexity/time.
Ground Rules
Ginormous caveat: Prediction performance varies wildly depending on the problem–method pair.
The standard: Input complexity/time.
⇒ No credit for creating complexity then reducing it. (Ouch!)
Ground Rules
Ginormous caveat: Prediction performance varies wildly depending on the problem–method pair.
The standard: Input complexity/time.
⇒ No credit for creating complexity then reducing it. (Ouch!)
Most interesting results reported. Some cases require creative best-effort summary.
Linear Threads-2 RCV1
Boosted DT MPI-32 Ranking
Decision Tree MapRed-200 Ad-Bounce
RBF-SVM TCP-48 MNIST 220K
1e+06
Ensemble Tree MPI-128 Synthetic
RBF-SVM MPI?-500 RCV1
Features/s
Supervised Training Speed per method
1e+07 parallel single
100000
10000
1000
100
Conv NN Asic(sim) Caltech
Boosted DT GPU Images
Conv. NN FPGA Caltech
100000
Inference MPI-40 Protein
Speech-WFST GPU NIST
Features/s
Supervised Testing (but not training)
Speed per method
1e+06
parallel single
10000
1000
100
My Flow Chart for Learning Optimization
1
Choose an efficient effective algorithm
2
Use compact binary representations.
3
If (Computationally Constrained)
4
then GPU else
5
1 2 3
If few learning steps then Map-Reduce else Research Problem.