Handling Large Datasets in Parallel Metaheuristics: A ... - Google Sites
Recommend Documents
Due to the nature of service levels (measured across all parts) and pooling (applicable across all airports), it is not possible to decompose data for SMO ...
Parallel metaheuristics: recent advances and new trends ... lution delays imposed in the industrial field to be met and the study of general problem classes.
heuristics (Blum and Roli, 2003). The main difference between ... Section 3, a taxonomy of the different parallel search models is presented. In Section 4, we .... in the breeding loop. ...... ing Building Blocks library Garrett (2010). PMF includes
explore a small portion of the solution space, where one believes good solutions .... Again, the best neighbor of the best would be chosen as the current solution.
parallel R system called RABID (R Analytics for BIg Data) that maintains R compatibility, leverages the MapReduce- like distributed Spark[22] and achieves high ...
parallel R system called RABID (R Analytics for BIg Data) that maintains R compatibility, leverages the MapReduce- like distributed Spark[22] and achieves high ...
set of structures acts as a shared memory, much like it is done in blackboard sys- tems [104]. A set of ... acting in a pipeline fashion. On the other hand ...... thesis, Argonne National Laboratory, Illinois Institute of Technology, Argonne,. USA, M
dom undersampling, directed oversampling (in which no new examples are created, ... Section 6 deals with other problems related with imbalance such as the ...
Parallel variational Bayes for large datasets with an. 1 application to generalized linear mixed models. 2. Minh-Ngoc Tran, David J. Nott, Anthony Y.C. Kuk and ...
Edmund Burke1, Patrick De Causmaecker2, Sanja Petrovic1,. Greet Vanden .... The nurse rostering literature is described in more detail in (Burke et al., 2004) and (Cheang et al., 2003). ...... Journal of the Society for Health Systems, 2 (2),.
Index TermsâAutomotive deployment; parallel optimisation; multiobjective problems ..... Graduate School University Of Southern California, 2007. [9] ââ, âA ...
Versus panmixia. 2. Orthodox. B. Speed with predefined effort. Once we have a taxonomy, the open problem is to select a fair way of computing speedup.
sequence of early jobs on each machine (a linked-list), and due to Remark 1, late jobs are ..... Technical Report 99/8, Ecole des Mines, Nantes, France, 1999.
Optimization algorithms can be classified in the following categories: (1) exact .... Parallel ACS algorithms may be homogeneous or heterogeneous in respect to ...
Email: {msevaux, pthomin}@univ-valenciennes.fr. 1 Introduction. A parallel machine scheduling problem where the objective is to minimize the number of late ...
Motivation: bysort is slow with large datasets. 2. Solution: replace it with hash tables. 3. Implementation: new Mata ob
proposed technique. Index Termsâ. Saliency detection, fuzzy quantification, semi-supervised classification, ensemble projection, random forest, large datasets.
Third, molecular docking ... protein docking, as a particular case, is fundamental in ...... Baker Rosetta in CASP4: progress in ab-initio protein structure predic-.
normal sample had been derived by Hardin and. Rocke[2] ... on x1, x2, ..., xj-1 j = 2, 3, ..., p (where xj is the jth ..... âPattern Classification,â John Wiley & Sons,.
of the rogue taxa identified by our algorithm; applying this procedure to our biological datasets caused a large number of edges to move from âunsupportedâ to ...
Jul 7, 2010 - Loading a large data vector and sorting it, is impossible sometimes ... compute the medians of these, is the median of the medians a good ap-.
For example, an accumulator can be used to keep a running sum for an .... Figure 2: A complete application suite implemented as a customized ADR application. ... Planning is carried out in two steps; tiling and workload partitioning. ..... enable dep
CUDA is proprietary and only for NVIDIA GPUs while OpenCL is an open standard ..... deterministic methods such as artificial neural networks (ANN) and ...
The Electronic Journal of Symbolic Data Analysis - Vol.3, N.1 (2005), ISSN 1723- ... For example, few thousands of patterns were referred to as large data in.
Handling Large Datasets in Parallel Metaheuristics: A ... - Google Sites
solution space. â« Solution traces to reduce searching. â« Keeping track of past searches. â« Keeping track of past s
Handling Large Datasets in Parallel Metaheuristics: A Spares Management p Case Study y and Optimization
Chee Shin Yeo, Elaine Wong Kay Li, Yong Siang Foo
Metaheuristics
2
Solve optimization problems in diverse domains Search over a solution space for an optimal solution that will minimise an objective function Challenges Exponentially increasing execution time Memory intensive Inconsistent performance due to random generation
Parallel Metaheuristics
3
Search in parallel using multiple searches over a solution space Solution traces to reduce searching Keeping track of past searches Cooperative methods with different initial solutions Parallel searches exchange intermediate results Problem: Large datasets Insufficient Memory Network bottlenecks
Parallel Metaheuristics: Flow Control Workflow
4
Run as a flow control workflow with n states, states each state with x Multiple Independent Runs (MIR)
Optimal solution is the output On of the last state (Sn)
Flow Control Workflow: Clustering Policy Workflow Clustering Policy: Executes the entire workflow as a single job
S1 MIR1a MIR1b
Processor P2 P1
Td
S2 MIR2a MIR2b
S3 MIR3a MIR3b
entire workflow
Ta
Time
State Clustering Policy: Executes each state of the entire workflow as a single g jjob
S1 MIR1a MIR1b
Processor P2 P1
Td
S1
S2 MIR2a MIR2b
Ta Td
S2
S3 MIR3a MIR3b
Ta Td
S3
Ta
Time
Job Clustering Policy: Executes each MIR in a state as a single job
S1 MIR1a MIR1b
5
Processor P2 P1
S2 MIR2a MIR2b
S3 MIR3a MIR3b
Td MIR1a Ta Td MIR2a Ta Td MIR3a Ta Td MIR1b Ta Td MIR2b Ta Td MIR3b Ta
Time
Case Study: Spares Management and Optimization (SMO)
6
Optimization scenario Aircraft spare parts 59 airports – with time time-based based delivery commitment at selective airports Logistics flights between all locations Flight network – ~320,000 Flight Hours/year
Experimental Setup
7
Effect of Stop Criterion on IBM (2 (2.26GHz, 26GHz 32GB)
8
Effect of Clustering Policy on DELL (3 (3.0GHz, 0GHz 4GB)
9
Conclusion
10
Flow control workflow for p parallel metaheuristics Stop Criterion: Exchange of intermediate data Clustering g Policy: y Assignment g of jjobs Memory y availability y is a critical issue Less iterations for stop criterion is better Less memory, shorter completion time Job clustering policy is better Least memory, memory more reliable completion
Future Work
11
More intelligent g optimization p Self-configuring stop criterion Resource contention in multi-user environment Effective scheduling g mechanism
End of Presentation Thank You Any Questions/Comments?