Harnessing Parallelism in a Parallel Discrete-Event ... - CiteSeerX

Proceedings of the IASTED International Conference Modelling and Simulation May 5 – 8, 1999, Philadelphia, Pennsylvania, USA

Harnessing Parallelism in a Parallel Discrete-Event Simulation Chu-Cheow LIM Yoke-Hean LOW Boon-Ping GAN Sanjay JAIN Gintic Institute of Manufacturing Technology 71 Nanyang Drive, Singapore 638075, Singapore Email: fcclim,yhlow,bpooi,[email protected] Stephen J. Turner Dept. of Computer Science University of Exeter Exeter, EX4 4PT, U.K. [email protected]

Wentong CAI Wen Jing HSU Shell Ying HUANG Centre for Advanced Information Systems School of Applied Science Nanyang Technological University Singapore 639798, Singapore faswtcai,hsu,[email protected]

Abstract

the simulation model has fine-grained parallelism, and does not contain intensive floating-point calculations.

This paper looks at a parallel discrete-event simulation of wafer fabrication models. This is an irregular computation because when events are sent from one logical process (LP) to another, the (virtual) times of their movement cannot be calculated in advance. Each eventhandling operation takes time in the order of tens of microseconds, so the application has fine-grained parallelism. We use a synchronous (“super-step”), conservative simulation protocol to organize the simulation into a regular, BSP-like structure. This eases the programmers’ coding efforts. A regular computation structure also makes it easier to analyze the program’s performance, identify bottlenecks and perform code optimizations. We summarize several optimizations to reduce cache miss rates, and show that it is possible to achieve reasonable speedups (with respect to an optimized C++ sequential implementation), for a non-trivial simulation model with fine-grained events.

Section 1.1 first describes the simulation model. Next Section 2 briefly outlines the synchronous protocol that we use. The main purpose is to show that a simulation protocol with a regular, BSP-like [3] structure makes it easier for us to identify implementation optimizations (Section 3). Section 3 will also give the timings for both our parallel and sequential implementations. We conclude the paper in Section 4. 1.1. Simulation Model Our application is a general model for discrete-event simulation of wafer fabrication. The specific wafer-fab models are based on data sets from Sematech [13]. Sematech (http:// www.sematech.org) is a consortium of semiconductor manufacturing companies that does research for its members in non-competitive areas of semiconductor manufacturing. Sematech has defined a data format to describe wafer fabrication processes, so that data can be easily exchanged among simulation users.

Keywords: Synchronous conservative protocol, parallel discrete-event simulation.

Each input data set describes the configuration of a specific plant, and has multiple homogeneous sets of machines. Each homogeneous machine set has a common queue. In wafer fabrication, the wafers move from one machine to the next in lots. Sematech specifies that there are machines which (a) process wafers within a lot individually, (b) process a wafer-lot as a whole, and (c) group multiple lots into a batch for processing. We implemented two types of machine sets. The first, for modelling (a) and (b), is a lot-processing machine which schedules a wafer-lot at a time. The second type models

1. Introduction The work presented in this paper was done in the context of a project whose object is to study the use of parallel and distributed simulation (PADS) techniques to speed up discrete-event simulation of a virtual factory [4]. A discrete-event simulation is hard to parallelize because both its data access pattern and computation structure are irregular. The task is made harder if 291–226

1

To correspond to “real world” simulations, we incorporated the rules which simulationists normally associate with each machine set. The rules include: (a) a dispatching rule to decide how the waiting lots (in the common queue for a machine set) are prioritized for scheduling, (b) a setup rule to decide if the setup of an available machine (within a set) is suitable for a waiting lot, or if the setup is to be changed, and (c) a time-out to prevent wafer-lots from waiting indefinitely.

Initialization /* Executed by every LP, say while ( GST ) do Swap InBuff and OutBuff Start Super-step

Harnessing Parallelism in a Parallel Discrete-Event ... - CiteSeerX

Harnessing Parallelism in a Parallel Discrete-Event ... - CiteSeerX

Suggest Documents

Exposing Parallelism and Locality in a Runtime Parallel ... - CiteSeerX

Harnessing Parallelism for High-Performance Interactive Computer

Harnessing the Multicores: Nested Data Parallelism in ... - Microsoft

Harnessing Parallelism in Multicore Clusters with the All-Pairs and ...

Harnessing the Multicores: Nested Data Parallelism in ... - Microsoft

Harnessing Parallelism in Multicore Clusters with the All-Pairs ...

FINDING AND EXPLOITING PARALLELISM IN A ... - CiteSeerX

Combining Control and Data Parallelism: Data Parallel ... - CiteSeerX

A note on Data-Parallelism and (And-Parallel) - Semantic Scholar

Warp-Level Parallelism: Enabling Multiple Replications In Parallel on ...

The Role of Parallelism in Parallel Inference Applications

On Modeling Intra-task Parallelism in Task-level Parallel Embedded ...

Independent parallelism in finite copying parallel ... - Science Direct

TERAFLUX: Exploiting Dataflow Parallelism in Teradevices - CiteSeerX

Parallelism in Mobile Agent Network - CiteSeerX

Parallelism in Structured Newton Computations - CiteSeerX

2: Data-Parallelism and Data-Flow 1. The Parallelism ... - CiteSeerX

Harnessing Naturally Occurring Tumor Immunity: A ... - CiteSeerX

1 adaptive bulk-synchronous parallelism in a network of ... - CiteSeerX

Harnessing P2P Power in the Classroom - CiteSeerX

Parallel Constraint Handling in a Multiobjective ... - CiteSeerX

Scheduling Parallel Computations in a Heterogeneous ... - CiteSeerX

Scheduling Parallel Computations in a Heterogeneous ... - CiteSeerX

exploiting multiple degrees of bp parallelism on the highly parallel ...