puterized information services in large organizations has been to operate large ... munication network and control protocols enable users and applications to ...
ARTICLES Management Computing Robert Zmud Editor
of
Dynamic File Migration in Distributed Computer Systems The importance of file migration is increasing because of its potential to improve the performance of distributed office, manufacturing and hospital information systems. To encourage research in the file migration problem, the authors summarize accomplishments of researchers of the problem, provide a de tailed comparison of file migration and dynamic file allocation problems, and identify important areas of research to support the development of effective file migration policies.
Bezalel Ciavish and Olivia R. Liu Sheng The demand for information services is increasing at an explosive rate. Due to large geographical regions of business operations and external competitive pressures, information services are expected to collect, store, retrieve, process, aggregate and distribute timely information from data generated at geographically remote and dispersed sites. The traditional mode of providing computerized information services in large organizations has been to operate large centralized computer systems, with most of the processing, storage and retrieval of data performed at central sites. Such systems have the disadvantages of high cost and high loads on the centralized computing and communication devices. Consequently, both communication with and processing at the central site have become system bottlenecks, access speeds have slowed down, and the availability and reliability of information services has deteriorated. Advances in communication technologies and the decreasing costs of computers have made distributed computer systems an attractive alternative for satisfying the information needs of large organizations. Distributed computer systems are composed of multiple geographically dispersed computer sites connected through communication networks. Each computer site has its own processing, storage and communication devices as well as the necessary software resources. Appropriate communication network and control protocols enable users and applications to access local and remote system resources in an integrated manner. By taking advantage of resource sharing and load balancing, distributed computer systems offer higher system reliability and B. Gavish’s research was supported in part by a Dean’s at the Owen Graduate School of Management. 0 1990 ACM
0001.0782/90/0200-0177
February 1990
Volume 33
$1.50
Number 2
summer
research
grant
availability, reduced system and communication costs, shorter response time and increased system throughput. In distributed systems, data files are accessed for information retrieval and update activities by dispersed users and applications. A file may be a conventional single file, a file as part of the system database, a fragment or a collection of any of these. Unlike those in centralized computer systems, data files in distributed systems can be replicated and allocated to computing sites so as to reduce communication costs and to increase reliability. Creation of multiple file copies requires the use of synchronization mechanisms to maintain data consistency among the replicated copies. As a result of concurrency control, update transactions must be processed on all the relevant copies of a file, while read-only queries can be responded to from any copy of a file. An important issue in the design of distributed information systems is how to determine the file replication level and the allocation of the replicated file copies necessary to achieve satisfactory system performance. This problem is referred to in past literature as the file allocation problem (FAP). Allocation of file copies in distributed systems involves tradeoffs between storage, processing and communication costs, and services based on the anticipated usage of files. Computing sites to which file copies are assigned have the advantage of quick and inexpensive information retrieval, but must bear the overhead of file storage and maintenance. The remaining sites incur communication costs and transmission delays when remotely referencing a file. File accessing in many distributed environments is characterized by the phenomenon of locality of reference in which we observe high-intensity periods of file refer-
Communications of the ACM
177
Articles
encing by users in the same location, followed by periods of low activity. Examples of such a phenomenon can be found in airline reservation systems around flight arrival and departure times and in hotel booking systems when conferences and conventions take place. During busy times, large volumes of communication traffic are generated at remote locations. As a result, overall operating costs increase and communication services deteriorate. An alternative to this situation is to transfer a copy of the referenced files to the requesting site so that file accesses can be done locally. This strategy, however, requires interrupting system operations across an entire system to perform a file reallocation or to incorporate file migration operations into distributed database management systems for on-line dynamic controls. Unlike systemwide file reallocations, file migration operations, a term adopted from file migration procedures in computer storage hierarchies, represent procedures to create, transfer or delete a file copy in distributed systems. Because file migration operations incur relatively little system interruption and therefore can be performed frequently, they are particularly useful to improve system performance by correcting file allocation during short bursts of busy file accessing activities. In addition, file migration may also lower system operating costs when different computer sites operate under different operating cost scales, and when time zone differences allow the use of regular shift operations at a remote site instead of a local late shift, and the use of remote computer devices during off-peak periods rather than local resources within peak-load hours. Recent developments of distributed computer systems, such as ROE [8], STORK [30], IBIS [38] and LOCUS [32], have incorporated file migration operations as options for accessing remote files. In most of these systems, file migration operations are performed on an ad hoc basis only, without taking the long-term impact of file migration operations into consideration. As a result, only a fraction of the potential benefits of these operations are realized under the appropriate conditions. File migration clearly has very high potential to improve system performance and decrease system costs significantly. On the other hand, improper control of file migration operations may lead to inferior performance levels. An elaborate analysis of the tradeoffs in file migration is required to fully realize the potential benefits of file migration. For easy reference, the problem that addresses proper control policies for file migration operations is referred to as the file migration problem. It should be distinguished from the dynamic file allocation problem, which considers file reallocations only. While dynamic file allocation problems began to receive some attention more than a decade ago, research on file migration in distributed computer systems started only recently. The importance of the file migration problem is, however, increasing rapidly as distributed information systems, such as for managerial, manufacturing and hospital applications, are beginning to
178
Communications
of the ACM
evidence more and more dynamic: changes in file accessing activities. To spur research interest in. the problem, this article surveys previous research on the problem, provides a detailed comparison of file m:igration and dynamic file allocation (the closest kin of file migration) problems and identifies important potential areas of research to support the development of effective file migration policies. FILE ALLOCATION VERSUS FILE MIGRATION: AN AIRLINE RESERVATION EXAMPLE Differences between file allocation and file migration in distributed computer systems must be considered when considering which option to use. The differec.ces and the problem-solving tradeoffs can be best illu.strated by an example of an airline reservation system. In all computerized airline reservation systems, information regarding schedules, fares, reservations and other basic flight information is frequently requested and changed by transactions generated by plassengers and by operational and managerial personnel. A fictitious relational database, AIRLINE, is used i.n this example to maintain all flight and reservation d.ata. It has the simplified schema shown in Table I. Ty:pical references to the database are: l l
l
l
Inquiries about flight schedules and air fares; Initiation, confirmation or cancellation of reservations; Ticketing, seat assignment, luggage handling and ordering of special meals; Flight and crew scheduling, changes in air fares, routine reports and inquiries about flight information by personnel at all levels.
Since the operations of an airline may span national or international boundaries, transactions directed to the AIRLINE database may originate from widely separated geographic areas. With the rapid price-performance revolution in computing and communication technologies, a distributed computing approach for information processing in this airline reservation system. had long become economically viable. This can be evidenced by the recent conversions from centralized systems to distributed computer systems for handling reservation information by American Airlines and United Airlines [311.
Static File Allocation In a distributed system, the allocation of the ,4IRLINE database can range from a single copy located at the central headquarters, similar to the design of a centralized system, to a fully replicated design which allocates a copy of the database to each branch office. .Assuming that the criterion for file allocation involves only system operating costs, the cost level depends upon how well the allocation meets the demand generated by the whole airline operation and the maintenance costs associated with a given allocation. For instance, Figure 1 (a) and (b) depict two possible alloc:ations of the AIRLINE database. The query and update frequen-
February 1990
Volume 3.3 Number 2
TABLE I. A Sample AIRLINE Reservation Database Relation Flight
Fares
Crew Reservation
Attributes flight number, airplane model number, scheduled days, initiation date, expiration date, smoking/nonsmoking boundary, first/ economy class boundary, manager-in-charge flight number, stop code, departing city, destination city, departure time, arrival time, first class fare, coach fare, discount fare flight number, stop code, crew member, crew position flight number, stop code, passenger id, seat assignment, meal type, reservation status, fare class
TABLE II. Query and Update Frequencies of the AIRLINE Database City
Query frequencyt
Update frequencyt
Chicago Denver Nashville Syracuse Tucson
100 80 60 40 20
50 20 15 10 5
T Transaction unit.
frequency
= number of transactions
per time
ties for various cities in the example are described in Table II. Based on this data, the solution in Figure 1 (a), in which copies of the database AIRLINE are allocated to Tucson and Syracuse, will result in transfers of data to remote locations for 80 percent of query transactions. Communications between the transaction origination sites and both database sites will be incurred for 85 percent of the update transactions. By changing the sites of the database AIRLINE to Chicago and Denver in as Figure 1 (b), the total communication volume is reduced. Now only 40 percent of the queries involve remote communication and the percentage of two-link communication updates (interaction with both database sites) is reduced to XI percent. In addition to communication traffic, other determinants of system operating costs include database storage and processing costs, as well as communication costs which are dependent on the communicating locations and the types of links used for communication. A database can be fragmented when it is distributed to computing sites. For example, transactions generated in Tucson will most often reference information on flights departing or arriving at Tucson. In order to reduce the amount of remote communications, the fragment of the AIRLINE database consisting of local (Tucson) flight information could reside in Tucson. Figure z shows an allocation of the fragmented AIRLINE database. In allocating fragmented databases, a fragment of a database can be regarded as a file. Allocation of database copies is thereby reduced to the allocation of single file copies.
February 1990
Volume 33
Number 2
Dynamic File Allocation Static file allocations are attractive for systems which have a stable level of file access intensities. Systems in which access intensities vary over time but are predictable, mainly as a result of predictable growth or decline in information needs, or periodic differences in usage patterns, can benefit from file allocations that adapt to those changes. In systems with high variability in usage patterns, users experience degraded performance levels and higher costs if file allocations remain static throughout the operational period. To derive dynamic file allocation based on anticipated changes in file access intensities, file reallocation costs incurred in later stages have to be taken into consideration in the initial design process. For example, the semi-annual transaction frequencies for the AIRLINE database depicted in Table III indicate significant seasonal variations between the reservation activities in Tucson and Syracuse. To reduce remote communications in transaction processing, a sensible allocation would assign a copy of the database to Chicago, Denver and Tucson during the winter season and, to Chicago, Nashville and Syracuse during summer seasons. This allocation requires file reallocations to take place with season changes, moving database copies to Nashville and Syracuse. However, the reallocation cost can be lowered if an additional copy resides in Nashville from November through April and in Denver from May through October. The savings represent the file transfer cost for the Nashville copy and the transmission of query requests from Denver during summer seasons and from Nashville during the winter. Additional costs involve the storage costs of the additional file copy and the update overhead to keep it consistent with other copies. Such tradeoffs can all be considered by dynamic file allocation models, leading to file allocation and reallocation policies for a planning horizon with varying activity levels. However, dynamic file allocation problems are more difficult to solve than their static counterparts due to the increase in state space descriptions needed to model the dynamic allocation options. File Migration in Distributed Computer Systems The dynamic file allocation example described in the previous section considered only file reallocations. File reallocations are costly, generate system-wide delays and require prior knowledge of potential changes in system activity levels. They are applicable only when changes in file access intensity are observed over a significant portion of the system and remain in effect for long periods of time (months or years). File migration operations, on the other hand, involve only a single copy of a file, or of clearly defined fragments of a database which can be treated as separate files. They react to significant temporary changes in usage patterns which are difficult to predict beforehand and may extend over shorter periods of time. File migrations which are incorporated as part of a distributed operating system offer additional advantages and should be considered for the following reasons:
Communications of the ACM
179
Articles
@
- A computing site consisting of computing, memory and peripheral devices as well as logical resources; - A copy of the AIRLINE
databas.e;
i? /
- The flow of query and update transactions. Syracuse
:+@G? Chicago
(a)
Nashville
Tucson
#
Syracuse
,d Chicago
Denver
0-Y
Nashville
Tucson
d
/
FIGURE1. Two Viable Allocations of the AIRLINEDatabase (1) Locality of reference is usually characterized by changes in access intensity to files which originate from the same geographic area. In such situations, significant gains may be generated by temporarily assigning a local copy. Single-copy file migration operations are sufficient to achieve this purpose. (2) Dynamic file reallocations can be staged using file migration operations. Hence, system-wide adjustments of file allocations can still be accomplished by a sequence of single file migrations. (3) Dynamic file reallocation models rely heavily on prior knowledge regarding potential usage patterns of the system databases. File migrations when incorporated as part of the distributed operating systern, are not very sensitive to prior estimates on systems usage patterns. They automatically react to
180
Communications of the .4CM
temporary changes in access intensity by making the necessary adjustments in file locations without human management or operator intervention. The following example demonstrates the s:avings offered by file migration. A few hours before iand after a flight stops at an airport, a surge in access activity from the airport to the appropriate portions of t:he reservation database is observed. During those periods, if the requested database fragment is remotely loc:ated, a large volume of long distance traffic will be generated. When the amount is substantial enough, the system can benefit from migrating the referenced database fragment to the cities at which a flight stops a few hours before the flight arrives there. For instancll, a flight that leaves Syracuse heading for Tucson stops over at Chi-
February 1990
Volume .33 Number 2
Articles
Ld
- A fragment of the AIRLINE
database
Syracuse
Denver
Tut
FIGURE2. An Allocation of the Fragmented AIRLINEDatabase (refer to Figure 1 for the rest of the symbols) cago for departing and connecting passengers. The database for the reservation system may reside at Chicago to be near the headquarters. However, a few hours before the flight leaves Syracuse, intensive queries and updates of reservation, ticketing, seat assignment, baggage handling, crew and fare information are likely to be generated from, Syracuse. The numbers of various transactions and the volume of communication generated by them are described in Table IV. Overall, 58.4 Kbytes of communication can be saved at the cost of file migration for each leg of this flight, since the transactions can be processed locally (in Syracuse) after the fragment migrates to Syracuse. Taking into consideration that the same pattern repeats for every leg of this flight, the net savings offered by file migration add up to 76.8 Kbytes for the entire flight. The costs and savings of file migration for this example are depicted in Table V. A typical airline has a few thousand trip-legs per day. Even if the savings per leg are only one dollar, they can add up to a few million dollars per year. It is clear from the net savings that the airline reservation system can benefit significantly from file migration. In view of this potential, file migrations can be incorporated in distributed systems provided that they are closely monitored and are invoked whenever they can generate benefits. The above example is a simplified one and is used mainly to illustrate part of the tradeoffs involved. In reality, references to the migrating fragment from other locations also should be considered in determining the net savings generated by file migrations. Furthermore, if the migrating fragment is only a duplicate of the existing copy at Chicago, then occasional communication between the multiple copies of the fragment for update transactions has to be taken into account. In brief, the tradeoffs involved in file mi-
February 1990
Volume 33
Number 2
gration are extremely complex. The derivation of a file migration policy thus requires elaborate analysis of the impact of file migration on overall system performance. PAST RESEARCH ON RELATED PROBLEMS As can be seen from the airline reservation example, file migration problems evolve from static and dynamic file allocation problems. Although modeling of file migration problems is more complex, the analytic and solution techniques employed for static and dynamic file allocation models still provide a foundation for developing the models and solution procedures for file migration problems. This section reviews past research on file allocation and file migration in distributed computer systems in order to identify the important research and implementation issues. Static File Allocation Static file allocation problems (static FAP) have received substantial attention in the literature. Based on TABLE Ill. Seasonal Query and Update Frequencies of the AIRLINEDatabase May-October
November-April City Chicago Denver Nashville Syracuse Tucson t Transaction unit.
Queriest
Updatest
Queriest
Updatest
100 80 60 20 80
50 20 15 5 20
100 60 80 80 20
50 15 20 20 5
frequency
= number of transactions
per time
Communications of the ACM
181
Articles
TABLE VI. A Summary of Dynamic File Allocation Models Models Adaptive
Non-adaptive
Segall Characteristics
1361
Levin & Morgan 1241
Segall & Sandell WI
Levin
Ames & Foster
Yu et al.
~231
PI
1421
Inputt
Continuous access rates or their distribution
Discrete T-period access rates
Access rate distribution
Current and past access patterns
Actual and assessed access rates of previous period
File allocation time points
Unfixed/ continuous
Fixed/discrete
Unfixed/ continuous
Fixed/discrete
Fixed/discrete
Data requirement of actual or simulated transa.ctions Unfixed/discrete
File copies allowed Dependencies incorporated Decision criteria
Single
Multiple
Single
Multiple
Multiple
Multiple
None
Program/data
Distributed control pattern
None
Program/data
Interaction of files
File storage/ comm. costs
File storage/ comm. costs
File storage/ comm. costs
To be decided upon implementation
File storage/ comm. costs
c0rf1rr1.
Model formulation
Markov decision
Dynamic programming
Signal in white Gaussian noise model
To be decided upon implementation
Exponential smoothing/ integer programming
Integer programming
Solution procedures
Sequential algorithm
Branch-andbound
Sequential algorithm
To be decided upon implementation
Two-phase add-drop algorithm
Worst-case time complexity
NP-complete
NP-complete
NP-complete
Not explicit
Calculation of linear eqs. and solving the one-period look-ahead model ([24]) NP-complete
output
Non-stationary file transfer control policies Yes
T-period file assignments
Non-stationary file transfer control policies
Reallocation decision/policy
One-period reallocation decision/policy
(New) file assignment
Yes
Yes
No
No
No
A less general discrete-time model was also presented
A less general one-period look-ahead model was also developed
This is the dynamic file assignment model for the informationexchange pattern. The solution for the information-no exchange model cannot be calculated.
The model is only a two-step process for dynamic file allocation where the actual formulations and solution procedures for each step were not addressed
It can be extended to a T-period adaptive model
Reallclcation costs were not considered
Optimal solutions Remarks
costs
Linear
t In addition to other system and cost parameters.
dynamic in that file reallocations are restricted to the beginning of each period. Most recent research on the dynamic FAP investigated adaptive models which yield lower computational complexity. This is achieved by restricting the reallocations to single-file reallocations only. Those adaptive models are not well suited for real-time operations in that to perform dynamic control of file allocations during system operation, file realloca-
la4
Communications of the ACM
tions are derived using static file allocation models which are NP-complete according to their approaches. To improve the applicability of the research results of dynamic FAPs, it is necessary to study the problem structure under realistic schemes for file relocations, in conjunction with effective control mechanisms and to develop specialized heuristics for practical implementations.
February 1990
Volume 33
Number 2
Articles
File Migration As mentioned earlier, the primary difference between dynamic FAP and file migration problems is in the operations used to change a file assignment. Dynamic FAP considers file reallocations which might involve reallocating multiple copies of a file. Such major changes could lead to system-wide interruptions of information services. In the file migration problems, migration operations which create, delete or transfer a single file copy are used to change the file assignment. Since migration operations take a relatively short time to complete, the operations can be incorporated into the distributed operating system and their control automated without offline monitoring and decision-making. In recent development of distributed computer systems, especially local-area distributed systems such as IBIS [38], ROE [8], LOCUS [32] and STORK [30], file migration has been incorporated into the operating system level. Automated control of file migration in distributed computer systems, however, was not investigated until the first study by Porcar [33]. In what follows, we summarize the few studies on the development of automated file migration policies in distributed computer systems. Evaluation of file migration policies in computer networks was first undertaken by Porcar [33]. Both the optimal and the suboptimal single-copy file migration policies were investigated. A semi-Markov decision model was used to obtain optimal single-copy policies. When studying the more ambitious multi-copy policies, only suboptimal policies were evaluated. The optimal policies are derived based on the distribution of interreference patterns assessed beforehand, and therefore are non-adaptive. The optimal model is similar to Segall’s model [36] with the exception that Porcar’s model assumes an infinite time horizon and considers only demand policies. A demand policy is activated only when a file is demanded. The suboptimal policies are also predetermined independent of the assessed system characteristics. Simulation experiments have been used to evaluate the performance of suboptimal policies in selecting a migration control policy. When addressing file placement in distributed computer systems, Wah paid special attention to the problem of file migration [41]. He proved that the problem of determining when to migrate multiple copies of a single file to minimize system costs is NP-hard, by showing that the corresponding decision problem is reducible from the knapsack decision problem. A simple heuristic was proposed to permit limited migration in the locality of a specific computing site and to determine whether migration should take place. The decision to migrate is triggered at a file site by changes in access intensities from the local site, while the intensities of remote accesses generated from other sites remain the same. Whenever file migration is necessary, the static file allocation algorithm is called to find the optimal relocation of the file copy in its locality. Hat [18] discussed the impact of file replication, file transfer and process transfer policies on the perfor-
February 1990
Volume 33
Number 2
mance of a simple database system in a local area network environment. The interrelationship among these policies and the problems of data access, concurrency control, transaction serialization, and deadlock control were addressed in the discussion. An algorithm for improving the system performance by file replication and file and process transfer was presented. The algorithm can be used at system initiation when new files or processes are created or when significant changes in system performance are detected; it therefore can be regarded as either non-adaptive or adaptive. The algorithm is not very explicit about how the locations of files and processes are determined when migration is considered and about the dependency of migration delay on the origination and destination of a migration operation. Liu Sheng [25] examined the most general file migration problem in which a complete variety of file migration operations, creations, transfers and deletions, can be performed at any point in time when a file is referenced (i.e., on-demand file migration]. Furthermore, file reallocations at the beginning of each time period were also considered, making the file management policy over the entire horizon complete. Based on the recursive relationship between successive migration decisions, the optimization model for optimal migration/ reallocation policies is formulated as a Markov decision model. Although the optimal policies will not change until system redesign/reorganization, the execution of the optimal policies is dependent upon actual system conditions observed during system operations. Hence, the policies adapt to some extent to system changes. Such migration policies are referred to as semi-adaptive throughout this article. The main disadvantage of this approach is high computational complexity due to the very large state space needed to derive the overall optimal policies. In view of the NP-complete time complexity of optimal file migration models and low accuracy of preassessed access rates, Liu Sheng [25] developed an adaptive heuristic for individual file migration decisions. Unlike most of the adaptive models, the proposed heuristic, employing greedy principles, decides on file creation or file deletion only when the file is referenced. The model takes into consideration the observed system state and the impact of limited future file migration operations. Based on the same considerations, a greedy heuristic for generating an initial file allocation was also developed to provide a complete control mechanism for file management over time. A summary of the file migration models is provided in Table VII. Most of these models simplify the problem by allowing only a single file copy in the system [33] or employing heuristics to resolve file migration decisions [18, 251. Non-adaptive and semi-adaptive models guarantee optimal migration policies when provided with perfect estimates of future access rates [25, 331. Adaptive models [18, 25, 411, on the other hand, have the potential to react to unexpected system changes with
Communications of the ACM
185
Articles
TABLE VII. A Summary of File Migration Models Models Non-adaptive
Semi-adaptive
Porcar
Liu Sheng
Characteristics
1331
~251
Wah
Hat
1411
1131
Input’
Interference time distribution
Discrete T-period access rates
Access rates in the locality of a node
Transaction frequencies, workload, time required and probability
File migration time points
Unfixed/ continuous File transfer
Fixed/discrete
Unfixed/continuous
File transfer
Single
Unfixed/ continuous File creation’ file transfer file deletion Multiple
Multiple
File transfer file creation2 process transfer Multiple
None
None
None
Process/file
Infinite
Finite/ ‘one-period Comm. costs
Infinite3
Types of migration considered File copies allowed Dependencies incorporated Time horizon Decision criteria
File storage/ comm. costs
Finite/ T-period File storage/ comm. costs
Model formulation
Semi-Markov decision
Markov decision programming
Linear cost function/static allocation mode?
Solution procedure
Policy iteration algorithm
Value iteration algorithm
Worst-case time complexity output
NP-complete
NP-complete
Calculation of cost +unctions/not addressed4 Linear/not explicit4
Stationary file transfer control policies
T-period file migration policies/initial allocation Yes
Optimal solutions Remarks
Yes File migrations are restricted to file transfers only
file
File deletion can be performed at the same time as file creation or file transfer
’ In addition to other system and cost parameters. *A new copy of the file is created and transferred to a selected location. 3 Implied from the use of stationary statistics for performance measures. 4 One part for determining whether to migrate and the other part for how
low computational time and close to optimal improvement. Of the developed models, only Liu Sheng [25] analyzed optimization models for the general migration problem. As a result, the proposed model has a high computational complexity, limiting its application to large systems. However, its in-depth analysis of shortterm and long-term impacts of file migrations provides
166
Adaptive
Communications
of
the ACM
reference
Utilization of system resources and other system performance measures such as response time and throughput Iterative algorithm incorporating maximizations, comparisons and queueing models Embedded in the model
Liu Sheng
--
1251
Actual past access rates, assessed future access rates Unfixed/ cant nuous File creation’ File deletion MultIpIe None Finite/ one-period File storage/ comm
Iterative greedy algorithm
Quadratic
Calculation of forecast and CO!3 functions Linear
Migration decision/ new location of the copy in a node’s locality No
File replication/transfer and process transfer policy
Sirlgle file migration decisions
No
No
File migrations are restricted to file transfers only
It can be regarded as a non-adaptive model when the input does not incorporate recent system changes
File creation and file deletion cannot be implemented at the same time
to migrate.
insight into developing simple heuristic rules: of file migrations [25] for large-scale applications. COMPARISON OF DYNAMIC FILE ALLOCATION AND FILE MIGRATION MODELS Each migration operation deals with only a single file copy. As a result, an individual file migration operation
February 1990
Volume 33
Number 2
Arficles
might be less effective than a complete file reallocation in improving system performance. However, selecting an optimal or near-optimal single operation is less complex than determining complete file reallocations. Hence, file migration can be invoked more frequently, thereby responding to system changes more rapidly than file reallocations. To illustrate the merits and disadvantages of automated file migration control over dynamic file allocation policies, comparable file migration and dynamic file allocation models for the more general system environment are compared in this section. Among the dynamic file allocation and migration models summarized in Tables VI and VII, only Levin/ Morgan’s dynamic allocation model [24] and Liu She&s file migration model [25] generate optimal policies for general multiple-copy distributed database systems and are therefore readily comparable. These two models will be used to demonstrate the principal modeling differences and the improvements that can be realized from their automated control policies. Both Levin/Morgan’s and Liu Sheng’s models were formulated as Markov decision models using dynamic programming techniques. The main differences between them are in the definition of decision points for policy changes and the representation of system state at a decision point, which result in a distinguishing discrepancy in the robustness of the two models [25]. Specifically, file reallocations are restricted to the beginning of each time period by both models. However, creations, transfers and deletions of a single file copy are permitted whenever the file is referenced in Liu She&s model. In contrast, the file reallocation policies generated by Levin/Morgan’s model are based on discrete time (period) dependent only, and therefore their performance is extremely vulnerable to unexpected deviations from the assessed file usage levels within. On the other hand, file migration policies generated by Liu Sheng’s model are both (continuous) time and system state dependent, so that the execution of a file migration operation will not take place unless its underlying system condition and time frame are present in the system. System costs resulting from the policies generated by the two models were numerically compared in [25] over an extensive set of examples based on Casey’s five-site system [3]. Casey’s five-site system was selected because it is small enough to allow exact testing of the two models. The comparison showed that dynamic file migration policies were able to generate about 10 percent cost savings over dynamic file allocation policies in most cases and the savings increased as the number of time periods or the length of each time period increased. Other properties of file migration policies examined in [25] include the weak dependency of the effectiveness of file migration on the initial file allocation at system initiation/reorganization; the increase of cost improvements with the degree of reference locality and the low level of sensitivity to deviations of estimated access patterns from the realizable access requests. Those properties demonstrate the robustness
Februa y 1990
Volume 33
Number 2
of optimal file migration procedures. Interested readers may refer to [25] for an in-depth numerical comparison of the two models as well as for a discussion and examples of those properties. While the computational complexity of Liu Sheng’s model is higher than that of Levin/Morgan’s model by a factor proportional to the average number of file access requests, both models belong to the NP-complete family. Large distributed systems, typically with tens of sites, may not find it practical to adapt either model. However, file migration operations are expected to be particularly effective in large systems in that they generate fewer interruptions and provide automatic corrections to deficiencies in file allocation caused by changes in usage levels across portions of a system. Dynamic file allocation approaches, on the other hand, are less appropriate for large systems since system-wide changes in database usage levels are less likely to occur frequently and total system interruptions by file reallocations are less tolerable. It is, therefore, rather important to find reliable approaches to satisfactory control of file migration in large systems. One of the solutions may very well be polynomially-bounded heuristics. Our own experience with adaptive file migration heuristics [25] indicated that the cost improvement generated by heuristics over dynamic file allocations is satisfactory (at close to 10 percent savings) for some cases and negligible (for less than 1 percent improvement) for other cases. Continued research is needed to develop effective real-time heuristics for controlling file migration in large systems. In summary, file migration models are typically more robust than dynamic file allocation models, while dynamic file allocation models have a slight advantage over file migration models in their computational complexity. This tradeoff is mainly inherent in the nature of the two problems. Another consideration is the amount of data required to represent the optimal file migration policy. Specifically, the optimal policy generated by Liu Sheng’s model specifies the optimal time-dependent migration action to take for all feasible system states, which become substantial for a large system. In such cases, the optimal policy may have to be stored on external storage devices. Its automated implementation, then, requires the employment of some direct accessing scheme, e.g., hashing or indexing, to search for the optimal action based on the system state at each decision point. FILE MIGRATION: PRESENT STATE AND FUTURE DIRECTIONS To provide timely and cost-effective information services in an ever-changing environment, distributed computer systems have to operate under automated control of file allocation, concurrency control and query processing, without interruptions from offline administrators when unexpected changes occur. File migration operations represent a way of adjusting file allocations to react to both temporary and long-term
Communications of the ACM
187
Articles
changes in file usage patterns. They are more effective in distributed systems, particularly large systems, that experience short bursts of busy file access activities. With the cost/capacity revolution in data communication networks, such operations have recently become technically viable [a, 30, 32, 381. Since their execution requires relatively small amounts of time, file migration operations can be incorporated as part of the distributed operating system using automated control policies without offline monitoring and decision-making. Since the early 198os, experimental distributed computer systems, especially local-area based distributed systems such as IBIS [38], ROE [8], LOCUS [32] and STORK [30], have supported file migration operations as part of the system or as user options for remote information retrieval. In most cases, the migration operations are either implemented on an ad hoc basis or subject to user control, realizing only a fraction of the potential benefits of file migration. Automated control of file migration in a distributed computer system requires the development and implementation of mechanisms to determine the control policy before the system is initiated/reinitiated or in real time during system operation. Since the initial introduction of distributed file migration, very few studies of automated file migration control [18, 25, 33, 411 have been conducted. Although some of the developed models [18, 251 attempted to treat the problem in a general system environment, none of the studies considered realistic settings of migration control. As initial studies on the problem, however, they provide insights into the development of more practical mechanisms for automated file migration. Continued research is essential to the development of effective and practical control mechanisms for file migration in a variety of distributed environments. The following directions for future research are suggested: Developing simple file migration heuristics with low overhead. Optimization models require a high computational effort even for moderate size problems. Moreover, the optimality of an optimal policy does not hold when system activity deviates from the forecasted level. Optimization analysis is used mainly as a means of gaining insight into a complex problem. For actual implementations, heuristics with low computational complexity tend to be a better choice for large applications. Therefore, the emphasis of future research should be on the investigation of effective heuristic rules and the associated implementation considerations in various file migration environments. Addressing file migration problems in local-area based distributed systems. Network topologies have a significant impact on the effectiveness of file migration in distributed systems. For example, the broadcasting communication property inherent in an Ethernet-class network may suggest that a full replication policy would be optimal for file allocations in Ethernet-based distributed systems. The structure of optimal file migra-
188
Communications of the ACM
tion policies should be studied for specific network topologies. Analyzing the impact of capacity constraints and reliability/availability requirements on file migration. Most of the studies of file migration problems have not taken into consideration capacity constraints and reliability/availability requirements in :model development. Varying as they do with system failure patterns, recovery procedures and consumption of system resources, analysis of the impact of these essential factors is an important direction for future system investigation. Incorporating performance analysis. In addition to operating costs, metrics such as average transaction response time and system throughput are im.portant performance measures of system designs. h4odels of file migration problems that take into consideration the stochastic behavior of such systems is an important area for investigation. Elaborate queueing analysis will be required to model such problems. Examining file migration problems where references to different files can be interrelated. In many systems, a transaction may make it necessary to reference multiple files. The interrelationships among the references to different files have an impact on file migration policies. Analysis of systems where interrelated file references occur frequently seems to be an open research area. Developing decentralized/distributed algorithms for controlling file migration activities. Existing file migration heuristics and algorithms are centralized in nature in that they require system-wide exchanges of status information, increasing communication overhead and reducing system reliability by requiring the availability of the centralized controlling sites. To reduce system overhead and increase reliability, future research efforts could focus on developing decentral!ized/distributed algorithms for file migration which utilize localized system information only. In conclusion, to bring dynamic file migraticln to realworld installations of distributed computer systems, such implementation issues as dynamic fil’e directory management, data compression techniques, for reliable and fast file transmission and design issues for the application layer of network management to effectively incorporate file migration operations need to be addressed. Due to the proliferation of local- and wide-area networking and the development of distributed operating systems and applications, we anticipate increased commercial and academic interest in these issues. These problems will become fruitful areas for further investigation. In addition, in view of the diversity in commercial applications of distributed information systems, research in file migration control must continue to examine the impact of new technologies such as hypertext, integrated voice, image and data service, optical disks, parallel architectures, neural networks and
Februa y 1990
Volume 33
Number 2
Articles
distributed artificial namic file migration
intelligence on the design of dyin distributed computer systems.
REFERENCES 1. Ames, I.E., and Foster, D. Dynamic file assignment in a star network. In Proceedings of Computer Networks Symposium (Gaithersburg, Md.. 1977). pp. 36-40. 2. Baumol, W.J., and Wolfe, P. A warehouse location problem. Oper. Res. 6, 263 (1958). 252-263. 3. Casey, R.G. Allocation of copies of a file in an information network. In Proceedings of the Spring /oint Computer Conference tFIPS (1972). pp. 617-625. 4. Ceri. S., Martella. G., and Pelagatt, G. Optimal file allocation in a computer network: A solution method based on the knapsack problem. Camp. Netw. 6 (1982), 345-357. 5. Chen, P.P., and Akoka, J. Optimal design of distributed information systems. IEEE Trans. Comput. C-29,12 (Dec. 1980). 1068-1080. information 6. Chu, W.W. Optimal file allocation in a minicomputer system. IEEE Trans. Comput. C-18. 10 (Oct. 19691, 885-889. 7. Dowdy, L.W.. and Foster, D.V. Comparative models of the file assignment problem. Comput. Surv. 14 [June 1982). 287-313. 8. Ellis, C.. and Floyd, R. The ROE file system. In Proceedings of the 3rd Symposium on Reliability in Distributed Software Systems (Oct. 1983). pp. 175-181. 9. Eswaran, K.P. Placement of records in a file and file allocation in a computer network. In information Processing 74. IFIPS, N.Y., 1974. 10. Fisher, M.L., and Hochbaum, D.S. Database location in computer networks. J, ACM 27,4 (1980). 718-735. 11. Foster. D.V.. Dowdy, L.W., and Ames, J.E. File assignment in a computer network. Comput. Netw. 5 (1981), 341-349. 12. &wish, B. Models for configuring large scale distributed computing systems. AT&T Technical J. 64, 2 (1985), 491-532. models for configuring distributed computer 13. Gavish, B. Optimization systems. IEEE Trans. Comput. C-36, 7 (1987), 773-793. 14. Gavish, B., and Pirkul, H. Allocation of data bases in distributed computing systems. In J. Akoka, Ed.. Management of Distributed Data Processing. North-Holland, 1982, pp. 215-231. Gavish, B., and Pirkul. H. Computer and database location in distrib15. uted computer systems. IEEE Trans. Comput. C-35, 7 (1986), 583-590. 16. Ghosh, S.P. Distributing a data base with logical associations on a computer network for parallel searching. IEEE Trans. Sojtw. Eng. SE-Z, 2 (June 1976). 17. Grapa, E., and Belford, G.G. Some theorems to aid in solving the file allocation problem. Commun. ACM 20, 11 (Nov. 1977), 878-882. 18. Hat, A. File migration and process migration in a local area network. In Proceedings oflNFOCOM86 (19861, pp. 488-495. 19. Hevner, A.R.. and Rae. A. Distributed data allocation strategies. In M. C. Yovits, Ed.. Advances in Computers. Vol. 27. Academic Press, 1988, pp. 121-155. 20. Irani, K.B., and Khabbaz. N.G. A methodology for the design of communication networks and the distribution of data in distributed supercomputer systems. IEEE Trans. Comput. C-31, 5 [May 1982), 419-434. 21. Jenny, C.j. Placing files and processes in distributed systems: A general method considering resources with limited capacity. Research Report. IBM Zurich Research Laboratory, Switzerland, 1982. 22. Laming, L.J.. and Leonard, MS. File allocation in a distributed computer communication network. IEEE Trans. Comput. C-32, 3 (1983). 232-244. 23. Levin, K.D. Adaptive structuring of distributed databases. In Proceedings of National Computer Conference (1982), pp. 691-696. 24. Levin, K.D., and Morgan, H.L. A dynamic optimization model for distributed databases. Oper. Res. 26, 5 (Sept.-Oct. 1978). 824-835. 25. Liu Sheng. O.R. Models for dynamic file migration in distributed computer systems. Ph.D. dissertation. William E. Simon Graduate School of Business Administration, University of Rochester, Rochester. New York, 1986. 26. Loomis, M.E.S. Data base design: Object distribution and resourceconstrained task scheduling. Ph.D. dissertation, Department of Computer Science, University of California. Los Angeles, 1975. 27. Mahmoud, S.. and Riordan, J.S. Optimal allocation of resources in distributed information networks. ACM Trans. Database Syst. I, 1 (Mar. 1976), 67-78. 28. Morgan, H.L.. and Levin, K.D. Optimal program and data locations in computer networks. Commun. ACM 20, 5 [May 1977), 315-322. 29. Murthy, K., Kam, J.F., and Krishnamoorthy. M. S. An approximation algorithm for the file allocation problem in computer networks. In 2nd Symposium on Principles of Data Base Systems (March 1983), pp. 258-266.
Februay
1990
Volume
33
Number
2
30. Paris, J., and Tichy, W.F. STORK: An experimental migrating file system for computer networks. 1EEE Infocom, 1983. 31. PCWeek. Connectivity supplemental. 1988. 32. Popek, G., Walker, B., Chow, J.. Edwards, D.. Kline, C.. Rudisin, G.. and Thiel. G. LOCUS: A network transparent,high reliability distributed system. In Proceedings of the 8th ACM Symposium on Operating Systems Principles, ACM SIGOPS Operating System Rev. 15, 5 47-58, 1981. 33. Porcar. H.M. File migration in distributed computer systems. Ph.D. dissertation, Physics, Computer Science and Mathematics Division. Lawrence Berkeley Laboratory, University of California, Berkeley, 1982. 34. Ramamoorthy. C.V., and Wah, B.W. Data management in distributed data bases. In Proceedings of National Computer Conference (1979). pp. 667-680. 35. Ramamoorthy, C.V., and Wah. B.W. The isomorphism of simple file allocation. IEEE Trans. Comput. C-32, 3 (Mar. 1983). 221-232. 36. &gall, A. Dynamic file assignment in a computer network. IEEE Trans. Automatic Control. AC-21, 2 (Apr. 1976). 161-173. 37. &gall. A., and Sandell, Jr.. N.R. Dynamic file assignment in a computer network-Part II: Decentralized control. tEEE Trans. Automatic Control. AC-24, 5 (Oct. 19791, 709-715. 38. Tichy, W., and Zuwang. R. Towards a distributed file system. In Proceediqs of the 1984 USENIX Summer Conf., (1984). pp. 87-97. 39. Trivedi. K.S.. Wagner, R.A., and Sigmon. T.M. Optimal selection of CPU speed, device capacities and file assignments. 1. ACM 27, 3 (July 1980). 457-473. 40. Wah. B.W. An efficient heuristic for file placement on distributed databases. In COMPSACC 80 (Oct. 1980), 462-468. 41. Wah. B.W. File placement on distributed computer systems. IEEE Comput. 17, 1 (Jan. 1984). 23-30. 42. Yu, C.T.. Siu, M.. Lam, K., and Chen. C. H. Adaptive file allocation in star computer network. IEEE Trans. Softw.Eng., SE-11, 9 (Sept. 1985],959-965. CR Categories and Subject cation Networks]: Distributed crating Systems]: File Systems H.2.4 [Database Management]: General Terms: Algorithms, Additional Key Words and file migration ABOUT
Descriptors: C.2.4 [Computer-CommuniSystems-distributed databases: D.4.3 [Op Management-Distributed file systems; Systems-transactions processing Design. Management Phrases: Dynamic control, file allocation,
THE AUTHORS:
BEZALEL GAVISH is a professor of Business Administration at Vanderbilt University. His current research interests include the design and analysis of computer communication networks, design and analysis of distributed computing systems, systems analysis and design, combinatorial optimization, and scheduling and routing in logistic systems. Author’s Present Address: Owen Graduate School of Management, Vanderbilt University, Nashville, TN 37203. OLIVIA R. LIU SHENG is an assistant professor of Management Information Systems at the University of Arizona and a consultant to the Toshiba Corporation on database design for medical image data. Her current research interests include analysis and design of distributed database and knowledge systems, image data management, automation of systems analysis and design. computer-mediated communication support, distributed group work support and integrating office information systems. Author’s Present Address: Department of Management Information Systems, Karl Eller Graduate School of Management, College of Business and Public Administration, Unversity of Arizona, Tucson, AZ 85722. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage. the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise. or to republish, requires a fee and/or specific permission.
Comnlunications
of the ACM
199