coillputer handling of geographical data

1 downloads 0 Views 39MB Size Report
Geographer, USGS, and Robert Alexander of the Geography Program, USGS, must be ...... Computer handling of geographical data. _ J. L*eU. Set switch saying this ...... and analysis for the Regional Environmental Systems Analysis (RESA) ...
natural resources research

XIII

coillputer handling of geographical data R. F. Tomlinson

H. W. Calkins

D. F. Marble

the unesco press

computer

handling of geographical data An examination of selected geographic information systems

The Unesco Press

Published in 1976 by The Unesco Press Place de Fontenoy, 75700 Paris Printed by Journal de Geneve ISBN 92-3-101340-8

©

Unesco 1976

Printed in Switzerland

Preface

The increasing demand for the storage, analysis and display of large quantities of environmental data, in order to facilitate rational utilization, planning and management of natural resources, has led, in recent years, to rapid developments in the application of computers to environmental and natural resources data handling and the creation of sophisticated information systems. Within the framework of its activities related to the promotion of modern methods of data acquisition and processing in environmental studies and natural resources research, Unesco has become involved in these developments. It has collaborated with c ompetent bodies of international scientific organizations concerned with different aspects of the research and management of environmental and natural resource information systems, and sponsored a number of expert groups' meetings and symposia to promote international co-operation in the development and utilization of these new data handling techniques. Various governmental agencies and regional planning bodies in both developed and developing countries have shown interest in the potentialities of computer-aided environmental data handling systems. Many of them are starting to use modern methods to handle natural resources data, but the lack of clear advice and information on the range of techniques available often makes it difficult to select the most suitable approach to be employed in each particular case. It was felt that an impartial appraisal of various techniques described in different sources was needed to facilitate this selection and that beginners in this field w ould benefit from a comparative study of the different approaches adopted. In response to these needs, and with the co-operation of Unesco and governmental agencies in Canada and the United States of America, the Commission on Geographical Data Sensing and Processing of the International Geographical Union has initiated and carried out an examination of five major data handling techniques currently used for non-graphic storage and computer-aided manipulation of locationspecific environmental data. The geographic information systems examined by the Commission represent different methods of encoding location-specific data in a computer-readable form as well as different encoding procedures. One of them was developed in Canada, and four others in the USA, in response to certain requirements of federal, state and local government agencies. Although the present study makes no attempt to cover all the environmental information systems that exist in different parts of the world (particularly in Europe), the choice of systems examined reflects objectively the main trends in their development. In view of its aims and objectives, the study is technical in character and is intended mainly for specialists already acquainted with the principles of computer-aided data handling who are engaged in the organization and development of geographic information systems. It is hoped, however, that it will be of interest to all environmental scientists seeking reliable information on the most advanced achievements in this field with a view to their practical implementation, as well as to governmental agencies collecting and utilizing environmental and natural resources data, university professors and students. The responsibility for the selection and presentation of the material and for the views and opinions expressed in the book rests with the authors.

5

Acknowledgements

This examination of selected geographic information systems and digitizing methods for handling environmental data is the first to be carried out by an independent organization. It is an attempt to extract, summarize, and combine the lessons learnt by those who have met the problems in this new and rapidly developing field. The results of their continuing experience may provide guidelines to agencies who face the task of handling increasing amounts of data on the earth's natural resources, and who are contemplating the use of modern data handling methods. The study is therefore written less for a general audience than for the reader who has some knowledge of the purposes and usual functions of geographic information systems; and the terminology that describes them. The data on which the report is based was gathered by the Commission on Geographical Data Sensing and Processing of the International Geographical Union. The Commission has received substantial technical assistance from various organizations, particularly from the United States Geological Survey, Department of the Interior, Government of the United States of America; State Governments of Minnesota and New York; County of San Diego, California; Oak Ridge National Laboratory, Tennessee; and Calspan Corporation. Major assistance was obtained from the Department of the Environment, Government of Canada. All these organizations have provided data not previously made public, for the purposes of this impartial enquiry. Many persons have thus contributed to the research incorporated in this report. Some have supplied factual and graphic material, and others have carried out specified experiments, or produced background documentation specifically written for sections of the report, or provided a detailed review of the work produced. Their efforts have been co-ordinated by two working group chairmen, Dr. H. Calkins and Dr. D. Marble of the State University of New York at Buffalo, who have been responsible for the main body of this text. The essential finances for the work of the Commission on Geographical Data Sensing and Processing, International Geographical Union, have come from a number of sources. The work on which this report is based was supported in part by funds from the Division of Earth Sciences of the United Nations Educational, Scientific and Cultural Organization (Unesco Grant No. 600-130) and from the United States Geological Survey (USGS Grant No. 14-0 8-000 1-G-6 7). This fiscal support is greatly appreciated. Many persons have contributed administrative and technical advice. Dr. James Anderson, Chief Geographer, USGS, and Robert Alexander of the Geography Program, USGS, must be thanked for their guidance in the formative stages of the work. The help and encouragement of the International Geographical Union has also been generously channelled through Professor Chauncy Harris, SecretaryGeneral of the International Geographical Union, University of Chicago. In terms of technical assistance, it is invidious to single out names when many persons have contributed to the overall objective. I would, however, be incurring their joint displeasure if special appreciation was not expressed to Dr. William Mitchell of the staff of the Geography Program, USGS. In all stages of the work his criticism has been incisive and his patience has been monumental. As Chief of the new Geographic Information Systems Unit, his task has been to tackle the day-to-day

7

Acknowledgements

problems inherent in developing a rapidly growing technology within government service. Without his continuing assistance and many hours of work related to Commission activity, this report could not have been produced. Finally, and on behalf of all members of the Commission and the authors and members of the working groups, I would like to express appreciation to Emily Marica, Susan Wierman and Lila Blanchard for invaluable advice and assistance in the production of all aspects of this report. To each of these persons and to those numerous colleagues who have individually and jointly taken part in discussions and provided assistance in many ways, I extend my most sincere thanks.

R. F. T. Ottawa, 1975

8

Principal Investigator:

R. F. Tomlinson

Principal Authors:

H. W. Calkins D. F. Marble

Contributing Authors:

A. R. Boyle K. Dueker R. Durfee E. Hardy G. Lewandowski L. Maki W. Mitchell D. Sinton D. Steiner W. Switzer W. Tobler J. Veroughstraete S. Wierman D. Yaeger

Contents

I

GENERAL REVIEW 1. Perspectives for evaluation 2. Essential parts of a geographic information system 3. Lessons to be learned from review of existing systems and digitizing methods

II

20

DESCRIPTION OF SYSTEMS AND EXPERIMENT 4. 5. 6. 7. 8. 9.

Ill

13 15

The Canada Geographic Information System (CGIS) The Polygon Information Overlay System (PIOS) The Minnesota Land Management Information System (MLMIS) The Land Use and Natural Resources Inventory of New York State (LUNR) The Oak Ridge Regional Modelling Information System (ORRMIS) Data encoding experiment

27 74 90 110 125 140

APPENDICES 1. Canada Geographic Information System: Co-operating federal and

provincial government agencies 2. Canada Geographic Information System: Summary of CLI classification codes 3. Canada Geographic Information System: Graphics subsystem 4. Land Use and Natural Resources Inventory of New York State: Classification system 5. Oak Ridge ORRMIS Data Classification System 6. Data Encoding Experiment: Estimates of land use area by tract

175 178 189 195 205 208

9

Part I

General review

1

Perspectives for evaluation

1 A geographic information system or development project can be described and evaluated according to several perspectives. However, the objectives of the geographic information system or development project and the purpose of the efforts must first be clearly understood, so that the system can be viewed in the context intended by the individuals or agency sponsoring the system. Several types of systems, or stages of system development, are generally recognized: 1. Research or experimental mode - to demonstrate that a particular data processing or manipulation technique is feasible; 2. Demonstration mode - to show the system's actual or potential utility in one or more applications with a hypothetical data set, or a small test data set; 3. Operational or production mode - to process actual data regularly for specified problems and maintain the data through specified updating procedures. 'System' is a loosely defined term and any system can be viewed as a subsystem of some larger system, or, vice versa, any system may be made up of several subsystems (Churchman, 1968). This document does not attempt to rationalize the different uses of the term 'system' but allows the definition to vary depending upon the particular project or activity under investigation. Each system mentioned in this report can currently be assigned to one or the above categories. Such assignment, however, is mainly for convenience and is probably transitory. Project objectives are usually not specific and any review, description or evaluation must recognize that all systems pass through each stage during their development. This is particularly important at the more advanced stages of development because attention must be paid to the maintenance, addition and loss of selected functions over time, including the ability to input, process, retrieve and finally analyse the data. Evaluation perspectives provide a common background for the systems studied. Three perspectives are considered relevant to evaluation: 1. Evaluation within context This perspective is aimed specifically at evaluating the ability of a system to meet its stated objectives. Frequently, objectives are not stated explicitly, which makes the perspective difficult to adopt, or the objectives of a system may be simply 'to provide data to decision makers'. Systems that are in the experimental or demonstration phase often lack adequate documentation, and their objectives may change during the development and demonstration process; evaluation of the system at a given point in time thus quickly becomes outdated. Finally, to obtain adequate information one must ask the system developers for a self-evaluation at the very time when they are most concerned with demonstrating the utility and feasibility of their system. This perspective is described here because it points out a constraint on system evaluation of any kind; it is not otherwise pertinent to this study. 2. Evaluation outside context This perspective is aimed directly at the question of transferability.

The focus is on generic data

1

The meaning of a 'geographic information system' and the underlying concepts of handling locationspecific data are discussed in several publications 13

Computer handling of geographical data

handling capabilities and the possibility of applying them to another set of problems and the data handling requirements. This perspective limits evaluation to the specific functions that are currently in operation. The conditions according to which the system operates are considered only to the extent that they are a constraint on data handling capabilities, and it is of no concern whether or not the system is meeting a defined set of objectives or whether, in fact, objectives exist. 3, Evaluation by objective appraisal The perspective of objective appraisal is similar to the 'outside-context' perspective in that it asks whether existing methods and techniques can be applied to another set of problems. However, under objective appraisal the investigation is not limited to the current capabilities of the system. The procedure used is to establish a conceptual framework containing the elements considered appropriate to meet a perceived need, to which the system or some of its elements may be transferred. This approach, therefore, starts by identifying the total range of capabilities for data acquisition and manipulation for a set of perceived problems, and then compares the capabilities of existing systems to this set. In this framework the various elements of a system may be evaluated independently with respect to their applicability to the new set of problems. Final recommendations may include selected elements from a variety of systems. This study uses the objective appraisal perspective, and recommendations and observations will be limited to the relevant factors for a programme of land use or natural resource inventory. This approach has been chosen because it provides the reader with the best information available at present and because it is without prejudice to the individuals or agencies responsible for developing and operating the geographic information systems studied. The study is presented in two parts. The first discusses concepts fundamental to the development of a geographic information system, and the lessons learned from applying them to the systems selected for study. The second part is a detailed account of each system, and of the data encoding experiment which attempted to provide a direct means of comparison of various approaches and operations.

REFERENCE Churchman, C. W., 1968,

14

The Systems Approach , Dell Publishing Co., Inc., New York.

2

Essential parts of a geographic information system

The various approaches to evaluation of a given information system make it clear that a systematic evaluation procedure is needed. This has been developed in the form of a conceptual or hypothetical framework, which can be used as a tool for comparison of existing information systems. The conceptual framework represents what are considered at present to be the key elements of an information system. It must be recognized that this procedure means comparison of an 'ideal' system, the framework, with an 'imperfect' system, such as one of the case study systems. However, it is currently the only way to appraise systems objectively in terms of the 'outside-context' evaluation of the system with respect to its transferability. The recommendations and observations contained in this document have been derived by comparing the case study systems with the conceptual framework. It is felt that this type of comparison is more valid than direct comparison of existing systems, for the following reasons: I. The conceptual framework represents a neutral position and thus allows each system to be evaluated in an objective manner which removes the 'within-context' considerations and emphasizes transferability; 2. It does not consider only the elements of a system that are evident in the case studies; rather, it highlights some of the major issues which have arisen during development of the systems but which could not be exploited within the time and resources available; and 3. It provides a perspective on the systems studied which directly suggests recommendations for similar efforts elsewhere. Figure 1 depicts a systematic framework for design and evaluation of geographic information sys terns. Three major stages are shown: Stage 1: Determination of the system's objectives, assessment of resources (including gee-referencing systems), and development and evaluation of specifications. Stage 2: Description and evaluation of alternative system designs that are capable of meeting the specifications set forth in Stage I. Stage 3: Overall evaluation of alternative system designs in terms of potential benefits and system costs. Feedback loops are shown to indicate the major checkpoints in the process. If the results of any stage are not satisfactory, either the system specifications will have to be changed, or, if possible, additional alternative designs may be investigated. It must be emphasized that the imperfect existing systems are evaluated by comparison with the idealized model for the explicit purpose of making recommendations for the development of new systems by other agencies. In no way can the conclusions of this document be construed as criticism of an individual system in the context of its own explicit or implicit objectives or its efficacy within its own environment. Although the question of a system's utility within its own environment is valid, this study did not adequately explore the aspect. In fact, the systems discussed are all reasonably successful in the sense that they provide sufficient documentation, utility and other related factors to allow a study to be conducted. This alone could be considered a measure of success. Many other systems, real or proposed, have been reported in the literature but could not be reviewed in depth simply because 15

.....

r----------

"'

1. 1 Describe objectives, client, and client needs

_

-

0

~

"~

.=er ~

""a. """ 0

1. 6 Describe data set specifications

1. 2 Describe and evaluate data needs

~

~

!:'.

STAGE 3 1. 9 Evaluate

1. "( Describe information delivery requirements

1. 3 Describe and evaluate geographic reference needs

sys.tern specifi/ cations and objectives

:1 Benefits I

3. 1 Final evaluation

2. 2 Describe hardware requirements 2. 3 Describe software requirements

STAGE 2

2. 1 Describe alternative information systems

• I

L----Information system design and evaluation model.

Costs

2. 5 Evaluate feasibility and cost

2. 4 Describe operating environment 2. 6 Describe legal implications 2. 7 Describe political imolications

2. 8 Evaluate legal and political implication

_J

Feedback

"!!?..

..=~

Impacts

I

1. 8 Describe geographic referencing svstem

1. 5 Inventory geographic referencing systems

Fig. 1.

r.i

---,

~ee~bl.!:.ck_

---,·-

1. 4 Inventory existing data sources and collection ro2rams

STAGE 1

..

~e~db.2-C~

Essential parts of a geographic information system

adequate documentation does not exist or the personnel are unavailable for interview and comment. Finally, it should be pointed out that all the systems reviewed by this project have passed through the proposal and implementation stages and entered the stage of continuing use. This, in itself, represents a considerable degree of success. A geographic information system consists of six major subsystems: 1. The management subsystem; 2. The data acquisition subsystem; 3. The data input and storage subsystem; 4. The data retrieval and analysis subsystem; 5. The information output subsystem; and 6. The information use subsystem. The components of the ideal system, in themselves, constitute a major recommendation of the !GU, as these components can be applied to any similar activity concerned with information systems for land use or natural resources undertaken by other agencies in the future. Management subsystem The management subsystem is concerned exclusively with the normally recognized management functions considered essential to the continuing success of any particular operation. In terms of an information system, the management subsystem must consider the following: 1. A long-term staff plan which recognizes the fact that staff will change substantially during the phases of design, development and implementation; 2. A fiscal plan which addresses the questions of the total resources required to develop the system and also directly provides for fiscal continuity; 3. A programme for publicizing the system which accurately represents what is being done without overselling the system, while at the same time providing the interim products to potential users; in this way the system is viewed as legitimate by the user community and gains its support, facilitating continuity of financial resources; 4. An extensive educational programme for users which deals with the techniques for using the system, but more importantly recognizes the application problems of the user and provides him with the information and skills he needs to structure his problem in quantitative terms, so that he will find the system useful to him; and 5. A user feedback system, not in terms of abstract utility, but rather which allows the user to provide feedback both while he is learning how to view his problem in a quantitative perspective and while he attempts to use the system by analysing his problem quantitatively. Data acquisition subsystem The data acquisition subsystem addresses the problem of acquiring all the data elements required to analyse a particular problem and make the appropriate recommendations. It must be recognized first that an information system can usually only provide a portion of the information required. The ma terials to be distributed by the agency developing a geographic information system should include consideration of the entire data acquisition subsystem, and advice should be given to potential users on the critical elements concerned with this subsystem. These critical elements are: 1. A committee of data gathering agencies upon whom the system intends to rely to meet user needs; 2. A methodology for identifying and describing the location of all source data, including documentation, continued access, and opportunities for change in data collection procedures; 3. A methodology for evaluation of source data, particularly in terms of their applicability to the problems at hand and their accuracy and reliability. It is important that a user recognizes the need for a formalized data acquisition programme, and it may be desirable to provide guidance to potential users in this area. Data input and storage subsystem The input procedures subsystem deals mainly with the technical operations necessary to convert the source data to a digital for:{n. The typical requirement is to convert a graphic image, such as aerial photography, satellite imagery or mapped data, to an error-free digital record. The significant steps in this process are:

17

Computer handling of geographical data

1. 2. 3.

Partitioning, or the separation of the data into workable units; Control processes, the method and procedures for controlling the processing of the workable units; The encoding operation, the conversion of the graphic image to digital form, including verification and correction procedures; 4. The encoding of the descriptive data corresponding to the graphic units; 5. Data reduction, the simplification or compaction of the encoded graphic image; and 6. Data file construction, addressing the questions of file organization, access method and the linkage of image data with descriptive data. The product of the data input subsystem is the stored digital representation of the image and descriptive data in a form suitable for retrieval and analysis. Data retrieval and analysis subsystem The data retrieval and analysis subsystem de'a ls exclusively with the operations of extracting the data from storage and performing the analytical operations needed to meet the requirements of the problem at hand. Typical operations which must be performed in a geographic information system are: 1. Retrieval of the data from storage; 2. Measurement of areas or calculation of distances; 3. Comparison of multiple data sets, that is, a procedure to overlay one data set upon another and to determine the intersection or union of the variables; 4. Statistical analyses appropriate for spatial data; and 5. Specific analytical procedures determined by the user. Information output subsystem The information output subsystem is generally fairly simple, consisting of tabular listings of measurements, comparisons or statistical analyses, and graphic display, in map form, of the entire data set, a portion thereof, or the results of analytical operations. It should be noted that validity or verification procedures are needed to determine: (1) that the data retrieved from storage are accurate within certain limits; (2) that the analytical procedures used are valid in terms of the data accuracy; and (3) that the graphic display of the data reflects the confidence limits as determined by the data characteristics and the analytical procedures. Information use subsystem The information use subsystem deals mainly with the interface between the user and the system. The first question which must be addressed is the type of access required. In some situations the user has direct access to the system. In others he must rely on the assistance of a technician, or a professional skilled in the applications. In yet others he must develop a formalized negotiation process which involves several individuals with the appropriate range of capabilities, including knowledge of the analytical techniques, system capabilities and limitations, and the cost and implications of using a system for a particular problem. The particular manner of transmitting output to the use environment is only significant insofar as it exists and functions well. Different situations call for different institutional arrangements, and the function of output transfer must be explicitly planned for in the design of a geographic information system. The resultant use of the output is essentially beyond the control of the individuals or agency responsible for the system. However, it can reasonably be assumed that if the output is delivered in a readily understandable format and in a timely fashion it will be used to the maximum possible extent. Overview of the case studies and data encoding experiment The case studies were chosen to represent the various methods of encoding graphic data. The systems and their encoding method are as follows: 1 The Canada Geographic Information System (CGIS) Fine polygon (drum scanner)

1 The difference between fine and coarse polygon systems is the number of points used to define a polygon. 18

Essential parts of a geographic information system

The San Diego Comprehensive Planning Organization's Polygon Information Overlay System (PIOS)

Coarse polygon (digitizer)

The Minnesota Land Management Information System (MLMIS)

Medium grid (manual)

The New York Land Use and Natural Resource Information System (LUNR)

Large grid (manual)

The Oak Ridge Regional Modelling Information System (ORRMIS)

Small grid (flying spot scanner)

These systems also represent the different encoding procedures: manual, manually operated digitizer, flying spot scanner, and drum scanner, as indicated in parentheses. Each system is fully described in Chapters 4 to 8 of this report. The data encoding experiment, fully described in Chapter 9, was an attempt to generate directly comparable statistics for all systems, plus other digitizing methods under development or operation. The Digimap system recently developed by Calspan Corporation was included in the experiment as it represents a different approach, though one that has not yet been incorporated into an operational geographic information system.

19

3

Lessons to be learned from review of existing systems and digitizing methods

The recommendations resulting from a synthesis of the case study material and, when applicable, the data encoding experiment are presented as they relate to each subsystem of the conceptual framework. In the opinion of the Commission, these are the major lessons to be learned from this type of comparative analysis. MANAGEMENT SUBSYSTEM There are just as many problems, and possibly more, on the management side of implementing an information system as there are on the technical side. The following specific problems have been observed in one or more of the systems studied. 1. There are likely to be significant time delays in obtaining project approval, acquiring the necessary equipment, hiring and training staff and actually implementing the system from an administrative (management) standpoint. It is necessary to note this and to indicate the order of magnitude of such delays. A programme for system design should, if possible, be structured to reduce the adverse effects of such time delays to a minimum, that is, to anticipate significant delays from one or more causes. If the delays do not occur, system development will only be ahead of schedule rather than substantially late. Opponents of a system development project will take as much advantage of delays as they can. Therefore, if approval of a project can theoretically be obtained in 3 weeks, for example, do not assume that it will be.

2. Any system that is proposed must win support at a high level of decision making. Merely to convince the technical personnel of its value is inadequate to obtain approval for systems development and the concomitant continuing support. The major problem, at present, is how to overcome the negative attitude and misconceptions that exist concerning the use of computers. One must emphasize to people at senior levels that the system is intended for use by their advisers and staff, and that it represents an important tool needed by these people to function in their jobs. This, of course, requires an explicit, objective statement identifying the immediate users of the system. Broad statements that decision makers are the intended users of the final results may be true, but the results must pass through one or more staff levels before the information is put into the format that a decision maker can easily accept. 3. a. b. c.

The relevant groups that need to participate in system development are: the data gatherers (the agencies that will supply data on a continuous basis); the system designers (usually representatives of the system sponsor, who reflect the goals or objectives of the sponsor); and the anticipated user group.

4, a, b. c.

The representatives of the data gatherers must do the following: identify the data supplies; evaluate the data supplies with respect to the problems at hand; and solve the problems of data flow from the supplies to the system.

20

Lessons to be learned from review of existing syst em s and digitizing m ethods

5. Documentation is essential and resources spent on it are never wasted. Documentation is particularly significant, for the following reasons: a. staff changes are inevitable and new staff members will have to trace the history of the system's development, through regular documentation on the state of the system, significant decisions, why decisions were made, and so forth; b. the system will be transferred from an outside contractor to the sponsoring agency or its designee. 6. Staffing plans for the development of an information system must recognize the following phases: a. conceptualization of the system and entrepreneurial activities; b. research management; c. system debugging; and d. operation and maintenance. Either the individual responsible for directing the development of the system must fill all of the above roles, or different individuals should be responsible for the system during the time required for each identified phase. The three possibilities for meeting this problem are: a. change of personnel; b. assignment t o one individual who changes his interests and capabilities as the system work progresses; or c. pr ogressive promotion. It is recognized that one l.ndividual usually does not have interes-t, capability or willingness to continue with the system through all four development phases. Essentially, staff change or modification is inevitable and must be planned for systematically. 7. Job descriptions must be written with the total length of system development in mind; the recommended time frame is 8 to 10 years. This implies a long-range and well-thought-out staff plan, which is a difficult task. 8. Staff planners should recognize that experienced personnel are required at all levels. Planning for the minimum of staff or hiring personnel at the lowest levels will cause significa nt difficulties during or after system development. Each level should be filled with an individual of the highest quality available. This philosophy often runs counter to some government hiring practices and may force the use of non-government or 'outside' talent. Frequently, high-level staff in government tend to be administrators rather than technical personnel and use of high-quality persons from industry may be the only feasible alternative. 9. Government employment policies in force at any particular time may require the use of outside personnel on a consulting or contract basis. When-this occurs, and it can be expected to occur in most information system projects, the outside individuals or firms must be subjected to the closest management and supervision possible. Otherwise, they may produce what they think is required rather than what is needed by the system. To ensure that their work can be used, it must be designed in terms of a totally integrated project rather than viewed from the particular perspective that the outside individual or firm may have on the subject. Mechanisms to monitor and control any outside help must be set up and enforced. 10. It is not satisfactory t o use universities, or outside organizations such as consulting companies, as places to build information systems or components thereof. Although the sponsoring organization temporarily avoids the staffing problem, it later becomes a major stumbling block. For example, contracting with universities for student labour is cheap but not necessarily reliable. 11. There are several problems associated with maintaining fiscal continuity for the devefopment of an information system, given that such development will probably take from 5 to 10 years to complete. The models of fiscal continuity are as follows: a. Orientation of the project so that there are deliverable products at the end of each year or funding period, which justify the continuation of the development programme. The problem associated with this approach is that substantial effort may be needed to produce the deliverable products, at the expense of overall system development. Also, the interim products may not be very useful and may not fit into the final system. This approach can tend to extend the development time. b. Emphasis on system development over the shortest time without concern for interim deliverable products. Experience has shown that 5 years is probably a reasonable estimate for the delivery of the first products when this approach is used. In general, this approach may only be sustainable in government agencies.

21

Computer handling of geographical data

c.

Inclusion in (b) above of an extensive user education programme within the project from the very beginning instead of deliverable products. The appropriate user education programme will be identified under that subject.

12. System publicity. If development takes up to 5 years it will be necessary to report progress on a regular basis. One mechanism is a series of deliverable products, but the dangers of this have already been mentioned. The user education programme may provide a suitable reporting mechanism, especially if it is co-ordinated with the major steps of system development. Also, it should be possible to organize the development programme in such a way that simpler tasks can be accomplished at an early date. This point is particularly pertinent to users who do not have many resources, either of manpower or money, to spend on system development. An approach must be devised that integrates user education and interim deliverable products within the continuing development of the system without excessively delaying it. This implies that much more attention should be given to the management function than has been the case to date. 13. The education process must start at the current level of capability of the user rather than at the system's level. Education is not how to use the system, which is somewhat trivial, but rather how the user's problems can be handled in quantitative terms using data. It should be assumed that the user must be shown an approach to his problem solving that is entirely new to him. 14. It is impractical to expect user feedback to the system unless an education programme for the user is initiated at the very start. A user cannot provide valid feedback if his capabilities are not equal to the system's manipulative power. A user training programme should include the following elements: a. Formulation of the user's problems in quantitative terms for: i. regular university curricula; ii. continuing professional education; b. Instruction on use of the system in terms of: i. operating characteristics; ii. assumptions concerning the data and manipulative functions; iii. reasonable expectations on system performance; iv. limitations of the system. This can •·nly be accomplished by a programme that trains the user step by step as the system develops. It should be realized that the user training programme will require substantial resources and that user feedback can only be meaningful if it is in a structured form. The user must be able to interact with the system as it develops, even if hypothetical problems and output have to be used. These are considered to be the required conditions for obtaining valid user feedback. DATA ACQUISITION, INPUT, STORAGE, RETRIEVAL AND ANALYSIS SUBSYSTEMS

1. One of the major technical constraints continues to be digitization. This is understood to be the process of converting an error-prone manuscript into an acceptably error-free, non-graphic, machinereadable file. The error-free file is particularly significant because in the current state of computer development the computer cannot intelligently ignore non-logical errors. The process of creating an error-free file consists of: a. pre-editing graphic data b. digitizing c. error correction d. file structuring. 2. The development of interactive capabilities can be of considerable assistance in the correction of errors. However, more importantly, interactive systems allow a user to browse through a data file which would otherwise be invisible. 3. If a system is expected to handle multi-format data, that is, point, line and area data, then the file structure must be designed with this in mind. It can be fairly expensive to add one or more of these types to an existing file structure originally designed for only one type, and it can reduce efficiency greatly. 4. Manipulative functions are still a problem because most users do not understand the assumptions or statistical bases on which they are made. This problem, however, must be addressed under user education.

22

Lessons to be learned from review of existing systems and digitizing methods

INFORMATION OUTPUT AND USE SUBSYSTEMS 1. Users generally perceive that automated systems can perform much faster than manual systems and they consequently judge the system's performance according to this perceived capability, that is, the time it takes to respond to their queries. The user's expectations are much higher than if the task were done manually, even if he knows nothing about the specifics of the system.

2. An understanding of the manipulative techniques is required so that the output can be properly interpreted. Currently, users in general do not have an adequate understanding of the processes going on inside the computer and therefore they fail to interpret the output properly. This can lead to the situation where many of the system's capabilities are not used because the manipulative processes are not understood, and where manipulations are being carried out, the user accepts the results without understanding the underlying assumptions and qualifying his interpretation of the results based on those assumptions. For example, the assumption of homogeneity of areas used for the overlay process can be overlooked. 3. Standard statistical packages do not adequately meet the needs of spatial data. There are statistical methods which can handle spatial data but, in general, these are not used and there seems to be a tendency to use standard statistical packages which assume a zero dimension from a spatial context. In the spatial manipulation programs of a geographic information system, it is important to define and use statistical procedures that recognize that one observation may have a spatial dependence on another observation. 4. As an interim device, to substitute for user education, one can use the go-between process. In this process, knowledgeable individuals act as intermediaries between the user and the system, translating the user's problem into quantitative terms to which the system can respond. The go-between process may always be required, because even if high-level decision makers attain sufficient knowledge to use the system effectively, they may not want to take the time to do so. 5. For policy decisions, particularly those made by high-level officials, the primary model is a mental one, in which the d·~cision maker's 'image' of the problem is built on all information seen or heard. Consequently, information systems must be built for lower-level technical people and these people must transform the information into a form compatible with the policy maker's mental model. The degree to which a decision maker's image can potentially be modified by information from a formal information system or model should determine whether a system is built or used in a particular context. Currently, there are no mechanisms by which to determine the degree to which this image is subject to modification by the type of information that would come from an information system. None of the systems studied have a mechanism for either determining the effects of information or negotiating the use of information, and the lack of such mechanisms is a detriment to them. Further, even if these functions are supplied. there may be no perceivable change in decision making, and it could easily be fatal to base the continuity and survival of the information system on such notions. However, there is a clear need for either the sponsor or designer of the system to take the initiative and actively advocate use of the information system. The recommended method of accomplishing this is for the system's staff to include personnel capable of understanding the user's problem and translating it into terms suitable to the use of quantified data. In short, the designer must take the system to the user and make sure that the user can understand it in terms of his problem. 6. Key elements for incorporating an information system into a decision-making framework are: a. the development of interagency data linkages; b. better design of formal models; c. incorporation of these models into the decision-making framework; d. post-university transfer of technology (continuing education); e, changes in institutional structure. In the context of a federal government, the notion of the independence of data gathering agencies or lack of bias will not always hold. Most data on which decisions are based are collected by line agencies, and the particular bias of any line agency is seen in what data they choose to .c ollect and how they collect the data. DAT A ENCODING EXPERIMENT The results of the data encoding experiment must be viewed very carefully.

Initially, it was hoped that

23

Computer handling of geographical data

a sample graphic data base could be processed by the various systems identified earlier and that directly comparable results could be obtained. However, the basic design philosophies are sufficiently different that the identical data could not be input to each system and meaningful conclusions derived from the results. Thus, conclusions from the data encoding experiment are limited to those that can reasonably be considered independent from the bias caused by use of a single data base for all systems. The major findings of the experiment were: 1. The different approaches used in the test systems do not lend themselves to a simple standard test; and

2. A benchmark testing procedure requires very careful design and iteration with systems to be tested, to define objectives clearly and ensure comparability of results. The test used was oversimplified and poorly controlled in absolute terms. Several lessons have, however, been learned and the test is included to illustrate the difficulties inherent in this type of comparison. Such difficulties may not be apparent to potential users who have to evaluate conflicting claims made by proponents of various approaches. Despite the deficiencies, two general conclusions can be derived from the encoding experiment: 1. Grid-based systems (medium and large sizes particularly) tend to be significantly less accurate in area measurement than systems based on either coarse or fine polygons; and

2. Grid-based systems tend to underestimate or lose linear features when two data variables are overlaid. A major recommendation of this study resulting from the attempt to implement the experiment is that a carefully designed, well-thought-out benchmarking procedure is virtually mandatory in any serious project concerned with a geographic information system. CONCLUSIONS In summary, the lessons described in Chapter 3 _are considered tci be essential factors that must be explicitly addressed if a system is to be successful. Specifically, for a system to survive and be useful over time the following conditions must be met: a. there must be a perceived need to gather some data; b. the data must in fact be gathered;_ c. the data must continue to be gathered; d. the system itself niust be an efficient way to handle the data; and e. there must be a continuing perceived need for those data to be handled, that is, a continuous use for the data. Clearly, these are not the only factors that must be considered. Many technical factors of equal importance have not been discussed here because they are already well developed and generally accepted. Finally, it is certain that expected problems will be discovered in the continuing research and development of ge0graphic information systems. This report has enumerated the problems that have become apparent from past efforts, in order that future system development projects may benefit from the accumulation of past experience.

24

Part II

Description of systems and experiment

4

The Canada Geographic Information System (CGIS)

GENERAL DESCRIPTION The Canada Geographic Information System (CGIS) started as a computer system planned to facilitate use of data gathered by the Canada Land Inventory (CLI). The Government of Canada approved development of the system in 1964. Both CLI and CGIS were initially sponsored by federal and provincial agencies developing programmes to aid rural areas of Canada under the Agricultural Rehabilitation and Development Act of 1961 (ARDA). Facilitating use of CLI-type data has remained a primary objective of the developers and sponsors of CGIS. However, data from other sources are included in the CGIS data bank, and Environment Canada now emphasizes CGIS's ability to handle geographic data in general rather than just those of CLI. Data processing Most data for CGIS were acquired through CLI, which mapped the capability of land for various uses, as well as the present use of the land. The CLI covers most of southern Canada from the Atlantic to the Pacific oceans (Fig. 2). CGIS data from other sources include census data and administrative, watershed and shoreline boundaries. These data files increase the usefulness of CLI data by making it possible to aggregate data for specific areas and to compare physical and socio-economic characteristics. Mapped data to be input into the CGIS data base are of two types: image and descriptive. Data describing map images, that is, polygon boundaries, are digitally encoded on tapes automatically by a drum scanner developed for CGIS by IBM Canada. Since about 1974, polygon data from low -density maps can be input using a conventional digitizer table, by a system called DIGIMAT. Point data are also encoded using a standard table d_igitizer. Descriptive data classifying map images are coded on tape by manual processes. The two data sets are linked by a unique number assigned to each polygon or point. The ability to input network as opposed to polygon or point data has not yet been exercised. Provision for handling line data exists in the software, but as yet there has been no demand for this facility. In terms of scale, the system can now input maps at any scale from 1:10,000 to 1:25,000, and as of August 1975, one map at 1:1, 000, 000 has been successfully entered (W. A. Switzer, personal communication). The source data are collected and mapped in polygon rather than grid form which gives CGIS the ability to maintain exact boundary data and flexibility in data manipulation. Polygon data can be automatically converted to any size of grid cell summary at any time, which is of particular value in input to some forms of subsequent spatial analysis. Figure 3 is part of a map of present land use, an example of the type of polygons processed by the system. The main storage unit in CGIS is termed a 'coverage'. A coverage comprises data concerning a single descriptive variable and may potentially cover the whole of Canada. The definition of the data coverages has relied on relatively stable factors, such as soils, thereby prolonging the useful life of the data. CGIS is developing 11 coverages as part of the CLI data base: soil capability for agriculture; 27

Computer handling of geographical data

Fig. 2.

Areas covered by the Canada Land Inventory. Source: Report No. 1, 1970.

Fig. 3.

Part of a present land use map (photographically reduced from a map at 1:50, 000 scale).

28

The Canada Geographic Information System (CGIS)

land capability for forestry, recreation, ungulates and waterfowl; fresh water capability for sportfish; land use; census (1971), administrative, and watershed boundaries; and shorelines. These data are being input at a scale of 1:250,000, but some are input at 1:125,000 (CLI data in British Columbia) and some at 1:50, 000 for major urban areas of Canada (land use at two time periods, agriculture and recreation capability, shoreline and census boundaries). For each coverage, image data are stored in compact notation by reference to incremental changes in the direction and length of line segments which form polygons, and descriptive data are stored for each polygon in an order determined by arbitrary numbers, which link the descriptive and the coded image data for the polygon. Data analysis CGIS data files are retrieved and analysed by means of prewritten programs and subroutines, and the user can write specific subprograms to meet special requirements. This structure helps CGIS to serve users by saving time and requiring less individual programming than would a system with no prewritten programs. It is also a flexible method because programs can be combined in a variety of ways. As shown in Figure 4, a retrieval programmer acts as a necessary bridge between the user and the system.

User

**

Step 1 ,._ JEt~_r~c!_!O_E _ poses question

*f

I L

Retrieval programmer

Step 2 generates request program

------- -----

Step 5 receives answer

t Step 5

... - - - - -- checks for successful execution

' Computer

Step 3 executes system and writes report

* Interaction to define problem in quantitative terms. ** Interaction regarding interpretation of results or refinement of · problem statement. Fig. 4.

Simplified retrieval processing cycle. Source: CGIS Vol. 1, p. 101 - 102).

Users may initiate a request by indicating specific geographical areas of interest or by specifying particular classifications for one or more coverages. Geographic areas may be specified in several ways, for example (Geo-Information System, 1972, Part 4, p. 3): 1. Topographical map sheet number - e.g.- , 11L07 2. Administrative µnit - county, province, parish, etc. 3. Census unit - enumeration area, electoral district 4. Regional development area - Gaspe, Northern New Brunswick, etc. 5. Arbitrary circle - e.g. , 100 miles around Pembroke 6. Arbitrary polygon - all land included between specified vertices (co-ordinates). In addition to specifying a geographical area of interest, the user may inquire about a specific coverage, specific elements of a coverage, or a combination of coverages or elements. The most significant analytical capability of CGIS is that coverages can be overlaid to produce new ones. The new coverages can then be analysed in the same manner as the original ones. For example, one can 29

Computer handling of geographical data

ask the total area within a county of all second-class agricultural land, or of all second-class agricultural land which is first-class wildlife habitat . Any number of coverages may be overlaid and combined in this way. Information output CGIS provides information primarily in tabular form, but maps can also be automatically produced. The system has been designed to include extensive capabilities for graphic output. The output information is the product of retrieval and analysis activities and may be derived from one or more coverages. Figure 5 is an example of tabular information produced using a single coverage. It is a list of all present land use categories giving the total area of each in the coverage and the percentage of total land area represented by each use (Geo-Information System, 1972, Part 4, p. 5).

*********************************** LISTING OF UNIQUE PRESENT LAND USES *********************************** CLASSES

DESCRIPTION

ACREAGE:.:~

PERCENT (LAND AREA)

*******

***********

*******

*******************

A

95%-100% CROPLAND

A

75%-94.97. CROPLAND

A

50%-74.9% CROPLAND

P

95%-100% IMPROVED PASTURE AND FORAGE CROPS

p

0

0.0

0 17,869

0.0 1.4

493

o.o

75%-94.9% IMPROVED PASTURE AND FORAGE CROPS

15,464

1.2

P

50%-74.9% IMPROVED PASTURE AND FORAGE CROPS

542,606

44.3

B

URBAN LAND USE (NON-AGRICULTURAL)

6,946

0.5

702

E

MINES, QUARRIES, SAND AND GRAVEL PITS

G

ORCHARDS AND VINEYARDS

351

o.o o.o

H

CROPLAND (FIELD CROPS-GRAIN, VEGETABLES, ETC)

838

0.0

K

ROUGH GRAZING AND RANGELAND

74,453

6.0

L

UNVEGETATED SURFACES (ROCK)

453

o.o

M

SWAMP, MARSH OR BOG

23,203

1.8

0

URBAN OUTDOOR RECREATION (PARKS, ARENAS)

874

o.o

S

UNPRODUCTIVE LAND (SAND)

5,513

0.4

T

PRODUCTIVE WOODLAND

463,457

37.9

69,367

5.6

2,083,676

o.o

U

NON-PRODUCTIVE WOODLAND (SMALL TREES, BUSHES)

Z

WATER (OCEANS, LAKES, PONDS, RIVERS) TOTAL AREA (ACRES) TOTAL AREA EXCLUDING WATER (ACRES)

*1 ac = 0. 405 ha. Fig. 5.

30

Sample CGIS output (simple retrieval). Source: Geo-Information System, Part 4.

3,306,265 1,222,589

The Canada Geographic Information System (CGIS)

Objectives and organizational environment The Canada Geographic Information System originated as a computer mapping system planned to facilitate use of data gathered by the Canada Land Inventory (Report No. 1, 1965, p. 1; Tomlinson, 1963, p. 1). As early as 1962, CLI developers recognized that unless CLI data were handled automatically, much of their potential usefulness would be lost. The initial plan for a computer mapping system resulted from perceptions of two aspects of CLI data: that they were spatial, that is, in map form; and that they were very extensive (estimates ranged between 1, 600 and 30, 000 maps, depending on scale and number of coverages) (Seminar, 1962, pp. 7-8, 10; Tomlinson, 1963, pp. 1, 5). Because CGIS was originally closely related to CLI, and thus to the Agricultural Rehabilitation and Development Administration (ARDA), its initial objectives related to rural development problems and programmes. A computerized system was seen as a device that would save time and effort, a tool for data analysis. Automation would help CLI to fulfil its objective of making data available to the land use planning efforts of the provinces and the federal government (Report No. 1, 1965, p. 5; Symington, 1973, pp. 11-12). Therefore, ARDA sponsored an investigation into the technical and economic feasibility of a computer system to store, analyse and produce both tabular and mapped data. Feasibility reports indicated that no suitable computer mapping system existed, but recommended that one be developed. This was to become the Canada Geographic Information System (Tomlinson, 1963; Tomlinson, 1965). Canada Land Inventory The CLI was designed to produce information for the federal and provincial ARDA's land use plans and programmes. Thus, CLI encompasses areas of Canada which ARDA defined as having significant opportunities and resources available to rural Canadians. These areas are shown in Figure 2. The purpose of CLI was to create an inventory of present land use and the land's capability for a number of uses, and to encourage the use of this information by planners. Four types of information form the main base of CLI. These are: 1. the capability of the land for agriculture, 2. the capability of the land for forestry, 3. the capability of the land for recreation, and 4. the capability of the land for supporting wildlife. The CLI was planned and designed between 1958 and 1963. The inventory began in 1963 after gaining approval of the Government of Canada. Inventories were undertaken by the provinces with the financial aid of the federal government (Report No. 1, 1970, p. 5). The product of CLI is a series of maps published by the Government of Canada and a series of reports explaining the inventory .(Reports Nos. 1-7). By March, 1973, most of the maps at 1:250,000 scale were in print (Index, 1973). Canada Geographic Information System Basic research into the handling of mapped data by computer began in Canada in 1961 (Tomlinson, 1962). Design and planning of CGIS in the Government of Canada began in 1963 when ARDA sponsored the investigations into the technical and economic feasibility of a computer system to store, analyse and produce both tabular and mapped data (Tomlinson, 1963; Tomlinson, 1965). In November, 1965, the Government of Canada approved further development of CGIS and in September, 1966, approved a contract with IBM Canada for development of CGIS hardware and software under government direction (Schedule A, 1965; Switzer, 1973, p. 22) and a contract with Spartan Air Services Limited to conduct parallel research in the techniques of spatial data analysis and their relevance to land use planning. Development of CGIS has continued through yearly funding by the Canadian government (Switzer, 1973b). Although the system had been experimentally demonstrated by 1968, several unexpected events inhibited its immediate use. This caused problems for developers who were faced with user requests they could not meet. Consequently, the system 'went underground'. From late 1969 until late 1971, while the system's staff adjusted to organizational changes and conducted an extensive check of the system, nothing official was published about CGIS (Switzer, 1973b, pp. 1, 21). Development of the system has been continuous, despite the organizational changes and several staff changes.

31

Computer handling of geographical data

Current status Although CGIS began under the auspices of CLI, Environment Canada now emphasizes the separate nature of CGIS as a general geographic information system. Both CGIS and CLI remain within Environment Canada, however, and CLI data form the main component of the CGIS data base. The current desire for separation may relate in part to the fact that CGIS has not yet automated CLI data for all areas of Canada. Being quite realistic as to the amount of time required to complete the input of CLI data, Environment Canada desires to stress the system's capability to process geographic data as well as the availability of CLI data in particular. This approach is aimed at making the system useful to more potential users. The CGIS staff now consists of about 26 full-time employees (Table 1). The staff also processes data from sources other than CLI, but this accounts for only 5% of their time. It is apparent that Environment Canada is committed to continuing the development of CGIS and encouraging its use by a variety of agencies across Canada. Each major capability will be available for use after it has been tested and shown to perform to the standards set by Environment Canada. The rules, regulations and procedures that g·o vern system use are currently being prepared by Environment Canada.

Table 1.

CGIS staff, estimated and actual (1973). Source: William Switzer interview, June, 1973. Notes p. 27. M. C. Thurger & D. A. Hodgins, 'Map Production Estimate (Revised)', July 23, 1971, p. 8.

1971 - Proposed

Actual - 1973

3 Systems programmers

1 System manager

1 Production co-ordinator 2 Production co-ordinator clerks

8 Systems programmers

5 Map scribers 5 Cartographers - code manual e_r ror correction commands 2 Clerks - digitizing and encoding 3 Clerks - receipt and input of maps, clerical work, receptionist

2 Production persons 6 Draftsmen - scribing, manual error correction, digitizing 4 Data processors - scanning, keypunching, digitizing and key-to~tape 2 Clerks - receipt and input of maps, clerical work 1 Remote terminal operator

4 Keypunch operators - put classification data on tapes, run scanner

1 Secretary

1 Secretary

1 Receptionist

TOTAL:

TOTAL:

26

26

HISTORIC DEVELOPMENT The following sections discuss these historic developments in greater detail. First, an analysis of the Canada Land Inventory provides insight into the purposes of CGIS development and the content of the CGIS data bank. Following this is a section on historic development of CGIS itself. Development of the Canada Land Inventory Purposes and objectives. The federal and provincial administrators of the Agricultural Rehabilitation and Development Act were concerned about expected pressures from urban development on rural people and lands. One of their goals was to maximize rural income and productivity for the benefit of individual farmers as well as all Canadians. One way to pursue this goal was to determine the most efficient pattern of land use and then implement policies designed to draw the current land use pattern toward the most efficient one (Seminar, 1962, p. 1; Report No. 1, 1970, p. 2). 32

The Canada Geographic Information System (CGIS)

Efficient land use is not specifically defined in CLI documentation. However, one may infer that it was defined in terms of the productivity of the land and the income resulting from that productivity. Depending on demand and prices, income may not be high when a great deal is produced, of course, so these two measures of efficiency are somewhat contradictory. Some measure of the distribution of income is discussed, so efficient land use also must have implied an efficient population distribution. That is, an efficient land use pattern would mean one where productivity was high enough to meet demand without causing surplusses, and where the income generated by the sale of products was one which made maximum use of the land's physical capabilities (Seminar, 1962, p. l; Socio-Economic Variables, undated, pp. 1-4, 7-8; Socio-Economic Aspects, 1965, pp. 1-2, 10). Because efficient land use was thus linked to an efficient population distribution, ARDA administrators and planners desired information about the people as well as the land. The Canada Land Inventory was designed as a means to obtain information on the various land capabilities and the distribution of population. The following might represent a planning process such as that envisioned by ARDA administrators: 1. Gather information. 2. Analyse information to determine land capabilities and compare them with current population distributions and incomes. 3. Prepare and evaluate alternative plans to achieve an efficient pattern of land use. 4. Choose the plan with the maximum efficiency and develop specific federal and provincial programmes to implement the plan. 5. Implement the programmes . In addition to this positive role envisioned for the inventory, ARDA felt that the availability of such information would help prevent bad decisions or ineffective programmes. 'It became increasingly apparent that without a land capability inventory, programmes of land adjustment and regional economic development would be based on judgements which, in the absence of essential information, would be fallible and costly.' {Report No. 1, 1970, p. 1, emphasis added.) --Thus, CLI's role was that of an information base for land use and economic planning in rural Canada. This may be traced to the desire t o plan for resource management and regional economic development under the Agricultural Rehabilitation and Development Act. Early recommendations. The idea of an invent ory of land use and capability has been in circulation since at least 1958, when the Senate Special Committee on Land Use in Canada recommended: That it be called to the attention of the proper authorities the need of a systematic land-use survey, based upon appropriate fact ors, to provide for an economic classification of the land according to its suitability {Report No. 1, 1970, p. 55). Available documents do not elaborate on or explain this recommendation or its context. However, it does indicate an early interest in an inventory. It is interesting to note a difference between this recommendation and the actual Inventory, namely the use of the word suitability instead of capability. Definitions of both suitability and capability may be broad or narrow, based on one or two variables or a comprehensive study of many variables. However, suitability implies more of a value judgement and presumably a more comprehensive analysis than d oes capability. For example, capability of the soil for agriculture might mean its physical ability or capacity for production assuming good farming practices. Suitability for agriculture, on the other hand, might be based on consideration of socio-economic factors such as unemployment, prices, and tradition; ecological and aesthetic factors such as water quality; and the benefits foregone if alternative uses are rejected. It is significant that CLI surveyed capabilities, not suitabilities as the Senate Committee suggested. The data gathered by CLI, although not completely free from value judgements, are thus not so encumbered with them as a suitability survey would have been. CLI data can be used as an input into decisions about suitable land use patterns. Suitability may be defined as the user sees fit, so that federal and provincial planners can make their own decisions about suitable land uses rather than having decisions made for them. This extends the usefulness of the CLI data base considerably. No immediate development activity followed the 1958 recommendation for an inventory. However, in June, 1961, the Agricultural Rehabilitation and Development Act became law {Report No. 1, 1965, p. 1). Administrators of this act were to be the initial developers of CLI. In October, 1961, 4 months after the Act became law, at the 'Resources for Tomorrow' Conference held in Montreal, resource specialists stressed the forestry, wildlife, and recreation {Report No. 1, 1970, pp. 1, 52-55 ).

33

Computer handling of geographical data

Planning. In 1962, development of CLI officially began. The organizational structure, responsibility and financial capability for the inventory were set up in the Agricultural Rehabilitation and Development Administration (Report No. 1, 1970, p. 4). From the beginning, ARDA called on experts and recognized the interest of the provinces in the results of the inventory. In November, 1962, the federal ARDA Administration held a seminar in Ottawa on a 'National Land Capability Inventory', the purpose of which was 'to discuss the need, feasibility, possible scope and form of a nation-wide inventory' of land capability. Experts from all parts of Canada participated in the seminar, and all agreed that an inventory was needed (Seminar, 1962, pp. 1-2; Report No. 1, 1970, pp. -4-5). Several more specific recommendations resulted from the seminar. First, many participants felt that provincial authorities should conduct the actual inventory using categories standardized nationally. Participants felt this would facilitate both any further aggregation desired for national planning and any more detailed studies for local purposes. This recommendation has been followed (Seminar, 1962, p. 2; DREE, 1970, Intro.; Report No. 1, 1970, p. 5). · A second recommendation made at the 1962 seminar advocated separate classification systems for each use inventoried, as opposed to a general classification as good, marginal or bad for agricultural use alone. Participants felt it desirable to inventory capabilities for several uses in addition to agriculture; namely, forestry, wildlife management, recreation and urban development. Participants also recommended a survey of present land use. These recommendations too were carried out in CLI. Inventories of economic and climatic factors were also recommended at the seminar. Participants felt that economics could be a limit to capability just as could physical factors, because costs and prices influence production just as climate and soil fertility do. Despite these reasons, economic and climatic data have not been mapped as part of the CLI programme. However, some economic data are available on Census tapes through the use of CGIS. A system for classifying climate has also been studied, though no firm commitment has been made to develop such a system fully (Seminar, 1962, pp. 5-7, 10; Report No. 1, 1970, pp. 14-15). In another recommendation, participants cited the availability of considerable amounts of data for some factors (such as soils data for agriculture) and lack of data for others (such as land capability for wildlife). They stated that in order to use available data and gather new data in a relatively short time, the inventory should not be extremely detailed but rather of a 'reconnaissance' type. In addition to being quicker than a detailed survey, the general survey would be focussed on 'areas of major concern' rather than exclusively local problems (Seminar, 1962, pp. 3-4, 9). Finally, participants pointed out the need for further definition of the purposes and scope of the inventory with regard to federal and provincial administration of the ARDA programme (Seminar, 1962, pp. 9-10). Implementation. In 1963, the ARDA Administration took several steps toward implementing CLI. In May, 1963, the National Soil Survey adopted a national soil capability system for agriculture developed at ARDA's request; in October, the Government of Canada approved funding for CLI; and in November, the Canadian Council of Resource Ministers recommended that CLI be undertaken through agreements between ARDA and the provinces. This was done, with the provinces carrying out the inventory using federal funds (Report No. 1, 1970, p. 5). Although the inventory did not begin until 5 years after the Senate Committee's 1958 recommendation, by 1963 support for the CLI seems to have been quite broad. That the Inventory began in 1963 was at least partly due to earlier planning and design efforts, in which people from all parts of Canada and both federal and provincial governments as well as private citizens had participated, The accomplishment of CLI required co-operation among people from eight federal departments, at least two departments of each provincial government, and four universities. Disciplines represented included agriculture, forestry, fisheries, wildlife, meteorology, economics, statistics, soils, mines and water. Appendix 1 lists the co-operating federal, provincial and university interests as they were listed in the 1970 edition of CLI Report No. 1. Organization and management. Responsibilities of the federal government included standardizing inventory classification systems and methodologies across Canada, providing technical and financial assistance to the provinces for conducting the inventory, publishing the results, and providing a data processing system (CGIS). - -- -Responsibilities of the provincial governments included actually conducting the inventory, providing the data to the federal government, and publishing any desired maps of the province (Report No. 1, 1965, pp. 18-21). 34

The Canada Geographic Information System (CGIS)

Scope. The objectives and organizational environment of CLI have influenced the type of data classifications included in the inventory and the geographic area covered by the inventory. The decision about what variables to include in such an inventory is, of course, crucial to the later usefulness of the inventory. Sponsored by the ARDA Administration and aimed at rural development planning, CLI was not designed to cover the entire land area of Canada. Rather, it covers those parts of Canada described as 'settled portions of rural Canada and adjoining areas which affect the income and employment opportunities of rural residents'. The area covered is mostly in the southernfart of Canada, extending east and west to the coastlines, and amounts to approximately 2. 5 million km (1 million sq. miles) (Fig. 2) (Rep ort No. 1, 1970, p. 3; DREE, 1970, Intro.). In line with its rural orientati1:m, the data variables, or coverages as they are called, concern resources rather than sociological factors. Four capabilitiesl are consistently mentioned in available d ocuments and seem to have constituted the heart of the inventory from t he beginning. Capability for supporting agriculture, forestry, recreation, and wild.Life formed the nucleus of di-scussions as early as the 1961 Resources for Tomorrow Conference (Report No. 1, 1970, pp. 52-55). These four variables are included in every list published since 1961. In more recent publications, wildlife is broken down into ungulates and waterfowl, and sport fish are included (Table 2). Table 2.

CLI data variables - proposed and adopted. Sources: (In order for each column) Report No. 1, 1970; Report of Seminar ... , 1962; Tomlinson, 1965; Socio-Economic Variables ... ; Report No. l, 1965; Symington, 1973; Report No. 1, 1970; DREE, 1970; Environment Canada, March 1973; W. A. Switzer, 1975, personal communication .

.::

0

t:0

t:0

a.>

a.>

(1j

~.::

~ .....

sa.>



Agriculture

x

x

x

x

x

x

x

x

x

x

Forestry

x

x

x

x

x

x

x

x

x

x

Recreation

x

x

x

x

x

x

x

x

x

x

Wildlife

x

x

x

x

x

x

*Ungulates

x

x

x

x

Waterfowl

x

x

x

x

Sport fish

x

x x

Present land use

x

Socio-economic Agro-climatic

x

Urban development

x

Economic

x

Census subdivisions

x

x

x

x

x

x

x

x

x

x

x

x

x

x

x

1

'Capability' will be used interchangeably with 'coverage' to denote data for a single variable covering the study area shown in Fig. 2. 35

Computer handling of geographical data

Three other variables are often mentioned in various forms but are not included in the latest document, the 'Index' published by the Lands Directorate, Environment Canada. These are socio-economic (economic or census), present land use, and climatic (or agro-climatic) data. Socio-economic data were often mentioned as an important link between the data on capability and use and the ARDA objective of improving the lot of the rural citizen (Seminar, 1962; Socio-Economic Aspects, 1965; Socio-Economic Variables, undated; Report No. 1, 1965, 1970; Symington, 1973). A consultant's draft report to ARDA on socio-economic variables aptly stated that 'changes in land use also mean changing the use of capital and manpower' (Socio-Economic Variables, p. 3). Participants in ARDA's 1962 Seminar also pointed out that economic factors can limit production. Socio-economic data are necessary to comprehensive planning. In spite of these justifications, socio-economic data were not mapped by CLI. CLI developers evidently felt that a user who desired them could obtain Census data from the Dominion Bureau of Statistics (Statistics Canada) or through CGIS (Report No. 1, 1970, pp. 14-15). If the CLI programme attempted to map socio-economic data, it would be duplicating the efforts and mandate of Statistics Canada. It is thought better to input the boundary information available from Statistics Canada (as is currently done), and to relate to it the mass of data collected by Statistics Canada and available on summary tapes (W. A. Switzer, 1975, personal communication). Present land use is the second variable which CLI designers described as important; although it has been mapped, it has not been published except for selected maps in one or two locations (W. A. Switzer, 1975, personal communication). This variable too is available to planners through the computer mapping process of the CGIS (Report No. 1, 1970, pp. 14-15). A preliminary classification system for climatic data was developed as recommended, but inventory and mapping of this variable as such were not made part of CLI. Climatic information did play a part in several of the capability classification systems developed for other variables, however (Report No. 1, 1970). Development of capability classifications. Land capability is a derived variable that cannot be measured directly and simply, but rather depends on an interpretation of raw data. Definitions of capabilities used in CLI were developed for it through the co-operation of federal and provincial agencies. In general, CLI capabilities refer to the physical capability of the land to support varying levels of intensity of activity or use, in other words, the productivity of the land. Accessibility and economic demand were not considered in assessing capability; nor was present land use, because unless the land was urbanized its actual use did not influence its capability. Good resource management practices, private as well as public, were assumed. However, as defined earlier, capability is not necessarily the same as suitability in terms of environmental quality or socio-economic considerations. The CLI classification of capabilities is given in Appendix 2. Developing capability classifications posed more of a problem for some variables than for others. Classification systems were developed separately for each variable and brought together data previously unco-ordinated. Availability of data as well as the general resource orientation of the variables played a part in the design of the classification systems. Soils and climatic data figured prominently in a number of the capability classifications (Seminar, 1962; Report No. 1, 1970). The soil capability classification system for agriculture was developed by the National Soil Survey Committee in co-operation with the federal and provincial ARDA administrations, assisted by the Soil Conservation Service of the U. S. Department of Agriculture. Developers of the classification system built on the soil survey experience of the Canada Department of Agriculture, provincial governments, and agricultural colleges (Report No. 2, 1972, pp. 2-3). The land capability classification system for forestry was developed through co-operative efforts of the Canada Department of Forestry and Rural Development and provincial forestry departments. The system was tested in pilot projects in each province before its approval in 1965 (Report No. 1, 1970, p. 9; Report No. 4, 1972, p. 1). CLI Report No. 4, 'Land Capability Classification for Forestry', includes the regional reports that formed the basis for the national classification system. These reports described the particular features in each province that influence the forest productivity of the land, and the methods by which those features were translated into national classification ratings. Between 1965 and 1967, the system was twice revised by the National Committee on Forest Land, a group representing all Canadian agencies that classify and administer large land areas (Report No. 4, 1972, p. 1). Development of the land capability classification for outdoor recreation was guided by the Recreation Sector of CLI. A Committee of the Federal-Provincial Parks Conference helped in initial develop-

36

The Canada Geographic Information System (CGIS)

ment in 1965. The first system was tested in 1965, and a revised system was adopted at a meeting of representatives of all interested agencies held in February, 1966. Guidelines were further revised in 1966. Staff from the National and Historic Parks Branch, Department of Indian Affairs and Northern Development, helped to co-ordinate development of the system (Report No. 6, 1969, p. 1; Report No. 1, 1970, p. 12). Officials of the Canadian Wildlife Service and provincial game and wildlife agencies developed the land capability classification system for wildlife (ungulates and waterfowl). An initial system was developed and reviewed at regional and national meetings in 1964 and 1965, and a system was adopted in July, 1965, at the Federal-Provincial Wildlife Conference. Only minor changes were made in this system after survey and mapping work was begun (Report No. 7, 1969, p. 1; Report No. 1, 1970, p. 13). A system for classifying the capability of water bodies and watersheds for sportfish was developed, but no available documents report who developed it. Data on sportfish are only available through CGIS (Report No. 1, 1970, pp. 13-14; CGIS Vol. 4, 1972, Ch. 11). The Geographical Branch of the Canada Department of Energy, Mines and Resources has been mapping present land use since 1950. In 1963, ARDA requested that the Branch develop a land use classification system compatible with other CLI classifications. The Branch did so, with the advice of academicians as well as government researchers. CLI did not map present land use for distribution; however, data and maps are available through CGIS (McClellan et al., 1968; Report No. 1, 1970, p. 14). Products and their use. CLI data were collected by the provinces at the 1:50, 000 scale, which was considered by the provincial governments to be the smallest scale at which data pertinent to provincial land management practices could be recorded. The federal agency responsible for CLI then generalized the maps to a 1:250, 000 scale and published them (CGIS Vol. 1, 1970, Intro.). Users of CLI maps include planners, engineering firms, individuals, conservation authorities, and a variety of users in private, municipal, provincial and federal agencies. CLI maps are distributed to frequent users on a regular basis and to others who may request them. Environment Canada records indicate that CLI distributed more than 10, 000 printed copies of various maps (coverages and areas) up to the beginning of 1973. In addition, the work sheets at 1:50, 000 have been available through provincial offices. To supplement the maps, CLI developers have issued reports explaining in detail their classification system. This service also makes CLI data more useful in that the user can assess the relevance and limitations of each classification system in terms of his own purposes. The following reports are, therefore, an extremely important aspect of CLI. Report No. 1, 1970, 'The Canada Land Inventory, Objectives, Scope and Organization', Department of Regional Economic Expansion. Report No. 2, 1972, 'The Canada Land Inventory, Soil Capability Classification for Agriculture', Environment Canada, Land Inventory. Report No. 4, 1972, 'The Canada Land Inventory, Land Capability Classification for Forestry', Environment Canada, Land Inventory. Report No. 6, 1969, 'The Canada Land Inventory, Land Capability Classification for Outdoor Recreation', Department of Regional Economic Expansion. Report No. 7, 1969, 'The Canada Land Inventory, Land Capability Classification for Wildlife', prepared for the Canada Land Inventory by N. G. Perret, Department of Regional Economic Expansion. Pilot studies on land use planning. In order to demonstrate and assess the usefulness of CLI data to the process of land use planning, CLI developers were authorized in 1967 to fund pilot studies that would use CLI data in land use planning. Such studies, called land capability analyses, are evidently being undertaken in British Columbia and Quebec. Little information is available about them, but they seem to involve mapping areas of Classes 1, 2 and 3 for agriculture, forestry, recreation, ungulates, and waterfowl; determining overlaps; adding agricultural and forestry lands of Class 4 and urban areas; and finally, analysing the resulting pattern in some way that allows for consideration of present land use and socio-economic factors (Report No. 1, 1970, pp. 16-18; CGIS Vol. 1, 1970, 'Land-Use Planning'; Environment Canada, 1973a). Future plans.

Plans for future development of CLI are undocumented, but presumably they include 37

Computer handling of geographical data

completing the map series at 1:250, 000 scale. No plans are mentioned for updating the maps in the future, and no estimate is made of the length of time the maps now in production will be current. These considerations are important, because they influence the potential long-range usefulness of the data base. Development of the Canada Geographic Information System Feasibility studies. The idea of developing a geographic information system using CL! data is first documented in ARDA's 1962 Seminar on a National Land Capability Inventory (Tomlinson, 1962). While planning and designing the Inventory, ARDA was also investigating a computer mapping system. Spartan Air Services, Ltd. carried out a further feasibility study for ARDA during 1963 (Tomlinson, 1963). Its report reviewed data available to a computer system, input and output devices, data processing and analytical capabilities. The report reflected government estimates that 1, 600 to 1, 700 maps would be available for processing (Table 3). These maps would come from CL!, the Dominion Bureau of Statistics (Statistics Canada), the Ontario Research Foundations, the Geographic Branch, Department of Energy, Mines and Resources, and the provincial governments. The report noted that unmapped statistical data were also available from these and other sources, some of them (notably, census tapes) in machine-readable form. The report stated that statistical data posed no input or output problems in terms of equipment or programming. Mapped data posed no output problems, but no completely automated input mechanism was then available. It was recommended that development of this type of equipment would best serve ARDA's purposes. Non-automatic and automatic digitizers that involved manual tracing of boundary lines were ruled out as too time-consuming and costly. The input process recommended in the report was essentially the one later developed for CGIS by IBM. It consists of a drum scanner which automatically digitizes polygon boundaries scribed in scribecoat; a digitizer, which is used to encode one point within each polygon; and a key-to-tape device which is used to encode descriptions of each polygon. IBM later terms the design and use of the drum scanner 'the first of many technical breakthroughs' made in developing CGIS (IBM Canada, 1968, p. 2). The 1963 report recommended development of a rectangular co-ordinate system and a compact notation for data storage. Also recommended was the use of a computer capability comparable to that of the IBM 7000 series, which was the most advanced available in 1963. Analytical capabilities were to be reviewed by a team consisting of an agro-economist and a mathematician-statistician backed up by programmers (Tomlinson, 1963). A second feasibility study was prepared for the government in 1965 (Tomlinson, 1965). This report expanded on the first one and emphasized the economic feasibility of a computer system. There are several interesting differences between the 1963 and 1965 reports. The estimated number of maps available to the system increased from 1, 600 to 1, 700, to 2, 500 to 3, 000. The CL! programme was to take 3 instead of 2 years, and the minimum number of maps the system should be capable of handling in a year rose from 1, 000 to 1, 200 (Table 3). The second feasibility report did not disagree with the recommendations of the earlier one; however, the recommended system was compared with other means of processing the data. The result is a reiteration of the 1963 recommendations, with the addition of some tentative cost estimates. Three general means of analysing maps were identified and evaluated. Map estimation, the term used for examining maps and overlays visually, was thought to provide a good start, but it was subjective and became quite difficult at a scale as small as 1;250,000. It was concluded that map estimation alone was insufficient to fulfil CL! objectives. A second method, manual map measurement, involved use of a dot grid superimposed on a map or a polar planimeter to assess areas of polygons on a map. This method was accurate and worked well for small areas represented at small scales, but it was time consuming and expensive for a large number of maps. The third method, computer map assessment, had great potential, but involved high initial cost and uncertainty due to the newness of the techniques and equipment. Further comparisons of manual and computer methods of map measurement were made by estimating the costs for performing 30 comparisons of two variables for an area of 700,000 km 2 (270,000 sq. miles). Manual assessment was applied to maps at two scales: 1:250, 000, and optimum scale, that is, the scale at which the data were available, usually 1:50,000. Costs are summarized in Table 4. From these comparisons several conclusions were drawn. Manual map measurement at the optimal scale was far too expensive. Analysing maps at only the 1:250,000 scale was insufficient because it would not serve the provinces. Computer map measurement had a high initial cost, but subsequent costs would be lower. Additionally, it was determined that the computer method of map measurement and analysis was far superior to the other two, in that the original data are coded and can be manipulated for future uses quickly without more manual measurement. 38

The Canada Geographic Information System (CGIS)

Table 3.

Comparisoi:i of estimates of numbers of maps available to CGIS. Sources: Tomlinson, 1963; Tomlinson, 1965.

Coverage

1963

1965

sheets'~

Agriculture

135-400

Forestry

ca. 200 sheets

ca. 200 sheets

Wildlife

135-150 sheets

135-150 sheets

Recreation

135-150 sheets

135-150 sheets

Census boundaries

500

maps'~*

135 -500 sheets*+

500 maps**

maps~'~'*

Climate

2-14

Land use

400 -500 sheets

1200 -1500

Actual total of estimates

1507-1914

2307-2914

Estimated total reported

1600-1700

2500-3000+

2-14 maps*** sheets'~*+

1963 Recommended input capability:

minimum of 1000 maps/year (2-year CL! programme)

1965 Recommended data handling capacity:

minimum of 1200 maps/year (3-year CLI programme)

~'Various scales 1:50, 000 - 1:250, 000. **Various scales 1:50, 000 - 1:500, 000. **~' 1 :2, 000, 000 scale. + New estimates reported in Economic Feasibility Report, ca. 1965

Table 4.

Total cost estimated in 1965 to complete 30 assessments in 3 years.

Method

Cost and computer time

1.

Manual map measurement at 1 :250, 000

$ 879,292 + 122 hr IBM 1401

2.

Manual map measurement at optimal scale

$8, 414, 145 + 488 hr IBM 1401 or 50 hr IBM 7000

3.

Computer map measurement at optimal scale

$1, 112, 202 + 2, 047 hr IBM 7000

Manpower 58 persons 556 persons

13-27 persons

39

Computer handling of geographical data

The report stressed that it was impossible to make firm cost estimates for a computer system because it would be a state-of-the-art development. Cost could not be judged until such a system was actually developed and used. However, it was stated that relative costs could be estimated and from this a judgement made whether to develop the system or not. Table 5 shows cost estimates over the 3 years estimated for the development of the system.

Table 5.

Cost estimates for 12 months made in 1965. Source: Tomlinson, 1965.

Function

$/yr

$/yr

$/yr

30,000 4, 800

50,000 8,000

13,661 2, 200

40,000 7,000 75,000 7,000

7,000 76, 041 7, 000

7,000

Program control (ADE T)

150,000

150,000

150,000

Programming

100,000

100, 000

50,000

2,500

5,000

21,000

21, 000

21,000

437' 300

424,041

250,861

Map preparation Labour Material Input system Non-automatic unit, semiautomated Operator Drum scanner Operator

Statistical data reduction Co-ordination Total Machine time, hr (IBM 7000 series) cost, $ Total, $

7,000

200

800

1, 047

30,000

120,000

157,050

467,300

544,041

407, 911

These two feasibility reports evidently formed the basis for future development of CGIS. IBM had provided information used in these studies and was later hired to develop the system under government direction. In November, 1965, the Government approved initial developments of CGIS and in September, 1966, approved the contract with IBM Canada for development of CGIS hardware and software under government direction. A team of specialists to monitor the system's compliance with user needs was still considered necessary if a computer system were developed. The team was called an Assessment Determination and Evaluation Team (ADET) and was established under contract at Spartan Air Services, Ltd. in 1965. There is evidence that ADET was formed ('Socio-Economic Variables to be Included in the CL! Geolnformation System' is a draft ADET report), but personnel, duties, source of funds and dates of operation are not documented in available literature. The contract with Spartan Air Services, Ltd. was terminated in 1967 and the functions were nominally absorbed by government staff of the Rural Development Branch, of which CL! was a part.

40

The Canada Geographic Information System (CGIS)

Organizational environment of CGIS and CLI. Initially sponsored by ARDA in the Canada Department of Agriculture, CLI and CGIS have changed departments three times since 1963. By 1965, CLI and CGIS were in the Department of Forestry and Rural Development. Then by 1968 they became part of the Department of Regional Economic Expansion. The latest move, which took place in 1972, put CL! and CGIS in Environment Canada, Lands Directorate. Some of these changes were due to governmental reorganization, and others to changing departmental objectives and priorities (Report No. 1, 1965; McClellan et al., 1968; Environment Canada, 1973a; Report No. 2, 1972; Switzer, 1973a). The basic objectives and geographical scope of CLI do not appear to have changed as a result of these organizational shifts. Report No. 1 on CLI, 'Objectives, Scope and Organization', published in 1965 by the Department of Forestry and Rural Development, is substantially the same as the 1970 report of the same name published by the Department of Regional Economic Expansion. There seems to have been an interaction between some aspects of CGIS and CLI, for example, the availability of present land use and socio-economic data only on computer-generated maps instead of published maps. One of the major objectives for CGIS from the beginning was to provide the ability to compare statistical data with mapped CLI data. The early dependence of CGIS on CLI played a major tole in determining the characteristics of the current system. Efforts are currently under way to expand the system's operational capabilities (Environment Canada, 1973b, p. 7). System name. Changes in the organizational environment were accompanied by changes in the system name. Originally designated simply as a 'Computer Mapping System' (Tomlinson, 1963), it was referred to as a 'Geographic Information System' (Tomlinson, 1968; IBM Canada, 1968) or 'Geo-Information System' (GeoIS) (Tomlinson, 1967), and then officially became the 'Canada Geographic Information System', CGIS (CGIS, 1972; Tomlinson, 1973). Development between 1965 and 1968. By 1965, when CLI and CGIS were the responsibility of the Rural Development Branch, Department of Forestry and Rural Development, the development of the computer mapping system had been studied and approved. Between 1965 and 1969, several articles were written about CGIS. As a result of these and perhaps other sources of information, the system gradually became known and predictions about its capabilities and availability were made public. Unfortunately, the system was not available for use when expected. This can be attributed mainly to the fact that CGIS was the first system of its kind and there was little information on which to base predictions of both capability and availability. The economic feasibility report of 1965 had stated that the proposed computer mapping system could be developed in 3 years. Two years later, in a report entitled 'An Introduction to the Geo-information System of the Canada Land Inventory', it was predicted that routine use of the system would be possible in September, 1967. A later revision of these reports entitled 'A Geographic Information System for Regional Planning', put the completion date a year later, in September 1968. However, circumstances common to the nature of a state-of-the-art development caused further delays. However, both these articles distinguished the CGIS 'information system' from the 'data bank'. The information system was defined as the equipment and programs needed to input and process data, as opposed to the machine-readable data files, which were called the data bank. It was stated, 'It is quite possible to have the entire geographic information system with full operating capability and have no data in the data bank. ' (Tomlinson, 1968, p. 20 1). Developing processing capability is a large task and an important step toward providing a usable information system, but without a data bank, processing capabilities are useless. Developing a data bank is also time consuming and CGIS developers are still working on it. The 1:250,000 data base is now 90% complete. In early documents about CGIS it does not seem to have been made clear that the information system and data bank would take some time to develop. An IBM Canada report written in 1968 stated at one point that the information system was just beginning to input maps, but at a later point the document lists 'present base coverages', failing to distinguish between the information system and the data bank. This clearly points to the need for very careful thought about system documentation in order to avoid confusion on the part of potential users. Developers underestimated not only the problems but the costs of system development. However, these problems are not unique to CGIS. Evidently, in a state-of-the-art system neither developers nor potential users can predict the cost-effectiveness of the system but must wait until it is operating to evaluate it. Finally, it must be pointed out that it is virtually impossible to predict the nature and magnitude of

41

Computer handling of geographical data

the problems that will be encountered in a system development programme as fundamental and original as CGIS. One must be prepared to accept significant differences from original estimates of time and cost. Such projects are usually made operational by the efforts of a highly skilled management unit and a staff of competent technicians. Current developments, 1969-73. In 1969, realizing that premature knowledge of the system could only lead to the dissatisfaction of potential users, developers ceased to discuss CGIS formally until the system became operational in 1971, when Environment Canada actually began automating data. For about 9 months in 1971, data were input at the 1:50,000 scale. However, the input process proved to be too slow at that time. It was estimated that it would take 10 to 15 years to complete the data base at that scale and at the level of processing efficiency and computer time and costs applicable -in 1971 (Switzer, 1973a), rather than the 5 years hoped for earlier (Table 6). Table 6.

Time needed to create CGIS data bank.

Source

No. of maps

Scale

Years

Rate

1963 Feasibility Report

1600-1700

optimal

2

1000/year

1965 Economic Feasibility Report

2500-3000

optimal

3

1200/year

22,000

1: 50, 000

5

4400/year

July 1971 Map Production Est. (Revised)

2,365

1: 250' 000

2

July 1973 Environment Canada>:'

2,420

1:250, 000

?

July 1696 GeoIS Project Discussion

5/day 60-70 /month (since Apr. 1973)

>:
i

• j : ; i; n » ;51. ? :319 ?3 2: ?7 ;*:* ?' ?f /• ?*:•) 30:; j; 3; ',< v ',; r 1 ;?ac 13 •;; •*;• *; « .-.• " •;; *( jssc ii s:

£ 3 9 : 2 9 3 9 « 9 r> S 9 9 3 ? i 3 3 9 9 9 9 9 I 9 9 9 & 9 9 fl & S a

/

Lot 2 1

>,

2200

\0 3200

3100

3300

3400

4203

s/ 4104

4400

B. MLMIS numerical description of section

Numbering system for forties and lots Source: Orning and Maki (1972).

Besides these geocodes based on the Land Survey system, each parcel is referenced to a county, minor civil division (MCD), and latitude and longitude co-ordinates. Counties are numbered from 1 to 87 in alphabetical order. MCD's in each county are numbered in alphabetical order. Longitude and latitude are coded for the centroid of each MCD. Data encoding process. Once the geocoding system was established, the data were converted to mac.xiine records using the process shown in Figure 46. This original process will not be used on any further data input because of the subsequent development of a data entry system using a cathode-raytube interactive mode. The original data capture procedure was as follows. Townships were outlined on the aerial photographs. A mylar overlay was used to define the 16 forties within each of the 36 sections in the township. A team of three students encoded the data. Two interpreted the dominant use of each forty, thus double-checking each other; the third recorded their interpretation on a county highway map. Next, another student mark-sensed the data on punch cards on which geocodes for each forty had been punched. The data were transferred to tapes by a Motorola mark-sense reader, then run through the Minn-Map program which checked for errors. Finally, any errors were corrected using Control Data Corporation's 7000 Modify/Scope system (Orning and Maki, 1972, p. 67). Interpretation times were calculated in the Northwest Minnesota pilot study. 'The average interpretation time for northern Minnesota was 26 minutes per township. This does not include field checking, report writing, or preparation of photos and maps. Interpretation time ranged from 5 minutes to 110 minutes (per township). ' (Fig..47) (Orning and Maki, 1972, p. 67). Coding land use data for the state took somewhat over 2 years. The work was done by undergraduates at the University of Minnesota at $1. 90 per hour, an essential factor in cost cutting (Yaeger, 1973a). Problems in distinguishing land use categories from the aerial photographs were documented in Land Management Information in Northwest Minnesota (Orning and Maki, 1972). The major source of confusion was in determining where one land use ended and another began so that the predominant use of the forty could be determined. The report highlights the difficulties as follows: 'Five categories of physical characteristics were isolated by the interpreters as problems most affecting interpretation time: beach ridges, open-cultivated decisions, marsh-open decision, forest-swamp decision, and forest-open decisions. In addition to these problems, two cultural characteristics were found to affect interpretation time. In larger urban centers and in the extractive areas of the Iron Range, interpretation time tended to be greater. ' (Orning and Maki, 1972, p. 67) (Fig. 47). 102

The Minnesota Land Management Information System (MLMIS)

Mylar overlay:

1:90,000 print of

1 township 36 sections 576 forties

aerial photograph

Two students interpret dominant land use and water orientation for each forty from photographs.

Third student records data for each forty on a county highway map.

Yes

Fourth student marksenses data from map on appropriate card.

Geocodes punched one card per forty

Mark Sense Reader transfers data to tapes.

Minn-Map programs detect errors.

Errors corrected on tapes using CDC 7000 Modify/Scope system.

Fig. 46.

Original MLMIS data input process.

In order to facilitate future data input, the capability to display and enter data using a cathode ray tube (CRT) has been developed and is being further refined. Half a township can be displayed on the screen and modified by using the LANDUSE and SCAN subroutines. The SCAN subroutine displays raw data of the parcel records in list form. The LANDUSE subroutine displays the data spatially by displaying the land use for each parcel in the half-township as a two-digit number (Fig. 48). The CRT is on-line to a CDC 3200 computer. Data files for the 3200 are built from the MLMIS data stored on a CDC 6600. Data are read off tapes by the 6600 computer and transferred via phone lines to the 3200, which writes the data on a disk pack for random access (Fig. 49) (Outline of MLMIS Data.. .; Kozar and Orning, 1972).

103

Dirtributjon of Townsnipi By Time Class *od Interpretation Problemi Average Number of PhysicalCultural Problemi Affecting Interpretation Tim*

InterjHTtation Time

Ptrcenl of Township* By Time Cta»

Very Fan (5- 10 Minutes)

12

OJ

Fan (11-20 Minutes)

26

13

Medium (2 1 -35 M mutes)

30



Slow (35-6C Minutes)

27

2.1

5

2.0

Very Slow (61 + Minutes) TOTAL

100%

This time and problem distribution was determined from a systematic sample of 1/3 of the townships in northern Minnesota.

Occurrence of Interpretation Problems within each Time Class Very Fast Beach Ridge

3%

Fast

7%

Medium

5%

Slow

5%

Very Slow 12%

TotaJ

4%

20

46

67

77

82

37%

Swamp-Open

13

32

40

63

53

25%

Forest-Swamp

37

30

33

32

18

22%

8

16

17

22

47

Open-Cultivated

Forest-Open

12% 100%

Fig. 47.

Interpretation times and problems in the areas of MLMIS coverage.

&

-X ^ \N OF INTERPRETATION TIME

The Minnesota Land Management Information System (MLMIS)

22 22 11 22

- - - - - --| 1

22 11 66 66 66 66 66 11 22 66 66 11

1 1

1-

iI

1

L

1 1 i 1 i

1 1

| 1 1

Fig. 48.

" ~l

I I _, i 1

LANDUSE subroutine displays land use in each parcel as a two-digit number.

At the 6600 computer: MLMIS data tapes

Data are read off the tape files and transferred to a special file which is sent across a telephone line to the 3200 computer

Updated data are written back on to the MLMIS data tape

At the 3200 computer: Upon reaching the 3200 computer the data are written on a disk pack in such a way that it can be randomly accessed

When the user has finished entering data for the townships contained on the disk pack, the updated file is sent back across the telephone line to the 6600 computer

The user can request town- ( ships to be displayed on the CRT and can enter or correct data on the screen; these corrections are sent back to the disk pack to await further action

Fig. 49.

Capability to enter data and display on CRT. Source: Outline of MLMIS Data Entry System, p. 105

Computer handling of geographical data

Data storage The data bank which results from the data input process contains data stored in a sequence determined by county number, township and range numbers, section number, and forty number. Data retrieval and output Retrieval. MLMIS is oriented to data input, retrieval and mapping rather than statistical analysis or modelling. SPA has developed a technique of location analysis called EPPL (Environmental Planning and Programming Language) which uses MLMIS data and facilities. EPPL has the ability to analyse data at various grid sizes, down to 1. 08 ha (2 2/3 ac). The location of each data cell is referenced by its row and column in the grid, rather than by legal description or other geocodes used in MLMIS. EPPL compares a user's data to specified criteria and rates each data cell based on the criteria. EPPL can also determine whether or not a large enough group of cells that meet the requirements are close enough to each other to fulfil the user's needs. Output. MLMIS's main output consists of township maps on which the predominant use of each forty is printed as a block, five characters wide and three high, utilizing a line printer. Each land use category has a corresponding unique character. After retrieving data in this form, the user performs any desired analysis thereof. SPA and CURA have co-operated in using MLMIS to produce two major products: the report entitled Land Management Information in Northwest Minnesota (Orning and Maki, 1972), and a state land use map printed in several colours. The report is the most complete system documentation to date and shows how planners can use MLMIS data in regional analysis. The State Land Use Map was produced from tinted photographs of printed township maps reduced to a scale of 1:500, 000 (Fig. 50). The composite state map was produced by hard-pasting all the reduced township maps together for each colour separation. SYSTEM USE The goal of SPA and CURA in developing MLMIS for potential use has been to provide information to decision makers so that they will make better decisions. As with almost any system, the actual identity of these users is somewhat obscure. SPA hopes to use MLMIS in future state planning efforts as well as more specific studies such as planning of recreation areas. CURA also believes that other state departments such as Natural Resources, Highways and Administration as well as county and regional planners will find the system useful (Kozar and Orning, 1972). To the present, actual use of MLMIS data has been infrequent. Costs of obtaining Xerox copies of township maps are quite low, but only two variables are available, and analysis of the mapped data must be done by the user. The Minnesota Department of Highways, the Bureau of Engineering of the Department of Natural Resources, and the Rochester-Olmstead Council of Governments have used the township data. In addition, 3, 500 copies of the State Land Use Map and 700 copies of the 1972 Report on Northwest Minnesota have been distributed by SPA (Yaeger, 1973b). However, more significantly, one of the most requested products of MLMIS has been the aerial photography. The following uses of these photographs were listed in the Northwest Minnesota Report: 1. The Minnesota Highway Department, for updating county highway maps. 2. The Department of Natural Resources: a. Bureau of Planning, for studying the St. Croix Wild River, and for locating recreational areas. b. Division of Game and Fish, for inventory work in wetlands and for field reconnaissance in watershed investigation. c. Division of Waters, Soils and Minerals, the transparencies of the 7. 5-minute equivalents in the administration of shoreland and floodplain zoning programmes. 3. The Department of Economic Development, which is studying the use of photograph enlargements for planning purposes. 4. The Soil Science Department, University of Minnesota, in soil mapping programmes. The photographs are used for: delineating topographic areas; locating roads and trails; determining soil group boundaries; delineating mines, dumps and cultural features that would not have been found on previous leaf-on photographs.

106

The Minnesota Land Management Information System (MLMIS)

** *» *» *******

************ ************ ,***»***»*** *«**.,******

******* *****,»***»* ******* *******•• • MM" ** •MM** •* •••••*• **

*******• '••'•"•**••*•***MMMMM*****»*********MMMUM*****»*******»*

••••••*•••

****************** **»**********»*#»fgggl

•••••

IIIRI1 ***** ***** *****

*************gggga«

*BMM'"*'lMM"** *MIM*""«MM*-« *•!•••** * * * • • • • • * * * • *aaiaae|IBl* * * * * * * * *

•«••••••••••••••••§•*****••>••••••••••••

•••••

••••• • •••• !!•••' IIMI

** ...... ****** ******** ****** ******** ****** *************ggggg*

«*»•••••••••• iitgi******»***+***»***»*li§lg* ******•**•••••••••• ***•••••••••• iiiii*******»************i«»ii******»**»i«iiiiil«i •»•••' ***•?•••••••• •••§•»*******»****»******§••••*»********••••••••»• •§•••' *** BBBBBBBBBB***** *****•••••********'**** ********•••••*** ** *** gggagggggg******«***ggggi*********»****...***|BgH.**********»***********»**«.*********»*** *»* **«, gggggggggg**********ggggg********************gB|g B ************************************************** ••••••••••••••••••••••••*•••••***************•••*•*****•••••***************BBBBBBflBBi***** ********** •••••••••••••••••••••••••••••I************ ***lBBBB*****BaBBB********»******BBBBBBaaBB***** **********

giiiMiiii»iiiiii«»«Mi«iiiii***************ii»a>*****«tii***************ii§aaiiiH***** ********** *************** aa ggg********** aa gg a ***** aa gg a ********************i a igB aaa igi aaa g******************** ********«****** a a g a a ****+***** a a gg a ***** a a a a a ******************** a a a a a a a a a a i a a a a *********** **** ***************aggaa****.*****aaaaa*****aaaga ********************a§aaaaaaaaaaaaa******************** ********** aaaaaa g aaa ********** a gg aa ***** aaaiaaaaaa **********g aaaa ********** aaa f a ***«****«. aaaaa ***** ***** ***** »****,*** *a a a a a a a a ig****«*****gMM*****aMBMiiaaa ******** ** a a gi$**********BiiiB**********iBiBB***** **********aaaaaaaaaa**********aaaai*****agaaaaaaaa **********atgaa***«******aaaaa**********aag|t***** ***** *** **gcaittnii*** ••••••»•••••••• **•••••***************•••••*****••••«••••••*••• ***»*iMiiiiig«»* * •iiii««Miiiii» **iiM«*********** ****•••••** ***••••••••••••••! **** *tagaiaiati*** ••••••*•••••••• **•••••***** **********•§•••*** **••••••••••••••• stiaaitaaaagiif*** ,****»*»,* **********giiig*****gt§gt*** **•••••••••• »»***••••• •BIBB** **' *.*•****»* ***»******iiiig***»*iiiig*.* **••••••••••*** **!•••• •••••••••••••••*** BBBM********* ********** *********'••••!*** **•••••*•••*•••••••••«*'**•••••• •••••*****••••••••••***************•••!•••••••••••* »*****»******»» ••••§*****••••••••••***********»*» *••••••«•••••§!* ***************

••!••* ****iiiiiiMii*************»*»n«iHiiai»i«i*****.....,»**** •n»iHii«H»«iMiiM*********»**********iiiii*****»******************** •••••••••••'•••••••••** »***»*****»******•••$•************************** in«i»Hiii«i«iitii******+*+******»****iiiii*** *********************** •••••*****iaataaain*****a»Baaiiftiiiaii******* ******** *****•••••****** »***************»*•••••****+* •••••••••••••••*****•§••••••••••••••••••

•ii«i*«iai**********aaiaaaiaaaiaaaa*****aia«aaiaaiiian«aai» laaai* ****** *»*»**»*i>»it*****gi»tiiHii******»»**********»* aaana* ******** ******i»tig**»**gggggigigg*********** ********* iia«a***************«t«ii* ****••••••••••* ******************* .«**•*************** •••••••••• sitiiam !••••• ,*«******,********** aiNiinaai ••••••••••iiaai ******************** aiaiaiaaia ••••••••••aaaia ...***....*****•»•••* ****iaiiaiiiaaaaaaiiaaiaaaii* »,**•»*.**»*. **atMB*****BMMitBSiBiiaiBBaiMaaaa •••BIIIIIBIIBIB

••••••«••••••••***

MBBB********** BiBiB*********' BBiBB*********' * •••••••••• BBBBB » * •••••*•••• ••••I * * •••••••••• BBBBB * * *•••••********** •••"•••••••••••III! *•••••* ***+***** * * * * * B B B B B B B B B B B B B B B *** •****BBBBBBBBBBBBBIfi ***• •••••••••• BBBBBBBBBBBIBIB BBBBB BBBaaBflBBBBBBIB BBBBB BBBBBBBBBBBBBBB BBBBB *****BBBBBBiaBa*****BBflBB BBBBB ***** *****BBB|BBIBBB*****BBBBB •••«• ***** *****BBBBBBBBBB*****BBBBB BBBBB ***** BBBBB BBBBSBBBBBBBBBB ********** BBBBB BBBBBBBBBBBBBBB ********** BBBBB BBBBBBBBBBBBBBB **********

,.******.*** •**&»•••** '**aiaaBBBBBtaBfiBBBBBBBBBail MIBMiaaiBMU aBBB&aBaBflBBBBB iaBfB*BBBB** ******* *BBBBBBBafla*****BBiaiaBBBl*****BBBBBBBIBBBBBBBaBBBiaBaBBBaaaB ft>lll|ltftl IBBBa«BaBl**********BBaBflBBBBB*****BBBaiBailB*****BBBBlBaB»iBaBBaBBBBBBBBBBBBBa >•••••••>•

•BBBBBBBlB

•••••*•••• •BIBB •tin Fig. 50.

*****BBBBa *****aaaia **•*»«•••• BBBiiaaaaa

aaRiaaaaaa

*****

***** ***** ••••• BBIBB •••••

•••BB*********

a|BBBIBiBBBBlBBBIBBB BBIBIBI|liaiBiaBBI*l 1 •••••III Mil M ••• l«

•••••••••••••••••••1

••••••••II

••••••••••••••••«••• ••••!••••••••••••••• aaaaa aiiii*»tii«iin aaaaa aaaaaaaaaaiaBaa •••••

•••••••••• ••••••••••

Minn-Map township (red overlay),

107

Computer handling of geographical data

5.

The United States Geological Survey, which uses the 7. 5-minute quadrangle equivalents (photomaps), diapositives and contact prints for interim revision of its quadrangle maps. The photographs are also used for USGS's new 7. 5-minute orthophoto maps. (Orning and Sizar, 1972, pp. 65-66). Current information indicates that more than 65,000 individual photographs have been purchased by users from SPA. As MLMIS developers update the data bank and develop the ability to analyse the data, the system will probably become more widely known and used. FUTURE DEVELOPMENT PLANS SPA and CURA are continuing to develop MLMIS incrementally through specific projects. The work plan for 1973-75 biennium includes studies of land use priorities, preparation of coding manuals, special projects, analysis of institutions and development of CRT software for data entry (MLMIS Study, Work Plan). Land use priorities will be studied in Itaska County and the Arrowhead Development Region as well as state-wide. Coding manuals containing standards for data collection and storage will be produced for land cover and land use, soils and public land ownership records, as well as a general geocoding manual concerning points, lines and areas. Special projects will include studies of legal controls, agency powers and permits and licenses affecting land use. These studies will be linked with the analysis of institutional controls and their relationship to MLMIS. Land use data will be updated in accordance with a sampling procedure, perhaps based on permits and licenses. This update will use software developed for an interactive data entry system. Continued use of the pilot study technique of system development has had both positive and negative results. The system has been able to obtain funds year by year, and several beneficial products have been produced, but there evidently has been no well-documented effort at total system design. Only broad, -long-range goals and immediate objectives have been documented. There appears to be no systematic overall plan for development which would relate the immediate objectives to the long-range goals. The overall goal of MLMIS is to make data available to decision makers. The organizational environment of MLMIS has in large part been responsible for the degree to which MLMIS has developed toward its objectives. Development of MLMIS through pilot studies as an integrated part of the work of the Planning Agency and CURA has had several positive results: aerial photography has been useful to several agencies, and the data bank has seen some use. However, the system is still vulnerable to a loss of funding. There is no specific authorization to operate MLMIS. It is funded through CURA and Planning Agency budgets and depends on their personnel and programme orientation (Yaeger, 1973a). Yaeger has estimated the total cost of system development thus far to be in excess of $400,000. Benefits to users (decision makers) are currently limited by the fact that only two variables are available, land use and water orientation. A current pilot study in Itaska County in northern Minnesota is being used to experiment with adding additional variables and to develop software for data manipulation. However, rather than to expand their central data bank, MLMIS developers' primary goal for work during the 1973-75 biennium is to 'structure and organize data collected by public agencies' (Kozar and Orning, 1972, p. 4). System viability may increase if its operation is transferred to the DOA, which would provide a more solid legal basis for continuation. However, interest in the system may be higher at SPA than DOA, and a transfer might be a setback. In the meantime, although the lack of a system design does not conform to rational planning methods, it does allow the developers to be more flexible and open to changes in development activities which may keep the system viable.

BIBLIOGRAPHY

Arrowhead Regional Development Commission, 1973, 'ARDC Information Systems Development — January - June, 1973'. Borchert, John R. and Donald D. Carroll, 1970, Minnesota Settlement and Land Use, 1985, prepared for the Minnesota State Planning Agency, based on initial work done in a graduate seminar conducted at the University of Minnesota, Minneapolis. 108

The Minnesota Land Management Information System (MLMIS)

Brown, Dwight, 1973, 'Type 1 Progress Report for ERTS Project No. SR-2831. Center for Urban and Regional Affairs (CURA), 1973, 'CURA, '71-'73', University of Minnesota, Minneapolis. Center for Urban and Regional Affairs and State Planning Agency (CURA and MSP A), 1972, 'Minnesota Land Use, The Minnesota Land Management Information System', University of Minnesota, Minneapolis, and the Minnesota State Planning Agency, St. Paul. 'Governor's Environmental Message, Land Use Planning1, no date. Kozar, Kenneth, 1973, 'Pilot Resource Management System1, memo dated March 8, 1973, on CURA stationery. Kozar, Kenneth, and George Orning, 1972, 'Anticipated uses of the MLMIS data structure and projected goals for 1973-1975', memo dated November 13, 1972, on CURA stationery. Minnesota State Department of Administration, Information Systems Division (MSDOA), 1971, 'The Minnesota Land Information System', Information Systems, Minnesota's Natural Resources, Newsletter 2, March 1971, St. Paul, Minnesota. Minnesota State Planning Agency (MSPA), 1968, 'A State Land Inventory'. MSPA, 1972a, 'Land Management Information Report Completed', news release, September 29. MSPA, 1972b, 'Proposed Land Use Planning Program', with attached chronological review of the land use activities of the State Planning Agency from 1968 to 1972. MLMIS Study, 'Work Plan - July 1973 to June 1975'. Orning, George W. and Les Maki, 1972, Land Management Information in Northwest Minnesota, the Beginning of a Statewide System, Report No. 1, Minnesota Land Management Information System Study, prepared for the Minnesota State Planning Agency and the Upper Great Lakes Regional Commission, Center for Urban and Regional Affairs, University of Minnesota, Minneapolis. 'Outline of MLMIS Data Entry System', no date, no author.

Regional Development Act, State of Minnesota (As Amended by Chapters 153 & 174 Laws, 1971). U. S. Department of the Interior, Bureau of Land Management (USDI), 1947, Manual of Surveying Instructions, U. S. Government Printing Office.

Yaeger, Donald, 1973a (May), interview by Hugh W. Calkins and presentation at the University of Washington, Seattle. Yaeger, Donald, 1973b (Aug.), letter to Hugh W. Calkins including list of users of township maps.

109

7

The Land Use and Natural Resources Inventory of New York State (LUNR)

GENERAL DESCRIPTION The Land Use and Natural Resources Inventory is the computerized record of an aerial survey of New York State's land resources. Supported by some retrieval, analysis and display computer programs developed for the inventory, it represents a relatively well established state-wide land information system. , The LUNR system at present contains 130 categories of land use data and four categories of supplemental data for the entire state. The data were manually interpreted, then converted to machine records and stored as a sequence of descriptors for square cells, 1 km2 (o. 4 sq. mile) in area. Two sets of computer programs are available for retrieval and analysis of the stored data. DATALIST I produces tabular summaries of raw data or performs limited analyses. PLANMAP II and IV have more sophisticated analytical capability and produce computer graphic maps. OBJECTIVES AND ORGANIZATIONAL ENVIRONMENT LUNR was undertaken because of a general feeling in state government that a consistent inventory of the state's natural resources was needed, rather than because it was required to achieve any specific objectives. In 1966, Governor Rockefeller stated that a natural resources inventory would be conducted. The State Office of Planning Coordination was assigned that task and decided to obtain information on land use as well as natural resources, for its own purposes. The system thus originated to fulfil some unidentified needs felt to exist in the state rather than for a specific purpose. The LUNR inventory was developed for the New York State Office of Planning Coordination (now the Office of Planning Services, OPS) by the Center for Aerial Photographic Studies (CAPS) at Cornell University. CAPS had general responsibility for system design, development and operation during the inventory process and for some time thereafter (CAPS, 1970-71). More recently, CAPS has become disengaged from the project and OPS has adopted it as its capabilities have grown. The data bank is available for general use through the LUNR Users Service at Cornell University. Operation of that service at present constitutes Cornell's only involvement in development activities for the state's land information system. Present plans call for the creation of a new data base and a set of computer programs for the state. This project will be undertaken by OPS under the guidance of a state interagency committee (Guinn et al. , 1973). HISTORIC DEVELOPMENT The LUNR inventory grew out of a pledge from Governor Rockefeller in 1966 to provide an inventory of the state's natural resources. Responsibility for this inventory was first assigned to the Conservation Commission, but was eventually given to the state's Office of Planning Services, successor to the Office of Planning Coordination (OPC). At that time the Center for Aerial Photographic Studies (CAPS) at Cornell University had completed an inventory of the Delmarva Peninsula by aerial photographic interpretation. The data derived from this project had been coded and stored in a computer by use of 110

The Land Use and Natural Resources Inventory of New York State (LUNR)

the SYMAP program developed at Harvard's Laboratory for Computer Graphics. OPC contracted with CAPS to develop LUNR. Before developing LUNR, CAPS had gained considerable experience with gathering and analysing land-related information for large areas. Development of LUNR was viewed 'as an opportunity to develop techniques which could be applied to similar projects and problems regardless of their geographic location1 (Shelton and Hardy, 1971, p. 3). The general system was intended to be independent of CAPS' expertise and workable with commonly available techniques and equipment. It was not conceived of as the ultimate geographic information system, but 'of immediate operational use in the period . .. before more sophisticated techniques become available on a practical basis' (Shelton and Hardy, 1971, p. 3). Variations of the basic system developed have been used in the Hudson River Valley, Puerto Rico, El Salvador and Colorado (Shelton and Hardy, 1971, p. ii). The OPC first contracted with CAPS to perform a five-county pilot study. The purpose was to develop a classification system and to do some computer programming. The area surrounding Syracuse, New York, was selected for the pilot study because a number of independent land use inventories were being conducted, and accuracy could be checked (Belcher, 1972, p. 15). Initial plans called for the use of SYMAP to store and retrieve the data collected. In the course of the pilot study it was determined that this program package could not efficiently process the large volume of data produced, so a new set of programs was devised (Belcher, 1972, p. 6). The earliest plans called for a classification system with only six categories: water, woods, farming, urban, and commercial-industrial. After representatives from state agencies, the universities and planning organizations had been interviewed, the initial list was expanded to 130 items (Belcher, 1972, pp. 4-5). This classification system is described below. The LUNR inventory was conducted by CAPS between 1967 and 1970. It required about 2 1/2 years and $500,000, which breaks down to a cost of about $4 per km^ ($10 per sq mile). Although the inventory was a state project, the federal government paid about 70% of the estimated cost with HUD and Appalachia funds (Shelton and Hardy, 1971, p. 18). When the inventory was completed, a LUNR Users Service was established at Cornell University. This organization serves the general public and local planning and government organizations. A separate data file and set of programs have been maintained at the Office of Planning Services in Albany to serve state planning projects. DETAILED FUNCTIONAL DESCRIPTION Data acquisition Aerial photography. About 85% of the data in the LUNR system was interpreted from stereographlc panchromatic aerial photographs taken in the spring of 1967 (about 10%) and the spring of 1968. New York City and Long Island were photographed in 1969 and 1970. The photography was flown at 1:24,000 to correspond directly with U. S. Geological Survey 7. 5-minute quadrangles. The entire state has been mapped at this scale, so the USGS map series was used for base mapping. About 17,500 23 x 23cm (9 x 9-in.) prints were made. They have been combined into 129 photo-mosaic indexes. Prints of the indexes, the original photographs and enlargements can be purchased through OPS or directly from the contractor who flew it. Field checking. The remaining 15% of the data in the inventory, the secondary data, was apparently obtained in the course of field checking. During this process county planners, civil defense representatives, highway superintendents and public health officers in each county were interviewed. In this way data from existing inventories and future plans were obtained (Belcher, 1972, p. 6). 'This and ... other supplemental information, such as the level of highway access, were also added to the coding form for each cell' (Crowder, p. 7). An accuracy of 90 to 95% ia claimed for the data derived in this manner (Shelton and Hardy, 1971, p. 6). Other mapped data. Another class of supplementary data has been developed for incorporation into the LUNR system. It consists of four general maps which have been geographically referenced to be compatible with the l-km2 (0.4 sq mile) grid system. The maps are: 1. The General Soil Map of New York State. This data set was derived from a map prepared by soil scientists of the U. S. Department of Agriculture. The map is quite detailed, in that 214 of the

ill

Computer handling of geographical data

2.

3.

4.

state's total of 258 soil series are represented. Resolution, however, is limited since no units of less than 1. 2 km2 (300 ac) are shown. A single LUNR inventory cell occupies 1 km 2 (247. 1 ac). A single value, a mapping unit of the 'dominant 1 soil type, was recorded for each cell except where subdominant soils occurred frequently in an area but were masked by larger proportions of dominant soils in each cell. When this situation occurred, a representative number of cells containing the dominant soil would be coded as the subdominant type. 'Dominance is determined by a soil series relative position in the naming sequence for a soils association. ' (Crowder, p. 7). The Geologic Map of New York summarizes data for the state. To simplify data coding and use, the 152 bedrock types were compressed into 11 map units for inclusion in the LUNR inventory. These 11 categories have been interpreted as to their basic characteristics and capabilities. 'By using the appropriate combination of data code and data value, any of the (original) map units can (also) be retrieved. ' (Crowder, p. 7). As with the soil map, only the dominant rock type for each cell was recorded. A map of Economic Viability of Farm Areas was also adapted for computer storage in the LUNR inventory. This map was prepared on the basis of interpretation of soils, topographic, climatic, and water resources data and included consideration of locational, market, access roads and other social variables. One of three levels of viability, high, medium or low, was derived from these data and used to characterize parts of the state. Landforms and depth to bedrock, indicated by a combined mapping symbol, were included as supplemental data for 15 quadrangles, which were chosen because of conditions expected to constrain development or expected development pressures. Data values were recorded as percentages of the kilometre grid cell (Crowder, p. 9).

Data classification. The LUNR inventory classification system was devised after consultation with a variety of potential clients including representatives from state agencies and planners from counties and larger areas (Belcher, 1972, p. 5). About 130 categories are currently included, and the storage format allows space for an additional 200 data items for each cell. The classification system consists of ten major categories: agriculture, forest land, water resources, residential, commercial and industrial, outdoor recreation, extractive industry, public and semi-public, transportation, and nonproductive land. These categories are further divided into more detailed area or point subcategories. There are 51 area and 68 point subcategories. Definitions of these categories are reproduced in Appendix 4. Area categories and subcategories are coded as a combination of capital and small letters, except for outdoor recreation where a numeral follows the designation 'OR', and public and semi-public where a capital letter is followed by a numeral. Point (and linear) categories are coded as small letters only or as small letters followed by numerals or non-numerical signs. Symbols consisting of small letters and non-numerical signs indicate that the data have been stored as a numerical count or total length per cell. Letters with numerals indicate only that this category is present in a particular cell (LUNR Classification Manual, 1972, p. 4). These symbols were used as the mapping codes during interpretation and overlay preparation and are used to identify categories desired for computer retrieval and analysis. The system was developed specifically to meet the needs of interested state agencies and was not intended to be comprehensive or exhaustive. The cell nature of the system allows new descriptors to be added as desired, but none beyond the four supplemental categories described earlier have been added since the system was developed. Data input Map preparation. A grid system with 1-km2 (0.4 sq mile) cells was devised and related to the USGS quadrangles. The resolution of the system was thus limited to 1 km 2 . This was recognized as too gross for urban planning, but because rural areas were to be the main subjects of study, it was felt to be adequate. Also, funding was limited and the number of cells required rises geometrically and inversely with the cell size selected (Belcher, 1972, p. 4). About 140,000 cells were required to cover the entire state. Each is uniquely numbered with x, y co-ordinates, the numbering system originating in the southwestern corner of the state. The photography was manually interpreted at CAPS, mainly by graduate students or spouses without previous experience with aerial photographic interpretation. Data interpreted from the photographs

112

The Land Use and Natural Resources Inventory of New York State (LUNR)

were transferred manually to transparent 7. 5-minute quadrangle overlays. Three types of overlays were produced for each quadrangle. 1. Area land use overlay - Polygons of particular uses were outlined and areas estimated by placing a hectare grid over each cell and counting the number of hectare cells in which a particular use predominated. These counts were then used to estimate percentages of land uses for each cell (Fig. 51). Both land uses (human) and natural resource characteristics (for example, natural lakes, forest land) were noted, but only one characteristic was assigned to a particular polygon. The smallest unit recognized during interpretation was 0.4 ha (1 ac) (Crowder, p. 5). 2. Point land use overlay - This consists of both point or small-area features such as underground mining or campgrounds, and linear features such as roads or streams. Point features were tallied by category. Total lengths of streams were measured and likewise tallied. Many were traced on the overlay but some were not (Fig. 52) (Crowder, p. 3), 3. Compilation overlay - This would show minor civil divisions such as township lines, county lines, villages, and it would carry road classifications. The length (miles) of each within a cell would be recorded. 'Our practice was to record the length of roads, streams, and shorelines in a cell. ' (Belcher, 1972, p. 6). Conversion to machine records. The overlays were edited and partly field checked before data were converted to machine records. Data from the overlays were then summarized on a coding sheet, which listed percentages of each cell in particular land uses, numbers of indications of presence or absence of point items and total length of linear features. One coding sheet was used for each cell. These data were then punched on to EDP cards, of which three were required per cell. Each cell was coded to reflect its location by county, minor divisions and watershed, so that the data could be accessed automatically for these units. Figure 53 shows a summary of the data acquisition and encoding procedures. Data storage LUNR data are stored on two IBM 2316 direct-access disk packs. One disk contains the land use data obtained from aerial photographic interpretation and field checking, and the other the four categories of supplemental data described earlier. This second data set became operational early in 1974. Storage capacity for an additional 200 items per cell is available now and could be expanded to accommodate several thousand more (Crowder, p. 6). Disk storage is felt to have made storage and retrieval of the large volumes of data produced by the LUNR inventory practical. 'Data for a cell could be read on to the storage disk by means of any one of several data input programs which require only that the card be punched in the format for that program and contain the appropriate cell coordinate values. This made each card completely independent of other cards. Cards were read on to the disk in random order, as they became available from the data processing operation. Retrieval of data proceeds along the same line: cell coordinates are specified for any area for which data sets are to be retrieved, and those data are retrieved directly from disk (without searching through all the cell data records). ' (Shelton and Hardy, 1971, pp. 10-11). Data retrieval, analysis and output Two general types of retrieval and analysis programs, DATALIST and PLANMAP, were developed for the LUNR inventory. They were specifically designed to provide easy access to the data, so that an unskilled user would be able to retrieve desired information and perform relatively sophisticated analyses without the assistance of a programmer or a knowledge of programming. Both types of program employ the same land use categories and coding titles. DATALIST retrieves raw data in input format and can be used to perform simple arithmetical operations. Output is a tabular summary. PLANMAPS II and IV can perform logical as well as mathematical operations on raw data and produces computer graphic maps as output. DATALIST I. This program can be used to extract raw data or to make summaries of single or multiple point, area or supplemental data (for example, the area of forest land) for selected areas. The study area can be selected in one of two ways.

113

Fig. 51.

Area land use overlay.

Fig. 52.

Point (and linear) land use overlay.

Receive aerial photographs

Sort photographs according to USGS quadrangle maps

Field check aerial

photographic interpretation Aerial photographic interpretation

Draft LUNR overlays

Code data to Universal Transverse Mercator grid

Key punch coded data on to EDP cards

Transfer data to disk pack

Gather supplemental data PRODUCT LUTNTR overlays of USGS 7.5-minute quadrangles Fig. 53.

Data acquisition and encoding procedures.

PRODUCT Data suitable for computer manipulation and display

Computer handling of geographical data

1.

2.

The grid cell co-ordinates defining the perimeter can be given. It is only necessary to identify the cells within which polygon vertices are located. The program will interpolate between designated cells where gaps exist. A maximum of 40 cells can be specified to describe a study area boundary (Fig. 54). If an entire county, minor civil division or watershed is to be studied, a numerical code can be used and appropriate cells are automatically determined and accessed.

715

^^

710 N.

X

/

"•x

*

/A s

705 N.

700 N

Kr

4t o Ei i

-4(•tc;

4 n

-4 i R F

-4. •>n

f

Coordinates (412,712) (414,711) (415,710) (417,709) (415,709) (411,705) (411,703) (409,703) (409,705) (405,705) (405,712) (409,712) (409,710) (412,710) Fig. 54.

Description of boundary by grid cell co-ordinates.

DATALIST can deal with up to 10 'expressions' per query. Each expression represents retrieval of a value for some data category (for example, area of pasture), or 'a mathematical expression consisting of one or more (data category values) and valid arithmetical or relational operators and constants' (User Manual, 1971, p. 18). Addition, subtraction, multiplication or division are valid 'operators'. Two logical operators (and, or) and six comparative operators (not equal, equal, less than, greater than, less than or equal, greater than or equal) may also be used. These are useful for determining the presence or absence of some data category value or combination of values. DATALIST provides tabular summaries of raw or analysed data (Fig. 55). Output is by cell, and up to ten data categories can be represented per query. Values are printed as derived except where conditional operators are used, when '0' indicates absence and '2' indicates presence of the condition described in the query (User Manual, 1971, p. 19). DATALIST summaries are particularly useful as statistical and general reference tools and are also a relatively inexpensive means of identifying cells that exhibit desired characteristics, for which further analysis with PLANMAP may be desirable. Its main disadvantage relative to PLANMAP is its inability to recognize patterns or to identify features that may be adjacent to one another (Crowder, p. 8). PLANMAP II and IV. These programs are capable of more complicated forms of analysis than DATALIST^Basically, PLANMAP programs produce line-printer maps which show up to ten visual density 116

COORDINATES

UTHE U T M N

520 521 522 523 511 512 513 514 516 517 518 519 520 521 522 523 524 509 510 511 512 513 514 515 516 Fig. 55.

706 706 706 706 705 705 705 705 705 705 705 705 705 705 705 705 705 704 704 704 704 704 704 704 704

Active Agriculture

1

40.00 21*00 17.00 4.00 20.00 0.0 0.0 0*0 0.0 15.00 25.00 0.0 0.0 0.0 8.00 45.00 44.00 0.0 3.00 26.00 13.00 0.0 0.0 10.00 18.00

Inactive Agriculture

Bruihland

3

2

0.0 0.0 0.0 0.0

0.0 0.0 0.0 2.00 0.0 0.0 0.0

0.0 0.0

o.o

1.00 0.0 7.00 21.00 1.00 0.0 6.00

3.00 0.0 0.0 1.00

17*00 68.00 63.00 39.00 19.00 35.00 29.00 41*00 36.00 39.00 68*00 9.00 3.00 26*00 23.00 '13.00 41*00 21.00 26.00 18.00 32.00 37.00 7.00 68.00 55-00

Natural Forest 4

Forest Plantation

'43.00 6.00 20.00 57.00 41.00 23.00 68.00 57.00 43.00 46.00 7.00 91.00 54.00 62.00 67.00 20.00 21.00 58.00 70.00 52.00 47.00 55.00 93.00 18.00 17*00

0.0 0.0 0.0

5

0.0 20.00 39.00 2.00 0.0 21.00 0.0 0.0 0.0 1.00 10.03 0.0 0.0 2.00 0.0 0.0 4.00 1.00

o.o

0.0 0.0 4.00

Streams & Lakes 6

0.0 0.0 0.0 0.0

o.o

3.00 1.00

o.o

0.0 0.0 0.0 0.0 0.0 0.0 0.0 20.00 0.0 0.0

o.o

0.0 1.00 5.00 0*0 0.0

o.o

Wetlands

7

0.0

o;o o.o

0.0 0.0 0.0 0.0

o.o

0.0 0.0

0.0 0.0 42.00 1.00 0.0 0.0 8.00 0.0

H 3-

o.o

0.0 0.0

o.o

0.0 0.0

o.o

DATALIST tabular summary of data in selected categories.

H

ej

2

Computer handling of geographical data

levels produced by character overprinting (Fig. 56). Each cell is assigned a particular density level based on criteria specified during programming. A PLANMAP II program consists of five required and six optional statements. Their capabilities will be discussed within the context of a description of the steps involved in running a PLANMAP program. The first two statements may be used to delineate the study area. If an entire county, minor civil division or watershed is to be studied, its name is stated and the appropriate cells are automatically accessed using the MAP statement. Other areas can be defined by stating a maximum of 14 pairs of cell co-ordinates to describe its perimeters in a co-ordinates statement. The program will then interpolate a boundary on a 'line-of-sight' basis, and data for the total area of any cell crossed by this line will be included (Fig. 54). If 14 pairs of co-ordinates are not sufficient to describe the boundary, the map can be produced by preparing more than one program. The next statement in a PLANMAP II program is the WEIGHTS statement, which is mandatory because the PLANMAP II program prints maps only of weight values, not of the data values themselves. The analytical usefulness of the WEIGHTS statement derives from the user's ability to assign different levels of importance, or weights, to different parts of the ranges of data values. The user can form composite variables from successive WEIGHTS statements; up to 50 can be used in a single PLANMAP II program. However, the weights produced by these WEIGHTS statements will be combined to produce only one set of final weighted values. An optional statement which may follow WEIGHTS is the EXCLUDE statement or statements. They are used to withhold from the analysis cells that have specified data values. Cells can be withheld if their data values are greater than a specified value, less than a specified value or greater than one and less than another. Up to 50 EXCLUDE statements can be included in a single PLANMAP II program. Once weights and exclusions are specified, the PLANMAP II program divides the range of the final weighted values into up to ten density levels. Unless otherwise specified, these levels will represent 0-9%, 10-19%. .. 90-99% densities. (The percentage refers to the percentage of the cell having the weighted characteristics.) A smaller number of density levels can be selected if desired, but the entire range of possible values must be covered in a WEIGHTS statement. It should be noted that exact data values are lost by splitting the range into density levels. For example, a cell containing 19% natural ponds and lakes would be represented as containing 10-19%, Exact values can only be retrieved by DATALIST. The remaining statements in a PLANMAP II program provide control over features of the lineprinter maps. WRITE TITLES is used to add map titles and legends. NEW NUMBER OF LEVELS IS can be used to reduce the number of density levels shown to less than ten. NEW CHARACTERS ARE can be used to heighten the contrast of density where less than ten levels are desired. SUPRESS LEVEL INDICATORS can be used to eliminate the symbol that would otherwise be printed in the centre of the group of symbols used to represent each cell in a printout. NEW CELL SIZE IS CHARACTERS HIGH BY CHARACTERS WIDE can be used to change the represented cell size on output maps. In order to represent 1-km2 cells as square on printouts, PLANMAP represents cells as three characters high by five characters wide unless this option is used. The normal printed cell size represents an approximate scale of 1:78, 700 (1 in. = 2 km). A scale of 1:24,000 (1 in. = 2,000 ft) can be approximated by specifying a printout five characters high by eight characters wide. Also, maps may be reduced in size by, for example, using only a single overprinted character per cell. This will, of course, distort the represented dimensions of a map by lengthening the vertical axis. PLANMAP II also offers an ATLAS option which is used to produce maps showing the predominant land use in each cell from a choice of ten possible uses. HALT statements are used to indicate the end of statements for each map, because more than one map query may be submitted at a time. FINISH signals the end of a mapping run. Two examples of user queries for PLANMAP II programs are shown in Table 16. The first set uses only mandatory statements to produce a map of natural forest cover (FN) in the specified area. The second set uses all optional as well as mandatory statements. Optional statements are identified by an asterisk, and mandatory statements by a double asterisk. Figure 56 shows a reduced sample PLANMAP output with ten density classes. Each 1-km2 cell is represented by a single set of density classes, three characters high by five characters wide. A numeral corresponding to the density class is printed in row 3, column 2, to locate individual cells. The main advantage claimed for PLANMAP II over DATALIST is its ability to present visually the patterns and spatial relationships of variables. Also, the program is able to sort through a large number of cells, and to determine and display patterns of cells that exhibit specified combinations of characteristics. This is especially useful for studies to locate potential sites. 118

The Land Use and Natural Resources Inveiilory of New York State (I.LJNR)

• IBM ihk.K IMIMOt«,(-•«••• '-.' ' - B B ' i t f B t i ' w ' l l •III1'; ( S M ••• : = ••••• HMWiBBBBBNHNMBMBB

i.niiin.njm

M«i**.*Ji>fcf BLOtCBBBttBBMB,

. • • i'' ISM*;. . /t:i fit 'NH*i-! 111)1(1XIIVBB

• ••••••IIMIIH*Jil*UM>

••BBBBRRItBHBBB«aiBBBBBaB.B«BBBiaiBBBBIlIIIMrilllflB

11 IBJt 111 • *1 BBBIUBBBBMH

OUOUU . IW^UU—I — ..O..OU*OU" 3' 00000 011003" "

ixxxxx

11 in 111111111 ii 1111 n f ««* ii ••* t i

xxxxxj

OUOOO JU'J3J 0()i,00—I — -.0. .UO flQ

oooitcj

*

.... ,ouoj3

a t

a

--l--ttUt>8e). .C . ,0t)M9M"*IB BBBBBHUHB

i>fti(3rt^B4 t u=i=?e=!;.* t fe;H=(^Ni

Ji^JI'lU sxs >^ YYes /"S

Automatically position scanner at the next map section

~ Fig. 68.

Calspan (DIGIMAP) encoding procedure. 147

Computer handling of geographical data

RESULTS OF THE EXPERIMENT Accuracy of tract area measurement To assess the accuracy of tract area measurements computed by each system, the system-computed areas were compared with areas measured manually using a planimeter 1. Three main approaches were used to compare the system-computed to the planimetered areas, descriptive statistics, regression equations, and percentage deviations from planimetered values. Regression, Regression equations were computed for each system using the planimetered tract areas as the dependent variable. Had the system-computed and planimetered measurements exactly agreed, the resulting equation would have been: planimetered area = system-computed area, with a coefficient of 1. 00 and a zero constant term. The actual equations were very close to this, with the smallest coefficient being 0. 90 and the largest constant, 22. 3 (Table 18). The Calspan system had the best coefficient, 1.00, and CGIS had the lowest constant term, 0. 53 ha (1. 3 ac). The relatively high constant term in the MLMIS equation is probably the result of the size of the grid cell used in that system. Table 18.

System to planimeter regressions. R2

Planimeter - 0. 96 (ORRMIS)

+

6. 5

0. 975

Planimeter = 0. 93 (CGIS)

+

1. 3

0. 998*

Planimeter - 0. 90 (PIOS)

+ 4. 5

0. 997

Planimeter = 0. 93 (MLMIS)

+ 22. 3

0. 969

Planimeter = 1. 00 (Calspan)

+

7. 2

0. 996

*The area calculations provided by CGIS were modified by a factor of 1. 076 to compensate for changing the latitude of the maps. In the opinion of CGIS, this should not have been done. The reciprocal of 1.076 is 0. 929. All data, charts and graphs presented in this chapter reflect the 1.076 factor. Descriptive statistics. Descriptive statistics were calculated for each system's computed areas and for the planimetered tract areas (Tables 19 and 20). Several conclusions can be drawn from these statistics. There was a range of 27. 5 ha (67. 9 ac) between the maximum mean tract area (PIOS: 181. 2 ha, or 447. 3 ac) and the minimum mean tract area (Calspan: 150. 7 ha, or 379. 4 ac). This range was rather large , 18% of the minimum and 15% of the maximum mean tract area expected if all systems were equally accurate. There was also a range of 16. 6 ha (41.0 ac) between the minimum and maximum median tract area. This range was much larger than zero, which would be expected if the systems were all equally accurate. Median tract areas were in all cases smaller than the mean, because most tracts were comparatively small. However, there was quite a large range between the smallest and largest tracts. Estimated areas for the largest tract, 214. 03, ranged from 745 ha (1840 ac) (MLMIS) to 841 ha (2078 ac) (PIOS), a difference of 96 ha (238 ac), which was 13% of the MLMIS estimate or 11% of the PIOS estimate. The tracts computed to have the smallest area were tracts 35. 02 and 40. 01 by ORRMIS measurements and tracts 41 and 111 by the other systems and the planimeter measurements (Table 21). The differences between the estimated areas of these tracts were quite large, ranging from 59 to 103% of the minimum estimates and from 37 to 58% of the maximum estimates. Because there was so much variation between systems in the minimum and maximum tract area measurements, there was also variation in the range and standard deviations computed for each system. The difference in the range was 88. 7 ha (219 ac), or 11 to 12% of the maximum or minimum value. !The planimetered tract areas are assumed to be accurate for this analysis. It is recognized that such manual methods are subject to error, although each area was measured repeatedly until three measurements were very close. 148

Data encoding experiment

Table 19.

Computed census tract areas.

Tract number 34.00 35.01 35.02 36.00 37.00 40.01 40.02 41.00 42.00 43.00 47.00 48.00 49.00 50.00 53.00 101.00 102.00 103.00 104.00 105.00 106.00 107.00 108.00 109.00 110.00 111.00 112.00 113.00 114.00 115.00 116.00 117.00 118.00 119.00 120.00 121.00 122.00 123.00 124.00 125.00 126.00 127.00 202.00 203.00 204. 00 205.01 205.02 206.00 214.03

Plant meter

319 143

Census tract area in acres* ORRMIS (40 ac) (2. 5 ac) CGIS

510 446 239 319 191 303 223 175 80 96 239 271 319 797 590 335 319 175 260

331 148 163 204 150 129 175 98 177 243 431 205 571 712 265 786 426 544 459 252 332 228 291 246 199 97 113 239 268 317 842 637 357 327 195 278

340 170 85 212 127 85 170 127 127 170 467 212 552 765 212 807 340 595 467 255 212 212 340 127 255 127 127 255 297 297 680 680 382 297 127 197

315 146 143 172 138 109 164 106 149 231 414 180 562 693 236 809 366 554 430 255 287 220 279 210

1055

1140

542 494 579 542 424 590 287 430

596 539 659 568 464 654 298 460

1084 456 797

132 191 128 112 175 96 191 207 399 191 526 654 239 701

414

1881

MLMIS

PIOS

Calspan

104 111 226 281 334 762 613 348 287 223 276

320 120 120 200 120 200 160 80 200 160 520 120 600 680 200 760 360 560 360 240 320 240 240 240 200 80 120 160 240 320 920 600 360 240 160 240

1062

1078

1000

1179

1056

552 510 595 552 425 552 340 467

526 528 651 531 415 603 295 441

560 560 600 520 320 560 320 400

616 559 677 584 475 676 306 456

548 493 585 507 407 607 275 426

1187

1189

1151

1320

1231

1097

504 859 2006

510 807

486 781

520 800

465 795

1869

1847

1840

518 889 2078

193

340 152 169 206 154 131 181 99

332 135 147 188 139 116 144 90

183

166

251 442 210 588 734 275 807 439 560 473 257 345 233 299 253 204 99 116 248 273 328 865 572 365 337 205 282

175 387 190 516 654 248 722 348 489 409 229 309 209 241 228 183 90 104 221 248 293 751 585 330 283 199 258

1855

*1 ac = 0.405 ha.

149

Computer handling of geographical data

Table 20.

Statistics describing tract area estimates, in acres*.

Planimeter Mean Median Standard deviation Range Minimum Maximum *1 ac = 0.405 ha. Table 21.

504.4

317.0 319. 2 1801.0

ORE MIS (40 ac) (2. 5 ac) 416. 9 323. 8 329.4 1784.0

1743.0

CGIS

PIOS

MLMIS

Calspan

436. 1

447. 3

409. 8

325.5

335.5

298.0

379.4 294.5

344. 1

355. 6

336. 1

317. 9

1909.0

1979.0

1760.0

1765.0

80.0

85.0

104.0

97.0

99.0

80.0

90.0

1881.0

1869.0

1847.0

2006.0

2078.0

1840.0

1855.0

Range in minimum tract area measurements (minimum tract areas detected, acres*).

System

35.02

111.00

163 169 120 147

112 85 109 129 131 200 116

96 127 106 98 99 90

80 127 104 97 99 80 90

Range

84

115

47

47

R/min (%)

99

103

59

59

R/max (%)

50

58

37

37

Planimeter ORRMIS (40 ac) ORRMIS (2. 5 ac) CGIS PIOS MLMIS Calspan

132 85

Tract no. 40.01 41.00

143

80

ac = 0.405 ha The difference in the standard deviation was 37. 1, which is 10 to 12% of the maximum or minimum values. PIOS estimates seemed to be consistently higher than those of the other systems or the planimetered values as shown by the mean, median, maximum and minimum areas. PIOS also had a broader range and standard deviation, indicating more variation in tract area than measured by the other systems. The trend for PIOS to make higher estimates was also evidenced in the PIOS regression equation in Table 18, where PIOS had one of the smallest coefficients. It must again be emphasized that the planimetered values are only assumed to be accurate for the purposes of this experiment. The planimetered values are probably liable to instrumental and operational bias in the same manner as the other techniques. They represent only an artificial zero point against which comparisons are made; any one of the systems could have been assumed to be totally accurate and comparisons could have been made with it. The planimetric approach was adopted for this purpose mainly because it is in widespread use and is a traditional manual method of area measurement against which the computer-aided systems will be intuitively compared by a wide variety of users. The fact that the planimetric approach is only a nominal standard must be borne in mind when considering the comparative comments made above and the ones that follow. The significant measures of most importance to the user are the range and distribution of deviations computed for each approach. Calspan and MLMIS showed some tendency to underestimate in comparison to the other systems and the planimetered areas. Calspan had the smallest mean and median values. However, MLMIS had the smallest estimates for the minimum- and maximum-sized tracts, but the MLMIS and planimeter estimates were equal for the smallest tract. MLMIS had the smallest range and Calspan the smallest standard deviation, indicating a more compact distribution of tract areas than measured by the other systems or the planimeter. Percentage deviations from planimetered area. These differences may be further clarified by considering the percentage deviations from the planimetered tract area (Table 22). Percentage deviations were computed by subtracting the planimetered area from each computed area and dividing the result

150

Data encoding experiment

Table 22.

Percentage deviations from planimetered area of census tracts*.

Tract number

34. 00 35.01 35.02 36.00 37.00 40.01 40.02 41.00 42.00 43.00 47.00 48.00 49.00 50.00 53.00 101.00 102.00 103.00 104. 00 105.00 106.00 107.00 108.00 109.00 110.00 111.00 112.00 113.00 114.00 115.00 116.00 117.00 118.00 119.00 120.00 121.00 122.00 123.00 124.00 125.00 126.00 127.02 202.00 203.00 204. 00 205.01 205.02 206.00 214.03

CGIS .0376 .0350 .2348 .0681 . 1719 . 1518 .0000 .0208 -.0733 . 1739 .0802 .0733 .0856 .0887 . 1088 . 1213 .0290 .0667 .0291 .0544 .0408 . 1937 -.0396 . 1031 . 1371 . 2125 . 1771 .0000 -.0111 -.0063 .0565 .0797 .0657 .0251

. 1143 .0692 .0806 .0996 .0911 . 1382 .0480 .0943 . 1085 .0383 .0698 .0950 . 1053 .0778 .0665

ORRMIS (40 ac) ( 2 . 5 ac) .0658 . 1888 -.3561 . 1099 -.0078 -.2411 -.0286 . 3229 -.3351 -. 1787 . 1704 . 1099 .0494 . 1697 -. 1130 . 1512 -. 1787 . 1667 .0471 .0669 -. 3354 . 1099 . 1221 -. 4305 .4571 . 5875 .3229 .0669 .0959 -.0690 -. 1468 . 1525 . 1403 -.0690 -.2743 . 1423 ,0066 .0185 .0324 .0276 .0185 .0024 -.0644 . 1847 ,0860 ,0969 , 1184 ,0125 -.0064

-.0125 .0210 .0833 -.0995 .0781 -.0268 -.0629 . 1042 -.2199

. 1159 .0375 -.0576 .0684 .0596 -.0126 . 1541 -. 1159 .0863 -.0359 .0669 -. 1003 . 1513 -.0792 -.0583 . 1028 .3000 . 1563 -.0544 .0369 .0470 .0439 .0389 .0388 -. 1003 .2743 .0615 .0218 -.0295 .0688 . 1244 -.0203 -.0212 .0220 .0278 .0239 .0618 .0658 -.0200 -.0181

MLMIS

pros

.0031 -. 1608 -.0909 .0471 -.0625 . 7857 -.0857 -. 1667 .0471 -.2271 . 3033 -.3717 . 1407 .0398 -. 1632 .0842 -. 1304 .0980 -. 1928 .0042

.0658 .0629 .2803 .0785 . 2031 . 1696 .0343 .0313 -.0419 .2126 . 1078 .0995 . 1179 . 1223 . 1506 . 1512 .0604 .0980 .0605 .0753 .0815 . 2199 -.0132 . 1345 . 1657 . 2375 .2083 .0377 .0074 .0282 .0853 -.0305 .0896 .0564 . 1714 .0846 . 1175 . 1365 . 1316 . 1693 .0775 . 1203 . 1458 .0662 .0605 . 1356 . 1360 . 1154 . 1047

.0031 . 2565 -.2079 .0762 . 1429 .0000 .2500 -.3305

-. 1144 .0031 . 1543 .0169 .0746 -. 2476 -.0857 -.0769 -.0521 .0332 . 1336 .0363 -.0406 -.2453 -.0508 . 1150 -.0698 . 2177

. 1404 .0038 -.0218

Calspan .0408 -.0559 . 1136 -.0157 .0859 .0357 -. 1771 -.0625 -. 1309 -. 1546 -.0301 -.0052 -.0190 .0000 .0377 .0300 -. 1594 -.0412 -.0830 -.0418 -.0313 .0942 -. 2046 .0224 .0457 . 1250 .0833 -.0753 -.0849 -.0815 -.0577 -.0085 -.0149 -. 1129 . 1371 -.0077 .0009 .0111 -.0020 .0104 -.0646 -.0401 .0288 -.0418 -.0093 .0120 .0197 -.0025 -.0138

^Percentage deviation = f System computed area - Planimetered area ^ Planimetered area

151

Computer handling of geographical data

by the planimetered area. Overall, these deviations ranged from -43% (tract 109, ORRMIS) to +79% (tract 40.01, MLMIS) of the planimetered tract area. Statistics describing the percentage deviations for each system are listed in Table 23. Table 23.

Statistics describing percentage deviations from planimetered census tract areas.

Mean Median Standard deviation Range Minimum Maximum

(40 ac) 0.032 0.053 0. 196 1.018 -0. 430 0.587

(2. 5 ac) 0.027

CGIS 0.079 0.077 0.063 0. 308 -0.073 0.235

PIOS 0. 107 0. 104 0.068 0. 322 -0.042 0. 280

MLMIS 0.000 0.003 0. 188 1. 157 -0.372 0. 786

Calspan -0.018 -0.010 0.075 0.342 -0. 205 0. 137

It is interesting to note that although MLMIS had the largest single percentage deviation, 79%, overall deviations balanced each other because MLMIS had the mean and median deviations closest to zero, which would be the ideal. PIOS had the mean and median percentage deviations furthest from zero, 11% and 10%, respectively. However, more information about relative accuracy can be gained from the measures of dispersion, the range and standard deviation, and the mean and median which indicate the balance or centrality of the distribution of percentage deviations. CGIS had the smallest standard deviation and range, ORRMIS had the largest standard deviation, and MLMIS the largest range. Figure 69 indicates the magnitudes of the differences in range. The two grid systems, MLMIS and ORRMIS, had by far the largest ranges while the other three systems had ranges that were smaller and quite comparable. Seen in this context, the fact that MLMIS had a mean and median percentage deviation close to zero is countered by the large percentage deviations which the system permitted; and the larger means and medians of CGIS and PIOS are offset by their relatively small ranges. However, in both the CGIS and PIOS distributions tend to be positive, whereas the Calspan distribution is more closely balanced around zero, has mean and median closer to zero, and is about as compact as CGIS and PIOS, so the Calspan system gives the best results in this context. The comments made earlier concerning the use of the planimetered measures as an artificial zero must be kept firmly in mind.

Range in System % deviation O qq PIOS 0 01 CGIS O ld Calspan l i e MLMIS ORRMIS (40 ac) 1 O9 i ORRMIS (2 5 ac) 0 52 T

-.45

i p

i

i I

-.30

x Median marked for each system Fig. 69.

11

.

1

-.15

I

0

.15

1

I

I

i

1

.30

.45

.60

.75

.90

I

Range in percentage deviation from planimetered tract area.

The observed deviation for both MLMIS and ORRMIS was evaluated. It was discovered that the staff at Oak Ridge National Laboratory had used the variable grid size option of the system for the input of census tract boundaries, choosing a 17-ha (42-ac) grid cell for the tract boundaries. When the impact on measurement range and deviation was observed, the staff"at Oak Ridge reprocessed the tract data using a smaller grid size of 1 ha (2. 5 ac). Some results of this second calculation have been entered on Tables 18, 20, 21, 22 and 23 and Figure 69. The revised measurements of land use by tract have been completed but are not included as no additional conclusions could be drawn from them. However, it must be remembered that the land use measurements for ORRMIS reported here are based on the 17-ha (42-ac) grid cell and not on the 1-ha (2. 5 ac) cell. 152

Data encoding experiment

The percentage deviations from the planimetered tract areas are plotted in Figures 70 to 741. Here again, it is evident that the ORRMIS 17-ha (42-ac) and MLMIS distributions are much more diffuse than the others, that CGIS and PIOS distributions are more clustered but tend to be positive, and that the Calspan distribution is more closely clustered and balanced around zero. One can also perceive from these plots that every system tends to allow larger percentage deviations for the smaller tracts than for the larger ones. This tendency becomes even more evident when the absolute values of the percentage deviations are plotted (Figs. 75 to 79). Land use area measurement The land use categories and mapping codes used for this experiment have been shown previously (Table 17). Not all the categories listed were present in the 7. 5 x 7. 5-minute map used. Table 24 lists the total land use in each category as measured by four of the five systems. There is considerable variation in these areas, just as there was in the census tract areas. The range between the minimum and maximum measurements for a land use category varies between 3 and 576 ha (8 and 1, 423 ac) and between 12 and 100% of the maximum area measured for a category. The absence of the area of uses coded 42 and 43 for some systems represents coding errors. However, on the source map provided to CGIS, the polygons of land use 43 were identified as land use 42, and the absence of land use 43 is not a coding error. The presence of uses coded 0 and 18 for PIOS represents either coding errors or the absence of data on the source map. (This indicates the need for predigitizing editing, as discussed further in the conclusions.) Table 24.

Total land use areas for each system, in acres*.

Use code

0 11 12 13 15 16 18 19 21 23 41 42 43 54 61 62

CGIS

0 9546 2693 2091 1196 1714 0 766 58 27 226

124 0 2816 64 44

Area recorded ORRMIS MLMIS

0 9400 2521 1943 1041 1503 0 718 77 24 214 66 50 2803 40 40

0 9360 3440 1080 1040 1520 0 640 160 40 160 0 160 2360 40 40

PIOS

Range

689 9848 2017 2231 1258 1674 186 617 72 32 244 0 126 2868 12 48

689 488 1423 1151 218 211 186 149 102 16

84 124 126 508 52 8

Range/ Maximum

1.00 0.50 0.41 0.52 0. 17 0. 12 1.00 0. 19 0.64 0.40 0. 34 1.00 1.00 0. 18 0. 81 0. 16

*1 ac = 0.405 ha. Accuracy of land use by census tract overlay Four of the five systems discussed above were able to overlay at least two sets of mapped data. For this experiment they combined the census tracts with a map of land use for the same 7. 5 x 7. 5-minute map. As might be expected, errors that occurred in processing the two maps separately were accentuated when the two maps were combined (Table 25). The range between the minimum and maximum measurements of the area of a single land use within a tract was from 5% of the maximum measurement (tract 214.03, use 41) to 110% of the maximum measurement (tract 205. 01, use 11). The range in this case was greater than the maximum area estimate because PIOS computes area for use 11 by subtracting the area of other uses from the total area of the overlaid polygon, in this case census tract 205. 01, and assigning the remainder to use 11, that is, 499 ha minus 502 ha equals -3 ha (1, 231 ac minus 1, 239 ac equals -8 ac). The program used to plot these scattergrams overprints when values are equal, so each tract does not have a corresponding unique point. 153

Computer handling of geographical data

.14(10

.200

§

1

•l-°°°6

250

500

HO

ifiofl

J256

1500

1750

2000

PLANIMETEREO AREA OF CENSUS TRACTS

Fig. 70.

CGIS: Percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

0 1.000

.600

400

•~

.200

I

i'""'"'256"''***566*"*"7FO

"Io66*""*i250

1 5 0 0 * " " ' 1750

2000

PLAH1METERED AREA OF CENSUS T R A C T S

Pig. 71. ORRMIS: Percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

154

Data encoding experiment

1.000

800

g .200 3 1

I

S



,.............. f i.......f ........................ 4 .............j O B PLANIMETERED AREA OF CENSUS TRACTS

Fig. 72.

MLMIS: Percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

, ...£................ ...^....

..£.

J...

£.flo

PUANIHETERED AREA OF CENSUS TRACTS

Fig. 73.

PIOS: Percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

155

Computer handling of geographical data

.HflO

E-.zoo

500

?5o""'"i56i

1Z50

1500

1750

2000

PLANIHETERED AREA OF CENSUS TRACTS

Pig. 74.

Calspan: Percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

.200

£-1.000..

.......

£&fi

m

,y.

.„.

.j..

S.o()

PUANIMETERED AREA OF CENSUS TRACTS

Fig. 75.

""" i

Absolute values of CGIS percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

zso

560

756

1665

PLANIMETERCD AREA

Fig. 76.

156

iifto

isoo

ifsfi

2600

OF CENSUS TRACTS

Absolute values of ORRMIS percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

Data encoding experiment

JJ . soo

.

*

I I

.600 t

1 1

S •'too 1 °* 1

*

§ •20°

*

"S

M

1

*

*

*

* »

«* * ** ** • « •

* * * * » * * » , *

» w » *

»

*

*

" V

(4

"

£

-1.000



li*

t*i

ssi

;*•!

Jill

atx

*iji

ii. „

P L A N T H E T E R E D AREA OF CENSUS TRACTS

Fig. 77.

Absolute values of MLMIS percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

""'250

500

I Fig. 78.

?SO""P"i535

1250

1500

1750

2000

VIANIMEIERED AREA Of CENSUS TRACTS

Absolute values of PIOS percentage deviation from planimetered tract areas plotted against planimetered area of tracts.

1.000.

•*'

^••"i

H6

500

750

16*50

1250

ISOO

1750

2DOO

PLANIMETERED A R E A OF CENSUS TRACTS

Fig. 79.

Absolute values of Calspan percentage deviation from planimetered tract areas plotted against planimetered area of tracts. 157

Computer handling of geographical data

Table 25.

Selected examples* of estimates of land use area by tract, in acres**.

Tract 35.01

Use

ORRMIS

CGIS

PIOS

MLMIS

11 12

56 90 0 24 170

60 88 0 0 148

65 86 1 0 152

40 80 0 0 120

11 12

74 53 127

86 64 150

83 71 154

0 11 12

0 21 96 0 11 128

0 20 92 0 0 Tl2

11 12 15 18 19

526 56 88 0 11 681

11 13 15

16 619 0 3 24 27 21 0 481

13 16

Total Tract 37.00 Total Tract 112.00

15 19

Total Tract 117.00

Total Tract 205.01

16 19 41 42 43

54

Total Tract 214.03

11 12 13 16 18 19

41 42 54 62

Total

Range Range/ (max.-min. ) maximum 25 10 1 24

0. 3846 0. 1111 1.0000 1.0000

80 40 120

12 31

0. 1395 0.4366

93 22 0 1 0 116

0 0 120 0 0 120

93 22 120 1 11

1.0000 1.0000 1.0000 1.0000 1.0000

473 21 131 0 12 637

425

520 0 40 0 40 600

101 56 91 14 40

0. 1920 1.0000 0.6947 1.0000 1.0000

0 672 7 0 9 22

-8 746 12 1 15

80 640 0 0

31

19 0 458

0 24 410

0 0 40 560

88 127 12 3 24 31 21 40 150

1. 1000 0. 1702 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.2679

1191

1187

1231

1320

690 189 35 481 0 127 40 5 263 40

666 199 1 637 0

688 183 3 648

144 40 0 275 44

45 42 0 323 45

640 200 0 560 0 120 40 0 200 80

50 17 35 167 105 99 2 5 123 40

0.0725 0.0850 1.0000 0.2577 1.0000 0. 6875 0.0476 1.0000 0. 3808 0.5000

1870

2006

2082

1840

15 118 14 0 572

105

o

* Complete estimates are shown in Appendix 6. #*1 ac = 0.405 ha. In addition to this anomaly, there were a large number of use-by-tract areas which at least one sys tem measured as zero, but one or more other systems recorded as positive. Nearly 50% of the listed use-by-tract polygons show this type of error, so that the range was 100% of the maximum area measurement. In Table 26 these cases are classified according to which system or systems recorded zero area.

158

Table 26.

Analysis of the use-by-tract areas for which one or more systems registered no area. Missed by one system

ORRMIS

CGIS

PIOS

MLMIS

3502-16 3600-16 4002-13 11600-61 20600-21

20400-43 20501-11

10400-12 10500-12 10600-12 11000-12 11200-12 11700-19

3400-54 4002-54 4100-12 4100-16 4300-13 4700-12 4800-12 4800-54 5300-13 5300-15 5300-16 10100-11 10600-54 10800-11 10900-19 11000-15 11200-11 11300-11 11400-15 11400-16 11600-15 11700-12 11800-15 11800-41 11900-13 12000-15 12100-12 12100-15 12100-16 12300-13 12300-54 12400-16 12400-19 12400-21 12600-16 12702-19 12702-41 20200-12 20200-16 20400-54 20501-19 20501-41 20502-54 20600-13 20600-41 20600-61 21403-13

Missed by two systems ORRMIS ORRMIS ORRMIS CGIS* CGIS* CGIS PIOS MLMIS PIOS MLMIS 20501-43 20502-43

10200-12 10500-15 10600-15 10600-19 10700-16 10700-54 10900-16 11500-15 12600-15 20300-13 20501-15

10400-16 10800-19 11500-16 12100-13 12100-19 12500-21 12600-19 20501-16

Recorded by one system PIOS MLMIS 3400-16 10300-12 10500-16 10900-12 20501-42 20502-42

ORRMTS

CGIS*

PIOS

MLMIS

3400-13 3501-16 11000-19 11200-19 20502-21 20600-19 21403-42

10700-12 12702-13 20400-42 20502-61

3501-13 4001-16 10900-18 11200-15 11700-18 11900-16 12000-13 12200-62 12400-18 20200-21 20502-18 21403-18

10200-21

10200-0 10300-0 10400-0 10500-0 10600-0 10700-0 10900-0 11000-0 11200-0 11300-0 11400-0

11500-0 11600-0 20600-0

Total number of areas 5

2

6

47

2

0

1 1

0

8

6

7

4

26

1

Total number of areas missed by 1, 2 or 3 systems

60

#Land use 43 was not shown on the source documents provided to CGIS.

27

38

Computer handling of geographical data

Table 27.

Land-use by census-tract polygons missed by MLMIS but recorded by all others.

Tract 34.00 40.02 41.00 41.00 43.00 47.00

48.00 48.00 53.00 53.00 53.00 101.00 106.00 108.00 109.00 110.00 112.00 113.00 114.00 114.00 116.00 117.00 118.00 118.00 119.00 120.00 121.00 121.00 121.00 123.00 123.00 124. 00 124.00 124.00 126.00 127.02 127.02 202.00 202.00 204.00 205.01 205.01 205.02 206.00 206.00 206.00 214.03

*1 ac = 0.405 ha

160

Use 54 54 12 16 13 12 12 54 13

15 16 11 54 11 19 15 11 11 15 16 15 12 15 41 13 15 12 15 16 13 54 16 19 21 16 19 41 12 16 54 19

41 54 13 41 61 13

Acres* recorded CGIS ORRMIS 24 11 27 19 24 11 11 90 19 5 3 29 21 5 13 13 21 80 8 58 8 56 27 8 21 8 19

3 29 27 16 24

19 16 13 16 13 5 21 21

24 27 11

48 72 40

35

20 18 13 14 39 14 16 60 36 1 3 38 58 17 35 7 20 46 2 62 10 21 24 20 4 13 18 14 9 17 22 28 44 17 16 19 4 32 34 14 9 22 14 34 53 35 1

PIOS 37 15 12 13 40 4 26 57 29 1 5 30 64 13 24

6 22 45 6

34 22

15 18 21

4 14 18 9 17 17 27 29 22 18 17 19

5 31 29 12

15 31 14 29 63 1 3

Data encoding experiment

Three major categories are listed in Table 26, areas missed by one system out of four, areas missed by two systems out of four, and areas missed by three systems out of four. There is some reciprocity between the first and third categories, in that if one system miscodes a polygon, an entry will appear in both categories. For example, in Table 25, for tract 117 PIOS records 5. 7 ha (14 ac) of use 18 and the other three systems record none, but they record 4. 5, 4. 9 or 16. 2 ha (11, 12 or 40 ac) of use 19 and PIOS records none. This situation is noted in Table 26 as one case under 'Missed by one system, PIOS' for tract 117 use 19 and one case under 'Recorded by one system, PIOS1 for *ract 117 use 18. A similar type of reciprocal entry occurs when a use area is recorded in one tract by one system and another tract by the other systems. For example, ORRMIS recorded 10 ha (24 ac) of use 16 in tract 35.01 and 3 ha (8 ac) in tract 36 whereas the other systems recorded none, and ORRMIS recorded no area of use 16 in tract 35. 02 whereas the others showed 23, 21 and 16 ha (57, 53 and 40 ac). This situation is recorded in Table 26 as one case under 'Missed by one system, ORRMIS', for tract 35. 02 use 16, and two cases under 'Recorded by one system, ORRMIS', for tract 35.01 use 16 and tract 36 use 16. As shown by the above example, not all the entries in these two categories have reciprocal entries. In tract 214. 03, PIOS records 43 ha (105 ac) of use 18 but the other three record none. PIOS also records 18 ha (45 ac) of use 19 in the same tract which is approximately 41 ha (100 ac) less than that recorded by the other systems. This indicates that a land use polygon was miscoded by PIOS. In Table 26 the situation is shown as one case under 'Recorded by one system, PIOS', but there is no reciprocal entry under 'Missed by one system 1 . The category accounting for the largest number of use-by-tract areas for which one or more systems recorded no area is that 'Missed by one system, MLMIS', which accounts for 37. 6% of all the cases. These cases are further analysed in Table 27. Sixty per cent of these areas are of 8 ha (20 ac) or less, as measured by CGIS, and therefore would tend to have been lost because of the 16-ha (40-ac) grid coding of predominant use in MLMIS. Another 27% were measured as 9 to 16 ha (21 to 40 ac) by CGIS, and these may also have been missed because of the grid cell. Thus, 87% of the use-by-tract areas missed by MLMIS but recorded by all other systems were measured as 16 ha (40 ac) or less by CGIS. Only six cases (13%) were of larger areas according to CGIS figures. The category with the second largest number of use-by-tract areas missed by one or more systems is the category 'Recorded by one system, PIOS1, which accounts for 26 cases, or 20. 8% of all the cases listed. Fourteen of them are cases where PIOS recorded areas for land use code zero, which was not one of the codes used in this experiment. The areas represent polygons which were coded zero either as a coding error or due to missing data on the source map. The polygons were contained in, and therefore affect the accuracy of, PIOS areas in 14 tracts (numbers 102, 103, 104, 105, 106, 107, 109, 110, 112, 113, 114, 115, 116 and 206). One polygon that was miscoded zero instead of 12 accounts for nearly all the cases, because it was a long and narrow, winding polygon (such as would represent a transportation facility) which overlapped many census tracts. The third largest group of cases in Table 26 is areas missed by both ORRMIS and MLMIS. Both these systems used a 16-ha (40-ac) grid for the overlay, but ORRMIS used a smaller grid for the output of the combined maps whereas MLMIS used the same grid. Therefore, there are fewer cases missed by both systems than by MLMIS alone, but 11 cases were missed by both. The smallest of these 11 areas were measured as 0.4 ha (1 ac) by CGIS and PIOS (tract 107 use 16, and tract 126 use 15), and none was much more than 16 ha (40 ac); the largest was measured as 19 ha (46 ac) by CGIS and 15 ha (37 ac) by PIOS (tract 106 use 15). Table 28 shows the areas not recorded by MLMIS and ORRMIS; it also shows the change in number of areas not recorded by ORRMIS after they had reprocessed the test data using the 1-ha (2. 5-ac) grid coding. Table 28.

Areas not recorded by MLMIS and ORRMIS.

Tract 102.00 105.00 106.00 106.00 107.00 107.00 109.00 115.00 126.00 203.00 214.01

Use

CGIS

PIOS

12

28 1 46 3 1 11 4

41 10 37 12 1

15 15 19 16

54 16 15

15 13 15

5 1 23 7

5 5 11 1 6 12

Acres* recorded MLMIS ORRMIS (40 ac) (2. 5 ac) 0 0 0

0 0 0 0 0 0

0 0

0 0 0 0 0

0 5 26 0 0

0

0

0

3

0 0

3 0 8 29

0 0

*1 ac = 0.405 ha.

161

('.omputrr handling i geographical data

Quality of mapped output The source maps used by the systems were shown in Figures 66 and 67. As CGIS produced the source maps, no other outputs have been included from CGIS. Mapped output from ORKA1IS, PIOS and Calspan is shown in Figures 80 to 84 and can be compared visually to the source maps. ML MIS did not submit a plot of the data. 36052'30'

RES I DENT Ifll

OPEN 4 OTHER

FEEDING OPERRTNS.

CROPLHND 4 PflSTfl,

HERVY CROWN COVR.

L!C,H1 CROWN COVfl.

COMHERCL * SVRCS.

VEGETATED WETLND

• "";"

BORE WETUflNO

TRflNSP.COMM.UTIL,

RESERVOIRS

INSTITUTIONS

STRIP t CL5. SETT,

76®22'30" ORNL/RESR -

Fig. 80.

VS^IB' 0" NORFOLK 7.5 MIN QUflO - LEVEL II LRNDUSE - POLTCONIC PROJ. 1/500CO. CELL=2.5 RCRE

ORR MIS shaded plot of land use polygons.

Data encoding experiment

Fig. 81.

PIOS plot of land use polygons.

163

Computer handling of geographical data

164

Data encoding experiment

Fig. 83.

Calspan plot of land use polygons.

16;

Computer handling of geographical dala

200.01

200.09

213.01 Fig. 84.

166

Calspan plot of census tracts.

Data encoding experiment

COSTS This section describes the costs that the systems reported for their parts of the experiment and reproduces the project cost reports for MLMIS, PIOS and ORRMIS. As Calspan is a private enterprise, cost figures could not be made available for publication *, CGIS data are not reported in detail because this system processed an entire map sheet at a scale of 1:100,000 from input sheets at two different scales (1:125,000 and 1:50,000). The costs for the smaller area processed by the other systems would have been difficult to factor to provide equivalent costs for comparison purposes. If the CGIS costs were prorated on a linear basis to represent the smaller 7. 5 x 7. 5-minute map, the costs would compare very favourably with those of the other systems. It must be stressed, however, that this would not be a completely valid comparison technique. A complete comparative analysis of costs has not been possible because of the factors identified below. This section, instead, focusses on the distribution of costs within each system, attempting to identify steps or processes that appear to be either efficient or inefficient. The reported costs are probably useful only for a very general indication of overall system performance and are not adequate for comparisons of cost effectiveness. Each system was designed for a purpose and at a geographic scale quite different from those assigned for the experiment; most are more suitable for larger areas and have processed the single 7. 5 x 7. 5-minute map with uneven degrees of efficiency. A wide variety of capabilities and products is represented by these systems, so their relative performances on this problem are difficult to compare. Because the experimental problem was not a routine processing job, a number of special costs were incurred. These have been separated from the others as far as possible. Finally, only direct costs are reported and capital investment in either hardware or software is excluded; these vary enormously, further complicating comparisons. Despite the qualifications expressed above, it is felt that the results of this data encoding experiment have provided a useful comparison of the features and performances of the systems. Careful comparison of specific procedures, costs and products can provide a useful indication of the operational state-of-the-art in various aspects of geographic information system development. Significant steps or processes Costs (or levels of effort) must be analysed in terms of a detailed sequence of steps which cover the major process of data acquisition, data input, data retrieval and analysis and information output. The following list is the composite sequence of steps deemed necessary to describe the process. 1.

Acquisition of source data - description of source data (documentation) - evaluation of source data

2.

Input procedures - partitioning (separation of the data into workable units) - control (method and procedures for controlling the processing of workable units) - encoding (graphic to digital conversion, classification coding, verification and correction procedures) - data reduction (simplification or compaction of encoded data) - data file construction (organization, access method)

3.

Retrieval and analysis - retrieval from storage - edge matching (combining various segments of a single data set, necessitated when partitioning data into workable units or for data storage efficiency) - measurement - comparison of multiple data sets (overlay procedures)

4.

Information output - tabular listings - graphic display - verification procedures

The discussion of individual system costs and levels of effort will be in terms of the above categories to the extent possible. However, the diverse nature of the systems tested makes it necessary to combine or omit certain categories. Additional information can be obtained from Mr. G. Lewandowski, Calspan Corporation, Buffalo, N. Y. 167

Computer handling of geographical data

MLMI5 experiment. Table 29 indicates the steps and associated costs for processing the data by MLMTS in this experiment. As shown in the table, MLMIS is a labour-intensive system, particularly concerning the entire data input process. Two significant facts are discernible from the figures listed: 1. The manual coding time is not a function of data density but remains fairly constant across data sets (land use and census tracts). Although some difference is shown between the two data sets encoded, it is clear that the need to assign a classification value manually to every grid cell establishes a minimum level of effort for classification coding that is related to the number of cells. 2. A relatively large proportion of the total cost and man-hours was required for project definition and set-up. The 8-hour figure shown represents the total time spent with IGU personnel, and the time actually pertaining to data encoding cannot be separated from this total estimated time. Table 29.

MLMIS project costs.

Job Liaison with Institute of Urban and Regional Research, University of Iowa, and project administration

Rate per hour ($)

Normal Extra time Total production because time time (hr) ad hoc job (hr) (hr)

10.00

Total cost ($)

80.00

Encoding of land use census tract data Coding: Land use Census tracts Keypunching: Land use and census tracts

2. 84 2. 84

5.5 5.0

2. 60

3.5

8.00 10.00

1.5 0.5

1.5 0.5

7.0 5.5

19. 88 15.62

3.5

9. 10

4.0 0.5

32.00 5.00

Computer costs Programming: Mapping Frequency cross-tabulation Machine costs: Mapping Frequency cross-tabulation Total

2.5

Normal costs Extra costs $1.03 $8.00 $3.07 $60.02

$113. 68

9.03 3.07 $173. 70

MLMIS represents the least cost, in terms of total dollars, of all the experiments. However, this cost must be viewed in the light of the required accuracy, as discussed in the previous section. ORRMIS experiment. Table 30 lists the costs and levels of effort experienced by ORRMIS in conducting the experiment. As with MLMIS, the significant factors in this case were mainly the man-hours required for several tasks, as identified below. 1. The process of map preparation, and specifically the manual inking of polygons before scanning, required 13. 5 man-hours for both data sets. 2. The editing of the encoded data sets presents an interesting comparison. Whereas initial manual preparation (inking) required about the same number of man-hours, editing the land use data set required 8 man-hours whereas editing the census tracts required only 15 minutes. From these figures it is inferred that manual editing time increases rapidly as data density and complexity increase. 3. It should also be noted that the processing of these two data sets resulted in the creation of 101 separate frames for scanning. Even though the unit cost for scanning is low ($1. 00/frame), the initial partitioning (inking separate classifications) creates a large number of frames to be scanned. 168

Data encoding experiment

4. Finally, the man-hours required for supervision of computer job-submission (7. 5) was a significant cost, Table 30.

ORRMIS project costs. Personnel

Manhours

Grade

Data preparation Computer-drawn graticule Scan cell mask

Positioning of registration bail Checking registration and alignment of mask and copy work Inking land use data Mylar overlay sheets Photographing land use data Inking census data Photographing census data Film 35-mm (25') Film development Scanning/digitizing Scanner (cost/frame $1. 00) Data processing Scanner magnetic tape conversion 'Quick-look1 plots Header card preparation Generation of header card information Land use Census tracts Scan edit data Fiducial plot Fiducial adjustment Cell assignment Loading data into ORRMIS Back-up creation (during ORRMIS loading) Editing land use data Editing census tract data Editing changes to ORRMIS data base Computer job submission Display and analysis Cluster maps Computer plots (all plots) Tabular display Submitting jobs, etc. Total man-hours: technician Total man-hours: supervisor

Equipment and supplies

Computer time

1 2

Technician Technician

1

Supervisor

7.5

Technician

1 6 0. 25

Technician Technician Technician

Cost

IBM 360/91

($)

0.25 sec plotting

0.05 0.50

2. 10

1. 10 1.00 101.00

0. 25 0. 25

0. 75

4.0

1.5 mm min

5.00 10.00

4.0 min 1. 25 min 1.0 sec

12.00 6.00 0.05

10.0 min

2.00

0.5 min 3.5 min 7.0 sec

2.00 19.00 0.50

Technician

Technician Technician

7.5

Supervisor

26. 35 10. 75

8.00 14.00 0. 15

Technician Technician

8 0.25

2. 25

2.5 min 3. 25 min 0.50 sec

Supervisor Total supply and computer cost $289.45

As noted previously, the use of different cell sizes for the two data sets, 1 ha (2. 5 ac) for land use and 17 ha (42 ac) for census tracts, may have significantly affected the cost figures. The data are not sufficient to permit calculation of the total cost, as personnel rates are not shown. However, this analysis has identified the high-cost steps in the ORRMIS process. 169

Computer handling of geographical data

PIOS experiment. Table 31 lists the component costs for the PIOS experiment. The significant factors in this test are as follows. 1.

The base maps were enlarged from 1:100,000 to l:24,000for digitizing and split into two equal parts.

2.

Man-hours and machine time costs for digitizing are rather high for the size of the data base.

3. All computer costs are minimal except the overlay process. A cost of $88 for the overlay of one data set upon another for a single 7. 5 x 7. 5-minute map suggests that this step might prove expensive for larger and more complex problems. However, the portion of this cost that is for starting the program and the amount of computer time required are not known. Table 31.

PIOS project costs.

Map preparation Enlargements of base maps to 1:24,000 (1 in. = 2000 ft) scale Numbering of individual polygons: 3. 4 hr clerical @ $3. 50 Digitizing Operator: Machine time :

12 hr @ $3. 50 Travel mileage - 80 miles @ $0. 12 10 hr @ $10.00

Data processing Polygon edit program Land use file 031 A: 29 polygons, Land use file 031B: 79 polygons, Land use file 031C: 57 polygons, Land use file 032; 96 polygons, Census tract file 031: 27 polygons, Census tract file 032: 63 polygons,

0 parity errors 26 parity errors 1 parity error 187 parity errors 0 parity errors 1 parity error

$ 23.84 12. 25

$ 36.09

42.00 9. 60 100.00

151. 60

32.59

Polygon merge program Merge and plot Merge 6 files, 3 invocations of land use files: merge program, plot merged file Merge and plot Merge 2 files, 1 invocation of census tract files: merge program, plot merged file

17. 70

Polygon update program Update land use file: 158 transactions Update census tract file: 1 transaction

4.27

Polygon list program List land use file: 261 polygons List census tract file: 90 polygons

2. 81

Complot program (composite plot) Composite plot of land use and census tract files at 1:42,000 (1 in. = 3500 ft) scale

4.32

Polygon overlay program (PIOS) Overlay land use and census tract files Plot residual file Sort residual file: 3 programs PIOS edit program Other charges Plotter: 1. 13 hr @ $30. 00 Programmer/analyst: 6 hr @ $ 7. 50 Total

170

4.32

88.32

150.01

33. 90 45.00

78. 90 $416. 60

Data encoding experiment

Concluding comments on cost analysis. The aim of the cost analysis has been to identify steps or processes within each system that are high in cost or level of effort and that may thus significantly affect attempts to use the systems, or similar procedures, for any substantial task. Cost data have not been included for the CGIS experiment because, as mentioned above, their cost figures were based on the much larger area they had processed for the earlier USGS project and have not been factored to provide equivalent costs for comparison with the limited test undertaken by the other systems. As far as is known, this report represents the first comparative investigation of spatial encoding techniques. It reveals the complexities of such a process and the illusory nature of simplistic cost comparisons relating to specific parts of a digitization process. Careful comparison of the various component procedures, costs and products described, however, can provide a useful indication of the state of development of the various aspects of geographic information systems in 1974. Several important observations can be made from this test. First, the general conclusions about accuracy of measurement are worth repeating: 1. Grid-based systems (in medium and large sizes particularly) tend to be significantly less accurate in area measurement than either coarse or fine polygon systems; and 2. Grid-based systems tend to underestimate or lose linear features when two data variables are overlaid. Further investigations will undoubtedly be undertaken by many agencies concerned with converting graphic data to a machine-readable form. As shown in this experiment, the various approaches do not lend themselves to a simple, standard comparative test. Future investigation must now be concerned not simply with digitization but with the whole process of converting an error-prone source document to an acceptably error-free, machine-readable record, in a format and file structure convenient to the eventual user of the information. Source documents of many types, sizes and degrees of complexity may demand different encoding approaches. Removing errors from the source document and those introduced by the digitizing process may be much more costly than any one digitizing process itself. The test described above was limited to a small piece (129 cm^, or 20 sq in, at 1:100,000 scale) of source material and ignored the task of fitting such sections together to form a map or a whole region or county. Yet that task of edgematching may be efficiently handled by some digitizing approaches but may impose significant difficulties after other forms of digitizing have apparently been completed. Conversion to a machine-readable format must eventually produce a data format that is of direct value for subsequent retrieval or manipulation. It is irrelevant whether the costs of achieving that end are incurred during digitizing or during the subsequent handling of the data resulting from the digitizing. The objective must be efficiency in the overall process of establishing a useful data base. In the Canada Geographic Information System, for example, the procedures prior to document scanning not only prepare the data for automatic digitization, but effectively edit and remove errors from the source documents. In this system, it is found to be efficient to process the raw data from the scanner in several ways before compiling the final line data in machine storage. This process involves change of map projection, centroid calculation, area measurement, linear measurement, automatic alignment of codes to facilitate retrieval, topological editing and automatic line correction, and simple forms of line smoothing and scale change. These processes make it much easier to retrieve data from the machine-readable records that are subsequently produced, but they form an integral part of the process of turning the source document into a digital record of the graphic image. Because the costs from CGIS relate to the overall task, they cannot be compared directly with those of a particular digitization process. As increasingly sophisticated techniques are devised in future to accomplish the overall task of creating computer-stored maps, the comparison of segments of the various approaches will be similarly meaningless if taken out of context. Even in this early stage (1974), examination of the current state of development of such techniques has made this apparent. Future studies must therefore concern themselves not only with digitization but also with the nature and quantity of source documents and, most importantly, with the manner in which all the data concerned are eventually stored in machine-readable form for various subsequent uses.

171

Appendix 3

Canada Geographic Information System Graphics subsystem

.

For some time there has been interest in the possibility of on-line interactive graphics to assist users in the interrogation and manipulation of the CGIS data base. This effort commenced in April, 1974, with the acquisition of a Tektronix 4012 storage display system and hard copy unit. The first task was to determine the feasibility of painting a screen with the type of tabular and graphic information that would be of va3ue to land use planners, resource managers and others interested in interrogating such a data base. The results of the first effort indicated that further investigation was warranted. This report describes one data base and the current software. The hardware will soon be upgraded to a Tektronix 4014 storage display system and hard copy unit which will provide a 48. 2-cm (19-in) screen to view the data. The software developed to date is general and not restricted to operations on the data base described. It is a simple task to take any area from the CGIS data base, convert it to graphics format, and operate upon it.using current software. Software development is the responsibility of Mr. T. A. Fisher, Head of Data Retrieval for the Canada Geographic Information System, and Mr. Sujit Banerjee, Programmer/Analyst. The data base consists of one 1:50,000 map sheet of the Ottawa area (31G05). For this map sheet five different coverages were prepared according to CLI classification codes (Appendix 2)1 and CGIS land use criteria (Table 8). The coverages were land use 1964, land use 1968, land use 1973, agricultural capability, and recreation capability. These five maps were processed through the input and data reduction subsystems of the CGIS to produce a data base. The five coverages for the one map sheet were then overlaid using the retrieval subsystem capabilities. The resultant data base was converted from the standard CGIS data base format into a format more suitable for graphics manipulation. This data base is stored on a directaccess device. The variables available for interrogation are PLU64 - present land use (1964); PLU68 - present land use (1968); PLU73 - present land use (1973); AGRCL - agriculture class (CLI); AGRS1 - agriculture subclass 1 (CLI); AGRS2 - agriculture subclass 2 (CLI); RECCL - recreation class (CLI); and RECF1 - recreation primary feature (subclass, CLI). In the overlay operation of CGIS, each polygon produced as a result of the overlay has associated with it an identifier that is a concatenation of all the component identifiers. For example, a particular location for the five basic input coverages might be defined as Class 2 recreation, Class 3 agriculture, built up in 1973, unimproved pasture in 1968 and productive woodland in 1964. The resultant polygon from the overlay would have these five identifiers associated with it, that is, 23BKT. From the overlay 12,000 unique combinations of identifiers were produced. Many of these identifiers describe polygons that are less than 0. 04 ha (0. 10 ac) . The number of unique combinations of identifiers gives some idea as to the number of possible questions that might be asked about the area. It is important to note that one particular data base that is available for manipulation is described above. However, once in the CGIS data base, any data can be operated upon using the graphics software. An eighth class has been added to the original seven, for unmapped (unclassified) areas. 189

Computer handling of geographical data

The current graphics routine is essentially a basic retrieval mechanism for displaying and tabulating subsets of the data base. As such, it has no manipulative capabilities. For example, the various classes of the land cannot be weighted and decisions made based upon the weighted values. The selected results can only be selected, tabulated and plotted. The routine has five commands: LIST, SELECT, PLOT, GRAPH and DISTANCE. The LIST command provides the user with a list of variables that can be accessed for report generation and plotted (mapped) output (example 1). The SELECT command allows the user to specify all or some of the variables to produce a tabular report giving the area that satisfied the selection criteria. For each of the variables specified, the user is requested to supply the values for which a report is desired. A 'nil1 response, that is, hitting the 'return 1 button on the keyboard, indicates that all values for that particular variable are to be selected. An asterisk before a value indicates the NOT function. For example, to examine the change in the urban area between 1964 and 1968, the statement would be 'not built up in 1964 (*B) and built up (B) in 1968 1 . The tabular report would show the amount of land that had become urban during that time. To obtain a breakdown of the land that had changed from non-urban to urban use, all the individual values describing non-built-up areas must be entered. For 1968, the value describing built-up areas would be entered. The report would be for the same total area, but would be broken down to indicate the composition of the land that had changed from non-urban to urban use. Each selection is numbered in ascending sequence starting at 1. The selection number and date are printed under the report heading (examples 2 and 3). The PLOT command is used in conjunction with the SELECT command. Once a selection has been made and a tabular report produced, the user has the option of requesting a plotted map of the polygons that satisfied the selection criteria. Plots are given the same number as that used in the most recent selection. This facilitates correlation of tabular reports and plots. The PLOT command can enlarge the desired portion of the data base. The user specifies the portion required in inches from the left side of the study area (data base). A return (with no data input) indicates that the entire data base is to be plotted. The system takes the dimensions as supplied by the user and scales the data base or the map to fit the screen. The scale is the same in both the x and y directions. The system asks the user if he wishes to see the shoreline produced on the map. A 'no! answer maps only the selected criteria. The selected criteria and the water bodies are plotted in response to 'yes1. When plotting is completed, the scale and selection criteria are written above the map (examples 4 and 5). The GRAPH command makes use of the cross-hair cursor available on the graphics hardware to enlarge a portion of the plotted image. It is used after a PLOT command and allows the user to enlarge a portion of a plot by pointing to a bounding rectangle with the cursor. The DISTANCE command makes use of the cross-hair cursor in the same manner as the GRAPH command to determine the distance, in miles, between any two points on the plotted image. This command is used after a map has been plotted (example 6). At this stage in the development of CGIS graphics, selected individuals are using the system experimentally so that their views about the system, their requirements and their ideas for possible future enhancements to the system can be determined. More resources will be committed to develop the graphics capability further. Improvements will include some form of shading for the water bodies, some type of location-specific identifiers so that the user can better orient the mapped output, a SCALE command to allow the user to specify the scale of the plotted map, and some further type of interactive capability through the use of the cross-hair cursor. EXAMPLES 1.

The LIST command.

Enter:

LIST

Response:

LIST OF COVERAGES AVAILABLE AGRS1 - AGRICULTURE SUBCLASS 1 PLU64 - PRESENT LAND USE (1964) PLU68 - PRESENT LAND USE (1968) PLU73 - PRESENT LAND USE (1973) RECCL- RECREATION CLASS AGRCL- AGRICULTURE CLASS AGRS2 - AGRICULTURE SUBCLASS 2 RECF1 - RECREATION PRIMARY FEATURE ENTER COMMAND

190

Canada Geographic Information System: Graphics subsystem

2.

The SELECT command.

Enter:

SELECT INPUT COVERAGES TO BE SELECTED PLU64 PLU68 INPUT LAND USES TO BE SELECTED FROM PLU64 - PRESENT LAND USE (1964) *B INPUT LAND USES TO BE SELECTED FROM PLU68 - PRESENT LAND USE (1968) B

Response:

*** SUMMARY OF AREAS SELECTED ### SELECTION 1 20/02/75 PLU64 - PRESENT LAND USE (1964) USE AREA B 0.0 PLU68 - PRESENT LAND USE (1968) USE AREA B 6,372. 1 TOTAL AREA SELECTED IS 6, 372 ENTER COMMAND

3.

**# **# PER CENT 0.0 PER CENT 100.0

The SELECT command.

Enter:

Response;

SELECT INPUT COVERAGES TO BE SELECTED PLU64 PLU68 INPUT LAND USES TO BE SELECTED FROM PLU64 - PRESENT LAND USE (1964) EGHKMOPSTU8L INPUT LAND USES TO BE SELECTED FROM PLU68 - PRESENT LAND USE (1968) B *** SELECTED *** SUMMARY OF ARI #** *** SELECTION 4 20/02/75 PLU64 - PRESENT LAND USE (1964) USE AREA PER CENT E 257. 8 4.0 33. 8 0.5 G 61.4 0.9 H K 29.9 1,906. 7 0.3 M 20.4 80.0 1.2 O P 2,820.4 44. 2

s

0.0

0.0

T U

758.0 421.5 11. 7

11.8

8 0.0 L PLU68 - PRESENT LAND USE (1968) USE AREA 6,372. 1 B 6372 TOTAL AREA SELECTED IS ENTER COMMAND

6.6

0. 1 0.0

PER CENT 100.0

191

Computer handling of geographical data

4.

The PLOT command. PLOT INPUT XMIN AND XMAX IN INCHES APPROX X SCALE IS 1 206.092 APPROX Y SCALE IS 1 206.092 IS SHORELINE DESIRED ON THIS PLOT YES/ NO NO

Enter

Response:

PLOT 4 SELECTED COVERAGES

APPX SCALE IS 1 ' * PLU64 PLU68

£06,092

0 ^

192

Canada Geographic Information System: Graphics subsystem

5.

The PLOT command. PLOT INPUT XMIN AND XMAX IN INCHES

Enter:

APPROX X SCALE IS 1 206. 092 APPROX Y SCALE IS 1 206.092 IS SHORELINE DESIRED ON THIS PLOT YES/NO YES Response:

PLOT 4 flPPX SCALE IS 1 SELECTED COOERAGES • PLU64 PLU68

206,092

Suggest Documents