computer structures: readings and examples

2 downloads 0 Views 29MB Size Report
Charles Pfferkorn. Ken Fitzgerald and Anita Jones of IBM were kind enough to ...... Stack pointer (P may have multiple simultaneously active stacks). 7. Address ...
COMPUTER STRUCTURES: READINGS AND EXAMPLES

C.

Gordon BeH St. #2506

611 Washington

San Francisco, CA 941 11

Computer Structures: Readings and Examples

McGraw-Hill computer science series

RICHARD W. HAMMING Bell Telephone Laboratories

EDWARD

A.

FEIGENBAUM

Stanford University

Bell

and Newell

Computer

Structures:

Readings and Examples

Cole

Introduction to Computing

Gear

Computer Organization and Programming

Givone

Introduction to Switching Circuit Theory

Hellerman Kohavi Liu

Digital Computer System Principles Switching and Finite Automata Theory

Introduction to Combinatorial Mathematics

Rosen

Introduction to Computer Science Programming Systems and Languages

Salton

Automatic Information Organization and Retrieval

Ralston

Watson

Timesharing System Design Concepts

Wegner

Programming Languages, Information Organization

Structures,

and Machine

Computer Structures: Readings and Examples C. Professor of

Gordon

Bell

Computer Science and Electrical Engineering Carnegie-Mellon University

Allen Newell University Professor

Carnegie-Mellon University

McGraw-Hill Book Company

New

York

St.

London

Louis

San Francisco Diisseldorf Panama Rio de Janeiro Singapore Sydney Toronto

Mexico

To Brigham, Laura, Paul

Computer Structures: Readings and Examples

©

1971 by McGraw-Hill, Inc. All rights reserved. Printed in the Copyright United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without

the prior written permission of the publisher.

Library of Congress Catalog Card

Number 75-109245

07-004357-4

1234567890 HDBP 79876543210 News Gothic by Graphic Services, Inc., printed on permanent paper by Halliday Lithograph Corporation, and bound by The Book Press, Inc. The designer was Elliot Epstein; the drawings were This book was set in

done by John Cordes, J. & Pi. Technical Richard Dojny and J. W. Maisel. William

Services, Inc. P.

The

editors

were

Weiss supervised production.

Preface The structures that we call computer systems continue to grow size, and in diversity. This book is linked firmly to the nature of book

is

about the upper

levels of

computer

define a computer system at the

in

complexity,

this growth.

in

The

structure: about instruction sets, which

programming

level;

and about organizations of communica-

processors, memories, switches, input-output devices, controllers, and tion links, which provide the ultimate functioning system.



These

levels are just

emerging into well-defined systems levels with developed symbolic techniques of analysis and synthesis and accumulated engineering know-how, all expressed in a

These aspects

computer systems have always existed, The classical four-box picture of a commemory, input-output, and control) is certainly an effective

crystallized representation.

of

of course, but only in rudimentary form.

puter (arithmetic unit,

organization of components to process information. But multiple-processors hierarchies of memories and remote communications force the top level of organization into a distinct level, requiring analysis

and

rational design. Similarly, the

25

instruc-

IBM 701 computer (developed around 1953) is certainly an instruction indeed one worthy of study. But processors with dozens of registers and set almost unlimited logical circuitry, again force the instruction set to become a topic

tions of the



of rational analysis

This book

and design.

emergence of these upper levels of organization: eight years ago (a computer engineer's half dozen) would have been too early to write is

tied to the

hence would be too late. Eight years ago the diversity and complexity of computer structures was not sufficient to justify the attention this this book; eight years

book provides. This book would have been too exist that treat these levels systematically. This

years hence textbooks will then appear too descriptive.

thin. Eight

book

will

But right now, as these aspects of computer structure are emerging, and with systematic treatment still precluded, there is a need to make available material on these levels for systematic reference and study. Our choice has been to present a large set of examples, which illustrate the various design options and structural possibilities,

both

in

instruction sets

and

in overall

configurations. These examples

are descriptions of actual computer systems, taken from the technical literature or

from technical reports and manuals. Descriptions of actual systems are to be much preferred over idealized abstractions. The latter can reflect the real issues only after successful systematization.

Not only are the chapters about actual computers, they present much detail. The complexity of computers resides in part in their size and the multiplicity of their parts



e.g., to their

having 200 instructions rather than 20, or having to service 2. It seems essential to describe computer systems in their

50 Teletypes rather than entirety, rather

than via simplified vignettes. Again, this view stems from the existing

state of the art. Eight years hence,

We

it

will

not necessarily hold.

from grace on all the above principles, providing occasionally descriptions of paper machines and partial descriptions of partial systems. But our feeling fall

that detail

and

reality is

important remains. This

for study rather than for reading.

is

why

this

book

is

so large; and

fit

vi

Preface

The book presents

a large

number

of examples. Variation

needs to be presented

the major dimensions that instruction sets and system configurations along exhibit. currently Thus, as a glance at the table of contents will show, the examples all

the book are hardly picked at random. The variation is empirical. It exists in the population of computers that have actually been built. This characteristic of the book stems, again, from our assessment that the upper levels of computer structure in

still in an essentially descriptive and empirical state of development. However, as the book documents, ample variation occurs in existing computer systems. The evidence presented here should finally lay to rest the remarks once echoed almost

are



and



heard occasionally that nothing has happened structure since the von Neumann machine. universally

Dimensions

still

of variations imply a

framework,

for

in

computer

dimensions do not by them-

selves arise from a population of systems.

They require the aid, witting or not, of a three chapters of the book testify, we have most

conceptual framework. As the first wittingly created a framework, and have had no hesitation

in imposing it throughout keeping with our view already expressed, this framework is primarily descriptive. It has come inductively from the common lore, from our own experiences as designers, and from the effort of putting this book together. This

the book. However,

in

attempt at systematization has given (ISP)

and the other

rise to

for configurations of

two notations: one for instruction sets

major components (PMS). But, again, these

notations are primarily descriptive. So much for what the book actually tries to provide. What are our goals for it? The first is educational. There are three distinct populations of professionals whose education is to be served by this book: the computer engineer, who will design

computer systems; the computer scientist, who is concerned primarily programming level and with various abstract views of information processing; and the electrical engineer, who sees computer systems simply as one part of a physical

with the

larger technology.

For

all

ments

of

we see no sense

of these,

computer

structure. There

in is

talking of elementary versus surely "less" versus

advanced

with our view of the current art, no vertical stratification of education in

treat-

"more," but consistent is

possible

and device configurations. It is sufficient, in the present day, for computer systems to become accepted as worthy of study in their

instruction sets

these aspects of

own

right.

This book

will

hardly

make easy

have an instructor somewhat

book

is

meant

for study.

fare for undergraduate students,

skilled in the art that

A good

instructor can,

we

who do not

being taught. However, this feel, develop an excellent course is

computer structures, taking this book as the basic material. In addition to the three introductory chapters, Chapter 5 (on the DEC PDP-8), by providing a complete example of a computer system with descriptions at all systems

(or part thereof) in

levels,

helps to

view students

tie

will

the aspects of computer structure discussed pick

up from

in this

book to the

a traditional course in logical design.

goes without saying that for the computer engineer and designer, the material book should be fully assimilated. In designing a new computer system, or subsystem thereof, he should be familiar with all that this book has to offer the It

of this



design choices, the structural variations possible, the experiments of the past and

Preface

the design needs they attempted to satisfy. Given that systematic analysis does not is no substitute for extensive, critical understanding of the existing

yet exist, there

examples of designed systems. We assume the student of computer engineering comes to this book with a working knowledge of logical design. He should find it possible to realize many of the systems described in this book at the next lower levels of logic structure.

For the computer scientist, the levels of computer structure discussed in this book know about the physical devices that

constitute a substantial part of what he should underlie his science. As

we pass downward from these

levels to lower

ones

— to

register-transfer systems, sequential logic circuits, combinatory circuits, continuous

and on down

circuits

— the relevance of each

level

gradually fades.

The

levels of this

book, along with the register-transfer level constitute the main aspects of computer structure that the computer scientist must understand. It does not matter that they

and

are, as yet, basically empirical

descriptive.

The computer

scientist

undoubtedly

not be able to carry through the design of the systems described in this book terms of the lower logic levels, but this is not necessary for an appropriate grasp

will

in

of these

upper

systems

levels of

computer

structure. Indeed, this

is

what

it

means

for distinct

levels to exist.

For the electrical engineer, this book undoubtedly presents

he cares to know (or needs in

presented

the

first

more examples than

But an appropriate sampling, plus the overview

to).

three chapters,

is

appropriate to give him

some

insight into

the elaborate growth that has occurred on top of the basic digital technology created within electrical engineering.

The student

of systems engineering may also find the material presented here an example of a class of complex systems which has evolved several distinct levels of representation. Again, the book undoubtedly presents too massive useful, as

a dose of detail for him, but the overview in the first chapters, plus a sampling throughout the space of computer systems, should prove highly instructive. We have goals for the book in addition to the educational ones. We think the book

can serve as a useful reference for the practicing computer engineer. The time is past when every computer engineer knows about all computer systems because he has us

lived

who

through

of

all

computer

are past forty (and

still

history.

That position is now reserved for those of source book that provides the

active). For the rest, a

cumulated design experience of the

field is a useful substitute, especially

so

if

it

contains enough detail so that a designer can reasonably evaluate the actual computer systems that

embody

a particular design alternative.

Behind the goal of the book as a guide for the practicing computer designer lies the feeling that the field of computer engineering needs to develop a sense of history and of looking to the past for guidance. The fantastic advance in basic logic technology

new one.



in

past systems,

some goal

speed, cost, and

But, of course, in

it

is

not.

reliability

Many



makes each day seem an

absolutely

alternative designs have been tried out in

ways relevant to current design. Thus, we have the goal of saving form accessible to the future needs of computer design. This

of the past in a

is

mixed with a certain archival

programming books.

Many of the systems in this book have manuals and various elementary how-to

feeling.

never been documented, other than

in

vii

viii

Preface

A final goal comes from our feelings as computer scientists that the variety of computer systems is a phenomena worthy of study in its own right. This book carries, to asking how to classify the diversity of therefore, an invitation to taxonomy



forms

of

systems that are

computer

usually take place

into existence.

coming

Taxonomic endeavors

a field of natural systems, particularly biological systems. It that a domain of artificial systems calls for taxonomic activity. strange

may seem But the demand significant size for

many

in

for empirical classification exists

and

rich structure.

Rudimentary

— for

populations of artifacts

whenever there

is

a population of

classification efforts

have occurred

ships, for aircraft, for houses. This

book

should amply confirm that computer systems are complex and diverse enough

and undergoing enough continual nificant taxonomic endeavor.

and evolution

proliferation

— to

command



sig-

Enough is said in the first two chapters about the new notations introduced in the book, so that nothing substantive need be added here. We apologize for inflicting new notation on the reader. We feel that good notations are really quite important for the aspects of

by the whole buyers,

sellers,

came

notations

computer structure described of

field

manufacturers, into

tion of the notations rise in

A

computers — by

common we have

in this

book.

Much would be gained

programmers, engineers, planners, and scientists if relatively uniform

users,

students,



we have no illusions about the perfecwe would be most happy if they cause a

use. Although

introduced,

concern for standard notations and nomenclature.

large

number

redescribed

many

The accuracy

of

of distinct of the

all

systems are described

systems

in

the

these descriptions

is

common a

in

substantial detail.

notation introduced

in

We have

the book.

major problem. Even where the papers

are reproduced from the literature, this problem of accuracy remains

— although

it is not ours alone. Even though we have taken pains to obtain accurate information on the systems and to portray them faithfully in our various descriptions and figures, there is no way we can be responsible for their ultimate accuracy. The

then

PMS and

ISP figures,

in particular,

cannot be guaranteed

to be accurate representa-

tions of the systems they purport to describe. Ultimately,

one would

like to

have

simulation languages for such notations and to verify (up to the usual criteria of a debugged program) that a system given by, say, an ISP description, simulates the behavior of the target machine. But that day is still far off.

Our most fundamental acknowledgment

is

to the contributors to this volume,

not only for the articles they have written, but for the computers they have designed

and

built,

thereby creating a population of fascinating artifacts worthy of study. An

additional reason for reprinting their articles rather than simply describing their is the importance of having available the views of the designers themselves about the nature of their systems.

computer systems

The research on the basic ideas underlying the notations was supported by Advanced Research Projects Agency of the Office of the Secretary of Defense (F 44620-67-C-0058) and is monitored by the Air Force Office of Scientific Research. We would like to extend an acknowledgment to the organizations that have produced

all

of these computers, oftentimes

saw has

it

would seem

in

defiance of the laws

computer manufacturer is simply a of another This computer's way breeding computer. might account for the tenacity of economics. Perhaps, as the old

it,

a

Preface

in spawning the vast numbers of computer systems that provide our field of study. Within this general acknowledgment, we would like to extend a very specific one to all the people in these organizations who

shown by computer manufacturers

heiped

make

information available to us

that this book has

We

demanded

— the manuals,

such great quantity. are indebted to the students who have read and

and ISP

photographs, dates,

etc.,

criticized the various

PMS

in

figures: Richard Dove,

Wayne

Kohl, Michael Knudsen, Paul

Mobus, and

Charles Pfferkorn. Ken Fitzgerald and Anita Jones of IBM were kind enough to read the introduction to the IBM System/360. Professor David L. Parnas initially reviewed the text and contents, thus providing

many Alan

Our other colleagues, especially Professors Angel Jordan, Herbert Simon and Everard M. Williams deserve a special thanks for

helpful suggestions. Perlis,

their patience Finally,

and encouragement.

we would

like to

thank those who were

a part of the

machine that assembled

who assembled the bibliography, figures, and contributor articles; Mrs. Mildred Sisko who typed the PMS and ISP Appendix; and especially Mrs. Dorothy Josephson who not only typed nearly all

the book: the editors of McGraw-Hill; Mrs. Mary Ross

drafts of the book, but also the final

PMS

figures,

and ISP Appendices. C.

Gordon Bell Allen Newell

ix

Acknowledgments R. H. Allmark and J. R. Lucking: Design of an Arithmetic Unit Incorporating a Nesting Store, Proceedings of the International Federation of Information Processing Congress 1962, pp. 694-698,

J.

R.

Hudson,

W.

H. Leonard, R. C. McReynolds, and G. Shapiro formed

efforts. Of particular importance is the G. Gregory in tuning the conceptual design to the real

the basis for the subsequent

work

North Holland Publishing Co.,

Amsterdam, Holland, by permission from American Federation of Informa-

of

J.

world of technology.

tion Processing Societies (AFIPS), Spartan Books, Washington, D.C.

Theodore R. Bashkow, Azra Sasson, and Arnold Kronfeld: System Design FORTRAN Machine, Transactions on Electronic Computers, vol. EC- 16,

of a

R. L. Alonso, H. Rlair-Smith, andA.L. Hopkins: Some Aspects of the Logical Design of a Control Computer, A Case Study, Transactions on Electronic

Computers,

EC-12, no.

vol.

of the authors

and the

6,

no. 4, pp. 485-499,

pp. 687-697, December, 1963, by permission

Institute of Electrical

and Electronics Engineers

(IEEE).

This research is supported by the Air Force Office of Scientific Research Contract AF19(628)— 2798.

G. A.

Anderson, Samuel A. Hoffman, Joseph Shifman, and Robert J. Williams: D825 A Multiple Computer System for Command and Control,

James

P.



Proceedings of the AFIPS Fall Joint Computer Conference, 1962,

vol. 22,

Blaauw and

authors acknowledge:

acknowledge the outstanding efforts of their many who have contributed so well

colleagues at Burroughs Laboratories

and

in so

cation,

many ways

to

all

and programming.

The authors

stages of It

D825

design, development, fabri-

would be impossible

to cite all of these

also wish to

acknowledge the contributions of Mr. William R. Slack and Mr. William W. Carver, also of Burroughs

efforts.

The Structure

of System/360, Part 3, no. 2,

pp.

1

I



19-

The Engineering Design of the Stretch Computer, Proceedings of the Eastern Joint Computer Conference, 1959, pp. 48-58, by permission

Erich Bloch:

of the author to

F. P. Brooks, Jr.:

Outline of the Logical Structure, IBM Systems Journal, vol. 135, 1964, by permission from the IBM Systems Journal.

pp. 86-96,

by permission from AFIPS, Spartan Books, Washington, D.C. The

The authors wish

August 1967, by permission of the authors and the IEEE.

The authors acknowledge:

and the

Institute of Electrical

and Electronics Engineers.

The author acknowledges:

The

efforts

and contributions of many people have gone into the

engineering design of the Stretch computer. To mention all would be impossible. However, the following individuals and their groups were responsible for the units indicated; Mr. R. T. Blosk for the Instruction Unit, Mr.

J.

F. Dirac for the

Look-ahead Units, Messrs.

J.

A.

Hipp

D825 from

and O.

L. MacSorley for the Arithmetic Units,

original conception to its implementation in hardware and software. Mr. Carver made important contributions to the writing and editing

for the

Memory Bus. The Systems Development was under the guidance S. W. Dunwell and R. E. Merwin.

Laboratories. Mr. Slack has been closely associated with the its

and Mr.

L.

O. Ulfsparre

of Messrs.

of this paper.

Arthur W. Burks, Herman H. Coldstine, and John von Neumann: PreGeorge H. Barnes, Richard M. Brown, Maso Kato, David J. Kuck, Daniel L. Slotnick, and Richard A. Stokes: The ILLIAC IV Computer, Transactions

on Computers, vol. C-17, no. 8, pp. 746-757, August 1968, by permission of the authors and the IEEE. The authors acknowledge: This work was supported in part by the Department of Computer Science, University of Illinois, Urbana, Illinois, and in part by the Ad-

vanced Research Projects Agency as administered by the Rome Air Development Center, Griffiss Air Force Base, Rome, New York, under Contract

USAF

The authors

Computing

Instrument, "Collected Works of John von Neumann," vol. V, pp. 34-79,

General Editor: A. H. Taub, Macmillan Company, by permission from

Pergamon

Press,

New

York, 1963.

The authors acknowledge:

This report has been prepared in accordance with the terms of Contract W-36-034-0RD-7481 between the Research and Development Service, Ordnance Department, U.S. Army and the Institute for Ad-

vanced Study.

The authors wish University, for

to express their thanks to Dr. John Tukey, of Princeton

many

valuable discussions and suggestions.

are pleased to acknowledge their indebtedness to the

Westinghouse Electric Corporation that initiated the parallel computer effort. The work of W. C. Borck, A. B. Carroll, group

at the

30 (602)4144.

an Electronic liminary Discussion of the Logical Design of

John W. Carrlll:

IBM

UNIVAC Scientific

(1103A) Instruction Logic, pp. 77-83;

650 Instruction Logic, pp. 93-98; Instruction Logic of the Soviet

Acknowledgments

Strela (Arrow), pp. 111-115; Instruction Logic of the

chap. tion,

MIDAC,

pp. 115-121,

Programming and Coding, "Handbook of Automation, Computaand Control," vol. 2, edited by Eugene M. Grabbe, Simon Ramo, and 2,

Dean Wooldridge, Copyright

© 1959 John Wiley & Sons,

New

Inc.,

no. 2, pp. 223-235, April, 1962,

The

The authors

York,

work by

reprinted by permission.

Jr., James R. Weiner, H. Frazer Welsh, and Herbert F. The UNIVAC System, American Institute of Electrical Engineers-

Institute of Radio Engineers Conference, pp. 6-16, December, 1951, bv permission of the authors and the IEEE. The authors acknowledge:

The UNIVAC System has been an over-all company project and hundreds of people have participated. It is, therefore, difficult to

gratefully

members

all

B.

of the authors

is

D. Chapline,

owed

Jr.

To the

for their continuous

Philosophy of Pegasus,

A

C. H. Devonald,

and

B.

G. Maudsley: The Design

Quantity-production Computer, Proceedings of

the Institution of Electrical Engineers, London, Pt. B, vol. 103, Supple-

ment

2,

pp. 188-196, 1956, by permission of the Institution of Electrical

Engineers.

portion of the system was designed and written in part who is entitled to equal credit with the authors for

L. P. Deutsch,

the ideas in this paper. L. Barnes also contributed significantly to the

acknowledge the contributions that Mr.

like to

A

M. Lehman:

Survey of Problems and Preliminary Results Concerning

Parallel Processing

Electrical

and

and

C. Strachey and Dr. D. B. Gillies, of the National Research Development Corporation, and Dr. J. M. Bennett and Mr. T. G. H. Braunholtz, of Ferranti, Ltd., made to the logical design of Pegasus: particular thanks are due to Mr. C. Strachey for originating the order code.

They also thank Ferranti, Ltd., and the National Research Development Corporation for permission to publish the paper.

December, 1966, by permission

The Whirlwind

Computers, Joint

Computer, Review of Electronic Digital Computers American Institute of Electrical EngineersI

of Radio Engineers Conference, pp. 70-74, February, 1952, by permission of the author and the IEEE. Institute

Thomas W. Kampe: The Design of a General-purpose Microprogramcontrolled Computer with Elementary Structure, Institute of Radio Engineers, Transactions on Electronic Computers, vol. EC-9, no. 213, June, 1960,

by permission

of the author

2,

pp. 208-

and the IEEE. The author

to thank his co-designers, R.

for their assistance

Compton and T. Hayata, during the design of the SD-2 computer and for

their suggestions on this paper.

regular discussions on

due to

all

members

Kilburn, D. B. G. Edwards, M.

of the author

no.

of the Institute of

12,

pp.

1889-1901,

and the IEEE. The author

mem-

aspects of the project. Credit

all

is

therefore

of the

group which, during the period covered by the contents of this paper, included G. C. Driscoll, M. Lee, A. P. J. Mullery, J. L. Rosenfeld, H. P. Schlaeppi, and M. Weitzman. I should also like to express

my

sincere thanks to Dr. H. A. Ernst for the con-

and encouragement offered during prepara-

structive criticism, advice,

My

sincere thanks are also due to

Graphics and Design Department at the Thomas Center, and in particular to G. Massi and Mrs. M.

J. J.

members

of the

Watson Research

LaMarre

for their

preparation of the charts and figures. Last, my thanks to Mrs. J. Galto for her infinite patience in the repeated retypings of the manuscript.

A. L. Leiner, W. A. Notz,

NBS Multicomputer

and A. Weinberger: PILOT, The System, Proceedings of the Eastern Joint Computer J.

L. Smith,

Conference, 1958, pp. 71-75, by permission of the authors and the IEEE.

The authors acknowledge: to

acknowledge the valuable contributions of their

Loberman and W. Youden, who helped to develop the design and programming procedures for this system.

colleagues H. logical

William Lonergan and Paul King: Design of the B 5000 System, Datamation, vol. 7, no. 5, pp.

T.

54,

participated in

The authors wish

acknowledges:

The author wishes

vol.

This paper reports on a group activity in which each individual ber had his own specific assignments and in addition

tion of this paper.

R. R. Everett:

Parallel Processors, Proceedings

Electronics Engineers,

acknowledges:

The authors acknowledge:

The authors would

December, 1966, and the IEEE. The authors acknowledge:

final result.

support of the project.

Owen,

User Machine

The work for this paper was supported in part by the Advanced Research Projects Agency, Department of Defense, Contract SD-185.

by

J.

A

Pirtle:

Electronics Engineers, vol. 54, no. 12, pp. 1766-1774,

by permission

Blumenthal, Mr. L. D. Wilson, and Mr.

EMiott, C. E.

acknowledge the contributions made to this computer team at both Manchester

Time-sharing System, Proceedings of the Institute of Electrical and

The software

S.

and the IEEE.

of the Atlas

W. Lampson, W. W. Lichtenberger, and M. W.

in a

acknowledge the contributions of individuals. However, special mention must be made of the contributions of Mr. H. Lukoff, Mr. E. I. Census Bureau a great debt of gratitude

W.

of the authors

University and Ferranti Ltd.

Presper Eckert,

/.

Mitchell:

by permission

authors acknowledge:

J.

Lanigan, and

F.

H. Sumner: One-

level Storage System, Institute of Radio Engineers Transactions, vol.

EC-11,

Copyrighted Conn.

©

28-32, May, 1961, by permission

1961 by F. D.

Thompson

of,

published and

Publications, Inc., Greenwich,

xi

xii

Acknowledgments

Richard

E.

Monnier, Thomas E. Osborne, and David

S.

Cochran: The

HP Model 9100A Computing Calculator. This chapter is a compilation of three articles: A New Electronic Calculator with Computerlike Capabiliby Richard E. Monnier, pp. 3-9; Hardware Design of the Model Calculator, by Thomas E. Osborne, pp. 10-13; and Internal

ties,

J.

H. Wilkinson:

Computation,

The

Pilot

5-14,

pp.

ACE, by

permission from Automatic Digital

National

Physical

Teddington,

Laboratory,

England, March 25-28, 1953.

and

9100A

M.

Programming of the 9100A Calculator, by David S. Cochran, pp. 14-16, which appeared in the Hewlett-Packard Journal, volume 20, no. 1, Septem-

the Control Circuits in an Electronic Digital Computer, Proceedings of

ber, 1968,

by permission

V. Wilkes

the

J.

Cambridge Philosophical

Society, Pt. 2, vol. 49, pp. 230-238, April,

1953, by permission of the authors

of the Hewlett-Packard Journal.

Micro-programming and the Design of

B. Stringer:

and the Cambridge Philosophical

Society,

Cambridge, England. The authors acknowledge: R. E. Porter: mation, vol.

The RW-400— A New Polymorphic Data System, Data-

6, no. 1,

published and Copyrighted Greenwich, Conn.

/.

©

1960 by F. D. Thompson Publications,

W. Renwick

Inc.,

and

to Professor D. R. Hartree, F.R.S., for his generous help with the

O.

T.

Ellis:

A Command

Struc-

Complex Information Processing, Western Joint Computer Conference 1958, by permission of the authors and the IEEE.

Joseph E.

Y. Stevens:

The Structure

of System/360, Part II

IBM Systems Journal, vol. the IBM Systems Journal.

tions,

from

for assisting

Wirsching:

vol. 12, no. 12,

ture for

W.

Mr. A. L. Freedman and

to express their thanks to

Mr.

them

in clarifying a

number

of points,

preparation of the paper.

Shaw, A. Newell, H. A. Simon, and

C.

The authors wish

of,

pp. 8-14, January/February, 1960, by permission

3, no. 2,

List-oriented

Computer, Datamation,

by permission

of,

©

and Copyrighted 1966 by F. D. Thompson Publications, wich, Conn. The author acknowledges:

published

Inc.,

Green-

This work was performed under the auspices of the U.S. Atomic

— System Implementa-

pp. 136-143, 1964,

NOVA: A

pp. 41-43, December, 1966,

Energy Commission.

by permission Several organizations have contributed to the writing and production of

book by giving us permission to use material from their publications. many cases they have also supplied us with original copies. We have

this

James

E. Thornton: Parallel Operation in the Control

Data 6600, Proceed-

AFIPS Fall Joint Computer Conference, Pt. II, vol. 26, pp. 33-40, by permission from AFIPS, Spartan Books, Washington, D.C.

In

ings of the

credited their text, tables, pictures, and diagrams

1964,

This cooperation has been invaluable.

The

when they

specific

are used.

organizations are:

Adams's Associates: Computer Characteristics Quarterly. (Adams, 1966-1968)

W.

L.

van der

Poel:

ZEBRA, A Simple

Binary Computer, Proceedings of

an International Conference on Information Processing, Paris, UNESCO House, June, 1959, pp. 361-365, by permission from AFIPS, Spartan Books,

Computers and Automation magazine

Washington, D.C.

Minnesota

Data Corporation,

Control

8100

34th

Avenue

South,

Minneapolis,

Datamation magazine

Helmut Weber: A Microprogrammed Implementation of EULER on IBM System/360 Model 30, Communications of the Association for Computing

©

Machinery, vol. 10, no. 9, pp. 549-558, September, 1967, Copyright 1967 Association for Computing Machinery, Inc., by permission of the author and the Association for Computing Machinery, Inc. The author

wish to thank Jack Carman,

who wrote

the Operating System linkage for the

Morrison

who helped prepare Wirth and

Street,

Maynard, Massachusetts

Hewlett-Packard Company, 1501 Page Mill Road, Palo, California International Business Machines Corporation,

New

White

Plains

and Pough-

York

Massachusetts Institute of Technology, Cambridge, Massachusetts the I/O Control Program and

EULER

the figures.

valuable criticism offered by the referee, Professor N.

Equipment Corporation, 146 Main

keepsie,

acknowledges;

I

Digital

E. Satterthwaite.

I

W.

system and Miss Sheila

am C.

also grateful for the

McGee,

as well as

National Science Foundation Olivetti

Underwood Corporation,

1

Park Avenue,

New

York,

New

York

by Scientific

Data Systems, 1649 Seventeenth

Street, Santa

Monica, California

Contributors R. H.

Allmark

W.

W. W. Lichtenberger

J.

R. L.

Alonso

T. 0. Ellis

William Lonergan

W.

R. R. Everett

J.

Herman

B. G.

James

Anderson

P.

Theodore

R.

Bashkow

George H. Barnes G. A.

Blaauw

S. Elliott

Samuel A. L.

H. Goldstine A.

Hoffman

Hopkins

Thomas W. Kampe Maso Kato

H. Blair-Smith

R.

J. B.

F.

Mitchell

Richard

E.

Monnier

Notz

A.

T. Kilburn

Paul King

M. W.

Arthur W. Burks

David

John W. Carr III David S. Cochran

Arnold Kronfeld

P.

C. H.

Brooks,

Devonald

D. B. G. J.

Jr.

Edwards

Presper Eckert, Jr.

J.

Kuck

B.

W. Lampson

M.

J.

Lanigan

A. L. Leiner

M.

Lehman

Pirtle

R. E. Porter

Azra Sasson J.

C.

Shaw

Joseph Shifman H. A.

Stevens

Maudsley

Richard M. Brown

F.

Smith

Richard A. Stokes

Herbert

W.

Y.

Lucking

Thomas E. Osborne C. E. Owen

Erich Bloch

L.

Simon

Daniel

L.

Slotnick

Stringer

Sumner James E. Thornton F.

W.

H.

L.

van der Poel

John von Neumann A.

Helmut Weber Weinberger R. Weiner

James

H. Frazer Welsh

M. V. Wilkes J.

H. Wilkinson

Robert

Joseph

J.

E.

Williams

Wirsching

xiii

Contents

Part 1

Preface

V

Contributors

xiii

The Structure Chapter

1

Chapter 2

of

The

Acknowledgments

Computers

Introduction

PMS

The

Chapter 3

and

ISP

Chapter 4

15

Processors with One Address per Instruction

89

Chapter 16 Chapter 17

Preliminary Discussion of the LogiDesign of an Electronic Com-

cal

Instrument

puting

Herman H. John von Neumann The DEC PDP-8 The Whirlwind

Chapter 5 Chapter 6

— Arthur

92

IBM

The

Chapter 7

Some Aspects

Chapter 9

Hopkins The SDS 910-9300

Stretch

J.

the

— Computer Erich Bloch

Series

The Design Philosophy

of Pegasus,

Structure

A Quantity-production Computer — W. S. Elliott, C. E. Owen, C. H. The

a "virtual" contents,

Structure I

— Outline

of of

Brooks,

171

Chapter 10 Chapter 39

An

— G.

A.

Blaauw and

8-bit-character

Parallel

F. P.

Jr.

Operation

Data 6600—James

System/360, the

Computer in E.

the

184

Control

Thornton

Logical

which means that because many of the computers are relevant

type to indicate a nonsequential mapping for computers placed out of "physical" order. virtual order.

Storage Kilburn, D. B. G. Edwards, M.

T.

Processors with a General-register State

Part

is

Chapter 34

Lanigan, and F. H. Summer The Engineering Design of

146

L.

157

— System

One-level

of the Logical Design

Devonald, and B. G. Maudsley

Chapter 43

Mitchell

Chapter 23

a

and A.

Section 2

Jr., James B. Weiner, H. Frazer Welsh, and Herbert F.

137



Chapter 42

The UNIVAC System—J. Presper Eckert,

1800

Control Computer: A Case Study R. L. Alonso, H. Blair-Smith,

of

Chapter 8

— Computer

R. R. Everett

Chapter 33

Chapter 41 120

I

The LGP-30 and LGP-21

IBM 650 Instruction Logic—John W. Can III The IBM 7094 I, II

W.

Goldstine, and

Burks,

This

37

Instruction-set Processor: Main-line computers

Section 1

1

The Computer Space

Descriptive

Systems

Part 2

3

to more than one The reader might read

part and section, we have used italic (reference) the book according to the

xvi

Contents

Part 3

The

Instruction-set Processor Level: Variations in the Processor

Section 1

Processors with Greater than One Address per Instruction

ACE—

Chapter 11

The

Chapter 12

ZEBRA, A Simple Binary Computer — W. L. van der Poel UNIVAC Scientific (1103A) Instruction Logic— John W. Carr HI The RW-400: A New Polymorphic

Chapter 13 Chapter 38

Pilot

Data System

Section 2

Chapter 19

— R.

H. Wilkinson

J.

191

Chapter 14

193

Chapter 15

Memory

Chapter 9

The LGP-30 and LGP-21

Chapter 11 Chapter 8

The

H.

ACE—

J.

Frazer

The Design Philosophy of Pegasus,

W. 217

James

R.

Elliott,

C.

E.

Owen,

C.

H.

IBM

Chapter 26

John W. Carr III NOVA: A List-oriented Computer—

Presper

650

Instruction

Logic220

Joseph E. Wirsching

Weiner,

and Herbert

Welsh,

S.

A

— Computer

Chapter 17

H. Wilkinson

UNIVAC System—J. Jr.,

213

Devonald, and B. G. Maudsley

van der Poel

Eckert,

Soviet

(Arrow)— John W. Carr HI

Quantity-production

Chapter 16

The

the

of

216

ZEBRA, A Simple Binary Computer

Pilot

209

III

Logic

E. Porter

The OLIVETTI Programma 101 Desk

L.

Instruction Strela

Processors Constrained by a Cyclic, Primary

— W.

Carr

205

Calculator

Chapter 12

W.

John

200

MID AC —

Instruction Logic of the

F.

Mitchell

Section 3

Chapter 18

Section 4

Chapter 19

Processors for Variable-length-string Data

The IBM 1401

225

Section 5

Chapter 21

The OLIVETTI

Programma

Chapter 36

An

8-bit-character

Computer

The

HP

Thomas

237

Processors with Stack Memories (Zero Addresses per Instruction)

Design of an Arithmetic Unit Incorporating a Nesting Store R. H.

for

—A

Multiple-computer System

Command and Control—James P.

Anderson,

Samuel

A.

Hoffman,

E.

Monnier,

and David

S.

243

257 Joseph Shifman, Williams



D825

E. Osborne,

Cochran

Model 9100A Computing

R. Lucking J. Design of the B 5000 SystemWilliam Lonergan and Paul King

235

Calculator — Richard

101

Allmark and

Chapter 22

Chapter 10

Desk Calculator Computers: Keyboard Programmable Processors with Small Memories

Desk Calculator Chapter 20

224

262

Chapter 30

A Command

and

Structure for



Information Processing A. Newell, H. A. Simon,

267 Chapter 32

Robert

/.

J.

Complex C.

T.

Shaw,

O. Ellis

Microprogrammed Implementation of Model

EULER on IBM System/ 360 30— Helmut Weber

Contents

Section 6

Chapter 23

Processors with Multiprogramming Ability

274

One-level Storage System— T. KilD. B. G. Edwards, M. J.

Chapter 24

burn,

Lanigan, and F. H. Sumner

Chapter 21

Part 4

The

of

Design

B 5000 System —

the

William Lonergan and Paul King User Machine in a Time-sharing

A

— B.

W. Lampson, W. W. Lichtenberger, and M. W. Pirtle

276

System

291

Instruction-set Processor Level: Special-function Processors

Section

1

Chapter 41 Chapter 43

Processors to Control Terminals and Secondary Memories (Input-output Processors)

The

IBM

The Part

7094

I,

— Outline

Brooks,

Section 2

Chapter 26

of

System/360,

of

the

I

Structure/G. A.

Blaauw and

NOVA: A

List-oriented

ILLIAC

George

Section 3

Chapter 28

F.

H.

IV

Barnes,

Stokes

ComputerRichard

334

Chapter 20

an Elec-

of a

Chapter 32



Chapter 30

Structure for



Design of a FORTRAN Machine Theodore R. Bashkow,

System



E.

Monnier,

Osborne, and David

S.

A Microprogrammed Implementation of EULER on IBM System/ 360 Model 30— Helmut Weber

348 Azra Sasson, and Arnold Kronfeld

Complex

Information Processing J. C. Shaw, A. Newell, H. A. Simon, and T. O. Ellis

Chapter 31

— Richard

E.

341

Processors Based on a Programming Language

A Command

Model 9100A Computing

Cochran

General-purpose

Thomas W. Kampe

HP

Thomas 335

B. Stringer

The

Calculator

— Computer M. V. Wilkes and

The Design

320

M.

Microprogram-controlled Computer with Structure Elementary

Section 4

305

Brown, Maso Kato, David J. Kuck, Daniel L. Slotnick, and Richard E.

316

Microprogramming and the Design

J.

338 Display Processor

325

Computer-

of the Control Circuits in

Chapter 29

DEC

P.

Processors Defined by a Microprogram

tronic

IBM 1800

Jr.

Processors for Array Data

The

The

The

Logical

Joseph E. Wirsching

Chapter 27

Chapter 33 Chapter 25

II

Structure

303

Chapter 32 349

363

A Microprogrammed Implementation of EULER on IBM System/360 Model

30— Helmut Weber

382

xvii

xviii

Contents

Part 5

Contents

PMS

Appendix

and ISP Notations

General Conventions

607

607 608

8 Attributes

2 Metanotation

608

9 Null

3 Basic Syntax

609

1

4

Basic Semantics

Commands: Assignments, AbbreviaForms

tion, Variables,

PMS

609

612

Symbol

and

Optional

Ex-

pression

613

10

Names

613

11

Numbers

614

5 Indefinite Expressions

610

12 Quantities, Dimensions, and Units

615

6 Lists and Sets

611

13 Boolean and Belations

615

7 Definite Expressions

611

Conventions

615

616

7 Switch

2 General Units

616

8 Control (K)

624

3 Information Units

616

9 Transducer (T)

625

4 Component

617

5 Link (L)

1

6

Dimensions

Memory

ISP Conventions

(M)

623

(S)

10 Data-operations (D)

626

619

11 Processor (P)

626

620

12

628

Computer

(C)

628

Data-types

629

3 Operations

632

2 Instruction

631

4 Processors

635

1

Bibliography

638

Name

653

Index

Machine and Organization Index

656

Subject Index

661

xix

Part 1

The structure

of

computers

1

Chapter

This book presents

them

in

enough

many examples of computer systems.

detail so that

It

presents

meaningful engineering study and

opment

of this science

and technology of computers (one of us To understand why this particular

also likes to build computers).

Most of these examples are presented by the original descriptions of them in the technical literature. using Others have been redescribed by us, especially where the original descriptions existed only in technical manuals. In both cases there

book seems

are considerable discussion and analysis of the computer structures: what problems they were intended to solve, what solutions

the most important. There are at least four levels of system description, possibly five, that can be used for a computer. These are not

were adopted, and how these solutions have fared. Yet the emphasis has remained on detailed descriptions precise enough so

alternative descriptions in the sense that anything said one

that the systems themselves are available for independent study.

straction of the levels

analysis are possible.

Why

should one want to produce such a book? Collections of

common

reprintings from the technical literature are

in

many

fields, e.g., "Programming Systems and Languages" [Rosen, 1967]. We have departed from this traditional exercise in two ways, both of which seem important to us.

of computer-systems technology.

A

we have

presented substantial amounts of detail: in effect, block diagrams of computer structures and the equivalents of

computer system

is

complex

On

can be said another.

in several

ways. Figure

1

shows

way

the contrary, each level arises from ab-

below

Each does a job

it.

that the lower

levels could not

perform because of the unnecessary detail they

would be forced

to carry around.

A system (at any level) is

science and engineering

First,

be the right way to push this development time requires characterizing the current state

to us to

at this particular

of

characterized by a set of components, which certain properties are posited, and a set of ways of com-

bining components to produce systems. When formalized appropriately, the behavior of the systems is determined by the behavior of

its

components and the

specific

modes

of combination used.

programming manuals. These constitute neither good reading nor a way of communicating the "essential ideas" in the field. Second, we have introduced a system of notation and have used it not only in the parts we ourselves have written but also to provide addi-

Network //V, computer/C

Structures:

Components: Processors/* memories/^, switches/5, controls /A', transducers / T, data operators//?, links//. ,

tional (sometimes redundant) descriptions of

computer systems in the reprinted articles. Why should there be a book like this? The reasons are several and require some background discussion.

Structure:

Programs, subprograms

Components: State (memory

cells),

instructions, operators, controls,

interpreter

Circuits: Arithmetic unit

Computer systems

c

>

Components: Registers, transfers,

Computer systems are one example 1 ficial systems. They have existed as

of man's

more complex

successful engineering prod-

ucts long enough to undergo radical evolution to a

number

of basic, unique technologies.

and

They

controls, doto operators (+, -, etc.]

arti-

I Circuits: Counters, controls, sequential

to give rise

transducer, function generator, register arrays

are sufficiently



reset-set/ Components: Flip-flops RS, JK delay/ D toggle/ T latch, ,

complex that they have given rise to a science, that is, to a continuing, institutionalized endeavor to understand what sort of beast has been brought forth. 2 1

it

t

Circuits: Encoders, decoders, tronsfer

arroys, data ops, selectors, distributors, iterative networks

most complex system. That view myopic. Setting aside quasi-natural systems, such as cities and economies, is still the case that a modern aircraft carrier is more complex than a

Components: AND, OR, NOT, NAND,

need not argue that they are

interest

is

his

modern computer by any reasonable measure. 2

t

in the devel-

We

is

Our fundamental

t

delay, one shot

Here uniqueness can be claimed, perhaps, since few other

is

no science of

aircraft carriers.

But there

is

a

Circuits: Amplifiers, delays, attenuators, multivibrators, clocks, gates, differentiator

Active components: Relays, vacuum tubes, transistors

artifactual

systems (again, excluding the quasi-natural ones) provide new phenomena that require sustained scientific investigation to understand them. There certainly

Components:

NOR

computer

science.

Passive components: Resistor//?, capocitor/ C, inducter/Z., diode, delay lines

Fig. 1.

Hierarchy of levels: computer structure.

states, inputs, outputs

4

The structure

Part 1

of

computers

Elementary

circuit theory

components

are R's, L's, C's,

combination

is

is

an almost prototypic example. The and voltage sources. The mode of

between the terminals of components, an identification of current and voltage at

to run wires

which corresponds

to

The algebraic and differential equations of circuit theory provide the means whereby the behavior of a circuit can be computed from the properties of its components and the way these terminals.

the circuit

is

constructed.

a recursive feature to most system descriptions. A system, composed of components structured in a given way, may be considered a component in the construction of yet other sys-

There

is

some primitive components whose properties are not explicable as the resultant of a system of the same type. For example, a resistor is not to be explained by a tems. There are, of course,

subcircuit but

is

taken as a primitive. Sometimes there are no

absolute primitives, it being a matter of convention what basis is taken. For example, one can build logical design systems from

many different primitive sets of logical operations (AND and NOT, NAND, OR and NOT, etc.). A system level, as we have used the term in Fig. 1, is characterized is,

by a

distinct language for representing the system (that

the components,

These

modes

distinct languages

of combination,

and laws

of behavior).

reflect special properties of the types of

components and of the way they combine. Otherwise, there would be no point in adopting a special representation. Nevertheless, these levels exist in the system analyst's

Structure

-15 volts

way of describing the same

Chapter

to

come

to

mind

first,

but card readers, card punches, and Teletype

terminals are other examples. These devices obey laws of motion and are analyzed in units of mass, length, and time.

The

next level

is

the logic

to digital technol-

level. It is

unique ogy, whereas the circuit level (and below) is what digital technology shares with the rest of electrical engineering. The behavior of a system

now

is

and

low).

NOT, NAND,

etc.

at the circuit level,

which

and

1 (or

+

The components perform

and



,

true

in the

by connecting the terminals

thereby identify their behavioral values.

of

get sequential circuits. The problem that the combinatorial-level analysis solves is the production of a set of outputs

time

we

t

as a function of a

As described

number

in textbooks, the analysis abstracts

not look at the voltage (which

same time

t.

from any trans-

port delays between input and output; however, in engineering practice the analysis of delays is usually considered to be still part of the combinatorial level. In Fig. 3 we show a combinatorial

to use the

("settling

A and structure we

boolean variables

B.

O v Oz

and

Note that

3,

symbolic representa-

can write an expression that reflects the structure of the combinatorial network, but, on reduction, the tion of the

boolean equations no longer reflect the actual structure of the combinatorial circuit but become a model to predict its behavior.

The representation of a sequential switching circuit is basically the same as that of a combinatorial switching circuit, although one needs to add memory components, such as a delay element



(which produces as output at time t the input at time t t). Thus the equations that specify structure must be difference equations involving time. Again, there is a distinction (even in representation) between synchronous circuits and asynchronous circuits, namely, whether behavior can be represented by a sequence of or must deal in values at integral time points (t = 1, 2, 3, .) .

.

a minor variation. Figure 4 gives a sequential logic circuit in both an algebraic and a graphical form

continuous time. But this

and shows

Now

is

also the representation of the behavior of the system.

clear that logic circuits are simply a subspecies of general circuits. Indeed, to design the logic components one constructs circuit-level descriptions of them. For instance, Fig. 5 it is

when

it

is

transient

Thus the

compute the behavior of circuits at the logic level that are extremely complex at the circuit level. The techniques for doing so use an entirely different mathematical apparatus. In general,

we

cross into another level

when

the previous level provides information that

A

the representation at is

no longer relevant.

concerned with explaining the behavior of a certain structure, whereas the next highest level takes the lower level as given (a primitive). The higher level is concerned not about lower level

is

internal behavior but only

A glance at

how

primitives are combined.

shows that we have described only the lower part of the logic level. There is another part, called the registertransfer level (or RT level). This is still an uncertain level, a matter Fig. 1

as a function of the input

in the

common



Structure ,

identified as the behavior variable

is

network formed from combinatorial elements which realize three boolean output expressions,

is

in the logic circuit) during certain periods

since one can

of bool-

of time. outputs are directly related to the inputs at any instant hold values over time the to If the circuit has (store inforability

at

its

components,

The laws

of inputs at the

gate plus a table of

AND, OR, same way as

whose previous paragraph described combinatorial circuits

mation),

NOR)

its

case in which certain features are deliberately ignored. One buys a great deal from the specialization to logic circuits,

false,

ean algebra are used to compute the behavior of a system from the behavior and properties of its components.

The

NAND gate only if

(or

behavior corresponds to that of the certain restrictions hold; namely, that one does

evident that

It is

high

and

logical functions:

Systems are constructed

NAND

circuit for a

down," logic level phrase). an instance of the circuit level only in the same sense that the as a limiting circuit level is an instance of Maxwell's equations

described by discrete variables which take on

only two values, called

shows a behavior.

1

NOR

3

OR

OR

5

6 Part

1

The structure

of

computers

Behavior

Structure

MOR

Clock

— Sum

Xr

I

1

°

X c Sum

1

I

Chapter

level.

The

practicing logic designer (by

now an

institutionalized

position, on a par with that of circuit designer) has sequential and combinatorial circuits as his basic analytic tools, and he attempts

on the register-transfer level

to design systems

essors) with these as

tools.

The

Structure

(AJr^OO)— (Sum:=0)';

central proc-

(e.g.,

register-transfer level has

from the informal attempts to create a notation closer to be done.

f.Xr,X^)^{Sm=);

emerged

to the job U/;>r=00)^(Sum:=1);

Recently there have been a number of efforts to construct formalized register-transfers systems. Most of them are built

lXr,X=0\)— (Sum:=1)

around the construction of a programming system or language that permits computer simulation of systems on the RT level. Although there

agreement on the basic components and types of operais much less agreement on the representation of the

is

tions, there

laws of the system (corresponding to the production system in Fig.

Behavior

4+/V 5+/V

6

I

/V+1

/V+1 /V+1

o

f

t a

— (S— 0; 7—0; start— 0; run— 1) -«//V)-(run-0»!

fi 1

s is abbreviation for start

2

r is abbreviation for run combinational network

3

A5

A A

start a -^run

t A run

4 clock event time, . N

f

.

Fig. 6. Register-transfer sublevel of

sum

of integers.

the logic

level:

Behavior (

G-15

(transistors, core memory) TX-2

MANIAC

and vonNeumann) (Burks. Goldstine .

"

(\2 b/w)

f>

University of Illinois 0RDVAC (for BRL)7

EDSAC (Wilkes) Cambridge University University of Pennsylvania EDVAC (Moore School of Electrical Engineering)

Bendix

(tubes, selectron memory)

J0HNNIAC

University of Chicago

V_

1218(18 b/w) 41

12

B

FILE

Jtape.drum)^ (core memory)

m

I

Rand Corporation von Neumann or IAS Based

490

1,11(10 d/w)

(12 d/w)

MUSE -ATLAS

(index regi sters/B-tubes ) ACE^ En q |ish Electric DEUCE

Lincoln Laboratory MTC/Memory Test Computer

Whirlwind

,

s-

HARK

g_

NPL/National Physics Laboratory md ACE Based Machines

K

>

(36 b/w)

1

Rice University

Manchester University



a—

»i

k K

File

BINSC*

Ti

——

1*5

1103A

Smal

(E-M)

(Real

d / w)

a

I

7040-7044 k

-

LARC

360/91

w

a

.(vac

,„,,

!

60,62,66,70 360/95 360/85

_608*.6IO»

7090(transi5tor)k_7094 7094 J

,

,

ire

Include models withdrawn:

(does not

p—

(64 b/w)

i

3t 360/65,75

360/30,40,50 0,40,50

System/360

(Large Scale)

t

360/67 (time shared)

(disk)

_609*

^STRETCH (7030)

(real

.

,

Internationa Business Machines

.

»—

.

7074-707"

7080

..

|

1460,1440

650(di-k)

*-

(10 d, fixed)

SPECTRA 70 Series

k!620»-

b/char, fixed inst.) 70S lll_

(6

1

I410J

M

7070

k

70IQ

,,.„,_/ kl401|£

,

_

3j0Hk-301) Compatible with System/360's Process Control j Smal scale General Purpose 1130 1800 ( 16 b/w) r, a Business

10(24 */w)

1

7 b/char)

,25 0*

B263 Bl 60 B273 BI70 B28 ' B,8 °

I

(10 d)

,

,

B8501

I^W

B65OO?

B8501

'

1" _ (8 b/char)

B260 B270 828 °

\

IBM/

B5000 B5500

b/char> 3 01( 6 b/ch a r) 601(56 b/w )

(6

3

B250B-

(6b/char) k E103«

501

1

ll6

1400-1800 fr

multiprocessor) B-5000 0-825 B5000 " not, family Business

b/w, stack, H0TE:

Datatron Division RCA/Radio Corporation of

(16 b/w)

k

b/ ">

18

rf.

(12

d/w-pluqboard program) E10l«n (10 d/w)

.A

(19 b /w) DPP- '9

Computer Controls Division

(I48

Fig. 2a.

J420Q-8?0O

2200, 1200,120

based k 200

IBM 1401

b/char)

(6

I

Datamatic

1^42

(control)

DATANET 30

(18 b/w)

k 1

M

^050

,

d/w)

(6

1^58

19*59

Wt

lftl

19*62

19^3

k

Reasonably compatible series

ku Upward compatible *

19^4

19^5

Non-stored Program Calculator

19*66

19^7

1968

I9%9

W

1940

ISA

1

19*0

|9d1

1942

1943

'.944

1945

]946

1947

J948

1949

1950

1951

'952

46

Part

The structure

1

of

computers

Analytical

Charles BabbSge (1792-1871)

ErK,

Difference Engine * •

Bouchon, Falcon, Jacques ^

^

^

Leibni P

z

-FIRST RENERATION-

*•'•?"







ne

Bell Telephone Labs (1000 words, 50 digits/word)

Mullers Difference Engine

paper tape (.cards

'

card con troHed

U.

i

of Pennsylvania (Moore School)

-

VI

• -

-Relay Interpolator,

Ml

Electronic Numerical

Integrator And Computer

IV -IV

-Ballistic

Harvard Marks

• •





Mul tl plying

Pascal Calculato L.

^Calculator/IBM r^ Monroe Calculator Baldwin Calculator*) Columbia U X.Thomas Arithmometer Calculator

Use of Boolean Algebra for Switching Circuits

a)

TELEGRAPH _ ELECTROMAGNET* TELEPHONE « MECHANICAL

MEMORY

1700

40

IT

50

Q

Operational p Paper

ELECTRO-

T

|

1800

w

(Shannon)

fTRANSI STORSiJINTEGRATED CI8CUI : co p,E MEMORY :

1/ariNiJ TUBES. T1IBFS- DBUHS DBIIHS VACUUM

ECCLES-JORDAN FLIP-FLOP VACUUM TUBES

Time

• II

EN1AC

Calculator

Schickhardt Calculator alculato (described to Kepler)

Fig. 2d.

V

IV

I

••

-Complex Numbers,



#)

It

II

I

• •

Hollerith punched cards • (used for census) w Jacquard Punched Card Loom

1850

chart: pre-computer technology.

Function

The most

striking fact about function

is

the existence of only a

and with only a few values. Perhaps we have taken a simplistic view of the functions that computers perform, but we think our computer space represents reality: To wit, there single dimension,

is

remarkably

little

shaping of computer structure to

fit

the func-

task.

The

latter

is

often carefully specialized to is mostly the amount of

the function to be performed. But this Mp, the amount of types of Ms, and the

Within of

limits, these are all

computer

(i.e.,

to

number and types of T's. items that can be attached to any type

any Pc) and are handled

in

an environment-

At the root of

independent way. Thus there is little specialization of computer types, but great specialization of particular configurations. That

which

this

tion to be performed. this lies the general-purpose nature of computers, the functional specialization occurs at the time of programming and not at the time of design. However, it might

in

assembled for a

seem

all

that specialized environments

erality, so that functional this

would not require

adaptation would

appears not to be so for two reasons.

tions of the

Pc

(as

defined in the ISP)

is

still

all

the gen-

be possible. But

First, the level of

operatoo basic to reflect the

should be the case indicates something about the nature of that it can be expressed adequately

the functional specialization in gross

PMS

There

terms, as

is still

more



more

bits of storage

to the story.

Some

and more data

rate.

functional specialization

indicated in the dimension. This depends primarily on two kinds of things beyond the reach of the configurational adapta-

exists, as

The

demands

kind of specialization offered by the environment (think of infor-

tion described above.

mation-transfer or conditional-transfer operations). Second, all environments ultimately require a variety of tasks in addition to

ruggedness, small size, etc. These have strong effects on design, but below the ISP and PMS levels. The second consists of demands

the main specialized task. These include at least language compilation or assembly, readable formatted output, debugging aids,

affects design at the

and other

utility routines.

By the time

these have been added, a

been generated. A second part is the differ-

for large

and has

first

consists of

for reliability,

amounts of processing power. One response to this again lower levels of logic, devices, and circuitry

little

impact on design at the ISP and

PMS

level.

But

substantial requirement for generality has

response

However, this is not the whole story. ence between the computer type and the

into the ISP. Large machines have data-types that are appropriate

specific configuration

is

also possible in terms of the data-types that are built

to their tasks (with operations to match),

and these

affect the

The computer space

Chapter 3

design. In fact, this effect cialization

shown

in the

the substance of the functional spe-

is

in the

of the

computer-space dimension.

look-ahead of Stretch (Chap. 34) and the n-instruction buffer CDC 6600 (Chap. 39). This might be considered a unique

one last part of the story, and it is the most Various groups of computer engineers have felt strongly from time to time that functional specialization should exist, and they have set out to create such machines. These efforts

functional specialization for scientific computation. It is too early to tell, but it is our impression that, although the needs for sci-

have often produced machines that were different from the existing main line of computers, i.e., were appropriately specialized.

a certain power, whatever the task domain. Physical limits on

Finally, there

interesting of

is

all.

But the net effect of almost

new

all

such attempts has been that the

idea was seen to be good in general for

was taken back

into the

main

all

computers and

line of computers. Thus,

what started

out to be a functional separation turned out to be simply a way to produce rapid development of a more universally applicable

A

the expansion of input/output example computer. facilities in creating a functionally specialized business machine, classic

is

which simply led to better I/O facilities will have more to say about such examples

for all

as

computers.

we discuss

We

computation initiated the exploration of concurrency and parallelism, we will eventually see them in all computers above entific

component speed and

signal propagation will

make

these tech-

niques universally attractive.

A better

case for permanent specialization can be made in the special algorithm computers, which compute the fast Fourier transform or do vector operations. Here we finally have systems

whose whole design

is responsive to a narrow class of problems. extend to the very special kinds of Pc parallelism exhib-

This

may

ited

by the ILLIAC IV (Chap.

generality in

27),

although there

is

substantial

such systems.

the values

computing it was felt by was a functional many major separation between business computing and scientific computing. 1 Scientific problems were

along the dimension.

Business. In the early days of electronic that there

Computer-system function Scientific.

The

first

machines were clearly designed for scientific Aberdeen Proving Grounds funded the early

calculations. In fact,

ENIAC

"large computing-small input/output"; business problems were "small computing-large input/output." Certainly most of the

the work sheet, and the program

had poor used the Pc input/output example, to control everything dynamically, actually catching the bits from running tapes on the fly (by executing well-timed small loops).

the instructions that the mathematician gave to the clerk. From a design standpoint, scientific computation has posed two

These design efforts for business computers resulted in the IBM 702 (and subsequently the IBM 705, 708, and 7080). This machine

work on the

for the

computation of

ballistic firing tables.

And

the image used frequently by the early computer designers was the computer as a statistical clerk, the arithmetic unit being the desk calculator, the

striking requirements. bers,

which has led

memory

The first is the word lengths

to

great accuracy of the

num-

of 36 to 60 bits (11 to 18

decimal digits of significance) and arises from the propagation of roundoff error during repeated arithmetic operations. The second is the emphasis on fast arithmetic operations, i.e., for arithmetic power. In the early machines the standard rule for estimating computation times was to count the number of multiplications in

be neglected. The arithmetic unit has where the floating point multiply is hardly more

existing computers, designed for scientific computation, facilities.

had two major innovations a

PMS

input/output.

The

scientific

Thus, the main effect at the ISP

is

The main PMS

used characters, and flexible

it

had

and voluminous

was immediately incorporated and then into all large

output for

scientific calculation.

The have

its

Thus the bifurcation was tempo-

specialization to characters as a basic type (as opposed

was already present

effect until 5 years later

in the

IBM

702 but did not

with the development of the

IBM

1401 (Chap. 18). The latter machine was adapted to business, both in being character-based and in being small enough so that small businesses could afford

it. It

was extremely successful (many thou-

is

the emphasis on

sands were produced) and certainly represents a successful func-

press for increased arithmetic processing

has led in recent

'Such feelings are still extant, but we are concerned here not with the validity of the feelings but with what they led to at a particular period

operations in the ISP.

the classic "statistical clerk"

The

latter feature

It

more

computers as separate input/output control (either Kio it was realized that there were also demands on input/

to long words)

level.

IBM:

or Pio), for

developed to expensive than floating point add. This requirement on fast arithmetic, however, has really been directed at the logical design level,

PMS

701, for

into scientific computers, e.g., into the 709,

rarily halted.

the adoption of long word lengths, floating point data-types (in addition to integers), and an extensive repertoire of arithmetic

for

structure that permitted

a program; all else could

not at the ISP or

The IBM

PMS

effect

design.

times to the development of various forms of Pc concurrency, as

of

computer development.

47

48

The structure

Part 1

of

computers

tional specialization for business.

However,

it is

interesting that

the specialization has not been maintained, for the IBM System/360 (Chaps. 43 and 44) is again a single machine, although it

has in essence two internal ISP's, one centered around characters

and the other around floating point data-types, that is, and a scientific specialization residing side by side. 1

a business

necessarily required. This in part reflects the fact that control

computers may retain their programs over their whole lifetime, programming and reprogramming is less important. (It is

so that

not absent, however, and so this

is

not a very strong functional

adaptation.)

Communication. The functional specialization of communication Control.

The

third functional value

a computer used for control

is

in real time. Examples are process-control computers, aerospace computers, and laboratory instrument-control computers. The role

of the

computer

is

to act as a sophisticated control (K) in

larger physical process, relatively late arrival

and thus

was due

it

some

plays a subordinate role. Their

to the high cost

and

unreliability

of early computers, as well as to the lack of necessary interface

equipment.

The

functional specialization is seen most strongly in the word size, which reflects the appropriate numerical data-type. The

numbers used

in control processes are

generated by physical de-

vices and are rarely better than 0.1 percent accurate. Since elab-

orate arithmetic calculations are not called

hence the word

size,

can be around 12

puters have been 12 to 18 bits/word.

for,

bits.

the numbers, and

Most control com-

is

the computer transfers messages from terminals (and links) into primary (and sometimes secondary) memories and then transfers

them are

that

all

nature.

About the only other functional specialization of control com2 puters is the interrupt capability to allow them to respond to

to other terminals (and links). In

stored and then forwarded.

first

the computer reads the off-the-hook signal, detects the dialed numbers, rings the dialed parties, and finally sets the switches to

connect the telephones together. In some instances, when it answers information inquiries about new telephone numbers or reit

communications computer

functionally a switch or a control

The main

distinction

nications computers since

it

is

between control computers and commuthat the task environment of the latter,

consists of digitally

encoded messages (even

in the case

of the voice telephone exchange), can be handled directly

by the communications computer. That is, the communications computer can do the work of transshipment and storage as well as control. There are no pure examples of communications computers in book. However, the Pio's serve essentially the same function within a single computer (Part 4, Sec. 1), and they can profitably this

is another possible example of functional specialization leading to reunification rather than divergence, for it has

be examined from

again been widely accepted that

File Control.

general-purpose computers

is

for a switch.

potentially simultaneous external conditions in real time. This provides apparent parallelism, though still using a sequential

all

functions as a memory. Thus a

routes calls to other phones,

many

processor. This

message switching, messages

The computer

in a telephone exchange functions as a very sophisticated switch control. Here

A second specialization, again

control computers are binary and have boolean operations. This arises because many of the external conditions to be sensed and effected are binary in reflecting appropriate data-types,

could be taken as a subfunction of a control computer. The function is mainly to behave as a switch. In a message-switching application

this viewpoint.

We list this as a separate specialization

only because

in actuality,

capabilities. However, though not existing in early computers, were developed

number of computers have been built to do exactly this task. The specialization is easily described: It is a communication com-

good input/output facilities, not for control computers. Chapters 7 and 29 give examples of aerospace computers, and

puter with the messages being characters (since they are built for business), and with the large memory (the file) being considered to be part of the system. There are no examples of file-control

must have good interrupt interrupts,

to obtain

Chap. 33 describes the IBM 1800, which is specifically designed As these examples show, a complex ISP is not

a

book, but the early

IBM 305 and UNIVAC

for process control.

computers in

'The story above has been

computers serve this function. An IBM 1800 is used as the control for a 10 12 -bit photo-optical memory, for example.

told exclusively in terms of

IBM

machines.

does not distort the picture too strongly in terms of total movements of the field, since IBM dominated the market, concurrent

Although

this

file

this

developments were taking place throughout the field. UNIVAC I was the first computer built by a manufacturer and did not have the idiosyncrasies we ascribe to IBM; on the other hand, the marketing effort for it was nil.

Apparently introduced

in the

UNIVAC

1103.

Terminal. Since

it is

possible to obtain a separate

whose only function

is

to run a display,

computer system

we have

listed this as a

separate functional specialization. In fact, it is better viewed (and almost always occurs) as a component of a larger computer system,

The computer space

Chapter 3

DEC

The

as a special Pio.

i.e.,

338

is

such a P.display and

chapter and

described both later in this

in detail in

Chap.

is

25.

We

want to know how well the computer system some vague notion of the kind of task programs performs, given and data that will be used with it. Although we know that we

such

specifics.





in simultaneous conversational interaction with a single large

cannot have adequate measures, we believe that there is something that tells us that a CDC that can be said about the performance

machine has bred a new

6600

Time-sharing.

The requirement

have a large number of users

to

specialization, that of the time-sharing

be time-shared computer. All the computers described above can inherent multiprogram(even if they do not have interrupts or with the ming). However, the emphasis on this mode of operation particular timing

and

flexibility

requirements of human

users doing

software systems has general computing at consoles in multiple led to a number of innovations in design. The most important is

the virtual-memory techniques for achieving multiprogramming

(described in Part 3, Sec. 6). There is also substantially increased complexity of PMS structure to handle the integration of large

swapping memories, and the huge software systems that seem be endemic to time-sharing systems. It is still too early to tell

files,

to

whether any of the design responses will produce permanent specialization or will again simply be the first instigation of design features that will

In summary, that

it

become

we

is

many

times more powerful in actual performance than a

PDP-8.

An interesting way to look at the problem of specifying performance 4.

is

to play a simple

You are

game:

functional specialization and machine and into

translates mostly into total size of the

you a number, say

computer systems involv-

many parameters (equivalently, dimensions or attriThat is, what is the best description of a computer that butes). can be stated in four numbers? The game is easier to play if we speak of the dimensions, rather than the information content of the description

1

(in bits, say).

We

have

still

not defined "best,"

can be taken to mean the best prediction of the relative ordering of the computer system; better on the index of course. It

means better on the same

To

start at the

task. 2

beginning, what single

number would you

give

power? Such a question makes most people uncomfortable, since strong feelings exist for at least two kinds of numbers, dealing with speed and memory, respectively.

we would probably

essing speed.

common

because for simple machines

machines.

all

will give

ing only that

the data-types available. Many of the other design aspects created in response to functional specialization have instead become the

property of

We

to give the best description of

to characterize a computer's

universally used.

see that there



is

If forced,

The

settle for

something related to proc-

cycle time of the primary

rate. It is a structural

it

memory

determines

parameter, but that

is

a possibility

(limits) the is

operation to avoid

no reason

as a performance index. The average number of instructions per second, or operations per second, is a better indicator. Since the latter does not take into account the size of the word being procit

Performance For a device that does a complex

job,

it is

meaningless to ask for

a single precise index of performance. It is like asking for the average speed of a given model of car over its lifetime without

who

will

own

it,

where he

will drive

it,

and what

sort

specifying of terrain he will encounter along the way. Notice that the difficulty is as much in the complexity of the task environment as in

the complexity of the internal workings of the machine. Specify everything about the environment, and the performance can often

may be hard to determine, but at know the terrain and road conditions you

essed, perhaps average bits processed per

second

is

the best single

number. (We measure this number at the processor, and include both the instruction and data streams.)

times and divide by their number. This is equivalent to weighting them equally, the rare ones and the common ones. If we want

do better than that we need some

in a single figure. It

to

least

well defined.

frequencies, of instruction types, called "mixes," have

If

perfectly and how the car was driven, then from the structure of the car it is possible to figure out the instantaneous velocity and

from

this to construct the

To put tion for a

this in

average speed. terms of computers, given a particular configura-

computer system, given a particular program, and given

a particular set of input data, of the performance:

whether

it

was

how

correct,

it is

long

it

possible to determine all aspects

how much space was used, But we are not interested in

took,

and so on.

may

To take an average we must adopt some weightings. The simplest scheme is simply to add all the instruction (or operation)

be given it is

it

in the literature.

'It

is

not

fair,

data. Several sets of relative

been used

Table 2 gives four examples. The Gibson mix

of course, to invent tricks to

is

encode many conceptually

independent dimensions into a single one, just to beat the limit. On the other hand, composite dimensions, such as average operation time, are perfectly acceptable. 2 Definitional precision

is not appropriate, since we are not attempting to deal seriously with the technical questions of indices, only to illustrate the

issues.

49

50

Part

The structure of computers

1

Table 2

Instruction-mix weights for evaluating computer power

Gibson 1

Arbuckle [1966] Fixed

+

/

Knight

-

10(25) 6

X

Knight (commercial)

(scientific) 2

25(45)

2

1

2

+ x

Floating Floating

/



5.6

2.0

-=-

Floating

10

9.5

Load/store

28.5

Indexing Conditional branch

22.5

25 (move)

Compare

20 24

Branch on character

10

13.2

4

Edit

7

I/O initiate

Other

72

18.7

74

'Published reference unknown. 2

Extra weight for either indirect addressing or index registers.

probably the best known. The best source for such data comes from instruction counts of running programs. Knight takes the view (Fig. 3) that a single number can be used to indicate power,

and

his

formula has been evaluated for some

300 computers [Knight, 1966]. His formula

is

the product of

is

restricted to

such as

programs coded

Nevertheless, although

it

We

procedure-oriented language,

all

systems, only occasionally has

number.

in a

computers accept FORTRAN. has often been done to compare two

FORTRAN, where

it

been done

for

even a modest

general-purpose computer the coma reasonable single-performance actual use will be with the compiler, and good

feel that for a

three factors: processing time, memory size (in words), and word length. The formula was derived (roughly) to measure power so

piler-derived bench

that technological change could be modeled. Applying the formula

compilers produce code to rival hand coding, so that special features of the machine are utilized. Cox [1968] compares several,

is

like

measuring automotive-vehicle power as a product of speed, number of wheels. (Such an indicator is roughly

weight, and the

proportional to a car's momentum.) Thus, although it is a reasonable single-number indication for power, a computer buyer could

not use

it

directly.

Taking averages, ticated approach. is

as in the case of mixes, suggests a

more

sophis-

A collection of programs, called a "bench mark,"

developed that does a variety of different tasks.

number is the time it takes to do mark generates its own frequencies instructions. It brings in a number

Then

the one

Such a bench

number.

Much

mark

is

using hand coding and compilers for several tasks. There is a difficulty with the bench-mark scheme that ent in

its

is

inher-

strongest advantage, that of doing a total problem and

thus integrating all features of the computer. The number obtained depends not only on the type of computer, for example, an IRM 704, but on the exact configuration, for example, 16 kwords of versus 32 kwords, and even on the operating system and the soft-

Mp

of occurrence of the primitive

ware (which version of FORTRAN). Thus, although the number perhaps comes closest to an adequate single-performance figure,

of additional dimensions that

it

this collection.

affect performance: the instruction code, the size of

Mp,

pro-

skill, input/output devices, etc. It also carries with it an implicit frequency of different kinds of task demands (how

gramming

becomes much

less of a parameter characterizing the structure computer than one characterizing a contingent total system. Let us underscore again the distinction between the computer

of the

crunching,

type and the particular configuration (possibly including basic software) assembled in a particular installation. Computer systems

There are severe practical problems in carrying out such measurements on many computers, since the problems must be coded

are designed with certain forms of variability. To specify a 1604 is to specify many things, such as the ISP of the Pc, the cycle time of Mp, the K's used to control secondary memories (Ms), and

much of the set involves how much I/O, etc.).

and run on

all

compiling, how much number

the systems.

CDC

It is

somewhat

easier

if

the task set

interfaces to the external world. But

it

leaves open

many

other

Chapter 3

[(L-7) (T)

=

P

10'

2

weighting factor that indicates the percentage of floating additions

(WF )]'

[32,000 (36-7)]'

+

*o

t I/0

ia*[C,A FI + C 2 A FL + C,M + C 4 D + C 5 L] = P X OL, [10 6 (W„ X B X 1/K„) + (Woi X B X 1/K i) + N(S, + H,)] Ri + (IP) 0L [10 6 (W I2 X B X 1/K, 2 + (W 2 X B X 1/K 02 + N(S 2 + H )] )

2

weighting factor that indicates the percentage of

)

divide operations

2

weighting factor that indicates the percentage of logic operations percentage of the I/O that uses the primary I/O system

Variables—attributes of each computing system

L

= =

lh the computing power of the n computing system the word lengths (in bits)

T

=

the total

t,.

=

the time for the Central Processing Unit to perform 1 million operations the time the Central Processing Unit stands idle waiting for I/O to take

P

ti/o

=

An =

A^ = M D

= =

L

=

B

=

Kn =

K 02

= = =

Si

=

Koi

K I2

number

of

words

in

S2

H2 R,

= = = =

place the time for the Central Processing Unit to perform 1 fixed point addition the time for the Central Processing Unit to perform 1 floating point addition the time for the Central Processing Unit to perform 1 multiply the time for the Central Processing Unit to perform 1 divide

w„

using the primary I/O system a. magnetic tape I/O system b. other I/O systems number of output words per million internal operations

using the primary I/O system

number

w„

the start time of the primary I/O system not overlapped with compute the stop time of the primary I/O system not overlapped with compute the start time of the secondary I/O system not overlapped with compute the stop time of the secondary I/O system not overlapped with compute for non1 + the fraction of the useful primary I/O time that is required overlap rewind time

Symbol

Description

WF

the word factor a. b.

computer per OL,

Values

Scientific

Commercial

computation

computation

read, write

and com-

pute—single buffer d.

multiple read, write and compute— several buffers

e.

multiple read, write and compute with

program interrupt—

OL 2 10

25

25

45

computers with index

several buffers overlap factor 2— the fraction of the secondary I/O system's time not over-

lapped with compute

registers or indirect

P for any comKnight's functional model algorithm to calculate vol. 12, no. 9, September, 1966, of Datamation, (Courtesy puter system. page 42.)

variable

100,000 10,000

20.000 2,000

the values are the same as those given for

Wn

the values are the

same as those above

the exponential memory weighting factor

for

given

W,,

20

million operations

compute no overlap— no buffer b. read or write with com-

word length memory variable word length

addressing

variable

above

pute—single buffer

registers or indirect

Fig. 3.

1.0

words

a.

fixed

addressing

1.0

overlap factor 1— the fraction of the primary I/O system's time not overlapped with

c.

weighting factor representing the percentage of the fixed add operations a. computers without index

b.

of input/output per million internal

operations using the secondary I/O system number of times separate data is read into or out of the

memory Ci

74

million internal operations

the time for the Central Processing Unit to perform 1 logic operation the number of characters of I/O in each word the Input transfer rate (characters per second) of the primary I/O system the Output transfer rate (characters per second) of the primary I/O system the Input transfer rate (characters per second) of the secondary I/O system the Output transfer rate (characters per second) of the secondary I/O

Semi-constant factors

systems with only a primary I/O system b. systems with a primary and secondary I/O system number of input words per

72

a.

memory

system H,

10

weighting factor that indicates the percentage of multiply operations

*i/o

=

tc

The computer space

1

51

52 Part

The structure

1

of

computers

Ms and

Mp. On

If

can even leave open part of the ISP (e.g., the multiply/divide options on many small machines), or the speed of the Pc and Mp (e.g., in the IBM System/360).

at

things, e.g., the types

some computers

and

sizes of

the size of

it

When we

ask questions about computer systems, we should be clear whether we are talking about a computer "type," such as

CDC

1604, or whether

with

tion,

all

either with

we

are talking about a particular installa-

the variability specified.

It is

possible to describe

PMS and ISP, provided we recognize that the diagrams

for the types represent

maximal

possibilities for

assembling par-

we had bench

marks, which are themselves only approximations measuring performance, we might look at how well the parameters in Table 3 predict the bench marks. But there remain the difficulties of

how

the total system

to take into

(e.g.,

account the additional aspects of

compiler efficiency) that are implied in the

bench mark. Alternatively, one might want to construct a mixed description of bench-mark numbers and measurements of the kind Table

in

3.

Then the

relationship

other measurements would

between bench marks and these

become an

indirect measure of the

efficiency of the rest of the system.

how almost all the PMS and ISP diagrams in this book were prepared. From the point of view of our "number game," if we are talking about computer types, we might prefer

We have discussed performance in a crude and cavalier way, but this accurately reflects the state of the art. There are no precise measures for performance. There are precise structure and per-

numbers

formance measures of individual components (e.g., memory size, and speed and word length, and processor instruction times). When

ticular systems. This

is

that do not depend on the particular configuration. two numbers were available for describing performance,

If

what would they be? Clearly there are One could fractionate the bench mark,

mark

for arithmetic-rich tasks

several directions to go. so that

one has a bench

and a bench mark

composite of compiling and data processing).

for others (a

One could decom-

pose the processing rate into, say, operations per second and word size (from which bits per second can be recaptured approximately). Alternatively, one could retain only a single rate

designers (and users) are faced with obtaining a certain total performance for a given cost, the only method is that of the bench

mark, because the task is

to

such a significant variable.

is

be increased, unless the task

to predict

what

effect

variables will have

is

If

performance

sufficiently trivial,

it is

difficult

changing even the most direct structural

(e.g.,

memory

speed).

number for processing

and add a measure of the memory available, e.g., size of Mp Of the three we would choose the latter, especially if

Structure

(in bits).

we were

talking about a particular installation rather than

com-

puter types, for which Mp size remains variable. We can continue this game through several numbers. Table 3

shows some of our choices. Various parameters drop out or change only when they are decomposed into other parameters from which they can be recovered. Thus, initially Mp must be measured into bits, but when the word size is given, Mp is more reasonably

measured

in words.

One

of the reasons for exposing such a

list

emphasize its judgmental and approximate character. There no way to validate such proposals for brief descriptions.

is

to

is

as yet

Table 3

turn from function and performance, which provide design constraints and objectives, to the dimensions of structure, which provide the space in which the design is actually cast. A structural dimension is one in which the designer can attain any of the values along the dimension a

machine

is

relatively direct means.

completely specified by

the structural dimensions. its

by

From

this,

listing all its

Thus

values along

the system's function and

performance within that function can be determined. What dimensions should be selected for structure? The view-

point

is

distinctly different

from that of performance, where one

Performance parameters specification

(as a function of an allowable

Number

We now

number

of parameters)

of

parameters allowed:

1

Parameters: Pc(i.rate:(b/s))-

Pc(operation-rate:(op/s))Mp(size:(b))-

Pc( wi dth( b)) i

.

»Mp(i. (words)) *Ms(i. (words))

2T

The computer space 53

Chapter 3

averages and combines many features to summarize effective outwants put. This tends to obscure structure. For structure, one

maximally independent aspects which are easily obtained if selected as a design choice. For example, if the computer designer

had only a

undoubtedly This

tells

dimension to describe a computer, he would the logic technology used in the Pc and K's.

single

select

him a good deal about many aspects

of the computer's

and the average bits processed the Pc are second correlated, and so each can be used to by per the other, though only imperfectly. If one is interested predict

tubes, transistors, ties are rare; If it

it.

performance, effective bits per second

is

preferred;

if

one

is

interested in design, technology is preferred. The computer space in Table 1 presents our choice of the major

structure dimensions. There

even

is

less

choice of dimensions here than there

is

means

to validate the

for performance. Never-

theless, there are a few hallmarks. Perhaps the most important is redundancy (the opposite side of the coin from independence,

mentioned above). Several dimensions of structure may covary, so that giving any one of them is tantamount to giving the others.

time and good engineering practice work against to consider such cases, then additional

were necessary

dimensions

(e.g., for

secondary and tertiary logic) could be added, computer could be

or several points in the space for a given

used.

The computer-structure space

structure. In fact, the technology

in

For instance, the Rice University computer uses vacuum and integrated-circuit logic. But such complexi-

possible.

most important dimensions.

is

thus our choice of the seven

our response, so to speak, to the number game, given only seven descriptors. They are playing in order of arranged importance, although clearly no simple way exists to validate such an order. But, if we were to have only three It

is

computer system, we would pick logic technology, word size, and PMS structure (i.e., what processors exist with what functions).

attributes to describe the structure of a

we are ready to proceed through the space, dethe various dimensions and discussing how the computer scribing in this book illustrate various points along them. We take systems At

this point

up each major dimension

A few

separately.

of the correlated

come from physical dependence; may from the nature of an appropriate design and good engineering practice. Such a cluster of covarying dimensions is likely to indicate an important dimension (which one among the correlates

dimensions are accorded separate sections, but most are discussed along with the main dimension.

terms

Computers are constrained by the physical technology from which

This covariation need not

it

arise

a secondary matter). Table organized of such clusters, with one of each selected as the main representais

to

be used

is

and placed

tive

A

in

1 is

they are constructed.

It is

greater speed, size, and

at the left.

second hallmark derives from the hierarchical nature of

computer

Technology

systems. Generally a description of a system consists of

the union of the description of its parts, plus a description of the interconnections. This is the basic style of PMS, for example. But there are a few features that affect the total system,

not just that

a

computer is. For instance, the emergence of the PMS system is due to advances in technology. Prior to transistor technol-

level

did not

make

sense to think of elaborate

affect

ogy,

it

The

costs of the various parts

is

a prime example.

Yet a third clue

is

that the dimensions discriminate the actual

population of computers. If all machines had single-address instructions, for instance, there would be no sense in using number of addresses per instruction as a dimension.

who had all

studied machines at

all

computers. Thus one looks

Any computer engineer

would know

for

this to

be true of

dimensions that spread the

technologies provide

they do that. But technologies dictate the kinds of structures that can be considered and thus come to shape our whole view of what

many components. These are usually rather important. Technology

i.e.,

new

reliability at less cost, although of course

PMS

structures.

were too high and the reliabilities were too low. When, occasionally, such a machine was in fact designed, it invariably proved too far ahead of its time to succeed.

An example

in this

book might be the RW-40, described

A more

in

1960

the Analytic Engine of Babbage, which he designed in 1844 and was never able to com-

(Chap.

38).

classic

example

is

The technology of the time was entirely mechanical, and crude state accounts for a large share of the failure. Thus the 1

plete. its

machines out evenly into a substantial number of categories. If the dimensions of the space are known, a computer is supposed to be defined by a single point. For most existing computers

technology is by all odds the most important single attribute to know about the computer system.

a computer system were of several enough, say consisting processors, each built complicated with different technologies and having a different number of ad-

of

this

is

actually the case. However,

if

dresses per instruction, then such a representation

would not be

Many

technologies go into making up a computer. Each type typically uses a different one. In current (so-called

component

'Thus, the

by a

first

operation.

computer established the precedent of failing meet the expected dates of completion and full

real digital

large margin to

54 Part

1

The structure

of

computers

third-generation) machines the Pc

may

use hybrid- and inte-

when

cially

technological costs are of interest rather than market

for the

costs (which reflect

Pc generalized registers, core technology for the Mp, electromechanical technology for tapes and disks (with integrated circuits

effect of technology

grated-circuit technology for

its

logic, thin-film

technology

mechanical technology for card punches and typewriters, and even manual technology for mounting tapes and disk for logic),

packs.

The

existence of

all

of systems balance, issues

remains true in the current generation that input/ not in balance with the internal structures. This is due

it

example, output

is

to the crude state of terminal technology, so that cost too

much

it

appears to

to provide an appropriate solution. 1

The heterogeneity

of technologies

is

for

any component, but

within a technology.) Thus there

is

this

is

usually

a sense in which the leading

technology can be used to represent them all. This is the technology used for the logic level and is the one listed in the computer If it

space. a computer,

is

known

it is

that transistor logic

is

used

Ms

is

electromechanical,

a safe prediction that

in the

Pc of

Mp is core, Tio is electromechanical printers and punches, etc. This reflects the fact that technology develops and hence becomes locked with calendar time. Thus a prediction is from logic technology to date

be current

it

seemed necessary

how

and then

Nevertheless the

factors).

to give a

other dimensions) that

all

measure of cost

in

Table

1,

no matter

crude.

We have

indicated only a few of the dimensions that are corre-

lated with technology. In fact, the only dimensions in Table

1

that

are independent of technology are the word length and the Pc addresses/instruction. All the rest show dependence on technology. For some, such as memory speed and size, there is a direct

For others, such as PMS structure and Pc concurrency, the development of more complex versions the leading edge, so to speak depends on technology, but there is free use of all correlation.

not a consequence of

cost/benefit analysis; rather, each represents the forefront technology for the type of device shown. (There is, of course, cost/

performance exchange

neously pushing up performance along

these technologies poses major issues

which are only imperfectly resolved. For

numerous other

on costs has been so striking (while simulta-

to all other things

known

to

at that date.





versions that are in existence at

any given time. There are

other dimensions of importance, not

shown

in

Table

1,

still

that have

changed with technology, e.g., electric-power consumption. One way to see both what varies and what is independent of

also

technology

is

wind (Chap.

compare selected machines. For

to

6),

a first-generation system, and the

instance, Whirl-

IBM

1800 (Chap.

have reasonably similar ISP descriptions, if one ignores index registers, which were not invented at the time of Whirlwind's design. However, they have very different 33), a third-generation system,

PMS structures. In Whirlwind, the early system, transferred information between Tio's and Ms was under program control of the Pc.

The

existing

Pc

registers

and

transfer gates

were used because

This correlation of date with technology is given in the computer space along with the generation. It can also be seen in the

uses hybrid circuits,

time chart. The correspondences must be taken as very rough only.

devoted to special functions; hence there are many Pio's operating

The technologies are listed in increasing power (and decreasing cost). The dates run in exactly the same order. The one exception which has been introduced very recently and is a special technology for ruggedness, reliability, and direct external coupling is

fluidics,

in certain control systems. (Small fluidic

computers are at the early

was too expensive

we

list

the dimensions:

Pc speed (operations per second), and cost (dollars per million opwhich vary directly (or inversely) with logic tech-

to

it is

have separate ones. In the 1800, which economical to have additional subsystems

independently of the main Pc.

It

was not

cost alone that limited

the complexity of first -generation vacuum-tube systems. The large physical size of tubes introduced substantial transmission delays; their large

system; and the

prototype stage.) Alongside the technology dimension

it

number

power consumption added dependency on a cooling their limited life and deteriorating nature constrained of tubes that could

be used

in a

system requiring high

reliability.

The IBM 700

scientific series (701, 704, 709, 7090, 7040, 7044,

erations), all of

7094

nology. In general, costs are extremely difficult to determine, espe-

ing structure over time, hence across technologies, but

I

and

II)

offers

another comparison, where there

is

an evolv-

where

for

reasons of compatibility the ISP's have remained almost constant (except for the 701). Again we see radical increases both in perform1 Although beside the point of the current discussion, one reason why these imbalances appear to be "permanent" is that the time constant for change

technology is of the same order as the time constant for human beings systems analysts, programmers, and users) to understand the imbalance. Before system imbalance is diagnosed and solved, the terms of the problem change, inducing new imbalances. in the (i.e.,

ance (Pc speed increases by a factor of 5 from the 701 to the 704 and another 10 to the 7094 II) and PMS complexity. But various other features, though not affecting compatibility, were locked in with the ISP and remained fairly constant. For example, Mp size

went

to

32

kw

(kilowords) early in the series with the 704; and

The computer space 55

Chapter 3

took a jerry-rigged modification to get 64 kw on a 7094 toward the end of the lifetime of the series (see Chap. 41, page 517). it

Throughout this section we have referred to technology as the dominant factor in the computer. Does this mean that computer development waits upon new fundamental windfalls? We have been lucky

in getting the transistor and, to a lesser degree, the

integrated circuit from external efforts. However, core memories for the computer and resulted because of need.

were invented

Read-only memories have also resulted both from development at the circuit level and from pressure above, requiring the memories to

ories

be developed. All the electromechanical secondary memmagnetic tape, drums, disks, and photostores) have

(i.e.,

resulted from the computer's needs. Thus, although technology is

dominant, the computer often forces the development. The Pc operation rate is strongly correlated with logic tech-

nology, as

we have indicated in the computer space. Our discussion

about operation rate. The reason for the higher operation rate is because of faster principal also has a secondary effect on intechnology. Technology logic about technology and generations

creasing speed.

be

More

is

also

reliable devices allow large

computers to

digits (4 bits), the halfword,

performance.

If

we

hold the structure and concurrency constant,

the simplest way to increase performance is by increasing the clock rate. The increase in the performance/cost ratio over the past two

One

of

we need

to characterize the

characteristic of this organization, the

itself.

word length

(in bits), gives

most of the information, the

the hierarchy adding only a little. Let us see why this is so. At the bottom there

is

the

bit,

rest of

encoded

Although other numbers of states are possible, and ternary (three-state) machines have been proposed occasionto handle binary ally, digital technology has developed exclusively in two-state devices.

information. There are several reasons for

requirement for high

reliability

and high

this.

The

first

is

the

signal-to-noise ratios in

the basic devices. Generally a basic n-state device (that is, one not built up from other fc-state devices) is realized by breaking a continuous physical dimension, such as voltage, current, or

magnetic

into n discrete levels or regions. Reliability

flux,

and

depend on keeping adequate separation. do with two states (e.g., in the limit they become

signal-to-noise ratio then

This

is

easiest to

and becomes progressively more difficult as n inThe second reason is the simplicity of the logical design binary representations. A basic device for combining two

on-off devices) creases. for

as

also relatively highly correlated with total

consider them,

organization

connection density. is

we

of data. Before

2x2 =

Operation rate

A number

features of the design are related to this hierarchical organization

Smaller devices allow higher device densities, thus decreasing stray capacitance and inductance and shortening transmission delays. Smaller components also allow increased interbuilt.

and the double word.

3x3

= 9 configurations, rather than ternary digits must deal with 4 configurations for the binary case. This also gets worse n increases.





A final reason the coup de grace, so to speak is that no one has ever found striking advantages for the resulting processing structure in having more than two states. Thus there are no compelling reasons to suffer the

first

two disadvantages. In

short,

what

decades of computer evolution has made their primary gains

might have been an important dimension on which to distinguish

through higher operation rates. The two 16-bit computers already mentioned, Whirlwind (Chap. 6) and the IBM 1800 (Chap. 33),

computers, namely, the number of states in the basic encoding, turns out instead to be one of the great uniformities in digital

provide a nice comparison of the evolution.

With

a difference of



10:1 whereas two generations, their cost ratio is 1 is ~1:5 and the internal clock rates are also ~1:5. performance

10 years and

Information structure: word length, information base,

computers structure which we defined as an

their information in a hierarchy of units,

For example, the IBM Chap. then the byte, which is 8 bits; then

i-unit in

2.

System/360 starts with the bit; the word, which is 4 bytes; then the record, which is a variable number of words. In between, playing minor roles, are decimal 1

However,

it

is

not as dramatic an example as

we

could

find.

Information base. That the physical devices deal ultimately in bits does not imply that the information processing must be organized in terms of bits. It is possible to select an arbitrary base (one with

any number of

and data-types All

technology.

By picking and

a better third-generation example we might get a cost ratio of ~ 100:1 a performance ratio of — 1:10.

states)

and construct the entire ISP

A

base unit

If

one wanted a base 13 machine,

is

for example,

to use at least 4 bits (with 16 states) to at the

in its terms.

represented physically, of course, as a set of bits.

encode

it.

one would have

But no operations

ISP level would refer to anything but base units and data up from sets of base units, and there would be

structures built

no way

to manipulate directly the bits that represented the base.

Thus, using a base other than binary obtains whatever advantages might accrue to n-state units, without any of the disadvantages at the device level.

56 Part

The structure

1

of

computers

Computers have been built with a variety of different bases, the main ones being binary, decimal, and character. The character has shifted between a 6-bit character and an 8-bit character 1

The arguments

than binary (which represents the natural base of the computer) all hinge on the alphabets used externally by human beings and the desire to avoid conver(byte).

for bases other

sions into a different representation inside the computer. universal acceptance of higher languages, such as

With FORTRAN and

argument has also lost much of its force. In fact, third-generation machines are binary. Nevertheless, in the fifties

ALGOL, all

this

there was

much

controversy over which base to use, and the

others follow.

set,

An

as the character, should

integer fit into

a word, since otherwise a set of words will not provide a homogeneous sequence of subunits. (That is, only five 6-bit characters fit into 32 bits, so that a set of 32-bit words filled with 6-bit characters

has a

number

of 2-bit holes in

it.

This can complicate algorithms

The constraint of compatinot so with since Ms, bility strong speeds are slow enough to conversion hardware or software). Still, permit algorithms (either the system is simpler (and therefore usually will work better) if

that deal with long character strings.) is

incommensurabilities of information units do not

exist.

Thus, to

machines presented in this book exhibit all three bases. There is little difference between binary and decimal com-

pick an example, the number of parallel tracks on magnetic tapes

puters in their ISP organization.

700 series of 36-bit machines have

a great difference between these two and character machines. The latter are

However, there

is

designed for handling text and are constructed to deal with variable-length strings of characters. Correspondingly, they deemphasize numerical computation. Both these decisions affect the ISP considerably. Thus, in the computer space

we

indicate the base

dimension along with the word-length dimension. The two gether

make up

to-

a single dimension.

Sometimes there are intermediate

but they always play a minor role and we can disregard them at this stage. As we noted earlier, the main determinant of word length has been the length).

units,

function of the total system: large word lengths for arithmetic systems, small word lengths for control systems (and character strings for business). Thus, only within narrow limits is the word length a free design choice. However, the interesting thing about

much

word length

is

not so

determinant as the way it affects other aspects of the total system design. This starts with a design decision that the

As soon

as this

between components

becomes the

will

be a word.

case, then registers in various

com-

ponents must hold a word, since that is what arrives or is to be transmitted. Thus the word becomes the information unit of the

Mp, and most

of the registers of the

tion

one word, since that is the number obtained "at once" and hence can be used to effect

is

designed to

of bits that

is

fit

Pc hold one word. The

instruc-

into

the next time increment of processing. Seven

IBM tapes for the data tracks; for the Sys-

length.

six

computer and the number of data-types that it makes availaAs we saw in Chap. 2, the operations in a computer can be classified according to the type of data they operate upon. Each

of a

ble.

data type tends to have a certain set of operations appropriate to it (for example, + — X and / for numbers) and the decision ,

,

to include a data-type carries with its

operations.

the

number

Thus the number

of data-types.

The

it

the decision to include

grow with hardware in a

of operations tends to total

amount

of

computer grows as the word size (because data paths are word2 and also as the number of operations. Thus machines with large word size tend to be large machines and have many parallel

)

data-types and many operations. ("Large" as an adjective for machines invariably means big and expensive, hence given economics capable of doing large amounts of processing.) There are two additional, somewhat independent, features that





support the relationship between word size, number of data-types, size of computer. First, with a large system there will already

and

its

unit of information transfer

word

tem/360, which has a 32-bit word, the tapes have eight data tracks. There is an interesting correlation between the word length

n

bits for a binary computer or n digits for a decimal computer (character machines being excluded as not having a fixed word

as

tends to divide evenly into the

,

Word length. Let us now examine the role of word length. The word is the first major information unit above the base. It is defined

1

Once these basic features are number of any smaller units, such

have been proposed for communication purposes but have never been made the basis of a machine, as far as we know.

be available many of the pieces necessary to add additional operations. That is, the marginal cost of a new operation goes down as the system grows. Therefore, given a large system, there

is

a

tendency to add more operations. The number of operations per data-type is not easy to increase; rather, one adds new data-types. Second, with small word lengths, one cannot define while data-types that will

fit

into a word,

many worth-

and multiple-word data-

left to the programmer to define with software. With word lengths there are many different worthwhile data-types fit into the word, for instance, decompositions of the word

types are large

that

into partial words, or into character strings.

Each

of these requires

bits

2

The

issue of bit-serial versus bit-parallel

is

discussed subsequently.

Chapter 3

additional operations, since the initial data-types involve the entire word or some large part of it (i.e., the word, address, and integer operations).

In sum, the

word length stands

of the machine.

zation of

It

not only

tells

many components

an indicator of many aspects something about the basic organias

but indicates

how

big the computer

both in number of data-types and number of operations. Figure 2 shows time lines of well-known computers with their word is,

length, with a special time line for the ones in

for their definitions.)

chines do not generally have boolean data-types, and there has

been some attempt at machines with only floating point, without a separate integer type (e.g., the CDC G20 2 ).

The reason behind

1 groups are suggested in the figure which classify these computers.

It

The

classes overlap,

and

computer into one of two

to separate a

more knowledge

(e.g.,

the

number

of data-types).

To be located at a point on this dimension means to have all the data types below it

on the dimension, (i.e., word, address, integer, boolean.) Occasionally machines which violate this have arisen. Decimal ma-

is

classes requires

The computer space

(say at floating point)

Five

this book.

I

this

cumulation of data-types

in a fixed

i.e.,

order

that certain general tasks must be performed

by any computer. must transmit data between the Pc and Mp, and this transmission has nothing to do with the meaning or content of the data; thus there

is

always the "unit of transmission," which

is

the

word

For example, the 24-bit SDS 9300 and CDC 3200 appear in the same class with the 36-bit IBM 7090 just because both machines

(except on character machines). Next, all computers manipulate addresses to achieve generality (e.g., to compile), providing for a

have floating point hardware and,

for

second data-type. Next come integers, since almost all algorithms make use of arithmetic (this could conceivably be absent in some

that makes word length have few of the described is consequences just making a computer bit-serial rather than bit-parallel. In many machines information transfers are con-

communications computers), and on up to floating point numbers, multiple precision, and vector and string operations. At each stage

in fact,

perform comparably

arithmetic tasks.

The one design choice

ducted on a single bit stream (especially Pc-Mp transfers). Coincident with this is the construction of operations on a bit-by-bit This works well for arithmetic and logical operations. Time traded for hardware. The cost of the system becomes independ-

more specialized so that lower ones cannot be eliminated, except for a few cases such as handling addresses as regular the uses are

integers.

basis. is

Addresses per instruction and processor state

ent of word length, but the processing rates go down correspondingly. This design decision was an extremely important one when

The number

logic was expensive and unreliable. It has become current era, where processors and transfer paths are

3 puter systems containing these processors. and 3 to separate the different processors.

in

number while both the

cost

have improved. However, sidered

(

— 10

3

and the

the

relatively

few

reliability of

components

as large parallel processors are con-

P's), bit-serial

processors again

design alternative. (See the serial

word length

In summary,

less so in

is

become

a serious

computers of Part 3, Sec.

2.)

an important dimension, and

we

many characteristics either proportional to or inversely proportional to it. To be sure, these relations hold only for current find

way

of addresses in

of describing processors

an instruction has been a traditional (i.e.,

their ISP's)

and hence the com-

We

use

it

in Parts 2

Originally the dimension was simple: one-, two-, three-, and four-address machines were constructed. It has become somewhat

more complex. A "one plus one" machine has one address for data and one for determining the next instruction, and is to be distinguished from a two-address machine, which uses both addresses Index registers and so-called general registers provide

for data.

instruction schemes

which

lie

somewhere between one- and two-

seen with the bit-serial designs. The main-line computers in Part 2 are ordered according to increasing

address organizations. When processors admit several instruction formats or variable-length instructions, matters become even more

word

complicated. A correlated dimension in the computer space is the amount of processor state, that is, the number of bits that exist in the

design practice, as

we have

length.

We

have presented the number of data-types as being Data-types. correlated with word length and also with computer size through the effect on number of operations. Although far from perfect,

there in a

is

a rough order in which specific data-types are included

computer.

We

in the data-type

have

listed the

main types

in such

an order

dimension of the computer space. (See Chap. 2

processor, as described in the ISP. This tion that can be held at the

processing context for the next instruction. of status 2 3

'The

class

number

is

essentially [log2 (Mp

word

length)



2].

is

and mode

bits (in

the

amount

end of one instruction It

of informa-

to provide the

consists of a

modern machines packaged

number

into regis-

Originally the Bendix G-20. Although used mostly to describe Pc's, the description applies to any

processor.

57

58

Part 1

ters,

The structure

of

computers

but in earlier machines simply scattered around in the procthe next instruction address, the accumulator and other

essor),

arithmetic registers, the index registers, and other general registers making up a "scratch-pad" memory. It is a simpler descriptor of the ISP than addresses per instruction, since it is independent of

number and variety of instruction formats. It is easy to define processor state generally for any ISP, but difficult to define addresses per instruction. state

number

not the total

is

not do so logically (that

is,

the registers could exist in

Mp

on the instruction format). With interrupts and multiprogramming the processor

still

the

The processor

the extra time to store in

Mp results that need only index Thus, also, temporary storage. registers and general registers almost always imply increased processor state, although they need organization

have their

gains additional significance, since that has to

and

effect

it is

the

amount

state

of information

be saved and restored when switching programs.

of bits in the proc-

For example, in the Honeywell H-800, an early three-address

essor, since there may be registers in the physical system that are used within the interpretation of one instruction but which carry no information between instructions. Address registers for obtain-

computer, the processor state per program consisted only of the program counter and index registers, and when io-halts occurred

ing operands from

is

are the most

Mp

common

such "underground"

We implied this

or "temporary" registers, but there can be others. distinction

by defining processor

state in terms of the

ISP rather

than the physical processor.

The

correlation

100 words must be stored, which

and the number

ing to addresses per instruction. To show the common similarities, we give in Fig. 4 a state diagram that can be used for all processors.

state

Mp

(or

even

Ms

or Tio's)

and are

not concerned with the state of the processor. Processor state enters only because, in decomposing the total algorithm into a

not possible (or efficient) to

it is

~

not simple, since it rests on two note that larger programs perform

is

separate issues. For the first, transformations on the state of

series of small steps,

general-register state, often 25

implies an appreciable time for switching contexts. We can now consider briefly the different organizations accord-

between the processor

of addresses per instruction

during processing, the Pc was switched immediately to another program. Eight programs could run concurrently (by having a total processor state of 64 program registers). In present computers with

make each

In

common

instruction, it

the basic idea of the stored program: Fetch an is to do, then execute

is

determine what the instruction

(the fetch-execute cycle).

Other than

this,

only a part of the

diagram will be applicable to a given processor type. As shown in the computer space, the addresses-per-instruction

state

to Mp. Basically, this happens step a transformation from because the instruction does not hold enough information to specify the Mp-to-Mp transformations. For example, if one wants to

dimension

add two numbers, two operands are required, and an instruction must contain at least two addresses; if it does not, then an inter-

and variable addresses. However, from an expository viewpoint one should follow a different course, starting with single-address

mediate state

(i.e., processor state) must be created to hold the information while the additional instructions are fetched. Thus,

machines, then indexing, then two- and three-address machines, then general registers, and finally the zero-address and variable-

one-address organizations require the most processor state, with two- and three-address organizations. This consideration

address organizations. This not only puts the more common organizations first but makes it easy to relate the organizations

Mp

less for

stops at three (two operands

elementary operations are

and a

more than

result)

binary.

because only a few

The

tinuity of the program.

The second source

between processor state and comes from differential access time to

of correlation

processor registers and to

Mp. As long

an appreciable differential, substantial gain, processing power can be obtained from increasing processor state. This derives, again, from the strucas there

is

ture of algorithms which generate intermediate results that are used almost immediately afterward and then are of no further

Rapid temporary storage and retrieval are beneficial under these conditions. Thus, working against higher address interest.

with zero addresses, then one address, then one

plus indexing, one plus general

to

registers,

and on up

to two, three,

each other.

processor state

cannot be eliminated entirely, however, since there must be at least an instruction address (a program register) to maintain con-

instructions per address

starts

P(l address)

and

P(l

+

index address). These Pc's constitute most

and simple third-generation computers. The earliest outline of the structure was the IAS computer (Chap. 4), which has come to be known as the von Neumann computer. Although first-,

second-,

fundamentally pears to is

like the

be the

not described,

(Chap.

A

IAS computer, EDSAC's adaptation ap-

closest prototype to this class. it

Although

influenced M.I.T.'s Whirlwind

I

EDSAC

significantly

6).

significant

change to the IAS machine was the addition of

the index register (called B-tubes) in the Manchester University machine in the early 1950s. The evolution can be seen by comparing the

first

and third generations using Whirlwind (Chap.

6)

and

Chapter 3

controlled stote

Mp 2

Pc controlled stote Note; Any state may be

null

State name soq/oq soq/aq

in

a state

toq

taq

o/o.o

to.o tov.

sav.r/ov.r

tav.r to

so/o sov.w/ov.w sav.w/av.w

the

Time

sov.r/ov.r

so.

Fig. 4.

r

tov.

w

tav.

w

Weaning Operation to determine the instruction q Access (to Mp) for the instruction q Operation to decode the operation of q Operation to determine the variable address v Access (to Mp) read the variable v Operation specified in q Operation to determine the variable address v Access (to Mp) to write variable v

ISP interpretation state diagram.

IBM

1800 (Chap. 33) or looking at the IBM 701-7094 evolution 1. Index registers are motivated by the frequent

in Part 6, Sec.

occurrence, in

address systems, of circuitous address calcula-

1

tions that involve

first

computing the address

(e.g.,

the index of

an array in Mp) and then planting it just ahead in the instruction stream in order to make use of it as an address. Providing a set of index registers introduces a second address into the in-

even though of extremely limited function. Thus we processors with indexing as having (1 + x) addresses

struction, classify

1 per instruction.

For the

just scalars.

Indirect addressing, on the other hand, does not add to the addresses per it introduces a second operation per instruction.

instruction; rather,

address processor, the processor state (Mps) typically program counter (instruction location counter), an

Accumulator/ AC, a Multiplier-Quotient register/MQ (the extension of AC), and one or more Index registers/X/XR.

With only one address register,

A, must be used

in the instruction, the

for

temporary

results.

one arithmetic

Thus an

effective-

address integer (z) is computed as a function of the address part (v part) of the instruction (q) and the index registers. This process is

An

on vector data elements rather than

1

consists of the

alternative view of index registers suggests that they double the number of data-types by allowing operations

1

The computer space

typically

z:=v + where

X[j]

is

X[j]

the jth index registers as specified in the instruction. for the transmission operators between

There are several forms

A

and Mp.

59

60 Part

The structure

1

of

|

A A A

computers



3 sets of protection and relocation

virtual

Similar to above.

physical a

homo-

Similar to above. Simple, pure procedures with one data array area can be

memory.

More

implemented. (UNIVAC

similar to

page mapping.

Has not been used

108, PDP-10)

any conventional

computer.

registers.

Mapping,

in

1

Mv > Mp:

Memory page mapping

For each page (2 6 to 2 12 words) tual

in a

user's

vir-

memory, corresponding information

is

Relatively expensive. Not as general as

following

kept concerning the actual physical location in primary or secondary memory. If the

method

pure procedures. SDS-940)

for

implementing CDC-3500,

(Atlas,

is in primary memory, it may be desirable to have "associative registers" at the

map

processor-memory interface to remember previous reference to virtual pages, and their actual locations. Alternatively, a hardware map may be placed between the processor and memory to transform processor virtual addresses into physical addresses.

Memory page/segmentation mapping

Additional address space virtual

is

provided beyond a

memory above by

providing a seg-

Expensive.

Little

effectiveness.

experience

to

judge

(GE 645, IBM 360/67)

ment number. This segment number addresses or selects the page tables. This allows a user an almost unlimited set of addresses. Both segmentation and page map look-up is provided in hardware. May be

thought of as two-dimensional addressing. Indirect references through a descriptor table to segments.

All

data are considered part of a descriptor is referred to by a number. A descriptor table indexed by the descriptor array which

number

is

and give

its size.

used

to locate the array in

Mp

An

indirect reference

must be made

the description table

in

to

Mp. (B 5500)

80

Part

z

The structure

1

is

encountered

of

computers

in the

be obtained. There are

program, the information at Mp[z] will

Every reference Mv[z] takes place

however, two different ways to obtain

still,

Mv[z]

the effect of a virtual memory.

=

:

as

(—|Mp[z] Mp[z]

one can operate interpretively, with a software system taking the place of hardware. That is, the programs of all the users

That

are in a nonmachine language

The other two schemes

—>

—*

Mp[z]; protection violation «—

1)

First,

a higher procedure-oriented language), and each access in the language is processed by the software interpreter before an access is made to Mp. It is clear (e.g.,

power of a memory mapping is available with scheme. The only drawback is the loss of efficiency from the

that all the logical this

which may range from a factor of 5 to 100. Consescheme is used only in special circumstances, such

interpretation,

quently this as multiuser time-shared conversational algebraic languages. The second scheme is to modify the code at the time it is placed in the

Mp

for a given run, so that all addresses in the

spond to the actual

Mp

addresses used. That

is,

code corre-

an assembly or

performed each time the program is placed in Mp. The advantage of this scheme is that no further address calculations are necessary. There are three disadvantages. Assemtranslation operation

is

bly operations are expensive so that, although the ble is

if

the program

not tolerable

out of

Mp. In

if

is

brought

in

scheme

is

tolera-

once and run to completion, in

it

and relocating registers) hardware. A protection and relocation register mechanism is used in four schemes of Table 6. These provide either one concatenated, one additive, two addin additive register pairs for mapping a single program into one, one, two, or n nonadjacent blocks in Mp. The authors know of no schemes where more than three registers are used; this would tive, or

be akin to using a more general page map. Generally, these

really

schemes

restrict

Mv
)-

by themselves do not contain

a carry sequence of length St). In

numbers any carry sequence of length St) in the total last digits of the total sequence. length n) must end with the

this case (of

Hence these must form

the combination 1,1.

The next

-

o

1

digits

these must form the

must propagate the carry, hence each of do not or 0, 1. (The combinations 1, 1 and 0, combination 1, 1 is %, the combination of 1, propagate a carry.) The probability that one of the alternative combinations

The remaining n



or 0,

1,

-

-

1

obtains.

that

p n (v)

,

t

n tPi(«)

-

Pi-i( u )]

+1

.

-

Thus the

Combining

2 V+1

We see with the help of the _ p n _ (v) is always ^1/2 V+1

2

%. The

=

- P„-»

The observation

pn ( u )

is

_»-

of length St). This has the probability 1 pn v+1 case is [1 p n - v (v)]/2 probability of the second these two cases, the desired relation

P» - Pn-l(f) +

1

-1

therefore

sequence (%)* V^)* v digits must not contain a carry sequence

total probability of this

is

of

of the largest carry sequence is on the average length n, the length 2 not in excess of log n. Let p n (v) designate the probability that a carry sequence is of length v or greater in the sum of two binary - p„(c + 1) is the probawords of length n. Then clearly p„(i>) bility that the largest carry

p„(e)

g

Indeed, p„(c)

II.)

little dispart of our arithmetic organ requires cussion at this point. It should be a parallel storage organ which

5.6.

From

n.

We now

one division per iteration. As will be seen below in our more detailed examination of the arithmetic organ we do not include a square-

contains.

>

-

%(X + a/X)

is

t>

,

techniques.

5.5.

if

n

not important.

of course, also possible to handle square roots by iterative 1/2 then X' In fact, if X is our estimate of a

=

=



=

if

v

>

n

is trivial.

formulas proved above that and hence that the sum

Chapter 4



not in excess of (n

is

+

o

1)/2

C+1 since there are

terms in the sum; since, moreover, each p n (v) is not greater than 1. Hence we have

an

=

^Ti

min^l,

Finally

we

Choose

K— 1

This

n

=

2

l

Jt+1

M

K

,

is

=

2

log n

is

log n.

=K

is

n

^

v=

n

.

bound on

Then ^

K *

linear for

2* +1 n

n

I

clearly

it

is

it

i.e.

^

v=l

and

,

that our expression 2

so that 2 K

K—

expression

the function

a n £s

J

re

2*^ng2

v

*

=

the

interval

=K +

2 K and

1

for

log n at both ends of this interval. Since everywhere concave from below, it follows

^

2

log n throughout this interval.

This holds for

all

K,

i.e.

for all n,

equality which we wanted to prove. For our case n = 40 we have a n 2s log 2 40

and

it

Thus

the in-

is

Having discussed the addition,

subtraction.

It is

— 5.3,

i.e.

to

make some

we can now

convenient to discuss at

of negative numbers,

and

in order to

an average

this point

do that

right,

go on

to the

our treatment it is

desirable

observations about the treatment of numbers in

general. digit aggregates, the left-most digit being

the sign digit, and the other digits genuine binary digits, with _1 2 2~ 39 (going from left to right). Our positional values 2 , 2~ accumulator will, however, treat the sign digit, too, as a binary ,

digit

.

.

.

,

with the positional value 2°

— at

least

when

an adder. For numbers between

and

1 this is

The

and

if

left-most digit will then be 0,

to represent a its

sign

+

sign, then the number

and 39 binary

Let us

now

numbers. The

is

it

functions as

clearly all right:

at this place

is

taken

correctly expressed with

digits.

consider one or more unrestricted 40 binary digit accumulator will add them, with the digit-adding

and the carrying mechanisms functioning normally and identically in all 40 positions. There is one reservation, however: If a carry originates in the left-most position, then it has nowhere to go from there (there being no further positions to the left) and is "lost". This means, of course, that the addend and the augend, both numbers between and 2, produced a sum exceeding 2, and the

accumulator, being unable to express a digit with a positional value 2 1 which would now be necessary, omitted 2. That is, the ,

it

is

It should be noted that our convention of placing the binary point immediately to the right of the left-most digit has nothing to do with the structure of the adder. In order to make this point

clearer we proceed to discuss the possibilities of positioning the binary point in somewhat more detail. We begin by enumerating the 40 digits of our numbers (words) In doing this we use an index h — 1, 40. might have placed the binary point just as well between and + 1, / = 0, = .0 corresponds 40. Note, that

left to right.

.

.

.

,

Now we digits

/

.

.

.

,

/'

to the position at the

extreme

;'

left

(there

no

is

digit

h

= =

0);

/

=

40 corresponds to the position at the extreme right (there is no position h = j + 1 = 41); and / — 1 corresponds to our above

/

choice.

Whatever our choice

of

/,

it

does not affect the correctness

is equally true for subtraction, below, but not for multiplication and division, cf. 5.8.) Indeed, we have merely multiplied all numbers by 2,_1 (as against our

cf.

previous convention), and such a "change of scale" has no effect on addition (and subtraction). However, now the accumulator is

an adder which allows errors that are integer multiples of 2' it is an adder modulo 2.'. We mention this because it is occasionally convenient to think

terms of a convention which places the

in

binary point at the right

Our numbers are 40



of the accumulator's addition. (This

length of about 5 for the longest carry sequence. (The actual value of a 40 is 4.62.) 5.7.

correctly, excepting a possible error 2. If several such additions are performed in succession, then the ultimate error may be any integer multiple of 2. That is, the accumulator is an

from n in

in

2

is

sum was formed

adder which allows errors that are integer multiples of 2 an adder modulo 2.

i>=K

last

1 it

turn to the question of getting an upper

I.v=iPn( v )-

u=

+

a probability,

n-c+n

r,

^

p„(v)

is



n

Preliminary discussion of the logical design of an electronic computing instrument

end

of the digital aggregate.

Then

= /'

40,

our numbers are integers, and the accumulator is an adder modulo 2 40 We must emphasize, however, that all of this, i.e. all attribu.



are purely convention i.e. it is solely the mathematician's interpretation of the functioning of the machine and not a physical feature of the machine. This convention will tions of values to

;',

necessitate measures that have to be

physical features of the

machine



i.e.

made

We

will use the convention

/

=

This being represent

all

so,

these

1, i.e.

our numbers

2.

Any

lie in

and

2.

numbers between

numbers modulo

by actual

become when we come to the

a physical and engineering reality only organs of multiplication.

2 and the accumulator adds modulo

effective

the convention will

real

and 2 can be used

to

number x agrees modulo



2 with one and only one number x between and 2 or, to be 5= 2. Since our modulo 2, x addition functions quite precise:


1

we have

.

.

.

the

We

will

.





on removing biases of this therefore use the unmodified methods in this case,

have seen that size.

.

it is

pointless to insist

too. It

should be noted that the bias in the case of multiplication in various ways. However, for the reasons set forth

can be removed above,

we

shall not

complicate the machine by introducing such

to

Inasmuch

as

we propose

to

form the product

the accu-

x'y' in

mulator, which has carry facilities, there is no reason why we should not adopt the rounding scheme described above which has the smaller dispersion,

i.e.

the one which

the case, however, of division

we

we

may induce

carries. In

wish to avoid schemes leading

expect to form the quotient in the arithmetic

which does not permit

register,

of carry operations.

The scheme

which we accordingly adopt is the one in which o) n is replaced by 1. This method has the decided advantage that it enables us

down

to write first

(n



the approximate quotient as soon as we know its will be seen in 5.14 and 6.6.4 below that

1) digits. It

our procedure for forming the quotient of two numbers will always lead to a result that is correctly rounded in accordance with the decisions just made.

We

do not consider

as serious the fact that

is

a far less frequent

operation.

A

final

remark should be made

occasional need of carrying

in

connection with the possible, = 39 digits. Our logical

more than n

= 2, 3, ) sufficiently flexible to permit treating k ( one number, and thus effecting n = 39fc. In this case the round-off has to be handled differently, cf. Chapter 9, Part II. The

control

words

is

.

.

.

as

multiplier produces all 78 digits of the basic 39 by 39 digit multiplication: The first 39 in the Ac, the last 39 in the AR. These must

then be manipulated in an appropriate manner. (For details, cf. and 9.9-9.10, Part II.) The divider works for 39 digits only:

6.6.3

In forming x/y, it is necessary, even if x and y are available to use only 39 digits of each, and a 39 digit result will It seems most convenient to use this result as the first step appear. of a series of successive approximations.

Thus we have two standard "round-off methods, both unbiased the extent to which we need this, and with the variances

The

successive improve-

ments can then be obtained by various means. One way consists of using the well known iteration formula (cf. 5.4). For k = 2 one such step will be needed, for k = 3, 4, two steps, for k = 5, 6, 7, 8 three steps, etc. An alternative procedure is this: Calculate the remainder, using the approximate, 39 digit, quotient and the complete, 39k digit, divisor and dividend. Divide this again by the approximate, 39 digit, divisor, thus obtaining essentially the next 39 digits of the quotient. Repeat this procedure until the full 39fe desired digits of

5.13.

corrections. "

the second one

39Jt digits, to

), n n+1 n+2 p second one, xy — (.£ x £ n £ n+1 £ 2n ), i.e. p — n. Hence for the division both methods are applicable without modification. In 2n may be introduced. We multiplication a bias of the order of l/2 .

facilities,

as large as that in multiplication since division

,

variance (yi2 )2 2n If the number

one requires no carry

vn ) in-

Hence comparing with the "rounded-off" value, random in the intervals 0, l/2 n+1 and 0, — l/2 n+1 +1 Hence its mean is n+1 and its in the interval — l/2 l/2"

1/2".

v

first

our rounding scheme in the case of division has a dispersion twice

we have

a difference

co

The

requires them.

in the

whether the random number

in question lies in the interval 0, l/2" +1 , or in the interval

i.e.

,

,

to carries since

digits.

When applied to a number of the form

i.e.



last digit.

".

The round-off procedures, which we can use in this connection, fall into two broad classes. The first class is characterized by its

first

2 2n and (yi2 )2 2n that is, with the dispersions (1/ v'3)(l/2 n ) 0.58 times the last digit and (l/2y'3)(l/2 n ) = 0.29 times the

1/3

=

Processors with one address per instruction

1

arises

We

when

the quotient have been obtained.

might mention at

The operation

of addition

time a complication which introduced into the machine.

this

a floating binary point

is

which usually takes

at

most yi0 of a

Chapter 4

multiplication time becomes much longer in a machine with floating binary since one must perform shifts and round-offs as well as additions. It

would seem reasonable

%

time of an addition as about rate

is

it

number

clear that the

in this case to place the

of a multiplication. At this

%

to

of additions in a

is

problem

as

Preliminary discussion of the logical design of an electronic computing instrument

rn and d; if they are of the same sign, repeatedly subtracted from the remainder until the signs become opposite; if they are opposite, the dividend is repeatedly added to the remainder until the signs again become

one compares the signs of the dividend

like. In this

is

scheme the

digits that

may occur

important a factor in the total solution time as are the number

in the quotient are evidently

of multiplications. (For further details concerning the floating

tive digits corresponding to subtractions

We

5.14.

remainder

Ac and

in

proceeding further

let

.

.

.

do

will

from the partial remainder

m—

this for a general base

and dividend are both

divisor

division consists of subtracting

(at

the former becomes smaller than the latter. For any fixed positional value in the quotient in a well-conducted division this need be

m—

most

1

times.

If,

after precisely k

=

0, 1,

.

.

.

repetitions of this step, the partial remainder has indeed

than the divisor, then the digit k

,

m—

1

one place to the

.

.

m—

.

m—

,

1, is

2. If

at all

is

and the whole process is repeated for the Note that the above comparison of sizes is only

left,

next position, etc. needed at k — 0, 1, 1,

.

.

.

,

m—

the value k

reached

in

2, i.e.

= m

before step 1 and after steps — 1, i.e. the point after step

a well-conducted division, then

it

may

be taken

for granted without any test, that the partial remainder has become smaller than the divisor, and the operations on the position under consideration can therefore be concluded. (In the

binary system,

m=

comparison of

sizes,

known

there is thus only one step, and only one before this step.) In this way this scheme, — 1 comas the restoring scheme, requires a maximum of 2,

m

parisons and quotient.

utilizes the digits 0, 1,

The

difficulty of this

usually the only economical as to size rn

were

is

to subtract

less

.

.

scheme

method

.

,

m—

for for

d,

each place

in the

machine purposes is that comparing two numbers

one from the other.

than the dividend

1 in

If

the partial remainder

one would then have to add d



d in order to restore the remainder. Thus at every an stage unnecessary operation would be performed. A more symmetrical scheme is obtained by not restoring. In this method (from

back into

here on

This

is

We

1)

a

m

digits instead of the usual

would mean 18

digits.

digits instead of 10.

redundant notation. The standard form of the quotient

propose to store the quotient in AR, which has no carry Hence we could not use this scheme if we were to

rn

we need not assume

operate in the decimal system.

the positivity of divisor and dividend)

m

The same objection applies to any base for which the digital — 1) m. representation in question is redundant i.e. when 2(m



Now 2(m — 1) > m m = 2. Hence, with contemplated,

whenever

m > 2,

>

but 2(m

the use of a register which

this division

scheme

is



= m

1)

we have

for

so far

certainly excluded from the

start unless the

become

put in the quotient (at the position under consideration), the partial remainder is shifted

less



In the decimal system this

2,

the very beginning of the process of course, the dividend) the divisor, repeating this until

at

the posi-

1),

and the negative ones to

facilities.

Assume for the moment that The ordinary process of

done

in a given place

±(m —

its positive digits the aggregate of its negative digits. This requires carry facilities in the place where the quotient is stored.

make

.

is,

,

us consider the so-called restoring and

positive.

this

.

must therefore be restored by subtracting from the aggregate of

we

certain comparisons,

.

the partial quotient in AR. Before

non-restoring methods of division. In order to be able to

3,

Thus we have 2(m

conclude our discussion of the arithmetic unit with

a description of our method for handling the division operation. To perform a division we wish to store the dividend in SR, the partial

.

additions of the dividend to the remainder.

cf. 6.6.7.)

binary point,

±1, ±2,

Let us inquire

now

if it is

binary system is used. investigate the situation in the binary system. We possible to obtain a quasi-quotient by using the

instead of 1, non-restoring scheme and by using the digits 1, — 1. Or rather we have to ask this question: Does this quasi-

quotient bear a simple relationship to the true quotient? Let us momentarily assume this question can be answered affirmatively

and describe the division procedure. We store the SR and wish to form the

divisor initially in Ac, the dividend in

quotient in AR.

SR

We now

either

into Ac, according to

opposite or the same,

and

add or subtract the contents of

whether the

signs in

Ac and SR

insert correspondingly a

or

1 in

are

the

right-hand place of AR. We then shift both Ac and AR one place with electronic shifters that are parts of these two aggregates.

left,

At this point we interrupt the discussion to note this: multiplication required an ability to shift right in both Ac and AR (cf. 5.8).

We have now

to shift left in both

found that division similarly requires an ability Ac and AR. Hence both organs must be able to

both ways electronically. Since these abilities have to be present for the implicit needs of multiplication and division, it is just

shift

as well to

make use of them

explicitly in the

These are the orders 20, 21 of Table

1,

form of explicit orders.

and of Table 2, Part

II. It

will,

however, turn out to be convenient to arrange some details in the shifts, when they occur explicitly under the control of those orders,

107

108 Part 2

The

instruction-set processor: main-line

when they occur

differently from

Section

computers

implicitly under the control of a

multiplication or a division. (For these things,

cf.

the discussion of

the shifts near the end of 5.8 and in the third remark below on one

hand, and in the third remark in

now resume

Let us

7.2,

Part

II,

on the other hand.)

The process

the discussion of the division.

described above will have to be repeated as many times as the number of quotient digits that we consider appropriate to produce

way. This

in this

number

exact

we

be 39 or 40;

likely to

is

will

determine the




Ac +

S(x)

,

-

—> Ac —

—» Ah + S(x) —> Ah — M, S(x)-» Ah + M, S(x)-» Ah - M] ,

S(x)

S(x)-> Ac + M, S(x)-> Ac involves the following possible four steps: First: Clear SR and transfer into it the

Second: Clear

Ac

Ac

if

,

at S(x).

the order contains the symbol

c;

do not

in our present

— If the according to whether the order contains the symbol -f or the number in SR or its use order contains the symbol M, negative .

according to whether the sign of the number in SR and the symbol + or — in the order do or do not agree. Fourth: Perform a complete carry. Building the last four addisymbol M) into the control

tion operations (those containing the

in it

fairly simple: It calls

SR and

the

+

or



only for one extra comparison (of the sign in the order, cf. the third step above),

requires, therefore, only a

first

few tubes more than required

required.

and

for the

four addition operations (those not containing the symbol M).

some The absolute

by merely detecting the

=

sign of

— N |

. \

0.)

of S(x)

—» R

involves the following

two

steps:

Clear SR, and transfer

First:

S(x) to

it.

AR and add the number in the Selectron register The operation of R —» Ac merits more detailed discussion,

Second: Clear into

it.

ways of removing numbers from AR. Such numbers could be taken directly to the Selectrons as well as into Ac, and they could be transferred to Ac in parallel, in

since there are alternative

sequence, or in sequence parallel.

should be recalled that while

It

most of the numbers that go into AR have come from the Selectrons and thus need not be returned to them, the result of a division

and the right-hand 39

Hence while an operation required,

it

is

for

product appear in AR. withdrawing a number from AR is digits of a

relatively infrequent

and therefore need not be

We

are therefore considering the possibility of transferring at least partially in sequence and of using the shifting properties of Ac and of AR for this. Transferring the number to particularly fast.

the Selectron via the accumulator

machine method

of checking

numbers are only checked

is

is

also desirable

employed, for

in their transit

it

if

means

the dual that even

through the accumu-

nevertheless every number going into the Selectron checked before being placed there. 6.6.3.

if

(i.e.

then JV

The operation

6.6.2.

is

lator,

1 system its complement with respect to 2 ) into Ac. If the order does not contain the symbol M, use the number in SR or its negative

is

-\N\ SO,

(If

in

is frequently in connection with the orders L and R while the minus absolute value order makes the detec-

(see 6.6.7),

if

number

the order contains the symbol h. Third: Add the number in SR or its negative

clear

,

them

further justification for building

all

operations the reasons for building them into the control have already been given. In this section we will give reasons for building the other operations into the control and will explain in the case

should

absolute value and five for minus absolute value), so that

into two groups: Those that specify operations which are performed within the computer and those that specify operations

than the input and output operations, and hence they will be discussed more in detail than the latter (which are treated briefly

it

be noted that these operations can be programmed out of the other operations of Table 1 with correspondingly few orders (three for

The operation

S(x)

is

X R —* Ac involves the following six

steps:

Clear SR and transfer S(x) (the multiplicand) into it. Second: Thirty-nine steps, each of which consist of the two following parts: (a) Add (or rather shift) the sign digit of SR into First:

the partial product in Ac, or add all but the sign digit of SR into the partial product in Ac depending upon whether the right-most or 1 and effect the appropriate carries, (b) Shift digit in AR is





Ac and AR

the sign digit of Ac with a and the of the immediately right sign digit (positional value with the previously right-most digit of Ac. (There are ways

digit of

2" 1 )

to save time digit in

to the right,

fill

AR

Ar

by merging these two operations when the right-most but we will not discuss them here more fully.)

is 0,

Third: If the sign digit in

SR

is

1 (i.e.



),

then inject a carry

Chapter 4

into the right-most stage of

Ac and place

a

1

into the sign digit

of Ac.

Fifth:

If

If

a partial carry system

AR

is 1 (i.e.



),

then sub-

Add

or subtract the contents of

in the

main

in

into Ac, depending on

the same alternative as above. Fourth: Fill the right-most digit of

was employed

SR

its

AR

with a

1,

and change

sign digit.

necessary at the end. Sixth: The appropriate round-off must be effected. (Cf. Chapter Part II, for details, where it is also explained how the sign digit

For the purpose of timing the 39 steps involved in division a 6 = 64) will be built six-stage counter (capable of counting to 2 into the control. This same counter will also be used for timing

treated as part of the round-off

the 39 steps of multiplication, and possibly for controlling Ac when a number is being transferred between it and a tape in either

process, then a complete carry

of the Arithmetic register

is

is

process.) It will

be noted that since any number held in Ac at the begin-

ning of the process to

depending on whether there was disagreement or agreement (a), (c)

the original sign digit of tract the contents of SR from Ac. Fourth:

9,

Preliminary discussion of the logical design of an electronic computing instrument

is

gradually shifted into

accumulate sums of products

Ac

in

AR,

it is

impossible without storing the various

products temporarily in the Selectrons. While this is undoubtedly a disadvantage, it cannot be eliminated without constructing an extra register, and this does not at this moment seem worthwhile.

On the other hand, saving the right-hand 39 digits of the answer accomplished with very little extra equipment, since it means -39 1 connecting the 2 stage of Ac to the 2" stage of AR during the is

shift operation.

simplifies the

The advantage

of saving these digits

is

handling of numbers of any number of digits

that

it

in the

direction (see

and Ap'

6.8.).

The

6.6.5.

—»

three substitution operations [At

involve transferring

S(x)]

all

—>

S(x),

or part of the

Ap —»

S(x),

number held

in Ac into the Selectrons. This will be done by means of gate tubes connected to the registering flip-flops of Ac. Forty such tubes are needed for the total substitutions, At—» S(x). The partial substitu-

tion

Ap —»

S(x)

digits of the

and Ap' —»

number held

in the left-hand

S(x) requires that

in

Ac be

the left-hand twelve

substituted in the proper places

and right-hand orders, respectively. This may be

done by means of extra gate tubes, or by shifting the number in Ac and using the gate tubes required for At —» S(x). (This scheme

of 39fc binary

needs some additional elaboration, when the order directing and

an integer) and sign can be divided into k parts, each part being placed in a separate Selectron position. Addition and subtraction of such numbers may be programmed out of a

the order suffering the substitution are the two successive halves of the same word; i.e. when the latter is already in FR at the time

series of additions or subtractions of the 39-digit parts, the carry-

effected in the Selectrons

computer

the last part of 5.12).

(cf.

digits (where k

Any number

is

over being programmed by means of Cc —» S(x) and Cc' -* S(x) operations. (If the 2° stage of Ac registers negative after the addition of

two

dure

may be

followed in multiplication

parts.) if

all

and hence

it is

one of the

ways described at the end of 5.12.

The operation of division Ac + S(x) —* R involves the four following steps: First: Clear SR and transfer S(x) (the divisor) into it. 6.6.4.

The importance

to coding

remainder) and of SR, and sense whether they agree or not. (b) Shift Ac and AR left. In this process the previous sign digit of

quence

Fill

the right-most digit of

and the right-most

digit of

AR

Ac

(after the shift)

(before the shift) with

possible the coding of classes of problems in contrast

each individual problem separately. Because Ap -> S(x) S(x) are available, any program sequence may be stated

in general

for the

0,

make

Third: Thirty-nine steps, each of which consists of the following

is lost.

form (that

is,

without Selectron location designations

numbers being operated on) and the Selectron locations of the numbers to be operated on substituted whenever that seis

used. As an example, consider a general code for nth

m

with a

order integration of

or

independent variable

1,

open.)

remove a very sizeable burden from the person coding problems, for they

Sense the signs of the contents of Ac (the partial

Ac

still

of the partial substitution operations can

wise conveniently perform, such as making use of a function table stored in the Selectron memory. Furthermore, these operations

and Ap' —»

(a)

at the next step in

hardly be overestimated. It has already been pointed out (3.3) that they allow the computer to perform operations it could not other-

Second: Clear AR.

three parts:

become operative

FR. There are various ways to take care of this complication, either

decisions in this respect are

39fc digit division in

planned to program

has already reached CR, to

78 digits of the

39-digit parts are kept, as is planned. (For the cf. details, Chapter 9, Part II.) Since it would greatly complicate the computer to make provision for holding and using a 78 digit

dividend,

becomes operative in CR, so that the substitution comes too late to alter the order which

by some additional equipment or by appropriate prescriptions in coding. We will not discuss them here in more detail, since the

two

product of the

the former

A similar proce-

39-digit parts, a carry-over has taken place

2~ 39 must be added to the sum of the next

when

t,

total differential equations for p steps of formulated in advance. Whenever a prob-

115

The

116 Part 2

instruction-set processor: main-line

Section

computers

Processors with one address per instruction

1

coded for the computer, the general can be inserted into the statement of the integration sequence instructions for telling the sequence with coded problem along

point since a different scale factor does not need to be

where

the

lem requiring

it

will

this rule is

be located

in the

memory

[so that

the proper S(x)

Cm—

* S(x), etc.]. designations will be inserted into such orders as Whenever this sequence is to be used by the computer it will

automatically substitute the correct values of m, n, p and At, as well as the locations of the boundary conditions and the descriptions of the differential equations, into the general sequence. (For

the details of this particular procedure,

A

library of such general sequences will

cf. Chapter 13, Part II.) be built up, and facilities

provided for convenient insertion of any of these into the coded statement of a problem (cf. 6.8.4). When such a scheme is used, only the distinctive features of a problem need be coded. 6.6.6. The manner in which the control shift operations

[Cm —* S(x), Cm' —* S(x), Cc —» S(x), and Cc' —* S(x)] are realized has been discussed in 6.4 and needs no further comment. 6.6.7.

computer

One is

basic question

built

is

which must be decided before a

whether the machine

floating binary (or decimal) point.

While a

is

to

have a so-called

floating binary point

for

remembered

each number.

To program a

floating binary point involves detecting

number

zero occurs in a

first

in Ac. Since

Ac has

where

shifting

can best be done by means of them. In terms of the

facilities this

operations previously described this would require taking the given number out of Ac and performing a suitable arithmetical operation

on

For a (multiple) right shift a multiplication, for a (multiple) either one division, or as many doublings (i.e. additions)

it:

left shift

as the shift has stages.

However, these operations are inconvenient and time-consuming, so we propose to introduce two operations (L and R) in order that this (i.e. the single left and right shift) can be accomplished directly. These operations make use of facilities already present in Ac and hence add very little equipment to the computer.

use of

L and

should be noted that in

It

possibly of

R

will suffice in

many instances

a single

programming a

floating

the two factors in a multiplication have no the superfluous zeros, product will have at most one superfluous 1 and zero (if X Y XY 1). This is 1, then

binary point. For

if




SC

/>^SC

[12]

^

array

circuits

yGontrol

SE

circuits

/

»CC

Combinatorial circuits [9]

\

Fli

State system

P.-

'[14]

feed back)

Programming

Inverter

indicates

figure

of

still

level (ISP)

interpretation is given in Appendix 1 of this chapter and the specification of the programming machine. In addition, it constrains the physical machine's behavior to have a particular

The ISP

Multivibrator [I4J [active component)

[15]

is

R (passive component) X]

and behavior

look at these primitives (although

.

Transistor

[

Mp

We should

together as a C) at the register-transfer level.

[15]

Electrical circuits

to describe the internal structure

and Pc.

[10,11]

(data operation)

NAND

needed

level

CC

f£ c

(with

the

\

1

is

level

A

data

operation/

Switching

[13]

fControi-

ISP.

number of instance

The ISP has been

discussed earlier in the chapter.

Register-transfer level

DEC PDP-8

Fig. 6.

hierarchy of descriptions.

The C can also be represented at the register-transfer level by using PMS. Figure 4 (by DEC) shows the register-transfer level;

Abstract representations Figure 6 also

lists

some

of the

methods used

to represent the

physical computer abstractly at the different description levels. As mentioned previously, only a small part of the PDP-8 description tree

diagrams,

even

is

represented here.

etc.,

The many documents,

schematics,

which constitute the complete representation

of

computer include logic diagrams, wiring lists, circuit schematics and printed-circuit board layout masks, prothis small

duction description diagrams, production parts for

lists,

testing speci-

and diagnosing faults, and manuals programs modification, production, maintenance, and use. As the discusfor testing

fications,

down the abstract description tree, the reader will observe that the tree conveniently represents the constituent obnext highest jects of each level and their interconnection at the sion continues

level.

Each

level in the abstract-description tree will

be described

in order.

The

The Fig.

PMS

level

simplified 1.

PMS

The computer

tion of the

PMS

nounced than

in

structure in Fig. 3 has been reduced from is

small enough so that the physical delinea-

components, such as K's and larger systems.

S('Memory Bus, 'I/O

S's, is less proIn fact, in the case of the

Bus), the S's are actually within the

K and

127

128 Part 2

The

instruction-set processor: main-line

Section

computers

Processors with one address per instruction

1

Mps('Link/L)

DT'Link/L; operations:(l— C\L— 1

L— -.L)

L ('Memory Bus)-

>

-LrMB;'dota;

MPMemory

buffer

/MB ;

flip flop)

LCI/O Bus):=



Mps ('Accumulator /AC ,f lip

[output; broodcost;12 bj

'MB— PC.'MB- m[mA],'MB— DBudota,

amplifiers L;'data, input. J

'MB

)

'L°AC— L°AC x4 (rotate), 'AC— AC® MB, AC— AC "MB,

'MA; operations:('MA— 0,

12 b

'MA— PC.'MA— MB.'MA — MB 'MA— DB address

read/write,' inhibit, M„ select [0:7];

MB— M[MA]

D

('

'AC— Carry (AC.MB), 'AC— AC Data„switches

M ('Instruction

register

-

/IR; flip flop)

M CCore-iStock

|— -S

L [('IO^select ) ['MB

:



=1

-

MpsCProgram counter/PC;flip flop)

ir

I

OCIR; operations :('IR— 0,'IR— m[ma]))

_J p-T('Sense,-iomplifier)

)

Instruction register decode)

Mp(core;»0)



,

'AC— -.AC.'AC— AC-M, 'L°AC— L°AC »2 {rotate}, 'L°AC— L°AC x4 (rotate), 'L°AC— L°AC x2 {rotate},

— AC)

M('Memory address/MA

r

-L[MA;bddress, [output;

'AC; operations: (AC— 0;'AC— 77778

D[('MB;operotions:('MB~0'MB->-MB-t1,

,

-Lf Sense

-L('AC,inpuf,output;12b)-

flop)-

'PC; operations: ('PC— 0)'PC— PC+1 'PC— 0,'PC— MB;

,

'PC; input)—

Mp

=

)-

—•-!_ ('address _accepted,'word_count_ov,' break-state To

]

T.console

-

L('DB_doto ; input)—

;

MA,'MB,'AC,'L,'PC

-T.console

(

lights)

-

'States register; Run,'Interrupt— state

Fig. 8.

DEC PDP-8

register-transfer-level

PMS

diagram.

only registers, operations, and L's are important at this level. We still lack information about the conditions under which operations are evoked. Figure 8

is

a

PMS

diagram of Pc-Mp registers. Here (although we do not bother with

X 64 1-bit core planes is needed. Such a diagram, though still a functional block diagram, takes on some of the aspects of a circuit diagram because a core memory is largely circuit-level 64

The

we show considerably more detail

details.

than in Fig. 4. We declare the Pc state (including the temporary register) within Pc. The which are figure also gives the permissible data operations, D,

address decoders (which select 1 each of 64 outputs in the X and Y axis directions of the coincident current memory); selection

electrical pulse voltages

permitted on the

and

polarities)

registers. It

should be clear from this that the

and the operators can easily be design cannot be reached until we use the

logical design level for the registers

reached.

The K

programming for

logic

level constraints (ISP), thus defining the conditions

evoking the data operators.

The core memory. The

Mp

structure

is

given in Fig.

8.

detailed block diagram which shows the core stack with

A more

its

twelve

Mp

(Fig. 9) consists of the

component

units: the

two

switches (which transform a coincident logic address into a highcurrent path to switch the magnetic cores); the 12 inhibit drivers

(which switch a high current or no current into a plane when or 1 is rewritten); 12 sense amplifiers (which take the induced low sense voltage from a selected core from a plane being switched or not switched and transform it into a 1 or 0); and the either a

core stack, an array M[0:7777 8 ]. Since this is the only time the Mp is mentioned, Fig. 9 also includes the associated circuitlevel

hardware needed

in the

core-memory operation, such

as

Chapter 5

power supplies, timing, and logic signal level conversion amplifiers. The timing signals are generated within Pc(K) and are shown

have selection current. Only one core

together with Pc's clock in Fig. 10. The process of reading a word from

the selected intersection

memory

A

12-bit selection address

is

established on the

MA unique num-

2

3

A

word

logic signal

high-current

a core

is

made

a

is

it),

is

at

=

switched to then a

bit within a core

addresses.

The read

each plane

Iswitching/2, and the current Iy

=

Iswitching.

amplifier

1.

X and Y selection and Y directions 64 x 12 cores

(by having Iswitching amperes is read at the output

plane [0:7777 8 ]. All 12 cores of the selected The sense time at which the sense

are reset to 0. is

observed

in effect creates

X

+

was present and

1

is

tms (memory

MB =

Fig. 2. Digit-delay circuit.

one

off,

deferring the charging of

volts

3ai sec

,

The

D2

be noted that the reset pulse

-*J

in cascade in Pegasus

digit, this

and charges up a storage condenser, C, t the end of the next clock pulse by a 'reset'

to charge the storage condenser. This merely has the effect of

-150 -150--150 volts

cut off at the end of the

computer supply whose amplitude and phasing clock pulse is shown in Fig. 3.

Output 2 Lood pins

volts

is

computer

D

pulse applied through

.

Input clock

+ 200

When V,

flows through diodes

which

100

;330/int

kJT.

volts

of Pegasus, a quantity-production

a further gating with a clock pulse.

digits from the gate input circuit are applied to the of the anode voltage of which falls, so building up a grid Vj,

[_,

volts

••>

173

174

The

Part 2

instruction-set processor: main-line

Section 2

computers

Processors with a general register state

current in the meantime continuing to flow through the diodes with little loss in the stored energy of L, since the voltage across

L

low

is

X + Y or X-Y (Delayed one

at this time.

The output cathode-follower V 2

is

caught

at

— 10

digit

volts in the

a 2L

negative direction by a diode; this safeguards the crystal-diode circuits driven by it in the event of failure of the h.t. supply or

V 2 and ,

El

removes residual ripple on the bottom of the input

it

)

A

\A

waveform, and thus reduces the back voltage and hence leakage in diodes of gates driven

by the output. The second output through a diode can be used

in conjunction with similar outputs from other circuits and a resistor (pins 3 and 4) to make an 'or' (up to about 16-way). In general, each output circuit has two available load resistors,

disposed between direct and

'or'

rr\£. Vr-

1

outputs according to a set of rules

^H>r

which are applied for each case. The number of units which can be driven by an output can vary between three and 16 according to circumstances; where more have to be driven than the rules allow, use

is

made

of 'booster' cathode-followers available

Carry

Two examples ment

ff

will

be given, the first being a simple arrangeis used frequently, and the second

in Figs. 2c

staticizor.

and 5b.

The function

of a staticizor

is

to

remember the an indefinite

in Pegasus being shown in Fig. with a twin 'and' gate input has its output connected to one of its inputs. It is turned on by gate 1, which causes

period, the

A

|

(£>

Fig. 5.

The adder/subtracter.

It is

normally turned off by an inverted pulse on one of the gate 2 inputs.

(a '0'

following a

series of l's)

fact that a digit occurred at a particular time, for

4.

Digit de ay





The

a Inverter

the use of the logical circuits

— the staticizor — which

shown

Cathode f n ower

AND Gate

being a complicated arrangement the adder/subtracter which is used infrequently. The symbols used to indicate the circuit units are

(a)

Subtract

*

on one

of the packages.

Some examples of

Add

suppression

method generally used

digit delay

a digit to circulate as long as the inputs to gate 2 remain positive.

The adder /subtracter. Figure 5 shows an adder/subtracter unit X and Y and an output X + Y for the sum or X — Y

with inputs for the

difference.

marked

'add'

and

There are two further input control leads 'subtract'. If the 'add' lead

If

the 'subtract' lead

is

is

held positive

held negative, the unit acts as an adder. held positive and the 'add' lead negative,

while the 'subtract' lead

is

the unit acts as a subtracter. Carry suppression is controlled by the lead marked 'carry suppression'. Carries are allowed to propa-

gate Staticizor is set these leads are positive

it

Staticizor is turned it either ot these leads is negative

/ ott

when

this lead

is

held positive, so that a negative signal on

this lead will suppress carry.

Table

elements

1

gives the digits appearing at the outputs of logical

in the

and carry

adder/subtracter unit for

digits

Arrangement of It

was required

when

the unit

circuits

is

all

combinations of input

operating as an adder.

based on packages

to base the logical circuits

on a standard

size of

package which could also be used for other circuits, e.g. a nickelline 1-word store [Fairclough, 1956], A unit which could accomFig. 4.

The

staticizor.

modate three valves and had a 32-way plug was decided

on; the

Chapter 9 |

Digits at various internal points of the adder/subtracter unit

Table

1

when

set to add, for

all

combinations of the input and carry digits

The design philosophy

of Pegasus, a quantity-production

computer 175

176

The

Part 2

instruction-set processor: main-line

The magnetic-drum

store

and the

Section 2

computers

circuit packages used

with

are described in another paper [Merry and Maudsley, 1956], as is the nickel-line store [Fairclough, 1956]. it

Processors with a general register state

This combination of plug and socket has a consistently low contact resistance (0.003 ohm at 1 amp); the insertion and withdrawal force is about 4 oz per contact.

The wiring of the packages

The mechanical design

of the packages

At present packages are wired and soldered by hand. The wiring is point-to-point, and within the limitations of layout for efficient

General form

Each standard package

consists of three

main

parts,

namely the

valve panel, the component panel and the plug. The valve panel is an aluminium pressing, there being three types a 3-valve type, a 2-valve type and a blank. The package



type number is marked on the panel by two dots according to the standard resistor colour code.

The component panel houses up

to 100

small transformers, chokes and coils,

components, including the panel and the handle

being made in one piece from sheet insulating material. This design provides a minimum resistance to airflow over the valves

and gives ample protection

to the valves against accidental

dam-

plugs and sockets are used in multiples of eight connec-

tions.

Most of the packages have four plugs providing 32 connec-

tions,

but up to 64 are possible in each package. The plug contacts

are

made

of brass

and are heavily

a proprietary valve-holder contact, if

the eyelet positions makes it possible to use components which are preformed to a standard pitch and would allow for automatic

preforming and insertion of components. Experimental packages have been produced by photo-etched wiring and dip soldering.

Specification of the

Summary

computer Pegasus

specification

A

age.

The

performance, wire lengths are standardized for mass production on automatic wire-cutting and stripping machines. The symmetry of

silver-plated.

The socket

uses

which can readily be replaced

detailed specification would cover the ground of the programming manual [Pegasus Programming Manual, Ferranti Ltd.,

London] and would be out of place here. is

its

Pegasus is a binary serial-digital computer. The word length 42 binary digits, of which 39 digits are used for a number and sign (negative

numbers are represented by

damaged.

other two are gap digits. so that one

word may

their

complements

used for a parity check and the The length of an order is 19 binary digits,

with respect to two), one digit

is

consist of

two

orders, the remaining digit

being a 'stop-go' digit. If the 'stop-go' digit is a '0', the computer will stop before obeying the orders in the word, but will proceed

unhindered

if

the digit

a *1\

is

a 2-level store, a magnetic drum holding 5120 words and an immediate-access or computing store of 55 single-word

There

VALVE

MOUNTING PANEL

is

magnetostriction delay lines. An order is made up of seven

and three M-digits, the

JV-digits,

three X-digits, six F-digits

being the most significant and the M-digits the least significant. The iV-digits allow 128 addresses in the immediate-access store (of which only 63 are used). The reg]V-digits

isters in this store are shown in Fig. 8. The X-digits refer to one of the accumulators, the registers corresponding to JV-addresses

0-7.

Thus the order code

is

a 2-address

code with one address

referring to only a limited part of the store.

the function of the order.

A

list

of functions

The

F-digits indicate

and their correspond-

ing F values are given in the appendix of this chapter. The Af-digits indicate a modifier for the order: they select one of the accumula-

and the modification process is to add certain parts of the contents of the selected accumulator to the order before it is tors,

Fig. 7.

Standard package.

Chapter 9

The design philosophy

All stored information

NAME OF REGISTER

ADDRESS OF REGISTER

parity digit,

correctly stored

ALWAYS ZERO SINGLE- WORO TRANSFER

ACCUMULATORS

BLOCK TRANSFERS TO AND FROM MAIN STORE



form of code, however, shows that

C.

in

is

the 3-address code

An examination

many

cases

two

of this

of the ad-

large number of jump instructions greatly helps in a programme. In particular, one order enables a jump organizing to be made depending on the condition of an accumulator (being zero, for example), and another order on the complementary con-

Having a

dition (being not zero).

necessary to think

dresses are the same, so that the order takes the 2-address form,

it is

A + B —* A. A

condition will be

further examination shows that in a large propor-

A

confined to a very few addresses. This leads to the suggestion of a code of the form + X —> X, covers the where X covers only a small part of the store while tion of cases the address

is

N N

This will have the advantage of yielding a reasonably short order. In Pegasus two such orders are incorporated in one

whole

store.

word, leaving sufficient digits to specify a modification register Mancunian B-line) in each order.

The extreme code, where

X

is

case of this code

is,

(a

of course, the single-address

confined to one address, the accumulator.

How-

had convinced the programmers collaborating in the design of Pegasus that, with single-address codes, a large number of orders are concerned solely with transfers of numbers

ever, experience

from one register to another; the single accumulator through which all numbers must pass and in which

is

a restriction

all

operations

have to be performed. In the Manchester University computer the B-lines serve two very valuable but distinct purposes: they allow order modification and rudimentary arithmetic (such as counting) to be done without disturbing the accumulator. It was felt that fuller arithmetic and logical facilities on these B-lines would have been extremely valuable. The seven accumulators in Pegasus, used for modification

and arithmetic, are a development of the B-line concept.

order), enables the counting through blocks of information to

done with

The use

be

of the group-4 orders of the

code enables counters to

set conveniently and a constant (up to 127) to be placed in an accumulator, the constant being the value of the ]V-digits of

be

the order. Order 67 (the unit-count order) enables the counting in a simple way. A jump can be programmed to take to another part of the programme number of cycles has been when the required place automatically

performed.

is

available

satisfied.

helpful.

The

logical shift orders, 52

and

53, are also included to simplify

and unpacking words holding several items of information. As a result of including these various orders, the order code 'red tape'. In particular, they are used for packing

is quite large. It is worth remarking, however, that by a sensible grouping of the orders in the code the remembering of the code is a very simple task. A sensible arrangement of the

of Pegasus

code tends to reduce the amount of equipment needed to engineer it. For example, when the equipment for dealing with group of the code has

been allocated, groups

1

and 4 require the addition

of only three gates. Facilities for

checking programmes. The features mentioned above

make

the computer easier to programme, and there are other facilities in Pegasus that make it easier to check out and develop

new programmes. These

include causing the machine to stop obeying orders, either under programme control or when the programme is in error. In particular, the machine stops if an order for writing in the

main

store

is

reached and an overflow indicator

A further aid when testing new programmes is the

transfer orders.

all

When

automatic

main-store addresses appearing in blockthis information is examined an indication

programme is readily obtained. The punching can be inhibited by a switch when a return to full-speed running is needed. of the course of a

Machine rhythm

The

relative ease.

of cycles of operations to

only one of these orders

ahead to see whether or not the correct

Although the eight jump instructions felt initially to be enough, it is now that even more such orders would be suggested by programmers

punching out of

function of the order (Fig. 9). This method of modifying orders, used in conjunction with order 66 of the code (the unit-modify

When

included in the code were

is set.

Special facilities for dealing with 'red tape'. The difficulties associated with the 2-level storage system have been greatly reduced by having an order-modification procedure which depends on the

Processors with a general register state

be dealt with

logical design of Pegasus is built around a nucleus that deals with the simple arithmetic orders, groups 0, 1 and 4, of the code. This nucleus contains the control section, i.e. the order register

and order decoding equipment, and the

mill in

which these orders

nucleus could not begin until a are executed. The design for with the extraction from the computing basic rhythm dealing store and the execution of such a pair was determined. When the of this

outline of this nucleus was clear, the equipment for dealing with

the remaining orders in the code was designed to

fit

it.

Chapter 9

The following arguments led to the basic rhythm. Since the orders of groups 0, 1 and 4 are similar in many respects, for definiteness,

it

will

be

of the code, say. This

is

The times

available for replacing in the store in the

for this

millisec

It

same

digit

sequence of operations. Thus,

it

which

is

in a different timing

store in the next

word time

in

standard timing.

order. Two reasons overlap with the first word time for the next oppose this: the new contents of the register being changed might

Exponential function Sine function

29

Logarithmic function

34

7 min 17 sec

of the time for a typical prob-

is

for calculation

and 18 sec

is

for output.

Realizing the specification

The detailed

of

in

Thus, the execution of a pair of orders taken from the computthe ing store requires four word times. The reasons for opposing

overlapping of the execution of

two orders

also

tion of an order pair while the previous pair

oppose the extrac-

is

being dealt with.

Five word times are therefore needed for the process of extracting and obeying a pair of simple arithmetic orders. More time may

some

basic 3-beat

of the other orders in the code.

rhythm

is

thus established:

logical design

would take too long to describe fully the detailed logical design. aspect is worth mentioning, however, namely the avoidance

One

all

'exceptions' in the results of orders.

range of numbers. In multiplication this can occur only the multiplier and the multiplicand are — 1, and this

b

Obey

the

c

Obey

the second order.

first

likely to

it is easier to put a footnote in the programwhere the overflow indicator is described, pointing manual, ming out the exception. It was felt, however, that such exceptions should

this infrequent case,

machine

expense of extra equipment or extra comand other reasons concerned with facilitating

at the

this

use, the logic of

The end-product

order of the pair.

when both is

occur very infrequently. Rather than provide equipment to sense

be avoided even

Extract the order pair from the computing store.

As an example of an

exception consider the overflow indicators, which should be set whenever the final result of an order is outside the permissible

plication. For

a

some indication

24

lem, a set of 50 simultaneous equations (with a single right-hand side) takes about 10% min. Of this time, 3 min 8 sec is for input,

another register in the

be extracted from one and replaced

same word time.

for

for standard subroutines are:

Finally, to give

It

The

time to extract the

for the

millisec

be required by the next order; and two different sets of equipment for selecting a storage register would be needed if numbers were

be needed

5.4

common

delaying circuit instead of one for every takes two word times to execute. register. Such an order therefore It may be argued that this second word time could be made to

to

2.0

Some times

from the normal circulation

considered an uneconomical use of extra equipment. Instead, it was decided to delay the sum so that it could enter the register

computing

Multiplication Division

would be im-

To produce two such entry points to each register would mean more equipment associated with each register, which was

This involves one

0.3

orders.

entry.

in the

Addition and subtraction

These times include an allowance

the sum to the store in the same word as the possible to return without having an entry point to each are extracted operands register

for the various arithmetic operations are:

an order which takes two numbers from

time as the least significant digits of the two components taken out of the store. In practice, some four digit times at least would

be needed

computer

Times for typical operations

would take a prohibitive amount of equipment to extract these numbers, add them together and have the least significant digit

sum

of Pegasus, a quantity-production

sufficient to consider a particular order, 11

the computing store and replaces one of them by their sum.

of the

The design philosophy

I

Pegasus

is

quite complicated. is a series of

of the detailed logical design

diagrams with symbols corresponding to the circuit units of the packages, as shown, for example, in Fig. 5. The inputs and outputs of the units on these diagrams correspond to the pins of the sockets

The

duration of beat

(a) is

one word time; beats

are each two word times long for orders in groups of the code, but may be longer for other orders.

(b)

0, 1,

and

(c)

4 and 6

into

which the packages plug. Thus, the wiring lists of connections be produced from these logical diagrams. The

of these pins can first

step in the production of these

lists is

to allocate a position

179

180

Part

2

The

instruction-set processor: main-line

Section 2

computers

the cabinets to each logical circuit in such a way as to reduce amount of wire needed. When the layout has been completed, the last stage of producing the wire lists can proceed. in

screening

the

is

General construction of machine

The main

units are

shown

is

Processors with a general register state

necessary between any packages, a special metal plate and is fixed by a single screw

inserted in slots in the cast rack

back panel. Coded aluminium strips containing coloured plastic studs which identify the position of each package are fixed to the front of each casting. in the

in Fig. 10.

The package frame. This unit is a simple light-alloy frame supporting diecast light-alloy frame racks to which the back socket

Arrangement of the packages. There are 200 packages per cabinet, arranged in ten horizontal rows of 20 units per row. The metal valve panels are placed so that the edges almost touch. The com-

panels are fixed. The packages slide into grooves in the rack and plug into sockets at the back, a polarizing feature preventing the

sponding position

insertion of a package upside

down.

If electrical

or

magnetic

BAY

ponent panel of each unit is in register with the unit in the correin each of the other rows, thereby providing vertical

chimneys

for cooling the

components secured

to these

I

LOGIC PACKAGES

BAY 2 LOGIC PACKAGES

BAY

3

LOGIC PACKAGES

PACKAGED MONITOR UNIT

PROGRAMMERS CONTROL PANEL

INPUT

EQUIPMENT

FIBRE

GLASS FILTER

DRUM PACKAGES

Fig. 10.

Main

units.

Chapter 9

Warm

panels.

air

from the main source of heat, the valves,

is

The design philosophy

of Pegasus, a quantity-production

computer

Fault location

prevented by the valve panels from reaching the more temperature-sensitive components, such as diodes, secured to the com-

There are parity-checking

ponent panel.

The

speed

on both the main and the highmachine.

circuits

stores. Errors of a single digit in the stores stop the

fault

can then be quickly located by examination of the

monitors.

The back panel wiring. For locating long signal wires between sockets a system of plastic strips is used, which hold the wires at definite positions given

The

by the

exact route of every wire

instructions on the wiring

lists.

predetermined, thus making wiring and inspection more reliable and fault finding and mainte-

nance

is

easier.

Final assembly. The completely wired frame is assembled in its cabinet, which has already been fitted with the control and auxiliary supply circuit unit, heater transformers, fuses, cooling assembly and eableforms. The work of connecting the cableforms, heaters

and earths can be done by relatively unskilled labour working to clearly written instructions and diagrams.

The cooling system. Each cabinet has

own

its

an integral part of the construction; there

is

For other

chamber, each providing 300 of 1 in (water gauge).

The power supply. stabilizing valves

ft

/min

A separate

and control

rise is

to run a test

and a number of key wavenormally a matter of tracing 0's and l's through the machine with reference to logical diagrams rather than electronic circuit diagrams. forms. Fault-finding

is

A variety of triggers can be selected for the

monitor time-bases,

these including

a

Trigger at any word position within a drum revolution (128 different times selectable by switches)

b

Trigger at any

word time

of

any selected order

head

10° C.

These

triggers

and some other monitoring

facilities are pro-

duced by 19 standard packages and are found cubicle houses metal rectifiers, shunt

The power

circuits.

the mains through a motor-alternator

set,

is

programme

position: these include all store lines

cooling system as

of air at a total pressure

is

All outputs of circuit units are readily accessible at monitoring sockets on the front of each package, and in addition about 80 points can be directly selected by switches from the monitoring

therefore no difficulty

The maximum temperature

method

out with the monitors.

in cooling cabinets added to existing computers. Two axial-flow turbo blowers are mounted in the base beneath an airtight pressure 3

faults the general

(assuming the fault is not in the main control) which will indicate the area of the fault. Detailed examination can then be carried

to

be well worth

the extra equipment.

obtained from

the output of which

is

main purpose of this set being to act as a buffer against switching surges and other mains voltage variations.

Fault repair

stabilized to 2%, the

The

valve heaters in the computer are energized from the stabiwhich is expected to extend the valve life.

lized alternator output,

Once

a faulty package has been located, the machine can be got working again immediately by replacement of the package with a spare; repair of the faulty package can be done at leisure with

the aid of a package tester.

With

this

equipment a package can

quickly be given a series of standard

Maintenance

switches,

All digital

ignored.

is

tests; each is selected by measured either by observation

of meters or a built-in oscillograph.

General

circuits

and the performance

computers so

far

have a

fault rate

which cannot be

When the best has been done in the choice of components, and mechanical construction, attention must be paid to

During commissioning not one case was found of the first machine doing other than what one would expect from the logical diagram (except for a very few cases of incorrect wiring).

the following points to get the best out of a machine:

Preventive maintenance

a

Rapid

b

Getting the machine working again as soon as possible after

fault location

locating a fault c

Preventive maintenance

The machine

h.t. supplies are reduced while the test programmes are being run. This marginal testing shows up incipient faults such

as deterioration in valves, crystal diodes or resistors. is

at present kept in

good running order

down

to

The machine 10% margins

181

182

The

Part 2

instruction-set processor: main-line

(the supplies are normally controlled to about

although correct running at about

20%

Section 2

computers

1%

of nominal),

reduction has been ob-

for

55% hours'

running.

The

Processors with a general register state

majority of package replacements are

done during routine maintenance.

served. to

The packaged method of construction of computers has proved have great advantages in design, construction and operation.

Conclusions first machine has been computing regularly for only a few months and has been on regular preventive maintenance (about 1 hour per day) for a few weeks. Error-free runs of over 30 hours are common, and at the time of writing there has been no error

The

References ElliW56a; ElboR53; E1KW51, 52, 53, 56b; Fair]56; JohnD52; MerrI56; Pegasus Programming Manual, Ferranti Ltd., London; Pegasus Mainte-

nance Manuals, Ferranti

Ltd.,

London.

APPENDIX The Pegasus Order Code

00

x'

01

x'

02

x'

03

x'

04

x'

05 x 06

1

x'

= n =x+ n = -n =x- n =n—x —x&n = x E=£ n

26

11 n'

12 n'

13 n' 14 n'

15 n' 16 n'

30

17 Not allocated

21 (pq)' 22 (pq)'

= = =

23 (nq)'

=

n n

p

x •

x

+

+

2 39

2~

3S

2~

3S

q

+

nx

this

n

+

order assumes that any

overflow

q

tions

is

in 7.

due to operaClears overflow

unless n' overflows

< 24

25

y+

2

2^ 38

(— =

27 Not allocated

=x — n+x = -x — n —x = x— n = n&x = n^x

20 (pq)'

+

j

— -% < p'/n < % (rounded ;

single-

length division

07 Not allocated

10 n'

q'

- 38

(t)

-

x

+

2~ 38 q

p'/n




B

-1

C[0]^;

{

Go to head 0+ B string

m[b}-mb^b-i)

state

state

I

B— B-1;

Fig. 4.

10

L |

^ M

Thus Fig. 4 is a more detailed description of states and o.v'. Each horizontal pair of states (Fig. 4) corre-

sponds to a single scan of the states of type 1 instruction o.v, o.v, o, among states 2 and 3 correspond to the

M[B]— -.M[B]

iM[Btl]*;

up accord-

3.

o.v' in Fig. 3. Transition:

B— B +

string has terminated

;

>B

dress pointer registers. These point to the tail (or least significant digit), that

+0;

A

trecomp-.";

AC[0]

I^address register* the instruction location pointer

r

I

[l

1

A[l :3] ,

address[X[1:3];|,

:=

(

Address encoding for 1 of 16000 from a ter X. Indexina described below.

3

char value of regis-

231

232

Part 3

The

Section 3

instruction-set processor level: variations in the processor

[

APPENDIX

Processors for variable-length-string data

IBM 1401 ISP DESCRIPTION (Continued)

1

x [3]

'

+

**ooo 10

x

+

x[]] x ]000,

x[l :3][bcd. string})

Instruction Format op]

{3.ch})); B address set up or djzhar

next

«-

1

)

;

next B[l] ^-d^har;

;

active

-*

(b[2]

*- get,_,char)

;

active



(B[3j

«-

get^char)

;

active



(Bwaddress^present

«-

)

;

active



(B^ddress^present

*- 0)

;

Bwaddress^jpresent

add index register to I or A

(d^char^present «-0;

(d,_,char «- get,_,char

d.jChar.^present

1

next

-1

(A[2] * 0)

active

next

^get^char; next

A^address^present

iM[l]

I or A address set up or d^char

d^char;

next

;

(A[2]

active-* (A^ddress^present ,

«_

char instruction

proceed to get an I or A address

*-0); next

-» B

(d^char ^-get^char; next A[i]

-imls -»

mis

-^

d^char^present

active

(

*- 0;

-*

next next 1

record whether B address is present next

add index register to B

(

d^char.jjresent .

eat

->ij;i

i-

.004 398 364 291

end accumulate reg ».-*«. ¥.2.»/% Removes condrtiorts end

.stars

machine.

The FIXED POINT mode displays numbers in the way they are most commonly written. The DECIMAL DIGITS wheel allows setting the number of digits displayed to the right of the decimal to 9. Figure 2 shows a display of three point anywhere from numbers with the DECIMAL DIGITS wheel set at 5. The number

the

x 10 5 s 533 684.5815, is too big FIXED POINT without reducing the DECI-

DIGITS

845 815

setting to 4 or

DECIMAL DIGITS

less.

If

the

number

is

too big for

the register involved reverts

setting,

an apparent overflow. In the number display, displayed is rounded, but full

automatically to floating point to avoid

FIXED POINT

1

STORAGE:

Storage registers.*-'

"

'

Transfer * -or * 16 register .(«* eated byr»W twystrei*.

jMj

TAN 1 f-3)=-71.9-SS'

aJ&Jl.*. todfi... atpha registers tar

to X.

**

Hvrnbaatwi.

alpha

,

mmmc

oft

*-*•*% t — *-*f

....

Exchanges , with regrsief Snd.c»i*d by next keystroke; omty instruction for recalling contents, erf • numeric register.

I

I

TO ENTER A PROGRAM



POINT,

digits to the right of the

Causes

pull-out instruction card, Fig. 3,

calculator under the keyboard.

j.

unchanged.

RCCAtL

urtconCitiona!

branch to

in

(wopwn t*e#erx*. mtml iemd.

-

tMtik

MET

is

located at the front of the

The operation

of each key

is

briefly

'

Stops program execution when used menuelly or as a program step.

,

[Srenches to address given by

next two program step if first itep is alphameric. (GO TO not necessary.) Otherwise, executes instructions in next two steps and "CwKtmiM with third step. I

Ends recording on magnetic card. Gives STOP end automatic GO TO (0) (0) Most be last program step.

;

50 RECORD A

PROGRAM

SET:{-ntSn

' ,

CONMTtON

PRESS: j

PftlESS-

Enter rlata and press

CONTINUE

»s

recp/irwl

,

Sf.T:,wwe

!

Ptwa&@(g@ FRCSS: Desired

1

My

24

yz'

+

,

step.

MWpai'l mode: Psotevs

address

„ and instruction code in X.

Sets

tnTT^lrnode. Executes one pro ghwmt) or aH 3 steps o< GO TO

forces a brief display during proexecution. When bald down, causes STOP at next prog PAUSE

only.

condition to b* tested by the next IF FlAG. May be used manuetfy or as * program step.

gram

>*,»

25

"No Operation"

WTiKT

Stops next two program steps. 'Continues with third anagram 'step. OF FlAG clean the flag )

TO RUN A PROGRAM

Ml

Starts program execution at present address. May be used as a

j

PR65S.@.

tnseft magnetic c»fd,

i

4? CMtnitui

Pull-out instruction card

A

9 nrf

CONDITION

keys

M&c

fima

FLOATING

decimal are grouped in threes.

MeCOWHATC j.

address given by next two program steps or keyboard entries,

from keyboard:

Answer -*X

AOCOMUtATE* * Xh. A met "IF" step branches to address in next two steps. II not an address, executes first step A not met "IF" step branches to and executes thtfd step

Chapter 20

The HP Model 9100A computing

calculator

245

246

The

Part 3

them. Special keys located in a block to the are used to identify the lettered registers.

To

Section 4

instruction-set processor level: variations in the processor

store a

number from

the

X

left

register the key

of the digit keys

Q

parenthesis indicates that another key depression, representing the storage register, is necessary to complete the transfer. For example, storing a number from the X register into register 8 requires two

key depressions: store a

f-o)

number from Y

The contents numbered

The X

register remains

register the key

Q

is

unchanged. To

a, b, c, d, e,

and

used.

Recalling a

f.

register requires the use of the

(*=»)

key

X

simply

number from to distinguish

the recall procedure from digit entry. This key interchanges the in the Y register with the number in the register indicated by the following keystroke, alpha or numeric, and is also useful

number in

f

programs since neither number involved in the transfer is lost. The CLEAR key sets the X, Y, and Z display registers and the

and e

registers to zero.

The f and

registers are not affected.

e registers are set to zero to initialize

the r%) and

key

The remaining

(IF)

often makes

it

for use

and the

a very useful

ARC first

and

HYPER

with

CLEAR

keys as will be explained. In addition the

FLAG

clears the

them

and R

in X,

(^J

is

conditions,

which

step in a program.

placed in

is

pressed and the display shows y in

Y and

x in X.

ACC+

and

components

contents of the in f

ACC—

in the f

allow addition or subtraction of vector

and

X and Y

e storage registers.

ACC+

adds the

numbers already stored subtracts them. The RCL key

register to the

and e respectively; ACC— numbers in the f and e

recalls the

of the alpha registers are recalled to

by pressing the keys a

.

[ij

converting from polar to rectangular coordinates, Y,

The

used.

is

Desk calculator computers: keyboard processors with small memories

registers to

X and

Y.

Illegal operations

A light to the left of the CRT indicates that an illegal operation has been performed. This can happen either from the keyboard or when running a program. Pressing any key on the keyboard reset the light. When running a program, execution will continue but the light will remain on as the program is completed. will

The

illegal

operations are:

Division by zero \/x

where

x


/>%, or/=0

float-

length word may

single reprea 40-bit fractional part /, and ~ the value of the number is then /2 C 128

is

c;

in parallel

jiisec

with the

time required to complete the functions.

.

limited to the range

when

c

is

— 1 < / < — %,

or

also zero. All floating-point opera-

assume that operands are in this standard form and give correctly rounded results in standard form. Functions for the additions

and subtraction of double-length floating-point numbers have been provided, as these give increased accuracy and stability in tion

many

performed

suc-

number with

an 8-bit characteristic

The

A

Thus with a simple one core per bit system cessive reads can be made at 1 jusec intervals and writes at 2 use.

matrix operations.

The arithmetic As shown

unit

in Fig.

1,

there are

six full

length transistor flip-flop

registers in the arithmetic unit; there are also

two

8-bit registers

used when performing floating-point operations. The main ties associated with these registers are as follows.

Wl,

W2

and

W3

facili-

are the three most accessible cells of the

nesting store; transfers to the core part of the nesting store, being

263

264

The

Part 3

formed by adding the minuend's complement to the subtrahend with a carry inserted into the right-most adder stage.

MAIN TRANSFERS

Nb

NESTING ADDRESSING

STORE

COUNTER

r I

AMPLIFIERS

between

store control

and the arithmetic

is

quence of timed pulses along lines which activate the various transfers etc., between the registers. The sequences have been

WRITE_

[AMPLIFIERS

constructed so that

many operations are performed simultaneously,

reducing the overall time to a minimum; thus the function sin-

W3

r-L

acts as a buffer

and together with Bl and B2,

used in nearly every function. Arithmetic unit control interprets each instruction as a se-

unit,

CORE REGISTERS

READ

Processors with stack memories (zero addresses per instruction)

Section 5

instruction-set processor level: variations in the processor

gle-length fixed-point

is

performed by:

W3

Bl and Nb respectively, read from the nesting store, a simultaneously commencing clearing the carry inserted into the right-most adder stage

i

SWITCH

add

Transferring

L

Wl, W2,

to B2,

and switching the adder's output to Wl. Wl

Adding and simultaneously transferring Nb

ii

CLEAR

TO STORE CONTROL

Each step takes

AUXILIARY TRANSFERS AND SHIFTS

OF OR-8

>1

* ^CC8BITS)|

LEFT SHIFTS OF

STANDARDISATION

0,l.2,S,8

,

OR -8

AND CONVERSION LOGIC SHIFT

~ CHARACTERISTIC MODIFIER

A.U^CONTROL, PULSES

inforFig. 1. Block diagram of the arithmetic unit. Full lines represent mation transfers; dotted lines represent control pulses. All registers are 48-bits long unless otherwise stated.

similar arithmetic unit operating only on single-length

enables all double-length arithmetic operations to be performed without writing information back into the nesting during the function; this would have complicated the sequences and increased the time for the functions.

When determining the arrangement of transfer paths between the various registers, it was found sufficient to consider only the or lengthy double-length functions which required complicated sequences; in particular the function for adding two double-length

Wl

and W2, together with £1 and 52, form a double-length shifting register which may be used as two independent single-length shifting registers. Bl and B2 are the inputs to the 48-bit adder whose output may be routed to Wl, W2, or to the characteristic difference register

influence.

is

set

contains 13 carry-skip stages which reduce the carry time to a maximum of 150 nsec Subtraction is per-

The adder

on fixed-point addition and sub-

the sign of the result differs from that expected, and on floating-point operations if the characteristic exceeds the if

maximum

allowable; shifting

may

also cause overflow.

Shift control

by transfers between Wl (and/or and back Bl and (and/or B2), again. The shift transfer paths W2) from the to the B registers provide right shifts of 0, 1, 2, 5 Shifting operations are effected

CD.

propagation

numbers had great

overflow indication

traction

W3.

num-

changes the contents of the two most accessible cells in the nesting store with those of the next most accessible pair. The sixth register

An via

W3

bers could be designed using only four full-length registers. At least five registers are required to perform the function which inter-

floating

made

of the last step,

To speed up multiplication and division, these functions are carried out in a separate unit employing the stored carry principle, but the results are finally assimilated within the arithmetic unit.

A

RIGHT SHIFTS 0,1,2,5,8

W2.

has been refilled from the core nesting store.

RIGHT SHIFTS OF 0,1,2,5,8. OR -8 LEFT SHIFTS OF 0,1,2,5,8 OR -8

and by the end

0.5 jusec

to

W

Chapter 21 |

or 8 places, and a left shift of 8 places; the paths from the B to the registers provide the same shifts in the reverse direction.

Hi

W

The two

sets of shift

paths are used alternately, those from the first; all shifts are terminated using a path

W registers being used into the W

registers. Shifts of a large

number

of places are

accomiv t;

W

necessary the number is then transferred back into the registhe remaining shifts, or the whole shift if the number of places less than eight, is then completed by a transfer to the B registers

ters: is

vi

b

and back again using two appropriate paths. With the shifts available, extension of the B registers by two bits at the right-most end enables any shift to be performed without loss of accuracy.

word

number

of places

is

by-passed.

When

a shift

is

to

ii

be performed, the

iv

store.

Ada, simultaneously clearing the sign of W2. floating

numbers

complement of Wl to Bl, B2 and switch the adder's output to

Transfer the

Wl

Store the characteristic of

in

transfer

Wl

W2 CD.

register in the eight-bit register

and add.

Clear the characteristic positions of Wl, simultane-

contains minus the difference in charac-

Clear the characteristic of W2, and

of the result.

The character conversion operations to, and from, binary are accomplished by shift control, using a method involving successive

vii

and adding or subtracting portions

viii

Supply control pulses to shift control and thus perform the required right-shift of eight Wl or W2. Having completed the shift, transfer Wl, W2 and W3

nesting store. Add the fractional parts, simultaneously transferring

Nb

ix

Examples of sequences

to

W2.

Supply control pulses it

of the radix word.

the shifts required. Store the complement of the

two sequences

respectively,

number

and perform of left-shifts

performed

in (viii) in the characteristic position of B2,

C

to the characteristic position of Bl, switch

the adder to Wl. x

are described.

Nb

to shift control so as to cause

to enter the standardization procedure

transfer

core register of the nesting store). i Transfer Wl, W2, W3 to B2, B\ and

about

to B2, Bl and Nb respectively, simultaneously switching the adder's output to Wl, clearing the carry into the right-most adder stage and reading from the core-

during this metic unit control for use in forming the correct characteristic

— D, (i.e. subtract the double-length fixed-point number in Wl and W2 from the number in W3 and the most accessible

is

replace the contents of C thus C contains the larger characteristic.

The number of shifts performed standardising operation is made available to the arith-

unit,

Wl

by the sign digit of CD, by the characteristic of B2;

into shift information.

working of the arithmetic

if

to be shifted, determined

vi

ii

inserting a carry into the right-

and read from the nesting

teristics.

determined by logical circuits which interpret the pattern of

a

stage,

with fresh data), switch the

Transfer the complement of Wl to Bl and Nb to B2, switch the adder's output to Wl and insert a carry into the right-most adder stage if W2 is negative.

shift register

v

illustrate the

filled

W2,

CD

ating on the characteristic positions of the two numbers. After the addition, the shift required to restore the result to standard form

To

W2 to B2 (but setting the W3 directly to Bl (W3

into the shift number register ously transferring in shift control. This latter operation is such that the

When performing floating-point addition and subtraction, shifts are required to equalize the characteristics of the two numbers; the amount of shift is calculated by a modified subtraction, oper-

shifting of the character word,

most adder Add.

C Hi

necessary to obtain the shift.

Wl

now been

of

positive), transfer

(i.e. add the two single-length and W2).

with a string of command pulses by the arithmetic unit control; shift control then re-routes these pulses to perform the transfers

is

has by

to

and the type of shift are transferred into a semiautonomous unit, called the shift control, which is then supplied

bits in

complement

B2

+F i

In double-length arithmetic shifts, the sign digit of the less significant

Transfer the sign of

adder's output to

plished by a series of shifts of eight places in the appropriate direction until the number of places remaining is less than eight; if

Design of an arithmetic unit incorporating a nesting store

The sum

Perform a special add operation which only affects the characteristic positions of Wl.

is

thus formed in

Wl. Rounding the answer

is

carried

out using two special control pulses which complete all floatingpoint operations, these call up logic to deal with the cases when

simultaneously reading from the core nesting store.

the rounding operation necessitates re-standardization of the re-

A dummy

sult.

pulse.

265

266

Part 3

The

instruction-set processor level: variations in the processor

Conclusions

Section 5

Hi

The advantages

of a

arithmetic unit are:

machine incorporating



a nesting store in the

As the operation of the arithmetic unit is largely independent of the main store, their controls may readily be separated. This allows store control to process instructions whilst the arithmetic unit control processes a prior instruction,

i

The machine

is

thereby leading to faster execution of the programme.

simple to programme using the machine

The main disadvantage

language, ii

Processors with stack memories (zero addresses per instruction)

Programmes are faster, since many main store transfers are eliminated, and the access time of the nesting store is virtually zero. They are more compact because less infor-

involved.

mation

AllmR62; DaviG60; HaleA62

is

required to specify

many

instructions.

References

is

an increase

in the

order of complexity

Chapter 22 1 Design of the B 5000 system

William Lonergan / Paul King

Computing systems have conventionally been designed

via the

'hardware' route. Subsequent to design, these systems have been handed over to programming systems people for the development

programming package to facilitate the use of the hardware. B 5000 system was designed from the start a total hardware-software system. The assumption was made

of a

In contrast to this, the as

that higher level

be used to the

programming languages, such as ALGOL, should machine language programming,

virtual exclusion of

and that the system should largely be used to control its own operation. A hardware-free notation was utilized to design a proc-

word and symbol manipulative capabilities. model was translated into hardware specifica-

essor with the desired

Subsequently tions at

this

should be

made

for the generalized

subroutines; a full

complement

handling of indexing and

of logical, relational

and control

operators should be provided to enable efficient translation of higher-level source languages such as ALGOL and COBOL; pro-

gram syntax should permit an almost mechanical translation from source languages into efficient machine code; facilities should be provided to permit the system to largely control its own operation; input-output operations should be divorced from processing and should be handled by an operating system; multi-programming and true parallel processing (requires multiple processors) should be

and changes in system configuration (within certain broad limitations) should not require reprogramming. facilitated,

which time cost constraints were considered.

System organization Design objectives

The B 5000 system achieves

The fundamental design

objective of the

B 5000 system was

the

A second major both in changes programs and system

reduction of total problem through-put time. objective was

facilitation of

Toward these objectives the following aspects of the total computer utilization problem were considered: configurations.

Statement of problems

in higher-level

languages; efficiency of compilation of

system.

of

Master control program

machine language; program debugging in higherlanguages; problem set-up and load time; efficiency of

A master control program

system operation; ease of maintaining and making changes in existing programs,

made

logically like telephone crossbar switches. Figure 1 depicts the basic organization of the system as well as showing a maximum

machine-independent

machine language; speed

compilation of level

its unique physical and operational the use of electronic switches which function modularity through

in a

Design

and ease of reprogramming when changes are

system configuration.

criteria

Early in the design phase of the

B 5000 system

the following

principles were established and adopted:

Program should be independent of its location and unmodified as stored at object time; data should be independent of its location; addressing of memory within a program should take advantage of contextual addressing schemes to reduce redundancy; provisions ^Datamation,

vol. 7, no. 5,

pp. 28-32, May, 1961.

will be provided with the B 5000 system. be stored on a portion of the magnetic drum. During normal operations, a small portion of the MCP will be contained in core It

will

memory. This portion will handle a large percentage of recurrent system operations. Other segments of the MCP will be called in from the magnetic drum, from time to time, as they are required to handle less frequently-occurring events, or system situations. Whenever the system is executing the master control program, it is

said to be in the Control State. All entries to the Control

State are

made

via 'interrupts.'

A

special operation

which can only be executed when the system

is

is

provided,

in the Control

State, to permit control to return to the object program executing at the time the 'interrupt' occurred.

The following

it

was

are a few typical occurrences which cause an

automatic 'interrupt' in the system:

An

input-output channel

is

267

268

The

Part 3

or

1

1

2

to 16

1or2

1or2

1

1

1

1

instruction-set processor level: variations in the processor

Section 5

Processors with stack memories (zero addresses per instruction)

Chapter 22 |

F

Design of the B 5000 system

269

270

The

Part 3 |

way around machine

still must provide object the and recall functions. In brief, storage accomplish conventionally designed computers, with or without automatic

coding

ming

design, but they

to

programming

the wasteful expenditure of programcapacity, and running time to overcome the

aids, require

memory

effort,

Processors with stack memories (zero addresses per instruction)

Section 5

instruction-set processor level: variations in the processor

the / operator

of higher precedence than the

is

right-hand Polish notation used in the

B 5000

is

+

operator.

The

based on placing

the operators to the right of their operands: A + B becomes AB + in Polish notation. A + B + C can be written either as AB + C-I-, or as

ABC + +

.

In the expression

ABC + +

,

the

+

first

operator

of a

add the operands B and C. The second + operator says to add A to the sum of B and C. Beturning to the first examples above, A(B + C) can be written as BC + Ax or ABC+ X in Polish.

instructions (coded or compiled) to store or recall intermediate

The second example

results.

sion of Polish notation to handle equations

limitations of their internal organization.

says to

The problem is attacked directly in the B 5000 by incorporation "pushdown" stack, which completely eliminates the need for

B 5000 processor, the stack is composed of a pair of regisA and B registers, and a memory area. As operands are picked up by the programs, they are placed in the A register. If In a

ters,

the is

register already contains a

transferred to the

the

A

B

then the word in B address register

S.

is

register

is

stored in

also

in

A

BC/A +

or is

ABC/ + The .

shown

exten-

in the follow-

Conventional notation Z = A(B - C)/(D + E) Polish notation

ABC- xDE + /Z=

word

operand into occupied by information, a memory area defined by an

Then the word

and the operand brought into the into the stack has

of information, that

register prior to loading the

B

register. If the

word

written as

ing example:

the

A

is

can be transferred to B

A register. The new word coming

pushed down

the information previously held in the registers. As each pushdown occurs, the address in the S

The stack

in

use

To

illustrate the functioning of the stack,

are

shown

in Figs.

4 and

5.

two simple examples Q and

In the examples, the letters P,

R

represent syllables in the program that cause the operands P, Q, and R to be picked up and placed in the stack. The symbols

+

and

X

represent syllables that cause the add and multiply

The two examples represent different ways The first example in Fig.

register automatically increased by one. The information contained in the registers is the last information entered into the stack;

operations to occur.

the stack operates on a "last in-first out" principle. As information is operated on in the stack, operands are eliminated from the stack

4 does not require pushdowns or pushups. The second example, shown in Fig. 5, requires a pushdown in the execution of the

is

and

results of operations are returned to the stack.

As information

used up by operations being performed, it is possible to cause "pushups," i.e., a word is brought from the memory area in the stack

is

addressed by the S register, and the address in the S register decreased by one.

the registers contain information or are empty. When an operand is to be placed in the stack and either of the registers is empty,

no pushdown into memory occurs. Also, when an operation leaves one or both of the registers empty, no automatic pushup occurs.

Polish notation Polish logician,

J.

Lukasiewicz, developed a notation which

allows the writing of algebraic or logical expressions which do not require grouping symbols and operator precedence conventions.

For example, parentheses are necessary as grouping symbols in the expression A(B + C) to convey the desired interpretation of the expression. In the expression A + B/C, the normal interpretation is

A + (B/C),

rather than (A

+ B)/C, because of the convention

syllable R,

columns

P(Q +

R) in Polish notation.

and a pushup

in the execution of the syllable

X The

in the table represent the contents of the various registers

after execution of the syllable listed in the

first

column.

is

To eliminate unnecessary pushdowns and pushups, the A and B registers both have indicators used for remembering whether

The

of writing

that

Independence of addressing

One

of the goals set in the design of the

programs

program

independent of the actual itself

Polish Notation

and the data,

QR + Px

B 5000 was

memory

in order to

to

make

the

locations of both the

provide really automatic

Chapter 22

Polish Notation

PQR +

Design of the B 5000 system

271

272

The

Part 3

instruction-set processor level: variations in the processor

Section 5

Word mode program

For

The word mode of the B 5000 processor has four types of syllables. The syllable type is distinguished by the two high-order bits of each 12-bit

The types

syllable.

of syllable

and the

identification

(3),

indexing of the descriptor by the item that

call syllable, action

In the case of

addressed.

00— Operator Syllable 01— Literal Syllable 10— Operand

registers

Call Syllable

now

the

A

complete after the indexing.

is

subroutine entry occurs to the subroutine word of the three previous types may be left in the (4),

upon return from the subroutine,

in

which instance the

actions described above will take place, depending of syllable which initiated the subroutine.

— Descriptor Call Syllable

is

second item in the stack occurs. For an operand call syllable, the operand is obtained from the indexed address; for the descriptor

bits are:

11

Processors with stack memories (zero addresses per instruction)

upon the type

Essentially, the four types of action that occur for an

operand

obtaining an operand directly, indirectly, from an array, or by computation. Sometimes in the use of the call syllables, it is not known which type of action will occur for a call syllable are

The

first of these, the operator syllable, causes operations to be performed. The remaining ten bits of the operator syllable are the

operation codes. There are approximately sixty different operations in the word mode. For those operations requiring an operand or operands, the processor checks for sufficient operands in the registers; if

they are not there, pushups from the stack in

memory occur

automatically.

The to

literal syllable is

used for placing constants

be used as operands. The ten

in the stack

bits of the literal syllable are

transferred to the stack. This allows the

program

to contain inte-

call syllable,

and the descriptor

call syllable ad-

dress locations in the

operand

call syllable

program reference table. The purpose of the is to place an operand in the stack; the

purpose of the descriptor an operand, a descriptor, that arise,

call syllable

in the stack.

is

to place the address of

There are four situations

depending on the word read from the program reference

when

the program

Programs

in the

word mode

is

an operand.

is

particu-

consist of strings of syllables

which

by operator syllables which perform their operations on information in the stack. in the stack, are followed

The indexing at the

features of the

B 5000 allow generalized indexing

same time provide complete storage protection. Data

areas and program segments of different programs may be intermingled, but a program is prevented from storing outside of its

data areas. The method of indexing allows any of the 1,024 words of the program reference table to be considered index registers. Multilevel indexing selves

is

be elements of

The subroutine

provided,

i.e.,

tine of itself)

indices of arrays can them-

arrays.

control provided in the

B 5000

allows nesting

— even recursive nesting subroutine a subrou— arbitrarily deep. Dynamic allocation of storage

of subroutines

The word

created. This

follow the rules of Polish notation. Variable length strings of call syllables and literal syllables, which place items of information

table.

1

is

larly true for call syllables in subroutines.

and

gers less than 1,024 as constants.

The operand

particular syllable

is

(a

for

and temporary working storage simplify the use parameter of subroutines. Storage is automatically allocated and deallocated lists

2

The word

is

a descriptor containing the address of the

operand. 3

The word

4

The word

as required.

a descriptor containing the base address of the data area in which the operand resides. is

a program descriptor containing the base address of a subroutine.

Character

is

mode program

In the character

mode

of the

B 5000

Processor, there

type of syllable, called the operator syllable.

For

the operand call syllable has completed its action by an placing operand in the stack. The descriptor call syllable will cause the construction of a descriptor of the operand, replacing (1),

the operand by the constructed descriptor. For (2), the operand call syllable then reads the operand from the cell addressed. The descriptor call syllable has completed its action.

in

is

only one

Program segments

mode are constructed of strings of these syllables. mode is designed to provide editing, formatting,

the character

The character

comparison, and other forms of data manipulation. In doing so, the processor uses two areas of memory the source and desti-



When

a program switches from word mode to character mode, two descriptors containing the base addresses of these

nation areas.

areas are supplied.

The source area

or destination area

may be

Chapter 22 j

changed

may

act

at any time during character mode so that the program on several areas.

The character mode operator

syllable

is

split into

last part specifies the operation to parts; the

two

6-bit

be performed and

is to be part specifies the number of times the operation the are for deletion, transferring, provided performed. Operations

the

first

comparison, and insertion of characters or bits. Also, there are operations which allow the repetition of syllable strings. This is quite useful for complex table look-up operations and for editing information which contains repeated patterns.

Design of the B 5000 system

Conclusion

The Burroughs B 5000 system has been designed

as

an integrated

hardware-software package which offers such benefits as savings in the memory space required to store equivalent object programs; multi-processing and parallel processing; and running identical

programs on systems with different size memories and different system configurations with no loss in individual system efficiency. References

LoneW61; BartR61; BockR63; CarlC63; MaheR61

273

Section 6

Processors with multiprogramming ability The processors ple

programs

in this

Two

section have features which allow multi-

to exist in the primary

The programs can be executed

memory

same

at the

time.

alternately by a single processor

without having to wait for new programs to be input. The cost is only that of changing the processor state, which involves only

most (and only one instruction on some a few such as the CDC 6600). Since programs are subject systems, to numerous unpredictable delays within a single run for interchange with the external environment (either via Ms or T), substantial increases in Pc utilization can be achieved by multi-

more than

If

Mp, the system

is

is

in

essentially that of Atlas.

The extracodes feature allows ordinary machine operation codes to be used to

The ISP

in

swapping programs, one at a time, into primary memory for interpretation. The Berkeley Time-Sharing System (Chap. 24) uses both multiprogramming and program swapping. The

tion of the extracode.

multiprogram fundamental that

The

an early computer to have

is

idea of

multiprogramming

is

so

it should be among the first concepts to be understood by the student of computing systems. A very nice

review of in

memory mapping and

the paper

Dynamic Storage

storage allocation

is

the

SDS 900 series and was used in the common-user instructions. The

for defining

IBM System/360 SVC (supervisor call) Atlas

was about the

instruction

is

an adapta-

computer to be designed with and the idea of user machine in

earliest

a software operating system

mind. The operating system has been nicely described [Kilburn al., 1961] and evaluated [Morris et al., 1967].

et

In a letter to

the following

the authors of this book,

comments on

F.

H.

Sumner makes

Atlas.

and The

initial ideas and the preliminary research on the Atlas computer system started in the Department of Computer Science of the University of Manchester in 1956. The team, under the direction of

Professor T. Kilburn, was later supplemented by several

Atlas

of the I.C.T.

The Atlas

most important machines described in was originally designed and conManchester University. The Atlas 1 and Atlas 2 were

is

one

of the

book. The prototype

structed at

produced by Ferranti Corp. (prior to becoming part of 1

is

registers

the most interesting;

I.C.T.

and

al.,

machine was working in the department by the Autumn of 1961. first production model became operational in January 1963.

The

The

significant features of the

system can be summarised

as:

).

1

The provision

of a virtual address field greater than the real

address space.

it

2

The implementation of a "one-level" store using a mixture and drum store.

of core store

interrupt processing of input/output devices.

Atlas' detailed internal structure

ner et

members

Computer Research Department, and the prototype

1

incorporates most of the features of the Atlas prototype. The Lincoln Laboratory TX-2 [Clark, 1957] influenced some Atlas features: multiple index Atlas

Initially

presented

Allocation Systems [Randell

Kuehner, 1968].

this

Commonly used complex

straightforward and extremely nice. The extra-

is

code idea appears

SDS 940 system

capability.

subroutines.

in a common operating system accessible to all users. these subroutines were stored in a read-only memory.

Time-shared computers are generally multiprogrammed. Alternatively, time-shared systems can be implemented by

Burroughs B 5000 (Chap. 22)

is

described

1962].

International Computers and Tabulators, U. K.

274

call

instructions (such as sin, cos, and monitor calls) can be written

a single processor has access to

called a multiprocessor system.

and extracodes, have

many other machines. A one-level store is common to most new computers which are time-shared or multiprogrammed; the scheme for memory paging in the SDS 940

instructions at

programming.

original features, one-level storage

been copied

in a

paper [Sum-

3

4

and the method

The

interrupt system

The

realisation at the design stage that there

of peripheral control.

would be a

complex operating system and the provision in the hardware of specific features to assist such an operating system.

Section 6 |

The method

number

a large

attachment of

of peripheral control permitted the of on-line peripherals with rapid

into the operating

system

response and entry

for a peripheral requiring attention. This,

together with the multiprogramming features,

makes the design

attachment of keyboards for the provision of multiaccess operation. In the original design, provision for several such ideal for the

on-line typewriters

was made, but

decided to remove these as an

at the production stage

economy measure.

subsequent development of on-line operation,

In

it

was

view of the

was rather an

this

unfortunate decision.

The Atlas computer

at the University

operation for four years and

it

is

has now been

in

continuous

expected to provide for the major

During the period of its operation the provision of extensive monitoring and logging information has permitted the behaviour of the system to be studied in detail. The results of these studies have

been extremely valuable

in

the design of a successor to the Atlas.

940 one

of the first commercially available combined hardwaresoftware time-sharing computers. 1 The description in Chap. 24 is concerned with the machine

as

it

appears to the user. That

is

described

in

the context

in

more modest than that of the IBM 360/67 GE 645 [Dennis, 1965; Daley and Dennis, 1968]. A number of instructions are apparently built in via the programmed operator calling mechanism, based on Atlas extracodes (Chap. 23). The software-defined instructions that of Atlas but

is

1966] and

al.,

emphasize the need

for

ing-point arithmetic

is

run.

hardware features. For example,

float-

needed when several computer-bound

The SDS 945

is

a

successor to the 940, with

increased capability but at a lower cost.

'Time-shared computers consist of both hardware and a complex software operating system.

Adams Computer

Characteristics Quarterly lists the deliveries of gen-

DEC PDP-6 hardware, October, 1964 (software in early 1965); SDS 940 hardware (and Berkeley software) April, 1966; GE 635, 645 hardware, May, 1965 (M.l.T.'s project MULTICS software, around eral-purpose time-shared computers as

in

a time-sharing system

The Berkeley Time-Sharing Computer (Fig. 1) is based on the SDS 930 (Chap. 24). The hardware modifications to the SDS

1969);

IBM System /360 Model 67 hardware, March, 1966

1968).

M(content addressable: flip flop) Mp(#0:3)'-

-,— K('Map)

Pc

2

Ms (magnetic tape).

S-r-K I— K

T(paper tape)-

l-K-S-T(Teletype)t— Pio

s(drum; 2 ys/w;

K— Msfmoving L-

Pio-Sf-K-CfPDP-Sj-pK

I

l-K-Cf'PDP-Sj-r-K ;— pi\

'rlp(core; 2

in

Part 3, Sec. 5,

page 257, Chap. 22.

A user machine

the hardware and the oper-

which they contribute to form a user machine. The 940 uses a memory map which is almost a subset of

programs are

The Burroughs B 5000 computer

is,

ating system software are both presented

slightly

Design of the B 5000 System

ability

930, together with the operating system software, were sold by Scientific Data Systems as the SDS 940. The operating system and hardware modifications for multiprogramming make the

[Arden et

part of the University's computing needs until 1971.

Processors with multiprogramming

1.75 ys/w;

I63RI1 w;

(2*1,1



1

w) 10

T(CRT; display)-

T(keyboard; CRT; display)1

parity) b/w)

time-shared-computer

10 .5 x

8

Pc('Modified SDS 930), see Chapter k2

Fig. 1. University of California (Berkeley)

1.3 x

head disk;

PMS

diagram.

w)

(software, around

275

Chapter 23 One-level storage system 1 T.

F.

Kilbum / D. H. Sumner

B. C.

Edwards / M.

J.

Lanigan

After a brief survey of the basic Atlas machine, the paper

Summary

describes an automatic system which in principle can be applied to any combination of two storage systems so that the combination can be

regarded

by the machine user

as a single level.

The

actual system described relates

to a fast core store-drum combination.

The

effect of the system

tion times

since

it

is

fits

illustrated,

basically in

on instruc-

and the tape transfer system is also introduced through the same hardware. The scheme incor-

porates a "learning" program, a technique which can be of greater importance in future computers.

requisite transfers of information taking place automatically. There number of additional benefits derived from the scheme

are a

adopted, which include relative addressing so that routines can operate anywhere in the store, and a "lock out" facility to prevent interference

between

The

2.

basic

a large-capacity fast-access

ation of the

now being

main

store.

it is

necessary to have

While more

efficient oper-

computer can be achieved by making

of one type, this step

is

this store all

scarcely practical for the storage capacities

considered. For example, on Atlas

it

is

possible to

address 10 6 words in the main store. In practice on the first installation at Manchester University a total of 10 5 words are provided,

but though

it is

just technically feasible to

much more economical to provide a and drum (96,000 words) combination. it is

make

this in

one level

core store (16,000 words)

a machine which operates its peripheral equipment on a time division basis, the equipment "interrupting" the normal Atlas

is

main program when peripheral equipment

it is

requires attention. Organization of the

done by program so that many prothe store of the machine at the same

also

grams can be contained in time. This technique can also be extended to include several main programs

of the basic

available storage space

In a universal high-speed digital computer

as well as the smaller subroutines

used for controlling

peripherals. For these reasons as well as the fact that

which

levels of store,

in order to eliminate the long

drum

i.e.,

is

a system has been devised to make the core drum store combination appear to the programmer as a single level of storage, the

276

IRE

Trans., EC-11, vol. 2, pp. 223-235, April, 1962.

shown

in Fig.

1.

The

store, in

to the

which

normal

all

words

user,

and

tape store, which is the conventional backing-up large capacity store of the machine. Both the private store and the main core store are linked with the main accumulator, the B-store, and the B-arithmetic unit.

However the drum and tape stores only have

access to these latter sections of the machine via the

main core

store.

The machine order code is of the single address type, and a comprehensive range of basic functions are provided by normal engineering methods. Also available to the programmer are a number

of extra functions

termed "extracodes" which give auto-

matic access to and subsequent return from a large number of built-in subroutines. These routines provide 1

A number

which would be expensive to provide in terms of equipment and also time because of the extra loading on certain circuits. An example in the

of this Shift

of orders

machine both is

the order:

accumulator contents

2

The more complex mathematical log

3

±n places where n is an integer.

4

operations,

e.g.,

sin x,

x, etc.,

Control orders for peripheral equipments, card readers, parallel printers, etc.,

l

is

finally the

core store and drum,

Hence

machine

split into three sections; the private store used solely for internal machine organization, the central

which includes both core and drum are addressed and is the store available

some orders

access time of 6 msec.

in

is

store

take a variable time depending on the exact numbers involved, it is not really feasible to "optimum" program transfers of infor-

mation between the two

programs simultaneously held

machine

The arrangement Introduction

1.

different

the store.

Input-output conversion routines,

Chapter 23

One-level storage system

about 10 6 words. In Muse the central store capacity is about 96,000 words contained on 4 drums. Any part of this store can be transFixed store

ferred in blocks of 512 words to/from the

meshes 4.096 words

2

Operand

«

I

address

Subsidiary store

1,024 words

H

24

decode on digits

address

The tape system provides a very for the machine. The user can effect of information

Core store address from

machine

8 tape decks fi k0.5x10 words

Fig. 1.

between

automatic transfers

initiates Main accumulator

* k

Drum store 4 drums 24, 576 words

The main

Information channels

(two way)

core store address can thus be provided from either

the central machine, the drum, or the tape system. Since there is no synchronization between these addresses, there has to be a

drum priority system to allocate addresses to the core store. The has top priority since it delivers a word every 4 jusec, the tape next priority since words can arise every 11 jusec from 8 decks

programs being run simultaneously, monitoring and costing purposes, and the

routines for fault finding

drum and tape

time.

A

priority,

at

store for the rest of the available

time to establish its priority system necessarily takes and so it has been arranged that it comes into effect only

each drum or tape request. Thus the machine is not slowed in any way when no drum or tape transfers take place. The

down

transfers.

effect of

permanently required and hence is kept in part of the private store termed the "fixed store" [Kilburn and Grimsdale, 1960a] which operates on a "read only" basis. This store All this information

fixed store

Address channels

—•-

allocation to Special programs concerned with storage

detailed organization of

by a

tape store and the main core store. The system can handle eight tape decks running simultaneously, each producing or demanding a word on average every 88 jusec.

Layout of basic machine.

different

store. In actual

program which of blocks of 512 words between the

and the machine uses the core 5

large capacity backing store transfers of variable amounts

and the central

this store

fact such transfers are organized

centrol

Main core store 4 stocks n 4,096 words

which

4096 words.

Peripherdl eguipments

digits

drithmetic unit

23,22,21

4

store,

consists of four separate stacks, each stack having a capacity of

B store 128 words

8 Subsidiary store

main core

is

drum and tape

transfers

on machine speed

is

given in

Appendix 1. To simplify the control commands given to the drum, tape, and

"linear" ferrite slugs are inserted to represent digital information.

peripheral equipment in the machine, the orders all take the form h->S or s->B and the identification of the required command register is provided by the address S. This type of storage is clearly

The information content can only be changed manually and

widely scattered in the machine but

consists of a

woven wire mesh

into

which a pattern of small will

tend to differ only in detail between the different versions of the Atlas computer. In Muse this store is arranged in two units each

V-store.

4096 words, a unit consisting of 16 columns of 256 words, each word being 50 bits. The access time to a word in any one column is about 0.4 jusec. If a change of column address is required, this

adder [Kilburn et

of

figure increases

read amplifiers.

due to switching transients in the in the new column revert to accesses Subsequent

by about

1 /usee

its operation B-arithmetic unit.

fast

1960b] and has built-in multiplication and can deal with fixed or floating point numbers

al.,

division facilities. It

jusec) of

termed collectively the

machine the main accumulator contains a

In the central

and

is

is

completely independent of the B-store and B-store is a fast core store (cycle time 0.7

The

120 twenty-four bit words operating in a word selected

mode [Edwards

I960]. Eight "fast"

store operates in conjunction with a subsidiary core store of 1024 words which provides working space for the fixed

partial flux

and has a cycle time of about 1.8 jusec. There are certain safeguards against a normal machine user gaining access

three are used as control lines, termed main, extracode, and inter-

0.4 jusec.

The

store programs,

to addresses in either part of the private store,

he makes use of

The

this store

central store of the

store combination,

in effect

though through the extracode facility.

machine

consists of a

of

switching

lines are also

provided

in the

form of

et

al.,

flip-flop registers.

Of these,

The arrangement has the advantage numbers can be manipulated by the normal B-type and the existence of three controls permits the machine

rupt controls respectively. that the control orders,

drum and core

which has a maximum addressable capacity

B

to switch rapidly

control

numbers

from one to another without having to transfer Main control is used when the

to the core store.

277

278

Part 3

The

central

machine

Section 6

instruction-set processor level: variations in the processor

code control

is

is obeying the current program, while the extraconcerned with the fixed store subroutines. The

interrupt control provides the

means

for

handling numerous pe-

the machine when they ripheral equipments which "interrupt" either require or are providing information.

The remaining

"fast"

organizational procedures, though B124 the floating point accumulator exponent. The operating speed of the machine is of the order of 0.5 X 10 6

B lines are mainly used for is

instructions per second. This is achieved by the use of fast transistor logic circuitry, rapid access to storage locations, and an

extensive overlapping technique. The latter procedure is made number of intermediate buffer storpossible by the provision of a

age registers, separate access mechanisms to the individual units of core store and parallel operation of the main accumulator and

The word length throughout the machine

B-arithmetic units.

48

is

which may be considered as two half-words of 24 bits each. store transfers between the central machine, the drum and tape

bits

All

being a parity digit associated with each half-word. In the case of transfers within the central store stores are parity checked, there

(i.e.,

between main core

and drum) the parity

store

digits associ-

ated with a given word are retained throughout the system. Tape transfers are parity checked when information is transferred to store, and on the tape itself a check sum technique involving the use of two closely spaced heads is used. The form of the instruction, which allows for two B-modifica-

and from the main core

tions,

and the allocation of the address

digits

is

shown

in Fig. 2a.

Half of the addressable store locations are allocated to the central store

which

is

identified

by a zero

in the

of the address. (See Fig. 2b.) This address into block address,

and

line address in a block of

least significant digits,

characters in a half

The

function

and

1,

word and

number

is

make

it

The machine

significant digit

512 words. The

6 bit possible to address

digit 2 specifies the half word.

split into several sections,

relating to a particular set of operations, Fig. 2c.

most

can be further subdivided

orders

fall

into

each section

and these are

two broad

classes,

listed in

and these

are 1

B codes: These involve operations between a B line specified by the BA digits in the instruction and a core store line whose address can be modified by the contents of a B line determined by the B m digits. There are a total of 128 B one of which, B always contains zero. Of the other 90 are available to the machine user, 7 are special registers previously mentioned, and a further 30 are used lines,

,

lines

by extracode 2

orders.

A codes: These involve operations between the Accumulator and

a core store line

whose address can now be doubly

Function 10 bits

Processors with multiprogramming

ability

Chapter 23

3.

One-level store concept

The choice computer

is

of system for the fast access store in a large scale

governed by a number of conflicting factors which

These processes are necessarily time consuming but by providing a by-pass of this procedure for instruction accesses (since, in general, instruction loops are all contained in the same block) then most of

this

time can be overlapped with a useful portion of the store rhythm. In this way information in the core

include speed and size requirements, economic and technical difficulties. Previously the problem has been resolved in two ex-

machine or core

treme cases either by the provision of a very large core

and only

store, e.g.,

store

is

available to the rarely

is

machine

in the equivalence circuitry.

Mercury [Lonsdale and Warburton, 1956; Kilburn et al., 1956] computer. Each of these methods has its disadvantages, in the first case, that of expense, and in the second

manded block

If

speed of the core store

a "not equivalence" indication is obtained when the deaddress is compared with the contents of the

P.A.R.'s then that address, first

at the full

the over-all machine speed affected by delays

the 2.5 megabit [Papian, 1957] store at M.I.T., or by the use of a small core store (40,000 bits) expanded to 640,000 bits by a drum store as in the Ferranti

One-level storage system

stored in a register

which may have been B-modified,

which can be accessed

is

as a line of the

machine easy access to this ad"interrupt" also occurs which switches operation of the machine over to the interrupt control, which first determines the V-store. This permits the central

transfers of information

who is obliged to program between the two types of store and this can be time consuming. In some instances it is possible for an

dress.

expert machine user to arrange his program so that the amount of time lost by the transfers in the two-level storage arrangement

cause of the interrupt and then, in this instance, enters a fixed store routine to organize the necessary transfers of information

is

not significant, but this sort of "optimum" programming is not very desirable. Suitable interpretative coding [Brooker, 1960] can

between drum and core

permit the two-level system to appear as one level. The effect is, however, accompanied by an effective loss of machine speed

A.

Drum

On

each drum, one track

case, that of inconvenience to the user,

which, in some programs and depending on details of machine design, can be quite severe, varying typically, for example, be-

tween one and

The tages,

three.

two-level storage scheme has obvious economic advan-

and inconvenience to the machine user can be eliminated

An

store.

transfers is

used to identify absolute block posi-

around the drum periphery. The records on these tracks are read into the registers which can be accessed as lines of the tions

permits the present angular drum position to be determined, though only in units of one block. In this way the V-store

and

this

Atlas a completely automatic system has been provided with tech-

time needed to transfer any block while reading from the drums can be assessed. This time varies between 2 and 14 msec since

niques for minimizing the transfer times. In this way the core and drum are merged into an apparent single level of storage with

2 msec.

by making the

transfer arrangements completely automatic. In

good performance and at moderate cost. Some rangement on the Muse are now provided.

The

central store

is

drum

revolution time

is

12 msec and the actual transfer time

The time

details of this ar-

of a writing transfer to the drums has been reduced the block of information to the first available empty by writing

subdivided into blocks of 512 words as

block position on any drum. Thus the access time of the drum can be eliminated provided there are a reasonable number of

shown by the address arrangements is

the

in Fig. lb.

also partitioned into blocks of this size

The main

which

core store

for identification

purposes are called pages. Associated with each of these core store page positions is a "page address register" (P.A.R.) which contains

empty blocks on the drum. This means, however, that transfers to/from the drum have to be carried out by reference to a directory and this is stored in the subsidiary store and up-dated when-

the address of the block of information at present occupying that page position. When access to any word in the central store is

ever a transfer occurs.

required the digits of the demanded block address are compared with the contents of all the page address registers. If an "equivalence" indication is obtained then access to that particular page

to determine the absolute position on a

position is permitted. Since a block can occupy any one of the 32 page positions in the core store it is necessary to modify some digits of the demanded block address to conform with the page positions in

which an equivalence was obtained.

When

the

drum

transfer routine

is

entered the

drum

first

action

is

of the required block.

The order

is then given to carry out the transfer to an empty page position in the core store. The transfer occurs automatically as

soon as the drum reaches the correct angular position. The page address register in the vacant position in the core store is set to a^ specific plifies

block number for

drum

transfers. This

technique sim-

the engineering with regard to the provision of this

number

279

280

The

Part 3

from the drum and to the

Section 6

instruction-set processor level: variations in the processor

wrong

also provides a safeguard against transferring

Processors with multiprogramming

made from

position can then be

the central machine.

It is

ability

clear

that the L.O. digit can also be used to prevent interference be-

block.

as the order asking for a read transfer from the drum has been given the machine continues with the drum transfer program. It is now concerned with determining a block to be

tween programs when several machine at the same time.

transferred back from the core store to the drum. This

As soon

In Sec. 3

it

was stated

different ones are being held in the

that addresses

demanding access

to the

necessary

core store could arise from three distinct sources, the central

to ensure

an empty core store page position when the next read transfer is required. The block in the core store to be transferred

machine, the drum, and the tape. These accesses are complicated because of (1) the equivalence technique, and (2) the lock out digit.

has to be carefully chosen to minimize the number of transfers in the program and this optimization process is carried out by a

The

learning program, details of

which are given

in Sec. 5.

is

The opera-

by the provision of the "use" digits program which are associated with each page position of the core store. tion of this

is

assisted

To interchange information between the core store and drums, transfers, a read from and a write to the drum are necessary.

two

These have to be done sequentially but could occur

The technique

in either order.

of having a vacant

page position in the core store permits a read transfer to occur first and thus allows the time for the learning program to be overlapped either into the waiting period for the read transfer or into the transfer time itself. In the time remaining after completion of the learning program an entry is made into the over-all supervisor program for the machine, and

taken concerning what the machine is to do until the drum transfer is completed. This might involve a change to a decision

is

a different

main program.

while a drum or tape transfer is taking place to that page. This is prevented in Atlas by the use of a "lock out" (L.O.) digit which

provided with each Page Address Register.

When

a lock out

only permitted when the the drum system, the tape address has been provided either by or the control. The latter case permits all transsystem, interrupt fers from paper tape, punched card, and other peripheral equipdigit

is

1.

The

provision of the Page Address Registers, the equivalence circuitry, and the learning program have permitted the core store and drum to be regarded by the ordinary machine user as a one-

and the system has the additional feature of "floating address" operation, i.e., any block of information can be stored in any absolute position in either core or drum store. The minimum level store,

access time to information in this store

the core store and

B.

set at 1, access to that

page

is

ments, to be handled without interference from the main program. When the transfer of a block has been completed the organizing

and access to that page

The core store

is

split into four stacks,

sequential addresses it is

Source of address 1.

Central Machine

2.

Drum System

3.

Tape System

=

and write

is

[E.Q.]

Access to required page position Access to required page position Access to required page position

discussed.

each with individual address

digits are

time shared between

thus arranged across two stacks. In this way from consecutive ad-

by increasing the size of the read channel. This to be completely obeyed in three store two instructions permits dresses in parallel

"accesses." The choice of this particular storage arrangement is discussed in Appendix 2. The coordination of these four stacks is done by the "core stack

coordinator" and some features of this are

now

discussed, starting

with the operation of a single stack.

of equivalence

and lock out (

1

0)

now

possible to read a pair of instructions

Comparison of demanded block address with contents of the P.A.R.'s resultant state Equivalence

is

the various stacks. Sequential address positions occur in two stacks alternately and a page position which contains a block of 512

Table

(Lock out

obviously limited by

is

this

decoding and read and write mechanisms. The stacks are then combined in such a way that common channels into the machine

program

(

arrangement and

Core store arrangement

resets the L.O. digit to zero

1

its

for the address, read

A program could ask for access to information in a page position

is

various cases and the action that takes place are summarized

in Table

circuits

ice 1 Equivalence Lock out = 1)

Not equivalence

\

[N.E.Q.]

[E.Q. 6- L.O.]

Enter

drum

transfer routine

Not available to

this

program

Fault condition indicated

Fault condition indicated

Fault condition indicated

Fault condition indicated

Chapter 23

There

C. Operation of a single stack of core store

The storage system employed

is

a coincident current M.I.T. system

arranged to give parallel read out of tion

is

50

digits.

The reading opera-

destructive and each read phase of the stack cycle

is fol-

lowed by a write phase during which the information read out may be rewritten. This is achieved by a set of digit staticizors which are loaded during the read phase and are used to control the inhibit current drivers during the write phase. When new information is to be written into the store a similar sequence is followed, except that the digit staticizors are loaded with the new information during the read phase. A diagram indicating the different types of stack cycle

shown

is

is

a small delay

WD (~100 m/isec) between the

and the

"stack

the read phase to allow for request" of the the address address state and decoding. The output setting information from the store appears in the read strobe period, which signal, Sfi,

is

start of

towards the end of the read phase. In general, the write phase soon as the read phase ends. However, the start of the

starts as

write phase may be held up until the new information is available from the central machine. This delay is shown as w in Fig. 3c.

W

The

TA between

interval

the stack request and the read strobe

termed the stack access time, and in practice this is approximately one third of the cycle time Tc Both TA and Tc are functions is

.

W

w is zero have typical values of 0.7 jusec and 1.9 jusec respectively. A holdup gate in the request channel prevents the next stack request occurring before of the storage system

in Fig. 3.

One-level storage system

and assuming that

the end of the preceding write phase. Stack

request

D. Operation of the main core store with the central machine

"^T

Read phase Read

A

schematic diagram of the essentials of the main core store con-

trol

strobe

r

i

+=H-

Write

phase

system

is

shown

in Fig. 4.

The

control signals

SA t and SA 2

indicate whether the address presented is that of a single word or a pair of sequentially addressed instructions. Assuming that the flip-flop

F

is

in the reset condition, either of these signals results

in the loading of the buffer address register (B.A.R.). This

,ck Stack

— I

reqiuest

I

is

—r

In dealing with the

I

Read

IS

strobe

r

Write

phase

arises,

—1

i-

cases

I |

strobe

Write strobe

In Fig. 5 a flow diagram

i_r

not store-limited. In most

is

the equivalence operation on complete, and the read phase of

is

shown

for the various cases

which can

When a single address request is accepted it is necessary to obtain an "equivalence" indication and form the page location SET CSF digits before the stack request can be generated. The

phase

(c)

TA = access time; Tc = and loading of address

-

wait for address decoding cyclic time; Wo w - wait for release of write hold register;

W

signal then occurs as soon as the read phase starts. If a "not equiva-

lent" or "equivalent

up.

Fig. 3. Basic

—> s).

is

when

arise in practice.

Write

(a

is

the appropriate stack (or stacks) has started. Until this time the information held in the B.A.R. must not be allowed to change.

U

Read

then the speed of the system

SET CSF

generated the demanded block address

Read phase

request the block address digits in the

indicated in Fig. 4 is obtained. Assuming access to the required store stack is permitted then a set C.S.F. signal is given which resets the flip-flop F. If this occurs before the next access request

(*)

Stack ,Ck req uest

first

B.A.R. are compared with the contents of all the page address registers. Then one of the indications summarized in Table 1 and

phase Write

loading

done by the signal B.A.B.A. which also indicates that the buffer register in the central machine has become free.

i.0)

(c)

types of stack cycle, (a) Read order Read-write order (b + s —» S).

-

(s

A), (b) Write order

and locked out" indication

is

obtained a stack

request is not generated, and the contents of the B.A.R. are copied in to a line of the V-store before SET CSF is generated. When access to a pair of addresses is requested (i.e., an instruc-

281

282

Part 3

The

instruction-set processor level: variations in the processor

Buffer

address

register

I

Block oddress

address

|Line

Page address regO

[Page address reg

1

|Poge oddress reg

31

|

Not instruction oddress

Instruction

address 1

Equivalence Page

,Poge circuitry

~j

EQ

j

NEQ

digit

register

digits

r EQaiO Comparison circuit

sr.r

Right

Wrong

page

page

CSP Control

circuitry

Stack request

Stack

Stock address

Section 6

Processors with multiprogramming

ability

Chapter 23

It is necessary to ensure a certain minimum time between successive read strobes from the core store stacks to allow

3

which take satisfactory operation of the parity circuits, about 0.4 |iisec to check the information. This time could be reduced, but as

it is

only possible to get such a condition

for a small part of the was not thought to be

The

basic machine timing

normal instruction timing cycle an economical proposition. is

now

it

is

main

the store cycle time. Here a

fast basic cycle

time of 2

1

2

3 4

The type

to

jusec in

Table 2

is

when

obeyed. While

in practice

some

between completing

a long sequence of the same type of instruction this method is not ideal, it is necessary because

obeying one instruction

overlapped in time with

is

part of three other instructions. This

timing complicated, and

number

of techniques,

order to alleviate this situation.

complete an instruction

of instruction (which

is

To obey

factors limiting speed

is

dependent upon

defined by the function

makes the detailed

so the timing

sequence is developed obeyed one after another.

make

this instruction the central

quests to the core store,

one

machine makes two

for the instruction

for the operand. After the instruction

is

re-

and the second

received in the machine

be decoded and the operand address modified by the contents of one of the B registers before the operand request can be made. Finally, after the operand has been the function part has to

obtained the actual accumulator addition takes place to complete the instruction. The time from beginning to end of one instruction 6.05 jusec and an approximate timing schedule

is

as follows in

digits)

is

The

exact location of the instruction and operand in the core or fixed store since this can affect the access time

Table

Whether

the instruction (steps 1 to 8 in Table 3), then the different sections of the machine are being used very inefficiently, e.g., the accumu-

or not the operand address

is

to

be modified

In the case of floating point accumulator orders, the actual

numbers themselves 5

instructions

for various instructions are given in

figures relate to the times

slowly by first It is convenient to

discussed.

the core store into four separate stacks and extracting e.g., splitting two instructions in a single cycle, have been adopted despite a

The time taken

These

these instructions a sequence of floating both instruction and operand in the core store point additions with the address and with single B-modified. operand

In high-speed computers, one of the of operation

2.

considering instructions

Instruction times

4.

The approximate times Table

One-level storage system

Whether drum and/or tape

transfers are taking place

Approximate instruction times

Type of instruction

If

3.

no other action

is

permitted

in the

time required to complete

adder is only used for less than 1.1 jusec. However, the organization of the computer is such that the different sections such lator

as store stacks,

accumulator and B-arithmetic unit, can operate

283

284

Part

The

3 |

instruction-set processor level: variations in the processor

Section 6 |

Table 3f

Timing sequence for floating point addition (instructions and operands in the core store)

Processors with multiprogramming

ability

Chapter 23

One-level storage system

Copy to

Accumulotor busy

|

j

occ

s,cck

Operand

t

e5t

re

1f

Copy

Read

Equivalence

|

[

|

to

L

Accumulator busy_

ace

Start second of pair

g modification '^T^

(Function! I

decode

Start next pair

request |

Copy

Equivolence

I

to

I

Acumulator busy_

L

J

occ

Instruction

Stack ifci

request

I

Stack

Operand

request

[Function!

Equivolence

III

I

I

decode

Operand

Stack

request

request

B modification

I

i

Equivolence Start second of pair IFunctionl

decode

I

B modification

I

Instruction

Start next pair

request |'o

i

Fig. 6.

Timing diagram for a sequence of floating point addition orders. (Single-address modification.)

Element

1

of

first

vector into accumulator. (Operand B-modi-

3

Add

equivalence.

partial product to accumulator.

5

Alter count to select next elements and repeat.

is

for this loop

12.2 jusec.

shown by the the

drum

is

of the overlapping technique

time from starting the approximately 10 jusec.

first

instruction

or tape systems are transferring information

be affected. The

affect

is

dis-

cussed in more detail in Appendix 1. The degree of slowing down is dependent upon the time at which a drum or tape request occurs

machine

requests. depends on the stacks used the drum or and those used by by the central machine. tape being The approximate slowing down is by a factor of 25 per cent during relative to

drum

transfer

(See Appendix

It

also

and by 2 per cent

for

at

The

each active tape channel.

random;

necessary to arrange non-

for use at the next

selection of the page to be transferred could be

could easily result in many additional transpage selected could be one of those in current

this

by the programmer. To make this ideal selection the programmer would have to know (1) precisely how his program operated, which is

not always the case, and

(2)

the precise amount of core store

is

not generally available as the core store could be shared by other machine programs, and almost certainly by some fixed store

central

program organizing the input and output of information from slow peripheral equipments. The amount of core store required by this fixed store

is continuously varying [Kilburn et al., 1961]. the ideal pattern of transfers can be approached

program

The only way

for the transfer program to monitor the behavior of the main program and in so doing attempt to select the correct pages to be transferred to the drum. The techniques used for monitoring is

been described

in Sec. 2A.

number of transfers required. The method described occupies less than 1 per cent of the operating time, and the reduction in the number of transfers is more than sufficient

drum

to the core

to cover this.

The drum transfer learning program of

is full it is

are subject to the condition that they must not slow down the operation of the program to such an extent that they offset any

1.)

The organization

the core store

if

fers occurring, as the

reduction in the 5.

still exist,

available to his program at any instant. This latter information

to or from the core store then the rate of obeying instructions also use the core store will

if

use or one required in the near future. The ideal selection, which would minimize the total number of transfers, could only be made

with instructions and operands on the

The value

fact that the

to finishing the second

When

line containing partial product.

and

an empty page to be made available

made

Copy accumulator to store

program examines the state no further action empty pages

initiated, the organizing

However,

and B-modified.)

The time

a

taken.

is

for

4

which

been

Multiply accumulator by element of second vector. (Oper-

core store is

store has

of the core store,

fied.)

2

Equivolence

|

drum

transfers has

After the transfer of the required block from the

285

286

The

Part 3

Section 6

instruction-set processor level: variations in the processor

|

That part of the transfer program which organizes the selection page to be transferred has been called the "learning" pro-

of the

gram. In order for

this

program

to

have some data on which

to

operate, the machine has been designed to supply information about the use made of the different pages of the core store by

the program being monitored. With each page of the core store there digit

which

is

set to

The 32 "use"

"1" whenever any line

two

digits exist in

is

associated a "use"

in that

is

page

obeyed

real

selected

The

due to the

is

at

random

for

random

lengths of time by the operation of peripheral equipments. With an instruction counter the temporal pattern of the blocks used will

T

in that

will

it is

the

If if

immediately required again, then,

=

— value of

time of transfer

be the same on successive runs through the same part of the

a block

the block t

of values of t

list is

is

kept.

transferred to the drum:

for transferred

page

transferred to the core store the value of t

of transfer

is

—value of t

for this block

length of last period of inactivity

For the block transferred from the drum In order to to

will not

set the value of T.

T = time

=

is

first

the page finally

become zero and the same mistake

when

When

main program. This

fact that the operations

of the main program may be interrupted

wrong,

values of t are set

used to

instruction counter rather than a normal clock to measure "time"

program

rules

be repeated. For all the blocks on the drum a

t

clock causes the learning program to copy the "use" digits to a list in the subsidiary store every 1024 instructions. The use of an

for the learning

is

as in this case,

time but the number

in the operation of the

two

required by the program for the longest time. to select a page the third ensures that

fail

accessed.

read by the learning program, the reading automatically resetting them to zero. The frequency with which these digits are read is of instructions

ability

and can be

lines of the V-store

governed by a clock which measures not

Processors with multiprogramming

make

its

t is

set to 0.

decision the learning program has only and apply at the most three simple rules;

update two short lists can easily be done during the 2 msec transfer time of the block

this

required as a result of the nonequivalence. As the learning program it is not slowed down

uses only fixed and subsidiary store addresses

during the period of the drum transfer.

the learning program is to make use of this pattern to minimize the number of transfers. When a nonequivalence occurs and after the transfer of the

The over-all efficiency of the learning program cannot be known until the complete Atlas system is working. However, the value of the method used has been investigated by simulating the

required block has been arranged, the learning program again adds the current values of the "use" digits to the list and then uses

behavior of the one-level store and learning program on the Mercury computer at Manchester University. This has been done

program. This

essential

is

this list to bring

subsidiary store. of each for each

up

if

to date

These

two

page of the core

of time since the block in that

T

kept in the

sets of times also

sets consist of store.

32 values of

The value

of

and

t

t is

T,

one

the length

The value of this block. The

page has been used.

the length of the last period of inactivity of accuracy of the values of t and T is governed by the frequency is

with which the "use" digits are inspected. The page to be written to the drum is selected by the application in turn of three simple tests to the values of

t

and

T.

for several

problems using varying amounts of store

One

in excess of

was the problem of forming of two 80th order matrices B and C. The three

the core store available.

of these

the product A matrices were stored row by row each one extending over 14 blocks, only 14 pages of core store were assumed to be available.

The method fc

of multiplication

n X 1st row of C = X 2nd row of C +

b 12

was

partial

answer to

partial

answer

=

1st

row

of

A

second partial answer,

etc.

for

which

1

Any page

2

That page with

3

t

t



That page with T max

The

first

rule selects

of use for longer than

>

T+

and (T (all t

=

or

1,



t)

Thus matrix B was scanned once, matrix C 80 times and each row max, or

of matrix

0).

any page which has been currently out period of inactivity. Such a page

its last

has probably ceased to be used by the program and is therefore an ideal one to be transferred to the drum. The second rule ignores as they are in current use, and then selects all pages with t = the one which,

if

the pattern of use

is

A

80 times.

Several machine users were asked to spend a short time writing a program to organize the transfers for a general matrix multipli-

maintained, will not

be

cation problem. In no case

when

the

method was applied

to the

above problem were fewer than 357 transfers required. A program written specifically for this problem which paid great attention to the distribution of the rows of the matrices relative to block divisions required

274

234

transfers.

transfers; the gain over the

The learning program required human programmer was chiefly

Chapter 23

due

to the fact that the learning

of the occasions

when

program could take

the rows of

A

full

advantage

existed entirely within one

block.

One-level storage system

time taken for address comparison into the store and machine operating time if it is not to introduce any extra time delays. Simulated tests have shown that the organization of drum transfers

Many other problems involving cyclic running of single or multiple sets of data were simulated, and in no case did the learning program require more transfers than an experienced human

are reasonably efficient and other advantages

programmer.

intelligent a

as efficient allocation of core storage

and

between

drum

transfers

it

interrupt the operation of the program for from 2 to 14

Some

they are initiated this

time

advance.

loss

by nonequivalence interrupts. could be avoided by organizing the

msec

programmer may be he can never know how many

all

as

very experienced programmer having sole use of the core store could arrange his own transfers in such a way that no unnecessary ones ever occurred and no time was ever wasted

waiting for transfers to be completed. This would require a great deal of effort and would only be worthwhile for a program that was going to occupy the machine for a long time. By using the

data accumulated by the learning program it is possible to recognize simple patterns in the use made by a program of the various a prediction program could forecast the blocks required in the near future and organize the transfers. By recording the success or failure of these forecasts

way

the program could be made self-improving. For the matrix multiplication problem discussed above the pattern of use of the blocks

containing matrix C is repeated 80 times, and a considerable degree of success could be obtained with a simple prediction

his

if

normal use there

as in

is

some

sort

machine rhythm even through several programs, there the possibility of making some sort of prediction with regard

to the transfers necessary. This involves

no more hardware and

be done by program. However, this stage will probably be until results on the actual system are obtained. will

A

blocks of the one-level store. In this

when

of regular

of

transfers in

in operation

is

particular time. Furthermore

is

or

programs matter how

No

running. The advantage of the automatic system is that takes into account the state of the machine as it exists at any

program

Although the learning program tends to reduce the number of transfers required to a minimum, the transfers which do occur still

different

store lock out facilities are also invaluable.

programs or peripheral equipments are A. Prediction of

which accrue, such

It

that

left

can be seen that the system is both useful and flexible in can be modified or extended in the manner previously

it

indicated.

Thus despite the increase

in

equipment, the advantages

which are derived completely justify the building of this automatic system.

APPENDIX 1 ORGANIZATION OF THE ACCESS REQUESTS TO THE CORE STORE There are three sources of access requests to the core store, namely the central machine, the drum, and the tape systems. In deciding

how

the sequence of requests from

and placed in some be considered. These are

serialized to

all

three sources are to be

sort of order, a

number

of facts have

program.

6.

Conclusions

1

All three sources are asynchronous in nature.

2

The drum and tape systems can make requests at a fairly high rate compared with the store cycle time of approximately 2 jusec. For example, the drum provides a request every 4 jusec and the tape system every 11 /tsec when all

A specific system for making a core-drum store combination appear as a single level store has

been described. While

this

is

the actual

system being built for the Atlas machine the principles involved are applicable to combinations of other types of store. For example, a

tunnel diode-fast core store combination for an even faster

machine.

was not

An alternative which was considered for Atlas, but which

as attractive economically,

was a

fast

core-slow core store

combination. The system too can be extended to three levels of 6 storage, and indeed if 10 words of total storage had to be provided then it would be most economical to provide it on a third level of store such as a

file

drum.

The automatic system does require additional equipment and introduces some complexity, since it is necessary to overlap the

8 channels are operative. 3

The drum and tape systems can only be stopped in multiples of a block length, i.e., 512 words. This means that any system devised for accessing the core store must deal with both the average rates of drum and tape requests specified in 2. Only the central machine can tolerate requests being stopped

any time and for any length of time. request priority can be stated which is a Drum request. at

b

Tape

c

Central machine request.

request.

From

these facts a

287

288

The

Part 3

instruction-set processor level: variations in the processor

A machine

4

request can be accepted by the core store, but is no place available to accept the core store

because there

its cycle is inhibited and further requests held up. In the case of successive division orders this time can be as long as 20 ^usec, in which case 5 drum requests could be made. To avoid having an excessive amount of buffer

information,

b

available to put the information. Store the machine request and then permit a

drum

flip-flop

frozen

Inspect state of

*y

drum two techniques

are possible: When drums or tapes are operative do not permit machine requests to be accepted until there is a place

storage for the

a

F

F

flip-flop 1

Busy

Wait for

or

equivalence

completed

tape request. I

The latter scheme has been adopted because it can be accommodated more conveniently and it saves a small amount of time. 5

If

the central machine

is

using the private store then

Store machine order

it is

flip-flop

core store

to

-Drum/tape

priority

-

Remove stack request

way.

Inhibit signals

When

the central machine, drum and tape are sharing the core store then the loss of central machine speed should be roughly proportional to the activity of the drum or tape

6

Stock request for drum /tape

Orum/tape request

nhibits to reapply Is

there a stored

machine order

when

The system which accommodates all these points is now disWhenever a drum or tape request occurs inhibit signals

also to the stack request channels

from

Allow to proceed possible)

F

(Fig. 5)

(if

and Stack request of stored machine order

This

this coordinator.

results in a "freezing" of the state of flip-flop is

Apply

and

inhibits

to

stack request channels and to machine request channels (if these are not already applied)

cussed.

means

?

required.

are applied to request channel into the core stack coordinator

W

Perm it stack request___f^\

systems. This means that drum or tape requests must "break" into the normal machine request channel as and

state

I

Drum tope access

drum and tape transfers to the core store not with or slow down the central machine in any

desirable for to interfere

F

Free

Hos the stack request machine order been stopped 7

of a stored

r

7es

No

this

then inspected (Fig. 7, point X). If the state is "busy" this machine order has been stopped somewhere between

Remove

that a

inhibits

on machine request channels

the loading of the buffer address register (B.A.R.) and the stack request. Normally this time interval can vary from about 0.5 /isec if there are no stack request holdups, to 20 jusec in the case of certain accumulator holdups. In either case sufficient time is al-

Fig. 7.

Drum and tape break

in

systems.

lowed after the inspection to ensure that the equivalence operation has been completed. If an equivalence indication is obtained all the information relevant to this machine order (i.e., the line ad-

by the priority circuit) to removes the inhibits on the stack the core store then occurs, which

required and type of stack order) are is made here of the page digit

request channels. When the stack request for the drum or tape cycle is initiated these inhibits are allowed to reapply. At this stage

dress,

page

digits, stack(s)

stored for future reference. Use

register provided to allow the by-pass for instruction accesses.

The

on the equivalence circuitry is then made free for access

core store

by the drum or the tape. If the core store had been found free on inspection, the above procedure is omitted.

to

be

A drum

or tape access (as decided

(Fig. 7, point Y),

nels are

if

there

is

a stored machine order

it

is

allowed

possible. The inhibits on the machine request chanremoved when the stack request for the stored machine

to proceed

if

order occurs.

If

there

is

no stored machine order

this

is

done

Chapter 23

immediately, and the central machine is again allowed access to the core store. However, another drum or tape request can arise before the stack request of the stored machine order occurs, in particular because this latter order may still be held central machine. If this is the case the drum or tape

up by the is

allowed

One-level storage system

the result in this particular case that the machine can at

80 per cent of

its

still

operate

normal speed.

APPENDIX 2 METHODS OF DIVISION OF THE MAIN CORE STORE

immediate access and a further attempt is made to complete the stored machine order when this drum or tape stack request occurs.

The maximum frequency with which requests can be dealt with by a single stack core store is governed by the cycle time of the

the stored machine order was for an operand, the content of the page digit register will correspond to the location of this

is divided into several stacks which can be cycled independently then the limit imposed on the speed of the machine by the core store is reduced. The degree of division which is chosen

If

operand. The next machine request for an instruction pair will then almost certainly result in a "wrong page" indication. This is

prevented by arranging that the next instruction pair access does

store. If the store

dependent upon the machine operations and is

ratio of core store cycle time to other

also

upon the

cost of the multiple selec-

mechanisms required.

not by-pass the equivalence circuitry.

tion

on the machine speed when the drum or tapes are transferring information to or from the core store is dependent upon two factors. First, upon the proportion of time during which

Considering a sequence of orders in which both the instruction and operand are in the core store, then for a single stack store

The

effect

the buffer register in the core coordinator is busy dealing with machine requests, and secondly, upon the particular stacks being

used by the central machine and the drum or tape. If the computer is obeying a program with instructions and operands on the fixed or subsidiary store then the rate of obeying instructions

is

un-

or tape transfers. A drum or tape interrupt the B.A.R. is free prevents any machine address being accepted onto this buffer for 1.0 /usee. However, if the B.A.R. is busy then the next machine request to the core store is delayed

affected

by drum

occurring

when

until 1.8 /usee after the interrupt

if

different stacks are being used,

or until 3.4 /usee after the interrupt if the stacks are the same. When the machine is obeying a program with instructions and

operands on the core store the slowing down during drum transfers can be by a factor of two if instructions, operands, and drum requests use the same stacks. It is also possible for the machine to be unaffected. The effect on a particular sequence of orders can be seen by considering the one discussed in Sec. 4 and illustrated in Fig.

and or tape

1

6.

than the limits imposed by other sections of the computer (Sec. 4). If the store is divided into two stacks and instructions and

operands are separated, then the limit

is

reduced to 2

/usee

which

rather high. The provision of two stacks permits the addressing of the store to be arranged so that successive addresses is still

are in alternate stacks. to both stacks at the

It is

therefore possible by making requests to read two instructions together,

same time

number of access times to three per instruction Unfortunately such an arrangement of the store means that operands are always on the same stacks as instruction pairs, and so reducing the pair.

the limit imposed by the cycle time is still 2 /usee per order even if the two operand requests in the instruction pair are to different stacks

and occur

same time. number of stacks with

at the

Division into any

working through each stack 2

in turn

/usee since successive instructions

the addressing system cannot reduce the limit below

normally occur in successive

In this sequence the instructions are on stacks

addresses and are therefore in the same stack. However, four stacks

drum

arranged in two pairs reduces the limit to 1 /usee as the operands can always be arranged to be on different stacks from the instruc-

while the operands are on stacks 2 and is

the limit imposed on the operating speed by the store is two cycle times per order, i.e., 4 /usee in Atlas. This is significantly larger

transferring alternately to stacks

and

1

3. If

the

then the effect

any interrupt within the 3.2 /usee of an instruction pair is to increase this time by between 0.5 and 3.4 /usee depending upon of

tion pairs. In order to reduce the limit to 0.5 /usee

it is

necessary

have eight stacks arranged in two sets of four and to read four instructions at once, which would increase the complexity of the to

where the interrupt occurred. The average increase is 1.8 /usee and for a tape transfer with interrupts every 88 /usee the computer

central machine.

98 per cent of the normal rate. During drum transfers the interrupts occur every 4 jusec which would suggest a slowing down to 60 per cent of normal. However, for

The limit of 1 /usee is quite sufficient and further division with the stacks arranged in pairs only enables the limit to be more easily obtained by suitable location of the instructions and operands.

can obey instructions

at

any regular sequence of orders the requests to the core store by the machine and by the drum rapidly become synchronized with

is

The location of instructions and operands within the core store under the control of the drum transfer program; thus when there

289

290

Part 3

The

instruction-set processor level: variations in the processor

Section 6 |

Processors with multiprogramming

ability

Chapter 24

A

user machine

system B.

a time-sharing

in

1

W. Lampson / W. W. Lichtenberger / M. W.

Summary

Virile

This paper describes the design of the computer seen by a in a time-sharing system developed at the

In a time-sharing system which has been developed by and for the use of members of Project Genie at the University of California

Some of the instructions in this machine

at Berkeley [Lichtenberger and Pirtle, 1965], the user machine has a number of interesting characteristics. The computer in this

machine-language programmer

University of California at Berkeley.

some are implemented by software. The user, however, thinks of them all as part of his machine, a machine having extensive and unusual capabilities, many of which might be part are executed by the hardware, and

hardware of a (considerably more expensive) computer. Among the important features of the machine are the arithmetic and

of the

string manipulation instructions, the very general

memory

allocation

and

and the multiple processes which can be created by the program. Facilities are provided for communication among these processes and for the control of exceptional conditions. configuration mechanism,

The input-output system

is capable of handling all of the peripheral uniform and convenient manner through files having symbolic names. Programs can access files belonging to a number of people,

equipment

in a

but each person can protect his

own

files

from unauthorized access by

others.

made

is

at various points of the

but the main emphasis

is

techniques of implemen-

on the appearance of the

user's

machine.

characteristic of a time-sharing system

is

that the

computer seen

by the user programming in machine language differs from that on which the system is implemented [Bright, 1964; Comfort, 1965;

McCullogh et al., 1965; Schwartz, 1964]. In fact, the user machine is defined by the combination of the time-sharing

Forgie, 1965;

hardware running

in user

mode and

the software which controls

input-output, deals with illegal actions

which may be taken by

a user's program, and provides various other services. If the hardware is arranged in such a way that calls on the system have the

same form berger and

hardware instructions of the machine [LichtenPirtle, 1965], then the distinction becomes irrelevant

to the user;

he simply programs a machine with an unusual and

as the

powerful instruction set

which

relieves

him

of

many

of the prob-

lems of conventional machine-language programming [Lampson, 1965;

McCarthy

1

IEEE, 54,

Proc.

an

SDS

930, a 24 bit, fixed-point machine with one index

and

32 thousand words of 1.75 jus memory in two independent modules. Figure 1 shows the basic configuration of equipment. The memory

two modules so that processing and simultaneously. A detailed description of the various hardware modifications of the computer and their is

interleaved between the

drum

transfers

may occur

implications for the performance of the overall system has been given in a previous paper [Lichtenberger and Pirtle, 1965]. Briefly, these modifications include the addition of monitor and user

modes

mode, the execution of a class of and prevented replaced by a trap to a system rouThe protection from unauthorized access to memory has been

tine.

subsumed

in

in which, for user is

an address mapping scheme: both the 16 384 words

addressable by a user program (logical addresses) and the 32 768 words of actual core memory (physical addresses) have been

Introduction

A

is

register, multi-level indirect addressing, a 14 bit address field,

instructions

Some mention tation,

system

et

al.,

1963].

vol. 12, pp.

1766-1774, December, 1966.

divided into 2048-word pages. A set of eight six-bit hardware registers defines a map from the logical address space to the real memory

by specifying the

real

page which

is

user's logical pages. Implicit in this

marking each of the

user's

to

correspond to each of the

scheme

is

any attempt to access such a page improperly references in user

All

memory mode, memory

the capability of

pages as unassigned or read-only, so that

mode

are

will result in a trap.

mapped. In monitor

references are normally absolute. It is possible, with however, any instruction in monitor mode, or even within all

a chain of indirect addressing, to specify use of the user map. Furthermore, in monitor mode the top 4096 words are mapped

through two additional registers called the monitor map. The

mapping process

is

illustrated in Fig. 2.

Another significant hardware modification

is

the

mechanism

going between modes. Once the machine is in user mode, get to monitor mode under three circumstances:

it

for

can

291

292

The

3

Part

instruction-set processor level: variations

user mode, the user map will be applied to the remainder of the address indirection. All calls on the system which

a

made in this way. monitor mode program gets into user mode by transferring to an address with mapping specified. This means, among other are not inadvertent are

A

P.T.

reader

things, that a

CPU SDS 930



Magnetic tapes

— — —

I

I

'

'

POP-5

_J '

I

I

I

— PDP-5 J—

I

I

display

Rand

ral transfers of control

tablet

Keyboard displays

processor

/i

Graphic

sec

display

DRUM

and

J

light

pen

I3»I0 6 W0RDS 5«I0 5 WDS/SEC

General

I/O

Moss

Remote

processor

computers

store

5»l0

Fig. 1.

8

words

Configuration of equipment.

1

If

a hardware interrupt occurs

2

If

a trap

3

If is

is

generated by the user program as outlined.

an instruction with a particular configuration of two bits executed. Such an instruction is called a system pro-

grammed

operator (SYSPOP).

In case 3, the six-bit operation field locations in absolute core. is

between user and system programs. Advanbeen taken of this fact to create a rather grandiose has tage machine for the user. Its features are the subject of this paper.

Basic features of the machine

A (planned)

Memory 16 K 1.75

is

used to select one of 64

The current address

of the instruction

put into absolute location zero as a subroutine link, the indirect

address bit of this link

word

is

set,

and another

bit

is set,

marking

having come from usermemory The routine thus invoked may take a memory. system mapped parameter from the word addressed by the SYSPOP, since its the

location in the link

address field

is

return to the user program simply by

As the above discussion has perhaps indicated, the modechanging arrangements are very clean and permit rapid and natu-

Graphic

Memory 16 K 75 }i sec

Drum I/O

J

SYSPOP can

branching indirect through location zero.

modified

f

ability

come from Teletypes

interface

I

Processors with multiprogramming

address the parameter indirectly through location zero and, because of the bit marking the contents of location zero as having

-D -a

TTY

Section 6

the processor

in

word

as

not interpreted by the hardware.

The

routine will

user in the Berkeley time-sharing system, working at what he

thinks of as the hardware language level, has at his disposal a machine with a configuration and capability which can be con-

veniently controlled by the execution of machine instruction sequences. Its simplest configuration is very similar to that of a

A user machine

Chapter 24

medium-sized computer. In this configuration, the machine possesses the standard 930 complement of arithmetic and

standard

of software interpreted logic instructions and, in addition, a set monitor and executive instructions. The latter instructions, which will

be discussed more

input-output of

many

fully in the following,

do rather complex

different kinds, perform

many

frequently

used table lookup and string processing functions, implement floating point operations,

and provide

for the creation of

complex machine configurations. Some examples

2

X

to

13777 8

user's

of fixed-point arithmetic

and

logic

4

is Floating point arithmetic and input-output. The latter F format. E or of Fortran in or the in free format equivalent

5

Input a character from a teletype or write a block of arbitrary length on a drum file.

6

Look up a

string in a hash-coded table tion in the table.

and obtain

its

posi-

new process and start it running concurrently with the present one at a specified point. Redefine the

memory

which

is

also

of the

machine

to include a portion

being used by another program.

The

;

completely invisible do have the these instructions to the machine user, and since is

standard machine instruction format, the user and his program make no distinction between hardware and software interpreted instructions.

of the possible 192 operation codes are not legal in the

user machine. Included in this category are those hardware instructions which would halt the machine or interfere with the

allowed to execute, and those software interpreted to do things which are forbidden to

which attempt

the program. Attempted

by

to

he may specify that the 6K should 14000 8 to 17777 8 and 34000 8 to

3777 8

,

,

also specify the size

and configuration of

be deferred

to a later section.

mechanism by which the and organization. This memory size

one byte

for

each of the

63 words in a table called the private memory table (PMT). Each

own private memory table. An entry in this table about a particular 2K block of memory. The information provides block may be either local to the user or it may be shared. If the user has his

block

is

local, the

execution of one of these instructions will

entry gives information about whether it is on the drum. This information is important

currently in core or to the system but its

user. If the block

need not concern the

PMT entry points to an

entry in

is

shared,

another table called the shared

memory table (SMT). Entries in this table describe blocks of memory which are shared by several users. Such blocks may con-

A

if

specified

blocks addressable by the 14 bit address field of an ineight or addresses one of the struction. Each of these bytes either is

standard machine instruction format, with the exception of the one bit which specifies a system interpreted instruction. Since the

instructions

is

program may specify mechanism, known as the process map to distinguish it from the hardware memory address mapping, uses a (software) mapping

marked

input-output

illegal

next few paragraphs discuss the

tain invariant

Some

alternatively,

its

It should be emphasized that, although many of these instructions are software interpreted, their format is identical to the

system interpretation of these instructions

an

2K

Create a

of that

and organization of the machine

register consisting of eight 6-bit bytes,

Skips on various arithmetic and logic conditions.

effect of

the machine's secondary storage and, to a considerable extent, the structure of its input-output system. A full discussion of this capa-

load and store are also available.

3

8

size

include addresses

bility will

The normal complement

The

later.

an appropriate sequence of instructions. For example, the user may with addresses from specify a machine which has 6K of memory

Load A, B, or any (index) registers from memory of the registers. Indexing and indirect addressing are available on these and almost all other instructions. Double word or store

described

a time-sharing system

configuration

The memory

.

operations.

7

Memory

is

37777 8 The user may

available are:

1

illegal instruction violation.

instruction violation

more

of the instructions

an

result in

in

is

in which case they will be contain arbitrary data which

programs and constants,

as read-only, or they

may

being processed by programs belonging to two different users. possible arrangement of logical or virtual is

shown

in Fig. 3.

The nature

process in the picture of the virtual

memory

of each page has

memory;

this

information can also

be obtained by taking the corresponding byte of the

PMT entry

for a

been noted

map and

The

figure shows by that the process which memory, suggests the the code for be a compiler with might compilation, sharing other processes translating programs written in the same source

looking at the a large

amount

specified

that byte.

of shared

language. Virtual pages one and two might hold tables and temporary storage which are imique to each separate compilation. Note that, although the flexibility of the map allows any block of code or data to appear

anywhere

in the virtual

memory,

it is

certainly not true that a program can run regardless of which pages

293

294

Part 3

The

instruction-set processor level: variations in the processor

Section 6 |

Page

Processors with multiprogramming

ability

Chapter 24

the routine and data base do not

common to

make frequent adjustment

to the

where several

into 16K, or

fit

routines are concurrently employed,

may be

it

necessary

during execution.

map

simple multi-process structures, one for each of two users. Note that each process has associated with it pointers to its controlling process and to one of

two immediate descendants,

process,

minor exceptions,

same

to

be discussed, each subsidiary process has the

status as the controlling process. Thus,

lish a

subsidiary process.

machine

is

It

is

may

in turn estab-

therefore apparent that the user

machine. The original sug-

in fact a multi-processing

gestion which gave

it

rise to this capability

was made by Conway

[Conway, 1963], more recently the Multics system has included a multi-process capability [Corbato and Vyssotsky, 1965; Dennis

and Van Horn, 1966;

A

Saltzer, 1966].

is

to run; this information

for the

called the state

is

program To create a new process, a given process executes an instruction which has arguments specifying the state vector of the quired vector.

memory

configuration which

is

the same

as,

ent from, that of the originating process.

placed on

specification

available to the multi-process system

is

limited to 128K

process

Each

mapping mechanism, which

user, of course, has his

own

user processes.

buffering routines,

is

common

by the

to all processes.

128K.

The most obvious examples

are input-output

which can operate independently

of the user's

main program, communicating with it through memory and with interrupts (see the following). Whether the operation being buffis large volume output to a disc or teletype requests for information about the progress of a running program, the degree of flexibility afforded by multiple processes far exceeds anything which could have been built into the input-output system. Fur-

ered

is very low: an additional process requires about 15 words of core, and process switching takes about 1 ms under favorable conditions. There are numerous other examples

thermore, the overhead

of the value of multiple processes; most, unfortunately, are too

complex

A

to

be

process

briefly explained.

may

create a

number

and

pointers are, of course, redundant, but are convenient for the implementation. The process is identified by a process number which is returned by the system when it is created.

A

complex structure such

as that in Fig.

5

may

result

from the

number of subsidiary processes. The processes in 5 have been numbered arbitrarily to allow a clear description

creation of a Fig.

of the

way

in

which the pointers are arranged. Note that the user of these pointers; they are shown here to clarify

the manner in which the multiple process mechanism

of subsidiary processes, each

is

imple-

mented.

A process may destroy one of its subsidiary processes by executreasons this operation ing the appropriate instruction. For obvious is not legal if the process being destroyed itself has subsidiary

memory

This facility was put into the system so that the system could control the user processes. It is also of direct value, however, for

many

a process has

The up

constraint

that the total

memory

When

as in the case of processes 1.2

they are chained together on a ring. Thus, three pointers, up, down, and ring, suffice to define the process structure completely.

or completely differ-

The only

is

this

subsidiary processes.

1.3,

new process. This state vector includes the program counter, the central registers, and the process map. The new process may have a

its

need not be aware

the logical environment for the execution of a to the physical environment, which is a as contrasted program, hardware processor. It is defined by the information which is reprocess

a time-sharing system

which is independent of the others and equivalent to them from the point of view of the originating process. Figure 4 shows two

An important

machine allows the user program, which in the current context will be referred to as the controlling to establish one or more subsidiary processes. With a few

in

of

Multiple processes feature of the user

A user machine

1.2

295

296

Part 3

The

instruction-set processor level: variations in the processor

Section 6

Processors with multiprogramming

ability

A user machine

Chapter 24

in a

time-sharing system

|

the process attempts to obtain new memory, scan upward through the process hierarchy until the topmost process is reached. If at any time during this scan a process is found

3

If

for

the

which the address causing the trap

memory

assigned to

it

is

down through

legal,

Option 3 permits a process to be started with a subset of memory and later to reacquire some of the memory which was not given to it initially. This feature is important because the of

memory

available on

assigned to a process influences the operating and thus the speed with which it will be

efficiency of the system

able to respond to teletypes or other real-time devices.

A

file

user machine has a straightforward but unconventional set

The primary emphasis in the design has been to make all input-output devices

of input-output instructions. of these instructions

interface identically with a flexibility in this

result

from

which are

common

to provide as

interface as possible.

this uniformity:

essentially

program and

it

becomes

Two

much

advantages

natural to write programs

independent of the environment in which

opened by giving

its

system command language and all of the subsystems to be driven in this way. This device is particularly useful for repetifor the

tive

sequences of program assemblies and for background jobs in the absence of the user. Output which normally

which are run

goes to the teletype is similarly diverted to user files. Another application of the uniformity of the file system is demonstrated in some of the subsystems, notably the assembler and the various

The subsystem may request the user to specify where he wishes the program listing to be placed. The user may choose anything from paper tape to drum to his own teletype. In the compilers.

absence of file uniformity each subsystem would require a separate block of code for each possibility. In fact, however, the same input-output instructions are used for

The input-output

instructions

system in turn associates

files

files.

an argument to the

the system. If authorized, a program may refer to files belonging to other users by supplying the name of the other user as well as the

file

name. The owner of a

file

determines

who

is

authorized

The reader may compare this file naming mechanism with a more sophisticated one [Daley and Neumann, 1965], bearing to access

in

it.

mind the

fact the

file

of any length and can be by the program.

names can be

(as strings of characters)

is, in general, either sequential or random in devices (like a keyboard-display or a card reader) are purely sequential, while others (like a disk) may be either sequentially or randomly accessed. There are accordingly two

files

Some

major I/O interfaces to deal with these different qualities. The interface used in conjunction with a given file depends on whether the file was declared to be a random or a sequential file. The two major interfaces are each broken down into other interfaces,

between sequential and random

files is

The

pri-

Although the distinction

marily for reasons of implementation.

great, the subinterfaces are

files

three instructions

CIO

(character input-output),

input-output), and BIO (block input-output) are used

nicate with a sequential

file.

Each

WIO to

(word

commu-

instruction takes as an operand

when it opens At the time of opening a file it must be specified whether the file is to be read from or written onto. Whether any given device associated with the file is character-oriented or worda.

a

file

number. This number

is

given to the program

file.

is unimportant; the system takes care of all necessary character-to-word assembly or word-to-character disassembly. There are actually three separate, full-duplex physical inter-

oriented

faces to devices in the sequential

file

mechanism. Generally, these

interfaces are invisible to programs.

They

exist, of course,

reasons of system efficiency and also, because of the

some devices

all cases.

communicate with

as

to all files symboli-

Sequential

example, for programs written to be controlled from a teletype to be driven instead from a file on, let us say, the drum. A command exists which permits the recognizer for

name

and organization to

one.

common,

single-

cally, leaving the details of physical location

not especially visible to the user.

has been

the operations

Programs thus refer

they operate, and the implementation of the system is greatly simplified. To the user the former point is, of course, the important

It

flexibility of

They must range from

clearly critical.

appropriate instruction.

nature.

The

is

manipulated Access to

The input-output system

and behavior, the

files is

character input to the output of thousands of words.

propagate

the hierarchy to

the process causing the trap.

amount

in characteristics

are used.

The

way

in

for

which

interfaces are:

1

Character-by-character (basically for low-speed, characteroriented devices used for man-machine interaction)

2

Buffered block I/O (for medium-speed I/O applications)

3

Block I/O directly from user core

The

with the various physical devices.

Programs, for the most part, do not have to account for the peculiarities of the various actual devices. Since devices differ widely

(for

high-speed situations)

297

298

Part 3

The

Section 6

instruction-set processor level: variations in the processor

Processors with multiprogramming

|

should be pointed out that there is no particular relation beinterfaces and the three instructions CIO, WIO, and

ability

tween these

shows the components of the character-by-character interface; responsibility for its operation is split between the interrupt called

BIO. The interface used

when

It

in a given situation is a function of the device involved and, sometimes, of the volume of data to be trans-

mitted, not of the instruction.

Any interface may be driven by any instruction. Of the three subinterfaces under discussion, the last two are straightforward. The character-by-character interface is, however, somewhat

different

and deserves some elaboration. Devices

associ-

ated with this interface are generally (but not necessarily) used for man-machine interaction. Consider the case of a person communicating with a program by means of a keyboard-display (or a teletype). He types on the keyboard and the information is transmitted to the computer. The program may wish to make an

the device signals for attention and the routine which proc-

esses the user's

I/O request. The advantage of the full-duplex, character-by-character mode of operation is considerable. The character-by-character capability means that the user can interact with his program in the smallest possible unit

— the character.

Furthermore, the full-duplex capaother things (1) the program to substitute characters on strings of characters as echoes for those received, (2) the keyboard and display to be used simultaneously (as, for

bility permits,

among

example, permitting a character typed on a keyboard to pre-empt the operation of a process. In the case of typing information in during the output of information, a simple algorithm prevents the random admixture of characters which might otherwise result),

immediate response on the display screen. In many cases this response will consist of an echo of the same character, so that the

and

user has the feeling of typing directly onto the screen (or onto the teleprinter).

Instructions are included to enable the state of both input and output buffers to be sensed and perhaps cleared (discarding un-

So that input-output can be carried out when the program is not actually in main memory, the character-by-character input interface permits programs a choice of a number of echo tables;

wanted output or input). Of course, it is possible for a program to use any number of authorized physical devices; in particular,

it

further permits programs a choice of grade of service

by per-

mitting them to specify whether a given character is an attention (or break) character. Thus, for example, the program may specify that each character typed

is

to

be echoed immediately and that

control characters are to result in activation of the program regardless of the number of characters in the input buffer. Alter-

all

natively, the

program may specify that no characters are echoed

and every character

is

a break character.

By changing the

specifi-

cation the program can obtain an appropriate (and varying) grade of service without putting undue load on the system. Figure 6

(3)

the ready detection of transmission errors.

used as remote consoles.

this includes those devices is

to

provided to permit output

be copied on

all

which

A mechanism

directed to a given device

other devices which are output linked to it This is useful when communication

(and similarly for input).

among users is desired and in numerous other situations. The sequential file has a structure somewhat similar of an ordinary

magtape

file.

consists of a

It

record.

The

full

generality

the ones most

is

available for

drum

files,

which are

The

commonly logical record is to be contrasted with the variable length physical record of magtape or the fixed length record of a card. Instructions are provided to insert or delete logical records and increase or decrease them in length. used.

file

to

be "positioned" almost

tial file

routine

greater flexibility than one which

ble. This flexibility

is

is

completely unaddressa-

only possible, of course, because the

on a random-access device and the sequential structure Echo

Users

table

program

tained by pointers.

The implementation

routine

Input

buffer

interface.

discussed in the follow-

end of

file,

and BIO terminates transmission on either of the

conditions and returns the address of the last In addition, certain flag bits are set

The character-oriented

file is

main-

ing.

or

Fig. 6.

is

is

When reading a sequential file, CIO and WIO return certain unusual data configurations when they encounter an end of record

Input interrupt

I

in-

stantaneously to a specified logical record. This gives the sequen-

Output interrupt

Device

to that

sequence of logical

records of arbitrary length and number. On some devices, such as a card reader or the teletype, a file may have only one logical

Other instructions permit the Output buffer

I

is

and an interrupt may be caused

if it

word transmitted.

by the unusual conditions, has been armed.

Chapter 24

The implementation storage

is

of the sequential

file

illustrated in Fig. 7. Information

is

scheme

for auxiliary

written on the drum

256-word physical records. The locations of these records are kept track of in 64-word index blocks containing pointers to the in

data blocks. For the

file

shown, the

first

logical record

is

more

than 256 words long but ends in the second 256-word block. The fits in the third 256-word block and the third

second logical record





followed by an end of file. If a file requires more than 64 index words, additional index blocks are chained together, both forward and backward. Thus, logical record

in the 4th

data block

is

in order to access information in the file

know

the location of the

first

index block.

it

It

is

necessary only to

may be worthwhile

same drum. Since the system

to point out that all users share the

has complete control over the allocation of space on the drum, there is no possibility of undesired interaction among users. Available space for of

by a

new data blocks or index blocks is kept

bit table, illustrated in Fig. 8. In the figure,

track

each column

represents one of the 72 physical bands on the drum allocated for the storage of file information. Each row represents one of the 64 256-word sectors around a band. Each bit in the table thus

represents one of the 4608 data blocks available. The bits are set when a block is in use and cleared when the block becomes available. Thus,

if

a

new

data block

is

to read the physical position of the in the table,

in

which

a block

is

available.

block over 1

is

its

a

is

this position to

index

appearance of a 0. The found indicates the physical track on which

and search a row

column

required, the system has only

drum, use

for the

Because of the way the row was chosen,

this

immediately accessible. This scheme has two advantages alternative, which is to chain unused blocks together:

easy to find a block in an algorithm just described. It is

optimum

position, using the

A user machine

in a

time-sharing system

299

300

The

Part 3 |

instruction-set processor level: variations in the processor

Section 6 |

Main

memory

Processors with multiprogramming

ability

Part

The

4

instruction-set processor level:

special-function processors This part contains descriptions of processors that do not interpret general programming languages; that is, they are not Pc's. They are all P's, however, since

they have an interpreter that determines not only the operations to be taken, given the current instruction, but the next instruction to be obtained.

A

and Ms components. It manages T and Mp. A P.array (Sec. 2) processes both vectors and two-dimensional matrices. By recognizing these data as fundamental units, programs (or algorithms) can be expressed efficiently in terms of primitive operators. The chief advantage of these Pio (Sec. 1)

is

a processor that controls T

block or vector transmission between

P's

is

their ability to

Ms

or

take advantage of the data structure for parallel interpretation,

thereby increasing processing speed.

A microprogram processor type which

is

a

program.

(Sec. 3)

is

In effect, this

designed to interpret and process a dataprocessor is a computer within another

computer, programmed to act as an interpreter.

A language processor (Sec. programming language.

of a

4) interprets a data-type derived from the primitives In contrast,

a conventional processor interprets a

language based on fundamental hardware implementation primitives. The difference is clearly apparent as increased complexity of the language processors.

301

Section 1

Processors to control terminals

and secondary memories (input-output processors) show the evolution of the IBM Data Channels (io processors) from 1958 (the 7094 which came after the 360). The II) to the present (the 1800,

The

first

three chapters of this section

processor approach forcontrollingTand Ms components, while more general, should be contrasted with the specialized oneinstruction controls in the

B 5000 (Chap. 22) and Burroughs

D825 (Chap. 36). The fourth chapter, on the DEC 338, shows

CRT

displays used the Pc (e.g., on Whirlwind); then small Pc's were adapted to the task; the DEC 338 is one of the earliest special P.display's that apfirst

of

System 360

Part I— outline of the logical structure io processors (Selector and Multiplexor Channels) in the System/360 have evolved from the IBM 701-7094 Series. Part 6, Sec. 3 presents the ISP and PMS structures for these processors. Depending on the computer model, the implementations

The

II

microprogrammed processor interpreting a shared control program for both Pio's and Pc, or by a hardwired Pio. The multiple Pio's in a 360 Multiplexor Channel, though

are realized by a a processor that

controls cathode-ray-tube display consoles. The graphic terminals are the first T's of sufficient complexity to utilize a proc-

essor of their own. The

The structure

independent, are implemented as a single, shared

logically

physical processor.

The IBM 1800

peared.

There is no example message concentration and

in this section of a specialized

multiple remote inputs are

switching. For

P for

computer systems

The

Pio's in this structure are presented in Chap. 33,

structure

is

discussed

in

Part 5, Sec.

2,

and the

page 396.

recent enough so that either

still

the main Pc handles the task, via specialized K, or small Pc's are committed to it. However, in the telephone industry there

has been a very substantial development by the Bell System System (ESS), which uses specialized

of the Electronic Switching

C's to control switching (routing). In computer systems,

we can

expect the use of such specialized processors to increase the near future.

in

The

Digital

Equipment Corporation DEC 338 display processor

The DEC 338

is an early P.display. It directly interprets a stored to control a T. display. Earlier T.displays were con-

program

Pc (Whirlwind, Chap. 6), or by a special K.display without stored-program capability, or by a general-purpose Pio. The last method outputs fixed length blocks containing data to trolled by

be interpreted by T.display as points, vectors, characters, curved line segments, etc. The control of T.display first by Pc,

The IBM 7094

The IBM 709,

a

then by a K, then by a Pio, and finally by a P.display has been observed as an evolution [Myer and Sutherland, 1968]. Myer

II

member

of the

IBM 701-7094

II

family,

is

one

of the first processor (IBM name: Data 41 discusses the two Data Channel) in its structure. Chapter the later 7909. The 7909 and Channel types: the early 7607 it K which Data Channel ISP, and a controls, are given in Ap-

computers to have an

io

pendix 2 and 3 of Chap. 41. The principal difference is that Pc controls the Pio ('7909) which in turn controls the K, which in turn controls a T or Ms; the Pc controls the Pio (7607) and the K; the K controls the T or Ms. The series Part 6, Sec.

1,

page 515.

is

discussed

in

and Sutherland also observe that the evolution

become

a closed cycle

because the generality of

a

is

Pc

about to is

needed

to control a T.display.

Note that the 338 has a very extensive ISP. In fact, the P.display's ISP is more extensive than the companion Pc of the

PDP-8 (Chap.

5).

There are some display tasks which require

Pc, for example,

compiling programs (pictures), calculating elaborate light-pen tracking figures, making coordinate and curved lines to straight-line vector approximation transformations,

and communicating with other system components. 303

304

Part

4

The instruction-set processor

level: special-function

Section 1

processors

Another approach to the design of a P.display

is

based on

a P.microprogram which is shared among many T.displays [Rose, 1967]. Yet another alternative, which has not yet been tried, is to

incorporate a Pio (P.display) as a special

a conventional Pc.

Thus the P would

mode

in

interpret either conven-

tional Pc instructions or P.display instructions. is

P.display

the interpreter for the output of pictures or utilizes data space efficiently simply because

Processors to control terminals and secondary memories

an interrupt system, and other tasks beyond

P. display's

A

clock should be built into the 338. The brightness or

tensity of a picture

is

mode instructions for controlling intensity) and by the rate at which the pictures are repeated. A clock would allow the time when pictures are started or drawn to be specified; thus the intensity would be independent of picture length.

the data are long variable-length strings (word vectors). The instruction requires almost no space to specify the data opera-

a large

and addresses; data are interpreted

directly or

immedi-

The 338 requires more hardware than a simpler Pc. However, amount of this hardware is used to control the generation of characters and lines. The lines (vectors) are drawn

DDS

ately in the instruction rather than via instruction addresses.

using a

Another feature which allows a program to be efficiently encoded is the stack mechanism for storing subroutine linkages. Subroutines in P. displays are actually programs which

one-half of the registers could be eliminated

form part of a more complete

picture. Subroutines are actually

subpictures. Although the stack picture calls, the stack

is

mechanism

allows for recursive

used principally to save space and

common picture programs. common to all multi-P struc-

to allow multiple T.displays to use

A problem tures

is

ling P,

in

the 338 which

intercommunication

as

is

is

among

the P's. Pc

is

the control-

the case with most Pc-Pio structures. The P('338)

has no trap to itself but relies on an interrupt signal to Pc. The Pc processes both tasks which P.display might process, given

in-

determined both electronically (see the

graphics. The 338

tions

capa-

bility.

not a P.

(Digital Differential Analyzer) technique.

A simpler

alternative

if

the

was constructed about

Perhaps

338 were a similar

Telephone Laboratories and DEC, computer, the PDP-9, by of the making the display only a K. approach using Bell

A more elaborate Pc interrupt system with reduced overhead time would enable Pc to take on the specialized program control functions in the 338. Such a scheme might pass the program or instruction counter parameter directly from P.display to Pc. In this

way, Pc or P.display would alternatively process part of depending on the task.

a single instruction stream,

Despite the problems of this early P.display, tication

it

has a sophis-

which successors appear to be following.

Chapter 25

The DEC 338 display computer Introduction

The

C(display;

'DEC

which can connect in.

to

a

C('DEC PDP-8) with a P.display CRT; T(#l:8; display; area: 9.375 X 9.375 338)

is

The PMS structure is shown in Fig. 1, Chap. 5, describing The Pc ISP is given in Appendix 1 of Chap. 5. The C('338), although designed to stand alone, is generally used

2

).

the PDP-8.

as a satellite to a larger C, via

an L(Dataphone). The rationale

independently. A photomultiplier connected through a fiber-optic is used as a light pen (a photosensitive sensor) to detect

bundle link

spots on the T.

whether

The

pen allows the P.display

light

to detect

a user has "pointed to" a displayed spot.

Pc and P.display access the same Mp; the total data rate availMp is one 12-bit word/1.5 microseconds. The instruction

able from

C as a T is based on the bandwidth and storage requirements needed to maintain graphical picture displays. A human being manipulating pictures (rotation, scale change, and conver-

times of P.display are a function of the point plotting times of the T(CRT):0.3 microsecond to the next incremental unintensified

sion of internal linked data structure to a picture structure) reshort this time; quires response requirement places high processing

incremental intensified point; and 35 microseconds to a point

for using a

demands on

larger C's.

Thus

this C(display)

larger, more general C's. The actual T(CRT) is a 16-inch

viewing area covered by 1,024 of the points

~0.015

inch.

x

is

The of

CRT

with a 9%-inch square 1,024 (XY) points. The diameter

The

is

random

plotted at a

a preprocessor for

spot magnetically deflected and focused. All eight T(CRT)'s can be driven together or used is

point (approximately 0.010 inch away); 1.2 microseconds to an

position.

state (registers) of C. display

Appendix

1

is

given in the ISP description of the state:

There are four parts

of this chapter.

the control registers for Program Flow State, the Picture State (or position of

The

beam), Console and Light-pen State, and

instruction interpreter

by the 1 and

state 2.

diagram

(Fig.

The remainder

instructions

is

1).

fairly

The

simple and

is

Mp

State.

best described

instructions are given in Tables

of the chapter discusses the P.display

and the Pc instructions

for

communicating with

P.dis-

play.

[1>— M[DAC+1];

Principle of operation

The

is held stationary by repeatedly displaying a (intensifying) particular point, line, etc. The number of times a figure has to be displayed so that it appears stationary and does

actual picture

CRT phosphor, the figure, and environmental parameters. The generally accepted range is a plotting rate of 20 50 plots/second; thus a complete picture has to be drawn not flicker depends on the

in

50

~ ~ 20

milliseconds. If

we assume

a 30-Hz plot rate, about

~

1 120 inches, 28,000 points can be plotted in vector mode (or 280 on the About characters can be dis1,000 depending spacing).

played in 30 milliseconds using character mode.

When 1

2

3

Executed by Pc to start p display Executed by Pc to stop P display Data state states"

Fig. 1.

DEC 338

is

used, a display program

The pen's The pen, of

is

position

known

course, detects the points

points.

Control stote instructions

is

required to

determined by displaying

when

present at the displayed points position; therefore the

4 Control state "states"

State transitions occur approximately each

the light pen

"track" the pen.

Mp

cycle

instruction-interpretation state diagram.

knows the location The parameters

it is

program

of the pen.

of interest for a display vary, depending on the application. However, the general parameters are:

305

306

Part

4

Table 1

The

DEC 338 control-mode

instruction set

t Set; allow instruction bits to specify |

A two-word

Section 1

instruction-set processor level: special-function processors

new

value.

instruction, second word contains low-order 12

H Skip can be for true or

false.

§ Inhibit restoration of bits.

bits for

DAC (jump

address).

Processors to control terminals and secondary memories

Chapter 25

more

Instructions and their interpretation in P(display)

Two

instruction-set types are interpreted in the P. display:

Data

which instructions specify display information; and ConState, in which instructions specify program control informa-

State, in trol

tion (e.g., jumps, modes, etc.).

tation process

is

given

in Fig.

A

state

diagram

instructions

returns the P.display to control state. to select a

stored per word.

is

DEC

calls

in data state.

When

all

the data-

A control instruction is issued

mode and simultaneously place

Increment mode. This mode

There are seven instructions (which

modes) that can

The

instructions

(modes) are really substates of data state. The instructions (actually

is

the display in data state.

used to draw curves and alpha-

An

up and

instruction set

Two instructions are beam position to

instruction will cause the

be moved one, two, or three times, one of eight directions. Direction

Instruction bits:

Mode

mode.

have been interpreted, an escape instruction

numeric characters and other small symbols.

be executed while P.display

DEC 338 data-mode

like data) are interpreted for the

for the interpre-

1.

Data-state instructions

Table 2

mode

The DEC 338 display computer 307

to the right, etc.

in 0.010-inch increments, in is

to the right, direction

1 is

308

Part

The

4

Vector mode.

The vector mode

is

used to draw straight-line seg-

ments. This two-word instruction causes the

moved along delta

beam

position to

A word Hit 0:

to the

edge of the screen.

It is

Rit

Short vector mode.

The

short vector

mode

is

a

1,

bits 1 to

1:

used to draw figures

are used to perform a control

Determines the mode

in

which the character

is

to

be

displayed. If bit 1 is a 0, the increment mode is used to plot the character used; if bit 1 is a 1, the short

of short line segments. A one-word instruction specifies a 5-bit delta y and a 5-bit delta x quantity. It is transformed within the display to the same format as vector mode and operates in

composed

vector

mode

is

the same manner.

Control-state instructions

The preceding modes move the beam by counting the X and The counting is done at 1.2 microseconds per on step on an intensified move and at 0.30 microsecond per step

There are

Y

1 1

the SAR and bit 2 of the dispatch word and so may be specified in either place or in both places.)

encountered.

is

is

specify the address at which the character definition program starts. (The address bit 2 is common to both

used to draw a straight line similar to vector mode but causes is

the line to be extended until an "edge"

If bit

function as specified by particular control instructions. is a 0, bits 2 to 11 are combined with SAR to If bit

x.

mode

in the dispatch table has the following format:

be

a line represented by an 11-bit delta y and an 11-bit

Vector continue mode. This

Processors to control terminals and secondary memories

Section 1

instruction-set processor level: special-function processors

used to plot the character.

six control-state instructions.

position registers.

a nonintensified

move.

Point mode. Point

placed into the

functions.

A

used

Y and X

for

random point

new Y and/or X

used to

set values in scale, light-pen,

and

plotting.

A

coordinates to be

Mode. Mode

is

instruction).

Mode

used to also

set

up the data-state mode

is

used to stop the display.

(or

data-mode

position registers.

is

used to draw curves of mathematical

one-word instruction has data

register; at the

by

is

specifies

Graph-plot mode. This

is

intensity registers.

mode

two-word instruction

Parameter. Parameter

same time,

X

for the

or Y, respectively,

X

Y

or

is

incremented

Conditional skip. The skip instruction tests the state of the P. display and the pushbuttons.

position

a count of one, two, four, or eight, depending on the scale

Miscellaneous. These instructions include both tests and additional

parameter control.

factor.

Point and graph-plot modes operate at a rate depending upon the position of the new point with respect to the previous point. If a point is only one-eighth of the screen away, the delay for beam-settling time is 6 microseconds; otherwise the settling time is

35 microseconds.

Display jump and push-jump subroutine instructions. The display jump instruction has 15 address bits, so that a jump may be

executed to any location in the display

file

within the 32-kw

memory. The display subroutine

instructions are push-jump (an extension and pop, the return from subroutine. The jump instruction) The works as follows: current state of the display (Light push-jump

of the

Character generation option instructions. The alphanumeric charwhich make up a character set are stored

increment mode or short vector mode. These characters

Pen Enable, Data Mode, Scale, and Intensity) is stored, along with the return address, in two successive locations in the first 4,096

can be arbitrarily defined. A 6-bit (or 7-bit) character code in the instruction is used to locate a word in a table in Mp called the

words of memory. The locations are determined by the pushdown pointer, PDP. This pointer is initially set by a Pc instruction. The

dispatch table. The base address of the table Starting Address Register/SAR.

normal jump is then executed. To return from a subroutine, the pop instruction is executed. It has no address bits. Its function is to return the display to a previous state by sending the last words on the push-down stack

acters or special symbols in

Mp

in

SAR may be

is

specified

by the

loaded by instructions from the Pc. The

SAR

represents the most significant 6 bits of a 15-bit memory address. The character code represents the least significant 6 (or 7) bits.

A seventh SAR bit, corresponding to the octal position with 6-bit characters as a case bit characters)

and may be

set or

(i.e.,

100,

is

used

uppercase or lowercase

cleared with a control character.

back to the display.

The stack approach to subroutining as implemented on the 338 has certain advantages over the jump to subroutine instruction normally used in

Pc's:

The DEC 338 display computer 309

Chapter 25

1

Memory

space

is

conserved since return address locations

are not required in each subroutine in

2

3

A

memory.

subroutine can be called any return to the main routine.

number

Since the state of the display

saved on the stack and

is

of times before

same state or change the state of the display by using one or more of the "inhibit restore" bits available in the pop instruction. The programmer can elect independently to inhibit restoration of mode, light pen, and scale, or intensity information.

The subroutines can

Counter from

Set Push Button contents from Set miscellaneous flag

and

Set character generator

subsequently restored, subroutines are truly transparent; that is, after the return they leave the state of the display program the same as before the subroutine call. 4

Set Display Address

either retain the

status bits

SAR

register into

Read Y

register into

communicating with P(display)

communicate with P.display. The physical connection is by the S(l/0 Bus). The in-out transfer instructions in Pc are used to initialize and read the state of P.display. Instructions in Pc

P.display state initialization Set Push

Down

from Pc

Pointer from

AC

instructions

AC

Read Display Address Counter into AC Read Status words 1, 2, 3, 4, 5 into AC bits of flags,

ture debugging.

A

modes,

bit

AC

AC AC (60 miscellaneous

etc.)

Picture debugging modes. These

Instructions in Pc for

from

address

P.display status to Pc instructions Read Push Down Pointer into

Read X

AC

AC

can be

modes

aid

programmed and

pic-

set to override the nonintensify bit

in data-mode instructions. When this bit is a 1, all points and vectors are plotted, whether they are to be intensified or not. The search enable instruction forces the display to run until a particular instruction type is found. The instruction type is specified by

the search enable instruction.

310

Part 4

The

APPENDIX

instruction-set processor level: special-function processors

1

DEC 338 DISPLAY PROCESSOR ISP DESCRIPTION

Section

1 |

Processors to control terminals and secondary memories

Chapter 25

APPENDIX

1

DEC 338 DISPLAY PROCESSOR ISP DESCRIPTION (Continued)

pb^clear

:=

i

pbj:lear

(PB

(Scale -»

light^penuchange

->

:

1 1



Ace

C(AcCi) to n

.r*+» +1 to

Ace Ace

< 0,

transfer control to n;

C(Acc)-2» If

C(Acc)

to

if

C(Acc)

>

proceed serially) Read next character on input mechanism into n

(i.e.,

On The

Effect of order

Send C(n)

to output

mechanism

0, ignore

337

338

Part

4

The

Section 3

instruction-set processor level: special-function processors

Processors defined by a microprogram

Table 2

stand for the various registers in the arithmetical and control register units (see §3 of the text). 'Cto D' indicates that Notation: A, B, C, indicates that the output of register A is conthe switching circuits connect the output of register C to the input of register D; %D + A) to is permanently connected to the other input), and the output of the adder to register C. nected to the one input of the adding unit (the output of A numerical symbol n in quotes (e.g., V) stands for the source whose output is the number n in units of the least significant digit. .

.

.

C

D

Arithmetical

Control

unit

register unit

Conditional

Next

flip-flop

micro-order

Set

Use

Microprogramming and the design

Chapter 28

coloperations called for in the control register unit. The fourth shows which conditional flip-flop, if any, is to be set and the

umn

digit

which

flip-flop

is

to

number

C, while (2)G,

be used to 1 is set

means

set

means that number in register

for example, (1)CS

it;

by the sign

digit of the

number 2

that flip-flop

is

set

by the

least

significant digit of the

number

ditional micro-orders

columns 5 and 7 are blank and column 6

in register G. In the case of

uncon-

contains the address of the next micro-order to be executed. In

column 5 shows which

the case of conditional micro-orders is

flip-flop

used to operate the conditional switch and columns 6 and 7

give the alternative addresses to which control is to be sent or a 1 respectively. the conditional flip-flop contains a

Micro-orders

from the

store.

to 4 are

when

concerned with the extraction of orders

serve to bring about the transfer of the order

They

from the store to register E and then cause the five most significant with the result that digits of the order to be placed in register II control is transferred to one of the micro-orders 5 to 15, each of

which corresponds In this

way

to a distinct order in the

machine order code.

the sequence of micro-orders needed to perform the

particular operation called for is begun. The way in which the various operations are performed can

be followed from Table

2.

In the section dealing with multipli-

assumed that numbers

lie in

—1


means the symbol

in

L

sary to be able to detect the responsibility bit (b

=

22), since there

s

1L

,

are impossible. (1L

.)

It is

neces-

when

the explicit structure of lists is important, and not just the information they designate. Finally, although the signal bit is just a single switch, it is necessary to have two symbols, one are occasions

corresponding to "signal on" and the other to "signal off" (b = 26 and 27), so that the information in the signal can be retained for later use (b

The

if

the symbol

-

is

found, and "off"

of the virtues of the if

programmer knows

the

is

apparent

is

not found.

One

at this point, since,

that the symbol exists,

he

will simply

ignore the signal. Instruction formats that provide for additional addresses for conditional transfers would force the programmer to attend to the condition even in the

To a

of

list

how

the

on the

list,

only meant leaving a blank

these search operations work, Fig. 6 shows

L300 and

lists,

,

reference to the

how

if it

program.

illustrate

L300

last list of

list is

,

a

known

structure.

referenced.

cell,

L 100

.

Cell

L 100

contains the

The programmer does not know

He wants

the structure. His

first

to find the last

step

is

symbol

L 100 which L300 He then

(30, 1,

)

replaces the reference by the name of the list, searches down to the end of list L 300 by doing a series of operations: (32, 1, L 10o ). Each of these replaces one location on the list .

28 and

operation if the end of the the end of the list

when

by symbolized by

been reached. The net is

result,

that the location of the

word on list L 300 rests in L 100 Since in this example he wants go down to the end of the sublist of the last word on the main list, he next performs (31, 1, L 100 ). This operation replaces the .

location of the last

the search

down

word with the name

the sublist

is

of the last

list is

in

L 100

,

as

list,

L 70o Now .

repeated until the end

is

again

symbol on the desired. The sequence of code follows:

reached, at this point the location of the

Location

last

last

Link

Symbol

bed

is

setting the signal to "off."

hasn't

reached,

is

to

29).

sense of the signal

list

last

not arbitrary. In general "off" is used to mean that a process "failed," "did not find," or the like. Thus, in operations b = 6 and 7, the failure to find a "stop interpretation" operation sets the signal to "off." Likewise, the end of a list will

the symbol

if

common signal

by the next one. In fact, a loop is required, since the length of the list is unknown. Hence, after each "find the next word" operation, he must transfer, on the basis of the signal, back to the same

Ten b operations are primarily involved 21)

the symbol referred to

is,

10 operation, which means that the end of the may list has been reached. Consequently, the signal is always set "on"

is

=

not exist; that

=

contain a b

space

recalled.)

symbol,

unknown symbol need

Processors based on a programming language

30,1X 100 .32, l./^ioo

4,0,L 888 31,l,Lioo .32,l,Lioo

£-999

4,0,L999

List operations

Both the "save" and "delete" operations are used to manipulate lists, but besides these, several others are needed. The three operations,

b



30, 31, 32, allow for search over

list

structures.

They

"get the referent," "turn down the sublist," and "get the next word of the list." They all have in common that they replace a known symbol with an unknown symbol. This

can be paraphrased

as:

The a

list

this

operations, b

=

33 and 34, allow for inserting symbols

either before or after the

symbol designated.

system are one-way: although there

is

always a

The

way

lists

in

in

of finding

the symbol that follows a designated symbol, there is no way of finding the symbol that precedes a designated symbol. The "insert

before" operation does not violate this rule. In both operations,

A command

Chapter 30 |

structure for complex information processing

359

360 Part 4

The instruction-set processor

|

level: special-function

Section 4

processors

|

The name

of the CIA list for the program structure which be reactivated on completion or interruption of the current program structure is the second item on the L 3 list, etc. Therefore, list.

is

to

the

L3

list is

appropriately called the current

and "delete" operations are used

CIA

list.

The "save"

L3

to

manipulate analogously previously described. Appendix 3 gives a more complete schematic representation of the interpretation cycle. It has still been necessary to represent

L2

to their use with

only selected b operations.

Data programs In the section on list operations a search of a list was described. There the data were passive; the processing program dictated just what steps were taken in covering the list. Consider a similar situation,

shown

in Fig. 8,

which contains the name

where there

of a

list,

is

a working cell, L 100 is a data program.

,

L300 L300 .

There

is a program that wants to process the data of L 300 , which a sequence of symbols. This program knows L 100 To obtain the first symbol of data, it does (6, 1, L 100 ), that is, "execute the is

.

parallel

L 100 ." The result is to create a CIA list, L 500 put its name in L 100 and fire the program. Some sort of processing will occur, as indicated by the blank words of L 300 program whose name

is

in

,

,

.

Presumably

this has

something to do with determining what the might be some bookkeeping on L 300 's experi-

data are, although it ence as a data file. Eventually

L mo is reached, which contains (0, This operation stops the interpretation, and returns control to the original processing program. The first symbol of data 1> i'soo)-

defined to be 1L 800 The processing program can designate this by 4L 100 since the sequence of c = 4 prefixes in L 100 and L is

.

,

500

pass along the interpretation until

it

ultimately becomes 1L 800

Now the processing program can proceed with

Before

8,1,

Lioo

the data.

It

.

remains

Processors based on a programming language

Chapter 30

A command

structure for complex information processing

361

362

Part 4

I

The

15

16 17

18 19

Copy s into communication list, saving 1L Move * into communication list, saving 1L Move 1L into location of s, saving s. Move 1L into location of s, destroying s. Copy

location of s into communication

Create a

new symbol

in location of

.

1. .

2.

list,

saving 1L

saving

s,

Fetch the current instruction according to the current instruction address (CIA) of the current

21

22 23

Turn signal on Turn signal on

if s if s

Turn signal on if Turn signal on.

s

is

.

c to c

s.

=

If

1L

,

off if not.

1L

,

off if not; delete

responsible,

1L

Turn signal

Invert signal.

26 27

Copy Copy

28

Set signal according to

s.

29

Set signal according to

s;

off.

3.

signal into location of

s.

signal into location of

s,

= =

put CIA

which saving

delete

s.

and turn

Replace

to step

If

b

=

If

b

=

step

1.

31

= 10), leave s and turn signal off. symbol doesn't exist (b in d of s and turn signal on; if symbol the s symbol Replace by

by the symbol designated by

doesn't exist, leave s and turn signal

by the location

s,

signal on;

of the next

symbol after d of

"0, 4,

part of

d

If

b

If

b

If

b If

= = -

s

If

new CIA =

d,

CIA and go to step 4. new CIA = d part of s and go

turn the signal on, delete 1 save

CIA,

set a

2 replace 3 replace

b, c,

d by

CIA by

to

Insert

(move symbol from communication

list).

2. 1.

signal off, delete

CIA and go

to

4.

"popped up" CIA

Replace to step

and go to step

10 delete CIA.

is

go to step 3. Otherwise go to step 4.

s

the d part of s and go to step

rent instruction again,

and

34

after s

set a

operation: (Some of the b operations

no CIA "pops up" turn

step

of s)");

33

1L

it,

1.

off.

(f, replaced by if next symbol does not exist, leave s and turn signal off. Insert 1L before s (move symbol from communication list).

(s

3.

and go to step 3. d parts of the word at address

affect the interpretation cycle follow.)

if

s

reduce

s.

30

turn signal on

c

in the address register

Decode and execute the b

List Operations

Replace

If

4 replace c, d by the c, d and go to step 2. If c = 5 mark CIA "incomplete," save

If c

.

not.

off if

c

and go

25

32

at address d,

=

2 replace d by d part of the at address d, reduce c to c = 1 and continue. If c = 1 2 and continue.

to step put d in the address register and go

24

s

list.

Decode and execute the c operation: c = 3 replace d by d part of the word word

20

CIA

If

Signalling Operations

= =

THE INTERPRETATION CYCLE

APPENDIX 3

Communication List Operations 14

Processors based on a programming language

Section 4

instruction-set processor level: special-function processors

CIA by 1.

the

/

marked "incomplete" fetch the curmove 1L into address register and

4.

part of the current instruction

and go

Chapter 31

System design

of a

FORTRAN machine 1

Theodore R. Bashkow / Azra Sasson / Arnold Kronfeld

Summary

A

system design

is

given for a computer capable of direct

FORTRAN language source statements. The allowed types of statements are the FORTRAN DO, GO TO, computed GO TO, Arithmetic, READ, PRINT, arithmetic IF, CONTINUE, PAUSE, DIMENSION and END statements. Up to two subscripts are allowed for variables and execution of

FORMAT

needed. The programmer's source program is converted to a slightly modified form while being loaded and placed in a Program Area in lower memory. His original variable names and statement

no

statement

numbers are retained

in a

Symbol Table

in

a hardware interpreter for these statements.

The machine corresponds

therefore to a "one-pass, load-and-go" compiler except, of course, that there is

no

language.

and 100

flip-flops.

Index Terms

machine language. It is estimated that the machine will require on the order of 10,000 diodes

execution of

This does not include arithmetic circuitry.

Digital

and arithmetic

However, when

details of his solution in the source

the machine "hangs up" or

computer system,

digital

FORTRAN, FORTRAN computer

machine design, system,

direct

FORTRAN

lan-

his

all

when he

finds displayed

gets equivalently an esoteric print-out in a symbolic form of

machine language.) To overcome these

difficulties

one could use

an interpretive translator of the source language instead, but the historical deficiencies of interpreters, loss of

speed of execution have caused Another solution is also possible

loss of

memory space and

this solution to

— design

a

be shunned.

machine which

executes an algebraic language directly as its "machine language." This approach is based on a recognition that once the allowable syntax and associated semantics of language statements have been firmly specified it is a matter of choice whether to write a compiler,

an interpreter or to build an interpreter out of hardware.

to write

guage machine, hardware interpreter.

he

on the debug program, machine console is the machine language. (On large machines he attempts to

translation to a different

control circuitry for this

The software choice has been almost overwhelmingly

to write a

compiler. Since the choice of hardware interpreter, or machine, has not been made, and in fact has hardly been explored to any

Introduction

The

logical flow

is

upper memory, which also serves as the data storage area. During execution of the program each FORTRAN statement is read and interpreted at basic circuit speeds since the machine is

language translation is accomplished, is a waste of time and money to the user since he must pay for this time though he gets no problem answers from it. Secondly, the user has specified the

algebraic languages, in particular

FORTRAN

in this country,

have had enormous impact on the utilization of computers for scientific and engineering computation. They were designed in

overcome the annoyance of lengthy learning time and the laborious attention to detail needed to use a basic machine large part to

great extent, a study has been made in order to see if this choice leads to a system which is competitive with the usual software

system.

It

should be understood that such a machine has not been

constructed. However, the design 2 construction seems feasible.

is

sufficiently

complete that

language.

is

These annoyances are overcome by providing a language which and freer of "bookkeeping" details,

closer to English in form,

than the usual machine languages, and by providing a machine language program, called a compiler or translator, to convert from the source program written by a user to an object program executable by a computer. Thus the original drawbacks are overcome but the discrepancy between the external language of the user

and the internal language others.

1

IEEE

of the

The compilation run

machine leads

to at least

of the machine, during

Trans., EC-16, vol. 4, pp. 485-499, August, 1967.

two

which the

Language— design philosophy Since the machine language is to be an algebraic one it seemed reasonable to choose a simple subset of the most commonly used one, FORTRAN. This eliminates the necessity for inventing still

another such language and allows attention to be focused on machine design. In fact, the subset chosen is quite close to that

known

as "Preliminary

complete enough 2

See

final

to

FORTRAN

for the

IBM

1620," which

is

be quite useful, but which does not include

technical report for Contract

AF

19(628)-2798.

363

364

4

Part

The

Section 4

instruction-set processor level: special-function processors

|

such innovations as subroutines, etc. In addition, the usual "built in" subroutines

SIN

(*),

COS

(x),

etc., are not included. Their in-

READ,

These statements cause data to be read or printed, respectively, in accordance with the specified list of variables which

List

PRINT,

List

would require additional effort for their hardware implementation which did not appear to be worth expending at this

clusion

may be subscripted; however, the "implied DO" feature has not been implemented.

time.

No FORMAT

The

FORTRAN

machine

as

statement types which are accepted by the

machine language are

in the table that follows.

1

DIMENSION

=

is

stored

GOTOn GO TO

to

the

name

n m ),

.

i

which

a,

may

have

two subscripts.

and

i

IF(e)

n lf n 2 n 3

statement

executed.

is

Program control

,

transferred to the

is

statement numbered

ri]

if

the algebraic

expression e is negative, to that numbered n 2 if e is zero, and to that numbered

n3

PAUSE

if

e

is

m 2 m3 ,

halted until restarted

statements following this one in the program, including the statement numbered n, are executed repeatedly. The

All

first

execution

is

with

cremented by the value

equal

i

of

m3

rr»i, i is in-

before each

succeeding execution. This continues until i is greater than m 2 at which time program control is transferred either to the

stood to be

CONTINUE

1

DO sequencing is

not given

it

rules for is

under-

.

affect

program control and not arithmetic processing.

one to four numeric symbols preceded by a decimal point (and a + or — sign). These are followed by the character £ and a single

(positive or negative) digit representing the power of ten in the usual scientific notation.

These constraints on number

the last statement

size

and format are made to

simplify certain circuits and could easily be relaxed if desired. The restriction to a two-subscript maximum for subscripted variables is

similarly motivated. Internally, all numerical data require three 8-bit

The

first

words

(Fig.

packed two decimal point is assumed

two words contain the

four-digit mantissa,

per word in a 4-bit code for each digit. A to exist to the left of the most significant digit. The most significant bits of the third

mantissa bit

is

The

positive, or

is

word are 1 if it is

zero.

The

third bit

is

if

the

negative, and similarly the fourth respectively, positive or negative.

if the exponent is, exponent digit occupies the least significant four bits word. All other characters occupy a full 8-bit word of which

or

1

single

in

the

of this

range of a DO. In this case normal sequencing takes place.

DO

the two most significant are l's. Any numeric characters which are symbols of a variable, e.g., the "2" in AB2X, also occupy a

is

This statement generates a control signal

familiarity with the

fixed (integer)

These may have names of any

of

two

to start execution of the

Some

machine between

"mixed mode" expressions. Statement numbers must be unsigned fixed point constants, which are not so converted since they only

This statement has the effect of the "no

CONTINUE

1

in this

floating point (real) variables.

any combination of one to four numeric characters preceded — sign, however, these are converted to an internal by a + or decimal floating point number and so there are no restrictions on

1).

operation" instruction in conventional machines. Program control goes to the next statement in the program unless the

END

made

as

statement following n or to that statement required by the DO nests. If m 3

is

Floating point constants are specified in the form of a mantissa is

by console switch. mi,

distinction

length, starting with any alphabetic character. Fixed point constants may be specified, in a program or as data,

positive.

Program execution

DO ni =

No

to the

Program control is transferred to one of nm the statements numbered rn, n 2 at the time depending on the value of this

name followed by parentheses enclosing one or two constants.

location referenced

memory

Program control is transferred statement numbered n.

(tfe, fH>,

available with this

This statement has the effect of reserv-

v, v,

of the arithmetic expression b

in

by the variable

up

is

ing memory space for the subscripted variables v. Each v stands for a variable

The value

b

control

machine, therefore no statement number need be given.

Comment

Statement

a

Processors based on a programming language

FORTRAN

language

is

program.

assumed.

full

word

digits

of this type. Statement

numbers are simply packed 2

per word and always occupy 2

full

words.

Refore proceeding with the description of the overall charac-

Chapter 31

+ 0.5739 E-4

Word

1

Word

2

in

three consecutive words

in

memory

System design

of a

FORTRAN machine 365

366

Part

4

I

The instruction-set processor

level: special-function

Section

processors

of the pointers correspond to indirect addresses. Figure 2 shows control and Tables 2 to 7 show to a sketch of the overall

It is

b

in a paper tape, is loaded into the tape read circuit which reads a statememory by energizing ment on the tape, including the end-of-statement symbol ±, into

program, which

is

punched

The read circuit is then de-energized. The least 6 bits of each word of the buffer hold the internal BCD significant of each symbol. representation

statement number was previously processed.) The statement number is put into the Program Area and the

ment from

left

Program Counter c

now

picks up each symbol in the stateto right and as each symbol is decoded it reacts

scan circuit (Fig. 3)

also put into the Program Area starting at this location and the Program Counter incremented appropriately, i.e., by 2 since two 8-bit words are used. The statement number is found in the Symbol Table because it has been previously referred to by an IF or GO TO. The current value of the Program Counter is

placed into the two memory locations following the statement number. (These were left blank when the

the I/O buffer.

A

put into the Symbol Table followed by the value Program location. The statement number

is

have been altered.

Loading a program

A

Processors based on a programming language

of the current

system

original statements

what extent the

4

the

first

Statement

ior is

a digit, control is turned over to a Load circuit. This circuit shifts the

is

symbol

Number

2

statement number digit by digit into a register (SHR). The maximum allowable length of a statement number is 4 digits all

Table area. a

One

is

Symbol Table

in the

is

DO DO

described since the circuit's behav-

more meaningful

in that context.

number has been processed in this fashion symbol in the statement was not a digit (no statement number was assigned) then the scan circuit continues to pick up each symbol from left to right until it is

if

the

first

able to classify the statement as to type.

then turns

It

over control to the appropriate loading circuit as indicated

of three possibilities exists:

The statement number

found

After a statement

or

statement numbers are carried internally in this form, i.e., a programmer's statement number 13 is carried in 2 words as 0013. A search is now made of the Symbol

and

is

it

statement loading If

incremented.

has been previously referred to by a statement. A description will be deferred until the

because

as follows.

1

is

The statement number

in Fig. 3.

not found in the Symbol Table. All of these loading circuits put the statements into the Pro-

gram Area

after replacing variable

names and statement number

references in the program with addresses or pointers. They also TO or CONTINUE with a replace reserved names such as

GO

Memory address register

single 8-bit

code

(token).

Each unique variable name

also stored in the

in

the pro-

Symbol Table once using an

gram, however, is code for each symbol. For nonsubscripted variables the three words following the name are reserved for the data that will be

8-bit Symbol table

associated with this

Memory

Lood

scripted variable

Program area I/O

name when

which must precede the use

buffer

In this case as

the program

names are found

many

in

is

executed. Sub-

DIMENSION

statements

of these variables in the program.

locations following the

name

are reserved

have been computed from the DIMENSION statement. The name in the Symbol Table is preceded by a special symbol a, to as

Inputoutput

Program

Memory buffer register

Data

Arithmetic unit

the first of it is a subscripted variable. In addition, the two subscript values in the DIMENSION statement is also stored immediately following the name. This number is needed indicate that

element during program execution for constructing the proper Read /print

of the array specified 1

Fig. 2.

FORTRAN computer

system.

A

for

by a subscripted

variable.

1

The

location in the Symbol Table pointer to the next available

speed

in

Symbol Table searching.

address of

is

also stored

Chapter 31

System design

of a

FORTRAN machine 367

Process

statement number

Statement number Process

DIMENSION ARITHMETIC

Process

ARITHMETIC DO

Process

DO Process

GO TO

COMPUTED GO TO Paper

-

I/O

Scon

buffer

CKT

Process

COMPUTED GO TO

tape Process

READ Process

PRINT Process IF

Process

PAUSE Process

CONTINUE End Process end

XT Fig. 3.

Load processing sequence and control.

symbols of the variable name in the first. This symbol, which must be

(SMU). These circuits indicate either that the name or statement number is already in the Symbol Table or it is not. Thus the first

retained in the Program Area as an indicator that

appearance of a variable name, statement number, or reference to a statement number causes it to be put into the Symbol Table.

the data location replaces

Program Area except alphabetic, this

is

is

all

for the

indeed a variable. All special symbols such as

(,),

+,



,

simply stored sequentially in the Program Area in the 8-bit form as they appear in the original statement. Statement numbers in IF and GO TO statements are similarly

Subsequent references merely utilize these previously assigned data or Program addresses. Therefore each name or statement

replaced by the address in the Symbol Table which holds the address in the Program Area of the statement having that number.

noted below. In general, the programmer's statement is altered only in the above described fashion. However, for ease of execution

etc. are

BCD

Note that

numbers

this

is

an indirect address to the statement. Statement

DO

statements are dealt with somewhat differently as will be explained later. Because variable names and statement

number

in

references can appear many times in a program, these searches of the Symbol Table are controlled by two special circuits, the Variable Match Unit (VMU) and the Statement Match Unit

number

is

stored in the Symbol Table only once with an exception

computed GO TO has GO TO (n v n 2 ••-, nm

the in

.

its

), i,

index parameter name, i.e., the "i" changed from the position following

the parenthesis to a position preceding the parenthesis. The statement requires the most complex loading algorithm. Basically, the idea is to place the statement itself,

DO

DO

essentially unchanged, into the

Program Area but to extract the

368

The

Part 4

Section 4

instruction-set processor level: special-function processors

range statement number (which specifies the range of the

DO) and put

into the

it

last

Symbol

statement in the

Table.

It is

there

preceded by a special symbol A, designating it as being referenced by a DO, and followed by the Program Area address of the corresponding

DO statement.

The

DO

statement in the Program Area

original statement number replaced by a special symbol, \, and an internal address which is determined as follows (see

has

its

Table a

6).

If this

DO

is

DO

DO

is an entry in the Symbol Table corresponding to every nest three deep all ending in statement. Thus for a

DO

statement number 100, for example, there will be three entries in "DO nest order" of the number 0100 each fol-

lowed by the corresponding

DO

statement Program Area

same

DO

if it is the is the only first of a nest of DO's, or specifying a particular range statement number, then this internal address is the program address of the next

The

circuit

shown in the Appendix. The hardware implementation of the state diagram Variable Match Unit is also described there.

Executing a program the END statement signaling the end of a source program encountered by the scan unit, the machine leaves its load mode, executes an automatic RESET, and enters the execution mode.

When is

(Reset forces the address 100 into the Program Counter.) Pressing

the console start button causes statement execution to begin at the first executable statement which is always found at memory

extracts

if

DO range, DO or DO

i.e.,

this

the address to which nest

is

satisfied.

found by the Statement Number Load the time the last statement in the range appears in the is

and saves the Program Area address

of the

first

DO

and

DO, if there is a nest, or simply the only address if there one. The statement number is put in the Program Area as

the last

always. In addition, the Program Area address of the A token of in the nest is also put in the Program Area immediately the last

DO

following

it.

In addition, a special flip-flop, the

LSFF,

is set.

The

loading circuit for each statement type allowed to be the last statement in a range, tests this LSFF after it has loaded the statement into the Program Area. If it is on, the current contents

DO

Program Counter, the address

DO

of the next statement outside

range are used as the internal address in the

Each

of these circuits

should be noted that this

DO

range statement number

together with its the Symbol Table without a preceding A. This it

(or only)

possible (and even legal

GO TO

refer to

it

in

location will also appear in

some

is

necessary because

cases!) to

have an IF or

also.

The method used

to design the circuits

The

initial state.

is

in

an

first

symbol

in a statement.

initial state

executes the statement until the

is

first

|

(end of

read from memory. It then returns to its symbol of the next statement, as indicated

by the Program Counter, is read and causes some circuit to leave its initial state, etc. Thus the first symbol of a statement acts like the "operation code" portion of a conventional computer instruction word. The first symbol must be (since the load circuitry causes this)

one of the

8-bit tokens for the various statement types, or

a digit of a statement number, or the alphabetic character of the " " variable on the left of the = symbol of an arithmetic statement.

The tokens

Table

are represented in this paper

in

Table

1.

Token

Statement type

GOTOn IF (e)

shown

1

(n lT n 2

.

,

.

n m ),

n lf n 2 n 3 ,

PAUSE DO n = mi, CONTINUE READ i

PRINT

which implement these

it

statement symbol)

GO TO own Program Area

is

first

of the nest.

It

a separate statement execution circuit for

is

each statement type. In addition, the Statement Number proc-

statement outside the

I/O buffer for loading. The circuit first detects that a matching statement number in the Symbol Table is preceded by a A. It then

DO

of the

when execution begins. One and only one can leave its initial state when the first symbol of a statement is read from memory. The responding circuit then

circuit at

the

is

are used during Loading, are

retains control as

of the

is

then synthesized from the state diagram established methods. The state diagrams of the Arithmetic using Statement Loading circuits and the Variable Match Unit, which constructed.

essing circuit reacts to a digit as the

This outside address

just

each case. From the English language

DO

control should go

is

in

description of the function a sequential circuit state diagram

address 100. There

address. If this

the

is

one of a nest of DO's, the internal address

the Program Area address of the X token of the next statement. This is easily found by a Symbol preceding Table search for the range statement number since there is

b

functions

Processors based on a programming language

i

GO TO COMGOTO IF

PAUSE ">2. "13

DO CONTINUE

READ PRINT

Chapter 31

DO execution circuitry to leave by reading of the DO or by reading of the The former causes DO initialimmediately following causes DO the latter indexing and testing as will be however, for the

It is possible,

initial state either

its

X token

it.

ization,

described

The

later.

action of the execution circuits

is

briefly given below.

Statement number processing

When

the

first

symbol of a statement

a digit this circuit

is

is

there are only four digits (packed into two memory energized. circuit returns to its initial state and the remainder words) the If

If there are eight digits (packed into the last four digits (the address of the X of words),

of the statement

four

the

memory

last,

LSFF

is

executed.

is

DO in a nest) are saved in

or only,

turned on, the circuit returns to

remainder of the statement statement

is

not an IF,

GO

is

executed.

TO,

or

DO

a register, SSAR.

its initial

If

state

The

and the

the remainder of the

statement, the execution

circuitry in control executes the statement and then tests for the LSFF being on. If it is on, the Program Counter contents are re-

LSFF is reset, and the circuit SSAR holds the program returns to its address of the A token of the innermost DO. When this X is read, DO indexing and testing take place. If the LSFF is off, the circuit returns placed with the

SSAR

contents, the

initial state. In this case the

to

its initial state.

GO TOn The

GOTO

token energizes this circuit.

(packed into two is

The

four-digit address

words) immediately following the token extracted. The contents of this address are put into the Program

memory

Counter and the

GO TO

1

Example.

The

circuit returns to

COMGOTO

15$ (Table

its initial state.

2).

token energizes

this circuit.

The

initial

alpha-

now immediately

following the token, is read and discarded and the four-digit address immediately following is extracted. The contents of this address (the current value of t)

betic symbol of

i,

are put into a register 1

the result

and decremented by one.

zero, the four-digit address following the parenthesis is extracted. The contents of this address are put into the Program Counter and the circuit returns

If

is

left

to 1

its initial state.

A11 examples are written as

first

in the

program.

though

this

statement or statements were the

Table 2

System design

of a

FORTRAN machine 369

370 Part 4

Table 3

The

instruction-set processor level: special-function processors

Section

4

Processors based on a programming language

Chapter 31

Table 5

System design of a FORTRAN machine

371

372

Part

4

I

Table 6

The

instruction-set processor level: special-function processors

Section

4

Processors based on a programming language

Chapter 31 |

Table 7

System design

of a

FORTRAN machine 373

374

Part

4

The

Section

instruction-set processor level: special-function processors

4

Processors based on a programming language

|

first right parenthesis after the F causes Z3 to equal zero. This condition causes the value stored in d 30 to be placed in the SR. The value of i is decremented

Therefore the

Arithmetic Statement execution. These storage areas for partial results are called

d i0 div where ,

t

specifies the "level" at

which

1

computation is taking place, t is equal to zero until a left parenthesis is encountered which increases the current value of i by

to 2.

An

5

follows exception occurs if the left parenthesis immediately It is also level remains at zero. the the ss symbol. In this case 1.

is

tial results.

control values are required at every level. The count of left parentheses at any i level is stored as a number, l t Before i is incremented, the incompleted arithmetic operations still re-

Two

6

an indicator quired at the current level are indicated by giving * to and t t + indicators needed are Also 3. or value the 1, 2, f,

-

from

and

*

from

/.

To

7

,

,

-I-

clarify the significance of

made

these control values an analysis will be

8

sets of

1

((B

+

The

+

(C/((D

E'(F))))

circuit reads

and discards the The first two left

+

2

and saves the address of A, then reads which puts the circuit at the level i = 0. set to 2. The parentheses cause l to be

B

(", t

is

is

set to zero to indicate the plus sign.

The left parenthesis also causes to be incremented to one and since it is the only one at this level, \ is also set to The division symbol 1. The value of C is stored in d i

w

followed by a left parenthesis causes t t to be set to 2 to indicate the condition "C/(". Since we might find "C*(" in * is set to 1 to indicate the division. other cases, t j

3

The left parenthesis also causes to be incremented to 2 and the next left parenthesis increments l 2 to 2. The value of D is stored in d 20 and the value of E put into d 2l respeca left parentively. The multiplication symbol followed by i

,

thesis causes

"D + E*

(". f

r2

+

to

be

2

and

set to t * 2

3 to indicate the condition

are each set to zero to indicate

the plus and multiplication symbols, respectively.

4

The

left

parenthesis before the

F causes

i

to

=

2

t*

{

Basic circuit operation at any level page 363, footnote 2.

is

for

0) causes the

computation,

d20 The next two paren-

in

.

F cause l2 to equal zero. Therefore, this result placed in the SR. The value of t is decremented to 1. *

equal to 2 and t 1 is equal to 1 the computation stored in d 10 The final parenthesis after causes l x to equal zero. Therefore this result goes to

the

F

SR.

i

Since

is

tj

made and

is

is

decremented to

t

is

one and

made and

The

.

t

+

is

d^ +

SR,

stored in d^.

d00 + G to be made and two parentheses cause l to be zero;

the computation

d00 The .

zero the computation,

is

the result

+G causes

zero.

final

is

placed in SR.

(If

another right

Any subscripted variable addresses are computed easily from the initial DIMENSION statement information, saved in the Symbol Table, and the current value of the subscripts. Assume the first data location for an array A(I, J) is stored at a location A base + 1. If the DIMENSION statement read DIMENSION A(5, 10) then the computation, A base + 5 * (/ — 1) + address for any nonzero value of / and

complete data word

is

stored per

I,

gives the correct data

/.

(This

is

memory word;

true only

in this

if

a

machine

the expression is slightly more complicated.) In this machine the partial result locations

d i0 and dn are accommodate the data. An actually 3 words long, of course, to where 4 bits control information additional word is used to store

+

t *

and the remaining 4 bits for the count. The i counter therefore is actually incremented or decreZj mented by 7 instead of one. Thus at any level, of which there

are used for

t

t

,

t

4

,

and

f

can be 14 since the I/O buffer

210

left

is

100 words long, the

Z

f

count

more than adequate since it allows is much which longer than the I/O buffer parentheses,

can be as great as 15. This for

.

must be decremented by one

=

2

be stored

is

be incremented

and l3 to be set to 1. The value of F is placed in d 30 The Arithmetic Statement circuit always puts the final value computed at any level into the arithmetic unit regis= for any i. Clearly l ter, SR. It does this whenever l

to 3

1

to

parenthesis were found, this would cause an error condition to be indicated.) The $ symbol causes the contents of SR to be stored at the previously saved memory address for A.

G))t

stored in d^. The plus sign followed by a left t to be set to 1 to indicate parenthesis cause the indicator we Since "B condition the + (". might in other cases find

value of



SR

therefore the value in d^,

=

"B

*

d 21

stored in

parentheses:

A =

+

Since

is

of the following ex-

which contains some unneeded but legitimate

t+

being 3 (and

d 10 /SR

.

pression,

2

theses after

necessary to store control information which relates to these par-

distinguish

t

d20

length.

Since the appearance of the Symbol Table and Program Area little to this discussion, an example will be omitted.

would add

t

each right parenthesis.

described in the earlier report. See

Conclusion

We have illustrated in some

detail that a

lation of a simple algebraic language

is

machine

for direct trans-

It

would therefore

possible.

Chapter 31

seem that further investigation be made

of the

economic position

of this solution vis-a-vis the software compiler solution. Unfor-

to set either the

is

OK,

AOK

respectively indicate that the

System design

|

or

EOL

ST

either:

of a

flip-flops.

FORTRAN machine 375

These

flip-flops

tunately, the present authors are not sufficiently versed in compiler

construction to

The

make such

a comparison.

machine

an independent unit is probably not reasonable except under particular circumstances in which only small one-shot scientific problems form the actual construction of such a

1

holds the variable in question as a result of previous loading, or

2

that the variable

as

loaded by the

bulk of the computing. However, as an adjunct to a larger general purpose machine, it may well serve a need as a hardware inter-

3

it is

would be needed

estimated that 10,000 diodes and 100

absence of the variable in the ST.

The

flip-flops

for these alone (not including arithmetic circuits).

The design techniques used

are simple and straightforward but rather expensive. These designs should probably only be considered for use with integrated circuitry.

state

Preliminary Specifications,

Form

J29-4200-2, April, 1960

said to

is

3

if

of (Fig. 4)

The Symbol Table at the end of the load mode should contain all variable names used by the program, together with empty locations reserved for data associated with these names. The Program Area at the end of the load mode should have a program in which all variable names have been modified in that only the letter

in the corresponding position

The

The variable match unit (VMU)

first

4095 since the ST

is

retained, followed

the data associated with this

by the Symbol Table address of name. Since any variable name may

is

shown

in Fig. 4.

When

the

VMU

be matched. The

first

or

in the ST, the character

proceeds from state 2 to state scan matches. Otherwise

name under

character of the

MATCH

scanned sequentially downward. the I/O buffer is found

name in of a name

a character of a variable

the state changes from 2 to

APPENDIX 1

is

(CIO) contents are saved in register SCIO since the name may have to be scanned again. The Symbol Table Counter (STC) is If

FORTRAN:

for this circuit

VMU

initialized to

1620

diagram

triggered by the START signal in state 0, the circuit goes to state 1, the next clock pulse sends it to state 2 from which it starts its search of the ST. In going from 1 to 2, the I/O Counter

References AndeJ61; BashT64; International Business Machines Corporation, General Information Manual; FORTRAN, Form F28-807401, December, 1961; IBM

(EOL) token was found, indicating the

that the End-of-List

preter for widely used higher level languages. As a result of a fairly complete design of the control circuits of this machine,

subscripted and has been previously statement loading circuit, or

is

DIMENSION

8, if

NO MATCH

the

NO MATCH

is

signal

given.

signals are generated as a result

comparing the contents of the ST location undergoing the scan

(the contents reside in the

Memory

Buffer Register,

MBR), with

COMP which has the character from put into COMP by the calling

the contents of the register the I/O buffer. The first character circuit, thereafter the

VMU

The CIO and STC counters respectively,

and the

VMU

is

picks them up in the 3-4 transition. are incremented and decremented, oscillates

between

states 3

and 4

as

matching continues. This comparison process will terminate when, either an arithmetic operator S is read from the I/O long as

,

appear many times

in a

program, a search

required, during the

is

name

already exists in the Symbol Table. The search of the Symbol Table (ST) consists of comparing each name there with the variable name in the statement being loaded. loading, to see

if

the

All statements are loaded by an appropriate circuit of Fig. 3 from the I/O buffer and into the Program Area of the memory. Therefore the variable name in the Statement exists physically in the

I/O

cause a

the function of the

VMU

to

make

this search

when

signal with respect to the contents of the COMP unit causing the transition from state 4 to 5. In state 6, if a digit is next read from the ST, corresponding in position to the

gized or "called" by the loading circuits for DIMENSION, DO, TO, READ, PRINT, IF and Arithmetic statements computed

GO

which variable names appear. The output action

'Symbols used

in this

Appendix are described

in

Table

8.

of the

VMU

appearance of the operator from the I/O buffer OKFF is set to 1, and

names are the same and the

clearly the

the transition from 6 to

in the

I/O

buffer, the

from 6 to 5 of the

is

made.

On

the other hand,

another

is

made. In

,

state 5 the circuit just reads to the

nonmatching name

in the ST.

A

digit at the

causes the transition 5-7 during which the STC over the 3 data locations to the next ST entry and the

name

if

ST corresponds to an operator, S are not the same and the transition names

character in the

alphameric ener-

ST contents

NO MATCH

buffer.

It is

in

buffer sending the circuit to state 6 from state 3, or the

end of is

end this

stepped

CIO

reini-

376

Part

4

The

Section 4

instruction-set processor level: special-function processors

Processors based on a programming language

Sy/| STC

READ

READ

d/SET OKFF

/READ

(STC)

S — COMP

STC

So/t

NO MATCH /-

(STC)

READ (STC) t

CIO

READ

MATCH/*

I/O

CIO

/READ

I/O

s V//Sy— COMP '

START VMU /-

READ (STC)

-O

CIO— SCIO

/A 09 / READ

EOL/SET EOLFF

(STC)

-~

/| 3 STC d/SCIO— CIO I/O

/READ

O^

5— STC (STC)

NO MATCH/

a/(5

STC

/READ

(STC)

STC d/U /READ (STC)

|

STC

READ

a/

(STC)

STC

\

/READ

(STC)

NO MATCH

SvA

A STC

STC

READ

(STC)

(STC)

'

0/READ



-(10 d

MATCH/ t CIO /READ I/O

—SAVE

READ

READ

SAVE — STCM V/d—STCL /READ ,

(STC)

(STC)

I/O

S— COMP s

'd— STCL

0/l STC

/READ

d/SAVE— — STCM

(STC)

'SCIO

S v/t STC

'READ

d/|

'

(STC)

d— SAVE

STC

/set AOKFF

/READ

match

SCIO 0100 0000 1001 0101 -> STC

STC -+

from state

Therefore the execution of the above microsteps, in that order,

,

which indicates a unique state of the state diafinally the third comes from the control cycle counter.

MAR

READ

3.

AND

The output of each AND gate is a line indicating a unique microstep. The AND's feed OR gates, which actually energize the given

read cycle.

This signal causes the

Each

skeletal counter

address register. initiates a

,

in Fig. 6.

gate in this figure has 3 inputs except those not requiring input

CHANGE STATE

This signal causes the CIO to be incremented by one. CIO—> MAR This signal causes the CIO to be gated to the

READ This signal CHANGE STATE

should

we are in, the outputs of the decoder of the control cycle counter, and the input lines (S v S MATCH,

are:

fCIO

memory

it

use the outputs of the skeletal counter which will

MATCH

2 of Fig. 4, if a signal is present we are supposed to increment the CIO counter and then read the I/O buffer.

Consequently the microsteps required

present,

line information.

(Fig. 4).

to realize a circuit

will

the "/„ in the

is

indicate to us the state

ones are initially represented in a state diagram form, such as the state diagram for the loading of the Arithmetic Statement (Fig. 5)

MATCH signal

to state

State 2

CHANGE STATE and MATCH INCREASE CIO CIO -> MAR

READ

380 Part 4

The

Section 4

instruction-set processor level: special-function processors

Start

VMU [—[^Change

Fig. 6. State

diagram implementation.

Processors based on a programming language

ST

Chapter 31

State 2

CHANGE STATE and NO MATCH CHANGE STATE

State 5

and S v

DECREASE STC STC -+ MAR READ CHANGE STATE State 5

and d

DECREASE STC DECREASE STC DECREASE STC SCIO —> CIO CIO—> MAR READ

CHANGE STATE

In state is

of Fig. 4 a

START

accomplished by the top

System design

FORTRAN machine 381

of a

VMU signal takes AND of Fig. 6. The

it

to state

1.

This

only microstep STATE. In state 1 of Fig. 4, the next clock needed is pulse (after reaching state 1) causes a transition to state 2. In this case we need to save CIO contents in register SCIO, (CIO -* SCIO)

CHANGE

set the

STC

to

4095 (4095 -»

STC shown above in BCD form) and now in the Symbol Table Counter

get the contents of the address

(READ(STC)). This

STC —>

This transition from 5

AND

latter

is

implemented by the two microsteps

MAR followed by a READ command to the core memory. gates

shown

plish the transition

1 to

2 of Fig. 4 is accomplished by the next The next AND gates shown accom-

in Fig. 6.

from state 2 to 3

if

there

is

a

MATCH. The

AND accomplishes the transition from 2 to 8 if there is NO MATCH (in this case nothing need be done). Finally the lowest two groups of AND gates implement the required microsteps as

next

the circuit changes from state 5 to 7

if

a 4-bit digit

code

is

sensed

or causes the circuit to remain in state 5 after decrementing the

STC

if

an 8-bit variable code

is

read.

Chapter 32

A microprogrammed implementation of EULER on IBM System/360 Model

30'

Helmut Weber

Summary

An experimental processing system for the algorithmic language

EULER has been implemented in microprogramming on an IBM System/360 Model 30 using a second Read-Only Storage

unit.

The system

consists of a

microprogrammed compiler and a microprogrammed String Language Interpreter, and of an I/O control program written in 360 machine language. The system is described and results are given in terms of microprogram and main storage space required and compiler and interpreter performance obtained. The role of microprogramming is stressed, which opens a new dimension

in the processing of interpretive code.

The

structure and content

can be matched by an appropriate interpretive language which can be executed efficiently by microprograms on existing computer hardware. of a higher level language

form which text

is

two

in a procedure-oriented language are usually

steps.

They are

first

translated into an equivalent

translation process is a data-invariant and flow-invariant operation. It consists of two parts an analytical part, which analyzes the higher level language text, and a generative part,



which builds up a

string of instructions that can be directly inter-

preted by a machine. The analytical part of the translator depends on the higher level language; the generative part depends on a set of instructions interpretable by a machine. Historically there

was only one this

it is

conceivable to compile a program written in a higher microprogram language string. This string

would undoubtedly contain substrings which occur over and over in the same sequence. We could call these substrings procedures and move them out of the main

string, replacing their occurrence symbol, followed by a parameter designator pointing to the particular procedure. Our object program then

by a procedure

call

set of instructions

by a machine,

its

which could be interpreted

"machine language." Figure

1

it is

quence of "procedure designators."

The process just described will

result in the definition of a string

language and the development of a microprogrammed interpretation system to interpret texts in this string language. is

IBM System/360

360 language. Programs written in a higher level language are compiled into string language text to be stored in main storage.

The

string language interpreter corresponds to the

effi-

family are

preted by wired-in logic. Therefore, in a certain sense the 360 language is not the "machine language" of these processors but the (efficiently interpretable) language in which the processors of

^Comm. ACM,

microprogram

outlines

microprogrammed machines. On them the "360 machine language" is interpreted not by wired-in logic but by an interpretive microprogram, stored in control storage, which in turn is inter-

vol. 10, no. 9, pp.

549-558, September, 1967.

situation

to the

Input Data

of the processors of the

The

similar to the System/360 case: the string language corresponds

scheme.

Some

From

only a final step to eliminate the call symbols and furnish an interpreting mechanism which interprets the remaining se-

more

efficiently interpretable; then the translated interpreted ("executed") by an interpretation mechanism. is

The

ciently

Now

level language into a

here

Programs written processed

the elementary operations of the machine as operators and the elements of the data flow and storage as operands.

takes on the appearance of a sequence of call statements.

Introduction

in

the System/360 family are compatible. The true "machine language" of these processors is their microprogram language. This language is on a lower level than the "360 language"; it contains

p

Analysis

Progrom

in

*

t

I

^

Higher-level

Text

i,

Language

(language dependent,

I

Generation

'

Intermediate

———

*

.

,

,

f

..

(machine J dependent) 1

I

Program

in

Chapter 32

which interprets 360 language

texts. It consists of

part to read the next consecutive string

a recognizing

element and to branch

to an appropriate action routine and of action routines to execute the particular procedure called for by the string element. The essential difference between our situation and the 360 case that the string language reflects the features of the particular higher level language as well as the features of the particular is

hardware better than the general purpose 360 language. What is gained by defining this string language and by providing a

microprogrammed

interpreter for

it?

From

the

method

of

can be seen that the elements of the string language correspond directly to the elements of the higher level definition described,

it

all simplifying data-invariant and flow-invariant transformations have been performed. But the elements of the

language after

string language are also well-adapted to the

microprogram

struc-

ture of the machine. Therefore, during the compiling process (see Fig. 2) only a minimum of generation is necessary to produce the

The compiler But the more important aspect

string language text.

is

also faster.

coded

shorter and runs faster. that object code execution

2 will string language interpreter in case

be

to take care of all necessary operations in a concise form,

whereas of

The

is is

in case 1

it

will

be necessary to compile a whole sequence for an elementary operation in

machine language instructions

the higher level language. Examples of this are the compilation of

360 code

for

an add operation in

COBOL

of

different scaling factors or the compilation of

two numbers with

machine instructions

lookup or search operations, etc. In these cases the string language interpreter of Fig. 2 will execute a function much faster for table

than the machine language interpreter of Fig.

1

will execute the

machine language instructions. Therefore, will be faster in scheme 2. code execution object equivalent sequence of

If

object code performance

is

not as

much

in

demand

as object

also storage space economy, the string language interpreter can is as such that the be written tightly packed as string language

A microprogrammed implementation

of

EULER on IBM System/360 Model 30 383

384

The

Part 4

Section 4

instruction-set processor level: special-function processors

Processors based on a programming language

I

|

and storage

is

new N:

allocated dynamically,

N-

begin

new

EULER

A

A;



23IO-A1,

3*0

A2,«A3

Moiimum on* 2310 t,

,f

.'-W«

Cm Syt htm/360 Chonn iphu f*7720|

STORAGE

Data Chonn*

I

H*od) ('71251

CARD I/O

PAPER TAPE

The IBM 1800 401

402

Part 5

PMS

The

Computers with one

Section 2

level

central processor

and multiple input/output processors

PRINTER AND PLOTTER

.ADDITIONAL PRINTERS NO. 5 THROUGH 8

U4 3 Con he ('44371

E3 JI44VIM7 H Adopter

('4431'|

ADC Mod ('II or ADC Mod 2 1

(MOa t—uim. DC

('1232)

OPTIONAL FEATURES • {'4709) Croup of 6 Additional Interrupt • ('3222) Additional Dots Channel (max: • in BO fo 4 I/O

Up

Adoprtn

Digital

1

Unit 6i

-

Max. 2 Group* '

'



I

-J°^

Sum

of

DOC

Plui 1856-1 Eql

r DOC

per 1826

Digital

Output

Conrrol

(DOC)

I

B PIA

Maximum

w

I

BO)

lion Voltooc Reference iPVx

per Syile

1

2

PIA per 1801 or 1876

Maximum

('35271 or

Mod

2 ('53781

DAG

4

('32961

Maximum Maximum

hqvilfl Cultomer Alignment (Form 170-1246)

4 2

DOA DOA

in

1826

in

1801

jilol-Anoloa C „/R Control l'5256> plut Mjw.lt

T

MPX/K

1

The IBM 1800 403

404

Part 5

The

PMS

Section 2

level

T.

console

-

Computers with one

central processor

and multiple input/output processors

Chapter 33

2

An

3

A

interval timer has counted a previously set time interval.

magnetic-tape drive has completed a data transfer previ-

ously requested and

4

An

5

A

is

The

Digital inputs. terrupts;

ready for another request.

384 process

to

in-

to 1,024 bits of contact sense, digital input, or parallel

up

and 128

register input;

event input counters as

bits of

1-, 8-,

and

16-bit counting registers.

operator has initiated an interrupt from the Pc console.

device such as a typewriter has just printed a character is ready to receive the next one.

up

Digital Input provides

The IBM 1800

Anatog outputs.

Up

can be provided.

to 128 analog outputs

and

Digital outputs. Digital Outputs provide

Primary-memory communication and data transmission with terminals

and secondary memory

Two methods and T.

Mp

are used to transmit data

First,

between

Mp

the program. Each character or word of data is transmitted to or from the Pc and onto T by means of an Execute I/0(XIO) instruc-

The Pc program and device synchronization are accomplished

tion.

IO

and Ms, or

low-speed devices are controlled directly by

up

to 2,048 bits of pulse

output, contacts, and registers. processors (data channels)

Pio('Data Channels) give a directly with

memory The

or

Ms

the ability to communicate

if

an input unit requires a primary

cycle to store data that

it

has collected, the Pio communi-

cates directly with

by using the interrupt mechanism. Devices operating under direct

T

Mp. For example,

even

if

Word Count which

is

Pio's run

and

stores the data.

Pc

waiting.

Mp

is

The

Pio's

have two

registers:

program control include typewriter, printer, plotter, paper tape and punch, analog-to-digital converters, contact sense,

a

voltage-level sense, pulse counters, etc.

Channel Address which points to the next word transferred in a block. The Channel Address is also used to select the next instruc-

The second method of transferring data is via the Pio('Data Channel's. The Pio program is started by the XIO instruction of the Pc. The transfer of data words then proceeds under control

(reader

the specified Pio, completely asynchronous to and in parallel with Pc program operation. The Pio gains Mp access independent

of Pc (of

(Pc operation is suspended for one Mp cycle). During the cycle, the data are taken from or placed into core storage by

Mp

Pio (via internal Pc control and registers). As soon as the Pio has satisfied, which normally takes one cycle, the Pc proceeds.

been

The

logical state of the Pc, or the Instruction-set Processor,

is

not

changed by Mp. This method of access is referred to as "cycle stealing." Devices (Ms and T) operating under Pio control include magnetic tapes, disks, line printer, card reader-

transferred in

used to count the number of words being a block between a device and Mp memory; and a

tion in the

Two

program

for the next block transfer task.

basic types of Pio's are used, nonchaining

and chaining. 1

The

Pio's provide the ability to transfer either a single block (nonchaining) or multiple blocks (chaining) directly to Mp inde-

pendent of Pc.

The

central processor

Registers in the physical processor

Pio's access to

punch, and the link to the IBM System/360. Some devices can operate under both Pc and Pio control,

depending on their characteristics and the configuration, e.g., analog input, analog output, digital input, and digital output. Process I/O, controls

Figure 4 shows the relationship of the registers in Pc, together with those in the Instruction-set Processor. Those registers accessible

by the program are shown with an

accessible from the console. register

is

Storage address register (SAR). All

by

Channel Address Register (CAR) Instruction register

ment

address of the next instruction.

are

up

handle various analog-input

signals.

The data input

to 20,000 16-bit samples per second, with

program

rates

and 256

(via

high-speed solid state) multiplexed analog-

input channels connected to a single verter).

The Configurator

(Fig. 2)

K

Pc references

to

Mp are selected Mp use the

of the active Pio.

This 16-bit counter register holds the

(I)'.

selecta-

ble resolution and external synchronization. There can be 1,024 (via relay)

All the registers are

this 16-bit register. Pio references to

Analog inputs. Analog-input equipment includes analog-to-digital converters, multiplexors, amplifiers, and signal conditioning equipto

°.

description of the functions of each

given below.

or accessed

and transducers

A

(analog-to-digital con-

shows the allowable inputs.

Storage buffer register all 1

A

word

(B).

transfers with

descriptive

departments.

This 16-bit register

is

used for buffering

Mp.

name undoubtedly concocted by one

of

IBM's marketing

405

406

Part 5

The

PMS

Computers with one

Section 2

level

central processor

and multiple input/output processors

Console Core Storage

SAR Interval

Operation

Mon

Timers *

i

tor

EM

o

a

A *

XJ

Connected

-r \

to

Input Devices

C "T

]

N THEN

GO TO

FIN

If all

n

/-tasks started,

proceed with

The simple

K

MOP

algorithm presented here sequence. There is therefore a possibe queued during the execution of bility that unnecessary task-calls may the split which is to generate the nth task. The probability of this is, 'This

is

not quite accurate.

does not explicitly interlock the

split

however, small, while the degradation arising from an interlock could be in the form given appears more economical. significant, and the algorithm

Chapter 37

A survey

of

problems and preliminary results concerning

1966], 6.3

Macro-parallelism

is

not

all

processing and parallel processors

differential equations [Niever-

[Miranker and Liniger,

These various

1967].

were directly related to the present project,

studies,

more mathe-

The

matical in nature, and to the best of our knowledge, no attempt

largely historical, a consequence of the fact that

has yet been made to develop efficient parallel computer programs. Thus, while numerical methods are beginning to emerge which

and computer programs are generally sequential reason for this

and the solution of linear

gelt, 1964],

Commonly used numerical algorithms, data processing procedures,

parallel

in nature.

the Mechanisms, human, mechanical, and electronic, used in

developing and executing these procedures have been incapable of significant parallel activity, other perhaps than the simultaneous, coordinated use of

many humans. The advent

of parallel

enable the exploitation of macro-parallelism in the solution of time-limited problems, and from which it appears that significant reductions may be obtained in throughput times, much work

of accepted processing systems thus calls for the modification The resultant inherent to proparallelism. expose any techniques cedures must then be further adapted to make parallel tasks of

remains to be done on re-programming the problems themselves.

such a magnitude that the overhead involved in their generation becomes insignificant. But the ultimate benefit from parallel execu-

7.

be obtained only by going back to the problems themThese must be analyzed anew. Algorithms must be devel-

7.1

Simulation

Simulation as a design tool

tion will selves.

oped that make bility,

it

possible to exploit the parallel executing capa-

by introducing

into the mathematical

and program model

the physical parallelism that ultimately reflects the parallelism of to return to studied. In this need or system phenomena being

fundamentals, the situation

is

days of electronic computing, plication

somewhat analogous

when attempts

were largely frustrated

until

it

to the early

commercial apwas realized that wideat

spread application required the development of new techniques, rather than the adaptation and mechanization of existing procedures.

At the present time, however, our direct activity in problem analysis has concentrated mainly on the adaptation of existing numerical techniques for parallel processing, for problems in

which the basic macro-parallelism was self-evident. These include, for example, linear algebra and the solution of elliptic partial differential equations. In these areas the extent

parallelism

for vector processing

had previously led to proposals

systems such as Solomon [Slotnick et

and nature of the

al.,

1962; Gregory and

has been our experience with simulation that its principal function as a design tool is to focus attention on features that It

require investigation and explanation. Many results, qualitative and quantitative, that are obtained during simulation experiments

be obtained analytically. It is, however, the insight and understanding gained from the design of simulation experiments

may

also

and the analysis of their results that draws attention to specific details and difficulties. The undeniable value of simulation in in development and design is therefore quite different from that be where meaningful performance figures may system evaluation, obtained when the work load is well defined.

7.2

The executing simulator

In the present study simulation

was seen

of additional functions. In particular

it

as fulfilling a

made

number

available a usable

working model of a parallel processing system. This would give potential users the incentive to undertake actual programming and to gain limited operational experience. also required for the investigation of

An

executing simulator was

what

is commonly regarded most immediate question in parallel processing, the extent of performance degradation due to storage-access interference and

McReynolds, 1963] and Vamp [Senzig and Smith, 1965]. Other areas in which the parallelism is self-evident but where vector

as the

processors prove less effective are those in which the algorithms

executive (queue-access) interference. Such an executing simulator is now operational and its use is discussed in the next section. We

model

distinct physical activities such as in

Monte Carlo techniques. For [Schlaeppi, 19??]

it

all significant

was possible

file

processing and

problems investigated

to establish the existence of

parallel tasks of such a length that tasking overheads could

be

expected to be negligible.

Other

classes of

problems have been studied, both in terms of

the extension of existing algorithms and the development of new ones. In particular we refer to the extraction of polynomial roots [Shedler

and Lehman, 1966], solution of equations [Shedler,

note parenthetically that a limitation of this type simulator is its speed. For the evaluation of total system performance over any

when using a computer itself much slower than the simulated system, only gross, nonexecuting, simulation is reasonable [Katz, 1966].

length of time, particularly

The system presently modeled in the executing simulator includes the processors, switch, and Storage Modules of Fig. 1. The storage modules are accessed through a fully interleaved address

463

464

The

Part 5

PMS

Section 3

level

though it is clear that in any realization interleaving partial, both to sustain high availability and to decrease

structure,

be

will

storage interference between independent jobs. The individual processors have a System/360-like structure [Blaauw and Brooks,

augmented subset of S/360 machine language. The nonstandard instructions added to the repertoire in-

sizes of matrices

clude the functions discussed in Section

be used

in the

also as an instruction buffer,

model

for

which the interference

4.

is

The

results are

quoted

in the

The simulator configuration is parameterized so that, for example, the numbers of storage modules and processors, instruction execution times (in storage cycles), and the nature of statistics gathered and printed may be selected for each run. The next section.

program

ment

modular, and both system features and measuremay be expanded or modified as required.

itself is

facilities

and

parallel

isolate the effect of

processing

commensurate

mapping with the address structure of the which demonstratively had significant influence on the

store,

results.

Instruction execution times for the most frequently executed instructions used in the experiment are given in Table 2.

local store LSi,

however not included

were used to

for multiprocessing

periodicities of array

1964] and execute an

to

Computers

for

These times exclude the instruction fetch time (one instruction each fetch), since these are overlapped unless storage conflict

occurs,

may

when

a request must be queued.

also include a data fetch

further store access time

is

(RX

The arithmetic operations

instructions) in

which case

a

required.

In the absence of an internal instruction buffer, processors

executing the same program string interfere with each other continuously during instruction fetches. To minimize this effect for loops that are short relative to the width of the interleaving, is profitable to unwind such loops by repetition so that the resultant string stretches as far as possible across the interleaved it

7.3 7.3.1

Simulator experiments Kernels. Simulation experiments

first

concentrated on an

investigation of storage interference arising in the execution of typical kernels from numerical analysis.

The

results indicated that

under the limited condition of the experiments and

for a storage

module-to-processor ratio of two, interference would degrade performance by less than twenty percent, dropping to some five

percent for storage module-to-processor ratio of eight. Addition and its use as an instruction buffer

of a local processor store

effectively eliminated interference, as expected, indicating that it

had been substantially due to instruction-fetch interference. These results were considered to have been generated under

store.

that

The program was unwound it

in this

in fact better [Rosenfeld,

is

way.

We

note, however,

1965] to repeat the loop,

appropriately modified, several times across the interleaved store, directing successive processors to successive, but unconnected,

can decrease interference by as much as twenty percent over the previous case. Some results of the simulation are given in Table 3 and plotted in Figs. 5 and 6. loops. This

We

note that running time (col. 4) is defined as the interval start of the first processor on its first task and the completion, by the last processor to finish, of its final task. Since

between the

conditions too restrictive to permit generalization. In particular each set referred only to concurrent executions of a single loop.

an onion peel technique has been used for the splitting, there is an interval (of order 70 storage cycles) between the start of suc-

Thus more recent experiments have included many runs of a matrix-multiply subroutine and the solution of an electrical net-

cessive tasks.

work problem using an appropriately modified version

of the

Jacobi variant of the Gauss-Seidel solution of a set of linear algebraic equations.

7.3.2

The matrix multiplication. The Matrix Multiply program in two versions. A classical sequential program ex-

which the

There

is

also an initial interval (87

memory

cycles)

processor initializes the program. Finally, the finish of processors is staggered and, in particular, for the sixteenin

first

processor case, eight processors are assigned two tasks (rows) in succession, and eight, three tasks. The former processors will, of

Table 2

was written cluding

all

the special instructions provided the standard on which

measurement of the parallelism overhead and interference could be based. The second, rather than the

parallel,

program used the onion peeling

MOP algorithm described in Sec. 7.2. The product

Execution time in storage cycles

Instruction

Fixed Point Addition

0.4

Floating Point Addition

0.5

Floating Point Multiplication

1.0

matrix was partitioned by rows, with the computation of each

Floating Point Division

2.0

comprising one task. The experiments were performed for square matrices of dimensions thirty-nine and forty with from one to

Terminate

sixteen processors

and sixteen to

sixty-four storage modules.

Two

25.0

Split

New Task

25.0 Fetch (Part of Terminate)

25.0

Chapter 37

Table 3

A survey

of problems

and preliminary

results concerning parallel processing

and

parallel

processors

465

466

PMS

The

Part 5

STOF KILOC1

600 ^ 550 1- 500 450 O in in 400 ui

g350 £ 300 -)

fe

H

250 200 150 !

I00

ui

90 80 70 60

p z 50 K 40

30 20 10

level

Section 3

Computers

for multiprocessing

and

parallel

processing

Chapter 37

A survey

of

problems and preliminary results concerning

uj j|

or vi

m W o en

formance. This point, however, requires further study. Figure 10 reproduces some of the results of the previous three figures for the case of a five-equation inner loop. Table 4 lists these

64 STORAGE MODULES

same

INNER LOOP SIZE • 2 EQUATIONS i 3 EQUATIONS •4 EQUATIONS

results as a

of the number of processors. indicates interference and parallel processing Figure storage overheads as a function of the number of processors, with storage modularity again a parameter and an inner loop again comprising

— —

11

STORAGE

4 5 6 7 8 9 10 12 NUMBER OF PROCESSORS II

Fig. 9. Total processor and throughput times analysis— 64 storage modules.

in

13

14

15

electrical

16

network

which must be understood within the framework of a

numerical analysis of the relaxation solutions. Figures 7, 8, and 9 present the basic performance data, throughput time, and total processor time, for a total of one hundred and forty-four cases.

The

variables are the

number

of processors in the

cases), the size of the inner

loop as represented by the number of currents (from 2 to 5) evaluated in the loop, and the number of interleaved storage modules (16, 32, 64).

system (12

to

These curves clearly indicate the reduction in throughput time be obtained from the use of parallel processing, the consequent

increase in processor cost due to interferences of various sorts, the resultant effect of diminishing returns, and the actual increase in

throughput time, when too

many

percentage of the time using one processor and

compares them with the reciprocal

a.

effects

processing and parallel processors

circumstantial evidence that an ad hoc procedure, which does not guarantee sequential evaluation of the equations, improves per-

STORAGE KILOCYCLES

600 550 500 450 400 350

parallel

processors chase too few equa-

and generally get seriously "into each other's way." For the smaller inner loops and when interference between

tions

processors is low, total processor times vary somewhat erratically. The causes for this are related to the relaxation pattern and the rate of convergence in each case. In fact there appears strong

467

468

Part 5

Table 4

The

PMS

Section 3

level

Run time

for resistor

using one processor, with a

network system

relative to the run

five equation inner loop

time

Computers

for multiprocessing

and

parallel

processing

A survey

Chapter 37 [

factor.

Such a factor

is

intuitive

of

problems and preliminary results concerning

and environment-sensitive, de-

pending on the relative concern for speed and for costs of various sorts. For the present data we have chosen to display a function:

X

total processor

time

processing and parallel processors

Any ultimate evaluation of a parallel processing system within a working environment depends on actual operating experience. This in turn requires the existence of a system and the interest of users.

throughput time

parallel

Only when usable systems become available will the in integrated systems be accurately

concept of parallel processing evaluated.

where

K

a constant, throughput time a measure of the speed of computation, and total processor time a measure of the cost. is

References BlaaG64; BrigH64;

8.

FalkAfM; GMS58;

Conclusion

we have presented some thoughts on parallel processparticular we have chosen to survey the topic by including

In this paper ing. In

an extensive bibliography and some of the results of our work in this area. The discussion has had to be brief, but our intention has been to convey the picture of the potential that parallel processing systems offer for the future development of computing. The key to successful exploitation lies in a new, unified, and scientific

approach

to the entire

problem of the design and usage

computing systems. The development of large, integrated systems raises many problems, but there can be no doubt that ecoof

nomic

solutions to these will

be found. Their development should

comprise a significant part of the computer system architectural design effort of the next few years.

ConwM63; CorbF65; DennJ66; DesmW64; DreyP58: GregJ&3; KatzJ66; LehmM65; LeinA59; McCuJ65.

MiraW67; NievJ64; RoseJ65; SchlH??; ShedG66a, PL/I Language Specification, FormC28-6571

b; SlotD62;

SmitR64

Bibliography AlleM6.3;

AmdaG62; AndeJ62,

65;

ArdeB66; BaldF62; BlaaG64; BrigH64

BuchW62; BussB63; CoddE62; ComfW65; ConwM63; CorbF62, 65: CritA63; DaleR65; DennJ65, 66; DesmW64; DijkE65; DreyP58; ErnsH63: EstrG60, 63; EwinR64; FalkA64; ForgJ65; FranJ57; GillS58; GlasE65 GregJ63; HellH61, 66; KatzJ66; KinsH64; KnutD66; LehmM63a, 63b, 65

LeinA59; LourN59; MarcM63; McCaJ62; McCuJ65; MeadR63; MillW63 MiraW67; NievJ64; OssaJ65; Penn]62; RoseJ65; SchlH??; SeebR63; SenzD65:

ShedG66a, 66b; SlotD62; SmitR64; SquiJ63; StraC59; VyssV65; WirtN66: IBM OS/360 PL/I Language Specification, Form C 28-6571; Proc. ZF/P1962

"Symposium on Multi-Programming"

1963.

:

469

Section 4

Network computers and computer networks The RW-400 and the CDC 6600 are

actually

computer networks

by our definition of a computer (Chap. 2, page 17). Yet because of the restrictions on the quantity and location of the compo-

nents

in

these structures,

we

still

consider them to be com-

which are puters. On the other hand, two or more computers separated physically, yet connected, constitute a computer network. Computer networks will appear in the future; it is important to understand the basis for them.

managing T

and

activity. Similar solutions are

activity

common

by using an M, local to particular T's,

local C's.

The structure should be compared with the CDC 6600 (Chap. 39) and the network examples in Chap. 40.

The CDC 6400, 6500, 6600, 6416, and 7600 The CDC 6600 development began in 1960, using high-speed transistors and discrete components of the second generation. The first 6600 was delivered in September, 1964. Subsequent

The RW-400— a new polymorphic data system Chapter 38 presents the RW-400 (also called the AN/FSQ-27), a later version of the Ramo-Wooldridge RW-40 originally designed in 1959. The diagram (page 478) gives an indication

compatible successors included the 6400, in April, 1966, which was implemented as a conventional Pc(a single shared arith-

the components. The PMS

1967, which uses two 6400 Pc's; and the 6416 in 1966, which has only peripheral and control processors. The first 7600,

of the relationship

structure in Fig.

RW-400's were tions

has

1

and names

built for military

(although the number

little

of

has more configuration details. At least of

command and control

computers

six

applica-

of a type in existence

to do with a machine's worth or ability).

The RW-40 ISP as given

in

Appendix

1

of Chap.

38

is

a

of a processor with a two-address instruction set.

good example The ISP does not have index registers; it has a small state consisting of the accumulator (A), a limited extended accumulator (B), the program counter (P), and about 6 state bits. The Pc

is

limited by

Mp. The ISP

its ability

to

address directly only a 1,024-word sufficient for solving the kinds of

undoubtedly problems encountered by the computer and compares favorably with Whirlwind and the IBM 1800. is

The RW-40 introduced multiple parts for reliability [Rothman, 1959]. Multiple C's (or Mp— Pc and Mp— Pio) are provided redundancy and capacity. However, the S('Central Exchange) which provides communication among the C's may not have redundant parts. The multiple-computer concept can be for

viewed as the forerunner to our present computer networks, in which the central switching element is the Telephone Ex-

change. Over a longer time span, the RW-400 may be most with the significant as a pioneer. However, the whole system, exception of the small Mp's, is nicely designed. The problem of low speed T(typewriter, display)'s is handled well by transferring data

470

independent T and P for

from

Mp— Pc

to

Ms(drum)

for concurrent

and

metic function unit instead of the 10 D's); the 6500

which

is

nearly compatible,

was delivered

in

in

October,

1969. The dual

processor 6700, consisting of two 6600 Pc's was introduced in October, 1969. Subsequent modifications to the series in

20 peripheral and control also marketed a 6400 with peripheral and control processors (e.g.,

1969 included the extension processors with 24 channels. a smaller

number

6415-7 with

7).

of

to

CDC

Reducing the

maximum PCP number

to 7

also reduced the overall purchase cost by approximately $56,000

per processor.

The computer organization, technology, and construction in Chap. 39. ISP descriptions for both the Pc and

are described

Pc ('Peripheral and Control Processors/PCP) are given 1 and 2 of Chap. 39.

in

Ap-

pendices

To obtain the very high logic speeds, the components are placed close together. The logic cards use a cordwood-type construction. The logic is direct-coupled transistor logic, with 5 nanoseconds propagation time and a clock of 25 nanoseconds. The fundamental minor cycle is 100 nanoseconds and the major cycle is 1,000 nanoseconds, also the memory cycle time. Since the component density is high (about 500,000 transistors in the 6600), the logic a plate with Freon circulating

This series

is

interesting from

the fastest operational

is

cooled by conduction to

through

it.

many aspects. It has remained computer for many years. Its large

Section 4

Mp>

Pc

1 1

Network computers and computer networks 471

472

Part 5

The

PMS

Section

level

why we consider the 6600

Each

to be fundamentally a network.

Cio (actually a general-purpose, 12-bit C) can easily serve the specialized Pio function for Cc.

The Mp

of

Cc

is

an Ms

for a Cio,

By having a powerful Cio, more complex input-output tasks can be handled without Cc intervention. These tasks can

4

Network computers and computer networks

write accesses to store results. valid

We would

agree that this

is

a

programs (e.g., look at a FORarithmetic statement), and it is probably valid for most

assumption for

TRAN

scientific

of course.

other programs as well.

include data-type conversion, error recovery, etc. The K's which

Cc has provisions for multiprogramming in the form of a protection and relocation address. The mapping is given in the

are connected to a Cio can also be less complex. Figure 2 has

ISP description for both

about the same information as Thorton's

/ECS).

Fig. 1

block diagram

detailed

PMS diagram

for the C('6400, '6416, '6500,

and

is given in Fig. 3. The interesting structural aspects can be seen from this diagram. The four configurations, 6400 6600, are included just by considering the pertinent parts of

'6600)

~

6416 has no large Pc; a 6400 has a sinis, a 6500 has two Pc's; and the 6600 has Pc; gle straightforward a single powerful Pc. The 6600 Pc has 10 D's, so that several the structure. That

A 6600 Pc

in

paral-

also has considerable M. buffer to hold instruc-

tions so that Pc need not wait for

Mp

fetches.

The implementation of the 10 Cio's can be seen from the PMS diagram (Fig. 3). Here, only one physical processor is used on a time-shared basis. Each 0.1 jus a new logical P is processed by the physical P. The 10 Mp's are phased so that a new access occurs each 0.1 jus. The 10 Mp's are always busy. Thus the rate i.

10

x

12 b/jus or 120 megabits/s. This process of shifting a new Pc state into position each 0.1 jus has been likened to is

2, Chap. 39, has an ISP description of the PCP. 2 a figure which shows the instruction deincludes Appendix

coding and execution as well. The 6600 PCP is about the same as the early CDC 160. The PCP has an 18-bit A register because it

a

parts of a single instruction stream can be interpreted lel.

Ms('Extended Core Storage-

Appendix

(Chap. 39).

A

Mp and

CDC. A diagram of the process is shown in Fig. 4. The T's, K's, and M's are not given, although it should be mentioned that the following units are rather unique: a K for a barrel by

has to process addresses for the large Cc. One interesting aspect of the 6600 which we question

communication among

switching for Pc

Pc to stop a

is,

however, elegant, since a Pio can request Mps, and resume a new task in one

job, store

instruction. (The t.save

+

t.

restore

formation or conversions; complete task management, including initiation, termination,

ment

The

of Pc.

and

error handling;

Cio's perform in about the

particular tasks, carry out the tasks,

etc.).

/is.)

The Cio's functions are data transmission between a peripheral device and the large Cc via the Cio's Mp with some data trans-

simultaneous transfers to 4 Ms; the T (display) for monitoring the system's operation; K's to other C's and Ms's; and conventional T(card reader, punch, line printer,

~2

The operating system

40, page 506).

management

Cio;

the

all

of 64 telegraph lines to be connected to a an Ms(disk) with four simultaneous access ports, each at 1.68 megachar/s data transfer rate, and a capacity of 168 megachar; an Ms(magnetic tape) with a K(# 1:4) and S to allow

the

is

components at the ISP (programming) level. When Pc stops, it has no way of explicitly informing any other components. There are no interprocessor interrupts. An io device cannot interrupt a Pio, nor can Pio's communicate with one another except by polling. The state lack of

and manage-

same manner as

the C('Attached Support Processor) a single fixed io

in the N('360 ASP) (Chap. The operating-system software is managed by Cio. The remaining nine Cio's are free, and as

tasks arise

in

the system, the Cio's assign themselves to

and then free themselves on other tasks. The operating-system software resides Mp(Pc) (that is, Cc) accessible to all Cio's and includes:

to take in

ISP 1

The ISP description of the Pc is given in Appendix 1, Chap. 39 The Pc has a very clean, straightforward scientific-calculation oriented ISP. We can consider it a variation on the general

pendix

1.

a

2

o

This structure assumes that a program consists of

b

several read accesses to a large array(s), a large

number

of

operations on these accessed elements, followed by occasional

list

of a particular

data pointers to Ms(disk, 'ECS), running time,

of jobs to do, etc.

Programs

because the Pc state has three sets of genera Their use is explained both in Chap. 39 and its Ap

register structure registers.

The variables which determine the state job, e.g.,

for the Cio's

Parts of the operating system used by the Cio sponsible for the system management 10

re-

management programs (or programs to get the management program from Ms) which the Cio's

task

use

Section

M('8arrel; working;

Mp(#0:9)'

10 w; 51

b/w; 0.1 u.s/w)

-Tf'Dead Start Console)-



Stm-

-Pc? (#0:9)

S>-

11:1211

JixedJ #0:9;

r.

'Peripheral

and Control Pro-

12 b/w)-

L(l u,s/w;

(keyboard)

-

'Read Pyramid; buffer;

K

12 b/w:

cessor/PCP

M (working: 12 b/w):

(1+2+3+11+5): .2 u,s/w)

'Write Pyramid; buffer;

M (working

12 b/w;

12 b/w;

(5+4+ 3+2+1) w: .2 [is/w

'Extended Core Coupler;

-K[

(_•

Mp*{#0:3D

S

1

.0 u,s/w;

4096 w;

12 b/w)

u,s/w;

12 b/w)

3

Pc( 'Peripheral and Control

(#0:15)

to: 'Extended Core Coupler)

.1

Processor; #0:9; time multiplex;.

1

address/instruction:

1

p,s/w:

Mps('Program Counter, Accumulator) 1,2 w/i nstruct ion)

12 b/w:

Mpfcore; 1.0u,s/wj 4096

w:

(5 x

12)

b/w)

S(time multiplex: 0.1 u,s/w; 60 b/w)

Ms('Extended Core Storage/ECS; 3.2u.s/w;

7

See Chapter 39 for operation.

s

0nly present in CDC 6500

9

8

LPc

S(tlme multiplex:

e

Ms)— Ms

16

L (ft, 3,4;

2

B

K:

J

ns/w; 60 b/w

6-

C('Central)

'Mpfcore;

1

-Sd

1-

(125952 /

fi)

w:

(8 x

(60,

1

parity)) b/w)

No C('Central) in CDC 6416; CDC 6500 and CDC 6400 do not have K( Scoreboard) '

,

separate D's,

and M( Instruction Stack). '

Pc('6600;

15,

30 b/instruction:

technology transistor:

~

S('Switchboard)

D('Shift)

:

-Mps(flip flop: ~16 w) I

,

1964;

data:

si

,bv,w,sf ,df )

— D(' Boolean) — D(#l 2; Increment) — D( 'Branch) '

;

I

-

K(interpreter)-

K( 'Scoreboard)

M.worki ng

-

M.

i

nstruct

iorf"'

I

nstruct ion Stack;

content addressable; _fl ip flop;

Fig. 3.

CDC

6400, 6416, 6500, and 6600

8 w;

PMS

60 b/w_.

diagram.

— D('Add;

0.3

(is)

— D('Long Add) — D(#l:2: Multiply; — "('Divide: 2.9

lis)

1

u,s

:=

4

Network computers and computer networks

473

474

The

Part 5

PMS

Section 4

level

10

1

CENTRAL ~

MEMORY 160)

MEMORIES, 4096 WORDS EACH, 12-BIT

Network computers and computer networks

Network computers and computer networks 475

Section 4

In a typical

assignment

1

CDC

system, one might expect to find the following of

PCP's to

Operating-system execution, including scheduling and of

management

Cc and

all

7600

The CDC 7600 system is an upward compatible member of the CDC 6000 series. Although the main Pc in the 7600 is compatible with the main Pc of the 6600, instructions have been added

be:

Cio's

for controlling the io section

2

Display of job status data on T(display)

3

Ms(disk) transfer

4

"["(printers,

management

a

L(#l:3;

6

Ms(magnetic tape)

7

T(64 Teletypes)

8

Free to be used with Ms(disk) and Ms(magnetic tape)

9

Free

10

Free

PPU's are located

—5

Mp(#0:31)

S

in

K(M. buffer; core to core transfers) 5

3

-pp|c I— K 'Input Output Section; M(buffer;

15 w;

S|

60 b/w)

t

ime mul

t

iplex;

C* (#1

:

15:

'PPU)

4

II

15 C('PPU)

55 ns/w.: 60 b/w:

Basic N('CDC 7600)

>Ms('Large Core Memory/LCM; 1.760 u s/w: 2

3

Mpf Small

Core Memory/SCM;

(6V8)

kw:

(60 X

8)

b/w)

.275 ns/w; 2 kw: 60 b/w)

S(time multiplexed: 27.5 ns/w; 60 b/w)

*C('Peripheral Processing Unit/PPU) :=

Mpp0:l;

275

i

ns/w7|-

12

L.2M8 Hj

b/wj

address/instruction:

Mps(~2.5 '

L-Kio(#0:7;

"Mpsplip

flop; 27.5 ns/w'i

16 w; |_~

60 b/w



10

f-

~2

1

w/i nstruct ion

instruction

Channel)

-L(to: K)-

D('Long Add)



'

D

(

D

(

Increment)

'Population Count)

1

f

Instruct ion Stack; 1

ip flop:

12 w.

Fig. 5.

27.5 ns/w;

60 b/w

CDC 7600 computer PMS

D('Shift) D( 'Normal ize)

interpreter

diagram.

:

w)

— D( 'Boolean) M. working:

5)

is

substantially different from that

The C('7600 Peripheral Processing Unit/PPU), unlike the C('6600 Peripheral and Control Processor's, has a loose coupling with the main C. The PPU's are under control of the main C when transferring words into SCM via K('lnputOutput Section). The 15 C('PPU)'s have 8 input/output channels. These channels, which can run concurrently, provide the link between C('PPU) and peripheral Ms's and T's. Some of the

to:C.satellite)



communicating between

of the 6600.

5

2

for

CC6600).

The PMS structure (Fig.

card reader, card punch)

1 Ms(#0:7) —i

and

Large Core Memories /LCM and Small Core Memory/SCM. It is expected to compute at an average rate of four to six times

— D('Floating

Add)

D('Floating Multiply) L_ D('Floating Divide)

-K-T |Ms|c(Central)-

the

same

physical space as the Pc.

476

Part 5

The

PUS

Section 4

level

Network computers and computer networks

|

a clock, the PPU's, and A breakpoint address, BPA, can

The 7600 Pc can be interrupted by trap condition within the Pc.

be set up within Pc such that, on the program reaching BPA, a trap

is

This interruption

initiated.

scheme

is in

contrast to

that of the 6600, which could not be interrupted or trapped.

The 7600 interrupt may be munication

in

There have been instances of very large computers not being carried to completion either for financial or technical reasons.

The 6600 seems

marks it

to be the first large

of success. Here

we

computer

are interested

in

has held the "world's largest computer"

to achieve these

the 6600 because title

for so long.

a reaction to the lack of intercom-

the 6600.

Computer-network examples In Chap. 40, we present examples of seven computer networks. There is a dearth of both computer networks and of papers on

Conclusions

Although the 6600 was somewhat behind its announced delivery schedule and represented a significant drain on the financial resources of CDC,

it

is

now

clear that

it

is

a successful product.

computer networks. This chapter takes examples from papers and from knowlof several existing or proposed networks.

edge

Chapter 38

The RW-400— a new polymorphic data system 1 R. E. Porter The RW-400 Data System, based upon modularly

Summary

independently operating and flexibly connected components,

to another model,

the logically

in large expenditures of

evolved successor to conventional computer designs. It provides the means by which information processing requirements can be met with equipment capable of producing timely results at a cost commensurate with problem

economic value. System obsolescence is minimized by the expandability in numbers and types of processing modules. Real time reliability is assured

by component duplication techniques employed

at

minimum

in the system's

cost

and by the advanced design

manufacture. Man-machine

nication facilities are program controlled for

maximum

due to growth in applications, often resulted time and money. During maintenance or

constructed,

is

commu-

flexibility. Parallel

malfunction of a conventional computer its entire processing is shut down. Real time processing reliability cannot be

capacity

maintained on an around-the-clock chine must process

its

problems

basis.

serially.

The conventional maThis serious limitation

is only partially alleviated by time-sharing or computing-element-doubling designs. The high cost-per-hour of conventional computer operation rules out direct man-machine intercommuni-

processing and parallel information handling modules increase the system's speed and adaptability when handling complex computing workloads. This

cation during other than emergency situations.

polymorphic design truly represents an extension of man's intellect through

Data System was evolved by Ramo-Wooldridge engineers

electronics.

vide a practical solution to those information processing problems now inadequately handled by conventional computer designs. The

The RW-400 Data System

new

The radically-new polymorphic design concept

of the

RW-400 to pro-

design concept. It was develfor information processing oped with real-time reliability and power to adaptability, equipment information with handling requirecontinuously-changing cope

a powerful new tool in the field of intellectronics extension of man's intellect by electronics.

a polymorphic system including a variety of functionally-independent modules. These are interconnectable through a

System description

to

ments.

is

a

meet the increasing demand

It is

program-controlled electronic switching center.

Many

pairs of

modules may be independently connected, disconnected, and reconnected, in microseconds if need be, to meet continuouslyvarying processing requirements. The system can assume whatever configuration is needed to handle problems of the moment. Hence it is

best characterized

by the term "polymorphic"

— having many

shapes.

Rapid, program-controlled switching of

many

pairs of func-

tionally-independent modules permits nondisruptive system

operating reliability, simultaneous multi-problem

pandability,

processing feasibility.

ex-

capability,

and man-machine

These are only partially found

in

intercommunication

computers of conven-

tional design.

to match problems Problem changes posed serious reoriencomputer tation and reprogramming difficulties. Changes from one computer

Computer users have been forced heretofore

to

1

Datamation,

limitations.

vol. 6, no. 1, pp.

8-14, January/February, 1960.

RW-400

is

— the

The RW-400 Data System contains an optional number and variety of functionally-independent modules. These communicate via a central electronic switching exchange. Each module is designed, within practical economic and functional limits, to maximize system adaptability over a wide range of problem types and sizes.

new design embodies the latest proven electronic design techniques, assuring high processing speeds and high equipment reliability. The RW-400's modularity assures reliable, round-theThis

clock processing of information with controllable computing capacity degradation during module maintenance or malfunction. Practical

man-machine intercommunication

RW-400 system by

is

achieved in the

use of program-controlled information display

and interrogation consoles. Figure 1 shows the over-all system design. Modules of various types communicate through a central exchange switching center.

Computing and buffering modules provide control for the system. These modules are self-controlled and make possible completely independent processing of two or more problems. One of the computer modules may be designated the master computer and 477

478

The

Part 5

PMS

Section 4

level

Network computers and computer networks

CONTROLLING

COMPUTING

BUFFERING

DISPLAY

i

^J

I

I

I

SWITCHING CENTER

I

J

INTERROGATION

AUXILIARY STORAGE

Fig. 1.

INPUT-OUTPUT

The RW-400 data system.

and monitors actions of the entire system. An

in this role initiates

provided to allow coordinated system action. Therefore, the system as applied to given information processing problems may change on a short range (microsecond) alert-interrupt

network

is

thus providing, through programming, a self-organizing aspect to the system. In addition, the system may change through

basis,

the years as the applications change. The most efficient and economical complement of equipment is applied to the problem at all

times.

put/output requirements. Additional man-machine communication devices such as interrogation, display and control consoles,

may be included in the system as problem A Tape Adapter (TA) module is available to

requirements dictate.

provide compatibility with magnetic tape of other computers. Information generated at Flexowriter inquiry and recording stations may be directly received by the system via the Peripheral Buffer Module. This latter module also buffers the receipt of and punched tape information.

TWX

self-instructed

in which a particular RW-400 Data System functions on the number and type of each module included. It may depends be initially composed of the minimum number and variety of

Buffer Modules (BM); Magnetic Tape Modules (TM); Magnetic Drum Modules (DM); Peripheral Buffer Modules (PB); and

modules needed to do a small problem or the initial part of some large but yet-to-be-defined problem. Such a system would work

An RW-400 system to

Exchange (CX) attached. These

is

built

around an expandable Central of primary modules may be

which a number

are:

Computer Modules (CM);

console communication Display Buffer Modules (DB).

modules are put together In

in

a system

addition

to

is

How many

entirely a function of

primary system modules, punched tape, high speed printing and control punched console devices are available. These handle nominal system insystem application. card,

The way

much

like a

conventional computer.

It

would probably include

a buffer module and thus have a parallel data handling capability not found in the conventional design at a comparable price. The initial

system installation

addition of modules.

may then be augmented by

the timely

The RW-400— a new polymorphic data system 479

Chapter 38

A

buffer

module (BM) has the capability

to control

its

acquisi-

tion and dissemination of information independently. The buffer provides a computer module with parallel data handling capability

without complicating the problem processing program with the conventional intermixture of arithmetic and housekeeping in-

by the processing

structions. Information previously generated

program

may be appropriately disposed of within the system while continues. Data needed at a subsequent time in the

processing

be retrieved from system storage in advance of

processing may need while processing progresses. The simultaneity of these operations not only materially increases over-all processing speed but also increases the practical utility of the less costly types of in-

ternal system storage such as a magnetic tape.

The computer (CM)

or buffer

(BM) modules, when acting

in

program when the two can work profitably in unison. The pair of modules thus interconnected neither affect nor are affected by other modules. Logical interlocks prevent unwanted cross talk among modules. An intermodule communication system lets con-

modules signal status or alert other such modules of their need to communicate. The decision by a module receiving an alert trolling

to

is

proceed

optional with

The

optional interrupt feature is that needed to make the often-discussed but seldom-used program interrupt capability both useful and practical. Programs may thus permit that module.

interruptions

only

at

convenient

points

functional modules

The key to appreciative understanding of the power of the RW-400 lies in

the

in

processing

sequence.

knowledge

of intermodule connection. It

describe the Central Exchange (CX) unit descriptions of the various modules.

first,

is appropriate to then follow with

The central exchange

The Central Exchange performs the necting a pair of modules

vital function of intercon-

whenever requested

a computer or a buffer module. Since internal

to

do so by either

programmed control

only possible within a computer or a buffer module, one of the interconnected pair of modules must be either a computer or a is

The time

in which any connection may be made or broken 65 about microseconds. An exchange has basic capacity to connect any of 16 computer or buffer modules to any of 64 auxiliary function modules. There is nothing sacred about the number

buffer.

a controlling capacity, may initiate connection to an information storage or handling module during that part of the processing

signal to permit interruption or

The

is

16 since

it is

possible to extend the

matrix through design modification

CX

module's interconnection

when need

arises.

The

CX

is

an expandable, program-controlled, electronic switching center capable of connecting or disconnecting any available pair of modules in roughly the time of one computer instruction execuFigure 2 illustrates the permissible module interconnections within the Central Exchange. tion.

Every intersection on the illustration represents a possible connection between modules. The "x-ed" intersections indicate typical connections in force at any point in time.

The

control logic

CX

module's connection table prevents more than one interconnection on any horizontal (controlling) or vertical (conof the

The system

path representation on the diagram. When connecrequested of the Central Exchange while one of the required modules is already carrying out a previous assignment, the requesting module can be programmed to sense this condition and

thus self-controlled to match processing capacity to each problem for the time necessary to do the job. Full system capacity may be brought to bear upon a very large problem when needed. This

waiting be undesirable, the requesting module can go on about its business and check back later to see when the desired connec-

Modules may be assigned, under program

control, to

work

together on a problem in proportion to its needs. As soon as a module's function is complete for a given problem, that module

may be

released for reassignment to

some other

task.

is

capacity may be apportioned among a number of smaller problems simultaneous processing, program compilation, program

for

checkout, module maintenance

etc.,

when

maximum system effort. From the preceding system description,

it

is

not needed for

trolled) data

tion

is

wait until connection can be

tion can

be made. There

knowing the kind

is

made without

an implication here, of course, that

of a system he

is dealing with, a programmer requests connections in advance of need whenever possible.

Provision for master-slave control it is

apparent that such

interference. Should

Matrix established within the

CX

is

included via an Assignment

module by a computer module

equipment can be expanded from a modest initial installation into a very powerful and comprehensive information processing cen-

previously assigned to master status. Such a provision is necessary to preclude inadvertent connection requests from unchecked

More

to give the reader a better feel

programs or malfunctioning control modules from affecting sets of modules simultaneously processing another problem. Connection

system might perform his information processing

requests are therefore essentially filtered through both an assign-

ter as requirements warrant.

cipal system for

how

work.

this

modules follow

specific descriptions of prin-

ment and an interconnection

validity matrix prior to being acted

480

PMS

The

Part 5

T

CM

CM CM

M f

IM I

i

level

Section 4

Network computers and computer networks

Chapter 38

contents of H are multiplied Multiply Accumulate wherein the where the contents of G are Transmit and to added A; by G and

The

ten program control instructions are Store, Store Double Accumulator, Load Accumulator, Insert Mask in the

S Register, Stop, Link Jump, Compare Jump, Tally Jump, Test Jump and a Multi-purpose Shift.

The

five external instructions are

those which cause data to

be transmitted to or received from a device external to the comis multi-purpose in nature and hence equivputer. Each command alent to several conventional external instructions.

are

The commands

—Command Output, Data Input, Conditional Data Input, Data

Output and Character Transfer. variation of each of these

it

The

stored in H.

Length

Suffice

A comprehensive discussion of the

commands

is

not pertinent to this article.

to say that

wide variety

The RW-400— a new polymorphic data system 481

commands

are available for carrying out a

of intermodule data

interrupt capability of a

communication.

Computer Module

is

a logical

generalization of the "trapping" feature found on several conventional computers.

gram,

at the

It

permits the automatic interruption of a prowhen the computer module

option of the program,

receives an "alert" that a condition requiring attention has arisen. It can be used to warn the program when an error of some type has occurred, minimize unproductive computer waiting time while another module completes its task, eliminate many programmed

status test instructions

and provide a convenient means of sub-

jecting one computer module to the control of another. Program

control of interruptions within a CM-400 is accomplished through the sense register S. This register may be filled with an interrupt

482

The

Part 5

PMS

Section 4

level

Network computers and computer networks

versus cost; parallel processing versus versus sequential processing; independent information handling

the trade

offs in features

program complicating "housekeeping"; and

time system

real

reli-

The only valid comparison ability versus periodic inoperability. is that between the RW-400 Data System and a conventional same

to the

computer applied

RW-400 system made by the by the reader

task.

The contribution

to the

Buffer Modules can be better assessed

been considered.

after the following description has

The buffer module

A

Buffer Module consists of two independent logical buffer units, each having 1024 words of random access magnetic core storage and a number of internal registers used in performing its functions

when

in the self-controlling

mode.

A

Buffer

Module may be con-

nected to a Computer Module so that the Buffer's core storage is accessible to the computer as an extension of the computer's own storage. A Buffer may also serve as an intermediary device between a computer and another module, such as a tape or drum, to minimize time conventionally lost in data transfers. The Buffer

capable of recognizing and executing certain instructions stored in its own memory. It can therefore be left to perform data hanis

RW-400

analysis console.

dling functions on

mask by means of the Insert S instruction. A bit by bit correspondence exists between the S register and the interrupt register and the interrupt register I to which the alert lines are connected. A Test

Jump

instruction can be used to

between these

registers of

examine the coincidence in a bit position corre-

an alert signal

alert is received sponding to a one in the S register mask. If an by the computer during the execution of an instruction, control

will

be transferred to memory location "O" at the end of the to the if, and only if, (a) the sense bit corresponding

instruction alert

is

a "one,"

struction ister

(b)

the master sense bit

was not an "Insert

may be programmed

according to the interrupt

S."

is

a "one," and

The master

(c)

the in-

sense bit in the S reg-

to permit the interrupt to take place

mask or

to inhibit interrupt until the

its

occupied. A Buffer Module

dress 1023

instruction to

and division and square

CM

a deluxe conventional computer the reader should bear in

and

mind

own working

Computer Module

storage.

operand be executed, the computer is

When

the ad-

a computer signalled that the

field of

cell in buffer storage.

The computer then

read register a few instructions, the buffer write register

R (or in the case of W) as the effective

operand uses the

refers to

some

number

in the buffer

address designated by the operand field of the instruction. Extended addressing may be used in either the first or second operand the instruction or in both operand fields. If extended addressing is used in only one operand field, the effective address field of

instruction

and 170 microseconds respectively. Before attempting to draw a comparison between a

to a

(all ones) appears in the

before the interruption is allowed to take place. Figure 3 schematically illustrates the Computer Module's primary registers and the interconnecting information paths.

root about 130

may be connected

extension of the computer's

designated by that

tiplication takes about 80 microseconds,

while computer modules are otherwise

and the buffer 1024 word storage used as an indirectly addressed

program can conveniently cope with it. All instructions being executed at the time an interrupt condition occurs are completed

Typical two-address addition and subtraction times are approximately 35 microseconds including memory access time. Mul-

own

field is

automatically added

operand

is

the

number

in register R.

to the contents of the

executed.

If

R

A

"1"

is

register after the

extended addressing

is

used

in

both

an instruction, the effective address of the first the number in register R and the effective address of

fields of

operand is the second operand is one more than the number in register R. A "2" is automatically added to the contents of register R after the execution of this type of instruction. The R (or W) register may be preset to any desired initial condition by means of the

computer's Command Output instmction. All the commands being executed by the computer must be stored within the computer

Chapter 38 |

module's storage and may not be in buffer cells addressed by the at execution time. The extended addressing and buffer

computer

may be used to materially simplify repetitive data

register indexing

acquisition operations.

The primary function of an auxiliary

of a Buffer

computer storage

Module

unit.

is

however, that

not,

The drum and tape modules

more aptly serve this function in the RW-400 system. A Buffer Module is capable of operating autonomously and of controlling

Drum

other modules such as Tape Modules,

Modules, Peripheral

Buffers, Display Buffers, Printers or Plotters. This capability en-

Modules

ables the Buffer

in a

system to perform routine tape

searching and data transferral tasks thereby freeing the

Computer

Modules

mode, the the same

to

do more computing. In

buffer executes

its

"self-instruction"

own internally stored program

its

fashion as a computer.

The memory

therefore be occupied by

its

of data

buffer

much

of a Buffer

Module

will

own control programs as well as blocks

holding for transmission to other units. The used to acquire information from the relatively slower

which

is

in

is

it

auxiliary storage

The RW-400-ra new polymorphic data system 483

(the size of the storage available to hold the data in a sending

Each block is preceded by a block identiwhich permits selective tape information searching by a Buffer Module. Single blocks imbedded in a tape file of other or receiving module). fication

blocks can be overwritten.

A

two-stack head permits automatic

written. Readback parity errors are automatically detected during the writing process. Thus dropout areas may be determined while the data is still available in verification of each block as

it is

a computer or buffer for recording elsewhere.

A description of the RW-400's tape handling capability would not be complete without mentioning the Tape Adapter (TA) module. This is a self-contained unit capable of performing the reading and writing of magnetic tapes in a format acceptable to the IBM 704 and 709 systems. The TA consists of an Ampex FR-300 half-inch digital tape transport, including dual gap head and servo control system; reading, writing and control circuits; and a

housing with

its

own blower and power

and communication modules while the computer

proceeds at high speed. Blocks of information retrieved in advance of computer need by the buffer may then be rapidly transferred to the computer's own storage or operated upon as they stand in the buffer via the indirect addressing capability of the computer. Another feature of the buffer is its switching capability. Each

Buffer

Module

is

composed

of

two buffer

units tied together.

unit function switching feature permits the

two units together

in

an alternating

employment

A

of the

mode of operation. Continuous

HUl

information transfer from tape to computer, for example, may be accomplished without stopping the tape unit. A switching instruction executed simultaneously by both units of a Buffer Module causes whatever devices were connected to the

first

ftfi-^-^^-*-

unit to be

connected to the second and vice versa.

Now that the functional controlling modules and the module interconnection concept have been discussed, the more conventional auxiliary storage modules available with the system may be described to round out the processing capability of the system.

The tape modules

fiiiii

iiiiiiiii!

A Tape Module consists of an altered Ampex FR-300 tape transport plus the necessary

power supplies and control

circuitry to effect

information reading, writing and control. One inch mylar tape is used. Information is written on 16 channels two of which are

ifllUll



The remaining 14 channels consist of 13 informaThe information reading or recording rate words 15,000 computer per second. Data may be recorded on

clock channels.

tion bits plus parity. is

tape in variable blocks up to a

maximum

of 1024

words per block

RW-400

Buffer Module.

supply.

module

484

The

Part 5

PMS

Section

level

4 |

The drum module

The Drum Module (DM)

contains a magnetic

capacity of 8192 words.

may be connected to either a Computer

It

drum with

storage

Module through the Central Exchange. Average access the first word position on the drum is 8% milliseconds.

or a Buffer

time to

Successive words are transmitted at the rate of 60,000 computer words per second. The Drum Module is conventionally used as

handled by the RW-400 system. In addition to the actual Cathode Bay Tube, numerical indicator, signal lamp and typewriter information outputs, several types of keyboard activated system control

and parameter entry facilities are provided on the console. The man-machine communication facility represented by each

total

console

is

designed to be primarily a function of the computer

control programs initiated

A

an intermediate item storage device to minimize tape handling

set of

by the

analyst via his console.

Display Control Keys generate messages which are

recorded on a Peripheral Buffer sector for later interpretation and

time.

display generation by a computer program. A set of Process Step Keys are provided the analyst so that he can initiate prepro-

Special system communication modules

The

Network computers and computer networks

external

data

and man-machine communication of the

RW-400 Data System are handled via drum buffer modules. A wide variety of asynchronously operated equipment

is

speed matched

and program controlled through the features designed into these special system communication modules.

The

Peripheral Buffer (PB) provides input/output buffers for communication between Computer or Buffer Modules and rela-

grammed system processing variations. is

Associated with the Process

an overlay or "program card" which permits the

Step Keys assignment of a variety of meanings to the set of Process Step Keys. Insertion of the overlay by the analyst gives him a unique label

each Process Step Key and automatically cues the controlling to assign the corresponding set of programs to each key

for

computer

tively slow speed external devices such as Flexowriters, Plotters, Punched Tape Handlers, Teletype Lines and Keyboard Operated

message. A Data Entry Keyboard is provided on the console so that the analyst can enter control parameters when asked to do so via the display devices.

Equipment. The Peripheral Buffer stores its information in four pairs of bands which operate alternately as circulating registers.

trolling the position of cross hair

Each band contains eight input and eight output buffers for a total of 32 input buffers and 32 output buffers in each Peripheral Buffer Module. Each buffer is a drum band sector 64 computer words

display tubes. Associated with the joystick are control keys which may be used to send a message to the controlling computer specifying the coordinates of the cross hairs. Control programs may be

one input and one output buffer sector are connected to each external device (such as a Flexowriter) to permit two-way communication between the external device and the

written, for example, to act

long. Conventionally

RW-400

system.

The display buffer

A

Display Buffer (DB) acts as a recirculating storage for the cathode ray tube display units in a Display Console. Information

be displayed is sent to the DB band associated with a particular display tube via the Central Exchange. The Display Buffer sends to

only status information back to other system modules upon request. The information displayed on any tube is controlled by the bit pattern sent to the Display Buffer. The display pattern is regener-

A

Joystick Lever affords the console operator a

upon

means of con-

markers on the cathode ray

this

information to reorient the

display with respect to the area selected

by the

cross hair position.

A

Light Gun is also provided as a means of selecting any point on the cathode ray tube displays. The gun emits a small beam of light. With the beam centered on a given point on the cathode ray display tube, pressing the trigger results in the automatic generation of a message to the Peripheral Buffer specifying the address in the Display Buffer containing the coordinates of the selected point. A set of Status and Error lights are contained on the Display Console to provide the console operator with over-all knowledge of the system and thus minimize conflicting control requests and intermodule interference. For example, a Buffer

ated 30 times per second to minimize image fading and flicker. The preceding explanation of the Display Buffer has little meaning

Peripheral may not be ready to accept a console key message until after certain previously requested control actions have been completed. The

to a reader unfamiliar with the features of the Display Console itself. This console is therefore described in more detail in the

that he

Status Lights indicate this condition to the console operator so may act accordingly.

following paragraphs.

The printer module Display consoles Display Consoles can give a problem "analyst" or "monitor" a visual picture of the status or results of any information being

The

Printer

Module (PR)

minute Anelex type a

Computer

basically a 160 column, 900 line per printer. It receives information from either

or a Buffer

is

module

via the Central Exchange. Indi-

The RW-400— a new polymorphic data system 485

Chapter 38 |

vidual characters to be printed are represented by a 6-bit code and are transmitted four to a computer word. Zero suppression,

CR

completion and information block end codes are included for format control. A plugboard is provided for flexibility in columnar

cards at the rate of 2,500 cards per minute.

line

data arrangement. Paper feed is controlled by means of a loop of 7-channel punched paper tape. Control of the printing operation has been arranged so that the connected control module may send

headings from one set of memory locations, stop sending information while going to a different part of the memory, and line

then proceed to send data from to

complete a

this

new

set of

memory

communicates with Computer or Ruffer modules via the It is capable of reading 80 column punched

Central Exchange.

using the Tape Adapter

References RothS59; WestG60

The punched card modules

The RW-400 Data System may be equipped with a high speed punched card reading module (CR) and an IRM card punch. The

is

the sources of large volumes of punched cards usually convert this

data into magnetic tape form which

locations

line of print.

The card punch

connected to the system through the Peripheral Ruffer Module (PR) since it is a relatively low speed device. Emphasis has not been placed on directly connected punched card equipment since

Module

may be more

(TA).

rapidly handled

486

Part 5

The

APPENDIX

PMS

1

level

RW

40 ISP DESCRIPTION

Section

4

Network computers and computer networks

The RW-400— a new polymorphic data system

Chapter 38 |

Instruction Interpretation Process

487

488

Part 5

The

PMS

Section

level

(g 4 0)

-

{

4

Network computers and computer networks

Chapter 39 Parallel operation in the Control

6600 James

Data

1

E.

Thornton

History In the

summer

Data began a project which the delivery of the first 6600 Com-

of 1960, Control

culminated October, 1964 in puter. In 1960

it was apparent that brute force circuit performance and parallel operation were the two main approaches to any advanced computer.

This paper presents some of the considerations having to do with the parallel operations in the 6600. A most important and fortunate event coincided with the of the 6600 beginning project.

This was the appearance of the high-speed silicon transistor, which survived early difficulties to become the basis for a nice in

jump

more critical system control operations in the separate The central processor operates from the central

processors.

memory with

relocating register and

file

protection for each program in central

memory. Peripheral and control processors

The peripheral and

control processors are housed in one chassis main frame. Each processor contains 4096 memory words of 12 bits length. There are 12- and 24-bit instruction formats to of the

provide for direct, indirect, and relative addressing. Instructions

provide

logical,

addition,

subtraction,

shift,

and conditional

circuit performance.

branching. Instructions also provide single word or block transfers to and from any of twelve peripheral channels, and single word or block transfers to and from central memory. Central memory

System organization

words of 60

and now called

bits length are assembled from five consecutive pewords. Each processor has instructions to ripheral interrupt the

of use, the very large

central processor and to monitor the central program address.

The computing system envisioned

in that project,

the 6600, paid special attention to

two kinds

scientific

problem and the time sharing of smaller problems. For

the large problem, a high-speed floating point central processor with access to a large central memory was obvious. Not so obvious, but important to the 6600 system idea, was the isolation of this central arithmetic from any peripheral activity.

It

was from

this general line of reasoning that the idea of

multiplicity of peripheral processors

was formed

(Fig.

1).

a

Ten such

peripheral processors have access to the central memory on one side and the peripheral channels on the other. The executive control of the system

is always in one of these peripheral proceswith the others operating on assigned peripheral or control tasks. All ten processors have access to twelve input-output channels and may "change hands," monitor channel activity, and

sors,

perform other related

jobs.

These processors have access to central

memory, and may pursue independent transfers to and from memory. Each of the ten peripheral processors contains its own

this

memory

for

program and buffer

MF/PS

areas, thereby isolating

Proc. FJCC, pt. 2 vol. 26, pp. 33-10, 1964.

and protecting the

To get this much processing power with reasonable economy and space, a time-sharing design was adopted (Fig. 2). This design contains a register "barrel" around which is moving the dynamic information for

all ten processors. Such things as program address, accumulator contents, and other pieces of information totalling 52 bits are shifted around the barrel. Each complete trip around requires one major cycle or one thousand nanoseconds. A "slot"

in the barrel contains adders,

assembly networks, distribution network, and interconnections to perform one step of any peripheral instruction. The time to perform this step or, in other words, the time through the slot, is one minor or one hundred

cycle nanoseconds. Each of the ten processors, therefore,

is allowed one minor cycle of every ten to perform one of its steps. A peripheral instruction may require one or more of these steps, depending on

the kind of instruction. In effect, the single arithmetic and the single distribution and assembly network are made to appear as ten. Only the memories are kept truly independent. Incidentally, the

cycle time

memory

read-write

equal to one complete trip around the barrel, or one thousand nanoseconds. is

489

490

Part 5

The

PMS

level

Section 4

Network computers and computer networks

Chapter 39

Input-output channels are bi-directional, 12-bit paths. 12-bit

word may move

in

nanoseconds, on each channel. Therefore, a of 120 million bits per

processors.

A

is

maximum

in a practical

all

single real time clock, continuously running,

is

Data 6600

available to

peripheral processors.

burst rate

possible using all ten peripheral

sustained rate of about 50 million bits per second

can be maintained

may

second

A

One

one direction every major cycle, or 1000

Parallel operation in the Control

operating system. Each channel and may interface to other

service several peripheral devices

systems, such as satellite computers. Peripheral and control processors

Central processor

The 6600

central processor may be considered the high-speed arithmetic unit of the system (Fig. 3). Its program, operands, and results are held in the central memory. It has no connection to

through an assembly network and a dis-assembly network. Since

the peripheral processors except through memory and except for two single controls. These are the exchange jump, which starts

memory references are required to make up one memory word, a natural assembly network of five levels

and the central program address which can be monitored by a

five

central

memory

peripheral

central is

access

used. This allows five references to be "nested" in each network

during any major cycle. The central memory is organized in independent banks with the ability to transfer central words every

minor

cycle.

most about

The peripheral

2%

processors, therefore, introduce at interference at the central memory address control.

PERIPHERAL AND CONTROL PROCESSORS

*-»

10

•-»

9

•»• 12

INPUT

OUTPUT CHANNELS

or interrupts the central processor from a peripheral processor,

peripheral processor.

A key description of the 6600 central processor, as you will see in later discussion, is "parallel by function." This means that a number of arithmetic functions may be performed concurrently. To

this end, there are ten functional units

within the central

491

492

The

Part 5

PMS

Section

level

processor. These are the fixed

add

unit, shift unit,

two increment two multiply

units, floating

add

unit,

units, divide unit, boolean

and branch unit. In a general way, each of these units is a three address unit. As an example, the floating add unit obtains two 60-bit operands from the central registers and produces a unit,

60-bit result

which

returned to a register. Information to and

is

held in the central registers, of which there are twenty-four. Eight of these are considered index registers, are of 18 bits length, and one of which always contains zero. Eight

from these units

is

are considered address registers, are of 18 bits length, and serve to address the five read central memory trunks and the two store

memory trunks. Eight are considered floating point regisare of 60 bits length, and are the only central registers to access central memory during a central program.

central ters,

whole central processor is hidden behind the peripheral processors, so, too, the ten functional units are hidden behind the central registers from In a sense, just as the

central

memory from

central

memory. As a consequence, a considerable instruction is obtained and an interesting form of concurrency is

efficiency feasible

and

practical.

The

fact that a small

number

of bits can

to give meaningful definition to any function makes it possible for a needed develop forms of operand and unit reservations general scheme of concurrent arithmetic. Instructions are organized in two formats, a 15-bit format and

a 30-bit format, and 4).

As an

may be mixed

in

an instruction word

example, a 15-bit instruction

may

call for

an

(Fig.

ADD,

4

Network computers and computer networks

Chapter 39

absence of the two restraints. The instruction executions, in com-

minor cycles for fixed add, 10 minor multiply, to 29 minor cycles for floating divide.

Parallel operation in the Control

previous uses of that register are completed.

The

Data 6600

central registers,

parison, range from three

therefore, provide all of the data to the ten functional units,

cycles for floating

receive

To provide a

relatively continuous source of instructions,

buffer register of 60 bits

is

one

located at the bottom of an instruction

stack capable of holding 32 instructions (Fig. 5). Instruction words from memory enter the bottom register of the stack pushing up

the old instruction words. In straight line programs, only the bottom two registers are in use, the bottom being refilled as quickly as

memory

programs which branch back to an the upper stack registers, no refills are allowed after

conflicts allow. In

instruction in

the branch, thereby holding the program loop completely in the stack. As a result, memory access or memory conflicts are no longer involved, and a considerable speed increase can be had. Five memory trunks are provided from memory into the central

processor to five of the floating point registers (Fig. 6). One address register is assigned to each trunk (and therefore to the floating

point register).

Any

instruction calling for address register result

implicitly initiates a

memory

reference on that trunk. These in-

structions are handled through the scoreboard

and therefore tend

memory access with arithmetic. For example, a new memory word to be loaded in a floating point register can be brought in from memory but may not enter the register until all to overlap

all

of the unit results.

Central

memory

is

No storage

is

maintained

in

any

and

unit.

organized in 32 banks of 4096 words. Con-

secutive addresses call for a different bank; therefore, adjacent addresses in one bank are in reality separated by 32. Addresses issued every 100 nanoseconds. A typical central memory information transfer rate is about 250 million bits per second.

may be

As mentioned before, the functional units are hidden behind the registers. Although the units might appear to increase hardware duplication, a pleasant fact emerges from this design. Each

may be trimmed

to perform its function without regard to Speed increases are had from this simplified design. As an example of special functional unit design, the floating

unit

others.

multiply accomplishes the coefficient multiplication in nine minor cycles plus one minor cycle to put

away

the result for a total of

10 minor cycles, or 1000 nanoseconds. The multiply uses layers of carry save adders grouped in two halves. Each half concurrently

forms a partial product, and the two partial products finally merge while the long carries propagate. Although this is a fairly large

complex of

circuits, the resulting

device was sufficiently smaller

than originally planned to allow two multiply units to be included in the final design.

493

494

Part 5

The

PMS

level

Section

4

Network computers and computer networks

Chapter 39

Fig. 7.

6600

lines.

Interconnections between chassis are

printed circuit module.

made with

coaxial

cables.

Both maintenance and operation are accomplished

at a pro-

grammed display console (Fig. 10). More than one of these consoles may be included in a system if desired. Dead start facilities bring

Fig. 9.

TJ"1

V\

Til

II

6600 main frame

section.

Parallel operation In the Control

Data 6600

495

496

Part 5

The

PMS

Section 4

level

Network computers and computer networks

the ten peripheral processors to a condition which allows infor-

which now appear to be quite

mation to enter from any chosen peripheral device. Such loads normally bring in an operating system which provides a highly

advances in technology upward within the same compatible structure, and identical technology downward, also within the

sophisticated capability for multiple users, maintenance, and so on.

same compatible

The 6600 Computer has taken advantage of certain technology advances, but more particularly, logic organization advances

References AllaR64; ClayB64

structure.

successful. Control

Data

is

exploring

Chapter 39

APPENDIX 1 CDC 6400, 6500, 6600 CENTRAL PROCESSOR ISP DESCRIPTION

Appen

Parallel operation in the Control

Data 6600 497

498

Part 5

The

PMS

Section 4

level

|

Network computers and computer networks

Instruction Format although 30 bits, most instructions are 15 bits; see Instruction Interpretation Process

instructlon

fm

- instruction

operation code or function

fmi

- fmai

extended op code specifies a register or an extension to op code

i

-

instruction

J

"

lnstruction

specifies a register

k

= instruction

specifies a register

Jk

- JDk

a shift constant (6 bits)

K

= instruction

long^instruct ion

((fm


(A[i

;

"SAi Xj + K"

(fm - 52)

->(A[i] £

CPU 0-35

1

"V

•J

Sense Indicators

JL:

1



— 17|18

Adders

12-17

Q-P-

9

8.

|

'"dex Adders

35

I Right

Left

34-35 17

\\

Compl 3-17 i

Instruction Counter

-17

"7

Address Register -17

^ \y

»

Accumulator

35

" M-Q

S,l,9

35

35

S4_ 3-17

30-35

i

J35

S, 1-5

(DFAD) 35

Miscellaneous Mode

f

Odd

Odd Core Multiplexor Address Switch

Addresses

-17

Even Core

Even

Addresses

CORE STORAGE MULTIPLEXOR

Available

Fig. 3.

IBM 7094

to the Instruction Set Processor

central-processing-unit information flow. (Courtesy of International Business Machines Corporation.)

SJ-35

S.l-35

1,11

521

522

Part

6

Computer

Divide-check'.

Section

families

The Divide-Check Indicator

in the

AC

of the

number

(dividend) in

is

is

turned on, in

fixed-

the magnitude of the number or equal to the magnitude than greater

point or floating-point division,

memory

if

(divisor).

1

The IBM 701-7094

II

sequence, a family by evolution

The operation portion of the Storage Register goes into the Instruction Register, where the operation code is decoded and the execute control circuitry is set up to perform the operation specified by the instruction. The address portion of the instruc-

now located in the Storage Register, may be used Normally, however, it goes to the Address Register and then to the Multiplexor Address Switch to locate the appropriate tion word,

Input-output check'. check)

is

The Input-Output Check Indicator

(I-O

turned on by the attempted execution of an input/output first selecting an input/output unit.

directly.

instruction without

data word in Mp. If the address is to be modified, it is routed from the Storage Register to the Index Adders for Index-register

in a special Transfer trap mode'. The computer can be operated Transfer Trap Mode. Operation in the Trap Mode permits the to run at normal speed with interruptions of normal

modification.

program

Register and on data word in core storage.

location of operation only at transfer points. At such points the the last sequential instruction is saved, and a transfer of control

Concurrently, during the same instruction cycle, a second instruction, located at the immediately higher odd-numbered Mp

is

made

off

manually, and there are

instructions

Sense lights'. Four Sense Lights are also on the console. Any one of these lights may be turned on, off, or the status tested by instructions.

Panel in-out switches' These 36 switches on the console .

may be

read by an instruction.

decoded

to

determine

if it

meets certain criteria

basic computer clock cycle

Mp

ignored in the current I cycle and Register on the next I cycle.

is

brought into the Storage

Execution cycle (£). The execution (E) cycle is used when a reference is needed. All instructions requiring an operand have

to core storage

an E cycle following the

is

2.0

jus

in

7094

I

and

be executed will go to 1.4

/lis

in

the Pc's registers; several operations may occur simultaneously. In Pc four different cycles are used: instruction/I, exe-

among

I

cycle.

I,

E,

E

if it

is

cycle.

to

E

to

indirectly

when

required from storage and the instrucan E cycle. Other instructions during completed require no reference to storage and, therefore, use only I and L cycles

information

is

depending on the instruction. The number of cycles required for an instruction may vary from 1 (e.g., transfer) to 19 (e.g.,

cycles for their completion.

double-precision floating-point divide).

to

E

Logic cycle (L). The L cycle is an execute cycle that does not require a reference to Mp. Many instructions use both E and L

cute/E, logic/L, and buffer/B. The cyclic sequence of an instruction is fixed, always beginning with an I cycle and progressing to E, cycles,

and again

I

addressed.

tion cannot be

B

concurrent

In other words, an instruction that normally goes from

7094 II, as dictated by Mp. Within the single 2- (or 1.4-) microsecond cycle, up to 10 sequential register transfers and/or data operations can take place, each of which transfers information

L, or

for

reference. If the instruction execution, thus saving a second in the IBR cannot be executed with the current instruction, it is

Indirect addressing of an instruction requires an extra

Instruction-set interpretation

The

then brought to the Address

address location,

Sense switches'. Six Sense Switches are located on the console.

They may which sense them.

is

to the Multiplexor Address Switch to locate the

is brought to the Instruction Backup Register/ IBR. While in the IBR, the odd-numbered instruction is partially

to a fixed location.

be turned on or

The modified address

Buffer cycle (B). A buffer (B) cycle is a null Pc cycle; it is used the data channels get information from or put information into core storage. This information can be either data or data-

when Instruction cycle

(I).

The

instruction location to instruction

Bus

I

Mp, word taken from

(Fig. 3).

From

cycle begins

when IC furnishes the The addressed

via S('Multiplexor).

goes to the Multiplexor Storage the Multiplexor Storage Bus the instruction

Mp

read into the Storage Register where it is separated into the operation portion and the address portion of the instruction word. is

channel commands. All demands for B cycles come from the channels themselves. Because of the nature of Ms's and

demand

B

T's,

the

cycle takes precedence over an instruction being performed by Pc. If Pc is in its logic cycle, then both an L and for a

B cycle occur simultaneously.

Chapter 41

Instruction interpretation. Instruction flow diagrams for the instructions are given in Fig. 4. These

CAL, and CLS

Operations on AC and Mps (A

) V (BPT

3

1(

A BPT)

)

-> (P



(instruction



(EM2

instruction

->

(EM3 (P

condition POP

:=

(

)

;

+ 1);

nstruct ion A (EM2

2))

A

(

i

nstruct ion A (EM3 • 3)))

(M[0] t-OvDlCP; P *-'00 g + popj;ode);

->

EOM

-> lO^i

nstruct on^xecut ion

;

POT

-> lO^i

nstruct ionjaxecut ion

;

PIN

-» lOjji

nstruct ionjexecut ion;

SKS

-> IO w

nstruct ionjaxecut ion

i

i

programmed operator; 64 user defined instructions catted via subroutine link in M[0] see the definition of the 10 instruction set below

;

end Instruction^xecution; not including Input Output instructions

)

Input-Output Control from the Pa

KT and KMs State Devices consist

of the following parts: name Cor address) of a specific 10 device: the EOM command is first given to select the specific device: subseauent commands are implicitly to the selected device

IOJ)evice[0:77777 8 ]

IO tJOutput[0:77777

8

Input and Output Data buffers associated with specific devices

]

tO UJ

input[0:77777g] IO LJ Ready[0:77777 ] 8 IO,J

bit for each device to denote when device is ready to transmit data a bit within each device denoting it has been selected for

Select[0:77777 8 ]

an operation the particular io device selected by the EOM command;

io^unitxOH'O 10 Instruction Set

command to select or address the device: energize output M

EOM



(io^uni

POT



(lO^Selectllo^unit] A lOJteadyt io^uni t]

t

c-

•)

J

lOJJutputtiouunit]



(POT)):

(

]



;

t

(PIN));

SKS -»(io _ unit *-e: next l

l

P

wait until ready sVip if signal is not set

1

(lO _Jselect[io U(unit] a IOJ*eady

wait until ready input data command

->

[

io^un

1

1

]



{

«-P + 1);

iojunit

(interrupt

R —» (Interrupt

IET

-

IDT -*

(Interrupt (—i

enable interrupt; turn on mode

«- 1)j t-

0)



P

disable interrupt; turn off

;



each alphabet are ordered

is

name

BILL :=

example

=^©

right (high).

3.4

and y

class(x)

>

a free

is

assigns the

... 9

»M< of

(GC

z

|;,:«

The characters

If

x:=y

Z

.

x

4.1

one of the following alphabets:

of

Commands: assignment,

4.

a sequence of characters written without spaces.

is


reset -time + constant X a) >

first-in-first-out

cocomponents: (input: component, output: component:

p

|

|

~constant)

|

dequeue: (constant ~constant)); |

component);

tmx / tm permanency: (decay transmit-destruct time-multiplexed / irreversible fixed until broken / cyclic permanent

subcomponents: ('control; 'input-buffer: M.i-unit; 'output-buffer:

|

|

|

|

|

|

|

moving

M.i-unit);

fixed manual); |

operation: (open close); |

hang-up-delay:

ft];

concurrency: (1|2); delay: delay(links);

concurrency-type: (simplex half-duplex full-duplex / duplex); |

|

L-initiator: initiator(links); i-rate: i-rate(link);

technology) delay: delay(link);

hang-up-delay: access-time /

A

ft];

constant

ta:

[t])

or as no connection. gate-switch acts as a simple-link

It is

used to trans-

mit information conditionally between the ports of two components. It can be used as a basic primitive to express the structure of other switches, including the simple-switch.

The parameters

will

be discussed under the

simple-switch.

A simple-switch consists of a set of potential links between a set of input and output components, with an operation (access) that can actualize some subset of the links. This is done according to an instruction called the address (which

may

or

not be held in a memory). For a switch,

may

the cocomponent input and output ports are sometimes listed to specify

the size of the switch.

An

important parameter

is

the concurrency-type, which describes the

various subsets that can be simultaneously realized.

7.3

simple-switch

:

link

=

component

(

link

—simplex,

taneity

component-set);

subcomponents: control,

established for

is

many

links

which permits

cross-point has 1-trunk,

memory;

for k-simultaneous conversations.

operation: access; size:

simplex or half duplex switch in would be more accurate.

size(output(cocomponents));

concurrency:

+

by means

the course of transmission of an i-unit

links: link-set, 'address:

Hierarchy

integer;

As a

concurrency-type:(simplex half-duplex full-duplex/duplex dual-simplex dual half-duplex dual full-duplex / dual-duplex |

|

time-multiplexed-cross-point /

1

|

trunk cross-point dual-cross-point |

|

k-trunk);

rule,

if

is

|

(in

essence the time multiplexedand finally k-trunks

conversation);

often use a duplex switch instead of

PMS

diagrams, even though the latter

a redundant attribute derived from the cocomponent

is

no hierarchy.

A

set.

communi-

telephone system

is

a

internal to a comtypical nonhierarchical structure. Usually the switches puter are hierarchical in that there are n components of type a which

communicate with

hierarchy: (hierarchical nonhierarchical / anarchical);

We

1

which functional simul-

of rapid switching within

there are n identical cocomponents each of which

cates with one another, there

|

|

|

values given cor-

in

true simultaneity; time-multiplexed-cross-point, in

cocomponents: (input/from: component-set, output/to: component-set, initiator:

The

which only a single simplex may be established at a time; duplex, in which a single full-duplex may be established; cross-point (also dual-cross-point), which permits

respond to practical alternatives

with the

fa's

m

and vice

components of type

b.

The as only communicate

versa: hierarchy does not

determine the component

|

initiating the dialogue.

location: (central distributed |

(cocomponent

set));

distribution: (radial bussed / bus / chain / daisy chain); |

access-time /

ta:

switch-type

:

switch-type(address /

=

(

a,

prior-address / p)

The

location of a switch refers to whether the hardware

is

localized

within one of the components using the switch, whether it is separate (called central), or whether it is distributed through all the cocomponents.

An attribute that is not completely independent is distribution, which denotes whether the physical structure is a continuous bus or chain or is

623

624 Appendix

fed radially from a centralized component. See Fig. 13, Chap. for

common alternative physical structures. A major way of classifying simple-switches

cyclic, linear,

With each

etc.

random,

is

(a)

is

time-

their access

by

parameters in most

critical

and the prior address

(p),

which

switch the

state of the switch. Thus, represents the existing access time consists of a start-up time plus a time proportional to the

mag-

nitude of the difference between the prior address and the desired address. This differs from a linear switch, which only permits movement in one direction and must reset to an initial state if an address lower than the is

An

sought.

interleave

memory

is

one that

consists of

is

compound-switch

an array of switches whose links are connected are inputs to others and thus effects a total

some

so that the outputs of

in a bilinear

existing address (p)

A

page 67

that given the type of formula

determines the actual access time. The two switches are the address being sought

3,

which go from output to input component-sets.

set of links,

It

can be

defined as an extension of a simple-switch, since most parameters are defined identically for both.

Many combinations

The two most common

are possible.

of accessing arrangements

are given above.

A

cascade-switch

is

which each accessing of the next subswitch must take place after the prior one so that the access times add. A parallel-switch makes all the

one

in

accesses simultaneously, so that the total access time

is simply the access time of the subswitch that takes longest. (In both cases, there can be additional overhead time, but this can usually be allotted to the subswitches

and does not require separate terms

in the expressions for access time.)

a collection of random-access memories, depending on the relationship

between a and p access;

a^p

(usually a

mod 4—>

modular one, such

short access).

Random

= p mod 4) -* long means that the access

as (a

access

be only approxiindependent of both a and p. This constancy may mate (as in using a drum with its cyclic character ignored). Queues and

time

is

Control

8.

K =

Control /

8.1

8.2

simple-control

stacks differ from the other switches in having a degenerate addressing

simple-control compound-control

:

|

:

=

component

(

the state of the system such that the next link selected is determined by switch itself. Dequeues allows either of the two ends of a queue to be

cocomponents: controlled / object: component-set, "instruction:

accessed.

subcomponents: 'instruction: memory, working / w: memory,

how

long the switch maintains a link (or set of them) after establishing the link by an access operation. The three common values are (1) the destruction of the connection with the transmission refers to

Permanency

(2) the maintenance of the connection permathe autonomous movement of the connection (as in disks

of the i-unit across the link, nently,

and

(3)

and drums). The Rarer

is

latter

two give

rise to the

p used

in the access formulas.

a decay function, in which the link remains established for

been transmitted. Hang-up delay is given only for certain permanencies of fixed-until-broken and manual switches. A number of parameters derive directly from the properties of the set

lay,

size of the i-unit, the information-rate, the link de-

the direction of data flow, and the

component

that can initiate data

transmission (as opposed to initiating accessing). Finally, there nology, which

memory

is

not given in detail, since

much

of

it

is

is

operations: evoke / -», next-evoke / next, condition-operations;

controlled-operations: (controlled-component: operation)-list; instruction-source: (none data instruction); |

tech-

identical to

technology.

A

simple-control

is

a logical circuit (usually sequential) that evokes

components (the controlled, or object, components). main operations are those of evoking and evoking-next (symbol—> and next in ISP). However, it must also detect conditions on

operations in other

Thus,

its

ized as

which such evoking depends, so that it has available additional operations, that are combined in an instruction-set (see ISP 2.1). These vary greatly from boolean operations to arithmetic operations (such as

in complexity,

counting the

A

number

of i-units processed).

major distinction

is

the source of the external instructions that can

be given the control. At one extreme there

whose function examples

|

instruction-set)

priate i-unit has

— the

operations: data-operation;

some

period of time, or an irreversible connection, which can be set just once and from then on operates like a simple-link. Hang-up delay is the time taken to break a connection after the appro-

of ports or links

component-set, 'data: component-set;

S('I/0 BUS; location: K; from:P; to:K; half-duplex; initiators: P, K; switch-type:

S(cross-point; 16

random;

M; 6

(P

ta: 5/is;

+

concurrency:

is

that in

More complex

1)

K); concurrency; 6; location:

case

M)

is

may be

which

all

7.4

compound-switch

=

simple-switch

(

essor. It

is

commands).

A

control does not obtain

'address:

links: link-set,

subswitches: switch-set,

memory;

does have an instruction-set, which

technology

is

it

its

to set

it

own into

from a proc-

is

the ISP expression that

all

realized in a logic tech-

actions.

given, since controls are

nology, as given in the definition of component. Likewise, no function

access-time: (cascade: sum(access-time(subswitches))| parallel: max(access-time(subswitches))

No

component

the primary characteristic that distinguishes

shows what conditions evoke what subcomponents: control,

itself.

controls have a separate set of external instructions (often

called control characters or

action. This

The common

the external instruction comes via the data

next instruction, being dependent on an external :

none, as in a clock

to interrupt the system every millisecond.

) )

parameter

is

given, since there exists

no special vocabulary

the different subspecies of control tasks.

to designate

Appendix

examples

input: Pc; output:

K(Mp;

transducer-technology := (analog-digital converter bell buzzer TV camera / vidicon card reader card punch CRT display storage

Mp)

|

K(D(multiply)) 8.3

compound-control

:

=

|

|

CRT simple-control

D

1

|

document reader

subcomponents: alternatives: simple-K-set, 'instruction: memory,

keyboard

working: memory;

|

light

printer linear

|

|

|

film reader film writer |

|

|

|

light

alternative simple-controls compound-control consists of a collection of and can be given as an extension of the simple-control. At any time, the

one of these simple-controls. Determination of what simple-

mode the control is in) is by a modeoperative (often called the instruction from some external component. This additional freedom reis

to hold the current specification. quires a subcomponent, the control-state, that the actual simple-K is determined it is rare, though (Thus possible,

of mode-instructions, each determining

some

part of the

|

|

|

|

|

|

9.

A

outK(Instruction set processor/ISP; input:M.processor„state;

K(LCI/0

Bus)); M(read-write;

b/w

1

40 b; working);

/js/w))

simple-transducer

is

T =

Transducer /

a pair of connected links that have different i-units

link. Meaning is preserved; that is, only the encoding has changed. Preservation of meaning distinguishes transduction from data operation. The amount of information need not be preserved, so that information

output

is

an additional characteristic of a transducer.

9.2

simple-transducer

|

:

=

component

A simple-transducer one

is

number

of bits

is

It

may be

posi-

either increased or decreased.

called a simplex, in that information flow

is

in

fixed direction only (as in a simple-link).

Knowing the function

example

simple-transducer compound-transducer

:

)

of the transducer permits an inference of

whether

one interface of the transducer involves a human being. This inference can be derived from the port characteristics.

Transducer

9.1

|

|

and/or underlying carriers. As defined above, transduction is a digital operation, taking in an i-unit of the input link and producing an i-unit of the

tive or negative, as the net

M(read only; 100 w; 36

|

button telephone dial thermocouple Lincoln Laboratory Wand)

divergence

control state.)

put: D, K(Mp),

|

|

|

|

A

example

joystick keys

|

|

by a sequence

|

pen continuous line plotter line printer/ actuator SRI mouse paper tape reader paper tape gun

|

control

gong

|

punch incremental point plotter pressure transducer speech synthesizer Rand tablet Sylvania tablet telephone dial push

instruction-source: mode-instructions)

is

|

|

display printed document display plasma display 3 reader / document reader document printer magnetic character

(

|

control

|

|

T(line printer; 1000 lines/m; 132 char/line; 8 bit/char)

T(paper tape; reader; 300 char/s; 8 b/char; width: 1 in.) T(sense amplifier; i-rate: .5 w/s; 24 b/w; input: M(memory

(

cocomponents: input: component, output: component,

stack))

initiator:

(input output both); |

|

9.3

subcomponents: input: L, output: L, 'control;

compound-transducer

concurrency:

See port of component;

o-rate

console [i/t];

either

(amplification analog-digital angular|

|

linear attentuation electroluminescence electromagnetic |

|

|

|

electromechanical electromechanical-acoustic electro-optical |

|

mechanical-indentation photochemical xerographic) |

|

|

|

:

=

|

|

(airlines reservations stock |

|

quotation data collection) |

of simple-transducers. The two and the the kinds are full-duplex, which are extensions half-duplex simplest of the simple-transducer, wherein the direction of information flow can be

1;

=

|

|

A compound-transducer consists of a set

concurrency-type: simplex;

:

card reader-punch computer

|

|

transduction-technology

=

magnetic tape transport typewriter Teletype special purpose

[i];

portability: (portable not portable / fixed);

concurrency:

:

|

|

X

duplex);

console / processor console / console Dataphone keyboard-CRT card transport display diskpak drive film write-reader magnetic

i-unit(input)

divergence-rate / divergence

full |

+ integer)

compound-transducer-technology

transduction: port(output) «- port(input);



j

compound-transducer-technology;



1

1 1

segment

|

segments paging) |

= (fixed length page segments multiple length page segments variable length page segments :

|

|

|

segments);

P-concurrency:

/ serial

(serial

by

bit parallel / parallel |

by word

multiple instruction streams multiple data streams (arrays) pipeline processing instruction-memory |

|

|

);

instruction-memory := (none|l instruction look ahead |n instruction look ahead cache / look aside / slave memory)) |

possibilities as

really

leading number, since

1

(no relocation protect only

segmented-programming

Mp,

and the program-switching time, which is change context from one program to another. In simple operating regimes (standard batch processing) program-switching time is not an important parameter; it becomes so when interrupts are permitted. For interis

+

multiprogramming!

|

to

rupts, the response time

fixed |

|

operations can be performed per cycle time

an averaging of the various

n Pio monitor

-

swapped program

2 segment / pure impure segments

memories, since they

the cycle-time of

(

|

index registers and general

struction set);

for data

simple-processor

P|l P with interrupt |1 program with multiple

multiprogramming

and

long run limits the rate at which instructions and data can be accessed (and also determines the maximum throughput); the concur-

which

ad-

1

multiprogramming segmented-programming);

various amounts of

its own operation time and its own possibeing overlapped with other operations. Several parameters are

summarize

~2 w/ instruction;

/ instruction)

)

of the operations has

given that

is

(1

=

:

concurrent subprograms

named

bilities for

is

address / instruction;

500 kw/s; data-types: words; integer; 36 b/w

79 (LD AC ((A

formed from

(A

single bit vector

add

>

0) -» P P (Ov, M[B]

0:11> :=

jump

1)

need not be given

EXAMPLES OF REGISTERS FORMED BY CONCATENATION

LAC

B)

1)-(A

Lehman,

M., R. Eshed,

—A New

and

Z. Netter:

SABRAC IEEE

Generation Serial Computer,

Trans., vol.

EC-12, no.

6, pp.

618-628, Decem-

ber, 1963.

LehmM65

E.:

Changes

in

Computer

vol. 12, no. 9, pp.

Per-

M.: Serial

LehmM66

Mode Operation and

Parallel Processing, Proc.

Pt. 2, pp.

40-54,

E.: Evolving Computer Performance 1963-1967, Datamation, vol. 14, no. 1, pp. 31-35, January, 1968.

Lehman, speed

September, 1966.

KnutD66

L.,

Lebedev, S. A.: The High-speed Calculating Machine of the Academy of Sciences of the USSR, 1956. /. ACM, vol. 3, pp. 129-133,

partial translation

The Oracle Memory System,

formance, Datamation,

KnigK68

J.

LebeS56

1953. Knight, Kenneth

W. W. Lichtenberger, and in a Time-sharing

sharing System, AFIPS Proc. FJCC, 601-609, 1967.

Izdatelstvo

Argonne Natl. Lab., Proc. Symp. on Large Scale Digital Computing Machines, pp. 47-58, August,

KnigK66

Langdon,

pp.

available, 1956.

KleiR53

W.,

A User Machine

of a High-speed Transistor for the ASLT Current Switch, IBM J. of Res. and Dev., vol. 11, no. 1,

vol. 4, no.

Tsifrovie

Machines),

Digital

B.

Pirtle:

System, Proc. IEEE, vol. 54, no. 12, pp. 17661774, December, 1966.

April,

Kinslow, H. A.: The Time-sharing Monitor System, AFIPS Proc. FJCC, Pt. I, vol. 26, pp. 443454, 1964.

C-17, no. 8, pp.

Lampson,

1962.

KinsH64

vol.

LampB66

279-294, 1961. KilbT62

ILLIAC IV Software and Application

J.:

Lampson, B. W.: Interactive Machine Language Programming, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 473-481, 1965.

222-225, October, 1961.

Payne, and D. J. Howarth: The AFIPS Proc. EJCC, vol. 20, pp.

November,

LampB65

I:

puter

Kuck, D.

Programming, IEEE Trans., 758-770, August, 1968.

1960. G. Edwards, and

Division,

1961.

High-

IFIPCong. 1965,

631-633, 1965.

M.: A Survey of Problems and Preliminary Results Concerning Parallel Processing and Parallel Processors, Proc. IEEE, vol. 54, no. 12, pp. 1889-1901, December, 1966.

Lehman,

Knight, Kenneth

Knuth, D. E.: Additional Comments on a Problem in Concurrent Programming Control, Comm. ACM, vol. 9, no. 5, pp. 321-322, 1966.

LeinA54

A.

L.,

and

S.

N.

Alexander: System

neers, vol. EC-3, no. 1, pp. 1-10,

LeinA57

March, 1954.

Leiner, A. L, W. A. Notz, J. L. Smith,

and

A.

Weinberger: Organizing a Network of Computers to Meet Deadlines, Proc. EJCC, pp. 115-128, 1957.

Kroger, Marlin G., et at.: Computers in Command and Control, TR61-12, prepared for

DOD:ARPA by Digital Computer Application Study, Institute for Defense Analyses, Research

Leiner,

Organization of the DYSEAC, Professional Group on Electronic Computers, Institute of Radio Engi-

LeinA58

Leiner, A. L, W. A. Notz, J. L. Smith,

and

A.

Bibliography

Proc.

System,

LeinA59

NBS Multicomputer EJCC, pp. 71-75, 1958.

PILOT, The

Weinberger:

Leiner, A. L, W. A. Notz, J.

New

Weinberger: PILOT, A

System, 1959.

LichW65

ACM,

/.

vol. 6,

L.

Smith, and A.

Multiple

Computer

no. 3, pp.

313-335,

Pirtle: A Facility Man-Machine Inter-

in

Experimentation

action,

MaheR61

AFIPS

Proc.

FJCC,

Pt.

I,

vol.

27,

MarcM63

Lindquist, A. B., R. R. Seeber, and L. W. Comeau: A Time-sharing System Using an Associative Memory, Proc. IEEE, vol. 54, no. 12, pp.

Liptay,

85,

LoneW61

LonsK56

Lonsdale, vol. 103,

Lourie,

MelbA65

and

Digital

Supp.

2,

E. T.

Warburton: Mercury: A

Computer, Proc. IEE, pp. 174-183, 1956.

H. Schrimpf, R. Reach, Proc.

McCarthy, in

MendM66

MercR57

ture,"

M.I.T.

Press,

Merrl56

in

EJCC, pp.

a Multi-

MetrN52

75-81,

MMIW63

Cambridge, Mass.,

J.,

McCormick, Bruce

MiraW67

McCullough,

H.:

The

Illinois

Pattern Rec-

J. D., K.

H. Speierman,

and

F.

McPherson,

J.

L.,

Proc.

FORTRAN

and

J.,

A. W.

24-27,

April,

England: The

SDS

Real-time, Time-sharing Computer,

FJCC,

Mercer, Robert

J.:

vol.

29,

pp. 51-64,

Micro-programming, 157-171, 1957.

/.

1966.

ACM,

Merry, I. W., and B. G. Maudsley: The MagneticStore of the Computer Pegasus, Proc. IEE, B, vol.

103, Supp.

2,

pp. 197-202, 1956.

Metropolis, N., E. Richardson, H. B. Proc.

Klein, W. Orvedahl, J. R.

F.

Demuth, and

ACM,

J.

B.

Jackson:

Toronto Conf, pp. 13-17,

Miller,

W.

F.,

and

R. A.

Aschenbrenner. The

GUS

Miranker, W. L., and W. M. Liniger: Parallel Methods for the Numerical Integration of Ordi-

nary Differential Equations, Math, of Computa99, pp. 303-320, July, 1967.

MolnC67

and

S.

N. Alexander: Per-

Molnar, Charles E., Severo M. Ornstein, and Antharvedi Anne: The CHASM: A Macromodular

Computer

W.

Zurcher: Design for a Multiple User Multiprocessing System, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 611-617, 1965.

McPhJ51

AFIPS

A

vol. 8, pp.

tion, vol. 21, no.

ognition Computer-ILLIAC III, IEEE Trans., vol. EC-12, no. 5, pp. 791-813, December, 1963.

McCuJ65

7:

M. Pugmire: A Small

Multicomputer System, IEEE Trans., vol. EC-12, no. 6, pp. 671-676, December, 1963.

pp. 51-57, 1963.

McCoB63

J.

Direct Processing of

September, 1952.

R. Licklider:

for a

SIGMA

MANIAC,

S. Boilen, E. Fredkin, and J. C. A Time-sharing Debugging System Small Computer, AFIPS Proc. SJCC, vol. 23,

McCarthy,

and

J.,

Nash: The Ordvac,

P.

J.

37-43, December, 1951.

drum

1962.

McCaJ63

Mendelson, M.

Pt.

Management and The

Melbourne, A.

vol. 4, no. 2, pp.

Pt. B,

"Time Sharing Computer Systems the Computer of the Fu-

J.:

and

Statements, Computer J., 1965.

1959.

McCaJ62

E.,

Conf., pp.

and W. Kahn:

Arithmetic and Control Techniques

program Computer,

R.

Meagher,

604 Machine Description, IBM 38 pp., December, 1963.

Computer for the

F.:

K.,

IM.,

MeagR51

J.,

Lonergan, William, and Paul King: Design of the B 5000 System, Datamation, vol. 7, no. 5, pp. 28-32, May, 1961.

High Speed

LourN59

Sys.

ASLT: An Extension of Hybrid Miniaturization Techniques, IBM }. of Bes. and Dev., vol. 11, no. 1, pp. 86-92, January, 1967. Lloyd, R. H.

R. M.:

Meade,

Proc. SJCC,

29-40, 1963.

Internal Mem.,

15-21, 1968.

vol. 7, no. 1, pp.

LloyR67

The Cache, IBM

II.

MeadR63

AIEE-IBE

Structural Aspects of the Sys-

S.:

J.

tem/360 Model

Marcotty, M. J., F. M. Longstaff, and A. P. M. Williams: Time-sharing on the Ferranti-Packard vol. 23, pp.

1774-1779, December, 1966. LiptJ68

J.:

Multiprocessor

FP6000 Computer System, AFIPS

pp.

589-598, 1965.

LindA66

Problems of Storage Allocation in Multiprogrammed System, Comm. ACM, vol. 4, no. 10, pp. 421-422, OctoMaher, R.

a

ber, 1961.

Lichtenberger, W., and M. W. for

formance of the Census Univac System, AIEEIBE Conf., pp. 16-22, December, 1951.

for Analyzing

Proc. SJCC, vol. 30, pp.

MonnR68

Monnier, Richard tor

with

E.:

Neuron Models, AFIPS 393-401, 1967.

A New

Electronic Calcula-

Computerlike Capabilities, HewlettPackard J., vol. 20, no. 1, pp. 3-9, September, 1968.

647

648 Bibliography

MorrD67

Morris, Derrick, Frank H. Sumner, and Michael Wyld: An Appraisal of the Atlas Supervisor,

PapiW57

ACM

Proc.

MuntC62

2,

MyerT68

A

Processing Interpreter for

List

M.I.T., Instrumentation Lab.,

AGC4, MurtJ66

C. A.:

Muntz,

PatzW67

York, 1966.

PennJ62 and

T. H.,

E.

I.

On the Design Comm. ACM, vol. 11, no.

Sutherland:

and

and

The Logic Theory

H. A. Simon:

vol. IT-2, no. 3, pp.

J.

WJCC,

C.

Shaw, and H.

PikeJ52

PlugW61

A.

Simon: Empiri-

PortR60

J. C. Shaw, and H. A. Simon: The Elements of a Theory of Human Problem Solving, Psychology Rev., vol. 65, pp. 151-166, March, 1958.

RajcJ43

OssaJ65

731-733, December, 1964.

Hardware for Information Processing Systems: Today and in the Future, Proc. IEEE, vol. 54, no. 12, pp. 1820-1835, DecemOsborne, Thomas E.: Hardware Design of the Model 9100A Calculator, Hewlett-Packard J., vol.

RandB68

PadeA64

Padegs,

IBM PadeA68

Sys.

Padegs,

I,

A.: J.,

A.:

Structural

85,

pp. 22-29, 1968.

James

473-

vol. 5, no. 9, pp.

Input-Output Devices Used with pp. 36-38, Decem-

L.:

III.

Aspects of the Sys Extensions to Float

IBM

Sys.

J.,

vol. 7,

"SABRE"

Proc.

WJCC,

Porter, R.

no

Perry: American AirElectronic Reservations System,

and M. N.

Plugge, W. R., lines'

pp.

593-602, May, 1961.

The RW-400-A New Polymorphic

E.:

Rajchman,

Randell, B.,

J.,

no.

6,

Snyder, and Rudnick:

under terms

of

1,

RCA

OSRD

pp.

Labo-

contract

and

C. J.

RichR55

Kuehner: Dynamic Storage Comm. ACM, vol. 11, no. 5,

297-306, May, 1968.

Richards, R. K.: "Arithmetic Operations tal

Computers"

D.

in Digi-

Van Nostrand Company,

Inc.,

Princeton, N.J., 1955.

RobeJ58

Robertson, J. E.: A New Class of Digital Division Methods, IRE Trans., vol. EC-7, no. 3, pp. 218222, September, 1958.

RobeL67

Roberts, Lawrence G.: Multiple Computer Networks and Intercomputer Communication, ACM Symp. on Operating System Principles, Gatlinburg, Tenn., Oct. 1-4, 1967.

RoseG67

Rose, Gordon

grammed

Channel Design Considerations vol. 3, no. 2, pp. 165-180, 1964

ing-point Architecture, 1,

231-241, 1965.

vol. 27, pp.

tem/360 Model

Pike,

pp.

pp. 10-13, September, 1968.

J. F., L. E. Mikus, and S. D. Dunten: Communications and Input-Output Switching in a Multiplex Computing System, AFIPS Proc.

Pt.

in

Allocation Systems,

Ossanna,

FJCC,

and T. Pearcey: Use of Multiprothe Design of a Low Cost Digital

J. P.,

OEM-sr-591.

Nisenoff, N.:

1,

Penny,

ratories Report,

Nievergelt, J.: Parallel Methods for Integrating Ordinary Differential Equations, Comm. ACM,

20, no.

Memory and Computer

Data System, Datamation, vol. 8-14, January/February, 1960.

Theory Machine,

ber, 1966.

0sboT68

Read-only

SEAC, AIEE-IRE-ACM Conf.,

pp. 218-230, February, 1957.

12, pp.

A.:

ber, 1952.

Newell, A.,

vol. 7, no.

NiseN66

vol. 6,

Computer, Comm. ACM,

61-79,

J. C.

cal Explorations of the Logic

NievJ64

Gilbert C. Vandling: Sys-

Microprogramming, Comno. 12, pp. 62-66, Decem-

of

476, September, 1962.

Shaw: Programming the Newell, A., Logic Theory Machine, Proc. WJCC, pp. 230-

Newell, A.,

Peacock,

gramming

240, February, 1957.

NeweA58

and

1 Control, to be published.

410-414, June, 1968.

Newell, A.,

Proc.

J.,

162-

vol. 30, no. 10, pp.

J.

Machine, IRE Trans., September, 1956.

NeweA57fo

Patzer, William

puter Design, ber, 1967.

PeacA??

Myer,

High-speed Computer Stores 2.5

N.:

tems Implications

Mem.

C: Highly Parallel Information Processing Systems, in "Advances in Computers," vol. 7, pp. 2-116, Academic Press, Inc., New Murtha,

6, pp.

NeweA57a

AGC

Cambridge, Mass., January, 1962.

of Display Processors,

NeweA56

67-75, 1967.

Natl. Meeting, pp.

Papian, W.

Megabits, Electronics, 167, October, 1957.

T.

Trans., vol.

A.:

"Intergraphic,"

A

Micropro-

Graphical-Interface Computer,

EC-16, no.

6, pp.

IEEE

773-784, Decem-

ber, 1967.

'According to E. F. Codd, this article has not been published as of Jan. 23, 1968. However, "Microprogram Control for System/360" by S. G. Tucker, IBM Sys. J., vol. 6, no. 4, 1967, has and covers the material that we think was intended to be in PeacA??.

Bibliography

RoseJ65

Marbles and Boxes, IBM Res. Yorktown Hts., N.Y., November, Rept.,

Rosenfeld, Project

1965.

RoseS67

nization for Array Processing,

J.:

Pt.

SerrR62

Rosen, Saul: "Programming Systems and Languages," McGraw-Hill Book Company, New

RoseS69

Rosen, Saul: Electronic Computers: A Historical Survey, Computing Surveys, vol. 7-36, March, 1969.

RosiR69

no.

1,

1,

ShanC38

pp.

SharW69

1969.

ShawJ58

F.:

RossH53

Ross, Harold D., Jr.: The Arithmetic Element of the IBM Type 701 Computer, Proc. IRE, vol. 41, no. 10, pp. 1287-1294, October, 1953.

RothS59

Rothman, Intern.

S.:

R/W 40

Data Processing System,

SaltJ66

ShedG66/)

SaxoJ63

Saxon,

"Programming the IBM 7090,"

A.:

J.

SchlH??

Schlaeppi, H.

P.:

Englewood

Cliffs, N.J.,

SchwJ64

SechR67

Schwartz,

J.

I.:

Sechler, R.

A. R. Strube,

397-411,

ASLT Circuit

and

vol.

Seeber, R. R., and A. B. Lindquist: Associative

Logic for Highly Parallel Systems, AFIPS Proc. FJCC, vol. 24, pp. 489-493, 1963.

SegaR61

Segal, R.

J.,

and H.

P.

SlutR51

W. Carl Borck, and Robert

Slutz,

Ralph

J.:

Engineering Experience with the

SEAC, AIEE-IRE Conf,

to Air Force Digital Data Communication System, AFIPS Proc. EJCC, vol. 20, pp. 264-278, 1961.

Senzig, D. N.,

and

R. V. Smith:

pp.

90-94, December,

1951.

SmitR64

S0I0M66

SquiJ63

V., and D. N. Senzig: Computer Organization for Array Processing, IBM Res. Rept. RC

Smith, R.

N.Y.,

December, 1964.

Solomon, Martin B., Jr.: Economies of Scale and the IBM System /360, Comm. ACM, vol. 9, no.

Computer Orga-

435-440, June, 1966.

J. S., and S. M. Polais: Programming and Design Considerations of a Highly Parallel Computer, AFIPS Proc. SJCC, vol. 23, pp. 395-

Squire,

400, 1963.

Guerber: Four Advanced

Computers— Key

SenzD65

L.,

McReynolds: The SOLOMON Computer, AFIPS Proc. FJCC, vol. 22, pp. 97-107, 1962. C.

6, pp.

SeebR63

RC

Turnbull:

IBM J. of Res. andDev., 74-85, January, 1967.

Design,

11, no. 1, pp.

J. R.

Res. Rept.

Slotnick, Daniel

1330, Yorktown Hts., F.,

IBM

SlotD62

A General-purpose Time-sharing Proc. SJCC, vol. 25, pp.

Numerical Methods for

Shupe, P. D., and R. A. Kirsch: SEAC, Review of Three Years of Operation, Proc. EJCC, pp. 83-90, 1953.

Extensions of PL/l-like Lan-

System, AFIPS 1964.

Parallel

ShupP53

1963.

guages for Parallel Processing, with Programming Examples, in preparation.

S.:

1619, Yorktown Hts., N.Y., June, 1966.

L.:

Prentice-Hall, Inc.,

Shedler, G.

the Solution of Equations,

MAC-TR-30,

Accents, Proc.

S., and M. Lehman: Parallel Compuand the Solution of Polynomial Equa-

Shedler, G.

IBM Res. Rept. 1550, Yorktown Hts., N.Y., February, 1966.

Saltzer, J. H.: Traffic Control in a Multiplexed

Computers with European WJCC, pp. 14-17, 1957.

York,

1958.

ShedG66«

tions,

Samuel, Arthur

Com-

J. C, A. Newell, H. A. Simon, and T. 0. A Command Structure for Complex Information Processing, Proc. WJCC, pp. 119-128,

Ramo-Wooldridge, Div. of Inc., Los Angeles,

July, 1966.

of

New

Ellis:

1959,

M.I.T. Tech. Rept.

"The Economics

Shaw,

tation

June, 1959.

F.:

1969.

Information Processing and

Computer System,

SamuA57

Sharpe, William

puters," Columbia University Press,

Thompson RamoWooldridge, Calif.,

Shannon, E. C: A Symbolic Analysis of Relay and Switching Circuits, Trans. AIEE, vol. 57, pp.

on

Conf.

Auto-math

M. M. Astrahan, G. W. Patterson, and Pyne: The Evolution of Computing Machines and Systems, Proc. IRE, vol. 50, no. 5, pp. 1039-1058, May, 1962. B.

713-723, 1938.

Contemporary Concepts of Microprogramming and Emulation, Computing Surveys, vol. 1, no. 4, pp. 197-212, December,

Rosin, Robert

AFIPS Proc. FJCC,

117-128, 1965.

vol. 27, pp.

Serrell, R., I.

York, 1967.

I,

SteeT61

Steel, T.

WJCC, StevL52

B„

pp.

Stevens,

L.

Jr.:

A

First Version of

UNCOL,

Proc.

371-377, 1961. D.:

Engineering Organization of for the IBM 701 Electronic

Input and Output

649

650 Bibliography

AIEE-IRE-ACM Machine, Data-processing Conf., pp. 81-85, December, 1952. StevW64

Stevens, W. Y.: The Structure of System/360, Part II— System Implementations, IBM Sys. J.,

Walendziewicz,

neering, Anaheim, 30-31, 1962.

C: Time Sharing in Large Fast ComProc. ICIP, UNESCO, pp. 336-341, June,

WareW63a

F. H., G. Haley, and E. C. Y. Chen: The Central Control Unit of the "Atlas" Computer,

Sumner,

Taylor,

Norman

H.:

Evaluation of the Engineer-

ing Aspects of Whirlwind

AIEE-IRE

I,

ples of Operation,

WareW63b

A Review

Teager, Herbert M.: Rev.,

Computing

vol.

6,

of

AmdaG64a;

no. 5, pp.

R. N.,

Thompson,

and

WebeH67

Thornton, James E.: Parallel Operation Control Data 6600, AFIPS Proc. FJCC, Pt.

Tomasulo,

R.

An

M.:

in

the

II,

vol.

Efficient Algorithm

WeikM55

VandW52

WeikM61 for

VandW56

Sci.

Van der

L.:

Poel, W.

Res.,

Van der

Poel, W.

The

WestG60

30,

549-558, Sep-

Electronic Digital

Sec.

B, vol. 2,

A Survey of Domestic Electronic Computing Systems, Ballistic Research Decem-

Weik, Martin H.:

A Third Survey

Electronic Digital

Computing Systems,

BRL Rept.

Project No.

of

Domestic

Md.;

Ballistic

report

1010, Department of the

5B03-06-002 (1961).

Weik, Martin H., Jr.: A Fourth Survey of Domestic Electronic Digital Computer Systems,

West, George P., and Ralph J. Koerner: Communications within a Polymorphic Intellectronic

System, Proc. WJCC, pp. 225-230, 1960. Wilkinson,

Automatic

pp.

cal

Logical Principles of

Thesis,

WilkM51a

Amsterdam,

J.

H.:

Digital

"The Pilot ACE," pp. 5-14, Computation, National Physi-

Laboratory, Teddington,

England, March

UNESCO,

pp.

361-365,

The Best Way to Design An Automatic Calculating Machine, Manchester University Computer Inaugural Conf, July, 1951. PubWilkes, M. V:

lished by Ferranti Ltd.,

ZEBRA, A Simple Binary

L.:

Proc. ICIP,

June, 1959.

vol. 10, no. 9, pp.

Research Laboratories, Aberdeen, Md., 1227; processed by Defense DocumentaRept. tion Agency, Defense Supply Agency No. 42900, January, 1964.

1956.

Computer,

York, 1963.

25-28, 1953.

Some Simple Computers, VandW59

New

Ballistic

WilkJ53

A Simple

Computer, Appl. 367-400, 1952.

and Machine Design,"

Weik, M. H.:

Army

and

Unger, S. H.: A Computer Oriented toward Spatial Problems, Proc. IRE, vol. 46, no. 10, pp. 1744-1750, October, 1958. L.:

Inc.,

EULER on IBM System/360 Model

supersedes

WeikM64

Poel, W.

Sons,

Research Laboratories, Aberdeen,

G.:

Turing, Sara: "Alan M. Turing," W. Heffer

Van der

&

ber, 1955.

Sons, Ltd., Cambridge, England, 1959.

UngeS58

York, 1963.

Laboratories, Aberdeen, Md., Rept. 971,

1967.

TuriS59

New

Weber, Helmut: A Microprogrammed Implemen-

Digital

Microprogram Control for System/360, IBM Sys. ]., vol. 6, no. 4, pp. 222-241, S.

and Programming," John

D825

1967. Tucker,

Inc.,

vol. 2, "Circuits

Comm. ACM,

Exploiting Multiple Arithmetic Units, IBM J. of Res. and Dec, vol. 11, no. 1, pp. 25-33, January,

TuckS67

Oct.

tember, 1967. Wilkinson: The

J. A.

26, pp. 33-40, 1964.

TomaR67

117-127,

pp.

Ware, W. H.: "Digital Computer Technology and

tation of

355-356,

Automatic Operating and Scheduling Program, AF1PS Proc. SJCC, vol. 23, pp. 41-49, 1963.

ThorJ64

Sons,

John Wiley

September-October, 1965.

ThomR63

&

Design,"

Conf., pp.

75-78, December, 1951.

TeagH65

Calif,

Ware, W. H.: "Digital Computer Technology and Design," vol. 1, "Mathematical Topics, PrinciWiley

657-662, 1962.

Proc. IF1P Cong. 1962, pp.

The D210 Magnetic ComComputer Engi-

E. T.:

puter, Proc. Conf. on Spaceborne

1959.

TaylN51

WaleE62

Strachey, puters,

SumnF62

Vyssotsky, V. A., F. J. Corbato, and R. M. Graham: Structure of the Multics Supervisor, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 203-212, 1965.

136-143, 1964.

vol. 3, no. 2, pp.

StraC59

VyssV65

WilkM51b

London.

The Edsac Computer, AIEE-IRE Conf, pp. 79-83, December, 1951.

Wilkes, M. V:

Bibliography

WilkM52

V., D. J. Wheeler, and S. Gill: "The Preparation of Programs for a Digital Compu-

Wilkes, M.

Addison-Wesley Publishing Company, Reading, Mass., 1952.

ter,"

WilkM53

WilkM58fc

WMIF49

M. V., and J. B. Stringer: Microprogramming and the Design of the Control

Circuits in

an Electronic

Cambridge

Phil.

Digital

Computer,

March, 1949.

183-202,

Proc.

Soc, Pt. 2, vol. 49, pp. 230-238,

WirsJ66

1953.

W. Renwick, and D. J. Wheeler: The Design of the Control Unit of an Electronic Digital Computer, Proc. IEE, Pt. B, vol. 105, pp. 121-128, March, 1958. Wilkes, M.

F. C, and T. Kilburn: A Storage System Use with Binary-Digital Computing Machines, Proc. IEE, Pt. 3, vol. 96, pp. 81-100,

Williams, for

Wilkes,

April,

WilkM58a

Inc.,

Operating Experience, Proc. EJCC, pp. 91-95, 1953.

Wirsching, Joseph

WirtN66a

Proc.

Microprogramming,

V.:

tion of I,

Wilkes, M.

V.:

Slave Memories and Dynamic

V.:

Trans., vol.

12,

List-oriented

no.

12,

pp.

and H. Weber: EULER: A GeneralizaALGOL, and Its Formal Definition: Part

Comm. ACM,

vol. 9, no. 1, pp.

13-25, Janu-

The Growth of

Charles

R.:

WirtN66c

A Review

of

ORDVAC

Comm. ACM,

vol. 9, no. 2, pp.

89-99, Febru-

ary, 1966.

Interest in Micro-

1969.

and H. Weber: EULER: A GeneralizaALGOL, and Its Formal Definition: Part

Wirth, N.,

II,

EC-14, no.

programming: A Literature Survey, Computing Surveys, vol. 1, no. 3, pp. 139-145, September,

Williams,

NOVA: A

EJCC,

WirtN66b

Storage Allocation, IEEE 2, pp. 270-271, 1965.

WillC53

E.:

ary, 1966.

Wilkes, M.

Wilkes, M.

96, pp.

Wirth, N.,

tion of

WilkM69

in Pt. 2, vol.

1949.

Computer, Datamation, vol. 41-43, December, 1966.

V.,

pp. 18-20, 1958.

WilkM65

Same paper

April,

Wirth, N.:

A Note on "Program Structures" for Comm. ACM, vol. 9, no. 5,

Parallel Processing,

pp.

ZadeL63

320-321, May, 1966.

Lotfi A., and Charles A. Desoer: "Linear System Theory," McGraw-Hill Book Company,

Zadeh,

New

York, 1963.

651

Name Adams, Charles W.,

Burdette, E. W., 119

Adams

Burks, Arthur W., 86-119

42, 585 Associates, 42, 257, 580

Ainsworth, Ernest, 212 Alexander,

S.

N., 165,

469

Bussell, B.,

212

496 M. W., 469

Allard, R. W.,

Allen,

Allmark, R. H., 257, 262-266 Alonso, R. L., 146-156

Amdahl, Gene M., 259, 469, 561 Anderson, D. W., 587 Anderson, James

P.,

257, 348, 447-455, 469,

586 Anderson, S. F., 587 Anne, Antharvedi, 73 Arbuckle, R. A., 50 Arbuckle, T., 349

Arden, B. W., 81, 275, 469, 566, 571 Aschenbrenner, R. A., 469 Aspinall, D., 277 Astrahan, M. M., 42, 119, 144, 212, 223, 515

Babbage, Charles, 46 Backus, John, 9 Baldwin, F. R., 46 Baldwin, R. R., 469 Barnes, George H., 320-333 Bartlett, K. A., 504 Barton, R. S., 257, 273 Bashkow, Theodore R., 363-381 Basilewskii, Iu. la., 213 Beckman, F. S., 146 Belsky, M. A., 349 Benington, H. D., 504 Bernstein, A., 349 Bhushan, A., 507 Bibb, J., 469 Blaauw, G. A., 259, 426, 428, 464, 561, 588-601 Blair-Smith, H., 146-156 Bloch, Erich,

421^39

P.,

Boland, L. J., 587 Borck, W. Carl, 320, 463

Bouchon, Falcon, Jacques, 46 Boutwell, E., Jr., 334 Bowden, B. V., 42 Bright, H. S., 291, 456 Brooker, R. A., 279

Edwards, D. B. G., 276-290 Elbourne, R. D., 172, 212 Elliott, W. S., 171-183 Ellis, T. O., 257, 349-362 England, A. W., 396 England, W. A., 149 Ernst, H. A., 469 Eshed, R., 469 Estrin, Gerald, 119, 469 Evans, D. S., 171 Everett, R. R., 137-145, 504 Ewing, R. G, 469

Fagen, R.

E.,

496

Fagg, P., 385 Fairclough, J. W., 171, 174, 176, 385 Falkoff, A. D., 13, 458,

587

Fikes, Richard E., 571

587 J., 83, 340, 291, 469 Forgie, James Forrester, J. W., 75

Flynn, Michael

W„

Fotheringham, John, 190 Frankovich, J. M., 469 Fredkin,

E.,

291

Fried, 45 Frizzell,

Clarence

E.,

525

Ill

Cray, Seymour, 471 Critchlow, A. J., 469

Caller, B. A., 81, 275, 469, 566, 571

Culler, Glen, 45

Daley, Robert

Gibson, C. T., 81, 587 Gibson, D. H., 574

C,

275, 297, 469, 517, 523, 571

Gibson,

W.

Gill, S.,

456

B.,

469

Darringer, John A., 13 Davies, D. W., 504

Glaser, E. L., 469 Goldschmidt, R. E., 587

Davies, P. M., 469

Goldstine, Herman H., 87-119 Grabbe, E. M., 205-215, 220-224 Graham, R. M., 469 Granito, G. D., 587

Davis, G. M., 257 Dean, R. F., 340, 587

Demuth, H.

B.,

Desmonde, W.

Bock, R. V., 257 S., 291

Boilen,

P., Jr., 146,

Crawford,

119

Dennis, Jack B., 81, 275, 295, 457, 469 Dent, B. A., 257

Blosk, R. T., 439

Brooks, F.

Campbell, Robert V. D., 42 Carlson, C. B., 257, 273 Carpenter, H. G., 171 Carr, J. W., Ill, 205-215, 220-224 Carter, W. C, 587 Casale, Charles T., 69, 155, 156, 396 Chase, George C, 42 Chen, E. C. Y., 274 Chen, T. C, 587 Chu, J. C, 119, 396 Clark, Wesley A., 274 Clayton, B. B., 496 Cochran, David S., 243-256, 439 Codd, E. F., 397, 439, 469 Comeau, L. W„ 587 Comfort, W. T., 291, 469 Conti, Carl J., 563, 574 Conway, Melvin E., 295, 457 Corbato, Fernando J., 295, 457, 469, 517, 523, 571 Couleur, J., 469 Cox, Jerome R., Jr., 50

H.,

456

Desoer, Charles A., 7 Devonald, C. H., 171-183

Green, A., 156 Green, J., 392 Greene, J., 340, 587 Greenstadt, J. L., 525 Greenwald, Sidney, 212 Gregory, J. G., 315, 463 Grimsdale, R. L., 277, 587 Grosch, H. R. J., 585 Gruenberger, F. J., 89, 119

469 385 Dorff, E. K., 496 Dreyfus, P., 456 Dunten, S. D., 469 Dunwell, S. W., 421 Dijkstra, E. W.,

Doody, D.

Index

T.,

Grumette, Murray, 525 Guerber, H. P., 509

259, 349, 423, 428, 464,

561, 588-601

Brown, J. L„ 385 Brown, Richard M., 320-333 Buchholz, Werner, 396, 421, 428, 469, 515

Earle,

J.

G.,

Eccles,

W.

Eckert,

J.

587 46

Haines, L. H., 392 Haley, A. C. D., 266

H.,

Presper,

Jr.,

91, 157-169,

396

Haley, G., 274

653

654 Name

index

Hamblin, C.

257

L.,

Lebedev,

Haney, Frederick M., 9 Hartley, D. F., 290 257 Haueter, R. C, 212 Hayata, Tomo, 344 Hellerman, H., 469 Herwitz, Paul S., 397 Hillegass, John R., 587 Hipp, J. A., 385 Hodges, Donald, 257 Hoffman, Samuel A., 257, 447-455, 469 Holland, John, 315, 320

Hauck, E.

A.,

H, 46

Hollerith,

Hopkins, A. L., 146-156, 349 Hoskinson, E. A., 334 Howarth, D. J., 274

Hughes, E. S., Jr., 223 Huskey, H. D., 191, 193

Iverson,

Jackson,

Kenneth

469 71, 334,

Kato, Maso, 320-333 Katz,

J.

H.,

463

Kepler, Johannes, 46 Kilburn, T., 75, 274-290

King, Paul, 257, 267-273 Kinslow, H. A., 469 Kirsch, R. A., 212 Kister,

J.,

349

Kitov, A. I., 213 Klein, E. F., 119 Klein, R. J., Jr., 119

Knight, Kenneth E., 50-51

Knuth, D.

E.,

469

Koerner, Ralph J., 485 Kroger, Marlin G., 448 Kronfeld, Arnold, 363-381

Kuck, David

J.,

Kuehner, C.

J.,

320-333 274

77,

W., 291-300 Landy, B., 290 Langdon, J. L., 581 Lanigan, M. J., 276-290 Laning, J. H., Jr., 146-156 Lauer, Hugh C, 571 Lawless, W. J., Jr., 146

Lampson,

B.

Nievergelt, J., 463 Nisenoff, N., 42

Lichtenberger, W. W., 291-300 Licklider, J. C. R., 291 Lindquist, A. B., 469, 587

Notz,

Liniger,

Liptay,

W.

M., 463

J. S.,

W.

A.,

440-445, 449, 456

C,

O'Brien, T.

587

81, 275, 469, 566,

571

Oleksiak, R., 156

587

Lloyd, R. H. F.,

Oliver, G.,

Lonergan, William, 257, 267-273 Longstaff, F. M., 469 Lonsdale, K, 279 Lourie, N., 469 Low, P. R., 587 Lowry, E. S., 397 Lucking, J. R., 257, 262-266 Lukasiewicz, J., 270

469

Ornstein, Severe M., 73

Orvedahl, W., 119

Osborne, Thomas E., 243-256 Ossanna, J. F., 469 Owen, C. E., 171-183

Padegs, A., 587

D., 291, 456,

McCullough,

J.

McDonough,

E.,

Pascal, Blaise,

341-347

46

Patterson, G. W., 42, 119, 144, 212, 223

469

397

McReynolds, Robert C., 315, 320, 463 Maher, R. J., 273 Marcotte, A. U., 587 Marcotty, M. J., 469

Jones, P. D., 290 Jordan F. W., 46

W„

349

Leibniz, Gottfried Wilhelm, 46 Leiner, A. L., 212, 440-445, 449, 456

MacLaren, M. Donald, 587 McPherson, J. L., 165, 169

Johnston, D. L., 171

Kahn,

P. G., 297,

Newell, A., 257, 349-362

McCarthy, J., 291, 469 McCormick, Bruce H., 315

587

Jacquard, Joseph Marie, 46 Johnston, A. St., 171

Kampe, Thomas W.,

Neumann,

213

Papian, W. N., 279 Parnes, David L., 13

119

B.,

J.

E., 13,

S. A.,

Lehman, M., 393, 446, 456-469

Mauchly, John W., 91 Maudsley, B. G., 171-183 Mauer, H, 156 Meade, R. M., 469

Meagher, R. E., 119 Melbourne, A. J., 392 Mendelson, M. J., 396 Mercer, Robert J., 340 I. W., 171, 176 Merwin-Daggett, Marjorie, 469, 517, 523, 571 Messina, B. U., 587

Merry,

Patzer, William J., 340 Payne, R. B., 274

Peacock, A., 604 Pearcey, T, 469

Penny,

J. P.,

469

Perry, M. N., 504 Peterson, H. P., 469

James L., 212 M. W., 291-300 Pitkowsky, S. H., 574 Plugge, W. R., 504 Polais, S. M., 469 Poland, C. B., 469 Pomerene, James H., 397 Porter, R. E., 449, 477-488 Powers, D. M., 587 Preiss, R. J., 587 Pugmire, J. M., 392 Pike,

Pirtle,

Pyne,

I.

B., 42, 119, 144,

212, 223

Metropolis, N., 119 Mikus, L. E., 469

W.

469 463 Mitchell, Herbert F., 157-169 Molnar, Charles E., 73 Monnier, Richard E., 243-256 Montgomery, H. C, 587 Morris, Derrick, 274 Mueller, 46 Miller,

Miranker,

Muntz, C. Murtha, J.

F.,

W.

L.,

A., 155, 156

C, 320

Myer, T. H., 303

119 J. P., Naur, Peter, 9 Nash,

R. M., 290 Netter, Z., 469

Needham,

Rajchman,

Ramo,

S.,

Ill J., 205-215, 220-224

Randell, B., 77, 274

Reach, R., 469 Reinheimer, H. J., 587 Renwick, W., 346 Richards, R. K, 146, 150 Richardson,

J.

Robbins, R.

C,

Roberts,

R.,

119

171

Lawrence

De

G., 45,

349 Robertson, J. E., 431 Rochester, N., 515 Rose, Gordon A., 304, 469 Rosen, Saul, 3, 42 Rosenfeld, J., 468 Rosin, Robert F., 340, 649 Roberts, M.

V.,

504

Name

Ross, Harold D.,

Rothman,

S.,

Jr.,

525

Stein, P.,

W. Y., 563, 587, 602-606 Stokes, Richard A., 320-333 Stotz, R. H., 507

Rudnick, 111, 119

Stevens,

295 Samuel, Arthur L., 42, 119, 144, 257 Sanderson, J. G., 469 Sasson, Azra, 363-381 Saxon, J. A., 525 Scalzi, C. A., 397 Scantlebury, R. A., 504 Schickhardt, Wilhelm, 46 Schlaeppi, H. P., 457, 463 Schmitt, W. J., 396 Schrimpf, H., 469 Schwartz, J. I., 291 Scott, N. R., 209 Sechler, R. F., 587 Seeber, R. R., 469, 587 Segal, R. J., 509 Senzig, D. N., 463, 469 Serrell, R., 42, 119, 144, 212, 223 Shannon, E. C, 46, 649 Sharpe, William F„ 585 Shaw, J. C, 257, 349-362 Shedler, G. S., 463 Shifman, Joseph, 257, 447-455, 469 Shupe, P. D., 212 Simon, H. A., 257, 349-362 Slotnick, Daniel L., 315, 320-333, 463 Slutz, Ralph J., 210 Smith, J. L., 440-445, 449, 456 Smith, J. W., 587

Strachey,

Saltzer,

J.

von Neumann, John, 86-119 Vyssotsky, V. A., 295, 457, 469

349

Stevens, L. D., 525

470, 485

H.,

C, 469

344 J. B., 200, 335-340, Strube, A. R., 587

Stringer,

Sumner, Frank H., 274-290 Sussenguth, E. H., 13, 587 Sutherland, I. E., 303

Taub, A. H., 92 Taylor,

Norman

H., 144

Teager, Herbert M., 587 Thomas, C. E., 279

Thomas,

L. X., 46

Thompson,

R. N., 455

349

Unger, S. H., 320 Updike, B. M., 340, 587

Smith, R. V., 463, 469 Snyder, 111, 119

Solomon, Martin R., Sparacio, F. J., 587

Jr.,

561

Speierman, K. H., 291, 456, 469 Squire, J. S., 469 Steel, T. B., Jr., 8

Van der Poel, W. L., 200-204 Van Derveer, E. J., 587 Vandling, Gilbert

C, 340

Van Horn,

295, 457

E.

C,

Vareha, Albin L.,

Jr.,

Weber, Helmut, 257, 340, 348, 382-392, 469, 587 Weik, Martin H., Jr., 42 Weinberger, A., 440-445, 449, 456 Weiner, James R., 157-169 Wells, M., 349 Welsh, H. Frazer, 157-169 West, George P., 485 Westervelt, F. H., 81, 275, 469, 566, 571

Wilkes, M.

Turing, Sara, 191, 199 Turn, R., 469 Turnbull, J. R., 587

S.,

Walden, W., 349 Walendziewicz, E. T, 148, 156 Warburton, E. T, 279 Ward, J. E., 507 Ware, W. H., 650

Wheeler, D.

Thornton, James E., 489-503 Tomasulo, R. M., 587 Tonik, A. B., 396 Tucker, S. G, 340 Turing, Alan M., 23, 191, 193

Ulam,

index

571

J.,

V,

346 84, 139, 200, 214, 334-340,

344, 345, 396, 574

455 193-199 Wilkinson, P. T, 504 Williams, A. P. M., 469 Williams, Charles R., 119 Williams, F. C, 75 Williams, Robert J., 257, 447-455, 469 Wirsching, Joseph E., 316-319 Wirth, N., 257, 348, 383, 389, 392, 469 Witt, R. P., 172, 212 Wolf, K. A., 496 Wooldridge, D. E., 205-215, 220-224 Wright, M. V, 349 Wyld, Michael T, 274 Wilkinson, Wilkinson,

J.

A.,

J.

H.,

Zadeh, Lotfi A., 7 Zemlin, R. A., 496 Zraket, C. A., 504 Zurcher, F. W., 291, 456, 469

655

Machine and Organization Index Page references

in boldface refer to the

Aberdeen Proving Grounds

ENIAC;

(see

Appendix, ISP descriptions, and

B

EDVAC;

IAS)

ACE

(NPL/National Physical Laboratory), 43, 44, 74, 190, 193-199, 216

39,

ISP, 193-199

191, 193, 198

197-199 ADU/ Accumulation and Distribution Unit T(io),

ComLogNet) AEC/Atomic Energy Commission, 396 AGC/Apollo Guidance Computer (M.I.T. (see

Instrumentation Laboratory), 44, 89,

146-156 D(arithmetic), 150-152

design and construction, 148 interpreter, 147-148 introduction, 146

PMS, 146-148

283, and 300 (Burroughs), 43-44 2500, 2501, and 3500 (Burroughs), 43

B B 5000

(Burroughs), 43, 44, 79, 81, 257-261,

BESK,

language, 45

AMBIT

HIE,

II,

44

language, 45

(see

(see

performance, 470-471 PMS, 470, 471-475, 476, 489-494

RT, 491-494 8090 and 8092

CDC

under

CDC;

G-15; G-20)

CDC

(see

160, A,

CDP/Communications Data Processor ComLogNet) Census, Bureau

of, 157,

164-165

(Lincoln Laboratory), 43

Chasm

special purpose computer, 73

COBOL

BESM, 213

60 and 61 language, 45 Columbia University Calculator, 46

BINAC

COMIT

(Eckert-Mauchly), 43, 91, 163 (Business Information Technology),

language, 33, 45

ComLogNet,

CORC

BIZMAC I, II BTL MACRO

G) (see

C.E.C.E., 39

CG24

Bitran 6 (Fabri-Tek), 44

D825, operating system)

APL/A Programming Apollo (see AGC)

45,

509-510

language, 45

CPC/Card Programmed

(RCA), 39-43 language, 45

Calculator (IBM), 43,

88

CSIRAC, 89

BTSS/Berkeley Time Sharing System

Culler-Fried on line language, 45

(University of California, Berkeley), 44,

Language,

13,

45

Argonne Laboratory, 257 Arithmometer (L. X. Thomas), 46

ARPA/Advanced Research

Projects Agency,

291-300, 315 network, 510-512

Arrow

packaging, 494^496

43, 45,

44

AOSP/Automatic Operating and Scheduling program APEXC, 39

CDC

494-495 489

ISP, 472, 491-493, 497-503 operating system, 472-475

39, 89

BIT 480

AN/FSQ-27 (see RW-40 and 400) AN/GYK-3(V) (see D825 and D830) AN/UYK (RW => TRW), 71

6400, 6416, 6500, 6600, 6700, and 7600,

history, 470,

257-261, 325, 328 B 8500 and B 8501 (Burroughs), 43-44, 64, 257 Babbage's Analytic Engine, 42, 46, 53 Babbage's Difference Engine, 46 Baldwin Calculator, 46 BASIC (Dartmouth College), 45, 236

Bendix =»

1700, 44 3400, 3600, and 3800, 43-44, 348, 396

470-476, 489-503

42-43, 45-46

language, 13, 45, 73, 257, 267, 348

1604, 44, 89

circuits,

B 5500 (Burroughs), 43-45 B 6500 and B 7500 (Burroughs),

Air Force, 137

CDC CDC CDC CDC

43-45, 47, 71, 76, 79, 83, 120, 170, 397,

operating system, 267-268 PMS, 258-260, 268

ALGOL

ALWAC

(see Strela)

ASI 6000 (EMR), 44 Atlas (Manchester University, Ferranti), 43-45, 82, 91, 274-290 input-output, 274-283, 285-289 interrupt, 274, 276-277 introduction, 276 ISP, 276-279, 283-285 M(core), 280-283, 289-290 multiprogramming, 274-283 operating system, 279, 285-287 PMS, 277, 279-283, 289-290 RT, 287-289 ATLAS-1 and 2 (Ferranti), 43 AVIDAC, 39-89 656

160, 170, 180, 250, 260, 263, 270, 273, 280,

Bell System, 303 Bell Telephone Laboratory computers, 39,

ISP, 152-155

ALPAK

diagrams.

267-273 design, 267 ISP, 268-273

introduction, 193

PMS,

PMS

45, 274-275, 291-300 input-output, 297-300 introduction, 291-292

D825 and D830

291-297 297-300

ISP,

design philosophy, 447-450 input-output, 454-455

M(files),

multiprogramming, 291-295 operating system, 292-300 PMS, 275, 292 T(io), 297 Burroughs

(see

B

2500;

B

5000;

ISP,

DASK, 89

B

5500;

IV)

California, University of, Berkeley (see

BTSS)

Carnegie-Mellon University, 120, 571

CDC/Control Data Corporation

(see G-15;

G-20) 160, A, G, 43, 44, 120

924, 3100, 3200, 3300, and 3500, 43-44,

79

453

operating system, 450-455 PMS, 260, 450-455

B 6500; B 8500; D825; Datatron 204, 205, and 220; E 101, 102, and 103; ILLIAC

CDC CDC

(Burroughs), 44, 45, 257-260,

446-455

Datamatic 1000 (Honeywell), 39, 43 DATANET 30 (GE), 43 Datatron 204, 205, and 220 (Burroughs), 39, 43, 44 DDP-19 (Honeywell), 43 DDP-24, 224, and 124 (Honeywell), 43-44 DDP-116, 316, 416, and 516 (Honeywell), 43-44, 512

DEC/Digital Equipment Corporation PDP-1) DEC 338, 260, 303-314, 396 interpreter, 305 introduction, 305

(see

Machine and organization index

DEC

FORTRAN

338, ISP, 305-309, 310-314

121

PMS,

(See also

PDP-8)

191 (English Electric), 39, 43-45, also ACE) (See DMI/Data Machine Inc. =• Varian Associates,

Deuce

44

DMI DMI

520/1 (Varian), 44

620 (Varian), 44 Postal and Telecommunications Services, 200 Dynamo language, 45 DYSEAC (National Bureau of Standards), 39, 43, 172, 440

Machine, 44, 348, 363-381 interpreter, 366-379 introduction, 363-364 ISP, 363-365 365-381 logical design, PMS, 365-366 RT, 364-368, 375-381 FX-1 (Lincoln Laboratory), 43-45

101, 102,

and 103 (Burroughs),

43,

44

EAI/Electronic Associates Inc., 44 EAI 640, 44 Eccles-Jordan Flip-Flop, 46

Eckert-Mauchly Computer Corporation => UNIVAC, 91 EDSAC I and II (Cambridge University), 39, 42-45, 58, 89, 139, 144, 196, 398

EDVAC/Electronic Discrete Variable Automatic Computer (University

of

Pennsylvania) 39, 42-45, 95 Eight-bit character computer, 170, 184-187,

224

=

G-15 (Bendix CDC), 39, 43^4, 74, 191 G-20 (Bendix =s> CDC), 44, 57, 152 Gamma 60 (Machines Bull), 44, 456 GARDE 312 (GE), 43 GE 100/ERMA, 43 GE 115, 43 GE 205, 210, 215, 225, 235, 255, and 265, 43-44 GE 412, 435, 43-44 GE 635, 625, 43 GE 645 (General Electric), 43, 45, 79, 275 GE 4040, 4050, 4060, 4020, and 4050 II, 43 General Automation (see SPC-8) General Precision => CDC (see LGP-30) Genie project [see BTSS) George (University of New South Wales), 257 Gott Sei Danke, 346 GPS language, 45

H-200

=> ICT/International

Computers and Tabulators (see KDF 9) ENIAC/Electronic Numerical Integrator and (University of Pennsylvania),

39, 42-43, 45-47, 88, 113

ERA/Engineering Research Associates => UNIVAC,

UNIVAC

43, 192

1101, 1102;

UNIVAC

1103 A)

ERMA

(see

GE

and 1460, 43-45,

47, 61, 188,

231-234

PMS, 226

ISP, 184, 185, 186-187

(See also

1401, 1440,

224-234, 562-564 history, 225 interpreter, 229

RT, 229-230 1410 and 7010, 43, 44 1620, III, and 1710, 43-44, 225 1800 and 1130, 43-45, 48, 55, 90, 396, 399-420, 470, 575-576, 579, 583-586 input-output, 405, 409-411 interpreter, 408-409 introduction, 399-400 ISP, 407-416, 417-420 PMS, 400-405, 404 RT, 405-409, 411-413

IBM IBM IBM

IBM 2938, 45, 72 IBM 7030 (see Stretch) IBM 7070, 7072, 7074, 43, 44 IBM 7094 I, II, 7044, 7040, 7090,

709,

and

704, 30-32, 39, 43-45, 47, 54, 64, 70, 79, 91, 149, 303, 396, 422, 433, 515-541,

562-564 515-517 interpreter, 522-523 ISP, 523, 526-541 multiprogramming, 523 P(io), 524-525 PMS, 517-519 RT, 520-522 history,

EMR

Computer

433 1130 (see IBM 1800)

47, 87,

IBM IBM

ISP, 226-229,

introduction, 184

6130, 44 English Electric

702, 39, 43, 47, 87 705, 705 III, 708, and 7080, 39, 43-44,

introduction, 225-226

Dutch

E

IBM IBM

100)

ESS/Electronic Switching System (Bell System), 303 44, 73, 257, 348, 382-392 interpreter (microprogram), 385-392 introduction, 382-383

EULER,

ISP, 383-385, 388-391

PMS, 382-392

series: 110, 120, 125, 200, 400, 1200,

1250, 2200, 3200, 4200,

and 8200

(Honeywell), 43, 44, 58, 225

H-1400 and 1800 (Honeywell), 43 Harvard (see Marks) Hollerith Punched Cards, 46 Honeywell (see Datamatic 1000; DDP-19; DDP-24; DDP-116; H-200; H-1400) Host computer (see ARPA network) HP/Hewlett-Packard (see HP 9100A) HP 9 100 A, 44, 235-236, 243-256 D, 243-244, 254-256 ISP, 243-249 microprogram, 254-256 packaging, 250, 252-253 PMS, 235, 249-254 RT, 250 T, 243, 248, 253

IBM IBM IBM

Multiplying Calculator, 46 Stretch (see Stretch) System/360, 43-45, 61, 64, 303, 396

addressing, 565-566, 594

array processor, 576-579

base register, 594 (See also addressing above) bibliography, 587

branch instructions, 595 channel-to-channel adapter, 576 circuits, 564,

cost,

603-604

579-585

critique

by authors, 561-587

data types, 564-565, 590-594 design, 561-564, 588

Fabri-Tek (see Bitran FACT language, 45

direct control, 597

6)

Ferranti Corp. Ltd. => ICT/International Computers and Tabulators, 39 (See also Atlas; Mercury; Pegasus)

FLAC (Florida), 39 FOCAL (DEC) language,

FORMAC FORTRAN

236 (IBM) language, 45 (IBM),

IV language,

FORTRAN 45, 50, 73,

Advanced Studies machine von Neumann)

IAS/Institute for (see

IBM ASP/ Attached-Support Processor, 506 IBM 305 (disk), 43, 45 IBM 650, 39, 43, 44, 91, 216, 220-223 ISP,

IBM II,

348

FORTRAN

220-223

701, 39, 43-45, 47, 89, 515-516

floating point, 591-592 functional schematic, 589

general registers, 564-565 history, 561 (See also design above)

information formats (see data types above)

PMS, 515 (See also

(See also input-output below) emulation, 562-563

IBM

7094)

innovations, 562

657

658 Machine and

IBM

organization index

System/360, input-output, 588, 598-601 [See also P(io; data channels) below; PMS

and

PMS

diagrams below] interpreter, 594-595, 604-605 interrupts, 596-597 introduction, 561, 588 ISP, 564-566, 588-601 logical structure, 588-601 (See also ISP above) M(content addressable), 571, 573-574 M(Large capacity store), 571-572, 582-583 M(read only), 604-605 (See also

microprogramming below; Models 30, 40, and 50 below) microprogramming, 563-564, 604-605 (See also Models 30, 40, and 50 below) model range, 561-564, 588, 602-606

IBM/System

360,

LARC (UNIVAC), 43^4,

and PMS diagrams, 580

PMS

T(print, punch,

design philosophy, 456-457

573 S(cross-point; time-multiplexed; BCU), SLT/Solid Logic Technology, 564, 603-604 storage protection (see multiprogramming

above) storage-to-storage channel, 576-577 SVC/Supervisor Call, 597

simulation, 463-469

Leprechan (Bell Telephone Laboratories), 43 LGP-30, and LGP-21 (General

ASCII, 593 decimal, 593-594

Precision

EBCDIC, 592

ILLIAC

microprogram, 382-385, 388-392 RT, 386 67, 76, 79, 275, 561, 563, 571,

573-574

Model 75, 561, 563, 571 Model 85, 76, 561, 563, 574-575 Model 91, 561, 563, 575 Models 30, 40, and 50, 561, 563,

566, 568,

602-603 563, 571-572, 582-583, 602-603 585-587 multiprocessing, 456-469, multiprogramming, 565-566, 571, 573-574,

Mp,

9) Illinois), 39,

43-45,

320-330 input-output, 322, 327-328 interpreter, 322-325 introduction, 320-321 ISP, 322-325, 330-333 PMS, 321-322, 327-329 K(P), 322-323 RT, 326

LINC/Laboratory Instrument Computer

IBM ASP)

performance, 563, 579-587, 602-606 P(io; data channels), 573-574, 576-577, 598-601, 605-606

and PMS diagrams, 563, 566-579, 602-606 K(special controls), 576 Model 20, 567 Model 44, 569-571, 569 Model 67, 571, 573-574, 573 Model 75, 567, 571-572 Model 85, 574-575, 575 Model 91, 575 Models 30, 40, and 50, 65, 566-568, 566-567 Ms (data cell, disk, drum), 577, 579 576 P(special), 576-578, 576 S(c), 579, 581 (See also networks above; T(analog), 581 T(audio), 579 T(display), 579

of,

LRL

Livermore, California, 396-397 network, 507

MAC-16 (Lockheed

interpreter, 351, 354-355, ISP, 354-359, 361-362

Electronics), 44 language (University of Michigan), 45 MADM/Manchester Automatic Digital Machine, 39, 58 Manchester University, 39, 45, 340

MAD AGC)

44,

(See also Atlas;

MANIAC

348-362 design, 349-350

359-362

RT, 352-354 IPL VC, 257

MADM;

Mark

I;

Muse)

(University of California,

Michigan, University 209-212, 571

M.I.T.

262-263

II

I, II, III, and IV (Harvard), 39, 42-43, 46 Mathmatic language, 45 MEG, 39 Mercury (Ferranti), 39, 279

KDF

ISP,

and

Los Alamos), 39, 43, 89 I (Manchester University), 43

MIDAC

9 (English Electric), 44, 257-266

I

Mark Mark

Jacquard Punched Card Loom, 46 JOHNNIAC (RAND), 43-44, 78, 89 JOSS (RAND) language, 45, 78 JOVIAL (SDC) language, 45

introduction, 262

CG24; FX1; LINC; MTC; TX-0,

43

IPL I, II, III, IV, and V, 45, 257 IPL Vl/Information Processing Language,

D, 263-266

IBM ASP)

(See also

LRL/Lawrence Radiation Laboratory,

under ILLIAC) computer (see ARPA, network)

578

P(array), 576-578,

PDP-8)

(see

Lincoln Laboratory (M.I.T.), 571

1.5 language, 45 Lockheed Electronics (see MAC-16) Los Alamos (see AEC)

(See also

45, 73, 257,

PMS

tape), 578-579,

University

LINC-8 (DEC)

TX-2) LISP 1.0 and

66, 72, 315,

Illinois,

44, 45, 74, 91, 192.

(M.I.T. Lincoln Laboratory), 43, 44, 120

Instrumentation Laboratory, M.I.T. (see Interdata, Model 3 and 4, 44, 184

networks, 576-579, 581, 598

Ms(magnetic

KDF

(University of

ILLIAC II (University of Illinois), 43 ILLIAC III (University of Illinois), 43, 351 ILLIAC IV (University of Illinois), 43-45, 47,

IMP

597-598 (See also

I

=> CDC),

216-219 ISP, 217, 218-219 PMS, 217

89

602-606 ISP, 385-388

Model

operating system, 461-463 performance, 456-457, 463-469

Leibniz Calculator, 46 LEO I and II, 39

variable-length character strings, 591

(See also Atlas;

457-461

instructions,

interrupt, 458-461 introduction, 456

PMS, 459-461

system implementations, 602-606 timer, 597

Model Model

569

Research),

application, 464-469

RT, 568, 570, 572, 603-604

(See also performance below) Model 20, 563-567 30, 236, 348, 382-392, 566-568,

396-397

44-45, 446, 456-469

581

T(telephone, typewriter), 579, 596-598 processor state, 564-565, 588,

ICT/International Computers and Tabulators, 91, 274

25, 184, 563, 567,

86,

Lehman Computer example (IBM

read),

of,

MAD, MIDAC,

(Michigan, University

of), 39, 44,

192,

192,

209-212 ISP, 209-212 MILSMAC, 347 MISTIC, 43

CTSS operating system, 45 M.I.T./Massachusetts Institute of Technology (see AGC; GE 645; Lincoln Laboratory;

MULTICS

project;

Whirlwind

PMS, 260

M.I.T. network, 507

RT, 264

Monorobot, Monorobot XI, 39, 44

I)

Machine and organization index

Monroe Calculator, 46 Monroe Corporation, 46 Moore School of Electrical Engineering

PDP-8, 8S, (see

RT, 125, 127-131 (See also 338)

Test Computer (M.I.T. Lincoln Laboratory), 39, 45, 89 Mueller's Difference Engine, 46

MTC/Memory

MULTICS

project (M.I.T.), 45, 571 (Manchester University), 43, 277

275, 564

ISP,

(see

DYSEAC; PILOT; SEAC)

Pegasus (Ferranti), 44, 62, 170-183, 564 circuits, 171-174, 176 introduction, 181 ISP, 176-179, 182-183

179-181 packaging, 174-176, 179-182 logical design, 172-175,

Neher Laboratory, 200 Network of Computers, 504-512, 505-512 ARPA, 510-512 ComLogNet, 509-510 IBM ASP, 506 LRL, 507 M.I.T, 507

SABRE, 504 SAGE, 504 typical,

NOVA

of,

506-507

Philco 212, 44

PILOT

(National Bureau of Standards) 39, 43,

44, 75, 397-398, 440-445,

101

508-509 44

(see RW-40 and 400) Desk Calculator

ACE)

PUFFT,

California, Berkeley), 43-14, 79, 275,

291-300, 542

Calculator)

ONR/Offlce of Naval Research, 137 89 (University of

Illinois), 39,

43, 89

RAND

Corporation (see JOHNNIAC; JOSS) (Raytheon), 39

BIZMAC

RCA RCA RCA RCA RCA

= Raytheon

(see

PB-250;

PB-440) PB-250, 44, 74, 191

circuits,

America

(see

70 Series)

110, 43, 44 301 and 3301, 43 501 and 601, 43, 44, 225 1600, 184

Spectra 70, 561-562 I,

II,

and

III,

44

RW/Radio Wooldridge (see AN/UYK) RW-40 and 400 (Thompson, Ramo,

design philosophy, 477

808, 816, 44

8S, 81, 8L, and 120-136, 396 applications, 120

of

SPECTRA

Wooldridge), 44, 53, 192, 400, 470-471,

PDP-1 (DEC), 44-45 PDP-4, 7, 9, and 15, 43-45 PDP-8,

II;

477-488

PB-440, 334

PDC

I,

Rice University computer, 45, 53

Pascal Calculator, 46

5,

20-32, 43-44, 49, 90,

BTSS)

SEAC

(National Bureau of Standards), 39, 43-45, 172, 192, 209-212, 440 SEL/Systems Engineering Laboratories, 44

SEL

Recomp Bell

University),

compiler, 45

RCA/Radio Corporation

PB/Packard

(See also

RAYDAC

Olivetti-Underwood (see Programma 101 Desk

556-560

275, 543, 546-548, 546

RT, 550-552 (See also BTSS) SDS 940 and 945 (SDS, University of

120

317-318 RT, 318 NPL/National Physical Laboratory, 45 ISP,

551-552

ISP, 544-345, 548-550,

PMS,

PMS, 237-238, 237 Programmed Console (Washington

applications, 316-317 introduction, 316

910, 920, 925, 930, 9300, 43, 44, 91, 291,

interrupt, 553-555 introduction, 542-543

ISP,

Laboratory), 44, 66, 315-319

92, 44, 120

542-560 history, 542-543

237-242 237-242

(LRL/Lawrence Radiation

ORDVAC

SDS SDS

interpreter,

(Olivetti-Underwood), 44, 216, 235,

ORACLE,

SDC/Systems Development Corp., 45 SDS/Scientific Data Systems => XDS/Xerox Data Systems (see SDS 910; SDS 940 and 945; Sigma 2 and 3; Sigma 5 and 7)

input-output, 543-545, 552-555

449

440 input-output, 444-445 ISP, 442-444 performance, 440-442 PMS 398, 440-442 applications,

Programma

39,

(See also

ENIAC)

Polymorphic (RW)

Texas, University

NORC,

43, 46, 95 (See also EDVAC;

343-347

microprogram, 345-346 packaging, 341-343 PMS, 343 RT, 343-345

Pennsylvania, University of (Moore School),

NBS/National Bureau of Standards

SD-2 (Librascope), 44, 334, 341-347 design, 341-343 interpreter, 550-552 introduction, 341

PDP-10 and 6, 43-45, 79, 170, PDP-12, LINC-8 (see PDP-8)

39

Motorola 1000, 44-45

Muse

M(core), 128-129 121, 124, 126, 128

5,

DEC

Pennsylvania, University of)

MOSAIC,

8L, and

81,

PMS, 20-21, 123-131,

481-482 480-482 ISP language, 486-488 PMS, 471, 477-480, 482-485

interrupt, ISP, 470,

810, 44 Sigma 2 and 3 (SDS =» XDS), 43-44, 78 Sigma 5 and 7 (SDS => XDS), 43, 170, 396, 564 SILLIAC, 89 SIMSCRIPT language, 45 SIMULA, language, 45 SNOBOL language, 45 SOL language, 45 SOLOMON, 315, 320 Soviet Academy of Sciences, 213 SPC-8 and 12, 44 SPECTRA 70 Series (RCA), 43 SS 80 I and II (UNIVAC), 43 Strela/ Arrow (Russian), 44, 192, 213-215 ISP, 213-215 Stretch/IBM 7030, 43-45, 47, 91, 396-397, 421-439 arithmetic, 428-431 circuits, 433-438 D, 427-431 input-output, 421-422 interrupt, 423

introduction, 421

132-133

ISP,

422^24

input-output, 123 interpreter, 131

SABRE

SAGE/Semi-Automatic Ground Environment

look-ahead, 426-128

interrupt, 123 ISP, 22-33, 120-123, 127, 134-136

network, 45, 504 SCC/Scientific Control Corp. 650, 120 Schickhardt Calculator, 46

packaging, 432, 438-439 performance, 421-423, 425-426, 431-433

Logical design, 127-133

network (American

Airlines), 45,

504

K(P),

424-128

PMS, 421-423, 425-426

659

660 Machine and organization index

Stretch/IBM 7030, RT, 426-431 Subscriber Station (see ComLogNet)

SWAC,

39,

System/360

43 (see

IBM

of,

and 400) Turing machine, 23 39, 43-45,

274

UNIVAC UNIVAC UNIVAC

1050, 43, 44

490, 491, 492, I,

II, III,

and 494, 43-44 1005 I, II, and

WEIZAC, Whirlwind

UNIVAC UNIVAC

III,

UNIVAC UNIVAC UNIVAC

I

(M.I.T.), 10, 39, 43-45, 55,

1101 and 1102, 39, 43

interpreter, 140-141 introduction, 137-139

1103A, 39, 43, 44, 48, 62, 192,

ISP, 145 K, 139-143

M, 141 packaging, 141-143

1105, 39, 43 1108, 1107,

470

applications, 138

D, 142

and 1106,

10,

43-45, 62,

564 1206, 43 1212 (Military), 43 9200 and 9300, 43

170, 192, M.I.T.),

43, 89

58, 90, 137-145, 303,

205-208 ISP, 205-208

TRW/Thompson, Ramo, Wooldridge (see R W-40 TX-2 and TX-0 (Lincoln Laboratory,

II and HI, 39, 43-45 418, 1218, and 1818, 43-44

1004 43, 44

System/360)

network, 506-507 Toronto University Computer, 44 TRAC language, 45 TRE, 39

Texas, University

UNIVAC UNIVAC UNIVAC UNIVAC

PMS,

90, 138-139

Wilkes'

microprogrammed computer example, 44,335-340

design, 335-337 introduction, 335

ISP, 337-340

UNCOL

microprogram, 339-340 RT, 336

language, 8-9, 13

Army Ordnance Department, 92 UNIVAC, 39, 43-45, 48, 91, 157-169 U.S.

applications, 164-165 design constraints, 163

input-output, 158, 161-162 interpreter, 159-161 ISP, 157-160 performance, 164-168 PMS, 158 reliability, 165-169 RT, 157-160 T(io), 161-163 (See also SS 80 I and II)

Varian Associates (see under DMI) von Neumann/IAS/Institute for Advanced Studies, 39, 42, 44, 58, 89, 92-119, 152,

398

XDS/Xerox Data Systems

(see

SDS)

applications, 92-93

checking, 118 D, 96-111 design constraints, 92-93 input-output, 92, 117, 119 interpreter, 111-119 ISP, 111-119

M, 94-96

ZEBRA

(Standard Telephones and Cables, 200-204, 216

Ltd.), 44, 191-192,

introduction, 200 ISP,

200-204

PMS, 201 ZUSE Company,

39, 42

Subject Index in boldface refer to the

Page references

acceptance

diagrams.

arithmetic element, Whirlwind, 142 arithmetic expression, 614

abbreviation/, 19, 607, 609

UNIVAC, 165-166

test,

PMS

Appendix, ISP descriptions, and

access-i-unit-operation, 633 access-time, 620-622

arithmetic-function-operation, 614 arithmetic organ, von Neumann, 98

accessing algorithm, 41

(See also D/data-operation) arithmetic unit, KDF 9, 263-266

accumulator, ZEBRA, 202 accumulator register, 59-60, 98

array instructions,

array processor [see P(array)]

[See also under action «-, 23-24,

Information Interchange, 593 assemble instruction, 457-458

ASCII/ American Standards Code

line)]

action-sequence, 23, 631 actual address, 76-81

assignments, associative

(See also physical address)

23, 607,

memory

bulk core

IBM IBM

609

attribute: value pair (see attribute; value)

address-range

[

address-size, P,

612-613

auto index register, 120-122, 134

447

Lehman computer, 456-457

24, 631-633 626-627

available space

],

list,

IPL

VI,

352-353

addresses-per-instruction, P, 57-63, 627 (See also instruction format)

addressing (see

memory

addressing;

memory

mapping; multiprogramming) addressing system, memory, 16

aerospace computer, 146-156 algorithm-encoding-efficiency, P, 627 alias/, 19, 607, 609

423

System/360, 591

(see computer) C(l Pc), 40-41, 63-70, 395 C(l Pc-nPio), 40-41, 63-70, 396-398 capital letters, 609 card, IBM, 617 carrier, 618 data-type, 629-631 carry, 98-99 casting out three, Stretch, 431

central processor [see P(c)]

channels [see

B

line:

character-base, 631-632

277-278 Manchester University, 340

character/char, 616 character generation instruction, 308 character string, 184-185

Atlas,

(See also index register) 6600, 474, 489-491

barrel,

CDC

and A, 25

base register,

(See also n-ary-boolean-operation)

Stretch,

(see bit)

base, 24, 55-56, 614, 616, 631

,

M/memory)

b

alphabet, 609, 613 alternation indefinite expression, 17, 610 |

(see

C/computer

RW-400, 477^79

address-expression, 631-632

memory

memory;

attribute-list,

availability,

(See also variable-length character string)

Stretch, 431

UNIVAC,

MIDAC, 210

bilinear switch,

160-161, 168-169

Whirlwind, 143-144

bench-mark, 52

applications:

P(io)]

checking:

base-data-type, 630-631

antecedent, 619

623-624



circuit level, 4

Lehman computer, 464-469

614, 633-635 binary-arithmetic-operation H binary-boolean-operation, 615, 633-635

NOVA, 316-317

binary-decimal conversion, 211

PDP-8, 120 PILOT, 440

binary machine, 87-88

Pegasus, 171-174, 176

UNIVAC

binary-operation, 28, 633

Stretch,

(See also ISP,

164-165 von Neumann, 92-93 Whirlwind I, 138 I,

IBM

bit/binary-digit, 611,

619

multiple-precision,

429-430 serial, 428-429 Stretch, 428-431 parallel,

433-438

609-610 cocomponent, 617 class,

block, 617

co-incident current memory [see M(core)] colon 19, 612-613, 631 (See aho attribute: value pair) :

PMS

diagram;

PMS

level;

RT)

ZEBRA, 204 BNF/Backus-Normal Form (Backus-Naur

block transfer,

AGC, 151-152

6600, 494-495

PDP-8, 132-133

616-617

block diagram (see

arithmetic:

CDC

component count, 431^432

archival

area, 617,

circuits:

component count, 470-471

binary-value, 611 bit string, 317-318 (See also data-type, Stretch)

[see M(archival)]

,

System/360)

approximation—, 607-608, 610 architecture, 562 (See also ISP; under PMS)

memory

608-609,

by/byte, 616

D825, 447-448 adder, Pegasus, 174 addition, von Neumann, 98-99

-,,

ACE, 198

buzzer,

M(associative)] attribute, 19, 607, 612-613

adaptability:

D V A

bus, 10 (See also S/switch) business computer (see function)

for

[see look-aside



633-635 branch instruction, 595 breakouts, IPL VI, 350-351 buffer module, RW-400, 482-484

NOVA, 316-319

accuracy, HP 9100 A, 246, 256 acoustic delay line, 96

M(delay 631-632

boolean-operations

Form), 9 boolean, 608, 615 boolean-expression, 615

,

combinatorial circuits, 5 comma, 611

commands, 608-610 (See also abbreviation; assignment, form; variable)

COMMENT,

608 661

662

Subject index

comments, 608 communication computer (see function) communication multiplexing, 505 compiler, EULER, 391-392 complex, data-type, 631

complex number arithmetic, 246, 255-256 component: data-type, 629-631 PMS, 616-619 component-function, 617 component-name, 617 compound computer, 628 compound-link, 619-620

compound name

(see

name)

computer, 628 control, 146-156 duplex, 66 computer levels, 3-11 PDP-8, 126-127 computer model, 63-66

computer-space dimensions, 40 concatenation 24, 631-633

,

decimal digit, 616 decimal machine, 57, 87-88 decimal-name, 614 DECtape, 124-126

concurrency-type, 617 condition, 23, 631 condition codes, IBM 1800, 407 conditional micro-order, 336-337

IBM

delay,

1,

contextual addressing, 267-268 continuous-modulation, 618

B

SD-2, 341-343 desk calculators, 235-256 destination address,

Neumann, 111-119

controlled-operation, 624-625 conversion, 615-616

element-range/< ellipses.

.

.

,

626-628

memory, 75 >,

24,

631-633

608, 610

emulation, 562-563 encode, 16

encoding, 618 entity, 608, 611-612 error-rate, 617, 619 evoke operation —>, 23, 631, 633 EXAMPLE, 608 excess three code, UNIVAC, 163 448 expansibility criteria, D825,

607-608, 611-612 607-608, 610

extended core store/ECS, CDC 6600, 473 external execute instruction, 458 extra codes, 597 AGC, 154-155 Atlas, 274-278

5000, 271-272

Lehman computer, 456-457

control-operation, 633

electrostatic

relational-expression) expression-variables, 608

D825, 447-450

RT)

59-60, 636-637 efficiency, processor,

count-expression; dimension-expression;

design philosophy:

(See also interpreter; K/control; control computer (see function)

228

effective address calculation process, ISP, 28,

line)]

descriptor, 79-81

624-625 ILLIAC IV, 322-323 Stretch, 424-428 Whirlwind, 139-142

control,

edit instruction,

optional, 613 (See also boolean-expression;

620

delay line [see under M(delay

Interchange Code, 592

ECL/Emitter Coupled Logic, 320

indefinite,

dequeue switch, 623-624 descendants, 619

addressable)]

EBCDIC/Extended Binary Coded Decimal

definite,

=

1800, 400-403

[see M(drum)] dynamic data types, 383 dynamic storage allocation, 383-384

expression, 608

definite expression, 607-608, 611-612 definition: (see assignment)

construction (see packaging) content addressable memory [see M(content

control-organ, von

616 D(Stretch), 427^131 data break, PDP-8, 124-126 data channel [see P(io)] IBM 7094, 523-525 SDS 900 series, 543, 546-548, 552-555 data-expression, 631 data field register, 120, 523 data flow, Stretch, 425-428 data-operation, 17, 23-36, 626 data-operation definition, ISP, 636-637 data-operations table, 633-635 data programs, IPL VI, 360 data structure, IPL VI, 351, 354 data-type, 23-36, 57 ISP, 629-631 P, 626-628 Stretch, 423^424 data-type format, ISP, 636-637 data-type-name, 629-631 digit,

decimal, 614

concurrency, 617-618

configurator,

drum

626 D/data-operation, 17, 23-36, d/decimal

(See also syspop)

ACE, 194-199

digital

computer

fabrication (see packaging) family tree of computer design, 39

digits,

609

fast

(see C/computer) digital differential analyzer, 304

Fourier transform, 73

225-226

conversion-arithmetic-operation, 633-634

dimension, 608, 615

features,

Cooley-Tukey algorithm, 73 cooling, 470

dimension-expression, 615 direct access communications channel, 900 series (see data channel)

fetch-execute cycle (see interpreter)

Pegasus, 181

UNIVAC, 163 memory [see

core cost,

direct

M(core)]

616-617, 619

memory

direction,

access,

PDP-8, 124-126

618

Lehman computer, 459

count-expression, 614

discrete-modulation, 618

country, 619 cross-point switch, 267 crossbar switch, [see S(crossbar); under S(cross-

disk, 74, 577,

CRT/Cathode Ray Tube

display [see under

T(CRT)]

cyclic

memory

cyclic switch,

file,

data-type, 631

617

BTSS, 297-300 control (see function) fixed point (see data-type)

579

display processor [see P(display)] distribution, switch, 623-624

number-data-type, 630-631 fixed structure network, flag bit,

IBM

504

1401, 226

floating point, 97

divergence, T, 625-626 divergence-rate, T, 625-626 divide step, SDS 900 series (see ISP)

Atlas, 277-278,

division:

283-285

B

5000, 268-270 HP 9100 A, 243-256

IBM

7094, 527

memory)

nonrestoring, 107-111

KDF

9,

[see M(cyclic)]

restoring, 107-111

number-data-type, 630-631 SDS 900 series, 544-545, 549-551

current, 616

cycle-time (see

field,

file

directive instructions,

point)]

SDS

623-624

UNIVAC,

159

263-266

Subject index

floating point, Stretch, 429-431,

UNIVAC

433

1103A, 208

Wilkes example, 335 457 form, 607, 610 format, data-type, 629-631 full-duplex, 617-618 function, 37, 40, 46-49 business, 47-48 C, 618 communication, 48 component, 617 control, 48 file control, 48 operation, 28 P, 626-627 scientific, 47 T, 625-626 terminal, 48-49 time-sharing, 49 fork instruction, 325,

CDC

functional units,

information length, 16 information-rate, 617-618

information units, 616-618

instruction-memory, P, 627-628

inhibit drivers [see M(core)]

instruction modification, 209-210

input-output:

instruction-set, 25

ACE, 197-199 Atlas, 274-283,

P,

(See also ISP) instruction-size, P,

117, 119 instruction:

DEC 338, 308-309 DEC 338, 307-308

general registers: 8-bit character computer, 184-187 Pegasus, 176-179 generations (first, second, third, and fourth), 39-40, 43-46

Gibson mix, 49-50

hexa-decimal-digit/hex, 616 hierarchy (see structure) switch, 623-624

631-632

Lehman computer, 457-461

instruction

backup 520-522

register,

IBM

7094,

instruction buffers, 84

ILLIAC

IV,

323-324

(See also look-ahead; look-aside) instruction decoding diagram, 122-123, 184

address/stack, 62-64, 257-261 stack:

B 5000, 267 memory (see M/memory)

B

KDF

high-level language,

5000, 268-273 9;

262-266

1+ 1

+

1+

general register (see general registers) 1 address, IBM 650, 220-223 index address, 58-60, 87-91

2 address, 60-61

617-618

RW-400,

1103A, 205-208 3 address, 60-61

MIDAC, 209-212 Strela,

213-215

general registers, 61, 64 (See also general registers) IBM 1800, 407-408, 410-411

i-unit-prefix

IBM-card, 617 iconoscope tube, 94 illegal instruction,

470, 480-482, 486-488

UNIVAC

data-type, 629-631

ISP, 25,

BTSS, 293

indefinite expression, 607-608, index*, 20, 613

n 610

index register, 59-60 information, 616 information base, 24, 55-56, 614, 616, 631 information-content, data-type, 629-631

interlace (see data channel,

interleaving (see

memory

+

1

636-637

address, 61, 191

SDS 900 variable

series,

544-545, 548-552 of addresses per

number

instruction, 63 instruction highway, ACE, 197

instruction interpretation process, ISP,

636-637

463-469

SDS 900

series)

interleaving)

interpretation-cycle, 22-36 See also interpreter) ( interpreter,

22-36

AGC, 147-148

DEC 338, 305 EULER microprogrammed, 385-392 FORTRAN Machine, 366-379 IBM IBM IBM

1401, 229 1800, 408^09 7094, 522-523

ILLIAC

IV,

322-325

IPL VI, 351, 354-355, 359-362 ISP, 636-637 PDP-8, 131 Stretch (see instruction unit)

hyphen-name, 613-614

length, 616 i-unit-name, 616

integer-name, 614

integrated circuit memory (see M/memory) interaction controller, Lehman computer, 460 interaction function, Lehman computer,

AGC, 149-150

hyphen-, 25, 607

base-unit, 616

integer-name, 614

SDS 900

1

i-unit/information unit, 16, 616-618

+ —

address, 58-60, 64, 87-91

high-speed core history, 38-46, 617, 619

irate,

integer-data-type, 630-631

458-461

Instruction^execution, ISP, 25-36, 637 instruction execution process, ISP, 637 instruction-expression, 23, 631-632 instruction format:

Half-duplex, 617-618

+

integer-name, 614

interference, processor-memory, interflow, 151

instruction-efficiency, P, 626-627 instruction examples, ISP, 632, 635-637

graph-plot instructions, 308

integer-data-type, 630-631

control,

special,

and ISP, 607-615

626-627

instruction-source, K, 624-625 instruction unit, Stretch, 426-427

PDP-8, 123 PILOT, 444-445 SDS 900 series, 543-545, 552-555 Stretch, exchange, 421-422 UNIVAC I, 158, 161-162 input and output organ, von Neumann, 91,

ISP,

PMS

K,

IBM 1800, 405, 509-411 IBM 7094, 524-525 ILLIAC IV, 322, 327-328

6600, 473, 494

636-637 624-625 626

ISP,

285-289

BTSS, 297-300 D825, 454-455

data,

gate tubes, 112-119 general conventions,

instruction interpreter (see interpreter) instruction look-ahead (see look-ahead)

series,

550-552

UNIVAC, 159-161 von Neumann, 111-119 Whirlwind I, 140-141 interprocess communication, 41

interprogram communication, 81-83 interrupt/interprocess interrupts, 82-83, 411 Atlas,

B

274-283

5000, 267-272

D825, 452-453

Lehman computer, 458-461 PDP-8, 123 RW-400, 481^182 SDS 900 series, 553-555 Stretch, 423 interrupt-response-time, P, 626-627 intraprocess interrupt/trap, 82-83 (See also extra codes; trap)

I/O Bus: PDP-8, 124-126 SDS 900 series (see input-output)

663

664 Subject

index

large capacity store/LCS, 571-572,

ISP/Instruction-set Processor, 12, 22-33

ACE, 193-199 AGC, 152-155 Atlas, 276-279,

B

length, 616

M(p/primary memory),

level,

LINCtape, 124-126

CDC DEC

lineage, 617, 619 linear switch, 623-624

338, 305-309, 310-314

link,

8-bit character

computer, 184, 186-187

list,

HP

list

9100A, 243-249 650, 220-223

list

607, 611

384

processing, EULER, structure, IPL VI, 350

1401, 226-229, 231-234 1800, 407-416, 417-420

literal syllable,

7094, 523, 526-541

logic diagrams,

ZEBRA,

127,

logical structure (see ISP,

544-545, 548-550, 556-560

ISP conventions, 628-637 italics, 24, 608

IBM

IBM

System/360)

287-289

6600, 492-494 7094, 550-552

ILLIAC

IV, 323-324

424^128 574

Stretch, 397, 422,

look-aside

memory,

[See also

84,

457

K/control, 16-22 (See also control)

616 kernels, 464 k/kilo,

keyboard: HP 9100A, 235, 244-249, 251-253

memory)

Programma

101;

237-242

[See also T(keyboard)]

L/link, 16-22, 619-620 label,

612

labeled-entity, 612 language, 9

(see

memory map; multiprogramming)

464-466

M(associative), 76 [See also M(content addressable)]

medium, 618 memory, 620-622 access-time, 620-622

M(core), PDP-8; 128-130

cycle-time, 620-622 function, 620-622

M(cyclic), 73-74

information-rate, 620-622

M(delay line; ACE, Deuce), 191, 193-199 M(delay line; Pegasus), 173-174, 177 M(delay line; UNIVAC), 163 M(drum), 74

operations, 620-622

(See also look-aside)

M(electrostatic;

M(fixed-head

Whirlwind

disk),

M(fixed-head disk;

I),

141

ILLIAC

IV), 322,

card;

HP

card;

Programma

tape),

327-328

9100A), 248-249, 253 101),

74

IBM format), 126 tape; RW-400), 483 tape;

permanency, 620-621 620-621 primary, 621 portability,

[See also M(p)] processor state, 621

74

M(large storage; Whirlind), 137-138, 141 M(magnetic card), 74

M(magnetic M(magnetic M(magnetic M(magnetic M(magnetic

142-143

marks, 609 master control program, B 5000, 267-268 master slave schemes, D825, 449 matrix multiply problem, Lehman computer,

M(bulk core), 74 M(content addressable), 74 join instruction,

I),

magnetic card [see M(magnetic card)] magnetic tape [see M(magnetic tape)] magnetic wire memory, 96 main line of computers, 87-91 maintenance: ILLIAC IV, 328-329 Pegasus, 181-182 UNIVAC, 165-169 Whirlwind I, 138-139, 142-143 manufacturer catalog number, 617 manufacturer name (see proper-name) manufacturer-type, 619

map

M(content addressable)]

M/memory, 16-22 (See also

5000), 269-271

film;

machine-independent language, B 5000; 267 macro-parallelism, 456, 463

look-ahead: Atlas, 281-285,

B

D825), 453-454 M(toggle switch; Whirlwind M(UNIVAC), 158, 164

M(thin

memory mapping; multiprogramming) logical design level, 5 FORTRAN Machine, 365-381 PDP-8, 127-133 Pegasus, 172-175, 179-181

CDC

2(X)-204

M(stack;

(See also

SD-2, 343-347

Whirlwind, 140-141, 145 Wilkes example, 337-339

M(stack), 73

BTSS, 291

PILOT, 442^44 Programma, 237-242 RW-40, RW-400, 470, 480-482, 486-488 series,

M(s/secondary), 74

logical address, 76-81

134-136

213-215 Stretch, 422-424 UNIVAC, 157-160 UNI VAC 1103 A, 205-208 von Neumann, 111-119

5000; 272

PDP-8, 127-133 logic equations, PDP-8, 127-133 logic technology, 40, 617-618

Pegasus, 176-179, 182-183

Strela,

B

623-624

location, S,

System/360, Model 30, 385-388 ILLIAC IV, 322-325, 330-333 IPL VI, 354-358, 361-362 KDF 9, 262-263 LGP-30, LGP-21, 217, 218-219 MIDAC, 209-212 NOVA, 317-318

PDO-8, 22-25, 26-27, 28-33, 120-123,

M(read only), 604-605 M(read only; capacitor; System/360; Model 30), 385-387 M(read only; HP 9100A), 235, 250-253 M(read only; rope; AGC), 146-147

port-to-port delay, 620

EULER, 383-385, 388-391 FORTRAN, 363-365

SDS 900

[See also T(punch)] M(queue), 73 M(random), 75

619-620 delay, 620

D825, 453

diskpak), 74 17, 24, 74

M(p; concurrency), 41, 76-81 M(p; size), 41 M(photostore; IBM), 507 M(punched card), 74

system, 3-4

BTSS, 292-297 6600, 472, 491-493, 497-503

M(magnetic tape; Univervo), 157

M(moving head

length-type, data-type, 629-631

283-285

5000, 268-273

IBM IBM IBM IBM IBM

582-583

lattice (see structure)

237-242

secondary, 621 [See also M(s)] size,

620-622

technology, 620-622 (See also M/memory; memory memory access algorithm, 73 memory addressing: AGC, 155-156

organ)

Subject index

142 multiplication, Whirlwind,

memory addressing: (cont.) SDS 900 series, 542, 549-550 memory bus, Stretch, 422, 426

multiplier,

(See also S/switch) memory declaration, 36

memory-expression, 631-632 interface connection, 543, 546-548, 555

SDS 900

memory memory Atlas,

CDC IBM

series,

operation-code-size, processor, 626-627

multiply step, SDS 900 series (see ISP) multiprocessing, 446-469 multiprogramming, 76, 274-275, 456-469

operation-expression, 631-635 }, 30-32, 631-632 operation-modifier/ { operation-rate, port, 617-618

Atlas,

BTSS, 291-295

289-290

(See also

IV, 322-324,

Stretch, 397,

multiprocessing;

n-ary operation, 633 name, 607, 609, 613-614

memory mapping, 77-80 (See also multiprogramming)

Neumann, 92-96

IBM 1800, 408 BTSS, 294-295

protection,

message concentrator, 120 message switching, 505 metanotation, 607-609 micro-operation, Wilkes, 335-337, 339 micro-order:

System/360, Model 30, 385-388

Lehman, 456

micro-programme, Wilkes, 335 [See also P(microprogram)] microprogram: control fields, 387 HP 9100A, 254-256 sequencing, 388 status bits, 388 symbolic representation, 388-389 [See also P(microprogram)] [see

component, 617 compound, 25, 614 hyphen, 613-614 phrase, 613-614 primitive, 613-614 proper, 607, 617 simple, 607, 613-614 name-expression, 613 nesting store, 263-266

P(microprogram)]

micro-subroutines, Wilkes, 339-340

MOBF/mean-operations-between-failure, 617-618

modular scheme, D825, 449^50 modulation, 618 monitor map, BTSS, 291-295 monitor mode, BTSS, 291-297 moving head disk, 74, 577, 579 Mp-concurrency, processor, 627-628 MTBF/mean-time-between-failure, 617-618 multiple addresses per instruction, 191 (See also instruction format) multiple data stream, 83-84 multiple instruction stream, 83-84 multiplex, 617-618

memory, IBM 7094

(See also S/switch) multiplication, 100-111

AGC, 152 UNIVAC,

157

B

5000, 272

coding, 193, 199 optional expression, 607, 613

optimum

II,

518-519

address) (see instruction format)

66

ILLIAC IV), 320-333 NOVA), 318-319

P(display), 72 P(io), 72,

303-304

P(io; analog/digital;

IBM

1800), 405,

409-416

P(language), 63, 73, 257

P(microprogram)/microprogram processor, 61, 71,334 P(microprogram; SD-2), 341-347 P(microprogram; System/360, Model 30), 385, 388 P(microprogram; Wilkes example), 335-340 P(special algorithm), 66, 72-73, 301 P(stack) (see instruction format)

P(vector move), 72

P-concurrency, 627-628 packaging: 6600, 494-496

CDC

HP 9100A, 250, 252-253 Pegasus, 174-176, 179-182 SD-2, 341-343

616

Lehman computer, 462-463

call syllable,

B

5000, 272

operating system:

285-287 B 5000, 267-268 BTSS, 292-300 CDC 6600, 472, 475 D825, 450-455 Lehman computer, 461-463 operation, 616, 632-635 D, 626 K, 624-625 M, 620-622 P, 626-627 port, 627-628 Atlas, 279,

1

P(c/central processor), 17-22, 71

one-level store, Atlas, 179-283 one's complement, AGC, 150-152

operand

+

P(array;

network analysis problem, Lehman computer, 466-469 network computers, 447, 470-503 next, 24, 631 noisy mode floating-point, 422-423 nonary operation, 633 null, 607, 613 number, 608, 614 number-data-type, 630-631 number-name, 614 number representation, AGC, 150-152 number-set-name, 615

onion peeling,

(See also processor) P(l address) (see instruction format) P(2 address) (see instruction format) P(3 address) (see instruction format)

P(array;

[See also M(stack)]

octal-digit,

P/processor, 17

P(array),

mixed number, data-type, 630-631

multiplexer,

operator syllable,

P(n

network, 628

Wilkes, 335-337

microprogram processor

memory map;

n-ary-arithmetic-operation, 614, 633-635 n-ary-boolean-operation, 615, 633-635

BTSS, 291-295 IBM 7094, 523

micro-parallelism,

617

operation-time, 19

327-328

memory map:

violation,

operation-set,

parallel processing)

421-422

organ, von

operation-rate-set, 617

274-283

5000; 267-268

interleaving:

ILLIAC

operation, S, 623 T, 625-626

multiplier-quotient register, 59

B

6600, 473, 493 7094, 517-522

memory memory memory

615

Stretch, 432,

Whirlwind

I,

438-439 141-143

page: address, 120-134

(See also

memory mapping; multiprogram-

ming) Atlas, 274, 276,

279-283

BTSS, 291

mapping, 79-80 page address register,

Atlas,

279-283

parallel arithmetic (see arithmetic)

CDC

6600, 491-494 456-469 IPL VI, 359-360 parallel programs, parallelism, 456 parameter, 19, 611 (see attribute)

parallel-by-function,

parallel processing, 446,

parameter-set, 611

665

666 Subject index

parentheses

PMS PMS PMS PMS

609

),

(

performance, 37, 49-52 CDC 6600, 470-471

notation, 19-22

quantity, 608, 615

primitives, 16-22 structure, 41

quit instruction, 457

queue memory

PILOT, 440-442 Stretch, 421^23, 425-426, 431-433

structure dimensions, 63-85 polar coordinate arithmetic, 246, 255-256 Polish notation, 270-271, 391

UNIVAC, 164-168

port, 16-18,

Lehman example, 456-457, 463-469

period ., 25, 609, 614 peripheral and control processors,

6600,

471-475, 489-491

ming)

diagram, 16-22 15-22

level, 9-10,

ACE, 191, 193, 198 AGC, 146-148 289-290 B 5000, 258-260, 268 BTSS, 275, 292 6600, 470, 471-475, 476, 489-494

computer models, 63-66 D825 and D830, 260, 450^51, 453-455 Deuce, 191 EULER, 382-392 FORTRAN machine, 365-366 HP 9100A, 235, 249-254 1401, 226

System/360, 563, 579-587, 602-606 IV, 321-322, 327-329 9,

260

Lehman Computer, 459-461 LGP-30|LGP-21, 217 M.I.T. network, 507

networks, 504, 505-512 PDP-8, 20-21, 121, 123-131, 124, 126-128

PILOT,

398, 440-442

SDS 900

Stretch, 421-423,

425-426

546-548

data-types, 626-628 encoding-efficiency, 626-627 function, 626

instruction-memory, 627 instruction-size,

626-627

program program program program program program

633-634

615 617-618 HP 9100A, 253 ILLIAC IV, 328-329 Lehman computer, 456-457 network, 505 Pegasus, 181-182 UNIVAC, 166-168 Whirlwind, 138-139 relocation registers, 80

relations,

memory mapping; multiprogram-

NOVA, 316

120

field register,

8-10

level,

reference table,

operator,

B

5000, 271-272

SDS 900

series, 542,

criteria,

D825, 448

proper-name, 607-617 protection and relocation registers, 80 status

word

CDC

(see processor state)

M(punched

DEC

(See also stack)

pyramid,

resume instruction, 458 reverse polish, 262-263 round-off, 104-107

Atlas,

287-289

CDC

6600, 491-494

FORTRAN Machine, HP 9100A, 250 IBM IBM IBM

6600, 474

card)]

338, 308-309

364-368, 375-381

1401, 229-230

1800, 405-409, 7094, 520-522

411^13

ILLIAC

IV, 326 352-354 KDF 9, 264 NOVA, 318 PDP-8, 125, 127-133 SD-2, 343-345 SDS 900 series, 550-552 Stretch, 426-431 UNIVAC, 157-160 Wilkes example, 336

IPL

(See also extra codes)

[see

replicated single-computer systems, 448 resource allocation diagram, 10

RT/register transfer level, 5-7

checking, Pegasus, 178 counter, Whirlwind, 140 entry mode, desk calculator, 235

push-pop instruction, 90, 138-139

608-609, 634

relational-i-unit-operations,

(See also

(See also P/processor) processor state, 24, 57-63

UNIVAC,

I,

,

615

repeat instruction, 207

punched card

1108, 11

< >

Mp-concurrency, 627-628

PSW/program

Whirlwind



ming) renaming, 632

Texas, University, 506-507

UNIVAC

=

relational-expression,

interrupt-response-time, 626-627 ISP, 635-637

S/switches, 67-69

158

relational-arithmetic-operations

reliability,

programming

series, 275, ,543, 546,

RT)

processor, 626-628

544-545, 550

471, 477-480, 482-485

632

register transfer (see relation, 608

relational-operation, 615, 634

programmed

SD-2, 343

register,

process map, BTSS, 293 processing elements, ILLIAC IV, 321-322

program-switching-time, 626-627

pipeline processor, 84 Programma 101, 237, 237-238

RW40, RW-400,

629-631 primary computer, PILOT, 441-443 primary memory [see M(core), Mp-concurrency] primitive-name, 613-614 print column, 617 process, BTSS, 293-297 process control computer, IBM 1800, 399-420

program-switching-time, 626-627 serial, 83

ASP, 506

629-630

referent-expression, data-type, 629-630

163

parallel/parallel-by-word, 83-84

1800, 400-405, 404 7094, 517, 518, 519

EULER, 383-384

referent, data-type, 16,

operation-code size, 626-627 P-concurrency, 627

701, 515

ILLIAC

KDF

recursive procedure,

supply: Pegasus, 181

algorithm-encoding-efficiency, 626-627 concurrency, 41, 83-85, 626-627

ComLogNet, 509-510

IBM IBM IBM IBM IBM IBM

power, 616-617, 619

address-per-instruction, 627 address-size, 626-627

network, 511

Atlas, 277, 279-283,

CDC

(See also physical address) record, 617

precision, data-type,

pipeline processor, 84-85 PMS conventions, 615-628

ARPA

M, 620-622

UNIVAC,

memory mapping; multiprogram-

random access memory [see M(random)] range—, indefinite expression, 19, 610

T, 625 610 postulation, indefinite-expression,

power

BTSS, 291

PMS PMS

portability:

M(queue)]

readability, 618 real address, 76-81

port-to-port delay, L, 620

CDC

permanency: M, 620-622 S, 623 phrase-name, 613-614 physical address, 76-81 (See also

617-618

[see

VI,

(See also logical design level; microprogram)

Subject index

Mp-Pc; Lehman computer), 461 67-70 S(cross-point; B5000), 258, 267-268 S(cross-point; D825), 450-454 S(cross-point; non-hierarchy; RW-400), 478480 S(duplex), 66-69 S(hierarchy) 67-70 S(Inter-memory transfer trunk; PILOT), 443 S(non-hierarchy), 68-69 S/sec/seconds, 616 S(simplex), 66-69 S/switch, 17-22, 41, 66-70 S(crossbar;

S(cross-point),

S(Telephone exchange), 506

CDC

S(trunk;

6600), 493

computer

CDC

secondary computer, PILOT, 443-444 segmentation, 77-81 (See also memory mapping; multiprogramming) Selectron memory, 95 semantics, 607-608 semi-colon ;, 611

(see interpretation-cycle)

sequential circuits, 5 serial arithmetic, 428-429

set,

607, 611

shared

614

simulation:

computers

(See also emulation; inter-

Lehman computer, 463-469 single data stream, 83-84 single instruction stream,

memory

83-84

(see look-aside)

SLT/Solid Logic Technology, 564, 603-604 small letters, 609 source address, ACE, 194-199 space, SD-2, 341 space „, 25, 607 spacing, 609 specialization, indefinite expression, 610 split instruction, 457 square root instruction, 241 stack:

5000, 260-261, 269-271

DEC

338, 308-309

EULER, 385

KDF

174

631-632 ory protection; multiprogramming) and forward network, 504

stored program digital computer (see computer)

613

structure, 37-38,

52-85

computer, 628

9,

260-261

and

458 Whirlwind, 142-143

set instruction,

test control,

tetrads, 112

three addresses per instruction, 193-194 (See also instruction format) time, 616 time chart, 43-46

time-sharing computer (see function; multipro-

hierarchy, 63-70 tree,

memory, technology)

temperature, 616-617, 619 terminate instruction, 457-458 test

memory mapping; mem-

gramming)

IBM 1800, 411 transducer, 625-626

65

timer,

65

subcomponents, 617

divergence, 625-626

subroutine calling instructions, PDP-8, 123, 135 subroutine file, BTSS, 299-300 subscripts (see base register; index register)

technology, 625 transduction, T, 625-626

609 subtraction, 99-100 superscripts/I, 609

transmission-operation, 633 transmit

Suggest Documents