Charles Pfferkorn. Ken Fitzgerald and Anita Jones of IBM were kind enough to ...... Stack pointer (P may have multiple simultaneously active stacks). 7. Address ...
COMPUTER STRUCTURES: READINGS AND EXAMPLES
C.
Gordon BeH St. #2506
611 Washington
San Francisco, CA 941 11
Computer Structures: Readings and Examples
McGraw-Hill computer science series
RICHARD W. HAMMING Bell Telephone Laboratories
EDWARD
A.
FEIGENBAUM
Stanford University
Bell
and Newell
Computer
Structures:
Readings and Examples
Cole
Introduction to Computing
Gear
Computer Organization and Programming
Givone
Introduction to Switching Circuit Theory
Hellerman Kohavi Liu
Digital Computer System Principles Switching and Finite Automata Theory
Introduction to Combinatorial Mathematics
Rosen
Introduction to Computer Science Programming Systems and Languages
Salton
Automatic Information Organization and Retrieval
Ralston
Watson
Timesharing System Design Concepts
Wegner
Programming Languages, Information Organization
Structures,
and Machine
Computer Structures: Readings and Examples C. Professor of
Gordon
Bell
Computer Science and Electrical Engineering Carnegie-Mellon University
Allen Newell University Professor
Carnegie-Mellon University
McGraw-Hill Book Company
New
York
St.
London
Louis
San Francisco Diisseldorf Panama Rio de Janeiro Singapore Sydney Toronto
Mexico
To Brigham, Laura, Paul
Computer Structures: Readings and Examples
©
1971 by McGraw-Hill, Inc. All rights reserved. Printed in the Copyright United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without
the prior written permission of the publisher.
Library of Congress Catalog Card
Number 75-109245
07-004357-4
1234567890 HDBP 79876543210 News Gothic by Graphic Services, Inc., printed on permanent paper by Halliday Lithograph Corporation, and bound by The Book Press, Inc. The designer was Elliot Epstein; the drawings were This book was set in
done by John Cordes, J. & Pi. Technical Richard Dojny and J. W. Maisel. William
Services, Inc. P.
The
editors
were
Weiss supervised production.
Preface The structures that we call computer systems continue to grow size, and in diversity. This book is linked firmly to the nature of book
is
about the upper
levels of
computer
define a computer system at the
in
complexity,
this growth.
in
The
structure: about instruction sets, which
programming
level;
and about organizations of communica-
processors, memories, switches, input-output devices, controllers, and tion links, which provide the ultimate functioning system.
—
These
levels are just
emerging into well-defined systems levels with developed symbolic techniques of analysis and synthesis and accumulated engineering know-how, all expressed in a
These aspects
computer systems have always existed, The classical four-box picture of a commemory, input-output, and control) is certainly an effective
crystallized representation.
of
of course, but only in rudimentary form.
puter (arithmetic unit,
organization of components to process information. But multiple-processors hierarchies of memories and remote communications force the top level of organization into a distinct level, requiring analysis
and
rational design. Similarly, the
25
instruc-
IBM 701 computer (developed around 1953) is certainly an instruction indeed one worthy of study. But processors with dozens of registers and set almost unlimited logical circuitry, again force the instruction set to become a topic
tions of the
—
of rational analysis
This book
and design.
emergence of these upper levels of organization: eight years ago (a computer engineer's half dozen) would have been too early to write is
tied to the
hence would be too late. Eight years ago the diversity and complexity of computer structures was not sufficient to justify the attention this this book; eight years
book provides. This book would have been too exist that treat these levels systematically. This
years hence textbooks will then appear too descriptive.
thin. Eight
book
will
But right now, as these aspects of computer structure are emerging, and with systematic treatment still precluded, there is a need to make available material on these levels for systematic reference and study. Our choice has been to present a large set of examples, which illustrate the various design options and structural possibilities,
both
in
instruction sets
and
in overall
configurations. These examples
are descriptions of actual computer systems, taken from the technical literature or
from technical reports and manuals. Descriptions of actual systems are to be much preferred over idealized abstractions. The latter can reflect the real issues only after successful systematization.
Not only are the chapters about actual computers, they present much detail. The complexity of computers resides in part in their size and the multiplicity of their parts
—
e.g., to their
having 200 instructions rather than 20, or having to service 2. It seems essential to describe computer systems in their
50 Teletypes rather than entirety, rather
than via simplified vignettes. Again, this view stems from the existing
state of the art. Eight years hence,
We
it
will
not necessarily hold.
from grace on all the above principles, providing occasionally descriptions of paper machines and partial descriptions of partial systems. But our feeling fall
that detail
and
reality is
important remains. This
for study rather than for reading.
is
why
this
book
is
so large; and
fit
vi
Preface
The book presents
a large
number
of examples. Variation
needs to be presented
the major dimensions that instruction sets and system configurations along exhibit. currently Thus, as a glance at the table of contents will show, the examples all
the book are hardly picked at random. The variation is empirical. It exists in the population of computers that have actually been built. This characteristic of the book stems, again, from our assessment that the upper levels of computer structure in
still in an essentially descriptive and empirical state of development. However, as the book documents, ample variation occurs in existing computer systems. The evidence presented here should finally lay to rest the remarks once echoed almost
are
—
and
—
heard occasionally that nothing has happened structure since the von Neumann machine. universally
Dimensions
still
of variations imply a
framework,
for
in
computer
dimensions do not by them-
selves arise from a population of systems.
They require the aid, witting or not, of a three chapters of the book testify, we have most
conceptual framework. As the first wittingly created a framework, and have had no hesitation
in imposing it throughout keeping with our view already expressed, this framework is primarily descriptive. It has come inductively from the common lore, from our own experiences as designers, and from the effort of putting this book together. This
the book. However,
in
attempt at systematization has given (ISP)
and the other
rise to
for configurations of
two notations: one for instruction sets
major components (PMS). But, again, these
notations are primarily descriptive. So much for what the book actually tries to provide. What are our goals for it? The first is educational. There are three distinct populations of professionals whose education is to be served by this book: the computer engineer, who will design
computer systems; the computer scientist, who is concerned primarily programming level and with various abstract views of information processing; and the electrical engineer, who sees computer systems simply as one part of a physical
with the
larger technology.
For
all
ments
of
we see no sense
of these,
computer
structure. There
in is
talking of elementary versus surely "less" versus
advanced
with our view of the current art, no vertical stratification of education in
treat-
"more," but consistent is
possible
and device configurations. It is sufficient, in the present day, for computer systems to become accepted as worthy of study in their
instruction sets
these aspects of
own
right.
This book
will
hardly
make easy
have an instructor somewhat
book
is
meant
for study.
fare for undergraduate students,
skilled in the art that
A good
instructor can,
we
who do not
being taught. However, this feel, develop an excellent course is
computer structures, taking this book as the basic material. In addition to the three introductory chapters, Chapter 5 (on the DEC PDP-8), by providing a complete example of a computer system with descriptions at all systems
(or part thereof) in
levels,
helps to
view students
tie
will
the aspects of computer structure discussed pick
up from
in this
book to the
a traditional course in logical design.
goes without saying that for the computer engineer and designer, the material book should be fully assimilated. In designing a new computer system, or subsystem thereof, he should be familiar with all that this book has to offer the It
of this
—
design choices, the structural variations possible, the experiments of the past and
Preface
the design needs they attempted to satisfy. Given that systematic analysis does not is no substitute for extensive, critical understanding of the existing
yet exist, there
examples of designed systems. We assume the student of computer engineering comes to this book with a working knowledge of logical design. He should find it possible to realize many of the systems described in this book at the next lower levels of logic structure.
For the computer scientist, the levels of computer structure discussed in this book know about the physical devices that
constitute a substantial part of what he should underlie his science. As
we pass downward from these
levels to lower
ones
— to
register-transfer systems, sequential logic circuits, combinatory circuits, continuous
and on down
circuits
— the relevance of each
level
gradually fades.
The
levels of this
book, along with the register-transfer level constitute the main aspects of computer structure that the computer scientist must understand. It does not matter that they
and
are, as yet, basically empirical
descriptive.
The computer
scientist
undoubtedly
not be able to carry through the design of the systems described in this book terms of the lower logic levels, but this is not necessary for an appropriate grasp
will
in
of these
upper
systems
levels of
computer
structure. Indeed, this
is
what
it
means
for distinct
levels to exist.
For the electrical engineer, this book undoubtedly presents
he cares to know (or needs in
presented
the
first
more examples than
But an appropriate sampling, plus the overview
to).
three chapters,
is
appropriate to give him
some
insight into
the elaborate growth that has occurred on top of the basic digital technology created within electrical engineering.
The student
of systems engineering may also find the material presented here an example of a class of complex systems which has evolved several distinct levels of representation. Again, the book undoubtedly presents too massive useful, as
a dose of detail for him, but the overview in the first chapters, plus a sampling throughout the space of computer systems, should prove highly instructive. We have goals for the book in addition to the educational ones. We think the book
can serve as a useful reference for the practicing computer engineer. The time is past when every computer engineer knows about all computer systems because he has us
lived
who
through
of
all
computer
are past forty (and
still
history.
That position is now reserved for those of source book that provides the
active). For the rest, a
cumulated design experience of the
field is a useful substitute, especially
so
if
it
contains enough detail so that a designer can reasonably evaluate the actual computer systems that
embody
a particular design alternative.
Behind the goal of the book as a guide for the practicing computer designer lies the feeling that the field of computer engineering needs to develop a sense of history and of looking to the past for guidance. The fantastic advance in basic logic technology
new one.
—
in
past systems,
some goal
speed, cost, and
But, of course, in
it
is
not.
reliability
Many
—
makes each day seem an
absolutely
alternative designs have been tried out in
ways relevant to current design. Thus, we have the goal of saving form accessible to the future needs of computer design. This
of the past in a
is
mixed with a certain archival
programming books.
Many of the systems in this book have manuals and various elementary how-to
feeling.
never been documented, other than
in
vii
viii
Preface
A final goal comes from our feelings as computer scientists that the variety of computer systems is a phenomena worthy of study in its own right. This book carries, to asking how to classify the diversity of therefore, an invitation to taxonomy
—
forms
of
systems that are
computer
usually take place
into existence.
coming
Taxonomic endeavors
a field of natural systems, particularly biological systems. It that a domain of artificial systems calls for taxonomic activity. strange
may seem But the demand significant size for
many
in
for empirical classification exists
and
rich structure.
Rudimentary
— for
populations of artifacts
whenever there
is
a population of
classification efforts
have occurred
ships, for aircraft, for houses. This
book
should amply confirm that computer systems are complex and diverse enough
and undergoing enough continual nificant taxonomic endeavor.
and evolution
proliferation
— to
command
—
sig-
Enough is said in the first two chapters about the new notations introduced in the book, so that nothing substantive need be added here. We apologize for inflicting new notation on the reader. We feel that good notations are really quite important for the aspects of
by the whole buyers,
sellers,
came
notations
computer structure described of
field
manufacturers, into
tion of the notations rise in
A
computers — by
common we have
in this
book.
Much would be gained
programmers, engineers, planners, and scientists if relatively uniform
users,
students,
—
we have no illusions about the perfecwe would be most happy if they cause a
use. Although
introduced,
concern for standard notations and nomenclature.
large
number
redescribed
many
The accuracy
of
of distinct of the
all
systems are described
systems
in
the
these descriptions
is
common a
in
substantial detail.
notation introduced
in
We have
the book.
major problem. Even where the papers
are reproduced from the literature, this problem of accuracy remains
— although
it is not ours alone. Even though we have taken pains to obtain accurate information on the systems and to portray them faithfully in our various descriptions and figures, there is no way we can be responsible for their ultimate accuracy. The
then
PMS and
ISP figures,
in particular,
cannot be guaranteed
to be accurate representa-
tions of the systems they purport to describe. Ultimately,
one would
like to
have
simulation languages for such notations and to verify (up to the usual criteria of a debugged program) that a system given by, say, an ISP description, simulates the behavior of the target machine. But that day is still far off.
Our most fundamental acknowledgment
is
to the contributors to this volume,
not only for the articles they have written, but for the computers they have designed
and
built,
thereby creating a population of fascinating artifacts worthy of study. An
additional reason for reprinting their articles rather than simply describing their is the importance of having available the views of the designers themselves about the nature of their systems.
computer systems
The research on the basic ideas underlying the notations was supported by Advanced Research Projects Agency of the Office of the Secretary of Defense (F 44620-67-C-0058) and is monitored by the Air Force Office of Scientific Research. We would like to extend an acknowledgment to the organizations that have produced
all
of these computers, oftentimes
saw has
it
would seem
in
defiance of the laws
computer manufacturer is simply a of another This computer's way breeding computer. might account for the tenacity of economics. Perhaps, as the old
it,
a
Preface
in spawning the vast numbers of computer systems that provide our field of study. Within this general acknowledgment, we would like to extend a very specific one to all the people in these organizations who
shown by computer manufacturers
heiped
make
information available to us
that this book has
We
demanded
— the manuals,
such great quantity. are indebted to the students who have read and
and ISP
photographs, dates,
etc.,
criticized the various
PMS
in
figures: Richard Dove,
Wayne
Kohl, Michael Knudsen, Paul
Mobus, and
Charles Pfferkorn. Ken Fitzgerald and Anita Jones of IBM were kind enough to read the introduction to the IBM System/360. Professor David L. Parnas initially reviewed the text and contents, thus providing
many Alan
Our other colleagues, especially Professors Angel Jordan, Herbert Simon and Everard M. Williams deserve a special thanks for
helpful suggestions. Perlis,
their patience Finally,
and encouragement.
we would
like to
thank those who were
a part of the
machine that assembled
who assembled the bibliography, figures, and contributor articles; Mrs. Mildred Sisko who typed the PMS and ISP Appendix; and especially Mrs. Dorothy Josephson who not only typed nearly all
the book: the editors of McGraw-Hill; Mrs. Mary Ross
drafts of the book, but also the final
PMS
figures,
and ISP Appendices. C.
Gordon Bell Allen Newell
ix
Acknowledgments R. H. Allmark and J. R. Lucking: Design of an Arithmetic Unit Incorporating a Nesting Store, Proceedings of the International Federation of Information Processing Congress 1962, pp. 694-698,
J.
R.
Hudson,
W.
H. Leonard, R. C. McReynolds, and G. Shapiro formed
efforts. Of particular importance is the G. Gregory in tuning the conceptual design to the real
the basis for the subsequent
work
North Holland Publishing Co.,
Amsterdam, Holland, by permission from American Federation of Informa-
of
J.
world of technology.
tion Processing Societies (AFIPS), Spartan Books, Washington, D.C.
Theodore R. Bashkow, Azra Sasson, and Arnold Kronfeld: System Design FORTRAN Machine, Transactions on Electronic Computers, vol. EC- 16,
of a
R. L. Alonso, H. Rlair-Smith, andA.L. Hopkins: Some Aspects of the Logical Design of a Control Computer, A Case Study, Transactions on Electronic
Computers,
EC-12, no.
vol.
of the authors
and the
6,
no. 4, pp. 485-499,
pp. 687-697, December, 1963, by permission
Institute of Electrical
and Electronics Engineers
(IEEE).
This research is supported by the Air Force Office of Scientific Research Contract AF19(628)— 2798.
G. A.
Anderson, Samuel A. Hoffman, Joseph Shifman, and Robert J. Williams: D825 A Multiple Computer System for Command and Control,
James
P.
—
Proceedings of the AFIPS Fall Joint Computer Conference, 1962,
vol. 22,
Blaauw and
authors acknowledge:
acknowledge the outstanding efforts of their many who have contributed so well
colleagues at Burroughs Laboratories
and
in so
cation,
many ways
to
all
and programming.
The authors
stages of It
D825
design, development, fabri-
would be impossible
to cite all of these
also wish to
acknowledge the contributions of Mr. William R. Slack and Mr. William W. Carver, also of Burroughs
efforts.
The Structure
of System/360, Part 3, no. 2,
pp.
1
I
—
19-
The Engineering Design of the Stretch Computer, Proceedings of the Eastern Joint Computer Conference, 1959, pp. 48-58, by permission
Erich Bloch:
of the author to
F. P. Brooks, Jr.:
Outline of the Logical Structure, IBM Systems Journal, vol. 135, 1964, by permission from the IBM Systems Journal.
pp. 86-96,
by permission from AFIPS, Spartan Books, Washington, D.C. The
The authors wish
August 1967, by permission of the authors and the IEEE.
The authors acknowledge:
and the
Institute of Electrical
and Electronics Engineers.
The author acknowledges:
The
efforts
and contributions of many people have gone into the
engineering design of the Stretch computer. To mention all would be impossible. However, the following individuals and their groups were responsible for the units indicated; Mr. R. T. Blosk for the Instruction Unit, Mr.
J.
F. Dirac for the
Look-ahead Units, Messrs.
J.
A.
Hipp
D825 from
and O.
L. MacSorley for the Arithmetic Units,
original conception to its implementation in hardware and software. Mr. Carver made important contributions to the writing and editing
for the
Memory Bus. The Systems Development was under the guidance S. W. Dunwell and R. E. Merwin.
Laboratories. Mr. Slack has been closely associated with the its
and Mr.
L.
O. Ulfsparre
of Messrs.
of this paper.
Arthur W. Burks, Herman H. Coldstine, and John von Neumann: PreGeorge H. Barnes, Richard M. Brown, Maso Kato, David J. Kuck, Daniel L. Slotnick, and Richard A. Stokes: The ILLIAC IV Computer, Transactions
on Computers, vol. C-17, no. 8, pp. 746-757, August 1968, by permission of the authors and the IEEE. The authors acknowledge: This work was supported in part by the Department of Computer Science, University of Illinois, Urbana, Illinois, and in part by the Ad-
vanced Research Projects Agency as administered by the Rome Air Development Center, Griffiss Air Force Base, Rome, New York, under Contract
USAF
The authors
Computing
Instrument, "Collected Works of John von Neumann," vol. V, pp. 34-79,
General Editor: A. H. Taub, Macmillan Company, by permission from
Pergamon
Press,
New
York, 1963.
The authors acknowledge:
This report has been prepared in accordance with the terms of Contract W-36-034-0RD-7481 between the Research and Development Service, Ordnance Department, U.S. Army and the Institute for Ad-
vanced Study.
The authors wish University, for
to express their thanks to Dr. John Tukey, of Princeton
many
valuable discussions and suggestions.
are pleased to acknowledge their indebtedness to the
Westinghouse Electric Corporation that initiated the parallel computer effort. The work of W. C. Borck, A. B. Carroll, group
at the
30 (602)4144.
an Electronic liminary Discussion of the Logical Design of
John W. Carrlll:
IBM
UNIVAC Scientific
(1103A) Instruction Logic, pp. 77-83;
650 Instruction Logic, pp. 93-98; Instruction Logic of the Soviet
Acknowledgments
Strela (Arrow), pp. 111-115; Instruction Logic of the
chap. tion,
MIDAC,
pp. 115-121,
Programming and Coding, "Handbook of Automation, Computaand Control," vol. 2, edited by Eugene M. Grabbe, Simon Ramo, and 2,
Dean Wooldridge, Copyright
© 1959 John Wiley & Sons,
New
Inc.,
no. 2, pp. 223-235, April, 1962,
The
The authors
York,
work by
reprinted by permission.
Jr., James R. Weiner, H. Frazer Welsh, and Herbert F. The UNIVAC System, American Institute of Electrical Engineers-
Institute of Radio Engineers Conference, pp. 6-16, December, 1951, bv permission of the authors and the IEEE. The authors acknowledge:
The UNIVAC System has been an over-all company project and hundreds of people have participated. It is, therefore, difficult to
gratefully
members
all
B.
of the authors
is
D. Chapline,
owed
Jr.
To the
for their continuous
Philosophy of Pegasus,
A
C. H. Devonald,
and
B.
G. Maudsley: The Design
Quantity-production Computer, Proceedings of
the Institution of Electrical Engineers, London, Pt. B, vol. 103, Supple-
ment
2,
pp. 188-196, 1956, by permission of the Institution of Electrical
Engineers.
portion of the system was designed and written in part who is entitled to equal credit with the authors for
L. P. Deutsch,
the ideas in this paper. L. Barnes also contributed significantly to the
acknowledge the contributions that Mr.
like to
A
M. Lehman:
Survey of Problems and Preliminary Results Concerning
Parallel Processing
Electrical
and
and
C. Strachey and Dr. D. B. Gillies, of the National Research Development Corporation, and Dr. J. M. Bennett and Mr. T. G. H. Braunholtz, of Ferranti, Ltd., made to the logical design of Pegasus: particular thanks are due to Mr. C. Strachey for originating the order code.
They also thank Ferranti, Ltd., and the National Research Development Corporation for permission to publish the paper.
December, 1966, by permission
The Whirlwind
Computers, Joint
Computer, Review of Electronic Digital Computers American Institute of Electrical EngineersI
of Radio Engineers Conference, pp. 70-74, February, 1952, by permission of the author and the IEEE. Institute
Thomas W. Kampe: The Design of a General-purpose Microprogramcontrolled Computer with Elementary Structure, Institute of Radio Engineers, Transactions on Electronic Computers, vol. EC-9, no. 213, June, 1960,
by permission
of the author
2,
pp. 208-
and the IEEE. The author
to thank his co-designers, R.
for their assistance
Compton and T. Hayata, during the design of the SD-2 computer and for
their suggestions on this paper.
regular discussions on
due to
all
members
Kilburn, D. B. G. Edwards, M.
of the author
no.
of the Institute of
12,
pp.
1889-1901,
and the IEEE. The author
mem-
aspects of the project. Credit
all
is
therefore
of the
group which, during the period covered by the contents of this paper, included G. C. Driscoll, M. Lee, A. P. J. Mullery, J. L. Rosenfeld, H. P. Schlaeppi, and M. Weitzman. I should also like to express
my
sincere thanks to Dr. H. A. Ernst for the con-
and encouragement offered during prepara-
structive criticism, advice,
My
sincere thanks are also due to
Graphics and Design Department at the Thomas Center, and in particular to G. Massi and Mrs. M.
J. J.
members
of the
Watson Research
LaMarre
for their
preparation of the charts and figures. Last, my thanks to Mrs. J. Galto for her infinite patience in the repeated retypings of the manuscript.
A. L. Leiner, W. A. Notz,
NBS Multicomputer
and A. Weinberger: PILOT, The System, Proceedings of the Eastern Joint Computer J.
L. Smith,
Conference, 1958, pp. 71-75, by permission of the authors and the IEEE.
The authors acknowledge: to
acknowledge the valuable contributions of their
Loberman and W. Youden, who helped to develop the design and programming procedures for this system.
colleagues H. logical
William Lonergan and Paul King: Design of the B 5000 System, Datamation, vol. 7, no. 5, pp.
T.
54,
participated in
The authors wish
acknowledges:
The author wishes
vol.
This paper reports on a group activity in which each individual ber had his own specific assignments and in addition
tion of this paper.
R. R. Everett:
Parallel Processors, Proceedings
Electronics Engineers,
acknowledges:
The authors acknowledge:
The authors would
December, 1966, and the IEEE. The authors acknowledge:
final result.
support of the project.
Owen,
User Machine
The work for this paper was supported in part by the Advanced Research Projects Agency, Department of Defense, Contract SD-185.
by
J.
A
Pirtle:
Electronics Engineers, vol. 54, no. 12, pp. 1766-1774,
by permission
Blumenthal, Mr. L. D. Wilson, and Mr.
EMiott, C. E.
acknowledge the contributions made to this computer team at both Manchester
Time-sharing System, Proceedings of the Institute of Electrical and
The software
S.
and the IEEE.
of the Atlas
W. Lampson, W. W. Lichtenberger, and M. W.
in a
acknowledge the contributions of individuals. However, special mention must be made of the contributions of Mr. H. Lukoff, Mr. E. I. Census Bureau a great debt of gratitude
W.
of the authors
University and Ferranti Ltd.
Presper Eckert,
/.
Mitchell:
by permission
authors acknowledge:
J.
Lanigan, and
F.
H. Sumner: One-
level Storage System, Institute of Radio Engineers Transactions, vol.
EC-11,
Copyrighted Conn.
©
28-32, May, 1961, by permission
1961 by F. D.
Thompson
of,
published and
Publications, Inc., Greenwich,
xi
xii
Acknowledgments
Richard
E.
Monnier, Thomas E. Osborne, and David
S.
Cochran: The
HP Model 9100A Computing Calculator. This chapter is a compilation of three articles: A New Electronic Calculator with Computerlike Capabiliby Richard E. Monnier, pp. 3-9; Hardware Design of the Model Calculator, by Thomas E. Osborne, pp. 10-13; and Internal
ties,
J.
H. Wilkinson:
Computation,
The
Pilot
5-14,
pp.
ACE, by
permission from Automatic Digital
National
Physical
Teddington,
Laboratory,
England, March 25-28, 1953.
and
9100A
M.
Programming of the 9100A Calculator, by David S. Cochran, pp. 14-16, which appeared in the Hewlett-Packard Journal, volume 20, no. 1, Septem-
the Control Circuits in an Electronic Digital Computer, Proceedings of
ber, 1968,
by permission
V. Wilkes
the
J.
Cambridge Philosophical
Society, Pt. 2, vol. 49, pp. 230-238, April,
1953, by permission of the authors
of the Hewlett-Packard Journal.
Micro-programming and the Design of
B. Stringer:
and the Cambridge Philosophical
Society,
Cambridge, England. The authors acknowledge: R. E. Porter: mation, vol.
The RW-400— A New Polymorphic Data System, Data-
6, no. 1,
published and Copyrighted Greenwich, Conn.
/.
©
1960 by F. D. Thompson Publications,
W. Renwick
Inc.,
and
to Professor D. R. Hartree, F.R.S., for his generous help with the
O.
T.
Ellis:
A Command
Struc-
Complex Information Processing, Western Joint Computer Conference 1958, by permission of the authors and the IEEE.
Joseph E.
Y. Stevens:
The Structure
of System/360, Part II
IBM Systems Journal, vol. the IBM Systems Journal.
tions,
from
for assisting
Wirsching:
vol. 12, no. 12,
ture for
W.
Mr. A. L. Freedman and
to express their thanks to
Mr.
them
in clarifying a
number
of points,
preparation of the paper.
Shaw, A. Newell, H. A. Simon, and
C.
The authors wish
of,
pp. 8-14, January/February, 1960, by permission
3, no. 2,
List-oriented
Computer, Datamation,
by permission
of,
©
and Copyrighted 1966 by F. D. Thompson Publications, wich, Conn. The author acknowledges:
published
Inc.,
Green-
This work was performed under the auspices of the U.S. Atomic
— System Implementa-
pp. 136-143, 1964,
NOVA: A
pp. 41-43, December, 1966,
Energy Commission.
by permission Several organizations have contributed to the writing and production of
book by giving us permission to use material from their publications. many cases they have also supplied us with original copies. We have
this
James
E. Thornton: Parallel Operation in the Control
Data 6600, Proceed-
AFIPS Fall Joint Computer Conference, Pt. II, vol. 26, pp. 33-40, by permission from AFIPS, Spartan Books, Washington, D.C.
In
ings of the
credited their text, tables, pictures, and diagrams
1964,
This cooperation has been invaluable.
The
when they
specific
are used.
organizations are:
Adams's Associates: Computer Characteristics Quarterly. (Adams, 1966-1968)
W.
L.
van der
Poel:
ZEBRA, A Simple
Binary Computer, Proceedings of
an International Conference on Information Processing, Paris, UNESCO House, June, 1959, pp. 361-365, by permission from AFIPS, Spartan Books,
Computers and Automation magazine
Washington, D.C.
Minnesota
Data Corporation,
Control
8100
34th
Avenue
South,
Minneapolis,
Datamation magazine
Helmut Weber: A Microprogrammed Implementation of EULER on IBM System/360 Model 30, Communications of the Association for Computing
©
Machinery, vol. 10, no. 9, pp. 549-558, September, 1967, Copyright 1967 Association for Computing Machinery, Inc., by permission of the author and the Association for Computing Machinery, Inc. The author
wish to thank Jack Carman,
who wrote
the Operating System linkage for the
Morrison
who helped prepare Wirth and
Street,
Maynard, Massachusetts
Hewlett-Packard Company, 1501 Page Mill Road, Palo, California International Business Machines Corporation,
New
White
Plains
and Pough-
York
Massachusetts Institute of Technology, Cambridge, Massachusetts the I/O Control Program and
EULER
the figures.
valuable criticism offered by the referee, Professor N.
Equipment Corporation, 146 Main
keepsie,
acknowledges;
I
Digital
E. Satterthwaite.
I
W.
system and Miss Sheila
am C.
also grateful for the
McGee,
as well as
National Science Foundation Olivetti
Underwood Corporation,
1
Park Avenue,
New
York,
New
York
by Scientific
Data Systems, 1649 Seventeenth
Street, Santa
Monica, California
Contributors R. H.
Allmark
W.
W. W. Lichtenberger
J.
R. L.
Alonso
T. 0. Ellis
William Lonergan
W.
R. R. Everett
J.
Herman
B. G.
James
Anderson
P.
Theodore
R.
Bashkow
George H. Barnes G. A.
Blaauw
S. Elliott
Samuel A. L.
H. Goldstine A.
Hoffman
Hopkins
Thomas W. Kampe Maso Kato
H. Blair-Smith
R.
J. B.
F.
Mitchell
Richard
E.
Monnier
Notz
A.
T. Kilburn
Paul King
M. W.
Arthur W. Burks
David
John W. Carr III David S. Cochran
Arnold Kronfeld
P.
C. H.
Brooks,
Devonald
D. B. G. J.
Jr.
Edwards
Presper Eckert, Jr.
J.
Kuck
B.
W. Lampson
M.
J.
Lanigan
A. L. Leiner
M.
Lehman
Pirtle
R. E. Porter
Azra Sasson J.
C.
Shaw
Joseph Shifman H. A.
Stevens
Maudsley
Richard M. Brown
F.
Smith
Richard A. Stokes
Herbert
W.
Y.
Lucking
Thomas E. Osborne C. E. Owen
Erich Bloch
L.
Simon
Daniel
L.
Slotnick
Stringer
Sumner James E. Thornton F.
W.
H.
L.
van der Poel
John von Neumann A.
Helmut Weber Weinberger R. Weiner
James
H. Frazer Welsh
M. V. Wilkes J.
H. Wilkinson
Robert
Joseph
J.
E.
Williams
Wirsching
xiii
Contents
Part 1
Preface
V
Contributors
xiii
The Structure Chapter
1
Chapter 2
of
The
Acknowledgments
Computers
Introduction
PMS
The
Chapter 3
and
ISP
Chapter 4
15
Processors with One Address per Instruction
89
Chapter 16 Chapter 17
Preliminary Discussion of the LogiDesign of an Electronic Com-
cal
Instrument
puting
Herman H. John von Neumann The DEC PDP-8 The Whirlwind
Chapter 5 Chapter 6
— Arthur
92
IBM
The
Chapter 7
Some Aspects
Chapter 9
Hopkins The SDS 910-9300
Stretch
J.
the
— Computer Erich Bloch
Series
The Design Philosophy
of Pegasus,
Structure
A Quantity-production Computer — W. S. Elliott, C. E. Owen, C. H. The
a "virtual" contents,
Structure I
— Outline
of of
Brooks,
171
Chapter 10 Chapter 39
An
— G.
A.
Blaauw and
8-bit-character
Parallel
F. P.
Jr.
Operation
Data 6600—James
System/360, the
Computer in E.
the
184
Control
Thornton
Logical
which means that because many of the computers are relevant
type to indicate a nonsequential mapping for computers placed out of "physical" order. virtual order.
Storage Kilburn, D. B. G. Edwards, M.
T.
Processors with a General-register State
Part
is
Chapter 34
Lanigan, and F. H. Summer The Engineering Design of
146
L.
157
— System
One-level
of the Logical Design
Devonald, and B. G. Maudsley
Chapter 43
Mitchell
Chapter 23
a
and A.
Section 2
Jr., James B. Weiner, H. Frazer Welsh, and Herbert F.
137
—
Chapter 42
The UNIVAC System—J. Presper Eckert,
1800
Control Computer: A Case Study R. L. Alonso, H. Blair-Smith,
of
Chapter 8
— Computer
R. R. Everett
Chapter 33
Chapter 41 120
I
The LGP-30 and LGP-21
IBM 650 Instruction Logic—John W. Can III The IBM 7094 I, II
W.
Goldstine, and
Burks,
This
37
Instruction-set Processor: Main-line computers
Section 1
1
The Computer Space
Descriptive
Systems
Part 2
3
to more than one The reader might read
part and section, we have used italic (reference) the book according to the
xvi
Contents
Part 3
The
Instruction-set Processor Level: Variations in the Processor
Section 1
Processors with Greater than One Address per Instruction
ACE—
Chapter 11
The
Chapter 12
ZEBRA, A Simple Binary Computer — W. L. van der Poel UNIVAC Scientific (1103A) Instruction Logic— John W. Carr HI The RW-400: A New Polymorphic
Chapter 13 Chapter 38
Pilot
Data System
Section 2
Chapter 19
— R.
H. Wilkinson
J.
191
Chapter 14
193
Chapter 15
Memory
Chapter 9
The LGP-30 and LGP-21
Chapter 11 Chapter 8
The
H.
ACE—
J.
Frazer
The Design Philosophy of Pegasus,
W. 217
James
R.
Elliott,
C.
E.
Owen,
C.
H.
IBM
Chapter 26
John W. Carr III NOVA: A List-oriented Computer—
Presper
650
Instruction
Logic220
Joseph E. Wirsching
Weiner,
and Herbert
Welsh,
S.
A
— Computer
Chapter 17
H. Wilkinson
UNIVAC System—J. Jr.,
213
Devonald, and B. G. Maudsley
van der Poel
Eckert,
Soviet
(Arrow)— John W. Carr HI
Quantity-production
Chapter 16
The
the
of
216
ZEBRA, A Simple Binary Computer
Pilot
209
III
Logic
E. Porter
The OLIVETTI Programma 101 Desk
L.
Instruction Strela
Processors Constrained by a Cyclic, Primary
— W.
Carr
205
Calculator
Chapter 12
W.
John
200
MID AC —
Instruction Logic of the
F.
Mitchell
Section 3
Chapter 18
Section 4
Chapter 19
Processors for Variable-length-string Data
The IBM 1401
225
Section 5
Chapter 21
The OLIVETTI
Programma
Chapter 36
An
8-bit-character
Computer
The
HP
Thomas
237
Processors with Stack Memories (Zero Addresses per Instruction)
Design of an Arithmetic Unit Incorporating a Nesting Store R. H.
for
—A
Multiple-computer System
Command and Control—James P.
Anderson,
Samuel
A.
Hoffman,
E.
Monnier,
and David
S.
243
257 Joseph Shifman, Williams
—
D825
E. Osborne,
Cochran
Model 9100A Computing
R. Lucking J. Design of the B 5000 SystemWilliam Lonergan and Paul King
235
Calculator — Richard
101
Allmark and
Chapter 22
Chapter 10
Desk Calculator Computers: Keyboard Programmable Processors with Small Memories
Desk Calculator Chapter 20
224
262
Chapter 30
A Command
and
Structure for
—
Information Processing A. Newell, H. A. Simon,
267 Chapter 32
Robert
/.
J.
Complex C.
T.
Shaw,
O. Ellis
Microprogrammed Implementation of Model
EULER on IBM System/ 360 30— Helmut Weber
Contents
Section 6
Chapter 23
Processors with Multiprogramming Ability
274
One-level Storage System— T. KilD. B. G. Edwards, M. J.
Chapter 24
burn,
Lanigan, and F. H. Sumner
Chapter 21
Part 4
The
of
Design
B 5000 System —
the
William Lonergan and Paul King User Machine in a Time-sharing
A
— B.
W. Lampson, W. W. Lichtenberger, and M. W. Pirtle
276
System
291
Instruction-set Processor Level: Special-function Processors
Section
1
Chapter 41 Chapter 43
Processors to Control Terminals and Secondary Memories (Input-output Processors)
The
IBM
The Part
7094
I,
— Outline
Brooks,
Section 2
Chapter 26
of
System/360,
of
the
I
Structure/G. A.
Blaauw and
NOVA: A
List-oriented
ILLIAC
George
Section 3
Chapter 28
F.
H.
IV
Barnes,
Stokes
ComputerRichard
334
Chapter 20
an Elec-
of a
Chapter 32
—
Chapter 30
Structure for
—
Design of a FORTRAN Machine Theodore R. Bashkow,
System
—
E.
Monnier,
Osborne, and David
S.
A Microprogrammed Implementation of EULER on IBM System/ 360 Model 30— Helmut Weber
348 Azra Sasson, and Arnold Kronfeld
Complex
Information Processing J. C. Shaw, A. Newell, H. A. Simon, and T. O. Ellis
Chapter 31
— Richard
E.
341
Processors Based on a Programming Language
A Command
Model 9100A Computing
Cochran
General-purpose
Thomas W. Kampe
HP
Thomas 335
B. Stringer
The
Calculator
— Computer M. V. Wilkes and
The Design
320
M.
Microprogram-controlled Computer with Structure Elementary
Section 4
305
Brown, Maso Kato, David J. Kuck, Daniel L. Slotnick, and Richard E.
316
Microprogramming and the Design
J.
338 Display Processor
325
Computer-
of the Control Circuits in
Chapter 29
DEC
P.
Processors Defined by a Microprogram
tronic
IBM 1800
Jr.
Processors for Array Data
The
The
The
Logical
Joseph E. Wirsching
Chapter 27
Chapter 33 Chapter 25
II
Structure
303
Chapter 32 349
363
A Microprogrammed Implementation of EULER on IBM System/360 Model
30— Helmut Weber
382
xvii
xviii
Contents
Part 5
Contents
PMS
Appendix
and ISP Notations
General Conventions
607
607 608
8 Attributes
2 Metanotation
608
9 Null
3 Basic Syntax
609
1
4
Basic Semantics
Commands: Assignments, AbbreviaForms
tion, Variables,
PMS
609
612
Symbol
and
Optional
Ex-
pression
613
10
Names
613
11
Numbers
614
5 Indefinite Expressions
610
12 Quantities, Dimensions, and Units
615
6 Lists and Sets
611
13 Boolean and Belations
615
7 Definite Expressions
611
Conventions
615
616
7 Switch
2 General Units
616
8 Control (K)
624
3 Information Units
616
9 Transducer (T)
625
4 Component
617
5 Link (L)
1
6
Dimensions
Memory
ISP Conventions
(M)
623
(S)
10 Data-operations (D)
626
619
11 Processor (P)
626
620
12
628
Computer
(C)
628
Data-types
629
3 Operations
632
2 Instruction
631
4 Processors
635
1
Bibliography
638
Name
653
Index
Machine and Organization Index
656
Subject Index
661
xix
Part 1
The structure
of
computers
1
Chapter
This book presents
them
in
enough
many examples of computer systems.
detail so that
It
presents
meaningful engineering study and
opment
of this science
and technology of computers (one of us To understand why this particular
also likes to build computers).
Most of these examples are presented by the original descriptions of them in the technical literature. using Others have been redescribed by us, especially where the original descriptions existed only in technical manuals. In both cases there
book seems
are considerable discussion and analysis of the computer structures: what problems they were intended to solve, what solutions
the most important. There are at least four levels of system description, possibly five, that can be used for a computer. These are not
were adopted, and how these solutions have fared. Yet the emphasis has remained on detailed descriptions precise enough so
alternative descriptions in the sense that anything said one
that the systems themselves are available for independent study.
straction of the levels
analysis are possible.
Why
should one want to produce such a book? Collections of
common
reprintings from the technical literature are
in
many
fields, e.g., "Programming Systems and Languages" [Rosen, 1967]. We have departed from this traditional exercise in two ways, both of which seem important to us.
of computer-systems technology.
A
we have
presented substantial amounts of detail: in effect, block diagrams of computer structures and the equivalents of
computer system
is
complex
On
can be said another.
in several
ways. Figure
1
shows
way
the contrary, each level arises from ab-
below
Each does a job
it.
that the lower
levels could not
perform because of the unnecessary detail they
would be forced
to carry around.
A system (at any level) is
science and engineering
First,
be the right way to push this development time requires characterizing the current state
to us to
at this particular
of
characterized by a set of components, which certain properties are posited, and a set of ways of com-
bining components to produce systems. When formalized appropriately, the behavior of the systems is determined by the behavior of
its
components and the
specific
modes
of combination used.
programming manuals. These constitute neither good reading nor a way of communicating the "essential ideas" in the field. Second, we have introduced a system of notation and have used it not only in the parts we ourselves have written but also to provide addi-
Network //V, computer/C
Structures:
Components: Processors/* memories/^, switches/5, controls /A', transducers / T, data operators//?, links//. ,
tional (sometimes redundant) descriptions of
computer systems in the reprinted articles. Why should there be a book like this? The reasons are several and require some background discussion.
Structure:
Programs, subprograms
Components: State (memory
cells),
instructions, operators, controls,
interpreter
Circuits: Arithmetic unit
Computer systems
c
>
Components: Registers, transfers,
Computer systems are one example 1 ficial systems. They have existed as
of man's
more complex
successful engineering prod-
ucts long enough to undergo radical evolution to a
number
of basic, unique technologies.
and
They
controls, doto operators (+, -, etc.]
arti-
I Circuits: Counters, controls, sequential
to give rise
transducer, function generator, register arrays
are sufficiently
—
reset-set/ Components: Flip-flops RS, JK delay/ D toggle/ T latch, ,
complex that they have given rise to a science, that is, to a continuing, institutionalized endeavor to understand what sort of beast has been brought forth. 2 1
it
t
Circuits: Encoders, decoders, tronsfer
arroys, data ops, selectors, distributors, iterative networks
most complex system. That view myopic. Setting aside quasi-natural systems, such as cities and economies, is still the case that a modern aircraft carrier is more complex than a
Components: AND, OR, NOT, NAND,
need not argue that they are
interest
is
his
modern computer by any reasonable measure. 2
t
in the devel-
We
is
Our fundamental
t
delay, one shot
Here uniqueness can be claimed, perhaps, since few other
is
no science of
aircraft carriers.
But there
is
a
Circuits: Amplifiers, delays, attenuators, multivibrators, clocks, gates, differentiator
Active components: Relays, vacuum tubes, transistors
artifactual
systems (again, excluding the quasi-natural ones) provide new phenomena that require sustained scientific investigation to understand them. There certainly
Components:
NOR
computer
science.
Passive components: Resistor//?, capocitor/ C, inducter/Z., diode, delay lines
Fig. 1.
Hierarchy of levels: computer structure.
states, inputs, outputs
4
The structure
Part 1
of
computers
Elementary
circuit theory
components
are R's, L's, C's,
combination
is
is
an almost prototypic example. The and voltage sources. The mode of
between the terminals of components, an identification of current and voltage at
to run wires
which corresponds
to
The algebraic and differential equations of circuit theory provide the means whereby the behavior of a circuit can be computed from the properties of its components and the way these terminals.
the circuit
is
constructed.
a recursive feature to most system descriptions. A system, composed of components structured in a given way, may be considered a component in the construction of yet other sys-
There
is
some primitive components whose properties are not explicable as the resultant of a system of the same type. For example, a resistor is not to be explained by a tems. There are, of course,
subcircuit but
is
taken as a primitive. Sometimes there are no
absolute primitives, it being a matter of convention what basis is taken. For example, one can build logical design systems from
many different primitive sets of logical operations (AND and NOT, NAND, OR and NOT, etc.). A system level, as we have used the term in Fig. 1, is characterized is,
by a
distinct language for representing the system (that
the components,
These
modes
distinct languages
of combination,
and laws
of behavior).
reflect special properties of the types of
components and of the way they combine. Otherwise, there would be no point in adopting a special representation. Nevertheless, these levels exist in the system analyst's
Structure
-15 volts
way of describing the same
Chapter
to
come
to
mind
first,
but card readers, card punches, and Teletype
terminals are other examples. These devices obey laws of motion and are analyzed in units of mass, length, and time.
The
next level
is
the logic
to digital technol-
level. It is
unique ogy, whereas the circuit level (and below) is what digital technology shares with the rest of electrical engineering. The behavior of a system
now
is
and
low).
NOT, NAND,
etc.
at the circuit level,
which
and
1 (or
+
The components perform
and
—
,
true
in the
by connecting the terminals
thereby identify their behavioral values.
of
get sequential circuits. The problem that the combinatorial-level analysis solves is the production of a set of outputs
time
we
t
as a function of a
As described
number
in textbooks, the analysis abstracts
not look at the voltage (which
same time
t.
from any trans-
port delays between input and output; however, in engineering practice the analysis of delays is usually considered to be still part of the combinatorial level. In Fig. 3 we show a combinatorial
to use the
("settling
A and structure we
boolean variables
B.
O v Oz
and
Note that
3,
symbolic representa-
can write an expression that reflects the structure of the combinatorial network, but, on reduction, the tion of the
boolean equations no longer reflect the actual structure of the combinatorial circuit but become a model to predict its behavior.
The representation of a sequential switching circuit is basically the same as that of a combinatorial switching circuit, although one needs to add memory components, such as a delay element
—
(which produces as output at time t the input at time t t). Thus the equations that specify structure must be difference equations involving time. Again, there is a distinction (even in representation) between synchronous circuits and asynchronous circuits, namely, whether behavior can be represented by a sequence of or must deal in values at integral time points (t = 1, 2, 3, .) .
.
a minor variation. Figure 4 gives a sequential logic circuit in both an algebraic and a graphical form
continuous time. But this
and shows
Now
is
also the representation of the behavior of the system.
clear that logic circuits are simply a subspecies of general circuits. Indeed, to design the logic components one constructs circuit-level descriptions of them. For instance, Fig. 5 it is
when
it
is
transient
Thus the
compute the behavior of circuits at the logic level that are extremely complex at the circuit level. The techniques for doing so use an entirely different mathematical apparatus. In general,
we
cross into another level
when
the previous level provides information that
A
the representation at is
no longer relevant.
concerned with explaining the behavior of a certain structure, whereas the next highest level takes the lower level as given (a primitive). The higher level is concerned not about lower level
is
internal behavior but only
A glance at
how
primitives are combined.
shows that we have described only the lower part of the logic level. There is another part, called the registertransfer level (or RT level). This is still an uncertain level, a matter Fig. 1
as a function of the input
in the
common
—
Structure ,
identified as the behavior variable
is
network formed from combinatorial elements which realize three boolean output expressions,
is
in the logic circuit) during certain periods
since one can
of bool-
of time. outputs are directly related to the inputs at any instant hold values over time the to If the circuit has (store inforability
at
its
components,
The laws
of inputs at the
gate plus a table of
AND, OR, same way as
whose previous paragraph described combinatorial circuits
mation),
NOR)
its
case in which certain features are deliberately ignored. One buys a great deal from the specialization to logic circuits,
false,
ean algebra are used to compute the behavior of a system from the behavior and properties of its components.
The
NAND gate only if
(or
behavior corresponds to that of the certain restrictions hold; namely, that one does
evident that
It is
high
and
logical functions:
Systems are constructed
NAND
circuit for a
down," logic level phrase). an instance of the circuit level only in the same sense that the as a limiting circuit level is an instance of Maxwell's equations
described by discrete variables which take on
only two values, called
shows a behavior.
1
NOR
3
OR
OR
5
6 Part
1
The structure
of
computers
Behavior
Structure
MOR
Clock
— Sum
Xr
I
1
°
X c Sum
1
I
Chapter
level.
The
practicing logic designer (by
now an
institutionalized
position, on a par with that of circuit designer) has sequential and combinatorial circuits as his basic analytic tools, and he attempts
on the register-transfer level
to design systems
essors) with these as
tools.
The
Structure
(AJr^OO)— (Sum:=0)';
central proc-
(e.g.,
register-transfer level has
from the informal attempts to create a notation closer to be done.
f.Xr,X^)^{Sm=);
emerged
to the job U/;>r=00)^(Sum:=1);
Recently there have been a number of efforts to construct formalized register-transfers systems. Most of them are built
lXr,X=0\)— (Sum:=1)
around the construction of a programming system or language that permits computer simulation of systems on the RT level. Although there
agreement on the basic components and types of operais much less agreement on the representation of the
is
tions, there
laws of the system (corresponding to the production system in Fig.
Behavior
4+/V 5+/V
6
I
/V+1
/V+1 /V+1
o
f
t a
— (S— 0; 7—0; start— 0; run— 1) -«//V)-(run-0»!
fi 1
s is abbreviation for start
2
r is abbreviation for run combinational network
3
A5
A A
start a -^run
t A run
4 clock event time, . N
f
.
Fig. 6. Register-transfer sublevel of
sum
of integers.
the logic
level:
Behavior (
G-15
(transistors, core memory) TX-2
MANIAC
and vonNeumann) (Burks. Goldstine .
"
(\2 b/w)
f>
University of Illinois 0RDVAC (for BRL)7
EDSAC (Wilkes) Cambridge University University of Pennsylvania EDVAC (Moore School of Electrical Engineering)
Bendix
(tubes, selectron memory)
J0HNNIAC
University of Chicago
V_
1218(18 b/w) 41
12
B
FILE
Jtape.drum)^ (core memory)
m
I
Rand Corporation von Neumann or IAS Based
490
1,11(10 d/w)
(12 d/w)
MUSE -ATLAS
(index regi sters/B-tubes ) ACE^ En q |ish Electric DEUCE
Lincoln Laboratory MTC/Memory Test Computer
Whirlwind
,
s-
HARK
g_
NPL/National Physics Laboratory md ACE Based Machines
K
>
(36 b/w)
1
Rice University
Manchester University
™
a—
»i
k K
File
BINSC*
Ti
——
1*5
1103A
Smal
(E-M)
(Real
d / w)
a
I
7040-7044 k
-
LARC
360/91
w
a
.(vac
,„,,
!
60,62,66,70 360/95 360/85
_608*.6IO»
7090(transi5tor)k_7094 7094 J
,
,
ire
Include models withdrawn:
(does not
p—
(64 b/w)
i
3t 360/65,75
360/30,40,50 0,40,50
System/360
(Large Scale)
t
360/67 (time shared)
(disk)
_609*
^STRETCH (7030)
(real
.
,
Internationa Business Machines
.
»—
.
7074-707"
7080
..
|
1460,1440
650(di-k)
*-
(10 d, fixed)
SPECTRA 70 Series
k!620»-
b/char, fixed inst.) 70S lll_
(6
1
I410J
M
7070
k
70IQ
,,.„,_/ kl401|£
,
_
3j0Hk-301) Compatible with System/360's Process Control j Smal scale General Purpose 1130 1800 ( 16 b/w) r, a Business
10(24 */w)
1
7 b/char)
,25 0*
B263 Bl 60 B273 BI70 B28 ' B,8 °
I
(10 d)
,
,
B8501
I^W
B65OO?
B8501
'
1" _ (8 b/char)
B260 B270 828 °
\
IBM/
B5000 B5500
b/char> 3 01( 6 b/ch a r) 601(56 b/w )
(6
3
B250B-
(6b/char) k E103«
501
1
ll6
1400-1800 fr
multiprocessor) B-5000 0-825 B5000 " not, family Business
b/w, stack, H0TE:
Datatron Division RCA/Radio Corporation of
(16 b/w)
k
b/ ">
18
rf.
(12
d/w-pluqboard program) E10l«n (10 d/w)
.A
(19 b /w) DPP- '9
Computer Controls Division
(I48
Fig. 2a.
J420Q-8?0O
2200, 1200,120
based k 200
IBM 1401
b/char)
(6
I
Datamatic
1^42
(control)
DATANET 30
(18 b/w)
k 1
M
^050
,
d/w)
(6
1^58
19*59
Wt
lftl
19*62
19^3
k
Reasonably compatible series
ku Upward compatible *
19^4
19^5
Non-stored Program Calculator
19*66
19^7
1968
I9%9
W
1940
ISA
1
19*0
|9d1
1942
1943
'.944
1945
]946
1947
J948
1949
1950
1951
'952
46
Part
The structure
1
of
computers
Analytical
Charles BabbSge (1792-1871)
ErK,
Difference Engine * •
Bouchon, Falcon, Jacques ^
^
^
Leibni P
z
-FIRST RENERATION-
*•'•?"
•
•
•
ne
Bell Telephone Labs (1000 words, 50 digits/word)
Mullers Difference Engine
paper tape (.cards
'
card con troHed
U.
i
of Pennsylvania (Moore School)
-
VI
• -
-Relay Interpolator,
Ml
Electronic Numerical
Integrator And Computer
IV -IV
-Ballistic
Harvard Marks
• •
•
•
Mul tl plying
Pascal Calculato L.
^Calculator/IBM r^ Monroe Calculator Baldwin Calculator*) Columbia U X.Thomas Arithmometer Calculator
Use of Boolean Algebra for Switching Circuits
a)
TELEGRAPH _ ELECTROMAGNET* TELEPHONE « MECHANICAL
MEMORY
1700
40
IT
50
Q
Operational p Paper
ELECTRO-
T
|
1800
w
(Shannon)
fTRANSI STORSiJINTEGRATED CI8CUI : co p,E MEMORY :
1/ariNiJ TUBES. T1IBFS- DBUHS DBIIHS VACUUM
ECCLES-JORDAN FLIP-FLOP VACUUM TUBES
Time
• II
EN1AC
Calculator
Schickhardt Calculator alculato (described to Kepler)
Fig. 2d.
V
IV
I
••
-Complex Numbers,
•
#)
It
II
I
• •
Hollerith punched cards • (used for census) w Jacquard Punched Card Loom
1850
chart: pre-computer technology.
Function
The most
striking fact about function
is
the existence of only a
and with only a few values. Perhaps we have taken a simplistic view of the functions that computers perform, but we think our computer space represents reality: To wit, there single dimension,
is
remarkably
little
shaping of computer structure to
fit
the func-
task.
The
latter
is
often carefully specialized to is mostly the amount of
the function to be performed. But this Mp, the amount of types of Ms, and the
Within of
limits, these are all
computer
(i.e.,
to
number and types of T's. items that can be attached to any type
any Pc) and are handled
in
an environment-
At the root of
independent way. Thus there is little specialization of computer types, but great specialization of particular configurations. That
which
this
tion to be performed. this lies the general-purpose nature of computers, the functional specialization occurs at the time of programming and not at the time of design. However, it might
in
assembled for a
seem
all
that specialized environments
erality, so that functional this
would not require
adaptation would
appears not to be so for two reasons.
tions of the
Pc
(as
defined in the ISP)
is
still
all
the gen-
be possible. But
First, the level of
operatoo basic to reflect the
should be the case indicates something about the nature of that it can be expressed adequately
the functional specialization in gross
PMS
There
terms, as
is still
more
—
more
bits of storage
to the story.
Some
and more data
rate.
functional specialization
indicated in the dimension. This depends primarily on two kinds of things beyond the reach of the configurational adapta-
exists, as
The
demands
kind of specialization offered by the environment (think of infor-
tion described above.
mation-transfer or conditional-transfer operations). Second, all environments ultimately require a variety of tasks in addition to
ruggedness, small size, etc. These have strong effects on design, but below the ISP and PMS levels. The second consists of demands
the main specialized task. These include at least language compilation or assembly, readable formatted output, debugging aids,
affects design at the
and other
utility routines.
By the time
these have been added, a
been generated. A second part is the differ-
for large
and has
first
consists of
for reliability,
amounts of processing power. One response to this again lower levels of logic, devices, and circuitry
little
impact on design at the ISP and
PMS
level.
But
substantial requirement for generality has
response
However, this is not the whole story. ence between the computer type and the
into the ISP. Large machines have data-types that are appropriate
specific configuration
is
also possible in terms of the data-types that are built
to their tasks (with operations to match),
and these
affect the
The computer space
Chapter 3
design. In fact, this effect cialization
shown
in the
the substance of the functional spe-
is
in the
of the
computer-space dimension.
look-ahead of Stretch (Chap. 34) and the n-instruction buffer CDC 6600 (Chap. 39). This might be considered a unique
one last part of the story, and it is the most Various groups of computer engineers have felt strongly from time to time that functional specialization should exist, and they have set out to create such machines. These efforts
functional specialization for scientific computation. It is too early to tell, but it is our impression that, although the needs for sci-
have often produced machines that were different from the existing main line of computers, i.e., were appropriately specialized.
a certain power, whatever the task domain. Physical limits on
Finally, there
interesting of
is
all.
But the net effect of almost
new
all
such attempts has been that the
idea was seen to be good in general for
was taken back
into the
main
all
computers and
line of computers. Thus,
what started
out to be a functional separation turned out to be simply a way to produce rapid development of a more universally applicable
A
the expansion of input/output example computer. facilities in creating a functionally specialized business machine, classic
is
which simply led to better I/O facilities will have more to say about such examples
for all
as
computers.
we discuss
We
computation initiated the exploration of concurrency and parallelism, we will eventually see them in all computers above entific
component speed and
signal propagation will
make
these tech-
niques universally attractive.
A better
case for permanent specialization can be made in the special algorithm computers, which compute the fast Fourier transform or do vector operations. Here we finally have systems
whose whole design
is responsive to a narrow class of problems. extend to the very special kinds of Pc parallelism exhib-
This
may
ited
by the ILLIAC IV (Chap.
generality in
27),
although there
is
substantial
such systems.
the values
computing it was felt by was a functional many major separation between business computing and scientific computing. 1 Scientific problems were
along the dimension.
Business. In the early days of electronic that there
Computer-system function Scientific.
The
first
machines were clearly designed for scientific Aberdeen Proving Grounds funded the early
calculations. In fact,
ENIAC
"large computing-small input/output"; business problems were "small computing-large input/output." Certainly most of the
the work sheet, and the program
had poor used the Pc input/output example, to control everything dynamically, actually catching the bits from running tapes on the fly (by executing well-timed small loops).
the instructions that the mathematician gave to the clerk. From a design standpoint, scientific computation has posed two
These design efforts for business computers resulted in the IBM 702 (and subsequently the IBM 705, 708, and 7080). This machine
work on the
for the
computation of
ballistic firing tables.
And
the image used frequently by the early computer designers was the computer as a statistical clerk, the arithmetic unit being the desk calculator, the
striking requirements. bers,
which has led
memory
The first is the word lengths
to
great accuracy of the
num-
of 36 to 60 bits (11 to 18
decimal digits of significance) and arises from the propagation of roundoff error during repeated arithmetic operations. The second is the emphasis on fast arithmetic operations, i.e., for arithmetic power. In the early machines the standard rule for estimating computation times was to count the number of multiplications in
be neglected. The arithmetic unit has where the floating point multiply is hardly more
existing computers, designed for scientific computation, facilities.
had two major innovations a
PMS
input/output.
The
scientific
Thus, the main effect at the ISP
is
The main PMS
used characters, and flexible
it
had
and voluminous
was immediately incorporated and then into all large
output for
scientific calculation.
The have
its
Thus the bifurcation was tempo-
specialization to characters as a basic type (as opposed
was already present
effect until 5 years later
in the
IBM
702 but did not
with the development of the
IBM
1401 (Chap. 18). The latter machine was adapted to business, both in being character-based and in being small enough so that small businesses could afford
it. It
was extremely successful (many thou-
is
the emphasis on
sands were produced) and certainly represents a successful func-
press for increased arithmetic processing
has led in recent
'Such feelings are still extant, but we are concerned here not with the validity of the feelings but with what they led to at a particular period
operations in the ISP.
the classic "statistical clerk"
The
latter feature
It
more
computers as separate input/output control (either Kio it was realized that there were also demands on input/
to long words)
level.
IBM:
or Pio), for
developed to expensive than floating point add. This requirement on fast arithmetic, however, has really been directed at the logical design level,
PMS
701, for
into scientific computers, e.g., into the 709,
rarily halted.
the adoption of long word lengths, floating point data-types (in addition to integers), and an extensive repertoire of arithmetic
for
structure that permitted
a program; all else could
not at the ISP or
The IBM
PMS
effect
design.
times to the development of various forms of Pc concurrency, as
of
computer development.
47
48
The structure
Part 1
of
computers
tional specialization for business.
However,
it is
interesting that
the specialization has not been maintained, for the IBM System/360 (Chaps. 43 and 44) is again a single machine, although it
has in essence two internal ISP's, one centered around characters
and the other around floating point data-types, that is, and a scientific specialization residing side by side. 1
a business
necessarily required. This in part reflects the fact that control
computers may retain their programs over their whole lifetime, programming and reprogramming is less important. (It is
so that
not absent, however, and so this
is
not a very strong functional
adaptation.)
Communication. The functional specialization of communication Control.
The
third functional value
a computer used for control
is
in real time. Examples are process-control computers, aerospace computers, and laboratory instrument-control computers. The role
of the
computer
is
to act as a sophisticated control (K) in
larger physical process, relatively late arrival
and thus
was due
it
some
plays a subordinate role. Their
to the high cost
and
unreliability
of early computers, as well as to the lack of necessary interface
equipment.
The
functional specialization is seen most strongly in the word size, which reflects the appropriate numerical data-type. The
numbers used
in control processes are
generated by physical de-
vices and are rarely better than 0.1 percent accurate. Since elab-
orate arithmetic calculations are not called
hence the word
size,
can be around 12
puters have been 12 to 18 bits/word.
for,
bits.
the numbers, and
Most control com-
is
the computer transfers messages from terminals (and links) into primary (and sometimes secondary) memories and then transfers
them are
that
all
nature.
About the only other functional specialization of control com2 puters is the interrupt capability to allow them to respond to
to other terminals (and links). In
stored and then forwarded.
first
the computer reads the off-the-hook signal, detects the dialed numbers, rings the dialed parties, and finally sets the switches to
connect the telephones together. In some instances, when it answers information inquiries about new telephone numbers or reit
communications computer
functionally a switch or a control
The main
distinction
nications computers since
it
is
between control computers and commuthat the task environment of the latter,
consists of digitally
encoded messages (even
in the case
of the voice telephone exchange), can be handled directly
by the communications computer. That is, the communications computer can do the work of transshipment and storage as well as control. There are no pure examples of communications computers in book. However, the Pio's serve essentially the same function within a single computer (Part 4, Sec. 1), and they can profitably this
is another possible example of functional specialization leading to reunification rather than divergence, for it has
be examined from
again been widely accepted that
File Control.
general-purpose computers
is
for a switch.
potentially simultaneous external conditions in real time. This provides apparent parallelism, though still using a sequential
all
functions as a memory. Thus a
routes calls to other phones,
many
processor. This
message switching, messages
The computer
in a telephone exchange functions as a very sophisticated switch control. Here
A second specialization, again
control computers are binary and have boolean operations. This arises because many of the external conditions to be sensed and effected are binary in reflecting appropriate data-types,
could be taken as a subfunction of a control computer. The function is mainly to behave as a switch. In a message-switching application
this viewpoint.
We list this as a separate specialization
only because
in actuality,
capabilities. However, though not existing in early computers, were developed
number of computers have been built to do exactly this task. The specialization is easily described: It is a communication com-
good input/output facilities, not for control computers. Chapters 7 and 29 give examples of aerospace computers, and
puter with the messages being characters (since they are built for business), and with the large memory (the file) being considered to be part of the system. There are no examples of file-control
must have good interrupt interrupts,
to obtain
Chap. 33 describes the IBM 1800, which is specifically designed As these examples show, a complex ISP is not
a
book, but the early
IBM 305 and UNIVAC
for process control.
computers in
'The story above has been
computers serve this function. An IBM 1800 is used as the control for a 10 12 -bit photo-optical memory, for example.
told exclusively in terms of
IBM
machines.
does not distort the picture too strongly in terms of total movements of the field, since IBM dominated the market, concurrent
Although
this
file
this
developments were taking place throughout the field. UNIVAC I was the first computer built by a manufacturer and did not have the idiosyncrasies we ascribe to IBM; on the other hand, the marketing effort for it was nil.
Apparently introduced
in the
UNIVAC
1103.
Terminal. Since
it is
possible to obtain a separate
whose only function
is
to run a display,
computer system
we have
listed this as a
separate functional specialization. In fact, it is better viewed (and almost always occurs) as a component of a larger computer system,
The computer space
Chapter 3
DEC
The
as a special Pio.
i.e.,
338
is
such a P.display and
chapter and
described both later in this
in detail in
Chap.
is
25.
We
want to know how well the computer system some vague notion of the kind of task programs performs, given and data that will be used with it. Although we know that we
such
specifics.
—
—
in simultaneous conversational interaction with a single large
cannot have adequate measures, we believe that there is something that tells us that a CDC that can be said about the performance
machine has bred a new
6600
Time-sharing.
The requirement
have a large number of users
to
specialization, that of the time-sharing
be time-shared computer. All the computers described above can inherent multiprogram(even if they do not have interrupts or with the ming). However, the emphasis on this mode of operation particular timing
and
flexibility
requirements of human
users doing
software systems has general computing at consoles in multiple led to a number of innovations in design. The most important is
the virtual-memory techniques for achieving multiprogramming
(described in Part 3, Sec. 6). There is also substantially increased complexity of PMS structure to handle the integration of large
swapping memories, and the huge software systems that seem be endemic to time-sharing systems. It is still too early to tell
files,
to
whether any of the design responses will produce permanent specialization or will again simply be the first instigation of design features that will
In summary, that
it
become
we
is
many
times more powerful in actual performance than a
PDP-8.
An interesting way to look at the problem of specifying performance 4.
is
to play a simple
You are
game:
functional specialization and machine and into
translates mostly into total size of the
you a number, say
computer systems involv-
many parameters (equivalently, dimensions or attriThat is, what is the best description of a computer that butes). can be stated in four numbers? The game is easier to play if we speak of the dimensions, rather than the information content of the description
1
(in bits, say).
We
have
still
not defined "best,"
can be taken to mean the best prediction of the relative ordering of the computer system; better on the index of course. It
means better on the same
To
start at the
task. 2
beginning, what single
number would you
give
power? Such a question makes most people uncomfortable, since strong feelings exist for at least two kinds of numbers, dealing with speed and memory, respectively.
we would probably
essing speed.
common
because for simple machines
machines.
all
will give
ing only that
the data-types available. Many of the other design aspects created in response to functional specialization have instead become the
property of
We
to give the best description of
to characterize a computer's
universally used.
see that there
—
is
If forced,
The
settle for
something related to proc-
cycle time of the primary
rate. It is a structural
it
memory
determines
parameter, but that
is
a possibility
(limits) the is
operation to avoid
no reason
as a performance index. The average number of instructions per second, or operations per second, is a better indicator. Since the latter does not take into account the size of the word being procit
Performance For a device that does a complex
job,
it is
meaningless to ask for
a single precise index of performance. It is like asking for the average speed of a given model of car over its lifetime without
who
will
own
it,
where he
will drive
it,
and what
sort
specifying of terrain he will encounter along the way. Notice that the difficulty is as much in the complexity of the task environment as in
the complexity of the internal workings of the machine. Specify everything about the environment, and the performance can often
may be hard to determine, but at know the terrain and road conditions you
essed, perhaps average bits processed per
second
is
the best single
number. (We measure this number at the processor, and include both the instruction and data streams.)
times and divide by their number. This is equivalent to weighting them equally, the rare ones and the common ones. If we want
do better than that we need some
in a single figure. It
to
least
well defined.
frequencies, of instruction types, called "mixes," have
If
perfectly and how the car was driven, then from the structure of the car it is possible to figure out the instantaneous velocity and
from
this to construct the
To put tion for a
this in
average speed. terms of computers, given a particular configura-
computer system, given a particular program, and given
a particular set of input data, of the performance:
whether
it
was
how
correct,
it is
long
it
possible to determine all aspects
how much space was used, But we are not interested in
took,
and so on.
may
To take an average we must adopt some weightings. The simplest scheme is simply to add all the instruction (or operation)
be given it is
it
in the literature.
'It
is
not
fair,
data. Several sets of relative
been used
Table 2 gives four examples. The Gibson mix
of course, to invent tricks to
is
encode many conceptually
independent dimensions into a single one, just to beat the limit. On the other hand, composite dimensions, such as average operation time, are perfectly acceptable. 2 Definitional precision
is not appropriate, since we are not attempting to deal seriously with the technical questions of indices, only to illustrate the
issues.
49
50
Part
The structure of computers
1
Table 2
Instruction-mix weights for evaluating computer power
Gibson 1
Arbuckle [1966] Fixed
+
/
Knight
-
10(25) 6
X
Knight (commercial)
(scientific) 2
25(45)
2
1
2
+ x
Floating Floating
/
—
5.6
2.0
-=-
Floating
10
9.5
Load/store
28.5
Indexing Conditional branch
22.5
25 (move)
Compare
20 24
Branch on character
10
13.2
4
Edit
7
I/O initiate
Other
72
18.7
74
'Published reference unknown. 2
Extra weight for either indirect addressing or index registers.
probably the best known. The best source for such data comes from instruction counts of running programs. Knight takes the view (Fig. 3) that a single number can be used to indicate power,
and
his
formula has been evaluated for some
300 computers [Knight, 1966]. His formula
is
the product of
is
restricted to
such as
programs coded
Nevertheless, although
it
We
procedure-oriented language,
all
systems, only occasionally has
number.
in a
computers accept FORTRAN. has often been done to compare two
FORTRAN, where
it
been done
for
even a modest
general-purpose computer the coma reasonable single-performance actual use will be with the compiler, and good
feel that for a
three factors: processing time, memory size (in words), and word length. The formula was derived (roughly) to measure power so
piler-derived bench
that technological change could be modeled. Applying the formula
compilers produce code to rival hand coding, so that special features of the machine are utilized. Cox [1968] compares several,
is
like
measuring automotive-vehicle power as a product of speed, number of wheels. (Such an indicator is roughly
weight, and the
proportional to a car's momentum.) Thus, although it is a reasonable single-number indication for power, a computer buyer could
not use
it
directly.
Taking averages, ticated approach. is
as in the case of mixes, suggests a
more
sophis-
A collection of programs, called a "bench mark,"
developed that does a variety of different tasks.
number is the time it takes to do mark generates its own frequencies instructions. It brings in a number
Then
the one
Such a bench
number.
Much
mark
is
using hand coding and compilers for several tasks. There is a difficulty with the bench-mark scheme that ent in
its
is
inher-
strongest advantage, that of doing a total problem and
thus integrating all features of the computer. The number obtained depends not only on the type of computer, for example, an IRM 704, but on the exact configuration, for example, 16 kwords of versus 32 kwords, and even on the operating system and the soft-
Mp
of occurrence of the primitive
ware (which version of FORTRAN). Thus, although the number perhaps comes closest to an adequate single-performance figure,
of additional dimensions that
it
this collection.
affect performance: the instruction code, the size of
Mp,
pro-
skill, input/output devices, etc. It also carries with it an implicit frequency of different kinds of task demands (how
gramming
becomes much
less of a parameter characterizing the structure computer than one characterizing a contingent total system. Let us underscore again the distinction between the computer
of the
crunching,
type and the particular configuration (possibly including basic software) assembled in a particular installation. Computer systems
There are severe practical problems in carrying out such measurements on many computers, since the problems must be coded
are designed with certain forms of variability. To specify a 1604 is to specify many things, such as the ISP of the Pc, the cycle time of Mp, the K's used to control secondary memories (Ms), and
much of the set involves how much I/O, etc.).
and run on
all
compiling, how much number
the systems.
CDC
It is
somewhat
easier
if
the task set
interfaces to the external world. But
it
leaves open
many
other
Chapter 3
[(L-7) (T)
=
P
10'
2
weighting factor that indicates the percentage of floating additions
(WF )]'
[32,000 (36-7)]'
+
*o
t I/0
ia*[C,A FI + C 2 A FL + C,M + C 4 D + C 5 L] = P X OL, [10 6 (W„ X B X 1/K„) + (Woi X B X 1/K i) + N(S, + H,)] Ri + (IP) 0L [10 6 (W I2 X B X 1/K, 2 + (W 2 X B X 1/K 02 + N(S 2 + H )] )
2
weighting factor that indicates the percentage of
)
divide operations
2
weighting factor that indicates the percentage of logic operations percentage of the I/O that uses the primary I/O system
Variables—attributes of each computing system
L
= =
lh the computing power of the n computing system the word lengths (in bits)
T
=
the total
t,.
=
the time for the Central Processing Unit to perform 1 million operations the time the Central Processing Unit stands idle waiting for I/O to take
P
ti/o
=
An =
A^ = M D
= =
L
=
B
=
Kn =
K 02
= = =
Si
=
Koi
K I2
number
of
words
in
S2
H2 R,
= = = =
place the time for the Central Processing Unit to perform 1 fixed point addition the time for the Central Processing Unit to perform 1 floating point addition the time for the Central Processing Unit to perform 1 multiply the time for the Central Processing Unit to perform 1 divide
w„
using the primary I/O system a. magnetic tape I/O system b. other I/O systems number of output words per million internal operations
using the primary I/O system
number
w„
the start time of the primary I/O system not overlapped with compute the stop time of the primary I/O system not overlapped with compute the start time of the secondary I/O system not overlapped with compute the stop time of the secondary I/O system not overlapped with compute for non1 + the fraction of the useful primary I/O time that is required overlap rewind time
Symbol
Description
WF
the word factor a. b.
computer per OL,
Values
Scientific
Commercial
computation
computation
read, write
and com-
pute—single buffer d.
multiple read, write and compute— several buffers
e.
multiple read, write and compute with
program interrupt—
OL 2 10
25
25
45
computers with index
several buffers overlap factor 2— the fraction of the secondary I/O system's time not over-
lapped with compute
registers or indirect
P for any comKnight's functional model algorithm to calculate vol. 12, no. 9, September, 1966, of Datamation, (Courtesy puter system. page 42.)
variable
100,000 10,000
20.000 2,000
the values are the same as those given for
Wn
the values are the
same as those above
the exponential memory weighting factor
for
given
W,,
20
million operations
compute no overlap— no buffer b. read or write with com-
word length memory variable word length
addressing
variable
above
pute—single buffer
registers or indirect
Fig. 3.
1.0
words
a.
fixed
addressing
1.0
overlap factor 1— the fraction of the primary I/O system's time not overlapped with
c.
weighting factor representing the percentage of the fixed add operations a. computers without index
b.
of input/output per million internal
operations using the secondary I/O system number of times separate data is read into or out of the
memory Ci
74
million internal operations
the time for the Central Processing Unit to perform 1 logic operation the number of characters of I/O in each word the Input transfer rate (characters per second) of the primary I/O system the Output transfer rate (characters per second) of the primary I/O system the Input transfer rate (characters per second) of the secondary I/O system the Output transfer rate (characters per second) of the secondary I/O
Semi-constant factors
systems with only a primary I/O system b. systems with a primary and secondary I/O system number of input words per
72
a.
memory
system H,
10
weighting factor that indicates the percentage of multiply operations
*i/o
=
tc
The computer space
1
51
52 Part
The structure
1
of
computers
Ms and
Mp. On
If
can even leave open part of the ISP (e.g., the multiply/divide options on many small machines), or the speed of the Pc and Mp (e.g., in the IBM System/360).
at
things, e.g., the types
some computers
and
sizes of
the size of
it
When we
ask questions about computer systems, we should be clear whether we are talking about a computer "type," such as
CDC
1604, or whether
with
tion,
all
either with
we
are talking about a particular installa-
the variability specified.
It is
possible to describe
PMS and ISP, provided we recognize that the diagrams
for the types represent
maximal
possibilities for
assembling par-
we had bench
marks, which are themselves only approximations measuring performance, we might look at how well the parameters in Table 3 predict the bench marks. But there remain the difficulties of
how
the total system
to take into
(e.g.,
account the additional aspects of
compiler efficiency) that are implied in the
bench mark. Alternatively, one might want to construct a mixed description of bench-mark numbers and measurements of the kind Table
in
3.
Then the
relationship
other measurements would
between bench marks and these
become an
indirect measure of the
efficiency of the rest of the system.
how almost all the PMS and ISP diagrams in this book were prepared. From the point of view of our "number game," if we are talking about computer types, we might prefer
We have discussed performance in a crude and cavalier way, but this accurately reflects the state of the art. There are no precise measures for performance. There are precise structure and per-
numbers
formance measures of individual components (e.g., memory size, and speed and word length, and processor instruction times). When
ticular systems. This
is
that do not depend on the particular configuration. two numbers were available for describing performance,
If
what would they be? Clearly there are One could fractionate the bench mark,
mark
for arithmetic-rich tasks
several directions to go. so that
one has a bench
and a bench mark
composite of compiling and data processing).
for others (a
One could decom-
pose the processing rate into, say, operations per second and word size (from which bits per second can be recaptured approximately). Alternatively, one could retain only a single rate
designers (and users) are faced with obtaining a certain total performance for a given cost, the only method is that of the bench
mark, because the task is
to
such a significant variable.
is
be increased, unless the task
to predict
what
effect
variables will have
is
If
performance
sufficiently trivial,
it is
difficult
changing even the most direct structural
(e.g.,
memory
speed).
number for processing
and add a measure of the memory available, e.g., size of Mp Of the three we would choose the latter, especially if
Structure
(in bits).
we were
talking about a particular installation rather than
com-
puter types, for which Mp size remains variable. We can continue this game through several numbers. Table 3
shows some of our choices. Various parameters drop out or change only when they are decomposed into other parameters from which they can be recovered. Thus, initially Mp must be measured into bits, but when the word size is given, Mp is more reasonably
measured
in words.
One
of the reasons for exposing such a
list
emphasize its judgmental and approximate character. There no way to validate such proposals for brief descriptions.
is
to
is
as yet
Table 3
turn from function and performance, which provide design constraints and objectives, to the dimensions of structure, which provide the space in which the design is actually cast. A structural dimension is one in which the designer can attain any of the values along the dimension a
machine
is
relatively direct means.
completely specified by
the structural dimensions. its
by
From
this,
listing all its
Thus
values along
the system's function and
performance within that function can be determined. What dimensions should be selected for structure? The view-
point
is
distinctly different
from that of performance, where one
Performance parameters specification
(as a function of an allowable
Number
We now
number
of parameters)
of
parameters allowed:
1
Parameters: Pc(i.rate:(b/s))-
Pc(operation-rate:(op/s))Mp(size:(b))-
Pc( wi dth( b)) i
.
»Mp(i. (words)) *Ms(i. (words))
2T
The computer space 53
Chapter 3
averages and combines many features to summarize effective outwants put. This tends to obscure structure. For structure, one
maximally independent aspects which are easily obtained if selected as a design choice. For example, if the computer designer
had only a
undoubtedly This
tells
dimension to describe a computer, he would the logic technology used in the Pc and K's.
single
select
him a good deal about many aspects
of the computer's
and the average bits processed the Pc are second correlated, and so each can be used to by per the other, though only imperfectly. If one is interested predict
tubes, transistors, ties are rare; If it
it.
performance, effective bits per second
is
preferred;
if
one
is
interested in design, technology is preferred. The computer space in Table 1 presents our choice of the major
structure dimensions. There
even
is
less
choice of dimensions here than there
is
means
to validate the
for performance. Never-
theless, there are a few hallmarks. Perhaps the most important is redundancy (the opposite side of the coin from independence,
mentioned above). Several dimensions of structure may covary, so that giving any one of them is tantamount to giving the others.
time and good engineering practice work against to consider such cases, then additional
were necessary
dimensions
(e.g., for
secondary and tertiary logic) could be added, computer could be
or several points in the space for a given
used.
The computer-structure space
structure. In fact, the technology
in
For instance, the Rice University computer uses vacuum and integrated-circuit logic. But such complexi-
possible.
most important dimensions.
is
thus our choice of the seven
our response, so to speak, to the number game, given only seven descriptors. They are playing in order of arranged importance, although clearly no simple way exists to validate such an order. But, if we were to have only three It
is
computer system, we would pick logic technology, word size, and PMS structure (i.e., what processors exist with what functions).
attributes to describe the structure of a
we are ready to proceed through the space, dethe various dimensions and discussing how the computer scribing in this book illustrate various points along them. We take systems At
this point
up each major dimension
A few
separately.
of the correlated
come from physical dependence; may from the nature of an appropriate design and good engineering practice. Such a cluster of covarying dimensions is likely to indicate an important dimension (which one among the correlates
dimensions are accorded separate sections, but most are discussed along with the main dimension.
terms
Computers are constrained by the physical technology from which
This covariation need not
it
arise
a secondary matter). Table organized of such clusters, with one of each selected as the main representais
to
be used
is
and placed
tive
A
in
1 is
they are constructed.
It is
greater speed, size, and
at the left.
second hallmark derives from the hierarchical nature of
computer
Technology
systems. Generally a description of a system consists of
the union of the description of its parts, plus a description of the interconnections. This is the basic style of PMS, for example. But there are a few features that affect the total system,
not just that
a
computer is. For instance, the emergence of the PMS system is due to advances in technology. Prior to transistor technol-
level
did not
make
sense to think of elaborate
affect
ogy,
it
The
costs of the various parts
is
a prime example.
Yet a third clue
is
that the dimensions discriminate the actual
population of computers. If all machines had single-address instructions, for instance, there would be no sense in using number of addresses per instruction as a dimension.
who had all
studied machines at
all
computers. Thus one looks
Any computer engineer
would know
for
this to
be true of
dimensions that spread the
technologies provide
they do that. But technologies dictate the kinds of structures that can be considered and thus come to shape our whole view of what
many components. These are usually rather important. Technology
i.e.,
new
reliability at less cost, although of course
PMS
structures.
were too high and the reliabilities were too low. When, occasionally, such a machine was in fact designed, it invariably proved too far ahead of its time to succeed.
An example
in this
book might be the RW-40, described
A more
in
1960
the Analytic Engine of Babbage, which he designed in 1844 and was never able to com-
(Chap.
38).
classic
example
is
The technology of the time was entirely mechanical, and crude state accounts for a large share of the failure. Thus the 1
plete. its
machines out evenly into a substantial number of categories. If the dimensions of the space are known, a computer is supposed to be defined by a single point. For most existing computers
technology is by all odds the most important single attribute to know about the computer system.
a computer system were of several enough, say consisting processors, each built complicated with different technologies and having a different number of ad-
of
this
is
actually the case. However,
if
dresses per instruction, then such a representation
would not be
Many
technologies go into making up a computer. Each type typically uses a different one. In current (so-called
component
'Thus, the
by a
first
operation.
computer established the precedent of failing meet the expected dates of completion and full
real digital
large margin to
54 Part
1
The structure
of
computers
third-generation) machines the Pc
may
use hybrid- and inte-
when
cially
technological costs are of interest rather than market
for the
costs (which reflect
Pc generalized registers, core technology for the Mp, electromechanical technology for tapes and disks (with integrated circuits
effect of technology
grated-circuit technology for
its
logic, thin-film
technology
mechanical technology for card punches and typewriters, and even manual technology for mounting tapes and disk for logic),
packs.
The
existence of
all
of systems balance, issues
remains true in the current generation that input/ not in balance with the internal structures. This is due
it
example, output
is
to the crude state of terminal technology, so that cost too
much
it
appears to
to provide an appropriate solution. 1
The heterogeneity
of technologies
is
for
any component, but
within a technology.) Thus there
is
this
is
usually
a sense in which the leading
technology can be used to represent them all. This is the technology used for the logic level and is the one listed in the computer If it
space. a computer,
is
known
it is
that transistor logic
is
used
Ms
is
electromechanical,
a safe prediction that
in the
Pc of
Mp is core, Tio is electromechanical printers and punches, etc. This reflects the fact that technology develops and hence becomes locked with calendar time. Thus a prediction is from logic technology to date
be current
it
seemed necessary
how
and then
Nevertheless the
factors).
to give a
other dimensions) that
all
measure of cost
in
Table
1,
no matter
crude.
We have
indicated only a few of the dimensions that are corre-
lated with technology. In fact, the only dimensions in Table
1
that
are independent of technology are the word length and the Pc addresses/instruction. All the rest show dependence on technology. For some, such as memory speed and size, there is a direct
For others, such as PMS structure and Pc concurrency, the development of more complex versions the leading edge, so to speak depends on technology, but there is free use of all correlation.
not a consequence of
cost/benefit analysis; rather, each represents the forefront technology for the type of device shown. (There is, of course, cost/
performance exchange
neously pushing up performance along
these technologies poses major issues
which are only imperfectly resolved. For
numerous other
on costs has been so striking (while simulta-
to all other things
known
to
at that date.
—
—
versions that are in existence at
any given time. There are
other dimensions of importance, not
shown
in
Table
1,
still
that have
changed with technology, e.g., electric-power consumption. One way to see both what varies and what is independent of
also
technology
is
wind (Chap.
compare selected machines. For
to
6),
a first-generation system, and the
instance, Whirl-
IBM
1800 (Chap.
have reasonably similar ISP descriptions, if one ignores index registers, which were not invented at the time of Whirlwind's design. However, they have very different 33), a third-generation system,
PMS structures. In Whirlwind, the early system, transferred information between Tio's and Ms was under program control of the Pc.
The
existing
Pc
registers
and
transfer gates
were used because
This correlation of date with technology is given in the computer space along with the generation. It can also be seen in the
uses hybrid circuits,
time chart. The correspondences must be taken as very rough only.
devoted to special functions; hence there are many Pio's operating
The technologies are listed in increasing power (and decreasing cost). The dates run in exactly the same order. The one exception which has been introduced very recently and is a special technology for ruggedness, reliability, and direct external coupling is
fluidics,
in certain control systems. (Small fluidic
computers are at the early
was too expensive
we
list
the dimensions:
Pc speed (operations per second), and cost (dollars per million opwhich vary directly (or inversely) with logic tech-
to
it is
have separate ones. In the 1800, which economical to have additional subsystems
independently of the main Pc.
It
was not
cost alone that limited
the complexity of first -generation vacuum-tube systems. The large physical size of tubes introduced substantial transmission delays; their large
system; and the
prototype stage.) Alongside the technology dimension
it
number
power consumption added dependency on a cooling their limited life and deteriorating nature constrained of tubes that could
be used
in a
system requiring high
reliability.
The IBM 700
scientific series (701, 704, 709, 7090, 7040, 7044,
erations), all of
7094
nology. In general, costs are extremely difficult to determine, espe-
ing structure over time, hence across technologies, but
I
and
II)
offers
another comparison, where there
is
an evolv-
where
for
reasons of compatibility the ISP's have remained almost constant (except for the 701). Again we see radical increases both in perform1 Although beside the point of the current discussion, one reason why these imbalances appear to be "permanent" is that the time constant for change
technology is of the same order as the time constant for human beings systems analysts, programmers, and users) to understand the imbalance. Before system imbalance is diagnosed and solved, the terms of the problem change, inducing new imbalances. in the (i.e.,
ance (Pc speed increases by a factor of 5 from the 701 to the 704 and another 10 to the 7094 II) and PMS complexity. But various other features, though not affecting compatibility, were locked in with the ISP and remained fairly constant. For example, Mp size
went
to
32
kw
(kilowords) early in the series with the 704; and
The computer space 55
Chapter 3
took a jerry-rigged modification to get 64 kw on a 7094 toward the end of the lifetime of the series (see Chap. 41, page 517). it
Throughout this section we have referred to technology as the dominant factor in the computer. Does this mean that computer development waits upon new fundamental windfalls? We have been lucky
in getting the transistor and, to a lesser degree, the
integrated circuit from external efforts. However, core memories for the computer and resulted because of need.
were invented
Read-only memories have also resulted both from development at the circuit level and from pressure above, requiring the memories to
ories
be developed. All the electromechanical secondary memmagnetic tape, drums, disks, and photostores) have
(i.e.,
resulted from the computer's needs. Thus, although technology is
dominant, the computer often forces the development. The Pc operation rate is strongly correlated with logic tech-
nology, as
we have indicated in the computer space. Our discussion
about operation rate. The reason for the higher operation rate is because of faster principal also has a secondary effect on intechnology. Technology logic about technology and generations
creasing speed.
be
More
is
also
reliable devices allow large
computers to
digits (4 bits), the halfword,
performance.
If
we
hold the structure and concurrency constant,
the simplest way to increase performance is by increasing the clock rate. The increase in the performance/cost ratio over the past two
One
of
we need
to characterize the
characteristic of this organization, the
itself.
word length
(in bits), gives
most of the information, the
the hierarchy adding only a little. Let us see why this is so. At the bottom there
is
the
bit,
rest of
encoded
Although other numbers of states are possible, and ternary (three-state) machines have been proposed occasionto handle binary ally, digital technology has developed exclusively in two-state devices.
information. There are several reasons for
requirement for high
reliability
and high
this.
The
first
is
the
signal-to-noise ratios in
the basic devices. Generally a basic n-state device (that is, one not built up from other fc-state devices) is realized by breaking a continuous physical dimension, such as voltage, current, or
magnetic
into n discrete levels or regions. Reliability
flux,
and
depend on keeping adequate separation. do with two states (e.g., in the limit they become
signal-to-noise ratio then
This
is
easiest to
and becomes progressively more difficult as n inThe second reason is the simplicity of the logical design binary representations. A basic device for combining two
on-off devices) creases. for
as
also relatively highly correlated with total
consider them,
organization
connection density. is
we
of data. Before
2x2 =
Operation rate
A number
features of the design are related to this hierarchical organization
Smaller devices allow higher device densities, thus decreasing stray capacitance and inductance and shortening transmission delays. Smaller components also allow increased interbuilt.
and the double word.
3x3
= 9 configurations, rather than ternary digits must deal with 4 configurations for the binary case. This also gets worse n increases.
—
—
A final reason the coup de grace, so to speak is that no one has ever found striking advantages for the resulting processing structure in having more than two states. Thus there are no compelling reasons to suffer the
first
two disadvantages. In
short,
what
decades of computer evolution has made their primary gains
might have been an important dimension on which to distinguish
through higher operation rates. The two 16-bit computers already mentioned, Whirlwind (Chap. 6) and the IBM 1800 (Chap. 33),
computers, namely, the number of states in the basic encoding, turns out instead to be one of the great uniformities in digital
provide a nice comparison of the evolution.
With
a difference of
—
10:1 whereas two generations, their cost ratio is 1 is ~1:5 and the internal clock rates are also ~1:5. performance
10 years and
Information structure: word length, information base,
computers structure which we defined as an
their information in a hierarchy of units,
For example, the IBM Chap. then the byte, which is 8 bits; then
i-unit in
2.
System/360 starts with the bit; the word, which is 4 bytes; then the record, which is a variable number of words. In between, playing minor roles, are decimal 1
However,
it
is
not as dramatic an example as
we
could
find.
Information base. That the physical devices deal ultimately in bits does not imply that the information processing must be organized in terms of bits. It is possible to select an arbitrary base (one with
any number of
and data-types All
technology.
By picking and
a better third-generation example we might get a cost ratio of ~ 100:1 a performance ratio of — 1:10.
states)
and construct the entire ISP
A
base unit
If
one wanted a base 13 machine,
is
for example,
to use at least 4 bits (with 16 states) to at the
in its terms.
represented physically, of course, as a set of bits.
encode
it.
one would have
But no operations
ISP level would refer to anything but base units and data up from sets of base units, and there would be
structures built
no way
to manipulate directly the bits that represented the base.
Thus, using a base other than binary obtains whatever advantages might accrue to n-state units, without any of the disadvantages at the device level.
56 Part
The structure
1
of
computers
Computers have been built with a variety of different bases, the main ones being binary, decimal, and character. The character has shifted between a 6-bit character and an 8-bit character 1
The arguments
than binary (which represents the natural base of the computer) all hinge on the alphabets used externally by human beings and the desire to avoid conver(byte).
for bases other
sions into a different representation inside the computer. universal acceptance of higher languages, such as
With FORTRAN and
argument has also lost much of its force. In fact, third-generation machines are binary. Nevertheless, in the fifties
ALGOL, all
this
there was
much
controversy over which base to use, and the
others follow.
set,
An
as the character, should
integer fit into
a word, since otherwise a set of words will not provide a homogeneous sequence of subunits. (That is, only five 6-bit characters fit into 32 bits, so that a set of 32-bit words filled with 6-bit characters
has a
number
of 2-bit holes in
it.
This can complicate algorithms
The constraint of compatinot so with since Ms, bility strong speeds are slow enough to conversion hardware or software). Still, permit algorithms (either the system is simpler (and therefore usually will work better) if
that deal with long character strings.) is
incommensurabilities of information units do not
exist.
Thus, to
machines presented in this book exhibit all three bases. There is little difference between binary and decimal com-
pick an example, the number of parallel tracks on magnetic tapes
puters in their ISP organization.
700 series of 36-bit machines have
a great difference between these two and character machines. The latter are
However, there
is
designed for handling text and are constructed to deal with variable-length strings of characters. Correspondingly, they deemphasize numerical computation. Both these decisions affect the ISP considerably. Thus, in the computer space
we
indicate the base
dimension along with the word-length dimension. The two gether
make up
to-
a single dimension.
Sometimes there are intermediate
but they always play a minor role and we can disregard them at this stage. As we noted earlier, the main determinant of word length has been the length).
units,
function of the total system: large word lengths for arithmetic systems, small word lengths for control systems (and character strings for business). Thus, only within narrow limits is the word length a free design choice. However, the interesting thing about
much
word length
is
not so
determinant as the way it affects other aspects of the total system design. This starts with a design decision that the
As soon
as this
between components
becomes the
will
be a word.
case, then registers in various
com-
ponents must hold a word, since that is what arrives or is to be transmitted. Thus the word becomes the information unit of the
Mp, and most
of the registers of the
tion
one word, since that is the number obtained "at once" and hence can be used to effect
is
designed to
of bits that
is
fit
Pc hold one word. The
instruc-
into
the next time increment of processing. Seven
IBM tapes for the data tracks; for the Sys-
length.
six
computer and the number of data-types that it makes availaAs we saw in Chap. 2, the operations in a computer can be classified according to the type of data they operate upon. Each
of a
ble.
data type tends to have a certain set of operations appropriate to it (for example, + — X and / for numbers) and the decision ,
,
to include a data-type carries with its
operations.
the
number
Thus the number
of data-types.
The
it
the decision to include
grow with hardware in a
of operations tends to total
amount
of
computer grows as the word size (because data paths are word2 and also as the number of operations. Thus machines with large word size tend to be large machines and have many parallel
)
data-types and many operations. ("Large" as an adjective for machines invariably means big and expensive, hence given economics capable of doing large amounts of processing.) There are two additional, somewhat independent, features that
—
—
support the relationship between word size, number of data-types, size of computer. First, with a large system there will already
and
its
unit of information transfer
word
tem/360, which has a 32-bit word, the tapes have eight data tracks. There is an interesting correlation between the word length
n
bits for a binary computer or n digits for a decimal computer (character machines being excluded as not having a fixed word
as
tends to divide evenly into the
,
Word length. Let us now examine the role of word length. The word is the first major information unit above the base. It is defined
1
Once these basic features are number of any smaller units, such
have been proposed for communication purposes but have never been made the basis of a machine, as far as we know.
be available many of the pieces necessary to add additional operations. That is, the marginal cost of a new operation goes down as the system grows. Therefore, given a large system, there
is
a
tendency to add more operations. The number of operations per data-type is not easy to increase; rather, one adds new data-types. Second, with small word lengths, one cannot define while data-types that will
fit
into a word,
many worth-
and multiple-word data-
left to the programmer to define with software. With word lengths there are many different worthwhile data-types fit into the word, for instance, decompositions of the word
types are large
that
into partial words, or into character strings.
Each
of these requires
bits
2
The
issue of bit-serial versus bit-parallel
is
discussed subsequently.
Chapter 3
additional operations, since the initial data-types involve the entire word or some large part of it (i.e., the word, address, and integer operations).
In sum, the
word length stands
of the machine.
zation of
It
not only
tells
many components
an indicator of many aspects something about the basic organias
but indicates
how
big the computer
both in number of data-types and number of operations. Figure 2 shows time lines of well-known computers with their word is,
length, with a special time line for the ones in
for their definitions.)
chines do not generally have boolean data-types, and there has
been some attempt at machines with only floating point, without a separate integer type (e.g., the CDC G20 2 ).
The reason behind
1 groups are suggested in the figure which classify these computers.
It
The
classes overlap,
and
computer into one of two
to separate a
more knowledge
(e.g.,
the
number
of data-types).
To be located at a point on this dimension means to have all the data types below it
on the dimension, (i.e., word, address, integer, boolean.) Occasionally machines which violate this have arisen. Decimal ma-
is
classes requires
The computer space
(say at floating point)
Five
this book.
I
this
cumulation of data-types
in a fixed
i.e.,
order
that certain general tasks must be performed
by any computer. must transmit data between the Pc and Mp, and this transmission has nothing to do with the meaning or content of the data; thus there
is
always the "unit of transmission," which
is
the
word
For example, the 24-bit SDS 9300 and CDC 3200 appear in the same class with the 36-bit IBM 7090 just because both machines
(except on character machines). Next, all computers manipulate addresses to achieve generality (e.g., to compile), providing for a
have floating point hardware and,
for
second data-type. Next come integers, since almost all algorithms make use of arithmetic (this could conceivably be absent in some
that makes word length have few of the described is consequences just making a computer bit-serial rather than bit-parallel. In many machines information transfers are con-
communications computers), and on up to floating point numbers, multiple precision, and vector and string operations. At each stage
in fact,
perform comparably
arithmetic tasks.
The one design choice
ducted on a single bit stream (especially Pc-Mp transfers). Coincident with this is the construction of operations on a bit-by-bit This works well for arithmetic and logical operations. Time traded for hardware. The cost of the system becomes independ-
more specialized so that lower ones cannot be eliminated, except for a few cases such as handling addresses as regular the uses are
integers.
basis. is
Addresses per instruction and processor state
ent of word length, but the processing rates go down correspondingly. This design decision was an extremely important one when
The number
logic was expensive and unreliable. It has become current era, where processors and transfer paths are
3 puter systems containing these processors. and 3 to separate the different processors.
in
number while both the
cost
have improved. However, sidered
(
— 10
3
and the
the
relatively
few
reliability of
components
as large parallel processors are con-
P's), bit-serial
processors again
design alternative. (See the serial
word length
In summary,
less so in
is
become
a serious
computers of Part 3, Sec.
2.)
an important dimension, and
we
many characteristics either proportional to or inversely proportional to it. To be sure, these relations hold only for current find
way
of addresses in
of describing processors
an instruction has been a traditional (i.e.,
their ISP's)
and hence the com-
We
use
it
in Parts 2
Originally the dimension was simple: one-, two-, three-, and four-address machines were constructed. It has become somewhat
more complex. A "one plus one" machine has one address for data and one for determining the next instruction, and is to be distinguished from a two-address machine, which uses both addresses Index registers and so-called general registers provide
for data.
instruction schemes
which
lie
somewhere between one- and two-
seen with the bit-serial designs. The main-line computers in Part 2 are ordered according to increasing
address organizations. When processors admit several instruction formats or variable-length instructions, matters become even more
word
complicated. A correlated dimension in the computer space is the amount of processor state, that is, the number of bits that exist in the
design practice, as
we have
length.
We
have presented the number of data-types as being Data-types. correlated with word length and also with computer size through the effect on number of operations. Although far from perfect,
there in a
is
a rough order in which specific data-types are included
computer.
We
in the data-type
have
listed the
main types
in such
an order
dimension of the computer space. (See Chap. 2
processor, as described in the ISP. This tion that can be held at the
processing context for the next instruction. of status 2 3
'The
class
number
is
essentially [log2 (Mp
word
length)
—
2].
is
and mode
bits (in
the
amount
end of one instruction It
of informa-
to provide the
consists of a
modern machines packaged
number
into regis-
Originally the Bendix G-20. Although used mostly to describe Pc's, the description applies to any
processor.
57
58
Part 1
ters,
The structure
of
computers
but in earlier machines simply scattered around in the procthe next instruction address, the accumulator and other
essor),
arithmetic registers, the index registers, and other general registers making up a "scratch-pad" memory. It is a simpler descriptor of the ISP than addresses per instruction, since it is independent of
number and variety of instruction formats. It is easy to define processor state generally for any ISP, but difficult to define addresses per instruction. state
number
not the total
is
not do so logically (that
is,
the registers could exist in
Mp
on the instruction format). With interrupts and multiprogramming the processor
still
the
The processor
the extra time to store in
Mp results that need only index Thus, also, temporary storage. registers and general registers almost always imply increased processor state, although they need organization
have their
gains additional significance, since that has to
and
effect
it is
the
amount
state
of information
be saved and restored when switching programs.
of bits in the proc-
For example, in the Honeywell H-800, an early three-address
essor, since there may be registers in the physical system that are used within the interpretation of one instruction but which carry no information between instructions. Address registers for obtain-
computer, the processor state per program consisted only of the program counter and index registers, and when io-halts occurred
ing operands from
is
are the most
Mp
common
such "underground"
We implied this
or "temporary" registers, but there can be others. distinction
by defining processor
state in terms of the
ISP rather
than the physical processor.
The
correlation
100 words must be stored, which
and the number
ing to addresses per instruction. To show the common similarities, we give in Fig. 4 a state diagram that can be used for all processors.
state
Mp
(or
even
Ms
or Tio's)
and are
not concerned with the state of the processor. Processor state enters only because, in decomposing the total algorithm into a
not possible (or efficient) to
it is
~
not simple, since it rests on two note that larger programs perform
is
separate issues. For the first, transformations on the state of
series of small steps,
general-register state, often 25
implies an appreciable time for switching contexts. We can now consider briefly the different organizations accord-
between the processor
of addresses per instruction
during processing, the Pc was switched immediately to another program. Eight programs could run concurrently (by having a total processor state of 64 program registers). In present computers with
make each
In
common
instruction, it
the basic idea of the stored program: Fetch an is to do, then execute
is
determine what the instruction
(the fetch-execute cycle).
Other than
this,
only a part of the
diagram will be applicable to a given processor type. As shown in the computer space, the addresses-per-instruction
state
to Mp. Basically, this happens step a transformation from because the instruction does not hold enough information to specify the Mp-to-Mp transformations. For example, if one wants to
dimension
add two numbers, two operands are required, and an instruction must contain at least two addresses; if it does not, then an inter-
and variable addresses. However, from an expository viewpoint one should follow a different course, starting with single-address
mediate state
(i.e., processor state) must be created to hold the information while the additional instructions are fetched. Thus,
machines, then indexing, then two- and three-address machines, then general registers, and finally the zero-address and variable-
one-address organizations require the most processor state, with two- and three-address organizations. This consideration
address organizations. This not only puts the more common organizations first but makes it easy to relate the organizations
Mp
less for
stops at three (two operands
elementary operations are
and a
more than
result)
binary.
because only a few
The
tinuity of the program.
The second source
between processor state and comes from differential access time to
of correlation
processor registers and to
Mp. As long
an appreciable differential, substantial gain, processing power can be obtained from increasing processor state. This derives, again, from the strucas there
is
ture of algorithms which generate intermediate results that are used almost immediately afterward and then are of no further
Rapid temporary storage and retrieval are beneficial under these conditions. Thus, working against higher address interest.
with zero addresses, then one address, then one
plus indexing, one plus general
to
registers,
and on up
to two, three,
each other.
processor state
cannot be eliminated entirely, however, since there must be at least an instruction address (a program register) to maintain con-
instructions per address
starts
P(l address)
and
P(l
+
index address). These Pc's constitute most
and simple third-generation computers. The earliest outline of the structure was the IAS computer (Chap. 4), which has come to be known as the von Neumann computer. Although first-,
second-,
fundamentally pears to is
like the
be the
not described,
(Chap.
A
IAS computer, EDSAC's adaptation ap-
closest prototype to this class. it
Although
influenced M.I.T.'s Whirlwind
I
EDSAC
significantly
6).
significant
change to the IAS machine was the addition of
the index register (called B-tubes) in the Manchester University machine in the early 1950s. The evolution can be seen by comparing the
first
and third generations using Whirlwind (Chap.
6)
and
Chapter 3
controlled stote
Mp 2
Pc controlled stote Note; Any state may be
null
State name soq/oq soq/aq
in
a state
toq
taq
o/o.o
to.o tov.
sav.r/ov.r
tav.r to
so/o sov.w/ov.w sav.w/av.w
the
Time
sov.r/ov.r
so.
Fig. 4.
r
tov.
w
tav.
w
Weaning Operation to determine the instruction q Access (to Mp) for the instruction q Operation to decode the operation of q Operation to determine the variable address v Access (to Mp) read the variable v Operation specified in q Operation to determine the variable address v Access (to Mp) to write variable v
ISP interpretation state diagram.
IBM
1800 (Chap. 33) or looking at the IBM 701-7094 evolution 1. Index registers are motivated by the frequent
in Part 6, Sec.
occurrence, in
address systems, of circuitous address calcula-
1
tions that involve
first
computing the address
(e.g.,
the index of
an array in Mp) and then planting it just ahead in the instruction stream in order to make use of it as an address. Providing a set of index registers introduces a second address into the in-
even though of extremely limited function. Thus we processors with indexing as having (1 + x) addresses
struction, classify
1 per instruction.
For the
just scalars.
Indirect addressing, on the other hand, does not add to the addresses per it introduces a second operation per instruction.
instruction; rather,
address processor, the processor state (Mps) typically program counter (instruction location counter), an
Accumulator/ AC, a Multiplier-Quotient register/MQ (the extension of AC), and one or more Index registers/X/XR.
With only one address register,
A, must be used
in the instruction, the
for
temporary
results.
one arithmetic
Thus an
effective-
address integer (z) is computed as a function of the address part (v part) of the instruction (q) and the index registers. This process is
An
on vector data elements rather than
1
consists of the
alternative view of index registers suggests that they double the number of data-types by allowing operations
1
The computer space
typically
z:=v + where
X[j]
is
X[j]
the jth index registers as specified in the instruction. for the transmission operators between
There are several forms
A
and Mp.
59
60 Part
The structure
1
of
|
A A A
computers
3 sets of protection and relocation
virtual
Similar to above.
physical a
homo-
Similar to above. Simple, pure procedures with one data array area can be
memory.
More
implemented. (UNIVAC
similar to
page mapping.
Has not been used
108, PDP-10)
any conventional
computer.
registers.
Mapping,
in
1
Mv > Mp:
Memory page mapping
For each page (2 6 to 2 12 words) tual
in a
user's
vir-
memory, corresponding information
is
Relatively expensive. Not as general as
following
kept concerning the actual physical location in primary or secondary memory. If the
method
pure procedures. SDS-940)
for
implementing CDC-3500,
(Atlas,
is in primary memory, it may be desirable to have "associative registers" at the
map
processor-memory interface to remember previous reference to virtual pages, and their actual locations. Alternatively, a hardware map may be placed between the processor and memory to transform processor virtual addresses into physical addresses.
Memory page/segmentation mapping
Additional address space virtual
is
provided beyond a
memory above by
providing a seg-
Expensive.
Little
effectiveness.
experience
to
judge
(GE 645, IBM 360/67)
ment number. This segment number addresses or selects the page tables. This allows a user an almost unlimited set of addresses. Both segmentation and page map look-up is provided in hardware. May be
thought of as two-dimensional addressing. Indirect references through a descriptor table to segments.
All
data are considered part of a descriptor is referred to by a number. A descriptor table indexed by the descriptor array which
number
is
and give
its size.
used
to locate the array in
Mp
An
indirect reference
must be made
the description table
in
to
Mp. (B 5500)
80
Part
z
The structure
1
is
encountered
of
computers
in the
be obtained. There are
program, the information at Mp[z] will
Every reference Mv[z] takes place
however, two different ways to obtain
still,
Mv[z]
the effect of a virtual memory.
=
:
as
(—|Mp[z] Mp[z]
one can operate interpretively, with a software system taking the place of hardware. That is, the programs of all the users
That
are in a nonmachine language
The other two schemes
—>
—*
Mp[z]; protection violation «—
1)
First,
a higher procedure-oriented language), and each access in the language is processed by the software interpreter before an access is made to Mp. It is clear (e.g.,
power of a memory mapping is available with scheme. The only drawback is the loss of efficiency from the
that all the logical this
which may range from a factor of 5 to 100. Consescheme is used only in special circumstances, such
interpretation,
quently this as multiuser time-shared conversational algebraic languages. The second scheme is to modify the code at the time it is placed in the
Mp
for a given run, so that all addresses in the
spond to the actual
Mp
addresses used. That
is,
code corre-
an assembly or
performed each time the program is placed in Mp. The advantage of this scheme is that no further address calculations are necessary. There are three disadvantages. Assemtranslation operation
is
bly operations are expensive so that, although the ble is
if
the program
not tolerable
out of
Mp. In
if
is
brought
in
scheme
is
tolera-
once and run to completion, in
it
and relocating registers) hardware. A protection and relocation register mechanism is used in four schemes of Table 6. These provide either one concatenated, one additive, two addin additive register pairs for mapping a single program into one, one, two, or n nonadjacent blocks in Mp. The authors know of no schemes where more than three registers are used; this would tive, or
be akin to using a more general page map. Generally, these
really
schemes
restrict
Mv
)-
by themselves do not contain
a carry sequence of length St). In
numbers any carry sequence of length St) in the total last digits of the total sequence. length n) must end with the
this case (of
Hence these must form
the combination 1,1.
The next
-
o
1
digits
these must form the
must propagate the carry, hence each of do not or 0, 1. (The combinations 1, 1 and 0, combination 1, 1 is %, the combination of 1, propagate a carry.) The probability that one of the alternative combinations
The remaining n
—
or 0,
1,
-
-
1
obtains.
that
p n (v)
,
t
n tPi(«)
-
Pi-i( u )]
+1
.
-
Thus the
Combining
2 V+1
We see with the help of the _ p n _ (v) is always ^1/2 V+1
2
%. The
=
- P„-»
The observation
pn ( u )
is
_»-
of length St). This has the probability 1 pn v+1 case is [1 p n - v (v)]/2 probability of the second these two cases, the desired relation
P» - Pn-l(f) +
1
-1
therefore
sequence (%)* V^)* v digits must not contain a carry sequence
total probability of this
is
of
of the largest carry sequence is on the average length n, the length 2 not in excess of log n. Let p n (v) designate the probability that a carry sequence is of length v or greater in the sum of two binary - p„(c + 1) is the probawords of length n. Then clearly p„(i>) bility that the largest carry
p„(e)
g
Indeed, p„(c)
II.)
little dispart of our arithmetic organ requires cussion at this point. It should be a parallel storage organ which
5.6.
From
n.
We now
one division per iteration. As will be seen below in our more detailed examination of the arithmetic organ we do not include a square-
contains.
>
-
%(X + a/X)
is
t>
,
techniques.
5.5.
if
n
not important.
of course, also possible to handle square roots by iterative 1/2 then X' In fact, if X is our estimate of a
=
=
p»
=
if
v
>
n
is trivial.
formulas proved above that and hence that the sum
Chapter 4
—
not in excess of (n
is
+
o
1)/2
C+1 since there are
terms in the sum; since, moreover, each p n (v) is not greater than 1. Hence we have
an
=
^Ti
min^l,
Finally
we
Choose
K— 1
This
n
=
2
l
Jt+1
M
K
,
is
=
2
log n
is
log n.
=K
is
n
^
v=
n
.
bound on
Then ^
K *
linear for
2* +1 n
n
I
clearly
it
is
it
i.e.
^
v=l
and
,
that our expression 2
so that 2 K
K—
expression
the function
a n £s
J
re
2*^ng2
v
*
=
the
interval
=K +
2 K and
1
for
log n at both ends of this interval. Since everywhere concave from below, it follows
^
2
log n throughout this interval.
This holds for
all
K,
i.e.
for all n,
equality which we wanted to prove. For our case n = 40 we have a n 2s log 2 40
and
it
Thus
the in-
is
Having discussed the addition,
subtraction.
It is
— 5.3,
i.e.
to
make some
we can now
convenient to discuss at
of negative numbers,
and
in order to
an average
this point
do that
right,
go on
to the
our treatment it is
desirable
observations about the treatment of numbers in
general. digit aggregates, the left-most digit being
the sign digit, and the other digits genuine binary digits, with _1 2 2~ 39 (going from left to right). Our positional values 2 , 2~ accumulator will, however, treat the sign digit, too, as a binary ,
digit
.
.
.
,
with the positional value 2°
— at
least
when
an adder. For numbers between
and
1 this is
The
and
if
left-most digit will then be 0,
to represent a its
sign
+
sign, then the number
and 39 binary
Let us
now
numbers. The
is
it
functions as
clearly all right:
at this place
is
taken
correctly expressed with
digits.
consider one or more unrestricted 40 binary digit accumulator will add them, with the digit-adding
and the carrying mechanisms functioning normally and identically in all 40 positions. There is one reservation, however: If a carry originates in the left-most position, then it has nowhere to go from there (there being no further positions to the left) and is "lost". This means, of course, that the addend and the augend, both numbers between and 2, produced a sum exceeding 2, and the
accumulator, being unable to express a digit with a positional value 2 1 which would now be necessary, omitted 2. That is, the ,
it
is
It should be noted that our convention of placing the binary point immediately to the right of the left-most digit has nothing to do with the structure of the adder. In order to make this point
clearer we proceed to discuss the possibilities of positioning the binary point in somewhat more detail. We begin by enumerating the 40 digits of our numbers (words) In doing this we use an index h — 1, 40. might have placed the binary point just as well between and + 1, / = 0, = .0 corresponds 40. Note, that
left to right.
.
.
.
,
Now we digits
/
.
.
.
,
/'
to the position at the
extreme
;'
left
(there
no
is
digit
h
= =
0);
/
=
40 corresponds to the position at the extreme right (there is no position h = j + 1 = 41); and / — 1 corresponds to our above
/
choice.
Whatever our choice
of
/,
it
does not affect the correctness
is equally true for subtraction, below, but not for multiplication and division, cf. 5.8.) Indeed, we have merely multiplied all numbers by 2,_1 (as against our
cf.
previous convention), and such a "change of scale" has no effect on addition (and subtraction). However, now the accumulator is
an adder which allows errors that are integer multiples of 2' it is an adder modulo 2.'. We mention this because it is occasionally convenient to think
terms of a convention which places the
in
binary point at the right
Our numbers are 40
—
of the accumulator's addition. (This
length of about 5 for the longest carry sequence. (The actual value of a 40 is 4.62.) 5.7.
correctly, excepting a possible error 2. If several such additions are performed in succession, then the ultimate error may be any integer multiple of 2. That is, the accumulator is an
from n in
in
2
is
sum was formed
adder which allows errors that are integer multiples of 2 an adder modulo 2.
i>=K
last
1 it
turn to the question of getting an upper
I.v=iPn( v )-
u=
+
a probability,
n-c+n
r,
^
p„(v)
is
—
n
Preliminary discussion of the logical design of an electronic computing instrument
end
of the digital aggregate.
Then
= /'
40,
our numbers are integers, and the accumulator is an adder modulo 2 40 We must emphasize, however, that all of this, i.e. all attribu.
—
are purely convention i.e. it is solely the mathematician's interpretation of the functioning of the machine and not a physical feature of the machine. This convention will tions of values to
;',
necessitate measures that have to be
physical features of the
machine
—
i.e.
made
We
will use the convention
/
=
This being represent
all
so,
these
1, i.e.
our numbers
2.
Any
lie in
and
2.
numbers between
numbers modulo
by actual
become when we come to the
a physical and engineering reality only organs of multiplication.
2 and the accumulator adds modulo
effective
the convention will
real
and 2 can be used
to
number x agrees modulo
—
2 with one and only one number x between and 2 or, to be 5= 2. Since our modulo 2, x addition functions quite precise:
1
we have
.
.
.
the
We
will
.
•
•
on removing biases of this therefore use the unmodified methods in this case,
have seen that size.
.
it is
pointless to insist
too. It
should be noted that the bias in the case of multiplication in various ways. However, for the reasons set forth
can be removed above,
we
shall not
complicate the machine by introducing such
to
Inasmuch
as
we propose
to
form the product
the accu-
x'y' in
mulator, which has carry facilities, there is no reason why we should not adopt the rounding scheme described above which has the smaller dispersion,
i.e.
the one which
the case, however, of division
we
we
may induce
carries. In
wish to avoid schemes leading
expect to form the quotient in the arithmetic
which does not permit
register,
of carry operations.
The scheme
which we accordingly adopt is the one in which o) n is replaced by 1. This method has the decided advantage that it enables us
down
to write first
(n
—
the approximate quotient as soon as we know its will be seen in 5.14 and 6.6.4 below that
1) digits. It
our procedure for forming the quotient of two numbers will always lead to a result that is correctly rounded in accordance with the decisions just made.
We
do not consider
as serious the fact that
is
a far less frequent
operation.
A
final
remark should be made
occasional need of carrying
in
connection with the possible, = 39 digits. Our logical
more than n
= 2, 3, ) sufficiently flexible to permit treating k ( one number, and thus effecting n = 39fc. In this case the round-off has to be handled differently, cf. Chapter 9, Part II. The
control
words
is
.
.
.
as
multiplier produces all 78 digits of the basic 39 by 39 digit multiplication: The first 39 in the Ac, the last 39 in the AR. These must
then be manipulated in an appropriate manner. (For details, cf. and 9.9-9.10, Part II.) The divider works for 39 digits only:
6.6.3
In forming x/y, it is necessary, even if x and y are available to use only 39 digits of each, and a 39 digit result will It seems most convenient to use this result as the first step appear. of a series of successive approximations.
Thus we have two standard "round-off methods, both unbiased the extent to which we need this, and with the variances
The
successive improve-
ments can then be obtained by various means. One way consists of using the well known iteration formula (cf. 5.4). For k = 2 one such step will be needed, for k = 3, 4, two steps, for k = 5, 6, 7, 8 three steps, etc. An alternative procedure is this: Calculate the remainder, using the approximate, 39 digit, quotient and the complete, 39k digit, divisor and dividend. Divide this again by the approximate, 39 digit, divisor, thus obtaining essentially the next 39 digits of the quotient. Repeat this procedure until the full 39fe desired digits of
5.13.
corrections. "
the second one
39Jt digits, to
), n n+1 n+2 p second one, xy — (.£ x £ n £ n+1 £ 2n ), i.e. p — n. Hence for the division both methods are applicable without modification. In 2n may be introduced. We multiplication a bias of the order of l/2 .
facilities,
as large as that in multiplication since division
,
variance (yi2 )2 2n If the number
one requires no carry
vn ) in-
Hence comparing with the "rounded-off" value, random in the intervals 0, l/2 n+1 and 0, — l/2 n+1 +1 Hence its mean is n+1 and its in the interval — l/2 l/2"
1/2".
v
first
our rounding scheme in the case of division has a dispersion twice
we have
a difference
co
The
requires them.
in the
whether the random number
in question lies in the interval 0, l/2" +1 , or in the interval
i.e.
,
,
to carries since
digits.
When applied to a number of the form
i.e.
•
last digit.
".
The round-off procedures, which we can use in this connection, fall into two broad classes. The first class is characterized by its
first
2 2n and (yi2 )2 2n that is, with the dispersions (1/ v'3)(l/2 n ) 0.58 times the last digit and (l/2y'3)(l/2 n ) = 0.29 times the
1/3
=
Processors with one address per instruction
1
arises
We
when
the quotient have been obtained.
might mention at
The operation
of addition
time a complication which introduced into the machine.
this
a floating binary point
is
which usually takes
at
most yi0 of a
Chapter 4
multiplication time becomes much longer in a machine with floating binary since one must perform shifts and round-offs as well as additions. It
would seem reasonable
%
time of an addition as about rate
is
it
number
clear that the
in this case to place the
of a multiplication. At this
%
to
of additions in a
is
problem
as
Preliminary discussion of the logical design of an electronic computing instrument
rn and d; if they are of the same sign, repeatedly subtracted from the remainder until the signs become opposite; if they are opposite, the dividend is repeatedly added to the remainder until the signs again become
one compares the signs of the dividend
like. In this
is
scheme the
digits that
may occur
important a factor in the total solution time as are the number
in the quotient are evidently
of multiplications. (For further details concerning the floating
tive digits corresponding to subtractions
We
5.14.
remainder
Ac and
in
proceeding further
let
.
.
.
do
will
from the partial remainder
m—
this for a general base
and dividend are both
divisor
division consists of subtracting
(at
the former becomes smaller than the latter. For any fixed positional value in the quotient in a well-conducted division this need be
m—
most
1
times.
If,
after precisely k
=
0, 1,
.
.
.
repetitions of this step, the partial remainder has indeed
than the divisor, then the digit k
,
m—
1
one place to the
.
.
m—
.
m—
,
1, is
2. If
at all
is
and the whole process is repeated for the Note that the above comparison of sizes is only
left,
next position, etc. needed at k — 0, 1, 1,
.
.
.
,
m—
the value k
reached
in
2, i.e.
= m
before step 1 and after steps — 1, i.e. the point after step
a well-conducted division, then
it
may
be taken
for granted without any test, that the partial remainder has become smaller than the divisor, and the operations on the position under consideration can therefore be concluded. (In the
binary system,
m=
comparison of
sizes,
known
there is thus only one step, and only one before this step.) In this way this scheme, — 1 comas the restoring scheme, requires a maximum of 2,
m
parisons and quotient.
utilizes the digits 0, 1,
The
difficulty of this
usually the only economical as to size rn
were
is
to subtract
less
.
.
scheme
method
.
,
m—
for for
d,
each place
in the
machine purposes is that comparing two numbers
one from the other.
than the dividend
1 in
If
the partial remainder
one would then have to add d
—
d in order to restore the remainder. Thus at every an stage unnecessary operation would be performed. A more symmetrical scheme is obtained by not restoring. In this method (from
back into
here on
This
is
We
1)
a
m
digits instead of the usual
would mean 18
digits.
digits instead of 10.
redundant notation. The standard form of the quotient
propose to store the quotient in AR, which has no carry Hence we could not use this scheme if we were to
rn
we need not assume
operate in the decimal system.
the positivity of divisor and dividend)
m
The same objection applies to any base for which the digital — 1) m. representation in question is redundant i.e. when 2(m
—
Now 2(m — 1) > m m = 2. Hence, with contemplated,
whenever
m > 2,
>
but 2(m
the use of a register which
this division
scheme
is
—
= m
1)
we have
for
so far
certainly excluded from the
start unless the
become
put in the quotient (at the position under consideration), the partial remainder is shifted
less
—
In the decimal system this
2,
the very beginning of the process of course, the dividend) the divisor, repeating this until
at
the posi-
1),
and the negative ones to
facilities.
Assume for the moment that The ordinary process of
done
in a given place
±(m —
its positive digits the aggregate of its negative digits. This requires carry facilities in the place where the quotient is stored.
make
.
is,
,
us consider the so-called restoring and
positive.
this
.
must therefore be restored by subtracting from the aggregate of
we
certain comparisons,
.
the partial quotient in AR. Before
non-restoring methods of division. In order to be able to
3,
Thus we have 2(m
conclude our discussion of the arithmetic unit with
a description of our method for handling the division operation. To perform a division we wish to store the dividend in SR, the partial
.
additions of the dividend to the remainder.
cf. 6.6.7.)
binary point,
±1, ±2,
Let us inquire
now
if it is
binary system is used. investigate the situation in the binary system. We possible to obtain a quasi-quotient by using the
instead of 1, non-restoring scheme and by using the digits 1, — 1. Or rather we have to ask this question: Does this quasi-
quotient bear a simple relationship to the true quotient? Let us momentarily assume this question can be answered affirmatively
and describe the division procedure. We store the SR and wish to form the
divisor initially in Ac, the dividend in
quotient in AR.
SR
We now
either
into Ac, according to
opposite or the same,
and
add or subtract the contents of
whether the
signs in
Ac and SR
insert correspondingly a
or
1 in
are
the
right-hand place of AR. We then shift both Ac and AR one place with electronic shifters that are parts of these two aggregates.
left,
At this point we interrupt the discussion to note this: multiplication required an ability to shift right in both Ac and AR (cf. 5.8).
We have now
to shift left in both
found that division similarly requires an ability Ac and AR. Hence both organs must be able to
both ways electronically. Since these abilities have to be present for the implicit needs of multiplication and division, it is just
shift
as well to
make use of them
explicitly in the
These are the orders 20, 21 of Table
1,
form of explicit orders.
and of Table 2, Part
II. It
will,
however, turn out to be convenient to arrange some details in the shifts, when they occur explicitly under the control of those orders,
107
108 Part 2
The
instruction-set processor: main-line
when they occur
differently from
Section
computers
implicitly under the control of a
multiplication or a division. (For these things,
cf.
the discussion of
the shifts near the end of 5.8 and in the third remark below on one
hand, and in the third remark in
now resume
Let us
7.2,
Part
II,
on the other hand.)
The process
the discussion of the division.
described above will have to be repeated as many times as the number of quotient digits that we consider appropriate to produce
way. This
in this
number
exact
we
be 39 or 40;
likely to
is
will
determine the
Ac +
S(x)
,
-
—> Ac —
—» Ah + S(x) —> Ah — M, S(x)-» Ah + M, S(x)-» Ah - M] ,
S(x)
S(x)-> Ac + M, S(x)-> Ac involves the following possible four steps: First: Clear SR and transfer into it the
Second: Clear
Ac
Ac
if
,
at S(x).
the order contains the symbol
c;
do not
in our present
— If the according to whether the order contains the symbol -f or the number in SR or its use order contains the symbol M, negative .
according to whether the sign of the number in SR and the symbol + or — in the order do or do not agree. Fourth: Perform a complete carry. Building the last four addisymbol M) into the control
tion operations (those containing the
in it
fairly simple: It calls
SR and
the
+
or
—
only for one extra comparison (of the sign in the order, cf. the third step above),
requires, therefore, only a
first
few tubes more than required
required.
and
for the
four addition operations (those not containing the symbol M).
some The absolute
by merely detecting the
=
sign of
— N |
. \
0.)
of S(x)
—» R
involves the following
two
steps:
Clear SR, and transfer
First:
S(x) to
it.
AR and add the number in the Selectron register The operation of R —» Ac merits more detailed discussion,
Second: Clear into
it.
ways of removing numbers from AR. Such numbers could be taken directly to the Selectrons as well as into Ac, and they could be transferred to Ac in parallel, in
since there are alternative
sequence, or in sequence parallel.
should be recalled that while
It
most of the numbers that go into AR have come from the Selectrons and thus need not be returned to them, the result of a division
and the right-hand 39
Hence while an operation required,
it
is
for
product appear in AR. withdrawing a number from AR is digits of a
relatively infrequent
and therefore need not be
We
are therefore considering the possibility of transferring at least partially in sequence and of using the shifting properties of Ac and of AR for this. Transferring the number to particularly fast.
the Selectron via the accumulator
machine method
of checking
numbers are only checked
is
is
also desirable
employed, for
in their transit
it
if
means
the dual that even
through the accumu-
nevertheless every number going into the Selectron checked before being placed there. 6.6.3.
if
(i.e.
then JV
The operation
6.6.2.
is
lator,
1 system its complement with respect to 2 ) into Ac. If the order does not contain the symbol M, use the number in SR or its negative
is
-\N\ SO,
(If
in
is frequently in connection with the orders L and R while the minus absolute value order makes the detec-
(see 6.6.7),
if
number
the order contains the symbol h. Third: Add the number in SR or its negative
clear
,
them
further justification for building
all
operations the reasons for building them into the control have already been given. In this section we will give reasons for building the other operations into the control and will explain in the case
should
absolute value and five for minus absolute value), so that
into two groups: Those that specify operations which are performed within the computer and those that specify operations
than the input and output operations, and hence they will be discussed more in detail than the latter (which are treated briefly
it
be noted that these operations can be programmed out of the other operations of Table 1 with correspondingly few orders (three for
The operation
S(x)
is
X R —* Ac involves the following six
steps:
Clear SR and transfer S(x) (the multiplicand) into it. Second: Thirty-nine steps, each of which consist of the two following parts: (a) Add (or rather shift) the sign digit of SR into First:
the partial product in Ac, or add all but the sign digit of SR into the partial product in Ac depending upon whether the right-most or 1 and effect the appropriate carries, (b) Shift digit in AR is
—
—
Ac and AR
the sign digit of Ac with a and the of the immediately right sign digit (positional value with the previously right-most digit of Ac. (There are ways
digit of
2" 1 )
to save time digit in
to the right,
fill
AR
Ar
by merging these two operations when the right-most but we will not discuss them here more fully.)
is 0,
Third: If the sign digit in
SR
is
1 (i.e.
—
),
then inject a carry
Chapter 4
into the right-most stage of
Ac and place
a
1
into the sign digit
of Ac.
Fifth:
If
If
a partial carry system
AR
is 1 (i.e.
—
),
then sub-
Add
or subtract the contents of
in the
main
in
into Ac, depending on
the same alternative as above. Fourth: Fill the right-most digit of
was employed
SR
its
AR
with a
1,
and change
sign digit.
necessary at the end. Sixth: The appropriate round-off must be effected. (Cf. Chapter Part II, for details, where it is also explained how the sign digit
For the purpose of timing the 39 steps involved in division a 6 = 64) will be built six-stage counter (capable of counting to 2 into the control. This same counter will also be used for timing
treated as part of the round-off
the 39 steps of multiplication, and possibly for controlling Ac when a number is being transferred between it and a tape in either
process, then a complete carry
of the Arithmetic register
is
is
process.) It will
be noted that since any number held in Ac at the begin-
ning of the process to
depending on whether there was disagreement or agreement (a), (c)
the original sign digit of tract the contents of SR from Ac. Fourth:
9,
Preliminary discussion of the logical design of an electronic computing instrument
is
gradually shifted into
accumulate sums of products
Ac
in
AR,
it is
impossible without storing the various
products temporarily in the Selectrons. While this is undoubtedly a disadvantage, it cannot be eliminated without constructing an extra register, and this does not at this moment seem worthwhile.
On the other hand, saving the right-hand 39 digits of the answer accomplished with very little extra equipment, since it means -39 1 connecting the 2 stage of Ac to the 2" stage of AR during the is
shift operation.
simplifies the
The advantage
of saving these digits
is
handling of numbers of any number of digits
that
it
in the
direction (see
and Ap'
6.8.).
The
6.6.5.
—»
three substitution operations [At
involve transferring
S(x)]
all
—>
S(x),
or part of the
Ap —»
S(x),
number held
in Ac into the Selectrons. This will be done by means of gate tubes connected to the registering flip-flops of Ac. Forty such tubes are needed for the total substitutions, At—» S(x). The partial substitu-
tion
Ap —»
S(x)
digits of the
and Ap' —»
number held
in the left-hand
S(x) requires that
in
Ac be
the left-hand twelve
substituted in the proper places
and right-hand orders, respectively. This may be
done by means of extra gate tubes, or by shifting the number in Ac and using the gate tubes required for At —» S(x). (This scheme
of 39fc binary
needs some additional elaboration, when the order directing and
an integer) and sign can be divided into k parts, each part being placed in a separate Selectron position. Addition and subtraction of such numbers may be programmed out of a
the order suffering the substitution are the two successive halves of the same word; i.e. when the latter is already in FR at the time
series of additions or subtractions of the 39-digit parts, the carry-
effected in the Selectrons
computer
the last part of 5.12).
(cf.
digits (where k
Any number
is
over being programmed by means of Cc —» S(x) and Cc' -* S(x) operations. (If the 2° stage of Ac registers negative after the addition of
two
dure
may be
followed in multiplication
parts.) if
all
and hence
it is
one of the
ways described at the end of 5.12.
The operation of division Ac + S(x) —* R involves the four following steps: First: Clear SR and transfer S(x) (the divisor) into it. 6.6.4.
The importance
to coding
remainder) and of SR, and sense whether they agree or not. (b) Shift Ac and AR left. In this process the previous sign digit of
quence
Fill
the right-most digit of
and the right-most
digit of
AR
Ac
(after the shift)
(before the shift) with
possible the coding of classes of problems in contrast
each individual problem separately. Because Ap -> S(x) S(x) are available, any program sequence may be stated
in general
for the
0,
make
Third: Thirty-nine steps, each of which consists of the following
is lost.
form (that
is,
without Selectron location designations
numbers being operated on) and the Selectron locations of the numbers to be operated on substituted whenever that seis
used. As an example, consider a general code for nth
m
with a
order integration of
or
independent variable
1,
open.)
remove a very sizeable burden from the person coding problems, for they
Sense the signs of the contents of Ac (the partial
Ac
still
of the partial substitution operations can
wise conveniently perform, such as making use of a function table stored in the Selectron memory. Furthermore, these operations
and Ap' —»
(a)
at the next step in
hardly be overestimated. It has already been pointed out (3.3) that they allow the computer to perform operations it could not other-
Second: Clear AR.
three parts:
become operative
FR. There are various ways to take care of this complication, either
decisions in this respect are
39fc digit division in
planned to program
has already reached CR, to
78 digits of the
39-digit parts are kept, as is planned. (For the cf. details, Chapter 9, Part II.) Since it would greatly complicate the computer to make provision for holding and using a 78 digit
dividend,
becomes operative in CR, so that the substitution comes too late to alter the order which
by some additional equipment or by appropriate prescriptions in coding. We will not discuss them here in more detail, since the
two
product of the
the former
A similar proce-
39-digit parts, a carry-over has taken place
2~ 39 must be added to the sum of the next
when
t,
total differential equations for p steps of formulated in advance. Whenever a prob-
115
The
116 Part 2
instruction-set processor: main-line
Section
computers
Processors with one address per instruction
1
coded for the computer, the general can be inserted into the statement of the integration sequence instructions for telling the sequence with coded problem along
point since a different scale factor does not need to be
where
the
lem requiring
it
will
this rule is
be located
in the
memory
[so that
the proper S(x)
Cm—
* S(x), etc.]. designations will be inserted into such orders as Whenever this sequence is to be used by the computer it will
automatically substitute the correct values of m, n, p and At, as well as the locations of the boundary conditions and the descriptions of the differential equations, into the general sequence. (For
the details of this particular procedure,
A
library of such general sequences will
cf. Chapter 13, Part II.) be built up, and facilities
provided for convenient insertion of any of these into the coded statement of a problem (cf. 6.8.4). When such a scheme is used, only the distinctive features of a problem need be coded. 6.6.6. The manner in which the control shift operations
[Cm —* S(x), Cm' —* S(x), Cc —» S(x), and Cc' —* S(x)] are realized has been discussed in 6.4 and needs no further comment. 6.6.7.
computer
One is
basic question
built
is
which must be decided before a
whether the machine
floating binary (or decimal) point.
While a
is
to
have a so-called
floating binary point
for
remembered
each number.
To program a
floating binary point involves detecting
number
zero occurs in a
first
in Ac. Since
Ac has
where
shifting
can best be done by means of them. In terms of the
facilities this
operations previously described this would require taking the given number out of Ac and performing a suitable arithmetical operation
on
For a (multiple) right shift a multiplication, for a (multiple) either one division, or as many doublings (i.e. additions)
it:
left shift
as the shift has stages.
However, these operations are inconvenient and time-consuming, so we propose to introduce two operations (L and R) in order that this (i.e. the single left and right shift) can be accomplished directly. These operations make use of facilities already present in Ac and hence add very little equipment to the computer.
use of
L and
should be noted that in
It
possibly of
R
will suffice in
many instances
a single
programming a
floating
the two factors in a multiplication have no the superfluous zeros, product will have at most one superfluous 1 and zero (if X Y XY 1). This is 1, then
binary point. For
if
SC
/>^SC
[12]
^
array
circuits
yGontrol
SE
circuits
/
»CC
Combinatorial circuits [9]
\
Fli
State system
P.-
'[14]
feed back)
Programming
Inverter
indicates
figure
of
still
level (ISP)
interpretation is given in Appendix 1 of this chapter and the specification of the programming machine. In addition, it constrains the physical machine's behavior to have a particular
The ISP
Multivibrator [I4J [active component)
[15]
is
R (passive component) X]
and behavior
look at these primitives (although
.
Transistor
[
Mp
We should
together as a C) at the register-transfer level.
[15]
Electrical circuits
to describe the internal structure
and Pc.
[10,11]
(data operation)
NAND
needed
level
CC
f£ c
(with
the
\
1
is
level
A
data
operation/
Switching
[13]
fControi-
ISP.
number of instance
The ISP has been
discussed earlier in the chapter.
Register-transfer level
DEC PDP-8
Fig. 6.
hierarchy of descriptions.
The C can also be represented at the register-transfer level by using PMS. Figure 4 (by DEC) shows the register-transfer level;
Abstract representations Figure 6 also
lists
some
of the
methods used
to represent the
physical computer abstractly at the different description levels. As mentioned previously, only a small part of the PDP-8 description tree
diagrams,
even
is
represented here.
etc.,
The many documents,
schematics,
which constitute the complete representation
of
computer include logic diagrams, wiring lists, circuit schematics and printed-circuit board layout masks, prothis small
duction description diagrams, production parts for
lists,
testing speci-
and diagnosing faults, and manuals programs modification, production, maintenance, and use. As the discusfor testing
fications,
down the abstract description tree, the reader will observe that the tree conveniently represents the constituent obnext highest jects of each level and their interconnection at the sion continues
level.
Each
level in the abstract-description tree will
be described
in order.
The
The Fig.
PMS
level
simplified 1.
PMS
The computer
tion of the
PMS
nounced than
in
structure in Fig. 3 has been reduced from is
small enough so that the physical delinea-
components, such as K's and larger systems.
S('Memory Bus, 'I/O
S's, is less proIn fact, in the case of the
Bus), the S's are actually within the
K and
127
128 Part 2
The
instruction-set processor: main-line
Section
computers
Processors with one address per instruction
1
Mps('Link/L)
DT'Link/L; operations:(l— C\L— 1
L— -.L)
L ('Memory Bus)-
>
-LrMB;'dota;
MPMemory
buffer
/MB ;
flip flop)
LCI/O Bus):=
—
Mps ('Accumulator /AC ,f lip
[output; broodcost;12 bj
'MB— PC.'MB- m[mA],'MB— DBudota,
amplifiers L;'data, input. J
'MB
)
'L°AC— L°AC x4 (rotate), 'AC— AC® MB, AC— AC "MB,
'MA; operations:('MA— 0,
12 b
'MA— PC.'MA— MB.'MA — MB 'MA— DB address
read/write,' inhibit, M„ select [0:7];
MB— M[MA]
D
('
'AC— Carry (AC.MB), 'AC— AC Data„switches
M ('Instruction
register
-
/IR; flip flop)
M CCore-iStock
|— -S
L [('IO^select ) ['MB
:
—
=1
-
MpsCProgram counter/PC;flip flop)
ir
I
OCIR; operations :('IR— 0,'IR— m[ma]))
_J p-T('Sense,-iomplifier)
)
Instruction register decode)
Mp(core;»0)
—
,
'AC— -.AC.'AC— AC-M, 'L°AC— L°AC »2 {rotate}, 'L°AC— L°AC x4 (rotate), 'L°AC— L°AC x2 {rotate},
— AC)
M('Memory address/MA
r
-L[MA;bddress, [output;
'AC; operations: (AC— 0;'AC— 77778
D[('MB;operotions:('MB~0'MB->-MB-t1,
,
-Lf Sense
-L('AC,inpuf,output;12b)-
flop)-
'PC; operations: ('PC— 0)'PC— PC+1 'PC— 0,'PC— MB;
,
'PC; input)—
Mp
=
)-
—•-!_ ('address _accepted,'word_count_ov,' break-state To
]
T.console
-
L('DB_doto ; input)—
;
MA,'MB,'AC,'L,'PC
-T.console
(
lights)
-
'States register; Run,'Interrupt— state
Fig. 8.
DEC PDP-8
register-transfer-level
PMS
diagram.
only registers, operations, and L's are important at this level. We still lack information about the conditions under which operations are evoked. Figure 8
is
a
PMS
diagram of Pc-Mp registers. Here (although we do not bother with
X 64 1-bit core planes is needed. Such a diagram, though still a functional block diagram, takes on some of the aspects of a circuit diagram because a core memory is largely circuit-level 64
The
we show considerably more detail
details.
than in Fig. 4. We declare the Pc state (including the temporary register) within Pc. The which are figure also gives the permissible data operations, D,
address decoders (which select 1 each of 64 outputs in the X and Y axis directions of the coincident current memory); selection
electrical pulse voltages
permitted on the
and
polarities)
registers. It
should be clear from this that the
and the operators can easily be design cannot be reached until we use the
logical design level for the registers
reached.
The K
programming for
logic
level constraints (ISP), thus defining the conditions
evoking the data operators.
The core memory. The
Mp
structure
is
given in Fig.
8.
detailed block diagram which shows the core stack with
A more
its
twelve
Mp
(Fig. 9) consists of the
component
units: the
two
switches (which transform a coincident logic address into a highcurrent path to switch the magnetic cores); the 12 inhibit drivers
(which switch a high current or no current into a plane when or 1 is rewritten); 12 sense amplifiers (which take the induced low sense voltage from a selected core from a plane being switched or not switched and transform it into a 1 or 0); and the either a
core stack, an array M[0:7777 8 ]. Since this is the only time the Mp is mentioned, Fig. 9 also includes the associated circuitlevel
hardware needed
in the
core-memory operation, such
as
Chapter 5
power supplies, timing, and logic signal level conversion amplifiers. The timing signals are generated within Pc(K) and are shown
have selection current. Only one core
together with Pc's clock in Fig. 10. The process of reading a word from
the selected intersection
memory
A
12-bit selection address
is
established on the
MA unique num-
2
3
A
word
logic signal
high-current
a core
is
made
a
is
it),
is
at
=
switched to then a
bit within a core
addresses.
The read
each plane
Iswitching/2, and the current Iy
=
Iswitching.
amplifier
1.
X and Y selection and Y directions 64 x 12 cores
(by having Iswitching amperes is read at the output
plane [0:7777 8 ]. All 12 cores of the selected The sense time at which the sense
are reset to 0. is
observed
in effect creates
X
+
was present and
1
is
tms (memory
MB =
Fig. 2. Digit-delay circuit.
one
off,
deferring the charging of
volts
3ai sec
,
The
D2
be noted that the reset pulse
-*J
in cascade in Pegasus
digit, this
and charges up a storage condenser, C, t the end of the next clock pulse by a 'reset'
to charge the storage condenser. This merely has the effect of
-150 -150--150 volts
cut off at the end of the
computer supply whose amplitude and phasing clock pulse is shown in Fig. 3.
Output 2 Lood pins
volts
is
computer
D
pulse applied through
.
Input clock
+ 200
When V,
flows through diodes
which
100
;330/int
kJT.
volts
of Pegasus, a quantity-production
a further gating with a clock pulse.
digits from the gate input circuit are applied to the of the anode voltage of which falls, so building up a grid Vj,
[_,
volts
••>
173
174
The
Part 2
instruction-set processor: main-line
Section 2
computers
Processors with a general register state
current in the meantime continuing to flow through the diodes with little loss in the stored energy of L, since the voltage across
L
low
is
X + Y or X-Y (Delayed one
at this time.
The output cathode-follower V 2
is
caught
at
— 10
digit
volts in the
a 2L
negative direction by a diode; this safeguards the crystal-diode circuits driven by it in the event of failure of the h.t. supply or
V 2 and ,
El
removes residual ripple on the bottom of the input
it
)
A
\A
waveform, and thus reduces the back voltage and hence leakage in diodes of gates driven
by the output. The second output through a diode can be used
in conjunction with similar outputs from other circuits and a resistor (pins 3 and 4) to make an 'or' (up to about 16-way). In general, each output circuit has two available load resistors,
disposed between direct and
'or'
rr\£. Vr-
1
outputs according to a set of rules
^H>r
which are applied for each case. The number of units which can be driven by an output can vary between three and 16 according to circumstances; where more have to be driven than the rules allow, use
is
made
of 'booster' cathode-followers available
Carry
Two examples ment
ff
will
be given, the first being a simple arrangeis used frequently, and the second
in Figs. 2c
staticizor.
and 5b.
The function
of a staticizor
is
to
remember the an indefinite
in Pegasus being shown in Fig. with a twin 'and' gate input has its output connected to one of its inputs. It is turned on by gate 1, which causes
period, the
A
|
(£>
Fig. 5.
The adder/subtracter.
It is
normally turned off by an inverted pulse on one of the gate 2 inputs.
(a '0'
following a
series of l's)
fact that a digit occurred at a particular time, for
4.
Digit de ay
—
—
The
a Inverter
the use of the logical circuits
— the staticizor — which
shown
Cathode f n ower
AND Gate
being a complicated arrangement the adder/subtracter which is used infrequently. The symbols used to indicate the circuit units are
(a)
Subtract
*
on one
of the packages.
Some examples of
Add
suppression
method generally used
digit delay
a digit to circulate as long as the inputs to gate 2 remain positive.
The adder /subtracter. Figure 5 shows an adder/subtracter unit X and Y and an output X + Y for the sum or X — Y
with inputs for the
difference.
marked
'add'
and
There are two further input control leads 'subtract'. If the 'add' lead
If
the 'subtract' lead
is
is
held positive
held negative, the unit acts as an adder. held positive and the 'add' lead negative,
while the 'subtract' lead
is
the unit acts as a subtracter. Carry suppression is controlled by the lead marked 'carry suppression'. Carries are allowed to propa-
gate Staticizor is set these leads are positive
it
Staticizor is turned it either ot these leads is negative
/ ott
when
this lead
is
held positive, so that a negative signal on
this lead will suppress carry.
Table
elements
1
gives the digits appearing at the outputs of logical
in the
and carry
adder/subtracter unit for
digits
Arrangement of It
was required
when
the unit
circuits
is
all
combinations of input
operating as an adder.
based on packages
to base the logical circuits
on a standard
size of
package which could also be used for other circuits, e.g. a nickelline 1-word store [Fairclough, 1956], A unit which could accomFig. 4.
The
staticizor.
modate three valves and had a 32-way plug was decided
on; the
Chapter 9 |
Digits at various internal points of the adder/subtracter unit
Table
1
when
set to add, for
all
combinations of the input and carry digits
The design philosophy
of Pegasus, a quantity-production
computer 175
176
The
Part 2
instruction-set processor: main-line
The magnetic-drum
store
and the
Section 2
computers
circuit packages used
with
are described in another paper [Merry and Maudsley, 1956], as is the nickel-line store [Fairclough, 1956]. it
Processors with a general register state
This combination of plug and socket has a consistently low contact resistance (0.003 ohm at 1 amp); the insertion and withdrawal force is about 4 oz per contact.
The wiring of the packages
The mechanical design
of the packages
At present packages are wired and soldered by hand. The wiring is point-to-point, and within the limitations of layout for efficient
General form
Each standard package
consists of three
main
parts,
namely the
valve panel, the component panel and the plug. The valve panel is an aluminium pressing, there being three types a 3-valve type, a 2-valve type and a blank. The package
—
type number is marked on the panel by two dots according to the standard resistor colour code.
The component panel houses up
to 100
small transformers, chokes and coils,
components, including the panel and the handle
being made in one piece from sheet insulating material. This design provides a minimum resistance to airflow over the valves
and gives ample protection
to the valves against accidental
dam-
plugs and sockets are used in multiples of eight connec-
tions.
Most of the packages have four plugs providing 32 connec-
tions,
but up to 64 are possible in each package. The plug contacts
are
made
of brass
and are heavily
a proprietary valve-holder contact, if
the eyelet positions makes it possible to use components which are preformed to a standard pitch and would allow for automatic
preforming and insertion of components. Experimental packages have been produced by photo-etched wiring and dip soldering.
Specification of the
Summary
computer Pegasus
specification
A
age.
The
performance, wire lengths are standardized for mass production on automatic wire-cutting and stripping machines. The symmetry of
silver-plated.
The socket
uses
which can readily be replaced
detailed specification would cover the ground of the programming manual [Pegasus Programming Manual, Ferranti Ltd.,
London] and would be out of place here. is
its
Pegasus is a binary serial-digital computer. The word length 42 binary digits, of which 39 digits are used for a number and sign (negative
numbers are represented by
damaged.
other two are gap digits. so that one
word may
their
complements
used for a parity check and the The length of an order is 19 binary digits,
with respect to two), one digit
is
consist of
two
orders, the remaining digit
being a 'stop-go' digit. If the 'stop-go' digit is a '0', the computer will stop before obeying the orders in the word, but will proceed
unhindered
if
the digit
a *1\
is
a 2-level store, a magnetic drum holding 5120 words and an immediate-access or computing store of 55 single-word
There
VALVE
MOUNTING PANEL
is
magnetostriction delay lines. An order is made up of seven
and three M-digits, the
JV-digits,
three X-digits, six F-digits
being the most significant and the M-digits the least significant. The iV-digits allow 128 addresses in the immediate-access store (of which only 63 are used). The reg]V-digits
isters in this store are shown in Fig. 8. The X-digits refer to one of the accumulators, the registers corresponding to JV-addresses
0-7.
Thus the order code
is
a 2-address
code with one address
referring to only a limited part of the store.
the function of the order.
A
list
of functions
The
F-digits indicate
and their correspond-
ing F values are given in the appendix of this chapter. The Af-digits indicate a modifier for the order: they select one of the accumula-
and the modification process is to add certain parts of the contents of the selected accumulator to the order before it is tors,
Fig. 7.
Standard package.
Chapter 9
The design philosophy
All stored information
NAME OF REGISTER
ADDRESS OF REGISTER
parity digit,
correctly stored
ALWAYS ZERO SINGLE- WORO TRANSFER
ACCUMULATORS
BLOCK TRANSFERS TO AND FROM MAIN STORE
form of code, however, shows that
C.
in
is
the 3-address code
An examination
many
cases
two
of this
of the ad-
large number of jump instructions greatly helps in a programme. In particular, one order enables a jump organizing to be made depending on the condition of an accumulator (being zero, for example), and another order on the complementary con-
Having a
dition (being not zero).
necessary to think
dresses are the same, so that the order takes the 2-address form,
it is
A + B —* A. A
condition will be
further examination shows that in a large propor-
A
confined to a very few addresses. This leads to the suggestion of a code of the form + X —> X, covers the where X covers only a small part of the store while tion of cases the address
is
N N
This will have the advantage of yielding a reasonably short order. In Pegasus two such orders are incorporated in one
whole
store.
word, leaving sufficient digits to specify a modification register Mancunian B-line) in each order.
The extreme code, where
X
is
case of this code
is,
(a
of course, the single-address
confined to one address, the accumulator.
How-
had convinced the programmers collaborating in the design of Pegasus that, with single-address codes, a large number of orders are concerned solely with transfers of numbers
ever, experience
from one register to another; the single accumulator through which all numbers must pass and in which
is
a restriction
all
operations
have to be performed. In the Manchester University computer the B-lines serve two very valuable but distinct purposes: they allow order modification and rudimentary arithmetic (such as counting) to be done without disturbing the accumulator. It was felt that fuller arithmetic and logical facilities on these B-lines would have been extremely valuable. The seven accumulators in Pegasus, used for modification
and arithmetic, are a development of the B-line concept.
order), enables the counting through blocks of information to
done with
The use
be
of the group-4 orders of the
code enables counters to
set conveniently and a constant (up to 127) to be placed in an accumulator, the constant being the value of the ]V-digits of
be
the order. Order 67 (the unit-count order) enables the counting in a simple way. A jump can be programmed to take to another part of the programme number of cycles has been when the required place automatically
performed.
is
available
satisfied.
helpful.
The
logical shift orders, 52
and
53, are also included to simplify
and unpacking words holding several items of information. As a result of including these various orders, the order code 'red tape'. In particular, they are used for packing
is quite large. It is worth remarking, however, that by a sensible grouping of the orders in the code the remembering of the code is a very simple task. A sensible arrangement of the
of Pegasus
code tends to reduce the amount of equipment needed to engineer it. For example, when the equipment for dealing with group of the code has
been allocated, groups
1
and 4 require the addition
of only three gates. Facilities for
checking programmes. The features mentioned above
make
the computer easier to programme, and there are other facilities in Pegasus that make it easier to check out and develop
new programmes. These
include causing the machine to stop obeying orders, either under programme control or when the programme is in error. In particular, the machine stops if an order for writing in the
main
store
is
reached and an overflow indicator
A further aid when testing new programmes is the
transfer orders.
all
When
automatic
main-store addresses appearing in blockthis information is examined an indication
programme is readily obtained. The punching can be inhibited by a switch when a return to full-speed running is needed. of the course of a
Machine rhythm
The
relative ease.
of cycles of operations to
only one of these orders
ahead to see whether or not the correct
Although the eight jump instructions felt initially to be enough, it is now that even more such orders would be suggested by programmers
punching out of
function of the order (Fig. 9). This method of modifying orders, used in conjunction with order 66 of the code (the unit-modify
When
included in the code were
is set.
Special facilities for dealing with 'red tape'. The difficulties associated with the 2-level storage system have been greatly reduced by having an order-modification procedure which depends on the
Processors with a general register state
be dealt with
logical design of Pegasus is built around a nucleus that deals with the simple arithmetic orders, groups 0, 1 and 4, of the code. This nucleus contains the control section, i.e. the order register
and order decoding equipment, and the
mill in
which these orders
nucleus could not begin until a are executed. The design for with the extraction from the computing basic rhythm dealing store and the execution of such a pair was determined. When the of this
outline of this nucleus was clear, the equipment for dealing with
the remaining orders in the code was designed to
fit
it.
Chapter 9
The following arguments led to the basic rhythm. Since the orders of groups 0, 1 and 4 are similar in many respects, for definiteness,
it
will
be
of the code, say. This
is
The times
available for replacing in the store in the
for this
millisec
It
same
digit
sequence of operations. Thus,
it
which
is
in a different timing
store in the next
word time
in
standard timing.
order. Two reasons overlap with the first word time for the next oppose this: the new contents of the register being changed might
Exponential function Sine function
29
Logarithmic function
34
7 min 17 sec
of the time for a typical prob-
is
for calculation
and 18 sec
is
for output.
Realizing the specification
The detailed
of
in
Thus, the execution of a pair of orders taken from the computthe ing store requires four word times. The reasons for opposing
overlapping of the execution of
two orders
also
tion of an order pair while the previous pair
oppose the extrac-
is
being dealt with.
Five word times are therefore needed for the process of extracting and obeying a pair of simple arithmetic orders. More time may
some
basic 3-beat
of the other orders in the code.
rhythm
is
thus established:
logical design
would take too long to describe fully the detailed logical design. aspect is worth mentioning, however, namely the avoidance
One
all
'exceptions' in the results of orders.
range of numbers. In multiplication this can occur only the multiplier and the multiplicand are — 1, and this
b
Obey
the
c
Obey
the second order.
first
likely to
it is easier to put a footnote in the programwhere the overflow indicator is described, pointing manual, ming out the exception. It was felt, however, that such exceptions should
this infrequent case,
machine
expense of extra equipment or extra comand other reasons concerned with facilitating
at the
this
use, the logic of
The end-product
order of the pair.
when both is
occur very infrequently. Rather than provide equipment to sense
be avoided even
Extract the order pair from the computing store.
As an example of an
exception consider the overflow indicators, which should be set whenever the final result of an order is outside the permissible
plication. For
a
some indication
24
lem, a set of 50 simultaneous equations (with a single right-hand side) takes about 10% min. Of this time, 3 min 8 sec is for input,
another register in the
be extracted from one and replaced
same word time.
for
for standard subroutines are:
Finally, to give
It
The
time to extract the
for the
millisec
be required by the next order; and two different sets of equipment for selecting a storage register would be needed if numbers were
be needed
5.4
common
delaying circuit instead of one for every takes two word times to execute. register. Such an order therefore It may be argued that this second word time could be made to
to
2.0
Some times
from the normal circulation
considered an uneconomical use of extra equipment. Instead, it was decided to delay the sum so that it could enter the register
computing
Multiplication Division
would be im-
To produce two such entry points to each register would mean more equipment associated with each register, which was
This involves one
0.3
orders.
entry.
in the
Addition and subtraction
These times include an allowance
the sum to the store in the same word as the possible to return without having an entry point to each are extracted operands register
for the various arithmetic operations are:
an order which takes two numbers from
time as the least significant digits of the two components taken out of the store. In practice, some four digit times at least would
be needed
computer
Times for typical operations
would take a prohibitive amount of equipment to extract these numbers, add them together and have the least significant digit
sum
of Pegasus, a quantity-production
sufficient to consider a particular order, 11
the computing store and replaces one of them by their sum.
of the
The design philosophy
I
Pegasus
is
quite complicated. is a series of
of the detailed logical design
diagrams with symbols corresponding to the circuit units of the packages, as shown, for example, in Fig. 5. The inputs and outputs of the units on these diagrams correspond to the pins of the sockets
The
duration of beat
(a) is
one word time; beats
are each two word times long for orders in groups of the code, but may be longer for other orders.
(b)
0, 1,
and
(c)
4 and 6
into
which the packages plug. Thus, the wiring lists of connections be produced from these logical diagrams. The
of these pins can first
step in the production of these
lists is
to allocate a position
179
180
Part
2
The
instruction-set processor: main-line
Section 2
computers
the cabinets to each logical circuit in such a way as to reduce amount of wire needed. When the layout has been completed, the last stage of producing the wire lists can proceed. in
screening
the
is
General construction of machine
The main
units are
shown
is
Processors with a general register state
necessary between any packages, a special metal plate and is fixed by a single screw
inserted in slots in the cast rack
back panel. Coded aluminium strips containing coloured plastic studs which identify the position of each package are fixed to the front of each casting. in the
in Fig. 10.
The package frame. This unit is a simple light-alloy frame supporting diecast light-alloy frame racks to which the back socket
Arrangement of the packages. There are 200 packages per cabinet, arranged in ten horizontal rows of 20 units per row. The metal valve panels are placed so that the edges almost touch. The com-
panels are fixed. The packages slide into grooves in the rack and plug into sockets at the back, a polarizing feature preventing the
sponding position
insertion of a package upside
down.
If electrical
or
magnetic
BAY
ponent panel of each unit is in register with the unit in the correin each of the other rows, thereby providing vertical
chimneys
for cooling the
components secured
to these
I
LOGIC PACKAGES
BAY 2 LOGIC PACKAGES
BAY
3
LOGIC PACKAGES
PACKAGED MONITOR UNIT
PROGRAMMERS CONTROL PANEL
INPUT
EQUIPMENT
FIBRE
GLASS FILTER
DRUM PACKAGES
Fig. 10.
Main
units.
Chapter 9
Warm
panels.
air
from the main source of heat, the valves,
is
The design philosophy
of Pegasus, a quantity-production
computer
Fault location
prevented by the valve panels from reaching the more temperature-sensitive components, such as diodes, secured to the com-
There are parity-checking
ponent panel.
The
speed
on both the main and the highmachine.
circuits
stores. Errors of a single digit in the stores stop the
fault
can then be quickly located by examination of the
monitors.
The back panel wiring. For locating long signal wires between sockets a system of plastic strips is used, which hold the wires at definite positions given
The
by the
exact route of every wire
instructions on the wiring
lists.
predetermined, thus making wiring and inspection more reliable and fault finding and mainte-
nance
is
easier.
Final assembly. The completely wired frame is assembled in its cabinet, which has already been fitted with the control and auxiliary supply circuit unit, heater transformers, fuses, cooling assembly and eableforms. The work of connecting the cableforms, heaters
and earths can be done by relatively unskilled labour working to clearly written instructions and diagrams.
The cooling system. Each cabinet has
own
its
an integral part of the construction; there
is
For other
chamber, each providing 300 of 1 in (water gauge).
The power supply. stabilizing valves
ft
/min
A separate
and control
rise is
to run a test
and a number of key wavenormally a matter of tracing 0's and l's through the machine with reference to logical diagrams rather than electronic circuit diagrams. forms. Fault-finding
is
A variety of triggers can be selected for the
monitor time-bases,
these including
a
Trigger at any word position within a drum revolution (128 different times selectable by switches)
b
Trigger at any
word time
of
any selected order
head
10° C.
These
triggers
and some other monitoring
facilities are pro-
duced by 19 standard packages and are found cubicle houses metal rectifiers, shunt
The power
circuits.
the mains through a motor-alternator
set,
is
programme
position: these include all store lines
cooling system as
of air at a total pressure
is
All outputs of circuit units are readily accessible at monitoring sockets on the front of each package, and in addition about 80 points can be directly selected by switches from the monitoring
therefore no difficulty
The maximum temperature
method
out with the monitors.
in cooling cabinets added to existing computers. Two axial-flow turbo blowers are mounted in the base beneath an airtight pressure 3
faults the general
(assuming the fault is not in the main control) which will indicate the area of the fault. Detailed examination can then be carried
to
be well worth
the extra equipment.
obtained from
the output of which
is
main purpose of this set being to act as a buffer against switching surges and other mains voltage variations.
Fault repair
stabilized to 2%, the
The
valve heaters in the computer are energized from the stabiwhich is expected to extend the valve life.
lized alternator output,
Once
a faulty package has been located, the machine can be got working again immediately by replacement of the package with a spare; repair of the faulty package can be done at leisure with
the aid of a package tester.
With
this
equipment a package can
quickly be given a series of standard
Maintenance
switches,
All digital
ignored.
is
tests; each is selected by measured either by observation
of meters or a built-in oscillograph.
General
circuits
and the performance
computers so
far
have a
fault rate
which cannot be
When the best has been done in the choice of components, and mechanical construction, attention must be paid to
During commissioning not one case was found of the first machine doing other than what one would expect from the logical diagram (except for a very few cases of incorrect wiring).
the following points to get the best out of a machine:
Preventive maintenance
a
Rapid
b
Getting the machine working again as soon as possible after
fault location
locating a fault c
Preventive maintenance
The machine
h.t. supplies are reduced while the test programmes are being run. This marginal testing shows up incipient faults such
as deterioration in valves, crystal diodes or resistors. is
at present kept in
good running order
down
to
The machine 10% margins
181
182
The
Part 2
instruction-set processor: main-line
(the supplies are normally controlled to about
although correct running at about
20%
Section 2
computers
1%
of nominal),
reduction has been ob-
for
55% hours'
running.
The
Processors with a general register state
majority of package replacements are
done during routine maintenance.
served. to
The packaged method of construction of computers has proved have great advantages in design, construction and operation.
Conclusions first machine has been computing regularly for only a few months and has been on regular preventive maintenance (about 1 hour per day) for a few weeks. Error-free runs of over 30 hours are common, and at the time of writing there has been no error
The
References ElliW56a; ElboR53; E1KW51, 52, 53, 56b; Fair]56; JohnD52; MerrI56; Pegasus Programming Manual, Ferranti Ltd., London; Pegasus Mainte-
nance Manuals, Ferranti
Ltd.,
London.
APPENDIX The Pegasus Order Code
00
x'
01
x'
02
x'
03
x'
04
x'
05 x 06
1
x'
= n =x+ n = -n =x- n =n—x —x&n = x E=£ n
26
11 n'
12 n'
13 n' 14 n'
15 n' 16 n'
30
17 Not allocated
21 (pq)' 22 (pq)'
= = =
23 (nq)'
=
n n
p
x •
x
+
+
2 39
2~
3S
2~
3S
q
+
nx
this
n
+
order assumes that any
overflow
q
tions
is
in 7.
due to operaClears overflow
unless n' overflows
< 24
25
y+
2
2^ 38
(— =
27 Not allocated
=x — n+x = -x — n —x = x— n = n&x = n^x
20 (pq)'
+
j
— -% < p'/n < % (rounded ;
single-
length division
07 Not allocated
10 n'
q'
- 38
(t)
-
x
+
2~ 38 q
p'/n
B
-1
C[0]^;
{
Go to head 0+ B string
m[b}-mb^b-i)
state
state
I
B— B-1;
Fig. 4.
10
L |
^ M
Thus Fig. 4 is a more detailed description of states and o.v'. Each horizontal pair of states (Fig. 4) corre-
sponds to a single scan of the states of type 1 instruction o.v, o.v, o, among states 2 and 3 correspond to the
M[B]— -.M[B]
iM[Btl]*;
up accord-
3.
o.v' in Fig. 3. Transition:
B— B +
string has terminated
;
>B
dress pointer registers. These point to the tail (or least significant digit), that
+0;
A
trecomp-.";
AC[0]
I^address register* the instruction location pointer
r
I
[l
1
A[l :3] ,
address[X[1:3];|,
:=
(
Address encoding for 1 of 16000 from a ter X. Indexina described below.
3
char value of regis-
231
232
Part 3
The
Section 3
instruction-set processor level: variations in the processor
[
APPENDIX
Processors for variable-length-string data
IBM 1401 ISP DESCRIPTION (Continued)
1
x [3]
'
+
**ooo 10
x
+
x[]] x ]000,
x[l :3][bcd. string})
Instruction Format op]
{3.ch})); B address set up or djzhar
next
«-
1
)
;
next B[l] ^-d^har;
;
active
-*
(b[2]
*- get,_,char)
;
active
-»
(B[3j
«-
get^char)
;
active
-»
(Bwaddress^present
«-
)
;
active
-»
(B^ddress^present
*- 0)
;
Bwaddress^jpresent
add index register to I or A
(d^char^present «-0;
(d,_,char «- get,_,char
d.jChar.^present
1
next
-1
(A[2] * 0)
active
next
^get^char; next
A^address^present
iM[l]
I or A address set up or d^char
d^char;
next
;
(A[2]
active-* (A^ddress^present ,
«_
char instruction
proceed to get an I or A address
*-0); next
-» B
(d^char ^-get^char; next A[i]
-imls -»
mis
-^
d^char^present
active
(
*- 0;
-*
next next 1
record whether B address is present next
add index register to B
(
d^char.jjresent .
eat
->ij;i
i-
.004 398 364 291
end accumulate reg ».-*«. ¥.2.»/% Removes condrtiorts end
.stars
machine.
The FIXED POINT mode displays numbers in the way they are most commonly written. The DECIMAL DIGITS wheel allows setting the number of digits displayed to the right of the decimal to 9. Figure 2 shows a display of three point anywhere from numbers with the DECIMAL DIGITS wheel set at 5. The number
the
x 10 5 s 533 684.5815, is too big FIXED POINT without reducing the DECI-
DIGITS
845 815
setting to 4 or
DECIMAL DIGITS
less.
If
the
number
is
too big for
the register involved reverts
setting,
an apparent overflow. In the number display, displayed is rounded, but full
automatically to floating point to avoid
FIXED POINT
1
STORAGE:
Storage registers.*-'
"
'
Transfer * -or * 16 register .(«* eated byr»W twystrei*.
jMj
TAN 1 f-3)=-71.9-SS'
aJ&Jl.*. todfi... atpha registers tar
to X.
**
Hvrnbaatwi.
alpha
,
mmmc
oft
*-*•*% t — *-*f
....
Exchanges , with regrsief Snd.c»i*d by next keystroke; omty instruction for recalling contents, erf • numeric register.
I
I
TO ENTER A PROGRAM
—
POINT,
digits to the right of the
Causes
pull-out instruction card, Fig. 3,
calculator under the keyboard.
j.
unchanged.
RCCAtL
urtconCitiona!
branch to
in
(wopwn t*e#erx*. mtml iemd.
-
tMtik
MET
is
located at the front of the
The operation
of each key
is
briefly
'
Stops program execution when used menuelly or as a program step.
,
[Srenches to address given by
next two program step if first itep is alphameric. (GO TO not necessary.) Otherwise, executes instructions in next two steps and "CwKtmiM with third step. I
Ends recording on magnetic card. Gives STOP end automatic GO TO (0) (0) Most be last program step.
;
50 RECORD A
PROGRAM
SET:{-ntSn
' ,
CONMTtON
PRESS: j
PftlESS-
Enter rlata and press
CONTINUE
»s
recp/irwl
,
Sf.T:,wwe
!
Ptwa&@(g@ FRCSS: Desired
1
My
24
yz'
+
,
step.
MWpai'l mode: Psotevs
address
„ and instruction code in X.
Sets
tnTT^lrnode. Executes one pro ghwmt) or aH 3 steps o< GO TO
forces a brief display during proexecution. When bald down, causes STOP at next prog PAUSE
only.
condition to b* tested by the next IF FlAG. May be used manuetfy or as * program step.
gram
>*,»
25
"No Operation"
WTiKT
Stops next two program steps. 'Continues with third anagram 'step. OF FlAG clean the flag )
TO RUN A PROGRAM
Ml
Starts program execution at present address. May be used as a
j
PR65S.@.
tnseft magnetic c»fd,
i
4? CMtnitui
Pull-out instruction card
A
9 nrf
CONDITION
keys
M&c
fima
FLOATING
decimal are grouped in threes.
MeCOWHATC j.
address given by next two program steps or keyboard entries,
from keyboard:
Answer -*X
AOCOMUtATE* * Xh. A met "IF" step branches to address in next two steps. II not an address, executes first step A not met "IF" step branches to and executes thtfd step
Chapter 20
The HP Model 9100A computing
calculator
245
246
The
Part 3
them. Special keys located in a block to the are used to identify the lettered registers.
To
Section 4
instruction-set processor level: variations in the processor
store a
number from
the
X
left
register the key
of the digit keys
Q
parenthesis indicates that another key depression, representing the storage register, is necessary to complete the transfer. For example, storing a number from the X register into register 8 requires two
key depressions: store a
f-o)
number from Y
The contents numbered
The X
register remains
register the key
Q
is
unchanged. To
a, b, c, d, e,
and
used.
Recalling a
f.
register requires the use of the
(*=»)
key
X
simply
number from to distinguish
the recall procedure from digit entry. This key interchanges the in the Y register with the number in the register indicated by the following keystroke, alpha or numeric, and is also useful
number in
f
programs since neither number involved in the transfer is lost. The CLEAR key sets the X, Y, and Z display registers and the
and e
registers to zero.
The f and
registers are not affected.
e registers are set to zero to initialize
the r%) and
key
The remaining
(IF)
often makes
it
for use
and the
a very useful
ARC first
and
HYPER
with
CLEAR
keys as will be explained. In addition the
FLAG
clears the
them
and R
in X,
(^J
is
conditions,
which
step in a program.
placed in
is
pressed and the display shows y in
Y and
x in X.
ACC+
and
components
contents of the in f
ACC—
in the f
allow addition or subtraction of vector
and
X and Y
e storage registers.
ACC+
adds the
numbers already stored subtracts them. The RCL key
register to the
and e respectively; ACC— numbers in the f and e
recalls the
of the alpha registers are recalled to
by pressing the keys a
.
[ij
converting from polar to rectangular coordinates, Y,
The
used.
is
Desk calculator computers: keyboard processors with small memories
registers to
X and
Y.
Illegal operations
A light to the left of the CRT indicates that an illegal operation has been performed. This can happen either from the keyboard or when running a program. Pressing any key on the keyboard reset the light. When running a program, execution will continue but the light will remain on as the program is completed. will
The
illegal
operations are:
Division by zero \/x
where
x
/>%, or/=0
float-
length word may
single reprea 40-bit fractional part /, and ~ the value of the number is then /2 C 128
is
c;
in parallel
jiisec
with the
time required to complete the functions.
.
limited to the range
when
c
is
— 1 < / < — %,
or
also zero. All floating-point opera-
assume that operands are in this standard form and give correctly rounded results in standard form. Functions for the additions
and subtraction of double-length floating-point numbers have been provided, as these give increased accuracy and stability in tion
many
performed
suc-
number with
an 8-bit characteristic
The
A
Thus with a simple one core per bit system cessive reads can be made at 1 jusec intervals and writes at 2 use.
matrix operations.
The arithmetic As shown
unit
in Fig.
1,
there are
six full
length transistor flip-flop
registers in the arithmetic unit; there are also
two
8-bit registers
used when performing floating-point operations. The main ties associated with these registers are as follows.
Wl,
W2
and
W3
facili-
are the three most accessible cells of the
nesting store; transfers to the core part of the nesting store, being
263
264
The
Part 3
formed by adding the minuend's complement to the subtrahend with a carry inserted into the right-most adder stage.
MAIN TRANSFERS
Nb
NESTING ADDRESSING
STORE
COUNTER
r I
AMPLIFIERS
between
store control
and the arithmetic
is
quence of timed pulses along lines which activate the various transfers etc., between the registers. The sequences have been
WRITE_
[AMPLIFIERS
constructed so that
many operations are performed simultaneously,
reducing the overall time to a minimum; thus the function sin-
W3
r-L
acts as a buffer
and together with Bl and B2,
used in nearly every function. Arithmetic unit control interprets each instruction as a se-
unit,
CORE REGISTERS
READ
Processors with stack memories (zero addresses per instruction)
Section 5
instruction-set processor level: variations in the processor
gle-length fixed-point
is
performed by:
W3
Bl and Nb respectively, read from the nesting store, a simultaneously commencing clearing the carry inserted into the right-most adder stage
i
SWITCH
add
Transferring
L
Wl, W2,
to B2,
and switching the adder's output to Wl. Wl
Adding and simultaneously transferring Nb
ii
CLEAR
TO STORE CONTROL
Each step takes
AUXILIARY TRANSFERS AND SHIFTS
OF OR-8
>1
* ^CC8BITS)|
LEFT SHIFTS OF
STANDARDISATION
0,l.2,S,8
,
OR -8
AND CONVERSION LOGIC SHIFT
~ CHARACTERISTIC MODIFIER
A.U^CONTROL, PULSES
inforFig. 1. Block diagram of the arithmetic unit. Full lines represent mation transfers; dotted lines represent control pulses. All registers are 48-bits long unless otherwise stated.
similar arithmetic unit operating only on single-length
enables all double-length arithmetic operations to be performed without writing information back into the nesting during the function; this would have complicated the sequences and increased the time for the functions.
When determining the arrangement of transfer paths between the various registers, it was found sufficient to consider only the or lengthy double-length functions which required complicated sequences; in particular the function for adding two double-length
Wl
and W2, together with £1 and 52, form a double-length shifting register which may be used as two independent single-length shifting registers. Bl and B2 are the inputs to the 48-bit adder whose output may be routed to Wl, W2, or to the characteristic difference register
influence.
is
set
contains 13 carry-skip stages which reduce the carry time to a maximum of 150 nsec Subtraction is per-
The adder
on fixed-point addition and sub-
the sign of the result differs from that expected, and on floating-point operations if the characteristic exceeds the if
maximum
allowable; shifting
may
also cause overflow.
Shift control
by transfers between Wl (and/or and back Bl and (and/or B2), again. The shift transfer paths W2) from the to the B registers provide right shifts of 0, 1, 2, 5 Shifting operations are effected
CD.
propagation
numbers had great
overflow indication
traction
W3.
num-
changes the contents of the two most accessible cells in the nesting store with those of the next most accessible pair. The sixth register
An via
W3
bers could be designed using only four full-length registers. At least five registers are required to perform the function which inter-
floating
made
of the last step,
To speed up multiplication and division, these functions are carried out in a separate unit employing the stored carry principle, but the results are finally assimilated within the arithmetic unit.
A
RIGHT SHIFTS 0,1,2,5,8
W2.
has been refilled from the core nesting store.
RIGHT SHIFTS OF 0,1,2,5,8. OR -8 LEFT SHIFTS OF 0,1,2,5,8 OR -8
and by the end
0.5 jusec
to
W
Chapter 21 |
or 8 places, and a left shift of 8 places; the paths from the B to the registers provide the same shifts in the reverse direction.
Hi
W
The two
sets of shift
paths are used alternately, those from the first; all shifts are terminated using a path
W registers being used into the W
registers. Shifts of a large
number
of places are
accomiv t;
W
necessary the number is then transferred back into the registhe remaining shifts, or the whole shift if the number of places less than eight, is then completed by a transfer to the B registers
ters: is
vi
b
and back again using two appropriate paths. With the shifts available, extension of the B registers by two bits at the right-most end enables any shift to be performed without loss of accuracy.
word
number
of places
is
by-passed.
When
a shift
is
to
ii
be performed, the
iv
store.
Ada, simultaneously clearing the sign of W2. floating
numbers
complement of Wl to Bl, B2 and switch the adder's output to
Transfer the
Wl
Store the characteristic of
in
transfer
Wl
W2 CD.
register in the eight-bit register
and add.
Clear the characteristic positions of Wl, simultane-
contains minus the difference in charac-
Clear the characteristic of W2, and
of the result.
The character conversion operations to, and from, binary are accomplished by shift control, using a method involving successive
vii
and adding or subtracting portions
viii
Supply control pulses to shift control and thus perform the required right-shift of eight Wl or W2. Having completed the shift, transfer Wl, W2 and W3
nesting store. Add the fractional parts, simultaneously transferring
Nb
ix
Examples of sequences
to
W2.
Supply control pulses it
of the radix word.
the shifts required. Store the complement of the
two sequences
respectively,
number
and perform of left-shifts
performed
in (viii) in the characteristic position of B2,
C
to the characteristic position of Bl, switch
the adder to Wl. x
are described.
Nb
to shift control so as to cause
to enter the standardization procedure
transfer
core register of the nesting store). i Transfer Wl, W2, W3 to B2, B\ and
about
to B2, Bl and Nb respectively, simultaneously switching the adder's output to Wl, clearing the carry into the right-most adder stage and reading from the core-
during this metic unit control for use in forming the correct characteristic
— D, (i.e. subtract the double-length fixed-point number in Wl and W2 from the number in W3 and the most accessible
is
replace the contents of C thus C contains the larger characteristic.
The number of shifts performed standardising operation is made available to the arith-
unit,
Wl
by the sign digit of CD, by the characteristic of B2;
into shift information.
working of the arithmetic
if
to be shifted, determined
vi
ii
inserting a carry into the right-
and read from the nesting
teristics.
determined by logical circuits which interpret the pattern of
a
stage,
with fresh data), switch the
Transfer the complement of Wl to Bl and Nb to B2, switch the adder's output to Wl and insert a carry into the right-most adder stage if W2 is negative.
shift register
v
illustrate the
filled
W2,
CD
ating on the characteristic positions of the two numbers. After the addition, the shift required to restore the result to standard form
To
W2 to B2 (but setting the W3 directly to Bl (W3
into the shift number register ously transferring in shift control. This latter operation is such that the
When performing floating-point addition and subtraction, shifts are required to equalize the characteristics of the two numbers; the amount of shift is calculated by a modified subtraction, oper-
shifting of the character word,
most adder Add.
C Hi
necessary to obtain the shift.
Wl
now been
of
positive), transfer
(i.e. add the two single-length and W2).
with a string of command pulses by the arithmetic unit control; shift control then re-routes these pulses to perform the transfers
is
has by
to
and the type of shift are transferred into a semiautonomous unit, called the shift control, which is then supplied
bits in
complement
B2
+F i
In double-length arithmetic shifts, the sign digit of the less significant
Transfer the sign of
adder's output to
plished by a series of shifts of eight places in the appropriate direction until the number of places remaining is less than eight; if
Design of an arithmetic unit incorporating a nesting store
The sum
Perform a special add operation which only affects the characteristic positions of Wl.
is
thus formed in
Wl. Rounding the answer
is
carried
out using two special control pulses which complete all floatingpoint operations, these call up logic to deal with the cases when
simultaneously reading from the core nesting store.
the rounding operation necessitates re-standardization of the re-
A dummy
sult.
pulse.
265
266
Part 3
The
instruction-set processor level: variations in the processor
Conclusions
Section 5
Hi
The advantages
of a
arithmetic unit are:
machine incorporating
—
a nesting store in the
As the operation of the arithmetic unit is largely independent of the main store, their controls may readily be separated. This allows store control to process instructions whilst the arithmetic unit control processes a prior instruction,
i
The machine
is
thereby leading to faster execution of the programme.
simple to programme using the machine
The main disadvantage
language, ii
Processors with stack memories (zero addresses per instruction)
Programmes are faster, since many main store transfers are eliminated, and the access time of the nesting store is virtually zero. They are more compact because less infor-
involved.
mation
AllmR62; DaviG60; HaleA62
is
required to specify
many
instructions.
References
is
an increase
in the
order of complexity
Chapter 22 1 Design of the B 5000 system
William Lonergan / Paul King
Computing systems have conventionally been designed
via the
'hardware' route. Subsequent to design, these systems have been handed over to programming systems people for the development
programming package to facilitate the use of the hardware. B 5000 system was designed from the start a total hardware-software system. The assumption was made
of a
In contrast to this, the as
that higher level
be used to the
programming languages, such as ALGOL, should machine language programming,
virtual exclusion of
and that the system should largely be used to control its own operation. A hardware-free notation was utilized to design a proc-
word and symbol manipulative capabilities. model was translated into hardware specifica-
essor with the desired
Subsequently tions at
this
should be
made
for the generalized
subroutines; a full
complement
handling of indexing and
of logical, relational
and control
operators should be provided to enable efficient translation of higher-level source languages such as ALGOL and COBOL; pro-
gram syntax should permit an almost mechanical translation from source languages into efficient machine code; facilities should be provided to permit the system to largely control its own operation; input-output operations should be divorced from processing and should be handled by an operating system; multi-programming and true parallel processing (requires multiple processors) should be
and changes in system configuration (within certain broad limitations) should not require reprogramming. facilitated,
which time cost constraints were considered.
System organization Design objectives
The B 5000 system achieves
The fundamental design
objective of the
B 5000 system was
the
A second major both in changes programs and system
reduction of total problem through-put time. objective was
facilitation of
Toward these objectives the following aspects of the total computer utilization problem were considered: configurations.
Statement of problems
in higher-level
languages; efficiency of compilation of
system.
of
Master control program
machine language; program debugging in higherlanguages; problem set-up and load time; efficiency of
A master control program
system operation; ease of maintaining and making changes in existing programs,
made
logically like telephone crossbar switches. Figure 1 depicts the basic organization of the system as well as showing a maximum
machine-independent
machine language; speed
compilation of level
its unique physical and operational the use of electronic switches which function modularity through
in a
Design
and ease of reprogramming when changes are
system configuration.
criteria
Early in the design phase of the
B 5000 system
the following
principles were established and adopted:
Program should be independent of its location and unmodified as stored at object time; data should be independent of its location; addressing of memory within a program should take advantage of contextual addressing schemes to reduce redundancy; provisions ^Datamation,
vol. 7, no. 5,
pp. 28-32, May, 1961.
will be provided with the B 5000 system. be stored on a portion of the magnetic drum. During normal operations, a small portion of the MCP will be contained in core It
will
memory. This portion will handle a large percentage of recurrent system operations. Other segments of the MCP will be called in from the magnetic drum, from time to time, as they are required to handle less frequently-occurring events, or system situations. Whenever the system is executing the master control program, it is
said to be in the Control State. All entries to the Control
State are
made
via 'interrupts.'
A
special operation
which can only be executed when the system
is
is
provided,
in the Control
State, to permit control to return to the object program executing at the time the 'interrupt' occurred.
The following
it
was
are a few typical occurrences which cause an
automatic 'interrupt' in the system:
An
input-output channel
is
267
268
The
Part 3
or
1
1
2
to 16
1or2
1or2
1
1
1
1
instruction-set processor level: variations in the processor
Section 5
Processors with stack memories (zero addresses per instruction)
Chapter 22 |
F
Design of the B 5000 system
269
270
The
Part 3 |
way around machine
still must provide object the and recall functions. In brief, storage accomplish conventionally designed computers, with or without automatic
coding
ming
design, but they
to
programming
the wasteful expenditure of programcapacity, and running time to overcome the
aids, require
memory
effort,
Processors with stack memories (zero addresses per instruction)
Section 5
instruction-set processor level: variations in the processor
the / operator
of higher precedence than the
is
right-hand Polish notation used in the
B 5000
is
+
operator.
The
based on placing
the operators to the right of their operands: A + B becomes AB + in Polish notation. A + B + C can be written either as AB + C-I-, or as
ABC + +
.
In the expression
ABC + +
,
the
+
first
operator
of a
add the operands B and C. The second + operator says to add A to the sum of B and C. Beturning to the first examples above, A(B + C) can be written as BC + Ax or ABC+ X in Polish.
instructions (coded or compiled) to store or recall intermediate
The second example
results.
sion of Polish notation to handle equations
limitations of their internal organization.
says to
The problem is attacked directly in the B 5000 by incorporation "pushdown" stack, which completely eliminates the need for
B 5000 processor, the stack is composed of a pair of regisA and B registers, and a memory area. As operands are picked up by the programs, they are placed in the A register. If In a
ters,
the is
register already contains a
transferred to the
the
A
B
then the word in B address register
S.
is
register
is
stored in
also
in
A
BC/A +
or is
ABC/ + The .
shown
exten-
in the follow-
Conventional notation Z = A(B - C)/(D + E) Polish notation
ABC- xDE + /Z=
word
operand into occupied by information, a memory area defined by an
Then the word
and the operand brought into the into the stack has
of information, that
register prior to loading the
B
register. If the
word
written as
ing example:
the
A
is
can be transferred to B
A register. The new word coming
pushed down
the information previously held in the registers. As each pushdown occurs, the address in the S
The stack
in
use
To
illustrate the functioning of the stack,
are
shown
in Figs.
4 and
5.
two simple examples Q and
In the examples, the letters P,
R
represent syllables in the program that cause the operands P, Q, and R to be picked up and placed in the stack. The symbols
+
and
X
represent syllables that cause the add and multiply
The two examples represent different ways The first example in Fig.
register automatically increased by one. The information contained in the registers is the last information entered into the stack;
operations to occur.
the stack operates on a "last in-first out" principle. As information is operated on in the stack, operands are eliminated from the stack
4 does not require pushdowns or pushups. The second example, shown in Fig. 5, requires a pushdown in the execution of the
is
and
results of operations are returned to the stack.
As information
used up by operations being performed, it is possible to cause "pushups," i.e., a word is brought from the memory area in the stack
is
addressed by the S register, and the address in the S register decreased by one.
the registers contain information or are empty. When an operand is to be placed in the stack and either of the registers is empty,
no pushdown into memory occurs. Also, when an operation leaves one or both of the registers empty, no automatic pushup occurs.
Polish notation Polish logician,
J.
Lukasiewicz, developed a notation which
allows the writing of algebraic or logical expressions which do not require grouping symbols and operator precedence conventions.
For example, parentheses are necessary as grouping symbols in the expression A(B + C) to convey the desired interpretation of the expression. In the expression A + B/C, the normal interpretation is
A + (B/C),
rather than (A
+ B)/C, because of the convention
syllable R,
columns
P(Q +
R) in Polish notation.
and a pushup
in the execution of the syllable
X The
in the table represent the contents of the various registers
after execution of the syllable listed in the
first
column.
is
To eliminate unnecessary pushdowns and pushups, the A and B registers both have indicators used for remembering whether
The
of writing
that
Independence of addressing
One
of the goals set in the design of the
programs
program
independent of the actual itself
Polish Notation
and the data,
QR + Px
B 5000 was
memory
in order to
to
make
the
locations of both the
provide really automatic
Chapter 22
Polish Notation
PQR +
Design of the B 5000 system
271
272
The
Part 3
instruction-set processor level: variations in the processor
Section 5
Word mode program
For
The word mode of the B 5000 processor has four types of syllables. The syllable type is distinguished by the two high-order bits of each 12-bit
The types
syllable.
of syllable
and the
identification
(3),
indexing of the descriptor by the item that
call syllable, action
In the case of
addressed.
00— Operator Syllable 01— Literal Syllable 10— Operand
registers
Call Syllable
now
the
A
complete after the indexing.
is
subroutine entry occurs to the subroutine word of the three previous types may be left in the (4),
upon return from the subroutine,
in
which instance the
actions described above will take place, depending of syllable which initiated the subroutine.
— Descriptor Call Syllable
is
second item in the stack occurs. For an operand call syllable, the operand is obtained from the indexed address; for the descriptor
bits are:
11
Processors with stack memories (zero addresses per instruction)
upon the type
Essentially, the four types of action that occur for an
operand
obtaining an operand directly, indirectly, from an array, or by computation. Sometimes in the use of the call syllables, it is not known which type of action will occur for a call syllable are
The
first of these, the operator syllable, causes operations to be performed. The remaining ten bits of the operator syllable are the
operation codes. There are approximately sixty different operations in the word mode. For those operations requiring an operand or operands, the processor checks for sufficient operands in the registers; if
they are not there, pushups from the stack in
memory occur
automatically.
The to
literal syllable is
used for placing constants
be used as operands. The ten
in the stack
bits of the literal syllable are
transferred to the stack. This allows the
program
to contain inte-
call syllable,
and the descriptor
call syllable ad-
dress locations in the
operand
call syllable
program reference table. The purpose of the is to place an operand in the stack; the
purpose of the descriptor an operand, a descriptor, that arise,
call syllable
in the stack.
is
to place the address of
There are four situations
depending on the word read from the program reference
when
the program
Programs
in the
word mode
is
an operand.
is
particu-
consist of strings of syllables
which
by operator syllables which perform their operations on information in the stack. in the stack, are followed
The indexing at the
features of the
B 5000 allow generalized indexing
same time provide complete storage protection. Data
areas and program segments of different programs may be intermingled, but a program is prevented from storing outside of its
data areas. The method of indexing allows any of the 1,024 words of the program reference table to be considered index registers. Multilevel indexing selves
is
be elements of
The subroutine
provided,
i.e.,
tine of itself)
indices of arrays can them-
arrays.
control provided in the
B 5000
allows nesting
— even recursive nesting subroutine a subrou— arbitrarily deep. Dynamic allocation of storage
of subroutines
The word
created. This
follow the rules of Polish notation. Variable length strings of call syllables and literal syllables, which place items of information
table.
1
is
larly true for call syllables in subroutines.
and
gers less than 1,024 as constants.
The operand
particular syllable
is
(a
for
and temporary working storage simplify the use parameter of subroutines. Storage is automatically allocated and deallocated lists
2
The word
is
a descriptor containing the address of the
operand. 3
The word
4
The word
as required.
a descriptor containing the base address of the data area in which the operand resides. is
a program descriptor containing the base address of a subroutine.
Character
is
mode program
In the character
mode
of the
B 5000
Processor, there
type of syllable, called the operator syllable.
For
the operand call syllable has completed its action by an placing operand in the stack. The descriptor call syllable will cause the construction of a descriptor of the operand, replacing (1),
the operand by the constructed descriptor. For (2), the operand call syllable then reads the operand from the cell addressed. The descriptor call syllable has completed its action.
in
is
only one
Program segments
mode are constructed of strings of these syllables. mode is designed to provide editing, formatting,
the character
The character
comparison, and other forms of data manipulation. In doing so, the processor uses two areas of memory the source and desti-
—
When
a program switches from word mode to character mode, two descriptors containing the base addresses of these
nation areas.
areas are supplied.
The source area
or destination area
may be
Chapter 22 j
changed
may
act
at any time during character mode so that the program on several areas.
The character mode operator
syllable
is
split into
last part specifies the operation to parts; the
two
6-bit
be performed and
is to be part specifies the number of times the operation the are for deletion, transferring, provided performed. Operations
the
first
comparison, and insertion of characters or bits. Also, there are operations which allow the repetition of syllable strings. This is quite useful for complex table look-up operations and for editing information which contains repeated patterns.
Design of the B 5000 system
Conclusion
The Burroughs B 5000 system has been designed
as
an integrated
hardware-software package which offers such benefits as savings in the memory space required to store equivalent object programs; multi-processing and parallel processing; and running identical
programs on systems with different size memories and different system configurations with no loss in individual system efficiency. References
LoneW61; BartR61; BockR63; CarlC63; MaheR61
273
Section 6
Processors with multiprogramming ability The processors ple
programs
in this
Two
section have features which allow multi-
to exist in the primary
The programs can be executed
memory
same
at the
time.
alternately by a single processor
without having to wait for new programs to be input. The cost is only that of changing the processor state, which involves only
most (and only one instruction on some a few such as the CDC 6600). Since programs are subject systems, to numerous unpredictable delays within a single run for interchange with the external environment (either via Ms or T), substantial increases in Pc utilization can be achieved by multi-
more than
If
Mp, the system
is
is
in
essentially that of Atlas.
The extracodes feature allows ordinary machine operation codes to be used to
The ISP
in
swapping programs, one at a time, into primary memory for interpretation. The Berkeley Time-Sharing System (Chap. 24) uses both multiprogramming and program swapping. The
tion of the extracode.
multiprogram fundamental that
The
an early computer to have
is
idea of
multiprogramming
is
so
it should be among the first concepts to be understood by the student of computing systems. A very nice
review of in
memory mapping and
the paper
Dynamic Storage
storage allocation
is
the
SDS 900 series and was used in the common-user instructions. The
for defining
IBM System/360 SVC (supervisor call) Atlas
was about the
instruction
is
an adapta-
computer to be designed with and the idea of user machine in
earliest
a software operating system
mind. The operating system has been nicely described [Kilburn al., 1961] and evaluated [Morris et al., 1967].
et
In a letter to
the following
the authors of this book,
comments on
F.
H.
Sumner makes
Atlas.
and The
initial ideas and the preliminary research on the Atlas computer system started in the Department of Computer Science of the University of Manchester in 1956. The team, under the direction of
Professor T. Kilburn, was later supplemented by several
Atlas
of the I.C.T.
The Atlas
most important machines described in was originally designed and conManchester University. The Atlas 1 and Atlas 2 were
is
one
of the
book. The prototype
structed at
produced by Ferranti Corp. (prior to becoming part of 1
is
registers
the most interesting;
I.C.T.
and
al.,
machine was working in the department by the Autumn of 1961. first production model became operational in January 1963.
The
The
significant features of the
system can be summarised
as:
).
1
The provision
of a virtual address field greater than the real
address space.
it
2
The implementation of a "one-level" store using a mixture and drum store.
of core store
interrupt processing of input/output devices.
Atlas' detailed internal structure
ner et
members
Computer Research Department, and the prototype
1
incorporates most of the features of the Atlas prototype. The Lincoln Laboratory TX-2 [Clark, 1957] influenced some Atlas features: multiple index Atlas
Initially
presented
Allocation Systems [Randell
Kuehner, 1968].
this
Commonly used complex
straightforward and extremely nice. The extra-
is
code idea appears
SDS 940 system
capability.
subroutines.
in a common operating system accessible to all users. these subroutines were stored in a read-only memory.
Time-shared computers are generally multiprogrammed. Alternatively, time-shared systems can be implemented by
Burroughs B 5000 (Chap. 22)
is
described
1962].
International Computers and Tabulators, U. K.
274
call
instructions (such as sin, cos, and monitor calls) can be written
a single processor has access to
called a multiprocessor system.
and extracodes, have
many other machines. A one-level store is common to most new computers which are time-shared or multiprogrammed; the scheme for memory paging in the SDS 940
instructions at
programming.
original features, one-level storage
been copied
in a
paper [Sum-
3
4
and the method
The
interrupt system
The
realisation at the design stage that there
of peripheral control.
would be a
complex operating system and the provision in the hardware of specific features to assist such an operating system.
Section 6 |
The method
number
a large
attachment of
of peripheral control permitted the of on-line peripherals with rapid
into the operating
system
response and entry
for a peripheral requiring attention. This,
together with the multiprogramming features,
makes the design
attachment of keyboards for the provision of multiaccess operation. In the original design, provision for several such ideal for the
on-line typewriters
was made, but
decided to remove these as an
at the production stage
economy measure.
subsequent development of on-line operation,
In
it
was
view of the
was rather an
this
unfortunate decision.
The Atlas computer
at the University
operation for four years and
it
is
has now been
in
continuous
expected to provide for the major
During the period of its operation the provision of extensive monitoring and logging information has permitted the behaviour of the system to be studied in detail. The results of these studies have
been extremely valuable
in
the design of a successor to the Atlas.
940 one
of the first commercially available combined hardwaresoftware time-sharing computers. 1 The description in Chap. 24 is concerned with the machine
as
it
appears to the user. That
is
described
in
the context
in
more modest than that of the IBM 360/67 GE 645 [Dennis, 1965; Daley and Dennis, 1968]. A number of instructions are apparently built in via the programmed operator calling mechanism, based on Atlas extracodes (Chap. 23). The software-defined instructions that of Atlas but
is
1966] and
al.,
emphasize the need
for
ing-point arithmetic
is
run.
hardware features. For example,
float-
needed when several computer-bound
The SDS 945
is
a
successor to the 940, with
increased capability but at a lower cost.
'Time-shared computers consist of both hardware and a complex software operating system.
Adams Computer
Characteristics Quarterly lists the deliveries of gen-
DEC PDP-6 hardware, October, 1964 (software in early 1965); SDS 940 hardware (and Berkeley software) April, 1966; GE 635, 645 hardware, May, 1965 (M.l.T.'s project MULTICS software, around eral-purpose time-shared computers as
in
a time-sharing system
The Berkeley Time-Sharing Computer (Fig. 1) is based on the SDS 930 (Chap. 24). The hardware modifications to the SDS
1969);
IBM System /360 Model 67 hardware, March, 1966
1968).
M(content addressable: flip flop) Mp(#0:3)'-
-,— K('Map)
Pc
2
Ms (magnetic tape).
S-r-K I— K
T(paper tape)-
l-K-S-T(Teletype)t— Pio
s(drum; 2 ys/w;
K— Msfmoving L-
Pio-Sf-K-CfPDP-Sj-pK
I
l-K-Cf'PDP-Sj-r-K ;— pi\
'rlp(core; 2
in
Part 3, Sec. 5,
page 257, Chap. 22.
A user machine
the hardware and the oper-
which they contribute to form a user machine. The 940 uses a memory map which is almost a subset of
programs are
The Burroughs B 5000 computer
is,
ating system software are both presented
slightly
Design of the B 5000 System
ability
930, together with the operating system software, were sold by Scientific Data Systems as the SDS 940. The operating system and hardware modifications for multiprogramming make the
[Arden et
part of the University's computing needs until 1971.
Processors with multiprogramming
1.75 ys/w;
I63RI1 w;
(2*1,1
—
1
w) 10
T(CRT; display)-
T(keyboard; CRT; display)1
parity) b/w)
time-shared-computer
10 .5 x
8
Pc('Modified SDS 930), see Chapter k2
Fig. 1. University of California (Berkeley)
1.3 x
head disk;
PMS
diagram.
w)
(software, around
275
Chapter 23 One-level storage system 1 T.
F.
Kilbum / D. H. Sumner
B. C.
Edwards / M.
J.
Lanigan
After a brief survey of the basic Atlas machine, the paper
Summary
describes an automatic system which in principle can be applied to any combination of two storage systems so that the combination can be
regarded
by the machine user
as a single level.
The
actual system described relates
to a fast core store-drum combination.
The
effect of the system
tion times
since
it
is
fits
illustrated,
basically in
on instruc-
and the tape transfer system is also introduced through the same hardware. The scheme incor-
porates a "learning" program, a technique which can be of greater importance in future computers.
requisite transfers of information taking place automatically. There number of additional benefits derived from the scheme
are a
adopted, which include relative addressing so that routines can operate anywhere in the store, and a "lock out" facility to prevent interference
between
The
2.
basic
a large-capacity fast-access
ation of the
now being
main
store.
it is
necessary to have
While more
efficient oper-
computer can be achieved by making
of one type, this step
is
this store all
scarcely practical for the storage capacities
considered. For example, on Atlas
it
is
possible to
address 10 6 words in the main store. In practice on the first installation at Manchester University a total of 10 5 words are provided,
but though
it is
just technically feasible to
much more economical to provide a and drum (96,000 words) combination. it is
make
this in
one level
core store (16,000 words)
a machine which operates its peripheral equipment on a time division basis, the equipment "interrupting" the normal Atlas
is
main program when peripheral equipment
it is
requires attention. Organization of the
done by program so that many prothe store of the machine at the same
also
grams can be contained in time. This technique can also be extended to include several main programs
of the basic
available storage space
In a universal high-speed digital computer
as well as the smaller subroutines
used for controlling
peripherals. For these reasons as well as the fact that
which
levels of store,
in order to eliminate the long
drum
i.e.,
is
a system has been devised to make the core drum store combination appear to the programmer as a single level of storage, the
276
IRE
Trans., EC-11, vol. 2, pp. 223-235, April, 1962.
shown
in Fig.
1.
The
store, in
to the
which
normal
all
words
user,
and
tape store, which is the conventional backing-up large capacity store of the machine. Both the private store and the main core store are linked with the main accumulator, the B-store, and the B-arithmetic unit.
However the drum and tape stores only have
access to these latter sections of the machine via the
main core
store.
The machine order code is of the single address type, and a comprehensive range of basic functions are provided by normal engineering methods. Also available to the programmer are a number
of extra functions
termed "extracodes" which give auto-
matic access to and subsequent return from a large number of built-in subroutines. These routines provide 1
A number
which would be expensive to provide in terms of equipment and also time because of the extra loading on certain circuits. An example in the
of this Shift
of orders
machine both is
the order:
accumulator contents
2
The more complex mathematical log
3
±n places where n is an integer.
4
operations,
e.g.,
sin x,
x, etc.,
Control orders for peripheral equipments, card readers, parallel printers, etc.,
l
is
finally the
core store and drum,
Hence
machine
split into three sections; the private store used solely for internal machine organization, the central
which includes both core and drum are addressed and is the store available
some orders
access time of 6 msec.
in
is
store
take a variable time depending on the exact numbers involved, it is not really feasible to "optimum" program transfers of infor-
mation between the two
programs simultaneously held
machine
The arrangement Introduction
1.
different
the store.
Input-output conversion routines,
Chapter 23
One-level storage system
about 10 6 words. In Muse the central store capacity is about 96,000 words contained on 4 drums. Any part of this store can be transFixed store
ferred in blocks of 512 words to/from the
meshes 4.096 words
2
Operand
«
I
address
Subsidiary store
1,024 words
H
24
decode on digits
address
The tape system provides a very for the machine. The user can effect of information
Core store address from
machine
8 tape decks fi k0.5x10 words
Fig. 1.
between
automatic transfers
initiates Main accumulator
* k
Drum store 4 drums 24, 576 words
The main
Information channels
(two way)
core store address can thus be provided from either
the central machine, the drum, or the tape system. Since there is no synchronization between these addresses, there has to be a
drum priority system to allocate addresses to the core store. The has top priority since it delivers a word every 4 jusec, the tape next priority since words can arise every 11 jusec from 8 decks
programs being run simultaneously, monitoring and costing purposes, and the
routines for fault finding
drum and tape
time.
A
priority,
at
store for the rest of the available
time to establish its priority system necessarily takes and so it has been arranged that it comes into effect only
each drum or tape request. Thus the machine is not slowed in any way when no drum or tape transfers take place. The
down
transfers.
effect of
permanently required and hence is kept in part of the private store termed the "fixed store" [Kilburn and Grimsdale, 1960a] which operates on a "read only" basis. This store All this information
fixed store
Address channels
—•-
allocation to Special programs concerned with storage
detailed organization of
by a
tape store and the main core store. The system can handle eight tape decks running simultaneously, each producing or demanding a word on average every 88 jusec.
Layout of basic machine.
different
store. In actual
program which of blocks of 512 words between the
and the machine uses the core 5
large capacity backing store transfers of variable amounts
and the central
this store
fact such transfers are organized
centrol
Main core store 4 stocks n 4,096 words
which
4096 words.
Peripherdl eguipments
digits
drithmetic unit
23,22,21
4
store,
consists of four separate stacks, each stack having a capacity of
B store 128 words
8 Subsidiary store
main core
is
drum and tape
transfers
on machine speed
is
given in
Appendix 1. To simplify the control commands given to the drum, tape, and
"linear" ferrite slugs are inserted to represent digital information.
peripheral equipment in the machine, the orders all take the form h->S or s->B and the identification of the required command register is provided by the address S. This type of storage is clearly
The information content can only be changed manually and
widely scattered in the machine but
consists of a
woven wire mesh
into
which a pattern of small will
tend to differ only in detail between the different versions of the Atlas computer. In Muse this store is arranged in two units each
V-store.
4096 words, a unit consisting of 16 columns of 256 words, each word being 50 bits. The access time to a word in any one column is about 0.4 jusec. If a change of column address is required, this
adder [Kilburn et
of
figure increases
read amplifiers.
due to switching transients in the in the new column revert to accesses Subsequent
by about
1 /usee
its operation B-arithmetic unit.
fast
1960b] and has built-in multiplication and can deal with fixed or floating point numbers
al.,
division facilities. It
jusec) of
termed collectively the
machine the main accumulator contains a
In the central
and
is
is
completely independent of the B-store and B-store is a fast core store (cycle time 0.7
The
120 twenty-four bit words operating in a word selected
mode [Edwards
I960]. Eight "fast"
store operates in conjunction with a subsidiary core store of 1024 words which provides working space for the fixed
partial flux
and has a cycle time of about 1.8 jusec. There are certain safeguards against a normal machine user gaining access
three are used as control lines, termed main, extracode, and inter-
0.4 jusec.
The
store programs,
to addresses in either part of the private store,
he makes use of
The
this store
central store of the
store combination,
in effect
though through the extracode facility.
machine
consists of a
of
switching
lines are also
provided
in the
form of
et
al.,
flip-flop registers.
Of these,
The arrangement has the advantage numbers can be manipulated by the normal B-type and the existence of three controls permits the machine
rupt controls respectively. that the control orders,
drum and core
which has a maximum addressable capacity
B
to switch rapidly
control
numbers
from one to another without having to transfer Main control is used when the
to the core store.
277
278
Part 3
The
central
machine
Section 6
instruction-set processor level: variations in the processor
code control
is
is obeying the current program, while the extraconcerned with the fixed store subroutines. The
interrupt control provides the
means
for
handling numerous pe-
the machine when they ripheral equipments which "interrupt" either require or are providing information.
The remaining
"fast"
organizational procedures, though B124 the floating point accumulator exponent. The operating speed of the machine is of the order of 0.5 X 10 6
B lines are mainly used for is
instructions per second. This is achieved by the use of fast transistor logic circuitry, rapid access to storage locations, and an
extensive overlapping technique. The latter procedure is made number of intermediate buffer storpossible by the provision of a
age registers, separate access mechanisms to the individual units of core store and parallel operation of the main accumulator and
The word length throughout the machine
B-arithmetic units.
48
is
which may be considered as two half-words of 24 bits each. store transfers between the central machine, the drum and tape
bits
All
being a parity digit associated with each half-word. In the case of transfers within the central store stores are parity checked, there
(i.e.,
between main core
and drum) the parity
store
digits associ-
ated with a given word are retained throughout the system. Tape transfers are parity checked when information is transferred to store, and on the tape itself a check sum technique involving the use of two closely spaced heads is used. The form of the instruction, which allows for two B-modifica-
and from the main core
tions,
and the allocation of the address
digits
is
shown
in Fig. 2a.
Half of the addressable store locations are allocated to the central store
which
is
identified
by a zero
in the
of the address. (See Fig. 2b.) This address into block address,
and
line address in a block of
least significant digits,
characters in a half
The
function
and
1,
word and
number
is
make
it
The machine
significant digit
512 words. The
6 bit possible to address
digit 2 specifies the half word.
split into several sections,
relating to a particular set of operations, Fig. 2c.
most
can be further subdivided
orders
fall
into
each section
and these are
two broad
classes,
listed in
and these
are 1
B codes: These involve operations between a B line specified by the BA digits in the instruction and a core store line whose address can be modified by the contents of a B line determined by the B m digits. There are a total of 128 B one of which, B always contains zero. Of the other 90 are available to the machine user, 7 are special registers previously mentioned, and a further 30 are used lines,
,
lines
by extracode 2
orders.
A codes: These involve operations between the Accumulator and
a core store line
whose address can now be doubly
Function 10 bits
Processors with multiprogramming
ability
Chapter 23
3.
One-level store concept
The choice computer
is
of system for the fast access store in a large scale
governed by a number of conflicting factors which
These processes are necessarily time consuming but by providing a by-pass of this procedure for instruction accesses (since, in general, instruction loops are all contained in the same block) then most of
this
time can be overlapped with a useful portion of the store rhythm. In this way information in the core
include speed and size requirements, economic and technical difficulties. Previously the problem has been resolved in two ex-
machine or core
treme cases either by the provision of a very large core
and only
store, e.g.,
store
is
available to the rarely
is
machine
in the equivalence circuitry.
Mercury [Lonsdale and Warburton, 1956; Kilburn et al., 1956] computer. Each of these methods has its disadvantages, in the first case, that of expense, and in the second
manded block
If
speed of the core store
a "not equivalence" indication is obtained when the deaddress is compared with the contents of the
P.A.R.'s then that address, first
at the full
the over-all machine speed affected by delays
the 2.5 megabit [Papian, 1957] store at M.I.T., or by the use of a small core store (40,000 bits) expanded to 640,000 bits by a drum store as in the Ferranti
One-level storage system
stored in a register
which may have been B-modified,
which can be accessed
is
as a line of the
machine easy access to this ad"interrupt" also occurs which switches operation of the machine over to the interrupt control, which first determines the V-store. This permits the central
transfers of information
who is obliged to program between the two types of store and this can be time consuming. In some instances it is possible for an
dress.
expert machine user to arrange his program so that the amount of time lost by the transfers in the two-level storage arrangement
cause of the interrupt and then, in this instance, enters a fixed store routine to organize the necessary transfers of information
is
not significant, but this sort of "optimum" programming is not very desirable. Suitable interpretative coding [Brooker, 1960] can
between drum and core
permit the two-level system to appear as one level. The effect is, however, accompanied by an effective loss of machine speed
A.
Drum
On
each drum, one track
case, that of inconvenience to the user,
which, in some programs and depending on details of machine design, can be quite severe, varying typically, for example, be-
tween one and
The tages,
three.
two-level storage scheme has obvious economic advan-
and inconvenience to the machine user can be eliminated
An
store.
transfers is
used to identify absolute block posi-
around the drum periphery. The records on these tracks are read into the registers which can be accessed as lines of the tions
permits the present angular drum position to be determined, though only in units of one block. In this way the V-store
and
this
Atlas a completely automatic system has been provided with tech-
time needed to transfer any block while reading from the drums can be assessed. This time varies between 2 and 14 msec since
niques for minimizing the transfer times. In this way the core and drum are merged into an apparent single level of storage with
2 msec.
by making the
transfer arrangements completely automatic. In
good performance and at moderate cost. Some rangement on the Muse are now provided.
The
central store
is
drum
revolution time
is
12 msec and the actual transfer time
The time
details of this ar-
of a writing transfer to the drums has been reduced the block of information to the first available empty by writing
subdivided into blocks of 512 words as
block position on any drum. Thus the access time of the drum can be eliminated provided there are a reasonable number of
shown by the address arrangements is
the
in Fig. lb.
also partitioned into blocks of this size
The main
which
core store
for identification
purposes are called pages. Associated with each of these core store page positions is a "page address register" (P.A.R.) which contains
empty blocks on the drum. This means, however, that transfers to/from the drum have to be carried out by reference to a directory and this is stored in the subsidiary store and up-dated when-
the address of the block of information at present occupying that page position. When access to any word in the central store is
ever a transfer occurs.
required the digits of the demanded block address are compared with the contents of all the page address registers. If an "equivalence" indication is obtained then access to that particular page
to determine the absolute position on a
position is permitted. Since a block can occupy any one of the 32 page positions in the core store it is necessary to modify some digits of the demanded block address to conform with the page positions in
which an equivalence was obtained.
When
the
drum
transfer routine
is
entered the
drum
first
action
is
of the required block.
The order
is then given to carry out the transfer to an empty page position in the core store. The transfer occurs automatically as
soon as the drum reaches the correct angular position. The page address register in the vacant position in the core store is set to a^ specific plifies
block number for
drum
transfers. This
technique sim-
the engineering with regard to the provision of this
number
279
280
The
Part 3
from the drum and to the
Section 6
instruction-set processor level: variations in the processor
wrong
also provides a safeguard against transferring
Processors with multiprogramming
made from
position can then be
the central machine.
It is
ability
clear
that the L.O. digit can also be used to prevent interference be-
block.
as the order asking for a read transfer from the drum has been given the machine continues with the drum transfer program. It is now concerned with determining a block to be
tween programs when several machine at the same time.
transferred back from the core store to the drum. This
As soon
In Sec. 3
it
was stated
different ones are being held in the
that addresses
demanding access
to the
necessary
core store could arise from three distinct sources, the central
to ensure
an empty core store page position when the next read transfer is required. The block in the core store to be transferred
machine, the drum, and the tape. These accesses are complicated because of (1) the equivalence technique, and (2) the lock out digit.
has to be carefully chosen to minimize the number of transfers in the program and this optimization process is carried out by a
The
learning program, details of
which are given
in Sec. 5.
is
The opera-
by the provision of the "use" digits program which are associated with each page position of the core store. tion of this
is
assisted
To interchange information between the core store and drums, transfers, a read from and a write to the drum are necessary.
two
These have to be done sequentially but could occur
The technique
in either order.
of having a vacant
page position in the core store permits a read transfer to occur first and thus allows the time for the learning program to be overlapped either into the waiting period for the read transfer or into the transfer time itself. In the time remaining after completion of the learning program an entry is made into the over-all supervisor program for the machine, and
taken concerning what the machine is to do until the drum transfer is completed. This might involve a change to a decision
is
a different
main program.
while a drum or tape transfer is taking place to that page. This is prevented in Atlas by the use of a "lock out" (L.O.) digit which
provided with each Page Address Register.
When
a lock out
only permitted when the the drum system, the tape address has been provided either by or the control. The latter case permits all transsystem, interrupt fers from paper tape, punched card, and other peripheral equipdigit
is
1.
The
provision of the Page Address Registers, the equivalence circuitry, and the learning program have permitted the core store and drum to be regarded by the ordinary machine user as a one-
and the system has the additional feature of "floating address" operation, i.e., any block of information can be stored in any absolute position in either core or drum store. The minimum level store,
access time to information in this store
the core store and
B.
set at 1, access to that
page
is
ments, to be handled without interference from the main program. When the transfer of a block has been completed the organizing
and access to that page
The core store
is
split into four stacks,
sequential addresses it is
Source of address 1.
Central Machine
2.
Drum System
3.
Tape System
=
and write
is
[E.Q.]
Access to required page position Access to required page position Access to required page position
discussed.
each with individual address
digits are
time shared between
thus arranged across two stacks. In this way from consecutive ad-
by increasing the size of the read channel. This to be completely obeyed in three store two instructions permits dresses in parallel
"accesses." The choice of this particular storage arrangement is discussed in Appendix 2. The coordination of these four stacks is done by the "core stack
coordinator" and some features of this are
now
discussed, starting
with the operation of a single stack.
of equivalence
and lock out (
1
0)
now
possible to read a pair of instructions
Comparison of demanded block address with contents of the P.A.R.'s resultant state Equivalence
is
the various stacks. Sequential address positions occur in two stacks alternately and a page position which contains a block of 512
Table
(Lock out
obviously limited by
is
this
decoding and read and write mechanisms. The stacks are then combined in such a way that common channels into the machine
program
(
arrangement and
Core store arrangement
resets the L.O. digit to zero
1
its
for the address, read
A program could ask for access to information in a page position
is
various cases and the action that takes place are summarized
in Table
circuits
ice 1 Equivalence Lock out = 1)
Not equivalence
\
[N.E.Q.]
[E.Q. 6- L.O.]
Enter
drum
transfer routine
Not available to
this
program
Fault condition indicated
Fault condition indicated
Fault condition indicated
Fault condition indicated
Chapter 23
There
C. Operation of a single stack of core store
The storage system employed
is
a coincident current M.I.T. system
arranged to give parallel read out of tion
is
50
digits.
The reading opera-
destructive and each read phase of the stack cycle
is fol-
lowed by a write phase during which the information read out may be rewritten. This is achieved by a set of digit staticizors which are loaded during the read phase and are used to control the inhibit current drivers during the write phase. When new information is to be written into the store a similar sequence is followed, except that the digit staticizors are loaded with the new information during the read phase. A diagram indicating the different types of stack cycle
shown
is
is
a small delay
WD (~100 m/isec) between the
and the
"stack
the read phase to allow for request" of the the address address state and decoding. The output setting information from the store appears in the read strobe period, which signal, Sfi,
is
start of
towards the end of the read phase. In general, the write phase soon as the read phase ends. However, the start of the
starts as
write phase may be held up until the new information is available from the central machine. This delay is shown as w in Fig. 3c.
W
The
TA between
interval
the stack request and the read strobe
termed the stack access time, and in practice this is approximately one third of the cycle time Tc Both TA and Tc are functions is
.
W
w is zero have typical values of 0.7 jusec and 1.9 jusec respectively. A holdup gate in the request channel prevents the next stack request occurring before of the storage system
in Fig. 3.
One-level storage system
and assuming that
the end of the preceding write phase. Stack
request
D. Operation of the main core store with the central machine
"^T
Read phase Read
A
schematic diagram of the essentials of the main core store con-
trol
strobe
r
i
+=H-
Write
phase
system
is
shown
in Fig. 4.
The
control signals
SA t and SA 2
indicate whether the address presented is that of a single word or a pair of sequentially addressed instructions. Assuming that the flip-flop
F
is
in the reset condition, either of these signals results
in the loading of the buffer address register (B.A.R.). This
,ck Stack
— I
reqiuest
I
is
—r
In dealing with the
I
Read
IS
strobe
r
Write
phase
arises,
—1
i-
cases
I |
strobe
Write strobe
In Fig. 5 a flow diagram
i_r
not store-limited. In most
is
the equivalence operation on complete, and the read phase of
is
shown
for the various cases
which can
When a single address request is accepted it is necessary to obtain an "equivalence" indication and form the page location SET CSF digits before the stack request can be generated. The
phase
(c)
TA = access time; Tc = and loading of address
-
wait for address decoding cyclic time; Wo w - wait for release of write hold register;
W
signal then occurs as soon as the read phase starts. If a "not equiva-
lent" or "equivalent
up.
Fig. 3. Basic
—> s).
is
when
arise in practice.
Write
(a
is
the appropriate stack (or stacks) has started. Until this time the information held in the B.A.R. must not be allowed to change.
U
Read
then the speed of the system
SET CSF
generated the demanded block address
Read phase
request the block address digits in the
indicated in Fig. 4 is obtained. Assuming access to the required store stack is permitted then a set C.S.F. signal is given which resets the flip-flop F. If this occurs before the next access request
(*)
Stack ,Ck req uest
first
B.A.R. are compared with the contents of all the page address registers. Then one of the indications summarized in Table 1 and
phase Write
loading
done by the signal B.A.B.A. which also indicates that the buffer register in the central machine has become free.
i.0)
(c)
types of stack cycle, (a) Read order Read-write order (b + s —» S).
-
(s
A), (b) Write order
and locked out" indication
is
obtained a stack
request is not generated, and the contents of the B.A.R. are copied in to a line of the V-store before SET CSF is generated. When access to a pair of addresses is requested (i.e., an instruc-
281
282
Part 3
The
instruction-set processor level: variations in the processor
Buffer
address
register
I
Block oddress
address
|Line
Page address regO
[Page address reg
1
|Poge oddress reg
31
|
Not instruction oddress
Instruction
address 1
Equivalence Page
,Poge circuitry
~j
EQ
j
NEQ
digit
register
digits
r EQaiO Comparison circuit
sr.r
Right
Wrong
page
page
CSP Control
circuitry
Stack request
Stack
Stock address
Section 6
Processors with multiprogramming
ability
Chapter 23
It is necessary to ensure a certain minimum time between successive read strobes from the core store stacks to allow
3
which take satisfactory operation of the parity circuits, about 0.4 |iisec to check the information. This time could be reduced, but as
it is
only possible to get such a condition
for a small part of the was not thought to be
The
basic machine timing
normal instruction timing cycle an economical proposition. is
now
it
is
main
the store cycle time. Here a
fast basic cycle
time of 2
1
2
3 4
The type
to
jusec in
Table 2
is
when
obeyed. While
in practice
some
between completing
a long sequence of the same type of instruction this method is not ideal, it is necessary because
obeying one instruction
overlapped in time with
is
part of three other instructions. This
timing complicated, and
number
of techniques,
order to alleviate this situation.
complete an instruction
of instruction (which
is
To obey
factors limiting speed
is
dependent upon
defined by the function
makes the detailed
so the timing
sequence is developed obeyed one after another.
make
this instruction the central
quests to the core store,
one
machine makes two
for the instruction
for the operand. After the instruction
is
re-
and the second
received in the machine
be decoded and the operand address modified by the contents of one of the B registers before the operand request can be made. Finally, after the operand has been the function part has to
obtained the actual accumulator addition takes place to complete the instruction. The time from beginning to end of one instruction 6.05 jusec and an approximate timing schedule
is
as follows in
digits)
is
The
exact location of the instruction and operand in the core or fixed store since this can affect the access time
Table
Whether
the instruction (steps 1 to 8 in Table 3), then the different sections of the machine are being used very inefficiently, e.g., the accumu-
or not the operand address
is
to
be modified
In the case of floating point accumulator orders, the actual
numbers themselves 5
instructions
for various instructions are given in
figures relate to the times
slowly by first It is convenient to
discussed.
the core store into four separate stacks and extracting e.g., splitting two instructions in a single cycle, have been adopted despite a
The time taken
These
these instructions a sequence of floating both instruction and operand in the core store point additions with the address and with single B-modified. operand
In high-speed computers, one of the of operation
2.
considering instructions
Instruction times
4.
The approximate times Table
One-level storage system
Whether drum and/or tape
transfers are taking place
Approximate instruction times
Type of instruction
If
3.
no other action
is
permitted
in the
time required to complete
adder is only used for less than 1.1 jusec. However, the organization of the computer is such that the different sections such lator
as store stacks,
accumulator and B-arithmetic unit, can operate
283
284
Part
The
3 |
instruction-set processor level: variations in the processor
Section 6 |
Table 3f
Timing sequence for floating point addition (instructions and operands in the core store)
Processors with multiprogramming
ability
Chapter 23
One-level storage system
Copy to
Accumulotor busy
|
j
occ
s,cck
Operand
t
e5t
re
1f
Copy
Read
Equivalence
|
[
|
to
L
Accumulator busy_
ace
Start second of pair
g modification '^T^
(Function! I
decode
Start next pair
request |
Copy
Equivolence
I
to
I
Acumulator busy_
L
J
occ
Instruction
Stack ifci
request
I
Stack
Operand
request
[Function!
Equivolence
III
I
I
decode
Operand
Stack
request
request
B modification
I
i
Equivolence Start second of pair IFunctionl
decode
I
B modification
I
Instruction
Start next pair
request |'o
i
Fig. 6.
Timing diagram for a sequence of floating point addition orders. (Single-address modification.)
Element
1
of
first
vector into accumulator. (Operand B-modi-
3
Add
equivalence.
partial product to accumulator.
5
Alter count to select next elements and repeat.
is
for this loop
12.2 jusec.
shown by the the
drum
is
of the overlapping technique
time from starting the approximately 10 jusec.
first
instruction
or tape systems are transferring information
be affected. The
affect
is
dis-
cussed in more detail in Appendix 1. The degree of slowing down is dependent upon the time at which a drum or tape request occurs
machine
requests. depends on the stacks used the drum or and those used by by the central machine. tape being The approximate slowing down is by a factor of 25 per cent during relative to
drum
transfer
(See Appendix
It
also
and by 2 per cent
for
at
The
each active tape channel.
random;
necessary to arrange non-
for use at the next
selection of the page to be transferred could be
could easily result in many additional transpage selected could be one of those in current
this
by the programmer. To make this ideal selection the programmer would have to know (1) precisely how his program operated, which is
not always the case, and
(2)
the precise amount of core store
is
not generally available as the core store could be shared by other machine programs, and almost certainly by some fixed store
central
program organizing the input and output of information from slow peripheral equipments. The amount of core store required by this fixed store
is continuously varying [Kilburn et al., 1961]. the ideal pattern of transfers can be approached
program
The only way
for the transfer program to monitor the behavior of the main program and in so doing attempt to select the correct pages to be transferred to the drum. The techniques used for monitoring is
been described
in Sec. 2A.
number of transfers required. The method described occupies less than 1 per cent of the operating time, and the reduction in the number of transfers is more than sufficient
drum
to the core
to cover this.
The drum transfer learning program of
is full it is
are subject to the condition that they must not slow down the operation of the program to such an extent that they offset any
1.)
The organization
the core store
if
fers occurring, as the
reduction in the 5.
still exist,
available to his program at any instant. This latter information
to or from the core store then the rate of obeying instructions also use the core store will
if
use or one required in the near future. The ideal selection, which would minimize the total number of transfers, could only be made
with instructions and operands on the
The value
fact that the
to finishing the second
When
line containing partial product.
and
an empty page to be made available
made
Copy accumulator to store
program examines the state no further action empty pages
initiated, the organizing
However,
and B-modified.)
The time
a
taken.
is
for
4
which
been
Multiply accumulator by element of second vector. (Oper-
core store is
store has
of the core store,
fied.)
2
Equivolence
|
drum
transfers has
After the transfer of the required block from the
285
286
The
Part 3
Section 6
instruction-set processor level: variations in the processor
|
That part of the transfer program which organizes the selection page to be transferred has been called the "learning" pro-
of the
gram. In order for
this
program
to
have some data on which
to
operate, the machine has been designed to supply information about the use made of the different pages of the core store by
the program being monitored. With each page of the core store there digit
which
is
set to
The 32 "use"
"1" whenever any line
two
digits exist in
is
associated a "use"
in that
is
page
obeyed
real
selected
The
due to the
is
at
random
for
random
lengths of time by the operation of peripheral equipments. With an instruction counter the temporal pattern of the blocks used will
T
in that
will
it is
the
If if
immediately required again, then,
=
— value of
time of transfer
be the same on successive runs through the same part of the
a block
the block t
of values of t
list is
is
kept.
transferred to the drum:
for transferred
page
transferred to the core store the value of t
of transfer
is
—value of t
for this block
length of last period of inactivity
For the block transferred from the drum In order to to
will not
set the value of T.
T = time
=
is
first
the page finally
become zero and the same mistake
when
When
main program. This
fact that the operations
of the main program may be interrupted
wrong,
values of t are set
used to
instruction counter rather than a normal clock to measure "time"
program
rules
be repeated. For all the blocks on the drum a
t
clock causes the learning program to copy the "use" digits to a list in the subsidiary store every 1024 instructions. The use of an
for the learning
is
as in this case,
time but the number
in the operation of the
two
required by the program for the longest time. to select a page the third ensures that
fail
accessed.
read by the learning program, the reading automatically resetting them to zero. The frequency with which these digits are read is of instructions
ability
and can be
lines of the V-store
governed by a clock which measures not
Processors with multiprogramming
make
its
t is
set to 0.
decision the learning program has only and apply at the most three simple rules;
update two short lists can easily be done during the 2 msec transfer time of the block
this
required as a result of the nonequivalence. As the learning program it is not slowed down
uses only fixed and subsidiary store addresses
during the period of the drum transfer.
the learning program is to make use of this pattern to minimize the number of transfers. When a nonequivalence occurs and after the transfer of the
The over-all efficiency of the learning program cannot be known until the complete Atlas system is working. However, the value of the method used has been investigated by simulating the
required block has been arranged, the learning program again adds the current values of the "use" digits to the list and then uses
behavior of the one-level store and learning program on the Mercury computer at Manchester University. This has been done
program. This
essential
is
this list to bring
subsidiary store. of each for each
up
if
to date
These
two
page of the core
of time since the block in that
T
kept in the
sets of times also
sets consist of store.
32 values of
The value
of
and
t
t is
T,
one
the length
The value of this block. The
page has been used.
the length of the last period of inactivity of accuracy of the values of t and T is governed by the frequency is
with which the "use" digits are inspected. The page to be written to the drum is selected by the application in turn of three simple tests to the values of
t
and
T.
for several
problems using varying amounts of store
One
in excess of
was the problem of forming of two 80th order matrices B and C. The three
the core store available.
of these
the product A matrices were stored row by row each one extending over 14 blocks, only 14 pages of core store were assumed to be available.
The method fc
of multiplication
n X 1st row of C = X 2nd row of C +
b 12
was
partial
answer to
partial
answer
=
1st
row
of
A
second partial answer,
etc.
for
which
1
Any page
2
That page with
3
t
t
=£
That page with T max
The
first
rule selects
of use for longer than
>
T+
and (T (all t
=
or
1,
—
t)
Thus matrix B was scanned once, matrix C 80 times and each row max, or
of matrix
0).
any page which has been currently out period of inactivity. Such a page
its last
has probably ceased to be used by the program and is therefore an ideal one to be transferred to the drum. The second rule ignores as they are in current use, and then selects all pages with t = the one which,
if
the pattern of use
is
A
80 times.
Several machine users were asked to spend a short time writing a program to organize the transfers for a general matrix multipli-
maintained, will not
be
cation problem. In no case
when
the
method was applied
to the
above problem were fewer than 357 transfers required. A program written specifically for this problem which paid great attention to the distribution of the rows of the matrices relative to block divisions required
274
234
transfers.
transfers; the gain over the
The learning program required human programmer was chiefly
Chapter 23
due
to the fact that the learning
of the occasions
when
program could take
the rows of
A
full
advantage
existed entirely within one
block.
One-level storage system
time taken for address comparison into the store and machine operating time if it is not to introduce any extra time delays. Simulated tests have shown that the organization of drum transfers
Many other problems involving cyclic running of single or multiple sets of data were simulated, and in no case did the learning program require more transfers than an experienced human
are reasonably efficient and other advantages
programmer.
intelligent a
as efficient allocation of core storage
and
between
drum
transfers
it
interrupt the operation of the program for from 2 to 14
Some
they are initiated this
time
advance.
loss
by nonequivalence interrupts. could be avoided by organizing the
msec
programmer may be he can never know how many
all
as
very experienced programmer having sole use of the core store could arrange his own transfers in such a way that no unnecessary ones ever occurred and no time was ever wasted
waiting for transfers to be completed. This would require a great deal of effort and would only be worthwhile for a program that was going to occupy the machine for a long time. By using the
data accumulated by the learning program it is possible to recognize simple patterns in the use made by a program of the various a prediction program could forecast the blocks required in the near future and organize the transfers. By recording the success or failure of these forecasts
way
the program could be made self-improving. For the matrix multiplication problem discussed above the pattern of use of the blocks
containing matrix C is repeated 80 times, and a considerable degree of success could be obtained with a simple prediction
his
if
normal use there
as in
is
some
sort
machine rhythm even through several programs, there the possibility of making some sort of prediction with regard
to the transfers necessary. This involves
no more hardware and
be done by program. However, this stage will probably be until results on the actual system are obtained. will
A
blocks of the one-level store. In this
when
of regular
of
transfers in
in operation
is
particular time. Furthermore
is
or
programs matter how
No
running. The advantage of the automatic system is that takes into account the state of the machine as it exists at any
program
Although the learning program tends to reduce the number of transfers required to a minimum, the transfers which do occur still
different
store lock out facilities are also invaluable.
programs or peripheral equipments are A. Prediction of
which accrue, such
It
that
left
can be seen that the system is both useful and flexible in can be modified or extended in the manner previously
it
indicated.
Thus despite the increase
in
equipment, the advantages
which are derived completely justify the building of this automatic system.
APPENDIX 1 ORGANIZATION OF THE ACCESS REQUESTS TO THE CORE STORE There are three sources of access requests to the core store, namely the central machine, the drum, and the tape systems. In deciding
how
the sequence of requests from
and placed in some be considered. These are
serialized to
all
three sources are to be
sort of order, a
number
of facts have
program.
6.
Conclusions
1
All three sources are asynchronous in nature.
2
The drum and tape systems can make requests at a fairly high rate compared with the store cycle time of approximately 2 jusec. For example, the drum provides a request every 4 jusec and the tape system every 11 /tsec when all
A specific system for making a core-drum store combination appear as a single level store has
been described. While
this
is
the actual
system being built for the Atlas machine the principles involved are applicable to combinations of other types of store. For example, a
tunnel diode-fast core store combination for an even faster
machine.
was not
An alternative which was considered for Atlas, but which
as attractive economically,
was a
fast
core-slow core store
combination. The system too can be extended to three levels of 6 storage, and indeed if 10 words of total storage had to be provided then it would be most economical to provide it on a third level of store such as a
file
drum.
The automatic system does require additional equipment and introduces some complexity, since it is necessary to overlap the
8 channels are operative. 3
The drum and tape systems can only be stopped in multiples of a block length, i.e., 512 words. This means that any system devised for accessing the core store must deal with both the average rates of drum and tape requests specified in 2. Only the central machine can tolerate requests being stopped
any time and for any length of time. request priority can be stated which is a Drum request. at
b
Tape
c
Central machine request.
request.
From
these facts a
287
288
The
Part 3
instruction-set processor level: variations in the processor
A machine
4
request can be accepted by the core store, but is no place available to accept the core store
because there
its cycle is inhibited and further requests held up. In the case of successive division orders this time can be as long as 20 ^usec, in which case 5 drum requests could be made. To avoid having an excessive amount of buffer
information,
b
available to put the information. Store the machine request and then permit a
drum
flip-flop
frozen
Inspect state of
*y
drum two techniques
are possible: When drums or tapes are operative do not permit machine requests to be accepted until there is a place
storage for the
a
F
F
flip-flop 1
Busy
Wait for
or
equivalence
completed
tape request. I
The latter scheme has been adopted because it can be accommodated more conveniently and it saves a small amount of time. 5
If
the central machine
is
using the private store then
Store machine order
it is
flip-flop
core store
to
-Drum/tape
priority
-
Remove stack request
way.
Inhibit signals
When
the central machine, drum and tape are sharing the core store then the loss of central machine speed should be roughly proportional to the activity of the drum or tape
6
Stock request for drum /tape
Orum/tape request
nhibits to reapply Is
there a stored
machine order
when
The system which accommodates all these points is now disWhenever a drum or tape request occurs inhibit signals
also to the stack request channels
from
Allow to proceed possible)
F
(Fig. 5)
(if
and Stack request of stored machine order
This
this coordinator.
results in a "freezing" of the state of flip-flop is
Apply
and
inhibits
to
stack request channels and to machine request channels (if these are not already applied)
cussed.
means
?
required.
are applied to request channel into the core stack coordinator
W
Perm it stack request___f^\
systems. This means that drum or tape requests must "break" into the normal machine request channel as and
state
I
Drum tope access
drum and tape transfers to the core store not with or slow down the central machine in any
desirable for to interfere
F
Free
Hos the stack request machine order been stopped 7
of a stored
r
7es
No
this
then inspected (Fig. 7, point X). If the state is "busy" this machine order has been stopped somewhere between
Remove
that a
inhibits
on machine request channels
the loading of the buffer address register (B.A.R.) and the stack request. Normally this time interval can vary from about 0.5 /isec if there are no stack request holdups, to 20 jusec in the case of certain accumulator holdups. In either case sufficient time is al-
Fig. 7.
Drum and tape break
in
systems.
lowed after the inspection to ensure that the equivalence operation has been completed. If an equivalence indication is obtained all the information relevant to this machine order (i.e., the line ad-
by the priority circuit) to removes the inhibits on the stack the core store then occurs, which
required and type of stack order) are is made here of the page digit
request channels. When the stack request for the drum or tape cycle is initiated these inhibits are allowed to reapply. At this stage
dress,
page
digits, stack(s)
stored for future reference. Use
register provided to allow the by-pass for instruction accesses.
The
on the equivalence circuitry is then made free for access
core store
by the drum or the tape. If the core store had been found free on inspection, the above procedure is omitted.
to
be
A drum
or tape access (as decided
(Fig. 7, point Y),
nels are
if
there
is
a stored machine order
it
is
allowed
possible. The inhibits on the machine request chanremoved when the stack request for the stored machine
to proceed
if
order occurs.
If
there
is
no stored machine order
this
is
done
Chapter 23
immediately, and the central machine is again allowed access to the core store. However, another drum or tape request can arise before the stack request of the stored machine order occurs, in particular because this latter order may still be held central machine. If this is the case the drum or tape
up by the is
allowed
One-level storage system
the result in this particular case that the machine can at
80 per cent of
its
still
operate
normal speed.
APPENDIX 2 METHODS OF DIVISION OF THE MAIN CORE STORE
immediate access and a further attempt is made to complete the stored machine order when this drum or tape stack request occurs.
The maximum frequency with which requests can be dealt with by a single stack core store is governed by the cycle time of the
the stored machine order was for an operand, the content of the page digit register will correspond to the location of this
is divided into several stacks which can be cycled independently then the limit imposed on the speed of the machine by the core store is reduced. The degree of division which is chosen
If
operand. The next machine request for an instruction pair will then almost certainly result in a "wrong page" indication. This is
prevented by arranging that the next instruction pair access does
store. If the store
dependent upon the machine operations and is
ratio of core store cycle time to other
also
upon the
cost of the multiple selec-
mechanisms required.
not by-pass the equivalence circuitry.
tion
on the machine speed when the drum or tapes are transferring information to or from the core store is dependent upon two factors. First, upon the proportion of time during which
Considering a sequence of orders in which both the instruction and operand are in the core store, then for a single stack store
The
effect
the buffer register in the core coordinator is busy dealing with machine requests, and secondly, upon the particular stacks being
used by the central machine and the drum or tape. If the computer is obeying a program with instructions and operands on the fixed or subsidiary store then the rate of obeying instructions
is
un-
or tape transfers. A drum or tape interrupt the B.A.R. is free prevents any machine address being accepted onto this buffer for 1.0 /usee. However, if the B.A.R. is busy then the next machine request to the core store is delayed
affected
by drum
occurring
when
until 1.8 /usee after the interrupt
if
different stacks are being used,
or until 3.4 /usee after the interrupt if the stacks are the same. When the machine is obeying a program with instructions and
operands on the core store the slowing down during drum transfers can be by a factor of two if instructions, operands, and drum requests use the same stacks. It is also possible for the machine to be unaffected. The effect on a particular sequence of orders can be seen by considering the one discussed in Sec. 4 and illustrated in Fig.
and or tape
1
6.
than the limits imposed by other sections of the computer (Sec. 4). If the store is divided into two stacks and instructions and
operands are separated, then the limit
is
reduced to 2
/usee
which
rather high. The provision of two stacks permits the addressing of the store to be arranged so that successive addresses is still
are in alternate stacks. to both stacks at the
It is
therefore possible by making requests to read two instructions together,
same time
number of access times to three per instruction Unfortunately such an arrangement of the store means that operands are always on the same stacks as instruction pairs, and so reducing the pair.
the limit imposed by the cycle time is still 2 /usee per order even if the two operand requests in the instruction pair are to different stacks
and occur
same time. number of stacks with
at the
Division into any
working through each stack 2
in turn
/usee since successive instructions
the addressing system cannot reduce the limit below
normally occur in successive
In this sequence the instructions are on stacks
addresses and are therefore in the same stack. However, four stacks
drum
arranged in two pairs reduces the limit to 1 /usee as the operands can always be arranged to be on different stacks from the instruc-
while the operands are on stacks 2 and is
the limit imposed on the operating speed by the store is two cycle times per order, i.e., 4 /usee in Atlas. This is significantly larger
transferring alternately to stacks
and
1
3. If
the
then the effect
any interrupt within the 3.2 /usee of an instruction pair is to increase this time by between 0.5 and 3.4 /usee depending upon of
tion pairs. In order to reduce the limit to 0.5 /usee
it is
necessary
have eight stacks arranged in two sets of four and to read four instructions at once, which would increase the complexity of the to
where the interrupt occurred. The average increase is 1.8 /usee and for a tape transfer with interrupts every 88 /usee the computer
central machine.
98 per cent of the normal rate. During drum transfers the interrupts occur every 4 jusec which would suggest a slowing down to 60 per cent of normal. However, for
The limit of 1 /usee is quite sufficient and further division with the stacks arranged in pairs only enables the limit to be more easily obtained by suitable location of the instructions and operands.
can obey instructions
at
any regular sequence of orders the requests to the core store by the machine and by the drum rapidly become synchronized with
is
The location of instructions and operands within the core store under the control of the drum transfer program; thus when there
289
290
Part 3
The
instruction-set processor level: variations in the processor
Section 6 |
Processors with multiprogramming
ability
Chapter 24
A
user machine
system B.
a time-sharing
in
1
W. Lampson / W. W. Lichtenberger / M. W.
Summary
Virile
This paper describes the design of the computer seen by a in a time-sharing system developed at the
In a time-sharing system which has been developed by and for the use of members of Project Genie at the University of California
Some of the instructions in this machine
at Berkeley [Lichtenberger and Pirtle, 1965], the user machine has a number of interesting characteristics. The computer in this
machine-language programmer
University of California at Berkeley.
some are implemented by software. The user, however, thinks of them all as part of his machine, a machine having extensive and unusual capabilities, many of which might be part are executed by the hardware, and
hardware of a (considerably more expensive) computer. Among the important features of the machine are the arithmetic and
of the
string manipulation instructions, the very general
memory
allocation
and
and the multiple processes which can be created by the program. Facilities are provided for communication among these processes and for the control of exceptional conditions. configuration mechanism,
The input-output system
is capable of handling all of the peripheral uniform and convenient manner through files having symbolic names. Programs can access files belonging to a number of people,
equipment
in a
but each person can protect his
own
files
from unauthorized access by
others.
made
is
at various points of the
but the main emphasis
is
techniques of implemen-
on the appearance of the
user's
machine.
characteristic of a time-sharing system
is
that the
computer seen
by the user programming in machine language differs from that on which the system is implemented [Bright, 1964; Comfort, 1965;
McCullogh et al., 1965; Schwartz, 1964]. In fact, the user machine is defined by the combination of the time-sharing
Forgie, 1965;
hardware running
in user
mode and
the software which controls
input-output, deals with illegal actions
which may be taken by
a user's program, and provides various other services. If the hardware is arranged in such a way that calls on the system have the
same form berger and
hardware instructions of the machine [LichtenPirtle, 1965], then the distinction becomes irrelevant
to the user;
he simply programs a machine with an unusual and
as the
powerful instruction set
which
relieves
him
of
many
of the prob-
lems of conventional machine-language programming [Lampson, 1965;
McCarthy
1
IEEE, 54,
Proc.
an
SDS
930, a 24 bit, fixed-point machine with one index
and
32 thousand words of 1.75 jus memory in two independent modules. Figure 1 shows the basic configuration of equipment. The memory
two modules so that processing and simultaneously. A detailed description of the various hardware modifications of the computer and their is
interleaved between the
drum
transfers
may occur
implications for the performance of the overall system has been given in a previous paper [Lichtenberger and Pirtle, 1965]. Briefly, these modifications include the addition of monitor and user
modes
mode, the execution of a class of and prevented replaced by a trap to a system rouThe protection from unauthorized access to memory has been
tine.
subsumed
in
in which, for user is
an address mapping scheme: both the 16 384 words
addressable by a user program (logical addresses) and the 32 768 words of actual core memory (physical addresses) have been
Introduction
A
is
register, multi-level indirect addressing, a 14 bit address field,
instructions
Some mention tation,
system
et
al.,
1963].
vol. 12, pp.
1766-1774, December, 1966.
divided into 2048-word pages. A set of eight six-bit hardware registers defines a map from the logical address space to the real memory
by specifying the
real
page which
is
user's logical pages. Implicit in this
marking each of the
user's
to
correspond to each of the
scheme
is
any attempt to access such a page improperly references in user
All
memory mode, memory
the capability of
pages as unassigned or read-only, so that
mode
are
will result in a trap.
mapped. In monitor
references are normally absolute. It is possible, with however, any instruction in monitor mode, or even within all
a chain of indirect addressing, to specify use of the user map. Furthermore, in monitor mode the top 4096 words are mapped
through two additional registers called the monitor map. The
mapping process
is
illustrated in Fig. 2.
Another significant hardware modification
is
the
mechanism
going between modes. Once the machine is in user mode, get to monitor mode under three circumstances:
it
for
can
291
292
The
3
Part
instruction-set processor level: variations
user mode, the user map will be applied to the remainder of the address indirection. All calls on the system which
a
made in this way. monitor mode program gets into user mode by transferring to an address with mapping specified. This means, among other are not inadvertent are
A
P.T.
reader
things, that a
CPU SDS 930
—
Magnetic tapes
— — —
I
I
'
'
POP-5
_J '
I
I
I
— PDP-5 J—
I
I
display
Rand
ral transfers of control
tablet
Keyboard displays
processor
/i
Graphic
sec
display
DRUM
and
J
light
pen
I3»I0 6 W0RDS 5«I0 5 WDS/SEC
General
I/O
Moss
Remote
processor
computers
store
5»l0
Fig. 1.
8
words
Configuration of equipment.
1
If
a hardware interrupt occurs
2
If
a trap
3
If is
is
generated by the user program as outlined.
an instruction with a particular configuration of two bits executed. Such an instruction is called a system pro-
grammed
operator (SYSPOP).
In case 3, the six-bit operation field locations in absolute core. is
between user and system programs. Advanbeen taken of this fact to create a rather grandiose has tage machine for the user. Its features are the subject of this paper.
Basic features of the machine
A (planned)
Memory 16 K 1.75
is
used to select one of 64
The current address
of the instruction
put into absolute location zero as a subroutine link, the indirect
address bit of this link
word
is
set,
and another
bit
is set,
marking
having come from usermemory The routine thus invoked may take a memory. system mapped parameter from the word addressed by the SYSPOP, since its the
location in the link
address field
is
return to the user program simply by
As the above discussion has perhaps indicated, the modechanging arrangements are very clean and permit rapid and natu-
Graphic
Memory 16 K 75 }i sec
Drum I/O
J
SYSPOP can
branching indirect through location zero.
modified
f
ability
come from Teletypes
interface
I
Processors with multiprogramming
address the parameter indirectly through location zero and, because of the bit marking the contents of location zero as having
-D -a
TTY
Section 6
the processor
in
word
as
not interpreted by the hardware.
The
routine will
user in the Berkeley time-sharing system, working at what he
thinks of as the hardware language level, has at his disposal a machine with a configuration and capability which can be con-
veniently controlled by the execution of machine instruction sequences. Its simplest configuration is very similar to that of a
A user machine
Chapter 24
medium-sized computer. In this configuration, the machine possesses the standard 930 complement of arithmetic and
standard
of software interpreted logic instructions and, in addition, a set monitor and executive instructions. The latter instructions, which will
be discussed more
input-output of
many
fully in the following,
do rather complex
different kinds, perform
many
frequently
used table lookup and string processing functions, implement floating point operations,
and provide
for the creation of
complex machine configurations. Some examples
2
X
to
13777 8
user's
of fixed-point arithmetic
and
logic
4
is Floating point arithmetic and input-output. The latter F format. E or of Fortran in or the in free format equivalent
5
Input a character from a teletype or write a block of arbitrary length on a drum file.
6
Look up a
string in a hash-coded table tion in the table.
and obtain
its
posi-
new process and start it running concurrently with the present one at a specified point. Redefine the
memory
which
is
also
of the
machine
to include a portion
being used by another program.
The
;
completely invisible do have the these instructions to the machine user, and since is
standard machine instruction format, the user and his program make no distinction between hardware and software interpreted instructions.
of the possible 192 operation codes are not legal in the
user machine. Included in this category are those hardware instructions which would halt the machine or interfere with the
allowed to execute, and those software interpreted to do things which are forbidden to
which attempt
the program. Attempted
by
to
he may specify that the 6K should 14000 8 to 17777 8 and 34000 8 to
3777 8
,
,
also specify the size
and configuration of
be deferred
to a later section.
mechanism by which the and organization. This memory size
one byte
for
each of the
63 words in a table called the private memory table (PMT). Each
own private memory table. An entry in this table about a particular 2K block of memory. The information provides block may be either local to the user or it may be shared. If the user has his
block
is
local, the
execution of one of these instructions will
entry gives information about whether it is on the drum. This information is important
currently in core or to the system but its
user. If the block
need not concern the
PMT entry points to an
entry in
is
shared,
another table called the shared
memory table (SMT). Entries in this table describe blocks of memory which are shared by several users. Such blocks may con-
A
if
specified
blocks addressable by the 14 bit address field of an ineight or addresses one of the struction. Each of these bytes either is
standard machine instruction format, with the exception of the one bit which specifies a system interpreted instruction. Since the
instructions
is
program may specify mechanism, known as the process map to distinguish it from the hardware memory address mapping, uses a (software) mapping
marked
input-output
illegal
next few paragraphs discuss the
tain invariant
Some
alternatively,
its
It should be emphasized that, although many of these instructions are software interpreted, their format is identical to the
system interpretation of these instructions
an
2K
Create a
of that
and organization of the machine
register consisting of eight 6-bit bytes,
Skips on various arithmetic and logic conditions.
effect of
the machine's secondary storage and, to a considerable extent, the structure of its input-output system. A full discussion of this capa-
load and store are also available.
3
8
size
include addresses
bility will
The normal complement
The
later.
an appropriate sequence of instructions. For example, the user may with addresses from specify a machine which has 6K of memory
Load A, B, or any (index) registers from memory of the registers. Indexing and indirect addressing are available on these and almost all other instructions. Double word or store
described
a time-sharing system
configuration
The memory
.
operations.
7
Memory
is
37777 8 The user may
available are:
1
illegal instruction violation.
instruction violation
more
of the instructions
an
result in
in
is
in which case they will be contain arbitrary data which
programs and constants,
as read-only, or they
may
being processed by programs belonging to two different users. possible arrangement of logical or virtual is
shown
in Fig. 3.
The nature
process in the picture of the virtual
memory
of each page has
memory;
this
information can also
be obtained by taking the corresponding byte of the
PMT entry
for a
been noted
map and
The
figure shows by that the process which memory, suggests the the code for be a compiler with might compilation, sharing other processes translating programs written in the same source
looking at the a large
amount
specified
that byte.
of shared
language. Virtual pages one and two might hold tables and temporary storage which are imique to each separate compilation. Note that, although the flexibility of the map allows any block of code or data to appear
anywhere
in the virtual
memory,
it is
certainly not true that a program can run regardless of which pages
293
294
Part 3
The
instruction-set processor level: variations in the processor
Section 6 |
Page
Processors with multiprogramming
ability
Chapter 24
the routine and data base do not
common to
make frequent adjustment
to the
where several
into 16K, or
fit
routines are concurrently employed,
may be
it
necessary
during execution.
map
simple multi-process structures, one for each of two users. Note that each process has associated with it pointers to its controlling process and to one of
two immediate descendants,
process,
minor exceptions,
same
to
be discussed, each subsidiary process has the
status as the controlling process. Thus,
lish a
subsidiary process.
machine
is
It
is
may
in turn estab-
therefore apparent that the user
machine. The original sug-
in fact a multi-processing
gestion which gave
it
rise to this capability
was made by Conway
[Conway, 1963], more recently the Multics system has included a multi-process capability [Corbato and Vyssotsky, 1965; Dennis
and Van Horn, 1966;
A
Saltzer, 1966].
is
to run; this information
for the
called the state
is
program To create a new process, a given process executes an instruction which has arguments specifying the state vector of the quired vector.
memory
configuration which
is
the same
as,
ent from, that of the originating process.
placed on
specification
available to the multi-process system
is
limited to 128K
process
Each
mapping mechanism, which
user, of course, has his
own
user processes.
buffering routines,
is
common
by the
to all processes.
128K.
The most obvious examples
are input-output
which can operate independently
of the user's
main program, communicating with it through memory and with interrupts (see the following). Whether the operation being buffis large volume output to a disc or teletype requests for information about the progress of a running program, the degree of flexibility afforded by multiple processes far exceeds anything which could have been built into the input-output system. Fur-
ered
is very low: an additional process requires about 15 words of core, and process switching takes about 1 ms under favorable conditions. There are numerous other examples
thermore, the overhead
of the value of multiple processes; most, unfortunately, are too
complex
A
to
be
process
briefly explained.
may
create a
number
and
pointers are, of course, redundant, but are convenient for the implementation. The process is identified by a process number which is returned by the system when it is created.
A
complex structure such
as that in Fig.
5
may
result
from the
number of subsidiary processes. The processes in 5 have been numbered arbitrarily to allow a clear description
creation of a Fig.
of the
way
in
which the pointers are arranged. Note that the user of these pointers; they are shown here to clarify
the manner in which the multiple process mechanism
of subsidiary processes, each
is
imple-
mented.
A process may destroy one of its subsidiary processes by executreasons this operation ing the appropriate instruction. For obvious is not legal if the process being destroyed itself has subsidiary
memory
This facility was put into the system so that the system could control the user processes. It is also of direct value, however, for
many
a process has
The up
constraint
that the total
memory
When
as in the case of processes 1.2
they are chained together on a ring. Thus, three pointers, up, down, and ring, suffice to define the process structure completely.
or completely differ-
The only
is
this
subsidiary processes.
1.3,
new process. This state vector includes the program counter, the central registers, and the process map. The new process may have a
its
need not be aware
the logical environment for the execution of a to the physical environment, which is a as contrasted program, hardware processor. It is defined by the information which is reprocess
a time-sharing system
which is independent of the others and equivalent to them from the point of view of the originating process. Figure 4 shows two
An important
machine allows the user program, which in the current context will be referred to as the controlling to establish one or more subsidiary processes. With a few
in
of
Multiple processes feature of the user
A user machine
1.2
295
296
Part 3
The
instruction-set processor level: variations in the processor
Section 6
Processors with multiprogramming
ability
A user machine
Chapter 24
in a
time-sharing system
|
the process attempts to obtain new memory, scan upward through the process hierarchy until the topmost process is reached. If at any time during this scan a process is found
3
If
for
the
which the address causing the trap
memory
assigned to
it
is
down through
legal,
Option 3 permits a process to be started with a subset of memory and later to reacquire some of the memory which was not given to it initially. This feature is important because the of
memory
available on
assigned to a process influences the operating and thus the speed with which it will be
efficiency of the system
able to respond to teletypes or other real-time devices.
A
file
user machine has a straightforward but unconventional set
The primary emphasis in the design has been to make all input-output devices
of input-output instructions. of these instructions
interface identically with a flexibility in this
result
from
which are
common
to provide as
interface as possible.
this uniformity:
essentially
program and
it
becomes
Two
much
advantages
natural to write programs
independent of the environment in which
opened by giving
its
system command language and all of the subsystems to be driven in this way. This device is particularly useful for repetifor the
tive
sequences of program assemblies and for background jobs in the absence of the user. Output which normally
which are run
goes to the teletype is similarly diverted to user files. Another application of the uniformity of the file system is demonstrated in some of the subsystems, notably the assembler and the various
The subsystem may request the user to specify where he wishes the program listing to be placed. The user may choose anything from paper tape to drum to his own teletype. In the compilers.
absence of file uniformity each subsystem would require a separate block of code for each possibility. In fact, however, the same input-output instructions are used for
The input-output
instructions
system in turn associates
files
files.
an argument to the
the system. If authorized, a program may refer to files belonging to other users by supplying the name of the other user as well as the
file
name. The owner of a
file
determines
who
is
authorized
The reader may compare this file naming mechanism with a more sophisticated one [Daley and Neumann, 1965], bearing to access
in
it.
mind the
fact the
file
of any length and can be by the program.
names can be
(as strings of characters)
is, in general, either sequential or random in devices (like a keyboard-display or a card reader) are purely sequential, while others (like a disk) may be either sequentially or randomly accessed. There are accordingly two
files
Some
major I/O interfaces to deal with these different qualities. The interface used in conjunction with a given file depends on whether the file was declared to be a random or a sequential file. The two major interfaces are each broken down into other interfaces,
between sequential and random
files is
The
pri-
Although the distinction
marily for reasons of implementation.
great, the subinterfaces are
files
three instructions
CIO
(character input-output),
input-output), and BIO (block input-output) are used
nicate with a sequential
file.
Each
WIO to
(word
commu-
instruction takes as an operand
when it opens At the time of opening a file it must be specified whether the file is to be read from or written onto. Whether any given device associated with the file is character-oriented or worda.
a
file
number. This number
is
given to the program
file.
is unimportant; the system takes care of all necessary character-to-word assembly or word-to-character disassembly. There are actually three separate, full-duplex physical inter-
oriented
faces to devices in the sequential
file
mechanism. Generally, these
interfaces are invisible to programs.
They
exist, of course,
reasons of system efficiency and also, because of the
some devices
all cases.
communicate with
as
to all files symboli-
Sequential
example, for programs written to be controlled from a teletype to be driven instead from a file on, let us say, the drum. A command exists which permits the recognizer for
name
and organization to
one.
common,
single-
cally, leaving the details of physical location
not especially visible to the user.
has been
the operations
Programs thus refer
they operate, and the implementation of the system is greatly simplified. To the user the former point is, of course, the important
It
flexibility of
They must range from
clearly critical.
appropriate instruction.
nature.
The
is
manipulated Access to
The input-output system
and behavior, the
files is
character input to the output of thousands of words.
propagate
the hierarchy to
the process causing the trap.
amount
in characteristics
are used.
The
way
in
for
which
interfaces are:
1
Character-by-character (basically for low-speed, characteroriented devices used for man-machine interaction)
2
Buffered block I/O (for medium-speed I/O applications)
3
Block I/O directly from user core
The
with the various physical devices.
Programs, for the most part, do not have to account for the peculiarities of the various actual devices. Since devices differ widely
(for
high-speed situations)
297
298
Part 3
The
Section 6
instruction-set processor level: variations in the processor
Processors with multiprogramming
|
should be pointed out that there is no particular relation beinterfaces and the three instructions CIO, WIO, and
ability
tween these
shows the components of the character-by-character interface; responsibility for its operation is split between the interrupt called
BIO. The interface used
when
It
in a given situation is a function of the device involved and, sometimes, of the volume of data to be trans-
mitted, not of the instruction.
Any interface may be driven by any instruction. Of the three subinterfaces under discussion, the last two are straightforward. The character-by-character interface is, however, somewhat
different
and deserves some elaboration. Devices
associ-
ated with this interface are generally (but not necessarily) used for man-machine interaction. Consider the case of a person communicating with a program by means of a keyboard-display (or a teletype). He types on the keyboard and the information is transmitted to the computer. The program may wish to make an
the device signals for attention and the routine which proc-
esses the user's
I/O request. The advantage of the full-duplex, character-by-character mode of operation is considerable. The character-by-character capability means that the user can interact with his program in the smallest possible unit
— the character.
Furthermore, the full-duplex capaother things (1) the program to substitute characters on strings of characters as echoes for those received, (2) the keyboard and display to be used simultaneously (as, for
bility permits,
among
example, permitting a character typed on a keyboard to pre-empt the operation of a process. In the case of typing information in during the output of information, a simple algorithm prevents the random admixture of characters which might otherwise result),
immediate response on the display screen. In many cases this response will consist of an echo of the same character, so that the
and
user has the feeling of typing directly onto the screen (or onto the teleprinter).
Instructions are included to enable the state of both input and output buffers to be sensed and perhaps cleared (discarding un-
So that input-output can be carried out when the program is not actually in main memory, the character-by-character input interface permits programs a choice of a number of echo tables;
wanted output or input). Of course, it is possible for a program to use any number of authorized physical devices; in particular,
it
further permits programs a choice of grade of service
by per-
mitting them to specify whether a given character is an attention (or break) character. Thus, for example, the program may specify that each character typed
is
to
be echoed immediately and that
control characters are to result in activation of the program regardless of the number of characters in the input buffer. Alter-
all
natively, the
program may specify that no characters are echoed
and every character
is
a break character.
By changing the
specifi-
cation the program can obtain an appropriate (and varying) grade of service without putting undue load on the system. Figure 6
(3)
the ready detection of transmission errors.
used as remote consoles.
this includes those devices is
to
provided to permit output
be copied on
all
which
A mechanism
directed to a given device
other devices which are output linked to it This is useful when communication
(and similarly for input).
among users is desired and in numerous other situations. The sequential file has a structure somewhat similar of an ordinary
magtape
file.
consists of a
It
record.
The
full
generality
the ones most
is
available for
drum
files,
which are
The
commonly logical record is to be contrasted with the variable length physical record of magtape or the fixed length record of a card. Instructions are provided to insert or delete logical records and increase or decrease them in length. used.
file
to
be "positioned" almost
tial file
routine
greater flexibility than one which
ble. This flexibility
is
is
completely unaddressa-
only possible, of course, because the
on a random-access device and the sequential structure Echo
Users
table
program
tained by pointers.
The implementation
routine
Input
buffer
interface.
discussed in the follow-
end of
file,
and BIO terminates transmission on either of the
conditions and returns the address of the last In addition, certain flag bits are set
The character-oriented
file is
main-
ing.
or
Fig. 6.
is
is
When reading a sequential file, CIO and WIO return certain unusual data configurations when they encounter an end of record
Input interrupt
I
in-
stantaneously to a specified logical record. This gives the sequen-
Output interrupt
Device
to that
sequence of logical
records of arbitrary length and number. On some devices, such as a card reader or the teletype, a file may have only one logical
Other instructions permit the Output buffer
I
is
and an interrupt may be caused
if it
word transmitted.
by the unusual conditions, has been armed.
Chapter 24
The implementation storage
is
of the sequential
file
illustrated in Fig. 7. Information
is
scheme
for auxiliary
written on the drum
256-word physical records. The locations of these records are kept track of in 64-word index blocks containing pointers to the in
data blocks. For the
file
shown, the
first
logical record
is
more
than 256 words long but ends in the second 256-word block. The fits in the third 256-word block and the third
second logical record
—
—
followed by an end of file. If a file requires more than 64 index words, additional index blocks are chained together, both forward and backward. Thus, logical record
in the 4th
data block
is
in order to access information in the file
know
the location of the
first
index block.
it
It
is
necessary only to
may be worthwhile
same drum. Since the system
to point out that all users share the
has complete control over the allocation of space on the drum, there is no possibility of undesired interaction among users. Available space for of
by a
new data blocks or index blocks is kept
bit table, illustrated in Fig. 8. In the figure,
track
each column
represents one of the 72 physical bands on the drum allocated for the storage of file information. Each row represents one of the 64 256-word sectors around a band. Each bit in the table thus
represents one of the 4608 data blocks available. The bits are set when a block is in use and cleared when the block becomes available. Thus,
if
a
new
data block
is
to read the physical position of the in the table,
in
which
a block
is
available.
block over 1
is
its
a
is
this position to
index
appearance of a 0. The found indicates the physical track on which
and search a row
column
required, the system has only
drum, use
for the
Because of the way the row was chosen,
this
immediately accessible. This scheme has two advantages alternative, which is to chain unused blocks together:
easy to find a block in an algorithm just described. It is
optimum
position, using the
A user machine
in a
time-sharing system
299
300
The
Part 3 |
instruction-set processor level: variations in the processor
Section 6 |
Main
memory
Processors with multiprogramming
ability
Part
The
4
instruction-set processor level:
special-function processors This part contains descriptions of processors that do not interpret general programming languages; that is, they are not Pc's. They are all P's, however, since
they have an interpreter that determines not only the operations to be taken, given the current instruction, but the next instruction to be obtained.
A
and Ms components. It manages T and Mp. A P.array (Sec. 2) processes both vectors and two-dimensional matrices. By recognizing these data as fundamental units, programs (or algorithms) can be expressed efficiently in terms of primitive operators. The chief advantage of these Pio (Sec. 1)
is
a processor that controls T
block or vector transmission between
P's
is
their ability to
Ms
or
take advantage of the data structure for parallel interpretation,
thereby increasing processing speed.
A microprogram processor type which
is
a
program.
(Sec. 3)
is
In effect, this
designed to interpret and process a dataprocessor is a computer within another
computer, programmed to act as an interpreter.
A language processor (Sec. programming language.
of a
4) interprets a data-type derived from the primitives In contrast,
a conventional processor interprets a
language based on fundamental hardware implementation primitives. The difference is clearly apparent as increased complexity of the language processors.
301
Section 1
Processors to control terminals
and secondary memories (input-output processors) show the evolution of the IBM Data Channels (io processors) from 1958 (the 7094 which came after the 360). The II) to the present (the 1800,
The
first
three chapters of this section
processor approach forcontrollingTand Ms components, while more general, should be contrasted with the specialized oneinstruction controls in the
B 5000 (Chap. 22) and Burroughs
D825 (Chap. 36). The fourth chapter, on the DEC 338, shows
CRT
displays used the Pc (e.g., on Whirlwind); then small Pc's were adapted to the task; the DEC 338 is one of the earliest special P.display's that apfirst
of
System 360
Part I— outline of the logical structure io processors (Selector and Multiplexor Channels) in the System/360 have evolved from the IBM 701-7094 Series. Part 6, Sec. 3 presents the ISP and PMS structures for these processors. Depending on the computer model, the implementations
The
II
microprogrammed processor interpreting a shared control program for both Pio's and Pc, or by a hardwired Pio. The multiple Pio's in a 360 Multiplexor Channel, though
are realized by a a processor that
controls cathode-ray-tube display consoles. The graphic terminals are the first T's of sufficient complexity to utilize a proc-
essor of their own. The
The structure
independent, are implemented as a single, shared
logically
physical processor.
The IBM 1800
peared.
There is no example message concentration and
in this section of a specialized
multiple remote inputs are
switching. For
P for
computer systems
The
Pio's in this structure are presented in Chap. 33,
structure
is
discussed
in
Part 5, Sec.
2,
and the
page 396.
recent enough so that either
still
the main Pc handles the task, via specialized K, or small Pc's are committed to it. However, in the telephone industry there
has been a very substantial development by the Bell System System (ESS), which uses specialized
of the Electronic Switching
C's to control switching (routing). In computer systems,
we can
expect the use of such specialized processors to increase the near future.
in
The
Digital
Equipment Corporation DEC 338 display processor
The DEC 338
is an early P.display. It directly interprets a stored to control a T. display. Earlier T.displays were con-
program
Pc (Whirlwind, Chap. 6), or by a special K.display without stored-program capability, or by a general-purpose Pio. The last method outputs fixed length blocks containing data to trolled by
be interpreted by T.display as points, vectors, characters, curved line segments, etc. The control of T.display first by Pc,
The IBM 7094
The IBM 709,
a
then by a K, then by a Pio, and finally by a P.display has been observed as an evolution [Myer and Sutherland, 1968]. Myer
II
member
of the
IBM 701-7094
II
family,
is
one
of the first processor (IBM name: Data 41 discusses the two Data Channel) in its structure. Chapter the later 7909. The 7909 and Channel types: the early 7607 it K which Data Channel ISP, and a controls, are given in Ap-
computers to have an
io
pendix 2 and 3 of Chap. 41. The principal difference is that Pc controls the Pio ('7909) which in turn controls the K, which in turn controls a T or Ms; the Pc controls the Pio (7607) and the K; the K controls the T or Ms. The series Part 6, Sec.
1,
page 515.
is
discussed
in
and Sutherland also observe that the evolution
become
a closed cycle
because the generality of
a
is
Pc
about to is
needed
to control a T.display.
Note that the 338 has a very extensive ISP. In fact, the P.display's ISP is more extensive than the companion Pc of the
PDP-8 (Chap.
5).
There are some display tasks which require
Pc, for example,
compiling programs (pictures), calculating elaborate light-pen tracking figures, making coordinate and curved lines to straight-line vector approximation transformations,
and communicating with other system components. 303
304
Part
4
The instruction-set processor
level: special-function
Section 1
processors
Another approach to the design of a P.display
is
based on
a P.microprogram which is shared among many T.displays [Rose, 1967]. Yet another alternative, which has not yet been tried, is to
incorporate a Pio (P.display) as a special
a conventional Pc.
Thus the P would
mode
in
interpret either conven-
tional Pc instructions or P.display instructions. is
P.display
the interpreter for the output of pictures or utilizes data space efficiently simply because
Processors to control terminals and secondary memories
an interrupt system, and other tasks beyond
P. display's
A
clock should be built into the 338. The brightness or
tensity of a picture
is
mode instructions for controlling intensity) and by the rate at which the pictures are repeated. A clock would allow the time when pictures are started or drawn to be specified; thus the intensity would be independent of picture length.
the data are long variable-length strings (word vectors). The instruction requires almost no space to specify the data opera-
a large
and addresses; data are interpreted
directly or
immedi-
The 338 requires more hardware than a simpler Pc. However, amount of this hardware is used to control the generation of characters and lines. The lines (vectors) are drawn
DDS
ately in the instruction rather than via instruction addresses.
using a
Another feature which allows a program to be efficiently encoded is the stack mechanism for storing subroutine linkages. Subroutines in P. displays are actually programs which
one-half of the registers could be eliminated
form part of a more complete
picture. Subroutines are actually
subpictures. Although the stack picture calls, the stack
is
mechanism
allows for recursive
used principally to save space and
common picture programs. common to all multi-P struc-
to allow multiple T.displays to use
A problem tures
is
ling P,
in
the 338 which
intercommunication
as
is
is
among
the P's. Pc
is
the control-
the case with most Pc-Pio structures. The P('338)
has no trap to itself but relies on an interrupt signal to Pc. The Pc processes both tasks which P.display might process, given
in-
determined both electronically (see the
graphics. The 338
tions
capa-
bility.
not a P.
(Digital Differential Analyzer) technique.
A simpler
alternative
if
the
was constructed about
Perhaps
338 were a similar
Telephone Laboratories and DEC, computer, the PDP-9, by of the making the display only a K. approach using Bell
A more elaborate Pc interrupt system with reduced overhead time would enable Pc to take on the specialized program control functions in the 338. Such a scheme might pass the program or instruction counter parameter directly from P.display to Pc. In this
way, Pc or P.display would alternatively process part of depending on the task.
a single instruction stream,
Despite the problems of this early P.display, tication
it
has a sophis-
which successors appear to be following.
Chapter 25
The DEC 338 display computer Introduction
The
C(display;
'DEC
which can connect in.
to
a
C('DEC PDP-8) with a P.display CRT; T(#l:8; display; area: 9.375 X 9.375 338)
is
The PMS structure is shown in Fig. 1, Chap. 5, describing The Pc ISP is given in Appendix 1 of Chap. 5. The C('338), although designed to stand alone, is generally used
2
).
the PDP-8.
as a satellite to a larger C, via
an L(Dataphone). The rationale
independently. A photomultiplier connected through a fiber-optic is used as a light pen (a photosensitive sensor) to detect
bundle link
spots on the T.
whether
The
pen allows the P.display
light
to detect
a user has "pointed to" a displayed spot.
Pc and P.display access the same Mp; the total data rate availMp is one 12-bit word/1.5 microseconds. The instruction
able from
C as a T is based on the bandwidth and storage requirements needed to maintain graphical picture displays. A human being manipulating pictures (rotation, scale change, and conver-
times of P.display are a function of the point plotting times of the T(CRT):0.3 microsecond to the next incremental unintensified
sion of internal linked data structure to a picture structure) reshort this time; quires response requirement places high processing
incremental intensified point; and 35 microseconds to a point
for using a
demands on
larger C's.
Thus
this C(display)
larger, more general C's. The actual T(CRT) is a 16-inch
viewing area covered by 1,024 of the points
~0.015
inch.
x
is
The of
CRT
with a 9%-inch square 1,024 (XY) points. The diameter
The
is
random
plotted at a
a preprocessor for
spot magnetically deflected and focused. All eight T(CRT)'s can be driven together or used is
point (approximately 0.010 inch away); 1.2 microseconds to an
position.
state (registers) of C. display
Appendix
1
is
given in the ISP description of the state:
There are four parts
of this chapter.
the control registers for Program Flow State, the Picture State (or position of
The
beam), Console and Light-pen State, and
instruction interpreter
by the 1 and
state 2.
diagram
(Fig.
The remainder
instructions
is
1).
fairly
The
simple and
is
Mp
State.
best described
instructions are given in Tables
of the chapter discusses the P.display
and the Pc instructions
for
communicating with
P.dis-
play.
[1>— M[DAC+1];
Principle of operation
The
is held stationary by repeatedly displaying a (intensifying) particular point, line, etc. The number of times a figure has to be displayed so that it appears stationary and does
actual picture
CRT phosphor, the figure, and environmental parameters. The generally accepted range is a plotting rate of 20 50 plots/second; thus a complete picture has to be drawn not flicker depends on the
in
50
~ ~ 20
milliseconds. If
we assume
a 30-Hz plot rate, about
~
1 120 inches, 28,000 points can be plotted in vector mode (or 280 on the About characters can be dis1,000 depending spacing).
played in 30 milliseconds using character mode.
When 1
2
3
Executed by Pc to start p display Executed by Pc to stop P display Data state states"
Fig. 1.
DEC 338
is
used, a display program
The pen's The pen, of
is
position
known
course, detects the points
points.
Control stote instructions
is
required to
determined by displaying
when
present at the displayed points position; therefore the
4 Control state "states"
State transitions occur approximately each
the light pen
"track" the pen.
Mp
cycle
instruction-interpretation state diagram.
knows the location The parameters
it is
program
of the pen.
of interest for a display vary, depending on the application. However, the general parameters are:
305
306
Part
4
Table 1
The
DEC 338 control-mode
instruction set
t Set; allow instruction bits to specify |
A two-word
Section 1
instruction-set processor level: special-function processors
new
value.
instruction, second word contains low-order 12
H Skip can be for true or
false.
§ Inhibit restoration of bits.
bits for
DAC (jump
address).
Processors to control terminals and secondary memories
Chapter 25
more
Instructions and their interpretation in P(display)
Two
instruction-set types are interpreted in the P. display:
Data
which instructions specify display information; and ConState, in which instructions specify program control informa-
State, in trol
tion (e.g., jumps, modes, etc.).
tation process
is
given
in Fig.
A
state
diagram
instructions
returns the P.display to control state. to select a
stored per word.
is
DEC
calls
in data state.
When
all
the data-
A control instruction is issued
mode and simultaneously place
Increment mode. This mode
There are seven instructions (which
modes) that can
The
instructions
(modes) are really substates of data state. The instructions (actually
is
the display in data state.
used to draw curves and alpha-
An
up and
instruction set
Two instructions are beam position to
instruction will cause the
be moved one, two, or three times, one of eight directions. Direction
Instruction bits:
Mode
mode.
have been interpreted, an escape instruction
numeric characters and other small symbols.
be executed while P.display
DEC 338 data-mode
like data) are interpreted for the
for the interpre-
1.
Data-state instructions
Table 2
mode
The DEC 338 display computer 307
to the right, etc.
in 0.010-inch increments, in is
to the right, direction
1 is
308
Part
The
4
Vector mode.
The vector mode
is
used to draw straight-line seg-
ments. This two-word instruction causes the
moved along delta
beam
position to
A word Hit 0:
to the
edge of the screen.
It is
Rit
Short vector mode.
The
short vector
mode
is
a
1,
bits 1 to
1:
used to draw figures
are used to perform a control
Determines the mode
in
which the character
is
to
be
displayed. If bit 1 is a 0, the increment mode is used to plot the character used; if bit 1 is a 1, the short
of short line segments. A one-word instruction specifies a 5-bit delta y and a 5-bit delta x quantity. It is transformed within the display to the same format as vector mode and operates in
composed
vector
mode
is
the same manner.
Control-state instructions
The preceding modes move the beam by counting the X and The counting is done at 1.2 microseconds per on step on an intensified move and at 0.30 microsecond per step
There are
Y
1 1
the SAR and bit 2 of the dispatch word and so may be specified in either place or in both places.)
encountered.
is
is
specify the address at which the character definition program starts. (The address bit 2 is common to both
used to draw a straight line similar to vector mode but causes is
the line to be extended until an "edge"
If bit
function as specified by particular control instructions. is a 0, bits 2 to 11 are combined with SAR to If bit
x.
mode
in the dispatch table has the following format:
be
a line represented by an 11-bit delta y and an 11-bit
Vector continue mode. This
Processors to control terminals and secondary memories
Section 1
instruction-set processor level: special-function processors
used to plot the character.
six control-state instructions.
position registers.
a nonintensified
move.
Point mode. Point
placed into the
functions.
A
used
Y and X
for
random point
new Y and/or X
used to
set values in scale, light-pen,
and
plotting.
A
coordinates to be
Mode. Mode
is
instruction).
Mode
used to also
set
up the data-state mode
is
used to stop the display.
(or
data-mode
position registers.
is
used to draw curves of mathematical
one-word instruction has data
register; at the
by
is
specifies
Graph-plot mode. This
is
intensity registers.
mode
two-word instruction
Parameter. Parameter
same time,
X
for the
or Y, respectively,
X
Y
or
is
incremented
Conditional skip. The skip instruction tests the state of the P. display and the pushbuttons.
position
a count of one, two, four, or eight, depending on the scale
Miscellaneous. These instructions include both tests and additional
parameter control.
factor.
Point and graph-plot modes operate at a rate depending upon the position of the new point with respect to the previous point. If a point is only one-eighth of the screen away, the delay for beam-settling time is 6 microseconds; otherwise the settling time is
35 microseconds.
Display jump and push-jump subroutine instructions. The display jump instruction has 15 address bits, so that a jump may be
executed to any location in the display
file
within the 32-kw
memory. The display subroutine
instructions are push-jump (an extension and pop, the return from subroutine. The jump instruction) The works as follows: current state of the display (Light push-jump
of the
Character generation option instructions. The alphanumeric charwhich make up a character set are stored
increment mode or short vector mode. These characters
Pen Enable, Data Mode, Scale, and Intensity) is stored, along with the return address, in two successive locations in the first 4,096
can be arbitrarily defined. A 6-bit (or 7-bit) character code in the instruction is used to locate a word in a table in Mp called the
words of memory. The locations are determined by the pushdown pointer, PDP. This pointer is initially set by a Pc instruction. The
dispatch table. The base address of the table Starting Address Register/SAR.
normal jump is then executed. To return from a subroutine, the pop instruction is executed. It has no address bits. Its function is to return the display to a previous state by sending the last words on the push-down stack
acters or special symbols in
Mp
in
SAR may be
is
specified
by the
loaded by instructions from the Pc. The
SAR
represents the most significant 6 bits of a 15-bit memory address. The character code represents the least significant 6 (or 7) bits.
A seventh SAR bit, corresponding to the octal position with 6-bit characters as a case bit characters)
and may be
set or
(i.e.,
100,
is
used
uppercase or lowercase
cleared with a control character.
back to the display.
The stack approach to subroutining as implemented on the 338 has certain advantages over the jump to subroutine instruction normally used in
Pc's:
The DEC 338 display computer 309
Chapter 25
1
Memory
space
is
conserved since return address locations
are not required in each subroutine in
2
3
A
memory.
subroutine can be called any return to the main routine.
number
Since the state of the display
saved on the stack and
is
of times before
same state or change the state of the display by using one or more of the "inhibit restore" bits available in the pop instruction. The programmer can elect independently to inhibit restoration of mode, light pen, and scale, or intensity information.
The subroutines can
Counter from
Set Push Button contents from Set miscellaneous flag
and
Set character generator
subsequently restored, subroutines are truly transparent; that is, after the return they leave the state of the display program the same as before the subroutine call. 4
Set Display Address
either retain the
status bits
SAR
register into
Read Y
register into
communicating with P(display)
communicate with P.display. The physical connection is by the S(l/0 Bus). The in-out transfer instructions in Pc are used to initialize and read the state of P.display. Instructions in Pc
P.display state initialization Set Push
Down
from Pc
Pointer from
AC
instructions
AC
Read Display Address Counter into AC Read Status words 1, 2, 3, 4, 5 into AC bits of flags,
ture debugging.
A
modes,
bit
AC
AC AC (60 miscellaneous
etc.)
Picture debugging modes. These
Instructions in Pc for
from
address
P.display status to Pc instructions Read Push Down Pointer into
Read X
AC
AC
can be
modes
aid
programmed and
pic-
set to override the nonintensify bit
in data-mode instructions. When this bit is a 1, all points and vectors are plotted, whether they are to be intensified or not. The search enable instruction forces the display to run until a particular instruction type is found. The instruction type is specified by
the search enable instruction.
310
Part 4
The
APPENDIX
instruction-set processor level: special-function processors
1
DEC 338 DISPLAY PROCESSOR ISP DESCRIPTION
Section
1 |
Processors to control terminals and secondary memories
Chapter 25
APPENDIX
1
DEC 338 DISPLAY PROCESSOR ISP DESCRIPTION (Continued)
pb^clear
:=
i
pbj:lear
(PB
(Scale -»
light^penuchange
->
:
1 1
Ace
C(AcCi) to n
.r*+» +1 to
Ace Ace
< 0,
transfer control to n;
C(Acc)-2» If
C(Acc)
to
if
C(Acc)
>
proceed serially) Read next character on input mechanism into n
(i.e.,
On The
Effect of order
Send C(n)
to output
mechanism
0, ignore
337
338
Part
4
The
Section 3
instruction-set processor level: special-function processors
Processors defined by a microprogram
Table 2
stand for the various registers in the arithmetical and control register units (see §3 of the text). 'Cto D' indicates that Notation: A, B, C, indicates that the output of register A is conthe switching circuits connect the output of register C to the input of register D; %D + A) to is permanently connected to the other input), and the output of the adder to register C. nected to the one input of the adding unit (the output of A numerical symbol n in quotes (e.g., V) stands for the source whose output is the number n in units of the least significant digit. .
.
.
C
D
Arithmetical
Control
unit
register unit
Conditional
Next
flip-flop
micro-order
Set
Use
Microprogramming and the design
Chapter 28
coloperations called for in the control register unit. The fourth shows which conditional flip-flop, if any, is to be set and the
umn
digit
which
flip-flop
is
to
number
C, while (2)G,
be used to 1 is set
means
set
means that number in register
for example, (1)CS
it;
by the sign
digit of the
number 2
that flip-flop
is
set
by the
least
significant digit of the
number
ditional micro-orders
columns 5 and 7 are blank and column 6
in register G. In the case of
uncon-
contains the address of the next micro-order to be executed. In
column 5 shows which
the case of conditional micro-orders is
flip-flop
used to operate the conditional switch and columns 6 and 7
give the alternative addresses to which control is to be sent or a 1 respectively. the conditional flip-flop contains a
Micro-orders
from the
store.
to 4 are
when
concerned with the extraction of orders
serve to bring about the transfer of the order
They
from the store to register E and then cause the five most significant with the result that digits of the order to be placed in register II control is transferred to one of the micro-orders 5 to 15, each of
which corresponds In this
way
to a distinct order in the
machine order code.
the sequence of micro-orders needed to perform the
particular operation called for is begun. The way in which the various operations are performed can
be followed from Table
2.
In the section dealing with multipli-
assumed that numbers
lie in
—1
means the symbol
in
L
sary to be able to detect the responsibility bit (b
=
22), since there
s
1L
,
are impossible. (1L
.)
It is
neces-
when
the explicit structure of lists is important, and not just the information they designate. Finally, although the signal bit is just a single switch, it is necessary to have two symbols, one are occasions
corresponding to "signal on" and the other to "signal off" (b = 26 and 27), so that the information in the signal can be retained for later use (b
The
if
the symbol
-
is
found, and "off"
of the virtues of the if
programmer knows
the
is
apparent
is
not found.
One
at this point, since,
that the symbol exists,
he
will simply
ignore the signal. Instruction formats that provide for additional addresses for conditional transfers would force the programmer to attend to the condition even in the
To a
of
list
how
the
on the
list,
only meant leaving a blank
these search operations work, Fig. 6 shows
L300 and
lists,
,
reference to the
how
if it
program.
illustrate
L300
last list of
list is
,
a
known
structure.
referenced.
cell,
L 100
.
Cell
L 100
contains the
The programmer does not know
He wants
the structure. His
first
to find the last
step
is
symbol
L 100 which L300 He then
(30, 1,
)
replaces the reference by the name of the list, searches down to the end of list L 300 by doing a series of operations: (32, 1, L 10o ). Each of these replaces one location on the list .
28 and
operation if the end of the the end of the list
when
by symbolized by
been reached. The net is
result,
that the location of the
word on list L 300 rests in L 100 Since in this example he wants go down to the end of the sublist of the last word on the main list, he next performs (31, 1, L 100 ). This operation replaces the .
location of the last
the search
down
word with the name
the sublist
is
of the last
list is
in
L 100
,
as
list,
L 70o Now .
repeated until the end
is
again
symbol on the desired. The sequence of code follows:
reached, at this point the location of the
Location
last
last
Link
Symbol
bed
is
setting the signal to "off."
hasn't
reached,
is
to
29).
sense of the signal
list
last
not arbitrary. In general "off" is used to mean that a process "failed," "did not find," or the like. Thus, in operations b = 6 and 7, the failure to find a "stop interpretation" operation sets the signal to "off." Likewise, the end of a list will
the symbol
if
common signal
by the next one. In fact, a loop is required, since the length of the list is unknown. Hence, after each "find the next word" operation, he must transfer, on the basis of the signal, back to the same
Ten b operations are primarily involved 21)
the symbol referred to
is,
10 operation, which means that the end of the may list has been reached. Consequently, the signal is always set "on"
is
=
not exist; that
=
contain a b
space
recalled.)
symbol,
unknown symbol need
Processors based on a programming language
30,1X 100 .32, l./^ioo
4,0,L 888 31,l,Lioo .32,l,Lioo
£-999
4,0,L999
List operations
Both the "save" and "delete" operations are used to manipulate lists, but besides these, several others are needed. The three operations,
b
—
30, 31, 32, allow for search over
list
structures.
They
"get the referent," "turn down the sublist," and "get the next word of the list." They all have in common that they replace a known symbol with an unknown symbol. This
can be paraphrased
as:
The a
list
this
operations, b
=
33 and 34, allow for inserting symbols
either before or after the
symbol designated.
system are one-way: although there
is
always a
The
way
lists
in
in
of finding
the symbol that follows a designated symbol, there is no way of finding the symbol that precedes a designated symbol. The "insert
before" operation does not violate this rule. In both operations,
A command
Chapter 30 |
structure for complex information processing
359
360 Part 4
The instruction-set processor
|
level: special-function
Section 4
processors
|
The name
of the CIA list for the program structure which be reactivated on completion or interruption of the current program structure is the second item on the L 3 list, etc. Therefore, list.
is
to
the
L3
list is
appropriately called the current
and "delete" operations are used
CIA
list.
The "save"
L3
to
manipulate analogously previously described. Appendix 3 gives a more complete schematic representation of the interpretation cycle. It has still been necessary to represent
L2
to their use with
only selected b operations.
Data programs In the section on list operations a search of a list was described. There the data were passive; the processing program dictated just what steps were taken in covering the list. Consider a similar situation,
shown
in Fig. 8,
which contains the name
where there
of a
list,
is
a working cell, L 100 is a data program.
,
L300 L300 .
There
is a program that wants to process the data of L 300 , which a sequence of symbols. This program knows L 100 To obtain the first symbol of data, it does (6, 1, L 100 ), that is, "execute the is
.
parallel
L 100 ." The result is to create a CIA list, L 500 put its name in L 100 and fire the program. Some sort of processing will occur, as indicated by the blank words of L 300 program whose name
is
in
,
,
.
Presumably
this has
something to do with determining what the might be some bookkeeping on L 300 's experi-
data are, although it ence as a data file. Eventually
L mo is reached, which contains (0, This operation stops the interpretation, and returns control to the original processing program. The first symbol of data 1> i'soo)-
defined to be 1L 800 The processing program can designate this by 4L 100 since the sequence of c = 4 prefixes in L 100 and L is
.
,
500
pass along the interpretation until
it
ultimately becomes 1L 800
Now the processing program can proceed with
Before
8,1,
Lioo
the data.
It
.
remains
Processors based on a programming language
Chapter 30
A command
structure for complex information processing
361
362
Part 4
I
The
15
16 17
18 19
Copy s into communication list, saving 1L Move * into communication list, saving 1L Move 1L into location of s, saving s. Move 1L into location of s, destroying s. Copy
location of s into communication
Create a
new symbol
in location of
.
1. .
2.
list,
saving 1L
saving
s,
Fetch the current instruction according to the current instruction address (CIA) of the current
21
22 23
Turn signal on Turn signal on
if s if s
Turn signal on if Turn signal on.
s
is
.
c to c
s.
=
If
1L
,
off if not.
1L
,
off if not; delete
responsible,
1L
Turn signal
Invert signal.
26 27
Copy Copy
28
Set signal according to
s.
29
Set signal according to
s;
off.
3.
signal into location of
s.
signal into location of
s,
= =
put CIA
which saving
delete
s.
and turn
Replace
to step
If
b
=
If
b
=
step
1.
31
= 10), leave s and turn signal off. symbol doesn't exist (b in d of s and turn signal on; if symbol the s symbol Replace by
by the symbol designated by
doesn't exist, leave s and turn signal
by the location
s,
signal on;
of the next
symbol after d of
"0, 4,
part of
d
If
b
If
b
If
b If
= = -
s
If
new CIA =
d,
CIA and go to step 4. new CIA = d part of s and go
turn the signal on, delete 1 save
CIA,
set a
2 replace 3 replace
b, c,
d by
CIA by
to
Insert
(move symbol from communication
list).
2. 1.
signal off, delete
CIA and go
to
4.
"popped up" CIA
Replace to step
and go to step
10 delete CIA.
is
go to step 3. Otherwise go to step 4.
s
the d part of s and go to step
rent instruction again,
and
34
after s
set a
operation: (Some of the b operations
no CIA "pops up" turn
step
of s)");
33
1L
it,
1.
off.
(f, replaced by if next symbol does not exist, leave s and turn signal off. Insert 1L before s (move symbol from communication list).
(s
3.
and go to step 3. d parts of the word at address
affect the interpretation cycle follow.)
if
s
reduce
s.
30
turn signal on
c
in the address register
Decode and execute the b
List Operations
Replace
If
4 replace c, d by the c, d and go to step 2. If c = 5 mark CIA "incomplete," save
If c
.
not.
off if
c
and go
25
32
at address d,
=
2 replace d by d part of the at address d, reduce c to c = 1 and continue. If c = 1 2 and continue.
to step put d in the address register and go
24
s
list.
Decode and execute the c operation: c = 3 replace d by d part of the word word
20
CIA
If
Signalling Operations
= =
THE INTERPRETATION CYCLE
APPENDIX 3
Communication List Operations 14
Processors based on a programming language
Section 4
instruction-set processor level: special-function processors
CIA by 1.
the
/
marked "incomplete" fetch the curmove 1L into address register and
4.
part of the current instruction
and go
Chapter 31
System design
of a
FORTRAN machine 1
Theodore R. Bashkow / Azra Sasson / Arnold Kronfeld
Summary
A
system design
is
given for a computer capable of direct
FORTRAN language source statements. The allowed types of statements are the FORTRAN DO, GO TO, computed GO TO, Arithmetic, READ, PRINT, arithmetic IF, CONTINUE, PAUSE, DIMENSION and END statements. Up to two subscripts are allowed for variables and execution of
FORMAT
needed. The programmer's source program is converted to a slightly modified form while being loaded and placed in a Program Area in lower memory. His original variable names and statement
no
statement
numbers are retained
in a
Symbol Table
in
a hardware interpreter for these statements.
The machine corresponds
therefore to a "one-pass, load-and-go" compiler except, of course, that there is
no
language.
and 100
flip-flops.
Index Terms
machine language. It is estimated that the machine will require on the order of 10,000 diodes
execution of
This does not include arithmetic circuitry.
Digital
and arithmetic
However, when
details of his solution in the source
the machine "hangs up" or
computer system,
digital
FORTRAN, FORTRAN computer
machine design, system,
direct
FORTRAN
lan-
his
all
when he
finds displayed
gets equivalently an esoteric print-out in a symbolic form of
machine language.) To overcome these
difficulties
one could use
an interpretive translator of the source language instead, but the historical deficiencies of interpreters, loss of
speed of execution have caused Another solution is also possible
loss of
memory space and
this solution to
— design
a
be shunned.
machine which
executes an algebraic language directly as its "machine language." This approach is based on a recognition that once the allowable syntax and associated semantics of language statements have been firmly specified it is a matter of choice whether to write a compiler,
an interpreter or to build an interpreter out of hardware.
to write
guage machine, hardware interpreter.
he
on the debug program, machine console is the machine language. (On large machines he attempts to
translation to a different
control circuitry for this
The software choice has been almost overwhelmingly
to write a
compiler. Since the choice of hardware interpreter, or machine, has not been made, and in fact has hardly been explored to any
Introduction
The
logical flow
is
upper memory, which also serves as the data storage area. During execution of the program each FORTRAN statement is read and interpreted at basic circuit speeds since the machine is
language translation is accomplished, is a waste of time and money to the user since he must pay for this time though he gets no problem answers from it. Secondly, the user has specified the
algebraic languages, in particular
FORTRAN
in this country,
have had enormous impact on the utilization of computers for scientific and engineering computation. They were designed in
overcome the annoyance of lengthy learning time and the laborious attention to detail needed to use a basic machine large part to
great extent, a study has been made in order to see if this choice leads to a system which is competitive with the usual software
system.
It
should be understood that such a machine has not been
constructed. However, the design 2 construction seems feasible.
is
sufficiently
complete that
language.
is
These annoyances are overcome by providing a language which and freer of "bookkeeping" details,
closer to English in form,
than the usual machine languages, and by providing a machine language program, called a compiler or translator, to convert from the source program written by a user to an object program executable by a computer. Thus the original drawbacks are overcome but the discrepancy between the external language of the user
and the internal language others.
1
IEEE
of the
The compilation run
machine leads
to at least
of the machine, during
Trans., EC-16, vol. 4, pp. 485-499, August, 1967.
two
which the
Language— design philosophy Since the machine language is to be an algebraic one it seemed reasonable to choose a simple subset of the most commonly used one, FORTRAN. This eliminates the necessity for inventing still
another such language and allows attention to be focused on machine design. In fact, the subset chosen is quite close to that
known
as "Preliminary
complete enough 2
See
final
to
FORTRAN
for the
IBM
1620," which
is
be quite useful, but which does not include
technical report for Contract
AF
19(628)-2798.
363
364
4
Part
The
Section 4
instruction-set processor level: special-function processors
|
such innovations as subroutines, etc. In addition, the usual "built in" subroutines
SIN
(*),
COS
(x),
etc., are not included. Their in-
READ,
These statements cause data to be read or printed, respectively, in accordance with the specified list of variables which
List
PRINT,
List
would require additional effort for their hardware implementation which did not appear to be worth expending at this
clusion
may be subscripted; however, the "implied DO" feature has not been implemented.
time.
No FORMAT
The
FORTRAN
machine
as
statement types which are accepted by the
machine language are
in the table that follows.
1
DIMENSION
=
is
stored
GOTOn GO TO
to
the
name
n m ),
.
i
which
a,
may
have
two subscripts.
and
i
IF(e)
n lf n 2 n 3
statement
executed.
is
Program control
,
transferred to the
is
statement numbered
ri]
if
the algebraic
expression e is negative, to that numbered n 2 if e is zero, and to that numbered
n3
PAUSE
if
e
is
m 2 m3 ,
halted until restarted
statements following this one in the program, including the statement numbered n, are executed repeatedly. The
All
first
execution
is
with
cremented by the value
equal
i
of
m3
rr»i, i is in-
before each
succeeding execution. This continues until i is greater than m 2 at which time program control is transferred either to the
stood to be
CONTINUE
1
DO sequencing is
not given
it
rules for is
under-
.
affect
program control and not arithmetic processing.
one to four numeric symbols preceded by a decimal point (and a + or — sign). These are followed by the character £ and a single
(positive or negative) digit representing the power of ten in the usual scientific notation.
These constraints on number
the last statement
size
and format are made to
simplify certain circuits and could easily be relaxed if desired. The restriction to a two-subscript maximum for subscripted variables is
similarly motivated. Internally, all numerical data require three 8-bit
The
first
words
(Fig.
packed two decimal point is assumed
two words contain the
four-digit mantissa,
per word in a 4-bit code for each digit. A to exist to the left of the most significant digit. The most significant bits of the third
mantissa bit
is
The
positive, or
is
word are 1 if it is
zero.
The
third bit
is
if
the
negative, and similarly the fourth respectively, positive or negative.
if the exponent is, exponent digit occupies the least significant four bits word. All other characters occupy a full 8-bit word of which
or
1
single
in
the
of this
range of a DO. In this case normal sequencing takes place.
DO
the two most significant are l's. Any numeric characters which are symbols of a variable, e.g., the "2" in AB2X, also occupy a
is
This statement generates a control signal
familiarity with the
fixed (integer)
These may have names of any
of
two
to start execution of the
Some
machine between
"mixed mode" expressions. Statement numbers must be unsigned fixed point constants, which are not so converted since they only
This statement has the effect of the "no
CONTINUE
1
in this
floating point (real) variables.
any combination of one to four numeric characters preceded — sign, however, these are converted to an internal by a + or decimal floating point number and so there are no restrictions on
1).
operation" instruction in conventional machines. Program control goes to the next statement in the program unless the
END
made
as
statement following n or to that statement required by the DO nests. If m 3
is
Floating point constants are specified in the form of a mantissa is
by console switch. mi,
distinction
length, starting with any alphabetic character. Fixed point constants may be specified, in a program or as data,
positive.
Program execution
DO ni =
No
to the
Program control is transferred to one of nm the statements numbered rn, n 2 at the time depending on the value of this
name followed by parentheses enclosing one or two constants.
location referenced
memory
Program control is transferred statement numbered n.
(tfe, fH>,
available with this
This statement has the effect of reserv-
v, v,
of the arithmetic expression b
in
by the variable
up
is
ing memory space for the subscripted variables v. Each v stands for a variable
The value
b
control
machine, therefore no statement number need be given.
Comment
Statement
a
Processors based on a programming language
FORTRAN
language
is
program.
assumed.
full
word
digits
of this type. Statement
numbers are simply packed 2
per word and always occupy 2
full
words.
Refore proceeding with the description of the overall charac-
Chapter 31
+ 0.5739 E-4
Word
1
Word
2
in
three consecutive words
in
memory
System design
of a
FORTRAN machine 365
366
Part
4
I
The instruction-set processor
level: special-function
Section
processors
of the pointers correspond to indirect addresses. Figure 2 shows control and Tables 2 to 7 show to a sketch of the overall
It is
b
in a paper tape, is loaded into the tape read circuit which reads a statememory by energizing ment on the tape, including the end-of-statement symbol ±, into
program, which
is
punched
The read circuit is then de-energized. The least 6 bits of each word of the buffer hold the internal BCD significant of each symbol. representation
statement number was previously processed.) The statement number is put into the Program Area and the
ment from
left
Program Counter c
now
picks up each symbol in the stateto right and as each symbol is decoded it reacts
scan circuit (Fig. 3)
also put into the Program Area starting at this location and the Program Counter incremented appropriately, i.e., by 2 since two 8-bit words are used. The statement number is found in the Symbol Table because it has been previously referred to by an IF or GO TO. The current value of the Program Counter is
placed into the two memory locations following the statement number. (These were left blank when the
the I/O buffer.
A
put into the Symbol Table followed by the value Program location. The statement number
is
have been altered.
Loading a program
A
Processors based on a programming language
of the current
system
original statements
what extent the
4
the
first
Statement
ior is
a digit, control is turned over to a Load circuit. This circuit shifts the
is
symbol
Number
2
statement number digit by digit into a register (SHR). The maximum allowable length of a statement number is 4 digits all
Table area. a
One
is
Symbol Table
in the
is
DO DO
described since the circuit's behav-
more meaningful
in that context.
number has been processed in this fashion symbol in the statement was not a digit (no statement number was assigned) then the scan circuit continues to pick up each symbol from left to right until it is
if
the
first
able to classify the statement as to type.
then turns
It
over control to the appropriate loading circuit as indicated
of three possibilities exists:
The statement number
found
After a statement
or
statement numbers are carried internally in this form, i.e., a programmer's statement number 13 is carried in 2 words as 0013. A search is now made of the Symbol
and
is
it
statement loading If
incremented.
has been previously referred to by a statement. A description will be deferred until the
because
as follows.
1
is
The statement number
in Fig. 3.
not found in the Symbol Table. All of these loading circuits put the statements into the Pro-
gram Area
after replacing variable
names and statement number
references in the program with addresses or pointers. They also TO or CONTINUE with a replace reserved names such as
GO
Memory address register
single 8-bit
code
(token).
Each unique variable name
also stored in the
in
the pro-
Symbol Table once using an
gram, however, is code for each symbol. For nonsubscripted variables the three words following the name are reserved for the data that will be
8-bit Symbol table
associated with this
Memory
Lood
scripted variable
Program area I/O
name when
which must precede the use
buffer
In this case as
the program
names are found
many
in
is
executed. Sub-
DIMENSION
statements
of these variables in the program.
locations following the
name
are reserved
have been computed from the DIMENSION statement. The name in the Symbol Table is preceded by a special symbol a, to as
Inputoutput
Program
Memory buffer register
Data
Arithmetic unit
the first of it is a subscripted variable. In addition, the two subscript values in the DIMENSION statement is also stored immediately following the name. This number is needed indicate that
element during program execution for constructing the proper Read /print
of the array specified 1
Fig. 2.
FORTRAN computer
system.
A
for
by a subscripted
variable.
1
The
location in the Symbol Table pointer to the next available
speed
in
Symbol Table searching.
address of
is
also stored
Chapter 31
System design
of a
FORTRAN machine 367
Process
statement number
Statement number Process
DIMENSION ARITHMETIC
Process
ARITHMETIC DO
Process
DO Process
GO TO
COMPUTED GO TO Paper
-
I/O
Scon
buffer
CKT
Process
COMPUTED GO TO
tape Process
READ Process
PRINT Process IF
Process
PAUSE Process
CONTINUE End Process end
XT Fig. 3.
Load processing sequence and control.
symbols of the variable name in the first. This symbol, which must be
(SMU). These circuits indicate either that the name or statement number is already in the Symbol Table or it is not. Thus the first
retained in the Program Area as an indicator that
appearance of a variable name, statement number, or reference to a statement number causes it to be put into the Symbol Table.
the data location replaces
Program Area except alphabetic, this
is
is
all
for the
indeed a variable. All special symbols such as
(,),
+,
—
,
simply stored sequentially in the Program Area in the 8-bit form as they appear in the original statement. Statement numbers in IF and GO TO statements are similarly
Subsequent references merely utilize these previously assigned data or Program addresses. Therefore each name or statement
replaced by the address in the Symbol Table which holds the address in the Program Area of the statement having that number.
noted below. In general, the programmer's statement is altered only in the above described fashion. However, for ease of execution
etc. are
BCD
Note that
numbers
this
is
an indirect address to the statement. Statement
DO
statements are dealt with somewhat differently as will be explained later. Because variable names and statement
number
in
references can appear many times in a program, these searches of the Symbol Table are controlled by two special circuits, the Variable Match Unit (VMU) and the Statement Match Unit
number
is
stored in the Symbol Table only once with an exception
computed GO TO has GO TO (n v n 2 ••-, nm
the in
.
its
), i,
index parameter name, i.e., the "i" changed from the position following
the parenthesis to a position preceding the parenthesis. The statement requires the most complex loading algorithm. Basically, the idea is to place the statement itself,
DO
DO
essentially unchanged, into the
Program Area but to extract the
368
The
Part 4
Section 4
instruction-set processor level: special-function processors
range statement number (which specifies the range of the
DO) and put
into the
it
last
Symbol
statement in the
Table.
It is
there
preceded by a special symbol A, designating it as being referenced by a DO, and followed by the Program Area address of the corresponding
DO statement.
The
DO
statement in the Program Area
original statement number replaced by a special symbol, \, and an internal address which is determined as follows (see
has
its
Table a
6).
If this
DO
is
DO
DO
is an entry in the Symbol Table corresponding to every nest three deep all ending in statement. Thus for a
DO
statement number 100, for example, there will be three entries in "DO nest order" of the number 0100 each fol-
lowed by the corresponding
DO
statement Program Area
same
DO
if it is the is the only first of a nest of DO's, or specifying a particular range statement number, then this internal address is the program address of the next
The
circuit
shown in the Appendix. The hardware implementation of the state diagram Variable Match Unit is also described there.
Executing a program the END statement signaling the end of a source program encountered by the scan unit, the machine leaves its load mode, executes an automatic RESET, and enters the execution mode.
When is
(Reset forces the address 100 into the Program Counter.) Pressing
the console start button causes statement execution to begin at the first executable statement which is always found at memory
extracts
if
DO range, DO or DO
i.e.,
this
the address to which nest
is
satisfied.
found by the Statement Number Load the time the last statement in the range appears in the is
and saves the Program Area address
of the
first
DO
and
DO, if there is a nest, or simply the only address if there one. The statement number is put in the Program Area as
the last
always. In addition, the Program Area address of the A token of in the nest is also put in the Program Area immediately the last
DO
following
it.
In addition, a special flip-flop, the
LSFF,
is set.
The
loading circuit for each statement type allowed to be the last statement in a range, tests this LSFF after it has loaded the statement into the Program Area. If it is on, the current contents
DO
Program Counter, the address
DO
of the next statement outside
range are used as the internal address in the
Each
of these circuits
should be noted that this
DO
range statement number
together with its the Symbol Table without a preceding A. This it
(or only)
possible (and even legal
GO TO
refer to
it
in
location will also appear in
some
is
necessary because
cases!) to
have an IF or
also.
The method used
to design the circuits
The
initial state.
is
in
an
first
symbol
in a statement.
initial state
executes the statement until the
is
first
|
(end of
read from memory. It then returns to its symbol of the next statement, as indicated
by the Program Counter, is read and causes some circuit to leave its initial state, etc. Thus the first symbol of a statement acts like the "operation code" portion of a conventional computer instruction word. The first symbol must be (since the load circuitry causes this)
one of the
8-bit tokens for the various statement types, or
a digit of a statement number, or the alphabetic character of the " " variable on the left of the = symbol of an arithmetic statement.
The tokens
Table
are represented in this paper
in
Table
1.
Token
Statement type
GOTOn IF (e)
shown
1
(n lT n 2
.
,
.
n m ),
n lf n 2 n 3 ,
PAUSE DO n = mi, CONTINUE READ i
PRINT
which implement these
it
statement symbol)
GO TO own Program Area
is
first
of the nest.
It
a separate statement execution circuit for
is
each statement type. In addition, the Statement Number proc-
statement outside the
I/O buffer for loading. The circuit first detects that a matching statement number in the Symbol Table is preceded by a A. It then
DO
of the
when execution begins. One and only one can leave its initial state when the first symbol of a statement is read from memory. The responding circuit then
circuit at
the
is
are used during Loading, are
retains control as
of the
is
then synthesized from the state diagram established methods. The state diagrams of the Arithmetic using Statement Loading circuits and the Variable Match Unit, which constructed.
essing circuit reacts to a digit as the
This outside address
just
each case. From the English language
DO
control should go
is
in
description of the function a sequential circuit state diagram
address 100. There
address. If this
the
is
one of a nest of DO's, the internal address
the Program Area address of the X token of the next statement. This is easily found by a Symbol preceding Table search for the range statement number since there is
b
functions
Processors based on a programming language
i
GO TO COMGOTO IF
PAUSE ">2. "13
DO CONTINUE
READ PRINT
Chapter 31
DO execution circuitry to leave by reading of the DO or by reading of the The former causes DO initialimmediately following causes DO the latter indexing and testing as will be however, for the
It is possible,
initial state either
its
X token
it.
ization,
described
The
later.
action of the execution circuits
is
briefly given below.
Statement number processing
When
the
first
symbol of a statement
a digit this circuit
is
is
there are only four digits (packed into two memory energized. circuit returns to its initial state and the remainder words) the If
If there are eight digits (packed into the last four digits (the address of the X of words),
of the statement
four
the
memory
last,
LSFF
is
executed.
is
DO in a nest) are saved in
or only,
turned on, the circuit returns to
remainder of the statement statement
is
not an IF,
GO
is
executed.
TO,
or
DO
a register, SSAR.
its initial
If
state
The
and the
the remainder of the
statement, the execution
circuitry in control executes the statement and then tests for the LSFF being on. If it is on, the Program Counter contents are re-
LSFF is reset, and the circuit SSAR holds the program returns to its address of the A token of the innermost DO. When this X is read, DO indexing and testing take place. If the LSFF is off, the circuit returns placed with the
SSAR
contents, the
initial state. In this case the
to
its initial state.
GO TOn The
GOTO
token energizes this circuit.
(packed into two is
The
four-digit address
words) immediately following the token extracted. The contents of this address are put into the Program
memory
Counter and the
GO TO
1
Example.
The
circuit returns to
COMGOTO
15$ (Table
its initial state.
2).
token energizes
this circuit.
The
initial
alpha-
now immediately
following the token, is read and discarded and the four-digit address immediately following is extracted. The contents of this address (the current value of t)
betic symbol of
i,
are put into a register 1
the result
and decremented by one.
zero, the four-digit address following the parenthesis is extracted. The contents of this address are put into the Program Counter and the circuit returns
If
is
left
to 1
its initial state.
A11 examples are written as
first
in the
program.
though
this
statement or statements were the
Table 2
System design
of a
FORTRAN machine 369
370 Part 4
Table 3
The
instruction-set processor level: special-function processors
Section
4
Processors based on a programming language
Chapter 31
Table 5
System design of a FORTRAN machine
371
372
Part
4
I
Table 6
The
instruction-set processor level: special-function processors
Section
4
Processors based on a programming language
Chapter 31 |
Table 7
System design
of a
FORTRAN machine 373
374
Part
4
The
Section
instruction-set processor level: special-function processors
4
Processors based on a programming language
|
first right parenthesis after the F causes Z3 to equal zero. This condition causes the value stored in d 30 to be placed in the SR. The value of i is decremented
Therefore the
Arithmetic Statement execution. These storage areas for partial results are called
d i0 div where ,
t
specifies the "level" at
which
1
computation is taking place, t is equal to zero until a left parenthesis is encountered which increases the current value of i by
to 2.
An
5
follows exception occurs if the left parenthesis immediately It is also level remains at zero. the the ss symbol. In this case 1.
is
tial results.
control values are required at every level. The count of left parentheses at any i level is stored as a number, l t Before i is incremented, the incompleted arithmetic operations still re-
Two
6
an indicator quired at the current level are indicated by giving * to and t t + indicators needed are Also 3. or value the 1, 2, f,
-
from
and
*
from
/.
To
7
,
,
-I-
clarify the significance of
made
these control values an analysis will be
8
sets of
1
((B
+
The
+
(C/((D
E'(F))))
circuit reads
and discards the The first two left
+
2
and saves the address of A, then reads which puts the circuit at the level i = 0. set to 2. The parentheses cause l to be
B
(", t
is
is
set to zero to indicate the plus sign.
The left parenthesis also causes to be incremented to one and since it is the only one at this level, \ is also set to The division symbol 1. The value of C is stored in d i
w
followed by a left parenthesis causes t t to be set to 2 to indicate the condition "C/(". Since we might find "C*(" in * is set to 1 to indicate the division. other cases, t j
3
The left parenthesis also causes to be incremented to 2 and the next left parenthesis increments l 2 to 2. The value of D is stored in d 20 and the value of E put into d 2l respeca left parentively. The multiplication symbol followed by i
,
thesis causes
"D + E*
(". f
r2
+
to
be
2
and
set to t * 2
3 to indicate the condition
are each set to zero to indicate
the plus and multiplication symbols, respectively.
4
The
left
parenthesis before the
F causes
i
to
=
2
t*
{
Basic circuit operation at any level page 363, footnote 2.
is
for
0) causes the
computation,
d20 The next two paren-
in
.
F cause l2 to equal zero. Therefore, this result placed in the SR. The value of t is decremented to 1. *
equal to 2 and t 1 is equal to 1 the computation stored in d 10 The final parenthesis after causes l x to equal zero. Therefore this result goes to
the
F
SR.
i
Since
is
tj
made and
is
is
decremented to
t
is
one and
made and
The
.
t
+
is
d^ +
SR,
stored in d^.
d00 + G to be made and two parentheses cause l to be zero;
the computation
d00 The .
zero the computation,
is
the result
+G causes
zero.
final
is
placed in SR.
(If
another right
Any subscripted variable addresses are computed easily from the initial DIMENSION statement information, saved in the Symbol Table, and the current value of the subscripts. Assume the first data location for an array A(I, J) is stored at a location A base + 1. If the DIMENSION statement read DIMENSION A(5, 10) then the computation, A base + 5 * (/ — 1) + address for any nonzero value of / and
complete data word
is
stored per
I,
gives the correct data
/.
(This
is
memory word;
true only
in this
if
a
machine
the expression is slightly more complicated.) In this machine the partial result locations
d i0 and dn are accommodate the data. An actually 3 words long, of course, to where 4 bits control information additional word is used to store
+
t *
and the remaining 4 bits for the count. The i counter therefore is actually incremented or decreZj mented by 7 instead of one. Thus at any level, of which there
are used for
t
t
,
t
4
,
and
f
can be 14 since the I/O buffer
210
left
is
100 words long, the
Z
f
count
more than adequate since it allows is much which longer than the I/O buffer parentheses,
can be as great as 15. This for
.
must be decremented by one
=
2
be stored
is
be incremented
and l3 to be set to 1. The value of F is placed in d 30 The Arithmetic Statement circuit always puts the final value computed at any level into the arithmetic unit regis= for any i. Clearly l ter, SR. It does this whenever l
to 3
1
to
parenthesis were found, this would cause an error condition to be indicated.) The $ symbol causes the contents of SR to be stored at the previously saved memory address for A.
G))t
stored in d^. The plus sign followed by a left t to be set to 1 to indicate parenthesis cause the indicator we Since "B condition the + (". might in other cases find
value of
—
SR
therefore the value in d^,
=
"B
*
d 21
stored in
parentheses:
A =
+
Since
is
of the following ex-
which contains some unneeded but legitimate
t+
being 3 (and
d 10 /SR
.
pression,
2
theses after
necessary to store control information which relates to these par-
distinguish
t
d20
length.
Since the appearance of the Symbol Table and Program Area little to this discussion, an example will be omitted.
would add
t
each right parenthesis.
described in the earlier report. See
Conclusion
We have illustrated in some
detail that a
lation of a simple algebraic language
is
machine
for direct trans-
It
would therefore
possible.
Chapter 31
seem that further investigation be made
of the
economic position
of this solution vis-a-vis the software compiler solution. Unfor-
to set either the
is
OK,
AOK
respectively indicate that the
System design
|
or
EOL
ST
either:
of a
flip-flops.
FORTRAN machine 375
These
flip-flops
tunately, the present authors are not sufficiently versed in compiler
construction to
The
make such
a comparison.
machine
an independent unit is probably not reasonable except under particular circumstances in which only small one-shot scientific problems form the actual construction of such a
1
holds the variable in question as a result of previous loading, or
2
that the variable
as
loaded by the
bulk of the computing. However, as an adjunct to a larger general purpose machine, it may well serve a need as a hardware inter-
3
it is
would be needed
estimated that 10,000 diodes and 100
absence of the variable in the ST.
The
flip-flops
for these alone (not including arithmetic circuits).
The design techniques used
are simple and straightforward but rather expensive. These designs should probably only be considered for use with integrated circuitry.
state
Preliminary Specifications,
Form
J29-4200-2, April, 1960
said to
is
3
if
of (Fig. 4)
The Symbol Table at the end of the load mode should contain all variable names used by the program, together with empty locations reserved for data associated with these names. The Program Area at the end of the load mode should have a program in which all variable names have been modified in that only the letter
in the corresponding position
The
The variable match unit (VMU)
first
4095 since the ST
is
retained, followed
the data associated with this
by the Symbol Table address of name. Since any variable name may
is
shown
in Fig. 4.
When
the
VMU
be matched. The
first
or
in the ST, the character
proceeds from state 2 to state scan matches. Otherwise
name under
character of the
MATCH
scanned sequentially downward. the I/O buffer is found
name in of a name
a character of a variable
the state changes from 2 to
APPENDIX 1
is
(CIO) contents are saved in register SCIO since the name may have to be scanned again. The Symbol Table Counter (STC) is If
FORTRAN:
for this circuit
VMU
initialized to
1620
diagram
triggered by the START signal in state 0, the circuit goes to state 1, the next clock pulse sends it to state 2 from which it starts its search of the ST. In going from 1 to 2, the I/O Counter
References AndeJ61; BashT64; International Business Machines Corporation, General Information Manual; FORTRAN, Form F28-807401, December, 1961; IBM
(EOL) token was found, indicating the
that the End-of-List
preter for widely used higher level languages. As a result of a fairly complete design of the control circuits of this machine,
subscripted and has been previously statement loading circuit, or
is
DIMENSION
8, if
NO MATCH
the
NO MATCH
is
signal
given.
signals are generated as a result
comparing the contents of the ST location undergoing the scan
(the contents reside in the
Memory
Buffer Register,
MBR), with
COMP which has the character from put into COMP by the calling
the contents of the register the I/O buffer. The first character circuit, thereafter the
VMU
The CIO and STC counters respectively,
and the
VMU
is
picks them up in the 3-4 transition. are incremented and decremented, oscillates
between
states 3
and 4
as
matching continues. This comparison process will terminate when, either an arithmetic operator S is read from the I/O long as
,
appear many times
in a
program, a search
required, during the
is
name
already exists in the Symbol Table. The search of the Symbol Table (ST) consists of comparing each name there with the variable name in the statement being loaded. loading, to see
if
the
All statements are loaded by an appropriate circuit of Fig. 3 from the I/O buffer and into the Program Area of the memory. Therefore the variable name in the Statement exists physically in the
I/O
cause a
the function of the
VMU
to
make
this search
when
signal with respect to the contents of the COMP unit causing the transition from state 4 to 5. In state 6, if a digit is next read from the ST, corresponding in position to the
gized or "called" by the loading circuits for DIMENSION, DO, TO, READ, PRINT, IF and Arithmetic statements computed
GO
which variable names appear. The output action
'Symbols used
in this
Appendix are described
in
Table
8.
of the
VMU
appearance of the operator from the I/O buffer OKFF is set to 1, and
names are the same and the
clearly the
the transition from 6 to
in the
I/O
buffer, the
from 6 to 5 of the
is
made.
On
the other hand,
another
is
made. In
,
state 5 the circuit just reads to the
nonmatching name
in the ST.
A
digit at the
causes the transition 5-7 during which the STC over the 3 data locations to the next ST entry and the
name
if
ST corresponds to an operator, S are not the same and the transition names
character in the
alphameric ener-
ST contents
NO MATCH
buffer.
It is
in
buffer sending the circuit to state 6 from state 3, or the
end of is
end this
stepped
CIO
reini-
376
Part
4
The
Section 4
instruction-set processor level: special-function processors
Processors based on a programming language
Sy/| STC
READ
READ
d/SET OKFF
/READ
(STC)
S — COMP
STC
So/t
NO MATCH /-
(STC)
READ (STC) t
CIO
READ
MATCH/*
I/O
CIO
/READ
I/O
s V//Sy— COMP '
START VMU /-
READ (STC)
-O
CIO— SCIO
/A 09 / READ
EOL/SET EOLFF
(STC)
-~
/| 3 STC d/SCIO— CIO I/O
/READ
O^
5— STC (STC)
NO MATCH/
a/(5
STC
/READ
(STC)
STC d/U /READ (STC)
|
STC
READ
a/
(STC)
STC
\
/READ
(STC)
NO MATCH
SvA
A STC
STC
READ
(STC)
(STC)
'
0/READ
-(10 d
MATCH/ t CIO /READ I/O
—SAVE
READ
READ
SAVE — STCM V/d—STCL /READ ,
(STC)
(STC)
I/O
S— COMP s
'd— STCL
0/l STC
/READ
d/SAVE— — STCM
(STC)
'SCIO
S v/t STC
'READ
d/|
'
(STC)
d— SAVE
STC
/set AOKFF
/READ
match
SCIO 0100 0000 1001 0101 -> STC
STC -+
from state
Therefore the execution of the above microsteps, in that order,
,
which indicates a unique state of the state diafinally the third comes from the control cycle counter.
MAR
READ
3.
AND
The output of each AND gate is a line indicating a unique microstep. The AND's feed OR gates, which actually energize the given
read cycle.
This signal causes the
Each
skeletal counter
address register. initiates a
,
in Fig. 6.
gate in this figure has 3 inputs except those not requiring input
CHANGE STATE
This signal causes the CIO to be incremented by one. CIO—> MAR This signal causes the CIO to be gated to the
READ This signal CHANGE STATE
should
we are in, the outputs of the decoder of the control cycle counter, and the input lines (S v S MATCH,
are:
fCIO
memory
it
use the outputs of the skeletal counter which will
MATCH
2 of Fig. 4, if a signal is present we are supposed to increment the CIO counter and then read the I/O buffer.
Consequently the microsteps required
present,
line information.
(Fig. 4).
to realize a circuit
will
the "/„ in the
is
indicate to us the state
ones are initially represented in a state diagram form, such as the state diagram for the loading of the Arithmetic Statement (Fig. 5)
MATCH signal
to state
State 2
CHANGE STATE and MATCH INCREASE CIO CIO -> MAR
READ
380 Part 4
The
Section 4
instruction-set processor level: special-function processors
Start
VMU [—[^Change
Fig. 6. State
diagram implementation.
Processors based on a programming language
ST
Chapter 31
State 2
CHANGE STATE and NO MATCH CHANGE STATE
State 5
and S v
DECREASE STC STC -+ MAR READ CHANGE STATE State 5
and d
DECREASE STC DECREASE STC DECREASE STC SCIO —> CIO CIO—> MAR READ
CHANGE STATE
In state is
of Fig. 4 a
START
accomplished by the top
System design
FORTRAN machine 381
of a
VMU signal takes AND of Fig. 6. The
it
to state
1.
This
only microstep STATE. In state 1 of Fig. 4, the next clock needed is pulse (after reaching state 1) causes a transition to state 2. In this case we need to save CIO contents in register SCIO, (CIO -* SCIO)
CHANGE
set the
STC
to
4095 (4095 -»
STC shown above in BCD form) and now in the Symbol Table Counter
get the contents of the address
(READ(STC)). This
STC —>
This transition from 5
AND
latter
is
implemented by the two microsteps
MAR followed by a READ command to the core memory. gates
shown
plish the transition
1 to
2 of Fig. 4 is accomplished by the next The next AND gates shown accom-
in Fig. 6.
from state 2 to 3
if
there
is
a
MATCH. The
AND accomplishes the transition from 2 to 8 if there is NO MATCH (in this case nothing need be done). Finally the lowest two groups of AND gates implement the required microsteps as
next
the circuit changes from state 5 to 7
if
a 4-bit digit
code
is
sensed
or causes the circuit to remain in state 5 after decrementing the
STC
if
an 8-bit variable code
is
read.
Chapter 32
A microprogrammed implementation of EULER on IBM System/360 Model
30'
Helmut Weber
Summary
An experimental processing system for the algorithmic language
EULER has been implemented in microprogramming on an IBM System/360 Model 30 using a second Read-Only Storage
unit.
The system
consists of a
microprogrammed compiler and a microprogrammed String Language Interpreter, and of an I/O control program written in 360 machine language. The system is described and results are given in terms of microprogram and main storage space required and compiler and interpreter performance obtained. The role of microprogramming is stressed, which opens a new dimension
in the processing of interpretive code.
The
structure and content
can be matched by an appropriate interpretive language which can be executed efficiently by microprograms on existing computer hardware. of a higher level language
form which text
is
two
in a procedure-oriented language are usually
steps.
They are
first
translated into an equivalent
translation process is a data-invariant and flow-invariant operation. It consists of two parts an analytical part, which analyzes the higher level language text, and a generative part,
—
which builds up a
string of instructions that can be directly inter-
preted by a machine. The analytical part of the translator depends on the higher level language; the generative part depends on a set of instructions interpretable by a machine. Historically there
was only one this
it is
conceivable to compile a program written in a higher microprogram language string. This string
would undoubtedly contain substrings which occur over and over in the same sequence. We could call these substrings procedures and move them out of the main
string, replacing their occurrence symbol, followed by a parameter designator pointing to the particular procedure. Our object program then
by a procedure
call
set of instructions
by a machine,
its
which could be interpreted
"machine language." Figure
1
it is
quence of "procedure designators."
The process just described will
result in the definition of a string
language and the development of a microprogrammed interpretation system to interpret texts in this string language. is
IBM System/360
360 language. Programs written in a higher level language are compiled into string language text to be stored in main storage.
The
string language interpreter corresponds to the
effi-
family are
preted by wired-in logic. Therefore, in a certain sense the 360 language is not the "machine language" of these processors but the (efficiently interpretable) language in which the processors of
^Comm. ACM,
microprogram
outlines
microprogrammed machines. On them the "360 machine language" is interpreted not by wired-in logic but by an interpretive microprogram, stored in control storage, which in turn is inter-
vol. 10, no. 9, pp.
549-558, September, 1967.
situation
to the
Input Data
of the processors of the
The
similar to the System/360 case: the string language corresponds
scheme.
Some
From
only a final step to eliminate the call symbols and furnish an interpreting mechanism which interprets the remaining se-
more
efficiently interpretable; then the translated interpreted ("executed") by an interpretation mechanism. is
The
ciently
Now
level language into a
here
Programs written processed
the elementary operations of the machine as operators and the elements of the data flow and storage as operands.
takes on the appearance of a sequence of call statements.
Introduction
in
the System/360 family are compatible. The true "machine language" of these processors is their microprogram language. This language is on a lower level than the "360 language"; it contains
p
Analysis
Progrom
in
*
t
I
^
Higher-level
Text
i,
Language
(language dependent,
I
Generation
'
Intermediate
———
*
.
,
,
f
..
(machine J dependent) 1
I
Program
in
Chapter 32
which interprets 360 language
texts. It consists of
part to read the next consecutive string
a recognizing
element and to branch
to an appropriate action routine and of action routines to execute the particular procedure called for by the string element. The essential difference between our situation and the 360 case that the string language reflects the features of the particular higher level language as well as the features of the particular is
hardware better than the general purpose 360 language. What is gained by defining this string language and by providing a
microprogrammed
interpreter for
it?
From
the
method
of
can be seen that the elements of the string language correspond directly to the elements of the higher level definition described,
it
all simplifying data-invariant and flow-invariant transformations have been performed. But the elements of the
language after
string language are also well-adapted to the
microprogram
struc-
ture of the machine. Therefore, during the compiling process (see Fig. 2) only a minimum of generation is necessary to produce the
The compiler But the more important aspect
string language text.
is
also faster.
coded
shorter and runs faster. that object code execution
2 will string language interpreter in case
be
to take care of all necessary operations in a concise form,
whereas of
The
is is
in case 1
it
will
be necessary to compile a whole sequence for an elementary operation in
machine language instructions
the higher level language. Examples of this are the compilation of
360 code
for
an add operation in
COBOL
of
different scaling factors or the compilation of
two numbers with
machine instructions
lookup or search operations, etc. In these cases the string language interpreter of Fig. 2 will execute a function much faster for table
than the machine language interpreter of Fig.
1
will execute the
machine language instructions. Therefore, will be faster in scheme 2. code execution object equivalent sequence of
If
object code performance
is
not as
much
in
demand
as object
also storage space economy, the string language interpreter can is as such that the be written tightly packed as string language
A microprogrammed implementation
of
EULER on IBM System/360 Model 30 383
384
The
Part 4
Section 4
instruction-set processor level: special-function processors
Processors based on a programming language
I
|
and storage
is
new N:
allocated dynamically,
N-
begin
new
EULER
A
A;
23IO-A1,
3*0
A2,«A3
Moiimum on* 2310 t,
,f
.'-W«
Cm Syt htm/360 Chonn iphu f*7720|
STORAGE
Data Chonn*
I
H*od) ('71251
CARD I/O
PAPER TAPE
The IBM 1800 401
402
Part 5
PMS
The
Computers with one
Section 2
level
central processor
and multiple input/output processors
PRINTER AND PLOTTER
.ADDITIONAL PRINTERS NO. 5 THROUGH 8
U4 3 Con he ('44371
E3 JI44VIM7 H Adopter
('4431'|
ADC Mod ('II or ADC Mod 2 1
(MOa t—uim. DC
('1232)
OPTIONAL FEATURES • {'4709) Croup of 6 Additional Interrupt • ('3222) Additional Dots Channel (max: • in BO fo 4 I/O
Up
Adoprtn
Digital
1
Unit 6i
-
Max. 2 Group* '
'
•
I
-J°^
Sum
of
DOC
Plui 1856-1 Eql
r DOC
per 1826
Digital
Output
Conrrol
(DOC)
I
B PIA
Maximum
w
I
BO)
lion Voltooc Reference iPVx
per Syile
1
2
PIA per 1801 or 1876
Maximum
('35271 or
Mod
2 ('53781
DAG
4
('32961
Maximum Maximum
hqvilfl Cultomer Alignment (Form 170-1246)
4 2
DOA DOA
in
1826
in
1801
jilol-Anoloa C „/R Control l'5256> plut Mjw.lt
T
MPX/K
1
The IBM 1800 403
404
Part 5
The
PMS
Section 2
level
T.
console
-
Computers with one
central processor
and multiple input/output processors
Chapter 33
2
An
3
A
interval timer has counted a previously set time interval.
magnetic-tape drive has completed a data transfer previ-
ously requested and
4
An
5
A
is
The
Digital inputs. terrupts;
ready for another request.
384 process
to
in-
to 1,024 bits of contact sense, digital input, or parallel
up
and 128
register input;
event input counters as
bits of
1-, 8-,
and
16-bit counting registers.
operator has initiated an interrupt from the Pc console.
device such as a typewriter has just printed a character is ready to receive the next one.
up
Digital Input provides
The IBM 1800
Anatog outputs.
Up
can be provided.
to 128 analog outputs
and
Digital outputs. Digital Outputs provide
Primary-memory communication and data transmission with terminals
and secondary memory
Two methods and T.
Mp
are used to transmit data
First,
between
Mp
the program. Each character or word of data is transmitted to or from the Pc and onto T by means of an Execute I/0(XIO) instruc-
The Pc program and device synchronization are accomplished
tion.
IO
and Ms, or
low-speed devices are controlled directly by
up
to 2,048 bits of pulse
output, contacts, and registers. processors (data channels)
Pio('Data Channels) give a directly with
memory The
or
Ms
the ability to communicate
if
an input unit requires a primary
cycle to store data that
it
has collected, the Pio communi-
cates directly with
by using the interrupt mechanism. Devices operating under direct
T
Mp. For example,
even
if
Word Count which
is
Pio's run
and
stores the data.
Pc
waiting.
Mp
is
The
Pio's
have two
registers:
program control include typewriter, printer, plotter, paper tape and punch, analog-to-digital converters, contact sense,
a
voltage-level sense, pulse counters, etc.
Channel Address which points to the next word transferred in a block. The Channel Address is also used to select the next instruc-
The second method of transferring data is via the Pio('Data Channel's. The Pio program is started by the XIO instruction of the Pc. The transfer of data words then proceeds under control
(reader
the specified Pio, completely asynchronous to and in parallel with Pc program operation. The Pio gains Mp access independent
of Pc (of
(Pc operation is suspended for one Mp cycle). During the cycle, the data are taken from or placed into core storage by
Mp
Pio (via internal Pc control and registers). As soon as the Pio has satisfied, which normally takes one cycle, the Pc proceeds.
been
The
logical state of the Pc, or the Instruction-set Processor,
is
not
changed by Mp. This method of access is referred to as "cycle stealing." Devices (Ms and T) operating under Pio control include magnetic tapes, disks, line printer, card reader-
transferred in
used to count the number of words being a block between a device and Mp memory; and a
tion in the
Two
program
for the next block transfer task.
basic types of Pio's are used, nonchaining
and chaining. 1
The
Pio's provide the ability to transfer either a single block (nonchaining) or multiple blocks (chaining) directly to Mp inde-
pendent of Pc.
The
central processor
Registers in the physical processor
Pio's access to
punch, and the link to the IBM System/360. Some devices can operate under both Pc and Pio control,
depending on their characteristics and the configuration, e.g., analog input, analog output, digital input, and digital output. Process I/O, controls
Figure 4 shows the relationship of the registers in Pc, together with those in the Instruction-set Processor. Those registers accessible
by the program are shown with an
accessible from the console. register
is
Storage address register (SAR). All
by
Channel Address Register (CAR) Instruction register
ment
address of the next instruction.
are
up
handle various analog-input
signals.
The data input
to 20,000 16-bit samples per second, with
program
rates
and 256
(via
high-speed solid state) multiplexed analog-
input channels connected to a single verter).
The Configurator
(Fig. 2)
K
Pc references
to
Mp are selected Mp use the
of the active Pio.
This 16-bit counter register holds the
(I)'.
selecta-
ble resolution and external synchronization. There can be 1,024 (via relay)
All the registers are
this 16-bit register. Pio references to
Analog inputs. Analog-input equipment includes analog-to-digital converters, multiplexors, amplifiers, and signal conditioning equipto
°.
description of the functions of each
given below.
or accessed
and transducers
A
(analog-to-digital con-
shows the allowable inputs.
Storage buffer register all 1
A
word
(B).
transfers with
descriptive
departments.
This 16-bit register
is
used for buffering
Mp.
name undoubtedly concocted by one
of
IBM's marketing
405
406
Part 5
The
PMS
Computers with one
Section 2
level
central processor
and multiple input/output processors
Console Core Storage
SAR Interval
Operation
Mon
Timers *
i
tor
EM
o
a
A *
XJ
Connected
-r \
to
Input Devices
C "T
]
N THEN
GO TO
FIN
If all
n
/-tasks started,
proceed with
The simple
K
MOP
algorithm presented here sequence. There is therefore a possibe queued during the execution of bility that unnecessary task-calls may the split which is to generate the nth task. The probability of this is, 'This
is
not quite accurate.
does not explicitly interlock the
split
however, small, while the degradation arising from an interlock could be in the form given appears more economical. significant, and the algorithm
Chapter 37
A survey
of
problems and preliminary results concerning
1966], 6.3
Macro-parallelism
is
not
all
processing and parallel processors
differential equations [Niever-
[Miranker and Liniger,
These various
1967].
were directly related to the present project,
studies,
more mathe-
The
matical in nature, and to the best of our knowledge, no attempt
largely historical, a consequence of the fact that
has yet been made to develop efficient parallel computer programs. Thus, while numerical methods are beginning to emerge which
and computer programs are generally sequential reason for this
and the solution of linear
gelt, 1964],
Commonly used numerical algorithms, data processing procedures,
parallel
in nature.
the Mechanisms, human, mechanical, and electronic, used in
developing and executing these procedures have been incapable of significant parallel activity, other perhaps than the simultaneous, coordinated use of
many humans. The advent
of parallel
enable the exploitation of macro-parallelism in the solution of time-limited problems, and from which it appears that significant reductions may be obtained in throughput times, much work
of accepted processing systems thus calls for the modification The resultant inherent to proparallelism. expose any techniques cedures must then be further adapted to make parallel tasks of
remains to be done on re-programming the problems themselves.
such a magnitude that the overhead involved in their generation becomes insignificant. But the ultimate benefit from parallel execu-
7.
be obtained only by going back to the problems themThese must be analyzed anew. Algorithms must be devel-
7.1
Simulation
Simulation as a design tool
tion will selves.
oped that make bility,
it
possible to exploit the parallel executing capa-
by introducing
into the mathematical
and program model
the physical parallelism that ultimately reflects the parallelism of to return to studied. In this need or system phenomena being
fundamentals, the situation
is
days of electronic computing, plication
somewhat analogous
when attempts
were largely frustrated
until
it
to the early
commercial apwas realized that wideat
spread application required the development of new techniques, rather than the adaptation and mechanization of existing procedures.
At the present time, however, our direct activity in problem analysis has concentrated mainly on the adaptation of existing numerical techniques for parallel processing, for problems in
which the basic macro-parallelism was self-evident. These include, for example, linear algebra and the solution of elliptic partial differential equations. In these areas the extent
parallelism
for vector processing
had previously led to proposals
systems such as Solomon [Slotnick et
and nature of the
al.,
1962; Gregory and
has been our experience with simulation that its principal function as a design tool is to focus attention on features that It
require investigation and explanation. Many results, qualitative and quantitative, that are obtained during simulation experiments
be obtained analytically. It is, however, the insight and understanding gained from the design of simulation experiments
may
also
and the analysis of their results that draws attention to specific details and difficulties. The undeniable value of simulation in in development and design is therefore quite different from that be where meaningful performance figures may system evaluation, obtained when the work load is well defined.
7.2
The executing simulator
In the present study simulation
was seen
of additional functions. In particular
it
as fulfilling a
made
number
available a usable
working model of a parallel processing system. This would give potential users the incentive to undertake actual programming and to gain limited operational experience. also required for the investigation of
An
executing simulator was
what
is commonly regarded most immediate question in parallel processing, the extent of performance degradation due to storage-access interference and
McReynolds, 1963] and Vamp [Senzig and Smith, 1965]. Other areas in which the parallelism is self-evident but where vector
as the
processors prove less effective are those in which the algorithms
executive (queue-access) interference. Such an executing simulator is now operational and its use is discussed in the next section. We
model
distinct physical activities such as in
Monte Carlo techniques. For [Schlaeppi, 19??]
it
all significant
was possible
file
processing and
problems investigated
to establish the existence of
parallel tasks of such a length that tasking overheads could
be
expected to be negligible.
Other
classes of
problems have been studied, both in terms of
the extension of existing algorithms and the development of new ones. In particular we refer to the extraction of polynomial roots [Shedler
and Lehman, 1966], solution of equations [Shedler,
note parenthetically that a limitation of this type simulator is its speed. For the evaluation of total system performance over any
when using a computer itself much slower than the simulated system, only gross, nonexecuting, simulation is reasonable [Katz, 1966].
length of time, particularly
The system presently modeled in the executing simulator includes the processors, switch, and Storage Modules of Fig. 1. The storage modules are accessed through a fully interleaved address
463
464
The
Part 5
PMS
Section 3
level
though it is clear that in any realization interleaving partial, both to sustain high availability and to decrease
structure,
be
will
storage interference between independent jobs. The individual processors have a System/360-like structure [Blaauw and Brooks,
augmented subset of S/360 machine language. The nonstandard instructions added to the repertoire in-
sizes of matrices
clude the functions discussed in Section
be used
in the
also as an instruction buffer,
model
for
which the interference
4.
is
The
results are
quoted
in the
The simulator configuration is parameterized so that, for example, the numbers of storage modules and processors, instruction execution times (in storage cycles), and the nature of statistics gathered and printed may be selected for each run. The next section.
program
ment
modular, and both system features and measuremay be expanded or modified as required.
itself is
facilities
and
parallel
isolate the effect of
processing
commensurate
mapping with the address structure of the which demonstratively had significant influence on the
store,
results.
Instruction execution times for the most frequently executed instructions used in the experiment are given in Table 2.
local store LSi,
however not included
were used to
for multiprocessing
periodicities of array
1964] and execute an
to
Computers
for
These times exclude the instruction fetch time (one instruction each fetch), since these are overlapped unless storage conflict
occurs,
may
when
a request must be queued.
also include a data fetch
further store access time
is
(RX
The arithmetic operations
instructions) in
which case
a
required.
In the absence of an internal instruction buffer, processors
executing the same program string interfere with each other continuously during instruction fetches. To minimize this effect for loops that are short relative to the width of the interleaving, is profitable to unwind such loops by repetition so that the resultant string stretches as far as possible across the interleaved it
7.3 7.3.1
Simulator experiments Kernels. Simulation experiments
first
concentrated on an
investigation of storage interference arising in the execution of typical kernels from numerical analysis.
The
results indicated that
under the limited condition of the experiments and
for a storage
module-to-processor ratio of two, interference would degrade performance by less than twenty percent, dropping to some five
percent for storage module-to-processor ratio of eight. Addition and its use as an instruction buffer
of a local processor store
effectively eliminated interference, as expected, indicating that it
had been substantially due to instruction-fetch interference. These results were considered to have been generated under
store.
that
The program was unwound it
in this
in fact better [Rosenfeld,
is
way.
We
note, however,
1965] to repeat the loop,
appropriately modified, several times across the interleaved store, directing successive processors to successive, but unconnected,
can decrease interference by as much as twenty percent over the previous case. Some results of the simulation are given in Table 3 and plotted in Figs. 5 and 6. loops. This
We
note that running time (col. 4) is defined as the interval start of the first processor on its first task and the completion, by the last processor to finish, of its final task. Since
between the
conditions too restrictive to permit generalization. In particular each set referred only to concurrent executions of a single loop.
an onion peel technique has been used for the splitting, there is an interval (of order 70 storage cycles) between the start of suc-
Thus more recent experiments have included many runs of a matrix-multiply subroutine and the solution of an electrical net-
cessive tasks.
work problem using an appropriately modified version
of the
Jacobi variant of the Gauss-Seidel solution of a set of linear algebraic equations.
7.3.2
The matrix multiplication. The Matrix Multiply program in two versions. A classical sequential program ex-
which the
There
is
also an initial interval (87
memory
cycles)
processor initializes the program. Finally, the finish of processors is staggered and, in particular, for the sixteenin
first
processor case, eight processors are assigned two tasks (rows) in succession, and eight, three tasks. The former processors will, of
Table 2
was written cluding
all
the special instructions provided the standard on which
measurement of the parallelism overhead and interference could be based. The second, rather than the
parallel,
program used the onion peeling
MOP algorithm described in Sec. 7.2. The product
Execution time in storage cycles
Instruction
Fixed Point Addition
0.4
Floating Point Addition
0.5
Floating Point Multiplication
1.0
matrix was partitioned by rows, with the computation of each
Floating Point Division
2.0
comprising one task. The experiments were performed for square matrices of dimensions thirty-nine and forty with from one to
Terminate
sixteen processors
and sixteen to
sixty-four storage modules.
Two
25.0
Split
New Task
25.0 Fetch (Part of Terminate)
25.0
Chapter 37
Table 3
A survey
of problems
and preliminary
results concerning parallel processing
and
parallel
processors
465
466
PMS
The
Part 5
STOF KILOC1
600 ^ 550 1- 500 450 O in in 400 ui
g350 £ 300 -)
fe
H
250 200 150 !
I00
ui
90 80 70 60
p z 50 K 40
30 20 10
level
Section 3
Computers
for multiprocessing
and
parallel
processing
Chapter 37
A survey
of
problems and preliminary results concerning
uj j|
or vi
m W o en
formance. This point, however, requires further study. Figure 10 reproduces some of the results of the previous three figures for the case of a five-equation inner loop. Table 4 lists these
64 STORAGE MODULES
same
INNER LOOP SIZE • 2 EQUATIONS i 3 EQUATIONS •4 EQUATIONS
results as a
of the number of processors. indicates interference and parallel processing Figure storage overheads as a function of the number of processors, with storage modularity again a parameter and an inner loop again comprising
— —
11
STORAGE
4 5 6 7 8 9 10 12 NUMBER OF PROCESSORS II
Fig. 9. Total processor and throughput times analysis— 64 storage modules.
in
13
14
15
electrical
16
network
which must be understood within the framework of a
numerical analysis of the relaxation solutions. Figures 7, 8, and 9 present the basic performance data, throughput time, and total processor time, for a total of one hundred and forty-four cases.
The
variables are the
number
of processors in the
cases), the size of the inner
loop as represented by the number of currents (from 2 to 5) evaluated in the loop, and the number of interleaved storage modules (16, 32, 64).
system (12
to
These curves clearly indicate the reduction in throughput time be obtained from the use of parallel processing, the consequent
increase in processor cost due to interferences of various sorts, the resultant effect of diminishing returns, and the actual increase in
throughput time, when too
many
percentage of the time using one processor and
compares them with the reciprocal
a.
effects
processing and parallel processors
circumstantial evidence that an ad hoc procedure, which does not guarantee sequential evaluation of the equations, improves per-
STORAGE KILOCYCLES
600 550 500 450 400 350
parallel
processors chase too few equa-
and generally get seriously "into each other's way." For the smaller inner loops and when interference between
tions
processors is low, total processor times vary somewhat erratically. The causes for this are related to the relaxation pattern and the rate of convergence in each case. In fact there appears strong
467
468
Part 5
Table 4
The
PMS
Section 3
level
Run time
for resistor
using one processor, with a
network system
relative to the run
five equation inner loop
time
Computers
for multiprocessing
and
parallel
processing
A survey
Chapter 37 [
factor.
Such a factor
is
intuitive
of
problems and preliminary results concerning
and environment-sensitive, de-
pending on the relative concern for speed and for costs of various sorts. For the present data we have chosen to display a function:
X
total processor
time
processing and parallel processors
Any ultimate evaluation of a parallel processing system within a working environment depends on actual operating experience. This in turn requires the existence of a system and the interest of users.
throughput time
parallel
Only when usable systems become available will the in integrated systems be accurately
concept of parallel processing evaluated.
where
K
a constant, throughput time a measure of the speed of computation, and total processor time a measure of the cost. is
References BlaaG64; BrigH64;
8.
FalkAfM; GMS58;
Conclusion
we have presented some thoughts on parallel processparticular we have chosen to survey the topic by including
In this paper ing. In
an extensive bibliography and some of the results of our work in this area. The discussion has had to be brief, but our intention has been to convey the picture of the potential that parallel processing systems offer for the future development of computing. The key to successful exploitation lies in a new, unified, and scientific
approach
to the entire
problem of the design and usage
computing systems. The development of large, integrated systems raises many problems, but there can be no doubt that ecoof
nomic
solutions to these will
be found. Their development should
comprise a significant part of the computer system architectural design effort of the next few years.
ConwM63; CorbF65; DennJ66; DesmW64; DreyP58: GregJ&3; KatzJ66; LehmM65; LeinA59; McCuJ65.
MiraW67; NievJ64; RoseJ65; SchlH??; ShedG66a, PL/I Language Specification, FormC28-6571
b; SlotD62;
SmitR64
Bibliography AlleM6.3;
AmdaG62; AndeJ62,
65;
ArdeB66; BaldF62; BlaaG64; BrigH64
BuchW62; BussB63; CoddE62; ComfW65; ConwM63; CorbF62, 65: CritA63; DaleR65; DennJ65, 66; DesmW64; DijkE65; DreyP58; ErnsH63: EstrG60, 63; EwinR64; FalkA64; ForgJ65; FranJ57; GillS58; GlasE65 GregJ63; HellH61, 66; KatzJ66; KinsH64; KnutD66; LehmM63a, 63b, 65
LeinA59; LourN59; MarcM63; McCaJ62; McCuJ65; MeadR63; MillW63 MiraW67; NievJ64; OssaJ65; Penn]62; RoseJ65; SchlH??; SeebR63; SenzD65:
ShedG66a, 66b; SlotD62; SmitR64; SquiJ63; StraC59; VyssV65; WirtN66: IBM OS/360 PL/I Language Specification, Form C 28-6571; Proc. ZF/P1962
"Symposium on Multi-Programming"
1963.
:
469
Section 4
Network computers and computer networks The RW-400 and the CDC 6600 are
actually
computer networks
by our definition of a computer (Chap. 2, page 17). Yet because of the restrictions on the quantity and location of the compo-
nents
in
these structures,
we
still
consider them to be com-
which are puters. On the other hand, two or more computers separated physically, yet connected, constitute a computer network. Computer networks will appear in the future; it is important to understand the basis for them.
managing T
and
activity. Similar solutions are
activity
common
by using an M, local to particular T's,
local C's.
The structure should be compared with the CDC 6600 (Chap. 39) and the network examples in Chap. 40.
The CDC 6400, 6500, 6600, 6416, and 7600 The CDC 6600 development began in 1960, using high-speed transistors and discrete components of the second generation. The first 6600 was delivered in September, 1964. Subsequent
The RW-400— a new polymorphic data system Chapter 38 presents the RW-400 (also called the AN/FSQ-27), a later version of the Ramo-Wooldridge RW-40 originally designed in 1959. The diagram (page 478) gives an indication
compatible successors included the 6400, in April, 1966, which was implemented as a conventional Pc(a single shared arith-
the components. The PMS
1967, which uses two 6400 Pc's; and the 6416 in 1966, which has only peripheral and control processors. The first 7600,
of the relationship
structure in Fig.
RW-400's were tions
has
1
and names
built for military
(although the number
little
of
has more configuration details. At least of
command and control
computers
six
applica-
of a type in existence
to do with a machine's worth or ability).
The RW-40 ISP as given
in
Appendix
1
of Chap.
38
is
a
of a processor with a two-address instruction set.
good example The ISP does not have index registers; it has a small state consisting of the accumulator (A), a limited extended accumulator (B), the program counter (P), and about 6 state bits. The Pc
is
limited by
Mp. The ISP
its ability
to
address directly only a 1,024-word sufficient for solving the kinds of
undoubtedly problems encountered by the computer and compares favorably with Whirlwind and the IBM 1800. is
The RW-40 introduced multiple parts for reliability [Rothman, 1959]. Multiple C's (or Mp— Pc and Mp— Pio) are provided redundancy and capacity. However, the S('Central Exchange) which provides communication among the C's may not have redundant parts. The multiple-computer concept can be for
viewed as the forerunner to our present computer networks, in which the central switching element is the Telephone Ex-
change. Over a longer time span, the RW-400 may be most with the significant as a pioneer. However, the whole system, exception of the small Mp's, is nicely designed. The problem of low speed T(typewriter, display)'s is handled well by transferring data
470
independent T and P for
from
Mp— Pc
to
Ms(drum)
for concurrent
and
metic function unit instead of the 10 D's); the 6500
which
is
nearly compatible,
was delivered
in
in
October,
1969. The dual
processor 6700, consisting of two 6600 Pc's was introduced in October, 1969. Subsequent modifications to the series in
20 peripheral and control also marketed a 6400 with peripheral and control processors (e.g.,
1969 included the extension processors with 24 channels. a smaller
number
6415-7 with
7).
of
to
CDC
Reducing the
maximum PCP number
to 7
also reduced the overall purchase cost by approximately $56,000
per processor.
The computer organization, technology, and construction in Chap. 39. ISP descriptions for both the Pc and
are described
Pc ('Peripheral and Control Processors/PCP) are given 1 and 2 of Chap. 39.
in
Ap-
pendices
To obtain the very high logic speeds, the components are placed close together. The logic cards use a cordwood-type construction. The logic is direct-coupled transistor logic, with 5 nanoseconds propagation time and a clock of 25 nanoseconds. The fundamental minor cycle is 100 nanoseconds and the major cycle is 1,000 nanoseconds, also the memory cycle time. Since the component density is high (about 500,000 transistors in the 6600), the logic a plate with Freon circulating
This series
is
interesting from
the fastest operational
is
cooled by conduction to
through
it.
many aspects. It has remained computer for many years. Its large
Section 4
Mp>
Pc
1 1
Network computers and computer networks 471
472
Part 5
The
PMS
Section
level
why we consider the 6600
Each
to be fundamentally a network.
Cio (actually a general-purpose, 12-bit C) can easily serve the specialized Pio function for Cc.
The Mp
of
Cc
is
an Ms
for a Cio,
By having a powerful Cio, more complex input-output tasks can be handled without Cc intervention. These tasks can
4
Network computers and computer networks
write accesses to store results. valid
We would
agree that this
is
a
programs (e.g., look at a FORarithmetic statement), and it is probably valid for most
assumption for
TRAN
scientific
of course.
other programs as well.
include data-type conversion, error recovery, etc. The K's which
Cc has provisions for multiprogramming in the form of a protection and relocation address. The mapping is given in the
are connected to a Cio can also be less complex. Figure 2 has
ISP description for both
about the same information as Thorton's
/ECS).
Fig. 1
block diagram
detailed
PMS diagram
for the C('6400, '6416, '6500,
and
is given in Fig. 3. The interesting structural aspects can be seen from this diagram. The four configurations, 6400 6600, are included just by considering the pertinent parts of
'6600)
~
6416 has no large Pc; a 6400 has a sinis, a 6500 has two Pc's; and the 6600 has Pc; gle straightforward a single powerful Pc. The 6600 Pc has 10 D's, so that several the structure. That
A 6600 Pc
in
paral-
also has considerable M. buffer to hold instruc-
tions so that Pc need not wait for
Mp
fetches.
The implementation of the 10 Cio's can be seen from the PMS diagram (Fig. 3). Here, only one physical processor is used on a time-shared basis. Each 0.1 jus a new logical P is processed by the physical P. The 10 Mp's are phased so that a new access occurs each 0.1 jus. The 10 Mp's are always busy. Thus the rate i.
10
x
12 b/jus or 120 megabits/s. This process of shifting a new Pc state into position each 0.1 jus has been likened to is
2, Chap. 39, has an ISP description of the PCP. 2 a figure which shows the instruction deincludes Appendix
coding and execution as well. The 6600 PCP is about the same as the early CDC 160. The PCP has an 18-bit A register because it
a
parts of a single instruction stream can be interpreted lel.
Ms('Extended Core Storage-
Appendix
(Chap. 39).
A
Mp and
CDC. A diagram of the process is shown in Fig. 4. The T's, K's, and M's are not given, although it should be mentioned that the following units are rather unique: a K for a barrel by
has to process addresses for the large Cc. One interesting aspect of the 6600 which we question
communication among
switching for Pc
Pc to stop a
is,
however, elegant, since a Pio can request Mps, and resume a new task in one
job, store
instruction. (The t.save
+
t.
restore
formation or conversions; complete task management, including initiation, termination,
ment
The
of Pc.
and
error handling;
Cio's perform in about the
particular tasks, carry out the tasks,
etc.).
/is.)
The Cio's functions are data transmission between a peripheral device and the large Cc via the Cio's Mp with some data trans-
simultaneous transfers to 4 Ms; the T (display) for monitoring the system's operation; K's to other C's and Ms's; and conventional T(card reader, punch, line printer,
~2
The operating system
40, page 506).
management
Cio;
the
all
of 64 telegraph lines to be connected to a an Ms(disk) with four simultaneous access ports, each at 1.68 megachar/s data transfer rate, and a capacity of 168 megachar; an Ms(magnetic tape) with a K(# 1:4) and S to allow
the
is
components at the ISP (programming) level. When Pc stops, it has no way of explicitly informing any other components. There are no interprocessor interrupts. An io device cannot interrupt a Pio, nor can Pio's communicate with one another except by polling. The state lack of
and manage-
same manner as
the C('Attached Support Processor) a single fixed io
in the N('360 ASP) (Chap. The operating-system software is managed by Cio. The remaining nine Cio's are free, and as
tasks arise
in
the system, the Cio's assign themselves to
and then free themselves on other tasks. The operating-system software resides Mp(Pc) (that is, Cc) accessible to all Cio's and includes:
to take in
ISP 1
The ISP description of the Pc is given in Appendix 1, Chap. 39 The Pc has a very clean, straightforward scientific-calculation oriented ISP. We can consider it a variation on the general
pendix
1.
a
2
o
This structure assumes that a program consists of
b
several read accesses to a large array(s), a large
number
of
operations on these accessed elements, followed by occasional
list
of a particular
data pointers to Ms(disk, 'ECS), running time,
of jobs to do, etc.
Programs
because the Pc state has three sets of genera Their use is explained both in Chap. 39 and its Ap
register structure registers.
The variables which determine the state job, e.g.,
for the Cio's
Parts of the operating system used by the Cio sponsible for the system management 10
re-
management programs (or programs to get the management program from Ms) which the Cio's
task
use
Section
M('8arrel; working;
Mp(#0:9)'
10 w; 51
b/w; 0.1 u.s/w)
-Tf'Dead Start Console)-
—
Stm-
-Pc? (#0:9)
S>-
11:1211
JixedJ #0:9;
r.
'Peripheral
and Control Pro-
12 b/w)-
L(l u,s/w;
(keyboard)
-
'Read Pyramid; buffer;
K
12 b/w:
cessor/PCP
M (working: 12 b/w):
(1+2+3+11+5): .2 u,s/w)
'Write Pyramid; buffer;
M (working
12 b/w;
12 b/w;
(5+4+ 3+2+1) w: .2 [is/w
'Extended Core Coupler;
-K[
(_•
Mp*{#0:3D
S
1
.0 u,s/w;
4096 w;
12 b/w)
u,s/w;
12 b/w)
3
Pc( 'Peripheral and Control
(#0:15)
to: 'Extended Core Coupler)
.1
Processor; #0:9; time multiplex;.
1
address/instruction:
1
p,s/w:
Mps('Program Counter, Accumulator) 1,2 w/i nstruct ion)
12 b/w:
Mpfcore; 1.0u,s/wj 4096
w:
(5 x
12)
b/w)
S(time multiplex: 0.1 u,s/w; 60 b/w)
Ms('Extended Core Storage/ECS; 3.2u.s/w;
7
See Chapter 39 for operation.
s
0nly present in CDC 6500
9
8
LPc
S(tlme multiplex:
e
Ms)— Ms
16
L (ft, 3,4;
2
B
K:
J
ns/w; 60 b/w
6-
C('Central)
'Mpfcore;
1
-Sd
1-
(125952 /
fi)
w:
(8 x
(60,
1
parity)) b/w)
No C('Central) in CDC 6416; CDC 6500 and CDC 6400 do not have K( Scoreboard) '
,
separate D's,
and M( Instruction Stack). '
Pc('6600;
15,
30 b/instruction:
technology transistor:
~
S('Switchboard)
D('Shift)
:
-Mps(flip flop: ~16 w) I
,
1964;
data:
si
,bv,w,sf ,df )
— D(' Boolean) — D(#l 2; Increment) — D( 'Branch) '
;
I
-
K(interpreter)-
K( 'Scoreboard)
M.worki ng
-
M.
i
nstruct
iorf"'
I
nstruct ion Stack;
content addressable; _fl ip flop;
Fig. 3.
CDC
6400, 6416, 6500, and 6600
8 w;
PMS
60 b/w_.
diagram.
— D('Add;
0.3
(is)
— D('Long Add) — D(#l:2: Multiply; — "('Divide: 2.9
lis)
1
u,s
:=
4
Network computers and computer networks
473
474
The
Part 5
PMS
Section 4
level
10
1
CENTRAL ~
MEMORY 160)
MEMORIES, 4096 WORDS EACH, 12-BIT
Network computers and computer networks
Network computers and computer networks 475
Section 4
In a typical
assignment
1
CDC
system, one might expect to find the following of
PCP's to
Operating-system execution, including scheduling and of
management
Cc and
all
7600
The CDC 7600 system is an upward compatible member of the CDC 6000 series. Although the main Pc in the 7600 is compatible with the main Pc of the 6600, instructions have been added
be:
Cio's
for controlling the io section
2
Display of job status data on T(display)
3
Ms(disk) transfer
4
"["(printers,
management
a
L(#l:3;
6
Ms(magnetic tape)
7
T(64 Teletypes)
8
Free to be used with Ms(disk) and Ms(magnetic tape)
9
Free
10
Free
PPU's are located
—5
Mp(#0:31)
S
in
K(M. buffer; core to core transfers) 5
3
-pp|c I— K 'Input Output Section; M(buffer;
15 w;
S|
60 b/w)
t
ime mul
t
iplex;
C* (#1
:
15:
'PPU)
4
II
15 C('PPU)
55 ns/w.: 60 b/w:
Basic N('CDC 7600)
>Ms('Large Core Memory/LCM; 1.760 u s/w: 2
3
Mpf Small
Core Memory/SCM;
(6V8)
kw:
(60 X
8)
b/w)
.275 ns/w; 2 kw: 60 b/w)
S(time multiplexed: 27.5 ns/w; 60 b/w)
*C('Peripheral Processing Unit/PPU) :=
Mpp0:l;
275
i
ns/w7|-
12
L.2M8 Hj
b/wj
address/instruction:
Mps(~2.5 '
L-Kio(#0:7;
"Mpsplip
flop; 27.5 ns/w'i
16 w; |_~
60 b/w
—
10
f-
~2
1
w/i nstruct ion
instruction
Channel)
-L(to: K)-
D('Long Add)
—
'
D
(
D
(
Increment)
'Population Count)
1
f
Instruct ion Stack; 1
ip flop:
12 w.
Fig. 5.
27.5 ns/w;
60 b/w
CDC 7600 computer PMS
D('Shift) D( 'Normal ize)
interpreter
diagram.
:
w)
— D( 'Boolean) M. working:
5)
is
substantially different from that
The C('7600 Peripheral Processing Unit/PPU), unlike the C('6600 Peripheral and Control Processor's, has a loose coupling with the main C. The PPU's are under control of the main C when transferring words into SCM via K('lnputOutput Section). The 15 C('PPU)'s have 8 input/output channels. These channels, which can run concurrently, provide the link between C('PPU) and peripheral Ms's and T's. Some of the
to:C.satellite)
—
communicating between
of the 6600.
5
2
for
CC6600).
The PMS structure (Fig.
card reader, card punch)
1 Ms(#0:7) —i
and
Large Core Memories /LCM and Small Core Memory/SCM. It is expected to compute at an average rate of four to six times
— D('Floating
Add)
D('Floating Multiply) L_ D('Floating Divide)
-K-T |Ms|c(Central)-
the
same
physical space as the Pc.
476
Part 5
The
PUS
Section 4
level
Network computers and computer networks
|
a clock, the PPU's, and A breakpoint address, BPA, can
The 7600 Pc can be interrupted by trap condition within the Pc.
be set up within Pc such that, on the program reaching BPA, a trap
is
This interruption
initiated.
scheme
is in
contrast to
that of the 6600, which could not be interrupted or trapped.
The 7600 interrupt may be munication
in
There have been instances of very large computers not being carried to completion either for financial or technical reasons.
The 6600 seems
marks it
to be the first large
of success. Here
we
computer
are interested
in
has held the "world's largest computer"
to achieve these
the 6600 because title
for so long.
a reaction to the lack of intercom-
the 6600.
Computer-network examples In Chap. 40, we present examples of seven computer networks. There is a dearth of both computer networks and of papers on
Conclusions
Although the 6600 was somewhat behind its announced delivery schedule and represented a significant drain on the financial resources of CDC,
it
is
now
clear that
it
is
a successful product.
computer networks. This chapter takes examples from papers and from knowlof several existing or proposed networks.
edge
Chapter 38
The RW-400— a new polymorphic data system 1 R. E. Porter The RW-400 Data System, based upon modularly
Summary
independently operating and flexibly connected components,
to another model,
the logically
in large expenditures of
evolved successor to conventional computer designs. It provides the means by which information processing requirements can be met with equipment capable of producing timely results at a cost commensurate with problem
economic value. System obsolescence is minimized by the expandability in numbers and types of processing modules. Real time reliability is assured
by component duplication techniques employed
at
minimum
in the system's
cost
and by the advanced design
manufacture. Man-machine
nication facilities are program controlled for
maximum
due to growth in applications, often resulted time and money. During maintenance or
constructed,
is
commu-
flexibility. Parallel
malfunction of a conventional computer its entire processing is shut down. Real time processing reliability cannot be
capacity
maintained on an around-the-clock chine must process
its
problems
basis.
serially.
The conventional maThis serious limitation
is only partially alleviated by time-sharing or computing-element-doubling designs. The high cost-per-hour of conventional computer operation rules out direct man-machine intercommuni-
processing and parallel information handling modules increase the system's speed and adaptability when handling complex computing workloads. This
cation during other than emergency situations.
polymorphic design truly represents an extension of man's intellect through
Data System was evolved by Ramo-Wooldridge engineers
electronics.
vide a practical solution to those information processing problems now inadequately handled by conventional computer designs. The
The RW-400 Data System
new
The radically-new polymorphic design concept
of the
RW-400 to pro-
design concept. It was develfor information processing oped with real-time reliability and power to adaptability, equipment information with handling requirecontinuously-changing cope
a powerful new tool in the field of intellectronics extension of man's intellect by electronics.
a polymorphic system including a variety of functionally-independent modules. These are interconnectable through a
System description
to
ments.
is
a
meet the increasing demand
It is
program-controlled electronic switching center.
Many
pairs of
modules may be independently connected, disconnected, and reconnected, in microseconds if need be, to meet continuouslyvarying processing requirements. The system can assume whatever configuration is needed to handle problems of the moment. Hence it is
best characterized
by the term "polymorphic"
— having many
shapes.
Rapid, program-controlled switching of
many
pairs of func-
tionally-independent modules permits nondisruptive system
operating reliability, simultaneous multi-problem
pandability,
processing feasibility.
ex-
capability,
and man-machine
These are only partially found
in
intercommunication
computers of conven-
tional design.
to match problems Problem changes posed serious reoriencomputer tation and reprogramming difficulties. Changes from one computer
Computer users have been forced heretofore
to
1
Datamation,
limitations.
vol. 6, no. 1, pp.
8-14, January/February, 1960.
RW-400
is
— the
The RW-400 Data System contains an optional number and variety of functionally-independent modules. These communicate via a central electronic switching exchange. Each module is designed, within practical economic and functional limits, to maximize system adaptability over a wide range of problem types and sizes.
new design embodies the latest proven electronic design techniques, assuring high processing speeds and high equipment reliability. The RW-400's modularity assures reliable, round-theThis
clock processing of information with controllable computing capacity degradation during module maintenance or malfunction. Practical
man-machine intercommunication
RW-400 system by
is
achieved in the
use of program-controlled information display
and interrogation consoles. Figure 1 shows the over-all system design. Modules of various types communicate through a central exchange switching center.
Computing and buffering modules provide control for the system. These modules are self-controlled and make possible completely independent processing of two or more problems. One of the computer modules may be designated the master computer and 477
478
The
Part 5
PMS
Section 4
level
Network computers and computer networks
CONTROLLING
COMPUTING
BUFFERING
DISPLAY
i
^J
I
I
I
SWITCHING CENTER
I
J
INTERROGATION
AUXILIARY STORAGE
Fig. 1.
INPUT-OUTPUT
The RW-400 data system.
and monitors actions of the entire system. An
in this role initiates
provided to allow coordinated system action. Therefore, the system as applied to given information processing problems may change on a short range (microsecond) alert-interrupt
network
is
thus providing, through programming, a self-organizing aspect to the system. In addition, the system may change through
basis,
the years as the applications change. The most efficient and economical complement of equipment is applied to the problem at all
times.
put/output requirements. Additional man-machine communication devices such as interrogation, display and control consoles,
may be included in the system as problem A Tape Adapter (TA) module is available to
requirements dictate.
provide compatibility with magnetic tape of other computers. Information generated at Flexowriter inquiry and recording stations may be directly received by the system via the Peripheral Buffer Module. This latter module also buffers the receipt of and punched tape information.
TWX
self-instructed
in which a particular RW-400 Data System functions on the number and type of each module included. It may depends be initially composed of the minimum number and variety of
Buffer Modules (BM); Magnetic Tape Modules (TM); Magnetic Drum Modules (DM); Peripheral Buffer Modules (PB); and
modules needed to do a small problem or the initial part of some large but yet-to-be-defined problem. Such a system would work
An RW-400 system to
Exchange (CX) attached. These
is
built
around an expandable Central of primary modules may be
which a number
are:
Computer Modules (CM);
console communication Display Buffer Modules (DB).
modules are put together In
in
a system
addition
to
is
How many
entirely a function of
primary system modules, punched tape, high speed printing and control punched console devices are available. These handle nominal system insystem application. card,
The way
much
like a
conventional computer.
It
would probably include
a buffer module and thus have a parallel data handling capability not found in the conventional design at a comparable price. The initial
system installation
addition of modules.
may then be augmented by
the timely
The RW-400— a new polymorphic data system 479
Chapter 38
A
buffer
module (BM) has the capability
to control
its
acquisi-
tion and dissemination of information independently. The buffer provides a computer module with parallel data handling capability
without complicating the problem processing program with the conventional intermixture of arithmetic and housekeeping in-
by the processing
structions. Information previously generated
program
may be appropriately disposed of within the system while continues. Data needed at a subsequent time in the
processing
be retrieved from system storage in advance of
processing may need while processing progresses. The simultaneity of these operations not only materially increases over-all processing speed but also increases the practical utility of the less costly types of in-
ternal system storage such as a magnetic tape.
The computer (CM)
or buffer
(BM) modules, when acting
in
program when the two can work profitably in unison. The pair of modules thus interconnected neither affect nor are affected by other modules. Logical interlocks prevent unwanted cross talk among modules. An intermodule communication system lets con-
modules signal status or alert other such modules of their need to communicate. The decision by a module receiving an alert trolling
to
is
proceed
optional with
The
optional interrupt feature is that needed to make the often-discussed but seldom-used program interrupt capability both useful and practical. Programs may thus permit that module.
interruptions
only
at
convenient
points
functional modules
The key to appreciative understanding of the power of the RW-400 lies in
the
in
processing
sequence.
knowledge
of intermodule connection. It
describe the Central Exchange (CX) unit descriptions of the various modules.
first,
is appropriate to then follow with
The central exchange
The Central Exchange performs the necting a pair of modules
vital function of intercon-
whenever requested
a computer or a buffer module. Since internal
to
do so by either
programmed control
only possible within a computer or a buffer module, one of the interconnected pair of modules must be either a computer or a is
The time
in which any connection may be made or broken 65 about microseconds. An exchange has basic capacity to connect any of 16 computer or buffer modules to any of 64 auxiliary function modules. There is nothing sacred about the number
buffer.
a controlling capacity, may initiate connection to an information storage or handling module during that part of the processing
signal to permit interruption or
The
is
16 since
it is
possible to extend the
matrix through design modification
CX
module's interconnection
when need
arises.
The
CX
is
an expandable, program-controlled, electronic switching center capable of connecting or disconnecting any available pair of modules in roughly the time of one computer instruction execuFigure 2 illustrates the permissible module interconnections within the Central Exchange. tion.
Every intersection on the illustration represents a possible connection between modules. The "x-ed" intersections indicate typical connections in force at any point in time.
The
control logic
CX
module's connection table prevents more than one interconnection on any horizontal (controlling) or vertical (conof the
The system
path representation on the diagram. When connecrequested of the Central Exchange while one of the required modules is already carrying out a previous assignment, the requesting module can be programmed to sense this condition and
thus self-controlled to match processing capacity to each problem for the time necessary to do the job. Full system capacity may be brought to bear upon a very large problem when needed. This
waiting be undesirable, the requesting module can go on about its business and check back later to see when the desired connec-
Modules may be assigned, under program
control, to
work
together on a problem in proportion to its needs. As soon as a module's function is complete for a given problem, that module
may be
released for reassignment to
some other
task.
is
capacity may be apportioned among a number of smaller problems simultaneous processing, program compilation, program
for
checkout, module maintenance
etc.,
when
maximum system effort. From the preceding system description,
it
is
not needed for
trolled) data
tion
is
wait until connection can be
tion can
be made. There
knowing the kind
is
made without
an implication here, of course, that
of a system he
is dealing with, a programmer requests connections in advance of need whenever possible.
Provision for master-slave control it is
apparent that such
interference. Should
Matrix established within the
CX
is
included via an Assignment
module by a computer module
equipment can be expanded from a modest initial installation into a very powerful and comprehensive information processing cen-
previously assigned to master status. Such a provision is necessary to preclude inadvertent connection requests from unchecked
More
to give the reader a better feel
programs or malfunctioning control modules from affecting sets of modules simultaneously processing another problem. Connection
system might perform his information processing
requests are therefore essentially filtered through both an assign-
ter as requirements warrant.
cipal system for
how
work.
this
modules follow
specific descriptions of prin-
ment and an interconnection
validity matrix prior to being acted
480
PMS
The
Part 5
T
CM
CM CM
M f
IM I
i
level
Section 4
Network computers and computer networks
Chapter 38
contents of H are multiplied Multiply Accumulate wherein the where the contents of G are Transmit and to added A; by G and
The
ten program control instructions are Store, Store Double Accumulator, Load Accumulator, Insert Mask in the
S Register, Stop, Link Jump, Compare Jump, Tally Jump, Test Jump and a Multi-purpose Shift.
The
five external instructions are
those which cause data to
be transmitted to or received from a device external to the comis multi-purpose in nature and hence equivputer. Each command alent to several conventional external instructions.
are
The commands
—Command Output, Data Input, Conditional Data Input, Data
Output and Character Transfer. variation of each of these
it
The
stored in H.
Length
Suffice
A comprehensive discussion of the
commands
is
not pertinent to this article.
to say that
wide variety
The RW-400— a new polymorphic data system 481
commands
are available for carrying out a
of intermodule data
interrupt capability of a
communication.
Computer Module
is
a logical
generalization of the "trapping" feature found on several conventional computers.
gram,
at the
It
permits the automatic interruption of a prowhen the computer module
option of the program,
receives an "alert" that a condition requiring attention has arisen. It can be used to warn the program when an error of some type has occurred, minimize unproductive computer waiting time while another module completes its task, eliminate many programmed
status test instructions
and provide a convenient means of sub-
jecting one computer module to the control of another. Program
control of interruptions within a CM-400 is accomplished through the sense register S. This register may be filled with an interrupt
482
The
Part 5
PMS
Section 4
level
Network computers and computer networks
versus cost; parallel processing versus versus sequential processing; independent information handling
the trade
offs in features
program complicating "housekeeping"; and
time system
real
reli-
The only valid comparison ability versus periodic inoperability. is that between the RW-400 Data System and a conventional same
to the
computer applied
RW-400 system made by the by the reader
task.
The contribution
to the
Buffer Modules can be better assessed
been considered.
after the following description has
The buffer module
A
Buffer Module consists of two independent logical buffer units, each having 1024 words of random access magnetic core storage and a number of internal registers used in performing its functions
when
in the self-controlling
mode.
A
Buffer
Module may be con-
nected to a Computer Module so that the Buffer's core storage is accessible to the computer as an extension of the computer's own storage. A Buffer may also serve as an intermediary device between a computer and another module, such as a tape or drum, to minimize time conventionally lost in data transfers. The Buffer
capable of recognizing and executing certain instructions stored in its own memory. It can therefore be left to perform data hanis
RW-400
analysis console.
dling functions on
mask by means of the Insert S instruction. A bit by bit correspondence exists between the S register and the interrupt register and the interrupt register I to which the alert lines are connected. A Test
Jump
instruction can be used to
between these
registers of
examine the coincidence in a bit position corre-
an alert signal
alert is received sponding to a one in the S register mask. If an by the computer during the execution of an instruction, control
will
be transferred to memory location "O" at the end of the to the if, and only if, (a) the sense bit corresponding
instruction alert
is
a "one,"
struction ister
(b)
the master sense bit
was not an "Insert
may be programmed
according to the interrupt
S."
is
a "one," and
The master
(c)
the in-
sense bit in the S reg-
to permit the interrupt to take place
mask or
to inhibit interrupt until the
its
occupied. A Buffer Module
dress 1023
instruction to
and division and square
CM
a deluxe conventional computer the reader should bear in
and
mind
own working
Computer Module
storage.
operand be executed, the computer is
When
the ad-
a computer signalled that the
field of
cell in buffer storage.
The computer then
read register a few instructions, the buffer write register
R (or in the case of W) as the effective
operand uses the
refers to
some
number
in the buffer
address designated by the operand field of the instruction. Extended addressing may be used in either the first or second operand the instruction or in both operand fields. If extended addressing is used in only one operand field, the effective address field of
instruction
and 170 microseconds respectively. Before attempting to draw a comparison between a
to a
(all ones) appears in the
before the interruption is allowed to take place. Figure 3 schematically illustrates the Computer Module's primary registers and the interconnecting information paths.
root about 130
may be connected
extension of the computer's
designated by that
tiplication takes about 80 microseconds,
while computer modules are otherwise
and the buffer 1024 word storage used as an indirectly addressed
program can conveniently cope with it. All instructions being executed at the time an interrupt condition occurs are completed
Typical two-address addition and subtraction times are approximately 35 microseconds including memory access time. Mul-
own
field is
automatically added
operand
is
the
number
in register R.
to the contents of the
executed.
If
R
A
"1"
is
register after the
extended addressing
is
used
in
both
an instruction, the effective address of the first the number in register R and the effective address of
fields of
operand is the second operand is one more than the number in register R. A "2" is automatically added to the contents of register R after the execution of this type of instruction. The R (or W) register may be preset to any desired initial condition by means of the
computer's Command Output instmction. All the commands being executed by the computer must be stored within the computer
Chapter 38 |
module's storage and may not be in buffer cells addressed by the at execution time. The extended addressing and buffer
computer
may be used to materially simplify repetitive data
register indexing
acquisition operations.
The primary function of an auxiliary
of a Buffer
computer storage
Module
unit.
is
however, that
not,
The drum and tape modules
more aptly serve this function in the RW-400 system. A Buffer Module is capable of operating autonomously and of controlling
Drum
other modules such as Tape Modules,
Modules, Peripheral
Buffers, Display Buffers, Printers or Plotters. This capability en-
Modules
ables the Buffer
in a
system to perform routine tape
searching and data transferral tasks thereby freeing the
Computer
Modules
mode, the the same
to
do more computing. In
buffer executes
its
"self-instruction"
own internally stored program
its
fashion as a computer.
The memory
therefore be occupied by
its
of data
buffer
much
of a Buffer
Module
will
own control programs as well as blocks
holding for transmission to other units. The used to acquire information from the relatively slower
which
is
in
is
it
auxiliary storage
The RW-400-ra new polymorphic data system 483
(the size of the storage available to hold the data in a sending
Each block is preceded by a block identiwhich permits selective tape information searching by a Buffer Module. Single blocks imbedded in a tape file of other or receiving module). fication
blocks can be overwritten.
A
two-stack head permits automatic
written. Readback parity errors are automatically detected during the writing process. Thus dropout areas may be determined while the data is still available in verification of each block as
it is
a computer or buffer for recording elsewhere.
A description of the RW-400's tape handling capability would not be complete without mentioning the Tape Adapter (TA) module. This is a self-contained unit capable of performing the reading and writing of magnetic tapes in a format acceptable to the IBM 704 and 709 systems. The TA consists of an Ampex FR-300 half-inch digital tape transport, including dual gap head and servo control system; reading, writing and control circuits; and a
housing with
its
own blower and power
and communication modules while the computer
proceeds at high speed. Blocks of information retrieved in advance of computer need by the buffer may then be rapidly transferred to the computer's own storage or operated upon as they stand in the buffer via the indirect addressing capability of the computer. Another feature of the buffer is its switching capability. Each
Buffer
Module
is
composed
of
two buffer
units tied together.
unit function switching feature permits the
two units together
in
an alternating
employment
A
of the
mode of operation. Continuous
HUl
information transfer from tape to computer, for example, may be accomplished without stopping the tape unit. A switching instruction executed simultaneously by both units of a Buffer Module causes whatever devices were connected to the
first
ftfi-^-^^-*-
unit to be
connected to the second and vice versa.
Now that the functional controlling modules and the module interconnection concept have been discussed, the more conventional auxiliary storage modules available with the system may be described to round out the processing capability of the system.
The tape modules
fiiiii
iiiiiiiii!
A Tape Module consists of an altered Ampex FR-300 tape transport plus the necessary
power supplies and control
circuitry to effect
information reading, writing and control. One inch mylar tape is used. Information is written on 16 channels two of which are
ifllUll
—
The remaining 14 channels consist of 13 informaThe information reading or recording rate words 15,000 computer per second. Data may be recorded on
clock channels.
tion bits plus parity. is
tape in variable blocks up to a
maximum
of 1024
words per block
RW-400
Buffer Module.
supply.
module
484
The
Part 5
PMS
Section
level
4 |
The drum module
The Drum Module (DM)
contains a magnetic
capacity of 8192 words.
may be connected to either a Computer
It
drum with
storage
Module through the Central Exchange. Average access the first word position on the drum is 8% milliseconds.
or a Buffer
time to
Successive words are transmitted at the rate of 60,000 computer words per second. The Drum Module is conventionally used as
handled by the RW-400 system. In addition to the actual Cathode Bay Tube, numerical indicator, signal lamp and typewriter information outputs, several types of keyboard activated system control
and parameter entry facilities are provided on the console. The man-machine communication facility represented by each
total
console
is
designed to be primarily a function of the computer
control programs initiated
A
an intermediate item storage device to minimize tape handling
set of
by the
analyst via his console.
Display Control Keys generate messages which are
recorded on a Peripheral Buffer sector for later interpretation and
time.
display generation by a computer program. A set of Process Step Keys are provided the analyst so that he can initiate prepro-
Special system communication modules
The
Network computers and computer networks
external
data
and man-machine communication of the
RW-400 Data System are handled via drum buffer modules. A wide variety of asynchronously operated equipment
is
speed matched
and program controlled through the features designed into these special system communication modules.
The
Peripheral Buffer (PB) provides input/output buffers for communication between Computer or Buffer Modules and rela-
grammed system processing variations. is
Associated with the Process
an overlay or "program card" which permits the
Step Keys assignment of a variety of meanings to the set of Process Step Keys. Insertion of the overlay by the analyst gives him a unique label
each Process Step Key and automatically cues the controlling to assign the corresponding set of programs to each key
for
computer
tively slow speed external devices such as Flexowriters, Plotters, Punched Tape Handlers, Teletype Lines and Keyboard Operated
message. A Data Entry Keyboard is provided on the console so that the analyst can enter control parameters when asked to do so via the display devices.
Equipment. The Peripheral Buffer stores its information in four pairs of bands which operate alternately as circulating registers.
trolling the position of cross hair
Each band contains eight input and eight output buffers for a total of 32 input buffers and 32 output buffers in each Peripheral Buffer Module. Each buffer is a drum band sector 64 computer words
display tubes. Associated with the joystick are control keys which may be used to send a message to the controlling computer specifying the coordinates of the cross hairs. Control programs may be
one input and one output buffer sector are connected to each external device (such as a Flexowriter) to permit two-way communication between the external device and the
written, for example, to act
long. Conventionally
RW-400
system.
The display buffer
A
Display Buffer (DB) acts as a recirculating storage for the cathode ray tube display units in a Display Console. Information
be displayed is sent to the DB band associated with a particular display tube via the Central Exchange. The Display Buffer sends to
only status information back to other system modules upon request. The information displayed on any tube is controlled by the bit pattern sent to the Display Buffer. The display pattern is regener-
A
Joystick Lever affords the console operator a
upon
means of con-
markers on the cathode ray
this
information to reorient the
display with respect to the area selected
by the
cross hair position.
A
Light Gun is also provided as a means of selecting any point on the cathode ray tube displays. The gun emits a small beam of light. With the beam centered on a given point on the cathode ray display tube, pressing the trigger results in the automatic generation of a message to the Peripheral Buffer specifying the address in the Display Buffer containing the coordinates of the selected point. A set of Status and Error lights are contained on the Display Console to provide the console operator with over-all knowledge of the system and thus minimize conflicting control requests and intermodule interference. For example, a Buffer
ated 30 times per second to minimize image fading and flicker. The preceding explanation of the Display Buffer has little meaning
Peripheral may not be ready to accept a console key message until after certain previously requested control actions have been completed. The
to a reader unfamiliar with the features of the Display Console itself. This console is therefore described in more detail in the
that he
Status Lights indicate this condition to the console operator so may act accordingly.
following paragraphs.
The printer module Display consoles Display Consoles can give a problem "analyst" or "monitor" a visual picture of the status or results of any information being
The
Printer
Module (PR)
minute Anelex type a
Computer
basically a 160 column, 900 line per printer. It receives information from either
or a Buffer
is
module
via the Central Exchange. Indi-
The RW-400— a new polymorphic data system 485
Chapter 38 |
vidual characters to be printed are represented by a 6-bit code and are transmitted four to a computer word. Zero suppression,
CR
completion and information block end codes are included for format control. A plugboard is provided for flexibility in columnar
cards at the rate of 2,500 cards per minute.
line
data arrangement. Paper feed is controlled by means of a loop of 7-channel punched paper tape. Control of the printing operation has been arranged so that the connected control module may send
headings from one set of memory locations, stop sending information while going to a different part of the memory, and line
then proceed to send data from to
complete a
this
new
set of
memory
communicates with Computer or Ruffer modules via the It is capable of reading 80 column punched
Central Exchange.
using the Tape Adapter
References RothS59; WestG60
The punched card modules
The RW-400 Data System may be equipped with a high speed punched card reading module (CR) and an IRM card punch. The
is
the sources of large volumes of punched cards usually convert this
data into magnetic tape form which
locations
line of print.
The card punch
connected to the system through the Peripheral Ruffer Module (PR) since it is a relatively low speed device. Emphasis has not been placed on directly connected punched card equipment since
Module
may be more
(TA).
rapidly handled
486
Part 5
The
APPENDIX
PMS
1
level
RW
40 ISP DESCRIPTION
Section
4
Network computers and computer networks
The RW-400— a new polymorphic data system
Chapter 38 |
Instruction Interpretation Process
487
488
Part 5
The
PMS
Section
level
(g 4 0)
-
{
4
Network computers and computer networks
Chapter 39 Parallel operation in the Control
6600 James
Data
1
E.
Thornton
History In the
summer
Data began a project which the delivery of the first 6600 Com-
of 1960, Control
culminated October, 1964 in puter. In 1960
it was apparent that brute force circuit performance and parallel operation were the two main approaches to any advanced computer.
This paper presents some of the considerations having to do with the parallel operations in the 6600. A most important and fortunate event coincided with the of the 6600 beginning project.
This was the appearance of the high-speed silicon transistor, which survived early difficulties to become the basis for a nice in
jump
more critical system control operations in the separate The central processor operates from the central
processors.
memory with
relocating register and
file
protection for each program in central
memory. Peripheral and control processors
The peripheral and
control processors are housed in one chassis main frame. Each processor contains 4096 memory words of 12 bits length. There are 12- and 24-bit instruction formats to of the
provide for direct, indirect, and relative addressing. Instructions
provide
logical,
addition,
subtraction,
shift,
and conditional
circuit performance.
branching. Instructions also provide single word or block transfers to and from any of twelve peripheral channels, and single word or block transfers to and from central memory. Central memory
System organization
words of 60
and now called
bits length are assembled from five consecutive pewords. Each processor has instructions to ripheral interrupt the
of use, the very large
central processor and to monitor the central program address.
The computing system envisioned
in that project,
the 6600, paid special attention to
two kinds
scientific
problem and the time sharing of smaller problems. For
the large problem, a high-speed floating point central processor with access to a large central memory was obvious. Not so obvious, but important to the 6600 system idea, was the isolation of this central arithmetic from any peripheral activity.
It
was from
this general line of reasoning that the idea of
multiplicity of peripheral processors
was formed
(Fig.
1).
a
Ten such
peripheral processors have access to the central memory on one side and the peripheral channels on the other. The executive control of the system
is always in one of these peripheral proceswith the others operating on assigned peripheral or control tasks. All ten processors have access to twelve input-output channels and may "change hands," monitor channel activity, and
sors,
perform other related
jobs.
These processors have access to central
memory, and may pursue independent transfers to and from memory. Each of the ten peripheral processors contains its own
this
memory
for
program and buffer
MF/PS
areas, thereby isolating
Proc. FJCC, pt. 2 vol. 26, pp. 33-10, 1964.
and protecting the
To get this much processing power with reasonable economy and space, a time-sharing design was adopted (Fig. 2). This design contains a register "barrel" around which is moving the dynamic information for
all ten processors. Such things as program address, accumulator contents, and other pieces of information totalling 52 bits are shifted around the barrel. Each complete trip around requires one major cycle or one thousand nanoseconds. A "slot"
in the barrel contains adders,
assembly networks, distribution network, and interconnections to perform one step of any peripheral instruction. The time to perform this step or, in other words, the time through the slot, is one minor or one hundred
cycle nanoseconds. Each of the ten processors, therefore,
is allowed one minor cycle of every ten to perform one of its steps. A peripheral instruction may require one or more of these steps, depending on
the kind of instruction. In effect, the single arithmetic and the single distribution and assembly network are made to appear as ten. Only the memories are kept truly independent. Incidentally, the
cycle time
memory
read-write
equal to one complete trip around the barrel, or one thousand nanoseconds. is
489
490
Part 5
The
PMS
level
Section 4
Network computers and computer networks
Chapter 39
Input-output channels are bi-directional, 12-bit paths. 12-bit
word may move
in
nanoseconds, on each channel. Therefore, a of 120 million bits per
processors.
A
is
maximum
in a practical
all
single real time clock, continuously running,
is
Data 6600
available to
peripheral processors.
burst rate
possible using all ten peripheral
sustained rate of about 50 million bits per second
can be maintained
may
second
A
One
one direction every major cycle, or 1000
Parallel operation in the Control
operating system. Each channel and may interface to other
service several peripheral devices
systems, such as satellite computers. Peripheral and control processors
Central processor
The 6600
central processor may be considered the high-speed arithmetic unit of the system (Fig. 3). Its program, operands, and results are held in the central memory. It has no connection to
through an assembly network and a dis-assembly network. Since
the peripheral processors except through memory and except for two single controls. These are the exchange jump, which starts
memory references are required to make up one memory word, a natural assembly network of five levels
and the central program address which can be monitored by a
five
central
memory
peripheral
central is
access
used. This allows five references to be "nested" in each network
during any major cycle. The central memory is organized in independent banks with the ability to transfer central words every
minor
cycle.
most about
The peripheral
2%
processors, therefore, introduce at interference at the central memory address control.
PERIPHERAL AND CONTROL PROCESSORS
*-»
10
•-»
9
•»• 12
INPUT
OUTPUT CHANNELS
or interrupts the central processor from a peripheral processor,
peripheral processor.
A key description of the 6600 central processor, as you will see in later discussion, is "parallel by function." This means that a number of arithmetic functions may be performed concurrently. To
this end, there are ten functional units
within the central
491
492
The
Part 5
PMS
Section
level
processor. These are the fixed
add
unit, shift unit,
two increment two multiply
units, floating
add
unit,
units, divide unit, boolean
and branch unit. In a general way, each of these units is a three address unit. As an example, the floating add unit obtains two 60-bit operands from the central registers and produces a unit,
60-bit result
which
returned to a register. Information to and
is
held in the central registers, of which there are twenty-four. Eight of these are considered index registers, are of 18 bits length, and one of which always contains zero. Eight
from these units
is
are considered address registers, are of 18 bits length, and serve to address the five read central memory trunks and the two store
memory trunks. Eight are considered floating point regisare of 60 bits length, and are the only central registers to access central memory during a central program.
central ters,
whole central processor is hidden behind the peripheral processors, so, too, the ten functional units are hidden behind the central registers from In a sense, just as the
central
memory from
central
memory. As a consequence, a considerable instruction is obtained and an interesting form of concurrency is
efficiency feasible
and
practical.
The
fact that a small
number
of bits can
to give meaningful definition to any function makes it possible for a needed develop forms of operand and unit reservations general scheme of concurrent arithmetic. Instructions are organized in two formats, a 15-bit format and
a 30-bit format, and 4).
As an
may be mixed
in
an instruction word
example, a 15-bit instruction
may
call for
an
(Fig.
ADD,
4
Network computers and computer networks
Chapter 39
absence of the two restraints. The instruction executions, in com-
minor cycles for fixed add, 10 minor multiply, to 29 minor cycles for floating divide.
Parallel operation in the Control
previous uses of that register are completed.
The
Data 6600
central registers,
parison, range from three
therefore, provide all of the data to the ten functional units,
cycles for floating
receive
To provide a
relatively continuous source of instructions,
buffer register of 60 bits
is
one
located at the bottom of an instruction
stack capable of holding 32 instructions (Fig. 5). Instruction words from memory enter the bottom register of the stack pushing up
the old instruction words. In straight line programs, only the bottom two registers are in use, the bottom being refilled as quickly as
memory
programs which branch back to an the upper stack registers, no refills are allowed after
conflicts allow. In
instruction in
the branch, thereby holding the program loop completely in the stack. As a result, memory access or memory conflicts are no longer involved, and a considerable speed increase can be had. Five memory trunks are provided from memory into the central
processor to five of the floating point registers (Fig. 6). One address register is assigned to each trunk (and therefore to the floating
point register).
Any
instruction calling for address register result
implicitly initiates a
memory
reference on that trunk. These in-
structions are handled through the scoreboard
and therefore tend
memory access with arithmetic. For example, a new memory word to be loaded in a floating point register can be brought in from memory but may not enter the register until all to overlap
all
of the unit results.
Central
memory
is
No storage
is
maintained
in
any
and
unit.
organized in 32 banks of 4096 words. Con-
secutive addresses call for a different bank; therefore, adjacent addresses in one bank are in reality separated by 32. Addresses issued every 100 nanoseconds. A typical central memory information transfer rate is about 250 million bits per second.
may be
As mentioned before, the functional units are hidden behind the registers. Although the units might appear to increase hardware duplication, a pleasant fact emerges from this design. Each
may be trimmed
to perform its function without regard to Speed increases are had from this simplified design. As an example of special functional unit design, the floating
unit
others.
multiply accomplishes the coefficient multiplication in nine minor cycles plus one minor cycle to put
away
the result for a total of
10 minor cycles, or 1000 nanoseconds. The multiply uses layers of carry save adders grouped in two halves. Each half concurrently
forms a partial product, and the two partial products finally merge while the long carries propagate. Although this is a fairly large
complex of
circuits, the resulting
device was sufficiently smaller
than originally planned to allow two multiply units to be included in the final design.
493
494
Part 5
The
PMS
level
Section
4
Network computers and computer networks
Chapter 39
Fig. 7.
6600
lines.
Interconnections between chassis are
printed circuit module.
made with
coaxial
cables.
Both maintenance and operation are accomplished
at a pro-
grammed display console (Fig. 10). More than one of these consoles may be included in a system if desired. Dead start facilities bring
Fig. 9.
TJ"1
V\
Til
II
6600 main frame
section.
Parallel operation In the Control
Data 6600
495
496
Part 5
The
PMS
Section 4
level
Network computers and computer networks
the ten peripheral processors to a condition which allows infor-
which now appear to be quite
mation to enter from any chosen peripheral device. Such loads normally bring in an operating system which provides a highly
advances in technology upward within the same compatible structure, and identical technology downward, also within the
sophisticated capability for multiple users, maintenance, and so on.
same compatible
The 6600 Computer has taken advantage of certain technology advances, but more particularly, logic organization advances
References AllaR64; ClayB64
structure.
successful. Control
Data
is
exploring
Chapter 39
APPENDIX 1 CDC 6400, 6500, 6600 CENTRAL PROCESSOR ISP DESCRIPTION
Appen
Parallel operation in the Control
Data 6600 497
498
Part 5
The
PMS
Section 4
level
|
Network computers and computer networks
Instruction Format although 30 bits, most instructions are 15 bits; see Instruction Interpretation Process
instructlon
fm
- instruction
operation code or function
fmi
- fmai
extended op code specifies a register or an extension to op code
i
-
instruction
J
"
lnstruction
specifies a register
k
= instruction
specifies a register
Jk
- JDk
a shift constant (6 bits)
K
= instruction
long^instruct ion
((fm
(A[i
;
"SAi Xj + K"
(fm - 52)
->(A[i] £
CPU 0-35
1
"V
•J
Sense Indicators
JL:
1
—
— 17|18
Adders
12-17
Q-P-
9
8.
|
'"dex Adders
35
I Right
Left
34-35 17
\\
Compl 3-17 i
Instruction Counter
-17
"7
Address Register -17
^ \y
»
Accumulator
35
" M-Q
S,l,9
35
35
S4_ 3-17
30-35
i
J35
S, 1-5
(DFAD) 35
Miscellaneous Mode
f
Odd
Odd Core Multiplexor Address Switch
Addresses
-17
Even Core
Even
Addresses
CORE STORAGE MULTIPLEXOR
Available
Fig. 3.
IBM 7094
to the Instruction Set Processor
central-processing-unit information flow. (Courtesy of International Business Machines Corporation.)
SJ-35
S.l-35
1,11
521
522
Part
6
Computer
Divide-check'.
Section
families
The Divide-Check Indicator
in the
AC
of the
number
(dividend) in
is
is
turned on, in
fixed-
the magnitude of the number or equal to the magnitude than greater
point or floating-point division,
memory
if
(divisor).
1
The IBM 701-7094
II
sequence, a family by evolution
The operation portion of the Storage Register goes into the Instruction Register, where the operation code is decoded and the execute control circuitry is set up to perform the operation specified by the instruction. The address portion of the instruc-
now located in the Storage Register, may be used Normally, however, it goes to the Address Register and then to the Multiplexor Address Switch to locate the appropriate tion word,
Input-output check'. check)
is
The Input-Output Check Indicator
(I-O
turned on by the attempted execution of an input/output first selecting an input/output unit.
directly.
instruction without
data word in Mp. If the address is to be modified, it is routed from the Storage Register to the Index Adders for Index-register
in a special Transfer trap mode'. The computer can be operated Transfer Trap Mode. Operation in the Trap Mode permits the to run at normal speed with interruptions of normal
modification.
program
Register and on data word in core storage.
location of operation only at transfer points. At such points the the last sequential instruction is saved, and a transfer of control
Concurrently, during the same instruction cycle, a second instruction, located at the immediately higher odd-numbered Mp
is
made
off
manually, and there are
instructions
Sense lights'. Four Sense Lights are also on the console. Any one of these lights may be turned on, off, or the status tested by instructions.
Panel in-out switches' These 36 switches on the console .
may be
read by an instruction.
decoded
to
determine
if it
meets certain criteria
basic computer clock cycle
Mp
ignored in the current I cycle and Register on the next I cycle.
is
brought into the Storage
Execution cycle (£). The execution (E) cycle is used when a reference is needed. All instructions requiring an operand have
to core storage
an E cycle following the
is
2.0
jus
in
7094
I
and
be executed will go to 1.4
/lis
in
the Pc's registers; several operations may occur simultaneously. In Pc four different cycles are used: instruction/I, exe-
among
I
cycle.
I,
E,
E
if it
is
cycle.
to
E
to
indirectly
when
required from storage and the instrucan E cycle. Other instructions during completed require no reference to storage and, therefore, use only I and L cycles
information
is
depending on the instruction. The number of cycles required for an instruction may vary from 1 (e.g., transfer) to 19 (e.g.,
cycles for their completion.
double-precision floating-point divide).
to
E
Logic cycle (L). The L cycle is an execute cycle that does not require a reference to Mp. Many instructions use both E and L
cute/E, logic/L, and buffer/B. The cyclic sequence of an instruction is fixed, always beginning with an I cycle and progressing to E, cycles,
and again
I
addressed.
tion cannot be
B
concurrent
In other words, an instruction that normally goes from
7094 II, as dictated by Mp. Within the single 2- (or 1.4-) microsecond cycle, up to 10 sequential register transfers and/or data operations can take place, each of which transfers information
L, or
for
reference. If the instruction execution, thus saving a second in the IBR cannot be executed with the current instruction, it is
Indirect addressing of an instruction requires an extra
Instruction-set interpretation
The
then brought to the Address
address location,
Sense switches'. Six Sense Switches are located on the console.
They may which sense them.
is
to the Multiplexor Address Switch to locate the
is brought to the Instruction Backup Register/ IBR. While in the IBR, the odd-numbered instruction is partially
to a fixed location.
be turned on or
The modified address
Buffer cycle (B). A buffer (B) cycle is a null Pc cycle; it is used the data channels get information from or put information into core storage. This information can be either data or data-
when Instruction cycle
(I).
The
instruction location to instruction
Bus
I
Mp, word taken from
(Fig. 3).
From
cycle begins
when IC furnishes the The addressed
via S('Multiplexor).
goes to the Multiplexor Storage the Multiplexor Storage Bus the instruction
Mp
read into the Storage Register where it is separated into the operation portion and the address portion of the instruction word. is
channel commands. All demands for B cycles come from the channels themselves. Because of the nature of Ms's and
demand
B
T's,
the
cycle takes precedence over an instruction being performed by Pc. If Pc is in its logic cycle, then both an L and for a
B cycle occur simultaneously.
Chapter 41
Instruction interpretation. Instruction flow diagrams for the instructions are given in Fig. 4. These
CAL, and CLS
Operations on AC and Mps (A
) V (BPT
3
1(
A BPT)
)
-> (P
(instruction
-»
(EM2
instruction
->
(EM3 (P
condition POP
:=
(
)
;
+ 1);
nstruct ion A (EM2
2))
A
(
i
nstruct ion A (EM3 • 3)))
(M[0] t-OvDlCP; P *-'00 g + popj;ode);
->
EOM
-> lO^i
nstruct on^xecut ion
;
POT
-> lO^i
nstruct ionjaxecut ion
;
PIN
-» lOjji
nstruct ionjexecut ion;
SKS
-> IO w
nstruct ionjaxecut ion
i
i
programmed operator; 64 user defined instructions catted via subroutine link in M[0] see the definition of the 10 instruction set below
;
end Instruction^xecution; not including Input Output instructions
)
Input-Output Control from the Pa
KT and KMs State Devices consist
of the following parts: name Cor address) of a specific 10 device: the EOM command is first given to select the specific device: subseauent commands are implicitly to the selected device
IOJ)evice[0:77777 8 ]
IO tJOutput[0:77777
8
Input and Output Data buffers associated with specific devices
]
tO UJ
input[0:77777g] IO LJ Ready[0:77777 ] 8 IO,J
bit for each device to denote when device is ready to transmit data a bit within each device denoting it has been selected for
Select[0:77777 8 ]
an operation the particular io device selected by the EOM command;
io^unitxOH'O 10 Instruction Set
command to select or address the device: energize output M
EOM
-»
(io^uni
POT
~»
(lO^Selectllo^unit] A lOJteadyt io^uni t]
t
c-
•)
J
lOJJutputtiouunit]
(POT)):
(
]
-»
;
t
(PIN));
SKS -»(io _ unit *-e: next l
l
P
wait until ready sVip if signal is not set
1
(lO _Jselect[io U(unit] a IOJ*eady
wait until ready input data command
->
[
io^un
1
1
]
-»
{
«-P + 1);
iojunit
(interrupt
R —» (Interrupt
IET
-
IDT -*
(Interrupt (—i
enable interrupt; turn on mode
«- 1)j t-
0)
-»
P
disable interrupt; turn off
;
each alphabet are ordered
is
name
BILL :=
example
=^©
right (high).
3.4
and y
class(x)
>
a free
is
assigns the
... 9
»M< of
(GC
z
|;,:«
The characters
If
x:=y
Z
.
x
4.1
one of the following alphabets:
of
Commands: assignment,
4.
a sequence of characters written without spaces.
is
reset -time + constant X a) >
first-in-first-out
cocomponents: (input: component, output: component:
p
|
|
~constant)
|
dequeue: (constant ~constant)); |
component);
tmx / tm permanency: (decay transmit-destruct time-multiplexed / irreversible fixed until broken / cyclic permanent
subcomponents: ('control; 'input-buffer: M.i-unit; 'output-buffer:
|
|
|
|
|
|
|
moving
M.i-unit);
fixed manual); |
operation: (open close); |
hang-up-delay:
ft];
concurrency: (1|2); delay: delay(links);
concurrency-type: (simplex half-duplex full-duplex / duplex); |
|
L-initiator: initiator(links); i-rate: i-rate(link);
technology) delay: delay(link);
hang-up-delay: access-time /
A
ft];
constant
ta:
[t])
or as no connection. gate-switch acts as a simple-link
It is
used to trans-
mit information conditionally between the ports of two components. It can be used as a basic primitive to express the structure of other switches, including the simple-switch.
The parameters
will
be discussed under the
simple-switch.
A simple-switch consists of a set of potential links between a set of input and output components, with an operation (access) that can actualize some subset of the links. This is done according to an instruction called the address (which
may
or
not be held in a memory). For a switch,
may
the cocomponent input and output ports are sometimes listed to specify
the size of the switch.
An
important parameter
is
the concurrency-type, which describes the
various subsets that can be simultaneously realized.
7.3
simple-switch
:
link
=
component
(
link
—simplex,
taneity
component-set);
subcomponents: control,
established for
is
many
links
which permits
cross-point has 1-trunk,
memory;
for k-simultaneous conversations.
operation: access; size:
simplex or half duplex switch in would be more accurate.
size(output(cocomponents));
concurrency:
+
by means
the course of transmission of an i-unit
links: link-set, 'address:
Hierarchy
integer;
As a
concurrency-type:(simplex half-duplex full-duplex/duplex dual-simplex dual half-duplex dual full-duplex / dual-duplex |
|
time-multiplexed-cross-point /
1
|
trunk cross-point dual-cross-point |
|
k-trunk);
rule,
if
is
|
(in
essence the time multiplexedand finally k-trunks
conversation);
often use a duplex switch instead of
PMS
diagrams, even though the latter
a redundant attribute derived from the cocomponent
is
no hierarchy.
A
set.
communi-
telephone system
is
a
internal to a comtypical nonhierarchical structure. Usually the switches puter are hierarchical in that there are n components of type a which
communicate with
hierarchy: (hierarchical nonhierarchical / anarchical);
We
1
which functional simul-
of rapid switching within
there are n identical cocomponents each of which
cates with one another, there
|
|
|
values given cor-
in
true simultaneity; time-multiplexed-cross-point, in
cocomponents: (input/from: component-set, output/to: component-set, initiator:
The
which only a single simplex may be established at a time; duplex, in which a single full-duplex may be established; cross-point (also dual-cross-point), which permits
respond to practical alternatives
with the
fa's
m
and vice
components of type
b.
The as only communicate
versa: hierarchy does not
determine the component
|
initiating the dialogue.
location: (central distributed |
(cocomponent
set));
distribution: (radial bussed / bus / chain / daisy chain); |
access-time /
ta:
switch-type
:
switch-type(address /
=
(
a,
prior-address / p)
The
location of a switch refers to whether the hardware
is
localized
within one of the components using the switch, whether it is separate (called central), or whether it is distributed through all the cocomponents.
An attribute that is not completely independent is distribution, which denotes whether the physical structure is a continuous bus or chain or is
623
624 Appendix
fed radially from a centralized component. See Fig. 13, Chap. for
common alternative physical structures. A major way of classifying simple-switches
cyclic, linear,
With each
etc.
random,
is
(a)
is
time-
their access
by
parameters in most
critical
and the prior address
(p),
which
switch the
state of the switch. Thus, represents the existing access time consists of a start-up time plus a time proportional to the
mag-
nitude of the difference between the prior address and the desired address. This differs from a linear switch, which only permits movement in one direction and must reset to an initial state if an address lower than the is
An
sought.
interleave
memory
is
one that
consists of
is
compound-switch
an array of switches whose links are connected are inputs to others and thus effects a total
some
so that the outputs of
in a bilinear
existing address (p)
A
page 67
that given the type of formula
determines the actual access time. The two switches are the address being sought
3,
which go from output to input component-sets.
set of links,
It
can be
defined as an extension of a simple-switch, since most parameters are defined identically for both.
Many combinations
The two most common
are possible.
of accessing arrangements
are given above.
A
cascade-switch
is
which each accessing of the next subswitch must take place after the prior one so that the access times add. A parallel-switch makes all the
one
in
accesses simultaneously, so that the total access time
is simply the access time of the subswitch that takes longest. (In both cases, there can be additional overhead time, but this can usually be allotted to the subswitches
and does not require separate terms
in the expressions for access time.)
a collection of random-access memories, depending on the relationship
between a and p access;
a^p
(usually a
mod 4—>
modular one, such
short access).
Random
= p mod 4) -* long means that the access
as (a
access
be only approxiindependent of both a and p. This constancy may mate (as in using a drum with its cyclic character ignored). Queues and
time
is
Control
8.
K =
Control /
8.1
8.2
simple-control
stacks differ from the other switches in having a degenerate addressing
simple-control compound-control
:
|
:
=
component
(
the state of the system such that the next link selected is determined by switch itself. Dequeues allows either of the two ends of a queue to be
cocomponents: controlled / object: component-set, "instruction:
accessed.
subcomponents: 'instruction: memory, working / w: memory,
how
long the switch maintains a link (or set of them) after establishing the link by an access operation. The three common values are (1) the destruction of the connection with the transmission refers to
Permanency
(2) the maintenance of the connection permathe autonomous movement of the connection (as in disks
of the i-unit across the link, nently,
and
(3)
and drums). The Rarer
is
latter
two give
rise to the
p used
in the access formulas.
a decay function, in which the link remains established for
been transmitted. Hang-up delay is given only for certain permanencies of fixed-until-broken and manual switches. A number of parameters derive directly from the properties of the set
lay,
size of the i-unit, the information-rate, the link de-
the direction of data flow, and the
component
that can initiate data
transmission (as opposed to initiating accessing). Finally, there nology, which
memory
is
not given in detail, since
much
of
it
is
is
operations: evoke / -», next-evoke / next, condition-operations;
controlled-operations: (controlled-component: operation)-list; instruction-source: (none data instruction); |
tech-
identical to
technology.
A
simple-control
is
a logical circuit (usually sequential) that evokes
components (the controlled, or object, components). main operations are those of evoking and evoking-next (symbol—> and next in ISP). However, it must also detect conditions on
operations in other
Thus,
its
ized as
which such evoking depends, so that it has available additional operations, that are combined in an instruction-set (see ISP 2.1). These vary greatly from boolean operations to arithmetic operations (such as
in complexity,
counting the
A
number
of i-units processed).
major distinction
is
the source of the external instructions that can
be given the control. At one extreme there
whose function examples
|
instruction-set)
priate i-unit has
— the
operations: data-operation;
some
period of time, or an irreversible connection, which can be set just once and from then on operates like a simple-link. Hang-up delay is the time taken to break a connection after the appro-
of ports or links
component-set, 'data: component-set;
S('I/0 BUS; location: K; from:P; to:K; half-duplex; initiators: P, K; switch-type:
S(cross-point; 16
random;
M; 6
(P
ta: 5/is;
+
concurrency:
is
that in
More complex
1)
K); concurrency; 6; location:
case
M)
is
may be
which
all
7.4
compound-switch
=
simple-switch
(
essor. It
is
commands).
A
control does not obtain
'address:
links: link-set,
subswitches: switch-set,
memory;
does have an instruction-set, which
technology
is
it
its
to set
it
own into
from a proc-
is
the ISP expression that
all
realized in a logic tech-
actions.
given, since controls are
nology, as given in the definition of component. Likewise, no function
access-time: (cascade: sum(access-time(subswitches))| parallel: max(access-time(subswitches))
No
component
the primary characteristic that distinguishes
shows what conditions evoke what subcomponents: control,
itself.
controls have a separate set of external instructions (often
called control characters or
action. This
The common
the external instruction comes via the data
next instruction, being dependent on an external :
none, as in a clock
to interrupt the system every millisecond.
) )
parameter
is
given, since there exists
no special vocabulary
the different subspecies of control tasks.
to designate
Appendix
examples
input: Pc; output:
K(Mp;
transducer-technology := (analog-digital converter bell buzzer TV camera / vidicon card reader card punch CRT display storage
Mp)
|
K(D(multiply)) 8.3
compound-control
:
=
|
|
CRT simple-control
D
1
|
document reader
subcomponents: alternatives: simple-K-set, 'instruction: memory,
keyboard
working: memory;
|
light
printer linear
|
|
|
film reader film writer |
|
|
|
light
alternative simple-controls compound-control consists of a collection of and can be given as an extension of the simple-control. At any time, the
one of these simple-controls. Determination of what simple-
mode the control is in) is by a modeoperative (often called the instruction from some external component. This additional freedom reis
to hold the current specification. quires a subcomponent, the control-state, that the actual simple-K is determined it is rare, though (Thus possible,
of mode-instructions, each determining
some
part of the
|
|
|
|
|
|
9.
A
outK(Instruction set processor/ISP; input:M.processor„state;
K(LCI/0
Bus)); M(read-write;
b/w
1
40 b; working);
/js/w))
simple-transducer
is
T =
Transducer /
a pair of connected links that have different i-units
link. Meaning is preserved; that is, only the encoding has changed. Preservation of meaning distinguishes transduction from data operation. The amount of information need not be preserved, so that information
output
is
an additional characteristic of a transducer.
9.2
simple-transducer
|
:
=
component
A simple-transducer one
is
number
of bits
is
It
may be
posi-
either increased or decreased.
called a simplex, in that information flow
is
in
fixed direction only (as in a simple-link).
Knowing the function
example
simple-transducer compound-transducer
:
)
of the transducer permits an inference of
whether
one interface of the transducer involves a human being. This inference can be derived from the port characteristics.
Transducer
9.1
|
|
and/or underlying carriers. As defined above, transduction is a digital operation, taking in an i-unit of the input link and producing an i-unit of the
tive or negative, as the net
M(read only; 100 w; 36
|
button telephone dial thermocouple Lincoln Laboratory Wand)
divergence
control state.)
put: D, K(Mp),
|
|
|
|
A
example
joystick keys
|
|
by a sequence
|
pen continuous line plotter line printer/ actuator SRI mouse paper tape reader paper tape gun
|
control
gong
|
punch incremental point plotter pressure transducer speech synthesizer Rand tablet Sylvania tablet telephone dial push
instruction-source: mode-instructions)
is
|
|
display printed document display plasma display 3 reader / document reader document printer magnetic character
(
|
control
|
|
T(line printer; 1000 lines/m; 132 char/line; 8 bit/char)
T(paper tape; reader; 300 char/s; 8 b/char; width: 1 in.) T(sense amplifier; i-rate: .5 w/s; 24 b/w; input: M(memory
(
cocomponents: input: component, output: component,
stack))
initiator:
(input output both); |
|
9.3
subcomponents: input: L, output: L, 'control;
compound-transducer
concurrency:
See port of component;
o-rate
console [i/t];
either
(amplification analog-digital angular|
|
linear attentuation electroluminescence electromagnetic |
|
|
|
electromechanical electromechanical-acoustic electro-optical |
|
mechanical-indentation photochemical xerographic) |
|
|
|
:
=
|
|
(airlines reservations stock |
|
quotation data collection) |
of simple-transducers. The two and the the kinds are full-duplex, which are extensions half-duplex simplest of the simple-transducer, wherein the direction of information flow can be
1;
=
|
|
A compound-transducer consists of a set
concurrency-type: simplex;
:
card reader-punch computer
|
|
transduction-technology
=
magnetic tape transport typewriter Teletype special purpose
[i];
portability: (portable not portable / fixed);
concurrency:
:
|
|
X
duplex);
console / processor console / console Dataphone keyboard-CRT card transport display diskpak drive film write-reader magnetic
i-unit(input)
divergence-rate / divergence
full |
+ integer)
compound-transducer-technology
transduction: port(output) «- port(input);
—
j
compound-transducer-technology;
1
1 1
segment
|
segments paging) |
= (fixed length page segments multiple length page segments variable length page segments :
|
|
|
segments);
P-concurrency:
/ serial
(serial
by
bit parallel / parallel |
by word
multiple instruction streams multiple data streams (arrays) pipeline processing instruction-memory |
|
|
);
instruction-memory := (none|l instruction look ahead |n instruction look ahead cache / look aside / slave memory)) |
possibilities as
really
leading number, since
1
(no relocation protect only
segmented-programming
Mp,
and the program-switching time, which is change context from one program to another. In simple operating regimes (standard batch processing) program-switching time is not an important parameter; it becomes so when interrupts are permitted. For interis
+
multiprogramming!
|
to
rupts, the response time
fixed |
|
operations can be performed per cycle time
an averaging of the various
n Pio monitor
-
swapped program
2 segment / pure impure segments
memories, since they
the cycle-time of
(
|
index registers and general
struction set);
for data
simple-processor
P|l P with interrupt |1 program with multiple
multiprogramming
and
long run limits the rate at which instructions and data can be accessed (and also determines the maximum throughput); the concur-
which
ad-
1
multiprogramming segmented-programming);
various amounts of
its own operation time and its own possibeing overlapped with other operations. Several parameters are
summarize
~2 w/ instruction;
/ instruction)
)
of the operations has
given that
is
(1
=
:
concurrent subprograms
named
bilities for
is
address / instruction;
500 kw/s; data-types: words; integer; 36 b/w
79 (LD AC ((A
formed from
(A
single bit vector
add
>
0) -» P P (Ov, M[B]
0:11> :=
jump
1)
need not be given
EXAMPLES OF REGISTERS FORMED BY CONCATENATION
LAC
B)
1)-(A
Lehman,
M., R. Eshed,
—A New
and
Z. Netter:
SABRAC IEEE
Generation Serial Computer,
Trans., vol.
EC-12, no.
6, pp.
618-628, Decem-
ber, 1963.
LehmM65
E.:
Changes
in
Computer
vol. 12, no. 9, pp.
Per-
M.: Serial
LehmM66
Mode Operation and
Parallel Processing, Proc.
Pt. 2, pp.
40-54,
E.: Evolving Computer Performance 1963-1967, Datamation, vol. 14, no. 1, pp. 31-35, January, 1968.
Lehman, speed
September, 1966.
KnutD66
L.,
Lebedev, S. A.: The High-speed Calculating Machine of the Academy of Sciences of the USSR, 1956. /. ACM, vol. 3, pp. 129-133,
partial translation
The Oracle Memory System,
formance, Datamation,
KnigK68
J.
LebeS56
1953. Knight, Kenneth
W. W. Lichtenberger, and in a Time-sharing
sharing System, AFIPS Proc. FJCC, 601-609, 1967.
Izdatelstvo
Argonne Natl. Lab., Proc. Symp. on Large Scale Digital Computing Machines, pp. 47-58, August,
KnigK66
Langdon,
pp.
available, 1956.
KleiR53
W.,
A User Machine
of a High-speed Transistor for the ASLT Current Switch, IBM J. of Res. and Dev., vol. 11, no. 1,
vol. 4, no.
Tsifrovie
Machines),
Digital
B.
Pirtle:
System, Proc. IEEE, vol. 54, no. 12, pp. 17661774, December, 1966.
April,
Kinslow, H. A.: The Time-sharing Monitor System, AFIPS Proc. FJCC, Pt. I, vol. 26, pp. 443454, 1964.
C-17, no. 8, pp.
Lampson,
1962.
KinsH64
vol.
LampB66
279-294, 1961. KilbT62
ILLIAC IV Software and Application
J.:
Lampson, B. W.: Interactive Machine Language Programming, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 473-481, 1965.
222-225, October, 1961.
Payne, and D. J. Howarth: The AFIPS Proc. EJCC, vol. 20, pp.
November,
LampB65
I:
puter
Kuck, D.
Programming, IEEE Trans., 758-770, August, 1968.
1960. G. Edwards, and
Division,
1961.
High-
IFIPCong. 1965,
631-633, 1965.
M.: A Survey of Problems and Preliminary Results Concerning Parallel Processing and Parallel Processors, Proc. IEEE, vol. 54, no. 12, pp. 1889-1901, December, 1966.
Lehman,
Knight, Kenneth
Knuth, D. E.: Additional Comments on a Problem in Concurrent Programming Control, Comm. ACM, vol. 9, no. 5, pp. 321-322, 1966.
LeinA54
A.
L.,
and
S.
N.
Alexander: System
neers, vol. EC-3, no. 1, pp. 1-10,
LeinA57
March, 1954.
Leiner, A. L, W. A. Notz, J. L. Smith,
and
A.
Weinberger: Organizing a Network of Computers to Meet Deadlines, Proc. EJCC, pp. 115-128, 1957.
Kroger, Marlin G., et at.: Computers in Command and Control, TR61-12, prepared for
DOD:ARPA by Digital Computer Application Study, Institute for Defense Analyses, Research
Leiner,
Organization of the DYSEAC, Professional Group on Electronic Computers, Institute of Radio Engi-
LeinA58
Leiner, A. L, W. A. Notz, J. L. Smith,
and
A.
Bibliography
Proc.
System,
LeinA59
NBS Multicomputer EJCC, pp. 71-75, 1958.
PILOT, The
Weinberger:
Leiner, A. L, W. A. Notz, J.
New
Weinberger: PILOT, A
System, 1959.
LichW65
ACM,
/.
vol. 6,
L.
Smith, and A.
Multiple
Computer
no. 3, pp.
313-335,
Pirtle: A Facility Man-Machine Inter-
in
Experimentation
action,
MaheR61
AFIPS
Proc.
FJCC,
Pt.
I,
vol.
27,
MarcM63
Lindquist, A. B., R. R. Seeber, and L. W. Comeau: A Time-sharing System Using an Associative Memory, Proc. IEEE, vol. 54, no. 12, pp.
Liptay,
85,
LoneW61
LonsK56
Lonsdale, vol. 103,
Lourie,
MelbA65
and
Digital
Supp.
2,
E. T.
Warburton: Mercury: A
Computer, Proc. IEE, pp. 174-183, 1956.
H. Schrimpf, R. Reach, Proc.
McCarthy, in
MendM66
MercR57
ture,"
M.I.T.
Press,
Merrl56
in
EJCC, pp.
a Multi-
MetrN52
75-81,
MMIW63
Cambridge, Mass.,
J.,
McCormick, Bruce
MiraW67
McCullough,
H.:
The
Illinois
Pattern Rec-
J. D., K.
H. Speierman,
and
F.
McPherson,
J.
L.,
Proc.
FORTRAN
and
J.,
A. W.
24-27,
April,
England: The
SDS
Real-time, Time-sharing Computer,
FJCC,
Mercer, Robert
J.:
vol.
29,
pp. 51-64,
Micro-programming, 157-171, 1957.
/.
1966.
ACM,
Merry, I. W., and B. G. Maudsley: The MagneticStore of the Computer Pegasus, Proc. IEE, B, vol.
103, Supp.
2,
pp. 197-202, 1956.
Metropolis, N., E. Richardson, H. B. Proc.
Klein, W. Orvedahl, J. R.
F.
Demuth, and
ACM,
J.
B.
Jackson:
Toronto Conf, pp. 13-17,
Miller,
W.
F.,
and
R. A.
Aschenbrenner. The
GUS
Miranker, W. L., and W. M. Liniger: Parallel Methods for the Numerical Integration of Ordi-
nary Differential Equations, Math, of Computa99, pp. 303-320, July, 1967.
MolnC67
and
S.
N. Alexander: Per-
Molnar, Charles E., Severo M. Ornstein, and Antharvedi Anne: The CHASM: A Macromodular
Computer
W.
Zurcher: Design for a Multiple User Multiprocessing System, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 611-617, 1965.
McPhJ51
AFIPS
A
vol. 8, pp.
tion, vol. 21, no.
ognition Computer-ILLIAC III, IEEE Trans., vol. EC-12, no. 5, pp. 791-813, December, 1963.
McCuJ65
7:
M. Pugmire: A Small
Multicomputer System, IEEE Trans., vol. EC-12, no. 6, pp. 671-676, December, 1963.
pp. 51-57, 1963.
McCoB63
J.
Direct Processing of
September, 1952.
R. Licklider:
for a
SIGMA
MANIAC,
S. Boilen, E. Fredkin, and J. C. A Time-sharing Debugging System Small Computer, AFIPS Proc. SJCC, vol. 23,
McCarthy,
and
J.,
Nash: The Ordvac,
P.
J.
37-43, December, 1951.
drum
1962.
McCaJ63
Mendelson, M.
Pt.
Management and The
Melbourne, A.
vol. 4, no. 2, pp.
Pt. B,
"Time Sharing Computer Systems the Computer of the Fu-
J.:
and
Statements, Computer J., 1965.
1959.
McCaJ62
E.,
Conf., pp.
and W. Kahn:
Arithmetic and Control Techniques
program Computer,
R.
Meagher,
604 Machine Description, IBM 38 pp., December, 1963.
Computer for the
F.:
K.,
IM.,
MeagR51
J.,
Lonergan, William, and Paul King: Design of the B 5000 System, Datamation, vol. 7, no. 5, pp. 28-32, May, 1961.
High Speed
LourN59
Sys.
ASLT: An Extension of Hybrid Miniaturization Techniques, IBM }. of Bes. and Dev., vol. 11, no. 1, pp. 86-92, January, 1967. Lloyd, R. H.
R. M.:
Meade,
Proc. SJCC,
29-40, 1963.
Internal Mem.,
15-21, 1968.
vol. 7, no. 1, pp.
LloyR67
The Cache, IBM
II.
MeadR63
AIEE-IBE
Structural Aspects of the Sys-
S.:
J.
tem/360 Model
Marcotty, M. J., F. M. Longstaff, and A. P. M. Williams: Time-sharing on the Ferranti-Packard vol. 23, pp.
1774-1779, December, 1966. LiptJ68
J.:
Multiprocessor
FP6000 Computer System, AFIPS
pp.
589-598, 1965.
LindA66
Problems of Storage Allocation in Multiprogrammed System, Comm. ACM, vol. 4, no. 10, pp. 421-422, OctoMaher, R.
a
ber, 1961.
Lichtenberger, W., and M. W. for
formance of the Census Univac System, AIEEIBE Conf., pp. 16-22, December, 1951.
for Analyzing
Proc. SJCC, vol. 30, pp.
MonnR68
Monnier, Richard tor
with
E.:
Neuron Models, AFIPS 393-401, 1967.
A New
Electronic Calcula-
Computerlike Capabilities, HewlettPackard J., vol. 20, no. 1, pp. 3-9, September, 1968.
647
648 Bibliography
MorrD67
Morris, Derrick, Frank H. Sumner, and Michael Wyld: An Appraisal of the Atlas Supervisor,
PapiW57
ACM
Proc.
MuntC62
2,
MyerT68
A
Processing Interpreter for
List
M.I.T., Instrumentation Lab.,
AGC4, MurtJ66
C. A.:
Muntz,
PatzW67
York, 1966.
PennJ62 and
T. H.,
E.
I.
On the Design Comm. ACM, vol. 11, no.
Sutherland:
and
and
The Logic Theory
H. A. Simon:
vol. IT-2, no. 3, pp.
J.
WJCC,
C.
Shaw, and H.
PikeJ52
PlugW61
A.
Simon: Empiri-
PortR60
J. C. Shaw, and H. A. Simon: The Elements of a Theory of Human Problem Solving, Psychology Rev., vol. 65, pp. 151-166, March, 1958.
RajcJ43
OssaJ65
731-733, December, 1964.
Hardware for Information Processing Systems: Today and in the Future, Proc. IEEE, vol. 54, no. 12, pp. 1820-1835, DecemOsborne, Thomas E.: Hardware Design of the Model 9100A Calculator, Hewlett-Packard J., vol.
RandB68
PadeA64
Padegs,
IBM PadeA68
Sys.
Padegs,
I,
A.: J.,
A.:
Structural
85,
pp. 22-29, 1968.
James
473-
vol. 5, no. 9, pp.
Input-Output Devices Used with pp. 36-38, Decem-
L.:
III.
Aspects of the Sys Extensions to Float
IBM
Sys.
J.,
vol. 7,
"SABRE"
Proc.
WJCC,
Porter, R.
no
Perry: American AirElectronic Reservations System,
and M. N.
Plugge, W. R., lines'
pp.
593-602, May, 1961.
The RW-400-A New Polymorphic
E.:
Rajchman,
Randell, B.,
J.,
no.
6,
Snyder, and Rudnick:
under terms
of
1,
RCA
OSRD
pp.
Labo-
contract
and
C. J.
RichR55
Kuehner: Dynamic Storage Comm. ACM, vol. 11, no. 5,
297-306, May, 1968.
Richards, R. K.: "Arithmetic Operations tal
Computers"
D.
in Digi-
Van Nostrand Company,
Inc.,
Princeton, N.J., 1955.
RobeJ58
Robertson, J. E.: A New Class of Digital Division Methods, IRE Trans., vol. EC-7, no. 3, pp. 218222, September, 1958.
RobeL67
Roberts, Lawrence G.: Multiple Computer Networks and Intercomputer Communication, ACM Symp. on Operating System Principles, Gatlinburg, Tenn., Oct. 1-4, 1967.
RoseG67
Rose, Gordon
grammed
Channel Design Considerations vol. 3, no. 2, pp. 165-180, 1964
ing-point Architecture, 1,
231-241, 1965.
vol. 27, pp.
tem/360 Model
Pike,
pp.
pp. 10-13, September, 1968.
J. F., L. E. Mikus, and S. D. Dunten: Communications and Input-Output Switching in a Multiplex Computing System, AFIPS Proc.
Pt.
in
Allocation Systems,
Ossanna,
FJCC,
and T. Pearcey: Use of Multiprothe Design of a Low Cost Digital
J. P.,
OEM-sr-591.
Nisenoff, N.:
1,
Penny,
ratories Report,
Nievergelt, J.: Parallel Methods for Integrating Ordinary Differential Equations, Comm. ACM,
20, no.
Memory and Computer
Data System, Datamation, vol. 8-14, January/February, 1960.
Theory Machine,
ber, 1966.
0sboT68
Read-only
SEAC, AIEE-IRE-ACM Conf.,
pp. 218-230, February, 1957.
12, pp.
A.:
ber, 1952.
Newell, A.,
vol. 7, no.
NiseN66
vol. 6,
Computer, Comm. ACM,
61-79,
J. C.
cal Explorations of the Logic
NievJ64
Gilbert C. Vandling: Sys-
Microprogramming, Comno. 12, pp. 62-66, Decem-
of
476, September, 1962.
Shaw: Programming the Newell, A., Logic Theory Machine, Proc. WJCC, pp. 230-
Newell, A.,
Peacock,
gramming
240, February, 1957.
NeweA58
and
1 Control, to be published.
410-414, June, 1968.
Newell, A.,
Proc.
J.,
162-
vol. 30, no. 10, pp.
J.
Machine, IRE Trans., September, 1956.
NeweA57fo
Patzer, William
puter Design, ber, 1967.
PeacA??
Myer,
High-speed Computer Stores 2.5
N.:
tems Implications
Mem.
C: Highly Parallel Information Processing Systems, in "Advances in Computers," vol. 7, pp. 2-116, Academic Press, Inc., New Murtha,
6, pp.
NeweA57a
AGC
Cambridge, Mass., January, 1962.
of Display Processors,
NeweA56
67-75, 1967.
Natl. Meeting, pp.
Papian, W.
Megabits, Electronics, 167, October, 1957.
T.
Trans., vol.
A.:
"Intergraphic,"
A
Micropro-
Graphical-Interface Computer,
EC-16, no.
6, pp.
IEEE
773-784, Decem-
ber, 1967.
'According to E. F. Codd, this article has not been published as of Jan. 23, 1968. However, "Microprogram Control for System/360" by S. G. Tucker, IBM Sys. J., vol. 6, no. 4, 1967, has and covers the material that we think was intended to be in PeacA??.
Bibliography
RoseJ65
Marbles and Boxes, IBM Res. Yorktown Hts., N.Y., November, Rept.,
Rosenfeld, Project
1965.
RoseS67
nization for Array Processing,
J.:
Pt.
SerrR62
Rosen, Saul: "Programming Systems and Languages," McGraw-Hill Book Company, New
RoseS69
Rosen, Saul: Electronic Computers: A Historical Survey, Computing Surveys, vol. 7-36, March, 1969.
RosiR69
no.
1,
1,
ShanC38
pp.
SharW69
1969.
ShawJ58
F.:
RossH53
Ross, Harold D., Jr.: The Arithmetic Element of the IBM Type 701 Computer, Proc. IRE, vol. 41, no. 10, pp. 1287-1294, October, 1953.
RothS59
Rothman, Intern.
S.:
R/W 40
Data Processing System,
SaltJ66
ShedG66/)
SaxoJ63
Saxon,
"Programming the IBM 7090,"
A.:
J.
SchlH??
Schlaeppi, H.
P.:
Englewood
Cliffs, N.J.,
SchwJ64
SechR67
Schwartz,
J.
I.:
Sechler, R.
A. R. Strube,
397-411,
ASLT Circuit
and
vol.
Seeber, R. R., and A. B. Lindquist: Associative
Logic for Highly Parallel Systems, AFIPS Proc. FJCC, vol. 24, pp. 489-493, 1963.
SegaR61
Segal, R.
J.,
and H.
P.
SlutR51
W. Carl Borck, and Robert
Slutz,
Ralph
J.:
Engineering Experience with the
SEAC, AIEE-IRE Conf,
to Air Force Digital Data Communication System, AFIPS Proc. EJCC, vol. 20, pp. 264-278, 1961.
Senzig, D. N.,
and
R. V. Smith:
pp.
90-94, December,
1951.
SmitR64
S0I0M66
SquiJ63
V., and D. N. Senzig: Computer Organization for Array Processing, IBM Res. Rept. RC
Smith, R.
N.Y.,
December, 1964.
Solomon, Martin B., Jr.: Economies of Scale and the IBM System /360, Comm. ACM, vol. 9, no.
Computer Orga-
435-440, June, 1966.
J. S., and S. M. Polais: Programming and Design Considerations of a Highly Parallel Computer, AFIPS Proc. SJCC, vol. 23, pp. 395-
Squire,
400, 1963.
Guerber: Four Advanced
Computers— Key
SenzD65
L.,
McReynolds: The SOLOMON Computer, AFIPS Proc. FJCC, vol. 22, pp. 97-107, 1962. C.
6, pp.
SeebR63
RC
Turnbull:
IBM J. of Res. andDev., 74-85, January, 1967.
Design,
11, no. 1, pp.
J. R.
Res. Rept.
Slotnick, Daniel
1330, Yorktown Hts., F.,
IBM
SlotD62
A General-purpose Time-sharing Proc. SJCC, vol. 25, pp.
Numerical Methods for
Shupe, P. D., and R. A. Kirsch: SEAC, Review of Three Years of Operation, Proc. EJCC, pp. 83-90, 1953.
Extensions of PL/l-like Lan-
System, AFIPS 1964.
Parallel
ShupP53
1963.
guages for Parallel Processing, with Programming Examples, in preparation.
S.:
1619, Yorktown Hts., N.Y., June, 1966.
L.:
Prentice-Hall, Inc.,
Shedler, G.
the Solution of Equations,
MAC-TR-30,
Accents, Proc.
S., and M. Lehman: Parallel Compuand the Solution of Polynomial Equa-
Shedler, G.
IBM Res. Rept. 1550, Yorktown Hts., N.Y., February, 1966.
Saltzer, J. H.: Traffic Control in a Multiplexed
Computers with European WJCC, pp. 14-17, 1957.
York,
1958.
ShedG66«
tions,
Samuel, Arthur
Com-
J. C, A. Newell, H. A. Simon, and T. 0. A Command Structure for Complex Information Processing, Proc. WJCC, pp. 119-128,
Ramo-Wooldridge, Div. of Inc., Los Angeles,
July, 1966.
of
New
Ellis:
1959,
M.I.T. Tech. Rept.
"The Economics
Shaw,
tation
June, 1959.
F.:
1969.
Information Processing and
Computer System,
SamuA57
Sharpe, William
puters," Columbia University Press,
Thompson RamoWooldridge, Calif.,
Shannon, E. C: A Symbolic Analysis of Relay and Switching Circuits, Trans. AIEE, vol. 57, pp.
on
Conf.
Auto-math
M. M. Astrahan, G. W. Patterson, and Pyne: The Evolution of Computing Machines and Systems, Proc. IRE, vol. 50, no. 5, pp. 1039-1058, May, 1962. B.
713-723, 1938.
Contemporary Concepts of Microprogramming and Emulation, Computing Surveys, vol. 1, no. 4, pp. 197-212, December,
Rosin, Robert
AFIPS Proc. FJCC,
117-128, 1965.
vol. 27, pp.
Serrell, R., I.
York, 1967.
I,
SteeT61
Steel, T.
WJCC, StevL52
B„
pp.
Stevens,
L.
Jr.:
A
First Version of
UNCOL,
Proc.
371-377, 1961. D.:
Engineering Organization of for the IBM 701 Electronic
Input and Output
649
650 Bibliography
AIEE-IRE-ACM Machine, Data-processing Conf., pp. 81-85, December, 1952. StevW64
Stevens, W. Y.: The Structure of System/360, Part II— System Implementations, IBM Sys. J.,
Walendziewicz,
neering, Anaheim, 30-31, 1962.
C: Time Sharing in Large Fast ComProc. ICIP, UNESCO, pp. 336-341, June,
WareW63a
F. H., G. Haley, and E. C. Y. Chen: The Central Control Unit of the "Atlas" Computer,
Sumner,
Taylor,
Norman
H.:
Evaluation of the Engineer-
ing Aspects of Whirlwind
AIEE-IRE
I,
ples of Operation,
WareW63b
A Review
Teager, Herbert M.: Rev.,
Computing
vol.
6,
of
AmdaG64a;
no. 5, pp.
R. N.,
Thompson,
and
WebeH67
Thornton, James E.: Parallel Operation Control Data 6600, AFIPS Proc. FJCC, Pt.
Tomasulo,
R.
An
M.:
in
the
II,
vol.
Efficient Algorithm
WeikM55
VandW52
WeikM61 for
VandW56
Sci.
Van der
L.:
Poel, W.
Res.,
Van der
Poel, W.
The
WestG60
30,
549-558, Sep-
Electronic Digital
Sec.
B, vol. 2,
A Survey of Domestic Electronic Computing Systems, Ballistic Research Decem-
Weik, Martin H.:
A Third Survey
Electronic Digital
Computing Systems,
BRL Rept.
Project No.
of
Domestic
Md.;
Ballistic
report
1010, Department of the
5B03-06-002 (1961).
Weik, Martin H., Jr.: A Fourth Survey of Domestic Electronic Digital Computer Systems,
West, George P., and Ralph J. Koerner: Communications within a Polymorphic Intellectronic
System, Proc. WJCC, pp. 225-230, 1960. Wilkinson,
Automatic
pp.
cal
Logical Principles of
Thesis,
WilkM51a
Amsterdam,
J.
H.:
Digital
"The Pilot ACE," pp. 5-14, Computation, National Physi-
Laboratory, Teddington,
England, March
UNESCO,
pp.
361-365,
The Best Way to Design An Automatic Calculating Machine, Manchester University Computer Inaugural Conf, July, 1951. PubWilkes, M. V:
lished by Ferranti Ltd.,
ZEBRA, A Simple Binary
L.:
Proc. ICIP,
June, 1959.
vol. 10, no. 9, pp.
Research Laboratories, Aberdeen, Md., 1227; processed by Defense DocumentaRept. tion Agency, Defense Supply Agency No. 42900, January, 1964.
1956.
Computer,
York, 1963.
25-28, 1953.
Some Simple Computers, VandW59
New
Ballistic
WilkJ53
A Simple
Computer, Appl. 367-400, 1952.
and Machine Design,"
Weik, M. H.:
Army
and
Unger, S. H.: A Computer Oriented toward Spatial Problems, Proc. IRE, vol. 46, no. 10, pp. 1744-1750, October, 1958. L.:
Inc.,
EULER on IBM System/360 Model
supersedes
WeikM64
Poel, W.
Sons,
Research Laboratories, Aberdeen,
G.:
Turing, Sara: "Alan M. Turing," W. Heffer
Van der
&
ber, 1955.
Sons, Ltd., Cambridge, England, 1959.
UngeS58
York, 1963.
Laboratories, Aberdeen, Md., Rept. 971,
1967.
TuriS59
New
Weber, Helmut: A Microprogrammed Implemen-
Digital
Microprogram Control for System/360, IBM Sys. ]., vol. 6, no. 4, pp. 222-241, S.
and Programming," John
D825
1967. Tucker,
Inc.,
vol. 2, "Circuits
Comm. ACM,
Exploiting Multiple Arithmetic Units, IBM J. of Res. and Dec, vol. 11, no. 1, pp. 25-33, January,
TuckS67
Oct.
tember, 1967. Wilkinson: The
J. A.
26, pp. 33-40, 1964.
TomaR67
117-127,
pp.
Ware, W. H.: "Digital Computer Technology and
tation of
355-356,
Automatic Operating and Scheduling Program, AF1PS Proc. SJCC, vol. 23, pp. 41-49, 1963.
ThorJ64
Sons,
John Wiley
September-October, 1965.
ThomR63
&
Design,"
Conf., pp.
75-78, December, 1951.
TeagH65
Calif,
Ware, W. H.: "Digital Computer Technology and Design," vol. 1, "Mathematical Topics, PrinciWiley
657-662, 1962.
Proc. IF1P Cong. 1962, pp.
The D210 Magnetic ComComputer Engi-
E. T.:
puter, Proc. Conf. on Spaceborne
1959.
TaylN51
WaleE62
Strachey, puters,
SumnF62
Vyssotsky, V. A., F. J. Corbato, and R. M. Graham: Structure of the Multics Supervisor, AFIPS Proc. FJCC, Pt. I, vol. 27, pp. 203-212, 1965.
136-143, 1964.
vol. 3, no. 2, pp.
StraC59
VyssV65
WilkM51b
London.
The Edsac Computer, AIEE-IRE Conf, pp. 79-83, December, 1951.
Wilkes, M. V:
Bibliography
WilkM52
V., D. J. Wheeler, and S. Gill: "The Preparation of Programs for a Digital Compu-
Wilkes, M.
Addison-Wesley Publishing Company, Reading, Mass., 1952.
ter,"
WilkM53
WilkM58fc
WMIF49
M. V., and J. B. Stringer: Microprogramming and the Design of the Control
Circuits in
an Electronic
Cambridge
Phil.
Digital
Computer,
March, 1949.
183-202,
Proc.
Soc, Pt. 2, vol. 49, pp. 230-238,
WirsJ66
1953.
W. Renwick, and D. J. Wheeler: The Design of the Control Unit of an Electronic Digital Computer, Proc. IEE, Pt. B, vol. 105, pp. 121-128, March, 1958. Wilkes, M.
F. C, and T. Kilburn: A Storage System Use with Binary-Digital Computing Machines, Proc. IEE, Pt. 3, vol. 96, pp. 81-100,
Williams, for
Wilkes,
April,
WilkM58a
Inc.,
Operating Experience, Proc. EJCC, pp. 91-95, 1953.
Wirsching, Joseph
WirtN66a
Proc.
Microprogramming,
V.:
tion of I,
Wilkes, M.
V.:
Slave Memories and Dynamic
V.:
Trans., vol.
12,
List-oriented
no.
12,
pp.
and H. Weber: EULER: A GeneralizaALGOL, and Its Formal Definition: Part
Comm. ACM,
vol. 9, no. 1, pp.
13-25, Janu-
The Growth of
Charles
R.:
WirtN66c
A Review
of
ORDVAC
Comm. ACM,
vol. 9, no. 2, pp.
89-99, Febru-
ary, 1966.
Interest in Micro-
1969.
and H. Weber: EULER: A GeneralizaALGOL, and Its Formal Definition: Part
Wirth, N.,
II,
EC-14, no.
programming: A Literature Survey, Computing Surveys, vol. 1, no. 3, pp. 139-145, September,
Williams,
NOVA: A
EJCC,
WirtN66b
Storage Allocation, IEEE 2, pp. 270-271, 1965.
WillC53
E.:
ary, 1966.
Wilkes, M.
Wilkes, M.
96, pp.
Wirth, N.,
tion of
WilkM69
in Pt. 2, vol.
1949.
Computer, Datamation, vol. 41-43, December, 1966.
V.,
pp. 18-20, 1958.
WilkM65
Same paper
April,
Wirth, N.:
A Note on "Program Structures" for Comm. ACM, vol. 9, no. 5,
Parallel Processing,
pp.
ZadeL63
320-321, May, 1966.
Lotfi A., and Charles A. Desoer: "Linear System Theory," McGraw-Hill Book Company,
Zadeh,
New
York, 1963.
651
Name Adams, Charles W.,
Burdette, E. W., 119
Adams
Burks, Arthur W., 86-119
42, 585 Associates, 42, 257, 580
Ainsworth, Ernest, 212 Alexander,
S.
N., 165,
469
Bussell, B.,
212
496 M. W., 469
Allard, R. W.,
Allen,
Allmark, R. H., 257, 262-266 Alonso, R. L., 146-156
Amdahl, Gene M., 259, 469, 561 Anderson, D. W., 587 Anderson, James
P.,
257, 348, 447-455, 469,
586 Anderson, S. F., 587 Anne, Antharvedi, 73 Arbuckle, R. A., 50 Arbuckle, T., 349
Arden, B. W., 81, 275, 469, 566, 571 Aschenbrenner, R. A., 469 Aspinall, D., 277 Astrahan, M. M., 42, 119, 144, 212, 223, 515
Babbage, Charles, 46 Backus, John, 9 Baldwin, F. R., 46 Baldwin, R. R., 469 Barnes, George H., 320-333 Bartlett, K. A., 504 Barton, R. S., 257, 273 Bashkow, Theodore R., 363-381 Basilewskii, Iu. la., 213 Beckman, F. S., 146 Belsky, M. A., 349 Benington, H. D., 504 Bernstein, A., 349 Bhushan, A., 507 Bibb, J., 469 Blaauw, G. A., 259, 426, 428, 464, 561, 588-601 Blair-Smith, H., 146-156 Bloch, Erich,
421^39
P.,
Boland, L. J., 587 Borck, W. Carl, 320, 463
Bouchon, Falcon, Jacques, 46 Boutwell, E., Jr., 334 Bowden, B. V., 42 Bright, H. S., 291, 456 Brooker, R. A., 279
Edwards, D. B. G., 276-290 Elbourne, R. D., 172, 212 Elliott, W. S., 171-183 Ellis, T. O., 257, 349-362 England, A. W., 396 England, W. A., 149 Ernst, H. A., 469 Eshed, R., 469 Estrin, Gerald, 119, 469 Evans, D. S., 171 Everett, R. R., 137-145, 504 Ewing, R. G, 469
Fagen, R.
E.,
496
Fagg, P., 385 Fairclough, J. W., 171, 174, 176, 385 Falkoff, A. D., 13, 458,
587
Fikes, Richard E., 571
587 J., 83, 340, 291, 469 Forgie, James Forrester, J. W., 75
Flynn, Michael
W„
Fotheringham, John, 190 Frankovich, J. M., 469 Fredkin,
E.,
291
Fried, 45 Frizzell,
Clarence
E.,
525
Ill
Cray, Seymour, 471 Critchlow, A. J., 469
Caller, B. A., 81, 275, 469, 566, 571
Culler, Glen, 45
Daley, Robert
Gibson, C. T., 81, 587 Gibson, D. H., 574
C,
275, 297, 469, 517, 523, 571
Gibson,
W.
Gill, S.,
456
B.,
469
Darringer, John A., 13 Davies, D. W., 504
Glaser, E. L., 469 Goldschmidt, R. E., 587
Davies, P. M., 469
Goldstine, Herman H., 87-119 Grabbe, E. M., 205-215, 220-224 Graham, R. M., 469 Granito, G. D., 587
Davis, G. M., 257 Dean, R. F., 340, 587
Demuth, H.
B.,
Desmonde, W.
Bock, R. V., 257 S., 291
Boilen,
P., Jr., 146,
Crawford,
119
Dennis, Jack B., 81, 275, 295, 457, 469 Dent, B. A., 257
Blosk, R. T., 439
Brooks, F.
Campbell, Robert V. D., 42 Carlson, C. B., 257, 273 Carpenter, H. G., 171 Carr, J. W., Ill, 205-215, 220-224 Carter, W. C, 587 Casale, Charles T., 69, 155, 156, 396 Chase, George C, 42 Chen, E. C. Y., 274 Chen, T. C, 587 Chu, J. C, 119, 396 Clark, Wesley A., 274 Clayton, B. B., 496 Cochran, David S., 243-256, 439 Codd, E. F., 397, 439, 469 Comeau, L. W„ 587 Comfort, W. T., 291, 469 Conti, Carl J., 563, 574 Conway, Melvin E., 295, 457 Corbato, Fernando J., 295, 457, 469, 517, 523, 571 Couleur, J., 469 Cox, Jerome R., Jr., 50
H.,
456
Desoer, Charles A., 7 Devonald, C. H., 171-183
Green, A., 156 Green, J., 392 Greene, J., 340, 587 Greenstadt, J. L., 525 Greenwald, Sidney, 212 Gregory, J. G., 315, 463 Grimsdale, R. L., 277, 587 Grosch, H. R. J., 585 Gruenberger, F. J., 89, 119
469 385 Dorff, E. K., 496 Dreyfus, P., 456 Dunten, S. D., 469 Dunwell, S. W., 421 Dijkstra, E. W.,
Doody, D.
Index
T.,
Grumette, Murray, 525 Guerber, H. P., 509
259, 349, 423, 428, 464,
561, 588-601
Brown, J. L„ 385 Brown, Richard M., 320-333 Buchholz, Werner, 396, 421, 428, 469, 515
Earle,
J.
G.,
Eccles,
W.
Eckert,
J.
587 46
Haines, L. H., 392 Haley, A. C. D., 266
H.,
Presper,
Jr.,
91, 157-169,
396
Haley, G., 274
653
654 Name
index
Hamblin, C.
257
L.,
Lebedev,
Haney, Frederick M., 9 Hartley, D. F., 290 257 Haueter, R. C, 212 Hayata, Tomo, 344 Hellerman, H., 469 Herwitz, Paul S., 397 Hillegass, John R., 587 Hipp, J. A., 385 Hodges, Donald, 257 Hoffman, Samuel A., 257, 447-455, 469 Holland, John, 315, 320
Hauck, E.
A.,
H, 46
Hollerith,
Hopkins, A. L., 146-156, 349 Hoskinson, E. A., 334 Howarth, D. J., 274
Hughes, E. S., Jr., 223 Huskey, H. D., 191, 193
Iverson,
Jackson,
Kenneth
469 71, 334,
Kato, Maso, 320-333 Katz,
J.
H.,
463
Kepler, Johannes, 46 Kilburn, T., 75, 274-290
King, Paul, 257, 267-273 Kinslow, H. A., 469 Kirsch, R. A., 212 Kister,
J.,
349
Kitov, A. I., 213 Klein, E. F., 119 Klein, R. J., Jr., 119
Knight, Kenneth E., 50-51
Knuth, D.
E.,
469
Koerner, Ralph J., 485 Kroger, Marlin G., 448 Kronfeld, Arnold, 363-381
Kuck, David
J.,
Kuehner, C.
J.,
320-333 274
77,
W., 291-300 Landy, B., 290 Langdon, J. L., 581 Lanigan, M. J., 276-290 Laning, J. H., Jr., 146-156 Lauer, Hugh C, 571 Lawless, W. J., Jr., 146
Lampson,
B.
Nievergelt, J., 463 Nisenoff, N., 42
Lichtenberger, W. W., 291-300 Licklider, J. C. R., 291 Lindquist, A. B., 469, 587
Notz,
Liniger,
Liptay,
W.
M., 463
J. S.,
W.
A.,
440-445, 449, 456
C,
O'Brien, T.
587
81, 275, 469, 566,
571
Oleksiak, R., 156
587
Lloyd, R. H. F.,
Oliver, G.,
Lonergan, William, 257, 267-273 Longstaff, F. M., 469 Lonsdale, K, 279 Lourie, N., 469 Low, P. R., 587 Lowry, E. S., 397 Lucking, J. R., 257, 262-266 Lukasiewicz, J., 270
469
Ornstein, Severe M., 73
Orvedahl, W., 119
Osborne, Thomas E., 243-256 Ossanna, J. F., 469 Owen, C. E., 171-183
Padegs, A., 587
D., 291, 456,
McCullough,
J.
McDonough,
E.,
Pascal, Blaise,
341-347
46
Patterson, G. W., 42, 119, 144, 212, 223
469
397
McReynolds, Robert C., 315, 320, 463 Maher, R. J., 273 Marcotte, A. U., 587 Marcotty, M. J., 469
Jones, P. D., 290 Jordan F. W., 46
W„
349
Leibniz, Gottfried Wilhelm, 46 Leiner, A. L., 212, 440-445, 449, 456
MacLaren, M. Donald, 587 McPherson, J. L., 165, 169
Johnston, D. L., 171
Kahn,
P. G., 297,
Newell, A., 257, 349-362
McCarthy, J., 291, 469 McCormick, Bruce H., 315
587
Jacquard, Joseph Marie, 46 Johnston, A. St., 171
Kampe, Thomas W.,
Neumann,
213
Papian, W. N., 279 Parnes, David L., 13
119
B.,
J.
E., 13,
S. A.,
Lehman, M., 393, 446, 456-469
Mauchly, John W., 91 Maudsley, B. G., 171-183 Mauer, H, 156 Meade, R. M., 469
Meagher, R. E., 119 Melbourne, A. J., 392 Mendelson, M. J., 396 Mercer, Robert J., 340 I. W., 171, 176 Merwin-Daggett, Marjorie, 469, 517, 523, 571 Messina, B. U., 587
Merry,
Patzer, William J., 340 Payne, R. B., 274
Peacock, A., 604 Pearcey, T, 469
Penny,
J. P.,
469
Perry, M. N., 504 Peterson, H. P., 469
James L., 212 M. W., 291-300 Pitkowsky, S. H., 574 Plugge, W. R., 504 Polais, S. M., 469 Poland, C. B., 469 Pomerene, James H., 397 Porter, R. E., 449, 477-488 Powers, D. M., 587 Preiss, R. J., 587 Pugmire, J. M., 392 Pike,
Pirtle,
Pyne,
I.
B., 42, 119, 144,
212, 223
Metropolis, N., 119 Mikus, L. E., 469
W.
469 463 Mitchell, Herbert F., 157-169 Molnar, Charles E., 73 Monnier, Richard E., 243-256 Montgomery, H. C, 587 Morris, Derrick, 274 Mueller, 46 Miller,
Miranker,
Muntz, C. Murtha, J.
F.,
W.
L.,
A., 155, 156
C, 320
Myer, T. H., 303
119 J. P., Naur, Peter, 9 Nash,
R. M., 290 Netter, Z., 469
Needham,
Rajchman,
Ramo,
S.,
Ill J., 205-215, 220-224
Randell, B., 77, 274
Reach, R., 469 Reinheimer, H. J., 587 Renwick, W., 346 Richards, R. K, 146, 150 Richardson,
J.
Robbins, R.
C,
Roberts,
R.,
119
171
Lawrence
De
G., 45,
349 Robertson, J. E., 431 Rochester, N., 515 Rose, Gordon A., 304, 469 Rosen, Saul, 3, 42 Rosenfeld, J., 468 Rosin, Robert F., 340, 649 Roberts, M.
V.,
504
Name
Ross, Harold D.,
Rothman,
S.,
Jr.,
525
Stein, P.,
W. Y., 563, 587, 602-606 Stokes, Richard A., 320-333 Stotz, R. H., 507
Rudnick, 111, 119
Stevens,
295 Samuel, Arthur L., 42, 119, 144, 257 Sanderson, J. G., 469 Sasson, Azra, 363-381 Saxon, J. A., 525 Scalzi, C. A., 397 Scantlebury, R. A., 504 Schickhardt, Wilhelm, 46 Schlaeppi, H. P., 457, 463 Schmitt, W. J., 396 Schrimpf, H., 469 Schwartz, J. I., 291 Scott, N. R., 209 Sechler, R. F., 587 Seeber, R. R., 469, 587 Segal, R. J., 509 Senzig, D. N., 463, 469 Serrell, R., 42, 119, 144, 212, 223 Shannon, E. C, 46, 649 Sharpe, William F„ 585 Shaw, J. C, 257, 349-362 Shedler, G. S., 463 Shifman, Joseph, 257, 447-455, 469 Shupe, P. D., 212 Simon, H. A., 257, 349-362 Slotnick, Daniel L., 315, 320-333, 463 Slutz, Ralph J., 210 Smith, J. L., 440-445, 449, 456 Smith, J. W., 587
Strachey,
Saltzer,
J.
von Neumann, John, 86-119 Vyssotsky, V. A., 295, 457, 469
349
Stevens, L. D., 525
470, 485
H.,
C, 469
344 J. B., 200, 335-340, Strube, A. R., 587
Stringer,
Sumner, Frank H., 274-290 Sussenguth, E. H., 13, 587 Sutherland, I. E., 303
Taub, A. H., 92 Taylor,
Norman
H., 144
Teager, Herbert M., 587 Thomas, C. E., 279
Thomas,
L. X., 46
Thompson,
R. N., 455
349
Unger, S. H., 320 Updike, B. M., 340, 587
Smith, R. V., 463, 469 Snyder, 111, 119
Solomon, Martin R., Sparacio, F. J., 587
Jr.,
561
Speierman, K. H., 291, 456, 469 Squire, J. S., 469 Steel, T. B., Jr., 8
Van der Poel, W. L., 200-204 Van Derveer, E. J., 587 Vandling, Gilbert
C, 340
Van Horn,
295, 457
E.
C,
Vareha, Albin L.,
Jr.,
Weber, Helmut, 257, 340, 348, 382-392, 469, 587 Weik, Martin H., Jr., 42 Weinberger, A., 440-445, 449, 456 Weiner, James R., 157-169 Wells, M., 349 Welsh, H. Frazer, 157-169 West, George P., 485 Westervelt, F. H., 81, 275, 469, 566, 571
Wilkes, M.
Turing, Sara, 191, 199 Turn, R., 469 Turnbull, J. R., 587
S.,
Walden, W., 349 Walendziewicz, E. T, 148, 156 Warburton, E. T, 279 Ward, J. E., 507 Ware, W. H., 650
Wheeler, D.
Thornton, James E., 489-503 Tomasulo, R. M., 587 Tonik, A. B., 396 Tucker, S. G, 340 Turing, Alan M., 23, 191, 193
Ulam,
index
571
J.,
V,
346 84, 139, 200, 214, 334-340,
344, 345, 396, 574
455 193-199 Wilkinson, P. T, 504 Williams, A. P. M., 469 Williams, Charles R., 119 Williams, F. C, 75 Williams, Robert J., 257, 447-455, 469 Wirsching, Joseph E., 316-319 Wirth, N., 257, 348, 383, 389, 392, 469 Witt, R. P., 172, 212 Wolf, K. A., 496 Wooldridge, D. E., 205-215, 220-224 Wright, M. V, 349 Wyld, Michael T, 274 Wilkinson, Wilkinson,
J.
A.,
J.
H.,
Zadeh, Lotfi A., 7 Zemlin, R. A., 496 Zraket, C. A., 504 Zurcher, F. W., 291, 456, 469
655
Machine and Organization Index Page references
in boldface refer to the
Aberdeen Proving Grounds
ENIAC;
(see
Appendix, ISP descriptions, and
B
EDVAC;
IAS)
ACE
(NPL/National Physical Laboratory), 43, 44, 74, 190, 193-199, 216
39,
ISP, 193-199
191, 193, 198
197-199 ADU/ Accumulation and Distribution Unit T(io),
ComLogNet) AEC/Atomic Energy Commission, 396 AGC/Apollo Guidance Computer (M.I.T. (see
Instrumentation Laboratory), 44, 89,
146-156 D(arithmetic), 150-152
design and construction, 148 interpreter, 147-148 introduction, 146
PMS, 146-148
283, and 300 (Burroughs), 43-44 2500, 2501, and 3500 (Burroughs), 43
B B 5000
(Burroughs), 43, 44, 79, 81, 257-261,
BESK,
language, 45
AMBIT
HIE,
II,
44
language, 45
(see
(see
performance, 470-471 PMS, 470, 471-475, 476, 489-494
RT, 491-494 8090 and 8092
CDC
under
CDC;
G-15; G-20)
CDC
(see
160, A,
CDP/Communications Data Processor ComLogNet) Census, Bureau
of, 157,
164-165
(Lincoln Laboratory), 43
Chasm
special purpose computer, 73
COBOL
BESM, 213
60 and 61 language, 45 Columbia University Calculator, 46
BINAC
COMIT
(Eckert-Mauchly), 43, 91, 163 (Business Information Technology),
language, 33, 45
ComLogNet,
CORC
BIZMAC I, II BTL MACRO
G) (see
C.E.C.E., 39
CG24
Bitran 6 (Fabri-Tek), 44
D825, operating system)
APL/A Programming Apollo (see AGC)
45,
509-510
language, 45
CPC/Card Programmed
(RCA), 39-43 language, 45
Calculator (IBM), 43,
88
CSIRAC, 89
BTSS/Berkeley Time Sharing System
Culler-Fried on line language, 45
(University of California, Berkeley), 44,
Language,
13,
45
Argonne Laboratory, 257 Arithmometer (L. X. Thomas), 46
ARPA/Advanced Research
Projects Agency,
291-300, 315 network, 510-512
Arrow
packaging, 494^496
43, 45,
44
AOSP/Automatic Operating and Scheduling program APEXC, 39
CDC
494-495 489
ISP, 472, 491-493, 497-503 operating system, 472-475
39, 89
BIT 480
AN/FSQ-27 (see RW-40 and 400) AN/GYK-3(V) (see D825 and D830) AN/UYK (RW => TRW), 71
6400, 6416, 6500, 6600, 6700, and 7600,
history, 470,
257-261, 325, 328 B 8500 and B 8501 (Burroughs), 43-44, 64, 257 Babbage's Analytic Engine, 42, 46, 53 Babbage's Difference Engine, 46 Baldwin Calculator, 46 BASIC (Dartmouth College), 45, 236
Bendix =»
1700, 44 3400, 3600, and 3800, 43-44, 348, 396
470-476, 489-503
42-43, 45-46
language, 13, 45, 73, 257, 267, 348
1604, 44, 89
circuits,
B 5500 (Burroughs), 43-45 B 6500 and B 7500 (Burroughs),
Air Force, 137
CDC CDC CDC CDC
43-45, 47, 71, 76, 79, 83, 120, 170, 397,
operating system, 267-268 PMS, 258-260, 268
ALGOL
ALWAC
(see Strela)
ASI 6000 (EMR), 44 Atlas (Manchester University, Ferranti), 43-45, 82, 91, 274-290 input-output, 274-283, 285-289 interrupt, 274, 276-277 introduction, 276 ISP, 276-279, 283-285 M(core), 280-283, 289-290 multiprogramming, 274-283 operating system, 279, 285-287 PMS, 277, 279-283, 289-290 RT, 287-289 ATLAS-1 and 2 (Ferranti), 43 AVIDAC, 39-89 656
160, 170, 180, 250, 260, 263, 270, 273, 280,
Bell System, 303 Bell Telephone Laboratory computers, 39,
ISP, 152-155
ALPAK
diagrams.
267-273 design, 267 ISP, 268-273
introduction, 193
PMS,
PMS
45, 274-275, 291-300 input-output, 297-300 introduction, 291-292
D825 and D830
291-297 297-300
ISP,
design philosophy, 447-450 input-output, 454-455
M(files),
multiprogramming, 291-295 operating system, 292-300 PMS, 275, 292 T(io), 297 Burroughs
(see
B
2500;
B
5000;
ISP,
DASK, 89
B
5500;
IV)
California, University of, Berkeley (see
BTSS)
Carnegie-Mellon University, 120, 571
CDC/Control Data Corporation
(see G-15;
G-20) 160, A, G, 43, 44, 120
924, 3100, 3200, 3300, and 3500, 43-44,
79
453
operating system, 450-455 PMS, 260, 450-455
B 6500; B 8500; D825; Datatron 204, 205, and 220; E 101, 102, and 103; ILLIAC
CDC CDC
(Burroughs), 44, 45, 257-260,
446-455
Datamatic 1000 (Honeywell), 39, 43 DATANET 30 (GE), 43 Datatron 204, 205, and 220 (Burroughs), 39, 43, 44 DDP-19 (Honeywell), 43 DDP-24, 224, and 124 (Honeywell), 43-44 DDP-116, 316, 416, and 516 (Honeywell), 43-44, 512
DEC/Digital Equipment Corporation PDP-1) DEC 338, 260, 303-314, 396 interpreter, 305 introduction, 305
(see
Machine and organization index
DEC
FORTRAN
338, ISP, 305-309, 310-314
121
PMS,
(See also
PDP-8)
191 (English Electric), 39, 43-45, also ACE) (See DMI/Data Machine Inc. =• Varian Associates,
Deuce
44
DMI DMI
520/1 (Varian), 44
620 (Varian), 44 Postal and Telecommunications Services, 200 Dynamo language, 45 DYSEAC (National Bureau of Standards), 39, 43, 172, 440
Machine, 44, 348, 363-381 interpreter, 366-379 introduction, 363-364 ISP, 363-365 365-381 logical design, PMS, 365-366 RT, 364-368, 375-381 FX-1 (Lincoln Laboratory), 43-45
101, 102,
and 103 (Burroughs),
43,
44
EAI/Electronic Associates Inc., 44 EAI 640, 44 Eccles-Jordan Flip-Flop, 46
Eckert-Mauchly Computer Corporation => UNIVAC, 91 EDSAC I and II (Cambridge University), 39, 42-45, 58, 89, 139, 144, 196, 398
EDVAC/Electronic Discrete Variable Automatic Computer (University
of
Pennsylvania) 39, 42-45, 95 Eight-bit character computer, 170, 184-187,
224
=
G-15 (Bendix CDC), 39, 43^4, 74, 191 G-20 (Bendix =s> CDC), 44, 57, 152 Gamma 60 (Machines Bull), 44, 456 GARDE 312 (GE), 43 GE 100/ERMA, 43 GE 115, 43 GE 205, 210, 215, 225, 235, 255, and 265, 43-44 GE 412, 435, 43-44 GE 635, 625, 43 GE 645 (General Electric), 43, 45, 79, 275 GE 4040, 4050, 4060, 4020, and 4050 II, 43 General Automation (see SPC-8) General Precision => CDC (see LGP-30) Genie project [see BTSS) George (University of New South Wales), 257 Gott Sei Danke, 346 GPS language, 45
H-200
=> ICT/International
Computers and Tabulators (see KDF 9) ENIAC/Electronic Numerical Integrator and (University of Pennsylvania),
39, 42-43, 45-47, 88, 113
ERA/Engineering Research Associates => UNIVAC,
UNIVAC
43, 192
1101, 1102;
UNIVAC
1103 A)
ERMA
(see
GE
and 1460, 43-45,
47, 61, 188,
231-234
PMS, 226
ISP, 184, 185, 186-187
(See also
1401, 1440,
224-234, 562-564 history, 225 interpreter, 229
RT, 229-230 1410 and 7010, 43, 44 1620, III, and 1710, 43-44, 225 1800 and 1130, 43-45, 48, 55, 90, 396, 399-420, 470, 575-576, 579, 583-586 input-output, 405, 409-411 interpreter, 408-409 introduction, 399-400 ISP, 407-416, 417-420 PMS, 400-405, 404 RT, 405-409, 411-413
IBM IBM IBM
IBM 2938, 45, 72 IBM 7030 (see Stretch) IBM 7070, 7072, 7074, 43, 44 IBM 7094 I, II, 7044, 7040, 7090,
709,
and
704, 30-32, 39, 43-45, 47, 54, 64, 70, 79, 91, 149, 303, 396, 422, 433, 515-541,
562-564 515-517 interpreter, 522-523 ISP, 523, 526-541 multiprogramming, 523 P(io), 524-525 PMS, 517-519 RT, 520-522 history,
EMR
Computer
433 1130 (see IBM 1800)
47, 87,
IBM IBM
ISP, 226-229,
introduction, 184
6130, 44 English Electric
702, 39, 43, 47, 87 705, 705 III, 708, and 7080, 39, 43-44,
introduction, 225-226
Dutch
E
IBM IBM
100)
ESS/Electronic Switching System (Bell System), 303 44, 73, 257, 348, 382-392 interpreter (microprogram), 385-392 introduction, 382-383
EULER,
ISP, 383-385, 388-391
PMS, 382-392
series: 110, 120, 125, 200, 400, 1200,
1250, 2200, 3200, 4200,
and 8200
(Honeywell), 43, 44, 58, 225
H-1400 and 1800 (Honeywell), 43 Harvard (see Marks) Hollerith Punched Cards, 46 Honeywell (see Datamatic 1000; DDP-19; DDP-24; DDP-116; H-200; H-1400) Host computer (see ARPA network) HP/Hewlett-Packard (see HP 9100A) HP 9 100 A, 44, 235-236, 243-256 D, 243-244, 254-256 ISP, 243-249 microprogram, 254-256 packaging, 250, 252-253 PMS, 235, 249-254 RT, 250 T, 243, 248, 253
IBM IBM IBM
Multiplying Calculator, 46 Stretch (see Stretch) System/360, 43-45, 61, 64, 303, 396
addressing, 565-566, 594
array processor, 576-579
base register, 594 (See also addressing above) bibliography, 587
branch instructions, 595 channel-to-channel adapter, 576 circuits, 564,
cost,
603-604
579-585
critique
by authors, 561-587
data types, 564-565, 590-594 design, 561-564, 588
Fabri-Tek (see Bitran FACT language, 45
direct control, 597
6)
Ferranti Corp. Ltd. => ICT/International Computers and Tabulators, 39 (See also Atlas; Mercury; Pegasus)
FLAC (Florida), 39 FOCAL (DEC) language,
FORMAC FORTRAN
236 (IBM) language, 45 (IBM),
IV language,
FORTRAN 45, 50, 73,
Advanced Studies machine von Neumann)
IAS/Institute for (see
IBM ASP/ Attached-Support Processor, 506 IBM 305 (disk), 43, 45 IBM 650, 39, 43, 44, 91, 216, 220-223 ISP,
IBM II,
348
FORTRAN
220-223
701, 39, 43-45, 47, 89, 515-516
floating point, 591-592 functional schematic, 589
general registers, 564-565 history, 561 (See also design above)
information formats (see data types above)
PMS, 515 (See also
(See also input-output below) emulation, 562-563
IBM
7094)
innovations, 562
657
658 Machine and
IBM
organization index
System/360, input-output, 588, 598-601 [See also P(io; data channels) below; PMS
and
PMS
diagrams below] interpreter, 594-595, 604-605 interrupts, 596-597 introduction, 561, 588 ISP, 564-566, 588-601 logical structure, 588-601 (See also ISP above) M(content addressable), 571, 573-574 M(Large capacity store), 571-572, 582-583 M(read only), 604-605 (See also
microprogramming below; Models 30, 40, and 50 below) microprogramming, 563-564, 604-605 (See also Models 30, 40, and 50 below) model range, 561-564, 588, 602-606
IBM/System
360,
LARC (UNIVAC), 43^4,
and PMS diagrams, 580
PMS
T(print, punch,
design philosophy, 456-457
573 S(cross-point; time-multiplexed; BCU), SLT/Solid Logic Technology, 564, 603-604 storage protection (see multiprogramming
above) storage-to-storage channel, 576-577 SVC/Supervisor Call, 597
simulation, 463-469
Leprechan (Bell Telephone Laboratories), 43 LGP-30, and LGP-21 (General
ASCII, 593 decimal, 593-594
Precision
EBCDIC, 592
ILLIAC
microprogram, 382-385, 388-392 RT, 386 67, 76, 79, 275, 561, 563, 571,
573-574
Model 75, 561, 563, 571 Model 85, 76, 561, 563, 574-575 Model 91, 561, 563, 575 Models 30, 40, and 50, 561, 563,
566, 568,
602-603 563, 571-572, 582-583, 602-603 585-587 multiprocessing, 456-469, multiprogramming, 565-566, 571, 573-574,
Mp,
9) Illinois), 39,
43-45,
320-330 input-output, 322, 327-328 interpreter, 322-325 introduction, 320-321 ISP, 322-325, 330-333 PMS, 321-322, 327-329 K(P), 322-323 RT, 326
LINC/Laboratory Instrument Computer
IBM ASP)
performance, 563, 579-587, 602-606 P(io; data channels), 573-574, 576-577, 598-601, 605-606
and PMS diagrams, 563, 566-579, 602-606 K(special controls), 576 Model 20, 567 Model 44, 569-571, 569 Model 67, 571, 573-574, 573 Model 75, 567, 571-572 Model 85, 574-575, 575 Model 91, 575 Models 30, 40, and 50, 65, 566-568, 566-567 Ms (data cell, disk, drum), 577, 579 576 P(special), 576-578, 576 S(c), 579, 581 (See also networks above; T(analog), 581 T(audio), 579 T(display), 579
of,
LRL
Livermore, California, 396-397 network, 507
MAC-16 (Lockheed
interpreter, 351, 354-355, ISP, 354-359, 361-362
Electronics), 44 language (University of Michigan), 45 MADM/Manchester Automatic Digital Machine, 39, 58 Manchester University, 39, 45, 340
MAD AGC)
44,
(See also Atlas;
MANIAC
348-362 design, 349-350
359-362
RT, 352-354 IPL VC, 257
MADM;
Mark
I;
Muse)
(University of California,
Michigan, University 209-212, 571
M.I.T.
262-263
II
I, II, III, and IV (Harvard), 39, 42-43, 46 Mathmatic language, 45 MEG, 39 Mercury (Ferranti), 39, 279
KDF
ISP,
and
Los Alamos), 39, 43, 89 I (Manchester University), 43
MIDAC
9 (English Electric), 44, 257-266
I
Mark Mark
Jacquard Punched Card Loom, 46 JOHNNIAC (RAND), 43-44, 78, 89 JOSS (RAND) language, 45, 78 JOVIAL (SDC) language, 45
introduction, 262
CG24; FX1; LINC; MTC; TX-0,
43
IPL I, II, III, IV, and V, 45, 257 IPL Vl/Information Processing Language,
D, 263-266
IBM ASP)
(See also
LRL/Lawrence Radiation Laboratory,
under ILLIAC) computer (see ARPA, network)
578
P(array), 576-578,
PDP-8)
(see
Lincoln Laboratory (M.I.T.), 571
1.5 language, 45 Lockheed Electronics (see MAC-16) Los Alamos (see AEC)
(See also
45, 73, 257,
PMS
tape), 578-579,
University
LINC-8 (DEC)
TX-2) LISP 1.0 and
66, 72, 315,
Illinois,
44, 45, 74, 91, 192.
(M.I.T. Lincoln Laboratory), 43, 44, 120
Instrumentation Laboratory, M.I.T. (see Interdata, Model 3 and 4, 44, 184
networks, 576-579, 581, 598
Ms(magnetic
KDF
(University of
ILLIAC II (University of Illinois), 43 ILLIAC III (University of Illinois), 43, 351 ILLIAC IV (University of Illinois), 43-45, 47,
IMP
597-598 (See also
I
=> CDC),
216-219 ISP, 217, 218-219 PMS, 217
89
602-606 ISP, 385-388
Model
operating system, 461-463 performance, 456-457, 463-469
Leibniz Calculator, 46 LEO I and II, 39
variable-length character strings, 591
(See also Atlas;
457-461
instructions,
interrupt, 458-461 introduction, 456
PMS, 459-461
system implementations, 602-606 timer, 597
Model Model
569
Research),
application, 464-469
RT, 568, 570, 572, 603-604
(See also performance below) Model 20, 563-567 30, 236, 348, 382-392, 566-568,
396-397
44-45, 446, 456-469
581
T(telephone, typewriter), 579, 596-598 processor state, 564-565, 588,
ICT/International Computers and Tabulators, 91, 274
25, 184, 563, 567,
86,
Lehman Computer example (IBM
read),
of,
MAD, MIDAC,
(Michigan, University
of), 39, 44,
192,
192,
209-212 ISP, 209-212 MILSMAC, 347 MISTIC, 43
CTSS operating system, 45 M.I.T./Massachusetts Institute of Technology (see AGC; GE 645; Lincoln Laboratory;
MULTICS
project;
Whirlwind
PMS, 260
M.I.T. network, 507
RT, 264
Monorobot, Monorobot XI, 39, 44
I)
Machine and organization index
Monroe Calculator, 46 Monroe Corporation, 46 Moore School of Electrical Engineering
PDP-8, 8S, (see
RT, 125, 127-131 (See also 338)
Test Computer (M.I.T. Lincoln Laboratory), 39, 45, 89 Mueller's Difference Engine, 46
MTC/Memory
MULTICS
project (M.I.T.), 45, 571 (Manchester University), 43, 277
275, 564
ISP,
(see
DYSEAC; PILOT; SEAC)
Pegasus (Ferranti), 44, 62, 170-183, 564 circuits, 171-174, 176 introduction, 181 ISP, 176-179, 182-183
179-181 packaging, 174-176, 179-182 logical design, 172-175,
Neher Laboratory, 200 Network of Computers, 504-512, 505-512 ARPA, 510-512 ComLogNet, 509-510 IBM ASP, 506 LRL, 507 M.I.T, 507
SABRE, 504 SAGE, 504 typical,
NOVA
of,
506-507
Philco 212, 44
PILOT
(National Bureau of Standards) 39, 43,
44, 75, 397-398, 440-445,
101
508-509 44
(see RW-40 and 400) Desk Calculator
ACE)
PUFFT,
California, Berkeley), 43-14, 79, 275,
291-300, 542
Calculator)
ONR/Offlce of Naval Research, 137 89 (University of
Illinois), 39,
43, 89
RAND
Corporation (see JOHNNIAC; JOSS) (Raytheon), 39
BIZMAC
RCA RCA RCA RCA RCA
= Raytheon
(see
PB-250;
PB-440) PB-250, 44, 74, 191
circuits,
America
(see
70 Series)
110, 43, 44 301 and 3301, 43 501 and 601, 43, 44, 225 1600, 184
Spectra 70, 561-562 I,
II,
and
III,
44
RW/Radio Wooldridge (see AN/UYK) RW-40 and 400 (Thompson, Ramo,
design philosophy, 477
808, 816, 44
8S, 81, 8L, and 120-136, 396 applications, 120
of
SPECTRA
Wooldridge), 44, 53, 192, 400, 470-471,
PDP-1 (DEC), 44-45 PDP-4, 7, 9, and 15, 43-45 PDP-8,
II;
477-488
PB-440, 334
PDC
I,
Rice University computer, 45, 53
Pascal Calculator, 46
5,
20-32, 43-44, 49, 90,
BTSS)
SEAC
(National Bureau of Standards), 39, 43-45, 172, 192, 209-212, 440 SEL/Systems Engineering Laboratories, 44
SEL
Recomp Bell
University),
compiler, 45
RCA/Radio Corporation
PB/Packard
(See also
RAYDAC
Olivetti-Underwood (see Programma 101 Desk
556-560
275, 543, 546-548, 546
RT, 550-552 (See also BTSS) SDS 940 and 945 (SDS, University of
120
317-318 RT, 318 NPL/National Physical Laboratory, 45 ISP,
551-552
ISP, 544-345, 548-550,
PMS,
PMS, 237-238, 237 Programmed Console (Washington
applications, 316-317 introduction, 316
910, 920, 925, 930, 9300, 43, 44, 91, 291,
interrupt, 553-555 introduction, 542-543
ISP,
Laboratory), 44, 66, 315-319
92, 44, 120
542-560 history, 542-543
237-242 237-242
(LRL/Lawrence Radiation
ORDVAC
SDS SDS
interpreter,
(Olivetti-Underwood), 44, 216, 235,
ORACLE,
SDC/Systems Development Corp., 45 SDS/Scientific Data Systems => XDS/Xerox Data Systems (see SDS 910; SDS 940 and 945; Sigma 2 and 3; Sigma 5 and 7)
input-output, 543-545, 552-555
449
440 input-output, 444-445 ISP, 442-444 performance, 440-442 PMS 398, 440-442 applications,
Programma
39,
(See also
ENIAC)
Polymorphic (RW)
Texas, University
NORC,
43, 46, 95 (See also EDVAC;
343-347
microprogram, 345-346 packaging, 341-343 PMS, 343 RT, 343-345
Pennsylvania, University of (Moore School),
NBS/National Bureau of Standards
SD-2 (Librascope), 44, 334, 341-347 design, 341-343 interpreter, 550-552 introduction, 341
PDP-10 and 6, 43-45, 79, 170, PDP-12, LINC-8 (see PDP-8)
39
Motorola 1000, 44-45
Muse
M(core), 128-129 121, 124, 126, 128
5,
DEC
Pennsylvania, University of)
MOSAIC,
8L, and
81,
PMS, 20-21, 123-131,
481-482 480-482 ISP language, 486-488 PMS, 471, 477-480, 482-485
interrupt, ISP, 470,
810, 44 Sigma 2 and 3 (SDS =» XDS), 43-44, 78 Sigma 5 and 7 (SDS => XDS), 43, 170, 396, 564 SILLIAC, 89 SIMSCRIPT language, 45 SIMULA, language, 45 SNOBOL language, 45 SOL language, 45 SOLOMON, 315, 320 Soviet Academy of Sciences, 213 SPC-8 and 12, 44 SPECTRA 70 Series (RCA), 43 SS 80 I and II (UNIVAC), 43 Strela/ Arrow (Russian), 44, 192, 213-215 ISP, 213-215 Stretch/IBM 7030, 43-45, 47, 91, 396-397, 421-439 arithmetic, 428-431 circuits, 433-438 D, 427-431 input-output, 421-422 interrupt, 423
introduction, 421
132-133
ISP,
422^24
input-output, 123 interpreter, 131
SABRE
SAGE/Semi-Automatic Ground Environment
look-ahead, 426-128
interrupt, 123 ISP, 22-33, 120-123, 127, 134-136
network, 45, 504 SCC/Scientific Control Corp. 650, 120 Schickhardt Calculator, 46
packaging, 432, 438-439 performance, 421-423, 425-426, 431-433
Logical design, 127-133
network (American
Airlines), 45,
504
K(P),
424-128
PMS, 421-423, 425-426
659
660 Machine and organization index
Stretch/IBM 7030, RT, 426-431 Subscriber Station (see ComLogNet)
SWAC,
39,
System/360
43 (see
IBM
of,
and 400) Turing machine, 23 39, 43-45,
274
UNIVAC UNIVAC UNIVAC
1050, 43, 44
490, 491, 492, I,
II, III,
and 494, 43-44 1005 I, II, and
WEIZAC, Whirlwind
UNIVAC UNIVAC
III,
UNIVAC UNIVAC UNIVAC
I
(M.I.T.), 10, 39, 43-45, 55,
1101 and 1102, 39, 43
interpreter, 140-141 introduction, 137-139
1103A, 39, 43, 44, 48, 62, 192,
ISP, 145 K, 139-143
M, 141 packaging, 141-143
1105, 39, 43 1108, 1107,
470
applications, 138
D, 142
and 1106,
10,
43-45, 62,
564 1206, 43 1212 (Military), 43 9200 and 9300, 43
170, 192, M.I.T.),
43, 89
58, 90, 137-145, 303,
205-208 ISP, 205-208
TRW/Thompson, Ramo, Wooldridge (see R W-40 TX-2 and TX-0 (Lincoln Laboratory,
II and HI, 39, 43-45 418, 1218, and 1818, 43-44
1004 43, 44
System/360)
network, 506-507 Toronto University Computer, 44 TRAC language, 45 TRE, 39
Texas, University
UNIVAC UNIVAC UNIVAC UNIVAC
PMS,
90, 138-139
Wilkes'
microprogrammed computer example, 44,335-340
design, 335-337 introduction, 335
ISP, 337-340
UNCOL
microprogram, 339-340 RT, 336
language, 8-9, 13
Army Ordnance Department, 92 UNIVAC, 39, 43-45, 48, 91, 157-169 U.S.
applications, 164-165 design constraints, 163
input-output, 158, 161-162 interpreter, 159-161 ISP, 157-160 performance, 164-168 PMS, 158 reliability, 165-169 RT, 157-160 T(io), 161-163 (See also SS 80 I and II)
Varian Associates (see under DMI) von Neumann/IAS/Institute for Advanced Studies, 39, 42, 44, 58, 89, 92-119, 152,
398
XDS/Xerox Data Systems
(see
SDS)
applications, 92-93
checking, 118 D, 96-111 design constraints, 92-93 input-output, 92, 117, 119 interpreter, 111-119 ISP, 111-119
M, 94-96
ZEBRA
(Standard Telephones and Cables, 200-204, 216
Ltd.), 44, 191-192,
introduction, 200 ISP,
200-204
PMS, 201 ZUSE Company,
39, 42
Subject Index in boldface refer to the
Page references
acceptance
diagrams.
arithmetic element, Whirlwind, 142 arithmetic expression, 614
abbreviation/, 19, 607, 609
UNIVAC, 165-166
test,
PMS
Appendix, ISP descriptions, and
access-i-unit-operation, 633 access-time, 620-622
arithmetic-function-operation, 614 arithmetic organ, von Neumann, 98
accessing algorithm, 41
(See also D/data-operation) arithmetic unit, KDF 9, 263-266
accumulator, ZEBRA, 202 accumulator register, 59-60, 98
array instructions,
array processor [see P(array)]
[See also under action «-, 23-24,
Information Interchange, 593 assemble instruction, 457-458
ASCII/ American Standards Code
line)]
action-sequence, 23, 631 actual address, 76-81
assignments, associative
(See also physical address)
23, 607,
memory
bulk core
IBM IBM
609
attribute: value pair (see attribute; value)
address-range
[
address-size, P,
612-613
auto index register, 120-122, 134
447
Lehman computer, 456-457
24, 631-633 626-627
available space
],
list,
IPL
VI,
352-353
addresses-per-instruction, P, 57-63, 627 (See also instruction format)
addressing (see
memory
addressing;
memory
mapping; multiprogramming) addressing system, memory, 16
aerospace computer, 146-156 algorithm-encoding-efficiency, P, 627 alias/, 19, 607, 609
423
System/360, 591
(see computer) C(l Pc), 40-41, 63-70, 395 C(l Pc-nPio), 40-41, 63-70, 396-398 capital letters, 609 card, IBM, 617 carrier, 618 data-type, 629-631 carry, 98-99 casting out three, Stretch, 431
central processor [see P(c)]
channels [see
B
line:
character-base, 631-632
277-278 Manchester University, 340
character/char, 616 character generation instruction, 308 character string, 184-185
Atlas,
(See also index register) 6600, 474, 489-491
barrel,
CDC
and A, 25
base register,
(See also n-ary-boolean-operation)
Stretch,
(see bit)
base, 24, 55-56, 614, 616, 631
,
M/memory)
b
alphabet, 609, 613 alternation indefinite expression, 17, 610 |
(see
C/computer
RW-400, 477^79
address-expression, 631-632
memory
memory;
attribute-list,
availability,
(See also variable-length character string)
Stretch, 431
UNIVAC,
MIDAC, 210
bilinear switch,
160-161, 168-169
Whirlwind, 143-144
bench-mark, 52
applications:
P(io)]
checking:
base-data-type, 630-631
antecedent, 619
623-624
—
circuit level, 4
Lehman computer, 464-469
614, 633-635 binary-arithmetic-operation H binary-boolean-operation, 615, 633-635
NOVA, 316-317
binary-decimal conversion, 211
PDP-8, 120 PILOT, 440
binary machine, 87-88
Pegasus, 171-174, 176
UNIVAC
binary-operation, 28, 633
Stretch,
(See also ISP,
164-165 von Neumann, 92-93 Whirlwind I, 138 I,
IBM
bit/binary-digit, 611,
619
multiple-precision,
429-430 serial, 428-429 Stretch, 428-431 parallel,
433-438
609-610 cocomponent, 617 class,
block, 617
co-incident current memory [see M(core)] colon 19, 612-613, 631 (See aho attribute: value pair) :
PMS
diagram;
PMS
level;
RT)
ZEBRA, 204 BNF/Backus-Normal Form (Backus-Naur
block transfer,
AGC, 151-152
6600, 494-495
PDP-8, 132-133
616-617
block diagram (see
arithmetic:
CDC
component count, 431^432
archival
area, 617,
circuits:
component count, 470-471
binary-value, 611 bit string, 317-318 (See also data-type, Stretch)
[see M(archival)]
,
System/360)
approximation—, 607-608, 610 architecture, 562 (See also ISP; under PMS)
memory
608-609,
by/byte, 616
D825, 447-448 adder, Pegasus, 174 addition, von Neumann, 98-99
-,,
ACE, 198
buzzer,
M(associative)] attribute, 19, 607, 612-613
adaptability:
D V A
bus, 10 (See also S/switch) business computer (see function)
for
[see look-aside
=©
633-635 branch instruction, 595 breakouts, IPL VI, 350-351 buffer module, RW-400, 482-484
NOVA, 316-319
accuracy, HP 9100 A, 246, 256 acoustic delay line, 96
M(delay 631-632
boolean-operations
Form), 9 boolean, 608, 615 boolean-expression, 615
,
combinatorial circuits, 5 comma, 611
commands, 608-610 (See also abbreviation; assignment, form; variable)
COMMENT,
608 661
662
Subject index
comments, 608 communication computer (see function) communication multiplexing, 505 compiler, EULER, 391-392 complex, data-type, 631
complex number arithmetic, 246, 255-256 component: data-type, 629-631 PMS, 616-619 component-function, 617 component-name, 617 compound computer, 628 compound-link, 619-620
compound name
(see
name)
computer, 628 control, 146-156 duplex, 66 computer levels, 3-11 PDP-8, 126-127 computer model, 63-66
computer-space dimensions, 40 concatenation 24, 631-633
,
decimal digit, 616 decimal machine, 57, 87-88 decimal-name, 614 DECtape, 124-126
concurrency-type, 617 condition, 23, 631 condition codes, IBM 1800, 407 conditional micro-order, 336-337
IBM
delay,
1,
contextual addressing, 267-268 continuous-modulation, 618
B
SD-2, 341-343 desk calculators, 235-256 destination address,
Neumann, 111-119
controlled-operation, 624-625 conversion, 615-616
element-range/< ellipses.
.
.
,
626-628
memory, 75 >,
24,
631-633
608, 610
emulation, 562-563 encode, 16
encoding, 618 entity, 608, 611-612 error-rate, 617, 619 evoke operation —>, 23, 631, 633 EXAMPLE, 608 excess three code, UNIVAC, 163 448 expansibility criteria, D825,
607-608, 611-612 607-608, 610
extended core store/ECS, CDC 6600, 473 external execute instruction, 458 extra codes, 597 AGC, 154-155 Atlas, 274-278
5000, 271-272
Lehman computer, 456-457
control-operation, 633
electrostatic
relational-expression) expression-variables, 608
D825, 447-450
RT)
59-60, 636-637 efficiency, processor,
count-expression; dimension-expression;
design philosophy:
(See also interpreter; K/control; control computer (see function)
228
effective address calculation process, ISP, 28,
line)]
descriptor, 79-81
624-625 ILLIAC IV, 322-323 Stretch, 424-428 Whirlwind, 139-142
control,
edit instruction,
optional, 613 (See also boolean-expression;
620
delay line [see under M(delay
Interchange Code, 592
ECL/Emitter Coupled Logic, 320
indefinite,
dequeue switch, 623-624 descendants, 619
addressable)]
EBCDIC/Extended Binary Coded Decimal
definite,
=
1800, 400-403
[see M(drum)] dynamic data types, 383 dynamic storage allocation, 383-384
expression, 608
definite expression, 607-608, 611-612 definition: (see assignment)
construction (see packaging) content addressable memory [see M(content
control-organ, von
616 D(Stretch), 427^131 data break, PDP-8, 124-126 data channel [see P(io)] IBM 7094, 523-525 SDS 900 series, 543, 546-548, 552-555 data-expression, 631 data field register, 120, 523 data flow, Stretch, 425-428 data-operation, 17, 23-36, 626 data-operation definition, ISP, 636-637 data-operations table, 633-635 data programs, IPL VI, 360 data structure, IPL VI, 351, 354 data-type, 23-36, 57 ISP, 629-631 P, 626-628 Stretch, 423^424 data-type format, ISP, 636-637 data-type-name, 629-631 digit,
decimal, 614
concurrency, 617-618
configurator,
drum
626 D/data-operation, 17, 23-36, d/decimal
(See also syspop)
ACE, 194-199
digital
computer
fabrication (see packaging) family tree of computer design, 39
digits,
609
fast
(see C/computer) digital differential analyzer, 304
Fourier transform, 73
225-226
conversion-arithmetic-operation, 633-634
dimension, 608, 615
features,
Cooley-Tukey algorithm, 73 cooling, 470
dimension-expression, 615 direct access communications channel, 900 series (see data channel)
fetch-execute cycle (see interpreter)
Pegasus, 181
UNIVAC, 163 memory [see
core cost,
direct
M(core)]
616-617, 619
memory
direction,
access,
PDP-8, 124-126
618
Lehman computer, 459
count-expression, 614
discrete-modulation, 618
country, 619 cross-point switch, 267 crossbar switch, [see S(crossbar); under S(cross-
disk, 74, 577,
CRT/Cathode Ray Tube
display [see under
T(CRT)]
cyclic
memory
cyclic switch,
file,
data-type, 631
617
BTSS, 297-300 control (see function) fixed point (see data-type)
579
display processor [see P(display)] distribution, switch, 623-624
number-data-type, 630-631 fixed structure network, flag bit,
IBM
504
1401, 226
floating point, 97
divergence, T, 625-626 divergence-rate, T, 625-626 divide step, SDS 900 series (see ISP)
Atlas, 277-278,
division:
283-285
B
5000, 268-270 HP 9100 A, 243-256
IBM
7094, 527
memory)
nonrestoring, 107-111
KDF
9,
[see M(cyclic)]
restoring, 107-111
number-data-type, 630-631 SDS 900 series, 544-545, 549-551
current, 616
cycle-time (see
field,
file
directive instructions,
point)]
SDS
623-624
UNIVAC,
159
263-266
Subject index
floating point, Stretch, 429-431,
UNIVAC
433
1103A, 208
Wilkes example, 335 457 form, 607, 610 format, data-type, 629-631 full-duplex, 617-618 function, 37, 40, 46-49 business, 47-48 C, 618 communication, 48 component, 617 control, 48 file control, 48 operation, 28 P, 626-627 scientific, 47 T, 625-626 terminal, 48-49 time-sharing, 49 fork instruction, 325,
CDC
functional units,
information length, 16 information-rate, 617-618
information units, 616-618
instruction-memory, P, 627-628
inhibit drivers [see M(core)]
instruction modification, 209-210
input-output:
instruction-set, 25
ACE, 197-199 Atlas, 274-283,
P,
(See also ISP) instruction-size, P,
117, 119 instruction:
DEC 338, 308-309 DEC 338, 307-308
general registers: 8-bit character computer, 184-187 Pegasus, 176-179 generations (first, second, third, and fourth), 39-40, 43-46
Gibson mix, 49-50
hexa-decimal-digit/hex, 616 hierarchy (see structure) switch, 623-624
631-632
Lehman computer, 457-461
instruction
backup 520-522
register,
IBM
7094,
instruction buffers, 84
ILLIAC
IV,
323-324
(See also look-ahead; look-aside) instruction decoding diagram, 122-123, 184
address/stack, 62-64, 257-261 stack:
B 5000, 267 memory (see M/memory)
B
KDF
high-level language,
5000, 268-273 9;
262-266
1+ 1
+
1+
general register (see general registers) 1 address, IBM 650, 220-223 index address, 58-60, 87-91
2 address, 60-61
617-618
RW-400,
1103A, 205-208 3 address, 60-61
MIDAC, 209-212 Strela,
213-215
general registers, 61, 64 (See also general registers) IBM 1800, 407-408, 410-411
i-unit-prefix
IBM-card, 617 iconoscope tube, 94 illegal instruction,
470, 480-482, 486-488
UNIVAC
data-type, 629-631
ISP, 25,
BTSS, 293
indefinite expression, 607-608, index*, 20, 613
n 610
index register, 59-60 information, 616 information base, 24, 55-56, 614, 616, 631 information-content, data-type, 629-631
interlace (see data channel,
interleaving (see
memory
+
1
636-637
address, 61, 191
SDS 900 variable
series,
544-545, 548-552 of addresses per
number
instruction, 63 instruction highway, ACE, 197
instruction interpretation process, ISP,
636-637
463-469
SDS 900
series)
interleaving)
interpretation-cycle, 22-36 See also interpreter) ( interpreter,
22-36
AGC, 147-148
DEC 338, 305 EULER microprogrammed, 385-392 FORTRAN Machine, 366-379 IBM IBM IBM
1401, 229 1800, 408^09 7094, 522-523
ILLIAC
IV,
322-325
IPL VI, 351, 354-355, 359-362 ISP, 636-637 PDP-8, 131 Stretch (see instruction unit)
hyphen-name, 613-614
length, 616 i-unit-name, 616
integer-name, 614
integrated circuit memory (see M/memory) interaction controller, Lehman computer, 460 interaction function, Lehman computer,
AGC, 149-150
hyphen-, 25, 607
base-unit, 616
integer-name, 614
SDS 900
1
i-unit/information unit, 16, 616-618
+ —
address, 58-60, 64, 87-91
high-speed core history, 38-46, 617, 619
irate,
integer-data-type, 630-631
458-461
Instruction^execution, ISP, 25-36, 637 instruction execution process, ISP, 637 instruction-expression, 23, 631-632 instruction format:
Half-duplex, 617-618
+
integer-name, 614
interference, processor-memory, interflow, 151
instruction-efficiency, P, 626-627 instruction examples, ISP, 632, 635-637
graph-plot instructions, 308
integer-data-type, 630-631
control,
special,
and ISP, 607-615
626-627
instruction-source, K, 624-625 instruction unit, Stretch, 426-427
PDP-8, 123 PILOT, 444-445 SDS 900 series, 543-545, 552-555 Stretch, exchange, 421-422 UNIVAC I, 158, 161-162 input and output organ, von Neumann, 91,
ISP,
PMS
K,
IBM 1800, 405, 509-411 IBM 7094, 524-525 ILLIAC IV, 322, 327-328
6600, 473, 494
636-637 624-625 626
ISP,
285-289
BTSS, 297-300 D825, 454-455
data,
gate tubes, 112-119 general conventions,
instruction interpreter (see interpreter) instruction look-ahead (see look-ahead)
series,
550-552
UNIVAC, 159-161 von Neumann, 111-119 Whirlwind I, 140-141 interprocess communication, 41
interprogram communication, 81-83 interrupt/interprocess interrupts, 82-83, 411 Atlas,
B
274-283
5000, 267-272
D825, 452-453
Lehman computer, 458-461 PDP-8, 123 RW-400, 481^182 SDS 900 series, 553-555 Stretch, 423 interrupt-response-time, P, 626-627 intraprocess interrupt/trap, 82-83 (See also extra codes; trap)
I/O Bus: PDP-8, 124-126 SDS 900 series (see input-output)
663
664 Subject
index
large capacity store/LCS, 571-572,
ISP/Instruction-set Processor, 12, 22-33
ACE, 193-199 AGC, 152-155 Atlas, 276-279,
B
length, 616
M(p/primary memory),
level,
LINCtape, 124-126
CDC DEC
lineage, 617, 619 linear switch, 623-624
338, 305-309, 310-314
link,
8-bit character
computer, 184, 186-187
list,
HP
list
9100A, 243-249 650, 220-223
list
607, 611
384
processing, EULER, structure, IPL VI, 350
1401, 226-229, 231-234 1800, 407-416, 417-420
literal syllable,
7094, 523, 526-541
logic diagrams,
ZEBRA,
127,
logical structure (see ISP,
544-545, 548-550, 556-560
ISP conventions, 628-637 italics, 24, 608
IBM
IBM
System/360)
287-289
6600, 492-494 7094, 550-552
ILLIAC
IV, 323-324
424^128 574
Stretch, 397, 422,
look-aside
memory,
[See also
84,
457
K/control, 16-22 (See also control)
616 kernels, 464 k/kilo,
keyboard: HP 9100A, 235, 244-249, 251-253
memory)
Programma
101;
237-242
[See also T(keyboard)]
L/link, 16-22, 619-620 label,
612
labeled-entity, 612 language, 9
(see
memory map; multiprogramming)
464-466
M(associative), 76 [See also M(content addressable)]
medium, 618 memory, 620-622 access-time, 620-622
M(core), PDP-8; 128-130
cycle-time, 620-622 function, 620-622
M(cyclic), 73-74
information-rate, 620-622
M(delay line; ACE, Deuce), 191, 193-199 M(delay line; Pegasus), 173-174, 177 M(delay line; UNIVAC), 163 M(drum), 74
operations, 620-622
(See also look-aside)
M(electrostatic;
M(fixed-head
Whirlwind
disk),
M(fixed-head disk;
I),
141
ILLIAC
IV), 322,
card;
HP
card;
Programma
tape),
327-328
9100A), 248-249, 253 101),
74
IBM format), 126 tape; RW-400), 483 tape;
permanency, 620-621 620-621 primary, 621 portability,
[See also M(p)] processor state, 621
74
M(large storage; Whirlind), 137-138, 141 M(magnetic card), 74
M(magnetic M(magnetic M(magnetic M(magnetic M(magnetic
142-143
marks, 609 master control program, B 5000, 267-268 master slave schemes, D825, 449 matrix multiply problem, Lehman computer,
M(bulk core), 74 M(content addressable), 74 join instruction,
I),
magnetic card [see M(magnetic card)] magnetic tape [see M(magnetic tape)] magnetic wire memory, 96 main line of computers, 87-91 maintenance: ILLIAC IV, 328-329 Pegasus, 181-182 UNIVAC, 165-169 Whirlwind I, 138-139, 142-143 manufacturer catalog number, 617 manufacturer name (see proper-name) manufacturer-type, 619
map
M(content addressable)]
M/memory, 16-22 (See also
5000), 269-271
film;
machine-independent language, B 5000; 267 macro-parallelism, 456, 463
look-ahead: Atlas, 281-285,
B
D825), 453-454 M(toggle switch; Whirlwind M(UNIVAC), 158, 164
M(thin
memory mapping; multiprogramming) logical design level, 5 FORTRAN Machine, 365-381 PDP-8, 127-133 Pegasus, 172-175, 179-181
CDC
2(X)-204
M(stack;
(See also
SD-2, 343-347
Whirlwind, 140-141, 145 Wilkes example, 337-339
M(stack), 73
BTSS, 291
PILOT, 442^44 Programma, 237-242 RW-40, RW-400, 470, 480-482, 486-488 series,
M(s/secondary), 74
logical address, 76-81
134-136
213-215 Stretch, 422-424 UNIVAC, 157-160 UNI VAC 1103 A, 205-208 von Neumann, 111-119
5000; 272
PDP-8, 127-133 logic equations, PDP-8, 127-133 logic technology, 40, 617-618
Pegasus, 176-179, 182-183
Strela,
B
623-624
location, S,
System/360, Model 30, 385-388 ILLIAC IV, 322-325, 330-333 IPL VI, 354-358, 361-362 KDF 9, 262-263 LGP-30, LGP-21, 217, 218-219 MIDAC, 209-212 NOVA, 317-318
PDO-8, 22-25, 26-27, 28-33, 120-123,
M(read only), 604-605 M(read only; capacitor; System/360; Model 30), 385-387 M(read only; HP 9100A), 235, 250-253 M(read only; rope; AGC), 146-147
port-to-port delay, 620
EULER, 383-385, 388-391 FORTRAN, 363-365
SDS 900
[See also T(punch)] M(queue), 73 M(random), 75
619-620 delay, 620
D825, 453
diskpak), 74 17, 24, 74
M(p; concurrency), 41, 76-81 M(p; size), 41 M(photostore; IBM), 507 M(punched card), 74
system, 3-4
BTSS, 292-297 6600, 472, 491-493, 497-503
M(magnetic tape; Univervo), 157
M(moving head
length-type, data-type, 629-631
283-285
5000, 268-273
IBM IBM IBM IBM IBM
582-583
lattice (see structure)
237-242
secondary, 621 [See also M(s)] size,
620-622
technology, 620-622 (See also M/memory; memory memory access algorithm, 73 memory addressing: AGC, 155-156
organ)
Subject index
142 multiplication, Whirlwind,
memory addressing: (cont.) SDS 900 series, 542, 549-550 memory bus, Stretch, 422, 426
multiplier,
(See also S/switch) memory declaration, 36
memory-expression, 631-632 interface connection, 543, 546-548, 555
SDS 900
memory memory Atlas,
CDC IBM
series,
operation-code-size, processor, 626-627
multiply step, SDS 900 series (see ISP) multiprocessing, 446-469 multiprogramming, 76, 274-275, 456-469
operation-expression, 631-635 }, 30-32, 631-632 operation-modifier/ { operation-rate, port, 617-618
Atlas,
BTSS, 291-295
289-290
(See also
IV, 322-324,
Stretch, 397,
multiprocessing;
n-ary operation, 633 name, 607, 609, 613-614
memory mapping, 77-80 (See also multiprogramming)
Neumann, 92-96
IBM 1800, 408 BTSS, 294-295
protection,
message concentrator, 120 message switching, 505 metanotation, 607-609 micro-operation, Wilkes, 335-337, 339 micro-order:
System/360, Model 30, 385-388
Lehman, 456
micro-programme, Wilkes, 335 [See also P(microprogram)] microprogram: control fields, 387 HP 9100A, 254-256 sequencing, 388 status bits, 388 symbolic representation, 388-389 [See also P(microprogram)] [see
component, 617 compound, 25, 614 hyphen, 613-614 phrase, 613-614 primitive, 613-614 proper, 607, 617 simple, 607, 613-614 name-expression, 613 nesting store, 263-266
P(microprogram)]
micro-subroutines, Wilkes, 339-340
MOBF/mean-operations-between-failure, 617-618
modular scheme, D825, 449^50 modulation, 618 monitor map, BTSS, 291-295 monitor mode, BTSS, 291-297 moving head disk, 74, 577, 579 Mp-concurrency, processor, 627-628 MTBF/mean-time-between-failure, 617-618 multiple addresses per instruction, 191 (See also instruction format) multiple data stream, 83-84 multiple instruction stream, 83-84 multiplex, 617-618
memory, IBM 7094
(See also S/switch) multiplication, 100-111
AGC, 152 UNIVAC,
157
B
5000, 272
coding, 193, 199 optional expression, 607, 613
optimum
II,
518-519
address) (see instruction format)
66
ILLIAC IV), 320-333 NOVA), 318-319
P(display), 72 P(io), 72,
303-304
P(io; analog/digital;
IBM
1800), 405,
409-416
P(language), 63, 73, 257
P(microprogram)/microprogram processor, 61, 71,334 P(microprogram; SD-2), 341-347 P(microprogram; System/360, Model 30), 385, 388 P(microprogram; Wilkes example), 335-340 P(special algorithm), 66, 72-73, 301 P(stack) (see instruction format)
P(vector move), 72
P-concurrency, 627-628 packaging: 6600, 494-496
CDC
HP 9100A, 250, 252-253 Pegasus, 174-176, 179-182 SD-2, 341-343
616
Lehman computer, 462-463
call syllable,
B
5000, 272
operating system:
285-287 B 5000, 267-268 BTSS, 292-300 CDC 6600, 472, 475 D825, 450-455 Lehman computer, 461-463 operation, 616, 632-635 D, 626 K, 624-625 M, 620-622 P, 626-627 port, 627-628 Atlas, 279,
1
P(c/central processor), 17-22, 71
one-level store, Atlas, 179-283 one's complement, AGC, 150-152
operand
+
P(array;
network analysis problem, Lehman computer, 466-469 network computers, 447, 470-503 next, 24, 631 noisy mode floating-point, 422-423 nonary operation, 633 null, 607, 613 number, 608, 614 number-data-type, 630-631 number-name, 614 number representation, AGC, 150-152 number-set-name, 615
onion peeling,
(See also processor) P(l address) (see instruction format) P(2 address) (see instruction format) P(3 address) (see instruction format)
P(array;
[See also M(stack)]
octal-digit,
P/processor, 17
P(array),
mixed number, data-type, 630-631
multiplexer,
operator syllable,
P(n
network, 628
Wilkes, 335-337
microprogram processor
memory map;
n-ary-arithmetic-operation, 614, 633-635 n-ary-boolean-operation, 615, 633-635
BTSS, 291-295 IBM 7094, 523
micro-parallelism,
617
operation-time, 19
327-328
memory map:
violation,
operation-set,
parallel processing)
421-422
organ, von
operation-rate-set, 617
274-283
5000; 267-268
interleaving:
ILLIAC
operation, S, 623 T, 625-626
multiplier-quotient register, 59
B
6600, 473, 493 7094, 517-522
memory memory memory
615
Stretch, 432,
Whirlwind
I,
438-439 141-143
page: address, 120-134
(See also
memory mapping; multiprogram-
ming) Atlas, 274, 276,
279-283
BTSS, 291
mapping, 79-80 page address register,
Atlas,
279-283
parallel arithmetic (see arithmetic)
CDC
6600, 491-494 456-469 IPL VI, 359-360 parallel programs, parallelism, 456 parameter, 19, 611 (see attribute)
parallel-by-function,
parallel processing, 446,
parameter-set, 611
665
666 Subject index
parentheses
PMS PMS PMS PMS
609
),
(
performance, 37, 49-52 CDC 6600, 470-471
notation, 19-22
quantity, 608, 615
primitives, 16-22 structure, 41
quit instruction, 457
queue memory
PILOT, 440-442 Stretch, 421^23, 425-426, 431-433
structure dimensions, 63-85 polar coordinate arithmetic, 246, 255-256 Polish notation, 270-271, 391
UNIVAC, 164-168
port, 16-18,
Lehman example, 456-457, 463-469
period ., 25, 609, 614 peripheral and control processors,
6600,
471-475, 489-491
ming)
diagram, 16-22 15-22
level, 9-10,
ACE, 191, 193, 198 AGC, 146-148 289-290 B 5000, 258-260, 268 BTSS, 275, 292 6600, 470, 471-475, 476, 489-494
computer models, 63-66 D825 and D830, 260, 450^51, 453-455 Deuce, 191 EULER, 382-392 FORTRAN machine, 365-366 HP 9100A, 235, 249-254 1401, 226
System/360, 563, 579-587, 602-606 IV, 321-322, 327-329 9,
260
Lehman Computer, 459-461 LGP-30|LGP-21, 217 M.I.T. network, 507
networks, 504, 505-512 PDP-8, 20-21, 121, 123-131, 124, 126-128
PILOT,
398, 440-442
SDS 900
Stretch, 421-423,
425-426
546-548
data-types, 626-628 encoding-efficiency, 626-627 function, 626
instruction-memory, 627 instruction-size,
626-627
program program program program program program
633-634
615 617-618 HP 9100A, 253 ILLIAC IV, 328-329 Lehman computer, 456-457 network, 505 Pegasus, 181-182 UNIVAC, 166-168 Whirlwind, 138-139 relocation registers, 80
relations,
memory mapping; multiprogram-
NOVA, 316
120
field register,
8-10
level,
reference table,
operator,
B
5000, 271-272
SDS 900
series, 542,
criteria,
D825, 448
proper-name, 607-617 protection and relocation registers, 80 status
word
CDC
(see processor state)
M(punched
DEC
(See also stack)
pyramid,
resume instruction, 458 reverse polish, 262-263 round-off, 104-107
Atlas,
287-289
CDC
6600, 491-494
FORTRAN Machine, HP 9100A, 250 IBM IBM IBM
6600, 474
card)]
338, 308-309
364-368, 375-381
1401, 229-230
1800, 405-409, 7094, 520-522
411^13
ILLIAC
IV, 326 352-354 KDF 9, 264 NOVA, 318 PDP-8, 125, 127-133 SD-2, 343-345 SDS 900 series, 550-552 Stretch, 426-431 UNIVAC, 157-160 Wilkes example, 336
IPL
(See also extra codes)
[see
replicated single-computer systems, 448 resource allocation diagram, 10
RT/register transfer level, 5-7
checking, Pegasus, 178 counter, Whirlwind, 140 entry mode, desk calculator, 235
push-pop instruction, 90, 138-139
608-609, 634
relational-i-unit-operations,
(See also
(See also P/processor) processor state, 24, 57-63
UNIVAC,
I,
,
615
repeat instruction, 207
punched card
1108, 11
< >
Mp-concurrency, 627-628
PSW/program
Whirlwind
?£
ming) renaming, 632
Texas, University, 506-507
UNIVAC
=
relational-expression,
interrupt-response-time, 626-627 ISP, 635-637
S/switches, 67-69
158
relational-arithmetic-operations
reliability,
programming
series, 275, ,543, 546,
RT)
processor, 626-628
544-545, 550
471, 477-480, 482-485
632
register transfer (see relation, 608
relational-operation, 615, 634
programmed
SD-2, 343
register,
process map, BTSS, 293 processing elements, ILLIAC IV, 321-322
program-switching-time, 626-627
pipeline processor, 84 Programma 101, 237, 237-238
RW40, RW-400,
629-631 primary computer, PILOT, 441-443 primary memory [see M(core), Mp-concurrency] primitive-name, 613-614 print column, 617 process, BTSS, 293-297 process control computer, IBM 1800, 399-420
program-switching-time, 626-627 serial, 83
ASP, 506
629-630
referent-expression, data-type, 629-630
163
parallel/parallel-by-word, 83-84
1800, 400-405, 404 7094, 517, 518, 519
EULER, 383-384
referent, data-type, 16,
operation-code size, 626-627 P-concurrency, 627
701, 515
ILLIAC
KDF
recursive procedure,
supply: Pegasus, 181
algorithm-encoding-efficiency, 626-627 concurrency, 41, 83-85, 626-627
ComLogNet, 509-510
IBM IBM IBM IBM IBM IBM
power, 616-617, 619
address-per-instruction, 627 address-size, 626-627
network, 511
Atlas, 277, 279-283,
CDC
(See also physical address) record, 617
precision, data-type,
pipeline processor, 84-85 PMS conventions, 615-628
ARPA
M, 620-622
UNIVAC,
memory mapping; multiprogram-
random access memory [see M(random)] range—, indefinite expression, 19, 610
T, 625 610 postulation, indefinite-expression,
power
BTSS, 291
PMS PMS
portability:
M(queue)]
readability, 618 real address, 76-81
port-to-port delay, L, 620
CDC
permanency: M, 620-622 S, 623 phrase-name, 613-614 physical address, 76-81 (See also
617-618
[see
VI,
(See also logical design level; microprogram)
Subject index
Mp-Pc; Lehman computer), 461 67-70 S(cross-point; B5000), 258, 267-268 S(cross-point; D825), 450-454 S(cross-point; non-hierarchy; RW-400), 478480 S(duplex), 66-69 S(hierarchy) 67-70 S(Inter-memory transfer trunk; PILOT), 443 S(non-hierarchy), 68-69 S/sec/seconds, 616 S(simplex), 66-69 S/switch, 17-22, 41, 66-70 S(crossbar;
S(cross-point),
S(Telephone exchange), 506
CDC
S(trunk;
6600), 493
computer
CDC
secondary computer, PILOT, 443-444 segmentation, 77-81 (See also memory mapping; multiprogramming) Selectron memory, 95 semantics, 607-608 semi-colon ;, 611
(see interpretation-cycle)
sequential circuits, 5 serial arithmetic, 428-429
set,
607, 611
shared
614
simulation:
computers
(See also emulation; inter-
Lehman computer, 463-469 single data stream, 83-84 single instruction stream,
memory
83-84
(see look-aside)
SLT/Solid Logic Technology, 564, 603-604 small letters, 609 source address, ACE, 194-199 space, SD-2, 341 space „, 25, 607 spacing, 609 specialization, indefinite expression, 610 split instruction, 457 square root instruction, 241 stack:
5000, 260-261, 269-271
DEC
338, 308-309
EULER, 385
KDF
174
631-632 ory protection; multiprogramming) and forward network, 504
stored program digital computer (see computer)
613
structure, 37-38,
52-85
computer, 628
9,
260-261
and
458 Whirlwind, 142-143
set instruction,
test control,
tetrads, 112
three addresses per instruction, 193-194 (See also instruction format) time, 616 time chart, 43-46
time-sharing computer (see function; multipro-
hierarchy, 63-70 tree,
memory, technology)
temperature, 616-617, 619 terminate instruction, 457-458 test
memory mapping; mem-
gramming)
IBM 1800, 411 transducer, 625-626
65
timer,
65
subcomponents, 617
divergence, 625-626
subroutine calling instructions, PDP-8, 123, 135 subroutine file, BTSS, 299-300 subscripts (see base register; index register)
technology, 625 transduction, T, 625-626
609 subtraction, 99-100 superscripts/I, 609
transmission-operation, 633 transmit