dpt of biochemistry,. PO Box 9502, 2300RA ..... or to PC. 3,1. Atari +Maclntosh. Our first step was to port STAR from Atari to Macintosh. This step raised ... The drive of this machine was able to read DOS floppies. To our pleasure the Macintosh.
Porting and Optimizing Star; A Case Study of Suffering and Surfacing
F.H.D. van Batenburg 1); V.BOS; J. J. M. Riethoven; J. P.Abrahams; C.Pley 2) for Theoretical Biology, Kaiserstraat 63, Leiden, the Netherlands. 2) Leiden University, dpt of biochemistry, PO Box 9502, 2300RA Leiden, the Netherlands. 1) 1nstitute
1. Abstract This
paper
description
Actually, has three of our
main
project
parts.
The
STAR.
first
part
It explains
is a
what
a closed zipper.
our
mirror
RNA
The last part narrates our strategy to optimise the program. the highest potential
building
blocks
the so-called
2. Introduction
with optimizing.
introduction
will
biochemical
problem
present a very which
At first,
by some particulars
however,
which
nucleotides
brief
in
in a very
blocks of a RNA
and the composition
to 4 different
particular string are of a RNA
types of nucleotides.
For
of these nucleotides
which
strand. For example each from A, C, G and U will comman d
our program
the
a particular
in which
the cell as encoded by that RNA
is
incorporation protein
of
amino-acid
chain. And these amino-acids blocks of all kinds of proteins.
and Manner
into
the
in turn are See fig. 1
[5].
Background In the beginning, research was focused on the composition and sequence of the nucleotides in the RNA-string only.
In order to explain our program STAR we will present a very superficial introduction to some aspects of cell and to the role of RNA
in particular.
copy without fee all or pad of this material
are not made or distributed
for direct commercial
notice and the title of the publication
IS gramed
advantage,
and its date appear,
prowaea
~PYiW iS by Permission of the Association for Computing Machinery. otherwise, or to republish, requires a fee andlor specific permission Q1992 ACM 0-89791 -478-3 92/0007-0265
APL Quote Quad
alone determines
the amino acid
sequence in the proteins,
and deficiencies
could often be
by
a
peculiar
RNA
arrangement
of
the
nucleotides. It appeared however that the nucleotide arrangement as such could not explain everything of the protein-synthesis
mat the
the ACM cmpyright
and notice
After all, this arrangement explained
RNA? Now most of you might be more or less familiar with DNA which carries the genetic code in our cells, but what is RNA? RNA is very similar to DNA.
What is
mpies
R is the
processes
triplet of combinations
of the
from Mavor
biochemistry,
a
this is
intends to sc,lve,
background
growing
2.1. Biological
only
o. a, determines the types of proteins which are produced in
the basic building
Permissionto
are arranged
It is the sequential arrangement
itself is in order. This
the program
(being
than DNA;
convenience they are characterised as A, U, C and G, which stands for the 4 different bases in the nucleotids.
This paper reports our strides with program STAR. In particular we elaborate upon our porting experiences and of the program
shorter
DNA)
of most of the production
string is restricted
a short introduction
as
opens the
is found in all cells of plants and animals.
sequence. The basic building
about our experiences
the cell
those cells. Such a RNA string is a long sequence of basic
gain.
followed unique.
much
of a p@ of the unzipped
basic controller
We will explain why we deviated from the general accepted those parts with
synthesis
RNA.
them and what we learned in the
process.
strategy to optimise
RNA
a single strand and usually
The second part pertains to the porting of our program to other platforms. It enumerates the errors of our ways, how to overcome
For
template.
the DNA
zipper and along a part of one of the DNA zipper strands it constructs a type of mirror-image. This mirror image is
program does and what makes it special compared to ether similar programs.
we struggled
it is more or less of a cast of the DNA
A popular but adequate analogy is to imagine
process, For the production of proteins, RNA is responsible together with special enzymes, which attach themselves to
is given that To copy
specific sites of the RNA. Now sometimes those enzymes s-mm unable to get a hold on a perfectly normal RNA
$1.50
265
van
Batenburg,
Bos, Reithoven,
Abrahams,
Pley
CACCCU
may glue to AGGGUG1)
and yield
an energy
gain of 15.9 kcal/mol (which is an expression of the attractions forces). After computation of all possible attractors, the program computes all types of combinations. For example if the sequence CUC is attracted to sequence
I
+LIP
GAG next to it as well as to a similar
UT? GTf
further
sequence GAG much
away, this yields two different
combinations.
The
first one requires a very sharp kink in the string (which by the way costs more energy) whereas the other one forms a very shallow loop. Or the string CACCCUC can either pair with ?AGGGUG (remember that the second string is written
reversed!),
leaving
the last C unpaired
or it may
pair with GAG???? elsewhere, leaving the first nucleotides CACC unpaired, The program computes the energy gain of each combination Our program
and yields the one with the highest gain,
STAR differs
we do not compute [1]).
Our reasoning
WHAM!
Fig. 1: Protein production string.
by RNA (Mavor
The explanation
called the secondary primary
structure
arrangement
which
of RNA simply
& Manner [5]).
the
you well
know
leisurely
about your
gain.
of RNA? Imagine RNA as
thus forming
that conflict
does not compute
and the secondary
structure
of our
has a higher percentage
in some cases (Abrahams
of correct
[1]) and demands far
power.
Although STAR has other unique characteristics (for example it was the first program to predict a very special
quietly floating as a long, stretched, relaxed string, some spots of the string are attracted to other spots and tend to “base-pairing”,
Still our program
less computer
shoe laces, such long strings
have an unbelievable tendency to form knots and tangles everywhere. This is also true for RNA strings. Instead of
glue together by so-called
does not strand into
is not necessarily the one with the highest energy
predictions
in the fluid of the cell. As
alternatives
step. So our program
possible alternatives
prediction floating
string
one and so on. In each step, the program
with that particular
sequential
~
What is the secondary structure
in that
(Abrahams
from a stretched
excludes more and more potential
of nucleotides.
a long string
was that the RNA
!! turns immediately
next biggest
as opposed to the is
program
combinations
its spaghetti tangle in one big bang. On the contrary, we expect this to be a gradually evolving process in which the biggest attractions are formed first. Therefore, our program starts with establishing the biggest attraction first, then the
for this can be found in what is
structure
from Zuker’s
all possible
type of spaghetti tangle which biochemists call “pseudoknots”), this simulation-like computation is unique.
an
intricate tangle of spaghetti. Now, contrary to the spaghetti on your plate, the RNA tangle is very specific and is
called
the
secondary
structure.
The
shape
of
2.3. Program
that
secondary structure explains why some enzymes are able to bind to the RNA
at specific
Some fit perfectly
well
locations
The basic engine of STAR
and others cannot.
in the crevices
J. P. Abrahams
of the spaghetti
Our program
Stau The Basic STAR
structure of RNA. Analysis
of
RNA.
Engine
is designed to compute the secondary is not
the
first
program
The most famous program
How does that program
Zuker’s
Atari and Macintosh
how great those attractions
APL 92
in
ideas were invented by him
implementations
of STAR.
Finally,
———————————— 11A “nomal”
all possible base-pairings
by
department in 1989. He designed and implemented the user interface as it is nowadays with menu’s and dialogues for
which
computes secondary structures. is that of Zuker [6]. work?
in APL department
and implemented in the program. In order to make this program accessible for others we decided to embed this basic engine within a more user-friendly interface. This was done by V. Bos, another student who worked at our
That is why STAR stands for Structural It
was programmed
as a student at our University
1985. Many of the biochemical
tangle, others do not.
2.2. Program
Staq The User Interface
in a RNA
program
computes
GUGGGA,
string and it computes
person might write but
biochemists
reversed. So CACCCU
are. For example the sequence
that CACCCU
write
with AGGGUG
the
string
means that C glues
to G, A to G, three times C to G and finally
266
glues to
second
U to A.
Porting and Optimizing
Star
J.J. M. Riethoven functions
exported
STAR
3. Porting;
to the PC, using several
from M.v. Welie.
STAR was originally comparable
of fig.2 and fig.3.
L! Ouit
Open Saw as, , , Save ..----.-----.-
1
Help -------------
Hel P -------l-m-c
This step raised various
open oilen save 0s, , , save as, , , Save i II Save
transportation 1, Floppy
-------.
-.[
Menu options of STAR
The
file,
drive
the nucleotides.
of
this
To our pleasure
This menu also
floppies;
sees fit.
Macintosh workspace,
of
and the second one
fomnat
enables the user to view the string or to edit it as hedshe core
to Macintosh
similar
to DOS-floppies.
machine
was
able
to read
DOS
This can be done by reading an
or by typing
The
to Macintosh.
The first one was the
floppies.
menu user must enter the RNA sequence
structure).
are very
This is very different from the way the Macintosh formats its floppies. Fortunately, shortly after our decision to port to Macintosh, we got a Macintosh IIcx.
.Priflt
(the primary
they
working.
Atari floppies are formatted 1---.
but
from Atari
problems.
from Atari
was to get the program
Priflt
ASCII
machines,
Our first step was to port STAR ., -----
Viell
In the [Primary]
to Macintosh
3,1. Atari +Maclntosh
Edit .---.-.----.-
Fig.2:
developed on an Atari microcomputer
cheap. Unfortunately notrnany scientists use Atari’s, so in order to make our program available to others we had to port it to Macintosh or to PC.
The central interface with the user is the menu with the alternatives [File], [Primary], [Secondary] and [Energy]. --see fig.2.
p#!LmJ_ ......... ............
& Lowlights
(1040 ST). Those type of machines have a power which is
To get the idea we show some of the basics in the figures
Renane Delete -----.. ..
Highlights
the program
is invoked
by
the
[Secondary] menu. Here the user can instruct to calculate the secondary structure. Here the user can also instruct the
write
Atari
converted format.
——
could
read the Atari
andthus
the files. The
could not understand the contents of the though. So we had to program a system to workspaces
which
Macintosh. This wasnotbuilt adaptation from ahome-made
program to show the computed structure in various ways on screen (see tig.3) or to print it.
workspaces
Allinall
could
be read in the
from scratch, butwasan program COMMAPL which
more or less in a “transfer-like”
this step took usabout2
hours.
I
UGGIM U-,
the Macintosh
at least the directories
-*
Uuccnnu
rmlmuil C---I---G G---l---c c-_-,-_+j C----I---G n----I----u u----- l----n C-----l-----G C----- I-----6 c------ I -----6 n------l ------u 6------ l------c unG-------l------~fi~-.----- f.-----. ~~~~uc~-.----.l ..-...-~
2. D@erent
names
of
comparable
(device
dependent)
routines.
R-----------------------------l-------.--------..-..---..--" - ------.. -----------------------------
n~:::::::::::::::::::::-:------.~-------....--.---.-.-.--....-!u
G--------------------------------l-------.--------.-...----..-..-c
Although
most programs
functions,
several
performed by special provided by APL.68000.
u--------------------------------I -.---------------------------.-.R c----------------------------------l---------------------------------G-----------------------------------l---------------------------------cuccfl
lZX4i6789j1234j67890 lZStj67a9i123tj67 a9;lZJ4:6789;lZSi j67a9jl?I~i67a9j12 J4j6 1 7 3 4 5 6 7 U6GIM II
Unfortunately, have different
u-l-n
use the standard APL
device-dependent locked
programs
some functionally
system
actions which
comparable
are are
programs
names in Atari and Mac. See for example
the list of fig.4.
7
66
72
*
14,4
un6---c6cclc-l-6#65uc:::\::: n------------- l------------u
4,6 *6 WOOU19 u-------------, * Cljncuou ~------_-.-_-_ 6-------------
“!:!
‘E
Atari
--------_-.-n , ---_. --_----u
BEE STCREATE STTIE STREAD STWRITE
I ------------C
ii
I I
I I 11-----1 l--l 1--1 l----l 1----1 I---1 1---11-----1 GcuGRununGcucn6uu66uaGnGcGcncccuu66uRn6GGuGn6GucGGcRGuuccRaucuGccuRuc#Gcnccn 12S4;6789:1Z34:6789;lZI4:67a9:l2I4:67a9;lZ34;67a9;1ZI4;6789OlZX4;6789:l234;6 6 3 4 5 z 1
Fig.3:
Various
7
I Macintosh
Fig.4:
Different
WATCH MCREATE MTIE MREAD MWRITE
Set cursor busy Create new file Open file Read file Write file
names for comparable
programs.
views of secondary structure. Of course it was evident machine-dependent parts.
APLQuote
I Function
Quad
267
that we should replace those
modules by their Macintosh
So WATCH, MCREAnE,
MTIE,
MREAD,
counter-
MWRHE
van Batenburg, Bos, Reithoven, Abrahams,
and
Pley
many
more programs
were
imported
from
when
the work-
supplies for the Macintosh, and BEE, STCREA TE, STTIE, STREAD, STWRITE and all the other Atari alternatives were removed. Unfortunately however, different names too. comparable
functions
those programs,
required
but the different
these differences,
replaced all menu-functions MDF-cover start with
we had to deal with the call to So the different names for not only
we discovered
us to replace
Those
numbering
system at the outside, but inside they convert
from
Simplified
cover
functions were dialogue-functions not only identical
that
not
~
machine-dependent For example the
different. MAKEDIALOG,
and
DIALOG
in name, but similar
v
for
of
that
MVRm.nw-(10 1)(102) v Z+WDF_GEIMENUIIEM
Z+MDF_GETMENUIIEM
LCGE2MENUV
/[l]Z+MVRmenu
V
. ..(145)
I
for their inputs as
comparable
system
Macintosh
v
MDF
ENABLE17EM
[l]ENAB&EM
input/output
numbering
sequential
examples:
[l]Z-GE777EM
well, 3. D@erent
use the
See fig.5.
were
ALERT
fimctions
the proper
machine.
Atari
convenient
with MDF-cover-functions.
functions are functions of which the name MDF to indicate a ~achine-~ependent-
particular
to look into all the programs and check if they called any one of those particular ST-ftmction. All in all this took us several hours. was
to
~unction. to and
names also forced us
to change the calls in the parent modules. This forced us
It
we decided
hide the differences of this information. On a “functional level” we opted for the Atari numbering system and we
spaces that APL.68000
functions
Fig.5:
user-interface of STAR relied heavily on menu’s and dialogue-boxes. In the menu’s functions again we
vMDF ENABLEHEM R MVRmenu[l tR],l [l~N4B LiImM
R
R V
$Rv
Cover for menu functions.
The
4. Font problems
had to tackle the problem of different names for comparable functions as outlined in the previous paragraph.
Moreover,
another
problem
o Identical
could load different
was that the
steering of those functions was different. with the following
The use of fonts posed another problem. SEZFON7’,
We had to deal
on-screen
categories:
number,
names with different
inputs as in:
151
(15= option
“view”,
position.
o Different
1171
(11 7 =option
names with different
“view”,
1 = enable)
(returns
is addressed
by line-
has different switch
and column
fonts.
to another
It has no
system font
Furthermore,
putting
two different
characters
on the same spot would result in a preempting of the last character with ATARI, and in an overstrike with the
outputs as in:
Atari: GE?TIEM
position
The Macintosh
program and their
though, Unfortunately those fonts are proportional and their on-screen position is addressed by the ~-
1 =enable)
Macintosh: ENABLE17EM
Atari fonts are nonproportional
SETFONT, but we could
Atari: ENABLE17EM
These
In the Atari we
fonts using the supplied
Macintosh. to solve.
2 numbers)
These font-differences
required
several days
Macintosh: GEZW3VU (returns 3 numbers)
o Different
names and different
5,
inputs as in:
SEliWENUS charactermatrix “ matrix
for Atari
and Macintosh.
We
discovered this immediately, as dialogue-boxes which were pleasingly centred on the Atari screen, appeared at
Macintosh: SLG’MENU
dlferences
Screen size was different
Atari:
numbers
Hardware
matrix..
the far right with the Macintosh. Another seemingly trivial problem was the difference in keyboards. A
matrix
DRA WMENUBAR
Macintosh main discrepancies were due to different addressing of the menu options. The menu options of fig.2 are addressed by Atari as number 1, 2, 3, 4, .,, etc. For example options [Files/Quit] is number 7 (notice that
+ has no [Insert]
key, nor a [Delete] -key.
The
“Files”
and horizontal
option
[Primary/Open]
lines have a number is
number
11,
Altogether these conversions were accomplished o Porting from system to system 2 hours.
0 0 0 0 0
too) and With
the
Macintosh, each option is addressed by the 2 numbers; the columns are numbered 10, 11, 12, . . . and the rows in each column [Files/Quit] is [Primary/Open]
as 1,2,3 and so on. So number-pair 10 6 and is number-pair
option option
113.
Conversion Conversion
268
of different names 4 hours. of menu differences 16 hours.
Font problems 16 hours. Positioning of dialogue-boxes 1 hour. Various small function adaptations 1 hour.
So porting
was not such a major effort.
multitude
of
platforms
APL 92
in...
those
small
deviations
Nevertheless, between
the
the two
required as you can see about 40 hours. The code
Porting and Optimizing
Star
amounted to 231 functions;
altogether
1. The best one was 13BOX which
3144 lines of which
a third (1285 lines) was comment.
matrix
2. Another one was ❑ISS(source;searchstring;replacement) which replaces substrings.
3.2. The Lessons What is to be learned from this experience? 1. First of all we learned the hard way never again to call non-standard functions directly but substitute cover functions
instead.
starting
with
Those
cover
the characters
functions
itmF.
functions.
to scan all parent because there would
We realised that nothing is wrong by using such handsome powerful functions as EIBOX and ❑SS, but that we should have hidden
had names
So MDF_READ
calling SIREAD in the Atari implementation the Macintosh
the machine-dependent
was
screen-
and MREAD in
In this way, there was no need
utilities.
functions for potential changes, not be any reason to change them
functions,
we suggest to MicroAPL
machine
dependent
platforms
if they perform
functions similar
on
Therefore,
Contrary
those
related,
features
but below
not
too, but not
UTS-functions.
[1]
different
vZ+L UTS_BOX R ‘Z-DBOX R’ ❑ EA ‘Z*L
❑ IBOX R’
V
Pc:
tasks. [1]
vZ+L UTS BOX R lL+ll ~~(•-~ lL&L~
[2]
Z-3(
-REL)CR
V
-b PC 3. The ‘overbar comma’ in the ATARI/Mac
next step was to port the Atari/Mac
be replaced everywhere
product to the PC.
system in use at our department APL2/PC
the availability
in an academic environment to colleagues
appropriate message was given and control returned to the main “menu-loop” for continuation. In such a case,
at that time, we optedl for
version instead. The primary of a free runtime
version.
version had to
by ,[1].
4. In APL,68000 we had used EIERROR to trap errors in the main program. Whenever an error occurred, an
The first decision to take was which APL system to use for the PC version. Although STSC was the standard APL reason was
the particular action was terminated, but after the error message the main program continued and no variables
This is essential
where products are distributed
local to the main program
were lost.
for free or for a moderate fee.
Transport
don’t know yet how to replace EIERROR by !JEA without seriously affecting the system-design though.
Again, the first problem was to transport the basics intc) the PC. Fortunately, at that time APL.68000 came with a level II version with nested arrays which claimed to have )IN compatible
was that Atari
floppies
with
IBM.
Another
are formatted
fortunate
similar
❑ IEA instead. We
has no LIERROR but has
APL21PC
3.3.1. Physical
and )OUT
were
For example... Atari & Macintosh:
Although several biochemists applied for our Macintosh version, we got many requests for a PC version. So our
IBM’s
to file- and
but more in the nature of
we hid these functions
below MFD-functions,
that they name their identical
machine
their use like we did with
functions.
managements,
particularly
in fUture portings. Of course, now all MDF... fUnctiOnS have to be changed though, no more, no less. 2. To spare APL programmers this hunt for calls in parent
3.3. Porting
reshapes a vector to a
(and vice versa).
5. A very particular non-standard feature was the diamond. All our programs were heavily diamond-cd. As
tlhing
APL.68000
to DOS.
supports the diamond,
but as APL2/PC
does
not, we had to convert all programs. So we “)OUT’’-ed
STAR
from the Atari
to a floppy
and The first attack was to change the diamond by character ~ and add the following function. ❑l% ‘Z*L AR’ ‘Z*L’ Atari and Macintosh This was done in the original programs too, in order to keep the bulk of the three versions identical. This solved about half of our problems with diamond-cd statements.
“)IN’’-ed it into IBM’s APL2 at a PC. To our surprise and delight, most functions and variables got imported into APL2/PC
without
and )OUT
standardisation!
apparent problems.
3.3.2. First Incompatibilities:
Hurray
Nonstandard
Our initial joy was tempered
quickly
that
nonstandard
we
had
APL.68000,
used
although
standard (Kerf
several
for the )IN
Features
when we discovered features
some of them are optional
The
A
now
inconspicuously
in the 1S0
sub-statements
[4]).
similar into
the
to
the
APL
269
“+ “
syntax:
and
Bos, Reithoven,
with
the very
the diamond
from A with regard to order.
van Batenburg,
blends
a line
separated by A is parsed from
right to the very left. Unfortunately different
APL Quote Quad
is
from
works
Although
APL is
Abrahams,
Pley
processed
from
statements, processed
right
to left
from
diamond-cd
left to right.
BE:+(O # 00/CILC
Unfortunately,
diamond-cd
among
were
sub-
checking
are
program
themselves
This characteristic
sub-statements
conditional tasks which enough for l-liners like: BE:+(1500 >pRNA)/CiLC+
compact rewritten
within
the sub-statements
often
used
small
and yet cohesive
❑+-’*
So we extended about
String too long’
V IFWN,l-U
A
had
to
be
the text-matrix
(variable
enablingldisabling
problem
The
User interface
was the user interface.
it completely,
using
auxiliary
Nevertheless, we were anxious to change possible and only replace or add functions
functions
and
versions tree.
For
MDF_RNAEAD.
were also included
We had to
specific
user-interface
as little as low in the
programs this
irrespective
which
WMENU
In
a charactermatrix
MENUOPTZONS
for
(O
disabledlenabled
MDF_CHECKHEM NUMBERS
no
action).
and checked onloff
MDF_ENABLEITEM
&
particular
alert-boxes
the message. The resultant
3, OUTPUT
are
on/off
through
and
MENUOPnONS,
used
by
program
menu and to walk
are
available
by
the
of the severity
in an appropriate
of
icon.
for Atari
code was hidden
beneath
Mac and PC.
WINDOWS
a window
with scrollbar.
and their
and sets checking on/off,
In APL2/PC however, output which extends beyond the screen will scroll over the top and be lost irrevocably.
These commands are set independent of the final menupainting and also independent of the polling of the userresponse. Once those commands are given, they lay dormant and their effwt becomes apparent at the time the menu is painted on screen and at the time that the user is scrolling
and
those two
Often output exceeds the available screen size. This is no problem in APL.68000 where output is displayed in
by...
menu-numbers
the
from
items
NUMBERS
sign sets enabling
To meet this problem we used AP124 again and wrote a little program UTS_DISPLA Y which used the pop-upfacility [FYI: call 13] of AP124. For example,
the menu-options.
we had to change our design completely
to display
the secondary
structure
of a
RNA in Iego-view we changed... ❑~ma_stmcture RNA_DISPLEGO rna_sequence into:
In the PC-version we had to use AP124. Although we expected that the introduction of the MDF-ftmction would ease the work of porting considerably, we were
UTS_DISPLA
Y ma_structure
RNA_DISPLEGO
ma_sequence
In the Atari and Macintosh versions we included function UTS_DI:Pull ❑ EY ‘ UTS_DISPLAY R’ ‘ ❑ -R’ and inserted It at the appropriate places.
once
more. The reason was the afore mentioned difference between issuing the enabling fdisabling and onloff
APL 92
APL.68000
MDF_ALERT
NUMBERS
enumerate
is
call
In APL2/PC we had to muddle along with AP124. This processor doesn’t display icons, so we wrote a little function that combined the severity code (as text) with
with the texts of
fig.2 in each row, User action is polled by... . .*MDF —GElMENU17EM which yields the number of the option chosen
mistaken;
ultimately
program,,. 3 ALERT ‘This message on screen’ The number at the left is an indicator
MENUOPTIONS k
programs DISABLEI~M
change variable is
The
2, ALERT-BOXES
were
The menu function are already explained for Atari Macintosh. The menu is painted on screen by... MENUOPZJONS
platform.
this information
these
1. MENU’S
Here
only
information
about
in variable
For the PC however,
the message which is reflected
MDF_SHO
the
to paint a proper only along “enabled” options.
necessary to design anew were:
where
of
on/off
ENABLEITEM,
directly.
NUMBERS
stored the information
and checking
MDF_GEn7EM
and Macintosh
functions
This calls to...
MDF_CHECKITEM
is that in APL.68000
CHECKITEM
(but empty of course) to keep the same function
The
checking.
MENUOP7TONS
machine-programs
AP124.
compatibility
in the Atari-
of
information
program
not used, because the last two
functional hierarchy of the program-system. For example, the sharing and retraction of variables with the auxiliary processor was done by newly introduced functions MDF.RNAINIT
information with
and onloff
last two programs
difference
processor
with
mentioned
MENUOPnONS+MENUOPITONS
MENUOPnON,
Our biggest
checking.
MENUOP~ONS)
enabling/disabling
rewrite
we had to
MENUOFTIONS*MENUOPnONSlDF_ENABLE17EMNUMBERS
B1 :-(1500 >PRNA)IEI + 1 too long’ El: ❑l-’*String
3.3.3. Main Incompatibility:
AP124
by storing the information
and onloff
changed the previously ...+MDF_GEliUENUIlliM
gave no solace here and all those
and comprehensive statements to clumsy 2-liners like:
Using
for menuoptions
1 0
effect.
about enablingldisabling
of the
was
and their
this time difference
27U
Porting
dummy
and Optimisiwj
Star
Writing
this
(designing, PC, ATARI 4. FILE
tool
for
the
PC
took
about
characters specifying different options, and POS specifies the coordinates of the surrounding frame. The AP124 worked quite different. For this reason we decided to interject a machine-independent level (higher
3 hours
implementing, testing), inserting it in the and Mac version about an hour or so.
SELECTOR
Whenever
than
one wishes to access a file in APL.68CIO0,
one may present the user a so-called “file-selector-bc)x”. This enables the user to look
for the drive,
name of the tile in a convenient
way. In APL.68000
file selector-box vFILE.DEFA [Ij FILE+ZEXT
is displayed
objects:
a dialogue
ANSW*MDF_DL4LOG
INFO)
to describe
is performed
the Now
by:
DDM
that In matrix
DEFA ULTPATH
matrix
the Dialogue-Definition-Matrix.
for each platform
path and
with
ULTPA TH MDF_FILESELECTOR STPUIFILE
the APL.68000
various
DDM
own line with
TEXT
each object of the dialogue information.
The first
box has its
two elements
in
each line mecify the position of the top-left comer and the following two elements specify size; all positions are expressed as line- and column-numbers: 2 81 17 Elwhat is your age?
V
I
Directory: n :\*, x________________________________ Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x File XXXX Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x D~r xxxx C:\ STAR\* * Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxmxxn x x A: xXXXXXXXXXXXXXXXXXXXXX OK Xxxxxxxxxxxxxx xxx B : Xxxxxxxxxxxxxx xxx c : XXXXX CANCEL X xxx \
0 !--------’-----DESKTOP,INF RES-SHIPot)CC ,DOC 1--1 STIIR
:+:STRR----
----- —--,---------!---
lo_. _..-_._
Fig.6:
Fileselector
for APL2/PC
.
._. ____
❑ i_. {OK
720
I 1 {CANCEL
As this dialogue shows, element 5 in DDM type of object,
❑
U
Im
--------------
(left) and for APL.68000
{an
our own file-seledor
so we had to write
Using
of
M.v. Welie,
we
programmed
DDM
All in all. , we worked roughly . . . . Physically porting from A-tari to PC 3 hours.
interfaces
with
several
0
objects: a set of exclusive
alternatives
Diamond
problem
(often substantial
15 hours. 0 Menu system 50 hours. 0 File-selector box 80 hours. output-windows, 0 Dialogue-boxes,
of which
exactly 1 and only one can be activated (chosen). checkboxes: options which can be activated or not options which will terminate the exit buttons: dialogue. -teXJ
rewrite
❑BOX, of code)
etcetera 200 hours.
All together 374 hours. Compared to the Mac porting this statistic shows that porting between Atari and Macintosh
&J@ M
APL.68000
toolbox
0 Conversion of various names 6 hours. of non-standard features like 0 Conversion ❑ERROR, ❑ ss and overbar comma 20 hours.
are complex
frame;
to this one.
o
5, DIALOGUE-BOXES
-
the
ANSW+MDF_DL4LOG
see fig.6.
radiobuttons:
of surrounding
AP124 underneath:
file-selectors: a scroll-window with tiles, directories (including parent and root) and all available drives, an [OK] and [CANCEL] button and a field for manual
Dialogue-boxes
comer
are relative
lEXT
It now supports the basic features as seen in many other
potential
text.
scroll-box. of top-left
all other positions
[1] ,,.
input;
updatable
scroll-box.
+ position
using AP124 beneath:
vFILEu-DEFA ULTPA TH MFDFILESELECTOR
text.
with optionally
exit button.
-D horizontal support,
are for
(F a radio-button of family F (different families have different characters instead of character F). [ a check box. J vertical
lacks tile-selector
specifies the
positions
For example:
non updatable
•l input field,
(right). APL2/PC
and the remaining
textual information.
-——_-——— ,--Ipiiq -— —— ----____
Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx
XXX A xxx !tSTAR XXX ST ARPACK. EXE xxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
4 815 7912
(numeric provides
or character)
for the fimction
was much easier. DIALOG as in:
ANSW*POS DIALOG INFO
Here
INFO
is
a
charactermatrix
with
a line
of
information for each object in the dialogue-box. Each line in INFO contains textual information preceded by 8
APL Quote
Quad
271
van Batenburg,
Bos, Reithoven,
Abrahams,
Pley
3.4. The
optimisation.
Lessons
according
What did we learn from this conversion?
3. “Don’t
1. Again we suffered from sins of our past. We had not complied with the APL standards well enough and used non-standard features like ❑ UMX, ❑ERROR, ❑ ISS and overbar comma. Similarly to the machine dependent functions,
we had not hidden those non-standard
This we learned the hard way.
The use of non-standard
APL features should have been
in utility
functions.
This would
the need to search the parent functions 2. The diamond ISO
APL
8485
(Kerf
[4]).
The
of
[2]). But lacking such a structured
the poor man’s substitute
of the diamond
making
or tools to manage user-interfaces. low-level is
for high program
much
better.
intermediate
For
in
variable
DDM
extension,
In the literature
the criteria
always
the highest
how
applied
system to paint the objects associated DDM variable.
on
screen
those of
that another
much
5, The )IN
and )OUT
APL21PC
working
between
one
timings
TRNATHRT
the
expects
calculates
various
and get the
is atrocious.
Macintosh
+
Macintosh
IICX
on different
by
improvement
could
yield
very probable
and
the highest
point to that part
structure.
Look
at the
structure
that secondary
to
optimisation
So we first concentrated and only
of
machines in fig.7. no coprocessor
136 sees
8MHZ
no coprocessor
140 sees
16MHZ
with coprocessor
20 sees
8MHZ
with coprocessor
140 sees
33MHZ
with coprocessor
11 sees
CPU measurements. of the secondary
structure
requires
for a long wait because he has given the computer
we
subscribe in
the
of 15, 2 hours instead of 1, 5 minutes instead of 2, neither of those improvements will excite the user very much or reduce his irritation considerably. In short, sometimes optimisation
with big gains can still be unimportant
respect to reduction
of user-irritation.
So it is our view
one should
indiscriminately,
the
2 rules
not optimise
not even a particular
highest potential gain (the official only consider
strategy). the highest
with
the
No, one should
respect to user-irritation
those parts with
with
everywhere
selection
first
and
user-irritation.
This brings us to a corollary of the “Don’t everywhere’’ -rule; it states: 3. Don’t optimise, but reduce irritation only.
do it yet (that is, not until clear and unoptimised
a working
a big
chew. So he is resigned to wait some time and builds only very little irritation. Whether it will take 30 minutes instead
measure all parts with
on making
then,
fully
condensed
1. “Don’t do it” 2. “(For experts only) Don’t you have a per fwtly solution)”.
APL 92
one.
factor should be
several minutes for small RNA strings to several hours for very long strings. However the user realises that he is in
viewpoint which Jackson’s (Jackson [3] p.232):
Afterwards
is the most important
8MHZ
Now the computation
Background respect
speed.
and
are a boon.
After all those conversions we had STAR ruining on Atari, Macintosh and PC, The Atari and the Macintosh versions worked fine, but the PC version was very slow; especially the user-dialogue was atrocious. So we decided to do an optimizing step for the PC-version only.
With
in
are think
to gain
to compute
AT
Fig.7:
4, Optimisation
4.1.
to optimise We
the secondary
Atari
a handy
APL,68000
factor
This analysis would
which
IBM
in APL21PC
where
gain
on those parts which
IBM 486
4, Absence of mouse-support
for
to our view the most conclusive
concentrate
A higher level
and programmed
if and where necessary.
According to the literature one should first measure how much time the various program parts consume, determine
some functions
we
one can gain some speed? And
the VSER-IRRITA 770N.
the
The AP 124 is too
reason
spirit
often damages
According
is better than
productivity.
this
same as:
So one should
gain, is also long overdue
the
thing and that optimisation
however
nothing. 3. APL2/PC
in
design and that it costs time and money. only optimise
resultant code without the diamond showed however that a structured extension in APL is long overdue (v. Batenburg
acted
do it everywhere”.
most important
for replacements.
clumsiness
we
how to determine where to optimise and where not? The spirit behind Jackson’s rule’s is that a good design is the
have spared us
is not standard either, although optional
then
Why not optimise wherever
feature
beneath cover functions.
hidden
And
to a third rule which we formulate
do
it
program.
did we start thinking
about
272
Porting and Optimizing
Star
The
4.2.
According
Program
Other, more substantial
to the strategy as outlined
was not to measure parts,
but
program
the timings
to
“measure”
parts.
There
the was
above, our first step
of the various irritation
no
need
of to do
program
the various this
in
statements were removed
BFS_MPOPUP towards the BFS_UMPOPUP. B1:(3>2>C)+(1 +-Pi >?)t [2]”” B-32z> C ,51:+(S >1-1+ 1)/Bl
loop
(once
in
out of a executed)
a
systematic way because every user of the beta-version complained about (in decreasing order of irritation):
Another improvement in the BFS.. .-programs was the following. To catch any unwanted result and send it into never-never land we used frequently (within often executed loops and/or programs):
1. Jumps between menu’s in PC version. 2. Dialogue-boxes in PC ~~ersion.
O 0 PFUNCTION MATRIX
So we choose to optimise the jumps
between
the highest-irritation
menu’s
and dialogue-boxes
version. The next step was to determine time is spend in that program part.
Experiments
part first:
to
mark
a
how much CPU
function
MA TtUX
yielded an improvement
of 14% in function
BFSSCREEN.
4,3. Result
We used the I-beam to measure the CPU time: “ 171FVNCTION”
learned us that:
DUMMY+FUNC7ZON
in the PC
for
timing
Table fig.9 shows the result.
registration. “ 191FTJNCITON” to unmark
that timction.
Conspicuous
“ 181FUNCnON” to read the consumed time. “2410”/’’2411”
to mark/unmark
all functions
is the deterioration
of BFS_UMPOPUP
-232 %.
was intentional; this function was called only once, whereas this change would improve BFS_MPOPUP which is
This
for timing.
called several times. Some CPU measurements
are presented in fig. 8. Another
—.—
interesting
observation
BFS_EXEC. It is probably
memory
1
of the APL2/PC
changes were actually
.+ ,: :;
[-
management
that
the
deterioration interpreter
made to BFSEXEC.
deterioration
although relatively
is the
of
due to very small fluctuations
of
impressive,
BFS_MPOPUP
Fig.9 and
in
as no shows
BFS_EXEC,
are not that great in absolute
values. The improvements of BFS_MPOPUP much more noticeable though. The
overall
effect
was
and BFSMPOPUP
slightly
more
than
are
25%
improvement in speed. Although not very impressive, it was fairly noticeable. As we saw no regular way for more substantial
improvements
AP124 and programming
(apart from
throwing
away the
in C) we decided to stop at this
point. Fig. 8: Execution
profiles:
times for jumps
from menu to
menu and for dialogue-boxes. 5. Conclusion
Apparently, menu-jumps require much time in BFS_MPO,PUP and BFSMPOPUP. Dialogues are consuming much time in BFS_MPOPUP too, but also in wMs_uBox and BFS_UBOX. So there are the first culprits
which
should be analyzed
This case-study had several lessons for us in store, which could be useful to other projects
for First, we learned to be alert on machine-dependent
improvement.
tackled them by hiding them underneath MDF-functions. In the subsequent porting to PC those changes proved worth
. ..(22P9 150 15) A,= C-.,,
those constructions
appeared within
loops,
aspects
in our programs. In porting from Atari to Macintosh we suffered from several of such incompatibilities and we
In programs BFSMPOPUP and BFS_MPOPUP both, small array’s are created on the fly. For example in BFSMPO~PUP Wherever
as well.
their while.
we
removed them out of the loop; for example: TABFLA+2 2p9 15015 B1:... TABFLAGA .= C+... = TABFLAG M :+(CONDIITON)lBl
APL Quote Quad
instead of 2 2p...
273
van Batenburg,
Bos, Reithoven,
Abrahams,
Pley
The dialogue functions in APL.68000 as well as in AP124 proved too low-level for great productivity. We introduced CPU
Functions
CPU
CPU
ajler
before
a higher
gain(%)
level
variable,
the Dialogue-Definition-Matrix
instead.
Installation
BFSSCTLEEN
240
205.6
14.3
BFSCLOSE
126
125.3-
0.6
BFSFBG
436
362
BFS_UMPOPUP
162
538
3679.8
3639.6
The menu-system constructed with AP124 proved too slow for basic PC’S, Contrary to the “official” strategy, we didn’t optimise the program at potential high-gain spots, but applied
BFS UBOX
17.0
our rule
“don’t
optimise,
This led us to optimisation optimisation
-232.1
but reduce
of the user dialogue
of the calculation
irritation”. instead of
part.
1.1
6. References 1. Abraharns, J. P., Pleij, C. Prediction 12
10
16.7
BFSBG
o
0
0
BFS[MMWR
o
12
BFSFORMAT
includhg pseudoknotting, by computer simulation. Nucleic Acids Research 18(10)3035-3044 (1990). 2.
Batenburg,
5636
3.
BFS_MPOPUP
12747.2
BFS EXEC BFSREADSCREEN
4196
25.6
9200.8
27.8
46
76
-65.2
787.4
551.8
29.9
Jackson,
4, Kerf, J.L.F.
Atler
19228.6
choise
I
14046.6 I
205.6
14.3
BFSCLOSE
126
125.3
0.6
32
32
RNAmENUFUNC Subtotal:
398
End STAR
I
102.8
14.3
BFSCLOSE
126
125.3
0.6
246
I
New
de Second generation 9x). APL-CAM
5, Mavor,
J. W. & Manner,
APL and 1S0 APL
J. 13(1)189-270
H. W. General
sequences
information.
using
(1991).
biology
p.565;
folding
thermodynamics
of and
Nucleic Acid Research 9(1)133-148
I
120
I
design.
8.8
BFSSCREEN
Subtotal:
on program
0
362.9 I
Principles
(APL
RNA
harmful.
press (1975).
Extended
auxiliary (1981).
I
240
considered
New York, MacMillan Company (1966). 6. Zuker, M. & Stiegler, P. Optimal computer
26.9
BFSSCREEN
L1..2..3
(1991).
M.
Academic
large Subtotal:
F, H.D.v.
QQ.21(4)330-337 York,
BFSMPOPUP
Berg, M.v.d., Batenburg, E.v. & of RNA secondary structure,
228.1
26348 21317.7 TOTAL: I g.9: CPU improvements atler optimisation.
I
7.3 19.1
Next, we experienced similar problems with nonstandard APL like ❑ BOX, ❑ SS and overbar comma. Again, we learned to hide those features underneath UTL-functions or to use the standard. We suffered
(as we often did in the past) from the lack of
a good structural construct in APL. The poor substitute of the diamond was unfortunately inappropriate in APL21PC.
APL 92
274
Porting and Optimizing
Star