Porting and Optimizing Star - Association for Computing Machinery

0 downloads 0 Views 907KB Size Report
dpt of biochemistry,. PO Box 9502, 2300RA ..... or to PC. 3,1. Atari +Maclntosh. Our first step was to port STAR from Atari to Macintosh. This step raised ... The drive of this machine was able to read DOS floppies. To our pleasure the Macintosh.
Porting and Optimizing Star; A Case Study of Suffering and Surfacing

F.H.D. van Batenburg 1); V.BOS; J. J. M. Riethoven; J. P.Abrahams; C.Pley 2) for Theoretical Biology, Kaiserstraat 63, Leiden, the Netherlands. 2) Leiden University, dpt of biochemistry, PO Box 9502, 2300RA Leiden, the Netherlands. 1) 1nstitute

1. Abstract This

paper

description

Actually, has three of our

main

project

parts.

The

STAR.

first

part

It explains

is a

what

a closed zipper.

our

mirror

RNA

The last part narrates our strategy to optimise the program. the highest potential

building

blocks

the so-called

2. Introduction

with optimizing.

introduction

will

biochemical

problem

present a very which

At first,

by some particulars

however,

which

nucleotides

brief

in

in a very

blocks of a RNA

and the composition

to 4 different

particular string are of a RNA

types of nucleotides.

For

of these nucleotides

which

strand. For example each from A, C, G and U will comman d

our program

the

a particular

in which

the cell as encoded by that RNA

is

incorporation protein

of

amino-acid

chain. And these amino-acids blocks of all kinds of proteins.

and Manner

into

the

in turn are See fig. 1

[5].

Background In the beginning, research was focused on the composition and sequence of the nucleotides in the RNA-string only.

In order to explain our program STAR we will present a very superficial introduction to some aspects of cell and to the role of RNA

in particular.

copy without fee all or pad of this material

are not made or distributed

for direct commercial

notice and the title of the publication

IS gramed

advantage,

and its date appear,

prowaea

~PYiW iS by Permission of the Association for Computing Machinery. otherwise, or to republish, requires a fee andlor specific permission Q1992 ACM 0-89791 -478-3 92/0007-0265

APL Quote Quad

alone determines

the amino acid

sequence in the proteins,

and deficiencies

could often be

by

a

peculiar

RNA

arrangement

of

the

nucleotides. It appeared however that the nucleotide arrangement as such could not explain everything of the protein-synthesis

mat the

the ACM cmpyright

and notice

After all, this arrangement explained

RNA? Now most of you might be more or less familiar with DNA which carries the genetic code in our cells, but what is RNA? RNA is very similar to DNA.

What is

mpies

R is the

processes

triplet of combinations

of the

from Mavor

biochemistry,

a

this is

intends to sc,lve,

background

growing

2.1. Biological

only

o. a, determines the types of proteins which are produced in

the basic building

Permissionto

are arranged

It is the sequential arrangement

itself is in order. This

the program

(being

than DNA;

convenience they are characterised as A, U, C and G, which stands for the 4 different bases in the nucleotids.

This paper reports our strides with program STAR. In particular we elaborate upon our porting experiences and of the program

shorter

DNA)

of most of the production

string is restricted

a short introduction

as

opens the

is found in all cells of plants and animals.

sequence. The basic building

about our experiences

the cell

those cells. Such a RNA string is a long sequence of basic

gain.

followed unique.

much

of a p@ of the unzipped

basic controller

We will explain why we deviated from the general accepted those parts with

synthesis

RNA.

them and what we learned in the

process.

strategy to optimise

RNA

a single strand and usually

The second part pertains to the porting of our program to other platforms. It enumerates the errors of our ways, how to overcome

For

template.

the DNA

zipper and along a part of one of the DNA zipper strands it constructs a type of mirror-image. This mirror image is

program does and what makes it special compared to ether similar programs.

we struggled

it is more or less of a cast of the DNA

A popular but adequate analogy is to imagine

process, For the production of proteins, RNA is responsible together with special enzymes, which attach themselves to

is given that To copy

specific sites of the RNA. Now sometimes those enzymes s-mm unable to get a hold on a perfectly normal RNA

$1.50

265

van

Batenburg,

Bos, Reithoven,

Abrahams,

Pley

CACCCU

may glue to AGGGUG1)

and yield

an energy

gain of 15.9 kcal/mol (which is an expression of the attractions forces). After computation of all possible attractors, the program computes all types of combinations. For example if the sequence CUC is attracted to sequence

I

+LIP

GAG next to it as well as to a similar

UT? GTf

further

sequence GAG much

away, this yields two different

combinations.

The

first one requires a very sharp kink in the string (which by the way costs more energy) whereas the other one forms a very shallow loop. Or the string CACCCUC can either pair with ?AGGGUG (remember that the second string is written

reversed!),

leaving

the last C unpaired

or it may

pair with GAG???? elsewhere, leaving the first nucleotides CACC unpaired, The program computes the energy gain of each combination Our program

and yields the one with the highest gain,

STAR differs

we do not compute [1]).

Our reasoning

WHAM!

Fig. 1: Protein production string.

by RNA (Mavor

The explanation

called the secondary primary

structure

arrangement

which

of RNA simply

& Manner [5]).

the

you well

know

leisurely

about your

gain.

of RNA? Imagine RNA as

thus forming

that conflict

does not compute

and the secondary

structure

of our

has a higher percentage

in some cases (Abrahams

of correct

[1]) and demands far

power.

Although STAR has other unique characteristics (for example it was the first program to predict a very special

quietly floating as a long, stretched, relaxed string, some spots of the string are attracted to other spots and tend to “base-pairing”,

Still our program

less computer

shoe laces, such long strings

have an unbelievable tendency to form knots and tangles everywhere. This is also true for RNA strings. Instead of

glue together by so-called

does not strand into

is not necessarily the one with the highest energy

predictions

in the fluid of the cell. As

alternatives

step. So our program

possible alternatives

prediction floating

string

one and so on. In each step, the program

with that particular

sequential

~

What is the secondary structure

in that

(Abrahams

from a stretched

excludes more and more potential

of nucleotides.

a long string

was that the RNA

!! turns immediately

next biggest

as opposed to the is

program

combinations

its spaghetti tangle in one big bang. On the contrary, we expect this to be a gradually evolving process in which the biggest attractions are formed first. Therefore, our program starts with establishing the biggest attraction first, then the

for this can be found in what is

structure

from Zuker’s

all possible

type of spaghetti tangle which biochemists call “pseudoknots”), this simulation-like computation is unique.

an

intricate tangle of spaghetti. Now, contrary to the spaghetti on your plate, the RNA tangle is very specific and is

called

the

secondary

structure.

The

shape

of

2.3. Program

that

secondary structure explains why some enzymes are able to bind to the RNA

at specific

Some fit perfectly

well

locations

The basic engine of STAR

and others cannot.

in the crevices

J. P. Abrahams

of the spaghetti

Our program

Stau The Basic STAR

structure of RNA. Analysis

of

RNA.

Engine

is designed to compute the secondary is not

the

first

program

The most famous program

How does that program

Zuker’s

Atari and Macintosh

how great those attractions

APL 92

in

ideas were invented by him

implementations

of STAR.

Finally,

———————————— 11A “nomal”

all possible base-pairings

by

department in 1989. He designed and implemented the user interface as it is nowadays with menu’s and dialogues for

which

computes secondary structures. is that of Zuker [6]. work?

in APL department

and implemented in the program. In order to make this program accessible for others we decided to embed this basic engine within a more user-friendly interface. This was done by V. Bos, another student who worked at our

That is why STAR stands for Structural It

was programmed

as a student at our University

1985. Many of the biochemical

tangle, others do not.

2.2. Program

Staq The User Interface

in a RNA

program

computes

GUGGGA,

string and it computes

person might write but

biochemists

reversed. So CACCCU

are. For example the sequence

that CACCCU

write

with AGGGUG

the

string

means that C glues

to G, A to G, three times C to G and finally

266

glues to

second

U to A.

Porting and Optimizing

Star

J.J. M. Riethoven functions

exported

STAR

3. Porting;

to the PC, using several

from M.v. Welie.

STAR was originally comparable

of fig.2 and fig.3.

L! Ouit

Open Saw as, , , Save ..----.-----.-

1

Help -------------

Hel P -------l-m-c

This step raised various

open oilen save 0s, , , save as, , , Save i II Save

transportation 1, Floppy

-------.

-.[

Menu options of STAR

The

file,

drive

the nucleotides.

of

this

To our pleasure

This menu also

floppies;

sees fit.

Macintosh workspace,

of

and the second one

fomnat

enables the user to view the string or to edit it as hedshe core

to Macintosh

similar

to DOS-floppies.

machine

was

able

to read

DOS

This can be done by reading an

or by typing

The

to Macintosh.

The first one was the

floppies.

menu user must enter the RNA sequence

structure).

are very

This is very different from the way the Macintosh formats its floppies. Fortunately, shortly after our decision to port to Macintosh, we got a Macintosh IIcx.

.Priflt

(the primary

they

working.

Atari floppies are formatted 1---.

but

from Atari

problems.

from Atari

was to get the program

Priflt

ASCII

machines,

Our first step was to port STAR ., -----

Viell

In the [Primary]

to Macintosh

3,1. Atari +Maclntosh

Edit .---.-.----.-

Fig.2:

developed on an Atari microcomputer

cheap. Unfortunately notrnany scientists use Atari’s, so in order to make our program available to others we had to port it to Macintosh or to PC.

The central interface with the user is the menu with the alternatives [File], [Primary], [Secondary] and [Energy]. --see fig.2.

p#!LmJ_ ......... ............

& Lowlights

(1040 ST). Those type of machines have a power which is

To get the idea we show some of the basics in the figures

Renane Delete -----.. ..

Highlights

the program

is invoked

by

the

[Secondary] menu. Here the user can instruct to calculate the secondary structure. Here the user can also instruct the

write

Atari

converted format.

——

could

read the Atari

andthus

the files. The

could not understand the contents of the though. So we had to program a system to workspaces

which

Macintosh. This wasnotbuilt adaptation from ahome-made

program to show the computed structure in various ways on screen (see tig.3) or to print it.

workspaces

Allinall

could

be read in the

from scratch, butwasan program COMMAPL which

more or less in a “transfer-like”

this step took usabout2

hours.

I

UGGIM U-,

the Macintosh

at least the directories

-*

Uuccnnu

rmlmuil C---I---G G---l---c c-_-,-_+j C----I---G n----I----u u----- l----n C-----l-----G C----- I-----6 c------ I -----6 n------l ------u 6------ l------c unG-------l------~fi~-.----- f.-----. ~~~~uc~-.----.l ..-...-~

2. D@erent

names

of

comparable

(device

dependent)

routines.

R-----------------------------l-------.--------..-..---..--" - ------.. -----------------------------

n~:::::::::::::::::::::-:------.~-------....--.---.-.-.--....-!u

G--------------------------------l-------.--------.-...----..-..-c

Although

most programs

functions,

several

performed by special provided by APL.68000.

u--------------------------------I -.---------------------------.-.R c----------------------------------l---------------------------------G-----------------------------------l---------------------------------cuccfl

lZX4i6789j1234j67890 lZStj67a9i123tj67 a9;lZJ4:6789;lZSi j67a9jl?I~i67a9j12 J4j6 1 7 3 4 5 6 7 U6GIM II

Unfortunately, have different

u-l-n

use the standard APL

device-dependent locked

programs

some functionally

system

actions which

comparable

are are

programs

names in Atari and Mac. See for example

the list of fig.4.

7

66

72

*

14,4

un6---c6cclc-l-6#65uc:::\::: n------------- l------------u

4,6 *6 WOOU19 u-------------, * Cljncuou ~------_-.-_-_ 6-------------

“!:!

‘E

Atari

--------_-.-n , ---_. --_----u

BEE STCREATE STTIE STREAD STWRITE

I ------------C

ii

I I

I I 11-----1 l--l 1--1 l----l 1----1 I---1 1---11-----1 GcuGRununGcucn6uu66uaGnGcGcncccuu66uRn6GGuGn6GucGGcRGuuccRaucuGccuRuc#Gcnccn 12S4;6789:1Z34:6789;lZI4:67a9:l2I4:67a9;lZ34;67a9;1ZI4;6789OlZX4;6789:l234;6 6 3 4 5 z 1

Fig.3:

Various

7

I Macintosh

Fig.4:

Different

WATCH MCREATE MTIE MREAD MWRITE

Set cursor busy Create new file Open file Read file Write file

names for comparable

programs.

views of secondary structure. Of course it was evident machine-dependent parts.

APLQuote

I Function

Quad

267

that we should replace those

modules by their Macintosh

So WATCH, MCREAnE,

MTIE,

MREAD,

counter-

MWRHE

van Batenburg, Bos, Reithoven, Abrahams,

and

Pley

many

more programs

were

imported

from

when

the work-

supplies for the Macintosh, and BEE, STCREA TE, STTIE, STREAD, STWRITE and all the other Atari alternatives were removed. Unfortunately however, different names too. comparable

functions

those programs,

required

but the different

these differences,

replaced all menu-functions MDF-cover start with

we had to deal with the call to So the different names for not only

we discovered

us to replace

Those

numbering

system at the outside, but inside they convert

from

Simplified

cover

functions were dialogue-functions not only identical

that

not

~

machine-dependent For example the

different. MAKEDIALOG,

and

DIALOG

in name, but similar

v

for

of

that

MVRm.nw-(10 1)(102) v Z+WDF_GEIMENUIIEM

Z+MDF_GETMENUIIEM

LCGE2MENUV

/[l]Z+MVRmenu

V

. ..(145)

I

for their inputs as

comparable

system

Macintosh

v

MDF

ENABLE17EM

[l]ENAB&EM

input/output

numbering

sequential

examples:

[l]Z-GE777EM

well, 3. D@erent

use the

See fig.5.

were

ALERT

fimctions

the proper

machine.

Atari

convenient

with MDF-cover-functions.

functions are functions of which the name MDF to indicate a ~achine-~ependent-

particular

to look into all the programs and check if they called any one of those particular ST-ftmction. All in all this took us several hours. was

to

~unction. to and

names also forced us

to change the calls in the parent modules. This forced us

It

we decided

hide the differences of this information. On a “functional level” we opted for the Atari numbering system and we

spaces that APL.68000

functions

Fig.5:

user-interface of STAR relied heavily on menu’s and dialogue-boxes. In the menu’s functions again we

vMDF ENABLEHEM R MVRmenu[l tR],l [l~N4B LiImM

R

R V

$Rv

Cover for menu functions.

The

4. Font problems

had to tackle the problem of different names for comparable functions as outlined in the previous paragraph.

Moreover,

another

problem

o Identical

could load different

was that the

steering of those functions was different. with the following

The use of fonts posed another problem. SEZFON7’,

We had to deal

on-screen

categories:

number,

names with different

inputs as in:

151

(15= option

“view”,

position.

o Different

1171

(11 7 =option

names with different

“view”,

1 = enable)

(returns

is addressed

by line-

has different switch

and column

fonts.

to another

It has no

system font

Furthermore,

putting

two different

characters

on the same spot would result in a preempting of the last character with ATARI, and in an overstrike with the

outputs as in:

Atari: GE?TIEM

position

The Macintosh

program and their

though, Unfortunately those fonts are proportional and their on-screen position is addressed by the ~-

1 =enable)

Macintosh: ENABLE17EM

Atari fonts are nonproportional

SETFONT, but we could

Atari: ENABLE17EM

These

In the Atari we

fonts using the supplied

Macintosh. to solve.

2 numbers)

These font-differences

required

several days

Macintosh: GEZW3VU (returns 3 numbers)

o Different

names and different

5,

inputs as in:

SEliWENUS charactermatrix “ matrix

for Atari

and Macintosh.

We

discovered this immediately, as dialogue-boxes which were pleasingly centred on the Atari screen, appeared at

Macintosh: SLG’MENU

dlferences

Screen size was different

Atari:

numbers

Hardware

matrix..

the far right with the Macintosh. Another seemingly trivial problem was the difference in keyboards. A

matrix

DRA WMENUBAR

Macintosh main discrepancies were due to different addressing of the menu options. The menu options of fig.2 are addressed by Atari as number 1, 2, 3, 4, .,, etc. For example options [Files/Quit] is number 7 (notice that

+ has no [Insert]

key, nor a [Delete] -key.

The

“Files”

and horizontal

option

[Primary/Open]

lines have a number is

number

11,

Altogether these conversions were accomplished o Porting from system to system 2 hours.

0 0 0 0 0

too) and With

the

Macintosh, each option is addressed by the 2 numbers; the columns are numbered 10, 11, 12, . . . and the rows in each column [Files/Quit] is [Primary/Open]

as 1,2,3 and so on. So number-pair 10 6 and is number-pair

option option

113.

Conversion Conversion

268

of different names 4 hours. of menu differences 16 hours.

Font problems 16 hours. Positioning of dialogue-boxes 1 hour. Various small function adaptations 1 hour.

So porting

was not such a major effort.

multitude

of

platforms

APL 92

in...

those

small

deviations

Nevertheless, between

the

the two

required as you can see about 40 hours. The code

Porting and Optimizing

Star

amounted to 231 functions;

altogether

1. The best one was 13BOX which

3144 lines of which

a third (1285 lines) was comment.

matrix

2. Another one was ❑ISS(source;searchstring;replacement) which replaces substrings.

3.2. The Lessons What is to be learned from this experience? 1. First of all we learned the hard way never again to call non-standard functions directly but substitute cover functions

instead.

starting

with

Those

cover

the characters

functions

itmF.

functions.

to scan all parent because there would

We realised that nothing is wrong by using such handsome powerful functions as EIBOX and ❑SS, but that we should have hidden

had names

So MDF_READ

calling SIREAD in the Atari implementation the Macintosh

the machine-dependent

was

screen-

and MREAD in

In this way, there was no need

utilities.

functions for potential changes, not be any reason to change them

functions,

we suggest to MicroAPL

machine

dependent

platforms

if they perform

functions similar

on

Therefore,

Contrary

those

related,

features

but below

not

too, but not

UTS-functions.

[1]

different

vZ+L UTS_BOX R ‘Z-DBOX R’ ❑ EA ‘Z*L

❑ IBOX R’

V

Pc:

tasks. [1]

vZ+L UTS BOX R lL+ll ~~(•-~ lL&L~

[2]

Z-3(

-REL)CR

V

-b PC 3. The ‘overbar comma’ in the ATARI/Mac

next step was to port the Atari/Mac

be replaced everywhere

product to the PC.

system in use at our department APL2/PC

the availability

in an academic environment to colleagues

appropriate message was given and control returned to the main “menu-loop” for continuation. In such a case,

at that time, we optedl for

version instead. The primary of a free runtime

version.

version had to

by ,[1].

4. In APL,68000 we had used EIERROR to trap errors in the main program. Whenever an error occurred, an

The first decision to take was which APL system to use for the PC version. Although STSC was the standard APL reason was

the particular action was terminated, but after the error message the main program continued and no variables

This is essential

where products are distributed

local to the main program

were lost.

for free or for a moderate fee.

Transport

don’t know yet how to replace EIERROR by !JEA without seriously affecting the system-design though.

Again, the first problem was to transport the basics intc) the PC. Fortunately, at that time APL.68000 came with a level II version with nested arrays which claimed to have )IN compatible

was that Atari

floppies

with

IBM.

Another

are formatted

fortunate

similar

❑ IEA instead. We

has no LIERROR but has

APL21PC

3.3.1. Physical

and )OUT

were

For example... Atari & Macintosh:

Although several biochemists applied for our Macintosh version, we got many requests for a PC version. So our

IBM’s

to file- and

but more in the nature of

we hid these functions

below MFD-functions,

that they name their identical

machine

their use like we did with

functions.

managements,

particularly

in fUture portings. Of course, now all MDF... fUnctiOnS have to be changed though, no more, no less. 2. To spare APL programmers this hunt for calls in parent

3.3. Porting

reshapes a vector to a

(and vice versa).

5. A very particular non-standard feature was the diamond. All our programs were heavily diamond-cd. As

tlhing

APL.68000

to DOS.

supports the diamond,

but as APL2/PC

does

not, we had to convert all programs. So we “)OUT’’-ed

STAR

from the Atari

to a floppy

and The first attack was to change the diamond by character ~ and add the following function. ❑l% ‘Z*L AR’ ‘Z*L’ Atari and Macintosh This was done in the original programs too, in order to keep the bulk of the three versions identical. This solved about half of our problems with diamond-cd statements.

“)IN’’-ed it into IBM’s APL2 at a PC. To our surprise and delight, most functions and variables got imported into APL2/PC

without

and )OUT

standardisation!

apparent problems.

3.3.2. First Incompatibilities:

Hurray

Nonstandard

Our initial joy was tempered

quickly

that

nonstandard

we

had

APL.68000,

used

although

standard (Kerf

several

for the )IN

Features

when we discovered features

some of them are optional

The

A

now

inconspicuously

in the 1S0

sub-statements

[4]).

similar into

the

to

the

APL

269

“+ “

syntax:

and

Bos, Reithoven,

with

the very

the diamond

from A with regard to order.

van Batenburg,

blends

a line

separated by A is parsed from

right to the very left. Unfortunately different

APL Quote Quad

is

from

works

Although

APL is

Abrahams,

Pley

processed

from

statements, processed

right

to left

from

diamond-cd

left to right.

BE:+(O # 00/CILC

Unfortunately,

diamond-cd

among

were

sub-

checking

are

program

themselves

This characteristic

sub-statements

conditional tasks which enough for l-liners like: BE:+(1500 >pRNA)/CiLC+

compact rewritten

within

the sub-statements

often

used

small

and yet cohesive

❑+-’*

So we extended about

String too long’

V IFWN,l-U

A

had

to

be

the text-matrix

(variable

enablingldisabling

problem

The

User interface

was the user interface.

it completely,

using

auxiliary

Nevertheless, we were anxious to change possible and only replace or add functions

functions

and

versions tree.

For

MDF_RNAEAD.

were also included

We had to

specific

user-interface

as little as low in the

programs this

irrespective

which

WMENU

In

a charactermatrix

MENUOPTZONS

for

(O

disabledlenabled

MDF_CHECKHEM NUMBERS

no

action).

and checked onloff

MDF_ENABLEITEM

&

particular

alert-boxes

the message. The resultant

3, OUTPUT

are

on/off

through

and

MENUOPnONS,

used

by

program

menu and to walk

are

available

by

the

of the severity

in an appropriate

of

icon.

for Atari

code was hidden

beneath

Mac and PC.

WINDOWS

a window

with scrollbar.

and their

and sets checking on/off,

In APL2/PC however, output which extends beyond the screen will scroll over the top and be lost irrevocably.

These commands are set independent of the final menupainting and also independent of the polling of the userresponse. Once those commands are given, they lay dormant and their effwt becomes apparent at the time the menu is painted on screen and at the time that the user is scrolling

and

those two

Often output exceeds the available screen size. This is no problem in APL.68000 where output is displayed in

by...

menu-numbers

the

from

items

NUMBERS

sign sets enabling

To meet this problem we used AP124 again and wrote a little program UTS_DISPLA Y which used the pop-upfacility [FYI: call 13] of AP124. For example,

the menu-options.

we had to change our design completely

to display

the secondary

structure

of a

RNA in Iego-view we changed... ❑~ma_stmcture RNA_DISPLEGO rna_sequence into:

In the PC-version we had to use AP124. Although we expected that the introduction of the MDF-ftmction would ease the work of porting considerably, we were

UTS_DISPLA

Y ma_structure

RNA_DISPLEGO

ma_sequence

In the Atari and Macintosh versions we included function UTS_DI:Pull ❑ EY ‘ UTS_DISPLAY R’ ‘ ❑ -R’ and inserted It at the appropriate places.

once

more. The reason was the afore mentioned difference between issuing the enabling fdisabling and onloff

APL 92

APL.68000

MDF_ALERT

NUMBERS

enumerate

is

call

In APL2/PC we had to muddle along with AP124. This processor doesn’t display icons, so we wrote a little function that combined the severity code (as text) with

with the texts of

fig.2 in each row, User action is polled by... . .*MDF —GElMENU17EM which yields the number of the option chosen

mistaken;

ultimately

program,,. 3 ALERT ‘This message on screen’ The number at the left is an indicator

MENUOPTIONS k

programs DISABLEI~M

change variable is

The

2, ALERT-BOXES

were

The menu function are already explained for Atari Macintosh. The menu is painted on screen by... MENUOPZJONS

platform.

this information

these

1. MENU’S

Here

only

information

about

in variable

For the PC however,

the message which is reflected

MDF_SHO

the

to paint a proper only along “enabled” options.

necessary to design anew were:

where

of

on/off

ENABLEITEM,

directly.

NUMBERS

stored the information

and checking

MDF_GEn7EM

and Macintosh

functions

This calls to...

MDF_CHECKITEM

is that in APL.68000

CHECKITEM

(but empty of course) to keep the same function

The

checking.

MENUOP7TONS

machine-programs

AP124.

compatibility

in the Atari-

of

information

program

not used, because the last two

functional hierarchy of the program-system. For example, the sharing and retraction of variables with the auxiliary processor was done by newly introduced functions MDF.RNAINIT

information with

and onloff

last two programs

difference

processor

with

mentioned

MENUOPnONS+MENUOPITONS

MENUOPnON,

Our biggest

checking.

MENUOP~ONS)

enabling/disabling

rewrite

we had to

MENUOFTIONS*MENUOPnONSlDF_ENABLE17EMNUMBERS

B1 :-(1500 >PRNA)IEI + 1 too long’ El: ❑l-’*String

3.3.3. Main Incompatibility:

AP124

by storing the information

and onloff

changed the previously ...+MDF_GEliUENUIlliM

gave no solace here and all those

and comprehensive statements to clumsy 2-liners like:

Using

for menuoptions

1 0

effect.

about enablingldisabling

of the

was

and their

this time difference

27U

Porting

dummy

and Optimisiwj

Star

Writing

this

(designing, PC, ATARI 4. FILE

tool

for

the

PC

took

about

characters specifying different options, and POS specifies the coordinates of the surrounding frame. The AP124 worked quite different. For this reason we decided to interject a machine-independent level (higher

3 hours

implementing, testing), inserting it in the and Mac version about an hour or so.

SELECTOR

Whenever

than

one wishes to access a file in APL.68CIO0,

one may present the user a so-called “file-selector-bc)x”. This enables the user to look

for the drive,

name of the tile in a convenient

way. In APL.68000

file selector-box vFILE.DEFA [Ij FILE+ZEXT

is displayed

objects:

a dialogue

ANSW*MDF_DL4LOG

INFO)

to describe

is performed

the Now

by:

DDM

that In matrix

DEFA ULTPATH

matrix

the Dialogue-Definition-Matrix.

for each platform

path and

with

ULTPA TH MDF_FILESELECTOR STPUIFILE

the APL.68000

various

DDM

own line with

TEXT

each object of the dialogue information.

The first

box has its

two elements

in

each line mecify the position of the top-left comer and the following two elements specify size; all positions are expressed as line- and column-numbers: 2 81 17 Elwhat is your age?

V

I

Directory: n :\*, x________________________________ Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x File XXXX Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x D~r xxxx C:\ STAR\* * Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxmxxn x x A: xXXXXXXXXXXXXXXXXXXXXX OK Xxxxxxxxxxxxxx xxx B : Xxxxxxxxxxxxxx xxx c : XXXXX CANCEL X xxx \

0 !--------’-----DESKTOP,INF RES-SHIPot)CC ,DOC 1--1 STIIR

:+:STRR----

----- —--,---------!---

lo_. _..-_._

Fig.6:

Fileselector

for APL2/PC

.

._. ____

❑ i_. {OK

720

I 1 {CANCEL

As this dialogue shows, element 5 in DDM type of object,



U

Im

--------------

(left) and for APL.68000

{an

our own file-seledor

so we had to write

Using

of

M.v. Welie,

we

programmed

DDM

All in all. , we worked roughly . . . . Physically porting from A-tari to PC 3 hours.

interfaces

with

several

0

objects: a set of exclusive

alternatives

Diamond

problem

(often substantial

15 hours. 0 Menu system 50 hours. 0 File-selector box 80 hours. output-windows, 0 Dialogue-boxes,

of which

exactly 1 and only one can be activated (chosen). checkboxes: options which can be activated or not options which will terminate the exit buttons: dialogue. -teXJ

rewrite

❑BOX, of code)

etcetera 200 hours.

All together 374 hours. Compared to the Mac porting this statistic shows that porting between Atari and Macintosh

&J@ M

APL.68000

toolbox

0 Conversion of various names 6 hours. of non-standard features like 0 Conversion ❑ERROR, ❑ ss and overbar comma 20 hours.

are complex

frame;

to this one.

o

5, DIALOGUE-BOXES

-

the

ANSW+MDF_DL4LOG

see fig.6.

radiobuttons:

of surrounding

AP124 underneath:

file-selectors: a scroll-window with tiles, directories (including parent and root) and all available drives, an [OK] and [CANCEL] button and a field for manual

Dialogue-boxes

comer

are relative

lEXT

It now supports the basic features as seen in many other

potential

text.

scroll-box. of top-left

all other positions

[1] ,,.

input;

updatable

scroll-box.

+ position

using AP124 beneath:

vFILEu-DEFA ULTPA TH MFDFILESELECTOR

text.

with optionally

exit button.

-D horizontal support,

are for

(F a radio-button of family F (different families have different characters instead of character F). [ a check box. J vertical

lacks tile-selector

specifies the

positions

For example:

non updatable

•l input field,

(right). APL2/PC

and the remaining

textual information.

-——_-——— ,--Ipiiq -— —— ----____

Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx Xxxxxxxxxxxxxx

XXX A xxx !tSTAR XXX ST ARPACK. EXE xxx Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

4 815 7912

(numeric provides

or character)

for the fimction

was much easier. DIALOG as in:

ANSW*POS DIALOG INFO

Here

INFO

is

a

charactermatrix

with

a line

of

information for each object in the dialogue-box. Each line in INFO contains textual information preceded by 8

APL Quote

Quad

271

van Batenburg,

Bos, Reithoven,

Abrahams,

Pley

3.4. The

optimisation.

Lessons

according

What did we learn from this conversion?

3. “Don’t

1. Again we suffered from sins of our past. We had not complied with the APL standards well enough and used non-standard features like ❑ UMX, ❑ERROR, ❑ ISS and overbar comma. Similarly to the machine dependent functions,

we had not hidden those non-standard

This we learned the hard way.

The use of non-standard

APL features should have been

in utility

functions.

This would

the need to search the parent functions 2. The diamond ISO

APL

8485

(Kerf

[4]).

The

of

[2]). But lacking such a structured

the poor man’s substitute

of the diamond

making

or tools to manage user-interfaces. low-level is

for high program

much

better.

intermediate

For

in

variable

DDM

extension,

In the literature

the criteria

always

the highest

how

applied

system to paint the objects associated DDM variable.

on

screen

those of

that another

much

5, The )IN

and )OUT

APL21PC

working

between

one

timings

TRNATHRT

the

expects

calculates

various

and get the

is atrocious.

Macintosh

+

Macintosh

IICX

on different

by

improvement

could

yield

very probable

and

the highest

point to that part

structure.

Look

at the

structure

that secondary

to

optimisation

So we first concentrated and only

of

machines in fig.7. no coprocessor

136 sees

8MHZ

no coprocessor

140 sees

16MHZ

with coprocessor

20 sees

8MHZ

with coprocessor

140 sees

33MHZ

with coprocessor

11 sees

CPU measurements. of the secondary

structure

requires

for a long wait because he has given the computer

we

subscribe in

the

of 15, 2 hours instead of 1, 5 minutes instead of 2, neither of those improvements will excite the user very much or reduce his irritation considerably. In short, sometimes optimisation

with big gains can still be unimportant

respect to reduction

of user-irritation.

So it is our view

one should

indiscriminately,

the

2 rules

not optimise

not even a particular

highest potential gain (the official only consider

strategy). the highest

with

the

No, one should

respect to user-irritation

those parts with

with

everywhere

selection

first

and

user-irritation.

This brings us to a corollary of the “Don’t everywhere’’ -rule; it states: 3. Don’t optimise, but reduce irritation only.

do it yet (that is, not until clear and unoptimised

a working

a big

chew. So he is resigned to wait some time and builds only very little irritation. Whether it will take 30 minutes instead

measure all parts with

on making

then,

fully

condensed

1. “Don’t do it” 2. “(For experts only) Don’t you have a per fwtly solution)”.

APL 92

one.

factor should be

several minutes for small RNA strings to several hours for very long strings. However the user realises that he is in

viewpoint which Jackson’s (Jackson [3] p.232):

Afterwards

is the most important

8MHZ

Now the computation

Background respect

speed.

and

are a boon.

After all those conversions we had STAR ruining on Atari, Macintosh and PC, The Atari and the Macintosh versions worked fine, but the PC version was very slow; especially the user-dialogue was atrocious. So we decided to do an optimizing step for the PC-version only.

With

in

are think

to gain

to compute

AT

Fig.7:

4, Optimisation

4.1.

to optimise We

the secondary

Atari

a handy

APL,68000

factor

This analysis would

which

IBM

in APL21PC

where

gain

on those parts which

IBM 486

4, Absence of mouse-support

for

to our view the most conclusive

concentrate

A higher level

and programmed

if and where necessary.

According to the literature one should first measure how much time the various program parts consume, determine

some functions

we

one can gain some speed? And

the VSER-IRRITA 770N.

the

The AP 124 is too

reason

spirit

often damages

According

is better than

productivity.

this

same as:

So one should

gain, is also long overdue

the

thing and that optimisation

however

nothing. 3. APL2/PC

in

design and that it costs time and money. only optimise

resultant code without the diamond showed however that a structured extension in APL is long overdue (v. Batenburg

acted

do it everywhere”.

most important

for replacements.

clumsiness

we

how to determine where to optimise and where not? The spirit behind Jackson’s rule’s is that a good design is the

have spared us

is not standard either, although optional

then

Why not optimise wherever

feature

beneath cover functions.

hidden

And

to a third rule which we formulate

do

it

program.

did we start thinking

about

272

Porting and Optimizing

Star

The

4.2.

According

Program

Other, more substantial

to the strategy as outlined

was not to measure parts,

but

program

the timings

to

“measure”

parts.

There

the was

above, our first step

of the various irritation

no

need

of to do

program

the various this

in

statements were removed

BFS_MPOPUP towards the BFS_UMPOPUP. B1:(3>2>C)+(1 +-Pi >?)t [2]”” B-32z> C ,51:+(S >1-1+ 1)/Bl

loop

(once

in

out of a executed)

a

systematic way because every user of the beta-version complained about (in decreasing order of irritation):

Another improvement in the BFS.. .-programs was the following. To catch any unwanted result and send it into never-never land we used frequently (within often executed loops and/or programs):

1. Jumps between menu’s in PC version. 2. Dialogue-boxes in PC ~~ersion.

O 0 PFUNCTION MATRIX

So we choose to optimise the jumps

between

the highest-irritation

menu’s

and dialogue-boxes

version. The next step was to determine time is spend in that program part.

Experiments

part first:

to

mark

a

how much CPU

function

MA TtUX

yielded an improvement

of 14% in function

BFSSCREEN.

4,3. Result

We used the I-beam to measure the CPU time: “ 171FVNCTION”

learned us that:

DUMMY+FUNC7ZON

in the PC

for

timing

Table fig.9 shows the result.

registration. “ 191FTJNCITON” to unmark

that timction.

Conspicuous

“ 181FUNCnON” to read the consumed time. “2410”/’’2411”

to mark/unmark

all functions

is the deterioration

of BFS_UMPOPUP

-232 %.

was intentional; this function was called only once, whereas this change would improve BFS_MPOPUP which is

This

for timing.

called several times. Some CPU measurements

are presented in fig. 8. Another

—.—

interesting

observation

BFS_EXEC. It is probably

memory

1

of the APL2/PC

changes were actually

.+ ,: :;

[-

management

that

the

deterioration interpreter

made to BFSEXEC.

deterioration

although relatively

is the

of

due to very small fluctuations

of

impressive,

BFS_MPOPUP

Fig.9 and

in

as no shows

BFS_EXEC,

are not that great in absolute

values. The improvements of BFS_MPOPUP much more noticeable though. The

overall

effect

was

and BFSMPOPUP

slightly

more

than

are

25%

improvement in speed. Although not very impressive, it was fairly noticeable. As we saw no regular way for more substantial

improvements

AP124 and programming

(apart from

throwing

away the

in C) we decided to stop at this

point. Fig. 8: Execution

profiles:

times for jumps

from menu to

menu and for dialogue-boxes. 5. Conclusion

Apparently, menu-jumps require much time in BFS_MPO,PUP and BFSMPOPUP. Dialogues are consuming much time in BFS_MPOPUP too, but also in wMs_uBox and BFS_UBOX. So there are the first culprits

which

should be analyzed

This case-study had several lessons for us in store, which could be useful to other projects

for First, we learned to be alert on machine-dependent

improvement.

tackled them by hiding them underneath MDF-functions. In the subsequent porting to PC those changes proved worth

. ..(22P9 150 15) A,= C-.,,

those constructions

appeared within

loops,

aspects

in our programs. In porting from Atari to Macintosh we suffered from several of such incompatibilities and we

In programs BFSMPOPUP and BFS_MPOPUP both, small array’s are created on the fly. For example in BFSMPO~PUP Wherever

as well.

their while.

we

removed them out of the loop; for example: TABFLA+2 2p9 15015 B1:... TABFLAGA .= C+... = TABFLAG M :+(CONDIITON)lBl

APL Quote Quad

instead of 2 2p...

273

van Batenburg,

Bos, Reithoven,

Abrahams,

Pley

The dialogue functions in APL.68000 as well as in AP124 proved too low-level for great productivity. We introduced CPU

Functions

CPU

CPU

ajler

before

a higher

gain(%)

level

variable,

the Dialogue-Definition-Matrix

instead.

Installation

BFSSCTLEEN

240

205.6

14.3

BFSCLOSE

126

125.3-

0.6

BFSFBG

436

362

BFS_UMPOPUP

162

538

3679.8

3639.6

The menu-system constructed with AP124 proved too slow for basic PC’S, Contrary to the “official” strategy, we didn’t optimise the program at potential high-gain spots, but applied

BFS UBOX

17.0

our rule

“don’t

optimise,

This led us to optimisation optimisation

-232.1

but reduce

of the user dialogue

of the calculation

irritation”. instead of

part.

1.1

6. References 1. Abraharns, J. P., Pleij, C. Prediction 12

10

16.7

BFSBG

o

0

0

BFS[MMWR

o

12

BFSFORMAT

includhg pseudoknotting, by computer simulation. Nucleic Acids Research 18(10)3035-3044 (1990). 2.

Batenburg,

5636

3.

BFS_MPOPUP

12747.2

BFS EXEC BFSREADSCREEN

4196

25.6

9200.8

27.8

46

76

-65.2

787.4

551.8

29.9

Jackson,

4, Kerf, J.L.F.

Atler

19228.6

choise

I

14046.6 I

205.6

14.3

BFSCLOSE

126

125.3

0.6

32

32

RNAmENUFUNC Subtotal:

398

End STAR

I

102.8

14.3

BFSCLOSE

126

125.3

0.6

246

I

New

de Second generation 9x). APL-CAM

5, Mavor,

J. W. & Manner,

APL and 1S0 APL

J. 13(1)189-270

H. W. General

sequences

information.

using

(1991).

biology

p.565;

folding

thermodynamics

of and

Nucleic Acid Research 9(1)133-148

I

120

I

design.

8.8

BFSSCREEN

Subtotal:

on program

0

362.9 I

Principles

(APL

RNA

harmful.

press (1975).

Extended

auxiliary (1981).

I

240

considered

New York, MacMillan Company (1966). 6. Zuker, M. & Stiegler, P. Optimal computer

26.9

BFSSCREEN

L1..2..3

(1991).

M.

Academic

large Subtotal:

F, H.D.v.

QQ.21(4)330-337 York,

BFSMPOPUP

Berg, M.v.d., Batenburg, E.v. & of RNA secondary structure,

228.1

26348 21317.7 TOTAL: I g.9: CPU improvements atler optimisation.

I

7.3 19.1

Next, we experienced similar problems with nonstandard APL like ❑ BOX, ❑ SS and overbar comma. Again, we learned to hide those features underneath UTL-functions or to use the standard. We suffered

(as we often did in the past) from the lack of

a good structural construct in APL. The poor substitute of the diamond was unfortunately inappropriate in APL21PC.

APL 92

274

Porting and Optimizing

Star

Suggest Documents