A program to generate the Luzzati plot using Microsoft

10 downloads 0 Views 294KB Size Report
The Luzzati Data asym worksheet (Fig. 1) contains the theoretical error data. The columns labelled IArl × Isl and. R" were taken from Table 2 of the original paper ...
COMPUTER PROGRAMS

745

HOLM, L. & SANDER,C. (1991). J. Mol. Biol. 218, 183-194. JONES, T. A. (1978). J. Appl. Cryst. II, 268-272.

References

BANASZAK, L., SHARROCK,W. & TIMMOIS, P. (1991). Ann. Rev. Biophys. Biophys. Chem. 20, 221-246. BHAT,T. N. & BLOW, D. M. (1982). Acta Cryst. ,~38, 21-29. BRt2NGER, A. (1990). Acta Cryst. A46, 46-57. DIAMOND, R. (1974). J. Mol. Biol. 82, 371-391. GREER, J. (1974). J. Mol. Biol. 82, 279-302. GUNSTEREN, W. F. 8~ BERENDSEN,H. J. C. (1977). Mol. Phys. 34, 1311-1327. HENDRICKSON, W. A. 8z LATTMAN, E. E. (1970). Acta Cryst. B26, 136-143.

JONES, T. A., Zou, J. Y., COWAN, S. W. 8z KJELDGAARD, M. (1991). Acta Cryst. A47, 110-119. LEVITT, D. G. 8¢ BANASZAK,L. J. (1992). J. Mol. Graph. 10, 229-234. SIBANDA, I. L., BLUNDELI,,T. L. • THORNTON, J. M. (1989). 206, 759-777. TORIWAKI, J., YOKOI, S., YONEKURA, T. 8¢ FUKUMURA, T. (1982). Sixth ICPR (Int. Conf. on Pattern Recognition), Munich, Germany, pp. 414 419. Xu, Z., BERNLOHR, D. A. & BANASZAK, L. J. (1992). Biochemistry, 32, 3484-3492.

J. Appl. Cryst. (1992). 26, 745-747

A program to generate the Luzzati plot using Microsoft Excel.

By Y. W. CHEN, Centre for Protein Engineering, Medical Research Council Centre, Hills Road, Cambridge CB2 2QH, England

(Received 4 April 1993; accepted 3 June 1993) Abstract The Luzzati plot (R factor against reciprocal resolution) is a useful way of assessing the reliability of a crystal structure. It is commonly employed in obtaining an upper limit to the error in atomic coordinates of a protein structure. The advantages of the spreadsheet program Microsoft Excel has been exploited and a simple procedure for producing highquality Luzzati plots for non-centrosymmetric structures (such as proteins) has been developed. Introduction All crystal structures carry with them some degree of uncertainty. The agreement between a crystal structure and the observed data is represented by the R factor, which is defined as R = Z l I F o [ - IFcll/~lFol. The R factor of biological macromolecules such as proteins (around 15-20%) are in general higher than that of small molecules (< 5%) owing to their enormous size, their inherent flexibility, which results in the possibility of multiple conformations, and the limited amount of available X-ray data. The publication of protein structures is always accompanied by an estimation of the error in atomic coordinates in the form of a Luzzati plot (Luzzati, 1952) as an indication of the quality of the structure model. A Luzzati plot is a plot of R factor against reciprocal resolution (I/d) superimposed on a background of theoretical error lines. It is the generation of the background lines that makes it quite difficult for general plotting programs to handle. This problem is elegantly solved in Microsoft Excel because it allows multiple data sets to be plotted on a common pair of axes. There are other ways to generate publication-quality Luzzati plots. For instance, the molecular-dynamics pro© 1993 International Union of Crystallography Printed in Great Britain - all rights reserved

gram X-PLOR (Bri.inger, 1993), which runs on workstations and main-frame computers, can produce output suitable for plotting with Mathematica. The procedure described here has the following advantages over the existing methods: (1) it is based on personal-computer programs and is user-friendly; (2) it generates high-quality yet easily customizable output; (3) it can incorporate multiple data sets on the same plot for comparison. The procedure has been developed primarily for protein structures (non-centrosymmetric), however, it is very easy to tailor the worksheet for centrosymmetric structures. Methods The basic requirements of this procedure are three files: Luzzati Data asym (Excel worksheet), Luzzati Plot template (Excel chart) and a data file. The Luzzati Data asym worksheet (Fig. 1) contains the theoretical error data. The columns labelled IArl × Isl and R" were taken from Table 2 of the original paper (Luzzati, 1952). For proteins, the three-dimensional non-centrosymmetric data were used. Each column of reciprocal resolution (s = 1/d = 2 sin 0/2) values was obtained simply by dividing the IArl x Isl values by the respective theoretical error, Ar. The table consists of theoretical error data in the resolution range 10-2/k (1/d = 0.1-0.5A-~). This table is used to generate the background reference lines for the plot. The Luzzati Plot template (Fig. 2) was generated with the LINE CHART option in Excel and contains the reference theoretical error lines. Each column of reciprocal-resolution (I/d) values was plotted against the column of R factors in turn. Multiple lines were plotted on the same graph by the COPY-PASTE SPECIAL ... commands in Excel. This plot is linked to the Luzzati Data asym worksheet via an Excel

Journal of Applied Crystallography ISSN 0021-8898 © 1993

746

COMPUTER PROGRAMS

File

Edit

1~o~,

Formula

Format

Data

Options

Macro

Window

~I,

I ~ I*1" Iv~l,-'.-i'JIzl I.Izl I~l-I-J Ii:::::ilI"-Ioloi~1 I--",,,lr~lc~l~l

-l-I

L u z z a t i Data a s y m

AI

s

I

c

I

D

I

~

I

........

r

o

.

~

s

...........................................................................................

"

.................

3 4 S

Luzzafi. V. (1952). ' r x - a i l ~ n t s ~ . ; ~ ~ ~ s e ~ ~ ~ ~ ~ ~ ~ t ~ s s ~ ~ s C ~ i J i ~ ........ A c t a C r y s ~ a l l o ~ r . ~, 802--810. , ,°o.o ........

o°o.°.° ............

i

~.

...........

.i......................... ~......................... }................................................... :....................

° ...... o ......

6 T

.... o..~,o,°°,o

$ 9

.......

11

................

~, . . . . . . . . . . . . .

! •

,o.° ........

o,:° I

i.....ITheoretieal Error, Ar [..... ...............................................

!......................... i

0.5

0.45 ! ..................

i

0.4

0.35

i

i

0,3

0.2.~

0.2

0.15

i

,..,...~

.:.....o

............

. ......

.~ ...................................................

i........ I; =

.

, ..............................

=

.o ..........

o....

}...... .i........................ i.

.....................

O":......................... Oi~......................... O":......................... O~~'......................... O~:......................... O~~......................... O:"......................... 0.:;b.......................( ....... ~• .........................

~ O.Oli 0.0l! 0.0LZZZZZi 0.0gS! 0.0185714~ 0.03333:33! 0.04i 0.05" 0.066666: -1-4"- ........ i................ b'.'O~.T................ 6 :i5"~,i""6:iS"~::~:~.Z~:q T............... "6"6:~i6"6":5Y~T"'iY6i~~i~~i ................ i56~'T.................. 6;Yi'"6f~~:~;. "T'J-

....... "i............... O.OS~"......................... 0.06 "i......................... 0.0666667 ~~.................. 0.075:" ..... '~ .....0.085714:~:" .................. ~i"...... :.................. 0:....: 1 *~.................. ..................O:"....... 12, ~'..................... 0 : ...... 15:"';" .... ="............ 0 :." ......................... : ......................... .................... ..... $ .................. : .................... .....................................................................................

....... ,

-rJr-

-

i ~

~. .........................

0.04--" 0.05

: .........................

,...........................................................

0.08! 0.0888889~ 0.i! O.lllllll i .~ .........................

: .....................

:..................................................

0 1~ 0 114Z.857~ 0 1333333~ O.lZS~ 0.I~8571 i 01666667! :...~.

.....

-. ..................

: ......

: ..................

0 16" O.Z~

~, ..................

: ..................................................

: ......

OZ~ 0166666" 0.2,5" O.SSZSSS.'.

: .....................

:...,~

:..................................................

....

:. ...............

:

:.......................

Fig. 1. The Luzzati Data asym worksheet. Only part of the worksheet is shown.

link a n d it serves as the t e m p l a t e o v e r which e x p e r i m e n t a l d a t a are plotted. M o s t of the c r y s t a l l o g r a p h i c r e f i n e m e n t p r o g r a m s o r s t r u c t u r e - a n a l y s i s p r o g r a m s c a n g e n e r a t e a text file c o n t a i n ing t w o c o l u m n s , o n e o f r e s o l u t i o n a n d the o t h e r o f R factor. This text d a t a file is c o n v e r t e d into a n Excel d a t a w o r k s h e e t first. A new c o l u m n is t h e n defined such t h a t it c o n t a i n s the r e c i p r o c a l - r e s o l u t i o n values (for an e x a m p l e , see Fig. 3). T h e r e c i p r o c a l - r e s o l u t i o n a n d the R - f a c t o r c o l u m n s are t h e n selected a n d p l o t t e d o v e r the Luzzati Plot template. A typical p r i n t e d o u t p u t is illustrated in Fig. 4.

-E3~

Luzzati Plot template

Pl-__--I

R-fae(~

0.6 0.5 -0.50 -0.45 -0.40-4 .035.4

0.4

-,0.30.4

0.3

-0£5 A .0.Z0,4

Customization

T h e p r o g r a m is g o o d for p l o t t i n g d a t a with a resolution r a n g e o f 10-2.0 A,. If a r e s o l u t i o n r a n g e o u t s i d e this is required, the scaling o f the X axis c a n be c h a n g e d . If e r r o r lines of a different Ar interval are desired, the values o f Ar in the e r r o r - t a b l e w o r k s h e e t can be c h a n g e d a n d the Luzzati Data asym w o r k s h e e t will be c h a n g e d a c c o r d i n g l y . In s o m e cases, o n e w o u l d prefer to plot R factor against sin 0 r a t h e r t h a n against 1/d. In this case, the e r r o r - t a b l e w o r k s h e e t needs to be r e d e f i n e d so t h a t e a c h c o l u m n c o n t a i n s values o f sin 0 r a t h e r t h a n 1/d, p r o v i d e d t h a t the w a v e l e n g t h , 2, is k n o w n . If t w o s t r u c t u r e s are to be i n c l u d e d on o n e plot, the

0.2 "0.15.4 -0.i0~4

0.1

I

0.1

I

I

I

I

I

I

/

0.15 0.3 0.25 0.3 0.35 0.4 0.45 0.5 / lid, X ' ]

Fig. 2. The Luzzati Plot template chart.

COMPUTER PROGRAMS

-F-I



A I 2 3 4 S 6 7 8 9 10 11

example

data

D

il~l-_= E

::i . . .[ . . . Pr&ein. . . . . . ..... i. . . . . . . . . . . . . . . . . . . . . . .

F ~

R-facto,

o.6

I..........i::

o.,

.......... T........

0.4

. . .

!.........

747

,0.50 A ..o.45 k .'0.40

..... T.....i"~"~ ' ~ ' ~ ' " " ! . . . . .

i

..........R " ~

0.185s5681zi

0.1441i ........

:~.~o

...... i ......... 61~'i2EiT~'i'~'~ ...................... 6:i6~~/" i" ........ ...... ~........................... 0- "z-~~: ' ..................... E i i ~ ~ i ........

0.3 ~25

........ ! ........ 6 : ~ ~ i i ~ ~ ~ ! ........................ i i i ~ ~ [ ........ ......... i~:i~~ ~ ~ ~ ~:i[ ..................... ~:i~~:;/i ........

12 13 14 1S 16 17 11]

19

i .......... ........

...................... .....................

..........................+

0.3ZZ5~0645 ~

...-...........

0_1401i

: ....................................

: ...................

0.1

"

24

i

0.44345898 ~ .o

0.2,083!,: 0.2,019 ! O.Z12,5 i

2 5

im

0.450450451

0.Z0431m. . . . . . . .

26

i

i •

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

lld. ~,-1 Fig. 4. An example of a printed output of the program. The two lines represent the plots for two different protein structures. The estimated error of the black line is 0.19/~ while that for the grey line is 0.22/~.

0.1918 i

4.

0.419Z87ZlZ i 0.4Z73504Z7 i 0.4357Z9847 i

.0.10

0

o.... ....................

2 3

;

0.410677618 ~

-oAsA

0.1

......... i i : ~ ; / ~ i ~ i ? , ~ [ ..................... E i g i i i ......... i . . . . . . . . . . . . . . . . .0.36036036 i 0.16641,o, . . . . . . . . ................... : .................................... 0.1796! " 0.371747Z 1Z; i 0.38Z409178 i 0.19Z4i i 0.392,156863 ,I,., ~ 0.17991 0.183 i i o4o16o6 6•

21 22

. . . . . . . . . .

~.~oA

0.2

.. ... ... ... . . . .

• ~ i•

20

A

.o35 k

0.1947!

. . . . . . . . . . . . . . . . .

i

Installation

]

,~,

Fig. 3. An example data worksheet. second data set can be incorporated using the C O P Y PASTE S P E C I A L c o m m a n d s again. Graphs produced in Excel are easily customizable. Every aspect of the plot can be changed according to individual requirements, including fonts, scale, border, shading, colour, line weight, text annotation etc. Although the program has been developed with Microsoft Excel on a Macintosh, the worksheet and chart can also be used on Excel 3 (or 4) for Windows on an IBM-compatible computer.

Copies of the programs and example data files may be obtained from the author upon request, enclosing a blank 3.5 in D D Macintosh-format floppy disk. The programs require a Macintosh computer (any model) and a copy of Microsoft Excel (version 3.0 or higher). All the files must reside in the same folder. YWC is supported by a Croucher Foundation Scholarship, the Overseas Research Student Award and Imperial Chemical Industries PLC.

References

BRINGER, A. T. (1993). X - P L O R (version 3.1) Manual, pp. 181-183. New Haven" Yale Univ. Press LUZZATI, V. (1952). Acta Cryst. 5, 802-810.

Suggest Documents