a GSAS script language for automated Rietveld ... - IUCr Journals

computer programs Journal of

Applied Crystallography ISSN 0021-8898

Received 10 March 2011 Accepted 14 June 2011

gsaslanguage: a GSAS script language for automated Rietveld refinements of diffraction data Sven C. Vogel Manuel Lujan Jr Neutron Scattering Center, MS H805, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. Correspondence e-mail: [email protected]

# 2011 International Union of Crystallography Printed in Singapore – all rights reserved

A description of the gsaslanguage software is presented. The software provides input to and processes output from the GSAS package. It allows the development of scripts for the automatic evaluation of large numbers of data sets and provides documentation of the refinement strategies employed, thus fostering the development of efficient refinement strategies. Use of the bash shell and standard Unix text-processing tools, available natively on Linux and Mac OSX platforms and via the free cygwin software on Windows systems, make this software platform independent.

1. Introduction Automation of the data analysis of diffraction data is arguably one of the biggest challenges since the advent of high-throughput facilities such as modern synchrotron or neutron facilities. Besides sheer sample throughput, nowadays in many cases parametric studies are undertaken during which hundreds or thousands of runs are acquired at different pressures, temperatures, stresses or other external conditions. Similarly, for kinetic studies, data are taken at fixed short time intervals for prolonged periods, generating large numbers of runs that need to be analyzed. While Rietveld analysis (Rietveld, 1969; Young, 1995) has become the standard way of extracting crystallographic information from such data, automation of such data analysis is by no means a trivial task and this is frequently the main obstacle for inexperienced or even experienced users of the Rietveld technique. While it has been suggested that the parameters describing a series of data sets during a parametric study, e.g. the coefficient of thermal expansion or kinetic parameters of phase transformations, be incorporated into the refinement of the whole data set [e.g. Stinton & Evans (2007); Halasz et al. (2010)], this technique has not been generalized and is not available in mainstream Rietveld packages. On the other hand, the role of beamline personnel at large-scale facilities also includes, or should include, assistance for users with data analysis. In view of the large numbers of user visits at many facilities, this task demands an efficient transfer of knowledge of the refinement strategy for the particular type of data from a particular instrument. The analysis of powder diffraction data, utilizing a complex model with tens or in some cases even hundreds of parameters, makes this analysis technique stand out from other materials characterization techniques with much smaller numbers of parameters. For beginners, it is therefore often cumbersome to understand the complex interplay of parameters modeling a diffraction data set. While the general rule that the next parameter to be varied should be the one that is deemed to have the largest contribution to the difference curve holds for specific instruments and techniques, the sequence and type of parameters that need to be varied can be considered specific to the instrument. Guidance on the topic of refinement strategies exists for users of the Rietveld method, e.g. McCusker et al. (1999), even for specialized topics like magnetic structure analysis using GSAS (Cui et al., 2006), but highly specialized J. Appl. Cryst. (2011). 44, 873–877

instruments or sample environments will always require special details that need to be communicated to every user. The learning curve for the particular Rietveld codes can also be fairly steep for a beginner, and while this issue can be partially circumvented by encapsulating the input- and output-specific problems in graphical user interfaces (GUIs) (e.g. Toby, 2001), such interfaces provide limited assistance for the automation and documentation of the analysis. Furthermore, some refinement techniques, such as the determination of site occupation factors as described by Heuer (2001), require repetitive refinements with slight variations of e.g. the d-spacing range used in the refinement, and could possibly be in more widespread use if automation were more readily available. Similarly, it is occasionally necessary to try several proposed structures and compare their match with a given data set, providing another potential application of automation to ensure that an identical refinement strategy is applied to each structural model. Frequently, documentation of the applied refinement strategy is not available and users may even find themselves unable to reproduce a particular successful refinement. Black-box automation of refinement, on the other hand, e.g. the analysis wizards for texture analysis from neutron time-of-flight data as implemented in MAUD (Wenk et al., 2010), with minimal user interaction greatly simplifies the data analysis, but may cause problems even for experienced users when the hard-coded refinement strategy does not work. Guidance to the user from e.g. the correlation matrix, the evolution of the reduced 2 or the parameter shift, which may indicate refinement problems, are typically not obvious for beginners but may contain very valuable indications as to which problems may have been encountered during the refinement. Finally, there is no standard way of communicating and exchanging successful refinement strategies that would be useful to serve at least as a starting point for other users and similar analysis problems, or in order to provide advice to a user from an experienced user remotely. In this paper, we describe an approach to address these problems, allowing users to focus on the actual refinement rather than having to deal with user-interface-specific obstacles, such as adding histograms or phase information to the refinement. To address the above issues, we have developed a script language, called gsaslanguage, that allows one to program a refinement of diffraction data using the GSAS package (Larson & Von Dreele, 2004) as the back-end. These scripts doi:10.1107/S0021889811023181

873

computer programs can be re-used with minimal modifications, either for analysis of similar data sets during a parametric study or for similar experiments undertaken by other users. They generate an overview file for documentation of the refinement process, as well as initial graphs of any refined parameter versus run number. The scripts use the bash shell, both to program commands controlling the analysis and for programming the actual analysis, and are therefore available on Windows (using the cygwin package, freely available at http:// www.cygwin.com), Mac and Linux systems without installing any software other than the collection of scripts and possibly LATEX or gnuplot, and of course GSAS. The bash environment, in combination with Unix text-processing tools, provides the possibility of validating input (e.g. the existence of input files such as instrument parameter files or data files), powerful on-the-fly extensions (no compilation necessary) and efficient extraction of results from various GSAS output files (list files, parameter value files etc.). If other script languages are installed, e.g. Tcl/Tk, crystallographic codes written in these languages, such as EXPGUI, can be included. Using a script language also allows the implementation of fairly specific checks for common mistakes, such as refinement of the ZERO instrument parameter instead of DIFC in neutron time-of-flight refinements, or simultaneous variation of calibration parameters and lattice parameters. From such checks, the user can be provided with feedback and warnings. The bash shell creates very little overhead to the execution of the GSAS commands, allowing for efficient analysis of large numbers of data sets. Conditional refinements, i.e. variation of certain phase-specific parameters only above a certain threshold for the weight/volume fraction of the phase, can be implemented. It is hoped that widespread use of a tool like the one presented here will increase the number of researchers able to analyze large numbers of data sets efficiently, and therefore help to reduce the mountains of un-analyzed data that exist at large-scale diffraction facilities. Finally, command-line access to Rietveld refinement commands offers opportunities for distributed computing.

2.2. The bash shell

2. Software used 2.1. The GSAS package

The GSAS package has been developed over several decades by Allen C. Larson and Robert B. Von Dreele (Von Dreele et al., 1982). Despite its vast number of features and models to analyze diffraction data, it can be considered very robust and mature. It can refine nuclear and magnetic structures from both neutron (constant wavelength or time-of-flight) and X-ray (laboratory or synchrotron X-rays, constant wavelength or energy dispersive) data. Diffraction data from powders, textured samples and single crystals may be processed, including simultaneous refinement of up to 99 histograms of the same sample originating from different instruments or different detectors on the same instrument, allowing, for instance, combined X-ray and neutron diffraction refinements (Williams et al., 1988) or simultaneous refinements of crystal structure and texture (Von Dreele, 1997). Besides structure refinement and quantitative phase analysis with up to nine phases, the GSAS program RAWPLOT can also be used to perform single peak fits with any of the peak-profile functions defined in GSAS, a feature frequently used in the engineering community to interpret in situ loading experiments (Clausen, 2004). GSAS can work in Le Bail mode (Le Bail, 2005) to refine the cell parameters only, or may be used to simulate diffraction patterns. More recently, the GSAS package was extended to allow refinement of protein crystal structures from powder diffraction data (Von Dreele, 1999; Von Dreele et al., 2000). Last, but not least, the software

874

Sven C. Vogel

is accompanied by a comprehensive manual including several tutorials. GSAS is therefore widely used in the international diffraction community. It should be mentioned that GSAS itself provides some capabilities for automating series of runs. However, these require an initial refinement and do not allow the full flexibility and documentation provided by the scripting language provided here, e.g. conditional refinements. 2.1.1. The native GSAS user interface. GSAS is a collection of about 50 individual executables, performing various tasks during structure analysis such as variation of parameters or computation of Fourier maps or bond lengths, as well as plotting of various results (e.g. measured data with fit and difference curves, pole figures, Fourier maps etc.). Besides minimal graphical input via a mouse in some of the plotting programs, all input and output is done in text mode, either at the console or in the list file. The main program to control the analysis is EXPEDT. Using one-letter options, the user can choose options within EXPEDT and other GSAS programs to modify e.g. parameter values, parameter refinement flags or models. It is this rather extensive menu tree that is considered a major obstacle for the ease of use of GSAS, and EXPGUI (Toby, 2001) provides a GUI for many options of EXPEDT. The benefit of the text-mode interface, however, is the ability to save the required keyboard input into text files which may be piped into EXPEDT (i.e. the text file simulates user input, albeit much faster than any human could type). This mechanism is heavily used by gsaslanguage, hiding most of the complexities of the GSAS native interface, similar to EXPGUI. While EXPGUI has features similar to a macro-recorder and can be accessed in command line scripts, it was not intended to ‘program’ an analysis and these capabilities of EXPGUI are mostly undocumented. On the other hand, the approach presented here allows one to write scripts, including loops and conditional statements, and both to document the refinement strategy and to automate refinements.

gsaslanguage

The bash (Bourne-again shell) shell is part of most Linux and Unix distributions, including Apple Mac OS and cygwin. It is a command language interpreter, allowing the execution of scripts. Therefore, all scripts here are included in the source code, allowing for modifications or use as a template for user-specific extensions. Simple debugging is possible using e.g. the echo (a print command) and read -p ‘wait here’ statements. The bash shell supports loops, conditional statements, simple string operations and simple checks of the filesystem (e.g. the existence of a file). In combination with the powerful Unix text-processing tools such as grep, sed, awk etc., fairly powerful code can be written efficiently. Since bash is the standard shell on many operating systems, extensive documentation and examples are available online. Installation of the cygwin package on computers running Windows is simple and is described in the documentation for gsaslanguage.

3. gsaslanguage 3.1. Example: the GSAS nickel tutorial

As an introduction, the following listing shows the well known first tutorial in the GSAS manual, describing the refinement of nickel powder neutron diffraction data, familiar to most GSAS users from their first steps using this package. J. Appl. Cryst. (2011). 44, 873–877

computer programs (neutron time-of-flight in this example) refinement step that even very experienced X-ray users might not be aware of, so they may refine the ZERO offset instead, as is common during Bragg–Brentano constant-wavelength measurements. Similarly, in neutron timeof-flight experiments the peak-width parameters have different underlying physics from those for constant-wavelength data. For the former kind of data, 1 describes the increase in emission time for neutrons of a given wavelength from the moderator, geometric sample broadening (i.e. from the diameter of the sample) and microstrain broadening, but not particle-size broadening. Consequently, in many cases, at least on low- to medium-resolution instruments, the peak profile parameter 1 is the only parameter varied unless the particle size is in the nanometre range. For the beginner in refinement of neutron time-of-flight data, neither of these steps might be obvious, and hence simply by providing a script like this one can communicate these important details. As an example of atomic parameters, the thermal motion (Uiso in GSAS) for the first atom in phase 1 (the only phase in this case) is refined, using a damping factor of 5 (lines 24 and 25). Owing to the high correlation between thermal motion and absorption parameters, the variation in thermal motion is turned off before the absorption is refined, and then this parameter is turned back on for the final refinement. A comment, indicated by the hash character (#) in bash, is added before this command sequence to illustrate the possibility of also documenting the refinement strategy in the script itself. The last command, gsas_done, concludes the script and generates an overview document describing the refinement. This document is an Acrobat PDF file and the freely available LATEX system is used for this purpose. The document contains a description of each refinement step, i.e. which parameters were varied, which values or models were changed, what the reduced 2 was and what the final normalized variable shift was. After each step, plots of the data with fit, the difference curve (Iobs Icalc) and reflection markers, the difference curve weighted by the statistical uncertainty of the data point [(Iobs Icalc)/ Iobs] (allowing a comparison of the differences between the experiment and the model, independent of the incident intensity),

The first line initializes a new GSAS project named nickel, providing a more instructive title that will be displayed in e.g. refinement plots. The second line adds the phase to the project, providing a name, the space group and the initial estimate for the lattice parameter. Alternatively, phase information can be read from GSAS EXP files directly, which is convenient in the case of more complex structures. Lines 4–6 add histograms two and three of the file nickel.raw, using the instrument parameter file inst_tof.prm and setting a d-spacing range of 0.4– ˚ for both histograms. A for loop is used to 1.4 A illustrate the usage of such code structures in bash, or equivalently two lines with gsas_add_histogram commands could have been used. With the addition of an atom for the nickel phase in line 8 the initialization of the project is complete. The first two refinement cycles are performed in line 10, fitting the parameters set to variable in GSAS by default, namely histogram scale factors (accounting for the count time) and background parameters for the default background function. Besides performing the equivalent tasks with the GSAS executables, these scripts also perform some initializations and input to code specific to gsaslanguage, such as for the generation of the documentation of the refinement. Lines 12–14 change the background function and the number of background parameters, followed by another five refinement cycles. Lines 16–18 refine the diffractometer constants DIFC for both banks, which in a neutron time-of-flight experiment essentially recalibrates the sample position using the Figure 1 initial lattice parameter of the phase. This is an Example of a section in the refinement documentation generated by gsaslanguage, describing a single example of an instrument- or technique-specific refinement step. J. Appl. Cryst. (2011). 44, 873–877

Sven C. Vogel

gsaslanguage

875

computer programs and the normal probability plot for each histogram are also included. Icalc)/ Iobs with the This last compares the frequency of (Iobs expected frequency based on a normal distribution and is useful for judging the quality of the refinement, including the statistical uncertainties of the measured data points [see e.g. Abrahams & Keve (1971) or International Tables for Crystallography (1989) for more details]. This part of the document allows one to judge whether each step of the refinement has converged and how much reduction in the difference between observed and calculated intensities has been accomplished by a given parameter variation. Fig. 1 shows an example of such a section in the documentation. Using gnuplot, plots of the reduced 2 (Fig. 2) and final normalized variable shift versus refinement cycle (Fig. 3) are provided at the end of the document for a similar purpose. The graph of the final normalized variable shift versus refinement cycle lets the user easily identify parameter oscillations by a sawtooth pattern typical for this problem. Brief comments instruct the user on what to look for and what action to take, illustrating the ability to transfer expert Rietveld refinement knowledge to the user by this approach. In the last section of the document, the refined parameter values are listed with the estimated standard uncertainties. The software checks for values that are zero within their error bars and might be candidates for exclusion from the refinement. Negative isotropic displacement parameters, or site occupation factors greater than one or smaller than zero, are also listed and the user is warned. The file with the parameter values and their estimated standard uncertainties (PVE file in GSAS) is automatically generated when using gsaslanguage and is available to the user after refinement for easier extraction of parameter values for plotting etc., simplifying access to another feature of GSAS that is unknown even to many experienced GSAS users. 3.2. Commands not included in the example

Many more commands than shown in the example above exist in gsaslanguage. Crystal phase information can be read from an EXP file instead of being provided atom by atom. Parameters such as atom parameters or lattice parameters can be modified. Phase flags, turning the inclusion of a given phase on or off for a given histogram, can be set with a simple command. Constraints for atom parameters and phase fractions can be added or deleted, and the model for thermal motion can be changed from isotropic to anisotropic and vice versa. To support the processing of series of runs, EXP files can be copied and the reference to a data file replaced with a new data file, allowing the use of the parameter values of the first EXP file as starting values for the second. Bond lengths can be calculated using the underlying

GSAS functions. The command gsas_convert_cif converts a crystal structure stored in a CIF into a file in GSAS format utilizing the code of EXPGUI, as an example of merging the two projects. Using a single command, all types of Fourier maps known in GSAS can be created in a file format for inspecting them in three-dimensions, together with the refined crystal structure, using e.g. the VESTA package (Momma & Izumi, 2008), and one hopes this will simplify the use of this powerful tool for crystal structure solution. Finally, routines exist to extract parameter values and their estimated standard deviations from the PVE files, plot them using gnuplot, and generate an overview plot as an Acrobat PDF file, containing all plots generated this way and providing quick graphical results for parametric studies. Fig. 4 shows examples of such plots, generated using the commands

The parameter names are the same as those given at the end of the refinement documentation, with the first number defining the phase, the second the atom within that phase, and the remainder identifying the actual variable. The nodelete option preserves the results of the parameter extraction from the GSAS result files into ASCII text files, relieving the user of the frequently cumbersome task of gathering that information for more sophisticated plotting. 3.3. Distribution and documentation

gsaslanguage and documentation, including installation instructions for additional required software (GSAS itself, gnuplot and LATEX, and cygwin on Windows platforms) and scripts for the nickel (neutron time-of-flight) and fluorapatite (laboratory X-ray) examples from the GSAS manual, are available for download from http:// code.google.com/p/gsaslanguage/. The source code is included in the distribution, and installation, apart from GSAS and the aforementioned software packages, is limited to unpacking the script files into a folder, followed by the inclusion of this folder into the search path of the system. Users are invited to submit their own scripts or extensions for publication on this server and to contact the author at sven@lanl. gov for bug reports, feature requests or problems. Updated versions and more examples will be posted at the same location.

4. Summary The software described herein has been used successfully by the author and several users for the analysis of neutron time-of-flight

Figure 2 Example of a plot showing the evolution of the goodness-of-fit indicator 2 generated by gsaslanguage.

876

Sven C. Vogel

gsaslanguage

Figure 3 Plot of the final normalized variable shift as a function of Rietveld refinement cycle.

J. Appl. Cryst. (2011). 44, 873–877

computer programs

Figure 4 Examples of the automatic plotting of parameter results (not from the nickel example).

data from the HIPPO beamline (Wenk et al., 2003) and laboratory X-ray diffraction data since 2009. Standard scripts exist, serving as templates for crystal structure analysis from HIPPO data, and these are now virtually exclusively used for HIPPO data analysis with GSAS. Many common steps during Rietveld analysis are automated with appropriate script commands, and extensions can be readily added. Besides its obvious uses for documentation and re-usability of refinement strategies, as well as automation of analysis of large numbers of data sets, it is hoped that this collection of script files also serves as a basis for the development of scripted analysis of diffraction data for other users. It is believed by the author that this skill is an extremely valuable one, given the advent of facilities and instrumentation that can generate diffraction data at ever faster speeds. While the acquisition and storage of such large quantities of data pose their own information technology problems, there is, to the best of the author’s knowledge, very little software to interface between these vast amounts of data and the established software for analyzing this kind of data. This work would not have been possible without the efforts of Drs Allen C. Larson and Robert B. Von Dreele to create and maintain GSAS. The author is indebted to Nina J. Lane for help with the manual and Professor Bjo¨rn Winkler for valuable comments on this manuscript. I thank Dr Brian Toby for provision of code to utilize the CIF import feature of EXPGUI and for valuable comments on this manuscript. The Lujan Neutron Scattering Center at the Los Alamos Neutron Science Center is funded by the Department of Energy, Office of Basic Energy Science. The Los Alamos National Laboratory

J. Appl. Cryst. (2011). 44, 873–877

is operated by Los Alamos National Security LLC under DOE contract No. DE-AC52-06NA25396.

References Abrahams, S. C. & Keve, E. T. (1971). Acta Cryst. A27, 157–165. Clausen, B. (2004). SMARTSWare Manual. Report LAUR 04-6581. Los Alamos National Laboratory, New Mexico, USA. Cui, J., Huang, Q. & Toby, B. (2006). Powder Diffr. 21, 71–79. Halasz, I., Dinnebier, R. E. & Angel, R. (2010). J. Appl. Cryst. 43, 504–510. Heuer, M. (2001). J. Appl. Cryst. 34, 271–279. International Tables for Crystallography (1989). Vol. IV, Section 4.3. Birmingham: Kynoch Press. Larson, A. & Von Dreele, R. (2004). GSAS. Report LAUR 86-748. Los Alamos National Laboratory, New Mexico, USA. Le Bail, A. (2005). Powder Diffr. 20, 316–326. McCusker, L. B., Von Dreele, R. B., Cox, D. E., Loue¨r, D. & Scardi, P. (1999). J. Appl. Cryst. 32, 36–50. Momma, K. & Izumi, F. (2008). J. Appl. Cryst. 41, 653–658. Rietveld, H. M. (1969). J. Appl. Cryst. 2, 65–71. Stinton, G. W. & Evans, J. S. O. (2007). J. Appl. Cryst. 40, 87–95. Toby, B. H. (2001). J. Appl. Cryst. 34, 210–213. Von Dreele, R. B. (1997). J. Appl. Cryst. 30, 517–525. Von Dreele, R. B. (1999). J. Appl. Cryst. 32, 1084–1089. Von Dreele, R. B., Jorgensen, J. D. & Windsor, C. G. (1982). J. Appl. Cryst. 15, 581–589. Von Dreele, R. B., Stephens, P. W., Smith, G. D. & Blessing, R. H. (2000). Acta Cryst. D56, 1549–1553. Wenk, H.-R., Lutterotti, L. & Vogel, S. (2003). Nucl. Instrum. Methods Phys. Res. Sect. A, 515, 575–588. Wenk, H., Lutterotti, L. & Vogel, S. (2010). Powder Diffr. 25, 283–296. Williams, A., Kwei, G., Von Dreele, R., Raistrick, I. & Bish, D. (1988). Phys. Rev. B, 37, 7960–7962. Young, R. (1995). The Rietveld Method. New York: Oxford University Press.

Sven C. Vogel

gsaslanguage

877

a GSAS script language for automated Rietveld ... - IUCr Journals

a GSAS script language for automated Rietveld ... - IUCr Journals

Suggest Documents

Parametric Rietveld refinement - IUCr Journals

Rietveld refinement guidelines - IUCr Journals

Quantitative phase analysis using the Rietveld method - IUCr Journals

Quantitative phase analysis using the Rietveld method - IUCr Journals

The use of restraints in Rietveld refinement of ... - IUCr Journals

FA5-MS01-P01 Synthesis and Rietveld Refinements of ... - IUCr Journals

Automated identification of crystallographic ligands ... - IUCr Journals

Phaser.MRage: automated molecular replacement - IUCr Journals

Automated sample-changing robot for solution ... - IUCr Journals

Automated crystal screening for high-throughput X-ray ... - IUCr Journals

PHENIX: building new software for automated ... - IUCr Journals

An automated platform for parallel crystallization of ... - IUCr Journals

tet-a - IUCr Journals

The Spike2 script language

VB Script Language Reference

Training Script Language PRACTICE

The Signal script language

A Script for Automated 3-Dimentional Structure Generation and ...

A case for automated programming language ...

Implementation of semi-automated cloning and ... - IUCr Journals

Application of DEN refinement and automated model ... - IUCr Journals

Supporting information for - IUCr Journals

PDB_REDO: automated re-refinement of X-ray ... - IUCr Journals

Automated map sharpening by maximization of detail ... - IUCr Journals