GSI User's Guide - Developmental Testbed Center

Gridpoint Statistical Interpolation (GSI) Version 3.1 User’s Guide

Developmental Testbed Center National Center for Atmospheric Research National Centers for Environmental Prediction, NOAA Global Systems Division, Earth System Research Laboratory, NOAA

July, 2012

Foreword This User’s Guide describes the Gridpoint Statistical Interpolation (GSI) Version 3.1 data assimilation system, released on July 20, 2012. This version of GSI is based on the GSI repository from May 2012. As GSI is further developed, this document will be enhanced and updated to match the released version. This User’s Guide is designed for both beginners and experienced users. It includes eight chapters and two appendices: Chapter 1: Overview Chapter 2: Software Installation Chapter 3: Running GSI Chapter 4: GSI Diagnostics and Tuning Chapter 5: GSI Applications Chapter 6: GSI Theory and Code Structure Chapter 7: Observation and Background Error Statistics Chapter 8: Satellite Radiance Data Assimilation Appendix A: GSI tools Appendix B: GSI Namelist

Where Chapter 2 through Chapter 5 introduces basic skills on installing the GSI code, running cases, and checking analysis results. Chapter 6 through 8 and Appendices A and B present more details on the use of GSI and other advanced topics. For the latest version of this document, please visit the GSI User’s Website at http://www.dtcenter.org/com-GSI/users/index.php. Please send questions and comments to [email protected]. Editors: Ming Hu, Hui Shao, Donald Stark, Kathryn Newman Contributors to this guide: DTC: Chunhua Zhou, Xiang-Yu Huang NOAA/NCEP/EMC: John Derber, Russ Treadon, Mike Lueken, Wan-Shu Wu NCAR: Syed Rizvi, Zhiquan Liu, Tom Auligne, Arthur Mizzi NOAA/ESRL/GSD: Steve Weygandt, Dezso Devenyi, Joseph Olson

ii

Acknowlegements: We thank the U.S. Air Force Weather Agency and the National Oceanic and Atmospheric Administration for their support of this work. . This work is also facilitated by National Center for Atmospheric Research (NCAR). NCAR is supported by the National Science Foundation (NSF).

iii

Table of Contents

Table of Contents

Chapter 1: Overview .................................................................................................................. 1 Chapter 2: Software Installation ............................................................................................ 7 2.1 Introduction .................................................................................................................................... 7 2.2 Obtaining the Source Code ......................................................................................................... 7 2.3 Compiling GSI .................................................................................................................................. 8 2.3.1 Environment Variables .......................................................................................................................... 9 2.3.2 Configure and Compile ....................................................................................................................... 10 2.4 System Requirements and External Libraries .................................................................. 12 2.4.1 Compilers .................................................................................................................................................. 12 2.4.2 NetCDF and MPI ..................................................................................................................................... 13 2.4.3 LAPACK and BLAS ................................................................................................................................ 14 2.5 Directory Structure, Source code and Supplemental Libraries .................................. 15 2.6 Getting Help and Reporting Problems ................................................................................ 16 2.7 Modifying the GSI Build Environment ................................................................................. 16 2.7.1 Understanding the Build System .................................................................................................... 16 2.7.2 Configuration Resource File ............................................................................................................. 18 2.7.3 Modification Example ......................................................................................................................... 19 Chapter 3: Running GSI .......................................................................................................... 21 3.1 Input Files ..................................................................................................................................... 21 3.2 GSI Run Script .............................................................................................................................. 24 3.2.1 Steps in the GSI run script ................................................................................................................. 24 3.2.2 Customization of the GSI run script .............................................................................................. 25 3.2.2.1 Setting up the machine environment .................................................................................................... 25 3.2.2.2 Setting up the running environment ..................................................................................................... 27 3.2.2.3 Setting up an analysis case ......................................................................................................................... 27 3.2.3 Description of the sample script to run GSI ............................................................................... 29 3.3 Introduction to Frequently Used GSI Namelist Options ................................................ 41 3.4 Files in GSI Run Directory ........................................................................................................ 45 Chapter 4: GSI Diagnostics and Tuning ............................................................................ 48 4.1 Understanding Standard Output (stdout) .......................................................................... 48 4.2 Single Observation Test ............................................................................................................ 63 4.2.1 Setup a single observation test: ...................................................................................................... 63 4.2.2. Examples of single observation tests for GSI ........................................................................... 64 4.3 Control Data Usage ..................................................................................................................... 65 4.4 Domain Partition for Parallelization and Observation Distribution ........................ 69 4.5 Observation Innovation Statistics ......................................................................................... 70 4.5.1 Conventional observations ............................................................................................................... 71 4.5.2 Satellite radiance ................................................................................................................................... 74 4.6 Convergence Information ........................................................................................................ 77 4.7 Use bundle to configure control, state varaibles and background fields ................ 82 4.8 Analysis Increments .................................................................................................................. 83 4.9 Running Time and Memory Usage ........................................................................................ 83

iv

Table of Contents Chapter 5: GSI Applications .................................................................................................. 84 5.1. Assimilating Conventional Observations with GSI: ........................................................ 85 5.1.1: Run script ................................................................................................................................................ 85 5.1.2: Run GSI and check the run status ................................................................................................. 87 5.1.3: Check for successful GSI completion ........................................................................................... 89 5.1.4: Diagnose GSI analysis results ......................................................................................................... 91 5.1.4.1: Check analysis fit to observations ......................................................................................................... 91 5.1.4.2: Check the minimization ............................................................................................................................. 94 5.1.4.3: Check the analysis increment .................................................................................................................. 96 5.2. Assimilating Radiance Data with GSI .................................................................................. 97 5.2.1: Run script ................................................................................................................................................ 97 5.2.2: Run GSI and check run status ......................................................................................................... 99 5.2.3: Diagnose GSI analysis results ......................................................................................................... 99 5.2.3.1: Check file fort.207 ......................................................................................................................................... 99 5.2.3.2: Check the analysis increment ............................................................................................................... 101 5.3. Assimilating GPS Radio Occultation Data with GSI ...................................................... 103 5.3.1: Run script ............................................................................................................................................. 103 5.3.2: Run GSI and check the run status .............................................................................................. 103 5.3.3: Diagnose GSI analysis results ...................................................................................................... 104 5.3.3.1: Check file fort.212 ...................................................................................................................................... 104 5.3.3.2: Check the analysis increment ............................................................................................................... 105 5.4 Introduction to GSI hybrid analysis ................................................................................... 105 Summary ............................................................................................................................................. 107 Chapter 6: GSI Theory and Code Structure ................................................................... 108 6.1 GSI Theory ................................................................................................................................... 108 6.1.1 3DVAR equations: .............................................................................................................................. 108 6.1.2 Iterations to find the optimal results ........................................................................................ 109 6.1.3 Analysis variables .............................................................................................................................. 110 6.2 GSI Code Structure .................................................................................................................... 110 6.2.1 Main process ........................................................................................................................................ 110 6.2.2 GSI background IO (for 3DVAR) .................................................................................................. 113 6.2.3 Observation ingestion ...................................................................................................................... 114 6.2.4 Observation innovation calculation ........................................................................................... 115 6.2.5 Inner iteration ................................................................................................................................... 116 Chapter 7: Observation and Background Error Statistics ........................................ 117 7.1 Conventional Observation Errors ....................................................................................... 117 7.1.1 Getting original observation errors ........................................................................................... 117 7.1.2 Observation error adjustment and gross error check within GSI ................................. 118 7.2 Background Error Covariance ............................................................................................. 120 7.2.1 Processing of background error statistics .............................................................................. 121 7.2.2 Apply background error covariance .......................................................................................... 123 Chapter 8: Satellite Radiance Data Assimilation ......................................................... 125 8.1. Satellite radiance data ingest and distribution ............................................................. 125 8.1.1 link radiance BUFR files to GSI recognized names .............................................................. 125 8.1.2 GSI Code to ingest radiance data ................................................................................................. 128 8.1.3 information on ingesting and distribution .............................................................................. 131 8.2. Radiance observation operator .......................................................................................... 131 8.3. Radiance observation quality control ............................................................................. 132 8.4. Bias correction for radiance observations ..................................................................... 134 v

Table of Contents 8.4.1. Bias correction for satellite observations .............................................................................. 134 8.4.2. The GSI Bias correction procedure and configurations ................................................... 135 8.4.3 Namelist, satinfo, and coefficients for bias correction ....................................................... 137 8.4.4. Utility for Angle Bias Correction ................................................................................................. 140 8.4.5. Discussion of FAQ ............................................................................................................................. 143 8.5. Radiance data analysis monitoring ........................................................................................ 145 8.5.1 Organization and compilation ...................................................................................................... 145 8.5.2 Data Extraction ................................................................................................................................... 146 8.5.3 Image Generation ............................................................................................................................... 149 8.5.4 Additional notes ................................................................................................................................. 151

Appendix A: GSI tools ........................................................................................................... 153 A.1 BUFR Tools and SSRC .............................................................................................................. 153 A.2 Read GSI Diagnostic Files ....................................................................................................... 153 A.3 Read and Plot Convergence Information from fort.220 .............................................. 157 A.4 Plot Single Observation Test Result and Analysis Increment ................................... 158 Appendix B: GSI Namelist .................................................................................................... 159

vi

Overview

Chapter 1: Overview GSI history and background: The Gridpoint Statistical Interpolation (GSI) system is a unified variational data assimilation (DA) system for both global and regional applications. It was initially developed by the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) as a next generation analysis system based on the then operational Spectral Statistical Interpolation (SSI) analysis system. Instead of being constructed in spectral space like the SSI, the GSI is constructed in physical space and is designed to be a flexible, state-of-art system that is efficient on available parallel computing platforms. After initial development, the GSI analysis system became operational as the core of the North American Data Assimilation System (NDAS) in June 2006 and the Global Data Assimilation System (GDAS) in May 2007 at NOAA. Since then, the GSI system has been adopted in various operational systems, including the National Aeronautics and Space Administration (NASA) global atmospheric analysis system, the NCEP Real-Time Mesoscale Analysis (RTMA) system, the Hurricane WRF (HWRF), and the Rapid Refresh (RR) system. The number of groups involved in operational GSI development has also been expanded to include more development groups, including NASA Goddard Global Modeling and Assimilation Office (GMAO), NOAA Earth System Research Laboratory (ESRL), and National Center for Atmospheric Research (NCAR) Earth System Laboratory’s Mesoscale and Microscale Meteorology Division (MMM). GSI becomes community code: Starting in 2007, the Developmental Testbed Center (DTC) began collaborating with major GSI development groups to transform the operational GSI system into a community system and support distributed development. The DTC has complemented the development groups in providing GSI documentation, porting GSI to multiple platforms, and testing GSI in an independent and objective environment, while still maintaining functionally equivalent to operational centers. Working with NCEP Environmental Modeling Center (EMC), the DTC is maintaining a community GSI repository that is equivalent to the operational developmental repository and facilitates community users to develop GSI. Based on the repository, the DTC releases GSI code annually with intermediate bug fixes. The first community version of the GSI system was released by the DTC in 2009. Since then, the DTC has provided community support through the GSI helpdesk ([email protected]), community GSI webpage (http://www.dtcenter.org/com-GSI/users/index.php), and annual community GSI tutorial/workshop.

1

Overview GSI code management and Review Committee: In late 2010, a GSI review committee was formed to coordinate distributed development of the GSI system. The committee is composed of primary GSI research and operational centers, including NCEP/EMC, NASA/GMAO, NOAA/ESRL, NCAR/MMM, AFWA and DTC. As one of the committee members, the DTC represents the research community with GSI users from all over the world. The GSI Review Committee primarily steers distributed GSI development and community code management and support. The responsibilities of the Review Committee are divided into two major aspects: coordination and code review. The purpose and guiding principles of the Review Committee are as follows: Coordination and Advisory • Propose and shepherd new development • Coordinate on-‐going and new development • Process management • Community support recommendation GSI Code Review • Establish and manage a unified GSI coding standard followed by all GSI developers. • Establish and manage a process for proposal and commitment of new developments to the GSI repository. • Review proposed modifications to the code trunk. • Make decisions on whether code change proposals are accepted or denied for inclusion in the repository and manage the repository. • Oversee the timely testing and inclusion of code into the repository. The review committee is committed to facilitating the transition from research to operations (R2O). Prospective contributors of code should contact the DTC through the GSI helpdesk ([email protected] ). The DTC will help the prospective contributors go through a scientific review from the GSI Review Committee to avoid any potential conflict in code development. Upon the approval of the GSI review committee, the DTC will work with the prospective contributors in the preparation and integration of their code into the GSI repository. About this GSI release veriosn: As a critical part of the GSI user support, this GSI User’s Guide is provided to assist users in applying GSI to data assimilation and analysis studies. It was composed by the DTC and reviewed by the GSI Review Committee members. Please note the DTC currently focuses on testing and evaluation of GSI for limited area Numerical Weather Prediction (NWP) applications; however, the long-term plan includes transitioning to global forecast applications. This documentation describes the July 2012 version 3.1 release, which 2

Overview includes new capabilities and enhancements as well as bug fixes. This version of GSI is based on a reversion of the community GSI repository from May 2012. The GSI version 3.1 is primarily a three-dimensional variational (3D-Var) system with modules developed for advanced features, such as the four-dimensional variational (4DVar), ensemble hybrid variational analysis, and observation sensitivity capabilities. Coupled with forecast models and their adjoint models, GSI can be turned into a 4D-var system. Combined with an ensemble system, this GSI can also be used in an ensemblevariational hybrid data assimilation system (GSI-hybrid) with the appropriate configuration. An example of such an application is the current Global Data Assimilation System (GDAS) at NCEP, using the GSI-hybrid capability with NCEP’s global ensemble forecast system. Observations used by this version GSI: GSI is being used by various applications on multiple scales. Therefore, the types of observations GSI can assimilate vary from conventional to aerosol observations. Users should use observations with caution to fit their specific applications. The GSI version 3.1 can assimilate, but is not limited to, the following types of observations: Conventional observations: (including satellite retrievals)  Radiosondes  Pibal winds  Synthetic tropical cyclone winds  Wind profilers  Conventional aircraft reports  ASDAR aircraft reports  MDCARS aircraft reports  Dropsondes  MODIS IR and water vapor winds  GMS, JMA, METEOSAT and GOES cloud drift IR and visible winds  GOES water vapor cloud top winds  Surface land observations  Surface ship and buoy observation  SSM/I wind speeds  QuikScat and ASCAT wind speed and direction  SSM/I and TRMM TMI precipitation estimates  Doppler radial velocities  VAD (NEXRAD) winds  GPS precipitable water estimates  GPS Radio occultation refractivity and bending angle profiles  SBUV ozone profiles and OMI total ozone  SST  Tropical storm VITAL Satellite radiance/brightness temperature observations (instrument/satellite ID)  SBUV: n17, n18, n19  HIRS: metop-a, metop-b, n17, n19

3

Overview               

GOES_IMG: g11, g12 AIRS:aqua AMSU-A: metop-a, metop-b, n15, n18, n19 AMSU-B: metop-b, n17 MHS: metop-a, metop-b, n18, n19 SSMI: f14, f15 SSMIS: f16 AMSRE: aqua SNDR: g12 IASI: metop-a, metop-b GOME: metop-a, metop-b, OMI: aura SEVIR: m08, m09, m10 ATMS: NPP CRIS: NPP

What is new in this release version: The following lists some of the new functions and changes included in the GSI release version 3.1 versus version 3.0: 1. Development of new data assimilation techniques: • Addition of 4D capability for ensembles to allow several flavors of 4DVAR using ensembles; • Enhancement of Global hybrid function and implementation. Development of hybrid capability for NAM and HWRF. Addition of dual resolution capability for regional hybrid ensemble applications; • Addition of cloudy radiance assimilation; • Addition of the NCEP 4DVAR perturbation model as a tool in utility directory; • Incorporation of hybrid capability to sqrt(B) option. 2. Improvement to general data analysis techniques: • Updated and enhanced the utility Radiance Monitoring package for easily installation and better monitoring radiance observations; • Improved the use of the significant levels in radiosonde observation; • Extended FGAT capability to NAM and ARW NetCDF interface; • Added GSI Metguess bundle; • Added the Bi-Conjugate Gradient minimization option; • Added radiance bias correction spin-up of new instruments: the new channels are assigned as passive channels and monitored before they are used in the data assimilation. While they are passive, bias correction will be performed; • Improved the GSD cloud analysis scheme; • Improved surface data analysis for RAP applications.

4

Overview 3. Improvement of the use of observations and addition of new observations: • Enhanced assimilation of GPS-RO data; • Added climatological monthly zonal mean CO2 fields and allow these fields to impact the CRTM's radiance computation; • Inclusion of ATMS. Added a new fixed file for the ATMS filtering; • Added capability to do chemical data assimilation, including aerosol, PM2.5, and MODIS AOD; • Allow assimilation of Doppler wind Lidar data; • Improved data quality control process for avhrr, AMSU-A channel 5; • Modification of the specification of TCVITL surface pressure observational errors; • Added the capability to assimilate MLS ozone BUFR data. Added an option to revert ozone analysis to uni-variate above specified level in hybrid; • Additional quality control for satellite winds; • Inclusion of JMA water vapor cloud (type 250) winds; • Inclusion of the ability to use SEVIRI data; • Improvement of radar radial wind data analysis: ∗ Added observation subtype to discriminate between level-2, level-2.5, and level-3 observations; ∗ Turned off the use of level-2.5 radial winds in the CONUS domain but kept the level-2.5 data for Alaska; ∗ Increased the observation error for level-3 data in the CONUS domain. • Enhancements to RTMA: ∗ Added the analysis of: (i) 10-m wind gust; (ii) visibility; and (iii) planetary boundary layer height; ∗ Included the new report types (192-195, 292-295) that were created for surface observations missing a surface pressure report; ∗ Added the Hilbert Curve-based cross-validation. Please note due to the version update, some diagnostic files and static information (“fixed”) files have been changed as well. The structure of this User’s Guide: The User’s Guide is organized as follows. Chapter 1 provides a background introduction of GSI. Chapter 2 contains basic information about how to get started with GSI – including system requirements; required software (and how to obtain it); how to download GSI; and information about compilers, libraries, and how to build the code. Chapter 3 focuses on the input files needed to run GSI and how to configure and run GSI. Chapter 4 includes information about diagnostics and tuning of the system. Chapter 5 illustrates how to setup configurations to run GSI with conventional, radiance, and GPSRO data and how to diagnose the results. A brief introduction to the GSI hybrid application is also included in this chapter. Chapter 6 introduces the data assimilation and minimization theories behind the GSI system and the GSI code structure. Chapter 7 includes some basics on the 5

Overview background and observation errors. Finally, Chapter 8 introduces several topics on satellite radiance data assimilation using GSI, including configuration, bias correction, and monitoring. Appendix A introduces the community tools available for GSI users and Appendix B is a complete list of the GSI namelist with explanations.

6

Software Installation

Chapter 2: Software Installation 2.1 Introduction The DTC community GSI is a community distribution of NOAA’s operational GSI. The original operational code was developed at NOAA/NCEP for use on IBM supercomputers. The community GSI expands the portability of the operational code by adding a flexable build system and run script that allows GSI to be compiled and run on a variety of platforms. The current version of GSI builds and runs on Linux platforms using Intel or PGI compilers, IBM supercomputers using the xlf compiler, and Intel Macintosh computers using the PGI compiler. This chapter provides build instructions applicable only to the DTC community GSI. They do not pertain to NCEP’s operational version of the code because of the different build environment and operational requirements. However, it can be easily adopted for the same purpose. This chapter describes how to build and install the GSI on your computing resources. The process consists of a few steps: • Obtain the source code (section 2.2). o Obtain GSI from the community GSI web page. o Obtain the WRF model from the WRF project page. • Build the WRF model (see the WRF users guide). • Set the appropriate environment variables for the GSI build (section 2.3.1). • Configure and compile the GSI source code (section 2.3.2). This chapter is organized as follows: Section 2.2 describes how to obtain the source code. Section 2.3 covers setting up the build environment, configuring the build system, and compiling the code. Section 2.4 covers the system requirements (tools, libraries, and environment variable settings) and currently supported platforms in detail. Section 2.5 covers the directory structure and supplemental NCEP libraries included with the distribution. Section 2.6 discusses what to do if you have problems with the build and where to get help. And section 2.7 briefly illustrates how to port GSI to another platform. For beginning users, sections 2.2 and 2.3 provide the necessary overview needed to build GSI on most systems. The remainder of the chapter is provided for completeness and in case the user has any difficulty building the code. 2.2 Obtaining the Source Code The community GSI resources, including source code, build system, utilities, and documentation, are available from the DTC community GSI users website, located at

7

Software Installation http://www.dtcenter.org/com-GSI/users/index.php

The source code is available by first selecting the Download tab on the vertical menu located on the left of the page, and then selecting the GSI System submenu. New users must first register before downloading the source code. Returning users only need their registration email address to log in. After getting in the download page, select the link to the comGSI v3.1 tarball to download the most recent version, as of July 2012, of the source code. Selecting the newest release of the community GSI is critical for having the most recent capabilities, supplemental libraries, and code fixes. To analyze satellite radiance observations, GSI requires the CRTM coefficients from the most recent version of the CRTM library. As a convience to the user, and because of its large size, the latest CRTM coefficients are provided as a separate tarfile, and can be downloaded by selecting the link CRTM 2.0.5 coefficients tarball from the web page. The community GSI version 3.1 comes in a tar file named comGSI_v3.1.tar.gz. The tar file may be unpacked by using the UNIX commands: gunzip –d comGSI_v3.1.tar.gz tar -xvf comGSI_v3.1.tar

or the single Linux command tar -zxvf comGSI_v3.1.tar.gz

This creates the top level GSI directory comGSI_v3.1/. After downloading the source code, and prior to building, the user should check the known issues link on the DTC website to determine if any bug fixes or platform specific customizations are needed. Since the WRF model is needed for a successful GSI build, the WRF code should be downloaded and built prior to proceeding with the GSI build. The WRF code, and full WRF documentation, can be obtained from the WRF Users Page, http://www.mmm.ucar.edu/wrf/users/

following a registration process similar to that for downloading GSI. 2.3 Compiling GSI This section provides a quick overview of how to build GSI on most systems. Typically GSI will build “straight out of the box” on any system that successfully builds the WRF model. Should the user experience any difficulties with the default build, check the build environment against the requirements described in section 2.4.

8

Software Installation To proceed with the GSI build, it is assumed that the WRF model has already been built on the current system. GSI reads and writes I/O using the WRF I/O API libraries. These I/O libraries are created as part of the WRF build, and are linked into GSI during the GSI build process. In order to successfully link the WRF I/O libraries with the GSI source, it is crucial that both WRF and GSI are built using the same compilers. This means that if WRF is built with the Intel Fortran compiler, then GSI must also be built with the same Intel Fortran compiler. 2.3.1 Environment Variables Before configuring the GSI code to be built, at least one, and up to three environment variables may need to be set. The first variable WRF_DIR defines the path to the root of the WRF build directory. This one is mandatory. It tells the GSI build system where to find the WRF I/O libraries. The second environment variable specifies the local path to NetCDF library. The variable for this is called NETCDF. The third variable is LAPACK_PATH, which specifies the path to the LAPACK library on your system. These variables must be set prior to configuring the build. Be aware that these paths are dependent on your choice of compiler. In many cases, the latter two variables may already be defined as part of your login environment. This is typical on systems with only one compiler option such as IBM supercomputers using the xlf compiler. On systems with multiple compiler options, such as many Linux systems, these variables may be set as part of the module settings for each compiler. The process for setting the environment variables varies according to the login shell in use. To set the path variable WRF_DIR for csh/tcsh, type; setenv WRF_DIR /path_to_WRF_root_directory/

for bash/ksh, the syntax is; export WRF_DIR=/path_to_WRF_root_directory/

As one of the additional environment variables, the value of the NETCDF variable may be checked by “echo’ing” the variable using: echo $NETCDF

If the command returns with the response that the variable is undefined, such as NETCDF: Undefined variable.

it is then necessary to manually set this variable. If your system uses modules or a similar mechanism to set the environment, do this first. If a valid path is returned by the echo command, no further action is required.

9

Software Installation The last environment variable, LAPACK_PATH, defines the path to the LAPACK library. Typically, this variable will only need to be set on systems without a vender provided version of LAPACK. IBM systems typically come installed with the LAPACK equivalent ESSL library that links automatically. Likewise, the PGI compiler often comes with a vender provided version of LAPACK that links automatically with the compiler. Over the last three years, the GSI help staff have experienced only two situations where the LAPACK variable needed to be set. 1. On stripped down versions of the IBM operating system that come without the ESSL libraries. 2. Linux environments using Intel Fortran compiler. Of the two, the second of these is the most common. The Intel compiler usually comes with a vender provided mathematics library known as the “Mathematics Kernal Libraries” or MKL for short. While most installations of the Intel compiler typically come with the MKL libraries installed, the ifort compiler does not automatically load the library. It is therefore necessary to set the LAPACK_PATH variable to the location of the MKL libraries when using the Intel compiler. You may need to ask your system administrator for the correct path to these libraries. On the NOAA supercomputers, the Intel build environment can be specified through setting the modules. When this is done, the MKL library path is available through a local variable, MKL on Jet, and MKL_ROOT on Zeus. Machine Name Jet Zeus

MKL library path MKL MKL_ROOT

Supposing that the MKL library path is set to the environment variable MKL, then the LAPACK environment may be set for csh/tcsh with setenv LAPACK_PATH $MKL

and for bash/ksh by export LAPACK_PATH=$MKL

Once the environment variables have been set, the next step in the build process is to first run the configure script and then the compile script. 2.3.2 Configure and Compile Once the environment variables have been set, building the GSI source code requires two steps: 1. Run the configure script 10

Software Installation 2. Run the compile script Change into the comGSI_v3.1 directory and issue the configure command: ./configure

This command uses user input to create a platform specific configuration file called configure.gsi. The script starts by echoing the NETCDF and WRF_DIR paths set in the previous section. It then examines the current system and queries the user to select from multiple build options. On selecting an option, it reports a successful configuration with the banner: -----------------------------------------------------------------------Configuration successful. To build the GSI, type: compile ------------------------------------------------------------------------

Failure to get this banner means that the configuration step failed to complete. The most typical reason for a failure is an error in one of the paths set to the environment variables. On Linux 64 bit environments, the supported compiler options are: 1. 2. 3. 4.

Linux Linux Linux Linux

x86_64, x86_64, x86_64, x86_64,

PGI compilers (pgf90 & pgcc) PGI/gnu compiler (pgf90 & gcc) Intel compilers (ifort & icc) Intel/gnu compiler (ifort & gcc)

(dmpar,optimize) (dmpar,optimize) (dmpar,optimize) (dmpar,optimize)

It is important to note that on Linux systems the Fortran compiler and the C compiler do not have to match. This allows the build to use the GNU C-compiler in place of the Intel or PGI C-compiler. On IBM computers with the xlf compiler, there are two build options: 1. 2.

AIX 64-bit AIX 32-bit

(dmpar,optimize) (dmpar,optimize)

You may choose either option depending on your computer platform. After selecting a build option, run the compile script: ./compile

It is typically useful to capture the build information to a log file by redirecting the output. To conduct a complete clean, which removes ALL built files in ALL directories, as well as the configure.gsi, type: ./clean -a

A complete clean is strongly recommended if the compilation failed or if the configuration file is changed. 11


Following a successful compile, the GSI executable gsi.exe can be found in the run/ directory. If the executable is not found, check the compilation log file for the word “Error” with a capital “E” to determine the failure. 2.4 System Requirements and External Libraries The source code for GSI is written in FORTRAN, FORTRAN 90, and C. In addition, the parallel executables require some flavor of MPI and OpenMP for the distributed memory parallelism. Lastly the I/O relies on the NetCDF I/O libraries. Beyond standard shell scripts, the build system relies on the Perl scripting language and GNU Make. The basic requirements for building and running the GSI system are the following:        

FORTRAN 90+ compiler C compiler MPI v1.2+ OpenMP Perl NetCDF V3.6+ or V4.0+ LAPACK and BLAS mathematics libraries WRF V3.3.1+

Because all but the last of these tools and libraries are typically the purview of system administrators to install and maintain, they are lumped together here as part of the basic system requirements. 2.4.1 Compilers The DTC community GSI system successfully builds and runs on IBM AIX, Linux, and Mac Darwin platforms. The following compiler/OS combinations are supported: • •

•

IBM with xlf Fortran compiler Linux (both 32 and 64-bit) with o PGI pgf90 & pgcc o PGI pgf90 & Gnu gcc o Intel ifort & icc o Intel ifort & Gnu gcc Mac Darwin with PGI

Currently, GSI is only tested on the IBM and Linux compiler configurations due to availability of testing resources. Unforeseen build issues may occur when using older compiler and library versions. As always, the best results come from using the most recent version of compilers. 12


2.4.2 NetCDF and MPI GSI requires a number of support libraries not included with the source code. These libraries must be built with the compilers being used to build the remainder of the code. Because of this, they are typically bundled along with the compiler installation, and are subsequently referred to as system libraries. For our needs, the most important libraries amoung these are NetCDF, MPI and OpenMP. The NetCDF library is an exception to the rule of using the most recent version of libraries. GSI will work with either V3.6+ or the new V4. Recent updates to NetCDF V4 support backward compatability. The V4 series is significantly more difficult to install, so version 3.6+ may be preferred for that reason. The NetCDF libraries can be obtained from the Unidata website: http://www.unidata.ucar.edu

Typically, the NetCDF library is installed in a directory that is included in the users path such as /usr/local/lib. When this is not the case, the environment variable NETCDF, can be set to point to the location of the library. For csh/tcsh, the path can be set with the command: setenv NETCDF /path_to_netcdf_library/

For bash/ksh, the path can be set with the command: export NETCDF=/path_to_netcdf_library/

It is crucial that system libraries, such as NetCDF, be built with the same FORTRAN compiler, compiler version, and compatible flags, as used to compile the remainder of the source code. This is often an issue on systems with multiple FORTRAN compilers, or when the option to build with multiple word sizes (e.g. 32-bit vs. 64-bit addressing) is available. Many default Linux installations include a version of NetCDF. Typically this version is only compatible with code compiled using gcc. To build GSI, a version of the library must be built using your preferred Fortran compiler and with both C and FORTRAN bindings. If you have any doubts about your installation, ask your system administrator. GSI requires that a version of the MPI library with OpenMP support be installed. GSI is written as a distributed memory parallel code and these libraries are required to build the code. Just as with the NetCDF library, the MPI library must be built with the same FORTRAN compiler, and use the same word size option flags, as the remainder of the source code.

13

Software Installation Installing MPI on a system is typically a job for the system administrator and will not be addressed here. If you are running GSI on a computer at a large center, check the machines’ documentation before you ask the local system administrator. On Linux or Mac Darwin systems, you can typically determine whether MPI is installed by running the following UNIX commands:  

which mpif90 which mpicc

If any of these tests return with Command Not Found, there may be a problem with your MPI installation or the configuration of user’s account. Contact your system administrator for help if you have any questions. 2.4.3 LAPACK and BLAS The LAPACK and BLAS are open source mathematics libraries for the solution of linear algebra problems. The source code for these libraries is freely available to download from NETLIB at http://www.netlib.org/lapack/

Most commercial compilers provide their own optimized versions of these routines. These optimized versions of BLAS and LAPACK provide superior performance to the open source versions. On the IBM machines, the AIX compiler is often, but not always, installed with the Engineering and Scientific Subroutine Libraries or ESSL. In part, the ESSL libraries are highly optimized parallel versions of many of the LAPACK and BLAS routines. Because GSI was developed on an IBM platform, all of the required linear algebra routines are available from the ESSL libraries. On Linux or Mac Darwin systems, GSI supports both the Intel and PGI compilers. The Intel compiler has its own optimized version of the BLAS and LAPACK routines called the Math Kernel Library or MKL. The MKL libraries provide sufficient routines to satisfy the requirements of GSI. The PGI compiler typically comes with its own version of the BLAS and LAPACK libraries. Again, the PGI version of BLAS and LAPACK contain the necessary routines to build GSI. For PGI these libraries are loaded automatically. Should your compiler come without any vender supplied linear algebra libraries, it will be necessary to download and build your own local version of the BLAS and LAPACK libraries. Just as with the NetCDF and MPI libraries, the LAPACK/BLAS libraries must be built with the same FORTRAN compiler, and use the same word size option flags, as the remainder of the source code.

14

Software Installation 2.5 Directory Structure, Source code and Supplemental Libraries The GSI system includes the GSI source code, build system, supplemental libraries, fixed files, and run scripts. The following table lists the system components found inside of the main GSI directory. Directory Name src/main/ src/libs/ fix/

include/ lib/ run/ arch/ util/

Content GSI source code and makefile Source directory for supplemental libraries Fixed input files required by a GSI analysis, such as background error covariances, observation error tables; excluding CRTM coefficients Include file directory, created by the build system Directory containing compiled supplemental libraries created by the build Directory for executable gsi.exe and sample run script Build options and machine architecture specifics Tools

The src/ directory contains two subdirectories. The main/ directory holds the main GSI source code along with its makefiles, and the The libs/ holds the source code for the supplemental libraries. The fix/ directory holds the static or fixed input files used by GSI. These files include among other things, the background error covariances and the observation error tables. Starting with version 3 of GSI, the GSI tar file no longer contains the CRTM coefficients due to their size. The CRTM coefficients are now included as a separate tar file and may be stored separately. This will be discussed further in the next chapter. The include/ and lib/ directories are created by the build process to store the include and library files, respectively. The run/ directory contains the executable and a sample run script for conducting an analysis run. The arch/ directory contains the machine specific information used by the build system. The build system is discussed in detail in section 2.7. Lastly, the util/ directory contains tools for GSI diagnostics. For the convenience of the user, supplemental NCEP libraries for building GSI are included in the src/libs/ directory. These libraries are built when GSI is built. These supplemental libraries are listed in the table below. Directory Name Content Source code for supplemental libraries (under src/libs/. except WRF) bacio/ NCEP BACIO library bufr/ NCEP BUFR library crtm_gfsgsi_2.0/ JCSDA community radiative transfer model v2.0.5 gfsio/ Unformatted Fortran record for GFS IO gsdcloud/ GSD Cloud analysis library misc/ Misc support libraries nemsio/ NEMS I/O library 15

Software Installation sfcio/ sigio/ sp/ w3/ WRF/

NCEP GFS surface file i/o module NCEP GFS atmospheric file i/o module NCEP spectral - grid transforms NCEP W3 library (date/time manipulation, GRIB) IO API libraries are used by GSI (not with GSI tar file)

The one “library” in this table, which is not included with the source code, is WRF. GSI uses the WRF I/O API’s to read NetCDF files created as WRF output. Therefore a copy of successfully compiled WRF is required for the GSI build. Please note that only WRFV3.3.1 and WRFV3.4 are tested with this current version of GSI.

2.6 Getting Help and Reporting Problems Should the user experience any difficulty building GSI on their system, please first confirm that all the required software is properly installed (section 2.4). Next check that the external libraries exist and that their paths are correct. Lastly, check the resource file configure.gsi for errors in any of the paths or settings. Should all these check out, feel free to contact the community GSI supporters at [email protected] for assistance. At a minimum, when reporting code building problems to the helpdesk, please include with your email a copy of the build log, and the contents of the configure.gsi file.

2.7 Modifying the GSI Build Environment The GSI build system comes with build settings that should compile on most standard systems. Typically if the WRF model builds on a system, GSI will build there as well. Unfortunately because standardization is lacking when it comes to setting up Linux HPC environments, it may be necessary to customize the GSI build settings in order to account for eccentric computing environments. Typically build problems can be traced back to issues with the libraries, the compiler, or the support utilities such as cpp. These types of issues can usually be solved by customizing the default configuration file settings. This sets up an iterative process where the build parameters are modified, the compile script is run, build errors diagnosed, and the process repeated. 2.7.1 Understanding the Build System The GSI build system uses a collection of data files and scripts to create a configuration resource file that defines the local build environment.

16

Software Installation At the top most level there are four scripts. The clean script removes everything created by the build. The configure script takes local system information and queries the user to select from a collection of build options. The results of this are saved into a resource file called configure.gsi. Once the configure.gsi file is created, the actual build is initiated by running the compile script. The compile script then calls the top-level makefile and builds the source code. Name makefile arch/ clean configure compile

Content Top-level makefile Build options and machine architecture specifics Script to clean up the directory structure Script to configure the build environment for compilation. Creates a resource file called configure.gsi Script for building the GSI system. Requires the existence of the configure.gsi prior to running

The compile script uses the resource file configure.gsi to set paths and environment variables required by the compile. The configure script generates the resource file configure.gsi by calling the Perl script Config.pl, located in the arch/ directory. The The script Config.pl combines the build information from the files in the arch/ directory with machine specific and user provided build information to construct the configure.gsi resource file. A clean script is provided to remove the build objects from the directory structure. Running ./clean scrubs the directory structure of the object and module files. Running a clean-all ./clean –a removes everything generated by the build, including the library files, executables, and the configure resource file. Should the build fail, it is strongly recommended that the user run a ./clean –a prior to rerunning the compile script. The arch/ directory contains a number of files used to construct the configuration resource file configure.gsi. File name preamble configure.defaults

postamble

Description Uniform requirements for the code, such as word size, etc. Selection of compilers and options. Users can edit this file if a change to the compilation options or library locations is needed. Also, it can be edited to add a new compilation option if needed. Standard compilation (“make”) rules and dependencies

Most users will not need to modify any of these files unless experiencing significant build issues. Should a user require a significant customization of the build for their local computing environment, those changes would be saved to the configure.defaults file only after first testing these new changes in the temporary configure.gsi file.

17

Software Installation 2.7.2 Configuration Resource File The configuration resource file configure.gsi contains build information, such as compiler flags and paths to system libraries, specific to a particular machine architecture and compiler. To illustrate its contents, lets look at the resource for the Linux Intel build. The header describes the overall build environment • Linux x86 with 64 bit word size • Uses Intel Fortran and C compilers # Settings for Linux x86_64, Intel compiler (ifort & icc)

(dmpar,optimize)#

The link path points to an Intel version of NetCDF and the OpenMP libraries. LDFLAGS

=

-Wl,-rpath,/usr/local/netcdf3-ifort/lib -openmp

The code directory location: COREDIR

=

$HOME/comGSI_v3.1

Various environment variables: INC_DIR BYTE_ORDER

= =

$(COREDIR)/include LITTLE_ENDIAN

Compiler definitions • Intel ifort fortran compiler • Intel icc C compiler SFC SF90 SCC

= = =

ifort ifort -free icc

The include paths for GSI source code and NetCDF: INC_FLAGS

=

-module $(INC_DIR) -I $(INC_DIR) -I /usr/local/netcdf3-ifort/include

The default Fortran compiler flags for the main source code: FFLAGS_DEFAULT = FFLAGS =

-fp-model precise -assume byterecl -fpe0 -ftz -O3 $(FFLAGS_DEFAULT) $(INC_FLAGS) $(LDFLAGS) -DLINUX

The default Fortran compiler flags for the external libraries: FFLAGS_BACIO FFLAGS_BUFR CFLAGS_BUFR FFLAGS_CLOUD FFLAGS_CRTM FFLAGS_GFSIO FFLAGS_SFCIO FFLAGS_SIGIO FFLAGS_SP FFLAGS_W3 INC_FLAGS_W3 #

= = = = = = = = = = =

-O3 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -O3 -DUNDERSCORE -O3 $(FFLAGS_DEFAULT) -O2 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -O3 $(FFLAGS_DEFAULT) -module ../sigio

$(FFLAGS_i4r8)

$(FFLAGS_i4r4) $(FFLAGS_i4r4) $(FFLAGS_i4r4) $(FFLAGS_i4r8)

18

Software Installation The default CPP path and flags. If your system has multiple versions of cpp it may be necessary to specify the correct one here. CPP CPP_FLAGS CPP_F90FLAGS

= = =

cpp -C -P -D$(BYTE_ORDER) -D_REAL8_ -DWRF -DLINUX -traditional-cpp -lang-fortran

The MPI compiler definitions: DM_FC DM_F90 DM_CC

= = =

mpif90 mpif90 -free icc

The default C compiler flags: CFLAGS CFLAGS2 long'

= =

-O0 -DLINUX -DUNDERSCORE -DLINUX -Dfunder -DFortranByte=char -DFortranInt=int -DFortranLlong='long

The default library paths and names • Variable LAPACK_PATH needs to point to the MKL library location • The library names may be different on other systems MYLIBsys

= -L$(LAPACK_PATH) -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide

NetCDF path information • Older versions of NetCDF only have the single library –lnetcdf. If you are using an older version you may need to remove the first library name. NETCDFPATH NETCDFLIBS

= =

/usr/local/netcdf3-ifort -lnetcdff -lnetcdf $(NETCDF_PATH)

Do not modify anything below these NetCDF environment variables.

2.7.3 Modification Example To demonstrate how one would go about modifying the configuration resource file, the generic Linux/Intel configuration will be ported to build on an SGI Linux cluster called Zeus. Zeus comes with a vender supplied version of MPI, which necessitates modification of the MPI paths. The first change is that Zeus does not use the traditional MPI wrappers such as mpif90 to invoke the compiler. Instead the Intel compiler is called directly with use of an additional –lmpi flag. Therefore the DM compiler definitions become: DM_FC DM_F90 DM_CC

= = =

ifort ifort -free icc

Next, additional link flags for MPI are needed. These are in bold. LDFLAGS

=

-Wl,-rpath,/usr/local/netcdf3-ifort/lib -L$MPI_ROOT/lib -lmpi -openmp

Then add the path to the MPI include directory, along with the additional Fortran flag.

19


FFLAGS_DEFAULT =

-msse2 -fp-model precise -assume byterecl -fpe0 -ftz -I$MPI_ROOT/include

An equivalent include path for the C flags are also needed. CFLAGS

=

-O0 -DLINUX -DUNDERSCORE -I$MPI_ROOT/include

Lastly, a difference with the MKL libraries eliminates the need for the –lguide library. MYLIBsys

= -L$(LAPACK_PATH) -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core

These changes should be saved to the users configure.gsi resource file and tested. Once they are confirmed to work, they may be moved into the configure.defaults file located in the arch/ directory as a new build target. To save your new build configuration, open the file configure.defaults, located in the arch/ directory. You will notice that it contains a collection of platform/compiler specific entries. The first entry is for the IBM platform, using the xlf compiler with 64-bit word size. This entry is indicated by a label at the top of the block starting with the tag #ARCH. For the 64-bit IBM build, the tag is: #ARCH AIX 64-bit

#dmpar

The block for the 64-bit IBM build is immediately followed by the 32-bit IBM build entry, which is indicated by the tag: #ARCH AIX 32-bit

#dmpar

Each subsequent build specification is delinated by a similar tag. For our port of the generic Intel build to Zeus, locate the tag for the Linux/Intel build with 64 bit words. Its header looks like this: #ARCH Linux x86_64, Intel compiler (ifort & icc) # (dmpar,optimize)

Duplicate this entry and give it a unique name by modifying the ARCH entry. #ARCH Linux x86_64, Intel compiler SGI MPT (ifort & icc) # (dmpar,optimize)

Then update the variables to match the settings in the configure.gsi resource file tested previously, and save your changes. Now when you run the ./configure script, there will be a new build option for an SGI MPT build.

20

Running GSI

Chapter 3: Running GSI This chapter starts with discussing the input files required to run GSI. It then proceeds with a detailed explanation of a sample run script and some frequently used options from the GSI namelist. It concludes with samples of GSI products produced by a successful GSI run. 3.1 Input Files In most cases, three types of input files (background, observation, and fixed files) must be available to run GSI. In some special idealized cases, such as a pseudo single observation test, GSI can be run without any observations. •

Background or first guess field As with other data analysis systems, the background or first guess fields may come from a model forecast conducted separately or from a previous data assimilation cycle. The following is a list of the types of background fields supported by this release code: a) b) c) d) e) f) g)

WRF NMM input fields in binary format WRF NMM input fields in NetCDF format WRF ARW input fields in binary format WRF ARW input fields in NetCDF format GFS input fields in binary format NEMS-NMMB input fields RTMA input files (2-dimensional binary format)

Please note that WRF is a community model system, including two dynamical cores: the Advanced Research WRF (ARW) and the nonhydrostatic Mesoscale Model (NMM). The GFS (Global Forecast System), NEMS (National Environmental Modeling System)-NMMB (Nonhydrostatic Mesoscale Model B-Grid), and RTMA (Real-Time Mesoscale Analysis) are operational systems of NCEP. Currently, the DTC supports GSI for regional community models, such as WRF NMM and ARW. Therefore, most of the multiple platform tests were conducted using WRF netcdf background files (b, d). While the formal regression test suite conducts tests of all 7 background file formats (a-g) on an IBM platform, the NetCDF options are the most thoroughly tested. For the current release, the following tests have been thoroughly performed: 1. Background (a)-(g) were tested with regression cases on IBM 2. NMM NetCDF (b) and ARW NetCDF (d) were tested with simple cases on Linux with ifort and pgi, and Mac with pgi. 3. NMM NetCDF (b) and ARW NetCDF (d) were tested with cycled cases •

Observations GSI can analyze many types of observational data, including conventional data, satellite radiance observations, GSP, and radar data. A list of the default observation 21

Running GSI file names used in GSI and the corresponding observations included in the files is provided in the table below: GSI Name prepbufr satwnd amsuabufr amsubbufr radarbufr gpsrobufr ssmirrbufr tmirrbufr sbuvbufr hirs2bufr hirs3bufr hirs4bufr msubufr airsbufr mhsbufr ssmitbufr amsrebufr ssmisbufr gsnd1bufr l2rwbufr gsndrbufr gimgrbufr omibufr

iasibufr gomebufr mlsbufr tcvitl seviribufr atmsbufr crisbufr modisbufr

Content Conventional observations, including ps, t, q, pw, uv, spd, dw, sst, from observation platforms such as METAR, sounding, et al. satellite winds AMSU-A 1b radiance (brightness temperatures) from satellites NOAA-15, 16, 17,18, 19 and METOP-A/B AMSU-B 1b radiance (brightness temperatures) from satellites NOAA15, 16,17 Radar radial velocity Level 2.5 data GPS radio occultation observation Precipitation rate observations fromSSM/I Precipitation rate observations from TMI SBUV/2 ozone observations from satellite NOAA16, 17, 18, 19 HIRS2 1b radiance from satellite NOAA14 HIRS3 1b radiance observations from satellite NOAA16, 17 HIRS4 1b radiance observation from satellite NOAA 18, 19 and METOP-A/B MSU observation from satgellite NOAA 14 AMSU-A and AIRS radiances from satellite AQUA Microwave Humidity Sounder observation from NOAA 18, 19 and METOP-A/B SSMI observation from satellite f13, f14, f15 AMSR-E radiance from satellite AQUA SSMIS radiances from satellite f16 GOES sounder radiance (sndrd1, sndrd2, sndrd3 sndrd4) from GOES 11, 12, 13. NEXRAD Level 2 radial velocity GOES sounder radiance from GOES11, 12 GOES imager radiance from GOES 11, 12 Ozone Monitoring Instrument (OMI) observation NASA Aura

Example file names gdas1.t12z.prepbufr

Infrared Atmospheric Sounding Interfero-meter sounder observations from METOP-A/B The Global Ozone Monitoring Experiment (GOME) ozone observation from METOP-A/B

gdas1.t12z.mtiasi. tm00.bufr_d

Aura MLS stratospheric ozone data Synthetic Tropic Cyclone-MSLP observation SEVIRI radiance from MET-08,09,10 ATMS radiance from Suomi NPP CRIS radiance from Suomi NPP MODIS aerosol total column AOD observations from AQUA and TERRA

gdas1.t12z. mlsbufr.tm00.bufr_d gdas1.t12z.syndata.tcvitals.tm00 gdas1.t12z. sevcsr.tm00.bufr_d gdas1.t12z. atms.tm00.bufr_d gdas1.t12z. cris.tm00.bufr_d

gdas1.t12z.satwnd.tm00.bufr_d gdas1.t12z.1bamua.tm00.bufr_d gdas1.t12z.1bamub.tm00.bufr_d ndas.t12z. radwnd. tm12.bufr_d gdas1.t12z.gpsro.tm00.bufr_d gdas1.t12z.spssmi.tm00.bufr_d gdas1.t12z.sptrmm.tm00.bufr_d gdas1.t12z.osbuv8.tm00.bufr_d gdas1.t12z.1bhrs2.tm00.bufr_d gdas1.t12z.1bhrs3.tm00.bufr_d gdas1.t12z.1bhrs4.tm00.bufr_d gdas1.t12z.1bmsu.tm00.bufr_d gdas1.t12z.airsev.tm00.bufr_d gdas1.t12z.1bmhs.tm00.bufr_d gdas1.t12z.ssmit.tm00.bufr_d gdas1.t12z.amsre.tm00.bufr_d gdas1.t12z.ssmis.tm00.bufr_d gdas1.t12z.goesfv.tm00.bufr_d ndas.t12z.nexrad. tm12.bufr_d gdas1.t12z.goesnd.tm00.bufr_d gdas1.t12z.omi.tm00.bufr_d

gdas1.t12z.gome.tm00.bufr_d

22

Running GSI Remarks: 1. Because the current regional models do not have ozone as a prognostic variable, ozone data are not assimilated at the regional scale. 2. GSI can analyze all of the data types on the list, but each GSI run (for both operation or case study) only uses a subset of the data based on their quality, forecast impacts, and availability. Some data may be out-dated and not available, some are on monitoring mode, and some data may have quality issues during certain period. Users are encouraged to check the data quality issues prior to running an analysis. The following NCEP links provide resources that include data quality history: http://www.emc.ncep.noaa.gov/mmb/data_processing/Satellite_Historical_Documentation.htm http://www.emc.ncep.noaa.gov/mmb/data_processing/Non-satellite_Historical_Documentation.htm

3. For use with GSI, the observations are saved in the BUFR/PrepBUFR format (with NCEP specified features). The details on the BUFR/PreBUFR processing are introduced in the following website, which is supported and maintained by the DTC: http://www.dtcenter.org/com-GSI/BUFR/index.php GSI can be run without any observations to see how the moisture constraint modifies the first guess (background) field. GSI can be run in a pseudo single observation mode, which does not require any BUFR observation files. In this mode, users should specify observation information in the namelist section SINGLEOB_TEST (see Section 4.2 for details). As more data files are used, additional information will be added through the GSI analysis.

•

Fixed file (statistics and control files) The GSI system has a directory of static or fixed files (called fix/ ), which includes many files required in a GSI analysis. The following table lists fixed file names used in the GSI code, the content of the files, and corresponding example files in the regional test case:

File name used in GSI

Content

Example files in fix/

anavinfo berror_stats

Information file to set control and analysis variables background error covariance

errtable

Observation error table

anavinfo_arw_netcdf anavinfo_ndas_netcdf nam_nmmstat_na.gcv nam_glb_berror.f77.gcv nam_errtable.r3dv

23

Running GSI Observation data control file (more detailed explanation in Section 4.3) convinfo

Conventional observation information file satellite channel information file precipitation rate observation information file ozone observation information file

satinfo pcpinfo ozinfo

satbias_angle

global_convinfo.txt global_satinfo.txt global_pcpinfo.txt global_ozinfo.txt

Bias correction used by radiance analysis satellite scan angle dependent bias global_satangbias.txt

satbias_in

correction file satellite mass bias correction coefficient file

sample.satbias

Radiance coefficient used by CRTM (not under ./fix directory) IR surface emissivity coefficients EmisCoeff.bin EmisCoeff.bin Aerosol coefficients AerosolCoeff.bin AerosolCoeff.bin Cloud scattering and emission CloudCoeff.bin CloudCoeff.bin ${satsen}.SpcCoeff.bin ${satsen}.TauCoeff.bin

coefficients Sensor spectral response characteristics Transmittance coefficients

${satsen}.SpcCoeff.bin ${satsen}.TauCoeff.bin

Each operation system, such as GFS, NAM, and RTMA, has their own set of fixed files. Therefore, for each fixed file used in GSI, there may be several optional files in the fixed file directory fix/. For example, for the background error covariance file, both nam_nmmstat_na.gcv (which is from the NAM system) and regional_glb_berror.f77. gcv (which is from the global forecast system) can be used. We also prepared the same background error covariance files with different byte order for Linux users such as nam_nmmstat_na.gcv_Little_Endian and nam_glb_berror.f77.gcv_Little_Endian. 3.2 GSI Run Script 3.2.1 Steps in the GSI run script The GSI run script creates a run time environment necessary for running the GSI executable. A typical GSI run script includes the following steps: 1. Request computer resources to run GSI. 2. Set environmental variables for the machine architecture. 3. Set experimental variables (such as experiment name, analysis time, background, and observation). 4. Check the definitions of required variables. 5. Generate a run directory for GSI (sometime called working or temporary directory). 6. Copy the GSI executable to the run directory. 24

Running GSI 7. Copy the background file and link ensemble members if running the hybrid to the run directory. 8. Link observations to the run directory. 9. Link fixed files (statistic, control, and coefficient files) to the run directory. 10. Generate namelist for GSI. 11. Run the GSI executable. 12. Post-process: save analysis results, generate diagnostic files, clean run directory. Typically, users only need to modify specific parts of the run script (steps 1, 2, and 3) to fit their specific computer environment and point to the correct input/output files and directories. Section 3.2.2 covers each of these modifications for steps 1 to 3. Section 3.2.3 dissects a sample run script, designed to run on a Linux cluster, an IBM supercomputer, and a Linux workstation. Users should start with the provided run script and modify it for their own run environment and case configuration. 3.2.2 Customization of the GSI run script 3.2.2.1 Setting up the machine environment This section focuses on modifying the machine specific entries in the run script. This corresponds to step 1 of the run script. Specifically, this consists of setting Unix/Linux environment variables and selecting the correct parallel run time environment (batch system with options). Six specific parallel environments will be considered: IBM supercomputer using LSF (Load Sharing Facility) IBM supercomputer using LoadLevel Linux clusters using PBS (Portable Batch System) Linux clusters using LSF Linux workstation (with no batch system) Intel Mac Darwin workstation with PGI complier (with no batch system) Let us start with the batch queuing system. On large computing platforms, machine resources are allocated through a queuing system. A job (along with queuing options) is submitted to a queue, where the job waits until sufficient machine resources become available to execute the job. Once sufficient resources become available, the code executes until it completes or the machine resources have been exhausted. Two queuing systems will be listed below as examples: 1) IBM with LSF #BSUB #BSUB #BSUB #BSUB #BSUB

-P ???????? -a poe -x -n 12 -R "span[ptile=2]"

# # # # #

account number exlusive use of node (not_shared) number of total tasks how many tasks per node (up to 8)

25

Running GSI #BSUB #BSUB #BSUB #BSUB #BSUB

-J -o -e -W -q

gsi gsi.out gsi.err 00:02 regular

# # # # #

job name output filename (%J to add job id) error filename wall time queue

set -x # Set environment variables for IBM export MP_SHARED_MEMORY=yes export MEMORY_AFFINITY=MCM export BIND_TASKS=yes # Set environment variables for threads export SPINLOOPTIME=10000 export YIELDLOOPTIME=40000 export AIXTHREAD_SCOPE=S export MALLOCMULTIHEAP=true export XLSMPOPTS="parthds=1:spins=0:yields=0:stack=128000000" # Set environment variables for user preferences export XLFRTEOPTS="nlwidth=80" export MP_LABELIO=yes

2) Linux Cluster with PBS #$ #$ #$ #$ #$ #$ #$

-S /bin/ksh -N GSI_test -cwd -r y -pe comp 64 -l h_rt=0:20:00 -A ??????

The Linux workstation and Mac workstation generally do not run a batch system. In these cases, step 1 can be skipped. In both of the examples above, environment variables are set specifying system resource management, such as the number of processors, the name/type of queue, maximum wall clock time allocated for the job, options for standard out and standard error, etc. Some platforms need additional definitions to specify Unix environmental variables that further define the run environment. These variable settings can significantly impact the GSI run efficiency and accuracy of the GSI results. Please check with your system administrator for the optimal settings for your computer system. Note that while the GSI can be run with any number of processors, using more than (number_of_levels) times (number_of_variables) of processors, will not scale well.

26

Running GSI 3.2.2.2 Setting up the running environment There are only two options to define in this block. The option ARCH selects the machine architecture. It is a function of platform type, compiler, and batch queuing system. The option GSIPROC sets the processor count used in the run. This option also decides if the job is run as a multiple processor job or as a single processor run. We listed several choices of the option ARCH in the sample run script. One of these choices should be applicable to a user’s system. Please check with your system administrator about running parallel MPI jobs on your system. Option ARCH IBM_LSF IBM_LoadLevel LINUX LINUX_LSF LINUX_PBS DARWIN_PGI

Platform IBM AIX IBM AIX Linux workstation Linux cluster Linux cluster MAC DARWIN

Compiler xlf, xlc, … xlf, xlc, … Intel/PGI Intel/PGI Intel/PGI PGI

batch queuing system LSF LoadLevel mpirun if GSIPROC > 1 LSF PBS mpirun if GSIPROC > 1

# GSIPROC = processor number used for GSI analysis #-----------------------------------------------GSIPROC=1 ARCH='LINUX' # Supported configurations: # IBM_LSF,IBM_LoadLevel # LINUX, LINUX_LSF, LINUX_PBS, # DARWIN_PGI

3.2.2.3 Setting up an analysis case This section discusses setting up variables specific to the analysis case, such as analysis time, working directory, background and observation files, location of fixed files and CRTM coefficients, and the GSI executable file. A few cautions to be aware of are: • The time in the background file must be consistent with the time in the observation file used for the GSI run (there is a namelist option to turn this check off). • Even if their contents are identical, PrepBUFR/BUFR files will differ if they were created on platforms with different endian byte order specification (Linux intel vs. IBM). If users obtain PrepBUFR/BUFR files from an IBM system, these files must be converted before they can be used on a Linux system. Appendix A.1 discusses the conversion tool ssrc to byte-swap observation files. # ##################################################### # case set up (users should change this part) ##################################################### # # ANAL_TIME= analysis time (YYYYMMDDHH) # WORK_ROOT= working directory, where GSI runs

27

Running GSI # PREPBURF = path of PreBUFR conventional obs # BK_FILE = path and name of background file # OBS_ROOT = path of observations files # CRTM_ROOT= path of crtm coefficients files # FIX_ROOT = path of fix files # GSI_EXE = path and name of the gsi executable #-----------------------------------------------ANAL_TIME=2008051112 WORK_ROOT=./run_${ANAL_TIME}_arw_single BK_FILE=./bkARW/wrfout_d01_2008-05-11_12:00:00 OBS_ROOT=./GSI_DATA/obs PREPBUFR=${OBS_ROOT}/newgblav.gdas1.t12z.prepbufr.nr CRTM_ROOT=./comGSI/crtm/CRTM_Coefficients FIX_ROOT=./comGSI_v3.1/fix GSI_EXE=./comGSI_v3.1/run/gsi.exe

The next part of this block focuses on additional options that specify important aspects of the GSI configuration. Option bk_core indicates the specific WRF core used to create the background files and is used to specify the WRF core when generating the namelist. Option bkcv_option specifies the background error covariance to be used in the script. Two background error covariance matrices are provided with the release, one from NCEP global data assimilation (GDAS), and one form NAM data assimilation system (NDAS). Please check Section 7.2 for more details about GSI background error covariance. Option if_clean is to set if the run script needs to delete temporal intermediate files in the working directory after a GSI run is completed. #-----------------------------------------------# bk_core= which WRF core is used as background (NMM or ARW) # bkcv_option= which background error covariance and parameter will # be used (GLOBAL or NAM) # if_clean = clean :delete temporal files in working directory (default) # no : leave running directory as is (this is for debug only) bk_core=ARW bkcv_option=NAM if_clean=clean

If using hybrid function, the follow block of parameters related to ensemble forecasts also need to be set: #-----------------------------------------------# GSI hybrid options # if_hybrid = .true. then turn on hybrid data analysis # ntotmem = number of ensemble members used for hybrid # mempath = path of ensemble members # # please note that we assume the ensemble member are located # under ${mempath} with name wrfout_d01_${iiimem}. # If this is not the case, please change the following lines: # # if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then # ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem} # else # echo "member ${mempath}/wrfout_d01_${iiimem} is not exit" # fi

28

Running GSI # # if_hybrid=.false. ntotmem=40 mempath=/ensemble_forecast/2011060212/wrfprd

3.2.3 Description of the sample script to run GSI Listed below is an annotated run script (Courier New) with explanations on each function block. For further details on the first 3 blocks of the script that users need to change, check section 3.2.2.1, 3.2.2.2, and 3.2.2.3: #!/bin/ksh ##################################################### # machine set up (users should change this part) ##################################################### # # GSIPROC=1 ARCH='LINUX' # Supported configurations: # IBM_LSF,IBM_LoadLevel # LINUX, LINUX_LSF, LINUX_PBS, # DARWIN_PGI # ##################################################### # case set up (users should change this part) ##################################################### # # ANAL_TIME= analysis time (YYYYMMDDHH) # WORK_ROOT= working directory, where GSI runs # PREPBURF = path of PreBUFR conventional obs # BK_FILE = path and name of background file # OBS_ROOT = path of observations files # CRTM_ROOT= path of crtm coefficients files # FIX_ROOT = path of fix files # GSI_EXE = path and name of the gsi executable ANAL_TIME=2008051112 WORK_ROOT=./run_${ANAL_TIME}_arw_single BK_FILE=./bkARW/wrfout_d01_2008-05-11_12:00:00 OBS_ROOT=./GSI_DATA/obs PREPBUFR=${OBS_ROOT}/newgblav.gdas1.t12z.prepbufr.nr CRTM_ROOT=./comGSI/crtm/CRTM_Coefficients FIX_ROOT=./comGSI_v3.1/fix GSI_EXE=./comGSI_v3.1/run/gsi.exe #-----------------------------------------------# bk_core= which WRF core is used as background (NMM or ARW) # bkcv_option= which background error covariance and parameter will # be used (GLOBAL or NAM) # if_clean = clean:delete temporal files in working directory (default)

29

Running GSI #

no : leave running directory as is (this is for debug only) bk_core=ARW bkcv_option=NAM if_clean=clean

# #-----------------------------------------------# GSI hybrid options # if_hybrid = .true. then turn on hybrid data analysis # ntotmem = number of ensemble members used for hybrid # mempath = path of ensemble members # # please note that we assume the ensemble member are located # under ${mempath} with name wrfout_d01_${iiimem}. # If this is not the case, please change the following lines: # # if [ -r ${mempath}/wrfout_d01_${iiimem} ]; then # ln -sf ${mempath}/wrfout_d01_${iiimem} ./wrf_en${iiimem} # else # echo "member ${mempath}/wrfout_d01_${iiimem} is not exit" # fi # # if_hybrid=.false. ntotmem=40 mempath=/ensemble_forecast/2011060212/wrfprd

At this point, users should be able to run the GSI for simple cases without changing the scripts. However, some advanced users may need to change some of the following blocks for special applications, such as use of radiance data, cycled runs, or running GSI on a platform not tested by the DTC. ##################################################### # Users should NOT change script after this point ##################################################### #

The next block sets byte-order and run command to run GSI on multiple platforms. The ARCH is set in the beginning of the script. Option BYTE_ORDER tells the script the byte order of the machine, which will set up the right byte-order files for the CRTM coefficients and background error covariance. The choices depend on a combination of the platform (e.g., IBM vs. Linux). case $ARCH in 'IBM_LSF') ###### IBM LSF (Load Sharing Facility) BYTE_ORDER=Big_Endian RUN_COMMAND="mpirun.lsf " ;; 'IBM_LoadLevel') ###### IBM LoadLeve BYTE_ORDER=Big_Endian RUN_COMMAND="poe " ;; 'LINUX') BYTE_ORDER=Little_Endian

30

Running GSI if [ $GSIPROC = 1 ]; then #### Linux workstation - single processor RUN_COMMAND="" else ###### Linux workstation - mpi run RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach " fi ;; 'LINUX_LSF') ###### LINUX LSF (Load Sharing Facility) BYTE_ORDER=Little_Endian RUN_COMMAND="mpirun.lsf " ;; 'LINUX_PBS') BYTE_ORDER=Little_Endian #### Linux cluster PBS (Portable Batch System) RUN_COMMAND="mpirun -np ${GSIPROC} " ;; 'DARWIN_PGI') ### Mac - mpi run BYTE_ORDER=Little_Endian if [ $GSIPROC = 1 ]; then #### Mac workstation - single processor RUN_COMMAND="" else ###### Mac workstation - mpi run RUN_COMMAND="mpirun -np ${GSIPROC} -machinefile ~/mach " fi ;; * ) print "error: $ARCH is not a supported platform configuration." exit 1 ;; esac #

The next block checks if all the variables needed for a GSI run are properly defined. These variables should have been defined in the first 3 parts of this script. # ################################################################ # Check GSI needed environment variables are defined and exist # # Make if [ ! echo exit fi

sure ANAL_TIME is defined and in the correct format "${ANAL_TIME}" ]; then "ERROR: \$ANAL_TIME is not defined!" 1

# Make if [ ! echo exit fi

sure WORK_ROOT is defined and exists "${WORK_ROOT}" ]; then "ERROR: \$WORK_ROOT is not defined!" 1

# Make sure the background file exists

31

Running GSI if [ ! -r "${BK_FILE}" ]; then echo "ERROR: ${BK_FILE} does not exist!" exit 1 fi # Make if [ ! echo exit fi if [ ! echo exit fi

sure OBS_ROOT is defined and exists "${OBS_ROOT}" ]; then "ERROR: \$OBS_ROOT is not defined!" 1 -d "${OBS_ROOT}" ]; then "ERROR: OBS_ROOT directory '${OBS_ROOT}' does not exist!" 1

# Set the path to the GSI static files if [ ! "${FIX_ROOT}" ]; then echo "ERROR: \$FIX_ROOT is not defined!" exit 1 fi if [ ! -d "${FIX_ROOT}" ]; then echo "ERROR: fix directory '${FIX_ROOT}' does not exist!" exit 1 fi # Set the path to the CRTM coefficients if [ ! "${CRTM_ROOT}" ]; then echo "ERROR: \$CRTM_ROOT is not defined!" exit 1 fi if [ ! -d "${CRTM_ROOT}" ]; then echo "ERROR: fix directory '${CRTM_ROOT}' does not exist!" exit 1 fi # Make if [ ! echo exit fi

sure the GSI executable exists -x "${GSI_EXE}" ]; then "ERROR: ${GSI_EXE} does not exist!" 1

# Check to make sure the number of processors for # running GSI was specified if [ -z "${GSIPROC}" ]; then echo "ERROR: The variable $GSIPROC must be set to contain the number of processors to run GSI" exit 1 fi

The next block creates a working directory (workdir) in which GSI will run. The directory should have enough disk space to hold all the files needed for this run. This directory is cleaned before each run, therefore, save all the files needed from the previous run before rerunning GSI.

32

Running GSI # ################################################################### # Create the ram work directory and cd into it workdir=${WORK_ROOT} echo " Create working directory:" ${workdir} echo " Create working directory:" ${workdir} if [ -d "${workdir}" ]; then rm -rf ${workdir} fi mkdir -p ${workdir} cd ${workdir}

After creating a working directory, copy the GSI executable, background, observation, and fixed files into the working directory. If hybrid is choosen, link ensemble members to the working directory # ################################################################### echo " Copy GSI executable, background file, and link observation bufr to working directory" # Save a copy of the GSI executable in the workdir cp ${GSI_EXE} gsi.exe

Note: Copy the background file to the working directory as wrf_inout. The file wrf_inout will be overwritten by GSI to save analysis result. # Bring over background field #(it's modified by GSI so we can't link to it) cp ${BK_FILE} ./wrf_inout

Note: link ensemble members and rename them with the GSI recognized name: wrf_en${iiimem}. # Link ensember members if use hybrid if [ ${if_hybrid} = .true. ] ; then echo " link ensemble members to working directory" let imem=1 while (( $imem stdout 2>&1

;;

* ) ${RUN_COMMAND} ./gsi.exe > stdout 2>&1

;;

esac ################################################################## # run time error check ################################################################## error=$? if [ ${error} -ne 0 ]; then echo "ERROR: ${GSI} crashed exit ${error} fi

Exit status=${error}"

The following block saves the analysis results with an understandable name and adds the analysis time to some output file names. Among them, stdout contains runtime output of GSI and wrf_inout is the analysis results. # ################################################################## # # GSI updating satbias_in (only for cycling assimilation) # # Copy the output to more understandable names cp stdout stdout.anl.${ANAL_TIME} cp wrf_inout wrfanl.${ANAL_TIME} ln fort.201 fit_p1.${ANAL_TIME} ln fort.202 fit_w1.${ANAL_TIME} ln fort.203 fit_t1.${ANAL_TIME} ln fort.204 fit_q1.${ANAL_TIME}

39

Running GSI ln fort.207

fit_rad1.${ANAL_TIME}

The following block collects the diagnostic files. The diagnostic files are merged and categorized by outer loop and data type. Setting write_diag to true, directs GSI to write out diagnostic information for each observation station. This information is very useful to check analysis details. Please check Appendix A.2 for the tool to read and analyze these diagnostic files. # Loop over first and last outer loops to generate innovation # diagnostic files for indicated observation types (groups) # # NOTE: Since we set miter=2 in GSI namelist SETUP, outer # loop 03 will contain innovations with respect to # the analysis. Creation of o-a innovation files # is triggered by write_diag(3)=.true. The setting # write_diag(1)=.true. turns on creation of o-g # innovation files. # ls -l pe0*.* > listpe loops="01 03" for loop in $loops; do case $loop in 01) string=ges;; 03) string=anl;; *) string=$loop;; esac #

Collect diagnostic files for obs types (groups) below listall="conv amsua_metop-a mhs_metop-a hirs4_metop-a hirs2_n14 msu_n14 \ sndr_g08 sndr_g10 sndr_g12 sndr_g08_prep sndr_g10_prep sndr_g12_prep \ sndrd1_g08 sndrd2_g08 sndrd3_g08 sndrd4_g08 sndrd1_g10 sndrd2_g10 \ sndrd3_g10 sndrd4_g10 sndrd1_g12 sndrd2_g12 sndrd3_g12 sndrd4_g12 \ hirs3_n15 hirs3_n16 hirs3_n17 amsua_n15 amsua_n16 amsua_n17 \ amsub_n15 amsub_n16 amsub_n17 hsb_aqua airs_aqua amsua_aqua \ goes_img_g08 goes_img_g10 goes_img_g11 goes_img_g12 \ pcp_ssmi_dmsp pcp_tmi_trmm sbuv2_n16 sbuv2_n17 sbuv2_n18 \ omi_aura ssmi_f13 ssmi_f14 ssmi_f15 hirs4_n18 amsua_n18 mhs_n18 \ amsre_low_aqua amsre_mid_aqua amsre_hig_aqua ssmis_las_f16 \ ssmis_uas_f16 ssmis_img_f16 ssmis_env_f16"

for type in $listall; do count=`grep ${type}_${loop} listpe | wc -l` if [[ $count -gt 0 ]]; then cat pe*${type}_${loop}* > diag_${type}_${string}.${ANAL_TIME}

fi done done

The following scripts clean the temporal files # Clean working directory to save only important files ls -l * > list_run_directory if [ ${if_clean} = clean ]; then

40

Running GSI echo ' Clean working rm -f *Coeff.bin rm -f pe0* rm -f listpe rm -f obs_input.* rm -f siganl sigf03 rm -f xhatsave.* rm -f fsize_*

directory after GSI run' # all CRTM coefficient files # diag files on each processor # list of diag files on each processor # observation middle files # background middle files # some information on each processor # delete temperal file for bufr size

fi

The GSI successfully finishes and exits with 0: exit 0

3.3 Introduction to Frequently Used GSI Namelist Options The complete namelist options and their explanations are listed in Appendix B. For most GSI analysis applications, only a few namelist variables need to be changed. Here we introduce frequently used variables for regional analyses: 1. Set up the number of outer loop and inner loop To change the number of outer loops and the number of inner iterations in each outer loop, you need to modify the following three variables in the namelist: miter: number of outer loops of analysis. st niter(1): maximum iteration number of inner loop iterations for the 1 outer loop. The inner loop will stop when it reaches this maximum number, reaches the convergence condition, or when it fails to converge. nd niter(2): maximum iteration number of inner loop iterations for the 2 outer loop. If miter is larger than 2, repeat niter with larger index. 2. Set up the analysis variable for moisture If you want to change moisture variable used in analysis, the namelist variable is: qoption = 1 or 2:

If qoption=1, the moisture analysis variable is pseudo-relative humidity. The saturation specific humidity, qsatg, is computed from the guess and held constant during the inner loop. Thus, the RH control variable can only change via changes in specific humidity, q. If qoption=2, the moisture analysis variable is normalized RH. This formulation allows RH to change in the inner loop via changes to surface pressure (pressure), temperature, or specific humidity. 41

Running GSI

3. Set up the background file The following four variables define which background field will be used in the GSI analyses: regional: if true, perform a regional GSI run using either ARW or NMM inputs as the background. If false, perform a global GSI analysis. If either wrf_nmm_regional or wrf_mass_regional are true, it will be set to true. wrf_nmm_regional: if true, background comes from WRF NMM. When using other background fields, set it to false. wrf_mass_regional: if true, background comes from WRF ARW. When using other background fields, set it to false. netcdf: if true, wrf files are in NetCDF format, otherwise wrf files are in binary format. This option only works for performing a regional GSI analysis. 4. Set up the output of diagnostic files The following variables decide if the GSI will write out diagnostic results in certain loops: write_diag(1): if true, write out diagnostic data in the beginning of the analysis, so that we can have information on Observation – Background (O-B) st write_diag(2) : if true, write out diagnostic data in between the 1 and 2nd loop nd write_diag(3): if true, write out diagnostic data at the end of the 2 outer loop (the analysis if the outer loop number is 2), so that we can have information on Observation – Analysis (O-A) Please check appendix A.2 for the tools to read the diagnostic files. 5. Set up the link to the observation files The following variables link the observation files to GSI observation IOs: ndat: number of observation variables (not observation types). This number should be consistent with the observation variable lines in section OBS_INPUT. If adding a new observation variable, ndat must be incremented by one and one new line is added in OBS_INPUT. Based on dimensions of the variables in OBS_INPUT (e.g., dfile), the maximum value of ndat is 200 in this version. dfile(01)='prepbufr': observation file name. The observation file contains observations used for a GSI analysis. This file can include several observation variables from different observation types. The file name in this parameter will be read in by GSI. This name can be changed as long as the name in the link from the BUFR/PrepBUFR file also changes correspondingly.

42

Running GSI dtype(01)='ps': analysis variable name that GSI can read in and handle. As an example here, GSI will read all ps observations from the file prepbufr.

Please note this name should be consistent with that used in the GSI code. dplat(01): sets up the observation platform inside the file dfile. dsis(01): sets up data name (including both data type and platform name) used

inside GSI. Please see Section 4.3 and Section 8.1 for examples and explanation to these variables. 6. Set up observation time window In the namelist section OBS_INPUT, use time_window_max to set maximum half time window (hours) for all data types. In the convinfo file, you can use column twindow to set the half time window for a certain data type (hours). For conventional observations, only observations within the smaller window of these two will be kept to further processing. For others, observations within time_window_max will be kept to further processing. 7. Set up data thinning 1) Radiance data thinning Radiance data thinning is controlled through two GSI namelist variables in the section &OBS_INPUT. Below is an example of the section: &OBS_INPUT dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=60.0,dmesh(4)=60.0,dmesh(5)=120,time_window_max=1.5, dfile(01)='prepbufr', dtype(01)='ps', dplat(01)=' ', dsis(01)='ps', dval(01)=1.0, dthin(01)=0, dfile(11)='ssmirrbufr',dtype(11)='pcp_ssmi', dplat(11)='dmsp', dsis(11)='pcp_ssmi', dval(11)=1.0, dthin(11)=1, dfile(12)='tmirrbufr', dtype(12)='pcp_tmi', dplat(12)='trmm', dsis(12)='pcp_tmi', dval(12)=1.0, dthin(12)=1, dfile(13)='sbuvbufr', dtype(13)='sbuv2', dplat(13)='n16', dsis(13)='sbuv8_n16', dval(13)=1.0, dthin(13)=0, dfile(14)='sbuvbufr', dtype(14)='sbuv2', dplat(14)='n17', dsis(14)='sbuv8_n17', dval(14)=1.0, dthin(14)=0, dfile(15)='sbuvbufr', dtype(15)='sbuv2', dplat(15)='n18', dsis(15)='sbuv8_n18', dval(15)=1.0, dthin(15)=0, dfile(17)='hirs2bufr', dtype(17)='hirs2', dplat(17)='n14', dsis(17)='hirs2_n14', dval(17)=6.0, dthin(17)=1, dfile(18)='hirs3bufr', dtype(18)='hirs3', dplat(18)='n16', dsis(18)='hirs3_n16', dval(18)=0.0, dthin(18)=1, dfile(19)='hirs3bufr', dtype(19)='hirs3', dplat(19)='n17', dsis(19)='hirs3_n17', dval(19)=6.0, dthin(19)=1, dfile(20)='hirs4bufr', dtype(20)='hirs4', dplat(20)='n18', dsis(20)='hirs4_n18', dval(20)=0.0, dthin(20)=1, dfile(27)='msubufr', dtype(27)='msu', dplat(27)='n14', dsis(27)='msu_n14', dval(27)=2.0, dthin(27)=2, dfile(28)='amsuabufr', dtype(28)='amsua', dplat(28)='n15', dsis(28)='amsua_n15', dval(28)=10.0,dthin(28)=2, dfile(34)='amsubbufr', dtype(34)='amsub', dplat(34)='n15', dsis(34)='amsub_n15', dval(34)=3.0, dthin(34)=3, dfile(35)='amsubbufr', dtype(35)='amsub', dplat(35)='n16', dsis(35)='amsub_n16', dval(35)=3.0, dthin(35)=3, dfile(41)='ssmitbufr', dtype(41)='ssmi', dplat(41)='f15', dsis(41)='ssmi_f15', dval(41)=0.0, dthin(41)=4,

The two namelist variables that control the radiance data thinning are real array dmesh in the 1st line and the integer array dthin in the last column. The dmesh gives a set of the mesh sizes in unit km for radiance thinning grids, while the dthin defines if the data type it represents needs to be thinned and which thinning grid (mesh size) to use. If the value of dthin is: 43

Running GSI

• •

an integer less and equal to 0, no thinning is needed an integer larger than 0, this kind of radiance data will be thinned in a thinning grid with the mesh size defined as dmesh (dthin).

The following gives several thinning examples defined by the above sample &OBS_INPUT section: • • • •

Data type 1 (ps in prepbufr): no thinning because dthin(01)=0 Data type 11 (pcp_ssmi from dmsp): no thinning because dthin(01)=-1 Data type 17 (hirs2 from noaa 14): thinning in a 120 km grid because dthin(17)=1 and dmesh(1)=120 Data type 41 (ssmi from f15): thinning in a 60 km grid because dthin(41)=4 and dmesh(4)=60

2) Conventional data thinning The conventional data can also be thinned. However, the setup of thinning is not in the namelist. To give users a complete picture of data thinning, conventional data thininng is briefly introduced here. There are three columns, ithin, rmesh, pmesh, in the convinfo file to configure conventional data thinning: •

ithin:

• •

rmesh:

0 = no thinning; 1 = thinning with grid mesh decided by rmesh and pmesh

pmesh:

horizontal thinning grid size in km vertical thinning grid size in mb; if 0, then use background vertical grid.

8. Set up background error factor In the namelist section BKGERR, use vs to set up scale factor for vertical correlation length and hzscl to set up scale factors for horizontal smoothing. The scale factors for the variance of each analysis variables are set in anavinfo file. 9. Single oservation test To do single observation test, the following namelist option has to be set to true first: oneobtest=.true.

Then go to the namelist section SINGLEOB_TEST to set up the single observation location and variable to be tested, please see Section 4.2 for an example and details on the single observation test.

44

Running GSI 3.4 Files in GSI Run Directory When completed, GSI will create a number of files in the run directory. Below is an example of the files generated in the run directory from one of the GSI test case runs. This case was run to perform a regional GSI analysis with a WRF ARW NetCDF background using conventional (prepbufr), radiance data (AMSU-A, AMSU-B, HIR3, HIRS4, and MHS), and GPSRO data. The analysis time is 12Z 22 March 2011. Four processors were used. To make the run directory more readable, we turned on the clean option in the run script, which deleted all temporary intermediate files: amsuabufr amsubbufr anavinfo berror_stats convinfo diag_amsua_metop-a_anl.2011032212 diag_amsua_metop-a_ges.2011032212 diag_amsua_n15_anl.2011032212 diag_amsua_n15_ges.2011032212 diag_amsua_n18_anl.2011032212 diag_amsua_n18_ges.2011032212 diag_amsub_n17_anl.2011032212 diag_amsub_n17_ges.2011032212 diag_conv_anl.2011032212 diag_conv_ges.2011032212 diag_hirs3_n17_anl.2011032212 diag_hirs3_n17_ges.2011032212 diag_hirs4_metop-a_anl.2011032212 diag_hirs4_metop-a_ges.2011032212 diag_mhs_metop-a_anl.2011032212 diag_mhs_metop-a_ges.2011032212 diag_mhs_n18_anl.2011032212 diag_mhs_n18_ges.2011032212 errtable

fit_p1.2011032212 fit_q1.2011032212 fit_rad1.2011032212 fit_t1.2011032212 fit_w1.2011032212 fort.201 fort.202 fort.203 fort.204 fort.205 fort.206 fort.207 fort.208 fort.209 fort.210 fort.211 fort.212 fort.213 fort.214 fort.215 fort.217 fort.218 fort.219 fort.220

fort.221 gpsrobufr gsi.exe gsiparm.anl hirs3bufr hirs4bufr l2rwbufr list_run_directory mhsbufr ozinfo pcpbias_out pcpinfo prepbufr prepobs_prep.bufrtable satbias_angle satbias_in satbias_out satinfo stdout stdout.anl.2011032212 wrfanl.2011032212 wrf_inout

There are several important files that hold GSI analysis results and diagnostic information. We will introduce these files and their contents in detail in the following chapter. The following is a brief list of what these files contain: stdout.anl.2011032212/stdout: standard text output file, which is a copy of stdout with the analysis time appended. This is the most commonly used file to check the GSI analysis processes as well as basic and important information about the analyses. We will explain the contents of stdout in Section 4.1 and users are encouraged to read this file in detail to become familiar with GSI process order. wrfanl.2011032212/wrf_inout: analysis results if GSI completes successfully - only if using wrf for background. This is a copy of wrf_inout with the analysis time appended. The format is the same as the background file. diag_conv_anl.(time): binary diagnostic files for conventional and GPS RO observations at the final analysis step (analysis departure for each observation, please see Appendix A.2 for more information). diag_conv_ges.(time): binary diagnostic files for conventional and GPS RO observations at initial analysis step (background departure for each observation, please see Appendix A.2 for more information) 45

Running GSI diag_(instrument_satellite)_anl: diagnostic files for satellite

observations at final analysis step. diag_(instrument_satellite)_ges: diagnostic files for satellite

observations at initial analysis step. gsiparm.anl: namelist, generated by the run script. fit_(variable).(time): copies of fort.2?? with meaningful names (variable

name and analysis time). They are statistic results of observation departures from background and analysis results according to observation variables. Please see Section 4.5 for more details. fort.220: output from the inner loop minimization --> pcgsoi.f90. Please see Section 4.6 for details. anavinfo: info file to set up control variables, state variables, and background variables. Please see Section 4.9 for details. *info (convinfo, satinfo,…): info files that control data usage. Please see Section 4.3 for details. berror_stats and errtable: background error file (binary) and observation error file (text). *bufr: observation BUFR files linked to the run directory Please see Section 3.1 for details satbias_in: the input coefficients of radiance mass bias correction satbias_out: the output coefficients of radiance mass bias correction after the GSI run satbias_angle: the input coefficients of radiance angle bias correction list_run_directory : the complete list of files before cleaning the run directory. Please note that diag_(instrument_satellite_anl.(time) and diag_conv_anl.(time) contain important information like analysis increment for each observation (O-A), also, diag_conv_ges and diag_(instrument_satellite)_ges include observation innovation for each observation (O-B). These files can be very helpful in terms of getting the detailed impact of data on the analysis. A tool is provided to process these files, which is introduced in Appendix A.2. There are many intermediate files in this directory during the running stage; the complete list of files before cleaning is saved in a file list_run_directory. Some knowledge about the content of these files is very helpful for debugging if the GSI run crashes. Please check the following list for the meaning of these files: (Note: you may not see all the files in the above list because different observational data are used. Also, the fixed files prepared for a GSI run, such as CRTM coefficient and background error statistic files, are not included.) sigf03

Background files in binary format (typically sigf03,sigf06 and sigf09). This is a temporal file holding binary background files. When you see this file, at the minimum, a background file was successfully read in.

46

Running GSI siganl

Analysis results in binary format. When this file exists, the analysis part has finished.

pe????.(conv or instrument_satellite)_ (outer loop)

Diagnostic files for conventional and satellite observations at each outer loop and each sub-domains (????=subdomain id)

obs_input.????

Observation scratch files (one file is one observation type within whole analysis domain and time. ????=observation type id in namelist)

pcpbias_out

Output precipitation bias correction file

47

GSI Diagnostics and Tuning

Chapter 4: GSI Diagnostics and Tuning The purpose of this chapter is to help users understand and diagnose the GSI analysis. The chapter starts with an introduction to the content and structure of the GSI standard output (stdout). It continues with the use of a single observation to check the features of the GSI analysis. An introduction to observation usage control, analysis domain partition, fit files, and the optimization process will all be presented from information within the GSI output files (including stdout).

4.1 Understanding Standard Output (stdout) In Section 3.4, we listed the files present in the GSI run directory following a successful GSI analysis and briefly introduced the contents of several important files. Of these, stdout is the most useful because the most critical information about the GSI analysis can be obtained from the file. From stdout, users can check if the GSI has successfully completed, if optimal iterations look correct, and if the background and analysis fields are reasonable. The stdout file also contains the GSI standard output from all the processors. Understanding the content of this file can also be very helpful for users to find where and why the GSI failed if it crashes. The structure of stdout follows the typical steps in a meteorological data analysis system: 1. Read in all data and prepare analysis: a. Read in configuration (namelist) b. Read in background c. Partition domain and data for parallel analysis d. Read in observation(s) e. Read in constant fields (fixed files) 2. Optimal iteration (analysis) 3. Save analysis result In this section, the detailed structure and content of stdout are explained using an on-line example case. This case uses a WRF-ARW NetCDF file as the background and analyzes several observations typical for operations, including most conventional observation data, several radiance data (AMSU-A, AMSU-B, HIR3, HIRS4, and MHS), and GPSRO data. The case was run on the Linux cluster supercomputer, using 4 processors. To keep the output concise and make it more readable, most repeated content was deleted (shown by the blue dotted line). For the same reason, the accuracy of some numbers has been reduced to avoid line breaks in stdout. The following indicates the start of the GSI analysis. It includes some run time information of the analysis and control variables.

48


* . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . PROGRAM GSI_ANL HAS BEGUN. COMPILED 1999232.55 ORG: NP23 STARTING DATE-TIME JUL 13,2012 21:55:14.798 195 FRI 2456122 gsi_metguess_mod*init_: 2D-MET STATE VARIABLES: gsi_metguess_mod*init_: 3D-MET STATE VARIABLES: cw gsi_metguess_mod*init_: ALL MET STATE VARIABLES: cw INIT_IO: set IO server task to mype_io= 3 state_vectors*init_anasv: 2D-STATE VARIABLES ps state_vectors*init_anasv: 3D-STATE VARIABLES u tv tsen q oz cw state_vectors*init_anasv: ALL STATE VARIABLES u tv tsen q oz cw ps sst control_vectors*init_anacv: 2D-CONTROL VARIABLES ARE control_vectors*init_anacv: 3D-CONTROL VARIABLES ARE t q oz cw control_vectors*init_anacv: MOTLEY CONTROL VARIABLES control_vectors*init_anacv: ALL CONTROL VARIABLES ps t q oz sst stl sti INIT_IO: reserve units lendian_in= 15 and for little endian i/o

sst v p3d v p3d ps sf

sst vp

stl sf

sti vp cw

lendian_out=

66

Next is the content of all namelist variables used in this analysis. The 1st part shows the 4DVAR setups. Please note that while this version of the GSI includes some 4DVAR code, it is untested in this release. The general set up for the GSI analysis (3DVAR) is located in the &SETUP section of the GSI namelist. Please check Appendix B for definitions and default values of each namelist variable. GSI_4DVAR: nobs_bins = 1 SETUP_4DVAR: l4dvar= F SETUP_4DVAR: l4densvar= F SETUP_4DVAR: winlen= 3.00000000000000 SETUP_4DVAR: winoff= 3.00000000000000 SETUP_4DVAR: hr_obsbin= 3.00000000000000 SETUP_4DVAR: nobs_bins= 1 SETUP_4DVAR: ntlevs_ens= 1 SETUP_4DVAR: nsubwin,nhr_subwin= 1 SETUP_4DVAR: lsqrtb= F SETUP_4DVAR: lbicg= F SETUP_4DVAR: lcongrad= F SETUP_4DVAR: lbfgsmin= F

3

&SETUP GENCODE = 78.0000000000000 , FACTQMIN = 0.000000000000000E+000, FACTQMAX = 0.000000000000000E+000, DELTIM = 1200.00000000000 , DTPHYS = 3600.00000000000 , BIASCOR = -1.00000000000000 , BCOPTION = 1, DIURNALBC = 0.000000000000000E+000, NDAT = 75, NITER = 0, 2*10, 48*0, NITER_NO_QC = 51*1000000, MITER = 2, QOPTION = 2, NHR_ASSIMILATION = 3, MIN_OFFSET = 180, IOUT_ITER = 220, NPREDP = 6,

49

GSI Diagnostics and Tuning RETRIEVAL

= F,

0: / 0: &GRIDOPTS 0: &BKGERR 0: &ANBKGERR 0: &JCOPTS 0: &STRONGOPTS 0: &OBSQC 0: &SUPEROB_RADAR 0: &LAG_DATA 0: &HYBRID_ENSEMBLE 0: &RAPIDREFRESH_CLDSURF 0: &CHEM

This version of GSI attempts to read multi-time-level backgrounds, however we only have provided one. Therefor, there is some error information at the beginning of the background reading portion: CONVERT_NETCDF_MASS:

problem with flnm1 = wrf_inou1, Status =

-1021

We can ignore these errors for missing files wrf_inou1, wrf_inou2, to wrf_inou9 because we only ran 3DVAR with one background. Next, the background fields for the analysis are read and the maximum and minimum values of the fields at each vertical level are displayed. Here, only part of the variables znu and T are shown, and all other variables read by the GSI are listed only as the variable name in the NetCDF file (rmse_var = …). The maximum and minimum values are useful for a quick verification that the background fields have been read successfully. dh1 = iy,m,d,h,m,s= 0 dh1 = before rmse var T after rmse var T dh1 = rmse_var = T ndim1 = ordering = XYZ k,znu(k)= k,znu(k)= k,znu(k)= k,znu(k)= rmse_var=ZNW

3 2011

3

22

12

0

3 3 3 1 2 49 50

0.9990000 0.9960001 7.1999999E-03 2.3500000E-03

rmse_var=RDX rmse_var=RDY

50


rmse_var=MAPFAC_M rmse_var=XLAT rmse_var=XLONG rmse_var=MUB rmse_var=MU rmse_var=PHB rmse_var=T ordering=XYZ WrfType,WRF_REAL= ndim1= 3 staggering= N/A start_index= 1 end_index= 348 k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= rmse_var=QVAPOR

104

104

1 2 3

1 247 310.8738 311.2928 311.4995

48 49 50

702.1189 770.3488 837.6133

50

1

0

0

234.0430 235.4056 237.6656

280.8286 280.9952 281.3005

616.5657 684.2148 760.6921

651.2628 718.2207 792.9601

rmse_var=U rmse_var=V rmse_var=LANDMASK rmse_var=SEAICE rmse_var=SST rmse_var=IVGTYP rmse_var=ISLTYP rmse_var=VEGFRA rmse_var=SNOW rmse_var=U10 rmse_var=V10 rmse_var=SMOIS rmse_var=TSLB rmse_var=TSK rmse_var=QCLOUD rmse_var=QRAIN rmse_var=QSNOW rmse_var=QICE rmse_var=QGRAUP rmse_var=RAD_TTEN_DFI

51

GSI Diagnostics and Tuning The following example demonstrates how the analysis domain is partitioned into subdomains for each processor. 4 processors were used in this example (see Section 4.4 for more information): general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN:

task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1=

0 1 2 3

1 125 1 125

1 1 175 175

124 123 124 123

174 174 174 174

Information on the horizontal dimensions of the sub-domains and set grid related variables: in general_sub2grid_create_info, kbegin= 1 228 303 in general_sub2grid_create_info, kend= 76 302 INIT_GRID_VARS: number of threads 1 INIT_GRID_VARS: for thread 1 jtstart,jtstop = 176

77

153

152

227 1

Information on using vertical levels from guess fields to replace the number in anavinfo and the start index of each state variables in the bundle array: INIT_RAD_VARS: ***WARNING*** mxlvs from the anavinfo file is different from that of the guess 50 . Resetting maxlvs to match NSIG from guess. Vars in Rad-Jacobian (dims) -------------------------tv 0 q 50 oz 100 u 150 v 151 sst 152

30

Display the analysis and background file time (they should be the same): READ_wrf_mass_FILES: analysis date,minutes 2011 3 22 12 0 17472240 READ_wrf_mass_FILES: sigma guess file, nming2 0.000000000000000E+000 12 3 22 2011 17472240 READ_wrf_mass_FILES: sigma fcst files used in analysis : 3 3.00000000000000 1 READ_wrf_mass_FILES: surface fcst files used in analysis: 3 3.00000000000000 1 GESINFO: Guess date is 12 3 22 2011 0.000000000000000E+000 GESINFO: Analysis date is 2011 3 22 12 0 2011032212 3.00000000000000

Read in radar station information and prepare for radar radial velocity. This case didn’t have radar velocity data linked. There is warning information about opening the file but this will not impact the rest of the GSI analysis. RADAR_BUFR_READ_ALL:

problem opening level 2 bufr file "l2rwbufr"

Read in and show the content of the observation info files (see Section 4.3 for details). Here is part of the stdout from convinfo: READ_CONVINFO: tcp 112 0 1 3.00000 0 0 0 75.0000 5.00000 1.00000 75.0000 0.00000 0 0.00000 0.00000 0 READ_CONVINFO: ps 120 0 1 3.00000 0 0 0 4.00000 3.00000 1.00000 4.00000 0.30E-03 0 0.00000 0.00000 0

52

GSI Diagnostics and Tuning READ_CONVINFO: ps 180 0 1 3.00000 0 0 0 4.00000 3.00000 1.00000 4.00000 0.30E-03 0 0.00000 0.00000 0 READ_CONVINFO: t READ_CONVINFO: t

120 0 1 3.00000 0 0 0 8.00000 5.60000 1.30000 8.00000 0.10E-05 0 0.00000 0.00000 0 126 0 -1 3.00000 0 0 0 8.00000 5.60000 1.30000 8.00000 0.10E-02 0 0.00000 0.00000 0

READ_CONVINFO: gps 440 0 -1 3.00000 0 0 0 10.0000 10.0000 1.00000 10.0000 0.00000

0 0.00000 0.0000 0

Read in background fields from intermediate binary file sigf03 and partition the whole 2D field into subdomains for parallel analysis: READ_WRF_MASS_GUESS: open lendian_in= 15 to READ_WRF_MASS_GUESS: open lendian_in= 15 to ifld, temp1(im/2,jm/2)= 4 0.28100E+03 READ_WRF_MASS_GUESS: open lendian_in= 15 to ifld, temp1(im/2,jm/2)= 2 0.52546E+04 ifld, temp1(im/2,jm/2)= 3 0.28083E+03 gsi_metguess_mod*create_: alloc() for met-guess done at 0 in read_wrf_mass_guess at 0.1 in read_wrf_mass_guess at 1 in read_wrf_mass_guess, lm = 50 at 1 in read_wrf_mass_guess, num_mass_fields= 214 at 1 in read_wrf_mass_guess, nfldsig = 1 at 1 in read_wrf_mass_guess, num_all_fields= 214 at 1 in read_wrf_mass_guess, npe = 4 at 1 in read_wrf_mass_guess, num_loc_groups= 53 at 1 in read_wrf_mass_guess, num_all_pad = 216 at 1 in read_wrf_mass_guess, num_loc_groups= 54 READ_WRF_MASS_GUESS: open lendian_in= 15 to ifld, temp1(im/2,jm/2)= 1 0.93598E+05 ifld, temp1(im/2,jm/2)= 5 0.28130E+03 ifld, temp1(im/2,jm/2)= 6 0.28213E+03 ifld, temp1(im/2,jm/2)= 7 0.28372E+03

file=sigf03 file=sigf03 file=sigf03

file=sigf03

ifld, temp1(im/2,jm/2)= 212 0.37183E+00 ifld, temp1(im/2,jm/2)= 213 0.27570E+03 ifld, temp1(im/2,jm/2)= 214 0.27570E+03 in read_wrf_mass_guess, num_doubtful_sfct_all = in read_wrf_mass_guess, num_doubtful_sfct_all =

0 0

Display information on control variable array allocation. control_vectors: control_vectors: control_vectors: control_vectors: control_vectors: control_vectors:

length= 6710564 currently allocated= maximum allocated= number of allocates= number of deallocates= Estimated max memory used=

0

0 0 0 0.0 Mb

Show the source of observation error used in the analysis: CONVERR:

using observation errors from user provided table

Read in all the observations. The observation ingesting processes are distributed over all the processors with each processor reading in at least one observation type. To speed up reading some of the large datasets, one type can be read in by more than one (ntasks) processors. Before reading in the data from BUFR files, GSI resets the file status depending on whether the observation time matches the analysis time and how offtime_date is set. This step also checks for consistency between the satellite radiance data types in the BUFR files 53

GSI Diagnostics and Tuning and the usage setups in the satinfo files. The following shows stdout information from this step: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check: read_obs_check:

bufr bufr bufr bufr bufr bufr bufr bufr bufr bufr

file file file file file file file file file file

date date date date date date date date date date

is is is is is is is is is is

read_obs_check: bufr file date is read_obs_check: bufr file date is read_obs_check: bufr file date is not used

2011032212 2011032212 2011032212 0 0 0 0 0 2011032212 2011032212

prepbufr t prepbufr q prepbufr ps satwnd uv not used radarbufr rw not used tmirrbufr pcp_tmi not used omibufr omi not used gimgrbufr goes_img not used prepbufr uv amsuabufr amsua

2011032212 amsuabufr amsua 2011032212 amsubbufr amsub 0 amsubbufrears amsub

read_obs_check: bufr file date is 0 sbuvbufr sbuv2 not used data type hirs3_n16 not used in info file -- do not read file hirs3bufr data type hirs4_n18 not used in info file -- do not read file hirs4bufr read_obs_check: bufr file date is

0 iasibufr iasi

not used

The list of observation types that will be read in: number of READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS: READ_OBS:

extra processors read 19 hirs3 read 21 hirs4 read 36 amsub read 37 mhs read 38 mhs read 63 mhs read 1 ps read 2 t read 3 q read 4 pw read 6 uv read 10 sst read 11 gps_ref read 28 amsua read 31 amsua read 32 amsua read 61 hirs4 read 62 amsua

2 hirs3_n17 hirs4_metop-a amsub_n17 mhs_n18 mhs_metop-a mhs_n19 ps t q pw uv sst gps amsua_n15 amsua_n18 amsua_metop-a hirs4_n19 amsua_n19

using using using using using using using using using using using using using using using using using using

ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks= ntasks=

2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1

0 2 0 2 0 2 0 1 2 3 0 1 2 3 0 1 2 3

2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1

Display basic statistics for full horizontal surface fields: GETSFC:

enter with nlat_sfc,nlon_sfc= 0 0 and nlat,nlon= 247 348 GETSFC: set nlat_sfc,nlon_sfc= 247 348 ================================================================================ Status Var Mean Min Max sfcges2 FC10 1.000000000000E+00 1.000000000000E+00 1.000000000000E+00 sfcges2 SNOW 7.733149095057E+00 0.000000000000E+00 3.080491943359E+02 sfcges2 VFRC 5.299080975923E-02 0.000000000000E+00 8.883580780029E-01 sfcges2 SRGH 5.000000000008E-02 5.000000000000E-02 5.000000000000E-02 sfcges2 STMP 2.807239946831E+02 2.168921508789E+02 3.027479553223E+02 sfcges2 SMST 7.959968731205E-01 4.999999888241E-03 1.071443676949E+00 sfcges2 SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02 sfcges2 VTYP 1.554879240542E+01 1.000000000000E+00 2.400000000000E+01 sfcges2 ISLI 4.321164316627E-01 0.000000000000E+00 2.000000000000E+00 sfcges2 STYP 1.156151984736E+01 1.000000000000E+00 1.600000000000E+01 ================================================================================

54


Lloop over all data files to read in observations: READ_BUFRTOVS: file=hirs4bufr type=hirs4 ithin= 1 rmesh=120.000000 isfcalc= 0 ndata= READ_BUFRTOVS: file=hirs3bufr type=hirs3 ithin= 1 rmesh=120.000000 isfcalc= 0 ndata= READ_GPS: READ_GPS:

time outside window time outside window

sis=hirs4_metop-a 1178 ntask= 2 sis=hirs3_n17 1235 ntask= 2

-2.81666666666667 -2.95000000000000

nread=

189563

nread=

202426

skip this report skip this report

READ_GPS : file=gpsrobufr type=gps_ref ithin= 0 rmesh=120.000000 isfcalc= 0 ndata=

sis=gps 18130 ntask=

READ_PREPBUFR: file=prepbufr type=uv ithin= 0 rmesh=120.000000 isfcalc= 0 ndata= READ_BUFRTOVS: file=amsuabufr type=amsua ithin= 2 rmesh= 60.000000 isfcalc= 0 ndata=

sis=uv 71038 ntask= sis=amsua_n18 47704 ntask=

nread=

22566

nread=

91930

nread=

63825

1 1 1

Using the above output information, many details on the observations can ve obstained. For example, the last line (bold) indicates that subroutine “READ_BUFRTOVS” was called to read in NOAA-18 AMSU-A (sis=amsua_n18) from the BUFR file “amsuabufr” (file=amsuabufr). Furthermore, this kind of data has 63825 observations in the file (nread=63825) and 47704 in analysis domain and time-window (ndata=47704). The data was thinned on a 60 km coarse grid (rmesh=60.000000). The next step partitions observations into subdomains. The observation distribution is summarized below by listing the number of observations for each observation variable in each subdomain (see Section 4.4 for more information): OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA:

ps t q pw uv sst gps_ref hirs3 hirs4 amsua amsua amsua amsub mhs mhs hirs4 amsua mhs

n17 metop-a n15 n18 metop-a n17 n18 metop-a n19 n19 n19

2607 5172 4107 296 6640 0 3538 0 0 2564 1000 0 0 1438 0 246 656 936

2878 4743 4197 92 5439 0 5580 0 0 1333 2117 0 0 2942 0 1097 3503 4277

9565 13902 11998 475 18109 6 2277 30 28 133 0 58 57 0 54 0 0 0

3019 5590 4090 83 5725 3 6768 35 35 195 90 67 70 119 71 37 93 116

Information on ingesting background error statistics: m_berror_stats_reg::berror_read_bal_reg(PREBAL_REG): berror_stats". mype,nsigstat,nlatstat = 0 m_berror_stats_reg::berror_read_wgt_reg(PREWGT_REG): berror_stats". mype,nsigstat,nlatstat = 0 Assigned default statistics to variable oz Assigned default statistics to variable cw

get balance variables" 60 93 read error amplitudes " 60 93

55

GSI Diagnostics and Tuning From this point forward in the stdout, the output shows many repeated entries. This is because the information is written from inside the outer loop. Typically the outer loop is iterated twice. For each outer loop, the work begins with the calculation of the observation innovation. This calculation is done by the subroutine setuprhsall, which sets up the right hand side (rhs) of the analysis equation. This information is contained within the stdout file, which is discussed in the following sections: Start the first outer analysis loop: GLBSOI: jiter,jiterstart,jiterlast,jiterend= 2 1

1

1

Calculate observation innovation for each data type in the first outer loop: SETUPALL:,obstype,isis,nreal,nchanl=ps 0 2607 SETUPALL:,obstype,isis,nreal,nchanl=t 0 5172

ps

22

t

24

Read in CRTM coefficients for the first outer loop, compute radiance observation innovation, and save diagnostic information in the first outer loop. In stdout, only information related to radiance data are printed. The complete information can be found in the diagnostic files for each observation (for details see Appendix A.2): SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33 15 2564 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15 Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15 N_COEFFS=21600 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11 SETUPRAD: write header record for amsua_n15 5 30 8 0 0 15 0 13784 to file pe0000.amsua_n15_01 2011032212 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./hirs3_n17.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.02 N_CHANNELS=19 Read_ODPS_Binary(INFORMATION) : FILE: ./hirs3_n17.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=5 N_ABSORBERS=3 N_CHANNELS=19 N_COEFFS=84300 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11 SETUPRAD: write header record for hirs3_n17 5 30 8 0 0 15 0 13784 to file pe0002.hirs3_n17_01 2011032212 SETUPRAD:

write header record for mhs_n19 8 0 0 15 to file pe0000.mhs_n19_01 2011032212 obsdiags: Bytes per element= 91 obsdiags: length total, used= 189601 47597 obsdiags: Estimated memory usage= 16.5 Mb

0

5 13784

30

56

GSI Diagnostics and Tuning The inner iteration of the first outer loop is discussed in the example below. In this simple example, the maximum number of iterations is 10: Print Jo components (observation term for each observation type in cost function) at the beginning of the inner loop: Begin Jo table outer loop Observation Type surface pressure temperature wind moisture gps radiance Jo Global End Jo table outer loop

Nobs 15600 11913 36726 4416 8709 53749 Nobs 131113

Jo 1.1832900994269770E+04 1.9028504337386752E+04 4.0133809857345797E+04 4.6817824733959324E+03 1.8551360623628108E+04 4.8794660264558479E+04 Jo 1.4302301855058485E+05

Jo/n 0.759 1.597 1.093 1.060 2.130 0.908 Jo/n 1.091

Print cost function values for each inner iteration (see section 4.6 for more details): GLBSOI: START pcgsoi jiter= 1 pcgsoi: gnorm(1:2),b= 1.156412696509E+07 1.156412696509277448E+07 0.0000000000E+00 stprat 0.959014637164523 stprat 5.926551062234699E-014 Initial cost function = 1.430230185505848494E+05 Initial gradient norm = 3.400606852473948493E+03 Minimization iteration 0 grepcost J,Jb,Jo,Jc,Jl = 1 0 1.430230185505848494E+05 0.000000000000000000E+00 1.430230185505848494E+05 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 0 3.400606852473948493E+03 1.000000000000000000E+00 pcgsoi: cost,grad,step = 1 0 1.430230185505E+05 3.4006068524E+03 2.4398954427E-03 pcgsoi: gnorm(1:2),b= 1.870988434476E+06 1.8709884344764E+06 1.617924500590633419E-01 stprat 0.680228405155974 stprat 4.467539812170933E-015 Minimization iteration 1 grepcost J,Jb,Jo,Jc,Jl = 1 1 1.148077578695608245E+05 6.884228595044261567E+01 1.147389155836103746E+05 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 1 1.367840792810494804E+03 4.022343223284949310E-01 pcgsoi: cost,grad,step = 1 1 1.14807757869E+05 1.3678407928E+03 7.63011937910E-03 pcgsoi: gnorm(1:2),b= 1.701652049920E+06 1.7016520499209E+06 9.094936230309182967E-01 stprat 0.375750738871811 stprat 1.030833603466626E-016 Minimization iteration 10 grepcost J,Jb,Jo,Jc,Jl = 1 10 6.518198181954051688E+04 4.712017911695538714E+03 6.046996390784497635E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 10 3.526173233561088409E+02 1.036924698012588603E-01 pcgsoi: cost,grad,step = 1 10 6.5181981819E+04 3.526173233E+02 9.3430503337038E-03 Minimization final diagnostics

At the end of the 1st outer loop, print some diagnostics about the guess fields after adding the analysis increment to the guess and diagnostics about the analysis increment: pcgsoi: Updating guess ================================================================================ Status Var Mean Min Max analysis U 8.687820027834E+00 -5.993215723446E+01 7.019321109527E+01 analysis V -8.389957616632E-01 -7.818069678361E+01 9.543296554002E+01 analysis TV 2.402283737571E+02 1.913718225601E+02 3.039049699223E+02 analysis Q 1.527484398981E-03 1.000000000000E-07 1.836425129904E-02 analysis TSEN 2.399649493443E+02 1.913715400572E+02 3.008258699568E+02 analysis OZ 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis DUMY 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis DIV 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis VOR 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00

57

GSI Diagnostics and Tuning analysis PRSL 4.097302066597E+01 1.148126437759E+00 1.038437446031E+02 analysis PS 9.917365877107E+01 6.406160090103E+01 1.039817643172E+02 analysis SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02 analysis radb -3.450042141774E-03 -3.668133494481E+01 4.186399153656E+01 analysis pcpb 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 ================================================================================ wwww u -2.694876007555E-02 -1.101437973098E+01 9.323673570665E+00 wwww v -2.027691113373E-03 -1.030344877527E+01 1.236183232954E+01 wwww tv -2.574050797903E-01 -5.802385852340E+00 2.846798028189E+00 wwww tsen -2.569174827651E-01 -5.802377539978E+00 2.846838216350E+00 wwww q -3.328819711997E-06 -2.793073305892E-03 3.887865406977E-03 wwww oz 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 wwww cw 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 wwww p3d 1.060550728782E-02 -8.703821275248E-02 2.381276521565E-01 wwww ps 2.592449429408E-02 -8.703821275248E-02 2.381276521565E-01 wwww sst 2.945225217239E-03 -5.763484248069E-01 4.716956592383E-01

Information on the state variable array allocation after cleaning up major fields at the end of the inner loop. state_vectors: latlon11,latlon1n,latlon1n1,lat2,lon2,nsig= 1108800 1130976 126 176 50 state_vectors: length= 8936928 state_vectors: currently allocated= 0 state_vectors: maximum allocated= 4 state_vectors: number of allocates= 4 state_vectors: number of deallocates= 4 state_vectors: Estimated max memory used= 272.9 Mb

22176

Start the second outer loop. GLBSOI: jiter,jiterstart,jiterlast,jiterend= 2 1

2

1

Calculate observation innovations for each data type in the second outer loop: SETUPALL:,obstype,isis,nreal,nchanl=ps 0 2607 SETUPALL:,obstype,isis,nreal,nchanl=t 0 5172

ps t

22 24

Read in CRTM coefficients from the fix file for the second outer loop and save diagnostic information at the beginning of the second outer loop: SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33 15 2564 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15 Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15 N_COEFFS=21600 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./hirs4_metop-a.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.02 N_CHANNELS=19 Read_ODPS_Binary(INFORMATION) : FILE: ./hirs4_metop-a.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=5 N_ABSORBERS=3 N_CHANNELS=19 N_COEFFS=82700 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS

58


The output from the inner iterations in the second outer loop is shown below. In this example, the maximum number of iterations is 10: Print Jo components (observation term for each observation type in cost function) at the beginning of the inner loop: Begin Jo table outer loop Observation Type surface pressure temperature wind moisture gps radiance Jo Global End Jo table outer loop

Nobs 15620 11913 36796 4416 8860 84607 Nobs 162212

Jo 6.0353797873191743E+03 1.0272593707203929E+04 2.2776352264954159E+04 2.2085428390180150E+03 8.9698796153000221E+03 4.1586556746867529E+04 Jo 9.1849304960662834E+04

Jo/n 0.386 0.862 0.619 0.500 1.012 0.492 Jo/n 0.566

Print cost function values for each inner iteration (see section 4.6 for more details): GLBSOI: START pcgsoi jiter= 2 pcgsoi: gnorm(1:2),b= 6.04436368069045E+06 6.04436368069045E+06 0.000000000000000E+00 stprat 0.952177909610047 stprat 3.022462323621298E-014 Initial cost function = 9.709172928659517493E+04 Initial gradient norm = 2.458528763445824097E+03 Minimization iteration 0 grepcost J,Jb,Jo,Jc,Jl = 2 0 9.709172928659517493E+04 5.242424325932335705E+03 9.184930496066283376E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 0 2.458528763445824097E+03 1.000000000000000000E+00 pcgsoi: cost,grad,step = 2 0 9.70917292865E+04 2.45852876344E+03 2.0910838314E-03 pcgsoi: gnorm(1:2),b= 1.39687516886273E+06 1.39687516886273E+06 2.311037592468580E-01 stprat 0.326022963208770 stprat 3.222291707173269E-015 Minimization iteration 1 grepcost J,Jb,Jo,Jc,Jl = 2 1 8.445245812257242505E+04 5.411029823593031324E+03 7.904142829897940101E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 1 1.181894736794584333E+03 4.807325236000348778E-01 pcgsoi: cost,grad,step = 2 1 8.44524581225E+04 1.18189473679E+03 3.1026039720E-03 pcgsoi: gnorm(1:2),b= 6.2897182736819E+05 6.2897182736819E+05 4.5027060498202475E-01 stprat 1.600925505225033E-002 stprat 1.290408460808865E-015 Minimization iteration 10 grepcost J,Jb,Jo,Jc,Jl = 2 10 7.074054129933725926E+04 6.952319149296819887E+03 6.378822215004044119E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 10 3.275896517354244679E+02 1.332462148119354928E-01 pcgsoi: cost,grad,step = 2 10 7.07405412993E+04 3.27589651735E+02 4.3417883383E-03 Minimization final diagnostics

Because the outer loop is set to 2, the completion of the 2nd outer loop is the end of the analysis. The next step is to save the analysis results. Again, only a portion of variable T is shown and all other variables are listed according to variable name in the NetCDF file (rmse_var = …). The maximum and minimum values are useful information for a quick check of the reasonableness of the analysis: pcgsoi: Updating guess at 2 in wrwrfmassa at 3 in wrwrfmassa at 6 in wrwrfmassa iy,m,d,h,m,s= 2011 0 nlon,lat,sig_regional=

3 348

22 247

12

0

50

59

GSI Diagnostics and Tuning rmse_var=P_TOP ordering=0 WrfType,WRF_REAL= ndim1= 0 staggering= N/A start_index= 1 end_index1= 348 p_top= 1000.000 rmse_var=MUB ordering=XY WrfType,WRF_REAL= ndim1= 2 staggering= N/A start_index= 1 end_index1= 348 max,min MUB= 99645.99 max,min psfc= 103974.1 max,min MU= 3974.133 rmse_var=MU ordering=XY WrfType,WRF_REAL= ndim1= 2 staggering= N/A start_index= 1 end_index1= 348 k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= rmse_var=T

104

104 1

1

247

104

50

104

1 247 62950.09 64035.82 -2036.453 104

0 0

1 50

0 0

104

1 2

1 247 311.0948 311.5334

1 50 234.0167 235.6663

49 50

762.8171 828.8514

679.6612 754.9369

0 0 280.5982 280.7569 706.3553 777.9498

rmse_var=QVAPOR rmse_var=U rmse_var=V rmse_var=SEAICE rmse_var=SST rmse_var=TSK

Diagnostics of the analysis results after adding the analysis increment to the guess and diagnostics about the analysis increment: ================================================================================ Status Var Mean Min Max analysis U 8.684494834819E+00 -5.994932102119E+01 7.041893161310E+01 analysis V -8.368213328561E-01 -7.803787125994E+01 9.484754997846E+01 analysis TV 2.401499796179E+02 1.913068477406E+02 3.039191344395E+02 analysis Q 1.535424180027E-03 1.000000000000E-07 1.869392878573E-02 analysis TSEN 2.398851864729E+02 1.913065654009E+02 3.008630161153E+02 analysis OZ 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis DUMY 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis DIV 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis VOR 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 analysis PRSL 4.098357903854E+01 1.148194759980E+00 1.038787838788E+02 analysis PS 9.917332267920E+01 6.406096161559E+01 1.039929254169E+02 analysis SST 2.807239944456E+02 2.168921356201E+02 3.027479553223E+02 analysis radb -7.769461831636E-03 -4.254112280406E+01 4.610384289368E+01 analysis pcpb 0.000000000000E+00 0.000000000000E+00 0.000000000000E+00 ================================================================================ wwww u -3.325193015686E-03 -2.341066364048E+00 2.228906207570E+00 wwww v 2.174428807068E-03 -2.228629595272E+00 3.099467379467E+00 wwww tv -7.839413921041E-02 -1.677489609269E+00 1.525438097693E+00

60

GSI Diagnostics and Tuning wwww wwww wwww wwww wwww wwww wwww

tsen q oz cw p3d ps sst

-7.976378952151E-02 7.939568093893E-06 0.000000000000E+00 0.000000000000E+00 -1.374925497693E-04 -3.360918742725E-04 5.598866783354E-03

-1.677484898836E+00 -2.000539224496E-03 0.000000000000E+00 0.000000000000E+00 -3.162044407889E-02 -3.162044407889E-02 -4.125074055517E-01

1.525436684058E+00 2.329500623959E-03 0.000000000000E+00 0.000000000000E+00 2.258895669507E-02 2.258895669507E-02 3.498591820290E-01

Information on control variable allocation after cleaning up major fields at the end of the 2nd outer loop: state_vectors: latlon11,latlon1n,latlon1n1,lat2,lon2,nsig= 1108800 1130976 126 176 50 state_vectors: length= 8936928 state_vectors: currently allocated= 0 state_vectors: maximum allocated= 4 state_vectors: number of allocates= 8 state_vectors: number of deallocates= 8 state_vectors: Estimated max memory used= 272.9 Mb

22176

After completion of the analysis, the subroutine setuprhsall is called again if write_diag(3)=.true.,to calculate analysis O-A information. Because CRTM initialization is inside setuprhsall, we can see the section reading in the CRTM coefficients (the third time to see this information): SETUPALL:,obstype,isis,nreal,nchanl=ps 0 2607

ps

22

SETUPALL:,obstype,isis,nreal,nchanl=amsua amsua_n15 33 15 2564 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./amsua_n15.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=15^M amsua_n15 AntCorr RELEASE.VERSION: 1.04 N_FOVS=30 N_CHANNELS=15 Read_ODPS_Binary(INFORMATION) : FILE: ./amsua_n15.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=15 N_COEFFS=21600 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11 SETUPRAD: write header record for amsua_n15 5 30 8 0 0 15 0 13784 to file pe0000.amsua_n15_03 2011032212 INIT_CRTM: crtm_init() on path "./" Read_SpcCoeff_Binary(INFORMATION) : FILE: ./mhs_n19.SpcCoeff.bin; ^M SpcCoeff RELEASE.VERSION: 7.03 N_CHANNELS=5^M mhs_n19 AntCorr RELEASE.VERSION: 1.05 N_FOVS=90 N_CHANNELS=5 Read_ODPS_Binary(INFORMATION) : FILE: ./mhs_n19.TauCoeff.bin; ^M ODPS RELEASE.VERSION: 2.01 N_LAYERS=100 N_COMPONENTS=2 N_ABSORBERS=1 N_CHANNELS=5 N_COEFFS=10200 Read_EmisCoeff_Binary(INFORMATION) : FILE: ./EmisCoeff.bin; ^M EmisCoeff RELEASE.VERSION: 2.02 N_ANGLES= 16 N_FREQUENCIES= 2223 N_WIND_SPEEDS= 11 SETUPRAD: write header record for mhs_n19 5 30 8 0 0 15 0 13784 to file pe0000.mhs_n19_03 2011032212 obsdiags: Bytes per element= 91 obsdiags: length total, used= 189601 69044 obsdiags: Estimated memory usage= 16.5 Mb

61

GSI Diagnostics and Tuning Print Jo components (observation term for each observation type) after the analysis, which shows the fit of the analysis results to the data: Begin Jo table outer loop Observation Type surface pressure temperature wind moisture gps radiance Jo Global End Jo table outer loop

Nobs 15620 11913 36794 4417 8868 100900 Nobs 178512

Jo 5.6788282318981810E+03 9.6271243544834779E+03 2.1433478710069721E+04 2.0790935066632146E+03 8.1188601604511678E+03 3.2527599143479922E+04 Jo 7.9464984107045690E+04

Jo/n 0.364 0.808 0.583 0.471 0.916 0.322 Jo/n 0.445

The end of the GSI analysis (a successful analysis must reach this end, but to reach this end is not neccessarily a successful analysis): ENDING DATE-TIME JUL 13,2012 21:58:55.998 195 FRI 2456122 PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

For runs on IBM computers only, additional machine resource statistics are provided. This section gives very useful information about the computer resources used in the analysis. Linux computer platforms do not provide this sort of information. The user must look for other ways to find the time and memory information about the analysis run. 0:*****************RESOURCE STATISTICS******************************* 0: 0:The total amount of wall time = 551.066561 0:The total amount of time in user mode = 544.004598 0:The total amount of time in sys mode = 0.718078 0:The maximum resident set size (KB) = 2087516 0:Average shared memory use in text segment (KB*sec) = 3096668 0:Average unshared memory use in data segment (KB*sec) = 1082940860 0:Average unshared memory use in stack segment(KB*sec) = 0 0:Number of page faults without I/O activity = 34260 0:Number of page faults with I/O activity = 697 0:Number of times process was swapped out = 0 0:Number of times filesystem performed INPUT = 0 0:Number of times filesystem performed OUTPUT = 0 0:Number of IPC messages sent = 0 0:Number of IPC messages received = 0 0:Number of Signals delivered = 0 0:Number of Voluntary Context Switches = 1098 0:Number of InVoluntary Context Switches = 596 0:*****************END OF RESOURCE STATISTICS************************* 0:

62

GSI Diagnostics and Tuning 4.2 Single Observation Test A single observation test is a GSI run with only one (pseudo) observation at a specific location of the analysis domain. By examining the analysis increments, one can visualize the important features of the analysis, such as the ratio of background and observation variance and the pattern of the background error covariance. Therefore, the single observation test is the first check that users should do after successfully installing the GSI. 4.2.1 Setup a single observation test: To perform the single observation test with the GSI, the following GSI namelist variables need to be set, which should be done through editing the run script (run_gsi.ksh): Under the &SETUP section, turn on the single observation test: oneobtest=.true.,

under the &SINGLEOB_TEST section, set up single observation features like: maginnov=1.0, magoberr=1.0, oneob_type='t', oblat=20., oblon=285., obpres=850., obdattim= 2007081512, obhourset=0.,

Note: • Please check Section 3.4 for the explanation of each parameter. From these parameters, we can see that a useful observation in the analysis should include information like the observation type (oneob_type), value (maginnov), error (magoberr), location (oblat, oblong, obpres) and time (obdattim, obhourset). Users can dump out (ncdump) the global attributes from the NetCDF background file and set oblat=CEN_LAT, oblong=360-CEN_LON to have the observation in the center of the domain. • In the analysis, the GSI first generates a prepbufr file including only one observation based on the information given in the namelist &SINGLEOB_TEST section. To generate this prepbufr file, the GSI needs to read in a PrepBUFR table, which is not needed when generating a GSI analysis with real observations. The table can be found in the fix/ directory and needs to be copied to the run directory. Please check if the following lines are in the GSI run script before running the single observation test: bufrtable=${FIX_ROOT}/prepobs_prep.bufrtable cp $bufrtable ./prepobs_prep.bufrtable

63

GSI Diagnostics and Tuning 4.2.2. Examples of single observation tests for GSI To give users a taste of the single observation test, a single temperature observation (oneob_type='t') with 1 degree innovation (maginnov=1.0) and 1 degree observation error (magoberr=1.0) was used to perform the test.

A horizontal cross section (left column) and a vertical cross section (right column) of analysis increment of T, U, V, and Q from a single T observation 64

GSI Diagnostics and Tuning This single observation was located at the center of the domain. The results are shown with figures of the horizontal and vertical cross sections through the point of maximum analysis increment. The above figure was generated using NCL scripts, which can be found in the util/Analysis_Utilities/plots_ncl directory, introduced in Section A.4.

4.3 Control Data Usage Observation data used in the GSI analysis are controlled by three parts of the GSI system: 1. Link observation BUFR files to working directory in GSI run script: All BUFR/PrepBUFR observation files need to link to the working directory using GSI recognized names. The run script (run_gsi.ksh) makes these links after locating the working directory. Turning on or off these links can control the use of all the data types contained in the BUFR files. In Section 3.1, we provide a table listing all default GSI recognized observation file names and the corresponding examples of the observation BUFR files from NCEP. The following is the first 3 rows of the table as an example: GSI Name prepbufr satwnd amsuabufr

Content Conventional observations, including ps, t, q, pw, uv, spd, dw, sst, from observation platforms such as METAR, sounding, et al. satellite winds AMSU-A 1b radiance (brightness temperatures) from satellites NOAA-15, 16, 17,18, 19 and METOP-A/B

Example file names gdas1.t12z.prepbufr gdas1.t12z.satwnd.tm00.bufr_d gdas1.t12z.1bamua.tm00.bufr_d

The left column is the GSI recognized name (bold) and the right column are names of BUFR files from NCEP. In the run script, the following lines are used to link the BUFR files in the right column to the working directory using the GSI recognized names shown in the left column: # Link to the prepbufr data ln -s ${PREPBUFR} ./prepbufr # Link to the radiance data ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr

The GSI recognized default observation filenames are set up in the namelist section &OBS_INPUT, which certainly can be changed based on application needs (details see below). 2. In GSI namelist (inside run script), section &OBS_INPUT: In this namelist section, observation files (dfile) are linked to the observed variables used in the GSI analysis (dsis), for example: 65


dfile(01)='prepbufr', dtype(01)='ps', dfile(02)='prepbufr' dtype(02)='t', dfile(03)='prepbufr', dtype(03)='q',

dplat(01)=' ', dplat(02)=' ', dplat(03)=' ',

dsis(01)='ps', dsis(02)='t', dsis(03)='q',

dval(01)=1.0, dthin(01)=0, dval(02)=1.0, dthin(02)=0, dval(03)=1.0, dthin(03)=0,

dfile(28)='amsuabufr', dtype(28)='amsua', dfile(29)='amsuabufr', dtype(29)='amsua',

dplat(28)='n15', dplat(29)='n16',

dsis(28)='amsua_n15', dsis(29)='amsua_n16',

dval(28)=10.0, dthin(28)=2, dval(29)=0.0, dthin(29)=2,

Here, conventional observations “ps”, “t”, and “q” will be read in from file prepbufr and AMSU-A radiances from NOAA-15 and -16 satellites will be read in from amsuabufr. Deleting a particular line in &OBS_INPUT will turn off the use of this observation variable in the GSI analysis but other variables under the same type still can be used. For example, if we delete: dfile(28)='amsuabufr', dtype(28)='amsua',

dplat(28)='n15',

dsis(28)='amsua_n15',

dval(28)=10.0, dthin(28)=2,

Then, the AMSU-A observation from NOAA-15 will not be used in the analsys but the AMSU-A observations from NOAA-16 will still be used. The observation filename in dfile can be different from the sample script (run_gsi.ksh). If the filename in dfile has been changed, the link from the BUFR files to the GSI recognized name in the run script also need to be changed correspondly. For example, if we change the dfile(28): dfile(28)='amsuabufr_n15', dtype(28)='amsua', dfile(29)='amsuabufr', dtype(29)='amsua',

dplat(28)='n15', dplat(29)='n16',

dsis(28)='amsua_n15', dsis(29)='amsua_n16',

dval(28)=10.0, dthin(28)=2, dval(29)=0.0, dthin(29)=2,

and add a new link need to added in the run script: # Link to the radiance data ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamua.tm00.bufr_d amsuabufr ln -s ${OBS_ROOT}/le_gdas1.t12z.1bamua.tm00.bufr_d amsuabufr_n15

The GSI will read NOAA-16 AMSU-A observations from file amsuabufr and NOAA-15 AMSU-A observations from file amsuabufr_n15 based on the above changes to the run scripts and namelist. In this example, both amsuabufr and amsuabufr_15 are linked to the same BUFR file and NOAA-15 AMSU-A and NOAA-16 AMSU-A observations are still read in from the same BUFR file. If amsuabufr and amsuabufr_15 link to different BUFR files, then NOAA-15 AMSU-A and NOAA-16 AMSU-A will be read in from different BUFR files. Clearly, the changable filename in dfile gives GSI more capability to handle multiple data resources. 3. Use info files For each variable, observations can come from multiple platforms (data types or observation instruments). For example, surface pressure (ps) can come from METAR observation stations (data type 187) and Rawinsonde data (data type 120). There are several files named *info in the GSI system (located in /fix) to control the usage of 66

GSI Diagnostics and Tuning observations based on the observation platform. Below is a list of info files and their function: File name in GSI convinfo satinfo

ozinfo pcpinfo aeroinfo

Function and Content Control the usage of conventional data, including tcp, ps, t, q, pw, sst, uv, spd, dw, radial wind (Level 2 rw and 2.5 srw), gps Control the usage of satellite data. Instruments include AMSUA/B, HIRS3/4, MHS, ssmi, ssmis, iasi, airs, amsrem sndr, cris, etc. and satellites include NOAA 16, 17, 18, 19, aqua, GOES 11, 12, 13, METOP-A/B, NPP, etc. Control the usage of ozone data, including sbuv6, 8 from NOAA 14, 16, 17, 18, 19. omi_aura, gome_metop-a, mls_aura Control the usage of precipitation data, including pcp_ssmi, pcp_tmi Control the usage of aerosol data, including modis_aqua and modis_terra

The header of each info file includes an explanation of the content of the file. Here we discuss the most commonly used two info files: •

convinfo The convinfo is to control the usage of conventional data. The following is the part of the content of convinfo:

!otype tcp ps ps ps ps ps ps ps t t t t t t t t t t t

type sub iuse twindow numgrp ngroup nmiter gross ermax ermin var_b var_pg ithin rmesh pmesh npred 112 0 1 3.0 0 0 0 75.0 5.0 1.0 75.0 0.000000 0 0. 0. 0 120 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 132 0 -1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 180 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 181 0 1 3.0 0 0 0 3.6 3.0 1.0 3.6 0.000300 0 0. 0. 0 182 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 183 0 -1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 187 0 1 3.0 0 0 0 4.0 3.0 1.0 4.0 0.000300 0 0. 0. 0 120 0 1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.000001 0 0. 0. 0 126 0 -1 3.0 0 0 0 8.0 5.6 1.3 8.0 0.001000 0 0. 0. 0 130 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0 131 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0 132 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.001000 0 0. 0. 0 133 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0 134 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0 135 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0 180 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0 181 0 -1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0 182 0 1 3.0 0 0 0 7.0 5.6 1.3 7.0 0.004000 0 0. 0. 0

The meaning of each column is explained in the header of the file and listed also in the following table:

67

GSI Diagnostics and Tuning Column Name otype type sub iuse

Content of the column observation variables (t, uv, q, etc.) prepbufr observation type (if available) prepbufr subtype (not yet available) flag if to use/not use / monitor data =1, use data, the data type will be read and used in the analysis after quality control =0, read in and process data, use for quality control, but do NOT assimilate =-1, monitor data. This data type will be read in but not be used in the GSI analysis

twindow numgrp ngroup nmiter gross ermax ermin var_b var_pg ithin rmesh pmesh npred

time window (+/- hours) for data used in the analysis cross validation parameter - number of groups cross validation parameter - group to remove from data use cross validation parameter - external iteration to introduce removed data gross error parameter - gross error gross error parameter – maximum error gross error parameter – minimum error variational quality control parameter - b parameter variational quality control parameter - pg parameter Flag to turn on thinning (0, no thinning, 1 - thinning) size of horizontal thinning mesh (in kilometers) size of vertical thinning mesh Number of bias correction predictors

From this table, we can see that parameter iuse is used to control the usage of data and parameter twindow is to control the time window of data usage. Parameters gross, ermax and ermin are for gross quality control. •

satinfo: The satinfo file contains information about the channels, sensors, and satellites. It specifies observation error (cloudy or clean) for each channel, how to use the channels (assimilate, monitor, etc), and other useful information. The following is the part of the content of satinfo: !sensor/instr/sat amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 hirs3_n17 hirs3_n17 hirs3_n17 hirs3_n17

chan iuse 1 1 2 1 3 1 4 1 12 13 14 15 1 2 3 4

1 1 -1 1 -1 -1 -1 -1

error 3.000 2.000 2.000 0.600 1.000 1.500 2.000 3.000 2.000 0.600 0.530 0.400

error_cld 9.100 13.500 7.100 1.300 0.385 0.520 1.400 10.000 0.000 0.000 0.000 0.000

ermax 4.500 4.500 4.500 2.500

var_b 10.000 10.000 10.000 10.000

var_pg 0.000 0.000 0.000 0.000

3.500 4.500 4.500 4.500 4.500 2.500 2.500 2.000

10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

68

GSI Diagnostics and Tuning The meaning of each column is explained in the following table: Column Name sensor/ instr/sat chan iuse error error_cld ermax var_b var_pg

Content of the column Sensor, instrument, and satellite name Channel number for certain senor = 1, use this channel data =-1, don’t use this channel data Variance for each satellite channel Variance for each satellite channel if it is cloudy Error maximum for gross check to observations Possible range of variable for gross errors Probability of gross error

4.4 Domain Partition for Parallelization and Observation Distribution In the standard output file (stdout), there is an information block that lists information regarding each sub-domain partition, including domain number (task), starting point (istart,jstart), and dimension of the sub-domain (ilat1,jlon1). Here is an example from the case we showed in section 4.1. Please note that 4 processors were used in the analysis: general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN: general_DETER_SUBDOMAIN:

task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1= task,istart,jstart,ilat1,jlon1=

0 1 2 3

1 125 1 125

1 1 175 175

124 123 124 123

174 174 174 174

The standard output file (stdout) also has an information block that shows the distribution of different kinds of observations in each sub-domain. This block follows the observation input section. The following is the observation distribution of the case shown in section 4.1. From the case introduction, we know the prepbufr (conventional data), radiance BUFR files, and GPS BUFR files were used. In this list, the conventional observations (ps, t, q, pw, uv, and sst), GPS (gps_ref), and radiance data (amusa, amsub, hirs3/4, and mhs from Metop-a, NOAA 15, 17, 18, and 19) were distributed among 4 sub-domains: Observation OBS_PARA: ps OBS_PARA: t OBS_PARA: q OBS_PARA: pw OBS_PARA: uv OBS_PARA: sst OBS_PARA: gps_ref OBS_PARA: hirs3 OBS_PARA: hirs4 OBS_PARA: amsua OBS_PARA: amsua OBS_PARA: amsua OBS_PARA: amsub OBS_PARA: mhs OBS_PARA: mhs OBS_PARA: hirs4 OBS_PARA: amsua OBS_PARA: mhs

type

n17 metop-a n15 n18 metop-a n17 n18 metop-a n19 n19 n19

…number of observations in each subdomain 2607 2878 9565 3019 5172 4743 13902 5590 4107 4197 11998 4090 296 92 475 83 6640 5439 18109 5725 0 0 6 3 3538 5580 2277 6768 0 0 30 35 0 0 28 35 2564 1333 133 195 1000 2117 0 90 0 0 58 67 0 0 57 70 1438 2942 0 119 0 0 54 71 246 1097 0 37 656 3503 0 93 936 4277 0 116

69

GSI Diagnostics and Tuning This list is a good way to quickly check which kinds of data are used in the analysis and how they are distributed in the analysis domain.

4.5 Observation Innovation Statistics The GSI analysis gives a group of files named fort.2* (other than fort.220, see explanation on fort.220 in next section) to summarize observations fitting to the current solution in each outer loop. The content of each of these files is listed in the following table: File name fort.201 or fit_p1.analysis_time fort.202 or fit_w1.analysis_time fort.203 or fit_t1.analysis_time fort.204 or fit_q1.analysis_time fort.205 fort.206

Variables in file fit of surface pressure data

Ranges/units mb

fit of wind data

m/s

fit of temperature data

K

fit of q data

percent of guess qsaturation mm

fort.208 fort.209 fort.210 fort.211 fort.212

fit of pcp_ssmi, pcp_tmi fit of radar radial wind (rw) fit of lidar wind (dw) fit of srw1, srw2 fit of GPS data

fort.213 fort.214 fort.215

fit of conventional sst data Tropical cyclone central pressure Lagrangian data

fit of precipitation water data fit of ozone observations from sbuv6_n14 (, _n16, _n17, _n18), sbuv8_n16 (, _n17, _n18, _n19), omi_aura, gome_metop-a/b, mls_aura fort.207 or fit of satellite data including: fit_rad1.analysis_time amsua_n15(, n16, n17, n18, metop-a, aqua, n19), amsub_n15(, n16, n17,), hirs3_n16(, n17), hirs4_n18 (, metop-a, n19), mhs_n18(, metop-a, n19), hirs2_n14, iasi616_metop-a, airs281SUBSET_aqua, avhrr3_n16(, n17, n18), amsre_aqua, ssmi_f13(, f14, f15), ssmis_f16, msu_n14, sndr_g11(, g12, g13), imgr_g11(, g12, g13), sndrD1(, 2, 3, 4)_g11(, g12,g13),

fractional difference C

To help users understand the information inside these files, the following are some examples of the contents in these files and corresponding explanations. 70

GSI Diagnostics and Tuning 4.5.1 Conventional observations Example of files including single level data (fort.201, fort.205, fort.213) pressure levels (hPa)= 0.0 2000.0 it obs type stype count o-g 01 ps 120 0000 141 o-g 01 ps 180 0000 2210 o-g 01 ps 181 0000 725 o-g 01 ps 187 0000 12524 o-g 01 all 15600 o-g 01 ps rej 120 0000 2 o-g 01 ps rej 180 0000 16 o-g 01 ps rej 181 0000 49 o-g 01 ps rej 183 0000 6 o-g 01 ps rej 187 0000 38 o-g 01 rej all 111 o-g 01 ps mon 120 0000 2 o-g 01 ps mon 180 0000 115 o-g 01 ps mon 181 0000 257 o-g 01 ps mon 183 0000 1378 o-g 01 ps mon 187 0000 245 o-g 01 mon all 1997

bias 0.2326 0.2623 0.1493 0.5000 0.4476 -7.5859 0.4209 0.5711 -19.2918 -2.2101 -1.6233 0.5141 -0.1094 0.3725 -0.8872 -0.2670 -0.6028

rms 1.0637 1.1016 1.2380 1.0227 1.0455 8.7800 7.0383 57.0891 19.5258 7.4631 38.5608 0.5372 1.4396 1.2673 1.9598 1.2678 1.7815

cpen 1.6900 1.4846 2.0757 0.5437 0.7585 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.4911 5.4329 4.7674 0.0000 3.5709 1.3650

qcpen 1.6613 1.2854 1.8844 0.5061 0.6910 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.4911 4.4811 3.9711 0.0000 2.7001 1.1009

Example of files including multiple level data (fort.202, fort.203, fort.204)

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0 ---------------------------------------------------------------------------------------o-g 01 uv 220 0000 count 135 428 427 838 688 978 8061 o-g 01 uv 220 0000 bias -0.43 0.93 0.85 0.61 0.31 0.23 0.64 o-g 01 uv 220 0000 rms 3.62 3.68 4.21 4.13 5.61 5.33 4.84 o-g 01 uv 220 0000 cpen 0.69 0.92 1.21 1.30 1.74 1.53 1.39 o-g 01 uv 220 0000 qcpen 0.69 0.92 1.21 1.30 1.72 1.52 1.38 o-g 01 uv 223 0000 count 0 20 122 595 326 93 3169 o-g 01 uv 223 0000 bias 0.00 -1.27 0.38 0.14 0.00 0.92 -0.40 o-g 01 uv 223 0000 rms 0.00 4.83 4.94 4.35 5.56 5.66 4.81 o-g 01 uv 223 0000 cpen 0.00 2.34 2.07 1.14 1.37 1.07 1.48 o-g 01 uv 223 0000 qcpen 0.00 2.34 2.05 1.08 1.34 1.07 1.43 o-g o-g o-g o-g o-g o-g o-g o-g o-g o-g

01 01 01 01 01 01 01 01 01 01

o-g o-g o-g o-g o-g o-g o-g o-g o-g o-g

01 01 01 01 01 01 01 01 01 01

o-g o-g o-g o-g o-g

01 01 01 01 01

uv uv uv uv uv

rej rej rej rej rej

all all all all all 220 220 220 220 220

uv uv uv uv uv

rej rej rej rej rej mon mon mon mon mon

all all all all all 220 220 220 220 220

mon mon mon mon mon

all all all all all

0000 0000 0000 0000 0000

count bias rms cpen qcpen count bias rms cpen qcpen

1962 -0.31 3.02 0.45 0.45 0 0.00 0.00 0.00 0.00

1400 0.92 3.88 0.53 0.53 0 0.00 0.00 0.00 0.00

1488 0.56 4.68 0.85 0.84 0 0.00 0.00 0.00 0.00

2688 0.00 4.65 0.94 0.92 0 0.00 0.00 0.00 0.00

1014 0.21 5.59 1.62 1.60 0 0.00 0.00 0.00 0.00

1071 0.29 5.36 1.49 1.48 0 0.00 0.00 0.00 0.00

18363 0.27 4.74 1.09 1.08 292 2.81 8.69 0.00 0.00

0000 0000 0000 0000 0000

count bias rms cpen qcpen count bias rms cpen qcpen

34 10.52 17.64 0.00 0.00 3 -1.55 3.92 0.00 0.00

10 4.64 13.49 0.00 0.00 2 -2.31 3.68 0.00 0.00

8 2.94 23.91 0.00 0.00 5 7.31 15.18 0.00 0.00

19 6.31 44.74 0.00 0.00 3 0.04 2.94 0.00 0.00

2 24.83 61.69 0.00 0.00 11 -3.49 0.65 0.00 0.00

1 28.94 59.27 0.00 0.00 40 2.02 8.67 0.00 0.00

429 2.00 22.52 0.00 0.00 145 1.74 8.93 0.00 0.00

count bias rms cpen qcpen

4091 -0.92 2.92 1.32 1.26

9546 -0.25 3.22 1.28 1.24

1653 0.18 5.93 0.87 0.83

958 -4.74 11.09 0.68 0.60

51 63 -0.64 0.15 19.81 12.48 0.00 0.00 0.00 0.00

16727 -0.81 5.11 1.18 1.14

71


Please note 4 layers from 600 to 200 hPa have been deleted to make each row fit into one line. Only observation type 220 and 223 are shown as an example. The following table lists the meaning of each item file fort.201-213 except file fort.207: Name it

obs

type styp ptop pbot count bias rms cpen qcpen

explanation outer loop = 01: observation – background = 02: observation – analysis (after 1st outer loop) = 03: observation – analysis fields (after 2nd outer loop) observation variable (such as uv, ps) and usage of the type, which include: blank: used in GSI analysis mon: monitored, (read in but not assimilated by GSI). rej: rejected because of quality control in GSI prepbufr observation type (see BUFR User’s Guide For details) prepbufr observation subtype (not used now) for multiple level data: pressure at the top of the layer for multiple level data: pressure at the bottom of the layer The number of observations summarized under observation types and vertical layers Bias of observation departure for each outer loop (it) Root Mean Square of departure for each outer loop (it) Observation part of penalty (cost function) nonlinear qc penalty

The contents of the file fort.2* are calculated based on O-B or O-A for each observation. This detailed information about each observation is saved in the diagnostic files. For the content of the diagnostic files, please check the content of the array rdiagbuf in one of the setup subroutines for conventional data, for example, setupt.f90. We provide a tool in appendix A.2 to help users read in the information from the diagnostic files. These fit files give lots of useful information on how data are analyzed by the GSI, such as how many observations are used and rejected, what is the bias and rms for certin data types or for all observations, and how analysis results fit to the observation before and after analysis. Again, we use observation type 220 in fort.202 (fit. fit_w1.2011032212) as an example to illustrate how to read this information. The fit information forobservation type 220 (sounding observation) is listed in the next page. Like the previous example, 4 layers from 600 to 200 hPa were deleted to make each row fit into one line. Also, the qcpen lines have been deleted because they have the same values as cpen lines. Otherwise, all fit information of observation type 220 are shown.

72

GSI Diagnostics and Tuning ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0 ---------------------------------------------------------------------------------------o-g 01 uv 220 0000 count 135 428 427 838 688 978 8061 o-g 01 uv 220 0000 bias -0.43 0.93 0.85 0.61 0.31 0.23 0.64 o-g 01 uv 220 0000 rms 3.62 3.68 4.21 4.13 5.61 5.33 4.84 o-g 01 uv 220 0000 cpen 0.69 0.92 1.21 1.30 1.74 1.53 1.39 o-g o-g o-g o-g

01 01 01 01

uv uv uv uv

rej rej rej rej

220 220 220 220

0000 count 0000 bias 0000 rms 0000 cpen

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

292 2.81 8.69 0.00

o-g o-g o-g o-g

01 01 01 01

uv uv uv uv

mon mon mon mon

220 220 220 220


3 -1.55 3.92 0.00

2 -2.31 3.68 0.00

5 7.31 15.18 0.00

3 0.04 2.94 0.00

11 -3.49 0.65 0.00

40 2.02 8.67 0.00

145 1.74 8.93 0.00

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0 ---------------------------------------------------------------------------------------o-g 02 uv 220 0000 count 135 428 427 838 688 978 8061 o-g 02 uv 220 0000 bias -0.22 0.80 0.50 0.52 0.17 0.65 0.56 o-g 02 uv 220 0000 rms 3.14 2.93 2.99 2.76 4.75 4.90 3.97 o-g 02 uv 220 0000 cpen 0.49 0.58 0.59 0.59 1.26 1.29 0.90 o-g o-g o-g o-g

02 02 02 02

uv uv uv uv

rej rej rej rej

220 220 220 220


0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

292 2.97 8.94 0.00

o-g o-g o-g o-g

02 02 02 02

uv uv uv uv

rej rej rej rej

220 220 220 220


0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

292 2.97 8.94 0.00

ptop 1000.0 900.0 800.0 600.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 150.0 100.0 2000.0 ---------------------------------------------------------------------------------------o-g 03 uv 220 0000 count 135 428 427 838 688 978 8061 o-g 03 uv 220 0000 bias -0.20 0.75 0.50 0.47 0.17 0.63 0.57 o-g 03 uv 220 0000 rms 3.01 2.77 2.82 2.57 4.62 4.82 3.86 o-g 03 uv 220 0000 cpen 0.44 0.51 0.52 0.51 1.19 1.25 0.85 o-g o-g o-g o-g

03 03 03 03

o-g 03 o-g 03 o-g 03 o-g 03

uv uv uv uv

rej rej rej rej

220 220 220 220


0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

0 0.00 0.00 0.00

292 3.02 8.97 0.00

uv mon 220 0000 count uv mon 220 0000 bias uv mon 220 0000 rms uv mon 220 0000 cpen

3 -0.78 2.99 0.00

2 -1.45 2.61 0.00

5 7.89 15.59 0.00

3 0.00 2.58 0.00

11 -4.70 11.45 0.00

40 2.56 9.15 0.00

145 1.91 9.29 0.00

In loop section “o-g 01”, from “count” line, we can see there are 8061 sounding observations used in the analysis. Among them, 135 are within the 1000-1200 hPa layer. Also from the “count” lines, in the rejection and monitoring section, there are 292 observations rejected and 145 obervations being monitored. In the same loop section, from the “bias” line and “rms” lines, we can see the total bias and rms of O-B for soundings is 0.64 and 4.84. The bias and rms of each layer can also be found in the file. When reading bias and rms values from different loops, as shown with the comparsion in the following three lines: 73


o-g 01 o-g 02 o-g 03

uv uv uv

220 0000 220 0000 220 0000

rms rms rms

3.62 3.14 3.01

3.68 2.93 2.77

4.21 2.99 2.82

4.13 2.76 2.57

5.61 4.75 4.62

5.33 4.90 4.82

4.84 3.97 3.86

we can see the rms reduced from 4.84 (o-g 01, which is O-B) to 3.97 (o-g 02, which is O-A after 1st outer loop) and then to 3.86 (o-g 03, which is O-A after 2nd outer loop, the final analysis result). The reduction in the rms shows the observation type 220 (sounding) was used in the GSI analysis to modify the background fields to fit to the observations. Please note this example only used 10 iterations for each outer loop. 4.5.2 Satellite radiance The file fort.207 is statistic file for radiance data. Its contents include important information about the radiance data analysis. The first part of the file fort.207 lists the content of the file radinfo, which is the info file to control the data usage for radiance data. This portion begins with the line: RADINFO_READ:

jpch_rad=

2680

This shows there are 2680 channels listed in the radinfo file and the 2680 lines following this line include the detailed setups in the radinfo file for each channel. The second part of the file is a list of the coefficients for mass bias correction, which begins with the line: RADINFO_READ:

guess air mass bias correction coefficients below

Each channel has 5 coefficients listed in a line. Therefore, there are 2680 lines of mass bias correction coefficients for all channels though some of the coefficients are 0. The 3rd part of the fort.207 file is similar to other fit files with content repeated in 3 sections to give detailed statistic information about the data in stages before the 1st outer loop, between 1st and 2nd outer loop, and after 2nd outer loop. The results before the 1st outer loop are used here as an example to explain the content of the statistic results: •

Summaries for various statistics as a function of observation type

sat metop-a

type hirs4

penalty 274.88309655 qcpenalty 274.88309655

nobs 62 qc1 1

iland isnoice 1 9 qc2 qc3 52 458

icoast ireduce 0 17 qc4 qc5 129 0

ivarl nlgross 169 0 qc6 qc7 0 0

sat n15

type amsua

penalty 24874.97687020 qcpenalty 24874.97687020

nobs 4149 qc1 947

iland isnoice 673 1475 qc2 qc3 94 1277



rad total penalty_all= rad total qcpenalty_all= rad total failed nonlinqc=

48794.6602645585226 48794.6602645585226 0

74

GSI Diagnostics and Tuning The following table lists the meaning of each item in the above statistics: Name sat type penalty nobs iland isnoice icoast ireduce ivarl nlgross qcpenalty qc1-7 rad total penalty_all rad total qcpenalty_all rad total failed nonlinqc

explanation satellite name instrument type contribution to cost function from this observation type number of good observations used in the assimilation number of observations over land number of observations over sea ice and snow number of observations over coast number of observations that reduce qc bounds in tropics number of observations tossed by gross check number of observation tossed by nonlinear qc nolinear qc penalty from this data type number of observations whose quality control criteria has been adjusted by each qc method (1-7), details see Section 8.3 summary of penalty for all radiance observation types summary of qcpenalty for all radiance observation types summary of observation tossed by nonlinear qc for all radiance observation types

Note: one observation may include multiple channels, not all channels are used in the analysis. ● 1

Summaries for various statistics as a function of channel 2

3

1 2 3 4 5 6 7 8 9 10 12 15

1 2 3 4 5 6 7 8 9 10 12 15

amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15

61 62 63 64 65 66 67 69 70 71

2 3 4 5 6 7 8 10 11 12

hirs4_metop-a hirs4_metop-a hirs4_metop-a hirs4_metophirs4_metop hirs4_metop-a hirs4_metop-a hirs4_metop-a hirs4_metop-a hirs4_metop-a

4

5

6

7

8

9

10

11

1134 1104 1180 1266 1266 2293 2216 1826 629 19 859 760

157 186 9 0 3 4 1933 2323 3520 4130 3290 597

3.000 2.000 2.000 0.600 0.300 0.230 0.250 0.275 0.340 0.400 1.000 3.000

-0.6857597 -1.0022746 1.3881152 -0.1387697 -0.1162391 -1.5129871 -0.6290472 -0.7189088 -1.2802610 -0.9309823 0.7343006 1.8184975

0.4101223 -0.2686506 -1.0935757 -0.1315877 -0.2221524 -0.3181822 -0.4804044 -0.5710642 -0.8482793 -0.9484701 0.0869171 -0.9903618

0.2064464 0.2293813 0.3919319 0.2929980 0.9228904 2.2705673 3.3199152 3.5927614 4.5752183 4.7934543 0.0672267 0.3098348

2.1807583 2.1137854 2.1384668 0.4427612 0.3347665 0.3760536 0.5077016 0.5909604 0.8554846 0.9528062 0.5131227 2.5077433

2.1418465 2.0966439 1.8376976 0.4227555 0.2504335 0.2004405 0.1642332 0.1520519 0.1107981 0.0907966 0.5057077 2.3039010

30 43 58 7 7 8 8 7 36 58

32 19 3 5 3 0 0 1 2 3

0.600 0.530 0.400 0.360 0.460 0.570 1.000 0.600 1.200 1.600

0.0627054 0.1164134 -0.1779762 -0.4521126 -0.4749403 -0.2872839 1.6804195 1.1985978 0.4357090 0.5691709

0.8577433 0.8431586 0.1575476 -0.8968262 -0.8017649 -0.8601251 1.7202181 0.7861630 0.2180678 0.7044314

1.4361066 2.1301347 0.4386076 2.9409352 1.4791934 0.9927962 0.9008130 0.5144437 0.5664240 0.6934126

0.8908604 0.8934206 0.3442761 0.9013240 0.8109521 0.9251662 1.8297700 0.8915189 0.9345560 1.4570142

0.2406420 0.2954387 0.3061124 0.0899321 0.1217222 0.3407599 0.6236248 0.4204208 0.9087581 1.2754085

75

GSI Diagnostics and Tuning The following table lists the meaning of each column in above statistics: Column # 1 2 3 4 5 6 7 8 9 10 11

content series number of the channel in satinfo file channel number for certain radiance observation type radiance observation type (for example: amsua_n15) number of observations (nobs) used in GSI analysis within this channel number of observations (nobs) tossed by gross check within this channel variance for each satellite channel bias (observation-guess before bias correction) bias (observation-guess after bias correction) penalty contribution from this channel (observation-guess with bias correction)**2 standard deviation

● Final summary for each observation type o-g o-g o-g o-g o-g o-g o-g o-g o-g o-g o-g

it 01 01 01 01 01 01 01 01 01 01 01

rad rad rad rad rad rad rad rad rad rad rad

satellite n16 n17 n18 metop-a g11 g12 g11 g12 aqua n14 n15

instrument hirs3 hirs3 hirs4 hirs4 sndr sndr goes_img goes_img airs msu amsua

# read 0 202426 0 189563 0 0 0 0 0 0 128055

# keep 0 1235 0 1178 0 0 0 0 0 0 53932

# assim 0 0 0 264 0 0 0 0 0 0 14552

penalty 0.0000 0.0000 0.0000 274.88 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 24875.

qcpnlty 0.0000 0.0000 0.0000 274.88 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 24875.

cpen 0.0000 0.0000 0.0000 1.0412 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.7094

qccpen 0.0000 0.0000 0.0000 1.0412 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.7094

The following table lists the meaning of each column in the above statistics: Name it satellite instrument # read # keep # assim penalty qcpnlty cpen qccpen

Explanation stage (o-g 01 rad = before 1st outer loop for radiance data) satellite name (n15=noaa-15) instrument name (amsua) number of data (channel value) read in within analysis time window and domain number of data (channel value) after data thinning number of data (channel value) used in analysis (passed all qc process) contribution from this observation type to cost function nonlinear qc penalty from this data type penalty divided by (# assim) qcpnlty divided by (# assim)

Same as other fit files, a comparison between results from different outer loops can give us very useful information on how each channel and data type are analyzed in the GSI.

76

GSI Diagnostics and Tuning 4.6 Convergence Information There are two ways to check the convergence information for each iteration of the GSI: 1. Standard output file (stdout): The value of the cost function and norm of the gradient for each iteration are listed in the file stdout. The following is an example showing the first two iterations from the first outer loop: Minimization iteration 0 grepcost J,Jb,Jo,Jc,Jl = 1 0 1.430230185505848494E+05 0.000000000000000000E+00 1.430230185505848494E+05 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 0 3.400606852473948493E+03 1.000000000000000000E+00 pcgsoi: cost,grad,step = 1 0 1.43023018550E+05 3.40060685247E+03 2.43989544270E-03 pcgsoi: gnorm(1:2),b= 1.87098843447644E+06 1.87098843447644E+06 1.61792450059063E-01 stprat 0.680228405155974 stprat 4.467539812170933E-015 Minimization iteration 1 grepcost J,Jb,Jo,Jc,Jl = 1 1 1.148077578695608245E+05 6.884228595044261567E+01 1.147389155836103746E+05 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 1 1.367840792810494804E+03 4.022343223284949310E-01 pcgsoi: cost,grad,step = 1 1 1.148077578695E+05 1.36784079281E+03 7.6301193791E-03 pcgsoi: gnorm(1:2),b= 1.7016520499209E+06 1.7016520499209E+06 9.094936230309182967E-01 stprat 0.375750738871811 stprat 1.030833603466626E-016

The following are the first two iterations from the second outer loop: Minimization iteration 0 grepcost J,Jb,Jo,Jc,Jl = 2 0 9.709172928659517493E+04 5.242424325932335705E+03 9.184930496066283376E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 0 2.458528763445824097E+03 1.000000000000000000E+00 pcgsoi: cost,grad,step = 2 0 9.70917292865E+04 2.458528763445E+03 2.0910838314E-03 pcgsoi: gnorm(1:2),b= 1.39687516886273E+06 1.39687516886273E+06 2.311037592468580815E-01 stprat 0.326022963208770 stprat 3.222291707173269E-015 Minimization iteration 1 grepcost J,Jb,Jo,Jc,Jl = 2 1 8.445245812257242505E+04 5.411029823593031324E+03 7.904142829897940101E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 1 1.181894736794584333E+03 4.807325236000348778E-01 pcgsoi: cost,grad,step = 2 1 8.445245812257E+04 1.18189473679E+03 3.102603972078E-03 pcgsoi: gnorm(1:2),b= 6.28971827368193E+05 6.28971827368193E+05 4.502706049820247580E-01 stprat 1.600925505225033E-002 stprat 1.290408460808865E-015

We can see clearly the number of outer loop and the inner loops (Minimization iteration). The meaning of the names (bold) used in stdout are explained in the following: the values of cost function (J, or penalty), background term (Jb), observations term (Jo), dry pressure constraint term (Jc), and negative and excess moisture term (Jl). grad: inner product of gradients (norm of the gradient (Y*X)) J,Jb,Jo,Jc,Jl:

reduction: cost:

grad grad of the first step of the loop

the values of cost function, (=J) 77

GSI Diagnostics and Tuning stepsize (α) 2 2 gnorm(1:2): 1=(norm of the gradient) , 2= (norm of the gradient) b: parameter to estimate the new search direction stprat: convergence in stepsize estimation step:

2. Convergence information in file fort.220: In file fort.220, users can find more detailed information about each iterations. The following example uses the first two iterations to explain the meaning of each value: 1) J= 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.190285043373867524E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.185513606236281089E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.401338098573457984E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.487946602645584815E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.118329009942697698E+05 0.468178247339593267E+04 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.106814995402123112E-01 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.215772188906240879E-02 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.931339146384645329E-02 0.296198179729126945E-01 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.292918321075311759E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.982394058251658943E+07 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.281704458045555561E+06 0.274591776085362393E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.274229587308945897E+08 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.455292252088399070E+10 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

0.000000000000000000E+00 0.000000000000000000E+00 0.302472476475514677E+08 0.927054232191691408E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

2) S=-0.100000000000000018E-03 0.000000000000000000E+00 0.000000000000000000E+00 0.112072975062044600E-01 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.325642588289663153E-02 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

3) b=-0.115641269650927740E+04 0.000000000000000000E+00 0.000000000000000000E+00 0.402962980357327977E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.262337918646139989E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

4)

c= 0.115641269650927719E+08 0.000000000000000000E+00 0.000000000000000000E+00 0.359554103149527393E+08 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.805600766238803874E+08 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00

5)

stepsize estimates =

0.100000000000000005E-03 0.243989544270778218E-02

0.243989544270763757E-02

78

GSI Diagnostics and Tuning 6)

0.100000E-03

0.900000E-04 0.110000E-03 0.246429E-02

0.000000E+00

0.243990E-02

0.241550E-02

7) penalties

=

0.000000E+00 0.222277E+03 -0.221329E+03 0.259470E+05 -0.259470E+05

0.226543E+04 -0.259498E+05 -

8) penalty,grad ,a,b=

1 0 0.143023018550584849E+06 0.115641269650927745E+08 0.243989544270778198E-02 0.000000000000000000E+00

9) pnorm,gnorm, step?

1

0

0.100000000000000000E+01

0.100000000000000000E+01 good

J= 0.688422859504426132E+02 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.106235467037641590E+05 0.172586295556918779E+05 0.388542992087058658E+05 0.455285388277129181E+04 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.177114761746839726E+05 0.257381100579931913E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 S=-0.454000845565062042E-02 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.101217294711077131E-01 0.279345794069204220E-01 0.130410935418658683E-01 0.368341097286829303E-01 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.178110078442459234E-01 0.297635097075488013E-02 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 b=-0.986861745749417622E+04 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.142776494763965690E+06 0.140788735520069243E+06 0.231276409863045459E+06 0.226873427979176697E+05 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.160017159820703326E+06 0.585021953540974826E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 c= 0.217370023732255860E+07 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.141059386314876834E+08 0.503994470327305877E+07 0.177344337819967319E+08 0.615932975305519791E+06 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.898417210412935433E+07 0.196556776834889866E+09 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 0.000000000000000000E+00 stepsize estimates = 0.243989544270778198E-02 0.763011937910834613E-02 0.763011937910831204E-02 stepsize guesses = 0.243990E-02 0.219591E-02 0.268388E-02 0.000000E+00 0.763012E-02 0.755382E-02 0.770642E-02 penalties = 0.000000E+00 0.635648E+03 -0.606453E+03 0.767027E+04 -0.660560E+04 0.660417E+04 -0.660417E+04 penalty,grad ,a,b= 1 1 0.114807757869560824E+06 0.187098843447644310E+07 0.763011937910831189E-02 0.161792450059063342E+00 pnorm,gnorm, step? 1 1 0.802722240329134418E+00 0.161792450059063564E+00 good

79

GSI Diagnostics and Tuning For each inner iteration, there are 9 different outputs. The 1st iteration is labeled with numbers 1 to 9, with a detailed explanation below: 1 – 4) detailed information on the cost function (J=), the stpx (S=bpen(i)/cpen(i)), bpen (b=), cpen (c=). There are 32 (5 + number of observation types) items listed in each and the meanings of these items are: 1 contribution from background, satellite radiance bias, and precipitation bias 2 place holder for future linear linear term 3 contribution from dry pressure constraint term (Jc) 4 contribution from negative moisture constraint term (Jl/Jq) 5 contribution from excess moisture term (Jl/Jq) 6 contribution from negative gust constraint term 7 contribution from negative vis constraint term 8 contribution from negative pblh constraint term

9-32: contributions to Jo from different observation types: 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution contribution

from from from from from from from from from from from from from from from from from from from from from from from from

ps observation term t observation term w observation term q observation term spd observation term srw observation term rw observation term dw observation term sst observation term pw observation term pcp observation term oz observation term o3l observation term (not used) gps observation term rad observation term tcp observation term lagrangian tracer carbon monoxide modis aerosol aod level modis aero aod in-situ pm2_5 obs gust monoxide vis aerosol aod pb1h modis aero aod

It is suggested that the users check stpcalc.f90 for the code including the above information. 5) information on the step size estimates 6) information on the step size guesses 7) information on the penalties 8) and 9) information on the cost function and gradient which is explained in the following: 80

GSI Diagnostics and Tuning penalty: the cost function of all observations and background grad: inner product of gradients (norm of the gradient (Y*X)) a: stepsize (α) b: parameter to estimate the new search direction

penalty penalty of the first step of the loop grad gnorm: grad of the first step of the loop pnorm:

To evaluate the convergence of the iteration, we usually make plots based on the above information, such as the value of the cost function and the norm of the gradient. The following are example plots showing the evolution of the cost function and the norm of gradient in different outer loops:

Evolution of cost function (left column) and the norm of gradient (right column) in the first outer loop (top raw) and the second outer loop (bottom raw)

Please see Section A.3 for the information on how to read convergence information from fort.220 and how to make the above plots.

81

GSI Diagnostics and Tuning 4.7 Use bundle to configure control, state varaibles and background fields Since the GSI release version 3.0, the control varaibles, state varaibles, and background fields can be configured through a new info file named “anavinfo”. Different GSI applications need a different anavinfo file to setup the control variables, state variables, and background fields. In the ./fix directory of the release package, there are many example anavinfo files for different operational applications. Because this is a working in progress, users should use one of the sample anavinfo files instead of making a new one. The released GSI run script has added the link for this new info file. Users should be able to use this without any problem. Below is an example of an avaninfo file for an ARW case: met_guess:: !var level cw 30 # ql 30 # qi 30 # qr 30 # qs 30 # qg 30 :: state_vector:: !var level u 30 v 30 tv 30 tsen 30 q 30 oz 30 cw 30 p3d 31 ps 1 sst 1 ::

crtm_use 10 10 10 10 10 10

itracer amedge 0 no 0 no 0 no 0 no 1 no 1 no 1 no 0 yes 0 no 0 no

desc cloud_condensate cloud_liquid cloud_ice rain snow graupel

source met_guess met_guess met_guess met_guess met_guess met_guess met_guess met_guess met_guess met_guess

control_vector:: !var level itracer as/tsfc_sdv sf 30 0 1.00 vp 30 0 1.00 ps 1 0 0.50 t 30 0 0.70 q 30 1 0.70 oz 30 1 0.50 sst 1 0 1.00 cw 30 1 1.00 stl 1 0 1.00 sti 1 0 1.00 ::

orig_name cw ql qi qr qs qg

funcof u v tv tv,q q oz cw p3d p3d sst

an_amp0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0

source state state state state state state state state motley motley

funcof u,v u,v p3d tv q oz sst cw sst sst

There are three sections in this file: met_guess:: section to configure background fields state_vector:: section to configure state variables control_vector:: section to configure control variables

82

GSI Diagnostics and Tuning In each section, the 1st column sets up the variable name and 2nd column sets up the vertical levels. The 4th column in the section control_vector is the normalized scale factor for the background error variance. 4.8 Analysis Increments Analysis increments are defined as the fields of analysis minus background. A plot of analysis increments can help users to understand how the analysis procedure modifies the background fields according to observations, errors (background and observation), and other constraints. You can either calculate analysis–guess and plot the difference field or use the tools introduced in Appendix A.4 to make analysis increment figures for different analysis fields.

4.9 Running Time and Memory Usage In addition to analysis increments, run time and memory usage are other important features of an analysis system, especially for operational code like the GSI. Some computer operating systems provide CPU, Wall time, and memory usage after the job has completed, but an easy way to determine how much wall time is used for the GSI is to check the time tag between the files generated at the beginning and the end of the run, for example: Wall time of GSI analysis = time of wrfanl.2011032212 - time of convinfo Also, the GSI standard output file (stdout) gives the GSI start time and end time at the top and the end of the file. For example: * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . PROGRAM GSI_ANL HAS BEGUN. COMPILED 1999232.55 ORG: NP23 STARTING DATE-TIME JUL 13,2012 21:55:14.798 195 FRI 2456122

ENDING DATE-TIME JUL 13,2012 21:58:55.998 195 FRI 2456122 PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

This tells us this case started at 21:55:14.798 and ended at 21:58:55.998. Meaning GSI used 3 mintues and 41.2 second to finish For a GSI run using an IBM computer, there is a resource statistics section at the end of the stdout file, which gives information about run time and memory usage for the analysis (see example at the end of section 4.1).

83

GSI Applications

Chapter 5: GSI Applications In this chapter, the knowledge from the previous chapters will be applied to several GSI cases to show how to setup GSI with various different data sources, and how to properly check the run status and analysis results in order to determine if a particular GSI application was successful. Note the examples here only use the WRF ARW system – global experiments or WRF NMM runs are similar, but require different background and namelist files. It is assumed that the reader has successfully compiled GSI on a local machine, and has the following data available: 1. Background file • When using WRF, WPS and real.exe will be run to create a WRF input file: wrfinput__ 2. Conventional data • Real time NAM PREPBUFR data can be obtained from the server: ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nam/prod Note: NDAS prepbufr data was chosen to increase the amount of data 3. Radiance data and GPS RO data • Real time GDAS BUFR files can be obtained from the following server: ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/gfs/prod Note: GDAS data was chosen to get better coverage for radiance and GPS RO GSI Case Study The following case study will give users an example of a successful GSI run with various data sources. Users are welcome to download these example data from the GSI users’ webpage (online case for release version 3.1) or create a new background and get real-time observation data from the above server. The background and observations used in this case study are as follows: 1. Background files: wrfinput_d01_2011-03-22_12:00:00 • The horizontal grid spacing is 30-km with 51 vertical sigma levels

Figure 5.1: The terrain (left) and land mask (right) of the background used in this case study

84

GSI Applications

2. Conventional data: NAM PrepBUFR data from 12UTC 22 March 2011. • File: nam.t12z.prepbufr.tm00.nr 3. Radiance and GPS RO data: GDAS PREPBUFR data from 12 UTC 22 March 2011 • Files: gdas.t12z.1bmaua.tm00.bufr_d gdas.t12z.1bamub.tm00.bufr_d gdas.t12z.1bhrs4.tm00.bufr_d gdas.t12z.gpro.tm00.bufr_d This case study was run on a Linux cluster. The BUFR/PrepBUFR files have been byteswaped to little endian format using the tool introduced in section A.1: le_nam.t12z.prepbufr.tm00.nr le_gdas.t12z.1bmaua.tm00.bufr_d le_gdas.t12z.1bamub.tm00.bufr_d le_gdas.t12z.1bhrs4.tm00.bufr_d le_gdas.t12z.gpro.tm00.bufr_d Assume the background file is located at: /scratch1/NA30km/bk all observations are located at: /scratch1/NA30km/obs20110322 and the GSI release version 3.1 is located at /scratch1/comGSI_v3.1

5.1. Assimilating Conventional Observations with GSI:

5.1.1: Run script With GSI successfully compiled and background and observational data acquired, move to the ./run directory under ./comGSI_v3.1 to setup the GSI run following the sample script run_gsi.ksh. To successfully run the run_gsi.ksh script, it must be modified in several places: •

Set up batch queuing system To run GSI with multi-processors, a job queuing head has to be added at the beginning of the run_gsi.ksh script. The set up of the job queue is dependent on the machine and 85

GSI Applications the jobs control system. More examples of the setup are described in section 3.2.2.1. The following example is setup to run on a Linux cluster supercomputer with PBS. The job head is as follows: # Set the queueing options #PBS -l procs=4 #PBS -l walltime=0:20:00 #PBS -A ????? #PBS -N wrf_gsi #PBS -j oe

In order to find out how to setup the job head, a good method is to use an existing MPI job script and copy the job head over. •

Setup the number of processors and the job queue system used. For this example, ‘LINUX_PBS’ and 4 processors were used: GSIPROC=4 ARCH='LINUX_PBS'

•

Setup the case data, analysis time, and GSI fix, exe, and CRTM coefficients: Setup analysis time: ANAL_TIME=2011032212

Setup a working directory, which will hold all the analysis results. This directory must have correct write permissions, as well as enough space to hold the output. Also, for the following example, directory /scratch1 has to exist. WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_prepbufr

Set path to the background file: BK_FILE=/scratch1/NA30km/bk/wrfinput_d01_2011-03-22_12:00:00

Set path to the observation directory and the PrepBUFR file within the observation directory. All observations to be assimilated should be in the observation directory. OBS_ROOT=/scratch1/NA30km/obs20110322 PREPBUFR=${OBS_ROOT}/le_nam.t12z.prepbufr.tm00.nr

Set the GSI system used for this case: the fix and CRTM coefficients directories as well as the location of the GSI executable: CRTM_ROOT=/scratch1/crtm/CRTM_Coefficients FIX_ROOT=/scratch1/comGSI_v3.1/fix GSI_EXE=/scratch1/comGSI_v3.1/run/gsi.exe

86

GSI Applications •

Set which background and background error file to use: bk_core=ARW bkcv_option=NAM if_clean=clean

This example uses the ARW NetCDF background; therefore bk_core is set to ‘ARW’. The regional background error covariance files were also used in this case. Finally, the run scripts are set to clean the run directory to delete all temporary intermediate files. 5.1.2: Run GSI and check the run status Once the run scripts are set up properly for the case and machine, GSI can be run through the run scripts. On our test machine, the GSI run is submitted as follows: $ qsub < run_gsi.ksh

While the job is running, move to the working directory and check the details. Given the following working directory setup: WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_prepbufr

Go to directory /scratch1 to check the GSI run directory. A directory named ./gsiprd_2011032212_prepbufr should have been created. This directory is the run directory for this GSI case study. While GSI is still running, the contents of this directory should include: imgr_g12.TauCoeff.bin imgr_g13.SpcCoeff.bin imgr_g13.TauCoeff.bin

ssmi_f15.SpcCoeff.bin ssmi_f15.TauCoeff.bin ssmis_f16.SpcCoeff.bin

These files are CRTM coefficients that have been linked to this run directory through the GSI run scripts. Additionally, many other files are linked or copied to this run directory or generated during run, such as: stdout: wrf_inout: gsiparm.anl: prepbufr: convinfo: berror_stats: errtable:

standard out file background file GSI namelist PrepBUFR file for conventional observation data usage control for conventional data background error file observation error file 87

GSI Applications

The presence of these files indicates that the GSI run scripts have successfully setup a run environment for GSI and the GSI executable is running. While GSI is still running, checking the content of the standard output file (stdout) can monitor the stage of the GSI analysis: $ tail -f stdout grepgrad grad,reduction= 1 0 8.188107512799259666E+02 1.000000000000000000E+00 pcgsoi: cost,grad,step = 1 0 7.024833842077980808E+04 8.188107512799259666E+02 1.788853101746809213E-02 pcgsoi: gnorm(1:2),b= 3.574858851399047417E+05 3.574858851399045670E+05 5.332020690447850653E-01 stprat 0.114387353859176 stprat 4.672705475826555E-016 Minimization iteration 1 grepcost J,Jb,Jo,Jc,Jl = 1 1 5.825495408135202160E+04 2.145440277602700405E+02 5.804041005359175324E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 1 5.979012335995843159E+02 7.302068672950051686E-01 pcgsoi: cost,grad,step = 1 1 5.825495408135202160E+04 5.979012335995843159E+02 2.019904649670459587E-02 pcgsoi: gnorm(1:2),b= 2.237482294472646608E+05 2.237482294472649228E+05 6.258938849003770066E-01 stprat 0.226513888727740 stprat 1.049725208537512E-015

The above output lines show that GSI is in the inner iteration stage. It may take several minutes to finish the GSI run. Once GSI has finished running, the number of files in the directory will be greatly reduced from those during the run stage. This is because the run script was set to clean the run directory after a successful run. The important analysis result files and configuration files will remain in the run directory. Please check Section 3.4 for more details on GSI run results. Upon successful completion of GSI, the run directory looks as follows: anavinfo berror_stats convinfo diag_conv_anl.2011032212 diag_conv_ges.2011032212 errtable fit_p1.2011032212 fit_q1.2011032212 fit_rad1.2011032212 fit_t1.2011032212 fit_w1.2011032212 fort.201 fort.202 fort.203 fort.204 fort.205

fort.206 fort.207 fort.208 fort.209 fort.210 fort.211 fort.212 fort.213 fort.214 fort.215 fort.217 fort.218 fort.219 fort.220 fort.221 gsi.exe

gsiparm.anl l2rwbufr list_run_directory ozinfo pcpbias_out pcpinfo prepbufr prepobs_prep.bufrtable satbias_angle satbias_in satbias_out satinfo stdout stdout.anl.2011032212 wrfanl.2011032212 wrf_inout

88

GSI Applications 5.1.3: Check for successful GSI completion It is important to always check for successful completion of the GSI analysis. Completion of the GSI run without crashing does not guarantee a successful analysis. First, check the stdout file in the run directory to be sure GSI completed each step without any obvious problems. The following are several important steps to check: 1. Read in the avaninfo and namelist The following lines show GSI started normal and has read in the anavinfo and namelist: state_vectors*init_anasv: 2D-STATE state_vectors*init_anasv: 3D-STATE tv tsen q state_vectors*init_anasv: ALL STATE tv tsen q

VARIABLES ps VARIABLES u oz cw VARIABLES u oz cw

sst v p3d v p3d

GSI_4DVAR: nobs_bins = 1 SETUP_4DVAR: l4dvar= F SETUP_4DVAR: l4densvar= F SETUP_4DVAR: winlen= 3.00000000000000 SETUP_4DVAR: winoff= 3.00000000000000 SETUP_4DVAR: hr_obsbin= 3.00000000000000

2. Read in the background field The lines in stdout immediately following the namelist section, indicate that GSI is reading the background fields. Checking the range of the max and min values will indicate if certain background fields are normal. end_index= 348 247 50 max,min XLAT(:,1)= 13.41553 2.813988 max,min XLAT(1,:)= 47.02686 2.813988 xlat(1,1),xlat(nlon,1)= 2.813988 4.969986 xlat(1,nlat),xlat(nlon,nlat)= 47.02686 52.26216 rmse_var=XLONG ............... rmse_var=U ordering=XYZ WrfType,WRF_REAL= 104 104 ndim1= 3 staggering= N/A start_index= 1 1 1 end_index= 349 247 50 k,max,min,mid U= 1 20.05023 -21.65548 k,max,min,mid U= 2 20.64079 -22.58930 k,max,min,mid U= 3 21.84538 -24.38444 k,max,min,mid U= 4 24.33893 -27.59095 k,max,min,mid U= 5 27.30596 -29.83475 k,max,min,mid U= 6 28.86383 -31.97169 k,max,min,mid U= 7 30.24547 -33.28873

0

0 0 -6.996003 -7.982791 -9.791903 -12.01432 -14.94501 -14.71747 -12.40972

89

GSI Applications 3. Read in observational data Skipping through a majority of the content towards the middle of the stdout file, the following lines will appear: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA:

ps t q pw uv sst

2352 4617 3828 89 5704 0

2572 4331 3908 31 4835 0

8367 12418 11096 141 15025 2

2673 4852 3632 23 4900 0

This table is important to see if the observations have been read in, which types of observations have been read in, and the distribution of observations in each sub domain. At this point, GSI has read in all the data needed for the analysis. Following this table is the inner iteration information. 4. Inner iteration The inner iteration step in the stdout file will look as follows: Minimization iteration 0 grepcost J,Jb,Jo,Jc,Jl = 1 0 7.024833842077980808E+04 0.000000000000000000E+00 7.024833842077980808E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 1 0 8.188107512799259666E+02 1.000000000000000000E+00 pcgsoi: cost,grad,step = 1 0 7.024833842077980808E+04 8.188107512799259666E+02 1.788853101746809213E-02 pcgsoi: gnorm(1:2),b= 3.574858851399047417E+05 3.574858851399045670E+05 5.332020690447850653E-01 stprat 0.114387353859176 stprat 4.672705475826555E-016

Following the namelist set up, similar information will be repeated for each inner loop. In this case, 2 outer loops with 50 inner loops in each outer loop have been set. The last iteration looks like: Minimization iteration 40 grepcost J,Jb,Jo,Jc,Jl = 2 40 3.664942287987162126E+04 6.573582409713609195E+03 3.007584047015801480E+04 0.000000000000000000E+00 0.000000000000000000E+00 grepgrad grad,reduction= 2 40 7.094588989162899789E-03 2.206814011091630791E-04 pcgsoi: cost,grad,step = 2 40 3.664942287987162126E+04 7.094588989162899789E-03 4.844607290004876443E-02 PCGSOI: WARNING **** Stopping inner iteration *** gnorm 0.750736287079361097E-10 less than 0.100000000000000004E-09 Minimization final diagnostics

Clearly, the iteration met the stop threshold before meeting the maximum iteration number (50). As a quick check of the iteration: the J value should descend through each iteration. Here, the J has a value of 7.024833842077980808E+04 at the beginning and a value of 3.664942287987162126E+04 at the final iteration. This means the value has reduced by almost half, which is an expected reduction. 90

GSI Applications

5. Write out analysis results The final step of the GSI analysis procedure looks very similar to the portion where the background fields were read in: max,min MU= 3993.281 rmse_var=MU ordering=XY WrfType,WRF_REAL= ndim1= 2 staggering= N/A start_index= 1 end_index1= 348 k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T= k,max,min,mid T=

-2043.336

104

1 2 3 4 5 6 7

104

1 247 310.9237 311.3734 311.6932 312.7413 313.2800 313.7712 314.9586

1 50 233.9436 235.8086 238.1501 243.5058 242.0858 247.9757 250.0384

0 0 280.6658 280.8300 281.0092 281.4198 282.9984 285.4900 289.3155

6. As an indication that GSI has successfully run, several lines will appear at the bottom of the file: ENDING DATE-TIME JUL 14,2012 23:50:12.367 196 SAT 2456123 PROGRAM GSI_ANL HAS ENDED. IBM RS/6000 SP * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * .

GSI was mainly developed on an IBM machine, so the ‘IBM RS/6000 SP’ markers will still appear on a Linux machine. After carefully investigating each portion of the stdout file, it can be concluded that GSI successfully ran through every step and there were no run issues. A more complete description of the stdout file can be found in Section 4.1. However, it cannot be concluded that GSI did a successful analysis until more diagnosis has been completed.

5.1.4: Diagnose GSI analysis results

5.1.4.1: Check analysis fit to observations The analysis uses observations to correct the background fields to fit the observations under certain constraints. The easiest way to confirm the GSI analysis fits the observations better than the background is to check a set of files with names fort.2??, where ?? is a 91

GSI Applications number from 01 to 19. In the run scripts, several fort files also have been renamed as fit_t1 (q1, p1, rad1, w1).YYYYMMDDHH. Please check Section 4.5.1 for a detailed explanation of the fit files. The following are several examples of these fit files. •

fit_t1.2011032212 (fort.203) This file shows how the background and analysis fields fit to temperature observations. The contents of this file show three data types were used in the analysis: 120, 130, and 180. Also included are the number of observations, bias and rms of observation minus background (O-B) on each level for the three data types. The following is a part of the file:

ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0 -----------------------------------------------------------------------------------------------------------------------o-g 01 t 120 0000 count 185 527 573 995 1117 637 226 443 642 979 855 8692 o-g 01 t 120 0000 bias 0.62 0.88 -0.22 -0.20 -0.19 -0.38 -1.07 -0.89 -0.95 -0.99 -1.63 -0.92 o-g 01 t 120 0000 rms 2.62 2.57 2.01 1.31 0.90 1.02 1.66 1.71 1.94 1.86 2.67 2.27 o-g 01 t 130 0000 count 0 0 0 0 0 11 177 626 67 0 0 881 o-g 01 t 130 0000 bias 0.00 0.00 0.00 0.00 0.00 0.18 0.06 -0.02 -1.23 0.00 0.00 -0.09 o-g 01 t 130 0000 rms 0.00 0.00 0.00 0.00 0.00 0.96 0.97 1.48 2.28 0.00 0.00 1.46 o-g 01 t 180 0000 count 1260 28 0 0 0 0 0 0 0 0 0 1288 o-g 01 t 180 0000 bias 0.67 0.80 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.68 o-g 01 t 180 0000 rms 1.76 1.19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.75 o-g 01 all count 1445 555 573 995 1117 648 403 1069 709 979 855 10861 o-g 01 all bias 0.67 0.88 -0.22 -0.20 -0.19 -0.37 -0.57 -0.38 -0.97 -0.99 -1.63 -0.66 o-g 01 all rms 1.89 2.52 2.01 1.31 0.90 1.02 1.40 1.58 1.97 1.86 2.67 2.16 --------------------------------------------------------------------------------------------------------------------------o-g 03 t 120 0000 count 185 527 573 995 1117 637 226 443 642 979 855 8692 o-g 03 t 120 0000 bias 0.37 0.51 -0.15 -0.04 -0.02 0.02 -0.24 -0.17 -0.03 -0.17 -0.31 -0.10 o-g 03 t 120 0000 rms 2.12 2.05 1.60 1.00 0.61 0.59 0.90 1.01 1.28 1.36 1.93 1.46 o-g 03 t 130 0000 count 0 0 0 0 0 11 177 626 67 0 0 881 o-g 03 t 130 0000 bias 0.00 0.00 0.00 0.00 0.00 0.19 -0.04 0.07 -0.33 0.00 0.00 0.02 o-g 03 t 130 0000 rms 0.00 0.00 0.00 0.00 0.00 0.66 0.76 1.10 1.47 0.00 0.00 1.07 o-g 03 t 180 0000 count 1260 28 0 0 0 0 0 0 0 0 0 1288 o-g 03 t 180 0000 bias 0.38 0.30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.38 o-g 03 t 180 0000 rms 1.46 0.76 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.45 o-g 03 all count 1445 555 573 995 1117 648 403 1069 709 979 855 10861 o-g 03 all bias 0.38 0.50 -0.15 -0.04 -0.02 0.03 -0.15 -0.03 -0.06 -0.17 -0.31 -0.03 o-g 03 all rms 1.56 2.01 1.60 1.00 0.61 0.59 0.84 1.06 1.29 1.36 1.93 1.43

For example: data type 120 has 1172 observations in layer 400.0-600.0 hPa, a bias of -0.19, and a rms of 0.90. The last column shows the statistics for the whole atmosphere. There are several summary lines for all data types, which is indicated by ‘all’ in the data types column. For summary O-B (which is “o-g 01” in the file), we have 10861 observations total, a bias of -0.66, and a rms of 2.16. Skipping ahead in the fort file, “o-g 03” columns (under ‘it’) show the observation minus analysis (O-A) information. Under the summary (‘all’) lines, it can be seen that there were 10861 total observations, a bias of -0.03, and a rms of 1.43. This shows that from the background to the analysis the bias reduced from -0.06 to -0.03, and the rms

92

GSI Applications reduced from 2.16 to 1.43. This is about a 34% reduction, which is a reasonable value for large-scale analysis. •

fit_w1.2011032212 (fort.202) This file demonstrates how the background and analysis fields fit to wind observations. This file (as well as fit_q1) are formatted the same as the above example. Therefore, only the summary lines will be shown for O-B and O-A to gain a quick view of the fitting:

ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0 -----------------------------------------------------------------------------------------------------------------------o-g 01 all count 1161 1175 1218 2195 1952 1080 557 1131 653 876 915 14812 o-g 01 all bias -0.31 1.01 0.58 0.08 -0.10 -0.05 -0.09 0.81 1.05 0.20 0.28 0.33 o-g 01 all rms 3.43 3.93 4.60 4.59 4.93 4.88 5.04 5.31 5.74 5.59 5.59 4.82 o-g 03 o-g 03 o-g 03

all all all

count bias rms

1163 0.28 2.69

1173 0.69 2.92

1223 0.32 2.85

2197 0.19 2.72

1955 0.15 2.87

1096 0.17 2.50

558 0.09 2.62

1132 0.32 3.02

651 0.47 3.65

881 0.22 4.24

915 0.55 4.90

14843 0.37 3.35

O-B: 14812 observations in total, bias is 0.33 and rms is 4.82 O-A: 14843 observations in total, bias is 0.37 and rms is 3.35 The total bias was not reduced, however the rms reduced from 4.82 to 3.35 (~30% reduction). •

fit_q1.2011032212 (fort.204) This file demonstrates how the background and analysis fields fit to moisture observations (relative humidity). The summary lines for O-B and O-A are as follows:

ptop 1000.0 950.0 900.0 850.0 800.0 700.0 600.0 500.0 400.0 300.0 0.0 0.0 it obs type styp pbot 1200.0 1000.0 950.0 900.0 850.0 800.0 700.0 600.0 500.0 400.0 300.0 2000.0 -----------------------------------------------------------------------------------------------------------------------o-g 01 all count 686 228 312 343 228 561 423 473 571 415 0 4240 o-g 01 all bias 1.37 -0.29 1.15 0.92 0.93 0.23 -4.64 -7.81 -8.29 -8.55 0.00 -2.84 o-g 01 all rms 12.92 17.73 16.92 15.55 15.36 15.72 19.37 21.03 21.42 17.97 0.00 17.61 o-g 03 o-g 03 o-g 03

all all all

count bias rms

686 229 312 0.20 -1.54 0.28 7.31 10.29 10.66

343 228 561 423 473 571 415 0.43 -0.50 0.66 -0.98 -0.71 -1.09 -3.94 9.83 10.64 11.57 14.67 14.34 14.23 13.01

0 0.00 0.00

4241 -0.64 11.94

O-B: 4240 observations in total and bias is -2.84 and rms is 17.61 O-A: 4241 observations in total and bias is -0.64 and rms is 11.94 The total bias and rms were reduced. •

fit_p1.2011032212 (fort.201) This file demonstrates how the background and analysis fields fit to surface pressure observations. Because the surface pressure is two-dimensional, the table is formatted

93

GSI Applications different than the three-dimensional fields shown above. Once again, the summary lines will be shown for O-B and O-A to gain a quick view of the fitting: it o-g 01

obs

o-g 03

type stype all all

count 13986

bias 0.4223

rms 1.0427

cpen 0.8230

qcpen 0.7487

14003

0.0172

0.6744

0.3537

0.3379

O-B: 13986 observations in total and bias is 0.4223 and rms is 1.0427 O-A: 14003 observations in total and bias is 0.0172 and rms is 0.6744 Both the total bias and rms were reduced. These statistics show that the analysis results fit to the observations closer than the background, which is what the analysis is supposed to do. How close the analysis fit to the observations is based on the ratio of background error variance and observation error. 5.1.4.2: Check the minimization In addition to the minimization information in the stdout file, GSI writes more detailed information into a file called fort.220. The content of fort.220 is explained in section 4.6. Below is an example of a quick check of the trend of the cost function and norm of gradient. The value should get smaller with each iteration step. In the run directory, the cost function and norm of the gradient information can be dumped into an output file by using the command: $ grep 'penalty,grad ,a,b=' fort.220 | sed -e 's/penalty,grad ,a,b=//g' > cost_gradient.txt

The file cost_gradient.txt includes 6 columns, however only the first 4 columns are shown below. The first 6 and last 6 lines read are: 1 1 1 1 1 1

0 1 2 3 4 5

2 2 2 2 2 2

35 36 37 38 39 40

0.702483384207798081E+05 0.582549540813520216E+05 0.510340800654954874E+05 0.451910522149672179E+05 0.421994383952682037E+05 0.403738674444606295E+05 … 0.366494229475844040E+05 0.366494229180964758E+05 0.366494229007449394E+05 0.366494228897850917E+05 0.366494228834806418E+05 0.366494228798716213E+05

0.670451046411596704E+06 0.357485885139904742E+06 0.223748229447264661E+06 0.100693261241731772E+06 0.726689181584937469E+05 0.565482157493619452E+05 0.654227325952160183E-03 0.328958749059858212E-03 0.215132065808595375E-03 0.140109917756355446E-03 0.873987085206934423E-04 0.503331929251514545E-04

The first column is the outer loop number and the second column is the inner iteration number. The third column is the cost function, and the forth column is the norm of

94

GSI Applications gradient. It can be seen that both the cost function and norm of gradient are descending with each iterations. To get a complete picture of the minimization process, the cost function and norm of gradient can be plotted using a provided NCL script located under: ./util/Analysis_Utilities/plot_ncl/GSI_cost_gradient.ncl. the plot is shown below:

Figure 5.2: Cost function and norm of gradient change with iteration steps

The above plots demonstrate that both the cost function and norm of gradient descend very fast in the first 10 iterations in both outer loops and drop very slowly after the 10th iteration. For more details on how to use the tool to make this plot, please see section A.3.

95

GSI Applications 5.1.4.3: Check the analysis increment The analysis increment gives us an idea where and how much the background fields have been changed by the observations. Another useful graphics tool that can be used to look at the analysis increment is located under: ./util/Analysis_Utilities/plot_ncl/Analysis_increment.ncl. The graphic below shows the analysis increment at the 15th level. Notice that the scales are different for each of the plots.

Figure 5.3: Analysis increment at the 15th level

It can be clearly seen that the U.S. CONUS domain has many upper level observations and the data availability over the ocean is very sparse. For more details on how to use the tool to make this plot, please see section A.4.

96

GSI Applications 5.2. Assimilating Radiance Data with GSI 5.2.1: Run script Adding radiance data into the GSI analysis is very straightforward after a successful run of GSI with conventional data. The same run scripts from the above section can be used to run GSI with radiance with or without PrepBUFR data. The key step to adding the radiance data is linking the radiance BUFR data files to the GSI run directory with the names listed in the &OBS_INPUT section of the GSI namelist. The following example adds the three following radiance BUFR files: AMSU-A: le_gdas1.t12z.1bamua.tm00.bufr_d AMSU-B: le_gdas1.t12z.1bamub.tm00.bufr_d HIRS4: le_gdas1.t12z.1bhrs4.tm00.bufr_d The location of these radiance BUFR files has been previously saved to the environment variable OBS_ROOT, therefore the following three lines can be inserted below the link to the PrepBUFR data in the script run_gsi.ksh: ln -s ${OBS_ROOT}/gdas1.t12z.1bamua.tm00.bufr_d amsuabufr ln -s ${OBS_ROOT}/gdas1.t12z.1bamub.tm00.bufr_d amsubbufr ln -s ${OBS_ROOT}/gdas1.t12z.1bhrs4.tm00.bufr_d hirs4bufr

If it is desired to run radiance data in addition to the conventional PrepBUFR data, the following link to the PrepBUFR should be left as is: ln -s ${PREPBUFR} ./prepbufr

Alternatively to analyze radiance data without conventional PrepBUFR data, this line can be commented out in the script run_gsi.ksh: ## ln -s ${PREPBUFR} ./prepbufr

In the following example, the case study will include both radiance and conventional observations. In order to link the correct name for the radiance BUFR file, the namelist section &OBS_INPUT should be referenced. This section has a list of data types and BUFR file names that can be used in GSI. The 1st column ‘dfile’ is the file name recognized by GSI. The 2nd column ‘dtype’ and 3rd column ‘dplat’ are the data type and data platform that are included in the file listed in ‘dfile’, respectively. More detailed information on data usage can be found in Section 4.3 and Section 8.1 for the radiance data specifically. For example, the following line tells us the AMSU-A observation from NOAA-17 should be in a BUFR file named as ‘amsuabufr’:

97

GSI Applications dfile(30)='amsuabufr',dtype(30)='amsua',dplat(30)='n17',dsis(30)='amsua_n17',dval(30)=0.0,dthin(30)=2,

With radiance data assimilation, two important setups, data thinning and bias correction, need to be checked carefully. The following is a brief description of these two setups:

•

Radiance data thinning The radiance data thinning is setup in the namelist section &OBS_INPUT. The following is a part of namelist in that section: dmesh(1)=120.0,dmesh(2)=60.0,dmesh(3)=60.0,dmesh(4)=60.0,dmesh(5)=120 dfile(30)='amsuabufr',dtype(30)='amsua',dplat(30)='n17',dsis(30)='amsua_n17',dval(30)=0.0,dthin(30)=2,

The first line of &OBS_INPUT lists multiple thinning grids as elements of the array demesh. For each line specifying a data type, the last element of that line is a specification such as, ‘dthin(30)=2’. This selects the mesh grid to be used for thinning. It can be seen that the data thinning option for NOAA-17 AMSU-A observations is 60 km because the dmesh(2) is set to 60 km. For more information about data thinning, please refer to section 3.3 and Section 8.1. •

Radiance data bias corretion The radiance data bias correction is very important for a successful radiance data analysis. In the run scripts, there are two files related to bias correction: SATANGL=${FIX_ROOT}/global_satangbias.txt cp ${FIX_ROOT}/sample.satbias ./satbias_in

The first file (global_satangbias.txt) provides GSI the coefficients for angle bias correction. The angle bias coefficients are calculated off line outside of GSI. The second file (sample.satbias) provides GSI the coefficients for mass bias correction. They are calculated from within GSI in the previous cycle. In the released version 3.1, most of the mass bias correction values in sample.satbias are 0. This means there was not a good estimate base for mass bias correction in this case. The angle bias file global_satangbias.txt is also out of date. These two files are provided in ./fix as an example of the bias correction coefficients. For the best results, it will be necessary for the user to generate their own bias files. The details of the radiance data bias correction are discussed in Section 8.4 of this document. For this case, the GDAS bias correction files were downloaded and saved in the observation directory. The run script should contain two lines to link to the bias correction coefficient files, in order to do this, change the following lines in run script: SATANGL=${FIX_ROOT}/global_satangbias.txt cp ${FIX_ROOT}/sample.satbias ./satbias_in

98

GSI Applications The first line sets the path to the angle bias coefficient file, and the second copies the mass bias coefficient file into the working directory. Once these links are set, we are ready to run the case. 5.2.2: Run GSI and check run status The process for running GSI is the same as described in section 5.1.2. Once run_gsi.ksh has been submitted, move into the run directory to check the GSI analysis results. For our current case, the run directory will look almost as it did for the conventional data case, the exception being the three links to the radiance BUFR files and new diag files for the radiance data types used. Following the same steps as in section 5.1.2, check the stdout file to see if GSI has run through each part of the analysis process successfully. In addition to the information outlined for the conventional run, the radiance BUFR files should have been read in and distributed to each sub domain: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA:

ps t q pw uv sst hirs4 amsua amsua amsua amsub hirs4 amsua

metop-a n15 n18 metop-a n17 n19 n19

2352 4617 3828 89 5704 0 0 2564 1000 0 0 246 656

2572 4331 3908 31 4835 0 0 1333 2117 0 0 1097 3503

8367 12418 11096 141 15025 2 28 133 0 58 57 0 0

2673 4852 3632 23 4900 0 35 195 90 67 70 37 93

When comparing this output to the content in step 3 of section 5.1.3, it can be seen that there are 7 new radiance data types that have been read in: HIRS4 from METOP-A and NOAA-19, AMSU-A from NOAA-15, NOAA-18, NOAA-19, and METOP-A, and AMSU-B from NOAA-17. The table above shows that most of the radiance data read in this case are AMSU-A from NOAA-15. 5.2.3: Diagnose GSI analysis results 5.2.3.1: Check file fort.207 The file fort.207 contains the statistics for the radiance data, similar to file fort.203 for temperature. This file contains important details about the radiance data analysis. Section 4.5.2 explains this file in detail. Below are some values from the file fort.207 to give a quick look at the radiance assimilation for this case study. The fort.207 file contains the following lines:

99

GSI Applications For O-B, the stage before the first outer loop: it satellite instrument #read o-g 01 rad n15 amsua 128055 o-g 01 rad n17 amsub 213920

#keep 53932 254

#assim 14552 0

For O-A, the stage after the second outer loop: o-g 03 rad o-g 03 rad

n15 n17

amsua amsub

128055 213920

53932 254

36345 0

penalty 24875. 0.000

qcpnlty 24875. 0.0000

9993.3 0.0000

9993.3 0.0000

cpen 1.7094 0.0000

qccpen 1.7094 0.0000

0.27496 0.0000

0.27496 0.0000

From the above information, it can be seen that AMSU-A data from NOAA-15 have 128055 observations within the analysis time window and domain. After thinning, 53932 of this data type remained, and only 14552 were used in the analysis. The penalty for this data decreased from 24875 to 9993.3 after 2 outer loops. It is also very interesting to see that the number of AMSU-A observations assimilated in the O-A calculation increase to 37345 from 14552. This is almost tripple times larger than the number of O-B. In this analysis, AMSU-B data from NOAA-17 have 213920 within the analysis time window and domain. After thinning, only 254 were left and none of them were used in the analysis. The statistics for each channel can be view in the fort.207 file as well. Below channels from AMSU-A NOAA-15 are listed as an example: For O-B, the stage before the first outer loop: 1 2 3 4 5 6 7 8 9 10 12 15

1 2 3 4 5 6 7 8 9 10 12 15

amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15

1134 1104 1180 1266 1266 2293 2216 1826 629 19 859 760

157 186 93 0 3 491 1933 2323 3520 4130 3290 597

3.000 2.000 2.000 0.600 0.300 0.230 0.250 0.275 0.340 0.400 1.000 3.000

-0.6857597 -1.0022746 1.3881152 -0.1387697 -0.1162391 -1.5129871 -0.6290472 -0.7189088 -1.2802610 -0.9309823 0.7343006 1.8184975

For O-A, the stage after the second outer loop: 1 2 3 4 5 6 7 8 9 10 12 15

1 amsua_n15 2 amsua_n15 3 amsua_n15 4 amsua_n15 5 amsua_n15 6 amsua_n15 7 amsua_n15 8 amsua_n15 9 amsua_n15 10 amsua_n15 12 amsua_n15 15 amsua_n15

2468 2403 2642 2824 2824 4088 4139 4139 4006 3520 1145 2147

356 421 182 0 0 5 10 10 143 629 3004 677

3.000 2.000 2.000 0.600 0.300 0.230 0.250 0.275 0.340 0.400 1.000 3.000

-0.7052460 -0.8484674 1.5872484 -0.0563292 0.0507068 -1.2792988 -0.1862899 -0.2879666 -0.6683869 -0.5653461 1.7278088 2.3030982

0.4101223 -0.2686506 -1.0935757 -0.1315877 -0.2221524 -0.3181822 -0.4804044 -0.5710642 -0.8482793 -0.9484701 0.0869171 -0.9903618

0.2064464 0.2293813 0.3919319 0.2929980 0.9228904 2.2705673 3.3199152 3.5927614 4.5752183 4.7934543 0.0672267 0.3098348

2.1807583 2.1137854 2.1384668 0.4427612 0.3347665 0.3760536 0.5077016 0.5909604 0.8554846 0.9528062 0.5131227 2.5077433

2.1418465 2.0966439 1.8376976 0.4227555 0.2504335 0.2004405 0.1642332 0.1520519 0.1107981 0.0907966 0.5057077 2.3039010

0.3474635 -0.1059943 -0.7332827 0.0350535 0.0043107 -0.0009505 -0.0170967 -0.0262668 -0.0845316 -0.2697082 0.1030682 -0.4271812

0.0926576 0.1397163 0.2142122 0.0958685 0.1646666 0.2258004 0.2394344 0.2453775 0.4433605 0.8731219 0.0421959 0.1395728

1.6449965 1.7843977 1.7468399 0.2964619 0.1623035 0.1226569 0.1334548 0.1501916 0.2511950 0.4392917 0.4321463 1.7659190

1.6078814 1.7812469 1.5854798 0.2943823 0.1622463 0.1226532 0.1323551 0.1478769 0.2365446 0.3467487 0.4196753 1.7134720

The second column is channel number for AMSU-A and the last column is the standard deviation for each channel. It can be seen that most of the channels fit to the observation more or less, but the standard deviation increased for channels 9 and 10. The increase in the standard deviation seems to be occurring because a much greater number of observations are used in the O-A calculation (column 3), not because the fit has gotten worse.

100

GSI Applications

5.2.3.2: Check the analysis increment

The same methods for checking the optimal minimization as demonstrated in section 5.1.4.2 can be used for radiance assimilation. Similar features to the conventional assimilation should be seen with the minimization. The figures below show detailed information of how the radiance data impact the analysis results. Using the same NCL script as in section 5.1.4.3, analysis increment fields are plotted comparing the analysis results with radiance and conventional to the analysis results with conventional assimilation only. This difference is only shown at level 49 and level 6, which represent the maximum temperature increment level (49) and maximum moisture increment level (6).

Figure 5.4: Analysis increment fields comparing to the analysis with PREPBUFR only at level 6

101

GSI Applications

Figure 5.5: Analysis increment fields comparing to the analysis with PREPBUFR only at level 49

In order to fully understand the analysis results, the following needs to be understood: 1. The weighting functions of each channel and the data coverage at this analysis time. There are several sources on the Internet to show the weighting function for the AMSU-A channels. Chanel 1 is the moisture channel, while the others are mainly temperature channels (Channels 2, 3 and 15 also have large moisture signals). Because a model top of 10 mb was specified for this case study, the actual impact should come from channels below channel 12. 2. The usage of each channel is located in the file named ‘satinfo’ in the run directory. The first two columns show the observation type and platform of the channels and the third column tells us if this channel is used in the analysis. Because a lot of amsua_n15 and amsua_n18 data were used, they should be checked in detail. In this case, Channel 11 and 14 from amsua_n15 and channel 9 and 14 from amsua_n18 were turned off. 3. Thinning information: a quick look at the namelist in the run directory: gsiparm.anl shows that both amsua_n15 and amsu_n18 using thinning grid 2, which is 60 km. In this case, the grid spacing is 30 km, which indicates to use the satellite observations every two grid-spaces, which may be a little dense. 4. Bias correction: radiance bias correction was previously discussed. It is very important for a successful radiance data analysis. The run scripts can only link to the old bias correction coefficients that are provided as an example in ./fix: SATANGL=${FIX_ROOT}/global_satangbias.txt cp ${FIX_ROOT}/ sample.satbias ./satbias_in

102

GSI Applications Users can also download the operation bias correction coefficients during experiment period as a starting point to calculate the coefficients suitable for their experiments. Radiance bias correction for regional analysis is a difficult issue because of the limited coverage of radiance data. This topic is out of the scope of this document, but this issue should be considered and understood when using GSI with radiance applications.

5.3. Assimilating GPS Radio Occultation Data with GSI 5.3.1: Run script The addition of GPS Radio Occultation (RO) data into the GSI analysis is similar to that of adding radiance data. In the example below, the RO data is used as refractivities. There is also an option to use the data as bending angles. The same run scripts used in sections 5.1.1 and 5.2.1 can be used with the addition of the following link to the observations: ln -s ${OBS_ROOT}/le_gdas1.t12z.gpsro.tm00.bufr_d gpsrobufr

For this case study, the GPS RO BUFR file was downloaded and saved in the OBS_ROOT directory. The file is linked to the name gpsrobufr, following the namelist section &OBS_INPUT: dfile(10)='gpsrobufr',dtype(10)='gps_ref',dplat(10)='',dsis(10)='gps',dval(10)=1.0,dthin(10)=0,

This indicates that GSI is expecting a GPS refractivity BUFR file named gpsrobufr. In the following example, GPS RO and conventional observations are both assimilated. Change the run directory name in the run scripts to reflect this test: WORK_ROOT=/scratch1/gsiprd_${ANAL_TIME}_gps_prepbufr

5.3.2: Run GSI and check the run status The process of running GSI is the same as described in section 5.1.2. Once run_gsi.ksh has been submitted, move into the working directory, gsiprd_2011032212_gps_prepbufr, to check the GSI analysis results. The run directory will look exactly the same as with the conventional data, with the exception of the link to the GPS RO BUFR files used in this case. Following the same steps as in section 5.1.3, check the stdout file to see if GSI has run through each part of the analysis process successfully. In addition to the information outlined for the conventional run, the GPS RO BUFR files should have been read in and distributed to each sub domain:

103

GSI Applications OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA: OBS_PARA:

ps t q pw uv sst gps_ref

2352 4617 3828 89 5704 0 3538

2572 4331 3908 31 4835 0 5580

8367 12418 11096 141 15025 2 2277

2673 4852 3632 23 4900 0 6768

Comparing the output to the content in section 5.1.3, it can be seen that the GPS RO refractivity data have been read in and distributed to four sub-domains successfully.

5.3.3: Diagnose GSI analysis results 5.3.3.1: Check file fort.212 The file fort.212 is the file for the fit of gps data in fractional difference. It has the same structure as the fit files for conventional data. Below is a quick look to be sure the GPS RO data were used: Observation – Background (O-B) ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0 -------------------------------------------------------------------------------------------------------------------------o-g 01 all count 4 87 155 548 942 675 411 497 618 855 1416 8709 o-g 01 all bias -0.50 -0.27 -0.18 -0.19 -0.11 -0.03 0.12 0.16 0.09 0.07 -0.06 -0.07 o-g 01 all rms 0.77 0.87 0.83 1.02 0.74 0.43 0.44 0.50 0.54 0.62 0.57 0.66

Observation – Analysis (O-A) ptop 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 it obs type styp pbot 1200.0 1000.0 900.0 800.0 600.0 400.0 300.0 250.0 200.0 150.0 100.0 2000.0 -------------------------------------------------------------------------------------------------------------------------o-g 03 all count 5 108 192 615 950 673 413 500 631 868 1417 8891 o-g 03 all bias -0.31 -0.11 -0.06 -0.01 -0.01 -0.04 0.00 0.01 0.01 -0.01 -0.01 0.01 o-g 03 all rms 0.43 0.62 0.79 0.77 0.54 0.29 0.20 0.21 0.24 0.28 0.40 0.48

It can be seen that most of the GPS RO data are located in the upper levels, with a total of 8709 observations used in the analysis during the 1st outer loop, and 8891 used to calculate O-A. After the analysis, the data bias reduced from -0.07 to 0.01, and the rms was reduced from 0.66 to 0.48. It can be concluded that the analysis with GPS RO data looks reasonable from these statistics.

104

GSI Applications 5.3.3.2: Check the analysis increment The same methods for checking the minimization in section 5.1.4.2 can be used for the GPS RO assimilation. The following figures give detailed information of how the new data impacts the analysis result. Using the NCL script used in section 5.1.4.3, analysis increment fields are plotted comparing the analysis results with GPS RO and conventional to the analysis results with conventional assimilation only for level 48, which represents the maximum temperature increment.

Figure 5.6: Analysis increment fields comparing to the analysis with PrepBUFR only at level 48

5.4 Introduction to GSI hybrid analysis The 3DVAR hybrid analysis is a new analysis option in the GSI system. It provides the ability to bring the flow dependent background error covariance into the analysis based on ensemble forecasts. If ensemble forecasts have been generated, setting up GSI to do a hybrid analysis is straightforward and only requires two changes in the run script in addition to the current 3DVAR run script:

105

GSI Applications Change 1: Link the ensemble members to the GSI run directory This change is to link the ensemble members to the GSI run directory and assign each ensemble member a name that GSI recognizes. For ensemble forecasts using WRF, GSI assumes each ensemble member has been named as follows: wrf_en001, wrf_en002, …, wrf_en011, wrf_en012, …, wrf_en040, … This naming convention is used for forecast files with WRF NMM binary and netcdf format as well as WRF ARW netcdf format. This version of GSI does not support WRF ARW binary as an ensemble member format. Another ensemble forecast format GSI supports is GEFS. Change 2 Setup the namelist options in section HYBRID_ENSEMBLE Users need to set l_hyb_ens=.true. to turn on hybrid ensemble analysis. Commonly used namelist options for the hybrid analysis are listed below: l_hyb_ens uv_hyb_ens

generate_ens n_ens beta1_inv

- if true, turn on hybrid ensemble option - if true, ensemble perturbation wind variables are u,v; otherwise, ensemble perturbation wind variables are stream function, velocity potential functions. - if true, generate internal ensemble based on existing background error - number of ensemble members. - (1/beta1), the weight given to the static background error covariance. 0 =0) ... read_loop: do while (ireadsb(lnbufr)==0) ...

!

End of bufr read loops enddo read_loop enddo read_subset call closbf(lnbufr)

The content of each observation needed by GSI can be found by searching the BUFR mnemonics (bold in following code sample), for example, the following lines of the code give a list of mnemonics included in the subroutine: hdr1b ='SAID FOVN YEAR MNTH DAYS HOUR MINU SECO CLAT CLON CLATH CLONH HOLS' if (atms) hdr1b ='SAID FOVN YEAR MNTH DAYS HOUR MINU SECO CLAT CLON CLATH CLONH HMSL' hdr2b ='SAZA SOZA BEARAZ SOLAZI' call ufbrep(lnbufr,data1b8,1,nchanl,iret,'TMBR') call ufbrep(lnbufr,data1b8,1,nchanl,iret,'TMBRST')

An explanation of each mnemonic can be easily found from the BUFR table used to generate this BUFR file. Users can get this BUFR table on-line, from decoding the BUFR file, or checking the BUFR file bufrtab.012 in the fix directory of the release package. For example, a search for SAZA SOZA in bufrtab.012, we found the following two lines: | SAZA | SOZA

| 007024 | SATELLITE ZENITH ANGLE | 007025 | SOLAR ZENITH ANGLE

| |

These lines tell us that GSI needs to read in satellite zenith angle and solar zenith angle for each observation profile. •

Data selection in reading process

In the data ingesting subroutine, only observations within the analysis domain (for regional applications) and time window are processed for the thinning. After establishing a coarse grid based on the setups in the parameters from OBS_INPUT, GSI starts a smart selection of radiance profiles for the coarse grid. This processing of radiance data thinning not only selects the nearest radiance profile in a coarse grid, but also considers the quality of the radiance profiles. Each observation needs to go through the following steps in the data thinning to only keep the radiance profile with the best quality for analysis: 1. Remove the profiles where the key channels are bad 2. Select a profile that has a larger number of good channels 129

Satellite Radiance Data Assimilation 3. Skip profiles that the Field of View (FOV) are out of range 4. Select profiles that are over better surface fields. Pick the observation in such order based on surface features: sea, sea ice, snow and land, mixed surface. 5. Select profile based on the data quality predictor •

Internal observation data format

After data thinning, the best quality radiance observation for each coarse grid is then saved with surface status (calculated from the background) in a two-dimensional array called “data_all”. The 1st dimension of the array saves all information about one observation and the 2nd one loops through the observations. The code that assigns the content of the array starts like: data_all(1 data_all(2 data_all(3 data_all(4

,itx)= ,itx)= ,itx)= ,itx)=

rsat t4dv dlon dlat

! ! ! !

satellite ID time grid relative longitude grid relative latitude

and ends like: do i=1,nchanl data_all(i+nreal,itx)=data1b8(i) end do

The code itself gives clear notation on the content of the 1st dimension of the array except for the last three lines. For example, it clearly tells us the first 4 items in the array are satellite ID (rsat), observation time (t4dv), and grid relative longitude (dlon) and latitude (dlat). However, there is no clear notation for data_all(i+nreal,itx), a little search for the array data1b8 indicates it contains the brightness temperature from all channels in an observation profile. After reading and processing all observations in the BUFR file and saving them in the data array “data_all”, this array is written to an intermediate binary file at the end of the subroutine read_bufrtovs. •

Observation count in stdout file

From the stdout file, we can see the following information counting the data during the data ingesting stage, an example from the case in Chapter 5: READ_BUFRTOVS: file=amsuabufr type=amsua ithin= 2 rmesh= 60.000000 isfcalc= 0 ndata=

sis=amsua_n15 53932 ntask=

nread=

128055

1

This tells us that the subroutine read_bufrtovs is reading NOAA-15 AMSU-A observations from file amsuabufr. There are 128055 observations (profile number * channels number) read in from the BUFR file and 53932 observations kept for further processing after data selection and thinning.

130

Satellite Radiance Data Assimilation 8.1.3 information on ingesting and distribution The analysis in GSI is done in each subdomain for MPI runs. The observation number in each sub-domain can be found in the stdout file. All data types are listed in the stdout file as shown in the following example, using the same example as section 5.2.2: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA: 3:OBS_PARA:

ps t q uv sst pw hirs4 amsua amsua amsua amsub hirs4 amsua

metop-a n15 n18 metop-a n17 n19 n19

2352 4617 3828 7859 96 89 0 2564 1000 0 0 245 656

2572 4331 3908 6216 239 31 0 1333 2117 0 0 1097 3503

8367 12418 11096 17360 564 141 28 133 0 58 58 0 0

2673 4852 3632 6126 150 23 35 195 90 67 70 37 93

Please note the number in each subdomain is the number of the radiance profiles, not the number of observed channels. Each profile includes many channels. For example, each HRIS observation has 19 channels, each MSU has 4 channels, each AMSU-A has 15 and AMSU-B has 5, each MHS has 5 and SSU has 3.

8.2. Radiance observation operator The observation operator for radiance observations is very complex and out of the scope of this user’s guide. Here, we only briefly introduce some features of the radiance observation operator. The Community Radiative Transfer Model (CRTM) developed by JCSDA is employed by the GSI system to transform control variables into simulated radiance or brightness temperatures. This operator can be illustrated by the following equation: y=K(x,z) where: y are simulated radiance observations; x are profiles of temperature, moisture, and ozone; K is the radiative transfer equation (CRTM); z are unknown parameters such as the surface emissivity, CO2 profile, methane profile, etc. In GSI, x (including surface condition) are calculated based on the background fields and then are put into the CRTM functions (K) to calculate the simulated radiance observations y. When unknowns in K(x, z) are too large, which may be from the formulation of K or unknown variables (z), observed radiance data cannot be reliably used and must be removed during quality control. Exmaples of this include when clouds, trace gases, or aerosols exist in the observed column. The description of radiance data quality control can be found in the next section. For advanced users needing to learn the details of the radiance 131

Satellite Radiance Data Assimilation observation operator in GSI, please check the corresponding subroutine listed in the right column of the section 6.2.4 table. Because GSI uses the CRTM functions as part of the radiance observation operator, the CRTM coefficients have to be available during the radiance data analysis. In the GSI release package, these CRTM coefficients are linked to the running directory by the run script before the GSI starts to run. The details of linking CRTM coefficients can be found in Chapter 3 in the introduction of the GSI run scripts. Please note that the GSI run script does not know which kind of radiance observations will be used in the analysis. The script links all the CRTM coefficients for the radiance observation types listed in the satinfo file. After reading in radiance observations from BUFR files, GSI recognizes which kind of radiance observations to be used and only reads in the corresponding coefficients needed. Therefore, users only need to check whether the CRTM coefficients of the user interested radiance data types are linked correctly. At the same time, users can ignore the warning information on the missing CRTM coefficients if those coefficients are for the radiance data types that are not used in the application.

8.3. Radiance observation quality control The quality control (QC) may be the most important aspect of satellite data assimilation. Unlike conventional observations from a prepbufr file, which includes the quality markers from the NCEP quality control process, the satellite radiance BUFR file does not include observation quality information. Instead, the quality control for radiance observations is inside the GSI. The GSI radiance data quality control starts right after the radiance observations are read in (such as in read_bufrtovs.f90). We can think of the processing of radiance data thinning as a part of the quality control because the thinning process selects the best quality observations. The major radiance data quality control step is after the calculation of the radiance observation departure in file setuprad.f90. Many QC steps are employed to capture problematic satellite data, which mainly come from the following 4 sources: • • • •

Instrument problems Clouds and precipitation simulation errors Surface emissivity simulation errors. Processing errors (e.g., wrong height assignment, incorrect tracking, etc)

In GSI, each instrument has its own quality control subroutine. All these subroutines are in the file qcmod.f90 and are listed as follows for reference:

132

Satellite Radiance Data Assimilation subroutine name qc_ssmi qc_seviri qc_ssu qc_goesimg qc_msu qc_irsnd qc_avhrr qc_amsua qc_mhs qc_atms

Quality Control for ssmi, amsre, and ssmis seviri data ssu data GOES image msu data ir sounder data(hirs, goessndr, airs, iasi, cris) avhrr and avhrr_navy amsua data amsub, mhs and hsb data atms data

After calculating the radiance observation departure from the background and bias correction, these QC functions are called for each instrument to either toss the bad (questionable) observations or inflate the low confidence observations. The number of filtered observations by these QC functions is summarized in the radiance fit file (fort.207) as 7 QC categories (steps). To help users understand the meanings of these numbers in the radiance fit file, we will briefly introduce these QC steps in subroutine qc_amusa in the following table. Please note these QC categories may have different meaning for different instruments: Category Quality Control steps

Action to observations

QC1 QC2

Toss channel 1-6, 16 Toss channel 1-6, 16

QC3 QC4 QC5 QC6 QC7

Cloud affected profile, (factch4 > 0.5) Inaccurate emissivity /surface temperature estimate over sea Cloud affected profile (Scattering index factch6 > 1.0) Inflate observation error over high terrain (>2000m) Inflate observation error over high terrain (>4000m) Retrieved could liquid water path > 1.0 Part of Scattering index > 1.0

Toss channel 1-7, 16 Inflate channel 7 observation error Inflate channel 8 observation error Count profiles Count profiles

Using the same example as section 4.5.2: sat n15

type amsua

penalty nobs 19769.16042371 4149 qcpenalty qc1 19769.16042371 883

iland 673 qc2 63

isnoice 1475 qc3 2127



Using the above table, we can understand numbers listed under qc1 to qc7. Listed below also includes the explanation of the numbers not in the above table, for a complete understanding of this part of the radiance fit file. For other portions of the fit file, please see the introduction in section 4.5.2.

133

Satellite Radiance Data Assimilation From the above example, we see there are 4149 NOAA-15 AMSU-A profiles after thinning, among which there are: 673 profiles over land (iland) 1475 profiles over snow or ice (isnoice) 268 profiles over coast (icoast) 1311 profiles within tropics that has reduced qc bounds (ireduce) 30453 channels that failed in the general gross check (ivarl) 0 channels that passed the general gross check but failed the nonlinear gross check (nlgross) 883 profiles were tossed because of cloud affect based on factch4 63 profiles were tossed because of inaccurate emissivity /surface temperature estimate over sea 2127 profiles were tossed because of cloud affect based on factch6 183 profiles have inflated observation error because of high terrain (>2000m) 0 profiles have inflated observation error because of high terrain (>4000m) 20 profiles meet criterion QC6 (part of qc 1) 46 profiles meet criterion QC7 (part of qc 3)

So, nearly ¾ of the observations were tossed because of cloud affects. 8.4. Bias correction for radiance observations Using bias correction to correct the system bias in the satellite radiance observations is one of the key steps to get a successful satellite radiance data assimilation. This section will introduce the basic theory of the GSI bias correction, the procedures and configurations of the bias correction in the GSI system, an explanation of the namelist, satinfo, and coefficients for bias correction, the use of the angle bias correction utility, and discussions of some common issues users encounter in the application of the GSI bias correction. 8.4.1. Bias correction for satellite observations Observation bias can systematically damage the data assimilation results and, consequently, the quality of the forecasting system. Biases in satellite observations are of particular concern because they may larger than the signal and damage the numerical weather prediction system in a very short period of time. Biases between the satellite observations and the model may come from the following sources: • satellite instrument itself (e.g. poor calibration or characterization, or adverse environmental effects); • radiative transfer model (RTM) linking the atmospheric state to the radiation measured by the satellite (e.g. errors in the physics or spectroscopy, or from nonmodeled atmospheric processes); • systematic errors in the background atmospheric state provided by the NWP model used for monitoring. 134

Satellite Radiance Data Assimilation

In GSI, satellite observation bias is represented as a linear regression based on N statedependent predictors Pi(x), with associated coefficients βi :

Since the bias correction is applied to the radiance departures, this is equivalent to using the modified definition of the observation operator: H̃ (x, β)=H(x)+ჼBC(β, x) The training of the bias correction consists in finding the vector β that allows the best fit between the NWP fields x and the observations. This is obtained by minimizing the following cost function:

For more details on the bias correction, please see the references listed below: 1. 2. 3. 4. 5.

Auligne T., A. P. McNally and D. P. Dee. 2007. Adaptive bias correction for satellite data in a numerical weather prediction system. Q. J. R. Meteorol. Soc. 133: 631-642. Derber JC, Wu W-S. 1998. The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system. Mon. Weather Rev. 126: 2287–2299. Harris BA, Kelly G. 2001. A satellite radiance-bias correction scheme for data assimilation. Q. J. R. Meteorol. Soc. 127: 1453–1468. Dee, D. P., 2004: Variational bias correction of radiance data in the ECMWF system. Pp. 97–112 in Proceedings of the workshop on assimilation of high-spectral-resolution sounders in NWP. 28 June–1 July 2004, ECMWF, Reading, UK. Dee, D. P. and S. M. Uppala, 2009, Variational bias correction of satellite radiance data in the ERAInterim reanalysis. Q. J. R. Meteorol. Soc. 135, 1830–1841.

8.4.2. The GSI Bias correction procedure and configurations In GSI, the bias correction for satellite radiance has two parts: one part is air mass bias correction, also called the predictive part of the bias correction; another part is angle dependent bias correction. Each part of bias correction has its own bias correction coefficient file: The satbias_angle file contains the angle dependent part of the brightness temperature bias for each channel/instrument/satellite. Also included in this file is the mean temperature lapse rate for each channel weighted by the weighting function for the given channel/instrument. ● The satbias_in file contains the coefficients for the predictive part of the bias correction. ●

135

Satellite Radiance Data Assimilation GSI will read in the coefficients from both satbias_angle and satbias_in files, combine them together with predictors to generate a system bias value for each channel, and then subtract this system bias from the observation innovation during the radiance observation operator calculation. During the minimization process, GSI will calculate the updated coefficients for the predictive part of the bias correction and save the updated coefficients in another file called “satbias_out”. The angle dependent bias coefficients are updated outside of GSI using a utility named gsi_angleupdate in the release package. These new mass and angle dependent bias coefficients should be used for the bias correction in the next cycle of the GSI analysis. To set up the bias correction for satellite radiance in the GSI system, users need to link the right coefficient files in the run directory and keep the coefficient files updated in cycles: Step 1, Link coefficient files for both mass bias correction and angle dependent bias into the GSI run directory before running the GSI executable. The coefficient files should come from the previous data assimilation cycle. However, if there is no previous data analysis cycle, the sample coefficient files can be copied from the directory ./fix within the community release version as a cold start. When using the run script with the released version, the following lines in the run script copy the coefficient files: SATANGL=${FIX_ROOT}/global_satangbias.txt SATINFO=${FIX_ROOT}/global_satinfo.txt ... cp $SATANGL satbias_angle cp $SATINFO satinfo # for satellite bias correction cp ${FIX_ROOT}/sample.satbias ./satbias_in

Within the directory ./fix, the sample angle dependent bias correction coefficients file is called global_satangbias.txt, and the file for mass bias correction coefficients is sample.satbias. Here, we also include the copy to the satinfo file because the bias correction needs information from the satinfo file. Step 2, Run GSI and save the output from the mass bias correction for next cycle After running the GSI, an updated coefficient file for the mass bias correction is generated in the run directory. This file is called “satbias_out”, which should be saved for the next cycle of the GSI analysis. There is a line commented out in the released GSI run script reserved for this purpose. The user should choose how to save the file for the next cycle: # GSI updating satbias_in # # cp ./satbias_out ${FIX_ROOT}/sample.satbias

Step 3, Run the angle dependent bias correction utility after GSI runs and save updated coefficients of angle dependent bias correction for use in the next cycle 136

Satellite Radiance Data Assimilation

The update of the coefficients for angle dependent bias correction is done by a stand alone application named gsi_angleupdate located under the directory ./util, but outside the GSI itself. This application reads in the diag files from the GSI analysis results and the old angle dependent bias coefficients, updates the coefficients and saves them as a new file called “satbias_ang.out”. We will introduce how to apply this utility in the next section. Please note the cycling of the coefficients to let the bias information accumulate during the data assimilation cycle is the key to getting the right bias correction.

8.4.3 Namelist, satinfo, and coefficients for bias correction To conduct the bias correction, GSI needs several pieces of information from different files: ● The satellite platform information from the GSI namelist ● The usage information for each channel from the satinfo file ● The coefficients from both mass and angle dependent bias correction coefficient files The following is a brief introduction to these files to help the user to understand the contents of each file and know how to check if the user interested satellite channels are correctly configured in these files. ● The satellite platform information from the GSI namelist The complete explanation of the GSI run script and most often used namelist options can be found in Chapter 3 of this guide. More details of setting up radiance data analysis in the run script are described in section 1 of this chapter. Users should make sure that user interested satellite instruments and platforms are in the list in &OBS_INPUT and have been correctly linked to the BUFR files. Also, the following is a list of GSI namelist options related to the bias correction: diag_rad

Default value .true.

passive_bc

.false.

Variable name

adp_anglebc .false.

Description logical to turn off or on the diagnostic radiance file (true=on) logical to turn off or on radiance bias correction for monitored channels option to perform variational angle bias correction

137

Satellite Radiance Data Assimilation ● The usage information for each channel from the satinfo file The GSI uses an information file called “satinfo” to control how to use each radiance channel. Detailed information about satinfo can be found in the GSI User’s Guide Section 4.3. The following is an example: !sensor/instr/sat amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 hirs3_n17 hirs3_n17

chan 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2

iuse 1 1 1 1 1 1 1 1 1 1 -1 1 1 -1 1 -1 -1

error 3.000 2.000 2.000 0.600 0.300 0.230 0.250 0.275 0.340 0.400 0.600 1.000 1.500 2.000 3.000 2.000 0.600

error_cld 9.100 13.500 7.100 1.300 0.550 0.230 0.195 0.232 0.235 0.237 0.270 0.385 0.520 1.400 10.000 0.000 0.000

ermax 4.500 4.500 4.500 2.500 2.000 2.000 2.000 2.000 2.000 2.000 2.500 3.500 4.500 4.500 4.500 4.500 2.500

var_b 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000

var_pg 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

Users can easily understand the first 2 columns are sensor/instrument/satellite and channel number information. The 3rd column is the usage information, which has the following meanings: iuse -2 -1 0 1 2 3 4

Channel usage in GSI do not use monitor if diagnostics produced monitor and use in QC only use data with complete bias correction use data with no air mass bias correction use data with no angle dependent bias correction use data with no bias correction

For bias correction purposes, please make sure user interested channels are listed in the satinfo file and have been set to the correct usage flag. ● The coefficients from both mass and angle dependent bias correction coefficient files As previously introduced in this section, there are two bias correction coefficient files. These files include the bias correction coefficients for each channel:

138

Satellite Radiance Data Assimilation 1) satbias_in This file contains the coefficients for the predictive part of the bias correction (air mass bias correction coefficients). There is a sample for this file named “sample.satbias” in the GSI release package under the directory ./fix. All coefficients in this sample file are 0. Here, we use NOAA-15 AMSU-A from the satbias_out file from the radiance application case in Chapter 5 as example: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15 amsua_n15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.472353 -0.677697 -2.631062 -0.470401 10.996354 -22.026905 -1.954080 -9.468913 -22.737061 -0.875332 0.000000 2.800800 0.000000 0.000000 -0.439501

-0.231512 0.382025 0.134578 2.121855 -0.762965 -1.543174 -0.293421 -1.490995 -2.195735 2.212551 0.000000 -4.042608 0.000000 0.000000 0.539856

0.291223 1.424922 2.968469 5.764014 1.787372 -1.397403 0.029899 -0.856006 0.247890 -0.392323 0.000000 0.060067 0.000000 0.000000 0.412582

0.000634 -0.000061 -0.004946 0.006496 0.082404 0.175626 -0.064129 -0.013090 -0.357354 -0.337414 0.000000 0.913834 0.000000 0.000000 -0.001741

-0.148959 0.016514 1.213581 1.333609 -1.661531 -11.384948 3.039958 -0.945916 -15.298422 -8.785395 0.000000 15.980004 0.000000 0.000000 0.158646

The first 3 columns are series number, the sensor/instrument/satellite, and channel number of each instrument. Columns 4 through 8 are 5 coefficients for the predictive (air mass) part of the bias correction, which has 5 predictors. 2) satbias_angle The satbias_angle contains the angle dependent part of the brightness temperature bias. There are two sample files for this in the GSI release package under the directory ./fix: global_satangbias.txt and nam_global_satangbias.txt. Here we only give two channels as examples from the file global_satangbias.txt: 1 amsua_n15 0.063 -0.666 -1.553 0.000 0.000 0.000 0.000 0.000 0.000 2 amsua_n15 -2.769 -0.676 -1.404 0.000 0.000 0.000 0.000 0.000 0.000

1 0.528768E-02 -0.200 -0.411 -0.588 -0.638 -0.523 -0.587 -0.593 -0.602 -0.766 -0.955 -1.635 -1.715 -1.783 -1.689 -1.507 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2 0.290661E-02 -2.880 -2.583 -2.449 -2.218 -1.810 -0.697 -0.508 -0.464 -0.544 -0.790 -1.315 -1.318 -1.151 -0.826 -0.219 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

-0.493 -0.466 -0.482 -0.475 -1.080 -1.218 -1.149 -1.374 -1.473 -1.244 -1.233 -1.259 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.536 -1.242 -0.882 -0.788 -0.945 -1.108 -1.002 -1.364 0.086 0.631 1.121 1.807 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

139

Satellite Radiance Data Assimilation The first line for each channel looks like: 1 amsua_n15 2 amsua_n15

1 2

0.528768E-02 0.290661E-02

The columns are series number, the sensor/instrument/satellite, channel number of each instrument, and “T lap mean”, respectively. The next 90 numbers are coefficients for angle dependent bias correction. These numbers correspond to the number of the FOV per scan. For example, AMSU-A has 30 FOV per scan, which is using the first 30 numbers to represents the bias correction coefficients for each FOV position, while the AMSU-B has 90 FOV per scan, all 90 numbers are used to do bias correction. If there are some instruments in satinfo but not in satbias_in and satbias_angle, GSI will set 0 as the initial value for these instruments and write out updated coefficients for these new instruments in coefficient results files. 8.4.4. Utility for Angle Bias Correction The coefficients for correcting the angle dependent part of the brightness temperature bias in the GSI are calculated after each GSI run. The NCEP has developed a tool to calculate these coefficients and, for the first time, this community GSI release (v3.1) started to include this tool as a part of the release package. This tool is released as a directory named ./gsi_angupdate within ./comGSIv3.1/util . Please note the community GSI release package version 3.0 doesn’t have this tool. If users are using the GSI release 3.0 and need this tool, please download the tar file “gsi_angupdate.tar” from the GSI download page on-line. Once untarred, under the GSI directory “./comGSIv3/util”, you will see a new directory: ./gsi_angupdate, which includes the tool to update coefficients for radiance angle dependent bias correction. •

Compile

Inside the directory ./gsi_angupdate, type the following command: ./make then check if executable “gsi_angupdate.exe” exists in the same directory. Please note that before compiling this utility, the community GSI should already be compiled successfully. Please refer to Chapter 2 of this User’s Guide on how to configure and compile the community GSI. •

Run

Before running “gsi_angupdate.exe”, make sure that GSI with radiance data have finished successfully and the diagnostic files that hold O-B information have been generated in the 140

Satellite Radiance Data Assimilation GSI run directory. The executable ““gsi_angupdate.exe” will read in O-B information from the diagnostic files for each sensor to update coefficients for angle dependent bias correction of the sensor. To help users easily run this tool, a sample run script named run_gsi_angupdate.ksh is provided within the ./comGSIv3.1.run directory. If a user uses the tar file downloaded separately on-line, a similar run script can be found in the directory “./comGSI/util” with the code. This script is modified based on the GSI run script. The script has a similar structure. Please check section 3.2.2.1 for instructions on setting up the machine environment, and section 3.2.2.2 for setting up the run environment. The run environment portion is illustrated below: ##################################################### # machine set up (users should change this part) ##################################################### # # GSIPROC = processor number used for GSI analysis #-----------------------------------------------GSIPROC=1 ARCH='LINUX_PBS' # Supported configurations: # IBM_LSF, # LINUX, LINUX_LSF, LINUX_PBS, # DARWIN_PGI

In this script, only four parameters need to be set for a case study. These parameters have been explained clearly in the run script and illustrated below: # ##################################################### # case set up (users should change this part) ##################################################### # # ANAL_TIME= analysis time (YYYYMMDDHH) # WORK_ROOT= working directory, where angupdate executable runs # GSI_WORK_ROOT= GSI working directory, where GSI runs # GSI_ANGUPDATE_EXE = path and name of the gsi angupdate executable ANAL_TIME=2011032212 WORK_ROOT=./comGSIv3.1/run/angupdate_${ANAL_TIME} GSI_WORK_ROOT=./comGSIv3.1/run/arw_2011032212 GSI_ANGUPDATE_EXE=./comGSIv3.1/util/gsi_angupdate/gsi_angupdate.exe #

These parameters tell the analysis case time, where to find the GSI run directory and gsi_angupdate.exe, and where to run gsi_angupdate.exe. The run time information can be found in the stdout file. A successful run should end with the following information: PROGRAM GLOBAL_ANGUPDATE HAS ENDED.

IBM RS/6000 SP

141

Satellite Radiance Data Assimilation After a successful run, an updated coefficients file named “satbias_ang.out” should be found in the run directory.

•

Namelist

The namelist for gsi_angupdate.exe has two sections: setup and obs_input. Here, we only show and illustrate part of the namelist as an example. &setup jpch=2680,nstep=90,nsize=20,wgtang=0.008333333,wgtlap=0.0, iuseqc=1,dtmax=1.0, iyy1=${iy},imm1=${im},idd1=${id},ihh1=${ih}, iyy2=${iy},imm2=${im},idd2=${id},ihh2=${ih}, dth=01,ndat=50 / &obs_input dtype(01)='hirs3', dplat(01)='n17', dsis(01)='hirs3_n17', dtype(02)='hirs4', dplat(02)='metop-a', dsis(02)='hirs4_metop-a', dtype(03)='goes_img', dplat(03)='g11', dsis(03)='imgr_g11', dtype(04)='goes_img', dplat(04)='g12', dsis(04)='imgr_g12', dtype(05)='airs', dplat(05)='aqua', dsis(05)='airs281SUBSET_aqua', dtype(06)='amsua', dplat(06)='n15', dsis(06)='amsua_n15',

The section obs_input only has three columns, which have the same meaning as their counterparts in the GSI namelist, i.e., dtype and dplat specify the radiance instrument and the satellite name, respectively, and dsis indicates the radiance observation type with a name combining both the instrument and the satellite names. Most of the parameters in the section setup are different from the section setup in GSI namelist. We will explain these parameters below: jpch: total channel number in coefficients file : satbias_ang.in nstep: maximum number of FOV per scan iyy1,imm1,idd1,ihh1: start date: year, month, day, and hour iyy2,imm2,idd2,ihh2: end date: year, month, day, and hour dth: time interval between start and end date. If start date is not equal to

ndat: iuseqc:

end date, the code will loop based on dth through the period to process multiple cycles. Number of radiance observation types that can be processed, which is the dimension for parameters in section: obs_input >0 (i.e., 1), check variance. If errinv= (1 /(obs error)) is small (small = less than 1.e-6), the observation did not pass quality control. In this case, do not use this observation in computing the update to the angle dependent bias. 0, then nsmooth will be forced to zero. afact0

0.0

anistropy effect parameter, the range must be in 0.01.0.

covmap

.false.

if true, covariance map would be drawn

&JCOPTS

Constraint term in cost function (Jc)

ljcdfi

.false.

alphajc

10.0

switch_on_ derivatives

.false., …

tendsflag

.false.

if true, then compute horizontal derivatives of all state variables (to be used eventually for time derivatives, dynamic constraints and observation forward models that need horizontal derivatives) if true, compute time tendencies

ljcpdry

.false.

when .t. uses dry pressure constraint on increment

bamp_jcpdry

0.0

parameter for pdry_jc

eps_eer

-1.0

Errico-Ehrendofer parameter for q-term in energy norm

ljc4tlevs

.false.

when true and in 4D mode, apply any weak constraints over all time levels instead of just at a single time

&STRONGOPTS

if .false., uses original formulation based on wind, temp, and ps tends when .t. uses digital filter initialization of increments (4dvar) parameter for digital filter

Strong dynamic constraint

jcstrong

.false.

if .true., strong constraint on

jcstrong_option

1

=1 for slow global strong constraint =2 for fast global strong constraint =3 for regional strong constraint

nstrong

0

if > 0, then number of iterations of implicit normal mode initialization to apply for each inner loop iteration

period_max

1000000.0

cutoff period for gravity waves included in implicit normal mode initialization (units = hours)

period_width

1.0

defines width of transition zone from included to excluded gravity waves

nvmodes_keep

0

number of vertical modes to use in implicit normal mode initialization

baldiag_full

.false.

flag to toggle balance diagnostics for the full fields

167

GSI Namelist baldiag_inc

.false.

flag to toggle balance diagnostics for the analysis increment

hybens_inmc_

0

integer flag for strong constraint (various capabilities for hybrid):

option

=0: no TLNMC (consistent with jcstrong=.false.) =1: TLNMC on static increment only (or if non-hybrid run) =2: TLNMC on total increment for single time level only (or 3DHybrid) if 4D mode, TLNMC applied to increment in center of window =3: TLNMC on total increment over all time levels (if in 4D mode)

&OBSQC

Observation quality control variables

Parameters used for gross error checks are set in file convinfo (ermin, ermax, ratio) Parameters below used for nonlinear (variational) quality control dfact

0

factor for duplicate observation at same location for conventional data

dfact1

3.0

time factor for duplicate observation at same location for conventional data

erradar_inflate

1

radar error inflation factor

oberrflg

.false.

logical for reading in new observation error table (if set to true)

vadfile

'none'

character(10) variable holding name of VAD wind bufr file

noiqc

.false

logical flag to bypass OI QC (if set to true)

c_varqc

1

constant number to control variance qc turning on speed

blacklst

.false.

logical for reading in raob blacklist (if set to true)

use_poq7

.false.

Logical to toggle accept (.true.) or reject (.false.) SBUV/2 ozone observations flagged with profile ozone quality mark

hilbert_curve

.false.

option for hilbert-curve based cross-validation. works only with twodvar_regional=.true.

tcp_refps

1000.0

reference pressure for tcps oberr calculation (mb)

tcp_width

50.0

parameter for tcps oberr inflation (width, mb)

tcp_ermin

0.75

parameter for tcps oberr inflation (minimum oberr, mb)

tcp_ermax

5.0

parameter for tcps oberr inflation (maximum oberr, mb)

168

GSI Namelist qc_noirjaco3

.false.

controls whether to use O3 Jac from IR instruments

qc_noirjaco3_pole

.false.

controls wheter to use O3 Jac from IR instruments near poles

& OBS_INPUT

Controls input data

dfile(ndatmax)

‘ ‘

input observation file name

dtype(ndatmax)

‘ ‘

observation type

dplat(ndatmax)

‘ ‘

satellite (platform) id (for satellite data)

dsis(ndatmax)

‘ ‘

sensor/instrument/satellite flag from satinfo files

dthin(ndatmax)

1

satellite group

dval(ndatmax)

1.0

relative value of each profile within group relative weight for observation = dval/sum(dval) within grid box

dmesh(ndatmax)

1.0

thinning mesh for each group mesh size (km) for radiance thinning grid (used in satthin)

dsfcalc(10)

0,0,…

specifies method to determine surface fields within a FOV. when equal to one, integrate model fields over FOV. when not one, bilinearly interpolate model fields to FOV center.

time_window

time_window

time window for each input data file

(ndatmax)

_max

time_window_max

3

upper limit on time window for all input data

ext_sonde

.false.

logical for extended forward model on sonde data

NOTE: current value for ndatmax is 200. &SINGLEOB_TEST

Single observation test case setup

maginnov

1

magnitude of innovation for one observation

magoberr

1

magnitude of observational error

oneob_type

‘‘

observation type (t, u, v, etc.)

oblat

0

observation latitude

oblon

0

observation longitude

obpres

1000.0

observation pressure (hPa)

obdattim

2000010100 observation date (YYYYMMDDHH)

obhourset

0

observation delta time from analysis time

169

GSI Namelist pctswitch

.false.

SUPEROB_RADAR

if .true. innovation & oberr are relative (%) of background value (level ozone only)

Level 2 bufr file to radar wind superobs

del_azimuth

5.0

azimuth range for superob box (default 5 degrees)

del_elev

0.25

elevation angle range for superob box (default .05 degrees)

del_range

5000.0

radial range for superob box (default 5 km)

del_time

0.5

1/2 time range for superob box (default .5 hours)

elev_angle_max

5.0

max elevation angle (default of 5 deg)

minnum

50

minimum number of samples needed to make a superob

range_max

100000.0

max radial range in meters to use in constructing superobs (default 100km)

l2superob_only

.false.

if true, then process level 2 data creating superobs, then quit. (added for easier retrospective testing, since level 2 bufr files are very large and hard to work with)

LAG_DATA

Lagrangian data assimilation related variables

lag_accur

1.0e-6

Accuracy used to decide whether or not a balloon is on the grid

infile_lag

inistate_lag.dat

File containing the initial position of the balloon

lag_stepduration

900.0

Duration of one time step for the propagation model

lag_nmax_bal

1000

Maximum number of balloons at starting time

lag_vorcore_stderr_a

2.0e3

Observation error for vorcore balloon

lag_vorcore_stderr_b

0.0

error = b + a*timestep(in hours)

HYBRID_ENSEMBLE

Parameters for use with hybrid ensemble option

l_hyb_ens

.false.

if true, then turn on hybrid ensemble option

uv_hyb_ens

.false.

if true, then ensemble perturbation wind variables are u,v, otherwise, ensemble perturbation wind variables are stream, pot. Functions.

aniso_a_en

.false.

if true, then use anisotropic localization of hybrid ensemble control variable a_en.

170

GSI Namelist generate_ens

.true.

if true, then generate internal ensemble based on existing background error

n_ens

0

number of ensemble members.

nlon_ens

0

number of longitudes on ensemble grid (may be different from analysis grid nlon)

nlat_ens

0

number of latitudes on ensemble grid (may be different from analysis grid nlat)

jcap_ens

0

for global spectral model, spectral truncation

pseudo_hybens

.false.

if true, turn on pseudo ensemble hybrid for HWRF

merge_two_grid_

.false.

if true, merge ensemble perturbations from two forecast domains to analysis domain (one way to deal with hybrid DA for HWRF moving nest)

0

integer, used to select type of ensemble to read in for regional application. Currently takes values from 1 to 4

ensperts regional_ensemble_ option

=1: use GEFS internally interpolated to ensemble grid. =2: ensembles are WRF NMM format =3: ensembles are ARW netcdf format. =4: ensembles are NEMS NMMB format.

full_ensemble

.false.

if true, first ensemble perturbation on first guess istead of on ens mean

betaflg

.false.

if true, use vertical weighting on beta1_inv and beta2_inv

pwgtflg

.false.

if true, use vertical integration function on ensemble contribution of Psfc

jcap_ens_test

0

for global spectral model, test spectral truncation (to test dual resolution)

beta1_inv

1

-

1/beta1, the weight given to static background error covariance 0

GSI User's Guide - Developmental Testbed Center

GSI User's Guide - Developmental Testbed Center

Suggest Documents

GSI - Developmental Testbed Center

MET Users Guide - Developmental Testbed Center

WRF-NMM User's Guide - Developmental Testbed Center

The Developmental Testbed Center and its Winter Forecasting ...

Developmental Delay - ECTA Center

NPS-CKM Testbed - Defense Technical Information Center

GSI SLV

WinSpool/400 - (98) Users Guide - RJS Support Center

GSI SLV

P9390G/GSI

Laurenceau_2010 - Center for Developmental Science

Vollmer - Douglass Developmental Disabilities Center

IDIOT - Users Guide

HYESS Users Guide - CiteSeerX

SPM Users Guide

Iridium 9555 Users Guide

Users' Guide - TextSpeak

fx-5800P Users Guide

SCW Users Guide - Intel

The CVX Users' Guide

Quest Users Guide - CiteSeerX

SPM Users Guide

Wells Fargo Users Guide

Guide for Users