problems. Each spectrum in the ATR database was searched against all spectra in the transmission FTIR database using the
Validation of ATR Correction and Reverse ATR Correction Algorithms, Improved by Optimized Corrections Gregory M. Banik, Ph.D., Michelle D’Souza, Ph. D., Keith Kunitsky, and Robin O’Connor Bio-Rad Laboratories, Inc., Informatics Division, 2000 Market Street, Suite 1460, Philadelphia,PA, 19103-3212, USA
Spectroscopy
210431
Abstract To address wavelength-dependent differences in intensity between ATR and transmission FTIR spectra, a mathematical ATR correction can be applied to an ATR spectrum to provide a better match when searching against a transmission FTIR reference database. Conversely, the reverse ATR correction can also be applied mathematically to a transmission FTIR spectrum to provide a better match when searching against an ATR reference database. An original study validated the ATR correction and the reverse ATR correction algorithm in Bio-Rad’s KnowItAll Informatics System1 spectroscopy software. This study updates and compares search results from the original study2 to search results using Optimized Corrections, a patent pending technology recently added to the software that iteratively optimizes the query and reference spectra for each comparison by automatically applying multiple corrections.
Introduction Transmission FTIR versus ATR Spectra Transmission Fourier transform infrared (FTIR) spectra are often acquired using a KBr pellet containing a sample. Alternatively, an Attenuated Total Reflectance (ATR) apparatus can be used to acquire the spectrum since it is usually simpler and faster than other methods and offers superior performance for some sample types. Because of differences in peak intensity caused by a wavelength-dependent difference in the depth of penetration in ATR spectra, a transmission FTIR spectrum of a sample will have a different appearance from an ATR spectrum of the same sample. Database Searches & the Effect of Intensity Differences Reference databases such as the Sadtler FTIR collections and in-house databases have been created to allow spectroscopists to verify the identity of samples as well as to identify unknown samples. Searches against these databases are conducted using a Hit Quality Index (HQI) value on a 0 to 100% scale, with an HQI value of 100% indicating a perfect match between the query spectrum and a given reference database spectrum, an HQI value of 0 indicating no match between the query spectrum and a given reference database spectrum, and intermediate HQI values indicating an intermediate level of similarity between the query spectrum and a given database spectrum. Such HQI values can be calculated to compare a query spectrum to each record in a spectral reference database to generate a “hit list.” The list is presented in descending order of
HQI values to display those reference spectra that are most spectrally similar to the query spectrum first. Because of subtle differences that can occur with sample preparation, contamination, etc., the first entry in a database search hit list may not be the “correct” hit. Nevertheless, the “correct” hit usually occurs within the top 50 hits if the query spectrum and reference database spectra have been collected properly. Because of the differences in intensity between a transmission FTIR spectrum of a sample and an ATR spectrum of the same sample, searching an ATR spectrum against a reference database of transmission FTIR spectra or vice versa can lead to differences in the matching factor between the query spectrum and the database hits retrieved by the search software. ATR Correction To address the wavelength-dependent differences in intensity between ATR and transmission FTIR spectra, a mathematical ATR correction can be applied to an ATR spectrum to provide a better match when searching against a transmission FTIR reference database. Conversely, the reverse ATR correction can also be applied mathematically to a transmission FTIR spectrum to provide a better match when searching against an ATR reference database. Bio-Rad’s original study validated ATR correction (searching an ATR spectrum against a transmission FTIR reference database) and reverse ATR correction algorithms (searching a transmission FTIR spectrum against an ATR reference database) in Bio-Rad’s KnowItAll Informatics System.
Validation of ATR Correction and Reverse ATR Correction Algorithms, Improved by Optimized Corrections
Optimized Corrections Optimized Corrections is a new technology in Bio-Rad’s KnowItAll Informatics System spectroscopy software to optimize spectral search results. It iteratively optimizes the query and reference spectra for each comparison by automatically applying multiple corrections to compensate for differences between spectra caused by the variability of different instruments and accessories as well as other factors, including human error. The goal of the Optimized Corrections technology is to solve many of the spectral search issues that cannot be adequately addressed by traditional algorithms and manual methods. Optimized Corrections include: baseline correction, clipping, horizontal shift, vertical shift, intensity distortion, Raman-specific intensity distortion, and ATR correction if needed. This updated study validates the use of Optimized Corrections to improve the results of ATR Correction. Several ATR Correction methods exist that take into account various parameters such as the penetration depth, the refractive indices of the crystal and the sample, and the angle of incidence. Unfortunately, these parameters are often not available when the search is performed. A qualitative method that works reasonably well is to use the following conversion function: Ic = I ∙ (1 + (n - n0) / n0) where Ic is the corrected intensity, I is the original intensity, n is the wavenumber of the data point, and n0 is the wavenumber of the first data point in the spectrum. Experience has shown that the conversion function above needs to be modified by introducing an adjustment factor F: Ic = I ∙ (1 + F ∙ (n - n0) / n0) In addition to the variations in intensities corrected by the standard ATR correction, there are non-polarization effects that may cause the tops of higher peaks to be different between the query and the reference spectrum. To compensate for these variations, the following equation introduces a polarization adjustment factor P and is applied to all intensity values that lie above 50% of the maximum intensity of the spectrum: Ic = 0.5 + (I – 0.5) ∙ (1 – (1 – P) ∙ hp) where Ic is the corrected intensity, I is the original intensity, and hp is the height of the peak.
© 2017 Bio-Rad Laboratories, Inc.
Fig. 1. Transmission FTIR spectrum and ATR spectrum for methylestradiol 3-methyl ether.
Methods 306 steroid samples were acquired and the spectra measured using a Bio-Rad Digilab FTS spectrometer. ATR spectra were measured by attaching a DuraSamplIR™ II ATR device from Smiths Scientific to the Bio-Rad instrument. Transmission FTIR spectra were measured either by creating a KBr pellet containing the sample or by dissolving the sample in dichloromethane and creating a film on a KBr crystal by evaporating the solvent from the sample solution on the KBr crystal. For the 306 spectra of steroids that were obtained, 21 spectra contained obvious water contamination, two spectra had obvious baseline drift, and one additional spectrum contained a reactive chloroformate moiety that appeared to have decomposed. These 24 spectra were excluded from the original validation study, leaving 282 spectra pairs measured by both transmission FTIR as well as ATR. Two databases were created in the KnowItAll Informatics System software (version 8.0.2): one database containing transmission FTIR spectra only and the other containing ATR spectra only.
In this study, previously excluded compounds were revisited to see if the Optimized Corrections overcome some of the problems. Each spectrum in the ATR database was searched against all spectra in the transmission FTIR database using the Optimized Corrections algorithm in the KnowItAll software. The HQI value of the top hit was noted, as was the ordinal position and HQI of the “correct” hit in the hit list, that is, the compound from the transmission FTIR database that is identical to the query ATR spectrum sample. Ideally, the ordinal position of the “correct” hit resulting from searching an ATR spectrum against a transmission FTIR database after applying mathematical ATR correction would be 1. Similarly, each spectrum in the transmission FTIR database was searched against all spectra in the ATR database using the Optimized Corrections algorithm in the KnowItAll software. The HQI value of the top hit was noted, as was the ordinal position and HQI of the “correct” hit in the hit list, that is, the compound from the ATR database that is identical to the query transmission FTIR spectrum sample.
210431
Validation of ATR Correction and Reverse ATR Correction Algorithms, Improved by Optimized Corrections
• With Optimized Corrections, eight of the twenty four spectra excluded in the original study overcome search issues between the query and reference spectra and now show up as the first hit, showing the power of Optimized Corrections.
Reverse ATR Correction Validation Searching the 282 transmission FTIR spectra against the ATR database using Reverse ATR Correction resulted in 282 individual hit lists. Fig. 2. Transmission FTIR spectrum and ATR spectrum to which Optimized Corrections algorithm has been applied for methylestradiol 3-methyl ether.
• In original study, with ATR Corrections alone, the “correct” hit occurred in the first position in the hit list in 95.4% of the searches, in the first or second positions in 99.6% of the searches, and in the top three hits in 100% of the searches with one hit in the third position. • With Optimized Corrections, the “correct” hit occurred in the first position in the hit list in 98.2% of the searches, in the first or second positions in 99.3% of the searches, and in the top three hits in 100% of the searches with two hits in the third position. • With Optimized Corrections, seven of the 24 spectra excluded in the original study overcome search issues between the query and reference spectra and now show up as the first hit.
Fig. 3. Transmission FTIR spectrum to which Optimized Corrections algorithm has been applied and ATR spectrum for methylestradiol 3methyl ether.
Conclusion Results ATR Correction Validation
Searching the 282 ATR spectra against the transmission FTIR database using ATR Correction resulted in 282 individual hit lists. • In original study, with ATR Corrections alone, the “correct” hit occurred in the first position in the hit list in 92.9% of the searches, in the first or second positions in 98.6% of the searches, and in the top ten hits in 100% of the searches. The ordinal rankings of the four hits not in the top two positions were 3, 3, 4, and 6. • With Optimized Corrections, the “correct” hit occurred in the first position in the hit list in 98.9% of the searches and in the first or second positions in 100% of the searches. Bio-Rad Laboratories, Inc. Informatics Division
Website: www.knowitall.com
© 2017 Bio-Rad Laboratories, Inc.
Steroids all share a common framework, making this validation study particularly challenging. The previous validation study of the ATR Correction and Reverse ATR Correction algorithms in the KnowItAll Informatics System showed that these algorithms already worked very well. Nevertheless, while there was not a lot of room for improvement, the addition of Optimized Corrections to KnowItAll was shown to improve ATR Correction and Reverse ATR Correction significantly. By applying Optimized Corrections, the correct compound was identified using both algorithms in the vast majority of all searches performed, resulting in a “correct” hit being in the top one or two hits of a hit list almost 100% of the time.
Reference 1
KnowItAll Informatics System, version 2017; Bio-Rad Laboratories, Informatics Division, Philadelphia, PA, 2017.
2
Banik, G. and Scandone, M. “Validation of ATR Correction and Reverse ATR Correction Algorithms in the KnowItAll Informatics System,” Bio-Rad Laboratories, Inc., Informatics Division, Philadelphia, PA, [Technical Note] 2009.
Contact Us: www.knowitall.com/contactus 210431