Physics Department, Bryn Mawr, Pennsylvania 19010, and W. Mc-. Kinney is now with ...... 1 + rr. (A9) b'=4a(02) coafK sinK. (9. It is seen that the slope of the primary astigmatic focal curve depends only on ... Am. 49, 460 (1959). 14. T. Namioka ...
Numerical design method for aberration-reduced concave grating spectrometers Wayne R. McKinney and Christopher Palmer
A general method is described for the design of single concavegrating optical systems. Proper parameter and variable sets are defined, and a straightforward technique leads to a final set of optimized variable values which minimize a merit function based on spectroscopic performance.
1.
Introduction
1882 Rowlandl-3
In described the concave grating, which combines the diffracting and focusing properties of the spectrograph into one optical element. This produced a simpler design, which was more efficient in the vacuum UV. Although they were employed in several new design configurations, called mountings, gratings were fabricated by the same procedure, mechanical ruling, until 1967. At that time the convergence of two technical advances made a new method of construction feasible; the needs of the microelectronics industry led to the development of photoresist, a high-resolution recording medium, and highly coherent high-intensity light became available from the ion laser. Rudolph and Schmahl at the University of Goettingen4 5 and Labeyrie and Flamand in France6 constructed interferometers whose standing wave patterns intersected the surface of a photoresist-covered optical blank. Development of the photoresist gave surface relief in the form of sinusoidal grooves. This type of interferometrically ruled concave grating, or photoresist grating, was quickly named a holographic grating because of the similarity of the methods used to make it. It has since become a widely used spectroscopic optical component. Its major disadvantage, that its grooves are sinusoidal rather than triangular, causes it to have a peak diffraction efficiency that is nominally less than that a ruled grating can achieve. Ruled concave gratings, however, with the
When this work was done, both authors were with the Milton Roy Company, Analytical Products Division, 820 Linden Avenue, Rochester, New York 14625; C. Palmer is now with Bryn Mawr College,
Physics Department, Bryn Mawr, Pennsylvania 19010,and W. McKinney is now with Lawrence Berkeley Laboratory, Center for XRay Optics, Berkeley, California 94720. Received 16 February 1987. 0003-6935/87/153108-11$02.00/0. © 1987 Optical Society of America. 3108
APPLIEDOPTICS / Vol. 26, No. 15 / 1 August 1987
exception of those ruled by Harada et al. in Japan7 have a triangular groove profile whose blaze angle var-
ies across the grating. This causes a variation in efficiency which can nullify some of the ruled concave's efficiency advantage In addition, the sinusoidal groove profiles can be modified to match more closelya triangular shape by ion milling processes.9 The holographic method of grating manufacture, however, provides considerable advantages. The long coherence length of the laser source guarantees more uniform groove spacings which leads to significantly reduced stray light levels. This is of particular interest to those who use the grating to monochromatize light from continuum sources. The most widely discussed advantage in the literature of concave gratings, however, is that one can generate groove placements which differ from those of the Rowland grating. The classical concave grating, known as a straight-grooved
grating, has grooves formed by the intersection of equally spaced parallel planes and the concave surface. This grating may be formed by conventional ruling or the use of a plane wave interferometer. Its focal properties have been well described.10'4 For source locations on the Rowland circle, the locus of horizontal focus also lies on that circle.
The locus of vertical
focus lies on a curve tangent to or near the Rowland circle. For holographic concaves made in spherical wave interferometers, however, the grooye placements are determined by the intersection of the confocal fringe system generated by two-point sources and the concave blank. This givesthe designer the freedom to adjust the recording coordinates and, therefore, the focal curve locations to optimize the construction of the grating for its particular use. There is also the further freedom to modify a third-order aberration, such as coma, after the second-order focal curves have been adjusted. Optical systems using holographically constructed gratings have been shown to provide better images than those using conventional straightgrooved gratings.'5
z
Such modification of the classical gratings and their mountings has become known as aberration reduction. It should be realized that the aberrations of the concave grating are considerable; they are merely reduced by this process, not eliminated. Around the wavelength of construction, nearly perfect imaging can be achieved. This is to be expected when one considers the grating to be a simple hologram. Illumination of the finished grating from one of its recording source points generates a diffracted wave that converges to the other recording source point. Concave grating instruments
must often be used over a wide wave-
length range, however, and the greater this range, the larger the defect of focus becomes. The major uses of the aberration-reduced concave grating, therefore, are for moderate resolution monochromators and spectrographs of typically 5-20-nm bandpass. These designs are most often applied in UV and visible spectroscopy where the advantage of using only one component reduces the costs of aligning and mounting the two mirrors of the usual Czerny-Turner plane grating monochromator. Because of the difference in inherent cost between concave gratings and plane gratings of comparable size, the advantages of this trade-off must be considered carefully. In the case of spectrographs, the flattening of the focal plane which can be accomplished offers considerable additional justification for the use of the concave grating. Since the grating does not rotate as in the monochromator, one has more ability to locate the horizontal and vertical focal curves near the array detector. For these applications, geometrical optics has been shown to be adequate. The theory has been worked out in considerable detail and elegance16-8 using optical path function analysis and ray tracing. Unlike the classical grating, where mountings could be analytically derived (e.g., the Rowland circle spectrograph, SeyaNamioka monochromator, and Wadsworth spectrograph), the subject of mountings involving aberrationreduced concave gratings does not readily lend itself to the analytical approach. There are more variables and parameters than before, and these quantities are related in a complicated fashion. This work describes a numerical approach based on the theory of Namioka which attempts to maximize physical insight and keep the necessary ad hoc parts of the problem to a minimum. Suitable parameters are defined, and a systematic procedure for obtaining a final design is presented. 11. Grating System
We concern ourselves with two types of spectrometer, spectrographs and monochromators. Both contain diffractive optical elements, such as gratings, which spatially disperse a spectrum. The spectrograph contains a linear array detector on which the entire spectrum of interest is imaged, while the monochromator contains a single exit slit at which only a narrow band of diffracted wavelengths is imaged at a time. This latter system requires the relative movement of the grating with respect to fixed entrance and exit slits to bring different wavelengths into focus.
Fig. 1. Spectrograph geometry. The source point A(a,r) and the short- and long-wavelength ideal image points B(fls,rs) and B'(Lr4), respectively, lie in the principal (x-y) plane. The origin 0 is the center of the grating.
A.
Spectrograph Description and Geometry
We consider the spectrograph shown in Fig. 1, on which is imposed a Cartesian coordinate system whose origin 0 is at the grating center. We refer the position vectors of points to this origin. The positive x axis is coincident with the grating surface normal at its center, and the y and z axes lie in the horizontal and vertical plane, respectively. The x-y plane is the plane of symmetry, and it is called the principal plane. Light from source A is dispersed by the grating to form a spectrum on the detector whose ends are at B and B', the images of the shortest and longest wavelengths Xsand XL,respectively. The distance between the ends of the spectrum to be imaged (between points B and B') is the detector length w. The position vector of source point A is of length r, and it makes an angle a with the grating normal. The position vectors of image points B and B' are of length rs and , respectively, and they make angles Usand 1L, respectively, with the grating normal. We invoke a convention of signed angles in which points in the +x + y quadrant of the principal plane make positive angles with the grating normal, and those in the +x - y quadrant make negative angles. Thus a is negative for this example, and 3s and 1L are positive. B.
Monochromator Description and Geometry
We employ the same coordinate system and signed angle convention as introduced for the spectrograph, but we replace the detector with an exit slit B to which only one wavelength band images at a time (see Fig. 2). The position vector length to source point A is again r but that to image point B is now simply r'. The angle between these two vectors is constant for all wavelengths, i.e., a-
= 2K,
(1)
where K is the half-deviation angle. Unlike the spectrograph, for which j3 varies with X but a does not, both a and vary with X for the 1 August 1987 / Vol. 26, No. 15 / APPLIEDOPTICS
3109
z
monochromator. Moreover, for spectrographs with linear detectors, the focal distance r' depends on X,but r' is constant for monochromators. Both spectrometer types have fixed values of the source distance r. C.
Grating Construction Geometry
To create a holographic grating, we utilize a concave blank whose surface is covered in photoresist and illuminate it with two beams of coherent light from two pinholes. As mentioned previously, the interference intensity pattern at the blank will be recorded in the
B
form of sinusoidal grooves. Figure 3 shows the con-
struction geometry for such a grating. In practice, the monochromatic light from points C and D is from the same laser, requiring a beam splitter. The position vectors of these sources, relative to our coordinate system origin, are rc and rD, respectively, and they make the (signed) angles y and 6, respectively, with the grating normal. It is shown by Noda et al.'9 that these angles determine the groove spacing at the center of the grating, and that the restriction 8 > y is necessary, but we may otherwise place the construction sources C and D anywhere in the +x half of the principal plane. When the light from these sources is collimated (or equivalently rc = rD = ), and the angles y and 8 are of equal and opposite magnitude, the grooves formed on the projection of the blank onto the plane tangent to it at its center are straight and equally spaced, so that this configuration of construction sources produces the classical grating by holographic means. In general, however, the distances rc and rD are finite, and the angles do not have equal absolute value; in such cases the grooves formed on the tangent plane are neither straight nor equally spaced throughout the entire grating surface. It is found that curved grating grooves allow for additional degrees of freedom in the grating system design, and we shall address ourselves to this general case of which the classical grating is a special
A Fig. 2. Monochromator geometry. The source point A(a,r) and the ideal image pointB(/3,r') lie in the principal plane. The origin O is the center of the grating and the point about which it rotates.
z
D
C
Fig. 3. Construction source geometry. The recording source points C(-y,rc) andD(8,rD) lie in the principalplane.
At the center of
the grating the grooves are parallel to the z-axis.
case. 11.
Optimization Routine
An ideal spectrometer is one in which the rays from a point source of polychromatic light are diffracted so that different wavelengths are imaged to different point foci. The deviations of the actual diffracted rays from a suitably defined ideal location in a predetermined image plane are, therefore, a logical choice in constructing a quantitative expression of the merit of the optical system. A geometric ray trace of the system provides these ray deviations exactly, but the time required for an optimization procedure based on such a process quickly becomes prohibitive as the optical system became more complex and as parameters to be varied were added. We, therefore, follow most authors and use the wavefront aberration theory developed by Beutler' 0 and expanded by Noda et al.'9 The approach involves an infinite series, and through its truncation we reduce computation time and introduce inaccuracies compared to the exact ray trace theory. In spite of this approximation, we show that the wavefront aberration theory is more than adequate as a 3110
APPLIEDOPTICS / Vol. 26, No. 15 / 1 August 1987
Fig. 4. Diffracted wavefronts. The locus of equal optical path along the rays diffracted from grating points 0 and P is not generally coincident with the reference wavefront, which is spherical and centered at the image point B.
basis on which to develop an optical optimization procedure. A.
Wavefront Aberration Theory
We consider Fig. 4, in which light from a source point A is to be diffracted by the grating points 0 and P so
Table 1. Source Componentsof Power Series Terms, Second to Fifth Orders
A20 =-a 2 0 cosa+ (1/2cos2 a)r-', MO2=-aO
2
cosa + ('/ 2 )r-',
-a3 0 cosa- (a30 sina cosa)r-' + (/2 sina cos2 a)r-2 ,
=30
MWU -a 12 cosa
(a0 2 sina cosa)r-
-
1
2
+ (/2 sina)r
,
2
M4= -a4 0 cosa+ (/ 2a20 sin a - a30 sina cosa)r' + [/2a 20 cosa(1 - 3 sina)]r- 2 + (3/4sin 2a - %/8 sina - /1)r-3, 2 M22 = a22 cosa + (a20a 0 2 sin a
-
which can be thought of as a discrepancy in the agree-
a12 sina cosa)r-
1
sin 2 a cosa)r-
2
3 + (/ 2 (a2 0 + aO2) cosa- / 2 a
2
ment of the left and right sides of Eq. (2), is called the optical path difference:
+ (3/4sin a -/4)r-3
For perfect images, AF = 0, but in general this does not occur. Our optimization procedure, then, is charged with minimizing the absolute magnitude of AF. In accordance with Beutler and Namioka, we expand the optical path difference AF and the x coordinate of the grating point in terms of the pupil coordinates to obtain the following infinite power series:
2
2
o4 = -aO4 cosa + (/ 2a02 sin a)r-' + (/ 2a02 cosa)r-(118)r 50
2 cosa + (a2 0a3 0 sin a
2 + [('/2ad0 sina(sin a
+ /2a 30 cosa(cos3a
-
a 40 sina cosa)r'
-
2 cos2a)
-
2 sin 2 a))]r-
2
3 + [a20 cosa sina( /2 cos a - sin a)]r5 4 -(3/ sina - / sin3a + /8sin a)r- , 3
2
2
AF = E E F(ij)yzi;
+ [(a20 a 02 sina(sin 2a
-
2 + /2 aj2 cosa(sin a
2 cos 2a) + 2/2 a30 cosa)]r-
2 cos 2 a)
+ [3/2a20 sina cosa - 3/2a%2 sina cosa(sin2a 3
+ (/4 sin a
-
-
2
cos 2a)]r- 3
4
3/4 sina)r- ,
2 14 = -a1 4 cosa + (a 20a12 sin a
aO4 sina cosa)r- 1
-
2 + (1/ 2 cosa)r2a' 2 sina(sin a - 2 cos a) + /2 a% 3 4 3 + ( /2aO2 sina cosa)r- + (8 sina)r2
x=
a(ij)y'zj.
(4)
i j i j Here we have simplified our analysis with respect to that of Namioka by restricting our source and ideal image points to lie in the x-y plane. The explicit form of the F(ij) coefficients we have derived match those of Namioka to the fourth order,19 and for completeness we have derived the fifth-order coefficients as well. Like Namioka, wehave separated the holographic contributions to these terms from those due to a classically ruled grating via
2 ML = -a 3 2 cosa + (a 20a12 + a 30a 02 sin a - a 22 sina cosa)r'
-
(3)
AF =APB-AOB +NmX.
2
M50 =-a
is a suitable condition for perfect imaging, where N is the groove number along the y-axis and m is the spectral order. That is, if the optical paths AOB and APB differ by an integral number of wavelengths, a new wavefront can be constructed which is spherical and centered at B. Actual optical systems rarely provide perfect imaging, and in general the diffracted wavefront is degraded from a sphere, resulting in the introduction of aberrations to the image. A measure of this degradation,
2
The straight-grooved and holographic terms, Mi1 and Hii, are found via M + M. and HM.+ M, respectively. M is found from M, by replacing a and r witi f and r'; Hq1 and H are found from M. and Mi, by replacing a, $3,r, and r with y, 6, rc, and rD, and the algebraic sign of the H, terms is changed. The zeroth-order term M00 vanishes when the source and image are at A and B, respectively, and the zeroth-order term H00 vanishes when the two recording sources are at C and D. The first-order terms M, 0 and H,0 vanish when the grating equation is satisfied.
F(ij) =M(ij) + XH(ij), X0
(5)
where Xois the wavelength of the laser light used in constructing the holographic grating. The components of the M(ij) coefficient due to the source point A are given in Table I; the image point contribution can be found by replacing a and r with ,Band r', and the holographic terms H(ij) are identical in form to the
correspondingM(ij) term, exceptthat a, r, ,, and r' are
that the light of a given wavelength is imaged to the ideal focus B. We take 0 to be the center of the grating surface, while P is an arbitrary point on that surface. The diffracted rays ideally coincide at B; in other words, the diffracted wavefronts form spheres whose centers lie at point B. We need not require the optical paths AOB and APB to be identical, however, for in defining wavefronts as surfaces of constant phase,
20
we
can append a phase term to APB, which is equal to an integral multiple of the diffracted wavelength. As 9
given by Noda et al.1 we see that AOB = APB + NmX
(2)
replaced with -y,rc, 6, and rD, and the algebraic sign of terms involving 8 and rD is changed. It can be seen that the zeroth-order term in Eqs. (4) vanishes, since its contribution from APB is simply the distance from A to 0 plus the distance from 0 to B (= AOB), and this term is subtracted to yield zero; this expansion is then seen to be one about the grating center (coordinate origin), and an ideal focus is one for which all higher-order coefficients vanish as well. An
application of Fermat's principle, aAF = 0, ay
(6)
produces the grating equation mX = d(sina + sin).
(7)
Our task then is to minimize or eliminate the higherorder coefficients, starting with those of second order, 1 August 1987 / Vol. 26, No. 15 / APPLIEDOPTICS
3111
F(20) (primary defocus) and F(02) (primary astigmatism). We found the relationship between the partial derivatives of AF with respect to y and z and the horizontal and vertical ray deviations in the image plane to be, respectively, 6
-(r'
-
y sin - x cos3) r' cosf3- x
X [(r'-y
6z = (r'-y
r.sin:-x sin-x
cos3)
C. OAF
,y
Z sing
OAF19A d'] Oz]
cosfl) d Oz
(8)
(9)
These expressions are essentially those of Chrisp2 l except that they take the sag of the grating x(y,z) into account in their derivation; test cases show that for common grating sizes and curvatures, the dependence of these deviations on the sag is negligible. We may now more precisely define the object of our optimization routine as minimizing the absolute magnitude of these ray deviations over the wavelength range of interest. Whereas minimizing the power series terms in order of increasing exponential order is certainly not incorrect, we understand that it is possible to minimize the ray deviations through aberration balancing, in which nonzero F(ij) terms combine to produce negligible values of by and z.
Of course, not every aberration term up to and including those of fifth order can be simultaneously considered in minimizing Eqs. (8) and (9) in all but a few special cases. The images considered in this wavefront aberration theory approach are only approximate, and it falls to a geometric ray trace to determine the accuracy of the aberration theory spot diagrams. In general, even in cases in which the discarded power series terms are of significant magnitude, minimization of y and z by considering second- and thirdorder terms is more than adequate in producing optimal designs of aberration-reduced grating systems. B.
Simplex Optimization Procedure
It is helpful to consider an optimization procedure as being equivalent to a curve-fitting algorithm in which the curve to be matched is, say, g = 0. The parameters of optimization are then altered in a systematic way until a set of parameter values is obtained which best approximates g = 0 over the range of the independent variable(s). We chose the downhill Simplex method of Nelder and Mead22 over more well-known procedures (such as the method of steepest descent, for example) due to its operational simplicity as well as the advantage that only function evaluations (and not derivative evaluations) are required. The Simplex method involves the definition of a merit function which is used to qualitatively compare sets of parameters. For our purposes, we shall consider the better of two parameter sets as corresponding to the lower of two merits. Our merit function then will be positive definite (and ideally zero). The independent optimization parameters are used with the merit 3112
function to create a vector space of dimension one greater than the number of parameters. The extra dimension allows the merit function to form a hypersurface whose global minimum we seek. This is done by creating a Simplex polyhedron which rolls along this hypersurface until it reaches a minimum.
APPLIEDOPTICS / Vol. 26, No. 15 / 1 August 1987
Merit Function
As previously stated, the ray deviation expressions, Eqs. (8) and (9), are suitable for use in a merit function, for an ideal optical system is one in which these deviations vanish for all grating point coordinates and at all wavelengths. In general, however, this will not happen. Thus we consider the sum of their absolute values as constituting a merit function: W = Ay + fAz.
(10)
Here Ay and Az are the absolute magnitudes of the greatest values of by and z for all rays at all wavelengths (the maximum full width and full height of all images, respectively), and we call f the astigmatism factor. The presence of this factor, which is always non-negative, allowsthe designer to adjust how significant the vertical deviation is relative to horizontal deviation (resolution). Since the acceptable bandpass (horizontal deviation) is often much less than the acceptable vertical deviation, this factor is typically less than unity. Most of our optimizations involve astigmatism factors between one-tenth and one-quarter. The quantities byand z from Eqs. (8) and (9) are in real distance units (such as millimeters), but since our objective is to minimize the spectral bandpass of a monochromator or the resolution of a spectrograph, we convert the horizontal deviation to wavelength units via the reciprocal linear dispersion OX/Ol= d coso/mr',
(11)
where (,f3) are the polar coordinates of the image of the given wavelength X,and I is the real distance moved along the spectrum. Note that this may deliberately introduce variations in image size over wavelength for the sake of making the bandpass or resolution uniform. We do not feel that this negatively biases the optimization; in fact, we believe it to be an advantage. The maximum vertical deviation Az must also be (somewhat artificially) converted to wavelength units to add in the contribution to the merit function [Eq. (10)]. D.
Variables of Optimization
In the optimization of a spectrograph, the positions of the source and detector can be varied to reduce aberrations in the image. The source coordinates a and r completely define the in-plane position of the source point A, but it is not so clear that there exist only two degrees of freedom in specifying the detector position. We choose i", the length of the position vector of the image of the shortest wavelength, and the orientation of the detector relative to this position vector, called , as optimization variables as well (see Fig. 5).
In contrast to most authors, we do not choose the
DETECTOR K
B'
\
plex space, tests their individual merit values, and displays them in order of increasing merit so that one with a low merit may be chosen, thereby giving the Simplex procedure somewhat of a headstart.
B r
E.
Fig. 5. Diffracted ray geometry. The position vectors rj, and r's locate the ideal image points of the longestand shortest wavelengths, respectively, and w is their vector difference.
0 is the angle mea-
sured from the extension of rs to w.
groove spacing d as an optimization variable or fixed parameter but allow it to be calculated from the angular distribution of the spectrum. Given r and 0, then, we must specify the detector length to completely determine the angular separation of the images of the ends of the spectrum range. The geometry of use of the spectrograph is now completely determined. For the monochromator, however, we find that the restriction that the angular separation between entrance and exit slits be constant for all wavelengths removes one degree of freedom; consequently, only three variables of optimization are obtained from the mounting geometry. We consider here the entrance and exit slit position vector lengths, r and r, respectively, and the half-deviation angle K as the monochromator optical system variables. Although the construction source coordinates (rc,,y) and (D,6) might suggest themselves as potential variables of optimization, their effect on image quality is manifest in the H(ij) coefficients, and from a design perspective it is these coefficients that are more easily understood to influence the image, for each one is identified with a particular wavefront aberration. It is for this reason that we break with convention and choose the ten H(ij) coefficients as variables of optimization. Although this leads to slight complications later, we feel that the design process is more interactive when these coefficients are specified rather than the recording source point coordinates. It should be noted that only three of these H(ij) coefficients can be used in determination of the recording source coordinates, for the three corresponding equations and the groove frequency relation mentioned above form a system of four simultaneous nonlinear equations which utilize the four degrees of freedom the placement of these sources offers. The Simplex optimization procedure requires a seed vertex about which it constructs an initial polyhedron according to well-defined rules. The designer must restrict himself to realistic initial values of those optimization variables he wishes to vary; for example, our in-plane analysis does not allow the entrance and exit slits of a monochromator to be coincident. Our optimization program permits the designer to choose the initial variable values or to invoke a routine which systematically chooses points in a region of the Sim-
Initialization Parameters
In addition to the variables of optimization, there exist parameters which we have chosen not to vary during the numerical design process. These include the spectral order m, the radius of the concave blank, the detector length (for spectrographs), and the dimensions of the projection of the grating onto the y-z plane. In many applications the spectral order is chosen to be either 1 or -1, and the grating radii, detector length, and grating dimensions are determined by existing hardware specifications and engineering constraints. The optical system design requires a careful choicebetween what to vary and what to hold constant to make the problem tractable. The effects of these parameters are often investigated by repeated iterations of the design method. An initial value of the groove spacing d is also re-
quired, since by the grating equation we wish the diffraction angle # to be the computed quantity. For spectrographs, it is evident that the detector length and groove spacing cannot be simultaneously specified for the monotonicity of the angular spread of the spectrum as a function of d provides for a unique value of d for each set of values of rs, 0, and w. For monochromators, the reciprocal linear dispersion, Eq. (11), may be specified instead of the groove spacing, or vice versa, but both cannot be invariant. Thus the groove spacing can be held constant for monochromators (but it need not be), and it cannot be held constant for spectrographs. For cases in which d is a calculated quantity, the nonlinear expression involving the angular spread of the spectrum (for spectrographs) or the dispersion (for monochromators) is solved via the Newton-Raphson or regula falsi methods for each iteration
of the optimization. F.
Determination of the Recording Coordinates
In choosing the holographic coefficients as variables of optimization rather than the coordinates of the recording point sources, we must accept the possibility that a given set of H(ij) values, found via optimization, may not correspond to a set of recording coordinates. We interpret a failure of a method of solving systems of
nonlinear equations in finding recording coordinates to indicate that the groove pattern which corresponds to the H(ij) coefficients in question cannot be constructed with simple point sources but perhaps require additional focusing optics. We employ the Simplex routine again, this time to determine the recording coordinates, but since many holographic optimizations involve specification of the two second-order coefficients H(20) and H(02) as well as the coma term H(30), a method of algebraic back substitution for this case has been coded as well. It is found that