J Pharm Innov (2013) 8:1–10 DOI 10.1007/s12247-012-9141-y
RESEARCH ARTICLE
Latent Variables-Based Process Modeling of a Continuous Hydrogenation Reaction in API Synthesis of Small Molecules Zhenqi Shi & Nikolay Zaborenko & David E. Reed
Published online: 6 January 2013 # Springer Science+Business Media New York 2013
Abstract Introduction Continuous manufacturing can be benefited by the use of the Quality by Design (QbD) strategy for robust process development and by the use of Process Analytical Technology (PAT) for real-time process monitoring and control. A successful implementation of QbD and PAT for continuous processes relies on a robust and information-rich process model as a basis for process understanding, monitoring, and control. Compared to first principles and other empirical models, a latent variables-based process model is capable of decomposing multidimensional process data into a few orthogonal latent variables and of providing accessible process understanding/visualization and control within the latent variable space. This study is an extension of our group’s earlier effort (Liu et al., J Pharm Innov 6:170–180, 2011) to explore the utility of latent variables-based process modeling in pharmaceutical manufacturing processes. Methods The case presented here is the first application of latent variables-based modeling to a reaction process in smallmolecule active pharmaceutical ingredient route synthesis, i.e., a continuous-flow hydrogenation. A particular reactor configuration and operation was used in this proof-of-concept study. Results It was found that time-variant profiles of pressure in the flow tube reactor served as an effective indicator of gas– liquid interaction within the reactor, thus determining process outcomes, i.e., the extent of reaction and enantiomeric excess (ee), given the importance of process set points. In addition, a design space of process parameters predicted to produce optimal outcomes, i.e., extent of reaction greater Z. Shi : D. E. Reed (*) ASR&D, Lilly Research Laboratories, Indianapolis, IN 46285, USA e-mail:
[email protected] N. Zaborenko CPR&D, Lilly Research Laboratories, Indianapolis, IN 46285, USA
than 98 % and ee higher than 93 %, was established in order to provide a flexible operation space for performing the reaction with desired process outcomes. Conclusions The capabilities of latent variables-based process modeling have been well demonstrated as applied to a continuous-flow hydrogenation reaction, regarding its improved process understanding and the potential for process optimization & control as well. Future efforts will be focused on continuing understanding of the capabilities and limitations of such a methodology on a fully-automated control scheme for continuous flow reaction. Keywords Continuous manufacturing . Latent-variable modeling . Hydrogenation . Design space
Introduction Hydrogenation is a commonly used reaction in bulk active pharmaceutical ingredient (API) synthesis of small molecules to reduce or saturate organic compounds, typically using molecular hydrogen as a reductant in the presence of a catalyst. Hydrogenations used in pharmaceutical industry are often complicated processes, with end-product quality variables dependent on a variety of process parameters. These are typically performed batchwise at pressures of 10 to 100 psig, with elevated pressures increasing operational difficulty and resulting in process safety issues due to the large molar amounts of hazardous hydrogen gas present in a batch vessel headspace during commercial autoclave hydrogenation. Alternatively, the use of continuous manufacturing to perform flow hydrogenation in tubular reactors provides a safer and more efficient method of hydrogenation, both from R&D and manufacturing perspectives. Performing hydrogenations in a continuous-flow tubular reactor allows for extended processing without the need to depressurize, to empty and clean the reactor, or to refill it
2
with reaction mixture, as would be done for a batch. Thus, a much smaller in volume (often by a factor of 3 to 5) tube reactor is necessary to generate the same throughput. Additionally, it is possible to successfully operate a hydrogenation tube reactor as much as 97 % liquid filled. The high liquid fill ratio, combined with the smaller reactor for the same throughput, results in a typical factor of 40 reduction in the volume of hydrogen present in the reaction system at any given time as compared to a batch with the same production rate. This translates into significant safety advantages of continuous-flow processing for hydrogenations. With continuous processing attracting attention in the pharmaceutical industry, the application of QbD and PAT in continuous processing is becoming a heated topic. Considering the nature of continuous processing, QbD and PAT should be considered as integral parts of the process for two reasons. First, PAT is capable of providing real-time monitoring of the process by identifying potential process deviations, with QbD allowing for identification of potential causal factors. Second, they also allow for either feedback closed-loop control to perform any necessary midcourse correction, to maintain the process within a previously determined design space to consistently produce expected outcomes or feed-forward to divert a section of the flow stream away from forward processing, either to waste or for reprocessing [1]. To fulfill these functionalities, process modeling is a requisite part of the QbD and PAT implementation in the industry. The process model here refers to a typically multivariate and mathematical relationship between quality attributes and a variety of independent process parameters. These variables could be as simple as temperature and pressure readings of a reaction and process set points such as flow rate and catalyst loading ratio, or as complicated as spectrometer readings or model-predicted concentration of a specific species in the reaction. A successful process model could be considered as the ultimate basis for the process understanding, monitoring, and control. Process modeling is often used to provide understanding to ultimately enable process monitoring and control. Depending on the model use, process monitoring and control may have different requirements for the dataset appropriate for process models. For the case of defining an optimal set of starting process parameters, the original process parameters used to generate historical or design of experiment (DOE) datasets may be more relevant than the real-time process data. From an online process monitoring perspective, real-time process data are often considered to be a more direct representation of process signature than are the starting process parameters. If an in-process correction is needed, this often requires a process model based on both process parameters and real-time process data, since the purpose of the model is to enable adjustments to the process parameters based on process signature as indicated by the real-time process data.
J Pharm Innov (2013) 8:1–10
As with any other modeling approaches, process modeling has two main categories: first principles-based (hard) models and empirical (soft) models. First principles-based models often utilize fundamental relations, such as differential equations of reaction kinetics, to derive analytical solutions, such as reaction rate constants, and thus gain an underlying understanding of the reaction. The advantage of hard models is that the understanding is direct and straightforward without further interpretation, while the disadvantage is that the analytical solutions for realistic scenarios could be complicated at best (if an analytical solution even exists). In comparison, empirical modeling utilizes available historic datasets to capture the underlying relations between quality attributes and independent process variables. Although the soft model does require experts to interpret the model, its advantages, such as eliminating complicated derivation of an analytical solution, especially for complex scenarios, deserve attention. Additionally, a soft model is able to include outputs from a hard model, while the reverse may not be true. Latent variables-based process modeling is a type of empirical modeling approach, relying on available historic datasets or ones generated by a DOE to capture the underlying relationship between quality attributes and independent process variables. The main advantage of this type of modeling is its capability of integrating multidimensional process variables and reducing the original dataset into a few independent latent variables. Each latent variable is a weighted summation of individual independent process variables. This modeling approach not only allows the multidimensional process understanding to be translated to a few latent variables’ space but also provides easy visualization of the mathematically complex nonlinear relationship between quality attributes and process parameters in the reduced latent variables’ space. Early reported studies of process modeling of bulk API synthesis of small molecules primarily focused on modeling of a few independent variables based on the nature of the reaction or on the reduction of the number of independent variables due to prior knowledge [2–6]. Therefore, the visualization and control of these reported design spaces is often simplified to a few independent variables, which is difficult to extend to processes associated with a large number of them. The latent variables-based process modeling has been commonly used in other industries, such as petroleum, chemical, and food [7–9]. In the pharmaceutical industry, only a few studies have been published previously using this type of process modeling on dry product manufacturing [10–12]. To the best of the authors’ knowledge, no study has yet been reported using latent variables-based process modeling of the reaction of bulk synthesis of small molecules for pharmaceutical applications. Therefore, the goal of this study was to extend our group’s efforts [10] to continue the
J Pharm Innov (2013) 8:1–10
exploration of the capability of latent variables-based modeling approach to the reaction process of bulk API synthesis of small molecules. The case study presented here was a homogeneously catalyzed enantioselective hydrogenation in a small research-scale continuous-flow tubular reactor with gas–liquid flow. The ratio of volumetric gas flow rate to liquid flow rate was controlled by the frequency of backend pressure cycling of the system, and gas was introduced at the inlet either by an always-open or intermittently open (pulsed) connection to the gas source. The process parameters that varied in this study included pulse frequency, temperature, pressure cycle time, reaction residence time, number of pressure cycles per residence time, desired ratio of gas to liquid in the reactor, volume of liquid per pressure cycle, and substrate and catalyst solution flow rates. Among these process variables, a set of independent ones would only include four members. These were selected to be pulse frequency, temperature, reaction residence time, and desired ratio of gas to liquid in the reactor. The remaining five process parameters are functions of the four independent ones and of several process and reactor parameters that were kept constant throughout this study, such as substrate-tocatalyst ratio, reactor volume, etc. In each experiment, pressure was monitored online at four different locations in the system (Fig. 1). Therefore, the real-time pressure profiles throughout the process were considered as a separate independent variable. The quality attributes for this specific hydrogenation process were extent of reaction and the enantiomeric excess (ee).
Fig. 1 Cartoon of the reactor setup, illustrating the main automated and manual valves and the gas-flow regulating back-end setup
3
Considering the potentially complicated and multidimensional relationship between the quality attributes and the collection of process parameters, this study was focused on answering the following two questions critical to implementing such process models for a continuous hydrogenation. One question, is whether the real-time pressure profiles can serve as a predictor of final quality attributes given all other process parameters. The second question, based on the dataset collected to date involving ranges of variation of those process parameters, is whether it is possible to identify an optimal set of starting process parameters in order to generate an ideal process with process outcomes of close to 100 % extent of reaction and close to 95 % ee.
Materials and Methods The Reaction Dataset The reaction dataset used here for process modeling was generated based on a historic dataset involving variations in the aforementioned process parameters. The following modeling activities were conducted retrospectively. The range of each process parameter, except for the real-time pressure profile, and quality attributes is listed in Table 1. In total, 48 experiments were conducted, with each experiment corresponding to a set of process parameters for the continuous hydrogenation reaction and a set of resultant quality attributes.
4 Table 1 The range of each process parameter and quality attribute in the historical dataset
a
The independent process parameters. Others were a function of independent ones and other constant reaction factors
J Pharm Innov (2013) 8:1–10
Pulse frequencya Temperature (°C)a Cycle time (s) Cycle number per residence time Desired ratio of gas/liquid in the reactora Desired residence time (h)a Volume of liquid per cycle (mL) Substrate pump rate (mL/min) Catalyst pump rate (mL/min) Extent of reaction (%) Enantiomeric excess (ee)
The reactor used in these experiments was a 316-L stainless-steel tube (1/8″ OD, 0.055″ ID, 61.6 ft long) coiled inside a hot-water bath heater (sufficient for the 70–90 °C temperature used in this study). High-pressure syringe pumps (100DX, Teledyne Isco, Lincoln, NE) were used to provide the reagent and catalyst solution flows at the requisite pressures (set point of 1,000 psi). Hydrogen gas was supplied from a regulated compressed gas cylinder, with hydrogen flow rate controlled by periodic removal of a known aliquot of gas from a backend collection vessel at reaction pressure (see Fig. 1 for the system schematic). At idealized operating conditions (i.e., constant and invariant supply of gas to the reactor) and with the given inner diameter of the tube and slow linear flow velocities, the gas– liquid flow profile would be expected to be stratified [13], which would result in fairly poor gas–liquid mass transfer. However, the practical operation created pressure fluctuations in the system due to the periodic removal of gas, each such cycle nearly instantaneously dropping the pressure in the collection vessel (and thus the reactor) by ∼2 %. The rate at which the gas was then resupplied to the reactor and the change in that rate over the course of the cycle were determined by the resistance to gas flow between gas supply and the reactor. Because of the small scale of these experiments and the high operating pressure, it was difficult to sufficiently restrict the gas flow to generate an even pressure rise from start to end of each pressure cycle. Often, this resulted in a fast pressure increase in the reactor, followed by a plateau, thus indicating that a large surge of gas occurred immediately following the aliquot removal, which was succeeded by little to no gas flow for the remainder of the pressure cycle before the next removal. During such surges, the small inner diameter of the reactor results in high surface effects, i.e., the liquid surface tension would prohibit gas from moving with a greater local linear velocity than the liquid, thus resulting in segmented flow. If the gas and liquid
Min
Max
Mean
Standard deviation
1 70 544 6.09 0.99 1.00 1.79 0.026 0.0039 43.50 94.00
20 90 4,353 6.62 1.80 8.00 3.27 0.21 0.031 99.60 95.60
9.54 73.33 2,687.60 6.35 1.36 4.75 2.68 0.049 0.0073 88.49 94.94
7.46 7.53 924.00 0.25 0.40 1.71 0.53 0.043 0.0065 12.60 0.35
segments are very long relative to the tube diameter (ratio >3), gas–liquid mass transfer becomes very limited. Because of such reasons, pulsating gas feed was used in some experiments to improve mass transfer. Data Pretreatment In order to integrate both the process parameters and the realtime pressure profile for process modeling, the following data pretreatments were conducted before any modeling activities. First, because the overall gas flow rate was controlled by the frequency of backend pressure cycling of the system relative to liquid pump flow rate, several pressure cycles occurred within the span of a residence time, as can be seen from the example shown in Fig. 2. Due to the range of process parameters, the number of cycles per residence time (i.e., the ratio between reaction residence time and cycle time) in this historical dataset varied from 6.09 to 6.62. For consistency, pressure profiles across seven cycles were chosen as the modeling inputs for all experiments. In other words, seven complete pressure cycles before the time point when a sample was collected for determination of two quality attributes were used to represent the pressure profile for that experimental sample. Thus, it ensured that each data sample included the pressure profile across at least one reaction residence time prior to sample acquisition. Second, different sets of process parameters resulted in pressure profiles with different shapes and durations, as well as intermediate pressure data points between cycles that did not serve to represent the reaction (Fig. 2, left). In order to integrate all pressure profiles with their respective process parameters for any process modeling, alignment of the pressure profiles was necessary to ensure the same time interval per pressure cycle/residence time from experiment to experiment, with removal of the intercycle pressure points. Therefore, those intermediate pressure points were removed before linear interpolation was performed for each residence
J Pharm Innov (2013) 8:1–10
5
Fig. 2 Two example pressure profiles from sensor A before (left) and after (right) data pretreatment. Filled circles represent an experiment with pulsed gas feed. Open squares represent an experiment with continuous gas feed
time corresponding to the aforementioned seven pressure cycles. The interpolation index was considered as the extent of reaction completion per residence time. To linearly interpolate the pressure profiles, 1,000 data points per pressure cycle, or 7,000 data points per residence time, were used. An example of a pressure profile after alignment is shown in Fig. 2 (right). The original shape of the pressure profile was well maintained after the removal of intermediate pressure points and the alignment. The only difference is that the time scale per cycle from experiment to experiment was realigned. Because of the rescaling of the time span per pressure cycle (i.e., per residence time), the cycle time and the residence time for each experiment were intentionally included in the process parameters to account for the effect of the duration of each cycle/residence time. Process Modeling on Pressure Profiles Two modeling exercises were conducted in order to investigate the potential capability of the real-time pressure profiles to act as an indicator of vapor–liquid interaction and as a predictor for quality attributes as well. First, both pressure profiles at four sensor locations and four independent process parameters were combined to build a model correlating with the two quality attributes. The four pressure sensor locations were at the gas source preceding the regulator (sensor D), following the gas regulator for the system (sensor A), and immediately prior to (sensor B) and following (sensor C) the tubular reactor (Fig. 1). As this model used both data sources, it is referred to as the global model throughout the rest of the manuscript. Second, only the pressure profiles at four sensor locations were used to build a model to correlate with the quality attributes. As this model only used pressure profiles, it is referred to throughout the rest of the manuscript as the pressure model. Because of the different nature of the data between pressure profiles and process parameters, multiblock partial least
squares (MB-PLS) was used for both models by allocating process parameters and pressure profiles from each sensor into individual blocks. Autoscale was used to pretreat each pffiffiffi variable, followed by weighting each block by 1 k, where k is the number of variables per block. The comparison between these two models with respect to the capability to predict the final quality attributes and the block importance plot (BIP) was used to investigate whether the real-time pressure profiles carry any weight in predicting the two quality attributes. Simulated Process Optimization In order to identify an optimal set of process parameters for performing the hydrogenation reaction, all of the experiments in the historical dataset except for two with the highest measured extent of reaction and ee were used as a calibration dataset to build a model correlating with the final quality attributes. The remaining two experiments served as the validation dataset. In addition, only those four linearly independent varying process parameters were used in this model. Because the four process parameters were the only data block used in this model, a common PLS model was used with autoscale pretreatment on each independent variable. After the model was built, the design space (i.e., the optimal operation space) was defined in the latent variable score space using iterative searches for those pairs of latent variable scores with predicted extent of reaction higher than 98 % and predicted ee higher than 93 %. Two approaches were used to validate this optimal operation space. First, the process parameters used in the validation dataset were input to the model to generate the predicted results. Second, a nonlinear optimization routine [10] was implemented within the latent variable score space defined early in the model to search for an optimal set of process parameters with predicted extent of reaction close to 100 % and predicted ee close to 95 %.
6
Software and Data Processing The MB-PLS models were conducted using ProMV (version 9.14, ProSensus Inc, Ancaster, ON). A series of Matlab (R2010a, The Mathworks, Natick, MA) routines was written in-house with the PLS_Toolbox 6.2 (Eigenvector Research, Inc., Manson, WA) to support this study.
Results and Discussions Process Modeling of Pressure Profiles MB-PLS is a type of PLS approach capable of decomposing multivariate data by assigning appropriate weights to individual variables according to the data source. Compared to the commonly used PLS, a significant advantage of MBPLS is its capability of interpreting the individual importance of every block of independent variables on modeling the dependent variables [10, 14, 15]. This advantage is used here to interpret the potential ability of the pressure profiles to predict the final quality attributes, given the importance of process parameters. For the process model using both process parameters and pressure profiles (i.e., global model), six principal components (PCs) were chosen as the optimal number based on the convergence of the correlation coefficient between calibration (R2) and cross validation (Q2, data not shown). The resultant correlation plots for both the extent of reaction and the ee are shown in Fig. 3a and b. Although the linearity of prediction on both quality attributes was demonstrated, certain data points were observed to deviate from the unity line. A wider variation was observed on the x-axis (observed quality attributes) than on the y-axis (predicted quality attributes), indicating the model’s incapability to perform a
J Pharm Innov (2013) 8:1–10
resolved prediction from sample to sample when those samples differed relatively slightly in the two quality attributes, especially in the ee. In addition, the BIP was shown in Fig. 3c. As can be seen, the process parameters had significant impact on the two quality attributes, showing a greater contribution than the pressure profiles to predict the two quality attributes. However, in the specific reactor configuration evaluated in this work, the pressure profiles should not be neglected, as their correlation with the quality attributes, while only half of that of the process parameters, was still significant. The process model using only pressure profiles (i.e., pressure model) was found to require up to eight PCs to reach a reasonable convergence between R2 and Q2 (data not shown). Each PC can be considered as a mathematical function of the pressure data across four sensors. The fact that the majority of the samples show good adherence to the unity line in correlation plots (Fig. 4a and b) indicates greatly improved model performance compared to that of the global model (Fig. 3) for the following reasons. First, the global model was dominated by the process parameters (Fig. 3c), requiring only six PCs to reach a convergence between R2 and Q2, but neglecting the potential correlation between the pressure profiles and the modeled reaction outcome. On the other hand, the pressure model, which applied only the pressure data as its input, required up to eight PCs, with the higher number of PCs being used to better capture the variations in the pressure profiles in order to correlate with the two quality attributes. Second, due to the dominance of the process parameters in the global model, the model relies only on those process parameters to predict the quality attributes. In contrast, the pressure model was able to use the dynamic pressure profiles to capture minor variations from experiment to experiment, which were unobservable in the process parameters. This could
Fig. 3 Model statistics for the global model incorporating both the independent process parameters and the pressure profiles: correlation plots for the extent of reaction (a) and the ee (b), and the BIP plot (c). The R2 for the extent of reaction and the ee were 0.8209 and 0.8349, respectively
J Pharm Innov (2013) 8:1–10
7
Fig. 4 Model statistics for the pressure model incorporating only the pressure profiles: correlation plots for the extent of reaction (a) and the ee (b), and the BIP plot (c). The R2 for the extent of reaction and the ee were 0.9261 and 0.9275, respectively
be the primary reason that the majority of samples in the pressure model showed a much more resolved prediction than in the global model, differentiating between samples having similar extents of reaction and ee. Therefore, the pressure profile was found to have a strong correlation and predict the two quality attributes in this experimental setup. It must be noted here that these observations and correlations, especially pertaining to the correlation between the real-time pressure profiles and the quality attributes, are valid only for the specific reactor configuration and range of operating parameters used for this study. It is known from the chemistry development that chemical reaction does not require pressure oscillations; thus, the pressure profiles are not necessary for high conversion and ee. The pressure profiles reported here were a consequence of the valving system used to feed hydrogen into the continuous-flow reactor. If the valve restriction were reduced, then the pressure profile would not have a strong correlation with the two quality attributes. Because of the inability of this reactor system to permit gas to move past liquid, the pulse frequency was introduced as an extra process parameter in order to facilitate the gas–liquid mixing in the tubular reactor. For a different reactor configuration where 98 % of the reactor volume consists of large-inner-diameter tubing with bottom-to-top flow, gas will be able to freely move past the liquid, resulting in excellent gas–liquid mass transfer regardless of the gas introduction method or consequent pressure profiles. Conversely, the reactor used in this study may be operated such that only liquid would be able to flow through despite gas being available, resulting in the pressure cycles exhibiting perfect linearity despite no reaction occurring at all. Therefore, although it is fair to say that the predictive capability of the real-time pressure profile is
specific to the reactor design described herein and operated in a specific regime, in such a system, the pressure data can nevertheless provide pivotal process monitoring capability to assure the real-time process signature within the design space. The block importance plot for this pressure model was also investigated and shown in Fig. 4c. Without the dominant influence of the process parameters, it was possible to investigate more clearly the detailed contribution from individual pressure profiles. As can be seen, pressure profiles from sensor A carried the information most correlated with the two quality attributes, followed by sensors B, C, and D. This importance order was confirmed by the experts to be valid, considering the following two reasons. First, the pulsing of hydrogen gas flow was created after the location of sensor D and before the location of sensor A (Fig. 1). Therefore, the pressure profile from sensor A was the best representation of the pulsed gas flow. Considering the importance of pulsed pressure cycling in enabling sufficient gas–liquid interaction to facilitate mass transfer, thus allowing for ideal extent of reaction and ee at given process parameters, the dominant contribution by the pressure profile measured by sensor A is to be expected. Second, the pressure profile from sensor D was only reading the pressure of the gas storage tank, which does not affect the reaction, as the reactor pressure is regulated downstream from sensor D. The tank pressure was monitored solely to allow for monitoring the hydrogen supply to ensure sufficient gas for the system and did not result in any process signature inside the reactor. However, even if there were no pressure pulses, the same conversion and ee could be achieved if an appropriately sized valve existed and was used to continuously meter hydrogen gas into the flow tube reactor. Therefore, the pressure profiles are related to this particular experimental
8
J Pharm Innov (2013) 8:1–10
Fig. 5 Results of the process model using four independent process parameters: correlation plots for the extent of reaction (a) and the ee ratio (b), and the W*C plot (c). The R2 for the correlation plots were 0.6135 and 0.6437 for the extent of reaction and the ee ratio, respectively
setup only and may not be significant for the outcomes of the actual hydrogenation reaction process using other reactor designs. Simulated Process Optimization The purpose of this simulated process optimization was to identify an optimal set of process parameters in order to produce ideal process outcomes. Thus, real-time pressure profiles were not used; instead, the four independent process parameters in the historical dataset were used to generate a PLS model to correlate with the two quality attributes. The model required up to two PCs to reach a convergence between R2 and Q2 (data not shown). The resultant correlation plots for both quality attributes are shown in Fig. 5a and b.
Fig. 6 Score plot of the first vs. the second PC in the process model using four independent process parameters from the historical calibration dataset (filled circles), in which the simulated design space (filled triangles), the empirical validation (filled diamond), and the optimized process parameter (filled square) are demonstrated
In addition, the W*C plot was used to enhance understanding of the interrelationship among process parameters and quality attributes (Fig. 5c). First, the fact that the two quality attributes are distributed in two different quadrants indicates an inverse relationship between the effects of process parameters on extent of reaction and the effects of process parameters on ee. It also illustrates the necessity of using certain optimization approaches to identify a set of process parameters in order to achieve an ideal balance between the two quality attributes. Second, the residence time was found in the same quadrant and very close to the extent of reaction. This indicates a very strong and positive relationship between the residence time and the extent of reaction, which is to be expected for any irreversible reaction (such as a hydrogenation), with a longer residence time leading to higher conversion. Third, the fact that temperature was located in the
J Pharm Innov (2013) 8:1–10
9
Fig. 7 Contour images of scores for the first and second PC in the process model using four independent process parameters, in which the simulated design space (asterisks), the empirical validation (filled diamond), and the optimized process parameter (filled square) are demonstrated
opposite quadrant from the ee indicates an inverse relationship between the two. Such a relationship is dependent on the relative reaction kinetics of formation of the two product enantiomers. Kinetic data necessary for full first principles understanding have not been gathered, but this relationship has been experimentally observed under multiple scenarios in this reaction. Thus, this demonstrates the capability of the latent variables-based empirical process model to provide supplementary process understanding to that achieved by first principles approaches, potentially reducing the number of experiments required to determine optimal operating conditions. In an earlier paper published by our group [10], an optimization routine was implemented to identify the optimal set of process parameters for a tablet compression process, a unit operation commonly used in the manufacturing of solid dosage forms. Although an optimal set of process parameters is necessary to launch a process, this is a less-desirable approach to operating a process compared to determining an optimal operating region (i.e., design space), considering the potential regulatory flexibility to operate within a design space. Therefore, an optimal operating region was identified here for this continuous hydrogenation reaction, while the optimal set of process parameters identified by an optimization routine was used later as a validation of the design space. A simulation routine was used to search the optimal operating region within the score space, where the predicted extent of reaction is greater than 98 % and the predicted ee is higher than 93 %. The identified design space relative to the historical data points in the score space is shown in Fig. 6. The corresponding design spaces for the predicted extent of reaction and the ee are shown in Fig. 7a and b. Due to process development limitations, the design space was not experimentally verified. However, the following two validation activities were conducted in order to demonstrate the validity of such an
optimal operation space. First, the two empirical experiments in the validation dataset with the highest measured extent of reaction and the ee were projected into the score space. The scores corresponding to those two validation experiments were found to be within the previously identified design space. Second, the optimal set of process parameters identified through the optimization routine was also projected onto the score space, indicating that the optimized set of process parameters was also within the design space. The range of the process parameters within the simulated design space, the process parameters identified by the optimization routine, and the process parameters for the empirical experiments were listed in Table 2. As can be seen, the range of process parameter defined by the design space encompassed very well the process parameters identified by optimization routine and the empirical process parameters. Therefore, both validation approaches confirmed the validity of this design space. Compared to the previously reported design space of API synthesis of small molecules based on several independent process parameters, the design space defined here in the principal component space has the following advantages. First, because each principal component score is a weightaveraged contribution of individual process parameters to the quality attributes, the design space in latent variable space is expected to provide a more comprehensive Table 2 The optimal process parameters
Pulse frequency Temperature (°C) Desired ratio of gas/ liquid in the reactor Desired residence time (h)
Simulated design space
Optimized design point
Empirical validation point
0∼25.82 57.60∼86.99 0.64∼2.13
3.76 66.86 1.66
14 70 1.01
6.19∼8.63
6.51
8.00
10
approach to describe the underlying multivariate relationship between the entirety of the process parameters and the quality attributes. Second, the latent-variable model-based design space decomposes a system of many process parameters to enable process visualization based upon only a few orthogonal principal components, which is expected to facilitate process understanding, monitoring, and control.
Conclusions To the best of the authors’ knowledge, this study is the first to apply latent variables-based process modeling to the synthesis of a small molecule. By harnessing unique data sources for different intended purposes, the capabilities of latent variables-based process modeling have been well demonstrated as applied to a continuous-flow enantioselective hydrogenation reaction. In the small-scale tubular reactor operating in the segmented flow or stratified flow regime, the use of real-time reaction pressure profiles alone was found to be capable of predicting process outcomes of a hydrogenation reaction at a given set of process parameters. Although it is known that the pressure profiles are not important to the chemical reaction itself and only an outcome of the size of control valves on this research scale continuous reactor, the capability of using chemometric modeling to capture such a correlation can not be overlooked. Meanwhile, the use of physical understanding of the process is also critical to enable accurate interpretation of model prediction and indicators. Moreover, the optimal process parameters and the related design space identified within the latent variable space also illustrate the benefit of process modeling for enhancing process understanding and optimizing process operation. Better understanding in future on the capabilities and limitation of this methodology is necessary in the continuous development of fully-automated control schemes for continuous-flow reactions. Acknowledgments The authors would like to acknowledge Dr. Martin D Johnson, Dr. Scott A May, Dr. Michael E Kopach for their indispensable contribution and constructive suggestions on the reactor setup of this manuscript, which has been very helpful in preparing this manuscript.
J Pharm Innov (2013) 8:1–10
References 1. MacGregor JF, Bruwer M. A framework for the development of design and control spaces. J Pharm Innov. 2008;3:15–22. 2. Seibert KD, Sethuraman S, Mitchell JD, Griffiths KL, McGarvery B. The use of routine process capability for the determination of process parameter criticality in small-molecule API synthesis. J Pharm Innov. 2008;3:105–12. 3. Burt JL, Braem AD, Ramirez A, Mudryk B, Rossano L, Tummala S. Model-guided design space development for a drug substance manufacturing process. J Pharm Innov. 2011;6:181–92. 4. Ende D, Bronk KS, Mustakis J, O’Connor G, Santa Maria CL, Nosal R, Watson TJN. API quality by design example from the torcetrapib manufacturing process. J Pharm Innov. 2007;2:71– 86. 5. Hallow DM, Mudryk BM, Braem AD, Tabora JE, Lyngberg OK, Bergum JS, Rossano LT, Tummala S. An example of utilizing mechanistic and empirical modeling in quality by design. J Pharm Innov. 2010;5:193–203. 6. Castagnoli C, Yahyah M, Cimarosti Z, Peterson JJ. Application of quality by design principles for the definition of a robust crystallization process for casopitant mesylate. Org Process Res Dev. 2010;14:1407–19. 7. MacGregor JF, Kourti T. Statistical process control of multivariate processes. Control Eng Pract. 1995;3:403–14. 8. Kourti T, MacGregor JF. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemom Intell Lab Syst. 1995;28:3–21. 9. Kourti T, Lee J, MacGregor JF. Experiences with industrial applications of projection methods for multivariate statistical process control. Comput Chem Eng. 1996;22(Suppl):S745–50. 10. Liu Z, Bruwer M, MacGregor JF, Rathore SSS, Reed DE, Champagne MJ. Modeling and optimization of a tablet manufacturing line. J Pharm Innov. 2011;6:170–80. 11. Muteki K, Swaminathan V, Sekulic SS, Reid GL. De-risking pharmaceutical tablet manufacture through process understanding, latent variable modeling, and optimization technologies. AAPS PharmSciTech. 2011;12:1324–34. 12. Garcia-Munoz S, Dolph S, Ward II HW. Handling uncertainty in the establishment of a design space for the manufacture of a pharmaceutical product. Comput Chem Eng. 2010;34:1098– 107. 13. Perry RH, Green DW. Perry’s chemical engineers’ handbook. 7th ed. New York: McGraw-Hill; 1997. p. 6–26. 14. Westerhuis JA, Coenegracht PMJ. Multivariate modelling of the pharmaceutical two-step process of wet granulation and tableting with multiblock partial least squares. J Chemom. 1997;11:379–92. 15. Westerhuis JA, Kourti T, MacGregor JF. Analysis of multiblock and hierarchical PCA and PLS models. J Chemom. 1998;12:301– 21.