shade loss model (called the âFast Shade Modelâ or FSM) that allows for the ... SunPower has implemented a 5-parameter 2-diode cell model in Python based.
A Fast Parameterized Model for Predicting PV System Performance under Partial Shade Conditions Bennet Meyers, Mark Mikofski, Mike Anderson SunPower Corporation, Performance Engineering, Richmond, CA, 94804 Abstract — Accurately modeling the performance of partially shaded photovoltaic systems is well-known to be a difficult problem. Power loss is not only nonlinear with shade coverage, but also has a strong dependence on system configuration and location of the shade on a system. This paper presents a parameterized shade loss model (called the “Fast Shade Model” or FSM) that allows for the calculation of system-level power loss based on three input parameters. This model was developed through the statistical analysis of hundreds of thousands of shade scenarios modeled with a cell-level, 2-diode model. Model validation was performed using real systems under shaded conditions. Index Terms — photovoltaic systems, modeling, shading, solar energy
I. INTRODUCTION Understanding the performance of inverter-tied photovoltaic (PV) systems under partial shade conditions is critical to accurate energy prediction. Shading is known to cause a significant reduction in system power in a non-linear manner due to current and voltage mismatch effects[1]. In real systems, completely avoiding shade is not often possible nor economically beneficial: On a space constrained rooftop, there may be an incentive to maximize PV coverage on the roof – even into non-optimal irradiance locations. To enable optimal designs, PV installers and owners need to have information regarding system performance if they build into areas that experience shade throughout the year. In this paper, we will describe the full complexity of the partial shade performance prediction problem, a method to reduce the dimensionality of the problem, and use the reduced factors to create a non-linear performance model. Finally, we will show this model’s advantages in terms of accuracy and performance (speed) over existing methods of predicting system performance under shade conditions. Understanding the behavior of a PV system subjected to shade or other irradiance variation is aided by the use of mathematical models that describe the current and voltage mismatch within the electrical architecture. SunPower has implemented a 5-parameter 2-diode cell model in Python based on previous work by Bishop[2], King, et. al[3], and DeSoto[4]. This model is a simplified version of DeSoto’s model, identical except that the dependence of Rshunt on irradiance was not implemented. This model will be referred to as the simplified five parameter model (“S5P”) throughout the paper. While this approach provides highly accurate results for precise shade patterns, it is computationally intensive for large systems and optimization studies SunPower developed a parameterized model (the “Fast Shade Model” or FSM) to predict the instantaneous power loss due to
partial shade conditions, which can be easily integrated with our existing PV system performance simulation engine, called SimEng. A simplified, parameterized model of partial shade performance is significantly faster to evaluate than a cell-level model (S5P) of an entire array, allowing for quicker iterations of system design. In this paper, we describe the methodology for creating a data set for power loss under shade conditions and the logic behind how the scenarios were selected. In addition, a justification for the dimensionality reduction of the problem to 3 independent variables is given. We then present how this data was used to train an artificial neural net (ANN) model, and how we selected the particular model used. Finally we compare both the FSM and S5P predictions to actual data from a shaded system and compare the results to other common approaches for estimating system power loss due to shade. We show that system performance under shaded conditions can be predicted to within acceptable accuracy using just three geometric parameters describing the interaction of shade and a PV system: the ratio of system area shaded to total system area (Rshade), the ratio of diode-protected cell strings impacted by shade to the total number of cell strings in the system (Rcs), and the ratio of module strings impacted by shade to the total number of parallel module strings in the system (Rms). II. SIMULATION OF SHADE SCENARIOS The S5P is “2-diode” model, implemented in Python, that represents the current and voltage response of a PV cell. Through its API, users are able to specify a system that is a collection of cells, organized into diode-protected substrings, modules, and strings of modules. The parameters for a SunPower cell in an E-series module were derived by fitting forward and reverse bias cell IV curve data, measured at SunPower. Irradiance and temperature values are set for each cell in the system. The input to this model, therefore, is a vector with N elements, where N equals the total number of cells in the system. This is represented in code as a sparse dictionary. To model realistic shade geometries, a set of tools based on Shapely[5] were created to enable the user to define the geometries of a PV system and shade scenario. The tools then calculate the effective irradiance on each cell in the system based on the shade geometry to generate the input vector to S5P, assuming a reference irradiance of 1 sun (1000 W/m2), a “fully shaded” irradiance of 0.05 suns, and a linear relationship between percent shade coverage of a cell and effective irradiance, such that
= − 1 ∙ +
(1)
Where, Ecell = Effective irradiance on a cell (suns) Eshade = Effective irradiance on a shaded cell = 0.05 suns Eref = Reference irradiance = 1000W/m2 A = Fractional area of a cell covered by shade (ratio) Methods were created to automate the creation certain basic shade types: rectangles, pipes with round caps, circles, and random polygons with 3–5 vertices. The goal was to explore a wide variety of possible shade geometries. (Additional discussion regarding data set domain space selection is provided in Section III.) By automating the creation of shade geometries, these basic shade objects were programmatically moved, expanded, and rotated to iterate over a wide variety of different shade scenarios. Then, these shade geometries were applied to 3 different system configurations: 3 strings of 10 modules, 10 strings of 10 modules, and 1 string of 8 modules. For each system configuration, the shade geometries were evaluated with both portrait and landscape module orientation. While the shapes themselves are very basic, they cover a wide range of aspect ratios, orientations, positions, and intersections with the various PV system elements. In total, 338,960 unique system/shade scenarios were generated, which is the “master data set.” (This is then split randomly into training and validation data sets, as described in Section IV.) A small set of examples of the shade geometries explored are shown in Fig. 1.
shows the Rshade and Rcs values as well as the power loss predicted by S5P for these two scenarios. The slight difference TABLE I RSHADE, RCS AND POWER LOSS FOR SCENARIOS IN FIG. 2 Portrait Landscape
Rshade 0.057 0.087
Rcs 0.792 0.458
Power Loss 66% 39%
in Rshade is due to the fact that the same width was used for both shade scenarios, but the respective shade patterns have different lengths, resulting in different total shade areas. However the bigger difference is in Rcs; because the diode-protected cell strings are aligned with the long axis of the module, the portrait case results in more cell strings being effected by shade than the landscape case, resulting in a higher power loss for the portrait case. III. PARAMETER SELECTION The systems modeled in the master data set are all made up of 96-cell standard SunPower modules. Considering the 3string, 10-module configurations, there are 3 10 96 = 2880 cells in the system, each of which can take a unique effective irradiance value. (For an explanation of effective irradiance, see King, et. al[6].) So, each simulation for this system configuration can be thought of as a function that takes a 2880 dimensional vector and outputs a scalar value representing the power of the system, or : → . This vector represents the irradiance on each cell and will be
Fig. 1. A small selection of the shade geometries that were modeled using S5P. Each shapely.geometry.Polygon object represents a separate shade scenario. Scenarios are plotted on top of each other with arbitrary transparency for illustrative purposes only.
To illustrate the importance of portrait versus landscape module orientation on the data, consider the two scenarios shown in Fig. 2. Both systems have the same number of modules and the same total system area. A pipe-like shade is applied that stretched 80% of the width of the system. Table I
Fig. 2. A comparison of similar shade applied to a system in portrait (top) and landscape (bottom). S5P predicts a power loss of 66% for the top case and 39% in the bottom case because of different diode activation
Fig. 3. IV curves of various irradiance patterns. (Top) A random distribution of irradiance on all cells. (Middle) A typical pipe shade pattern. (Bottom) System at STC with no shade. The top plot represents a vector, cI, which is not part of the subspace of that the model needs to describe. The middle plot shows a vector should be included in the model, and the bottom is a trivial case.
designated cI. Importantly, there are many possible such vectors in that do not represent realistic partial shade scenarios. A randomly selected vector in this space would represent a scenario with a random amount of irradiance on each cell in the PV system. (For more information on this concept see Olah’s description of the MNIST data[7].) This “typical” vector does not describe realistic shade and is not part of the domain space the model needs to describe. In Fig. 3, the output of a S5P simulation of a random irradiance vector is compared to a system with a simple pipe shade pattern and an unshaded system. The patterns of irradiance values that represent realist partial shade scenarios occupy a much smaller subspace of . In other words, partial shade scenarios actually occupy a lower-dimensional subspace of the higher dimensional space defined by all the cells within the system[7]. This structure is what we seek to describe through dimensionality reduction. The idea of dimensionality reduction is common within the machine learning community, and a number of fully developed tools exists to do this, such as principle component analysis PCA) or t-distributed stochastic neighbor embedding (tSNE)[7]. However, while these techniques provide useful insight into the lower dimensional structure of the data, the reduced dimensions are not necessarily physically meaningful nor easily predictable. To create a usable parametric model for shade performance, parameters must be chosen which not only describe the variation in the higher dimensional data, but are physical values that can be measured or estimated.
Based on knowledge of shade behavior and PV system performance, some statements can be made about the nature of the cell irradiance vectors, cI, that represent unique shading conditions. For starters, under mostly sunny conditions, shading due to nearby objects blocks the beam irradiance, leaving the majority of the diffuse sky dome. This limits the domain of cI to vectors that describe full-sun irradiance on the unshaded portion of an array and diffuse-only irradiance on the shaded portion. In addition, the solutions to the full mismatch calculations are not particularly sensitive to the value chosen for the diffuse portion. As shown in Fig. 4, choosing diffuse fractions from 0% to 25% affect normalized power in a nominal shade configuration by only 0.02%. This insensitivity implies that the analysis will not change considerably if cI are considered with an Eshade of 0.05 versus 0.2 suns. Another reduction of the cI domain is possible by observing that shade location within an array effects performance based on local topology (how the shade intersects different system elements) but not global topology (the first module vs. the last, and string 1 vs. string 3). As an example, Fig. 5 illustrates the effect of taking a single shade geometry and moving it through a single string of modules. There is a clear repeating pattern as the shade crosses different cell strings, with a significant drop in power when the shade interacts with three cell strings instead of two (the shade has a width of 3 cells and the cell strings in this examples are two cells wide). Therefore, analyzing system performance with the shade centered at x=18 is the same as when x=6, so including the cI which represents both cases is redundant. Three parameters were chosen based on the previous insights combined with trial and error: 1) Rshade: The ratio of the area of the system covered by shade to the total array area 2) Rcs: The ratio of diode-protected cell strings impacted by shade to the total number of cell strings in the system 3) Rms: The ratio of module strings impacted by shade to the total number of parallel module strings in the system
Fig. 4. The sensitivity of normalized power as calculated with S5P for different values of shaded effective irradiance, in the range 0.01– 0.24. The overall variation in this range is less than 3 parts per 10,000.
Fig. 5. Circular shade (r=1.5 cells) moving across a single string of modules @ y = 5.5 cells. X position is in units of cells and designates the center point of the shade. Vertical grid lines represent individual modules. 72-cell modules were modeled here, unlike the master data set.
Plots of the simulated power from S5P versus these three factors for the entire master data set are shown in Fig. 6. These parameters can be easily generalized to arbitrary systems. By scaling all the factors to a range of 0 to 1, the shade conditions can be described for an arbitrary system configuration, including various numbers of strings, string lengths, and module sizes. In addition, obtaining these three parameters for use in simulations for a prospective site is becoming more
Fig. 6. Modeled power by S5P plotted against the three selected parameters.
reasonable with the development in the industry of tools and devices to estimate shade projection onto a 3D CAD model of a PV system. IV. MODEL DEVELOPMENT Artificial neural networks (ANNs) [8] and support vector regression (SVR) [9] were considered for the predictive model. Both ANNs and SVR are versatile supervised learning methods that are well suited for regression of high-dimensional, nonlinear data. A detailed comparison of these methods is beyond the scope of this paper, and more information can be found in the sources. In general, different model implementations were evaluated for fitness by taking a random 2/3 of the master data set to train the models (“training data”) and using the remaining 1/3 of the data to evaluate the model fitness (“validation data”). Using the terminology of machine learning, Rshade, Rcs, and Rms are the features of the data and normalized system power from S5P is the target. The features and target in the training data are used to optimize each model’s parameters, then the validation features are fed through the model, and the model predictions on the validation features are compared to the validation targets. The
model predictions are linearly regressed against the validation targets, and the coefficient of determination (R2) and rootmean-square error (RMSE) is calculated for each model iteration. These two factors describe the fitness of each model. Before comparing iterations of ANN and SVR models, a “simple” ANN model was trained first. Each time a new model was tested on a randomized validation data set, the simple ANN was tested as well. The fitness factors derived from the simple ANN allowed us to evaluate how “easy” or “hard” each randomized validation data set was to model, and we used the difference in each model’s fitness factors from the simple ANN’s fitness factors to differentiate and rank the various model implementation. In the end, an ANN with two hidden layers, 8 nodes in the first layer and 16 in the second, provided the best description of the data. A diagram of this network topology is shown in Fig. 7. Other technical details of the ANN are given in Table II. Fig. 8 shows the output of the finalized FSM versus the three input parameters. V. MODEL VALIDATION PROCEDURE The model was validated against production from a system in a well-characterized shade environment. The system consisted of 8 SunPower X21-335 panels connected in series to an SMA string inverter and an identical reference system nearby, which was not exposed to shade. The test system was surrounded by vertical pipes on the east, west and south sides, and the reference array was unshaded. Both arrays were mounted in the same orientation with the same wind shielding as shown in Fig. 9. Before and after the pipe shade was applied to the test system, both systems were allowed to run without shade. This data was
Fig. 7. Network topology graph of the selected ANN for the final FSM. The network has three input nodes for the three features and one output note for the single target. There are two hidden layers with 8 nodes in the first and 16 nodes in the second.
TABLE II TECHNICAL DETAILS OF SELECTED ANN Activation function: Cost function: Optimization:
Fig. 9.
Shaded and reference arrays used for model validation.
Sigmoid/Logistic Quadratic/MSE Newton Conjugate-Gradient [11], [12]
Fig 8. FSM modeled power plotted against the three selected parameters. Compare this to the charts in Fig. 6.
Fig. 10. SketchUp representation of shaded array. A combination of Ruby and Python scripting was used to generate cell-level shade data from this 3D model.
used to normalize the performance between the two systems. This way, the normalized power from the reference array provided a direct estimate of how much power the shaded system should have produced without the impact of shade. Instantaneous AC power was collected every 5 seconds using a high accuracy (±0.2%) Electro Industries Shark™ meter. Meteorological data was also collected, but this data was not required due to the comparative nature of the analysis. Video cameras were used to record images every minute of the day to determine shading. To obtain percent shade on a per cell basis, a model of the test system was created in SketchUp using geolocation and shadow casting features, shown in Fig. 10. This model was validated using video taken of the test setup, and then Ruby scripting in Sketchup was used to create images of the shade on the array at any date/time of interest. These images were subsequently analyzed using Python script to determine the area of each cell shaded. A single, clear day was chosen for model validation. For this day, cell-level shade data was generated at 5-minute intervals, beginning at 2.5 minutes after the hour. This way the instananeous assesment of shade conditions could be compared to 5-minute average performance data. For example, a data point time stamped at 10:05AM represents average system performance from 10:00:05AM to 10:05:00AM and an estimation of cell-level shade at exactly 10:02:30AM. This 5-minute data was processed to provide the following: the power lost by the shaded system relative to the unshaded system, cell-level effective irradiance inputs for S5P, and the three parameters (Rshade, Rcs, and Rms) needed for the FSM. In addition to comparing FSM and S5P to the measured data, we also evaluated two other common approaches to estimating system-level shade loss: the shade impact factor model with SIF=2 (which is used by the California Energy Commission) [1] and the simplistic assumption that power loss equals irradiance loss, which is the implicit assumption in tools such as the SunEye from Solmetric which provide the user with monthly “solar access” values[10]. These two models will be referred to as “SIF” and “LIN” (for linear) respectively. VI. DISCUSSION A. Assessment of FSM Accuracy Compared to S5P The FSM shows a dramatic improvement in describing the output of S5P, as compared to the SIF and LIN approaches to estimating the performance impact of shade. Fig. 11 shows four actual versus predicted plots, with “actual” in this case being the output of the S5P model. In addition to FSM, SIF, and LIN, a full-factorial linear regression model based on the same parameters as FSM is also shown, to illustrate how the trained ANN in the FSM does an improved job of describing the nonlinear behavior of the data. The RMSE of each model relative to S5P is given in Table III. These are the RMSEs for the y=x lines shown in Fig. 11. The residuals for the FSM versus S5P fit have a standard deviation of 0.03 and a mean of zero, so we estimate the 95% confidence interval of FSM relative to S5P as ±6%. By
Fig 11. A comparison of actual (simulated by S5P) versus predicted, for four system-level, parameterized shade performance models. " = # lines are included for reference.
reducing the dimensionality of the problem, we have accepted an increase in uncertainty in exchange for a simpler, faster-toTABLE III COMPARISON OF RMSE RELATIVE TO S5P Model
RMSE
FSM
0.0299
Full-Factorial
0.0534
SIF
0.1518
LIN
0.1635
evaluate model. While not a formal benchmark, preliminary speed tests show that FSM evaluates approximately 3 orders of magnitude faster than S5P for a small system with 768 cells, and the evaluation time for S5P scales with system size, while FSM does not. A rigorous benchmarking of performance, however, is beyond the scope of this paper. B. Model Comparison to Measured Data Fig. 12 shows a comparison of each model to the measured data over the course of a single day. While the SIF and LIN models greatly under-predict the amount of power lost due to shade, S5P and FSM both tend over-predict the power loss for this system/shade configuration, especially in the tails of the day. The fact that FSM appears to be more accurate than S5P should not be taken to mean it is a more accurate model. FSM attempts to predict the output of S5P, and the difference between the models is well within the expected 95% confidence interval
stated previously. For other system/shade configurations, FSM can predict greater losses than S5P. The total energy loss over the course of the day by the shaded system and the estimated energy loss by the four models is shown in Fig. 13. S5P predicts 4.6% more energy loss than was measured, while FSM predicts 2.5% more energy loss. In contrast, the SIF and LIN methods under estimate the energy loss by a significant amount. More work needs to be done to determine the discrepancy between the measured and S5P results. It is possible that light trapping within the laminate effectively reduces the size of the shade to a smaller geometry than was modeled. Additionally, the script for converting the 3D model of site shade to cell Ee did not take into account the white space between cells in SunPower modules. More validation against real data under different shade configurations
should be done to determine if the error we see is a consistent bias error across all conditions or if it is linked to the specific type of shade applied in this study. VII. CONCLUSION We have developed a fast and accurate model for predicting the power loss in partially shaded PV systems. This model significantly reduces the RMS error as compared to linear shade loss models when referenced to S5P, while being much less computationally intensive then S5P. In addition, we show a unique approach for understanding the important factors in shade analysis through rapidly evaluating many shade configurations and collapsing the dimensionality of the problem down to three geometric parameters. VIII. ACKNOWLEDGEMENTS We would like to acknowledge the Python scientific computing community whose hard work, excellent code, and detailed documentation made it possible to perform this analysis and Daniel Dedrick for his helpful editing. REFERENCES [1]
[2]
[3]
[4]
[5] Fig. 12. A comparison of S5P, FSM, SIF, and LIN models to measured data. Top figure shows system power as a ratio of unshaded power. Bottom figure shows the deviation of each model from the real normalized power.
[6] [7]
[8]
[9]
[10]
[11] [12] Fig. 13. Percent loss in energy versus the unshaded system over the course of the single day shown in Fig. 12.
C. Deline, “Partially shaded operation of a grid-tied PV system,” in Conference Record of the 34th IEEE Photovoltaic Specialists Conference, 2009, pp. 001268–001273. J. W. Bishop, “Computer simulation of the effects of electrical mismatches in photovoltaic cell interconnection circuits,” Sol. Cells, vol. 25, no. 1, pp. 73–89, 1988. D. L. King, J. K. Dudley, and W. E. Boyson, “PVSIM©: A Simulation Program for Photovoltaic Cells, Modules, and Arrays,” Conf. Rec. 25th IEEE Photovolt. Spec. Conf., pp. 1295– 1297, 1996. W. De Soto, “Improvement and Validation of a Model for Photovoltaic Array Performance,” University of WisconsinMadison, 2004. S. Gillies, “Shapely: Set-Theoretic Analysis and Manipulation of Planar Features in Python,” 2015. [Online]. Available: https://pypi.python.org/pypi/ Shapely. [Accessed: 19-May-2016]. D. L. King, W. E. Boyson, and J. a Kratochvil, “Photovoltaic Array Performance Model,” Albuquerque, NM, 2004. C. Olah, “Visualizing MNIST: An Exploration of Dimensionality Reduction.” [Online]. Available: http://colah.github.io/posts/2014-10-Visualizing-MNIST/. [Accessed: 05-Mar-2016]. M. Nielsen, “Neural Networks and Deep Learning,” Determination Press, 2015. [Online]. Available: http://neuralnetworksanddeeplearning.com/. [Accessed: 19-May2016]. F. Pedregosa, et al., “Scikit-learn User Guide: Support Vector Machines,” 2014. [Online]. Available: http://scikitlearn.org/stable/modules/svm.html. [Accessed: 19-May-2016]. Solmetric, “Application Note : Understanding the Solmetric SunEye.” [Online]. Available: http://resources.solmetric.com/ get/UnderstandingTheSolmetricSunEye-March2011.pdf. [Accessed: 19-May-2016]. S. Nash, “Newton-Type Minimization via the Lanczos Method,” SIAM J. Numer. Anal., vol. 21, no. 4, pp. 770–788, 1984. E. Jones, et al., “SciPy: Open Source Scientific Tools for Python,” 2016. [Online]. Available: http://scipy.org. [Accessed: 19-May2016].