After apple, peach is the most widely grown tree fruit in the USA (Westwood ... current status of the orchard and develo
ORCHARD AND TREE MAPPING AND DESCRIPTION USING STEREO VISION AND LIDAR Michael Nielsen1*, David C. Slaughter2, Chris Gliever2, Shrini Upadhyaya2 1
Danish Technological Institute, Robot Technology, Forskerparken 10F 5230 Odense, Denmark 2 Biological Agricultural Engineering, University of California, One Shields Ave, Davis, CA 95616, USA *Corresponding author. E-mail:
[email protected] Abstract A novel approach for orchard scanning is proposed that is validated through automatic separation of trees and height measurements. The experiments were done in a real orchard recording data from a moving vehicle, using Lidar and stereo vision. The orchard reconstruction was done using tilt sensor corrected GPS positions interpolated through encoder counts. The accuracy of the reconstruction and tree labelling with tilt correction was compared without tilt correction. The accuracy of the Lidar reconstruction was improved, while the stereo vision depended on the tilt correction. The measured heights were correlated R2=0.83 and down to an average of 9cm. The approach is suggested to be used for yield estimation, dynamic control of string thinning cylinders, and variable application of pesticides. Key words: sensors vision lidar thinning scanning. 1. Introduction A method for describing the 3D spatial information of an orchard for automation purposes and management tools is described. Scanning and mapping of orchards in 3D can be used in many different applications in fruits and nuts such as precision spraying, controlling string thinners, predicting yields, or detecting damage or diseases. Others works have focused on LIDAR scanning and doing volume estimations for LAI correlations (Polo et al 2009). Other parameters such as density of blossoms, tree heights and orientation for thinning applications are needed for controlling the orientation and placement of string thinners (Miller et al 2011). In earlier works the lidar maps were built assuming a constant orientation and speed (Polo et al 2009), which is the case in an assembly line, but not in a field experiment. Segmentation of individual trees is also mostly done manually, such as analysing tree by tree. Perez-Ruiza (2012) used a tilt sensor to correct the positioning of transplants, but modelled the trajectory as a straight line. The goal in this paper is to build a system that can scan an orchard by driving through it non-stop, and correct tilting, wiggling, and turning around at each row end. Peaches are a popular fruit with a 2010 total production of 1045 million kg (fresh and canned peaches combined) and a per capita consumption of 3.8 kg in the USA alone (USDA 2010). The USA farm gate value in 2010 for peaches was almost US$615 million (USDA 2010). After apple, peach is the most widely grown tree fruit in the USA (Westwood 1993). Peach is naturally precocious, and most cultivars are naturally self-fertile and produce an abundance of flowers and fruitlets that must be thinned in order to produce large, high quality fruit preferred by consumers (Larue 1989). Currently, peaches are thinned by hand at a 2009 cost of US$1090/hectare, which represents approximately 18.5% of the total production costs (Day 2009). Hand thinning costs are slightly higher than annual pruning costs and about one-third of hand harvest costs in California (Day 2009). In addition to the adverse
impact of hand thinning on production costs, many growers are concerned that labor shortages may limit the long-term sustainability of using manual labor for thinning. Some simple mechanical systems have been developed to help peach growers mechanize the thinning of peach blossoms and fruitlets (Johnson 2012). These systems typically consist of a cylindrical array of strings or flexible rods that are rotated at a high velocity and used to randomly knock a portion of the blossoms or fruitlets off the tree. While some economic advantage has been demonstrated (Duncan and Norton, 2009 ), Johnson (2012) observed that shoots parallel to the row and pointing in the direction of travel had about twice as many flowers removed as shoots growing in other directions and that the greatest benefit was in orchards with particularly heavy fruit set. Thus in order to optimize the mechanization of fruitlet and blossom thinning more advanced technologies are required that can use machine learning techniques and real-time sensing technologies to gather information about the current status of the orchard and develop a custom thinning plan based upon that information that will produce an optimum yield of large, high quality fruit. 2. Material and Methods A Trimble gps, odometer, Sick Lidar, 5MP stereo camera set, and VectorNAV tilt sensor were mounted on a Mule utility vehicle. Furthermore, the vehicle was equipped with a halogen flood light and a custom made xenon strobe that flashed a pattern onto the trees. Three rows with 37-38 peach trees were scanned and reconstructed by fusing the data from the sensors. GPS positions were interpolated using odometer data, and the tilt data was used to correct the stereo and lidar maps. Lidar and stereo imagery were treated separately. The vehicle scanned the rows from both sides at 5-10 km/h. One row with 38 was measured manually using a total station land surveying device for comparison.
Figure 1 (left) shows the setup on the vehicle. Figure 1 (middle) shows a figure of the sensor attachment which was attached in a 13degree angle on the vehicle. The camera system, Lidar, vectornav, and GPS have their own coordinate systems. The offset from the GPS to the Lidar was 20x40 cm and to the camera system 22x43 cm (along the driving direction and to the side). Figure 1 (right) shows how the height of the trees were defined. It was the height from the ground to the tallest main limb that was used in the V structure
LIDAR offset
GPS GPS (x,y,z)
Camera1 Strobe Camera2
VectorNAV
R-cam
h
Pattern strobe (or floodlight)
Θ
VectorNAV (yaw,pitch, roll) L-cam
Encoder wheel
Figure 1. Left to right: Setup on the mule, the sensors in different coordinate systems, and measurement of ground truth heights,
Figure 2 shows the software structure used to extract information about the trees from the data. It consists of three types of modules: Tree sensor, Trajectory sensor, and an Information module. Each can be interchanged by different implementations. In this paper
two types of tree sensors are compared; a stereo vision based module and a lidar based module. Tree sensor 1 – Stereo vision The stereo camera reconstruction was done using a binocular version of the algorithm in Nielsen et al. 2012. It was a correlation based algorithm which is fast and uses CUDA acceleration so that it can run real-time. Nielsen et al. 2012 demonstrated how correlation based stereo vision was superior to global optimisation algorithms when reconstructing non planar scenes such as trees. However, the algorithm sometimes had problems matching blossoms on one image to the other. In this experiment this is dealt with using the strobe with a random noise pattern. This is a method typically used in bin picking applications at short range and small areas. Consequently, a custom made strobe had to be built, see Figure 3. Tree sensor 1 Cameras
Stereo 3D Vision per image set
Merging stereo reconstructions
Trajectory Sensor Encoder
Trajectory
Deviation Pitch Tilt Roll From Heading
GPS
Information module Tree segmentation Adapted GMM algorithm
Final 3D model Parametric description
VectorNAV
Tree sensor 2 LIDAR
Lidar reconstruction
Structured RANSAC
Figure 2. Overall structure of the methods employed in the proposed method. It consists of tree sensors, a trajectory sensor and an information module. Pattern transparency Frosted glass
Flash
mount
Clear glass View camera Lens 75mm Threaded focus mount
Figure 3. Pattern strobe made from a xenon flash head, an aluminium tube, a pattern transparency sandwiched between clear and frosted glass, and a view camera lens mounted for focussing.
Each image set resulted in a point cloud covering parts of 1-3 trees. These point clouds were combined using information from the trajectory module. Tree sensor 2 – Lidar The lidar produced single 180 degree scanlines that were combined using the trajectory module information in the same way as the stereo point clouds. However, since this point cloud included ground points, a structured Ransac algorithm was used to separate tree
points from ground points. Iteratively Principal Component Analysis was performed on the point cloud to find the basis vectors of the span of the ground plane. The 1st basis vector explained the length of the orchard row, the 2nd explained the direction crossing it, and the normal was pointing up like the trees (See Figure 5). The distance of the points to the plane formed by basis vector 1 and 2 was used to remove the top of the trees from the set and the process was repeated until it was a tight fit. Trajectory Module The trajectory module used data from the GPS and the decoder to derive the absolute position by fitting a spline to the GPS data, using the decoder as seeds. This way a straight line was not assumed. The heading was given by the first derivative of the spline. The VectorNAV data was used to compute the local deviation from the path by comparing the measurement at the given timestamp to the average orientation over 1 second. The built-in kalman filter was set up to act to fast movements. Information module Individual trees were automatically segmented and described using an adapted Gaussian mixture model with unknown clusters (k). The clusters described the size and orientation of the trees and were compared to ground truth heights. This was done by weighing each point inversely by their height from half the tree height, because the tops looked less like a Gaussian distribution than the bottoms. The fitting of Gaussian mixtures was done by testing two different k’s simultaneously, and selecting the best model, after applying heuristics. The algorithm used two heuristics: 1. the trees had to have a certain height, and 2. trees could not float on top of each other. The Gaussian mixture fitting returned a quality measure for how well the fit was. This likelihood was added a complexity measure derived from how many clusters were present and what was the extend of the eigenvectors in the model. The weight between this complexity and the log likelihood was a parameter Lambda, which denoted the expected separation between trees. Figure 4 shows an overview of the algorithm. L2=gmm(k2)
Apply heuristics
Initialise k1, k2 L1=gmm(k1)
Apply heuristics
Select Best model Check stop criteria
Finish
Make a new alternative
Figure 4. Tree fitting algorithm base don weighted gaussian mixtures. Two models were fitted simultaneously and compared through log likelihood, complexity and heuristics.
The gaussian mean position and covariance were used to model the trees as cylinders, where the opposition, and orientation, height and diameter was used as a rough estimate of the trees. The cylinder model can be exchanged for a better model of the tree that needs to be scanned. 3. Results The result was a map of the 3 row selection of the orchard with annotation counting the number of trees in the row, their width, height and orientation. The Adapted GMM segmentation correctly segmented all 113 trees, avoiding to segment missing trees, or trees with a wide spread of the V-structure. Figure 5 shows parts of the orchard reconstruction by lidar (left) and stereo vision (right). The Lidar reconstructed entire trees, so the heights of the cylinders were used to compare to ground truth heights.
Normal
Figure 6 shows the cylinders constructed from the lidar data of the row that was used for ground truth evaluation. The stereo vision did not reconstruct the entire trees from the ground up, so the height measurements were done using the top of the trees and assuming a fixed bottom location.
Figure 5. (left) The lidar point cloud reconstruction and ground model based on PCA. (Right) Camera reconstruction where the ground was not in the field of view.
Figure 6. The 3D model of the orchard row, where each tree was given a seperate color to show its labelling, and the height was written above. The cylinders denote their sizes and orientation.
The GMM correctly segmented all trees and automatically detected that one was missing in one row. There were also difficult cases where trees overlapped or the tree had two visible main trunks. However, these were correctly classified. The heights were correlated R2=0.83 with RMS = 9cm on the LIDAR map, and for stereo vision, R2=0.60 with RMS = 16cm. See Table 1 for details. It demonstrates the importance of VectorNAV adjustments, especially for the stereo vision, where lag or oversmoothing by built-in kalman filter affects a large point cloud instead of a single scan-line. This would suggest that point cloud registration would be beneficial. 4. Conclusions The stereo vision data further provided data for classifying colour and texture based analyses, such as blossom densities or disease detection, also called “rich 3D”, where 3D points are associated with other features such as colour, texture, or thermal imagery. While the accuracy of lidar measures was affected by vehicle tilt, the vision approach was depending on the tilt correction. Furthermore, the lidar provided data of the ground, which was not in the field of view of the camera. The results enable real-time control of string thinner orientation and can provide a map of the orchard for management purposes. Further research can test the proposed setup to make full orchard scans in trials on yield estimation, deficiency models and damage reports, or use the models for precision spraying.
The main modules are interchangeable with other algorithms and sensors than those that were proposed such as tree structure models in place of the GMM based cylinders, point cloud stitching for the stereo vision, or kalman filtering instead of the spline trajectory. Table 1. Tree segmentation for LIDAR and Camera reconstructions, with and without adjustment using the Tilt sensor. The camera approach is much more sensitive to adjustment, and inaccuracies in it.
LIDAR
Camera
No adjustment Adjustment No Adjustment Adjustment Correlation(R2)
0.8
0.85
0.32
0.60
RMS Error (m)
0.12
0.09
0.20
0.16
P
0.01
0.01
0.04
0.02
38
38
38
38
Segmented Acknowledgements
This work was supported in part by the USDA Specialty Crop Research Initiative, Award No. 2008-51180-19561, the California Canning Peach Association, and the University of California, Davis. References Polo, J.R.R., Sanz, R., Llorens, J., Arno, J. , Escola, A., Ribes-Dasi, M., Masip, J. , Camp, F., Gracia, F., Solanelles, F., Palleja, T., Val, L., Planas, S., Gil, E., Palac, J. (2009) A tractormounted scanning LIDAR for the non-destructive measurement of vegetative volume and surface area of tree-row plantations: A comparison with conventional destructive measurements, Biosystems Engineering, 102, pp 128–134 Perez-Ruiza, M., Slaughter, D.C., Gliever, C., Upadhyaya, S.K. (2012) Tractor-based Realtime Kinematic-Global Positioning System (RTK-GPS) guidance system for geospatial mapping of row crop transplant. Biosystems Engineering, 111(1), pp 64-71 Miller, S.S, Schupp, J. R., Baugher, T. A., Wolford, S. D. (2011) Performance of Mechanical Thinners for Bloom or Green Fruit Thinning in Peaches, Hortscience 46(1), pp 43-51 USDA ERS-NASS. (2010) Fruit and Tree Nuts Situation and Outlook Yearbook. Stock #89022. Economic Research Service, United States Department of Agriculture, Washington DC, USA. Westwood, M.N. 1993. Temperate-zone pomology physiology and culture. Third Edition. Timber Press, Inc. Portland, OR, USA. LaRue, J.H., & Johnson, S.R. (1989) Peaches, plums, and nectarines growing and handling for fresh market. Univ. of California, Cooperative Extension, Division of Agriculture and Natural Resources, Oakland, CA, USA. Publication number 3331. Day, K.R., Klonsky, K.M., & De Moura, R.L. (2009) Sample costs to establish and produce fresh market peaches. Univ. of California Cooperative Extension, Oakland, CA, USA. Available at: coststudies.ucdavis.edu/files/peachesvs09.pdf Johnson, S.R., (2012) Mechanical Thinning. Fruit Report. Univ. of California, Cooperative Extension, Division of Agriculture and Natural Resources, Oakland, CA, USA. Duncan, R., Norton, M. (2009) Use of a String Blossom Thinner in Canning Peaches. Univ. of California, Cooperative Extension, Division of Agriculture and Natural Resources, Oakland, CA, USA. cestanislaus.ucdavis.edu/files/111753.pdf Nielsen, M., Slaughter, D.C., Gliever, C. (2012). Vision-Based 3D Peach Tree Reconstruction for Automated Blossom Thinning. IEEE Transactions of Industrial Informatics, Vol. 8(1), pp. 188-196