Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018
2018-01-0124
Published 03 Apr 2018
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles Guojun Wang, Weiwen Deng, and Sumin Zhang Jilin University Jinsong Wang General Motors LLC Shun Yang Jilin University Citation: Wang, G., Deng, W., Zhang, S., Wang, J. et al., “A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles,” SAE Technical Paper 2018-01-0124, 2018, doi:10.4271/2018-01-0124.
Abstract
P
erformance testing and evaluation always plays an important role in the developmental process of a vehicle, which also applies to autonomous vehicles. The complex nature of an autonomous vehicle from architecture to functionality demands even more quality-and-quantity controlled testing and evaluation than ever before. Most of the existing testing methodologies are task-or-scenario based and can only support single or partial functional testing. These approaches may be helpful at the initial stage of autonomous vehicle development. However, as the integrated autonomous system gets mature, these approaches fall short of supporting comprehensive performance evaluation. This paper proposes a novel hierarchical and systematic testing and evaluation approach to bridge the above-mentioned gap. In this paper, firstly a three-dimensional evaluation model conforming to the functional architecture of autonomous vehicles was built, with each dimension representing one of the three key functional layers of autonomous vehicle including sensing &
Introduction
A
utonomous driving envisioned as the future driving style, has drawn a lot of attention from both auto industry and academia in recent years. However, along with the optimism on autonomous driving are safety and reliability concerns: whether an autonomous car depicted to be safer is safe in complicated road scenarios? To help address the concerns, product maturity must be demonstrated in a convincing way based on comprehensive performance testing and evaluation results. Due to the complex nature of autonomous vehicles from architecture to functionality, performance testing and evaluation plays an even more important role to provide guidance for the development of autonomous vehicles than ever before. In a sense, the rapid development of autonomous vehicles benefits from the pre-established test and evaluation system [1]. Testing tools and methodologies with the capabilities to support the development of autonomous vehicles must be deployed at a faster pace to meet the needs [2]. © 2018 SAE International. All Rights Reserved.
perception, decision-making ﹠ planning, control & execution. Each dimension has a set of metrics carefully defined with their weights fairly determined based on an entropy weights method. Then, considering environment effect on vehicle functions, we innovatively determine task-scenarios for testing the performance of each dimension. Besides, we design a hierarchical systematic testing method which could specially testing each function layer of autonomous vehicle. Fuzzy comprehensive and TOPSIS evaluation method was proposed to quantitatively evaluate the comprehensive performance of autonomous vehicles under defined task-scenarios. Finally, our methods are used to evaluate three candidate vehicles based on simulation scenario in PanoSim. Compared to traditional approach based on external task performance, the proposed approach can not only provide convincing results of the overall system performance but can also peek into each of the key functional layer and provide insights about their performance. Therefore, this approach provides better guidance for autonomous vehicle research.
Several previous work are good for reference, such as the DARPA Grand/Urban Challenge (DUC) and the European Land-Robot Trial robot competitions [3], [4], the “eVALUE” intelligent vehicles evaluation project [5], and recently the “Grand Cooperative Driving Challenge 2016” [6]. Similar autonomous vehicle competitions have also been held in China. National Science Foundation of China had spent over 30 million dollars to support nine “Intelligent Vehicle Future Challenges” that had been held in different cities of China, through 2009 to 2017 [7]. These research projects are to test and evaluate the safety performance and collaborative driving ability of autonomous vehicles in a simulated and closed environment. Sun Yang studied the quantitative analysis of the driving trajectory of autonomous vehicles based on chaos theory [8]. Huang and others studied the quantitative comprehensive test and evaluation of autonomous vehicles based on specific tasks [9]. Li li et al. proposed a test architecture combined scenarios and functions by analyzing the defects
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 2
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
FIGURE 2 Autonomous vehicle hierarchical architecture
of testing based on scenarios and functional tests but no specific tests and evaluation methods were given [10]. Most of the above-mentioned evaluation approaches were developed by individual research institutions trying to evaluate single or partial functions based on their own situations. These traditional approaches are mostly qualitative evaluation based on specific function, and can only support single or limited number of autonomous function testing. For comprehensive quantitative evaluation of autonomous vehicles, the above research only evaluated the performance of autonomous vehicles through external driving performance in specific tasks or scenarios, which can’t show the problem of specific function. As the integrated autonomous system gets mature, these traditional approaches lack comprehensive testing capability required to support the development of autonomous vehicle system. The technological route of the paper is organized as shown in Figure 1. Firstly, establishing the three-dimension evaluation model by analyzing the functional architecture of autonomous vehicle. Secondly, establishing the task-scenario model by determining driving task and driving scenario. Then, designing hierarchical testing technology to test each function layer based on the pre-establishing task-scenario. Finally, a comprehensive evaluation method was proposed to quantitatively evaluate the performance of each dimension based on the results of hierarchical testing.
© SAE International
© SAE International
FIGURE 1 The technological route of the paper
and control & execution layer [11]. Therefore, if the performance of autonomous vehicles is evaluated comprehensively, the performance of these three function layers should be considered. Based on this, a three-dimensional evaluation model is presented, as shown in Figure 3. sensing & perception, decision-making & planning and control & execution perform three evaluation aspects respectively as three dimensions of the evaluation model. Each dimension has a corresponding metric system and evaluation method. A score for each dimension can be calculated according to the results of the pre-established task-scenario. Finally, the scores of three dimensions are integrated to make a comprehensive quantitative performance evaluation of the vehicle. FIGURE 3 Three-dimension performance evaluation model.
Where, the blue triangle represents the score of the three dimensions of autonomous vehicle 1, and the red triangle represents the score of the three dimensions of the autonomous vehicle 2.
Testing and Evaluation Approach for Autonomous Vehicles © SAE International
Three-Dimension Performance Evaluation Model As shown in Figure 2, the functional architecture of an autonomous vehicle is very often decomposed into three layers: sensing & perception layer, decision-making & planning layer,
© 2018 SAE International. All Rights Reserved.
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
Establishing Task-Scenario Model Before conducting performance testing of an autonomous vehicle, it is necessary to determine typical driving tasks and design corresponding scenarios based on task requirement. Then a task-scenario model is established by combining driving task and driving scenario. Driving Task Driving task is the maneuver behavior to be completed when driving a vehicle. Based on analysis of typical operating environment of vehicles, and the key function layers of autonomous vehicle [12], the driving tasks can be divided into four typical categories: car-following task, collisionavoiding task, intersection task and U-bend task. Then we can design the corresponding scenario based on the predetermined task. Driving Scenario Driving scenario is a collection of external factors and information that affect vehicle driving, such as roads, weather, light, buildings, traffic signs, traffic lights, pedestrian, vehicles, and streetscape. It is not only very rich and complex, but highly uncertain and unpredictable, which limits the key technology application in the actual environment and prevents vehicle from autonomous driving safely. The design of test scenarios is necessary to be studied based on the test task of autonomous vehicle to ensure a scientific, repeatable and safe performance test [13]. By analyzing the influence of scenario on each function layer, the elements of the scenario are classified and their attribute is defined. Scenario elements are environment elements and traffic elements outside which influence the driving safety. According to the moving characteristic relative to the earth, scenario elements are divided into static elements and dynamic elements which include roads, buildings, weather, light, traffic
© SAE International
FIGURE 4 scenario element database
© 2018 SAE International. All Rights Reserved.
3
facilities, and the surrounding vehicles and so on. Besides, each element also contains geometric properties (size, layout, shape, etc.), physical properties (position, speed, direction, number, density, etc.) and image properties (surface roughness, texture, materials, etc.). As shown in figure 4, scenario element database is formed and the theoretical model of test scenarios is formed by combining various scenario elements. Task-Scenario According to the driving task requirements, the scenario elements are combined to form a test scenario. Finally, the driving task and driving scenarios to realize the task-scenario. Thus, we conclude the test task-scenario modeling process, as shown in figure 5. As mentioned earlier, an autonomous driving system consists of three key function layers. All of function layers are influenced by external scenarios. By analyzing influence elements on each function and driving task, the scenario elements are combined to form a test scenario. Finally, the driving tasks and driving scenarios are combined to realize the task-scenarios.
Hierarchical Testing Technology After determining the testing task-scenarios, a phased and hierarchical test method is proposed to ensure the test scientificity and repeatability. The testing technology could successively test control﹠execution layer, decision-making ﹠ planning layer and sensing ﹠ perception layer with corresponding task-files. The task-files contain different content. The test steps are as follows: 1. First determining driving task, and establishing a taskscenario model based on the driving tasks. 2. Then setting the corresponding task-files according to function layer.
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 4
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
© SAE International
FIGURE 5 Task-scenario modeling process
A. Control ﹠ Execution The control ﹠ execution layer is responsible for controlling the vehicle’s braking, throttle and steering to track trajectory which is planned by decisionmaking ﹠ planning layer. Therefore, all trajectory points should be included in the task-file for the test, and the task-file is shown in table 1. Where, L denotes Longitude, A denotes Latitude, and H denotes Altitude. The task-file contains the coordinate information of all trajectory points, and the coordinate format uses three-dimensional WGS coordinates. Autonomous vehicles only need navigation to track the trajectory to complete the test. Node1 and node n are the start point and end point of the test, and other trajectory points can be set between 1 and n.
B. Decision-Making ﹠ Planning The decision-making ﹠ planning layer uses the scenario information obtained by the sensing ﹠ perception layer to determine maneuver behaviors and plan the corresponding trajectory. Therefore, the task-file should contain surrounding scenario element ground truth information, as show in table 2. Where, node 1, 2, …, n represent element nodes. Nodes here is not different from trajectory point in control ﹠ execution task-file. The element nodes contain surrounding scenario elements information. The element node represents that there is at least one scenario element changing in this node in the test. There is not element change in the space between element TABLE 1 Control ﹠ execution layer task-file.
Node Numbers
Longitude
Latitude
Altitude
1
L1
A1
H1
2
L2
A2
H2
…
…
…
…
n
Ln
An
Hn
© SAE International
3. Finally importing the task-file into the autonomous vehicle’s computing module. The computing module parses the task-file and directs the autonomous vehicle to complete test. The test method can be applied to both virtual test and real field test. Each layer test method is described in detail below.
Node Numbers
Longitude
Longitude
Altitude
Road
Building/ Streetscape
Transit Facilities
Vehicle/ Pedestrian
Else
1
L1
A1
H1
1A824521
…
…
…
…
2
L2
A2
H2
1A825521
…
…
…
…
…
…
…
…
…
…
…
…
…
n
Ln
An
Hn
10810520
…
…
…
…
© 2018 SAE International. All Rights Reserved.
© SAE International
TABLE 2 Decision-making ﹠ planning layer task-file
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
nodes. Every scenario element is represented by combination of numbers and letters such as 1A824521 for road in node 1. Thus, the continuous scenario is transformed into discrete element node information by task-file. The scenario elements include roads, streetscape, traffic facilities, vehicle, pedestrian and else. Because the weather and light factor do not affect decision-making ﹠ planning function, the task-file does not contain weather and light. The detailed information for each scenario element is represented by the corresponding code such as “1A824521” and “10810520”. For example, the detailed properties of road element are shown in table 3: Where attribute A represents the road surface material, 0 represents concrete, 1 represents sand stone, 2 represents grass, 3 represents desert, etc. The attribute B indicates the slope of the road, 0 indicates that the slope is 0,1 means the slope is 1%, and so on, because the road slope is likely to exceed 10%, so the hexadecimal system is adopted; The attribute C indicates the adhesion coefficient of the road, 0 means the coefficient is 0, 1 means the coefficient is 0.1, and so on; The attribute D indicates the topology of the road, 0 indicates straight road, 1 indicates curve road, 2 indicates intersection, and 3 indicates overpass, etc. The attribute E indicates the road curvature of this point, 0 indicates the curvature is 0, 1 indicates the curvature is 0.1, 2 indicates the curvature is 0.2, and so on. The attribute F indicates the width of road at that point, 0 indicates the width is 1 m, 1 indicates the width is 2 m and so on. The attribute G indicates the number of lanes at this point, 0 means the number of lanes is 1,1 means the number of lanes is 2 and so on; The attribute H indicates that the covering of road, 0 indicates that the road has no covering, 1 means snow, 2 means water, 3 means ice, etc. As shown in table 3, the road in node 1 is the sand stone road, the slope is 10%, the adhesion is 0.8, the curvature is 0.4, the road width is 6 m, and the road is covered by snow, the road structure is three-lane intersection. Thus, all scenario information could be input to test vehicles in the form of task-file, the vehicle needs to achieve behavioral decision making on their own and plan a feasible trajectory to guide the vehicle test based on the scenario information in the task-file and GPS. C. Sensing ﹠ Perception The sensing ﹠ perception layer can be tested when the decision-making ﹠ planning layer is tested and the performance is well. Sensing ﹠ perception layer technology includes Lidar, Radar, Camera, Ultrasonic, GPS and other sensors. The testing of this layer consists of two parts. First, we need to test the vehicle’s sensor technology one by one. Then we will test the real-time fusion effect of each sensor. Single Sensor Technology Testing. The single sensor test
scheme is shown in figure 6. Where HIL Simulator is responsible for running vehicle model and scenario model in real time which could be built through the simulation software TABLE 3 Road element information
attribute
A
B
C
D
E
F
G
H
value
1
A
8
2
4
5
2
1
© SAE International © 2018 SAE International. All Rights Reserved.
5
FIGURE 6 Single Sensor testing scheme
© SAE International
such as PanoSim, Prescan and so on. Target information and scenario information in the scenario can be synchronized to Target Simulator and Image Generation by CAN/Ethernet. Target Simulator is responsible for sending the target information to sensor in a way that the sensor can receive, such as laser beam to Lidar, millimeter-wave to Radar, sound wave to Ultrasonic. Target Simulator can be a device or multiple devices which can adopt ABEx, ARTS9510, and R & S® SMBV100A [14], [15], [16]. Image Generation is responsible for transforming the scenario information in the model into video frame and injecting video frame into Camera ECU which can adopt Environment Sensor Interface Unit [17]. For Lidar, Radar, Ultrasonic and GPS testing, the HIL Simulator will sent target information, location information to the Target Simulator. In Target Simulator target information and location information are converted and sent to sensor in the way that sensor can receive. Finally, target and location identification information could be obtained through sensor ECU which can be compared with the known Target information in the scenario to evaluate the performance of the algorithm. For Camera testing, there are two schemes. In the first scheme, the actual on-board Camera takes the virtual scenario information and outputs video frame information to Camera ECU for image recognition. In the second scheme the difference is that the scenario information is not obtained through camera lens. Instead, the virtual scenario data is directly injected into the Camera ECU through Image Generation, which skips lens and imaging units. Multi-Sensor Fusion Testing. The multi-sensor fusion effect can be tested when the single sensor testing is achieved and the performance of each sensor is well. The sensing ﹠ perception layer can be tested only by a given start point and end point coordinate information. Because previous tests have ensured vehicle trajectory tracking capability, decisionmaking ability, and single sensor technology is tested, this test mainly examines vehicle multi-sensor fusion perception capability. The task-file only contains the coordinates of start point and end point.
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 6
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
Methodologies of Performance Evaluation
it is applied to be practically used in many fields [18]. The evaluation metrics of each dimension are determined as follows: Sensing ﹠ perception layer: Because there are many sensors in sensing and perception layer, the perception and recognition information of each sensor’s output is different, it is necessary to determine the respective evaluation metrics based on the characteristics of each sensor. Radar and Lidar outputs are mainly speed, distance and angle of target. Camera is about target classification, identification and tracking. GPS is responsible for determining the location of the vehicle. So, the evaluation metrics of each sensor are determined as follows. Lidar and Radar: distance error, recognition rate, angle error. Camera: MOPT (multiple object tracking precision), MOTA (multiple object tracking accuracy) for object tracking [19]; Average Precision for object classification [20]; PR (precision/recall) curve for semantic segmentation; rotation and translation error for visual odometry/SLAM; average orientation similarity (AOS) for object detection and orientation [21]. GPS: Location error. Decision-making ﹠ planning layer: response time, trajectory safety, trajectory comfort, driving efficiency. Specific calculation methods please refer to [22]. Control ﹠ execution layer: tracking adaptability, tracking accuracy, tracking smoothness [23]. Please refer to the relevant literature for the calculation methods of all the above metrics.
Autonomous vehicle performance evaluation methodologies consist of establishing evaluation model, metrics selection, metrics preprocess, weight calculation, metrics quantification and comprehensive evaluation. The evaluation process and corresponding methods are shown in figure 7, in which principal component analysis (PCA) [18], entropy weights method [12] and fuzzy comprehensive methods [1] are applied. When evaluating the performance of autonomous vehicle, it is necessary to determine the metrics of each dimension of the evaluation model and establish evaluation decision matrix X, as show in table 4. Where Ai (1 ≤ i ≤ M) represents candidates to be evaluated, M represents the number of candidates to be evaluated, candidates are the vehicles to be evaluated. Cj (1 ≤ j ≤ N ) represents metrics of three dimensions, N represents the number of metrics, xij (1 ≤ i ≤ M, 1 ≤ j ≤ N) represents the score of evaluated candidates Ai under evaluation metric Ci, and xij ≥ 0. The weight of metric Ci is set to be ωj and
å
N
w j =1.
j =1
A. Selection of Evaluation Metrics When there are too much metrics of each dimension, we need to find out the most effective metrics and exclude the redundant and secondary metrics vectors by linear transformation. PCA is a well-studied method for reducing the dimension in multivariate data and
B. Preprocessing of Evaluation Metrics There are different types of metrics for different key technologies of autonomous vehicles. Each evaluation metrics have different dimension, and some metrics are max type (benefits, larger is better), some metrics are min type (costs, smaller is better). Therefore, all metrics need to be standardized to same type with (1) or (2).
FIGURE 7 Evaluation procedure
xij+ =
xij- =
xij - min ( xij )
max ( xij ) - min ( xij ) max ( xij ) - xij
max ( xij ) - min ( xij )
(1)
(2)
© SAE International
Where, xij+ means types are not modified, xij- means types are modified. In this paper, all the metrics are standardized to max type.
C1
C2
…
CN
A1
x11
x12
…
x1N
A2
x21
x22
…
x2N
…
…
…
xij
…
AM
xM1
xM2
…
xMN
© SAE International
TABLE 4 Evaluation decision matrix
C. Computing of Evaluation Metrics Weights After determining the metrics, we need to assign weights to each metrics. The metrics weight computing can adopt artificial assignment, AHP and entropy weight methods. However, the evaluation metrics’ weights are interfered by subjective deviations when they are directly assigned. When applying the AHP method, the weight can be achieved from relative importance matrix which is obtained by the pairwise comparison between the metrics according to the relative importance assigned by evaluation judge [24]. To further decrease the subjective preferences, we employed entropy weights to compute the weights [12]. Information entropy can determine the metrics weight © 2018 SAE International. All Rights Reserved.
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018
parameters by calculating the amount of information, the entropy of information can be calculated as follows: The entropy weight α = {α1, α2, …, αN} can be calculated through the following steps. Where, αj(j = 1, 2, …, N) represents weight of metric Cj. 1) Calculate the entropy value ejof metric αj: M
e j = -k
åf ln f , j = 1, 2, ¼, N (3) ij
ij
i =1
å
M Where fij = xij / æç xij ö÷ , k = 1 / ln M ;when f ij = 0 𪾠 i =1 è ø fij ln fij = 0。. 2) Calculate the degree of diversity of metric αj:
Where vj, j = 1, 2, …, m is the evaluation result of the evaluation metrics of autonomous vehicle in the j evaluation grade, m is the number of evaluation grade set. c) Establishing the fuzzy matrix RI of this dimension I: Fuzzy set (ri1, ri2, …, rij, …, rim)can be achieved from evaluation of single metric ui, (i = 1, 2, …, n), where rij is the membership grade of the i-th evaluation metric ui for the j-th evaluation grade vj. So, the fuzzy matrix R I of dimension I is:
( )I
RI = rij
d j = 1 - e j , j Î [1,N ] (4)
3) Calculate the weight of metric αj: aj =
dj
=
N
åd
j
1 - ej n-
åe
, j Î [1,N ] (5)
N
j =1
j
D. Comprehensive Evaluation For comprehensive evaluation, fuzzy comprehensive evaluation methods and TOPSIS can be used to integrate evaluation metrics. The Fuzzy Method transforms qualitative evaluation into quantitative evaluation based on membership theory of fuzzy mathematics which can well solve fuzzy and hard-to-quantify problems, suited to solve uncertain problems [1]. The Topsis Method is a multi-criteria decision analysis method. TOPSIS is based on the concept that the chosen alternative should have the shortest geometric distance from the positive ideal solution (PIS) and the longest geometric distance from the negative ideal solution (NIS) [25]. Fuzzy Comprehensive Evaluation
1. Fuzzy comprehensive evaluation of single dimension I of evaluation model, Where, I = 1, 2, 3 represent sensing ﹠ perception layer, decision-making ﹠ planning layer and control ﹠ execution layer. a) Determine the evaluation metrics set of a certain dimension I of autonomous vehicles: U = {u1 ,u2 ,¼,ui ,¼,un } Where ui, i = 1, 2,…, n is evaluation metrics of this dimension I, n is the number of evaluation metrics of this dimension I. b) Establishing the evaluation grade set V:
V = {v1 ,v 2 ,¼,v j ,¼,v m } © 2018 SAE International. All Rights Reserved.
r12 r22 rn 2
r1m ù r2m úú ú ú rnm û I
AI = (a1 ,a 2 ,¼,a i ,¼,a n )I
Where, αi is the weight value of the i-th evaluation metric in the I-th dimension. Then comprehensive evaluation result of dimension I can be expressed as:
é r11 êr 21 BI = AI RI = (a1 a 2 a n )I ê ê ê ërn1
b1 = a1 + a 2 +¼+ a 8 (6)
é r11 êr 21 =ê ê ê ërn1
d) Establishing the comprehensive evaluation model of this dimension I: The weight coefficient matrix AI of the dimension I is:
j =1
According to the additivity of information entropy, the weights of each dimension can be calculated directly through accumulating the metric weights of corresponding dimensions. For example,{α1, α2, …, α8} represents the weight of each metric in sensing ﹠ perception layer, then the weight of sensing ﹠ perception layer is:
7
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
= ( b1I b2I bmI )
r12 r22 rn 2
r1m ù r2m úú ú ú rnm û I (7)
Where bjI, j = 1, 2, …, m is the membership grade of dimension I for the j-th evaluation grade.
2. The fuzzy-entropy comprehensive evaluation of autonomous vehicle. According to the above methods, the evaluation results BI (I = 1, 2, 3) of three dimensions can be obtained respectively, which can be combined into the final comprehensive evaluation matrix R = (B1 B2 B3)T. The weight coefficient matrix A of three dimension is: A = (a1 a2 a3). Where, ai(i = 1, 2, 3) can be calculated by accumulating the metric weights of corresponding dimensions. Then comprehensive evaluation result B is expressed as:
æ B1 ö ç ÷ B = AR = ( a1 a2 a3 ) ç B2 ÷ = ( b1 b2 bm ) (8) çB ÷ è 3ø
Where, bj, j = 1, 2, …, m is the membership grade of comprehensive evaluation for j-th evaluation grade. The comprehensive evaluation result can be expressed with a total score by assigning score values to each evaluation grade. The corresponding score values is
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 8
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
μ = [μ1 μ2⋯μk⋯μm], then the specific score G of the comprehensive evaluation result can be calculated as follows: G = B mT =
(b1 b2 bm ) × [ m1 m2 mk mm ]
T
(9)
Topis Comprehensive Evaluation. We can get new stan-
dardize decision matrix Z by combining decision matrix X and entropy weight matrix α.
Z = ( z ij )M ´N
é a1x11 êax 1 21 =ê ê ¼ ê ëa1x M 1
a 2 x12 a 2 x22 ¼ a2 xM 2
¼ ¼ ¼ ¼
a N x1N ù a N x2 N úú (10) ¼ ú ú a N x MN û
Where, xij(1 ≤ i ≤ M, 1 ≤ j ≤ N) are max type. The sets of the positive ideal point (PIP, z+) are
(
z + = z1+
(
)
z 2+ ¼
)
z N+ = max z ij (11) 1£i £ M
The sets of the negative ideal point (PIP, z+) are
(
z - = z1-
z 2-
)
(
)
z N- = min z ij (12)
¼
1£i £ M
The weighted Euclidean distances of (D i, D +)and (Di, D−)are respectively computed as. N
Di+ =
å( j =1
z ij - z +j
)
2
N
=
å( z
- a j ) (1 £ i £ M ) (13) 2
ij
j =1
N
Di- =
å(
z ij - z -j
j =1
)
2
N
=
å( z ) (1 £ i £ M ) 2
(14)
ij
j =1
Compute the relative closeness of each candidate vehicle to the ideal solution. The relative closeness of the alternative Di with respect to D+ is defined as.
Ci =
Di(1 £ i £ M ) D + Di+ i
(15)
Rank the candidate vehicles according to the relative closeness to the PIS. The best candidate vehicle is the one with the greatest relative closeness to the PIS.
Case Study of Autonomous Vehicle Evaluation This section demonstrates methodologies of performance evaluation by testing three candidate vehicles V1, V2 and V3. Each candidate vehicle is equipped three sensors (camera, radars and GPS). The test experiments are implemented based on Co-simulation of PanoSim and Matlab/Simulink. PanoSim is an integrated simulation platform for intelligent driving [26]. The collision-avoiding task-scenario is chosen for testing. The test task-scenario is established in PanoSim, as shown in Figure 8. Where, A, B represent test start point and end point. E represents candidate vehicle, T represents target vehicle which is stationary. The candidate vehicle needs to finish lanechanging task and arrive end point B. The weather is snowy and the road surface is smooth.
© SAE International
FIGURE 8 Testing task-scenario in PanoSim
© 2018 SAE International. All Rights Reserved.
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018
9
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
© SAE International
FIGURE 9 Candidate vehicle test mode
Hierarchical Testing PanoSim provides vehicle dynamic model and various environment sensor models (Camera, Lidar, Radar, Ultrasonic and GPS) for development and testing and verification of vehicle intelligence and ADAS technologies. In testing, we added Camera, Radar and GPS to candidate vehicles. As shown in figure 9, Automated Driver, Vehicle Model, Radar Model, Camera Model and GPS Model are combined into candidate vehicle. Automated Driver module is made up of two functional sub-models: decision-making ﹠ planning model and control ﹠ execution model. Vehicle Model is dynamic model. Different parameters of Sensor Model, Automated Driver and Vehicle Model are set for three candidate vehicles. Different task-files can be import into Automated Driver for three layers testing by From File module. Sensing ﹠ Perception. The Ground Truth module is
added to each candidate vehicle. It can output true value for location, obstacle and lane markers which can be used as base for computing scores for camera, radar and GPS. Sensor_ evaluation module is used for computing scores of sensing ﹠ perception. Distance errors and angle errors are chosen as radar metrics. Scores for radar evaluation metrics: Average distance errors for three candidates are Derror = ( 6.32
14.26
26.32 )
Because error is min type, the scores should be standardized as max types with Equation (2). max error
D
= (1
0.603
0)
Average angle errors for three candidates are Aerror = ( 0.17 © 2018 SAE International. All Rights Reserved.
0.09
0.14 )
After standardization, we get max Aerror = (0
0.375 )
1
Lane error is chosen as camera metric. Scores for camera evaluation metric: Average lane detection error for three candidates are Laneerror = ( 0.13
0.19
0.26 )
0.538
0)
After standardization, we get max Laneerror = (1
Location errors is chosen as GPS metric. Average location errors for three candidates are Locationerror = (1.4
0.9 )
2.4
After standardization, we get max Locationerror = ( 0.667
1)
0
Decision-Making ﹠ Planning. Trajectory_evaluation
module is used for computing scores of decision-making ﹠ planning. The variable TrajectoryIn is trajectory coordinates planned by Automated Driver. Evaluation metrics is trajectory safety. The trajectory safety scores for three candidates are TS = ( 2.6
5.7
7.5 )
Because safety is max type, the scores should be standardized as max type with Equation (1) TS max = ( 0
0.633
1)
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018 10
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
Distance error
Angle error
Lane error
Location error
Trajectory safety
Tracking accuracy
V1
1
0
1
0.667
0
0
V2
0.603
1
0.538
0
0.633
0.286
V3
0
0.375
0
1
1
1
Control ﹠ Execution. Control_evaluation module is used for computing scores of control ﹠ execution. The variable ControlIn is real trajectory coordinates of candidate vehicle which is used as base for computing scores. Evaluation metrics is tracking accuracy. The tracking accuracy scores for three candidates are:
TA = ( 8.5
8.7
9.2 )
After standardization, we get TA max = ( 0
0.286
1)
C1 = 0.4364 C2 = 0.5040 C3 = 0.5474 According to the relative closeness to the PIS, ranking the candidates V1-V3, the comprehensive performance ranking is V3 > V2 > V1.
Conclusion
In this paper, the comprehensive testing and evaluation method for autonomous vehicle performance is studied. By analyzing the functional architecture of autonomous vehicles, we proposed a three-dimensional evaluation model with each dimension representing one of the key functional layers of autonomous. Computing of Evaluation Metrics Weights. According Each dimension is further attributed with a set of metrics to to Equation (3-5), the weights of all metrics can be acquired facilitate the specification, analysis, evaluation, and measurea = ( 0.1545 0.1814 0.1597 0.1506 0.1525 0.2013 ) ment. In this way, the various complex functional tests of autonomous vehicles are simplified to evaluate the three key technologies. Secondly, we created the scenario element database by Comprehensive Evaluation. The TOPSIS method is used analyzing the relationship between the key technology of for comprehensive evaluation. According to Equation (10), autonomous vehicle and the relationship of scenario elements, we get and then formulated four typical task-scenarios by combining various scenario elements. Z = ( z ij )3´6 In addition, we innovatively put forward a detailed test method for each key technology of autonomous vehicles. The 0.1597 0.1005 0 0 é 0.1545 0 ù method tests each key technology by designing different taskê ú = ê0.0932 0.1814 0.0859 0 0.0965 0.0576 ú files. Compared to previous testing approaches based on êë 0 0.068 0 0.1506 0.1525 0.2013 úû external task performance, the proposed test method can penetrate the key technology inside the vehicle, and the test The sets of the positive ideal point (PIP, z+) are results provide better guidance for research and development. + Finally, we designed the performance evaluation method, z = (0.1545 0.1814 0.1597 0.1506 0.1525 0.2013) including the metrics selection, metrics preprocessing, weight calculation and comprehensive evaluation. With these works, The sets of the negative ideal point (PIP, z+) are three candidate vehicles are evaluated based on simulation in PanoSim. z = (0 0 0 0 0 0) Comprehensive Evaluation of Autonomous Driving According to results of hierarchical testing, evaluation decision matrix X is established as shown in table 5.
According to Equation (13) and (14), the weighted Euclidean distances of (Di, D+) and (Di, D−) are respectively computed as D1+ = 0.3150 + 2
D = 0.2443
D3+ = 0.2495
D1- = 0.2439
2 3
D = 0.2482 D = 0.3018
According to Equation (15), the relative closeness of each candidate vehicle to the ideal solution are computed as
References 1. Sun, Y., Tao, G., Xiong, G., and Chen, H., “The Fuzzy-AHP Evaluation Method for Unmanned Ground Vehicles,” Applied Mathematics & Information Sciences 7(2):653-658, March 1, 2013, doi:10.12785/amis/070232. 2. Macias F. “The Test and Evaluation of Unmanned and Autonomous Systems[R]. WHITE SANDS MISSILE RANGE NM,” 2008. 3. Campbell, M., Egerstedt, M., How, J.P., and Murray, R.M., “Autonomous Driving in Urban Environments: Approaches, Lessons and Challenges,” Philosophical Transactions of the © 2018 SAE International. All Rights Reserved.
© SAE International
TABLE 5 Evaluation decision matrix
Downloaded from SAE International by Jilin University, Tuesday, December 11, 2018
A Comprehensive Testing and Evaluation Approach for Autonomous Vehicles
Royal Society A: Mathematical, Physical and Engineering Sciences 368(1928):4649-4672, September 6, 2010, doi:10.1098/rsta.2010.0110.
11
rohde-schwarz.com/pws/solution/wireless_and_mobile_ communications/positioning___navigation_1/RohdeSchwarz_GNSS_fly_en_5215_5042_32_v0100_web2.pdf
4. Schneider, F.E. and Wildermuth, D., “Results of the European Land Robot Trial and their Usability for Benchmarking Outdoor Robot Systems,” Lecture Notes in Computer Science408-409, 2011, doi:10.1007/978-3-642-23232-9_51.
17. Dspace. “Developing Advanced Driver Assistance Systems (ADAS) and Functions for Autonomous Driving.” https:// www.dspace.com/shared/data/pdf/2017/ADAS_ Brochure_2017_English.pdf
5. Lesemann M. “Testing and evaluation methods for ICTbased safety systems[C]//15th World Congress on Intelligent Transport Systems and ITS America's 2008 Annual Meeting ITS AmericaERTICOITS JapanTransCore.” 2008.
18. Geiger, B.C. and Kubin, G., “Relative Information Loss in the PCA,” 2012 IEEE Information Theory Workshop, September 2012, doi:10.1109/itw.2012.6404738.
6. Kokogias S, Svensson L, Pereira G C, et al. “Development of Platform-Independent System for Cooperative Automated Driving Evaluated in GCDC 2016.” IEEE Transactions on Intelligent Transportation Systems, 2017, 1, 13. 7. National Science Foundation of China, Intelligent vehicle future challenges: “2009-2017, Nat. Sci. Found. China, Beijing, China, Tech. Rep.,”2015. 8. Sun, Y., Xiong, G., Ma, X., Gong, J., and Chen, H., “Quantitative Evaluation of Unmanned Ground Vehicle Trajectory Based on Chaos Theory,” Advances in Mechanical Engineering 7(11), November 5, 2015): 168781401561692, doi:10.1177/1687814015616929. 9. Huang, W.L., Wen, D., Geng, J., and Zheng, N.-N., “TaskSpecific Performance Evaluation of UGVs: Case Studies at the IVFC,” IEEE Transactions on Intelligent Transportation Systems 15(5):1969-1979, October 2014, doi:10.1109/ tits.2014.2308540. 10. Li, L., Huang, W.-L., Liu, Y., Zheng, N.-N., and Wang, F.-Y., “Intelligence Testing for Autonomous Vehicles: A New Approach,” IEEE Transactions on Intelligent Vehicles 1(2):158-166, June 2016, doi:10.1109/tiv.2016.2608003. 11. Li, L., Wen, D., Zheng, N.-N., and Shen, L.-C., “Cognitive Cars: A New Frontier for ADAS Research,” IEEE Transactions on Intelligent Transportation Systems 13(1):395407, March 2012, doi:10.1109/tits.2011.2159493. 12. Meng, K.-W., Zhao, Y.-n., Gao, L., and Tan, H.-c., “Evaluation of the Intelligent Behaviors of Unmanned Ground Vehicles Based on Information Theory,” CICTP 2015, July 13, 2015, doi:10.1061/9780784479292.037. 13. Sun Y. “Quantitative Evaluation of Intelligent Levels for Unmanned Ground Vehicles [D].” Beijing Institute of Technology, 2014. 14. Konrad. “Analog Bus Extension for PXI [EB/OL].” https:// www.konradtechnologies.com/files/konrad/media/ datasheet/ABex_Technology_Brochure_Rev120_web.pdf 15. Rohde & Schwarz. “Automated Measurements of 77 GHz FMCW Radar Signals [EB/OL].” https://cdn.rohde-schwarz. com/pws/dl_downloads/dl_application/application_ notes/1ef88/1EF88_0e_Automated_Measurements_of_77_ GHz_FMCW_Radar.pdf 16. Rohde & Schwarz. “Bring satellites into your lab: GNSS simulators from the T&M expert [EB/OL].” https://cdn.
19. Bernardin, K. and Stiefelhagen, R., “Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics,” EURASIP Journal on Image and Video Processing 2008:1-10, 2008, doi:10.1155/2008/246309. 20. Van der Pol, E., Boing, A.N., Harrison, P., Sturk, A., and Nieuwland, R., “Classification, Functions, and Clinical Relevance of Extracellular Vesicles,” Pharmacological Reviews 64(3):676-705, June 21, 2012, doi:10.1124/pr.112.005983. 21. Geiger, A., P. Lenz, and R. Urtasun. “Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite.” 2012 IEEE Conference on Computer Vision and Pattern Recognition (June 2012). doi:10.1109/cvpr.2012.6248074. 22. Sun H. “Studies on Trajectory Planning Considering Motion Uncertainties of Traffic Vehicles [D]. Jilin University,” 2016 23. Zhao P. “Research on Motion Control Approaches of Autonomous Vehicle in Urban Environments [D].” University of Science and Technology of China, 2012 24. Karayalcin, I.I., “The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation,” European Journal of Operational Research 9(1):97-98, January 1982, doi:10.1016/0377-2217(82)90022-4. 25. Hwang, C.L., Lai, Y.J., and Liu, T.Y., “A New Approach for Multiple Objective Decision making,” Computers & Operations Research 20(8):889-899, 1993. 26. “PanoSim an Integrated Simulation Platform for Intelligent Driving” [Online] Available: http://www.panosim.com/
Contact Information For questions or to contact the authors, please email to the corresponding author Weiwen Deng
[email protected]
Acknowledgments The research is partially supported by National Key R & D Program of China (2016YFB0100904) and National Natural Science Foundation of China under grants of U1564211, 51475206 and 51605185, and by the development grants from Shenzhen Science, Technology and Innovation Commission (JCYJ20160229165300897 and ZDSYS20140509155229805).
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the copyright holder. Positions and opinions advanced in this paper are those of the author(s) and not necessarily those of SAE International. The author is solely responsible for the content of the paper. ISSN 0148-7191