Benchmarking Vulnerability Detection Tools for Web Services

Benchmarking Vulnerability Detection Tools for Web Services

ICWS 2010

Nuno Antunes, Marco Vieira {nmsa, mvieira}@dei.uc.pt

CISUC Department of Informatics Engineering University of Coimbra, Portugal

Outline  

The problem

 

Benchmarking Approach

 

Benchmark for SQL Injection vulnerability detection tools

 

Benchmarking Example

 

Conclusions and Future Work

Nuno Antunes

ICWS 2010, July 05-10, Miami, Florida, USA

2

Web Services  

Web Services are becoming a strategic component in a wide range of organizations

 

Web Services are extremely exposed to attacks Any existing vulnerability will most probably be uncovered/exploited   Hackers are moving their focus to applications’ code  

 

Both providers and consumers need to assess services’ security

Nuno Antunes


3

Common vulnerabilities in Web Services  

300 Public Web Services analyzed

Nuno Antunes


4

Vulnerability detection tools  

Vulnerability Scanners Easy and widely-used way to test applications searching vulnerabilities   Use fuzzing techniques to attack applications   Avoid the repetitive and tedious task of doing hundreds or even thousands of tests by hand  

 

Static Code Analyzers Analyze the code without actually executing it   The analysis varies depending on the tool sophistication   Provide a way for highlighting possible coding errors  

Nuno Antunes


5

Using vulnerability detection tools…  

Tools are often expensive

 

Many tools can generate conflicting results

 

Due to time constraints or resource limitations Developers have to select a tool from the set of tools available   Rely on that tool to detect vulnerabilities  

 

However…  

 

Previous work shows that the effectiveness of many of these tools is low

How to select the tools to use?

Nuno Antunes


6

How to select the tools to use?  

Existing evaluations have limited value By the limited number of tools used   By the representativeness of the experiments  

 

Developers urge a practical way to compare alternative tools concerning their ability to detect vulnerabilities

 

The solution: Benchmarking!

Nuno Antunes


7

Benchmarking vulnerability detection tools  

Benchmarks are standard approaches to evaluate and compare different systems  

according to specific characteristics

 

Evaluate and compare the existing tools

 

Select the most effective tools

 

Guide the improvement of methodologies  

Nuno Antunes

As performance benchmarks have contributed to improve performance of systems ICWS 2010, July 05-10, Miami, Florida, USA

8

Benchmarking Approach  

Workload:  

 

Work that a tool must perform during the benchmark execution

Measures: Characterize the effectiveness of the tools   Must be easy to understand   Must allow the comparison among different tools  

 

Procedure:  

Nuno Antunes

The procedures and rules that must be followed during the benchmark execution ICWS 2010, July 05-10, Miami, Florida, USA

9

Workload  

Services to exercise the Vuln. Detection Tools

 

Domain defined by: Class of web services (e.g., SOAP, REST)   Types of vulnerabilities (e.g., SQL Injection, XPath Injection, file execution)   Vulnerability detection approaches (e.g., penetrationtesting, static analysis, anomaly detection)  

 

Different types of workload can be considered: Real workloads   Realistic workloads   Synthetic workloads  

Nuno Antunes


10

Measures   Computed

from the information collected during the benchmark run

  Relative

measures

  Can

be used for comparison or for improvement and tuning

  Different

tools report vulnerabilities in different ways   Precision

  Recall   Nuno Antunes

F-Measure ICWS 2010, July 05-10, Miami, Florida, USA

11

Procedure  

Step 1: Preparation  

 

Step 2: Execution  

 

Use the tools under benchmarking to detect vulnerabilities in the workload

Step 3: Measures calculation  

 

Select the tools to be benchmarked

Analyze the vulnerabilities reported by the tools and calculate the measures.

Step 4: Ranking and selection Rank the tools using the measures   Select the most effective tool  

Nuno Antunes


12

A Benchmark for SQL Injection V. D. tools  

This benchmark targets the domain: Class of web services: SOAP web services   Type of vulnerabilities: SQL Injection   Vulnerability detection approaches: penetration-testing, static code analysis, and runtime anomaly detection  

 

Workload composed by code from standard benchmarks: TPC-App   TPC-W*   TPC-C*  

Nuno Antunes


13

Workload Benchmark TPC-App

TPC-C

TPC-W

Nuno Antunes

Service Name ProductDetail NewProducts NewCustomer ChangePaymentMethod Delivery NewOrder OrderStatus Payment StockLevel AdminUpdate CreateNewCustomer CreateShoppingCart DoAuthorSearch DoSubjectSearch DoTitleSearch GetBestSellers GetCustomer GetMostRecentOrder GetNewProducts GetPassword GetUsername Total

Vuln. Inputs 0 15 1 2 2 3 4 6 2 2 11 0 1 1 1 1 1 1 1 1 0 56

Vuln. Queries 0 1 4 1 7 5 5 11 2 1 4 0 1 1 1 1 1 1 1 1 0 49


LOC 121 103 205 99 227 331 209 327 80 81 163 207 44 45 45 62 46 129 50 40 40 2654

Avg. C. Comp. 5 4.5 5.6 5 21 33 13 25 4 5 3 2.67 3 3 3 3 4 6 3 2 2 14

Enhancing the workload  

To create a more realistic workload we created new versions of the services

 

This way, for each web service we have: one version without known vulnerabilities   one version with N vulnerabilities   N versions with one vulnerable SQL query each  

 

This accounts for:

Services + Versions 80

Nuno Antunes

Vuln. Inputs 158


Vuln. lines 87

15

Step 1: Preparation  

The tools under benchmarking Provider

Tool

HP

WebInspect

IBM

Rational AppScan

Acunetix

Web Vulnerability Scanner

Univ. Coimbra

VS.WS

Univ. Maryland

FindBugs

SourceForge

Yasca

JetBrains

IntelliJ IDEA

Univ. Coimbra

CIVS-WS

Technique

Penetration testing

Static code analysis Anomaly detection

 

Vulnerability Scanners: VS1, VS2, VS3, VS4

 

Static code analyzers: SA1, SA2, SA3

Nuno Antunes


16


Results for Penetration Testing

Nuno Antunes


Tool

% TP

% FP

VS1

32.28%

54.46%

VS2

24.05%

61.22%

VS3

1.9%

0%

VS4

24.05%

43.28% 17


Results for Static Code Analysis and Anomaly Detection

Tool

% TP

% FP

CIVS

79.31%

0%

SA1

55.17%

7.69%

SA2

100%

36.03%

SA3

Nuno Antunes


14.94% 67.50% 18

Step 3: Measures calculation  

Benchmarking results

Nuno Antunes

Tool

F-Measure

Precision

Recall

CIVS-WS

0.885

1

0.793

SA1

0.691

0.923

0.552

SA2

0.780

0.640

1

SA3

0.204

0.325

0.149

VS1

0.378

0.455

0.323

VS2

0.297

0.388

0.241

VS3

0.037

1

0.019

VS4

0.338

0.567

0.241


19

Step 4: Ranking and selection  

Rank the tools using the measures

 

Select the most effective tool

Inputs

Queries

Nuno Antunes

Criteria

1st

2nd

3rd

4th

F-Measure

VS1

VS4

VS2

VS3

Precision

VS3

VS4

VS1

VS2

Recall

VS1

F-Measure

CIVS

SA2

SA1

SA3

Precision

CIVS

SA1

SA2

SA3

Recall

SA2

CIVS

SA1

SA3

VS2/VS4


VS3

20

Benchmark properties  

Portability

  Non-intrusiveness   Simple  

to use

Repeatability

  Representativeness Nuno Antunes


21

Conclusions and future work  

We proposed an approach to benchmark the effectiveness of V. D. tools in web services

 

A concrete benchmark was implemented Targeting tools able to detect SQL Injection   A benchmarking example was conducted  

 

 

Results show that the benchmark can be used to assess and compare different tools Future work includes: Extend the benchmark to other types of vulnerabilities   Apply the benchmarking approach to define benchmarks for other types of web services  

Nuno Antunes


22

Questions?

Nuno Antunes Center for Informatics and Systems University of Coimbra [email protected] Nuno Antunes


23

Benchmark Representativeness  

Influenced by the representativeness of the workload  

May not be representative of all the SQL Inj. patterns

 

However, what is important is to compare tools in a relative manner

 

To verify this we replaced the workload by a real workload  

Nuno Antunes

Constituted by a small set of third-party WS


24

Benchmark Representativeness

Inputs

Queries Nuno Antunes

Criteria F-Measure Precision Recall F-Measure Precision Recall

1st 2nd VS1 VS4 VS3/VS4 VS1 VS4 CIVS SA2 CIVS SA1 SA2/CIVS


3rd VS2 VS2 VS2 SA1 SA2 SA1

4th VS3 VS1 VS3 SA3 SA3 SA3 25

Benchmarking Vulnerability Detection Tools for Web Services

Benchmarking Vulnerability Detection Tools for Web Services

Suggest Documents

Benchmarking Vulnerability Detection Tools for Web Services

Benchmarking Models and Tools for Distributed Web-Server Systems

Benchmarking Models and Tools for Distributed Web-Server Systems ...

Benchmarking ontology-based annotation tools for the Semantic Web

Web 2.0 services (vulnerability, threats and protection

Benchmarking XML Processors for Applications in Grid Web Services

Efficient Web Vulnerability Detection Tool for Sleeping ...

web server and web services for detection of structurally similar ...

Benchmarking Python Tools for Automatic Differentiation - arXiv

10 free metrics tools

Pfam: clans, web tools and services - CiteSeerX

Pfam: clans, web tools and services - BioMedSearch

Benchmarking city services - KPMG

from desktop tools to web services and web applications

Tools and Technologies for Semantic Web Services - SRI's Artificial ...

Benchmarking the Robustness of Web Services - Semantic Scholar

Benchmarking the Robustness of Web Services - Semantic Scholar

Recommendations for Benchmarking Web Site Usage among ...

Surprise Is Inevitable; Vulnerability Is Not

Web Vulnerability Scanners - arXiv

Message Race Detection for Web Services by an SMT

Vulnerability & Attack Injection for Web Applications ...

Web Vulnerability Scanner (WVS): A Tool for detecting Web ...

dDelega: Trust Management for Web Services