Ontology-Based Semantic Image Interpretation

4 downloads 1983 Views 7MB Size Report
Sep 23, 2015 ... Relevance: ▷ semantic content based image retrieval via a query language; ... an ontology provides the formal semantics and constraints.
Ontology-Based Semantic Image Interpretation Ivan Donadello1,2 1 Fondazione 2 DISI

Luciano Serafini (Advisor)1

Bruno Kessler, Via Sommarive, 18 I-38123, Trento, Italy

University of Trento, Via Sommarive, 9 I-38123, Trento, Italy

September 23, 2015

1 / 31

Context

I

Huge diffusion of digital images in recent years;

I

lack of semantic based retrieval systems for images, that is no complex queries: “a person riding a horse on a meadow”;

I

semantic gap between numerical image features and human semantics;

I

need a method that automatically understands the semantic content of images.

Relevance: I

semantic content based image retrieval via a query language;

I

semantic content enrichment with Semantic Web resource.

2 / 31

Problem Statement Semantic Image Interpretation (SII) is the task of extracting a graph representing the image content;

3 / 31

Problem Statement Semantic Image Interpretation (SII) is the task of extracting a graph representing the image content; I nodes represent visible and occluded objects in the image and their properties;

3 / 31

Problem Statement Semantic Image Interpretation (SII) is the task of extracting a graph representing the image content; I nodes represent visible and occluded objects in the image and their properties; I arcs represent relations between objects;

3 / 31

Problem Statement Semantic Image Interpretation (SII) is the task of extracting a graph representing the image content; I nodes represent visible and occluded objects in the image and their properties; I arcs represent relations between objects; I alignment between visible object regions and nodes;

3 / 31

Problem Statement Semantic Image Interpretation (SII) is the task of extracting a graph representing the image content; I nodes represent visible and occluded objects in the image and their properties; I arcs represent relations between objects; I alignment between visible object regions and nodes; I an ontology provides the formal semantics and constraints that guide the graph construction;

3 / 31

Aim of the Doctoral Thesis

I

Define a theoretical reference framework for SII;

I

implementation of a system for SII; graph construction guided by mixing:

I

I I

I

numeric information (low-level features of the image); symbolic information (high-level constraints available in the ontology);

perform system evaluation on a ground truth of semantically interpreted images.

4 / 31

State-of-the-art on SII Logic-Based Works (2014) I

I

a first description of the image (basic object recognition and their relations) is given;

Neural Networks-based (NN) works (2015) I

Caption generation;

model generation (deduction or abduction) by exploiting the ontology.

5 / 31

State-of-the-art on SII Logic-Based Works (2014) I

I

a first description of the image (basic object recognition and their relations) is given;

Neural Networks-based (NN) works (2015) I

Caption generation;

model generation (deduction or abduction) by exploiting the ontology.

Limitations I

Logic-based works: no consideration for low-level features;

I

NN works: no formal semantics and a priori knowledge. 5 / 31

SII Pipeline

6 / 31

SII Pipeline

7 / 31

SII Pipeline

8 / 31

Our Vision of SII Finding the maximum of a joint search space composed of semantic features and image features.

9 / 31

Theoretical Framework Background Knowledge encoded in a Description Logic ontology O.

Labelled picture is a pair P = hS, Li where S are segments of the image, L are (weighted) labels from Σ.

10 / 31

The Partial Model I

A picture is a partial view of the real world;

I

A partial model Ip is a structure that can be extended to a model of O;

11 / 31

The Partial Model I

A picture is a partial view of the real world;

I

A partial model Ip is a structure that can be extended to a model of O; . A partial model of an ontology O is an interpretation Ip = (∆Ip , ·Ip ) of O: there exists a model I = (∆I , ·I ) with ∆Ip ⊆ ∆I and ·Ip is a restriction of ·I on ∆Ip .

I

11 / 31

The Partial Model I

A picture is a partial view of the real world;

I

A partial model Ip is a structure that can be extended to a model of O; . A partial model of an ontology O is an interpretation Ip = (∆Ip , ·Ip ) of O: there exists a model I = (∆I , ·I ) with ∆Ip ⊆ ∆I and ·Ip is a restriction of ·I on ∆Ip . A semantically interpreted picture is a triple (P, Ip , G)O ;

I

I

11 / 31

The Most Plausible Partial Model Many partial models for a picture

Searching for the partial model that best fits the picture content, i.e. the most plausible partial model. 12 / 31

The Semantic Image Interpretation Problem Formalization I

A cost function S assigns a cost to semantically interpreted pictures (P, Ip , G)O ;

I

S(P, Ip , G)O expresses the gap between low-level features of P and objects and relations encoded in Ip ;

I

the most plausible partial model Ip∗ minimizes S: Ip∗ = argmin S(P, Ip , G)O Ip |=p O

G⊆∆Ip ×S

I

the semantic image interpretation problem is the construction of (P, Ip∗ , G)O that minimizes S.

13 / 31

Case Study: Clustering-Based Cost Function

I

I

Task: part-whole recognition, i.e., discovery complex objects from their parts; part-whole recognition can be seen as a clustering problem; I

parts of the same object tend to be grouped together;

14 / 31

Case Study: Clustering-Based Cost Function

I

I

Task: part-whole recognition, i.e., discovery complex objects from their parts; part-whole recognition can be seen as a clustering problem; I

I

parts of the same object tend to be grouped together;

cost function as a clustering optimisation function.

14 / 31

Case Study: Clustering-Based Cost Function I

Clustering: grouping a set of input elements into groups (clusters) such that:

15 / 31

Case Study: Clustering-Based Cost Function I

Clustering: grouping a set of input elements into groups (clusters) such that:

I

clustering solution of (P, Ip , G)O is C = {Cd | d ∈ ∆Ip } where Cd = {G(d 0 ) | d 0 ∈ ∆Ip , hd, d 0 i ∈ hasPartIp };

I

d represents the composite object, the centroid of the cluster; 15 / 31

Case Study: Clustering-Based Cost Function Mixing numeric and semantic features: I

grounding distance δG (d, d 0 ): the Euclidean distance between the centroids of G(d) and G(d 0 );

I

semantic distance δO (d, d 0 ) is the shortest path in O:

I I

if Muzzle(d 0 ), Tail(d 00 ) then δO (d 0 , d 00 ) = 2; if Muzzle(d 0 ), Horse(d) then δO (d 0 , d) = 1;

16 / 31

Case Study: Clustering-Based Cost Function I

Inter-cluster distance Γ:

I

Intra-cluster distance Λ:

I

Cost function: S(P, Ip , G)O = α · Γ + (1 − α) · Λ 17 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

18 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

19 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

20 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

21 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

22 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

23 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

24 / 31

Minimising the Cost Function

The Clustering Part-Whole Algorithm (CPWA) approximates the minimum of the cost function.

25 / 31

Evaluation Comparing the predicted partial model with the ground truth, two measures: I grouping (GRP):

26 / 31

Evaluation Comparing the predicted partial model with the ground truth, two measures: I grouping (GRP):

I

complex-object type prediction (COP):

26 / 31

Evaluation Comparing the predicted partial model with the ground truth, two measures: I grouping (GRP):

I

complex-object type prediction (COP):

I

precision, the fraction of predicted pairs that are correct; recall, the fraction of correct pairs that are predicted.

I

26 / 31

Experiments and Results Experiments Setting I

Ground truth of 203 manually obtained labelled pictures on the urban scene domain;

I

manually built ontology with basic formalism of meronymy of the domain;

I

task: discovering complex objects from their parts in pictures.

Results

CPWA

precGRP 0.61

recGRP 0.89

F 1GRP 0.67

precCOP 0.73

recCOP 0.75

F 1COP 0.74

27 / 31

Experiments and Results Experiments Setting I

Ground truth of 203 manually obtained labelled pictures on the urban scene domain;

I

manually built ontology with basic formalism of meronymy of the domain;

I

task: discovering complex objects from their parts in pictures.

Results

CPWA Baseline I

precGRP 0.61 0.45

recGRP 0.89 0.71

F 1GRP 0.67 0.48

precCOP 0.73 0.66

recCOP 0.75 0.69

F 1COP 0.74 0.66

Baseline: clustering without semantics; 28 / 31

Experiments and Results Experiments Setting I Ground truth of 203 manually obtained labelled pictures on the urban scene domain; I manually built ontology with basic formalism of meronymy of the domain; I task: discovering complex objects from their parts in pictures. Results

CPWA++ CPWA Baseline I I

precGRP 0.67 0.61 0.45

recGRP 0.81 0.89 0.71

F 1GRP 0.71 0.67 0.48

precCOP 0.71 0.73 0.66

recCOP 0.82 0.75 0.69

F 1COP 0.86 0.74 0.66

Baseline: clustering without semantics; CPWA + +: improved version of CPWA; 29 / 31

Conclusions and Future Work

I

Theoretical framework for SII: partial model that minimizes a cost function;

I

cost function as a clustering optimization function;

I

clustering algorithm that approximates the cost function;

I

explicitly using semantics improves the results; future work:

I

30 / 31

Conclusions and Future Work

I

Theoretical framework for SII: partial model that minimizes a cost function;

I

cost function as a clustering optimization function;

I

clustering algorithm that approximates the cost function;

I

explicitly using semantics improves the results; future work:

I

I I I I

integrating of semantic segmentation algorithms; generalizing to other relations; extending the evaluation to a standard dataset; using general purposes ontologies;

30 / 31

Thanks for listening Questions?

31 / 31