Modeling the development of visual perception with computational vision

2 downloads 0 Views 12MB Size Report
EXPERIMENTS on the TWO-THIRDS POWER LAW for DRAWING MOVEMENTS. ✦ We analysed the validity of the Two-Thirds power law on videos of drawing ...
ren ast ge of ns ct-

European Conference on Visual Perception 2015

Modeling the development of visual perception with computational vision I View II Rea2, Francesca Odone1, Giulio Sandini2 12, Nicoletta Noceti1, Alessandra Sciutti2, View Alessia Vignolo Francesco 2

RBCS, Istituto Italiano di Tecnologia, Italy

1

DIBRIS, Università degli Studi di Genova, Italy

The long-term goal of our work is to model the development of visual The development of social skills is a complex perception with computational tools, process that finally brings humans to achieve a full bridging... social awareness COGNITIVE SCIENCE ✦ Soon after birth, babies start perceiving the presence of BACKGROUND & MOTIVATIONS

potentially interacting agents, by detecting biological motion in the scene (Simion et al. 2008)

✦ Later, the visual perception evolves to decode more refined kinematic properties, allowing to

ROBOTICS

COMPUTER VISION

CONTRIBUTIONS ✦ Taking inspiration from the Two-Thirds Power Law in order to design a perceptually inspired model of motion visual perception ✦ Establishing a connection between cognitive science findings and empirical observations from video analysis ✦ Setting the basis for a natural Human-Robot

• Anticipate the partner’s goal, during interaction (Kanakogi&Itakura, 2011)

Interaction (HRI) following a developmental robotic approach (Sandini et al., 1997) and with specific reference to the earliest stages of the

• Appreciate different motion nuances, the manner of the motion (Lohan et al., 2014)

development (limited perception capabilities!)

OUR PIPELINE

Motion segmentation

Video stream

Motion description

Motion modeling

Motion DYNAMIC FEATURES

OPTICAL FLOW

THE TWO-THIRDS POWER LAW

MACHINE LEARNING

MOTION SEGMENTATION, DESCRIPTION and MODELING

✦ It governs the relationship between the shape (i.e. the trajectory) and the law of a motion and is considered a well-known invariant property of human movements, also related to the minimum jerk principle (Viviani&Flash, 1995)

Optical flow magnitude is thresholded and used to compute a saliency motion map to finally detect the moving region (arm)

The arm region is tracked over time to obtain a spatial trajectory on which dynamic features are computed

✦ Evidence of the law has been observed especially for drawing end-point movements (Lacquaniti et al. 1983, Terzuolo&Viviani 1991), but also eye motion (Viviani, 1997), locomotion (Vieilledent at al., 2001), and to the purpose of movement prediction (Kandel et al., 2000).

V (t) = K(t) Tangential velocity



�β

R(t) 1 + α(t)R(t)

ˆ i (t), Aˆi (t)] F(pi (t)) = [Vˆi (t), Cˆi (t), R � Vˆ (t) = ui (t)2 + vi (t)2 + ∆2t

Estimated tangential velocity

Radius of curvature

ˆ i (t) × A ˆ i (t)|| || V Cˆi (t) = ˆ i (t)||3 ||V

Estimated curvature

α(t) = 0 ⇒ A(t) = K(t)C(t)

1−β

Angular velocity

understanding

Estimated radius of curvature

Curvature

ˆ i (t) = R

1 Cˆi (t)

Estimated angular velocity

ˆi (t) V Aˆi (t) = ˆ i (t) R

EXPERIMENTS on the TWO-THIRDS POWER LAW for DRAWING MOVEMENTS

EXPERIMENTS on BIOLOGICAL MOTION CLASSIFICATION

✦ We analysed the validity of the Two-Thirds power law on videos of drawing movements, showing 5 subjects tracing ellipses of 3 different sizes and with two different orientations. For each case we acquired two videos, showing ten repetitions of drawings.

What happens with a more general class of human movements? ✦ We acquired videos of 4 subjects performing sequences of actions typical of an interaction setting (as pointing, repositioning or lifting objects, ...), and a set of videos showing non biological dynamic events (toy cars, levers, pendulum, bouncing and rolling balls...). The scene is observed from two slightly different viewpoints to evaluate the level of invariance of our model ✦ An analysis of the beta exponent reveals that in the current model it can not be the only relying condition to classify an event as biological or non biological • Average 1-beta exponent is 0.65 for view I, and 0.63 for view II, but higher std. dev. • Some non biological events (e.g. given by the pendulum) show a pertinence to the TwoThirds Power Law ✦ We resort to an SVM classifier equipped with a Multi-Cue kernel (Noceti&Odone,2012) to model and then recognize classes of biological and non biological dynamic events (Noceti et al.,2015)

Exp. scenario 1. Both training and test on view I: best acc. 89%

Exp. scenario 2. Training on view I and test on view II: best acc. 88%

0,9

Two-ways repeated measures ANOVAs with ELLIPSE SIZE (3 levels: small, medium, big) and ORIENTATION (2 levels: straight, oblique) as factors with Greenhouse-Geisser correction did not reveal any significant effect of ellipse size or orientation on Beta, nor any interaction.

PCA

Principal Component Analysis (PCA): the slope is estimated using the principal eigenvector of the covariance matrix of the data

LM

Levenberg-Marquardt (LM): both the slope and the k parameter are estimated as a curve fitting problem in a leastsquare sense

0,675

0,45

0,225

s. t s r n I c s e d

e

m r i T sc de

s. t s r n I c s e d

e

m r i T sc de

0

Features centroid

Features histogram

CURRENT DEVELOPMENTS

Towards the estimation of actions affinity... ✦ The observed dynamic events can be roughly categorized depending on the mutual affinity ✦ We compute the affinity matrix of the similarities of all observed events, and then infer a graph of predominant affinities that reveals the presence of semantic connections (Noceti et al., 2015) Figure 7: A visual sketch of the actions affinities inferred by mean of the analysis (in green transitive actions, in red

Towards a developmental HRI... ✦ A prototype application of the model is currently under development on the iCub platform (Sandini et al., 1997) ✦ The implementation is designed so to meet the efficiency requirements of an HRI application (e.g. real-time processing, robustness to low space and time resolution,...)

Suggest Documents