team performance in professional basketball: an ...

3 downloads 0 Views 374KB Size Report
XIII IASE and III ESEA CONFERENCE OF SPORTS. Prague .... Official statistics of the ACB (Basketball Clubs Association) League Championship. Seasons: ...
TEAM PERFORMANCE IN PROFESSIONAL BASKETBALL: AN APPROACH BASED ON NEURAL NETWORKS AND GENETIC PROGRAMMING (Research in progress-very preliminary results)

José Manuel Sánchez Santos Department of Applied Economics Ana Belen Porto Pazos Department of Information and Communication Technologies Alejandro Pazos Sierra Department of Information and Communication Technologies University of A Coruña Spain

XIII IASE and III ESEA CONFERENCE OF SPORTS Prague, May 2011

OUTLINE • Motivation • Methodological approach: a brief description • Empirical strategy – Variables and Data – Artificial Neural Networks: results – Genetic Programming: results • Concluding remarks

• Further research

Motivation • Aim: – Generic: To explore the possibilities of applying Neural Nets and Genetic Programming techniques to analyze team performance in sports – Specific: to provide some insights into the relative significance of each input (play action) in a basketball game result

• Need of new modeling approaches? Why ANNs and GP? – Powerful models for statistical data analyisis (non-linear relationships) – No theoretical evidence about functional form – Novelty: only two previous works • Condon, Golden & Wasil (1999), Computers & Operations research • Hall & Seaman (2009) International J. Sport Management and Marketing

• Why Spanish basketball? – Previous work as a reference: Sánchez et al. (2007), European Sport Management Quarterly

METHODOLOGICAL APPROACH

• Heuristic methods - the discovery of functional form is automated – Based on neural networks (ANNs) – Based on evolutionary computation • Genetic algorithms • Evolutionary programming • Genetic programming (GP)

• Regression methods - funcional form is investigator-specified

ARTIFICIAL NEURAL NETWORKS (ANNs)

• A branch of artificial intelligence • Originally inspired from Neuroscience

• “A learning machine”: a computational method designed to operate like the biological nervous system • ANN consists of a set of interconnected processing neurons, called units, distributed across layers

• Feedforward neural networks or Multilayer Perceptron (MLP)

Single output, three layer feed-forward neural network MLP (n, m, 1)

x0

h0 c1 w11

X1

cm

w1m

X2 .. .

Input layer

H1 .. .

Y

Hm

wn1

Xn

b0

wnm

Hidden layer

Output layer

The unknown function to estimate from the available information: m



f ( x, Z )

F (b0

n

G (c j j 1

Where F: output layer activation function G: hidden layer activation function n: number of input units m: number of hidden units x: inputs vector (i = 1…n) Z: weights vector (parameters): b0 : output bias cj : hidden units biases (j=1…m) wij : weight from input unit i to hidden unit j bj : weights from hidden unit j to output

xi wij )b j ) i 1

Genetic Programming (GP) • A branch of Artificial Intelligence • Optimizing technique in the form of computer program based on Darwin’s notion of survival of the fittest.

• Non parametric procedure that search for the best fit over a large set of alternative functional forms. • Search technique in which, from an initial set of programs / equations (population), these are combined forming different generations resulting in better programs / equations – – –

Objetive: Optimize programs/ecuations Maximize fitness. Minimize error.

Flow Diagram of the GP Architecture Assemble initial population of equations

Breed new population of equations

Evaluate fitness of each equation

Yes? Was a fit equation found?

Terminat e search

No?

Yes?

No?

Was maximum number of equations reached?

The production process in basketball The variables: Unit of analysis: bastketball game Output: The outcome of a game (ratio of points scored) Inputs: Those play actions that are considered to be decisive factors so that a team gets to win Defensive pressure: Rival field goal percentage (-) Number of blocked shots (+) Rival turnovers (+) Personal fouls committed (-)

Efficiency in ball handling: Assists (+) Turnovers (-)

Home-court advantage

Rebounding capacity: Defensive rebounds (+) Offensive rebounds (+)

Shot effectiveness: Percentage of field shots made (+) Percentage of free throws scored (+) Blocked shots received (-) Personal fouls received (+)

Variables OUTPUT Y (ratio of points scored) INPUTS X1 (ratio of % of Field shots) X2 (ratio of % of Free throws) X3 (ratio of Defensive rebounds) X4 (ratio of Offensive rebounds) X5 (ratio of Assists) X6 (ratio of personal fouls) X7 (ratio of Steals) X8 (ratio of Turnovers) X9 (diference in blocked shots) X10 (It plays at home)

Data Official statistics of the ACB (Basketball Clubs Association) League Championship Seasons: 2002-2003 and 2003-2004 18 teams competed in the ACB over 34 league days 9 games are played on each regular league day Total sample: 612 observations (number of games disputed in the two seasons)

Artificial Neural Network Architecture for basketball problem

X1

1

X2

2

. . .

. . .

3

Y

. . . 10

X10

Single output, three layer feed-forward neural network MLP (10,10, 1)

BACKPROPAGATION ALGORITHM

INPUT LAYER

WEIGHTS

OUTPUT LAYER

TARGET

NETWORK ERROR

WEIGHTS ADJUSTMENT

Sensitivity analysis obtained by training 100 networks and using TEKA and Garson methods ANN-TEKA 14,77% 14,49% 11,83% 10,98% 10,83% 9,75% 9,71% 6,23% 5,22% 4,12%

x9 x3 x4 x2 x8 x7 x5 x1 x6 x10

ANN-Garson 18,16% x4 15,39% x3 14,27% x9 11,89% x1 9,93% x10 8,66% x7 7,22% x2 6,90% x8 4,03% x5 3,56% x6

Genetic Algorithm

Generate initial population randomly (Generation 0)

Evaluate current generation

End

Yes

Terminal criterium satisfied?

No Breed the new population from previous combining individuals through genetic operators

Equation that best fits the data

y

0.0002377

0.83936393* x1

0.1879168 0.2584619 x8 * x1 x1 x2 * x4

0.001528* x1* ( x 4 x10) x9 * ( x8 x7) x 4

Example of expression generated with logical operators • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

IF((X1>1.01911497),OR(if((X1X1),if(>((0.76183665),X9),TRUE,(X6