XIII IASE and III ESEA CONFERENCE OF SPORTS. Prague .... Official statistics of the ACB (Basketball Clubs Association) League Championship. Seasons: ...
TEAM PERFORMANCE IN PROFESSIONAL BASKETBALL: AN APPROACH BASED ON NEURAL NETWORKS AND GENETIC PROGRAMMING (Research in progress-very preliminary results)
José Manuel Sánchez Santos Department of Applied Economics Ana Belen Porto Pazos Department of Information and Communication Technologies Alejandro Pazos Sierra Department of Information and Communication Technologies University of A Coruña Spain
XIII IASE and III ESEA CONFERENCE OF SPORTS Prague, May 2011
OUTLINE • Motivation • Methodological approach: a brief description • Empirical strategy – Variables and Data – Artificial Neural Networks: results – Genetic Programming: results • Concluding remarks
• Further research
Motivation • Aim: – Generic: To explore the possibilities of applying Neural Nets and Genetic Programming techniques to analyze team performance in sports – Specific: to provide some insights into the relative significance of each input (play action) in a basketball game result
• Need of new modeling approaches? Why ANNs and GP? – Powerful models for statistical data analyisis (non-linear relationships) – No theoretical evidence about functional form – Novelty: only two previous works • Condon, Golden & Wasil (1999), Computers & Operations research • Hall & Seaman (2009) International J. Sport Management and Marketing
• Why Spanish basketball? – Previous work as a reference: Sánchez et al. (2007), European Sport Management Quarterly
METHODOLOGICAL APPROACH
• Heuristic methods - the discovery of functional form is automated – Based on neural networks (ANNs) – Based on evolutionary computation • Genetic algorithms • Evolutionary programming • Genetic programming (GP)
• Regression methods - funcional form is investigator-specified
ARTIFICIAL NEURAL NETWORKS (ANNs)
• A branch of artificial intelligence • Originally inspired from Neuroscience
• “A learning machine”: a computational method designed to operate like the biological nervous system • ANN consists of a set of interconnected processing neurons, called units, distributed across layers
• Feedforward neural networks or Multilayer Perceptron (MLP)
Single output, three layer feed-forward neural network MLP (n, m, 1)
x0
h0 c1 w11
X1
cm
w1m
X2 .. .
Input layer
H1 .. .
Y
Hm
wn1
Xn
b0
wnm
Hidden layer
Output layer
The unknown function to estimate from the available information: m
yˆ
f ( x, Z )
F (b0
n
G (c j j 1
Where F: output layer activation function G: hidden layer activation function n: number of input units m: number of hidden units x: inputs vector (i = 1…n) Z: weights vector (parameters): b0 : output bias cj : hidden units biases (j=1…m) wij : weight from input unit i to hidden unit j bj : weights from hidden unit j to output
xi wij )b j ) i 1
Genetic Programming (GP) • A branch of Artificial Intelligence • Optimizing technique in the form of computer program based on Darwin’s notion of survival of the fittest.
• Non parametric procedure that search for the best fit over a large set of alternative functional forms. • Search technique in which, from an initial set of programs / equations (population), these are combined forming different generations resulting in better programs / equations – – –
Objetive: Optimize programs/ecuations Maximize fitness. Minimize error.
Flow Diagram of the GP Architecture Assemble initial population of equations
Breed new population of equations
Evaluate fitness of each equation
Yes? Was a fit equation found?
Terminat e search
No?
Yes?
No?
Was maximum number of equations reached?
The production process in basketball The variables: Unit of analysis: bastketball game Output: The outcome of a game (ratio of points scored) Inputs: Those play actions that are considered to be decisive factors so that a team gets to win Defensive pressure: Rival field goal percentage (-) Number of blocked shots (+) Rival turnovers (+) Personal fouls committed (-)
Efficiency in ball handling: Assists (+) Turnovers (-)
Home-court advantage
Rebounding capacity: Defensive rebounds (+) Offensive rebounds (+)
Shot effectiveness: Percentage of field shots made (+) Percentage of free throws scored (+) Blocked shots received (-) Personal fouls received (+)
Variables OUTPUT Y (ratio of points scored) INPUTS X1 (ratio of % of Field shots) X2 (ratio of % of Free throws) X3 (ratio of Defensive rebounds) X4 (ratio of Offensive rebounds) X5 (ratio of Assists) X6 (ratio of personal fouls) X7 (ratio of Steals) X8 (ratio of Turnovers) X9 (diference in blocked shots) X10 (It plays at home)
Data Official statistics of the ACB (Basketball Clubs Association) League Championship Seasons: 2002-2003 and 2003-2004 18 teams competed in the ACB over 34 league days 9 games are played on each regular league day Total sample: 612 observations (number of games disputed in the two seasons)
Artificial Neural Network Architecture for basketball problem
X1
1
X2
2
. . .
. . .
3
Y
. . . 10
X10
Single output, three layer feed-forward neural network MLP (10,10, 1)
BACKPROPAGATION ALGORITHM
INPUT LAYER
WEIGHTS
OUTPUT LAYER
TARGET
NETWORK ERROR
WEIGHTS ADJUSTMENT
Sensitivity analysis obtained by training 100 networks and using TEKA and Garson methods ANN-TEKA 14,77% 14,49% 11,83% 10,98% 10,83% 9,75% 9,71% 6,23% 5,22% 4,12%
x9 x3 x4 x2 x8 x7 x5 x1 x6 x10
ANN-Garson 18,16% x4 15,39% x3 14,27% x9 11,89% x1 9,93% x10 8,66% x7 7,22% x2 6,90% x8 4,03% x5 3,56% x6
Genetic Algorithm
Generate initial population randomly (Generation 0)
Evaluate current generation
End
Yes
Terminal criterium satisfied?
No Breed the new population from previous combining individuals through genetic operators
Equation that best fits the data
y
0.0002377
0.83936393* x1
0.1879168 0.2584619 x8 * x1 x1 x2 * x4
0.001528* x1* ( x 4 x10) x9 * ( x8 x7) x 4
Example of expression generated with logical operators • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
IF((X1>1.01911497),OR(if((X1X1),if(>((0.76183665),X9),TRUE,(X6