Learning and search strategies in the context
of hidden-action problems 6th World Congress of the International Microsimulation Association
Stephan Leitner and Friederike Wall Faculty of Management and Economics
Alpen-Adria-Universität Klagenfurt
[email protected]
IMA 2017 Stephan Leitner and Friederike Wall
Motivation & contribution Agency theory • •
Captures delegation relationships between principal & agent and shapes them by contracts Makes very specific assumptions about the behavior of the involved individuals & the available pieces of information
Assumptions within agency models • •
Might be seen as a virtue as they allow for deriving solutions in rigorous closed-form modeling Might be seen as a fundamental weakness as they might limit the theory’s predictive power (i.e., agency theory might fail in explaining empirical phenomena)
IMA 2017 Stephan Leitner and Friederike Wall
2
Motivation & contribution Agency theory • •
Captures delegation relationships between principal & agent and shapes them by contracts Makes very specific assumptions about the behavior of the involved individuals & the available pieces of information
Assumptions within agency models • •
Might be seen as a virtue as they allow for deriving solutions in rigorous closed-form modeling Might be seen as a fundamental weakness as they might limit the theory’s predictive power (i.e., agency theory might fail in explaining empirical phenomena)
Contribution • •
We propose and agent-based variant of the standard-hidden action model We test for the level of performance achievable in situations with relaxed assumptions
IMA 2017 Stephan Leitner and Friederike Wall
2
Research agenda The standard hidden-action model • •
Provides an 'optimal' solution (here: focus on second-best solution) The involved parties have all the information to achieve the 'optimal' solution in one period
IMA 2017 Stephan Leitner and Friederike Wall
3
Research agenda The standard hidden-action model • •
Provides an 'optimal' solution (here: focus on second-best solution) The involved parties have all the information to achieve the 'optimal' solution in one period
!#"# $ !#"# $ U p ( X(a, ρ,θ ) , S(c f , x, p) ) !#"# $ !#"# $
Principal (P)
production function compensation function
x=a⋅ρ +θ
maxΕ(U P (x, s))
s=c f +x⋅p
s.t. Ε(U A (s,a)) ≥ U
∫ V (s) f (x | a)dx − G(a)' = 0
Agent (A)
a
utility from compensation
disutility from effort
! ! U A (s,a) = V (s) − G(a) max Ε(U A (s,a))
IMA 2017 Stephan Leitner and Friederike Wall
P’s utility function • Defined by (a function of) outcome
and A’s compensation P’s maximization problem • Maximize expected utility (via S(.)) • Subject to • Participation constraint (PC) • Incentive compatibility constraint (IC) A’s utility function • Defined by utility from compensation
minus disutility from exerting effort A’s maximization problem • Maximize expected utility (via a)
3
Research agenda The standard hidden-action model • •
Provides an 'optimal' solution (here: focus on second-best solution) The involved parties have all the information to achieve the 'optimal' solution in one period
Research Agenda • • • • •
Relax assumptions regarding information (make pieces of information unavailable) & transfer the standard model into an (multi-period*) agent-based model variant Allow the involved parties to search for the 'optimal' solution (over time) Allow the involved parties to learn and to store the learnings in a (limited) memory Investigate the convergence of the achieved solution to the 'optimal' solution (over time) Investigate parameters which drive the convergence of the achieved solution to the 'optimal' solution (here particularly the search strategy and the impact of the environment on the delegation relationship’s outcome)
*We do not allow for allocation of effort across periods. IMA 2017 Stephan Leitner and Friederike Wall
3
Transferring the standard model STANDARD MODEL
AGENT-BASED MODEL VARIANT
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
• •
Entire 'action space‘ (A) Distribution of exogenous factors
IMA 2017 Stephan Leitner and Friederike Wall
Limited information about 'action space‘ • No information about the distribution of exogenous factors (but learns about it over time) Endowed with mental horizon, exploration propensity, and (limited) memory •
4
Transferring the standard model STANDARD MODEL
AGENT-BASED MODEL VARIANT
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
• •
Entire 'action space‘ (A) Distribution of exogenous factors
IMA 2017 Stephan Leitner and Friederike Wall
=
Limited information about 'action space‘ • No information about the distribution of exogenous factors (but learns about it over time) Endowed with mental horizon, exploration propensity, and (limited) memory •
≠
4
Transferring the standard model STANDARD MODEL
AGENT-BASED MODEL VARIANT
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
• •
Entire 'action space‘ (A) Distribution of exogenous factors
A’s information • Observed outcome (x) • A’s private information: selected
action, realized exogenous factor •
Distribution of exogenous factors
IMA 2017 Stephan Leitner and Friederike Wall
=
Limited information about 'action space‘ • No information about the distribution of exogenous factors (but learns about it over time) Endowed with mental horizon, exploration propensity, and (limited) memory •
≠
A’s information • Observed outcome (x) • A’s private information: selected
action, realized exogenous factor No information about the distribution of exogenous factors (but learns about it over time) Endowed with (limited) memory •
4
Transferring the standard model STANDARD MODEL
AGENT-BASED MODEL VARIANT
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)
• •
Entire 'action space‘ (A) Distribution of exogenous factors
A’s information • Observed outcome (x) • A’s private information: selected
action, realized exogenous factor •
Distribution of exogenous factors
IMA 2017 Stephan Leitner and Friederike Wall
=
Limited information about 'action space‘ • No information about the distribution of exogenous factors (but learns about it over time) Endowed with mental horizon, exploration propensity, and (limited) memory •
≠
= ≠
A’s information • Observed outcome (x) • A’s private information: selected
action, realized exogenous factor No information about the distribution of exogenous factors (but learns about it over time) Endowed with (limited) memory •
4
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized
Minimum effort-level, Maximum effort-level that fulfills incentive leads to reservation Area that can be compatibility utility overseen by P (mental horizon)
t 'Status-quo' effort-level
aPt−1
(randomly drawn in t=1)
'Action space' defined on the basis of P’s state of information
IMA 2017 Stephan Leitner and Friederike Wall
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized
1. P: Select search strategy in t (in the following: exploration) P decides based on exploration propensity (0 to 1) and estimated exogenous factor in t-1; decision is based on a hurdle computed using the estimated exogenous factors’ cumulative distribution function
t
aPt−1
IMA 2017 Stephan Leitner and Friederike Wall
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized
1. Exploration in t 2. P builds expectation about exogenous factor in t Averaged estimated exogenous factors (over the last m periods) —> parameter m defines memory
t
aPt−1
IMA 2017 Stephan Leitner and Friederike Wall
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized
1. Exploration in t 2. P builds expectation about exogenous factor in t 3. P randomly discovers 2 alternative effort levels in exploration space & evaluates them with respect to expected increase in utility
t
aPt−1
IMA 2017 Stephan Leitner and Friederike Wall
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized
4. P decides for 'utility1. Exploration in t maximizing’ effort-level 2. P builds expectation about (as basis for contract) & exogenous factor in t 3. P randomly discovers 2 alternative offers contract to A, P builds
expectation about outcome in t effort levels in exploration aPt space & evaluates them
t
aPt−1
IMA 2017 Stephan Leitner and Friederike Wall
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1 1. Exploration in t 4. P decides for 'utility2. P builds expectation about maximizing’ effort-level exogenous factor in t (as basis for contract) & 3. P randomly discovers 2 alternative offers contract to A, builds
expectation about outcome in t effort levels in exploration aPt space & evaluates them
Exogenous factor in t-1 P: Estimated A: Realized Exogenous factor in t P: Estimated A: Realized
t
aPt−1 5. A builds expectation about
Estimation based on expected (step 4) and realized outcome (step 6)
IMA 2017 Stephan Leitner and Friederike Wall
exogenous factor in t & selects
effort-level 6. Exogenous factor and outcome
realize (for period t) 7. P observes outcome, estimates
exogenous factor in t & stores in memory 8. A observes outcome & exogenous
factor (for t) and stores in memory 9. Utilities in period t realize
5
Sequence of events per timestep Exploitation space
Exploration space
Memory
t-1
Exogenous factor in t-1 P: Estimated A: Realized Exogenous factor in t P: Estimated A: Realized
aPt t
aPt−1 Repetition of steps 1-9 (until period T)
t+1
aPt Boundaries change with P’s state of information (as expected exogenous factors are included in their definition) IMA 2017 Stephan Leitner and Friederike Wall
5
Investigated scenarios Benchmark scenario • Results derived from the standard hidden-action model are used as the benchmark
scenario (recall: second-best solution) Agent-based model variant parameterized as LEN model • Principal • Linear utility function • Mental horizon: Set to 1/10 of the entire ‚action space‘ • Exploration propensity: 0.25, 0.75 (exploitation-prone and exploration-prone Ps) • Agent • Exponential utility function (risk-averse), quadratic disutility function • Fixed compensation and reservation utility normalized to 0 • Input-output-relation (productivity): 1:50 • Environment • Normally distributed exogenous factors • Impact of exogenous factors (standard deviation relative to 'optimal' outcome x*): 0.05x*, 0.25x*, 0.45x*, 0.65x* (mean always set to 0) Further parameters • Periods per time path: 20 • Simulation runs per scenario: 700 • Memory: unlimited (all periods) and limited (3 periods) IMA 2017 Stephan Leitner and Friederike Wall
6
Reported performance measure Averaged normalized effort-level (selected by the agent) • • •
Appropriate performance measure due to stochastic dominance
(higher effort-level —> higher performance) Free from noise (as the measure itself is free from exogenous factors) Notation: • a* = 'optimal' effort level derived from the standard hidden-action model • t = timesteps, T = maximum timesteps • r = simulation run; R = total number of simulation runs • aAtr = effort level selected by the agent in timestep t and simulation run r
1 r=R aAtr ∀t = 1,...,T ∑ R r=1 a *
IMA 2017 Stephan Leitner and Friederike Wall
7
Selected results - unlimited memory I Impact of exgenous factors on outcome: SD=0.05x*
1
Effort-level
0.9 0.8
Higher exploration prop. significantly superior to tendency towards exploitation
0.7 0.6 0.5 1
3
5
7
9
11
13
15
17
19 20
Timesteps
Confidence level: 𝛼=0.01 IMA 2017 Stephan Leitner and Friederike Wall
8
Selected results - unlimited memory II Impact of exgenous factors on outcome: SD=0.05x*
Effort-level
1 0.9
0.9
0.8
0.8
Higher exploration prop. significantly superior to tendency towards exploitation
0.7 0.6
0.7 0.6 0.5
0.5 1
3
5
7
9
11
13
15
17
1
19 20
Impact of exgenous factors on outcome: SD=0.25x*
1
Effort-level
Impact of exgenous factors on outcome: SD=0.45x*
1
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5 1
3
5
7
9
11
Timesteps
13
15
17
19 20
5
7
9
11
13
15
17
19 20
Impact of exgenous factors on outcome: SD=0.65x*
1
0.9
3
1
3
5
7
9
11
13
15
17
19 20
Timesteps
Confidence level: 𝛼=0.01 IMA 2017 Stephan Leitner and Friederike Wall
9
Selected results - unlimited memory III •
Achieved performance decreases as the impact of exogenous factors increases: more turbulent environments lead to a decrease in performance
•
The choice of the search strategy does not affect the achieved performances in the first few periods
•
Exploitation leads to superior performance • Effect only observable in a subset of time-periods, and • extent of superiority decreases with increasing impact of the environment on outcome
IMA 2017 Stephan Leitner and Friederike Wall
10
Selected results - limited memory I Impact of exgenous factors on outcome: SD=0.05x*
Effort-level
1 0.9
0.9
0.8
0.8
Higher exploration prop. significantly superior to tendency towards exploitation
0.7 0.6
Performance tends to decrase after a peak
0.7 0.6 0.5
0.5 1
3
5
7
9
11
13
15
17
19 20
Impact of exgenous factors on outcome: SD=0.25x*
1
Effort-level
Impact of exgenous factors on outcome: SD=0.45x*
1
1
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5 1
3
5
7
9
11
Timesteps
13
15
17
19 20
5
7
9
11
13
15
17
19 20
Impact of exgenous factors on outcome: SD=0.65x*
1
0.9
3
1
3
5
7
9
11
13
15
17
19 20
Timesteps
Confidence level: 𝛼=0.01 IMA 2017 Stephan Leitner and Friederike Wall
11
Selected results - limited memory II •
Performances are lower than the ones achieved with unlimited memory: particularly observable for scenarios with a high impact of the environment on outcomes
•
The choice of the search strategy does not affect the achieved performances in the first few periods
•
Performance tends to decrease after a peak is reached: particularly observable in scenarios with a relatively high impact of the environment on outcome
•
Exploitation leads to superior performance • Effect only observable in a subset of time-periods, and • with increasing impact of exogenous factors the effect vanishes: in case of limited memory and high impact of the environment on outcome the choice of search strategy becomes irrelevant
IMA 2017 Stephan Leitner and Friederike Wall
12
Limitations and future research, e.g. • Include costs for exploration • Make more pieces of information unavailable (or observable with noise) for P (e.g., A’s exact utility function, A’s exact productivity) • Include more sophisticated learning strategies • Test for different mental horizons
IMA 2017 Stephan Leitner and Friederike Wall
13
Limitations and future research, e.g. • Include costs for exploration • Make more pieces of information unavailable (or only observable with noise) for P (e.g., A’s exact utility function, A’s exact productivity) • Include more sophisticated learning strategies • Test for different mental horizons
Thank you for your attention! Any questions or comments?
IMA 2017 Stephan Leitner and Friederike Wall
13