Learning and search strategies in the context of ...

Learning and search strategies in the context  of hidden-action problems 6th World Congress of the International Microsimulation Association 

 

  Stephan Leitner and Friederike Wall Faculty of Management and Economics  Alpen-Adria-Universität Klagenfurt  [email protected]

IMA 2017 Stephan Leitner and Friederike Wall

Motivation & contribution Agency theory • •

Captures delegation relationships between principal & agent and shapes them by contracts Makes very specific assumptions about the behavior of the involved individuals & the available pieces of information

Assumptions within agency models • •

Might be seen as a virtue as they allow for deriving solutions in rigorous closed-form modeling Might be seen as a fundamental weakness as they might limit the theory’s predictive power (i.e., agency theory might fail in explaining empirical phenomena)


2

Motivation & contribution Agency theory • •

Captures delegation relationships between principal & agent and shapes them by contracts Makes very specific assumptions about the behavior of the involved individuals & the available pieces of information

Assumptions within agency models • •

Might be seen as a virtue as they allow for deriving solutions in rigorous closed-form modeling Might be seen as a fundamental weakness as they might limit the theory’s predictive power (i.e., agency theory might fail in explaining empirical phenomena)

Contribution • •

We propose and agent-based variant of the standard-hidden action model We test for the level of performance achievable in situations with relaxed assumptions


2

Research agenda The standard hidden-action model • •

Provides an 'optimal' solution (here: focus on second-best solution) The involved parties have all the information to achieve the 'optimal' solution in one period


3



!#"# $ !#"# $ U p ( X(a, ρ,θ ) , S(c f , x, p) ) !#"# $ !#"# $

Principal (P)

production function compensation function

x=a⋅ρ +θ

maxΕ(U P (x, s))

s=c f +x⋅p

s.t. Ε(U A (s,a)) ≥ U

∫ V (s) f (x | a)dx − G(a)' = 0

Agent (A)

a

utility from compensation

disutility from effort

! ! U A (s,a) = V (s) − G(a) max Ε(U A (s,a))


P’s utility function • Defined by (a function of) outcome   and A’s compensation P’s maximization problem • Maximize expected utility (via S(.)) • Subject to • Participation constraint (PC) • Incentive compatibility constraint (IC) A’s utility function • Defined by utility from compensation   minus disutility from exerting effort A’s maximization problem • Maximize expected utility (via a)

3



Research Agenda • • • • •

Relax assumptions regarding information (make pieces of information unavailable) & transfer the standard model into an (multi-period*) agent-based model variant Allow the involved parties to search for the 'optimal' solution (over time) Allow the involved parties to learn and to store the learnings in a (limited) memory Investigate the convergence of the achieved solution to the 'optimal' solution (over time) Investigate parameters which drive the convergence of the achieved solution to the 'optimal' solution (here particularly the search strategy and the impact of the environment on the delegation relationship’s outcome)

*We do not allow for allocation of effort across periods. IMA 2017 Stephan Leitner and Friederike Wall

3

Transferring the standard model STANDARD MODEL

AGENT-BASED MODEL VARIANT

P’s information • A’s characteristics (U, 𝝆, UA) • Observed outcome (x)


• •

Entire 'action space‘ (A) Distribution of exogenous factors


Limited information about 'action space‘ • No information about the distribution of exogenous factors (but learns about it over time) Endowed with mental horizon, exploration propensity, and (limited) memory •

4





• •



=


≠

4





• •


A’s information • Observed outcome (x) • A’s private information: selected   action, realized exogenous factor •

Distribution of exogenous factors


=


≠

A’s information • Observed outcome (x) • A’s private information: selected   action, realized exogenous factor No information about the distribution of exogenous factors (but learns about it over time) Endowed with (limited) memory •

4





• •


A’s information • Observed outcome (x) • A’s private information: selected   action, realized exogenous factor •

Distribution of exogenous factors


=


≠

= ≠

A’s information • Observed outcome (x) • A’s private information: selected   action, realized exogenous factor No information about the distribution of exogenous factors (but learns about it over time) Endowed with (limited) memory •

4

Sequence of events per timestep Exploitation space

Exploration space

Memory

t-1

Exogenous factor in t-1 P: Estimated A: Realized

Minimum effort-level, Maximum effort-level that fulfills incentive leads to reservation Area that can be compatibility utility overseen by P (mental horizon)

t 'Status-quo' effort-level

aPt−1

(randomly drawn in t=1)

'Action space' defined on the basis of P’s state of information


5


Exploration space

Memory

t-1


1. P: Select search strategy in t (in the following: exploration) P decides based on exploration propensity (0 to 1) and estimated exogenous factor in t-1; decision is based on a hurdle computed using the estimated exogenous factors’ cumulative distribution function

t

aPt−1


5


Exploration space

Memory

t-1


1. Exploration in t 2. P builds expectation about exogenous factor in t Averaged estimated exogenous factors (over the last m periods) —> parameter m defines memory

t

aPt−1


5


Exploration space

Memory

t-1


1. Exploration in t 2. P builds expectation about exogenous factor in t 3. P randomly discovers 2 alternative effort levels in exploration space & evaluates them with respect to expected increase in utility

t

aPt−1


5


Exploration space

Memory

t-1


4. P decides for 'utility1. Exploration in t maximizing’ effort-level 2. P builds expectation about (as basis for contract) & exogenous factor in t 3. P randomly discovers 2 alternative offers contract to A, P builds  expectation about outcome in t effort levels in exploration aPt space & evaluates them

t

aPt−1


5


Exploration space

Memory

t-1 1. Exploration in t 4. P decides for 'utility2. P builds expectation about maximizing’ effort-level exogenous factor in t (as basis for contract) & 3. P randomly discovers 2 alternative offers contract to A, builds  expectation about outcome in t effort levels in exploration aPt space & evaluates them

Exogenous factor in t-1 P: Estimated A: Realized Exogenous factor in t P: Estimated A: Realized

t

aPt−1 5. A builds expectation about  

Estimation based on expected (step 4) and realized outcome (step 6)


exogenous factor in t & selects   effort-level 6. Exogenous factor and outcome  realize (for period t) 7. P observes outcome, estimates  exogenous factor in t & stores in memory 8. A observes outcome & exogenous  factor (for t) and stores in memory 9. Utilities in period t realize

5


Exploration space

Memory

t-1

Exogenous factor in t-1 P: Estimated A: Realized Exogenous factor in t P: Estimated A: Realized

aPt t

aPt−1 Repetition of steps 1-9 (until period T)

t+1

aPt Boundaries change with P’s state of information (as expected exogenous factors are included in their definition) IMA 2017 Stephan Leitner and Friederike Wall

5

Investigated scenarios Benchmark scenario • Results derived from the standard hidden-action model are used as the benchmark  scenario (recall: second-best solution) Agent-based model variant parameterized as LEN model • Principal • Linear utility function • Mental horizon: Set to 1/10 of the entire ‚action space‘ • Exploration propensity: 0.25, 0.75 (exploitation-prone and exploration-prone Ps) • Agent • Exponential utility function (risk-averse), quadratic disutility function • Fixed compensation and reservation utility normalized to 0 • Input-output-relation (productivity): 1:50 • Environment • Normally distributed exogenous factors • Impact of exogenous factors (standard deviation relative to 'optimal' outcome x*): 0.05x*, 0.25x*, 0.45x*, 0.65x* (mean always set to 0) Further parameters • Periods per time path: 20 • Simulation runs per scenario: 700 • Memory: unlimited (all periods) and limited (3 periods) IMA 2017 Stephan Leitner and Friederike Wall

6

Reported performance measure Averaged normalized effort-level (selected by the agent) • • •

Appropriate performance measure due to stochastic dominance  (higher effort-level —> higher performance) Free from noise (as the measure itself is free from exogenous factors) Notation: • a* = 'optimal' effort level derived from the standard hidden-action model • t = timesteps, T = maximum timesteps • r = simulation run; R = total number of simulation runs • aAtr = effort level selected by the agent in timestep t and simulation run r

1 r=R aAtr ∀t = 1,...,T ∑ R r=1 a *


7

Selected results - unlimited memory I Impact of exgenous factors on outcome: SD=0.05x*

1

Effort-level

0.9 0.8

Higher exploration prop. significantly superior to tendency towards exploitation

0.7 0.6 0.5 1

3

5

7

9

11

13

15

17

19 20

Timesteps

Confidence level: 𝛼=0.01 IMA 2017 Stephan Leitner and Friederike Wall

8

Selected results - unlimited memory II Impact of exgenous factors on outcome: SD=0.05x*

Effort-level

1 0.9

0.9

0.8

0.8


0.7 0.6

0.7 0.6 0.5

0.5 1

3

5

7

9

11

13

15

17

1

19 20

Impact of exgenous factors on outcome: SD=0.25x*

1

Effort-level


1

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5 1

3

5

7

9

11

Timesteps

13

15

17

19 20

5

7

9

11

13

15

17

19 20


1

0.9

3

1

3

5

7

9

11

13

15

17

19 20

Timesteps


9

Selected results - unlimited memory III •

Achieved performance decreases as the impact of exogenous factors increases: more turbulent environments lead to a decrease in performance

•

The choice of the search strategy does not affect the achieved performances in the first few periods

•

Exploitation leads to superior performance • Effect only observable in a subset of time-periods, and • extent of superiority decreases with increasing impact of the environment on outcome


10

Selected results - limited memory I Impact of exgenous factors on outcome: SD=0.05x*

Effort-level

1 0.9

0.9

0.8

0.8


0.7 0.6

Performance tends to decrase after a peak

0.7 0.6 0.5

0.5 1

3

5

7

9

11

13

15

17

19 20


1

Effort-level


1

1

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5 1

3

5

7

9

11

Timesteps

13

15

17

19 20

5

7

9

11

13

15

17

19 20


1

0.9

3

1

3

5

7

9

11

13

15

17

19 20

Timesteps


11

Selected results - limited memory II •

Performances are lower than the ones achieved with unlimited memory: particularly observable for scenarios with a high impact of the environment on outcomes

•

The choice of the search strategy does not affect the achieved performances in the first few periods

•

Performance tends to decrease after a peak is reached: particularly observable in scenarios with a relatively high impact of the environment on outcome

•

Exploitation leads to superior performance • Effect only observable in a subset of time-periods, and • with increasing impact of exogenous factors the effect vanishes: in case of limited memory and high impact of the environment on outcome the choice of search strategy becomes irrelevant


12

Limitations and future research, e.g. • Include costs for exploration • Make more pieces of information unavailable (or observable with noise) for P (e.g., A’s exact utility function, A’s exact productivity) • Include more sophisticated learning strategies • Test for different mental horizons


13

Limitations and future research, e.g. • Include costs for exploration • Make more pieces of information unavailable (or only observable with noise) for P (e.g., A’s exact utility function, A’s exact productivity) • Include more sophisticated learning strategies • Test for different mental horizons

Thank you for your attention! Any questions or comments?


13

Learning and search strategies in the context of ...

Learning and search strategies in the context of ...

Suggest Documents

E-LEARNING STRATEGIES IN THE CONTEXT OF ...

Foreign Language Learning Strategies in the Context of ... - Dialnet

utilization of pragma-stylistics strategies in learning context for ...

Inducing Effective Pedagogical Strategies Using Learning Context ...

TRANSFER AND LINGUISTIC CONTEXT IN THE LEARNING ...

the role of gender and language learning strategies in learning

Learning in the Context of Entrepreneurial Marketing

Depreciation and Obsolescence in the Context of ... - AgEcon Search

Search Strategies

eLearning: A Learning Context in Context

landscape learning and teaching: innovations in the context of the ...

Context of Seasonality in Web Search - STU

The Energetics and Scaling of Search Strategies in ... - Semantic Scholar

Smart specialisation strategies in the context of the

Russia's Arctic strategies in the context of the Ukrainian crisis

expectation in melody: the influence of context and learning

New learning formats and venues in the context of ... - CiteSeerX

Student learning outcomes, perceptions and beliefs in the context of ...

Welfare Decomposition in the Context of the Life ... - AgEcon Search

Practice guidelines in the context of primary care, learning and ...

Data mining and machine learning in the context of ...

Teaching and Learning Strategies

Learning in local search

Context and repetition in word learning - Frontiers