Can Well-Being be Predicted?A Machine Learning

0 downloads 0 Views 2MB Size Report
(2009, p. 554) found that ”neural ...... Wedderburn, 1972), which is in this case not nec- essary, since the ...... age 'RSNNS' (R package manual Ver- sion 0.4-4).
Can Well-Being be Predicted? A Machine Learning Approach Max Wilckens, Margeret Hall Karlsruhe Service Research Institute, Karlsruhe Institute of Technology, Germany [email protected]

Abstract The study of well-being is an interdisciplinary field integrating aspects from psychology, economics, and the social and political sciences. Until today, research is still struggling to provide a robust definition of well-being, explaining its variation and dependencies on e.g. personality, demographics, way of life and life events. For this study, several machine learning techniques including kernel smoothing algorithms, neural networks and feature selection methods are applied in order to expand the structural understanding of subjective well-being and its dependencies. Well-being data from a four weeks study sequence including 362 participants is analyzed for nonparametric structures upon thirteen predictor variables, including the big five personality traits, a maximizer-satisficer scale, a fairness measure and six demographic variables. Neuroticism, extraversion and conscientiousness were confirmed as the most important predictors. Although identified non-parametric structures do not lead to significantly higher prediction accuracy, 54% of the well-being variance between participants was explained upon the predictors set. Surprisingly, the cross-validated machine learning algorithms were not found to achieve higher accuracy than the linear model.

Keywords: well-being, big-five personality, maximizer-satisficer, fairness, machine learning, predictive analytics, non-parametric regression, neural network, extreme learning machine, k-nearest neighbor, feature selection, lasso regression, lazy lasso regression, small data

Contents List of Figures

v

List of Tables

vi

List of Abbreviations

vii

1 Introduction

1

1.1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Purpose of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.3

Layout of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

2 Literature Review 2.1

2.2

4

Defining Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

2.1.1

Perspectives on Well-being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

2.1.1.1

Eudemonia and Psychological Well-being . . . . . . . . . . . . . . . . . . . . . .

4

2.1.1.2

Hedonism and Subjective Well-being . . . . . . . . . . . . . . . . . . . . . . . . .

5

2.1.1.3

Economic Well-being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.1.2

Well-Being Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.1.3

The Influence of Positive and Negative Affect . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.1.4

Equilibrium Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

Determinants of Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.2.1

Demographics / One’s Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.2.2

Personality Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.2.3

Life Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

2.3

Measuring Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.4

Machine Learning on Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3 Research Questions

21

4 Methodology

23

4.1

Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

4.2

Apparatus and Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

4.3

Data Retrieval Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

4.4

Analysis Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

4.4.1

Comparison of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

4.4.2

Algorithms and Methods used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.4.3

Cross Validation and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

5 Results

30

5.1

Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

5.2

Generalized Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

ii

Contents

iii

5.3

Kernel Smoothing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

5.3.1

K-nearest Neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

5.3.2

Non-parametric Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

5.3.2.1

LOESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

5.3.2.2

Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

5.3.2.3

npreg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

5.3.3 5.4

5.5

5.6

Support Vector Machines (SVM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

Neural Network Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.4.1

Stuttgart Neural Network Simulator (SNNS) . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.4.2

Extreme Learning Machine (ELM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

Feature Selection Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.5.1

Lasso and Elastic Net Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.5.2

Lazy Lasso Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

Accuracy Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

6 Evaluation 6.1

55

Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

6.1.1

Existence of Well-Being Baseline (Hypothesis 1) . . . . . . . . . . . . . . . . . . . . . . .

55

6.1.2

Predictability of Well-Being Baseline (Hypothesis 2) . . . . . . . . . . . . . . . . . . . . .

55

6.1.3

Characterization of Well-Being Trajectory (Hypothesis 3 & 4) . . . . . . . . . . . . . . . .

56

6.2

Further Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

6.3

Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

7 Implications and Further Research

58

References

59

iii

List of Figures 1.1

J-Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

2.1

Objective vs. subjective happiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

2.2

Stocks and flows framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.3

SWB homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.4

Well-being equilibrium definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.5

Sigmoid / logistic regression function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.6

Neural network example (1 hidden layer with 5 hidden nodes) . . . . . . . . . . . . . . . . . . . .

18

2.7

Different kernel shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.8

Kernel-smoothing example: Epanechnikov kernel with local linear regression . . . . . . . . . . . .

19

2.9

Support Vector Regression: Fitting inside the kernel . . . . . . . . . . . . . . . . . . . . . . . . .

20

4.1

Participants’ demographic structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

4.2

Independent and dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4.3

Anova Type-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4.4

Caret cross-validation procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

5.1

HFI distribution and density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

5.2

Correlation matrix (absolute values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

5.3

GLM fitted with caret package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

5.4

Variable importance in GLM (t-staticic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

5.5

GLM Regression coefficients with standard error bars

. . . . . . . . . . . . . . . . . . . . . . . .

33

5.6

GLM for participants’ in person well-being variance . . . . . . . . . . . . . . . . . . . . . . . . .

33

5.7

RMSE for k-nearest neighbor using Euclidian metric . . . . . . . . . . . . . . . . . . . . . . . . .

35

5.8

Variance importance for k-nearest neighbor using Euclidian metric . . . . . . . . . . . . . . . . .

35

5.9

RMSE for gamLoess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

5.10 RMSE for gamSplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

5.11 RMSE for npreg with least-squares cross-validation (left) and Kullback-Leibler cross-validation (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

5.12 RMSE density plot for 10-fold cross-validation runs (kernel bandwidth selection upon leastsquares cross-validation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

5.13 npreg predictors’ partial regression influence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

5.14 npreg accuracy for reduced predictor dimensionality . . . . . . . . . . . . . . . . . . . . . . . . .

40

5.15 npreg accuracy for reduced predictor dimensionality . . . . . . . . . . . . . . . . . . . . . . . . .

41

5.16 npreg predictors’ partial regression influence for reduced predictor dimensionality (1) . . . . . . .

41

5.17 npreg predictors’ partial regression influence for reduced predictor dimensionality (2) . . . . . . .

42

5.18 RMSE accuracy for support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.19 RMSE accuracy for feedforward neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

5.20 RMSE accuracy for extreme learning machine (ELM) . . . . . . . . . . . . . . . . . . . . . . . .

46

5.21 Cross-validation results for extreme learning machine (ELM) for 12.000 hidden nodes . . . . . . .

46

iv

List of Figures

v

5.22 RMSE accuracy for ELM in trajectory prediction problem . . . . . . . . . . . . . . . . . . . . . .

47

5.23 Lasso regression path (left) and RMSE accuracy (right) . . . . . . . . . . . . . . . . . . . . . . .

48

5.24 Lazy Lasso Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

5.25 RMSE accuracy for lazy lasso regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

5.26 Lazy lasso predictor weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

5.27 Lazy lasso: percentage of local lasso regressions with predictor coefficient unequal to zero . . . .

51

5.28 Accuracy comparison between deployed algorithms for well-being baseline prediction . . . . . . .

53

5.29 RMSE accuracy gains with increased number of training points for neural network . . . . . . . .

54

5.30 RMSE accuracy gains with increased number of training points for npreg . . . . . . . . . . . . .

54

v

List of Tables 2.1

Big Five Trait Taxonomy - Factor Definition (based on John, Naumann, & Soto, 2008) . . . . . .

11

4.1

Participants descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

4.2

Descriptive statistics for dataset comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.3

Applied algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

5.1

Weekly HFI correlation matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

5.2

Explained variance of weekly HFI by the HFI average . . . . . . . . . . . . . . . . . . . . . . . .

30

5.3 5.4

Standard Deviation between and within participants’ HFI trajectory . . . . . . . . . . . . . . . . Predictor importance by group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30 52

vi

List of Abbreviations R2 . . . . . . . . . Coefficient of Determination RM SE . . . . Root Mean Squared Error AIC . . . . . . . . Akaike Information Criterion ANOVA . . . . Analysis of Variance CV . . . . . . . . . Cross-validation ELM . . . . . . . Extreme Learning Machine GAM . . . . . . Generalized Additive Model GLM . . . . . . . Generalized Linear Model HFI . . . . . . . . Human Flourishing Index LOESS . . . . . Locally Weighted Scatterplot Smoothing OLS . . . . . . . . Ordinary Least Squares PWB . . . . . . Psychological Well-being SD . . . . . . . . . Standard Deviation SVM . . . . . . . Support Vector Machine SWB . . . . . . . Subjective Well-being

vii

1. Introduction 1.1 Background

Well-being or happiness as referred to in parts of the literature has an extensive influence on human

What determines whether we judge a glass to be half

lifes. Happy people have not only been found to live

full or half empty? Determinants of human well-

longer and healthier, furthermore they are even more

being have been extensively addressed by recent re-

productive and successful (Diener & Chan, 2011; Di-

search in the last years. The possibility of defining

ener & Tay, 2013; Lyubomirsky, King, & Diener,

well-being as one of the most governing aspects of

2005). Happiness is per se an aspired feeling, which

one’s life, attracts researchers to identify the factors

makes us confident and satisfied with the life we

influencing well-being, and its impact on life, econ-

live. Diener (2013, p. 665) argued that well-being

omy and society. The list of statistically significant

”broadly mirrors the quality of life in societies be-

correlational findings is already extensive1 – reaching from one’s sex life and health to personality, employ-

yond economic factors and thus reflect social capital, a clean environment, and other variables” (see also

ment and education, to pets in households (Veen-

Diener & Seligman, 2004). Page and Vella-Brodrick

hoven, 2013). Even if correlational findings do not

(2008) identified close links between well-being and

imply causality itself, there is a common understand-

employee performance with considerations to their

ing that personality and basic demographic char-

turnover rate and proposed well-being as a ”valuable

acteristics have a determining or moderating influ-

tool” (p. 454) to measure return on investment of em-

ence on well-being; whereas variables such as health,

ployee enhancement programs and ”track employee

longevity or productivity have been identified as pos-

reaction to workplace changes” (p. 455). Moreover,

itive outcomes of well-being (Diener, 2013). Due to

high well-being values have been proven to be benefi-

a high degree of complexity none of the conducted

cial for turnover, customer loyalty, productivity and

studies have thus far been able to describe com-

profitability (cf. Harter, Schmidt, & Keyes, 2003).

putable dependencies that allow a well-being predic-

The social dimension of well-being is closely linked

tion.

to social stability. In 1962 Davies outlined his theory of revolution, analyzing that revolutionary forces are

This study aims for a prediction on individual well-

based on a ”dissatisfied state of mind rather than the

being. To do so, it focuses on one’s personality traits

tangible provision of adequate or inadequate supplies

and basic demographic variables such as age, employ-

of food, equality, or liberty” (p. 6) within the soci-

ment and gender. Being able to predict individual

ety’s majority. Figure 1.1 illustrates the gap between

well-being based upon personality traits enables the

personal expectations and reality, leading to dissatis-

isolation of other well-being factors in order to an-

faction and finally revolution, if the gap becomes to

alyze their influence specifically. If personality and

big (Davies, 1962). The described state of dissatis-

demography account for the foundational well-being

faction is similar to low well-being values, since sub-

base-level, additional influences such as certain work

jective well-being is mainly determined through life

environments or life situations could be researched

satisfaction. High well-being is therefore not only im-

independently. Consequently, the interdependencies

portant for individuals themselves, but also for well-

between personality and well-being could be utilized

functioning societies.

to measure and monitor well-being more independently from conventional methods such as well-being

Consequently, well-being is increasingly discussed as

questionnaires, and aid in finding a robust definition

an indicator for any kind of human environments

for well-being.

such as societies, institutions, companies, social interactions, services or even products.

1 The

world database of happiness by Veenhoven (2013) lists 12564 correlational findings on happiness found in 1203 publications.

Measuring

well-being enables to identify effects of social change within societies on peoples’ life satisfaction and hap-

1

2

1. Introduction

Satisfaction

acceptable gap

Ex

pe

a ct

unacceptable gap

ns tio

Re

ty ali

Time

Figure 1.1: J-Curve (based on Davies, 1962, p. 6)

piness and might therefore serve as a measure for

man brain or more generally rule-based engines are

quality of life. Bhutan was the first country to estab-

not able to grasp, enables for a new understanding of

lish a national well-being measure besides economic

those complex contexts within datasets. Initially, the

indicators, as for example the gross national product

machine learning algorithms were developed for data

(Thinley, 2011). Some political movements in differ-

compression3 upon underlying structures, but nowa-

ent countries claim similar well-being indicators for

days also allow for an interpretation of the data’s

their countries in order to support political decisions

context. (Hastie et al., 2009; Nilsson, 2005)

not necessarily leading to economic wealth, but to a higher life satisfaction and well-being in society2 . In

Predictive machine learning algorithms not only

order to develop, understand and interpret well-being

learn from historic data by identifying the numeric

measures deeper insights into the determinants for

linkages between variables in time and provide un-

well-being and its internal structure are demanded.

derstanding of the contexts; furthermore they predict future trajectories of certain variables. Mentioned al-

Machine learning is used in this study to add to

gorithms utilize the basics of statistical analysis such

the scientific discussion about the definition of well-

as Bayes rule or Gaussian distributions and scale

being, its predictors and the dependencies’ charac-

them through computer repeated application. Nu-

teristics. To do so this study aims for a prediction of

meric approaches calculate conditional probabilities,

well-being upon the structures identified within the

conduct multidimensional and multilayered regres-

dependencies via machine learning analysis. Predic-

sions, optimize the parameters upon e.g. the max-

tive analytics aims to predict real world variables based upon historic data.

imum likelihood principle in recursive procedures

It relates to topics of

and thereby achieve expressiveness that hasn’t been

’big data,’ since predictive analytics powers the infor-

reached before (cf. Heckerman, 1996). What is for

mation mining of those approaches. Computational

example achieved by interaction terms within linear

power developed in recent years allows us to analyze

models, is implemented natively by most machine

and find non-parametric correlations within vast and

learning algorithms using for example local neigh-

complex amounts of data. The ability to trace cor-

borhood estimators or multilayered regressions.

relational linkages on a numeric level, which a hu2 See

3 Compare

e.g. Stiglitz, Sen, and Fitoussi (2009) for well-being indices in France and Waldron (2010) for subjective wellbeing measures by the UK Office for National Statistics.

for example mp3 and jpeg data compression techniques, email-spam detection or handwritten digit recognition (Hastie, Tibshirani, & Friedman, 2009).

2

1.3. Layout of the Study

3

For social problems, such as in the present well-being

given within the following chapter. This includes def-

prediction study machine learning is still rarely ap-

inition attempts for well-being, identified determents

plied. However, machine learning promises valuable

of well-being, measuring well-being and basic back-

insights in order to understand the underlying struc-

ground on the machine learning techniques utilized

tures determining well-being. Besides the predomi-

in this study. Given this background, the study’s

nant goal of machine learning algorithms to explain

research questions are derived, explained upon and

as much of the variance enclosed in the data as pos-

condensed to four hypotheses formulated in chapter

sible, a second goal identifying the predominant pre-

three. The fourth chapter explains the methodology

dictors arises. Since social problems generally in-

by which the empirical survey is conducted and the

clude a high number of predictors with relatively low

data analyzed. Moreover, it includes an overview on

explanatory power, profound interpretability is es-

the participants’ demographics, the questionnaires

sential (Hainmueller & Hazlett, 2013). Consequently,

and the applied algorithms. Used cross-validation

this study applies a wide range of different machine

techniques are explained in order to provide under-

learning algorithms to test their capabilities for the

standing of the validity tests conducted. Results for

prediction of well-being on the one hand side and to

the conducted algorithms are given in chapter five.

identify the underlying dependencies between demo-

For each analysis a subsection explains the applied

graphics, personality and well-being on the other.

algorithm in detail including definition and setting of parameters first, and then outlines obtained results. In order to evaluate the results with regard to formu-

1.2 Purpose of the Study

lated hypotheses chapter six aggregates the findings, before implications for further research are presented

Within this study machine learning algorithms are

in the final, seventh chapter.

conducted to improve the understanding of correlative dependencies between well-being, personality traits and demographic data. Aiming for the prediction of the future well-being level of individuals upon the analyzed data is the main intention of this study. Moreover, the predictors’ importance and non-parametric influences are analyzed in order to identify the underlying structures and understand each predictor’s partial influence on individual well-being and its development over time. Generally, gained knowledge is intended to add to the scientific aim for a robust definition of well-being. This study reviews first results from a feasibility study on the predictability of well-being by Hall, Caton, and Weinhardt (2013, p.

21), amends

the data set and aims to underpin their findings through predictive analytics with machine learning techniques. These include kernel smoothing algorithms such as k-nearest neighbor and local linear regressions, as well as neural network approaches.

1.3 Layout of the Study This study consists of seven chapters. A general literature review on the relevant historical research is

3

2. Literature Review 2.1 Defining Well-Being

The aim is for well-being to finally be defined with subjective, individually perceived well-being provid-

Research has not yet come up with an agreed, fi-

ing us the possibility to measure well-being (Frey

nalized definition of well-being. This is especially

& Stutzer, 2002). However, considering historically

astonishing due to the long history of well-being lit-

proposed approaches towards a definition of well-

erature. Aristotle already proposed attempts to de-

being so far, is important in order to understand

fine well-being more than 2000 years ago (Aristotle,

found dependencies, the applied measurements and

2002). However, over the centuries and especially

hence this study’s hypotheses regarding the nature

in the last thirty years the attitude towards happi-

of well-being.

ness has changed and new dimensions of well-being have defined the common understanding. Moreover,

2.1.1 Perspectives on Well-being

the evaluation of individual well-being varies from concept to concept, including psychological well-

To understand well-being, it is important to differ-

being, subjective well-being and economically calcu-

entiate between hedonistic and eudemonistic well-

lated well-being (Diener & Suh, 1997; Frey & Stutzer,

being approaches. While hedonism is focused on the

2002; Hall et al., 2013). The absence of a unified mea-

individual subjective perception of happiness, eude-

surement of well-being supports the lack of a single

monism describes well-being as an objective state in

definition (Dodge, Daly, Huyton, & Sanders, 2012;

life, reached not by the notion of ”one is pleased with

Ryan & Deci, 2001; Ryff, 1989).

one’s life,” but if the individual has ”what is worth desiring and worth having in life” (Diener, 1984; Telfer, 1980; Waterman, 1993).

Dodge et al. (2012) identified historic definition attempts, but outline that they focused rather on the

2.1.1.1 Eudemonia and Psychological Well-

dimensions than on the actual definition of wellbeing.

being

The existence of a well-being baseline in-

fluenced by the individual life satisfaction is widely

The eudemonistic, normative perspective has its

supported (Brickman & Campbell, 1971; Headey &

roots in the work of Aristotle, who refused the hedo-

Wearing, 1991; Veenhoven, 1984). Additionally, pos-

nistic view of well-being as perceived happiness, be-

itive and negative affect seem to influence individ-

cause striving for hedonistic well-being would cause

ual well-being (Ryff, 1989). Diener (1994) accounted

a life of gratification similar to those of ”grazing ani-

that well-being ”comprises people’s longer-term lev-

mals” (Aristotle, 2002, p. 98). Instead he proclaimed

els of pleasant affect, lack of unpleasant affect, and

well-being as an overall goal in life based on virtues

life satisfaction” (p. 103).

and the achievement of what is desirable in life, nowadays also referred as the feeling of personal expres-

Moreover, it has to be questioned whether well-being

siveness and self-realization (Waterman, 1993). The

actually exists as an objectively measureable ’real

concept of eudemonia calls on individuals ”to real-

thing’ or whether it might be a construct of sev-

ize their full potentialities in order to achieve a good

eral different factors, each measurable individually

life” (Diener & Suh, 1997, p. 189). The eudemonic

and all together misinterpreted as a measurable phenomenon.

perspective is rather objective, since the standards

Nevertheless, Dodge et al. (2012) dis-

defining ’being well’ are assumed to be universal and

agreed with the proposition of well-being as a con-

equal for every single individual.

struct and proposed ”that well-being should be considered to be a state” (p. 226), in which the essential

Closely linked is psychological well-being (PWB) in-

qualities in life are relatively stable (Headey & Wear-

cluding indicators that assess individual well-being

ing, 1991).

from different psychological perspectives. Ryff and

4

2.1. Defining Well-Being

5 2.1.1.2 Hedonism and Subjective Well-being

Keyes (1995) proposed six measures for psychological well-being, namely ”positive evaluations of oneself and one’s past life (self-acceptance), a sense of

Contrarily to eudemonia, modern well-being litera-

continued growth and development as a person (per-

ture revives the hedonistic perspective on well-being

sonal growth), the belief that one’s life is purposeful

as individual subjective well-being (SWB) measures

and meaningful (purpose in life), the possession of

have come to the fore. In this regard, well-being is

quality relations with others (positive relations with

defined ”in terms of pleasure versus pain” (Ryan &

others), the capacity to manage effectively one’s life

Deci, 2001, p. 144) essentially equivalent to hedo-

and surrounding world (environmental mastery), and

nism as initially introduced by Aristotle (2002). In

a sense of self-determination (autonomy)” (p. 720).

contrast to Aristotle’s attitude, each individual’s per-

Significant correlations between the mentioned mea-

ceived happiness is assessed as important, so that the

sures and other measures such as life satisfaction

individual’s relative standards are applied for well-

and self-esteem were found (Ryff, 1989). The un-

being measurements, which is hence subjective.

derlying aim of psychological well-being is to cover positive human functioning as an important part of

Nowadays, most studies on well-being correlation

well-being. Again, this is closely linked to Aristo-

employ a subjective well-being measure in order to

tle’s eudemonic perspective on well-being. However,

assess well-being upon the standards of the respon-

the standards for being well according to psycholog-

dent (Diener, 1984). Frey and Stutzer (2002, p. 2)

ical well-being are partially adapted to the individ-

proposed that SWB is a ”useful way out” of the

ual’s valuation (cf. e.g. ’self-acceptance’, ’purpose in

well-being definition dilemma, because one can just

life’), in case self-reports and questionnaires are used

”ask the individuals how happy they feel themselves

to gather psychological well-being information.

to be.” In one of the first meta-analyses on subjective well-being Diener (1984) concluded that subjective well-being covers happiness and life satisfac-

Moreover, this perspective on well-being is supported

tion, as well as positive affect. The eudemonic, ob-

by the possibilities of objective physiological measurements. Being able to detect brain waves re-

jective well-being conditions such as health, virtue

sponsible for positive emotions and well-being allow

jective well-being, but are seen as potential influ-

for a technical definition and measurement of well-

ences. These days, there is evidence in literature,

being. Thereby an objective rating and valuation-

that subjective well-being generally aggregates three

independent measurement of well-being becomes possible. Figure 2.1 illustrates the differences be-

interrelated components, namely positive affect, the absence of negative affect and the overall life sat-

tween subjective and objective well-being as well as

isfaction (Diener, 1984). Many studies agreed with

possible measures. (Frey & Stutzer, 2002)

this classification and defined subjective well-being

and wealth are obviously not directly part of sub-

Objective happiness

Subjective happiness Affect

Physiological measures

Psychological measures Cognition, memory

Brain waves

Experience sampling measures

Global self-reports

Figure 2.1: Objective vs. subjective happiness (based on Frey & Stutzer, 2002, p. 4)

5

6

2. Literature Review

upon these components (see e.g. Dodge et al., 2012;

The Joyless Economy that most pleasures are not

Veenhoven, 2010).

achieved as a result of economic decisions. Following recent progress on the measurement of well-being,

In order to decouple the results from short-term ef-

it is consequently demanded that ”utility should be

fects some studies apply the naturalistic experience-

given content in terms of happiness, and that it

sampling method (ESM), in which ”researchers assess respondents’ SWB at random moments in their

[. . . ] should be measured” (Frey & Stutzer, 2002, p. 20). Moreover, the explicit measurement of well-

everyday lives, usually over a period of one to four weeks” (Diener, 2000).

being as utility would enable for interpersonal com-

Kahneman, Krueger,

parison within economic theory (cf. Easterlin, 1974;

Schkade, Schwarz, and Stone (2004) proposed the

Ng, 1997).

day reconstruction method (DRM) for the measurement of SWB. Different approaches on well-being

However, the conventional economic perspective on

measures are discussed in section 2.3.

well-being is still popular, since it is relatively easy to

Even if contrarily defined by Aristotle (2002), eudemonia or personal expressiveness is nowadays seen as

measure, tangible and widely used to support polit-

one possible track to achieve hedonistic well-being,

Economic well-being measures are not intended to

measured as subjective well-being (Telfer, 1980; Wa-

provide insights about personal well-being levels, but

terman, 1993). This finding is supported by corre-

about well-being on a more general, averaged or even

lational dependencies between personal expressive-

national basis. Nevertheless, several studies support

ness and hedonic enjoyment identified by Waterman

a correlation between economic indicators and sub-

(1993). However, for the understanding of the feel-

jective well-being, but rather on a macroeconomic

ing ”happiness” both perspectives remain important.

scale (see e.g. Stevenson et al., 2008). Diener and

Ryan and Deci (2001) concluded that ”the two tradi-

Seligman (2004) explained the importance of eco-

tions – hedonism and eudemonism – are founded on

nomic measures for well-being particularly for the

distinct views of human nature and of what consti-

”early stages of economic development, when the ful-

tutes a good society” (p. 143).

fillment of basic needs was the main issue” (p. 1),

ical decision-making (cf. Diener & Seligman, 2004).

but relativize this importance for highly developed 2.1.1.3 Economic Well-being

countries. This assessment is based on the ’Easterlin

Besides PWB and SWB, economic well-being refers

paradox’ describing a saturation point in the relationship between income and well-being on a national

to external measures upon economic indicators, including for example income, wealth, social security

basis (cf. Easterlin, 1974). The finding has been con-

and safety. It is based on the assumption that certain

wald, 1997) and is not only observed in comparisons

levels of these economic measures allow individuals

between countries, but also in time-series analyses

to achieve personal fulfillment, which again results in

for averaged national data. Economically saturated

well-being. According to the new welfare economics

countries as e.g. the US do not obtain higher av-

individual choices are based on maximizing utility

eraged well-being when the income per capita rises

in order to achieve personal well-being. Utility, ini-

over time (cf. Clark, Frijters, & Shields, 2008). The

tially introduced as cardinal value was then inter-

paradox is explained by decreasing importance of ad-

preted as ordinal measure indicating personal pref-

ditional income when basic needs are satisfied (cf.

erences for maximizing life satisfaction (cf. Frey &

Stevenson et al., 2008).

firmed several times (cf. e.g. Easterlin, 1995; Os-

Stutzer, 2002). However, recent developments show that these self-concerned preferences might not be

Consequently, several countries including e.g. the

sufficient to explain changes in personal well-being

US, France, Germany and the UK initialized pro-

(cf. Ng, 1997). It is for example argued that ”people

grams measuring well-being independently from eco-

are not always able to choose the greatest amount

nomic measures (see e.g. Enquete-Kommission, 2013;

of utility for themselves” (Frey & Stutzer, 2002, p.

Stiglitz et al., 2009; Waldron, 2010). Nevertheless,

22). Furthermore, Scitovsky (1976) found in his book

decision-making is still mainly based on the underly-

6

2.1. Defining Well-Being

7

ing idea of economic well-being that increased wealth

Stocks

and social status lead to higher well-being within the

Social background •  Sex •  Age •  Socio-economic status Personality •  Extraversion •  Neuroticism •  Openess Social Network •  Intimate attachments •  Friendship network

society. Economic well-being is therefore widely used as an argument in favor of economically beneficial development (cf. Diener & Seligman, 2004) and does not entirely reflect the individual self-perceived wellbeing. Economic indicators are therefore not directly considered within this study, however might have influenced the applied subjective well-being measures.

2.1.2 Well-Being Baseline

Flows / psychic income

Favorable events which yield satisfaction (income gains)

Subjective Wellbeing

Life Satisfaction Positive Affect

Adverse events which yield distress (income loss)

Negative Affect

Figure 2.2: Stocks and flows framework (based on Headey & Wearing, 1991, p. 56)

Subjective well-being has been found to be ’fairly stable’ for most people most of the time (Headey & Wearing, 1991; Veenhoven, 1984). This position is based on the finding of Brickman and Campbell (1971), who identified a baseline of well-being that

ferent well-being values even if they experienced the

an individual tends to return to after positive or

exact same favorable or adverse events. Headey and

negative external influences. Headey and Wearing

Wearing (1991) concluded that the model ”account

(1991) explained this with equilibrium of ”stock lev-

for stability and change in subjective well-being in

els, psychic income flows and subjective well-being”

the medium term; say, five to ten years” (p. 66), but

(p. 49). Stock levels include the individual’s so-

also that stock levels (e.g. personality) may change

cial background, networks and personality, while the

in the long term leading to changes of the well-being baseline.

flows of psychic income describe favorable and adverse events yielding satisfaction or distress.

Several other authors have empirically validated the

According to the model, given in Figure 2.2, well-

baseline theory, nowadays also referred to as the set-

being in the dimensions of life satisfaction, as well

point theory. For example, Suh, Diener, and Fujita

as positive and negative affect is dependent on the

(1996) found within a study of 115 participants ”that

income flows covering all recent experiences in life

only life events during the previous 3 months in-

and on the stock levels, which are stable during mid-

fluenced life satisfaction and positive and negative

term. Stock levels are thereby responsible for the

affect” and that their ”magnitude drops quickly af-

stability of well-being, while positive and negative

terward” (p. 1095). Cummins (2009) described the

income flows cause fluctuation around the baseline.

essence of the theory as ”a self-correcting process that

Furthermore, the stocks also provide the resources

maintains stability around set-points that differ be-

to cope with life experiences. Headey and Wearing

tween individuals” and that ”SWB is a neurologically

(1991) proposed therefore ”favorable events and high

maintained in a state of dynamic equilibrium” (p. 4).

levels of psychic income are due high stock levels” (p.

Nevertheless, Fujita and Diener (2005) also had to

61). Consequently, the stocks including background,

limit the idea of a wellbeing baseline as they analyzed

personality and social network influence well-being in

data from a representative 17 year study from Ger-

two different ways: firstly, they determine the well-

many. It was found that the life satisfaction in 9% of

being baseline; secondly, they moderate the effect of

the respondents had changed more than 3 points on

life experiences on well-being (see also Figure 2.2).

a 10 point scale from the first to the last five years

(Headey & Wearing, 1991)

on average. They concluded that ”life satisfaction

As a result each individual has their own level of

can and does change for some people, even in the

subjective well-being depending on individual stocks

face of significant stabilizing factors such as herita-

and flows. The dependency on personality and so-

ble disposition” (p. 162). They also supported the

cial background provides each individual with dif-

general idea of ”a ’soft baseline’ for life satisfaction,

7

8

2. Literature Review

with people fluctuating around a stable set point that

defensive range, in which the affects have a small in-

nonetheless does move for about a quarter of the pop-

fluence on SWB only, until the challenge becomes

ulation” (p. 162).

too strong and SWB drops out of the homeostatic defense range (see Figure 2.3). Cummins (2009) concluded if ”this condition is chronic, people experience

2.1.3 The Influence of Positive and Negative Affect

depression” (p. 15).

In 1991 Diener, Sandvik, and Pavot (1991) already

2.1.4 Equilibrium Theory

argued that ”well-being can be equated with the relative amount of time a person experiences positive vs.

Dodge et al. (2012) proposed an integrated approach

negative affect” (p. 136). The intensity of the affects

towards a definition of well-being, containing the

seems to be of less impact (see also Larsen, Diener,

baseline theory as well as the concepts of positive and

& Emmons, 1985; Lyubomirsky et al., 2005). Fol-

negative affect. They identified well-being as a state

lowing the idea of a ”set-point” for well-being Cum-

reached, when the individual’s resources fit with the

mins (2009) proclaimed that the theory does ”not

individual’s challenges, so that the equilibrium is sta-

attempt to account for the nature of the relationship

ble. The resources and challenges are thereby influ-

between SWB in dynamic equilibrium, and other de-

enced by long-term and short-term changes on the

mographic and psychological variables” (p. 4). To

psychological, social and physical field. Dodge et al.

cope with this, Cummins (2009) proposed a certain

(2012) described well-being to be stable accordingly

process of SWB management for positive and nega-

to their model, when ”individuals have the psycho-

tive affects, called SWB homeostasis.

logical, social and physical resources they need to The process addresses the question, of which level of

meet a particular psychological, social and/or phys-

affect has what impact on SWB. A high resilience of

ical challenge” (p. 230). If challenges and resources

the participant’s SWB against moderately challeng-

are out of balance, the individual well-being drops

ing life conditions has been found, leading to a wide

(see Figure 2.4 for a graphical representation).

Dominant Sources of SWB Control Setpoint

Homeostasis (Defensive range)

80 Set point range

Challenging conditions Upper threshold

Strong homeostatic defense

70

Lower threshold

SWB

50 No challenge

Strength of challenging agent

Figure 2.3: SWB homeostasis (based on Cummins, 2009, p. 5)

8

Very strong challenge

2.2. Determinants of Well-Being

Resources Psychological Social Physical

Wellbeing

9

Challenges

ciety, has social confidants, and possesses adequate

Psychological Social Physical

resources for making progress toward valued goals” (p. 295). This sections reviews the most relevant determinant in the three categories demographics, personality traits and the way of life in order to extend Diener’s picture (see Diener et al., 1999), but it does

Figure 2.4: Well-being equilibrium definition (based on Dodge et al., 2012, p. 230)

not claim completeness in this regard.

2.2.1 Demographics ground

To summarize, well-being definitions described in lit-

/

One’s

Back-

erature share the common idea of an individual level of well-being determined by personality, social and

One of the most discussed determinants of well-being

physical factors. Those determinants influence the

in recent years is the by Blanchflower and Oswald

general well-being level on the one hand, and oppose challenging conditions and corresponding affect on

(2008) identified correlation of well-being and age. During their study they examined 500,000 randomly

the other, in order to preserve the individual well-

sampled Americans and Europeans from the General

being level. Identified determinants for well-being

Social Surveys of the United States and the Euro-

are hence outlined in the following chapter.

barometer Surveys. According to which they have found a robust U-shape, reaching the minimum of well-being in the middle age (see also Clark & Os-

2.2 Determinants of Well-Being

wald, 2006). This finding is also supported by Stone,

Numerous studies aim to identify the most impor-

Schwartz, Broderick, and Deaton (2010) and Blanch-

tant determinants of well-being.

And even more

flower (2001). Frey and Stutzer (2002, p. 53) added

researchers have identified correlational linkages to

that ”the old have lower expectation and aspirations,”

well-being within their research (Veenhoven, 1984).

so that ”the gap between goals and achievement is

For example Sheldon and Hoon (2006) confirmed

smaller” and the perceived life satisfaction is con-

their hypothesis of psychological need-satisfaction,

sequently higher. They even reported older people

a positive Big Five trait profile, good personal

to be better adjusted ”to their conditions” and are

goal-progress, high self-esteem, positive social sup-

therefore happier. This would support a positive cor-

port, and a happiness-conducing cultural member-

relation of well-being with age. Age is therefore con-

ship would each uniquely correlate with SWB. How-

sidered to be important for our predictive analytics

ever, most studies agreed that ”mutual interaction

approach.

must be taken in account” and that ”causation may go in both directions (Frey & Stutzer, 2002, p. 103).”

Related is the finding that different generations, also referred to as cohorts, are characterized by differ-

Ryff and Keyes (1995) identified in their study auton-

ent average well-being levels. Blanchflower and Os-

omy, environmental mastery, personal growth, posi-

wald (2008) traced this finding to the circumstances, good or bad times, the cohorts experienced in their

tive relations with others, purpose in life and selfacceptance as key determinants for well-being. Di-

life. Interestingly, they have found evidence, ”that

ener, Suh, Lucas, and Smith (1999) reviewed the

successive American birth-cohorts have become pro-

last thirty years of SWB research and predicted that

gressively less happy between 1900 and today” (p.

”progress will have been even more rapid [thirty years

1740).

from now] than it has been in the past three decades” (p. 295). They draw a picture of their ’happy person’

Less important, but still discussed in research is the

and concluded that the ”person is blessed with a pos-

linkage between well-being and gender.

itive temperament, tends to look on the bright side of

there is no conclusive evidence of significant correla-

things, and does not ruminate excessively about bad

tion between sex and well-being (cf. Diener & Lucas,

events, and is living in an economically developed so-

1999). This finding is especially astonishing, since

9

However,

10

2. Literature Review

women suffer significantly more often from depres-

the US, which were explained by ”differences in the

sion and unpleasant affect (cf. for a literature review

norms governing the experience of emotion [. . . ] due

Fujita, Diener, & Sandvik, 1991). Diener and Lucas

to affective regulation” (p. 7). Moreover, regional

(1999, p. 292) identified one possible explanation as

differences have been found between European coun-

women experience ”both positive and negative emo-

tries as well as in worldwide comparisons (cf. Deaton,

tions more strongly and frequently than men.” The

2007). However, for the latter, results were found

finding is based on research by Fujita et al. (1991),

to highly correlate with the national gross domestic

in which sex accounted for only 1% of the well-being

product per capita. To summarize, a variable for the

variance, but for 13% of the variance of the intensity

location is included within the analysis. Location

of positive and negative affect.

refers to the cultural system the participant lives in and accounts therefore for cultural, geographic and

Further determinants such as the ethnic group and

ethnic differences.

religion have been tested for correlation. Taking religion Frey and Stutzer (2002, p. 59) for example sum-

2.2.2 Personality Traits

marized that ”the effect is not large.” Nevertheless, many studies found significant positive influences (cf.

Personality determines well-being on several levels.

Diener et al., 1999; Dolan, Peasgood, & White, 2008).

DeNeve and Cooper (1998) conduced a meta-analysis

Ellison (1991, p. 80) suggested that the ”positive in-

and identified 137 distinct personality constructs correlating with subjective well-being. The importance

fluence of religious certainty on well-being [. . . ] is direct and substantial: individuals with strong re-

of personality traits for well-being is explained with

ligious faith report higher levels of life satisfaction,

the top-down theory of well-being assuming that

greater personal happiness, and fewer negative psy-

there is a ”global propensity to experience things in a

chosocial consequences of traumatic life events.” Ac-

positive way” (Diener, 1984, p. 565), which is based

cording to the study, religion was found to explain 5%

on the individuals personality (DeNeve & Cooper,

- 7% of the well-being variance. Moreover, Diener et

1998).

al. (1999) concluded that religion ”may provide both

Contrariwise, bottom-up theories1 explain

well-being as a ”sum [of] the momentary pleasures

psychological and social benefit,” since it provides a

and pains” (Diener, 1984, p. 565), which is closely

”sense of meaning in daily life” and furthermore offers

linked to the theory of positive and negative affect

a ”collective identity and reliable social networks” (p.

(Diener et al., 1991). Additionally, Steel, Schmidt,

289).

and Shultz (2008) found that ”considerable advances have been made in the psychobiology of both SWB

Regarding ethnic differences within one country like

and personality” (p. 139) and ”that the two con-

the US it was found that the ”gap between the happi-

structs share common physical underpinnings” (p.

ness of the white and the black populations has nar-

139), which explain the correlational findings be-

rowed” (Frey & Stutzer, 2002, p. 56). The authors

tween personality and well-being. Beside these di-

traced this back to reduced discrimination. However,

rect psychobiologic linkages, it is argued, ”personal-

differences between ethnics can still be observed. Luttmer (2005) for example found that Hispanics

ity may help create life events that influence SWB” (p. 139). An often replicated finding in this regard

show significantly higher well-being values than other

is the linkage between sociability, a facet of extraver-

ethnics. Moreover, whites tend to show higher lev-

sion, and positive affect as a component of well-being

els of well-being than African American (cf. Dolan

(Steel et al., 2008). In general, extraversion2 is posi-

et al., 2008). However, in this study religion and

tively correlated to well-being, while neuroticism has

ethnic groups are not considered directly, as found

a negative influence (¸Sim¸sek & Koydemir, 2012; Steel

influences are comparably small. Nevertheless, it is

et al., 2008).

accounted for regional differences embodying those influences at least partially. Diener, Suh, Smith, and

1 For

a comparison of top-down and bottom-up well-being see also Headey, Veenhoven, and Wearing (1991). 2 ’Extraversion’ and ’extroversion’ are often used interchangeably within literature.

Shao (1995) found regional differences between the Pacific Rim countries (e.g. Japan, China, Korea) and

10

2.2. Determinants of Well-Being

11

Personality is measured and categorized in different

stability are the most important determinants of

dimensions and scales, of which the most common set

those three. This finding is also supported by Vit-

is the ”big five personality trait taxonomy” initially

tersø (2001). In order to perform predictive analytics

proposed by McCrae and Costa Jr. (1985) upon their

on well-being the big five must be considered as im-

previously published NEO (Neuroticism, Extraver-

portant predictors.

sion, Openness) model and the five personality facAnother dimension of personality to be considered

tors initially found by Norman (1963). The factors

is the differentiation between maximizing and satis-

originate from a set of 16 personality factors iden-

ficing individuals. Nenkov, Morrin, Ward, Hulland,

tified by Cattell (1947), of which the big five have

and Schwartz (2008) developed a short form of the

been proven to be replicable in many other studies

maximization scale, initially introduced by Schwartz

(see Goldberg, 1993). The big five scale measures

et al. (2002). The question addressed, whether ”peo-

personality in five dimensions, namely extraversion

ple [can] feel worse off as the options they face in-

vs. introversion, agreeableness vs. antagonism, con-

crease,” is related to well-being (Schwartz et al., 2002,

sciousness vs. lack of direction, neuroticism vs. emo-

p. 1178). It has been found, ”that maximizers re-

tional stability and openness vs. closeness to expe-

ported significantly less life satisfaction, happiness,

rience (John et al., 2008; John & Srivastava, 1999).

optimism, and self-esteem, and significantly more re-

This study utilizes the 44 single item scale proposed

gret and depression, than did satisfiers.” However,

by John, Donahue, and Kentle (1991) with a five-

it is also doubted that maximizing always prevents

point scale for each item. Each item adds to one of

from being well. In order to address this question,

the five factors; some of the items are reverse coded.

more advanced research exceeding correlational anal-

For the original conceptual definition of the five fac-

ysis has to be conducted.

tors see Table 2.1.

The third psychometric measure included in this Although different views regarding the importance

study is a fairness test. The questions are based on

of each personality trait occur, extraversion, agree-

a research by Schmitt and D¨orfel (1999), who found

ableness and neuroticism have been found to be the

a negative correlation between procedural injustice

most important determinants of well-being (DeNeve

and psychometric well-being. The correlational find-

& Cooper, 1998; Haslam, Whelan, & Bastian, 2009;

ings were moderated by justice sensitivity, which is

Steel et al., 2008). In 1998 DeNeve & Cooper pro-

referred to as fairness in this study. Schmitt, Goll-

posed that neuroticism and respectively emotional

witzer, Maes, and Arbach (2005) refined the find-

Factor labels

Conceptual definition

Extraversion (Energy, Enthusiasm)

Implies an energetic approach toward the social and material world and includes traits such as sociability, activity, assertiveness, and positive emotionality.

Agreeableness (Altruism, Affection)

Contrasts a prosocial and communal orientation toward others with antagonism and includes traits such as altruism, tender-mindedness, trust, and modesty.

Conscientiousness (Constraint, Control of impulse)

Describes socially prescribed impulse control that facilitates taskand goal-directed behavior, such as thinking before acting, delaying gratification, following norms and rules, and planning, organizing, and prioritizing tasks.

Neuroticism (Negative Emotionality, Nervousness)

Contrasts emotional stability and even-temperedness with negative emotionality, such as feeling anxious, nervous, sad, and tense.

Openness (Originality, Open-Mindedness)

Describes the breadth, depth, originality, and complexity of an individual’s mental and experiential life.

Table 2.1: Big Five Trait Taxonomy - Factor Definition (based on John et al., 2008)

11

12

2. Literature Review

ings as they analyzed justice sensitivity from three

countries in East Europe and the individuals’ age

perspectives (victim, observer and perpetrator). In

were observed. Moreover, Frey and Stutzer (2002,

order to measure the overall fairness sensitivity for

p. 101) reminded that ”individuals tend to evalu-

participants, this study averages a simplified version

ate their own situation relative to others,” so that

of the questionnaire by Schmitt et al. (2005) for all

”both, the psychic and the social effects are miti-

three perspectives.

gated” when ”unemployment is seen to hit many persons one knows or hears of.” This finding is supported

Concluding, many studies verify the importance of

by Shields and Price (2005), who analyzed the UK

personality as determinants of well-being. Consequently, the fig five trait taxonomy as well as the

general health questionnaire and found that the effect of individual unemployment is neutralized in ar-

maximizer vs. satisficer and the fairness score are in-

eas with the highest employment deprivation values

cluded in this study in order to cover a broad range

(> 22%). In those areas the average unemployed per-

of personality factors.

son was ”estimated to have at least the same level of psychological well-being as an equivalent employee”

2.2.3 Life Situation

(p. 531). The mitigating effect is not only found

Besides the rather static predictors personality and

for geographic areas, but also for partners. Employ-

demographics, background decisions and circum-

ees having an unemployed partner show significantly

stances in life have been found to be significantly correlated to well-being including life satisfaction as

lower well-being levels, whereas having an employed

well as positive and negative affect.

partner (cf. Clark, 2003). Dolan et al. (2008) ar-

partner increases the well-being of the unemployed gued in their literature review that those impacts

Firstly, employment has been found to be important.

result from ”the extent to which the individual can

McKee-Ryan, Song, Wanberg, and Kinicki (2005) re-

substitute other activities for work, belong to non-

viewed 104 studies and concluded ”unemployed indi-

work based social networks and are able to legitimize

viduals had lower psychological and physical well-

their unemployment” (p. 102). Generally, Frey and Stutzer (2002) pointed out the importance of one’s

being than did their employed counterparts” (p. 67). Frey and Stutzer (2002) supported the finding, ex-

reference group for the influence of unemployment on

plaining, ”job satisfaction is a crucial part in life sat-

well-being and concluded that ”unemployed people’s

isfaction” (p. 107). The measured drop in SWB is

well-being [. . . ] depends on the strength of the social

not only due to income losses, but also due to social

norm to work” (p. 102).

effects, because ”not having work leads to isolation, which makes it difficult or impossible to lead a satis-

Closely linked to this insight are the findings on

factory life” (p. 107). Lucas, Clark, Georgellis, and

income. Contrarily to the ’Easterlin paradox’ (see

Diener (2004) continued to support the set-point the-

Easterlin, 1974) suggesting well-being to be indepen-

ory, identifying an influence of employment as ”individuals first reacted strongly to unemployment and

dent from the income per capita as a measure of the

then shifted back toward their former (or ’baseline’) levels of life satisfaction” (p. 2). However, on average

ration point, Stevenson et al. (2008) found a ”clear positive link” and ”no evidence of a satiation point

they have found that ”individuals did not completely

beyond which wealthier countries have no further in-

return to their former levels of life satisfaction, even

creases in subjective well-being” (p. 1). Neverthe-

after they became re-employed” (p. 2). Unemploy-

less, it has to be taken into account that the studies

ment influences the well-being baseline in the long

refer to countrywide averages and do not assess the

run, even if ”considerable individual differences in re-

dependency on an individual level. Stevenson and

action and adaptation to unemployment” were found

Wolfers (2013) refined their previous conclusion from

(Lucas et al., 2004, p. 2). Blanchflower (2001) sup-

2008 and added in-country analyses. They found a

ported this finding, but also outlined moderating ge-

roughly linear-log relationship, but still rejected the

ographic and demographic influences on this corre-

existence of a satiation point or a ”critical income

lational finding. Differences between the examined

level beyond which the well-being–income relation-

society’s economic development after a certain satu-

12

2.2. Determinants of Well-Being

13

ship is qualitatively different” (p. 598). Frey and

toward their goals or to adapt to changes in the world

Stutzer (2002) supported the finding of positive cor-

around them” (p. 293). Nevertheless, it is also seen

relation, but propose the existence of an aspiration

that education raises expectations as well as aspira-

level, from which well-being is constant with increas-

tion and might therefore lower SWB due to a less

ing income. Similar results have recently been found

satisfied life (cf. Diener et al., 1999). Because of the

by Jorm and Ryan (2014), who analyzed eight re-

correlational findings between education and income,

search databases and found an increase of subjec-

several studies accounting for both variables found

tive well-being with an increasing income per capita

negative influences of higher education on well-being

on a national basis, even if these gains decrease for

(cf. Campbell, Converse, & Rodgers, 1976; Clark,

richer countries. Poverty affects well-being when it

2003). Clark (2003) concluded that education either

affects basic needs, but once those are satisfied the

”raises expectations at the same time as outcomes”

linkage becomes less significant and more complex.

or is ”endogenous, being chosen by people who are

It has also been found that income is judged in re-

’naturally’ more difficult to please” (p. 331). In gen-

lation to one’s social environment, so that high in-

eral, the findings regarding education differ crucially

come does only raise well-being, when it is compara-

depending on whether correlated variables such as

bly high in relation to others (Frey & Stutzer, 2002,

income and occupational status are accounted for.

p. 85f.). Clark et al. (2008) supported this findSimilar research has been conducted on the link-

ing and argued that the macroeconomic finding by

age between success and subjective well-being.

Easterlin (see 1974) is consistent with the found pos-

Lyubomirsky et al. (2005) argued that the correla-

itive correlations of income and subjective well-being

tion is found, because ”success makes people happy,

on an individual level, when relative income terms

but also because positive affect engenders success”

are used to explain the gains in utility. Diener et

(p. 803). The latter is also supported by Wright

al. (1999) concluded that ”wealthy people are only

and Cropanzano (2000) on the field of job perfor-

somewhat happier than poor people in rich nations, whereas wealthy nations appear much happier than

mance, because well-being was found to be a predictor of job satisfaction (cf. also Harter et al., 2003).

poor ones.”

While there is agreement in literature that higher Education is a further variable, for which an influ-

levels of well-being lead to success in life (see also Di-

ence has been found, even if the correlational findings

ener & Tay, 2013), it is contrarily discussed whether

seem to correlate significantly with those for income

success results in higher well-being. For example,

(cf. e.g. Clark, 2003; Diener, Sandvik, Seidlitz, &

Samuel, Bergman, and Hupka-Brunner (2013) found

Diener, 1993; Witter, Okun, Stock, & Haring, 1984).

no evidence for adolescents that a ”lack of educa-

High education leads to higher income, so that it has

tional and occupational success [. . . ] leads to a de-

to be questioned, whether ”education may be only

crease in well-being as hypothesized” (p. 90). On

indirectly related to well-being” (Diener et al., 1999,

the other hand, Diener et al. (1999) concluded that

p. 293). However, Blanchflower and Oswald (2004)

success or achievement of personal goals well has

found significant positive influences of higher education levels within their US and UK studies and

a positive effect on subjective well-being and that

concluded that ”education is playing a role independently of income” (p. 1371). Diener et al. (1999)

goals” (p. 284). However, they also emphasized that those ”goals serve as an important reference stan-

summarized that ”education is more highly related

dard for the affect system” (p. 284) and may hence

to well-being for individuals with lower incomes” (p.

lead to higher or lower well-being values depending

293), but the independent influences have not finally

on whether the goal meets the person’s needs and

been identified yet (cf. Diener et al., 1993). More-

whether they are therefore valued (see also Brun-

over, the influence is in particular found for low in-

stein, Schultheiss, & Gr¨assmann, 1998). Moreover,

come countries (cf. Dolan et al., 2008). Diener et al.

it was found that ”simply having valued goals inde-

(1999) proposed that education has an influence on

pendent of past success, was associated with higher

well-being as it allows ”individuals to make progress

life satisfaction” (p. 285), so that actual successful

people ”react negatively when they fail to achieve

13

14

2. Literature Review

achievement of those goals might not be the predomi-

ground, personality traits and life situation) to-

nant factor (see also Emmons & Diener, 1986). Since

gether, the review shows that many different deter-

Diener’s review in 1999 only little insight has been

minants explain a significant share of the SWB vari-

gained regarding the relation of well-being and suc-

ance. Still, little research has actually addressed the

cess, even if the dependencies between goal achieve-

linkages and moderating effects between those sig-

ment, valuation of goals and the influence on well-

nificant determinants in order to predict well-being

being is still explained insufficiently.

sufficiently. Before the existing literature on predictive analytics on well-being is reviewed, the question

Furthermore, the linkage between well-being and

on how to measure well-being has to be addressed.

health, especially not only mental health, but also

Within the reviewed studies on determinants several

physical health is researched extensively.

Within

different measurements have been used. However,

their meta-analysis Diener et al. (1999) concluded

every single study discussed the same question on

that SWB is strongly correlated to health, but only if

how to measure well-being persistently.

the health measure is self-reported. Correlation with objectively health measured by physicians is considerably weaker. But they also note that the percep-

2.3 Measuring Well-Being

tion of health is influenced by the actual objective

Due to the complexity of defining well-being, there

health, negative affect and the individual’s personal-

is no single right answer on how to measure well-

ity (Diener et al., 1999). Although objective health

being (see e.g. Larsen et al., 1985). Nevertheless, in

plays this minor role for SWB, it is rated highest,

order to study well-being and its determinants one

when ”respondents are asked to judge the importance

well-founded measuring approach has to be applied.

of various domains in life” (Diener et al., 1999, p.

Generally, most studies within the last two decades

293). One explanation given for this contradiction

have used subjective well-being measures including

is that ”people appear to be remarkably effective in

”life satisfaction, the presence of positive mood, and

coping, using cognitive strategies [. . . ] that induce

the absence of negative mood” (Ryan & Deci, 2001,

a positive image of their health condition” (Diener

p. 144).

et al., 1999, p. 287). The subjectively perceived health is consequently higher than the actual. Nev-

Within this study well-being is assessed via the Hu-

ertheless, it has also been found that perceived health does sometimes not recover entirely after periods of

man Flourishing Index (HFI) developed as a concep-

serious illnesses. The more different diseases were

being is considered as ”positive mental health” (p.

diagnosed per test person, the weaker was the ca-

837) and subjectively measured with a ten features

pability to recover after the drop of well-being (Di-

questionnaire. The features are derived by inversion

ener et al., 1999). Positive correlation between health

of ”internationally agreed criteria for depression and

and well-being is also recently reported by Jorm and

anxiety” (p. 837), which are defined as the opposite

Ryan (2014) and Lacey et al. (2008), who exam-

of mental health, respectively well-being. Depression

ined the effect of illnesses on the recalibration of the

and anxiety were selected, as they have the ”highest

quality of life, respectively well-being scale. Within

prevalence in the population” (p. 842) and also be-

their study with patients and non-patients no evidence could be found for a significant recalibration

cause ”the other categories of anxiety disorder (phobias, OCD, PTSD) do not have a polar opposite”

of the well-being scale due to the illnesses (Lacey et

(p. 842). The features include feelings as well as

al., 2008). However, subjectively perceived health is

functioning and therefore account for both, hedonis-

considered important in this study in order to pre-

tic as well as eudemonic well-being. Namely they are

dict subjective well-being, since it reflects both, the

stated as: ”competence, emotional stability, engage-

eudemonic virtue of a healthy life and the hedonistic

ment, meaning, optimism, positive emotion, positive

perspective of perceived physical health.

relationships, resilience, self-esteem, and vitality” (p.

tual framework by Huppert and So (2011). Well-

837). A panel of three psychologists and one lay person developed each feature as the mirror opposite of

Taking all determinants of SWB (demographic back-

14

2.3. Measuring Well-Being

15 Pc = {cj : cj > 0}, Pf = {fk : fk > 0},

a symptom of one of the mental disorders, depression or anxiety.

where pe stands for the question on positive emotion,

Huppert and So (2011) validated the framework upon

cj for the positive characteristics (vitality, resilience,

the European Social Survey (ESS). They found two

optimism, happiness, self-esteem) and fk for positive

underlying factors explaining 43% of the variance.

functioning items (engagement, meaning, positive re-

Additional analysis evaluating the differences be-

lationships, competence).

tween the framework and the single item measure life satisfaction from the EES showed high correla-

Hall et al. (2012) used the HFI framework for their

tion between positive emotion and life satisfaction,

approach towards the ”gamification” of well-being

which can be equated with happiness and hedonic

measurement.

well-being. In the resulting model explaining 52%

assessing the participants’ well-being via the HFI

of the variance, both items loaded on a third factor,

framework.

which was hence explained as the hedonic part of

SWB over time and send their data enriched by

the HFI. Consequently, Huppert and So (2011) con-

demographic information to the authors for scien-

cluded that the first two factors including ”all other

tific purposes. The authors prove the feasibility of

items are measuring eudemonic aspects of well-being”

well-being measuring via social networks, concluding

(p. 845) and explained the two eudemonic factors

”well-being games are a means to support the design

as positive characteristics and positive functioning.

and management of complex institutions and virtual

Hence, the HFI is a framework measuring hedonic

communities” (Hall et al., 2012, p. 8).

They developed a Facebook game

Participants were able to track their

well-being in terms of positive emotion, as well as Since the HFI intends to measure hedonic as well as

the eudemonic constructs of positive characteristics

eudemonic well-being it also reduces the risk embed-

and positive functioning.

ded in exclusively hedonic approaches. Veenhoven In order to propose an operational definition of flour-

(1984) outlined that ”people may in their heart know

ishing, Huppert and So (2011) defined the criterion

that they are disappointed with life, but repress that

for flourishing as ”having all but one of the features of

though, because they cannot deal with its conse-

positive characteristics and all but one of the features

quence” (p. 44-45). Moreover, the diversification of

of positive functioning, together with positive emo-

well-being on different features reduces the impact

tion” (p. 845). This again is founded upon the defi-

of a single measure. Overstatement and misinforma-

nitions for depression and anxiety requiring at least

tion, widely reported in SWB measures, are there-

one of the inverted features to be present. Hall, Kimbrough, Haas, Weinhardt, and Caton (2012) empha-

fore less likely (Veenhoven, 1984). Veenhoven (1984) moreover suggested to include ”non-verbal cues” (p.

sized that the operationalized definition suggested by

46) and ”expert ratings” (p. 47) into the assessment,

Huppert and So (2011) ”is an excellent representation

but this is due to simplicity not considered any fur-

of current well-being literature and its multidimen-

ther within this study. He also addressed the prob-

sional properties” (p. 3). A mathematical represen-

lems of contextual and response-type biases as well

tation of the operational definition is provided:

as different participants’ moods, why the well-being questions will be answered in different ways. For ”overall well-being” he concluded that ”happiness can

 HF I = pe ∗ Ic ∗ If ∗

n X j=1

cj +

m X

!

be assessed only by asking people about it” and that

fk 

”self-ratings are to be preferred to ratings by others”

k=1

(Veenhoven, 1984, p. 62). Ic =

 1,

if |Pc | ≥ n − 1

Other techniques address the problem of biased in-

0,

else

formation for example with a close link of the question to a certain event or activity. The ”Day Re-

 1 if |P | ≥ m − 1 f If = 0, else

construction Method” (DRM) by Kahneman et al. (2004) identified the remembered well-being for each

15

16

2. Literature Review

activity and experience of the preceding day. The

dividuals are first asked explicitly about the weather”

participants ”first revive memories of the previous

(p. 6). In this study participants are consequently

day by constructing a diary consisting of a sequence

asked about the weather or the temperature outside

of episodes. Then they describe each episode by an-

before addressing HFI questions.

swering questions about the situation and about the feelings that they experienced” (p. 1776). The re-

2.4 Machine Learning on Well-

view of the previous day causes that recent memories

Being

loose dominance, so that errors and biases of recall are reduced (Kahneman et al., 2004). The survey

So far little research has been conducted on the ex-

part of this method is based on the experience sam-

planation of well-being observations with machine

pling method developed by Diener (2000), as feelings

learning techniques. Although machine learning has

in different situations are aggregated toward an over-

gained growing importance for the analysis of high

all well-being measure. But deviating from the ESM,

dimensional, non-linear data3 , numerical problems in

Kahneman et al. (2004) proposed that the DRM al-

social science are rarely treated with machine learn-

lows for measuring a sufficient number of different

ing so far. The following chapter reflects the rare

events during just one day and that it is therefore

examples of machine learning within social sciences,

more efficient.

especially personality as in this study with due regard

All currently discussed well-being measures have one

to the underlying principles of the various algorithms

characteristic in common: They aim for an average of

used.

the participants’ well-being feelings either over time

Two different types of machine learning need to be

and therefore different events or via different dimen-

distinguished: (1) Firstly, supervised learning; refer-

sions measured on various scales. Diener, Emmons,

ring to a setup, in which the dependencies of one

Larsen, and Griffin (1985, p. 13) claimed that single

dependent variable or clustering upon the other in-

item measures are ”generally less reliable over time

dependent variables, also referred to as predictors

than multi-item scales, are probably more suscepti-

are learned. The dependent variable is present and

ble to acquiescence response bias, are more likely to

known within the historic data, and is also referred to

be affected by the particular wording of the item,

as the training set of data points. (2) Secondly and

may not be entirely suitable for parametric analysis

in contrast, unsupervised learning means the identi-

since responses tend to be highly skewed, and do not

fication of clusters within the given historic data set.

provide an assessment of the separate components of subjective well-being.” A multivariate measure is

The dependent variable is not part of the training

therefore also used within this thesis. Larsen et al.

tion assigning each training data point to a specific

(1985) also suggested that life satisfaction should

cluster. Unsupervised learning is used to reduce com-

play a major role within the multivariate analysis,

plexity, gain understanding of the data and interpret

since it accounts for the rather stable parts of well-

complex data structures, while supervised learning

being and provides ”high temporal reliability” (p.

is generally suitable for the prediction of future de-

14). The HFI embodies this finding as life satisfac-

velopment and closing of gabs within the dataset.

tion, respectively positive emotion is multiplicatively

(Hastie et al., 2009; Nilsson, 2005)

set and describes some sort of higher order classifica-

included in the HFI calculation. Both approaches for understanding contexts within Moreover, Kahneman and Krueger (2006) gave ad-

data are computationally expensive, since the learn-

vice on how to measure well-being and report find-

ing is based on repeated sequentially adaption of the

ings on the influence by changing weather condi-

algorithm’s parameters. Thereby, depending on the

tions. Weather has been found to be an important

algorithms characteristics local and global optima are

determinant of well-being (higher well-being on nicer

found via a stepwise approach upon the objectives

days). On the other hand, according to Kahneman

function gradient. (Hastie et al., 2009) 3 Compare

and Krueger (2006) this influence is eliminated, if ”in-

16

public discussions on the fields of ’big data’.

2.4. Machine Learning on Well-Being

17

Supervised learning has been for example applied by

Therefore, inputs and outputs have to be standard-

Minbashian, Bright, and Bird (2009), who analyzed

ized on a scale between zero and one before processed

the applicability and accuracy of neural networks in

by a neural network. The edge weights of the linear

comparison to multiple regressions on the prediction

combination are computed via an iterative training

of work performance upon the big five trait taxonomy

procedure, comparing the predicted output with the

(see chapter 2.2.2 Personality Traits). The authors

training value and then adjusting the weights. Sev-

were able to ”identify the specific [. . . ] nonlinear rela-

eral different learning functions for different types

tionships” (p. 540) and found ”superior performance

of data and networks have been developed in recent

of the neural networks” (p. 540) regarding a relative

years (e.g. backpropagation, scaled conjugate gradi-

accuracy measure. However, both methods achieved

ent decent). Figure 2.6 illustrates the basic structure

comparable absolute accuracy values.

of a neural network. For further details on the algorithms applied in this study see the methodology chapter (Anthony & Bartlett, 2009).

Neural networks, also referred to as artificial neural networks (ANN) and multi-layer perceptions (MLP), are non-parametric classification and/or regression

In the study on the predictability of work-

tools to solve supervised learning problems. They

performance Minbashian et al. (2009, p. 554) found

consist of a usually fully-connected network with

that ”neural networks performed as well or bet-

several layers of nodes. The first layer is the input

ter than MR [multiple regression] equations [. . . ]

layer, including one node for each independent vari-

with respect to a relational index of predictive ac-

able of the dataset. The last layer is the output layer, which consists of one node for each dependent vari-

curacy” (p. 554). The use of neural networks is

able or class in classification problems. The layers

known and complex data, especially ”when theory

in-between are ”hidden layers.” The number of hid-

about the underlying functional relationships is not

den layers and number of nodes per hidden layer are

strong” (p. 554). The neural networks applied in

the most important model parameters. Each node

their study achieved prediction-original-correlation

performs a weighted linear combination of a bias con-

of r = 0.55 at one hidden layer. Comparable re-

stant (usually 1) and all node values of the previous

search was conducted by Mart´ınez, Rodr´ıguez-d´ıaz,

layer. The sum is processed through a non-linear

Licea, and Castro (2010), who used neural networks

activation function, which is usually the sigmoid /

extended by fuzzy systems (namely adaptive-neuro-

logistic regression function resulting in a node value

fuzzy-inference-system) with personality data as in-

between zero and one (see Figure 2.5).

puts in order to assign employees on different software engineering roles. The algorithm is capable of

4

therefore recommended for the prediction upon un-

describing the roles upon personality profiles and further more is used to develop decision rules for the 1

employee-role match. One of the first applications of neural networks to personality related prediction problems is found in

0.5

the study on workplace behavior by Collins and Clark (1993). They utilized at that time newly developed neural networks for classification in a low and high performance group. It was concluded that the neural network performed at least as good as

-6

-4

-2

0

2

4

6

other classification methods. Nowadays, neural networks also contribute to the understanding of hu-

Figure 2.5: Sigmoid / logistic regression function

man personality itself. Read et al. (2010) proposed a complex, highly integrative neural network structure 4 There

in order to accurately model personality. Quek and

are edges between all nodes of two adjacent layers.

17

18

2. Literature Review

Input layer

Hidden layer

Output layer

Independent variable 1 Independent variable 2 Dependent variable Independent variable 3 Independent variable 4

Figure 2.6: Neural network example (1 hidden layer with 5 hidden nodes)

Moskowitz (2007) focused on the analysis of event–

nikov kernels weight the neighbors influence on the

contingent recorded data in order to identify person-

result upon their distance to the requested point,

ality patterns upon neural networks.

which is the center of the kernel. Different kernels

Nevertheless,

neural

networks

lack

in

embody various variables, as for example the number

well-

of neighbors included, or the bandwidth and scale of

interpretable results. While prediction accuracy and

the kernel, determining the kernel’s width. For each

mathematical expressiveness is comparably high,

independent variable a kernel with different charac-

the computed weights within the neural network

teristics can be used. See Figure 2.7 for a graphical

are rarely interpretable; especially if the neural net-

representation of possible kernel types (cf. Hofmann,

work consists of multiple hidden layers. For social

Sch¨olkopf, & Smola, 2008).

problems however, interpretability is an important consideration. Often, a high number of predictors

Secondly, within the kernel different methods can be

explain a comparably small amount of variance,

applied to calculate the dependent variable for the

so that identifying the important predictors and

kernel center (the requested point). Most common is

individual influences is crucial for scientific results

for example the average or a local linear regression.

(cf. Hainmueller & Hazlett, 2013).

Figure 2.8 illustrates an Epanechnikov kernel with

Kernel smoothing algorithms provide an alternative

a local linear regression estimator with one indepen-

allowing for an individual inspection of single pre-

dent variable. The actual regression line results from

dictors. Kernel smoothing algorithms solve super-

repeated application of the kernel algorithm for each

vised non-parametric regression problems by estima-

step of the input scale.

tion upon the nearest neighbors of the new requested

Other machine learning algorithms have also been

data point in the training sample (Nilsson, 2005).

used for predictions related to personality. For ex-

The kernel describes the selection of the neighbor-

ample, Chittaranjan, Blom, and Gatica-Perez (2011) used support vector machines (SVM) to predict the

hood in which the estimation takes place. Thereby, the kernel size as well as the individual local fitting

big five personality traits upon long-term recorded

operations can be undertaken greater analysis and interpretation.

smart-phone data. SVM were initially developed to

Friedman (2006) provided a good

solve two-classes supervised classification problems

mathematical definition of kernel smoothing.

(see Cortes & Vapnik, 1995).

Therefore, the al-

Several different kernels and estimation methods

gorithm computes a hyperplane separating the two

have to be distinguished. Firstly, the kernels differ

classes by maximizing the minimum distance be-

in shape. Uniform kernels are used for example in k-

tween the training points and the hyperplane. In or-

nearest neighbor; Gaussian, triangular or Epanech-

der to describe non-parametric structures SVMs ap-

18

2.4. Machine Learning on Well-Being

19

Uniform / Rectangular

Epanechnikov

Biweight

0.5

-2

-1

0

1

2

-2

0.6

0.8

0.3

0.4

0

-1

1

Gauss

-4

-2

2

1

0.2

0.5

0

1

2

Triangular

0.4

-2

0

-1

2

4

-2

-1

0

1

2

Figure 2.7: Different kernel shapes

ply kernel smoothing. The hyperplanes of all kernel

Further developments of the SVM perform regres-

environments result in a non-parametric separation

sions, as well. This is referred to as support vec-

between the two classes. Friedman (2006) provided

tor regression (SVR, cf. Vapnik, Golowich, & Smola,

a good mathematical description of SVMs. (see also

1997). In comparison to the kernel smoothing algo-

Basak, Pal, & Patranabis, 2007; Cortes & Vapnik,

rithms mentioned before (e.g. local linear regression)

1995)

the function inside the kernel of SVMs does not min-

Dependent variable

Kernel environment

Kernel average smoother fit

Original function Data points Data points in kernel Independent variable Figure 2.8: Kernel-smoothing example: Epanechnikov kernel with local linear regression

19

20

2. Literature Review

Local fitting area (kernel) with central prediction line

Dependent variable

Panelized variation

Accepted range ε

Data points Independent variable

Figure 2.9: Support Vector Regression: Fitting inside the kernel

imize the training error (e.g. least squares), but in-

side a kernel smoothing environment. Thereby, the

stead minimizes the generalization error bounds of

regularization can either be done on a global level

the local linear function (Basak et al., 2007). All

or for each kernel individually as for example in the

training points within an -environment of the lin-

lazy lasso algorithm (Vidaurre, Bielza, & Larra˜ naga,

ear function are excluded from the error calculation.

2011).

All points exceeding this error bound are panelized (see Figure 2.9). Thereby the SVM generates a more general prediction model and is computational less expensive than minimizing the entire training error (Basak et al., 2007).

Social sciences often encounter the problem of comparably noisy and high dimensional data, since questionnaire answers are based on subjective judgments and interpretations. Within machine learning feature extraction algorithms address this demand for a reduction of dimensionality and identification of the most important predictors in a model. Many algorithms of varying complexity have been developed over the last decades. Two different approaches can be observed. Firstly, feature selection upon generalized linear models as for example lasso regression, least angle regression or elastic net regression (Efron, Hastie, Johnstone, & Tibshirani, 2004; Tibshirani, 1996; Zou & Hastie, 2005). Those algorithms panelize the number of predictors included in the linear model. Secondly, this procedure can be applied in-

20

3. Research Questions The research conducted tests for hypotheses on the

methods? Are interaction effects beyond simple mul-

nature of personal well-being. Derived from the lit-

tiplication of determinants as in an ANOVA impor-

erature on well-being each participant’s well-being is

tant? And which of the algorithms is applicable for a

estimated to follow certain, still unknown rules. The

sufficient prediction without over-fitting the dataset?

first assumption is based on the proposals by Brick-

Therefore, this study broadly explores the power and

man and Campbell (1971) and Headey and Wearing

the results characteristics of several machine learning

(1991), that well-being is characterized by a baseline,

techniques developed in recent years with regard to

which is constant over mid periods. It is assumed

well-being and personality data.

that their finding can be supported by this study’s

Aside from the prediction of the well-being baseline,

dataset. The first hypothesis is therefore as follows:

it is questioned how the actual well-being trajectory

Hypothesis 1: Each individual has a

floats around the baseline and weather personality

well-being baseline, which is constanted

and demographics influence these short-term well-

in the mid-run.

being oscillations. Two additional hypotheses are

As stated in the literature review well-being has

derived from this research question. Closely linked

been found to be dependent on personality and de-

to the first hypothesis, it is expected that the indi-

mographics (Diener et al., 1999; Veenhoven, 1984).

vidual well-being trajectory follows the baseline.

Obviously, this does not include daily oscillations,

Hypothesis 3: Each individual’s well-

but should explain a large proportion of the vari-

being trajectory floats around the well-

ance between participants’ well-being baselines. Pre-

being baseline.

vious research focused on linear models and simple explanation of variance to proof dependencies. In

Related to the suggestion that the baseline is influ-

this study several non-linear and non-parametric ap-

enced and predictable by demographics and psycho-

proaches are tested in order to predict the well-being

metric measures (compare Hypothesis 2), it is ques-

baseline.

It is questioned whether this approach

tioned whether the trajectories characteristics can be

upon demographics and personality leads to higher

reduced to certain personality traits and demograph-

proportions of explained variance than simple linear models and whether the individual well-being level

ics. Personality influences the way we deal with ex-

can be predicted sufficiently. The second hypothesis

repeated measuring of well-being over a certain time

tested is therefore stated as follows:

frame allows for a comparison of different well-being trajectories. This study does not cover external in-

ternal triggers as positive and negative affect. Only

Hypothesis 2: Each individual’s well-

fluences themselves, but questions whether there is a

being baseline is primarily dependent on

linkage between personality and the well-being tra-

and predictable by the individual’s psy-

jectory, which can be explained by machine learning.

chometric profile and demographic vari-

The forth hypothesis is consequently stated as fol-

ables.

lows:

Besides the aim for a sufficient well-being prediction,

Hypothesis 4: Each individual’s fu-

the analysis with machine learning techniques is ex-

ture well-being trajectory can be ap-

pected to provide a new perspective on the impor-

proximated upon its well-being baseline,

tance of the included single personality and demo-

personality, demographics and historic

graphic measures, as for example extraversion, neu-

well-being data.

roticism and agreeableness as well as age and education. Which of the determinants has a signifi-

To summarize, this study tests the capabilities of sev-

cant influence when analyzed by advanced statistical

eral machine learning algorithms in order to provide

21

22

3. Research Questions

insights regarding the dependencies between personal well-being (dependent variable) and personality as well as demographics (independent variables). This research is conducted regarding two different perspectives. On the one hand side the first and second hypothesis cover the fairly constant mid-term wellbeing. On the other hand hypothesis three and four refer to short-term changes within a person’s wellbeing trajectory.

22

4. Methodology 4.1 Participants

(62% of all participants). Nevertheless, participants with lower educational degrees are covered as well.

This study is based on an online survey with four

For histograms of the demographic variables see Fig-

sequential questionnaires and an overall number of

ure 4.1.

126 questions. A first dataset with 85 participants was generated by Hall et al. (2013) during a four

Generally, this study is not dependent on a statisti-

weeks period in February 2013. The participants

cally representative sample of the society, since none

were asked by email to answer one questionnaire each

of the analyses are based on input variable distribu-

Wednesday over the given period. Out of 85 initial

tions. More important is the sample’s completeness

respondents from the first questionnaire in week one

in order to cover as many different input variable

66 participants completed all four questionnaires in

settings as possible and fully represent the field of

entirely. Nine participants aborted after week two

analysis. The age as well as the education variable

and another four participants after week three. From

meets this requirement.

six participants only single values are missing. Due to the small sample size it was decided to repeat

4.2 Apparatus and Materials

the survey during February 2014, one year after the

In order to measure the independent and dependent

first series in order to avoid seasonal influences. An

variables, empirical questionnaires have been used as

additional dataset with 343 respondents for the first

described in the literature review. Next to the basic

questionnaire has been generated. The questions and

demographics (gender, age and location), education,

the setting for the four questionnaires were equal to

employment and health have been asked with a sin-

the once in 2013. Out of the 343 respondents, 296

gle question each as indicators for different life situ-

participants completed all four questionnaires (see

ations. The personality traits are covered with three

Table 4.1).

psychometric measures, namely the 44-item big fiveinventory test, the 13-item maximizer vs. satisficer Dataset

Measure

2013

test and a 3-item fairness measure (John et al., 1991;

2014

combined

John & Srivastava, 1999; Nenkov et al., 2008; Schmitt & D¨orfel, 1999).1 Moreover, the dependent variable

N

85

343

428

Ncomplete

66

296

362

personal well-being is measured by the human flour-

32 female 34 male

133 female 163 male

165 female 197 male

ishing index (HFI) each week (see Huppert & So,

Gender

2011). These 10 weekly-asked HFI questions are randomized in order to reduce the chance of recognition

Table 4.1: Participants descriptive statistics

by the participants. All psychological and HFI questions were asked by providing a five to nine point

As expected, participants’ demographics differ between the two samples. While the sample in 2013 is

Likert scale (cf. Likert, 1974). A weather control question at the beginning of each questionnaire elim-

internationally well distributed over America, Asia

inates external weather influences on the recorded

and Europe, the sample in 2014 exists of 86% Ger-

well-being (cf. Kahneman & Krueger, 2006).

mans, 5% English, 2% Asians and 2% Americans. Both samples are fairly well distributed between gen-

In order to reduce participants’ workload, the three

ders, but characterized by a high number of partici-

psychometric measures were distributed over four

pants at the ages between 18 and 35 (78% of all par-

weeks. The questionnairesfor the four weeks con-

ticipants). Due to the universal background within

tained the following constructs and single questions: 1 For

the big five personality test and the maximizer-satisficer measure the construct’s goodness of fit has been proven by confirmatory factor analyses.

the social networks the surveys were distributed in, the participants’ majority has a university degree

23

24

4. Methodology

200

Gender 322

194

2013 2014

2013 2014

100

Participants

150

167

50

50

Participants

100 150 200 250 300 350

Location

28 8

non disclosed

USA

Europe

Asia

female

(a) Participants by Location

male

1 non disclosed

(b) Participants by Gender

Education

200

Age

150

149

4

Post−doctoral education

2013 2014

29

Professional degree

140

2013 2014

23

Doctorate degree

104

Master's degree

102

Bachelor's degree

100

Participants

0 other

0

0

4

3

Technical degree

1

Associate degree

13

50

Some college, no degree

7

Apprentiship

28

27

0

< 18

73

Higher education entrance qualification

11

1

2

18−25 26−35 36−45 46−55 56−65 66−75

2

Certificate of secondary education

4 76 +

Elementary school

1

No formal education

0 0

(c) Participants by Age

20

40

60

80

100

(d) Participants by Education

Figure 4.1: Participants’ demographic structure

1. First week: Weather control question, 13-item maximizer vs.

an online survey application. The gathered data is processed in Microsoft Excel and analyzed with R3 ,

satisficer test, 10-dimensional

human flourishing index, demographics

an open-source math and statistics language. R has been chosen since it is a ”flexible and powerful language that many data analysts are now using” (Beau-

2. Second week: Weather control question, 44-item

jean, 2013, p. 1). R-code was implemented using

big five inventory test, 10-dimensional human

Rstudio4 , an open source R editor software.

flourishing index

4.3 Data Retrieval Procedure

3. Third week: Weather control question, 3-item fairness test, 10-dimensional human flourishing

Participants were invited via posts in social networks,

index

emails as well as personal contacts to register with their email address in a SurveyMonkey registration form during the weeks before the first survey was

4. Fourth week: Weather control question, 10-

sent. Registration is necessary in order to gather a

dimensional human flourishing index

sufficient number of participants answering the following four questionnaires on the specific WednesThe surveys were conducted with SurveyMonkey2 , 2 See

3 See 4 See

http://www.surveymonkey.com

24

http://www.r-project.org http://www.rstudio.com

4.4. Analysis Procedure

25

days in February. As an incentive, each participant

via email how to access their personal report from a

had the chance to win one out of two 30 Euro Ama-

password-protected webserver at Karlsruhe Institute

zon vouchers. Moreover, for each participant com-

of Technology (KIT) upon their personal code. Re-

pleting all four surveys 20ct were given to the Unicef

ports were completely anonymous.

Childhood Foundation for their projects in Syria. This combination of lottery and donation incentive was chosen to attract egoistic as well as altruistic

4.4 Analysis Procedure

attuned participants. The lottery incentive is well

Gathered survey data has been analyzed using differ-

established and has been found to increase response

ent statistical and machine learning approaches. In

rates significantly (e.g. Deutskens, de Ruyter, Wet-

total 13 independent variables and 4 HFI data points

zels, & Oosterveld, 2004). Contrasting, the literature

were calculated per participant then standardized be-

on donations to charity is inconsistent and mainly

tween zero and one for the descriptive analyses. In

still based on paper surveys. Some studies reported

order to perform the machine learning algorithms the

a significant influence of donations on the response rate (cf. Deutskens et al., 2004; Robertson & Bel-

data has further been normalized to zero mean and standard deviation of one per variable. These include

lenger, 1978). Other research rejected the influence

six demographics and seven psychometric measures,

(cf. Furse & Stewart, 1982; Hubbard & Little, 1988).

calculated upon the single items well-described in lit-

To ensure anonymity each registered participant

erature. An overview on the data dimensionality can

was assigned a personal random identification code,

be gained from Figure 4.2.

which allows for an anonymous match of the four

The following analyses were conducted exclusively

questionnaires per participant without the partici-

on the described 13 + 4 variables. The four HFI

pants’ email-addresses or names. In addition, the

variables have been averaged for some analyses (in

participants were able to access their personal re-

particular regarding the first and second hypothe-

ports after completing the survey via this code

sis). Several incomplete responses had to be removed

anonymously from a KIT webserver.

from the dataset. Participants were included within

Each Wednesday morning in February (starting on

the analyses, when on the one hand side all 13 in-

the 5 February 2014) SurveyMonkey automatically sent emails with an individual survey links dedicated

dependent variables were available and on the other hand the HFI values for at least three weeks could be

to the participant’s code to the list of registered par-

calculated upon the ten measured values. Responses

ticipants. The email contained the individual link

with incomplete independent variables or less than

to the current Wednesday questionnaire, a link to

three HFI data points have been excluded. Regard-

unregister from the mailing list in order to abort fur-

ing the personality measures two thirds of the sin-

ther participation and a note that the survey must

gle items per measure had to be available for mean

be answered until the end of each given Wednesday.

calculation. Before the basic characteristics of the

The email was sent at 1:00am CET, to ensure that

selected machine learning algorithms are outlined in

American and Asian participants had sufficient time

this chapter, the challenge of combining the two dif-

to answer the questionnaires. Additional reminders

ferent datasets from 2013 and 2014 is addressed.

were sent at 3:30pm and 8:30pm CET. Answers were

4.4.1 Comparison of Datasets

accepted until Thursday morning 6:00 am for European participants and corresponding for other time

Since this study is based on two different datasets

zones.

(2013 and 2014), which are combined in order to gain

In order to provide a benefit for the participants a

a sufficient number of participants for further anal-

personal report including the well-being trajectory

ysis, the feasibility of this combination was tested

and the psychometric measures was generated for

first. In order to test possible influences of the differ-

each participant one week after the final question-

ences between the two datasets, a dummy variable

naire was completed. The participants were informed

for the 2013 and 2014 dataset was introduced. After

25

26

4. Methodology

Figure 4.2: Independent and dependent variables

controlling for variables highly correlated with well-

/ linear model. Levene’s test for homogeneity was

being as neuroticism and extraversion (p < 0.001),

conducted for ’dataset’ and shows no significance

this dummy variable ”Dataset” had no significant in-

for rejecting the null hypothesis of equal variances

fluence on the well-being index averaged for each

(p > 0.31).

participant over the four weeks (p > 0.38). The null hypothesis of no influence by dataset cannot be

However, there is a shift in means between the two

rejected. See Figure 4.3 for the detailed ANOVA

datasets of 0.067 (∼7% of the entire scale between 0

Levene ’ s Test for Df F value group 1 1.0175 360 Df Neuroticism 1 Extraverted 1 Agreeableness 1 Optimisim 1 Conscientious 1 Maximizer 1 Fairness 1 Health 1 Age 1 Location 3 Gender 1 Education 1 Job 1 Dataset 1 Residuals 345 --Signif . codes : 0

Homogeneity of Variance ( center = median ) Pr ( > F ) 0.3138 Sum Sq Mean Sq F value Pr ( > F ) 4.648 4.648 322.456 < 2e -16 0.749 0.749 51.942 3.63 e -12 0.148 0.148 10.251 0.00149 0.104 0.104 7.209 0.00760 0.379 0.379 26.291 4.90 e -07 0.110 0.110 7.643 0.00601 0.011 0.011 0.774 0.37956 0.116 0.116 8.072 0.00476 0.021 0.021 1.454 0.22876 0.059 0.020 1.354 0.25684 0.104 0.104 7.211 0.00760 0.006 0.006 0.419 0.51763 0.032 0.032 2.252 0.13437 0.011 0.011 0.744 0.38906 4.973 0.014

*** *** ** ** *** ** **

**

’* * * ’ 0.001 ’* * ’ 0.01 ’* ’ 0.05 ’. ’ 0.1 ’ ’ 1

Figure 4.3: Anova Type-I

26

4.4. Analysis Procedure

27

and 1), respectively 0.48 SD. The power of the t-test

duction, background and related literature regard-

for ’dataset’ is therefore power = 0.54, which is not

ing the algorithms’ application in social sciences is

sufficient to eliminate the type-II error. The depen-

also given in chapter 2.4. Machine learning is not

dent variables’ (HFI) mean and standard deviation

only utilized to solve the prediction problem result-

for both datasets are given in Table 4.2.

ing from hypothesis two, but specific algorithms also contribute to the evaluation of hypotheses three and

Dataset Measure

2013

four. Consequently, some of the algorithms men-

2014

combined

µHF I

0.4940

0.5617

0.5493

SDHF I

0.2035

0.1915

0.1954

Table 4.2: Descriptive comparison

statistics

for

tioned above were used in different contexts, contributing to the hypotheses evaluation.

4.4.3 Cross Validation and Testing

dataset

For prediction problems, in which accuracy is the predominant target, over-fitting the data is the most

Furthermore, the analysis shows no significant differ-

crucial concern (Cawley & Talbot, 2010).

ences in variances between the datasets, when com-

Over-

fitting revers to the fact that powerful algorithms

pared with a variance F-test. This applies for the

might actually fit the sample in such a precise man-

well-being index averaged for each participant over

ner that the computed results are not generally valid

the four weeks (p = 0.44) as well as if the individual

anymore when tested on new data. The algorithms

weeks’ data is compared (p = 0.23).

should explain the structural variance within the

To summarize, no evidence for a strong influence of the dummy variable ’dataset’ was found. Conse-

data, but should not fit the entire variance within the training set, including non-structural variance,

quently, the datasets are conjointly used within the

also referred to as noise (Kuhn & Johnson, 2013).

following analyses. The most common solution to prevent over-fitting is cross-validation. This study utilizes k-fold cross-

4.4.2 Algorithms and Methods used

validation for testing different algorithms. The sam-

In this study different machine learning and statisti-

ple is thereby divided into k equal subsamples,

cal methods were used to test the proposed hypothe-

whereof k − 1 subsamples are used for training pur-

ses (see chapter 3 for details). First of all, descrip-

poses and one for testing. Training and testing is re-

tive statistics provide an overview on the available

peated k times, so that each subsample is once used

dataset. This is especially important to test the first

as a testing sample. By this procedure, the accu-

hypothesis, whether subjective well-being follows an

racy of each training round is always tested on data

individual baseline. Secondly, machine learning was

points not yet used for training. Finally, the results

utilized to test the second hypothesis and predict

of the k training-testing loops are averaged in order

subjective well-being (measured as human flourish-

to receive one performance measure for the applied

ing index) upon the 13 predictor variables. Kernel

set of parameters (e.g. root-mean-squared-error or

smoothing as well as support vector machines and

R2 ). (Arlot & Celisse, 2010; Berrueta, Alonso-Salces,

neural network algorithms were implemented and

& H´eberger, 2007; Bouckaert & Frank, 2004; Kuhn,

tested on the prediction problem. Additionally, fea-

2008)

ture selection algorithms were deployed in order to

To avoid possible influences by random division in

identify the predictors’ importance. Table 4.3 pro-

k subsamples, repeated k-fold cross-validation con-

vides a list of the performed algorithms.

ducts the described process (folding the sample and

Due to the high number of algorithms and different

testing each fold) several times.

configurations, details and parameter sets are drawn

cross-validation initially proposed by Burman (1989)

together with their findings in the corresponding sub-

increases the reproducibility, even if the variance

sections within the results’ chapter. A general intro-

over the cross-validation increases (cf. Braga-Neto &

27

Repeated k-fold

28

4. Methodology

Category

Algorithms K-nearest neighbor with local average smoother Generalized additive model using loess (local linear regression)

Kernel Smoothing

Generalized additive model using splines (linear regression after non-parametric transformation of inputs) Non-parametric regression (local linear regression with varied kernels) Support vector regression Neural network with standard backpropagation learning

Neural Networks

Neural network with scaled conjugate gradient learning (SCG) Extreme learning machine (ELM) Lasso regression

Feature Selection

Elastic net regression Lazy lasso regression

Table 4.3: Applied algorithms Dougherty, 2004; Burman, 1989). This study’s re-

training data together with the current parameters

sults are based on two-times repeated 10-fold cross-

to the original learning algorithms for each loop. The

validation, if not explicitly otherwise stated differ-

procedure is summarized in Figure 4.4. (Kuhn, 2008)

ently. Deviations occur, if certain algorithms’ low resource consumption allows for a more often repeated approach in order to enhance precision or if resource limitations require a less often repeated and fold approach in order to deliver results in reasonable time at all.

Comparison of different algorithms and parameter sets is either based on the root mean squared error RM SE or the coefficient of determination R2 . In caret, the latter is calculated as the squared correlation coefficient between the prediction and the dependent test values, since the number of predictors

In this study cross-validation is conducted with the caret5 package in R for most algorithms (cf. Kuhn et al., 2014). The caret package already includes implementations of common algorithms, but also allows defining custom models and parameter sets. The package splits the data, loops over given parameter

is unknown for certain algorithms, so that an adjustment is not possible (Kuhn, 2008). Since the number of data points within this study (Ncomplete = 362) is much higher than the maximal number of predictors (pmax = 13), the differences between R2 and adjusted R2 can be neglected (cf. Yin & Fan, 2001). For

the comparison of different algorithms upon k-fold sets for each fold and repeated fold and passes Thethe caret Package cross-validation the R2 is nevertheless not suitable, 5 Short

for ”Classification and Regression Training.”

since the testing set consists of only 100/k percent

More formally: 1 2 3 4 5 6 7 8 9 10 11 12

Define sets of model parameter values to evaluate for each parameter set do for each resampling iteration do Hold–out specific samples [Optional] Pre–process the data Fit the model on the remainder Predict the hold–out samples end Calculate the average performance across hold–out predictions end Determine the optimal parameter set Fit the final model to all the training data using the optimal parameter set

There are options for customizing almost every step of this process (e.g. resampling technique, Figure 4.4: Caret cross-validation procedure choosing the optimal parameters etc). To demonstrate (Kuhn, 2014) this function, the Sonar data from the mlbench package will be used. The Sonar data consist of 208 data points collected on 60 predictors. The goal is to predict the two classes (M for metal cylinder or R for rock). First, we split the data into two groups: a training28set and a test set. To do this, the createDataPartition function is used:

4.4. Analysis Procedure

29

of the entire data. RM SE is therefore more accurate and used in most cases for comparison in this study. The dependent variable HFI is standardized to zero mean and standard deviation one for all analyses in order to ensure comparability. Consequently, the RM SE can be interpreted as the root residual sum of squares well known from linear regression and ANOVA analyzes. Hence, the squared RM SE corresponds to the variance not explained by the model. If R2 values are given for comparison, they refer to the adjusted R2 calculated by the Wherry formula (cf. Yin & Fan, 2001). Besides cross-validation, bootstrapping is a wellknown alternative for validation (cf. Efron, 1979). The procedure does not split the dataset as for example k-fold cross validation does, but instead samples a subset with replacement of the same size as the available dataset, fits the model upon this training subset and tests it upon the remaining points. Due to replacements, 63.2% of the data is on average used for training. Sampling, fitting and testing is repeated several times for an averaged accuracy result (Kohavi, 1995). Bootstrapping has been tested on several of the applied algorithms, but was not found to result in significantly different accuracy. Additionally, the newer bootstrap .632+ validation method introduced by Efron and Tibshirani (1997) did not provide reliable results, since the available dataset is too small for the accuracy smoothing proposed by Efron and Tibshirani (1997), such that over-fitting occurred. The caret package also implements parallel computing using the R multithread package doMC. Using two to four cores simultaneously reduces time consumption accordingly.

The technique has widely

been used for the following analyses.

29

5. Results 5.1 Descriptive Analysis

Weekly HFI variance explained by HFI average

Descriptive statistics were reviewed in order to

Week 1

79.96%

achieve a greater understanding of the well-being

Week 2

88.72%

data. Firstly, the four weeks HFI data has been av-

Week 3

86.21%

eraged per participant and compared to the weekly

Week 4

79.76%

data. Correlation analysis shows that each week’s

Average

83.66%

HFI is highly correlated with the averaged HFI. The Table 5.2: Explained variance of weekly HFI by the HFI average

correlation coefficient between the four weeks’ HFI and the averaged HFI lies between 0.89 and 0.94. The correlation matrix (see Table 5.1) also indicates higher correlation coefficients for consecutive weeks’

the standard deviation between participants’ aver-

HFIs (0.80 − 0.85) compared to other pairs of weeks

aged HFI value (0.1954).

(0.71 − 0.78). These scores support previous findings Dataset

as for example by Lucas, Diener, and Suh (1996), who reported a correlation coefficient of 0.77 for a

Measure

test-retest well-being survey over four weeks. HFI

Week 1

Week 2

Week 3

Week 4

Average

0.90

0.94

0.93

0.89

Week 1

-

0.82

0.76

0.71

-

0.85

0.78

-

0.80

Week 2 Week 3 Week 4

2013

2014

combined

Avg. SDwithin particpant

0.0787

0.0765

0.0769

SDbetween particpants

0.2035

0.1915

0.1954

2.59

2.50

2.54

Ratio

Table 5.3: Standard Deviation between and within participants’ HFI trajectory

-

Figure 5.1 provides a descriptive impression of the Table 5.1: Weekly HFI correlation matrix

HFI distribution1 , in which data is sorted by the averaged HFI per participant. The solid dark line

Upon the comparably high correlation coefficients,

indicates the averaged HFI per participant; the er-

it can be concluded that the amplitude within each

ror bars cover each participant’s single weekly values

participant’s HFI is rather small compared to the

from minimum to maximum. The sample is well distributed over the whole well-being scale from zero to

overall scale of well-being (between zero and one). This finding is supported by simple linear regressions

one with an average of 0.55 as presented in the den-

between the averaged HFI per participant and each

sity plot. The small peaks at zero and one result from

participant’s weekly well-being values. Each regres-

special characteristics of the HFI, which has several

sion includes the averaged HFI with an intercept as

input constellations leading to extremes at zero and

the independent and one week’s HFI value as the

one (see chapter 2.3).

dependent variable. As shown in Table 5.2, the averaged HFI per participant accounts for 83.66% of the

For each individual HFI data point, the hour of the

variance within the weekly HFI data.

day has been recorded in order to control for possible influences. Except for a slight decrease for late

The high percentage of explained variance indicates

evenings after midnight, no significant influence was

a larger deviation between participants than within

observed. Moreover, the lower averages during nights

each participant’s HFI trajectory. This can also be

are based on a few values with high variance only and

found within the standard deviations (see Table 5.3).

are hence not further considered as outliers.

The averaged standard deviation within each partic-

1 Participants

ipant’s HFI values (0.077) is 2.5 times smaller than

30

with n ≥ 3 HFI data points included.

31

1.00

1.00

0.75

0.75

HFI Index

HFI Index

5.1. Descriptive Analysis

0.50

0.50

0.25

0.25

0.00

0.00 0

100

200

0

300

1

2

Density

Participants

Figure 5.1: HFI distribution and density

In order to check for multicollinearity, a graphical

pendencies (Belsley, 1991; Mason & Perreault Jr.,

representation of the correlation matrix for all vari-

1991). As a result, multicollinearity is not further

ables in the dataset is given in Figure 5.2. It is found

considered, so that multivariate models can be ap-

that none of the input variables are highly correlated

plied without previous feature reductions. (cf. also

to others (|r| ≤ 0.44 ∀ bivariate correlations). The

Kuhn & Johnson, 2013)

strongest correlation was found between age and education. Additionally, the condition of the input matrix is 15.6, indicating moderate, but no strong de-

Neuroticism

1.0

Extraversion Agreeableness

0.8

Optimisim Conscientious

0.6

Maximizer Fairness Health

0.4

Age Location

0.2

Gender Education

0.0 Job

Education

Gender

Location

Age

Health

Fairness

Maximizer

Conscientious

Optimisim

Agreeableness

Neuroticism

Extraversion

Job

Figure 5.2: Correlation matrix (absolute values)

31

32

5. Results

5.2 Generalized Linear Model

of 0.54 and a RM SE 3 of 0.68 as given in Figure 5.3. The non cross-validated standard linear model

In terms of advanced machine-learning algorithms, the generalized linear model (GLM) is an important

fitted to the entire dataset reaches an only slightly

benchmark. Therefore, a GLM including all 13 pre-

founded concern for this model. The results are rel-

dictors and the averaged HFI as dependent variable

atively equal for both combined datasets: for 2013

is conducted with 10-times repeated 10-fold cross-

a RM SE = 0.67 and for 2014 a RM SE = 0.69 is

validation in order to ensure a highly reliable and re-

achieved.

better RM SE of 0.66, so that over-fitting is an un-

peatable result. The GLM is a generalization of the standard linear regression2 in order to allow for non-

Compared to the SD of the averaged HFI (normal-

normal distributed dependent variables (cf. Nelder

ized to SD = 1) the GLM predicts the independent

& Wedderburn, 1972), which is in this case not nec-

variable 32% more accurate than a simple average

essary, since the HFI variable has been normalized.

prediction. Each predictor’s importance measured

However, the GLM is available in the caret R pack-

by the absolute value of the t-statistic is given in

age for cross-validation (cf. Kuhn, 2008) and conse-

Figure 5.4.

quently used instead of a standard linear model with

Found results support previous research identifying

similar results. The optimization results in an R2

neuroticism and extraversion as the by far most im-

2 Also

3 Referring

referred to as ordinary least squares (OLS).

to the normalized data with SD = 1.

Generalized Linear Model 358 samples 13 predictors No pre - processing Resampling : Cross - Validated (10 fold , repeated 10 times ) Summary of sample sizes : 322 , 322 , 322 , 322 , 322 , 322 , ... Resampling results RMSE 0.678

Rsquared 0.537

RMSE SD 0.0834

Rsquared SD 0.108

Figure 5.3: GLM fitted with caret package

Neuroticism Extraversion Conscientious Health Gender Optimisim Maximizer Age Job Agreeableness Location Fairness Education 0

20

40

60

80

Importance

Figure 5.4: Variable importance in GLM (t-staticic)

32

100

5.2. Generalized Linear Model

33

Coefficient Plot Job Education Gender Location

Coefficient

Age Health Fairness Maximizer

Conscientious Optimisim

AgUHHDEOHQHVV Extraversion Neuroticism (Intercept) −0.4

−0.2

Value

0.0

0.2

Figure 5.5: GLM Regression coefficients with standard error bars

portant factors (DeNeve & Cooper, 1998; Haslam et

Generalized Linear Model

al., 2009; Steel et al., 2008), followed by conscious-

358 samples 13 predictors

ness and the self-reported physical health. For the

No pre - processing Resampling : Cross - Validated (10 fold , repeated 10 times )

regression coefficients see Figure 5.5. As expected, neuroticism is negatively and extraversion positively correlated with the HFI. According to

Summary of sample sizes : 323 , 322 , 322 , 322 , 322 , 322 , ...

the categorical variables, gender is negatively corre-

Resampling results

lated, meaning male participants tend to show lower

RMSE 0.999

well-being than female. Education, as well as fair-

Rsquared 0.0242

RMSE SD 0.21

Rsquared SD 0.0321

Coefficients :

ness, location, age and job have no significant in-

Estimate Std . Error t value Pr ( >| t |) ( Intercept ) -0.003128 0.052704 -0.059 0.9527 Neuroticism 0.039829 0.069191 0.576 0.5652 Extraverted -0.059986 0.059779 -1.003 0.3163 Agreeableness 0.044546 0.060848 0.732 0.4646 Optimism 0.004190 0.056109 0.075 0.9405 Conscientious -0.044208 0.059889 -0.738 0.4609 Maximizer 0.078849 0.059527 1.325 0.1862 Fairness -0.064784 0.054269 -1.194 0.2334 Health -0.020235 0.059328 -0.341 0.7333 Age -0.053825 0.063193 -0.852 0.3949 Location 0.089265 0.053402 1.672 0.0955 . Gender 0.002496 0.059480 0.042 0.9666 Education -0.011433 0.062806 -0.182 0.8557 Job -0.007346 0.055374 -0.133 0.8945 --Signif . codes : 0 ’* * * ’ 0.001 ’* * ’ 0.01 ’* ’ 0.05 ’. ’ 0.1 ’ ’ 1

fluence (p > 0.1). Remarkable is the comparably strong negative correlation of the personally perceived health. The healthier the participant judges himself to be, the lower is the measured well-being index. In order to test for possible interactions, the GLM was fitted with linear interaction terms. The non cross-validated fit has an RM SE of 0.55 (compare the GLM without interactions: RM SE = 0.66) with a significant, positive interaction term for optimism * age (p < 0.05). However, if the GLM with interac-

Figure 5.6: GLM for participants’ in person wellbeing variance

tions is 10-times repeated 10-fold cross-validated, the accuracy drops to RM SE = 0.83. Consequently, the interaction terms do not explain structural variance but rather over-fit the data.

being prediction problem resulting from the second

Mentioned results are for the general well-being pre-

hypothesis, the GLM is also applied to provide ba-

diction problem with the averaged well-being index

sic knowledge according to hypothesis three and four

per person as dependent variable. Besides this well-

aiming for an understanding of the in-person well-

33

34

5. Results

being variance (see chapter 3). The results (see Fig-

be applied through a distance parameter (1 for Man-

ure 5.6) indicate that no linear dependence exists be-

hattan and 2 for Euclidian metric; cf. Hechenbich-

tween the 13 predictor variables and the dependent

ler & Schliep, 2004). Furthermore, differing kernels

variable, which is the normalized standard deviation

including Gaussian, Epanechnikov and the standard

between the four HFI measures per participant. All

uniform, also referred to as rectangular kernel, can

predictors are not significant (p > 0.05) and the over-

be applied and compared.

all 10-times repeated 10-fold cross-validated model The results show a slight superiority of the Euclidian

explains less than 1% of the variance within the par-

metric for all kernels, why the l1 -metric is not fur-

ticipants’ HFI standard deviation (RM SE = 0.999).

ther considered. The prediction accuracy is best for A similar analysis was conducted on the slope of each

the Epanechnikov kernel at k = 22 (RM SEEpan. =

participant’s well-being trajectory. To do so, each

0.792). The Gaussian kernel and the uniform kernel

participant’s four HFI data points were separately

perform best for k = 12 (RM SEGaus. = 0.794 and

fitted with a linear regression. The regression coef-

RM SEU ni. = 0.796). Figure 5.7 provides a graphi-

ficient indicating the slope was then normalized and

cal representation. Nevertheless, all results are sig-

used as dependent variable within the GLM. However, the resulting GLM does not explain any vari-

nificantly worse than the GLM (RM SE = 0.678). Given results already indicate that a static local

ance between the participants’ well-being slope upon

structure might not be present within the data.

the 13 predictor variables (RM SE > 1). None of the However, the importance of the variables differs from

predictors had a significant influence (p > 0.05).

the GLM’s variance importance. As seen in Figure 5.8 neuroticism gains even more importance, while

5.3 Kernel

Smoothing

Algo-

the demographics lose influence on the dependent variable HFI.

rithms The following kernel smoothing algorithms are ap-

5.3.2 Non-parametric Regression

plied to solve the general prediction problem resulting from the second hypothesis including the per-

Non-parametric regression refers to algorithms,

participant averaged HFI as the dependent variable

which calculate a local linear regression within a

and the 13 demographic and personality variables as

kernel environment instead of averaging the nearest

predictors. All variables are normalized to zero mean

neighbors. Three different non-parametric regression

and standard deviation one.

algorithms have been tested, namely a Generalized Additive Model using LOESS, a Generalized Additive

5.3.1 K-nearest Neighbor

Model using Splines and Nonparametric Regression (see Hayfield & Racine, 2013).

The easiest kernel method is a uniform kernel, including the k-nearest neighbors of the requested point

5.3.2.1 LOESS

into the analysis. For the k-nearest neighbor algorithm the dependent variables’ value of these k neigh-

The LOESS (locally weighted scatterplot smoothing)

bors within the training set are averaged. In R the algorithm is implemented using the kknn package (cf.

algorithm (see Cleveland, 1979; Cleveland & Devlin, 1988) fits a linear or quadratic regression within a k-

Hechenbichler & Schliep, 2004; Schliep & Hechen-

nearest neighbor environment with a uniform shape.

bichler, 2014).

The kernel’s size is defined by parameter α, the pro-

The implemented algorithm allows for an adjustment

portion of training data points included in each ker-

of the metric, by which the distance for the k-nearest

nel. For α = 1 all training points are included in

neighbors are calculated. By using the Minkowski

every kernel, while α = 0.25 takes the 25% near-

distance the l1 - (Manhattan-), as well as the l2 -

est points of the entire training data into the kernel.

(Euclidian-) metric and graduations in-between can

LOESS consequently turns into a GLM for α = 1.

34

5.3. Kernel Smoothing Algorithms

35

Kernel epanechnikov

RMSE (Repeated Cross−Validation)

rectangular

gaussian

0.82

0.81

0.80

0.79 20

40

60

80

Neighbors

Figure 5.7: RMSE for k-nearest neighbor using Euclidian metric

Neuroticism ExtraverVLRQ Conscientious Health Agreeableness Maximizer Optimisim Age Fairness Education Location Gender Job 0

20

40

60

80

100

Importance

Figure 5.8: Variance importance for k-nearest neighbor using Euclidian metric

35

5. Results

RMSE (Repeated Cross−Validation)

36

6

4

2

0.4

0.6

0.8

1.0

Span

Figure 5.9: RMSE for gamLoess

The distance calculation for the neighborhood defi-

5.3.2.2 Splines

nition is conducted with the tri-cube weight function: (1 − (distance/max(distance))3 )3 . (Hastie, 2013) A different smoothing can also be achieved using splines (cf. de Boor, 2001). Instead of using kernels, The algorithm is implemented using the caret pack-

the independent variables are thereby steadily trans-

age’s gamLoess model. GamLoess implements the

formed using splines before integrated in the GAM.

LOESS algorithm separately for each independent

The model is tuned upon the degrees of freedom (df )

variable within a Generalized Additive Model (GAM;

parameter, which controls the degrees of freedom for

cf. Hastie & Tibshirani, 1986; Wood, 2004). Due

the spline function (the more degrees of freedom, the

to high computational costs, only the linear regres-

higher the adaption to local structures). Two degrees

sion was conducted. As seen in Figure 5.9 the ac-

of freedom lead to a fit with linear regression (cf. also

curacy converges towards the GLM’s accuracy at

Hastie et al., 2009). Analogous to the gamLoess al-

0.678, when α is close to one.

However, an in-

gorithm the results demonstrate that an adaption to

crease in accuracy cannot be observed, when α is

local structures does not increase the model’s accu-

reduced. This result is in line with the previously

racy. The best fit is achieved for df = 2, the linear

mentioned low accuracy of the k-nearest neighbor al-

model which was already tested with the GLM (see

gorithm. Noticeable is the RM SE drop for α = 0.32,

Figure 5.10).

4

which equals approximately 103 training points included in the local regression. Even this configuEven though a small improvement using splines was

ration (RM SE = 0.753) does not outperform the

expected and not achieved, those results are not

GLM.

astonishing; splines fit each independent variable within the GAM independently and are not capable of modeling interdependencies (cf. Hastie et al.,

4 For

10-fold cross-validation with 90% ∗ 362 training points ∗ 0.32 = 103.

2009).

36

5.3. Kernel Smoothing Algorithms

37

RMSE (Repeated Cross−Validation)

1.8

1.6

1.4

1.2

1.0

0.8

0

10

20

30

40

50

Degrees of Freedom

Figure 5.10: RMSE for gamSplines

5.3.2.3 npreg

For each cross-validation run, the kernel bandwidth for each input variable is computed via Kullback-

The most advanced kernel smoothing algorithm ap-

Leibler cross-validation (Hurvich, Simonoff, & Tsai,

plied in this study is computed upon the np-package

1998) or least-squares cross-validation (Li & Racine,

in R (see Hayfield & Racine, 2007, 2013). The npreg

2004; Racine & Li, 2004), which is applied to compare

function computes a kernel for each independent vari-

algorithms upon RM SE in this study. In contrast,

able and applies a local linear regression within the

the Kullback-Leibler cross-validation compares dif-

kernel. The optimal kernel parameters are indepen-

ferent bandwidths upon the Akaike information cri-

dently data-driven optimized for each independent

terion (AIC), which compares the goodness of fit with

variable. Thereby a different bandwidth results for

the model’s complexity. As a result of bandwidth se-

each of the independent variables (Hayfield & Racine,

lection and parameter comparison, two nested cross-

2007, 2013). One of the most important advantages

validations with correspondingly high computational

of this algorithm is that continuous as well as cate-

costs have to be performed in order to test each band-

gorical, unordered variables (as present in this study)

width specification on several folds. Since the algo-

can be included in the regression (Racine & Li, 2004).

rithm was not available for the caret cross-validation

The algorithm is consequently capable of predicting

package, a custom model was implemented.

upon mixed datasets. The algorithm moreover uses either local-linear reThe mentioned algorithm can either be computed

gression (ll) or the local-constant estimator (lc) by

with a Gaussian, an Epanechnikov or a linear ker-

Nadaraya (1964) and Watson (1964). The latter is an

nel for continuous input data. Categorical data is

average smoother, similar to the k-nearest neighbor

calculated with an Aitchisonaitken or Liracine ker-

smoother, but contrarily computes different band-

nel (Aitchison & Aitken, 1976; Titterington, 1980).

widths and scale factors for each independent vari-

For this study, the categorical predictors (location,

able (Hayfield & Racine, 2007, 2013).

job and gender) were fitted upon the Aitchisonaitken kernel only.

The results (see Figure 5.11) show that the local-

37

38

5. Results

Kernel Regression Estimator lc

ll RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)

ll

0.80

0.75

0.70

uniform

epanechnikov

gaussian

Kernel Regression Estimator lc

0.80

0.75

0.70

uniform

Continuous Kernel Type

epanechnikov

gaussian

Continuous Kernel Type

Figure 5.11: RMSE for npreg with least-squares cross-validation (left) and Kullback-Leibler cross-validation (right)

linear regression is more accurate than the local-

graphical representation in Figure 5.13 presents the

constant estimator and reaches the GLM perfor-

partial, almost linear (kernel bandwidths  n) re-

mance for the Epanechnikov kernel with least squares

gressions. The predictors were abbreviated to sim-

cross-validation (RM SE = 0.682; RM SE SD = 0.065). The uniform kernel with local-linear regres-

plify the analysis5 . High dimensionality of the input data masks several

sion and Kullback-Leibler cross-validation does not

non-linear linkages of certain independent variables.

reach sufficient accuracy (RM SE > 5), so that the

If less important independent variables are removed

corresponding data point is not included in the chart.

from the analysis, they will become known. Figure

Besides the models’ accuracy, the variance between

5.15 shows selected subsets of independent variables

several cross-validation loops is an important aspect

with reached performance measures. All calculations

to evaluate the model’s prediction capability. Upon

were conducted upon least-squares cross-validation

10-fold cross-validated model selection Figure 5.12

with local linear regression within Epanechnikov ker-

shows the RM SE density plots. The Epanechnikov

nels to fit the bandwidths and two-times repeated

kernel provides the smallest variance between CV

8-fold cross-validation to evaluate the performance.

runs, followed by the Gaussian and then the linear

Due to the computational costs, only a limited num-

kernel. For the local-constant estimator the variance

ber of repetitions and subsets could be tested.

is even smaller compared to the local-linear regres-

Certain subsamples of the input data achieve almost

sion, but the latter performs better regarding RM SE

as good of accuracy as the original model including

mean (see Figure 5.12).

all independent variables. This applies to RM SE as well as the RM SE standard deviation. For example,

The algorithm has also been tested with higher kernel

the independent variables’ subset including the big

orders (kernel order = 2 and 4), but no accuracy

five personality traits, health and the maximizer vs.

gains could be realized. Consequently, the following

satisficer test achieved an error of RM SE = 0.691,

analyses apply secondary Epanechnikov kernels only.

which is only one percent worse than the best full Due to the variable bandwidth and scale estimations

model fit. A graphical representation of the depen-

for the independent variables, npreg usually allows

dencies within this subsample fit is given in Figure

for an advanced analysis of the predictors’ impor-

5.16. The fact that subsamples of the independent

tance. Since the npreg algorithm does not predict

variables reach similar accuracy leads to the conclu-

the averaged well-being data more precisely than the

5 Abbreviations:

N - Neuroticism, E - Extraversion, A Agreeableness, O - Optimisim, C - Conscientious, M - Maximizer, F - Fairness, H - Health, Age - Age, L - Location, G - Gender, Edu - Education, J - Job.

GLM in this case, the variables’ importance just reflects the GLM predictor importance. However, the

38

5.3. Kernel Smoothing Algorithms

39

0.4

0.6

0.8

gaussian ll

gaussian lc

uniform ll

uniform lc

1.0

6 4 2 0

Density

6 4 2 0

epanechnikov ll

epanechnikov lc

6 4 2 0 0.4

0.6

0.8

1.0

RMSE

Figure 5.12: RMSE density plot for 10-fold cross-validation runs (kernel bandwidth selection upon least-squares cross-validation)

sion that the correlation between the predictors has

The overall model shows a small positive linear influ-

an influence when fitted locally.

ence of age, but those results are not obtained from time series-based measurement and are consequently

The maximizer-satisficer measure has been found to

not corrected for influences by different cohorts.

have a U-shaped partial influence in many subsets, even if the overall model fits almost linear (very large kernel bandwidth; see Figure 5.13). In contrast to

Moreover, the negative influence of self-perceived

the intuitive suggestion that maximizers have lower

health already identified by the GLM was confirmed

well-being than satisficers, maximizers seem to be

by non-parametric regression. None of the calcu-

happier than the average. This is further supported

lated predictor subsets showed a positive influence

when age, as the predictor most correlated with the

of a healthy lifestyle, although positive correlation

maximizer-satisficer variable is included in the model

has widely been discussed in literature (Diener et al.,

(see Figure 5.17). Directly compared to the predic-

1999; Jorm & Ryan, 2014; Lacey et al., 2008).

tors consciousness and agreeableness, the maximizersatisficer predictor explains less variance than conAn interesting observation was made when the pre-

sciousness (higher RM SE), but more than agreeableness (compare Figure 5.14).

dictors were reordered.

The algorithm results in

different accuracies for different predictor orders The U-shaped relationship discussed in the literature

(cf.

between age and well-being (cf. Blanchflower & Os-

validation. The algorithm calculates different band-

wald, 2008) could not be observed within the dataset.

widths for different predictor orders.

39

Figure 5.15), which are stable during cross-

−3

−1

2

0.5 HFI −1 0

2

−1

1 2 3

0.5 HFI −2

0

1

2

−1

Asia

0.5 HFI

−0.5 −1.5

Figure 5.13: npreg predictors’ partial regression influence

Resampling results across tuning parameters :

Tuning Tuning Tuning Tuning

parameter parameter parameter parameter

Rsquared 0.459 0.507 0.491 0.471

RMSE SD 0.0628 0.0745 0.0583 0.0596

Rsquared SD 0.145 0.135 0.139 0.131

’ regtype ’ was held constant at a value of ll ’ ckertype ’ was held constant at a value of epanechnikov ’ ckerorder ’ was held constant at a value of 2 ’ bwmethod ’ was held constant at a value of cv . ls

Figure 5.14: npreg accuracy for reduced predictor dimensionality

40

2

3

0.5 HFI male G

J

RMSE 0.735 0.697 0.715 0.726

1

−1.5 female

L

employed

subset N E N E C N E M N E A

0

H

−1.5

non disclosed

2

O

0.5 HFI

0.5 HFI 4

1

−1.5 −4

−1.5 3

Age

−1 0

F

−0.5

0.5 −0.5 −1.5

2

−3

−1.5 −3

M

1

2

0.5 HFI

0.5 1

1

A

−0.5

HFI 0 C

−1 0

−1.5 −3

−1.5

HFI

−0.5 −1.5 −3 −2 −1

HFI

1

E

0.5

N

0

−0.5

0.5 −1.5

3

−0.5

2

−0.5

1

−0.5

0

−0.5

−2

−0.5

HFI

HFI

−0.5 −1.5

−0.5 −1.5

HFI

0.5

5. Results

0.5

40

−2

−1

0 Edu

1

5.3. Kernel Smoothing Algorithms

41

Resampling results across tuning parameters : subset N E A O M H E A N E A O N E A O O C E A M N C E N E H M E M N H M E N H N E H N E N H M N M H Age M N M H E A Age G J N H N M N A N A E Tuning Tuning Tuning Tuning

RMSE 0.702 0.762 0.691 0.703 0.701 0.692 0.708 0.709 0.708 0.723 0.728 0.748 0.744 0.704 0.7 1 0.766 0.774 0.768 0.724

C H M F Age L G Edu J C N O F Age L G Edu J C H M C N

H E C N O F Age L

parameter parameter parameter parameter

Rsquared 0.51 0.513 0.522 0.509 0.513 0.522 0.499 0.498 0.499 0.478 0.471 0.446 0.452 0.505 0.512 0.0367 0.421 0.412 0.418 0.477

RMSE SD 0.0773 0.285 0.0682 0.0695 0.0688 0.0683 0.0749 0.0754 0.0749 0.085 0.0757 0.0713 0.0758 0.0737 0.0794 0.0859 0.0829 0.0584 0.0638 0.0676

Rsquared SD 0.127 0.131 0.118 0.128 0.13 0.119 0.139 0.14 0.14 0.152 0.147 0.133 0.138 0.139 0.121 0.0373 0.138 0.11 0.114 0.132

’ regtype ’ was held constant at a value of ll ’ nmulti ’ value of epanechnikov ’ ckerorder ’ was held constant at a value of 2 ’ bwmethod ’ was held constant at a value of cv . ls

0

1

2

3

1.0 HFI

−1.0 −3

−1

1

2

−3

−1 0

1

2

2

−1.0 −3 −2 −1

0 C

1

2

−1

0

1

2

3

H

0.0 −1.0

HFI

1.0

O

1

1.0 HFI

1.0 −1.0 −3

−1 0 A

0.0

HFI

0.0 −1.0

HFI

0

E

1.0

N

0.0

−2

0.0

1.0 −1.0

0.0

HFI

0.0 −1.0

HFI

1.0

Figure 5.15: npreg accuracy for reduced predictor dimensionality

−3

−1

1 2 3 M

Figure 5.16: npreg predictors’ partial regression influence for reduced predictor dimensionality (1)

41

−2 −3 1 22 333 -2 -−2 1 −100 0 11 10 −2 21 32 NM N

−3 −2 −1 −3 −1 −3

0 1 2 1 2 3 C −1M 0 1 2

−3−3 −2 −1 00 11 2 2 −2 0 1 2 3 AC −3 −1 −3 −1N 0 0 1 1 2 2

2 2

3

4

Age

-3−3 - 2 −1 -1−10 11 1 2 2 2 −3 E

EE

111

22 3 33 4

N −1 Age 0 1

−3

Age

2

A Maximizer, Neuroticism, Age as dependent variables:

−1−1 1 1

33 2

2

Age Age

−3 −3 −3

−1 0 1 2 −1 1 2 3 A −1 0 1 2 Maximizer M O

−2

0

1

2

3

1.0 2

Neuroticism N

HFI HFI −2.0 −0.5 −2 0 1

HFI HFI HFI HFI −2.0 −0.5 1.0 −0.5 −2.0−2.0 −0.5 1.0 1.0 −2.0 −0.5 1.0

−2.0−2.0 −0.5−0.5 1.0 1.0 −2.0 −0.5 1.0

HFI HFI HFI

−2 -1 0 −1 0

2

OE

1 1

E

1

E E

E

−3 −1 0 −3 −2 −1 0 E −1 0 1C 2

1 2

0

HFI HFIHFI HFI HFI −2.0 −0.5 1.0 −2.0 −0.5 1.0 1.01.0 −2.0 −0.5 −2.0 −0.5 −2.0 −0.5 1.0

−2.0 −0.5 −2.0−2.0 −0.5−0.5 1.0 1.01.0

HFI HFIHFI

C C

−1

1.0 1

0.0 0.0−2.0

E E

HFI HFI HFI HFI HFI HFI −2 −2.00−2.0 −0.5 −0.5 1.0 −2 0 −2 -2 - 1 0 0

2

3 233 3

−3

NN

HFI HFI HFI −2.0 −2.0 −0.5 −0.5 1.0 1.0 −2 0 1 2

1

−2.0

C C

−2 −3 00 11 1 222 333 -2 -−3 1−1−1 −1 −3 1 21 NM −3−3 −2 −1 E 00 11 2 2

−2.0

HFI

HFI HFI HFI HFI HFI HFI 0.0 0.0−2.0 −2.0 −0.5 1.0 −2.0 −0.5 −0.5 1.0 −2.0 −2.0 −0.5 1.0 1.0

M −3 −2 −1C 0

HFI

0.0−2.0 −2.0 0.0 −2.0 −0.5 −0.5 1.0 1.0 −2.0

HFI HFI

0.0 −2.0 HFI 0.0−2.0 HFI

−3 -2−1-1 01 21 3 2 -3 −3 −3 −1 −1 1 21 2

N

−2 -2 - 10 0 1 12 −2−2 0 0 1 1 2 N N

3

N N

C AC Maximizer, Neuroticism, Conscientious, Agreeableness, Extraversion, Optimism, dependent 2 −3 −3Age −1as−1 1 21 variables: −3 −3−1 −1 1 21

−3 −3 −3

−1 −1 0 0 1 1 2 2 −1 0 1 2 OE −1 0 1A 2 3 4

−3

−1 0 −1 0 O1

1 2

2 3

4

Age

Age

−3−1 0 −11 02 13 24 Age O

0

1

2

3

−3

−1

0

1

E

−2

0 1 2

N

−2.0

0 1 2 −2

HFI −2

HFI

−0.5

0 1 2 −2

HFI

−0.5 −0.5

1.0 1.0

31 42

−2.0 −2.0

HFI HFI

−0.5 −0.5

−1 0 −1 1 20 −3 Age E

1.0

−2.0 −2.0

HFI HFI

1.0 1.0

Neuroticism, Extraversion, Conscientious as dependent variables:

HFI

HFI

HFI

M M

0 1 2 3

−2

M M

−2

HFI

−3 −1 0 11 2 33 -3−1 −1 -1 −3 −3 1 13 3

N N

HFI HFI HFI HFI −2 0 1 -2 - 1 0 −2 0 −2 0

0.0 0.0−2.0

−2.0 −2.0 −2.0

M M

Neuroticism, Extraversion, Age as dependent variables:

−2 −2 0 10 21 32 3

−2.0

HFI

−3 −3−1 −11 1 3 3

−2.0 −2.0

HFI

0.0−2.0 −0.5 1.0

Maximizer, Neuroticism, Conscientious, Extraversion as dependent variables:

HFI HFI HFI HFI HFI −2 −0.5 0 1.0 −2.0 −2 0 −2 0 -2 - 1 0 1

HFI 0.0

HFI

HFI HFI HFI 0.0−2.0 −0.5 1.0 −0.5 1.0

0.0

5. Results

0.0 −2.0 HFI 0.0−2.0 HFI

HFI

HFI

42

−1 0

1

2

3

4

−3 −2 −1

Age

0

1

2

C

Figure 5.17: npreg predictors’ partial regression influence for reduced predictor dimensionality (2)

42

2

5.4. Neural Network Algorithms

43

5.3.3 Support Vector Machines (SVM)

cost of constraint violation does not have an influence on this result. When the kernel width is re-

This study’s data was then tested for prediction accu-

duced with increasing Sigma, the RM SE increases

racy using support vector machines (SVM; cf. Vapnik

and approaches the independent variables standard

et al., 1997). SVMs solve kernel smoothing problems

deviation, which is normalized to one. For extremely

by minimizing the error bounds of a linear regression

small σ, however, the performance drops again. Re-

within a local kernel environment. Therefore, it does

ducing the influence of points at the far end of the

not differ significantly from the kernel smoothing al-

predictor dimension space is consequently found to

gorithms previously mentioned. Consequently, it is

be beneficial for accuracy.

not remarkable, that the SVM does not provide any value added for the prediction of personal well-being upon the 13 predictor variables.

5.4 Neural Network Algorithms

However, the obtained results are as follows: The SVM has only been tested with a Gaussian kernel, which is parameterized by a bandwidth param-

5.4.1 Stuttgart Neural Network Simulator (SNNS)

eter Sigma specifying the inverse kernel width6 : the

The neural networks applied in this study are imple-

larger Sigma chosen, the smaller the kernel. More-

mented using the Stuttgart Neural Network Simula-

over, the SVM implementation allows specification

tor (SNNS) package in R (see Bergmeir & Benitez,

of the cost of constraint violation via a parameter

2012, 2013). In order to perform the same cross-

C, which is set to 1 standardly and varied between

validated analyses as for the before mentioned al-

0.7 and 1.3 within this analysis. Due to the compu-

gorithms, a custom model was built to integrate a

tational costs, results are calculated upon five-times

fully customizable version of the SNNS into the caret

repeated 10-fold cross-validation only.

package. (Kuhn, 2008)

Found results in Figure 5.18 indicate that a large

The SNNS allows for a variety of different learning al-

kernel leading to a linear model performs best. The

gorithms, of which standard backpropagation (SBP),

6 Gaussian

kernel defined as k(x, x0 ) = exp(−σ ∗ ||x − x0 ||2 ).

the most common learning algorithm, also referred

Cost 0.7 0.9

1 1.1

Cost 1.3

0.7 0.9

1 1.1

1.3

RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)

0.95

0.90

0.85

0.80

0.75

0.75

0.70

0.70

0.0

0.2

0.4

0.6

0.8

0.01

Sigma

0.02

0.03

Sigma

(a) full Sigma range

(b) zoom into small Sigma

Figure 5.18: RMSE accuracy for support vector machine

43

0.04

44

5. Results

RMSE (Repeated Cross−Validation)

1 2

# Hidden layers 5

3 4

1.1

1.0

0.9

0.8

0

100

200

300

400

500

400

500

# Hidden nodes per layer (a) Neural network with SCG learning algorithm

RMSE (Repeated Cross−Validation)

1 2

# Hidden layers 5

3 4

1.2

1.1

1.0

0.9

0.8

0

100

200

300

# Hidden nodes per layer (b) Neural network with standard backpropagation learning algorithm (learning rate = 0.1 and maximum difference = 0)

Figure 5.19: RMSE accuracy for feedforward neural network

44

5.4. Neural Network Algorithms

45

to as vanilla backpropagation (Rojas, 1996; Rumelhart, Hinton, & Williams, 1986), and scaled conjugate

(see previous chapter) generally face issues of slow learning speed (backpropagation) and customizable

gradient (SCG) (Møller, 1993) have been applied.

learning functions with a high number of crucial pa-

Both perform supervised learning for feed forward

rameters to set. A new method fitting neural net-

neural networks, but differ in the optimization rou-

works has therefore been developed: Extreme learn-

tine. While SBP uses the first derivative of the goal

ing Machines (ELM) fit single-hidden layer feed-

function, SCG optimizes upon the second derivative,

forward neural networks upon mathematical, non-

which is computationally more expensive, but gen-

iterative solving only (Huang, Chen, & Siew, 2006).

erally ”finds a better way to the (local) minimum”

The input weights for each hidden note are ran-

(Zell et al., 2013, p. 210). SCG is a combination

domly chosen and not adapted, so that training is

of a conjugate gradient approach and ideas of the

omitted. Training is only applied to the weights for

Levenberg-Marquardt algorithm (Bergmeir & Ben-

the output calculation, which is computationally less

itez, 2012; Marquardt, 1963). Regarding the different

costly and can consequently ”run thousands times

learning algorithms’ performance and accuracy no

faster than [. . . ] conventional methods” (Rajesh &

clear ranking persists in the literature so far. Conse-

Prakash, 2011, p. 35). By an increase of the num-

quently, comparable studies usually apply and com-

ber of hidden nodes with random inputs weights the

pare several different learning algorithms in order to

ELM is theoretically as powerful as conventional neu-

find algorithms fitting the data best.

ral networks and capable of approximating ”any continuous target functions” (Rajesh & Prakash, 2011,

Due to the characteristics of neural computing the

p. 880).

dependent and independent variables have been normalized to zero mean and standard deviation one.

The elmNN package in R (see Gosso, 2013) allows for

The categorical variables (e.g. gender, age, educa-

the training of ELMs with different activation func-

tion) were consequently transformed to numeric vari-

tions (e.g. sigmoid function for standard neural net-

ables. The neural network has been constructed with

works). For this study five activation functions have

one to five hidden layers and 20 to 1000 nodes on each

been tested for the hidden and the output nodes:

layer. For standard backpropagation the parameters

sigmoid (sig), slightly steeper tan-sigmoid (tansig),

have been kept fix at a level best for accuracy, but as-

stepwise zero / one function hard-limit (hardlim),

sociated with rather high computational costs, which

stepwise minus one / one function symmetric hard-

due to the small sample is acceptable: the learning

limit (hardlims) and a pure linear function (purelin).

rate at a low level of 0.1 and the maximum output difference at zero.

For a comparison of the activation functions with different numbers of hidden nodes see Figure 5.20. The

rithms is given in Figure 5.19. None of the tested

pure linear activation function obviously explains the same variance as the GLM and leads once more to

network layouts and none of the applied learning al-

the best fitting model.

The achieved accuracy with different learning algo-

gorithms reaches better performance than the GLM. The neural network with four hidden layers and 40 hidden nodes each performed best and reached a min-

All tests have been conducted with 20-times repeated

imum RM SE of 0.765 for the SCG learning function

put weights were randomly set, a sufficient number

and a RM SE of 0.763 for the standard backprop-

of repeated analyses has to be performed in order to

agation learning function. Both learning functions

achieve a valid accuracy result.

10-fold cross-validation. Since the hidden nodes in-

provide very similar results. Since the tansig, hardlim and hardlims activation

5.4.2 Extreme (ELM)

Learning

functions was found to show decreasing RM SE with

Machine

increased number of nodes at 5000 hidden nodes, a single five-times repeated 10-fold cross-validated

Standard feedforward neural networks as imple-

analysis has been conducted for 12000 hidden nodes.

mented by the Stuttgart Neural Network Simulator

However, it was still found that the sigmoid based ac-

45

46

5. Results

sig hardlim

hardlims tansig

Activation Function purelin

sig hardlim

hardlims tansig

Activation Function purelin

0.90

RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)

1.6

1.4

1.2

1.0

0.85

0.80

0.75

0.8 0.70

0

1000

2000

3000

4000

5000

20

#Hidden Units

40

60

80

100

#Hidden Units

(a) full range

(b) zoom for small number of hidden nodes

Figure 5.20: RMSE accuracy for extreme learning machine (ELM)

tivation functions do not outperform the GLM (see

and internal regression coefficient (slope) of the lin-

Figure 5.21).

ear trajectory smoothing could be explained (see Figure 5.22). All models upon the tested parameter sets

Extreme Learning Machine

result in higher RM SE than the samples standard

358 samples 13 predictors

deviation (RM SE > 1).

No pre - processing Resampling : Cross - Validated (10 fold , repeated 5 times )

5.5 Feature

Summary of sample sizes : 324 , 322 , 322 , 322 , 322 , 322 , ...

RMSE 0.957 0.724 0.727 0.8 0.676

Rsquared 0.283 0.472 0.469 0.388 0.531

RMSE SD 0.103 0.0863 0.0857 0.0786 0.0792

Algo-

rithms

Resampling results across tuning parameters : actfun sig hardlim hardlims tansig purelin

Selection

The selected algorithms here do not aim for an ac-

Rsquared SD 0.124 0.134 0.137 0.121 0.136

curate prediction of the dependent variable. Instead, feature selection algorithms evaluate the importance of certain predictors for the output variable (see literature review). The deployed kernel smoothing al-

Tuning parameter ’ nhid ’ was held constant at a value of 12000

gorithms (see chapter 5.3) indicate that certain independent variables within this study do not have an

Figure 5.21: Cross-validation results for extreme learning machine (ELM) for 12.000 hidden nodes

important influence on well-being. To evaluate this in detail, two different feature selection algorithms were applied.

Due to the computational efficiency in combination

5.5.1 Lasso and Elastic Net Regression

with comparable accuracy, the ELM was applied to test for possible structures within each participant’s

The lasso regression is a basic feature selection algo-

well-being trajectory proposed by this study’s fourth

rithm for generalized linear models (GLM). In com-

hypothesis (compare hypothesis 3 and 4). As already

parison to algorithms using regularization the lasso

obtained from the GLM analysis, no variance be-

algorithm limits the sum of coefficients (l1 norm) to

tween the participants’ internal standard deviation

a constant and therefore results in coefficients being

46

5.5. Feature Selection Algorithms

sig hardlim

47

Activation Function purelin

hardlims tansig

sig hardlim

1.25

1.20

1.20

RMSE (Bootstrap)

RMSE (Repeated Cross−Validation)

1.25

Activation Function purelin

hardlims tansig

1.15

1.10

1.15

1.10

1.05

1.05

1.00

1.00

20

40

60

80

100

20

#Hidden Units

40

60

80

100

#Hidden Units

(a) dependent var.: standard deviation

(b) dependent var.: regression coefficient

Figure 5.22: RMSE accuracy for ELM in trajectory prediction problem

actually zero (Tibshirani, 1996). Hastie et al. (2009)

for continuous adjustment of the regularization norm

provided a good description of possible regulariza-

including l1 and l2 norm by the parameter lambda.

tion norms and comparison of feature selection algo-

However, for this study the elastic net regression in-

rithms. The lasso regression is parameterized by the

cluding a parameterization for ridge regression did

fraction of the full model coefficients’ l1 norm, defin-

not provide an improvement in accuracy or feature

ing a maximum threshold for the sum of the current

selection.

regression coefficients’ l1 norm. A fraction of one consequently results in the full GLM, while a fraction

5.5.2 Lazy Lasso Regression

of zero forces all coefficients to zero. The algorithm is implemented using the lars and elasticnet package

The lazy lasso algorithm has been developed by Vidaurre et al. (2011) in order to combine kernel

in R (Hastie & Efron, 2013; Zou & Hastie, 2013) and

smoothing with lasso regression. The combination

is five-times repeated 10-fold cross-validated to en-

allows fitting non-linear functions upon the locally

sure sufficient reproducibility. Figure 5.23 outlines

most important independent variables only. Since

the lasso regression path and accuracy.

the algorithm implements the lasso algorithm men-

As expected, the RMSE of the model approaches

tioned before, it actually zeroes unimportant regres-

the GLM accuracy for the full solution. From the

sion coefficients by fitting the local lasso regression

RMSE plot, a small improvement to the GLM can

with the lars R package (see Hastie & Efron, 2013).

be observed, if the fraction is set to 0.9, so that fair-

However, the lazy lasso algorithm is not available

ness and education are not part of the model. These

as an R package yet, a simple version with a uni-

variables explain no structural variance in the linear

form kernel has therefore been implemented. The

model and hence overfit the data. The lasso path

implementation follows the abstract algorithm given

includes neuroticism as first, extraversion as second

in Figure 5.24 by Vidaurre et al. (2011).

and conscientiousness as third variable.

Additionally, the algorithm is cross-validated using

Further developments of the lasso regression led to

the caret package in order to test different parame-

alternative norms for coefficient regularization. The

ter sets. The parameters include the bandwidth pa-

Elastic Net Regression (Zou & Hastie, 2005) allows

rameter t for the uniform k-nearest neighbor kernel

47

48

5. Results

LASSO

* **

0

** * ** * ** ** **

***** * ** **

** * * *

** ** **

−2

*

**

2

** ** ** ** ** ** ** ** **

−4

Standardized Coefficients

*

**

* * * ** * *

* *

**

* *

* **

**

−6

* *

1 Neuroticism 2 Extroverted

0.2

0.4

0.6

** 0.8

|beta|/max|beta| 7 Fairness 10 Location

4 Optimisim 5 Conscientious 8 Health

0.80

0.75

0.70

* * 0.0

0.85

**

1

−8

** * **

0.90

RMSE (Repeated Cross−Validation)

2

**

**

* *

12

5

**

10

4

8

3

5 7

12

3

6

2

8

1

4

0

0.2

1.0

0.4

0.6

0.8

1.0

Fraction of Full Solution

13 Job

11 Gender

Figure 5.23: Lasso regression path (left) and RMSE accuracy (right)

(number of neighbors included) and a stopping pa-

each predictor, so that no adjustment of the kernel

rameter k, which defines the number of loops in a row

to the predictor weight takes place.

to be calculated without performance improvements until the algorithm aborts. For each iteration, the

As the algorithm performs feature selection upon the

distances for the kernel calculation are parameter-

Lasso regression, a criteria to define the number of

wise weighted with the regression coefficients from

predictors included in the local linear regression is

the previous iteration. The first iteration starts with-

necessary. Upon the residual standard error for each

out weighting. Vidaurre et al. (2011) argued that

step of the lars path Mallows’ Cp statistic is calcu-

this approach ”attaches more importance to relevant

lated. Predictors are included in the final model as

variables” (p. 539), because distances by irrelevant

long as Cp is larger than the total number of predic-

predictors are neglected. In order to parameterize

tors multiplied by a bias factor, which is bias = 1 for

the distance adjustment, the calculation of δj is as

the standard configuration, but may be parameter-

follows:

ized. A larger bias factor results in a less complex model, a smaller bias factor includes more predictor variables. |βj |d δj = p ∗ Pp 0 d j 0 =1 |βj |

Due to feature selection, the accuracy achieved by the model is not comparable with the prediction

This allows for a scaling of the adjustment’s power

models mentioned previously. However, the results

by the distance adaption parameter d. For d = 1, δ

from the parametric optimization can be gained

is equal to the relative predictor weight as proposed

from Figure 5.25. As expected, the kernel smooth-

by Vidaurre et al. (2011); for d = 0, δ equals 1 for

ing demonstrates once more that the best model is

48

5.5. Feature Selection Algorithms 540

D. Vidaurre et al.

Algorithm 1 lazy lasso Input: training data set D with p variables and n data items Input: bandwidth τ and stopping criterion parameter κ Input: weight function g(·) and distance function d(·) Input: point x(l) , whose response is to be predicted (l) Output: set of coefficients βˆ and estimated response yˆ (l) Initialization: δ j := 1, for j = 1, . . . , p overall Best := ∞ ; toStop := 0 repeat (i) (l) Calculate √all distances di := dδ (x , x ), for i = 1, . . . , n (l) w := gτ (d) (l) (l) W(l) := n × n diagonal matrix, Wii = wi , for i = 1, . . . , n Z := W(l) X v := W(l) y path := L ARS(Z, v) β ∗ := best (path; 0 p Z, v) δ j := p|β j |/ j ′ =1 |β j ′ |, for i = 1, . . . , n

scor e := evaluate(β ∗ ; Z, v) if scor e ≥ overall Best then toStop := toStop + 1 else toStop := 0 overall Best := scor e (l) βˆ := β ∗ end if until toStop = κ T (l) yˆ (l) := x(l) βˆ

Figure 5.24: Lazy Lasso Algorithm

3.2 Validation procedures 0.3 0.6

1 1.5

Bias Factor 2 3

5

0

0.3

Distance Adaption Factor 0.7

1

Validation plays a crucial role in the lazy lasso. On the one hand, a specific point of the regularization path must be selected from each LARS run. On the other hand, a final solution should be selected from the final lazy lasso sequence. Hence, the number of solutions for evaluation can be considerably large. An efficient evaluation method is thus required. In addition, we do not know in advance the proper bandwidth τ for the incoming point x(l). The procedure recommended for finding a specific τ value for x(l) should be data-driven and adaptive. We first deal with model selection along the LARS regularization path. Since we assume local homoscedasticity and we have used the k-nearest neighborhood function to weight the data set, we can now reasonably assume σi to be constant for the weighted data set (that is, within this neighborhood of x(l) ). The Mallows’ C p statistic (Mallows 1973), which needs σi = σ for all i, is defined as

0.95

RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)

0.85

0.90

0.85

0.80

0.80

0.75

0.75

0.70

0.70

50

100

150

200

50

Bandwidth

100

(a) d = 1

200

(b) bias = 1

ˆ for lazy lasso regression Figure 5.25: RMSERSS( accuracy β) Cp = − n + 2ν, 2 σ

123

150

Bandwidth

49

(13)

49

50

5. Results

achieved for large kernels approaching the general-

the assessment of the local predictor importance has

ized linear model. The stopping parameter k was

been conducted on models with 30 to 80 points per

tested for values k = 5 and k = 8 without noticeable

kernel, even if those were not performing best in

differences, so that it is fixed to k = 5 for all further

terms of accuracy. Figure 5.26 provides an overview

analyses.

of the predictor weights depending on the bias factor. Neuroticism is the predominant predictor gain-

The bias factor was as expected found to reduce

ing even more importance, if the restriction is tight-

the number of predictors included in the local linear

ened (higher bias). Extraversion and conscientious-

regressions and consequently reduces the accuracy

ness were found to be the second most important

when increased. Different from original expectations,

predictors. However, their influence decreases, when

the distance adaption factor d had a rather small in-

the kernel size is shrunken and the prediction conse-

fluence on the model’s accuracy. For medium-sized

quently based on fewer neighbors. This is different

kernels (30 - 80 points), models with little distance

than expected, because a local analysis usually in-

scaling actually fitted the testing points better than

creases the relative importance of generally less im-

the proposed distance scaling with d = 1. Moreover,

portant variables. Even for kernels with less than 30

those models generally included fewer variables on

points (< 10% of the sample size) neuroticism is the

average.

only important predictor. Extraordinarily increased

In order to evaluate the predictors’ importance the fi-

weights for other predictors are not observed. How-

nal local regression coefficients for each testing point

ever, the unrestricted model (bias = 0) for small ker-

are saved7 and allow for later statistical analysis as

nels weights all predictors relatively equal with five

for example counting the regressions with coefficients

to 15 percent of the total predictor weight8 . As seen

unequal to zero for each participant or sum the abso-

in Figure 5.26 this includes an increased weight for

lute regression coefficients by parameter. However,

the location variable. However, this has to be treated

since the best performing model has a large kernel, those feature selection results are similar to the

with caution, because the underlying sample is not representative in this regard. Moreover, the gender

variance importance identified by the GLM. Hence,

variable is comparably important in the unrestricted

7 The

8 Note

nominalTrainWorkflow function in the R caret package had to be adapted in order to return additional data from the prediction.

in this regard that the lars algorithm called for each local kernel environment individually shifts the training points to zero mean and variance one for each predictor.

80

80

Predictor Neuroticism Extraversion

Optimisim

Predictor Weights (in %)

60

Predictor Weights (in %)

Agreeableness

60

Conscientious Maximizer

40

40

Fairness Health Age Location

20

20

Gender Education Job

0

0 1

2

3

4

1

5

Bias

(a) t ∈ [30, 80]

2

Bias

3

4

(b) t ∈ [100, 200]

Figure 5.26: Lazy lasso predictor weights

50

5

5.5. Feature Selection Algorithms

51

model with large kernel drops weight when fitted lo-

40% of all local fitted regressions with small kernels

cally.

(30 - 80 points) only, while included in over 65% of the regressions with larger kernels. Correspondingly,

Since the lasso regression zeros unimportant predic-

variables not important in larger kernels are included

tors when called with sufficient restriction via the

in local regressions with smaller kernels more often.

bias variable, an analysis of the number of coefficients

Nevertheless, this is likely to result from over-fitting

unequal to zero per predictor over all testing points

the data, since those small kernels result in signifi-

is promising. Again, neuroticism, extraversion and

cantly less cross-validated accuracy (see Figure 5.27).

consciousness stack out as the most often included predictors, followed by health and the maximizer-

In general, differences for the predictors’ order in re-

satisficer measure (see Figure 5.27). When fitted lo-

gard to the frequency of coefficients unequal to zero

cally with small kernel sizes, the differences between

are not observed between different kernel sizes. This once more supports that the high predictor weight of

predictors are though less distinct. For an average number of 2.5 predictors, neuroticism is included in

the location for small kernels is due to irregularities

100

100

Predictor Neuroticism

Number of local Regressions (in %)

Number of local Regressions (in %)

Extraversion

75

50

25

75

Agreeableness Optimisim Conscientious Maximizer

50

Fairness Health Age Location

25

Gender Education Job

0

0 0.0

2.5

5.0

2.5

7.5

Average number of predictors

5.0

7.5

10.0

Average number of predictors

12.5

(a) measure relative to total number of regressions (left: t ∈ [30, 80]; right: t ∈ [100, 200])

80

80

Predictor

Share of average number of predictors (in %)

Share of average number of predictors (in %)

Neuroticism Extraversion

60

60

Agreeableness Optimisim Conscientious Maximizer

40

40

Fairness Health Age Location

20

20

Gender Education Job

0

0 0.0

2.5

5.0

7.5

10.0

2.5

Average number of predictors

5.0

7.5

10.0

Average number of predictors

12.5

(b) measure relative to total number of regressions corrected with total number of predictors per regression (left: t ∈ [30, 80]; right: t ∈ [100, 200])

Figure 5.27: Lazy lasso: percentage of local lasso regressions with predictor coefficient unequal to zero

51

52

5. Results

Predictors

RMSE contribution to full model

Variance explained as single predictor

Most important predictors (Group 1 )

Neuroticism Extraversion Consciousness

0.40

41 % 22 % 15 %

Moderately important predictors (Group 2 )

Maximizer Health Gender Agreeableness Optimism

0.04

8 - 12 %

Age Fairness Job Education Location

0

0-8%

Less important predictors (Group 3 )

Table 5.4: Predictor importance by group. Note: Numbers in the second column indicate the difference between RM SE of model including the group as predictors and model including the more important groups only; analysis conducted with npreg algorithm. in the dataset. However, the variables can be clus-

influence as for example the npreg algorithm. The

tered into three groups by importance. On the one

kernel smoothing selects local environments around

hand side they are fairly constant regarding predictor

the predicted test points, but does not currently save

weights and the frequency of coefficients unequal to

the bandwidth information in order to compute the

zero. On the other hand identified groups correspond

complete partial influence plot. Changes of local pre-

with the finding from the npreg algorithm mentioned

dictor importance along the predominant regression

before (see Table 5.4). Neuroticism, extraversion and

line of neuroticism could for example be subject to

conscientiousness explain by far most of the variance

further research.

– neuroticism alone already around 40%, if fitted with non-parametric regression. Extraversion and

5.6 Accuracy Comparison

consciousness add another ∼ 10% of explained variance after controlling for neuroticism. The second group includes the maximizer-satisficer scale, health,

The proposed hypotheses two and four aim for a

optimism, agreeableness and gender. Especially for

and the corresponding well-being trajectory upon the

large kernels the second group accounts for signif-

psychometric and demographic input variables. Dif-

icantly more predictor weight than the remaining

ferent machine learning approaches have been tested.

variables. Together with the first group those vari-

However, the algorithms do not achieve higher ac-

ables explain approximately 47% of the variance be-

curacy than the generalized linear model (GLM). A

tween the averaged HFI per participant. The third

comparison of the conducted algorithms and the ac-

group contains the remaining predictors fairness, ed-

curacy achieved can be obtained from Figure 5.28.

prediction of each participant’s well-being baseline

ucation, job, location and age, which were found to Since this study’s sample is comparably small for the

have a rather small influence and explain very little

number of predictors included in the prediction mod-

variance after controlling for the groups one and two.

els, an accuracy test for a reduced sample size is ad-

Within the third group age and fairness are the most

vised in order to test for possible accuracy advantages

relevant predictors. This division in three clusters is

from larger datasets. This test has been conducted

supported by the findings of the npreg algorithm and

for the neural network model (see chapter 5.4.1).

furthermore corresponds with the separation in the

Mentioned model was adapted to loop over different

linear lasso regression on the whole dataset.

subsets of the sample and apply the cross-validated While the lazy lasso algorithm is capable of effec-

neural network algorithm on these subsets. Subsets

tive feature selection and interpretation, it does not

including 50% − 100% of the original dataset were

allow for an overall picture of a single predictor’s

tested. The neural network was built with the two

52

5.6. Accuracy Comparison

53

0.54

GLM

0.68 0.37

K−nearest neighbor

0.79 0.53

Local linear regression − LOESS

0.68 0.53

Local linear regression − Splines

0.68 0.53

Local linear regression − NPREG

Model

GLM accuracy

0.68 0.51

Support Vector Regression

0.7 0.41

Neural Network (SCG)

0.76 0.42

Neural Network (Backpropagation)

0.76 0.54

Extreme Learning Machine (linear)

0.68 0.48

Extreme Learning Machine (hardlim)

0.72 0.08

Extreme Learning Machine (sigmoid)

0.96

0.00

0.25

Legend

0.50

Accuracy

0.75

1.00

Explained_Variance

Figure 5.28: Accuracy comparison between deployed algorithms for well-being baseline prediction

best performing parameter sets identified before (see chapter 5.4.1): three hidden layers with 100 nodes each and four layers with 40 nodes each. Found results indicate that further increases of the sample size do not promise large accuracy improvements (see Figure 5.29). The RM SE curve already flattens for training sets larger than 80% of the data available (362 points). For further prove the same analysis has been conducted with the npreg algorithm (see chapter 5.3.2.3). However, due to computational costs not the full 13-variable predictor set, but the seven most important predictors9 have been fitted. The results in Figure 5.30 support the implications previously mentioned. An extension of the dataset does not automatically lead to higher prediction results. Contrarily, the npreg algorithm almost achieves the maximum accuracy achieved in this study with 60% of the training data already.

9 Big

five traits, health and maximizer vs. satisficer measure.

53

54

5. Results

RMSE (Repeated Cross−Validation)

3

# Hidden layers 4

0.88

0.86

0.84

0.82

0.80

0.78 0.2

0.4

0.6

0.8

1.0

percentage of used data

RMSE (Repeated Cross−Validation)

Figure 5.29: RMSE accuracy gains with increased number of training points for neural network

0.9

0.8

0.7

0.2

0.4

0.6

0.8

1.0

percentage of used data

Figure 5.30: RMSE accuracy gains with increased number of training points for npreg

54

6. Evaluation This study’s rationale is the assumption that

to test for non-linear linkages between demograph-

machine-learning techniques may contribute to the

ics as well as personality and the well-being baseline.

understanding and prediction of subjective well-

However, none of the algorithms achieved higher ac-

being. As specified in the results chapter conducted

curacy results than the linear model when appropri-

algorithms do not provide higher prediction accu-

ately tested with sufficient cross-validation. Three

racies than the general linear model for the avail-

possible causes would explain the obtained findings:

able dataset. However, the obtained results allow

(1) Firstly, the conducted algorithms might not be

for a detailed analysis of the proposed hypotheses

able to fit the existing structure within the data suf-

and deepen the understanding of well-being’s inter-

ficiently. (2) Secondly, the existing dataset is too

nal structure and dependencies.

small in order to differ between structural variance and noise, so that cross-validation prevents from find-

6.1 Hypotheses

ing existing structures. However, the accuracy anal-

6.1.1 Existence of Well-Being Baseline (Hypothesis 1)

racy gains by larger samples. (3) And thirdly, the

ysis for smaller subsets does not indicate large acculinkages between personality as well as demographics and well-being are fairly linear and consequently

According to the first hypothesis it is assumed that

well-described by the generalized linear model (see chapter 5.2). Nevertheless, non-parametric regres-

the available dataset would underpin the well-being baseline theory (cf. Headey & Wearing, 1991). As de-

sion approaches on subsets of the predictor space

scribed the measured standard deviation within each

found non-linear structures within the data as for ex-

participant’s well-being trajectory (four weeks) was

ample for the maximizer-satisficer test. These might

2.5 times smaller than the standard deviation between participants’ well-being average.

result from the reduction of independent variables,

Moreover,

since the remaining variables found to be non-linear

the averaged well-being value per participant ex-

will embody additional variance previously explained

plained 84% of the variance within the weekly in-

by predictors, which are then excluded from the

dividual well-being values on average (see linear re-

model.

gressions). Those high percentages of explained variance indicate that well-being is fairly stable over the

Best fitting models achieved a cross-validated accu-

analyzed period within the dataset. However, this

racy of RM SE = 0.68, which corresponds to 68%

analysis is obviously limited to the study period of four weeks, so that is has to be questioned whether

of the dependent variable’s standard deviation (46%

stable subjective well-being would be confirmed by

32% of the standard deviation (54% of the variance),

a longer study period and higher frequency of well-

which cannot be regarded as sufficient prediction.

being measurement. Considering mentioned limita-

However, this finding is in line with the achieved ac-

tion, the proposed well-being baseline has been found

curacy in the feasibility study by Hall et al. (2013).

and the theory by Headey and Wearing (1991) is sup-

For a full list of the accuracy results see Figure 5.28.

of the variance). The model consequently explains

ported. The first hypothesis therefore is accepted.

According to the algorithms performed, neuroticism

6.1.2 Predictability of Well-Being Baseline (Hypothesis 2)

is the predominant variable, followed by extraver-

Hypothesis 2 addressing the predictability of the

well-being literature, the maximizer-satisficer scale

identified well-being baseline is this study’s main as-

and the participants’ fairness perception have been

pect. As outlined in the results chapter several differ-

tested for influences. The first mentioned is found

ent machine-learning algorithms have been applied

to provide reasonable contribution to the well-being

sion and conscientiousness, which is in accordance with the existing literature. As novel measures in

55

56

6. Evaluation

baseline explanation, particularly if analyzed by non-

be mentioned in this regard is the found negative

parametric algorithms, since a local U-shaped curve

correlation of physical health and well-being, which

has been found in some analyses. The latter did not

is conflicting with common well-being literature (cf.

explain additional variance and should consequently

Diener et al., 1999; Jorm & Ryan, 2014; Lacey et al.,

not be considered as relevant any further. The same

2008). First assumptions that the influence might

concerns for most of the demographic variables, ex-

result from the fact that more important variables as

cept for gender and age. The participant’s education,

for example neuroticism1 and conscientiousness are

employment and location did not provide any value

controlled first could not be confirmed. Even with

added, whereby it has to be noted that this study’s

health as the only predictor, a negative influence was

sample is not sufficiently representative in regards to

found.

location. Generally, the predictors can be clustered

Moreover, the U-shaped dependency between age

by importance in three groups as outlined in Table

and well-being widely discussed in literature (cf.

5.4.

Blanchflower & Oswald, 2008; Clark & Oswald, 2006) was not observed. However, the found slightly pos-

6.1.3 Characterization of Well-Being Trajectory (Hypothesis 3 & 4)

itive correlation corresponds with the findings by Blanchflower and Oswald (2008) before accounting for different cohorts. Moreover, Blanchflower and

While the existence of a well-being baseline per par-

Oswald (2008) controlled for several variables not

ticipant could at least partially be confirmed, each

present in this study such as income and children.

participant’s well-being trajectory was not found to

Distinct consideration of different cohorts has not

follow certain rules. However, the trajectory obvi-

been conducted in this study; instead, the U-shape

ously floats around the mentioned baseline, as con-

would have become visible by the applied kernel

structed using the average over the four weeks consid-

smoothing algorithms, if it was available within the

ered in this study. The third hypothesis can there-

data. However, this study demonstrates that age

fore be accepted, but does not provide large scien-

generally is by far less important than psychomet-

tific value. A prediction of the trajectory on the

ric predictors.

other hand was not possible. All tested approaches to explain variance of the trajectories standard de-

The maximizer-satisficer scale has been found to be a

viations and linear slope between participants failed.

reliable addition to the important big five personality

The models explained less than 1% of the variance

traits (especially neuroticism, extraversion and con-

in the dataset. Thus, the forth hypothesis could not

sciousness). The predictor is only moderately cor-

be confirmed. The in-person well-being trajectory’s

related with the big five (r < 0.3 ∀ big five) and

standard deviation is not dependent on considered

explains the same amount of variance as openness

personality factors and demographic variables.

(∼ 10%). However, the identified U-shaped internal

Algorithms tested include a simple generalized linear model and the extreme learning machine. Both, each

structure should be considered in further analysis, because fitting with linear models is not adequate to

participant’s in-trajectory standard deviation as well

map those dependencies.

as each participant’s in-trajectory linear regression

Furthermore, a significant influence of the gender

coefficient have been tested as dependent variable for explained variance by the 13 independent variables

variable was observed, indicating that male participants experience lower well-being values than female.

(psychometrics and demographics). No variance be-

This contradicts previous well-being studies (e.g. Di-

tween participants could be explained.

ener et al., 1991, 1999) and was moreover not observed within the smaller 2013 dataset for the feasibility study by Hall et al. (2013). However, no

6.2 Further Findings

conclusive findings about gender have been made to

Besides proposed hypotheses, the applied machine-

1 Neuroticism

learning algorithms provided additional findings. To

0.33).

56

and health are relatively highly correlated (r =

6.3. Limitations

57

date, so that further research is advised in order to

measured has no demand for objectivity and the re-

identify the independent effect of gender on personal

sults do not indicate whether a certain participant

well-being.

is actually well in different regards. Instead, the individual perception regarding the participant’s own

6.3 Limitations

well-being is measured according to the state of re-

Empirical research is exposed to certain limitations.

individual scale per participant in this study and is

Most important in this regard is the selection of par-

not intended to contribute to generalized national

ticipants, which has been conducted via social net-

measures, such as a gross national well-being score.

search. Consequently, well-being is measured on an

works and direct contact. This study’s participants might therefore not necessarily represent the opti-

The well-being prediction problems dealt with in the

mal sociological sample. However, an effort has been

study are limited to demographic and psychomet-

made to achieve a widespread sample, since the com-

ric predictors. These are mostly discussed and have

pleteness of the data is more important than statisti-

been found to have the highest influence on sub-

cally representativeness in order to evaluate determi-

jective well-being (Diener et al., 1999). However,

nants and their importance for well-being prediction.

many other correlations of subjective well-being have

Moreover, the underlying sample is a combination of

been identified (see e.g. Veenhoven, 1984), but are

two different sources using the same questionnaires

not covered by this study. Considering that applied

and procedures on different participants with an in-

machine learning algorithms explained roughly half

terval of one year. Possible influences of this com-

of the variance only, it has to be estimated that

bination have not been found, but can also not be

other factors account for a significant proportion of

precluded generally.

the variance, too. Especially, if analyzed upon nonparametric tools in combination with the important

Conducted measurement of well-being is furthermore

predictors identified by this study. Significant cor-

limited, since this study is based on four single well-

relations identified in literature and not sufficiently

being measures over a period of four weeks only. Ac-

considered by this study include for example religion,

tivities, events in the participants’ lifes and other

culture and marriage status.

short-term influences are not measured, such that those influences widely reported in literature (see e.g. Diener et al., 1999; Veenhoven, 1984) are not covered by this study (cf. also the Day Reconstruction Method, Kahneman et al., 2004). The well-being trajectory obtained from each participant is consequently a sequence of snapshots only and does not reflect the entire well-being curve. However, temporal analysis of the participants’ responses did not result in any significant well-being differences over the hour of the reporting day and the measurements have been conducted on Wednesdays, so that shortterm variation is avoided. Moreover, the differences between in-trajectory and between-participant wellbeing variance allow for an identification of the wellbeing baseline. Nevertheless, four data points might be not enough in order to gather a well descriptive image of each participant’s well-being trajectory. Obtained well-being data is furthermore subjective, since the participants rate their own perception on provided Likert scales. Consequently, the well-being

57

7. Implications and Further Research Machine learning has been proven to provide rea-

pants choose the hour of the day for completing the

sonable insight on the structure and dependencies

questionnaire themselves.

of well-being. Even though it was not achieved to

Concerning the predictors’ importance relevant for

explain higher proportions of the variance between

a well-being definition, further attention on the

participants’ well-being than by the general linear

maximizer-satisficer scale is recommended.

model, the findings on predictor importance and ap-

The

mentioned predictor has been found to explain a sig-

plicability of several machine learning algorithms add

nificant amount of variance, which is even more rele-

to the scientific goal of a robust well-being definition.

vant when analyzed locally. A U-shaped structure is

According to these findings ordinary least squares

observed indicating that satisficers as well as extreme

regression provides the most appropriate prediction

maximizers benefit from higher well-being than the

approximation upon psychometric and demographic

average participant. Generally, the predominant im-

variables.

portance of neuroticism is confirmed, followed by extraversion and consciousness.

However, since complex non-parametric regression algorithms upon psychometric and demographic vari-

Applied non-parametric machine learning algorithms

ables do not achieve accuracies larger 54% of ex-

significantly increased the developed picture of the

plained variance, it can be concluded that other de-

well-being dependencies’ internal structures. Today,

pendencies independent from the controlled variables

most analyses on social problems do not challenge

must exist. Most important in this regard is probably

significances found by variance analysis and linear

the participant’s life situation and activities, which

regression for underlying non-parametric structures,

are not covered adequately in this study. Moreover,

although those would probably add additional value

it has to be questioned, whether a four weeks pe-

to the ongoing scientific discussion. Applied meth-

riod is sufficient in order to obtain the mid-term well-

ods – even if developed for big data assessment –

being baseline assumed to be predominantly depen-

have been proven to reveal interesting and new facets

dent on psychometrics and demographics. Further

of this study’s well-being prediction problem upon

development of mobile applications and social net-

comparably small datasets. The topic of ’small data’

works might add to future well-being data collection on a larger scale, especially for long time studies with

analysis including small samples with high dimen-

high frequent data retrieval. The human flourishing

of individual, personal data gained for example from

index seems to be a valuable tool in this regard.

smart phones and social media activity. Social data

sionality recently evolved from increased availability

availability will simplify the understanding of depenThis study’s initial question for the reasons why some

dencies and underlying structures, but it will also

people judge the glass half full and others half empty can consequently not be answered in a more pro-

demand for easy-to-use, well-interpretable, but nevertheless powerful analysis procedures. It is conse-

found way than already described in literature. Fur-

quently proposed that non-parametric tools and fea-

ther research on well-being prediction is advised to

ture selection methods should be further developed

broaden the predictor space including for example

and more often be utilized in order to question pop-

participants’ important life events and a more gen-

ular, but simple regression results.

eral perspective on the social background. It has to be questioned, whether this information is accessible via anonymous online questionnaires. However, the found independence between HFI values and the hour of the day supports the use of online surveys. Obtained results are comparable, even if the partici-

58

References Aitchison, J., & Aitken, C. G. G. (1976). Multi-

Blanchflower, D. G., & Oswald, A. J. (2004). Well-

variate Binary Discrimination by the Kernel

Being over Time in Britain and the USA. Jour-

Method. Biometrika, 63 (3), 413–420.

nal of Public Economics, 88 , 1359–1386.

Anthony, M., & Bartlett, P. L. (2009). Neural net-

Blanchflower, D. G., & Oswald, A. J. (2008). Is

work learning: Theoretical foundations. Cam-

well-being U-shaped over the life cycle? Social

bridge, England: Cambridge University Press.

Science and Medicine, 66 (8), 1733–49.

Aristotle. (2002). Nicomachean Ethics (S. Broadie

Bouckaert, R., & Frank, E. (2004). Evaluating the

& C. Rowe, Eds.). Oxford, England: Oxford

Replicability of Significance Tests for Compar-

University Press.

ing Learning Algorithms. In H. Dai, R. Srikant, A Survey of

& C. Zhang (Eds.), Advances in knowledge dis-

Cross-Validation Procedures for Model Selec-

covery and data mining (pp. 3–12). Berlin, Hei-

tion. Statistics Surveys, 4 , 40–79.

delberg, Germany: Springer.

Arlot, S., & Celisse, A.

(2010).

Basak, D., Pal, S., & Patranabis, D. C.

Braga-Neto, U. M., & Dougherty, E. R.

(2007).

(2004).

Support Vector Regression. Neural Informa-

Is Cross-Validation Valid for Small-Sample

tion Processing – Letters and Reviews, 11 (10),

Microarray Classification?

203–224.

20 (3), 374–80.

Bioinformatics,

Beaujean, A. A. (2013). Factor Analysis using R.

Brickman, P., & Campbell, D. T. (1971). Hedonic

Practical Assessment, Research & Evaluation,

Relativism and Planning the Good Society. In

18 (4), 1–11.

M. H. Appley (Ed.), Adaptation - level theory

Belsley, D. A.

(1991).

(pp. 287–305). New York: Academic Press.

A Guide to Using the

Brunstein, J. C., Schultheiss, O. C., & Gr¨assmann,

Collinearity Diagnostics. Computer Science in Economics and Management, 4 , 33–50.

R. (1998). Personal Goals and Emotional Well-

Bentler, P. M. (2007). On tests and indices for eval-

Being: The Moderating Role of Motive Dispo-

uating structural models. Personality and In-

sitions. Journal of Personality and Social Psychology, 75 (2), 494–508.

dividual Differences, 42 (5), 825–829. Bergmeir, C., & Benitez, J. M. (2012). Neural Net-

Burman, P. (1989). A Comparative Study of Ordi-

works in R using the Stuttgart Neural Network

nary Cross-Validation, v-Fold Cross-Validation

Simulator: RSNNS. Journal of Statistical Soft-

and the Repeated Learning-Testing Methods. Biometrika, 76 (3), 503–514.

ware, 46 (7), 1–26. Bergmeir, C., & Benitez, J. M. age

‘RSNNS’

sion 0.4-4). from

(R

(2013).

package

R-project.org.

manual

Pack-

Campbell, A., Converse, P. E., & Rodgers, W. L.

Ver-

(1976). The Quality of American Life: Perceptions, Evaluations, and Satisfactions. New

Retrieved

http://cran.r-project.org/web/

York, NY: Russell Sage Foundation.

packages/RSNNS/index.html

Cattell, R. B. (1947). Conformation and Clarification of primary Personality Factors. Psychometrika,

Berrueta, L. A., Alonso-Salces, R. M., & H´eberger,

12 (3), 197–220.

K. (2007). Supervised pattern recognition in

Cawley, G. C., & Talbot, N. L. C. (2010). On Over-

food analysis. Journal of chromatography A,

fitting inModel Selection and Subsequent Se-

1158 , 196–214. Blanchflower, D. G. (2001). Unemployment, Well-

lection Bias in Performance Evaluation. The

Being, and Wage Curves in Eastern and Cen-

Journal of Machine Learning Research, 11 ,

tral Europe. Journal of the Japanese and In-

2079–2107. Chittaranjan, G., Blom, J., & Gatica-Perez, D.

ternational Economies, 15 (4), 364–402.

59

60

References (2011). Who’s Who with Big-Five: Analyzing

de Boor, C. (2001). A Practical Guide to Splines

and Classifying Personality Traits with Smart-

(Revised ed.). New York, NY: Springer.

In 15th Annual International Sym-

DeNeve, K. M., & Cooper, H. (1998). The Happy

posium on Wearable Computers (pp. 29–36).

Personality: A Meta-Analysis of 137 Personal-

IEEE.

ity Traits and Subjective Well-Being. Psycho-

phones.

Clark, A. E.

(2003).

logical Bulletin, 124 (2), 197–229.

Unemployment as a So-

cial Norm: Psychological Evidence from Panel

Deutskens, E., de Ruyter, K., Wetzels, M., & Oost-

Data. Journal of Labor Economics, 21 (2), 323–

erveld, P. (2004). Response Rate and Response

351.

Quality of Internet-Based Surveys: An Experimental Study. Marketing Letters, 15 (1), 21–

Clark, A. E., Frijters, P., & Shields, M. A. (2008).

36.

Relative Income, Happiness, and Utility: An

Diener, E. (1984). Subjective Well-Being. Psycho-

Explanation for the Easterlin Paradox and

logical Bulletin, 95 (3), 542–575.

Other Puzzles. Journal of Economic Litera-

Diener, E. (1994). Assessing Subjective Well-Being:

ture, 46 (1), 95–144. Clark,

A. E., & Oswald,

The

curved

jective

well-being

No. 26).

A. J.

relationship

Paris:

and

Research, 31 , 103–157.

sub-

(Tech.

Rep.

Diener, E. (2000). Subjective Well-Being: The Sci-

Retrieved from

ence of Happiness and a Proposal for a National

age

PSE.

Progress and Opportunities. Social Indicators

(2006).

between

http://halshs.archives-ouvertes.fr/

Index. American Psychologist, 55 (1), 34–43.

halshs-00590404/ Cleveland, W. S.

(1979).

Diener, E. (2013). The Remarkable Changes in the Science of Subjective Well-Being. Perspectives

Robust Locally and

Smoothing Weighted Regression Scatterplots.

on Psychological Science, 8 (6), 663–666.

Journal of the American Statistical Associa-

Diener, E., & Chan, M. Y. (2011). Happy Peo-

tion, 74 (368), 829–836. Cleveland, W. S., & Devlin, S. J.

ple Live Longer: Subjective Well-Being Con(1988).

Lo-

tributes to Health and Longevity. Applied Psy-

cally Weighted Regression: An Approach to

chology: Health and Well-Being, 3 (1), 1–43.

Regression Analysis by Local Fifing. Journal of

Diener, E., Emmons, R., Larsen, R., & Griffin, S. (1985). Satisfaction with Life Scale. Journal of

the American Statistical Association, 83 (403),

Personality Assessment, 49 (1), 71–75.

596–610. Collins, J. M., & Clark, M. R. (1993). An Appli-

Diener, E., & Lucas, R. E. (1999). Value as a Moder-

cation of the Theory of Neural Computation

ator in Subjective Well-being. Journal of Per-

To the Prediction of Workplace Behavior: An

sonality, 67 (1), 158–184. Diener, E., Sandvik, E., & Pavot, W. (1991). Happi-

Illustration and Assessment of Network Anal-

ness is the Frequency, not the Intensity of Pos-

ysis. Personnel Psychology, 46 (3), 503–524.

itive Versus Negative Affect. In N. Schwarz,

Cortes, C., & Vapnik, V. (1995). Support-Vector

F. Strack, & M. Argyle (Eds.), Subjective well-

Networks. Machine Leaming, 20 , 273–297.

being:

Cummins, R. (2009). Subjective Wellbeing, Home-

119–139). Oxford, England: Pergamon Press.

ostatically Protected Mood and Depression: A Synthesis.

An interdisciplinary perspective (pp.

Diener, E., Sandvik, E., Seidlitz, L., & Diener, M.

Journal of Happiness Studies,

(1993). The Relationship between Income and

11 (1), 1–17.

Subjective Well-being: Relative or Absolute?

Davies, J. C. (1962). Towards a Theory of Revo-

Social Indicators Research, 28 , 195–223.

lution. American Sociological Review , 27 (1),

Diener, E., & Seligman, M. (2004). Beyond Money:

5–19.

Toward an Economy of Well-Being. Psycholog-

Deaton, A. (2007). Income, Aging, Health and Well-

ical Science in the Public Interest, 5 (1), 1–31.

being around the World: Evidence from the

Diener, E., & Suh, E. (1997). Measuring Quality of

Gallup World Poll (Tech. Rep.). Cambridge,

Life: Economic, Social, and Subjective Indica-

MA: NBER.

60

References

61 tional approach to the study of personality and

tors. Social Indicators Research, 40 , 189–216.

emotion. Journal of Personality, 54 , 371–384.

Diener, E., Suh, E., Lucas, R. E., & Smith, H. L.

Enquete-Kommission.

(1999). Subjective Well-Being: Three Decades of Progress.

Wachstum,

Psychological Bulletin, 125 (2),

Wege

276–302.

zu

(2013).

Schlussbericht:

Wohlstand, nachhaltigem

Lebensqualit¨ at Wirtschaften

– und

Diener, E., Suh, E., Smith, H. L., & Shao, L. (1995).

gesellschaftlichem Fortschritt in der Sozialen

National Differences in Reported Subjective

Marktwirtschaft (Report, Germany). Berlin:

Well-Being: Why Do They Occur? Social In-

Deutscher Bundestag. Frey, B. S., & Stutzer, A. (2002). Happiness and

dicators Research, 34 (1), 7–32. Diener, E., & Tay, L. (2013). A Scientific Review of

Economics: How the Economy and Institutions

the Remarkable Benefits of Happiness for Suc-

Affect Human Well-Being. In Contemporary

cessful and Healthy Living (Report of the Well-

sociology. Princeton, NJ: Princeton University

Being Working Group). Royal Government of

Press. Friedman, J. H. (2006). Recent Advances in Predic-

Bhutan.

tive (Machine) Learning. Journal of Classifica-

Dodge, R., Daly, A., Huyton, J., & Sanders, L.

tion, 23 (2), 175–197.

(2012). The Challenge of Defining Wellbeing.

Fujita, F., & Diener, E. (2005). Life Satisfaction

International Journal of Well-being, 2 (3), 222–

Set Point: Stability and Change. Journal of

235.

Personality and Social Psychology, 88 (1), 158–

Dolan, P., Peasgood, T., & White, M. (2008). Do

64.

we really know what makes us happy? A review of the economic literature on the factors

Fujita, F., Diener, E., & Sandvik, E. (1991). Gender

associated with subjective well-being. Journal

Differences in Negative Affect and Well-Being:

of Economic Psychology, 29 , 94–122.

The Case for Emotional Intensity. Journal of Personality and Social Psychology, 61 (3), 427–

Easterlin, R. (1974). Does Economic Growth Improve the Human Lot? Some Empirical Evi-

434.

dence. In P. A. David & M. W. Reder (Eds.),

Furse, D. H., & Stewart, D. W.

Nations and households in economic growth:

(1982).

Mone-

Essays in honor of moses abramovitz (pp. 89–

tary Incentives Versus Promised Contribution to Charity: New Evidence on Mail Survey Re-

125). New York, NY: Academic Press.

sponse. Journal of Marketing Research, 19 (3), 375–380.

Easterlin, R. (1995). Will Raising the Incomes of All Increase the Happiness of All? Journal of

Goldberg, L. R. (1993). The Structure of Pheno-

Economic Behavior and Organization, 27 , 35–

typic Personality Traits. American Psycholo-

47.

gist, 48 (1), 26–34.

Efron, B.

(1979).

Gosso, A.

Bootstrap Methods: Another

(2013).

Package ‘elmNN’ (R pack-

Look at the Jackknife. The Annals of Statis-

age manual Version 1.0). R-project.org. Re-

tics, 7 (1), 1–26.

trieved from http://cran.r-project.org/ web/packages/elmNN/elmNN.pdf

Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R.

Hainmueller, J., & Hazlett, C. (2013). Kernel Reg-

(2004). Least Angle Regression. The Annals of

ularized Least Squares: Reducing Misspecifi-

Statistics, 32 (2), 407–451. Efron, B., & Tibshirani, R. (1997). Improvements

cation Bias with a Flexible and Interpretable

on Cross-Validation: The .632 + Bootstrap

Machine Learning Approach. Political Analy-

Method. Journal of the American Statistical

sis, 2013 , 1–26. Hall, M., Caton, S., & Weinhardt, C. (2013). Well-

Association, 92 (438), 548–560.

being’s Predictive Value.

Ellison, C. G. (1991). Religious Involvement and

In A. A. Ozok &

Subjective Well-Being. Journal of Health and

P. Zaphiris (Eds.), Proceedings of the 15th In-

Social Behavior , 32 (1), 80–99.

ternational Conference on Human-Computer Interaction (HCII) (pp. 13–22). Berlin, Ger-

Emmons, R. A., & Diener, E. (1986). An interac-

61

62

References Headey, B. W., & Wearing, A. J. (1991). Subjective

many: Springer. Hall, M., Kimbrough, S. O., Haas, C., Weinhardt,

Well-Being: A Stocks and Flows Framework.

C., & Caton, S. (2012). Towards the Gamifi-

In N. Schwarz, F. Strack, & M. Argyle (Eds.),

cation of Well-Being Measures. In 2nd Work-

Subjective wellbeing – an interdisciplinary per-

shop on Analyzing and Improving Collabora-

spective (pp. 49–73). Oxford, England: Perga-

tive eScience with Social Networks (eSon), Pro-

mon Press.

ceedings of the 8th IEEE International Con-

Hechenbichler, K., & Schliep, K. (2004). Weighted k-

ference on eScience (eScience 2012) (pp. 1–8).

Nearest-Neighbor Techniques and Ordinal Clas-

Chicago, IL: IEEE.

sification (Discussion Paper).

Munich, Ger-

many: LMU.

Harter, J. K., Schmidt, F. L., & Keyes, C. L. M. (2003). Well-being in the Workplace and its

Heckerman, D. (1996). A Tutorial on Learning With

Relationship to Business Outcomes: A Review

Bayesian Networks (Tech. Rep.). Redmond,

of the Gallup Studies. In C. L. M. Keyes &

WA: Microsoft Research.

J. Haidt (Eds.), Flourishing: The positive per-

Hofmann, T., Sch¨olkopf, B., & Smola, A. J. (2008).

son and the good life (pp. 205–224). Wash-

Kernel Methods in Machine Learning. The An-

ington D.C.: American Psychological Associa-

nals of Statistics, 36 (3), 1171–1220. Huang, G.-B., Chen, L., & Siew, C.-K. (2006). Uni-

tion. Haslam, N., Whelan, J., & Bastian, B. (2009). Big

versal approximation using incremental con-

Five traits mediate associations between val-

structive feedforward networks with random

ues and subjective well-being. Personality and

hidden nodes. IEEE Transactions on Neural

Individual Differences, 46 (1), 40–42.

Networks, 17 (4), 879–92.

Hastie, T.

(2013).

Hubbard, R., & Little, E. L.

Package ‘gam’ (R pack-

(1988).

Promised

R-project.org.

Contribution to Charity and Mail Survey Re-

Retrieved from http://cran.r-project.org/ web/packages/gam/gam.pdf

sponses - Replication with Extension. Public

age manual Version 1.09).

Opinion Quarterly, 52 , 223–230.

Hastie, T., & Efron, B. (2013). Package ‘lars’ (R

Huppert, F., & So, T. T. C. (2011). Flourishing

package manual Version 1.2). R-project.org. Retrieved from http://cran.r-project.org/

Across Europe: Application of a New Conceptual Framework for Defining Well-Being. Social

web/packages/lars/lars.pdf

Indicators Research, 110 (3), 837–861.

Hastie, T., & Tibshirani, R. (1986). Generalized Ad-

Hurvich, C. M., Simonoff, J. S., & Tsai, C.-l. (1998).

ditive Models. Statistical Science, 1 (3), 297–

Smoothing parameter selection in nonparamet-

318.

ric regression using an improved Akaike infor-

Hastie, T., Tibshirani, R., & Friedman, J. H. (2009).

mation criterion. Journal of the Royal Statis-

The Elements of Statistical Learning (2nd ed.).

tical Society. Series B (Methodological), 60 (2),

New York, NY: Springer.

271–293. (2007).

The

John, O. P., Donahue, E. M., & Kentle, R. L. (1991).

RNews, 27 (5), 1-32.

Re-

The Big Five Inventory – Versions 4a and 54.

trieved from http://cran.r-project.org/

(Questionnaire). Berkeley, CA: University of

web/packages/np/vignettes/np.pdf

California, Institute of Personality and Social

Hayfield, T., & Racine, J. S. np Package.

Research.

Hayfield, T., & Racine, J. S. (2013). Package ‘np’ (R package manual Version 0.50-1). R-project.org.

John, O. P., Naumann, L. P., & Soto, C. J. (2008).

Retrieved from http://cran.r-project.org/

Paradigm Shift to the Integrative Big Five

web/packages/np/np.pdf

Trait Taxonomy. In O. P. John, R. W. Robins,

Headey, B. W., Veenhoven, R., & Wearing, A. J.

& L. A. Pervin (Eds.), Handbook of personality:

(1991). Top Down Versus Bottom Up Theo-

Theory and research (3rd ed., pp. 114–158).

ries of Subjective Well-being. Social Indicators

New York, NY: Guilford Press. John, O. P., & Srivastava, S.

Research, 24 , 81–100.

62

(1999).

The Big

References

63

Five Trait Taxonomy: History, Measurement,

Social Indicators Research, 17 (1), 1–17.

and Theoretical Perspectives. In O. P. John,

Li, Q., & Racine, J. S. (2004). Cross-Validation

R. W. Robins, & L. A. Pervin (Eds.), Hand-

Local Linear Nonparametric Regression. Sta-

book of personality: Theory and research (pp.

tistica Sinica, 14 , 485–512. Likert, R. (1974). The Method of Constructing an

102–138). New York, NY: Guilford Press. Jorm, A. F., & Ryan, S. M. (2014). Cross-National

Attitute Scale. In G. M. Maranell (Ed.), Scal-

and Historical Differences in Subjective Well-

ing: A sourcebook for behavioral scientists (pp.

Being. International Journal of Epidemiology,

233–243). Chicago, IL: Aldine. Lucas, R. E., Clark, A. E., Georgellis, Y., & Diener,

43 (1), 1–11. De-

E. (2004). Unemployment Alters the Set-Point

velopments in the Measurement of Subjective

for Life Satisfaction Andrew. Psychological Sci-

Well-Being. Journal of Economic Perspectives,

ence, 15 (1), 8–13.

Kahneman, D., & Krueger, A. B.

(2006).

Lucas, R. E., Diener, E., & Suh, E. (1996). Discrimi-

20 (1), 3–24. Kahneman, D., Krueger, A. B., Schkade, D. A.,

nant Validity of Well-Being Measures. Journal

Schwarz, N., & Stone, A. A. (2004). A Survey

of Personality and Social Psychology, 71 (3),

Method for Characterizing Daily Life Experi-

616–628. Luttmer, E. F. P. (2005). Neighbors as Negatives:

ence: The Day Reconstruction Method. Sci-

Relative Earnings and Well-Being. The Quar-

ence, 306 (5702), 1776–80. Kohavi, R. (1995). A study of Cross-Validation and

terly Journal of Economics, 120 (3), 963–1002.

Bootstrap for Accuracy Estimation and Model

Lyubomirsky, S., King, L., & Diener, E. (2005). The

Selection. In Proceedings of the 14th interna-

Benefits of Frequent Positive Affect: Does Hap-

tional joint conference on artificial intelligence

piness Lead to Success? Psychological Bulletin,

- volume 2 (pp. 1137–1145). San Francisco,

131 (6), 803–55.

CA: Morgan Kaufmann Publishers Inc. Kuhn, M. (2008). Building Predictive Models in R

Marquardt, D. W. (1963). An Algorithm for Least-

Using the caret Package. Journal of Statistical

Journal of the Society for Industrial and Ap-

Squares Estimation of Nonlinear Parameters.

Software, 28 (5), 1–26. Kuhn, to

M. the

(2014). caret

A Short Introduction

Package

(R

package

& Castro, J. R.

in-

(2010).

Big Five Pat-

Retrieved

terns for Software Engineering Roles Using

http://cran.r-project.org/web/

an ANFIS Learning Approach with RAM-

troduction). from

plied Mathematics, 11 (2), 431–441. Mart´ınez, L. G., Rodr´ıguez-d´ıaz, A., Licea, G.,

R-project.org.

packages/caret/vignettes/caret.pdf

SET. In G. Idorov, A. Hern´andez Aguirre, &

Kuhn, M., & Johnson, K. (2013). Applied Predictive

C. A. Reyes Garcia (Eds.), Advances in Soft Computing (pp. 428–439). Berlin, Heidelberg,

Modeling. New York, NY: Springer.

Germany: Springer.

Kuhn, M., Wing, J., Weston, S., Williams, A.,

Mason, C. H., & Perreault Jr., W. D.

Keefer, C., Engelhardt, A., . . . Mayer, Z. (2014).

Collinearity, Power, and Interpretation of Mul-

Package ‘caret’ (R package man-

ual Version 6.0-22).

R-project.org.

(1991).

tiple Regression Analysis. Journal of Marketing

Re-

trieved from http://cran.r-project.org/

Research, 28 (3), 268–280.

web/packages/caret/caret.pdf

McCrae, R. R., & Costa Jr., P. T. (1985). Updat-

Lacey, H. P., Fagerlin, A., Loewenstein, G., Smith,

ing Norman’s ”Adequate Taxonomy”: Intelli-

D. M., Riis, J., & Ubel, P. a. (2008). Are they

gence and Personality Dimensions in Natural

really that happy? Exploring scale recalibra-

Language and in Questionnaires. Journal of

tion in estimates of well-being. Health psychol-

Personality and Social Psychology, 49 (3), 710– 721.

ogy, 27 (6), 669–75.

McKee-Ryan, F., Song, Z., Wanberg, C. R., &

Larsen, R., Diener, E., & Emmons, R. (1985). An

Kinicki, A. J.

evaluation of subjective well-being measures.

63

(2005).

Psychological and

64

References Physical Well-Being During Unemployment: A

Research in Personality, 41 (3), 700–706.

Meta-Analytic Study. The Journal of Applied

Racine, J. S., & Li, Q. (2004). Nonparametric estimation of regression functions with both cate-

Psychology, 90 (1), 53–76.

gorical and continuous data. Journal of Econo-

Minbashian, A., Bright, J. E. H., & Bird, K. D.

metrics, 119 (1), 99–130.

(2009). A Comparison of Artificial Neural Networks and Multiple Regression in the Con-

Rajesh, R., & Prakash, J. S. (2011). Extreme Learn-

text of Research on Personality and Work Per-

ing Machines - A Review and State-of-the-art.

formance. Organizational Research Methods,

International Journal of Wisdom Based Com-

13 (3), 540–561.

puting, 1 (1), 35–49. Raykov, T. (1998). On the Use of Confirmatory Fac-

Møller, M. F. (1993). A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning. Neu-

tor Analysis in Personality Research. Personal-

ral Networks, 6 (4), 525–533.

ity and Individual Differences, 24 (2), 291–293.

Nadaraya, E. A. (1964). On Estimating Regres-

Read, S. J., Monroe, B. M., Brownstein, A. L., Yang,

sion. Theory of Probability and its Applica-

Y., Chopra, G., & Miller, L. C. (2010). A

tions, 9 (1), 141–142.

Neural Network Model of the Structure and Dynamics of Human Personality. Psychologi-

Nelder, J. A., & Wedderburn, R. W. M. (1972). Gen-

cal Review , 117 (1), 61–92.

eralized Linear Models. Journal of the Royal

Revelle, W.

Statistical Society. Series A (General), 135 (3),

(2014).

Package ‘psych’ (R pack-

age manual Version 1.4.3).

370–384.

R-project.org.

Retrieved from http://cran.r-project.org/

Nenkov, G. Y., Morrin, M., Ward, A., Hulland, J.,

web/packages/psych/psych.pdf

& Schwartz, B. (2008). A short form of the Maximization Scale: Factor structure, reliabil-

Robertson, D. H., & Bellenger, D. N. (1978). A New

ity and validity studies. Judgment and Decision

Method of Increasing Mail Survey Responses:

Making, 3 (5), 371–388. Ng, Y.-K. (1997). A case for happiness, cardinal-

Contributions to Charity. Journal of Market-

ism, and interpersonal comparability. The Eco-

Rojas, R. (1996). The Backpropagation Algorithm.

nomic Journal , 197 (445), 1848-1858. Introduction to Ma-

In Neural networks (pp. 152–184). Rosseel, Y., Oberski, D., Byrnes, J., Vanbrabant,

chine Learning (Unpublished Textbook). Stan-

L., Savalei, V., Merkle, E., . . . Barendse,

ford, CA: Robotics Laboratory, Department

M.

of Computer Science, Stanford University.

age manual Version 0.5-16).

Retrieved from http://robotics.stanford

Retrieved from http://cran.r-project.org/

.edu/~nilsson/MLBOOK.pdf

web/packages/lavaan/lavaan.pdf

Nilsson, N. J.

(2005).

ing, 15 , 632–633.

(2014).

Package ‘lavaan’ (R packR-project.org.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J.

Norman, W. T. (1963). Toward an adequate taxonomy of personality attributes: Replicated fac-

(1986).

Learning Representations by Back-

tor structure in peer nomination personality

Propagating Errors. Nature, 323 (9), 533–536. Ryan, R. M., & Deci, E. L. (2001). On Happiness

ratings. The Journal of Abnormal and Social

and Human Potential: A Review of Research

Psychology, 66 (6), 574–583.

on Hedonic and and Eudaimonic Well-Being.

Oswald, A. J. (1997). Happiness and Economic Per-

Annual Review of Psychology, 52 , 141–166.

formance. The Economic Journal , 107 (445),

Ryff, C. D. (1989). Happiness Is Everything, or Is It?

1815–1831. Page, K. M., & Vella-Brodrick, D. a. (2008). The

Explorations on the Meaning of Psychological

‘What’, ‘Why’ and ‘How’ of Employee Well-

Well-Being. Journal of Personality and Social

Being: A New Model. Social Indicators Re-

Psychology, 57 (6), 1069–1081. Ryff, C. D., & Keyes, C. L. M. (1995). The Structure

search, 90 (3), 441–458. Quek, M., & Moskowitz, D. S. (2007). Testing Neu-

of Psychological Well-Being revisited. Journal

ral Network Models of Personality. Journal of

of Personality and Social Psychology, 69 (4),

64

References

65 Stevenson, B., Becker, G., Blanchflower, D. G.,

719–27.

Deaton, A., Easterlin, R., Graham, C., . . .

Samuel, R., Bergman, M. M., & Hupka-Brunner, S. (2013). The Interplay between Educational

Rayo, L. (2008). Economic Growth and Sub-

Achievement, Occupational Success, and Well-

jective Well-being: Reassessing the Easerlin

Being. Social Indicators Research, 111 (1), 75–

Paradox (Working Paper No. 14282). Cam-

96.

bridge, MA: NBER. Retrieved from http://

Schliep,

K.,

&

Hechenbichler,

K.

(R package manual Ver-

sion 1.2-5).

R-project.org.

from

www.nber.org/papers/w14282

(2014).

Package ‘kknn’

Stevenson, B., & Wolfers, J.

(2013).

Subjective

Retrieved

Well-Being and Income: Is There Any Evi-

http://cran.r-project.org/web/

dence of Satiation? American Economic Re-

packages/kknn/kknn.pdf

view , 103 (3), 598–604. Stiglitz, J., Sen, A., & Fitoussi, J.-P.

Schmitt, M., & D¨ orfel, M. (1999). Procedural injus-

(2009).

tice at work, justice sensitivity, job satisfaction

Report by the commission on the mea-

and psychosomatic well-being. European Jour-

surement

of

nal of Social Psychology, 29 , 443–453.

and

progress

social

economic

(Report).

bridge, MA: CMEPSP.

Schmitt, M., Gollwitzer, M., Maes, J., & Arbach, D.

performance Cam-

Retrieved from

(2005). Justice Sensitivity. European Journal

http://www.stiglitz-sen-fitoussi.fr/

of Psychological Assessment, 21 (3), 202–211.

documents/rapport_anglais.pdf

J.,

Stone, A. a., Schwartz, J. E., Broderick, J. E., &

Lyubomirsky, S., White, K., & Lehman,

Deaton, A. (2010). A snapshot of the age distri-

D. R. (2002). Maximizing Versus Satisficing:

bution of psychological well-being in the United

Happiness Is a Matter of Choice.

Journal

States. Proceedings of the National Academy

of Personality and Social Psychology, 83 (5),

of Sciences of the United States of America,

Schwartz,

B.,

Ward,

1178–1197. Scitovsky, T. (1976).

A.,

Monterosso,

107 (22), 9985–90. The joyless economy: An

Suh, E., Diener, E., & Fujita, F. (1996). Events and

inquiry into human satisfaction and consumer

Subjective Well-Being: Only Recent Events

dissatisfaction (17th ed.). New York, NY: Ox-

Matter. Journal of Personality and Social Psychology, 70 (5), 1091–102.

ford University Press.

Telfer, E. (1980). Happiness. New York, NY: St.

Sheldon, K. M., & Hoon, T. H. (2006). The mul-

Martin’s Press.

tiple determination of well-being: Independent effects of positive traits, needs, goals, selves,

Thinley, J. (2011). Gross National Happiness: A

social supports, and cultural contexts. Journal

Holistic Paradigm for Sustainable Development

of Happiness Studies, 8 (4), 565–592.

(Speech). New Delhi, India: Indian Parlament.

Shields, M. A., & Price, S. W. (2005). Exploring the

Tibshirani, R. (1996). Regression Shrikage and Selec-

Economic and Social Determinants of Psycho-

tion via the Lasso. Journal of the Royal Statis-

logical Well-Being and Perceived Social Sup-

tical Society. Series B (Methodological), 58 (1),

port in England. Journal of the Royal Statis-

267–288. Titterington, D. M. (1980). A Comparative Study of

tical Society. Series A (Statistics in Society),

Kernel-Based Density Estimates for Categori-

168 (3), 513–537.

cal Data. Technometrics, 22 (2), 259–268.

S ¸ im¸sek, O. F., & Koydemir, S. (2012). Linking Metatraits of the Big Five to Well-Being and Ill-

Vapnik, V., Golowich, S. E., & Smola, A. (1997).

Being: Do Basic Psychological Needs Matter?

Support Vector Method for Function Approx-

Social Indicators Research, 112 (1), 221–238.

imation, Regression Estimation, and Signal Processing.

Steel, P., Schmidt, J., & Shultz, J. (2008). Refin-

In M. C. Mozer, M. I. Jordan,

ing the Relationship Between Personality and

& T. Petsche (Eds.), Advances in Neural In-

Subjective Well-Being. Psychological Bulletin,

formation Processing Systems 9 (pp. 281–287).

134 (1), 138–61.

Boston, MA: MIT Press.

65

66

References

Veenhoven, R. (1984). Conditions of Happiness. Dor-

ware manual Version 4.2). Stuttgart, Germany: University of Stuttgart. Retrieved

drecht, Netherlands: D. Reidel Publishing. Veenhoven, R.

(2010).

from http://www.ra.cs.uni-tuebingen.de/

Greater Happiness for a

downloads/SNNS/SNNSv4.2.Manual.pdf

Greater Number. Journal of Happiness Stud-

Zou, H., & Hastie, T. (2005). Regularization and

ies, 11 (5), 605–629. Veenhoven, R. Happiness.

(2013).

Variable Selection via the Elastic Net. Journal

World Database of

Retrieved 20.11.2013,

of the Royal Statistical Society. Series B (Sta-

from

http://worlddatabaseofhappiness.eur

tistical Methodological), 67 (2), 301–320.

.nl/hap_cor/cor_fp.htm

Zou,

H., & Hastie, ‘elasticnet’

Vidaurre, D., Bielza, C., & Larra˜ naga, P. (2011). Lazy lasso for local regression. Computational

sion

Statistics, 27 (3), 531–550.

from

well-being: emotional stability, not extraversion, is probably the important predictor. Personality and Individual Differences, 31 , 903– 914. Waldron, S. (2010). Measuring Subjective Wellbeing in the UK (Tech. Report). London, England: Office for National Statistics, UK. (1993).

Two Conceptions of

Happiness: Contrasts of Personal Expressiveness (Eudaimonia) and Hedonic Enjoyment. Journal of Personality and Social Psychology, 64 (4), 678–691. Watson, G. S. (1964). Smooth Regression Analysis. Sankhy˜ a: The Indian Journal of Statistics, Series A, 26 (4), 359–372. Witter, R. A., Okun, M. A., Stock, W. A., & Haring, M. J. (1984). Education and Subjective WellBeing: A Meta-Analysis. Educational Evaluation and Policy Analysis, 6 (2), 165–173. Wood, S. N. (2004). Stable and Efficient Estimation Parameter Multiple Smoothing Models for Generalized Additive. Journal of the American Statistical Association, 99 (467), 673–686. Wright, T. A., & Cropanzano, R. (2000). Psychological Weil-Being and Job Satisfaction as Predictors of Job Performance. Journal of Occupational Health Psychology, 5 (1), 84–94. Yin, P., & Fan, X. (2001). Estimating R2 Shrinkage in Multiple Regression: A Comparison of Different Analytical Methods. The Journal of Experimental Education, 69 (2), 203–224. Zell, A., Mamier, G., Vogt, M., Mache, N., H¨ ubner, R., D¨ oring, S., . . .

(2013).

package

R-project.org.

Package

manual

Gatter, J.

(2013). SNNS - User Manual (Computer soft-

66

Ver-

Retrieved

http://cran.r-project.org/web/

packages/elasticnet/elasticnet.pdf

Vittersø, J. (2001). Personality traits and subjective

Waterman, A. S.

1.1).

(R

T.