Neuromuscular Research Laboratory/Warrior Human Performance Research Center, Department of Sports Medicine and Nutrition, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania; 2U.S. Army Public Health Center (Provisional), Aberdeen Proving Ground, Maryland; 3Department of Military and Emergency Medicine, Consortium for Health and Military Performance, Uniformed Services University of Health Sciences, Bethesda, Maryland; 4 Human Dimension Division, HQ Army Training and Doctrine Command, Fort Eustis, Virginia; 5Thermal and Mountain Medicine and Nutrition Divisions, U.S. Army Research Institute of Environmental Medicine, Natick, Massachusetts; 6 Behavioral Biology Branch, Walter Reed Army Institute of Research, Silver Spring, Maryland; and 7Division of Anesthesiology, Walter Reed National Military Medical Center, Bethesda, Maryland ABSTRACT Nindl, BC, Jaffin, DP, Dretsch, MN, Cheuvront, SN, Wesensten, NJ, Kent, ML, Grunberg, NE, Pierce, JR, Barry, ES, Scott, JM, Young, AJ, O'Connor, FG, and Deuster, PA. Human performance optimization metrics: consensus findings, gaps, and recommendations for future research. J Strength Cond Res 29(11S): S221–S245, 2015—Human performance optimization (HPO) is defined as "the process of applying knowledge, skills and emerging technologies to improve and preserve the capabilities of military members, and organizations to execute essential tasks." The lack of consensus for operationally relevant and standardized metrics that meet joint military requirements has been identified as the single most important gap for research and application of HPO. In 2013, the Consortium for Health and Military Performance hosted a meeting to develop a toolkit of standardized HPO metrics for use in military and civilian research, and potentially for field applications by commanders, units, and organizations. Performance was considered from a holistic perspective as being influenced by various behaviors and barriers. To accomplish the goal of developing a standardized toolkit, key metrics were identified and evaluated across a spectrum of domains that contribute to HPO: physical perfor
mance, nutritional status, psychological status, cognitive performance, environmental challenges, sleep, and pain. These domains were chosen based on relevant data with regard to performance enhancers and degraders. The specific objectives at this meeting were to (a) identify and evaluate current metrics for assessing human performance within selected domains; (b) prioritize metrics within each domain to establish a human performance assessment toolkit; and (c) identify scientific gaps and the needed research to more effectively assess human performance across domains. This article provides of a summary of 150 total HPO metrics across multiple domains that can be used as a starting point—the beginning of an HPO toolkit: physical fitness (29 metrics), nutrition (24 metrics), psychological status (36 metrics), cognitive performance (35 metrics), environment (12 metrics), sleep (9 metrics), and pain (5 metrics). These metrics can be particularly valuable as the military emphasizes a renewed interest in Human Dimension efforts, and leverages science, resources, programs, and policies to optimize the performance capacities of all Service members.
KEY WORDS human dimension strategy, human potential, performance enhancement, program evaluation, tactical athlete, total force fitness, holistic health INTRODUCTION
uman performance optimization (HPO) is critical for military and combat readiness, and integral for ensuring national security. Leveraging HPO strategies has been identified by senior leadership as a top strategic and operational priority, as VOLUME 29 | NUMBER 11 | SUPPLEMENT TO NOVEMBER 2015 |
Human Performance Optimization Metrics evidenced by the then Chairman of the Joint Chiefs, Admiral Mullen’s, publication of a Chairman of the Joint Chiefs Instruction on Total Force Fitness (117), and the Army’s recent unveiling of the Army Human Dimension Strategy (47). At the “Total Fitness for the 21st Century” conference held at the Uniformed Services University in 2009, HPO was defined as “the process of applying knowledge, skills and emerging technologies to improve and preserve the capabilities of military members, and organizations to execute essential tasks.” In 2006 at a “Human Performance Optimization in the Department of Defense (DoD): Charting a Course for the Future” conference, the lack of consensus for operationally relevant and standardized metrics that meet joint military requirements was identified as the single most important issue for research and application of HPO. Establishing a “toolkit” of valid, reliable, field-expedient, and actionable HPO metrics would be of value to for multiple Department of Defense (DoD) stakeholders: Warriors and operators, military leaders and commanders, medical personnel, policy makers, resource managers, program evaluators, and researchers (50). In the current era of fiscal austerity, the environment for prioritizing resource allocation will be competitive. A number of DoD efforts currently aim to improve and sustain health, performance, and resiliency of military personnel, and there is an increased focus on the human dimension of military readiness. A set of well-defined HPO outcome metrics will be critical for ensuring consistency in program evaluations to identify program efficacy and quantify return on investment. In 2013, the Consortium for Health and Military Performance hosted a meeting to develop a toolkit of standardized HPO metrics for use in military and civilian research, and potentially to be used in field applications by commanders, units, and organizations. Performance was viewed from a holistic perspective as being influenced by various behaviors and barriers. Key metrics were identified and evaluated across a spectrum of HPO domains. The domains of interest included the following: physical performance, nutritional status, psychological status, cognitive performance, environmental challenges, sleep, and pain. These domains were chosen based on relevant data regarding enhancers and degraders of Warrior performance. Specifically, physical fitness, nutritional status, environmental factors (e.g., heat, cold, and altitude) and psychological state are all recognized as contributing to both physical and cognitive performance. Additionally, sleep status (e.g., adequacy and quality) and pain significantly impact cognitive performance. Workshop participants reviewed available metrics within each domain and developed a summary matrix for each domain to describe: (a) Methods/Metrics used; (b) Validity (Precision); (c) Reliability (Reproducibility); (d) Advantages; and (e) Disadvantages of each metric. The general consensus was that HPO should be viewed as an aggregate outcome influenced by multiple factors, and for optimal performance to be achieved, each factor or set of factors must have metrics for evaluation. The specific objectives at this meeting were to: (a)
identify and evaluate current metrics for assessing human performance within selected domains; (b) prioritize metrics within each domain to construct a human performance assessment toolkit; and (c) identify scientific gaps and needed research to more effectively standardize assessments and to optimize human performance across domains. This effort is in line with the Army’s concept of Human Dimension—which includes cognitive, physical, and social components, and is not unlike the framework of Total Force Fitness (117). All of our Warriors need to be able to learn more quickly, maintain attention more closely, physically perform at peak capacity, and function at higher levels than ever before. This will only be achieved by optimizing physical, cognitive, emotional, social, and psychological strength—and with standardized metrics to monitor changes. The necessity for metrics to directly assess human performance and the variables that modify and/or mediate performance is clear. Importantly, the HPO metric meeting took the concept of HPO and the Human Dimension beyond the classic domains/dimensions by including 2 significant barriers to HPO—sleep and pain.
Well-defined health-related (aerobic, muscular strength and endurance, flexibility, body composition) and skill-related (speed, agility, power, coordination, and balance) metrics are available for most components of physical fitness and performance. Both gold standard laboratory tests and field-expedient tests are available for most fitness components and many are well supported by the scientific literature. Table 1 (health related) and Table 2 (skill related) identify the tests that were prioritized along with their respective rated attributes. Considering all aspects of validity and reliability, usability and relevance, and expedience, aerobic fitness might be selected as the fitness component most important to assess because of its broad applicability within military environments. In fact, along with assessing muscular endurance, all the services’ regular physical fitness tests include an assessment of aerobic fitness by means of surrogate running tests. Along with an adjunct assessment of body composition, the services’ physical fitness tests measure 3 of 5 components of healthrelated fitness. Muscular strength and flexibility and all skillrelated components of fitness are not routinely assessed, as they are not part of the services’ regular physical fitness tests. Table 3 provides a list of tests that are functional in nature and include task simulations, obstacle courses, Olympic lifts, and the Y-balance. In an effort to provide more meaningful information on combat and occupational fitness, most services are exploring the feasibility of implementing obstacle course or task simulations into their fitness assessment inventories. For example, the Marine Corps Combat Fitness Test, which was implemented in 2009 (9), includes a timed sprint of 880 yards, lifting a 30-lb ammunition can overhead from shoulder height repeatedly for 2 minutes, and a 300 m
TABLE 1. Health-related components of fitness.* Metric
Validity Reliability Application
Skinfolds Circumference
High High
L, C F
R U, R
High Low
GS Expedient
Resource intensive Motivation, familiarity
GS Safety
Equipment needed Precision
Expedient Functional Expedient
Tactical relevance Equipment needed Tactical relevance
GS High Medium Medium
L, F, C L, F, C
Medium Medium Medium Medium Medium Medium
U, L U, L U, L
Low Low Low
L, C F
R, C U, L
Low Low
Medium Medium
F, C
L, C
GS High Medium Medium
L, C L, C
R, C R, C
High Medium
GS Expedient
High Medium Medium Medium
L, C F
R, C U, L
Medium Low
Expedient Expedient
GS Medium
High High
U, L, C, R Medium U, L, C, R Medium
GS Expedient, lower back, hamstring specific Upper-body specific
Equipment needed, training Equipment needed, lower back, hamstring specific Upper-body specific Equipment needed, expense Equipment needed, reliability under testing conditions Equipment needed, training Estimate
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; GS = gold standard; ROM = range of motion; DXA = dualenergy x-ray absorptiometry; BIA = bioelectrical impedance analysis.
Shoulder mobility Body composition (58,111,113,121,132,133,142,166) DXA BIA
GS Medium
Aerobic fitness (121,158) Max V_ O2 test Time trial test Muscular strength (53,57,77,121,141,153) 1RM 1RM prediction Muscular endurance (77,121,138) Push-ups Pull-ups Sit-ups Flexibility (7,37,121,131,139,165) ROM Sit-and-reach
Intended user
Intended user
U, L U, L
Low Low
Field expedient Field expedient
L F, C F, C
R U, L, C U, L, C
High Low Low
GS Field expedient, leg symmetry Field expedient
Equipment needed, price NA Equipment needed
High Low
Accuracy of power measurement Field expedient
Equipment needed Provides estimate of power output Upper-body specific Equipment needed
Validity Reliability Application
Agility/coordination (77,121) T-test 300-m shuttle run Balance (73,98,121,140,143,150) Force plate Single leg stance Beam test Muscular power (1,8,35,77,112,121,161) Bench throws/jump squats Jump test
High Medium
High High
GS High Medium Medium Medium Medium GS High
High High
R U, L
High GS
High High
U, L R
300-m shuttle run
U, L
400-m run
U, L
U, L
Medicine ball throw Wingate test
Speed/reaction time (77,121) Sprint/dash
Low Field expedient; upper-body specific Medium Provides insight into anaerobic energy systems Low Field expedient and taxes anaerobic energy systems Low Field expedient and taxes anaerobic energy system Low
NA Track required Timing precision
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; GS = gold standard.
Human Performance Optimization Metrics
S224 TABLE 2. Skill-related components of fitness.*
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; GS = gold standard.
Face Validity Measures total body strength Expedient, may provide injury risk insight F F F, C, L High Medium High Medium Medium High Obstacle course Olympic lifts Y-balance
maneuver under fire event. This comprehensive test appears to assess the components of muscular strength, muscular endurance, aerobic fitness, agility, balance, speed, and coordination.
Equipment needed, training, practicality Equipment needed, familiarity Injury risk for novice lifters Equipment needed Face Validity
Medium/ high U, L Medium U, L Medium U, L, R, C Medium U, L High Tests (9,84,118,121,123,129,135,175) Task simulation
Validity Reliability Application Metric
TABLE 3. Functional test metrics of military fitness.*
Intended user
Gaps. The fact that recurring physical fitness tests for military personnel fail to assess the majority of physical fitness components remains a major gap. For example, the National Strength and Conditioning Association’s second Blue Ribbon Panel of Military Physical Readiness: Military Physical Performance Testing, rated muscular strength and power as the most critical fitness components required to successfully accomplish common military tasks (121) and current military physical fitness testing only superficially, if at all, assesses those fitness components. Historically, aerobic fitness has been viewed as synonymous with overall military fitness, which led to an overemphasis on running during physical training programs, and thus an excess of overuse musculoskeletal injuries (124). For dismounted Warriors, consideration should also be given to assessing physical capacity under loaded conditions (i.e., load carriage). Other gaps include a lack of (a) criterionbased objective tests for occupational, functional, and combat-centric tasks and duties; and (b) clear interpretation, intervention, and application to monitor physical fitness profiles across the Warrior lifecycle (122). Positive New Directions. The concept of different tiers of testing was discussed. For example, tiered criterion approaches could be health-based or task-based tests that assess basic health risk (health criterion based) and functional fitness/occupational capabilities required of Warriors (task criterion based). Physical fitness outcome data in the form of spider graphs plotting individual strengths and weaknesses could be provided to the end user and compared with the unit or military occupational specialty (MOS) averages for actionable feedback and to inform future training programs. Incorporation of technology (laser timing gates, physiological sensing and monitoring applications, etc.) could provide greater applicability and ease of measurement. Other emerging holistic approaches include the U.S. Army Special Operations Command’s (USASOC) Tactical Human Optimization, Rapid, Rehabilitation and Reconditioning Program (THOR3) and the Naval Special Warfare Tactical Athlete Program, both of which incorporate a team consisting of physical therapists, strength and conditioning coaches, and dieticians to establish physical performance metrics with local units for use in monitoring training effectiveness. Recommendations and Issues. Physical Domain metrics are well defined in the literature and will facilitate completion of matrices associated with health-related and skill-related fitness components. Additional research on tiered fitness tests should be tailored across the Warrior lifecycle VOLUME 29 | NUMBER 11 | SUPPLEMENT TO NOVEMBER 2015 |
Human Performance Optimization Metrics (accession, basic combat training, MOS, and special operations forces) with clearly designated testing intervals. Such a tiered testing paradigm would provide greater fidelity and utility for comprehensive fitness assessments that would include health, functional, and occupational aspects. More innovative approaches, greater use of technology, a departure from “field expediency” as the major criteria for determining physical fitness testing policies and making efforts to operationalize and leverage the latest breakthroughs in fitness assessment could prove to be truly transformative and game changing. The capacity of exercise testing expertise and subject matter knowledge within the military personnel system needs to be enhanced: the existing capabilities are insufficient to meet the needs. Best practices from USASOC’s THOR3 program could be replicated throughout the DoD in terms of establishing, implementing, and monitoring physical performance metrics to facilitate HPO. Nutritional Status
The group concurred that metrics selected for “toolkits” should provide actionable guidance, and recommended that 2 time-dependent dimensions should be considered when selecting nutritional metrics to assess performance: Event/ Missions/Training Readiness and Lifecycle Readiness. When assessed before, during, and after specific activities lasting hours (physical training, road march, combat patrol, etc.) or over the course of a multiday mission or training exercise lasting days to weeks, the Event/Mission/Training dimension metrics would provide signals indicating a need for performance optimizing countermeasures. In contrast, the Lifecycle dimension metrics would provide signals regarding chronic needs for performance and health optimizing countermeasures when assessed during a regularly recurring nutritional fitness assessment. Metrics for Event/Mission/Training Readiness and Performance. Changes in body weight, if properly assessed, can provide a valid, reliable, low-cost, low-burden, feasible means by which actionable information about hydration status and total energy balance could be obtained (Table 4). The limitation to this metric is that the context and timing must be clearly known and understood for accurate interpretation. For instance, body weight changes occurring over several hours of an event, or from one day to the next, are most reflective of differences in hydration, whereas body weight changes over several days to weeks can reflect adequacy of energy intake relative to requirements. Although dietary intake records could provide individual actionable guidance for nutrition countermeasures during the event/mission/ training, the value would be limited due to the burden in completing and reviewing/rating. Metrics for Lifecycle Readiness and Performance. All Warriors should undergo regular assessments of nutritional fitness (Table 5). At a minimum, this would include recording
regular (monthly to annual) body weight and body composition assessments in a permanent record for following the Warrior through his/her career. Some form of individual dietary recording at recurring intervals should also be implemented to enable periodic professional review and evaluation of the quality of the Warrior’s diet. In addition to the above, regularly recurring (e.g., annually) evaluations of nutritionally relevant clinical metrics (e.g., blood markers, bone mineral density, and metabolic rate) would be part of a nutritional fitness assessment for selected Warriors identified as potentially “at risk” for various health conditions (e.g., obesity, metabolic syndrome, diabetes) as defined by established Clinical Practice Guidelines. Finally, an assessment of Warrior and Leader attitudes and knowledge about foods and supplements was identified as being a useful metric to guide training programs and gauge leader knowledge for identifying informational gaps. Gaps. To date, no validated questionnaire to identify attitudes and knowledge about foods and supplements is available. In addition, acquiring dietary pattern information is cumbersome; new technologies to make such information valid and less burdensome are needed. Environmental constraints also influence healthy eating. Although the military nutritional environmental assessment tool (m-NEAT) might provide some limited individual guidance during event/mission/training, the overall burden of implementing m-NEAT for this time dimension was large. However, Go for Green can provide more individual guidance for Warriors in selecting performance-enhancing foods. Once operational, Go for Green will be implemented throughout the DoD. Positive New Directions. Various new possibilities are under development or envisioned. First, a new technology for realtime, noninvasive measurement of individual hydration status is currently under development. Also, emerging technology with ultrasound may in the future be able to quantify muscle glycogen levels with accurate, reliable, and noninvasive methods (81). Next, and as noted above, new approaches and technologies for individual dietary monitoring and analytics for dietary quality evaluation and feedback are critically needed for Warriors. Likewise, tools and surveys to assess Warrior and Leader nutrition knowledge and attitude assessments are needed to guide behavior and education. Recommendations and Issues. Despite the glamour, adopting terminology like “Warrior-athlete,” Warriors, leaders, and health care professionals often have unrealistic expectations regarding the real performance impact of acute nutritional interventions and supplements. If Warriors would generally consume a healthy energy and nutrient adequate diet overall, the physical and cognitive tasks they perform during their missions should be relatively insensitive to acute dietary
Journal of Strength and Conditioning Research
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
TABLE 4. Nutrition metrics.* Metric
Energy balance (23,81,87,104) Food/activity records Doubly labeled water
Medium/low Medium/low
Intended user
C, L
U, R, C
Ability to capture quantitative information High Values represent “free-living” conditions Low Quick measurement Medium Tracks all activity and sleep
F, C, L
Medium Medium
High Medium
F, C F
U, C, L R
C, L
C, R
F, C
Urine color
U, L, C
Instant results
No equipment required
Body weight Accelerometer Hydration status (3) Plasma osmolality
Thirst perception
High technical expertise required Reflect volume of fluid consumed rather than total body water May be altered by certain foods/vitamins Altered during prolonged activity/exercise
L, C
R, C
L, C
R, C
Regional body composition changes Medium Speed of measurement
Skinfold Circumference
Medium Medium
Medium Medium
L, C F
R, C U, L
Medium Portability Low Can be implemented anywhere
R, L
Assess environmental factors and Time commitment to complete policies at the community level assessment
R, L
Labels all foods and beverages on Limited oversight/regulatory authority nutritional score and sodium content
Access to healthy food Military Nutritional Environmental Assessment Tool (mNEAT) Go for Green
Radiation exposure, skilled technician required Influenced by hydration status/limitation with overweight/obese No measure of visceral fat Not sensitive to measure changes in lean mass vs. fat mass
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; GS = gold standard; USG = urine specific gravity; DXA = dualenergy x-ray absorptiometry; BIA = bioelectrical impedance analysis.
Body composition (87,111,113) DXA
Medium/low Medium/low
Medium In laboratory setting represents most precise assessment method Low Provides quantitative value
Time needed to analyze results Need for accurate protocol Limited generalizability
Recall bias
Journal of Strength and Conditioning Research
Metric Blood values (85,162) Cholesterol and lipid profile Vitamin D status Body composition (87,111,113) DXA
Intended user
C, L
U, R, C
C, L
U, R, C
L, C
R, C
L, C
R, C
L, C
R, C
U, L
C, L F
U, R, C R, L
Low Low
Estimate macro/micronutrient intake Recall bias Assess environmental factors and Time commitment to policies at the community level complete assessment
R, L
Limited oversight/ Labels all foods and beverages regulatory authority based on nutritional scoring algorithm; labels all foods and beverages based sodium content
BIA Skinfold Circumference Diet quality (104) Food records Military Nutritional Environmental Assessment Tool (m-NEAT) Go for Green
Medium/low Medium/low Unknown Unknown Unknown
Advantages Risk for heart disease
Medium Sun exposure and dietary intake
Disadvantages Large genetic component unaccounted for Different results depending on method used to
Regional body composition changes Radiation exposure, skilled technician required Medium Speed of measurement Influenced by hydration status/limitation with overweight/obese Medium Portability No measure of visceral fat Low Can be implemented anywhere Not sensitive to measure changes in lean mass vs. fat mass
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; GS = gold standard; DXA = dual-energy x-ray absorptiometry; BIA = bioelectrical impedance analysis.
Human Performance Optimization Metrics
S228 TABLE 5. Nutrition lifecycle readiness and performance.*
Journal of Strength and Conditioning Research manipulations. Nutritional fitness is a continuous process for maintaining health and performance over a Warrior’s career. Overall, body weight and circumference measurements remain the most feasible metrics to accurately and quickly assess Event/Mission/Training Readiness and Performance and Lifecycle Readiness and Performance. Psychological Status
The group identified 6 aspects of this extensive domain relevant to HPO: Personality, Mental Health, Meaning and Purpose, Social, Appetitive Behaviors, and Stress. The group excluded “cognitive” content because that information is addressed in the Cognitive Domain section (see below). The topics included in this section were determined to be particularly relevant to HPO. Each of these aspects within the Psychological domain could be a domain in and of itself. Metrics for an HPO “toolkit” relevant to each of the 6 psychological subareas were identified. The components within this domain can be assessed by self-report instruments that can be administered as paper-and-pencil questionnaires, as questionnaires through electronic devices, or as structured interviews. Each task takes from several minutes to several hours to administer. Therefore, a full list of instruments to use is provided below, and a shortened list is provided in italics if time is limited. Personality: Connor-Davidson Resilience Scale; Dispositional Resilience Scale; Hogan; Life Orientation Test— Revised; Locus of Control; Neo Personality Inventory— Revised; Propensity to Trust. Mental Health: Beck Anxiety Inventory; Beck Depression Inventory; Neurobehavioral Symptom Inventory; Profile of Mood States; Periodic Health Assessments; Pre- and Postdeployment Health Assessments; PTSD Checklist; Structured Clinical Interview for DSM Disorders. Meaning and Purpose: Brief COPE; Columbia Suicide Severity Rating Scale; Satisfaction With Life Scale; Quality of Life Scale; Calling and Vocation Questionnaire. Social: Group Environment Questionnaire; LeaderMember Exchange; Medical Outcomes Survey—Social Support Survey; Multifactor Leadership Questionnaire; Multi-Source Assessment and Feedback; UCLA Loneliness Scale. Appetitive: Alcohol Use Disorders Identification Test-C (AUDIT-C); Caffeine Consumption; Cut, Annoyed, Guilty, Eye Opener (CAGE); Fagerstrom Test for Nicotine Dependence; Fagerstrom Tolerance Test; Minnesota Nicotine Withdrawal Scale. Stress: Daily Hassles Scale; Life Event Checklist; Perceived Stress Scale; Shirom-Melamed Burnout Measure. Gaps. Several gaps exist in psychological assessments relevant to HPO. Historically, psychological measures have focused on personality and on negative characteristics/traits/states
related to mental health problems. This is because these types of measures are often used as screening tools to identify who needs mental or behavioral health assistance/treatment. They also are used to track improvements in mental and behavioral health with treatment or over time. More positive characteristics/traits/states can also be evaluated, and several relevant metrics appear in Table 6. In fact, metrics that assess positive rather than negative characteristics/traits/states and the like are needed for a more complete psychological assessment. With regard to HPO, another important gap is the lack of clearly defined psychometric properties for the diverse groups within the military. Positive New Directions. Many measures within the Psychological domain have been the focus of ongoing military laboratory and field studies, and are relevant to military personnel. Importantly, the conceptualization of psychological factors has broadened over the years to include behaviors that powerfully affect physical health, in addition to factors relevant to mental health per se (e.g., depression, PTSD, drug abuse, violence). One positive direction reflects the progress to destigmatize psychological factors (e.g., use of the term “behavioral health” and integration of behavioral health with family practice and medicine), but much work remains. Another positive direction is the focus on measures of resilience, the expansion of life orientation (e.g., optimism), behaviors affected by stress (e.g., drug use and abuse), and multidisciplinary measurement of stress. Several noninvasive, endocrinological and physiological biomarkers of stress can be used in conjunction with psychological measures to verify and validate stress responses. These biomarkers are not listed in this section because they are not psychological measures per se. Recommendations and Issues. Meaningful metrics to evaluate aspects of the Psychological domain should include more than one measure and, where possible, behavioral or biological measures in addition to self-report data. A solid, evidence-based “toolkit” with subcomponents for the Psychological domain would be valuable for researchers, clinicians, and Warriors. In addition, it would be valuable to compare responses in this domain with Sleep and Cognitive Performance assessments. Psychological measures that have robust impact on individuals and groups over time, including metrics relevant to positive and negative aspects of relationships (e.g., intimacy, communication, aggression, violence), sociological measures that affect individual well-being (e.g., group dynamics and cohesiveness) are important. Many of these related topics and themes are increasingly being appreciated, but they need to be considered and integrated into a broader “toolkit” relevant to psychological aspects of HPO. Cognitive Performance
A plethora of measures exist for assessing and predicting cognitive performance. Both behavioral and self-report VOLUME 29 | NUMBER 11 | SUPPLEMENT TO NOVEMBER 2015 |
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
Metric Personality Connor-Davidson Resilience (CD-RISC) (36) Dispositional Resilience Scale (DRS) (177)
Hogan (68)
Life Orientation Test—Revised (LOT-R) (29) Locus of control (145) NEO Personality InventoryRevised (NEO-PI-R) (40) Propensity to Trust (146) Mental health Beck Anxiety Inventory (BAI) (13) Beck Depression Inventory-II (BDI) (14) Neurobehavioral Symptom Inventory (NSI) (31) Profile of Moods States (POMS) (134) Periodic Health Assessments (PHA) (18) Pre- and Postdeployment Health Assessments (119) Post-traumatic checklist (PCL) (174) Structured Clinical Interview for DSM Disorders (SCID) (65) Meaning and purpose Brief COPE (28) Columbia Suicide Severity Rating Scale (C-SSRS) (136)
Intended user
F, C, L
U, R, C, L
F, C, L
F, C, L
U, R, C, L $37/y for academic use $4/use for standard use U, R, C, L $25–$375
F, C, L
U, R, C, L
F, C, L F, C, L
U, R, C, L U, R, C, L
Free $37–$70
Good Good to excellent Poor
F, C, L
U, R, C, L
F, C, L
Good to excellent Good
Good Good
Good Good
Good to excellent Poor to Poor to acceptable acceptable Poor to Poor to acceptable acceptable Good Acceptable to good Good to Good to excellent excellent Acceptable Good
Good Good
Short self report
3 dimensions; short self report
More hardiness than resilience
Several scales
Many questions; copyrighted N/A
Short self report; includes optimism Short self report Measures personality
Developed for children Many questions
Online self scored
Fair validity and reliability
U, R, C, L
Short self report
F, C, L
U, R, C, L
Short self report
Mostly physical sensations Based on DSM-IV
F, C, L
U, R, C, L
F, C, L
U, R, C, L
F, C, L
U, R, C, L
Short self report; validated with OIF/ OEF Vets Widely used; well known Self report
Limited psychometrics
F, C, L
U, R, C, L
Self report
Limited psychometrics
F, C, L
U, R, C, L
F, C, L
R, C
Short self report; widely used Comprehensive
Based on DSM-5; limited psychometrics Time-consuming; requires training
F, C, L F, C, L
U, R, C, L U, R, C, L
Free Free
Short self report Widely used
Limited psychometrics Normative data from civilians; no stigma measure
Does not differentiate types of mTBI Not recently updated
Human Performance Optimization Metrics
TABLE 6. Psychological status.*
Satisfaction with Life Scale Acceptable Acceptable to (SWLS) (51) to excellent excellent Quality-of-Life Scale (QOLS) (22) Good Good Calling and Vocation Good Acceptable to Questionnaire (CVQ) (52) good Social Group Environment Mixed Acceptable to Questionnaire (GEQ) (26) good Leader-Member Exchange 7 Mixed Good Scale (LMX-7) (152) Good Excellent Medical Outcomes Survey— Social Support Survey (MOSSSS) (155) Multifactor Leadership Poor Poor Questionnaire (MLQ) (10) Multi-Source Assessment and Mixed Mixed Feedback (MSAF) (86) UCLA Loneliness Scale (147)
U, R, C, L
Short self report
Limited range of values
F, C, L F, C, L
U, R, C, L U, R, C, L
Free Free
Short self report Short self report
N/A Limited psychometrics
F, C, L
U, R, C, L
Short self report
F, C, L
U, R, C, L
F, C, L
U, R, C, L
Focus on communication Short self report
Normative data from civilians Lack of specificity
F, C, L
U, R, C, L
Good to excellent
F, C, L
Widely used instrument U, R, C, L Free (limited to Web based military users) U, R, C, L Free Short self report
Good to excellent Acceptable to good Good to excellent Good
F, C, L
U, R, C, L
Short self report
Narrow range of values
F, C, L
U, R, C, L
Brief self report
Acceptable to good Acceptable to good Good to excellent Good
F, C, L
U, R, C, L
Short self report
Questionable for heavier consumers Narrow range of values
F, C, L
U, R, C, L
F, C, L
U, R, C, L
F, C, L
U, R, C, L
Widely used short self Limited to smoking report Widely used short self Limited to smoking report Widely used short self Requires multiple report responses
Good Acceptable
F, C, L F, C, L
U, R, C, L U, R, C, L
Free Free
Easy to understand Short self report
Acceptable to good Acceptable
F, C, L
U, R, C, L
Short self report
Not recently updated Based on DSM-5; limited psychometrics Limited psychometrics
F, C, L
U, R, C, L
Screening tool
Limited psychometrics
Good Good Acceptable to good Acceptable Poor
Army centric Old psychometric data
F, C, L
Poor psychometrics
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; DSM = Diagnostic and Statistical Manual of Mental Disorders; mTBI = mild traumatic brain injury.
Appetitive behaviors Alcohol Use Disorders Identification Test-C (AUDIT) (5) Caffeine Consumption Questionnaire (107) Cut, Annoyed, Guilty, Eye Opener (CAGE) (61) Fagerstrom Test for Nicotine Dependence (FTND) (79) Fagerstrom Tolerance Questionnaire (FTQ) (63) Minnesota Nicotine Withdrawal Scale (MNWS) (88) Stress Daily Hassles Scale (DHS) (44) Life Event Checklist (LEC-5) (173) Perceived Stress Scale (PSS) (33) Shirom-Melamed Burnout Measure (SMBM) (156)
F, C, L
Human Performance Optimization Metrics measures can predict scholastic and occupational performance. In terms of optimizing human performance in military populations, both behavioral and self-report measures should be captured at baseline and across the Warrior’s lifecycle to assess preparedness and provide monitoring. These data can then be used for identifying individuals and cohorts that may benefit from advanced training and selective interventions to bolster the operational effectiveness at both an individual and unit level. Cognitive neuroscience has highlighted the role and importance of various cognitive processes for optimal functioning. For example, attention is implicated in performance for most, if not all, cognitive processes. Attention is necessary for processing information associated with tasks involving neutral stimuli and those with emotionally salient stimuli, regardless of whether intrinsically or extrinsically perceived (i.e., imagined vs. actual). Complicated tasks, although relying on attention, incorporate higher level cognitive processes (working memory, mental flexibility, decision making, etc.) to regulate both behavioral performance in response to the environment (operating machinery, adapting to the environment, etc.) and internal milieu (thoughts, mood, emotions, attitude, memory, etc.). Maintenance of this highly complex, yet concerted, neurological system through behavioral and psychological hygiene is necessary for promoting optimal individual and group performance. Some of the commonly used cognitive tests with estimated psychometric properties (validity and reliability, testing duration) and utility (administer training level, setting, population) were identified and critiqued. In addition, a substantial number of self-report, cognitive instruments were identified as having value for assessing more esoteric cognitive constructs such as metacognition or creativity. Overall, dozens of tests for the various cognitive domains are available. However, for many of the cognitive tests, pros and cons associated with psychometrics, utility, applicability, cost, and so on, were identified. In addition, gaps for applying such tests to HPO were noted. Computerized tests provide millisecond accuracy and are convenient for recording, transferring, and data analysis. Although many individual computer-administered tests used for cognitive and clinical neuroscience studies do not have normative data, some do, such as the CANTAB (Cambridge Cognition), Automated Neuropsychological Assessment Metric, Defense Automated NeuroBehavioral Assessment, and CNS Vital Signs. Some of the available tests are normed and validated, but on nonmilitary populations. These batteries are commercially available and provide varying metrics of Attention, Memory, Concentration, Reaction Time, Executive Function, Decision Making, Social Cognition, and Induction as well as data for demographics, injury descriptives, mood, and screening for psychopathology. Apart from specific commercial batteries, a myriad of individual cognitive tasks are available. This report provides
only a few of those commonly used, which offer some level of support by the literature. Many of the tasks are computerized, which makes it ideal for pairing with electrophysiological and psychophysiological measures such as eye-tracking, electrodermal skin conductance response, electroencephalography, and brain imaging to provide additional information for monitoring performance. Other tasks are standardized for pencil-and-paper administration and proprietary (Hopkins Verbal Learning Task). However, many others are not, and can be computerized and experimentally modified with appropriate citations to provide optimal sensitivity to performance and delineate the relevant neural mechanisms and cognitive processes. The approximate administration time for each task varies from 5 to 90 minutes (Table 7). Attention: Simple Reaction Time (SRT); Odd Ball Task; Stroop; Psychomotor Vigilance Task (PVT); Attention Network Task (ANT); Procedural Reaction Time (PRT). Short-Term/Working Memory: n-back; Digit Span; Hopkins Verbal Learning Task (HVLT); Sternberg Test; Rey-Osterrieth; Visual Spatial Working Memory tasks; Groton Maze-Learning Task. Generalized Intelligence: Armed Services Vocational Aptitude Battery (ASVAB); Weschler; Stanford-Binet; Wonderlic Cognitive Abilities Test (WCAT); Kaufman (KBIT); Cognitive Assessment System (CAS); Comprehensive Test of Non-Verbal Intelligence (CTONI). Reasoning/Judgment/Decision Making/Problem Solving: Raven’s Matrices; Iowa Gambling Task (IGT); Wisconsin Card Sorting Task (WCST); Balloon Analogue Risk Task (BART). Adaptability/Creativity: Remote Association Test; Torrance Test of Creativity; Task-Change Paradigms; WCST; Situational Judgment Tasks; Attentional Shift Tasks. Emotional Biases/Emotional Intelligence/Social Intelligence: Mind in the Eyes Test; Goleman; Affective DotProbe; Scenario Based Empathy Tasks; Minnesota Multiphasic Personality Inventory (MMPI); Conditional Reasoning Test; Implicit Association Test; Emotional Competence Inventory; Affective Priming Task. Reflection/Insight/Metacognition: Beck Cognitive Insight Scale (BCIS); Mindfulness Attentional Awareness Scale (MAAS); Gudjonssen Suggestibility Scale (GSS). Gaps. One significant gap is the lack of psychometric properties for all the multiple tests, particularly over a wide range of skills, backgrounds, and ethnicities. Additionally, standardized/normative data for the military population is limited. Importantly, the amount of pretraining (number of practice sessions, days, weeks) required for detecting true differences over time are not clearly defined. Finally, cognitive tests available on multiple platforms (e.g., smartphone, laptop, iPad) and suitable for a one-time vs. serial testing are needed, as the time required for practice testing remains a barrier to progress and specificity.
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
TABLE 7. Cognitive attention/attentional control metrics.* Metric
Intended user
F, C, L
R, C
Highly sensitive
Nonspecific, boredom
Widely used
F, C, L
R, C
C, P
Motivation, familiarity, lots of trials Language based
Low to moderate Moderate
Stroop (74,75,164)
Low to high
High High High
Moderate Moderate Moderate
F, L F, L F, L
R, C R, C R
Low Low Low
Variants, commonly used Face validity Highly sensitive Multicomponent
High High
Moderate Moderate to high High
F, L F, C, L
R, C R, C
C C, P
Low Low
Many versions Software needed Easy to administer Poor sensitivity
F, L, C
R, C
C, P
Prone to subjectivity Well-established, by evaluator uniquely captures cognitive domain Well-established, Paucity of easy to interpret computerized versions Military populations Repeatable
Go/No-Go (71,172) Vigilance Task (172) Attention Network Task (91) Short-term/Working Memory N-Back (92,93) Digit Span (56,148,171) Hopkins Verbal Learning (126,154) Rey-Osterrieth Complex Figure (43,168)
Groton Maze Learning (34,41,114) Reasoning/Mental Flexibility/Decision Making Raven’s Matrices (137,169)
Moderate to high
F, L, C
R, C
C, P
F, L
Moderate to high
F, L, C
R, C
Moderate to high
F, L, C
R, C
F, L
R, C
F, C
R, C
Independence Outdated research from language skills Easy to administer Typically used for clinical populations Well established Hard to interpret
Iowa Gambling Task (20,55) Wisconsin Card Sorting Task (WCST) (163)
Visual-Spatial WM (60,170)
Not standardized Nonspecific Boredom, lots of trials
(continued on next page)
Attention/Attentional Control Simple Reaction Time (54,99,172) Odd Ball Task (42,97)
Moderate to high Moderate to high Moderate
F, C, L
R, C
C, P
R, C
C, P
Measures various functions Brief
L, C
R, C
Easy to administer Language based
Low to high
U, R, C
Longitudinal data
F, L
Low Moderate
C F, L
R, C R
C C, P
Medium Low
F, L
Well established Can be modified for military populations Multiple versions
F, L
U, R, C, L
C, P
F, L
U, R, C, L
C, P
Easy to administer No standardized application Easy to administer No standardized application Multiple versions Arbitrary scoring
Moderate Low to Moderate Low to Moderate Moderate
Moderate Moderate
L, C L
R, C R, C
C, P C
Medium Low
Well established Boredom Easy to administer Applicability unclear
Multiple versions
L, C
R, C
Easy to administer Unclear how to use
Moderate Moderate Moderate
Low Moderate High
L L F, C
U, R, L U, R, L U, R, L
C, P C, P C, P
Low High Low
Easy to administer Unclear how to use Easy to administer Demand effects Easy to administer Demand effects
Adaptability/Creativity Remote Associates Test Low to Moderate Torrance Test of Low to High Creativity (103) Task-Change Paradigm High (78,105,110) WCST (163) Moderate Situational Judgment High Tasks (4,24,130)
Attention Shift Tasks (78,105) Emotional Bias/Emotional Intelligence/Social Intelligence Eyes Test of Emotional Intelligence (157) Goleman EI (120) Affective Dot-Probe (49,64) MMPI (167) Conditional Reasoning (16) Implicit Association (59,72) Emotional Competency (62,89) Reflection/Introspection/ Metacognition Metacognition (128) Beck Insight (144) Mindfulness (17,108)
Typically used for clinical populations Mostly for 5–17-yearolds Developed for patients with language and motor impairments
Lack of agreed scoring Lack of standardization Hard to interpret Unlimited scenarios needed; arbitrary scoring Arbitrary scoring
Lots of trials
Human Performance Optimization Metrics
Balloon Analogue Risk Moderate Task (90,172,176) Cognitive Assessment High System (178) Moderate to Comprehensive Test High Nonverbal Intelligence (96)
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; platform: C = computer, P = pencil/paper; response: P = performance, Q = questionnaire; GS = gold standard; specialized = requires specialized knowledge, tools, and/or methods to implement; WCST = Wisconsin Card Sorting Task.
Hard to interpret Lots of trials P P Low Low C C C R F F Low High Moderate High WCST (163) Shifting Attention (78,105)
Perceptual Learning/ Pattern Recognition Raven’s Matrices (137,169)
Moderate to high
R, C
Independence from language skills Well established Multiple versions
Outdated research
Positive New Directions. Cognitive metrics reflect brain functioning, which can be measured over time to detect changes associated with a myriad of intrinsic and extrinsic factors, as well as various training, prophylactic, and treatment interventions. Recommendations and Issues. A neurocognitive toolkit for assessing performance should be established. The toolkit should provide the following: Description of battery requirements (platform, software, etc.); Options for screening vs. serial assessment; Multiple tests to choose from for each cognitive domain; Established psychometrics; User-friendly approach (i.e., minimal training requirement for administrator); Clear and concise instructions to the testee; Ability to integrate main confounding variables (sleepiness; exogenous distracters in the immediate environment; stress; etc.); Valuable and pleasing output of performance (across time; delta scored; etc.). Cognitive measures are replete with limitations in their ability to predict real-world performance or capacity for performance. This highlights the need to identify both the generalizability of various cognitive processes and the ecological validity of the tests. Hence, the concomitant use of performance-based cognitive tests, self-report cognitive assessments, and measures of state (e.g., sleepiness) and trait characteristics (neuroticism, sensation seeking, etc.), many of which are delineated in the different sections of this paper, might improve the predictive and ecological validity of some instruments. The interplay between emotions and cognitive performance is well known. For example, both low and high levels of anxiety can impede cognitive/behavioral performance. Depression and sleep deprivation also mediate cognitive/ behavioral performance as well as many other intrinsic and extrinsic factors. Therefore, accurate assessment of these factors should be a priority. To accurately assess Warrior cognitive performance, challenge paradigms are an underutilized tool. For example, a Warrior with compromised cognitive or attentional control and high trait anxiety might show decremented performance under stress challenge paradigms (e.g., public speaking, firing range), whereas a comrade with high trait anxiety, but strong attentional control, would likely perform at a much higher level. The use of a “cognitive toolkit” would be best utilized under a range of challenge conditions, both experimentally and in conjunction with combat operations. Environmental Challenges
Environmental extremes of heat, cold, and altitude can be a serious threat to HPO. Metrics for environmental challenges can be broken down into 2 categories: monitoring VOLUME 29 | NUMBER 11 | SUPPLEMENT TO NOVEMBER 2015 |
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
Human Performance Optimization Metrics metrics and preparedness metrics, both at the individual and unit levels. Individual and Unit Monitoring. Metrics for monitoring individuals include various measures of body core temperature and oxygen saturation (Table 8), although for applications outside of the clinic or laboratory, only rectal temperature is currently recommended, and then only when ruling in/out a heat stroke (48). When monitoring the unit, military physical profiles and injury rates—particularly of heat, cold, and/or altitude illness—can be used for risk mitigation. Heat illness can be further mitigated by measuring and classifying heat stress flag categories (48), according to the wet bulb globe temperature index (48). Cold stress injury
can be alleviated by measuring both air temperature and wind speed to calculate wind chill ( mil/weather/windchill.pdf ) for categorizing frostbite risk and subsequently application of appropriate mitigation strategies (45). The risk of acute mountain sickness can be similarly mitigated by adherence to recommended altitude ascent and staging strategies (46). Individual and Unit Preparedness. Metrics identified for evaluating individual preparedness to perform optimally in hot, cold, and high-altitude environments are exclusively “screening” tools. The top metrics include: (a) history of previous intolerance to a particular environment; (b) acclimatization status (i.e., recent residence in hot or
TABLE 8. Environmental metrics.* Metric
Validity Reliability Application
Monitoring metrics (individual) (46,48,151) Esophageal temperature Rectal temperature
High High
Intended user
C, L F, C, L
R, C R, C
Intestinal temperature
Medium Medium
F, C, L
R, C
Blood oxygen saturation Monitoring metrics (unit) (27,48,151) Profile rates
Medium Medium
F, C, L
R, C
Medium Rapid response Impractical Medium Field diagnostic Heat illness for heat stroke assessment only High Noninvasive Frequent redosing (pills) Medium Simple to use Impractical
Injury/illness rates
F, L
R, L
Link to doctrine (e.g., work/ rest)
High High
High Medium
F, C, L F, C, L
R, C, L R, C, L
Low Low
Simplicity Simplicity
Recent illness Medium Medium Preparedness metrics (unit) (30,159) SWET (MPT) High High ARMS (MPT) High High
F, C, L
R, C, L
Record keeping Adaptation/ decay unknown Associative
Low Low
Simplicity Simplicity
Digital interface Digital interface
Preparedness metrics (individual) (27,46,48,151),† History of intolerance Acclimatization
Record keeping; associative Record keeping; associative Potentially impractical
*GS = gold standard; WBGT = wet bulb globe temperature index; ARMS = altitude readiness management system; SWET = soldier water estimation tool; MPT = mission planning tool (part of Nett Warrior platform). †Other individual environmental preparedness metrics include body mass index, low fitness, hydration, sleep, energy balance, and personality (46,48,151)—all of which are covered as metrics in complementary domains.
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
Journal of Strength and Conditioning Research high-altitude environments; northern or southern home of origin); and (c) recent illness (“multiple hit hypothesis”) (27,46,48,151). Several other metrics related to environmental preparedness (151) were also identified but are covered more fully in other domains (e.g., physical fitness, hydration status, body mass index). Proposed environmental metrics are listed in Table 8. Unit preparedness in extreme environments can be improved by the use of 2 mission planning tools recently made available (April 15, 2015) on the Nett Warrior platform (http:// The Soldier Water Estimation Tool (30) is a mobile application that affords simple, accurate estimation of unit water needs for a range of environments, clothing configurations, mission durations, and activities to reduce the likelihood of dehydration contributing to environmental illness and performance impairments (27,48). The Altitude Readiness Management System (159) is a mobile app that allows estimation of the prevalence and severity of acute mountain sickness as a function of time spent at any given altitude (46). Both decision aid software applications allow unit leaders userfriendly options of optimizing human health and performance in various environmental conditions. Gaps. Two significant gaps include (a) the need for real-time individual monitoring of physiological status and (b) biomarkers of individual preparedness. The former could involve noninvasive ways of measuring body core temperature or oxygen saturation, as well as the integration of physiological models of those inputs for decision making (21). The latter could include biomarkers of adaptation or susceptibility, or could include functional tests (e.g., heat tolerance testing, hypoxia sensitivity, cold tolerance sensitivity); however, these currently remain in development. Recommendations and Issues. Consideration must be given as to how individual or unit metrics will be integrated into the user community. A program manager with HPO expertise would be needed to help coordinate and identify the requirements writer(s), user advocate(s), and development pathway(s) for this collective effort. Sleep
Sleep impacts all aspects of mental performance, most notably cognitive performance and emotional regulation. Sleep may indirectly impact physical performance via its direct impact on skill acquisition, motivation, and sensitivity to pain. Thus, optimizing sleep is essential for optimal human performance. Like any other biological function, sleep must be objectively measured for it to be appropriately managed and ultimately optimized. Subjective assessments of sleep amounts, such as sleep diaries, are time-consuming and therefore result in low compliance rates. Subjective assessments of one’s own performance capacity are not acceptable substitutes for monitoring sleep-wake since most
individuals overestimate their performance capacity, particularly under the conditions of chronic, insufficient sleep that characterize modern military operations. Under operational conditions, objective measurements must be reliable, useful, and nonobtrusive. For instance, wrist actigraphy is a mature technology that has been sufficiently validated and ruggedized to be ideally suited for measuring sleep in operational environments. Numerous commercially available devices exist, and most are wear-andforget technologies. For the vast majority of individuals in nonclinical settings, the key sleep parameter that determines waking cognitive function is total recuperative sleep time (TrST) per 24 hours, and actigraphs measure that function very well. “Sleep quality” is not independent of TrST, because time spent awake (fragmented sleep), no matter how brief, reduces TrST, and actigraphs can measure this aspect of sleep as well. Gaps. The primary critical gap to be addressed is whether the transformation of daily TrST into a generalized, operationally meaningful metric, such as cognitive effectiveness requires further task-specific translation. The DoD developed and patented a biomathematical model that translates daily sleep-wake amounts into a general cognitive effectiveness estimate. Importantly, the Federal Aviation Administration considers this estimate to have sufficient validity to serve as a key component of fatigue risk management systems, which allows for alternative means of compliance with flight duty regulations. However, other operational communities have yet to accept biomathematical estimates as a valid index of operational readiness, and instead argue that such biomathematical model predictions must to be validated for specific tasks/occupations/MOSs. In addition, currently no field-measurable biomarker exists that accurately indexes current level of sleep debt (or is there a performance probe that is specific to sleep). Such a metric would allow for rapid and immediate determination of current operational readiness and potentially obviate the need for continuous (24/7) wrist actigraphy. Positive New Directions. Hardware-software systems currently exist that would allow for measurement and management of sleep to optimize performance. Efforts are underway to (a) further refine biomathematical model estimates so they are tailored to the individual (based on periodic short performance probes) and (b) model the impact of caffeine so that performance optimization guidance can be automated. Recommendations and Issues. The main recommendation concerns the need to educate the DoD operational communities on the critical role of sleep in operational readiness (Table 9). The prevailing belief that sleep is not critical, negatively impacts the operational community’s willingness to implement sleep measurement and management strategies and tactics. This education issue is currently the core of VOLUME 29 | NUMBER 11 | SUPPLEMENT TO NOVEMBER 2015 |
Copyright © National Strength and Conditioning Association Unauthorized reproduction of this article is prohibited.
Sleep measurement, objective and subjective PSG (standard sleep metrics: sleep onset latency, sleep staging, etc.) (106) Actigraphy (115) Sleep Diary (25)
Intended user
Gold Standard
Not fieldable, specialized
C, L
R, C
High (relative to PSG)
F, C, L
U, R, C
Low (relative to PSG)
F, C, L
U, R, C
C, L
R, C
C, L
R, C
C, L
R, C
F, L
U, R, L
F, L F, L
R R, L
Subjective Sleep Quality Scales— Medium (relative PSQI (70); PIRS (116); ISI (11) to PSG) Sleep-related functional impairment MSLT/MWT (daytime sleepiness) GS (109) Subjective Sleepiness Scales— Medium (relative ESS (94); KSS (95); SSS to MSLT/MWT) (80,83) Simple RT (6) High (relative to MSLT/MWT) Optalert, Perclos (2) Simulator Environments (e.g., driving (76); surgery (125); patient care (69))
Reliability Application
High High High to unknown Unknown
Low sleep onset accuracy, Medium/ Fieldable; high overestimation of TST, does not low specificity; userrecord sleep stages friendly Subjective, low compliance Low Fieldable; high specificity; userfriendly Low Fieldable; medium Subjective, low compliance, specificity; userproprietary friendly Gold Standard
Fieldable; costeffective; userfriendly Medium Fieldable; costeffective; userfriendly Medium Proprietary High Presumed “facevalid”
Not fieldable, specialized Lack of validation (?), unknown specificity, subjective, low compliance, proprietary Unknown specificity (presumed LOW) Unknown specificity, proprietary Unknown specificity, specialized
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; PSG = Polysomnography; GS = gold standard; PSQI = Pittsburgh Sleep Quality Index; PIRS = Pittsburgh Insomnia Rating Scale; ISI = Insomnia Severity Index; MSLT = Multiple Sleep Latency Test; MWT = Maintenance of Wakefulness Test; ESS = Epworth Sleepiness Scale; KSS = Karolinska Sleepiness Scale; SSS = Stanford Sleepiness Scale; RT = reaction time; specialized = requires specialized knowledge, tools, and/or methods to implement; TST = total sleep time.
Human Performance Optimization Metrics
S238 TABLE 9. Sleep metrics.*
TABLE 10. Pain metrics.* Application
Intended user
In process
On study at present (Ongoing)
F, C, L
U, R, C
C, L
Connor-Davidson Resiliency Scale (177)
C, L
PROMIS: High PASTOR: Ongoing
C, L
F, C, L
Defense and Veterans Pain Rating Scale (DVPRS) (19) Brief Pain Inventory (32)
Multidimensional and Limited validity at functional focus present Penetration into VA/DoD R, C Medium Very well validated Time Logistics of administration R Medium Good validation Time Logistics of administration Unclear predictive utility Requires web R, C Free Patient reported interface Penetration Requires data Multidimensional storage Tracking over time Red Flag capability U, R, C, Free Most commonly used No functionality M scale Limited clinical utility
*Application: F = field, C = clinic, L = laboratory; intended user: U = user, R = researcher, C = clinician, L = leader; DoD = Department of Defense; GS = gold standard; PASTOR = Pain Assessment Screening Tool and Outcomes Registry; PROMIS = Patient Reported Outcomes Measurement Information System.
Numerical Rating Scale Visual Analog Scale (82)
Human Performance Optimization Metrics the Army Surgeon General’s Performance Triad and other efforts within the DoD. Pain
Pain is considered a barrier to human performance, as pain magnitude is likely to directly correlate with how an individual is limited to perform (100). Currently, 50–92% of all helicopter aircrew experience low back pain (66) and 56–85% experience neck pain (149). Gironda et al. (67) reported that 47% of Operation Enduring Freedom (OEF)/Operation Iraqi freedom (OIF) veterans entering the VA system reported having pain during their initial visit. Pain is a reality reinforced by various occupational requirements as well as participation in essential load carriage tasks (127). Thus, measures of pain, the degree to which pain affects human performance, as well as the development of metrics of pain assessment can be used to signal countermeasures to correct/optimize performance. Metrics of pain will allow an understanding of limits of mission applied psychomotor tasks (Table 10). Such metrics must be multidimensional in terms of how they impact life activities (sleep, mood, activity, etc.), and be able to be used longitudinally to optimize benefit. Importantly, pain is underreported by active duty military personnel, and although measurements of pain may not depict limits in very high functioning Warriors (Special Forces, Rangers, etc.), the enduring, but subtle impact that pain may impose goes unmeasured. This may in the long run severely limit the contribution of Warriors at later stages in their military careers. Currently, the best measures for pain include the Defense and Veterans Pain Rating Scale (DVPRS) and the Pain Assessment Screening Tool and Outcomes Registry (PASTOR)/Patient Reported Outcomes Measurement Information System (PROMIS) project. Gaps. No true measure of “pain resiliency” exists; however, psychological components are key in gathering a sense of such resiliency. In addition, no objective biomarkers—genetic or other—are available to (a) measure pain or (b) predict the impact of pain on either a personal or population level.
consuming and require trained personnel to administer them, so they may not be desirable for an HPO toolkit. The DoD/Veterans Health Administration (VHA) has identified measurement tools of pain, especially the DVPRS and the PASTOR/PROMIS project. The DVPRS Scale/Supplemental questionnaire is an easy and deployable pain measurement tool that focuses on function and the impact of pain. It can be an easy onsite tool and serve a tracking function. It has been validated by one recent study and has numerous ongoing studies at present for further validation. Its standardization has already commenced with roll out to many DoD facilities and VHA hospitals. The PASTOR/PROMIS project is a DoD funded project, whereby PROMIS pain measurement tools were essentially placed into a “militarized” program where pain can be measured in a multidimensional manner through a web-based program. It consists of wellvalidated tools (NIH PROMIS), but itself has not been validated as a combination tool. It went “live” in 2014 and will be undergoing initial validation and alteration in 2015/2016. This represents a large step forward in how pain is being measured in terms of focusing on comorbid conditions of substance abuse and PTSD, adding demographics that are military specific, using validated pain interference measures, and also describing what analgesic modalities have been tried by a specific patient (pharmacological and no pharmacological). It is based on a longitudinal platform that is personalized to each patient. The military health system has embraced the platform and will likely use this program as the primary pain measurement tool/tracking system to provide individualized data but also as a larger population comparison data repository. PASTOR/ PROMIS likely represents the best tool to measure and follow the impact of pain on human performance in the military. One other future direction to consider is the development of systems that allows patients/Warriors, and others to monitor and respond to pain in a self-directed manner without the need to always seek care in a clinic. Essential systems may be developed such that individuals by means of a convenient platform can measure, monitor, and respond to limiting pain levels as an early warning system.
Positive New Directions. An essential positive direction for HPO Pain Metrics is to actually measure pain in an accurate and meaningful way by integrating multidimensional pain tools that crosstalk with other data platforms. Pain should be assessed in a functional manner that considers both physical and mental impact.
Recommendations and Issues. Further research within the realm of biological systems will be essential for HPO Pain modeling. This includes using genomics and other biomarkers to correlate with perceptions of pain and to predict the impact of pain on a population and personal level. In terms of validation, classic tools such as the Brief Pain Inventory and various pain questionnaires have been used in a research and clinical setting. However, they are time
HPO will continue to be important in the future where Warriors will likely operate in “volatile, uncertain, complex, and ambiguous (VUCA)” (160) and “unconventional” environments. Thus, innovative approaches to optimize the military’s strongest asset—people—are and always will be needed. However, until validated and reliable metrics can be established, and a toolbox of metrics can be operationalized, the monitoring of various new training paradigms and learning scenarios, and appropriate evaluation of new technologies
Journal of Strength and Conditioning Research will remain a gap (122). This meeting centered on developing such a toolkit, or at least starting to build one. This paper provides a variety of metrics (150 total) across multiple domains that can be used as a starting point of a HPO toolkit. However, many of these metrics have not been completely developed with respect to their properties, validity, or reliability, nor have their ease of use (field expediency, requirements for equipment, and technology, etc.) and personnel demands been quantified. Perhaps, once a sufficient toolkit of HPO metrics is developed and a solid understanding of their military applicability is attained, it will be possible to predict a functional outcome/capability in Warriors (resiliency score, ability to successfully perform an occupational task, etc.), as recently done with multiple health-related biomarkers and algorithms used to predict biological age among Dunedin Study cohorts (15). All these issues must be carefully evaluated and prioritized with regard to the specific objectives of the individual, unit, and/or organizational performance goals. Importantly, databases must be developed to track the metrics collected and available to various groups ranging from researchers, to individuals to leaders. Warriors must be empowered to self monitor their performance capabilities and capacities, and senior leaders must ensure the resources and environment are conducive to HPO.
ACKNOWLEDGMENTS The authors would like to acknowledge the contribution of all working group members who attended the 2103 HPO Metrics Conference held at CHAMP. The authors would also like to acknowledge Matthew Moosey, Aaron Weisbrod, Margaret Baisley, and Kathryn Eklund for their assistance with detailed preparation of Tables in the Psychological Domain.
