Brain mapping with transcranial magnetic stimulation

Received: 16 July 2018

Accepted: 13 August 2018

DOI: 10.1002/hbm.24371

COMMENT

Reporting matters: Brain mapping with transcranial magnetic stimulation Martin E. Héroux1,2 1

Neuroscience Research Australia, Randwick, New South Wales, Australia

2

University of New South Wales, Randwick, New South Wales, Australia

Correspondence Martin Héroux, Neuroscience Research Australia, Randwick, New South Wales 2031, Australia. Email: [email protected]

Transcranial magnetic stimulation (TMS) allows researchers to nonin-

The authors later discuss these effects as if they were statistically

vasively probe the human brain. In a recent issue of Human Brain Map-

significant:

ping, Massé-Alarie, Bergin, Schneider, Schabrun, and Hodges (2017) used this technique to investigate the task-specific organization of the primary motor cortex for control of human forearm muscles. Specifically, TMS was used to create cortical topographical maps of four

“The findings that peak overlap was larger between ECRBsurf and ECRBfw than for ECRBfw and EDC implies that […].” p. 6130

forearm muscles at rest, and, in one of these muscles, during isometric

Implicit to this type of interpretation is that statistical trends reflect real

wrist extension and isometric grip. The authors were interested in

effects. However, additional data are more likely than not to turn a

how these maps differ between muscles, how they overlap, and how

trend into a nonsignificant result (Wood, Freemantle, King, & Nazareth,

they change with different motor tasks. Key to their approach was the

2014). Is this a problem? In this instance the spin is blatant and an

use of indwelling fine-wire electrodes to record motor evoked poten-

astute reader can draw their own conclusions. However, spin in various

tials elicited by magnetic stimulation, which revealed the size of corti-

forms is so common (Bero, 2018; Chiu, Grundy, & Bero, 2017; Héroux,

cal maps is grossly overestimated when evoked potentials are

2016) that readers and reviewers may be wooed by such biased inter-

recorded from electrodes placed on the skin surface.

pretations. Why do I say biased? The authors are lenient only in one

In their paper, Massé-Alarie et al. (2017) set (statistical) signifi-

direction. In their paper, Massé-Alarie et al. (2017) report 11 p values

cance at p < 0.05. Yet the authors interpret several p values above this

that fall between 0.02 and 0.05. Why are these not reported as tending

threshold as statistical trends. Post hoc analyses were even performed

toward nonsignificance? Regardless of whether the p values were just

for main effects that were not statistically significant:

above or just below the threshold of p = 0.05, an exact replication (i.e., same sample size and methods) of this study only has a 50% chance

“Although narrowly missing significance, there was a ten-

of reproducing these statistically significant (or near significant) effects

dency for a main effect of Pairs of tasks for percentage of

(Button et al., 2013; Forstmeier, Wagenmakers, & Parker, 2017). As

MEP peak overlap (F(1, 13) = 3.30; p = 0.053), which was

recently pointed out, p values are fickle (Cumming, 2014; Halsey,

explained by a tendency towards a greater percentage

Curran-Everett, Vowler, & Drummond, 2015), especially when sample

peak overlap in Rest-Ext (32.1 9.6%) than Rest-Grip

size is small (Button et al., 2013; Higginson & Manufó, 2016). Thus,

(14.3 8.2%; p = 0.02); Fig. 4B), and by a tendency

how confident should we be about the results of Massé-Alarie

toward a greater percent peak overlap for Grip-Ext

et al. (2017)? Looking back, how confident should we be about our own

(26.8 9.7%) compared to Rest-Grip (p = 0.09).” p. 6126

work? This type of nuanced view is not common. But we need more of

“Finally, although nonsignificant, there was a tendency

it; especially towards noninvasive brain stimulation where many pub-

toward a main effect for muscle pairs for percentage

lished effects are simply not reproducible (Héroux, Taylor & Gandevia,

of MEP peak overlap (F(2,

2015; Héroux, Loo, Taylor, & Gandevia, 2017).

18)

= 3.23; p = 0.06). This

was explained by a tendency toward a larger overlap

Another reporting matter in the paper by Massé-Alarie et al. (2017)

between ECRBfw-surf (43.5 8.7%) than ECRBfw-EDC

is the all-to-common use of the standard error of the mean (SEM) to

(25.2 7.5%;

summarize data variability (Héroux, 2016; Héroux et al., 2017; Weiss-

post

p. 6126-6127 Hum Brain Mapp. 2018;1–2.

hoc

p

= 0.03;

Fig.

6C).”

gerber, Milic, Winham, & Garovic, 2015). This is not what the SEM wileyonlinelibrary.com/journal/hbm

© 2018 Wiley Periodicals, Inc.

1

2

HÉROUX

quantifies. But does it actually matter what measure is reported? Experts think so (Curran-Everett & Benos, 2004), as do I. Here are some of the above results reported with standard deviations: “[…] which was explained by a tendency towards a greater percentage peak overlap in Rest-Ext (32.1 35.9%) than Rest-Grip (14.3 31.1%; p = 0.02); Fig. 4B), and by a tendency toward a greater percent peak overlap for Grip-Ext (26.8 36.3%) compared to Rest-Grip (p = 0.09).” Given that percentages are bound between 0 and 100, what does 14.3 31.1% actually mean? What does the underlying data look like? Reporting results with standard deviations or other appropriate measures of variability does not affect statistical tests—a significant result will remain a significant result—so let us not be afraid of them. Reporting results with standard deviations or other appropriate measures of variability provides the reader with a better sense of the underlying data, which is important to appropriately interpret study results and figures (Belia, Fidler, Williams, & Cumming, 2005; CurranEverett & Benos, 2004; Drummond & Vowler, 2011). Exploratory research is important to identify new avenues of research and test new hypotheses, and the paper by Massé-Alarie et al. (2017) raises many interesting questions on how the human primary motor cortex is organized. Nevertheless, I encourage the authors and others in the field to be mindful when reporting and interpreting study results, especially when sample sizes are relatively small. Let us heed the advice of experts and, as a field, strive toward publishing research that is less biased, and more reproducible and transparent. ORCID Martin E. Héroux

http://orcid.org/0000-0002-3354-7104

Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365–376. Chiu, K., Grundy, Q., & Bero, L. (2017). 'Spin' in published biomedical literature: A methodological systematic review. PLoS Biology, 15, e2002173. Cumming, G. (2014). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. New York, NY: Routledge. Curran-Everett, D., & Benos, D. J. (2004). Guidelines for reporting statistics in journals published by the American Physiological Society. Journal of Applied Physiology, 97, 457–459. Drummond, G. B., & Vowler, S. L. (2011). Show the data, don't conceal them. Journal of Physiology, 589, 1861–1863. Forstmeier, W., Wagenmakers, E. J., & Parker, T. H. (2017). Detecting and avoiding likely false-positive findings - A practical guide. Biological Reviews - Cambridge Philosophical Society, 92, 1941–1968. Halsey, L. G., Curran-Everett, D., Vowler, S. L., & Drummond, G. B. (2015). The fickle P value generates irreproducible results. Nature Methods, 12, 179–185. Héroux, M. E. (2016). Inadequate reporting of statistical results. Journal of Neurophysiology, 116, 1536–1537. Héroux, M. E., Loo, C. K., Taylor, J. L., & Gandevia, S. C. (2017). Questionable science and reproducibility in electrical brain stimulation research. PLoS One, 12, e0175635. Héroux, M. E., Taylor, J. L., & Gandevia, S. C. (2015). The use and abuse of transcranial magnetic stimulation to modulate corticospinal excitability in humans. PLoS One, 10, e0144151. Higginson, A. D., & Manufò, M. R. (2016). Current incentives for scientists lead to underpowered studies with erronous conclusions. PLoS Biology, 14, e2000995. Massé-Alarie, H., Bergin, M. J. G., Schneider, C., Schabrun, S., & Hodges, P. W. (2017). "Discrete peaks" of excitability and map overlap reveal task-specific organization of primary motor cortex for control of human forearm muscles. Human Brain Mapping, 38, 6118–6132. Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: Time for a new data presentation paradigm. PLoS Biology, 13, e1002128. Wood, J., Freemantle, N., King, M., & Nazareth, I. (2014). Trap of trends to statistical significance: Likelihood of near significant P value becoming more significant with extra data. British Medical Journal, 348, g2215.

RE FE R ENC E S Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods, 10, 389–396. Bero, L. (2018). Meta-research matters: Meta-spin cycles, the blindness of bias, and rebuilding trust. PLoS Biology, 16, e2005972.

How to cite this article: Héroux ME. Reporting matters: Brain mapping with transcranial magnetic stimulation. Hum Brain Mapp. 2018;1–2. https://doi.org/10.1002/hbm.24371