Journal of Pediatric Urology (2013) 9, 57e61
A multi-center study of pediatric uroflowmetry data using patterning software Akihiro Kanematsu a,c,*, Shiro Tanaka b, Kazuyoshi Johnin d, Shina Kawai e, Shigeru Nakamura e, Masaaki Imamura a, Koji Yoshimura a, Yoshihide Higuchi c, Shingo Yamamoto c, Yusaku Okada d, Hideo Nakai e, Osamu Ogawa a a
Department of Urology, Kyoto University, Japan Division of Clinical Trial Design and Management, Translational Research Center, Kyoto University Hospital, 54, Shogoin Kawaracho, Sakyo, Kyoto, Japan c Department of Urology, Hyogo College of Medicine, 1-1 Mukogawacho, Nishonomiya, Hyogo, Japan d Department of Urology, Shiga University of Medical Science, Seta Tsukinowacho, Otsu, Shiga, Japan e Department of Urology, Jichi Children’s Medical Center Tochigi, 3311-1 Yakushiji, Shimotsuke, Tochigi, Japan b
Received 27 August 2011; accepted 6 December 2011 Available online 23 December 2011
KEYWORDS Uroflowmetry; Children; Enuresis; Daytime wetting; Epidemiology; Patterning; Software
Abstract Objective: We created software for patterning uroflowmetry (UFM) curves, and validated its utility. Patients and Methods: The software patterns a given UFM curve upon four parameters: sex, voided volume, maximal flow rate, and amplitude of fluctuation. Using the software, 6 urologists from 4 institutes assessed 30 test curves. Further, 329 UFM curves obtained from children presenting to 3 institutes for daytime and/or nighttime wetting were assessed. Clinical presentation was divided into 3 groups: group A, daytime incontinence; group B, nonmonosymptomatic nocturnal enuresis without daytime wetting; and group C, monosymptomatic nocturnal enuresis. Results: Using the software, inter-rater agreement ranged from 0.85 to 1.00 (mean, 0.93 0.04). It could pattern 310 out of 329 clinical curves. In each institute, the tower pattern was prevalent according to severity of daytime symptoms, although not significantly. The merged data showed that the percent tower pattern significantly correlated with presence of daytime symptoms (groups A, B, and C, 29.7%, 27.0%, and 16.3%, respectively; p < 0.05). No
* Corresponding author. Department of Urology, Hyogo College of Medicine, 1-1 Mukogawacho, Nishonomiya, Hyogo, Japan. Tel.: þ81 798 45 6366; fax: þ81 798 45 6368. E-mail addresses:
[email protected],
[email protected] (A. Kanematsu),
[email protected] (S. Tanaka), johnin@belle. shiga-med.ac.jp (K. Johnin),
[email protected] (S. Kawai),
[email protected] (S. Nakamura),
[email protected] (M. Imamura),
[email protected] (K. Yoshimura),
[email protected] (Y. Higuchi),
[email protected] (S. Yamamoto),
[email protected] (Y. Okada),
[email protected] (H. Nakai),
[email protected] (O. Ogawa). 1477-5131/$36 ª 2011 Journal of Pediatric Urology Company. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.jpurol.2011.12.001
58
A. Kanematsu et al. correlation with daytime symptoms was noted for fluctuated (staccato and interrupted) and plateau patterns. Conclusion: The software creates a common platform for evaluating pediatric UFM, enabling extraction of common and biased features of different cohorts, and their integration into one single cohort. ª 2011 Journal of Pediatric Urology Company. Published by Elsevier Ltd. All rights reserved.
Introduction Uroflowmetry (UFM) is the most common diagnostic urodynamic procedure for evaluating children with lower urinary tract symptoms (LUTS) and is widely used for diagnosing children with daytime and nighttime wetting [1e4], which are highly prevalent conditions during childhood [5e8]. It has been postulated that UFM can detect underlying bladder problems in children by defining certain urination patterns, defined as bell (for normal), tower, plateau, staccato and interrupted patterns, according to standardized terminology from the International Children’s Continence Society (ICCS) [1,2]. Because the patterning itself had been incompletely standardized, in our previous report we proposed a method to define normal and abnormal patterns in accordance with the ICCS terminology [9]. This method consisted of two steps: 1) defining staccato and interrupted pattern if amplitude of fluctuation is less than square of maximal flow rate (MFR), clumped together as fluctuated pattern; and 2) classifying remaining curves according to deviation of MFR of a given curve from median value of a published nomogram [10]. The method was expected to overcome the inter-observer difference seen with subjective patterning [11], and showed that objective patterning is feasible. However, although logically simple, working and calculating with a nomogram is relatively time-consuming clinically, and the method itself has not become readily usable for all clinicians yet. Thus, we developed internet-based automated calculating software, which enables physicians to get a reproducible patterning in the clinic immediately. We validated this software by two steps. First, we examined interobserver difference among six urologists from four separate institutes. Then, clinical datasets deriving from three different institutes were assessed by the software and compared. Finally, these three datasets were merged into one single cohort to formulate an integrative database.
interruption relied on subjective judgment. The remaining curves were classified according to deviation of MFR of a given curve from median value of a published nomogram [10]. MFR >130% and 300 ml) and 10 curves for underscaling (voided volume, < 20 ml). Therefore, 310 of 329 curves (94.2%) from three institutes (n Z 98, 155 and 57) could be analyzed by the software. As shown in Table 2, average age was significantly different between institutes. Otherwise, the demographics were roughly consistent among institutes, with significantly lower age in order of group A, B, and C according to severity of daytime symptoms. In each institute, the tower pattern tended to be more prevalent in patients with daytime symptoms, although not significantly (Fig. 2). For the remaining patterns, there was no consistent distribution among the three institutes (Fig. 2). Having a common platform, the three datasets could be merged into one single cohort; the percent tower pattern significantly correlated with presence of daytime symptoms (group A, B, and C, 29.7%, 27.0%, and 16.3%, respectively; p < 0.05, Chisquare test, Fig. 3) whereas no correlation with daytime symptoms was noted for the fluctuated and plateau patterns.
Inter-rater agreement
Discussion
As shown in Table 1, inter-rater agreement ranged from 0.85 to 1.00 (mean, 0.93 0.04). Disagreement occurred in
The present results demonstrate that our software system formulates a platform for reproducible evaluation of pediatric UFM curves, and enables comparison and integration of data from multiple institutes. This latter point is particularly important because thus far only expert opinion existed in this area, and comparison of data from different institutes remained unattainable [13]. The software presented in this report is presently available as an internet-based calculation system [12],
Statistical methods Data are shown as mean standard deviation. Inter-rater agreement for each pair of the six urologists was evaluated by Cohen’s Kappa coefficient and its 95% confidence interval. Distributions of age in the three groups were compared by ANOVA. The rate of tower pattern in the patient groups was assessed by chi-square test. Reported p values < 0.05 was taken to indicate statistical significance. All statistical analyses were performed using Statview version 5.0 (SAS Institute, Cary, NC, USA) and SAS version 9.2 (SAS Institute) software.
Table 1 Summary of inter-rater agreement of six urologists.
Rater 1 vs Rater 2 Rater 3 Rater 4 Rater 5 Rater 6 Rater 2 vs Rater 3 Rater 4 Rater 5 Rater 6 Rater 3 vs Rater 4 Rater 5 Rater 6 Rater 4 vs Rater 5 Rater 6 Rater 5 vs Rater 6 Mean SD
Kappa
95% confidence interval
0.90 0.85 0.95 1.00 0.95
0.77 0.69 0.86 1.00 0.86
1.00 1.00 1.00 1.00 1.00
0.95 0.95 0.90 0.95
0.86 0.86 0.77 0.86
1.00 1.00 1.00 1.00
0.90 0.85 0.90
0.77 0.69 0.77
1.00 1.00 1.00
0.95 1.00
0.86 1.00
1.00 1.00
0.95 0.93 0.04
0.86
1.00
Table 2 Summary of demographic data from three institutes. Institute Sex (rate) M F n (rate) A B C Mean age A B C Total (SD)a a
I (n Z 98)
II III (n Z 155) (n Z 57)
Total (n Z 310)
71 (0.72) 112 (0.72) 27 (0.28) 43 (0.28)
32 (0.58) 215 (0.69) 25 (0.44) 95 (0.31)
40 (0.41) 28 (0.29) 30 (0.31)
33 (0.58) 138 (0.42) 8 (0.14) 74 (0.25) 16 (0.28) 98 (0.33)
8.1 9.0 9.2 8.7
(2.2) (2.0) (1.8) (1.8)
p < 0.05 by ANOVA.
65 (0.42) 38 (0.25) 52 (0.34) 7.6 8.5 9.4 8.1
(1.6) (1.7) (1.9) (1.7)
8.5 10.0 10.1 9.4
(1.7) (1.4) (2.4) (2.1)
7.9 8.8 9.6 8.6
(1.8) (1.8) (1.9) (1.9)
60
A. Kanematsu et al. N.S.
N.S.
N.S.
1.0
0.8
Rate
Tower Fluctuated 0.6
Plateau Normal
0.4
0.2
0.0
A
B
C
Institute I
A
B
C
Institute II
A
B
C
Institute III
Figure 2 Comparison of the patterns among three independent institutes. Tower pattern tended to be less prevalent in Group C (MNE), but this did not reach statistical significance.
readily usable for urologists from different facilities around the world. Six urologists from four different institutes tested this system. The inter-rater agreement was 93%, which highly exceeded the agreement rate documented for subjective patterning, ranging from 0.21 to 0.64 [9], or 0.45 to 0.67 [11] in previous reports. There were three curves for which disagreement arose. One related to differentiation between tower and normal may have been due to misuse of the software because only one rater rated it as normal, which would not occur if correct data were applied to the software. The remaining two disagreements occurred over the definition of fluctuation. Because the ICCS committee itself did not define fluctuation, it is not surprising that different interpretation is made for curves with multiple peaks. For the remainder of the curves, reproducibility of the judgment demonstrated the excellent reliability of the present system. There have been only a limited number of reports about comparison of UFM results [3,11,13,14].
p