Study design and recruitment
Data are from the Multi-center Infant Body Composition Reference Study (MIBCRS), a longitudinal, prospective, multinational study, that followed infants from birth to 24 months in lower-middle (India, Pakistan and Sri Lanka), upper-middle (Brazil and South Africa) and high-income (Australia) countries and details of the recruitment process is described elsewhere [4, 19,20,21]. Each participating country conducted the study adhering to International Ethical Guidelines for Biomedical Research Involving Human Subjects and obtaining approval from their respective review committee. Informed written consent was obtained from the enrolled mothers and data were collected from 2013 to 2019. The main cohort comprised of 3–24 month data from Brazil, Pakistan, South Africa, and Sri Lanka from 708 mother infant pairs to assess body composition used for the development of equations (training data) and validation of the developed equations (validation data). An independent cohort comprised of 250 infants (3–6 months) from Australia, India and South Africa was used for external validation of the developed equations (test data). During follow-up, children were fed according to Infant and Young Child Feeding (IYCF) guidelines.
The sample size for study sites was calculated to have a power of 90% to detect FM and FFM for boys and girls less than one standard deviation away from a reference study, that found a mean FM of 3.10 ± 0.5 kg and 3.05 ± 0.46 kg, and mean FFM of 9.13 ± 1.06 kg and 8.99 ± 1.1 kg for boys and girls, respectively [22].
Body composition assessed using the Deuterium Dilution (DD) technique
DD was utilized to calculate FM and FFM of infants at 3, 6, 12, 18, and 24 months of age in the development and validation group, and at 6 months of age in the test group. Details of the technique are provided elsewhere [23].
Anthropometry
Anthropometric data for this analysis were used from the visits at 3, 6, 9, 12, 18 and 24 months in the development and validation group and from the 6 month visit in the test group. Standardized protocols for anthropometry were developed based on the WHO Multicentre Growth Reference Study (WHO-MGRS) protocol [24].
Infant weight was measured naked, using a paediatric electronic scale (Seca 376; Hamburg, Germany) and length using a Harpenden infantometer (300–1100 mm, accurate to 1 mm; Holtain Ltd, Crymych, Wales, UK) in all countries except for India and Sri Lanka (Seca 417; Hamburg, Germany). A detailed protocol has been published [23].
Triceps skinfold thickness (TSFT) and subscapular skinfold thickness (SSFT) were measured using a Holtain Tanner skinfold calliper to the nearest 0.2 mm on left arm (Holtain Ltd, Crymych Wales, UK). Each skinfold thickness was read after 2 s, consistent with the WHO-MGRS methodology with MAD of 2 mm [24]. Mid upper arm and head circumference were measured with a non stretchable flexible tape to the nearest 1 mm and MAD was 5 mm (Seca 212; Hamburg, Germany).
Quality control in data collection
Anthropometry protocol training was undertaken in Johannesburg, South Africa and subsequently, anthropometry standardization sessions were undertaken locally at three monthly intervals. Intra- and inter-observer technical errors of measurements were calculated and compared to the measurements obtained by the anthropometry supervisor (gold standard). A training workshop on the DD technique was held at St John’s Research Institute, Bangalore, India. The IAEA organized an inter-laboratory comparison for the analysis of deuterium enrichment among the laboratories responsible for analyzing samples.
Statistical analysis
Data were collated and captured on the REDCap system [25], hosted at the University of the Witwatersrand, Johannesburg.
Sample split for model development
Participants from the 3 to 24-month cohort were split randomly into training (two-thirds) and validation (one-third) groups. The training data consisted of 942 sets of anthropometric and body composition measurements (collected from 310 girls) and 954 sets of measurements (collected from 340 boys). Observations corresponding to the 24-month visit were included only up to 26 months of age. The validation data set consisted of 441 sets of measurements (collected from 154 girls) and 500 sets of measurements (collected from 170 boys) from the same four country cohorts. Test data for external validation of the fitted model consisted of participants from three birth cohorts in Australia (21 girls, 30 boys with one set of measurements each at 6 months), India (44 girls, 46 boys, with one set of measurements each at 6 months) and South Africa (120 sets of observations, collected from 59 girls and 88 sets of measurements collected from 50 boys, who provided data during at least one visit between 3 to 6 months). Additional information on the sample selection is provided in Supplementary Fig. 1. We describe the characteristics of our training, validation, and test data separately by sex using median and range (Table 1 and 2).
Developing the prediction equations
A priori power analysis was not conducted for developing the prediction equations and therefore included longitudinal data of all infants (n = 650, observations = 1896) for whom the data was available. Previous studies that estimated body composition using anthropometry had similar or smaller sample sizes [11]. Joint distribution of all variables was examined using scatterplots and Pearson correlations.
A linear mixed model on the training data separately for girls and boys, using linear splines with knots at 9 and 18 months, was used to develop the prediction equations for FM (kg) and FFM (kg). We used linear mixed models with random intercepts to account for clustering of observations among individuals. The knots for age were selected based on visual inspection of trajectories for FM and FFM. Additionally, the models were adjusted for length (m), WFL (kg/m), TSFT, SSFT, and Asian ethnicity. Adjusting for head and arm circumference did not influence the results, and as such, were not included in the final prediction equations. South Asian ethnicity was ascribed based on the country from which the participant was recruited. Estimation of the 95% prediction interval was attained by incorporating uncertainty in random effects (for training data only), uncertainty in fixed effects, and residual variance of outcome variable. A detailed description of the methodology is provided in Supplementary Note 1.
Subsequently, the fitted model was internally validated on the ‘validation’ data and evaluated for the quality of predictions using: (a) error metrics (root mean squared error – RMSE; root mean square percentage error – RMSPE; mean absolute error – MAE; and mean absolute percentage error – MAPE) for the predicted values, and (b) the number of instances for which true values were outside the prediction interval. A summary of the different error metrics used is provided in Supplementary Note 2.
Validation of prediction equations
The fitted model was externally validated on the test data and evaluated for quality of predictions using the same error metrics as above. Additionally, systematic error in predictions were explored (on test data with observed and predicted values) using Bland-Altman plots.
Sensitivity analysis
First, model misspecification was assessed by repeating the analysis with models fitted (a) using quadratic term for age, (b) using natural cubic splines for age, and (c) using natural cubic splines for all predictors. Natural splines are a family of piecewise cubic polynomials and were fitted with four degrees of freedom. These models were then compared using error metrics and the conditional Akaike Information Criterion [26]. Second, prediction error was assessed against the logarithmic transformation of the outcome variables for the four model specifications due to observed right skew in the outcome variables. Third, prediction error was assessed after substituting WFL as a predictor with (a) BMI (kg/m2) and (b) ponderal index (kg/m3) for the linear spline model. All analyses were executed using R 3.6.1 using lme4 (v1.1-23), merTools (v0.5.2) and cAIC4 (v1.0) packages.