Establishing an appropriate Z score regression equation for Chinese pediatric coronary artery echocardiography: a multicenter prospective cohort study

Background Z score utility is emphasized in classifying coronary artery lesions in Kawasaki disease patients. The present study is the largest such multicenter Chinese pediatric study about coronary artery diameter reference values and Z score regression equation to date. It is useful in Chinese pediatric echocardiography. Methods A multicenter cohort was assembled, which consisted of 852 healthy children between 1 month and 17 years of age, ten children were excluded because their ultrasound images were not clear, or lost in following up. Diameters of the right coronary artery, left coronary artery, and left anterior descending coronary artery were assessed using echocardiography. Data were body surface area (BSA)-corrected using BSA calculated via either the Stevenson BSA formula or the Haycock BSA formula. Coronary artery diameter reference values and Z score regression equations were established for use in the Chinese pediatric population. Results No difference was observed between coronary artery diameter data corrected using BSAste or BSAhay. Of the five assessed regression models, the exponential model exhibited the best fit and was therefore selected as the basis for derivation of the SZ method. When comparing Z scores, those produced by the SZ method conformed to the standard normal distribution, while those produced by the D method did not. In addition, there was a statistically significant difference between Z scores produced by the SZ and D methods (P < 0.05). Conclusions Coronary artery diameter reference values for echocardiography were successfully established for use in the Chinese pediatric population, and a Z score regression equation more suitable for clinical use in this population was successfully developed.


Background
Echocardiography is commonly used to evaluate pediatric patients for coronary artery disease (CAD). Kawasaki disease is the major cause of childhoodacquired CAD. Accurate pediatric reference values for echocardiographic monitoring of changes in coronary artery diameter contribute to establishing diagnosis and prognosis. The most recent American Heart Association guidelines recommend body surface area (BSA)-corrected coronary artery diameter as the gold standard, and emphasize Z score utility in evaluating coronary artery injury risk and in classifying coronary artery lesions in Kawasaki disease patients. The American Society of Echocardiography also recommends the use of Z scores in pediatric cardiology [1]. These scores indicate the distribution of measurements about the mean within a healthy population, and facilitate comparison of datasets exhibiting differential means and distributions (such as datasets deriving from children of different ages and sizes).
Coronary artery Z score regression equations have been derived for multiple healthy pediatric populations, including those of the United States [2], Canada [3,4], Japan [5], Singapore [6], Turkey [7], Korea [8], and China [9]. With the advent of reference value predictive models, many manufacturers have incorporated such models into echocardiography equipment-integrated software in order to allow rapid computation of coronary artery measurement Z scores. However, whether models developed in North America and/or using different BSA formulae are suitable for use in the Chinese pediatric population remains unclear.
The present study therefore aimed to: (1) establish coronary artery diameter reference values in a healthy Chinese pediatric population, and (2) compare suitability for use in this population of a Z score regression equation derived from these data with that of one derived from data of mostly-Caucasian pediatric population [4].

Materials
A multicenter healthy Chinese pediatric cohort was assembled. Pediatric cardiologists and ultrasound experts from six provinces in Southern and Northern China participated in this study. All researchers reached a consensus and unified the inclusion and exclusion criteria. The cohort consisted of 862 participants between July 1, 2016, and December 30, 2019.The children within this cohort were followed up for 6 months. Ten children were excluded because their ultrasound images were not clear, or lost in following up. There were 546 males and 296 females. Height (H), weight (W) and systolic / diastolic blood pressure (SBP / DBP) were recorded. At each center, echocardiographic data of each child were evaluated.

Inclusion criteria
(1) Age from 1 month to 18 years old.
(3) Blood pressure was normal [10]. (4) Physical examination showed no abnormality of cardiovascular system. (5) Heart structure were normal by clinical examination and echocardiography. (6) Echocardiography examination for the following reasons: asymptomatic heart murmur, chest pain, suspicious abnormal chest X-ray or ECG performance, family history of congenital heart disease. This study was approved by the ethics committee of Shenzhen children's hospital. Signed the informed consent by children's parents were obtained.

Equipment and methods
The age, weight, and height of each patient were recorded at the time of the echocardiography evaluation. Data were body surface area (BSA)-corrected using BSA calculated via either BSA Stevenson formula (BSAste) [11] or the BSA Haycock formula (BSAhay) [12]. The studies were performed using GE Vivid (GE Healthcare, Milwaukee, WI) or Philips iE33 echocardiograph (Philips Medical Systems, Bothell, WA) echocardiography system. All Echocardiographic data were digitally recorded, allowing offline measurements. The frequency of the probe used should be as high as possible, the frame frequency > 60fps. Diameters of the right coronary artery (RCA), left coronary artery (LCA), and left anterior descending (LAD) coronary artery were assessed using echocardiography [13], which were defined as raw values (Fig. 1).

Statistics analysis
General data: SPSS 11.0 software (SPSS, Inc., Chicago, IL, UnitedStates) was used for statistical analysis. Continuous variables (age, height, weight, blood pressure, heart rate, BSA) were expressed as mean (SD) and / or medians.
The Anderson darling test is used to determine whether the data conform to the normal distribution. The student's t test was used to compare the difference between two groups of data which conformed to a standard normal distribution; Wilcoxon signed-rank test was used to compare the difference between two groups of data which did not. P values< 0.05 were considered statistically significant.

Regression analysis
Five regression models (linear, logarithmic, power, exponential, and squareroot) were assessed for goodness of fit using R language,based on adjustedR 2 , residual standard error (RSE), and Akaike information criterion (AIC) values. The best-performing model (maximum adjusted R 2 and minimumRSEandAIC) was carried forward to derive a novelreference regression equation, termed the ShenZhen (SZ) method.

Comparison of two regression equations
Both the SZmethod and the North AmericanDallaire(D)method [4] were applied to the same set of raw data, and resultantZ scores were compared.

Cohort demographic data
Overall, a pediatric cohort of 842 healthy subjects was assembled, comprising 67% males and 33% females, and spanning an age range of 1 month to 17 years (Table 1).
Echocardiography-determined coronary artery diameters corrected using BSAs calculated via two different formulae.

Regression model performance comparison
The R project for statistical computing was used to compare five regression models, including linear, logarithmic, power, exponential, and square root. Of these, the exponential model exhibited the best fit and was therefore selected as the basis for derivation of the SZ method. Exponential model performance based on BSAste-corrected values was as follows: lnyLCA = 1.0042 + 0.  (Table 3). Because the exponential model based on BSAhay-corrected values yielded the maximum adjusted R2 and minimum RSE and AIC, it was selected to construct a novel regression equation, herein referred to as the SZ method.
Relationship of coronary artery diameter to BSAhay, and Z score distribution Coronary artery diameters increased non-linearly with BSAhay ( Fig. 2A). Values were normally distributed (Fig.  2B), and the normal distribution was standard (P > 0.05) (Fig. 2C).

Comparison of SZ versus D method Z scores
When comparing Z scores resulting from the SZ versus D methods, SZ-derived Z scores conformed to a standard normal distribution, while D-derived Z scores (within certain BSAhay ranges) did not, and the difference between Z scores resulting from the two methods was significant (P < 0.05) ( Table 4). Based on raw LCA diameter data, SZ method Z scores were larger than those of the D method when BSA ≤ 1.0 m 2 , but smaller than those of the D method when BSA > 1.0 m 2 (P < 0.05). Similarly, based on raw LAD artery diameter data, SZ method Z scores were larger than those of the D method when BSA ≤ 1.3 m 2 , but smaller than those of the D method when BSA > 1.3 m 2 (P < 0.05). Finally, based on raw RCA artery diameter data, SZ method Z scores were larger than those of the D method when BSA ≤ 0.7 m 2 , but smaller than those of the D method when BSA > 0.7 m2 (P < 0.05).

D method Z scores
Within certain BSAhay ranges, D method Z scores were non-normally distributed (Fig. 3).

Performance of the SZ versus D methods
Within certain BSAhay ranges, the SZ and D methods predicted different values (P < 0.05) (Fig. 4). Specifically, D method-predicted LCA diameter means were larger than those predicted by the SZ method when BSAhay> 0.9 m 2 , D method-predicted LAD artery diameter means were smaller than those predicted by the SZ method when BSAhay ≤1.2 m 2 , and D method-predicted means of both diameters were larger than those predicted by the SZ method when BSAhay> 0.6 m 2 .

Discussion
Because China is large and exhibits regional diversity in individual lifestyle, height, and weight, a multicenter study was conducted to more accurately ascertain pediatric coronary artery diameter reference values. The present study is the largest such multicenter Chinese pediatric study to date, and includes coronary artery measurement data from six regions in Northern and Southern China. Study findings demonstrate that coronary artery diameters of a healthy Chinese pediatric population increase nonlinearly with an increase in BSA, and differ significantly from those of the Dallaire cohort. Specifically, Chinese pediatric coronary artery diameters are larger at lower BSA ranges, but smaller at higher BSA ranges, than those of North American children. In addition, relative to the D method, the SZ method mean reference value-predictive regression model exhibits lower Z scores at lower BSA ranges and higher Z scores at higher BSA ranges. This suggests that the SZ method may be more accurate than the D method in the Chinese pediatric population.
Previous studies suggested the existence of a linear correlation between equal variance of coronary artery diameter and BSA [14][15][16][17]. However, multiple larger studies have found a non-linear correlation between coronary artery diameter and BSA in healthy pediatric populations [2][3][4][5][6][7][8][9]18]. Coronary artery diameter standard deviation from the mean differs within distinct BSA ranges, resulting in population-inappropriate linear regression methods introducing bias during Z score calculation. Therefore, researchers have sought to identify regression models able to more accurately represent the non-linear relationship between coronary artery diameter mean and standard error. Specifically concerning heteroscedastic non-linear relationships, prior studies have evaluated the goodness of fit of various regression models, including quadratic, cubic polynomial, logarithmic, exponential, and square root [2][3][4][5][6][7][8][9].
In the present study, an exponential regression model exhibited the best fit (based on maximum R 2 and minimum RSE and AIC) and the standard test of normality was therefore applied to model residuals and Z scores. Results indicate that the model performs reliably. The present study therefore contributes to establishment of Chinese pediatric coronary artery diameter reference values, and provides data which may help overcome the lack of universality associated with single-center studies.
Up to 25 BSA calculation methods are available, including the Stevenson, Haycock, and Du Bois EF formula [11,12,19,20]. The American Society of Echocardiography guidelines for quantitative pediatric echocardiography  recommends use of the Haycock formula when calculating Z scores for cardiovascular structure measurements [21,22]. The Haycock formula may also be a better estimator of BSA for smaller children [4,22], although different BSA formulae do not result in different model Z scores in the present study. However, the Stevenson formula is commonly used within the Chinese healthcare system. Therefore, in order to determine the impact of formula choice on model Z scores, the present study derived regression models from a single original dataset (using BSA calculated via either the Stevenson or Haycock formulae), and compared resultant Z scores. Because Z scores did not differ significantly, either BSA formula is appropriate for use in quantitative evaluation of echocardiography data in a Chinese pediatric population. The present study ultimately incorporated the Haycock formula into the SZ method regression model. This is consistent with the approach of the D method, which also uses the Haycock formula, and facilitates comparison of these two methods.  In comparing Z scores resulting from the SZ and D methods, it was found that the SZ method produced Z scores closer to zero. This indicates that the SZ method may be more suitable than the D method for use in the Chinese pediatric population. Furthermore, predicted coronary artery diameter mean values and standard deviations differed between North American and Chinese pediatric populations, and the predicted mean value curve provided by the D method had a higher gradient than that provided by the SZ method. When BSA is low, predicted mean values for North American children were lower than those for Chinese children, and when BSA is high, predicted mean values for North American children were higher than those for Chinese children. This indicates that using predictive regression models based on North American pediatric reference values may over-or underestimate coronary artery measurement Z scores in the Chinese pediatric population, which would negatively impact CAD diagnosis and treatment.
We suggest that the D method regression model used for the North American pediatric population is unsuitable for use in the Chinese pediatric population. Factors such as region and race should be taken into account when incorporating automatic calculation functions into echocardiographic equipment and when selecting predictive regression models for clinical applications. The SZ method regression model established during the present study may be more suitable for use in the Chinese pediatric population.
However, we acknowledge certain study limitations. For example, determination coefficients (R 2 values) are marginally lower than those determined by previous studies, especially for the RCA [4,9]. Determination coefficients for the LCA, LAD artery, and RCA obtained using the D method were higher than those obtained using the SZ method. Two possible reasons may account for this observation: differences between multiple participating study sites, or an unbalanced cohort gender ratio. Furthermore, coronary artery diameters were too small to avoid errors in measurement when enlarging images, and it is challenging to ensure inter-site consistency during a multicenter study. Mitigation strategies for such limitations should be considered when designing future studies.
Furthermore, Study cohort was skewed toward male and there are few subjects older than 12 years of age which limit generalization of this data to children > 12 years of age. Fig. 4 A, B, C Means±2Z scores predicted by the SZ and D methods for LCA, LAD, and RCA. The solid black line represents values predicted by the SZ method, the dotted blue line represents values predicted by the D method, and the solid red line represents the BSA in which the two predicted mean curves intersect. The red vertical line indicates the point where the SZ and D models yield the same z-score prediction for that specific BSA