Comparison of Military Health System Data Repository and American College of Surgeons National Surgical Quality Improvement Program-Pediatric

Background Given the rarity of pediatric surgical disease, it is important to consider available large-scale data resources as a means to better study and understand relevant disease-processes and their treatments. The Military Health System Data Repository (MDR) includes claims-based information for > 3 million pediatric patients who are dependents of members and retirees of the United States Armed Services, but has not been externally validated. We hypothesized that demographics and selected outcome metrics would be similar between MDR and the previously validated American College of Surgeons National Surgical Quality Improvement Program-Pediatric (NSQIP-P) for several common pediatric surgical operations. Methods We selected five commonly performed pediatric surgical operations: appendectomy, pyeloplasty, pyloromyotomy, spinal arthrodesis for scoliosis, and facial reconstruction for cleft palate. Among children who underwent these operations, we compared demographics (age, sex, and race) and clinical outcomes (length of hospital stay [LOS] and mortality) in the MDR and NSQIP-P, including all available overlapping years (2012–2014). Results Age, sex, and race were generally similar between the NSQIP-P and MDR. Specifically, these demographics were generally similar between the resources for appendectomy (NSQIP-P, n = 20,602 vs. MDR, n = 4363; median age 11 vs. 12 years; female 40% vs. 41%; white 75% vs. 84%), pyeloplasty (NSQIP-P, n = 786 vs. MDR, n = 112; median age 0.9 vs. 2 years; female 28% vs. 28%; white 71% vs. 80%), pyloromyotomy, (NSQIP-P, n = 3827 vs. MDR, n = 227; median age 34 vs. < 1 year, female 17% vs. 16%; white 76% vs. 89%), scoliosis surgery (NSQIP-P, n = 5743 vs. MDR, n = 95; median age 14.2 vs. 14 years; female 75% vs. 67%; white 72% vs. 75%), and cleft lip/palate repair (NSQIP-P, n = 6202 vs. MDR, n = 749; median age, 1 vs. 1 year; female 42% vs. 45%; white 69% vs. 84%). Length of stay and 30-day mortality were similar between resources. LOS and 30-day mortality were also similar between datasets. Conclusion For the selected common pediatric surgical operations, patients included in the MDR were comparable to those included in the validated NSQIP-P. The MDR may comprise a valuable clinical outcomes research resource, especially for studying infrequent diseases with follow-up beyond the 30-day peri-operative period.


Background
Analysis of large medical databases holds substantial promise for improving healthcare delivery, including potential contributions to predictive modeling, surveillance, and other health improvement initiatives [1]. Likewise, national databases aggregate factors that otherwise might be too rare for meaningful analysis. However, the usefulness of such databases depends on generalizability and suitability as a representative sample [1], which can be evaluated in part by comparison against an existing validated resource.
One such validated database for pediatric surgical research is the National Surgical Quality Improvement Program-Pediatric (NSQIP-P) database, overseen by the American College of Surgeons in conjunction with the American Pediatric Surgery Association. This database collects 94 data points on surgical patients under 18 years of age, with additional data points for neonates under 30 days of age, and follows patient outcomes for 30 days post-operatively [2].
Also potentially useful for pediatric surgical research is the claims database maintained by the United States Military Health System (MHS) known as the MHS Data Repository (MDR). The MHS includes a universally-insured population of 9.4 million military and civilian beneficiaries [3] including 3 million children and has been called "America's 'undiscovered' laboratory for health services research" [4]. The beneficiary population is considered to be demographically representative of the adult U.S. population from age 18-64 years [5][6][7], however, to date, the pediatric generalizability has not been evaluated. Therefore, the aim of this research is to compare demographics and select outcomes (mortality, length of stay, and readmission) of the MHS Data Repository (MDR) to a validated resource, the NSQIP-P database, for five common pediatric surgical procedures to highlight the relative advantages and disadvantages of each resource. Findings of this research will help to evaluate the utility of the MDR as a tool for population-level research in pediatric procedures.

Study population
This study included data from two sources: 1) the United States Department of Defense Military Health System Data Repository (MDR) and 2) the American College of Surgeons Pediatric National Surgical Quality Improvement Project (NSQIP-P). The MDR is a claims database with records of healthcare delivered between 2005 and 2014 to over 3 million children who are dependents of active-duty personnel, retired service members, activated members of the National Guard and Reserve, civilian dependents of included personnel, and survivors or others entitled to care from the Department of Defense [4]. The MHS is separate from both the care provided for soldiers in combat zones and the Veterans Health Administration. Follow-up time for children, included as military-based insurance eligible dependents in the MDR, extends until their parent(s) or guardian(s) leave active duty without retiring or until 18 years of age or college graduation. A summary of other defining characteristics of the MDR is available in previously published work [8].
The NSQIP-P is a prospective clinical database which includes data from 54 participating hospitals in North America, which are abstracted by trained surgical clinical reviewers [9]. Patients under the age of 18 years who underwent selected general, neurosurgical, urological, otolaryngologic, plastic, and orthopedic procedures are eligible for selection by systematic sampling on an 8-day cycle. Only overlapping years, specifically 2012 to 2014, were included from both data sources.
We included five commonly performed operations across multiple disciplines that were captured by both the MDR and the NSQIP-P (2012-2014): appendectomy, pyeloplasty, pyloromyotomy, spinal arthrodesis for scoliosis, and facial reconstruction for cleft lip/palate (i.e. repair/reconstruction of unilateral/bilateral cleft lip/nasal deformity or palatoplasty). International Classification of Diseases, Ninth Revision (ICD-9) diagnosis and procedure codes and Current Procedural Terminology (CPT) codes were used to identify patients who underwent the above procedures (Additional file 1).
Demographic data included patient age, sex, and race (Asian, African American, white, other, and unknown). For dependents in the MDR with missing race data, race was assigned using the corresponding sponsor's value [10]. Outcomes included length of hospital stay and allcause mortality occurring during the hospitalization or follow-up. Follow-up extended 30 days post-operatively for NSQIP-P and 30 days post-discharge, 90 days postdischarge, and beyond (until last follow-up) in the MDR.
For each of the five aforementioned operations, we tabulated demographics and outcomes separately for each data source. Missing data were handled using a complete case approach. We did not conduct statistical tests or provide p-values comparing the two large databases, because the goal of the study was to describe the resources rather than to falsify a certain hypothesis test. All analyses were performed using SAS v9.3 (SAS Institute, Inc., Cary, NC). This research was considered exempt by the institutional review boards of the Uniformed Services University of the Health Sciences and Partners Healthcare.

Results
A total of five procedures were assessed: appendectomy, pyeloplasty, pyloromyotomy, scoliosis operations, and cleft lip/palate repair. Overall, NSQIP-P had a greater number of patients undergoing each procedure. Among both data sources, overall mortality was low, with 1 death reported following 24,965 appendectomies, 9 following 5838 scoliosis operations, and 1 following 6951 cleft lip/palate repairs. There were no mortalities following pyloromyotomy (n = 4054) or pyeloplasty (n = 898). Results stratified by operation are described in detail below.

Discussion
In this study comparing basic characteristics and outcomes of pediatric patients undergoing common procedures in the NSQIP-P and MDR, we report that the age and sex of patients were overall similar as were mortality, length of hospital stay, and readmission rate. Race distribution for each procedure was similar between the two databases, with the MDR containing a generally slightly higher proportion of white patients, although the distribution varied between procedures. Based on these comparisons, we review the advantages and disadvantages of each database, in order to help guide researchers toward the most suitable resource for a given analysis. Two important advantages of the MDR relative to NSQIP-P are follow-up duration and patient demographics. First, the MDR does not limit outcomes to 30 days, whereas the NSQIP-P limits follow-up to 30 days post-operatively. For some outcomes, 30 days may be adequate follow-up; for others, it may be advantageous to study longer term endpoints. For example, 90-day readmission rates were often notably higher than 30-day readmission rates in the MDR. Finally, as (non-sampled) claims data, the MDR captures 100% of operations. In contrast, the NSQIP-P intentionally captures only a select number of operations in order to maximize sampling efficiency.
There are also important disadvantages to the MDR, compared with the NSQIP-P. As a clinical database, NSQIP-P captures specific post-operative outcomes, including surgical site infections, need for mechanical ventilation, pneumonia, and others. The NSQIP-P is specifically designed as a quality improvement initiative, whereas the data in the MDR are captured as claims. These outcomes have been validated in the NSQIP and may not be as reliably captured in the MDR, unless a specific reimbursement code was applied to the event.
As such, it is important to choose objective outcomes whenever able if the MDR is to be used; whereas, if the question of interest involves clinical endpoints captured by the NSQIP, the latter dataset may be preferable. Additionally, NSQIP-P tracks infants under 30 days old, whereas the MDR poses a challenge in tracking infants under 1 year of age, due to the lag time in establishing them in the database as new patients with unique identifiers. Furthermore, age is documented in integer years in the MDR, whereas the NSQIP-P records age in days for neonatal patients. Given these differences, for neonatal patients, the NSQIP-P may be a preferable resource.
Finally, although both NSQIP-P and the MDR are updated on a yearly basis, NSQIP-P data are more readily accessible by the medical community, being provided at no charge to researchers and participating hospitals. In contrast, access to the MDR is free but controlled by Federal oversight and requires a more extensive application process.
This study has several strengths, such as the inclusion of surgical procedures with markedly different rates of use and occurring in different body systems. Both databases draw from sufficiently large populations so as to capture infrequent procedures, and both have demonstrated suitability as tools for health services research. Therefore, comparing the MDR and the NSQIP-P provides useful information on the scope of these databases as well as the difference between them.
There are also important limitations to consider. As described above, the MDR is a claims database which is limited in its capture of outcomes, and may not address subtler clinical findings. Similarly, the number of operations captured differs between the databases with NSQIP containing many-fold more operations than the MDR. This may occur because of the different inclusion criteria for each database. Specifically, the NSQIP systematically samples patients who undergo selected operations from participating institutions. Conversely, the MDR contains a large cohort of children insured via TRI-CARE, and can be interrogated using claims codes to identify all patients who have undergone a given operation over time. Finally, this study assessed only a few notable outcomes (demographics, length of stay, mortality, and readmissions), which together serve as a strong foundation for comparison of the two databases, but which cannot constitute a full validation. Further research is needed to address each of these issues.

Conclusions
In conclusion, the MDR was found to be comparable to the NSQIP-P in several areas including patient demographics and several clinical outcomes following five common pediatric surgical operations. Additional comparison Ninety-day readmission -54 (7.2) to other standard databases with other populations will further establish the MDR as a tool for robust health services research relevant to the general United States population.

Additional file
Additional file 1: Table S1. ICD-9 and CPT codes for included procedures and diagnoses.