Risk-stratification in febrile infants 29 to 60 days old: a cost-effectiveness analysis

Background Multiple clinical prediction rules have been published to risk-stratify febrile infants ≤60 days of age for serious bacterial infections (SBI), which is present in 8-13% of infants. We evaluate the cost-effectiveness of strategies to identify infants with SBI in the emergency department. Methods We developed a Markov decision model to estimate outcomes in well-appearing, febrile term infants, using the following strategies: Boston, Rochester, Philadelphia, Modified Philadelphia, Pediatric Emergency Care Applied Research Network (PECARN), Step-by-Step, Aronson, and clinical suspicion. Infants were categorized as low risk or not low risk using each strategy. Simulated cohorts were followed for 1 year from a healthcare perspective. Our primary model focused on bacteremia, with secondary models for urinary tract infection and bacterial meningitis. One-way, structural, and probabilistic sensitivity analyses were performed. The main outcomes were SBI correctly diagnosed and incremental cost per quality-adjusted life-year (QALY) gained. Results In the bacteremia model, the PECARN strategy was the least expensive strategy ($3671, 0.779 QALYs). The Boston strategy was the most cost-effective strategy and cost $9799/QALY gained. All other strategies were less effective and more costly. Despite low initial costs, clinical suspicion was among the most expensive and least effective strategies. Results were sensitive to the specificity of selected strategies. In probabilistic sensitivity analyses, the Boston strategy was most likely to be favored at a willingness-to-pay threshold of $100,000/QALY. In the urinary tract infection model, PECARN was preferred compared to other strategies and the Boston strategy was preferred in the bacterial meningitis model. Conclusions The Boston clinical prediction rule offers an economically reasonable strategy compared to alternatives for identification of SBI.

.Balancing the costs of medical evaluation and treatment, particularly unnecessary hospitalization, with the risks of misdiagnosis poses a clinical challenge. Over the last 40 years, multiple clinical prediction rules have been published to identify a cohort of infants at low risk of SBI [11][12][13][14][15][16][17]. Older prediction rules require routine lumbar puncture and cerebrospinal fluid (CSF) testing [15][16][17]. More recent prediction rules do not require CSF testing for risk stratification and offer improved diagnostic accuracy [15][16][17][18]. These rules carry the potential to improve clinical outcomes, decrease variation in care, and reduce high costs associated with the evaluation and management of febrile infants [4,7]. The cost-effectiveness of published clinical prediction rules in this population is unknown. The decision to adopt a clinical prediction rule to evaluate febrile infants must be weighed against the cost and effectiveness of established risk-stratification strategies.
In this investigation, we use decision modeling techniques to evaluate the cost-effectiveness of different strategies to identify infants with SBI in the ED.

Study design
We created a decision-analytic Markov model to simulate a hypothetical cohort of infants 29-60 days old presenting to an ED using the following eight published risk-stratification strategies: Boston, Rochester, Philadelphia, Modified Philadelphia, Pediatric Emergency Care Applied Research Network (PECARN), Step-by-Step, Aronson, and clinical suspicion (Table 1) [11][12][13][14][15][16][17][18]. Selection of strategies was limited to those developed to identify infants at low risk of SBI and with published sensitivity and specificity data. All analyses used secondary data from the medical literature or from online US databases without identifiable patient information. The decision model was programmed in TreeAge Pro 2016 (TreeAge Software, Inc., Williamstown, MA).
Our base case was a 40-day old term infant presenting to the ED with fever. A "base case" serves as the most likely scenario and lays the groundwork for model assumptions. All infants were assumed to be "wellappearing" and without localizing signs of infection. By selecting these characteristics, our base case infant met evaluation criteria and low-risk history criteria for each risk-stratification strategy. Infants < 29 days of age were excluded as most clinical prediction rules consider this age group to be inherently not low risk.
A Markov model consists of mutually exclusive "health states. " Simulated individuals reside in one health state at a time and can transition between health states as designated in the model. For this study, we considered five health states: 1) well, 2) SBI with medical treatment, 3) misdiagnosed SBI with no medical treatment, 4) misdiagnosed well infant with medical treatment, and 5) death. Infants in the simulated population were initially categorized as low risk or not low risk using each strategy. Infants with SBI who were misidentified as low risk had an increased risk of death above baseline [29][30][31][32].
Those remaining alive underwent reevaluation and medical treatment. Infants who underwent medical treatment were assumed to make a full recovery. Infants without SBI who were miscategorized as not low risk were assumed to undergo hospitalization without complication. Low-risk criteria were defined according to each clinical prediction rule (Table 1). For clinical suspicion, low-risk was defined as a risk of SBI < 1% as determined by the treating physician and was assumed to be assigned after history and physical exam but before obtaining diagnostic testing [18]. The threshold of 1% was selected based on a previous study [18]. For the Aronson strategy, we used a score < 2 to define low risk [14]. For the Boston strategy, empiric ceftriaxone administration for all lowrisk infants was assumed to be protective against worsening infection and death [16].

Model input variables
Input parameters for probabilities, costs, and outcomes are presented in Tables 1 and 2. For each variable, we included an estimated 95% probability range. We conducted a review of published literature to identify rates of outcomes for febrile infants and to identify measures of diagnostic accuracy for prediction rules (Table 1). Probabilities of outcomes from misdiagnosed bacteremia were derived from previously published literature, with ranges that accounted for variation among sources and uncertainty given paucity of data in the post antibiotic and vaccine eras [31]. All-cause mortality was estimated using U.S. National Center for Health Statistics life tables [38].
Sensitivity and specificity of each clinical prediction rule for bacteremia, UTI, and bacterial meningitis, were hand calculated by two of the authors (KAN, SR) based on published data [11][12][13][14][15][16][17][18][19][20][21][22][23]. When more than one study for a prediction rule was identified, the mean values for calculated sensitivity and specificity were used. Range was based on 95% confidence interval for prediction rules with a single data source. For prediction rules with externally validated data, the range was broadened to include values from all calculated 95% confidence intervals. Costs included direct medical costs of ED visits, diagnostic testing, and medical treatment ( Table 2). Costs of hospitalization were based on the mean value for each diagnosis for children less than a year of age, as reported by the Healthcare Cost and Utilization Project [37]. All costs were adjusted to 2016 US dollars based on the medical cost component of the Consumer Price Index [26]. We selected 2016 as this was the most recent year available for US inpatient healthcare data from the Healthcare Cost and Utilization Project [37].
Health state utilities were assigned a value of 0-1, with 0 equivalent to death and 1 representing perfect health [28]. Utility values associated with various outcomes were drawn from the literature [24,25,39,40]. When quality-of-life studies were not available for this age group, we used quality-of-life estimates from older populations. Infant mortality was factored as a lifetime disutility, meaning that the lifetime loss of qualityadjusted life-years (QALY) for each death was factored into the model. All costs and utilities were discounted at 3% per year, as recommended by the second Panel on Cost-Effectiveness in Health and Medicine [41].

Cost-effectiveness analysis
A cost-effectiveness analysis was conducted from a healthcare perspective, considering costs as they related directly to health expenditures, and run over a hypothetical one-year time horizon [41]. This differs from a societal perspective, which incorporates a comprehensive assessment of costs and benefits [41]. The primary outcomes evaluated in this study were cost, effectiveness (SBI accurately diagnosed and treated), and cost-effectiveness for each strategy. Strategies were ranked by cost then compared in terms of cost, effectiveness, and incremental cost-effectiveness ratio (ICER). The ICER measures added cost for additional benefit to a population, measured in QALYs, and reflects the value of an intervention. QALYs serve as a composite measure of morbidity and mortality. We assumed a willingness-to-pay of $100,000/QALY gained, a commonly cited threshold for the US healthcare system [41]. A strategy was dominated by another strategy if it was both more costly and less effective. Preferred strategies were those with the highest ICER that did not exceed the willingness-to-pay threshold. Secondary outcomes included hospitalizations, lumbar punctures, and deaths. Findings are expressed as costs, QALYs gained, and cost per QALY gained.

Sensitivity analyses
We conducted one-way sensitivity analyses to determine if varying any individual parameter across its listed range substantially changed results. Threshold analyses determined the point at which changes to certain input parameters (i.e. disease prevalence, sensitivity or specificity of each diagnostic strategy, or cost of medical management) resulted in a substantial change in the preferred strategy. Structural sensitivity analyses evaluated 1) the impact of empiric ceftriaxone administration in strategies that included testing of urine, blood, and CSF, and 2) the potential impact of contaminated cultures on the cost-effectiveness of each strategy. Probabilistic sensitivity analyses estimated the effect of uncertainties in each parameter. For the probabilistic sensitivity analyses, each variable was assigned a distribution of possible values. Distributions were chosen to reflect the level of certainty, the characteristics of the parameter range, and methodological standards. β distributions were used for probabilities and quality adjustments; γ distributions were used for costs. We then used the model to run 1000 simulations for each strategy. For each individual simulation, the model randomly selected a different value for each variable from its assigned distribution. Findings from the probabilistic sensitivity analysis are reported as costeffectiveness acceptability frontier curves [42]. These curves show the probability that the cost-effectiveness of optimal strategies will be less than or equal to a given $/ QALY amount and reflect uncertainty in the model.

Bacteremia
In the base-case analysis, the PECARN strategy was the least expensive (with a cost of $3671, and a gain of 0.779 QALYs per individual). Compared to the PECARN strategy, the Boston strategy cost $9799/QALY gained. All other strategies were dominated (Table 3). One-way sensitivity analyses demonstrated that the model was sensitive to mortality risk after misdiagnosis, bacteremia prevalence, and the sensitivity and specificity of PECARN, Modified Philadelphia, and Rochester strategies (Table 4, Fig. 2).  In a structural sensitivity analysis, when empiric ceftriaxone administration was removed from the Boston strategy, effectiveness of the Boston strategy decreased to 0.771 QALYs per individual. The PECARN strategy became the preferred strategy and Boston was dominated. Addition of ceftriaxone to the Philadelphia strategy did not change model outcomes. Addition of contaminated cultures with presumptive reevaluation and treatment of affected patients did not change model outcomes. Comparative clinical outcomes in a hypothetical population are shown in Table 5.
Probabilistic sensitivity analysis results are summarized as cost-effectiveness acceptability frontier curves,  showing the uncertainty associated with the optimal options, calculated using the net monetary benefit framework, over a range of willingness-to-pay (or acceptability) thresholds, as shown in Fig. 3. In this analysis, the Boston strategy was the preferred strategy when the willingnessto-pay was >$10,000/QALY. At a willingness-to-pay of $100,000/QALY, the Boston strategy was the more costeffective option in 20% of model iterations. A scatter-plot comparing the incremental cost and incremental effectiveness of the Boston strategy to the PECARN strategy demonstrated that the Boston strategy was below the $100,000/QALY threshold in 67% of model iterations (Fig. 4). PECARN, Pediatric Emergency Care Applied Research Network; WTP, willingness-to-pay threshold.

UTI
In the UTI model, the PECARN strategy was the least expensive strategy ($3422, 0.842 QALYs). All other strategies were more expensive and less effective ( Table 6).
One-way sensitivity analyses demonstrated that the Step-by-step strategy would be preferred if it had a specificity for UTI greater than 58% or if specificity of the PECARN strategy was less than 56%. Clinical suspicion was preferred if it had a specificity greater than 67%. Probabilistic sensitivity analyses indicated that at a threshold of $100,000/QALY, the PECARN strategy was preferred in 48% of model iterations.

Meningitis
For bacterial meningitis, the PECARN strategy was least expensive, and the Boston strategy was preferred with an  (Table 7). One-way sensitivity analyses demonstrated that this model was sensitive to the specificity of each clinical prediction rule. The PECARN strategy was favored if its sensitivity for identifying bacterial meningitis was greater than 99.8%. Probabilistic sensitivity analysis indicated that at a threshold of $100,000/QALY, the Boston strategy was preferred in 45% of model iterations.

Discussion
We performed a cost-effectiveness analysis to compare commonly cited strategies for risk-stratification in the evaluation of febrile infants, finding that Boston and PECARN strategies provided economically reasonable risk stratification strategies compared to other published clinical prediction rules. Models for each type of SBI varied with respect to treatment costs and health risks after misdiagnosis; the PECARN strategy was favored in the UTI model while the Boston strategy was increasingly cost-effective with higher risk infection types. We found that the Boston strategy was the most costeffective strategy in both the bacteremia and bacterial meningitis models. Despite lower sensitivity compared to other strategies, the protective effect offered by empiric ceftriaxone and the cost-savings introduced by a higher specificity outweighed the disutility and costs associated with universal lumbar puncture and CSF testing.
Alternatively, the PECARN strategy, which does not require CSF testing or empiric antibiotic administration, was an economical option in both models and may provide benefits not measured in this study, depending on individual risk tolerance or preferences.
We found that in the UTI model, most strategies had a sensitivity ≥90% and the risks associated with delayed antibiotics were less substantial than in the other models. Because of this, the benefits of empiric ceftriaxone had a smaller impact, and the model became more sensitive to the costs and disutility associated with admitting patients who were ultimately not diagnosed with UTI.
We found that there were no scenarios in which application of clinical suspicion alone was economically reasonable. In this strategy, infants with fever were assumed to undergo no diagnostic testing if the treating physician assessed their risk to be < 1%. Prior studies investigating variation in infant fever management have noted that a proportion of young febrile infants are discharged from pediatric EDs without additional testing [4]. Our model suggests that the benefits of decreased upfront resource utilization are outweighed by the increased risk and associated costs for the few missed infants with SBI.
To our knowledge, this is the first cost-effectiveness analysis of risk-stratification of febrile infants to evaluate the most recently reported clinical prediction rules for febrile infants. Lieu, et al., demonstrated the benefits of outpatient Fig. 4 Probabilistic sensitivity analysis, PECARN vs Boston, bacteremia model. Results of the probabilistic sensitivity analysis represented as a scatterplot of the incremental cost (x-axis) and incremental effectiveness (y-axis) of the Boston strategy as compared to the PECARN strategy. The willingness-to-pay threshold of $100,000/quality-adjusted life-year, is indicated by the diagonal dashed line management of low-risk infants with ceftriaxone using Boston and Philadelphia criteria [29]. In their study, sensitivity analyses indicated that treatment of low-risk infants with ceftriaxone would not be the preferred strategy if an alternative diagnostic strategy had ≥97% sensitivity. In our sensitivity analysis, the PECARN strategy would be preferred over the Boston strategy if its sensitivity were ≥ 98%. Our study builds on prior work by considering the impact of modern disease prevalence and epidemiology, as well as prediction rules with improved diagnostic accuracy.
By simulating the experience of large patient populations, Markov analyses identify strategies that benefit the most individuals most often and lend themselves to broader interpretations. Our study findings may also inform shared decision-making discussions. We found that most strategies have similar effectiveness ( Table 3). Strategies that require cerebrospinal fluid testing or result in more frequent hospitalization are more costly but may provide benefits not measured in this study, depending on individual risk tolerance or preferences. One of the strengths of this study is the separate consideration of the three most common serious bacterial infections in infants. While their presentations can be similar and prior studies have analyzed them as a group, the prognosis and the consequences of misdiagnosis for each are substantially different, particularly for bacteremia and bacterial meningitis [8,[30][31][32][33][34]. Recent investigators have also attempted to separate UTI from bacteremia and bacterial meningitis, using the term invasive bacterial infection for the latter [4,[11][12][13][14]. In this study, developing disease specific models allowed for a better understanding of how each rule performed across the spectrum of disease, from the low risk and low costs of UTI to the high risk and high costs of bacterial meningitis. The cost utilization of a new consensus guideline provided by the American Academy of Pediatrics is an additional consideration; at the time of this publication, no study has validated this decision rule to determine its performance characteristics [43].
Our findings are subject to limitations. Older prediction rules, such as the Philadelphia, Rochester and Boston criteria, were developed and validated during a period when invasive bacterial infection had a higher incidence [44]. In contrast, some recent rules, such as the Modified Philadelphia, Aronson, and Step-by-Step criteria, may be more reflective of present-day epidemiology but lack external validation. Local practices often do not strictly adhere to published protocols and our model was limited to strategies with published data [4]. We attempted to account for these factors by examining sensitivity and specificity ranges across their calculated 95% confidence intervals in sensitivity analyses. We did not consider the impact of increased outpatient visits associated with a larger proportion of infants categorized as low-risk. Given that our sensitivity analysis did not demonstrate that the model was sensitive to either the large costs of hospitalization or the relatively low costs of CSF testing, it is unlikely that the additional cost of an outpatient visit would change model outcomes.
We adapted utility values from the literature. Infant health state utilities are poorly defined and understudied [45]. It is possible that an older individual's experience with bacterial infection, lumbar puncture, or hospitalization is different from that of an infant. For this reason, selected utility values were varied over wide ranges. Varying these values did not change favored strategies.
We used a healthcare perspective and, as such, did not evaluate the perspective of families and caregivers, costs of missed or lost employment, or the disutility of caring for an ill child either in the hospital or at home. This, in turn, could bias results toward or against rules associated with higher hospitalization rates. We did not account for inherent risks of hospitalization and medical interventions, including iatrogenic complications and nosocomial infections, and subsequent associated costs. However, these would only increase the costs associated with strategies that require more frequent hospitalization and thereby not change the ultimate findings in our study. Medical costs and willingness-to-pay were based on the U.S. healthcare system and assumptions about the simulated population were drawn primarily from U.S. based data. As such, it is difficult to apply this model to more resource-limited settings. Despite these limitations, this study demonstrates the value associated with application of clincial prediction rules in the emergency setting, and how we can effectively and efficiently evaluate young febrile infants from the perspective of clinicians and health systems.

Conclusion
In this cost-effectiveness analysis evaluating strategies for the risk-stratification of young febrile infants, we found that the Boston and PECARN clinical prediction rules are economically reasonable strategies compared to alternative strategies when considering outcomes of UTI, bacteremia, and bacterial meningitis. The Boston strategy was more effective and economically reasonable for bacteremia and bacterial meningitis, whereas the PECARN strategy was preferred in UTI. Our findings highlight the benefits of a risk-stratification strategies that avoid potentially unnecessary hospitalizations, either with empiric antibiotic treatment or by maximizing sensitivity and specificity of the initial evaluation.