Comparison of three classes of Marginal Risk Set Model in predicting infant mortality among newborn babies at Kigali University Teaching Hospital, Rwanda, 2016

Background The Infant Mortality Rate (IMR) in Sub-Saharan Africa (SSA) remains the highest relatively to the rest of the world. In the past decade, the policy on reducing infant mortality in SSA was reinforced and both infant mortality and parental death decreased critically for some countries of SSA. The analysis of risk to death or attracting chronic disease may be done for helping medical practitioners and decision makers and for better preventing the infant mortality. Methods This study uses popular statistical methods of re-sampling and one selected model of multiple events analysis for measuring the survival outcomes for the infants born in 2016 at Kigali University Teaching Hospital (KUTH) in Rwanda, a country of SSA, amidst maternal and child’s socio-economic and clinical covariates. Dataset comprises the newborns with correct information on the covariates of interest. The Bootstrap Marginal Risk Set Model (BMRSM) and Jackknife Marginal Risk Set Model (JMRSM) for the available maternal and child’s socio-economic and clinical covariates were conducted and then compared to the outcome with Marginal Risk Set Model (MRSM). That was for measuring stability of the MRSM. Results The 2117 newborns had the correct information on all the covariates, 82 babies died along the study time, 69 stillborn babies were observed while 1966 were censored. Both BMRSM JMRSM and MRSM displayed the close results for significant covariates. The BMRSM displayed in some instance, relatively higher standard errors for non-significant covariates and this emphasized their insignificance in MRSM. The models revealed that female babies survive better than male babies. The risk is higher for babies whose parents are under 20 years old parents as compared to other parents’ age groups, the risk decreases as the APGAR increases, is lower for underweight babies than babies with normal weight and overweight and is lower for babies with normal circumference of head as compared to those with relatively small head. Conclusion The results of JMRSM were closer to MRSM than that of BMRSM. Newborns of mothers aged less than 20 years were at relatively higher risk of dying than those who their mothers were aged 20 years and above. Being abnormal in weight and head increased the risk of infant mortality. Avoidance of teenage pregnancy and provision of clinical care including an adequate dietary intake during pregnancy would reduce the IMR in Kigali.


Background
The discrepancy in IMR and low life expectancy of the SSA versus the other parts of the world attracts several researchers. The report of the World Bank in 2011 pointed that the IMR was 75/1000 in SSA versus 11/1000 in developed countries [1]. The same report pointed that half of the ten million children who die every year is in SSA. The World Bank dataset from 1960 to 2005 suggests that low life expectancy at birth in SSA is relatively higher in Middle Africa as compared to other sub-regional disparities of SSA [2]. The World Bank records of 2017 indicated that the IMR was 51.50/1000 in SSA [3]. Central African Republic had the highest IMR of 87.60/ 1000, the lowest IMR were found in Mauritius (11.60/1000), the IMR in Rwanda was 28.90/1000. Several studies on factors that could lower the infant mortality have been done and recommendations were suggested but the IMR remains a problem in SSA.
The multiple events model for infant mortality at the Kigali University Teaching Hospital analysed in [4] leaves a question on whether the adopted model is stable. The main causes of instability may be the correlation of the covariates or relatively small sample size [5]. One of the ways of assessing instability in survival regression models is a use of re-sampling techniques [6]. The analysis in [4] is a none re-sampled model that used the primary dataset of the year 2016. Two observable events per subject are death and the occurrence of at least one of the common conditions that may also cause the long-term death to infants. It was found that the Marginal Risk Set Model (MRSM) also known as the Wei, Lin and Weissfeld Model (WLWM) fit the data well. The WLWM is among the multiplicative methods for analysing ordered events found in [7]. Other multiplicative models include the Andersen-Gill Model (AGM) and the Prentice, Williams and Peterson Model (PWPM) [8].
The present study uses two popular nonparametric methods of re-sampling namely bootstrap which is based on the random samples with replacement [9], and jackknife method that is based on sampling by leaving out one observation at time [9]. The size of the sample in [4] is 2117 and the record is effective in the year 2016. The long-term results could be assumed according to the stability potentially observed after re-sampling. Several manuscripts on re-sampling in survival analysis are limited on the re-sampled Cox proportional hazards model and on estimating standard errors of the survival and hazard functions such as in [6,[10][11][12][13] where bootstrap is involved [13][14][15][16]; in which the jackknife is implicated or [17][18][19][20][21][22] where hazard and survival functions with their respective standard errors are of interest. The present study analyses the bootstrap-based MRSM with 1000 replicates and the jackknife-based MRSM. The results are then compared to that of the MRSM.

Dataset
The time to event data of 2117 newborns at the KUTH is recorded from the 1st January to the 31st December 2016. At KUTH, all newborns are recorded in registries with all details of parents and clinical outcomes of each newborn. The information in registry provides references on card indexes that provide information on clinical behavior of babies after leaving the hospital. KUTH as a site of interest in this study is a central Hospital where most of complicated childbirths countrywide are transferred. In 2016, KUTH recorded relatively high incidence of stillborn cases (69 stillborn babies or 3.259%) and relatively high infant mortality rate (3.873%). Table 1 summarises the information on newborns at KUTH along the study time.
The study is interested on subjects with a correct information on the covariates of interests. The two events per subject are observed namely the death and the incidence of at least one chronic disease or complication such as severe oliguria, severe prematurity, very low birth weight, macrosomia, severe respiratory distress, gastroparesis, hemolytic, trisomy, asphyxia and laparoschisis. Apart from the event status and the time to event, 11 covariates are recorded and subdivided in demographic covariates which include the age and the place of residence for parents; clinical covariates for female parents that include obstetric antecedents, type of childbirth and previous abortion. Clinical covariates for babies include APGAR; gender, number of births at a time, weight, circumference of the head, and height. Table 2 gives a description of the variables of interest.

Marginal risk set model
Assume that h(t|x i ) is the hazard function of the survival time T given the p fixed covariates x i = (x i1 , x i2 ,. .., x ip ). Let h 0 (t) be the hazard function when x i = (0, 0,. .., 0) for all i, then where β = (β 1 , β 2 ,. .., β p ) ' is a p-dimensional vector of model parameters [23]. Define an indicator function as.
δ ij (t) = 1 if individual i is at risk of the j th event and δ ij (t) = 0 otherwise.
The marginal risk set model (MRSM) or the Wei Lin and Weisfeld Model (WLWM) assumes that events are unordered where each event has its own stratum and each data point appears in all strata [4,24]. In other words, the k th time interval per subject is in the k th stratum, k = 1, 2,. .., n.
The hazard function for the j th event for the individual i is given by

Maximum likelihood and parameter estimation
Let]0, τ i [be the interval of time in which the individual i is observed with n i the number of events of the individual i along]0, τ i [and Assume that two events cannot occur simultaneously in continuous time. The probability density function for the outcome n i along]0, τ i [is given by.
In (3), individual i has n i events with n i ≥ 0 at times t i1 ≤ t i2 ≤ · · · ≤ t ini .
The appropriate partial likelihood functions for tied time to event data is well described in [24] and in [25] and include Breslow's, Efron's and Cox's techniques. The maximum likelihood estimates are given by a system where α is known as the baseline parameter vector while β is a vector of model parameters. The Newton-Raphson method is one of numerical methods used for solving system (4). The adequacy checking of the likelihood estimates is done by finding the elements ℑ αα , ℑ αβ , ℑ βα and ℑ ββ of the information matrix ℑ and assume that as n→∞;Φ−Φ↦Nð0; ℑ −1 ðΦÞÞ [4,26].
In MRSM, n is assumed to be the maximum number of events per subject while τ k , k = 1, 2, .. for estimating the jackknife model parameters. The Jackknife standard error is given by Eq. (7) found in the Appendix.
The results of the unadjusted JMRSM are relatively close to that of the unadjusted MRSM ( Table 3)   level of the covariate abortion where significance is suggested by the MRSM. Following the recommendations of Parzen and Lipsitz [30], the χ 2 test statistics suggest a higher performance of the JCPHM as compared to the CPHM and BCPHM since the χ 2 is relatively everywhere lower for the JCPHM..

Discussion
The overall results of MRSM, BMRSM and JMRSM by different approaches of ties handling (Tables 3, 4 and 5) are not critically different as expected. The STATA default method (Breslow) is then of interest in the analysis. The JMRSM is adopted for checking stability since the results are closer to that of MRSM than that of BMRSM. The similarity between MRSM and JMRSM suggests that the MRSM may be stable. The global analysis upholds the significance difference of all levels of covariates age, gender, number and APGAR and intermediate levels of covariates weight and head.
The re-sampled adjusted models by Breslow technique of handling tied events suggest that the risk of death or attracting a chronic disease of babies whose parents' age range from 20 to 34 years old is lower than that of babies whose parents are under 20 years old and that of babies whose parents are 35 years and above. Basinga et al. [31] argue that the unintended pregnancy induces abortion in Rwanda, their study suggests a relatively higher rate of teenage unintended pregnancies as compared to the other age ranges, this contributes on the first hand, to the increase of infant mortality rate. On the second hand, the study by Olausson et al. [32] confirms a relatively higher risk for teenage pregnancies due to biological immaturity. As for the advanced maternal age, Lampinen et al. [33] point that it is associated with relatively poorer outcomes to pregnancies due to the observed higher incidence of chronic medical conditions among older women.
The results show that the risk for male babies is higher than that of female babies. This complies with the usual better survival outcome of the females as reports several manuscripts such as [34] or [35]. Multiple babies survive better than singleton babies; this is however against the results from studies conducted in Sub-Saharan Africa by Monden and Smits [36] and Pongou et al. [37]. This may be due to the small number of multiple newborns recorded at KUTH along the year 2016. The survival outcomes of babies whose APGAR is below 4/10 are poorer than that of babies with higher APGAR score. Babies whose weight range from 2500 g to 4500 g survive better than those whose weight is below 2500 g and those whose weight is above 4500 g while babies whose circumference of head range from 32 cm to 36 cm survive better than those whose circumference of head is below 32 cm. The results of APGAR, weight and circumference of the head comply with the recommendations of the clinical medicine and related manuscripts such as [38] for example.
The study shows that the BMRSM is close to JMRSM and MRSM for all significant covariate but the BMRSM shows relatively higher standard errors for some non-significant covariates. The discrepancy between standard errors after re-sampling for covariates such as childbirth, weight, head and height suggests the instability of the MRSM at these specific covariates and this emphasizes their non-significance in the MRSM.
The present analysis is limited on eleven covariates. Unavailable covariates concerning parents that could improve models are, for example, demographic

Conclusion
Marginal Risk Set Model (MRSM) and related resampling using Bootstrap (BMRSM) and Jackknife (JMRSM) are described and compared with a use of the dataset on infant mortality. The JMRSM and MRSM displayed relatively close results. The risk is higher for babies whose parents are under 20 years old parents as compared to older parents. Babies born with APGAR greater or equal to 7/10 were found to have a better survival outcome than those born with APGAR less than 4/10 and those whose APGAR range between 4/10 and 6/10. The risk is lower for underweight babies as compared to babies with normal weight and overweight. The survival outcomes for babies with normal circumference of head were found to be better than those with relatively small head. The study suggests that pregnancy of under 20 years old parents should be avoided, also appropriate clinical ways of keeping pregnancy against any cause of infant abnormality could help in lowering infant mortality.

Appendix
Bootstrap and Jackknife re-sampling methods Bootstrap Consider the p fixed covariates x i = (x i1 , x i2 ,. .., x in ) in Eq.
(2) where x i j,i∈ [1,p] are independent and identically distributed possibly with distribution F θ where θ is the statistical parameter of interest. Consider the distribution function F Rn of a random variable R n (x, F θ ). A bootstrap method as described in [9], consists of generating samples.
are random samples of size n drawn with replacement from the sample x i .
The variables of x i *k are independent and identically distributed with distributionF θ;n , given x;F θ;n is an estimator of F θ from x i ; B is a number of bootstrap samples also known as replications.
The estimated standard error of the bootstrap statistic of interest is given in Efron and Tibshirani [9] aŝ whereθ Ã ðbÞ is an estimate of the statistic of interest from the b th bootstrap sample, b = 1, 2,. .. .B
(2). Let θ be a statistic of interest. The jackknife samples consist of leaving out one observation at a time, that is n samples.