Validity and reliability of a structured interview for early detection and risk assessment of parenting and developmental problems in young children: a cross-sectional study

Background Preventive child health care is well suited for the early detection of parenting and developmental problems. However, as far as the younger age group is concerned, there are no validated early detection instruments which cover both the child and its environment. Therefore, we have developed a broad-scope structured interview which assesses parents’ concerns and their need for support, using both the parental perspective and the experience of the child health care nurse: the Structured Problem Analysis of Raising Kids (SPARK). This study reports the psychometric characteristics of the SPARK. Method A cross-sectional study of 2012 18-month-old children, living in Zeeland, a province of the Netherlands. Inter-rater reliability was assessed in 67 children. Convergent validity was assessed by comparing SPARK-domains with domains in self-report questionnaires on child development and parenting stress. Discriminative validity was assessed by comparing different outcomes of the SPARK between groups with different levels of socio-economic status and by performing an extreme-groups comparison. The user experience of both parents and nurses was assessed with the aid of an online survey. Results The response rate was 92.1% for the SPARK. Self-report questionnaires were returned in the case of 66.9% of the remaining 1721 children. There was selective non-reporting: 33.1% of the questionnaires were not returned, covering 65.2% of the children with a high-risk label according to the SPARK (p < 0.001). Inter-rater reliability was good to excellent with intraclass correlations between 0.85 and 1.0 for physical topics; between 0.61 and 0.8 for social-emotional topics and 0.92 for the overall risk assessment. Convergent validity was unexpectedly low (all correlations ≤0.3) although the pattern was as expected. Discriminative validity was good. Users were satisfied with the SPARK and identified some topics for improvement. Conclusion The SPARK discriminates between children with a high, increased and low risk of parenting and developmental problems. It does so in a reliable way, but more research is needed on aspects of validity and in other populations.

In the Netherlands, the law requires preventive child health care (CHC) to detect parenting and developmental problems at an early stage. However, as the younger age group is concerned, there are no validated early detection instruments which cover both the child and its environment. Therefore, we have developed the Structured Problem Analysis of Raising Kids (SPARK) [15]. The SPARK is a structured interview for early detection and risk assessment of parenting and developmental problems in young children. This instrument combines the perspectives of the parent(s) and the professional. The SPARK asks parents to voice any concerns and problems on a broad range of topics, and then to indicate the need for support perceived by both parent and CHC-professional, followed by a joint decision on subsequent care. It finishes with a structured overall risk assessment for parenting and developmental problems by the professional.
The development study of 1140 children shows that the SPARK is discriminative and practicable [15]. Before the SPARK can be further implemented in clinical practice, further study is needed on the psychometric characteristics of this instrument. As no criterion instrument ('gold standard') exists for early detection of parenting and developmental problems, criterion validity cannot be assessed. Therefore, we have assessed the SPARK on interrater reliability, convergent validity, discriminative validity, and the user experience of both parents and CHC-professionals.

Study design
We performed a cross-sectional study on all children living in the province of Zeeland and born between January 15 and July 31 2006, a total of 2012 children. Once a month, all children who would reach the age of 18 months the following month were identified in the municipal population registry. This has the goal that all eligible children could be contacted. The CHC nurse contacted parents for the regular check-up at the age of 18 months, which consisted of a home visit by the CHC nurse or a visit to the well-baby clinic by parent(s) and child, and included an information letter on the aim of the visit and the primary study (assessing the value of a structured interview during home visits and visits to the well-baby clinic). The visit started with the structured interview (SPARK), with the primary goal of deciding together with the parent(s) which type of (health) care was needed by child and parent(s). The interview was followed by a request (verbal + written) for informed consent to use the information recorded in the SPARK for scientific research. The order of the steps was chosen on purpose, as it would be complicated to discuss parenting problems and care needed after informed consent was denied. The CHC nurses were not aware of the study goals of the validation study to prevent bias. The study protocol was approved by the Medical Ethical Review Committee of the University Medical Center Utrecht.
Reliability of the SPARK was assessed by the interrater agreement. In a random sample of 67 children a second CHC nurse was also present. Her function was to listen to the interview, without interfering, and to fill in the SPARK-form independently from the interviewing CHC nurse. Convergent validity was assessed by comparing SPARK-domains with domains in self-report questionnaires on child development and parenting stress which cover concepts also addressed in the SPARK. Parents who gave informed consent were requested to complete a set of questionnaires (described below). Discriminative validity was assessed by comparing different outcomes of the SPARK between groups with different levels of socio-economic status (SES) and by performing an extreme-groups comparison. We hypothesized that children from families with lower SES would report more problems and need for support, and that this group would include more children with a high and increased risk of parenting problems. The extremegroups comparison was done by comparing the mean levels of concern and perceived need for support and the risk assessment between a) all children with a confirmed report to the child protective services between birth and the age of 18 months (n = 21), and b) the 'everything OK' group: a group of children with normal scores on all self-report questionnaires and no known risk factors (which include large family (≥ four children), single parent, young parent (<20 years at birth of child), very low educational background of parents, parents not speaking Dutch at home, unemployed or unemployable parents) [16,17]. As the latter group was very large (n = 912), we took a random sample from this group of three times the number of the reported group. Again, children with a confirmed report were expected to show more problems and a higher risk.

Instruments
The way the SPARK was conceived has been described in detail in a previous study [15]. The SPARK consists of 16 topics in the following order: infancy review (reviewing past issues and discussing any problems arising from the infant period that are still relevant); somatic health; motor development; language, speech and thought development; language use of parents (second language, mother tongue); emotional development; contact between the child and others (both children and adults); child behavior; parenting approach; developmental stimulation and early/preschool education; how the child spends its time; living environment in and outside the home; social contacts and informal support; day-care for the child; concerns communicated by others; family issues; and lastly a question about whether any topic has been forgotten or needs further attention. The SPARK uses a 3-step model: Step 1: detection of problems and concerns; Step 2: clarifying the characteristics and seriousness of problems and concerns in dialogue with the parents; Step 3: analysis and a decision on what to do next. For each topic, the CHC nurse starts with a short description of the topic with examples, and asks the parents if they have experienced any concerns, questions or problems in the last six months (Step 1). Parents are requested to assess the seriousness of these concerns on a five-point Likert scale presented on a printed card, ranging from "no concern at all" to "very concerned". If concerns are cited, respondents are asked to elaborate on the exact nature of concerns, questions or problems, and whether or not professional and/or informal helpif offeredhas been sufficient. Each topic ends with the parents assessing their current perceived need for support, on a six-point Likert scale: 1) no help needed; 2) information wanted; 3) personal advice; 4) counselling; 5) intensive help; 6) immediate intervention required. The CHC professional then makes the same assessment ( Step 2). The information of steps 1-2 is recorded on a one-page form with a matrix structure: the first column includes all topics, followed by columns for each separate question: concerns / used support / support helped / current perceived need for support by parents / perceived need for support by nurse. After all the topics have been covered, the CHC nurse discusses with the parents the amount and content of care needed in the following months (Step 3), and notes this together with a description of the concern or problem on the second page, on which the possibilities for further care have been preprinted. Having done this, the CHC nurse ends the visit and subsequently makes an overall risk assessment on the third page, assigning the child a low, increased or high risk for parenting and development problems. The CHC nurse bases this overall risk assessment on the information from the interview, and on an elaboration of factors that might positively or negatively influence this risk assessment. This structured elaboration includes the observation of several factors, preprinted on the third page: the interaction between parent(s) and child(ren); growth and development of the child; manifest problems (both in the child such as existing illness, and in the family such as major life events, history of psychiatric illness, financial problems etc.); and living environment (hygiene, housing, family composition).
The set of self-report questionnaires on child development and parenting stress included a pre-stamped envelope addressed to the research team. The set consisted of the following questionnaires: 1) Ages and Stages Questionnaire (ASQ) version 2, 18-month version [18,19]. The ASQ consists of 30 questions on 5 domains: communication, gross motor, fine motor, problem solving and personal social. The ASQ has three answering options: 'yes' , 'sometimes' , 'not yet'. Domains have a range of 0 to 60. 2) The Ages and Stages Questionnaire: Social Emotional (ASQ:SE, 18 month version) also has three answering options: 'most of the time', 'sometimes' and 'rarely or never'. Parents are asked to tick off a checkbox if the item in question is a concern [20]. The ASQ:SE has a scoring range of 0 to 255 in the 18-month version.
3) the short validated Dutch version of the Parenting Stress Index [21], called 'Nijmeegse ouderlijke stress indexkort' (NOSIK) [22]. The NOSIK consists of 25 items using a 6-point Likert scale ranging from 'do not agree at all' to 'do completely agree' , with a scoring range of 25 to 150. 4) a partly validated questionnaire on psychological and pedagogic problems in young children which is frequently used in preventive CHC in the Netherlands: the 'Kort Instrument voor de Psychologische en Pedagogische Probleem Inventarisatie' (KIPPPI) [23]. This self-report questionnaire consists of 70 items grouped into a total score, and 19 yes/no items on life events.
The 18-month versions of both ASQ and ASQ:SE have been translated into the Dutch language using a double forwardonce backward procedure. The (minor) differences have been resolved in cooperation with the developer of these questionnaires. Although these translations of the ASQ and ASQ:SE have not been validated, the ASQ and ASQ:SE have proven to be practicable and valid in other countries than the USA [24][25][26], including the Netherlands (48 month version [27]). Additionally, data have been gathered on demographic variables: age of father and mother at birth of first child, level of education of both parents, current working status of both parents, language spoken at home. Both the SPARK and the self-report questionnaires have been scanned using Teleform W . Socio-economic status (SES) has been assessed on neighborhood level: using the postal code for the house address of the child, each child has been assigned the SES-level of his or her neighborhood, using figures of Statistics Netherlands delivered by the Municipal Health Service of Zeeland. SES has been measured in 7 categories, from very low to very high. Most of the 155 postal code regions in Zeeland have a medium SES.
For the extreme-groups comparison, we checked with the child protective services (Advice and Reporting Centres for Child Abuse and Neglect, and Youth Care Agency) which children in our sample had a confirmed report between birth and the age of 18 months.
For assessment of the user experience of both parents and CHC-professionals, we adapted a short questionnaire on CHC-nurses' skills meant for increasing parents' parenting competences [28]. During November 2007, parents and CHC-nurses were asked to complete this questionnaire online for each visit using the password-protected online survey tool NetQ (http://netq.nl).

Statistical analysis
Reliability of the SPARK was assessed by the inter-rater agreement between the SPARK and a listen-only version as described above. We computed an intraclass correlation (ICC) using an 'observer nested within subject' approach [29]. We only did this for the risk assessment and the need for support on the different topics as perceived by the CHC professional, as the answers given by the parents would be scored identically. Convergent validity of the SPARK was assessed by computing Spearman correlations between the care need expressed by parents and by CHC professionals on the 16 topics with domains in the self-report questionnaires. Using a multitrait-multimethod matrix [30] we expected higher correlations between related domains, such as motor development in the SPARK and gross motor in the ASQ; child behavior with ASQ:SE total score and NOSIK etc; and low correlations between differing domains such as physically oriented domains in the SPARK and parenting stress (the NOSIK score). Solely for the purpose of assessing discrimitative validity, we computed summary scores for concerns and perceived need for support by summing the scores for all topics and dividing by the number of topics. Thus, the scoring range of the summary scores was the same as with the original variables. Differences between postal code regions with different SES-levels on these summary scores for concerns or perceived need for support were tested using a Kruskal-Wallis test [31]. The extreme groups were compared using a Mann-Whitney U-test on concerns and perceived need for support, and a chi-square test on the risk assessment. Data-analysis was done using SPSS version 17. A p-value below 0.05 was considered significant.

Results
During the study period 2012 eligible children were living in the province of Zeeland. No SPARK was received for 136 children (6.8%). For another 155 children, an incomplete SPARK was available. This group consisted of a) received with comment 'no contact wanted by parents' (n = 24); b) missing risk and/or consent data (n = 25); and c) no consent obtained after administration of the SPARK (n = 106)). Children for whom no SPARK was received, or an almost empty SPARK with the comment 'no contact wanted by parents' , were counted as a non-response. From the remaining 1721 children, selfreport questionnaires were returned for 1152 children (66.9%). Characteristics of the study population are described in Table 1. Administration of the SPARK took on average 29 minutes (standard deviation 11 min.). Table 2 shows scores per domain on parents' concerns, needs assessment by parents and professional.

Reliability
Concerning inter-rater reliability, ICCs were very high for physical topics (>0.85 to 1.0; see Table 3). For socialemotional topics, ICCs varied between 0.61 and 0.8. The ICC of the overall risk assessment was also very high: 0.92.

Validity
Convergent validity was low, with no correlations exceeding 0.3. Despite the low correlations, the pattern was as expected: higher scores (in this case above 0.1) were only found in domains that were expected to have higher correlations. Correlations above 0.2 include SPARK motor development with ASQ gross motor; SPARK language-, speech-and cognitive development with ASQ communication; SPARK child behavior with KIPPPI total score; SPARK family issues with KIPPPI life events (see Table 4). Domains of the NOSIK were not related to physically oriented SPARK domains, and significantly correlated to psychosocial domains. All correlations above 0.1 were significant at the 0.01 level. Analysis of groups based on SES-level showed that there was a highly significant difference in overall risk assessment (p < 0.001): there were relatively more children labeled as high risk in the lower SES groups compared to the groups with higher SES. There was also a small but significant difference in the level of parents' concerns between SES-levels (median value range: 1.29 to 1.67, p < 0.001), but not in the perceived need for support (parents: 1.07 to 1.16; nurses: 1.19 to 1.30). The extreme-groups comparison followed almost the same pattern: significant differences in overall risk assessment (p < 0.001) and parental concerns (median value 'reported': 1.93 versus 'everything OK': 1.32, p = 0.043). There was a discrepancy in the perceived need for support: the reported children's parents did not differ from the 'everything OK' children's parents (1.13 vs 1.07, p = 0.60), but the need for support as perceived by the CHC-nurse was far higher for the reported children's' group (1.60 vs 1.19, p = 0.006). Table 5 shows the professional judgement of perceived need for support per domain, separately for the extreme groups and for the different SES-levels. The judgment was dichotomized for better readability into mild support (percentage information wanted / personal advice / counselling) and intensive support (percentage intensive help/ immediate intervention required). The reported group differed from the 'everything OK' group mostly in the domains related to the parent and family (parenting approach, living environment, social contacts, day care for child, concerns communicated by others, family issues, was any topic forgotten?). Lower SES-groups differed in a similar way from the higher SES-groups. Furthermore, we found a difference in overall risk between children with and without completed self-report questionnaires. The group with completed questionnaires formed 66.9% of the total group, but included only 34.8% of the high risk labels. The group without questionnaires thus formed 33.1% of the total group, with 65.2% of the high risk labels. This difference in distribution is highly significant (p < 0.001).

User experience
The survey on user experience was completed for a total of 211 contacts. Parents reported on 100 contacts, CHCprofessionals on 179 contacts. After removing incomplete surveys, 86 parent-completed and 177 CHC nursecompleted surveys remained. Completing the survey took parents on average 5.2 minutes, and nurses 7.5 minutes. Both parents and CHC-nurses were positive about using the SPARK (satisfied or very satisfied about the contact: parents 94.2%; nurses 91.5%). Nurses succeeded in using the structured approach of the SPARK reasonably well to very well in 92.1% of the contacts. Despite the fact that the SPARK structured the visit, most parents and CHC-nurses found the visit very relaxed (89.6% and 65.6%). More than half of the parents regarded the information given during the visit as useful (66.3%) and tailored to their needs (58.1%). The majority of the parents (95%) reported that all relevant topics had been sufficiently discussed. CHC-nurses reported that using the SPARK provided them with information they would not have collected without using such a structured instrument, especially regarding topics related to family matters (25.4% of the contacts), parenting approach (15.8%) and concerns communicated by others (11.9%). The results of the survey were discussed with the same expert group of CHC nurses that had helped develop the SPARK (n = 8) [15]. The results of the survey and this discussion resulted in the following comments on using and improving the SPARK. The SPARK supports the CHC-nurse in making difficult visits: it ensures that nothing is forgotten, and helps in asking tough questions. Asking for the concerns and needs of parents gives much additional information in families with problems, which helps in deciding what care should be offered to these families. However, in families where everything is OK, the SPARK was found to be too rigid. Furthermore, the expert group reported that the wording of the answering categories of the question whether parents experienced had any concerns, questions or problems in the last six months needed improvement.

Discussion
This study assesses the psychometric properties of the SPARK, a structured interview developed to assess parenting and developmental problems in young children. The inter-rater reliability was found to be very good to excellent, especially for the overall risk assessment and the physical domains. The SPARK showed to be discriminative, by distinguishing between areas with different SESlevels and between postal codes (representing both SES and urbanization). There were clear differences between extreme groups: children reported to the child protective services versus children with positive scores only on all questionnaires. The only psychometric property that was below expectation was the convergent validity. Correlations of SPARK-domains with related domains in the selfreported questionnaires were significant, but very low. Although they showed the expected pattern, no correlation exceeded 0.3. This lack of convergence is probably influenced by several aspects. Firstly, the content and the way of questioning differed quite a lot between the SPARK and the self-report questionnaires. Secondly, the majority of the children had no problems. Thirdly, the group that did not return the questionnaires included a large part of the children with a high risk. Both parents and CHC-nurses were positive about the SPARK. CHC-nurses reported that the SPARK gave practical information and supported them during visits with problem families. They also identified several areas of improvement for the SPARK: its rigid structure and the wording of some questions. Several authors support our opinion that an assessment of parents' concerns and their need for support should be done in dialogue with the parents [32][33][34]. One of the main features of the SPARK is direct interaction between parent and professional: the focus is on interactively discussing with parents the child's needs and development and their needs for parenting support. This professional helps the parent with arranging and judging concerns and problems. The only instrument that has a somewhat similar approach to the SPARK is the Parents' Evaluations of Developmental Status (PEDS) by Glascoe [33,35]. However, there are some major differences between the PEDS and the SPARK. The PEDS is a short 10-item questionnaire to be completed before a visit to a pediatric clinic using a self-report or interview [33,35]. The answers are then discussed by the nurse or pediatrician. The SPARK differs from the PEDS in that it is a conversation between parent and professional in order to clarify care needs and to jointly decide on subsequent care. Both the parents and the professional rate their perceived need for support, which is important in situations when parents are avoiding care and to reveal differences in the perceived need between parents and professional. Furthermore, the SPARK has a broader scope, including also the child's environment. Finally, the SPARK results in an overall assessment of risk for parenting and developmental problems. Whether the SPARK is preferable to self-report questionnaires needs to be determined. The duration of administering the SPARK is about double that of the regular time spent in a visit to the well-baby clinic. This will hamper implementation, in the Netherlands as well as in other countries. Further research is needed on whether implementing the SPARK is cost-effective. Three arguments are in favor of the SPARK: a) in our current study we observe a response bias, as especially the parents with a child labeled as high risk by the nurse did not return the self-report questionnaires, b) the interview gives nurses the possibility to ask not only about the child, but also about the (functioning of ) the family. Nurses reported that this part in particular gave them new information relevant for deciding which care and support should be offered, and c) in the Netherlands there is a growing aversion among parents to self-report questionnaires. Parents regard preventive child health care increasingly as a system for detection of child abuse and neglect, instead of as a care provider that supports parents of young children [36]. This threatens the high reach (>95%) that the Dutch system has traditionally had between 0-4 years. The interactive procedure of the SPARK (i.e. listening to the parent and making a shared decision about subsequent care) may help in re-establishing the trust of parents in preventive child health care. This study has several limitations. The low convergent validity needs further attention. In addition to the reasons stated above, some other aspects play a role. Firstly, although the response rate for the self-report questionnaires was quite high, there was selective nonreporting: about two-thirds of the children with a label of high risk were part of the one-third that did not return questionnaires. This may have negatively influenced the convergent validity, as the group with expected high scores in both the SPARK and the self-report questionnaires did not contribute to the correlations. Interestingly, this lower response rate showed that the SPARK identifies a large group of children with high risk for parenting problems, which would have been missed by using only self-report questionnaires. Reasons for not returning the questionnaires are unknown, but may include causes as diverse as lack of skills to complete a self-report questionnaire, stress within the family, or not wanting to write about problems within the family. Secondly, we were limited in choosing suitable questionnaires as there is a lack of validated questionnaires for this age group in the Dutch language. Some of the instruments used for assessing the convergent validity have been validated only partially (the KIPPPI, which is used extensively in the Netherlands) or have not been validated for this age group in the Netherlands (ASQ and ASQ:SE). This limits the interpretability of the convergent validity. Thirdly, the lack of convergence may also have been caused by the broad scope of the SPARK compared to the more limited self-report questionnaires. Another limitation is that, although the province of Zeeland resembles a large part of the Netherlands, it may not be representative of some highly urbanized areas elsewhere in the Netherlands. The validity and feasibility of the SPARK in urbanized, multi-ethnic areas should also be studied. Also, this was a cross-sectional study without follow-up. Further study is required to assess the predictive validity of the SPARK and long-term outcomes.

Conclusion
The SPARK is a structured interview that assesses parents' concerns and their need for support using both the parents' perspective and the experience of the CHC-nurse. The SPARK discriminates between children with a high, increased and low risk for parenting and developmental problems in a reliable way. The SPARK is practicable and provides useful information which helps to decide, together with the parents, what care is needed in a family. The users are satisfied, but there is room for improving the instrument. Several aspects of the SPARK such as predictive validity, construct validity, cost-effectiveness and discriminative validity in other samples require further study. By using only self-report questionnaires, a large part of the children with a high risk on parenting and developmental problems is left out.