Although physical punishment is used by as many as 94% of parents in some countries, a growing number of countries have banned its use. Some American pediatricians oppose all use of corporal punishment, whereas others think there are situations where nonabusive spanking should remain a disciplinary option for parents. The author of one major literature review opposes all spanking,[5, 6] whereas five other recent literature reviews have made less absolute conclusions [7–11]. In either case, pediatricians need research evidence comparing the effects of alternative disciplinary tactics with spanking to advise parents about which tactics to use instead of spanking. To provide that information, the present study is one of the first causally relevant longitudinal studies to compare the child outcomes of alternative disciplinary tactics with those of customary spanking. Specifically, it attempts to duplicate the strongest causal evidence against customary spanking to date and investigates which of three types of nonphysical punishment are more effective at reducing antisocial behavior than is customary spanking. Second, we investigate whether or not that evidence is a statistical artifact due to residual confounding from child effects on parents.
In the only scientific consensus conference on corporal punishment, co-sponsored by the American Academy of Pediatrics, both critics[7, 13, 14] and supporters[15, 16] of spanking bans acknowledged the weakness of the relevant scientific evidence. The conference consensus statements thus included appropriate cautions about corporal punishment, but fell short of a blanket opposition to all spanking.
Since the literature review published for that conference, there have been at least five published literature reviews with a diversity of conclusions[5, 8–11]. For example, Gershoff's meta-analysis supported an anti-spanking perspective; whereas Larzelere and Kuhn's meta-analysis concluded that child outcomes of physical punishment were more adverse than those of alternative disciplinary tactics only for overly severe or predominant use of physical punishment.
Gershoff's meta-analysis concluded that physical punishment was associated with 10 adverse outcomes, whereas immediate compliance was the only benefit associated with spanking. The 10 adverse mean effect sizes, however, were based on cross-sectional, retrospective, or prospective correlations for 60%, 26%, and 14% of the supporting studies, respectively. It is well-known that meta-analyses of correlational evidence can produce precise but spurious results[18, 19]. As has been shown elsewhere, selection biases cause even prospective correlations to be biased against most corrective actions, whether implemented by parents or professionals. For example, hospitalization is associated with about a 30-fold increased risk of dying in Medicare patients, and psychotherapy is associated with a median 14-fold increased risk of suicide, compared to matched-age groups not receiving (or needing) those corrective actions. Just as the severity of the presenting condition accounts for the prospective longitudinal correlations between hospitalization and mortality and between psychotherapy and suicide, the severity of oppositional behavior in children may lead parents to use all disciplinary enforcements more often, not just spanking.
To adjust for this selection bias, the meta-analysis by Larzelere and Kuhn compared effect sizes for physical punishment and alternative tactics investigated in the same studies. The outcomes of physical punishment compared unfavorably with alternative disciplinary tactics only when it was the primary disciplinary method or was too severe (such as beating up a child or striking the face or head). The outcomes of customary spanking were neither better nor worse than for any alternative tactic, except for one study in which spanking reduced drug abuse more than nonphysical punishment. Customary spanking was defined as ordinary usage, without any emphasis on how severely it was used. Larzelere and Kuhn also identified an optimal type of nonabusive back-up spanking, used when a child responds defiantly to milder disciplinary tactics such as time out (based mostly on research on 2- to 6-year-olds). Under these conditions, back-up spanking led to less noncompliance or antisocial behavior than 10 of 13 alternative disciplinary tactics and produced outcomes equivalent to the other three tactics. The nine relevant studies included the only four randomized clinical trials of spanking, which yielded the strongest causal evidence about spanking in the scientific literature, albeit limited to enforcing compliance with time out in clinically defiant 2- to 6-year-olds [23–26]. Compliance with time out is a crucial component for effective implementation of most evidence-based psychosocial treatments for Oppositional Defiance Disorder, Conduct Disorder, and ADHD in young children[27, 28].
A second development since 1996 is new evidence that customary spanking predicts greater subsequent antisocial behavior longitudinally after controlling statistically for initial differences in antisocial behavior. A seminal study by Straus and his colleagues in 1997 provided the first evidence against customary spanking based on stronger causal evidence than unadjusted correlations. This improved causal evidence has led some to conclude that any use of spanking is invariably detrimental and should be opposed by all professionals.
We located 14 longitudinal studies that investigated whether physical punishment of children younger than 13 years old predicted subsequent antisocial behavior or aggression after controlling statistically for initial levels of those outcome variables. Seven of them lumped together nonabusive spanking with more severe forms of punishment, such as shaking, hitting with an object, or name-calling [30–36]. The remaining seven studies showed non-significant,[21, 37, 38] small,[1, 39, 40] or mixed effects of customary spanking on subsequent antisocial behavior or aggression. The small significant effects were found only for non-Hispanic European-Americans or in samples dominated by that group, with effect sizes of β =.05,.06, and.07. Significantly adverse outcomes emerged only in studies in which mothers reported spanking frequency in the past week. The studies also used maternal reports for the outcome variable except for Gunnoe and Mariner, which found contrasting effects for different subgroups. With a distinct source of information for the child outcome variable (i.e., child report), Gunnoe and Mariner found that customary spanking significantly reduced aggression in the following subgroups: all 4- to 7-year-olds, all African-Americans, and all girls, although spanking increased aggression in all 8- to 11-year-olds and in all European-Americans. They also replicated the usual small adverse effect of customary spanking on antisocial behavior when relying solely on parental report.
In sum, the correlational evidence against spanking that was considered weak evidence by most participants in the 1996 scientific consensus conference was replicated in Gershoff's meta-analysis. Since then, the causal evidence against spanking has been strengthened by seven studies that have found small, sometimes significant adverse effects of customary (non-severe) spanking on subsequent antisocial behavior, after controlling statistically for pre-existing antisocial scores. On the other hand, Larzelere and Kuhn's meta-analysis found that child outcomes of physical punishment were more adverse than those for alternative disciplinary tactics only when physical punishment was overly severe or the predominant disciplinary tactic. No published study has compared the outcomes of any alternative disciplinary tactic with those of customary spanking in statistically controlled longitudinal analyses, a gap addressed by this study.
In addition to investigating the ability of alternative disciplinary tactics to reduce antisocial behavior, this study will investigate whether the small adverse effects attributed to spanking in statistically controlled analyses could be due to residual confounding. In statistically controlled analyses, residual confounding explained why the summer Head Start program appeared to be detrimental according to a major early evaluation study[43, 44]. Statistical controls yield unbiased estimates of causal effects only when the process of selecting recipients for a corrective action is measured comprehensively and without measurement error[46, 47]. Statistically controlled studies with fallible measures of the selection process only reduce the artifactual selection bias confounded with corrective actions[20, 48]. Accordingly, epidemiologists recognize that residual confounding remains when confounds are only partially controlled for statistically.
If the association between the frequency of spanking and subsequent antisocial behavior is due to child differences in initial levels of oppositional behavior, it follows that all disciplinary enforcements should show a similar association with antisocial behavior. This result would be consistent with Larzelere and Kuhn's meta-analysis that found no differences in effect sizes in comparisons between customary spanking and alternative disciplinary tactics. No statistically controlled longitudinal study of customary spanking has also investigated alternative disciplinary tactics that parents could use instead of spanking. This is therefore the first study to our knowledge that compares antisocial behavior outcomes of alternative disciplinary tactics vs. customary spanking after controlling statistically for pre-existing differences on antisocial behavior.
To compare the effects of three types of nonphysical punishment with the effects of spanking on subsequent antisocial behavior, we duplicated the study with the strongest causal evidence against customary spanking as closely as possible. Straus et al. was selected because it has reported the largest effect size associating spanking frequency with subsequent antisocial behavior. Their somewhat larger effect size might be partly explained because they chose to feature the cohort (out of five possible cohorts) with the largest longitudinal correlation between Wave-1 spanking and Wave-2 antisocial behavior (r =.29, compared to a mean of r =.22 in the other four cohorts in their Table 1, p. 764) . By duplicating the strongest causal evidence against customary spanking, our study increases the likelihood of finding disciplinary alternatives with better child outcomes than spanking.
If alternative disciplinary tactics show the same adverse associations with subsequent antisocial behavior as shown by spanking, however, the small effect sizes could be due to residual confounding. To test that possibility, additional analyses determined whether the adverse outcomes remained significant after improving the measure of pre-existing differences. Complete removal of the confound of oppositional behavior in children requires that oppositional behavior be measured comprehensively and without measurement error. Therefore, it follows that if the causal link between disciplinary punishments and antisocial behavior is artifactual due to residual confounding, then the adverse effects should become smaller and non-significant with improved measures of pre-existing antisocial behavior. Improvement in comprehensiveness will be evaluated in this study by comparing the trichotomous covariate used in Straus et al. with a continuous measure of the 6-item antisocial behavior scale and a 16-item measure of externalizing behavior problems. Structural equation modeling will also be used because it minimizes measurement error in the measure of pre-existing externalizing behavior problems.
A final set of analyses will determine whether disciplinary tactics predict simple change scores in externalizing behavior problems in the same direction that they predict residualized change scores in the above analyses. Simulation studies have shown that most analyses of residualized change scores remain biased against corrective actions even though they control statistically for pre-existing differences[44, 50]. In contrast, analyses of simple gain scores are biased in favor of corrective interventions due to regression toward the mean in the only known simulation study. Several prominent methodologists have recommended analyses of simple gain scores instead of residualized gain scores for many situations [52–54]. A recent study of Canadian longitudinal data showed that analyses of residualized gain scores were biased against corrective actions implemented by both parents and professionals, whereas analyses of simple gain scores were biased in favor of those corrective actions in the same data. When confounding factors are completely corrected for, however, analyses of residualized gain scores and simple gain scores agree with each other[55, 56].
Finally, to evaluate the success of the covariate adjustments in this study, the results will include outcomes of psychotherapy for comparative purposes. If the results are due to residual confounding, the pattern of results should be similar for corrective actions by professionals as well as corrective disciplinary actions by parents.
Because some parents will have used no disciplinary tactics in the past week and others will have used multiple tactics, we also repeated all the above analyses with two additional variations. The first variation repeated the analyses for the subset of families that reported at least one disciplinary tactic during the past week. The second variation included all disciplinary tactics and psychotherapy in the same analyses of the full sample, thereby controlling for each other.
In summary, we re-analyzed the strongest causal evidence against customary spanking, using the same National Longitudinal Survey of Youth (NLSY) cohort to investigate the apparent effect of spanking on subsequent antisocial behavior. Second, we repeated those analyses with each of three alternative disciplinary tactics: grounding, privilege removal, and sending children to their room. Third, those analyses were repeated while varying the adequacy of the covariate used for initial antisocial behavior. Covariates include a dichotomous measure, the trichotomous measure used by Straus et al., a continuous measure of the 6-item antisocial subscale, and a continuous measure of a 16-item scale of externalizing problems. Fourth, structural equation modeling was used to control for a latent factor of externalizing problems, thereby minimizing measurement error in the covariate. Fifth, we predicted simple changes in latent externalizing behavior problems, which should reverse the direction of the selection bias due to child differences if the effects are due to residual confounding associated with initial antisocial behavior. Sixth, we implemented all analyses for psychotherapy. Seventh, we repeated all these analyses for the subsample receiving any disciplinary tactics. Finally, we repeated the analyses in the full sample with all disciplinary tactics and psychotherapy as simultaneous predictors of antisocial behavior.