Replication of 'Marshmallow Test' Reveals Delayed Gratification May Be Less Predictive of Later Outcomes

Delaying gratification in youth may not be as much of a predictor of later life outcomes, contradictory to previous findings.

Tyler W. Watts, PhD

In a replication effort of the famous “marshmallow test,” research suggests that delaying the gratification of the youth may not be as much of a predictor of outcomes in later life, contradicting what was previously believed.

Originally conducted by Walter Mischel, PhD, a psychologist now with Columbia University, he and colleagues greeted a preschooler with a plate of treats to choose from. The child was then informed that the researcher would need to leave for 15 to 20 minutes, although just before, presented the child a simple choice: either wait for the researcher to return and received 2 marshmallows, or ring a bell that would prompt the researcher to return and limit the reward to 1 marshmallow.

Those findings showed that generally, those children who were less successful at resisting the more immediate gratification exhibited poorer performance in the self-control task as adults. Mischel and colleagues determined that this sensitivity to so-called "hot stimuli" may persist throughout his or her lifetime.

Led by Tyler W. Watts, PhD, an assistant professor of research and postdoctoral scholar at New York University’s Steinhardt School of Culture, Education, and Human Development, this new version utilized a larger and more diverse sample in their efforts to reexamine the test's prognostic abilities with regard to cognitive and behavioral outcomes.

"Our findings suggest that an intervention that alters a child's ability to delay, but fails to change more general cognitive and behavioral capacities, will probably have very small effects on later outcomes," Watts said in a statement. "If intervention developers hope to generate the kinds of improvements associated with the original marshmallow study, it is likely to be more fruitful to target the broader cognitive and behavioral abilities related to gratification delay."

Using data from the National Institute of Child Health and Human Development’s Study of Early Child Care and Youth Development, the data included 918 children with a valid measure of a delay of gratification at 4.5 years (54 months). Much of the analysis focused on the cohort of children whose mother’s education did not include a completed college education (n = 554).

Children were asked to wait only 7 minutes in this new test, with 55% of the children reaching the full 7 minutes. Children with college-educated mothers hit the 7-minute cap 68% of the time, compared to 45% of the non-college-degree group (P <.001). All told, 23% of the children in the non-degree group waited less than 20 seconds before ringing the bell. The average wait time for the nondegree group was 3.99 minutes (standard deviation [SD], 3.08) compared to 5.38 in the degree group (SD, 2.62).

Because fewer of the children in the nondegree group hit the ceiling on the minutes-waited measure, that group “complements the sample of children included in the Mischel and Shoda studies. But because the subsample of children with college-educated mothers allows for a more direct replication of Mischel and Shoda’s famous work, we also present results for them, bearing in mind the limitations imposed by the substantial delay truncation,” the authors noted. They also wrote that “children in the NICHD study were given only the version of the task that Shoda and colleagues (1990) called the diagnostic condition (i.e., the children were not offered strategies and were able to see the treat as they waited).”

In comparing the outcome measures between the groups at Grade 1, the nondegree group scored 108.42 (SD, 13.71) and 49.15 (SD, 8.43) in the achievement and behavior composites, respectively, while the degree group scored 117.29 (SD, 13.47; P <.001) and 47.40 (SD, 7.87; P <.008), respectively. At age 15, the achievement composites were 101.23 (SD, 11.63) for the nondegree group and 112.72 (SD, 13.19) for the degree group (P <.001). The behavioral composites were 47.12 (SD, 9.37) and 44.50 (SD, 8.66), respectively (P <.001).

The bivariate association (ß) between the time the child waited and academic achievement was 0.28 (standard error [SE] = .04; P <.001) for children in the nondegree group, which was considerably lower than the correlations from the original test of 0.57 for math and 0.42 for verbal scores. Watts and colleagues noted that when adding the controls that were measured prior to age 4.5 years to the model, “the standardized association fell to 0.10 (SE = 0.03, P = .002), and when concurrent 54-months controls were added, the association fell to a statistically nonsignificant 0.05 (SE = 0.03, P = .114).”

At both Grade 1 and age 15, when the results were controlled for child and family feature, the coefficients for all 3 delay groups (<20 seconds; 20 seconds to 2 minutes; 2 minutes to 7 minutes) dropped by almost 50%. At 15, the coefficients for all 3 groups fell between 0.23 and 0.30 and were not statistically significant (P = .752).

“Surprisingly, the addition of the background controls also flattened out the gradient of the prediction across the gratification-delay distribution. Relative to the less-than 20-[second] reference group, achievement differences for children who waited [for] more than 20 [seconds] but not the full 7 [minutes] were strikingly similar to the difference for children who waited [for] the full 7 [minutes],” the authors wrote.

Ultimately, while a child’s inability to wait to receive the treat did serve as a prediction for mathematics and reading skills in adolescence, these differences were small and disappeared altogether when controlled for socioeconomic status and environment.

"Given the attention these findings still receive when decisions are made about the skills early-intervention programs should target, we thought it was important to revisit the older work by replicating the original experiment using a newer sample and updated statistical methods,” Watts said. “Of course, these new findings should not be interpreted to suggest that gratification delay is completely unimportant, but rather that focusing only on teaching young children to delay gratification is unlikely to make much of a difference."

The study, “Revisiting the Marshmallow Test: A Conceptual Replication Investigating Links Between Early Delay of Gratification and Later Outcomes,” was published in Psychological Science.