Comparisons with the Control Students

The fact that children in the Treatment group were improving (and in ways that matched targeted goals of the Inquiry curriculum) strongly suggests that they were benefiting from the Inquiry Project curriculum. However, it could also be that the existing science curriculum (not designed from a learning progression perspective, but with separate units on balance scales, materials, solids, liquids, and gases, and water cycle) is also effective in promoting the development of understanding matter at the macroscopic level or that some aspects of student improvement primarily stem from increasing maturation or from learning outside their science curriculum (e.g., increased mathematical facility; learning about volume in the math curriculum), in which case current work on LP has no “value added.”

To test the LP hypothesis we need to compare the progress students make with LP curriculum to progress they make with a more standard approach. More specifically, how much progress did students in the same school make with their standard science curriculum?


When we interviewed the control students at the end of grade 3, we found many of their conceptions remarkably similar to those of our treatment students at the time of the pre-interview. This provides further evidence that many key ideas in elementary children’s matter network are in great need of reconceptualization if children are going to have an appropriate foundation for learning about the atomic molecular theory in middle school.

Further, on the key tasks calling for reconceptualization (e.g., understanding weight as a fundamental property of material, constructing a concept of volume as occupied 3D space and differentiating contexts where it is relevant rather than area or weight, differentiating weight and density), their developmental trajectories so far have been relatively flat, with no significant changes from end of grade 3 to grade 4 on any of these key measures. This finding is consistent with the assumption that standard science curricula are not highly effective in fostering reconceptualizations.


By the end of 4th grade, we found that the Treatment students did significantly better than the Control students across multiple measures that assess reconceptualization (i.e., beliefs about whether tiny things have weight or take up space; measuring volume; knowing volume, not weight, is the relevant variable in water displacement; using ideas about heaviness of materials in spontaneous explanations; developing some intuitions about the similarities among diverse forms of matter; more tightly inter-relating weight and taking up space with amount of matter). In addition, although the differences between the two groups did not yet reach significance for two of the four density measures (e.g., systematically using relational reasoning rather than direct perceptual cues in making inferences about which object is made of a heavier kind of material; and using a stacking strategy to equalize volumes before comparing the weights of objects to infer what material they could be made of), they were in the right direction for both tasks. Further, if one considered the number of students showing a systematically correct pattern on at least one of these two tasks, the difference was clearly significant (68% in the Treatment group vs. 44% in the Control group.)

Of course, making such direct comparisons is tricky, as it depends upon the comparability of the students in the samples to begin with, and our design did not allow us to interview the control students at the beginning of grade 3. Is there any reason to believe that students in the Control group made less progress because they were “less good” students or that they were starting further behind the Treatment group of students? In fact, the evidence suggests the reverse.

To evaluate this assumption that they might be “less good students”, we gathered grade 3 MCAS Reading and Math raw scores for the two groups of students, and compared their “total” scores. We found no evidence for this assumption; at one school the Control group’s “total” score was statistically superior to the Treatment group. At the other school, the two groups were quite comparable and included a diverse range of student abilities.