A Foolish Consistency

Damian Betebenner, Center for Assessment

A Foolish Consistency

Damian Betebenner

Center for Assessment

Staff Presentation: March 7th, 2017

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines.

Ralph Waldo Emerson

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines and measurement experts.

Ralph Waldo Emerson (Damian Betebenner)

Polly wants a cracker.

Policy maker wants a valid & reliable indicator.

Questions

Background

Two Dogmas of Measurement: Reliability & Validity

Status Quo

Samples versus Populations

Convenience Samples

Random sampling is hardly universal. More typically, perhaps, the data in hand are simply the data most readily available. "Convenience samples" of this sort are not random samples. Still, researchers may quite properly be worried about replicability. The generic concern is the same as for random sampling: if the study were repeated, the results would be different. What, then, can be said about the results obtained?.

David Freedman and Richard Berk (2008)

Accountability indicators as sampling

The result for a school for one year is just one observation from which to infer a school’s true score—what the school’s average would be if we could test an infinite number of students from the school’s catchment area an infinite number of times on all the test questions that might be asked.

Rich Hill (2002)

Superpopulation Falacy

One way to treat uncertainty is to create an imaginary population from which the data are assumed to be a random sample. With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the dataset being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions. Convenience will not be denied; the source of the fiction is two-fold: (i) the population does not have any empirical existence of its own, and (ii) the sample was not in fact drawn at random.

David Freedman and Richard Berk (2008)

Samples versus Populations

To conclude on the basis of an assessment that a school is effective as an institution requires the assumption, implicit or explicit, that the positive outcome would appear with a student body other than the present one, drawn from the same population.

Lee Cronbach (1997)

Though correct, the statement is a red herring. Inferring effectiveness requires more than placing a confidence interval about a statistic. Indeed, one of the most challenging issues in growth modeling using student assessment data is in trying to make effectiveness claims based upon observational data. Unless certain design issues are met, judging a school to be effective based upon percent of proficient students is not defensible, with or without confidence bands.8 Moreover, if such confidence intervals embolden users into believing that it is safe to make school effectiveness claims, then perhaps their use should be avoided.

Damian Betebenner (2006)

Sports Statistics

The Emporer wears no clothes.

Alternatives