Instead, we typically use a carefully selected sample of the domain to make as valid an inference as possible with the 2 hours I have available for examination.
Finally, the stakes we attach to any assessment should be in proportion to the validity of our inferences.
Comments
If I claim you’re good at juggling based on how well you describe the process of juggling, the validity of my inference is low.
Whereas if I claim you're good at juggling based on how well you juggle 3 balls for 1 minute, the validity of my inference is much higher.
Validity isn't a property of a test itself.
Instead, it relates to the inferences we draw as the result of a test.
If they all answered correctly, I could conclude—with reasonable validity—that they understand osmosis.
But any conclusions I draw about their wider understanding of science would be much less valid.
I could construct a fairly valid assessment of a 6yo’s times-tables understanding by testing them on all their times-tables.
But if I want to gauge the understanding of a 16-year-old, I can't easily test them on everything they should know... this would take days.
Finally, the stakes we attach to any assessment should be in proportion to the validity of our inferences.