Assessment Literacy for Teachers: Designing Tests That Actually Measure Learning
Most teachers design assessments by writing questions about what they covered. Assessment literacy means something more precise: designing assessments that validly and reliably measure what students have actually learned.
The difference matters more than most teachers realize.
Validity: Are You Measuring What You Think?
An assessment is valid when it measures the standard it's supposed to measure — not a student's ability to decode complex vocabulary, not their reading speed, not whether they paid attention to a specific example you used in class.
Check validity by comparing each question directly to the standard: "Write a claim supported by two pieces of evidence" tests argument writing. "Which of the following is a claim?" tests recognition of claims. These are different standards — make sure your assessment tests what you intended.
Reliability: Is the Score Consistent?
An assessment is reliable when students who know the material score well and students who don't score low — consistently, regardless of the specific day, version, or grader. Low reliability means scores vary for reasons unrelated to knowledge.
Improve reliability: use multiple questions per standard (one question has too much variability), use rubrics for subjective responses (reduces scorer inconsistency), pilot questions to verify they perform as expected.
Alignment to Bloom's Levels
Check the cognitive level of your questions against your instructional goals. If you taught at the application and analysis level, assessments that only ask recall questions don't measure what you taught.
A quick check: sort your questions by Bloom's level. If 80% are recall and understanding questions but your instruction was analysis-heavy, your assessment has a validity problem.
Create assessments in seconds, not hours
Generate quizzes, exit tickets, and formative assessments aligned to your standards. Multiple formats, instant results.
Writing Better Multiple Choice
Multiple choice is overused and poorly designed in most classrooms. Better design principles: one clearly correct answer, distractors based on common misconceptions (not random wrong answers), stems that require reading comprehension rather than just recognition, no trick questions.
Avoid: "all of the above," "none of the above," and double-negative stems. These test test-taking skill, not content knowledge.
Performance Tasks for Complex Standards
Some standards can't be validly assessed by multiple choice: write a developed argument, deliver an oral presentation, design an experiment. Performance tasks are messier to grade but more valid for complex skills.
Anchor papers (scored examples of strong, middle, and weak work) increase scoring reliability for performance tasks. Score a class set twice with the rubric before finalizing grades — the re-scoring will catch inconsistencies.
LessonDraft builds alignment into lesson planning — connecting each activity to specific standards so your assessments naturally align to what you've taught.Using Assessment Data
The best-designed assessment is wasted if you don't use the data. After scoring, identify: which standards did most students meet? Which had the most students struggling? What patterns explain the struggling — a specific misconception, a problem with the question itself, a gap in instruction?
An item analysis (which questions did struggling students miss vs. strong students?) tells you whether questions are working as intended. If strong students also miss a question, the problem may be the question, not the students.
Assessment literacy makes you a more precise teacher. It's worth the investment.
Keep Reading
Frequently Asked Questions
What is assessment validity and why does it matter?▾
How do I write better multiple choice questions?▾
Get weekly lesson planning tips + 3 free tools
Get actionable lesson planning tips every Tuesday. Unsubscribe anytime.
No spam. We respect your inbox.
Create assessments in seconds, not hours
Generate quizzes, exit tickets, and formative assessments aligned to your standards. Multiple formats, instant results.
15 free generations/month. Pro from $5/mo.