Diagnostic assessment in K‒12 education

September 13, 2024

By Dr. Gene Kerns, Chief Academic Officer

Choosing a name for a product takes a lot of thought. One classic blunder supposedly involved the Chevrolet Nova, which did not sell well in Latin America because its name translates as “no go” in Spanish. This is actually an urban legend, but the story illustrates how language must be thoughtfully considered.

The name should make a product stand out and, ideally, speak to what it does. Names should also be honest. If a name claims or strongly suggests that a product does something, then it should do just that, so buyers are not misled.

This is why the name of the assessment product from one of Renaissance’s competitors in the interim assessment space irritates me. The product’s name is misleading. It implies that the tool does something that it simply does not do.

While I’d prefer not to name the competitor here, suffice it to say that the product I’m referencing will be familiar to many school and district leaders. This company’s flagship assessment tool includes the word “diagnostic” in its name.

Diagnostic assessment: Determining student skill mastery

If you name an assessment tool “diagnostic,” then it should do what this name implies: diagnose. How would you react if, after undergoing a diagnostic medical test, your physician said that you “probably” had whatever condition or illness you were being evaluated for—but she added that “the tests we have run so far cannot claim to be definitive…”?

Would you view this as a conclusive diagnosis? Or an incomplete one? I know I’d want something definitive.

In psychometrics, “diagnostic assessment” is about making a definitive call. It’s about determining whether students have or have not mastered specific skills (such as letter-sound correspondence) or exhibit certain profiles (such as characteristics of dyslexia). For this reason, the thresholds set for reliability and validity when an assessment claims to be diagnostic are exceptionally high. Diagnostic assessments are about precision, not about what’s “probable.”

Although our competitor’s assessment tool includes the word “diagnostic” in its name, its technical manual speaks more accurately when it discusses how “computer adaptive testing and the Rasch Item Response Theory model form a strong foundation for ensuring valid inferences are reported by [the assessment].” The manual adds that “from the ability level of the student and the difficulty level of these indicators, [the assessment] can make probabilistic inferences about what students know and are likely able to do.”

Let me be clear. The three leading K–12 interim assessment providers—Renaissance included—all use computer-adaptive testing and Rasch modeling. Every word I just cited from our competitor’s technical manual is also true of Renaissance’s Star Assessments and of the third provider’s assessment. But, of the three market leaders, only one company labels its tool as “diagnostic.” Why is that?

If all three tests are designed in similar ways, then:

Two of the companies are under-representing what their tools can do in terms of “diagnosing;” or
One company is over-representing its tool’s capabilities.

It’s clearly the latter option.

Assessing student learning

Explore research-based assessment tools for reading, math, and non-academic factors.

Talk to an expert

Diagnostic assessment vs. screening tools

Why is this point about diagnostic assessment so important?

The impact of this misrepresentation of the product’s capabilities became apparent recently as I answered an audience chat during a webinar. The educator I was conversing with used the other company’s test. She remarked that “diagnostics should be used for those students noted as being at-risk by screening tools.”

By clumping this remark with the name of the test she was using, it was clear that she believed the test was capable of diagnosis and was diagnosing her students’ specific needs. It was not.

The technical manual for the assessment notes, appropriately, that it can provide “valid inferences” or “probabilistic inferences about what students know and are likely able to do.” However, the specificity and reliability of the information provided does not rise to the truly diagnostic level. The educator had been misled by the product’s name which—whether intentionally or not—overstates what the tool can do.

An example: Diagnostic assessment of essential reading skills

As I explain in my most recent Assessment Masterclass video, there are significant differences between:

General outcome measures, such as our Star Reading and Star Math assessments, and most other interim assessment tools on the market; and
Mastery-based measures, which can be used to diagnose at the skill level.

These two categories of assessment differ substantially, much the same way that the evaluation of key health metrics during a physical exam (blood pressure, pulse rate, etc.) differs from the detailed evaluation of a truly diagnostic procedure like an MRI or CAT scan of a targeted area. For this reason, the only Renaissance assessment tool that we position as having pre-configured diagnostic functions is Star Phonics, which truly can assess individual skills at the mastery/diagnostic level when necessary.

Why specifically target early grades reading skills?

We targeted these skills because, according to Louisa Moats and Carol Tolman, 70‒80 percent of struggling readers have difficulty reading words due to a deficit in phonological processing. If that area—early grades skills around phonological processing—is the chief source of eventual reading difficulty, then it is deserving of special attention and focus.

Star Phonics: Diagnosing individual phonics skills

Students’ first encounter with Star Phonics is through its screening function. As with other screeners, the tool covers a broad range of skills, with only a few items on any one skill. This approach aligns with earlier ideas that I covered in the Masterclass series, about the need for screening tools to be highly efficient.

If the screener reveals no areas of concern, then there is no need for additional work with Star Phonics. However, if the screener reveals that a student is struggling with a skill that has already been taught—and Star Phonics can be aligned to your reading scope and sequence to indicate this—then teachers can administer a diagnostic assessment.

When it comes to determining mastery of skills (diagnosing), Classical Test Theory is the model used. This requires that a number of items on each skill, typically 5‒7, be presented to the student. Star Phonics’ diagnostic assessments follow this construct by targeting only specific skills that teachers choose, based on the results of the screening assessment. The focus of the diagnostic assessments’ design allows them to return specific, skills-related information.

Diagnostic assessment for R-controlled vowels

Let me illustrate this process using R-controlled vowels. This category of skills, along with 11 others, is assessed at a high level through the Star Phonics screening tool:

*Star Phonics screening categories (left column) and diagnostic skills areas (right column)*

Suppose a teacher has covered R-controlled vowels during regular instruction but subsequently finds, through low performance on that category on the screener, that a student is still struggling in this area. At this point, the screener has identified that R-controlled vowels represent an area of weakness, but not specifically which combinations are proving problematic for the student.

The teacher could then administer the Star Phonics diagnostic assessment related to R-controlled vowels. In contrast with the screener, the diagnostic assessment only targets the identified skill area, with a higher number of items.

As shown below, the resulting Diagnostic Report pinpoints the specific R-controlled vowel the student is struggling with: “ir.” This allows the teacher to focus instruction and practice on this particular vowel—ideally with some additional work on “ar” as well—rather than spending time reteaching the other R-controlled vowels, which the student has already mastered.

*Star Phonics Diagnostic Report, R-controlled vowels*

Supporting mastery for every student, every skill

When discussing Star Phonics, I often remark that it’s about “Every student, every skill, to the mastery level.” During one webinar an attendee asked, “Has this assessment been normed?” The presenter immediately responded, “No, it has not been normed, and it will not be.” This is because, when it comes to essential phonics skills, anything short of mastery for every single student would represent a failure.

This is a perfect example of a critical, criterion-referenced assessment. The established criterion is that all students master all of these skills to be successful in reading. In contrast, in the case of a norm-referenced assessment, the norming values suggest that it is expected for some students to be performing above or below the norm.

In the case of the critical foundational reading skills measured by Star Phonics, mastery of all skills by all students must be our goal.

How to make the most of diagnostic assessment in K‒12 education

To some degree, diagnostic assessments seem wonderful, don’t they? Educators appreciate the specificity of the feedback: this student needs help with this particular skill. But a key idea covered in my Masterclass discussion is that there is no single assessment tool that will do everything we need. There are only tradeoffs, because each form of assessment has strengths and weaknesses, and advantages and disadvantages.

So, what disadvantage might there be to mastery-level assessments? They take a lot of time, so they should be used sparingly.

Consider that, with Star Phonics, we are focusing on a subset of skills that are highly predictive of reading success, rather than focusing on all aspects of reading. This comparatively narrow focus makes assessing these skills at the mastery level doable.

In contrast, Renaissance once offered a product called Accelerated Math. It included a skill-level focus with an appropriate number of items on each skill to allow it to receive perfect scores from the National Center for Response to Intervention as a mastery measure. It covered every math skill from kindergarten through Calculus, and it represented the most comprehensive mastery measure for mathematics ever developed.

However, a fairly consistent theme of educator feedback was that “it takes so long and becomes too much to manage.”

Does this mean we should never use diagnostic assessments/mastery measures? Certainly not. We simply need to use them sparingly, strategically, and with the most essential skills.

Renaissance: The right assessment at the right time

With this point in mind, let’s return to our opening discussion, about some providers overstating what their assessments are truly capable of doing. Don’t be swayed by a screening tool’s over-reaching promises of providing “diagnostic information” or the ability to “pinpoint exactly what each student needs.”

Within the interim assessment market, Renaissance’s two primary assessment tools—Star Assessments and FastBridge—stand out in two ways. First, they’re among the shortest screening tools in terms of administration time. In an earlier Masterclass, I explained how more isn’t always better—that is, how longer screening assessments with more items don’t always add value to the process.

I also explained that our shorter assessments are highly rated for both reliability and validity by external, objective reviewers.

Second, both Star and FastBridge are part of a larger, comprehensive ecosystem of assessment resources that includes Star Phonics, as well as our DnA formative assessment tool. Rather than being forced to use the competitor’s needlessly long, 45-minute screening tool with limited diagnostic/mastery-level abilities, you could—with Renaissance—use a 20-minute screening tool and then have an additional 25 minutes to use however you choose.

In some grades, this might be time devoted to administering Star Phonics to all students. In others, benchmark assessments or mastery probes targeting key standards could be administered using DnA. The choice is yours—you get to use the assessment tool that will best answer your questions about student learning.

For more insights on assessment, explore my complete Masterclass series.

Learn more

Discover how Star Phonics, DnA, and our other assessment tools answer key questions about student learning.

Request a demo

What’s in a name? Understanding diagnostic assessment in education