Establishing Validity in Assessments
Ivy Meighan
Ashford University
EDU602: Assessing Knowledge and
Skills in the Online Environment
Instructor: Lisa Reason
May 09, 2016
Validity, according to Collin and Wren (2005-06) “refers to how
well a test measures what it is purported to measure.” For example, if you want to know how many
miles it is from your home to the neighborhood super market, but you don’t
start counting the miles until you exit the gate of your residence, instead of counting
miles from the drive from your garage.
In this case, the miles from home to neighborhood supermarket is not
valid, because you did not start counting until you were a quarter of a mile
from home. Then, there’s this version, “Validity refers
to the evidence presented to support or refute the meaning or interpretation
assigned to assessment results. All assessments require validity evidence and
nearly all topics in assessment involve validity in some way. Validity is the
sine qua non of assessment, as without evidence of validity, assessments in
medical education have little or no intrinsic meaning.”(Downing, S.M., 2003)
Regardless of the form a test takes, its most important
aspect is how the results are used and the way those results impact individual
persons and society as a whole. Tests used for admission to schools or programs
or for educational diagnosis not only affect individuals, but also assign value
to the content being tested. A test that is perfectly appropriate and useful in
one situation may be inappropriate or insufficient in another. For example, a
test that may be sufficient for use in educational diagnosis may be completely
insufficient for use in determining graduation from high school.
Test validity, or the validation of a test, explicitly means
validating the use of a test in a specific context, such as college admission
or placement into a course. Therefore, when determining the validity of a test,
it is important to study the test results in the setting in which they are
used. In the previous example, in order to use the same test for educational
diagnosis as for high school graduation, each use would need to be validated
separately, even though the same test is used for both purposes. Validity is a matter of degree, not all or
none.
Content validity,
criterion-related validity, construct validity, and consequential validity are the types of evidence used to
establish validity.
Content validity refers to the “match between the test questions
and the content or subject area that they are intended to assess”…which is
sometimes referred to as “alignment”.
Criterion-related
validity looks at
the relationship between a test score and an outcome. For example, SAT™ scores
are used to determine whether a student will be successful in college.
First-year grade point average becomes the criterion for success. Looking at
the relationship between test scores and the criterion can tell you how valid
the test is for determining success in college.” (College Board.)
Construct validity, which is also referred to as
convergent and discriminant validity, is purported to measure what it is
supposed to measure. For example, if a student is given a
scientific problem to solve on an assessment exam, these questions contain a
lot of reading and comprehension. If the student is going to succeed at solving
the problem, he or she will have to have reading comprehension skills in order
to be able to solve the problem. So, in
essence, this student is not only being assessed on his scientific knowledge
he’s being assessed on his reading skills. This assessment is not assessing
what it’s supposed to be assessing.
Consequential Validity,
… “describes the
aftereffects and possible social and societal results from a particular
assessment or measure. For an assessment to have consequential validity it must
not have negative social consequences that seem abnormal. If this occurs, it
signifies the test isn't valid and is not measuring things accurately.” This example is one that has plagued
educators and test makers for decades, since assessment became relevant. Let’s say a test, made up by affluent
individuals who have the capabilities and the means to travel all over the
world, therefore, have better enrichment than individuals who do not have the
means and capabilities of traveling the world.
These two groups have different experiences, therefore, if the test is
given to both groups, the more affluent will do extremely well compared to the
less affluent group. This test has a
consequential validity.
“Establishing validity in assessments ensures opportunities
for growth, change, and improvement of teaching and learning in online
learning. Identify the purpose of your
assessments in terms of desired learning outcomes. Identify the story in your assessment
results, and tell that story effectively. Focus assessment summaries on
decisions that are based on the results.
Celebrate and broadcast good assessment results.
Analyze the possible causes of disappointing results: goals,
programs, or assessments themselves.”
(Week 2 Lesson Presentation 4)
I think that the validity of online assessments has to do
with whether or not the assessment is assessing what it is purported to
assess. Just like classroom assessment,
the same rules apply. The assessment
cannot be different from the objectives, or what was taught.
Reference
1.
Colin,
P., and Julie, W. (2005-06). UNI Office
of Academic Assessment. Exploring
Reliability in Academic Assessment.
Retrieved from https://www.uni.edu/chfasoa/reliabilityandvalidity.htm.
2.
College
Board. (n.d.) Validity Evidence.
Retrieved from https://research.collegeboard.org/services/aces/validity/handbook/test-validity
3.
Week
2 Lesson Presentation. Validity and Strategy.
4.
Downing,
S.M. (August 27, 2003). Volume 37, Issue
9. Validity: On the Meaningful
Interpretation of Assessment Data.
Retrieved from http://onlinelibrary.wiley.com/doi/10.1046/j.1365-2923.2003.01594.x/full