In our previous blog, I reviewed the research literature related to the retesting of applicants. Summarizing our findings from Part 1:
- If someone takes a test again, his/her score will increase.
- If a group of individuals are retested, the rank-order will change.
- At least two months, but more realistically 6 months to a year, should be required between most retests.
- Given a candidate is willing, there seems to be no reason to limit retests. The issues are really whether to even allow a first retest and the time between retests.
- Under typical situations, where only a portion of the applicants may be taking the test a second time, the first administration will probably be the most valid; but there are many factors that may influence this conclusion. And, from 1 above, we would expect those taking the test a second time to have higher scores than the first time examinees.
This month, our goal is to arrive at some practical suggestions regarding practice based on professional and government guidelines, the public sector testing model, and the previously mentioned research findings in order to come up with recommendations for applied practice. This will include a discussion of how we should determine a score for someone who is retested.
Standard and Guidelines
Logically, the first place we might turn for advice would be the Uniform Guidelines. There is a whole section of the Guidelines devoted to retesting; however, that section is only two lines long and so we can reproduce it here: “Users should provide a reasonable opportunity for retesting and reconsideration. Where examinations are administered periodically with public notice, such reasonable opportunity exists, unless persons who have previously been tested are precluded from retesting. The user may however take reasonable steps to preserve the security of its procedures.” Perhaps others may find this section to be more informative than I do, although it clearly reflects a preference for retesting.
The Society for Industrial and Organizational Psychology (SIOP) provides testing principles, although they are quite dated and currently under revision. The SIOP Principles view retesting as part of general fairness and equitable treatment. As such, reassessment is recommended when feasible, with adequate descriptions of the retesting policy including time intervals. The SIOP Principles also discuss corrective reassessment, which occurs when there has been a problem with the initial administration of the test battery. Similar statements appear in the Standards for Educational and Psychological Testing.
The Public Testing Model
Unfortunately, there are many public testing models, but to simplify our considerations of practical recommendations we will deal only with entry-level and promotional testing, ignoring for this blog the topics of annual and return to work testing. This leaves us with three basic situations or models:
- The highly competitive examination where all the candidates are assessed on the same day, the resulting scores are used to create a list, and hiring or promotion is by rank-order or top-down. Under this model, retests are usually not allowed or permitted, at least within the same jurisdiction, although an applicant could take a similar or the same test for multiple potential employers. Allowing retests under this approach to testing would lead to complaints of unfairness from those applicants who took the test only once.
- Pass-fail testing where the candidate must achieve a certain score either to be hired or allowed to continue in the selection process. Some examples of common pass-fail testing include clerical tests, typing or word processing tests, physical ability assessments for safety forces, and some general managerial exams. In this situation, the individual’s relative position does not matter, only whether they can achieve some minimum score on the assessment. Many times, in such situations the examination period may be flexible and thus not restricted to a single date, or may be one involving continuous recruitment. It is for this model where most of the questions arise concerning whether to allow for retests, how many retests, and time interval between retests. Under this model, the organization would usually accept the final or last test – since once the candidate passes they would no longer be motivated to undergo retesting.
- The use of unproctored, internet-based testing. This has become a common model in the private sector, although far less common in the public sector. Under this model, an applicant could apply and take the same assessment multiple times; thus, retesting is a common occurrence and there is usually no time limit. It is here where the question of scoring and how to calculate a final or operational value for the test is most likely to emerge. Among the options would be to use the first test, the last test, the average test score, the highest score in the string of attempts, or an estimate of the latent trait or true score.
Scoring
How should we score retests? Should we take the first score? The last? An average of the scores? The highest score? How can we be fair to individuals taking a retest and not taking a retest?
If I had answers to the above questions, I would be rich, and I am poor. In a perfect world, we would have well-developed procedures for estimating a latent trait or true score. Each retest would simply improve our estimate of the latent trait and increase our confidence in the obtained value. However, at least for practical use in the real world of public sector agencies, we are not yet in that perfect world, so we must deal with imperfect solutions.
For our Model 2 situation, the pass-fail decision, the scoring is straightforward in that we let the individual take the test until they pass and then retesting will terminate with a passing test score.
For Models 1 and 3, knowledge tests, it makes sense to argue that the most recent score is the most valid. For other tests, an average might make sense, depending upon the situation. The issues becomes more complicated when the retesting involves a battery, unless the rule is again applied to just use the most recent test score; at The University of Akron we accept the highest score on each component of the battery regardless of the number of retests.
Suggestions or Recommendations
Before offering my recommendations, first, it would appear that those who favor retesting do so based on general philosophical or ethical grounds. In particular, retesting is seen as fairer to the examinees. I would certainly agree that allowing for retesting is perceived by applicants as being fairer, in particular because it allows for multiple opportunities to perform and remediate.
What is less clear, however, is whether retesting allows for a more reliable or valid assessment of knowledge, skills, and abilities. We do know that scores will increase with retesting, but this does not mean the scores will be either more reliable or more valid. Contrary to what would appear to be popular opinion, retesting may increase rather than decrease adverse impact.
Given the existence of multiple public sector selection models, no one set of suggestions will apply to all possible situations, or even to most situations. Still, this is a blog, and I did indicate I would arrive at some general recommendations.
So the first set of suggestions would be:
- Where feasible, and in the public sector it may not always be possible, allow for retesting, as it is likely to be seen by candidates as fairer and more equitable and as giving an increased opportunity to perform.
- Where possible, use a different form of the test, rather than the exact same test. Such forms may be referred to as alternate or equivalent forms. IPMA-HR does offer multiple forms of many of its more popular assessments. For a previous blog on the topic of alternative test forms see Bob Burd, Multiple Forms of Written Exams.
- Clearly explain to all applicants and candidates the retest policies. This should include whether retests are allowed, how many are allowed, the time intervals, and how final scores will be calculated.
- There is no one method of obtaining a final score that will be optimal in all situations. Carefully consider the logic of arriving at an individual’s score and publicize it clearly. In most situations, given retesting is feasible, the most logical and defensible approach will be to accept the most recent score.
Overall, professional and government texts favor the use of retesting but provide a minimum of practical advice. I did find two comments worth repeating and remembering:
- Retesting should lead to increased concern with the security of the tests.
- An unpleasant topic to think about is the need for corrective assessments, where something went wrong with the initial assessment. Although infrequently discussed in the literature, organizations should probably plan for such an unfortunate possibility.
References
Equal Employment Opportunity Commission, the Civil Service Commission, the Department of Labor and the Department of Justice. (1978). Uniform guidelines on employee selection procedures. Federal Register, Volume 43, Number 166, 38290-38315. uniform guidelines
Society for Industrial and Organizational Psychology, Inc. (2003, 4rd ed.). Principles for the validation and use of personnel selection procedures. College Park, MD. Telephone Number: (708)640-0068 http://www.siop.org/_principles/principles.pdf
American Educational Research Association (AERA); American Psychological Association (APA); National Council on Measurement in Education (NCME) (2014). Standards for Educational and Psychological Testing. http://www.apa.org/science/programs/testing/standards.aspx