Successive Hurdles, Test Weighting and Certification Rules: Part 4

The articles in this series when taken as a whole present a picture of the challenges and potential pitfalls presented in the development of effective selection instruments and test batteries. In addition to the need to make sure instruments are reliable and valid so that they support the selection of the best available work force, they must also withstand legal scrutiny. Unfortunately, experience has shown that local laws, statutes and/or civil service rules that provide the blue print for how HR work is to be done are many times in conflict with exam development and validation procedures. In particular, certification rules that dictate the number of candidates from a ranked list that can be certified for a hiring authority to consider for selection can be responsible for undoing the efforts made to conform to professional standards.

Many individuals tasked with writing civil service rules, particularly in the infancy of the development of merit systems, did not have the benefit of possessing a test development and statistical background. Many systems focused on fairness and avoiding abuses of differing forms of the spoils system or the good ol’ boy system, but they did not take into consideration statistical concepts related to test scores, and in particular whether or not meaningful differences existed between scores. Sometimes, certification rules narrowly defined the group eligible for certification and in other instances; rules were modified in an attempt to address equal employment issues. These modifications often took the form of certification of the whole list which meant the hiring agency could select anyone on the entire list to put through the final selection interviews.

Both approaches ignore what we know about test scores and test development. First of all, if certification rules are too narrow and allow only three to five names to be certified, they may be placing an emphasis on differences in test scores that are not truly meaningful. While limiting the number of candidates that can participate in the final interview or hiring interview process may be efficient, it may be excluding candidates that are essentially as well qualified for the job as those who are included for selection. Secondly and conversely, if all candidates can be certified and are eligible to participate in hiring interviews, we have ignored the concept of test utility and the fact that well constructed and valid selection instruments can be appropriate for ranking. This is particularly true if a criterion- related validity study has been conducted that demonstrates a correlation between test scores and job performance as is the case for IPMA-HR entry level tests.

To understand the first concern with certification rules that unduly limit the number going on for final consideration we need to look at whether or not differences in test scores translate into difference in expected job performance. In other words, can we expect someone with a 91 to perform better on the job than someone with 90? Most of us can agree that this would be a difficult thing to prove and probably should not be our focus in establishing certification rules. Rather, we should be looking at whether the top group certified for selection, when taken as a whole is better than those not certified.

Again, this may be difficult to prove, but statistics and logic can be utilized to make our certification rules more defensible and work better for us. First of all, we should be able to agree that we should use whole scores and avoid setting certify, don’t certify points at fractions. If whole point differences don’t translate into meaningful distinctions in scores and job performance than how can fractions? So we want to avoid certification rules that would certify someone with a 89.5, but not someone with an 89.2.

In addition, since we do want to make the differences meaningful, we should look at the distinction between the top score certified and the bottom score to be certified. In other words we should look at the group certified as a range, and rather than comparing the lowest score in the range to the next highest score below that score we should be looking at the distinction between the top score in the range and the lowest passing score and the score below it. That is, if your certification rule, as applied, would indicate the top score to be certified is a 95 and the lowest score to be certified is an 88, you want to have some confidence that the next lowest score, say an 85 is significantly different than a 95.

Candidates have attempted to challenge pass points and certification rules by comparing the 88 and the 85 in our little example and saying there is no difference in these scores. We want to be sure that we are not making the claim that there is. What we are trying to say is that there is a difference between a 95 and an 85. Since we want to select top performers, we don’t want to include anyone in our certification group that could be considered significantly below our top group.

If we are working with rules that allow for broad ranges of groups to be certified, we will need to accept that not all will be top performers, on the other hand when we are working with rules that allow for only the top three to five top scorers to be certified, we may be working with rules that are too narrow and that unnecessarily eliminate candidates with potential to be top performers. While probability statistics can be difficult to compute, understand and apply, a look at the normal curve helps illustrate some of the things that we may want to look at when examining our current certification rules.


This is the final part of four-part series on successive hurdles, test weighting and certification rules. If you’ve just joined us, we suggest you catch up with part 1, part 2 and part 3 first. In case you missed it, check out Robert Burd’s previous series, Item Analysis In Public Safety.

Leave a Reply