As stated in the last article, appeals are typically more formal than complaints and there are usually written rules and procedures that govern the handling of appeals. These rules typically spell out how an appellant will go about appealing and how the agency will go about responding to the appeal.
The related rules range from general to very specific and they often can be found written out in the civil service rules for the agency or in a separate document of their own. At times, unions negotiate appeal procedures, but they are not typically part of mandatory subjects of bargaining. At one point in my career, I worked for a police agency that had been rocked by a cheating scandal. To prevent further cheating they created a very onerous and expensive test creation and appeal process that reeked havoc on the ability to develop valid and reliable written tests.
This brings me back to a point I made in the previous article and I will stress again here. If you have written appeal procedures that are not consistent with sound test development and validation strategies, you should consider it a goal to get them rewritten and approved so they do. Truly your rules should work for your test development and validation program and not against it.
I’ve seen a full range of appeals in that some have been as broad as to cover the test and others have been narrower focusing on items. Typically, broad appeals are saying you have a bad test while narrower appeals say you have a bad item. Most systems follow the individual item approach with the appellant claiming the item is not relevant, her answer is just as good as the keyed answer or the item is written so poorly that it should be thrown out. Since internal candidates have often lived with appeal procedures for years, they have become rather sophisticated in their use of the system. Civilians or non-employees, on the other hand, are usually new to the appeal process and quite often are more inclined to be less sophisticated in their approach. In my experience, rather than tackling a single item or a few select items, this group often uses a shotgun approach and asks to look at the whole test so they can go on a witch hunt.
Perhaps you have gathered by now that I am not a big fan of appeals and I can honestly say I am not. Typically, the appellant has nothing to lose in the appeal process, gets extra time with the test or test items and forces test developers into defending the instrument they created and how they created it. While test developers — whether in-house or functioning as a consultant — should be able to articulate how an item was created and how it relates to the target job, in-house practitioners are often not given the opportunity to provide support for the inclusion of an item. Even if in-house test developers are given the opportunity to provide support for appealed items, a great deal of time can be taken up by this process particularly if an appeal system allows for frivolous appeals. In addition, the individuals that make judgments about appeals are often not familiar with test development procedures and may not know the test material. Further, most test writers want to reuse well constructed tests that they have spent months creating and appeals can destroy the possibility of reusing a test.
In fact, the time and efficiency of being able to reuse tests should be the basis for making an argument against appeals. In addition, I believe that appeals are often based on a false premise. That is, the appellants are claiming that they possessed a body of knowledge at the time of the test that led them to answer the item(s) they are challenging in a specific way. As I have listened to and read appeals, it has become painfully clear to me that appellants did not possess the knowledge at the time of the test on which their appeals are based. In addition to being another thing I don’t like about appeals, I believe this shows how unfair the appeal process can be to the test developer. I believe that appeals should be made in writing at the time an appellant finishes the test and turned in before the appellant leaves the test room. After all, a test can only measure the knowledge possessed at a specific moment in time. Anything that occurs post test that alters the individual’s breadth of knowledge and is applied back to his test score tends to invalidate the final score.
Despite my experience regarding appeals and my campaigns to limit them and their impact on tests, I recognize that there are still many jurisdictions that provide appeal processes. Very often, different levels of management that may not be familiar with the test development and validation process believe that it is necessary to provide an appeal process to ensure the “fairness” of a test and its acceptability to candidates. That being the case, rules are often made that ensure an appeal process is provided. Practitioners, out of necessity, must comply with these rules, but should also work to ensure that the rules comply with sound test development and validation procedures. One suggestion that has been made to mitigate the impact that appeals have on the effectiveness of the test is to limit appellants’ review of the test to only those items that they got incorrect and rather than showing them the correct answer, require them to defend their chosen answer. Of course, when using outside test developers, it is important to publicize to candidates and management the test developer’s rules for question challenges and ensure that all parties comply with those rules.
Scoring raises another issue related to appeals in that it must be determined how successful appeals impact individual scores. Does everyone benefit from a successful appeal or just the appellant? In my experience everyone that had answered an item the same way that successful appellant did had a point added to their score. This meant that the individuals who had failed to answer an item the way it was keyed could gain ground on the people that had selected the correct answer originally. Statistically, this means that the individuals with the lowest scores, that is the most items wrong, had the greatest odds of improving their scores. So this issue proved to me to be another example of the unfairness of the appeal process and how it hurt the selection process.
Granted there are times when a test developer writes a bad item or mis-keys an item. These situations should be caught by the test writer, however; and caught before the test is actually used. This is one big advantage of using consultants or purchasing tests from professional test developers. The tests they prepare have had numerous trial runs to remove any potential problems before the final product is put into use. If you are still developing exams in-house you should learn from this example and find a method to try out your test before it is used on its intended population. This applies for entry-level as well as promotional exams.
If you want to develop a good test, it is important to write more items than you intend to use on the test. This will allow you to clean up the test using an item analysis and feedback from test takers. A thorough item analysis will demonstrate to you which items may be mis-keyed along with any items that are too easy, too hard, or have negative discrimination. Revising or removing these bad items and selecting only the best items from the item analysis will ensure you have the best test possible and help prevent the need for appeals. Providing candidates with information regarding post test analysis of items and review of items and how that could impact their scores may also prove useful in preventing appeals.
In that regard, appeals are similar to complaints in that they are best to be avoided. If you have the opportunity to rewrite appeal procedures, it would be wise to keep in mind some of the suggestions I have made in this article. Do your best to design an appeal process that forces potential appellants to write out their appeals and or issues with the written exam before they leave the testing room relying on knowledge possessed at that time. I would also only add points to the test scores of those who can present cogent arguments supporting their argument whether it be for a particular answer or the entire item.
This post test approach prevents teams of appellants from getting together to attack a test and encourage that appeals be granted by the sheer volume of appeals to particular items. Remember in any good test, there should be a complete range of difficulty in regard to the items. Some items that are relatively easy can be important to help bolster test taking confidence. Writing most of the items at medium difficulty where half the group gets the item correct and the other half gets the item wrong helps to maximize the differentiation in scores. Then rounding out your test with some items that very few people get correct helps establish a ceiling for your test ensuring that only those with the most knowledge get the highest scores. Again, I want to point out that which should be apparent by now and that is how a broad ranging appeal process interferes with your efforts to use sound test development practices in utilizing your tests.
Hopefully, this article has shown you some of the issues related to appeals and given you some suggestions on how they can be avoided. As practitioners, I wish you well in dealing with the challenges you face in dealing with and avoiding appeals. My years of experience have been utilized to give you the tips I’ve provided you here and I would sum up by saying that if I had the choice, I would prefer to rely on the use of tests written by experts rather than those developed in-house. In the next article in this series I will outline some of the ideas I used to try and improve our appeals system.