If you’d like to review the previous articles in this series, which were posted back in November, you can find them here: Part 1: Complaints & Appeals Related to Testing: An Overview and Part 2: Considering Your Appeals Process.

This is the third article in this series on complaints and appeals and it is intended to give courage and hope to some of you in the HR profession who are dealing with rules governing complaints and appeals that do not support sound test development and validation procedures. If we are to support and improve the effectiveness of testing and the value of the work done by those in our profession, we need to recognize that there are times we need to work to change rules that are contrary to sound practices. While being a change agent can be fraught with risks, it can also produce rewards. Before going forward with any efforts to modify existing rules, it is critical to assess the climate in which you work and the impact appeal procedures have on the utility of the tests you are using.

Some of the basic things we know about test development and test validation include the fact that tests only measure the KSAP’s (knowledge, skills, abilities and personal characteristics) that an individual possesses at the time of testing. We also know that most tests we use in Human Resources are either aptitude tests or achievement tests.

In general terms:

  • Aptitude tests measure one’s ability to learn and retain information over time and they are usually the types of test used for entry-level testing.
  • Achievement tests are designed to measure one’s knowledge of a particular subject after having received training and/or experience in that area.

Anything that occurs post test in regard to providing candidates the opportunity to review the test and appeal test items changes candidates’ body of knowledge that can be applied to the test and therefore negatively impacts the reliability of the test. That is, we are now measuring candidates’ abilities to conduct research and make cogent arguments regarding the quality of test items and their answers compared to keyed answers. We are no longer able to determine what candidates knew or did not know at the time of the test. So when we change scores for candidates based on appeals we are giving them credit for information they may or may not have had during the test. That means we are no longer measuring what the test was intended to measure and alterations in scores that negatively impact reliability also reduce the validity of the test.

So, the ideal system of appeals would be one that eliminated all the negative impact that bad systems can have on the test while providing candidates an opportunity to satisfy their concerns about the adequacy of test items. In order to move in that direction, we need to thoroughly understand the impact of appeals on tests and we must understand the motivation behind the rules that provide for appeals. In addition, we must have knowledge of who the stake holders are in the process along with their biases. As a general rule, it is relatively safe to say that most individuals not involved in the test development and validation field have a bias against written tests. I found this particularly interesting when I taught classes to managers on how to conduct valid selection interviews that conformed to the Uniform Guidelines for Employee Selection Procedures (UGESP 1978).

When I asked class members how many liked tests, nobody raised a hand; when I asked how many were test development experts, again, nobody raised a hand; and when I asked who knew what the UGESP were, I got the same non response. However, when I asked how many of the managers liked interviews they had developed and thought they were doing a good job of selecting the best candidates, I got every hand raised.

Most of them were reluctant to believe me when I told them that research on most interviews conducted by managers that were not structured and developed using appropriate test development procedures were less effective than random selection. I didn’t get many converts that suddenly thought written tests were great, but I did have an opportunity to point out the time and research that went into designing and constructing written exams compared to the effort managers put into creating their interviews. I also got better cooperation when working with them to design structured interviews for them to use.

This is just one example of the inroads that test developers can make into helping their test users better understand the process. I had an even greater challenge however; when working for a large police agency that was saddled with a crushing appeal system that no one was happy with and totally destroyed the quality of promotional written exams. While I was successful in changing this process and making the written test better for all concerned, I didn’t win any popularity contests. The process that I went through in making the changes can serve as a practical case guide for anyone in a similar position.

When I first started for that agency I was overwhelmed by the selection systems in place and the promotional process. In particular, they had rules that required new selection procedures to be developed in-house and administered annually for Police Captain, Police Lieutenant, Police Sergeant, Corrections Lieutenant, and Corrections Sergeant. Each written exam was required to be composed of all new items and twice as many items had to be written as were to be placed on the test. This meant that if there were to be 100 items on a test, the test developer had to write 200 items. For five written exams, that translated into 1,000 new items created annually. Once the items were completed they were drawn at random from a hat and placed on the test in the order drawn.

After the written exam, all candidates were required to drop their original answer sheet in a locked box. In its place, they were entitled to get a copy of the items they missed along with all answer options with the keyed answer being indicated. They were given two weeks to submit written intention to appeal items and a brief statement of the basis for their appeal. They were allowed to argue that their answer was as good as or better than the keyed answer or that the whole item was bad and should be thrown out.

Once all written appeals were submitted, a date and time was set for oral arguments on appeals. Appeals were heard by a panel consisting of three members from other police agencies that were at least one rank above the rank for which the test was developed. It was my responsibility to select and train these members in the appeal process. It was also my responsibility to conduct the actual oral appeal process, but I could not make any comments regarding the tests or test items. I was only allowed to address procedural questions. The first appeal for Police Sergeant lasted from 0800 on the first day until 0200 the second day.

Appeal processes for the other ranks were not quite as lengthy because there were fewer candidates, but were just as cumbersome and functioned in much the same way. A great deal of the oral arguments centered on bringing in source material that contradicted what the source material for the test indicated. This was a common practice when I started because the promotional posting did not identify any specific source material from which the test would be taken.

In addition to writing 1,000 new items annually, I was also tasked with designing new oral exams for four of the promotional classifications and an assessment center for Police Captain. The work with promotional exams was done amid my other duties that required I design, construct, and validate selection procedures for all entry-level positions including Police Officer, Corrections Officer, Crime Scene Investigators, and Police Dispatchers.

After one year of experience with this process, I was able to start making inroads into making changes. One of the first things that I learned was the entire process was designed to avoid cheating. Secondly, there was a major distrust of civilians. Third, I discovered that it was not a requirement that each test have all new items, there just couldn’t be an item on the current test that was used the year before. I also found out that they hadn’t created any type of item bank because no one had ever thought of it. Fourth, I found out that the appeal process had been developed by the Deputy Chief over the Personnel Bureau. Fifth, I found that there were ranking officers in the agency that did not like the current process and they might be willing to listen to new proposals. Sixth, I witnessed the fact that no one was selected off the lists for Police Captain, Police Lieutenant, or Corrections Lieutenant.

Fortunately, for me, there were many forces that came into play that allowed me to dramatically alter the way the agency conducted its promotional processes. First of all, a new civilian personnel director had been hired and he was eager to make improvements in the personnel bureau that had previously been commanded by police captains without any training in personnel. Second, I was new. I was hired because of my background in test development and validation. Therefore, I had credibility that isn’t often afforded in-house practitioners and for my first few years with the department I was afforded a degree of authority usually reserved for consultants. The third factor contributing to change was the fact that many of those considered the best and brightest in the department saw more bad than good in the appeal process having been hurt by it in their climb through the ranks. New money in the budget also allowed me to hire an additional analyst for support. Another change that contributed to a climate allowing for revision of the appeal process was the fact that the Deputy Chief who had written the rules governing appeals gave notice he was retiring.

While every one of those factors contributed to the success I had in changing procedures, the biggest motivation to improve the selection processes came from the United States Department of Justice. Within my first year of employment with this agency, the DOJ filed a suit alleging discrimination against minorities and females in the hiring and promotional practices of the agency. As a team, those in the personnel bureau worked to identify and contract with a nationally known expert in test validation to help in the department’s defense against the suit. After reviewing the current tests I had written using the content validity model and the older tests that were not designed with any particular support, the consultant said that my tests were valid and defensible while the other work was not. This gave my credibility with the department a huge boost and secured my tenure with the department for at least the life of the law suit.

While it did not happen over night, I was able to make significant changes in the appeal process. I did not get everything that I pushed for, but I did get enough to make the promotion process better for all involved even if many of the officers did not embrace all the changes. The changes included eliminating the requirement to write double the number of items to appear on the test. This shifted the focus to quality rather than quantity. We also established an item bank that allowed us to reuse items based on their item analysis results. The life of the eligibility lists for Police Captain, Police Lieutenant, and Corrections Lieutenant were changed from one year to two. Testing for these classes every other year allowed more time to develop the related selection instruments. We also eliminated the concept of placing items on the test in the order they were drawn. This allowed the grouping of items measuring the same knowledge. Two other changes that improved the process were the creation of a reading list and an annual testing schedule. Implementing these changes meant that everyone interested in preparing for promotion knew what to study and the month the written test for each promotional opportunity would be given.

In addition to these changes, I was able to revise the rules to permit one department member on each promotional oral board and one member of the department on each promotional panel. We were also able to select an in-house panel to review all proposed written items and determine if they were suitable. These changes along with the others had the result of greatly reducing the number of appeals filed and reducing the number of appeals granted to an average of three or less.

Some of the lessons that you can take away from my experience include:

  • Look for the right time to encourage change.
  • Seek out command staff and managers who will support change.
  • Enlist any test developers or consultants that you contract with to encourage change.
  • Be prepared to demonstrate the negative impact of appeals on test results and the organization.
  • Be willing to compromise, try, and try again.
  • Identify the individuals who will fight change.
  • Identify the arguments against change and be prepared to address them.
  • Be patient and listen carefully to the opposition.

Hopefully, sharing this experience, the difficulty I went through and the things I learned by the experience will encourage you to make any changes you believe are needed within your appeal system and give you some ideas on how you could go about making the needed changes.