When I was in school — particularly elementary school, where the practice seemed to be more prevalent — it troubled me to witness one student copying off another during tests. I always thought this was unfair. I wish I could say that I was upset by cheating because it damaged the educational system, but in reality I was angry because of the impact on me personally.
I carried that impression into the HR world. Since many decisions are based on test results that impact the health and effectiveness of an organization, I felt more justified in taking an active part to prevent cheating in this arena.
It pleased me a great deal when I discovered that there were other ways to discourage copying beyond the utilization of diligent test proctors. Making different forms of the same test was perhaps a devious method to discourage copying. On the other hand, we always announced to test takers the possibility that multiple forms of the test were in use so they would know that the person next to them may not have the same test.
I should also mention that in addition to an anti-cheating measure, having more than one version of a test also allowed us to alternate their administration. That is, if we typically tested twice a year for a particular class, like police officer, we could use form A in the spring and form B in the fall. Utilizing the tests this way also ensured that we did not have a significant number of test takers benefiting from the “practice effect,” where candidates may become too familiar with a particular test form after repeated taking.
Interestingly, no one ever challenged this practice, but the profession itself has held different standards for using multiple forms of a test. We’ll go over three different strategies for providing multiple forms of a written instrument.
Comparable forms of a test consist of two or more versions of a test that measure the same KSAPs for which statistical similarity has not been demonstrated.
Although these tests contain questions that measure the same KSAPs, the actual questions are different. Agencies may alternate between comparable forms during different test administrations to increase test security.
Equivalent forms consist of two or more versions of a test that can be equated by the use of a scoring formula so that scores on each version can be directly compared. This is useful in the event an agency wishes to use multiple equivalent forms during a single test administration
Like with comparable forms above, the actual questions on the test forms are different. Agencies may alternate between equivalent forms during different test administrations to increase test security.
Parallel forms consist of two or more versions of a test that are statistically equivalent in terms of raw score means, standard deviations, error structures, and correlations with other measures for any given population.
Parallel forms can be used if agencies want to utilize two different tests during the same test administration. IPMA-HR has indicated in their literature that they have developed parallel forms of tests and that they have statistically equated tests that are considered to be parallel forms. Further, they recommend that agencies using these tests as parallel forms should review the technical report to determine whether adjustments are necessary.
These three definitions represent a general hierarchy of comparability, with Comparable Forms at the lowest level of proof of comparability and Parallel Forms the highest level. This hierarchy also conveys the required level of sophistication and time commitment involved to develop multiple forms of a test.
From my point of view, the distinction between the types of multiple forms provide a guide for jurisdictions to determine when they are comfortable with tests being developed internally and when it would be more beneficial to enlist the services of outside sources such as IPMA-HR. What I discovered is this: it is usually wise to hire outside test developers to create instruments for large, high-profile recruitments that involve a significant amount of hiring and represent the highest potential for liability. Taking this a step further, teams that I worked with usually agreed to using outside resources for police officer, correctional officer, and firefighter testing.
These test developers frequently led the movement towards utilizing more than one form of a test. Many of them were motivated by realizing it was common for jurisdictions in close proximity to use the same tests, exposing candidates to identical forms within relatively short time periods. To address the concerns raised by jurisdictions so situated, consultants and test publishers were motivated to develop multiple forms of their tests.
I should also say that as a practitioner, I employed some of the common methods in use today for developing multiple forms that combined aspects of all three types of comparable forms. Generally, we would write test items to add to our item bank and then we would randomly select items by test category or subtest to comprise our two tests. We would compare the final products by establishing reading levels for each and reviewing item analysis data to determine the difficulty and discrimination of each item. After test administration we would compare measures of central tendency (mean, median, and mode) standard deviations and their correlation with each other. In retrospect, this review was not as sophisticated as current methodologies, but it did provide a degree of support for the use of each test and ranking their scores.
While the methods I utilized for using multiple forms of a test were not challenged, it is clear to me from a brief review of the literature that more sophisticated methods for comparing tests do exist and should be used if possible. I am also convinced that agencies interested in utilizing multiple forms of tests would be well advised to employ the services of a consultant or stock test publisher.
In summary, multiple forms of tests are useful and beneficial. Making sure that different forms of the same test are indeed comparable to the extent that they can be used interchangeably is a sophisticated and complicated process. Agencies interested in developing multiple forms of their tests have a wealth of information they can research to guide them in the process. As an alternative, however, they may opt to use the services of an outside test provider who is prepared to do this work for them.