Number 3 in the Validity of Public Safety Assessments Series

The idea for this primer series germinated from a simple question – “Could you do an article looking at the validity of tests used in public safety assessment?” In response, I decided to do a series of articles aimed to inform, but also designed to keep things simple. The blogs in this series are intended to cover:

  1. What are the characteristics of a good test?
  2. What are some authoritative references human resource and assessment professionals can rely upon in evaluating the worthiness of tests?
  3. What is validity?
  4. Are public safety assessments good tests and are they valid?

The first two blogs in the primer series has been published and are available by clicking the links above.

This is the third in the series and is intended to provide a basic introduction into the various kinds of validity evidence. By validity evidence, I do not mean the obvious distinction between the big four of:

  1. Content
  2. Criterion-related
  3. Construct
  4. Transfer or Transportability

Understanding the distinctions between the four types of validity listed above is important. However, in this blog, I mean something different by types of validity evidence. As our ultimate purpose or goal is to respond to the question as to whether tests are valid for purposes of public sector assessment, we can consider the following five types of validity evidence as relevant:

  1. Local Validation Based on Criterion-Related Evidence
  2. Validity Generalization Evidence Based on Tests in General
  3. Validity Generalization Evidence Based on Specific Occupation
  4. Validity Generalization Evidence Based on Specific Test
  5. Other

Local Validation Based on Criterion-Related Evidence

I list local validation first because it is the type of study upon which the other types of validity evidence are built. This refers to study done by an organization or jurisdiction using their incumbents from the relevant job family. So, we could conduct a study in the City of Gotham of the ability of a test of mechanical comprehension to predict entry level firefighter performance. Such studies require the administration of a test or assessment, the predictor, and the collection of some type of criterion, usually job performance as rated by supervisors or turnover. Once we have a dataset, we can then calculate the correlation coefficient, or r. The correlation coefficient can take on values ranging from -1 to 1, with typical results being in the range of .10 to .40. The obtained correlation provides an indicator of the utility of the assessment and of the degree to which the test accurately predicts job performance, or turnover. A significant correlation is then seen as evidence of the validity of the test or assessment.

Validity Generalization Evidence Based on Tests in General

In 1977, Schmidt and Hunter developed a general solution to the problem of estimating the validity of a test, or type of test, based on evidence collected from a series of local, criterion-related validation studies. Since 1977, and until the current day, researchers have been conducting validity generalization studies. Based on the more general technique of meta-analysis, in validity generalization we calculate a mean or average correlation coefficient. The obtained value can then be corrected for unreliability and range restriction. Uncorrected correlations are typically in the previously discussed range of .10 to .40, with corrected correlations providing much higher validity estimates.

Frank Schmidt and his colleagues have just completed a new validity generalization article based on 100 years of validation studies. This study provides data on the validity of both tests, for example work samples or interviews, and constructs, for example general intelligence.

Studies of validity generalization evidence based on tests in general do provide data relevant to the question of the validity of tests for public safety jobs. This is especially true if we are willing to accept the assumptions that the specific type of occupation does not make a difference and that all professionally developed tests are basically equivalent.

Validity Generalization Evidence Based on Specific Occupation

If we want data on the validity of tests for a specific occupation, we can conduct validity generalization studies targeting the specific job of interest. For example, we could ask what is the validity of a test for public safety jobs. Or, we could be more specific and ask what is the validity of a test for the occupation of entry level firefighter. Of course, the more specific we make our request, the fewer studies we will find. There have been a couple of meta-analytic studies published for public safety jobs and we will rely upon those when we discuss the evidence for validity in the next blog.

Validity Generalization Evidence Based on Specific Test

It is also possible to conduct validity generalization studies for a specific test. So, for example, we could ask what is the validity of the Myers-Briggs. Or, we could ask, what is the validity of the Bennett Mechanical Comprehension Test for use with firefighters. Again, such an analysis is likely to be based upon a relatively small number of studies and most likely to be conducted by the test publisher.

Other and Transportability

Validity evidence for public sector tests often relies upon a somewhat unique variation usually referred to as the transfer or transportability of local validation evidence from one jurisdiction to another. Many test companies rely upon this approach. Basically, the validity of a test is established in one jurisdiction or city, for example Gotham City. This validity is then transferred to a second locale, say Metropolis. The Uniform Guidelines provide detailed guidance on how this can be accomplished. Usually, it requires a demonstration that the jobs and situations are sufficiently similar between the original jurisdiction and the new agency. This can be completed by comparing the similarity of the job analyses, often using some type of checklist approach. Although a transfer strategy usually relies upon criterion-related validity, it is possible to argue for the transportability of a test where the original evidence relied upon content validity.

IPMA-HR Assessment Services Assessments

All tests from IPMA-HR Assessment Services, including the safety forces examinations, are designed to be consistent with the professional principles for the development of tests and are carefully developed so as to be high-quality tests. If you have questions regarding the validity of any of the IPMA-HR assessments, please feel free to contact the staff at IPMA-HR Assessment Services at (703) 549-7100,  assessment@ipma-hr.org or click here to visit the website.

Final Thoughts

Our first three blogs have covered:

  1. What are the characteristics of a good test?
  2. What are some authoritative references human resource and assessment professionals can rely upon in evaluating the worthiness of tests?
  3. What is validity?

The next blog will be the final one in this series and will deal with:

  1. Are public safety assessments good tests and are they valid?

In the meantime, if you have any questions or thought for me, please, email Dennis Doverspike at dennisdoverspike@gmail.com. As always, if you have a question you would like to see addressed in a future blog, please let us know.