Are Tests Valid for Public Safety Jobs?

Number 4 in the Validity of Public Safety Assessments Series

The idea for this primer series germinated from a simple question – “Could you do an article looking at the validity of tests used in public safety assessment?” In response, I decided to do a series of articles aimed to inform, but also designed to keep things simple. The blogs in this series were intended to cover:

  1. What are the characteristics of a good test?
  2. What are some authoritative references human resource and assessment professionals can rely upon in evaluating the worthiness of tests?
  3. What is validity?
  4. Finally, trying to address the original question I was asked, are public safety assessments valid?

The first three blogs in the primer series have been published and are available by clicking the links above.

This is the fourth and final article in the series and is intended to answer the question regarding the validity of tests for public safety jobs. I define public safety jobs here as including police, fire, and emergency medical services (EMS). In addition, human resource professionals are usually interested in the use of tests in both entry level screening and for arriving at promotion decisions.

My discussion of validity will be organized according to the five types of validity evidence:

  1. Validity Generalization Evidence Based on Tests in General
  2. Validity Generalization Evidence Based on a Specific Occupation
  3. Validity Generalization Evidence Based on a Specific Test
  4. Local Validation Based on Criterion-Related Evidence
  5. Other

Validity Generalization Evidence Based on Tests in General

Our field now has approximately 100 years of personnel selection research and 40 years of validity generalization, creating a rather sizable library of empirical studies. Thus, we can start to look for an answer to the question as to the validity of public safety tests by looking at the general, validity generalization research. As you hopefully remember, validity generalization is a technique for cumulating and averaging local validation studies, in order to arrive at conclusions regarding both the strength of the validity of an assessment and the degree to which that validity generalizes across jobs and locations. Analyses based on validity generalization techniques provide data relevant to the question of the validity of tests for public safety jobs, as long as we are willing to accept the assumption that the specific type of occupation does not make a difference and that all professionally developed tests are basically equivalent.

Given we accept the assumptions underlying validity generalization, we can reach several conclusions regarding the validity of assessments. More specifically, for the case of entry level screening:

  • In predicting job performance, general mental ability tests (also commonly known as cognitive or intelligence tests) achieve very high levels of validity.
  • Both the interview and integrity tests are also highly valid forms of assessment.
  • Personality tests have relatively low levels of validity, but can be useful when incremental validity and the low levels of adverse impact are considered.

Conclusions regarding promotional test are more difficult to reach. However, the previous conclusions would still hold, but we would add the following:

  • Work sample and job knowledge tests also provide high levels of validity in predicting job performance in higher level jobs.
  • The assessment center has many positive features, including being popular with employees and many stakeholders. Assessment centers score high on both content and face validity. However, the assessment centers as a whole tends to have validities that are similar to its component parts; the whole in this case not being greater than the sum of the parts. Given the cost of giving an assessment center, many private sector organizations have moved to computer simulations of a day-in-a-life, which include realistic, online in-baskets. Thus, when discussing assessment centers, there is a trade-off between the many positive aspects and the increased cost and limited gain in predictive validity.

Validity Generalization Evidence Based on a Specific Occupation

Of course, an argument could be made that public safety occupations are highly unique and that validity generalization studies should be conducted targeting fire, police, and EMS work. Luckily, there have been published and unpublished analyses looking at the validity of tests for fire and police, at least for entry level jobs. However, as far as I know, no attempt has been made to summarize the empirical studies for EMS. In addition, fire supervisors have a greater opportunity to observe performance on the job than do supervisors of police personnel. Thus, we would expect higher reliabilities and validity for performance appraisals for fire than for police, which would result in a finding of higher test validities for fire than police.

Looking at the results for entry level, both for fire and police occupations:

  • General mental ability tests again obtain high levels of validity.
  • Mechanical comprehension also emerges as a strong predictor for fire entry.
  • Based on more limited data, interviews possess high levels of validity and personality is still valid but with relatively low levels of validity.

For promotion, assessment centers are again valid in terms of face, content, and criterion-related validity. However, as previously discussed, the administration of tests in an assessment center setting is unlikely to add any incremental criterion-related validity beyond that obtained from simply administering the component tests. Thus, jurisdictions must weigh the value of the many positive aspects of using assessment centers against the increased costs.

Validity Generalization Evidence Based on a Specific Test for an Occupation

I know of no validity generalization studies for specific tests with public safety occupations. Test publishers may have a small number of studies available from their clients and may be willing to share such data with you. I would encourage you to request any available validity evidence from your test provider. You will need such evidence if you are considering relying upon a transfer or transportability approach to validation.

Local Validation Based on Criterion-Related Evidence

Of course, you could do your own local validation study. If you have the will, the patience, the time, and the money, I would certainly encourage you to consider collecting your own local validation evidence. However, a large number of factors can impact local validation and I would encourage you to consult with a qualified, Industrial-Organizational Psychologist.


Correlations with job performance are not the only way to evaluate the validity of a test. Both personality and interest inventories can be valuable and valid tools for predicting:

  • turnover, especially during the probationary period.
  • deviant or counter-productive work behaviors.
  • entry into specific areas or subdisciplines, such as fire as opposed to a concentration in EMS, or traffic rather than criminal investigation within policework.
  • resistance to stress or the ability to recover from stressful situations.


I have argued before that despite the frequency of litigation in public safety selection, there are few areas of human resources where we can claim to have been more diligent, built better tools, and had to perform at such a high level. In general, we can conclude that our available and frequently used tests are valid predictors of public safety job performance.

Brighter, more honest, harder working, and more socially adept applicants perform better in training and receive higher ratings from their supervisors. Incumbents with higher levels of job knowledge and who perform better in interviews tend to be more effective when promoted into higher level jobs. Well-designed assessments can identify those job applicants who are brighter, more honest, and more likely to be conscientious on the job.

Not only do public safety assessments tend to be valid, they also meet the criteria for good tests. Selection instruments have high levels of reliability and years of work have gone into designing tests to be fairer and more acceptable to candidates.

IPMA-HR Assessment Services Assessments

All tests from IPMA-HR Assessment Services, including the safety forces examinations, are designed to be consistent with the professional principles for the development of tests and are carefully developed so as to be high-quality tests. If you have questions regarding the validity of any of the IPMA-HR assessments, please feel free to contact the staff at IPMA-HR Assessment Services at (703) 549-7100, or click here to visit the website.

If you have any questions or thoughts for me, please, email Dennis Doverspike at As always, if you have a question you would like to see addressed in a future blog, please let us know.

This entry was posted in Assessment, Validity and tagged , , by Dennis Doverspike. Bookmark the permalink.

About Dennis Doverspike

Dennis Doverspike, Ph.D., ABPP, is President of Doverspike Consulting LLC. He is certified as a specialist in Industrial-Organizational Psychology and in Organizational and Business Consulting Psychology by the American Board of Professional Psychology (ABPP), serves on the Board of the American Board of Organizational and Business Consulting Psychology, and is a licensed psychologist in the State of Ohio. Dr. Doverspike has over forty years of experience working with consulting firms and with public and private sector organizations. He is the author of 3 books and over 150 other professional publications. Dennis Doverspike received his Ph.D. in Psychology in 1983 from the University of Akron.

Leave a Reply