What Are The Characteristics of a Good Test?

Part 1 in the Validity of Public Safety Assessments Series

The idea for this primer series germinated from a simple question – “Could you do an article looking at the validity of tests used in public safety assessment.” As my forgiving readership already knows, I have trouble containing my thoughts to a single entry. So, as I began to frame out how I would respond to the question of the validity of public safety assessments, the amount of material I wanted to cover started to grow exponentially. At some point, I decided it would be best to start from the beginning with a series of primers on topics related to validity, building up to an answer to the question of “what is the validity of public safety assessments.”

So now this blog will be the first in a series looking at this question. Over a series of articles aimed to inform, but also intended to keep things simple, I will cover:

  1. What are the characteristics of a good test?
  2. What are some authoritative references human resource and assessment professionals can rely upon in evaluating the worthiness of tests?
  3. What is validity?
  4. Are public safety assessments good tests and are they valid?

This first article in the primer series deals with the question of what is a good test. A good test can be defined as one that is:

  • Reliable
  • Valid
  • Practical
  • Socially Sensitive
  • Candidate Friendly.

Briefly and simply, I will review the meaning of each of these characteristics.


Reliability refers to the accuracy of the obtained test score or to how close the obtained scores for individuals are to what would be their “true” score, if we could ever know their true score. Thus, reliability is the lack of measurement error, the less measurement error the better. The reliability coefficient, similar to a correlation coefficient, is used as the indicator of the reliability of a test. The reliability coefficient can range from 0 to 1, and the closer to 1 the better. Generally, experts tend to look for a reliability coefficient in excess of .70. However, many tests used in public safety screening are what is referred to as multi-dimensional. Interpreting the meaning of a reliability coefficient for a knowledge test based on a variety of sources requires a great deal of experience and even experts are often fooled or offer incorrect interpretations. There are a number of types of reliability, but the type usually reported is internal consistency or coefficient alpha. All things being equal, one should look for an assessment with strong evidence of reliability, where information is offered on the degree of confidence you can have in the reported test score.


Validity will be the topic of our third primer in the series. In the selection context, the term “validity” refers to whether there is an expectation that scores on the test have a demonstrable relationship to job performance, or other important job-related criteria. Validity may also be used interchangeably with related terms such as “job related” or “business necessity.” For now, we will state that there are a number of ways of evaluating validity including:

  • Content
  • Criterion-related
  • Construct
  • Transfer or transportability
  • Validity generalization

A good test will offer extensive documentation of the validity of the test.


A good test should be practical. What defines or constitutes a practical test? Well, this would be a balancing of a number of factors including:

  • Length – a shorter test is generally preferred
  • Time – a test that takes less time is generally preferred
  • Low cost – speaks for itself
  • Easy to administer
  • Easy to score
  • Differentiates between candidates – a test is of little value if all the applicants obtain the same score
  • Adequate test manual – provides a test manual offering adequate information and documentation
  • Professionalism – is produced by test developers possessing high levels of expertise

The issue of the practicality of a test is a subjective judgment, which will be impacted by the constraints facing the public-sector jurisdiction. A test that may be practical for a large city with 10,000 applicants and a large budget, may not be practical for a small town with 10 applicants and a miniscule testing budget.

Socially Sensitive

A consideration of the social implications and effects of the use of a test is critical in public sector, especially for high stakes jobs such as public safety occupations. The public safety assessment professional must be considerate of and responsive to multiple group of stakeholders. In addition, in evaluating a test, it is critical that attention be given to:

  • Avoiding adverse Impact – Recent events have highlighted the importance of balance in the demographics of safety force personnel. Adverse impact refers to differences in the passing rates on exams between males and females, or minorities and majority group members. Tests should be designed with an eye toward the minimization of adverse impact. A complicated topic, I addressed adverse impact in greater depth in previous blog posts here and here.
  • Universal Testing – The concept behind universal testing is that your exams should be able to be taken by the most diverse set of applicants possible, including those with disabilities and by those who speak other languages. Having a truly universal test is a difficult, if not impossible, standard to meet. However, organizations should strive to ensure that testing locations and environments are compatible with the needs of as wide a variety of individuals as possible. In addition, organizations should have in place committees and procedures for dealing with requests for accommodations.

Candidate Friendly

One of the biggest changes in testing over the past twenty years has been the increased attention paid to the candidate experience. Thus, your tests should be designed to look professional and be easy to administer. Furthermore, the candidate should see a clear connection between the exams and the job. As the candidate completed the selection battery, you want the reaction to be “That was a fair test, I had an opportunity to prove why I deserve the job, and this is the type of organization where I would like to work.” One of my early set of blogs for IPMA-HR dealt with “how you treat a candidate makes a difference” here, here and here.

Tests from IPMA-HR Assessment Services

All tests from IPMA-HR Assessment Services, including the safety forces examinations, are designed to be consistent with the professional principles for the development of tests and are carefully developed so as to be high-quality tests. If you have questions regarding pre-employment or promotional testing, please feel free to contact the staff at IPMA-HR Assessment Services or email Dennis Doverspike at dennisdoverspike@gmail.com. As always, if you have a question you would like to see addressed in a future blog, please let us know.

This entry was posted in Assessment, Public Safety Tests, Validity and tagged , by Dennis Doverspike. Bookmark the permalink.

About Dennis Doverspike

Dennis Doverspike, Ph.D., ABPP, is President of Doverspike Consulting LLC. He is certified as a specialist in Industrial-Organizational Psychology and in Organizational and Business Consulting Psychology by the American Board of Professional Psychology (ABPP), serves on the Board of the American Board of Organizational and Business Consulting Psychology, and is a licensed psychologist in the State of Ohio. Dr. Doverspike has over forty years of experience working with consulting firms and with public and private sector organizations. He is the author of 3 books and over 150 other professional publications. Dennis Doverspike received his Ph.D. in Psychology in 1983 from the University of Akron.

Leave a Reply