Table of Contents
- 1. History and Variables Concerning Standards
- 2. Structure of Standards
- I. Governance and Structure
- II. Linkage with Job Requirement
- III. Assessment System Design
- IV. Structure
- V. Candidate Information
- VI. Candidate Processing
- VII. Test Development
- VIII. Test Administration
- IX. Test Security
- X. Scoring and Score
- XI. Appeals
- XII. Continual Maintenance
- XIII. Program Evaluation
Test Development
The ACRB was created in October of 1995. The ACRB was made up of individuals recommended by Accredited Chiropractic Colleges and Universities. These individual members have completed a pilot program in Chiropractic Rehabilitation. The program had the same module levels of I and II and certified diplomate status module level examinations. The diplomate status also required a non‑written performance examination. The final requirement for diplomate status was a written case study accepted for publication in a peer review journal.
Before the ACRB was created a consensus study known as the DELPHI Process began in 1994. The DELPHI is a continual process with the ACRB. The following page will give a history of the process.
See exhibit 0.0010 (Instruction for item development, Angoff procedure, Item selection)
Test Items
The Board and College will supply a course outline to rehabilitation programs based on Delphi topics with criteria‑references per topic. Instructors responsible for teaching course material will then submit test items to the Board supported by the criteria-references. The Board then compiled these test items from instructors to develop all level examinations.
Item Development
The rehabilitation chairman for the continuing education program of the sponsoring Chiropractic Colleges and Universities is contacted by the ACRB. This chairman is responsible for gathering the course instructors items for each one hundred hour program, reviewing all items received to ensure that all instructors have sent items specific to the instructor’s topic matter with Delphi criteria-references and to see that the proper item construction format has been followed as listed below.
An item is a summation of three different parts. The first part is the question. The second part is the answer. The third part is composed of the distracters. The question from a topic-referenced material should be supplied from a subject within that topic. The answer should be directly equated to that subject and topic. The distracters may be related to either the topic or the subject material, but not both. One distracter should not be related. The following will be an example item with break down and explanation.
To best improve sensorimotor coordination, exercise should be performed until:
- heart rate reaches aerobic maximum
- post exercise soreness develops
- passive care is no longer indicated
- patient unable to maintain proper form
The sample question’s topic is exercise, which is related to active care after an assessment. Sensorimotor coordination is the related subject to the topic. The answer "(D) patient unable to maintain proper form" is directly related to Proprioception and Coordination exercises. The distractors (A) and (B) are related to the topic, Exercise, but in no way related to the subject, Proprioception and Coordination exercises. The distractor (C) is in no way related to either the topic or subject of the question.
Once the rehabilitation chairman for the continuing education program feels the items generated are sufficient, test items are then mailed to the ACRB Test Development Committee Chairman.
Item Selection
All items are criteria‑referenced content material. The items are sent to the chairperson of the Test Development and Security Committee. The items are then reviewed by the Test Development Committee. Items that are not referenced, use references that are not from the original Delphi pool, excessively lengthy, proprietary items, statistical and/or are ambiguous are removed from the pool.
The remaining items are then rated by three to six (3‑6) members of the Test Development Committee utilizing the Angoff method. Items are rated 1 to 3 and 8 to 10 are eliminated and the remaining items rated 4 to 7 are categorize per Delphi topic from highest to lowest.
Test are generated based on a predetermined value per test and per topic area. Items per each topic area are selected such that the sum is the value of the topic and the topic value sum meet the predetermined value of the test. Items are then placed in random order with the test.
This test is then evaluated by the three to six (3‑6) members of the Test Development Committee removing items that are incorrectly answered by the majority the committee member and/or removing items that may covey a pattern within the test. These items are then replaced with topic items of equivalent value.
Each test is then copy righted then activated for applicant testing.
The Angoff Method.
The Angoff method is the most commonly used approach to set standards on multiple-choice credentialing examinations (Sireci &Biskin, 1992; Fidler, 1996). The question asked by the Angoff method is, "What is the probability that a minimally competent candidate would answer this question correctly?" The standard setter is in effect asked to judge the difficulty of the item within some range for a minimally competent candidate. The estimated performance standard for a judge is determined by summing the item difficulty estimates. The resulting average "Angoff rating" over judges is the used as the performance standard for the examination. An example of the Angoff method for performance standard estimates range from 7.2 to 7.85 and average 7.63. Accordingly, the estimated performance standard is eight items correct.
The application of the standard setting methods described in this section rarely produces an average value that is an exact whole number. "Rounding", therefore, is an implicit part of standard setting activity, and "rounding up" or "rounding down" may have an impact on the pass/fail rate. How then should rounding be accomplished or considered in the process? A reasonable strategy might be for credentialing groups to focus on the consequences of rounding rather than on the pure act of rounding, per se. Under this view, for example, the credentialing agency might consider the effect of rounding on the pass/fail rate obtained form previous administrations of the examination. Assuming no changes in the content of the examination or in the ability level of the candidate population, the agency might make the rounding decision in a way that is most consistent with the past administration. Licensure organizations might also make the rounding decision that is consistent with the view that credentialing should involve the minimum degree of regulation necessary to ensure the competency of practitioners.
The Angoff method is relatively easy to use, and calculation of the ratings is quite simple. In addition, the approach is relatively easy to modify for use with performance assessments and practical examinations (Hambleton & Plake, 1995). However, it has been reported that understanding the critical conceptualization known as the "minimally competent candidate" is often difficult or impossible for standard setting participants to acquire (National Academy of Education, 1993). In practice, participants typically need repeated references to a formal summary of the behaviors and performance indicators that represent this important construct.
Item Re‑Evaluation
The Test Development Committee Chairman monitors all applicant testing for incorrect results. If a pattern is found with any particular exist item it is brought to the attention of the Test Development Committee for review. If the Committee deems fit it will replace the item from the identical Delphi topic area with the same item value. This test is then copy righted then activated for applicant testing. This new item is then monitored by the Test Development Committee Chairman. If the same pattern continues for the replacement item it may be replaced with a different Delphi topic item of the same item value by the Committee. This test is then copy righted then activated for applicant testing.
Test Replacement
On a yearly basic, the Test Development Committee replaces the oldest test per level with a newly created test per the same level.