1. Achievement of learning outcomes from simple to complex can assesse.
2. Highly structured and clear tasks are provided.
3. A broad sample of achievement can be assessed.
4. Incorrect alternatives provide diagnostic information.
5. Scores are less influenced by guessing than true-false items.
6. Scores are more reliable than subjectively scored items (e.g. essays).
7. Scoring is easy, objective, and reliable.
8. Item analysis can reveal how difficult each item was and how well it discriminated between the strong and weaker students in the class
9. Achievement can be compared from class to class and year to year
10. Can cover a lot of material very efficiently (about one item per minute of testing time for straightforward questions).
11. Items can be written so that students must discriminate among options that vary in degree of correctness.
12. Avoids the absolute judgments found in True-False tests.
1. Constructing good items is time consuming.
2. It is frequently difficult to find plausible distractors.
3. Can be ineffective for assessming some types of problem solving and the ability to organize and express ideas.
4. Real-world problem solving differs – a different process is involved in proposing a solution versus selecting a solution from a set of alternatives.
5. Scores can be influenced by reading ability.
6. There is a lack of feedback on individual thought processes – it is difficult to determine why individual students selected incorrect responses.
7. Students can sometimes read more into the question than was intended.
8. Often focus on testing factual information and fails to test higher levels of cognitive thinking.
9. Sometimes there is more than one defensible “correct” answer.
10. They place a high degree of dependence the instructor’s writing ability.
11. Does not provide an assessment of writing ability.
12. May encourage guessing.
2. Parts of a multiple choice question (Bull & Mckenna, 2002)
A traditional multiple choice question (or item) is one in which a student chooses one answer from a number of choices supplied. A multiple choice question consists of a:
stem - the text of the question
options - the choices provided after the stem (these include the key and the distractors)
the key - the correct answer in the list of options
distracters - the incorrect answers in the list of options
3. Some examples of do’s and don’ts (Bull & Mckenna, 2002, Kehoe, 1995, Zimmaro, 2004)
Begin writing items well ahead of the time when they will be used —this allows time for revision and peer review.
Before writing the stem, identify the single idea to be tested by that item. This should be about an important aspect of the content area and not with trivia. In general, the stem should not pose more than one problem, although the solution to that problem may require more than one step.
Be sure that each item is independent of all other items (i.e. a hint to an answer should not be unintentionally embedded in another item). Design each item/question so that it can be answered by 60-65% of the student cohort (Zimmaro, 2004:15)
3.1 Writing Stems
(i) Present a single, definite statement or direct question to be completed or answered by one of the several given choices
A. original stem
are made up of thousands of smaller units called monosaccharides
are NOT found in the aloe vera leaf
are created during photosynthesis
can be described by the chemical formula: CHHOH
B. improved stem Polysaccharides of the plant cell wall are synthesized mainly in the endoplasmic reticulum
In Example A, there is no sense from the stem what the question is asking. Example B more clearly identifies the question and offers the student a set of homogeneous choices.
(ii) Avoid unnecessary and irrelevant material in the stem. It should be clear and unambiguous
A. original stem:
Paul Muldoon, an Irish postmodern poet who uses experimental and playful language, uses which poetic genre in "Why Brownlee Left"?
B. improved stem
Paul Muldoon uses which poetic genre in "Why Brownlee Left"?
d. dramatic monologue
Example A contains material irrelevant to the question. This sort of material should not be used to make the answer less obvious. This tends to place too much importance on reading comprehension as a determiner of the correct option
(iii) Use clear, straightforward language in the stem of the item.
Questions that are constructed using complex or imprecise wording may become a test of reading comprehension rather than an assessment of whether the student knows the subject matter.
A. original stem
As the level of fertility approaches its nadir, what is the most likely ramification for the citizenry of a developing nation?
a decrease in the workforce participation rate of women
a dispersing effect on population concentration
a downward trend in the youth dependency ratio
a broader base in the population pyramid
an increased infant mortality rate
B. improved stem A major decline in fertility in a developing nation is likely to produce a
decrease in the workforce participation rate of women
dispersing effect on population concentration
downward trend in the youth dependency ratio
broader base in the population pyramid
e.an e. increased infant mortality rate
(iv)Use negatives sparingly in the stem. If negatives must be used, capitalize, underscore, embolden or otherwise highlight them. Negatives include ‘except’, ‘only’
A. original stem
Which one of the following is not a symptom of osteoporosis?
Which one of the following is a symptom of osteoporosis?
decreased bone density
raised body temperature
Negatives in the stem usually require that the answer be a false statement. Because students are likely in the habit of searching for true statements, this may introduce an unwanted bias.
(v) Put as much of the question in the stem as possible, rather than duplicating material in each of the options.
A. original stem
Theorists of pluralism have asserted which of the following?
The maintenance of democracy requires a large middle class.
The maintenance of democracy requires autonomous centres of contervailing power.
The maintenance of democracy requires the existence of a multiplicity of religious groups.
The maintenance of democracy requires a predominantly urban population.
The maintenance of democracy requires the separation of governmental powers.
B. improved stem
Theorists of pluralism have asserted that the maintenance of democracy requires
a large middle class
autonomous centres of contervailing power
existence of a multiplicity of religious groups
a predominantly urban population
separation of governmental powers
Another example: If the point of an item is to associate a term with its definition, the preferred format would be to present the definition in the stemand several terms as options, rather than to present the term in the stem and several definitions as options.
(vi) Avoid irrelevant clues to the correct option in the stem.
Grammatical construction, for example, may lead students to reject options which are grammatically incorrect as the stem is stated. Perhaps more common and subtle, though, is the problem of common elements in the stem and in the answer.
Consider the following item:
What led to the formation of the States’ Rights Party?
a. The level of federal taxation
b. The demand of states for the right to make their own laws
c. The industrialization of the South
d. The corruption of federal legislators on the issue of state taxation
One does not need to know U.S. history in order to be attracted to the answer, b.
This is more difficult than writing stems. They’re called distracters because they are strategically designed to attract examinees who haven’t completely mastered the content and skills. This isn't tricky or deceptive or unfair. It is because the goal of testing is to find out who has learned the content and can apply skills and who has not, perhaps along a continuum between the two. Students who mastered the material should recognize the key (correct answer) and those who haven’t should not. (Parkes)
(i) Decide on how many distractors to write
According to Nitko (2001) there is no magic number that you should use. A 1987 study by Owen & Freeman suggests that three choices are sufficient. Clearly, the higher the number of distracters, the less likely it is for the correct answer to be chosen through guessing (providing all alternatives are of equal difficulty) (Bull & Mckenna, 2002). Be satisfied with three or four well constructed options. Generally, the minimal improvement to the item due to that hard-to-come-by fifth option is not worth the effort to construct it (Kehoe, 1995).
(iii) Follow these hints to avoid test validity problems Try to write items in which there is one and only one correct or clearly is the best answer and one on which experts would agree.
Be sure wrong answer choices (distractors) are at least plausible.
For example, a distractor can be correct but not answer the question. However, the distractor must not be so close to the correct answer that it confuses students who really do know the answer. Incorporate common student misunderstandings or errors in distractors.
The position of the correct answer should vary randomly from item to item.
After the options are written, vary the location of the answer on as random a basis as possible. A convenient method is to flip two (or three) coins at a time where each possible Head-Tail combination is associated with a particular location for the answer. Students should be informed that the locations are randomized. (Testwise students know that for some instructors the first option is rarely the answer.)
Avoid overlapping alternatives.
For example, in the original form of this item, if either of the first two alternatives is correct, ‘C’ is also correct.)
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
B. Preschool period
C. Before adolescence
D. During adolescence
E. After adolescence
1. During what age period is thumb-sucking likely to produce the greatest psychological trauma?
The length of the response options should be about the same within each item (preferably short).
Adherence to this rule avoids some of the more common sources of biased cueing. For example, we sometimes find ourselves increasing the length and specificity of the answer (relative to distractors) in order to insure its truthfulness. This, however, becomes an easy-to-spot clue for the testwise student. The number of students choosing a distractor should depend only on deficits in the content area which the item targets and should not depend on cue biases or reading comprehension differences in ‘favour’ of the distractor
There should be no grammatical clues to the correct answer.
1. Albert Eisenstein was a:
1. Who was Albert Einstein?
A. An anthropologist.
B. An Astronomer.
C. A chemist
D. A mathematician
Avoid excessive use of negatives and/or double negatives and words such as ‘always’, ‘never’, and ‘all’.
Avoid the use of ‘All of the above’, ‘both a. and e. above,’and ‘None of the above’ in the response alternatives, when students are asked to choose the best answer.
In the case of ‘All of the above’, students only need to have partial information in order to answer the question. Students need to know that only two of the options are correct (in a four or more option question) to determine that ‘All of the above’ is the correct answer choice. Conversely, students only need to eliminate one answer choice as implausible in order to eliminate ‘All of the above’ as an answer choice. Similarly, with ‘None of the above’, when used as the correct answer choice, information is gained about students’ ability to detect incorrect answers. However, the item does not reveal if students know the correct answer to the question.
4. Reviewing the MCQs: guidelines (Cohen & Wollack, 2000)
Cohen and Wollack recommend these for reviewing individual questions/items before students sit the test. 1. Consider the item as a whole and whether
it measures knowledge or a skill component which is worthwhile and appropriate for the examinees who will be tested
there is a markedly better way to test what this item tests
it is of the appropriate level of difficulty for the examinees who will be tested.
could be worded more simply, clearly or concisely.
3. Consider the alternatives and whether
they are parallel in structure
they fit logically and grammatically with the stem
they could be worded more simply, clearly or concisely
any are so inclusive that they logically eliminate another more restricted option from being a possible answer.
4. Consider the key and whether it
is the best answer among the set of options for the item
actually answers the question posed in the stem
is too obvious relative to the other alternatives (i.e., should be shortened, lengthened, given greater numbers of details, made less concrete).
5. Consider the distractors and whether
there is any way you could justify one or more as an acceptable correct answer
they are plausible enough to be attractive to examinees who are misinformed or ill-prepared
any one calls attention to the key (e.g., no distractor should merely state the reverse of the key or resemble the key very closely unless another pair of choices is similarly parallel or involves opposites).
5. Preparing Your Students for Taking Multiple-Choice Tests
(Dewey, 1998 in Zimmaro, 2004, Parkes)
1. Remind students of the learning outcomes and give them the exam map/blueprint so that they are not forced to guess what will be on a test. Exam questions cover the all the important ideas in the course.
2. Let students write multiple-choice items for revision purposes (not for the actual test). This puts them "behind the scenes" and helps them identify material that might be on the test and how it might be asked. It also allows you to gauge their depth of understanding and what issues/ topics they are focusing on (which may not align with yours). (Parkes)
3. Give students practice in doing MCQs so they learn strategies for answering them and to manage the time spent on each question. The test items are then drawn from but do not copy the practice questions. The practice questions can be done in the first 5 minutes of a lecture and then corrected or offered online with immediate feedback re the correct answer (and why a chosen answer was incorrect).
4. Defeat the lazy student’s rules of thumb for taking MCQ tests. Inform students that they WON’T work because YOU know about these ‘rules of thumb’ and have dealt with them in designing questions. (Dewey 1998) Students’ rules of thumb (test-wise strategies) or how to avoid studying (and what to do about it)
a. Pick the longest answer
make sure the longest answer is only correct a part of the time
try to make options equal length
b. When in doubt pick ‘c’
make sure the correct answer choice letter varies
c. Never pick an answer which uses the word ‘always’ or ‘never’ in it
make sure this option is correct part of the time or avoid using always and never in the option choices
d. If there are two answers which express opposites, pick one or the other and ignore other alternatives
sometimes offer opposites when neither is correct or offer two pairs of opposites
e. If in doubt, guess
use five alternatives instead of three or four to reduce guessing
f. Pick the scientific-sounding answer
use scientific sounding jargon in wrong answers
g. Don’t pick an answer which is too simple or obvious
h. Pick a word which you remember was related to the topic
when creating the distractors use terminology from the same area of the text as the right answer, but in distractors use those words incorrectly so the wrong answers are definitely wrong
Bull, J. & McKenna, C. (2002). Computer Assisted Assessment Centre. Retrieved 8 February 2009 from http://www.caacentre.ac.uk/resources/objective_tests/index.shtml Brown, G. & Pendlebury, M. (1992). Assessing Active Learning. Sheffield: CVCP, USDU. Cohen, A., & Wollack, J. (2000). Handbook on test development: Helpful tips for creating
reliable and valid classroom tests. Madison, WI: University of Wisconsin, Center for Placement Testing. Retrieved 13 October, 2003 from http://testing.wisc.edu/Handbook%20on%20Test%20Construction.pdf Dewey, R. A. (1998, January 20). Writing multiple choice items which require comprehension. Retrieved November 3, 2003 from http://www.psywww.com/selfquiz/aboutq.htm Kehoe, J. (1995) Writing multiple-choice test items. Practical Assessment, Research & Evaluation, 4(9). Retrieved July 29, 2008 from http://PAREonline.net/getvn.asp?v=4&n=9 Nitko, A. J. (2001). Educational assessment of students. (3rd Ed.). Columbus, OH: Merrill Prentice Hall.
Owen, S. & Freeman, R. (1987). What's wrong with three option multiple items? In Educational & Psychological Measurement (47), 513-22. Parkes, J. Multiple Choice Test. Retrieved 20 September 2005 from http://www.flaguide.org/cat/mutiplechoicetest/multiple_choice_test7.php Zimmaro D. (2004). Writing Good Multiple-Choice Exams, Measurement and Evaluation Center: University of Texas, Austin
1Note that this version does not go into statistical analysis of MCQs