The Quality of MCQs Used for Finals At The Department of Internal Medicine, College of Med

Authors

Omer A Elfaki
Khalid A Bahamdan
Suliman Al-Humayed

Theme

What is a good assessment and written performance assessment

Category

Student Assessment/Student Engagement

INSTITUTION

King Khalid niversity

Background

·   MCQs exam in internal medicine for finals at the College of Medicine, KKU.

·  100 MCQs of the one best answer type with four options.

· Some basic forms of item analysis has been carried out by the department before,

·  The data generated has not been used regularly to assess the quality of the questions or for feedback for the purpose of quality improvement.

·   Aim of this study: to assess the quality of MCQs used for finals in internal medicine in  January 2012.

 

 

 

 

Conclusion

                                       ·  The MCQs exam was quite reliable and the difficulty of the questions was reasonable.

 

 ·  The discrimination power of most of the questions was acceptable. However, a relatively high proportion of the questions had unacceptable discrimination index values.

  ·  Distractor effectiveness was good compared to similar studies.

 

        

Take-home Messages

 ·   Most of the questions were of good quality.

 ·   A few questions need to be reviewed and improved for reuse in future exams.

·   Training of some of the staff members on good quality MCQs writing was recommended.

·   The results of the analysis should be effectively used for feedback to students and staff to improve the quality of learning, teaching and future assessment.

 

 

 

Summary of Work

· Item analysis was done using Microsoft Excel 2007.

·  The total number of the students of this batch was 58 and the total number of MCQs was 100.

·  The parameters obtained included difficulty index, discrimination index, point biserial coefficient and reliability of the exam using KR-20 formula, in addition to analysis of distractors.

Summary of Results

 

Parameter

Value(STD)

Reliability(KR-20)

0.79

Mean difficulty

0.55(0.0.24)

Mean discrimination index

0.24(0.20)

Mean rpbi

0.16(0.12)

Mean number of functioning distractors per item

1.76

Mean score

54.6(9.4)

Max score

71.0

Min score

31.0

Acknowledgement

·  The  authors thank all faculty of the department of internal medicine for supplying the required test items.

·  Special thanks are extended to the members of the examination committee for item review and selection.

  

 

 

References

 1-      Farley JK. The multiple choice test: writing the questions. Nurse Educator 1989 14:10-12, 39

2-      Considine J, Botti M, Thomas S. Design, format, validity and reliability of multiple choice questions for use in nursing research and education. Collegian 2005; 12: 19-24.

3- Skakun EN, Nanson EM, Kling S, Taylor WC. A preliminary investigation of three types of multiple choice questions. Med Edu 1979; 13: 91-96.

4- Schuwirth LWT, Vleuten CPM: Different written assessment methods: what can be said about their strengths and weaknesses? Med Educ 2004, 38(9):974-979.

5- Kemp JE, Morrison GR, Ross SM. Developing evaluation instruments. In:

Designing Effective Instruction. New York, NY: MacMillan College Publishing Company, 1994:180-213

6- Kehoe J. Basic item analysis for multiple choice tests. Practical Assessment, Research, Evaluation. 1995; 4: 10.

7- Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart and Winston.

8- Birnbaum, L. (2008) Guidelines for Writing Multiple Choice Questions. Journal of Professional Exercise Physiology, 6(4). Retrieved Feb 10, 2012 from:

http://www.exercisephysiologists.com/JPEPFeb2008MCguidelines/index.html

 

9- Abdel-Hameed, A.A., Al-Faris, E.A., Alorainy, I.A., and Al-Rukban, M.O. (2005). The criteria and analysis of good multiple choice questions in a health professional setting. Saudi Medical Journal, 26 (10), 1505–10

10- Chiavaroli, N.& Familari, M. (2011). When Majority Doesn’t Rule: The Use of Discrimination Indices to Improve the Quality of MCQs. Beej,17(8). Retrieved Feb 4, 2012 from:

http://www.bioscience.heacademy.ac.uk/journal/vol17/beej-17-8.pdf

 

11- Gronlund, N.E. and Linn, R.L. (6th ed). (1990). Measurement and Evaluation in Teaching NY: Macmillan.

12- Michigan university, Academic technology service, introduction to item analysis. Retrieved Feb 12, 2012 from:

http://scoring.msu.edu/itanhand.html

 

13- Backhoff, E., Larrazolo, N. & Rosas, M. (2000). The level of difficulty and discrimination power of the Basic Knowledge and Skills Examination (EXHCOBA). Revista Electrónica de Investigación Educativa, 2 (1). Retrieved March 2, 2012 from:

http://redie.uabc.mx/vol2no1/contents-backhoff.html

 

14- Ebel, R.L. (1972) Essentials Of Educational Measurement (1st Edition). New Jersey: Prentice Hall .

15- Haladyna TM, Downing SM: How many options is enough for a multiple-choice test item? Educ Psychol Meas 1993, 53(4):999-1010.

16- Tarrant M, Ware J, Mohammed AM. An assessment of functioning and non-functioning distracters in multiple-choice questions: a descriptive analysis. BMC Med Educ 2009; 9: 40.

17- Haladyna TM, Downing SM: Validity of a taxonomy of multiple-choice item-writing rules. Appl Meas Educ 1989, 2(1):51-78.

18- Rodriguez MC: Three options are optimal for multiple-choice items: A meta-analysis of 80 years of research. Educ Meas Issues Pract 2005, 24(2):3-13.

Background
Conclusion

 

 

The difficulty indices for the 100 MCQs exam for 58 medical students were calculated and found to be acceptable. Based on this parameter, the exam was considered to be of good quality.

 The  discrimination indices of the 100 MCQs were determined using Excel. Most of the questions were of acceptable quality and few of them needed improvement. Only 7 question had unacceptable discrimination index and must be completely changed.

Presence of high quality distractors is essential for construction of good quality MCQs. Results from this study showed that the quality of distracters were comparable to those reported by similar tests. Some of the items with NFDs needed to be revised.

Take-home Messages
Summary of Work

Multiple choice question items were taken from the summative assessment test paper of internal medicine for graduating students in January 2012. A total of 100 test items were used.  The MCQs items were written by individual teachers and vetted at the department of internal medicine by the examination committee for item clarity, accuracy, content and structure..  All of the items were type A MCQs consisting of a stem and four choices and the students were to select one best answer from these four choices. A correct answer was awarded one mark and there were no negative marks for the incorrect answers. Microsoft Excel 2007 was used to perform item analysis. Responses obtained by the individual student on each of the MCQs items were analyzed. All  the 58 students attempted all the questions. The difficulty index (P), the item discrimination index (D), point biserial correlation(rPB) and the reliability coefficient(KR-20 ) were computed.  P was defined as the proportion of examinees answering the item correctly. It was calculated for each item according to the formula:

P = R/T  where R is the number of examinees who answered the item correctly, and T is the total number of examinees who took that test.

The mean  difficulty of the exam was also calculated by taking the average of all item difficulty indices. D was calculated by ranking the students according to total score and then selecting the top 27% and the lowest 27% in terms of total score. . D was determined using the formula D= Upper Group(UG) –Lower Group(LG) / n(number of students). Based on Ebel’s (1972) guidelines on classical test theory item analysis, items were categorized in their discrimination indices as shown in the following table:

 

D =

Quality

Recommendations

More than 0.39

Excellent

Retain

0.3 – 0.39

Good

Possibility for improvement

0.2 – 0.29

Mediocre

Need to check/Review

0.00 – 0.2

Poor

Discard or review in depth

Less than 0.01

Worst

Definitely discard

An other parameter analyzed for all the items was rPB. KR20 was used to estimate reliability. This is an estimate measure of reliability that shows how well the individual test questions correlate to each other.

In addition , distractor analysis was done. A distractor was considered nonfunctioning(NFD) if it was selected by less than 5% of students. Frequency distribution was constructed for the 100 items, which included 300 options (300 distractors and 100 correct responses). All distractors with a choice frequency of <5% were identified. Items with 0, 1, 2,  and 3 NDFs were also identified.

Summary of Results

                          

 Table1. The distribution of the MCQs among the different ranges of difficulty indices.

 

Difficulty Index

No of Questions

Degree of Difficulty

0-0.2

9

very difficult

0.21-0.4

24

moderately difficult

0.41-0.6

25

Intermediate difficulty

0.61-0.8

27

moderately easy

0.81-1.0

15

very easy

Total

100%

 

Table 2. The distribution of the MCQs among the different ranges of discrimination indices

 

Discrimination Index

% of Questions

≤0.09

23

0. 1-0.19

17

0.2-0.29

25

0.3-0.39

13

>0.39

22

Total

100%

Acknowledgement
References

Send ePoster Link