Research based paper of language testing

Download 183.75 Kb.
Size183.75 Kb.
  1   2



(Case study: One of Vocational School in Majalengka)

Yella Dezas Perdani

Indonesia University of Education


Abstract: Language assessement is a term to describe the way teachers assess learners’ language ability. Reading is one of the ability that usually assesses by the teachers. There are many ways in assessing learners’ reading skill. From previous reseach and early interview from the teachers , it is found that most of the teachers use multiple-choice format test. They argue that multiple-choice format is more efficient than other kind of testing in term of conducting, time consuming and scoring. This study aims to know whether multiple-choice format is efficient also from the learners’ point of view. The purpose of this study is to see its effectiveness to test reading ability of the learners.

This study uses quantitative method. It is conducted in one of vocational school in Majalengka, grade XII. There are 59 learners as a participants, 26 of them are in experiment group who are tested by using multiple-choice format test, and 33 of them are in the control group and are tested by using constructed-response format test. The data collection technique is in the form of test, and data are analyzed by using SPSS application. The data shows that multiple-choice test is effective to assess learners reading ability.

Keywords : Language assessment, Reading assessment, Multiple-choice format test, Constructed-response format test.

Abstrak: Penilaian pada bahasa adalah istilah untuk menggambarkan cara guru menilai kemampuan berbahasa siswa. Membaca merupakan salah satu kemampuan yang biasanya dinilai oleh guru pada pembelajaran bahasa Inggris. Ada banyak cara dalam menilai kemampuan membaca peserta didik. Dari penelitian sebelumnya dan wawancara awal pada para guru, ditemukan bahwa sebagian besar guru menggunakan tes pilihan ganda. Mereka berpendapat bahwa tes pilihan ganda lebih efisien daripada jenis lain dalam hal efisiensi waktu dan penilaian. Penelitian ini bertujuan untuk mengetahui pilihan ganda efisien juga dari sudut pandang siswa. Tujuan dari penelitian ini adalah untuk melihat efektivitasnya untuk menguji kemampuan membaca peserta didik. 

Penelitian ini menggunakan metode kuantitatif. Hal ini dilakukan di salah satu sekolah kejuruan di Majalengka, kelas XII. Ada 59 peserta didik sebagai peserta, 26 di antaranya berada di kelompok eksperimen yang diuji dengan menggunakan uji pilihan ganda, dan 33 dari mereka berada di kelompok kontrol dan diuji dengan menggunakan tes esai. Teknik pengumpulan data dalam bentuk tes, dan data dianalisis dengan menggunakan aplikasi SPSS. Data menunjukkan bahwa tes pilihan ganda efektif untuk menilai peserta didik kemampuan membaca. 

Kata kunci: penilaian Bahasa, penilaian membaca, tes piliahn ganda tes esai


Language evaluation is the process of examining learners performance, comparing and judging their ability. It is used to determine whether the learner has met the objectives course and how well they do (Zimmaro, 2004). There are some terms to describe the process of learning evaluation. First, measurement is “the process of quantifying the observed performance of classroom learners” (Brown: 2010: 5). It is done from the observation in whole activity in the class during teaching and learning process. Braun, Kanjee, Bettinger, and Kremer (2006: 17) think that measurement is the process that includes numerical value in some concept of learning. It relates to the scoring of learners ability.

Second, evaluation is the process of judging the elements of education institutions such as program, curricula, organizations, and institutions (Braun, Kanjee, Bettinger, and Kremer 2006: 17). Moreover, it is also “the process of making decision based on the result of the test (Brown: 2010: 5). The example of evaluation can be seen in national examination as the indicator to see whether the education system is good and success or not.

Third, assessment is the process of collecting, describing, and quantifying the information of learners’ performance (Zimmaro, 2004). It is in line with Brown (2010: 3) who states that assessment is the process of covering a wide range of methodological techniques of learner’s performance. So, assessment relates to the learners performances. The component of assessment is the amount of weight given to the different subject matter areas on the test. It should match to the relative importance of the objectives course. The emphasis are given too each subject area during the instruction. In addition, Braun, Kanjee, Bettinger, and Kremer (2006: 17), argue that assessment is a vital component of any evaluation, especially in education context. It is also the process of obtaining information that is used to make educational decisions about learners, to give feedback to them about their progress, strengths and weaknesses.

And the last one is testing. Testing is one of the assessment techniques (Brown, 2010:3). This kind of technique usually relates to the use of pencil or pen and paper as the tools and is done in formal circumstance. The atmosphere is usually intimidating to the learners. Moreover, Braun, Kanjee, Bettinger, and Kremer (2006: 17) argue that testing is also the process of administering to measure a material objective by using a standard score. And this study uses the term of testing as language education evaluation.

Moreover, there are four kinds of testing based on Hughes (2003: 11-18). They are proficiency tests, achievement tests, diagnostics tests and placement tests. Proficiency test is a test to know learners ability in a language. Achievement test is a test that used by the teachers to assess the learners whether they has achieved the objectives of the course or not. Diagnostic test is the test that used to identify learners’ strength and weaknesses in order to know which material needed to be taught and which one is not. And then placement test is a test to find out the leaners’ level ability. The type of test that used in this study is achievement test.

From the observation, it is found that the tests in the school are varied based on the teachers and the skills tested. However, most of common test used is multiple-choice test and constructed-response test. In addition, national examination test is also multiple-choice test, so that the teachers want to train their learners to answer multiple-choice test. Besides, other kind of tests also use by the teacher. For example, in testing reading skill, the teachers use reading aloud for beginner level. Moreover, the preparation of the test is depend on the purpose of the test. For example, some of the teachers give test in every competency standard achieved. They review the lesson first and then create the test based on the material given. However, from observation and interview, it is also found that some of the teachers do not have scoring format, especially for non multiple-choice test that need raters in scoring. They just give the score based on them without scoring format. It will be better when the teachers have scoring format to avoid bias and subjective scoring.

In language education context, especially English education, the four skills are the parts that needed to be tested by the teachers. Reading is one of skill that is tested in the school. Reading skill is necessary for the learners at least for two reasons. First, reading is important to accomplish assignments given and also to take part in learning activities in the classroom (Hamdan, 2010). Without reading, learners cannot share ideas and cannot participate well in any discussions because they do not have enough information and knowledge to be shared. Second, reading is a prerequisite skill which is required as the basis for writing. Someone will not have a good writing if he or she does not read much because a good writer is a good reader (Karbalaei, 2010).

To get a better knowing about learners understanding about the materials given, reading assesment is needed to be done. There are some type of testing that can be use to know whether the learners understand the material given or not. Grabe (2008: 352) divides reading testing based on the purpose of the course. There aresome purposes of testing reading, and in this study, the purpose is reading-proficiency assessment (standardized testing). It includes cloze, gap-filling formats (rational cloze formats, c-tests (retain initial letters of words removed), cloze elide (remove extra word), text segment ordering, text gap, choosing from a “heading bank” for identification paragraphs, multiple-choice, sentence completion , matching (and multiple matching) techniques, classification into groups, dichotomous items (true / false / not stated, yes / no questions), editing, short answer, free recall, summary (1 sentence, 2 sentences, 5 –6 sentences), information transfer (graphs, tables, flow charts, outlines, maps), project performance , skimming, and scanning.

Besides, Brown (2010: 228-257) divides the type of reading testing into four parts. First, perceptive reading is a test the basic ability of learners’ reading skill. The activities in perceptive reading are reading aloud, written response, multiple-choice and picture cued items. Second, selective reading includes multiple-choice (for form-focused criteria), matching task, picture-cued task and gap-filling tasks. The third one is interactive reading. It consists of cloze task, impromptu reading plus comprehension questions, short-answer tasks, editing (longer texts), scanning, ordering tasks and information transfer: reading charts, maps graphs, and diagrams. And this study uses the term of testing which are multiple-choice test and constructed-response test. The tests assess the reading skill of the learners.

And then, multiple-choice test is the test that has several answers for each of the questions. The learners are asked to choose only one of the options as the right answer based on their knowledge (CANG: 2009).  Multiple-choice items can be used to measure knowledge outcomes and various types of learning outcomes. It is the most widely used for measuring knowledge, comprehension, and application outcomes. The outcome can be from the simpliest to complex one, depends on what questions ask to answer. In this study, multiple-choice test is the test that asks the learners to choose one answer from four choices. From previous researches, it is found that multiple-choice format test is the most used test by the teachers (Cheung and Bucat: 2002), Currie and Chiramanee: 2010). It is because multiple-choice test is easy to be conducted and scored. Moreover, from early interview with some teachers, they argue that multiple-choice test is efficient also in term of time scoring.

There are some advantages and strenghts of using multiple-choice test. First, the learning outcomes from simple to complex can be measured. Second, highly structured and clear tasks are provided. It is clear that the learners have to choose one correct answer from several options. Third, a broad sample of achievement can be measured. Forth, incorrect alternatives provide diagnostic information. By looking at the learners answer, teachers can know their learners’ knowledge. Fifth, scores are less influenced by guessing than true-false items because the choices are usually 3-option format, 4-option format and 5-option format. This study uses 4-option format. The learners have to choose one of A, B, C, or D as correct answer. Sixth, scores are more reliable than subjectively scored items because there is only one correct answer for every questions.. Seventh, scoring is easy, objective, and reliable. Eighth, item analysis can reveal how difficult each item is and how well it discriminates between the strong and weaker learners in the class. Ninth, performance can be compared from class to class and year to year. Tenth, it can cover a lot of material very efficiently (about one item per minute of testing time). Eleventh, items can be written so that learners must discriminate among options that vary in degree of correctness. Twelfth, it can avoids the absolute judgments found in True-False tests.

However, there are also some limitations of multiple-choice test. First, constructing good items is time consuming. The distractor choices have to be considered well so that the questions are not too easy nor too difficult. Second, it is frequently difficult to find plausible distractors. Third, this item is ineffective for measuring some types of problem solving and the ability to organize and express the ideas. Forth, real-world problem solving differs. A different process is involved in proposing a solution versus selecting a solution from a set of alternatives. Fifth, the scores can be influenced by reading ability. Sixth, there is a lack of feedback on individual thought processes. It is difficult to determine why individual learners selected incorrect responses, and what the leaenrs actual understanding about the text and questions given. Seventh, learners can sometimes read more into the question than is intended. Eight, it often focuses on testing factual information and fails to test higher levels of cognitive thinking. Ninth, sometimes there is more than one defensible “correct” answer. Tenth , they place a high degree of dependence on the learners’s reading ability and the teachers’s writing ability. Eleventh, it does not provide a measure of writing ability. Finally, it may encourage guessing, especially when the learners do not understand the questions nor the options given.

The second test, for the control group, is Constructed-Response test. It is a test without optional answer like multiple-choice test. Based on Downing (2007: 2) and McClellan (2010), constructed-response test is a task that requires the learners to construct their answers rather than select from predetermined options like multiple-choice test. And this study is used essays format as constructed-response test. In addition, there are raters and scoring leaders in constructed-response test. Rater is the people who hire the score constructed-response test. And in this study, the rater is the researcher herself. The scoring leader is the one who has shown consistently strong scoring performance and who has the interpersonal qualities of a good mentor.

There are some strengths of this kind of test (Downing (2007: 2). First, constructed-response test uses less time to write the questions and answer options, because it does not need distractors like multiple-choice test. And then, it can also be better suited to measure fluid abilities. This kind of test is needed a higher levels of thinking from the learner so that the learners try their best to answer the questions. Finally, it directs evidence of learning. The learners answer the questions based on their understanding without any help from the option like in Multiple-choice test.

However, there are seven weaknesses of conducting constructed-response test (Haladyna, 1997). First, this test is highly involved and laborious scoring because there is no exact scoring answer like in multiple-choice test. The raters have to concemtrate in checking the answer and giving score. Second, the scoring is more subjective. The raters have to score the answer based on what questions ask. It is because there is no exact score format like in Multiple-choice test. Third, biases in scoring threaten validity (handwriting, sentence length). The rates have to deal with the handwriting (clear or not, readable or not, tidy or not) and how much answer that the learners write (long or short one). Forth, constructed-response test has lower reliability than multiple-choice test. Fifth, there is more difficulties in equating alternate forms, technical process is very immature. Sixth, some studies indicate that MR has higher predictive validity than CR. Seventh, writing ability influences learners’ test scores – contaminant in measuring what they’ve learned.

Those reasons are from teachers’ point of view. This study will try to capture the effectiveness of multiple-choice format test in measuring reading ability of the learners by looking at the learners’ score in multiple-choice test. The purpose of this study is to find out which test format is more effective, multiple-choice format test or constructed-response format test. This study aims at investigating whether multiple-choice format test is effective in showing learners’ reading ability or not. It will be compared to constructed-response format test.

Furthermore, this study is expected to be useful for the teachers who are concerned about finding the appropriate format in assessing learners’ reading ability. The research findings are expected to contribute on several aspect; theoretically, practically, and professionally. Theoretically, the study will add an empirical support to the existing theories of language assessment in teaching English, especially reading ability of the learners. Practically, the result of the study will help to clarify and define more precisely on how effective multiple-choice format test in assessing reading ability. It will also be useful to develop the argument about the previous theory of assessing reading. Professionally, this study will add the understanding about the way of reading assessment in teaching English.


This study uses quantitative reseach. Quantitative research uses statistics description, or it has to do with number, with large scale of participant or sample (Dawson 2002: 15, Silverman, 2005: 8, Creswell, 2012:19). The research problem that is about the effectiveness of using multiple-choice test in assessing reading ability of XII grade learners in vocational school is used to direct the sample to the hypotheses proposed in the study (Saedi, 2002, Creswell, 2012:19). The data found in quantitative research is aimed to be generalized.

The research is done in one of vocational school in Majalengka, West Java. Vocational school is in the same level with senior high school. The participants in this study are 2 classes of XII grade in one of the vocational school. The first class is TKJB class or Teknok Komputer Jaringan class B and the second class is RPLC or Rekayasa Perangkat Lunak, class C. They are choosen by random sampling. There are 59 learners, devided into 2 classes, experiment group that is TKJB class and control group that is RPLC class. There are 26 learners in the experiment class, and 33 learners in the control class (see Table 1.)

The data in this reseach is in the form of test. Emilia (2011) says that testing is one of the research instruments. Test is usually given at the beginning of the lesson and at the end of the lesson. In this study, the tests are given at the end of the lesson. The tests assess the learners’ capacity and ability. They are relevant to the purpose of the lesson. The tests are in the form of written test and they test the reading ability of the learners. Multiple-choice format test is given to the experiment group, and constructed-response test is given to control group.

The question in multiple-choice test was taken from VOA news. It was because VOA uses Standard English and it was the authentic material also. There were 30 questions and four-format choices (A, B, C, and D). In short, this study revealed the effectiveness of using multiple-choice test in testing reading ability of the learners. Moreover, the question for control group class that is in the form of constructed-response test is the same with the question in multiple-choice format. So, both of the test are the same, the difference is only in terms of format test.

Experiment Group

Tkjb Class

(Multiple-Choice Test)

Control Group

Rplc Class

(Constructed-Response Test)

26 Learners

33 Learners

Download 183.75 Kb.

Share with your friends:
  1   2

The database is protected by copyright © 2020
send message

    Main page