It was my second week of my seventh semester. It was Friday, and I had language assessment class at 07.30. There‘s one strict rule that must be obeyed: come on time, and I’d just arrived at 07.45 which made me successfully late. The sad thing is I was assumed absent, but I still had a good reason to enter the class: that lesson.
When I entered the class, my teacher was had just instructed his students to make a schedule which later on will be used for pairing the student with same free time to discuss about something related to the lesson. Of course, it’s not real, it was like a simulation, but I thought the way my teacher divided a group is unique, and perhaps I should try to emulate his way if one day I were a lecturer.
After every student got partner to discuss, the very first thing to do was read each partner’s journal, and commented each other by giving suggestions. After finishing comment one another’s journal, then my teacher discussed what has learned in the previous week.
If in the first meeting I learned about the concept of language assessment and kinds of assessment, then second meeting I learned about test qualities. There’s a term given to those qualities called validity. What I understand about validity is it reflects the quality of test which shows whether the test is suitable with the lessons have been taught the whole time or not. In short, the test must examine what must be examined. Furthermore, the test must reflect one’s real ability or proficiency has achieved.
My teacher told the whole class that validity consists of 3C: construct, content, and criterion. The first one, construct validity measures hidden knowledge and ability, for example: reading skill. We are able to know singer’s skill by hearing their voice, and then we know whether their voice is bad or good. However, we cannot directly know the extent of reading skill of person because it tends to be invisible.
The second, content validity, a test quality which guarantees that a test is suitable with the content. It means that content validity of certain test can be called good if the test examines what’s should be examined. In schooling context, the test must be suitable with curriculum, syllabus, or lessons in class room meanwhile outside schooling context is vice versa. TOEFL, IELTS, and painting competition are example of outside schooling context of content validity.
The third which is the last, there is criterion-related validity, in accordance to its name, it measures a degree of person according to a criteria. A student can be given ‘A’ by his professor due to particular standards, such as: ‘A’ is given when the result is best of the best, ‘B’ means ‘Good’, ‘C’ means ‘enough’, and so on. This also applied to competition which commonly measure the participant’s skills according to certain criteria. Criterion-related validity consists of concurrent validity. The quality is seen by comparing some of the test results of student. For example: a student got an ‘A’ in every quiz and mid test, but then unexpectedly got ‘C’ in final test. The question is “what is the cause of ‘C’?” is it caused by the student itself, or the test which has bad quality, thus the student fails to get the test? There is also predictive validity which can predict a person’s future performance. It commonly used in IELTS, TOEFL, TOEIC, or any similar test to know the degree of person’s language proficiency. For example: Score 6 in IELTS is considered as a standard to a person to study abroad, so if I get 7 of IELTS score, then I know that I have adequate ability to continue my study in UK.
So, that’s all I got in language assessment that day, it was important to know quality of test because a good test will reflect a good result. Otherwise, bad test can affect the result of the test, and it can be predicted that the result also will be bad.