Assessment, Examination, Evaluation: Are they interchangeable?

ACER news 5 minute read

In the last sharing session on 13 November 2020, ACER Indonesia with the Indonesian Educational Evaluation Association (HEPI) and Indonesian Psychometric Association (APSIMETRI) raise the phenomena where the term of examination, evaluation, testing, and assessment are used interchangeably not only by the public but even among educators and practitioners. And once the answer is clear on whether they really are interchangeable or not, a no less important question is; why does it matter for us to know the difference?

Some important points discussed in the session were the terminology of the words themselves, their characteristics, purpose, shared similarities, and differences. The first speaker, Bahrul Hayat, PhD commenced the session by giving a background about how these words are not exclusively used in the education sector only, but also in a lot of other contexts, such as how the medical sector was using the word ‘examination’ for their need. He continued after, stating that in his session in this online event, he will talk based on his view on the many kinds of literature about assessments, examination, and evaluation from the last couple of decades and that any discussion will revolve only around the words in its educational term.

Mr Bahrul chronologically describes the emergence and development of the three terms, starts with evaluation, which crops up along with the birth of science at the end of the 19th century. From the formalisation of the term evaluation by Edward Thorndike to John Dewey’s need for evaluation towards formal education in America and Ralph Taylor’s thoughts on educational objectives and educational evaluation in his book, Curriculum and Instruction (1949). 

He went on to describe that in Indonesia, Taylor’s theory can be seen to materialise in the 1975, 1984, and 1994 curriculum which consists of (educational) national objectives/goals, institutional objectives, and general and specific instructional objectives. And to collect the data that can back up the fulfilment of those objectives, (educational) evaluation is the term that is used. Throughout the 50s—80s, the term evaluation refers to two meanings, both within an individual's or student's learning outcomes and within an educational program. Up until now, evaluation stays as the measurement of program/policy in the level of division to national/systematic, and only in the 90s did assessment become used in Indonesia, particularly for collecting data about individual learning outcomes, although in itself it carries a far wider meaning than that. Lastly, examination serves a very specific meaning or objective, which in itself is a subset of today’s understanding about assessment; with condition that in an examination, the result of the assessment is utilised to create decisive output such as promotion, admission, or certification. A passed/failed decision is an integral part when one talks about the examination. Nowadays, “assessment” is widely used to map the progress and competence of an individual or system towards a (learning) program. PISA is the most famous example of an assessment at a systemic level.

Talking from the viewpoint of a government official, Profesor Ali Saukah noted how the usage of these terms throughout the regulations in the paper is not yet consistent and how he wishes for a widely accepted and attested set of terms and its proper definition for everyone, especially those who are working in the education sector or academics, to always refer to.

Professor Ali suggested an easy way to differ the terms assessment and evaluation is how each of them treats and use their result summary. Both use data collection and get the result through a process of analysis. And while the result of the assessment is descriptive in nature; to inform what should and should not be done by the educator or any person in charge of the system to achieve a satisfying outcome from the recipient, evaluation on the other hand always entails a value. The evaluation will always give a certain judgment against something, and also is norm-referenced as a benchmark.

So are they interchangeable? Both speakers agree that looking at their differences in characteristics, they are not substitutable. There is a need to create a glossary where each term can be further explained formally, for each term serves a contrastingly different purpose and is equally vital to inform us about the suitability of the effort we put in any work and the outcome we gain from it.

