20. International Conference
Pedagogical tests development
Dr. Vadim Avanesov
18 -19 марта 2002 г., Алматы, Казахстан
18 -19 March 2002, Almaty, Kazakhstan
The report is aimed at the following: to present to attendees the methodological aspects of
tests development and operation in educational systems. Educational Measurement science deals
with test development in Western countries. The subject matter of this Science is the
development of high-quality tests aimed at the measurement of students' knowledge level.
Currently such tests are also used for students' ranking, monitoring of educational process,
adaptive learning and test control, distance learning: generally speaking, tests are used in all
modern educational technologies.
The test method is considered actual among others due to certain competitive advantages.
The five key advantages:
1. High scientific validity of tests as such, that allows objective assessment of knowledge
level of probationers;
2. Manufacturability of test methods;
3. Measurement accuracy;
4. Common rules for pedagogical testing control and adequate interpretation of testing
5. Compatibility of testing technology with other modern educational technologies.
System of definitions and terminologies together with test form and content make up a
theoretical and methodological base of test methods. Methodology-related test methods, quality
check criteria and mathematical models of educational measurement are not covered by the
There exists a hierarchy of 5 basic subordinated concepts: the author of the report has
investigated the first three concepts - "assignment in the form of test", "test assignment" and
"pedagogical test" along with associated terminology (1), and then strongly defined those in the
work (2). Two other system-forming concepts of the theory, i.e., "content" and "form", are both
related to the assignments and tests on the whole. Investigation results are presented in detail in
the author's papers - (2) and (3).
Test's Definition
Pedagogical test is the system of assignments of specific form, certain content, and ascending
difficulty. The system is aimed at giving an objective assessment of structure and measuring the
students' knowledge level.
A brief interpretation of key concepts is helpful for better understanding of the definition.
System assumes the test contains system-forming assignments. That means belonging of all
assignments to the same educational system, e.g. the same subject, correlation between
assignments and order of arrangement. Assignments are arranged in ascending order: from the
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
easiest to the most difficult ones. In other words, arrangement of assignments by difficulty is one
of the important system-forming characteristics of tests.
Specific form of testing assignments is proved by the fact they are neither questions nor
problems, they're assignments designed in the form of true or false statements, depending on
answers. On the contrary, traditional questions can't be true or false: as the answers are often
uncertain and wordy, the teachers have to use an outstanding intellectual potential to ascertain if
the answers are true. In this sense, traditional questions and answers prove non-technological, or:
lacking manufacturability, and shouldn't be included into testing.
Certain content means using only appropriate assessment materials, consistent with the
educational subject content; other materials cannot be included into pedagogical testing. For
example, assessment of intellectual potential is a subject for psychological testing.
Testing content exists, is kept and transferred in either form of the four forms of assignment.
Either testing or it's content cannot exist in any other form than testing.
An assignment difficulty criterion is the sole theoretically justified criterion in the
pedagogical testology to arrange homogeneous testing content. The pedagogical testing should
not include any content non-related to education (for instance, intellectual potential assessment).
This is a subject of psychological testing. The increasing difficulty can be compared with the
barriers on a stadium racetrack where the next barrier is higher than previous one. Only a better-
trained runner would be succeed to run the distance and overcome all barriers.
As the assignments are arranged by increasing difficulty, one may notice that one
probationer fails with the first and easiest assignment, others - with the next ones. Student of
medium knowledge level would be succeeding only with the half of a test. And, finally, only the
most skilled students would solve the most difficult problems placed at the end of a test.
Difficulty of the testing may be defined in two ways: a) imaginarily, on the basis of the
assumed volume and character of mental work promoting assignment success realization; and b)
by empirical approbation of assignment accompanied by estimation of share of false answers.
Empirical indicators of difficulty have been studied only in the classical theory for many years.
New appearing types of testing emphasize the nature of the students' mental work.
Pedagogical testing answer is given as a brief judgment related to content and form to the
assignment content. Answers to each assignment may be true or false. Designers of the test
determine the accuracy criteria in advance. Evaluation of the designed answers by accuracy isn't
used in the practical testology very often, but if necessary, an assignment can be designed with
all true answers distinguished only by degree of accuracy1. The instruction for probationers
would be "Circle the number of the most true answer!"
The chance to give a true answer on any assignment depends upon the correlation between
students' knowledge level and assignment difficulty. This chance is indicated by values from 0 to
1, upon comparable scales availability. Analysis of each student's answers to testing lightens his
knowledge level and structure. The more true answers have been given, the higher an individual
testing score of probationer is. Usually the testing score is associated with "knowledge level" and
it is to be adjusted to a pedagogical measurement model. The same knowledge level can be
achieved due to answers given to different assignments. For example, a student has a score of 10
points in the test consisted of 30 assignments. Most likely his score has been obtained due to his
true answers to the first 10 - comparatively easy - assignments. The consistency of unities
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
followed by zeroes in this case /"1" followed by "0"/ is considered the right profile of student's
An opposite situation, where a student gives true answers on difficult questions and false on
easy ones, is contradictory to the logic of testing, hence, such knowledge profile could be
defined as inverted. Inverted profile is rarely found, mostly due to the fallacy in design of the
test, where assignments are arranged not by ascending difficulty. Provided that the test is
designed appropriately, knowledge structure is proved by each profile. This structure can be
defined as elementary (because of factor structures determined through factor analysis methods).
Each educational institution should be aimed above all at forming appropriate individual
knowledge structures without gaps in the knowledge, and at improving educational level. Japan
and the rapidly developing Asian-Pacific countries evidently maintain this principle. Mostly,
knowledge level depends on the student's individual work and capacities, when knowledge
structure is much depending upon the appropriate organization of educational process, individual
approach in education, teacher's competence and skill, objective control - as a matter of fact, all
that we're lacking.
First of all, the content and form of test notice attention of instructors. Content is defined as
reflection of a fragment, or a component, of a school subject in the form of test; form is defined
as a method of correlation and order of assignment components. Testing content may exist, be
stored and transferred in either of the four assignment key forms. Either testing or it's content
cannot exist in any other form than testing.
All testing assignments known in theory and practice can be divided into four major groups.
Assignments with one or more true answers form the first group. Assignment offering a choice
of answers (usually one true and several false answers) should be rather defined as the
assignment with the choice of one true answer. For example:
1) prime number
2) composite number
3) both prime and composite number
4) neither prime nor composite number
Such assignments have a true and several false, but verisimilar answers. The latter are called
distractors (from "distract"); number of distractors may vary from 1 to 5.
At present, the assignments with choice of several true answers are widely spread alongside
with the assignments of one-answer choice. They are more difficult by content than assignments
of one answer choice. "Circle the numbers of all true answers" instruction is given at the
1) atom 2) knowledge 3) being 4) liberty 5) development 6) quality 7) culture 8) revolution 9)
dialectics 10) quantity
1) tobacco 2) jewelry 3) grain 4) cars 5) petrol 6) sausages 7) bread 8) alcoholic beverages
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
Probationer should define are the answers true or false, and decide upon completeness of the
answer. Second group is represented by assignments requiring additional answer: it's usually one
word or sign. The standard instruction is: "Add".
Third group is formed by the assignments composed of the elements arranged in two
columns. Such assignments are preceded by the instruction -
Identify correspondence:
1. Fauna
2. Flora
3. Megaera
4. Aesculapius
5. Penelope
6. Narcissus
7. Prometheus
Meaning of name
A) Doctor
B) Luck
C) Fighter
D) Faithful wife
E) Wicked woman
F) Self-enamored
G) Vegetative world
H) Mysterious man
I) Man of striking beauty
Answers: 1_, 2_, 3_, 4_, 5_, 6_, 7_.
In the missing cells of answer line probationers enter a letter corresponding to the right
answer from the second column.
1) Trade profit
2) Entrepreneurship profit
3) Founder profit
4) Dividend
A) Share
B) Trade capital
C) Loanable funds
D) Industrial capital
E) Control packet of shares
F) Variable capital
G) Constant capital
Answers: 1, 2, 3, 4.
The fourth group includes the assignments of procedural or algorithmic nature. Let us
consider the assignment for testing the historic knowledge concerning the events of February-
October 1917, that considerably influenced not only the history of Russia, but the course of the
political events in the entire world. Naturally, studying the course students memorize the facts.
But knowing history is not only knowing certain facts, first of all it is knowing the historic
process where the studied facts are regulated by time. Each assignment is preceded by the
"Identify the right consecution": 1. EVENTS OF FERUARY-OCTOBER 1917
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
VI congress of RSDRP
disavowal of tsar Nicholas II
arrival of Lenin
founding the Petrograd council
Kornilov rebellion
abolition of diarchy
II congress of the Soviets
Probationer enters the ranking numbers in the boxes on the left of each element of the
assignment. At computer testing, the probationer works with the help of special instrumental
program made taking such form of assignment into account; after entering the ranking number
the shunt automatically switches to the next box. Second example:
The Fatalist
Princess Mary
Maxim Maximovich
Author's introduction
Pechorin's journal
The content of testing is an optimal reflection of the content of education in the system of
testing assignments. The words "optimal reflection" presume the necessity to select such control
material, the answers to which would provide a high probability (over 95%) evidence of each
student's preparedness.
The requirement to provide optimal reflection involves compulsory periodic revisions of the
goals and meaning of pedagogic activity. Till recently the practice of general secondary
education was reduced to mastering the known list of Knowledge, Abilities and Skills; in the
methodic literature the latter are sometimes known by their first letters, KASes. In the
educational ministry it was supplemented by a controversial (if not harmful) idea of so-called
"educational minimum" that absolutely contradicted the goals of genuine education and character
education, with the goals involving full intellectual, cultural, moral, esthetic, and physical
development. Orientation to the minimum and checking minimum only is a consequence of
bureaucratic approach to the education management and the falseness of a total and minimalist
educational policy.
Optimization of the content has been a leading idea of traditional and adaptive testing: to
optimize testing means to measure the knowledge of maximal number of students, rapidly, with
high quality, at minimal expenses, with the minimum number of assignments and for the short
space of time.
This idea is close to the task of improving the effectiveness of pedagogic activity by the
usage of mass knowledge control. It seems appropriate to make some generalization of
ideological sense: testing culture, first of all, is interesting to the leaders aiming at increasing
such effectiveness.
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
Testing content selection criteria:
1. Correspondence of the content of testing to the goals of testing;
2. The importance of the knowledge tested within the general system of knowledge;
3. Correlation of content and form;
4. Correctness of the content of test assignments;
5. Representativeness of the content of educational discipline in the content of test;
6. Correspondence of the content of test with the modern state of science.
7. The complement and equilibrium of the content of test.
8. System of the content.
9. Variability of the content.
10. Correspondence between the degree of difficulty and the goal of the test.
The lack of scientific research for the testing causes substitution for genuine testing by
unscientific forms and methods. For instance in Russia instead of developing tests, the funds
borrowed from the budget and international loans are spent on developing pseudo-scientific
controlling material (CMs) used for the Unified State Examination.
Negative aspects of the Russian Unified State Examination are unrealistic statement of goals.
For example: It is impossible to provide an equal access to education by impoverishment of the
general population; Fighting corruption is ineffective without Anticorruption Law;
Objectification of knowledge assessment is impossible using low-quality tests; The essential
issues of the Unified State Examination that have not been worked over at all, are: juridical,
social (social consequences in particular); methodic and metric (issues of exact measurement).
Main courses of work to provide scientific substantiation of test process are:
training the specialists by the program of "Pedagogic measurements";
post-graduate study and defense of the thesis on test problems;
training the faculty of the HEIs, secondary special educational institutions and school-
teachers concerning issues of the methodology of test control knowledge;
publications on the issue.
A brief list of the author's publications
1. "Methodological and theoretic grounds of testing control". Thesis of the doctor of pedagogic
science. State university, 1994 - 339p.
2. "Composition of testing assignments ". Testing center, 2002 - 240p.
3. Content of testing. Principles of developing the content of test. Logical requirements to the
content of test. Knowledge as a subject of test control. Kinds of knowledge. // Managing schools.
NN 36, 38, 42, 46, in 1999 and N 2 in 2000.
4. "Basic concepts of testology" // Thesis report of the participants of workshop-school
"Scientific problems of test control of knowledge" 14-18th of March 1994. Center of Research of
the problems of specialists' training quality, 1994, p. 105-108.
5. "Where will education go"// People's education, N 5, 2001, p. 26-31.
6. "How to overcome the precipice between secondary and high school?"// Managing schools,
N43, November 2000.
7. "Do we really want it?" // Russian Federation today. N20, September 2001, p. 8-9.
8. Principles of scientific organization of pedagogic control in high school. M. MISiS, 1989. p.
9. Certification of tests in ministry fashion \\ Official documents in education. N32 (167)
November 2001, p. 99-102.
Подготовлено ЦТ и МКО УГТУ-УПИ, 2005 г.
Аванесов В.С. Теория и практика педагогических измерений (материалы публикаций)
Достарыңызбен бөлісу: |