Computer Adaptive Assessment Project


Objective
Project Outline
Participants
 
Computer Adaptive Assessment


What is CAA?
Benefits
How does CAA work?
Take a tour
FAQs
Glossary
Related Articles
 
Glossary

Alberta Computer Adaptive Assessment Glossary

[A-G][H-N][O-T]

Assessment reliability – Assessment reliability refers to how consistently a test or other measurement instrument measures what it was designed to measure. In other words, are the results reproducible? One common approach to measuring test reliability is to statistically split the test items in half and then compare the results of the two “mini-tests.” Assessment reliability is a condition necessary for an assessment to be considered valid.

Return to Top

Assessment validity – Assessment validity refers to how well a test or other measurement instrument measures student performance in terms of the knowledge and/or skills it was intended to measure. The three main types of validity include:

  1. content-related validity – assessment items should be representative of the larger area of knowledge and skills being measured.
  2. criterion-related validity – assessment should have predictive value in terms of how well it correlates with other criteria or measures of student knowledge and skills. In other words, students who receive high scores in mathematics on a specific assessment should also do well on other mathematics tasks.
  3. construct-related validity – how well a test measures the psychological constructs related to the content that is being measured.
    Generally, items that are created and reviewed by experienced teachers help to ensure the validity of assessments.

Return to Top

Classical Test Theory (CTT) – Classical Test Theory refers to a theory of educational and psychological measurement that assumes that test-takers have an observed score and a true score. An examinee’s observed score is composed of a true score and unobservable measurement error. CTT dominates the construction/evaluation of most standardized tests because it is relatively easy to understand and is composed of flexible assumptions. CTT and Item Response Theory are the two most common theories used in the construction of tests.

Return to Top

Computer Adaptive Assessment (CAA) – In general terms, is an innovative online form of assessment in which an examinee is presented test items in a sequence that is dependent on the correctness of the response to the previous item. Through this process, each examinee is administered a unique set of test items that provides an accurate measure of his or her ability. Items are selected from an item bank of developed, reviewed and field-tested assessment items specific to a course and grade. This process of selecting and administering items continues until the CAA system reaches a pre-specified level of accuracy for the students’ ability estimate. CAA is also referred to as Computer Adaptive Testing (CAT).

Return to Top

Computer-Based Assessment (CBA) – CBA refers to administering assessments via computer versus via paper-and-pencil methods. CAA is a form of CBA in that students are presented with items via computer.

Return to Top

Field-testing – Field-testing is the process of administering newly developed items (or items that have been significantly revised) to representative groups of students in order to obtain statistical feedback. Item response theory statistics obtained during field-testing are a prerequisite to using items in a CAA.

Return to Top

Formative assessment – Formative assessments help monitor the progress of learning and the acquisition of learning outcomes during instruction; its purpose is to provide continuous feedback to both students and teachers on learning successes and failures.

Return to Top

Items – Items are test questions that are the measurement building blocks of assessments. A multiple-choice item contains a stem in which the question is presented and a series of alternatives from which the student is to select the correct response.

Return to Top

Item bank – An item bank is a repository or database of developed, reviewed and field-tested assessment questions.

Return to Top

Item development – Item development involves the processes and procedures for creating assessment questions. This practice typically involves item development workshops with experienced content experts (i.e., teachers) who create original assessment items that are then edited, reviewed, and field-tested with representative groups of students.

Return to Top

Item difficulty – Item difficulty refers to how challenging assessment questions are for students. Classical Test Theory item difficulty is communicated via the percentage of students who selected the correct response to the item (e.g., an item difficulty of 0.750 indicates that 75% of students selected the correct response). In other words, the lower the difficulty of the item the higher the number between 0 and 1. In CAA, item difficulty is characterized by an item response theory (IRT) statistic called the “b-parameter,” which characterizes the position on the ability continuum of an item (e.g., a b-parameter of 0 denotes that the item is of average difficulty).

Return to Top

Item exposure – Refers to the number of times, and in what conditions, items drawn from an item bank have been administered (exposed). Item exposure is important information to track in a CAA environment to ensure that items are not overexposed. Items that are frequently exposed are retired from the item bank once they have reached a pre-determined exposure limit.

Return to Top

Item Response Theory (IRT) – Item response theory refers to a theory of educational and psychological measurement that aims to describe item and examinee performance in relation to an ability scale called “theta.” For example, for a Mathematics 7 assessment, the theta scale would be “Mathematics 7 ability.” Item characteristics would be described statistically in terms of the probability of students selecting a correct response to the item as a function of Mathematics 7 ability. IRT provides the theoretical/statistical underpinnings of a CAA system.

Return to Top

Online assessment – Online assessment involves the use of computers and the Internet (or local area networks) to administer assessments to students. Examinees log-on to a secure Web site where they take the assessment through an online application.

Return to Top

Paper-and-pencil tests – These assessments are paper-based tests. Large-scale paper-and-pencil tests generally utilize “bubble-sheets” on which students indicate their answers by filling in circles or “bubbles.” These answer sheets are then scanned with optical mark recognition machines that convert the student responses into electronic files, which are then processed and scored. Small-scale (e.g., in class) paper-and-pencil tests may be simply hand-scored by a teacher.

Return to Top

Psychometrics – Psychometrics refers to an area of science involving the measurement and evaluation of educational and psychological constructs such as knowledge, skills and abilities. Psychometrics provides the theoretical/statistical underpinnings of assessment, such as utilizing item response theory to assess students adaptively. Experts in the area of psychometrics are generally referred to as “psychometricians.”

Return to Top

Summative assessment – Summative assessment typically involves assessing groups of students at specific times of the school year to evaluate the performance of the learning system including instructional programs and student competency. Examples of summative assessments include provincial/territorial assessment programs (e.g., Alberta diploma examinations), entrance exams (e.g., LSAT, GRE), and national/international assessments (e.g., SAIP, TIMSS).

Return to Top

Student ability – Student ability refers to the level of knowledge and skills possessed by a student, relative to the specific constructs being measured. A student's ability in an area is inferred by their achievement on an assessment designed to measure ability in that area. For example, students of high ability in mathematics at the Grade 7 level (i.e., students who excel in their classroom mathematics work) would likely recieve a high score on a Mathematics 7 assessment.

Return to Top

Testing-on-demand – Testing-on-demand refers to the opportunity for students to take assessments throughout the school year, at the discretion of a teacher, rather than at set times during the school year.

Return to Top