Measurement in educational research and assessment
Pre-established Instruments and Archival Data
All research studies, regardless whether they are quantitative or qualitative, require the collection of data through some type of measurement. Data are any type of information collected for use in educational research or assessment. The different types of data include numerical information, verbal information, and graphic information. Data are collected by instruments such as tests or surveys or by developing protocols for observations and interviews.
Understanding the design and the development of measuring instruments will help you not only select the appropriate tool for your study but also critically analyze the measures used in your school and in reading research studies. In addition, it is important for practitioners such as teachers, administrators, counselors, or school psychologists to understand the principles of measurement so they can more effectively use data to improve their practice and address accountability requirements.
Use of Correlation Coefficients in Evaluating Measures
Another type of descriptive statistic used in educational measurement is a correlation. As learned in Chapter One, correlations are measures of the relationship between two variables. Because we visited Chapter One ages ago, let us review some basic ideas related to correlation.
ts of scores on the same measure, one would expect them to be high (0.80 or higher). When correlations are computed between two different measures, one would expect the correlations to be moderately high (0.60 to 0.80) if the instruments are measuring the same construct (such as the example of the teacher self-report of effectiveness and principal evaluations). If the instruments measure related but different constructs, such as a measure of selectee and a measure of student achievement, the correlation coefficient would be expected to be lower
Evaluating the Quality of Educational Measures: Reliability and Validity
Reliability and validity are the two criteria used to judge the quality of all pre-established quantitative measures. Reliability refers to the consistency of scores, that is, an instrument’s ability to produce “approximately” the same score for an individual over repeated testing or across different raters. For example, if we were to measure your intelligence (IQ), and you scored a 120, a reliable IQ test would produce a similar score if we measured your IQ again with the same test.
Equivalent-Form Reliability (Consistency across Different Forms)
Another type of reliability the measurement team might decide to establish for the instrument is equivalent-form reliability. This type of reliability is also referred to as alternative form. For this type of reliability, two forms of the same tests are given to participants, and the scores for each are later correlated to show consistency across the two forms. Although they contain different questions, these two forms assess the exact same content and knowledge, and the norm groups have the same means and standard deviations.
Internal Consistency Reliability
Internal consistency refers to consistency within the instrument. The question addressed is whether the measure is consistently measuring the same trait or ability across all items on the test. The most common method to assess internal consistency is through split-half reliability. This is a form of reliability used when two forms of the same test are not available or it would be difficult to administer the same test twice to the sample group. In these cases, a split-half reliability would be conducted on the instrument. To assess split-half reliability, the instrument is given to a pilot sample, and following its administration, the items on the instrument are split in half! Yes, literally split in half—often, all the even- and all the odd-numbered items are grouped together, and a score for each half of the test is calculated.
All research studies, regardless whether they are quantitative or qualitative, require the collection of data through the use of some type of measurement tool. Pre-established instruments refer to a category of measuring tools that have been developed and piloted by usually someone other than the researcher who is doing the current study.
An important feature of pre-established instruments is that they are usually standardized measures. Standardized measures include a fixed set of questions, a framework, and procedures for test administration. They also measure specific outcomes with results compared with a well-defined norm group that has been given the measure at a previous time during the instrument’s development.