funding sources
Since 1985, Behavioral Research and Teaching has been the recipient of over 23 million dollars in federal and state level research, training, and development funds. An overview of current projects is provided below.
The purpose of this grant is to develop and validate an assessment extended seamlessly from general education large-scale tests that also reflects growth within and across elementary and middle school on grade level content standards.
We accomplish this outcome by (a) developing reliable alternate forms of tasks that are aligned with grade level content standards, (b) vertically scaling them across grade groups in two bands (elementary and middle), (c) collecting criterion-related evidence to anchor interpretation to classroom practices and (d) using growth modeling to document change, both across and within grades. Using a validation argument, we make the claim that with intensive instruction, this group of students can indeed change their learning trajectories to master grade level content. Our focus is on collecting content-related evidence, reliability, criterion-related evidence, and change over time to support this claim.
We use four primary data analytic strategies: (a) universal design to create measures accessible for the widest range of students followed by a formal alignment analysis, (b) item response theory (IRT) to develop assessments and to create vertically equated scales, (c) IRT and classical test theory methods to evaluate the reliability and validity of the measures, and (d) growth curve modeling to understand the learning trajectories, with the criterion-related evidence helping us understand why students may or may not improve.
This Progress Monitoring Center is designed to institute the next generation of curriculum-based measurement for school districts. By combining the test development technology of Item Response Theory (IRT) with the measurement sampling plans of curriculum-based measurement, this project can provide school districts formative assessments in reading with strong technical adequacy.
First, we plan to develop five skill area inventories in reading and scale them using Item Response Theory (IRT) for developing alternate forms of equivalent difficulty: (a) letter names and sounds, (b) phoneme segmentation and blending, (b) word and sentence reading, (d) oral reading fluency of passages, and (d) comprehension. As part of the measurement system, we can then develop a screening instrument to develop norms for the district in grades K-4 in the fall and in the spring
Second, we propose building a data entry node on this web site where teachers can enter the values from their frequent measures and receive basic output (summarizing performance over time and reporting on various characteristics of performance like slope, variability, level change, and overlap). We plan to institute frequent measures in each building with 2-3 teachers, targeting students who are below the 10th percentile rank.
Third, we plan to develop training modules on the administration and scoring of the CBMs in these five skill areas and provide both teacher and students materials on a web site so they can easily download all measures. In this training, teachers are taught basic administration and scoring rules with short video clips and then tested on their proficiency by having them score several samples; their score is tracked and they are provided feedback on their performance.
This project focuses on Instruction, Assessment, and Standards as well as Information Management to Support Decision Making. The application is designed to assist state departments in organizing measurement of state standards into a cohesive framework for decision-making that is responsive to the diversity of students with disabilities. In this project, we accomplish three outcomes so that the process of large-scale testing can become more meaningful for students with disabilities.
First, we develop a computerized a model of alignment that includes several dimensions allowing teachers to understand student’s specific skills relative to state standards. We focus on the concurrence between the measures and the standards, the depth of knowledge being assessed in the measures, the range of knowledge being assessed (relative to the standards), and the balance of representation within with measures that are represented in the standards.
Second, we assist teachers in deciding how students should participate in taking the state test in the standard manner, with accommodations, with modifications, or by taking an alternate assessment? Rather than basing this decision on a subjective judgment of prior classroom applications (which may be unreliable), we provide a computerized curriculum-based measurement (CBM) system to help teachers decide the optimal manner for participation.
Third, we provide outcome reports that show relationships between various measures of achievement that are aggregated in a meaningful manner. We computerize a data entry and reporting system that allows outcomes to reflect both an idiographic and nomothetic perspective.
This project provides teachers a system for making decisions on how students should participate in large-scale testing programs (standard test, 2%, 1%) and what types of accommodations they should receive.
Assessment: The process for determining how students should participate in large-scale assessments is problematic. With five possible options, what data sources and criteria should IEP teams use. Almost no research exists on making decisions for the 1% other than that the student needs to be with the most significant cognitive disabilities; however, most states disavow (and appropriately so) making decisions on the basis of a disability. In the white paper, we articulated a series of questions that IEP teams had to address about receiving instruction in grade level content, access skills to this content, use of communication systems, and degree of scaffolds needed for participation.
Accommodation: A need exists for accommodations in large-scale testing, though it is somewhat uncertain how and why they appear to be effective and with which students. The empirical support for them is inconsistent within and across subject areas and even though they sometimes are effective, they also appear to be either inert (not work for anyone) or overly effective (work for everyone). It is not yet possible, therefore, to simply move research to practice in adopting wide scale adoption of specific accommodations that have passed the test of replicable empirical support. Yet, large-scale testing requires their application. To bridge this gap, the research on accommodations has begun to focus on how teachers make the decision to recommend specific changes in testing.
Alternate achievement standards are designed to enable inferences to grade level expectations that have been extensively prioritized but maintain high expectations for progress in the general curriculum and assume student performance is contingent on having the supports specified for the assessment. Inferences are stipulated. These assessments reflect changes that can be made in the levels of support or in the breadth, depth, or complexity of the standards that are being assessed. Support refers to the types of scaffolds, prompts, and assistive technologies used in the administration of the assessment. Breadth refers to the extensiveness of content and skills linked to the standards, curriculum, or assessments; depth refers to the level of complexity of cognitive functioning i.e., factual-associative, conceptual-hierarchical, principles-causative to be successful on performance standards.
In Oregon, each test includes both teacher Administration and Scoring Protocols and Student Materials. Each test is comprised of 11 tasks. Task 1 contains 10 prerequisite items and Task 2 through Task 11 each contains five content items.
In Alaska, each test includes skills in subject areas that are linked to Extended Grade Level Expectations (ExGLEs) that articulate pre-requisite skills for success on grade level standards; Expanded Levels of Support represent even more basic skills for functioning with others.
In both states, the emphasis is on applying current measurement and analysis models to understanding accountability systems for a population that has not ever been included. We address issues of validity as outlined in the 1999 Standards for Psychological and Educational Research: content-related evidence that addresses alignment with standards, the internal structures of the instruments, criterion-related evidence that provides relations with external measures, and finally, response processes that consider the manner of interacting with the tests.