CALIFORNIA STATE BOARD OF EDUCATION NOVEMBER 2012 AGENDA
Discussion Regarding Priorities for California's Future Assessment System.
SUMMARY OF THE ISSUE(S) California Education Code (EC) Section 60604.5 requires the State Superintendent of Public Instruction (SSPI) to develop recommendations, including a transition plan, for the reauthorization of the statewide pupil assessment system.
The California Department of Education (CDE) is providing the State Board of Education (SBE) with a preview of the SSPI’s purposes and guiding principles for the development of the new assessment system. Attachment 1, Considerations for Developing California’s Future Assessment System, is a discussion paper on the major components to consider and decisions that need to be made regarding California’s future assessment system.
RECOMMENDATION The CDE recommends that the SBE engage in discussions about priorities for California’s future assessment system and the resulting decisions that will need to be recommended regarding the content areas to assess, the type of assessment tools to be used and the overall scope of California’s assessment system.
BRIEF HISTORY OF KEY ISSUES Authorization for the Standardized Testing and Reporting (STAR) Program ends July 1, 2014. In preparation for the transition to a new testing program, the process defined in EC Section 60604.5 began in early 2012. Over the past several months, the CDE, the SBE, educational stakeholders, technical experts, and members of the public have been engaged in various discussions about the future of the assessment system in California. To facilitate the collaboration of these groups, the CDE created multiple opportunities to provide input and suggestions. These opportunities included the Statewide Assessment Reauthorization Work Group meetings, regional public meetings, an online survey, focus groups, and a special e-mail account for receiving comments on reauthorization from the public. The complete report on the information gathered will be reported to the legislature with the SSPIs recommendations for the future statewide assessment system.
With approximately six million students in more than 11,000 schools, the CDE is responsible for assessing more students than any other state in the nation. It has been California’s priority to assess all students with its assessment programs, including students with disabilities and English learners. Our current state assessment system was originally designed in 1997. In 2001, the assessment system was modified to accommodate the reauthorization of the federal Elementary and Secondary Education Act (ESEA), also known as the No Child Left Behind Act (NCLB). The STAR Program was reauthorized in 2004, and was recently extended through July 1, 2014.
The reauthorization of ESEA in 2001 led to numerous changes in the STAR Program and accountability system. Those changes included:
Two new tests were developed: the California Modified Assessment (CMA) and the Standards-based Tests in Spanish (STS).
The CMA provided an assessment tool for measuring the achievement of low performing students with disabilities.
Science assessments were added to the STAR Program in grades five and eight.
A special, grade ten Life Science assessment was added to meet the requirement of testing students in science not less than once in grades ten through twelve.
The California Alternate Performance Assessment (CAPA), a test for severely disabled students, added science testing.
The California High School Exit Examination (CAHSEE) became part of the accountability system, assessing all students in ELA and mathematics in grade ten.
The California English Language Development Test (CELDT) began to be employed as an accountability measure.
In addition to the assessments required by NCLB, California has continued to provide statewide end-of-course (EOC) assessments in many subjects (See Attachment 2). These include:
Mathematics: Algebra I, Algebra II, Geometry, and Integrated Mathematics tests at three levels
Science: Biology, Chemistry, Earth Science, Physics, and Integrated Science at four levels
History–Social Science: Grade eight History–Social Science, U.S. History, and World History
With the exception of writing assessments in grades four and seven, all of the assessments have been based on selected response (multiple-choice) items. The primary purpose of the STAR Program at the beginning was to hold schools accountable for teaching students the knowledge and skills embodied in the California content standards. While individual pupil scores have been reported to parents, schools, and teachers, the primary accountability use for the data was intended to be a cross sectional view of the performance of groups of students. The current assessments measure how well students have learned the California content standards in grades two through eleven. The system is not designed to measure growth in achievement at the individual pupil level.
California’s Participation in the Smarter Balanced Assessment Consortium In June, 2011, Governor Brown, State Superintendent Torlakson, and Board President Kirst agreed to join the Smarter Balanced Assessment Consortium (SBAC) as a governing state. This represents a significant commitment to a new set of more complex and richer assessments. California’s membership in the SBAC will allow assessment of student achievement with respect to the Common Core State Standards (CCSS) in ELA and mathematics using both selected response and constructed response items and performance tasks. Currently, the system is being designed to provide summative information at the end of each school year for grades two through eight and grade eleven as well as provide schools with optional formative assessment tools and interim assessments that can be customized by teachers to examine specific content that students are studying.
Recent Developments in Alternative Assessments Recently, California joined the National Center and State Collaborative (NCSC) consortium in September 2012 as a Tier II state. This decision resulted from numerous conversations with the SSPI’s assessment advisory group, members of the Advisory Commission on Special Education as well as SBE staff and liaisons. The NCSC is responsible for developing alternate assessments based on alternate achievement standards for students with significant cognitive disabilities. Representing a Tier II state, the California team will:
Dedicate a staff member to coordinate the work.
Work directly with members of the Special Education Administrators of County Offices of Education (SEACO) and with directors of special education local plan areas (SELPA) to build a community of practice.
Meet directly with the field implementers every other month with technology supported meetings in between and as needed.
Deliver electronically the comprehensive curriculum, instruction, and professional development modules available from the NCSC on the CCSS expected by fall 2012.
California expects, as do other Tier II states, to develop an individualized plan to implement the professional development and curriculum and instruction resources, including formative assessment strategies and progress monitoring tools. The CDE’s Assessment Development and Administration Division and Special Education Division will collaborate on this project to provide support and information to the field and work with NCSC. It is expected that California will be able to adopt the NCSC developed alternate assessment; however, that decision will need to follow piloting the resulting resources.
Guiding Principles in Defining and Developing a New Assessment System The CDE has been seeking advice from its STAR/CAHSEE technical advisory group to provide ongoing evaluation of the assessment system. This advisory group consists of renowned assessment and psychometric professionals from higher education institutions throughout the nation as well as California local educational agencies’ (LEA) assessment and accountability administrators. Working together has resulted in the development of a set of guiding principles to consider when designing future assessments. These five principles serve to ensure the development of high-quality and fair assessments for California.
Assess subjects and learning in ways that promote high-quality instruction. In addition to mandated assessments, incorporate a variety of methods for measuring student achievement to provide achievement information on those subjects beyond SBAC that are critically important to the success of students. For example, teachers and administrators may need more resources and tools that help them select or build high-quality formative and interim assessments and performance tasks in multiple subject areas. Common to all assessments (summative, interim, and formative), a key goal of item development should be high student engagement. Quality items should not only measure student achievement, but should additionally lend themselves to good instruction.
Conform to rigorous industry standards for test development. The statewide summative assessments must be valid and reliable. Assessments with high-stakes outcomes for students or schools require the highest levels of comparability, reliability, and security. However, assessments of lesser consequence can be implemented at the local level. These assessments will not require the level of technical quality and security required for high-stakes statewide testing. This includes formative and interim assessments, which can be administered in a more flexible manner than the high-stakes summative assessments and these assessments can be scored locally.
At a minimum, the system should:
Create an assessment framework as a guide for test development. Such a document would clearly demonstrate the link between the content standards and the assessments designed to measure student achievement.
Ensure that no aspect of the system creates any bias with respect to race, ethnicity, culture, religion, gender, sexual orientation, and socioeconomic status. Insist that contractors provide documents revealing the procedures and analyses used to eliminate bias and documents showing their effectiveness.
Explore standard setting methodologies that incorporate multiple measurements of student learning in establishing proficiency.
3. Use resources efficiently and effectively. Time and money spent on assessment programs need to provide results commensurate with the investment. Student, teacher, and administrator time is precious and should be used as effectively as possible. Continuous improvement to the assessment system requires stakeholders to understand that a balance must be found between the costs of the system and the level of assessment desired. For example, if a given assessment needs to be made more informative and reliable, it is very likely that the test will either need be lengthened or the number of standards assessed reduced. If the test is lengthened, testing time and overall cost likely will be increased.
4. Provide for inclusion of all students. To ensure the effective participation of students with disabilities and English learners, all state assessments must be developed with these populations in mind. The system needs to provide an acceptable alternative for severely disabled students or for cases in which one type of test (e.g., a computer-based test) cannot be accessed by a particular student (e.g., the student is blind). A clearly articulated set of variations, accommodations, and modifications should be available for every assessment.
At a minimum, the system should:
Conform to the principles of universal design to ensure equity and access.
Consider linguistic complexity when developing assessments.
Provide appropriate assessments and accommodations as needed for all students with disabilities, including an alternate assessment for students with significant cognitive disabilities.
Incorporate research on assessment of English learners, students with disabilities, and economically disadvantaged students into the development of state assessment programs.
5. Provide information on the assessment system that is readily available and understandable to parents, teachers, schools, and the public. California educators must work to inform the public about the appropriate use and interpretation of the various types of test results. This is of greater importance than ever as the common core assessments go beyond the traditional standardized tests to include new types of items (e.g., performance tasks, extended response items), computer-adaptive assessments, interim assessments, and formative assessment tools. Information about the purpose of a test, interpretation of results, and appropriate uses of the test must be readily available. Likewise, teachers and parents will want ready access to cumulative information about the progress of students. The availability of longitudinal data and improvements to California’s student data system should be leveraged to provide ready access to assessment results.
At a minimum, the system should:
Provide information for each assessment that describes the purpose of the test, the relationship of the test to the content standards, and a guide to the interpretation and use of results.
Provide resources such as sample test items and student responses. Link items to content standards and levels of achievement.
Utilize technology to provide results that are easily interpreted by students, teachers, administrators, parents and guardians, and the general public. A reporting application should be developed that integrates results from multiple measures over time and allows users to analyze and compare data, whether from state or SBAC assessments.
SUMMARY OF PREVIOUS STATE BOARD OF EDUCATION DISCUSSION AND ACTION EC Section 60604.5 requires the SSPI to develop recommendations for the reauthorization of the statewide pupil assessment program, which includes a plan for transitioning to a system of high-quality assessments as defined in EC Section 60603. While the law specifically addresses the current STAR Program, the CDE’s position is that it is appropriate to consider other current California statewide assessments, including, but not limited to, the Early Assessment Program, which utilizes specific STAR assessments, and the CAHSEE.
In September, July, May, and March 2012, the SBE received updates regarding the statewide assessment reauthorization activities, including Work Group summaries.
In January 2012, the SBE was provided the requirements pursuant to EC Section 60604.5 and proposed activities to develop the SSPI’s recommendations, including a plan for transition, for the reauthorization of the statewide pupil assessment system.
FISCAL ANALYSIS (AS APPROPRIATE) The activities to develop the SSPI’s recommendations will stay within budgetary guidelines. Activities have included Work Group meetings, regional public meetings, focus group meetings, survey data collection from an e-mail account established for public input, and data analysis.
Attachment 2: Appendix C: Range of Assessments Required by State and Federal Laws
and Proposed by the Smarter Balanced Assessment Consortium (SBAC)
Considerations for Developing California’s Future Assessment System Introduction Adoption of the Common Core State Standards (CCSS) for English–language arts (ELA) and mathematics in August 2010 along with the sunset of the Standardized Testing and Reporting (STAR) Program in July 2014 presents a set of challenges and opportunities for California’s assessment system. The Governor, the State Superintendent of Public Instruction (SSPI), the State Board of Education (SBE), and the Legislature have a unique opportunity to shape the future of California’s assessment system.
Adoption of the CCSS means that, at a minimum, the state will need to review and revise the current assessments in ELA and mathematics to align with the new standards. Additionally, the state has an opportunity to rethink the purposes of the assessment system, and to consider the various ways those purposes may be met.
A set of common national science standards (the Next Generation Science Standards) is currently under development, and the possible adoption of these standards will require consideration of the current set of science assessments and their appropriateness and alignment to the new standards.
There are many paths the state can take with respect to transitioning to a new assessment system. This document outlines some of the major choices the state will need to make regarding student assessment, and what will be the likely consequences of these choices.
Framing the Conversation In order to appropriately develop the next generation of California assessments, the state must first decide what information it wants from these tests.
The current generation of standardized tests essentially does one thing: it measures the achievement of individual students against a set of specific standards in that student’s grade level, for that particular content area. In addition, aggregations of these scores can tell us how specific groups of students are doing against these same content standards.
Before determining what California assessments should do, it’s important to understand what a test, by itself, does not do. It does not measure how much more a student has learned from year-to-year (although that is often presumed). And it cannot, by itself, say how good a school or district is doing in educating its students (although we do use the results in this way). Comparing schools in this way is not perfect as students in California are not randomly distributed among the schools. As a result, scores of individual schools or districts represent the students enrolled at the time the test was administered making such comparisons difficult. There are ways of compensating for this lack of random distribution, but these are difficult and California makes little attempt to do so.
Even so, we place great reliance on the movement of scores within districts and schools from one year to the next. In fact, this is a reasonable assumption since students in that school and district probably change little from year to year. It is because of this that we can have great confidence that the state’s steadily increasing test scores represent improvements in the quality of the education we are delivering in California.
With the adoption of the CCSS, the state has agreed that the next generation of tests will be different. The tests in ELA and mathematics are designed to place individual students along a continuum of knowledge. This will allow us to determine how much progress a student, or a group of students, is making from year to year. The scores will reflect progress along a continuous scale, not progress within an individual grade level.
Furthermore, this new generation of tests is designed to measure in greater depth just how much students know. It relies less on specific facts learned and more on tasks that require complex cognitive processes, such as analysis and evaluation. But this advance in assessments comes with a price. These assessments will take longer to administer and will be more expensive to develop and administer.
Since resources available for testing, either in terms of dollars or time, are unlikely to substantially increase, the state will face some difficult decisions about what subjects to test, when to administer those tests, how to administer those tests, and how many students and/or grades are to be tested.
For example, is it important to test in more subjects than ELA and mathematics? These currently are required by the federal government and comprise the totality of the federal accountability system. Yet we know from past experience, that to a degree that is alarming to many, what gets tested is what gets taught. In light of that, do we also need to have standardized tests in science, social studies, history, arts, foreign languages and physical education in order to insure that those subjects also receive the attention they deserve? How often should these be administered and to what group of students?
And most crucial of all, what do we plan to do with the results we receive? Are tests being administered so that we can inform parents about the progress of their individual child? If so, is the standardized test the best way to do that? Are tests being administered to see how well schools and districts are performing? If so, what’s the standard of measure we will use to determine success or failure? And what will the consequences be for failing to meet those standards. If so, do we have in place an adequate system for accounting for the differences in student populations in making this judgment?
And if what we test shapes how we teach, are we developing tests that will help create the kind of instruction, both in terms of breath and depth, that we want to see in our classrooms?
Do we expect statewide assessments to inform us about areas of knowledge an individual student might lack so that we can provide, early on, appropriate remedial help?
The answers to these questions can help inform the type of tests we administer, how often they are administered and to whom, and the type of the test itself.
Strengths and Weaknesses of the Current Assessment System The primary purpose of the STAR Program has been to hold schools accountable for teaching students the knowledge and skills embodied in the California content standards. The assessments served as the basis for monitoring the progress of schools in improving student performance and to provide data for program evaluation. The current assessments are designed to measure how well students have learned the California content standards for their grade. The assessments are built from blueprints that delineate the grade level content standards to be tested in each subject, and the number of items to be developed for each standard. The system provides accountability information about the progress of successive cohorts of students for a given grade and subject. However, the system is not designed to measure growth in achievement from year to year for individual students.
With the exception of writing assessments administered in grades four and seven and as part of the California High School Exit Examination (CAHSEE), all of the STAR assessments are multiple-choice (selected response) tests.
Strengths One advantage of the paper and pencil multiple-choice assessments lies in their ability to be inexpensively developed, administered and scored. Further, they yield results that are especially reliable. Additionally, they provide secure measures of achievement. Moreover, the STAR assessments have been shown to have a high degree of alignment with the standards they are intended to measure and to be of high technical quality. Use of the multiple-choice approach has allowed California to offer the wide variety of tests that currently make up the STAR Program and to have a high level of reliability and objectivity in its accountability system.
Weaknesses, Limitations, and Unintended Consequences Despite strong alignment to the standards and a high level of reliability, the use of multiple-choice assessments has limited the types of knowledge and skills that are measured. The tests have been criticized for not measuring the standards in great enough depth, This is a fair criticism and is a reflection of the fact that the tests were designed to determine if the content standards were being taught in a given grade and subject for a particular school. The system has favored breadth over depth. This is demonstrated by the fact that the test blueprints generally include a small number of questions for any one standard.
The multiple-choice format also precludes measuring content standards that call for students to demonstrate complex processes, such as critical thinking and problem solving. There is a legitimate concern that an unintended consequence of using multiple-choice tests is that in-depth understanding of subject matter is devalued because it is not measured. Likewise, critical thinking and complex problem solving skills have the potential to become devalued because the STAR tests’ capacity to measure these attributes is limited.
Assessing more complex instructional concepts requires different types of test items that ask students to provide more complex responses and/or respond to more complex stimuli than the current assessments allow. These items require students to provide answers in the form of short responses consisting of a few words or sentences, or longer essay type responses in which students explain their understanding. Even more involved items are performance tasks that require students to complete a multifaceted assignment or project that demonstrates competence in a variety of areas. These types of items also have the benefit of informing and supporting instruction to a higher degree than is possible with multiple-choice assessments. To date, these types of assessments have been used to only a limited extent in various state summative assessments primarily because they are more costly to develop and score than multiple-choice assessments. The cost of using these types of items is elevated if they are part of high-stakes assessments where standardized administration and security are imperative.
The current system of assessments has also been criticized for negatively influencing instruction through the narrowing of the curriculum to only those subjects that are tested. Currently, ELA and mathematics are tested in every grade, two through eleven. In the elementary grades, science is tested less than either of these subjects, and history and social science is tested even less.
It can be argued that pressure to perform well on the major components of the accountability program (ELA and mathematics) has led to less time spent on other components of the curriculum. Subjects that are not part of the current statewide assessment system include career technical education and visual and performing arts. Focus group interviews conducted with elementary school teachers by the CDE found substantial evidence confirming that the current system has the effect of narrowing instruction, particularly in lower performing schools.
Many have expressed a desire for diagnostic information to guide instructors in determining what to teach and how to teach it for individual students. The current statewide assessments are neither focused enough nor sufficiently detailed to provide this type of information. It has not been the purpose of the tests to do this, and to be valid for this purpose, testing would need to take place at different points in the school year and likely, consume more instructional time.
Another unintended consequence of the current system of assessments has been devaluing or de-emphasis of assessments not associated with accountability. The statewide assessments, because of the high level of attention paid to the results and high level of technical quality ascribed to them, are viewed as most important. This has inadvertently facilitated a shift in importance from informal assessments that have a variety of item types such as constructed response items, performance tasks, and assessment projects.
Purposes of the New Assessment System The SSPI and the CDE are committed to designing an assessment system that includes a variety of assessment approaches and item types that has as its primary purpose to model and promote high quality teaching and student learning activities. In accomplishing this purpose, the system can also:
Produce scores that can be aggregated for the purpose of holding schools and districts accountable for the progress of their students in learning the California academic content standards.
Provide assessments and/or assessment tools in multiple grades covering the full breadth of the curriculum to provide clear expectations and incentives for teaching the full curriculum.
The delineation of the purposes of the testing system has a direct impact on the types of assessments that should be developed. The validity of an assessment is based on its purpose. While the current STAR assessments are valid for comparing school and district performance, they are not valid for measuring individual student growth, providing diagnostic information, or for supporting instruction that develops 21st century skills.
However, all of the assessments in a system do not need to be designed to serve all purposes. Some components of a comprehensively designed system may be valid and useful for one purpose, but not for another. In selecting the above purpose for the system, the SSPI and the CDE have begun to outline what will be assessed in terms of content as well as determine the types of instruments that will be used. What to test and how to test involve numerous trade-offs and choices, each with particular implications for validity as well as for the use of resources, principally, teaching and learning time and money.
Determining What to Test The CCSS and the Smarter Balanced Assessment Consortium (SBAC) The CCSS will require a more integrated approach to delivering instruction across all subject areas. Specifically, the CCSS provide a consistent, clear description of what students are expected to learn, so teachers and parents know what they need to do to help them. The standards are designed to be robust and relevant to the real world, reflecting the knowledge and skills that our young people need for success in college and careers. With participation in the SBAC, California will have access to assessments that measure student achievement of the CCSS in grades three through eight and grade eleven.
California’s membership in the SBAC will allow assessment of student achievement with respect to the CCSS in ELA and mathematics using both selected and constructed response items and performance tasks. Currently, the system is being designed to provide summative information at the end of each school year as well as provide schools with optional formative assessment tools and interim assessments that can be customized by teachers to examine specific content that students are studying. The summative assessments will include at least one performance task incorporating real life applications and require students to demonstrate their critical thinking, analysis, and problem solving skills. It is anticipated that these assessments will form the core of the accountability system and will provide the bulk of the data used for the purposes of program evaluation and accountability.
Current thinking is that advances in computer based testing (CBT) and automated scoring of constructed responses will enable the new assessments to measure student achievement more accurately and also capture information about depth of learning and the ability to apply more complex skills. It is anticipated that the format of the new assessments will have the effect of focusing instruction on these important aspects of student learning. The performance tasks, in particular, are seen as providing examples of the kinds of activities students should be engaged in when learning the content outlined by the new standards.
Given the types of items being considered for the SBAC assessment system, the cost is likely to be greater per student to implement and operate than the current California Standards Tests (CSTs). This means that California will need to consider allocating additional resources for assessment, finding more efficient ways to assess subjects not included in SBAC, and reducing the number of grades and subjects assessed.
Changing the way tests are administered, scored, and reported has the potential for realizing greater efficiencies. For example, some statewide assessments could rely on LEA staff to score and report assessment results. Local scoring of results could also have the benefit of allowing the results to be incorporated into evaluations of student performance. Teachers involved in the scoring process would be provided a valuable professional development opportunity, particularly if the assessments were primarily performance tasks. The transition to a new system provides opportunities to think differently about how assessment fits into the entire educational endeavor. California will need to be as vigilant and astute as possible to maximize the benefit of assessment expenditures.
Choices beyond SBAC Which subjects should California assess at the statewide level beyond SBAC? The discussion that follows pre-supposes the implementation of the SBAC summative assessments. Currently, California is committed to its membership within SBAC, overseeing the development of the ELA and mathematics assessments to which the consortium is committed. However, if California chooses not to implement the SBAC assessments, the choices outlined below would still remain.
Accountability Considerations The current state assessment and accountability system includes a wide variety of assessments beyond those currently required by federal law. California is not required to implement assessments in every subject and every grade. The degree to which end–of-course (EOC) science, mathematics, and history–social science assessments are used in accountability measures influences the type of assessments that can be used as well as how they are administered and scored.
The distinction needs to be made between the general concept of accountability as opposed to specific accountability measures that have been developed to judge and compare the performance of schools. In general, accountability pertains to ensuring that the elements of the system are doing what they were intended to do. This can be done without making all results part of high-stakes accountability measures such as the Academic Performance Index (API) and Adequate Yearly Progress (AYP). In some ways, completion of a classroom performance task or a portfolio of student work may provide more authentic and valid information than is provided by a highly reliable and secure standardized test.
The SSPI and the CDE feel strongly that the teaching and learning of science, history–social science, and other subject areas not be compromised by virtue of the assessment system that is developed. Assessments in these subjects and in multiple grades should include techniques to inform and supports instruction. While an SBAC-like testing system in every grade for every subject would incorporate more advanced item and assessment types than the current CSTs, the amount of money and student time invested on assessment will need to be considered. Some trade-offs are inevitable. For example, it may be that a subject is tested only in selected grades, or that the primary state assessment would be completion of a state-developed performance task that is produced and scored locally using state-developed prompts and rubrics.
Limiting the use of non-SBAC assessments and assessment results for high- stakes accountability purposes could have both financial and instructional benefits. High-stakes tests require a high degree of reliability and security For example, test items must be kept confidential and secure until they are used; many new items need to be developed each year. While these steps are essential for high-states accountability, these measures also incur high costs.
Lower stakes assessments that are locally reproduced, scored, and reported can greatly reduce the cost of assessment. For example, security and reliability costs would likely be less than they are currently. However, the results could still be appropriate for the purpose of adequately and fairly evaluating student performance. Local or regional scoring and calibration procedures can be designed to develop consistent scoring and prevent teachers from scoring their own students’ work, and a reasonable level of security can be guaranteed through state-level control of prompts and scoring rubrics. This is routinely done in many of the highest-achieving nations, such as Australia, Canada, and Singapore.
A major implication of the purposes envisioned for the new assessment system is the desire to promote high quality teaching and student learning activities. As discussed earlier, this purpose is poorly achieved using the current selected response assessments. The state could choose to develop sets of performance tasks and scoring rubrics for use at the local level, but treat these assessments as a lesser component of any future accountability system. This would greatly reduce costs associated with item development, scoring, and test security. Reducing or eliminating the contribution of these tests to high-stakes accountability measures would also allow the assessments to be scored locally by instructors teaching the subjects being assessed allowing for high quality, high value professional development activities.
Allowing teachers to participate in the assessment and scoring process, particularly when constructed response items or performance tasks are used, can have a direct impact on instruction. When teachers are involved in administering and scoring these types of assessments, they recognize the depth to which students need to understand the content standards, and the kinds of skills students need to successfully demonstrate what they know and can do. Teachers involved in scoring writing assessments regularly report that one of the major benefits of participating in such activities is that it gives them an enhanced understanding of what is required from their students, and provides insights into how to improve instruction.
Interim and Formative Assessments What to test may also include decisions about supporting more in-depth assessment of the common core. California could elect to reduce or alter high school EOC testing in order to support the participation of LEAs in the interim assessments and formative tools being developed by SBAC. The SBAC interim assessment system will provide a series of benchmarks for assessing progress at different points in the school year. These will be an additional cost to the system.
The SBAC formative assessment tools will be embedded in instruction and will provide the most detailed information about where individual students are and what they need to learn with respect to specific standards. While these are only being developed for ELA and mathematics, the interdisciplinary aspect of the CCSS provides for the development and assessment of literacy and numeracy skills in science and social studies.
One of the choices might be whether to support richer and deeper assessment tools at the local level that can improve teaching and learning, or to continue administering high-stakes EOC assessments for use in accountability.
Reducing Redundant Testing The current assessment system arguably requires substantial investments of student and teacher time because of the redundant nature of some assessments. For example, currently the CAHSEE is administered to every high school student in California in grade ten. Currently, this assessment measures much of the same content that the STAR assessments measure in ELA and mathematics in grade ten and earlier. The grade 11 SBAC or corresponding SBAC interims at earlier grades have the potential to serve as a substitute for the CAHSEE.
Another example of redundancy is testing Life Science for all grade ten students as required by the current ESEA. This test may be given to students that are also taking an EOC biology test. Redesigning the assessment system provides an opportunity to reduce redundancy. However, each decision made will likely have a cascading effect on decisions regarding accountability as well as costs..
The SBAC assessments promise to reduce testing time through the use of Computer Adaptive Testing (CAT) which uses information gained during testing to better target questions. Additionally, it is expected the grade eleven SBAC will serve as the measure of college readiness to ensure continuation of California’s Early Assessment Program.
Matrix Testing Another way to reduce testing time or to gather more information in the same amount of time is through matrix testing. Matrix testing expands the breadth of content measured by creating “blocks of questions” that are only taken by a fixed proportion of students. This means that a relatively higher number of items are distributed across all students rather than all students each taking all items. However, for a test of a given length, using a matrix approach will reduce the comparability of individual student results. The benefit is that the results yield more performance information regarding groups of students. This in turn would support the purpose of informing teaching and learning
Matrix testing is also an efficient way of focusing on the measurement of specific areas of the curriculum. For example, students might normally take a 60 item mathematics test in grade four. In this test, the students might normally answer 6 to 10 questions on fractions. This gives data of borderline reliability for any one student, and limited information about the population as a whole. However, using matrix testing, several “blocks of questions” could be randomly assigned to students. This would result in several times more information being assessed. The students could receive overall scores only, but information on groups of students would be enhanced.
Matrix testing may not necessarily decrease the cost and complexity of testing. For example, more items may need to be developed, and scoring procedures will need to be carefully implemented to ensure the various blocks are scored and aggregated appropriately. However, matrix testing could allow students to tackle more ambitious items and tasks rather than only multiple choice items. Since each student may take a smaller number of items, some of them can be lengthier. The National Assessment of Educational Progress takes advantage of matrix testing and is able to address rigorous content in a comprehensive way.
Conclusions Time and money spent on assessment programs needs to provide results commensurate with the investment. Student, teacher, and administrator time is valuable and should be invested as effectively as possible. A balance must be found between the costs of the system and the kind of assessment and reporting desired. For example, if a given assessment needs to be made more informative for instructional use or reliable for accountability, it is very likely that the test will either need be lengthened, or the number of standards assessed reduced. If a test is lengthened, testing time and overall cost will likely be increased. Also, if California chooses to support the use of formative and interim assessments to replace some of its current summative high-stakes EOC assessments, how these assessments might be used in an accountability system will need to be seriously considered and likely used in different ways than our current set of assessments.
The emphasis on supporting and informing instruction calls for the use of new and innovative item types, and also involves greater support for formative and interim assessments. Focusing on this purpose of the assessment system and, secondarily on the need for accountability, will greatly aid in making choices about what and how to assess.