Assessment in Social and Educational Contexts



Download 32.76 Kb.
Date03.05.2017
Size32.76 Kb.

Chapter 1

  • Assessment in Social and Educational Contexts (Salvia, Ysseldyke & Bolt, 2012)
  • Dr. Julie Esparza Brown
  • SPED 512: Diagnostic Assessment
  • Winter 2013
  • Chapters 1, 11, 12, 13, and 14 are included in this presentation

AGENDA – Week 3

  • Questions for the Good of the Group
  • Instruction and Lab Time: Continue WJ-III
  • Break
  • Group activity to process Chapters 1, 3, 11, 12, and 14
  • Powerpoint overview of Chapters 1, 3, 11, 12, and 14

Individualized Support

  • Schools must provide support as a function of individual student need
    • To what extent is the current level of instruction working?
    • How much instruction is needed?
    • What kind of instruction is needed?
    • Are additional supports necessary?

Assessment Defined

  • Assessment is the process of collecting information (data) for the purpose of making decisions about students
    • E.g. what to teach, how to teach, whether the student is eligible for special services

How Are Assessment Data Collected?

  • Assessment extends beyond testing and may include:
    • Record review
    • Observations
    • Tests
    • Professional judgments
    • Recollections

Why Care About Assessment?

  • A direct link exists between assessment and the decisions that we make. Sometimes these decisions are markedly important.
  • Thus, the procedures for gathering data are of interest to many people – and rightfully so.
    • Why might students, parents, and teachers care?
    • The general public?
    • Certification boards?

Common Themes Moving Forward

  • Not all tests are created equal
    • Differences in content, reliability, validity, and utility
  • Assessment practices are dynamic
    • Changes in the political, technological, and cultural landscape drive a continuous process of revision

Common Themes Moving Forward

  • The importance of assessment in education
    • Educators are faced with difficult decisions
    • Effective decision-making will require knowledge of effective assessment
  • Assessment can be intimidating, but significant improvements have happened and continue to happen
    • More confidence in the technical adequacy of instruments
    • Improvements in the utility and relevance of assessment practices
    • MTSS framework

Chapter 11

  • Assessment of Academic Achievement with Multiple-Skill Devices

Achievement Tests

  • Achievement Tests
    • Norm-referenced
      • Allow for comparisons between students
    • Criterion-referenced
      • Allow for comparisons between individual students and a skill benchmark.
  • Why do we use achievement tests?
    • Assist teachers in determining skills students do and do not have
    • Inform instruction
    • Academic screening
    • Progress evaluation

Classifying Achievement Tests

  • Diagnostic Achievement
  • Number of students who can be tested
    • High
  • Less efficient administration – Dense content and numerous items allow teachers to uncover specific strengths and weaknesses
    • Low
  • More efficient administration – Comparisons between students can be made but very little power in determining strengths and weaknesses
    • High
  • Efficient administration – Typically only quantitative data are available
    • Low
  • Less efficient administration – Allows for more qualitative information about the student.

Considerations for Selecting a Test

  • Four Factors
    • Content validity
      • What the test actually measures should match its intended use
    • Stimulus-response modes
      • Students should not be hindered by the manner of test administration or required response
    • Standards used in state
    • Relevant norms
      • Does the student population being assessed match the population from which the normative data were acquired?

Tests of Academic Achievement

  • Peabody Individual Achievement Test (PIAT-R/NU)
  • Wide Range Achievement Test 4 (WRAT4)
  • Wechsler Individual Achievement Test 3 (WIAT-III)

Peabody Individual Achievement Test-Revised/Normative Update (PIAT-R/NU)

  • In general…
    • Individually administered; norm-referenced for K-12 students
  • Norm population
    • Most recent update was completed in 1998
      • Representative of each grade level
    • No changes to test structure

PIAT-R/NU

  • Subtests
  • Mathematics: 100 multiple-choice items assess students’ knowledge and application of math concepts and facts
  • Reading recognition: 100 multiple-choice items require students to match and name letters and words
  • General information: 100 questions presented orally. Content areas include social studies, science, sports, and fine arts.
  • Reading comprehension: 81 multiple-choice items require students to select an appropriate answer following a reading passage
  • Spelling: 100 items ranging in difficulty from kindergarten (letter naming) to high school (multiple-choice following verbal presentation)
  • Written expression: Split into two levels. Level 1 assesses pre-writing skills and Level II requires story writing following a picture prompt

PIAT-R/NU

  • Scores
    • For all but one subtest (written expression), response to each item is pass/fail
    • Raw scores converted into:
      • Standard scores
      • Percentile ranks
      • Normal curve equivalents
      • Stanines
    • 3 composite scores
      • Total reading
      • Total test
      • Written language

PIAT-R/NU

  • Reliability and Validity
    • Despite new norms, reliability and validity data are only available for the original PIAT-R (1989)
    • Previous reliability and validity data are likely outdated
      • Outdated tests may not be relevant in the current educational context

Wide Range Achievement Test 4 (WRAT4)

  • In general…
    • Individually administered
    • 15-45 minute test length depending on age (5-94 age range)
    • Norm-referenced, but covers a limited sample of behaviors in 4 content areas
  • Norm population
    • Stratified across age, gender, ethnicity, geographic region, and parental education

WRAT4

  • Subtests
  • Word Reading: The student is required to name letters and read words
  • Spelling: The student write down words as they are read aloud
  • Math Computation: The student solves basic computation problems
  • Scores
    • Raw scores converted to:
      • Standard scores, confidence intervals, percentiles, grade equivalents, and stanines
      • Reading composite available
  • Reliability
    • Internal consistency and alternate-form data are sufficient for screening purposes
  • Validity
    • Performance increases with age
    • WRAT4 is linked to other tests that have since been updated; additional evidence is necessary

Wechsler Individual Achievement Test- Third Edition (WIAT-III)

  • General
    • Diagnostic, norm-referenced achievement test
    • Reading, mathematics, written expression, listening, and speaking
    • Ages 4-19
  • Norm Population
    • Stratified sampling was used to sample within several common demographic variables:
      • Pre K – 12, age, race/ethnicity, sex, parent education, geographic region

WIAT-III

  • Subtests and scores
    • 16 subtests arranged into 7 domain composite scores and one total achievement score (structure provided on next slide)
    • Raw scores converted to:
      • Standard scores, percentile ranks, normal curve equivalents, stanines, age and grade equivalents, and growth scale value scores.

WIAT-III Subtests

  • Composite
  • Subtest
  • Basic Reading
  • Word Reading
  • Pseudoword Decoding
  • Reading Comprehension and Fluency
  • Reading Comprehension
  • Oral Reading Fluency
  • Early Reading Skills
  • Mathematics
  • Math Problem Solving
  • Numerical Operations
  • Math Fluency
  • Math Fluency – (Addition, Subtraction, & Multiplication)
  • Written Expression
  • Alphabet Writing Fluency
  • Spelling
  • Essay Composition
  • Oral Expression
  • Listening Comprehension
  • Oral Expression

WIAT-III

  • Reliability
    • Adequate reliability evidence
      • Split-half
      • Test-retest
      • Interrater agreement
  • Validity
    • Adequate validity evidence
      • Content
      • Construct
      • Criterion
      • Clinical Utility
  • Stronger reliability and validity evidence increase the relevance of information derived from the WIAT-III

Getting the Most Out of an Achievement Test

  • Helpful but not sufficient – most tests allow teachers to find an appropriate starting point
  • What is the nature of the behaviors being sampled by the test?
    • Need to seek out additional information concerning student strengths and weaknesses
      • Which items did the student excel on? Which did he or she struggle with?
      • Were there patterns of responding?

Chapter Twelve

  • Using Diagnostic Reading Tests

Why Do We Assess Reading?

  • Reading is fundamental to success in our society, and therefore reading skill development should be closely monitored
  • Diagnostic tests can help to plan appropriate intervention
  • Diagnostic tests an help determine a student’s continuing need for special services

The Ways in Which Reading is Taught

  • The effectiveness of different approaches is heavily debated
  • Whole-word vs. code-based approaches
  • Over time, research has supported the importance of phonemic awareness and phonics

Skills Assessed by Diagnostic Approaches

  • Oral Reading
    • Rate of Reading
    • Oral Reading Errors
      • Teacher pronunciation/aid
      • Hesitation
      • Gross mispronunciation
      • Partial mispronunciation
      • Omission of a word
      • Insertion
      • Substitution
      • Repitition
      • Inversion

Skills Assessed by Diagnostic Approaches (cont.)

  • Reading Comprehension
    • Literal comprehension
    • Inferential comprehension
    • Critical comprehension
    • Affective comprehension
    • Lexical comprehension

Skills Assessed by Diagnostic Approaches (cont.)

  • Word-Attack Skills (i.e., word analysis skills) – use of letter-sound correspondence and sound blending to identify words
  • Word Recognition Skills – “sight vocabulary”

Diagnostic Reading Tests

  • See Table 12.1
  • Group Reading Assessment and Diagnostic Evaluation (GRADE)
  • DIBELS Next
  • Test of Phonemic Awareness – 2 Plus (TOPA 2+)

GRADE (Williams, 2001)

  • Pre-school to 12th grade
  • 60 to 90 minutes
  • Assesses pre-reading, reading readiness, vocabulary, comprehension, and oral language
  • Missing some important demographic information for norm group, high total reliabilities (lower subscale reliabilities), adequate information to support validity of total score.

DIBELS Next (Good and Kaminski, 2010)

  • Kindergarten-6th grade
  • Very brief administration (used for screening and monitoring)
  • First Sound Fluency, Letter Naming Fluency, Phoneme Segmentation Fluency, Nonsense Word Fluency, Oral Reading Fluency, and DAZE (comprehension)
  • Use of benchmark expectations or development of local norms
  • Multiple administrations necessary for making important decisions

TOPA 2+ (Torgesen & Bryant, 2004)

  • Ages 5 to 8
  • Phonemic awareness and letter-sound correspondence
  • Good norms description
  • Reliability better for kindergarteners than for more advanced students
  • Adequate overall validity

Chapter 13

  • Using Diagnostic Mathematics Measures

Why Do We Assess Mathematics?

  • Multiple-skill assessments provide broad levels of information, but lack specificity when compared to diagnostic assessments
  • More intensive assessment of mathematics helps educators:
    • Assess the extent to which current instruction is working
    • Plan individualized instruction
    • Make informed eligibility decisions

Ways to Teach Mathematics

  • < 1960: Emphasis on basic facts and algorithms, deductive reasoning, and proofs
  • 1960s: New Math; movement away from traditional approaches to mathematics instruction
  • 1980s: Constructivist approach – standards-based math. Students construct knowledge with little or no help from teachers
  • > 2000: Evidence supports explicit and systematic instruction (most similar to “traditional” approaches to instruction).

Behaviors Sampled by Diagnostic Mathematics Tests

  • National Council of Teachers of Mathematics (NCTM)
  • Content Standards
    • Number and operations
    • Algebra
    • Geometry
    • Measurement
    • Data analysis and probability
  • Process Standards
    • Problem solving
    • Reasoning and proof
    • Communication
    • Connections
    • Representation

Specific Diagnostic Math Tests

  • Group Mathematics Assessment and Diagnostic Evaluation (G●MADE)
  • KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)

G●MADE

  • General
    • Group administered, norm-referenced, standards-based test
    • Used to identify specific math skill strengths and weaknesses
    • Students K-12
    • 9 levels of difficulty teachers may select from

G●MADE

  • Subtests
    • Concepts and communication
      • Language, vocabulary, and representations of math
    • Operations and computation
      • Addition, subtraction, multiplication, and division
    • Process and applications
      • Applying appropriate operations and computations to solve word problems

G●MADE

  • Scores
    • Raw scores converted to:
      • Standard scores, grade scores, stanines, percentiles, and normal curve equivalents, and growth scale values.
  • Norm population
    • 2002 and 2003; nearly 28,000 students
    • Selected based on geographic region, community type, socioeconomic status, students with disabilities

G●MADE

  • Reliability
    • Acceptable levels of split-half and alternative form reliability
  • Validity
    • Based on NCTM standards (content validity)
    • Strong criterion related evidence

KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)

  • General
    • Comprehensive assessment of math skills and concepts
    • Untimed, individually administered, norm-referenced test; 30-40 minutes
    • 4 years 6 months through 21 years

KeyMath-3 DA

    • Numeration
    • Algebra
    • Geometry
    • Measurement
    • Data analysis and probability
    • Mental computation and estimation
    • Addition and subtraction
    • Multiplication and division
    • Foundations of problem solving
    • Applied problem solving
    • Subtests

KeyMath-3 DA

  • Scores
    • Raw scores converted to:
      • Standard scores, scaled scores, percentile rank, grade and age equivalents, growth scale values
    • Composite scores
  • Norm population
    • 3,630 individuals
    • 4, 6, and 21 years – demographic distribution approximates data reported in 2004 census

KeyMath-3 DA

  • Reliability
    • Internal consistency, alternate-form, and test-retest reliability
    • Adequate for screening and diagnostic purposes
  • Validity
    • Adequate content and criterion-related validity evidence for all composite scores

Chapter 14

  • Using Measures of Oral and Written Language

Assessing Language Competence

  • When assessing language skills, it is important to break language down into processes and measure each one
    • Language appears in written and verbal format
      • Comprehension
      • Expression
    • Normal levels of comprehension ≠ normal expression
    • Normal levels of expression ≠ normal comprehension

Terminology: Language as Code

  • Phonology:
    • Hearing and discriminating word sounds
  • Semantics:
    • Understanding vocabulary, meaning, and concepts
  • Morphology and syntax:
    • Understanding the grammatical structure of language
  • Supralinguistics and pragmatics:
    • Understanding a speaker’s or writer’s intentions

Assessing Oral and Written Language

  • Why?
    • Ability to converse and express thoughts is desirable
    • Basic oral and written language skills underlie higher-order skills
  • Considerations in assessing oral language
    • Cultural diversity
      • Differences in dialect are different, but not incorrect
        • Disordered production of primary language or dialect should be considered when evaluating oral language
      • Are the norms and materials appropriate?
    • Developmental considerations
      • Be aware of development norms for language acquisition

Assessing Oral and Written Language

  • Considerations in assessing written language
    • Form and Content
      • Penmanship
      • Spelling
      • Style
    • May be best assessed by evaluating students’ written work and developing tests (vocabulary, spelling, etc.) that parallel the curriculum

Methods for Observing Language Behavior

  • Spontaneous language
    • Record what child says while talking to an adult or playing with toys
    • Prompts may be used for older children
    • Analyze phonology, semantics, morphology, syntax, and pragmatics
  • Imitation
    • Require children to repeat words, phrases, or sentences produced by the examiner
    • Valid predictor of spontaneous production
    • Standardized imitation tasks often used in oral language assessment instruments
  • Elicited language
    • A picture stimulus is used to elicit language

Methods for Observing Language Behavior

  • Advantages and disadvantages of each method
    • Spontaneous
    • Advantages
      • Most natural indicator of everyday language performance
      • Informal testing environment
    • Disadvantages
      • Not a standardized procedure (more variability)
      • Time-intensive
    • Imitation
    • Advantages
      • Comprehensive
      • Structured and efficient administration
    • Disadvantages
      • Auditory memory may affect results
      • Hard to draw conclusions from accurate imitations
      • Boring for child
    • Elicited language
    • Advantages
      • Interesting and efficient
      • Comprehensive
    • Disadvantages
      • Difficult to create valid measurement tools

Specific Oral and Written Language Tests

  • Test of Written Language – Fourth Edition (TOWL-4)
  • Test of Language Development: Primary – Fourth Edition (TOLD-P:4)
  • Test of Language Development: Intermediate – Fourth Edition (TOLD-I:4)
  • Oral and Written Language Scales (OWLS)

Test of Written Language – Fourth Edition (TOWL-4)

  • General
    • Norm-referenced
    • Designed to assess written language competence of students between the ages of 9 and 17
    • Two formats
      • Contrived
      • Spontaneous

TOWL-4

  • Contrived
    • Vocabulary
    • Spelling
    • Punctuation
    • Logical sentences
    • Sentence combining
  • Spontaneous
    • Contextual conventions
    • Story composition
  • Subtests

TOWL-4

  • Scores
    • Raw scores can be converted to percentile or standard scores
    • Three composite scores and one overall score
      • Contrived writing
      • Logical sentences
      • Spontaneous writing
      • Overall writing

TOWL-4

  • Norms
    • Three age ranges: 9-11, 12-14, and 15-17
    • Distribution approximates nationwide school-age population for 2005; however, insufficient data are presented to confirm this
  • Reliability
    • Variable data for internal consistency, stability, and inter-scorer agreement
    • 2 composites reliable for making educational decisions about students
  • Validity
    • Content, construct, and predictive validity evidence is presented
    • Validity of inferences drawn from data is somewhat unclear

Test of Language Development: Primary – Fourth Edition (TOLD-P:4)

  • General
    • Norm-referenced, untimed, individually administered test
    • 4-8 years of age
    • Used to:
      • Identify children significantly below their peers in oral language
      • Determine specific strengths and weaknesses
      • Document progress in remedial programs
      • Measure oral language in research studies

TOLD-P:4

  • Subtests
    • Picture vocabulary
    • Relational vocabulary
    • Oral vocabulary
    • Syntactic understanding
    • Sentence imitation
    • Morphological completion
    • Word discrimination
    • Word analysis
    • Word articulation
  • Scores
    • Raw scores converted to:
      • Age equivalents, percentile ranks, subtests scaled scores, and composite scores
    • Composite scores
      • Listening
      • Organizing
      • Speaking
      • Grammar
      • Semantics
      • Spoken language

TOLD-P:4

  • Norm population
    • 1,108 individuals across 4 geographic regions
    • Sample partitioned according to the 2007 census
  • Reliability
    • Adequate estimates of reliability
      • Coefficient alpha
      • Test-retest
      • Scorer difference
  • Validity
    • Adequate content, construct, and criterion-related validity evidence

Test of Language Development: Intermediate – Fourth Edition (TOLD-I:4)

  • General
    • Norm-referenced, untimed, individually administered test
    • 8-17 years of age
    • Used to:
      • Identify children significantly below their peers in oral language
      • Determine specific strengths and weaknesses
      • Document progress in remedial programs
      • Measure oral language in research studies

TOLD-I:4

  • Subtests
    • Sentence combining
    • Picture vocabulary
    • Word ordering
    • Relational vocabulary
    • Morphological comprehension
    • Multiple meanings
  • Norm population
    • 1,097 students from 4 geographic regions
    • Sample partitioned according to the 2007 census
  • Scores
    • Raw scores converted to:
      • Age equivalents, percentile ranks, subtests scaled scores, and composite scores
    • Composite scores
      • Listening
      • Organizing
      • Speaking
      • Grammar
      • Semantics
      • Spoken language

TOLD-I:4

  • Reliability
    • Adequate estimates of reliability
      • Coefficient alpha
      • Test-retest
      • Scorer difference
  • Validity
    • Adequate content, construct, and criterion-related validity evidence

Oral and Written Language Scales (OWLS)

  • General
    • Norm referenced, individually administered assessment of receptive and expressive language
    • 3-21 years of age
  • Subtests
    • Listening comprehension
    • Oral expression
    • Written expression

OWLS

  • Norm population
    • 1,985 students matched to 1991 census data
  • Scores
    • Raw scores converted to:
      • Standard scores, age equivalents, normal-curve equivalents, percentiles, and stanines
      • Scores generated for each subtest, an oral language composite, and for a written language composite

OWLS

  • Reliability
    • Sufficient internal and test-retest reliability for screening, but not for making important decisions about individual students
  • Validity
    • Adequate criterion-related validity


Download 32.76 Kb.

Share with your friends:




The database is protected by copyright ©sckool.org 2020
send message

    Main page