Strategies, Tips, and Tools for Facilitating Learning Outcomes Assessment

Download 46,82 Kb.
Date conversion13.12.2016
Size46,82 Kb.

Strategies, Tips, and Tools for Facilitating Learning Outcomes Assessment

  • Jerry Rudmann, Irvine Valley College February 2008
  • Student Learning Outcomes

Overview - Instruction

  • Fine-tuning assessment
    • Item analysis primer
    • Calibrating rubric scoring
    • Tips for writing surveys
  • Helpful technology tools
    • Clickers - promote active learning and record SLO information
    • PDF Acrobat forms - autoscoring and recording student input
    • Portfolios - making students responsible and reflective
    • Scanning - some ideas
    • eTablets
    • Rubric generators - a way to measure most anything
    • Excel templates
    • CCC Confer - makes dialogue easier
    • Calibrated Peer Review
    • Tracking software - organizing all this stuff
  • Several options / strategies for making SLOs meaningful
    • Address “robust” SLOs (overarching outcomes)
    • Problem focus
    • Less is better
    • Share SLOs with students
    • Use what you already have
    • Think of SLOs in the context of student development
    • Qualitative assessment in OK

Some Options / Strategies for Making SLOs Meaningful

    • Address “robust” SLOs (overarching outcomes)
    • Problem focus
    • Less is better
    • Share SLOs with students
    • Use what you already have
    • Think of SLOs in the context of student development
    • Qualitative assessment is OK
    • Others…

General Tip 1: Robust SLOs

  • Developed through faculty dialog
  • Behavioral/measurable
  • Real-world
  • Higher-level
  • Conditions
  • Performance Criteria
  • Global, over-arching
  • Scored with rubric

General Tip 2: Problem Focus Approach

  • What concepts or competencies do students have difficulty mastering?
  • Focus SLO activities on problem areas.

General Tip 3: Keep It Simple But Meaningful

  • Corollary - Often, less is better.

General Tip 4: Student Development Approach

  • Student development
    • Academic self-efficacy (Bandura)
    • Academic self-regulation
    • Campus involvement (Astin)
  • Mentoring professor studies
  • Student Services DO help student success
  • A Closer Look at Objective Tests
  • Test Items

Item Considerations

  • Reliability
  • Item Difficulty
  • Validity
  • Level of assessment (Bloom’s taxonomy)
  • Tips from ETS
  • Recommendations

What is an Assessment?

  • In the most general sense, an assessment is a sample of behavior
    • Achievement
    • Aptitude
    • Personality
  • For assessing SLOs, an assessment is a finite set of objective format items

Assessment and Item Analysis Concerns

  • We must consider the properties of the items that we choose
    • Is your assessment reliable?
    • Item difficulty level
    • Examine performance of distracters
    • Is your assessment valid?

Improving Reliability of Assessment

    • Use several similar items that measure a particular skill or area of knowledge.
    • Seek items giving high item/total test score correlations. Each item should correlate well with the assessment’s total score.

Said in another way…

    • When you are giving a test that is measuring a specific characteristic (e.g., knowledge of the steps toward completing a financial aid form; knowledge of theoretical perspectives in psychology; knowledge of the components of a neuron), the items on the test should be intercorrelated (i.e., have “internal consistency” - they relate with one another).
      • Only when the items relate to one another can we be confident that we are measuring the characteristic we intended to measure (i.e., when the test has ‘internal consistency, the test is a reliable measure of that characteristic and it is more likely to be valid).

How Do We Determine Internal Reliability?

  • Examine correlations between each item score and the total test score—this is one way to assess “internal consistency”
    • You are correlating students’ “pass” vs. “fail” status on each item with students’ overall test scores.
      • This analysis indicates whether the item and total scores are assessing the behavior in the same way.
      • In general, items should be answered correctly by those obtaining high total scores (thus there should be a positive correlation).
      • In your final test, select only those items that have high positive internal correlations.

Item Difficulty

  • Difficulty
    • Permits the construction of a test in a specific way with specific characteristics.
    • Difficulty is based on the proportion of persons who pass or correctly answer the item.
      • The greater the proportion, the easier the item
    • What is the optimum level of item difficulty?

Item Difficultly - Prediction

  • If you are assessing achievement, proficiency, or mastery of subject matter, AND the results will be used in studies or examinations for prediction, then you should strive for an average item difficulty of .50
    • (and each item should not deviate much from this—this gives maximum variance among test scores, which is good for reliability and validity)
    • With .50 difficulty, there are more “discriminations” possible, thus you have the maximum “variance” among the test scores (this leads to better reliability and validity)

Item Difficultly - Competency

  • If you are interested in classification (e.g., mastery or not of most of the material in the course), then you should use the proportion that represents that standard.
    • If you deem 80% on an exam as “mastering the material,” then you should use .80 as the average difficulty level
      • some items will be higher and some lower, but the average would be .80.

Item Analysis: Validity

  • Test Validity: Relationship between total test scores and scores on an outside variable
  • Item Validity: Relationship between scores on each of the items and some external criterion.
    • Most are concerned with test validity, but test validity is a function of item validity.

More on Item Validity

  • Create external criterion groups: e.g., those with high scores (say upper 27%) and those with low scores (say lower 27%)— find items on the test (to predict school aptitude) that are passed by a significantly greater number in one group than the other group. These are the more effective items.
    • To select items, calculate the “discrimination index” (D), which is the difference between the number of correct responses for the high (H) and the low (L) groups. If 80 H scorers answered the item correctly, while 10 L scorers answered it correctly, the D = H – L = 70. Should select positive and high D value items (especially for achievement or aptitudes tests) for inclusion in the final form of the test (can use D as proportions, thus taking the difference between proportions and would be independent of sample size).

Tips for Writing Multiple-Choice Questions

  • You CAN test more than recognition and recall.
  • Applying Bloom’s Taxonomy.

Bloom’s Taxonomy & Objective Test Items

  • Create
  • Evaluate
  • Analyze
  • Apply
  • Understand
  • Remember
  • Lower Order
  • Questions
  • Higher Order
  • Questions

A Lower Order Question

  • Obsessive-compulsive disorder is characterized by which primary symptom?
  • Hallucination
  • Memory loss
  • Intense specific fear
  • Delusion
  • Unwanted repetitive thoughts*

Lower Order Question, Type 2

  • Which disorder is characterized by unwanted, intrusive thoughts and repetitive behavior?
  • Phobia
  • Obsessive-compulsive disorder*
  • Dissociative identity disorder
  • Major depressive disorder
  • Schizophrenia

Creating Higher-Order Questions

  • The question requires students to mediate their answers by doing an extra step they had not previously learned in their studies.
  • Students must transfer recalled knowledge to a new situation, break apart and reassemble concepts in new ways, or combine content of two areas in novel ways to answer a question.
  • Not always easy to distinguish between application and analysis questions
  • A student who misses deadline in school while striving for perfection may be exhibiting symptoms of which of the following disorders?
  • Phobia
  • Obsessive-compulsive disorder*
  • Dissociative identity disorder
  • Major depressive disorder
  • Schizophrenia
  • Gene is always late for school because he spends an hour organizing his closet each morning. Which of the following treatments would be most effective for Gene’s problem?
  • In-depth interpretation of dreams
  • Electroconvulsive therapy
  • Medication affecting serotonin levels*
  • Systematic desensitization
  • Regular exposure to bright lights

Tips from ETS

  • Whenever possible write items using positive form.
  • Don’t include “teaching” in the stem.
  • Uses plausible distracters.
  • Can you give a reason why each distracter is not an acceptable response?
  • The stem should be a complete question or statement.
  • The correct answer should be about the same length as the distracters.
  • Items should not ask trivial information. The point being tested should be one worth testing.

ETS Tips on Distracters

  • Should be reasonable.
  • May include misconceptions and errors typical of less prepared examinees.
  • May include truisms, rules-of-thumb that do not apply to or satisfy the problem requirements.
  • Negative stems should be avoided. Stems that include “EXCEPT” “NOT” “LEAST” can be difficult to process. Never use negatives in both the stem and in the options.
  • Avoid using “All of the above” as a distracter.

Conclusions and Recommendations

  • Take care when writing and/or selecting items from a test bank.
  • Look for at least some items that test higher levels of Bloom’s Taxonomy.
  • After the test, have your best students critique your test and find items needing revision.
  • When selecting software (clicker, scanner, survey, test) consider the item analysis capability that comes with the software – factor that in to your purchase decision.
    • Pass/fail rate for each item
    • Percentage breakdown for all distracters
    • Discrimination index (high versus low scorers)
    • Item correlation with total score

Final Word on Valid Assessment

    • Try using different methods of assessing learning
      • Converging evidence
      • This increases the overall validity of assessment
      • Example
        • Embedded assessment (multiple choice quizzes, exams)
        • Authentic assessment (students apply the skill)
        • Students self-rate their ability
        • Students post evidence in ePortfolio
  • Surveys

Surveys - SLO Uses

  • Students self-rate their competencies on program or college level learning outcomes.
  • Students’ satisfaction with various student services.

Types of Questions

  • Open-ended – respondents answer in own words
  • Closed-ended – respondent limited to a finite range of choices

Types of Questions

  • Open-ended
  • Closed-ended
    • Easier to code answers, process and analyze
    • Hard to write good closed-ended items

Item Format

  • Visual Analogue Scale
  • Food in the cafeteria is
  • Poor_ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _Excellent
  • Likert Scale
  • Food in the cafeteria is outstanding!
  • SD D N A SA
  • (Strongly Agree) (Disagree) (Neutral) (Agree) (Strongly Agree)

Nine tips for designing and deploying a survey

  • Don’t call it a survey
  • Provide a carefully worded rationale or justification at the beginning
  • Group items by common format
  • Start with more interesting items
  • Put demographic items last
  • Mix in negative wording to catch acquiescence (aka “response set”)
  • Automate scoring when possible
  • If asking for sensitive information, use procedures designed to assure anonymity
  • Always, always, always pilot test first

Survey Administration Methods

  • Face to Face
  • Written
    • Group administration
    • Mail
  • Computerized
    • Password protected
    • Validation rules
    • Branching and piping
  • Telephone
  • Focus groups can be especially insightful and helpful for program and institutional level learning outcome assessment.
  • Have your college researcher provide some background materials.
  • Focus Groups: A Practical Guide for Applied Research
  • By Richard A. Krueger, Mary Anne Casey
  • The RP Group sponsored several “drive in” workshops over the last few years.
  • Focus Groups

Goal for This Section

  • Technology Uses
  • Technology Tools Expected Outcome: Be able to select and use technology-based approaches to assess student learning outcomes

Assessment Challenges

  • Assessing Students in Large Classes
  • Assessing Performance at a Distance
  • Minimizing Subjectivity in Assessment
  • Creating Authentic Assessments
  • Engaging Students in Self-Evaluation
  • Accommodating Diverse Learning Styles
  • Assessing Conceptual Thinking
  • More Efficient Assessment

Technology Tools

  • CCC Confer (Web Conferencing)
  • Online Rubric Builders
  • eLumen (SLO Assessment/Tracking)
  • Calibrated Peer Review (CPR)
  • Classroom Responders (“Clickers”)
  • Scannable and Online Tests
  • ePortfolio
  • Adobe Acrobat Forms
  • Excel Spreadsheets

CCC Confer

  • Small-group work in project-based learning
  • Involving ALL instructors in the department’s SLO dialogue

CCC Confer Screen Shot


  • Way to measure the heretofore immeasurable: products and performances.
  • A rubric breaks the assessment into important components.
  • Each component rated along a scale well-labeled scale.

Let’s Develop an Assessment Rubric for a Resume

  • Factor
  • Needs
  • Improvement
  • 0 points
  • Satisfactory
  • 1 point
  • Excellent
  • 2 points
  • Lists educational background

Chocolate Chip Cookie Rubric

  • Chocolate Chip Cookie Rubric

Rubrics are Good!

  • Facilitate staff dialogue regarding satisfactory performance.
  • Create a more objective assessment.
  • Make expectations more explicit to the student.
  • Encourage metacognitive skill of self-monitoring own learning.
  • Facilitate scoring and reporting of data.

Online Discussion Rubric


Design Your Own Rubric

  • Please work in groups and use the worksheet in your packet to design a scoring rubric for assessing one of the following:
    • Coffee shop (not café)
    • Syllabi
    • Customer service at retail stores
    • Grocery stores
    • Online courses

Online Rubric Builders

  • Rubrics to guide and measure learning
  • Tools
    • Rubistar
    • Landmark Rubric Machine


  • Rubistar Art History Rubric

Rubric Builder Screen Shot

Adobe Acrobat Forms

  • Make form using MS Word
  • Import form and save as PDF form
  • Adjust the fields
  • Add fields to tally sub scores and total scores

How Do You Report Results?

eLumen to Assess SLOs

  • Reduce Time Spent Creating Reports
  • Assess Course, Program, and/or Degree-Level Outcomes
  • Share Assessment Rubrics Across Classes and Programs
  • View Individual or Aggregated Results
  • Use Online or Offline

Use Online or Offline

Criterion-Based Assessment

  • Rubrics are attached to each SLO

Rubrics Describe Criteria

  • Writes prose clearly
  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Link to Multiple SLOs

  • Read and write a response paper for the novel A Lesson Before Dying
  • Writes prose clearly
  • Critically analyzes a text
  • Considers ethical aspects
  • of a situation or text
  • Rubric
  • Rubric
  • Rubric
  • Here, one assignment stands as evidence for 3 different SLOs
  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Library of Degree-Level SLOs

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

And Rubrics Link to SLOs

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Science and Gen Ed SLOs/Rubrics

  • from the Science committee
  • from the faculty
  • committee on
  • critical thinking
  • from the faculty
  • committee on
  • communication
  • skills
  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Scorecard for All Students in the Course

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Class Scores by Student

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Aggregated Data for Course

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Course Aggregates by Program

  • Excerpted from eLumen: A Brief Introduction by David Shupe, July 2007

Calibrated Peer Review

  • Web-based program that enables frequent writing assignments with minimal impact on instructor time
  • Uses peer review
  • Promotes deeper learning

Calibrated Peer Review in Psychology 101

Critical Thinking in Introductory Psych Course

  • SLO on Pseudoscience skepticism: Students will correctly identify non-scientific explanations of human behavior and explain why those explanations are not based upon science and do not provide reliable or valid explanations of behavior or predictions of future behavior.

The Pseudoscience Belief Test

  • Please rate how much you believe the following statements. Use the 7-point scale provided.
  • 1 – Do not believe in this at all.
  • 2 – I doubt very much that this is real.
  • 3 – I doubt that this is real.
  • 4 – I am unsure if this is real or not.
  • 5 – I believe that this may be real.
  • 6 – I believe that this is real.
  • 7 – I strongly believe this is real.
  • __ 1. A person’s personality can be easily predicted by their handwriting.
  • __ 2. A person can use their mind to see the future or read other people’s thoughts.
  • __ 3. A person’s astrological sign can predict a person’s personality and their future.
  • __ 4. An ape-like mammal, sometimes called Bigfoot, roams the forests of America.
  • __ 5. The body can be healed by placing magnets on to the skin near injured areas.
  • __ 6. Healing can be promoted by placing a wax candle in your ear and lighting it.
  • __ 7. A dinosaur, sometimes called the Lock Ness Monster, lives in a Scottish lake.
  • __ 8. Sending chain letters can bring you good luck; ignoring them can bring you bad luck.
  • __ 9. The government is hiding evidence of alien visitation at places such as Area 51.
  • __ 10. Voodoo curses are real and have been known to kill people.
  • __ 11. A broken mirror can bring you bad luck for many years.
  • __ 12. Houses can be haunted by the spirits of people who have died in tragic ways.
  • __ 13. Water can be accurately detected by people using “Y” shaped tree branches.
  • __ 14. Animals, such as cats and dogs, are sensitive to the presence of ghosts.
  • Adapted from…Walker, Hoekstra, & Vogl, (2002). Science education is no guarantee of skepticism, Skeptic, vol 9, no 3.

Critical Thinking Experiment Using a SLO as the Dependent Variable

  • Pseudoscience
  • Belief Pre-test
  • Randomly Assigned 90 Students
  • Calibrated Peer Review Lesson on Graphology
  • Pseudoscience
  • Belief Post-test
  • Calibrated Peer Review Lesson on Different Topic

CPR Procedure

  • Students read assignment
  • Students read resource materials
  • Students wrote a short essay in response to the materials: Why or why I believe graphology is a reliable, valid way to measure and predict personality.
  • Students are “calibrated” – prepared to score essays written by their peers.
  • Students receive a detailed grade report for the assignment.

Graphology Belief Scores Statistical Summary

  • Treatment
  • Group
  • Pre-test Average
  • Post-test
  • Average
  • Paired t-tests
  • Graphology
  • 4.41
  • 2.33
  • t(26) = 6.40
  • p < .01
  • Conditioning
  • 4.12
  • 3.69
  • t(25) = 1.31
  • p = ns
  • t(51) = 0.67
  • p = ns
  • t(46.7) = 2.93
  • p < .01

Mean Pre and Post-Test Scores on Graphology Belief Question

Example Essay The Detection of a Pseudoscience: Graphology

  • Elaine Quigley’s posting on the website is littered with “red flags” that expose graphology as the pseudoscience/pseudopsychology that it is. While an attempt to promote graphology, Quigley’s posting fails to measure up to several of Cotton and Scalise’s guidelines for “baloney detection.” This paper will examine four areas in which graphology fails to live up to its claim of being “science.”In an attempt to display graphology’s validity, Quigley cites the notion that it is “a very old and respected science.” The fact that it has existed for approximately 3,000 years is used to justify Quigley’s notion that graphology is a science. However, one educated in the definition of science knows that the age of a theory is not a factor used to determine its validity. In fact, there are many beliefs that have been around for thousands of years that cannot be tested and therefore cannot be deemed as scientifically reliable. Graphology is just one of many ideas that cannot be justified despite their age. Quigley also fails to tell how the “science” of graphology has been tested and proven. Instead, she simply states that graphology is a “reliable indicator of personality and behavior” and expects her readers to accept this statement as fact. She also mentions that “the science is still being researched and expanded.” This is the extent to which she approaches the issues related to the research of graphology. Without explaining the testing that was done to prove the methods reliability, how is one to know that graphology is indeed reliable? Indeed, the answer is simple. It is impossible to be sure of the reliability of a measure of personality if the measure itself cannot be tested. In addition to not presenting methods for testing the claims of graphology, Quigley also fails to present evidence in support of its validity. Instead, she simply states that “it is not easy to explain how and why graphology works, nevertheless it continues to be used, respected and appreciated by many.” Could it be that the only “evidence” for the reliability of graphology is the satisfaction that its users experience? Unfortunately, being “used” and “accepted” characteristics required of a science. Finally, the vast majority of information provided by Quigley is anecdotal and leads up to a sales pitch for her services. She provides vague stories about how graphology has been used to produce more successful hiring processes and personal relationships. The information is presented more as an advertisement than a scientific work. Quigley goes into more detail on her experience as a graphologist than she does on the aspects of graphology that would qualify it as a science. In conclusion, it is quite clear that based on the evidence presented in this paper, graphology qualifies as a pseudoscience rather than a science. The claims of graphologist Elaine Quigley fail to show that graphology is indeed a science. Instead, she relies on the age of graphology and anecdotal evidence in support of graphology while ignoring issues related to methods for testing graphology’s claims and the results that have resulted in tests of its validity. Looking critically at “discoveries” is no doubt a useful tool that extends beyond the subject of graphology. The methods for recognizing pseudosciences compiled by Cotton and Scalise are certainly tools that would empower all people and prevent them from being fooled by pseudoscientific claims.

Questions and Answers for CPR Peer Reviewers

  • 1. Did the essay begin with a topic sentence?
  • 2. Was the essay free of spelling and grammatical errors?
  • 3. Did the essay present at least four (4) different reasons for supporting or denying the validity of graphology (or handwriting analysis) as a method of assessing personality and/or predicting behavior?
  • 4. Did the essay have balance? Although this may seem subjective, do you feel that it provided a balance among each of the points made? For example, was each point was explained in the same amount of detail.
  • 5. Did the author's arguments seem convincing to you?
  • 6. Did the author conclude with any reflection about whether this assignment was or was not helpful to his or her learning? In other words, did the author indicate that this assignment might help him or her judge the validity explanations of behavior encountered in the popular media (newspaper, radio, TV, magazines, etc.)?
  • 7. How would you rate this text? (Scale of 1 – 10)

Student’s Screen: Detailed Results

Instructor Screen: Student Progress

Instructor Screen: One Student’s Results

Instructor’s Screen: Student Results

  • SLO Data

  • UCLA
  • June 18-20, 2008
  • Calibrated Peer Review: A writing and critical thinking instructional tool. Arlene Russell, UCLA & Tim Su, CCSF

Classroom Responders

Renaissance Classroom Response System

  • PBS Demo

SLOs are here to stay (not a fad)

  • Absolutely here to stay
  • Probably here to stay
  • Unsure
  • Probably not here to stay
  • Will die out for sure

Most valuable tip is…

  • Concentrating SLO work on “robust” or overarching learning outcomes
  • Concentrating SLO work on skills students have difficulty mastering
  • Building SLOs around student development (self-efficacy, goal clarity, etc.)

Renaissance Learning for clicker training resources


Scanning Technology

  • Embedding Questions in Multiple Sections and Classes
  • and

Surveys and Tests

  • Online or Scannable
  • Surveys
    • Pre and post surveys of student self evaluation of progress
    • Gather stakeholder (faculty, business community leaders, advisory groups) input on expected learning outcomes
    • Student satisfaction with service (SSO)
  • Quizzes/Tests
    • Practice and graded

Some Survey Software Options

  • Scannable surveys and quizzes - Optical Mark Reader by Remark (OMR Remark)
    • Need software and a Fujitsu scanner
    • Use word processor to create scannable bubble-in surveys or answer sheets.
    • Produces item analysis output.
  • Online survey tools
    • eListen (Scantron Co.)

Excel Spreadsheets

  • Example of autoscoring and record keeping in a Japanese Program.


  • Advantages
    • Document artifacts of learning
    • Support diverse learning styles
    • Authentic assessment
    • Course, program, or degree-level tracking
    • Job skill documentation
  • Proprietary or Open Source
    • ePortfolio and Open Source Portfolio Assessment

  • Lock Assignments after submission
  • Random selection of assignments by learning objective
  • Anonymity of the student who produced the assignment and the instructor
  • Access to the work and the scoring rubrics
  • Reports aggregate scores; generate frequencies/means
  • Ability to download raw data which can be analyzed in another format

Open Source Portfolio

  • Aligned with Sakai
  • Admins or Faculty can structure and review work
  • Learning matrix documents levels of work


  • Calibrated Peer Review:
  • CCC Confer:
  • eListen:
  • eLumen:
  • ePortfolios:
    • Open Source Portfolio:
    • For others, see EduTools ePortfolio product comparison:
  • Online Rubric Builders
    • Rubistar:
    • Landmark Rubric Machine:
    • Coastline Rubric Builder:
  • Remark Survey Software:
  • Renaissance Classroom Responders:
  • SelectSurvey.Net
  • Which tool, if any, are you most likely to use for assessing your SLOS?
  • ePortfolio
  • Calibrate Peer Review (CPR)
  • On-line rubric generator
  • Scanning embedded items on Scantron answer sheets
  • Developing scannable forms using a product like Remark OMR
  • Survey software to capture students’ self-appraisals
  • Adobe PDF forms
  • None of these

Contact Info & Acknowledgements

  • Dr. Jerry Rudmann,
  • Professor of Psychology Irvine Valley College
  • Much of this slide show was adapted (with the express written permission) from Pat Arlington, Instructor/Coordinator Instructional Research Coastline Community College

The database is protected by copyright © 2016
send message

    Main page