Assessing Outcomes

Susan Twombly

9 Assessing Outcomes

Key Topics

Types of outcomes: A reminder
Guiding questions
Steps in planning an outcomes assessment
Assessment of student learning outcomes
Determining level of student outcomes assessment in student support units
Components of student learning outcomes assessment plan

Introduction

Administrators and program sponsors design and deliver programs because they want to effect change in particular groups of people or to bring about larger organizational or social changes, such as improved campus climate, enhanced workplace satisfaction, or to eliminate food insecurity. Administrators want new students to feel a sense of connectedness to the other members of their class as a result of reading a common book because they know there is a relationship between feeling connected and success in college. The chemistry faculty wants to be sure that chemistry majors know and can do what they expect graduates to be able to do to ensure they will get good jobs or be accepted into graduate school. A human resources office provides many learning and development opportunities for staff with hopes for improved morale, increased skills, and work productivity in mind. A campus may provide cultural competence education to faculty, staff, and students with the goal of creating more culturally competent individuals. All of these efforts are geared toward having effects on knowledge or skills, values, attitudes, behaviors, or status of participants in a program or on a larger social condition, such as workplace climate.

I have broken the discussion of outcomes assessment into two chapters. This chapter focuses on an overview of assessing outcomes in general.The next chapter addresses this more specific case of student learning outcomes assessment. Although the latter follows the same logic and principles of outcomes assessment described in this chapter, it is shaped by some specific expectations of accrediting bodies that make it somewhat different.

A Reminder

In Chapter 5 I introduced the different types of outcomes with which program evaluators are concerned. The focus of this and the next chapter is on assessing effects of policies, programs, and processes on traditional outcomes—the benefits participants, or groups of participants, accrue from a program, activity, or policy. These can include knowledge, skills, and behaviors as well as outcomes from processes such as a student evaluation of teaching system that seeks to effect faculty teaching, or a particular policy. They can include many of what Suskie (2018) calls student success metrics. Although acknowledging no widely agreed upon definition of student success, Suskie thinks of student success as efforts that help students to achieve their ultimate goals (Suskie, 2018). These metrics include a wide range of supports and experience that help students progress from first year to second year, earn good grades, transfer, and develop new skills. Outcomes can also include grander goals for a larger community, such as improved climate.

This chapter does not cover outcomes a program administrator might have for the program itself. This is likely best accomplished through operation and implementation assessment. Although satisfaction and participation are not the same the same as knowledge or skills accrued as a result of participation in the program and is not considered a traditional knowledge, skill, or behavioral outcome, it is often used as an indicator of outcomes.

Guiding Questions

Evaluation studies that seek to assess outcomes, like other evaluation activities, should be guided by a set of questions that reflect the fundamental purpose of the activity. These questions often specifically focus on outcomes a program is intended to influence. As the sample questions below show, outcomes assessment can answer two types of questions depending on from whom you have data and what kind of data you have or can collect. The first purpose is to describe outcomes from programs and relationships between participant characteristics and outcomes and the second is to determine the extent to which the program impacts or causes the outcomes. This is discussed in greater depth in Chapters 15, 16, and 17.

Some examples of generic guiding questions follow. Please note that you can and should name specific characteristics or variables of interest in the questions.

General Questions

At a most general level, outcomes assessment seeks to answer the following types of questions?

What are actual outcomes from participating in the program? What do participants know or can they do at the end of a program?
How has a process, such as a new student evaluation of teaching instrument, affected teaching?
What are outcomes from policy X?
To what extent does participation in a program contribute to the outcomes?

Questions: Participants Only

The following questions are appropriate if you are assessing outcomes for program participants absent any comparison groups of non-participants. Without a comparison group of nonparticipants, your task is one of learning the extent to which participants achieved the desired outcomes:

How well do participants perform on outcome measures?
Did participants meet expected levels of performance on the outcomes (e.g., what did participants learn)?
Are participant characteristics correlated to outcome attainment?
Do the outcomes differ for different groups of participants?

Questions: Comparison Groups

If you have data from pre and posttests of participants only, you can answer one or more of the following:

Do participant scores change from the beginning to end of the program?

With data from participants and comparable non-participants, you have more flexibility in the types of questions you can ask and the degree to which you can attribute outcomes to the program itself. You might seek to answer questions such as these:

Are outcomes different for program participants when compared to a similar group of non-participants?
Is there a greater change from pre to posttest for program participants compared to those who do not participate in the program?
To what extent are outcomes attributable to the intervention? That is, does the intervention cause the outcomes.

Planning an Outcome Assessment Project

The first task in doing an outcome assessment is to clearly identify expected outcomes and craft outcome statements. Ideally, this is done by program administrators or faculty members before a program is offered so that the task is to assess the extent to which outcomes are met. As noted in Chapter 5, if explicit outcomes do not exist, then it is up to the evaluator to deduce them from the program description, program logic model, or from discussions with administrators and stakeholders.

Crafting Outcome Statements

In keeping with the characteristics of good outcome statements as described in Chapter 5, the outcome statements should 1) identify the actual outcome and how the outcome is demonstrated, 2) be based on activities that could reasonably lead to the outcome, 3) be observable (measurable) in some way, and be 4) meaningful. In the words of Bresciani et al., (2004) outcomes must be meaningful, manageable, and measurable and meet the 3 M standard.

The ABCD model, described in Chapter 5 is proposed as a simple but effective guide for writing outcomes. As a reminder, in the ABCD model, A stands for audience (participants), B for the desired behavior (the outcome), C for the condition (the activity or program designed to produce the outcome), and D signifies the degree of performance expected—the level of performance or the percentage of participants achieving the desired level of performance. See CampusLabs and Chapter 5.

An example of an outcomes statement using the ABCD model for a hypothetical technology security training program for faculty and staff might be:

As a result of participating in the training program (condition), 90% of faculty/staff member participants (audience) will be able to correctly identify examples of phishing (behavior) in an embedded quiz (how demonstrated).
90% of students who complete the library module of UNIV101 will be able to use library search functions to generate a list of 15 relevant sources on a topic of choice.
75% of students will correctly identify offices where they can go for assistance with various issues and questions.

Determining Expected Levels of Performance

As discussed earlier, one of the important, but often overlooked, aspects of assessing outcomes involves identifying the level of performance (the standards) expected. Identifying expected levels of performance is critical for being able to compare measured outcomes to standards expected to know whether the outcomes are being met at the level or rate expected or not. In the ABCD model, expected level of performance is called degree. Ideally, expected standards of performance were specified before the program was implemented. If not, two common sources of standards are those set by external professional organizations and peer performance on the indicator of importance. If you do not have external sources to guide you in setting standards, think about current levels of performance and what would be reasonable to expect. For a faculty international curricular integration training program planners might set two standards: three international sources implemented by 80% of faculty participants. These standards give you concrete measures to compare with actual performance to determine whether the program is achieving its outcomes at a desired level. This type of outcome also provides very direct and clear indications of what the program should be doing. Be careful about setting 100% as your standard. In some cases, full participation should be expected and can be achieved, but in many cases, it is simply unrealistic to expect.

Additional examples of outcome statements with embedded standards include:

As a result of implementing a food pantry 95% of students and staff will report having sufficient food defined as having at least two meals a day.
Eighty percent of staff who participate in college/university-provided sexual misconduct training will score 90% or better on a posttest about various actions that fall under sexual misconduct..
As a result of training, 80% of student support programs will produce acceptable outcome assessment plans as determined by college assessment director review.

It is common to see outcomes written without standards. As indicated earlier, absence of expected performance standards makes decision making based on results harder. That said, absence of built-in standards also creates flexibility for the program administrator to make judgments free from pre-determined standards.

Mapping Outcomes onto Program Activities

The program logic model, namely its activities and outputs, are the conditions that presumably lead to the outcomes. If you are developing a program from scratch, you will want to be sure that the program activities can reasonably lead to the stated outcomes. If there is no logic model, you still have to identify where activities occur that will lead to the outcomes. In addition to assuring necessary opportunities for learning exist, the map also provides clues as to where you might collect data. If you are assessing outcomes that have already been established, you still want to identify where participants will learn or practice the knowledge, skills, behaviors expressed in the outcomes identified as part of your evaluation. If there is no place in the program where the outcome is introduced, taught, and practiced, perhaps you need to add such an opportunity to program activities. Participants can only be expected to demonstrate what they have had an opportunity to learn. Likewise, programs should be evaluated on outcomes that can reasonably be attributed to program activities.

This is an extremely important, but undervalued, activity. It is surprising how often outcomes and activities to meet outcomes do not match. The process of noting such discrepancies can help program providers decide what it is that they really want to accomplish and how they will do so.

Methods of Data Collection: Considerations about Research Design

Once you have the guiding questions identified and outcome statements identified or written, you must figure out how data will be collected and from whom. It is also a possibility that the data you have or can get determine the questions you can ask. It is a waste of time asking questions for which data do not or cannot be collected. Either way, the guiding questions (what you want to know) and the data you have or can collect must match.

In the case of outcomes assessment, as indicated above, the questions you can ask conclusions and the conclusions you can draw from social science research, including evaluation research, and the degree of certainty with which outcomes (knowledge, skills, behaviors, attitudes) can be linked to the intervention, depend on the research design employed. The design of your study to assess outcomes will determine how you collect data, from whom, and what you can say about the findings. Conversely, with respect to program evaluation, the data to which you have access may determine the design you can use. Research designs are discussed in more depth in Chapter 15.

Key Determinants of Research Methods Employed

The answers to two key questions drive methodological decisions in evaluation studies. These topics are covered in more depth in Chapters 15-18. These questions are: 1) from whom can you collect, or do you have, data (or from whom does it make sense to collect data) and 2) can you can control who gets the intervention (the program) and when. If you have data only from participants in the program, then you use surveys, tests, or interviews to collect data from the participants with the purpose of describing the outcomes for those participants. The conclusions you can draw are mainly descriptive: e.g., 30% of participants scored at the novice level. 60% chose a major. Seventy-five percent can identify where to get help. There is a correlation between first generation status and outcome attainment. One shot pre and post-test designs involving pre and post test data from program participants also yield descriptive data about participants and how well they perform on the assessment you ask them to do. If you have pre and post-test data, you may be able to conclude that participants’ scores increased or decreased. Although it is tempting to use pre a posttest data from participants to draw the conclusion that the program causes the outcome, there are many reasons doing so might lead to faulty decisions. See Chapters 16 and 17.

To be able to say that the program caused the outcome, you must be able to collect outcome data for a group that participated in a program and a similar group that did not. Ideally, to establish causation, you randomly assign participants to a treatment (program) group and to a control group that receives no intervention. You give them the same “test” at the beginning and the end. Pre and post-tests are often used in experimental designs comparing outcomes for participants and non-participants. These and additional designs that allow you to attribute varying degrees of causation are described in Chapter 17.

Most of the outcome assessments carried out in higher education, especially for student learning outcomes assessment, employ descriptive designs, the purpose of which is to describe the number, percentage of participants achieving at each level of performance on desired outcomes. Assessments typically involve participants and rarely have equivalent groups who did not participate. You may be able to compare outcome attainment for different groups of participants—tenure track faculty compared to full-time, non-tenure track faculty, for example. Additionally, it is possible to explore the relationship between demographic characteristics of participants and outcomes. Even when simple pretest and posttest measures are employed. Although simple pre and posttest designs involve a comparison, the most you can say when there is no non-participant comparison group is that outcome scores increased or decreased for participants in a particular program administered at a particular moment in time and for a particular group of students.

Using designs with comparison groups of participants and non-participants is especially important if the goal is to draw summative conclusions about the efficacy of large, expensive programs in achieving their outcomes and if you want or need to establish that the program causes the outcome. Specific research designs and their implications for collecting, analyzing and reporting data are discussed in more depth in Chapters 15-17.

Qualitative data collection is hard to do in outcomes assessment if one stays true to the principles of qualitative research—and to the premises of outcomes assessment. Outcomes assessment is a product of a world view that is logical and rational and based on a postpositivist notion that a reality—e.g., outcome—exists that can be observed. On the other hand, the qualitative paradigm considers knowledge as constructed and given meaning by both the participant and the researcher. Qualitative research studies of outcomes are useful if you want to identify things such as what participants get out of a program in their own words regardless of the program outcomes or how participants understand the program, but they are not particularly well-suited to identifying whether participants met specific comes, at what level, and in a way that can be compared to other groups of participants. Goal free evaluation described in Chapter 18 is a qualitative method for getting at what participations actually learned from a program without regard to the stated outcomes.

Analyzing Data and Drawing Conclusions

I will say more about this in Chapters 15-18, but once you have the data, you need to analyze it and draw conclusions about the extent to which outcomes are met and perhaps to offer thoughts about why not. Again, in drawing conclusions, one needs to distinguish between finding statistical significance between the performance of two groups and desired degree of outcome attainment. One may find statistically significant differences, but the program may be underperforming in level of performance on outcomes or vice versa. You could find no statistical significance between sub-groups of participants or between participants and non-participants because all participants perform well. Assessing outcomes can be used as a summative activity to determine a program’s effectiveness in meeting its outcomes.

The next chapter discusses the specific case of student learning outcomes assessment.

Summary

In this chapter, I reviewed some of the principal activities involved in outcomes assessment. The type of questions an assessment of outcomes can answer depends on from whom one has data and what kind of data one has. Most of the outcomes assessment done in higher education involves only program participants and allows the evaluator to describe outcome attainment. More complex research designs involving non-participants and manipulation of the interview are necessary to determine whether the program causes the outcome. These designs are discussed in Chapters 15-18.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Assessment and Evaluation in Higher Education: A Practical Guide Copyright © 2024 by Susan Twombly is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.