5 Outcomes
Key Topics
- Definition of outcome
- Different types of outcomes and examples
- Ways of expressing and measuring outcomes
- Characteristics of good outcome statements
Introduction
Most programs are goal driven and outcome focused (even if neither are formally stated). They ultimately seek to accomplish something. Goals are typically broad and general in nature and enacted through many smaller, more specific, measurable outcomes. Classic evaluation texts define outcomes as specific statements of effects of an intervention or program on participants. (Note: The terms goal and outcomes are often used interchangeably. See Chapter 10.) Within human oriented enterprises such as education, intended outcomes usually involve knowledge, skill, attitudes, and behavior acquired by program participants. However, higher education administrators are often concerned about other types of outcomes as well so it is important to distinguish among types of outcomes.
Historically, colleges and universities have been graded on inputs (e.g., attributes of incoming students, amount of research dollars received), and what they provided for participants (e.g., the services a college provides, what courses it delivered, how many sections were offered, how many activities students could get involved in, how many grant dollars secured) rather than on outcomes—what participants know or can do as a result of participating in the program or how behavior, skills, and knowledge have been influenced by a program. Until the student learning assessment movement emerged in the late 1980s, colleges focused on teaching (what colleges did to students) rather than learning (what students learned in college). Impact of college was assumed. Quality was, and still is, often equated with measures of institutional status and resources. Likewise, institutions have routinely been concerned with participant satisfaction with services offered (including teaching) and with mere numbers of participants. When outcomes have been measured, they often reflected aggregate results and at a high level, such as those described by Howard Bowen (1997) in his classic study of the value of higher education to individuals and society. For example, Bowen found college graduates to be healthier than non-college graduates. Economists and sociologists have long touted the economic benefits of college attendance in terms of lifetime earnings.
Beginning in the 1980s, t he outcomes assessment movement changed some of this as accrediting bodies began requiring colleges and universities to focus on knowledge, skill, and behavioral outcomes achieved from the academic and co-curricular programs they offer. To bring evaluation efforts into alignment with accreditors’ emphasis on student learning outcomes and broad measures of success such as retention and graduation rates—it is important to consider the differences among different types of outcomes programs seek to achieve. I introduce outcomes here and return to them in Chapters 9 and 10.
Types of Outcomes
For most evaluation experts, outcomes of social programs/interventions are what participants or beneficiaries know, believe, or can do as a result of participating in a program or because of a policy. There are, however, other types of outcomes. Following Henning and Roberts (2016; 2024) and Rossi, et al., (2004) outcomes can be grouped into several types: (1) Operational/implementation outcomes are those associated with program operations, what Rossi et al., (2004) label as outcomes for program design and delivery, and participation. (2) “True” outcomes are those associated with what participants know, believe, or can do as a result of participating in a program. (3) Aggregate program outcomes are cumulative effects of meeting outcomes in individual programs or workshops. The latter are broader outcomes and longer-term (Henning and Roberts, 2016; 2024). For example, the individual outcomes from a diversity training program might lead to specific, immediate outcomes for the individual attendees but lead to the aggregate program outcome of a more inclusive campus. Most colleges and universities also have expectations that their actions will have cumulative effects resulting in larger, grander longer-term outcomes, such as graduates are employed, are critical consumers of information, and are healthier and happier, and more socially aware citizens (Bowen, 1997).
In the following section, I will say a bit more about each of three broad types of outcomes identified above. In addition, since colleges and universities focus their outcome efforts participant satisfaction as an indicator of success, I will also discuss satisfaction as an outcome. Programs and policies also have unintended and emergent outcomes that evaluators should not ignore. The following distinction among types of outcomes is important for several reasons not the least of which is that because it is typically far easier to collect data for some types of outcomes, such as satisfaction and participation, than it is to capture what participants know, believe, or can do as a result of participating in programs. It is, however, the latter that outcomes oriented assessment is most concerned.
Operational and Implementation Outcomes
Program administrators often have goals and outcomes for the way their programs operate—the design, delivery, and utilization of a program (Brescianci,2004; Rossi, et al., 2004). Suskie (2015, 2018) calls these efforts, things an institution or program does to or for its faculty, staff, or students. Colleges and universities provide things like courses, programs, services (e.g., advising), staff, opportunities, and training. Campuslabs (n.d.) defines these outcomes as “what a program or process is to do, achieve, or accomplish for its own improvement and/or in support of institutional or divisional goals; generally numbers, needs, or satisfaction-driven.” (Italics added by Twombly.) An example of an operational goal is a goal to increase the number of advisors by five and reduce the advisor to student ratio from 1:700 to 1:300. Program administrators often have expectations for these kinds of outcomes and must report progress on them. Henning and Roberts (2016) describe operational outcomes as administrative or service outcomes and say they are “metrics that document how well the operational aspects of a program or an activity are functioning…” (p. 88). Following Rossi, et al., (2004) I divide operational outcomes into two main types: participation and use and design and delivery.
Participation and Use
Participation and use are commonly tracked operational outcomes. Program sponsors might have goals to improve access to and participation in a program. Even if improvement is not a goal, offices are typically required to and do frequently report use data. That is, they track usage.
Many efforts in higher education are about increasing use and participation. It is important to have students, faculty, and staff participate in a program and for program planners to know how many people—and who—programs reach. Programs cannot achieve their intended outcomes if people do not participate. For example, for a college or university common book program to meet its objectives, students must read the book. A goal to have 60% of students read the common book is a program participation or use goal. It is an operational outcome. Reading the book is essential. The number of students who read the common book or the number of students who participate in study abroad or use tutoring services are measures of participation and use. They are not, however, measures of what students learned from reading the common book or studying abroad but can be used as measures of student success.
It will perhaps not be a surprise that there are exceptions to every rule. Examples of such exceptions are college attendance, retention, and completion rates. College attendance, retention, and completion are often considered as outcomes in their own right resulting from or correlated with activities provided by the institution. Sometimes retention and graduation are used as proxies for learning. A college degree sends a signal that holders know and are able to do certain things.
Expectations for Program Design and Delivery
A second broad category of operational and implementation outcomes focuses on expectations for a program’s design and delivery. To achieve outcomes, programs must be designed and delivered well.They must have adequate resources—space, funding, people, technology, etc. Often, the focus of this type of evaluation is on satisfaction with various aspects of program delivery. An office may have implicit goals for competent staff to deliver good content. At the University of Kansas, all faculty and staff are required to complete online sexual harassment training. Operational outcomes for this program might involve the number who participate and are satisfied with the quality of the material and the mode of delivery. Unless the content (design) was clear, accurate, and communicated well, it may not achieve its stated goals. So, units may have explicit or implicit or explicit outcomes that content was accurate and understandable. Collectively, operational outcomes are important because they remind leaders that they should attend to quality of intended services to ensure that a program can reasonably meet its learning outcomes. In a fully online program, technology use has to be seamless for intended participants. No matter how good a required sexual harassment or alcohol training is, if the platform is not accessible and usable, it can’t have its intended learning and behavioral outcomes. This type of outcome will be explored in greater detail in Chapter 8.
Operational goals and outcomes for program design and delivery can be assessed in a dedicated operation/implementation evaluation, but they are also often assessed as part of a comprehensive outcomes assessment to help contextualize actual outcomes for participants. Many key indicators of program operations are regularly monitored and reported.
Attitude, Behavior, Skill, and Knowledge Outcomes
When traditional evaluation texts talk about assessing outcomes, they are typically most concerned with the effects of program activities on target audiences or populations. The Kellogg Foundation’s definition of outcomes is particularly helpful:
Outcomes are specific changes in attitudes, behaviors, knowledge, skills, status, or level of functioning expected to result from program activities and which are most often expressed at an individual level” (W.K. Kellogg Foundation Logic Model Development Guide, p. 8. Italics in original.).
Referring specifically to learning outcomes, CampusLabs defines student learning outcomes as
cognitive skills that students develop through department interactions; related to measurable, transferable skill development. They are statements indicating what a participant (usually students) will know, think, or be able to do as a result of an event, activity, program, etc. (CampusLabs, n.d.)
Henning and Roberts (2016) call these outcomes learning and development outcomes. Bresciani et al. (2004), echoing Suskie (2009), sum up the difference between operation and learning outcomes this way: “It [outcomes] is not what you are going to do to the student [or faculty or staff member], but rather what you want the student [faculty or staff member] to know or do as a result of an initiative, course or activity” (p. 11) (Bracketed sections added by Twombly).
Outcomes colleges and universities most want to achieve relate to participant knowledge (e.g., more knowledge of what constitutes sexual assault, knowledge of chemistry), skills (e.g., ability to write an appropriate resumé, to choose appropriate courses, to identify phishing email messages), and behaviors (e.g., can think critically, communicate well). Student learning outcomes are a particular type of outcomes established for the purpose of determining what students know, value, and can do as a result of participating in a learning activity, whether it be in the curriculum or co-curriculum. Chapter 10 addresses the specific case of student learning outcomes.
Aggregate Program Outcomes
Educational organizations often have broader goals in mind when they implement programs than the immediate outcomes of those programs on individual participants. These outcomes are what Henning and Roberts (2016) call aggregate program outcomes and the Kellogg Foundation calls long-term outcomes. In other words, the effect of programs can be greater than the sum of their outcomes for individuals. For example, faculty members may be able to demonstrate specific outcomes of a diversity training workshop, but the ultimate goals is to create a more culturally sensitive faculty and fewer incidents of microaggressions. Larger campus goals may include things like reduction of food insecurity, enhancing career readiness or sense of belonging, improved campus safety, positive campus climate, increased faculty morale, retention, student health and wellbeing, and inclusiveness and sense of belonging.
When applied to participation in higher education more generally, program outcomes may be even larger, and more abstract than the examples used above. Some of the larger social conditions higher education attempts to address are providing a vehicle for social mobility, increased health and wellness, social and economic equity, or preparing informed citizens for participation in democracy (Bowen, 1997).
Satisfaction
Satisfaction is often used as a proxy for attitude, skill, learning, or behavioral outcomes in higher education, but its use as a measure of success, or as an outcome is debated. Bresciani et al., (2004) argue strenuously that higher education must get beyond use of satisfaction as a proxy for learning. Accreditors generally agree. Although satisfaction is important in the highly competitive environment in which higher education exists, one would be hard pressed to find a college or university that expressly names satisfaction with its services as a desired outcome. Satisfaction, critics argue, is not the same as learning and therefore is not a desirable outcome measure for student learning outcomes assessment.
Is satisfaction ever an appropriate proxy for outcomes? Bresciani et al., (2004) answered in the negative. They argue satisfaction is a measure of contentment not effectiveness. On the other hand, some fields, such as social welfare and health care, regularly use satisfaction as a “pragmatic or clinically relevant indicator of the success of social welfare and behavioral health programs” (Fraser and Wu, 2016, p. 762) as well as means of giving voice to program participants. Michael Middaugh (2010), originator of widely used evaluative tools in higher education such as the Delta Cost Project, would likely agree with Fraser and Wu (2016). He sees measures of satisfaction as key to understanding who attends a particular college and their experiences in college. Middaugh (2010) also argues that faculty and staff satisfaction is important to measure as part of determining institutional effectiveness.
Use of satisfaction as an outcome assumes that there is a relationship between satisfaction with a program or service, degree of program engagement, and the program achieving its desired outcomes. The pro-satisfaction camp argues that participants who are satisfied will more likely have been fully engaged and achieved program outcomes than those who were less satisfied, less involved, and who got less out of it (Fraser & Wu, 2016). This is easy to think about using the example of medicine. Presumably satisfaction with one’s primary care physician can be linked to positive health outcomes.
Although I did not find studies of the relationship between satisfaction and outcomes in higher education, Fraser and Wu (2016) conducted such a study in the field of social welfare. The findings of their study were mixed on whether satisfaction predicts outcome attainment. Although the literature is mixed on whether satisfaction equates to achieving outcomes, their study led Fraser and Wu to make several recommendations for using satisfaction to better represent outcome measures. First, they argue that satisfaction should not be used as the only measure of quality and that its use should be accompanied by other measures. When satisfaction is used, Fraser and Wu (2016) argue that three types of questions should be asked: 1) questions about satisfaction with different and multiple service elements (as opposed to one “overall how satisfied were you.” item), 2) items asking participants if they would recommend the program, and 3) items asking participants to self-report change or growth as a result of participation or use (p. 772).
I guarantee that you will use satisfaction as a proxy measure of effectiveness of your programs or be party to evaluation efforts that do. In fact, I would argue that capturing satisfaction with services in ways recommended by Fraser and Wu (2016) might be perfectly acceptable in many functional areas such as disability services, academic advising, or counseling where services are individualized and participation variable. When you use satisfaction, just be aware that satisfaction is not the same as actual learning, skill development, or attitudinal or behavioral change.
Generally speaking, satisfaction is used too frequently as a substitute measure of outcomes, likely because it is easier to measure than more complex learning (Bresciani et al., 2004). If possible, you should not just rely on satisfaction as a sufficient measure of outcomes and program effectiveness. Regional accreditors, in particular, do not look especially favorably on outcomes assessment that is based primarily on satisfaction measures. That said, assessment that uses satisfaction as the outcome—especially when heeding Fraser and Wu’s advice for a multi-pronged approach—is better than no assessment at all. Additionally, assessing satisfaction may be appropriate and the best alternative for some units. If and when you do use satisfaction, make an effort to heed Fraser and Wu’s three pronged approach to measuring it to enhance your argument that satisfaction is a good proxy for learning.
Unintended and Emergent Outcomes
Programs have unintended and emergent consequences; outcomes that can’t be anticipated when a program is implemented. These outcomes are often not evident in program planning or predicted by the literature. An example of an unintended consequence (presumably) would be the case in which a public university adopts more stringent admissions standards that could end up excluding populations of students it says it wants to attract. In another example, a study by researchers at the College Board, found that students with advantages, such as parents who graduated from college, are more likely than students who do not have some of those same advantages, to use the College Score Card data to choose a college. The researchers’ early conclusions were that although the Score Card was not intended to “exacerbate inequalities,” it is possible that it is unintentionally doing so (Supiano, 2016). When first enacted, the Affordable Care Act was intended to benefit individuals without health insurance or who were at risk of losing it. However, an entire group of graduate students has been negatively affected. To avoid having to pay health care costs that are not budgeted, universities have restricted the number of hours graduate students can work to keep them under the ACA threshold. Presumably, the Obama Administration did not intend for graduate students to be negatively affected by the law.
Unintended outcomes are unexpected and are frequently identified in the process of conducting normal assessment and evaluation activities, and often when qualitative methods are employed. The fact that unintended outcomes are not identified in advance does not mean they are unimportant and should be ignored. Nor are they always negative. In fact, the unintended consequences can be positive and are often among the most interesting outcomes of an evaluation. An example of a positive unintended consequence: As a result of all but the most essential businesses and educational institutions shutting down early in the Coronavirus pandemic , smog and environmental pollution decreased as people were not driving to work. Fewer people got the flu or common colds because of staying home and wearing masks. Unintended outcomes are often captured with qualitative research methods, such as interviews.
Hussey and Smith (2003) argue that there are also “contingent and emergent” learning outcomes that are rarely acknowledged but that are important to the ultimate learning that occurs in a classroom (or other learning activity). It is highly likely that over the course of a 15 week semester that some learning will occur that was not intended. These outcomes may be closely related to the intended outcome and make a positive contribution to learning but they “broaden, elaborate, and increase sophistication” (Hussey & Smith p. 364). Although the precise definition of emergent learning outcomes is fuzzy, the concept is not: it is hard to precisely identify ahead of time all of the learning that will occur in the classroom or from any program.
Writing “Good” Outcome Statements
If you are developing a logic model for a new program or writing outcomes for an existing program, then you will be in a position to write your own outcomes statements. Outcome statements are formal expressions of expected results from a program, how outcomes are met, and at what level or standard. On the other hand, if you are working with an existing program, it may already have written or agreed upon outcomes. On occasion, you will run into a program that does not have specifically defined, written outcomes (they may be implicit but unstated) and you may have to define them in the process of conducting an assessment or evaluation.
Identifying and formulating specific, clearly articulated, and relevant outcomes that are in some way measurable is hard work; converting outcomes into good outcomes statements frequently takes multiple attempts. Chapter 10 will address writing good outcome statements specifically for student learning outcomes.
The following sections covers several crucial aspects of defining outcomes in useful ways.
Sources of Outcomes
Where does one find outcomes for an existing program? An external evaluator would expect the program administrators to have established the program outcomes in the process of developing and implementing a program. Presumably goals and outcomes of a program seek to address a problem or unmet need and are expressed in the logic model or program description. In the case of student learning outcomes, academic programs are required to specifically identify learning outcomes. There are cases, however, in which goals and outcomes are not specifically stated and in which evaluators have to construct or reconstruct outcomes. If outcomes are not explicitly stated in a program description, they may be derived by working backwards from a program’s activities, from interviews with program administrators and sponsors, and also from a program logic model, if one exists. For an existing program, it is helpful to sketch out the logic model, but evaluators may have to work with program administrators to do this and to identify specific outcomes. When creating outcomes, stakeholders also undoubtedly have expectations for what the program should accomplish. Sub-groups of stakeholders may have different outcomes in mind so it is important to collect these various perceptions of outcomes. It should also be mentioned that some evaluation proceeds without regards to goals or outcomes. This form of evaluation is not surprisingly called “goal free evaluation,” which is discussed briefly in Chapter 18.
“Good” Outcomes Statements
Outcomes and the measures used to capture them must meet some basic criteria (adapted from Bresciani et al., 2004; CampusLabs, n.d.; Hennig & Roberts, 2024; Rossi et al., 2004). I will come back to this in Chapters 9 & 10 in reference to student learning outcomes.
- Outcome statements identify the actual outcome expected and also how the outcome is demonstrated. For example, by taking the cybersecurity training, 90% of faculty and staff can successfully identify a phishing email message in an embedded quiz. This outcome statement tells us what the intervention is (cybersecurity training) and what the outcome is: being able to identify phishing emails. It also tells us what the target for success is (90%), how it will be demonstrated (recognizing a phishing email) and how it will be assessed (an embedded quiz). Examples of good and better outcome statements are found in Chapters 9 and 10, and in the CampusLabs guide for writing outcomes (CampusLabs, n.d.), or in any of the dozens of books written about outcomes assessment.
- The outcomes measured must be reasonable and based on program activities that are implemented. It should be reasonable to expect that the program as designed and delivered could affect the outcome in some way.In another example, it would be unreasonable to expect an outcome of student legal services to be correct course selection. Similarly, it is unreasonable, for example, to expect any single student affairs, or co-curricular program, to affect student retention in a significant way. It is much more reasonable to identify specific outcomes for a program such as recreation services that it can directly impact: health, wellness, or engagement. It may be reasonable to expect a university’s collection of purposeful activities to have an impact on retention, and it may be a reasonable expectation for a division of student affairs programming to positively affect retention through its many programs.
- To reiterate a point made earlier, good outcomes also must be meaningful. Holding constituents accountable for and collecting data on insignificant outcomes is a waste of time and effort.
- Outcomes must be observable and measurable; evaluators must be able to collect some sort of data to know whether they have been met or not. Outcomes such as weight change or income are easily observable. If the phenomenon of interest is not directly observable, you have to identify measures that can represent the outcome of interest. In other words, you have to operationalize the outcome, and this can sometimes be difficult. One semester the students in my class wanted to count participation as a huge portion of their grade. The question then became: How will participation be defined, observed, and measured? Their initial idea was that I (the professor) could do it by observing every student in every class meeting and could grade them based on my assessment of their verbal and nonverbal facial expressions, etc.!! This would have been impossible. Ultimately, the class and I determined that we could define participation but observing and measuring it was a problem. As a result, its percentage of the final grade was reduced and it was defined very simply as class attendance. Student development outcomes are also hard to operationalize in observable ways.
The latter is an example of operationally defining a concept or outcome so that it can be measured. This is the topic to which I now turn.
Operationalizing Outcomes
To be useful in assessment, outcomes must be observable and measurable as noted in point #4 above. The key to constructing good outcomes that capture what they intend to measure and do so reliably is identifying them in a way that you can observe or measure them. The focus of this discussion is on operationally defining outcomes in ways that can enable you know when they are achieved. t is not the purpose of this chapter to discuss crucial measurement properties of outcome measures, such as validity and reliability, although both are important. Crucial measurement properties of outcome measures, such as validity and reliability, will not be discussed here, although both are important.
Outcomes must be defined in such a way as to clearly indicate what data needs to be collected to know they have been met. Operationalizing an outcome answers the question of how you will know the phenomenon when you see it. The measurement—e.g., a specific sense of belonging scale—becomes the definition for how one defines sense of belonging. A first step is to identify the outcome—what is expected—and then also identify how the outcome, including its component parts, will be observed and measured. Operational definitions often take the shape of a “test”, inventory, or scale asking several items that are used to measure a concept. The Wechsler Test of Intelligence (IQ test) was long considered the measure—the operational definition—of intelligence until Howard Gardner came along and identified multiple types of intelligence that are not necessarily captured by standard IQ tests.
Weight lost or dollars earned are easily defined and measured. However, many of the outcomes with which higher education is concerned, such as leadership, sense of belonging or career readiness, are not directly observable, and must be operationally defined to be assessed. As described below, in order to measure sense of belonging, it is necessary to identify components that are reasonable indicators of belonging that can be measured, such as interaction with faculty, or participation in events. These components often come from theory, prior research, or existing validated instruments.
The point is that complex, abstract concepts such as critical thinking, cognitive development, college engagement, satisfaction with work, and leadership are examples of more abstract outcomes that must be concretely defined in order to be measured when they are defined as outcomes of a program. This is usually accomplished through the measurement tool selected.
Outcomes with multiple dimensions are discussed in more detail below. It should be noted that all operational definitions take a stand in one way or another reflecting the perspectives of the evaluator. They represent certain ideas and not others. This does not mean they are good or bad. Rather, it means that you need to understand what they include and what is left out. Definitions of measures are often included in an explanation of limitations of a study.
Outcomes with Multiple Dimensions
Many outcomes have multiple components or dimensions. They are often represented by global or summary measures of a complex concept (e.g., sense of belonging, college readiness, work satisfaction) with multiple component parts each of which must be identified, operationalized, and measured. College readiness is an example of an abstract outcome with multiple components or dimensions. You must have some way of knowing college readiness when you see it. To come up with an operational definition of college readiness you must identify and name specific components and indicators you will use to determine whether students are college ready or not. Dimensions or indicators of “college readiness” might include rigor of academic coursework taken in high school (number of Advanced Placement courses, college prep curriculum, etc.), high school performance (grade point average), social and emotional readiness, and high school involvement (extracurricular activities). Evaluators could calculate an overall number to represent college readiness—likely this would be an average or a total score of individual scores on the underlying dimensions of college readiness. Evaluators would likely look to the literature on college readiness to see how others have operationally defined it.
Student engagement (in college) is another abstract and multidimensional outcome which must be operationalized. (It can also be a predictor of other outcomes.) The National Survey of Student Engagement (2018) defines several dimensions of engagement (NSSE calls them engagement themes) as Level of Academic Challenge, Learning with Peers, Experiences with Faculty, and Campus Environment (NSSE, n.d). Each of these themes consists of multiple specific, individual items that when combined, allow researchers and evaluators to report engagement levels for each theme and to combine them into an overall engagement score.
In another example, if a private, religiously affiliated college has as part of its mission to enhance spirituality, it may want to assess spirituality as an outcome. Before the college can do that, it has to operationally define spirituality—identify characteristics of spirituality, in ways that are identifiable and measurable. They must answer the question: how would they know spirituality when they see it? In this case, measures might include number of times per week, month, or year a student attends church, for example. Or, one could use a spirituality scale consisting of several items to define and measure spirituality. Various scales have been developed to define and measure similarly complex concepts such as cultural competence. Leadership skills, communication skills, and career readiness are yet other examples of complex outcomes with multiple dimensions that must be operationalized before one can assess them.
The previous examples mostly applied to college students, however, the same challenge exists when assessing faculty or staff. Work satisfaction is a function of satisfaction with numerous dimensions of work, such as satisfaction with salary, with workplace relationships, with benefits, and with working conditions. Employees could be satisfied with some aspects of their work but not others. To fully and accurately assess faculty and staff work satisfaction, all of these dimensions should be measured. In short, student engagement or satisfaction with work are more complex than just asking students if they are engaged. Failure to assess all dimensions of such complex constructs may result in faulty assessment conclusions. However, when creating one’s own or borrowing someone else’s measures of complex constructs, one needs to understand preferences captured in the scales.
A note on measurement:
From a measurement point of view, statisticians always recommend using multiple items to measure a complex construct such as sense of belonging. Not only do multiple items capture complexity of the construct, but they also enhance the chance that you are really measuring what you intend to measure. So, for example, each of the NSSE themes includes multiple items. The scores on the items are totaled or averaged to obtain a composite theme score. This score is assumed to be stronger than using any single item.
A Single Measure of a Complex Outcome
Occasionally evaluators and researchers will use one indicator or measure to represent a multidimensional construct. For example, one could use the single measure, high school class rank or grade point average, to measure college readiness. Similarly, you could use one global item, “how satisfied are you with your work” to represent work satisfaction. When one does this, it is important to clarify what the single measure is and to know the limitations of using it. In education, two of the best examples of using one measure to represent a more complex construct is the concept of “free and reduced lunch” or its higher education equivalent “Pell eligibility.” These are usually treated as input or demographic measures in social science research and not as outcomes, but are examples of how a single indicator—family income—is used as proxy for the more complex concept, socioeconomic status.
Concluding Thoughts
The important takeaway is that the measurement and the definition are inextricably linked. The measurement becomes the definition, and the definition is specified in the measurement. Evaluators (and researchers) must always be clear about how they define and measure inputs as well as outcomes. When a measure of one dimension is used as a measure for a broader outcome, as in Pell eligibility, it is especially important for the evaluator to explain how the outcome is defined and to acknowledge any limitations of the measure used. Although there are times when using only one measure of a multi-dimensional construct is unavoidable, one should always strive to measure all the underlying constructs if at all possible. Using one measure does not help administrators or program sponsors know specifically what aspect of a multi-dimensional construct matters or is achieved, and in some cases may not fully represent participants.
Additional Dimensions of Outcomes
Each of the types of outcomes discussed above can have at least two additional dimensions or aspects worth consideration. They can be long or short-term and direct or indirect or both.
Short and Long-term Outcomes
As discussed earlier, programs can have both short and long-term outcomes (aggregate program outcomes). This is illustrated in the chapter on logic models (Chapter 7). An example of a short-term (immediate) outcome might be that as a result of participating in a training program, individual faculty members can identify sexual harassment and differentiate sexual harassment from discrimination or assault. The long-term aggregate, program outcome for the individual might be that they behave appropriately and do not get reported for offenses to the Office of Civil Rights and/or Title IX. For the institution, the longer-term outcome might be that training will lead to a decrease in the incidence of faculty members reported for engaging in sexual harassment in five years.
Direct and Indirect Outcomes
Another way to think about outcomes is that some programs seek to directly affect those who participate and indirectly influence others who benefit secondarily. When programs train faculty or staff to learn a new set of skills, they seek to directly affect the faculty members and staff who participate; but in so doing they often seek to indirectly affect a larger group of people with whom the faculty or staff interact. For example, the many development workshops geared toward enhancing faculty members’ online teaching skills seek to directly impact faculty participants but indirectly affect the students taught by faculty members. Housing administrators expect Resident Assistant (RA) training programs to directly affect RA behavior in some way, but they also expect RA training to have indirect effects on a larger population: the students under their charge, and the residence population in general. The same is true for training programs for academic advisors. It is important to keep these different kinds of outcomes in mind.
Writing Good Outcome Statements
Good, clear outcome statements are at the heart of many program evaluation activities. As a result, it is important to know how to write good outcome statements. The formula for writing them differs a bit depending on the type of outcome involved. To reiterate a point made above good outcomes must meet the 3 M standard. Outcomes must be meaningful, manageable, and measurable (Bresciani, 2004). I will come back to this in Chapters 9 and 10, but a good model for crafting specific outcome statements is described briefly below.
Expressing Outcome Achievement
Good outcome statements involve some expression of degree of attainment. Before discussing two commonly used models for crafting outcome statements, it useful to consider the following ways in which outcomes can be expressed:
- Number and proportion (e.g., percentage) of a target group that exhibits a behavior, attitude, or skill or performs at each level of attainment. This can be embedded in the outcome statement itself.
- 75% of participants can define compassionate communication correctly.
- Strength of intended outcome. How engaged are students? Are students highly engaged, moderately engaged, not very engaged?
- 5% of the participants report they are highly engaged….
- Level or degree of achievement. What is the degree of attainment? A variation of the strength of attainment, level indicates, for example, whether participants demonstrate novice, emerging, or expert levels. The rubrics professors use to grade student work typically have multiple levels of performance on essential components of an assignment: unacceptable to outstanding, for example. The levels are usually accompanied by a verbal description of characteristics of the level.
- Participants demonstrate proficiency when asked to demonstrate use of skills in a role play.
- Percentage and level of performance can be combined: Percentage of participants who performed at each level, for example.
- 60% of participants score at the highly proficient level when asked to demonstrate use of skills in a role play.
- Change and growth. In addition to learning the level or degree of outcome attainment, administrators and program sponsors are often interested in change (or lack thereof) in level of performance on outcomes from one point in time to another. A center for teaching excellence might expect faculty member or graduate teaching assistant knowledge and use of good pedagogical skills to increase as a result of participating in a faculty development exercise. Use of terms such as growth, increase, decrease, and change in outcomes requires that evaluators have measures of outcomes at two or more different times and that they compare before and after measures to identify growth or change. In order to know if there is a change from freshman to senior year it is necessary to have a measure from each year. For short-term programs this is often done using pretests and posttest measures of knowledge or skills for a single event or program. See Chapter 17. Although it is really tempting to want to measure “change,” it is hard to do so be thoughtful about whether you can actually do it before writing it into an outcome statement.
Models for Writing Outcomes
A common and very useful model for writing knowledge, skill, and behavioral outcomes is the ABCD model (CampusLabs, n.d.). (A) stands for audience, (B) for behavior, (C) for condition, and (D) for degree. This model is particularly useful because it guides the author to explicitly link the outcome to the activity intended (condition) to produce the outcome thereby avoiding a common error of an outcome that doesn’t match the program or action intended to produce the outcome. The model is clear and easy to use. Even though the ABCD model is most often applied to student learning outcomes, it can easily beapplied to operational or aggregate program outcomes.
An example of an outcomes statement (the desired result for the people or group of people who participate) for a hypothetical training program to help faculty to internationalize a course might be:
As a result of participating in the training program (condition), 75% (degree) of faculty member participants (audience) will identify (behavior) three (degree) relevant international examples that they incorporate into their syllabus (behavior).
I have changed the order in which ABCD is enacted by putting the condition first and degree second. I do this because it focuses attention on the activity meant to achieve the outcome to ensure that the activity and outcome are related. In the example above, “identify international” is a behavioral outcome. “Incorporate them into a course” serves a dual purpose. It is a behavioral outcome but is also specifies how the outcome is to be demonstrated. In a review of participants’ syllabi, the new sources and how they are integrated should be identifiable. Seventy-five percent and three sources represent target performance expectations. They are performance standards by which you can determine whether the workshop was successful or not.
The ABCD model represents a significant and positive step in how outcome statements can and should be written. Just look around at the various convoluted ways in which many outcomes are written! With its model, CampusLabs has provided an easy formula that results in significantly improved outcome statements eliminating many of problems with vague, poorly worded outcome statements. Although it provides a model form, it does not fill in the ideas. You still have to do that by focusing on those ideas and not having to worry about form.
There are other similar formulas for writing outcome statements, one of which is called SWiBAT. Although the “S” applies to students, it could easily be rewritten as FWiBAT substituting faculty or any other group, such as staff, for students. Spelled out, this formula suggests that outcome statements be written as “Students (faculty, staff, parents, etc.) will be able to do X. Both ABCD and SWiBAT require the outcome author to specify the desired outcome or behavior. Expected outcomes or behaviors can be guided by learning frameworks such as the commonly employed Bloom’s taxonomy or alternatives emanating from Medicine Wheel Taxonomy or the Taxonomy of Significant Learning (Henning & Roberts, 2024; Suskie, 2018). I should add that these taxonomies of learning are not simply outcomes but rather fundamental ways of conceptualizing knowledge that should be embedded in program construction and not just simply applied post hoc for assessment purposes.
Outcome Statements for Operational Outcomes
There’s no particular formula for writing outcome for program design, delivery, or participation. Some examples include:
- The Advising Center will add 10 new advisors over the next 3 years.
- By hiring X more advisors, the advisor to advisee load will decrease from 1:700 to 1:300.
Aggregate or Overall Program Outcomes
The ABCD model comes in handy again for writing program outcomes, albeit at a more general, longer-term level. Program outcomes can refer to the following type of outcome.
- As a result of cultural competence education provided to faculty and staff, incidence of reported microagressions in the classroom will decrease 20% over a two-year period.
Or:
- As a result of cultural competence education, in five years the campus will be rated as an inclusive environment by faculty, staff, and students.
Crafting good outcome statements takes practice. ABCD provides a good model but crafting good outcome statements also assumes that outcomes have been identified and defined. You have to know what your program hopes to accomplish before you can write the outcome statements.
Summary
Understanding the different types of outcomes becomes crucial as I move into describing specific types of evaluations. Operation/implementation evaluation is primarily concerned with outcomes for or about operational outcomes whereas outcomes assessment is about outcomes for individuals, what they learn, what they can do, or what their attitudes are as a result of participating in a program. Abstract outcomes must be operationalized (defined by the measurement). Good outcomes identify actual outcome, are reasonable and based on program activities, are observable (measurable), and should reliably measure the outcome.