Assessing Program Operations and Implementation

Susan Twombly

8 Assessing Program Operations and Implementation

Key Topics

Definition of operation (implementation) assessment and monitoring
Appropriate questions about a program’s design, delivery and utilization
Methods to answer questions about program design, delivery and utilization

Introduction

It is one thing to plan a program to address a need or problem on campus. It is quite another to implement that program faithfully and well. Webster’s Dictionary defines the verb to implement as to “carry out, accomplish; especially: to give practical effect to and ensure of actual fulfillment by concrete measures” (Webster’s dictionary online). As used in reference to program development, use, and evaluation, to implement is to enact the planned program effectively to accomplish one’s goals. Operation and evaluation questions typically center around aspects of the program that must be implemented to achieve the stated outcomes. These include a focus on how a program is designed, how it is delivered, and who uses the program—the concrete components of a program or intervention necessary for it to have its intended effects.

Failures of operational aspects of a program—the design, delivery, and utilization of the services offered—are quite common and can result in a program not achieving its intended outcomes. Perhaps the program is not funded at necessary levels, delivered as planned, or program staff members are not well trained. When this happens participants may not receive intended services. Or, the people for whom the program is designed may not participate in the program. It is unreasonable to expect a program to achieve its outcomes if it is not designed and delivered competently and as planned, and/or the intended participants do not or cannot participate.

Assessing program operations and implementation is an essential component of developing and maintaining effective programs. It is nearly impossible to design perfect programs. Program planners and administrators need to know what is working and what is not working well to improve programs so that they achieve their goals, and this is what operation/implementation assessment does.

Various terms are used to describe the type of evaluation considered in this chapter. Some authors call it process evaluation while others use the term implementation evaluation. Still others simply refer to it as formative evaluation. Henning and Roberts (2016, 2024) use the term program operations to refer to the typical focus of this type of evaluation. For the purposes of this book, I will use the term operation/implementation evaluation, and I will often spell out the three main categories of program characteristics with which this form of evaluation typically deals: program design, delivery, and utilization. Finally, as discussed below, operation/implementation evaluation can stand alone and be the main focus of an evaluation OR it can be combined with outcomes assessment.

Operation & Implementation Assessment

This chapter will discuss operation/implementation assessment primarily as a free-standing form of evaluation. Questions about the operational aspects of a program may also be included as part of any outcome assessment. In this case, the operational data can help to explain why a program does or does not produce the intended outcomes.

Rossi et al. (2004) stress that

Ascertaining how well a program is operating … is an important and useful form of evaluation, known as program process evaluation. (A widely used alternative label is implementation evaluation.) It does not represent a single distinct evaluation procedure, but rather, a family of approaches, concepts, and methods. The defining theme of program process evaluation (or simply process evaluation) is a focus on the enacted program itself—its operations, activities, functions, performance, component parts, resources, and so forth. (Italics in the original. p. 170; bold by Twombly.)

Hennings and Roberts (2024) define operational outcomes in a somewhat narrower way as administrative or service outcomes (p. 90). For them, operational outcomes are “metrics that document how well the operational aspects of a program or activity are functioning…” (p. 90).

Operation and implementation assessment focus on aspects of program functioning: the design of activities, the method and quality of delivery, and who and how many participate in a program or use its services. Think about all the things that go into designing and delivering one of the programs with which you are associated. Activities are designed and adequate resources must be provided, including presenters or trainers delivering the program. The program has to be advertised to the intended audience, and it must be offered. The trainers or leaders need to be well prepared and good communicators. Software must function properly and be usable by participants. People must participate. In short, to achieve intended learning or program outcomes, activities must be implemented well, and the intended participants must participate. Program operation/implementation assessment focuses on these aspects of design and delivery of the program. It is internally focused on the program itself.

When program administrators assess program operations, they generally have formative purposes in mind. That is, they seek information about a program’s operations to improve its functioning. As mentioned above, assessment of program operations is concerned about three aspects of a program: (1) program organization and design, (2) program delivery (e.g., quality of program materials and methods of delivery), and (3) who and how many participate in the program or use the service (Do the intended participants in the program?). I refer to the latter as program utilization or participation. Within these three broad categories of concerns, almost any question may be of interest to evaluators.

Approaches

There are at least three approaches to assessing program operations and implementation. Each approach assumes that a program has a defined set of activities leading to some sort of desired outcomes for participants in the short run and larger impact on people and organizations in the long run. Programs can range from traditional programs like new student orientation, or a faculty training program, to things such as implementation of a learning management system or a dashboard to communicate institutional data. Operation and implementation assessment is not typically appropriate for assessing something like an administrative position.

A one-time, stand alone, evaluation for which the focus is design, delivery, and participation or utilization of a program. This approach is particularly useful for new programs. An example of such an evaluation is when, at the end of the first year of a new transfer orientation program, a campus collects information about participation in the program and information about the program’s design and delivery: Was it designed and carried out as planned? Were materials used effective? If not, what was missing and why? Did the intended audience participate? A main goal of operation/implementation evaluation for multi-site programs is to be sure that the program was implemented the same way among all sites and times it is offered. Evaluation experts refer to this as fidelity of implementation.
Program operation monitoring, a routine collection and review of data that serve as indicators of program implementation. Although I am making a distinction between monitoring and other approaches to operation/implementation evaluation, they are often conducted simultaneously and are certainly not mutually exclusive. The main difference is that monitoring involves routine collection of data on a defined set of (typically) quantitative indicators, such as use, whereas a one-time evaluation typically collects a broader range of data about a program’s design, delivery, and use and may only be conducted once or periodically. These differences are discussed in greater depth later in this chapter.
An integrated approach in which assessment of program operations and participation is conducted alongside outcomes assessment. In this case, results of asking operation and implementation questions help administrators understand outcome data.

To reiterate, the rationale behind assessing a program’s operations is that a program must be delivered and must function well to achieve its intended outcomes. Likewise, without implementation evaluation, it is difficult to know why a program failed to achieve its intended outcomes. In sum, it is difficult to evaluate the intended outcomes of a program without first being confident that it has been designed well, delivered as planned, and is reaching the intended audience at the expected level.

Guiding Questions

For this section, I use the sexual misconduct training program outlined in the previous chapter to illustrate the kinds of questions (that is, questions about the design, delivery, and utilization of program services and administration) that might be relevant for an assessment of operations and implementation. Before getting to the case example, it is useful to consider some of the common things that can go wrong in program design and implementation that could negatively impact a program’s ability.

An Example

As a reminder, a comprehensive university campus just implemented a sexual misconduct training program for residence hall staff and fraternity and sorority leaders in response to a needs assessment indicating that students don’t really know what actions fall under the heading sexual misconduct or know how or where to report incidents of such misconduct. The intervention seeks to first educate RAs and fraternity and sorority leaders who will then hold workshops for the students with whom they work. The focus of the training is (1) what is sexual misconduct and (2) how and where students should report incidents. The ultimate goal is that all students will know what constitutes sexual misconduct, where to report incidents, and will feel safe reporting incidents. In the long term, the number of reported incidents should decrease even if numbers reported increases in the short run. Since the program was recently implemented, you plan to do a formative, implementation evaluation. Let’s assume that organizers set 90% of RAs and student leaders as the intended participation rate and set a goal that 90% of these will provide training to their residents or group members. These are intended operational outcomes.

What Could Go Wrong?

In the real world, there are many obstacles to effective program operation and implementation that can impede the effectiveness of a program in achieving its goals (Rossi et al., 2004). The following list of obstacles or common errors may help you to think about the type of questions an operation/implementation evaluation could address:

Program Design and Delivery

The program implemented does not meet needs or is designed for a different audience than the one to which it is delivered. (Ideally, a good logic model will prevent this, but to do so, logic models must have been developed.)
The program is not implemented as designed, services are incompletely provided, or not provided at all. The term fidelity is often used to describe the extent to which a program is implemented in accordance with the program’s intent and original design.
A program’s materials or activities are poorly designed and delivered (e.g., instruction is poor, materials are unclear).
The delivery required is so complicated that it is difficult to implement the activities correctly, as intended, or in all sites. Early in the student learning outcomes assessment movement, many units developed complicated assessment plans that failed, collapsing because they were just too time-consuming and complicated to enact.
A multi-site program is implemented very differently at each site or by different presenters or leaders.
A program is planned but there are insufficient resources to support full implementation. Faculty members at the University of Kansas often complained that the Blackboard LMS did not do things they wanted because the full program was not implemented.

What Could Go Wrong? Participation and Use

Obstacles to participation or use of services include:

Political, social, economic, and cultural issues may affect participation and use. For example, if an intervention runs counter to cultural beliefs, members of the intended audience may refuse to participate or use the intervention as intended.
The program implementation and service delivery do not facilitate participation by the intended population. An example is an app-based program delivery targeted to individuals who do not have access to WiFi or a program that is offered when the target population is unable to attend.
One group participates at a higher rate than other groups or are not the intended participants. For example, the majority of students who take advantage of a summer bridge program are high achieving students. This can skew the effects of the program.
Intended participants do not know about the program or did not know they were eligible.
Participants do not participate fully. This is a potential problem in long term, activity-intensive programs like the Wonder weight loss program in which many university employees participate. If individuals sign up but don’t watch all of the videos or adhere to the other program activities, then the outcomes of the program may be diminished. Partial participation is also a problem in short-term voluntary programs.

These are some of the common obstacles that could possibly affect implementation of any intervention that could inhibit successful outcome attainment. As such, they provide some ideas about the kinds of things evaluators might focus on in an operation/implementation assessment. Not all may be relevant to any one program, but it is easy to identify those that are key to effective program implementation of a specific program and to focus on these factors in operation/implementation assessment. To reiterate, the overall purpose of paying attention to program operations is program improvement to maximize program effect.

Potential Guiding Questions

Following are some potential questions to guide an assessment of the program’s operations and implementation:

Participation:
- How many student leaders and residence hall staff participated?
- Are there identifiable groups of intended participants who did not participate? (For example, maybe the housing staff in the hypothetical Jayhawk Complex did not participate.) Participation is key. The program can’t achieve its intended outcomes if staff members do not attend.
- If some portion of the intended population did not participate, why not?
Design and delivery:
- Is the program being delivered as designed?
- Were the stated activities delivered? As frequently as planned?
- Was the program delivered well? Were presentations clearly communicated and of high quality?
- Are participants receiving the intended amount of content? And, if the program is multi-site, are all sites delivering the same amount and quality of content?
- Were the individuals who led the training sessions knowledgeable and well prepared?
- Was the program delivered as designed and in the same way in all locations—e.g. Were all of the training sessions the same for all residential complexes and all student leaders (same content, same quality of presentations)?
- Does the program have adequate resources and are those resources used efficiently? Are shortcuts being taken due to lack of resources?
- Is the program offered at a time and place the intended participants can make use of it?
- Are facilities suitable?
- Are materials available and of high quality?
Overall satisfaction and early outcomes:
- Are participants satisfied with the program? Would they recommend participation to others? If not, why not? Satisfaction can provide useful information about whether and to what extent a program is being implemented well. Gathering satisfaction data can be particularly helpful if you also learn why participants are or are not satisfied (Middaugh, 2010).
- Outcomes: Do early outcomes suggest that the program is working? Do 95% of attendees recognize types of sexual assault and where students should report it? Do the trained students convey information to their groups? Why or why not? Do student leaders receive the necessary support to do so? How do participants rate their learning?

Offices will also have stated operational outcome such as to increase staff, lower wait times, serve more people. These goals for the program itself can and definitely should be assessed as part of an assessment of operations.

To repeat, the rationale for program operation/implementation assessment is that if the design and delivery of a program’s activities are lacking or participation is not what is intended, the program can’t be effective in addressing the problem it was designed to address. Providing information on which to base improvement is the goal.

Methods

The methods used to collect data to inform an operation/implementation evaluation are as diverse as the questions asked. At the outset it is important to note that some aspects of implementation assessment that can be, and are typically, done in conjunction with outcomes assessment. Most of the above guiding questions can be answered using the following methods described in more detail in previous chapters and in chapters 15-18:

Document review
Existing data
Data against which to benchmark program performance
Interviews/focus groups
Surveys

Assessing program operations is important early in a program’s life but remains important throughout. Planning for all aspects of operation/implementation evaluation, including data collection, is (or should be) heavily dependent on what program stakeholders want to know, leading Owen (2007) to call operation/implementation evaluations interactive evaluations. The idea behind interactive evaluations is that the more stakeholders are involved, the more the results will respond to what they need. and the more likely they will be to use the results to make improvements.

Operation and implementation evaluation are more likely to employ descriptive methods (surveys, document review, interviews, focus groups, and observations) and very unlikely to use more complicated designs such as true experiments. Formal research methods, such as surveys, can be used and are discussed in Chapter 16 or in texts such as Introduction to Educational Research (Mertler, 2016) and Schuh, et al. 2016. Qualitative methods well suited to some operation/implementation questions.

Monitoring Program Implementation

According to Rossi, et al. (2004) “[M]onitoring is the systematic and continual documentation of key aspects of program performance that assesses whether the program is operating as intended or according to some appropriate standard….” (p. 171). Program administrators likely assess some aspects of program operations when they monitor or routinely examine certain kinds of data, such as use rates. Monitoring program operations is useful for programs that are ongoing or offered repeatedly.

Method

Monitoring program operations typically involves identifying key aspects—indicators—on which the program’s services can be assessed systematically and routinely. These indicators tend to be quantitative and to fall into the categories of participation, money spent, or satisfaction, all of which are particularly well-suited to monitoring, but could well include other types of indicators of program design, and delivery, (e.g., attendance at a common book program or cost of activities). The first step is to identify important indicators about program design, delivery, and participation that merit monitoring.

Once identified, data on the indicators must be routinely collected and stored in accessible forms. Many colleges and universities have some form of academic information management system. These systems typically store data on indicators such as student credit hours generated, number and demographics of students, major, measures of student ability (such as, ACT scores), amount of student indebtedness, and time to degree. They also maintain databases on faculty, teaching loads, course enrollments, and a host of other academic indicators. These databases have the advantage of being continually updated to reflect current information and are used by administrators, such as deans and department chairs, for academic planning. For example, a department chair can use these data to monitor enrollment trends across years and could potentially use them as a basis for making changes to an academic program. These types of data are also used in program review as the basis for making an argument about program quality and resource requests.

Student affairs and student success units will likely have access to some types of automatically collected data about programs with which they are concerned but may need to rely on keeping their own data on relevant indicators. Some institutions use ID swipe technology to record student participation in events and appointments. This allows using databases for monitoring participation, although this methodology comes with some privacy and surveillance concerns discussed in Chapter 11. Indicators such as program participation, office visits, and satisfaction can also be monitored.

Administering repeated surveys is another option for monitoring. When shared services centers were implemented at the University of Kansas to consolidate business, purchasing, and human resources support services for departments, one unit’s dean collected monthly survey data from department chairs about performance of the service centers. This allowed him to track perceptions and attitudes about various aspects of operation over time.

College campuses are required to monitor campus crime data and report it annually to the public. Over time, these data can be used by a campus to monitor trends and as a springboard for identifying areas for action. Most state university governing boards require each university under their purview to provide them with yearly data on a set of important performance indicators. Even data from the Integrated Postsecondary Education Data System (IPEDS) data could possibly be used over time to monitor some aspects of a college’s functioning.

Monitoring is suitable for some questions about implementation assessment but not all. For example, it may be possible to record and monitor students’ use of the advising center and their satisfaction with it, but it not as easy to monitor student experiences with advising itself. The latter requires interviewing advisors or students, or at a minimum surveying them, over time; this is more difficult but not impossible to do. Rarely is implementation monitoring sufficient as a basis for program modification. It works best when used as a complement to interviews and surveys that capture more in-depth information about program implementation and as a trigger to explore concerning data trends in more depth.

Drawing Conclusions

Assuming the evaluation plan has been carried out and data have been collected, evaluators must draw some conclusions: Is the program working as designed? Are delivery methods effective? Should changes or modifications to the program be made and, if so, what changes? Ideally, the sponsors of the implementation/operation would have set some performance expectations ahead of time. To explore the challenges of drawing conclusions, I come back to the example of sexual misconduct training program.

As a result of the implementation assessment of the RA/student leader training program, imagine that:

Participants thought the instructional materials used were understandable and of high quality. Means on Likert items ranged from 3.7 to 4.3 (out of 5).
Participants thought the staff delivering the program were knowledgeable, but only a slight majority (55%) were satisfied with the program delivery methods.
Intended participants were aware of the program.
The program was at a convenient time.
About 60% of the RAs and student leaders participated.

Once evaluators have information on questions of delivery and participation such as these, they need to have some basis for deciding whether changes ought to be made, and, if so which ones. This is where having a set of performance standards or targets prior to collecting evaluation data comes into play. As an administrator the question facing you is whether, according to the hypothetical data from the sexual misconduct training program above, operation and implementation performance met expectations? In this case, program planners had hoped that 90% of RAs and student leaders would attend, but only 60% did. Presumably program administrators would use this information as a basis for finding out why attendance was low and then implement changes to get increase participation rates. The next year it would be reasonable to expect a greater percentage to participate. Had they not set attendance expectations ahead of time, they would have no basis for deciding whether 60% attendance was good or bad.

It is relatively easy to set standards for components such as participation that are easily measured and captured. Governing boards may set standards for indicators of program performance such as enrollment, graduation rates, employment rates. When these indicators are lower than desired, they typically result in further study to find out reasons.

It can be more difficult to set standards for whether the program activities are delivered faithfully or well or for what percent of respondents should be satisfied. In the sexual misconduct training example, you would also need to decide desired mean scores on items regarding instructional materials and quality are acceptable or merit changes. When program administrators have no established performance targets, they often make a judgment based on their own expectations..

It is generally not a good idea to make major program changes based on a very small number of complaints, or if one trainer got low satisfaction ratings one time. It may be helpful to consider the consequences of either not making or making program changes considering the data. It is easy to overreact and make changes that cause more problems than they solve. It may be helpful to think about the potential harm caused by making or not making changes. If only 60% of RAs hold discussions about the common book you would want to work towards greater involvement of RAs, but this can be done over time. In contrast, one might want to make changes in response to only a few complaints of RAs not providing intended sexual misconduct training in their halls as the consequences could be great..

When operation/implementation evaluation is done in concert with outcome assessment, administrators and evaluators are often interested in identifying the factors about program design, delivery, and use that might be affecting the outcomes. The knowledge gained might lead to program changes but also can be used simply to understand the outcomes.

Summary

To be effective, programs must be designed and delivered well and must reach the intended participants. The family of evaluation activities used to assess the extent to which programs are designed and delivered effectively is classified as operation and implementation evaluation. These evaluation activities may be stand alone, one-time evaluation studies; on-going monitoring studies, or they may accompany more comprehensive evaluations of outcomes. They focus on questions of program design, delivery, and use and are crucial in establishing that a program is in the best position to achieve its specified outcomes.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Assessment and Evaluation in Higher Education: A Practical Guide Copyright © 2024 by Susan Twombly is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.