Evaluation in Higher Education

Susan Twombly

2 Evaluation in Higher Education

Key Topics

Unique history of assessment in higher education
Accountability
Program planning and improvement

Background

Program evaluation as an activity and field of study emerged out of the programs sponsored as part of President Lyndon Johnson’s War on Poverty and the Great Society initiatives of the 1960’s (Owen, 2007; Rossi, et al., 2004). Under the auspices of these national initiatives, the U.S. government poured millions of dollars into efforts to improve education and health and to promote the well-being of individuals and communities, particularly for society’s most needy individuals. With government investment came questions about whether efforts were achieving desired outcomes. The Elementary and Secondary Education Act of 1965 played a particularly important role as legislators demanded to have evidence that financial investments were yielding intended improvements at the preK-12 school levels. Social scientists, realizing that accounting and auditing tools commonly used for evaluation in the for-profit world were insufficient for evaluating the impact of social programs, began to develop evaluation approaches and methodological tools more suitable for educational and social programs. Graduate programs followed, training students in the specific sub-discipline of evaluation and the academic field of program evaluation was born (See Fitzpatrick, Sanders & Worthen, 2004; Owen, 2007; and Rossi, et al., 2004 for more detailed histories.)

Colleges and universities have enacted evaluation and quality control activities in particular ways that are very different from preK-12 schools, health care, or social work. Institutional accreditation has, for more than a century, been the main form of quality assurance and accountability for U.S. colleges and universities. Institutional accreditation, which emerged in the late 19th and early 20th centuries, has served the country well by ensuring the public that colleges and universities receiving federal funds meet minimum standards. Similarly, specialized accreditation has overseen quality assurance for professional programs. In recent decades, institutional accreditation bodies have increasingly been called on to “enforce” and audit compliance with various federal mandates involving regulations such as common measures of academic effort—the credit hour—faculty qualifications, complaint reporting, and gainful employment of graduates. Both specialized and institutional accreditation are based on the judgements of experts, and while the former focuses in-depth on a particular program, institutional accreditation does not focus on or make judgments about specific programs.

Academic program review emerged in the 1970s and 1980s as state systems of higher education sought to engage in centralized planning and oversight of public colleges and universities. Program review is a process in which colleges and universities periodically review the status of each academic program. Accrediting agencies now expect both public and private institutions to engage in regular program reviews of their academic programs. State higher education systems also began around this time to monitor certain indicators of institutional success.

In the mid-1980s a series of national reports about the state of learning at the preK-12 and postsecondary levels gave rise to a focus on assessment of student learning outcomes in postsecondary education as a basis for improving the teaching and learning process. One of these reports, A Time for Results, authored by the National Governors’ Association (1986), expressed concerns that a bachelor’s degree was not a guarantee of basic literacy and that employers were dissatisfied with skills of college graduates. The report argued for using multiple measures to assess what undergraduates were actually learning. Since the late 1980’s, regional and specialized accreditation bodies have required evidence of outcomes assessment activity and use of data to make improvements to the teaching and learning process. Achieving this requirement has not been easy; student learning outcomes assessment is still one of the accreditation criteria on which colleges and universities most often fall short. Regardless, assessment of student learning outcomes is today the main type of routine evaluation activity on college campuses.

Other forces have also spurred growth of evaluation in higher education. One of these is federal research and training grants, many of which require some form of external evaluation. Another is availability of “big data.” Traditional forms of evaluation are being supplemented with a plethora of “big data” from learning management systems, institutional data, and accompanying data management platforms that can be used to monitor important indicators of institutional performance to inform administrative practice, and also for evaluation and benchmarking purposes. The language of data driven or data informed decision making is pervasive in higher education today (Taylor, 2020).

Within the broad quality framework set by regional and specialized accreditation, engaging in evaluation activities has become an institutional reality for 21st century colleges and universities.

Forces Motivating Assessment and Evaluation

Assessment and evaluation serve multiple purposes in colleges and universities. Accountability often takes primacy of place in public discourse surrounding assessment and evaluation in higher education overshadowing the role of evaluation in program planning, design, and improvement. Although the former may preoccupy senior level administrators, it is the latter, with its focus on continual improvement, that occupies the attention of most college and university faculty and administrators.

Evaluation for Accountability

Many of the arguments for doing evaluation center around accountability. Suskie (2015) defines accountability as “demonstrating to your stakeholders the effectiveness of your college, program, service, or initiative in meeting its responsibilities….” (p. 58). Transparency, which simply means providing “clear, concise, easy to find and relevant” information to stakeholders, is a key aspect of accountability (Suskie, 2015, p. 58). Additionally, accountability is often associated with external oversight (Hazelkorn, 2021). Pressures to hold colleges and universities accountable for improving learning outcomes and to be transparent about their operations have increased exponentially in recent decades. Government officials at all levels, including federal officials in both Republican and Democratic administrations, state legislators, taxpayers, and parents who question the cost and value of higher education want to hold higher education accountable. Calls for accountability are not new but colleges and universities, particularly the state supported ones, seem to have met “the perfect storm” in recent years.

The economic recession of 2008, growing state level anti-tax sentiment, the Covid-19 pandemic, inflation, predictions of a declining student pool, and coordinated attacks on the value of higher education have made it even more difficult for states to maintain their commitment to funding higher education at past levels (assuming they want to do so). Reduced state funding results in higher tuition at public universities that then may contribute to student indebtedness. Private universities face their own, largely enrollment driven, financial pressures. Collectively, these influences pressure colleges and universities to demonstrate their value.

To hold postsecondary institutions accountable, both the Obama Administration, and private foundations like the Lumina Foundation, zeroed in on retention and graduation rates as accountability markers (U.S. Department of Education, 2011), arguing that the U.S. needed more college-educated citizens in order to be competitive in the global market. In fact, under the Obama Administration, the federal government proposed a controversial system to rate colleges and universities based on access, affordability, and quality (Kamenetz, 2014). The resulting College Score Card still exists as an information tool for parents and prospective students (https://collegescorecard.ed.gov). Requirements to track “gainful employment”of graduates emerged during this period as well. The Trump administration set its sights on accreditation and the Biden administration on sexual misconduct and enhancing transgender and LGBTQ+ rights, among others. Predatory for profit college practices have also influenced accountability demands. Accrediting agencies have ratcheted up their accountability emphasis as a result.

There has been no shortage of critics from within higher education itself. Scholars such as Arum and Roksa (2010) argued that students do not develop critical thinking skills in college, which was then interpreted by the public to mean that colleges at all levels are failing those who attend them. Numerous authors argue that colleges and universities have and continue to fail to serve students from underrepresented populations equitably (e.g., Hamilton and Nielsen, 2020). In response to these collective critiques, colleges and universities have identified policies and programs to increase the percentage of students who are admitted, retained, and graduated from college with critical thinking skills, and who have prospects of being employed, all while keeping the costs of college down.

Amidst the chaos and uncertainty facing higher education today, internal and external stakeholders turn to what they perceive to be rational solutions to higher education’s problems. Accountability and its tools—assessment and evaluation—are two such seemingly rational responses to ensure that parents and legislators get something for their investment, namely employment. Later in the book, I introduce critiques of the accountability argument that position assessment as a public relations tool more concerned with the appearance of rational action than with results.

A whole host of new challenges—impending “demographic cliff,” challenges to the value of a college education, attacks on what is being taught, and continued struggles for funding, among others—ensure that the pressure on colleges and universities to demonstrate value will continue. In fact, in early 2024, the U.S. Department of Education is asking accrediting bodies to hold colleges and universities more accountable (Knott, 2024). Budget stress invokes administrators to adopt what Hamilton and Nielsen (2020) call austerity logics, which just drive accountability expectations and with them increased urgency of assessment efforts. This puts colleges and universities and the various academic and support programs under significant pressure to demonstrate the worth of what they do. In tight economic times, student affairs and co-curricular programs are often even more vulnerable than academic programs because they may be seen as competing with academic programs for limited resources. Colleges and universities can no longer, if they ever could, ignore the question of whether and how well their programs work.

Evaluation for Planning, Design, and Improvement

Accountability is hardly the only purpose for program evaluation. Rather, program evaluation is a healthy and useful process that should be embedded in routine organizational practice. In fact, one characteristic of an effective college or university is existence of a culture of evidence that informs continual institutional improvement (Middaugh, 2009; Susie, 2014).

Evaluation assists administrators to:

Identify the nature and scope of problems requiring interventions
Provide the foundation for good program design
Gather information that will lead to program improvements
Gather data that will help administrators and staff know whether and to what extent intended and unintended outcomes are being achieved.
Help an institution demonstrate cultures of evidence and betterment that are key to institutional accreditation (Community Toolbox, 2016; Suskie, 2014).

From this point of view, program evaluation activities should be part of the normal program planning and development cycle (Kellogg Foundation; Rossi et al., 2004; Work Group for Community Health and Development). Stated differently, employing assessment and evaluation skills is essential to implementing and conducting effective programs to achieve the goals of higher education. Although assessment and evaluation also serve accountability purposes, that should not be their only, and perhaps not their primary, use. In order for assessment and evaluation to be implemented as a normal part of administrative work in higher education, it must be practical, respond to needs of evaluation sponsors, and be useful. To serve these purposes, administrators should understand evaluation methods, and their strengths and limitations, and must engage in effective evaluation activities.

Challenges of Assessment & Evaluation in Higher Education

Assessing and evaluating programs in higher education faces unique challenges. First, problems in colleges and universities are often inadequately understood before interventions are put into place (Kezar et al., 2015). When problems are not fully understood, the programs or interventions created may not be the right ones to address the actual problem. It is common for administrators to learn about interesting programs from colleagues at other universities (for example, at conferences) and try to implement them lock stock and barrel on their home campuses without even knowing whether a problem exists that the program can address or whether the conditions are the same as those on the other campus. The result is often a program that fails (or does not succeed) because it is mismatched with need, is implemented poorly, or is antithetical to the culture of the adopting college or university. On my campus, a former provost decided that the university had to implement a particular student early warning system and simply decreed that it do so immediately and without careful needs assessment or involvement of crucial stakeholders. Needless to say, it failed and has been replaced by a different system.

Even if programs are put in place to respond to a specific need, they often lack explicitly stated goals and objectives (let alone identified standards of performance). They may have insufficient resources to be implemented as planned. Programs in higher education, outside of those funded by state or federal grants, frequently do not have a clear beginning and end date. And, even if programs do end, termination may not be related to program performance. Therefore, evaluation in its truest sense—to make summative decisions about a program’s impact and effectiveness—may be less of a priority, and is more challenging, than is assessment for continual improvement.

Moreover, programs often serve cultural, symbolic, political, or public relations purposes unrelated to outcomes meaning that there may not be pressure to attend to evaluation results. A University of Kansas summer transition program intended for high school students who could benefit from an extended orientation provides such an example. Year after year, studies showed that the summer program made no difference in first semester grade point average or retention rates when participants were compared to a matched sample of those who did not participate. (A plausible hypothesis for the findings of no effect, is that the program did not attract the students for whom it was intended.) Despite the findings of no difference, the program was retained. One could argue it was retained because it generated some money, provided some summer employment for faculty and students, participants liked it, and the program simply looked good symbolically. The university was perceived to be doing its part to create a level playing field for students who might benefit from a boost. Eventually the program ended. It is not clear what role, if any, evaluation studies played in that decision.

Somewhat related to this argument is the notion that higher education is a status oriented sector that associates prestige with quality. It is hard to encourage elite institutions in particular to engage in the hard work of assessment and evaluation when they receive benefits from being highly ranked in various ranking schemes that are based on inputs and not on outcomes. (See Colin Diver, 2022.) Unfortunately, student learning has little bearing on institutional rankings.

There are also methodological constraints to conducting assessment and evaluation as discussed in Chapter 17. College and university offices don’t have huge budgets to pay for external evaluations (except for externally funded grants). The result is that most evaluations in colleges and universities are internal: Administrators evaluate their own programs. Program administrators often do not have the expertise or time to build evaluations into their planning and are not trained to do so. Conducting evaluation is not their main job. Moreover, it is very difficult, if not inappropriate in many cases, to use experimental designs (often considered the gold standard) to evaluate higher education programs. I will say more about this later.

As a consequence, most evaluations in higher education are commissioned or conducted by administrators, faculty, and staff, often on their own programs. It is, however, becoming more common for institutions and individual units, especially at large research universities, to hire their own student learning outcomes assessment specialists. At the University of Kansas, for example, the offices of Student Affairs, the Center for Teaching Excellence, Undergraduate Studies, the School of Pharmacy, the School of Education and Human Sciences, and University Career Center, to name a few, have hired individuals tasked with various assessment responsibilities. Not only does the University of Kansas Center for Teaching Excellence have multiple learning outcomes specialists, they now have a data analyst who can help academic departments use existing data to inform decisions.

Internal evaluations have implications for program evaluation, including, but not limited to the following:

Program evaluations are often conducted by busy faculty and administrators who are not trained as professional evaluators, and evaluation is not their main role. As a consequence is essential to plan evaluations that are meaningful and that can reasonably be designed and carried out with available time and resources (Suskie, 2018).
Administrator-evaluators must work with the data and skills available to them. Methods such as experimental and causal comparative designs require specific kinds of data, mastery of sophisticated research designs and statistical methods, considerable planning, and much time and effort. Although methodological rigor is the ideal, evaluators are often forced to settle for “good enough.”
Additionally, the nature of the enterprise makes some methodological choices difficult if not impossible to use. Relatively rarely, for example, is it possible to set up true experiments by randomly assigning students (or faculty or staff members) to a control and treatment group.
Evaluating one’s own program(s) comes with political and ethical challenges that need to be recognized. These challenges are briefly considered in Chapter 4.

Think about this course—and this book—as an intervention or program attempting to address some of the challenges brought on by “in-house” assessment and evaluation activities. Since you will very likely be called on to evaluate your own programs—or to make decisions about whether to devote resources to a full-time assessment personnel—this book seeks to help you become informed users of assessment and evaluation tools.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Assessment and Evaluation in Higher Education: A Practical Guide Copyright © 2024 by Susan Twombly is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.