A New Age of Implementation: Guiding Principles for Implementing Performance Assessment Systems


Gary Chapin is senior associate for Quality Performance Assessment at the Center for Collaborative Education. Laurie Gagnon is director of Quality Performance Assessment at the Center for Collaborative Education. Virgel Hammonds is chief learning officer of KnowledgeWorks and former superintendent of Regional School Unit 2 in Maine.

In an examination of the conditions required for the successful implementation of performance assessment, the authors draw on a range of personal experience and other insights to guide practitioners and policymakers.

During the 2000s, some practitioners and education researchers aimed to produce pedagogy that embodied the ideals of equity of opportunity and outcome, including initiatives such as personalized learning, competency-based learning, flexible pathways, attention to habits and dispositions, and authenticity in learning and assessment. Taken together, these initiatives provide ways for students to learn and demonstrate their learning, directly address the transferrable skills that are the foundation for all learning regardless of content, shape learning so that it is genuinely relevant to each student, and elevate student agency as a value. At the center of all of these practices is performance assessment, which can be defined as “multi-step assignments with clear criteria, expectations, and processes that measure how well a student transfers knowledge or applies complex skills to create or refine an original product” (CCE 2012).1

Performance assessments come in many forms – artistic performances, labs, exhibitions of research, internships, and portfolios. Students show us not only that they know something, but also that they know how to use that knowledge (or skill). Brian Stecher (2010), writing for the Stanford Center for Opportunity Policy in Education, expresses a definition that is broader and somewhat more elegant. He writes that performance assessment is “judging student achievement on the basis of relatively unconstrained responses to relatively rich stimulus materials.” A lot is unsaid in Stecher’s definition, but he captures the ethical imperative, pointing not only to the technical aspects of performance assessment, but also to the values and cultural changes it implies.

Building on the authentic assessment work of the Boston Pilot Schools (CCE 2004), in 2008 the Center for Collaborative Education (CCE) began developing a system to design and implement performance assessment that was research-based and educator-driven and that achieved a high level of technical quality. We conducted research and worked with a cohort of educators to bridge research and practice. The initiative culminated in the assessment model Quality Performance Assessment (QPA).2

What Do We Know About Successful Implementation?

Gregg Palmer, principal of Falmouth High School in Maine and an early champion of standards-based reform in that state, once said, “The most dangerous time for any innovation is when you try to scale it.”3 For the purposes of this conversation, “scaling” and “implementing” are synonymous, defined as enacting an innovation in a new space. Our experience lends credence to Palmer’s view that, “When an innovation fails, it is most often the implementation – rather than the innovation – that goes awry.”

In the early 2000s, a number of states attempted to foster Comprehensive Local Assessment Systems using standards-based criteria and reporting. In Maine, the effort (in which two of the authors, Gary and Virgel, participated) was indeed comprehensive, but not sustainable. The strict demands of validity and reliability took far too much time, and the recording requirements created so much documentation that it was common in Maine to quip that the effort “died under the weight of its own paper.” Other efforts, as reported by Tung and Stazesky (2010), have ended because funding ended, because of leadership changes, or because of political pressure at the local and state level.

Since then, we have learned more about implementing performance assessment in schools. Innovation funders, partners, state agencies, and curriculum leaders have been paying attention to which methods of implementation seem to produce the best outcomes. We are not as far developed as our colleagues in the health field, who have seen the birth of a subsidiary field of study, Implementation Science, which looks at the uptake of research findings into routine healthcare practice.4 Still, qualitative research,5 along with the experience of the authors, suggests that four key considerations, if taken into account, will increase the probability of a successful implementation effort:

  • Shared moral vision and leadership driving policy and practice
  • Abundant, informative, and compassionate communication to build understanding and harness public will
  • An insistence on technical quality, fostered by sustained professional development
  • A collaborative culture within which to build capacity

Moral Vision and Leadership

Any change process involves necessary loss and disruption, and a huge amount of effort and learning by educators and stakeholders. Effective implementation requires that everyone in the system comprehend its necessity on rational, emotional, and moral levels. Student achievement, graduation, and college placement rates, while necessary to cite, are never enough on their own to drive change. We must also surface the moral and ethical commitments to equity that will move our schools toward implementing performance assessment. These commitments must become the foundation for supporting policy and decisions around practice. As leaders, we will know that we’ve succeeded not when people tell us they want to transform schools, but when they tell us they would be appalled if the change process went off the rails.

In 2006, when Gary was a curriculum teacher-leader at Hall-Dale High School in Farmingdale, Maine, the superintendent and principal moved to implement a standards-based reporting system. Their first step was not to change policy, but to begin working with teachers. First, the most enthusiastic were given access to professional development, and then word began spreading. More went for trainings. Book groups formed. Teachers, leaders, students, and parents traveled to districts engaged in similar work. At the forefront of all discussions was the question: Why is this necessary? When the time came to throw the switch and change the grading policy for the school, the school committee met with a group of 60 teachers. During the two-hour meeting, one teacher summed it up for all when he said, “Let us do the right thing.”


When Virgel arrived as superintendent of Regional School Unit 2 (RSU 2), which comprises five towns and to which Hall-Dale High School belongs, he began his tenure with a conversation tour of the large and far-flung district. As he recalled in an interview, “I needed to get to know the communities, the schools. What has made us successful? What hasn’t? What do we need to target? I made a commitment to go to every community. Homes. Patios. Barns. Town fairs” (Center for Best Practice, 2012). The frequent gatherings tended to be small, and listening happened on both sides. For Virgel, change happened one conversation at a time.

The district had learned early on that gathering parents into an auditorium and speaking at them from the stage was decidedly not the way to invite cultural change to a school district. Smaller meetings, in which parents could talk with students and teachers – rather than administrators – and, not incidentally, eat lasagna, turned out to be much more successful. Importantly, the leaders and teachers at RSU 2 realized that people aren’t afraid of change. Rather, as Michael Fullan (2006) has pointed out, they are afraid of loss.6

Faced with the shift to a performance assessment system, educators may feel loss of a sense of competence as the teachers move from masters of content to facilitators of learning. Or they may fear losing the precision (and the illusion of accuracy) that traditional grades can convey, and the resulting uncertainty. High-performing students may fear loss of hierarchy based on those grades, and the GPA-based honors that attach to them. Parents of high-performing students may similarly fear the possible loss of advantage that their children accrue in the current system (Kohn 1998), especially when it comes to college admissions and scholarships. Some may feel fear because even if they agree that the current system is flawed, they aren’t sure of the advantages of a new system, and don’t want to “experiment” on their children. And administrators and innovators may fear the reactions of their communities and possible resulting pushback.

None of this fear is baseless, though it is insufficient cause to withdraw from change. In the many conversations leading up to a transformation, everything is to be gained by avoiding treating anyone as if they are the enemy. Every stakeholder should be treated with compassion. In RSU 2, curriculum leaders met with individual parents for dozens of hours, discussing the foundations and subtleties of the system, and trying to assuage concerns. When RSU 2 formed an implementation committee, they not only invited parents on board, they invited the two most vocal skeptics of the new system, representatives of a much larger group. The committee met every two weeks for the school year, and set the conditions for the initial launch the following September.

Technical Quality

Our advocacy for QPA is inspired by its potential to support equity. QPA also has at its core a commitment to technical quality – that is, an assurance that the assessments deployed are valid and reliable. A performance assessment system must have an effect on student learning that is considerable, measurable, and demonstrable. The mechanism for ensuring this level of technical quality is a sustained professional development effort such as the Performance Assessment for Competency Education (PACE)7 pilot, which New Hampshire launched in 2014.

In PACE, both local and cross-district collaborative processes are the foundation of technical quality evidence and are guided by the National Center for the Improvement of Educational Assessment (Center for Assessment). Key processes to ensure the technical quality of the assessments, scoring, and the overall annual determinations that result from the local assessment systems include: 

  • Content Area Leads, selected from among PACE district educators, lead their peers in developing the PACE Common Performance Tasks. The Content Area Leads work closely with Center for Assessment staff, who also conduct technical reviews of the tasks.
  • Local calibration and double-scoring of the PACE common performance tasks occurs during the school year. Educators further analyze results and student work samples during summer cross-district calibration sessions.
  • Annual determinations are made based on achievement level descriptors (ADLs) written by teachers and then applied in a teacher survey matching each student’s body of work to the appropriate ALD, and through the analysis of samples from a body of student work from the local assessment system.

PACE teacher involvement is a cornerstone of technical quality, along with the support and additional psychometric analysis of the resulting data by the Center for Assessment, redefining traditional psychometrics for a locally-driven reciprocal accountability system (NHDOE 2016).

Collaborative Culture

While Virgel was superintendent of RSU 2, the district joined with nearly two dozen others to form the Maine Cohort for Customized Learning (MCCL). Their goal was to bring together energy, expertise, and resources. Pooling together funds in order to share professional development costs proved to be vital. The MCCL facilitated a years-long professional development agreement with the Re-inventing Schools Coalition to provide professional development to their collective faculties. Similarly, when looking for standards-based reporting software, they reached out to the designers of the Empower software package, and worked with them to customize the software the MCCL needs. Finally, in designing proficiencies and standards, the cohort created content area committees, drawing expertise from all member districts. 

As a central part of its work supporting performance assessment, the Center for Collaborative Education has fostered collaborative networks in Rhode Island, Vermont, Oregon, Massachusetts, and New Hampshire (through the aforementioned PACE). CCE itself works as part of a national collaborative cohort of 12 organizations called the Assessment for Learning Project (ALP). Each ALP organization is engaged in a learning pilot around some aspect of Assessment for Learning. One, for example, is looking into the power and quality of feedback (or feedforward) for kids. Another is examining a place-based, culturally responsive approach to habits and dispositions. CCE has developed a system of micro-credentials around performance assessment, and is piloting them as part of a system of professional development with districts in Rhode Island, Kentucky, and Georgia. ALP’s participating organizations gather online, and occasionally in person, to provide feedback, insights, encouragement, and support. It is a genuine learning community.

Districts must also build within themselves a culture that supports educator, student, and community collaboration. In many districts this has taken the form of some sort of formal model, for example Professional Learning Communities (PLCs) or Critical Friends Groups (CFGs). Through practice in these models, educators engage in intentional, structured conversations that center on student work, data, and practice. The protocols of these models allow for a safe place in which that level of vulnerability is possible. Over time, PLC practice stops being novel, stops being something teachers do, and becomes the culture of the school.

A New Age of Implementation

In implementing QPA across New England and beyond, we have learned how to support schools, districts, and states as they create local systems of performance assessment. We have seen that curriculum-embedded performance assessment operates as a leverage point for many of the practices that comprise competency-based and personalized learning. The vision of performance assessment, with its relatively unrestrained responses and rich materials, is one of broad possibility and permission for students to take agency over their learning. The challenge is embodying this vision in all of our schools, in the widest variety of contexts, and in ways that ensure equity of opportunity and outcome for every student.

All students need, deserve, and have a right to our care, attention, and best efforts. Our nation requires a citizenry with the capacity to thrive in (and hold on to) the democratic republic of the 21st century. The broader range of possibilities allowed by performance assessment, and the associated careful use of data required, mean that our biases and assumptions will be less likely to close our eyes to the ways students can be successful. The commitment to authenticity and the call to allow students to become co-conspirators in the construction of their learning mean that they will be better prepared to move about in the world and shape it for their own futures.

Related topics: 

1 See also Brown & Mevs 2012.

2 See CCE 2012. For more information on QPA, see http://cce.org/work/instruction-assessment/quality-performance-assessment.

3 Personal conversation with Gary Chapin, 2007.

4 For more on implementation science, see https://www.fic.nih.gov/researchtopics/pages/implementationscience.aspx

5 See Tung & Stazesky (2010) and Center for Best Practice (2013)

6 In a keynote at a 2007 conference in Maine, discussing the book, Breakthrough (co-written by Fullan and Crévola, see References)

7 For more information on PACE visit https://www.education.nh.gov/assessment-systems/pace.htm

Brown, C., and P.  Mevs. 2012. Quality Performance Assessment: Harnessing the Power of Teacher and Student Learning. Boston, MA: Center for Collaborative Education. 

Center for Best Practice, 2012. The long conversation, or ‘It’s hard, but worth it. Did I mention that it’s hard?’: RSU 2 student-centered learning implementation case study. Maine Department of Education.

Center for Best Practice, 2013. Threads of implementation: A thematic review of six case studies of Maine school districts implementing proficiency-based systems. Maine Department of Education. 

Center for Collaborative Education. 2004. How pilot schools authentically assess student mastery. Boston, MA: Center for Collaborative Education. 

Center for Collaborative Education. 2012. Quality Performance Assessment: A Guide for Schools and Districts. Boston, MA: Center for Collaborative Education.

Fullan, M., Hill, P., & Crévola, C. 2006. Breakthrough. Thousand Oaks, CA: Corwin.

New Hampshire Department of Education. 2016. Moving from good to great in New Hampshire: Performance assessment of competency education (PACE)

Kohn, A. 1998. Only for my kid: How privileged parents undermine school reform. Phi Delta Kappan (April).

Rush, B. 1987. Of the mode of education proper in a republic.The Founders’ Constitution. Chicago, IL: University of Chicago Press.     

Stecher, B. 2010. Performance Assessment in an Era of Standards-Based Educational Accountability. Stanford Center for Opportunity Policy in Education. Stanford, CA: Stanford University.

Tung, R. and Stazesky, P. (2010). Including performance assessments in accountability systems: A review of scale-up efforts. Boston, MA: Center for Collaborative Education.