Skip to main content

25 June 2019

Evaluation: moving beyond well-intentioned fads and towards a new professionalism

Nicholas Gruen

NICHOLAS GRUEN: Public sector policymaking should involve better expertise, resourcing, independence and transparency.

Cogs turning in a machine
Evaluation: moving beyond well-intentioned fads and towards a new professionalism

Working in and around government for over three decades I’ve grown increasingly wary of fads.

Remember the book Reinventing Government which proposed government working more like the private sector, without carefully setting out how to do it? Today it’s design thinking, putting "users" or the intended beneficiaries of programs at the centre of program design and delivery. It’s great that we’re trying to do so.

But we’ve been talking like this for at least two decades – since the Third Way fad came and went. (Remember how "one size fits all" wouldn’t do anymore? But again, we couldn’t work out how to do it.) And while promising demonstrations of design thinking proliferate – as they did for Reinventing Government and the Third Way, proving that there are some real prospects in the fad, we’re still miles from big systemic improvements. We’ve got some inklings of what we want, but we’re still stumped about how to do it.

The fads themselves are a symptom of the deeper, careerist malaise. The big civil service career rewards go to strategisers of high policy – orchestrators of fads from the top. Yet the knowledge we’ll need to transform systems around their users’ needs comes mostly from below – from workers in the field and those they serve.

Towards a new professionalism

We’ve been here before. As services in health and education were ramped up from the late 19th century on, we built professions to take us from fairly clear ideas of what we wanted, to working out the "how".

Those professions were far from perfect, freighted as they were with the baggage of the governing class – a particular problem where users were from another class or ethnicity. They also erected barriers to entry. This increased costs and could nurture a complacency that delayed the response to emerging evidence – for instance, that hand washing reduced hospital infections.

On the other hand, professions were and remain the paradigm institution for fostering increasingly knowledgeable "communities of practice". They did develop expertise in difficult practical tasks in the field. And the autonomy their status gave them provided some ballast for their expertise to influence outcomes, alongside the changeable imperatives of their managers.

Today managers direct the traffic. Current managerialism offers measurement and accountability aplenty. That certainly suggests objective standards. In principle this could offer an antidote to professional complacency1. But it’s mostly driven by what Jakobsen et al call the “political demand for account giving”. Thus, the motive force behind monitoring and evaluation in the civil service is bureaucrats’ and politicians’ joint need to be seen to hold the system to account, not the need for program or system-wide learning2.

In response, I propose a new agency – the evaluator-general. Perhaps because this name implies some borrowing from the status of the auditor-general, this has been taken to involve centralising monitoring and evaluation from above. That’s technically true, but ironic nevertheless, for my objective is decentralise and empower the knowledge of those in the field (both those employed to deliver services and those in the community) so that, to the extent that it is independently tested and validated against the evidence, it is given substantially greater weight than it is now.

And, for as long as it remains under their sole direction, no amount of wishful thinking will prevent the institutional imperatives of senior managers and politicians from driving the design, operation of and reporting from monitoring and evaluation. Expertise – in evaluation and from the field – needs a seat at that table.

Operationalising the demarcation between delivery and understanding impact

Consider the agencies that provide foundational information and integrity in Britain – for instance, the National Audit Office, the Meteorological Office, and the Office for Budget Responsibility. They have independence to insulate them from influence by those within politics and the bureaucracy whose circumstances force them into an intense preoccupation with "messaging". If our system is to function, let alone learn, they must "tell it like it is".

Although it has hitherto only existed at the level of agencies, I envisage the evaluator-general as being an institution through which a new demarcation is operationalised at all levels of the hierarchy, between delivering programs on the one hand and understanding their impacts on the other.

Thus a line agency directed by a political officeholder would deliver a program – or commission its delivery from competing providers – but the evaluator-general would independently oversee and resource the program’s "nervous system" – its monitoring and evaluation.

For this to work well, those delivering and those monitoring and evaluating services would need to collaborate closely. Physically, they would work alongside each other within the delivery agency and in the field. But the evaluator-general would have ultimate responsibility for monitoring and evaluation in the event of disagreement between its own and delivery agency’s officers. And, subject to privacy safeguards, the monitoring and evaluation system’s outputs would be regularly published with appropriate comment and analysis.

Monitoring and evaluation would have the primary objective of helping those delivering services measure, understand and thus continually improve their impact. Accountability to those ‘above’ the service deliverers in the management hierarchy would be built in the first instance from this self-accountability of those in the field, who would now have an expert critical friend, or in Adam Smith’s words, an impartial spectator. Toyota revolutionised the efficiency and quality of car manufacture similarly – by building its own production system around the self-accountability of production teams.

Where cooperation was poor, the efficacy of the system would be degraded, though I doubt it would be less useful than what we have now. Moreover, it would be visible and so, one hopes, corrected.

Thus the new arrangements are intended not just to give some ballast to evaluation expertise and the knowledge of those in the field against the changeable institutional imperatives of senior managers and politicians but to do so by reference to validation against the evidence. This would also strengthen the efficacy of the profession involved and stiffen its discipline against complacency.

The objectives of the new arrangements

The finely disaggregated transparency of performance information made possible by these arrangements would support;

  • the intrinsic motivation of those in the field to optimise their impact;
  • public transparency to hold practitioners, their managers and agencies to account3;
  • more expert and disinterested estimates of the long‑term impact of programs to enable a long‑run "investment approach" to services; and
  • a rich "knowledge commons" in human services and local solutions that could tackle the "siloing" of information and effort within agencies.

With journalism and political debate increasingly given over to spin, the public sector can strengthen its own independence from this process by strengthening the expertise, resourcing, independence and transparency of the evidence base on which it proceeds.


  1. See Gruen, N, 2019, “Accountability: from above and below” (forthcoming).
  2. As Jakobsen et al put it, the “performance metrics are politically viewed as a legitimate and necessary” way of satisfying “political demand for account giving”. Jakobsen, M.L., Baekgaard, M., Moynihan, D.P. and van Loon, N., 2017. Making sense of performance regimes: Rebalancing external accountability and internal learning, Perspectives on Public Management and Governance, 1(2), pp.127-141 at 128. In consequence, form often trumps substance.
  3. By publicly identifying success as it emerged, an evaluator-general would place countervailing pressure on agencies to more fully embrace evidence-based improvements, even where this disturbed the web of acquired habits and vested interests that entrench incumbency.


Nicholas Gruen is a Visiting Professor at the Policy Institute, King's College London, CEO of Lateral Economics and the Chair of Open Knowledge Australia.

This blog piece was originally published in The Mandarin.


Related departments