31 July 2025
State capability, effectiveness, and policy reform
Vanessa Hirneis and Professor Michael Sanders
What Germany can learn from the UK’s What Works Movement

On 18 July, Chancellor Friedrich Merz travelled to London, where he and Prime Minister Keir Starmer signed a treaty between Germany and the UK, cementing the countries’ friendship and bilateral cooperation. The agreement covers shared priorities such as defence, migration, infrastructure, the economy and freer movement of goods and people - areas of growing importance in an era of volatility, uncertainty, complexity and ambiguity. Yet beyond these headline commitments lies another opportunity: learning from each other how to “do government” more effectively. In particular, Germany could benefit from Britain’s “test, learn, adapt” culture: using rigorous, real‑world trials to ensure that policies and internal processes actually deliver the results they’re intended to achieve.
Germany is in dire need of political reform, and its administration is currently lacking some essential tools to execute such reform effectively and sustainably. Inequality is on the rise, the economy is stalling, infrastructure is outdated, climate targets are not being met, the world is seeing the most widespread conflict since World War II and citizens’ trust in the state’s capability of solving these pressing issues - and therefore also people’s trust in their democratic institutions - is severely damaged. Meanwhile, Merz is faced with a highly risk-averse government apparatus made up of overly complex laws and excessive bureaucracy. In light of this, media manager and supervisory board member Julia Jäkel, former federal ministers Peer Steinbrück and Thomas de Maizière and the former President of the Federal Constitutional Court Andreas Voßkuhle founded the politically independent “initiative for an effective state”, offering the public a whole book of suggestions for reform. Their report (a result of the work of around 50 experts across seven thematic working groups) was presented by Federal President and the initiative’s patron, Frank-Walter Steinmeier, earlier this month in Berlin. Many of the suggestions made aim to keep the fragile balance between making public services leaner and safeguarding those who need them most. And at the same time - despite plenty of references to the need for KPIs, ‘model regions’, and digital innovation - many questions remain unanswered concerning how to systematically and rigorously measure and test ‘what works’ to ensure such substantial reforms truly deliver and endure.
How to measure ‘What Works’?
Evidence‑based policymaking applies scientific methods to government activities and treats interventions as an opportunity to learn: we begin with a clear hypothesis and theory of change and draw on administrative data, randomised trials and quasi‑experimental designs to understand not just whether a policy delivers, but why, for whom and under what conditions. Crucially, causal inference underpins this approach: In a randomised controlled trial (RCT), for instance, participants are randomly allocated to either a treatment group, which receives an intervention, or a control group, which does not. This random allocation ensures that, on average, both groups are similar in all respects before the intervention. As a result, any differences in outcomes observed after the intervention can be confidently attributed to the intervention itself rather than to other factors. A well-studied method taken from medicine and applied to public policy, RCTs are considered the best and easiest way of isolating the effect of a policy or programme itself, eliminating confounding factors and allowing us to attribute observed outcomes directly to the intervention at hand. Other well-established forms of policy evaluation, such as pre‑post comparisons, correlation‑based analyses, or even newer approaches such as the creation of model regions as proposed in the initiative’s report, can suggest whether outcomes moved in the right direction but, by themselves, cannot reliably tell us whether it was the policy change or some external trend, selection bias or some concurrent programme driving the change. Causal methods, by contrast, give policymakers the confidence to scale up interventions that truly work and to retire or redesign those that don’t. This way, government can ensure resources are directed towards programmes with proven impact, it can reduce risk by piloting new initiatives, and it can make decisions more transparent and accountable.
Building on this foundation of causal evidence, the United Kingdom offers a particularly well-developed model for integrating evaluation into policymaking. A central feature is the What Works Network - independent centres starting with the National Institute for Health and Care Excellence (NICE) in 1999 and followed by the Education Endowment Foundation in 2011. Today, there are 12 centres covering areas from health and education to housing and crime. While they vary in size and remit, each is committed to using high-quality evidence to inform policy and practice, guided by the core question: “If we do X, what happens to Y?”. In addition to these independent centres, government departments each maintain their own evaluation teams and budgets; the Treasury’s “Magenta Book” sets out methodological standards for evaluation in government; and Cabinet Office guidance embeds a “test, learn, adapt” cycle into everyday policymaking, making experimental evaluation standard practice. Independent expert bodies such as the Behavioural Insights Team and our very own Experimental Government Team at the King’s College London Policy Institute further enhance this infrastructure, promoting rigorous, transparent, and actionable evidence across government.
Will ‘What Works’ Finally Make Its Way to Germany?
Presently, Germany shows real potential to implement scientific methods into the policy process, but it still lacks a coherent, system‑wide approach. Within the Federal Chancellery, the governmental unit “Effective Governance” (dt. Wirksam Regieren), appointed by the Merkel government in 2015, represented an important first step. Yet, unlike the United Kingdom’s Evaluation Taskforce in the Cabinet Office and the Treasury, it has no legal mandate, dedicated budget, authority to coordinate across departments or guaranteed access to the data it would need to run large-scale randomised trials. Although this unit can offer advice, it cannot itself commission or oversee impact assessments across ministries, nor is there a shared framework for evaluation that spans the entire federal administration. As a result, evaluations are often conducted by independent research organisations, occur only sporadically and remain confined to a few areas such as health, development cooperation, and social care.
For this to change, there is a strong need for targeted legal reforms which remain largely unaddressed in public discourse. Foremost among these is a Research Data Act that allows GDPR‑compliant linkage of administrative datasets for evaluation purposes, together with clear data‑access rights, unified methodological standards and formal embedding of the discussed types of evaluation in both budgeting and legislative processes. Without this legal framework, evaluation will remain an ad hoc exercise. With it, evaluation becomes a strategic, government‑wide capability.
Yet optimists may point to signs that change could be on the horizon. The report’s authors highlight the government’s stated intention to introduce key performance indicators to “make the effectiveness of [policies and] laws verifiable”. The coalition agreement explicitly pledges to “evaluate all funding programmes in terms of their purposefulness and effectiveness” and to ensure that data is made accessible for strategic management, modelling and impact monitoring. Even more tellingly, the newly appointed Digital Minister, Dr Karsten Wildberger, declared in his address at the report’s launch in Bellevue:
“Overall, we have too little focus on implementation and impact. It is not enough to define measures, make laws and tick them off. What matters is implementation, clarity and measuring effectiveness. Good governance must be effective governance. This should apply in [my ministry]: We measure what works.”
But whether words will be followed by deeds (and the relevant policy changes to enable rigorous evaluations) remains to be seen.