The Evaluator’s Imperative: Understanding whether interventions were actually designed to solve the right problem.

 

In most development programs, solutions precede problems, and evaluators should not perpetuate this.


In international development, we often see well-intentioned projects that achieve all their stated outputs but fail to create lasting, systemic change. We focus intently on evaluating effectiveness and efficiencyDid the project deliver what was promised, and at what cost?—yet we frequently miss the most critical question of all: Was the intervention designed to solve the right problem?

The reality is that much of the development sector suffers from a fundamental design flaw: Solutions often precede Problems. Development practitioners often arrive with pre-packaged tools, often due to organizational mandates or donor preference, and then try to fit them to a perceived local issue, bypassing the complex, time-consuming work of deep problem analysis. This process—sometimes described as "organized anarchy" or "Garbage Can Theory" in public policy—creates programs that are perfectly executed but irrelevant to the underlying structural failures they were meant to address.

The Consequences of Superficial Diagnosis

This mechanical, compliance-driven approach carries a heavy cost: wasted capital, lost time, and a deepening crisis of credibility when programs inevitably falter. Evaluators must not perpetuate this cycle. By mechanically assessing performance metrics before rigorously understanding the failure that led to the problem, we risk becoming score-keepers for flawed strategies. This shifts our work from providing strategic, actionable solutions to merely pleasing evaluation commissioners. To move the needle, we must elevate the standard of relevance assessment from simple alignment with national priorities to rigorous failure diagnosis—a diagnostic that determines whether the initial project design was fundamentally fit for purpose.

1. The Core Diagnostic: Market, State, Implementation, and Regulatory Failure

The most fundamental task for any evaluator is to classify the core failure that produced the policy problem. This classification determines whether the intervention's "tool" was ever appropriate for the job.

We all sat in an economics class at some time in our education life and heard of market failures and public goods. In the end, understanding social problems comes to understanding that market failure had a hand in it. This initial taxonomy goes beyond pure economics to include state action and delivery capacity, providing a holistic view of systemic breakdown.

Failure CategorySimple DefinitionHow to Recognize (Diagnostics)Appropriate Policy Tools

Market Failure

When private markets fail to yield socially optimal outcomes (e.g., public goods, information asymmetry, monopoly).

Under-provision of basic services (WASH); low uptake of beneficial goods (vaccines); high prices due to monopoly; widespread negative externalities (e.g., pollution).

Pigouvian taxes/subsidies, regulation, competitive procurement, public provision, information disclosure, and certification schemes.

State/Governance Failure

When government action misallocates or distorts resources due to political capture, corruption, poor incentives, or weak capacity.

Rent-seeking; contradictory or unstable rules; low execution rates despite adequate funding; systemic leakages in cash transfers or resource flows.

Independent regulators, enhanced transparency and open contracting, civil service reform, e-procurement systems, and robust participatory oversight.

Implementation Failure

Sound policies fail at delivery due to weak systems, logistics, or misalignment of frontline incentives. This is the gap between "policy on paper" and reality.

Chronic stockouts despite national policy; staff absenteeism; last-mile bottlenecks in supply chains or service delivery.

Delivery chain analysis, logistics/IT investments (e.g., LMIS), supportive supervision, performance management systems, and clear micro-planning.

Regulatory Failure

When rules designed to protect the public interest are either over- or under-regulated, or create perverse, unintended incentives.

Outdated standards; regulatory capture by industry; overly burdensome rules that stifle legitimate growth (e.g., complex licensing pushing small businesses into the informal sector).

Sunset clauses for review; risk-based regulation; independent review boards; co-regulatory models involving public and private actors.

Elaborating the Mismatch:

The critical insight lies in exposing the mismatch. For example, if a community faces a lack of clean water (a perceived Implementation Failure in maintenance), the standard intervention is often a new logistics grant for spare parts. But if the real problem is that local elites are diverting maintenance funds with impunity (a Governance Failure), the new logistics system will fail. The evaluation must pivot from asking, "Did the logistics system deliver?" to asking, "Did the intervention address the underlying corruption?" This failure classification is the bedrock of a relevant evaluation.

2. Deeper Lenses for Root-Cause Analysis

Beyond the initial, broad classification, a professional evaluation must deploy advanced frameworks to understand the problem's true context and depth. This ensures our recommendations target the structural determinants, not just the visible surface-level issues.

A. Root Cause vs. Symptom: The Danger of Treating the Surface and Time Misfit

An effective intervention addresses the Root Cause, which is the upstream determinant driving multiple issues. Too often, interventions only treat the Symptom, which is a narrow, proximate manifestation.

Consider chronic child stunting in a rural region. The symptom is malnutrition. A typical intervention might treat the symptom with targeted food supplementation. However, a deep diagnostic reveals a chain of causality:

  • Proximate Cause: Low dietary intake.

  • Intermediate Causes: Poor sanitation, caregiver knowledge deficits (Information Failure), and seasonal food price volatility.

  • Underlying Root Causes: Structural poverty, gendered norms that limit women's decision-making, and Time Horizon Misfit—where short political cycles lead to underinvestment in long-term, high-return sectors like Early Childhood Education (ECE).

If we evaluate the supplementation program and find it efficient, but fail to note that it left the structural root causes unaddressed, the evaluation has provided limited strategic value, as stunting will inevitably recur once the program ends. The evaluator must check for evidence of long-term planning (e.g., earmarked funds or intergenerational impact assessments) to counteract this systemic time horizon misfit.

B. Principal–Agent and Collective Action Problems: Incentive Alignment

Policy problems frequently boil down to misaligned incentives and human behavior.

  • Principal–Agent Problems: This occurs when the interests of the Agent (e.g., a teacher, a doctor, a street-level bureaucrat) diverge from the interests of the Principal (e.g., the parent, the patient, the government). The diagnostic signals are high staff absenteeism, deliberate gaming of performance metrics to hit targets without improving quality, or low effort because pay is not linked to outcomes. Solutions must focus on establishing transparent feedback loops, credible monitoring systems (like biometric attendance linked to supportive coaching), and designing contracts that reward intrinsic motivation and patient-centered care.

  • Collective Action Failures: Here, the problem is not misaligned personal incentives, but the difficulty of coordinating shared effort. The diagnostic is the "tragedy of the commons" syndrome—for instance, community members benefiting from a shared water system but each preferring to free-ride on the essential Operation and Maintenance (O&M) costs. Solutions require creating strong coordinating institutions (like water user committees) and implementing matching funds or transparent fee schedules to make cooperation the most rational choice.

C. The Complex Systems Perspective: Matching Style to Context

Some problems arise not from a single faulty node but from the interaction of feedback loops, delays, and emergent behavior within a system.

  • System Failure Types: This includes Coordination Failure (silos leading to client fatigue or duplication), Learning Failure (a system that fails to use routine data or M&E findings to adapt), or Resilience Failure (where the system cannot flex to handle expected shocks like annual floods or currency depreciation impacting medicine imports).

  • Problem Context (Cynefin): This lens is essential for M&E design. Is the issue merely Complicated (solvable with expert analysis and a phased plan, e.g., building a bridge), or is it truly Complex (where cause and effect are only clear in retrospect, requiring iterative pilots, small safe-to-fail probes, and rapid learning/adaptation, e.g., changing ingrained sanitation behaviors)? Evaluating a complex problem with a rigid, linear M&E plan appropriate for a complicated engineering task is a recipe for irrelevance and missing necessary course corrections. Evaluators must assess if the project design allows for adaptive phasing and reversibility (option value) before large sunk costs are incurred.

D. Political Economy, Voice, and Capacity: Who Holds the Power?

System failures are deeply rooted in political realities and institutional abilities.

  • Accountability Gaps (Hirschman's Exit–Voice–Loyalty): This framework diagnoses the structural relationship between service providers and users. In situations where service users have no alternatives (can't 'exit' to private provision) and their complaints are ignored (weak 'voice'), the problem is fundamentally one of suppressed accountability. The critical need is for interventions that strengthen grievance redress mechanisms, client charters, and social accountability, rather than just basic service delivery improvements.

  • Policy Capacity: We must diagnose not just the failure type, but the capacity to solve it. Is the capacity gap primarily:

    1. Analytical: Lack of evidence, modeling, or evaluation to inform policy?

    2. Operational: Inability to manage programs, logistics, or execute budgets effectively?

    3. Political: Inability to build necessary coalitions, negotiate with stakeholders, or secure long-term budget commitments?

Identifying the binding capacity constraint is crucial. For instance, a program may have excellent Analytical design, but if it fails due to a lack of Political support (budget cuts, ministerial opposition), the evaluation should focus on strategies for coalition building and advocacy, not just on fixing the Operational weaknesses.

Conclusion: The Evaluator’s New Mandate for Systemic Change

The era of simply checking off outputs is over. To serve decision-makers and the communities we work with, evaluators must embrace a mandate for rigorous policy diagnosis.

We need to shift our focus from validating the success of a chosen solution to interrogating the fitness of the solution for the diagnosed failure. This is the only way to ensure that development resources are deployed effectively and strategically.

True strategic evaluation moves beyond a final score on effectiveness and provides a clear diagnostic: "The intervention, while well-executed, failed because it used a Market Failure tool (a subsidy) to address a Governance Failure (political capture of the supply chain), or because it applied a Complicated project plan to a Complex problem context."

By adopting a robust, multi-layered problem taxonomy, evaluators can provide the strategic intelligence decision-makers need to know exactly what to do differently, transforming our role from score-keepers into indispensable agents of systemic and sustainable change.

Comments

Popular posts from this blog

Organizational Effectiveness through the Monitoring, Evaluation, Accountability, and Learning (MEAL) Lens

The Law of Least Resistance and the Wide Adoption of Generative AI in the Workplace

Have You Considered and Prepared for Artificial intelligence (AI) in Your Monitoring and Evaluation Career?