Evidence submitted by the Institute for
Innovation & Valuation in Health Care (NICE 18)
1. THE INSTITUTE
FOR INNOVATION
& VALUATION IN
HEALTH CARE
(INNOVAL-HC)
1.1 The Institute for Innovation & Valuation
in Health Care (InnoVal-HC) welcomes the opportunity to submit
a response to the Health Committee's inquiry into aspects of the
work of the National Institute for Health and Clinical Excellence
(NICE).
1.2 InnoVal-HC ("the Institute";
www.innoval-hc.com) is an independent not-for-profit scientific
organisation dedicated to research into the principles of economic
evaluation of health care technologies and their application.
The Institute was founded in June 2005 and has since been formally
associated with the University of Applied Economic Sciences Ludwigshafen,
Germany.
1.3 The Institute's remit includes to conduct
analyses and research into the methods and ethical foundations
of health economic evaluations, the mechanisms of delivery of
and financing health care, the valuation of innovative technologies,
procedures, and products, and the acceptability of technologies
based on their cost-benefit and cost-effectiveness ratios.
1.4 The Institute does not operate as a
contract research organisation. As a matter of principle, the
Institute accepts support exclusively under a policy of unrestricted
educational grants. To date, the Institute has received support
from policy makers', payers', providers', physicians', patients',
and pharmacists' organisations, as well as from the pharmaceutical
industry.
2. OVERVIEW
We would particularly like to draw the attention
of the Committee to the following points:
NICE's international standing
The logic of cost-effectiveness
NICE accountability for reasonableness
Concluding remarks and recommendations
3. INTERNATIONAL
PERSPECTIVE
3.1 Internationally, NICE's technology appraisal
programme is broadly considered a role model for health technology
assessments including economic evaluation. Following the House
of Commons Health Select Committee report of June 2002, NICE commissioned
the WHO Regional Office for Europe to carry out a review of its
Technology Appraisal Programme. The WHO review team described
key principles of the NICE approach as "the use of best available
evidence in decision-making, transparency, consultation, inclusion
of all key stakeholders, and responsiveness to change." They
concluded that, "in all of these areas, it is clear that
NICE is setting a new, international benchmark, for which it can
and should be congratulated" (Hill et al., 2003).
3.2 Further to this, NICE has assumed a
leading role internationally by fostering methodological advances
such as the use of probabilistic sensitivity analyses (intended
to capture decision uncertainty) and mixed treatment comparison
techniques (in order to enable indirect comparisons of technologies
in the absence of head-to-head studies).
3.3 Against this background, guidance issued
by NICE, as well as the underlying technology assessments and
appraisals, have attracted much attention internationally. For
example, the US National Guideline Clearinghouse routinely lists
recommendations by NICE on its website (www.guideline.gov). In
addition, policy makers in jurisdictions other than the UK (such
as the US and Germany) have engaged in debate about the adoption
of NICE-like processes in the context of National Health Technology
Assessment (HTA) programmes.
3.4 In the context of this international
debate, we conducted a qualitative study of the robustness of
the NICE approach (Schlander, 2007a). Here we report on some of
our key observations.
4. THE LOGIC
OF COST-EFFECTIVENESS
4.1 The logic of cost-effectiveness as adopted
by NICEin contrast to traditional cost-benefit analysisdoes
not represent an application of standard economic theory (eg,
Birch and Donaldson, 2003; Birch and Gafni, 2006). The approach
was rather developed by decision analysts with an operations research
background, striving to transfer methods to optimise the efficiency
of manufacturing processes to the production of health (cf. Torrance,
2006). Specifically, NICE has chosen to use cost-utility analysisa
variant of cost-effectiveness analysisas its reference
case, with Quality-Adjusted Life Years (QALYs) as a universal
and comprehensive measure of health-related outcomes (NICE, 2004).
4.2 It is a fundamental and well established
principle of decision analysis that "the identification and
structuring of objectives essentially frames the decision being
addressed. It sets the stage for all that follows" (Keeney
and Raiffa, 1993). To be relevant, analytic decision support relies
on prior clarification of values and objectives to be pursued
(Keeney, 1992). Then, to a great deal, applying the logic of cost-effectiveness
to inform health care resource allocation decisions hinges on
the assumption that "the principal objective of the National
Health Service (NHS) ought to be to maximise the aggregate improvement
in the health status of the whole community" (Culyer, 1997;
earlier for instance: Weinstein and Stason, 1977). While it appears
trivial that health care services (should) produce health, it
is by no means self-evident to make a quick leap from here to
an assumed "principal objective" of collectively financed
health care to simply maximise some construct (QALYs or else)
of health-related consequences.
4.3 In fact, there is little if any evidence
that the maximisation view (sometimes justified with an asserted
"consensus in the literature" without specifying sources;
see Torrance, 2006) is shared by the general population (Coast,
2004). On the contrary, there has been a rapidly growing body
of studies, which collectively show that this assumption is "empirically
flawed" (Dolan et al., 2005; and others). Controversial
issues revolve around (but are not limited to) a higher social
priority for interventions when the severity of the patient's
condition increases, with life-saving interventions most highly
valued (this is sometimes referred to as "the rule of rescue",
cf. Jonsen, 1986; Hadorn, 1991; Nord, 1999; Ubel, 2000; McKie
and Richardson, 2003), and for people in so called double jeopardy
(ie, with more than one condition causing impairment) who have
less QALYs to gain from successful interventions compared to otherwise
healthy individuals (cf. Singer et al., 1995; Harris, 1995;
McKie et al., 1996). As a consequence, there has been a
call for more research into "empirical ethics"
by leading health economists (eg, Richardson and McKie, 2005).
4.4 The maximisation assumption has also
been critiqued from a normative perspective. Concerns prominently
include the implied valuation of human life as a function of health
status, as opposed to viewing the value of life as a dimension
distinct from health (Harris, 1987; Arnesen and Nord, 1999; and
many others).
4.5 In the absence of a gold standard against
which to the judge criterion validity of the logic of cost-effectiveness,
it has been proposed to use the so-called reflective equilibrium
approach to examine the social acceptability of the resulting
rankings of health care programmes (Daniels, 2001; Nord, 1992).
Thus the problems involved in the application of standard decision
rules derived from the logic of cost-effectiveness are perhaps
best illustrated using an example: Assuming the (incremental)
cost per QALY gained was, for example, approximately £3,600
for sildenafil in erectile dysfunction (Stolk et al., 2000),
approximately £7,000 for pharmacotherapy of children with
attention deficit hyperactivity disorder (NICE, 2006), and >£120,000
for beta-interferons and glatiramer in multiple sclerosis (NICE,
2002), would this ranking reflect the comparative social desirability
of these interventions (cf. McGregor, 2003)?
4.6 Far from representing a phenomenon encountered
in England and Wales only, the issue of counterintuitive rankings
had been a major obstacle already faced by the protagonists of
cost-effectiveness analysis for resource allocation in the Oregon
Health Plan (cf. Hadorn, 1991). It is a conspicuous observation
that reviews of the usefulness of such rankings ("QALY league
tables") by many health economists have addressed a variety
of technical issues in detail but did not pay attention to the
issue of the validity of such rankings. (eg, Drummond et al.,
1993; Mauskopf et al., 2003).
4.7 Importantly, the issue of counterintuitive
rankings should not be confused with the problem of distorted
human judgments due to "heuristics and biases" (Gilovich
et al., 2002), as moral intuitions in the sense of reflected
values and beliefs cannot be invalidated simply on grounds of
their incompatibility with competing normative claims. Of note,
it has even been argued by philosophers that there may exist an
irreducible pluralism at the foundations of normative ethics (cf.
Nagel, 1979).
5. NICE ACCOUNTABILITY
FOR REASONABLENESS
5.1 Recognizing both the difficulty of democratic
societies to achieve consensus on distributive principles for
health care and the need for legitimacy of allocation decisions,
Norman Daniels and James Sabin (2002) proposed a framework for
institutional decision-making, which they call "accountability
for reasonableness" (A4R). In order to narrow the scope of
controversy, A4R relies on "fair deliberative procedures
that yield a range of acceptable answers" and consists of
four conditions.
5.1.1 Publicity, ie, resource allocation
decisions must be public, including the grounds for making them.
Transparency should open decisions and their rationales for scrutiny
by all affected, not just the members of the decision-making group.
5.1.2 Relevance, ie, "the grounds
for decisions must be ones that fair-minded people can agree are
relevant to meeting health care needs fairly under reasonable
resource constraints." Arguments should rest on scientific
evidence, though not necessarily a specific kind of, and appeal
to the notion of "fair equality of opportunity." Although
Daniels and Sabin acknowledge that stakeholder participation
may improve deliberation about complicated matters, they believe
it is neither a necessary nor a sufficient condition of A4R.
5.1.3 Revisions and appeal, ie, there
must be an institutional mechanism to engage a broader segment
of society in the process, providing those affected by a decision
to reopen deliberation, and to offer decision-makers an option
to revise funding decisions in light of further arguments.
5.1.4 Enforcement entails some form
of regulation to make sure that the first three conditions are
met.
5.2 Seeking to combine legitimacy and pragmatism,
and realizing that utilitarianism "has next to nothing to
offer in eradicating health inequalities" (Rawlins and Dillon,
2005), NICE put aside questions whether matters of content can
"be resolved solely with a reference to `due process'"
(Hasman and Holm, 2005) and has explicitly subscribed to the principles
of accountability for reasonableness (Rawlins and Dillon, 2005).
At the same time, NICE reaffirmed its preference for cost-utility
analysis with QALYs "as its principal (though not only) measure
of health gain."
5.3 A preliminary case study of a recent
NICE Technology Appraisal (No. 98; see www.nice.org.uk) focused
on the processes adopted by NICE. It confirmed the high (albeit
not prefect) level of transparency, predictability, and the participatory
nature of the NICE approach (Schlander, 2007b). While largely
in agreement with the positive WHO review (Hill et al.,
2003), the analysis also indicated a need for further in-depth
inquiry.
5.4 A subsequent more comprehensive in-depth
review focusing on the technology assessment report informing
NICE Technology Appraisal No. 98 did not confirm the expected
robustness of the NICE evaluation process, revealing a striking
number of limitations and anomalies (Schlander, 2007c). Collectively
these left the assessment open to critique regarding all essential
components of a technology review question, namely the population
studied, the choice of interventions, the clinical and economic
criteria used, as well as the study designs and selection criteria
(cf. CRD, 2001). Furthermore, the structure of the economic model
itself was prone to distortion and bias in various ways, and an
unsettling number of consistency problems were identified within
the assessment report. As a consequence, the assessment did not
fully consider the best available evidence and was unable to identify
any differences in clinical effectiveness between the treatment
options evaluated.
5.5 A number of underlying problems were
suggested to explain the observed limitations, notably including
an insufficient integration of clinical and economic perspectives,
a high level of standardisation demanding to make the problem
fit a preconceived solution approach (including [but not limited
to] the use of QALYs as effectiveness measure), and issues related
to the technical quality of the assessment itself (Schlander,
2007a).
5.6 Process-related observations may be
compared to the conditions of accountability for reasonableness:
5.6.1 Publicity. The overall process
was well structured and followed well-defined timelines with predictable
opportunities for (some) stakeholders to provide input; key documents
were continuously published at the NICE website. Major limitations
of transparency were related to the use of commercial-in-confidence
information (a situation on which NICE has taken action meanwhile),
the economic model developed by the assessment group, and decision-making
criteria beyond cost-effectiveness used by the appraisal committee.
Designating economic models as "proprietary" insulates
a major component of technology assessments from public scrutiny
and does not meet established standards of good economic modelling
practice (eg, Philips et al., 2004; Brennan and Akehurst,
2000). It might be added that this practice prevents academic
debate as well and, therefore, is not conducive to the further
development of health economic evaluation methods. As admitted
by NICE (cf. above, 5.2), quasi-utilitarian maximisation of QALY
gains irrespective of their distribution does not provide for
a sufficient basis for health care resource allocation in tune
with social preferences. Thus, it is a critical transparency issue
that decision criteria other than cost-effectiveness have not
(yet) been codified by NICE.
5.6.2 Relevance. In the absence of codified
criteria for fairness and with its heavy (albeit not exclusive)
reliance on cost-effectiveness benchmarks, the specific NICE approach
may be characterised as an "efficiency-first" strategy
(cf. Richardson and McKie, 2006). It has been argued by observers
that this approach in practice will result in the marginalization
of other factors "as outside of NICE's terms of reference"
(Redwood, 2006). It seems unlikely that the current approach will
enable to adequately capture social preferences for health care
provision. A current example nicely illustrating these issues
is the debate about the cost-effectiveness of expensive drugs
to treat patients with rare disorders ("orphan drugs").
Given the high fixed (ie, volume-independent) and low variable
cost structure of the pharmaceutical industry, applying the logic
of cost-effectiveness would inevitably deprive these patients
of any chance to receive effective treatment (cf. McCabe et
al., 2005, 2006; Hughes, 2005, 2006). While not meant to dismiss
any need to make thorny trade-off decisions, this example
may serve to illustrate the role of budgetary impact for reimbursement
decision-makingwhich NICE has repeatedly denied to take
into consideration (Rawlins and Culyer, 2004; Pearson and Rawlins,
2005), despite at least some indications to the contrary (Dakin
et al., 2006). While this position taken by NICE appears
questionable on both theoretical and pragmatic grounds, it is
evident that recognition of the relevance of budgetary impact
would have fatal implications for any attempt to interpret the
logic of cost-effectiveness in a normative way (Donaldson et
al., 2002; Schlander, 2003, 2005).
5.6.3 Revisions and appeal. NICE provisions
for appeal are more restrictive than those provided for by A4R.
Appeals are limited to specific grounds and do not allow to reopen
debate. It seems unlikely that these limitations are compensated
for by opportunities for (invited) consultees and commentators
to provide inputs during the process, owing to the relatively
short windows of opportunity compared to the massive amount of
data to be reviewed and due to their limited transparency (cf.
above, 5.6.1).
5.6.4 Enforcement. There is no indication
that NICE has implemented an effective quality assurance system
for technology assessments. Design of effective provisions would
have to take into account that conventional peer-review processes
are unlikely to be up to the task to assess the quality of economic
evaluation models (Brennan and Akehurst, 2000; Hill et al.,
2000).
5.6.5 Implementation. Following Hasman
and Holm (2005), proper enforcement of decisions should ensure
that reasoning is "decisive in priority setting and not merely
a theoretical exercise". Although NICE and the NHS have made
substantial efforts to improve actual implementation of guidance,
there remain issues in this area as well (cf. Sheldon et al.,
2004; Freemantle, 2004). It has been suggested that guidance may
be "more likely to be adopted when there is strong professional
support, a stable and convincing evidence base" and that
"guidance needs to be clear and reflect the clinical context"
(Sheldon et al., 2004)conditions that were arguably
not fulfilled in the case of Technology Appraisal No. 98 (Schlander,
2007a,c).
5.7 NICE has established a "Citizens Council"
to provide input "on the topics it wants the council to discuss"
and to ensure that its "value judgments resonate broadly
with the public" (Rawlins and Culyer, 2004), while maintaining
that its guidance "is based on clinical and cost-effectiveness
evidence" (NICE, 2007). The Citizens Council has shown some
concern for considerations of social justice but endorsed NICE's
approach, concluding that "cost-utility analysis is necessary
but should not be the sole basis for decisions on cost-effectiveness"
(NICE, 2005a,b). It might be worthwhile to explore in more depth
whether the Citizens Council was confronted with the issue of
cost-per-QALY rankings such as those cited above (see 4.5), ie,
with the logic that providing 10 people with a utility gain of
0.1 for the rest of their life (equivalent to sildenafil treatment
for men with erectile dysfunction) is indeed considered equivalent
to saving the life of a single (otherwise healthy) person.
5.8 Summing up, there are good reasons to be
suitably impressed by the attempts by NICE to ensure rigorous
systematic reviews, objective economic evaluation, stakeholder
participation, and transparency of process as well as value judgments.
This notwithstanding, NICE is still in its infancy (cf. Williams,
2004), andin our conclusionthere remains a long
way to go before conditions of accountability for reasonableness
will have been met.
6. CONCLUDING
REMARKS AND
RECOMMENDATIONS
6.1 At this point in time, our observations
do not confirm "NICE's use of cost effectiveness as
an exemplar of a deliberative process", as one of its founding
fathers recently claimed (Culyer, 2006). In our conclusion, a
more balanced perspective would seem commendable, as there is
reason for concern as to the robustness of NICE health technology
assessment processes as well as their specific focus on "efficiency"
in terms of aggregated QALY maximisation.
6.2 In particular, in our view it would
seem justified to (re)consider (a) more flexible approaches in
terms of process as well as analytic procedures (enabling to adapt
the problem-solving strategy to the clinical decision problem
at hand), (b) the extent of reliance on QALYs as (exclusive?)
clinical effectiveness measure, (c) the level of integration of
clinical and economic perspectives, (d) the implementation of
an effective quality assurance system for technology assessments.
From an international perspective, we further note that the value
judgments of NICE are not universally shared.
Professor Michael Schlander
InnoValHC, Eschborn, Germany
16 March 2007
|