Properties of the valuation system


There are two aspects here. One is the repeatability and stability of the valuations at the group level. In this respect the repeatability of 15D valuations seems good (Sintonen 1995).

The other aspect is the test-retest reliability of the total scores generated by these valuations. In this respect the test-retest reliability of the 15D scores over 14 days was clearly better than that of EQ-5D scores (based on UK TTO valuations) among COPD patients (Stavem 1999).


There is no gold standard for a valuation system, that is, for how to measure the values and from whom.

Usually the valuations for preference-based generic instruments have been elicited from population samples with reference to hypothetical health states, which the respondents imagine to be in. To be valid for QALY calculations, the values should reflect a reasonable trade-off between quality and length of life. It has been therefore argued on theoretical grounds that standard gamble (SG) or time trade-off (TTO) valuations are most valid for QALY calculations.

Recently there have been increasingly doubts about whether valid valuations can be elicited without the respondents being themselves in those health states that are being valued or that they at least have sometimes experienced them. It has been claimed that the typical approach followed so far is so hypothetical and unreal that the validity of valuations obtained is questionable.

However, it has not been studied, for example, how the average scores produced by widely used generic preference-based generic instruments (like 15D, HUI Mark 3, EQ-5D based on UK TTO valuations, AQoL, SF-6D) in different age groups of the general population reflect the direct TTO valuations of their own health status in these age groups.

To find out this we recently carried out an extensive population survey. It showed that of the scores produced by the instruments mentioned above the 15D scores reflect best the TTO valuations of general population on their own health states. To the extent that these are valid, the 15D scores are most valid for QALY calculations.

Another test with COPD patients showed that their 15D scores agreed better with SG and TTO valuations on patients’ own health than the EQ-5D scores based on UK TTO valuations (Stavem 1999). Thus if SG or TTO valuations of own health are taken as the gold standard, the 15D scores are more valid for QALY calculations than EQ-5D these scores.

We carried out recently an extensive study among hospital patients. It showed that the relationship between TTO scores of patients’ own health states and their 15D scores is linear and the agreement between them is good at the aggregate level. Thus to the extent that the former are valid for QALY calculations, then are also the 15D scores without any transformation (Honkalampi and Sintonen 2010).

On the other hand Nord (1992) has argued that person trade-off (PTO) valuations are the gold standard especially if resource allocation decisions are concerned. Comparisons by Nord (1996) across preference-based generic instruments (15D, EQ-5D based on UK TTO valuations, HUI Mark 1, HUI Mark 2, QWB, Rosser/Kind) have shown that apart from the old two-dimensional Rosser/Kind index scores, the 15D scores agree best with the “rules of thumb” valuations based on PTO. Nord (1999) goes as far as to conclude that apart from the 15D and Rosser/Kind, the other instruments mentioned above “are completely unusable, at least as stand-alone aids, in comparisons of treatment programs for patient groups that differ with respect to the severity of their condition”.

Validity of Finnish valuations elsewhere

So far the users of the 15D in different countries around the world have mostly applied the Finnish valuations in their studies. Now valuations are available also from Denmark (Wittrup-Jensen et al.) and Norway (Michel et al.) and they are very similar to the Finnish ones.

This rises the question of how valid are the Finnish valuations elsewhere. So far there is no direct evidence on this, but a lot of indirect evidence can be obtained from the EuroQol project, which was originally launched to explore whether health state valuations elicited with similar basic methods as used in the 15D are similar across a number of European countries. Rigorous comparisons based on 11 valuation studies from six European countries (Finland, Great Britain, Germany, The Netherlands, Spain and Sweden) led to the conclusion that

”There is a considerable degree of agreement between health state valuations from several European countries. Spanish deviations may be explained by subtle methodological and/or linguistic differences that we were not able to account for, rather than cultural differences. Hence in Western industrialised countries it appears unnecessary to replicate expensive valuation studies in each country in order to arrive at valid preference-based HRQoL instruments” (Sintonen et al. 2003).

A thorough comparison of Finnish and U.S.-based VAS valuations led to a conclusion that there were small and inconsistent differences in valuations, but they are unlikely to affect the results of international studies (Johnson et al. 2000).

Minimum Clinically Important Change or Difference

It is often asked, how to interpret a change in the 15D score; what is the minimum important change (MIC) over time or minimum important difference (MID) cross-sectionally in the 15D score? Based on an extensive patient data it has been estimated that the generic MICs and MIDs of 15D scores are ±0.015.

Recommendation: Follow-up studies using the 15D should report the mean change in the 15D score, its statistical significance, relationship to the MIC, and the distribution of the changes of the 15D scores into the following five categories: >0.035 for "much better", 0.015 – 0.035 for "slightly better", >-0.015 and <0.015 for "much the same (no change)", -0.035 – -0.015 for "slightly worse" and <-0.035 for "much worse" (see Alanne S et al. 2015).


Wittrup-Jensen KU, Pedersen KM. Modelling Danish Weights for the 15D Quality of Life Questionnaire by Applying Multi-Attribute Utility Theory (MAUT). Health Economics Papers 2008:7, University of Southern Denmark.

Michel YA, Augestad LA, Rand K. Comparing 15D Valuation Studies in Norway and Finland-Challenges When Combining Information from Several Valuation Tasks. Value Health. 2018 Apr;21(4):462-470. doi: 10.1016/j.jval.2017.09.018. Epub 2017 Nov 8

Alanne S, Roine RP, Räsänen P, Vainiola T, Sintonen H. Estimating the minimun important change in the 15D scores. Qual Life Res 2015 Mar;24(3):599-606. 2014 Aug 22. [Epub ahead of print].

Honkalampi T, Sintonen H. Do the 15D scores and time trade-off (TTO) values of hospital patients’ own health agree? Int J Technol Assess Health Care 2010, 26(1):117-23.

Johnson JA, Ohinmaa A, Murti B, Sintonen H, Coons SJ. Comparison of Finnish and U.S.-based visual analog scale valuations of the EQ-5D measure. Med Decis Making 2000; 20: 282-289.

Nord E. Cost-value analysis in health care. Making sense out of QALYs. Cambridge: Cambridge University Press; 1999.

Nord E. Health status index models for use in resource allocation decisions. A critical review in the light of observed preferences for social choice. Int J Techn Assess Health Care 1996; 12: 31-44.

Nord E. Methods for quality adjustment of life years. Soc Sci Med 1992; 34: 559-569.

Sintonen H. The 15D instrument of health‑related quality of life: properties and applications. Ann Med 2001; 33: 328‑336 and Sintonen H. The 15D‑measure of health‑related quality of life. II. Feasibility, reliability and validity of its valuation system. National Centre for Health Program Evaluation, Working Paper 42, Melbourne 1995 (can be downloaded from

Sintonen H, Weijnen T, Nieuwenhuizen M, Oppe S, Badia X, Busschbach J, Greiner W, Krabbe P, Ohinmaa A, Roset M, de Charro F. Comparison of EQ-5D VAS valuations: analysis of background variables. In: Brooks R, Rabin R, de Charro F (eds.) The measurement and valuation of health status using EQ-5D: A European perspective. Evidence from the EuroQol BIOMED Research Programme. Dordrecht: Kluwer; 2003: 81-101.

Stavem K. Reliability, validity and responsiveness of two multiattribute utility measures in patients with chronic obstructive pulmonary disease. Qual Life Res 1999; 8: 45-54.