Abstract
Introduction: The hip and knee replacement priority criteria tool (HKPT) is 1 of 5 tools developed by the Western Canada Waiting List Project for setting priorities among patients awaiting elective procedures. We set out to assess the validity of the HKPT priority criteria score (PCS) and map the maximum acceptable waiting times (MAWTs) for patients to levels of urgency.
Methods: Two studies were used to assess convergent and discriminant validity. In study 1, consecutive patients on a waiting list for hip or knee arthroplasty were assessed by orthopedic surgeons from the 4 provinces in Western Canada, using the HKPT and data on patient age, gender, joint site, type of surgery (primary or revision), 2 measures of surgeon-rated patient urgency, and diagnosis. In study 2, 6 patients were videotaped during a consultation interview with the surgeon and were assessed by a group of experts. We measured function with the PCS and the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC).
Results: In study 1, we assessed 394 patients, and in study 2, 19 raters assessed the 6 patients. Correlations between the PCS and other measures of physician-rated urgency were strong, ranging from 0.78 to 0.89. For a subgroup of 60 patients, correlation between the PCS and function as measured with the WOMAC was 0.48, and correlation was greater (0.45–0.56) between items measuring similar constructs (e.g., pain at rest) than those measuring different constructs (0.21–0.40). In study 2, median MAWTs ranged from 4 to 24 weeks for 5 levels of urgency based on PCS percentiles.
Conclusions: Results from this study support the validity of the PCS as a measure of surgeon-rated urgency for hip or knee arthroplasty. Evaluative studies are needed to assess the validity and acceptability of the tools and the establishment of MAWTs in clinical practice.
Long waiting lists for joint arthroplasty continue to be a major concern in Canada, with median waiting times from specialist assessment to surgery ranging from approximately 11 to 28 weeks.1–3 Currently, waiting lists are managed by individual orthopedic surgeons, and the order of patients is not rationalized or prioritized within or across physician lists using a transparent and standardized approach.4,5 Most surgeons use broad, ill-defined categories such as urgent, semiurgent and routine to prioritize patients in the queue. However, evidence has shown that order in the queue bears little relation to the severity of patient symptoms such as pain and disability.4
In a recent review of Canadian waiting lists and wait times McDonald and colleagues6 concluded that with some exceptions, “wait lists do not provide a fair and transparent basis for managing patients or allocating resources.” A key recommendation was the development and use of standardized measures, based on clinical urgency and capacity to benefit, to assess patient priority. To address the problems of waiting list management, the Western Canada Waiting List (WCWL) Project developed 5 tools designed to provide an explicit, transparent and fair method for prioritizing patients on waiting lists.7
Priority criteria are currently being used in New Zealand and parts of the United Kingdom, but evidence of their reliability and validity is minimal. 8,9 In this paper we address aspects of construct validity of the hip and knee replacement priority criteria tool (HKPT), designed to rank order patients scheduled for primary or revision hip and knee replacement.10
Traditionally, there are 3 types of validity: construct, content and criterion-related. This concept has evolved into a more unified view with construct validity as the foundation of validity inquiry, a foundation that subsumes construct, content and criterion-related evidence.11–13 Validation on the basis of only a single type of evidence is no longer considered sufficient. Further, the tools themselves are no longer to be validated, but rather the inferences about score meaning or interpretation. Messick13 has defined validity as the process of evaluating the degree to which the empirical evidence and theoretical rationales support interpretations and actions based on a score or other indicator. This broader conceptualization of validity is increasingly used by health services researchers,14 and is adopted as a framework in this paper.
The HKPT comprises 7 criteria developed by a panel of practitioners and researchers.15 A weighted sum results in a priority criteria score (PCS) intended to rank patients in order of urgency. We examine here aspects of the validity of the PCS as a measure of patient urgency. Specifically, our research questions were:
What is the congruence between the PCS and other indicators of patient urgency?
What are the convergent and discriminant validity characteristics of the PCS in relation to a patient-rated measure of health status?
What are physician-rated maximum acceptable waiting times (MAWTs) for different levels of patient urgency?
Methods
To address these questions, 2 studies were designed. Both have been described in a separate paper on tool development and reliability.10 The first collected data on consecutive patients seen by orthopedic surgeons who completed an assessment of priority criteria for each patient. The second study used data obtained from clinicians who evaluated interviews and videotapes of 6 patients of differing urgency who were on a waiting list for hip and knee replacement during a consultation with the orthopedic surgeon. The findings were analyzed by the 2-tailed t test for independent samples. A p value of less than 0.05 was considered significant.
Study 1
From December 1999 to May 2000, data were collected on consecutive patients seen by 17 orthopedic surgeons from across 4 provinces in Western Canada. The HKPT was completed by the surgeon at the patient visit. In addition to measures of patient urgency and health status, data obtained included age, gender, joint site, type of surgery (primary or revision), 2 measures of surgeon-rated patient urgency, and diagnosis. Diagnoses were coded using the ICD-9.
The HKPT comprises 7 criteria, each with 3 to 4 severity levels measuring: pain on motion; pain at rest; ability to walk without significant pain; functional limitations; abnormal findings on physical examination related to the affected joint; potential for progression of the disease documented by radiographic findings; and threat to role and independence. A panel of orthopedic surgeons, researchers and clinicians developed the criteria, which were adapted from the New Zealand clinical priority assessment criteria (CPAC).16 Weights were determined by regression analysis and clinical judgement, and the PCS was calculated by summing the weighted items.
Surgeons were also asked to rate their patients on 2 measures of urgency:
A 100-mm visual analogue scale (VAS) with anchors of 0 (not urgent at all) and 100 (extremely urgent: just short of an emergency)
A 5-point Likert scale on relative urgency scored from 1 (much less urgent than the average patient) to 5 (much more urgent than the average patient).
A subset of surgeons collected patient data using the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC),17 a disease-specific questionnaire that consists of 3 subscales measuring: pain, stiffness and physical function. The WOMAC tool is widely used to measure function and symptoms in patients with osteoarthritis of the hip or knee. It was not developed to rank relative urgency for intervention. It consists of 24 items (5 pain, 2 stiffness and 17 physical function) with the degree of severity or difficulty measured on a 5-point Likert scale (0 = none, 1 = mild, 2 = moderate, 3 = severe, 4 = extreme). Subscale scores were derived from the summation of items for each dimension and transformed to a 0–100 scale, with the higher score reflecting worse function.
The congruence between the PCS and other measures of patient urgency were evaluated by correlational analysis. As a guideline to assess the strength of a relationship, an r value greater than 0.3 was the minimal correlation considered important.18 It was expected that correlations between surgeon-rated measures of urgency would be strong (≥0.8) and that correlations between surgeon and patient measures would be moderate (at least 0.5). Analysis of convergent and discriminant validity between the PCS and the WOMAC was based on the approach described by Campbell and Fiske.19 We hypothesized that correlations between patient- and surgeon-rated variables measuring similar constructs (e.g., pain at night) would be moderate and positive (convergent validity) and that correlations between variables measuring the same construct would be higher than correlations between variables measuring different constructs (discriminant validity). For example, the correlation between patient- and surgeon-rated pain at night should be higher than the correlation between pain at night and pain on walking. To determine conceptually similar variables, the WOMAC criteria and subscales were matched to the priority scoring criteria.
Study 2
After informed consent, 6 patients on a waiting list for hip or knee replacement surgery (3 hip, 3 knee), representing a range of urgency, were recruited from participating doctors’ practices and videotaped during a clinical interview with an orthopedic surgeon. The interviews were conducted by 3 surgeons, including one of the authors (G.A.). The surgeon posed questions normally covered in an initial patient history-taking and conducted an examination of the hip or knee. Videotapes also included x-rays of the affected joint and an explanation of the findings. Panel members and their colleagues viewed the videotapes and independently scored each patient using the HKPT. They also rated the urgency for each patient on the 100-mm VAS. Waiting time was defined as the time from consultation with the surgeon to the surgical procedure. MAWT from the physician’s perspective was measured by the question: “In your clinical judgement, what should be the maximum waiting time for this patient?”
Convergent validity was assessed by correlational analysis of the PCS, VAS and MAWT. To establish preliminary MAWTs for different levels of urgency, the distribution of MAWTs was mapped onto 5 groups, based on the PCS expressed in percentiles. Percentiles allow a comparison of relative performance with respect to 2 different variables, in this case, urgency scores and MAWTs.
Results
Study 1
The sample comprised 406 patients (240 [59%] female), ranging in age from 17 to 93 years (mean [and standard deviation] 66.8 yr [13.1]). One hundred and ninety-three (48%) were on a waiting list for hip replacement and 213 (52%) for knee replacement. Primary arthroplasty was scheduled in 369 (91%) and revision arthroplasty in 37 (9%). Of patients scheduled for primary arthroplasty, 336 (91%) had a diagnosis of osteoarthritis, 16 (4%) had rheumatoid arthritis, and 17 (5%) had other conditions. The analysis was based on 394 patients for whom complete data for the priority criteria and VAS urgency were available. To determine the effect of volume of patients assessed on urgency ratings, surgeons were grouped into high (> 20) and low (≤20) providers split by the median number of patients assessed. On t testing there was no difference in the PCS or VAS urgency between the groups.
The WOMAC was available for a subgroup of 60 patients. There was no significant difference in gender, type of surgical procedure or urgency measure (PCS, VAS, relative urgency) for patients with and without WOMAC data. However, the 60 patients for whom the WOMAC was available were significantly younger than the other patients (mean 61.4 and 67.8 yr respectively).
The mean (and standard deviation [SD]) PCS was 48.39 (20.61) and VAS urgency 59.85 (21.63) (Table 1). Mean WOMAC subscale scores ranged from 58.82 (function) to 64.92 (pain). Analysis using t tests for independent samples showed that female patients had a significantly higher mean PCS (50.25) and VAS (61.81) than male patients (45.26 and 56.53 respectively). Patients requiring revision arthroplasty had a significantly higher mean PCS (63.87) and VAS (79.71) than patients scheduled to undergo primary replacement (46.46 and 57.81 respectively).
The multiple correlation (R) between the combined 7 priority criteria and the VAS urgency was 0.82, whereas the correlation of the PCS with the 5-point relative urgency scale was 0.78. To compare distributions of the PCS and VAS and identify outliers, PCS and VAS scores were grouped into 5 equal groups based on percentiles (Table 2). For example, patients with a PCS at or below the 20th percentile (i.e., a PCS ≤ 28) would be among the 20% least urgent, whereas those with a PCS above the 80th percentile (PCS > 64) would be among the 20% most urgent. Similarly, patients with a VAS score of 41 or less would be the least urgent and those with a VAS greater than 79 would be the most urgent. Outliers were defined as cases classified in the 20% most urgent group by the VAS and the 20% least urgent by the PCS. Only 1 outlier was identified (PCS 22, VAS 83). This patient was scheduled for revision arthroplasty because of recurrent dislocation and was rated as “much more urgent than the average patient” on the relative urgency scale. No patient was in the 20% most urgent group based on the PCS and in the 20% least urgent based on the VAS.
Fig. 1 compares the distribution of the PCS and VAS urgency scores across 5 levels of relative urgency compared to the average patient. Only 216 patients were included because the relative urgency question was removed after revision of the HKPT form. Although both scales were positively correlated with relative urgency, the PCS had a greater range of scores than the VAS urgency for all but the middle levels of relative urgency (Fig. 1). Thirty-two (15%) patients were rated as “much more urgent” and 65 (30%) “more urgent” than the average patient. Of those rated as “much more urgent” than the average patient, 27 (84%) were above the 80th percentile for the VAS and 25 (78%) were above the 80th percentile for the PCS.
Correlations between the PCS and the WOMAC function subscale were 0.48. To assess convergent and discriminant validity, 3 priority criteria were tested against WOMAC criteria and subscales measuring similar constructs: pain on motion, pain at rest and function. Convergent validity was moderate, ranging from 0.45 to 0.56, for similar constructs and was greater than measures comparing different constructs (0.21–0.40) (Table 3).
Study 2
Nineteen experts (14 orthopedic surgeons, 4 other physicians and 1 physical therapist) rated the 6 patient interviews using the HKPT, the VAS and a MAWT. Correlations between the PCS and VAS urgency rating were high (0.89), whereas MAWTs were moderately correlated with the PCS (–0.67) and the VAS urgency findings (–0.74). When we mapped MAWTs to 5 levels of urgency based on the PCS by grouping patients into quintiles according to the PCS scores (e.g., patients with a PCS > 80 were rated among the 20% most urgent and those with a PCS < 30 were rated among the 20% least urgent), we found that both the mean and median MAWTs decreased with increasing levels of urgency (Table 4). Median MAWTs ranged from 4 weeks (most urgent group) to 24 weeks (least urgent group).
Discussion
The HKPT was designed to capture clinical judgement and the complex decision-making process involved in assessing urgency for joint arthroplasty. Although there is no standard against which to test the PCS, one way of measuring convergent validity is to compare the PCS to other methods of physician judgement of urgency (i.e., urgency measured on a VAS and urgency relative to average patients in their practice). In study 1, the correlation between the PCS and relative urgency was strong (0.78) and the 5 priority items were highly correlated with the VAS (R = 0.82). Compared with 5 levels of relative urgency, the PCS and VAS discriminated fairly well between extreme levels of urgency, although there was some overlap of the PCS between adjacent levels of relative urgency, particularly in the middle group. For example, the PCS ranged from 7 to 70 for patients rated “about as urgent” as other patients in practice. For cases rated “much more urgent” and “much less urgent,” the range of PCS scores was generally less, with the exception of 1 outlier.
Outliers were defined as patients in the 20% most urgent group as measured by the VAS and also in the 20% least urgent group based on the PCS. An examination of the presence and characteristics of outliers is important to the validity of the PCS as a measure of urgency, particularly where high-stakes decisions are made on the basis of a priority score. For example, in New Zealand, where thresholds for the CPAC determine access to surgery, 20 errors in the conceptualization and measurement of urgency could have significant consequences.
Potential limitations of the VAS urgency and relative urgency measures are the variations in the range of the VAS for different surgeons and in case mix in different surgeons’ practices against which each patient is assessed. Although we found no difference in priority ratings among groups of surgeons based on the number of patients assessed, there is some evidence that perception of priority could be subject to the effects of actual wait time in orthopedic units. In a Swedish study of assessment and prioritization of identical simulated referrals for orthopedic consultation, units with longer waiting times assigned patients a lower priority than other units.21 The level of agreement for indications for joint arthroplasty also varies among physicians. 22–25 Unfortunately, an objective criterion against which to test the PCS as a measure of urgency does not exist. However, the criteria do reflect the basis of physicians’ best judgements of urgency.
In the comparative analysis of the PCS with the WOMAC, which were developed for differing reasons, evidence generally supported the convergent and discriminant validity of the PCS. Similar criteria measuring pain and function in the HKPT and WOMAC were moderately correlated (0.45–0.56) and more strongly related than items measuring different constructs (e.g., pain at rest and pain walking). The overall correlation of the PCS and WOMAC function was moderate (0.48) in contrast to the findings of Derrett and associates,8 who reported low correlations (0.29) between the New Zealand CPAC and a condition-specific tool in patients wait-listed for hip and knee replacement surgery. In addition, they found little relationship between the hip and knee CPAC and patient benefit, as measured by improvement in health-related quality of life.
In study 2, correlations between the PCS and VAS urgency were high, whereas correlations between perceived MAWT and the PCS (–0.67) and VAS (–0.74) were moderately strong. For use in the management of waiting lists, MAWTs need to be established for different levels of urgency. Little work has been done in this area. Clinically reasonable waiting times in Canada have been assessed largely through physician opinion surveys with no allowance for differing levels of urgency. For example, physicians responding to the Fraser Institute surveys estimated clinically reasonable waiting times of 6.5 weeks from the consultation to the surgical procedure.26
Naylor and Williams27 used 4 groups of time frames to determine urgency ratings for a waiting list for hip or knee replacement surgery, 0–4 weeks for the most urgent group and 26–52 weeks for the least urgent. Results from our study provided preliminary median MAWTs for varying levels of urgency, based on the PCS. These ranged from 4 weeks for the most urgent to 24 weeks for the least urgent, highly comparable to the waiting time suggested by Naylor and Williams.
A limitation of our MAWT estimates comes from the use of a limited number of patients, assessed in a simulated clinical situation. It is also important to note that although most of the percentiles for the PCS were similar for both studies, normative scores, such as percentiles, are sample-dependent and should be collected on a representative sample of the population in which the scores will be used. A further limitation is that these are surgeons’, not patients’, views.
The rationale for using priority scoring criteria is to improve fairness, explicitness and transparency, and to provide more consistent access to surgery. However, a criticism has been either weak or no evidence of the validity of techniques to rank patients in order of urgency for various procedures.8,9,20,28–32 Validation is a continuous process of evaluating evidence over time. Accordingly, we acknowledge that the HKPT needs to be tested in a wide range of clinical populations. Our work is continuing on validity testing and the establishment of MAWTs in clinical practice. Prioritizing patients based on clinical urgency rather than simple queuing should result in relative improvement in their clinical outcome after surgery. Evidence of the relationship of the PCS to patient outcomes would provide important support for the validity of the PCS as a prioritization tool. Although evidence suggests that patients who have worse preoperative functional status may have comparatively worse pain and function 1–2 years after arthroplasty,33,34 the impact of waiting on patient outcomes is unclear. 35,36 Further research is needed to understand the relationships and possible interactions between patient urgency, length of waiting time and patient benefit. In addition, the impact of short-term variations in symptoms on patients’ priority rankings is unknown and would be an important aspect to assess for the fair use of the tool. Implementation should involve continuous monitoring and an evaluation of the effects of implementation on patient outcomes, case mix, patterns of resource use, gaming (i.e., playing the system) and impact on the patient–doctor relationship.20,31,37,38
Conclusions
Our preliminary results show support for the validity of the PCS as a measure of physician-rated urgency. Although only 1 case was identified as an outlier, the implication for implementation of scoring tools is that continuous monitoring and evaluation are needed to determine validity in clinical practice. Results also support the convergent and discriminant validity of the PCS in relation to similar dimensions in the WOMAC.
Acknowledgements
Members of the Steering Committee of the Western Canada Waiting List Project are as follows: Dr. Tom Noseworthy, Department of Community Health Sciences, University of Calgary (Chair); Dr. Morris L. Barer, Centre for Health Services and Policy Research, and Department of Health Care and Epidemiology, University of British Columbia, Vancouver; Dr. Charlyn Black, Centre for Health Services and Policy Research, University of British Columbia, Vancouver; Ms. Lauren Donnelly, Acute and Emergency Services Branch, Saskatchewan Health, Regina; Dr. David Hadorn, Western Canada Waiting List Project; Dr. Isra Levy, Health Programs, Canadian Medical Association, Ottawa; Mr. Steven Lewis, Access Consulting, Saskatoon; Mr. John McGurran, Western Canada Waiting List Project, and Department of Public Health Sciences, University of Toronto; Dr. Sam Sheps, Department of Health Care and Epidemiology, University of British Columbia, Vancouver; Dr. Mark C. Taylor, Department of Surgery, University of Manitoba, Winnipeg; Mr. Laurie Thompson, Health Services Utilization and Research Commission, Saskatoon; Mr. Darrell Thomson, British Columbia Medical Association, Vancouver; Ms. Barbara Young, Clinical Evaluation Services, Calgary Regional Health Authority, Calgary.
The Western Canada Waiting List Project was supported by a financial contribution from the Health Transition Fund (Health Canada) as Project NA489. The views expressed herein do not necessarily represent the official policy of federal, provincial or territorial governments.
We are indebted to the 19 partner organizations from the 4 western Canadian provinces for their ongoing support throughout the project: British Columbia Medical Association; Capital Health Region (Victoria); Vancouver/Richmond Health Board; British Columbia Ministry of Health; University of British Columbia, Centre for Health Services and Policy Research; Alberta Medical Association; Capital Health Authority (Edmonton); Calgary Regional Health Authority; Alberta Health and Wellness; University of Alberta, Department of Public Health Sciences; Saskatchewan Medical Association; Regina Health District; Saskatoon District Health; Saskatchewan Health; Health Services Utilization and Research Commission; Winnipeg Regional Health Authority; Manitoba Health; Manitoba Centre for Health Policy and Evaluation; Canadian Medical Association.
We wish to acknowledge the members of the hip and knee replacement panel who contributed to the development of the hip and knee replacement surgery priority criteria tool: Dr. Ted Findlay, Dr. Donald Garbuz, Dr. Robert Glasgow, Ms. Karin Greaves, Dr. David Hedden, Dr. Mary Hurlburt, Dr. Bill Johnston, Dr. Stewart McMillan, Dr. Jack Reilly, Dr. Anne Sclater, Dr. Kenneth Skeith and Dr. Lowell van Zuiden. We thank colleagues of the panel members and the patients who participated in the pilot testing and reliability work in Winnipeg, Regina, Saskatoon, Calgary, Edmonton and Vancouver. Finally, we thank Ms. Elaine Dunn and Ms. Anne-Marie Pedersen for their contributions to data collection.
Footnotes
Reprint requests to: Dr. Tom W. Noseworthy, Chair, Western Canada Waiting List Project Steering Committee, University of Calgary, Heritage Medical Research Bldg., 3330 Hospital Dr. NW, Calgary AB T2N 4N1; fax 403 210-9378; tnosewor{at}ucalgary.ca
Competing interests: None declared.
- Accepted September 5, 2003.