Abstract
Background: Meta-analyses may be prone to generating misleading results because of a paucity of experimental studies (especially in surgery); publication bias; and heterogeneity in study design, intervention and the patient population of included studies. When investigating a specific clinical or scientific question on which several relevant meta-analyses may have been published, value judgments must be applied to determine which analysis represents the most robust evidence. These value judgments should be specifically acknowledged. We designed the Veritas plot to explicitly explore important elements of quality and to facilitate decision-making by highlighting specific areas in which meta-analyses are found to be deficient. Furthermore, as a graphic tool, it may be more intuitive than when similar data are presented in a tabular or text format.
Methods: The Veritas plot is an adaption of the radar plot, a graphic tool for the description of multiattribute data. Key elements of meta-analytical quality such as heterogeneity, publication bias and study design are assessed. Existing qualitative methods such as the Assessment of Multiple Systematic Reviews (AMSTAR) tool have been incorporated in addition to important considerations when interpreting surgical meta-analyses such as the year of publication and population characteristics. To demonstrate the potential of the Veritas plot to inform clinical practice, we apply the Veritas plot to the meta-analytical literature comparing the incidence of 30-day stroke in off-pump coronary artery bypass surgery and conventional coronary artery bypass surgery.
Results: We demonstrate that a visually-stimulating and practical evidence-synthesis tool can direct the clinician and scientist to a particular meta-analytical study to inform clinical practice. The Veritas plot is also cumulative and allowed us to assess the quality of evidence over time.
Conclusion: We have presented a practical graphic application for scientists and clinicians to identify and interpret variability in meta-analyses. Although further validation of the Veritas plot is required, it may have the potential to contribute to the implementation of evidence-based practice.
Identifying the highest quality evidence from the literature and applying it to clinical practice has always been a challenge for clinicians. The term “meta-analysis” was developed to describe the methodological framework for quantitative and systematic combination of results from previous research to reach conclusions about that body of research.1 Meta-analysis progressed quickly and occupied an important place in surgical research. Proponents of meta-analysis argue that robust analyses of well conducted randomized controlled trials (RCTs) sit at the top of the “evidence hierarchy”2 and should carry more weight than other study designs when determining the effectiveness of clinical interventions.2–4
The challenges of applying meta-analysis in surgery
The application of meta-analysis in surgery is challenging and can be particularly problematic. In the following sections we discuss important methodological controversies that need to be considered by clinicians and scientists seeking to implement evidence-based medicine to achieve improvements in the quality of health care delivery and clinical care.
The “garbage in–garbage out” effect
The dependence of meta-analysis on the quality and characteristics of included studies was famously described as “garbage in–garbage out.”5 The application of meta-analysis to nonexperimental studies in particular is controversial, as the results of nonexperimental studies are vulnerable to bias by confounding factors.6 This is particularly important in surgical meta-analysis; the nature of surgical interventions often make it difficult to perform well-conducted RCTs, as it would be unethical to randomly assign patients to groups in which they may be subject to potentially harmful risks.7 This is reflected in the relative paucity of surgical experimental studies compared with medical studies.8,9 Most literature assessing the efficacy of surgical interventions consists of retrospective case series, with RCTs accounting for less than 10% of the total.8,10–12 Whereas there are guidelines to assist with preparation of both randomized (QUOROM statement)13 and observational studies (MOOSE statement),14 these are primarily targeted at researchers conducting meta-analyses and do not facilitate rapid assessment of the quality and type of included studies.
Heterogeneity or the “apples and oranges” effect
Heterogeneity in study design, intervention or patient population can result in the comparison of “apples and oranges” and may substantially affect the conduct and results of a meta-analysis.15 The perception of statistical heterogeneity often influences meta-analysts and clinicians in important decisions. These decisions include whether data extracted from different studies are similar enough to combine and whether the treatment advocated is applicable to the general targeted population.16
The recognition of heterogeneity can also be considered a positive finding. It widens the spectrum of the meta-analytical results. It may also highlight factors that influence outcome that were not observable in individual trials. Furthermore, if the effect is consistent even with discrepant studies, it strengthens the case for the causality of the treatment. Finally, if a meta-analysis is performed before beginning a new study, then heterogeneity may help the investigator improve his design by incorporating an understanding of these other factors.17 A commonly used test for identifying the degree of heterogeneity is the I2 test; however, this is often not calculated.18
Publication bias and the “file drawer problem”
The exclusion of studies from meta-analyses can bias the results. This can occur for 2 reasons: first, because of a flawed, incomplete literature search and second, because of publication bias or the “file drawer problem.”19 Publication bias is an inherent issue in meta-analyses. Studies are often not published in indexed journals and are consequently not included in meta-analyses if they do not report original, positive or statistically significant findings. If publication bias occurs, the subsequent meta-analysis of published literature may be misleading as it may not accurately reflect the available evidence.20
The simplest and most commonly used method to detect publication bias is an informal examination of a funnel plot. Formal tests for publication bias exist, but in practice few meta-analyses have assessed or adjusted for the presence of publication bias. A recent assessment of the quality of systematic reviews stated that only 6.5% of studies in high-impact general journals and 3.2% in specialty journals reported that a funnel plot had been examined.21
Variations in individual study characteristics may substantially affect one or all of these previously mentioned primary dimensions of meta-analytical quality. Secondary meta-analytical controversies include the chronology of meta-analyses and the patient population included. Often these factors are not considered in sufficient depth and may result in poor inferences being drawn from meta-analyses.
Study population
The contribution of the study population in individual studies on the results of meta-analyses is important as it is a major determinant of heterogeneity. For example, if the targeted intervention is only beneficial in high-risk groups such as the elderly,22 it would be hazardous to apply results of such a meta-analysis to low-risk groups. Furthermore, greatly differing event rates among populations reported in different studies can potentially bias the meta-analysis.23,24 This is particularly important in surgical meta-analysis because event rates can vary substantially depending on patient comorbidities.
Year of publication
The chronology of meta-analysis is important for a number of reasons. The year a study is published is a significant determinant of heterogeneity as population characteristics may change over time, whereas the development in technology and technical expertise may translate into unfavourable outcomes in early studies. These factors need to be taken into account, especially in surgical specialties where new technologies and techniques are continuously developed and the learning curve is progressively overcome.25 Finally, as the pool of evidence accumulates with time, the summary results reported by meta-analyses will be refined.26
Meta-analysis is further complicated, as attempts to adhere to all dimensions of primary and secondary meta-analytical quality will often suggest contradictory study inclusion and exclusion criteria and encourage different meta-analytical conduct. Consequently, meta-analysis, especially surgical meta-analysis, is often an exercise in compromise. For example, inclusion of only experimental trials may exclude the most contemporary literature, or an attempt to include all available evidence may introduce heterogeneity. Qualitative value judgments interpreting the relative importance of dimensions of quality are required even in the structure of a statistically robust meta-analysis that integrates a comprehensive literature search. Therefore a compromise is necessary for qualitative value judgments. This does not necessarily demean the effectiveness of meta-analysis as a tool for literature review, evidence synthesis and clinical decision-making; however, it is imperative that these value judgments are explained and justified. Unfortunately, this is often not the case.18,21
Similarly, when investigating a specific clinical question on which several relevant meta-analyses may have been published, value judgments must be applied to determine which of the meta-analyses represents the most robust evidence. These value judgments should also be explicitly acknowledged.
Quality assessment tools of meta-analyses
More than 20 tools are available to assess the quality of systematic reviews.27 Nevertheless, most of the available instruments are not widely used. Several are lengthy and include complicated instructions. One of the most commonly used is the Overview Quality Assessment Questionnaire (OQAQ).28 An improved version of this tool is the recently developed and validated Assessment of Multiple Systematic Reviews (AMSTAR) tool,29 an 11-item instrument that assesses key attributes of a well-conducted meta-analysis. It assesses whether an a priori design was provided before conduction of the meta-analysis, whether there was any duplicate study selection or data extraction and whether a comprehensive literature search was used. Furthermore, it asks whether the authors made an attempt to source grey literature. It looks at whether a list of included and excluded studies was provided and whether characteristics of the included studies were provided. Also, it determines whether the quality of the included studies was assessed, documented and used appropriately in formulating questions. Finally, it evaluates whether the methods used to combine the findings of the studies were appropriate, whether publication bias was assessed and conflict of interest stated. Each question has 4 responses: “yes,” “no,” “can’t answer” and “not applicable.” A “yes” gives a score of 1; any other response results in a score of 0. The overall score is out of 11.29 The AMSTAR tool too has its limitations, however, which include minimal use to date, lack of graphical output and failure to fully assess important dimensions of quality (e.g., how the study deals with heterogeneity). Furthermore, it is conceivable that in different clinical specialties the relative importance of dimensions of meta-analytical quality may differ. For example, in a rapidly evolving specialty, it may be considered more important to include the most contemporary literature, whereas if several well-conducted randomized studies exist the quality of included studies may be considered more important. The AMSTAR tool is presented in Appendix 1.
The Veritas plot
We designed the Veritas plot to address the limitations of previous scoring systems, explicitly explore important elements of quality and facilitate decision-making by highlighting specific areas in which meta-analyses addressing similar clinical questions are found to be deficient. Furthermore, as a graphic tool we hoped that the Veritas plot would be more intuitive than presenting similar data in a tabular or text format.
In this paper, we demonstrate how the Veritas plot can be applied to an important example from the cardiothoracic literature. In addition, we discuss the strengths, limitations and potential for further development of the Veritas plot.
Methods
In this section we illustrate the key methodological steps required to create a Veritas plot, using an example from the cardiothoracic literature. By comparing the quality of meta-analyses that investigate the effect of off-pump coronary bypass techniques on the incidence of stroke, we hope to demonstrate the role of the Veritas plot in clinical decision-making.
The Veritas plot can be considered to be a variant of the radar or radial plot. A radar plot is a graphic display for comparing estimates that have differing precisions.30 Radar plots have existed for many years and are an important descriptive tool for multiattribute data. In general, the common feature of radar plots is that they are a circular graphing method and have a series of spokes or rays projecting from a central point, with each ray representing a different variable label. The values of the variables are encoded into the lengths of the rays, and the values so plotted are sometimes connected to form an enclosed figure.31 In the Veritas plot, the rays represent dimensions of meta-analytical quality.
In our example, we included the 6 dimensions of meta-analytical quality that most influence the final outcome. As important elements of meta-analytical quality, we included assessments of whether the studies consider publication bias and heterogeneity, and we have considered the design of studies included in the analysis. We included the AMSTAR score as one of our dimensions of quality, as it is a strong tool for assessing the quality and comprehensiveness of the literature review. Finally, we included the baseline population event risk and year of publication as further dimensions of meta-analytical quality, as meta-analyses addressing the effect of rapidly evolving technology on infrequently occurring side-effects could easily be biased by these factors. Our methods included the following steps.
1. Clinical question
A clinical question is raised (e.g., what is the difference in outcome of 30-day stroke between cardiac surgery patients undergoing conventional coronary artery bypass (CCAB) and off-pump coronary artery bypass (OPCAB) surgery?).
2. Search strategy
A search strategy must be conducted to investigate the clinical question. In our example, we used an expert search strategy as described by Kelly and colleagues32 to search for any relevant meta-analyses published up to and including March 2008. We searched the following databases: MEDLINE, EMBASE, CINAHL and Google Scholar. We searched for meta-analyses either specifically examining the outcome of 30-day stroke or several outcomes, including 30-day stroke. We limited the search to include “meta-analysis,” “human only studies” and “English language.” The results of our search are shown in Figure 1.
3. Ranking of studies
The studies identified require ranking. In our example, we identified 7 studies and ranked them according to 6 categories: study type, publication bias, heterogeneity, year of publication, risk of population included and AMSTAR score. The characteristics of the studies included are presented in Table 1.
The scoring system worked as follows. When analyzing each category, the study with the best score received n points where n = the number of studies. The second best study received n −1 points, and so on. In the case of 2 studies performing equally well, the study with the next highest score would receive n −2 points.
4. Implementation of data set in a radar plot
We constructed the radar plots using Microsoft Excel software (Microsoft Corp.).
5. Formatting the Veritas plot
We created the Veritas plot as a variant of the radial plot. Individual Veritas plots for all of the studies are shown in Figure 2. We showed the score of a study in each dimension of quality with a solid line, which can be compared with the triangles that represent the mean score across all 7 studies in each of the dimensions of quality. This way individual study performance in each dimension of quality can be evaluated.
We conceived the Veritas score as a summary statistic. It is calculated by taking the mean of the scores in the 6 dimensions of quality for each study. It is represented in the plots shown in Figure 2 by dashed lines. It can be compared with the mean Veritas score across all 7 studies (represented by the squares in Fig. 2) to give an overall estimate of the study quality. In Figure 3, the Veritas scores for all of the studies are plotted. For ease of interpretation, the scores of the individual studies in each of the dimensions of quality were not plotted. This figure can be used to compare overall quality in the included studies across all 6 selected dimensions of quality.
6. Interpretation of the Veritas plot
The greater the score of a study in a specific quality dimension, and hence the further the solid plot is from the central pole, the better its performance in this quality dimension. If the solid plot lies outside of the triangles, which represent the mean score across all 7 studies in each dimension, then it could be said to perform “better than average” in this dimension. The Veritas score is shown with a dashed plot. Similarly, if it lies outside of the squares, which represent the mean Veritas score across all 7 studies, then the study could be said to be “better than average.” It is important to note that although the plot is effective in demonstrating the dimensions of quality in which the study performs well, the Veritas score has not been validated as a summary statistic. The Veritas plot is intended to facilitate value judgment in the quality assessment of meta-analysis to inform clinical practice and is not intended to replace such judgments. Consequently the Veritas score must be interpreted with caution.
Results
The results of our search strategy are shown in Figure 1. We identified 16 meta-analyses comparing OPCAB with CCAB surgery. Of these, 7 meta-analyses studied stroke.22,33–38 One of the studies considered RCTs and observational studies separately.36
We developed individual Veritas plots for each of the 7 studies (Fig. 2). The meta-analysis by Cheng and colleagues35 scored the highest in 4 of the 6 dimensions and had the highest Veritas score: 7.17.
Discussion
The advantages of well-conducted meta-analyses are that they allow objective appraisal of the evidence in comparison with traditional narrative reviews, provide a more precise estimate of a treatment effect and may explain heterogeneity among the results of individual studies. The application of meta-analysis in surgery, however, is problematic, and surgical meta-analyses are often vulnerable to bias. Patient population characteristics, technological development and evolving surgical expertise have the potential to substantially affect event rates, which can bias the results of meta-analyses.23,24
There are numerous instances where meta-analyses have pooled results from small trials with disparate results and, as such, have produced conflicting evidence.39–43 Furthermore, results have been generated that conflicted with those of subsequent large RCTs.44–47 When this occurs, the reliability of the evidence is questioned, resulting in poorly guided clinical decisions. Consequently, doubts have been raised about the reliability of using meta-analyses to guide clinical practice.48,49 Although even advocates of meta-analysis argue that such studies are not a substitute for RCTs,50 meta-analyses may be a useful guide to clinical decision-makers until unequivocal experimental evidence is available.6 However, if meta-analysis is to continue to have a role in surgical decision-making, surgeons needs to be able to assess, compare and communicate the quality of meta-analyses, particularly in areas where several meta-analyses are available.
Strengths of the Veritas plots
We have presented a novel application of radar plots that clinicians can use when they are presented with results from several meta-analyses. Our Veritas plot enables the clinician to assess key attributes that evaluate the quality of a meta-analysis. This tool helps with the analytical process and allows clinicians to decide on the translation of results from the meta-analyses to patient care.
In addition to text or tables, research results can also be presented in graphic formats. Graphic displays are particularly suitable for illustrating relations and trends concisely.
Our tool has several advantages. First, it incorporates the different facets that constitute a well-conducted meta-analysis without the need for extensive narrative. It considers important elements of quality such as heterogeneity, publication bias, quality scoring, year of publication and the risk stratification of the group. It builds on the strengths of current quality-assessment tools such as the AMSTAR.27–29 It is an invaluable tool for apprising the ever-increasing number of meta-analyses in fields such as OPCAB surgery. Hence, one can say that it has a cumulative potential. It is also a versatile tool and in meta-analyses where attributes such as year of publication are not important, these can be removed. We believe that it is also a quick and simple tool that can be applied across any surgical discipline. Further benefit occurs by virtue of its graphic representation; our method makes it easier for the clinician to appraise the quality of meta-analyses.51
We believe that this tool builds on the foundations of the evidence-based medicine hierarchy and allows for effective communication between researchers and clinicians.52 It also allows interstudy variation to be assessed effectively and the findings of the study to be applied accordingly to daily surgical practice.
Limitations of the Veritas plots
Further validation of the Veritas plot by appraisers of meta-analyses is needed to assess its validity, reliability and perceived utility. Its application needs to be verified across specialties, and its user-friendliness should be confirmed. Furthermore, this tool is subjective; authors in other fields may dispute the emphasis we have placed on the attributes in our Veritas plot. The issue of scaling and appropriate weighting of attributes will need further validation.
At present this tool cannot compare the quality of studies from different clinical areas. Of equal importance is the fact that it relies on a study being ranked relative to others. Consequently, all dimensions of quality need to be assessed and obtained.
Conclusion
The Veritas plot is a practical, novel and useful aid in the quality assessment of several meta-analyses studying the same outcome. It was designed to consider important elements of meta-analytical quality such as heterogeneity, publication bias, study design, quality scoring, year of publication and population characteristics. The method is suitable, however, for adaptation to a variety of questions in evidence synthesis.
We therefore invite colleagues to consider applying and adapting the Veritas plot as a component of the processes of synthesizing and reporting the findings of multiple meta-analyses assessing similar outcomes.
Footnotes
Competing interests: None declared.
Contributors: Dr. Panesar was responsible for fundamental conceptual development, intellectual content, statistical analysis, data collection and interpretation and manuscript drafting. Dr. Rao was responsible for fundamental conceptual development, intellectual content, statistical analysis, data interpretation and manuscript and figure drafting. Drs. Vecht, Mirza, Netuveli, Morris and Rosenthal were responsible for data collection and preliminary analysis and provided important intellectual content. Drs. Netuveli and Morris are statisticians. Lord Darzi was responsible for providing important intellectual content throughout the manuscript’s production and for final approval of the version to be published. The Veritas Plot was conceived by Dr. Athanasiou, whose involvement was critical to every phase of this work. All authors read and approved the final manuscript.
- Accepted January 29, 2009.