Skip to Main Content

Banner showing title of page: Appraise

Step 3: Appraise

Your clinical question type will often determine which study design is most appropriate. However, it's not enough to assume that the evidence is high quality based purely on the type of study or its position in the evidence hierarchy.

The third step is to appraise the evidence that you have located for its validity (closeness to the truth), risk of bias (trustworthiness of results) and applicability to your patient (usefulness in clinical practice).

Watch the video below for a comprehensive introduction to critical appraisal.

Cochrane Common Mental Disorders 2019, 1. Introduction to critical appraisal
Running time: 9:33 minutes

Validity, bias and applicability

Validity refers to how rigorously a study was conducted and the extent to which the conclusions are true within the context that the research was undertaken. More valid research designs, methods and procedures ensure that the study results are less biased and the conclusions are more generalisable.

Select the plus symbols below to learn more about the three areas of validity.

Bias refers to intentional and unintentional errors in the study design, research methods, data collection, and reporting of results during a research project.

It's important to know the criteria to use to critically evaluate the integrity, reliability, and applicability of health‐related research. This requires an understanding of the different categories of bias and the impact that these biases and uncertainty (random error) have on the outcomes of a study.

Conflicts of interest may also influence research reports, particularly the conclusions drawn from results. You need to decide if it is necessary to read the discussion of the article, or rely on the authors’ interpretation of their findings.

Select the plus symbols below to learn more about the sources of bias.

Applicability refers to whether or not the study can be applied to your specific clinical setting and individual patient. This is where you will need to use your clinical expertise and professional judgment.

You can learn more about how to assess applicability in Step 4: Apply.

GRADE (Grading of Recommendations, Assessment, Development and Evaluations) is a transparent framework for developing and presenting summaries of evidence and provides a systematic approach for making clinical practice recommendations. It is the most widely adopted tool for grading the quality of evidence.

The GRADE approach helps to evaluate the strength of what you are recommending to your patient and identify and consider key factors that drive the direction and strength of your recommendations, and their role in shared decision-making.

Interpreting statistical data

To implement EBP, knowledge of statistical calculations is not required. However, the ability to interpret statistical results is essential.

There are common statistical terms that are used when discussing the results of a study. Having a good understanding of these terms will help you understand the results of the study and appraise it effectively.  Becoming familiar with the terminology is more important than understanding the calculations.

The tabs in this box will show you how risk and adverse effects are calculated and give a brief overview of forest plots.

In an example randomised controlled trial, patients with pack pain were assigned with either drug X (experimental) or a placebo (control). The results after 5 months are listed in the table below.

 

Outcome: No impact on back pain

Outcome: Relief from back pain  
Experimental (drug X) 2 98 Experimental non event rate = 2/100 = 0.02%
Control (placebo) 3 97 Control non event rate = 3/100 = 0.03%

Modified risk: 2 out of 100 patients (0.02%) in the drug X (experimental) group had no impact on back pain 

Starting risk: 3 out of 100 patients (0.03%) in the placebo (control) group had no impact on back pain

 

Absolute risk reduction

An absolute comparison of risks tells you how much lower the modified risk is than the starting risk in absolute terms. It subtracts the event rates.

Example:

Absolute risk reduction = Risk of no impact on back pain without treatment (control group) - Risk of having no impact on back pain after treatment

Absolute risk reduction = 0.03% - 0.02% = 0.01%

This means that for every 100 patients treated with drug X for 5 months (instead of a placebo), there will be 1 patient less that will have no impact on back pain.

 

Relative risk reduction

A relative comparison of risks tells you how much lower the modified risk is relative to the starting risk and reports the proportion or percentage of change. It divides the absolute risk reduction by the starting risk. The relative risk can often look more impressive than it really is.

Example:

Relative risk reduction = Absolute risk reduction / Risk of no impact on back pain without treatment (control group)

Relative risk reduction = 0.01/0.03 = 0.33 = 33%

Drug X lowers the risk of no relief from back pain by 33%

 

Number needed to treat

The number needed to treat is the number of patients that need to be treated with the experimental treatment compared to the control treatment in order to have one patient experience a good outcome.

Example

Number needed to treat = 1 / Absolute risk reduction

Number needed to treat = 1 / 0.01 = 100 patients

When 100 patients are treated with drug X results in one patient from having no impact on back pain.

In an example randomised controlled trial, patients with pack pain were assigned with either drug X (experimental) or a placebo (control). The results after 5 months are listed in the table below.

  Outcome - YES adverse effect Outcome - NO adverse effect Totals
Experimental group (drug X) 25 87 112
Control group (placebo) 15 73 88
Term Calculation What this means?
Experimental event rate (EER) = outcome present / total in experimental group 25 / 112 = 22% 22% of patients in the treatment group reported at least one adverse reaction.
Control event rate (CER) = outcome present / total in control group 15 / 88 = 17% 17% of patients in the placebo group reported at least one adverse reaction.

Absolute risk increase (ARI) is the difference between the rates of events in the experimental and control group.

Absolute risk increase = EER-CER

22% - 17% = 5% There is an absolute increase of 5% for adverse reactions in the treatment group compared to the placebo group.

Relative risk increase (RRI) is the proportional difference in risk between the rates of events in the control group and the experimental group.

Relative risk increase is often a larger number than the ARI and therefore may tend to exaggerate the difference.

Relative risk increase = EER - CER/CER

5% / 17% = 29% There is a relative increase of 29% for adverse reactions in the treatment group compared to the placebo group.

Numbers needed to harm (NNH) is the number of patients that a clinician would have to treat with the experimental treatment over the specific period of time to report one additional patient with an adverse outcome.

Numbers needed to harm = 1 / ARI

1 / 0.05 = 20 20 patients have to be treated for one patient to have an adverse reaction. Note you need to round up, never down!

A forest plot is a graphic representations of a meta-analysis. It shows critical information including the overall effect, relative risk and the level of heterogeneity between studies, 

Select the i symbols below to lean how to read a forest plot.

For more information about forest plots, you can visit the Systematic Reviews guide.

Test your knowledge

Appraising a systematic review

While systematic reviews are one of the highest levels of evidence, they still need to be appraised for their quality. The evidence that you can use to appraise systematic reviews is primarily found in the methods section of a published study. The article abstract may also include some of this information.

Select the plus symbol below to learn more about 4 main areas to consider.

The article below shows which elements to consider when appraising a systematic review. Relevant sections are highlighted and comments regarding the importance of that section are provided.

Select the comment icons below to read more about what to consider.


You can download a PDF of the annotated article below:

Watch the videos below to learn more about evaluating systematic reviews and meta-analyses.

Cochrane Common Mental Disorders 2019, 2. Systematic reviews and meta analysis
Running time: 29:19 minutes

Further information:

Appraising a primary study

Primary studies will need to be scrutinized more heavily, as they have not undergone any form of pre-appraisal.

The evidence that you can use to appraise primary research is found primarily in the methods section of a published study. The article abstract may also include some of this information. This is where the investigators address the issue of bias, both conscious and unconscious.

The final assessment of a study's validity is not going to always be a "yes" or "no" decision. Instead, the assessment will be made on a continuum ranging from strong studies that are very likely to yield accurate conclusions to weak studies that are very likely to yield a biased conclusion. Inevitably, the judgment as to where a study lies in this continuum involves some subjectivity.

Watch the videos below to learn more about appraising different types of studies.

Further reading