Richard Armour, Brett Williams
Learning outcomes
Both quantitative and qualitative research methodologies may be applied in paramedic research as we seek to understand not just the ‘what’ of paramedic practice, but also the ‘why’. When conducting and reviewing both quantitative and qualitative research, though, it is critical that we are satisfied the research is both accurate and free from error so much as is feasibly possible. A careful, considered methodological approach will assist in achieving these goals, with particular attention paid to the concepts of reliability and validity.
As overall concepts, reliability is concerned with the consistency or repeatability of the results, tests or measurements used within research1, while validity examines whether the research measures what it set out to measure and how applicable it is to a wider population2. Although sharing similar overarching principles in ensuring the accuracy and trustworthiness of research, reliability and validity are explored in different manners between quantitative and qualitative research.
When applying the concept of reliability to quantitative research, we can consider reliable quantitative research to be that which utilises methods, instruments or measurements which will produce consistent results and the extent to which the results may be reproduced if repeated under the same conditions.3,4 Assessing reliability allows us to appreciate how much variability in the measured results is due to errors in measurement, and how much is due to a true result.5 Both reliability and validity are central elements to psychometrics and measurement.
High-quality quantitative research will demonstrate how the methods and measurement instruments chosen to answer the research question are reliable in obtaining consistent results. This may include selecting instruments that are previously validated as reliable or creating and then validating instruments with sufficient internal consistency, stability and equivalence, representing typical measures by which we may evaluate reliability. These are often examined using tests such as classic test theory or item response theory.
Internal consistency, or homogeneity, assesses how reliable a measurement is by estimating how well items within the measurement instrument are used to obtain the same results as other items assessing the same construct within the same instrument.6,7 Stability examines the consistency of results using the same instrument, over time.8 This can be assessed using methods such as test-retest and parallel-form reliability testing. Equivalence is concerned with the consistency among multiple users of an instrument, or the inter-rater reliability.8 Equivalence allows us to understand how precise reported results may be and of what value they are in answering the research question.8,9
If we are satisfied that the results obtained are reliable, we must also consider whether they are valid. Broadly, the validity of quantitative research relates to how well the chosen methodology does, or does not, support discovering a true answer to the research question posed and whether the results can be extrapolated to our own setting.10 We can subdivide validity in quantitative research into internal and external validity.
The internal validity of a study is concerned with how well a study controls extraneous variables to allow for the establishment of a causal relationship between the independent and dependent variable.11 When evaluating the internal validity of a quantitative study, we must consider how appropriate the methodology is for answering the research question, whether the sampling and data analysis plans are appropriate and whether the stated conclusions match the reported results in the context of the sample obtained.11,12 Threats to the internal validity of the study will inherently vary dependent on the research question and chosen methodology (see Table 18.1).
| Threat | Rationale |
|---|---|
| History | Encompasses the events a study participant or researcher experiences during the research, including staff changes, symptom exacerbations or factors as simple as listening to news stories related to the research. Less likely to impact short-term research, but over longer term may impact study outcomes. |
| Maturation | Involves any biological changes occurring over time, including fatigue, wound healing and disease progression. |
| Testing | Testing may produce reactive responses from participants. Initial tests may provide clues to participants as to what the outcome of interest is, while during repeated testing, they may attempt to learn the ‘correct’ answers and modify their responses accordingly. |
| Instrument decay | The decay of any instrument used to measure outcomes is of concern to internal validity. Decay of an automated non-invasive blood pressure monitor in a long-term study of blood pressure control, for example. |
| Statistical regression | The concept of regression to the mean, in which participants who initially score highly on initial testing are likely to score lower the next time they are examined with or without intervention and vice versa. Thus, if participants are selected for research based on high or low scores, they will likely demonstrate a regression to the mean regardless of the intervention. |
| Selection | Potential bias in the selection of participants in the control and experimental groups. |
| Mortality | Includes not only mortality but study attrition and drop-out rate. May often be higher in experimental groups as these tend to require more commitment than control groups, leaving only highly motivated individuals in the experimental group and producing a skewed view of benefit. |
Source: Flannelly KJ, Flannelly LT, Jankowski KR. Threats to the internal validity of experimental and quasi-experimental research in healthcare. Journal of Health Care Chaplaincy. 2018 July; 24(3):107–30. Online. Available: http://doi.org/10.1080/08854726.2017.1421019.
Although internal validity is a key priority in research, the importance of external validity in fields such as paramedicine cannot be understated.13 External validity is concerned with whether we can apply results with good reliability and internal validity to patients within our own clinical setting.14 To determine whether a study has sufficient external validity, we must consider whether the sample obtained in the research is generalisable to the population we serve in our current setting, paying particular attention to factors such as age, sex, presence or absence of co-morbidities and shared backgrounds.
The concepts of reliability and validity as they exist in the quantitative paradigm are perhaps unsuitable, or unwieldy, to apply to qualitative research.15 While quantitative research follows structured and preset methods, in qualitative research, it is critical to remain flexible to developing themes in the research.15 Despite this need for flexibility, it is important that we can be satisfied qualitative research is sufficiently rigorous, leading to the proposal by Guba and Lincoln of trustworthiness as a more applicable framework for assessing and preparing reliable and valid qualitative research.16 Although there have been subsequent additional criteria added to the original framework proposed by Guba and Lincoln, encompassing the domains of credibility, dependability, transferability and confirmability remains the most commonly applied.16
Credibility closely mirrors the quantitative concept of internal validity, as it relates to how the researcher demonstrates that the reported results are grounded in the reality of the participants’ responses.17 Taking steps to improve credibility in qualitative research ultimately ensures that the participants’ voices are appropriately reflected in the research product (see Table 18.2).17
| Guba and Lincoln Criteria | Strategies to Improve/Evaluate Criteria |
|---|---|
| Credibility17 | |
| Dependability18 | |
| Transferability19 | |
| Confirmability20 | |
Source: Noble H, Smith J. Issues of validity and reliability in qualitative research. Evidence-Based Nursing. 2015 April; 18(2): 34–5. Online. Available: http://doi.org/10.1136/eb-2015-102054.
The concept of dependability is concerned with the repeatability of results.21 However, the paradigm of qualitative research inherently lends itself to results that may not be reproducible.21 Instead, dependability in qualitative research is focused on the stability and consistency of the methods by which the researcher has conceptualised the research, collected the data and interpreted the findings.21
Transferability in qualitative research shares similarities with external validity for quantitative research, in that it relates to the extent to which the study’s findings could be applied elsewhere.19 Ultimately, though, it is not the role of the researcher to demonstrate how their results can be applied elsewhere. Rather, it is their role to provide sufficient context and evidence to determine whether the results could be applied externally.19
Finally, confirmability seeks to establish how confident we are that the reported results are based on truth and not subject to the biases of the researchers.20 Some argue that confirmability is demonstrated when credibility, dependability and transferability are sufficiently demonstrated.20 In essence, when we consider whether the research demonstrates confirmability, we must ask whether another qualitative researcher working with the same data would draw similar conclusions.
The air ambulance service you currently work with is considering implementing the use of thoracic ultrasound for the evaluation of pneumothorax during flight, as they have found that the use of a stethoscope is limited by the noise of the aircraft. They discover a study by Roline and colleagues which may assist in determining if this is a feasible option.22
In their study, Roline and colleagues performed a prospective feasibility study evaluating the use of point of care ultrasound (POCUS) during flight to evaluate lung sliding sign.22 Although a useful tool, POCUS must be interpreted by clinicians and so suffers from variable inter-rater reliability. In their methods section, Roline and colleagues outline that clinicians in the study received a 15-minute lecture, participated in a 60-minute hands-on training session and demonstrated competency on a volunteer model prior to participating.22 The use of standardised training is a recognised method for reducing the impact of equivalence on the reliability of quantitative research.
With these things considered, you decide that this study has sufficiently adjusted methods to reduce the impact of equivalence on the reported outcomes and continue to assess the validity of the study.
You are reviewing the use of intravenous fluids in paramedic practice and considering the relative pros and cons of different formulations. You stumble across a study by Self and colleagues comparing the use of balanced crystalloids versus saline in non-critically ill adult patients in the emergency department.23 You wonder whether this be applied to paramedic practice.
In their study, Self and colleagues conducted a single-centre, pragmatic, multiple-crossover trial comparing balanced crystalloids such as lactated Ringer’s solution or Plasma-Lyte with saline among adults treated with intravenous crystalloids in the emergency department who were hospitalised outside the intensive care unit.23 The primary outcome of interest was hospital-free days, with secondary outcomes including major adverse kidney events within 30 days, with a finding of no difference in hospital-free days (median 25 days, adjusted odds ratio 0.98, 95%CI 0.92–1.04, p = 0.41), and a lower rate of major adverse kidney events with the use of balanced crystalloids (adjusted odds ratio 0.82, 95%CI 0.70–0.95, p = 0.01).
On surface value, this study may have some relevance to paramedicine, as the patient population seen in the emergency department is not dissimilar to the population seen by paramedics. However, it is important when assessing for external validity to move beyond this general assumption. In their study, Self and colleagues examined a patient population with a median age of 54 (interquartile range [IQR] 37–67), equal sex distribution, a predominantly white background (78.2%), a moderate preexisting comorbid burden and generally suffering from medical complaints.23
Having examined the patient population studied within the research by Self and colleagues, you take this data away and compare it against the data available for your local service.23 Finding that this population is similar to your current patient population, you consider that this study may have some external validity to your current setting.
It is paramount when conducting or critically appraising research that we are satisfied the results are consistent and precise, while ensuring the methods used to obtain these results logically assist in providing a true answer to the research question. Critical to achieving these goals are the concepts of reliability and validity. Although applied differently to accommodate the different paradigms of quantitative and qualitative research, reliability and validity are critical in ensuring the production and implementation of high-quality research.
1. Al-Jundi A & Sakka S. Critical appraisal of clinical research Journal of Clinical and Diagnostic Research. 5, 2017 May;11: JE01-5 Online. Available: http://doi.org/10.7860/JCDR/2017/26047.9942.
2. Patino CM & Ferreira JC. Internal and external validity: can you apply research study results to your patients? Journal Brasileiro de Pneumologia. 3, 2018 May;44: 183. Online. Available: http://doi.org/10.1590/S1806-37562018000000164.
3. Matheson GJ. We need to talk about reliability: making better use of test-retest studies for study design and interpretation PeerJ Publishing. 2019;7: e6918. Online. Available: http://doi.org/10.7717/peerj.6918.
4. Lachin JM. The role of measurement reliability in clinical trials Clinical Trials. 6, 2004;1: 553-566 Online. Available: http://doi.org/10.1191/1740774504cn057oa.
5. Gosall N & Gosall G. The doctor’s guide to critical appraisal 2015; Pastest Cheshire, The United Kingdom 134.
6. McCrae RR, Kurtz JE, Yamagata S & Terracciano A. Internal consistency, retest reliability and their implications for personality scale validity Personality and Social Psychology Review. 1, 2011 Feb;15: 28-50 Online. Available: http://doi.org/10.1177/1088868310366253.
7. Boateng GO, Neilands TB, Frongillo EA, Melgar-Quiñonez HR & Young SL. Best practices for developing and validating scales for health, social and behavioural research: a primer Frontiers in Public Health. 2018;6: 149. Online. Available: http://doi.org/10.3389/fpubh.2018.00149.
8. Heale R & Twycross A. Validity and reliability in quantitative studies Evidence-Based Nursing. 3, 2015 July;18: 66-67 Online. Available: http://doi.org/10.1136/eb-2015-102129.
9. Burns M. How to establish interrater reliability Nursing. 10, 2014 Oct;44: 56-58 Online. Available: http://doi.org/10.1097/01.NURSE.0000453705.41413.c6.
10. Slack MK & Draugalis JR. Establishing the internal and external validity of experimental studies American Journal of Health-System Pharmacy. 22, 2001 Nov;58: 2173-2181.
11. Flannelly KJ, Flannelly LT & Jankowski KR. Threats to the internal validity of experimental and quasi-experimental research in healthcare Journal of Health Care Chaplaincy. 3, 2018 July;24: 107-130 Online. Available: http://doi.org/10.1080/08854726.2017.1421019.
12. Leung L. Validity, reliability and generalizability in qualitative research Journal of Family Medicine and Primary Care. 3, 2015 Jul;4: 324-327 Online. Available: http://doi.org/10.4103/2249-4863.161306.
13. Steckler A & McLeroy KR. The importance of external validity American Journal of Public Health. 1, 2008 Jan;98: 9-10 Online. Available: http://doi.org/10.2105/AJPH.2007.126847.
14. Murad M, Katabi A, Benkhadra R & Montori VM. External validity, generalisability, applicability and directness: a brief primer BMJ Evidence-Based Medicine. 1, 2018 Jan;23: 17-19 Online. Available: http://doi.org/10.1136/ebmed-2017-110800.
15. Cypress BS. Rigor or reliability and validity in qualitative research: perspectives, strategies, reconceptualization and recommendations Dimensions of Critical Care Nursing. 4, 2017 Jul;36: 253-263 Online. Available: http://doi.org/10.1097/DCC.0000000000000253.
16. Morse JM, Barrett M, Mayan M, Olson K & Spiers J. Verification strategies for establishing reliability and validity in qualitative research International Journal of Qualitative Methods. 2, 2002 Jun;1: 13-22 Online. Available: http://doi.org/10.1177/160940690200100202.
17. Noble H & Smith J. Issues of validity and reliability in qualitative research Evidence-Based Nursing. 2, 2015 Apr;18: 34-35 Online. Available: http://doi.org/10.1136/eb-2015-102054.
18. Forero R, Nahidi S, De Costa J, Mohsin M, Fitzgerald G, Gibson N, McCarthy S & Aboagye-Sarfo P. Application of four-dimension criteria to assess rigour of qualitative research in emergency medicine BMC Health Services Research. 2018 Feb;18: 120. Online. Available: http://doi.org/10.1186/s12913-018-2915-2.
19. Hadi MA & Closs SJ. Ensuring rigour and trustworthiness of qualitative research in clinical pharmacy International Journal of Clinical Pharmacy. 3, 2016 Jun;38: 641-646 Online. Available: http://doi.org/10.1007/s11096-015-0237-6.
20. Korstjens I & Moser A. Practical guidance to qualitative research. Part 4: trustworthiness and publishing European Journal of General Practice 1, 2018 Dec;24: 120-124 Online. Available: http://doi.org/10.1080/13814788.2017.1375092.
21. Johnson JL, Adkins D & Chauvin S. A review of the quality indicators of rigor in qualitative research American Journal of Pharmaceutical Education. 1, 2020 Jan;84: 7120. Online. Available: http://doi.org/10.5688/ajpe7120.
22. Roline CE, Heegaard WG, Moore JC, Joing SA, Hildebrandt DA, Biros MH & et al. Feasibility of bedside thoracic ultrasound in the helicopter emergency medical services setting Air Medical Journal. 3, 2013 May;32: 153-157 Online. Available: http://doi.org/10.1016/j.amj.2012.10.013.
23. Self WH, Semler MW, Wanderer JP, Wang L, Byrne DW, Collins SP & et al. Balanced crystalloids versus saline in noncritically ill adults New England Journal of Medicine. 2018;378: 819-828 http://doi.org/10.1056/NEJMoa1711586.