Introduction Contemporary validity testing theory holds that validity lies in the extent to which a proposed interpretation and use of test scores is justified, the evidence for which is dependent on both quantitative and qualitative research methods. Despite this, we hypothesise that development and validation studies for assessments in the field of health primarily report a limited range of statistical properties, and that a systematic theoretical framework for validity testing is rarely applied. Using health literacy assessments as an exemplar, this paper outlines a protocol for a systematic descriptive literature review about types of validity evidence being reported and if the evidence is reported within a theoretical framework.
Methods and analysis A systematic descriptive literature review of qualitative and quantitative research will be used to investigate the scope of validation practice in the rapidly growing field of health literacy assessment. This review method employs a frequency analysis to reveal potentially interpretable patterns of phenomena in a research area; in this study, patterns in types of validity evidence reported, as assessed against the criteria of the 2014 Standards for Educational and Psychological Testing, and in the number of studies using a theoretical validity testing framework. The search process will be consistent with the Preferred Reporting Items for Systematic Reviews and Meta-analyses statement. Outcomes of the review will describe patterns in reported validity evidence, methods used to generate the evidence and theoretical frameworks underpinning validation practice and claims. This review will inform a theoretical basis for future development and validity testing of health assessments in general.
Ethics and dissemination Ethics approval is not required for this systematic review because only published research will be examined. Dissemination of the review findings will be through publication in a peer-reviewed journal, at conference presentations and in the lead author’s doctoral thesis.
- validity testing theory
- health literacy
- health assessment
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
Strengths and limitations of this study
This is the first systematic literature review to examine types of validity evidence for a range of health literacy assessments within the framework of the authoritative reference for validity testing theory, The Standards for Educational and Psychological Testing.
The review is grounded in the contemporary definition of validity as a quality of the interpretations and inferences made from measurement scores rather than as solely based on the properties of a measurement instrument.
The search for the review will be limited only by the end search date (March 2019) because health literacy is a relatively new field and publications are not expected prior to about 30 years ago.
All definitions of health literacy and all types of health literacy assessment instruments will be included.
A limitation of the review is that the search will be restricted to studies published and instruments developed in the English language, and this may introduce an English language and culture bias.
Historically, the focus of validation practice has been on the statistical properties of a test or other measurement instrument, and this has been adopted as the basis of validity testing for individual and population assessments in the field of health.1 However, advancements in validity testing theory hold that validity lies in the justification of a proposed interpretation of test scores for an intended purpose, the evidence for which includes but is not limited to the test’s statistical properties.2–7 Therefore, to validate means to investigate, through a range of methods, the extent to which a proposed interpretation and use of test scores is justified.7–9 The term ‘test’ in this paper is used in the same sense as Cronbach uses it in his 1971 Test Validation chapter8 to refer to all procedures for collecting data about individuals and populations. In health, these procedures include objective tests (eg, clinical assessments) and subjective tests (eg, patient questionnaires) or a combination of both and may involve quantitative (eg, questionnaire) or qualitative methods (eg, interview). The act of testing results in data that require interpretation. In the field of health, such interpretations are usually used for making decisions about individuals or populations. The process of validation needs to provide evidence that these interpretations and decisions are credible, and a theoretical framework to guide this process is warranted.1 2 10
The authoritative reference for validity testing theory comes from education and psychology: the Standards for Educational and Psychological Testing (the Standards).3 The Standards define validity as ‘the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests’ and that ‘the process of validation involves accumulating relevant evidence to provide a sound scientific basis for the proposed score interpretations’ (p.11).3 A test’s proposed score interpretation and use is described in Kane’s argument-based approach to validation as an interpretation/use argument (IUA; also called an interpretive argument).11 12 Validity testing theory requires test developers and users to generate and evaluate a range of validity evidence such that a validity argument can determine the plausibility of the IUA.3 7 9 11 12 Despite this contemporary stance on validity testing theory and practice, the application of validity testing theory and methodology is not common practice for individual and population assessments in the field of health.1 Furthermore, there are calls for developers, users and translators/adapters of health assessments to establish theoretically driven validation plans for IUAs such that validity evidence can be systematically collected and evaluated.1 2 7 10
The Standards provide a theoretical framework that can be used or adapted to form a validation plan for development of a new test or to evaluate the validity of an IUA for a new context.1 2 Based on the notion that construct validity is the foundation of test development and use, the theoretical framework of the Standards outlines five sources of evidence on which validity arguments should be founded: (1) test content, (2) response processes, (3) internal structure, (4) relationship of scores to other variables and (5) validity and the consequences of testing (table 1).3
Validity testing in the health context
Two of the five sources of validity evidence defined by the Standards (internal structure and relationship of scores to other variables) have a focus on the statistical properties of a test. However, the other three (test content, response processes and consequences of testing) are strongly reliant on evidence based on qualitative research methods. Greenhalgh et al have called for more credence and publication space to be given to qualitative research in the health sciences.13 Zumbo and Chan (p.350, 2014) call specifically for more validity evidence from qualitative and mixed methods.1 It is time to systematically assess if test developers and users in health are generating and integrating a range of quantitative and qualitative evidence to support inferences made from these data.1
In chapter 1 of their book, Zumbo and Chan report the results of a systematic search of validation studies from the 1960s to 2010. Results from this search for the health sciences categories of ‘life satisfaction, well-being or quality of life’ and ‘health or medicine’, show that there is a dramatic increase in publication of validation studies since the 1990s that produce primarily what is classified as construct validity.1 Given this was a snapshot review of validation practice during these years, the authors do not delve into the methods used to generate evidence for construct validity. However, Barry et al, in a systematic review investigating the frequency with which psychometric properties were reported for validity and reliability in health education and behaviour (also published in 2014), found that the primary methods used to generate evidence for construct validity were factor analysis, correlation coefficient and χ2.14 This limited view of construct validity as simply correlation between items or tests measuring the same or similar constructs is at odds with the Standards where evaluation and integration of evidence from perhaps several other sources (ie, test content, response processes, internal structure, relationships with theoretically predicted external variables, and intended and unintended consequences) is needed to determine the degree to which a construct is represented by score interpretations (p.11).3
This literature review will examine validity evidence for health literacy assessments. Health literacy is a relatively new area of measurement, and there has been a rapid development in the definition and measurement of this multi-dimensional concept.15–18 Health literacy is now a priority of the WHO,19 and many countries have incorporated it into health policy,20–24 and are including it in national health surveys.25–27
Definitions of health literacy include those for functional health literacy (ie, a focus on comprehension and numeric abilities) to multi-dimensional definitions such as that used by the WHO: ‘the cognitive and social skills which determine the motivation and ability of individuals to gain access to, understand and use information in ways which promote and maintain good health’.28 The general purpose of health literacy assessment is to determine pathways to facilitate access to and improve understanding and use of health information and services, as well as to improve or support the health literacy responsiveness of health services.28–31 However, these two uses of data (in general, to improve patient outcomes and to improve organisational procedures) may require evaluative integration of different types of evidence to justify score interpretations to inform patient interventions or organisational change.3 7 9 11 32 A strong and coherent evidence-based conception of the health literacy construct is required to support score interpretations.14 33–35 Decisions that arise from measurements of health literacy will affect individuals and populations and, as such, there must be strong argument for the validity of score interpretations for each measurement purpose.
To enhance the quality and transparency of the proposed systematic descriptive literature review, this protocol paper outlines the scope and purpose of the review.36 37 Using the theoretical framework of the five sources of validity evidence of the Standards, and health literacy assessments as an exemplar, the results of this systematic descriptive literature review will indicate current validation practice. The assumptions that underlie this literature review are that, despite the advancement of contemporary validity testing theory in education and psychology, a systematic theoretical framework for validity testing has not been applied in the field of health, and that validation practice for health assessments remains centred on general psychometric properties that typically provide insufficient evidence that the test is fit for its intended use. The purpose of the review is to investigate quantitative and qualitative validity evidence reported for the development and testing of health literacy assessments to describe patterns in the types of validity evidence reported,38–45 and identify use of theory for validation practice. Specifically, the review will address the following questions:
What is being reported as validity evidence for health literacy assessment data?
Do the studies place the validity evidence within a validity testing framework, such as that offered by the Standards?
Methods and analysis
This review is designed to provide the basis for a critique of validation practice for health literacy assessments within the context of the validity testing framework of the Standards. It is not an evaluation of the specific arguments that authors have made about validity from the data that have been gathered for individual measurement instruments. The review is intended to quantify the types of validity evidence being reported so a systematic descriptive literature review was chosen as the most appropriate review technique. Described by King and He (2005)42 as belonging towards the qualitative end of a continuum of review techniques, a descriptive literature review nevertheless employs a frequency analysis to reveal interpretable patterns in a research area; such as, in this review, in the types of validity evidence being reported for health literacy assessments and in the number of studies that refer to a validity testing framework. A descriptive literature review can include qualitative and quantitative research and is based on a systematic and exhaustive review method.38–41 43 44 38 39 The method for this review will be guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.46
This literature review is not an assessment of participant data but a collation of reported validity evidence. As such, the focus is not on the participants in the studies but on the evidence presented in support of the validity of interpretations and uses of health literacy assessment data. This means that it will be the type of study that is considered for inclusion rather than the type of study participant. Inclusion criteria are as follows:
Development/application/validation studies about health literacy assessments: We expect to find many papers that describe the development and initial validation studies of health literacy assessments. Papers that use an existing health literacy assessment to measure outcomes but do not claim to conduct validity testing will not be included. Studies of comparison (eg, participant groups) or of prediction (eg, health literacy and hospital admissions) will be included only if the authors openly claim that the study results contribute validation evidence for the health literacy assessment instrument.
Not limited by date: There will be no start date to the search such that papers about validation and health literacy assessments from the early days of health literacy measurement will be included in the search. Health literacy is a relatively new concept and the earliest papers are expected to date back only about 30 years. The end search date was in March 2019.
Studies published and health literacy assessments developed in the English language: Due to resource limitations, the search will be restricted to studies published in the English language and instruments developed in the English language. Translated instruments will be excluded. We realise that these exclusions introduce an English language and culture bias, and we recommend that a similar descriptive review of published studies about health literacy assessments developed in or translated to other languages is warranted.
Qualitative and quantitative research methods: Given that comprehensive validity testing includes both qualitative and quantitative methods, studies employing either or both will be included.
All definitions of health literacy: Definitions of health literacy have been accumulating over the past 30 years and reflect a range of health literacy testing methods as well as contexts, interpretations and uses of the data. We include all definitions of health literacy and all types of health literacy assessment instruments, which may include objective, subjective, uni-dimensional and multi-dimensional measurement instruments.
Systematic reviews and other types of reviews captured by the search will not be included in the analysis. However, before being excluded, the reference lists will be checked for articles that may have been missed by the database search. Predictive, association or other comparative studies that do not explicitly claim in the abstract to contribute validity evidence will also not be included. Instruments developed in languages other than English, and translation studies, will be excluded as noted previously.
Systematic electronic searches of the following databases will be conducted in EBSCOhost: MEDLINE Complete, Global Health, CINAHL Complete, PsycINFO and Academic Search Complete. EMBASE will also be searched. The electronic database search will be supplemented by searching for dissertations and theses through proquest.com, dissertation.com and openthesis.org. Reference lists of pertinent systematic reviews that are identified in the search will be scanned, as well as article reference lists and the authors’ personal reference lists, to ensure all relevant articles have been captured. The search terms will use medical subject headings and text words related to types of assessment instruments, health literacy, validation and validity testing. Peer reviewed full articles and examined theses will be included in the search.
An expert university librarian has been consulted as part of planning the literature search strategy. The strategy will focus on health literacy, types of assessment instruments, validation and validity, and methods used to determine the validity of interpretation and use of data from health literacy assessments. The search terms have been determined through scoping searches and examining search terms from other measurement and health literacy systematic reviews. The database searches were completed in March 2019 and the search terms used are described in online supplementary file 1.
Literature search results will be saved and the titles and abstracts downloaded to Endnote Reference Manager X9. Titles and abstracts of the search results will be screened for duplicates and according to the inclusion and exclusion criteria. The full texts of articles that seem to meet the eligibility criteria or that are potentially eligible will then be obtained and screened. Excluded articles and reasons for exclusions will be recorded. The PRISMA flow diagram will be used to document the review process.46
The data extraction framework will be adapted from tables in Hawkins et al 2 (p.1702) and Cox and Owen (p.254).47 Data extraction from eligible articles will be conducted by one reviewer (MH) and comprehensively checked by a second reviewer (GE).
Subjective and objective health literacy assessments will be identified along with those that combine objective and subjective items or scales. Data to be extracted will include the date and source of publication; the context of the study (eg, country, type of organisation/institution, type of investigation, representative population); statements about the use of a theoretical validity testing framework; the types of validity evidence reported; the methods used to generate the evidence; and the validation claims made by the authors of the papers, as based on their reported evidence.
Data synthesis and analysis
A descriptive analysis of extracted data, as based on the theoretical framework of the Standards, will be used to identify patterns in the types of validity evidence being reported, the methods used to generate the evidence and theoretical frameworks underlying validation practice. Where possible and relevant to the concept of validity, changes in validation practice and assessment of health literacy over time will be explored. It is possible that one study may use more than one method and generate more than one type of validity evidence. Statements about a theoretical underpinning to the generation of validity evidence will be collated.
Patient and public involvement
Patients and the public were not involved in the development or design of this literature review.
With the increasing use of health assessment data for decision-making, the health of individuals and populations relies on test developers and users to provide evidence for validity arguments for the interpretations and uses of these data. This systematic descriptive literature review will collate existing validity evidence for health literacy assessments developed in English and identify patterns of reporting frequency according to the five sources of evidence in the Standards, and establish if the validity evidence is being placed within a theoretical framework for validation planning.3 The potential implications of this review include finding that, when assessed against the Standards’ theoretical framework, current validation practice in health literacy (and possibly in health assessment in general) has limited capacity for determining valid score interpretation and use. The Standards’ framework challenges the long-held perception in health assessment that validity refers to an assessment tool rather than to the interpretation of data for a specific use.48 49
The validity of decisions based on research data is a critical aspect of health services research. Our understanding of the phenomena we research is dependent on the quality of our measurement of the constructs of interest, which, in turn, affects the validity of the inferences we make and actions we take from data interpretations.6 7 Too often the measurement quality is considered separate to the decisions that need to be made.6 50 However, questionable measurement (perhaps through use of an instrument that was developed using suboptimal methods, was inappropriately applied or through gaps in validity testing) cannot lead to valid inferences.3 50 To make appropriate and responsible decisions for individuals, communities, health services and policy development, we must consider the integrity of the instruments, and the context and purpose of measurement, to justify decisions and actions based on the data.
A limitation of the review is that the search will be restricted to studies published and instruments developed in the English language, and this may introduce an English language and culture bias. A similar review of health literacy assessments developed in or translated to other languages is warranted. A further limitation is that we rely on the information authors provide in identified articles. It is possible that some authors have an incomplete understanding of the specific methods they are using and reporting, and may not accurately or clearly provide details on validity testing procedures employed. Documentation for decisions made during data extraction will be kept by the researchers.
Health literacy is a relatively new area of research. We are fortunate to be at the start of a burgeoning field and can include all publications about validity testing of English-language health literacy assessments. The inclusion of the earliest to the most recent publications provides the opportunity to understand changes and advancements in health literacy measurement and methods of analysis since the introduction of the concept of health literacy. Using health literacy assessments as an exemplar, the outcomes of this review will guide and inform a theoretical basis for the future practice of validity testing of health assessments in general to ensure, as far as is possible, the integrity of the inferences made from data for individual and population benefits.
The authors acknowledge and thank Rachel West, Deakin University Liaison Librarian, for her expertise and advice during the preparation of this systematic literature review.
Contributors MH and RHO conceptualised the research question and analytical plan. Under supervision from RHO, MH led the development of the search strategy, selection criteria, data extraction criteria and analysis method, which was then comprehensively assessed and checked by GRE. MH drafted the initial manuscript and led subsequent drafts. GRE and RHO read and provided feedback on manuscript iterations. All authors approved the final manuscript. RHO is the guarantor.
Funding MH is funded by a National Health and Medical Research Council (NHMRC) of Australia Postgraduate Scholarship (APP1150679). RHO is funded in part through a National Health and Medical Research Council (NHMRC) of Australia Principal Research Fellowship (APP1155125).
Competing interests None declared.
Patient consent for publication Not required.
Ethics approval Ethics approval is not required for this systematic review because only published research will be examined. Dissemination will be through publication in a peer-reviewed journal and at conference presentations, and in the lead author’s doctoral thesis.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.