What are Patient Reported Outcomes II
Kevin, here! After a hiatus, I’m back with a weekly-ish newsletter on how health-care quality is measured. Click here for Part 1 on patient-centered healthcare and measuring its effectiveness using Patient-Reported Outcome (PRO) and PRO-performance measures (PRO-PMs)
In part 1, I covered how patient reported outcomes, or standardized surveys that ask about patient’s perceptions on function, pain, or other personal dimensions of health can be used.
For part 2, I’m covering what are required when trying to take a survey and convert it into a performance measure.
Meaningful Change
Beyond the requirements that a PROM is psychometrically validated and reliable (that is, it can actually measure what you’re looking at, that is, it can track what it intends to measure, like health and function, and can replicate those findings across populations (that is a response won’t swing wildly within a week or a few days), PRO-PMs have a further requirement to determine what changes are meaningful for purposes of comparison between units of measurement. This is typically done using meaningful change thresholds.
Typical performance measures such as 30-day readmission rates after hospitalization compare the outcome, readmission rates, amongst other hospitals for comparison. Typically, after risk adjustment and the inclusion of confidence intervals, hospitals are compared to see if there are any statistically significant differences in performance. The benchmark of comparison can be a national observed rate of readmissions or comparison between hospitals, but there is typically no concept of ‘clinical meaningfulness.’
PRO-PMs require slightly more nuance due to the inclusion of the patient’s voice and the need to interpret changes in a PROM score. A statistically significant change in average PROM scores is not the same of a clinically meaningful change in average PROM scores for patients, since patients may need a larger change to feel impact on their functional status or quality of life.
Clinically meaningful changes typically fall into three categories. The minimal clinical improvement difference (MCID), the patient acceptable symptom state (PASS), and the substantial clinical benefit (SCB). The MCID is the difference in PROM score that shows smallest change to be clinically important. The PASS is the PROM score at which a patient is satisfied with their function or quality of life, depending on the domain measured. The SCB PROM score at which a patient feels “much improvement” or is the “upper threshold of outcome improvement.” Choosing which threshold to use depends again on the purpose of measurement. For purposes of appropriateness, MCID may be most relevant to track how often patients achieve at least some improvement in their condition. For public reporting and patient decision-making, PASS may be most appropriate to ensure patients are satisfied with their treatment outcome. For accountability of health systems, hospitals, and/or clinicians, SCB may be most appropriate because it would allow comparisons of how well the unit of interest treats conditions.
Determining these thresholds generally falls into two categories, an anchor-based approach, and a distribution-based approach. Distribution-based approach uses a statistical method to review a population that receives a PROM and treatment and defines a meaningful change based on the variance of the population’s change in scores. For MCID, this is typically half a standard deviation. The anchor-based approach bases changes in the PROM score compared to an anchor-question, such as asking a patient to rate improvement and/or satisfaction. Changes in a patient’s score are compared to same responses to the anchor question, and threshold are determined using statistical methods such as ROC curve analysis and determining the cut-point when the sensitivity and specificity are equivalent.
A final method of validating clinical meaningful changes regardless of approach is to use qualitative methods such as focus groups and consensus-based approach or “Delphi method.” This method typically uses patients to confirm if quantitatively-derived thresholds are aligned with the patient’s perceptions as “truly important and meaningful” using clinical vignettes or semi-structure interviews.12
Operational Difficulties
Due to the requirement of surveying patients, PRO-PMs present significantly more burden to collect and measure compared to existing quality measures from electronic medical records and administrative claims data. Burdens include standardizing PROMs and collecting PROMs. Due to these difficulties, some programs such as the CMS Comprehensive Care for Joint Replacement bundled payment program have incentivized the collection of PROMs from participating hospitals by reward extra quality points for hospitals who submit PROMs to CMS, thereby potentially increasing reconciliation payments.
In many academic medical centers, PROMs have been a tool used to conduct clinical research, generally with historical PROMs used in clinical registries for multiple years. Clinician-scientists may be opposed to changing PROMs at the risk of losing information that could be used within their registries for future research. Additionally increasing the number of PROMs a patient receives can reduce response rates. Consensus-based bodies who have typically endorsed performance measures in healthcare, such as the NQF, have taken a position to not endorse PROMs, specifically, creating more difficulty in standardizing PROMs within the industry
One approach to removing this barrier is to avoid standardizing PROMs and instead choosing a standardized approach to determining meaningful change. As previously mentioned, anchor-based approaches require a common, universal anchor. If a common anchor is used across multiple PROMs and the same approach is used to determine meaningful change thresholds, such as MCID, then institutions could continue to use the same PROM while still allowing the interpretation of the meaningful change into a PRO-PM. While various instruments will have different quantitative MCIDs with the same anchor based on the PROM psychometrics, the interpretation will still be the same.
Collecting PROMs still presents a significant source of burden and a substantial barrier to adoption. Patients face a large burden in terms of how much paperwork they need to fill out in the course of receiving care. This can include insurance forms due to prior authorization requirements, and filling out paperwork in a physician’s office for a history and physical examination. Hospitals send out surveys on patient satisfaction to fulfill existing federal requirements. Adding additional surveys for tracking PRO-PMs can lead to frustration and low response rates. Low response rates may lead to a non-responder bias, and created skewed results.
Reducing patient burden is a common approach towards increasing the collection of PROMs. One approach is reducing the number of items in a PROM. Shortened versions of PROMs have been developed, including converting the HOOS from 40 items into the HOOS JR with 6 items. Another approach is the use of computer-adaptive tests, which modify the questions according to what a patient responses are, significantly reducing the number of questions while presenting researchers with more flexibility on the outcomes measured. Examples include PROMIS-CAT. The increasing use of electronic medical records also allow health systems to integrate PROMs directly into the care delivery process, with patient portals sending PROMs at set time periods in a more automated fashion than compared to pen and paper methods.16 This has several benefits, including the ability to increase response rates but also in promoting the use of PROM scores in clinical care, such as shared decision making.17 However significant concern still exists in different modalities of administering surveys, especially in equity of responses from different vulnerable population who may not find internet-based surveys as accessible. This may require respond adjustment based on modality or population.
One novel, alternative approach could be determining a minimal sample required to understand a representative sample. Using power analysis and baseline data of PROM performance such as how often a baseline patient population achieves MCID or other clinical thresholds, a minimally powered sample size can be determined. Patients in a specific time period can be randomly sampled for follow-up response to PROMs. A minimal sample can be much less burden to follow-up on, using paper, electronic, and phone collection methods. Using all three methods on a smaller sample can potentially increase response rates, though adjusting the sample size for expected response rates may also be feasible, as long as non-response bias is considered and the responders are representative of the sample. Analysis can be done on demographic information and treatment data to determine if that randomly selected sample is representative of the entire patient population as well. Existing precedent in performance measurement exists with the use of patient-reported surveys used in the National Committee for Quality Assurance Health Effectiveness Data and Information Set, which allows for oversampling of surveys and adjustment or imputation for non-response in the final creation of performance measures.
Conclusion
Designing PRO-PMs requires careful understanding of measurement purpose, choosing the appropriate PROs, identifying appropriate meaningful change thresholds and overcoming barriers to the collection of PROMs. The design of PRO-PMs can be advanced by allowing clinicians to collect PROMs with minimal items or using a universal anchor to allow them to collect existing PROMs while allowing for a standard interpretation of meaningful change. Barriers associated with low responses can be done by allowing for adjustment of non-response and using minimal sample size that is appropriately powered. PRO-PMs can be a powerful tool to make quality assessment of the health care system much more patient-centered.
For part III, I’ll propose how you might go about designing a PRO-PM for something like improvement after joint replacement and look at a measure that was recently proposed by CMS for use in the near future!