XVI.Add.II.2.8. Factors influencing the choice of data source(s)

Location:

XVI.Add.II.2.

The choice of data source(s) for RMM effectiveness studies should be determined by the following factors to ensure study objectives are feasible:

Scope and research question: Good understanding of eligible data sources to verify whether key variables and information required to answer the research question are available for secondary use of routinely collected healthcare data, given that this data collection was not designed to answer the research question. The data source’s strengths and limitations should be considered in the study design.
Accessibility of data sources: Access and conditions for collaboration with data source owners should be clarified.
Information on exposure and outcome: The validity of information on exposure and outcome data should be verified.
Availability and timeliness: Pre-existing healthcare data is more likely to be readily available for analysis compared to primary data collection, and timelines for the entire process from data delivery to availability of secondary use data including lag times should be considered. Also, the periodicity of refresh of the database over time may be a limitation. Low exposure in the period following product launch in healthcare may pose challenges to recruitment of study participants in primary data collection, and to detect e.g. changes in prescribing trends based on secondary use data.
Prevalence of outcomes of interest: Routinely collected data tends to have large sample sizes which may be relevant for rare exposures and rare outcomes.
Observation period: For detecting changes over time or delayed effects of RMM, data must be collected over a sufficiently long period of time to ensure RMM dissemination and implementation at healthcare level has happened. As the complete medical history may not be available in databases, the extent of left and/or right truncation should be considered, e.g. if no information is available outside of the respective insurance period in the case of claims data.
Representativeness of the study population: The representativeness of the study population for the entire population should be assessed. For example, where claims databases are used, the population with a specific health insurance may be inherently different to the entire population, which may introduce bias. Survey studies are prone to selection bias that may affect the generalisability of results. The approach to RMM effectiveness evaluation includes evaluating intended outcomes of RMM and, as appropriate, unintended outcomes associated with the use of the concerned and other medicinal products (see GVP Module XVI, Figure XVI.3). Where unintended outcomes are evaluated, the study population should preferably not be limited to the population targeted by the RMM for the concerned medicinal product and expanded to include populations where unintended outcomes (see GVP Module XVI, Table XVI.5) may be expected.
Completeness of the data: The amount of missing or incomplete variables should be considered where data was initially collected for a purpose different from the research question, e.g. indication of medicines use, co-morbidities, co-medication, patient monitoring, smoking, diet, body mass index or family history of disease.

The ENCePP Guide on Methodological Standards in Pharmacoepidemiology8 provides further guidance on assessing study feasibility.