A hitchhiker’s guide to data review in an ongoing (“live”) study – part 2: data review for interim analysis

data review for interim analysis

Part 2 – data review for interim analysis

If you are thinking of a clinical trial, you might ask yourself why and how data review for interim analysis can be a topic for a whole article.

In a clinical trial, a priori planned interim analyses, their nature and reason for performance are defined in the protocol. Data for such an analysis are then usually source-verified by the field monitor and central data review concerns only such verified data. Upon a tightly coordinated query process, where usually field monitors are involved again and provide support for the sites, the data base can be locked for the respective data analysis. At the time of data base lock, the data are usually “locked” by the data managers (i.e. can only be changed by the sites in the eCRF upon a reasonable and verified request) and thus usually do not change after the interim analysis has been performed.

Challenges for data review and interim analyses in a non-interventional study (NIS)

However, the situation is much more complicated in the non-interventional setting. In a NIS, the following characteristics may present a challenge for data review and interim analyses:

  • Only rarely the observational plan of a NIS defines a priori interim analyses and, if any, the description is very vaguely. Often the need for an interim analysis is defined by how well enrollment goes and/or by congress and publishing plans.

  • Patient enrollment in a NIS cannot be guided as within a clinical trial, therefore the timing of the interim analysis may be difficult to predict far ahead.

  • A NIS usually involves many more patients than a clinical trial and data are usually less standardized (e.g. no central lab, therefore necessity for working with different units and reference ranges).

  • A NIS utilizes data that originate from the site’s routine and are not generated for the purposes of the study and not necessarily following a visit schedule. Thus, often automatic plausibility checks (e.g. in the eCRF) cannot be programmed to be as stringent as they would be in a clinical trial; therefore many more data need to be reviewed manually.

  • A NIS also must fit into the sites’ routines and schedules; thus time windows for data entry are rarely very stringent in a NIS and also answering data review queries is not always possible in narrow time margins.

  • The proportion of data subject to source data verification (SDV) in a NIS is often rather low. In some NIS, only particular items are verified by the monitors, in others, only a random sample of patient data are subject to monitoring, and again in others, there might be no on-site monitoring at all. Thus, in most NIS the central data review is usually performed on data that were not source verified and thus usually of lower quality than in a clinical trial.

So what does this mean? Certainly, you could have a 100% SDV in a NIS and perform the data cleaning for your interim analysis as you would for a clinical trial, i.e. lock the data and unlock upon request only. However, this is an unrealistic scenario in the vast majority of NIS – in a study with thousands of patients, it is likely that dealing with data unlock-requests would be more than a full-time job.

Therefore, different approaches for central data review seem more efficient in the NIS setting. Importantly, there is not “the one” strategy that always fits and different studies and constellations require an individualized approach. In addition, the continuous and batch-wise review approaches are most effective when they go hand in hand in a coordinated manner.

Continuous data review

For studies with many patients, it is usually advisable to perform a data review of basic and/or critical data on a regular basis.

Assessing and closing failed edit checks on a regular basis often provides a good basis for a continuous data review. This allows not only to reduce a bulking up of workload for sites and data reviewers, but is essential for determining sites’ acceptance of the eCRF and perhaps identifying ways to improve the eCRF.

For instance: Only if you perform a regular data review, you will be able to realize if a majority of sites has troubles with a given edit check or question. Consequently, you can react before people get frustrated with the study.

The continuous data review also has direct projections to pharmacovigilance issues (see Part 1 – Data review for pharmacovigilance purposes) and can lead to identification of unreported (“hidden”) adverse drug reactions or other events of interest.

Batch-wise data review

This approach is especially useful when data that require review are to be reviewed in a longitudinal manner (i.e. in order to review data obtained at visit 2 you need to also check data obtained at visit 1).

This is often the case when the interim analysis should include patients who have reached a given relevant study milestone (e.g. when a given number of patients has reached a specific study visit). The batch then contains those patients with a defined degree of completeness of their data sets, which can be analyzed in a meaningful way.

There are many circumstances, where patient data that has already been subject to review may need to be reviewed again – e.g., when the data were reviewed for a previous interim analysis up to a certain visit and need to be cleaned again prior final analysis. In such cases, it may be beneficial to program listings where only de novo entered or changed data are displayed.

However, keep in mind that the plausibility of certain data entries in the last visit may only be judged when also looking at the data from the first visit. Therefore, listings and data presentations that allow for both – a quick overview on new entries as well as the whole picture – may be the most useful approach.

Summary and recommendations

  • Even though a NIS does not require a per protocol defined interim analysis, pre-defining and long-term planning of interim analyses is certainly beneficial.

  • Precise definition of the goal of an interim analysis and focusing only on those data that are needed to achieve this goal are strongly advisable – this saves time and effort (also for the sites) and is more cost effective; it is advisable to streamline the data review activities with the statistical analysis plan development.

  • Streamline your data review activities with the activities of your CRAs and centralized monitoring; if possible coordinate site visits in a way that can help the sites resolve queries raised during medical data review.

  • Streamline any other data review activity with your medical data review – e.g. it might be advisable to perform a reconciliation between safety and clinical data base along with the data review, in order to avoid conflicting or duplicate queries to the sites.

  • Keep cool! After all – it is an interim analysis – therefore a data status “as clean as possible” is adequate in most cases; potentially incomplete or implausible data can be explained and justified and usually do not outweigh the time one needs to close all queries or the effort and costs associated with locking entered data.

  • Make realistic time plans – not only for your data review teams but also, and foremost, for your sites and give them time to review queries and correct data or provide answers.

Part 3 of the article series “A hitchhiker’s guide to data review in an ongoing (“live”) study” will be out soon… Register below to get a notification via e-mail!

Picture: NASA


Get the latest articles as soon as they are published: for practitioners in clinical research

  • Read about ideas & tools for effective clinical research

  • Follow today’s topics in clinical research

  • Knowledge base: study design, study management, digitalization & data management, biostatistics, safety

  • It’s free! Sign up now!

Anmeldeformular Newsletter / Clever Reach / EN