Improving Data Operations at Clinical Research Sites

By Kyle Given, Executive Vice President of Account Management, Transformative Pharmaceutical Solutions
Whose favorite activity in a clinical trial is answering data queries? I would take a guess that the answer is a resounding “not me” for almost every site resource that has ever been involved in a clinical trial.
So why, then, is data management such a significant, and aggravating, part of the study coordinator’s responsibilities? Well, let’s start with the fact that patient data is voluminous and highly variable within and across patients. It is hard to keep it all straight, especially if a site may not be trained on certain data management concepts or provided with the same tools necessary to visualize that data.
Also, given resource constraints at many sites, data operations are often deprioritized, which builds up a backlog of data cleaning and potentially more queries. On the other hand, having high-quality, consistent, and reliable data is essential to answer each protocol’s efficacy and safety objectives with a high level of scientific rigor.
Let’s start with the first issue, available resources. Most study coordinators are clinically trained – not trained to be data experts. This does not mean they do not have an awareness of the logical ways that clinical data interact, but they may not have the same level of knowledge and experience as a data manager that ultimately oversees the data quality in a clinical trial.
More importantly, clinical trial budgets often underestimate (or largely ignore) the amount of time that will be spent entering, curating, and cleaning patient data. It’s almost impossible to anticipate this volume of work before a trial starts. This leads to data cleaning being deprioritized at many sites with data query volumes increasing as a result.
The second challenge is that sites do not have effective data visualization tools to facilitate the review of large quantities of patient data. Most EDC or e-source platforms present the data one module at a time so the reviewer can only see a limited slice of data at any given point. They are set up for data entry, but not data review. In addition, the data is often captured in multiple systems, so cross-referencing data adds to the complexity of the review.
Sponsors and CROs have medical and central data review teams that combine data across unique data sets and present the data over time so that a data reviewer can easily see inconsistent data… Furthermore, in many of these systems, they are now presenting the data using statistical signals so that outlier data can be flagged for the end user to review. Does it really make sense that the secondary reviewers of data have better tools than the primary reviewers of clinical data?
There is a potential solution to this problem; sponsors could deploy centralized Data Operation Specialists (DOS) to remove the burden of data review and cleaning from the clinical trial site. This concept has already been deployed as a result of the pandemic when sites fell behind in data collection and cleaning activities. If this solution works well for an urgent situation, why not consider this a more permanent strategy that is deployed proactively?
Consider a reality where each site had a DOS that partnered with them to clean and process all of their clinical trial data with the best available data visualization software. Imagine a world where the site actually gets support to clean the sponsor’s clinical data, and the data gets captured and cleaned in real-time, thus avoiding a huge source of stress between sponsors and their clinical trial sites.




