Data Collaboration

Data Management

We can create end-to-end database solutions across various database platforms. We can enrich your current research data with clinical data from EPIC with IRB approval. We also have complete access to the PHIS repository and can provide databases based on this data from pediatric hospitals across the country. We have database professionals in-house who can assist you with all your data management needs.

Services

  • Database creation and design
  • Importing your existing data
  • Data manipulation (pivoting data, aggregation, etc.), cleansing (cleaning up of dirty input) and standardization (creating valid list of values)
  • Data Integration with EPIC data to create a single combined data set
  • Algorithm implementation to calculate results based off of raw data
  • Securing your data such as PHI concerns and general access limitations to those on IRB Backup and recovery so you know your data is protected
  • Performance tuning
  • PHIS repository extractions

Case Studies

Multi-Database consolidation: We collaborated with a team that had data in Epic, REDCap, Oracle, and excel files. We were able to consolidate the data into a single system to support their needs. Data flows around to the various components to support different functionality. The work has helped the customer with their grant applications, paper writing, recruitment, and patient follow up.

Data Quality: We worked with a team that had been collecting data for many years. They had worked very hard through manual effort to clean their data. We were able to take data from their homegrown solutions and load it into our tools. This helped to identify and clean up data that had been missed in their original work. The team is now working with a much cleaner data set. Additionally, they are able to enter data into the system quicker than they were with the previous solution.

data collaboration

Data Warehousing

We currently house the NCH Clarity clinical dataset and provide a de-identified portal through which researchers can obtain counts for cohort identification for your next study. We enable data warehousing solutions for your large scale data projects and can enrich your data from multiple sources (geographic, census, PHIS)

Services

  • De-identified portal for determining patient cohort before starting research.
  • Custom data warehouse creation and design.
  • Data enrichment as we can use external and internal sources and enrich it with clinical data.

Case Studies

PEDSnet: We participate in the PEDSnet data collaborative. This is a network of children’s hospitals that have normalized their data for research. The data goes through a multi-layer modeling and cleansing process to produce a standardized data sets. The data has been used to support a variety of studies across many different conditions.

More information is available here: https://pedsnet.org/

Honest Broker

Our team acts on behalf of your team to extract health information, de-identify it, and provide it to our research investigators in such a manner that it would not be reasonably possible for the investigators or others to identify the corresponding patients-subjects directly or indirectly ensuring the integrity of PHI.

Services

  • Be the honest broker over health information between the covered entity and the investigator.
  • Collect the health information and de-identify it and provide it to the investigator in such a manner that it would not be reasonably possible for the PI to others to identify the patients/subjects directly or indirectly.
  • Provide limited data sets by removing any HIPAA defined direct identifiers.
  • The anonymized information provided to investigators by the honest broker may include linkage codes to permit easier organization of the information and/or linking for subsequent inquiries. However, any information linking this re-identification code to the patient’s identity must be retained by the honest broker, secured and separate from research documents. All subsequent inquiries are conducted through the honest broker.

Case Study

We worked with a PI that needed to combine Epic data with Specimen data. The resulting data set was not allowed to have patient identifiers, specimen identifiers, or other PHI. We provided a file to the PI that contained all of the relevant clinical and specimen information needed while maintaining PHI. New identifiers were generated for the file. This allows the PI to return and request more data from us while still maintaining confidentiality. We hold the ability to get to the original PHI using the newly generated identifier. We can add more data and return to the PI without the PI ever having to know the original identifiers.

Reporting Services

The Data Services team can user multiple tools such as SQL Server Reporting Services and QlikView.

Services

  • Create static & dynamic reports using SSRS
  • Create QlikView reports.

Case Study

SSRS: A research group came to us with a complicated data set crossing multiple applications. They had two distinct reporting requests that needed to be satisfied. First, they had to know quickly about the most recent information about a patient so they could meet with the patient in clinic or a follow up phone call. We developed a series of reports that consolidated the data across multiple systems to show them the most up-to-date information about the patient they were about to interact with. Secondly, they needed historical data across many patients for cohort identification and study purposes. We produced reports that would allow them to filter by a cross-section of criteria to find the patients that were eligible for their study. They could export data to be analyzed in more detail to discover new insights.

Electronic Data Capture

Data needs to be captured for clinical trials and research studies. Many groups accomplish this with paper forms or excel files. We support systems to be able to do this same activity in an online electronic format. Data is captured in a web-based system that enables data consistency, data quality, auditing, and security. Data can be imported into these systems. Upon completion of the data collection, the data can be exported for analysis in statistical packages. We are using tools commonly utilized by others in the research industry. This allows for the possibility of the same data to be collected at other sites utilizing these same tools.

Services

  • Build and support for REDCap and OpenClinica platform
  • Integration of EPIC data into REDCap
  • The ability to send surveys to parents and patients
  • Supports multi-site collaboration
  • Build and support for OpenClinica platform

Service Information

  • 650 active users of REDCap at NCH.
  • Over 3,000 institutions running REDCap

data collaboration

RISI has deployed an institutional Hadoop cluster to provide an infrastructure for big data solutions in storage and analytics. We design algorithms and develop scalable and reliable applications to store and analyze large amounts of structured and unstructured data using distributed computing.

Big Data

We offer storing and analyzing big amount of structured, unstructured and genomic data on distributed, fault tolerance, and highly available systems. Our team takes advantage of Machine Learning, Natural Language Processing and Data Mining algorithms to analyze big datasets in a timely fashion and provides access to the results on the cloud. We provide real-time information streaming of high throughput systems by storing and analyzing data in a distributed environment. Our technology allows data to be stored in SQL, NoSQL, graph and schema-free formats on distributed file systems.

Services

  • Amazon AWS setup
  • Storing huge amount of information on fault tolerance, highly available systems in tabular, graph or NoSQL databases
  • Designing algorithms to extract and analyze information in big databases
  • Storage and real-time analysis of huge streaming data
  • Develop and/or refine algorithms to automate research data workflow processes

Case Studies

Big data storage and access management (such as genomics and clinical data)​: provide solutions to easily store and manage terrabytes of data. High throughput instruments generate a bulk of data. Some of the solutions being looked at currently are to use a cloud based solution, distributed data management using Hadoop like system and traditional high performance computing.​

Genome Archiving Communication System: Researchers at Nationwide Children’s Hospital complete a first-of-its-kind project to evaluate a large-scale genomic data management system on the scale of up to one million genomes.