Data Collaboration

Data Management

We can create end-to-end database solutions using almost any database platform (SQL Server, Oracle, MySQL, PostgreSQL). We manage, protect and secure your sensitive data and can enrich your current research data with clinical data from EPIC (with IRB approval). We also have complete access to the PHIS repository and can provide databases based on this data from pediatric hospitals across the country. We have certified database professionals in-house who can assist you with all your data management needs.


  • Database creation and design
  • Importing your existing data (ETL)
  • Data manipulation (pivoting data, aggregation, etc.), cleansing (cleaning up of dirty input) and standardization (creating valid list of values)
  • Data Integration with EPIC data to create a single combined data set
  • Algorithm implementation to calculate results based off of raw data
  • Securing your data such as PHI concerns and general access limitations to those on IRB (limiting what can be done with data)
  • Backup and recovery so you know your data is protected
  • Performance tuning
  • PHIS repository extractions

Case Studies

PHIS: PHIS data is a large data warehouse run by the Children’s Hospital Association which contains de-identified clinical data from forty-four pediatric hospitals around the US. While we do not manage this data internally we do pull this data for researchers by request. We can export the data in numerous formats or into multiple database types and can aggregate data as needed.​

Infectious Disease Case Study: The Infectious Disease group has a significant amount of survey data in REDCap. Our team worked with the REDCap data to link it to the Epic data. The corresponding visit was found in Epic that related to the REDCap response. Data validation and automated REDCap data loading were able to be built to compare and load the two data sets to each other. This includes quality checks and automating of business rules and algorithms. This helps to reduce data errors and improve the consistency of the results to support the research study.

Data Warehousing

We currently house the complete NCH Clarity clinical dataset and provide a de-identified portal through which researchers can obtain counts for cohort identification for your next study. We enable true data warehousing solutions for your large scale data projects and can enrich your data from multiple sources (geographic, census, PHIS)


  • Integrating your research data into i2b2 for data exploration. i2b2 (Informatics for Integrating Biology and the Bedside) is an informatics framework that will enable clinical researchers to use existing NCH de-identified clinical data for discovery research (cohort identification) and when combined with IRB-approval we can extract identified clinical data to IRB approved researchers and host the data in a secured location for your investigative use. See more at
  • Custom data warehouse creation and design.
  • Data enrichment as we can use external and internal sources and enrich it with some PHI/non-PHI clinical data (based on IRB approval)

Case Studies

RDMF: A large repository of identified Epic data with a bent towards research and research data. This star schema data warehouse is used by RISI to enrich clinical data with other data such as genetics info, Survey data, etc. We can extract the exact population which was identified in i2b2 for your cohorts after receiving IRB approval.

I2b2: Allows researchers to easily identify counts of cohorts for Grant/IRB submission which meet their study requirements. I2b2 empowers the researcher to identify if there is a large enough population of cohorts in Epic that will qualify for their study. I2b2 is an easy to use front-end for researchers to explore de-identified data within NCH’s Epic/Clarity without the complexity that comes along with clinical data.​

Honest Broker

Our team acts on behalf of the covered entity to collect health information, de-identify it, and provide it to our research investigators in such a manner that it would not be reasonably possible for the investigators or others to identify the corresponding patients-subjects directly or indirectly ensuring the integrity of PHI.


  • Be the honest broker over health information between the covered entity and the investigator.
  • Collect the health information and de-identify it and provide it to the investigator in such a manner that it would not be reasonably possible for the PI to others to identify the patients/subjects directly or indirectly.
  • Provide limited data sets by removing any HIPAA defined direct identifiers.
  • The anonymized information provided to investigators by the honest broker may include linkage codes to permit easier organization of the information and/or linking for subsequent inquiries. However, any information linking this re-identification code to the patient’s identity must be retained by the honest broker, secured and separate from research documentsAll subsequent inquiries are conducted through the honest broker.

Case Study

We worked with a PI that needed to combine Epic data with Specimen data. The resulting data set was not allowed to have patient identifiers, specimen identifiers, or other PHI. We provided a file to the PI that contained all of the relevant clinical and specimen information needed while maintaining PHI. New identifiers were generated for the file. This allows the PI to return and request more data from us while still maintaining confidentiality. We hold the ability to get to the original PHI using the newly generated identifier. We can add more data and return to the PI without the PI ever having to know the original identifiers.

Reporting Services​

The Data Services team can user multiple tools such as: SQL Server Reporting Services; Power Pivot and Power View (coming with Excel 2013).


  • Create static & dynamic reports using SSRS
  • Dynamic Reporting using Power Pivot (Excel)
  • Dynamic Reporting using Power View

Case Study

SSRS: we can create reports that accept multiple input parameters with all being optional so allow the researcher flexibility in filtering the same dataset. We can use SSRS to create geographic maps to drill down from country to state (even county, but PHI remains a concern) and show aggregated data at each level. Maps are color-coded to show penetration of an area.

Electronic Data Capture

Data needs to be captured for clinical trials and research studies. Many groups accomplish this with paper forms or excel files. We support systems to be able to do this same activity in an online electronic format. Data is captured in a web-based system that enables data consistency, data quality, auditing, and security. Data can be imported into these systems. Upon completion of the data collection, the data can be exported for analysis in statistical packages. Also, these are tools commonly used by the research industry. This allows for the possibility of the same data to be collected at other sites utilizing these same tools.


  • Build and support for REDCap platform
  • Integration of EPIC data into REDCap
  • The ability to send surveys to parents and patients
  • Supports multi-site collaboration
  • Build and support for OpenClinica platform

Service Information

  • 260 REDCap Research Projects in use
  • Over 2,000 institutions running REDCap
  • 18 OpenClinica Research Projects in use