Workstream B: Standards
Standards and Ontologies for Integration, Analysis, and Exchange of Global Coherent Datasets
Lead: Jessie Tenenbaum, Duke Translational Medicine Institute
Participants: James Brenton, Kimberly Hartwell, Cameron Neylon, Jonathan Rees, Susanna-Assunta Sansone, Philippe Rocca-Serra, Jim Davies, Steve Harris
Introduction
A prerequisite for the success of the Sage Commons is a standardized approach to data organization and annotation. The Standards and Ontology Project aims to identify:
- What metadata elements should be captured regarding experimental setting, the resulting data, and analyses
- What standards to use to capture those data elements
- Which data elements should draw from which ontologies
- What is the scope of data types to be included in each release
Activities before the Congress
For 2 specific data sets (BxH ApoE -/-, and TCGA glioblastoma):
- Enumerate all data elements to be captured in a structure format (content)
- Identify data types for each data element
- Ideally leverage existing standardized common data elements (CDE’s), e.g. from caDSR
- Identify ontologies to use where appropriate (semantics)
- Use an existing tool (likely ISA-Creator) to annotate these two datasets as proof of concept
- Clarify scope including experiment types (e.g. RNAi, drug screening, case-control) and modalities (e.g. gene expression, genotyping, genome sequencing, CNV, etc.)
Activities at the Congress
- A presentation to communicate the issues this group is addressing and an overview of progress to date (without going into so much detail that even diehard geeks’ eyes glaze over)
- Distribute information on how interested attendees can give feedback and get involved.
Activities after the Congress
- Evaluate the annotation tool used for proof of concept annotation for its suitability as the tool of choice for contributors
- Extend proof-of-concept annotations to the other ~4 datasets (?) available on Sage. Metadata elements for ovarian cancer are a key priority given imminent arrival of TCGA set and proposed prospective study profiling ovarian tissues before platinum-based chemotherapy in 1st relapse.
- Integrate the annotation tool with a metadata repository from which to draw common data elements