Latest revision as of 19:04, 2 June 2015

Goals and Objectives:

Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes.

Deliverables: A centralized portal for common reference ontologies for plants and the associated data sets. Novel data store and web user interface.

3.1 Planteome Web Portal Development

Drupal portal will host the AmiGO browser, the ontology database (similar to the one developed by the PO and the GO), and a BioMart
Transition to AmiGO 2.0 with new features

3.2 Planteome Data Warehouse Development

Novel data warehouse for storing both the ontologies and annotation data based on NoSQL (e.g. MongoDB, http://www.mongodb.org, and Apache™ Hadoop®, http://hadoop.apache.org)
Integrate the MapReduce algorithm to increase scalability and performance
Investigate using HDF (Hierarchical Data Format), as a storage format for any numerical or sequence-based data.
Create an efficient way to add annotations incrementally to the database, (not possible in the current AmiGO database)
Implementation of OLAP (Online Analytical Processing) data cubes (http://en.wikipedia.org/wiki/OLAP_cube)

3.3 Integration with the iPlant infrastructure

Initial design and testing will happen locally at the Center for Genome Research and Biocomputing at Oregon State University
Use of virtual machine (VM) images in the iPlant cloud computing environments
Utilization of high performance computing resources, such as:
- The supercomputer 'Stampede' at Texas Advanced Computing Center (TACC)
- Use of iRODS at iPlant for data file storage and retrieval
- Image hosting via Bisque hosted on the iPlant infrastructure (See 3.4, below)
Interaction with resources such as CoGE, Bisque, and the Integrated Breeding Platform (IBP)

3.4 Library of Publicly-Accessible, Annotated Digital Images

Design a relational data schema to support the large-scale storage of annotated images (and their associated metadata)
Image library main goal: A training set for a new auto-segmentation and annotation active-learning algorithm
Support other visual analysis tools and the integration of image data with ontology data
Will also function as a home for community-contributed image data

3.5 Application Programming Interface (APIs)

Develop of publicly available APIs for both internal and external data access to ontology terms and annotations
Extend the existing lightweight web services providing Plant Ontology terms, synonyms, and definitions to the Planteoem APIs, including direct web service access to annotated data
Potential Users:
- Gramene project- information about annotations and ontologies
- DOE KBase project (http://kbase.science.energy.gov/)
- iPlant tools and services.

Integrate our data with other external APIs, For example:
- EBI (the Gene Expression Atlas, Ensembl Plants, IntAct),
- ERA-CAPS (genotype-to-phenotype data)
- DOE KBase
- GCP Integrated Breeding Platform
- Agave on iPlant which provides web-focused developer access to the iPlant data store and other integration services, providing a direct link to high-performance computing systems such as the TACC.

Participants

Jaiswal Lab (OSU, BPP): Justin Elser
Mungall Group (Lawrence Berkeley National Laboratory): Chris Mungall (Co-PI), Seth Carbon
Zhang Lab (OSU, EECS): Eugene Zhang (Co-PI), Botong Qu (CS Ph.D. student)

Link to Data storage and AmiGO 2 Working Group Meetings

@@ Line 1: / Line 1: @@
 =Goals and Objectives:=
-Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes.
+== Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes. ==
+*Deliverables: A centralized portal for common reference ontologies for plants and the associated data sets. Novel data store and web user interface.
-.1 Planteome Web Portal Development
+===3.1 Planteome Web Portal Development===
-* Drupal portal will host the AmiGO browser, the ontology database developed by the PO and the GO consortium, and a BioMart
+* Drupal portal will host the AmiGO browser, the ontology database  (similar to the one developed by the PO and the GO), and a BioMart
 * Transition to AmiGO 2.0 with new features
-.2 Planteome Data Warehouse Development
+===3.2 Planteome Data Warehouse Development===
-* Novel data warehouse for storing both the ontologies and annotation data based on [http://nosql-database.org/ NoSQL]
+* Novel data warehouse for storing both the ontologies and annotation data based on [http://nosql-database.org/ NoSQL] (e.g. MongoDB, http://www.mongodb.org, and Apache™ Hadoop®, http://hadoop.apache.org)
 * Integrate the MapReduce algorithm to increase scalability and performance
 * Investigate using HDF ([http://en.wikipedia.org/wiki/Hierarchical_Data_Format Hierarchical Data Format]),  as a storage format for any numerical or sequence-based data.
+* Create an efficient way to add annotations incrementally to the database, (not possible in the current AmiGO database)
+* Implementation of OLAP (Online Analytical Processing) data cubes (http://en.wikipedia.org/wiki/OLAP_cube)
+===3.3 Integration with the iPlant infrastructure ===
+* Initial design and testing will happen locally at the Center for Genome Research and Biocomputing at Oregon State University
+* Use of virtual machine (VM) images in the iPlant cloud computing environments
+* Utilization of high performance computing resources, such as:
+** The supercomputer 'Stampede' at Texas Advanced Computing Center (TACC)
+** Use of iRODS at iPlant for data file storage and retrieval
+** Image hosting via Bisque hosted on the iPlant infrastructure (See 3.4, below)
+* Interaction with resources such as CoGE, Bisque, and the Integrated Breeding Platform (IBP)
+===3.4 Library of Publicly-Accessible, Annotated Digital Images===
+* Design a relational data schema to support the large-scale storage of annotated images (and their associated metadata)
+* Image library main goal: A training set for a new auto-segmentation and annotation active-learning algorithm
+* Support other visual analysis tools and the integration of image data with ontology data
+* Will also function as a home for community-contributed image data
+===3.5 Application Programming Interface (APIs)===
+* Develop of publicly available APIs for both internal and external data access to ontology terms and annotations
+* Extend the existing lightweight web services providing Plant Ontology terms, synonyms, and definitions to the Planteoem APIs, including direct web service access to annotated data
+* Potential Users:
+** Gramene project- information about annotations and ontologies
+** DOE KBase project (http://kbase.science.energy.gov/)
+** iPlant tools and services.
+* Integrate our data with other external APIs, For example:
+** EBI (the Gene Expression Atlas, Ensembl Plants, IntAct),
+** ERA-CAPS (genotype-to-phenotype data)
+** DOE KBase
+** GCP Integrated Breeding Platform
+** Agave on iPlant which provides web-focused developer access to the iPlant data store and other integration services, providing a direct link to high-performance computing systems such as the TACC.
 =Participants=
@@ Line 16: / Line 48: @@
 * Zhang Lab (OSU, EECS):'' Eugene Zhang (Co-PI), Botong Qu (CS Ph.D. student)
-=Data storage and AmiGO 2 Working Group Meetings:=
+= Link to [[Data storage and AmiGO 2 Working Group Meetings]]=
-* Data storage and AmiGO 2 call 1-30-15
-** Who: PJ, CM, Seth, EZ, LC, JP, JE
-- Discussion of the planned transition to the AmiGO 2.0 platform
-- JE is working on installing SolR database - View details and progress reports here: [[AmiGO2_install]]
-* Data Storage and AmiGO2 call 2-18-15 [[Media:Data_Storage_2-18-15.mp4 ]]
-** Who: JE, JP, EZ
-- Further discussion of AmiGO2 progress and overview of AmiGO2 interface
-Relevant links:
-https://github.com/geneontology/amigo
-Demo: http://amigo.geneontology.org/
-http://amigo2.berkeleybop.org/ - dev server

Data storage and AmiGO2 Working Group: Difference between revisions

Latest revision as of 19:04, 2 June 2015

Contents

Goals and Objectives:

Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes.

3.1 Planteome Web Portal Development

3.2 Planteome Data Warehouse Development

3.3 Integration with the iPlant infrastructure

3.4 Library of Publicly-Accessible, Annotated Digital Images

3.5 Application Programming Interface (APIs)

Participants

Link to Data storage and AmiGO 2 Working Group Meetings

Navigation menu

Data storage and AmiGO2 Working Group: Difference between revisions

Latest revision as of 19:04, 2 June 2015

Goals and Objectives:

Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes.

3.1 Planteome Web Portal Development

3.2 Planteome Data Warehouse Development

3.3 Integration with the iPlant infrastructure

3.4 Library of Publicly-Accessible, Annotated Digital Images

3.5 Application Programming Interface (APIs)

Participants

Link to Data storage and AmiGO 2 Working Group Meetings

Navigation menu

Search