Goals and Objectives:

Aim-3: Develop an online informatics portal and data warehouse for ontology-based, annotated plant genome data and plant genomes.

Deliverables: A centralized portal for common reference ontologies for plants and the associated data sets. Novel data store and web user interface.

Drupal portal will host the AmiGO browser, the ontology database (similar to the one developed by the PO and the GO), and a BioMart
Transition to AmiGO 2.0 with new features

Novel data warehouse for storing both the ontologies and annotation data based on NoSQL (e.g. MongoDB, http://www.mongodb.org, and Apache™ Hadoop®, http://hadoop.apache.org)
Integrate the MapReduce algorithm to increase scalability and performance
Investigate using HDF (Hierarchical Data Format), as a storage format for any numerical or sequence-based data.
Create an efficient way to add annotations incrementally to the database, (not possible in the current AmiGO database)
Implementation of OLAP (Online Analytical Processing) data cubes (http://en.wikipedia.org/wiki/OLAP_cube)

Initial design and testing will happen locally at the Center for Genome Research and Biocomputing at Oregon State University
Use of virtual machine (VM) images in the iPlant cloud computing environments
Utilization of high performance computing resources, such as:
- The supercomputer 'Stampede' at Texas Advanced Computing Center (TACC)
- Use of iRODS at iPlant for data file storage and retrieval
- Image hosting via Bisque hosted on the iPlant infrastructure (See 3.4, below)
Interaction with resources such as CoGE, Bisque, and the Integrated Breeding Platform (IBP)

Design a relational data schema to support the large-scale storage of annotated images (and their associated metadata)
Image library main goal: A training set for a new auto-segmentation and annotation active-learning algorithm
Support other visual analysis tools and the integration of image data with ontology data
Will also function as a home for community-contributed image data

Develop of publicly available APIs for both internal and external data access to ontology terms and annotations
Extend the existing lightweight web services providing Plant Ontology terms, synonyms, and definitions to the Planteoem APIs, including direct web service access to annotated data
Potential Users:
- Gramene project- information about annotations and ontologies
- DOE KBase project (http://kbase.science.energy.gov/)
- iPlant tools and services.

Jaiswal Lab (OSU, BPP): Justin Elser
Mungall Group (Lawrence Berkeley National Laboratory): Chris Mungall (Co-PI), Seth Carbon
Zhang Lab (OSU, EECS): Eugene Zhang (Co-PI), Botong Qu (CS Ph.D. student)