Ontology Working Group Meeting Tuesday, Feb 17th, 2015

From Planteome.org
Revision as of 23:28, 20 April 2015 by Cooperl (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Who: PJ, LC, CM,

1. Discussion of the GitHub Repository: https://github.com/Planteome

  • Intro to the GitHub- site for the Planteome
    • Procedure will be to do a check out. Two-step commit process: make changes and then commit locally, then do a "push" to synchronize with the server version.
    • Branching: It is different than forking
  • What is 'Forking': (we probably won't need to be doing this)
    • You can copy a version of the ontology into your own user space to make local changes, changes can be made there, and then a make a "pull request"
    • Administrator can review the pull request and look at the difs, and select the parts that are accepted. Can also be voted on by the project members
    • Works best if the files are easily "diffable", OBO works well, OWL not so well suited for this. Possible changes coming to OWL to make this work better.
    • Different than doing a check out

2. Overview and update on the Plant Trait Ontology

- The Trait Ontology can be accessed in various ways:

Updates on the Trait Ontology:

  • Early work (~2005-2011) on the TO was done by PJ and CWT at Gramene, primarily to describe QTLs of rice
  • Since 2012, the Trait Ontology has not been funded under the PO grant
  • Large amount of interest from the community drove the TO forward Crop Plant Trait Ontology Workshop OSU 2012
  • A significant amount of work has been done to make the TO compatible with the PO and other reference ontologies
  • In Summary:
    • Revised def'n of the root term: plant trait
    • Many changes to eight top level classes, new terms added, revised names of others
    • Renamed and added new upper level terms to align the anatomy and morphology trait (TO:0000017) class with the plant anatomical entity (PO:0025131) branch of the PO
    • Many new child terms added
    • Definitions revised to contain references to the entity term (from PO or GO) and quality term from PATO
    • Xrefs added to SF trackers
    • Full details of the changes have been documented on the Trait Ontology Wiki page and the Trait Ontology SourceForge tracker
  • Changes are currently stored on the SVN, with notes as to the nature of the changes.

TO on Github

  • A version of the TO was branched or forked out to GitHub in April 2014 (https://github.com/Planteome/plant-trait-ontology) and a large number (~680) of new terms from the TRY database were dumped in the ontology. This version is now out of sync with the main version.
  • LC, CM and JE will organize a call to discuss how best to set the new repository up, Needs to have a standard structure
  • CM suggests that the we should use the script here [1], which creates a number of extraneous files and does not current support OBO files, although can be modified.
  • Need to consider if we need to transfer the history- this is fairly involved.

Logical Definitions; i.e. Cross Products

  • Chris has done some work on setting up logical definitions (also called Cross Products)
    • Cross products are a way to 'decompose' the TO terms into the element entities (from PO, GO, ChEBI or other ontology) and qualities (from PATO).
    • This allows automatic classification by the reasoner - so the computer can make inferences across the ontology
    • Cross product file is currently stored on our SVN: trait_xp.obo last updated 16 months ago
  • Example: ligule length (TO:0000024)

id: TO:0000024 ! ligule length

intersection_of: BATTO:0000001 ! biological attribute

intersection_of: affects_quality PATO:0000122 ! length

intersection_of: attribute_of PO:0020105 ! ligule

  • ligule- PO:0020105 (is_a cardinal organ part; part_of vascular leaf)
  • length- quality from PATO
  • BATTO- now called "OBA"- Ontology of Biological Attributes, larger ontology which unifies trait ontologies across all types of life.

In the upcoming versions of the AmiGO2, it is anticipated that we will be able to create actual relationships between the TO terms and the component entities (from PO, GO, ChEBI or other ontology) and qualities (from PATO). Thus the existing references will be redundant.

3. Panzea Project: Curating Maize Diversity with MaizeGDB

  • Recent work with Mary Schaeffer of MaizeGDB which was presented at the Plant and Animal Genome Conference in January 2015.
  • Link to Abstract
  • Link to presentation on Google Docs: Curating Maize Diversity]
  • Panzea project ([2]) at Cornell lead by Edward Buckler ([3])- breeder data for a large number of GWAS and QTL studies.
  • Collaborative process of curating a set of data from 8 published papers (see list: References for the Panzea Project Annotations
  • Panzea data is stored in free text a large relational data base, in the GDPDM schema, (GDPDM Home Page)

Traits are not linked to ontology terms but are based on the community's name for them; for example:

Trait no. Locations Trait Protocol
20KernelWeight 7 weight of 20 kernels
AGPase 1 ADP-Glc pyrophosphorylase (EC 2.7.7.27)
AlaAT 1 Ala aminotransferase (EC 2.6.1.2)
CobDiameter 8 diameter of cob at halfway point of length, no kernels
CobWeight 8 mass of cob only, no kernels
CS 1 citrate synthase (EC 2.3.3.1)
DaystoSilk 10 Days from planting date to silking date (cb)
DaysToTassel 10 Days from planting date to tasseling date (ca)
EarDiameter 8 diameter of ear at 1/5 length point with kernels on it or the widest point
EarHeight 11 distance from soil surface to the highest ear-bearing node
EarLength 8 length of cob from base to tip
EarRankNumber 7 number of rows along length
EarRowNumber 8 number of rows around circumference
EarWeight 8 mass of cob + kernels
GDDAnthesis-SilkingInterval 11 NULL
GDDDaystoSilk 10 NULL
GDDDaystoTassel 7 NULL
  • Eighty TO traits have been revised or new ones added to annotate 300,000 lines in the MaizeGDB database with the ontology terms
  • Germplasm includes mapping panels (200 IBM and 5000 NAM recombinant inbred) and public inbred lines, such as the Goodman diversity panel (~302 inbreds).
  • Metadata about the assays are recorded in the spreadsheets
  • Eventually will be able to search the MaizGDB interface using the ontology terms- TO, PO, PATO, CO etc.

CM: Recommend a more precise prefix to be used for the Crop Ontologies- such as CGIAR_CO, in order to add it to the central registry of OBO ontologies. May want to use the Assay ontology, combined with the Unit ontologies.

Plan is to eventually host these annotations at the Planteome to do more advanced searches, at least at the population level with associations with links to MaizeGDB where the bulk of the data will be stored. With QTLs and linkage disequilibrium

Back to Ontology_Working_Group_Meetings-_Feb_2015 page