Ontology Working Group Meeting, Tuesday May 26th, 2015

From Planteome.org
Jump to: navigation, search

Agenda

Who: LC, MAL, CM, PJ, BS

Absent: EA, JE etc

Update from Barry on the UN Initiative:

  • UN is adopting ontologies to describe indicators for sustainable development
  • See additional notes on Ontology_Working_Group_Meeting_Tuesday,_Feb_24th,_2015
  • Barry is working with Ramona and Pier B (ENVO)- metrics for environment conditions
  • Goals have been published-find link to?
  • Wide ranging set of goals to encompass agriculture, health, environment
  • 'Plant Stress Ontology' is of interest to describe stressors of plants - agriculture, forestry, fisheries,
  • Plant Experimental Conditions Ontology (PECO/EO)-Plant Stress Ontology- aspects of overlap

Report from Bioversity team

Marie-Angelique:

  • Leo is working on adding columns in the trait template for entity and attribute- will be useful for linking to the reference ontologies
    • LC: How will the ref to the TO be included?? Is there a column for that?
  • One trait dictionary per crop, Excel format, defined columns, uploaded in CO system and then exported as OBO and SKOS
  • Not sure how to deal with disease resistance - what are the entity and attribute ? Link to PDO terms for the disease name
  • Question of sharing the trait dictionary file- should be stored on GitHub, at least for the four focus crops: cassava, wheat, maize, rice
  • Currently the trait dictionaries are stored in Dropbox, waiting for curator validation and approval
  • Ideally, the working versions could be stored on GitHub

Discussion of the id space for Crop Ontologies:

  • Having 20-30 separate id spaces (or namespaces) can be done, but is not too convenient, they would have to each be registered in the OBO library
  • A single common prefix would be preferable, can be something like CO_IBP:XXXXXXX
  • They could still be distributed as separate files and then combined as needed for the Planteome uses.
  • The numbering scheme could have a leading number with the taxon id (from NCBI??), or use the existing numbering scheme (e.g. cassava CO_334:xxxxxxx)
  • Existing ids have been defined by the CO community e.g. CO_334:xxxxxxx
    • MAL: It should not be too much as an issue, most people are not using the OBO file
    • Example proposed: CO_IBP:334xxxxxxx or CO_IBP:334xxxx, zero padded number- four digits would provide a maximum of 9999 terms which should be plenty?
  • PJ: Introduce the namespace for the taxon? Could be coded in to the OBO file- OBO namespace could be created
  • In the Owl version can use "only in taxon" restriction
  • This would allow all the ontologies to be merged, with the different namespaces
  • How would we deal with the crop ontologies that have more than one for an individual species? Only wheat and banana
  • All the codes are listed on the GCP Pantheon Page: [1]- This is out of date, as the GCP project has ended


  • PJ: Question- Creating a version of the TermGenie for the Crop Ontology vocabs?
    • For the shorter term: TermGenie will focus on the reference ontologies, may be difficult to fit in with the species-specific ontologies
  • This would only work on the ones which are maintained in the OBO format, not in Excel.

Discussion of the hierarchy of the CO Vocabs

  • The CO vocabularies all have some inherent structure with some or all of these grouping terms. Ideally, these could be pulled from the Reference Trait (TO) or anatomy/development (PO) ontology. If we just make them all the same number space, then there will be 20 or so terms with the same (or slight variations of) name. In the past, the website was broken when more than one vocabulary had a term with the same ID#.
  • agronomic (or 'agronomical') traits
  • morphological traits
  • growth and development traits
  • stress (biotic and/or abiotic) traits
  • physiological traits
  • quality traits
  • phenological traits
  • fertility traits
  • biochemical traits
  • hybrid traits (?? rice)
  • It will not matter what the upper hierarchy is if all the terms have a XP logical definition referring to the PO/GO/PATO terms- then the mapping can be done automatically. So we do not have to worry about the hierarchy. We would just have to exclude those terms on import to the joined file. Or they would need to be renamed or referenced to the TO. This was suggested at the CO meeting and was rejected.
  • Emphasis in the Planteome is not on building these ontologies, as long as the TD's are enriched and have the XPs. Should not even call them ontologies.
  • Even if users are not interested in browsing the hierarchy, in order to query the ind. crop vocabs, the underlying hierarchy needs to be there, but can be generated based on the XPs.
  • Effort should be focused on making sure all the TDs have the terms that are needed and a XP to the reference trait ontology.
  • The upper level categories are integrated into the trait template, but perhaps can be imported as a flat file, without the upper level terms. There is no data attached.
  • Method and scale are attached to the individual terms
  • Mapping using a formal definition
  • MAL is testing using the Agreement Maker Light (AML) (http://agreementmaker.org/) tool to derive the mappings, but needs to make the product in the owl files. Starting with the wheat ontology. Import as OWL or OBO, add mapping to terms, structure or words.
    • The wheat file has manually created Xrefs to TO which can be used to double check the AML mappings

To-Do:

  • Should create a page on the wiki with a list of the ontologies and and the ones that are proposed. This should list all the CO vocabs and which ones are maintained in OBO and which ones are derived. (Ones in blue are OBO).
    • LC: How are the OBO format ones and the corresponding trait dictionary kept in sync? Is someone working on the OBO files to keep up to date?
  • CM: Need a document describing the workflow or pipeline- it would help us to better understand the process.

Report from OSU

  • Laurel, Justin E and Pankaj
  • Progress on AmiGO2 browser development: http://dev.planteome.org/amigo
    • Report from All Hands meeting: Working on creating an planteome importer owl file to simplify the loading of the ontologies via the golr Makefile. Working on a catalog xml file to redirect URI in the importer OWL. (JE)
    • The issue with loading both the Plant Environment (EO) and Plant TO where loading both causes the EO to disappear from AmiGO- this appears to be fixed
  • From Justin Elser: Let me know if any other ontologies are missing. I am going to start working on how to get a smaller slice of the ncbi one loaded up.
    • Is including the NCBI a priority? How is the slice being created? Looks like only 68 terms so far
  • Missing from current version:
    • GO-BP, MF and CC???
    • ChEBI
    • Need to remove Uberon
    • Can we just have a slice of the Cell Ontology?

Updates on the transition to Github: https://github.com/Planteome

  • Model is a separate repository for each ontology under the organization of the Planteome

Plant Trait Ontology

  • Plant Trait Ontology repo has been created, along with the SVN history - testing is underway
  • File was renamed "plant-trait-ontology.obo"
  • TO development browser (http://palea.cgrb.oregonstate.edu/amigo/TO_dev) is now set up to read from the OBO file on the GitHub, rather than the one on the SVN
    • This updates nightly and is a useful tool for the curators
    • Now also includes the PDO
  • To-Do:
  • Need to move the other associated other files from the SVN: cross product files, trait lists, owl file etc
  • Will need to change where collaborating sites are pulling the TO from: Bioportal, EBI OLS, OBO Foundry, Gramene (?), others?

Action Items:

  • Move the rest of the ontology files from the SVN folder (JE): which ones are the priority?
  • Remove the other test repo "planteome-ref-ontologies" (JE). LC transferred the only comment (from CM) to the plant-trait-ontology repo.

Other Repositories on GitHub:

  • Need to rename the CGIAR/IBP prefix for their files. EA will clarify with her colleague at the IBP. - Any news? None yet

From the last meeting:

  • Discussion of including the trait dictionaries in the GitHub repos:
  • Managing the files as excel files with the users, and this can be converted to a simple OBO file using MAL's script (for loading on the CO site)
  • Could be uploaded as the OBO or a TSV file- closer to the source and works well with Github
  • Could load the OBO, TSV and SKOS files together in the repo and use the wiki and issue trackers

Updates on new hires

  • Post-doc position at OSU- Six candidates were interviewed and an offer has been made, new person should join in late June or early July.
  • Post-doc position at Aberystwyth University, UK- GG is advertising, closing date is mid-June
  • Post-doc position at NYBG- No news on this yet

Updates on work on Plant Disease Ontology and future plans for Plant Stress Ontology (Laurel)

Tabled for next meeting

Discussion of format for meetings

- Rotate hosting and presenting??

Schedule for future meetings:

  • Next Ontology meetings- Every other Tuesday 8:15 AM (GMT-7:00) Pacific Time
    • June 9th, 2015
    • June 23rd, 2015
    • July 7th- Laurel is on vacation, but someone else can host?
    • July 21st