Latest revision as of 19:34, 7 March 2016

Planteome Ontology WG Zoom Meeting

Date: Tuesday Mar. 1st, 2016
Time: 8:15am PST (GMT-8)
Connection details: Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/762470789

Links to recordings:
- File:Ontology WG Meeting 3-1-2016 audio.m4a
- File:Ontology WG Meeting 3-1-2016 video.mp4

Attendees:LC, AM, PJ, CM, JE

Changes and updates in the IBP files

Review of progress with building the Planteome version of AmiGO with the CO classes merged in

Three files are currently integrated: rice, cassava, lentil
wheat- newly mapped and will be added soon, once the issues of the id numbers are resolved
- Latest changes: added the crop common name to all terms, removed the categorical classes (ie: Agronomical Traits), made all CO class names in lower case
The mappings to TO (is_a) are on the trait class, but there are still some on the cassava variables, as it appeared on the browser. These were originally put in since the variable was not connected, but can now be removed.
In cassava file, the variable has an is_a relationship to the trait, since the variable_of, method_of and scale_of relations are not being displayed. This is a workaround to get the variable to show as a child (should revert to actual relationships when we can get it sorted out in AmiGO)

Note: funny synonyms in AmiGO cause weird behavior with autocomplete (not sure what this means..)

Questions and discussions:
- Have we decided on how and where the CO<->TO mappings will be stored? At the Corvallis meeting we hacked something into a pseudo-TD5 file.
  - Currently the mappings are being done in a google spreadsheet, which is then converted to an OBO file (see below).

From CM: Marie, looks like you are generating some files such as: lentil_withcropname.obo --- which have the bridging axioms merged in. This looks great. Not sure if this is experiments or ready to go, but this is exactly what we need. We should just decide on a standard naming convention, update the READMEs, and maybe set up travis for each CO repo.

From MAL:

About the creation of the lentil file, I have a script in java that takes a TD V5 (in excel format) and creates an OBO file. I use apache poi and obo2owl for that. Then I have a second script that adds the mappings in the OBO file. I get the mappings from a spreadsheet document on Google Drive. Eventually the mappings will be store directly in the TD v5.

PJ's Vision:

Vision: TD --> CSV --> Chris' tool --> merged version --> read-only crop specific
Want crop-specific groups to be able to pull down a slice of the TO that includes all the crop-specific terms, and the associated TO hierarchy. None of the traits that are not specific to their crop of choice. His goal is to not use the CO OBO, we want to bypass it, and use just the trait dictionary (TD)- wants to avoid CO.obo and reduce overhead for MAL
MAL doesn't edit the CO.obo, so the current way is already reduced.
- No explanation of what "Chris' tool" is...., or how this would actually work

Chris argued for having purls for each of the CO-obo files
there is a replacement of "prefixes" (/ibp-cassava-traits/) that put in the true link to raw.github.usercontent.com/planteomeetc
need to be clear on our naming for the purl.
PURLS are stored here: https://github.com/OBOFoundry/purl.obolibrary.org/blob/master/config/to.yml
from the chat: MAL should use ontology:to/ibp-FOO-traits/FOO.owl

Update on AmiGO APIs- Justin Elser

API code changes have been merged into our version of AmiGO (dev.planteome.org)
These APIs will live on AmiGO, and bisque, AISO, and whoever wants to use them, can use them for pulling definitions, using autocomplete
Does not work on browser.planteome.org as it will require a reload of all data to work because of a schema change to SOLR

Still fixing some bugs:

Small issue with main readme page not showing up, but the API itself is working
- ie. http://dev.planteome.org/api doesn't work
- http://dev.planteome.org/api/entity/term/GO:0022008 works and returns JSON
Still testing out speed and working on readme page not working

Other AmiGO Issues:

purls needed again (catalogs can fix this -xml format)
do a redirect within the catalogs
put the catalog on GitHub that maps all logical definitions onto the dev versions, the catalog can also insure that protege pulls the dev versions of all the imported ontologies

more frequent releases will insure that TO doesn't reference updated PO terms, before the PO gets released, the idea is to eventually release PO/TO/etc simultaneously.
Ideally releases would be monthly

iPlant mirror working at: http://draco.cyverse.org/amigo

- some issues, like slow response- say something to iplant folks.

Visit to IRRI March 7th-11th

Leo, Marie-Angelique and Austin will visit IRRI Monday March 7th to Friday March 11th
Goals: TD revision and data annotation-
- MAL, and Leo will be working on trait dictionary
- AM: annotate 140K+ germplasm entries with PD and TO mapping data

Discussion from email on the need for an annotation tool:
- EA: "We discussed the visit of Marie, Léo and Austin and the agenda we prepared. I indicated Mau that the team would like to see how IRRI data annotations could go on Planteome. Mau asked where Planteome stands with the annotations tools because this what IRRI is waiting for long. Would you have any points you want the team to share with IRRI on that aspect?"
MAL/LV: "We need a simple tool that scientists will be able to use to easily annotate their data. Adhoc scripts are short term and one-shot solutions that will serve Planteome but not necessarily serve IRRI because: The GAF2 format that amiGO2 requires is not known by the breeding community and the annotated data will be published on Planteome and not in the IRRI information systems. We should take the opportunity of being at IRRI to provide some guidance on how to include the ontologies in their data model. Annotating the data at the source is beneficial for everyone. IRRI information systems will have interoperable and discoverable data. And Planteome can easily add future annotations from the IRRI systems."

Discussion:

From PJ: MAL/LV - forms submitted for international travel: if approved, it will be covered. Need american carriers - so PJ will look into other routes.

Notes:

Goal is to have the data annotated, and have the trait dictionary in its final version
- add breeding traits to the TD, currently only genebank traits are in there.
data annotation- Identify datasets for Planteome
genebank data, breeding database, "breeding for rice"- similar to data in BMS
- http://www.irgcis.irri.org:81/grc/AboutIRGCIS.htm

3K genomes of rice in another database, PJ - no trait data, only sequencing data
Unless that seed packet of sequenced varieties has IRGC phenotypes in their collection. Add EVERYTHING to Planteome
- environment, phenotypes, observed scales/variables, original location, location assay, treatment metadata
ANYTHING they have to include, we will get included, very first pass will be similar to casava and lentil
Anything that doesn't fit in the first pass, get the data, and we can evaluate how to make the additional data fit the GAF2 form (they might not be comfortable with giving ALL the data, but get what I can- access to images would be great

IRGC packets http://www.irgcis.irri.org:81/grc/SearchData.htm
IR64 mutant collection -another database, plus any other high throughput phenotyping data
64K mutant lines with phenotype data
C3-C4 mutant collection- any data on this would be awesome to get as well.

Genomic & Opensource Breeding Informatics Initiative (GOBII)

what is GOBII (http://cbsugobii05.tc.cornell.edu/wordpress/) - learn as much about it as possible
- creates field books and templates in a standard way for phenotype collection, more about providing tools than collecting data. push for them to use TO/CO ontologies in these GOBII

Tentative Schedule

Monday Day 1: morning - Austin present Planteome
Tuesday Day 2: show how lentil and cassava data has been annotated
GAF2 format - column 16 - not unstructured, it has data relationships included

Visit to genebank, fields, see the phenotyping platform
They are doing some automating phenotyping with UAVs- plant physiology, soil physiology

Bottom line: Look at all the data they are collecting, and see if we can get those things integrated in our workflow we want to attempt to get a meeting with "homebase"

8 hours ahead - could meet Wednesday (0800) Philippines = Tuesday (1600) Corvallis

Austin work with their staff to get the phenotyping data format find exact ID of their varieties- need the EXACT seed packet ID, publicly available IDs to link out to from Planteome, basically need to find out how they store data

5. Phenotype RCN meeting February 26-28, 2016

PJ attended

Status of second year funding

PJ- first year report accepted, will follow up with NSF and find out
PJ update:
- Annual report approved
- Once money from second year is released: amendment to the subcontracts (elizabeth, chris, et cetera)

Following items tabled for next meeting:

Recent updates on the TO

revisions to equivalence axioms- occurs_in and composition
Stem and culm
anther morphology traits, incl. anther number
biochemical branch

OBA development- Chris Mungall

If there is time I would like to walk through the procedure we use to develop OBA.

Part of this was covered in the tutorial on template-based ontology development, but we had to rush this part due to lack of time:

https://github.com/Planteome/protege-tutorial/tree/master/template-examples

For OBA, the source of the ontology is primarily in TSVs, found here:

https://github.com/obophenotype/bio-attribute-ontology/tree/master/src/ontology/modules

The design patterns are specified here:

https://github.com/obophenotype/bio-attribute-ontology/tree/master/src/ontology/patterns

Together these are used to build the ontology with equivalence axioms, with the entire ontology hierarchy being inferred automatically, e.g:

http://www.ebi.ac.uk/ols/beta/ontologies/oba/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FOBA_VT0000017

Upcoming Meetings and Workshops

Biocuration 2016, April 10th-14th,2016; Geneva, Switzerland

MAL is going
- key dates
- Abstract submitted: https://easychair.org/conferences/submission.cgi?a=10690939;submission=2641703

GARNet/Egenis Workshop: Integrating Large Data into Plant Science, April 21st-22nd 2016, Dartington Hall, Totnes, Devon

Elizabeth and George are going, EA will present Planteome as part of her talk

Meeting in Montpellier, 9-13 May 2016

Link to tentative agenda/website: [1]

BioOntologies SIG of the Intelligent Systems for Molecular Biology (ISMB); July 8-12, 2016, Orlando, Florida

Dates: July 8th and 9th, with July 9th being the “Phenotype Day”, focused on the systematic description of phenotypes.
- Short papers, up to 4 pages (will be published in JBMS)
- Poster abstracts, up to 1 page
- Flash updates, up to 1 page

7th International Conference on Biological Ontology and BioCreative 2016 Aug 1st to 4th, Corvallis, OR

Link to program: ICBO + BioCreative Program
Link to Easy Chair site: https://easychair.org/conferences/conference_info.cgi?a=10776589

@@ Line 1: / Line 1: @@
-==Planteome Ontology WG Zoom Meeting==
+=Planteome Ontology WG Zoom Meeting=
 * Date: Tuesday Mar. 1st, 2016
 * Time: 8:15am PST (GMT-8)
 * Connection details:  Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/762470789
+* Links to recordings:
+** [[File:Ontology WG Meeting 3-1-2016 audio.m4a|thumb|Ontology WG Meeting 3-1-2016 audio]]
+** [[File:Ontology WG Meeting 3-1-2016 video.mp4|thumb|Ontology WG Meeting 3-1-2016 video]]
-* Links to recordings:
+* Attendees:LC, AM, PJ, CM, JE
-* Attendees:
+==Changes and updates in the IBP files ==
+* Review of progress with building the Planteome version of AmiGO with the CO classes merged in
+* Three files are currently integrated: rice, cassava, lentil
+* wheat- ''newly mapped and will be added soon, once the issues of the id numbers are resolved''
+** Latest changes: added the crop common name to all terms, removed the categorical classes (ie: Agronomical Traits), made all CO class names in lower case
+* The mappings to TO (is_a) are on the trait class, but there are still some on the cassava variables, as it appeared on the browser.  These were originally put in since the variable was not connected, but can now be removed.
+* In cassava file, the variable has an is_a relationship to the trait, since the variable_of, method_of and scale_of relations are not being displayed.  This is a workaround to get the variable to show as a child (should revert to actual relationships when we can get it sorted out in AmiGO)
-Tentative Agenda:
+*Note: funny synonyms in AmiGO cause weird behavior with autocomplete ''(not sure what this means..)''
-. recent updates on the TO
+* Questions and discussions:
+** Have we decided on how and where the CO<->TO mappings will be stored? At the Corvallis meeting we hacked something into a pseudo-TD5 file.
+*** ''Currently the mappings are being done in a google spreadsheet, which is then converted to an OBO file (see below). ''
+* From CM: ''Marie, looks like you are generating some files such as: lentil_withcropname.obo  --- which have the bridging axioms merged in.  This looks great. Not sure if this is experiments or ready to go, but this is exactly what we need. We should just decide on a standard naming convention, update the READMEs, and maybe set up travis for each CO repo.''
+From MAL:
+* About the creation of the lentil file, I have a script in java that takes a TD V5 (in excel format) and creates an OBO file. I use apache poi and obo2owl for that. Then I have a second script that adds the mappings in the OBO file. I get the mappings from a spreadsheet document on Google Drive. Eventually the mappings will be store directly in the TD v5.
+PJ's Vision:
+* Vision: TD --> CSV --> Chris' tool --> merged version --> read-only crop specific
+* Want crop-specific groups to be able to pull down a slice of the TO that includes all the crop-specific terms, and the associated TO hierarchy.  None of the traits that are not specific to their crop of choice. His goal is to not use the CO OBO, we want to bypass it, and use just the trait dictionary (TD)- wants to avoid CO.obo and reduce overhead for MAL
+* MAL doesn't edit the CO.obo, so the current way is already reduced.
+** ''No explanation of what "Chris' tool" is...., or how this would actually work''
+* Chris argued for having purls for each of the CO-obo files
+* there is a replacement of "prefixes" (/ibp-cassava-traits/) that put in the true link to raw.github.usercontent.com/planteomeetc
+* need to be clear on our naming for the purl.
+* PURLS are stored here: https://github.com/OBOFoundry/purl.obolibrary.org/blob/master/config/to.yml
+* from the chat: MAL should use ontology:to/ibp-FOO-traits/FOO.owl
+== Update on AmiGO APIs- Justin Elser ==
+* API code changes have been merged into our version of AmiGO (dev.planteome.org)
+* These APIs will live on AmiGO, and bisque, AISO, and whoever wants to use them, can use them for pulling definitions, using autocomplete
+* Does not work on browser.planteome.org as it will require a reload of all data to work because of a schema change to SOLR
+Still fixing some bugs:
+* Small issue with main readme page not showing up, but the API itself is working
+** ie. http://dev.planteome.org/api doesn't work
+** http://dev.planteome.org/api/entity/term/GO:0022008 works and returns JSON
+* Still testing out speed and working on readme page not working
+==Other AmiGO Issues:==
+* purls needed again (catalogs can fix this -xml format)
+* do a redirect within the catalogs
+* put the catalog on GitHub that maps all logical definitions onto the dev versions, the catalog can also insure that protege pulls the dev versions of all the imported ontologies
+* more frequent releases will insure that TO doesn't reference updated PO terms, before the PO gets released, the idea is to eventually release PO/TO/etc simultaneously.
+* Ideally releases would be monthly
+* iPlant mirror working at: http://draco.cyverse.org/amigo
+** some issues, like slow response- say something to iplant folks.
+==Visit to IRRI March 7th-11th==
+* Leo, Marie-Angelique and Austin will visit IRRI Monday March 7th to Friday March 11th
+* Goals: TD revision and data annotation-
+** MAL, and Leo will be working on trait dictionary
+** AM: annotate 140K+ germplasm entries with PD and TO mapping data
+* Discussion from email on the need for an annotation tool:
+** EA: "We discussed the visit of Marie, Léo and Austin and the agenda we prepared. I indicated Mau that the team would like to see how IRRI data annotations could go on Planteome. Mau asked where Planteome stands with the annotations tools because this what IRRI is waiting for long. Would you have any points you want the team to share with IRRI on that aspect?"
+* MAL/LV: "We need a simple tool that scientists will be able to use to easily annotate their data. Adhoc scripts are short term and one-shot solutions that will serve Planteome but not necessarily serve IRRI because: The GAF2 format that amiGO2 requires is not known by the breeding community and the annotated data will be published on Planteome and not in the IRRI information systems. We should take the opportunity of being at IRRI to provide some guidance on how to include the ontologies in their data model. Annotating the data at the source is beneficial for everyone. IRRI information systems will have interoperable and discoverable data. And Planteome can easily add future annotations from the IRRI systems."
+Discussion:
+* ''From PJ: MAL/LV - forms submitted for international travel: if approved, it will be covered.  Need american carriers - so PJ will look into other routes.''
+=== Notes:===
+* Goal is to have the data annotated, and have the trait dictionary in its final version
+** add breeding traits to the TD, currently only genebank traits are in there.
+* data annotation- Identify datasets for Planteome
+* genebank data, breeding database, "breeding for rice"- similar to data in BMS
+** http://www.irgcis.irri.org:81/grc/AboutIRGCIS.htm
+* 3K genomes of rice in another database, PJ - no trait data, only sequencing data
+* Unless that seed packet of sequenced varieties has IRGC phenotypes in their collection. Add EVERYTHING to Planteome
+** environment, phenotypes, observed scales/variables, original location, location assay, treatment metadata
+* ANYTHING they have to include, we will get included, very first pass will be similar to casava and lentil
+* Anything that doesn't fit in the first pass, get the data, and we can evaluate how to make the additional data fit the GAF2 form (they might not be comfortable with giving ALL the data, but get what I can- access to images would be great
+* IRGC packets http://www.irgcis.irri.org:81/grc/SearchData.htm
+* IR64 mutant collection -another database, plus any other high throughput phenotyping data
+*  64K mutant lines with phenotype data
+* C3-C4 mutant collection- any data on this would be awesome to get as well.
+===Genomic & Opensource Breeding Informatics Initiative (GOBII)===
+* what is GOBII (http://cbsugobii05.tc.cornell.edu/wordpress/) - learn as much about it as possible
+** creates field books and templates in a standard way for phenotype collection, more about providing tools than collecting data. push for them to use TO/CO ontologies in these GOBII
+===== Tentative Schedule =====
+* Monday Day 1: morning - Austin present Planteome
+* Tuesday Day 2: show how lentil and cassava data has been annotated
+* GAF2 format - column 16 - not unstructured, it has data relationships included
+* Visit to genebank, fields, see the phenotyping platform
+* They are doing some automating phenotyping with UAVs- plant physiology, soil physiology
+Bottom line:  Look at all the data they are collecting, and see if we can get those things integrated in our workflow
+we want to attempt to get a meeting with "homebase"
+* 8 hours ahead - could meet Wednesday (0800) Philippines = Tuesday (1600) Corvallis
+Austin work with their staff to get the phenotyping data format
+find exact ID of their varieties-  need the EXACT seed packet ID, publicly available IDs to link out to from Planteome, basically need to find out how they store data
+== 5. [http://www.phenotypercn.org/?page_id=2750 Phenotype RCN meeting February 26-28, 2016] ==
+* PJ attended
+== Status of second year funding==
+* PJ- first year report accepted, will follow up with NSF and find out
+*PJ update:
+** Annual report approved
+** Once money from second year is released: amendment to the subcontracts (elizabeth, chris, et cetera)
+==Following items tabled for next meeting:==
+===Recent updates on the TO ===
 * revisions to equivalence axioms- ''occurs_in'' and ''composition''
 * Stem and culm
@@ Line 17: / Line 133: @@
 * biochemical branch
-. Changes and updates in the CO files
+=== OBA development- Chris Mungall===
-* rice
+* If there is time I would like to walk through the procedure we use to develop OBA.
-* cassava
-* lentil
-* newly mapped and added- wheat--- TBA
+* Part of this was covered in the tutorial on template-based ontology development, but we had to rush this part due to lack of time:
+https://github.com/Planteome/protege-tutorial/tree/master/template-examples
+* For OBA, the source of the ontology is primarily in TSVs, found here:
+https://github.com/obophenotype/bio-attribute-ontology/tree/master/src/ontology/modules
-. Update on AmiGO APIs- Justin Elser
+* The design patterns are specified here:
+https://github.com/obophenotype/bio-attribute-ontology/tree/master/src/ontology/patterns
-==Visit to IRRI==
+* Together these are used to build the ontology with equivalence axioms, with the entire ontology hierarchy being inferred automatically, e.g:
-* Monday March 7th to Friday March 11th
+http://www.ebi.ac.uk/ols/beta/ontologies/oba/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FOBA_VT0000017
-* Goals: TD revision and data annotation- MAL, and Leo will be working on trait dictionary
-* Austin- send me the contact info
-* all 3 - should be contacting each other as a group
-==Status of second year funding==
-* Bioversity is getting impatient, PJ will follow up with NSF and find out
 = Upcoming Meetings and Workshops=
-==[http://www.phenotypercn.org/?page_id=2750 Phenotype RCN meeting February 26-28, 2016] ==
-* PJ is going
 ==[https://www.isb-sib.ch/events/biocuration2016/home Biocuration 2016, April 10th-14th,2016; Geneva, Switzerland]==
 * MAL is going
@@ Line 48: / Line 157: @@
 * Elizabeth and George are going, EA will present Planteome as part of her talk
-== Meeting in Montpellier, May 2016- website??==
+== Meeting in Montpellier, 9-13 May 2016 ==
+Link to tentative agenda/website: [https://sites.google.com/a/cgxchange.org/cropontologycommunity/home]
 ==[http://www.bio-ontologies.org.uk/call-for-participation BioOntologies SIG] of the [http://www.iscb.org/ismb2016 Intelligent Systems for Molecular Biology (ISMB)]; July 8-12, 2016, Orlando, Florida ==

Mar 1st, 2016 Ontology Working Group Meeting: Difference between revisions