Thursday April 16th, 2015: Difference between revisions

Latest revision as of 18:05, 18 May 2015

Who: LC, PJ, JP, JE, CM, JD, XX
Not available: EA, EZ, BS, GG, SN

Zoom Connection Details: Join from PC, Mac, iOS or Android: https://zoom.us/j/996259332

Link to recording- Video: File:All-Hands Meeting 4-16-15.mp4
Link to recording- Audio: File:All-Hands Meeting 4-16-15.m4a

A. General Comments and Updates:

Introductions:

Leo Valette, New Crop Ontology curator and coordinator at Bioversity, with Elizabeth Arnaud (for more info please see: Ontology_Working_Group_March_6th_2015

- Curating the trait dictionaries, used by plant breeders - New Hire at CO: Marie Angelique Laporte has accepted the position as an Ontology Engineer, will be joining the first week of May

John Doonan, Director National Plant Phenomics Centre, at University of Aberystwyth, UK

- High throughput phenotyping - gene discovery and plant breeding. Integration of phenomic and genomic data

- Semi-automated phenotyping system, associated with Plant Breeding department

- Background in developmental genetics, but works closely with George Gkoutos, expert in ontologies and bioinformatics

- George's news: postdoc position has been advertised, and BBSRC has agreed to extend it as part of the collaboration with CGIAR on tropical grasses (Need to check?)

- Hands on facility for large scale phenotyping

- Goal is integrate ontologies into phenotyping workflow.

Chris Mungall, LBL - background in bioinformatics, involved with Gene Ontology, Phenotype ontology, PATO with George Gkoutous, human and mammalian systems, created Uberon anatomy ontology

- Assist with deployment of AmiGO2 browser.

Xu Xu - In Sinisa's group, works on Image annotation - From Xu Xu : "hello all, I am Xu Xu. Sorry my mic is not working. Just an introduction here: I am Sinisa's student working in image annotation group, with Justin. I will be working on integrating AISO to BisQue. Happy to work with all of you!"

Update on hires at OSU: PJ: Software Developer position has not had too many applicants, PJ met a potential candidate who is a spouse of a BPP department member, Biology Masters, recently from Yahoo.

Ontology curator position - will start evaluating applicants and will hopefully can offer to someone by mid May......

No news on the position at NYBG.

B. Update from IT group- Data Store, AmiGO2

Justin Elser, Chris Mungall, with Seth Carbon

AmiGO2 install

A dev version of the AmiGO2 has been installed and loaded with a set of OWL files, includes some others that may not be needed. http://dev.planteome.org
Still in progress, need to do a fair bit of configuration and organization of the ontology project files.
Loading the ontologies takes about 25 minutes, but does not include the ncbi taxon ontology, which takes too much memory and doesn't finish on palea.
Looking at loading a small subset (plants?) of the taxon

Ontologies loaded:

ChEBI, GO, PATO, Cell, PO, TO, ECO, GOREL, also Uberon, Uberon/phenoscape-anatomy, ncbi_taxonomy (114)
Need to include the Plant Experimental Conditions ontology (EO), and take out the CL and Uberon files
Several of the features such as QuickGO are not working QuickGo apparently is hard wired for GO, so we will not be able to adopt it
'Mappings' aka dbxrefs to SourceForge and other ontologies like PATO are not working either.

Filters are not being saved- compare with the GO version- It actually does not work there either, apparently designed this way.

CM: In the future, we will have one uber file- "planteome.owl" that will contain all the planteome core and partner (PATO, ChEBI, GO etc) ontologies.

Also could use only a slice of the GO, as there are a lot of non-plant relevant terms.

PJ: Could use the GR_tax-ontology.obo file that has many of the plant species, and should have all the pathogens- Note: GR_tax-ontology.obo has only green plants and red algae.

GR_tax-ontology.obo should have refs to NCBI as well, would have to be maintained here, has not been worked on for over two years

CM: We could extract a slice of the NCBI taxonomy, if we can define our criteria e.g. everything that we have an annotation to, should have all the plants and the pests and pathogens.

Test set of annotations loaded

Loaded around 200,000 annotations from various ontologies, out of a total of ~ 4 million
Load times seem much better than the old AmiGO, at least for these annotations
Problems were found in some of the files, specifically coming from the col 16- annotation extensions and also in column 5 (ontology ID), where spaces are causing problems
JE corrected the issues he found in the files, but these revised files need to be committed back to the SVN- Done

Ontology	Species	filename	Issues?
PO-Anatomy	rice	po_anatomy_gene_oryza_gramene.assoc	no
PO-Anatomy	rice	po_anatomy_gene_oryza_poc.assoc	yes
PO-Anatomy	rice	po_anatomy_qtl_oryza_gramene.assoc	no
PO-Anatomy	maize	po_anatomy_gene_zea_MaizeGDB.assoc	no
PO-Anatomy	grape	po_anatomy_gene_vitis_poc.assoc	yes
PO-Anatomy	maize	po_anatomy_stock_zea_MaizeGDB.assoc	no
PO_Growth_Stage	rice	po_growth_gene_oryza_gramene.assoc	no
PO_Growth_Stage	rice	po_growth_qtl_oryza_gramene.assoc	no
PO_Growth_Stage	grape	po_growth_gene_vitis_poc.assoc	yes
Plant_Ontology	rice	po_ontology_IMP_gene_oryza_poc.assoc	yes
Trait_Ontology	Arabidopsis	to_diversity_arabidopsis.assoc	yes
Trait_Ontology	?	to_Protein_association.assoc	no
Trait_Ontology	?	to_QTL_association.assoc	no
Trait_Ontology	Rice	to_diversity_rice.assoc	no
Trait_Ontology	?	to_Gene_association.assoc	no
Plant_EO	?	eo_protein.assoc	no
Plant_EO	rice	eo_diversity_rice.assoc	no
Plant_EO	?	eo_qtl.assoc	no
Plant_EO	Arabidopsis	eo_diversity_arabidopsis.assoc	no
Plant_EO	?	eo_gene.assoc	no
Gene_Ontology	rice	go_gramene_oryza.assoc	no

Need to standardize the names of the annotation files

Browsing in AmiGO2

Type term into search box on front page, auto-fill feature will be useful
Column sorting can be customized
NCBI ids are not displayed correctly as the NCBI tree is not loaded
Need to check on "Source"- where is it pulling that from? It looks like it is using the namespaces from the terms.

"Panther Gene Families"- We can customize this and can use our own Inparanoid gene families, as Panther does not have too many plants represented
Inparanoid gene families- algorithm creates superclusters, based on ~70- 100 genomes, species by species comparison
Can input any standard format such as New Hampshire, phyloXML, etc call it "Plant Gene Families" etc
Can try with a subset- e.g. Arabidopsis, rice and maize and associate with the GO annotations

PJ: can include the other out groups, but do not serve the animal genes, but then we have to serve all the GO human/mouse etc annotations

Start with displaying only the plant data, but include all the species in the analysis. Can decide later if we want to include it as well
JE will do a baseline GO annotation for all the plant genomes, based on Interpro

PJ- Goal is a "Pan-Gene" database - centralized resource for nomenclature and management

Need an annotation platform- one idea is a "wiki" type platform
May look at GO tool "Noctua", CM can demo on a future call see Link to [Noctua on GitHub]

C. Update from AISO/BisQue group- Justin Preece:

Please fill in notes here....

Personnel:

Yao Zhou has left Sinisa's lab
Xu Xu is currently a full time GRA on the project.

D. Update from Ontologies Working Group:

The OWG has been meeting roughly biweekly, around everyone's travel schedules- see the page for more details on the recent meetings.

Discussion about storing Ontologies and Associations at GitHub:

Working on plans for moving the ontologies to GitHub, still not completely decided how to do this. See meeting notes GitHub Mtg 4-3

Re: associations: PJ met a person (another Justin!) from Github on the plane, they are interested in hosting large datasets and offering them through APIs etc

Question about whether or not we need to maintain the history? PJ: Ideally we would keep the history with the ontology files, not so important about the association files.
LC: Theoretically this is possible, but it is complicated. On the 4-3 Github call, JE said that he would reorganize the files on the SVN and try importing the history.
PJ: Suggest someone should send an email to Github and ask them how best to transfer the SVN history....Who is doing this??
One option would be to maintain the SVN and run a chron job to the Github or visa versa, but this does not solve anything and adds a lot of extra work for maintenance.
PJ can ask his contact "GitHub-Justin" if we have questions.

Note from after the call: see this link for [Announcing Git Large File Storage (LFS)]

Summary of Goals:

PJ: The goal is to have a version control system on Github for the ontology files (and possibly the annotations). Currently we are using the SVN on our local servers, publicly-available for read access, approved developers can get write access.

Each change generates a version # for tracking purposes.
To release the ontologies and data on the database and browser, the team JE/CM would take a snapshot or branch it

Updates on various collaborative projects:

Panzea dataset annotation in collaboration with MaizeGDB

large GWAS dataset 385,000 lines in MaizeGDB database
annotating with TO and PO terms
new interface at Maize GDB includes some of the PO terms, adding additional ontology terms and adding TO
working with their developers so their users can browse the ontology hierarchy

Plant Disease Ontology

(will become part of Plant Stress Ontology)

new undergrad helper working on adding diseases
will also tie into the Panzea dataset as they also have diseases traits

Working with new collaborating Database group PHI-Base Pathogen - Host Interaction database- Initiative from Rothamstad in England
Large set of manually curated literature
Covers wide range of plant species (and animals)
They are requesting terms and will make cross links to and from their database

E. Other Comments:

Leo has been working with JE and his local Sys Admin to get the SVN access working- maybe fixed by moving to GitHub

Thursday April 16th, 2015: Difference between revisions

Latest revision as of 18:05, 18 May 2015

Contents

A. General Comments and Updates:

Introductions:

B. Update from IT group- Data Store, AmiGO2

AmiGO2 install

Ontologies loaded:

Test set of annotations loaded

Browsing in AmiGO2

C. Update from AISO/BisQue group- Justin Preece:

D. Update from Ontologies Working Group:

Discussion about storing Ontologies and Associations at GitHub:

Summary of Goals:

Updates on various collaborative projects:

Panzea dataset annotation in collaboration with MaizeGDB

Plant Disease Ontology

E. Other Comments:

Next meeting Thursday May 21st

Navigation menu

@@ Line 1: / Line 1: @@
-Who:
-Not available:
-* Link to recording- Video:
+* Who: LC, PJ, JP, JE, CM, JD, XX
-* Link to recording- Audio:
+* Not available:  EA, EZ, BS, GG, SN
+'''Zoom Connection Details:  Join from PC, Mac, iOS or Android: https://zoom.us/j/996259332'''
+* Link to recording- Video: [[File:All-Hands Meeting 4-16-15.mp4|thumbnail|All-Hands_Meeting_4-16-15.mp4]]
+* Link to recording- Audio: [[File:All-Hands Meeting 4-16-15.m4a|thumbnail|All-Hands_Meeting-Audio_4-16-15.m4a]]
 == A. General Comments and Updates:==
+=== Introductions: ===
+*Leo Valette, New Crop Ontology curator and coordinator at Bioversity, with Elizabeth Arnaud (for more info please see: [[Ontology_Working_Group_March_6th_2015]]
+- Curating the trait dictionaries, used by plant breeders
+- New Hire at CO: Marie Angelique Laporte has accepted the position as an Ontology Engineer, will be joining the first week of May
+* John Doonan, Director National Plant Phenomics Centre, at University of Aberystwyth, UK
+- High throughput phenotyping - gene discovery and plant breeding. Integration of phenomic and genomic data
+- Semi-automated phenotyping system, associated with Plant Breeding department
+- Background in developmental genetics, but works closely with George Gkoutos, expert in ontologies and bioinformatics
+- George's news: postdoc position has been advertised, and BBSRC has agreed to extend it as part of the collaboration with CGIAR on tropical grasses (''Need to check?'')
-== B. Update from IT group- Data Store, AmiGO2, Justin Elser:==
+- Hands on facility for large scale phenotyping
+- Goal is integrate ontologies into phenotyping workflow.
+Chris Mungall, LBL
+- background in bioinformatics, involved with Gene Ontology, Phenotype ontology, PATO with George Gkoutous, human and mammalian systems, created Uberon anatomy ontology
+- Assist with deployment of AmiGO2 browser.
+Xu Xu
+- In Sinisa's group, works on Image annotation
+- From Xu Xu : "hello all, I am Xu Xu. Sorry my mic is not working.  Just an introduction here: I am Sinisa's student working in image annotation group, with Justin. I will be working on integrating AISO to BisQue. Happy to work with all of you!"
+Update on hires at OSU:
+PJ: Software Developer position has not had too many applicants, PJ met a potential candidate who is a spouse of a BPP department member, Biology Masters, recently from Yahoo.
+Ontology curator position - will start evaluating applicants and will hopefully can offer to someone by mid May......
+No news on the position at NYBG.
+== B. Update from IT group- Data Store, AmiGO2==
+* Justin Elser, Chris Mungall, with Seth Carbon
+==AmiGO2 install==
+* A dev version of the AmiGO2 has been installed and loaded with a set of OWL files, includes some others that may not be needed.  http://dev.planteome.org
+* Still in progress, need to do a fair bit of configuration and organization of the ontology project files.
+* Loading the ontologies takes about 25 minutes, but does not include the ncbi taxon ontology, which takes too much memory and doesn't finish on palea.
+* Looking at loading a small subset (plants?) of the taxon
+===Ontologies loaded:===
+* ChEBI, GO, PATO, Cell, PO, TO, ECO, GOREL, also Uberon, Uberon/phenoscape-anatomy, ncbi_taxonomy (114)
+* Need to include the Plant Experimental Conditions ontology (EO), and take out the CL and Uberon files
+* Several of the features such as QuickGO are not working ''QuickGo apparently is hard wired for GO, so we will not be able to adopt it''
+* 'Mappings' aka dbxrefs to SourceForge and other ontologies like PATO are not working either.
+* Filters are not being saved- compare with the GO version- ''It actually does not work there either, apparently designed this way.''
+CM: In the future, we will have one uber file- "planteome.owl" that will contain all the planteome core and partner (PATO, ChEBI, GO etc) ontologies.
+* Also could use only a slice of the GO, as there are a lot of non-plant relevant terms.
+PJ: Could use the GR_tax-ontology.obo file that has many of the plant species, and should have all the pathogens- ''Note: GR_tax-ontology.obo has only green plants and red algae.''
+* GR_tax-ontology.obo should have refs to NCBI as well, would have to be maintained here, has not been worked on for over two years
+CM: We could extract a slice of the NCBI taxonomy, if we can define our criteria e.g. everything that we have an annotation to, should have all the plants and the pests and pathogens.
+===Test set of annotations loaded ===
+* Loaded around 200,000 annotations from various ontologies, out of a total of ~ 4 million
+* Load times seem much better than the old AmiGO, at least for these annotations
+* Problems were found in some of the files, specifically coming from the col 16- annotation extensions and also in column 5 (ontology ID), where spaces are causing problems
+* JE corrected the issues he found in the files, but these revised files need to be committed back to the SVN- ''Done''
+{| class="wikitable"
+|-
+! Ontology !! Species !! filename !! Issues?
+|-
+| PO-Anatomy ||rice || po_anatomy_gene_oryza_gramene.assoc || no
+|-
+| PO-Anatomy || rice || po_anatomy_gene_oryza_poc.assoc || yes
+|-
+| PO-Anatomy || rice || po_anatomy_qtl_oryza_gramene.assoc || no
+|-
+| PO-Anatomy || maize || po_anatomy_gene_zea_MaizeGDB.assoc || no
+|-
+| PO-Anatomy || grape || po_anatomy_gene_vitis_poc.assoc || yes
+|-
+| PO-Anatomy ||  maize ||po_anatomy_stock_zea_MaizeGDB.assoc || no
+|-
+| PO_Growth_Stage || rice || po_growth_gene_oryza_gramene.assoc || no
+|-
+| PO_Growth_Stage || rice || po_growth_qtl_oryza_gramene.assoc || no
+|-
+| PO_Growth_Stage || grape ||  po_growth_gene_vitis_poc.assoc || yes
+|-
+| Plant_Ontology || rice || po_ontology_IMP_gene_oryza_poc.assoc || yes
+|-
+| Trait_Ontology || Arabidopsis || to_diversity_arabidopsis.assoc || yes
+|-
+| Trait_Ontology || ? || to_Protein_association.assoc || no
+|-
+| Trait_Ontology || ? || to_QTL_association.assoc || no
+|-
+| Trait_Ontology || Rice || to_diversity_rice.assoc  || no
+|-
+| Trait_Ontology || ? || to_Gene_association.assoc || no
+|-
+| Plant_EO || ? || eo_protein.assoc || no
+|-
+| Plant_EO || rice ||  eo_diversity_rice.assoc || no
+|-
+|Plant_EO || ? || eo_qtl.assoc || no
+|-
+|Plant_EO || Arabidopsis || eo_diversity_arabidopsis.assoc || no
+|-
+| Plant_EO || ? || eo_gene.assoc || no
+|-
+| Gene_Ontology || rice || go_gramene_oryza.assoc || no
+|}
+''Need to standardize the names of the annotation files''
+===Browsing in AmiGO2===
+* Type term into search box on front page, auto-fill feature will be useful
+* Column sorting can be customized
+* NCBI ids are not displayed correctly as the NCBI tree is not loaded
+* Need to check on "Source"- where is it pulling  that from? ''It looks like it is using the namespaces from the terms.''
+* "Panther Gene Families"- We can customize this and can use our own Inparanoid gene families, as Panther does not have too many plants represented
+* Inparanoid gene families- algorithm creates superclusters, based on ~70- 100 genomes, species by species comparison
+* Can input any standard format such as New Hampshire, phyloXML, etc call it "Plant Gene Families" etc
+* Can try with a subset- e.g. Arabidopsis, rice and maize and associate with the GO annotations
+PJ: can include the other out groups, but do not serve the animal genes, but then we have to serve all the GO human/mouse etc annotations
+* Start with displaying only the plant data, but include all the species in the analysis.  Can decide later if we want to include it as well
+* JE will do a baseline GO annotation for all the plant genomes, based on Interpro
+PJ- Goal is a "Pan-Gene" database - centralized resource for nomenclature and management
+* Need an annotation platform- one idea is a "wiki" type platform
+* May look at GO tool "Noctua", CM can demo on a future call see Link to [[https://github.com/geneontology/noctua Noctua on GitHub]]
 == C. Update from AISO/BisQue group- Justin Preece:==
+''Please fill in notes here....''
+Personnel:
+* Yao Zhou has left Sinisa's lab
+* Xu Xu is currently a full time GRA on the project.
+== D. Update from Ontologies Working Group:==
+The OWG has been meeting roughly biweekly, around everyone's travel schedules- see the page for more details on the recent meetings.
+===Discussion about storing Ontologies and Associations at GitHub:===
+* Working on plans for moving the ontologies to GitHub, still not completely decided how to do this. See meeting notes [[Ontology_Working_Group_GitHub_Meeting,_Tuesday_April_3rd,_2015|GitHub Mtg 4-3]]
+*Re: associations: PJ met a person (another Justin!) from Github on the plane, they are interested in hosting large datasets and offering them through APIs etc
+* '''Question about whether or not we need to maintain the history?''' PJ: Ideally we would keep the history with the ontology files, not so important about the association files.
+* LC: Theoretically this is possible, but it is complicated.  On the 4-3 Github call, JE said that he would reorganize the files on the SVN and try importing the history.
+* PJ: Suggest someone should send an email to Github and ask them how best to transfer the SVN history....''Who is doing this??''
+* One option would be to maintain the SVN and run a chron job to the Github or visa versa, but this does not solve anything and adds a lot of extra work for maintenance.
+* PJ can ask his contact "GitHub-Justin" if we have questions.
+Note from after the call: see this link for [[https://github.com/blog/1986-announcing-git-large-file-storage-lfs Announcing Git Large File Storage (LFS)]]
+====Summary of Goals: ====
+PJ: The goal is to have a version control system on Github for the ontology files (and possibly the annotations).  Currently we are using the SVN on our local servers, publicly-available for read access, approved developers can get write access.
+* Each change generates a version # for tracking purposes.
+* To release the ontologies and data on the database and browser, the team JE/CM would take a snapshot or branch it
+===Updates on various collaborative projects:===
+==== Panzea dataset annotation in collaboration with MaizeGDB====
+* large GWAS dataset 385,000 lines in MaizeGDB database
+* annotating with TO and PO terms
+* new interface at Maize GDB includes some of the PO terms, adding additional ontology terms and adding TO
+* working with their developers so their users can browse the ontology hierarchy
-== D. Update from Ontologies Working group:==
+=== Plant Disease Ontology ===
+(will become part of Plant Stress Ontology)
+* new undergrad helper working on adding diseases
+* will also tie into the Panzea dataset as they also have diseases traits
+* Working with new collaborating Database group [http://www.phi-base.org/ PHI-Base Pathogen] - Host Interaction database- Initiative from Rothamstad in England
+* Large set of manually curated literature
+* Covers wide range of plant species (and animals)
+* They are requesting terms and will make cross links to and from their database
 ==E. Other Comments:==
+Leo has been working with JE and his local Sys Admin to get the SVN access working- maybe fixed by moving to GitHub
-==Next meeting Thursday May 14th==
+==Next meeting Thursday May 21st==

Thursday April 16th, 2015: Difference between revisions

Latest revision as of 18:05, 18 May 2015

A. General Comments and Updates:

Introductions:

B. Update from IT group- Data Store, AmiGO2

AmiGO2 install

Ontologies loaded:

Test set of annotations loaded

Browsing in AmiGO2

C. Update from AISO/BisQue group- Justin Preece:

D. Update from Ontologies Working Group:

Discussion about storing Ontologies and Associations at GitHub:

Summary of Goals:

Updates on various collaborative projects:

Panzea dataset annotation in collaboration with MaizeGDB

Plant Disease Ontology

E. Other Comments:

Next meeting Thursday May 21st

Navigation menu

Search