Crop Ontology Pages: Difference between revisions

From Planteome.org
Jump to navigation Jump to search
No edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 3: Line 3:
The Crop Ontology website consists of 5 sections:
The Crop Ontology website consists of 5 sections:
# Phenotypes and Traits Ontology
# Phenotypes and Traits Ontology
##crop specific trait ontologies. contains also the methods and the scales of measurement
## crop list: Maize, Rice, Wheat, Cassava, Cowpea, Chickpea, Potato, Sorghum, Soybean, Barley, Common Bean, Groundnut, Pearl millet, Pigeon Pea, Musa, Sweet Potato, Yam, Oat, Vitis
# Plant Anatomy & Development Ontology
# Plant Anatomy & Development Ontology
##Banana anatomy
##Plant Ontology (PO)
##Wheat Plant Anatomy and Development Ontology: Defines growth stages of wheat
# General Germplasm Ontology
# General Germplasm Ontology
##FAO/IPGRI Multi-Crop Passport Descriptor ontology : Adapted from FAO/Bioversity Multi-Crop Passport Descriptors, 2004
##GCP Germplasm ontology: General germplasm descriptors
##ICIS germplasm methods ontology: adaptation of the ICIS germplasm methods table
# Location and Environmental Ontology
# Location and Environmental Ontology
##Country and Location ontology: Official ISO 3166-1 alpha-2, alpha-3 and numeric country codes
##Crop Research ontology: Describes experimental design, environmental conditions and methods associated with the crop study/experiment/trail and their evaluation
# Structural and Functional Genomic Ontology
# Structural and Functional Genomic Ontology
##Bioversity molecular markers ontology: Adapted from Descriptors for Genetic Markers Technologies (2004)


===There are 3 ways of creating “ontologies” on the Crop Ontology website===
===There are 3 ways of creating “ontologies” on the Crop Ontology website===
Line 20: Line 31:
=List of Crops and IDs=
=List of Crops and IDs=
According to the project timeline, the crops that will be linked to the reference ontologies are:
According to the project timeline, the crops that will be linked to the reference ontologies are:
{| border="1" class="sortable"
{| {{table}}
|-
| align="center" style="background:#f0f0f0;"|'''CROPS'''
! Crops
| align="center" style="background:#f0f0f0;"|'''Crop Ids'''
! Prefixes
| align="center" style="background:#f0f0f0;"|'''Prefixes'''
| align="center" style="background:#f0f0f0;"|'''Map to prefixes recommended by Chris'''
|-
|-
| Maize
| Maize||322||CO_322||IBPCO:322xxxx
| 322
|-
|-
| Rice
| Rice||320||CO_320||IBPCO:320xxxx
| 320
|-
|-
| Wheat
| Wheat||321||CO_321||IBPCO:321xxxx
| 321 (traits) and 121 (anatomy)
|-
|-
| Cassava
| Cassava||334||CO_334||IBPCO:334xxxx
| 334
|-
|-
| Cowpea
| Cowpea||340||CO_340||IBPCO:340xxxx
| 340
|-
|-
| Chickpea
| Chickpea||338||CO_338||IBPCO:338xxxx
| 338
|-
|-
| Potato
| Potato||330||CO_330||IBPCO:330xxxx
| 330
|-
|-
| Sorghum
| Sorghum||324||CO_324||IBPCO:324xxxx
| 324
|-
|-
| Soybean
| Soybean||336||CO_336||IBPCO:336xxxx
| 336
|-
|-
| Barley
| Barley||323||CO_323||IBPCO:323xxxx
| 323
|-
|-
| Common Bean
| Common Bean||335||CO_335||IBPCO:335xxxx
| 335
|-
|-
| Groundnut
| Groundnut||337||CO_337||IBPCO:337xxxx
| 337
|-
|-
| Pearl Millet
| Pearl millet||327||CO_327||IBPCO:327xxxx
| 327
|-
|-
| Pigeon Pea
| Pigeon Pea||341||CO_341||IBPCO:341xxxx
| 341
|-
|-
| Musa
| Musa||325 (traits) and 125 (anatomy)||CO_325 and CO_125||IBPCO:325xxxx and IBPCO:125xxxx
| 325 (traits) and 125 (anatomy)
|-
|-
| Sweet Potato
| Sweet Potato||331||CO_331||IBPCO:331xxxx
| 331
|-
|-
| Yam
| Yam||333||CO_333||IBPCO:333xxxx
| 333
|-
|-
| Other crops ?????
| Other crops ???||||||
|  
|}
|}


Note: The trait ontologies start with 3XX whereas the anatomy ontologies start with 1XX.
=Trait Dictionary V5=
 
== Mapping TD to OBO ==
==Trait Template==
==Structure of the Trait Template V4==
Currently, the Trait Template V4 is uploaded onto the CO website.
 
The Trait Template is used to gather information on traits along with their methods and scales of measurement.
 
An (unique) ID is created by the system for each trait, method and scale (several IDs are created when the scale is categorical).
 
** But for now, if two methods or scales are the same (e.g. seed length and leaf length are measured in centimeters), different IDs are created, which is far from being ideal or even ontologically correct.
 
===Trait Template V4 columns: ===
* IBFieldbook
* Name of submitting scientist
* Institution
* Language of submission (only in ISO 2 letter codes)
* Date of submission
* Crop
 
For each trait, some meta-information is added (5 first columns)
* Name of Trait
* Abbreviated name
* Synonyms (separate by commas)
* Trait ID for modification, Blank for New
* Description of Trait
* How is this trait routinely used?
* Trait Class
* Method ID for modification, Blank for new
* Name of Method
* Describe how measured (method)
* Growth Stage
* Bibliographic Reference
* Comments
* Scale ID for modification, Blank for new
*Type of Measure (Continuous, Discrete or Categorical)
*For Continuous: units of measurement
*For Continuous: reporting units (if different from measurement)
*For Continuous: minimum
*For Continuous: maximum
*For Discrete: Name of scale or units of measurement
*For Categorical: Name of rating scale
*For Categorical: Class 1 - value = meaning
*For Categorical: Class 2 - value = meaning
*For Categorical: Class 3 - value = meaning
*For Categorical: Class 4 - value = meaning
*For Categorical: Class 5 - value = meaning
 
==Structure of the Trait Template V5 (Léo is still working on it with Julian Pietragalla from CIMMYT) ==
Note the structure can  still evolve. Template V5 is still a draft
 
{| {{table}}
{| {{table}}
| align="center" style="background:#f0f0f0;"|'''Column'''
| align="center" style="background:#f0f0f0;"|'''TDv5 Elements'''
| align="center" style="background:#f0f0f0;"|'''Description'''
| align="center" style="background:#f0f0f0;"|'''Description'''
| align="center" style="background:#f0f0f0;"|'''Stored in CO website'''
| align="center" style="background:#f0f0f0;"|'''OBO element'''
|-
| Curation||Comments for curation||NO||NO
|-
|-
| Curation||Comments for curation
| Trait ID||Term ID for trait as generated by the system. Use an existing ID to modify data for that trait. If left blank the system will automatically generate a new ID.||Yes||Ontology ID (trait namespace)
|-
|-
| Scientist||Name of scientist submitting variable
| Trait||Trait name (property)||Yes||name (trait namespace)
|-
|-
| Institution||Name of institution submitting variable
| Entity||A trait must follow the convention "Trait" = "Entity" + "Attribute", eg for "grain colour", attribute = "grain"||Yes||Crossproduct to PO
|-
|-
| Language||2 letter ISO code for language in which variable data is submitted
| Attribute||eg for "grain colour", entity = "colour"||Yes||Crossproduct to PATO
|-
|-
| Date||Date the variable was submitted
| Trait synonyms||Acronym/abbreviated name.||Yes||synonym (trait namespace)
|-
|-
| Crop||Crop name
| Trait abbreviation||Trait name abbreviation. If several abbrevations, separate with comas. The recommended abbreviation must be the first in the list.||Yes||synonym (trait namespace)
|-
|-
| Trait ID||Term ID for trait as generated by the system. Use an existing ID to modify data for that trait. If left blank the system will automatically generate a new ID.
| Trait description||Textual description of trait.||Yes||def (trait namespace)
|-
|-
| Trait||Trait name (property)
| Trait class||General class to which trait belongs. Concensus trait classes are morphological, phenological, agronomical, physiological, abiotic stress, biotic stress, biochemical, quality traits and fertility traits||Yes||is_a
|-
|-
| Entity||A trait must follow the convention "Trait" = "Entity" + "Attribute", eg for "grain colour", attribute = "grain"
| Trait status||Status of the trait. Trait can be tagged as recommended, standard for , deprecated, legacy||Yes||is_obsolete
|-
|-
| Attribute||eg for "grain colour", entity = "colour"
| Trait Xref||Cross reference of the trait e.g., Xref to TO||Yes||xref
|-
|-
| Trait synonyms||Acronym/abbreviated name.
| Method ID|| Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID. ||Yes||id
|-
|-
| Trait abbreviation||Recommended trait name abbreviation
| Method|| (Short) method name. ||Yes||name (method namespace)
|-
|-
| Trait abbreviation synonyms||Other trait name abbreviations
| Method description|| Textual description of method. ||Yes||def (method namespace)
|-
|-
| Trait description||Textual description of trait.
| Formula|| For computational methods, express the formula using variable names ||Yes||NO
|-
|-
| Trait class||General class to which trait belongs.
| Method class|| Measurement, Counting, Estimation, Computation ||Yes||NO
|-
|-
| Trait status||recommended, standard, obsolete,
| Method reference|| Biobliographical reference describing method. ||Yes||Xref
|-
|-
| Trait Xref||Cross reference of the trait e.g., Xref to TO
| Scale id||Term ID for scale as generated by the system. Use an existing ID to modify data for that scale. If left blank the system will automatically generate a new ID.||Yes||id
|-
|-
| Method ID|| Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID.
| Scale name||[Guidelines to be defined for naming the scale]||Yes||name (scale namespace)
|-
|-
| Method|| (Short) method name.
| Scale class||Numerical, Nominal, Ordinal, Text, Code, Time, Duration||Yes||NO
|-
|-
| Method description|| Textual description of method.
| Decimal places||For numerical, number of decimal places of report||Yes||NO
|-
|-
| Formula|| For computational methods, express the formula using variable names
| Lower limit||Minimum value (used for validation) for numerical and date||Yes||NO
|-
|-
| Method class|| Measurement, Counting, Rating, Estimation, Scoring, Computation
| Upper limit||Maximum value (used for validation).||Yes||NO
|-
|-
| Method reference|| Biobliographical reference describing method.
| Scale Xref||Cross reference to the scale, eg to a unit repository like "improved UO"||Yes||Xref
|-
|-
| Scale id||Term ID for scale as generated by the system. Use an existing ID to modify data for that scale. If left blank the system will automatically generate a new ID.
| Category n||If the scale is categorical, class value and meaning of the n-th category. It possible to create as many category columns as necessary, as long as they are called "Category "||Yes||Yes
|-
|-
| Scale name||[Guidelines to be defined for naming the scale]
| Variable ID||Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID.||Yes||Ontology ID (variable namespace)
|-
|-
| Scale class||Numerical, Nominal, Ordinal, Text, Code, Time, Duration
| Variable name (abbreviated)||Name of the variable following the convetion __. Variable name must be unique.||Yes||synonym (variable namespace)
|-
|-
| Decimal places||For numerical, number of decimal places of report
| Variable synonyms||Other names given to this variable||Yes||synonym (variable namespace)
|-
|-
| Lower limit||Minimum value (used for validation) for numerical and date
| Context of use||Indication of how trait is routinely used. If several context of use, separate with ","||Yes||NO
|-
|-
| Upper limit||Maximum value (used for validation).
| Growth stage|| Growthstage at which measurement is made. Follow standards. If variable used in time series, leave blank ||Yes||NO
|-
|-
| Scale Xref||Cross reference to the scale, eg to a unit repository like "improved UO"
| Variable status||Obsolete, Legacy, Standard for , Recommended, Experimental, Revision||Yes||is_obsolete
|-
|-
| Categorie i||Class value and meaning of the category "i" with "i" a positive integer
| CV Term ID||ID generated by BMS||Yes||NO
|-
|-
| Variable name||Name of the variable, it follows a naming convention described in guidelines
| Variable Xref||||Yes||Xref
|-
|-
| Variable synonyms||Other names given to this variable
| Scientist||Name of scientist submitting variable.||Yes||NO
|-
|-
| Context of use||Indication of how trait is routinely used. If several context of use, separate with ","
| Institution||Name of institution submitting variable||Yes||NO
|-
|-
| Growth stage|| Growthstage at which measurement is made. Follow standards. If variable used in time series, leave blank
| Language||2 letter ISO code for language in which variable data is submitted.||Yes||NO
|-
|-
| Variable status||Obsolete, Legacy, Standard // institution1, institution2, Recommended, Experimental, Revision
| Date||Date the variable was submitted.||Yes||NO
|-
|-
| CV Term ID||ID generated by BMS
| Crop||Crop name.||Yes||Yes
|-
|-
| Variable Xref||
| Full variable name||||||customized name. For new variables -> concatenation of  by  in
|-
| variable definition||||||T:  /n M:  /n S:
|}
|}
=OBO files=
=OBO files=
==OBO file structure==
==OBO file structure==

Latest revision as of 09:55, 10 September 2015

Introduction to The Crop Ontology

The Crop Ontology website consists of 5 sections:

  1. Phenotypes and Traits Ontology
    1. crop specific trait ontologies. contains also the methods and the scales of measurement
    2. crop list: Maize, Rice, Wheat, Cassava, Cowpea, Chickpea, Potato, Sorghum, Soybean, Barley, Common Bean, Groundnut, Pearl millet, Pigeon Pea, Musa, Sweet Potato, Yam, Oat, Vitis
  2. Plant Anatomy & Development Ontology
    1. Banana anatomy
    2. Plant Ontology (PO)
    3. Wheat Plant Anatomy and Development Ontology: Defines growth stages of wheat
  3. General Germplasm Ontology
    1. FAO/IPGRI Multi-Crop Passport Descriptor ontology : Adapted from FAO/Bioversity Multi-Crop Passport Descriptors, 2004
    2. GCP Germplasm ontology: General germplasm descriptors
    3. ICIS germplasm methods ontology: adaptation of the ICIS germplasm methods table
  4. Location and Environmental Ontology
    1. Country and Location ontology: Official ISO 3166-1 alpha-2, alpha-3 and numeric country codes
    2. Crop Research ontology: Describes experimental design, environmental conditions and methods associated with the crop study/experiment/trail and their evaluation
  5. Structural and Functional Genomic Ontology
    1. Bioversity molecular markers ontology: Adapted from Descriptors for Genetic Markers Technologies (2004)

There are 3 ways of creating “ontologies” on the Crop Ontology website

  1. Upload a Trait Dictionary (TD) in Excel. (Note, the TD is only used to store and load information about traits, thus it cannot be used for uploading ontologies that would belong to the other categories)
  2. Upload an OBO file. They can be used to create ontologies in any of the 5 sections
  3. Ontologies can be built from scratch using the website dedicated interface. Are there any examples of these?

Storage and Downloading

  • Once the files are uploaded or some terms are submitted using the web interface, the data are stored in a noSQL database (Google App Engine Datastore).
  • The ontologies can be downloaded in various formats: CSV, OBO, SKOS and JSON. The structures of the different formats are described in the following sections.
  • Note that because there is no exact mapping between the different formats some compromises has been made. Not all the information is available in the different formats.

List of Crops and IDs

According to the project timeline, the crops that will be linked to the reference ontologies are:

CROPS Crop Ids Prefixes Map to prefixes recommended by Chris
Maize 322 CO_322 IBPCO:322xxxx
Rice 320 CO_320 IBPCO:320xxxx
Wheat 321 CO_321 IBPCO:321xxxx
Cassava 334 CO_334 IBPCO:334xxxx
Cowpea 340 CO_340 IBPCO:340xxxx
Chickpea 338 CO_338 IBPCO:338xxxx
Potato 330 CO_330 IBPCO:330xxxx
Sorghum 324 CO_324 IBPCO:324xxxx
Soybean 336 CO_336 IBPCO:336xxxx
Barley 323 CO_323 IBPCO:323xxxx
Common Bean 335 CO_335 IBPCO:335xxxx
Groundnut 337 CO_337 IBPCO:337xxxx
Pearl millet 327 CO_327 IBPCO:327xxxx
Pigeon Pea 341 CO_341 IBPCO:341xxxx
Musa 325 (traits) and 125 (anatomy) CO_325 and CO_125 IBPCO:325xxxx and IBPCO:125xxxx
Sweet Potato 331 CO_331 IBPCO:331xxxx
Yam 333 CO_333 IBPCO:333xxxx
Other crops ???

Trait Dictionary V5

Mapping TD to OBO

TDv5 Elements Description Stored in CO website OBO element
Curation Comments for curation NO NO
Trait ID Term ID for trait as generated by the system. Use an existing ID to modify data for that trait. If left blank the system will automatically generate a new ID. Yes Ontology ID (trait namespace)
Trait Trait name (property) Yes name (trait namespace)
Entity A trait must follow the convention "Trait" = "Entity" + "Attribute", eg for "grain colour", attribute = "grain" Yes Crossproduct to PO
Attribute eg for "grain colour", entity = "colour" Yes Crossproduct to PATO
Trait synonyms Acronym/abbreviated name. Yes synonym (trait namespace)
Trait abbreviation Trait name abbreviation. If several abbrevations, separate with comas. The recommended abbreviation must be the first in the list. Yes synonym (trait namespace)
Trait description Textual description of trait. Yes def (trait namespace)
Trait class General class to which trait belongs. Concensus trait classes are morphological, phenological, agronomical, physiological, abiotic stress, biotic stress, biochemical, quality traits and fertility traits Yes is_a
Trait status Status of the trait. Trait can be tagged as recommended, standard for , deprecated, legacy Yes is_obsolete
Trait Xref Cross reference of the trait e.g., Xref to TO Yes xref
Method ID Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID. Yes id
Method (Short) method name. Yes name (method namespace)
Method description Textual description of method. Yes def (method namespace)
Formula For computational methods, express the formula using variable names Yes NO
Method class Measurement, Counting, Estimation, Computation Yes NO
Method reference Biobliographical reference describing method. Yes Xref
Scale id Term ID for scale as generated by the system. Use an existing ID to modify data for that scale. If left blank the system will automatically generate a new ID. Yes id
Scale name [Guidelines to be defined for naming the scale] Yes name (scale namespace)
Scale class Numerical, Nominal, Ordinal, Text, Code, Time, Duration Yes NO
Decimal places For numerical, number of decimal places of report Yes NO
Lower limit Minimum value (used for validation) for numerical and date Yes NO
Upper limit Maximum value (used for validation). Yes NO
Scale Xref Cross reference to the scale, eg to a unit repository like "improved UO" Yes Xref
Category n If the scale is categorical, class value and meaning of the n-th category. It possible to create as many category columns as necessary, as long as they are called "Category " Yes Yes
Variable ID Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID. Yes Ontology ID (variable namespace)
Variable name (abbreviated) Name of the variable following the convetion __. Variable name must be unique. Yes synonym (variable namespace)
Variable synonyms Other names given to this variable Yes synonym (variable namespace)
Context of use Indication of how trait is routinely used. If several context of use, separate with "," Yes NO
Growth stage Growthstage at which measurement is made. Follow standards. If variable used in time series, leave blank Yes NO
Variable status Obsolete, Legacy, Standard for , Recommended, Experimental, Revision Yes is_obsolete
CV Term ID ID generated by BMS Yes NO
Variable Xref Yes Xref
Scientist Name of scientist submitting variable. Yes NO
Institution Name of institution submitting variable Yes NO
Language 2 letter ISO code for language in which variable data is submitted. Yes NO
Date Date the variable was submitted. Yes NO
Crop Crop name. Yes Yes
Full variable name customized name. For new variables -> concatenation of by in
variable definition T: /n M: /n S:

OBO files

OBO file structure

OBO files are created using the obo2owl library (which should sound familiar ;-)).

The main issue with the Trait Template is that because different IDs can be created for the same “thing” (e.g. the term “cm” will get different IDs if different traits are measured in cm), different obo terms can have the same name (but different IDs).

That is a problem when our users tried to import the OBO files in a Chado database.

  • Consequently, I created a little script that created OBO files where all the names are unique in a given namespace. The script has not be inserted in the website yet but it will be soon. Example of an OBO files on Github. (MAL)
TD column OBO element
Crop Used for setting up the namespace
Name of Trait name
Trait ID for modification, Blank for New Term + id
Description of Trait definition
Trait Class Term +id + name + is-a
Language of submission (only in ISO 2 letter codes) We have only English OBO files.
Name of submitting scientist created_by
Institution Not used
Date of submission creation_date
Abbreviated name synonym [EXACT]
Synonyms (separate by commas) synonym [EXACT]
How is this trait routinely used Not used
Name of method name
Method ID for modification, Blank for New Term + id + method_of
Describe how measured (method definition
Bibliographic Reference xref
Growth Stage Not used
Comments Not used
Scale ID for modification, Blank for New Term + id + scale_of
Type of Measure (Continuous, Discrete or Categorical Not used
For Continuous: units of measurement name
For Discrete: Name of scale or units of measurement name
For Categorical: Name of rating scale name
For Categorical: Class 1 - value = meaning Term + name + is-a
For Categorical: Class 2 - value = meaning Term + name + is-a
For Categorical: Class 3 - value = meaning Term + name + is-a
For Categorical: Class 4 - value = meaning Term + name + is-a


SKOS files

SKOS file structure

TD column SKOS
Crop Used for setting up the namespace
Name of Trait skos:prefLabel
Trait ID for modification, Blank for New Local Name (URI) + rdf: type skos:concept
Description of Trait skos:definition
Trait Class rdf:type Skos:Concept + skos:broader
Language of submission (only in ISO 2 letter codes) xml:lang of the literals. SKOS files are multilingual (if we have the info)
Name of submitting scientist foaf:Person (not set so far but will be soon)
Institution foaf:Organization (not set so far but will be soon)
Date of submission dc:date (not set so far but will be soon)
Abbreviated name skos:altLabel
Synonyms (separate by commas) skos:altLabel
How is this trait routinely used Not used
Name of method skos:prefLabel
Method ID for modification, Blank for New Local Name (URI) + rdf: type skos:concept
Describe how measured (method skos:definition
Bibliographic Reference Not set (dc: bibliographicCitation?)
Growth Stage skos:related (not set so far)
Comments skos:editorialNote
Scale ID for modification, Blank for New Local Name (URI) + rdf: type skos:concept
Type of Measure (Continuous, Discrete or Categorical Not used
For Continuous: units of measurement skos:prefLabel
For Discrete: Name of scale or units of measurement skos:prefLabel
For Categorical: Name of rating scale skos:prefLabel
For Categorical: Class 1 - value = meaning Local Name (URI) + rdf: type skos:concept
For Categorical: Class 2 - value = meaning Local Name (URI) + rdf: type skos:concept
For Categorical: Class 3 - value = meaning Local Name (URI) + rdf: type skos:concept
For Categorical: Class 4 - value = meaning Local Name (URI) + rdf: type skos:concept