Crop Ontology Pages: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
=Introduction= | |||
The Crop Ontology website 5 sections: | |||
# General Germplasm Ontology | |||
# Phenotypes and Traits Ontology | |||
# Structural and Functional Genomic Ontology | |||
# Location and Environmental Ontology | |||
# Plant Anatomy & Development Ontology | |||
There are 3 ways of creating “ontologies” on the Crop Ontology website. | There are 3 ways of creating “ontologies” on the Crop Ontology website. | ||
*First, one can choose to upload a Trait Dictionary in Excel. The TD is only used to store and load information about traits, thus it cannot be used for uploading ontologies that would belong to the other categories (Note, | *First, one can choose to upload a Trait Dictionary in Excel. The TD is only used to store and load information about traits, thus it cannot be used for uploading ontologies that would belong to the other categories (Note, | ||
*OBO files can be uploaded as well. They can be used to create ontologies in any of the 5 sections. | *OBO files can be uploaded as well. They can be used to create ontologies in any of the 5 sections. | ||
*Finally, ontologies can be built from scratch using the website dedicated interface. | *Finally, ontologies can be built from scratch using the website dedicated interface. |
Revision as of 21:05, 8 June 2015
Introduction
The Crop Ontology website 5 sections:
- General Germplasm Ontology
- Phenotypes and Traits Ontology
- Structural and Functional Genomic Ontology
- Location and Environmental Ontology
- Plant Anatomy & Development Ontology
There are 3 ways of creating “ontologies” on the Crop Ontology website.
- First, one can choose to upload a Trait Dictionary in Excel. The TD is only used to store and load information about traits, thus it cannot be used for uploading ontologies that would belong to the other categories (Note,
- OBO files can be uploaded as well. They can be used to create ontologies in any of the 5 sections.
- Finally, ontologies can be built from scratch using the website dedicated interface.
Once the files are uploaded or some terms are submitted using the web interface, the data are stored in a noSQL database (Google App Engine Datastore). The ontologies can be downloaded in various formats: CSV, OBO, SKOS and JSON. The structures of the different formats are described in the following sections. Note that because there is no exact mapping between the different formats some compromises has been made. Not all the information is available in the different formats.
Trait Template
Structure of the Trait Template V4
So far, the Trait Template V4 is uploaded into the CO website.
The Trait Template is used to gather information on traits along with their methods and scales of measurement. For each trait, some meta-information is added (5 first columns). An (unique) ID is created by the system for each trait, method and scale (several IDs are created when the scale is categorical).
But for now, if two methods or scales are the same (seed length and leaf length are measured in centimeters), different IDs are created, which is far from being ideal or even ontologically correct.
The Trait Template V4 columns are:
- IBFieldbook
- Name of submitting scientist
- Institution
- Language of submission (only in ISO 2 letter codes)
- Date of submission
- Crop
- Name of Trait
- Abbreviated name
- Synonyms (separate by commas)
- Trait ID for modification, Blank for New
- Description of Trait
- How is this trait routinely used?
- Trait Class
- Method ID for modification, Blank for new
- Name of Method
- Describe how measured (method)
- Growth Stage
- Bibliographic Reference
- Comments
- Scale ID for modification, Blank for new
- Type of Measure (Continuous, Discrete or Categorical)
- For Continuous: units of measurement
- For Continuous: reporting units (if different from measurement)
- For Continuous: minimum
- For Continuous: maximum
- For Discrete: Name of scale or units of measurement
- For Categorical: Name of rating scale
- For Categorical: Class 1 - value = meaning
- For Categorical: Class 2 - value = meaning
- For Categorical: Class 3 - value = meaning
- For Categorical: Class 4 - value = meaning
- For Categorical: Class 5 - value = meaning
Structure of the Trait Template V5 (Léo is still working on it with Julian Pietragalla from CIMMYT)
Note the structure can still evolve. Template V5 is still a draft
Column | Description |
Curation | Comments for curation |
Scientist | Name of scientist submitting variable. |
Institution | Name of institution submitting variable |
Language | 2 letter ISO code for language in which variable data is submitted. |
Date | Date the variable was submitted. |
Crop | Crop name. |
Trait ID | Term ID for trait as generated by the system. Use an existing ID to modify data for that trait. If left blank the system will automatically generate a new ID. |
Trait | Trait name (property) |
Entity | A trait must follow the convention "Trait" = "Entity" + "Attribute", eg for "grain colour", attribute = "grain" |
Attribute | eg for "grain colour", entity = "colour" |
Trait synonyms | Acronym/abbreviated name. |
Trait abbreviation | Recommended trait name abbreviation |
Trait abbreviation synonyms | Other trait name abbreviations |
Trait description | Textual description of trait. |
Trait class | General class to which trait belongs. |
Trait status | recommended, standard, obsolete, |
Trait Xref | Cross reference of the trait e.g., Xref to TO |
Method ID | Term ID for method as generated by the system. Use an existing ID to modify data for that method. If left blank the system will automatically generate a new ID. |
Method | (Short) method name. |
Method description | Textual description of method. |
Formula | For computational methods, express the formula using variable names |
Method class | Measurement, Counting, Rating, Estimation, Scoring, Computation |
Method reference | Biobliographical reference describing method. |
Scale id | Term ID for scale as generated by the system. Use an existing ID to modify data for that scale. If left blank the system will automatically generate a new ID. |
Scale name | [Guidelines to be defined for naming the scale] |
Scale class | Numerical, Nominal, Ordinal, Text, Code, Time, Duration |
Decimal places | For numerical, number of decimal places of report |
Lower limit | Minimum value (used for validation) for numerical and date |
Upper limit | Maximum value (used for validation). |
Scale Xref | Cross reference to the scale, eg to a unit repository like "improved UO" |
Categorie i | Class value and meaning of the category "i" with "i" a positive integer |
Variable name | Name of the variable, it follows a naming convention described in guidelines |
Variable synonyms | Other names given to this variable |
Context of use | Indication of how trait is routinely used. If several context of use, separate with "," |
Growth stage | Growthstage at which measurement is made. Follow standards. If variable used in time series, leave blank |
Variable status | Obsolete, Legacy, Standard // institution1, institution2, Recommended, Experimental, Revision |
CV Term ID | ID generated by BMS |
Variable Xref |
OBO files
OBO file structure
OBO files are created using the obo2owl library (which should sound familiar ;-)). The main issue with the Trait Template is that because different IDs can be created for the same “thing” (e.g. the term “cm” will get different IDs if different traits are measured in cm), different obo terms can have the same name (but different IDs).
That is a problem when our users tried to import the OBO files in a Chado database.
Consequently, I created a little script that created OBO files where all the names are unique in a given namespace. The script has not be inserted in the website yet but it will be soon. Example of an OBO files on Github.
TD column | OBO element |
---|---|
Crop | Used for setting up the namespace |
Name of Trait | name |
Trait ID for modification, Blank for New | Term + id |
Description of Trait | definition |
Trait Class | Term +id + name + is-a |
Language of submission (only in ISO 2 letter codes) | We have only English OBO files. |
Name of submitting scientist | created_by |
Institution | Not used |
Date of submission | creation_date |
Abbreviated name | synonym [EXACT] |
Synonyms (separate by commas) | synonym [EXACT] |
How is this trait routinely used | Not used |
Name of method | name |
Method ID for modification, Blank for New | Term + id + method_of |
Describe how measured (method | definition |
Bibliographic Reference | xref |
Growth Stage | Not used |
Comments | Not used |
Scale ID for modification, Blank for New | Term + id + scale_of |
Type of Measure (Continuous, Discrete or Categorical | Not used |
For Continuous: units of measurement | name |
For Discrete: Name of scale or units of measurement | name |
For Categorical: Name of rating scale | name |
For Categorical: Class 1 - value = meaning | Term + name + is-a |
For Categorical: Class 2 - value = meaning | Term + name + is-a |
For Categorical: Class 3 - value = meaning | Term + name + is-a |
For Categorical: Class 4 - value = meaning | Term + name + is-a |
SKOS files
SKOS file structure
TD column | SKOS |
---|---|
Crop | Used for setting up the namespace |
Name of Trait | skos:prefLabel |
Trait ID for modification, Blank for New | Local Name (URI) + rdf: type skos:concept |
Description of Trait | skos:definition |
Trait Class | rdf:type Skos:Concept + skos:broader |
Language of submission (only in ISO 2 letter codes) | xml:lang of the literals. SKOS files are multilingual (if we have the info) |
Name of submitting scientist | foaf:Person (not set so far but will be soon) |
Institution | foaf:Organization (not set so far but will be soon) |
Date of submission | dc:date (not set so far but will be soon) |
Abbreviated name | skos:altLabel |
Synonyms (separate by commas) | skos:altLabel |
How is this trait routinely used | Not used |
Name of method | skos:prefLabel |
Method ID for modification, Blank for New | Local Name (URI) + rdf: type skos:concept |
Describe how measured (method | skos:definition |
Bibliographic Reference | Not set (dc: bibliographicCitation?) |
Growth Stage | skos:related (not set so far) |
Comments | skos:editorialNote |
Scale ID for modification, Blank for New | Local Name (URI) + rdf: type skos:concept |
Type of Measure (Continuous, Discrete or Categorical | Not used |
For Continuous: units of measurement | skos:prefLabel |
For Discrete: Name of scale or units of measurement | skos:prefLabel |
For Categorical: Name of rating scale | skos:prefLabel |
For Categorical: Class 1 - value = meaning | Local Name (URI) + rdf: type skos:concept |
For Categorical: Class 2 - value = meaning | Local Name (URI) + rdf: type skos:concept |
For Categorical: Class 3 - value = meaning | Local Name (URI) + rdf: type skos:concept |
For Categorical: Class 4 - value = meaning | Local Name (URI) + rdf: type skos:concept |
List of Crops and IDs
According to the project timeline, the crops that will be linked to the reference ontologies are:
Crops | Prefixes |
---|---|
Maize | 322 |
Rice | 320 |
Wheat | 321 (traits) and 121 (anatomy) |
Cassava | 334 |
Cowpea | 340 |
Chickpea | 338 |
Potato | 330 |
Sorghum | 324 |
Soybean | 336 |
Barley | 323 |
Common Bean | 335 |
Groundnut | 337 |
Pearl Millet | 327 |
Pigeon Pea | 341 |
Musa | 325 (traits) and 125 (anatomy) |
Sweet Potato | 331 |
Yam | 333 |
Other crops ????? |
The trait ontologies start with 3XX whereas the anatomy ontologies start with 1XX.