AmiGO2 install
GOLR install:
There are a few prerequisites to getting GOLR installed. They appear to be Maven and OwlTools. To install maven, I followed the instructions from here: http://blog.gluster.org/2013/08/yum-install-maven-yes-you-can/, seemed to work without error.
Then, to install OWLTools, I did the following:
cd /lemma/justin svn co http://owltools.googlecode.com/svn/trunk/ owltools
This creates the owltools directory in my space on lemma.
To compile OWLTools, run:
mvn clean package
This will take about half an hour.
Grab the 3.6 version of SOLR from apache and unpack:
wget http://archive.apache.org/dist/lucene/solr/3.6.2/apache-solr-3.6.2.tgz tar zxvf apache-solr-3.6.2.tgz
Pull down the latest version of AmiGO:
cd /data/www/planteome_dev git clone https://github.com/geneontology/amigo.git amigo
Go back to the SOLR directory and replace the config with those from GOLR:
cd /lemma/justin/SOLR/apache-solr-3.6.2/example/solr/conf cp schema.xml schema.xml.org cp solrconfig.xml solrconfig.xml.org cp /data/www/planteome_dev/amigo/golr/solr/conf/s* ./
I don't see any edits that need to be made at this point. I think when we move to additional ontologies, we will have to modify schema.xml to account for new relationship types. Cross that bridge when we get there.
Start SOLR(GOLR):
cd ../../ (be in the example directory) java -jar start.jar
Now that it seems to be starting up fine, daemonise it so that it runs at a service at boot (Note that any service needs to be okayed by Chris before started):
Add the init.d script as outlined on this page
Had to modify that script as follows to make it work on CentOS:
elserj@palea init.d]$ diff solr solr.org 16,19d15 < # Source function library. < [ -f /etc/rc.d/init.d/functions ] || exit 0 < . /etc/rc.d/init.d/functions < 24c20 < daemon "cd /data/www/planteome_dev/SOLR/example; java -jar start.jar &> /var/log/solr/solr.log &" --- > daemon --chdir='/data/www/plantome_dev/example' --command "java -jar start.jar" --respawn --output=/var/log/solr/solr.log --name=solr --verbose 40c36 < ps -aef | grep start.jar | grep -v grep | awk '{print $2}' | xargs -i++ kill -9 ++ --- > daemon --stop --name=solr --verbose
At this point, SOLR is now running. Next is to load data to it.
Load data into GOLR instance
First we need to pull down the latest AmiGO code from git:
cd /data/www/planteome_dev git clone https://github.com/geneontology/amigo.git amigo
In the metadata folder are the yaml files that tell owltools how to populate the fields in GOLR
Edit the golr/Makefile to fit our server:
[elserj@palea golr]$ diff Makefile Makefile.org 18c18 < MAVEN_EXE ?= /usr/bin/mvn --- > MAVEN_EXE ?= ~/local/src/java/apache-maven-3.0.4/bin/mvn 21c21 < SOLR_URL ?= http://localhost:8983 --- > SOLR_URL ?= http://localhost:8080/solr/ 24,27c24,27 < OWLTOOLS_ROOT ?= /nfs0/BPP/Jaiswal_Lab/justin/owltools/ < BBOP_JS_ROOT ?= /data/www/planteome_dev/amigo/external/ < #PANTHER_FILES_DIR ?= ~/local/src/svn/geneontology.org/trunk/experimental/trees/panther_data/ < #SOLR_DATA_ROOT ?= /srv/solr --- > OWLTOOLS_ROOT ?= ~/local/src/git/owltools/ > BBOP_JS_ROOT ?= ~/local/src/git/bbop-js/ > PANTHER_FILES_DIR ?= ~/local/src/svn/geneontology.org/trunk/experimental/trees/panther_data/ > SOLR_DATA_ROOT ?= /srv/solr
Try to run the load-ontology part of the Makefile:
make load-ontology
This will take some time as it appears to be updating the owltools and then loading the ontologies
At this point, I get an error when trying to load the ontologies. Working with Seth to figure it out.
Errors were mostly caused by memory issues. Had to comment out the ncbitaxon.owl ontology from the list of ones ran. Upped the memory to 20GB on palea. May eventually have to look at running the load on one of the cluster nodes as they have much more RAM. Would have to figure out how to open up the SOLR port to the internal network though.
At this point, data for an initial test is loaded into SOLR, next is to get AmiGO2 up and running off that instance.
This page will be updated as I progress.