Svn migration to git
Migration of the ontology respository from svn to git
Motivation
After much discussion with Chris Mungall, it was decided to migrate all relevant files from the POC svn repo to a new planteome repository in [github http://github.com].
Not all files are needed to be moved. Most notably, the association files are too large to be hosted on github, so we are looking for a solution for this.
Also, ontology files besides the plant_ontology.obo file were stored in a collaborators_ontology folder. We decided to restructure the repo to make these at the same level
Dumping the svn repo
To keep the revision info for the files, we need to dump all of that from svn and then filter it out for the specific files we need to move to git.
On Palea
cd /data/svnrepos/Poc svn admin dump . > /lemma/justin/Poc_svn_migration/Poc_svn_dump
This will dump all repo information including revision info and commit messages to a file named Poc_svn_dump. I stored this on lemma as the file is large (>25GB).
Filtering the dump
Note: the following was run on one of cluster compute nodes because I didn't want to swamp palea working on it.
I decided to test this out with trait.obo first as a proof of concept. Other files should be similar with changes to the include statement
cd /lemma/justin/Poc_svn_migration cat Poc_svn_dump | svndumpfilter include trunk/ontology/collaborators_ontology/gramene/traits/trait.obo --drop-empty-revs --renumber-revs > trait.obo_filtered_dump
The
svndumpfilter
command will go through the dump file and only output revision info for trait.obo, in this case. The full path needs to be in the include argument, otherwise it output an empty repo. The other arguments will reset the revision numbers so that they start at 1 and have no empty revisions.
Create a new svn repo to hold the filtered data
cd /data/svnrepos sudo svnadmin create /data/svnrepos/svn2git sudo chown -R apache: svn2git
Copy the filtered dump to the new repo
Before we can add the revisions to the repo, we need to at least create the folder where the fitered dump files are in the new repo, or we will get an error:
cd ~/palea_svn svn co http://palea.cgrb.oregonstate.edu:/svn/svn2git cd svn2git mkdir -p trunk/ontology/collaborators_ontology/gramene/traits svn add trunk svn commit --username=elserj
Then back on palea (svn server):
cd /data/svnrepos sudo svnadmin load ./svn2git < /lemma/justin/Poc_svn_migration/trait.obo_filtered_dump
Fix the file path (from the svn client):
svn up svn mv trunk/ontology/collaborators_ontology/gramene/traits/trait.obo ./ svn up svn rm trunk svn commit
At this point, we should have a fresh clean svn repo with only the filtered files in it with all revision info.
Copy the svn repo to github
Checkout the repo from github