Installing libsequence and analysis tools on OSX

After being relegated to the margins of biology for almost a century, the field of population genetics is now moving closer the the center of mainstream biology in the era of next-generation sequencing (NGS). In addition to providing genome-wide DNA variation data to study classical questions in evolution that are motivated by population genetic theory, NGS now permits population-genetic based techniques like GWAS (genome-wide association studies) and eQTL (expression quantitative trait locus) analysis to allow biologists to identify and map functional regions across the genome.

Handling big data requires industrial strength bioinformatics tool-kits, which are increasingly available for many aspects of genome bioinformatics (e.g. the Kent source tree, SAMtools, BEDtools, etc.). However, for molecular population genetics at the command line, researchers have a more limited palette that can be used for genome-scale data, including: VariScan, the PERL PopGen modules, and the libsequence C++ library.

Motivated by a recent question on BioStar and a request from a colleague to generate a table of polymorphic sites across a bacterial genome, I’ve recently had a play with the last of these — Kevin Thornton’s libsequence and accompanying analysis toolkit — which includes a utility called compute which is a “mini-DNAsp for the Unix command-line.” As ever, getting this installed on OSX was more of a challenge than desired, with a couple of dependencies (including Boost and GSL), but in the end was do-able as follows:

$ wget http://sourceforge.net/projects/boost/files/boost/1.47.0/boost_1_47_0.tar.gz
$ tar -xvzf boost_1_47_0.tar.gz
$ cd boost_1_47_0
$ ./bootstrap.sh
$ sudo ./b2 install
$ cd ..

$ wget http://molpopgen.org/software/libsequence/libsequence-1.7.3.tar.gz
$ tar -xvzf libsequence-1.7.3.tar.gz
$ cd libsequence-1.7.3
$ ./configure
$ make
$ sudo make install
$ cd ..

$ wget ftp://ftp.gnu.org/gnu/gsl/gsl-1.15.tar.gz
$ tar -xvzf gsl-1.15.tar.gz
$ cd gsl-1.15
$ ./configure --disable-shared --disable-dependency-tracking
$ make
$ sudo make install
$ cd ..

$ wget http://molpopgen.org/software/analysis/analysis-0.8.0.tar.gz
$ tar -xvzf analysis-0.8.0.tar.gz
$ cd analysis-0.8.0
$ ./configure
$ make
$ sudo make install
$ cd ..

With this recipe, compute and friends should now be happily installed in /usr/local/bin/.

Notes: This protocol was developed on a MacBook Air Intel Core 2 Duo running OSX 10.6.8.

Enhanced by Zemanta

9 Comments

  1. timflutre

    Bio++ (http://biopp.univ-montp2.fr/) seems to be also an interesting library, although I never used it: “Bio++ is a set of C++ libraries for Bioinformatics, including sequence analysis, phylogenetics, molecular evolution and population genetics. Bio++ is fully Object Oriented and is designed to be both easy to use and computer efficient.” According to the documentation, there is no dependency, and the build system uses CMake.

  2. boudin

    hehe…doesn’t work

    $ compute
    compute: error while loading shared libraries: libsequence.so.19: cannot open shared object file: No such file or directory

  3. caseybergman

    Hi @boudin – can you send the complete transcript or overview of your installation? Just having this last fragment doesn’t really help answer why you are having troubles.

  4. Hi Casey,

    Have you used Homebrew much? I’m finding it works much better than Macports etc (not that you used either of those here) and doing an update of the software is as easy as a ‘brew update && brew upgrade’.

    Doing a:

    brew install boost && brew install gsl

    followed by:

    wget -c http://molpopgen.org/software/libsequence/libsequence-1.7.5.tar.gz && wget -c http://molpopgen.org/software/analysis/analysis-0.8.3.tar.gz

    and associated extraction and compilation has things up and running in no time. Although this still requires manual steps for libsequence and analysis, the automated install of the earlier stages makes it much easier to keep the software up-to-date.

    Cheers,

    Steve

  5. caseybergman

    Thanks for this! I have tried homebrew for a few things, but still mostly using Fink because it has bioperl. I will be sure to give homebrew a proper test drive one day.

  6. I forgot to add, that you need to download either the ICU4C binary or source (which will need building and installing) http://site.icu-project.org/download/50 before doing the boost install. Boost.Regex needs the ICU GCC components.

    Downloading the binaries and copying them to /usr/local worked fine for me 🙂

  7. Hi,

    Thank you so much for trying to make it simple for others. I followed your recommendations ; however, I have an error while trying to “make” libsequence. I am on OS.10.9.5, I installed boost, zlib, alias g++ to gcc (not clang, but gcc 4.9.1).
    The error is quite obscure to me… suggestions would be welcome.
    Thanks!
    Annabelle
    1rst error (expected class member or base class name):
    libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -O2 -std=c++11 -Wall -W -Woverloaded-virtual -Wnon-virtual-dtor -Wcast-qual -Wconversion -Wsign-conversion -Wsign-promo -Wsynth -ffor-scope -DHAVE_HTSLIB -DNDEBUG -g -O2 -std=c++11 -MT IOhelp.lo -MD -MP -MF .deps/IOhelp.Tpo -c IOhelp.cc -o IOhelp.o >/dev/null 2>&1
    depbase=`echo hts/bamrecord.lo | sed ‘s|[^/]*$|.deps/&|;s|\.lo$||’`;\
    /bin/sh ../libtool –tag=CXX –mode=compile g++ -DHAVE_CONFIG_H -I. -I.. -g -O2 -std=c++11 -Wall -W -Woverloaded-virtual -Wnon-virtual-dtor -Wcast-qual -Wconversion -Wsign-conversion -Wsign-promo -Wsynth -ffor-scope -DHAVE_HTSLIB -DNDEBUG -g -O2 -std=c++11 -MT hts/bamrecord.lo -MD -MP -MF $depbase.Tpo -c -o hts/bamrecord.lo hts/bamrecord.cc &&\
    mv -f $depbase.Tpo $depbase.Plo
    libtool: compile: g++ -DHAVE_CONFIG_H -I. -I.. -g -O2 -std=c++11 -Wall -W -Woverloaded-virtual -Wnon-virtual-dtor -Wcast-qual -Wconversion -Wsign-conversion -Wsign-promo -Wsynth -ffor-scope -DHAVE_HTSLIB -DNDEBUG -g -O2 -std=c++11 -MT hts/bamrecord.lo -MD -MP -MF hts/.deps/bamrecord.Tpo -c hts/bamrecord.cc -fno-common -DPIC -o hts/.libs/bamrecord.o
    hts/bamrecord.cc:203:5: warning: declaration does not declare anything [-Wmissing-declarations]
    std::unique_ptr __block;
    ^~~~~~~~~~~~~~~~~~~~~~~
    hts/bamrecord.cc:220:8: error: expected class member or base class name
    __block(nullptr),
    ^
    :33:17: note: expanded from here
    #define __block __attribute__((__blocks__(byref)))
    ^

  8. caseybergman

    Hi Annabelle –

    These instructions are pretty old and might be out of date. I would suggest following the instructions here: https://github.com/molpopgen/libsequence and if you are still having trouble open an issue here: https://github.com/molpopgen/libsequence/issues

    Best regards,
    Casey

  9. Hi Casey,

    Thanks for your advices. I finally succeed to install libsequence. The up-to-date protocole I would recommand is the following:

    I used Homebrew with “sudo brew install libsequence” (http://www.molecularecologist.com/2013/09/analytical-software-management-for-your-mac-homebrew-to-the-rescue/). Homebrew check for potential dependencies.

    I first got an error for sudo brew install, and had to change to rights, as suggested here: http://digitizor.com/2014/06/29/fix-cowardly-refusing-sudo-error-brew/.

    Best regards,
    Annabelle


1 Trackbacks/Pingbacks

  1. Getting libsequence and Boost to Play Nice | Monkeyologist 03 04 13

Add Your Comment