Express Beta Diversity (EBD)
by Donovan Parks and Rob Beiko
-------------------------------------------------------------------------------
EBD is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
EBD is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with EBD. If not, see .
Installation:
-------------------------------------------------------------------------------
EBD is a command-line program written in C++. To install EBD, download
and uncompress it with the unzip command:
unzip EBD_1_0_1.zip
To compile EBD on OSX or Linux simply type 'make' from within the source
directory of EBD. The resulting executable will be in the bin directory.
A precompiled executables for Windows is provided in the bin directory.
Please note that even under Windows, EBD must be run from the command-line
(i.e., the DOS prompt).
The ClusterTree program can be used to generating hierarchical cluster trees
from the dissimilarity matrices produced by EBD. The source code is contained
in the cluster_tree directory. Again, just run 'make' to build on OSX or Linux.
An exectuable is provided for Windows.
ProjectTree can be used to project a tree onto a specific set of taxa. EBD
assumes a tree contain only taxa present in at least one sample. The source
code is contained in the project_tree directory. Again, just run 'make' to
build on OSX or Linux. An exectuable is provided for Windows.
Program usage:
-------------------------------------------------------------------------------
Usage: EBD [OPTIONS]
Calculates taxon- and phylogenetic-basec beta diversity measures.
Options:
-h, --help Produce help message.
-l, --list-calc List all supported calculators.
-u, --unit-tests Execute unit tests.
-t, --tree-file Tree in Newick format (if phylogenetic beta-diversity is desired).
-s, --seq-count-file Sequence count file.
-d, --diss-file File to write dissimilarity matrix to.
-c, --calculator Desired calculator (e.g., Bray-Curtis, Canberra).
-w, --weighted Indicated if sequence abundance data should be used.
-m, --mrca Apply 'MRCA weightings' to each branch (experimental).
-r, --strict-mrca Restrict calculator to MRCA subtree.
-y, --count Use count data as opposed to relative proportions.
-x, --max-data-vecs Maximum number of profiles (data vectors) to have in memory at once (default = 1000).
-a, --all Apply all calculators and cluster calculators at the specified threshold.
-b, --threshold Correlation threshold for clustering calculators (default = 0.8).
-o, --output-file File to write clusters to (default = clusters.txt.
-v, --verbose Provide additional information on program execution.
Examples of Use:
./ExpressBetaDiversity -t input.tre -s seq.txt -d output.txt -c Bray-Curtis -w
./ExpressBetaDiversity -t input.tre -s seq.txt -a -b 0.9 -o clusters.txt
Verifying software installation:
-------------------------------------------------------------------------------
A set of unit tests is included to verify proper installation of the EBD
software. The unit tests can be run with:
./ExpressBetaDiversity -u
The software should not be used if any of the unit tests fail.
Projecting a phylogenetic tree:
-------------------------------------------------------------------------------
EBD assumes a tree spans only the taxa present in your set of samples. The
program ProjectTree can be used to ensure this is true. It can be run as
follows:
./ProjectTree