PICA
From Bioinformatics Software
Jump to navigationJump to searchPhenotype Investigation with Classification Algorithms (PICA) is a Python framework for testing genotype-phenotype association algorithms.
PICA was developed by Norman MacDonald (norman@cs.dal.ca) and is released under the Creative Commons Share-Alike Attribution 3.0 License
Command-line Interface
Use the option -h for help with any command.
- train: Train a given data mining algorithm and output model to file.
Example usage:
train.py --algorithm cpar.CPARTrainer
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--output output.rules
- test: Test a model with a classification algorithm and given model.
Example usage:
test.py --algorithm cpar.CPARClassifier
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--model_filename output.rules
--model_accuracy mi
- crossvalidate: Replicated cross-validation with the given training and testing algorithms.
Example usage:
crossvalidate.py --training_algorithm cpar.CPARTrainer
--classification_algorithm cpar.CPARClassifier
--accuracy_measure mi
--replicates 10
--folds 5
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--output_filename results.txt
--metadata examples/taxonomic_confounders_propagated.txt
Python API
Shortened example of a paired test between mutual information and conditionally weighted mutual information using the CWMIRankFeatureSelector class and the LIBSVM interface for testing each set of features.
See util/batch_validate.py for more details on an example of setting up a programmatic comparison.
test_configurations = []
trainer = libSVMTrainer(kernel_type="LINEAR",C=5)
classifier = libSVMClassifier()
for score in ("mi","cwmi"):
feature_selector = CWMIRankFeatureSelector(confounders_filename=confounders_filename,
scores=(score,),
features_per_class=10,
confounder="order")
tc = TestConfiguration(score,feature_selector,trainer,classifier)
test_configurations.append(tc)
crossvalidator = CrossValidation(samples=samples,
parameters=None,
replicates=10,
folds=5,
test_configurations=test_configurations,
root_output=root_output)
crossvalidator.crossvalidate()