Difference between revisions of "PICA"
From Bioinformatics Software
Jump to navigationJump to search| Line 2: | Line 2: | ||
| − | == Command-line == | + | == Command-line Interface== |
Use the option -h for help with any command. | Use the option -h for help with any command. | ||
* train: Train a given data mining algorithm and output model to file. | * train: Train a given data mining algorithm and output model to file. | ||
Example usage: | Example usage: | ||
| − | train.py --algorithm cpar.CPARTrainer --samples examples/genotype_prokaryote.profile --classes examples/phenotype.profile --targetclass THERM --output output.rules | + | train.py --algorithm cpar.CPARTrainer |
| + | --samples examples/genotype_prokaryote.profile | ||
| + | --classes examples/phenotype.profile | ||
| + | --targetclass THERM | ||
| + | --output output.rules | ||
* test | * test | ||
Test a model with a classification algorithm and given model. | Test a model with a classification algorithm and given model. | ||
Example usage: | Example usage: | ||
| − | test.py --algorithm cpar.CPARClassifier --samples examples/genotype_prokaryote.profile --classes examples/phenotype.profile --targetclass THERM --model_filename output.rules --model_accuracy mi | + | test.py --algorithm cpar.CPARClassifier |
| + | --samples examples/genotype_prokaryote.profile | ||
| + | --classes examples/phenotype.profile | ||
| + | --targetclass THERM | ||
| + | --model_filename output.rules | ||
| + | --model_accuracy mi | ||
* crossvalidate | * crossvalidate | ||
Train and test various replicates with the given training and testing algorithms. | Train and test various replicates with the given training and testing algorithms. | ||
Example usage: | Example usage: | ||
| − | crossvalidate.py --training_algorithm cpar.CPARTrainer --classification_algorithm cpar.CPARClassifier --accuracy_measure mi --replicates 10 --folds 5 --samples examples/genotype_prokaryote.profile --classes examples/phenotype.profile --targetclass THERM --output_filename | + | crossvalidate.py --training_algorithm cpar.CPARTrainer |
| + | --classification_algorithm cpar.CPARClassifier | ||
| + | --accuracy_measure mi | ||
| + | --replicates 10 | ||
| + | --folds 5 | ||
| + | --samples examples/genotype_prokaryote.profile | ||
| + | --classes examples/phenotype.profile | ||
| + | --targetclass THERM | ||
| + | --output_filename results.txt | ||
| + | --metadata examples/taxonomic_confounders_propagated.txt | ||
| + | == Python API == | ||
| + | Shortened example of a paired test between mutual information and conditionally weighted mutual information using the CWMIRankFeatureSelector class and the LIBSVM interface for testing each set of features. | ||
| + | See util/batch_validate.py for more details on an example of setting up a programmatic comparison. | ||
| − | == | + | test_configurations = [] |
| + | trainer = libSVMTrainer(kernel_type="LINEAR",C=5) | ||
| + | classifier = libSVMClassifier() | ||
| + | |||
| + | for score in ("mi","cwmi"): | ||
| + | feature_selector = CWMIRankFeatureSelector(confounders_filename=confounders_filename, | ||
| + | scores=(score,), | ||
| + | features_per_class=10, | ||
| + | confounder="order") | ||
| + | |||
| + | tc = TestConfiguration(score,feature_selector,trainer,classifier) | ||
| + | test_configurations.append(tc) | ||
| + | |||
| + | crossvalidator = CrossValidation(samples=samples, | ||
| + | parameters=None, | ||
| + | replicates=10, | ||
| + | folds=5, | ||
| + | test_configurations=test_configurations, | ||
| + | root_output=root_output) | ||
| + | crossvalidator.crossvalidate() | ||
Revision as of 20:15, 12 April 2010
Phenotype Investigation with Classification Algorithms (PICA) is a Python framework for testing genotype-phenotype association algorithms.
Command-line Interface
Use the option -h for help with any command.
- train: Train a given data mining algorithm and output model to file.
Example usage:
train.py --algorithm cpar.CPARTrainer
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--output output.rules
- test
Test a model with a classification algorithm and given model. Example usage:
test.py --algorithm cpar.CPARClassifier
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--model_filename output.rules
--model_accuracy mi
- crossvalidate
Train and test various replicates with the given training and testing algorithms. Example usage:
crossvalidate.py --training_algorithm cpar.CPARTrainer
--classification_algorithm cpar.CPARClassifier
--accuracy_measure mi
--replicates 10
--folds 5
--samples examples/genotype_prokaryote.profile
--classes examples/phenotype.profile
--targetclass THERM
--output_filename results.txt
--metadata examples/taxonomic_confounders_propagated.txt
Python API
Shortened example of a paired test between mutual information and conditionally weighted mutual information using the CWMIRankFeatureSelector class and the LIBSVM interface for testing each set of features.
See util/batch_validate.py for more details on an example of setting up a programmatic comparison.
test_configurations = []
trainer = libSVMTrainer(kernel_type="LINEAR",C=5)
classifier = libSVMClassifier()
for score in ("mi","cwmi"):
feature_selector = CWMIRankFeatureSelector(confounders_filename=confounders_filename,
scores=(score,),
features_per_class=10,
confounder="order")
tc = TestConfiguration(score,feature_selector,trainer,classifier)
test_configurations.append(tc)
crossvalidator = CrossValidation(samples=samples,
parameters=None,
replicates=10,
folds=5,
test_configurations=test_configurations,
root_output=root_output)
crossvalidator.crossvalidate()