| |
Toucan Review
Toucan is the Java front-end application to analyse cis-regulatory
logic of coregulated genes. It provides interface to online services
which perform actual calculations. One of these services is known as
"ModuleSearcher", this application can be used to find the optimal
cis-regulatory module (CRM) which is binded to transcription factor
binding sites in the regulatory sequences of a set of co-regulated
genes.
Toucan software is avaliable from:
http://www.esat.kuleuven.ac.be/~dna/BioI/Software.html
Advantages:
* Using Genetic Algorithm and A* algorithm to find best combination;
* A* algorithm (based on brach-and-bound search) is claimed to be able
to find the best solution possible;
* Genetic Algorithm is more customizable than our realization: user
can specify mutation probability (always one model of the module) and
percent of survivors (always 50% in our current realization);
* Optionally there may be multiple copies of one model in the
resulting module;
* Optionally sites may or may not overlap;
* Using model distance (according to Kullback-Leiber), which
helps clusterize similar models into classes;
* Using background model to calculate score.
Disadvantages:
* Java realization possibly significantly slows down computations. C++
realization may be quite faster;
* Solving NP-complete problem in A* algorithm takes huge amount of CPU
time and memory resources;
* CMA accepts two sequence sets (experimental and control) and total
score tends to maximum when experimental set contains most sequences
which are fit found module, while control set contains less sequences.
ModuleSearcher implementation just sums up all the scores of separate
sequences to produce sequence set score;
* CMA accepts sequence sets with expression values and tries to
maximize correlation between score of CRM on each sequence and
expression value. Thus CMA can work with microarray experiments;
* CMA tries to find not just best set of matrices, but also cutoff
value for each matrix;
* CMA is able to find best boolean expression of input sets. For
example, result can be like this:
not (p(M1)) and (p(M2) or p(M3)) and p(M4),
where p(Mi) is whether model Mi occurs on current sequence or not.
Comparison of Toucan and CompositeModuleAnalyst
CMA is faster, because it is written on C++.
CMA has metropolis algorithm inside
CMA can work with microarray experiments
CMA finds cutoff values for best module
CMA can find boolean expressions
CMA has no client-server implementation
CMA has less features in genetic algorithm
CMA cannot find authentically best module
CMA cannot handle modules with repeated models
CMA cannot group models into classes
|
|
-- menu start -->
Company
Science
Development
Related
Products
|
-- menu end -->