A simple example of using d2-vlmc pipeline.
Example data:
4 test samples were used in this example. These file are small (only 10 KB for each one)in “fa” format. These files are compressed into a zip file (d2-vlmc-example-data) which can be downloaded from
here.
Concrete steps of d2-vlmc pipeline:
d2-vlmc pipeline comprises the following specific steps. And results of each step can be found in
here.
for((i=1;i<=10;i++));
do
echo $i
./TupleCount10bp.py -l pipeline.txt -k $i -t vlmc-pipeline-test/tuplecount/
done
- Step 2: calculating markov probabality
python ./VLMC327proliulin.py -i pipeline1.txt -t ./vlmc-pipeline-test/tuplecount/ -K 5.0 -p ./vlmc-pipeline-test/markovfile/
python ./VLMC327proliulin.py -i pipeline2.txt -t ./vlmc-pipeline-test/tuplecount/ -K 5.0 -p ./vlmc-pipeline-test/markovfile/
python ./VLMC327proliulin.py -i pipeline3.txt -t ./vlmc-pipeline-test/tuplecount/ -K 5.0 -p ./vlmc-pipeline-test/markovfile/
python ./VLMC327proliulin.py -i pipeline4.txt -t ./vlmc-pipeline-test/tuplecount/ -K 5.0 -p ./vlmc-pipeline-test/markovfile/
- Step 3: calculating dissimilarity martix
for((i=2;i<=9;i++));
do
echo $i
#d2S command
./calculatedissimiliraty.py -l pipeline.txt -k $i -r 0 -d d2S -m vlmc-pipeline-test/markovfile/ -o vlmc-pipeline-test/dissfile/d2S/pipelinek"$i"d2S
#d2Star command
./calculatedissimiliraty.py -l pipeline.txt -k $i -r 0 -d d2Star -m vlmc-pipeline-test/markovfile/ -o vlmc-pipeline-test/dissfile/d2Star/pipelinek"$i"d2Star
done
- Step 4: Generating clustering tree
1: write a clustering.r file
source('/home/yingwang/weinan/vlmc/ClusterTreeby_upgma.R');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek2d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek3d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek4d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek5d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek6d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek7d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek8d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek9d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2S/pipelinek10d2S.dissimilaritymatrix.txt','pipelined2S.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek2d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek3d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek4d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek5d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek6d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek7d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek8d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek9d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
ClusterTreebyupgma('../vlmc-pipeline-test/dissfile/d2Star/pipelinek10d2Star.dissimilaritymatrix.txt','pipelined2Star.tree.nwk');
2: visit the folder to store the clustering trees.
Rscript ../clustering.r