利用krona可视化处理QIIME结果

这个需求的起因就是因为 QIIME 自带 plot_taxa_summary.py 脚本生成的图表太难看, 所以就着手处理了一下

本来是想在biostar上面找现成的轮子, 结果没找到, 只有自己做了.

附上在biostar.org上的回答内容

Hi, hope this answer is not too late. :)

convert biom to tsv format

1
biom convert -i single_sample.biom -o single_sample.tsv --to-tsv --table-type "OTU table" --header-key taxonomy

remove the first two rows with comment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# take a look at single_sample.tsv
$ head single_sample.tsv 
# Constructed from biom file
#OTU ID H2O taxonomy
346085  140.0   Bacteria; Proteobacteria; Alphaproteobacteria; Caulobacterales; Caulobacteraceae; Brevundimonas; Brevundimonas_bullata
10298   2.0 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Photorhabdus; Photorhabdus_temperata
122823  3.0 Bacteria; Proteobacteria; Betaproteobacteria; Burkholderiales; Oxalobacteraceae; Massilia; EF516371_s
130468  2.0 Bacteria; Proteobacteria; Alphaproteobacteria; Sphingomonadales; Sphingomonadaceae; Sphingopyxis; Sphingopyxis_witflariensis
139977  38.0    Bacteria; Proteobacteria; Alphaproteobacteria; Sphingomonadales; Erythrobacteraceae; Erythrobacter; Erythrobacter_flavus
121751  2.0 Bacteria; Firmicutes; Clostridia; Clostridiales; Lachnospiraceae; Catonella; JX096343_s
96934   10.0    Bacteria; Proteobacteria; Gammaproteobacteria; Pseudomonadales; Pseudomonadaceae; Pseudomonas; Pseudomonas_guguanensis
95181   4.0 Bacteria; Proteobacteria; Gammaproteobacteria; Pseudomonadales; Moraxellaceae; Acinetobacter; Acinetobacter_radioresistens

# remove the comments
$ egrep -v "^#" single_sample.tsv > sample_no_comment.tsv

remove the first column

1
2
3
4
$ cut sample_no_comment.tsv -f 2- > sample.tsv
$ head -n2 sample.tsv 
140.0   Bacteria; Proteobacteria; Alphaproteobacteria; Caulobacterales; Caulobacteraceae; Brevundimonas; Brevundimonas_bullata
2.0 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Photorhabdus; Photorhabdus_temperata

replace the "; " with "\t"

1
2
3
4
$ sed -i 's/;\s*/\t/g' sample.tsv
$ head -n2 sample.tsv 
140.0   Bacteria    Proteobacteria  Alphaproteobacteria Caulobacterales Caulobacteraceae    Brevundimonas   Brevundimonas_bullata
2.0 Bacteria    Proteobacteria  Gammaproteobacteria Enterobacteriales      Enterobacteriaceae   Photorhabdus    Photorhabdus_temperata

now we can make a pie chart with ktImportText(One of the Krona tools)

1
$ ktImportText -n enjoy_your_life -o sample.krona.html sample.tsv

open sample.krona.html with your web browser, you'll see this

Krona_snapshot_enjoy_your_life

all in one command(optional)

1
$ awk -F '\t|;' 'BEGIN{OFS="\t"} FNR > 2 {$1=""; print $0}' single_sample.tsv | cut -f 2- > sample.tsv