如何根据物种拉丁名找到其在NCBI Taxonomy所处的位置
- 2020 年 3 月 3 日
- 筆記
问题描述:
我想知道某个物种在NCBI的分类系统里被归为哪个目、哪个科、哪个属? 单个物种可以手动NCBI网站检索,如果物种数非常多如何实现?
之前读 ete3 的帮助文档的时候看到过类似的功能http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html。最近可能会用到这个功能,记录自己使用的代码 (首先是安装ete3:自己windows10电脑安装了Anaconda3,直接在DOS窗口下使用命令pip install ete3
即可安装)
- 单个物种 以石榴(Punica granatum)为例
from ete3 import NCBITaxa ncbi = NCBITaxa name2taxid = ncbi.get_name_translator(["Punica granatum"]) for a,b in name2taxid.items(): lineage = ncbi.get_lineage(b[0]) names = ncbi.get_taxid_translator(lineage) for taxid in lineage: print(names[taxid])
输出结果
root cellular organisms Eukaryota Viridiplantae Streptophyta Streptophytina Embryophyta Tracheophyta Euphyllophyta Spermatophyta Magnoliophyta Mesangiospermae eudicotyledons Gunneridae Pentapetalae rosids malvids Myrtales Lythraceae Punica Punica granatum
- 多个物种 将物种拉丁名放到文本文件里,每行一个
Lumnitzera littorea Punica granatum Heimia myrtifolia Sonneratia alba Epilobium ulleungensis
代码
import sys from ete3 import NCBITaxa input_file = sys.argv[1] output_file = sys.argv[2] ncbi = NCBITaxa() fw = open(output_file,"w") with open(input_file,"r") as fr: for line in fr: species_name = line.strip() name2taxid = ncbi.get_name_translator([species_name]) for a,b in name2taxid.items(): lineage = ncbi.get_lineage(b[0]) names = ncbi.get_taxid_translator(lineage) i = 1 for taxid in lineage: if i < len(lineage): fw.write(names[taxid]+",") i = i + 1 else: fw.write(names[taxid]+"n") print(species_name + ":","OK") fw.close() #使用方法 python .get_species_placement_in_NCBI.py .Organism_name.txt placement.txt #输出结果 root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Combretaceae,Lumnitzera,Lumnitzera littorea root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Punica,Punica granatum root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Heimia,Heimia myrtifolia root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Sonneratia,Sonneratia alba root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Onagraceae,Onagroideae,Epilobieae,Epilobium,Epilobium ulleungensis