如何根據物種拉丁名找到其在NCBI Taxonomy所處的位置
- 2020 年 3 月 3 日
- 筆記
問題描述:
我想知道某個物種在NCBI的分類系統里被歸為哪個目、哪個科、哪個屬? 單個物種可以手動NCBI網站檢索,如果物種數非常多如何實現?
之前讀 ete3 的幫助文檔的時候看到過類似的功能http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html。最近可能會用到這個功能,記錄自己使用的程式碼 (首先是安裝ete3:自己windows10電腦安裝了Anaconda3,直接在DOS窗口下使用命令pip install ete3
即可安裝)
- 單個物種 以石榴(Punica granatum)為例
from ete3 import NCBITaxa ncbi = NCBITaxa name2taxid = ncbi.get_name_translator(["Punica granatum"]) for a,b in name2taxid.items(): lineage = ncbi.get_lineage(b[0]) names = ncbi.get_taxid_translator(lineage) for taxid in lineage: print(names[taxid])
輸出結果
root cellular organisms Eukaryota Viridiplantae Streptophyta Streptophytina Embryophyta Tracheophyta Euphyllophyta Spermatophyta Magnoliophyta Mesangiospermae eudicotyledons Gunneridae Pentapetalae rosids malvids Myrtales Lythraceae Punica Punica granatum
- 多個物種 將物種拉丁名放到文本文件里,每行一個
Lumnitzera littorea Punica granatum Heimia myrtifolia Sonneratia alba Epilobium ulleungensis
程式碼
import sys from ete3 import NCBITaxa input_file = sys.argv[1] output_file = sys.argv[2] ncbi = NCBITaxa() fw = open(output_file,"w") with open(input_file,"r") as fr: for line in fr: species_name = line.strip() name2taxid = ncbi.get_name_translator([species_name]) for a,b in name2taxid.items(): lineage = ncbi.get_lineage(b[0]) names = ncbi.get_taxid_translator(lineage) i = 1 for taxid in lineage: if i < len(lineage): fw.write(names[taxid]+",") i = i + 1 else: fw.write(names[taxid]+"n") print(species_name + ":","OK") fw.close() #使用方法 python .get_species_placement_in_NCBI.py .Organism_name.txt placement.txt #輸出結果 root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Combretaceae,Lumnitzera,Lumnitzera littorea root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Punica,Punica granatum root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Heimia,Heimia myrtifolia root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Sonneratia,Sonneratia alba root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Onagraceae,Onagroideae,Epilobieae,Epilobium,Epilobium ulleungensis