如何根據物種拉丁名找到其在NCBI Taxonomy所處的位置

問題描述:

我想知道某個物種在NCBI的分類系統里被歸為哪個目、哪個科、哪個屬? 單個物種可以手動NCBI網站檢索,如果物種數非常多如何實現?

之前讀 ete3 的幫助文檔的時候看到過類似的功能http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html。最近可能會用到這個功能,記錄自己使用的程式碼 (首先是安裝ete3:自己windows10電腦安裝了Anaconda3,直接在DOS窗口下使用命令pip install ete3即可安裝)

  • 單個物種 以石榴Punica granatum為例
from ete3 import NCBITaxa  ncbi = NCBITaxa  name2taxid = ncbi.get_name_translator(["Punica granatum"])  for a,b in name2taxid.items():      lineage = ncbi.get_lineage(b[0])      names = ncbi.get_taxid_translator(lineage)      for taxid in lineage:          print(names[taxid])  

輸出結果

root  cellular organisms  Eukaryota  Viridiplantae  Streptophyta  Streptophytina  Embryophyta  Tracheophyta  Euphyllophyta  Spermatophyta  Magnoliophyta  Mesangiospermae  eudicotyledons  Gunneridae  Pentapetalae  rosids  malvids  Myrtales  Lythraceae  Punica  Punica granatum  
  • 多個物種 將物種拉丁名放到文本文件里,每行一個
Lumnitzera littorea  Punica granatum  Heimia myrtifolia  Sonneratia alba  Epilobium ulleungensis  

程式碼

import sys  from ete3 import NCBITaxa  input_file = sys.argv[1]  output_file = sys.argv[2]  ncbi = NCBITaxa()  fw = open(output_file,"w")  with open(input_file,"r") as fr:      for line in fr:          species_name = line.strip()          name2taxid = ncbi.get_name_translator([species_name])          for a,b in name2taxid.items():              lineage = ncbi.get_lineage(b[0])              names = ncbi.get_taxid_translator(lineage)              i = 1              for taxid in lineage:                  if i < len(lineage):                      fw.write(names[taxid]+",")                      i = i + 1                  else:                      fw.write(names[taxid]+"n")          print(species_name + ":","OK")      fw.close()  #使用方法  python .get_species_placement_in_NCBI.py .Organism_name.txt placement.txt  #輸出結果  root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Combretaceae,Lumnitzera,Lumnitzera littorea  root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Punica,Punica granatum  root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Heimia,Heimia myrtifolia  root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Lythraceae,Sonneratia,Sonneratia alba  root,cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliophyta,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Myrtales,Onagraceae,Onagroideae,Epilobieae,Epilobium,Epilobium ulleungensis