对于将blast的xml格式的输出转化成gff3的,可以先将其转化成tab格式的,然后使用blast92gff3.pl转化成gff3格式的。xml--->tab的可以使用一个开源工具biopython,Bio/Blast/NCBIXML.py可以识别xml文档,然后自己写程序转化一下就可以了。参考程序:- import sys
- from Bio.Blast import NCBIXML
- file_handle = open(sys.argv[1])
- blast_records = NCBIXML.parse(file_handle)
- #
- for record in blast_records:
- #no match
- if(len(record.alignments) == 0):
- continue
- #query_id
- #print 'query id:', record.query_id
- #hit_id
- for align in record.alignments:
- #print 'hit id:', align.hit_id
- # %identities
- for hsp in align.hsps:
- #output all value
- print "%s\t%s\t%f\t%s\t%d\t%s\t%s\t%s\t%s\t%s\t%s\t%s" %(record.query_id, align.hit_id, (hsp.identities*1.0/hsp.align_length*100.0),
- hsp.align_length, (hsp.align_length-hsp.identities), hsp.gaps, hsp.query_start, hsp.query_end, hsp.sbjct_start, hsp.sbjct_end, hsp.expect, hsp.bits)
复制代码 当然这段程序需要安装biopython,有一个说明文档 |