hisat2构建GRCH38转录组index内存不足

hisat2构建GRCH38转录组index内存不足

报错

Ran out of memory; auhisat2 tomatically trying more memory-economical parameters

解决

首先查看hisat2官网的manual,可以看到这样一句话:

If you use –snp, –ss, and/or –exon, hisat2-build will need about 200GB RAM for the human genome size as index building involves a graph construction.
Otherwise, you will be able to build an index on your desktop with 8GB RAM.

同时注意到参数–known-splicesite-infile:

With this mode, you can provide a list of known splice sites, which HISAT2 makes use of to align reads with small anchors.

You can create such a list using python hisat2_extract_splice_sites.py genes.gtf > splicesites.txt, where hisat2_extract_splice_sites.py is included in the HISAT2 package, genes.gtf is a gene annotation file, and splicesites.txt is a list of splice sites with which you provide HISAT2 in this mode. Note that it is better to use indexes built using annotated transcripts (such as genome_tran or genome_snp_tran), which works better than using this option. It has no effect to provide splice sites that are already included in the indexes.

所以有两种解决方式:第一种可以申请更多的内存资源重新建索引;第二种,可以在建索引的时候不加可变剪切位点,在比对的时候提供,但效果不如第一种好。

查看节点可用内存

使用top命令查看:

如果出于习惯去计算可用内存数,这里有个近似的计算公式:第四行的free + 第四行的buffers + 第五行的cached,按这个公式此台服务器的可用内存:530668+79236+4231276 = 4.7GB

-------------本文结束感谢您的阅读-------------