SOAPdenovo-Trans is a de novo transcriptome assembler basing on the SOAPdenovo framework, adapt to alternative splicing and different expression level among transcripts.The assembler provides a more accurate, complete and faster way to construct the full-length transcript sets.
SOAPdenovo-Trans aims for the transcript assembly. It runs on 64-bit Linux systems. For animal transcriptomes like mouse, about 30-35GB memory would be required.
1.04 | 2014-04-22 15:00:00 +0800 (Tue, 22 Apr 2014) Fixes a number of 'seqmentation fault' errors on different kinds of data. (Thanks for Chris Boursnell (twitter: @chrisboursnell) fixing the bugs.)
1.03 | 2013-07-19 12:00:00 +0800 (Fri, 19 Jul 2013) Add the function: calculate RPKM (Reads per Kilobase of assembled transcripts per Million mapped reads).
The configuration file in SOAPdenovo-Trans is mostly the same as SOAPdenovo, but there is no "rank" parameter. The configuration file tells the assembler where to find these files and the relevant information. "example.config" demonstrates how to organize the information and make configuration file.
The configuration file has a section for global information, and then multiple library sections. Right now only "max_rd_len" is included in the global information section. Any read longer than max_rd_len will be cut to this length. The library information and the information of sequencing data generated from the library should be organized in the corresponding library section. Each library section starts with tag [LIB] and includes the following items:
Once the configuration file is available, the simplest way to run the assembler is:
User can also choose to run the assembly process step by step as:
NOTE: SOAPdenovo-Trans has two versions: SOAPdenovo-Trans-31mer and SOAPdenovo-Trans-127mer.
These files are output as assembly results: *.contig contig sequence file *.scafSeq scaffold sequence file There are some other files that provide useful information for advanced users, which are listed in Appendix B.