TEtranscripts FAQ

This page contains some frequently asked questions and commonly encountered issues with TEtranscripts.
– You can also visit our page on Github to see if your question has been addressed there.
– Otherwise, you can contact us, and we will try our best to help.

Commonly encountered issues

Where can I find a transposable element GTF file for the organism or genome build that I want to use for my analysis?
Error: ‘NoneType’ object has no attribute ‘reference_start’ [Exception type: AttributeError, raised in TEtranscripts:288]
Error: ‘module’ object has no attribute ‘AlignmentFile’ [Exception type: AttributeError, raised in TEtranscripts:199]
Error in parametricDispersionFit(means, disps) : Parametric dispersion fit failed. Try a local fit and/or a pooled estimation.
Why did the differential analysis fail on the test files that were provided?
How do I know whether my library is stranded (or the directionality of strandedness)?
Are there special instructions for paired-end libraries?

Where can I find a transposable element GTF file for the organism or genome build that I want to use for my analysis?

You can see what GTF files are already available here.
If what you’re looking for is not there, we can help you generate the file, but you would need to provide well-curated annotations, including genomic location (relevant to the genome build of your interest) and specific information about the TE (e.g. type or element name).
Please don’t hesitate to contact us, and we will do our best to help.

How do I resolve the following error?

Error: 'NoneType' object has no attribute 'reference_start' [Exception type: AttributeError, raised in TEtranscripts:288]

This error arises when TEtranscripts failed to find the aligned mates of a paired-end read in the alignment file. TEtranscripts requires the aligned mates to be found adjacent to each other in the BAM file.

If you have sorted your BAM file, you can check how it was sorted using the following command:

$ samtools view -H [BAM file] | grep "SO:"

If your BAM file is sorted by coordinate (SO:coordinate), please specify the --sortByPos option when running TEtranscripts.

How can I resolve the following error?

Error: 'module' object has no attribute 'AlignmentFile' [Exception type: AttributeError, raised in TEtranscripts:199]

This error arises due to an old version of the pysam library being used. TEtranscripts depends on pysam library version above 0.8, and is incompatible with older pysam libraries.
To check your pysam version, run the following commands on your command prompt:
$ python >>> import pysam >>> print pysam.__version__
If the output is not ‘0.8’ or higher (e.g. ‘0.8.3’), then you will need to install a new version of pysam.
Note: As of TEToolkit version 2.0+, you will need a pysam library version above 0.9.

I installed pysam version 0.8 (or higher), but why do I still get same error as above?

This is probably due to having both the newer (version 0.8 or higher) and an older version installed, but python calling the older version by default.

If you installed pysam through pip, you can type:
$ pip list
This should show you the version of the package that was installed with pip.
To find the location of your pysam installed through pip, you can type:
$ pip show pysam

If you installed from source (e.g. python setup.py install), you can run the following commands:
$ python >>> import pysam >>> print pysam.__version__ '0.8.3' >>> print pysam.__file__ '/lib/python2.6/site-packages/pysam-0.8.3-py2.6-linux-x86_64.egg/pysam/__init__.pyc'
This will tell you where the default pysam library is installed.

If your newer version of pysam is not listed, this likely means that it is not on the python search path. You can confirm this by running the following:
$ python >>> import sys >>> print "\n".join(sys.path)
For example, if pysam.__file__ produces this:
'/lib/python2.6/site-packages/pysam-0.8.3-py2.6-linux-x86_64.egg/pysam/__init__.pyc'
Then you should ensure that '/lib/python2.6/site-packages/pysam-0.8.3-py2.6-linux-x86_64' should be in the output above.

If you do not find the folder containing your pysam installation, then you will need to add it to the search path. You can do this by setting your PYTHONPATH variable.
This can be done either by running the following command:
$ export PYTHONPATH=/{path to your pysam version 0.8}:$PYTHONPATH

or add the command above to your .bashrc file
The former will enable python to use the newer pysam library for that particular session, while the latter will enable all future sessions to use the newer pysam library. This is useful if the newer pysam library might cause conflicts with other programs.

If your folder is in the search path, but listed after the folder with the default pysam, you can follow the procedure in the paragraph above to ensure that Python chooses the newer version of pysam as its default.

How do I resolve the following error?

Error in parametricDispersionFit(means, disps) : Parametric dispersion fit failed. Try a local fit and/or a pooled estimation.

This error is thrown due to the inability of DESeq to fit a dispersion model to your data. To address this, you can modify the following line in the .R file generated by TEtranscripts:

cds <- estimateDispersions(cds)

cds <- estimateDispersions(cds,fitType="local")

I was running TEtranscripts with the provided test files, and got the following error. How can I resolve this?

CRITICAL @ Wed, 02 Dec 2015 18:20:21: Error in running differential analysis! CRITICAL @ Wed, 02 Dec 2015 18:20:21: Error: [Errno 13] Permission denied CRITICAL @ Wed, 02 Dec 2015 18:20:21: [Exception type: OSError, raised in subprocess.py:1335]

This error is likely caused by the inability of TEtranscripts to run the R code for differential analysis. Here are things that you should check:

The gene/TE abundance table generated by TEtranscripts (***.cntTable) is in the output folder and contains counts for genes & TE in the libraries processed.
The R code generated by TEtranscripts (***.R) is the output folder.
Rscript is in your PATH variable.

To check if Rscript could be found, you can type the following:

$ which Rscript

If you get the following results:

/usr/bin/which: no Rscript in (...)

That means that the system could not find the Rscript program (which is installed alongside R).
This is an issue encountered when using RStudio (especially in Mac OS X), where the R and Rscript programs are not added to the PATH variable upon installation.
If you've installed R/RStudio on Mac OS X in a standard location, you can direct the system to the Rscript program by doing the following:

ln -s /Library/Frameworks/R.framework/Resources/bin/Rscript /local/home/usr/bin

if /local/home/usr/bin is in your PATH variable.

Which `--stranded` parameter should I be using?

With an increasing number of RNAseq libraries using directional/stranded library prep protocols, this parameter becomes critical in ensuring the quantification is done correctly. Here are a few examples of library prep protocols that would go with the corresponding --stranded parameter:

--stranded no

Any unstranded RNAseq library protocol (e.g. NuGens Ovation RNAseq v2)

--stranded forward (TEtranscripts version 2.1.4 or later) or --stranded yes (TEtranscripts version 2.1.3 or earlier)

--stranded reverse

If you are uncertain whether your library is stranded, we recommend that you run one or your library using all three --stranded parameters, and then compare the counts of a highly expressed gene (e.g. GAPDH) between the three modes.

If the gene counts in the --stranded no run is approximately twice the other two runs (Example 1), then your library is likely to be unstranded, and the correct parameter to use is --stranded no.

Example 1
GeneA
--stranded no: 100
--stranded forward/yes: 50
--stranded reverse: 50
Conclusion: use --stranded no

If the gene counts in either --stranded yes/forward (Example 2) or --stranded reverse (Example 3) is approximately the value of the --stranded no, then the correct parameter to use is the one that gave you the value closest to the --stranded no run.

Example 2
GeneA
--stranded no: 100
--stranded forward/yes: 99
--stranded reverse: 1
Conclusion: use --stranded forward (for TEtranscripts version 2.1.3 or earlier, use --stranded yes)

Example 3
GeneA
--stranded no: 100
--stranded forward/yes: 3
--stranded reverse: 97
Conclusion: use --stranded reverse

Are there special instructions for paired-end libraries?

For paired-end libraries, it is recommended that only alignments from properly paired reads are present in the input BAM file. I.e., each read 1 alignment should only have a single read 2 alignment.

For example, if read 1 matched 3 genomic locations (A, B, C), then if read 2 also match 3 genomic locations (A', B', C'), then all three pairs of alignments could be used (and should be in the BAM file). However, if alignment C of read 1 was matched with more than one alignment of read 2 (e.g. C' and C*), then alignment C should be discarded (as there are unmatched alignments between read 1 and read 2).

STAR only outputs properly paired alignments by default, while Bowtie2 requires the --no-mixed parameter to be used.

Commonly encountered issues

Where can I find a transposable element GTF file for the organism or genome build that I want to use for my analysis?

How do I resolve the following error?

How can I resolve the following error?

I installed pysam version 0.8 (or higher), but why do I still get same error as above?

How do I resolve the following error?

I was running TEtranscripts with the provided test files, and got the following error. How can I resolve this?

Which --stranded parameter should I be using?

Are there special instructions for paired-end libraries?

Which `--stranded` parameter should I be using?