.. role:: bash(code)
   :language: bash

.. _sensitive-filtering:

Sensitive Filtering of Noantigenes
==================================

The sensitive filtering module of the TIminer pipeline allows selecting expressed neoantigens considering the allele-specific expression. As shown in the following figure, the sensitive filtering is applied on the mutated binding peptides predicted by NetMHCpan, and considers the list of DNA mutations (VCF format) and the RNA-seq reads (FASTQ format). The filtering is achieved through three main computational steps: (1) sensitive mapping of the RNA-seq reads with HiSat2 [1]_; (2) calculation of the RAN-seq read coverage for each mutation with ASEReadCounts function from the Genome Analysis Toolkit (GATK) [2]_; and (3) filtering of expressed mutated peptides with a read coverage greater or equal than 5 counts. The sensitive filtering of neoantigens can be activated in the TIminer pipeline launched from command line by specifing in the **--sensitivefiltering** parameter:

:bash:`usage: python TIminerPipeline.py [-h] --input INPUT --out OUT [--database DATABASE] [--threadcount THREADCOUNT] [--sensitivefiltering]`


.. figure:: _static/tciaScheme_sensitive.png


Output files
============

The *filtered-neoantigens* directory contains the files:

- *subjectID_neoantigens_sensitiveFiltered.txt*: a :download:`file <_static/neoantigens_filtered_sensitive.txt>` for each subject containing only the expressed neoantigens; it has the same columns as the files generated by NetMHCpan, plus an additional field reporting the expression read coverage for the mutated allele.


Sensitive filtering functions
=============================

**Single-Subject data analysis**

.. automodule:: TIminer.TIminerAPI
   :members: sensitiveFilterNeoantigen

**Multiple-Subject data analysis**

.. automodule:: TIminer.TIminerAPI
   :members: sensitiveFilterNeoantigenDir


References
==========

.. [1] Kim D., Langmead B. and Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods 2015
.. [2] McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).