BamQuery logo

BamQuery: a proteogenomic tool for the genome-wide exploration of the immunopeptidome

https://img.shields.io/badge/python-3.6-blue.svg

BamQuery is on Github and can be used from its web interface.

A Quick Intro:

MHC class I–associated peptides (MAPs), collectively referred to as the immunopeptidome, define the immune self for CD8+ T cells and have a pivotal role in cancer immunosurveillance. While MAPs were long thought to be solely generated by the degradation of canonical proteins, recent advances in the field of proteogenomics (genomically-informed proteomics) have evidenced that ∼10% of MAPs originate from allegedly noncoding genomic sequences. Among these sequences, the endogenous retroelements (EREs) are notably under intense scrutiny as possible cancer-specific antigen (TSAs) source.

With the increasing number of cancer-oriented immunopeptidomic and proteogenomic studies comes the need to accurately attribute an RNA expression level to each MAP identified by mass-spectrometry. Here, we introduce BamQuery (BQ), a computational tool to count all reads able to code for any MAP in any RNA-seq data chosen by the user as well as to annotate each MAP with all available biological features. Using BQ, we found that most canonical MAPs can derive from an average of two different genomic regions, whereas most tested ERE-derived MAPs can be generated by numerous (median of 682) different genomic regions and RNA transcripts.

We show that published ERE MAPs considered as TSAs candidates can be coded by numerous other genomic regions than those previously studied, resulting in high undetected expression in normal tissues. Similarly, we also show that some mutated neoantigens previously published as presumably specific anti-cancer targets can in fact be generated by other non-mutated, non-coding, widely expressed RNA-seq reads in normal tissues. In light of these observations, we conclude that BQ could become an essential tool in any TSA-identification/validation pipelines in the near future.

BamQuery is developed by Maria Virginia Ruiz Cuevas at the Institute for Research in Immunology and Cancer (IRIC).