One of our aims is to create or improve methodology to analyze cancer genomes and epigenomes. This includes methodology to work with high-throughput sequencing datasets (WGS, WES, amplicon seq, ChIP-seq, RNA-seq) and approaches to work with motifs in DNA sequences (motif discovery for TFBSs, analysis of the repetitive elements).

Software developed by the members of the lab

Analysis of ChIP-seq data

CHIPIN - normalization of the CHIP-seq signal without spike-in data, when matched RNA-seq data are available
L. Polit, et al. BMC Bioinformatics, accepted for publication. Link to the paper
LILY - detection of super-enhancers in cancer samples
V. Boeva, et al. Nature Genetics, 2017, 49(9):1408-1413. PMID: 28740262
HMCan - detection of chromatin modifications in ChIP-seq data (specifically in cancer genomes)
Ashoor et al., Bioinformatics, 2013, 29 (23): 2979-2986. PMID: 24021381.
HMCan-diff - detection of differential chromatin modifications in ChIP-seq data (by applying a correction to copy number alterations, HMCan-diff allows comparison of different cancer cell lines or cancer cells vs normal cells)
Ashoor et al., Nucleic Acids Research2017, 45(8):e58. PMID: 28053124.
MICSA - detection of transcription factor binding sites in ChIP-seq data using information about de novo identified binding motifs
Boeva et al., Nucleic Acids Research, 2010, 38(11):e126. PMID: 20375099.
Nebula - a web-server for advanced ChIP-seq data analysis
Boeva et al., Bioinformatics, 2012, 28(19):2517-9. PMID: 22829625.

Analysis of DNA sequencing data (WGS, WES, ultra-deep targeted sequencing data)

FREEC & Control-FREEC - detection of copy number alterations (specifically in cancer genomes) using whole genome or whole exome sequencing data
Boeva, at al., Bioinformatics, 2012, 28(3):423-5. PMID: 22155870.
Boeva, et al., Bioinformatics, 2011, 27(2):268-9. PMID: 21081509.
QuantumClone - clonal reconstruction method for whole genome or whole exome sequencing data
Deveau et al., Bioinformatics, accepted for publication. Link to the paper
SV-Bay - structural variant detection in cancer genomes using a Bayesian approach with correction for GC-content and read mappability
Iakovishina et al., Bioinformatics. 2016. 32(7):984-92. PMID: 26740523.
ONCOCNV - detection of copy number changes in high-depth/amplicon sequencing data
Boeva et al., Bioinformatics, 2014, 30(24):3443-3450. PMID: 25016581.
SVDetect - detection of genomic structural variations from paired-end and mate-pair sequencing data (in collaboration with B. Zeitouni)
Zeitouni, et al., Bioinformatics, 2010, 26(15):1895-6. PMID: 20639544.

Sequence analysis

ChIPmunk - de novo motif discovery (in collaboration with I. Kulakovskiy)
Kulakovskiy, et al., Bioinformatics, 2010, 26(20):2622-3. PMID: 20736340.
AhoPro - evaluation of over-representation of one or more given motifs in DNA sequences
Boeva et al., Algorithms for Molecular Biology, 2007, 2:13. PMID: 17927813.
TandemSwan - detection of fuzzy tandem repeats in DNA sequences
Iakovishina et al., Bioinformatics. 2016. pii: btv751. [Epub ahead of print] PMID: 26740523.

Data visualization

SegAnnDB - interactive genomic data segmentation framework (in collaboration with T.D. Hocking)
Hocking, et al., Bioinformatics, 2014, 30(11):1539-1546. PMID: 24493034.

Simulation and mapping accuracy assessment of sequencing reads

RNFtools - RNF: a general framework to evaluate NGS read mappers (in collaboration with G. Kucherov)
Brinda, et al., Bioinformatics, 2016, 32(1):136-9. PMID: 20736340.
Simulator of cancer ChIP-seq reads
Ashoor et al., Bioinformatics, 2013, 29 (23): 2979-2986. PMID: 24021381.
TGSim - simulates cancer genomes by adding large structural variants to the reference genome
Iakovishina et al., Bioinformatics. 2016. 32(7):984-92. PMID: 26740523.