ONCOCNV - a package to detect copy number changes in deep sequencing data

ONCOCNV: Detection of copy number changes in deep sequencing data
Prediction of copy number gains and losses in amplicon sequencing or exome-sequencing data

Introduction

ONCOCNV is a package to detect copy number changes in Deep Sequencing data developed by OncoDNA with the collaboration with the Bioinformatics Laboratory of Institut Curie (Paris). It is now supported by the group of Valentina Boeva at Inserm U1016.

ONCOCNV automatically computes, normalizes, segments copy number profiles, then calls copy number alterations. The user can provide any positive number of control samples in order to construct the baseline. However, we recommend to use at least three control samples. The more the better.

ONCOCNV can be applied to exome-seq data. You just need to provide probe coordinates instead of amplicon coordinates, and you will get beautiful copy number profiles for your data.

Input for CNV detection: aligned single-end or paired-end data in the BAM format.
Output: Annotation of genes with copy number changes + visualization of the profile (.png).

Citation: Boeva,V. et al. (2014) Multi-factor data normalization enables the detection of copy number aberrations in amplicon sequencing data. Bioinformatics, 30(24):3443-3450. Link

SeqAnswers forum: http://seqanswers.com

Downloads

Read about the requirements in the README file.

Download the the latest version of ONCOCNV.

Data access

Raw or processed intput files:

Spreadsheet with the raw read count data for the control samples (dataset A; generated by ONCOCNV_getCounts.pl): Control.stats.txt
Spreadsheet with the normalized read count data for the control samples with additional information: amplicon length, GC-content, PC1, PC2, PC3, standard deviation (dataset A; generated by processControl.v5.3.R): Control.stats.Processed.txt
Spreadsheet with the raw read count data for the tumor samples (dataset A; generated by ONCOCNV_getCounts.pl): Test.stats.samplesA1_A3.txt and Test.stats.samplesA4_A8.txt

Result files:

Archive with the results of ONCOCNV, ADTEx, NextGENe on 8 samples from dataset A: Dataset_A_results_3tools.zip

Contacts

The following members of the ONCOCNV working group are pleased to answer any question or address any concerns you may have with the ONCOCNV software:

Visualization

Example of output .png file:

Additionally, you can visualize the output per chromosome. In this case, gene names will appear on the graph. To run the script:

cat perChrVisualization.R | R --slave --args myTestSample.profile.txt 17
or
cat perChrVisualization.R | R --slave --args myTestSample.profile.txt chr17

Blue vertical lines signify predicted breakpoints. Orange vertical lines correspond to breakpoints resulting from the segmentation (they were corrected by a t-test later). You may ignore them.
Example of output .png file: