Équipe IMAGeS - Images, Modélisation, Apprentissage, Géométrie et Statistique

Séminaire du 18/10/2018, 14h00

De Équipe IMAGeS - Images, Modélisation, Apprentissage, Géométrie et Statistique
Sauter à la navigation Sauter à la recherche

jeudi 18 octobre 2018, 14h00

Bayesian interaction and difference detection in Hi-C data using generalized additive models and fused lasso

Conférencier : Yannick Spill (BSC)

3C-like experiments, such as 4C or Hi-C, have been fundamental in understanding genome organization. Thanks to these technologies, it is now known, for example, that Topologically Associating Domains (TADs) and chromatin loops are implicated in the dynamic interplay of gene activation and repression, and their disruption can have dramatic effects on embryonic development. However, the analysis of Hi-C experiments is both statistically and computationally demanding. Most methods are hindered by the high noise, large quantities of data and inadequate modelling of spatial dependency. In this talk, I will present a new way to represent Hi-C data, which leads to a more detailed classification of paired-end reads and, ultimately, to a new normalization and interaction detection method. This method, called Binless, uses a generalized additive model framework, and makes extensive use of the sparse fused lasso regression in a Bayesian setting. Binless is resolution-agnostic, and adapts to the quality and quantity of available data. I demonstrate its capacities to call interactions and differences using a large-scale benchmark, and dwell on the difficulties and open questions that remain both from the theoretical and from the applied perspective.