Supervision team information
Lab: Norwegian Centre for Molecular Biosciences and Medicine (NCMBM), UiO Team: Computational Biology & Gene Regulation
Supervisor: Anthony Mathelier
IBV supervisor: Eivind Valen
Co-supervisor: Anthony Mathelier, Damla ?vek Baydar, and Ieva Rauluseviciute
E-mail address: anthony.mathelier@ncmbm.uio.no internship place: NCMBM, UiO, Oslo Norway
Keywords: transcription factors (TFs), TF-DNA interactions, deep learning, UniBind
Subject description
Transcription factors (TFs) are key proteins involved in transcriptional regulation through their specific binding at transcription factor binding sites (TFBSs). Hence, accurately mapping TF-DNA interactions is critical to understanding gene regulation. Our group develops and maintains the key open-access resources JASPAR [1] and UniBind [2-3] to model TF-DNA interactions and map them across species. Currently, UniBind stores TFBSs predicted from ChIP-seq data and position weight matrices. We aim to expand it with predictions derived from state-of-the-art deep learning models predicting ChIP-seq signal at base pair resolution (Bpnet [4]). With our collaborators from the Kundaje lab in Stanford, we are currently expanding the JASPAR database to incorporate trained BPnet models. The candidate will implement a pipeline use the trained deep learning models (BPnet) to expand the TFBSs stored in UniBind following our ChIP-eat approach [2-3]. This project provides an optimal learning environment, exposing the student to software development and computational approaches for managing and analyzing high-throughput sequencing data.
Bibliographic references
1. I. Rauluseviciute*, R. Riudavets-Puig*, R. Blanc-Mathieu, J.A. Castro-Mondragon, K. Ferenc, V. Kumar, R.B. Lemma, J. Lucas, J. Chèneby, D. Baranasic, A. Khan, O. Fornes, S. Gundersen, M. Johansen, E. Hovig, B. Lenhard+, A. Sandelin+, W.W. Wasserman+, F. Parcy+, A. Mathelier+. (2024) JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Research. doi:10.1093/nar/gkad1059.
2. Gheorghe,M., Sandve,G.K., Khan,A., Chèneby,J., Ballester,B. and Mathelier,A. (2019) A map of direct TF–DNA interactions in the human genome. Nucleic Acids Res, 47, e21–e21.
3. Puig,R.R., Boddie,P., Khan,A., Castro-Mondragon,J.A. and Mathelier,A. (2021) UniBind: maps of high-confidence direct TF-DNA interactions across nine species. BMC Genomics, 22, 482.
4. Avsec, ?., Weilert, M., Shrikumar, A. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet 53, 354–366 (2021). https://doi.org/10.1038/s41588-021-00782-6
Most used technicals during the internship: Most used technicals during the internship: Python programming, deep learning, large-scale analysis of omics data, gene regulation knowledge