Machine learning on climate, health and biological data
Machine learning is seeing increasing use within life sciences, due to the presence of complex relations between a typically large number of interacting entities. This happens both at micro scale, between and within cells, and at macro scale, where e.g. climate change is seen by WHO as the single biggest threat to human health due to the many ways that climate affects human and animal health. At both micro and macro scale, developing carefully crafted machine learning strategies is crucial to handle the typically limited amount of data, to learn robust models that generalise well across relevant contexts and to ensure explainability. This overall theme encompasses two more focused sub-themes, which are described in more detail below.
NB! Applicants are asked to apply for one of these sub-themes. Please indicate clearly which of the sub-themes you have chosen for your proposal by using one of the codes SCML1, SCML2.
Mentoring and internship will be offered by a relevant external partner.
Theme SCML1. Machine learning-based modeling of how climate change is affecting global health
- Contact person: Geir Kjetil Ferkingstad Sandve
- Keywords: Climate and health, Time series, Deep learning, Global health, Machine learning software
- Research group: Scientific Computing and Machine Learning (SCML)
It is well established that climate change will have a range of direct and indirect effects on global health. An example is how precipitation and temperature affects mosquito populations and the incidence of a disease like Malaria. Precise quantitative models could allow early warning of epidemics and predict spatial expansion of disease, facilitating mitigation through improved resource allocation and interventions. The relations between climate and health are, however, highly complex and varying.We are interested in the development of improved machine learning methodology for predicting future disease incidence informed by climate projections, providing modeling-based decision support, or predicting the effects of interventions designed to reduce disease burden. Methodology ranging from classic time series forecasting to tailored deep learning models is of interest, as well as development of open-source platforms to improve transparency, reproducibility and software reuse. The theme is connected to a large, transdisciplinary climate and health collaboration at UiO.
Topics from methodological research:
- Time series forecasting
- Deep Learning for Time Series Analysis
- Geospatial Machine Learning
- Machine learning platform/framework design
Topics from natural sciences or technology:
- Climate science
- Global health
- Climate change preparedness
External partners:
- The Norwegian Computing Center (NR)
- The Norwegian Meteorological Institute (met.no)
- SINTEF
Theme SCML2. Explainable Data-Fusion-based Machine Learning Models for Cellular Insights in Dementia-Associated Neurodegenerative Diseases
- Contact person: Pooya Zakeri
- Keywords: Explainable Machine Learning, Data Fusion, Dementia Pathogenesis, Neurodegenerative Diseases, Cellular Heterogeneity
- Research group: Scientific Computing and Machine Learning (SCML)
Neurodegenerative diseases associated with dementia—including but not limited to Alzheimer's Disease—affect millions worldwide, leading to cognitive decline and a significant impact on quality of life. Recent studies have highlighted the roles of various single-cell types—such as neurons, astrocytes, microglia, and oligodendrocytes—in disease progression. However, there is no consensus among studies, leaving the full spectrum of these cells' activation states, particularly in humans, largely unknown.
This proposal aims to develop explainable, data-fusion-based machine learning models to synthesize insights into the diverse activation states of these cells in dementia. By integrating single-nuclei transcriptomics datasets from different laboratories and technologies, we seek to systematically investigate unique cellular patterns associated with dementia-related diseases, identifying key human-specific cellular signatures. By applying advanced ML techniques for data integration, the research aims to produce biologically interpretable outputs, enhancing our understanding of cellular heterogeneity in dementia-related neurodegeneration and paving the way for novel therapeutic targets.
Relevant topics from methodological research:
- Explainable Machine Learning Techniques for Biological Data
- Machine learning-based iData Fusion approaches for OMICs data
- Single-Cell State Identification and Classification
- Polygenic Risk Modeling in Heterogeneous Disease Contexts: Approaches
- Dimensionality Reduction for High-Dimensional Single-Cell Studies
- Scalable Data Integration Techniques for Single-Cell Data Across Multiple Datasets
- Clustering Algorithms for Single-Cell Data Analysis
Relevant topics from natural sciences or technology:
- Single-Cell Transcriptomics in Neurodegenerative Disease
- Human Brain Atlas Development Using Single-Cell Data and Machine Learning Techniques
- Gene Prioritization in Neurodegenerative Diseases
- Predictive Modeling of Disease Progression in Neurodegenerative Disorders
- Biomarker Discovery for Early Dementia Diagnosis
- Molecular Pathways in Dementia Pathogenesis
- Neuroinflammation and Microglial Activation
- Explainable Machine Learning for Identifying Key Drivers in Neurodegenerative Disease
External Partners:
- Oslo University Hospital (OUH)
- Integreat: Norwegian Centre for Knowledge-Driven Machine Learning
- Norwegian Institute of Public Health (NIPH)
- Norwegian Centre for Mental Disorders Research (NORMENT)
- Healthcare providers and research institutions specializing in neurodegenerative diseases
- AI and data science organizations focused on medical applications
- Biotech and genomics companies with access to large-scale datasets
- Pharmaceutical companies invested in neurodegenerative disease treatments and biomarker discovery