Our research areas are computational biology and bioinformatics, where we are interested in solving mathematical and computational problems arising from the analysis of functional genomics data.

Workflow of a study combining functional genomics data generation, computational analysis, and experimental validation. From Talukdar et al. (2016)

We are particularly interested in causal inference and scientific machine learning, modelling approaches where existing knowledge about a system is encoded in a qualitative causal diagram or a set of stochastic differential equations, and machine learning and large data are used to “fill the gaps”, that is, to parametrize the model and perform statistical inference.

In biology, our main interest is in systems biology, where we try to understand how genetic variation between individuals influences variation in gene regulation, gene regulatory networks, and health and disease outcomes.

Topics we are particularly interested in at the moment are:

  • Causal gene regulatory network inference
  • Causal inference for stochastic processes
  • Causal models for cellular differentiation and gene expression dynamics
  • Causal models for dimensionality reduction
  • Branching processes and stochastic processes on trees
  • Geometric deep learning

Projects

MultiomIcs-based Risk stratification of Atherosclerotic CardiovascuLar disease (2023-2027)

Atherosclerotic cardiovascular disease (ASCVD) is the leading cause of mortality worldwide. Aside from asymptomatic manifestations, the first sign of clinically significant ASCVD is often a severe clinical event, such as stroke or myocardial infarction (MI). Thus, identifying individuals at high risk is crucial in preventing the fatal consequences of ASCVD. Current risk prediction models based on traditional risk factors, such as SCORE2, have limitations since they do not encompass all mechanisms and intermediary phenotypes leading to ASCVD. Particularly, current risk models fail to consider the disturbance of gene regulatory networks (GRNs) caused by genetic risk factors and diverse longitudinal exposures accumulating during a person’s lifetime.Furthermore the current models predict the combined risk of CAD, PAD and ischemic stroke despite mounting evidence of the heterogeneity of the underlying disease mechanisms. To capture the missing aspects of current ASCVD risk scores, MIRACLE project brings together unique data resources and expertise to provide novel multiomics based prediction models of ASCVD. We aim to (1) Integrate the globally largest CAD, PAD, and stroke GWAS information to identify genetic loci that differ between or are shared by these diseases and their subtypes, (2) Identify sex-specific subtypes of ASCVD patients using transcriptomic phenotyping of plaques and circulating biomarkers, (3) Generate functionally informed polygenic risk scores by combining experimental fine-mapping and gene prioritization approaches with integrative GRN and deep learning modelling. (4) Derive novel risk prediction models incorporating polygenic risk and circulating biomarkers. Providing a new gold standard for prediction models to accurately risk stratify stroke and MI represents a technological breakthrough allowing for earlier diagnoses and treatments of ASCVD.

The project is funded through the EIC Pathfinder Challenge: Cardiogenomics.

See the official project homepage for more information.

Group members involved
Project partners

New technologies for target discovery in neuropsychiatric disorders (2022-2027)

Neuropsychiatric disorders (NPDs) are major causes of human suffering, loss of lives and productivity all over the world. Currently used pharmacotherapies against NPDs have low efficacy and specificity and were introduced 50-100 years ago mainly based on accidental findings. However, recent molecular genetic studies have revealed strongly associated genetic loci, enriched in brain pathways and proteins, and possible new therapeutic targets for NPDs. Moreover, revolutionary new technological breakthroughs have been reported in computational and experimental molecular life sciences. These technologies will be refined, applied and combined in the NeuroConvergence project for systematic identification and testing of new molecular targets in the treatment of NPDs.

More information can be found in the NFR project bank.

Group members involved
  • Ammar Malik
Project partners

Intelligent systems for personalized and precise risk prediction and diagnosis of non-communicable diseases (2021-2024)

This project will create intelligent systems for personalized and precise risk prediction and diagnosis of non-communicable diseases using multi-omics data, by developing, implementing and validating novel algorithms for structure learning and inference in large-scale, multi-organ causal Bayesian gene networks, based on computational methods that we have developed previously to infer, characterize and validate gene regulatory networks in complex diseases.

More information can be found in the NFR project bank.

Group members involved
Project partners

Scalable causal gene network inference via genetic node ordering (2015-2017)

The aim of this project is to reconstruct causal, global and high-quality gene networks from large-scale omics data to understand how the genotype determines the phenotype.

For more information, see the UKRI project bank.

Group members involved
  • Lingfei Wang
Software

The main outcome of this project was the Findr software.