As a bioinformatics lab, we are primarily interested in the development of new mathematical and computational methods applied to biological problems. Our main research topics are:
Bayesian statistics for genomics data
The development of large scale genomics data sets, in particular micro-arrays, has been plagued by question marks over the reliability and accuracy of the data generated by these methods. Even if there is better understanding of the technology now, current error models are still relatively limited.
Using the wealth of data stored in public repositories and calibration data sets, we are studying the impact of technical effects on the final data, and building general error models to account for the correlations introduced by these effects.
We are developing data processing strategies for expression micro-array data, which aims are three-fold: the methods must preserve (when possible) the complex structure of gene expression, in particular regarding to possible cross-hybridisation and the relative contribution of alternative transcripts in the whole gene expression; they must include all experimental data and account for multiple sources of noise (probe location, probe sequence, ...); the methods must provide a comprehensive error model which offers variance-covariance estimates between the gene expression values.
Reconstruction of regulatory networks
The organisation of genes into pathways and/or networks provide invaluable information for the understanding of a biological process. Unfortunately, reverse-engineering such networks from expression patterns alone is one of the biggest challenges in system biology. We are working on extending reconstruction methods to include error models which take into account the non-independence between gene expression values.
We are interested in two apsects of regulatory networks: first, in collaboration with the Lieberam Lab, we are exploring methods to identify the most important relations between genes when the primary experimental evidence is restricted to co-expression. We are working on ways to include covariance and correlation between expression patterns to provide the most informative starting point for network reconstruction. We are also interested in modelling small regulatory networks by various methods, and in the information flow in biological networks.
Automation
When used in model animals with genetic modifications, behavioural assays provide a system level functional readout of genetic alterations, which were previously only characterized on a cellular basis. Unfortunately, these assays are often time-consuming and, as they are usually not standardised, difficult to compare. While the physical automation these assays partially remove these obstacles, the results (typically short movies showing the animal's behaviour in highly controlled conditions) have to be quantified automatically in order to gain the maximum unbiased information from the experiment. We are interested in the development of algorithms and software tracking the vertical motion of the flies D.melanogaster in tubes, and in statistical methods to average and compare trajectories from different animals.
In collaboration with the Williams Lab, we are exploring the automation of negative geotaxis phenotypic assays for D.melanogaster. In particular, processing the movies recording the motions of the animals in the tubes requires the development of specialised heuristic algorithms to reconstruct individual trajectories, because of the severe occlusion problem caused by the large number of animals in each tube. We are involved in the development of such algorithms, and their practical implementation, as well as in the mining of future data sets to identify robust and informative numerical quantities describing the trajectories, that are suitable for a quatitative statistical analysis of the animals’ motions.