Rob (or “Dr. Robert Beiko”, if you want to be all formal about it) is an Associate Professor and Canada Research Chair in Bioinformatics in the Faculty of Computer Science at Dalhousie University. Before coming to Dal in 2006, he was a postdoc in the lab of Mark Ragan at the University of Queensland in Brisbane, Australia. And before that, he completed a PhD in Biology at the University of Ottawa (1998-2003). Although all of his formal training was in biology, an interest in machine-learning approaches, algorithms for identifying important evolutionary events, and visualization of biological data have ultimately led him to put down stakes in Computer Science and collaborate with some of the best in the business here.
The primary question we ask in the Beiko lab is “What are the microbes DOING to each other?” This leads us in a number of different directions. For example, from the postdoc to the present, there has been a continuous thread of stories about lateral gene transfer or LGT (or HGT, if you’re one of those people). Although we informally refer to this process as “gene sharing”, it’s really anything but, as typically the donor cell needs to die before the recipient can take up its DNA. As with everything else in biology, there are exceptions: LGT appears to be important in biofilm formation, as massive gene transfer appears to reinforce the diversity of many species. There are some amazing stories of LGT in the evolution of microbes, including massive gene transfer between thermophilic organisms, and an emerging story of acidophiles that may be even crazier. And since a great deal of microbial diversity remains unexplored, who knows what stories are yet to be discovered?
Discovering LGT events is not a trivial matter. We have developed several efficient algorithms to detecting these events through the comparison of phylogenetic trees; most recently, we have used these methods to build “supertrees” of >200 microorganisms, and quickly identify the “highways” of gene sharing that connect them. This is an ongoing project, and we would like to ramp up this method to incorporate the thousands of microbial genomes that are available, and reluctantly include more complex organisms as well.
There is a persistent question about whether LGT mechanisms evolved to serve the purpose of gene transfer, or whether gene transfer is a side effect of other mechanisms. But in either case, its implications are profound. Figuring that LGT might be particularly interesting and important within microbial communities, we started to look for evidence that specific functions may have been transferred in, for instance, the human gut. And guess what, we found them. The “big story” of how LGT influences the formation of microbial communities is slowly emerging, and there is a great deal of work yet to be done.
The microbes that form communities can interact in numerous ways, beyond those that are shaped by LGT. Stories are emerging now of remarkable dependencies between organisms, where organism 1 will start the job (metabolically speaking) and organism 2 will finish it, providing the results to other members of a community. These tight interactions emerge through a combination of gene mutation, gene loss, and LGT. We are now trying to explore the rules that might govern these processes. In support of this, we have developed methods to assign sequences from a metagenomic sample to the most likely originating genome. By attaching functional genes (the “what”) to organisms (the “who”) we hope to discover these associations. We have also developed and evaluated methods that express the similarity of microbial samples, to try and determine which environmental features influence them most strongly.
The age of massive DNA sequencing allows us to view not only individual microbes, but also to consider the similarity of microbes across space, time, and environmental conditions. We have developed the GenGIS software to allow interactive visualization and analysis of microbial samples, and to retrieve data from a growing list of online resources. Although initially conceived for microbial applications, others have used GenGIS to study the evolution and ecology of lizards, kangaroo apples, goats, viruses, and human populations. Here, we are using GenGIS to support biomonitoring of threatened sites, and to track emerging pathogens.
We have applied machine-learning techniques to a number of problems in genomics, including the classification of promoters (regulatory regions of genes), the association of genotype with phenotype, and most recently for genome-wide association studies of mice. At the moment we’re also watching the protein function prediction competitions with interest, as these methods can improve the identification of the “what”, and there seems to be much room for improvement in the current sophisticated methods.
If any of this interests you, I encourage you to get in contact with me!