Stock Photo newspapers

Transforming data into knowledge relevant to health and disease

A search for new junior group leaders for BIH and IRI

The process began last summer, and after five months has finally reached what is likely to be the culminating step: a choice of new junior group leaders. The means by which such groups are established varies between institutes and programs, and in this case was anything but typical because it involves three major parties and two positions. One group will likely be established at the MDC under the roof of the Berlin Institute of Health, whose aim is to unify research at the MDC and Charité, and the second is a joint call with Berlin's Integrative Research Institute for the Life Sciences. IRI, as this program is called, is five years old and draws together both BIH partners and Humboldt-Universität.

The aim of both searches is to find scientists who will help cull through huge amounts of data – hopefully including information related to clinical studies – to produce insights that will ultimately aid in the diagnosis, treatment and prevention of disease. That's a very wide mandate, and the call that went out in October attracted 33 applicants with diverse interests and expertise. The ideal candidate would be someone with an outstanding record of interdisciplinary post-doctoral research, aiming to take one of the most important steps in a scientific career: establishing an independent group.

A committee consisting of experts from the three institutes culled the list down to eight, seven of whom accepted an invitation to come Berlin in late February for rounds of presentations, talks with the final selection committee and individual scientists. Cornelia Maurer of the MDC, who has helped organize the process from the beginning, said that the next steps will probably go quickly.

"It's usually not like filling a position for a professorship, where candidates may be considering several offers for positions at the same time," she says. "That's often a very long process. In this case, based on the quality of the applications and the importance of the positions for the individuals, the committee will probably be able to make the selections and offers quite quickly. So hopefully the groups can be established quite soon."

* * * * *

On Thursday, Feb. 12, the seven candidates had half an hour to make an impression by giving presentations and fielding a few questions in an open session. The room in the Festsaal on the Campus Mitte – home to units of the Charité, HU, and the site where the new BIMSB building is under construction – was packed.

Erwin Böttinger, CEO of BIH, speaking at a recent event at the MDC. Photo: BIH/Thomas Rafalzyk.

Prof. Erwin Böttinger, recently appointed as the new CEO of BIH, opened the day with a brief overview of expectations for the development of both biomedical research and the infrastructures in Berlin devoted to it. "We are seeing an incredible advance of information science into medicine and the life sciences that will characterize the 21st century," Böttinger stated. "We're now confronting a situation where millions of exomes and genomes have been sequenced and need to be understood. An additional, extremely important development regards the changing role of patients and other individuals in the research process. Health and biological data is moving from institutional hosts into the 'Cloud', raising new issues regarding permission and access. We need to be ready to embrace these changes and proactively create a community that will be ahead of the game in dealing with all of this data."

Berlin provides a great "ecosystem" for the life sciences, he added, through the development of BIH, which represents a unique integration of a federally funded institute with academics and universities, funded at the state level. "This will permit us to make truly major investments in the life sciences and medicine, and we are forming interdisciplinary communities around key topics such as health data science."

Böttinger and Nils Blüthgen of the Charité also provided an overview of the developing structure of the Campus Mitté, which is promoting direct interactions between the partners of BIH and IRI. The site is steadily becoming a meeting point between people and projects and is stimulating the type of collaborations – through a clustering of both technological platforms and expertise – that are for successful translational research.

* * * * *

The diversity of topics and approaches covered in the candidates' talks reflects the enormity of the types of data that scientists are trying to integrate in the closely related fields of health-related and basic biological research. Yet there were a number of common themes. One was the way scientists are moving from a rather generic view of the "reference genome" to variations of gene sequences found in individuals and tissues, and the ways they contribute to pathophysiological processes. Beyond the sequence level, a huge focus is being placed on understanding how the landscape of the genome is altered in diverse types of healthy and diseased cells, providing access to molecules so that sequences can be transcribed into RNAs and then expressed as proteins. Signatures of these processes are crucial to understanding the biological mechanisms underlying diseases, finding new means of diagnosing them, designing new treatments and understanding their effects. Another important theme is deriving information from images of cells and tissues. Making progress in all of these areas will require new methods to understand data produced by high-throughput experiments – and in many cases developing such approaches in the first place.

Anaïs Bardet, currently a postdoc at the FMI in Basel, presented her work on a fundamental chemical change that occurs globally across the genome and the way it influences the expression of DNA sequences. Molecules called DNA methyltransferases attach methyl tags to sequences – particularly when the nucleotides "C" and "G" lie next to each other. This epigenetic process influences whether proteins called transcription factors or other molecules can gain access to a region of DNA and what happens when they do. Patterns of methylation change as cells specialize, during the development of cancer, and in a range of other diseases. Anaïs has developed techniques to block the activity of transferases and carry out other manipulations control the tagging of specific sequences. She is studying the effects that this has on the activity of transcription factors – some of which respond to changes in DNA methylation, while others apparently do not. She is using the data she obtains to try to develop computational methods of predicting whether specific molecules are sensitive to changes in methylation that occur in a range of biologial contexts. These processes are crucial, for example, in controlling the growth of both normal tissues and tumors.

Fabian Spill, a postdoc at MIT in Boston, is trying to develop models to clarify the way cells receive and respond to changes in the mechanical forces they experience during development, migrations, and other situations. Such changes are obviously crucial in metastates as cancer cells leave a tumor and invade other tissues. As cells move through "looser" or "stiffer" environments, new signaling pathways are activated to change the expression of genes and thus the shape, behavior, and other features of cells. While biochemical signaling networks have been profoundly explored, Spill says, their relationship to external influences such as mechanical stress is poorly understood. He has been studying the responses of pathways involving the genes yap and taz, whose deregulation contributes to the development of many types of cancer, when cells are grown on stiff or soft matrixes. He is developing a hypothesis that cancer cells might overcome "stiff" barriers by hiding signals that they would normally activate in such environments. Modeling the synergies between cellular mechanosensing and biochemistry might lead to new therapeutic approaches that target them.

Davide Risso, a postdoc at the University of California in Berkeley, is working on problems related to individual variation that will have to be clarified for the development of "personalized medicine." These are approaches that take individual differences in genomes and lifestyle factors into account when making diagnoses, assessing disease risk, designing therapies, etc. A related area of Risso's work revolves around basic biological questions such as clarifying how many types of cells are present in tissues such as the brain and how they respond to changes. An enormous issue in these areas is to combine data from many types of experiments and distinguish between effects that are due to real biological differences versus those that stem from the way experiments have been set up and conducted. Scientists call the latter type of data "unwanted" variation, and a major challenge is to find ways to remove it when carrying out analyses. Risso has developed new computational approaches and, in a recent case, applied it to the study of a class of cells called L5 neurons. The methods have allowed him to distinguish gene signatures that break such cells into three distinct sub-types; he has also managed to chart their developmental stages and relate them to other types of neighboring cells. Many of the bioinformatic tools he has created have been made accessible to the community at large. This type of work will be crucial to clustering data to understand cellular processes in a wide range of diseases.

Another candidate working on the relationship between individual variation and disease is Birte Kehr, who currently works at the company deCODE in Iceland. deCODE has taken advantage of a huge amount of health and genomic data collected from Iceland's population, whose features have produced an amazing resource to study these types of questions. Sharp insights are emerging from the combination of huge amounts of sequence data with extensive records on the health, lifestyle, and genealogy of the individuals from whom it has been obtained. This unique set of data has enabled Kehr and her colleagues to carry out studies of the variation between individual genomes – which contain huge numbers of original insertions, deletions, and rearrangements of sequences. Little has been known about the extent of such changes and their true effects on individual health risks. One of Kehr's projects has been to identify an insertion of a sequence into a gene called CASP8. This molecule plays an important role in triggering cell death in cases where it is appropriate; disruptions of the gene or similar factors are often found in cancer. Other work has identified new risk factors in the development of several types of tumors.

Dagmar Kainmueller, currently working at the MPI for Molecular Cell Biology and Genetics in Dresden, is a specialist in computational approaches to extracting data from various imaging methods. This type of work is a crucial step in the development of high-throughput or large-scale biomedical imaging. In the past, analyzing images obtained in biological experiments or medical procedures often required painstaking and time-consuming work, which was an obstacle to deriving information by comparing images from different patients or samples. Kainmueller has worked on projects ranging from three-dimensional studies of the worm C. elegans – a favorite model of developmental biologists – to scans of the vertebrae of developing mice. She is developing potent new methods for "machine-learning" that enlarge on the current capacities of "neural networks:" approaches that help a machine train itself to carry out analyses that imitate a scientist's knowledge and judgments. While such networks have been used in science for many years, it is still not completely understood how they work. Kainmueller's projects are helping advance this methodology, and she is applying it to questions such as the identification of cell types and borders from microscope data and the analysis of images captured from scans of animals – similar to those obtained in patient CTs.

Martin Kircher is a postdoc at the University of Washington and a former member of the lab of Svaante Paabo in Dresden, known from its work on the genomes of Neandertals and other fossil species. The connection lies in the study of very short DNA sequences that can be obtained from both fossils and various types of patient tissues. He is using next-generation sequencing approaches to carry out two main types of research that are linked in a fascinating way: on the one hand, the data he is producing may be highly relevant in the diagnosis of cancer and other diseases in patients; the other sheds light on the structure of the genome and the biology surrounding it. Blood and other tissue contain plenty of small fragments of DNA shed by tumors or cells affected by other types of diseases. Identifying, quanitfying and comparing these sequences may produce profiles that will be extremely useful in diagnosing patients and studying the effects of therapies. Kircher is also using DNA fragments to study crucial aspects of the architecture of genomes. The ends of the sequences provide information about where molecules such as transcription factors or histones are bound to it, and their positions directly influence which genes are transcribed in different types of cells and tissues, both in health and disease. This provides a completely different type of information that can be combined with other information about genome architecture to characterize developmental and pathological processes.

The final speaker of the day was Pierre Cauchy, currently a postdoc at the University of Birmingham. Cauchy's recent work has focused primarily on the way transcription factors such as Ets1 bind to DNA during the development of blood cells and the aberrations found in diseases such as myeloid leukemias. Blood has long been a premier system used to study the differentiation of cells from a stem state to their final, differentiated forms, and leukemias represent a case where this progression is disrupted – revealing links between stem cell and cancer processes. Ets1 is particularly important in the development of crucial immune cells called T cells, which groups at the MDC and elsewhere are actively engineering in hopes of creating immunotherapies. Understanding the way transcription factors influence these processes is crucial to understanding the diseases and designing new therapies, but it has been difficult to combine different types of data to describe the interactions between the molecules and DNA sequences such as enhancers that may lie far from genes yet control their output. Cauchy has taken two approaches to these questions. The first is to characterize the mechanisms that regulate genes in disease and to study what happens when they are disturbed; the second is to carry out similar procedures in normal, healthy cells. His work has identified "footprints" that reveal sequences where transcription factors are bound – under a variety of natural and manipulated conditions – and is exposing the relationships between some of these complex events and the mechanisms that guide cell development.

All in all, the talks made for an extremely interesting day, giving the listeners a broad view of many diverse approaches to the computational analysis of biomedical data. Each speaker presented a piece of the huge issues confronting today's research: integrating huge amounts of information in ways that will allow us to distinguish between general and unique individual patterns, expose basic biological mechanisms, and intervene in ways that will promote both knowledge and real benefits for patients. At the moment, the challenge for each researcher is to find an approach and a model that will carry these efforts the farthest, in the interdisciplinary and interinstitutional context of Berlin. The immediate decision is in the hands of the selection committee: all those who attended the event – particularly the candidates! – are now eagerly awaiting their decision.