The Dahlquist Lab performs three distinct, but related research projects. The common thread amongst these projects is that they employ the techniques of bioinformatics, biomathematics, and genomics and the perspective of systems biology. All of these projects involve undergraduate students and interdisciplinary collaborations with other faculty at LMU, Dr. John David N. Dionisio (Department of Electrical Engineering and Computer Science) and Dr. Ben G. Fitzpatrick (Department of Mathematics). These research projects have also been brought into the classroom in such courses as Biology/Computer Science 367: Biological Databases, Biology 368: Bioinformatics Laboratory, Biology/Mathematics 388: Biomathematical Modeling, and Biology 478: Molecular Biology of the Genome. For lab protocols see the Dahlquist Lab wiki at OpenWetware.org.
The Global Transcriptional Response of Saccharomyces cerevisiae to Cold Shock and Recovery
The complete sequencing of the human genome and those of other major model organisms, along with the invention of high-throughput methods to measure gene expression has propelled biology into the genomics era. Where once biologists could only study genes one at a time, DNA microarrays now generate data routinely for thousands of genes in a single experiment. My research harnesses the power of DNA microarray and other types of genomic data to elucidate the systems level properties of gene regulatory networks in the budding yeast, Saccharomyces cerevisiae. Yeast responds to environmental stresses through characteristic programs of gene expression. The transcriptional response to heat shock and a variety of other stressors such as changes in nutrient availability, osmolarity, and oxidative stress have been well characterized. However, the response to cold shock has been less well studied. Previous studies on the transcriptional response of budding yeast to cold shock have revealed that the response can be divided into a set of early response genes (after 15 minutes to 2 hours of cold temperatures) and late response genes (after 12 to 60 hours of cold temperatures) (Sahara et al., 2002 J Biol Chem 277:50015; Kandror et al., 2004 Mol Cell 13:771; Schade et al., 2004 Mol Biol Cell 15:5492). The late response genes include the ESR genes induced by many environmental stresses and are regulated by the Msn2p/Msn4p transcription factors (Schade et al., 2004). While these studies have begun to characterize the transcriptional response of yeast to cold shock, they have not determined which transcription factors were responsible for the induction of the early response genes, the genes uniquely responding to cold shock.
Our lab has conducted a screen of transcription factor deletion strains for impaired growth at cold temperatures. The global transcriptional response to the environmental stress of cold shock and subsequent recovery in Saccharomyces cerevisiae has been measured using DNA microarrays for the wild type BY4741 strain and several transcription factor deletion strains identified in our screen. DNA microarrays measure the mRNA levels of all 6000 genes in yeast simultaneously, giving a snapshot of all transcriptional activity in the cells at one time. DNA microarrays were initially obtained through the Genome Consortium for Active Teaching. In our experiments, yeast cells were grown to early log phase at 30°C, then shifted to 13°C for 60 minutes, and then shifted back to 30°C for another 60 minutes. Samples were collected before cold shock (t 0), after 15, 30, and 60 minutes of cold shock (t 15, t 30, and t 60), and after 30 and 60 minutes (t 90, t 120) of recovery at 30°C. Yeast samples were collected for four independent replicates of the time course experiment for each strain. Total RNA was purified from each of the samples. The mRNA was amplified, labeled with Cy3 and Cy5 dyes by the indirect method, and hybridized to DNA microarrays according to the manufacturers’ protocols provided with the reagents. Data were normalized using the limma package in R and statistical tests were performed to determine which genes had significant differences in expression at each timepoint within and between strains (See also the Dahlquist Lab site at OpenWetware.org for protcols). These data are then used as input to the mathematical model described below.
GRNmap: Gene Regulatory Network Modeling and Parameter Estimation (GRNmap Web Site)
Gene expression is a complex biological process in which cells first transcribe their genes encoded in the DNA into an intermediary known as mRNA. Then the cell translates the mRNAs into proteins. Transcription factors are regulatory proteins which increase or decrease the rate at which a cell transcribes a gene. Genome-wide location analysis has determined the relationships between transcription factors and their target genes on a global scale in budding yeast, Saccharomyces cerevisiae (Lee et al., 2002 Science 298:799; Harbison et al., 2004 Nature 431:99). While these data have identified properties of the network topology, they do not reveal the dynamics of the behavior of the network. In the first iteration of our modeling efforts, we used ordinary differential equations to model how the expression of genes in the cell change over time for a medium-scale gene expression network of twenty-one transcription factors controlling the environmental stress response in yeast. The expression levels of the individual transcription factors were modeled using mass balance ordinary differential equations with a sigmoidal production function. Each equation includes a production rate, a degradation rate, weights that denote the magnitude and type of influence of the connected transcription factors (activation or repression), and a threshold of expression. The weights were optimized to published gene expression data from yeast exposed to the environmental stress of cold shock (Schade et al., 2004) using a penalized least squares approach. Model predictions fit the experimental data well, within the 95% confidence interval. The modeling has given new insights into the gene regulatory network controlling the cold shock response in yeast and was published in theBulletin of Mathematical Biology in 2015.
We have continued to develop the modeling software, called GRNmap, which is written in MATLAB. We have added new several new features to the model. The user can choose between the original sigmoidal production function or a Michaelis-Menten production function. The model can now use replicate data and data from both the wild type and deletion strains as input. However, the large number of developers and time span of development led to a code base that was difficult to revise and adjust. We therefore refactored the script-based software with global variables into a function-based package that uses an object to carry relevant information from function to function. This modular approach allows for cleaner, less ambiguous code and increased maintainability. In addition, we have added a simple user interface, removing the need for users to edit MATLAB code. Finally, after the code was refactored and tested, we used the MATLAB compiler to create an executable file that can be run on any Windows machine without the need of a MATLAB license, increasing the accessibility of our program. We follow an open development best practices using our GitHub repository, the code and executable are available under an open source license from the GRNmap web site.
GRNsight: a Web Application and Service for Visualizing Models of Gene Regulatory Networks (GRNsight Web Site)
GRNsight is an open source web application for visualizing models of small- to medium-scale gene regulatory networks. A gene regulatory network (GRN) consists of genes, transcription factors, and the regulatory connections between them, which govern the level of expression of mRNA and protein from those genes. GRNs can be mathematically modeled and simulated by applications such as GRNmap, described above, a MATLAB program that estimates the parameters and performs forward simulations of a differential equations model of a GRN. Computer representations of GRNs, such as the models output by GRNmap, are in the form of a tabular spreadsheet (adjacency matrix) that is not easily interpretable. Ideally, GRNs should be displayed as diagrams (graphs) detailing the regulatory relationships (edges) between each gene (node) in the network. To address this need, we developed GRNsight.
XMLPipeDB: A Reusable, Open Source Tool Chain for Building Relational Databases from XML Sources (XMLPipeDB Web Site)
GenMAPP (Gene Map Annotator and Pathway Profiler) is a tool for viewing and analyzing DNA microarray and other types of high-throughput data on "MAPPs" representing biological pathways or other functional groupings of genes. GenMAPP has graphics tools for drawing MAPPs, but also has an underlying Gene Database which allows users to give the genes placed on the MAPP an identifier from a public database. This feature makes it possible for users to import gene expression data into GenMAPP and color-code the genes according to the data. MAPPFinder works with GenMAPP to determine which Gene Ontology terms are over-represented among genes changed in the imported expression data. Although GenMAPP is now considered “legacy” software and is no longer supported, the LMU Bioinformatics Group headed by Dr. Dahlquist and Dr. John David N. Dionisio of the Department of Electrical Engineering and Computer Science have extended its life by providing new and updated Gene Databases for use with GenMAPP using the open source XMLPipeDB software suite.
XMLPipeDB is a reusable, open source tool chain for automatically building relational databases from an XML schema (XSD). "XML" stands for "eXtensible Mark-up Language." Bioinformatics data is typically provided in XML format because an accompanying XSD or DTD document gives a complete description of the format of the XML data that is less idiosyncratic than other formats and can be easily read programmatically. While XMLPipeDB is a general-purpose tool that can be used for any type of XML data, we have used it to create and update Gene Databases for GenMAPP. The software uses UniProt as the main data source, is robust to changes in source file formats, uses XML sources wherever possible, takes advantage of existing open source tools, and limits the manual manipulation of the data. The XMLPipeDB software suite includes the individual programs, XSD-to-DB, XMLPipeDB Utilities, and GenMAPP Builder. XSD-to-DB reads an XSD (XML schema) and automatically generates the SQL schema file, Java classes, and Hibernate mappings needed to create a relational database. XMLPipeDB Utilities is a general purpose library for performing simple database functions such as importing XML data into the relational database and running simple queries. GenMAPP Builder is a downstream application that exports the data as a GenMAPP-formatted Gene Database. GenMAPP Builder has been used to create a Gene Databases for several species including Arabidopsis thaliana, Bordetella pertussis, Burkholderia cenocepacia, Chlamydia trachomatis, Escherichia coli, Helicobacter pylori, Leishmania infantum, Leishmania major, Mycobacterium smegmatis, Mycobacterium tuberculosis, Plasmodium falciparum, Pseudomonas aerugenosa, Salmonella typhimurium, Shewanella oneidensis, and Shigella flexneri, Sinorhizobium melliloti, Staphylococcus aureus, Streptococcus pneumoniae, and Vibrio cholerae. XMLPipeDB source code and Gene Databases are available for download from our GitHub site. XMLPipeDB development began in Spring 2006 as a group project in a special studies course in Bioinformatics (CMSI 698/BIOL 498) team-taught by Drs. Dahlquist and Dionisio, described in a publication in the ACM SIGCSE journal and is featured in our cross-listed and team-taught course, BIOL/CMSI 367: Biological Databases course (e.g., see the 2015 course wiki).
Last modified: 1/12/16