The Dahlquist Lab performes four distinct, but related research projects. The common thread amongst these projects is that they employ the techniques of bioinformatics and genomics and the perspective of systems biology. All of these projects involve undergraduate or Master's level students and most involve interdisciplinary collaborations with other faculty at LMU (see: People). These research projects have also been brought into the classroom (see: Courses). For lab protocols see the Dahlquist Lab site at OpenWetware.org.
GenMAPP (Gene Map Annotator and Pathway Profiler) is a tool for viewing and analyzing DNA microarray and other types of high-throughput data on “MAPPs” representing biological pathways or other functional groupings of genes. GenMAPP has graphics tools for drawing MAPPs, but also has an underlying Gene Database which allows users to give the genes placed on the MAPP an identifier from a public database. This feature makes it possible for users to import gene expression data into GenMAPP and color-code the genes according to the data. MAPPFinder works with GenMAPP to determine which Gene Ontology terms are over-represented among genes changed in the imported expression data.
The Dahlquist Lab has been working to address two limitations of GenMAPP: (1) GenMAPP is only as useful as the number of MAPPs available; and (2)
GenMAPP can only be used with species for which a GenMAPP-formatted Gene Database is available. Vassar College students, Meredith Braymer ’04 and Jessica Heckman ‘05, built a complete set of 120 metabolic pathway MAPPs for yeast, based on the pathway information available from the Saccharomyces Genome Database (SGD). These MAPPs have been made available for download from the GenMAPP.org web site for use by the entire research community.
Currently, the main GenMAPP development group only supports Gene Databases for eleven species, none of which are bacteria or plants, because they use Ensembl as the main data source. To expand the number of species available for analysis in GenMAPP and to make our system more robust to changes in source data formats, my collaborator we have created XMLPipeDB, a reusable, open source tool chain for automatically building relational databases from an XML schema (XSD). “XML” stands for “e Xtensible Mark-up Language.” There is a growing trend in the bioinformatics field to provide data in XML format because an accompanying XSD or DTD document gives a complete description of the format of the XML data that is less idiosyncratic than other formats and can be easily read programmatically. While XMLPipeDB is a general-purpose tool that can be used for any type of XML data, we have used it to solve the Gene Database creation problems faced by the main GenMAPP group. The software uses UniProt as the main data source, opening up the possibility of creating Gene Databases for hundreds of species, is robust to changes in source file formats, uses XML sources wherever possible, takes advantage of existing open source tools, and limits the manual manipulation of the data. The XMLPipeDB software suite includes the individual programs, XSD-to-DB, XMLPipeDB Utilities, and GenMAPP Builder. XSD-to-DB reads an XSD (XML schema) and automatically generates the SQL schema file, Java classes, and Hibernate mappings needed to create a relational database. XMLPipeDB Utilities is a general purpose library for performing simple database functions such as importing XML data into the relational database and running simple queries. GenMAPP Builder is a downstream application that exports the data as a GenMAPP-formatted Gene Database. GenMAPP Builder has been used to create a Gene Database for Escherichia coli K12 (available from GenMAPP.org and SourceForge.net) and for Arabidopsis thaliana (beta version available from SourceForge.net).
The Global Transcriptional Response of Saccharomyces cerevisiae to Cold Shock and RecoveryTop
The complete sequencing of the human genome and those of other major model organisms, along with the invention of high-throughput methods to measure gene expression has propelled biology into the genomics era. Where once biologists could only study genes one at a time, DNA microarrays now generate data routinely for thousands of genes in a single experiment. My research harnesses the power of DNA microarray and other types of genomic data to elucidate the systems level properties of gene regulatory networks in the budding yeast, Saccharomyces cerevisiae. Yeast responds to environmental stresses through characteristic programs of gene expression. The transcriptional response to heat shock and a variety of other stressors such as changes in nutrient availability, osmolarity, and oxidative stress have been well characterized. However, the response to cold shock has been less well studied. Previous studies on the transcriptional response of budding yeast to cold shock have revealed that the response can be divided into a set of early response genes (after 15 minutes to 2 hours of cold temperatures) and late response genes (after 12 to 60 hours of cold temperatures) (Sahara et al., 2002 J Biol Chem 277:50015; Kandror et al., 2004 Mol Cell 13:771; Schade et al., 2004 Mol Biol Cell 15:5492). The late response genes include the ESR genes induced by many environmental stresses and are regulated by the Msn2p/Msn4p transcription factors (Schade et al., 2004). While these studies have begun to characterize the transcriptional response of yeast to cold shock, they have not determined which transcription factors were responsible for the induction of the early response genes, the genes uniquely responding to cold shock.The global transcriptional response to the environmental stress of cold shock and subsequent recovery in Saccharomyces cerevisiae has been measured using DNA microarrays. DNA microarrays measure the mRNA levels of all 6000 genes in yeast simultaneously, giving a snapshot of all transcriptional activity in the cells at one time. DNA microarrays were obtained through the Genome Consortium for Active Teaching. DNA microarrays obtained in this manner are for the express use of undergraduates. Vassar College students, Meredith Braymer ’04 and Philopose Mulugeta ’07 and LMU students, Heather King ’06, Matthew Mejia ’07, Wesley Citti ’08, Robert Hybki ’08, Elizabeth Liu ’08, Olivia Sakhon ’08, Kevin Entzminger ’09, Stephanie Kuelbs ’09, and Kenny Rodriguez ’'09 have all performed DNA microarray experiments related to this project. Wild type yeast cells (BY4741) were grown to early log phase at 30°C, then shifted to 13°C for 60 minutes, and then shifted back to 30°C for another 60 minutes. Samples were collected before cold shock (t 0), after 15, 30, and 60 minutes of cold shock (t 15, t 30, and t 60), and after 30 and 60 minutes (t 90, t 120) of recovery at 30°C. Yeast samples were collected for four independent replicates of the time course experiment. Total RNA was purified from each of the samples. The mRNA was amplified, labeled with Cy3 and Cy5 dyes by the indirect method, and hybridized to DNA microarrays according to the manufacturers’ protocols provided with the reagents (See also the Dahlquist Lab site at OpenWetware.org for protcols). These data are currently being analyzed. Click here to view some pictures of DNA microarrays hybridized by students in my lab.
Gene expression is a complex biological process in which cells first transcribe their genes encoded in the DNA into an intermediary known as mRNA. Then the cell translates the mRNAs into proteins. Transcription factors are regulatory proteins which increase or decrease the rate at which a cell transcribes a gene. Recently, genome-wide location analysis has determined the relationships between transcription factors and their target genes on a global scale in budding yeast, Saccharomyces cerevisiae (Lee et al., 2002 Science 298:799; Harbison et al., 2004 Nature 431:99). While these data have identified properties of the network topology, they do not reveal the dynamics of the behavior of the network. Using differential equations, we have modeled how the concentrations of proteins in the cell change over time for a subset of a real gene expression network of twenty-one transcription factors controlling the environmental stress response in yeast. The differential equations governing the rate of change of concentration for each protein in the network were based on a sigmoidal function. A weight parameter determines how each transcription factor affects the transcriptional and translational rate of its target gene. The weights were optimized to experimentally derived gene expression data from yeast exposed to the environmental stress of cold shock. Sensitivity analysis was performed to understand the behavior of the different parameters in the model. We then used the model to generate a simulated gene expression dataset giving the steady-state concentrations of each protein after a period of time has elapsed. The simulated data determined which transcription factors have a greater impact on the overall dynamics of the network. Then each gene in the network was systematically deleted in silico to determine how the steady-state concentrations of the proteins in the network changed after the deletions. Nathan Wanner '07 began this project in collaboration with Dr. Erika Camacho (who is now at Arizona State Unversity) and won third prize at the San Diego Consortium for Systems Biology Symposium: Systems to Synthesis, held at the Salk Institute on January 19, 2007. Currently this work is being performed by Stephanie Kuelbs '09 in collaboration with Dr. Ben Fitzpatrick (LMU Mathematics) as part of the NSF-Interdisciplinary Training for Undergraduates in Biological and Mathematical Sciences (UBM) project at LMU: Analysis of Stress in Biological Systems.
Identifying Soil Bacteria and Biochemical Pathways in the Ballona Wetlands for the Bioremediation of Organic Pollutants Top
A multi-disciplinary collaborative of faculty from the Departments of Biology, Chemistry & Biochemistry, Natural Science, and Civil Engineering & Environmental Science have been awarded a Merck/AAAS Undergraduate Science Research Grant to study the chemical and biological aspects of pollution in the nearby Ballona Wetlands. As the last remaining wetlands of significant size in Los Angeles County, the Ballona Wetlands is not only a valuable environmental resource for southern California, but is also an outstanding teaching and research laboratory for LMU. It is especially well-suited for projects involving undergraduates: its proximity at less than a mile from the LMU campus allows easy access for sampling, it provides a unique venue to study an urban-wildlands interface in all its dimensions, it reinforces the practical contribution that research can make, and it underscores the interdisciplinary nature of the real world. The project described below has been undertaken in collaboration with Dr. Carl R. Urbinati (LMU Biology).
The Ballona Wetlands in Los Angeles County are contaminated with organic pollutants including polychlorinated biphenyls (PCBs), polyaromatic hydrocarbons (PAHs), and single-ring aromatics (e.g. toluene) from urban run-off. Initially, to determine whether biochemical pathways exist in the wetlands to degrade toluene Wesley Citti '08 attempted to enrich bacteria from the soil that could metabolize toluene. We obtained a single, pure environmental isolate from media with citrate as the sole carbon source but not from media with toluene as the sole carbon source. Wesley isolated genomic DNA from the environmental strain and used PCR to amplify a variable region of the 16S rRNA gene, which he then subcloned into a plasmid vector. DNA sequencing identified the environmental isolate as Pseudomonas putida F1. Pseudomonas species, including Pseudomonas putida F1, are known to degrade toluene. Jeff McGowan '08 and Kara Taylor '09 have both attempted to use PCR to amplify and clone the todC1 gene which encodes one subunit of toluene dioxygenase, the first enzyme in the toluene degradation pathway from the isolated Pseudomonas putida F1 strain, but have been unsuccessful so far.To assess the diversity of soil bacteria in the wetlands, Wesley also isolated genomic DNA from soil samples collected from four sites in the wetlands, amplified a variable region of the 16S rRNA gene using PCR, and constructed a sub-genomic library for each site. Jeff, and then Kara, used PCR screening to identify clones with inserts and purified plasmid DNA which was sent out to be sequenced. To date, the students have screened over 450 clones resulting in the sequencing and analysis of 88 clones from one of the libraries, and Kara has begun to screen a second library from a different wetlands site. Thus far, the most abundant taxonomic group represented among the sequences is Proteobacteria, with subgroups Gammaproteobacteria and Alphaproteobacteria being most prominent.
Last Modified: 8/27//08