Drosophila DNase I Footprint Database (v1.1)
HomeBrowse by TargetBrowse by Factor
 
Background
This page provides access to results of the systematic curation and genome annotation of 1,367 DNase I footprints for the fruitfly D. melanogaster reported in Bergman, Carlson and Celniker (2005) Bioinformatics 21:1747-1749. These data have been extracted from 201 primary references and provide a non-redundant set of high quality binding site information for 87 transcription factors and 101 target genes in one of the most important model systems. Unlike previous work, this dataset has been generated from a single experimental data type, represents all available developmental stages (including anterior-posterior, mesoderm and imaginal disk patterning), and is linked explicitly to finished genome sequence coordinates. This dataset should provide a useful resource for computational analyses of transcription factor binding site biology in the genus Drosophila.
 
Browse the Drosophila DNase I Footprint Database (v1.1)
Browse by Target: A hypertext document sorted by target gene with links to FlyBase, PubMed and the UCSC genome browser showing alignments of footprints with D. yakuba, D. pseudoobscura, and Anopheles gambiae.

Browse by Factor: A hypertext document sorted by binding factor with links to FlyBase, PubMed and the UCSC genome browser showing alignments of footprints with D. yakuba, D. pseudoobscura, and Anopheles gambiae.
 
Download the Drosophila DNase I Footprint Database (v1.1)
Download GFF: A text file in Sanger Institute's GFF v2 format with additional target, factor, PubMed ID fields. This file is identical to Supplemental File 3 in Bergman, Carlson and Celniker (2005) Bioinformatics 21:1747-1749 with the addition of footprint IDs (FPIDs). Coordinates are UCSC's "half-open zero-based" system {e.g. a (start, end] pair of (100,200] represents bases 101-200 on the genome sequence}. The strand for genome sequence coordinates is purposefully not reported since a binding site is a double-stranded feature.

Download Sequences: A text file of sequences in FASTA format corresponding to footprinted regions on the plus strand of the D. melanogaster Release 3 genome sequence. A text file of file of sequences in FASTA format corresponding to footprinted regions extended by ±30 bp on the plus strand of the D. melanogaster Release 3 genome sequence can be found here.

Download SQL: A dump of the mySQL database (and documentation) used for data entry and storage derived from the GadFly schema.

NB: The GFF file (not the Sequences or mySQL dump) defines the information contained in v1.1.
 
Browse & Download the Literature (v1.1)
Browse PubMed (1984-1994) (1995-2004): Hypertext summaries of the 201 primary references that form the experimental foundation for this database.

Download XML: A tar.gz text file of 201 PubMed abstracts for literature mining.

An archive of the 131 primary references in .pdf format is available on request.
 
Release Log
Information about FlyReg releases can be found here.
FlyReg v1.0 can be found here.
FlyReg v2.0 can be found here.
 
Related Resources
New! The Drosophila DNase I footprint database has been included in the BioMedCentral Catalog of Databases on the web.
FlyReg v1.0 has been incorporated as a track in the UCSC Drosophila melanogaster genome browser.

Dan Pollard (UC-Berkeley) has made position weight matrices (PWMs) for many of the transcription factors in this dataset.
Dmitri Papatsenko's group (UC-Berkeley) has curated binding site data for blastoderm transcription factors.
Michael Zhang's group (CSHL) has curated an independent set of binding site data at the Drosophila Binding Site Database.
 
Acknowledgements
We thank Nicholas Blanchard for assistance with literature curation and FlyBase Cambridge for access to the Drosophila offprint collection. We are grateful for the information, sequences, coordinates, and/or footprint data provided by Mariana Bienz, Olivier Cuvier, Kirsten Guss, Thomas Hader, Craig Hart, Michael Hoch, Herbert Jackle, Judith Lengyel, Guo-Jen Liao, Kevin Moses, Rolando Riviera-Pomar, Maria Saenz-Robles, Matthew Scott, William Stumph, Andrew Travers, Joseph Weiss, Cheng-Cai Zhang and Keji Zhao.
 
Citation
If you use this dataset in your research, please report the version number and cite Bergman, C.M., J.W Carlson and S.E. Celniker (2005) Drosophila DNase I footprint database: A systematic genome annotation of transcription factor binding sites in the fruitfly, D. melanogaster. Bioinformatics 21:1747-1749.

If you intend to redistibute these data, please note the version number and provide a link to this page.
 
Contact Information
Email Casey Bergman (casey.bergman@manchester.ac.uk) with questions, comments or corrections.
 
This page was last updated 11-Apr-2014