Data Collections at Indiana University
FlyBase
euGenes
wFleaBase
National Gene Vector Laboratories Toxicology database
CIFASD Data Repository
Indiana Spatial Data Service
CLIOH
Copyright: Genetics Society of America
Funding: FlyBase is supported by the National Human Genome Research
Institute (U.S.A.), grant P41-HG00739. Additional support for FlyBase is
provided by the Medical Research Council (U.K.), grant G9535792MB, and the
Indiana Genomics Initiative
Stewardship: FlyBase Consortium
Access Rights: FlyBase is a public database and can be freely used for
research purposes. Data can be downloaded via FTP or searched at the web
interface
FTP address: ftp://flybase.org
HTTP address: http://flybase.org
Contact: flybase-help (at) morgan.harvard.edu
FlyBase is a database of genetic and molecular data for Drosophilia or fruit fly and includes data files, documents, images and indices. Data is curated, annotated and maintained by the FlyBase consortium that includes biologists and computer scientists at Harvard University, Indiana University and University of Cambridge. FlyBase represents genetic and genomic data from the Drosophila literature, i.e. from the genome centers, published scientific literature, accessions from nucleic acid, protein and other databases, written personal communications and bulk submissions. The data resides on a multi-protocol server. Details on the types of data hosted in FlyBase are available at: http://www.flybase.org/docs/lk/refman/refman-A.html#A.1.
The home page of FlyBase provides a list of data classes as well as a selection of search tools that are available for each data class. Query tools that permit field-specific searches, combinatorial queries and menu-driven selection of controlled vocabulary terms are available. Organized lists of hits to a given query are produced and single or multiple items from these hit lists can be retrieved. A synopsis report is first produced and the user is provided with several options for more extensive reports. The search interface has a link to the Search Tools section of the FlyBase Reference Manual. Users needing specialized help e.g. building searches that are not permitted by existing tools or with running a query across data sets are referred to an email address.
euGenes
Funding: euGenes is supported by the National Science Foundation and
Indiana U. Center for Genomics and Bioinformatics. Funding is provided via
NSF awards DBI-0090782 and DBI-9982851 to IUBio Archive.
Stewardship: euGenes uses automated data collection rather than curation
by people.
Access rights: Data is freely available for research purposes. Data can be
downloaded via FTP or searched at the web interface
FTP address:
ftp://iubio.bio.indiana.edu/eugenes/
HTTP address:
http://iubio.bio.indiana.edu:8089/
Contact: eugenes (at) iubio.bio.indiana.edu
euGenes is a genome information system and database that provides a common summary of eukaryote (a group that includes all higher organisms as well as fungi and some algae) genes and genomes. This information is automatically extracted and updated from various public genome databases. The euGenes summarization seeks to organize and integrate diverse sources into a common structure with a common interface to search and retrieve or to extract for further use in projects needing eukaryote genome data.
The summary information available of 150,000 genes of human and model eukaryote organisms provides essential data and links to fuller information from source databases. euGenes is a good entry point for a user who doesn't need all details of an organism, or who has expertise with one organism and wants to find gene data on related organisms. Each organism has a section in euGenes, listing source data and the derived and computed summaries, genome features and homologies. Computations and tables of comparison across all species are also provided. The web interface for search, retrieval, map display and summaries of genome data can be accessed at http://iubio.bio.indiana.edu/eugenes/. The results of searches are available as hypertext documents, in spreadsheet table formats for data manipulations and in database formats including XML.
wFleaBase
Copyright: Genomic data is copyrighted by the Daphnia Genomics Consortium.
Reference records are taken from the BIOSIS database and are the copyright
of BIOSIS.
Funding: wFleabase is supported by the National Science Foundation (NSF),
the National Institutes of Health (NIH), the Center for Genomics and
Bioinformatics (CGB) at Indiana University and the IU Genome Informatics
Lab.
Stewardship: Gilbert, D., V.R. Singan and J.K. Colbourne
Access rights: wFleaBase is freely distributed to the scientific community
on the condition that it will not be used for commercial gain by any
individual or organization. Database can be searched at the web interface
or downloaded via FTP
FTP address: ftp://wfleabase.org/daphnia/
HTTP address: http://wfleabase.org
Contact: daphnia (at) iubio.bio.indiana.edu
wFleaBase is a database with the infrastructure to curate, archive and share genetic, molecular and functional genomic data and protocols for an emerging model organism, the water-flea or Daphnia. The main sources of data for wFleaBase are direct submissions from Daphnia Genomics Consortium (DGC) members and from research at large genome sequencing centers.
wFleaBase is developed at the Genomics Informatics Lab at Indiana University and is a project of the Daphnia Genomics Consortium (DGC). wFleaBase is designed to be a resource where users can search and retrieve sequence data for genes of ecological importance, or find putative genes modulating traits of interest based on their homologies to functionally characterized genes in other model organisms. wFleaBase is an organized repository of Daphnia specific sequences with standard bioinformatic tools to facilitate gene discovery. This function includes BLAST analyses and links to gene reports for other eukaryotic genomic models via the euGenes database. The service is built using tested genome database components and open source software that is common to several other databases.
wFleaBase uses Lucegene to support rapid search and retrieval of the sequence database, of Blast table entries, of Daphnia Medline references and of Daphnia web documents. Lucegene based on the Lucene search system, is an open-source part of the GMOD project. A major benefit of Lucegene is the large variety of data formats that can be added to the search system with minimal work. For instance, currently supported formats used in wFleaBase include Simple text, XML (Medline abstracts and Gene sequence annotation), HTML, Tabular data, Bio-formats (Fasta, Genbank, EMBL) and Gene object data used by euGenes. Search terms such as "magna" to retrieve all sequences from this species, can be entered in a Search box at the head of all web pages. The search is refined at the main wFleabase search page by specifying the search library (sequences, references, documents or Blast tables) and the library fields containing the queried term. Options are also available to detail the output format, and each result is hyperlinked to the source document for easy access to the data. On a separate web page (Batch download), users can recover multiple records obtained from complex queries and save the results to a file.
National Gene Vector Laboratories Toxicology database
Funding: Data repository is funded by the NIH.
Stewardship: NGVL Coordinating Center
Access rights: Data is freely available for research purposes
HTTP address:
http://www.ngvl.org/pharmtox/
Contact: lrubin (at) iupui.edu
The National Gene Vector Laboratories Toxicology database is a database of information on animal studies that have been conducted to study the biodistribution of vector to different target organs and to evalutate potential toxic effects associated with the use of various vector systems. The purpose of the database is to inform gene therapy investigators about the conclusions of prior toxicology studies and to facilitate the cross-referencing of relevant studies in support of new Investigational New Drug Applications (INDs) applications.
The studies are generally small, randomized single dose or dose escalation studies consisting of various animal models from mouse to non-human primate. The study parameters capture the test system, vector information, dose procedures, clinical observations, macroscopic/microscopic pathology, histopathology, tissue collection for PCR analysis, clinical pathology:hematology/chemistry analysis and a summary of the relevant findings.
These studies are paramount in moving from preclinical animal studies to clinical trial. With this in mind it is anticipated that by sharing this data, investigators in the gene therapy field may find a study in the database that is relevant to their specific vector and study of interest. Investigators can then utilze this data by acquiring a letter of cross reference from the Study PI allowing the FDA to review the data and assess whether a specific study would need to be conducted or whether there was no need to conduct the study.
CIFASD Data Repository
Copyright: Collaborative Initiative on Fetal Alcohol Spectrum Disorders
(CIFASD) Consortium
Funding: Data repository is funded by NIH/NIAA
Stewardship: Informatics Core for the CIFASD Consortium
Access rights: Data has not yet been made publicly available
Contact: cifasdic (at) iu.edu
The CIFASD Data Repository is used to collect, maintain and distribute data generated by the various participants in the Consortium for the Collaborative Initiative on Fetal Alcohol Spectrum Disorders. The data repository is the collective outcome of input and contributions from basic researchers, behavioral scientists, and clinical investigators and is developed and maintained by the Informatics Core which is responsible for working with the other consortium participants to define a Data Dictionary to be used in standardizing data collection, enabling the transfer of data to and from the CIFASD Data Repository, consulting on how to establish local data management systems, providing both software tools and consulting to consortium participants, and producing status reports about the progress of the various projects within the consortium. The CIFASD data repository is hosted on Indiana University's supercomputing facilities.
Indiana Spatial Data Service
Funding: Multiple sources
Access rights: Data is freely available
HTTP address:
http://discover-g.uits.indiana.edu:8290/website/isds/index.html
Contact: uitsgis (at) indiana.edu (for technical support regarding data contents and
format) and store-admin (at) iu.edu (for technical support regarding data transfer);
The Indiana Spatial Data Service (ISDS) and the Indiana Spatial Data Portal (ISDP) form a geographic information system that provides online access to topographic maps,aerial photos and digital elevation data for Indiana. In addition, the ISDP hosts several local data sets including aerial photography for Bartholomew, Hamilton, Marion, Monroe, and Wayne Counties. Maps and aerial photos can be viewed via the interactive map viewer or by means of a GIS software application. The system utilizes Oracle databases, ArcSDE, and ArcIMS to serve imagery in real-time to a variety of desktop and server-based GIS clients. Most data sets are available to the public for download and have no use restrictions.
The interface for the Indiana Spatial Data Portal includes interactive map components which simplify the process of identifying the files that a user may need and selecting them for download. Interactive indices are available for statewide aerial photos, topographic maps, digital elevation models, and high-resolution aerial photos for Indianapolis and Marion County. An automated multi-file download tool is provided for download and installation by users who are interested in retrieving multiple files from the spatial data collection.
Cultural Digital Library Indexing our
Heritage (CLIOH)
Funding: Funding for CLIOH is provided by the Institute of Museum and Library Services (IMLS)
and the IU school of Informatics.
Access rights: Data can be freely accessed
HTTP address: http://clioh.informatics.iupui.edu/
Contact: clioh (at) cs.iupui.edu
CLIOH is a searchable digital archive housing original multimedia acquired from threatened world heritage cultural sites. Currently, the collection contains multimedia from the Mayan city of Uxmal in Yucatan, Mexico and a Native American Mississippian site, Angel Mounds U.S.A. Visitors to CLIOH will find information on the history, anthropology, archeology, science, mathematics and culture of ancient civilizations for the purpose of developing reports, investigative studies and other curriculum. The CLIOH archive is designed to be compliant with museum and library approved standards such as the Dublin Core and the Open Archives Initiatives.




