Home   |    Contact    |    Research   |    Teaching    |    Publications   |    Staff    |    Employment   |    Links    

Biochemical pathway knowledge database and visualization tools

BioPathAt is a visual interface that allows the knowledge-based analysis of genome-scale data by integrating biochemical pathway maps (BioPathAtMAPS module) with a manually scrutinized gene-function database (BioPathAtDB module) for the model plant Arabidopsis thaliana . The BioPathAt tool has been described in a recent article in Phytochemistry :

Comprehensive post-genomic data analysis approaches integrating biochemical pathway maps • Phytochemistry, Volume 66, Issue 4, February 2005, Pages 413-451

B. Markus Lange and Majid Ghassemian

http://www.sciencedirect.com/science/journal/00319422

  Downloads:

BioPathAtMAPS, BioPathAtDB and BioPathMetDB files can be downloaded from the Lange laboratory's public folder:

http://www.ibc.wsu.edu/research/lange/public_folder

Biochemical pathway maps (BioPathAtMAPS)

Example Map for Version 2 (in “BioPathAt Examples” folder): Glycerolipid biosynthesis in leaves

The experimental evidence regarding the presence of pathways in Arabidopsis thaliana was evaluated and matched with the occurrence predicted based on the apparent coding capacity of the entirely sequenced genome. Maps were generated to represent current knowledge regarding numerous pathways involved in light perception, metabolism, protein trafficking, and signal transduction (BioPathAtMAPS module). In BioPathAt biochemical pathways are treated as modules that can be reassembled in various ways so that separate maps can be used to visualize the connections between pathways in different biological contexts.

Gene/enzyme function database (BioPathAtDB)

It is noteworthy that experimental evidence is available for only about 10% of all genes in A. thaliana and only another 40 % of A. thaliana genes display sufficient homology to those of other organisms so that a sequence-based annotation might be feasible. However, recent publications have underscored that the annotation of gene sequences in the public databases as having a particular or putative function in a specific biochemical pathway must be viewed with considerable caution. Integrative genome annotation involves keyword and sequence-based searches against public databases, the use of algorithms that predict the subcellular localization of enzymes, and a manual evaluation of available annotation based on published literature and knowledge about the tissue-specific mRNA expression. A gene list for genes encoding the proteins represented in the BioPathAtMAPS module was compiled using literature keyword and sequence-based searches in the TAIR A. thaliana database (http://www.arabidopsis.org/Blast/). If no A. thaliana gene for a protein of interest was annotated based on biochemical data, the gene from the nearest relative (putative ortholog) was used as the reference protein sequence to identify the A. thaliana gene. For this purpose, a protein sequence database covering biochemical pathways in all plants was generated with data from NCBI (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein). These protein sequences were then compared (BLASTp; Altschul et al., 1990) with the predicted A. thaliana proteins. The BLASTp alignments were processed with PERL scripts to extract the top 10 hits. A separate BLASTp search was run against full-length cDNA databases ( http://rarge.gsc.riken.go.jp/blast/blast.pl ; http://signal.salk.edu/dblast.html ). Genes encoding members of enzyme families in A. thaliana were aligned, using the CLUSTALW algorithm (Thompson et al., 1994), with those homologs from other plants for which a biochemical function had already been established. BLASTp results and sequence alignments where manually scrutinized for the quality of hits. The subcellular localization of proteins was predicted using the PSORT (http://psort.nibb.ac.jp/form.html ) and TargetP (http://www.cbs.dtu.dk/services/TargetP/ ) programs, and was manually updated when experimental data were in disagreement with the computational prediction.

Metabolite database (BioPathMetDB)

Example Metabolite Database (in :BioPathAt Examples” folder): Glycerolipid biosynthesis in leaves

(in development)

For all metabolites represented on the BioPathAt pathways information regarding their structure, molecular composition, mass, alternative names was collected from various sources and is provided in an EXCEL spreadsheet. Links to relevant small molecule databases (e.g., KEGG, PubChem) are given as well.

Disclaimer:

This site provides information and resources for educational purposes with the hope that they might be beneficial to others. Please credit properly if you intend to use the downloadable files. Commercial use of these materials is prohibited without prior written permission.

Contact: B.M. Lange ( lange-m@wsu.edu ).

Requests regarding copyright and the authorization of third parties to reproduce or otherwise use all or part of the website contents (including figures and tables) are referred to the Elsevier Global Rights Department (http://www.elsevier.com/locate/permissions). Click on “obtain permission” to proceed.