the Proteomic Analysis Workbench

The Proteomic Analysis Workbench is a collection of software tools to improve proteomic data analysis.  The tools are written in the multi-platform Python programming language.  It is arguably one of the easiest languages to learn, and has very readable source code.

If you can use a word processor or open a web browser, you can run this software.  You do have to install Python (version 2.x, not 3.x) and download source code for the software tools.  You do not have to bother with web servers, web interfaces, or third-party libraries.  You do not have to type command lines unless you want to.

The FASTA utilities help maintain and prepare FASTA-formatted protein databases for use by search programs such as SEQUEST or MASCOT.  These tools keep "the answers in the back of the book" up to date. The distribution includes a detailed User’s Guide, a Quick Reference Guide, and the programs.


July 2012 fix: A bug reading GZipped FASTA files has been fixed.

Summer 2012 update: Some changes in data structures improve memory usage during processing of the large NCBI nr or UniProt Trembl databases. A bug in the sequence reversing of "" was fixed.

Summer 2014 update: Increased array dimensions for gi to taxon mapping.

Fall 2014 update: Updated FTP addresses for UniProt databases.

Additional FASTA tools for processing Ensembl databases and programs for processing SEQUEST searches bottom up proteomics experiments are available upon email request.