Where to find such gene and protein sequences?


Until now public research institutions give full access to known gene and protein sequences in huge databases that can be queried by everybody using a web-interface. The biggest and most often used database is hosted at the National Centre of Biotechnological Information (NCBI) in Bethesda, MD, USA. Following the link:

http://www.ncbi.nlm.nih.gov/

one can use the search term “human alcohol dehydrogenase” in the search line. Sequence information can be retrieved when choosing links like “Protein: Sequence database” and “CoreNucleotide: Core subset of nucleotide sequence records” in the automatically opened window. Choosing any of the listed individual pages gives access to an information sheet such as:



Beside many informative data for scientists these sheets contain four important pieces of information for our project (marked in yellow):

LOCUS: gives a unique identification number that allows direct retrieval of sequences in the database [here: CAA53961]
DEFINITION: provides the protein name [here: alcohol dehydrogenase]
SOURCE: indicates the organism of which the sequence is shown [here: Homo sapiens (human)]
ORIGIN: provides the requested gene/protein sequence

We searched for different ADH protein sequences using a keyword search or a sequence comparison approach (BLAST algorithm). The following sequences are currently used: