The GenPept format is used to represent annotated protein sequences.
It is derived from the widely used genbank format and contains like the
genbank format several keys and subkeys to describe sequence characteristic.
Each sequence starts by the key LOCUS and terminates by //. The record
can contain more than one sequence. 

Example: 

LOCUS       CAA00606                 609 aa            linear   PAT 31-AUG-1993
DEFINITION  albumin [Homo sapiens].
ACCESSION   CAA00606
VERSION     CAA00606.1  GI:412163
DBSOURCE    embl accession A06977.1
KEYWORDS    .
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens.
FEATURES             Location/Qualifiers
     source          1..609
                     /organism="Homo sapiens"
                     /db_xref="taxon:9606"
     Protein         1..609
                     /product="albumin"
     sig_peptide     1..24
                     /note="albumin"
     mat_peptide     25..609
                     /product="albumin"
     CDS             1..609
                     /coded_by="A06977.1:76..1905"
                     /db_xref="GOA:P02768"
                     /db_xref="UniProtKB/Swiss-Prot:P02768"
ORIGIN      
        1 mkwvtfisll flfssaysrg vfrrdahkse vahrfkdlge enfkalvlia faqylqqcpf
       61 edhvklvnev tefaktcvad esaencdksl htlfgdklct vatlretyge madccakqep
      121 ernecflqhk ddnpnlprlv rpevdvmcta fhdneetflk kylyeiarrh pyfyapellf
      181 fakrykaaft eccqaadkaa cllpkldelr degkassakq rlkcaslqkf gerafkawav
      241 arlsqrfpka efaevsklvt dltkvhtecc hgdllecadd radlakyice nqdsissklk
      301 eccekpllek shciaevend empadlpsla adfveskdvc knyaeakdvf lgmflyeyar
      361 rhpdysvvll lrlaktyett lekccaaadp hecyakvfde fkplveepqn likqncelfe
      421 qlgeykfqna llvrytkkvp qvstptlvev srnlgkvgsk cckhpeakrm pcaedylsvv
      481 lnqlcvlhek tpvsdrvtkc cteslvnrrp cfsalevdet yvpkefnaet ftfhadictl
      541 sekerqikkq talvelvkhk pkatkeqlka vmddfaafve kcckaddket cfaeegkklv
      601 aasqaalgl
//