WORKLIST ENTRIES (1):

ERYTHCRUORIN View alignment View Structure    Erythrocruorin family signature 
 Type of fingerprint: COMPOUND with 4  elements
Links:
   PRINTS; PR00188 PLANTGLOBIN; PR00612 ALPHAHAEM; PR00613 MYOGLOBIN
   PRINTS; PR00814 BETAHAEM; PR01906 FISHGLOBIN
   PRINTS; PR01907 WORMGLOBIN 
   PROSITE; PS01033 GLOBIN
   INTERPRO; IPR002336
   PDB; 1ECA 3Dinfo; 1ASH 3Dinfo
   SCOP; 1ECA; 1ASH
   CATH; 1ECA; 1ASH

 Creation date 30-SEP-1996; UPDATE 14-JUN-1999

   1. COWAN, J.A.
   Inorganic Biochemistry: An Introduction.
   VCH PUBLISHERS, 1993, NEW YORK.

   2. KAIM, W. AND SCHWEDERSKI, B.
   Bioinorganic Chemistry: Inorganic Elements in the Chemistry of Life.
   WILEY, 1991, CHICHESTER.

   3. KAPP, O.H., MOENS, L., VANFLETEREN, J., TROTMAN, C.N.A., SUZUKI, T.
   AND VINOGRADOV, S.N.
   Alignment of 700 globin sequences: Extent of amino acid substitution
   and its correlation with variation in volume.
   PROTEIN SCI. 4 2179-2190 (1995).

   4. MOENS, L., VANFLETEREN, J., VAN DE PEER, Y., PEETERS, K., KAPP, O., 
   CZELUZNIAK, J., GOODMAN, M., BLAXTER, M. AND VINOGRADOV, S.N.
   Globins in nonvertebrate species: dispersal by horizontal gene transfer
   and evolution of the structure-function relationships.
   MOL.BIOL.EVOL. 13 324-333 (1996).

   5. VINOGRADOV, S.N.
   The structure of invertebrate extracellular hemoglobins (erythrocruorins
   and chlorocruorins).
   COMP.BIOCHEM.PHYSIOL. 82 1-15 (1985).

   6. GOLDBERG, D.E.
   The enigmatic oxygen-avid hemoglobin of Ascaris.
   BIOESSAYS 17 177-182 (1995).

   7. TROTMAN, C.N., MANNING, A.M., BRAY, J.A., JELLIE, A.M., MOENS, L.
   AND TATE, W.P.
   Interdomain linkage in the polymeric hemoglobin molecule of Artemia.
   J.MOL.EVOL. 38 628-636 (1994).

   Globins are haem-containing proteins involved in dioxygen binding and/or 
   transport [1,2]. Hundreds of globin sequences are known [3]. It has been 
   proposed that all globins have evolved from a family of ancestral, ~17kDa 
   haemoproteins that displayed the globin fold and functioned as redox 
   proteins [4]. The globin superfamily includes vertebrate haemoglobins (Hb); 
   vertebrate myoglobins (Mb); invertebrate globins; plant leghaemoglobins; and 
   bacterial flavohaemoglobins. 
  
   The function of haemoglobins (Hbs) is transport of dioxygen in blood plasma.
   Erythrocruorins (Ec) are extracellular Hbs found freely dissolved in the 
   blood of annelids and arthropods. Ec molecules exist as aggregates of up to
   200 small globin-like subunits, some of which are disulphide-bonded and not
   all of which contain haem [5]. Nematodes (e.g., Ascaris) possess an octa-
   meric Hb, each subunit containing two globin-like domains. Ascaris Hb binds
   oxygen four orders of magnitude more tightly than does human Hb [6]. The
   brine shrimp Artemia has evolved the longest known concatenation of globin
   domains: each subunit contains 9 globin-like domains, connected by linking
   peptides [7]. Artemia possess three types of dimeric haemoglobins: HbI
   (alpha+alpha); HbII (alpha+beta); and HbIII (beta+beta). 
  
   The 3D structures of a number of Ecs are known. The protein is largely 
   alpha-helical, eight conserved helices (A to H) providing the scaffold for 
   a well-defined haem-binding pocket. The imidazole ring of the "proximal" His 
   residue provides the fifth haem iron ligand; the other axial haem iron 
   position remains essentially free for O(2) coordination. Many Ecs lack
   the "distal" His and Val residues that are conserved in vertebrate globins.
  
   ERYTHCRUORIN is a 4-element fingerprint that provides a signature for the 
   erythrocruorins. The fingerprint was derived from an initial alignment
   of 17 sequences (both N- and C-terminal globin-like domains of GLB_ASCSU 
   and GLB_PSEDC were used to construct the alignment): motif 1 spans helices B
   and C; motif 2 corresponds to helix F, and includes the invariant proximal
   His residue; motif 3 includes helix G; and motif 4 spans helix H. Three
   iterations on OWL28.2 were required to reach convergence, at which point a
   true set comprising 62 sequences was identified. Numerous partial matches
   were also found, most of which are members of the globin superfamily.
  
   An update on SPTR37_9f identified a true set of 64 sequences, and 104
   partial matches.

  SUMMARY INFORMATION
     64 codes involving  4 elements
     25 codes involving  3 elements
     79 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    4|  64   64   64   64  
    3|  24   24   14   13  
    2|  53   42   22   41  
   --+---------------------
     |   1    2    3    4  

True positives..
 Q25218         Q25219         Q25217         GLBI_CHITP     
 GLBC_CHITH     GLBK_CHITH     GLBH_CHITP     GLBH_CHITH     
 GLBF_CHITH     GLBE_CHITH     GLBI_CHITH     GLBD_CHITH     
 Q94445         Q25215         Q94442         Q27303         
 Q25216         GLBV_CHITP     GLB7_CHITH     O02368         
 GLBZ_CHITP     GLBZ_CHITH     GLB9_CHITH     GLB6_CHITH     
 Q94444         GLB2_CHITH     GLBP_CHITH     Q94443         
 GLB3_CHITP     GLB3_CHITH     P91600         P91595         
 GLB4_CHITH     O02369         P92191         O02370         
 P91592         P91594         P91593         GLBT_CHITH     
 GLB_BURLE      GLB_APLLI      GLB_DOLAU      GLB_APLJU      
 O02567         GLB_APLKU      GLB_CERRH      GLB_NASMU      
 GLB4_TYLHE     GLB2_TYLHE     GLB2_NIPBR     GLB_TUBTU      
 O02004         GLBC_CAUAR     GLB3_LUMTE     Q27430         
 GLB2_LUMTE     GLBH_CAEEL     Q27302         GLB1_PHESE     
 GLB3_TYLHE     GLB1_LUMTE     O61233         O07944         
Subfamily:  Codes involving 3 elements
 Subfamily True positives..
 GLB8_CHITH     GLBY_CHITP     GLBX_CHITH     GLBW_CHITP     
 GLBW_CHITH     GLB1_CHITH     GLB3_LAMSP     GLBC_NIPBR     
 Q26506         GLBA_ANATR     GLB_PSEDC      Q25689         
 GLB4_LUMTE     GLB1_PARCH     GLBA_SCAIN     Q17154         
 Q93101         GLBH_TRICO     Q17155         O02480         
 O61234         GLB1_GLYDI     Q27126         HBF1_URECA     
 HBB2_XENLA     
Subfamily:  Codes involving 2 elements
 Subfamily True positives..
 GLB1_LUCPE     GLB_BUSCA      Q65553         O61603         
 O55574         O82467         O30764         HBAM_RANCA     
 Q17286         GLBM_ANATR     Q54296         GLB4_GLYDI     
 GLBB_RIFPA     GLB_ASCSU      Q17153         HBA3_XENLA     
 HBA4_XENLA     O30766         HBA3_XENTR     HBA3_RANCA     
 HBA5_XENLA     Q17156         RYNR_PIG       O21882         
 P89459         O30765         O85168         O86363         
 Q54299         HBA_TRAST      Y06B_MYCTU     GLBD_CAUAR     
 HBPL_TRETO     HBPL_PARAD     Q42785         Q20798         
 O81116         Q09164         Q24409         O24520         
 HBP2_CASGL     O07407         O76242         GLB3_LUCPE     
 Q17157         HBA_LIOMI      Q89815         O76243         
 HBB_LATCH      GLB_ISOHY      HMPA_ALCEU     HBA3_PLEWA     
 O77003         Q50585         GLB_PAREP      P96645         
 PPS1_BACSU     Q24367         Q94543         GLB7_ARTSX     
 HD_HUMAN       HD_MOUSE       HD_RAT         O83423         
 GLB2_LUCPE     Q43236         Q43296         Q85438         
 VP7_RDV        GLBB_SCAIN     HD_FUGRU       DCOR_NEUCR     
 Q55179         Q85265         POLG_PVYHU     LGBA_PHAVU     
 O04939         Q03972         HYPF_AZOCH     


  PROTEIN TITLES
   Q25218           KC HBVIIB-G PRECURSOR - KIEFFERULUS CORNISHI.
   Q25219           KC HBVIIB-H PRECURSOR - KIEFFERULUS CORNISHI.
   Q25217           KC HBVIIB-E PRECURSOR - KIEFFERULUS CORNISHI.
   GLBI_CHITP       GLOBIN CTT-VIIB-8 PRECURSOR - CHIRONOMUS THUMMI PIGER (MIDGE
   GLBC_CHITH       GLOBIN CTT-VIIB-3 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDG
   GLBK_CHITH       GLOBIN CTT-VIIB-10 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MID
   GLBH_CHITP       GLOBIN CTT-VIIB-7 PRECURSOR - CHIRONOMUS THUMMI PIGER (MIDGE
   GLBH_CHITH       GLOBIN CTT-VIIB-7 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDG
   GLBF_CHITH       GLOBIN CTT-VIIB-6 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDG
   GLBE_CHITH       GLOBIN CTT-VIIB-5/CTT-VIIB-9 PRECURSOR - CHIRONOMUS THUMMI T
   GLBI_CHITH       GLOBIN CTT-VIIB-8 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDG
   GLBD_CHITH       GLOBIN CTT-VIIB-4 PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDG
   Q94445           TENTANS ORF'S (A-E) FOR HEMOGLOBIN PRECURSOR (A-E) - CHIRONO
   Q25215           KC HBVIIIB-A PRECURSOR - KIEFFERULUS CORNISHI.
   Q94442           TENTANS ORF'S (A-E) FOR HEMOGLOBIN PRECURSOR (A-E) - CHIRONO
   Q27303           KC HBVIIB-C PRECURSOR - KIEFFERULUS CORNISHI.
   Q25216           KC HBVIIB-B PRECURSOR - KIEFFERULUS CORNISHI.
   GLBV_CHITP       GLOBIN CTT-V PRECURSOR (HBV) - CHIRONOMUS THUMMI PIGER (MIDG
   GLB7_CHITH       GLOBIN CTT-VIIA - CHIRONOMUS THUMMI THUMMI (MIDGE).
   O02368           GLOBIN VIIA.1 - CHIRONOMUS THUMMI THUMMI (MIDGE).
   GLBZ_CHITP       GLOBIN CTT-Z PRECURSOR (HBZ) - CHIRONOMUS THUMMI PIGER (MIDG
   GLBZ_CHITH       GLOBIN CTT-Z PRECURSOR (HBZ) - CHIRONOMUS THUMMI THUMMI (MID
   GLB9_CHITH       GLOBIN CTT-IX PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDGE).
   GLB6_CHITH       GLOBIN CTT-VI PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDGE).
   Q94444           TENTANS ORF'S (A-E) FOR HEMOGLOBIN PRECURSOR (A-E) - CHIRONO
   GLB2_CHITH       GLOBIN CTT-II BETA PRECURSOR - CHIRONOMUS THUMMI THUMMI (MID
   GLBP_CHITH       GLOBIN CTT-E/E' PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDGE)
   Q94443           TENTANS ORF'S (A-E) FOR HEMOGLOBIN PRECURSOR (A-E) - CHIRONO
   GLB3_CHITP       GLOBIN CTP-III (ERYTHROCRUORIN III) - CHIRONOMUS THUMMI PIGE
   GLB3_CHITH       GLOBIN CTT-III PRECURSOR (ERYTHROCRUORIN III) - CHIRONOMUS T
   P91600           GLOBIN CTT 3-1 - CHIRONOMUS THUMMI THUMMI (MIDGE).
   P91595           GLOBIN CPA E - CHIRONOMUS PALLIDIVITTATUS (MIDGE).
   GLB4_CHITH       GLOBIN CTT-IV PRECURSOR - CHIRONOMUS THUMMI THUMMI (MIDGE).
   O02369           GLOBIN XII - CHIRONOMUS THUMMI THUMMI (MIDGE).
   P92191           GLOBIN CPA 4-2 - CHIRONOMUS PALLIDIVITTATUS (MIDGE).
   O02370           GLOBIN XI - CHIRONOMUS THUMMI THUMMI (MIDGE).
   P91592           GLOBIN CPA 3-1 - CHIRONOMUS PALLIDIVITTATUS (MIDGE).
   P91594           GLOBIN CPA 3-2 - CHIRONOMUS PALLIDIVITTATUS (MIDGE).
   P91593           GLOBIN CPA F - CHIRONOMUS PALLIDIVITTATUS (MIDGE).
   GLBT_CHITH       GLOBIN CTT-IIIA - CHIRONOMUS THUMMI THUMMI (MIDGE).
   GLB_BURLE        GLOBIN (MYOGLOBIN) - BURSATELLA LEACHII (RAGGED SEA HARE).
   GLB_APLLI        GLOBIN (MYOGLOBIN) - APLYSIA LIMACINA (SLUG SEA HARE).
   GLB_DOLAU        GLOBIN (MYOGLOBIN) - DOLABELLA AURICULARIA (SEA HARE).
   GLB_APLJU        GLOBIN (MYOGLOBIN) - APLYSIA JULIANA (SEA HARE).
   O02567           MYOGLOBIN - APLYSIA JULIANA (SEA HARE).
   GLB_APLKU        GLOBIN (MYOGLOBIN) - APLYSIA KURODAI (KURODA'S SEA HARE).
   GLB_CERRH        GLOBIN (MYOGLOBIN) - CERITHIDEA RHIZOPHORARUM (WATER SNAIL) 
   GLB_NASMU        GLOBIN (MYOGLOBIN) - NASSA MUTABILIS (SEA SNAIL).
   GLB4_TYLHE       GLOBIN IIC, EXTRACELLULAR (ERYTHROCRUORIN) - TYLORRHYNCHUS H
   GLB2_TYLHE       GLOBIN IIA, EXTRACELLULAR (ERYTHROCRUORIN) - TYLORRHYNCHUS H
   GLB2_NIPBR       MYOGLOBIN (GLOBIN, BODY WALL ISOFORM) - NIPPOSTRONGYLUS BRAS
   GLB_TUBTU        GLOBIN, EXTRACELLULAR MONOMERIC - TUBIFEX TUBIFEX (SLUDGE WO
   O02004           HEMOGLOBIN - DAPHNIA MAGNA.
   GLBC_CAUAR       GLOBIN C, COELOMIC - CAUDINA ARENICOLA (SEA CUCUMBER) (MOLPA
   GLB3_LUMTE       GLOBIN III, EXTRACELLULAR PRECURSOR (ERYTHROCRUORIN) (GLOBIN
   Q27430           GLOBIN - CAENORHABDITIS REMANEI.
   GLB2_LUMTE       GLOBIN II, EXTRACELLULAR (ERYTHROCRUORIN) (GLOBIN AIII) (GLO
   GLBH_CAEEL       PUTATIVE GLOBIN-LIKE PROTEIN - CAENORHABDITIS ELEGANS.
   Q27302           GLOBIN - CAENORHABDITIS BRIGGSAE.
   GLB1_PHESE       GLOBIN I, EXTRACELLULAR (ERYTHROCRUORIN) - PHERETIMA SIEBOLD
   GLB3_TYLHE       GLOBIN IIB, EXTRACELLULAR (ERYTHROCRUORIN) - TYLORRHYNCHUS H
   GLB1_LUMTE       GLOBIN I, EXTRACELLULAR (ERYTHROCRUORIN) (GLOBIN D) - LUMBRI
   O61233           HEMOGLOBIN CHAIN D1 PRECURSOR - LUMBRICUS TERRESTRIS (COMMON
   O07944           PRISTINAMYCIN I SYNTHASE 3 AND 4 - STREPTOMYCES PRISTINAESPI
 
   GLB8_CHITH       GLOBIN CTT-VIII - CHIRONOMUS THUMMI THUMMI (MIDGE).
   GLBY_CHITP       GLOBIN CTT-Y PRECURSOR (HBY) - CHIRONOMUS THUMMI PIGER (MIDG
   GLBX_CHITH       GLOBIN CTT-X - CHIRONOMUS THUMMI THUMMI (MIDGE).
   GLBW_CHITP       GLOBIN CTT-W PRECURSOR (HBW) - CHIRONOMUS THUMMI PIGER (MIDG
   GLBW_CHITH       GLOBIN CTT-W PRECURSOR (HBW) - CHIRONOMUS THUMMI THUMMI (MID
   GLB1_CHITH       GLOBIN CTT-I/CTT-IA PRECURSOR (ERYTHROCRUORIN) - CHIRONOMUS 
   GLB3_LAMSP       GIANT HEMOGLOBIN AIII CHAIN - LAMELLIBRACHIA SP. (DEEP-SEA G
   GLBC_NIPBR       GLOBIN, CUTICULAR ISOFORM PRECURSOR - NIPPOSTRONGYLUS BRASIL
   Q26506           A POLYPEPTIDE CHAIN OF CHLOROCRUORIN PRECURSOR - SABELLASTAR
   GLBA_ANATR       GLOBIN I ALPHA CHAIN - ANADARA TRAPEZIA (ARK CLAM).
   GLB_PSEDC        EXTRACELLULAR GLOBIN PRECURSOR - PSEUDOTERRANOVA DECIPIENS (
   Q25689           'HEMOGLOBIN, ABNORMAL' PRECURSOR - PSEUDOTERRANOVA DECIPIENS
   GLB4_LUMTE       GLOBIN IV, EXTRACELLULAR (ERYTHROCRUORIN) (GLOBIN A) - LUMBR
   GLB1_PARCH       GLOBIN I - PARACAUDINA CHILENSIS (SEA CUCUMBER).
   GLBA_SCAIN       GLOBIN II, A CHAIN (HBII-A) - SCAPHARCA INAEQUIVALVIS (ARK C
   Q17154           TWO-DOMAIN CHAIN OF THE POLYMERIC HEMOGLOBIN (INTRACELLULAR)
   Q93101           NERVE MYOGLOBIN - APHRODITE ACULEATA.
   GLBH_TRICO       GLOBIN-LIKE HOST-PROTECTIVE ANTIGEN PRECURSOR - TRICHOSTRONG
   Q17155           ALPHA CHAIN OF THE TETRAMERIC HEMOGLOBIN (INTRACELLULAR) - B
   O02480           HEMOGLOBIN B CHAIN - SCAPHARCA INAEQUIVALVIS (ARK CLAM).
   O61234           HEMOGLOBIN CHAIN D2 PRECURSOR - LUMBRICUS TERRESTRIS (COMMON
   GLB1_GLYDI       GLOBIN, MAJOR MONOMERIC COMPONENT - GLYCERA DIBRANCHIATA (BL
   Q27126           F-I HEMOGLOBIN - URECHIS CAUPO (INNKEEPER WORM) (SPOONWORM).
   HBF1_URECA       HEMOGLOBIN F-I - URECHIS CAUPO (INNKEEPER WORM) (SPOONWORM).
   HBB2_XENLA       HEMOGLOBIN BETA-2 CHAIN (MINOR) (LARVAL BETA-II-GLOBIN) (B2G
 
   GLB1_LUCPE       HEMOGLOBIN I (HB I) - LUCINA PECTINATA (CLAM).
   GLB_BUSCA        GLOBIN (MYOGLOBIN) - BUSYCON CANALICULATUM (CHANNELED WHELK)
   Q65553           UL36 - BOVINE HERPESVIRUS 1.
   O61603           EYELID - DROSOPHILA MELANOGASTER (FRUIT FLY).
   O55574           VP80 - LEUCANIA SEPARATA NUCLEAR POLYHEDROSIS VIRUS (LSNPV).
   O82467           PEROXISOMAL TARGETING SIGNAL TYPE 1 RECEPTOR - ARABIDOPSIS T
   O30764           POLYKETIDE SYNTHASE MODULES 1 AND 2 - STREPTOMYCES CAELESTIS
   HBAM_RANCA       HEMOGLOBIN ALPHA-TYPE CHAIN, HEART MUSCLE - RANA CATESBEIANA
   Q17286           HEMOGLOBIN (HETERODIMERIC) - BARBATIA VIRESCENS.
   GLBM_ANATR       GLOBIN, MINOR - ANADARA TRAPEZIA (ARK CLAM).
   Q54296           POLYKETIDE SYNTHASE - STREPTOMYCES HYGROSCOPICUS.
   GLB4_GLYDI       GLOBIN, MONOMERIC COMPONENT M-IV (GMH4) - GLYCERA DIBRANCHIA
   GLBB_RIFPA       GIANT HEMOGLOBINS B CHAIN - RIFTIA PACHYPTILA (TUBE WORM).
   GLB_ASCSU        EXTRACELLULAR GLOBIN PRECURSOR - ASCARIS SUUM (PIG ROUNDWORM
   Q17153           HEMOGLOBIN (2 DOMAIN) - BARBATIA LIMA.
   HBA3_XENLA       HEMOGLOBIN ALPHA-3 CHAIN (ALPHA-T3) - XENOPUS LAEVIS (AFRICA
   HBA4_XENLA       HEMOGLOBIN ALPHA-4 CHAIN (ALPHA-T4) - XENOPUS LAEVIS (AFRICA
   O30766           POLYKETIDE SYNTHASE MODULES 4 AND 5 - STREPTOMYCES CAELESTIS
   HBA3_XENTR       HEMOGLOBIN ALPHA-3 CHAIN (LARVAL) - XENOPUS TROPICALIS (WEST
   HBA3_RANCA       HEMOGLOBIN ALPHA-III CHAIN, LARVAL - RANA CATESBEIANA (BULL 
   HBA5_XENLA       HEMOGLOBIN ALPHA-5 CHAIN (ALPHA-T5) - XENOPUS LAEVIS (AFRICA
   Q17156           BETA CHAIN OF THE TETRAMERIC HEMOGLOBIN (INTRACELLULAR) - BA
   RYNR_PIG         RYANODINE RECEPTOR, SKELETAL MUSCLE (SKELETAL MUSCLE CALCIUM
   O21882           HYPOTHETICAL 104.5 KD PROTEIN - BACTERIOPHAGE SK1.
   P89459           VERY LARGE TEGUMENT PROTEIN - HERPES SIMPLEX VIRUS (TYPE 2).
   O30765           POLYKETIDE SYNTHASE MODULE 3 - STREPTOMYCES CAELESTIS.
   O85168           SYRINGOMYCIN SYNTHETASE - PSEUDOMONAS SYRINGAE (PV. SYRINGAE
   O86363           HYPOTHETICAL 43.3 KD PROTEIN - MYCOBACTERIUM TUBERCULOSIS.
   Q54299           POLYKETIDE SYNTHASE - STREPTOMYCES HYGROSCOPICUS.
   HBA_TRAST        HEMOGLOBIN ALPHA CHAIN - TRAGELAPHUS STREPSICEROS (GREATER K
   Y06B_MYCTU       HYPOTHETICAL 139.6 KD PROTEIN CY338.11C PRECURSOR - MYCOBACT
   GLBD_CAUAR       GLOBIN D, COELOMIC - CAUDINA ARENICOLA (SEA CUCUMBER) (MOLPA
   HBPL_TRETO       HEMOGLOBIN - TREMA TOMENTOSA.
   HBPL_PARAD       NONLEGUME HEMOGLOBIN I - PARASPONIA ANDERSONII, AND PARASPON
   Q42785           LEGHEMOGLOBIN - GLYCINE MAX (SOYBEAN).
   Q20798           F55A11.3 PROTEIN - CAENORHABDITIS ELEGANS.
   O81116           LEGHEMOGLOBIN - TREMA ORIENTALIS.
   Q09164           CYCLOSPORIN SYNTHETASE (CYSYN) (EC 6.-.-.-) - TOLYPOCLADIUM 
   Q24409           MUSASHI - DROSOPHILA MELANOGASTER (FRUIT FLY).
   O24520           LEGHEMOGLOBIN - ARABIDOPSIS THALIANA (MOUSE-EAR CRESS).
   HBP2_CASGL       HEMOGLOBIN II - CASUARINA GLAUCA (SWAMP OAK).
   O07407           ALCOHOL DEHYDROGENASE - MYCOBACTERIUM TUBERCULOSIS.
   O76242           NEURAL GLOBIN - CEREBRATULUS LACTEUS (MILKY RIBBON WORM).
   GLB3_LUCPE       HEMOGLOBIN III (HB III) - LUCINA PECTINATA (CLAM).
   Q17157           DELTA CHAIN OF THE HOMODIMERIC HEMOGLOBIN (INTRACELLULAR) - 
   HBA_LIOMI        HEMOGLOBIN ALPHA CHAIN - LIOPHIS MILIARIS.
   Q89815           UL37 - BOVINE HERPESVIRUS 1.
   O76243           BODY WALL GLOBIN - CEREBRATULUS LACTEUS (MILKY RIBBON WORM).
   HBB_LATCH        HEMOGLOBIN BETA CHAIN - LATIMERIA CHALUMNAE (LATIMERIA) (COE
   GLB_ISOHY        GLOBIN (MYOGLOBIN) - ISOPARORCHIS HYPSELOBAGRI.
   HMPA_ALCEU       FLAVOHEMOPROTEIN (HEMOGLOBIN-LIKE PROTEIN) (FLAVOHEMOGLOBIN)
   HBA3_PLEWA       HEMOGLOBIN ALPHA CHAIN, LARVAL - PLEURODELES WALTLII (IBERIA
   O77003           MYOGLOBIN - BIOMPHALARIA GLABRATA (BLOODFLUKE PLANORB).
   Q50585           HYPOTHETICAL 122.4 KD PROTEIN CY19G5.06 - MYCOBACTERIUM TUBE
   GLB_PAREP        GLOBIN-3 (MYOGLOBIN) - PARAMPHISTOMUM EPICLITUM.
   P96645           YDDH PROTEIN - BACILLUS SUBTILIS.
   PPS1_BACSU       PEPTIDE SYNTHETASE 1 - BACILLUS SUBTILIS.
   Q24367           INSCUTEABLE - DROSOPHILA MELANOGASTER (FRUIT FLY).
   Q94543           NEM (NEM) - DROSOPHILA MELANOGASTER (FRUIT FLY).
   GLB7_ARTSX       GLOBIN E7, EXTRACELLULAR - ARTEMIA SP. (BRINE SHRIMP).
   HD_HUMAN         HUNTINGTIN (HUNTINGTON'S DISEASE PROTEIN) (HD PROTEIN) - HOM
   HD_MOUSE         HUNTINGTIN (HUNTINGTON'S DISEASE PROTEIN HOMOLOG) (HD PROTEI
   HD_RAT           HUNTINGTIN (HUNTINGTON'S DISEASE PROTEIN HOMOLOG) (HD PROTEI
   O83423           HYPOTHETICAL 124.0 KD PROTEIN - TREPONEMA PALLIDUM.
   GLB2_LUCPE       HEMOGLOBIN II (HB II) - LUCINA PECTINATA (CLAM).
   Q43236           LEGHEMOGLOBIN - VIGNA UNGUICULATA (COWPEA).
   Q43296           LEGHEMOGLOBIN - VIGNA UNGUICULATA (COWPEA).
   Q85438           NONSTRUCTURAL PROTEIN - RICE DWARF VIRUS (RDV).
   VP7_RDV          NONSTRUCTURAL PROTEIN PNS7 - RICE DWARF VIRUS (RDV).
   GLBB_SCAIN       GLOBIN II, B CHAIN (HBII-B) - SCAPHARCA INAEQUIVALVIS (ARK C
   HD_FUGRU         HUNTINGTIN (HUNTINGTON'S DISEASE PROTEIN HOMOLOG) (HD PROTEI
   DCOR_NEUCR       ORNITHINE DECARBOXYLASE (EC 4.1.1.17) (ODC) - NEUROSPORA CRA
   Q55179           HYPOTHETICAL 57.1 KD PROTEIN - SYNECHOCYSTIS SP. (STRAIN PCC
   Q85265           POLYPROTEIN - POTATO VIRUS Y.
   POLG_PVYHU       GENOME POLYPROTEIN [CONTAINS: N-TERMINAL PROTEIN (P1); HELPE
   LGBA_PHAVU       LEGHEMOGLOBIN A - PHASEOLUS VULGARIS (KIDNEY BEAN) (FRENCH B
   O04939           LEGHEMOGLOBIN - PHASEOLUS VULGARIS (KIDNEY BEAN) (FRENCH BEA
   Q03972           LEGHEMOGLOBIN 2 - PHASEOLUS VULGARIS (KIDNEY BEAN) (FRENCH B
   HYPF_AZOCH       TRANSCRIPTIONAL REGULATORY PROTEIN HYPF - AZOTOBACTER CHROOC

SCAN HISTORY OWL28_2 3 1370 NSINGLE SPTR37_9f 3 1000 NSINGLE INITIAL MOTIF SETS ERYTHCRUORIN1 Length of motif = 23 Motif number = 1 Erythrocruorin motif I - 1 PCODE ST INT VRAVFDDLFKHYPTSKALFERVK GLB4_LUMTE 35 35 SLHFWKEFLHDHPDLVSLFKRVQ GLB1_PHESE 28 28 GQAIFQELFALDPNAKGVFGRVN GLB3_TYLHE 32 32 GIALWKSMFAQDNDARDLFKRVH GLB2_TYLHE 31 31 GLDFLVALFEKFPDSANFFADFK GLB_APLLI 25 25 EVEILAAVFAAYPDIQNKFSQFA GLBE_CHITH 38 38 SVGILYAVFKADPTIQAAFPQFV GLBZ_CHITH 35 35 GLELWKGILREHPEIKAPFSRVR GLB1_LUMTE 28 28 SQAIWRATFAQVPESRSLFKRVH GLB2_LUMTE 30 30 GRLLLTKLAKDIPDVNDLFKRVD GLB3_LUMTE 53 53 GLELFTKYFHENPQMMFIFGYSG GLP2_GLYDI 28 28 GREFYKYFFTNHQDLRKYFKGAE GLBH_TRICO 44 44 GNAFFRYFFTNFPDLRVYFKGAE GLBH_CAEEL 31 31 GIDLYKHMFENYPPLRKYFKNRE GLB_ASCSU 44 44 GIDLYKHMFENYPSMREAFKDRE GLB_ASCSU 193 193 GIDLYKHMFEHYPAMKKYFKHRE GLB_PSEDC 44 44 GIDLYKHMFEHYPHMRKAFKGRE GLB_PSEDC 193 193 ERYTHCRUORIN2 Length of motif = 24 Motif number = 2 Erythrocruorin motif II - 1 PCODE ST INT LDDTLVLQSHLGHLADQHIQRKGV GLB4_LUMTE 84 26 LDEDDTFTVQLAHLKAQHTERGTK GLB1_PHESE 77 26 LEDPKALQEELKHLARQHRERSGV GLB3_TYLHE 81 26 LTDEPVLNAQLEHLRQQHIKLGIT GLB2_TYLHE 80 26 AANAGKMSAMLSQFAKEHVGFGVG GLB_APLLI 78 30 TSNAAAVNSLVSKLGDDHKARGVS GLBE_CHITH 94 33 IDDLPNIGKHVDALVATHKPRGVT GLBZ_CHITH 85 27 LDTPDMLAAQLAHLKVQHVERNLK GLB1_LUMTE 77 26 LDQPATLKEELDHLQVQHEGRKIP GLB2_LUMTE 79 26 LDDPPALDAALDHLAHQHEVREGV GLB3_LUMTE 102 26 MDNAKQMAGTLHALGVRHKGFGDI GLP2_GLYDI 79 28 YDNEMIFRAFVRDTIDRHVDRGLD GLBH_TRICO 97 30 YTNEEVFKGYVRETINRHRIYKMD GLBH_CAEEL 84 30 YDDRETFNAYTRELLDRHARDHVH GLB_ASCSU 97 30 YDDEETFHMYVHELMERHERLGVQ GLB_ASCSU 246 30 YDDRETFDAYVGELMARHERDHVK GLB_PSEDC 97 30 YDDEPTFDYFVDALMDRHIKDDIH GLB_PSEDC 246 30 ERYTHCRUORIN3 Length of motif = 17 Motif number = 3 Erythrocruorin motif III - 1 PCODE ST INT YFRGIGEAFARVLPQVL GLB4_LUMTE 111 3 YFDLFGTQLFDILGDKL GLB1_PHESE 103 2 YFDEMEKALLKVLPQVS GLB3_TYLHE 108 3 MFNLMRTGLAYVLPAQL GLB2_TYLHE 106 2 QFENVRSMFPGFVASVA GLB_APLLI 104 2 QFGEFRTALVAYLQANV GLBE_CHITH 120 2 QFNNFRAAFIAYLKGHV GLBZ_CHITH 111 2 FFDIFLKHLLHVLGDRL GLB1_LUMTE 103 2 YFDAFKTAILHVVAAQL GLB2_LUMTE 105 2 HFKKFGEILATGLPQVL GLB3_LUMTE 129 3 FFPALGMCLLDAMEEKV GLP2_GLYDI 106 3 LWKEFWSIYQKFLESKG GLBH_TRICO 123 2 LWMAFFTVFTGYLESVG GLBH_CAEEL 110 2 VWTDFWKLFEEYLGKKT GLB_ASCSU 125 4 HWTDFWKLFEEFLEKKS GLB_ASCSU 274 4 VWNHFWEHFIEFLGSKT GLB_PSEDC 125 4 QWHEFWKLFAEYLNEKS GLB_PSEDC 274 4 ERYTHCRUORIN4 Length of motif = 15 Motif number = 4 Erythrocruorin motif IV - 1 PCODE ST INT NVDAWNRCFHRLVAR GLB4_LUMTE 131 3 DQAAWRDCYAVIAAG GLB1_PHESE 124 4 NSGAWDRCFTRIADV GLB3_TYLHE 128 3 DKEAWAACWDEVIYP GLB2_TYLHE 127 4 ADAAWTKLFGLIIDA GLB_APLLI 126 5 VAAAWNKALDNTFAI GLBE_CHITH 142 5 VEAAWGATFDAFFGA GLBZ_CHITH 133 5 DFGAWHDCVDQIIDG GLB1_LUMTE 124 4 DREAWDACIDHIEDG GLB2_LUMTE 126 4 DALAWKSCLKGILTK GLB3_LUMTE 149 3 WAAAYREISDALVAG GLP2_GLYDI 130 7 QKAAFDAIGTRFNDE GLBH_TRICO 146 6 QKAAWMALGKEFNAE GLBH_CAEEL 132 5 TKQAWHEIGREFAKE GLB_ASCSU 147 5 TKHAWAVIGKEFAYE GLB_ASCSU 296 5 TKHAWQEIGKEFSHE GLB_PSEDC 147 5 EKHAWSTIGEDFAHE GLB_PSEDC 298 7 FINAL MOTIF SETS ERYTHCRUORIN1 Length of motif = 23 Motif number = 1 Erythrocruorin motif I - 3 PCODE ST INT EADILYAVFKAYPDIQAKFPQFA Q25218 38 38 EADILYAVFKAYPDIQAKFPQFA Q25219 38 38 EADILYAVFKAYPDIQAKFPQFA Q25217 38 38 EVEILAAVFAAYPDIQNKFPQFA GLBI_CHITP 38 38 EVDILAAVFAAYPDIQAKFPQFA GLBC_CHITH 38 38 EVEILAAVFAAYPDIQNKFSQFA GLBK_CHITH 38 38 EVDILAAVFAAYPDIQAKFPQFA GLBH_CHITP 38 38 EVDILAAVFAAYPDIQAKFPQFA GLBH_CHITH 38 38 EVEILAAVFAAYPDIQNKFSQFA GLBF_CHITH 38 38 EVEILAAVFAAYPDIQNKFSQFA GLBE_CHITH 38 38 EVEILAAVFAAYPDIQNKFPQFA GLBI_CHITH 38 38 EVDILAAVFAAYPDIQAKFPQFA GLBD_CHITH 38 38 EVDILAAVFTANPDIQARFPQFA Q94445 38 38 EADILYAVFKAYPDIQAKFPQFA Q25215 38 38 EVDILAAIFAANPDIQARFSQFA Q94442 40 40 EVDILYAVFKAYPDIQNKFSQFA Q27303 38 38 EVDILYAVFKAYPDIQNKFSQFA Q25216 38 38 EVDILAAVFKAYPDIQAKFPQFA GLBV_CHITP 38 38 EVEILAAVFTAYPDIQARFPQFA GLB7_CHITH 22 22 EVEILAAVFTAYPDIQARFPQFA O02368 38 38 EVDILYTVFKAYPDIQARFPQFA GLBZ_CHITP 38 38 EVDILYTVFKAYPDIQARFPQFA GLBZ_CHITH 38 38 EVDILAAVFSDHPDIQARFPQFA GLB9_CHITH 38 38 EVDILYAVFKAYPDIMAKFPQFA GLB6_CHITH 37 37 EVDILAAVFKDHPDIQARFPQFA Q94444 38 38 EVDILYYIFKANPDIMAKFPQFA GLB2_CHITH 37 37 SVGILYAVFKADPTIQAAFPQFV GLBP_CHITH 35 35 EVDILYYIFKANPDIMAKFPQFV Q94443 37 37 PVGILYAVFKADPSIMAKFTQFA GLB3_CHITP 20 20 PVGILYAVFKADPSIMAKFTQFA GLB3_CHITH 35 35 PVGILYAVFKADPSIMAKFTQFA P91600 35 35 STGILYAVFKADSSIQAAFPQFV P91595 35 35 AVGILYAVFKADPSIQAKFTQFA GLB4_CHITH 35 35 EVDILYSIFAANPDIQARFPQFA O02369 39 39 SVGILYAVFKADPSIQAKFSQFA P92191 35 35 EVDILYAIFKANPDIQARFPQFA O02370 39 39 PVGILYAVFKADPSIMAKFTQFA P91592 35 35 SVGILYAVFKADPSIQTKFTQFA P91594 35 35 PVGILYACLKADPSIQEKFPQFA P91593 38 38 GVEILYFFLNKFPGNFPMFKKLG GLBT_CHITH 28 28 GDNFLIALFEAFPDSANFFGDFK GLB_BURLE 25 25 GDAFLVALFEKFPDSANFFADFK GLB_APLLI 25 25 GDNFLIALFEAYPDSPNFFADFK GLB_DOLAU 25 25 GASFLVALFTQFPESANFFNDFK GLB_APLJU 25 25 GDSFLVALFTQFPESANFFNDFK O02567 26 26 GDAFLLSLFEKFPNNANYFADFK GLB_APLKU 25 25 GATLFSLLFKQFPDTRNYFTHFG GLB_CERRH 28 28 SAAMFGLLFEKYPDTKKHFKTFD GLB_NASMU 28 28 GRLLFEELFEIDGATKGLFKRVN GLB4_TYLHE 32 32 GIALWKSMFAQDNDARDLFKRVH GLB2_TYLHE 31 31 GKDFYKFFFTNHPDLRKYFKGAE GLB2_NIPBR 26 26 GLKLWNSIFRDAPEIRGLFKRVD GLB_TUBTU 28 28 APQVLFRFVKAHPEYQKMFSKFA O02004 70 70 VTDVFIRIFAYDPSAQNKFPQMA GLBC_CAUAR 35 35 GRLLLTKLAKDIPDVNDLFKRVD GLB3_LUMTE 53 53 GNGFYQYFFTNFPDLRVYFKGAE Q27430 31 31 SQAIWRATFAQVPESRSLFKRVH GLB2_LUMTE 30 30 GNAFFRYFFTNFPDLRVYFKGAE GLBH_CAEEL 31 31 GNGFYQYFFTNFPDLRVYFKGAE Q27302 31 31 SLHFWKEFLHDHPDLVSLFKRVQ GLB1_PHESE 28 28 GQAIFQELFALDPNAKGVFGRVN GLB3_TYLHE 32 32 GLELWKGILREHPEIKAPFSRVR GLB1_LUMTE 28 28 GLELWRDIIDDHPEIKAPFSRVR O61233 46 46 VVARLAAHLAGRPDLADKVTVDA O07944 2053 2053 ERYTHCRUORIN2 Length of motif = 24 Motif number = 2 Erythrocruorin motif II - 3 PCODE ST INT ESNLAAVNNLVSKLGADHKARGVT Q25218 94 33 ESNLAAVNNLVSKLGADHKARGVT Q25219 94 33 AANLSAVYNLVSKLGADHKARGVT Q25217 94 33 ESNASAVNSLVSKLGDDHKARGVS GLBI_CHITP 94 33 ESNASAVNSLVSKLGDDHKARGVS GLBC_CHITH 94 33 ESNASAVNSLVSKLGDDHKARGVS GLBK_CHITH 94 33 QANLSAVYALVSKLGVDHKARGIS GLBH_CHITP 94 33 QANLSAVYALVSKLGVDHKARGIS GLBH_CHITH 94 33 DSNAAAVNSLVSKLGDDHKARGVS GLBF_CHITH 94 33 TSNAAAVNSLVSKLGDDHKARGVS GLBE_CHITH 94 33 ESNASAVNSLVSKLGDDHKARGVS GLBI_CHITH 94 33 ASNAAAVEGLLNKLGSDHKARGVS GLBD_CHITH 94 33 ESNAPAVQTLVGQLAASHKARGIS Q94445 94 33 ASNLGAINNIVSKLGADHNGRGVT Q25215 94 33 AANAPALQTLVGQLAASHKARGIP Q94442 96 33 EANAGAIQNIVSKFGADHNARGVT Q27303 94 33 EANAVAIQNIVSKFGADHNARGVT Q25216 94 33 EANLSAVYGLVKKLGVDHKNRGIT GLBV_CHITP 94 33 ESNAPAVQTLVGQLAASHKARGIS GLB7_CHITH 78 33 ESNAPAVQTLVGQLAASHKARGIS O02368 94 33 ESNLSAIYGLISKMGTDHKNRGIT GLBZ_CHITP 94 33 ESNLSAIYGLISKMGTDHKNRGIT GLBZ_CHITH 94 33 ESNAPAMATLINELSTSHHNRGIT GLB9_CHITH 94 33 DANIPAIQNLAKELATSHKPRGVS GLB6_CHITH 93 33 EANRPAMNTLTNELATNHHNRGIS Q94444 94 33 SANMPAMETLIKDMAANHKARGIP GLB2_CHITH 93 33 IDDLPNIGKHVDALVATHKPRGVT GLBP_CHITH 85 27 EANRPAMVTLINEMAANHKARKIP Q94443 93 33 IGELPNIEADVNTFVASHKPRGVT GLB3_CHITP 70 27 IGELPNIEADVNTFVASHKPRGVT GLB3_CHITH 85 27 IGELPNIDGDVNTFVASHKPRGVT P91600 85 27 IDDLPNIGKHVDALVATHKPRGVT P91595 85 27 IGDLPNIDGDVTTFVASHTPRGVT GLB4_CHITH 85 27 DSGVSAAKTLINEVAASHKGRGVS O02369 95 33 VGDLPNISGDVDTFVASHKPRGAT P92191 85 27 ESGISAAKTLINALGASHRGRGIS O02370 95 33 IGDLPSIEGDVDTFVTSHKPRGVT P91592 85 27 ISELPNIDADVDAFVATHKPRSVT P91594 85 27 IYELPDMERDVDTFVASHKPRGIT P91593 88 27 GSDMGGAKALLNQLGTSHKAMGIT GLBT_CHITH 81 30 AADAGKMAGMLDQFSKEHVGFGVG GLB_BURLE 78 30 AADAGKMSAMLSQFAKEHVGFGVG GLB_APLLI 78 30 AADAGKMAAMLDQFSKEHAGFGVG GLB_DOLAU 78 30 AADAGKMGSMLQQFATEHAGFGVG GLB_APLJU 78 30 AADAGKMGSMLQQFATEHAGFGVG O02567 79 30 AADAGKMSAMLSQFASEHVGFGVG GLB_APLKU 78 30 MDDADCMNGLALKLSRNHIQRKIG GLB_CERRH 81 30 VDDGECVLGLAKKLSRNHTARGVT GLB_NASMU 81 30 LGDSDTLNSLIDHLAEQHKARAGF GLB4_TYLHE 81 26 LTDEPVLNAQLEHLRQQHIKLGIT GLB2_TYLHE 80 26 FDNEDVFRAFCRETIDRHVGRGLD GLB2_NIPBR 79 30 LDDQAAFDAQLAHLKSQHAERNIK GLB_TUBTU 77 26 LFSQELMANQLNALGGAHQPRGAT O02004 123 30 ELDSDILPELLATLARTHDLNKVG GLBC_CAUAR 87 29 LDDPPALDAALDHLAHQHEVREGV GLB3_LUMTE 102 26 YTNEEVFKAYVRETVNRHRIYKMD Q27430 84 30 LDQPATLKEELDHLQVQHEGRKIP GLB2_LUMTE 79 26 YTNEEVFKGYVRETINRHRIYKMD GLBH_CAEEL 84 30 FTNEEVFKAYVRETINRHRIYKMD Q27302 84 30 LDEDDTFTVQLAHLKAQHTERGTK GLB1_PHESE 77 26 LEDPKALQEELKHLARQHRERSGV GLB3_TYLHE 81 26 LDTPDMLAAQLAHLKVQHVERNLK GLB1_LUMTE 77 26 LDTPDMLAAQLAHLKVQHVERNLK O61233 95 26 TLDTAALRAALADVTARHEALRTV O07944 2519 443 ERYTHCRUORIN3 Length of motif = 17 Motif number = 3 Erythrocruorin motif III - 3 PCODE ST INT QFGEFRTALVAYLQAHV Q25218 120 2 QFGEFRTALVAYLQAHV Q25219 120 2 QFGEFRTALVAYLQAHV Q25217 120 2 QFGEFRTALVAYLQANV GLBI_CHITP 120 2 QFGEFRTALVAYLSNHV GLBC_CHITH 120 2 QFGEFRTALVAYLQANV GLBK_CHITH 120 2 QFGEFRTALVSYLQAHV GLBH_CHITP 120 2 QFGEFRTALVSYLQAHV GLBH_CHITH 120 2 QFGEFRTALVAYLQANV GLBF_CHITH 120 2 QFGEFRTALVAYLQANV GLBE_CHITH 120 2 QFGEFRTALVAYLSNHV GLBI_CHITH 120 2 QFGEFRTALVSYLSNHV GLBD_CHITH 120 2 QFNEFRASLVSYLQANV Q94445 120 2 QFGEFRTALMAYLQAHV Q25215 120 2 QFGEFRTSLVAYLQANV Q94442 122 2 QFGEFRTALFAYLQAHV Q27303 120 2 QFGEFRTALFAYLQAHV Q25216 120 2 QFNEFKTALISYLSSHV GLBV_CHITP 120 2 QFNEFRAGLVSYVSSNV GLB7_CHITH 104 2 QFNEFRAGLVSYVSSNV O02368 120 2 QFNEFRTALVSYISSNV GLBZ_CHITP 120 2 QFNKFRTALVSYISSNV GLBZ_CHITH 120 2 QFNEFRSSLVSYLSSHA GLB9_CHITH 120 2 QFTEFRTALFTYLKAHI GLB6_CHITH 119 2 QFNEFRASMTSYLSHHT Q94444 120 2 QFNEFRASLVSYLQSKV GLB2_CHITH 119 2 QFNNFRAAFIAYLKGHV GLBP_CHITH 111 2 QFNEFRASLVSYLQSHV Q94443 119 2 QLNNFRAGFVSYMKAHT GLB3_CHITP 96 2 QLNNFRAGFVSYMKAHT GLB3_CHITH 111 2 QLNNFRAGFVSYMKAHT P91600 111 2 QFNNFRAAFIGYLKGHV P91595 111 2 QLNNFRAGFVSYMKAHT GLB4_CHITH 111 2 QFNAFRVSLTAYLADHV O02369 121 2 QLNNFRSAFVSYMKAHT P92191 111 2 QFNEFRASLITYLSQNV O02370 121 2 QLNNFRAGFVSYMKAHT P91592 111 2 QLNNFRAGFVGYMKAHT P91594 111 2 QLDNFRAGFVTYMKAHT P91593 114 2 QFDQFRQALTELLGNLG GLBT_CHITH 107 2 QFENVRSMFPGFVSSVA GLB_BURLE 104 2 QFENVRSMFPGFVASVA GLB_APLLI 104 2 QFQNVSAMFPGFVASIA GLB_DOLAU 104 2 QFQNVRSMFPGFVASLS GLB_APLJU 104 2 QFQNVRSMFPGFVASLS O02567 105 2 QFENVRSMFPAFVASLS GLB_APLKU 104 2 RFGEMRQVFPNFLDEAL GLB_CERRH 107 2 DFKLMRSIFGEFLDKAT GLB_NASMU 107 2 YFKEFGKALNHVLPEVA GLB4_TYLHE 108 3 MFNLMRTGLAYVLPAQL GLB2_TYLHE 106 2 LWKAFWSVWVAFLESKG GLB2_NIPBR 105 2 FVNELLAVLPDYLGTKL GLB_TUBTU 107 6 MFEQFGGILEEVLAEEL O02004 149 2 HYNLFAKVLMEALQAEL GLBC_CAUAR 113 2 HFKKFGEILATGLPQVL GLB3_LUMTE 129 3 LWMAFFTVFTGYLGSTG Q27430 110 2 YFDAFKTAILHVVAAQL GLB2_LUMTE 105 2 LWMAFFTVFTGYLESVG GLBH_CAEEL 110 2 LWMAFFTVFTGYLESTG Q27302 110 2 YFDLFGTQLFDILGDKL GLB1_PHESE 103 2 YFDEMEKALLKVLPQVS GLB3_TYLHE 108 3 FFDIFLKHLLHVLGDRL GLB1_LUMTE 103 2 FFDIFLKHLLHVLGDRL O61233 121 2 LATEGRASLFMVLQAAF O07944 2722 179 ERYTHCRUORIN4 Length of motif = 15 Motif number = 4 Erythrocruorin motif IV - 3 PCODE ST INT VAAAWNQALDNTFAI Q25218 142 5 VAAAWNQALDNTFAI Q25219 142 5 VAAAWNHALDNTYAV Q25217 142 5 VAAAWNKALDNTFAI GLBI_CHITP 142 5 VAAAWNKALDNTYAI GLBC_CHITH 142 5 VAAAWNKALDNTFAI GLBK_CHITH 142 5 VAAAWNHALDNTYAV GLBH_CHITP 142 5 VAAAWNHALDNTYAV GLBH_CHITH 142 5 VAAAWNKALDNTFAI GLBF_CHITH 142 5 VAAAWNKALDNTFAI GLBE_CHITH 142 5 VAAAWNKALDNTYAI GLBI_CHITH 142 5 VAAAWNKALDNTMAV GLBD_CHITH 142 5 VAAAWTQGLDNIYGL Q94445 142 5 VAAAWNHALDNTMEI Q25215 142 5 VAAAWNQALDNLFFV Q94442 144 5 VAAAWNQAVDNTFTI Q27303 142 5 VAAAWNQAVDNVFVV Q25216 142 5 VAAAWEHALENTYTV GLBV_CHITP 142 5 AESAWTAGLDNIFGL GLB7_CHITH 126 5 AESAWTAGLDNIFGL O02368 142 5 VAAAWTHALDNVYTA GLBZ_CHITP 142 5 VAAAWTHALDNVYTA GLBZ_CHITH 142 5 TADAWTHGLDNIFGM GLB9_CHITH 142 5 TETAWTLALDTTYAM GLB6_CHITH 141 5 TAAAWTHGLDNIFDA Q94444 142 5 LGAAWTQGLDNVFNM GLB2_CHITH 141 5 VEAAWGATFDAFFGA GLBP_CHITH 133 5 LGAAWTQGLDNAFTM Q94443 141 5 AEAAWGATLDTFFGM GLB3_CHITP 117 4 AEAAWGATLDTFFGM GLB3_CHITH 132 4 AEAAWGATLDTFFGM P91600 132 4 VEAAWGATFDAFFGA P91595 133 5 AEAAWGATLDAFFGM GLB4_CHITH 132 4 VAQAWEKGLDNVYFV O02369 143 5 SESAWGATLDAFFGA P92191 132 4 VAQAWEKGFNNVYFI O02370 143 5 SESAWGATLDTFFGM P91592 132 4 AESAWGATLDTFFGA P91594 132 4 SESAWGASLDNFFGM P91593 135 4 NIGAWNATVDLMFHV GLBT_CHITH 127 3 ADAAWGKLFGLIIDA GLB_BURLE 126 5 ADAAWTKLFGLIIDA GLB_APLLI 126 5 ADAAWGKLFGLIIDA GLB_DOLAU 126 5 ADAAWNSLFGLIISA GLB_APLJU 124 3 GDAAWNSLFGLIISA O02567 125 3 ADDAWNKLFGLIVAA GLB_APLKU 124 3 VKGAWDALLAYLQDN GLB_CERRH 131 7 MKSAWDALLGVLIEN GLB_NASMU 131 7 NPEAWNHCFDGLVDV GLB4_TYLHE 128 3 DKEAWAACWDEVIYP GLB2_TYLHE 127 4 QKAAWDKLGTVFNDE GLB2_NIPBR 127 5 DFKAWSECLGVITGA GLB_TUBTU 124 0 ARQAWKNGLAALVAG O02004 173 7 TRDAWAKAFSVVQAV GLBC_CAUAR 137 7 DALAWKSCLKGILTK GLB3_LUMTE 149 3 QKAAWMALGKEFNAE Q27430 132 5 DREAWDACIDHIEDG GLB2_LUMTE 126 4 QKAAWMALGKEFNAE GLBH_CAEEL 132 5 QKAAWMALGKEFNAE Q27302 132 5 DQAAWRDCYAVIAAG GLB1_PHESE 124 4 NSGAWDRCFTRIADV GLB3_TYLHE 128 3 DFGAWHDCVDQIIDG GLB1_LUMTE 124 4 DFGAWHDCVDQIIDG O61233 142 4 LAAAARHCFDLTTEL O07944 3621 882

User query: Display/Full Code "ERYTHCRUORIN"