WORKLIST ENTRIES (1):

EGFTGF View alignment View Structure          Type I EGF signature
 Type of fingerprint: COMPOUND with 4  elements
Links:
   PRINTS; PR00010 EGFBLOOD; PR00011 EGFLAMININ; PR01089 NEUREGULIN
   INTERPRO; IPR001336
   PROSITE; PS00022 EGF
   PFAM; PF00008 EGF
   PDB; 1EGF 3Dinfo
   SCOP; 1EGF
   CATH; 1EGF

 Creation date 03-SEP-1991; UPDATE 14-JUN-1999

   1. CAMPBELL, I.D., BARON, M., COOKE, R.M., DUDGEON, T.J., FALLON, A.,
   HARVEY, T.S. AND TAPPIN, M.J.
   Structure-function relationship in Epidermal Growth Factor (EGF)
   and Transforming Growth Factor-alpha (TGF-alpha).
   BIOCHEM.PHARMACOL. 40(1) 35-40 (1990).

   Epidermal growth factors and transforming growth factors belong to a
   general class of proteins that share a repeat pattern involving a number 
   of conserved Cys residues. Growth factors are involved in cell recognition 
   and division [1]: the repeating pattern, especially of cysteines (the 
   so-called EGF repeat), is thought to be important to the 3D structure of 
   the proteins, and hence its recognition by receptors and other molecules.
   
   The EGF motif is found frequently in nature, particularly in extracellular 
   proteins. The spacing of conserved cysteines, however, is variable, and for
   this reason 2 further EGF, or EGF-like, motifs, have been classified - see 
   signatures EGFBLOOD and EGFLAMININ. 
  
   EGFTGF is a 4-element fingerprint that provides a signature for type I 
   EGF repeats. The fingerprint was derived from an initial alignment of 9 
   sequences (after Campbell et al., [1]): the motifs include 6 conserved
   cysteines believed to be involved in disulphide bond formation, motifs 3
   and 4 spanning the region encoded by PROSITE pattern EGF (PS00022). Four
   iterations on OWL12.0 were required to reach convergence, at which point a 
   true set comprising 12 sequences was identified: these included epidermal,
   transforming and viral growth factors. 
  
   An update on SPTR37_9f identified a true set of 22 sequences, and 21
   partial matches.

  SUMMARY INFORMATION
     22 codes involving  4 elements
      3 codes involving  3 elements
     18 codes involving  2 elements

   COMPOSITE FINGERPRINT INDEX
  
    4|  22   22   22   22  
    3|   3    0    3    3  
    2|  10    2   12   12  
   --+---------------------
     |   1    2    3    4  

True positives..
 GRFA_VACCC     GRFA_VACCV     O57166         Q86607         
 Q89066         Q89756         GRFA_VARV      TGFA_MOUSE     
 TGFA_RAT       TGFA_HUMAN     TGFA_PIG       Q15577         
 O14944         Q61521         BTC_MOUSE      EGF_RAT        
 EGF_HUMAN      BTC_HUMAN      EGF_MOUSE      GRFA_MYXVL     
 NOTC_BRARE     GRFA_SFVKA     
Subfamily:  Codes involving 3 elements
 Subfamily True positives..
 NOTC_XENLA     NTC1_MOUSE     NTC1_RAT       
Subfamily:  Codes involving 2 elements
 Subfamily True positives..
 Q21340         NOTC_DROME     Q25253         Q19350         
 O88281         O16004         O61240         Q21756         
 YREC_VIBCH     O00306         Q99940         FBN1_BOVIN     
 NTC3_MOUSE     FBN1_HUMAN     O88840         FBN1_MOUSE     
 NTC4_MOUSE     O35442         


  PROTEIN TITLES
   GRFA_VACCC       GROWTH FACTOR - VACCINIA VIRUS (STRAIN COPENHAGEN).
   GRFA_VACCV       GROWTH FACTOR - VACCINIA VIRUS (STRAIN WR).
   O57166           GROWTH FACTOR PROTEIN - VACCINIA VIRUS.
   Q86607           GROWTH FACTOR - VACCINIA VIRUS.
   Q89066           GARCIA-1966 LEFT NEAR-TERMINAL REGION - VARIOLA VIRUS.
   Q89756           HOMOLOG OF VACCINIA VIRUS CDS C11R - VARIOLA VIRUS.
   GRFA_VARV        GROWTH FACTOR - VARIOLA VIRUS.
   TGFA_MOUSE       TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
   TGFA_RAT         TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
   TGFA_HUMAN       TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
   TGFA_PIG         TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
   Q15577           TRANSFORMING GROWTH FACTOR-ALPHA PRECURSOR - HOMO SAPIENS (H
   O14944           EPIREGULIN - HOMO SAPIENS (HUMAN).
   Q61521           EPIREGULIN - MUS MUSCULUS (MOUSE).
   BTC_MOUSE        BETACELLULIN PRECURSOR (BTC) - MUS MUSCULUS (MOUSE).
   EGF_RAT          PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
   EGF_HUMAN        PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
   BTC_HUMAN        BETACELLULIN PRECURSOR (BTC) - HOMO SAPIENS (HUMAN).
   EGF_MOUSE        PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
   GRFA_MYXVL       GROWTH FACTOR (MGF) - MYXOMA VIRUS (STRAIN LAUSANNE).
   NOTC_BRARE       NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN PRECURSOR - BRACHYDAN
   GRFA_SFVKA       GROWTH FACTOR - SHOPE FIBROMA VIRUS (STRAIN KASZA) (SFV).
 
   NOTC_XENLA       NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG PRECURSOR (XOTCH PROT
   NTC1_MOUSE       NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1 PRECURSOR (MOTCH PR
   NTC1_RAT         NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1 PRECURSOR - RATTUS 
 
   Q21340           K08E5.3 PROTEIN - CAENORHABDITIS ELEGANS.
   NOTC_DROME       NEUROGENIC LOCUS NOTCH PROTEIN PRECURSOR - DROSOPHILA MELANO
   Q25253           NOTCH HOMOLOG SCALLOPED WINGS (SCL) - LUCILIA CUPRINA (GREEN
   Q19350           SIMILAR TO EGF-LIKE REPEATS. NCBI GI: 1125776 - CAENORHABDIT
   O88281           MEGF6 - RATTUS NORVEGICUS (RAT).
   O16004           NOTCH HOMOLOG - LYTECHINUS VARIEGATUS (SEA URCHIN).
   O61240           HRNOTCH PROTEIN - HALOCYNTHIA RORETZI (SEA SQUIRT).
   Q21756           HYPOTHETICAL 39.1 KD PROTEIN - CAENORHABDITIS ELEGANS.
   YREC_VIBCH       HYPOTHETICAL 41.3 KD PROTEIN - VIBRIO CHOLERAE.
   O00306           NOTCH4 - HOMO SAPIENS (HUMAN).
   Q99940           NOTCH4 - HOMO SAPIENS (HUMAN).
   FBN1_BOVIN       FIBRILLIN 1 PRECURSOR (MP340) - BOS TAURUS (BOVINE).
   NTC3_MOUSE       NEUROGENIC LOCUS NOTCH 3 PROTEIN - MUS MUSCULUS (MOUSE).
   FBN1_HUMAN       FIBRILLIN 1 PRECURSOR - HOMO SAPIENS (HUMAN).
   O88840           MUTANT FIBRILLIN-1 - MUS MUSCULUS (MOUSE).
   FBN1_MOUSE       FIBRILLIN 1 PRECURSOR - MUS MUSCULUS (MOUSE).
   NTC4_MOUSE       NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 4 PRECURSOR (TRANSFOR
   O35442           NOTCH4 - MUS MUSCULUS (MOUSE).

SCAN HISTORY OWL12_0 4 50 NSINGLE OWL17_1 1 25 NSINGLE OWL18_0 1 45 NSINGLE OWL19_1 1 75 NSINGLE OWL26_0 1 300 NSINGLE SPTR37_9f 2 200 NSINGLE INITIAL MOTIF SETS EGFTGF1 Length of motif = 16 Motif number = 1 Type I EGF repeat motif I - 1 PCODE ST INT NSNTGCPPSYDGYCLN EGF_RAT 1 1 NSYPGCPSSYDGYCLN EGF_MOUSE 977 977 NSDSECPLSHDGYCLH EGF_HUMAN 971 971 SHFNKCPDSHTQYCFH TGFA_RAT 41 41 KRIKLCNDDYKNYCLN GRFA_MYXV 32 32 KHVKVCNHDYENYCLN GRFA_SHFVK 28 28 PAIRLCGPEGDGYCLH GRFA_VACCC 40 40 PAIRLCGPEGDGYCLH GRFA_VACCV 40 40 SHFNDCPDSHTQFCFH TGFA_HUMAN 42 42 EGFTGF2 Length of motif = 8 Motif number = 2 Type I EGF repeat motif II - 1 PCODE ST INT GVCMYVES EGF_RAT 18 1 GVCMHIES EGF_MOUSE 994 1 GVCMYIEA EGF_HUMAN 988 1 GTCRFLVQ TGFA_RAT 57 0 GTCFTVAL GRFA_MYXV 49 1 GTCFTIAL GRFA_SHFVK 45 1 GDCIHARD GRFA_VACCC 56 0 GDCIHARD GRFA_VACCV 56 0 GTCRFLVQ TGFA_HUMAN 58 0 EGFTGF3 Length of motif = 12 Motif number = 3 Type I EGF repeat motif III - 1 PCODE ST INT VDRYVCNCVIGY EGF_RAT 26 0 LDSYTCNCVIGY EGF_MOUSE 1002 0 LDKYACNCVVGY EGF_HUMAN 996 0 EEKPACVCHSGY TGFA_RAT 65 0 SLNPFCACHINY GRFA_MYXV 60 3 SITPFCVCRINY GRFA_SHFVK 56 3 IDGMYCRCSHGY GRFA_VACCC 64 0 IDGMYCRCSHGY GRFA_VACCV 64 0 EDKPACVCHSGY TGFA_HUMAN 66 0 EGFTGF4 Length of motif = 10 Motif number = 4 Type I EGF repeat motif IV - 1 PCODE ST INT IGERCQHRDL EGF_RAT 38 0 SGDRCQTRDL EGF_MOUSE 1014 0 IGERCQYRDL EGF_HUMAN 1008 0 VGVRCEHADL TGFA_RAT 77 0 VGSRCQFINL GRFA_MYXV 72 0 EGSRCQFINL GRFA_SHFVK 68 0 TGIRCQHVVL GRFA_VACCC 76 0 TGIRCQHVVL GRFA_VACCV 76 0 VGARCEHADL TGFA_HUMAN 78 0 FINAL MOTIF SETS EGFTGF1 Length of motif = 16 Motif number = 1 Type I EGF repeat motif I - 2 PCODE ST INT PAIRLCGPEGDGYCLH GRFA_VACCC 40 40 PAIRLCGPEGDGYCLH GRFA_VACCV 40 40 PAIRLCGPEGDGYCLH O57166 40 40 PAIRLCGPEGDGYCLH Q86607 40 40 PAIRLCGPEGNGYCFH Q89066 40 40 PAIRLCGPEGNGYCFH Q89756 40 40 PAIRLCGPEGDRYCFH GRFA_VARV 40 40 SHFNKCPDSHTQYCFH TGFA_MOUSE 41 41 SHFNKCPDSHTQYCFH TGFA_RAT 41 41 SHFNDCPDSHTQFCFH TGFA_HUMAN 42 42 SHFNDCPDSHSQFCFH TGFA_PIG 42 42 SHFNDCPDSHTQFCFH Q15577 42 42 VSITKCSSDMNGYCLH O14944 63 63 VQITKCSSDMDGYCLH Q61521 56 56 THFSRCPKQYKHYCIH BTC_MOUSE 64 64 NSNTGCPPSYDGYCLN EGF_RAT 974 974 NSDSECPLSHDGYCLH EGF_HUMAN 971 971 GHFSRCPKQYKHYCIK BTC_HUMAN 64 64 NSYPGCPSSYDGYCLN EGF_MOUSE 977 977 KRIKLCNDDYKNYCLN GRFA_MYXVL 32 32 KAICTCPPGYTGSACN NOTC_BRARE 394 394 KHVKVCNHDYENYCLN GRFA_SFVKA 28 28 EGFTGF2 Length of motif = 8 Motif number = 2 Type I EGF repeat motif II - 2 PCODE ST INT GDCIHARD GRFA_VACCC 56 0 GDCIHARD GRFA_VACCV 56 0 GDCIHARD O57166 56 0 GDCIHARD Q86607 56 0 GICIHARD Q89066 56 0 GICIHARD Q89756 56 0 GICIHARD GRFA_VARV 56 0 GTCRFLVQ TGFA_MOUSE 57 0 GTCRFLVQ TGFA_RAT 57 0 GTCRFLVQ TGFA_HUMAN 58 0 GTCRFLVQ TGFA_PIG 58 0 ATCRFLVH Q15577 58 0 GQCIYLVD O14944 79 0 GQCIYLVD Q61521 72 0 GRCRFVVD BTC_MOUSE 80 0 GVCMYVES EGF_RAT 991 1 GVCMYIEA EGF_HUMAN 988 1 GRCRFVVA BTC_HUMAN 80 0 GVCMHIES EGF_MOUSE 994 1 GTCFTVAL GRFA_MYXVL 49 1 GVCRESED NOTC_BRARE 840 430 GTCFTIAL GRFA_SFVKA 45 1 EGFTGF3 Length of motif = 12 Motif number = 3 Type I EGF repeat motif III - 2 PCODE ST INT IDGMYCRCSHGY GRFA_VACCC 64 0 IDGMYCRCSHGY GRFA_VACCV 64 0 IDGMYCRCSHGY O57166 64 0 IDGMYCRCSHGY Q86607 64 0 IDGMYCRCSHGY Q89066 64 0 IDGMYCRCSHGY Q89756 64 0 IDGMYCRCSHGY GRFA_VARV 64 0 EEKPACVCHSGY TGFA_MOUSE 65 0 EEKPACVCHSGY TGFA_RAT 65 0 EDKPACVCHSGY TGFA_HUMAN 66 0 EDKPACVCHSGY TGFA_PIG 66 0 EDKPACVCHSGY Q15577 66 0 MSQNYCRCEVGY O14944 87 0 MREKFCRCEVGY Q61521 80 0 EQTPSCICEKGY BTC_MOUSE 88 0 VDRYVCNCVIGY EGF_RAT 999 0 LDKYACNCVVGY EGF_HUMAN 996 0 EQTPSCVCDEGY BTC_HUMAN 88 0 LDSYTCNCVIGY EGF_MOUSE 1002 0 SLNPFCACHINY GRFA_MYXVL 60 3 LGGYSCECVPGY NOTC_BRARE 1162 314 SITPFCVCRINY GRFA_SFVKA 56 3 EGFTGF4 Length of motif = 10 Motif number = 4 Type I EGF repeat motif IV - 2 PCODE ST INT TGIRCQHVVL GRFA_VACCC 76 0 TGIRCQHVVL GRFA_VACCV 76 0 TGIRCQHVVL O57166 76 0 TGIRCQHVVL Q86607 76 0 TGIRCQHVVL Q89066 76 0 TGIRCQHVVL Q89756 76 0 TGIRCQHVVL GRFA_VARV 76 0 VGVRCEHADL TGFA_MOUSE 77 0 VGVRCEHADL TGFA_RAT 77 0 VGARCEHADL TGFA_HUMAN 78 0 VGARCEHADL TGFA_PIG 78 0 VGARCEHADL Q15577 78 0 TGVRCEHFFL O14944 99 0 TGLRCEHFFL Q61521 92 0 FGARCERVDL BTC_MOUSE 100 0 IGERCQHRDL EGF_RAT 1011 0 IGERCQYRDL EGF_HUMAN 1008 0 IGARCERVDL BTC_HUMAN 100 0 SGDRCQTRDL EGF_MOUSE 1014 0 VGSRCQFINL GRFA_MYXVL 72 0 VGERCEGDVN NOTC_BRARE 1258 84 EGSRCQFINL GRFA_SFVKA 68 0

User query: Display/Full Code "EGFTGF"