WORKLIST ENTRIES (1):
EGFTGF View alignment View Structure Type I EGF signature
Type of fingerprint: COMPOUND with 4 elements
Links:
PRINTS; PR00010 EGFBLOOD; PR00011 EGFLAMININ; PR01089 NEUREGULIN
INTERPRO; IPR001336
PROSITE; PS00022 EGF
PFAM; PF00008 EGF
PDB; 1EGF 3Dinfo
SCOP; 1EGF
CATH; 1EGF
Creation date 03-SEP-1991; UPDATE 14-JUN-1999
1. CAMPBELL, I.D., BARON, M., COOKE, R.M., DUDGEON, T.J., FALLON, A.,
HARVEY, T.S. AND TAPPIN, M.J.
Structure-function relationship in Epidermal Growth Factor (EGF)
and Transforming Growth Factor-alpha (TGF-alpha).
BIOCHEM.PHARMACOL. 40(1) 35-40 (1990).
Epidermal growth factors and transforming growth factors belong to a
general class of proteins that share a repeat pattern involving a number
of conserved Cys residues. Growth factors are involved in cell recognition
and division [1]: the repeating pattern, especially of cysteines (the
so-called EGF repeat), is thought to be important to the 3D structure of
the proteins, and hence its recognition by receptors and other molecules.
The EGF motif is found frequently in nature, particularly in extracellular
proteins. The spacing of conserved cysteines, however, is variable, and for
this reason 2 further EGF, or EGF-like, motifs, have been classified - see
signatures EGFBLOOD and EGFLAMININ.
EGFTGF is a 4-element fingerprint that provides a signature for type I
EGF repeats. The fingerprint was derived from an initial alignment of 9
sequences (after Campbell et al., [1]): the motifs include 6 conserved
cysteines believed to be involved in disulphide bond formation, motifs 3
and 4 spanning the region encoded by PROSITE pattern EGF (PS00022). Four
iterations on OWL12.0 were required to reach convergence, at which point a
true set comprising 12 sequences was identified: these included epidermal,
transforming and viral growth factors.
An update on SPTR37_9f identified a true set of 22 sequences, and 21
partial matches.
SUMMARY INFORMATION
22 codes involving 4 elements
3 codes involving 3 elements
18 codes involving 2 elements
COMPOSITE FINGERPRINT INDEX
4| 22 22 22 22
3| 3 0 3 3
2| 10 2 12 12
--+---------------------
| 1 2 3 4
True positives..
GRFA_VACCC GRFA_VACCV O57166 Q86607
Q89066 Q89756 GRFA_VARV TGFA_MOUSE
TGFA_RAT TGFA_HUMAN TGFA_PIG Q15577
O14944 Q61521 BTC_MOUSE EGF_RAT
EGF_HUMAN BTC_HUMAN EGF_MOUSE GRFA_MYXVL
NOTC_BRARE GRFA_SFVKA
Subfamily: Codes involving 3 elements
Subfamily True positives..
NOTC_XENLA NTC1_MOUSE NTC1_RAT
Subfamily: Codes involving 2 elements
Subfamily True positives..
Q21340 NOTC_DROME Q25253 Q19350
O88281 O16004 O61240 Q21756
YREC_VIBCH O00306 Q99940 FBN1_BOVIN
NTC3_MOUSE FBN1_HUMAN O88840 FBN1_MOUSE
NTC4_MOUSE O35442
PROTEIN TITLES
GRFA_VACCC GROWTH FACTOR - VACCINIA VIRUS (STRAIN COPENHAGEN).
GRFA_VACCV GROWTH FACTOR - VACCINIA VIRUS (STRAIN WR).
O57166 GROWTH FACTOR PROTEIN - VACCINIA VIRUS.
Q86607 GROWTH FACTOR - VACCINIA VIRUS.
Q89066 GARCIA-1966 LEFT NEAR-TERMINAL REGION - VARIOLA VIRUS.
Q89756 HOMOLOG OF VACCINIA VIRUS CDS C11R - VARIOLA VIRUS.
GRFA_VARV GROWTH FACTOR - VARIOLA VIRUS.
TGFA_MOUSE TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
TGFA_RAT TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
TGFA_HUMAN TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
TGFA_PIG TRANSFORMING GROWTH FACTOR ALPHA PRECURSOR (TGF-ALPHA) (EGF-
Q15577 TRANSFORMING GROWTH FACTOR-ALPHA PRECURSOR - HOMO SAPIENS (H
O14944 EPIREGULIN - HOMO SAPIENS (HUMAN).
Q61521 EPIREGULIN - MUS MUSCULUS (MOUSE).
BTC_MOUSE BETACELLULIN PRECURSOR (BTC) - MUS MUSCULUS (MOUSE).
EGF_RAT PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
EGF_HUMAN PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
BTC_HUMAN BETACELLULIN PRECURSOR (BTC) - HOMO SAPIENS (HUMAN).
EGF_MOUSE PRO-EPIDERMAL GROWTH FACTOR PRECURSOR (EGF) [CONTAINS: EPIDE
GRFA_MYXVL GROWTH FACTOR (MGF) - MYXOMA VIRUS (STRAIN LAUSANNE).
NOTC_BRARE NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN PRECURSOR - BRACHYDAN
GRFA_SFVKA GROWTH FACTOR - SHOPE FIBROMA VIRUS (STRAIN KASZA) (SFV).
NOTC_XENLA NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG PRECURSOR (XOTCH PROT
NTC1_MOUSE NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1 PRECURSOR (MOTCH PR
NTC1_RAT NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1 PRECURSOR - RATTUS
Q21340 K08E5.3 PROTEIN - CAENORHABDITIS ELEGANS.
NOTC_DROME NEUROGENIC LOCUS NOTCH PROTEIN PRECURSOR - DROSOPHILA MELANO
Q25253 NOTCH HOMOLOG SCALLOPED WINGS (SCL) - LUCILIA CUPRINA (GREEN
Q19350 SIMILAR TO EGF-LIKE REPEATS. NCBI GI: 1125776 - CAENORHABDIT
O88281 MEGF6 - RATTUS NORVEGICUS (RAT).
O16004 NOTCH HOMOLOG - LYTECHINUS VARIEGATUS (SEA URCHIN).
O61240 HRNOTCH PROTEIN - HALOCYNTHIA RORETZI (SEA SQUIRT).
Q21756 HYPOTHETICAL 39.1 KD PROTEIN - CAENORHABDITIS ELEGANS.
YREC_VIBCH HYPOTHETICAL 41.3 KD PROTEIN - VIBRIO CHOLERAE.
O00306 NOTCH4 - HOMO SAPIENS (HUMAN).
Q99940 NOTCH4 - HOMO SAPIENS (HUMAN).
FBN1_BOVIN FIBRILLIN 1 PRECURSOR (MP340) - BOS TAURUS (BOVINE).
NTC3_MOUSE NEUROGENIC LOCUS NOTCH 3 PROTEIN - MUS MUSCULUS (MOUSE).
FBN1_HUMAN FIBRILLIN 1 PRECURSOR - HOMO SAPIENS (HUMAN).
O88840 MUTANT FIBRILLIN-1 - MUS MUSCULUS (MOUSE).
FBN1_MOUSE FIBRILLIN 1 PRECURSOR - MUS MUSCULUS (MOUSE).
NTC4_MOUSE NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 4 PRECURSOR (TRANSFOR
O35442 NOTCH4 - MUS MUSCULUS (MOUSE).
SCAN HISTORY
OWL12_0 4 50 NSINGLE
OWL17_1 1 25 NSINGLE
OWL18_0 1 45 NSINGLE
OWL19_1 1 75 NSINGLE
OWL26_0 1 300 NSINGLE
SPTR37_9f 2 200 NSINGLE
INITIAL MOTIF SETS
EGFTGF1 Length of motif = 16 Motif number = 1
Type I EGF repeat motif I - 1
PCODE ST INT
NSNTGCPPSYDGYCLN EGF_RAT 1 1
NSYPGCPSSYDGYCLN EGF_MOUSE 977 977
NSDSECPLSHDGYCLH EGF_HUMAN 971 971
SHFNKCPDSHTQYCFH TGFA_RAT 41 41
KRIKLCNDDYKNYCLN GRFA_MYXV 32 32
KHVKVCNHDYENYCLN GRFA_SHFVK 28 28
PAIRLCGPEGDGYCLH GRFA_VACCC 40 40
PAIRLCGPEGDGYCLH GRFA_VACCV 40 40
SHFNDCPDSHTQFCFH TGFA_HUMAN 42 42
EGFTGF2 Length of motif = 8 Motif number = 2
Type I EGF repeat motif II - 1
PCODE ST INT
GVCMYVES EGF_RAT 18 1
GVCMHIES EGF_MOUSE 994 1
GVCMYIEA EGF_HUMAN 988 1
GTCRFLVQ TGFA_RAT 57 0
GTCFTVAL GRFA_MYXV 49 1
GTCFTIAL GRFA_SHFVK 45 1
GDCIHARD GRFA_VACCC 56 0
GDCIHARD GRFA_VACCV 56 0
GTCRFLVQ TGFA_HUMAN 58 0
EGFTGF3 Length of motif = 12 Motif number = 3
Type I EGF repeat motif III - 1
PCODE ST INT
VDRYVCNCVIGY EGF_RAT 26 0
LDSYTCNCVIGY EGF_MOUSE 1002 0
LDKYACNCVVGY EGF_HUMAN 996 0
EEKPACVCHSGY TGFA_RAT 65 0
SLNPFCACHINY GRFA_MYXV 60 3
SITPFCVCRINY GRFA_SHFVK 56 3
IDGMYCRCSHGY GRFA_VACCC 64 0
IDGMYCRCSHGY GRFA_VACCV 64 0
EDKPACVCHSGY TGFA_HUMAN 66 0
EGFTGF4 Length of motif = 10 Motif number = 4
Type I EGF repeat motif IV - 1
PCODE ST INT
IGERCQHRDL EGF_RAT 38 0
SGDRCQTRDL EGF_MOUSE 1014 0
IGERCQYRDL EGF_HUMAN 1008 0
VGVRCEHADL TGFA_RAT 77 0
VGSRCQFINL GRFA_MYXV 72 0
EGSRCQFINL GRFA_SHFVK 68 0
TGIRCQHVVL GRFA_VACCC 76 0
TGIRCQHVVL GRFA_VACCV 76 0
VGARCEHADL TGFA_HUMAN 78 0
FINAL MOTIF SETS
EGFTGF1 Length of motif = 16 Motif number = 1
Type I EGF repeat motif I - 2
PCODE ST INT
PAIRLCGPEGDGYCLH GRFA_VACCC 40 40
PAIRLCGPEGDGYCLH GRFA_VACCV 40 40
PAIRLCGPEGDGYCLH O57166 40 40
PAIRLCGPEGDGYCLH Q86607 40 40
PAIRLCGPEGNGYCFH Q89066 40 40
PAIRLCGPEGNGYCFH Q89756 40 40
PAIRLCGPEGDRYCFH GRFA_VARV 40 40
SHFNKCPDSHTQYCFH TGFA_MOUSE 41 41
SHFNKCPDSHTQYCFH TGFA_RAT 41 41
SHFNDCPDSHTQFCFH TGFA_HUMAN 42 42
SHFNDCPDSHSQFCFH TGFA_PIG 42 42
SHFNDCPDSHTQFCFH Q15577 42 42
VSITKCSSDMNGYCLH O14944 63 63
VQITKCSSDMDGYCLH Q61521 56 56
THFSRCPKQYKHYCIH BTC_MOUSE 64 64
NSNTGCPPSYDGYCLN EGF_RAT 974 974
NSDSECPLSHDGYCLH EGF_HUMAN 971 971
GHFSRCPKQYKHYCIK BTC_HUMAN 64 64
NSYPGCPSSYDGYCLN EGF_MOUSE 977 977
KRIKLCNDDYKNYCLN GRFA_MYXVL 32 32
KAICTCPPGYTGSACN NOTC_BRARE 394 394
KHVKVCNHDYENYCLN GRFA_SFVKA 28 28
EGFTGF2 Length of motif = 8 Motif number = 2
Type I EGF repeat motif II - 2
PCODE ST INT
GDCIHARD GRFA_VACCC 56 0
GDCIHARD GRFA_VACCV 56 0
GDCIHARD O57166 56 0
GDCIHARD Q86607 56 0
GICIHARD Q89066 56 0
GICIHARD Q89756 56 0
GICIHARD GRFA_VARV 56 0
GTCRFLVQ TGFA_MOUSE 57 0
GTCRFLVQ TGFA_RAT 57 0
GTCRFLVQ TGFA_HUMAN 58 0
GTCRFLVQ TGFA_PIG 58 0
ATCRFLVH Q15577 58 0
GQCIYLVD O14944 79 0
GQCIYLVD Q61521 72 0
GRCRFVVD BTC_MOUSE 80 0
GVCMYVES EGF_RAT 991 1
GVCMYIEA EGF_HUMAN 988 1
GRCRFVVA BTC_HUMAN 80 0
GVCMHIES EGF_MOUSE 994 1
GTCFTVAL GRFA_MYXVL 49 1
GVCRESED NOTC_BRARE 840 430
GTCFTIAL GRFA_SFVKA 45 1
EGFTGF3 Length of motif = 12 Motif number = 3
Type I EGF repeat motif III - 2
PCODE ST INT
IDGMYCRCSHGY GRFA_VACCC 64 0
IDGMYCRCSHGY GRFA_VACCV 64 0
IDGMYCRCSHGY O57166 64 0
IDGMYCRCSHGY Q86607 64 0
IDGMYCRCSHGY Q89066 64 0
IDGMYCRCSHGY Q89756 64 0
IDGMYCRCSHGY GRFA_VARV 64 0
EEKPACVCHSGY TGFA_MOUSE 65 0
EEKPACVCHSGY TGFA_RAT 65 0
EDKPACVCHSGY TGFA_HUMAN 66 0
EDKPACVCHSGY TGFA_PIG 66 0
EDKPACVCHSGY Q15577 66 0
MSQNYCRCEVGY O14944 87 0
MREKFCRCEVGY Q61521 80 0
EQTPSCICEKGY BTC_MOUSE 88 0
VDRYVCNCVIGY EGF_RAT 999 0
LDKYACNCVVGY EGF_HUMAN 996 0
EQTPSCVCDEGY BTC_HUMAN 88 0
LDSYTCNCVIGY EGF_MOUSE 1002 0
SLNPFCACHINY GRFA_MYXVL 60 3
LGGYSCECVPGY NOTC_BRARE 1162 314
SITPFCVCRINY GRFA_SFVKA 56 3
EGFTGF4 Length of motif = 10 Motif number = 4
Type I EGF repeat motif IV - 2
PCODE ST INT
TGIRCQHVVL GRFA_VACCC 76 0
TGIRCQHVVL GRFA_VACCV 76 0
TGIRCQHVVL O57166 76 0
TGIRCQHVVL Q86607 76 0
TGIRCQHVVL Q89066 76 0
TGIRCQHVVL Q89756 76 0
TGIRCQHVVL GRFA_VARV 76 0
VGVRCEHADL TGFA_MOUSE 77 0
VGVRCEHADL TGFA_RAT 77 0
VGARCEHADL TGFA_HUMAN 78 0
VGARCEHADL TGFA_PIG 78 0
VGARCEHADL Q15577 78 0
TGVRCEHFFL O14944 99 0
TGLRCEHFFL Q61521 92 0
FGARCERVDL BTC_MOUSE 100 0
IGERCQHRDL EGF_RAT 1011 0
IGERCQYRDL EGF_HUMAN 1008 0
IGARCERVDL BTC_HUMAN 100 0
SGDRCQTRDL EGF_MOUSE 1014 0
VGSRCQFINL GRFA_MYXVL 72 0
VGERCEGDVN NOTC_BRARE 1258 84
EGSRCQFINL GRFA_SFVKA 68 0
User query: Display/Full Code "EGFTGF"