>Q65209 ~~~~~~Protein MGF 100-2L~~~
MGNKESKYLEMCSEEAWLNIPNIFKCIFIRKLFYNKWLKYQEKKLKKSLKLLSFYHPKKDFVGIRDMLHMAPGGSYFITD
NITEEFLMLVVKHPEDGSAEFTKLCLKGSCIVIDGYYYDTLHIFLSETPDIYKYPLIRYDR
>Q65210 ~~~~~~Protein MGF 100-3L~~~
MGNRLIRSYLPNTVMSIEDKQNKYNETIEDSKICNKVYIKQSGKIDKQELTRIKKLGFFYSQKSDHEIERMLFSMPNGTF
LLTDDATNENIFIVQKDLENGSLNIAKLEFKGKALYINGKDYYSLENYLKTFEDFYKYPLIYNKNK
>P18560 ~~~~~~Protein MGF 110-1L~~~
MLGLQIFTLLSIPTLLYTYEIEPLERTSTPPEKELGYWCTYANHCRFCWDCQDGICRNKAFKNHSPILENNYIANCSIYR
RNDFCIYYITSIKPHKTYRTECPQHINHERHEADIRKWQKLLTYGFYLAGCILAVNYIRKRSLQTVMYLLVFLVISFLLS
QLMLYGELEDKKHKIGSIPPKRELEHWCTHGKYCNFCWDCQNGICKNKAFKNHPPIGENDFIRYDCWTTHLPNKCSYEKI
YKHFNTHIMECSQPTHFKWYDNLMKKQDIM
>A9JLI2 ~~~~~~Protein MGF 110-1L~~~
MLGLQIFTLLSIPTLLYTYEIEPLERTSTPPEKEFGYWCTYANHCRFCWDCQDGICRNKAFKNHSPILENDYIANCSIYR
RNDFCIYHITSIKPHKTYRTECPQHINHERHEADIRKWQKLLTYGFYLAGCILAVNYIRKRSLQTVMYLLVFLVISFLLS
QLMLYGELEDKKHKIGSIPPKRELEHWCTHGKYCNFCWDCQNGICKNKAFKNHPPIGENDFIRYDCWTTHLPNKCSYEKI
YKHFDTHIMECSQPTHFKWYDNLMKKQDIM
>P18559 ~~~~~~Protein MGF 110-2L~~~
MRFFSYLGLLLAGLTSLQGFSTDNLLEEELRYWCQYVKNCRFCWTCQDGLCKNKVLKDMSSVQEHSYPMEHCMIHRQCKY
IRDGPIFQVECTMQTSDATHLINA
>A9JLI3 ~~~~~~Protein MGF 110-2L~~~
MRFFSYLGLLLAGLTSLQGFSTDNLLEEELRYWCQYVKNCRFCWTCQDGLCKNKVLKDMSSVQEHSYPMKHCMIHRQCKY
IRDGPIFQVECTIQTSDATHLINA
>P18558 ~~~~~~Protein MGF 110-4L~~~
MLVIFLGILGLLANQVLGLPTQAGGHLRSTDNPPQEELGYWCTYMESCKFCWECAHGICKNKVNESMPLIIENSYLTSCE
VSRWYNQCTYSEGNGHYHVMDCSNPVPHNRPHRLGRKIYEKEDL
>A9JLI5 ~~~~~~Protein MGF 110-4L~~~
MLVIFLGILGLLANQVLGLPTQAEGHLRSTDNPPQEELGYWCTYMESCKFCWECEHGICKNKVNRSMPWIIENSYLTSCE
VSRWYNQCTYDEGNGHYHVMDCSNPVPHNRPHRLGRKIYEKEDL
>P18557 ~~~~~~Protein MGF 110-5L~~~
MLVIFLGILGLLANQVSSQLVGQLHPTENPSENELEYWCTYMECCQFCWDCQNGLCVNKLGNTTILENEYVHPCIVSRWL
NK
>A9JLI7 ~~~~~~Protein MGF 110-5L~~~
MLVIILGVIGLLANQVLGLPTQAGGHLRSTDNPPQEELGYWCTYMESCKFCWECAHGICKNKVNESMPLIIENSYLTSCE
VSRWYNQCTYSEGNGHYHVMDCSDPVPHNRPHRLLMKIYEKEDL
>P68744 ~~~~~~Protein MGF 110-6L~~~
MLVIFLGILGLLASQVSSQLVGQLRPTEDPPEEELEYWCAYMESCQFCWDCQDGTCINKIDGSAIYKNEYVKACLVSRWL
DKCMYDLDKGIYHTMNCSQPWSWNPYKYFRKEWKKDEL
>P0DJZ0 ~~~~~~Host-modulation protein 11K~~~
MQNNTTGMDTKSLKNCGQPKAVCTHCKHSPPCPQPGCVTKRPPVPPRLYLPPPVPIRQPNTKDIDNVEFKYLTRYEQHVI
RMLRLCNMYQNLEK
>P03588 ~~~ORF1a~~~Replication protein 1a~~~
MSSSIDLLKLIAEKGADSQSAQDIVDNQVAQQLSAQIEYAKRSKKINVRNKLSIEEADAFRDRYGGAFDLNLTQQYHAPH
SLAGALRVAEHYDCLDSFPPEDPVIDFGGSWWHHFSRRDKRVHSCCPVLGVRDAARHEERMCRMRKILQESDDFDEVPNF
CLNRAQDCDVQADWAICIHGGYDMGFQGLCDAMHSHGVRVLRGTVMFDGAMLFDREGFLPLLKCHWQRDGSGADEVIKFD
FENESTLSYIHGWQDLGSFFTESVHCIDGTTYLLEREMLKCNIMTYKIIATNLRCPRETLRHCVWFEDISKYVGVSIPED
WSLNRWKCVRVAKTTVREVEEIAFRCFKESKEWTENMKAVASILSAKSSTVIINGQAIMAGERLDIEDYHLVAFALTLNL
YQKYEKLTALRDGMEWKGWCHHFKTRFWWGGDSSRAKVGWLRTLASRFPLLRLDSYADSFKFLTRLSNVEEFEQDSVPIS
RLRTFWTEEDLFDRLEHEVQTAKTKRSKKKAKVPPAAEIPQEEFHDAPESSSPESVSDDVKPVTDVVPDAEVSVEVPTDP
RGISRHGAMKEFVRYCKRLHNNSESNLRHLWDISGGRGSEIANKSIFETYHRIDDMVNVHLANGNWLYPKKYDYTVGYNE
HGLGPKHADETYIVDKTCACSNLRDIAEASAKVSVPTCDISMVDGVAGCGKTTAIKDAFRMGEDLIVTANRKSAEDVRMA
LFPDTYNSKVALDVVRTADSAIMHGVPSCHRLLVDEAGLLHYGQLLVVAALSKCSQVLAFGDTEQISFKSRDAGFKLLHG
NLQYDRRDVVHKTYRCPQDVIAAVNLLKRKCGNRDTKYQSWTSESKVSRSLTKRRITSGLQVTIDPNRTYLTMTQADKAA
LQTRAKDFPVSKDWIDGHIKTVHEAQGISVDNVTLVRLKSTKCDLFKHEEYCLVALTRHKKSFEYCFNGELAGDLIFNCV
K
>P27752 ~~~ORF1a~~~Replication protein 1a~~~
MASSLDLLKLISERGADSRGASDIVEQQAVKQLLEQVDYSKRSKKINIRNKLTPDEENAFRARYGGAFDLNLTQQYNAPH
SLAGALRIAEHYDCLSSFPPLDPIIDFGGSWWHHYSRKDTRIHSCCPVLGVRDAARHEERLCRMRKLLQECDDREDLPDF
CIDRAESCSVQADWAICIHGGYDMGYTGLCEAMHSHGVRILRGTIMFDGAMLFDNEGVLPLLKCRWMKSGKGKSEVIKFD
FMNESTLSYIHSWTNLGSFLTESVHVIGGTTYLLERELLKCNIMTYKIVATNLKCPKETLRHCVWFENISQYVAVNIPED
WNLTHWKPVRVAKTTVREVEEIAFRCFKENKEWTENMKAIASILSAKSSTVIINGQAIMAGERLNIDEYHLVAFALTMNL
YQKYENIRNFYSEMEWKGWVNHFKTRFWWGGSTATSSTGKIREFLAGKFPWLRLDSYKDSFVFLSKISDVKEFENDSVPI
SRLRSFFSSEDLMERIELELESAQKRRREKKKKEVEKIDEEEFQDAIDIPNDAVRDDAKPEKEPKPEVTVGAEPTGPEEA
SRHFAIKEFSDYCRRLDCNAVSNLRRLWAIAGCDGRTARNKSILETYHRVDDMINLHYPGGQWLYPKKYDYEVGFNDSGL
GPKFDDELYVVDKSCICANYQVLSKNTDSLKAPSCKISLCDGVAGCGKTTAIKSASNIAEHLVVTANKKSAEDVREALFP
HNPSSEIAFKVIRTADSALMHGLPRCKRLLVDEAGLLHYGQLLAVAALCKCQSVLAFGDTEQISFKSRDATFRLKYGDLQ
FDSRDIVTETWRCPQDVISAVQTLKRGGNRTSKYLGWKSHSKVSRSISHKEIASPLQVTLSREKFYLTMTQADKAALVSR
AKDFPELDKAWIEKHIKTVHEAQGVSVDHAVLVRLKSTKCDLFKTEEYCLVALTRHKITFEYLYVGMLSGDLIFRSIS
>Q83264 ~~~ORF1a~~~Replication protein 1a~~~
MATSSFNINELVASHGDKGLLATALVDKTTHEQLEEQLQHQRRGRKVYIRNVLGVKDSEVIRNRYGGKYDLHLTQQEFAS
QGLAGALRLCGTLDCLDSFPSSGLRQDLVLDFGGSWVTHYLRGHNVHCCSPCLGIRDKMRHSERLMNMRKIILNDPQQFD
GRQPDFCTQPAADCKVQAHFAISIHGGYDMGFRGLCEAMNAHGTTILKGTMMFDGAMMFDDQGVIPELNCQWRKIRSAFS
ETEDVTPLVGKLNSTVFSRVRKFKTMVAFDFINESTMSYVHDWENIKSFLTDQTYSYRGMTYGIERCVIHAGIMTYKIIG
VPGMCPPELIRHCIWFPSIKDYVGLKIPASQDLVEWKTVRILTSTLRETEEIAMRCYNDKKAWMEQFKVILGVLSAKSST
IVINGMSMQSGERIDINDYHYIGFAILLHTKMKYEQLGKMYDMWNASSISKWFAALTRPLRVFFSSVVHALFPTLRPREE
KEFLIKLSTFVTFNEECSFDGGEEWDVISSAAYVATQAVTDGKILAAQKAEKLAEKLAQPVSEVSDSPETSSQTPDDTAD
VCGKEREVSELDSLSAQTRSPITRVAERATAMLEYAAYEKHLHDTTVSNLKRIWNMAGGDDKRSFLEGNLKFVFDSYFTV
DPMVNIHFSTGRWVRPVPEGIVYPVGYNERGLGPKSDGELYIVNSECVICNSESLSTVYGRSLQTPTGTISQVDGVAGCG
KTMPIKSIFEPSTDMIVTANKKSAQDVRMALFKSSDSKEACTFVRTADSVLLNECPTVSRVLEDEVVLLHFGQLCAVMSK
LKAVRAICFGDAEQIAFSSRDASFDMRFSKIIPDETSDADTTFRSPQDVVPLVRLMATKALPKGTHSKYTKWVSQSKVRR
SVTSRAIASVTLVDLDSSRFYITMTQADKASLISRAKEMNLPKTFWNERIKTVHESQGISEDHVTLVRLKSTKCDLFKQF
SYCLVALTRHKVTFRYEYCGVLNGDLIASVARA
>P89680 ~~~~~~Replication protein 1a~~~
MDSLSLPTVCDVVVPALNVDNLIRDYVSNVRADDSNNVSRFLGEVALKEIKSQVDTSNGDFQKLNVGFRLTPDEKNALKA
NFPGLEIAFRDSCHSSHSFAAAHRVCETLNIYNRFKTKTERIIDLGGNYVTHAKQGRSNVHSCCPILDVRDGARHTDRYI
SLAASVENRHRELPVDFCCHKFEECDVKAPFAMAIHSISDIPISTVATHCVRRGVRKLIASVMMDPLMMLYDKGHIPLLN
VDWEKEDVEETGKTLIHFHFVDAPGLSYSHDFDTLSQYMITNQVIVNNSYSFRVERTACLSGVYIVEMTLSMTDGHSLAY
LKPMRDVSCAWLSSLRKKVFVKLAVPISAEWYTEQFEVRHALMDESLVRYVSEAAFRQFSKTKDPETLVQYIATMLSSSS
NHVVINGITMRSGSPIKFDEYVPLAVTFYVMAAWRYKMIAPGIDAVKTRTEKNVDILDEKGLIKEDFNVVNALLEDAGLI
EPKLPHFTDVIKNAGLRGFGKKTIEKTRDDVLLIKPRSLLREVIYTVRSVFGLTILDSDYNLVSGVPSHMKATHVWSVFV
GNLAFPSCLNVNECVNELLVNHMEMMEETKKEETRQQAFKDARDRALMTIAKAIEKDQTIKDGLLPILDLCKIKEELTAA
SNSLSLTPEAIEQTDSRLMKASGSDVNPYADSIKEAIHYFNEVEVANTRNLRSLGIYLGWSIPKNKQTYDALQGRNESVR
VYVPYENKWYPSAPTSQYERAMTVDGYFSLPMEFEAITDGCRREISKYHLLVVDDSCIFCSGQRMIPALEAALKLVPTFK
ITIVDGVAGCGKTTHLKKIARIDSSAAGSPDLVLTSNRSSSDELKEVIDCPDVMKYRIRTVDSYLMLKSWFSAERLLFDE
CFLTHAGCVYAAATLAQVKEVIAFGDTEQVPFISRLPEFRMEHHKVKGKISCTTDHLQMSKRCHSLFKENFLQKQDCEVS
ERCSSLAELCPIQSVIQIQPERDVLYMTHTRADKETLMRIPGMPKDRIKTTHEAQGETWDHVVMFRLSKTTNLLHSGKGP
DLGPCHNLVAISRHRKSFRYFTVAPHDNDDQIVKCINYARSLSSGDLDGVRVLI
>P0C783 ~~~ORF2b~~~Suppressor of silencing 2b~~~
MELNVGAMTNVELQLARMVEAKKQRRRSHKQNRRERGHKSPSERARSNLRLFRFLPFYQVDGSELTGSCRHVNVAELPES
EASRLELSAEDHDFDDTDWFAGNEWAEGAF
>Q66125 ~~~ORF2b~~~Suppressor of silencing 2b~~~
MDVLTVVVSTADLHLANLQEVKRRRRRSHVRNRRARGYKSPSERARSIARLFQMLPFHGVDPVDWFPDVVRSPSVTSLVS
YESFDDTDWFAGNEWAEGSF
>Q8UYT3 ~~~ORF2b~~~Suppressor of silencing 2b~~~
MASIEIPLHEIIRKLERMNQKKQAQRKRHKLNRKERGHKSPSEQRRSELWHARQVELSAINSDNSSDEGTTLCRFDTFGS
KSDAICDRSDWCLDQ
>Q89705 ~~~~~~Protein MGF 300-1L~~~
MVSLTTCCLKNIVNQHACVENTVLLYHLGLRWNCKTLYQCTQCNGVNYTNSHSDQCKNKDLFLMKVIVKKNLAVTRTLLS
WGASPEYARLFCRNTEEEQALNVQHVADVSSSKILERLTMSYKENDEQLLITFYLLNLSTKFSTNLREQVRFNIVSYIIC
DLAIHQTFKTFYAKNYSLSTLYCIFLAIYYKLYTALRKMVKIYPGLKRFAYLIGFMFDDETVMETYNSTDDEISECKNRI
IAIKGYYGNIHCRSDIDHMYAFSQNDYW
>P0C9L6 ~~~~~~Protein MGF 300-4L~~~
MLSLFNIALKALIMKHNVEFLKRDKEILTHLGLCCKDYVIIDKCSECGNICPNGQQHGDTCININYLLIYAVKRDNYMLA
YRLLSWGANEKFANCFRRPLPNLKPPLPKKELTPKEIKQLAYDHFHNDSELITIFEVFRKCRHINDCLNFFYKKNTEFEI
YFARLHVYSKTFYGKGWYWFCIFIAVKHSMEHALKKITKTFTPTFYNKTTLNLVLFLSACFYENVEWMKNFFYKGNKKIQ
QRMLNYGMEWAATHGKVRTFICCYTLGGTASLKLYQKAYQNEKFMIMALCSYLGNIQINNPWESLNPYTMVQNKEKFLPL
KFSEETQYFYI
>P0C9P6 ~~~~~~Protein MGF 360-11L~~~
MLPSLQSLTKKVLAGQCVSVDHYHILKCCGLWWHNGPIMLHIRRNRIFIRSTCFSQGIELNIGLMKAVKENNHDLIKLFT
EWGADINYGMICALTENTRDLCKELGAKEYLEREYILKIFFDTTRDKTSNNIIFCHEVFSNNPNLRIIDNLDLRGEIMWE
LRGLMEITFMLDHDDSFSTVLTKYWYAIAVDYDLKDAIRYFYQKYPRLHRWRLMCALFYNNVFDLHELYEIERVRMDIDE
MMHIACVQDYSYSAIYYCFIMGANINQAMLVSIQNYNLGNLFFCIDLGANAFEEGKALAEQKGNYLIANALSLKHYNPVI
SLLSVVTDPEKINRMLKNYHSINMGIFLDYEQR
>P0C9Q5 ~~~~~~Protein MGF 360-14L~~~
MLSLQTLAKKVVACNYLSSDYDYMLQRFGLWWDLGPIHLCNNCKQIFSYKHLQCFSEDDLCLEAALVKAVKSDNLELIRL
FVDWGANPEYGLIRVPAVHLKRLCTELGGLTPVSEPRLLEILKEVAKLKSCAGVLLGYDMFCHNPLLETVTRTTLDTVTY
TCSNIPLTGDTAHHLLTKFWFALALRHNFTKAIHYFYKRHKNHLYWRVACSLYFNNIFDIHELCREKEICISPNLMMKFA
CLRKKNYAAIYYCYRLGASLDYGMNLSIYNNNTLNMFFCIDLGATDFDRAQRIAHKTYMYNLSNIFLVKQLFSRDVTLAL
DVTEPQEIYDMLKSYTSKNLKRAEEYLTAHPEIIVID
>Q65141 ~~~~~~Protein MGF 360-15R~~~
MVLIEFLTGFFYLYGKRLFSISKVMDMICLDYYTIIPAPLAMMLAARIKNYDLMKRLHEWEISVDYALLVVDDVPSIDFC
LSLGAKSPTRAQKRQLLRDNTFNPVYKYLMNCSGFPTRREKNIPCDVQCERLQKNIIKELVFNCSVLLEMVLHTEREYAY
ALHCAAKHNQLPILMYCWQQSTDAESILLKTCCSDKNINCFNYCILYGGAQNLNAAMVEAAKHDARMLINYCVMLGGRSL
NEAKETAAMFGHIECAQHCFELQSYVMDALNADDAD
>P23163 ~~~~~~Protein MGF 360-16R~~~
MLSLQTIAKMAVATNTYSKYHYPILKVFGLWWKNSTLNGPIKICNHCNNIMVGEYPMCYNHGMSLDIALIRAVKERNISL
VQLFTEWGGNIDYGALCANTPSMQRLCKSLGAKPPKGRMYMDALIHLSDTLNDNDLIRGYEIFDDNSVLDCVNLIRLKIM
LTLKARIPLMEQLDQIALKQLLQRYWYAMAVQHNLTTAIHYFDNHIPNIKPFSLRCALYFNDPFKIHDACRTVNMDPNEM
MNIACQQDLNFQSIYYSYILGADINQAMLMSLKYGNLSNMWFCIDLGADAFKEAGALAGKKKKSVTAHIRS
>P23164 ~~~~~~Protein MGF 360-19R~~~
MPSTLQVLAKKVLALGEHKENEHISREYYYHILKCCGLWWHEAPIILCFDGSEQMMIKTPIFEEGILLNTALMKAVQENN
YELINLFTEWGANINYGLISINTEHARDLCRKLGAKEMLERNEVIQIVFKTLDDITSSNIILCHELFTNNPLLENVNMGE
MRMIIHWRMKNLTNLLLNNDSISEILTKFWYGIAVKYNLKDAIQYFYQRFMDFNEWRVTCALSFNNVNDLHKMYITEKVH
MNNDEMMNLACSIQDRNLSTIYYCFLLGANINQAMLTSVLNYNIFNLFFCIDLGADAFEEGKTLAKQKGYNEIVEILSLD
IIYSPNTDFSSKIEPEHISSLLKNFYPKNLFAFDRCNPGLYYS
>P23167 ~~~~~~Protein MGF 360-1L~~~
MPSTLQALTKKVLATQPVFKDDYCILERCGLWWHEAPITIHHTCIDKQILIKTASFKHGLTLNVALMKAVQENNHGLIEL
FTEWGADISFGLVTVNMECTQDLCQKLGARKALSENKILEIFYNVQYVKTSSNIILCHELLSDNPLFLNNAQLKLRIFGE
LDTLSINFTLDNISFNEMLTRYWYSMAILYKLTEAIQYFYQRYSHFKDWRLICGVAYNNVFDLHEIYNKEKTNIDIDEMM
QLACMYDCNYTTIYYCCMLGADINRAMITSVMNFCEGNLFLCIDLGADAFEESMEIASQTNNWILINILLFKNYSPDSSL
LSIKTTDPEKINALLDEEKYKSKNMLIYEESLFHIYGVNI
>P23168 ~~~~~~Protein MGF 360-20R~~~
MPTPLSLQALAKKVLATQHISKDHLYFEILWFMVAFFDAYSL
>Q65133 ~~~~~~Protein MGF 360-5L~~~
MNSLQVLTKKVLIENKAFSEYHEDDIFILQQLGLWWTQSPGILIKLQGRSGKRINIPERS
>P23162 ~~~~~~Protein MGF 360-8L~~~
MLSLQTLAKKAVAKQSVPEEYHYILKYCGLWWQNKPISLCHYCNYVILSSTPFKGELLHLDVALIMAIKENNYDVIRLFT
EWGANIYYGLTCARTEQTQELCRKLGAKDGLNNKEIFAGLMRHKTSNNIILCHEIFDKNPMLEALNVQEMGEEIHRELKL
FIFYILDNVPMNIFVKYWYAIAVKYKLKRAIFFFYQTYGHLSMWRLMCAIYFNNVFDLHEIYEQKIVHMDIDKMMQLACM
QDYNFLTIYYCFVLGADIDQAITVTQWHYHTNNLYFCKDLKDLKQNTLTARPLLLPNITDPKKIYTMLKNYLPTSSNSL
>Q65137 ~~~~~~Protein MGF 360-9L~~~
MDMDEMLRWACRKNYNYLTIYYCCVALGADINQAMFHSIQFYNIGNIFFCIDLGANAFEEGKTLAHQKDNSFIASMLSLN
CYSMNDSLSLKETDPEVIKRMLKDYHSKNLSIAHKHYINDGFNDI
>P24745 ~~~~~~Uncharacterized 38.0 kDa protein in P143-LEF5 intergenic region~~~
MASSLQSKWICLRLNDAIIKRHVLVLSEYADLKYLGFEKYKFFEYVIFQFCNDPQLCKIIENNYNYCMQIFKAPADDMRD
IRHNIKRAFKTPVLGHMCVLSNKPPMYSFLKEWFLLPHYKVVSLKSESLTWGFPHVVVFDLDSTLITEEEQVEIRDPFVY
DSLQELHEMGCVLVLWSYGSRDHVAHSMRDVDLEGYFDIIISEGSTVQEERSDLVQNSHNAIVDYNLKKRFIENKFVFDI
HNHRSDNNIPKSPKIVIKYLSDKNVNFFKSITLVDDLPTNNYAYDFYVKVKRCPTPVQDWEHYHNEIIQNIMDYEQYFIK
>A0A7H0DNE2 ~~~~~~3 beta-hydroxysteroid dehydrogenase/Delta 5-->4-isomerase~~~
MAVYAVTGGAGFLGRYIVKLLISADDVQEIRVIDIVEDPQPITSKVKVINYIQCDINDFDKVREALDGVNLIIHTAALVD
VFGKYTDNEIMKVNYYGTQTILAACVDLGIKYLIYTSSMEAIGPNKHGNPFIGHEHTLYDISPGHVYAKSKRMAEQLVMK
ANNSVIMNGAKLYTCCLRPTGIYGEGDKLTKVFYEQCKQHGNIMYRTVDDDAVHSRVYVGNVAWMHVLAAKYIQYPGSAI
KGNAYFCYDYSPSCSYDMFNLLLMKPLGIEQGSRIPRWMLKMYACKNDMKRILFRKPSILNNYTLKISNTTFEVRTNNAE
LDFNYSPIFDVDVAFERTRKWLEESE
>O57245 ~~~~~~3 beta-hydroxysteroid dehydrogenase/Delta 5-->4-isomerase~~~
MAVYAVTGGAGFLGRYIVKLLISADDVQEIRVIDIVEDPQPITSKVKVINYIQCDINDFDKVREALDGVNLIIHTAALVD
VFGKYTDNEIMKVNYYGTQTILAACVDLGIKYLIYTSSMEAIGPNKHGDPFIGHEHTLYDISPGHVYAKSKRMAEQLVMK
ANNSVIMNGAKLYTCCLRPTGIYGEGDKLMKVFYEQCKQHGNIMYRTVDDNAVHSRVYVGNAAWMHVLAAKYIQYPGSEI
KGNAYFCYDYSPSCSYDMFNLLLMKPLGIEQGSRIPRWMLKMYACKNDMKRILFRKPSLLNNYTLKISNTTFEVRTNNAE
LDFNYSPIFNVDVAFERTRKWLEESE
>P21097 ~~~~~~3 beta-hydroxysteroid dehydrogenase/Delta 5-->4-isomerase~~~
MAVYAVTGGAGFLGRYIVKLLISADDVQEIRVIDIVEDPQPITSKVKVINYIQCDINDFDKVREALDGVNLIIHTAALVD
VFGKYTDNEIMKVNYYGTQTILAACVDLGIKYLIYTSSMEAIGPNKHGDPFIGHEHTLYDISPGHVYAKSKRMAEQLVMK
ANNSVIMNGAKLYTCCLRPTGIYGEGDKLTKVFYEQCKQHGNIMYRTVDDNAVHSRVYVGNAAWMHVLAAKYIQYPGSKI
KGNAYFCYDYSPSCSYDMFNLLLMKPLGIEQGSRIPRWMLKMYACKNDMKRILFRKPSLLNNYTLKISNTTFEVRTNNAE
LDFNYSPIFDVDVAFKRTRKWLEESE
>P26670 ~~~~~~3 beta-hydroxysteroid dehydrogenase/Delta 5-->4-isomerase~~~
MAVYAVTGGAGFLGRYIVKLLISADDVQEIRVIDIVEDPQPITSKVKVINYIQCDINDFDKVREALDGVNLIIHTAALVD
VFGKYTDNEIMKVNYYGTQTILAACVDLGIKYLIYTSSMEAIGPNKHGDPFIGHEHTLYDISPGHVYAKSKRMAEQLVMK
ANNSVIMNGAKLYTCCLRPTGIYGEGDKLTKVFYEQCKQHGNIMYRTVDDDAVHSRVYVGNVAWMHVLAAKYIQYPGSEI
KGNAYFCYDYSPSCSYDMFNLLLMKPLGIEQGSRIPRWMLKMYACKNDMKRILFRKPSLLNNYTLKISNTTFEVRTNNAE
LDFNYSPIFNVDVAFERTRKWLEESE
>Q89187 ~~~~~~3 beta-hydroxysteroid dehydrogenase/Delta 5-->4-isomerase~~~
MEAIGPNKHGNPFIGHEHTLYDISPGHVYAKSKRMVEQLVTKANNSVIMNGAKLYTCCLRPTGIYGGGDKLTKVFYEQCT
QHGNIMYRTVDDDAVHSRVYVGNVAWMHVLAVKYIQYPGSEIKGNAYFCYGYSPSCSYDMFNLLLMKPLGIEQGSRIPRW
MLKMYTCKNDMKRILFRKPSLLNNYTLKISNTTFEARQQCRTRFQLLPYL
>Q89869 ~~~~~~Protein MGF 505-10R~~~
MFSLQELCRKNIYILPYPLGKHVLQQLGLYWKGHGSLQRIGDDHVLLQQDLIFSINEALRMAAEEGNNEVVKLLLLWEGN
LHYAIIGALEGDRYDLIHKYYEQIGDCHKILPLIQDPQIFEKCHELSNSCNIRCLLEHAVKHNMLSILQKHKDQIRLHMA
LTQILFELACHERKNDIIRWIGYSLHIYHLETIFDVAFAHKNLSLYVLGYELLMHKVNTEAANIDLPNLLSYHLRTAAAG
GLLNFMLETIKHGGCVDKTVLSAAIRYKHRKIVAHFIHQVPRKTVKKLLLYAVQARAPKKTLNLLLSSLNYAVHTITKQL
VHNVINYSSTLVVKLLLMRRKRKLNLVDAVLARLVKYSTYTDIVQFMGEFSVSPERVIKMAARESRTFLIEMISKAAWGN
HPQTLIHHLKHLTNTMKPQSGKDLIIYTIHYIYLNSNMLVAEEEKNIFKLAKFYANHNAVNRFKQICEDYYILDARFKTL
ILECFEIAVQKNYPRIANIVDDYIRFLFYRGNITEEEIREAYSLKDAEVYVDLKWLQQGEMV
>Q65208 ~~~~~~Protein MGF 505-11L~~~
MFSLQKKALQHIYMTPENASQLTKDLLQHLGLYWNGPVIKMDTVVHLHNKIFSNRSVLKYALAKQANITIIKTLVLWVEP
EYALAQALKHNRKDVLECIFSYHLTTPKYHHIMHLTSSQELFEFFHLFICKSKNYNARMECLLYAATLYNFPNILEKNRE
YIIRHSIGNTLFAIACKERHIHLIAWFVTAGVLDTYDDSTLFNTAFRLGDYSLLEVACDLPIIYPDYLIISMMQTAIQKN
YFRFFKKLLTHFNIYRQIIITDAAYYNRRKILLLLLNQNVFNNFAILCALSAAIKGHASKKTLNLLISQLDSQMTVIDSV
YYSIIKYNNIDCIPLLMQIKTFRLETLVSIAVHGDNIDIIAACKAFLPKDTLYHLVLKMAIISRNHKLFKLYTEKENPMY
IFTTIKAIISNLVNYTVVQAVAIEYLREFHREKQLPIVPLLMVLAEHNYITKFKKACYAANMSDQKVKRALIKCLFIASQ
KNYCQIFKYCFGSLLKVLSKHERVKFFNSVVFAKKLASYYDHQNMIHLIDSLIERFRYLIKD
>Q89702 ~~~~~~Protein MGF 505-2R~~~
MFSLQDLCRKHLFILPDVFGEHVLQRLGLYWRCHGSLQRIGDDHILIRRDLILSTNEALRMAGEEGNNEVVKLLLLWKGN
LHYAVIGALQGDQYDLIHKYENQIGDFHFILPLIQDANTFEKCHALERFCGVSCLLKHATKYNMLPILQKYQEELSMRAY
LHETLFELACLWQRYDVLKWIEQTMHVYDLKIMFNIAISKRDLTMYSLGYIFLFDRGNTEATLLTQHLEKTAAKGLLHFV
LETLKYGGNIDTVLTQAVKYNHRKLLDYFLRQLPRKHIEKLLLLAVQEKASKKTLNLLLSHLNYSVKRIKKLLRYVIEYE
STLVIKILLKKRVNLIDAMLEKMVRYFSATKVRTIMDELSISPERVIKMAIQKMRTDIVIHTSYVWEDDLERLTRLKNMV
YTIKYEHGKKMLIKVMHGIYKNLLYGEREKVMFHLAKLYVAQNAATQFRDICKDCYKLDVARFKPRFKQLILDCLEIVTK
NLAIVSWKS
>Q89642 ~~~~~~Protein MGF 505-3R~~~
MSSSLQELCRKKLPDCILPEFFDDYVLQLLGLHWQDHGSLQRIEKNQILVQQEPIHINEALKVAASEGNYEIVELLLSWE
ADPRYAVVGALESKYYDLVYKYYDLVKDCHDILPLIQNPETFEKCHELNNPCSLKCLFKHAVIHDMLPILQKYTYFLDGW
EYCNQMLFELACSKKKYEMVVWIEGVLGIGKVTSLFTIAISNRDLHLYSLGHLIILERMQSCGQDPTFLLNHFLRDVSIK
GLLPFVLKTIEYGGSKEIAITLAKKYQHKHILKYFETGKC
>Q89768 ~~~~~~Protein MGF 505-4R~~~
MFSLQDICRKYLFQLPDSFDEYTLQVLGLYWEKHGSLQRIRKDAVFVQRNLIISINEALRIAASEGNGRVVKLLLSWEGN
FHYVIIGALEGDHYDLIHKYGSQIEDYHMILSSIHNANTFEKCHELSNCDMWCLIQNAIKYNMLPILQKHRNILTHEGEN
QELFEMACEEQKYDIVLWIGQTLMLNEPEFIFDIAFERIDFSLLTMGYSLLFNNKMSSIDIHDEEDLISLLTEHLEKAAT
KGCFFFMLETLKHGGNVNMAVLSKAVEYNHRKILDYFIRQKCLSRKDIEKLLLVAISNSASKKTLNLLLSYLNHSVKNII
GKIVQSVLKNGDFTIIIFLKKKKINLVEPALIGFINYYYSYCFLEQFIHEFDIRPEKMIKMAARKGKLNMIIEFLNEKYV
HKDDLGAIFKFLKNLVCTMKHKKGKETLIVLIHKIYQVIQLETKEKFKLLRFYVMHDATIQFISMYKDCFNLAGFKPFLL
ECLDIAIKKNYPDMIRNIETLLKCE
>Q89777 ~~~~~~Protein MGF 505-5R~~~
MFSLQEICRKNIYFLPDWLSEHVIQRLGLYWEKHGSLQRIGDDYVLIQQDLIIPINEALRMAGEEGNDEVVQLLLLWEGN
IHYAIIGALESDHYSLIRKLYDQIEDCHDILPLIQDPKIFEKCHELDKFCNILCLVLHAVKNDMLCILQEYKMHLSGEDI
QVVFETACRSQKNDIVSWMGQNIAIYNSGVIFDIAFDKMNVSLLSIGYTLLFNHHINNTNENINSLLTQHLEWAAGMGLL
HFMLETLKYGGDVTIIVLSEAVKYDHRKILDYFLRRKNLYQEDLEELLLLAIRADCSKKTLNLLLSYLNYSINNIRKKIL
QCVKEYETTVIIKILWKRKINLIEPILADFIGYHSYTYMVDFMREFSIHPEKMIKMAARESREDLIIKFSKKVCKEPKDR
LHYLKSLVYTMRHKEGKQLLIYTIHNLYKACHLESKEMFNLARFYARHNAVIQFKSICHDLSKLNINIKNLLLECLGIAI
KKNYFQLIKTIETDMRYE
>Q89925 ~~~~~~Protein MGF 505-7R~~~
MFSLQDLCRKNTFFLPNDFSKHTLQRLGLYWKEHGSVHRIEKDSIMIQNELVLSINDALQLAGEEGDTDVVQLLLLWEGN
LHYAIIGALKTENYNLVCEYHSQIQDWHILLPLIQDPETFEKCHDLSLGCDLICLLQHAVKCDMLSILVKYKEDLLNVRI
RHRTQSLFVLACENRRFEIIEWIGQNLSIPEPEAIFSIAIVTKDVELFSLGYKIIFDYMQRQGIFQLTNVVRMLLLNRHI
GMAIEKGLLPFILETLKYGGSVKRALSYAVIDNKRKIIDYLVRHENIPRGTIERLLHLAVKKQSSRKTLNLLLSYINYKV
KNVKKLLEHVVKYNSTLVIRILLEKKKNLLDATLTRYVKDSTYFQVKEFMQDFSISPEKFIKIAVREKRNVLIKGISEDI
WENPAERIRNLKQIVCTIKYESGRQFLINIIHTIYQSYSLKPEEILKLATFYVKHNATTHFKDLCKYLWLNRGTESKKLF
LECLEIADEKEFPDIKSIVSEYINYLFTAGAITKEEIMQVYALEYAMY
>Q89740 ~~~~~~Protein MGF 505-9R~~~
MFSLQDLCRKNLFLPLEPLGKHVVQRLGLYWEGHGSLKRVGDCFICVDKIWILSIHKAIQIAASEGNENIVKLFLLWKGS
LQYAIIGALEGRQYDLIQKYYNQIGDCHENLPLIQDPEIYERCHELNVTCTFQCLFQHAIRDNMLPIFQKYGEDLNGNRR
MVQLLYEMACRLQNYDIIKWIGFNLHVYNLEAIFSIAFVRKDLTLYSLGYMLLLGRMSTEDRNFISIITRHLEYASKKGL
FDFVLESLKYGGQVDTVLFQAVKYNHRKILAHFIHEIPRETVEKLILHAVESRASRKTFNLLLSSINYCVNPFVKKLLHT
VVKHKYMLIIKLLLERPKKKINLVDAALFKLVKYSTYAEIVKFMKEFSVDPERVVKMAARLMRVDLIKKISNDAWENKLE
RIKHLKQMVNTMNHRNGKNLLMYNIHNITGYTCLNTKEAFNLTRFYAVHNATCLFKEMCKSCFVHDKIQFRELLEDCLHI
ANRHDYIQIAETADECIKYIDLITPK
>P20194 ~~~~~~Uncharacterized protein A-100~~~
MVSPQTRKEEELLEKQNSVFYLLTLGRKPYGSYLHIKIELDEDEKLEKEIYADNIKLENELRQLKRLYEVYQSVEIDDAQ
KAIQKEALLTIAKILSVFDF
>Q9J558 ~~~~~~Protein A11 homolog~~~
MTGLMVTDITNIAKEYNLTAFSEDVYPCNKNYELTNGQLSALKTINVVLTTRSDNYEKDVTYNDDDDHDRCIVSEIGSHH
SFNDEKDNYIQSNNIQQTPSLSAVFDDNKRVHLLEQEIAELRKKKTKSKNLLDFTNTLFNKNPLRIGILNKRAIILNYAS
MNNSPLTMEDLEACEDEEIENMYISIKQYHEVHKKKLIVTNIISILISVIEQLLVRIGFDEIKGLSKEVTSTIIDLEIGE
DCEQLATKMGVANNPVINISLFILKIFIRRINIL
>Q9J555 ~~~~~~Virion membrane protein A13 homolog~~~
MGIIDTFVITAVTVIIFCLLIYAAYKRYKCIPSPDDRDKVLKSTLNDDTLFNQTLTPDQVKALHRLVTSSI
>Q9J552 ~~~~~~Virion membrane protein A16 homolog~~~
MGQHVSNITVIATPQAPETKYLRVEYTGGYDDEYIRFFEAENIHSGDIGSEISPPFCLTRDTTVKQCASFLSPEAKKKFV
IVPGEPCKSLSFRPGSILDLQKIPYGTESYVLDGTRCRFINIDYLYTDPDIKRCCNKESDKDCPEIFSNNYETDHCDTIM
SSICLQTPGSLPCREWLEKKREVAFDTYMKVCSDHLDANYCSDFVDYTRPDNFGYSDAAILSYCSKHRNNPNCWCVTTPK
NDKLFSLELALGPKVCWLHECTDKSKDRKYLLFDQDVQRTNCKYIGCNINVDTLRLRNSVAELIAKCGGSIAEDTVLGDD
SYNKEAKLPSFFSIIPVCIVLLCLFVLFYFLRIYDAKVINSNTINVYRK
>Q6RZG6 ~~~~~~Virion membrane protein A16~~~
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGSLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRAKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
SAVGDVHHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRPKIKTNDINVRRR
>O93122 ~~~~~~Virion membrane protein A16~~~
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGSLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRAKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
SAVGDVHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRTKIKTNDINVRRR
>Q77TI4 ~~~~~~Virion membrane protein A16~~~
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGSLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRAKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
SAVGDVHHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRPKIKTNDINVRRR
>Q8LTE1 ~~~~~~Minor capsid protein A1~~~
MAKLETVTLGNIGKDGKQTLVLNPRGVNPTNGVASLSQAGAVPALEKRVTVSVSQPSRNRKNYKVQVKIQNPTACTANGS
CDPSVTRQAYADVTFSFTQYSTDEERAFVRTELAALLASPLLIDAIDQLNPAYWTLLIAGGGSGSKPDPVIPDPPIDPPP
GTGKYTCPFAIWSLEEVYEPPTKNRPWPIYNAVELQPREFDVALKDLLGNTKWRDWDSRLSYTTVRGCRGNGYIDLDATY
LATDQAMRDQKYDIREGKKPGAFGNIERFIYLKSINAYCSLSDIAAYHADGVIVGFWRDPSSGGAIPFDFTKFDKTKCPI
QAVIVVPRA
>P09677 ~~~~~~Minor capsid protein A1~~~
MAKLNQVTLSKIGKNGDQTLTLTPRGVNPTNGVASLSEAGAVPALEKRVTVSVAQPSRNRKNFKVQIKLQNPTACTRDAC
DPSVTRSAFADVTLSFTSYSTDEERALIRTELAALLADPLIVDAIDNLNPAYWAALLVASSGGGDNPSDPDVPVVPDVKP
PDGTGRYKCPFACYRLGSIYEVGKEGSPDIYERGDEVSVTFDYALEDFLGNTNWRNWDQRLSDYDIANRRRCRGNGYIDL
DATAMQSDDFVLSGRYGVRKVKFPGAFGSIKYLLNIQGDAWLDLSEVTAYRSYGMVIGFWTDSKSPQLPTDFTQFNSANC
PVQTVIIIPSL
>Q6QGT3 ~~~A1~~~Protein A1~~~
MVISAEKQTVILKMAADFNFYGKRLRATKLEVCDDISKMVYDTTKHSTAICDWLEANKPAKPKAAKVAKAIKNDERPEAA
GIVSSTVEQWEVKQGKRFIITSIQNNTFPHKNFLASLEQYAQFIGADLLVSKFIYNKNGFQNGEGADGIRYDSAFDKYIC
NKNVFLNNRRFAFMAEINVLPTADYPLSGFAETATALNLEGLAIGAAKITAESVPALKGEIVRRMYSTGTATLKNYIQQK
AGQKAEALHNFGALIVEFDEDGEFFVRQLETMDESGVFYDLNTCATPAGCYETTGHVLGLQYGDIHAEKLDEECAAASWG
HGDTYGLVDILKPKYQFVHDVHDFTSRNHHNRASGVFLAKQYAAGRDKVLDDLIDTGRVLESMERDFSQTIIVESNHDLA
LSRWLDDRNANIKDDPANAELYHRLNAAIYGAIAEKDDTFNVLDYALRKVAGCEFNAIFLTTDQSFKIAGIECGVHGHNG
INGSRGNPKQFKKLGKLNTGHTHTASIYGGVYTAGVTGSLDMGYNVGASSWTQTHIITYANGQRTLIDFKNGKFFA
>P23541 ~~~A2~~~Protein A2~~~
MTNAKTAKFAWNEENTQKAVSMYQQLINENGLDFANSDGLKEIAKAVGAASPVSVRSKLTSAKAYQKSDKPRKVGGGSSI
RKAHYVRVIAKHAIDSGIIKDADDLASLESAKLETLDAVAQLLGVADEVKQAAGE
>P24765 ~~~~~~Protein A40~~~
MNKHKTDYAGYACCVICGLIVGIIFTATLLKVVERKLVHTPSIDKTIKDAYIREDCPTDWISYNNKCIHLSTDRKTWEEG
RNACKALNPNSDLIKIETPNELSFLRSIRRGYWVGESEILNQTTPYNFIAKNATKNGTKKRKYICSTTNTPKLHSCYTI
>P31037 ~~~~~~Protein A49R~~~
MDEAYYSGNLESVLGYVSDMHTELASISQLVIAKIETIDNDILNKDIVNFIMCRSNLDNPFISFLDTVYTIIDQENYQTE
LINSLDDNEIIDCIVNKFMSFYKDNLENIVDAIITLKYIMNNPDFKTTYAEVLGSRIADIDIKQVIRENILQLSNDIRER
YL
>Q01220 ~~~~~~Protein A52~~~
MDIKIDISISGDKFTVTTRRENEERKKYLPLQKEKTTDVIKPDYLEYDDLLDRDEMFTILEEYFMYRGLLGLRIKYGRLF
NEIKKFDNDAEEQFGTIEELKQKLRLNSEEGADNFIDYIKVQKQDIVKLTVYDCISMIGLCACVVDVWRNEKLFSRWKYC
LRAIKLFINDHMLDKIKSILQNRLVYVEMS
>Q9J563 ~~~~~~Protein A6 homolog~~~
MEKFRNAFIEFYELSKKYLENSTGQKVYEVNYDNDIDSFLTVFPILESKIGGDINCAMSDETVILAMQQVEFKMFTFWYM
RSAANVKSMLNKITDKETKEQFIRIFKDMLVYAKVITSINNMYSNMKKDTNEIVQDLKKILGIVSLIKSANNEHQAYKIL
MENNSFIIRTINKVLADSNYIIKIIALFNTDVVSDKIKLEEYKDVFSFSKENVIFGIKCFCDITIDGIDQINNKYVSFFK
KVLPNIILFQTSCVKTTQFVNIFSKLSSIVYSEILTNERLHVLFSEIMASFKTKVSVEDLKKRKVNNIQGLISEISNNRE
MYKNIFVEEYEKHKTTLISIVQCITDNYNINYKENAVDIEFIFDFIQEHYISKL
>Q9J560 ~~~~~~Protein A9 homolog~~~
MSCYSSILNSISTLAFLQVASNVIELVRHCIMHFCETRIRCNTLAFVILKILITMVIYFMIGLGLFYLAKNGTEAE
>P11191 ~~~abc2~~~Anti-RecBCD protein 2~~~
MPAPLYGADDPRRCSGNSVSEVLDKFRKNYDLIMSLPQETKEEKEFRHCIWLAEKEERERIYQTAIRPFRKATYTKFIEI
DPRLRDYRSRYGAISNN
>P10447 2.7.10.2~~~ABL~~~Tyrosine-protein kinase transforming protein Abl~~~
ENLLAGPSENDPNLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGP
VSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSPESRFNTLAELVHHHSTVADGLIT
TLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMK
EIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLV
GENHLVKVADFGLSRLMTGDTYTAHAGTKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYEL
LEKDYRMERPEGCPEKVYELMRACWQWNPSDRPAFAEIH
>P41482 ~~~~~~Protein Ac102~~~
MIASINDTDMDTDDNMSQARRNRRNRPPARPSAQTQMAAVDMLQTINTAASQTAASLLINDITPNKTESLKILSTQSVGA
RSLLEPMQANASTIKLNRIETVNVLDFLGSVYDNTIQVIVTE
>P24730 ~~~~~~Protein Ac132~~~
MSDKTPTKKGGSHAMTLRERGVTKPPKKSEKLQQYKKAIAAEQTLRTTADVSSLQNPGESAVFQELERLENAVVVLENEQ
KRLYPILDTPLDNFIVAFVNPTYPMAYFVNTDYKLKLECARIRSDLLYKNKNEVAINRPKISSFKLQLNNVILDTIETIE
YDLQNKVLTITAPVQDQELRKSIIYFNILNSDSWEVPKYMKKLFDEMQLEPPVILPLGL
>P12828 ~~~~~~Protein AC18~~~
MERLLNQLNLGVLPYITTKDIEDRLRDKIVAKAKLAFIKDCFEAVVCENGGLFVLTGGAAVTCHIDDDRSALKCIDFDYY
GFCAKMFCNLQTNLQKCVDQHYAELDVLTRQVYMSDPLVVLKCYQNGAYRLNGQINLHLNRHIKCIKTQYNDEFDLVRFA
LQIDITSADGVDEYTDNGVKITTAPLSFNVFFVNVRIMKRPFNADRCIKNFSLFGNEYHVLVSSLQRVLNDQLMCLLKDI
FTNKFDYKIERRLKHLKRLFANLPAESYNSCVNDHTDMCLYKEQNETITNFVKKILDISGPALGCRKLMHIYLTTDTFSG
QLPAYLTHYVNYPHKSLCDQNWKRFMSCIFSLY
>P21287 ~~~~~~Protein Ac34~~~
MTTVAVNAPLPPPLVELCNRRPIPTPRIISLQRQLISTPVVKNYQADVQEAINDFKRLNITPGHLGEVIDTMGQQGKLLP
EIIEADDDFKVNQTRNLSCKTVEYLNFLENDKLFRCRLCYTHADWLWCDFHRNHAYRGTRDITCNNYVEHLNSDMGVVML
IEEYFYCLSSCNFKQDAKRALQTLTKFESLSDLMASYNFSTPDLDTNAYELMDFE
>Q91J24 ~~~C4~~~Protein C4~~~
MGNLISTSCFNSKEKFRSQISDYSTWYPQPGQHISIRTFRELNPAPTSSPTSTRTETQLNGGNSRSTVEVLEEVNRQLTT
HMPRR
>P0C2W9 ~~~AC4~~~Protein AC4~~~
MGNLTSTCLFSSRENTAAKINDSSTWYPQQGQHISIQTFRELNRLPTSRRTSTKTEILLYGENSRSTVEVLEEVAKHLTT
LQQRR
>P41458 ~~~~~~Protein AC54~~~
MCSTKKPIKLDLCASVKLTPFKPMRPPKPMQCWIHPRRANCKVTRPRNNYSDPDNENDMLHMTVLNSVFLNEHAKLYYRH
LLRNDQAEARKTILNADSVYECMLIRPIRTEHFRSVDEAGEHNMSVLKIIIDAVIKYIGKLADDEYILIADRMYVDLIYS
EFRAIILPQSAYIIKGDYAESDSESGQSVDVCNELEYPWKLITANNCIVSTDESRQSQYIYRTFLLYNTVLTAILKQNNP
FDVIAENTSISIIVRNLGSCPNNKDRVKCCDLNYGGVPPGHVMCPPREITKKFFHYAKWVRNPNKYKRYSELIARQSETG
GGSASLRENVNNQLHARDVSQLHLLDWENFMGEFSSYFGLHAHNV
>P41467 ~~~~~~Protein Ac66~~~
MQRWPKYGGTDVNTRTVHDLLNTINTMSARIKTLERYEHALREIHKVVVILKPSANTHSFEPDALPALIMQFLSDFAGRD
INTLTHNINYKYDYNYPPAPVPAMQPPPPPPQPPAPPQPPYYNNYPYYPPYPFSTPPPTQPPESNVAGVGGSQSLNQITL
TNEEESELAALFKNMQTNMTWELVQNFVEVLIRIVRVHVVNNVTMINVISSITSVRTLIDYNFTEFIRCVYQKTNIRFAI
DQYLCTNIVTFIDFFTRVFYLVMRTNFQFTTFDQLTQYSNELYTRIQTSILQSAAPLSPPTVETVNSDIVISNLQEQLKR
ERALMQQISEQHRIANERVETLQSQYDELDLKYKEIFEDKSEFAQQKSENVRKIKQLERSNKELNDTVQKLRDENAERLS
EIQLQKGDLDEYKNMNRQLNEDIYKLKRRIESTFDKDYVETLNDKIESLEKQLDDKQNLNRELRSSISKIDETTQRYKLD
AKDIMELKQSVSIKDQEIAMKNAQYLELSAIYQQTVNELTATKNELSQVATTNQSLFAENEESKVLLEGTLAFIDSFYQI
IMQIEKPDYVPISKPQLTAQESIYQTDYIKDWLQKLRSKLSNADVANLQSVSELSDLKSQIISIVPRNIVNRILKENYKV
KVENVNAELLESVAVTSAVSALVQQYERSEKQNVKLRQEFEIKLNDLQRLLEQNQTDFESISEFISRDPAFNRNLNDERF
QNLRQQYDEMSSKYSALETTKIKEMESIADQAVKSEMSKLNTQLDELNSLFVKYNRKAQDIFEWKTSMLKRYETLARTTA
ASVQPNVE
>Q06669 ~~~~~~Protein Ac75~~~
MSNLMKNFFTELVKSTTFTTKVSVVKTTLSNWLCEQVYPDKDFSLKLKRVVNMFLNNEIENNKIYKLVETVDSSNKLSRR
QVDFLIHALLNNVSVTFTLHRFVDDNVLTQDELSFLANFLVTKMDEAYQLPAY
>Q06690 ~~~~~~Protein Ac76~~~
MNLYLLLGALAIFSLVYDKKENSIFLYLLILFLVFIIVSPAIISKNTESTVEDIPSHKAKSVRKKLEIEQALDAILNKNT
SSID
>E5EYS5 ~~~~~~Anti-CBASS protein Acb1~~~
MKLSDIDKGTYAAVKFDDSTLDMFQALQQIMELFNPVPRDKLHSTICFSRVKIPYIPLTEKMPIGSTHKLEVFEHNGKRA
LVVLLDSPYLESRHEYANILGATFDFPTYNPHVTLAYDIGAMEIPKHGVTGNPVVITHEYTEDLDLNWKP
>A0A868BQY3 ~~~~~~Anti-CBASS protein Acb1~~~
MKKLSEFGKGLYVAAKFSESTLDALEELQRSLKLPNPVPRDKLHTTIVYSRVNVPYKVASGSFEIADKGKLTVFETQSGN
RALVLEMDSDYLSARHSYAKALGASYDYPDYRPHITLSYNIGVLNFSGEYKVPVVLDREYSEELDLEWSDKD
>Q6WHI1 ~~~~~~Anti-CBASS protein Acb1~~~
MKLSKFLESQGTYVGIKLDEASKQELVRLQKTLRLKNPLDPDKFHVTVLYSRKQIDVPVVDTTFVATVEQVDCWKTQDGK
HAVVAKMSCPALVERHEDLISIGGTHDYPDYTPHVTLSYDDVITPMFINTEVRLVDEYIEPLDLEWVANND
>I7K316 ~~~~~~Anti-CBASS protein Acb1~~~
MLTFDEAAEGLYVAAKYSELTLDAIEMLQRELKVPNPVPREKIHSTICYSRVMVPYSVASGSFEVATSGHLEVWNHGSSP
LLVLVLDSDYLRCRHQYARAIGATHDFQDYTPHITLSYNVGVLKYPKEKYPIRVVLDREYKEPLKLDWADDLK
>D9ICP2 ~~~~~~Anti-CBASS protein Acb1~~~
MKSFIEVAKPVVDSGIYMCAKFDQASCEALAQVQKLLGVENPVSAEKLHTTIVYSRKTVDLFPASGISEPARLVDVEKWD
TKYGNTIVGVLESDYLHSRFNDAMDAGATYDFDNYKPHVTLAYDSRIEDISGVKRLLTLPVDLTIIKEDAESLDLDKTVE
DITEHIEHRGESS
>M1EAP5 ~~~~~~Anti-CBASS protein Acb1~~~
MMKFQDFSSGLYVAAKFSDSTLDEIENLQRELKVPNPVPRHKIHSTICYSRVNVPYVVSTGSFEVANSGELQVWDTQDGR
TLVLVLDSDYLKFRHNYARALGATHDFDDYSPHITLSYNVGPAQFSGIVQVPVILDREYKEPLKINWTEDLK
>E3SEW0 ~~~~~~Anti-CBASS protein Acb1~~~
MTLNEVSEGLYVAAKFSELTLDALENLQRSLKVPNPVPREKFHSTICYSRVNIPYVVASGSFEVANSGHLEVWKTDDGST
LVLVLDSDYLSCRHMYARALGATHDFPDYTPHITLSYNVGPLTYKGEVQIPVVLDREYKEPLKLNWSADLK
>P04533 ~~~~~~Anti-CBASS protein Acb1~~~
MMEFKDFSTGLYVAAKFSELTLDALEELQRSLRVPNPVPREKIHSTICYSRVNVPYVPSSGSFEVASSGHLEVWKTQDGS
TLVLVLDSEYLRCRHMYARALGATHDFDDYTPHITLSYNVGPLSFSGDVQIPVVLDREYKEPLKLDWADDLK
>A0A8G1LR08 ~~~~~~Anti-Pycsar protein Apyc1~~~
MRHTMELTMVGTGSAFSKKFYNNSALVEFSNGYRLLIDCGHSVPKGLHALGFPLDSLDGILITHTHADHIGGLEEVALYN
KFVLGGRKIDLLVPNTLVESLWENSLKGGLRYPDEGSPEPELSDYFTVRSLKTSSYGVAHTQIEENMAVRLYPTVHVSHM
DSYAVGLVDRGEDKVFYSSDTIFDEYLIDYALTYSWVFHDCQFFTGGVHASLDELLSYISEEDQSRVFLMHYGDNMEDFF
TKAGMMRFALQGRTYIL
>A0A217ER65 ~~~~~~Anti-Pycsar protein Apyc1~~~
MLDTTKLTMVGTGSAFSKKFYNNSALVTFTNGYNLLIDCGHSVPKGLHDLGFPLESLDGILITHTHADHIGGLEEVALYN
KFVLGGRKIDLLVPEPLVEPLWNDSLNGGLRYDDSRELELDDYFTVRSLKTSDCGAARTQIDENIAFTLYTTLHVSHMKS
YAVGLIDRGEEKVFYSSDTVFDEYLLDYALTMFPWVFHDCQLFTGGVHASLDELLGYTRYIPEKQQNKIFLMHYGDNVEE
FIGKTGRMRFAEQGREIIL
>A0A516KNH9 ~~~~~~Anti-Pycsar protein Apyc1~~~
MLDTTKLTMVGTGSAFSKKFYNNSALVTFTNGYKLLIDCGHSVPKGLHDLGFPLESLDGILITHTHADHIGGLEEVALYN
KFVLGGRKIDLLVPEPLVEPLWNDSLNGGLRYDDSRELELDDYFTVRSLKTSDCGAARTQIDENIAFTLYTTLHVSHMKS
YAVGLIDRGEEKVFYSSDTVFDEYLLDYALTMFPWVFHDCQLFTGGVHASLDELLGYTRYIPEKQQNKIFLMHYGDNVEE
FIGKTGRMRFAEQGREIIL
>Q6TM72 ~~~~~~Anti-CRISPR protein 30~~~
MIAQQHKDTVAACEAAEAIAIAKDQVWDGEGYTKYTFDDNSVLIQSGTTQYAMDADDADSIKGYADWLDDEARSAEASEI
ERLLESVEEE
>P06022 ~~~C~~~Late transcription activator C~~~
MQHDLFEHDPAIRQLIGHIDNIPAPELESRWPRSVVDLIDVLENELKRQNVSNPRELARKQAVALSCFLGGRQFYIPCGD
TILTALRDDLLYCQFNGRNMEELRRQYRLSQPQIYQIIARQRKLHTRRHQPDLFSPETPK
>P07693 4.4.1.42~~~~~~S-Adenosylmethionine lyase~~~
MIFTKEPAHVFYVLVSAFRSNLCDEVNMSRHRHMVSTLRAAPGLYGSVESTDLTGCYREAISSAPTEEKTVRVRCKDKAQ
ALNVARLACNEWEQDCVLVYKSQTHTAGLVYAKGIDGYKAERLPGSFQEVPKGAPLQGCFTIDEFGRRWQVQ
>P21290 3.6.1.13~~~~~~Protein ADP-ribose pyrophosphatase ORF38~~~
MRNAAGLFMIIEPDKAVLLCARRAYRSANAPAADMNDTFLEKISIPRGHRDCCDAKVYETAVREFVEETGRFFDSAFIYK
LPFTLQWKDDGVTYKYLIYVGVVRGNLINVNAKPNTYTVKLLPGTFGNDYRIMLKPRRFNCEIARSLAIVPLNKYFNYMN
DKQLITYDYSNYIEFFDFVRSVKARFDNRQLQDFFYATLKKIDNDAPQKLHALRRV
>P24935 ~~~~~~Adenovirus death protein~~~
MTGSTIAPTTDYRNTTATGLTSALNLPQVHAFVNDWASLDMWWFSIALMFVCLIIMWLICCLKRRRARPPIYRPIIVLNP
HNEKIHRLDGLKPCSLLLQYD
>Q37976 3.4.24.-~~~ply~~~L-alanyl-D-glutamate peptidase~~~
MTSYYYSRSLANVNKLADNTKAAARKLLDWSESNGIEVLIYETIRTKEQQAANVNSGASQTMRSYHLVGQALDFVMAKGK
TVDWGAYRSDKGKKFVAKAKSLGFEWGGDWSGFVDNPHLQFNYKGYGTDTFGKGASTSNSSKPSADTNTNSLGLVDYMNL
NKLDSSFANRKKLATSYGIKNYSGTATQNTTLLAKLKAGKPHTPASKNTYYTENPRKVKTLVQCDLYKSVDFTTKNQTGG
TFPPGTVFTISGMGKTKGGTPRLKTKSGYYLTANTKFVKKI
>Q37979 3.4.24.-~~~ply~~~L-alanyl-D-glutamate peptidase~~~
MALTEAWLIEKANRKLNAGGMYKITSDKTRNVIKKMAKEGIYLCVAQGYRSTAEQNALYAQGRTKPGAIVTNAKGGQSNH
NYGVAVDLCLYTNDGKDVIWESTTSRWKKVVAAMKAEGFKWGGDWKSFKDYPHFELCDAVSGEKIPAATQNTNTNSNRYE
GKVIDSAPLLPKMDFKSSPFRMYKVGTEFLVYDHNQYWYKTYIDDKLYYMYKSFCDVVAKKDAKGRIKVRIKSAKDLRIP
VWNNIKLNSGKIKWYAPNVKLAWYNYRRGYLELWYPNDGWYYTAEYFLK
>A6QL29 ~~~~~~Avian agnoprotein 1a~~~
MSTPARDPNTAGTAALSPFSTPNHELRAPGPGEAHSPFTPTAAPGSQPAGSLSDPEDGPDPTFNFYIQGHRRRPYDRQNR
FGKLESEIRETKSQLETLRQELKHLQADVDDLKETVYAAGTSTASTSVPPSQPNSPTPTATTPEASPAAPTTESTETTGP
SVATNATEPSESRPAR
>P03086 ~~~~~~Agnoprotein~~~
MVLRQLSRKASVKVSKTWSGTKKRAQRILIFLLEFLLDFCTGEDSVDGKKRQRHSGLTEQTYSALPEPKAT
>P03084 ~~~~~~Agnoprotein~~~
MVLRRLSRQASVKVRRSWTESKKTAQRLFVFVLELLLQFCEGEDTVDGKRKKPERLTEKPES
>P0DOE2 ~~~aimP~~~Protein AimP~~~
MKKVFFGLVILTALAISFVAGQQSVSTASASDEVTVASAIRGA
>P0DOE3 ~~~aimR~~~AimR transcriptional regulator~~~
MIKNECEKDNQLAARLAKLAGYEKVNGFYKFVNTPEKEMENLGGLLKIVKNLFPDSEEQLLSEYFLELDPNKKCARQSVE
YSDINQWDTLTDKIIINLCNSKNSTSQEWGKVYSLHRKLNKNEISLNDAIRESGKCKIKSAEMLFFSNAMLMYAYLNIGE
FGLMKSTSKLLEFDDLPEGFIKESFKSRVSMLEANISLNENSLLEARQHSNRAIENSNVNRICFFAYLTIGNTLIFEDYD
EAKKAYIKGQKYAKNPVHQEMLDGALCFLSNIWKKENQWVNYNSDNIKYLQLRAFYYINQGNIEEATEILDELSSRDQDE
NELGFYYYYKGLISQDKTDYYKSIRYFKKSDDKYFIQLPLLQLERMGADLELLNLISI
>O64094 ~~~aimR~~~AimR transcriptional regulator~~~
MELIRIAMKKDLENDNSLMNKWATVAGLKNPNPLYDFLNHDGKTFNEFSSIVNIVKSQYPDREYELMKDYCLNLDVKTKA
ARSALEYADANMFFEIEDVLIDSMISCSNMKSKEYGKVYKIHRELSNSVITEFEAVKRLGKLNIKTPEMNSFSRLLLLYH
YLSTGNFSPMAQLIKQIDLSEISENMYIRNTYQTRVHVLMSNIKLNENSLEECREYSKKALESTNILRFQVFSYLTIGNS
LLFSNYELAQENFLKGLSISVQNENYNMIFQQALCFLNNVWRKENKWINFESDSIMDLQEQAHCFINFNENSKAKEVLDK
LDLLVHNDNELAMHYYLKGRLEQNKACFYSSIEYFKKSNDKFLIRLPLLELQKMGENQKLLELLLL
>P31748 2.7.11.1~~~V-AKT~~~AKT kinase-transforming protein~~~
AREETLIIIPGLPLSLGATDTMNDVAIVKEGWLHKRGEYIKTWRPRYFLLKNDGTFIGYKERPQDVDQRESPLNNFSVAQ
CQLMKTERPRPNTFIIRCLQWTTVIERTFHVETPEEREEWATAIQTVADGLKRQEEETMDFRSGSPSDNSGAEEMEVSLA
KPKHRVTMNEFEYLKLLGKGTFGKVILVKEKATGRYYAMKILKKEVIVAKDEVAHTLTENRVLQNSRHPFLTALKYSFQT
HDRLCFVMEYANGGELFFHLSRERVFSEDRARFYGAEIVSALDYLHSEKNVVYRDLKLENLMLDKDGHIKITDFGLCKEG
IKDGATMKTFCGTPEYLAPEVLEDNDYGRAVDWWGLGVVMYEMMCGRLPFYNQDHEKLFELILMEEIRFPRTLGPEAKSL
LSGLLKKDPTQRLGGGSEDAKEIMQHRFFANIVWQDVYEKKLSPPFKPQVTSETDTRYFDEEFTAQMITITPPDQDDSME
CVDSERRPHFPQFSYSASGTA
>P17595 ~~~~~~Replication protein alpha-A~~~
MASDEIVRNLISREEVMGNLISTASSSVRSPLHDVLCSHVRTIVDSVDKKAVSRKHEDVRRNISSEELQMLINAYPEYAV
SSSACESGTHSMAACFRFLETEYLLDMVPMKETFVYDIGGNWFSHMKFRADREIHCCCPILSMRDSERLETRMMAMQKYM
RGSKDKPLRLLSRYQNILREQAARTTAFMAGEVNAGVLDGDVFCENTFQDCVRRVPEGFLKTAIAVHSIYDIKVEEFASA
LKRKGITQAYGCFLFPPAVLIGQKEGILPSVDGHYLVENGRIKFFFANDPNAGYSHDLKDYLKYVEKTYVDIEDGVFAIE
LMQMRGDTMFFKITDVTAAMYHMKYRGMKRDETFKCIPLLKNSSVVVPLFSWDNRSLKITSGLLPRTLVEQGAAFIMKNK
EKDLNVAVLKNYLSAVNNSYIFNGSQVRDGVKIAPDLISKLAVTLYLREKVYRQRENSIISYFEQEMLHDPNLKAMFGDF
LWFVPNTLSSVWKNMRKSLMEWFGYAEFDLTTFDICDPVLYVEIVDRYKIIQKGRIPLGEFFDCHEECENYELREKEKND
LAVKMAQKVTGTVTECEKDLGPLVQPIKEILVQLVMPNLVRALCRPRSPTSPLDLKSIPGSTPSHSSSDSEHSMTEEASC
TIAGSVPTWEIATRKDLTFQRIDEDMSRRTGMPPRPKVTSSYNMNARAEFLYYQLCSVICERAQILSVIEDFRQNLIFSD
KVAVPLNARFYSFQSLRPGWVFKTPSHSEVGHSYAVHFDFKTIGTDLEESLAFCRMVPISWDKSGKYIATTPHFPERHGY
YVICDNTKLCNNWLIYNKLVDVYALVADRPLRFELIDGVPGCGKSTMILNSCDIRREVVVGEGRNATDDLRERFKRKKNL
NSKTANHRVRTLDSLLLAEGPCVPQADRFHFDEALKVHYGAIMFCADKLGASEILAQGDRAQLPMICRVEGIELQFQSPD
YTKTIINPKLRSYRIPGDVAFYLSAKEFYKVKGIPQKVITSNSVKRSLYARGETTPERFVSLLDVPVRKDTHYLTFLQAE
KESLMSHLIPKGVKKESISTIHEAQGGTYENVILVRLQRTPNEIYPGGPRSAPYIVVGTSRHTKTFTYCSVTDDKLLLDI
ADVGGIAHTPIRTFESHIV
>P12726 2.4.2.31~~~alt~~~NAD(+)--arginine ADP-ribosyltransferase~~~
MELITELFDEDTTLPITNLYPKKKIPQIFSVHVDDAIEQPGFRLCTYTSGGDTNRDLKMGDKMMHIVPFTLTAKGSIAKL
KGLGPSPINYINSVFTVAMQTMRQYKIDACMLRILKSKTAGQARQIQVIADRLIRSRSGGRYVLLKELWDYDKKYAYILI
HRKNVSLEDIPGVPEISTELFTKVESKVGDVYINKDTGAQVTKNEAIAASIAQENDKRSDQAVIVKVKISRRAIAQSQSL
ESSRFETPMFQKFEASAAELNKPADAPLISDSNELTVISTSGFALENALSSVTAGMAFREASIIPEDKESIINAEIKNKA
LERLRKESITSIKTLETIASIVDDTLEKYKGAWFERNINKHSHLNQDAANELVQNSWNAIKTKIIRRELRGYALTAGWSL
HPIVENKDSSKYTPAQKRGIREYVGSGYVDINNALLGLYNPDERTSILTASDIEKAIDNLDSAFKNGERLPKGITLYRSQ
RMLPSIYEAMVKNRVFYFRNFVSTSLYPNIFGTWMTDSSIGVLPDEKRLSVSIDKTDEGLVNSSDNLVGIGWVITGADKV
NVVLPGGSLAPSNEMEVILPRGLMVKVNKITDASYNDGTVKTNNKLIQAEVMTTEELTESVIYDGDHLMETGELVTMTGD
IEDRVDFASFVSSNVKQKVESSLGIIASCIDIANMPYKFVQG
>O03979 3.5.1.28~~~PAL~~~Lysin~~~
MGVDIEKGVAWMQARKGRVSYSMDFRDGPDSYDCSSSMYYALRSAGASSAGWAVNTEYMHAWLIENGYELISENAPWDAK
RGDIFIWGRKGASAGAGGHTGMFIDSDNIIHCNYAYDGISVNDHDERWYYAGQPYYYVYRLTNANAQPAEKKLGWQKDAT
GFWYARANGTYPKDEFEYIEENKSWFYFDDQGYMLAEKWLKHTDGNWYWFDRDGYMATSWKRIGESWYYFNRDGSMVTGW
IKYYDNWYYCDATNGDMKSNAFIRYNDGWYLLLPDGRLADKPQFTVEPDGLITAKV
>P32762 3.5.1.28~~~HBL~~~Lytic amidase~~~
MDIDRNRLRTGLPQVGVQPYRQVHAHSTGNRNSTVQNEADYHWRKDPELGFFSHVVGNFRIMQVGPVNNGSWDVGGGWNA
ETYAAVELIESHSTKEEFMADYRLYIELLRNLADEAGLPKTLDTDDLAGIKTHEYCTNNQPNNHSDHVDPYPYLASWGIS
REQFKQDIENGLSAATGWQKNGTGYWYVHSDGSYSKDKFEKINGTWYYFDGSGYMLSDRWKKHTDGNWYYFDQSGEMATG
WKKIADKWYYFDVEGAMKTGWVKYKDTWYYLDAKEGAMVSNAFIQSADGTGWYYLKPDGTLADKPEFTVEPDGLITVK
>P13304 ~~~rI~~~Antiholin~~~
MALKATALFAMLGLSFVLSPSIEANVDPHFDKFMESGIRHVYMLFENKSVESSEQFYSFMRTTYKNDPCSSDFECIERGA
EMAQSYARIMNIKLETE
>P03217 3.1.-.-~~~BGLF5~~~Shutoff alkaline exonuclease~~~
MADVDELEDPMEEMTSYTFARFLRSPETEAFVRNLDRPPQMPAMRFVYLYCLCKQIQEFSGETGFCDFVSSLVQENDSKD
GPSLKSIYWGLQEATDEQRTVLCSYVESMTRGQSENLMWDILRNGIISSSKLLSTIKNGPTKVFEPAPISTNHYFGGPVA
FGLRCEDTVKDIVCKLICGDASANRQFGFMISPTDGIFGVSLDLCVNVESQGDFILFTDRSCIYEIKCRFKYLFSKSEFD
PIYPSYTALYKRPCKRSFIRFINSIARPTVEYVPDGRLPSEGDYLLTQDEAWNLKDVRKRKLGPGHDLVADSLAANRGVE
SMLYVMTDPSENAGRIGIKDRVPVNIFINPRHNYFYQVLLQYKIVGDYVRHSGGGKPGRDCSPRVNIVTAFFRKRSPLDP
ATCTLGSDLLLDASVEIPVAVLVTPVVLPDSVIRKTLSTAAGSWKAYADNTFDTAPWVPSGLFADDESTP
>P04294 3.1.-.-~~~~~~Alkaline nuclease~~~
MESTVGPACPPGRTVTKRPWALAEDTPRGPDSPPKRPRPNSLPLTTTFRPLPPPPQTTSAVDPSSHSPVNPPRDQHATDT
ADEKPRAASPALSDASGPPTPDIPLSPGGTHARDPDADPDSPDLDSMWSASVIPNALPSHILAETFERHLRGLLRGVRAP
LAIGPLWARLDYLCSLAVVLEEAGMVDRGLGRHLWRLTRRGPPAAADAVAPRPLMGFYEAATQNQADCQLWALLRRGLTT
ASTLRWGPQGPCFSPQWLKHNASLRPDVQSSAVMFGRVNEPTARSLLFRYCVGRADDGGEAGADTRRFIFHEPSDLAEEN
VHTCGVLMDGHTGMVGASLDILVCPRDIHGYLAPVPKTPLAFYEVKCRAKYAFDPMDPSDPTASAYEDLMAHRSPEAFRA
FIRSIPKPSVRYFAPGRVPGPEEALVTQDQAWSEAHASGEKRRCSAADRALVELNSGVVSEVLLFGAPDLGRHTISPVSW
SSGDLVRREPVFANPRHPNFKQILVQGYVLDSHFPDCPPHPHLVTFIGRHRTSAEEGVTFRLEDGAGALGAAGPSKASIL
PNQAVPIALIITPVRIDPEIYKAIQRSSRLAFDDTLAELWASRSPGPGPAAAETTSSSPTTGRSSR
>Q2HR95 3.1.-.-~~~~~~Shutoff alkaline exonuclease~~~
MEATPTPADLFSEDYLVDTLDGLTVDDQQAVLASLSFSKFLKHAKVRDWCAQAKIQPSMPALRMAYNYFLFSKVGEFIGS
EDVCNFFVDRVFGGVRLLDVASVYAACSQMNAHQRHHICCLVERATSSQSLNPVWDALRDGIISSSKFHWAVKQQNTSKK
IFSPWPITNNHFVAGPLAFGLRCEEVVKTLLATLLHPDEANCLDYGFMQSPQNGIFGVSLDFAANVKTDTEGRLQFDPNC
KVYEIKCRFKYTFAKMECDPIYAAYQRLYEAPGKLALKDFFYSISKPAVEYVGLGKLPSESDYLVAYDQEWEACPRKKRK
LTPLHNLIRECILHNSTTESDVYVLTDPQDTRGQISIKARFKANLFVNVRHSYFYQVLLQSSIVEEYIGLDSGIPRLGSP
KYYIATGFFRKRGYQDPVNCTIGGDALDPHVEIPTLLIVTPVYFPRGAKHRLLHQAANFWSRSAKDTFPYIKWDFSYLSA
NVPHSP
>P24731 3.1.-.-~~~AN~~~Alkaline nuclease~~~
MFASLTSEQKLLLKKYKFNNYVKTIELSQAQLAHWRSNKDIQPKPLDRAEILRVEKATRGQSKNELWTLLRLDRNTASAS
SNSSGNMLQRPALLFGNAQESHVKETNGIMLDHMREIIESKIMSAVVETVLDCGMFFSPLGLHAASPDAYFSLADGTWIP
VEIKCPYNYRDTTVEQMRVELGNGNRKYRVKHTALLVNKKGTPQFEMVKTDAHYKQMQRQMYVMNAPMGFYVVKFKQNLV
VVSVPRDETFCNKELSTENNAYVAFAVENSNCARYQCADKRRLSFKTHSCNHNYSGQEIDAMVDRGIYLDYGHLKCAYCD
FSSDSRETCDSVLKREHTNCKSFNLKHKNFDNPTYFDYVKRLQSLLKSHHFRNDAKTLAYFGYYLTHTGTLKTFCCGSQN
SSPTKHDHLNDCVYYLEIK
>P59632 ~~~3a~~~ORF3a protein~~~
MDLFMRFFTLGSITAQPVKIDNASPASTVHATATIPLQASLPFGWLVIGVAFLAVFQSATKIIALNKRWQLALYKGFQFI
CNLLLLFVTIYSHLLLVAAGMEAQFLYLYALIYFLQCINACRIIMRCWLCWKCKSKNPLLYDANYFVCWHTHNYDYCIPY
NSVTDTIVVTEGDGISTPKLKEDYQIGGYSEDRHSGVKDYVVVHGYFTEVYYQLESTQITTDTGIENATFFIFNKLVKDP
PNVQIHTIDGSSGVANPAMDPIYDEPTTTTSVPL
>P0DTC3 ~~~3a~~~ORF3a protein~~~
MDLFMRIFTIGTVTLKQGEIKDATPSDFVRATATIPIQASLPFGWLIVGVALLAVFQSASKIITLKKRWQLALSKGVHFV
CNLLLLFVTVYSHLLLVAAGLEAPFLYLYALVYFLQSINFVRIIMRLWLCWKCRSKNPLLYDANYFLCWHTNCYDYCIPY
NSVTSSIVITSGDGTTSPISEHDYQIGGYTEKWESGVKDCVVLHSYFTSDYYQLYSTQLSTDTGVEHVTFFIYNKIVDEP
EEHVQIHTIDGSSGVVNPVMEPIYDEPTTTTSVPL
>Q65202 3.1.21.-~~~~~~Probable AP endonuclease~~~
MFGAFVSHRLWSDSGCTTTCITNSIANYVAFGEQIGFPFKSAQVFIAGPRKAVINIQEDDKVELLKMIVKHNLWVVAHGT
YLDVPWSRRSAFVTHFIQQELLICKEVGIKGLVLHLGAVEPELIVEGLKKIKPVEGVVIYLETPHNKHHTYKYSTMEQIK
ELFLRIRNTRLKQIGLCIDTAHIWSSGVNISSYNDAGQWLRSLENIHSVIPPSHIMFHLNDAATECGSGIDRHASLFEGM
IWKSYSHKIKKSGLYCFVEYITRHQCPAILERNLGSSMQLQTALTAEFTTLKSLLK
>P0C9C6 3.1.21.-~~~~~~Probable AP endonuclease~~~
MFGAFVSHRLWSDSGCTTTCITNSIANYVAFGEQIGFPFKSAQVFIAGPRKAVINIQEDDKVELLKMIVKHNLWVVAHGT
YLDVPWSRRSAFVTHFIQQELLICKEVGIKGLVLHLGAVEPELIVEGLKKIKPVEGVVIYLETPHNKHHTYKYSTMEQIK
ELFLRIRNTRLKQIGLCIDTAHIWSSGVNISSYNDAGQWLRSLENIHSVIPPSHIMFHLNDAATECGSGIDRHASLFEGM
IWKSYSHKIKQSGLYCFVEYITRHQCPAILERNLGSSMQLQTALTAEFTTLKSLLK
>Q6QGS2 3.1.-.-~~~~~~DNA-(apurinic or apyrimidinic site) endonuclease~~~
MVIYEGNRFVAICRPGLLANYLNQVSPEYKGVINIYEGKAHYKINAVVARELAFQFLTFASCDIQVMGEALIAENEDDFI
NIFRKVYTERLIMKGAYIQSTAESIETAFRKVAQ
>A0A345MJY6 ~~~~~~Anti-Pycsar protein Apyc1~~~
MLHTTQIRMVGTGSAFSKKFYNNSALVTFTNGYNLLIDCGHSVPKGLHDADIPLESIDGILITHTHADHIGGLEEVALYN
KFVLGGRKIDLLVPNTLVESLWENSLKGGLRYSDTYDDLSLSDYFTVRSLKTFTSGAARTQLEENIAIKLYPTFHVSHMA
SYAVGLEDRGEDKVFYSSDTIFDEYLIDYALTYSWVFHDCQFFTGGVHASLDELLNYIPEEDQDRVFLMHYGDNMEDFFT
KTGRMRFALQGRTYIL
>P0DTL0 ~~~~~~Anti-Pycsar protein Apyc1~~~
MAHTTQLTMVGTGSAFSKKFYNNSALVQFTNGYNLLIDCGHSVPKGLHDLGIPLESIDGILITHTHADHIGGLEEVALYN
KFVLGGRKIDLLVPETLVEPLWENSLKGGLRYPDEDSPEPELSDYFTVRSLKTSDYGVAHTQIEENMAVRLYPTVHVSHM
DSYAVGLVDRGEDKVFYSSDTIFDEYLIDYALTYPWVFHDCQFFTGGVHASLDELLNYIPEEDQARVFLMHYGDNLEDFF
NKTGRMRFALQGRTYIL
>P42485 ~~~~~~Apoptosis regulator Bcl-2 homolog~~~
MEGEELIYHNIINEILVGYIKYYINDISEHELSPYQQQIKKILTYYDECLNKQVTITFSLTSVQEIKTQFTGVVTELFKD
LINWGRICGFIVFSAKMAKYCKDANNHLESTVITTAYNFMKHNLLPWMISHGGQEEFLAFSLHSDMYSVIFNIKYFLSKF
CNHMFFRSCVQLLRNCNLI
>Q07818 ~~~LMW5-HL~~~Apoptosis regulator Bcl-2 homolog~~~
MEGEELIYHNIINEILVGYIKYYINDISEHELSPYQQQIKKILTYYDECLNKQVTITFSLTSVQEIKTQFTGVVTELFRD
LINWGRICGFIVFSARMAKYCKDANNHLESTVITTAYNFMKHNLLPWMISHGGQEEFLAFSLHSDMYSVIFNIKYFLSKF
CNHMFFRSCVQLLRNCNLI
>Q07819 ~~~~~~Apoptosis regulator Bcl-2 homolog~~~
MEGEELIYHNIINEILVGYIKYYMNDISEHELSPYQQQIKKILTYYDECLNKQVTITFSLTNAQEIKTQFTGVVTELFKD
LINWGRICGFIVFSARMAKYCKDANNHLESTVITTAYNFMKHNLLPWMISHGGQEEFLAFSLHSDIYSVIFNIKYFLSKF
CNHMFLRSCVQLLRNCNLI
>Q6VZT9 ~~~~~~Apoptosis regulator Bcl-2 homolog~~~
MDPSVKKDEIYYTILNIIQNYFIEYCTGKNRNFHVEDENTYIIVKNMCDIILRDNIVEFRKDIDRCSDIENEIPEIVYDT
IHDKITWGRVISIIAFGAYVTKVFKEKGRDNVVDLMPDIITESLLSRCRSWLSDQNCWDGLKAYVYNNKKFYYVTRYFRV
AAFIITSLAVINLFL
>Q9J5G4 ~~~~~~Apoptosis regulator Bcl-2 homolog~~~
MASSNMKDETYYIALNMIQNYIIEYNTNKPRKSFVIDSISYDVLKAACKSVIKTNYNEFDIIISRNIDFNVIVTQVLEDK
INWGRIITIIAFCAYYSKKVKQDTSPQYYDGIISEAITDAILSKYRSWFIDQDYWNGIRIYKNYSYIFNTASYCIFTASL
IIASLAVFKICSFYM
>F5HGJ3 ~~~vBCL2~~~Apoptosis regulator Bcl-2 homolog~~~
MDEDVLPGEVLAIEGIFMACGLNEPEYLYHPLLSPIKLYITGLMRDKESLFEAMLANVRFHSTTGINQLGLSMLQVSGDG
NMNWGRALAILTFGSFVAQKLSNEPHLRDFALAVLPVYAYEAIGPQWFRARGGWRGLKAYCTQVLTRRRGRRMTALLGSI
ALLATILAAVAMSRR
>P89884 ~~~vBCL2~~~Apoptosis regulator Bcl-2 homolog~~~
MSHKKSGTYWATLITAFLKTVSKVEELDCVDSAVLVDVSKIITLTQEFRRHYDSVYRADYGPALKNWKRDLSKLFTSLFV
DVINSGRIVGFFDVGRYVCEEVLCPGSWTEDHELLNDCMTHFFIENNLMNHFPLEDIFLAQRKFQTTGFTFLLHALAKVL
PRIYSGNVIYV
>Q00901 2.4.2.-~~~~~~Mono-ADP-ribosyltransferase C3~~~
MKGIRKSILCLVLSAGVIAPVTTSIVQSPQKCYACTVDKGSYADTFTEFTNVEEAKKWGNAQYKKYGLSKPEQEAIKFYT
RDASKINGPLRANQGNENGLPADILQKVKLIDQSFSKMKMPQNIILFRGDDPAYLGPEFQDKILNKDGTINKTVFEQVKA
KFLKKDRTEYGYISTSLMSAQFGGRPIVTKFKVTNGSKGGYIDPISYFPGQLEVLLPRNNSYYISDMQISPNNRQIMITA
MIFK
>P15879 2.4.2.-~~~C3~~~Mono-ADP-ribosyltransferase C3~~~
MKGLRKSILCLVLSAGVIAPVTSGMIQSPQKCYAYSINQKAYSNTYQEFTNIDQAKAWGNAQYKKYGLSKSEKEAIVSYT
KSASEINGKLRQNKGVINGFPSNLIKQVELLDKSFNKMKTPENIMLFRGDDPAYLGTEFQNTLLNSNGTINKTAFEKAKA
KFLNKDRLEYGYISTSLMNVSQFAGRPIITKFKVAKGSKAGYIDPISAFAGQLEMLLPRHSTYHIDDMRLSSDGKQIIIT
ATMMGTAINPK
>P39510 ~~~arn~~~Anti-restriction endonuclease~~~
MIIDSQSVVQYTFKIDILEKLYKFLPNLYHSIVNELVEELHLENNDFLIGTYKDLSKAGYFYVIPAPGKNIDDVLKTIMI
YVHDYEIEDYFE
>P32267 ~~~asiA~~~10 kDa anti-sigma factor~~~
MNKNIDTVREIITVASILIKFSREDIVENRANFIAFLNEIGVTHEGRKLNQNSFRKIVSELTQEDKKTLIDEFNEGFEGV
YRYLEMYTNK
>P24759 ~~~~~~A-type inclusion protein A25~~~
MEVTNLIEKCTKHSKDFATEVKKLWNDELSSESGLSRKTRNVIRNILRDITKSLTTDKKSKCFRILERSTINGEQIKDVY
KTIFNNGVDVESRINTTGKYVLFTVMTYVAAELRLIKSDEIFALLSRFFNMICDIHRKYGCGNMFVGIPAALIILLEIDH
INKLFSVFSTRYDAKAYLYTEYFLFLNINHYLLSGSDLFINVAYGAVSFSSPISVPDYIMEALTFKACDHIMKSGDLKYT
YAFTKKVKDLFNTKSDSIYQYVRLHEMSYDGVSEDTDDDDEVFAILNLSIDSSVDRYRNRVLLLTPEVASLRKEYSDVEP
DYKYLMDEEVPAYDKHLPKPITNTGIEEPHATGGDEDQPIKVVHPPNNDKDDAIKPYNPLEDPNYVPTITRTAIGIADYQ
LVINKLIEWLDKCEEECGNSGEFKTELEEAKRKLTELNAELSDKLSKIRTLERDSVYKTERIDRLTKEIKEHRDIQNGTD
DGSDLLEIDKKTIRELRESLDREREMRSELEKELDTIRNGKVDGSCQRELELSRMWLKQRDDDLRAEIDKRRNVEWELSR
LRRDIKECDKYKEDLDKAKTTISNYVSKISTLESEIAKYQQDRDTLSVVRRELEEERRRVRDLESRLDECTRNQEDTQEV
DALRSRIRELENKLTDCIESGGGNLTEISRLQSKISDLERQLSECRENATEISRLQSRISDLERQLNDCRRNNETNAETE
RDATS
>A0A0S0MX59 ~~~~~~5-NmdU N-acetyltransferase~~~
MIVVRKALPEEHKEILQVAKQSKYTKDFSNQVMFSSEAAYNKGWIHVAEHEGEIRGFYCIREKVRAPETVLYFIGVAQEA
KGLGLGKKLIEHIMATTRHRRLTLNVNKQNEEARAFYDRLGFTVAGESLGGEGLALFKEW
>A0A385DVS7 ~~~~~~Auxiliary capsid protein~~~
MVISINQVRQLYVAKALKANTAALTTAGDIVPKADTAKTTLYFQSMSPAGIVASDKINLKHVLYAKATPSEALAHKLVRY
SVTLDADVSATPVAGQNYILRLAFRQYIGLSEEDQYFKYGEVIARSGMTASDFYKKMAISLAKNLENKTESTPLVNIYLI
SAAAASTDVPVTSATKESDLTATDYNQIIIEETEQPWVLGMMPQAFIPFTPQFLTITVDGEDRLWGVATVVTPTKTVPDG
HLIADLEYFCMGARGDIYRGMGYPNIIKTTYLVDPGAVYDVLDIHYFYTGSNESVQKSEKTITLVAVDDGSHTAMNALIG
AINTASGLTIATL
>P27269 ~~~V2~~~Protein V2~~~
MWDPLLNEFPESVHGFRCMLAIKYLQSVEETYEPNTLGHDLIRDLISVVRARDYVEATRRYNHFHARLEGSPKAELRQPI
QQPCCCPHCPRHKQATIMDVQAHVPKAQNIQNVSKP
>P20201 ~~~~~~Uncharacterized protein B-129~~~
MTESDVDSGSKKYLSNHKGIFIHVTLEELKRYHQLTPEQKRLIRAIVKTLIHNPQLLDESSYLYRLLASKAISQFVCPLC
LMPFSSSVSLKQHIRYTEHTKVCPVCKKEFTSTDSALDHVCKKHNICVS
>P68830 ~~~B2~~~Protein B2~~~
MPSKLALIQELPDRIQTAVEAAMGMSYQDAPNNVRRDLDNLHACLNKAKLTVSRMVTSLLEKPSVVAYLEGKAPEEAKPT
LEERLRKLELSHSLPTTGSDPPPAKL
>Q992I9 ~~~B2~~~Protein B2~~~
MQSKLALIQELPDRIQKAVEVVLAMSYQEAPNNVRRDLDNLQACLNKAKQTVNRMVTSLLDKPSMAAYLEGKPLPEERPT
LEERLRKLELSREPPPTRSDPAPAKL
>P68831 ~~~B2~~~Protein B2~~~
MPSKLALIQELPDRIQTAVEAAMGMSYQDAPNNVRRDLDNLHACLNKAKLTVSRMVTSLLEKPSVVAYLEGKAPEEAKPT
LEERLRKLELSHSLPTTGSDPPPAKL
>Q9IMM3 ~~~B2~~~Protein B2~~~
MTNMSCAYELIKSLPAKLEQLAQETQATIQTLMIADPNVNKDLRAFCEFLTVQHQRAYRATNSLLIKPRVAAALRGEELD
LGEADVAARVRQLKQQLAELEMEIKPGHQQVAQVSGRRKAAAAAPVAQLGRVGVVNE
>P0CK58 ~~~BALF1~~~Apoptosis regulator BALF1~~~
MNLAIALDSPHPGLASYTILPRPFYHISLKPVSWPDETMRPAKSTDSVFVRTPVEAWVAPSPPDDKVAESSYLMFRAMYA
VFTRDEKDLPLPALVLCRLIKASLRKDRKLYAELACRTADIGGKDTHVRLIISVLRAVYNDHYDYWSRLRVVLCYTVVFA
VRNYLDDHKSAAFVLGAIAHYLALYRRLWFARLGGMPRSLRRQFPVTWALASLTDFLKSL
>P03228 ~~~BARF1~~~Secreted protein BARF1~~~
MARFIAQLLLLASCVAAGQAVTAFLGERVTLTSYWRRVSLGPEIEVSWFKLGPGEEQVLIGRMHHDVIFIEWPFRGFFDI
HRSANTFFLVVTAANISHDGNYLCRMKLGETEVTKQEHLSVVKPLTLSVHSERSQFPDFSVLTVTCTVNAFPHPHVQWLM
PEGVEPAPTAANGGVMKEKDGSLSVAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ
>P0CW72 ~~~BARF1~~~Secreted protein BARF1~~~
MARFIAQLLLLASCVAAGQAVTAFLGERVTLTSYWRRVSLGPEIEVSWFKLGPGEEQVLIGRMHHDVIFIEWPFRGFFDI
HRSANTFFLVVTAANISHDGNYLCRMKLGETEVTKQEHLSVVKPLTLSVHSERSQFPDFSVLTVTCTVNAFPHPHVQWLM
PEGVEPAPTAANGGVMKEKDGSLSVAVDLSLPKPWHLPVTCVGKNDKEEAHGVYVSGYLSQ
>P03225 ~~~BDLF2~~~Protein BDLF2~~~
MVDEQVAVEHGTVSHTISREEDGVVHERRVLASGERVEVFYKAPAPRPREGRASTFHDFTVPAAAAVPGPEPEPEPHPPM
PIHANGGGETKTNTQDQNQNQTTRTRTNAKAEERTAEMDDTMASSGGQRGAPISADLLSLSSLTGRMAAMAPSWMKSEVC
GERMRFKEDVYDGEAETLAEPPRCFMLSFVFIYYCCYLAFLALLAFGFNPLFLPSFMPVGAKVLRGKGRDFGVPLSYGCP
TNPFCKVYTLIPAVVINNVTYYPNNTDSHGGHGGFEAAALHVAALFESGCPNLQAVTNRNRTFNVTRASGRVERRLVQDM
QRVLASAVVVMHHHCHYETYYVFDGVGPEFGTIPTPCFKDVLAFRPSLVTNCTAPLKTSVKGPNWSGAAGGMKRKQCRVD
RLTDRSFPAYLEEVMYVMVQ
>P03224 ~~~BDLF3~~~Glycoprotein BDLF3~~~
MAHARDKAGAVMAMILICETSLIWTSSGSSTASAGNVTGTTAVTTPSPSASGPSTNQSTTLTTTSAPITTTAILSTNTTT
VTFTGTTVTPVPTTSNASTINVTTKVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNATTLAPTLSSKGTSNA
TKTTAELPTVPDERQPSLSYGLPLWTLVFVGLTFLMLILIFAAGLMMSAKNKPLDEALLTNAVTRDPSLYKGLV
>P14353 ~~~bel1~~~Protein Bel-1~~~
MDSYEKEESVASTSGIQDLQTLSELVGPENAGEGELTIAEEPEENPRRPRRYTKREVKCVSYHAYKEIEDKHPQHIKLQD
WIPTPEEMSKSLCKRLILCGLYSAEKASEILRMPFTVSWEQSDTDPDCFIVSYTCIFCDAVIHDPMPIRWDPEVGIWVKY
KPLRGIVGSAVFIMHKHQRNCSLVKPSTSCSEGPKPRPRHDPVLRCDMFEKHHKPRQKRPRRRSIDNESCASSSDTMANE
PGSLCTNPLWNPGPLLSGLLEESSNLPNLEVHMSGGPFWEEVYGDSILGPPSGSGEHSVL
>P14355 ~~~bel3~~~Protein Bel-3~~~
MEIGVMQLCNMLIRLKGAVKQGAWHHLYLTTKLCSFIEPLWQTSPILGLEKDILLMVTKQLWKLMDLREEVTRRGCGGMS
LETRENKEESITGKEVKNLITQILLLLIDVPGMRDTRFLNCPHSLLPLTSNAELLKHCLMAGKWSPKAEMIILAAERSEH
>O93036 ~~~bet~~~Protein Bet~~~
MASKYPEEGPITEGVEEDFNSHSTSGLDLTSVGKNPEHPRRILLVLQHLIAYAEATIKKQDVPGPLLPILSPYVMAWDNP
QNVVTRLVNLGESWKKYLLSPGWKDCGERDLTMLTRELLVPGIGLVQIAATLTKTYVLMCNGRCITGSRTDPDCDPLFCK
LLCWKQNIQDPRECNLEEWCLYSLDPEHDPLWDPKMIVRRHRNLLPYCMRPFLIWMNYISHNPLTQQCIMMKTLNMLWRA
QADDPSDVASLYPRVKVFKASHFDIFGSASGNSEERVSWAKENSHRGEYSLLPSSDDEEEEMSEREELLCHINQCQQKLF
YPGGTTDVLGMESNVWLTKFVNIKFPKGTKVILPDGRKFIACDPELKPLLQELKFLDRATSESSDSE
>P89873 ~~~bet~~~Protein Bet~~~
MDSYEKEESVASTSGIQDLQTLSELVGPENAGEGELTIAEEPEENPRRPRRYTKREVKCVSYHAYKEIEDKHPQHIKLQD
WIPTPEEMIAQKVQNQDLGTILSFDVTCLKSITSLGRNDPGDDPSIMSHVLPVVTPWPMSQDHYAPTLFGILDRYYQGYL
KSPATYQTWKFTCQVDPSGKRFMETQFWVPPLGQVNIQFYKNYQILTCCQAVDPFANIFHGTDEEMFDIDSGPDVWCSPS
LCFKVIYEGAMGQKQEQKTWLCRLGHGHRMGACDYRKVDLYAMRQGKENPYGDRGDAALQYAYQVKRGCKAGCLASPVLN
YKALQFHRTIMADFTNPRIGEGHLAHGYQAAMEAYGPQRGSNEERVWWNVTRNQGKQGGEYYREGGEEPHYPNTPAPHRR
TWDERHKVLKLSSFATPSDIQRWTTKALPYGWKVVTESGNDYTSRRKIRTLTEMTQDEIRKRWESGYCDPFIDSGSDSDG
PF
>P14347 ~~~BFRF2~~~Late gene expression regulator BFRF2~~~
MALFLARHTLSGTGAGCHGRGPAPDVSEVDLTLQALGERGFSRLLDLGLACLDLSYVEMREFVVWGRPPASEAAVASTPG
SLFRSHSSAYWLSEVERPGGLVRWARSQTSPSSLTLAPHLGPSLLSLSVVTGGGCGAVAFCNAFFLAYFLVVRSVFPAFS
DRIAAWICDRSPFCENTRAVARGYRGLVKRFLAFVFERSSYDPPLLRQNSRPVERCFAIKNYVPGLDSQSCVTVPSFSRW
AQSHASELDPREIRDRVTPATAPSFVADHASALLASLQKKASDTPCGNPIQWMWYRLLVNSCLRSAHCLLPIPAVSEGGR
KTGGGVGEELVGAGGPCLSRDVFVAIVSRNVLSCLLNVPAAGPRAYKCFRSHASRPVSGPDYPPLAVFCMDCGYCLNFGK
QTGVGGRLNSFRPTLQFYPRDQKEKHVLTCHASGRVYCSNCGSAAVGCQRLAEPPSARSGWRPRIRAVLPHNAAYELDRG
SRLLDAIIPCLGPDRTCMRPVVLRGVTVRQLLYLTLRTEARAVCSICQQRQAPEDARDEPHLFSSCLEVELPPGERCAGC
RLYQTRYGTPAAQAHPPGEAGGGFSRQSPAS
>P03208 ~~~BILF1~~~G-protein coupled receptor BILF1~~~
MLSTMAPGSTVGTLVANMTSVNATEDACTKSYSAFLSGMTSLLLVLLILLTLAGILFIIFVRKLVHRMDVWLIALLIELL
LWVLGKMIQEFSSTGLCLLTQNMMFLGLMCSVWTHLGMALEKTLALFSRTPKRTSHRNVCLYLMGVFCLVLLLIIILLIT
MGPDANLNRGPNMCREGPTKGMHTAVQGLKAGCYLLAAVLIVLLTVIIIWKLLRTKFGRKPRLICNVTFTGLICAFSWFM
LSLPLLFLGEAGSLGFDCTESLVARYYPGPAACLALLLIILYAWSFSHFMDSLKNQVTVTARYFRRVPSQST
>P03218 ~~~BILF2~~~Glycoprotein BILF2~~~
MTHLVLLLCCCVGSVCAFFSDLVKFENVTAHAGARVNLTCSVPSNESVSRIELGRGYTPGDGQLPLAVATSNNGTHITNG
GYNYSLTLEWVNDSNTSVSLIIPNVTLAHAGYYTCNVTLRNCSVASGVHCNYSAGEEDDQYHANRTLTQRMHLTVIPATT
IAPTTLVSHTTSTSHRPHRRPVSKRPTHKPVTLGPFPIDPWRPKTTWVHWALLLITCAVVAPVLLIIIISCLGWLAGWGR
RRKGWIPL
>P08986 ~~~imm~~~Immunity protein~~~
METLVAGSIFMVLVSGVLAIIIYMLPWFIALMRGSKSTVGIFFASLLFNWSIIGWFITFIWSIAGETKKSAQPNQVIIIR
EKE
>P0C724 ~~~BKRF4~~~Tegument protein BKRF4~~~
MAMFLKSRGVRSCRDRRLLSDEEEETSQSSSYTLGSQASQSIQEEDVSDTDESDYSDEDEEIDLEEEYPSDEDPSEGSDS
DPSWHPSDSDESDYSESDEDEATPGSQASRSSRVSPSTQQSSGLTPTPSFSRPRTRAPPRPPAPAPVRGRASAPPRPPAP
VQQSTKDKGPHRPTRPVLRGPAPRRPPPPSSPNTYNKHMMETTPPIKGNNNYNWPWL
>P30117 ~~~BKRF4~~~Tegument protein BKRF4~~~
MAMFLKSRGVRSCRDRRLLSDEEEETSQSSSYTLGSQASQSIQEEDVSDTDESDYSDEDEEIDLEEEYPSDEDPSEGSDS
DPSWHPSDSDESDYSESDEDEATPGSQASRSSRVSPSTQQSSGLTPTPSFSRPRTRAPPRPPAPAPVRGRASAPPRPPAP
VQQSTKDKGPHRPTRPVLRGPAPRRPPPPSSPNTYNKHMMETTPPIKGNNNYNWPWL
>Q3KSS1 ~~~BKRF4~~~Tegument protein BKRF4~~~
MAMFLKSRGVRSCRDRRLLSDEEEETSQSSSYTLGSQASQSIQEEDVSDTDESDYSDEDEEIDLEEEYPSDEDPSEGSDS
DPSWHPSDSDESDYSESDEDEATPGSQASRSSRVSPSTQQSSGLTPTPSFSRPRTRAPPRPPAPAPVRGRASAPPRPPAP
VPQSTKDKVPNRPTRPVLRGPAPRRPPPPSSPNTYNKHMMETTPPIKGNNNYNWPWL
>P0C717 ~~~BLRF2~~~Tegument protein BLRF2~~~
MSAPRKVRLPSVKAVDMSMEDMAARLARLESENKALKQQVLRGGACASSTSVPSAPVPPPEPLTARQREVMITQATGRLA
SQAMKKIEDKVRKSVDGVTTRNEMENILQNLTLRIQVSMLGAKGQPSPGEGTRPRESNDPNATRRARSRSRGREAKKVQI
SD
>Q3KST5 ~~~BLRF2~~~Tegument protein BLRF2~~~
MSAPRKVRLPSVKAVDMSMEDMAARLARLESENKALKQQVLRGGACASSTSVPSAPVPPPEPLTARQREVMITQATGRLA
SQAMKKIEDKVRKSVDGVTTRNEMENILQNLTLRIQVSMLGAKGQPSPGEGTRLRESNDPNATRRARSRSRGREAKKVQI
SD
>P03493 ~~~M~~~Matrix protein 2~~~
MLEPLQILSICSFILSALHFMAWTIGHLNQIKRGVNLKIQIRNPNKEAINREVSILRHNYQKEIQAKETMKKILSDNMEV
LGDHIVVEGLSTDEIIKMGETVLEVEELQ
>P0C0X4 ~~~M~~~Matrix protein 2~~~
MLEPFQILSICSFILSALHFMGWTIGHLNQIKRGVNLKIRIRNPNKETINREVSILRHSYQKEIQAKETIKEVLSDNMER
LSDHIVIEGLSAEEIIKMGETVLEVEELH
>P03192 ~~~BMRF2~~~Protein BMRF2~~~
MFSCKQHLSLGACVFCLGLLASTPFIWCFVFANLLSLEIFSPWQTHVYRLGFPTACLMAVLWTLVPAKHAVRAVTPAIML
NIASALIFFSLRVYSTSTWVSAPCLFLANLPLLCLWPRLAIEIVYICPAIHQRFFELGLLLACTIFALSVVSRALEVSAV
FMSPFFIFLALGSGSLAGARRNQIYTSGLERRRSIFCARGDHSVASLKETLHKCPWDLLAISALTVLVVCVMIVLHVHAE
VFFGLSRYLPLFLCGAMASGGLYLGHSSIIACVMATLCTLTSVVVYFLHETLGPLGKTVLFISIFVYYFSGVAALSAAMR
YKLKKFVNGPLVHLRVVYMCCFVFTFCEYLLVTFIKS
>P0C739 ~~~BNLF2a~~~Protein BNLF2a~~~
MVHVLERALLEQQSSACGLPGSSTETRPSHPCPEDPDVSRLRLLLVVLCVLFGLLCLLLI
>O64174 1.17.4.1~~~bnrdF~~~Ribonucleoside-diphosphate reductase subunit beta~~~
MTKIYDAANWSKHEDDFTQMFYNQNVKQFWLPEEIALNGDLLTWKYLGKNEQDTYMKVLAGLTLLDTEQGNTGMPIVAEH
VDGHQRKAVLNFMAMMENAVHAKSYSNIFLTLAPTEQINEVFEWVKNNRFLQKKARTIVSVYKTIKKNDEISLFKGMVAS
VFLESFLFYSGFYYPLYFYGQGKLMQSGEIINLIIRDEAIHGVYVGLLAQEIYKKQTPQKQKELYAWALNLLQELYENEL
EYTEDVYDQVGLAPDVKKFIRYNANKALNNLGFDHWFEEEDVNPIVINGLNTKTKSHDFFSTKGNGYKKATVEPLKDSDF
IFTEKGCIQ
>P22499 ~~~bof~~~Modulator protein~~~
MKKRYYTVKHGTLRALQEFADKHNVEVRREGGSKALRMYRPDGKWRTVVDFKTNSVPQGVRDRAFEEWEQIIIDNALLLN
AD
>P26814 ~~~bor~~~Lipoprotein bor~~~
MKKMLLATALALLITGCAQQTFTVQNKPAAVAPKETITHHFFVSGIGQKKTVDAAKICGGAENVVKTETQQTFVNGLLGF
ITLGIYTPLEARVYCSQ
>P19060 ~~~6~~~Baseplate wedge protein gp6~~~
MANTPVNYQLTRTANAIPEIFVGGTFAEIKQNLIEWLNGQNEFLDYDFEGSRLNVLCDLLAYNTLYIQQFGNAAVYESFM
RTANLRSSVVQAAQDNGYLPTSKSAAQTEIMLTCTDALNRNYITIPRGTRFLAYAKDTSVNPYNFVSREDVIAIRDKNNQ
YFPRLKLAQGRIVRTEIIYDKLTPIIIYDKNIDRNQVKLYVDGAEWINWTRKSMVHAGSTSTIYYMRETIDGNTEFYFGE
GEISVNASEGALTANYIGGLKPTQNSTIVIEYISTNGADANGAVGFSYADTLTNITVININENPNDDPDFVGADGGGDPE
DIERIRELGTIKRETQQRCVTATDYDTFVSERFGSIIQAVQTFTDSTKPGYAFIAAKPKSGLYLTTVQREDIKNYLKDYN
LAPITPSIISPNYLFIKTNLKVTYALNKLQESEQWLEGQIIDKIDRYYTEDVEIFNSSFAKSKMLTYVDDADHSVIGSSA
TIQMVREVQNFYKTPEAGIKYNNQIKDRSMESNTFSFNSGRKVVNPDTGLEEDVLYDVRIVSTDRDSKGIGKVIIGPFAS
GDVTENENIQPYTGNDFNKLANSDGRDKYYVIGEINYPADVIYWNIAKINLTSEKFEVQTIELYSDPTDDVIFTRDGSLI
VFENDLRPQYLTIDLEPISQ
>P19061 ~~~7~~~Baseplate wedge protein gp7~~~
MTVKAPSVTSLRISKLSANQVQVRWDDVGANFYYFVEIAETKTNSGENLPSNQYRWINLGYTANNSFFFDDADPLTTYII
RVATAAQDFEQSDWIYTEEFETFATNAYTFQNMIEMQLANKFIQEKFTLNNSDYVNFNNDTIMAALMNESFQFSPSYVDV
SSISNFIIGENEYHEIQGSIQQVCKDINRVYLMESEGILYLFERYQPVVKVSNDKGQTWKAVKLFNDRVGYPLSKTVYYQ
SANTTYVLGYDKIFYGRKSTDVRWSADDVRFSSQDITFAKLGDQLHLGFDVEIFATYATLPANVYRIAEAITCTDDYIYV
VARDKVRYIKTSNALIDFDPLSPTYSERLFEPDTMTITGNPKAVCYKMDSICDKVFALIIGEVETLNANPRTSKIIDSAD
KGIYVLNHDEKTWKRVFGNTEEERRRIQPGYANMSTDGKLVSLSSSNFKFLSDNVVNDPETAAKYQLIGAVKYEFPREWL
ADKHYHMMAFIADETSDWETFTPQPMKYYAEPFFNWSKKSNTRCWINNSDRAVVVYADLKYTKVIENIPETSPDRLVHEY
WDDGDCTIVMPNVKFTGFKKYASGMLFYKASGEIISYYDFNYRVRDTVEIIWKPTEVFLKAFLQNQEHETPWSPEEERGL
ADPDLRPLIGTMMPDSYLLQDSNFEAFCEAYIQYLSDGYGTQYNNLRNLIRNQYPREEHAWEYLWSEIYKRNIYLNADKR
DAVARFFESRSYDFYSTKGIEASYKFLFKVLYNEEVEIEIESGAGTEYDIIVQSDSLTEDLVGQTIYTATGRCNVTYIER
SYSNGKLQWTVTIHNLLGRLIAGQEVKAERLPSFEGEIIRGVKGKDLLQNNIDYINRSRSYYVMKIKSNLPSSRWKSDVI
RFVHPVGFGFIAITLLTMFINVGLTLKHTETIINKYKNYKWDSGLPTEYADRIAKLTPTGEIEHDSVTGEAIYEPGPMAG
VKYPLPDDYNAENNNSIFQGQLPSERRKLMSPLFDASGTTFAQFRDLVNKRLKDNIGNPRDPENPTQVKIDE
>P19062 ~~~8~~~Baseplate wedge protein gp8~~~
MNDSSVIYRAIVTSKFRTEKMLNFYNSIGSGPDKNTIFITFGRSEPWSSNENEVGFAPPYPTDSVLGVTDMWTHMMGTVK
VLPSMLDAVIPRRDWGDTRYPDPYTFRINDIVVCNSAPYNATESGAGWLVYRCLDVPDTGMCSIASLTDKDECLKLGGKW
TPSARSMTPPEGRGDAEGTIEPGDGYVWEYLFEIPPDVSINRCTNEYIVVPWPEELKEDPTRWGYEDNLTWQQDDFGLIY
RVKANTIRFKAYLDSVYFPEAALPGNKGFRQISIITNPLEAKAHPNDPNVKAEKDYYDPEDLMRHSGEMIYMENRPPIIM
AMDQTEEINILFTF
>P10927 ~~~9~~~Baseplate protein gp9~~~
MFIQEPKKLIDTGEIGNASTGDILFDGGNKINSDFNAIYNAFGDQRKMAVANGTGADGQIIHATGYYQKHSITEYATPVK
VGTRHDIDTSTVGVKVIIERGELGDCVEFINSNGSISVTNPLTIQAIDSIKGVSGNLVVTSPYSKVTLRCISSDNSTSVW
NYSIESMFGQKESPAEGTWNISTSGSVDIPLFHRTEYNMAKLLVTCQSVDGRKIKTAEINILVDTVNSEVISSEYAVMRV
GNETEEDEIANIAFSIKENYVTATISSSTVGMRAAVKVIATQKIGVAQ
>P10928 ~~~~~~Baseplate wedge protein gp10~~~
MKQNINIGNVVDDGTGDYLRKGGIKINENFDELYYELGDGDVPYSAGAWKTYNASSGQTLTAEWGKSYAINTSSGRVTIN
LPKGTVNDYNKVIRARDVFATWNVNPVTLVAASGDTIKGSAVPVEINVRFSDLELVYCAPGRWEYVKNKQIDKITSSDIS
NVARKEFLVEVQGQTDFLDVFRGTSYNVNNIRVKHRGNELYYGDVFSENSDFGSPGENEGELVPLDGFNIRLRQPCNIGD
TVQIETFMDGVSQWRSSYTRRQIRLLDSKLTSKTSLEGSIYVTDLSTMKSIPFSAFGLIPGEPINPNSLEVRFNGILQEL
AGTVGMPLFHCVGADSDDEVECSVLGGTWEQSHTDYSVETDENGIPEILHFDSVFEHGDIINITWFNNDLGTLLTKDEII
DETDNLYVSQGPGVDISGDVNLTDFDKIGWPNVEAVQSYQRAFNAVSNIFDTIYPIGTIYENAVNPNNPVTYMGFGSWKL
FGQGKVLVGWNEDISDPNFALNNNDLDSGGNPSHTAGGTGGSTSVTLENANLPATETDEEVLIVDENGSVIVGGCQYDPD
ESGPIYTKYREAKASTNSTHTPPTSITNIQPYITVYRWIRIA
>P10929 ~~~~~~Baseplate wedge protein gp11~~~
MSLLNNKAGVISRLADFLGFRPKTGDIDVMNRQSVGSVTISQLAKGFYEPNIESAINDVHNFSIKDVGTIITNKTGVSPE
GVSQTDYWAFSGTVTDDSLPPGSPITVLVFGLPVSATTGMTAIEFVAKVRVALQEAIASFTAINSYKDHPTDGSKLEVTY
LDNQKHVLSTYSTYGITISQEIISESKPGYGTWNLLGAQTVTLDNQQTPTVFYHFERTA
>Q6QGE3 ~~~~~~Baseplate tube protein p140~~~
MFYSLMRESKIVIEYDGRGYHFDALSNYDASTSFQEFKTLRRTIHNRTNYADSIINAQDPSSISLAINFSTTLIESNFFD
WMGFTREGNSLFLPRNTPNIEPIMFNMYIINHNNSCIYFENCYVSTVDFSLDKSIPILNVGIESGKFSEVSTFRDGYTIT
QGEVLPYSAPAVYTNSSPLPALISASMSFQQQCSWREDRNIFDINKIYTNKRAYVNEMNASATLAFYYVKRLVGDKFLNL
DPETRTPLIIKNKYVSITFPLARISKRLNFSDLYQVEYDVIPTADSDPVEINFFGERK
>P09425 ~~~~~~Baseplate wedge protein gp25~~~
MANINKLYSDIDPEMKMDWNKDVSRSLGLRSIKNSLLGIITTRKGSRPFDPEFGCDLSDQLFENMTPLTADTVERNIESA
VRNYEPRIDKLAVNVIPVYDDYTLIVEIRFSVIDNPDDIEQIKLQLASSNRV
>Q6WHH0 ~~~~~~Probable baseplate hub protein gp334~~~
MFEMRSANPIENFVVSKIHIRGLDFAASVENEITHMEIYESLNGLVSGMFMFKDSIGVVDTIRMTGFEAIDVEFASYVGE
QANRVYQKSFRATGISRMPARTGGFETVLVRFTNNLLTLNDYVKRPYVFKKTSISNIIKAILDNLGDEKPEYEIETSLYQ
RDFVTKIGKPYDIIKSIVDHASTDVNNSCKFMFYEDRDSVKFASLGSIRDKEYEYIIRKGADTGDGKWTSGNTNTITALR
VVVKEQSNMHEISSGLFGSRTYSHSLIRKKLTTKDVRRNDYIAQVGILNDRAHMYTNELEFASEVPETEQPLNSIQLLPN
DGFYEHDNKHPLGSIHGVSLMEETYLKAKQIIVEIPGNTNITVGDVVFLDYHAVTGENHSSLDASGRWIVHELKHRVEPN
SFITTLELSSDSSVNIAIAGSKK
>P17172 ~~~~~~Baseplate central spike complex protein gp27~~~
MSMLQRPGYPNLSVKLFDSYDAWSNNRFVELAATITTLTMRDSLYGRNEGMLQFYDSKNIHTKMDGNEIIQISVANANDI
NNVKTRIYGCKHFSVSVDSKGDNIIAIELGTIHSIENLKFGRPFFPDAGESIKEMLGVIYQDRTLLTPAINAINAYVPDI
PWTSTFENYLSYVREVALAVGSDKFVFVWQDIMGVNMMDYDMMINQEPYPMIVGEPSLIGQFIQELKYPLAYDFVWLTKS
NPHKRDPMKNATIYAHSFLDSSIPMITTGKGENSIVVSRSGAYSEMTYRNGYEEAIRLQTMAQYDGYAKCSTIGNFNLTP
GVKIIFNDSKNQFKTEFYVDEVIHELSNNNSVTHLYMFTNATKLETIDPVKVKNEFKSDTTTEESSSSNKQ
>P85227 ~~~~~~Virion protein 3~~~
MAIATNNSRVYASLQLKNKQDSMYLAIGKTTPWTNEDAPPAPDPTTTTLTEVIGYKKVARVSLCREYLPSDDSKYPVVSY
GSRKFTLIPDEDGYKEQAWMVYVEAEITGDELPTGTFRQVGIHTDLVSKASSEKKALLPTDVTDAGILQFFENRQQQNRT
SDVILKEKFIITMENKKSVKQ
>P08558 ~~~P~~~Baseplate hub protein gp44~~~
MSNTVTLRADGRLFTGWTSVSVTRSIESVAGYFELGVNVPPGTDLSGLAPGKKFTLEIGGQIVCTGYIDSRRRQMTADSM
KITVAGRDKTADLIDCAAVYSGGQWKNRTLEQIARDLCAPYGVTVRWELSDKESSAAFPGFTLDHSETVYEALVRASRAR
GVLMTSNAAGELVFSRAASTATDELVLGENLLTLDFEEDFRDRFSEYTVKGYARANGAEGDDIDAKSIVSRKGTATDSDV
TRYRPMIIIADSKITAKDAQARALREQRRRLAKSITFEAEIDGWTRKDGQLWMPNLLVTIDASKYAIKTTELLVSKVTLI
LNDQDGLKTRVSLAPREGFLVPVESDRKNRKGGDSNGGIDALVEDYYRRHPEKTPPWKE
>Q9T1V4 ~~~~~~Baseplate puncturing device gp45~~~
MERVNDSALNRLLTPLMRRVRLMLARAVVNVINDGRKVQNLQVGLLDDEESDEVERLQNYGHFSVPLPGAEALIACVGAQ
RDQGIAVVVEDRRYRPTNLEPGDAGIYHHEGHRIRLTKDGRCIITCKTVEVYADESMTVDTPRTTFTGDVEIQKGLGVKG
KSQFDSNITAPDAIINGKSTDKHIHRGDSGGTTGPMQ
>Q9T1V3 ~~~~~~Baseplate protein gp46~~~
MTDLAIIWTNGRGDIAQDGIDMLTDDSLTTDVTISLFTDRRALDSDTLPDGSDDRRGWWGDSYRDRPIGSRLWLLSREKA
TPDTLERARGYAEEALEWLKTAGRVSAINVRAEQLHQGWLYLYIALTLPDGSVIPYEFKAAFNGV
>Q9T1V2 ~~~~~~Baseplate protein gp47~~~
MAYSPPTLSSLIARTEQNIEQRLPGSWPQAREKTLSAIAYAQAGLAAGCHEHISWVGRQIIPSTADEDELLEHCRFWGVR
RKQATAASGPLTVTTSAATTIPAGTRWQRADGVVYSLADTIVIDRAGTTEITVTALAAGEAGNTGENTLLTLITPVACVV
SDAITVKGFSGGADIESAAELLSRLEYRVQYPPFGGNQFDYVRWAREVSGVTRAWCFPTWKGGGTVGVTFVMDNRSNIFP
QPADVERVADYIAGHTDPITGLIVGQPDGVNVTVFAPKAKPVNPRIYISPKTAELKQAITNAINTMFFNEVMPGGALAPS
RIIRAVAGVTGLDDFEVRFPTEIQRSENTELLTAGTIEWL
>Q9T1V1 ~~~~~~Baseplate protein gp48~~~
MAVTPWQTAFLQLLPSGLAWNKSPDSKLSALAQAISDVIATAADDARQMLRERFPSTSRWYLGEWESFLGLPDCTSENGT
LSERQRAAANKMRMTGNLSRRFYEWLAAQYGFTVRLTDSTEGQWVTQVNIYGIKNYRNATVLDNVLTPLRVYESGALECL
LEKYKPAHQIYKFVYHDGDN
>P13339 ~~~~~~Baseplate tail-tube junction protein gp48~~~
MAIVKEITADLIKKSGEKISAGQSTKSEVGTKTYTAQFPTGRASGNDTTEDFQVTDLYKNGLLFTAYNMSSRDSGSLRSM
RSNYSSSSSSILRTARNTISSTVSKLSNGLISNNNSGTISKSPIANILLPRSKSDVDTSSHRFNDVQESLISRGGGTATG
VLSNIASTAVFGALESITQGIMADNNEQIYTTARSMYGGAENRTKVFTWDLTPRSTEDLMAIINIYQYFNYFSYGETGKS
QYAAEIKGYLDDWYRSTLIEPLSPEDAAKNKTLFEKMTSSLTNVLVVSNPTVWMVKNFGATSKFDGKTEIFGPCQIQSIR
FDKTPNGNFNGLAIAPNLPSTFTLEITMREIITLNRASLYAGTF
>P16011 ~~~~~~Baseplate wedge protein gp53~~~
MLFTFFDPIEYAAKTVNKNAPTIPMTDIFRNYKDYFKRALAGYRLRTYYIKGSPRPEELANAIYGNPQLYWVLLMCNDNY
DPYYGWITSQEAAYQASIQKYKNVGGDQIVYHVNENGEKFYNLISYDDNPYVWYDKGDKARKYPQYEGALAAVDTYEAAV
LENEKLRQIKIIAKSDINSFMNDLIRIMEKSYGNDK
>P13341 ~~~~~~Baseplate tail-tube junction protein gp54~~~
MYSLEEFNNQAINADFQRNNMFSCVFATTPSTKSSSLISSISNFSYNNLGLNSDWLGLTQGDINQGITTLITAGTQKLIR
KSGVSKYLIGAMSQRTVQSLLGSFTVGTYLIDFFNMAYNSSGLMIYSVKMPENRLSYETDWNYNSPNIRITGRELDPLVI
SFRMDSEACNYRAMQDWVNSVQDPVTGLRALPQDVEADIQVNLHSRNGLPHTAVMFTMHSISVSAPELSYDGDNQITTFD
VTFAYRVMQAGAVDRQRALEWLESAAINGIQSVLGNSGGVTGLSNSLSRLSRLGGTAGSISNINTMTGIVNSQSKILGAI
>Q6QGE9 ~~~~~~Baseplate hub protein pb3~~~
MKKILDSAKNYLNTHDKLKTACLIALELPSSSGSAATYIYLTDYFRDVTYNGILYRSGKVKSISSHKQNRQLSIGSLSFT
ITGTAEDEVLKLVQNGVSFLDRGITIHQAIINEEGNILPVDPDTDGPLLFFRGRITGGGIKDNVNTSGIGTSVITWNCSN
QFYDFDRVNGRYTDDASHRGLEVVNGTLQPSNGAKRPEYQEDYGFFHSNKSTTILAKYQVKEERYKLQSKKKLFGLSRSY
SLKKYYETVTKEVDLDFNLAAKFIPVVYGVQKIPGIPIFADTELNNPNIVYVVYAFAEGEIDGFLDFYIGDSPMICFDET
DSDTRTCFGRKKIVGDTMHRLAAGTSTSQPSVHGQEYKYNDGNGDIRIWTFHGKPDQTAAQVLVDIAKKKGFYLQNQNGN
GPEYWDSRYKLLDTAYAIVRFTINENRTEIPEISAEVQGKKVKVYNSDGTIKADKTSLNGIWQLMDYLTSDRYGADITLD
QFPLQKVISEAKILDIIDESYQTSWQPYWRYVGWNDPLSENRQIVQLNTILDTSESVFKNVQGILESFGGAINNLSGEYR
ITVEKYSTNPLRINFLDTYGDLDLSDTTGRNKFNSVQASLVDPALSWKTNSITFYNSKFKEQDKGLDKKLQLSFANITNY
YTARSYADRELKKSRYSRTLSFSVPYKFIGIEPNDPIAFTYERYGWKDKFFLVDEVENTRDGKINLVLQEYGEDVFINSE
QVDNSGNDIPDISNNVLPPRDFKYTPTPGGVVGAIGKNGELSWLPSLTNNVVYYSIAHSGHVNPYIVQQLENNPNERMIQ
EIIGEPAGLAIFELRAVDINGRRSSPVTLSVDLNSAKNLSVVSNFRVVNTASGDVTEFVGPDVKLAWDKIPEEEIIPEIY
YTLEIYDSQDRMLRSVRIEDVYTYDYLLTYNKADFALLNSGALGINRKLRFRIRAEGENGEQSVGWATI
>P51772 ~~~X~~~Baseplate protein X~~~
MKTFALQGDTLDAICVRYYGRTEGVVETVLAANPGLAELGAVLPHGTAVELPDVQTAPVAETVNLWE
>P03209 ~~~BRLF1~~~Replication and transcription activator~~~
MRPKKDGLEDFLRLTPEIKKQLGSLVSDYCNVLNKEFTAGSVEITLRSYKICKAFINEAKAHGREWGGLMATLNICNFWA
ILRNNRVRRRAENAGNDACSIACPIVMRYVLDHLIVVTDRFFIQAPSNRVMIPATIGTAMYKLLKHSRVRAYTYSKVLGV
DRAAIMASGKQVVEHLNRMEKEGLLSSKFKAFCKWVFTYPVLEEMFQTMVSSKTGHLTDDVKDVRALIKTLPRASYSSHA
GQRSYVSGVLPACLLSTKSKAVETPILVSGADRMDEELMGNDGGASHTEARYSESGQFHAFTDELESLPSPTMPLKPGAQ
SADCGDSSSSSSDSGNSDTEQSEREEARAEAPRLRAPKSRRTSRPNRGQTPCPSNAAEPEQPWIAAVHQESDERPIFPHP
SKPTFLPPVKRKKGLRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVG
SLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESM
TEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLF
>Q3KSS7 ~~~BRLF1~~~Replication and transcription activator~~~
MRPKKDGLEDFLRLTPEIKKQLGSLVSDYCNVLNKEFTAGSVEITLRSYKICKAFINEAKAHGREWGGLMATLNICNFWA
ILRNNRVRRRAENAGNDACSIACPIVMRYVLDHLIVVTDRFFIQAPSNRVMIPATIGTAMYKLLKHSRVRAYTYSKVLGV
DRAAIMASGKQVVEHLNRMEKEGLLSSKFKAFCKWVFTYPVLEEMFQTMVSSKTGHLTDDVKDVRALIKTLPRASYSSHA
GQRSYVSGVLPACLLSTKSKAVETPILVSGADRMDEELMGNDGGASHTEARYSESGQFHAFTDELESLPSPTMPLKPGAQ
SADCGDSSSSSSDSGNSDTEQSEREEARAEAPRLRAPKSRRTSRPNRGQTPCPSNAEEPEQPWIAAVHQESDERPIFPHP
SKPTFLPPVKRKKGLRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVG
SLTPAPVPKPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLNHPPPRGHLDELTTTLESM
TEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLF
>Q01012 ~~~~~~Putative transcription activator BRLF1 homolog~~~
MQRIPLWLVRSTHCLILLFQDDVQVRKSCLEPFLFLSPERKREIHQLLVAFNQSLVTPTQDEEKILSDIQRACLQIAEDL
KHLNPFTGLLLDLNLYTLWTLLRNYKTKQRSQPVNSTVVSRYAHHVVKYIMQRLVYTTDRLFLTAPTSGIVLPVPLANAI
FNLLSHCRKKCTGLWRNYGTEKSVLMGLGKEITLCYQALNESGIVSTTLAAFIKLSFPTISIPNLFKPMFQSCKGNQDNF
PDICTQGSVIRRPHQGVFGDTFPIPDPLMREISENSFKKFSTANISTLLQNPKEILEMDPFDPRIGGFPLNKEETATPLK
DSSFSNPTFINTGAANTLLPAASVTPALESLFSPTHFPCMSDESIASTSHVPLDNNISLPTLVKTNFPLKRKRQSRNIDP
NTPRRPRGRPKGSKTKKRPTCSPALFQSSDIPTDSLHVKCPEMLPTVPQNEFCDSSNIQPCTSSSVLENDNLVPINEAET
DDNILATILQDLYDLPAPPVLCSHENQTLEIDNNVDIEDLGLSFPMSLQDFLNDE
>Q1HVD3 ~~~BTRF1~~~Uncharacterized protein BTRF1~~~
MLKCKQPGARFIHGAVHLPSGQIVFHTIHSPTLASALGLPGENVPIPALFRASGLNVRESLPMTNMRAPIISLARLILAP
NPYILEGQLTVGMTQDNGIPVLFARPVIEVKSGPESNIKASSQLMIAEDSCLNQIAPFSASEHPAFSMVESVKRVRVDEG
ANTRRTIRDILEIPVTVLSSLQLSPTKSILKKAPEPPPPEPQATFDAAPYARIFYDIGRQVPKLGNAPAAQVSNVLIANR
SHNSLRLVPNPDLLPLQHLYLKHVVLKSLNLENIVQDFEAIFTSPSDTISEAETKAFEKLVEQAKNTVENIVFCLNSICS
TSTLPDVVPDVNNPNISLALEKYFLMFPPSGTIMRNVRFATPIVRLLCQGAELGTMAQFLGKYIKVKKETGMYTLVKLYY
LLRI
>P0C734 ~~~BTRF1~~~Uncharacterized protein BTRF1~~~
MLKCKQPGARFIHGAVHLPSGQIVFHTIHSPTLASALGLPGENVPIPALFRASGLNVRESLPMTNMRAPIISLARLILAP
NPYILEGQLTVGMTQDNGIPVLFARPVIEVKSGPESNIKASSQLMIAEDSCLNQIAPFSASEHPAFSMVESVKRVRVDEG
ANTRRTIRDILEIPVTVLSSLQLSPTKSILKKAPEPPPPEPQATFDATPYARIFYDIGRQVPKLGNAPAAQVSNVLIANR
SHNSLRLVPNPDLLPLQHLYLKHVVLKSLNLENIVQDFEAIFTSPSDTISEAETKAFEKLVEQAKNTVENIVFCLNSICS
TSTLPDVVPDVNNPNISLALEKYFLMFPPSGTIMRNVRFATPIVRLLCQGAELGTMAQFLGKYIKVKKETGMYTLVKLYY
LLRI
>P24649 ~~~~~~DNA-binding protein~~~
MVYRRRRRSSTGATYGLTRRRRSSAGITRRRRSSGYRRRPGRPRTYRRSRSRSLTSRRSYRTRYY
>P46081 ~~~~~~Non-toxic nonhemagglutinin type C~~~
MDINDDLNINSPVDNKNVVIVRARKTNTFFKAFKVAPNIWVAPERYYGEPLDIAEEYKLDGGIYDSNFLSQDSERENFLQ
AIIILLKRINNTISGKQLLSLISTAIPFPYGYIGGGYSSPNIFTFGKTPKSNKKLNSLVTSTIPFPFGGYRETNYIESQN
NKNFYASNIIIFGPGSNIVENNVIYYKKNDAENGMGTMAEIVFQPLLTYKYNKFYIDPAMELTKCLIKSLYFLYGIKPSD
NLVVPYRLRTELDNKQFSQLNIIDLLISGGVDLEFINTNPYWFTNSYFPNSIKMFEKYKNIYKTEIEGNNAIGNDIKLRL
KQKFQINVQDIWNLNLNYFCQSFNSIIPDRFSNALKHFYRKQYYTMDYTDNYNINGFVNGQINTKLPLSNKNTNIISKPE
KVVNLVNENNISLMKSNIYGDGLKGTTEDFYSTYKIPYNEEYEYRFNDSDNFPLNNISIEEVDSIPEIIDINPYKDNSDN
LVFTQITSMTEEVTTHTALSINYLQAQITNNENFTLSSDFSKVVSSKDKSLVYSFLDNLMSYLETIKNDGPIDTDKKYYL
WLKEVFKNYSFDINLTQEIDSMCGINEVVLWFGKALNILNTSNSFVEEYQDSGAISLISKKDNLREPNIEIDDISDSLLG
LSFKDLNNKLYEIYSKNIVYFKKIYFSFLDQWWTEYYSQYFELICMAKQSILAQESLVKQIVQNKFTDLSKASIPPDTLK
LIRETTEKTFIDLSNESQISMNRVDNFLNKASICVFVEDIYPKFISYMEKYINNINIKTREFIQRCTNINDNEKSILINS
YTFKTIDFKFLDIQSIKNFFNSQVEQVMKEILSPYQLLLFASKGPNSNIIEDISGKNTLIQYTESIELVYGVNGESLYLK
SPNETIKFSNKFFTNGLTNNFTICFWLRFTGKNDDKTRLIGNKVNNCGWEIYFEDNGLVFEIIDSNGNQESVYLSNIIND
NWYYISISVDRLKDQLLIFINDKNVANVSIDQILSIYSTNIISLVNKNNSIYVEELSVLDNPITSEEVIRNYFSYLDNSY
IRDSSKSLLEYNKNYQLYNYVFPETSLYEVNDNNKSYLSLKNTDGINISSVKFKLINIDESKVYVQKWDECIICVLDGTE
KYLDISPENNRIQLVSSKDNAKKITVNTDLFRPDCITFSYNDKYFSLSLRDGDYNWMICNDNNKVPKGAHLWILES
>P18640 ~~~~~~Botulinum neurotoxin type C~~~
MPITINNFNYSDPVDNKNILYLDTHLNTLANEPEKAFRITGNIWVIPDRFSRNSNPNLNKPPRVTSPKSGYYDPNYLSTD
SDKDTFLKEIIKLFKRINSREIGEELIYRLSTDIPFPGNNNTPINTFDFDVDFNSVDVKTRQGNNWVKTGSINPSVIITG
PRENIIDPETSTFKLTNNTFAAQEGFGALSIISISPRFMLTYSNATNDVGEGRFSKSEFCMDPILILMHELNHAMHNLYG
IAIPNDQTISSVTSNIFYSQYNVKLEYAEIYAFGGPTIDLIPKSARKYFEEKALDYYRSIAKRLNSITTANPSSFNKYIG
EYKQKLIRKYRFVVESSGEVTVNRNKFVELYNELTQIFTEFNYAKIYNVQNRKIYLSNVYTPVTANILDDNVYDIQNGFN
IPKSNLNVLFMGQNLSRNPALRKVNPENMLYLFTKFCHKAIDGRSLYNKTLDCRELLVKNTDLPFIGDISDVKTDIFLRK
DINEETEVIYYPDNVSVDQVILSKNTSEHGQLDLLYPSIDSESEILPGENQVFYDNRTQNVDYLNSYYYLESQKLSDNVE
DFTFTRSIEEALDNSAKVYTYFPTLANKVNAGVQGGLFLMWANDVVEDFTTNILRKDTLDKISDVSAIIPYIGPALNISN
SVRRGNFTEAFAVTGVTILLEAFPEFTIPALGAFVIYSKVQERNEIIKTIDNCLEQRIKRWKDSYEWMMGTWLSRIITQF
NNISYQMYDSLNYQAGAIKAKIDLEYKKYSGSDKENIKSQVENLKNSLDVKISEAMNNINKFIRECSVTYLFKNMLPKVI
DELNEFDRNTKAKLINLIDSHNIILVGEVDKLKAKVNNSFQNTIPFNIFSYTNNSLLKDIINEYFNNINDSKILSLQNRK
NTLVDTSGYNAEVSEEGDVQLNPIFPFDFKLGSSGEDRGKVIVTQNENIVYNSMYESFSISFWIRINKWVSNLPGYTIID
SVKNNSGWSIGIISNFLVFTLKQNEDSEQSINFSYDISNNAPGYNKWFFVTVTNNMMGNMKIYINGKLIDTIKVKELTGI
NFSKTITFEINKIPDTGLITSDSDNINMWIRDFYIFAKELDGKDINILFNSLQYTNVVKDYWGNDLRYNKEYYMVNIDYL
NRYMYANSRQIVFNTRRNNNDFNEGYKIIIKRIRGNTNDTRVRGGDILYFDMTINNKAYNLFMKNETMYADNHSTEDIYA
IGLREQTKDINDNIIFQIQPMNNTYYYASQIFKSNFNGENISGICSIGTYRFRLGGDWYRHNYLVPTVKQGNYASLLEST
STHWGFVPVSE
>Q9LBR2 ~~~ntnha~~~Non-toxic nonhemagglutinin type D~~~
MDINDDLNINSPVDNKNVVIVRARKTNTFFKAFKVAPNIWVAPERYYGEPLDIAEEYKLDGGIYDSNFLSQDSERENFLQ
AIITLLKRINNTISGKQLLSLISTAIPFPYGYVGGGYSSPNIFTFGKTPKSNKKLNSLVTSTIPFPFGGYRETNYIESQN
NKNFYASNIVIFGPGSNIVENNVICYKKNDAENGMGTMAEILFQPLLTYKYNKFYIDPAMELTKCLIKSLYFLYGIKPSD
DLVVPYRLRTELDNKQFSQLNIIDLLISGGVDLEFINTNPYWFTNSYFSNSIKMFEKYKNIYETEIEGNNAIGNDIKLRL
KQKFQNSVQDIWNLNLNYFSKEFNSIIPDRFSNALKHFYRKQYYTMDYGDNYNINGFVNGQINTKLPLSDKNTNIISKPE
KVVNLVNANNISLMKSNIYGDGLKGTTEDFYSTYKIPYNEEYEYRFNDSDNFPLNNISIEEVDSIPEIIDINPYKDNSDD
LLFTQITSTTEEVITHTALPVNYLQAQIITNENFTLSSDFSKVVSSKDKSLVYSFLDNLMSYLETIKNDGPIDTDKKYYL
WLKEVFKNYSFDINLTQEIDSSCGINEVVIWFGKALNILNTSNSFVEEYQNSGPISLISKKDNLSEPNIEIDDIPDSLLG
LSFKDLNNKLYEIYSKNRVYFRKIYFNFLDQWWTEYYSQYFELICMAKQSILAQESVVKQIIQNKFTDLSKASIPPDTLK
LIKETTEKTFIDLSNESQISMNRVDNFLNKASICVFVEDIYPKFISYMEKYINNINIKTREFIQRCTNINDNEKSILINS
YTFKTIDFKFLNIQAIKNFFNSQVEQVMKEMLSPYQLLLFATRGPNSNIIEDISGKNTLIQYTESVELVYGVNGESLYLK
SPNETVEFSNNFFTNGLTNNFTICFWLRFTGKDDDKTRLIGNKVNNCGWEIYFEDNGLVFEIIDSNGNQESVYLSNVINN
NWYYISISVDRLKDQLLIFINDKNVANVSIEQILNIYSTNVISLVNKNNSIYVEELSVLDKPVASEEVIRNYFSYLDNSY
IRDSSKSLLEYNKNYQLYNYVFPETSLYEVNDNNKSYLSLKNTDGINIPSVKFKLINIDESKGYVQKWDECIICVSDGTE
KYLDISPENNRIQLVSSKDNAKKITVNTDLFRPDCITFSYNDKYFSLSLRDGDYNWMICNDNNKVPKGAHLWILKS
>P19321 ~~~botD~~~Botulinum neurotoxin type D~~~
MTWPVKDFNYSDPVNDNDILYLRIPQNKLITTPVKAFMITQNIWVIPERFSSDTNPSLSKPPRPTSKYQSYYDPSYLSTD
EQKDTFLKGIIKLFKRINERDIGKKLINYLVVGSPFMGDSSTPEDTFDFTRHTTNIAVEKFENGSWKVTNIITPSVLIFG
PLPNILDYTASLTLQGQQSNPSFEGFGTLSILKVAPEFLLTFSDVTSNQSSAVLGKSIFCMDPVIALMHELTHSLHQLYG
INIPSDKRIRPQVSEGFFSQDGPNVQFEELYTFGGLDVEIIPQIERSQLREKALGHYKDIAKRLNNINKTIPSSWISNID
KYKKIFSEKYNFDKDNTGNFVVNIDKFNSLYSDLTNVMSEVVYSSQYNVKNRTHYFSRHYLPVFANILDDNIYTIRDGFN
LTNKGFNIENSGQNIERNPALQKLSSESVVDLFTKVCLRLTKNSRDDSTCIKVKNNRLPYVADKDSISQEIFENKIITDE
TNVQNYSDKFSLDESILDGQVPINPEIVDPLLPNVNMEPLNLPGEEIVFYDDITKYVDYLNSYYYLESQKLSNNVENITL
TTSVEEALGYSNKIYTFLPSLAEKVNKGVQAGLFLNWANEVVEDFTTNIMKKDTLDKISDVSVIIPYIGPALNIGNSALR
GNFNQAFATAGVAFLLEGFPEFTIPALGVFTFYSSIQEREKIIKTIENCLEQRVKRWKDSYQWMVSNWLSRITTQFNHIN
YQMYDSLSYQADAIKAKIDLEYKKYSGSDKENIKSQVENLKNSLDVKISEAMNNINKFIRECSVTYLFKNMLPKVIDELN
KFDLRTKTELINLIDSHNIILVGEVDRLKAKVNESFENTMPFNIFSYTNNSLLKDIINEYFNSINDSKILSLQNKKNALV
DTSGYNAEVRVGDNVQLNTIYTNDFKLSSSGDKIIVNLNNNILYSAIYENSSVSFWIKISKDLTNSHNEYTIINSIEQNS
GWKLCIRNGNIEWILQDVNRKYKSLIFDYSESLSHTGYTNKWFFVTITNNIMGYMKLYINGELKQSQKIEDLDEVKLDKT
IVFGIDENIDENQMLWIRDFNIFSKELSNEDINIVYEGQILRNVIKDYWGNPLKFDTEYYIINDNYIDRYIAPESNVLVL
VQYPDRSKLYTGNPITIKSVSDKNPYSRILNGDNIILHMLYNSRKYMIIRDTDTIYATQGGECSQNCVYALKLQSNLGNY
GIGIFSIKNIVSKNKYCSQIFSSFRENTMLLADIYKPWRFSFKNAYTPVAVTNYETKLLSTSSFWKFISRDPGWVE
>P03206 ~~~BZLF1~~~Lytic switch protein BZLF1~~~
MMDPNSTSEDVKFTPDPYQVPFVQAFDQATRVYQDLGGPSQAPLPCVLWPVLPEPLPQGQLTAYHVSTAPTGSWFSAPQP
APENAYQAYAAPQLFPVSDITQNQQTNQAGGEAPQPGDNSTVQTAAAVVFACPGANQGQQLADIGVPQPAPVAAPARRTR
KPQQPESLEECDSELEIKRYKNRVASRKCRAKFKQLLQHYREVAAAKSSENDRLRLLLKQMCPSLDVDSIIPRTPDVLHE
DLLNF
>Q3KSS8 ~~~BZLF1~~~Lytic switch protein BZLF1~~~
MMDPNSTSEDVKFTPDPYQVPFVQAFDQATRVYQDLGGPSQAPLPCVLWPVLPEPLPQGQLTAYHVSAAPTGSWFPAPQP
APENAYQAYAAPQLFPVSDITQNQLTNQAGGEAPQPGDNSTVQPAAAVVLACPGANQEQQLADIGAPQPAPAAAPARRTR
KPLQPESLEECDSELEIKRYKNRVASRKCRAKFKHLLQHYREVASAKSSENDRLRLLLKQMCPSLDVDSIIPRTPDVLHE
DLLNF
>P19194 ~~~ORF3~~~Internal scaffolding protein VP3~~~
MKFRTIYDEERPAPVLECKDESLCLAYQCTETSIEKLVKLANQNPSYLHAFAGDPTRQPEYGECPSPLDYQDALEIVARG
EEAFYSLPANIRVNFSNPMEFLSWLEDPANYDEVEKLGLLDPEKVQIRKSKLQKDQKEEVSSEEK
>P03296 ~~~~~~Protein C10~~~
MDIYDDKGLQTIKLFNNEFDCIRNDIRELFKHVTDSDSIQLPMEDNSDIIENIRKILYRRLKNVECVDIDSTITFMKYDP
NDDNKRTCSNWVPLTNNYMEYCLVIYLETPICGGKIKLYHPTGNIKSDKDIMFAKTLDFKSKKVLTGRKTIAVLDISVSY
NRSMTTIHYNDDVDIDIHTDKNGKELCYCYITIDDHYLVDVETIGVIVNRSGKCLLVNNHLGIGIVKDKRISDSFGDVCM
DTIFDFSEARELFSLTNDDNRNIAWDTDKLDDDTDIWTPVTEDDYKFLSRLVLYAKSQSDTVFDYYVLTGDTEPPTVFIF
KVTRFYFNMPK
>Q76ZJ3 ~~~~~~Truncated CrmB protein~~~
MKSVLYSYILFLSCIIINGRDIAPHAPSDGKCKDNEYKRHNLCPGTYASRLCDSKTNTQCTPCGSGTFTSRNNHLPACLS
CNGRRDRVTRLTIESVNALPDIIVFSKDHPDARHVFPKQNVE
>Q91J26 ~~~C2~~~Protein C2~~~
MENHVSLKVVSPALYYAIQDLRAHTNNFLKNQKMKPLSPGHYIIQPSANSKVRSLITKQQHPRKVTLPCNCHFTIHHECN
RGFSHRGTYYSPSGNKFRGIRECTESTVYETPMVREIRANLSTEDTNPIQLQPPESVESSQVLDRANDNRIEQDIDWTPF
LEGLEKETRDILG
>P25695 ~~~~~~Protein C42~~~
MSAIALYLEINKLRLKIDEPMQLAIWPQLFPLLCDEHQSVQLNTDVLINFMMHVARKSQNTILNNNAAIASQYAAGNADV
VAAPASAQPTPRPVINLFARANAAAPAQPSEELINMRRYRNAARKLIHHYSLNSTSSTEYKISDVVMTMIFLLRSEKYHS
LFKLLETTFDDYTCRPQMTQVQTDTLLDAVRSLLEMPSTTIDLTTVDIMRSSFARCFNSPIMRYAKIVLLQNVALQRDKR
TTLEELLIERGEKIQMLQPQQYINSGTEIPFCDDAEFLNRLLKHIDPYPLSRMYYNAANTMFYTTMENYAVSNCKFNIED
YNNIFKVMENIRKHSNKNSNDQDELNIYLGVQSSNAKRKKY
>A0A7H0DN92 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISDARLKTLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSTIHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGIIIIAIFLQVSDHKNVYFQKIVNQLDSIRSANMSAPFDSVFYLDNLL
PSTLDYFTYLGTTINHSADAAWIIFPTPINIHSDQLSKFRTLLSSSNHEGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPVRENYFMKWLSDLREACFSYYQKYIEGNKTFAIIAIVFVFILTAILFLMSQRYSREKQN
>Q8V4Y0 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISDTRLKTLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSTIHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGIIIIAIFLQVSDHKNVYFQKIVNQLDSIRSANMSAPFDSVFYLDNLL
PSTLDYFTYLGTTINHSADAAWIIFPTPINIHSDQLSKFRTLLSSSNHEGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPVRENYFMKWLSDLREACFSYYQKYIEGNKTFAIIAIVFVFILTAILFLMSQRYSREKQN
>Q6RZI9 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISNARLKPLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRSANTSAPFDSVFYLDNLL
PSTLDYFTYLGTTINHSADAAWIIFPTPINIHSDQLSKFRTLLSSSNHDGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPARDNYFMRWLSDLRETCFSYYQKYIEGNKTFAIIAIVFVFILTAILFLMSRRYSREKQN
>O57211 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISNARLKPLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSSLRIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRSTNTSAPFDSVFYLDNLL
PSKLDYFTYLGTTINHSADAVWIIFPTPINIHSDQLSKFRTLLSSSNHDGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPARENYFMRWLSDLRETCFSYYQKYIEGNKTFAIIAIVFVFILTAILFFMSQRYSREKQN
>P20508 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISNARLKPLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRSANTSAPFDSVFYLDNLL
PSTLDYFTYLGTTIKHSADAVWIIFPTPINIHSDQLSKFRTLLSSSNHDGKPYYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPARENYFMRWLSDLRETCFSYYQKYIEGNKTFAIIAIVFVFILTAILFLMSRRYSREKQN
>Q9JFA1 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISNARLKPLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRSANTSAPFDSVFYLDNLL
PSKLDYFTYLGTTINHSADAVWIIFPTPINIHSDQLSKFRTLLSSSNHEGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTPPARENYFMRWLSDLRETCFSYYQKYIEENKTFAIIAIVFVFILTAILFFMSRRYSREKQN
>P04195 ~~~~~~Cell surface-binding protein OPG105~~~
MPQQLSPINIETKKAISNARLKPLDIHYNESKPTTIQNTGKLVRINFKGGYISGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVLDHKNVYFQKIVNQLDSIRSANTSAPFDSVFYLDNLL
PSKLDYFTYLGTTINHSADAVWIIFPTPINIHSDQLSKFRTLLSSSNHDGKPHYITENYRNPYKLNDDTQVYYSGEIIRA
ATTSPARENYFMRWLSDLRETCFSYYQKYIEENKTFAIIAIVFVFILTAILFFMSRRYSREKQN
>P0DSY1 ~~~~~~Cell surface-binding protein OPG105~~~
MSQQLSPINIETKKAISNARLKPLNIHYNESKPTTIQNTGKLVRINFKGGYLSGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRTANTSAPFDSVFYLDNLL
PSKLDYFKYLGTTINHSADAVWIIFPTPINIHSDQLSKFRTLLSLSNHEGKPHYITENYRNPYKLNDDTEVYYSGEIIRA
ATTSPARENYFMRWLSDLRETCFSYYQKYIEGNKTFAIIAIVFVYILTAILFLMSRRYSREKQN
>P0DSY2 ~~~~~~Cell surface-binding protein OPG105~~~
MSQQLSPINIETKKAISNARLKPLNIHYNESKPTTIQNTGKLVRINFKGGYLSGGFLPNEYVLSSLHIYWGKEDDYGSNH
LIDVYKYSGEINLVHWNKKKYSSYEEAKKHDDGLIIISIFLQVSDHKNVYFQKIVNQLDSIRTANTSAPFDSVFYLDNLL
PSKLDYFKYLGTTINHSADAVWIIFPTPINIHSDQLSKFRTLLSLSNHEGKPHYITENYRNPYKLNDDTEVYYSGEIIRA
ATTSPARENYFMRWLSDLRETCFSYYQKYIEGNKTFAIIAIVFVYILTAILFLMSRRYSREKQN
>P03279 ~~~L1~~~Pre-hexon-linking protein IIIa~~~
MMQDATDPAVRAALQSQPSGLNSTDDWRQVMDRIMSLTARNPDAFRQQPQANRLSAILEAVVPARANPTHEKVLAIVNAL
AENRAIRPDEAGLVYDALLQRVARYNSGNVQTNLDRLVGDVREAVAQRERAQQQGNLGSMVALNAFLSTQPANVPRGQED
YTNFVSALRLMVTETPQSEVYQSGPDYFFQTSRQGLQTVNLSQAFKNLQGLWGVRAPTGDRATVSSLLTPNSRLLLLLIA
PFTDSGSVSRDTYLGHLLTLYREAIGQAHVDEHTFQEITSVSRALGQEDTGSLEATLNYLLTNRRQKIPSLHSLNSEEER
ILRYVQQSVSLNLMRDGVTPSVALDMTARNMEPGMYASNRPFINRLMDYLHRAAAVNPEYFTNAILNPHWLPPPGFYTGG
FEVPEGNDGFLWDDIDDSVFSPQPQTLLELQQREQAEAALRKESFRRPSSLSDLGAAAPRSDASSPFPSLIGSFTSTRTT
RPRLLGEEEYLNNSLLQPQREKNLPPAFPNNGIESLVDKMSRWKTYAQEHRDVPGPRPPTRRQRHDRQRGLVWEDDDSAD
DSSVLDLGGSGNPFAHLRPRLGRMF
>P12537 ~~~L1~~~Pre-hexon-linking protein IIIa~~~
MMQDATDPAVRAALQSQPSGLNSTDDWRQVMDRIMSLTARNPDAFRQQPQANRLSAILEAVVPARANPTHEKVLAIVNAL
AENRAIRPDEAGLVYDALLQRVARYNSGNVQTNLDRLVGDVREAVAQRERAQQQGNLGSMVALNAFLSTQPANVPRGQED
YTNFVSALRLMVTETPQSEVYQSGPDYFFQTSRQGLQTVNLSQAFKNLQGLWGVRAPTGDRATVSSLLTPNSRLLLLLIA
PFTDSGSVSRDTYLGHLLTLYREAIGQAHVDEHTFQEITSVSRALGQEDTGSLEATLNYLLTNRRQKIPSLHSLNSEEER
ILRYVQQSVSLNLMRDGVTPSVALDMTARNMEPGMYASNRPFINRLMDYLHRAAAVNPEYFTNAILNPHWLPPPGFYTGG
FEVPEGNDGFLWDDIDDSVFSPQPQTLLELQQREQAEAALRKESFRRPSSLSDLGAAAPRSDASSPFPSLIGSLTSTRTT
RPRLLGEEEYLNNSLLQPQREKNLPPAFPNNGIESLVDKMSRWKTYAQEHRDVPGPRPPTRRQRHDRQRGLVWEDDDSAD
DSSVLDLGGSGNPFAHLRPRLGRMF
>P36712 ~~~L1~~~Pre-hexon-linking protein IIIa~~~
MQRPAIIAERAPNLDPAVLAAMQSQPSGVTASDDWTAAMDRIMALTARSPDAFRQQPQANRFSAILEAVVPSRTNPTHEK
VLTIVNALLDSKAIRKDEAGLIYNALLERVARYNSTNVQANLDRMGTDVKEALAQRERFHRDGNLGSLVALNAFLSTQPA
NVPRGQEDYTNFISALRLMVTEVPQSEVYQSGPDYFFQTSRQGLQTVNLTQAFKNLQGLWGVRAPVGDRSTLSSLLTPNS
RLLLLLIAPFTNTNSLSRDSYLGHLVTLYREAIGQAQVDEQTYQEITSVSRALGQEDTGSLEATLNFLLTNRRQQVPPQY
TLNAEEERILRYVQQSVSLYLMREGATPSAALDMTARNMEPSFYASNRAFINRLMDYLHRAAAMNGEYFTNAILNPHWLP
PPGFYTGEFDLPEGNDGFLWDDVTDSLFSPAVIGHHGKKEAGDEGPLLDSRASSPFPSLTSLPASVNSGRTTRPRLTGES
EYLNDPILFPVRDKNFPNNGIESLVDKMSRWKTYAQERREWEERQPRPVRPPRQRWQRRKKGAHAGDEGSDDSADDSSVL
DLGGSGNPFAHLRPQGCIGSLY
>P03274 ~~~L3~~~Pre-protein VI~~~
MEDINFASLAPRHGSRPFMGNWQDIGTSNMSGGAFSWGSLWSGIKNFGSTIKNYGSKAWNSSTGQMLRDKLKEQNFQQKV
VDGLASGISGVVDLANQAVQNKINSKLDPRPPVEEPPPAVETVSPEGRGEKRPRPDREETLVTQIDEPPSYEEALKQGLP
TTRPIAPMATGVLGQHTPVTLDLPPPADTQQKPVLPGPSAVVVTRPSRASLRRAASGPRSMRPVASGNWQSTLNSIVGLG
VQSLKRRRCF
>P24937 ~~~L3~~~Pre-protein VI~~~
MEDINFASLAPRHGSRPFMGNWQDIGTSNMSGGAFSWGSLWSGIKNFGSTVKNYGSKAWNSSTGQMLRDKLKEQNFQQKV
VDGLASGISGVVDLANQAVQNKINSKLDPRPPVEEPPPAVETVSPEGRGEKRPRPDREETLVTQIDEPPSYEEALKQGLP
TTRPIAPMATGVLGQHTPVTLDLPPPADTQQKPVLPGPTAVVVTRPSRASLRRAASGPRSLRPVASGNWQSTLNSIVGLG
VQSLKRRRCF
>P35988 ~~~L3~~~Pre-protein VI~~~
MEDINFSSLAPRHGTRPYMGTWNEIGTSQLNGGAFNWNSIWSGLKNFGSTIKTYGTKAWNSQTGQMLRDKLKDQNFQQKV
VDGLASGINGVVDIANQAVQKKIANRLEPRPDEVMVEEKLPPLETVPGSVPTKGEKRPRPDAEETLVTHTTEPPSYEEAI
KQGAALSPTTYPMTKPILPMATRVYGKNENVPMTLELPPLPEPTIADPVGSVPVASVPVASTVSRPAVRPVAVASLRNPR
SSNWQSTLNSIVGLGVKSLKRRRCY
>P16139 ~~~L3~~~Pre-protein VI~~~
MEDINFASLAPRHGSRPFMGTWNEIGTSQLNGGAFSWSSLWSGIKNFGSSIKSFGNKAWNSNTGQMLRDKLKDQNFQQKV
VDGLASGINGVVDIANQALQNQINQRLENSRQPPVALKQRPTPEPEEVEVEEKLPPLETAPPLPSKGEKRPRPELEETLV
VESREPPSYEQALKEGASYPMTRPIGSMARPVYGKEKTPVTLELPPPAPTVPPMPTPTLGTNVPRLAAPTVAVATPARRV
RGANWQSTLNSIVGLGVKSLKRRRCY
>P03280 ~~~L4~~~Pre-hexon-linking protein VIII~~~
MSKEIPTPYMWSYQPQMGLAAGAAQDYSTRINYMSAGPHMISRVNGIRAHRNRILLEQAAITTTPRNNLNPRSWPAALVY
QESPAPTTVVLPRDAQAEVQMTNSGAQLAGGFRHRVRSPGQGITHLKIRGRGIQLNDESVSSSLGLRPDGTFQIGGAGRS
SFTPRQAILTLQTSSSEPRSGGIGTLQFIEEFVPSVYFNPFSGPPGHYPDQFIPNFDAVKDSADGYD
>P24936 ~~~L4~~~Pre-hexon-linking protein VIII~~~
MSKEIPTPYMWSYQPQMGLAAGAAQDYSTRINYMSAGPHMISRVNGIRAHRNRILLEQAAITTTPRNNLNPRSWPAALVY
QESPAPTTVVLPRDAQAEVQMTNSGAQLAGGFRHRVRSPGQGITHLTIRGRGIQLNDESVSSSLGLRPDGTFQIGGAGRP
SFTPRQAILTLQTSSSEPRSGGIGTLQFIEEFVPSVYFNPFSGPPGHYPDQFIPNFDAVKDSADGYD
>P36713 ~~~L4~~~Pre-hexon-linking protein VIII~~~
MSKDIPTPYMWSFQPQMGLAAGAAQDYSSKMNWLSAGPHMISRVNGVRARRNQILLEQAALTATPRNQLNPPSWPAALIY
QENPPPTTVLLPRDAQAEVHMTNAGAQLAGGARHSFRYKGRTEPYPSPAIKRVLIRGKGIQLNDEVTSPLGVRPDGVFQL
GGSGRSSFTARQAYLTLQSSSSAPRSGGIGTLQFVEEFTPSVYFNPFSGSPGHYPDAFIPNFDAVSESVDGYD
>P11822 ~~~L4~~~Pre-hexon-linking protein VIII~~~
MSKEIPTPYMWSYQPQMGLAAGASQDYSSRMNWLSAGPHMIGRVNGIRATRNQILLEQAALTSTPRSQLNPPNWPAAQVY
QENPAPTTVLLPRDAEAEVQMTNSGAQLAGGSRHVRFRGRSSPYSPGPIKRLIIRGRGIQLNDEVVSSLTGLRPDGVFQL
GGAGRSSFTPRQAYLTLQSSSSQPRSGGIGTLQFVEEFVPSVYFNPFSGAPGLYPDDFIPNYDAVSESVDGYD
>P03282 ~~~IX~~~Hexon-interlacing protein~~~
MSANSFDGSIVSSYLTTRMPPWAGVRQNVMGSSIDGRPVLPANSTTLTYETVSGTPLETAASAAASAAAATARGIVTDFA
FLSPLASSAASRSSARDDKLTALLAQLDSLTRELNVVSQQLLDLRQQVSALKASSPPNAV
>P03281 ~~~IX~~~Hexon-interlacing protein~~~
MSTNSFDGSIVSSYLTTRMPPWAGVRQNVMGSSIDGRPVLPANSTTLTYETVSGTPLETAASAAASAAAATARGIVTDFA
FLSPLASSAASRSSARDDKLTALLAQLDSLTRELNVVSQQLLDLRQQVSALKASSPPNAV
>P32539 ~~~IX~~~Hexon-interlacing protein~~~
MSGSMEGNAVSFKGGVFSPYLTTRLPAWAGVRQNVMGSNVDGRPVAPANSATLTYATVGSSVDTAAAAAASAAASTARGM
AADFGLYNQLAASRSLREEDALSVVLTRMEELSQQLQDLFAKVALLNPPANAS
>Q70LC6 ~~~~~~Major capsid protein 1~~~
MAKLNRKLRQDSTDRYKTKLYLWRNLGGLIPEDMAISVTESITADWKQYNDMMSKVRNETLDILKTNKVATEDYIGYIAF
AEELAHQVWKNKNSSPDPNTANEASKTDLESKYSDVYGLDVTVLDAIYNAVIPIIMGGGS
>Q5UQL7 ~~~~~~Capsid protein 1~~~
MAGGLLQLVAYGAQDVYLTGNPMITFFKVVYRRHTNFAVESIEQFFGGNLGFGKKSSAEINRSGDLITQVFLKVTLPEVR
YCGDFTNFGHVEFAWVRNIGHAIVEETELEIGGSPIDKHYGDWLQIWQDVSSSKDHEKGLAKMLGDVPELTSISTLSWDV
PDNTVLKPSYTLYVPLQFYFNRNNGLALPLIALQYHQVRIYVKFRQADQCYIASDAFKSGCGNLQLDDVSLYVNYVFLDT
EERRRFAQVSHEYLIEQLQFTGEESAGSSNSAKYKLNFNHPVKAIYWVTKLGNYQGGKFMTYDPVCWENARENAAKLLLL
AQYDLDDWGYFQEPGGYECEGNDGRSYVGDCGVQYTAVDPSNPSEEPSYIFNDTTTAEAFDGSLLIGKLAPCVPLLKRNK
DVDLKDKVEGIIRIHTDFENDRMKYPEVEKITRNDLTLHDLSVPISKYDVDNRVDYIKKFDVTVWQHNNFGLLIDGSGNP
THEAELQLNGQPRQSKRGGIWYDTVNPTVHHTKSPRDGVNVFSFALNPEEHQPSCTCNFSRIDTAQLNLWFQHFTNHKFA
DVFADNDNKVLIFAVNYNVLRMLSGMAGLAYSN
>A0A6M3VZT9 ~~~~~~Major capsid protein 1~~~
MSVVTTRARIAETLTEKHTLGIEKVVATDSWRVGITSREKKLERINISAEISRRIQDEAIAYARNKGIPYLPGINGIAWK
LLRLKWLGYTDQINVVMRTVPAEWRDFLTQIMENTQMESMYSELRKVRV
>Q914J4 ~~~~~~Major capsid protein 1~~~
MAGRQSHKKIDVRNDTSTRYKGKLYGIFVNYMGEKYAQQLVENMYSNYNDVFVEIYNKMHNALRPTLVKLAGAGATFPLW
QLVNEAIYAVYLTHKETASFLVTKYVARGVPAMTVKTLLAEVGNQLKELVPAVAEQIGSVTLDHTNVVSTVDNIVTSMPA
LPNSYAGVLMKTKVPTVTPHYAGTGTFSSMESAYKALEDIERGL
>A0A1W6I187 ~~~~~~Capsid protein VP4~~~
MSESVTQQVFNFAVTKSQPFGGYVYSTNLTASTSSAVTSTQLTPLNLSITLGQITLSGNSLVIPATQIWYLTDAYVSVPD
YTNITNGAEADGVILIYKDGVKLMLTTPLISSMSISNPARTHLAQAVKYSPQSILTMYFNPTKPATASTSYPNTVYFTVV
VVDFSYAQNPARAVVSANAVM
>A0A346LU62 ~~~~~~Major capsid protein VP4~~~
MPSRRTGITTEDAITKYSVKAKTEQTAYKNATKDMVVQAQNIMNFYSVVNQALIPWLNAHGVGGNLRILYRQLANEYVKV
LNTKQSGEVIKRLKIALRHKYWLRGLDEAMLDEFMDYIDSLKSTTTNYIIFNMQSSK
>Q70LC7 ~~~~~~Major capsid protein 2~~~
MVNKKYRQDSKDRYQYKQYIYRSIGGIVPPEMAETVTANQTAQWEAGFTPYHKLRLAIKEICKTDGIPNIKWGMYIAFGE
KLLKSYLKMKAGSASSDMIAEYINNAISAFSSRTGISQETAQKIADFITSNY
>A0A6M3VWZ7 ~~~~~~Major capsid protein 2~~~
MSVEVYRQKIEKGGYSAAYEATRRYEREEIEVLSWSSRWESAWSKFGEAVKALGKIEGAPRALVIAKVQEALAYMSKPLP
NMKLAMAAAVQAVRACEQLPGMNRERCLDAVAGALGVAKDWIRREMTGGGGGGGGGGGGGGGAVV
>Q914J5 ~~~~~~Major capsid protein 2~~~
MARRNRRLSSASVYRYYLKRISMNIGTTGHVNGLSIAGNPEIMRAIARLSEQETYNWVTDYAPSHLAKEVVKQISGKYNI
PGAYQGLLMAFAEKVLANYILDYKGEPLVEIHHNFLWELMQRQSGAGLGVTSGFIYTFVRKDGKPVTVDMSKVLTEIEDA
LFKLVKK
>A0A1W6I162 ~~~~~~Capsid protein VP10~~~
MLSLDNYSYVHNITTQTNIDLSSQQTIHLASINGKGYIIFLRFFCEGSSACFTNVKFSVKANGLVLYSFRYIQLLELGQA
IATAIPSSSQGFSTLLSNYNVLISSPIGTLPQLTLYDSYDNRYGAMLQPAFPLPFVNTLSLDVDILPVSQSSYDPIPYSL
NDNQISTNAPTGKGNISIEYLLYNCLV
>A0A346LU63 ~~~~~~Major capsid protein VP5~~~
MARKRTSKNDPLRMYLNYVRKLQTMGDAYDESAKYRIANFENGFKSLHMVENEFKQYLANVIDEAIKSGASPQDLPYVNE
IKLALMKIFTSWLKYSNEKLGANEIAINVAGTATMTLTENLYGTRVSCEEAVSLINSIFAVWVGVEPFEAEEREGACLVT
PRSPLPPVPISSPTGFSAPIQEVLQAKSPEEIIGVKGGA
>Q27YG7 ~~~ORF4~~~ORF4 polyprotein~~~
MQNPTQTMHIYDMPLRVIAGLSTLAKTTEEDDNTSTGIVVSEVGEPQVVNHPAWIDPFVAYQLRAPRKNITPDFIFGRAD
IGNAFSAFLPRRFSAPAVGTRLVVDPVFTYQQRTVLGLYNYFHADFYYIVHVPAPLGTGIYLKIYAPEFDTTTVTRGIRF
KPSASPTIALSVPWSNDLSTVETSVGRVGQSGGSIVIETIEDNSNETVNTPLSITVWCCMANIKATGYRHADTSAYNEKG
MNFISVPVPKPPVPPTKPITGEEQADNEVTAEGGKLVQELVYDHSAIPVAPVVETQAEQPEVPVSLVATRKNDTGHLATK
WYDFAKISLSNPANMNWTTLTIDPYNNVTLSRDGESMVLPWRRNVWTTGSKSIGYIRTMVAQINIPRPPQISGVLEVKDS
INNSSISLVEFGGKVEIPIIPKVMNGLATTASLPRHRLNPWMRTAESKVELQYRIIAFNRTSDIADLNVSVLLRPGDSQF
QLPMKPDNNVDTRHFELVEALMYHYDSLRIRGEEQSLPENAPNAVSNPQQFITPATALSAEEYNVHEALGETEELELDEF
PVLVFKGNVPVDSVTSIPLDLATIYDFAWDGEQNAISQKFQRFAHLIPKSAGGFGPVIGNYTITANLPTGVAGRILHNCL
PGDCVDLAVSRIFGLKSLLGVAGTAVSAIGGPLLNGLVNTAAPILSGAAHAIGGNVVGGLADAVIDIGSNLLTPKEKEQP
SANSSAISGDIPISRFVEMLKYVKENYQDNPVFPTLLVEPQNFISNAMTALKTIPIEVFANMRNVKVERNLFDRTVVPTV
KEATLADIVIPNHMYGYILRDFLQNKRAFQSGTKQNVYFQQFLTVLSQRNIRTHITLNDITSCSIDSESIANKIERVKHY
LSTNSSGETTEEFSRTDTGLLPTTTRKIVLGESKRRTERYVAETVFPSVRQ
>P19726 ~~~~~~Major capsid protein~~~
MASMTGGQQMGTNQGKGVVAAGDKLALFLKVFGGEVLTAFARTSVTTSRHMVRSISSGKSAQFPVLGRTQAAYLAPGENL
DDKRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVESKYNENIEGLG
TATVIETTQNKAALTDQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSYSAILAALMPNAANYAALIDPEKGSIR
NVMGFEVVEVPHLTAGGAGTAREGTTGQKHVFPANKGEGNVKVAKDNVIGLFMHRSAVGTVKLRDLALERARRANFQADQ
IIAKYAMGHGGLRPEAAGAVVFKVE
>P19727 ~~~~~~Minor capsid protein~~~
MASMTGGQQMGTNQGKGVVAAGDKLALFLKVFGGEVLTAFARTSVTTSRHMVRSISSGKSAQFPVLGRTQAAYLAPGENL
DDKRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVESKYNENIEGLG
TATVIETTQNKAALTDQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSYSAILAALMPNAANYAALIDPEKGSIR
NVMGFEVVEVPHLTAGGAGTAREGTTGQKHVFPANKGEGNVKVAKDNVIGLFMHRSAVGTVKLRDLALERARRANFQADQ
IIAKYAMGHGGLRPEAAGAVVFQSGVMLGVASTVAASPEEASVTSTEETLTPAQEAARTRAANKARKEAELAAATAEQ
>P03135 ~~~VP1~~~Capsid protein VP1~~~
MAADGYLPDWLEDTLSEGIRQWWKLKPGPPPPKPAERHKDDSRGLVLPGYKYLGPFNGLDKGEPVNEADAAALEHDKAYD
RQLDSGDNPYLKYNHADAEFQERLKEDTSFGGNLGRAVFQAKKRVLEPLGLVEEPVKTAPGKKRPVEHSPVEPDSSSGTG
KAGQQPARKRLNFGQTGDADSVPDPQPLGQPPAAPSGLGTNTMATGSGAPMADNNEGADGVGNSSGNWHCDSTWMGDRVI
TTSTRTWALPTYNNHLYKQISSQSGASNDNHYFGYSTPWGYFDFNRFHCHFSPRDWQRLINNNWGFRPKRLNFKLFNIQV
KEVTQNDGTTTIANNLTSTVQVFTDSEYQLPYVLGSAHQGCLPPFPADVFMVPQYGYLTLNNGSQAVGRSSFYCLEYFPS
QMLRTGNNFTFSYTFEDVPFHSSYAHSQSLDRLMNPLIDQYLYYLSRTNTPSGTTTQSRLQFSQAGASDIRDQSRNWLPG
PCYRQQRVSKTSADNNNSEYSWTGATKYHLNGRDSLVNPGPAMASHKDDEEKFFPQSGVLIFGKQGSEKTNVDIEKVMIT
DEEEIRTTNPVATEQYGSVSTNLQRGNRQAATADVNTQGVLPGMVWQDRDVYLQGPIWAKIPHTDGHFHPSPLMGGFGLK
HPPPQILIKNTPVPANPSTTFSAAKFASFITQYSTGQVSVEIEWELQKENSKRWNPEIQYTSNYNKSVNVDFTVDTNGVY
SEPRPIGTRYLTRNL
>P21942 ~~~AR1~~~Capsid protein~~~
MPKRDLPWRSMPGTSKTSRNANYSPRARIGPRVDKASEWVHRPMYRKPRIYRTLRTADVPRGCEGPCKVQSYEQRHDISH
VGKVMCISDVTRGNGITHRVGKRFCVKSVYILGKIWMDENIKLQNHTNSVMFWLVRDRRPYGTPMDFGHVFNMFDNEPST
ATVKNDLRDRYQVLHKFYGKVTGGQYASNEQAIVKRFWKVNNHVVYNHQEAGKYENHTENALLLYMACTHASNPVYATLK
IRIYFYDSLMN
>P27737 ~~~~~~Capsid protein~~~
MAAVLNLQLKVDASLKAFLGAENRPLHGKTGATLEQILESIFANIAIQGTSEQTEFLDLVVEVKSMEDQSVLGSYNLKEV
VNLIKAFKTTSSDPNINKMTFRQVCEAFAPEARNGLVKLKYKGVFTNLFTTMPEVGSKYPELMFDFNKGLNMFIMNKAQQ
KVITNMNRRLLQTEFAKSENEAKLSSVSTDLCI
>P24029 ~~~~~~Capsid protein VP1~~~
MSKIPQHYPGKKRSAPRHVFIQQAKKKKQTNPAVYHGEDTIEEMDSTEAEQMDTEQATNQTAEAGGGGGGGGGGGGGGGG
VGNSTGGFNNTTEFKVINNEVYITCHATRMVHINQADTDEYLIFNAGRTTDTKTHQQKLNLEFFVYDDFHQQVMTPWYIV
DSNAWGVWMSPKDFQQMKTLCSEISLVTLEQEIDNVTIKTVTETNQGNASTKQFNNDLTASLQVALDTNNILPYTPAAPL
GETLGFVPWRATKPTQYRYYHPCYIYNRYPNIQKVATETLTWDAVQDDYLSVDEQYFNFITIENNIPINILRTGDNFHTG
LYEFNSKPCKLTLSYQSTRCLGLPPLCKPKTDTTHKVTSKENGADLIYIQGQDNTRLGHFWGEERGKKNAEMNRIRPYNI
GYQYPEWIIPAGLQGSYFAGGPRQWSDTTKGAGTHSQHLQQNFSTRYIYDRNHGGDNEVDLLDGIPIHERSNYYSDNEIE
QHTAKQPKLRTPPIHHSKIDSWEEEGWPAASGTHFEDEVIYLDYFNFSGEQELNFPHEVLDDAAQMKKLLNSYQPTVAQD
NVGPVYPWGQIWDKKPHMDHKPSMNNNAPFVCKNNPPGQLFVKLTENLTDTFNYDENPDRIKTYGYFTWRGKLVLKGKLS
QVTCWNPVKRELIGEPGVFTKDKYHKQIPNNKGNFEIGLQYGRSTIKYIY
>P03591 ~~~ORF3b~~~Capsid protein~~~
MSSSQKKAGGKAGKPTKRSQNYAALRKAQLPKPPALKVPVVKPTNTILPQTGCVWQSLGTPLSLSSFNGLGARFLYSFLK
DFVGPRILEEDLIYRMVFSITPSHAGTFCLTDDVTTEDGRAVAHGNPMQEFPHGAFHANEKFGFELVFTAPTHAGMQNQN
FKHSYAVALCLDFDAQPEGSKNPSFRFNEVWVERKAFPRAGPLRSLITVGLFDEADDLDRH
>P05673 ~~~ORF3b~~~Capsid protein~~~
MSSSQKKAGGKAGKPTKRSQNYAALRKAQLPKPPALKVPVVKPTNTILPQTGCVWQSLGTPLSLSSFNGLGVRFLYSFLK
DFAGPRILEEDLIYRMVFSITPSYAGTFCLTDDVTTEDGRAVAHGNPMQEFPHGAFHANEKFGFELVFTAPTHAGMQNQN
FKHSYAVALCLDFDAQPEGSKNPSYRFNEVWVERKAFPRAGPLRSLITVGLLDEADDLDRH
>P03590 ~~~ORF3b~~~Capsid protein~~~
MSSSQKKAGGKAGKPTKRSQNYAALRKAQLPKPPALKVPVAKPTNTILPQTGCVWQSLGTPLSLSSFNGLGVRFLYSFLK
DFTGPRILEEDLIYRMVFSITPSHAGTFCLTDDVTTEDGRAVAHGNPMQEFPHGAFHANEKFGFELVFTAPTHAGMQNQN
FKHSYAVALCLDFDAQPEGSKNPSYRFNEVWVERKAFPRAGPLRSLITVGLLDEADDL
>P24264 ~~~ORF3b~~~Capsid protein~~~
MSSSQKKAGGKAGKPTKRSQNYAALRKARLPKPPALKVPVAKPTNTILPQTGCVWQSLGTPLSLSSFNGLGVRFLYSFLK
DFAGPRILEEDLIYRMVFSITPSHAGTFCLTDDVTTEDGRAVAHGNPMQEFPHGAFHANEKFGFELVFTAPTHAGMQNQN
FKHSYAVALCLDFDAQPEGSKNPSYRFNEVWVERKAFPRAGPLRSLITVGLLDEADDLDRH
>B2BNE1 3.6.4.13~~~S3~~~Major inner capsid protein VP3~~~
MPRRPRRNAKTSDKAEDAQTLVAPAANASVSSTVNTTTSPTLAAGNESQQRAGIDPNQAGSAGVGDAAPSSRVDNDGDVI
TRPTSDSIAAIANATKPAAVINNAQATALVPTSNPHAYRCNVCNAEFPSMSAMTEHLRTSHRDEPSTLLATPVINAAIQA
FLQAWDGLRLLAPDVSSEALSKYLDSTVDSSPDLIVEDQGLCTSFMLIDNVPASHLSPELIGFTWFMQMYQMTPPLPEGA
VNRIVCMTNWASLGDPSRGIEVRLPPPTDNTVHAYKTVLSQGYVASSQFSPLTFRANTLLMLTQFVLSNLKINKSSTFTS
DVTTLTVGRMICSFEARPELLALAYPGRAVLPVNTKNAQFLATAIPDRIGRIDRANLIGGEVSASVECMELCDSLTLYIR
ENYLMLLRSMHQDPTRIVQIVNECARNLLNSSIPVNLRPSILCPWFASTADLRLQQAIHLVNISSNTAAALPQVEALSSL
LRSVTPLVLNPTILTNAITTISESTTQTISPISEILRLLSPTGNDYAAFWKCIASWAYNGLVQTVLSEDAFPDSSQSITH
LPSMWKCMLLTLAAPMTSDPHSPVKVFMSLANLLAQPEPIVINVDGMHQTTPASQFSHPGVWPPGFINPAQIPVAQAPLL
RAFADHIHANWPQPSDFEYGSAAQGSGNLFIPPNRMVYPWPNAPLPRMTVAATFDSAMSQWISTTIAFFIRVVNAPIMAP
TVNDLTRRTITGVLTAMRQVKTMTPFYIQHMCPTELAVLGSITLVPPFQVPFTRLVQNDAITNVLVARVDPTQRGDAAVD
IRATHATFSAALPVDPASIVVAMLCGQTPTNLIPSHHYGKAFAPLFTSNAMFTRNQRAVITREALVCARSIVAQCQDDGF
NVPRPLAGLRQFDITSAAAAEIWHAVNDAFKTAFDIDGALLDGMGLYGDPRIADISVAYLQYDGRVTREHVPPDQSFIHR
ALLTTENTFLAEMNLFNVGAGDIFLIQTPTNGNWAPMVPVAHPPFARGGPNVNVVGNHGTLAMRPNGLEPQLIDNAGVPR
DIAGDWIYPIDVLQVSVSTFRDYVWPLVVAGRVRVRIEIPHYVYTTHYHQPQTTFTDAQLVETWLAGIDPTGIPPIPFSI
PIPQVGACITSRRVYHVFAAQNNNNSLFSTNSSSIATVFGEDAGVSPARWPALVDPNYQFGTNELPNRITLYGSLFRYNF
TYPSLSGVMFMRSAE
>P04329 3.4.23.44~~~alpha~~~Capsid protein alpha~~~
MVRNNNRRRQRTQRIVTTTTQTAPVPQQNVPKQPRRRRNRARRNRRQGRAMNMGALTRLSQPGLAFLKCAFAPPDFNTDP
GKGIPDRFEGKVVTRKDVLNQSINFTANRDTFILIAPTPGVAYWVADVPAGTFPISTTTFNAVNFPGFNSMFGNAAASRS
DQVSSFRYASMNVGIYPTSNLMQFAGSITVWKCPVKLSNVQFPVATTPATSALVHTLVGLDGVLAVGPDNFSESFIKGVF
SQSVCNEPDFEFSDILEGIQTLPPANVTVATSGQPFNLAAGAEAVSGIVGWGNMDTIVIRVSAPTGAVNSAILKTWACLE
YRPNPNAMLYQFGHDSPPCDEVALQEYRTVARSLPVAVIAAQNASMWERVKSIIKSSLAMASNVPGPIGIAASGLSGLSA
LFEGFGF
>Q9YUC8 ~~~Cap~~~Capsid protein~~~
MWGTSNCACAKFQIRRRYARPYRRRHIRRYRRRRRHFRRRRFTTNRVYTLRLTRQFQFKIQKQTTSVGNLIFNADYITFA
LDDFLQAVPNPHALNFEDYRIKLAKMEMRPTGGHYTVQSNGFGHTAVIQDSRITKFKTTADQTQDPLAPFDGAKKWFVSR
GFKRLLRPKPQITIEDLTTANQSAALWLNSARTGWIPLQGGPNSAGTKVRHYGIAFSFPQPEQTITYVTKLTLYVQFRQF
APNNPST
>P03602 ~~~ORF3b~~~Capsid protein~~~
MSTSGTGKMTRAQRRAAARRNRWTARVQPVIVEPLAAGQGKAIKAIAGYSISKWEASSDAITAKATNAMSITLPHELSSE
KNKELKVGRVLLWLGLLPSVAGRIKACVAEKQAQAEAAFQVALAVADSSKEVVAAMYTDAFRGATLGDLLNLQIYLYASE
AVPAKAVVVHLEVEHVRPTFDDFFTPVYR
>O11840 ~~~ORF2-ORF3~~~Capsid readthrough protein~~~
MSSEGRYMTWKDMSHNKFMTDRWARVSDVVSVIKQSHAMDLSKAANLSIIKTALAGLGSGWTDNNPFVSPMTRFPQTLTM
YGALVLYVNLSDPEFALIMTKVSTLTDSGLADNASANVRRDVVSGNKAESSGKTAGTNENSAYTLTVSLAGLAQALRLEE
LMWTRDKFEDRLKLPWTPVQGRTSPPGQXQLAAARVTAHIRAAKRALLYPGDSPEWVGWKHFYPPPPYDVYDVPPLDIIN
AKLAADDIGGLVTPTPASSHGLPFEVSEEVEQANRNSLWLTVGLLLAALAVGIGVAAYHRKKLQSRLRELKLLWGSTGGS
GGGGGFDTELYMRATDTVSLGTTLSEHAASAPSGLRHRPAATDSGPHEALPFEVWVFDNLAVVYDSIGMSDLFYTVREFV
GVFNGEFEGLIELLESPDDDDGVYTNAPRDTAIDAYESQENYDRIDIETVLIERRINLKKLLLEEAELERRERDMTMIAD
EEQRTLLHRLESSRVEATHAVAKAEADARAAVAMAALASKEANDYDSKMAFDRSCKEQELRLRELEVNSMPSKTERYVHT
GIQGGAQLAGAMAVGAMLRRGAGSSSQTVSSGANIGSRSQSLTRGRSASQPLSSVGGSTRGVNNNISNTNLVRAGNSAEV
SAGRSTNSGNSNFWSKLRVGEGWSKYSVERAATRAQRAIVLPAPPSAPAG
>A7XXC2 ~~~~~~Major capsid protein~~~
MRVPININNALARVRDPLSIGGLKFPTTKEIQEAVAAIADKFNQENDLVDRFFPEDSTFASELELYLLRTQDAEQTGMTF
VHQVGSTSLPVEARVAKVDLAKATWSPLAFKESRVWDEKEILYLGRLADEVQAGVINEQIAESLTWLMARMRNRRRWLTW
QVMRTGRITIQPNDPYNPNGLKYVIDYGVTDIELPLPQKFDAKDGNGNSAVDPIQYFRDLIKAATYFPDRRPVAIIVGPG
FDEVLADNTFVQKYVEYEKGWVVGQNTVQPPREVYRQAALDIFKRYTGLEVMVYDKTYRDQDGSVKYWIPVGELIVLNQS
TGPVGRFVYTAHVAGQRNGKVVYATGPYLTVKDHLQDDPPYYAIIAGFHGLPQLSGYNTEDFSFHRFKWLKYANNVQSYL
PPFPPKVEL
>P08767 ~~~F~~~Capsid protein F~~~
MSNVQTSAEREIVDLSHLAFDCGMLGRLKTVSWTPVIAGDSFELDAVGALRLSPLRRGLAIDSKVDFFTFYIPHRHVYGD
QWIQFMRDGVNAQPLPSVTCNRYPDHAGYVGTIVPANNRIPKFLHQSYLNIYNNYFRAPWMPERTEANPSNLNEDDARYR
FRCCHLKNIWSAPLPPETKLAEEMGIESNSIDIMGLQAAYAQLHTEQERTYFMQRYRDVISSFGGSTSYDADNRPLLVMH
TDFWASGYDVDGTDQSSLGQFSGRVQQTFKHSVPRFFVPEHGVMMTLALIRFPPISPLEHHYLAGKSQLTYTDLAGDPAL
IGNLPPREISYRDLFRDGRSGIKIKVAESIWYRTHPDYVNFKYHDLHGFPFLDDAPGTSTGDNLQEAILVRHQDYDACFQ
SQQLLQWNKQARYNVSVYRHMPTVRDSIMTS
>Q9T1S4 ~~~~~~Major capsid protein~~~
MANNLESNISQIVLKKFLPGFMSDIVLCKTVDRQLLSGEINSNTGDSVSFKRPHQFKSERTETGDITGKDKNGLFSAKAT
GKVGKYITVAVEWTQIEEALKLNQLDQILSPIHERMVTDLETELAHFMMNNGALSLGSPNTAIKKWADVAQTASFIKDIG
IKTGENYAIMDPWSAQRLADAQSGLHAADQLVRTAWENAQISGNFGGIRALMSNGLASRKQGDFDGAITVKTAPNVDYLS
VKDSYQFTVALTGATPSKTGFLKAGDQLKFTSTHWLNQQSKQTLYNGSTAMSFTATVLEETNSTASGDVTVKLSGVPIYD
EKNSQYNAVDAKVKAGDAVSIIGTAKQQMKPNLFYNKFFCGLGTIPLPKLHSLDSAVATYEGFSIRVHKYADGDANKQMM
RFDLLPAYVCFNPHMGGQFFGNP
>A0A385DVU6 ~~~~~~Major capsid protein~~~
MAGKLGKFQMLGFQHWKGLTSDNHLGAIFQQAPQKATNLMVQLLAFYRGKSLDTFLNSFPTREFEDDNEYYWDVIGSSRR
NIPLVEARDENGVVVAANAANVGVGTSPFYLVFPEDWFADGEVIVGNLNQVYPFRILGDARMEGTNAVYKVELMGGNTQG
VPAERLQQGERFSIEFAPVEKELSRKVGDVRFTSPVSMRNEWTTIRIQHKVAGNKLNKKLAMGIPMVRNLESGKQVKDTA
NMWMHYVDWEVELQFDEYKNNAMAWGTSNRNLNGEYMNFGKSGNAIKTGAGIFEQTEVANTMYYNTFSLKLLEDALYELS
ASKLAMDDRLFVIKTGERGAIQFHKEVLKTVSGWTTFVLDNNSTRVVEKVQSRLHSNALSAGFQFVEYKAPNGVRVRLDV
DPFYDDPVRNKILHPMGGVAFSYRYDIWYIGTMDQPNIFKCKIKGDNEYRGYQWGIRNPFTGQKGNPYMSFDEDSAVIHR
MATLGVCVLDPTRTMSLIPAILQG
>P19192 ~~~ORF1~~~Capsid protein VP1~~~
MAKGRKLPSVMKNRFSEVPTATIRRSSFDRSHGYKTTFDMDYLVPFFVDEVLPGDTFSLSETHLCRLTTLVQPIMDNIQL
TTQFFFVPNRLLWDNWESFITGGDEPVAWTSTNPANEYFVPQVTSPDGGYAENSIYDYFGLPTKVANYRHQVLPLRAYNL
IFNEYYRDENLQESLPVWTGDADPKVDPTTGEESQEDDAVPYVYKLMRRNKRYDYFTSALPGLQKGPSVGIGITGGDSGR
LPVHGLAIRSYLDDSSDDQFSFGVSYVNASQKWFTADGRLTSGMGSVPVGTTGNFPIDNVVYPSYFGTTVAQTGSPSSSS
TPPFVKGDFPVYVDLAASSSVTINSLRNAITLQQWFEKSARYGSRYVESVQGHFGVHLGDYRAQRPIYLGGSKSYVSVNP
VVQNSSTDSVSPQGNLSAYALSTDTKHLFTKSFVEHGFVIGLLSATADLTYQQGLERQWSRFSRYDYYWPTFAHLGEQPV
YNKEIYCQSDTVMDPSGSAVNDVPFGYQERYAEYRYKPSKVTGLFRSNATGTLDSWHLSQNFANLPTLNETFIQSNTPID
RALAVPDQPDFICDFYFNYRCIRPMPVYSVPGLRRI
>Q37993 ~~~9~~~Major capsid protein~~~
MANKITTFLSGQTGKQISNIDLLNSIRTRASADYQADIPVLEGARINHATVPYQDFQKHANEFFTALVNRIGSTVIKALT
YENPLAIFKSETFEFGDTLQEIYVHPAEKKTYDAKSDVSPFKFADTDIEAFYHTLNNENYYERTFERAWIQKAFVSDMAF
DEFVDKMFTSLLSSDTLDEYQAVRVYLRNHLRKSLIQTLKGNDKKITVAGTKIDETKQDFVVDFNQSLINLSKRFTIPSR
TTFNNPVGVPNMTAIEDQYLVISAEFSTHLDMLLANAFNMDKASVLARTIVVDDFEKFTGEGANNGRKPVAFLISAKSII
NKDKLVHMEAIRNPRNMTYNYFYHHHYMTSLSLFENIHFWYVEEA
>P69540 ~~~VIII~~~Capsid protein G8P~~~
MKKSLVLKASVAVATLVPMLSFAAEGDDPAKAAFDSLQASATEYIGYAWAMVVVIVGATIGIKLFKKFTSKAS
>P03611 ~~~~~~Capsid protein~~~
ASNFTQFVLVNDGGTGNVTVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSAQNRKYTIKVEVPKVATQTVGGVELPVA
AWRSYLNLELTIPIFATNSDCELIVKAMQGLLKDGNPIPSAIAANSGIY
>P69539 ~~~VIII~~~Capsid protein G8P~~~
MKKSLVLKASVAVATLVPMLSFAAEGDDPAKAAFDSLQASATEYIGYAWAMVVVIVGATIGIKLFKKFTSKAS
>P03614 ~~~~~~Capsid protein~~~
MASNFEEFVLVDNGGTGDVKVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSANNRKYTVKVEVPKVATQVQGGVELPV
AAWRSYMNMELTIPVFATNDDCALIVKALQGTFKTGNPIATAIAANSGIY
>P03642 ~~~F~~~Capsid protein F~~~
MSNVQTSADRVPHDLSHLVFEAGKIGRLKTISWTPVVAGDSFECDMVGAIRLSPLRRGLAVDSRVDIFSFYIPHRHIYGQ
QWINFMKDGVNASPLPPVTCSSGWDSAAYLGTIPSSTLKVPKFLHQGYLNIYNNYFKPPWSDDLTYANPSNMPSEDYKWG
VRVANLKSIWTAPLPPDTRTSENMTTGTSTIDIMGLQAAYAKLHTEQERDYFMTRYRDIMKEFGGHTSYDGDNRPLLLMR
SEFWASGYDVDGTDQSSLGQFSGRVQQTFNHKVPRFYVPEHGVIMTLAVTRFPPTHEMEMHYLVGKENLTYTDIACDPAL
MANLPPREVSLKEFFHSSPDSAKFKIAEGQWYRTQPDRVAFPYNALDGFPFYSALPSTELKDRVLVNTNNYDEIFQSMQL
AHWNMQTKFNINVYRHMPTTRDSIMTS
>P07234 ~~~~~~Capsid protein~~~
MATLRSFVLVDNGGTGNVTVVPVSNANGVAEWLSNNSRSQAYRVTASYRASGADKRKYAIKLEVPKIVTQVVNGVELPGS
AWKAYASIDLTIPIFAATDDVTVISKSLAGLFKVGNPIAEAISSQSGFYA
>P82889 ~~~VIII~~~Capsid protein G8P~~~
MDFNPSEVASQVTNYIQAIAAAGVGVLALAIGLSAAWKYAKRFLKG
>P49861 ~~~5~~~Major capsid protein~~~
MSELALIQKAIEESQQKMTQLFDAQKAEIESTGQVSKQLQSDLMKVQEELTKSGTRLFDLEQKLASGAENPGEKKSFSER
AAEELIKSWDGKQGTFGAKTFNKSLGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDLLAQGRTSSNALEYVREEVFTNNA
DVVAEKALKPESDITFSKQTANVKTIAHWVQASRQVMDDAPMLQSYINNRLMYGLALKEEGQLLNGDGTGDNLEGLNKVA
TAYDTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALLKDNEGRYIFGGPQAFTSNIMWGLPVVPTKAQA
AGTFTVGGFDMASQVWDRMDATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSSGS
>P03619 ~~~VIII~~~Capsid protein G8P~~~
MKKSVVAKIIAGSTLVIGSSAFAADDATSQAKAAFDSLTAQATEMSGYAWALVVLVVGATVGIKLFKKFVSRAS
>P03620 ~~~VIII~~~Capsid protein G8P~~~
MRVLSTVLAAKNKIALGAATMLVSAGSFAAEPNAATNYATEAMDSLKTQAIDLISQTWPVVTTVVVAGLVIRLFKKFSSK
AV
>A0A0H5AXT3 ~~~~~~Major capsid protein~~~
MAAPDPYKPGKYNDPAGGVESSIGPQTQTQYWMKQALIDARKEAYFGQLASVFGMPKHYGKKIVRMHYIPLLDDRNINDQ
GIDASGATIANGNLYGSSRDVGTIVGKMPTLTEVGGRVNRVGFKRVLLEGKLEKYGFFREYTQESLDFDSDAELDMHVGR
EMLKGANEMTEDLLQIDLLNSAGVVRYPGAATQDSEVDATTEVTYDSLMRLNIDLDNNRAPKGTKMITGTRMIDTRTIPG
CRPLYCGSELIPTLKAMKDNHGNPAFISIEKYAAGGNTFIGEIGAIDQFRIIINPQMMHWQGAGKAVDPAADGYHDFNDK
YSIFPMLVISSEAFTTVGFQTDGKNVKFKIYNKKPGEATADRLDPYGEMGFMSIKWYYGFMVYRPEWIALIKTVARL
>A0A0U5AF03 ~~~~~~Major capsid protein~~~
MSQISKTHSRLAGRNAKPFDLKNITNDAVASLRRIGLVFDHAVVQDQIKALAKAGAFRSGSAMDSNFTAPVTTPSIPTPI
QFLQTWLPGLVKVMTAARKIDEIIGIDTVGSWEDQEIVQGIVEPAGTAVEYGDHTNIPLTSWNANFERRTIVRGELGMMV
GTLEEGRASAIRLNSAETKRQQAAIGLEIFRNAIGFYGWQSGLGNRTYGFLNDPNLPPFQTPPSQGWSTADWAGIIGDIR
EAVRQLRIQSQDQIDPKAEKITLALATSKVDYLSVTTPYGISVSDWIEQTYPKMRIVSAPELSGVQMKAQEPEDALVLFV
EDVNAAVDGSTDGGSVFSQLVQSKFITLGVEKRAKSYVEDFSNGTAGALCKRPWAVVRYLGI
>C0HJH8 ~~~~~~Major capsid protein~~~
MPLPVQGNYTNFARLTNEQKTVWSLQFWRQARNAAFINMFLGTDANSMIQQITELRRDEKGARAVITLIADMVGDGVVGD
NQLEGNEEALTAFDTVIQLDQMRAANVHEGRMADQRSIVNFRTTSRDMLAYWLADRMDQLAFLSLAGVSYAYRTNGALRG
SSPFPNLTFAADVTPPSANRRLRWDGTNKVLVPNAATSDVTAADTPSYALLVNLKAYAKTKYIRGLRGDGGEEMYHVFLD
PLAMAKLKLDPDYIANLRSGYTRGNVNPLFKGGIVTVDGLVIHEFRHVYNTRGMAPGAKWGASGNVDGCSMLFCGAQALG
FADIGNPRWVEKEFDYDNKHGISVAKILGFLKPQFPSIYEDGNTEDFGVINVYVAA
>D6RRG1 ~~~ORF4~~~Major capsid protein~~~
MRPIPSLQNNFEYTDLTEPMILIPNVWGLTQQLGIFGVDRTTQESVTLEEITKSFGLMEDIHRGARHQVGRDYDRQMRTF
AVPHFTYDDYITPRDIQGKRAYGKQELETLDQVRMRKLERLRGTHAATMEFARMHTLVTGKPYTPNNTVGGATGYDWYQE
FGKTRFEVNFELDTPTTNILEKSELVYAHMQDEAYTGGVVGDVIAICSPEFFSKLISHPTVVEAYKYYASQPQILRERLR
ARGFDARYREFYFGNVLYIEYRGGFQGRPGGEKRRYVPAGEAVFIPGSGTEDLFKTFFAPASKFEHVNTPGEESYAFEYV
DPKGEFLEINSETNFINVLMYPQLVVKGKAA
>D3WAC5 ~~~~~~Capsid protein~~~
MNKPDLIEKQNRLAELKENNVSLKSQISGFEVKNAIEDLPKVQELEKTLSENSIEIIKIENELNAQEEKPKGKDKMTNFI
ESQNAVTEFFDVLKKNSGKSEIKNAWSAKLAENGVTITDTTFQLPRKLVESINTALLNTNPVFKVFHVTNVGALLVSRSF
DSANEAQVHKDGQTKTEQAATLTIDTLEPVMVYKLQSLAERVKRLQMSYSELYNLIVAELTQAIVNKIVDLALVEGDGTN
GFKSIDKEADVKKIKKITTKAKSAGKTPFADAIEEAVDFVRPTAGRRYLIVKTEDRKALLDELRQATANANVRIKNDDTE
IASEVGVDEIIVYTGSKALKPTVLVDQKYHIDMQDLTKVDAFEWKTNSNMILVETLTSGHVETYNAGAVITVS
>P69541 ~~~VIII~~~Capsid protein G8P~~~
MKKSLVLKASVAVATLVPMLSFAAEGDDPAKAAFNSLQASATEYIGYAWAMVVVIVGATIGIKLFKKFTSKAS
>Q05223 ~~~~~~Probable major capsid protein gp17~~~
MAVNPDRTTPFLGVNDPKVAQTGDSMFEGYLEPEQAQDYFAEAEKISIVQQFAQKIPMGTTGQKIPHWTGDVSASWIGEG
DMKPITKGNMTSQTIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDNAAINGTDSPFPTFLAQTTKEVSLV
DPDGTGSNADLTVYDAVAVNALSLLVNAGKKWTHTLLDDITEPILNGAKDKSGRPLFIESTYTEENSPFRLGRIVARPTI
LSDHVASGTVVGYQGDFRQLVWGQVGGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEYAFHCNDKDAFVKLTN
VDATEA
>A9CRA7 ~~~~~~Major capsid protein~~~
MPQGITKTSNQIIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEGEKIPTDILETKKRE
AKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKLTVNADITKLNGLQSAIDKFNDEDLEPM
VLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLEAGTAILAKKGAVKLILKRDFFLEVARDAS
TKTTALYSDKHYVAYLYDESKAVKITKGSGSLEM
>B2ZYY5 ~~~~~~Putative major capsid protein~~~
MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWADKP
GAYWVGEGQKIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIA
QSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERIYDRNSDSLDGLPVVNLKSSNLK
RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVAITCNYACSIAYR
>P03612 ~~~~~~Capsid protein~~~
MASNFTQFVLVDNGGTGDVTVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSAQNRKYTIKVEVPKVATQTVGGVELPV
AAWRSYLNMELTIPIFATNSDCELIVKAMQGLLKDGNPIPSAIAANSGIY
>Q9T1W1 ~~~T~~~Major capsid protein~~~
MIVTPASIKALMTSWRKDFQGGLEDAPSQYNKIAMVVNSSTRSNTYGWLGKFPTLKEWVGKRTIQQMEAHGYSIANKTFE
GTVGISRDDFEDDNLGIYAPIFQEMGRSAAVQPDELIFKLLKDGFTQPCYDGQNFFDKEHPVYPNVDGTGSAVNTSNIVE
QDSFSGLPFYLLDCSRAVKPLIFQERRKPELVARTRIDDDHVFMDNEFLFGASTRRAAGYGFWQMAVAVKGDLTLDNLWK
GWQLMRSFEGDGGKKLGLKPTHIVVPVGLEKAAEQLLNRELFADGNTTVSNEMKGKLQLVVADYL
>Q9T0R9 ~~~~~~Capsid protein~~~
MAKLQAITLSGIGKNGDVTLNLNPRGVNPTNGVAALSEAGAVPALEKRVTISVSQPSRNRKNYKVQVKIQNPTSCTASGT
CDPSVTRSAYADVTFSFTQYSTDEERALVRTELKALLADPMLIDAIDNLNPAY
>Q859Q5 ~~~~~~Major capsid protein~~~
MLNYNAPTDGQKSSIDGANSDQMQTFFWLKKAIITARKEQYFMPLASVTNMPKHYGKTIKVYEYVPLLDDRNINDQGIDA
SGATIVNGNLYGSSKDIGNITSKLPLLTENGGRVNRVGFTRIAREGSIHKFGFFYEFTQESIDFDSDDGLMEHLSRELMN
GATQITEAVLQKDLLAAAGTVLYAGAATSDATITGEGSTPSVVSYKNLMRLDQILTENRTPTQTTIITGSRMIDTKVIGA
TRVMYVGSELVPELKAMKDLFGNKAFIETQHYADAGTIMNGEVGSIDKFRIIQVPEMLHWAGAGAQATGANPGYRTSMVS
GQEHYDVYPMLVVGDDSFTSIGFQTDGKSLKFTVMTKMPGKETADRNDPYGETGFSSIKWYYGILVKRPERLALIKTVAP
L
>P25477 ~~~N~~~Capsid proteins~~~
MRQETRFKFNAYLSRVAELNGIDAGDVSKKFTVEPSVTQTLMNTMQESSDFLTRINIVPVSEMKGEKIGIGVTGSIASTT
DTAGGTERQPKDFSKLASNKYECDQINFDFYIRYKTLDLWARYQDFQLRIRNAIIKRQSLDFIMAGFNGVKRAETSDRSS
NPMLQDVAVGWLQKYRNEAPARVMSKVTDEEGRTTSEVIRVGKGGDYASLDALVMDATNNLIEPWYQEDPDLVVIVGRQL
LADKYFPIVNKEQDNSEMLAADVIISQKRIGNLPAVRVPYFPADAMLITKLENLSIYYMDDSHRRVIEENPKLDRVENYE
SMNIDYVVEDYAAGCLVEKIKVGDFSTPAKATAEPGA
>P26747 ~~~5~~~Major capsid protein~~~
MALNEGQIVTLAVDEIIETISAITPMAQKAKKYTPPAASMQRSSNTIWMPVEQESPTQEGWDLTDKATGLLELNVAVNMG
EPDNDFFQLRADDLRDETAYRRRIQSAARKLANNVELKVANMAAEMGSLVITSPDAIGTNTADAWNFVADAEEIMFSREL
NRDMGTSYFFNPQDYKKAGYDLTKRDIFGRIPEEAYRDGTIQRQVAGFDDVLRSPKLPVLTKSTATGITVSGAQSFKPVA
WQLDNDGNKVNVDNRFATVTLSATTGMKRGDKISFAGVKFLGQMAKNVLAQDATFSVVRVVDGTHVEITPKPVALDDVSL
SPEQRAYANVNTSLADAMAVNILNVKDARTNVFWADDAIRIVSQPIPANHELFAGMKTTSFSIPDVGLNGIFATQGDIST
LSGLCRIALWYGVNATRPEAIGVGLPGQTA
>P85500 3.4.-.-~~~~~~Capsid polyprotein~~~
MKTNRAYSTLEVKALDDEKRVITGIASTPSPDRMQDVVEPKGAQFKLPIPFLWQHNHDEPIGHVTEAKVTQKGIEVSVQL
TQVEEPGKLKDRLDEAWQSIKSGLVRGLSIGFSAKEFEQIPGSWGLRFLSWEWFELSAVTIPANAEATITSVKSIDREQR
AALGIKSVPVVRVTPAGASAIKTKTIKVPKPQEGNDMKTTAEQIAEFEATRVTKAAEMEAIMTKAAEAGETLDAEQSEQF
DTLEAEIAAIDKHIGRLKQMQKAQAANAKPVTEEAGAQRMANVKALDFKEVQVRAKNTQKLEPGIAFARAAKCLALGHLE
HRDAIGIAKSLYDGQDSIIAATQRLVTKAAVAAATTSDATWAGPLVGDETSVFADFVEYLRPQTILGRFGTNGIPSLRRV
PFRVPLIGQTSGGDGYWVGEGQAKPLTKFDFERKTLEPLKVANIAVATMEVIRDSSPSADVIIRDQLAAALRERLDIDFI
DPAKAAVAGVSPASILNGVAGIPSSGNTADDVRADIRALFNAFIAANNAPTSGVWLMPATTALALSLMQNPLGQAEFPGI
SMTGGTLFGLPVIVSEYIPTASAGAVVALVNASDIYLGDEGGVDLSMSTEASLQMDNAPDNPTTASTVLVSLWQRNLVGF
RAERAINWARRRASAVAYLTGVNWGAA
>P03621 ~~~VIII~~~Capsid protein G8P~~~
MKAMKQRIAKFSPVASFRNLCIAGSVTAATSLPAFAGVIDTSAVESAITDGQGDMKAIGGYIVGALVILAVAGLIYSMLR
KA
>P03623 ~~~VIII~~~Capsid protein G8P~~~
MQSVITDVTGQLTAVQADITTIGGAIIVLAAVVLGIRWIKAQFF
>Q6Y7S3 ~~~~~~Major capsid protein~~~
MTIEKNLSDVQQKYADQFQEDVVKSFQTGYGITPDTQIDAGALRREILDDQITMLTWTNEDLIFYRDISRRPAQSTVVKY
DQYLRHGNVGHSRFVKEIGVAPVSDPNIRQKTVSMKYVSDTKNMSIASGLVNNIADPSQILTEDAIAVVAKTIEWASFYG
DASLTSEVEGEGLEFDGLAKLIDKNNVINAKGNQLTEKHLNEAAVRIGKGFGTATDAYMPIGVHADFVNSILGRQMQLMQ
DNSGNVNTGYSVNGFYSSRGFIKLHGSTVMENELILDESLQPLPNAPQPAKVTATVETKQKGAFENEEDRAGLSYKVVVN
SDDAQSAPSEEVTATVSNVDDGVKLSINVNAMYQQQPQFVSIYRQGKETGMYFLIKRVPVKDAQEDGTIVFVDKNETLPE
TADVFVGEMSPQVVHLFELLPMMKLPLAQINASITFAVLWYGALALRAPKKWARIKNVRYIAV
>P13849 ~~~8~~~Major capsid protein~~~
MRITFNDVKTSLGITESYDIVNAIRNSQGDNFKSYVPLATANNVAEVGAGILINQTVQNDFITSLVDRIGLVVIRQVSLN
NPLKKFKKGQIPLGRTIEEIYTDITKEKQYDAEEAEQKVFEREMPNVKTLFHERNRQGFYHQTIQDDSLKTAFVSWGNFE
SFVSSIINAIYNSAEVDEYEYMKLLVDNYYSKGLFTTVKIDEPTSSTGALTEFVKKMRATARKLTLPQGSRDWNSMAVRT
RSYMEDLHLIIDADLEAELDVDVLAKAFNMNRTDFLGNVTVIDGFASTGLEAVLVDKDWFMVYDNLHKMETVRNPRGLYW
NYYYHVWQTLSVSRFANAVAFVSGDVPAVTQVIVSPNIAAVKQGGQQQFTAYVRATNAKDHKVVWSVEGGSTGTAITGDG
LLSVSGNEDNQLTVKATVDIGTEDKPKLVVGEAVVSIRPNNASGGAQA
>P07579 ~~~P8~~~Major outer capsid protein~~~
MLLPVVARAAVPAIESAIAATPGLVSRIAAAIGSKVSPSAILAAVKSNPVVAGLTLAQIGSTGYDAYQQLLENHPEVAEM
LKDLSFKADEIQPDFIGNLGQYREELELVEDAARFVGGMSNLIRLRQALELDIKYYGLKMQLNDMGYRS
>P03641 ~~~F~~~Capsid protein F~~~
MSNIQTGAERMPHDLSHLGFLAGQIGRLITISTTPVIAGDSFEMDAVGALRLSPLRRGLAIDSTVDIFTFYVPHRHVYGE
QWIKFMKDGVNATPLPTVNTTGYIDHAAFLGTINPDTNKIPKHLFQGYLNIYNNYFKAPWMPDRTEANPNELNQDDARYG
FRCCHLKNIWTAPLPPETELSRQMTTSTTSIDIMGLQAAYANLHTDQERDYFMQRYHDVISSFGGKTSYDADNRPLLVMR
SNLWASGYDVDGTDQTSLGQFSGRVQQTYKHSVPRFFVPEHGTMFTLALVRFPPTATKEIQYLNAKGALTYTDIAGDPVL
YGNLPPREISMKDVFRSGDSSKKFKIAEGQWYRYAPSYVSPAYHLLEGFPFIQEPPSGDLQERVLIRHHDYDQCFQSVQL
LQWNSQVKFNVTVYRNLPTTRDSIMTS
>P15794 ~~~II~~~Major capsid protein P2~~~
MRSFLNLNSIPNVAAGNSCSIKLPIGQTYEVIDLRYSGVTPSQIKNVRVELDGRLLSTYKTLNDLILENTRHKRKIKAGV
VSFHFVRPEMKGVNVTDLVQQRMFALGTVGLTTCEIKFDIDEAAAGPKLSAIAQKSVGTAPSWLTMRRNFFKQLNNGTTE
IADLPRPVGYRIAAIHIKAAGVDAVEFQIDGTKWRDLLKKADNDYILEQYGKAVLDNTYTIDFMLEGDVYQSVLLDQMIQ
DLRLKIDSTMDEQAEIIVEYMGVWSRNGF
>P03630 ~~~~~~Capsid protein~~~
MSKTIVLSVGEATRTLTEIQSTADRQIFEEKVGPLVGRLRLTASLRQNGAKTAYRVNLKLDQADVVDCSTSVCGELPKVR
YTQVWSHDVTIVANSTEASRKSLYDLTKSLVATSQVEDLVVNLVPLGR
>P22535 ~~~III~~~Major capsid protein P3~~~
MAQVQQLTPAQQAALRNQQAMAANLQARQIVLQQSYPVIQQVETQTFDPANRSVFDVTPANVGIVKGFLVKVTAAITNNH
ATEAVALTDFGPANLVQRVIYYDPDNQRHTETSGWHLHFVNTAKQGAPFLSSMVTDSPIKYGDVMNVIDAPATIAAGATG
ELTMYYWVPLAYSETDLTGAVLANVPQSKQRLKLEFANNNTAFAAVGANPLEAIYQGAGAADCEFEEISYTVYQSYLDQL
PVGQNGYILPLIDLSTLYNLENSAQAGLTPNVDFVVQYANLYRYLSTIAVFDNGGSFNAGTDINYLSQRTANFSDTRKLD
PKTWAAQTRRRIATDFPKGVYYCDNRDKPIYTLQYGNVGFVVNPKTVNQNARLLMGYEYFTSRTELVNAGTISTT
>P03616 ~~~~~~Capsid protein~~~
AQLQNLVLKDREATPNDHTFVPRDIRDNVGEVVESTGVPIGESRFTISLRKTSNGRYKSTLKLVVPVVQSQTVNGIVTPV
VVRTSYVTVDFDYDARSTTKERNNFVGMIADALKADLMLVHDTIVNLQGVY
>G9M952 ~~~~~~Major capsid protein~~~
MAEKSTKNETALLVAQSAKSALQDFNHTYSKSWTFGDKWDNSNTMFETFVNKFLFPKINETLLIDIALGNRFNWLAKEQD
FIGQYSEEYVIMDTVPINMDLSKNEELMLKRNYPRMATKLYGSGIVKKQKFTLNNNDTRFNFQTLADATNYALGVYKKKI
SDINVLEEKEMRAMLVDYSLNQLSESNVRKATSKEDLASKVFEAILNLQNNSAKYNEVHRASGGAIGQYTTVSKLKDIVI
LTTDSLKSYLLDTKIANTFQVAGIDFTDHVISFDDLGGVFKVTKDIVVSSDESVNFLRAYGDYQTHKGDTIPVGSVFTYD
VSKLSEFKDSVEEIKPKSDLYAFILDINSIKYKRYTKGMLKQPFYNGEFDEVTHWIHYYSFKAISPFFNKILITDQDVTP
RTE
>G9M973 ~~~~~~Major capsid protein~~~
MAEKSTKNETALLVAQSAKSALQDFNHTYSKSWTFGDKWDNSNTMFETFVNKFLFPKINETLLIDIALGNRFNWLAKEQD
FIGQYSEEYVIMDTVPINMDLSKNEELMLKRNYPRMATKLYGSGIVKKQKFTLNNNDTRFNFQTLADATNYALGVYKKKI
SDINVLEEKEMRAMLVDYSLNQLSESNVRKATSKQDLASKVFEAILNLQNNSAKYNEVHRASGGAIGQYTTVSKLKDIVI
LTTDSLKSYLLDTKIANTFQVAGIDFTDHVISFDDLGGVFKVTKDIVVSSDESVAFLRAYGDYQTHKGDTIPVGSVFTYD
VSNLSEFKSNVEEIKPKSDLYAFILDINSIKYKRYTKGMLKQPFYNGEFDEVTHWIHYYSFKAISPFFNKILITDQDVTP
RTE
>P03615 ~~~~~~Capsid protein~~~
MAKLETVTLGNIGKDGKQTLVLNPRGVNPTNGVASLSQAGAVPALEKRVTVSVSQPSRNRKNYKVQVKIQNPTACTANGS
CDPSVTRQAYADVTFSFTQYSTDEERAFVRTELAALLASPLLIDAIDQLNPAY
>P69170 ~~~~~~Capsid protein~~~
ASNFTQFVLVNDGGTGNVTVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSAQNRKYTIKVEVPKVATQTVGGVELPVA
AWRSYLNMELTIPIFATNSDCELIVKAMQGLLKDGNPIPSAIAANSGIY
>V5XWI9 ~~~~~~Major capsid protein~~~
MTIEKNLSDVQQKYADQFQEDVVKSFQTGYGITPDTQIDAGALRREILDDQITMLTWTNEDLIFYRDISRRPAQSTVVKY
DQYLRHGNVGHSRFVKEIGVAPVSDPNIRQKTVSMKYVSDTKNMSIASGLVNNIADPSQILTEDAIAVVAKTIEWASFYG
DASLTSEVEGEGLEFDGLAKLIDKNNVINAKGNQLTEKHLNEAAVRIGKGFGTATDAYMPIGVHADFVNSILGRQMQLMQ
DNSGNVNTGYSVNGFYSSRGFIKLHGSTVMENELILDESLQPLPNAPQPAKVTATVETKQKGAFENEEDRAGLSYKVVVN
SDDAQSAPSEEVTATVSNVDDGVKLSISVNAMYQQQPQFVSIYRQGKETGMYFLIKRVPVKDAQEDGTIVFVDKNETLPE
TADVFVGEMSPQVVHLFELLPMMKLPLAQINASITFAVLWYGALALRAPKKWARIKNVRYIAV
>V5XVW4 ~~~~~~Major capsid protein~~~
MTIEKNLSDVQQKYADQFQEDVVKSFQTGYGITPDTQIDAGALRREILDDQITMLTWTNEDLIFYRDISRRPAQSTVVKY
DQYLRHGNVGHSRFVKEIGVAPVSDPNIRQKTVSMKYVSDTKNMSIASGLVNNIADPSQILTEDAIAVVAKTIEWASFYG
DASLTSEVEGEGLEFDGLAKLIDKNNVINAKGNQLTEKHLNEAAVRIGKGFGTATDAYMPIGVHADFVNSILGRQMQLMQ
DNSGNVNTGYSVNGFYSSRGFIKLHGSTVMENELILDESLQPLPNAPQPAKVTATVETKQKGAFENEEDRAGLSYKVVVN
SDDAQSAPSEEVTATVSNVDDGVKLSISVNAMYQQQPQFVSIYRQGKETGMYFLIKRVPVKDAQEDGTIVFVDKNETLPE
TADVFVGEMSPQVVHLFELLPMMKLPLAQINASITFAVLWYGALALRAPKKWARIKNVRYIAV
>P85987 ~~~~~~Major capsid protein~~~
MAAYQTYTMAGIKEDFADWVSNISPEYTPLISMIRKFPVHNTMFQWQWDVLKDVDTENQHNEASDAKDVELTPTTVVQNY
VQIMRKVVFVSDSANAVSSHGREKELFYQLKKAAKELKRDNEGIFLLKDRAGDAGSATKPRLTASFGSLIDASMKKTADL
DEATLFEMTAKLYTEGADPTLIMYHPSNANFFASLQEKSGTRMRIFENDKRFVKQVEYIVDPLGQELKCIPNRWCPEDAT
YIFNPSDLGMAVLRAPKKVALAKSGSAEKYMIEQEVGFRLNNPKAAALIIGKYKEGGNGGGESVKS
>P85990 ~~~~~~Major capsid protein~~~
MENITRELFDQYISRQAQLNRVSPAAVAAKFAVDPTVQQKLEAAAQESDSFLSKINVFGVTQQIGQKVLIGSKGPLAGVN
NSTTTRRNPADNSKMEPYDYMCRKVNYDYGISYEQLDAWAHQPNFQPLISSAMARQMSLDRIMIGFNGTSYADPSNRAAN
PLLQDCGIGFLEKIRKEAPHRVISNITVTSRDEDNKIITKGTYGNVSAAVYDAKNSLMDEWHKRNPDNVVILAGDLLTTS
NFPAINAMSQTNPNTEMLAGQLIVAQERVGNMPTFIAPFFPVNGILITPFKNLSIYYQRGGLRRTIKEEPEYNRVATYQS
SNDDFVVEDYGNVAFIDGITFAQPENGG
>P85989 ~~~~~~Major capsid protein~~~
MSKKLVTEEMRTQWLPVLEKKSEQIQPLTAENVSVRLLQNQAEWNAKNLGESEGPSSVNANVGKWQPVLIDMAKRLAPNN
IAMDFFGVQPLAGPDGQIFALRARQGVGDASNTQQSRKELFMEEAQTNYSGDQTTVHSGDPSGFSQADIEGSGTEVSSYG
KAMDTVKAEQLGSPTQPWARVGITIQKATVTAKSRGLYADYSHELRQDMMAIHGEDVDAILSDVMVTEIQAEMNREFIRT
MNFTAVRFKKFGTNGVVDVAADVSGRWALEKWKYLVFMLEVEANGVGVDTRRGKANRVLCSPNVASALAMAGMLDYSPAL
NVQAQLAVDPTGQTFAGVLSNGMRVYIDPYAVAEYITLAYKGATALDAGIYFAPYVPLEMYRTQGETTFAPRMAFKTRYG
IAANPFVQIPANQDPQVYVTEDGIAKDTNVYFRKGLIKNLY
>P09673 ~~~~~~Capsid protein~~~
MAKLNQVTLSKIGKNGDQTLTLTPRGVNPTNGVASLSEAGAVPALEKRVTVSVAQPSRNRKNFKVQIKLQNPTACTRDAC
DPSVTRSAFADVTLSFTSYSTDEERALIRTELAALLADPLIVDAIDNLNPAY
>Q38582 ~~~~~~Major capsid protein~~~
MAYTKISDVIVPELFNPYVINTTTQLSAFFQSGIAATDDELNALAKKAGGGSTLNMPYWNDLDGDSQVLNDTDDLVPQKI
NAGQDKAVLILRGNAWSSHDLAATLSGSDPMQAIGSRVAAYWAREMQKIVFAELAGVFSNDDMKDNKLDISGTADGIYSA
ETFVDASYKLGDHESLLTAIGMHSATMASAVKQDLIEFVKDSQSGIRFPTYMNKRVIVDDSMPVETLEDGTKVFTSYLFG
AGALGYAEGQPEVPTETARNALGSQDILINRKHFVLHPRGVKFTENAMAGTTPTDEELANGANWQRVYDPKKIRIVQFKH
RLQA
>Q6QGD8 ~~~~~~Major capsid protein~~~
MTIDINKLKEELGLGDLAKSLEGLTAAQKAQEAERMRKEQEEKELARMNDLVSKAVGEDRKRLEEALELVKSLDEKSKKS
NELFAQTVEKQQETIVGLQDEIKSLLTAREGRSFVGDSVAKALYGTQENFEDEVEKLVLLSYVMEKGVFETEHGQRHLKA
VNQSSSVEVSSESYETIFSQRIIRDLQKELVVGALFEELPMSSKILTMLVEPDAGKATWVAASTYGTDTTTGEEVKGALK
EIHFSTYKLAAKSFITDETEEDAIFSLLPLLRKRLIEAHAVSIEEAFMTGDGSGKPKGLLTLASEDSAKVVTEAKADGSV
LVTAKTISKLRRKLGRHGLKLSKLVLIVSMDAYYDLLEDEEWQDVAQVGNDSVKLQGQVGRIYGLPVVVSEYFPAKANSA
EFAVIVYKDNFVMPRQRAVTVERERQAGKQRDAYYVTQRVNLQRYFANGVVSGTYAAS
>P03622 ~~~VIII~~~Capsid protein G8P~~~
SGVGDGVDVVSAIEGAAGPIAAIGGAVLTVMVGIKVYKWVRRAM
>P03618 ~~~VIII~~~Capsid protein G8P~~~
AEGDDPAKAAFDSLQASATEYIGYAWAMVVVIVGATIGIKLFKKFASKAS
>P69171 ~~~~~~Capsid protein~~~
MASNFTQFVLVNDGGTGNVTVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSAQNRKYTIKVEVPKVATQTVGGVELPV
AAWRSYLNMELTIPIFATNSDCELIVKAMQGLLKDGNPIPSAIAANSGIY
>P04866 ~~~~~~Capsid protein~~~
MPNVSLTAKGGGHYIEDQWDTQVVEAGVFDDWWVHVEAWNKFLDNLRGINFSVASSRSQVAEYLAALDRDLPADVDRRFA
GARGQIGSPNYLPAPKFFRLDKRTIAELTRLSRLTDQPHNNRDIELNRAKRATTNPSPPAQAPSENLTLRDVQPLKDSAL
HYQYVLIDLQSARLPVYTRKTFERELALEWIIPDAEEA
>O72120 ~~~ORF2~~~Capsid protein~~~
MARYLELNPQNYSDEEYDYDSYNPFPNFEKNLASHYGTDFVPRINLDDFFLDDEDFEFCDDPLNCCFPDYLASLGEEEFI
YEGDEPYIVLKHQLVSSTMWDDGTFTYPILPPFKTSSISYFLPKPGEVLHRCLMAVAKGMDPDLQVAVGTEFQFRAESDS
SHPPDITTEDQGTVVATGPQPSAPAMATLATAATGTMPEEWKNFFSYYTTINWATTDETGKVLFVQNLAPRMNPFLDHIA
KMYTGWSGSMEVRFTISGSGVFGGKVAAVLVPPGISTEGGTNLLQFPHVLVDARQTEPVIFTIPDIRTQLWHDMHDTSTS
HLVILVYNDLVNPFQGGENGTSCTITVETRGGTDFEFHLLKPPTRKMIFGADPSRLIPRRSQFWEGNRLPGVITSFVCLP
RMFQANRHFDCKRQTFGWSRPVHKGIEVRVDATNKDAANTTDIGIHVVTARNAIKSDIPDGWPDYYRTGEQVYNNTTQTF
QEVKESVMGSAVPDSTATAMTWHHLPTVVFGHGTAVGSKTTNSKVLSGNFYAIGNFDQSGNIKLYPSYWIAKEQSAGGAP
IGAYEDMVKRIDVLPTAQTTGGNFPVAFVSKFASSHNGNGVSVYNSQILTTSALLAQDVYDIGPNALAVFKIKGSGGYWF
DLGISADGFSYVGGGNLNFSSLQFPLEATYVGMASLHNKLQYNLGGSATTL
>P03542 ~~~ORF IV~~~Capsid protein~~~
MAESILDRTINRFWYNLGEDCLSESQFDLMIRLMEESLDGDQIIDLTSLPSDNLQVEQVMTTTEDSISEEESEFLLAIGE
TSEEESDSGEEPEFEQVRMDRTGGTEIPKEEDGEGPSRYNERKRKTPEDRYFPTQPKTIPGQKQTSMGMLNIDCQTNRRT
LIDDWAAEIGLIVKTNREDYLDPETILLLMEHKTSGIAKELIRNTRWNRTTGDIIEQVIDAMYTMFLGLNYSDNKVAEKI
DEQEKAKIRMTKLQLCDICYLEEFTCDYEKNMYKTELADFPGYINQYLSKIPIIGEKALTRFRHEANGTSIYSLGFAAKI
VKEELSKICDLSKKQKKLKKFNKKCCSIGEASTEYGCKKTSTKKYHKKRYKKKYKAYKPYKKKKKFRSGKYFKPKEKKGS
KQKYCPKGKKDCRCWICNIEGHYANECPNRQSSEKAHILQQAEKLGLQPIEEPYEGVQEVFILEYKEEEEETSTEESDGS
STSEDSDSD
>P04383 ~~~ORF4~~~Capsid protein~~~
MENKGEKIAMNPTVQTLAQKGDKLAVKLVTRGWASLSTNQKRRAEMLAGYTPAILAFTPRRPRMTNPPPRTSRNSPGQAG
KSMTMSKTELLSTVKGTTGVIPSFEDWVVSPRNVAVFPQLSLLATNFNKYRITALTVKYSPACSFETNGRVALGFNDDAS
DTPPTTKVGFYDLGKHVETAAQTAKDLVIPVDGKTRFIRDSASDDAKLVDFGRIVLSTYGFDKADTVVGELFIQYTIVLS
DPTKTAKISQASNDKVSDGPTYVVPSVNGNELQLRVVAAGKWCIIVRGTVEGGFTKPTLIGPGISGDVDYESARPIAVCE
LVTQMEGQILKITKTSAEQPLQWVVYRM
>P54089 ~~~VP1~~~Capsid protein~~~
MARRARRPRGRFYAFRRGRWHHLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGVIFLTE
GLILPKNSTAGGYADHMYGARVAKISVNLKEFLLASMNLTYVSKIGGPIAGELIADGSKSQAAENWPNCWLPLDNNMPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWRKGGQPMTGGIAYATGKMRPDEQQYPAMPPDPPIITTTTAQGTQVRCMNSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMSAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDAWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWETQP
>P54090 ~~~VP1~~~Capsid protein~~~
MERRARRPRGRFYAFRRGRWNHLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGVIFLTE
GLILPKNSTAGGYADHMYGARVAKISVNLKEFLLASMNLTYVSKLGGPIAGELIADGSKSEAAENWPNCCVPLDNNVPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWRKGGQPMTGGIAYATGKMRPDEQQYPAMPPDPPIITSTTAQGTQVRCMNSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMGAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDSWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWGTQP
>P54087 ~~~VP1~~~Capsid protein~~~
MARRARRPRGRFYAFRRGRWHHLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGVIFLTE
GLILPKNSTAGGYADHMYGARVAKISVNLKEFLLASMNLTYVSKIGGPIAGELIADGSKSQAAENWPNCWLPLDNNVPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWRKGEQPMTGGIAYATGKMRPDEQQYPAMPPDPPIITSTTAQGTQVRCMYSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMGAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDSWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWGTQP
>Q99153 ~~~VP1~~~Capsid protein~~~
MARRARRPRGRFYSFRRGRWHHLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGVIFLTE
GLILPKNSTAGGYADHMYGARVAKISVNLKEFLLASMNLTYVSKIGGPIAGELIADGSKSQAADNWPNCWLPLDNNVPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWQKGGQPMTGGIAYATGKMRPDEQQYPAMPPDPPIITATTAQGTQVRCMNSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMSAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDAWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWGTQP
>Q9IZU5 ~~~VP1~~~Capsid protein~~~
MARRARRPRGRFYAFRRGRWHHLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGVIFLTE
GLILPKNSTAGGYADHMYGARVAKISVNLKEFLLASMNLTYVSKIGGPIAGELIADGSKSQAAENWPNCWLPLDNNVPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWRKGEQPMTGGIAYATGKMRLDEQQYPAMPPDPPIITTTTAQGTQVRCMNSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMRAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDSWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWGSQP
>P54088 ~~~VP1~~~Capsid protein~~~
MARRARRPRGRFYAFRRGRWHNLKRLRRRYKFRHRRRQRYRRRAFRKAFHNPRPGTYSVRLPNPQSTMTIRFQGIIFLTE
GLILPKNSTAGGYADHLYGARVAKISVNLKEFLLASMNLTYVSKIGGPIAGELIADGSQSQAAQNWPNCWLPLDNNVPSA
TPSAWWRWALMMMQPTDSCRFFNHPKQMTLQDMGRMFGGWHLFRHIETRFQLLATKNEGSFSPVASLLSQGEYLTRRDDV
KYSSDHQNRWRKGEQPMTGGIAYATGKMRPDEQQYPAMPPDPPIITATTAQGTQVRCMNSTQAWWSWDTYMSFATLTALG
AQWSFPPGQRSVSRRSFNHHKARGAGDPKGQRWHTLVPLGTETITDSYMSAPASELDTNFFTLYVAQGTNKSQQYKFGTA
TYALKEPVMKSDAWAVVRVQSVWQLGNRQRPYPWDVNWANSTMYWGSQP
>P03601 ~~~ORF3b~~~Capsid protein~~~
MSTVGTGKLTRAQRRAAARKNKRNTRVVQPVIVEPIASGQGKAIKAWTGYSVSKWTASCAAAEAKVTSAITISLPNELSS
ERNKQLKVGRVLLWLGLLPSVSGTVKSCVTETQTTAAASFQVALAVADNSKDVVAAMYPEAFKGITLEQLTADLTIYLYS
SAALTEGDVIVHLEVEHVRPTFDDSFTPVY
>Q66012 ~~~ORF3~~~Capsid protein~~~
MMVRKGAATKAPQQPKPKAQQQPGGRRRRRGRSMEPVSRPLNPPAAVGSTLKAGRGRTAGVSDWFDTGMITSYLGGFQRT
AGTTDSQVFIVSPAALDRVGTIAKAYALWRPKHWEIVYLPRCSTQTDGSIEMGFLLDYADSVPTNTRTMASSTSFTTSNV
WGGGDGSSLLHTSVKSMGNAVTSALPCDEFSNKWFKLSWSTPEESENAHLTDTYVPARFVVRSDFPVVTADQPGHLWLRS
RILLKGSVSPSTNL
>P69475 ~~~CP~~~Capsid protein~~~
MAYNPITPSKLIAFSASYVPVRTLLNFLVASQGTAFQTQAGRDSFRESLSALPSSVVDINSRFPDAGFYAFLNGPVLRPI
FVSLLSSTDTRNRVIEVVDPSNPTTAESLNAVKRTDDASTAARAEIDNLIESISKGFDVYDRASFEAAFSVVWSEATTSK
A
>P03561 ~~~AR1~~~Capsid protein~~~
MSKRPGDIIISTPGSKVRRRLNFDSPYRNRATAPTVHVTNRKRAWVNRPMYRKPTMYRMYRSPDIPRGCEGPCKVQSFEQ
RDDVKHLGICKVISDVTRGPGLTHRVGKRFCIKSIYILGKIWLDETIKKQNHTNNVIFYLLRDRRPYGNAPQDFGQIFNM
FDNEPSTATIKNDLRDRFQVLRKFHATVVGGPYGMKEQALVKRFYRLNHHVTYNHQEAGKYENHTENALLLYMACTHASN
PVYATLKIRIYFYDSIGN
>Q66154 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRPRRGSRSASSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCKPGYTFSSITLNPPKI
DRGSYYGKRLLLPDSVTEFDKKLVSRIQIRVNPLPKFDSTVWVTVRKVPASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLYDLSAMRADIGDMRKYAVLVYSKDDALETDELVLHVDVEHQRIPTSGVLPV
>P21368 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNHRRRPRRGSRSAPSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCRPGYTFTSITLKPPKI
DRESYYGKRLLLPDSVTEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVPASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLFDLSAMRADIGDMRKYAVLVYSKDDALETDELVLHVDIEHQRIPTSGVLPV
>Q00259 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRPRRGSRSAPSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCRPGYTFTSITLKPPKI
DRGSYYGKRLLLPDSVTEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVSASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLYDLSAMRADIGDMRKYAVLVYSKDDALETDELVLHVDIEHQRIPTSGVLPV
>P69466 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRPRRGSRSAPSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCRPGYTFTSITLKPPKI
DRGSYYGKRLLLPDSVTEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVPASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLYDLSAMRADIGDMRKYAVLVYSKDDALETDELVLHVDIEHQRIPTSGVLPV
>Q06934 ~~~ORF3b~~~Capsid protein~~~
MDKSGSPNASRTSRRRRPRRGSRSASGADAGLRALTQQMLKLNKTLAIGRPTLNHPTFVGSESCKPGYTFTSITLKPPEI
EKGSYFGRRLSLPDSVTDYDKKLVSRIQIRINPLPKFDSTVWVTVRKVPSSSDLSVAAISAMFGDGNSPVLVYQYAASGV
QANNKLLYDLSEMRADIGDMRKYAVLVYSKDDNLEKDEIVLHVDVEHQRIPISRMLPT
>P16489 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRPRRGSRSAPSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCKPGYTFTSITLKPPKI
DRGSYYGKRLLLPDSVTEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVPASSDLSVAAISAMFADGASPVLVYQYAAFGV
QANNKLLYDLSAMRADIGDMRKYAVLVYSKDDALETDELVLHVDIEHQRIPTSRVLPV
>Q00261 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRPRRGSRSASSSADANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCRPGYTFTSITLKPPKI
DRGSYYGKRLLLPDSVTEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVLASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLYDLSAMRADIGDMRKYAILVYSKDDALETDELVLHVDIEHQRIPTSGVLPV
>P03605 ~~~ORF3b~~~Capsid protein~~~
MDKSGSPNASRTSRRRRPRRGSRSASGADAGLRALTQQMLRLNKTLAIGRPTLNHPTFVGSESCKPGYTFTSITLKPPEI
EKGSYFGRRLSLPDSVTDYDKKLVSRIQIRINPLPKFDSTVWVTVRKVPSSSDLSVAAISAMFGDGNSPVLVYQYAASGV
QANNKLLYDLSEMRADIGDMRKYAVLVYSKDDKLEKDEIVLHVDVEHQRIPISRMLPT
>Q83253 ~~~ORF3b~~~Capsid protein~~~
MDKSGSPNASRTSRRRRPRRGSRSASGADAGLRALTQQMLKLNKTLAIGRPTLNHPTFAGSESCKPGYTFTSITLKPPEI
EKGSYFGRRLSLPDSVTDYDKKQVSRIQIRINPLPKFDSTVWVTVRKVPSSSDLSVAAITAMFGDGKSPVLVYQYAASGV
QANNKLLYNLSEMRADIGDMRKYAVLVYSKDDKLEKDEIVLHVDVEHQRIPISRMLPT
>P24147 ~~~ORF3b~~~Capsid protein~~~
MDKSGSPNASRTSRRRRPRRGSRSASGADAGLRALTQQMLKLNRTLAIGRPTLNHPTFVGSESCKPGYTFTSITLKPPEI
EKGSYFGRRLSLPDSVTDYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVPSSSDLSVAAISAMFGDGNSPVLVYQYAASGV
QANNKLLYDLSEMRADIGDMRKYAVLVYSKDDKLEKDEIALHVDVEHQRIPISRMLPT
>P18027 ~~~ORF3b~~~Capsid protein~~~
MDKSESTSAGRNRRRRLRRGSRSASSSSDANFRVLSQQLSRLNKTLAAGRPTINHPTFVGSERCKPGYTFTSITLRPPKI
DRESYYGKRLLLPDSVMEYDKKLVSRIQIRVNPLPKFDSTVWVTVRKVSASSDLSVAAISAMFADGASPVLVYQYAASGV
QANNKLLYDLSAMRADIGDMRKYAVLVYSKDDTLETDELVLHVDVEHQRIPTSGVLPV
>P15183 ~~~ORF2~~~Capsid protein~~~
MALVSRNNNMRTLAKLAAPLATAGTRTIVDNKEAIWNGVKWIWGKLPKGKKGKNGNGALIAHPQAFPGAIAAPISYAYAV
KGRKPRFQTAKGSVRITHREYVSVLSGTNGEFLRNNGTGPNNDFSINPLNPFLFPWLVNIAANFDQYKFNSLRFEYVPLV
NTTTNGRVALYFDKDSEDPGPDDRAALANYAHLSEISPWAITKLTVPTDNVKRFISDTSSGDPKLINLGQFGWVAYSGPT
AELGDIFVEYTVDLFEAQPTSPLLESLFRESASSVQTRMGLPYFSLEVASATDLVWQARVPGTYVVTIIFNSTVGGLTPS
ISGGGTINSSFSVSTAGSSAYVANITIRVNANLSLSGLTGATNAQLFAVRAITENAVQVV
>Q6TS43 ~~~~~~Capsid protein VP1~~~
MHSTNNNSNKRNNEEKHKQPEIDSSANNGEGTSGTRAQTVGDTATEAGVRNETEAGASTRRQTDGTGLSGTNAKIATASS
ARQADVEKPADVTFTIENVDDVGIMQQKKPPTVVQSRTDVFNEQFANEALHPTTKVIFNGLDVNTEVQPLSDDFKQISDP
KGYLTYSVKYEDQFTKKDKLRASEADDRIVGPTVNLFKYGAAVVNIDLNRDFFDTATGIDLTKGIPLVQDLLVPIGVTAG
AEQSAEYVSGLLMVLFKVMTDNRLVIVGETTTPMSNTLSTVVNNVLRTTYHNNVGVNPALLRDFTQVNWLNRDITNMLQQ
AGTKYGLGLTETRLDYVRLVKTIVGHALNIDHFAASVLNINLRALMEANVTADDRIKALQAHSMISTQFHGPNQGALRPE
LAFDHDHIIRCLMLAAANYPRLEGIIVQINTGYVASANVIRPVSEKRYFPENLEQNQSAARLVSAVKARASEADISSIHL
AIAREVSPMFNVHELKKIAESFEDPSSIVVVLEFILFALFFPTEFNRIKGDIQNVLLLFFSRWYPVEYGIFVQRGATYTI
NAAGEFEFSGRNEKWDQALYLSEHFPALFSDVPLAGANTIIAIMRLFTPQGFLRTDDLAIAANFPRASRNPQTYIPYTNQ
RGTVTNEFASRFRTIVATLANVVNERAVQDDMQKATRSCTKQWLRHLETQFDNIAVAHTDHLSVVYATMSNFMLNFTNNF
SGNHATFKPDQYVITSPEGSYKPIIERQGETVDGLTIIDTSIVWPILCQCTYPLVRQSGKGVDAVSIMEEIVYPDPSTTL
SQSLSVAQVLSKLTLPDAFINMILSGGDSVVMRTYQTEADDDLDEGIRMTTYDQYLSHIRERLHITNVPDPIYITGASTP
DQIAASVQATHVAVVLYQSGVINGPASTYLRENEVLVVMPDYYDVVSRFANANLQMNNNRYHESVLEIADIFDQADFIQT
SDAVRQLRALMPTLSTSQIRHAIERIAQITDVDSTDYGKLTLRFLGTLTRSLKMQNAQIRRIRPDGTVLRYDDQIDIEAF
RWSRYFLDELQLRRLSVGLRLITNPRIARRFNGVRIMYLTDDDPDPDFVPDVPEGYVAVQYAHRLFSSSLANKRNRVTYT
HPPTGMAYPSPTGRPHVHMTINERAGMSKLVADNIIASVIKSNWVVDILDIEYTAEVMTPSEGYTQHVDAESIMTAPKGK
LFHLQFMDGLLRPEPSAFDPPASGEDMRLIYPLQPISVARSMRAIVNHNEVDRPRGAVAPSSYEMDTGTLSRNGDLLYSP
VANGQVGIPKLEVDHISFSNVVSMMTANIRTGDDMAVERVNPDDVRAINIRNA
>Q00686 ~~~~~~Capsid protein~~~
MDDETKKLKNKNKETKEGDDVVAAESSFSSVNLHIDPTLITMNDVRQLSTQQNAALNRDLFLTLKGKHPNLPDKDKDFRI
AMMLYRLAVKSSSLQSDDDATGITYTREGVEVDLSDKLWTDVVFNSKGIGNRTNALRVWGRTNDALYLAFCRQNRNLSYG
GRPLDAGIPAGYHYLCADFLTGAGLTDLECAVYIQAKEQLLKKRGADDVVVTNVRQLGKFNTR
>P0C6J7 ~~~C~~~Capsid protein~~~
MDINASRALANVYDLPDDFFPKIDDLVRDAKDALEPYWKSDSIKKHVLIATHFVDLIEDFWQTTQGMHEIAESLRAVIPP
TTTPVPPGYLIQHEEAEEIPLGDLFKHQEERIVSFQPDYPITARIHAHLKAYAKINEESLDRARRLLWWHYNCLLWGEAQ
VTNYISRLRTWLSTPEKYRGRDAPTIEAITRPIQVAQGGRKTTTGTRKPRGLEPRRRKVKTTVVYGRRRSKSRERRAPTP
QRAGSPLPRSSSSHHRSPSPRK
>P27404 ~~~ORF2~~~Capsid protein~~~
MCSTCANVLKYYDWDPHIKLVINPNKFLHVGFCDNPLMCCYPELLPEFGTMWDCDQSPLQVYLESILGDDEWSSTHEAID
PVVPPMHWDEAGKIFQPHPGVLMHHLICKVAEGWDPNLPLFRLEADDGSITTPEQGTMVGGVIAEPNAQMSTAADMATGK
SVDSEWEAFFSFHTSVNWSTSETQGKILFKQSLGPLLNPYLTHLAKLYVAWSGSVDVRFSISGSGVFGGKLAAIVVPPGI
DPVQSTSMLQYPHVLFDARQVEPVIFSIPDLRSTLYHLMSDTDTTSLVIMVYNDLINPYANDSNSSGCIVTVETKPGPDF
KFHLLKPPGSMLTHGSIPSDLIPKSSSLWIGNRFWSDITDFVIRPFVFQANRHFDFNQETAGWSTPRFRPITITISVKES
AKLGIGVATDYIVPGIPDGWPDTTIPGELVPVGDYAITNGTNNDITTAAQYDAATEIRNNTNFRGMYICGSLQRAWGDKK
ISNTAFITTGTVDGAKLIPSNTIDQTKIAVFQDTHANKHVQTSDDTLALLGYTGIGEEAIGADRDRVVRISVLPERGARG
GNHPIFHKNSIKLGYVIRSIDVFNSQILHTSRQLSLNHYLLSPDSFAVYRIIDSNGSWFDIGIDNDGFSFVGVSSIGKLE
FPLTASYMGIQLAKIRLASNIRSVMTKL
>P27405 ~~~ORF2~~~Capsid protein~~~
MCSTCANVLKYYDWDPHFRLIINPNKFLPIGFCDNPLMCCYPDLLPEFGTVWDCDQSPLQIYLESILGDDEWASTHEAID
PSVPPMHWDSAGKIFQPHPGVLMHHLIGEVAKAWDPNLPLFRLEADDGSITTPEQGTAVGGVIAEPSAQMSTAADMASGK
SVDSEWEAFFSFHTSVNWSTSETQGKILFKQSLGPLLNPYLEHLSKLYVAWSGSIEVRFSISGSGVFGGKLAAIVVPPGV
DPVQSTSMLQYPHVLFDARQVEPVIFTIPDLRSTLYHVMSDTDTTSLVIMVYNDLINPYANDSNSSGCIVTVETKPGPDF
KFHLLKPPGSVLTHGSIPSDLIPKSSSLWIGNRYWTDITDFVIRPFVFQANRHFDFNQETAGWSTPRFRPITITISEKNG
SKLGIGVATDYIIPGIPDGWPDTTIADKLIPAGDYSITTGEGNDIKTAQAYDTAAVVKNTTNFRGMYICGSLQRAWGDKK
ISNTAFITTAIRDGNEIKPSNTIDMTKLAVYQDTHVEQEVQTSDDTLALLGYTGIGEEAIGSNRDRVVRISVLPEAGARG
GNHPIFYKNSIKLGYVIRSIDVFNSQILHTSRQLSLNHYLLPPDSFAVYRIIDSNGSWFDIGIDSEGFSFVGVSDIGKLE
FPLSASYMGIQLAKIRLASNIRSRMTKL
>P27406 ~~~~~~Capsid protein~~~
MCSTCANVLKYYDWDPHFKLVINPNNFLSVGFCSNPLMCCYPELLPEFGTVWDCDRSPLEIYLESILGDDEWASTFDAVD
PVVPPMHWGAAGKIFQPHPGVLMHHLIGKVAAGWDPDLPLIRLEADDGSITAPEQGTMVGGVIAEPSAQMSTAADMATGK
SVDSEWEAFFSFHTSVNWSTSETQGKILFKQSLGPLLNPYLEHLAKLYVAWSGSIEVRFSISGSGVFGGKLAAIVVPPGV
DPVQSTSMLQYPHVLFDARQVEPVIFCLPDLRSTLYHLMSDTDTTSLVIMVYNDLINPYANDANSSGCIVTVETKPGPDF
KFHLLKPPGSMLTHGSIPSDLIPKTSSLWIGNRYWSDITDFVIRPFVFQANRHFDFNQETAGWSTPRFRPISVTITEQNG
AKLGIGVATDYIVPGIPDGWPDTTIPGELIPAGDYAITNGTGNDITTATGYDTADIIKNNTNFRGMYICGSLQRAWGDKK
ISNTAFITTATLDGDNNNKINPCNTIDQSKIVVFQDNHVGKKAQTSDDTLALLGYTGIGEQAIGSDRDRVVRISTLPETG
ARGGNHPIFYKNSIKLGYVIRSIDVFNSQILHTSRQLSLNHYLLPPDSFAVYRIIDSNGSWFDIGIDSDGFSFVGVSGFG
KLEFPLSASYMGIQLAKIRLASNIRSPMTKL
>Q66915 ~~~ORF2~~~Capsid protein~~~
MCSTCANVLKYYNWDPHFKLVINPNKFLSIGFCDNPLMCCYPELLPEFGTVWDCDQSPLQIYLESILGDDEWSSTYEAID
PVVPPMHWNEAGKIFQPHPGVLMHHIIGEVAKAWDPNLPLFRLEADDGSITAPEQGTVVGGVIAEPSSQMSTAADMASGK
SVDSEWEAFFSFHTSVNWSTSETQGKILFKQSLGPLLNPYLEHLSKLYVAWSGSVEVRFSISGSGVFGGKLAAIVVPPGV
DPIQSTSMLQYPHVLFDARQVEPVIFTIPDLRSTLYHLMSDTDTTSLVIMVYNDLINPYANDSNSSGCIVTVETKPGSDF
KFHLLKPPGSMLTHGSVPSDLIPKTSSLWIGNRFWSDITDFVIRPFVFQANRHFDFNQETAGWSTPRFRPITVTISEKNG
AKLGVGVATDFIVPGIPDGWPDTTIGEKLVPAGDYAITNGSGNDITTANQYDAADIIRNNTNFKGMYICGSLQRAWGDKK
ISNTAFITTATVEGNDLIPSNVIDQTKIAIFQDNHVQDEVQTSDDTLALLGYTGIGEEAIGANRERVVRISTLPETGARG
GNHPIFYKNSIKLGYVIRSIDVFNSQILHTSRQLSLNHYLLPPDSFAVYRIIDSNGSWFDVGIDFDGFSFVGVSDVGKLE
FPLTASYMGIQLAKIRLASNIRSTMTKL
>P12870 3.4.23.44~~~alpha~~~Capsid protein alpha~~~
MVNNNRPRRQRAQRVVVTTTQTAPVPQQNVPRNGRRRRNRTRRNRRRVRGMNMAALTRLSQPGLAFLKCAFAPPDFNTDP
GKGIPDRFEGKVVSRKDVLNQSISFTAGQDTFILIAPTPGVAYWSASVPAGTFPTSATTFNPVNYPGFTSMFGTTSTSRS
DQVSSFRYASMNVGIYPTSNLMQFAGSITVWKCPVKLSTVQFPVATDPATSSLVHTLVGLDGVLAVGPDNFSESFIKGVF
SQSACNEPDFEFNDILEGIQTLPPANVSLGSTGQPFTMDSGAEATSGVVGWGNMDTIVIRVSAPEGAVNSAILKAWSCIE
YRPNPNAMLYQFGHDSPPLDEVALQEYRTVARSLPVAVIAAQNASMWERVKSIIKSSLAAASNIPGPIGVAASGISGLSA
LFEGFGF
>P04864 ~~~~~~Capsid protein VP1~~~
MAPPAKRARRGLVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYAAYLRSGKNPYLYFSPADQRFIDQTKDATDWGGK
IGHYFFRAKKAIAPVLTDTPDHPSTSRPTKPTKRSKPPPHIFINLAKKKKAGAGQVKRDNQAPMSDGAVQPDGGQPAVRN
ERATGSGNGSGGGGGGGSGGVGISTGTFNNQTEFKFLENGWVEITANSSRLVHLNMPESENYKRVVVNNMDKTAVKGNMA
LDDTHVQIVTPWSLVDANAWGVWFNPGDWQLIVNTMSELHLVSFEQEIFNVVLKTVSESATQPPTKVYNNDLTASLMVAL
DSNNTMPFTPAAMRSETLGFYPWKPTIPTPWRYYFQWDRTLIPSHTGTSGTPTNIYHGTDPDDVQFYTIENSVPVHLLRT
GDEFATGTFFFDCKPCRLTHTWQTNRALGLPPFLNSLPQSEGATNFGDIGVQQDKRRGVTQMGNTDYITEATIMRPAEVG
YSAPYYSFEASTQGPFKIPIAAGRGGAQTDENQAADGDPRYAFGRQHGQKTTTTGETPERFTYIAHQDTGRYPAGDWIQN
INFNLPVTNDNVLLPTDPIGGKTGINYTNIFNTYGPLTALNNVPPVYPNGQIWDKEFDTDLKPRLHVNAPFVCQNNCPGQ
LFVKVAPNLTNEYDPDASANMSRIVTYSDFWWKGKLVFKAKLRASHTWNPIQQMSINVDNQFNYLPNNIGAMKIVYEKSQ
LAPRKLY
>P24840 ~~~~~~Capsid protein VP1~~~
MAPPAKRARRGLVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYAAYLRSGKNPYLYFSPADQRFIDQTKDAKDWGGK
IGHYFFRAKKAIAPVLTDTPDHPSTSRPTKPTKRSKPPPHIFINLAKKKKAGAGQVKRDNLAPMSDGAVQPDGGQPAVRN
ERATGSGNGSGGGGGGGSGGVGISTGTFNNQTEFKFLENGWVEITANSSRLVHLNMPESENYKRVVVNNMDKTAVKGNMA
LDDIHVQIVTPWSLVDANAWGVWFNPGDWQLIVNTMSELHLVSFEQEIFNVVLKTVSESATQPPTKVYNNDLTASLMVAL
DSNNTMPFTPAAMRSETLGFYPWKPTIPTPWRYYFQWDRTLIPSHTGTSGTPTNVYHGTDPDDVQFYTIENSVPVHLLRT
GDEFATGTFFFDCKPCRLTHTWQTNRALGLPPFLNSLPQSEGATNFGDIGVQQDKRRGVTQMGNTDYITEATIMRPAEVG
YSAPYYSFEASTQGPFKTPIAAGRGGAQTDENQAADGDPRYAFGRQHGQKTTTTGETPERFTYIAHQDTGRYPEGDWIQN
INFNLPVTNDNVLLPTDPIGGKTGINYTNIFNTYGPLTALNNVPPVYPNGQIWDKEFDTDLKPRLHVNAPFVCQNNCPGQ
LFVKVAPNLTNEYDPDASANMSRIVTYSDFWWKGKLVFKAKLRASHTWNPIQQMSINVDNQFNYVPNNIGAMKIVYEKSQ
LAPRKLY
>Q90125 ~~~VP~~~Capsid protein VP1~~~
MSFFKNQLIHRARPGYRIIPESTVTEDIELGTIGEETPLLSEGVITAVEEGAIGLPEVAIGVAGAIGTHAHEWWRDRYAF
KSVLTGNYTDLKGNPLKPRNAIPEKIKQLGKKIFQGDFNRAFPDNLKLETEKEKADLLRYYNHNRRLAGLSEAYPQGKGY
AYAKSQKVLEAERRGLTVPGYKYLGPGNSLNRGQPINQIDEDAKEHDEAYDKVKTSQEVSRADNTFVNKALDHVVNAINF
KETPGNAFGAAIGAIGIGTKQAIEKYSGVIYPSVSGMSRHINPRYINQPNWKDYIAEGNSKNWVGYSNLPDDFFQEETLS
DSPMQEATKRKADSPAVETPAKKGTTGVNVNSQSTDPQNPSSSGATTDLDVTMAMSLPGTGSGTSSGGGNTQGQDVYIIP
RPFSNFGKKLSTYTKSHKFMIFGLANNVIGPTGTGTTAVNRLLTTCLAEIPWQKLPLYMNQSEFDLLPPGSRVVECNVKV
IFRTNRIAFETSSTVTKQATLNQISNVQTAIGLNKLGWGINRAFTAFQSDQPMIPTATTAPKYEPVTGDTGYRGMIADYY
GADSTNDTAFGNAGNYPHHQVSSFTFLQNYYCMYQQTNQGTGGWPCLAEHLQQFDSKTVNNQCLIDVTYKPKMGLIKSPL
NYKIIGQPTVKGTISVGDNLVNMRGAVVTNPPEATQNVAESTHNLTRNFPADLFNIYSDIEKSQVLHKGPWGHENPQIQP
SVHIGIQAVPALTTGALLINSSPLNSWTDSMGYIDVMSSCTVMEAQPTHFPFSTEANTNPGNTIYRINLTPNSLTSAFNG
LYGNGATLGNV
>Q82462 ~~~~~~Capsid protein~~~
MGDAGVASQRPHNRRGTRNVRVSANTVTVNGRRNQRRRTGRQVSPPDNFTAAAQDLAQSLDANTVTFPANISSMPEFRNW
AKGKIDLDSDSIGWYFKYLDPAGATESARAVGEYSKIPDGLVKFSVDAEIREIYNEECPVVTDVSVPLDGRQWSLSIFSF
PMFRTAYVAVANVENKEMSLDVVNDLIEWLNNLADWRYVVDSEQWINFTNDTTYYVRIRVLRPTYDVPDPTEGLVRTVSD
YRLTYKAITCEANMPTLVDQGFWIGGQYALTPTSLPQYDVSEAYALHTLTFARPSSAAALAFVWAGLPQGGTAPAGTPAW
EQASSGGYLTWRHNGTTFPAGSVSYVLPEGFALERYDPNDGSWTDFASAGDTVTFRQVAVDEVVVTNNPAGGGSAPTFTV
RVPPSNAYTNTVFRNTLLETRPSSRRLELPMPPADFGQTVANNPKIEQSLLKETLGCYLVHSKMRNPVFQLTPASSFGAV
SFNNPGYERTRDLPDYTGIRDSFDQNMSTAVAHFRSLSHSCSIVTKTYQGWEGVTNVNTPFGQFAHAGLLKNEEILCLAD
DLATRLTGVYPATDNFAAAVSAFAANMLSSVLKSEATSSIIKSVGETAVGAAQSGLAKLPGLLMSVPGKIAARVRARRAR
RRAARAN
>O12792 ~~~ORF2~~~Capsid polyprotein VP90~~~
MASKSNKQVTVEVSNNGRSRSKSRARSQSRGRDKSVKITVNSRNRARRQPGRDKRQSSQRVRNIVNKQLRKQGVTGPKPA
ICQRATATLGTVGSNTSGTTEIEACILLNPVLVKDATGSTQFGPVQALGAQYSMWKLKYLNVKLTSMVGASAVNGTVSGV
SLNPTTTPTSTSWSGLGARKHLDVTVGKNATFKLKPSDLGGPRDGWWLTNTNDNASDTLGPSIEIHTLGRTMSSYKNEQF
TGGLFLVELASEWCFTGYAANPNLVNLVKSTDNQVSVTFEGSAGSPLIMNVPEGSHFARTVLARSTTPTTLARAGERTTS
DTVWQVLNTAVSAAELVTPPPFNWLVKGGWWFVKLIAGRTRTGSRSFYVYPSYQDALSNKPALCTGSTPGGMRTRNPVTT
TLQFTQMNQPSLGHGEAPAAFGRSIPAPGEEFKVVLTFGAPMSPNANNKQTWVNKPLDAPSGHYNVKIAKDVDHYLTMQG
FTSIASVDWYTIDFQPSEAPAPIQGLQVLVNSSKKADVYAIKQFVTAQTNNKHQVTSLFLVKVTTGFQVNNYLSYFYRAS
ATGDATTNLLVRGDTYTAGISFTQGGWYLLTNTSIVDGAMPPGWVWNNVELKTNTAYHMDKGLVHLIMPLPESTQMCYEM
LTSIPRSRASGHGYESDNTEYLDAPDSADQFKEDIETDTDIESTEDEDEADRFDIIDTSDEEDENETDRVTLLSTLVNQG
MTMTRATRIARRAFPTLSDRIKRGVYMDLLVSGASPGNAWSHACEEARKAAGEINPCTSGSRGHAE
>Q82446 ~~~ORF2~~~Capsid polyprotein VP90~~~
MASKSDKQVTVEVNNNGRNRSKSRARSQSRGRGRSVKITVNSHNKGRRQNGRNKYQSNQRVRKIVNKQLRKQGVTGPKPA
ICQRATATLGTIGTNTTGATEIEACILLNPVLVKDATGSTQFGPVQALGAQYSMWKLKYLNVKLTSMVGASAVNGTVLRI
SLNPTSTPSSTSWSGLGARKHMDVTVGRNAVFKLRPSDLGGPRDGWWLTNTNDNASDTLGPSIEIHTLGKTMSSYKNEQF
TGGLFLVELASEWCFTGYAANPNLVNLVKSTDHEVNVTFEGSKGTPLIMNVAEHSHFARMAEQHSSISTTFSRAGGDATS
DTVWQVLNTAVSAAELVAPPPFNWLIKGGWWFVKLIAGRTRTGTKQFYVYPSYQDALSNKPALCTGGVTGGVLRTTPVTT
LQFTQMNQPSLGHGEHTATIGSIVQDPSGELRVLLTVGSIMSPNSADRQVWLNKTLTAPGTNSNDNLVKIAHDLGHYLIM
QGFMHIKTVEWYTPDFQPSRDPTPIAGMSVMVNITKKADVYFMKQFKNSYTNNRHQITSIFLIKPLADFKVQCYMSYFKR
ESHDNDGVANLTVRSMTSPETIRFQVGEWYLLTSTTLKENNLPEGWVWDRVELKSDTPYYADQALTYFITPPPVDSQILF
EGNTTLPRISSPPDNPSGRYMESHQQDCDSSDDEDDCENVSEETETEDEEDEDEDDEADRFDLHSPYSSEPEDSDENNRV
TLLSTLINQGMTVERATRITKRAFPTCAEKLKRSVYMDLLASGASPSSAWSNACDEARNVGSNQLAKLSGDRGHAE
>Q9IFX1 ~~~ORF2~~~Capsid polyprotein VP90~~~
MASKSDKQVTVEVNNNGRSRSKSRARSQSRGRGRSVKITVNSHNKGRRQNGRNKYQSNQRVRKIVNKQLRKQGVTGPKPA
ICQTATATLGTIGSNTTGATEIEACILLNPVLVKDATGSTQFGPVQALGAQYSMWKLKYLNVRLTSMVGASAVNGTVVRI
SLNPTSTPSSTSWSGLGARKHLDVTVGKNAVFKLKPSDLGGPRDGWWLTNTNDNASDTLGPSIEIHTLGQTMSSYQNTQF
TGGLFLVELSSAWCFTGYAANPNLVNLVKSTDKSVNVTFEGSAGTPLIMNVPEHSHFARTAVEHSSLSTSLSRAGGESSS
DTVWQVLNTAVSAAELVTPPPFNWLVKGGWWFVKLIAGRARTGARRFYVYLSYQDALSNKPALCTGGVPASARQSNPVRT
TLQFTQMNQPSLGHGATPMTFGRSIPEPGEQFRVLLTVGPPMAPNTANSQNWVNKTIVPPENQYTVKIGIDLEHYTTMQG
FTPVESVSWYTADFQPSDEPSPIPGLYARVNNTKKADVYGVQQFKSSHTNNRHQITSVFLVRVTTSFQVINYTSYFIRGA
ESGSNVSNLKIRDQTYHTPLQFTQGKWYLLTSTVMHDGPTSSGWVWMNQELTNNIAYRVDPGMMYLITPPPAASQLYFEL
HTVLPQARSEEPETYVDAPLPEEPPIEEEETDSDFESTEDENDEVDRFDLHPSSESDDDDVENDRATLLSTLLNQGISVE
RATRITNGAFPTRAARVRRSVYNDLLVSGLSPGAAWSHACEQARRAGDNHDLQLSGSRDHAE
>Q3YPH4 3.1.1.4~~~VP1~~~Minor capsid protein VP1~~~
MPPIKRQPRGWVLPGYRYLGPFNPLDNGEPVNNADRAAQLHDHAYSELIKSGKNPYLYFNKADEKFIDDLKDDWSIGGII
GSSFFKIKRAVAPALGNKERAQKRHFYFANSNKGAKKTKKSEPKPGTSKMSDTDIQDQQPDTVDAPQNTSGGGTGSIGGG
KGSGVGISTGGWVGGSHFSDKYVVTKNTRQFITTIQNGHLYKTEAIETTNQSGKSQRCVTTPWTYFNFNQYSCHFSPQDW
QRLTNEYKRFRPKAMQVKIYNLQIKQILSNGADTTYNNDLTAGVHIFCDGEHAYPNASHPWDEDVMPDLPYKTWKLFQYG
YIPIENELADLDGNAAGGNATEKALLYQMPFFLLENSDHQVLRTGESTEFTFNFDCEWVNNERAYIPPGLMFNPKVPTRR
VQYIRQNGSTAASTGRIQPYSKPTSWMTGPGLLSAQRVGPQSSDTAPFMVCTNPEGTHINTGAAGFGSGFDPPNGCLAPT
NLEYKLQWYQTPEGTGNNGNIIANPSLSMLRDQLLYKGNQTTYNLVGDIWMFPNQVWDRFPITRENPIWCKKPRADKHTI
MDPFDGSIAMDHPPGTIFIKMAKIPVPTASNADSYLNIYCTGQVSCEIVWEVERYATKNWRPERRHTALGMSLGGESNYT
PTYHVDPTGAYIQPTSYDQCMPVKTNINKVL
>C1IWT2 3.1.1.4~~~VP1~~~Minor capsid protein VP1~~~
MPPIKRQPGGWVLPGYKYLGPFNPLDNGEPVNKADRAAQSHDKSYSELIKSGKNPYLYFNKADEKFIDDLKNDWSLGGII
GSSFFKLKRAVAPALGNKERAQKRHFYFANSNKGAKKSKNNEPKPSTSKMSENEIQDQQPSEPNDGQRGGGGGATGSVGG
GKGSGVGISTGGWVGGSYFTDSYVITKNTRQFLVKIQNNHQYKTESIIPSNGGGKSQRCVSTPWSYFNFNQYSSHFSPQD
WQRLTNEYKRFRPKGMHVKIYNLQIKQILSNGADVTYNNDLTAGVHIFCDGEHAYPNATHPWDEDVMPELPYQTWYLFQY
GYIPTIHELAEMEDSNAVEKAIALQIPFFMLENSDHEVLRTGESAEFNFNFDCEWINNERAFIPPGLMFNPLVPTRRAQY
IRRNGNTQASTSRVQPYAKPTSWMTGPGLLSAQRVGPAASDTAAWMVGVDPEGANINSGRAGVSSGFDPPAGSLRPTDLE
YKVQWYQTPAGTNNDGNIISNPPLSMLRDQTLYRGNQTTYNLCSDVWMFPNQIWDRYPVTRENPIWCKQPRSDKHTTIDP
FDGSIAMDHPPGTIFIKMAKIPVPSNNNADSYLNIYCTGQVSCEIVWEVERYATKNWRPERRHTALGLGIGGADEINPTY
HVDKNGAYIQPTTWDMCFPVKTNINKVL
>C5IY46 3.1.1.4~~~VP1~~~Minor capsid protein VP1~~~
MPPIKRQPGGWVLPGYKYLGPFNPLENGEPVNKADRAAQAHDKSYSELIKSGKNPYLYFNKADEKFIDDLKDDWSLGGII
GSSFFKLKRAVAPALGNKERAQKRHFYFANSNKGAKKTKNNEPKPGTSKMSENEIQDQQPSDSMDGQRGGGGGATGSVGG
GKGSGVGISTGGWVGGSYFTDSYVITKNTRQFLVKIQNNHQYKTELISPSTSQGKSQRCVSTPWSYFNFNQYSSHFSPQD
WQRLTNEYKRFRPKGMHVKIYNLQIKQILSNGADTTYNNDLTAGVHIFCDGEHAYPNATHPWDEDVMPELPYQTWYLFQY
GYIPVIHELAEMEDSNAVEKAICLQIPFFMLENSDHEVLRTGESTEFTFNFDCEWINNERAYIPPGLMFNPLVPTRRAQY
IRRNNNPQTAESTSRIAPYAKPTSWMTGPGLLSAQRVGPATSDTGAWMVAVKPENASIDTGMSGIGSGFDPPQGSLAPTN
LEYKIQWYQTPQGTNNNGNIISNQPLSMLRDQALFRGNQTTYNLCSDVWMFPNQIWDRYPITRENPIWCKKPRSDKHTTI
DPFDGSLAMDHPPGTIFIKMAKIPVPSNNNADSYLNIYCTGQVSCEIVWEVERYATKNWRPERRHTTFGLGIGGADNLNP
TYHVDKNGTYIQPTTWDMCFPVKTNINKVL
>P03149 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMTLATWVGNNLQDPA
SRDLVVNYVNTNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRDRGRSPRR
RTPSPRRRRSQSPRRRRSQSRESQC
>P03148 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMTLATWVGNNLEDPA
SRDLVVNYVNTNVGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRDRGRSPRR
RTPSPRRRRSPSPRRRRSQSRESQC
>P0C693 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMTLATWVGNNLEDPA
SRDLVVNYVNTNMGLKIRQLLWFRISYLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRDRGRSPRR
RTPSPRRRRSQSPRRRRSQSRESQC
>Q81102 ~~~C~~~Capsid protein~~~
MDIDTYKEFGASVELLSFLPSDFFPSIRDLLDTAFALHREALESPEHCSPHHTALRQAIVCWGELMNLATWVGSNLEDPA
SRELVVSYVNVNMGLKIRQLLWFHISCLTFGRETVLEYLVSVGVWIRTPQAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSKSPRRRRSQSRESQC
>P69706 ~~~C~~~Capsid protein~~~
MDIDPYKEFGASVELLSFLPSDFFPSIRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMNLATWVGSNLEDPA
SRELVVSYVNVNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSQSPRRRRSQSRESQC
>Q76R61 ~~~C~~~Capsid protein~~~
MDIDPYKEFGASVELLSFLPSDFFPSIRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMNLATWVGSNLEDPA
SRELVVSYVNVNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSQSPRRRRSQSRESQC
>P12901 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPNHTALRQAILCWGELMTLASWVGNNLEDPA
SREQVVNYVNTNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSQSPRRRRSQSPASQC
>P03147 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTAAALYRDALESPEHCSPHHTALRQAILCWGDLMTLATWVGTNLEDPA
SRDLVVSYVNTNVGLKFRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSQSPRRRRSQSRESQC
>P03146 ~~~C~~~Capsid protein~~~
MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPHHTALRQAILCWGELMTLATWVGVNLEDPA
SRDLVVSYVNTNMGLKFRQLLWFHISCLTFGRETVIEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVRRRGRSPRRRT
PSPRRRRSQSPRRRRSQSRESQC
>P29326 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MRPRPILLLLLMFLPMLPAPPPGQPSGRRRGRRSGGSGGGFWGDRVDSQPFAIPYIHPTNPFAPDVTAAAGAGPRVRQPA
RPLGSAWRDQAQRPAVASRRRPTTAGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPL
SPLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISFWPQTTTTPTSVDMNSITSTDVRILVQPGI
ASELVIPSERLHYRNQGWRSVETSGVAEEEATSGLVMLCIHGSLVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRV
SRYSSTARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGVGEIGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRP
VVSANGEPTVKLYTSVENAQQDKGIAIPHDIDLGESRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEY
DQSTYGSSTGPVYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLSTIQQYSKTFFVLPLRGKLSFWEAGTTKAGYPYN
YNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLAPHSALALLEDTLDYPARAHTFDDFCPECRPLGLQGCAF
QSTVAELQRLKMKVGKTREL
>Q81871 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MRPRPILLLLLMFLPMLPAPPPGQPSGRRRGRRSGGSGGGFWGDRADSQPFAIPYIHPTNPFAPDVTAAAGAGPRVRQPA
RPLGSAWRDQAQRPAAASRRRPTTAGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPL
SPLLPLQDGTNTHIMATEASNYAQYRVVRATIRYRPLVPNAVGGYAISISFWPQTTTTPTSVDMNSITSTDVRILVQPGI
ASEHVIPSERLHYRNQGWRSVETSGVAEEEATSGLVMLCIHGSLVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRV
SRYSSTARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGVGEIGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRP
VVSANGEPTVKLYTSVENAQQDKGIAIPHDIDLGESRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEY
DQSTYGSSTGPVYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLSTTQQYSKTFFVLPLRGKLSFWEAGTTKAGYPYN
YNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLAPHSALALLEDTMDYPARAHTFDDFCPECRPLGLQGCAF
QSTVAELQRLKMKVGKTREL
>Q9IVZ8 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MRSRALLFLLFVLLPMLPAPPAGQPSGRRRGQAGCGGGFWGDRVDSQPFALPYIHPTNPFASDIPAAAGTGARPRQPIRP
LGSAWRDQSQRPAASTRRRPAPAGASPLTAVAPAPDTAPVPDADSRGAILRRQYNLSTSPLTSTIATGTNFVLYAAPLSP
LLPLQDGTNTHIMATEASNYAQYRVVRATIRYRPLVPNAVGGYAISISFWPQTTTTPTSVDMNSITSTDVRILVQPGIAS
ELVTPSERLHYRNQGWRSVETSGVAEEEATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRVSR
YSSSARHKLRRGPDGTAELTTTAATRFMKDLHFTGTNGVGEVGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRPVV
SANGELTVKLYTSVENAQQDKGVAIPHDIDLGESRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEYDQ
TTYGSSTNPMYVSDTVTFVNVATGAQGVSRSLDWSKVTLDGRPLTTIQQYSKTFYVLPLRGKLSFWEAGTTKAGYPYNYN
TTASDQILIENAAGHRVCISTYTTNLGSGPVSVSAVGVLAPHSALAALEDTADYPARAHTFDDFCPECRALGLQGCAFQS
TVGELQRLKMKVGKTREY
>Q68985 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MGPRPILLLFLMFLPMLLAPPPGQPSGRRRGRRSGGSGGGFWGDRVDSQPFAIPYIHPTNPFAPNVTAAAGAGPRVRQPV
RPLGSAWRDQAQRPAAASRRRPTTAGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPL
SPLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISFWPQTTPTPTSVDMNSITSTDVRILVQPGI
ASELVIPSERLHYRNQGWRSVETSGVAEEEATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRV
SRYSSTARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGVGEIGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRP
VVSANGEPTVKLYTSVENAQQDKGIAIPNDIDLGESRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEY
DQSTYGSSTGPVYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLSTIQQYSKIFFVLPLRGKLSFWEAGTTRPGYPYN
YNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLGPHSALALLEDTLDYPARAHTFDDFCPECRPLGLQGCAF
QSTVAELQRLKMKVGKTREL
>P33426 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MRPRPILLLLLMFLPMLPAPPPGQPSGRRRGRRSGGSGGGFWGDRVDSQPFAIPYIHPTNPFAPDVTAAAGAGPRVRQPA
RPLGSAWRDQAQRPAAASRRRPTTAGAAPLTAVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPL
SPLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISFWPQTTTTPTSVDMNSITSTDVRILVQPGI
ASELVIPSERLHYRNQGWRSVETSGVAEEEATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRV
SRYSSTARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGVGEIGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRP
VVSANGEPTVKLYTSVENAQQDKGIAIPHDIDLGESRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEY
DQSTYGSSTGPVYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLSTIQQYSKTFFVLPLRGKLSFWEAGTTKAGYPYN
YNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLAPHSVLALLEDTMDYPARAHTFDDFCPECRPLGLQGCAF
QSTVAELQRLKMKVGKTREL
>Q9YLQ9 ~~~ORF2~~~Pro-secreted protein ORF2~~~
MRPRAVLLLLFVLLPMLPAPPAGQPSGRRRGRRSGGAGGGFWGDRVDSQPFALPYIHPTNPFAADVVSQPGAGTRPRQPP
RPLGSAWRDQSQRPSAAPRRRSAPAGAAPLTAVSPAPDTAPVPDVDSRGAILRRQYNLSTSPLTSSVASGTNLVLYAAPL
NPLLPLQDGTNTHIMATEASNYAQYRVVRATIRYRPLVPNAVGGYAISISFWPQTTTTPTSVDMNSITSTDVRILVQPGI
ASELVIPSERLHYRNQGWRSVETTGVAEEEATSGLVMLCIHGSPVNSYTNTPYTGALGLLDFALELEFRNLTPGNTNTRV
SRYTSTARHRLRRGADGTAELTTTAATRFMKDLHFAGTNGVGEVGRGIALTLFNLADTLLGGLPTELISSAGGQLFYSRP
VVSANGEPTVKLYTSVENAQQDKGITIPHDIDLGDSRVVIQDYDNQHEQDRPTPSPAPSRPFSVLRANDVLWLSLTAAEY
DQTTYGSSTNPMYVSDTVTLVNVATGAQAVARSLDWSKVTLDGRPLTTIQQYSKTFYVLPLRGKLSFWEAGTTKAGYPYN
YNTTASDQILIENAAGHRVAISTYTTSLGAGPTSISAVGVLAPHSALAVLEDTIDYPARAHTFDDFCPECRTLGLQGCAF
QSTIAELQRLKMKVGKTRES
>Q25BH4 ~~~~~~Major capsid protein~~~
MVEIDDGVEMAVGIFVVIILAANLLPTAFDQIFNASTSSWNSDVTTLWELLPLLSVVGLLLMFVYWARKAGKM
>Q50LE5 ~~~Segment-1~~~Capsid protein precursor~~~
MKQNDTKKTTQRRNSKKYSSKTNRGTKRAPRDQEVGTGAQESTRNDVAWYARYPHILEEATRLPFAYPIGQYYDTGYSVA
SATEWSKYVDTSLTIPGVMCVNFTPTPGESYNKNSPINIAAQNVYTYVRHMNSGHANYEQADLMMYLLAMDSLYIFHSYV
RKILAISKLYTPVNKYFPRALLVALGVDPEDVFANQAQWEYFVNMVAYRAGAFAAPASMTYYERHAWMSNGLYVDQDVTR
AQIYMFKPTMLWKYENLGTTGTKLVPLMMPKAGDNRKLVDFQVLFNNLVSTMLGDEDFGIMSGDVFKAFGADGLVKLLAV
DSTTMTLPTYDPLILAQIHSARAVGAPILETSTLTGFPGRQWQITQNPDVNNGAIIFHPSFGYDGQDHEELSFRAMCSNM
ILNLPGEAHSAEMIIEATRLATMFQVKAVPAGDTSKPVLYLPNGFGTEVVNDYTMISVDKATPHDLTIHTFFNNILVPNA
KENYVANLELLNNIIQFDWAPQLYLTYGIAQESFGPFAQLNDWTILTGETLARMHEVCVTSMFDVPQMGFNK
>P83664 ~~~~~~Capsid protein~~~
MALSFKNSSGVLKAKTLKDGFVTSSDIETTVHDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQSVTNSERLNNNKKNN
KTQRFKLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWSNKRYEDNKNTINFNEDDDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYEHESDLVAPMVRQVTKRGNISDVMDIVGKDIIAKKY
EEIVKDRSSIGIGTKYNDILDEFKDIFNKIDSSSLDSTIKNCFNKIDGE
>P83549 ~~~~~~Capsid protein~~~
MALSFKNSSGVLKAKTLKDGFVTSSDIETTVHDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQSVTNSERLNNNKKNN
KTQRFKLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWSNKRYEDNKNTINFNEDDDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYEHESDLVAPMVRQVTKRGNISDVMDIVGKDIIAKKY
EEIVKDRSSIGIGTKYNDILDEFKDIFNKIDSSSLDSTIKNCFNKIDGE
>P83550 ~~~~~~Capsid protein~~~
MALSFKNSSGVLKAKTLKDGFVTSSDIETTVHDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQSVTNSERLNNNKKNN
KTQRFKLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWSNKRYEDNKNTINFNEDDDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYEHESDLVAPMVRQVTKRGNISDVMDIVGKDIIAKKY
EEIVKDRSSIGIGTKYNDILDEFKDIFNKIDSSSLDSTIKNCFNKIDGE
>P83666 ~~~~~~Capsid protein~~~
MALSFKNSSGVLKAKTLKDGFVTSSDIETTVHDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQSVTNSERLNNNKKNN
KTQRFKLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWSNKRYEDNKNTINFNEDDDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYEHESDLVAPMVRQVTKRGNISDVMDIVGKDIIAKKY
EEIVKDRSSIGIGTKYNDILDEFKDIFNKIDSSSLDSTIKNCFNKIDGE
>P83665 ~~~~~~Capsid protein~~~
MALSFKNSSGVLKAKTLKDGFVTSSDIETTVHDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQSVTNSERLNNNKKNN
KTQRFKLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWSNKRYEDNKNTINFNEDDDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYEHESDLVAPMVRQVTKRGNISDVMDIVGKDIIAKKY
EEIVKDRSSIGIGTKYNDILDEFKDIFNKIDSSSLDSTIKNCFNKIDGE
>P03713 ~~~E~~~Major capsid protein~~~
MSMYTTAQLLAANEQKFKFDPLFLRLFFRESYPFTTEKVYLSQIPGLVNMALYVSPIVSGEVIRSRGGSTSEFTPGYVKP
KHEVNPQMTLRRLPDEDPQNLADPAYRRRRIIMQNMRDEELAIAQVEEMQAVSAVLKGKYTMTGEAFDPVEVDMGRSEEN
NITQSGGTEWSKRDKSTYDPTDDIEAYALNASGVVNIIVFDPKGWALFRSFKAVKEKLDTRRGSNSELETAVKDLGKAVS
YKGMYGDVAIVVYSGQYVENGVKKNFLPDNTMVLGNTQARGLRTYGCIQDADAQREGINASARYPKNWVTTGDPAREFTM
IQSAPLMLLADPDEFVSVQLA
>B1PS80 ~~~ORF5~~~Capsid protein~~~
MSESKAETPSKSAEKGVASLSTSAPPSSTTPTAQAKQTPPPVATTARPMASRLPRTIAAEGGGTEKKQSHLAEDRIAQYL
PKQDAVDHSNLAALLQPFTEASYEREFDVKVNGIASKAELTAVAETWASRLNVPKENSAVLAQEIAIHCYHNGSSEQTDF
NLKSSQVAGLNLEAAVGVIKEILTLRQFAAYYATFVWNWGIKNEIPPANWVAKGYTDETKYAAFDTFSYVGSPLGLRITP
TRKPTNNEYMAASVNAREKIIQSRGKGMVTNSPMFSDGTTHQGIPLHPKLPLS
>P19899 ~~~ORF4~~~Capsid protein~~~
MAMVKRINNLPTVKLAKQALPLLANPKLVNKAIDVVPLVVQGGRKLSKAAKRLLGAYGGNISYTEGAKPGAISAPVAISR
RVAGMKPRFVRSEGSVKIVHREFIASVLPSSDLTVNNGDVNIGKYRVNPSNNALFTWLQGQAQLYDMYRFTRLRITYIPT
TGSTSTGRVSLLWDRDSQDPLPIDRAAISSYAHSADSAPWAENVLVVPCDNTWRYMNDTNAVDRKLVDFGQFLFATYSGA
GSTAHGDLYVEYAVEFKDPQPIAGMVCMFDRLVSLSEVGSTIKGVNYIADRDVITTGGNIGVNINIPGTYLVTIVLNATS
IGPLTFTGNSKLVGNSLNLTSSGASALTFTLNSTGVPNSSDSSFSVGTVVALTRVRMTITRCSPETAYLA
>P06448 ~~~V1~~~Capsid protein~~~
MSTSKRKRGDDSNWSKRVTKKKPSSAGLKRAGSKADRPSLQIQTLQHAGTTMITVPSGGVCDLINTYARGSDEGNRHTSE
TLTYKIAIDYHFVADAAACRYSNTGTGVMWLVYDTTPGGQAPTPQTIFAYPDTLKAWPATWKVSRELCHRFVVKRRWLFN
METDGRIGSDIPPSNASWKPCKRNIYFHKFTSGLGVRTQWKNVTDGGVGAIQRGALYMVIAPGNGLTFTAHGQTRLYFKS
VGNQ
>P07302 ~~~~~~Capsid protein VP1~~~
MAPPAKRAKRGWVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYDQYIKSGKNPYLYFSAADQRFIDQTKDAKDWGGK
VGHYFFRTKRAFAPKLATDSEPGTSGVSRAGKRTRPPAYIFINQARAKKKLTSSAAQQSSQTMSDGTSQPDGGNAVHSAA
RVERAADGPGGSGGGGSGGGGVGVSTGSYDNQTHYRFLGDGWVEITALATRLVHLNMPKSENYCRIRVHNTTDTSVKGNM
AKDDAHEQIWTPWSLVDANAWGVWLQPSDWQYICNTMSQLNLVSLDQEIFNVVLKTVTEQDSGGQAIKIYNNDLTACMMV
AVDSNNILPYTPAANSMETLGFYPWKPTIASPYRYYFCVDRDLSVTYENQEGTIEHNVMGTPKGMNSQFFTIENTQQITL
LRTGDEFATGTYYFDTNPVKLTHTWQTNRQLGQPPLLSTFPEADTDAGTLTAQGSRHGATQMEVNWVSEAIRTRPAQVGF
CQPHNDFEASRAGPFAAPKVPADVTQGVDREANGSVRYSYGKQHGENWAAHGPAPERYTWDETNFGSGRDTRDGFIQSAP
LVVPPPLNGILTNANPIGTKNDIHFSNVFNSYGPLTAFSHPSPVYPQGQIWDKELDLEHKPRLHITAPFVCKNNAPGQML
VRLGPNLTDQYDPNGATLSRIVTYGTFFWKGKLTMRAKLRANTTWNPVYQVSVEDNGNSYMSVTKWLPTATGNMQSVPLI
TRPVARNTY
>P03137 ~~~~~~Capsid protein VP1~~~
MAPPAKRAKRGWVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYDQYIKSGKNPYLYFSAADQRFIDQTKDAKDWGGK
VGHYFFRTKRAFAPKLATDSEPGTSGVSRAGKRTRPPAYIFINQARAKKKLTSSAAQQSSQTMSDGTSQPDSGNAVHSAA
RVERAADGPGGSGGGGSGGGGVGVSTGSYDNQTHYRFLGDGWVEITALATRLVHLNMPKSENYCRIRVHNTTDTSVKGNM
AKDDAHEQIWTPWSLVDANAWGVWLQPSDWQYICNTMSQLNLVSLDQEIFNVVLKTVTEQDLGGQAIKIYNNDLTACMMV
AVDSNNILPYTPAANSMETLGFYPWKPTIASPYRYYFCVDRDLSVTYENQEGTVEHNVMGTPKGMNSQFFTIENTQQITL
LRTGDEFATGTYYFDTNSVKLTHTWQTNRQLGQPPLLSTFPEADTDAGTLTAQGSRHGTTQMGVNWVSEAIRTRPAQVGF
CQPHNDFEASRAGPFAAPKVPADITQGVDKEANGSVRYSYGKQHGENWASHGPAPERYTWDETSFGSGRDTKDGFIQSAP
LVVPPPLNGILTNANPIGTKNDIHFSNVFNSYGPLTAFSHPSPVYPQGQIWDKELDLEHKPRLHITAPFVCKNNAPGQML
VRLGPNLTDQYDPNGATLSRIVTYGTFFWKGKLTMRAKLRANTTWNPVYQVSAEDNGNSYMSVTKWLPTATGNMQSVPLI
TRPVARNTY
>Q9YPS5 ~~~AR1~~~Capsid protein~~~
MPKRNYDTAFSTPMSNVRRRLTFDTPLSLPATAGSVPASAKRRRWTNRPMWRKPRYYRLYRSPDVPRGCEGPCKVQSFEA
KHDISHVGKVICVTDVTRGMGITHRVGKRFCVKSIWVTGKIWMDENIKTKNHTNTVMFKLVRDRRPFGTPQDFGQVFNMY
DNEPSTATVKNDLRDRYQVVRKFQATVTGGQYASKEQAIVSKFYRVNNYVVYNHQEAAKYENHTENALLLYMACTHASNP
VYATLKIRIYFYDSISN
>P12871 3.4.23.44~~~alpha~~~Capsid protein alpha~~~
MVSKAARRRRAAPRQQQRQQSNRASNQPRRRRARRTRRQQRMAATNNMLKMSAPGLDFLKCAFASPDFSTDPGKGIPDKF
QGLVLPKKHCLTQSITFTPGKQTMLLVAPIPGIACLKAEANVGASFSGVPLASVEFPGFDQLFGTSATDTAANVTAFRYA
SMAAGVYPTSNLMQFAGSIQVYKIPLKQVLNSYSQTVATVPPTNLAQNTIAIDGLEALDALPNNNYSGSFIEGCYSQSVC
NEPEFEFHPIMEGYASVPPANVTNAQASMFTNLTFSGARYTGLGDMDAIAILVTTPTGAVNTAVLKVWACVEYRPNPNST
LYEFARESPANDEYALAAYRKIARDIPIAVACKDNATFWERVRSILKSGLNFASTIPGPVGVAATGIKGIIETIGSLWV
>Q83884 ~~~ORF2~~~Capsid protein VP1~~~
MMMASKDATSSVDGASGAGQLVPEVNASDPLAMDPVAGSSTAVATAGQVNPIDPWIINNFVQAPQGEFTISPNNTPGDVL
FDLSLGPHLNPFLLHLSQMYNGWVGNMRVRIMLAGNAFTAGKIIVSCIPPGFGSHNLTIAQATLFPHVIADVRTLDPIEV
PLEDVRNVLFHNNDRNQQTMRLVCMLYTPLRTGGGTGDSFVVAGRVMTCPSPDFNFLFLVPPTVEQKTRPFTLPNLPLSS
LSNSRAPLPISSMGISPDNVQSVQFQNGRCTLDGRLVGTTPVSLSHVAKIRGTSNGTVINLTELDGTPFHPFEGPAPIGF
PDLGGCDWHINMTQFGHSSQTQYDVDTTPDTFVPHLGSIQANGIGSGNYVGVLSWISPPSHPSGSQVDLWKIPNYGSSIT
EATHLAPSVYPPGFGEVLVFFMSKMPGPGAYNLPCLLPQEYISHLASEQAPTVGEAALLHYVDPDTGRNLGEFKAYPDGF
LTCVPNGASSGPQQLPINGVFVFVSWVSRFYQLKPVGTASSARGRLGLRR
>Q84122 ~~~CP~~~Capsid protein~~~
MSYTITDPSKLAYLSSAWADPNSLINLCTNSLGNQFQTQQARTTVQQQFADVWQPVPTLTSRFPAGAGHFRVYRYEPILE
PLITFLMGTFDTRNRIIEVRNPQNPTTTETLDATRRVDDATVAIRSAINNLLNELVRGNGMYNQVSFETMSGLTWTSS
>P03578 ~~~CP~~~Capsid protein~~~
MSYTITDPSKLAYLSSAWADPNSLINLCTNSLGNQFQTQQARTTVQQQFADVWQPVPTLTSRFPAGAGYFRVYRYDPILD
PLITFLMGTFDTRNRIIEVENPQNPTTTETLDATRRVDDATVAIRSAINNLLNELVRGTGMYNQVSFETISGLTWTSS
>B3VML3 ~~~~~~Capsid protein~~~
MARLPKRKNRRNEKKKNANASRVQNVPRTFGLWKSTERIKYTTELKYLNSKCRAIRLHPDLVANNSFPTYCSAWKIDQVE
FEFVSYMSPLAGHVGCVFFVVIPAKGLNSRISADEAESLQSAILWDEKGRLKITPISGPISRHPWTNLSQVVTPPQIPKG
STDGERQDLQSGYYLIFDSRKLFGKDLVDKQSVLGELSLTITATYWTSLS
>Q9J7Z0 3.4.23.44~~~alpha~~~Capsid protein alpha~~~
MVSRTKNRRNKARKVVSRSTALVPMAPASQRTGPAPRKPRKRNQALVRNPRLTDAGLAFLKCAFAAPDFSVDPGKGIPDN
FHGRTLAIKDCNTTSVVFTPNTDTYIVVAPVPGFAYFRAEVAVGAQPTTFVGVPYPTYATNFGAGSQNGLPAVNNYSKFR
YASMACGLYPTSNMMQFSGSVQVWRVDLNLSEAVNPAVTAITPAPGVFANFVDKRINGLRGIRPLAPRDNYSGNFIDGAY
TFAFDKSTDFEWCDFVRSLEFSESNVLGAATAMKLLAPGGGTDTTLTGLGNVNTLVYKISTPTGAVNTAILRTWNCIELQ
PYTDSALFQFSGVSPPFDPLALECYHNLKMRFPVAVSSRENSKFWEGVLRVLNQISGTLSVIPGPVGTISAGVHQLTGMY
M
>Q11213 ~~~~~~Capsid protein VP1~~~
MAPPAKRARRGLVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYAAYLRSGKNPYLYFSPADQRFIDQTKDAKDWGGK
IGHYFFRAKKAIAPVLTDTPDHPSTSRPTKPTKRSKPPPHIFINLAKKKKAGAGQVKRDNLAPMSDGAVQPDGGQPAVRN
ERATGSGNGSGGGGGGGSGGVGISTGTFNNQTEFKFLENGWVEITANSSRLVHLNMPESENYRRVVVNNMDKTAVNGNMA
LDDIHAQIVTPWSLVDANAWGVWFNPGDWQLIVNTMSELHLVSFEQEIFNVVLKTVSESATQPPTKVYNNDLTASLMVAL
DSNNTMPFTPAAMRSETLGFYPWKPTIPTPWRYYFQWDRTLIPSHTGTSGTPTNIYHGTDPDDVQFYTIENSVPVHLLRT
GDEFATGTFFFDCKPCRLTHTWQTNRALGLPPFLNSLPQSEGATNFGDIGVQQDKRRGVTQMGNTNYITEATIMRPAEVG
YSAPYYSFEASTQGPFKTPIAAGRGGAQTDENQAADGNPRYAFGRQHGQKTTTTGETPERFTYIAHQDTGRYPEGDWIQN
INFNLPVTNDNVLLPTDPIGGKTGINYTNIFNTYGPLTALNNVPPVYPNGQIWDKEFDTDLKPRLHVNAPFVCQNNCPGQ
LFVKVAPNLTNEYDPDASANMSRIVTYSDFWWKGKLVFKAKLRASHTWNPIQQMSINVDNQFNYVPSNIGGMKIVYEKSQ
LAPRKLY
>P17455 ~~~~~~Capsid protein VP1~~~
MAPPAKRARRGLVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYAAYLRSGKNPYLYFSPADQRFIDQTKDAKDWGGK
IGHYFFRAKKAIAPVLTDTPDHPSTSRPTKPTKRSKPPPHIFINLAKKKKAGAGQVKRDNLAPMSDGAVQPDGGQPAVRN
ERATGSGNGSGGGGGGGSGGVGISTGTFNNQTEFKFLENGWVEITANSSRLVHLNMPESENYRRVVVNNMDKTAVNGNMA
LDDIHAQIVTPWSLVDANAWGVWFNPGDWQLIVNTMSELHLVSFEQEIFNVVLKTVSESATQPPTKVYNNDLTASLMVAL
DSNNTMPFTPAAMRSETLGFYPWKPTIPTPWRYYFQWDRTLIPSHTGTSGTPTNIYHGTDPDDVQFYTIENSVPVHLLRT
GDEFATGTFFFDCKPCRLTHTWQTNRALGLPPFLNSLPQSEGATNFGDIGVQQDKRRGVTQMGNTNYITEATIMRPAEVG
YSAPYYSFEASTQGPFKTPIAAGRGGAQTDENQAADGNPRYAFGRQHGQKTTTTGETPERFTYIAHQDTGRYPEGDWIQN
INFNLPVTNDNVLLPTDPIGGKTGINYTNIFNTYGPLTALNNVPPVYPNGQIWDKEFDTDLKPRLHVNAPFVCQNNCPGQ
LFVKVAPNLTNEYDPDASANMSRIVTYSDFWWKGKLVFKAKLRASHTWNPIQQMSINVDNQFNYVPSNIGGMKIVYEKSQ
LAPRKLY
>P03136 ~~~~~~Capsid protein VP1~~~
MAPPAKRAKRGWVPPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYDQYIKSGKNPYLYFSPADQRFIDQTKDAKDWGGK
VGHYFFRTKRAFAPKLSTDSEPGTSGVSRPGKRTKPPAHIFVNQARAKKKRASLAAQQRTLTMSDGTETNQPDTGIANAR
VERSADGGGSSGGGGSGGGGIGVSTGTYDNQTTYKFLGDGWVEITAHASRLLHLGMPPSENYCRVTVHNNQTTGHGTKVK
GNMAYDTHQQIWTPWSLVDANAWGVWFQPSDWQFIQNSMESLNLDSLSQELFNVVVKTVTEQQGAGQDAIKVYNNDLTAC
MMVALDSNNILPYTPAAQTSETLGFYPWKPTAPAPYRYYFFMPRQLSVTSSNSAEGTQITDTIGEPQALNSQFFTIENTL
PITLLRTGDEFTTGTYIFNTDPLKLTHTWQTNRHLACLQGITDLPTSDTATASLTANGDRFGSTQTQNVNYVTEALRTRP
AQIGFMQPHDNFEANRGGPFKVPVVPLDITAGEDHDANGAIRFNYGKQHGEDWAKQGAAPERYTWDAIDSAAGRDTARCF
VQSAPISIPPNQNQILQREDAIAGRTNMHYTNVFNSYGPLSAFPHPDPIYPNGQIWDKELDLEHKPRLHVTAPFVCKNNP
PGQLFVHLGPNLTDQFDPNSTTVSRIVTYSTFYWKGILKFKAKLRPNLTWNPVYQATTDSVANSYMNVKKWLPSATGNMH
SDPLICRPVPHMTY
>P07299 3.1.1.4~~~~~~Minor capsid protein VP1~~~
MSKKSGKWWESDDKFAKAVYQQFVEFYEKVTGTDLELIQILKDHYNISLDNPLENPSSLFDLVARIKNNLKNSPDLYSHH
FQSHGQLSDHPHALSSSSSHAEPRGENAVLSSEDLHKPGQVSVQLPGTNYVGPGNELQAGPPQSAVDSAARIHDFRYSQL
AKLGINPYTHWTVADEELLKNIKNETGFQAQVVKDYFTLKGAAAPVAHFQGSLPEVPAYNASEKYPSMTSVNSAEASTGA
GGGGSNSVKSMWSEGATFSANSVTCTFSRQFLIPYDPEHHYKVFSPAASSCHNASGKEAKVCTISPIMGYSTPWRYLDFN
ALNLFFSPLEFQHLIENYGSIAPDALTVTISEIAVKDVTDKTGGGVQVTDSTTGRLCMLVDHEYKYPYVLGQGQDTLAPE
LPIWVYFPPQYAYLTVGDVNTQGISGDSKKLASEESAFYVLEHSSFQLLGTGGTASMSYKFPPVPPENLEGCSQHFYEMY
NPLYGSRLGVPDTLGGDPKFRSLTHEDHAIQPQNFMPGPLVNSVSTKEGDSSNTGAGKALTGLSTGTSQNTRISLRPGPV
SQPYHHWDTDKYVTGINAISHGQTTYGNAEDKEYQQGVGRFPNEKEQLKQLQGLNMHTYFPNKGTQQYTDQIERPLMVGS
VWNRRALHYESQLWSKIPNLDDSFKTQFAALGGWGLHQPPPQIFLKILPQSGPIGGIKSMGITTLVQYAVGIMTVTMTFK
LGPRKATGRWNPQPGVYPPHAAGHLPYVLYDPTATDAKQHHRHGYEKPEELWTAKSRVHPL
>Q9PZT0 3.1.1.4~~~vp~~~Minor capsid protein VP1~~~
MSKESGKWWESDDKFAKAVYQQFVEFYEKVTGTDLELIQILKDHYNISLDNPLENPSSLFDLVARIKNNLKNSPDLYSHH
FQSHGQLSDHPHALSSSSSHAEPRGENAVLSSEDLHKPGQVSVQLPGTNYVGPGNELQAGPPQSAVDSAARIHDFRYSQL
AKLGINPYTHWTVADEELLKNIKNETGFQAQVVKDYFTLKGAAAPVAHFQGSLPEVPAYNASEKYPSMTSVNSAEASTGA
GGGGSNPVKSMWSEGATFSANSVTCTFSRQFLIPYDPEHHYKVFSPAASSCHNASGKEAKVCTISPIMGYSTPWRYLDFN
ALNLFFSPLEFQHLIENYGSIAPDALTVTISEIAVKDVTDKTGGGVQVTDSTTGRLSMLVDHEYKYPYVLGQGQDTLAPE
LPIWVYFPPQYAYLTVGDVNTQGISGDSKKLASEESAFYVLEHSSFQLLGTGGTATMSYKFPPVPPENLEGCSQHFYEMY
NPLYGSRLGVPDTLGGDPKFRSLTHEDHAIQPQNFMPGPLVNSVSTKEGDSSNTGAGKALTGLSTGTSQNTRISLRPGPV
SQPYHHWDTDKYVPGINAISHGQTTYGNAEDKEYQQGVGRFPNEKEQLKQLQGLNMHTYFPNKGTQQYTDQIERPLMVGS
VWNRRALHYESQLWSKIPNLDDSFKTQFAALGGWGLHQPPPQIFLKILPQSGPIGGIKSMGITTLVQYAVGIMTVTMTFK
LGPRKATGRWNPQPGVYPPHAAGHLPYVLYDPTATDAKQHHRHGYEKPEELWTAKSRVHPL
>P36310 ~~~~~~Capsid protein VP1~~~
MAPPAKRAKRGWVPPGYKYLGPGNSLNQGEPTNPSDAAAKEHDEAYDQYIKSGKNPYLYFSPADQRFIDQTKDAKDWGGK
VGHYFFRTKRAFAPKLSTDSEPGTSGVSTAGKRTKPPAHIFINQARAKKKRTSLAAQQRTQTMSDGTDQSDSGNAVQSAA
RVERAADGPGGSGGGGSGGGGVGVSTGSYDNQTHYKFLGDGWVEITAYSTRMVHLNMPKSENYCRVRVHNTNDTGTASHM
AMDDAHEQIWTPWSLVDANAWGVWFQPSDWQYISNNMIHINLHSLDQELFNVVIKTVTEQNTGAEAIKVYNNDLTAAMMV
ALDSNNILPYTPAIDNQETLGFYPWKPTIPSPYRYYFSCDRNLSVTYKDEAGTITDTMGLASGLNSQFFTIENTQRINLL
RTGDEYATGTYYFDTEPIRLTHTWQTNRHLGQPPQITELPSSDTANATLTARGYRSGLTQIQGRNDVTEATRVRPAQVGF
CQPHDNFETSRAGPFKVPVVPADITQGLDHDANGSLRYTYDKQHGQSWASQNNKDRYTWDAVNYDSGRWTNNCFIQSVPF
TSEPNANQILTNRDNLAGKTDIHFTNAFNSYGPLTAFPHPAPIYPQGQIWDKELDLEHKPRLHTQAPFVCKNNAPGQLLV
RLAPNLTDQYDPNSSNLSRIVTYGTFFWKGKLTLKAKMRPNATWNPVFQISATNQGTNDYMSIERWLPTATGNITNVPLL
SRPVARNTY
>P18546 ~~~~~~Capsid protein VP1~~~
MAPPAKRARGLTLPGYKYLGPGNSLDQGEPTNPSDAAAKEHDEAYDKYIKSGKNPYFYFSAADEKFIKETEHAKDYGGKI
GHYFFRAKRAFAPKLSETDSPTTSQQPEVRRSPRKHPGSKPPGKRPAPRHIFINLAKKKAKGTSNTNSNSMSENVEQHNP
INAGTELSATGNESGGGGGGGGGRGAGGVGVSTGTFNNQTEFQYLGEGLVRITAHASRLIHLNMPEHETYKRIHVLNSES
GVAGQMVQDDAHTQMVTPWSLIDANAWGVWFNPADWQLISNNMTEINLVSFEQEIFNVVLKTITESATSPPTKIYNNDLT
ASLMVALDTNNTLPYTPAAPRSETLGFYPWLPTKPTQYRYYLSCIRNLNPPTYTGQSQQITDSIQTGLHSDIMFYTIENA
VPIHLLRTGDEFSTGIYHFDTKPLKLTHSWQTNRSLGLPPKLLTEPTTEGDQHPGTLPAANTRKGYHQTINNSYTEATAI
RPAQVGYNTPYMNFEYSNGGPFLTPIVPTADTQYNDDEPNGAIRFTMDYQHGHLTTSSQELERYTFNPQSKCGRAPKQQF
NQQAPLNLENTNNGTLLPSDPIGGKSNMHFMNTLNTYGPLTALNNTAPVFPNGQIWDKELDTDLKPRLHVTAPFVCKNNP
PGQLFVKIAPNLTDDFNADSPQQPRIITYSNFWWKGTLTFTAKMRSSNMWNPIQQHTTTAENIGNYIPTNIGGIRMFPEY
SQLIPRKLY
>O56129 ~~~Cap~~~Capsid protein~~~
MTYPRRRYRRRRHRPRSHLGQILRRRPWLVHPRHRYRWRRKNGIFNTRLSRTFGYTVKATTVRTPSWAVDMMRFNIDDFV
PPGGGTNKISIPFEYYRIRKVKVEFWPCSPITQGDRGVGSTAVILDDNFVTKATALTYDPYVNYSSRHTIPQPFSYHSRY
FTPKPVLDSTIDYFQPNNKRTQLWLRLQTSRNVDHVGLGTAFENSIYDQDYNIRVTMYVQFREFNLKDPPLKP
>Q8JVC1 ~~~p2~~~Capsid protein~~~
MAAPVLYGGAGGTATGPGDMRRSLMHEKKQVFAELRREAQALRVAKEARGKMSVWDPSTREGARGYREKVVRFGRQIASL
LQYFENMHSPALDIIACDKFLLKYQIYGDIDRDPAFGENTMTAEVPVVWDKCEVEVKLYAGPLQKLMSRAKLVGAAREGI
PNRNDVAKSTGWNQDQVQKFPDNRMDSLISLLEQMQTGQSKLTRLVKGFLILLEMAERKEVDFHVGNHIHVTYAIAPVCD
SYDLPGRCYVFNSKPTSEAHAAVLLAMCREYPPPQFASHVSVPADAEDVCIVSQGRQIQPGSAVTLNPGLVYSSILTYAM
DTSCTDLLQEAQIIACSLQENRYFSRIGLPTVVSLYDLMVPAFIAQNSALEGARLSGDLSKAVGRVHQMLGMVAAKDIIS
ATHMQSRTGFDPSHGIRQYLNSNSRLVTQMASKLTGIGLFDATPQMRIFSEMDTADYADMLHLTIFEGLWLVQDASVCTD
NGPISFLVNGEKLLSADRAGYDVLVEELTLANIRIEHHKMPTGAFTTRWVAAKRDSALRLTPRSRTAHRVDMVRECDFNP
TMNLKAAGPKARLRGSGVKSRRRVSEVPLAHVFRSPPRRESTTTTDDSPRWLTREGPQLTRRVPIIDEPPAYESGRSSSP
VTSSISEGTSQHEEEMGLFDAEELPMQQTVIATEARRRLGRGTLERIQEAALEGQVAQGEVTAEKNRRIEAMLSARDPQF
TGREQITKMLSDGGLGVREREEWLELVDKTVGVKGLKEVRSIDGIRRHLEEYGEREGFAVVRTLLSGNSKHVRRINQLIR
ESNPSAFETEASRMRRLRADWDGDAGSAPVNALHFVGNSPGWKRWLENNNIPSDIQVAGKKRMCSYLAEVLSHGNLKLSD
ATKLGRLVEGTSLDLFPPQLSSEEFSTCSEATLAWRNAPSSLGVRPFAQEDSRWLVMAATCGGGSFGIGKLKSLCKEFSV
PKELRDALRVKYGLFGGKDSLE
>P29153 ~~~ORF3~~~Major capsid protein~~~
MPTRSRSKANQRRRRPRRVVVVAPSMAQPRTQSRRPRRRNKRGGGLNGSHTVDFSMVHGPFNGNATGTVKFGPSSDCQCI
KGNLAAYQKYRIVWLKVVYQSEAAATDRGCIAYHVDTSTTKKAADVVLLDTWNIRSNGSATFGREILGDQPWYESNKDQF
FFLYRGTGGTDVAGHYRISGRIQLMNASL
>P17522 ~~~ORF3~~~Major capsid protein~~~
MSTVVVKGNVNGGVQQPRMRRRQSLRRRANRVQPVVMVTAPGQPRRRRRRRGGNRRSRRTGVPRGRGSSETFVFTKDNLV
GNTQGSFTFGPSLSDCPAFKDGILKAYHEYKITSILLQFVSEASSTSSGSIAYELDPHCKVSSLQSYVNKFQITKGGAKT
YQARMINGVEWHDSSEDQCRILWKGNGKSSDSAGSFRVTIKVALQNPK
>Q9WDG3 ~~~CP~~~Capsid protein~~~
MAYTVSSANQLVYLGSVWADPLELQNLCTSALGNQFQTQQARTTVQQQFSDVWKTIPTATVRFPATGFKVFRYNAVLDSL
VSALLGAFDTRNRIIEVENPQNPTTAETLDATKRVDDATVAIRASISNLMNELIRGTGMYNQALFEMASGFTWATIPYT
>Q9WDG4 ~~~CP~~~Capsid protein~~~
MSYSITSPSQFVFLSSVWADPIELLNFGTNSLGNQFQTQQARTTVQQQFSEVWKPFPQSTVRFPGDVYKVYRYNAVLDPL
ITALLGSFDTRNRIIEVENQQNPTTAETLDATRRVDDATVAIRSAINNLVNELVRGTGLYNQNTFESMSGLVWTSAPAS
>Q9WDG5 ~~~CP~~~Capsid protein~~~
MAYTVSSANQLVYLGSVWADPLELQNLCTSALGNQFQTQQARTTVQQQFSDVWKTIPTATVRFPATGFKRFRYNAVLDSL
VSALLGAFDTRNRIIEVENPQNPTTAETLDATMRVDDATVAIRASISPIMNELVRGTGMYNQALFESASGLTWATTP
>P69509 ~~~CP~~~Capsid protein~~~
MAYTVSSANQLVYLGSVWADPLELQNLCTSALGNQFQTQQARTTVQQQFSDVWKTIPTATVRFPATGFKVFRYNAVLDSL
VSALLGAFDTRNRIIEVENPQNPTTAETLDATRRVDDATVAIRASISNLMNELVRGTGMYNQALFESASGLTWATTP
>P16596 ~~~~~~Coat protein~~~
MSKSSMSTPNTAFPAITQEQMSSIKVDPTSNLLPSQEQLKSVSTLMVAAKVPAASVTTVALELVNFCYDNGSSAYTTVTG
PSSIPEISLAQLASIVKASGTSLRKFCRYFAPIIWNLRTDKMAPANWEASGYKPSAKFAAFDFFDGVENPAAMQPPSGLI
RSPTQEERIANATNKQVHLFQAAAQDNNFTSNSAFITKGQISGSTPTIQFLPPPE
>P89036 ~~~ORF3~~~Capsid protein~~~
MNRNGATPTRGRGKRAIPNPPRRRARGKSVERGSTPLQYVTTLGPSRPRMGQGQGWQKLSHEEIILQVNSSTAADTIQTI
PIIPRLSVPAGDKPIYSGSAPHLRTIGSAFAIHRWRALSFEWIPSCPTTTPGNLVLRFYPNYSTETPKTLTDLMDSESLV
LVPSLSGKTYRPKIETRGNPPELRNIDATAFSALSDEDKGDYSVGRLVVGSSKQAVVIQLGLLRMRYSAEMRGATSISGV
SA
>Q02106 ~~~~~~Capsid protein~~~
MSGEQTEQISKDKAVAAEQARKEQIAEGKKAAESPEVERRKKNIAEIAKLNEKAREAKKQATEQEETTTSLLERFNLLKE
WHLNQQVNNKVKNPAMESETEPALADELKPDMSNLFARPTVTDLQKMKWNAESNKMATADDMAFIEAEFQSLGVPKENLA
KVMWTLTRYCVGASSSQYLDPKGEEEKLCGGVTRAALIACIKKRSTLRKVCRLYAPIVWNYMLVNNVPPEDWQSKGYTEE
TKFAAFDTFDFVMNPAAIQPLEGLIRSPTKAEIIANETHKRIALDRNANNERFANLGSEITGGKFGCRVGTKWRESKCDN
G
>P31798 ~~~~~~Coat protein~~~
MSAPASTTQPIGSTTSTTTKTAGATPATASGLFTIPDGDFFSTARAIVASNAVATNEDLSKIEAIWKDMKVPTDTMAQAA
WDLVRHCADVGSSAQTEMIDTGPYSNGISRARLAAAIKEVCTLRQFCMKYAPVVWNWMLTNNSPPANWQAQGFKPEHKFA
AFDFFNGVTNPAAIMPKEGLIRPPSEAEMNAAQTAAFVKITKARAQSNDFASLDAAVTRGRITGTTTAEAVVTLPPP
>P17782 ~~~~~~Coat protein~~~
MSAPASTTQATGSTTSTTTKTAGATPATASGLFTIPDGDFFSTARAIVASNAVATNEDLSKIEAIWKDMKVPTDTMAQAA
WDLVRHCADVGSSAQTEMIDTGPYSNGISRARLAAAIKEVCTLRQFCMKYAPVVWNWMLTNNSPPANWQAQGFKPEHKFA
AFDFFNGVTNPAAIMPKEGLIRPPSEAEMNAAQTAAFVKITKARAQSNDFASLDAAVTRGRITGTTTAEAVVTLPPP
>P22955 ~~~ORF2~~~Capsid protein~~~
MSSKAPKKSKQRSQPRNRTPNTSVKTVAIPFAKTQIIKTVNPPPKPARGILHTQLVMSVVGSVQMRTNNGKSNQRFRLNP
SNPALFPTLAYEAANYDMYRLKKLTLRYVPLVTVQNSGRVAMIWDPDSQDSAPQSRQEISAYSRSVSTAVYEKCSLTIPA
DNQWRFVADNTTVDRKLVDFGQLLFVTHSGSDGIETGDIFLDCEVEFKGPQPTASIVQKTVIDLGGTLTSFEGPSYLMPP
DAFITSSSFGLFVDVAGTYLLTLVVTCSTTGSVTVGGNSTLVGDGRAAYGSSNYIASIVFTSSGVLSTTPSVQFSGSSGV
SRVQMNICRCKQGNTFILG
>Q85433 ~~~~~~Subgenomic capsid protein VP60~~~
MEGKARITPQGEAAGTATTASVPGTTTDGMDPGVVATTSVVTTENASTSVATAGIGGPPQQVDQQETWRTNFYYNDVFTW
SVADAPGSILYTVQHSPQNNPFTAVLSQMYAGWAGGMQFRFIVAGSGVFGGRLVAAVIPPGIEIGPGLEVRQFPHVVIDA
RSLEPVTITMPDLRPNMYHPTGDPGLVPTLVLSVYNNLINPFGGSTSAIQVTVETRPSEDFEFVMIRAPSSKTVDSVTPA
GLLTTPVLTGVGTDNRWNCQIVGLQPVPGGLSTCNRHWNLNGSTYGWSSPRFTDIDHRRGASQPGGNNVLQFWYANAGSA
VDNPICQVAPDGFPDMSFVPLNGPNVPTAGWVGFGAIWNSNSGAPNVTTVQAYELGFATGAPNNLQPATNTSGSQIVAKS
IYAVSTGANQNPAGLFVMASGVISTPTARAITYTPQPDRIVNAPGTPAAAPVGKNVPIMFASVVRRTGDVNAEAGSDNGT
QYGTGSQPLPVTIGLSLNNYSSALTPGQFFVWQLNFASGFMEIGLNVDGYFYAGTGASTTLIDLTELIDIRPVGPRPSTS
TLVFNLGGATSGFSYV
>Q9QEE3 ~~~CP~~~Capsid protein~~~
MSYNITSSNQYQYFAAMWAEPTAMLNQCVSALSQSYQTQAARDTVRQQFSNLLSAIVTPNQRFPETGYRMYINSAVLKPL
YESLMKSFDTRNRIIETEEESRPSASEVANATQRVDDATVAIRSQIQLLLNELSNGHGLMNRAEFEVLLPWATAPAT
>Q9WDG7 ~~~CP~~~Capsid protein~~~
MSYNITSSNQYQYFAAMWAEPQAMLNQCVSALSQSYQTQAARDTVRQQFSNLLSAIVTPNQRFPESGYRVYINSAVLKPL
YEALMKSFDTRNRIIETEEESRPSASEVANATQRVDDATVAIRSQIQLLLNELSNGHGLMNRAEFEVLLPWTTAPAT
>P03580 ~~~CP~~~Capsid protein~~~
MSYNITNSNQYQYFAAVWAEPTPMLNQCVSALSQSYQTQAGRDTVRQQFANLLSTIVAPNQRFPDTGFRVYVNSAVIKPL
YEALMKSFDTRNRIIETEEESRPSASEVANATQRVDDATVAIRSQIQLLLNELSNGHGYMNRAEFEAILPWTTAPAT
>P03607 ~~~ORF3~~~Capsid protein~~~
MSGLFHHRTKPREIRAFVMATRLTKKQLAQAIQNTLPNPPRRKRRAKRRAAQVPKPTQAGVSMAPIAQGTMVKLRPPMLR
SSMDVTILSHCELSTELAVTVTIVVTSELVMPFTVGTWLRGVAQNWSKYAWVAIRYTYLPSCPTTTSGAIHMGFQYDMAD
TLPVSVNQLSNLKGYVTGPVWEGQSGLCFVNNTKCPDTSRAITIALDTNEVSEKRYPFKTATDYATAVGVNANIGNILVP
ARLVTAMEGGSSKTAVNTGRLYASYTIRLIEPIAAALNL
>P03581 ~~~CP~~~Capsid protein~~~
MAYSIPTPSQLVYFTENYADYIPFVNRLINARSNSFQTQSGRDELREILIKSQVSVVSPISRFPAEPAYYIYLRDPSIST
VYTALLQSTDTRNRVIEVENSTNVTTAEQLNAVRRTDDASTAIHNNLEQLLSLLTNGTGVFNRTSFESASGLTWLVTTTP
RTA
>Q8V9P2 ~~~~~~Major capsid protein~~~
MAKGHTSRSYSQRYAKWQAKFNAFSNPTVASTILSNVSPVAQQNFQTNVPKFTSVNENVSAVLTQYGITGPNRAIYQGFG
LKVARALNRIGSGPALVNMINGLKGYYISAFNANPQVLDAVVNIITGSPTGYVS
>Q91D83 ~~~alpha~~~Capsid protein alpha~~~
MVRKGDKKLAKPPTTKAANSQPRRRATQRRRSGRADAPLAKASTITGFGRATNDVHISGMSRIAQAVVPAGTGTDGKIVV
DSTIVPELLPRLGHAARIFQRYAVETLEFEIQPMCPANTGGGYVAGFLPDPTDNDHTFDALQATRGAVVAKWWESRTVRP
QYTRTLLWTSTGKEQRLTSPGRLVLLCVGSNTDVVNVSVMCRWSVRLSVPSLETPEDTTAPITTQAPLHNDSINNGYTGF
RSILLGATQLDLAPANAVFVTDKPLPIDYNLGVGDVDRAVYWHLRKKAGDTQVPAGYFDWGLWDDFNKTFTVGAPYYSDQ
QPRQILLPAGTLFTRVDSEN
>P36285 ~~~ORF2~~~Capsid protein~~~
MATTHTLLSFDDLEFLLHRKDLTDLYGERCGTLNLVINPYELFLPDELDDDCCDDPFNCCFPDVYASIGTEYSYIDPPEL
IHEEHCATNGTWPNGDPCEPILPPFTITGTHHYYATKPGEVVSGILSKLGSSWDPSLRSTADVSNSFTFRAESDGPGSAE
IVTEEQGTVVQQQPAPAPTALATLATASTGKSVEQEWMTFFSYHTSINWSTVESQGKILYSQALNPSINPYLDHIAKLYS
TWSGGIDVRFTVSGSGVFGGKLAALLVPPGVEPIESVSMLQYPHVLFDARQTEPVIFTIPDIRKTLFHSMDETDTTKLVI
NPYENGVENKTTCSITVETRPSADFTFALLKPPGSLIKHGSIPSDLIPRNSAHWMGNRWWSTISGFSVQPRVFQSNRHFD
FDSTTTGWSTPYYVPIEIKIQGKVGSNNKWFHVIDTDKALVPGIPDGWPDTTIPDETKATNGNFSYGESYRAGSTTIKPN
ENSTHFKGTYICGTLSTVEIPENDEQQIKTEAEKKSQTMYVVTADFKDTIVKPQHKISPQKLVVYFDGPEKDLTMSATLS
PLGYTLVDEQPVGSVSSRVVRIATLPEAFTQGGNYPIFYVNKIKVGYFDRATTNCYNSQILMTSQRLAEGNYNLPPDSLA
VYRITDSSSQWFDIGINHDGFSYVGLSDLPNDLSFPLTSTFMGVQLARVKLASKVKAHTITAK
>B4YNG0 ~~~~~~Putative capsid protein V20~~~
MSNSAIPLNVVAVQEPRLELNNERTWVVVKGGQQVTYYPFPSTSFSSNQFNFICNPPSAQTVLDRLVFIQVPYDITFTAN
PSHAGITENLLQPGRDAFRAFPISSITNTLNATINGFPVNIELAQIIHALSRYHTPLKVKNGWMSMQPSFEDNYQSYRDA
DGANNNPLGVFTSAAGLSELPRGSYTMNVVTNTTTTARITGVLYEQVFLPPFLWDGEQAGGLANLTSLTFNWVLNNNLAR
IWSHSDITNDVSGNSTIGSMNISFQQPSMYLGFVTPRLNIPIPPRITYPYFKLSRYTTQFQNTLAPNASSTFKSNVVQLD
SIPRKLYLFVKQSDNVIYQNLNNQITTPDVFLQINNLNLTWNNQQGILSGASSQNLYDFSVQNGYNKTWSEFNGVTQQFN
GVSGQPTKVIGLEGGIVCLELGKDVGLRDDEAEGVIGNFNLQVQMTVTNTNQYVTVTPDMYIVAVYDGTLVISNTSAMAS
IGVASKEEVLNARITHGVSYNELQRIYGGDFFSSFKNFLGKVGNVAGKVNNFLKDSKIASSVLGAIPHPYAQVPGQILKN
VGYGESHVGGGKKKGGVLIGGRQLTKAELRKELKM
>P11333 ~~~ORF1~~~Capsid protein VP1~~~
MKKKMSKLNARVHDFSMFKGNHIPRSKIHIPHKTIRAFNVGEIIPIYQTPVYPGEHIKMDLTSLYRPSTFIVPPMDDLIV
DTYAFAVPWRIVWKDLEKFFGENSDSWDVKNAPPVPDIVAPSGGWDYGTLADHFGITPKVPGIRVKSLRFRAYAKIINDW
FRDQNLSSECALTLDSSNSQGSNGSNQVTDIQLGGKPYIANKYHDYFTSCLPAPQKGAPTTLNVGGMAPVTTKFRDVPNL
SGTPLIFRDNKGRTIKTGQLGIGPVDAGFLVAQNTAQAANGERAIPSNLWADLSNATGISISDLRLAITYQHYKEMDARG
GTRYVEFTLNHFGVHTADARLQRSEFLGGHSQSLLVQSVPQTSSTVEKMTPQGNLAAFSETMIQNNYLVNKTFTEHSYII
VLAVVRYKHTYQQGIEADWFRGQDKFDMYDPLLANISEQPVKNREIMVQGNSQDNEIFGFQEAWADLRFKPNSVAGVMRS
SHPQSLDYWHFADHYAQLPKLSSEWLKEDYKNVDRTLALKASDNTPQLRVDFMFNTIAEKPMPLYSTPGLRRI
>A0A6M3VXD3 ~~~~~~Major capsid protein~~~
MAKGRTPRSFSQRYGKWNAKFTAFSNPTVASTILTNVAPIAQGNFQTNVPKFTSVNEQVSAVLTQYGVTGPSRAIYQGYG
LKVARALNRIGAGPALTNMVAGLKAYYVSAYGANPEILDAVTNIILGSPTGYVS
>P03606 ~~~~~~Capsid protein~~~
MAKQQNNRRKSATMRAVKRMINTHLEHKRFALINSGNTNATAGTVQNLSNGIIQGDDINQRSGDQVRIVSHKLHVRGTAI
TVSQTFRFIWFRDNMNRGTTPTVLEVLNTANFMSQYNPITLQQKRFTILKDVTLNCSLTGESIKDRIINLPGQLVNYNGA
TAVAASNGPGAIFMLQIGDSLVGLWDSSYEAVYTDA
>Q9Q3G5 ~~~ORF2~~~Capsid polyprotein VP90~~~
MAAMADKVVVKKTTTRRRGRSNSRSRSRSRSRSRTKKTVKIIEKKPEKSILKKIDQAERRDAKQLRRIRKKVQGPPVNSR
MTTVVTLGQITGNKDNTLERKHKCFLNPLLMKSQETGQTATPLSVRASQYNLWKLSRLHVRLIPLAGKANILGSVVFLDL
EQEANTAGPESVDTIKARPHVEVPIGSKTVWKVHPRSALGPRQGWWNVDPGDSPTDSLGPALNMWTYLQTVNALQSAGGT
QTPYTSALFLVEVLVTYEFSNYGPKPALSQMVSDSFPPASGSTATLKNTSDGAVAIQLSGAIARKMEEVEPKGRRSNAQT
SGVGEVFWAVSTEVVNTVADAIPGWGWLLKGGWFVLRKIFGAANDQNGTYLIYSSVADAQGDNRIYTSVKQTQLTSSRIN
LVQLTQPNVNQAAVGGSVGAANSIYLPLPQADDQYTPYFVYNFQGERVSTTETGVFCLAAIPAATTSSRYNNQITTPSIG
YRNASGTGTSFLLDAASWWNILDVTQTGVLFGQPRLGVGVMQTMKTLKQHIKDYTEPAIQKYYPGTTNLDEQLKQRLNLA
EGDPVISMGDTNGRRAALFYRTSDEKYILFFSTTEDPGAQYQNLKMLYFWNWSYSDTKQQFLDHLRTVQFANLDDSQPAP
YDSDDDDLSDVTSLFEQADLGDETDFKFNMSIQTSKHLEEEKNYWKNQCERMMMEKALSGTSQPLVRFEKAGPRADQSSA
SGHS
>P23627 ~~~ORF3b~~~Capsid protein~~~
MAQNGTGGGSRRPRRGRRNNNNNNSTARDKALLALTQQVNRLANIASSSAPSLQHPTFIASKKCRAGYTYTSLDVRPTRT
EKDKSFGQRLIIPVPVSEYPKKKVSCVQVRLNPSPKFNSTIWVSLRRLDETTLLTSENVFKLFTDGLAAVLIYQHVPTGI
QPNNKITFDMSNVGAEIGDMGKYALIVYSKDDVLEADEMVIHIDIEHQRIPSLQRSRCDSTRMHDVRRR
>P11795 ~~~ORF2~~~Capsid protein~~~
MAMTTRNNNNVLAISKKQLGVLAASAAVGALRNHISESSPALLQSAVGLGKKALNKVRNRRKQGNQQIITHVGGVGGSIM
APVAVSRQLVGSKPKFTGRTSGSVTVTHREYLTQVNNSSGFVVNGGIVGNLLQLNPSNGTLFSWLPAIASNFDQYSFNSV
VLHYVPLCGTTEVGRVALYFDKDSQDPEPADRVELANFGVLKETAPWAEAMLRIPTDKVKRYCNDSATVDQKLIDLGQLG
IATYGGAGTNAVGDVFISYSVTLYFPQPTNTLLSTRRLDLTGSLADATGPGYLVLTRTPTVLTHTFRVTGTFNLSGGLRC
LTSLTLGATGAVVINDILAIDNVGTASAYFLNCTVSSLPATVTFTTTGISSATVNVVRGTRANVVNLL
>P06663 ~~~ORF4~~~Capsid protein~~~
MENDPRVRKFASDGAQWAIKWQKKGWSTLTSRQKQTARAAMGIKLSPVAQPVQKVTRLSAPVALAYREVSTQPRVSTARD
GITRSGSELITTLKKNTDTEPKYTTAVLNPSEPGTFNQLIKEAAQYEKYRFTSLRFRYSPMSPSTTGGKVALAFDRDAAK
PPPNDLASLYNIEGCVSSVPWTGFILTVPTDSTDRFVADGISDPKLVDFGKLIMATYGQGANDAAQLGEVRVEYTVQLKN
RTGSTSDAQIGDFAGVKDGPRLVSWSKTKGTAGWEHDCHFLGTGNFSLTLFYEKAPVSGLENADASDFSVLGEAAAGSVQ
WAGVKVAERGQGVKMVTTEEQPKGKWQALRI
>P03579 ~~~CP~~~Capsid protein~~~
MPYTINSPSQFVYLSSAYADPVQLINLCTNALGNQFQTQQARTTVQQQFADAWKPVPSMTVRFPASDFYVYRYNSTLDPL
ITALLNSFDTRNRIIEVDNQPAPNTTEIVNATQRVDDATVAIRASINNLANELVRGTGMFNQAGFETASGLVWTTTPAT
>P69687 ~~~CP~~~Capsid protein~~~
MSYSITTPSQFVFLSSAWADPIELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDSDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLIVELIRGTGSYNRSSFESSSGLVWTSGPAT
>P03574 ~~~CP~~~Capsid protein~~~
MSYSITTPSHFVFLSSAWADPIELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDRDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLMVELIRGTGSYNRSSFESSSGLVWTSGPAT
>P03575 ~~~CP~~~Capsid protein~~~
MSYSITSPSQFVFLSSVWADPIELLNVCTSSLGNQFQTQQARTTVQQQFSEVWKPFPQSTVRFPGDVYKVYRYNAVLDPL
ITALLGTFDTRNRIIEVENQQSPTTAETLDATRRVDDATVAIRSAINNLVNELVRGTGLYNQNTFESMSGLVWTSAPAS
>P03573 ~~~CP~~~Capsid protein~~~
MSYNITTPSQFVFLSSAWADPLELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDRDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLIVELIRGTGSYNRSSFESSSGLVWTSGPAT
>P69508 ~~~CP~~~Capsid protein~~~
MSYSITTPSQFVFLSSAWADPIELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDSDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLVVELIRGTGSYNRSSFESSSGLVWTSGPAT
>P69507 ~~~CP~~~Capsid protein~~~
MSYSITTPSQFVFLSSAWADPIELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDSDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLVVELIRGTGSYNRSSFESSSGLVWTSGPAT
>P03571 ~~~CP~~~Capsid protein~~~
MSYSITTPSQFVFLSSAWADPIELINLCTNALGNQFQTQQARTVVQRQFSEVWKPSPQVTVRFPDSDFKVYRYNAVLDPL
VTALLGAFDTRNRIIEVENQANPTTAETLDATRRVDDATVAIRSAINNLVVELIRGTGSYNRSSFESSSGLVWNSGPAT
>Q4U3U9 ~~~MCP-1~~~Major capsid protein~~~
MSGDPKIEATAVAKTKEDTNKVDIKAMRQFVEIKGGIENEIYNPENATTYFYREVRRSVPFIKVPEIIKPNGSVNFGGQC
QFNIPRCGDYLLNLTLYIEIPKIELNITETAAVPGGAAASLHRVGWVKNLAHQLVKEIKLNIDDTTVVNIDTAFLDMWSE
FMTDNGKFDGYKEMIGGSKELWDLDDKVAGAGTIILPLPLFFSRDSGLALPLSSLISSDISVEVLLRKWDEVLLLEDINN
SKNKVIKADDIKDGEPKLTSIKLIGTYAVASKYEVNKTRCDDRRMLIEKPFKASEYELPVFDSGKLDEPIPINMEKINGA
IKALFFAVKNTSRSNEHSVYKVGLPIPGTLRIDGGSLHALSEIGINLAEHVFVPSLPIEYYSYQQPYEHAKRIPNNRDLF
MYSFCLDVGDVDPMGSINPKLLAKRLTFQIKPTKDLAEATKNKQTFKFITYALCNRLVSIRQGKFTLLSQSMYGEDGEDE
>Q9Q1T6 ~~~CP~~~Capsid protein~~~
MSYPITSPSQFVFLSSVWADPIELLNVCTNSLGNQFQTQQARTTVQQQFSEVWEPFPQSTVRFPGDVYKVYRYNAVLDPL
ITALLGTFDTRNRIIEVENRQSPTTAETLDATRRVDDATVAIRSAINNLVNELVRGTGLYNQNTFESMSGLVWTSAPAS
>Q88894 ~~~~~~Capsid protein~~~
MCAVTVVPDPTCCGTLSFKVPKDAKKGKHLGTFDIRQAIMDYGGLHSQEWCAKGIVNPTFTVRMHAPRNAFAGLSIACTF
DDYKRIDLPALGNECPPSEMFELPTKVFMLKDADVHEWQFNYGELTGHGLCNWANVATQPTLYFFVASTNQVTMAADWQC
IVTMHVDMGPVIDRFELNPTMTWPIQLGDTFAIDRYYEAKEIKLDGSTSMLSISYNFGGPVKHSKKHAISYSRAVMSRNL
GWSGTISGSVKSVSSLFCTASFVIFPWECEAPPTLRQVLWGPHQIMHGDGQFEIAIKTRLHSAATTEEGFGRLGILPLSG
PIAPDAHVGSYEFIVHINTWRPDSQVHPPMFSSSELYNWFTLTNLKPDANTGVVNFDIPGYIHDFASKDATVTLASNPLS
WLVAATGWHYGEVDLCISWSRSKQAQAQEGSVSITTNYRDWGAYWQGQARIYDLRRTEAEIPIFLGSYAGATPSGALGKQ
NYVRISIVNAKDIVALRVCLRPKSIKFWGRSATLF
>Q88897 ~~~CP~~~Capsid protein~~~
MGDMYDESFDKSGGPADLMDDSWVESVSWKDLLKKLHSIKFALQSGRDEITGLLAALNRQCPYSPYEQFPDKKVYFLLDS
RANSALGVIQNASAFKRRADEKNAVAGVTNIPANPNTTVTTNQGSTTTTKANTGSTLEEDLYTYYKFDDASTAFHKSLTS
LENMELKSYYRRNFEKVFGIKFGGAAASSSAPPPASGGPIRPNP
>Q9DHA8 ~~~ORF1~~~Probable capsid and replication-associated protein~~~
MAYGWWRRRRRRWRRWRRRPWRRRWRTRRRRPARRRGRRRNVRRRRRGGRWRRRYRRWKRKGRRRKKAKIIIRQWQPNYR
RRCNIVGYIPVLICGENTVSRNYATHSDDTNYPGPLGGGMTTDKFTLRILYDEYKRFMNYWTASNEDLDLCRYLGVNMYF
FRHPDVDFIIKINTMPPFLDTELTAPSIHPGMLALDKRARWIPSLKSRPGKKHYIKIRVGAPKMFTDKWYPQTDLCDMVL
LTVYATAADMQYPFGSPLTDSVVVNFQVLQSMYDQNISILPDQKSEREKLLTQITDYIPFYNTTQTIAQLKPFIDAGNIT
PNTPATTWGSYINTTKFTTAATTTYTYPGTTTTTVTMLTCNDSWYRGTVYNDKIKNLPKQAATLYSKATKTLLGNTFTTD
DHTLEYHGGLYSSIWLSPGRSYFETTGAYTDIKYNPFTDRGEGNMLWIDWLSKKNMNYDKVQSKCLISDLPLWAAAYGYV
EFCAKSTGDQNIHMNARLLIRSPFTDPQLLVHTDPTKGFVPYSLNFGNGKMPGGSSNVPIRMRAKWYPTLFHQQEVLEAL
AQSGPFAYHSDIKKVSLGMKYRFKWIWGGNPVRQQVVRNPCKETHSSGNRVPRSLQIVDPKYNSPELTFHTWDFRRGLFG
PKAIQRMQQQPTTTDIFSAGHKRPRRDTEVYHSSQEGEQKESLLFPPVKLLRRVPPWEDSQQEESGSQSSEEETQTVSQQ
LKQQLQQQRILGVKLRLLFNQVQKIQQNQDINPTLLPRGGDLASLFQIAP
>P27256 ~~~V1~~~Capsid protein~~~
MSKRPGDIIISTPVSKVRRRLNFDSPYSSRAAVPIVQGTNKRRSWTYRPMYRKPRIYRMYRSPDVPRGCEGPCKVQSYEQ
RDDIKHTGIVRCVSDVTRGSGITHRVGKRFCVKSIYFLGKVWMDENIKKQNHTNQVMFFLVRDRRPYGNSPMDFGQVFNM
FDNEPSTATVKNDLRDRFQVMRKFHATVIGGPSGMKEQALVKRFFKINSHVTLFIFIQEAAKYENHTENALLLYMACTHA
SNPVYATMKIRIYFYDSISN
>P09508 ~~~ORF3~~~Major capsid protein~~~
MNTVVGRRIINGRRRPRRQTRRAQRPQPVVVVQTSRATQRRPRRRRRGNNRTGRTVPTRGAGSSETFVFSKDNLAGSSSG
AITFGPSLSDCPAFSNGMLKAYHEYKISMVILEFVSEASSQNSGSIAYELDPHCKLNSLSSTINKFGITKPGKRAFTASY
INGTEWHDVAEDQFRILYKGNGSSSIAGSFRITIKCQFHNPK
>Q66222 ~~~CP~~~Capsid protein~~~
MVYNITSSNQYQYFAAMWAEPTAMLNQCVSALSQSYQTQAARDTVRQQFSNLLSAIVTPNQRFPEAGYRVYINSAVLKPL
YESLMKSFDTRNRIIETEEESRPSASEVANATQRVDDATVAIRSQIQLLLNELSNGHGLMNRAEFEVLLPWATAPAT
>B3VMP4 ~~~~~~Capsid fiber protein~~~
MMVSFTARAKSNVMAYRLLAYSQGDDIIEISHAAENTIPDYVAVKDVDKGDLTQVNMYPLAAWQVIAGSDIKVGDNLTTG
KDGTAVPTDDPSTVFGYAVEEAQEGQLVTLVISRSKEISIEVDDIKDAGDTGKRLLKINTPSGARNIIIENEDAKALING
ETTNTNKKNLQDLLFSDGNVKAFLQATTTDENKTALQQLLVSNADVLGLLSGNPTSDNKINLRTMIGAGVPYSLPAATTT
TLGGVKKGAAVTASTATDVATAVKDLNSLITVLKNAGIIS
>P03277 ~~~L3~~~Hexon protein~~~
MATPSMMPQWSYMHISGQDASEYLSPGLVQFARATETYFSLNNKFRNPTVAPTHDVTTDRSQRLTLRFIPVDREDTAYSY
KARFTLAVGDNRVLDMASTYFDIRGVLDRGPTFKPYSGTAYNALAPKGAPNSCEWEQTEDSGRAVAEDEEEEDEDEEEEE
EEQNARDQATKKTHVYAQAPLSGETITKSGLQIGSDNAETQAKPVYADPSYQPEPQIGESQWNEADANAAGGRVLKKTTP
MKPCYGSYARPTNPFGGQSVLVPDEKGVPLPKVDLQFFSNTTSLNDRQGNATKPKVVLYSEDVNMETPDTHLSYKPGKGD
ENSKAMLGQQSMPNRPNYIAFRDNFIGLMYYNSTGNMGVLAGQASQLNAVVDLQDRNTELSYQLLLDSIGDRTRYFSMWN
QAVDSYDPDVRIIENHGTEDELPNYCFPLGGIGVTDTYQAIKANGNGSGDNGDTTWTKDETFATRNEIGVGNNFAMEINL
NANLWRNFLYSNIALYLPDKLKYNPTNVEISDNPNTYDYMNKRVVAPGLVDCYINLGARWSLDYMDNVNPFNHHRNAGLR
YRSMLLGNGRYVPFHIQVPQKFFAIKNLLLLPGSYTYEWNFRKDVNMVLQSSLGNDLRVDGASIKFDSICLYATFFPMAH
NTASTLEAMLRNDTNDQSFNDYLSAANMLYPIPANATNVPISIPSRNWAAFRGWAFTRLKTKETPSLGSGYDPYYTYSGS
IPYLDGTFYLNHTFKKVAITFDSSVSWPGNDRLLTPNEFEIKRSVDGEGYNVAQCNMTKDWFLVQMLANYNIGYQGFYIP
ESYKDRMYSFFRNFQPMSRQVVDDTKYKEYQQVGILHQHNNSGFVGYLAPTMREGQAYPANVPYPLIGKTAVDSITQKKF
LCDRTLWRIPFSSNFMSMGALTDLGQNLLYANSAHALDMTFEVDPMDEPTLLYVLFEVFDVVRVHQPHRGVIETVYLRTP
FSAGNATT
>P04133 ~~~L3~~~Hexon protein~~~
MATPSMMPQWSYMHISGQDASEYLSPGLVQFARATETYFSLNNKFRNPTVAPTHDVTTDRSQRLTLRFIPVDREDTAYSY
KARFTLAVGDNRVLDMASTYFDIRGVLDRGPTFKPYSGTAYNALAPKGAPNPCEWDEAATALEINLEEEDDDNEDEVDEQ
AEQQKTHVFGQAPYSGINITKEGIQIGVEGQTPKYADKTFQPEPQIGESQWYETEINHAAGRVLKKTTPMKPCYGSYAKP
TNENGGQGILVKQQNGKLESQVEMQFFSTTEATAGNGDNLTPKVVLYSEDVDIETPDTHISYMPTIKEGNSRELMGQQSM
PNRPNYIAFRDNFIGLMYYNSTGNMGVLAGQASQLNAVVDLQDRNTELSYQLLLDSIGDRTRYFSMWNQAVDSYDPDVRI
IENHGTEDELPNYCFPLGGVINTETLTKVKPKTGQENGWEKDATEFSDKNEIRVGNNFAMEINLNANLWRNFLYSNIALY
LPDKLKYSPSNVKISDNPNTYDYMNKRVVAPGLVDCYINLGARWSLDYMDNVNPFNHHRNAGLRYRSMLLGNGRYVPFHI
QVPQKFFAIKNLLLLPGSYTYEWNFRKDVNMVLQSSLGNDLRVDGASIKFDSICLYATFFPMAHNTASTLEAMLRNDTND
QSFNDYLSAANMLYPIPANATNVPISIPSRNWAAFRGWAFTRLKTKETPSLGSGYDPYYTYSGSIPYLDGTFYLNHTFKK
VAITFDSSVSWPGNDRLLTPNEFEIKRSVDGEGYNVAQCNMTKDWFLVQMLANYNIGYQGFYIPESYKDRMYSFFRNFQP
MSRQVVDDTKYKDYQQVGILHQHNNSGFVGYLAPTMREGQAYPANFPYPLIGKTAVDSITQKKFLCDRTLWRIPFSSNFM
SMGALTDLGQNLLYANSAHALDMTFEVDPMDEPTLLYVLFEVFDVVRVHRPHRGVIETVYLRTPFSAGNATT
>P11820 ~~~L3~~~Hexon protein~~~
MATPSMMPQWSYMHIAGQDASEYLSPGLVQFARATDTYFSLGNKFRNPTVAPTHDVTTDRSQRLTLRFVPVDREDTAYSY
KVRFTLAVGDNRVLDMASTYFDIRGVLDRGPSFKPYSGTAYNSLAPKTAPNPCEWKDNNKIKVRGQAPFIGTNINKDNGI
QIGTDTTNQPIYADKTYQPEPQVGQTQWNSEVGAAQKVAGRVLKDTTPMLPCYGSYAKPTNEKGGQASLITNGTDQTLTS
DVNLQFFALPSTPNEPKAVLYAENVSIEAPDTHLVYKPDVAQGTISSADLLTQQAAPNRPNYIGFRDNFIGLMYYNSTGN
MGVLAGQASQLNAVVDLQDRNTELSYQLMLDALGDRSRYFSMWNQAVDSYDPDVRIIENHGVEDELPNYCFPLGGSAATD
TYSGIKANGQTWTADDNYADRGAEIESGNIFAMEINLAANLWRSFLYSNVALYLPDSYKITPDNITLPENKNTYAYMNGR
VAVPSALDTYVNIGARWSPDPMDNVNPFNHHRNAGLRYRSMLLGNGRYVPFHIQVPQKFFAIKNLLLLPGSYTYEWNFRK
DVNMILQSSLGNDLRVDGASVRFDSINLYANFFPMAHNTASTLEAMLRNDTNDQSFNDYLCAANMLYPIPSNATSVPISI
PSRNWAAFRGWSFTRLKTKETPSLGSGFDPYFTYSGSVPYLDGTFYLNHTFKKVSIMFDSSVSWPGNDRLLTPNEFEIKR
TVDGEGYNVAQCNMTKDWFLIQMLSHYNIGYQGFYVPESYKDRMYSFFRNFQPMSRQVVNTTTYKEYQNVTLPFQHNNSG
FVGYMGPTMREGQAYPANYPYPLIGQTAVPSLTQKKFLCDRTMWRIPFSSNFMSMGALTDLGQNMLYANSAHALDMTFEV
DPMDEPTLLYVLFEVFDVVRIHQPHRGVIEAVYLRTPFSAGNATT
>P03278 ~~~L3~~~Hexon protein~~~
MATPSMLPQWSYMHIAGQDASEYLSPGLVQFAQATESYFNIGNKFRNPTVAPTHDVTTERSQRLQLRFVPVDREDTQYSY
KTRFQLAVGDNRVLDMASTYFDIRGTLDRGASFKPYSGTAYNSFAPKSAPNNTQFRQANNGHPAQTIAQASYVATIGGAN
NDLQMGVDERQLPVYANTTYQPEPQLGIEGWTAGSMAVIDQAGGRVLRNPTQTPCYGSYAKPTNEHGGITKANTQVEKKY
YRTGDNGNPETVFYTEEADVLTPDTHLVHAVPAADRAKVEGLSQHAAPNRPNFIGFRDCFVGLMYYNSGGNLGVLAGQSS
QLNAVVDLQDRNTELSYQMLLANTTDRSRYFSMWNQAMDSYDPEVRVIDNVGVEDEMPNYCFPLSGVQIGNRSHEVQRNQ
QQWQNVANSDNNYIGKGNLPAMEINLAANLWRSFLYSNVALYLPDNLKFTPHNIQLPPNTNTYEYMNGRIPVSGLIDTYV
NIGTRWSPDVMDNVNPFNHHRNSGLRYRSQLLGNGRFCDFHIQVPQKFFAIRNLLLLPGTYTYEWSFRKDVNMILQSTLG
NDLRVDGATVNITSVNLYASFFPMSHNTASTLEAMLRNDTNDQSFNDYLSAANMLYPIPPNATQLPIPSRNWAAFRGWSL
TRLKQRETPALGSPFDPYFTYSGTIPYLDGTFYLSHTFRKVAIQFDSSVTWPGNDRLLTPNEFEIKISVDGEGYNVAQSN
MTKDWFLVQMLANYNIGYQGYHLPPDYKDRTFSFLHNFIPMCRQVPNPATEGYFGLGIVNHRTTPAYWFRFCRAPREGHP
YPQLALPPHWDPRHALRDPERKFLCDRTLWRIPFSSNFMSMGSLTDLGQNLLYANAAHALDMTFEMDPINEPTLLYVLFE
VFDVARVHQPHRGVIEVVYLRTPFSAGNATT
>P42671 ~~~L3~~~Hexon protein~~~
MTALTPDLTTATPRLQYFHIAGPGTREYLSEDLQQFISATGSYFDLKNKFRQTVVAPTRNVTTEKAQRLQIRFYPIQTDD
TPNSYRVRYSVNVGDSWVLDMGATYFDIKGVLDRGPSFKPYGGTAYNPLAPREAIFNTWVESTGPQTNVVGQMTNVYTNQ
TRNDKTATLQQVNSISGVVPNVNLGPGLSQLASRADVDNIGVVGRFAKVDSAGVKQAYGAYVKPVKDDGSQSLNQTAYWL
MDNGGTNYLGALAVEDYTQTLSYPDTVLVTPPTAYQQVNSGTMRACRPNYIGFRDNFINLLYHDSGVCSGTLNSERSGMN
VVVELQDRNTELSYQYMLADMMSRHHYFALWNQAVDQYDHDVRVFNNDGYEEGVPTYAFLPDGHGAGEDNGPDLSNVKIY
TNGQQDKGNVVAGTVSTQLNFGTIPSYEIDIAAATRRNFIMSNIADYLPDKYKFSIRGFDPVTDNIDPTTYFYMNRRVPL
TNVVDLFTNIGARWSVDQMDNVNPFNHHRNWGLKYRSQLLGNSRYCRFHIQVPQKYFAIKNLLLLPGTYTYEWVLRKDPN
MILQSSLGNDLRADGAQIVYTEVNLMANFMPMDHNTSNQLELMLRNATNDQTFADYLGAKNALYNVPAGSTLLTINIPAR
TWEGMRGWSFTRLKASETPQLGAQYDVGFKYSGSIPYSDGTFYLSHTFRSMSVLFDTSINWPGNDRLLTPNLFEIKRPVA
TDSEGFTMSQCDMTKDWFLVQMATNYNYVYNGYRFWPDRHYFHYDFLRNFDPMSRQGPNFLDTTLYDLVSSTPVVNDTGS
QPSQDNVRNNSGFIAPRSWPVWTAQQGEAWPANWPYPLIGNDAISSNQTVNYKKFLCDNYLWTVPFSSDFMYMGELTDLG
QNPMYTNNSHSMVINFELDPMDENTYVYMLYGVFDTVRVNQPERNVLAMAYFRTPFATGNAV
>P22776 ~~~~~~Hexon protein p72~~~
MASGGAFCLIANDGKADKIILAQDLLNSRISNIKNVNKSYGKPDPEPTLSQIEETHLVHFNAHFKPYVPVGFEYNKVRPH
TGTPTLGNKLTFGIPQYGDFFHDMVGHHILGACHSSWQDAPIQGTAQMGAHGQLQTFPRNGYDWDNQTPLEGAVYTLVDP
FGRPIVPGTKNAYRNLVYYCEYPGERLYENVRFDVNGNSLDEYSSDVTTLVRKFCIPGDKMTGYKHLVGQEVSVEGTSGP
LLCNIHDLHKPHQSKPILTDENDTQRTCSHTNPKFLSQHFPENSHNIQTAGKQDITPITDATYLDIRRNVHYSCNGPQTP
KYYQPPLALWIKLRFWFNENVNLAIPSVSIPFGERFITIKLASQKDLVNEFPGLFIRQSRFIPGRPSRRNIRFKPWFIPG
VINEISLTNNELYINNLFVTPEIHNLFVKRVRFSLIRVHKTQVTHTNNNHHDEKLMSALKWPIEYMFIGLKPTWNISDQN
PHQHRDWHKFGHVVNAIMQPTHHAEISFQDRDTALPDACSSISDISPVTYPITLPIIKNISVTAHGINLIDKFPSKFCSS
YIPFHYGGNAIKTPDDPGAMMITFALKPREEYQPSGHINVSRAREFYISWDTDYVGSITTADLVVSASAINFLLLQNGSA
VLRYST
>Q6WHE1 ~~~~~~Major capsid protein~~~
MNLTEKWKDLLEAEGADMPEIATATKQKIMSKIFENQDRDINNDPMYRDPQLVEAFNAGLNEAVVNGDHGYDPANIAQGV
TTGAVTNIGPTVMGMVRRAIPQLIAFDIAGVQPMTGPTSQVFTLRSVYGKDPLTGAEAFHPTRQADASFSGQAAASTIAD
FPTTGAATDGTPYKAEVTTSGGDVSMRYFLALGAVTLAVAGQMTATEYTDGVAGGLLVEIDAGMATSQAELQENFNGSSN
NEWNEMSFRIDKQVVEAKSRQLKAQYSIELAQDLRAVHGLDADAELSGILANEVMVELNREIVNLVNSQAQIGKSGWTQG
AGAAGVFDFSDAVDVKGARWAGEAYKALLIQIEKEANEIGRQTGRGNGNFIIASRNVVSALSMTDTLVGPAAQGMQDGSM
NTDTNQTVFAGVLGGRFKVYIDQYAVNDYFTVGFKGSTEMDAGVFYSPYVPLTPLRGSDSKNFQPVIGFKTRYGVQVNPF
ADPTASATKVGNGAPVAASMGKNAYFRRVFVKGL
>P04535 ~~~~~~Major capsid protein~~~
MTIKTKAELLNKWKPLLEGEGLPEIANSKQAIIAKIFENQEKDFQTAPEYKDEKIAQAFGSFLTEAEIGGDHGYNATNIA
AGQTSGAVTQIGPAVMGMVRRAIPNLIAFDICGVQPMNSPTGQVFALRAVYGKDPVAAGAKEAFHPMYGPDAMFSGQGAA
KKFPALAASTQTTVGDIYTHFFQETGTVYLQASVQVTIDAGATDAAKLDAEIKKQMEAGALVEIAEGMATSIAELQEGFN
GSTDNPWNEMGFRIDKQVIEAKSRQLKAAYSIELAQDLRAVHGMDADAELSGILATEIMLEINREVVDWINYSAQVGKSG
MTLTPGSKAGVFDFQDPIDIRGARWAGESFKALLFQIDKEAVEIARQTGRGEGNFIIASRNVVNVLASVDTGISYAAQGL
ATGFSTDTTKSVFAGVLGGKYRVYIDQYAKQDYFTVGYKGPNEMDAGIYYAPYVALTPLRGSDPKNFQPVMGFKTRYGIG
INPFAESAAQAPASRIQSGMPSILNSLGKNAYFRRVYVKGI
>P03276 ~~~L2~~~Penton protein~~~
MQRAAMYEEGPPPSYESVVSAAPVAAALGSPFDAPLDPPFVPPRYLRPTGGRNSIRYSELAPLFDTTRVYLVDNKSTDVA
SLNYQNDHSNFLTTVIQNNDYSPGEASTQTINLDDRSHWGGDLKTILHTNMPNVNEFMFTNKFKARVMVSRSLTKDKQVE
LKYEWVEFTLPEGNYSETMTIDLMNNAIVEHYLKVGRQNGVLESDIGVKFDTRNFRLGFDPVTGLVMPGVYTNEAFHPDI
ILLPGCGVDFTHSRLSNLLGIRKRQPFQEGFRITYDDLEGGNIPALLDVDAYQASLKDDTEQGGDGAGGGNNSGSGAEEN
SNAAAAAMQPVEDMNDHAIRGDTFATRAEEKRAEAEAAAEAAAPAAQPEVEKPQKKPVIKPLTEDSKKRSYNLISNDSTF
TQYRSWYLAYNYGDPQTGIRSWTLLCTPDVTCGSEQVYWSLPDMMQDPVTFRSTSQISNFPVVGAELLPVHSKSFYNDQA
VYSQLIRQFTSLTHVFNRFPENQILARPPAPTITTVSENVPALTDHGTLPLRNSIGGVQRVTITDARRRTCPYVYKALGI
VSPRVLSSRTF
>P12538 ~~~L2~~~Penton protein~~~
MRRAAMYEEGPPPSYESVVSAAPVAAALGSPFDAPLDPPFVPPRYLRPTGGRNSIRYSELAPLFDTTRVYLVDNKSTDVA
SLNYQNDHSNFLTTVIQNNDYSPGEASTQTINLDDRSHWGGDLKTILHTNMPNVNEFMFTNKFKARVMVSRLPTKDNQVE
LKYEWVEFTLPEGNYSETMTIDLMNNAIVEHYLKVGRQNGVLESDIGVKFDTRNFRLGFDPVTGLVMPGVYTNEAFHPDI
ILLPGCGVDFTHSRLSNLLGIRKRQPFQEGFRITYDDLEGGNIPALLDVDAYQASLKDDTEQGGGGAGGSNSSGSGAEEN
SNAAAAAMQPVEDMNDHAIRGDTFATRAEEKRAEAEAAAEAAAPAAQPEVEKPQKKPVIKPLTEDSKKRSYNLISNDSTF
TQYRSWYLAYNYGDPQTGIRSWTLLCTPDVTCGSEQVYWSLPDMMQDPVTFRSTRQISNFPVVGAELLPVHSKSFYNDQA
VYSQLIRQFTSLTHVFNRFPENQILARPPAPTITTVSENVPALTDHGTLPLRNSIGGVQRVTITDARRRTCPYVYKALGI
VSPRVLSSRTF
>Q65190 ~~~~~~Penton protein H240R~~~
MAVNIIATRAAPKMASKKEHQYCLLDSQEKRHGHYPFSFELKPYGQTGANIIGVQGSLTHVIKMTVFPFMIPFPLQKTHI
DDFIGGRIYLFFKELDMQAVSDVNGMQYHFEFKVVPVSPNQVELLPVNNKYKFTYAIPVVQYLTPIFYDLSGPLDFPLDT
LSVHVDSLSNHIQLPIQNHNLTTGDRVFISGYKHLQTIELCKNNKIFIKYIPPLSSEKIKLYIPKNRIRIPLYFKSLKNV
>P19896 ~~~~~~Capsid vertex protein~~~
MAKINELLRESTTTNSNSIGRPNLVALTRATTKLIYSDIVATQRTNQPVAAFYGIKYLNPDNEFTFKTGATYAGEAGYVD
REQITELTEESKLTLNKGDLFKYNNIVYKVLEDTPFATIEESDLELALQIAIVLLKVRLFSDAASTSKFESSDSEIADAR
FQINKWQTAVKSRKLKTGITVELAQDLEANGFDAPNFLEDLLATEMADEINKDILQSLITVSKRYKVTGITDSGFIDLSY
ASAPEAGRSLYRMVCEMVSHIQKESTYTATFCVASARAAAILAASGWLKHKPEDDKYLSQNAYGFLANGLPLYCDTNSPL
DYVIVGVVENIGEKEIVGSIFYAPYTEGLDLDDPEHVGAFKVVVDPESLQPSIGLLVRYALSANPYTVAKDEKEARIIDG
GDMDKMAGRSDLSVLLGVKLPKIIIDE
>A0A385DV85 ~~~~~~Cargo protein 1~~~
MAKKKIKRRGKMPPNIFDTGGQSWGQQSSGQFSNAFKGENLGNSIGSIGGAVGGIAQAGISNAQIADTSGIEAQNKAQKN
MVVGASSNDDLMSEWGSWNKVKDDYSWKDVRGGSTGQRVTNTIGAAGQGAAAGASVGGPIGAIVGGVVGLGSAIGGWLGG
NRKAKRKAKKLNKEAKEANERALTSFETRADNIDTQNDFNMLANFSAYGGPLEFGSGAIGYEFDNRYLNNQEMSAVAKQR
LTSLPNSFQALPEMNTYNAFAEGGGLSREKNYGSKKKPYPSVPSGDFAGPHRSYPIPTKADARDALRLAGLHGNESVRRK
VLAKYPSLKAFGGSLFDSVVGNNFNQSFTQGIQGMFQQEPEQTVQAANIAKDGGDIKIKEKNKGKFTAYCGGKVTEACIR
KGKNSSNPTTRKRATFAQNARNWNAFGGWLNTQGGDFTNGVTFINEGGSHEENPYQGIQIGVDPEGAPNLVEQGEVVYDD
YVFSDRMEIPDDIRKEYKLRGKTFAKAAKSAQRESEERPNDPLSTKGLQAAMERIATAQEEARQRKEAHREGNEYPSMFA
YGGDTNPYGLALEDPMSVEELEALMVQSGETGEIAPEGNNGNRQTWTRYAPIIGSGLASLSDLFSKPDYDSADLISGVDL
GAEAVGYAPIGNYLSYRPLDRDFYINKMNQQAAATRRGLMNTSGGNRLNAQAGILAADYNYGQNMGNLARQAEEYNQQLR
ERVEAFNRGTNMFNTETGLKASMFNAESRNAAKRARLGQATTVAQLRQGIKDQDAARRSANITNFLQGLGDMGWENEQAN
WLDTLAKSGVLKMNTKGEYTGGTKKAKGGKVRTKKKKGLTYG
>A0A385DVT8 ~~~~~~Cargo protein 2~~~
MANFSFVSGAKFRPFSYQEMLQPLQAYTQEYNTIQEGMGELGTKADVFERMANEQTDPQAYAMYKQYSNDLAAQAESLAK
QGLTPASRQGLIDMKRRYSSEIVPIEQAYKRRQELVDEQRKLQAQDSTLLFDRPASTLSLDELISNPALSPQSYSGALLS
KQVGTAAQNLAKEVRENPRKWRTILGNQYYETIMQKGFRPEEIMQAVQNNPEASPILQGIVEDAVGSSGIRNWNDENILN
RAYDYARQGLWNAVGETQYQTLSNKAYDYAMQERLAAAKKGKTEGTPSAVFRSVPKTKVDGDKKTTELNDELQFIQQLRA
NPSMINEEVERVNPGYPTQYGVNVGGGIYKVKPHAERLQQIIKKYDMKDGNMDQLEQKLQADIRSSAVRDFIYKPNITQS
DLISQVIKENARTLGAATESTGLYELDDNRKGDPIKLKNISDYFTGDNDISYDPEVGLIINATKDGKTKSAVIDPELIDD
ADRSVAGYMNNINVLLENGYDVEAQRYINTMMNYIYGKFNTLAKRQSNTDSKLE
>P85309 ~~~~~~Capsid protein~~~
MALSFKNSSGSIKAKRIKDGLVNANDIETTVIDFSYEKPDLSSVDGFSLKSLLSSDGWHIVVAYQCVTNSEQLNNNKKNN
KTQKFRLFTFDIIVIPGLKPNKSKNVVSYNRFMALCIGMICYHKKWKVFNWTRKSYEDNTSTIDFNEDEDFMNKLAMSAG
FSKEHKYHWFYSTGFEYTFDIFPAEVIAMSLFRWSHRVELKIKYTHESDLVEPMVRQLTKKGTISDVMDIIGKSTIAKRY
EEIVKDRSSTGIGTKYNDVLDEFKEIIKKINSSSLDSTIKDCLKDIEE
>Q5K5C1 3.4.22.-~~~~~~Executioner caspase~~~
MSICYYDTVGREKRLLIINQRALALPNRTISSDGTCCDGDEYLLVDTFTKLNFKVQTIRNASKIVLETTVRNYIEKNVKR
VACYFVVVLNDGNDADTILTTDGTYSLSELYALFTLYTVRAIPKVFLIQSCLGAKIDRSHCDRASCQCDQEEDAHPTSVC
HDIFVNTVRRVIHACSRKSNGSTTTTETKCSDVATVVLTSPHTEETIIVYLRIEAYLRYGDTKCGCFMIEKFCKNLIKYG
TRSSVHTTITMVQNEMQITDPKHVPIVQMNCTKLLFLGDENHIIMEEY
>P25783 3.4.22.50~~~VCATH~~~Viral cathepsin~~~
MNKILFYLFVYGVVNSAAYDLLKAPNYFEEFVHRFNKDYGSEVEKLRRFKIFQHNLNEIINKNQNDSAKYEINKFSDLSK
DETIAKYTGLSLPIQTQNFCKVIVLDQPPGKGPLEFDWRRLNKVTSVKNQGMCGACWAFATLASLESQFAIKHNQLINLS
EQQMIDCDFVDAGCNGGLLHTAFEAIIKMGGVQLESDYPYEADNNNCRMNSNKFLVQVKDCYRYITVYEEKLKDLLRLVG
PIPMAIDAADIVNYKQGIIKYCFNSGLNHAVLLVGYGVENNIPYWTFKNTWGTDWGEDGFFRVQQNINACGMRNELASTA
VIY
>P41721 3.4.22.50~~~VCATH~~~Viral cathepsin~~~
MNKILFYLFVYAVVKSAAYDPLKAPNYFEEFVHRFNKNYSSEVEKLRRFKIFQHNLNEIINKNQNDSAKYEINKFSDLSK
DETIAKYTGLSLPTQTQNFCKVILLDQPPGKGPLEFDWRRLNKVTSVKNQGMCGACWAFATLGSLESQFAIKHNELINLS
EQQMIDCDFVDAGCNGGLLHTAFEAIIKMGGVQLESDYPYEADNNNCRMNSNKFLVQVKDCYRYIIVYEEKLKDLLPLVG
PIPMAIDAADIVNYKQGIIKYCFDSGLNHAVLLVGYGVENNIPYWTFKNTWGTDWGEDGFFRVQQNINACGMRNELASTA
VIY
>Q89501 ~~~~~~CD2 homolog~~~
MIIIVIFLMCLKIVLNNIIIWSTLNQTVFLNNIFTINDTYGGLFWNTYYDNNRSNFTYCGIAGNYCSCCGHNISLYNTTN
NCSLIIFPNNTEIFNRTYELVYLDKKINYTVKLLKSVDSPTITYNCTNSLITCKNNNGTNVNIYLIINNTIVNDTNGDIL
NYYWNGNNNFTATCMINNTISSLNETENINCTNPILKYQNYLSTLFYIIIFIVSGLIIGIFISIISVLSIRRKRKKHVEE
IESPPPSESNEEDISHDDTTSIHEPSPREPLLPKPYSRYQYNTPIYYMRPSTQPLNPFPLPKPCPPPKPCPPPKPCPPPK
PCPPPKPCSPPKPCRPPKPCPPPKPCPPPKPCPPPKPCPPSKPCPSPESYSPPKPLPSIPLLPNIPPLSTQNISLIHVDR
II
>P0C9V9 ~~~~~~CD2 homolog~~~
MIIILIFLIIPNIVLSIDYWVSFNKTIILDSNITNDNNDINGVSWNFLNNSLNTLATCGKAGNFCECSNYSTSLYNIAHN
CSLTIFPHNDVFGTPYQVVWNQIINYTIKLLTPVTPPNITYNCTNFLITCKKNNGTNTIIYFNINDTNVKYTNESILEYN
WNNSNFNNFTATCIINNTINSSNDTQTIDCINTLLSSYLDFFQVASYMFYMIIFIATGIIASIFISIITFLSLRKRKKHV
EEIESPSPSESNEEEQCQHDDTTSIHEPSPREPLLPKPYSRYQYNTPIYYMRPLTQPLNPSPLPKLCPPPKPCPPPKPCP
PPKPCPPPKPCPSSESCSPPESYSLPKPLPNIPLLPNIPPLSTQNISLIHVDRII
>P29882 ~~~BBRF2~~~Cytoplasmic envelopment protein 1~~~
MASGKHHQPGGTRSLTMQKVSLRVTPRLVLEVNRHNAICVATNVPEFYNARGDLNIRDLRAHVKARMISSQFCGYVLVSL
LDSEDQVDHLNIFPHVFSERMILYKPNNVNLMEMCALLSMIENAKSPSIGLCREVLGRLTLLHSKCNNLDSLFLYNGART
LLSTLVKYHDLEEGAATPGPWNEGLSLFKLHKELKRAPSEARDLMQSLFLTSGKMGCLARSPKDYCADLNKEEDANSGFT
FNLFYQDSLLTKHFQCQTVLQTLRRKCLGSDTVSKIIP
>P10191 ~~~UL7~~~Cytoplasmic envelopment protein 1~~~
MAAATADDEGSAATILKQAIAGDRSLVEAAEAISQQTLLRLACEVRQVGDRQPRFTATSIARVDVAPGCRLRFVLDGSPE
DAYVTSEDYFKRCCGQSSYRGFAVAVLTANEDHVHSLAVPPLVLLHRFSLFNPRDLLDFELACLLMYLENCPRSHATPST
FAKVLAWLGVAGRRTSPFERVRCLFLRSCHWVLNTLMFMVYVKPFDDEFVLPHWYMARYLLANNPPPVLSALFCATPTSS
SFRLPGPPPRSDCVAYNPAGIMGSCWASEEVRAPLVYWWLSETPKRQTSSLFYQFC
>P09301 ~~~~~~Cytoplasmic envelopment protein 1~~~
MQRIRPYWIKFEQTGGAGMADGMSGINIPSILGCSVTIDNLLTRAEEGLDVSDVIEDLRIQAIPRFVCEAREVTGLKPRF
LANSVVSLRVKPEHQETVLVVLNGDSSEVSCDRYYMECVTQPAFRGFIFSVLTAVEDRVYTVGVPPRLLIYRMTLFRPDN
VLDFTLCVILMYLEGIGPSGASPSLFVQLSVYLRRVECQIGPLEKMRRFLYEGVLWLLNTLMYVVDNNPFTKTRVLPHYM
FVKLLNPQPGTAPNIIKAIYSCGVGQRFDLPHGTPPCPDGVVQVPPGLLNGPLRDSEYQKSVYFWWLNRTMVTPKNVQLF
ETYKNSPRVVK
>P0CK53 ~~~BGLF2~~~Cytoplasmic envelopment protein 2~~~
MASAANSSREQLRKFLNKECLWVLSDASTPQMKVYTATTAVSAVYVPQIAGPPKTYMNVTLIVLKPKKKPTYVTVYINGT
LATVARPEVLFTKAVQGPHSLTLMYFGVFSDAVGEAVPVEIRGNPVVTCTDLTTAHVFTTSTAVKTVEELQDITPSEIIP
LGRGGAWYAEGALYMFFVNMDMLMCCPNMPTFPSLTHFINLLTRCDNGECVTCYGAGAHVNILRGWTEDDSPGTSGTCPC
LLPCTALNNDYVPITGHRALLGLMFKPEDAPFVVGLRFNPPKMHPDMSRVLQGVLANGKEVPCTAQPWTLLRFSDLYSRA
MLYNCQVLKRQVLHSY
>P0CK55 ~~~BGLF2~~~Cytoplasmic envelopment protein 2~~~
MASAANSSREQLRKFLNKECLWVLSDASTPQMKVYTATTAVSAVYVPQIAGPPKTYMNVTLIVLKPKKKPTYVTVYINGT
LATVARPEVLFTKAVQGPHSLTLMYFGVFSDAVGEAVPVEIRGNPVVTCTDLTTAHVFTTSTAVKTVEELQDITPSEIIP
LGRGGAWYAEGALYMFFVNMDMLMCCPNMPTFPSLTHFINLLTRCDNGECVTCYGAGAHVNILRGWTEDDSPGTSGTCPC
LLPCTALNNDYVPITGHRALLGLMFKPEDAPFVVGLRFNPPKMHPDMSRVLQGVLANGKEVPCTAQPWTLLRFSDLYSRA
MLYNCQVLKRQVLHSY
>P16800 ~~~~~~Cytoplasmic envelopment protein 2~~~
MAWRSGLCETDSRTLKQFLQEECMWKLVGKSRKHREYRAVACRSTIFSPEDDSSCILCQLLLLYRDGEWIICFCCNGRYQ
GHYGVNHVHRRRRRICHLPTLYQLSFGGPLGPASIDFLPSFSQVTSSMTCDGITPDVIYEVCMLVPQDEAKRILVKGHGA
MDLTCQKAVTLGGAGAWLLPRPEGYTLFFYILCYDLFTSCGNRCDIPSMTRLMAAATACGQAGCSFCTDHEGHVDPTGNY
VGCTPDMGRCLCYVPCGPMTQSLIHNEEPATFFCESDDAKYLCAVGSKTAAQVTLGDGLDYHIGVKDSEGRWLPVKTDVW
DLVKVEEPVSRMIVCSCPVLKNLVH
>P10200 ~~~~~~Cytoplasmic envelopment protein 2~~~
MAQLGPRRPLAPPGPPGTLPRPDSRAGARGTRDRVDDLGTDVDSIARIVNSVFVWRVVRADERLKIFRCLTVLTEPLCQV
ALPNPDPGRALFCEIFLYLTRPKALRLPPNTFFALFFFNRERRYCAIVHLRSVTHPLTPLLCTLTFARIRAATPPEETPD
PTTEQLAEEPVVGELDGAYLVPAKTPPEPGACCALGPGAWWHLPSGQIYCWAMDSDLGSLCPPGSRARHLGWLLARITNH
PGGCESCAPPPHIDSANALWLSSVVTESCPCVAPCLWAKMAQCTLAVQGDASLCPLLFGHPVDTVTLLQAPRRPCITDRL
QEVVGGRCGADNIPPTSAGWRLCVFSSYISRLFATSCPTVARAVARASSSDPE
>P0CK51 ~~~BBLF1~~~Cytoplasmic envelopment protein 3~~~
MGALWSLCRRRVNSIGDVDGGIINLYNDYEEFNLETTKLIAAEEGRACGETNEGLEYDEDSENDELLFLPNKKPN
>P13200 ~~~~~~Cytoplasmic envelopment protein 3~~~
MGAELCKRICCEFGTTPGEPLKDALGRQVSLRSYDNIPPTSSSDEGEDDDDGEDDDNEERQQKLRLCGSGCGGNDSSSGS
HREATHDGSKKNAVRSTFREDKAPKPSKQSKKKKKPSKHHHHQQSSIMQETDDLDEEDTSIYLSPPPVPPVQVVAKRLPR
PDTPRTPRQKKISQRPPTPGTKKPAASLPF
>P04289 ~~~~~~Cytoplasmic envelopment protein 3~~~
MGLSFSGARPCCCRNNVLITDDGEVVSLTAHDFDVVDIESEEEGNFYVPPDMRGVTRAPGRQRLRSSDPPSRHTHRRTPG
GACPATQFPPPMSDSE
>Q68980 ~~~~~~Cytoplasmic envelopment protein 3~~~
MGLSFSGARPCCCRNNVLITDDGEVVSLTAHDFDVVDIESEEEGNFYVPPDMRVVTRAPGRQRLRSSDPPSRHTHRRTPG
GACPATQFPPPMSDSE
>P13294 ~~~~~~Cytoplasmic envelopment protein 3~~~
MGLAFSGARPCCCRHNVITTDGGEVVSLTAHEFDVVDIESEEEGNFYVPPDVRVVTRAPGPQYRRPSDPPSRHTRRRDPD
VARPPATLTPPLSDSE
>Q98325 ~~~~~~Viral CASP8 and FADD-like apoptosis regulator~~~
MSDSKEVPSLPFLRHLLEELDSHEDSLLLFLCHDAAPGCTTVTQALCSLSQQRKLTLAALVEMLYVLQRMDLLKSRFGLS
KEGAEQLLGTSFLTRYRKLMVCVGEELDSSELRALRLFACNLNPSLSTALSESSRFVELVLALENVGLVSPSSVSVLADM
LRTLRRLDLCQQLVEYEQQEQARYRYCYAASPSLPVRTLRRGHGASEHEQLCMPVQESSDSPELLRTPVQESSSDSPEQT
T
>Q01043 ~~~~~~Cyclin homolog~~~
MADSPNRLNRAKIDSTTMKDPRVLNNLKLRELLLPKFTSLWEIQTEVTVDNRTILLTWMHLLCESFELDKSVFPLSVSIL
DRYLCKKQGTKKTLQKIGAACVLIGSKIRTVKPMTVSKLTYLSCDCFTNLELINQEKDILEALKWDTEAVLATDFLIPLC
NALKIPEDLWPQLYEAASTTICKALIQPNIALLSPGLICAGGLLTTIETDNTNCRPWTCYLEDLSSILNFSTNTVRTVKD
QVSEAFSLYDLEIL
>P41684 3.2.1.14~~~CHIA~~~Chitinase~~~
MLYKLLNVLWLVAVSNAIPGTPVIDWADRNYALVEINYEATAYENLIKPKEQVDVQVSWNVWNGDIGDIAYVLFDEQQVW
KGDAESKRATIKVLVSGQFNMRVKLCNEDGCSVSDPVLVKVADTDGGHLAPLEYTWLENNKPGRREDKIVAAYFVEWGVY
GRNFPVDKVPLPNLSHLLYGFIPICGGDGINDALKTIPGSFESLQRSCKGREDFKVAIHDPWAAVQKPQKGVSAWNEPYK
GNFGQLMAAKLANPHLKILPSIGGWTLSDPFYFMHDVEKRNVFVDSVKEFLQVWKFFDGVDIDWEFPGGKGANPSLGDAD
GDAKTYILLLEELRAMLDDLEAQTGRVYELTSAISAGYDKIAVVNYAEAQKSLGKIFLMSYDFKGAWSNTDLGYQTTVYA
PSWNSEELYTTHYAVDALLKQGVDPNKIIVGVAMYGRGWTGVTNYTNDNYFSGTGNGPGSGTWEDGVVDYRQIQKDLNNY
VYTFDSAAQASYVFDKSKGDLISFDSVDSVLGKVKYVDRNKLGGLFAWEIDADNGDLLNAINAQFKPKDEL
>B3FIW8 ~~~~~~Chimallin~~~
MIRDTATNTTQTQAAPQQAPAQQFTQAPQEKPMQSTQSQPTPSYAGTGGINSQFTRSGNVQGGDARASEALTVFTRLKEQ
AVAQQDLADDFSILRFDRDQHQVGWSSLVIAKQISLNGQPVIAVRPLILPNNSIELPKRKTNIVNGMQTDVIESDIDVGT
VFSAQYFNRLSTYVQNTLGKPGAKVVLAGPFPIPADLVLKDSELQLRNLLIKSVNACDDILALHSGERPFTIAGLKGQQG
ETLAAKVDIRTQPLHDTVGNPIRADIVVTTQRVRRNGQQENEFYETDVKLNQVAMFTNLERTPQAQAQTLFPNQQQVATP
APWVASVVITDVRNADGIQANTPEMYWFALSNAFRSTHGHAWARPFLPMTGVAKDMKDIGALGWMSALRNRIDTKAANFD
DAQFGQLMLSQVQPNPVFQIDLNRMGETAQMDSLQLDAAGGPNAQKAAATIIRQINNLGGGGFERFFDHTTQPILERTGQ
VIDLGNWFDGDEKRDRRDLDNLAALNAAEGNENEFWGFYGAQLNPNLHPDLRNRQSRNYDRQYLGSTVTYTGKAERCTYN
AKFIEALDRYLAEAGLQITMDNTSVLNSGQRFMGNSVIGNNMVSGQAQVHSAYAGTQGFNTQYQTGPSSFY
>A0A482GDX1 ~~~~~~Chimallin~~~
MGLDVRNNGNDNVEIRAAETRTAQRADEALETAADFAGQPKVTHTMRTINRTLSRRISRNTGSEQVLNLRRLMEKYLEDT
RFKDDFIFVAVDPNQYSVPYPTLVVMSGAKVGDHNHFFGYVLPLVAGLAPLPRREEQGPHGNILVPRTWVDNLNGTFINE
VMAAMYAAIGGKSNGTARIAGLAVVTNEITAESAHLATTLLSAADNAIQTAIEIRLGDKLGLPQFNLGMMASDQPISSVQ
YNTSGMQDSDIVGNPVRSDITVTISNRIRQAMSDYDSQQRLVATTGYIDLTYSPQNPTFNQGPVLVNGYPVPPTVQYQPR
YVMTSAYPLELDAFTPNTFVLGLIGTIATLNSGMAWAQSLISNAARGIGPHNPGALAMVLDPEVTAPLDLSTQTNEQIYK
FLQQVLYPSLLISIDVPEEGEYSWLLRMIPAAEKIYTGKVEGEVREISEGYKALYRAFDDVTLGCFSKKYQYGLPLVYAT
GNRIPLGHYNHQDGHRHDIRDMDDLYMMNITNPDTVEAWEDSFDRTDMTMSQRVVARHEIIDRVLSGSWEQTGWAMRYDF
DPLALQALIEAAADAGFTIRPENIQHLAGTAVRGNMAARARGLGNISGNIYARSDRPNVGVNNMGGAFNLF
>F8SJT5 ~~~~~~Chimallin~~~
MQQTQQGPKVQTQTLQGGAGNLNSIFQRSGRTDGGDARASEALAVFNKLKEEAIAQQDLHDDFLVFRFDRDQNRVGYSAL
LVVKRAAINGQQVIVTRPLVMPNDQITLPTKKLTIQNGMHQETIEAEADVQDVFTTQYWNRICDSIRQQTGKHDAMVINA
GPTVIPADFDLKDELVLKQLLIKSVNLCDDMLAKRSGEQPFSVAMLKGTDETLAARLNFTGKPMHDSLGYPIRSDILVSL
NRVKKPGQQENEFYEAEDKLNQVSCFVNLEYTPQPQQAIYGAPQQTQQLPPLTPAIVITDVRQAEWLKANTMELYLFALS
NAFRVTANQSWARSLLPQLGKVKDMRDIGAIGYLSRLAARVETKTETFTDQNFAELLYNMVRPSPVFMSDLNRFGDNAAI
ENVFIDALGGVNQQRAVAAIIAGVNNLIGGGFEKFFDHNTMPIIQPYGTDIQLGYYLDGEGEKQDRRDLDVLGALNASDG
NIQEWMSWYGTQCNVAVHPELRARQSKNFDRQYLGNSVTYTTRAHRGIWNPKFIEALDKAIASVGLTVAMDNVAQVFGAQ
RFSGNLAIADYAVTGTAQVSSGLVSNGGYNPQFGVGQGSGFY
>P08557 ~~~N~~~DNA circularization protein N~~~
MFEDALNAVNAVRDKTGGGRKTTGKGTFRNVPFLVIEEQKQAGGRRLVKREYPLRDTGGVNDLGKKLRSRTFSACILNSN
AETARDEAGALMDALDAPGSGELVHPDFGTVDVMVDSWECRTKADELNYYAFTVTVYPSLQDTAPDAETDTSAAVPAQAV
AVTGSLGDTLSSVWQTVKDGTAAATAVMEAVTGVIDDISDAVDNLGVTQTVSGLMGSLSAMKGSVTSLINQPAMLASSLM
GALSGVSSLCDTRTAFSTWNRLAQRFERRHAATAGRQGTITTSYNSPVAEKNIATLNYVMLAAAQTYRAEAASQALTAAL
DFSRRMDNAARAPVLDAPSTTTGTASGASSTSATVTQGQLQLTAITPDGGFSQVSFSDSGTATPPVFESVSDIEKTTAML
GAALDSVILTASEQGFSTDSVQLTQLRLLVVADLEKRGLQLAGSESHHLPETLPAMVALYRFTGNSRNWQRLARRNGISN
PLFVPGGVSIEVINE
>O80164 ~~~~~~Sliding clamp~~~
MKLSKDTIAILKNFASINSGILLSQGKFIMTRAVNGTTYAEANISDEIDFDVALYDLNSFLSILSLVSDDAEISMHTDGN
IKIADTRSTVYWPAADKSTIVFPNKPIQFPVASVITEIKAEDLQQLLRVSRGLQIDTIAITNKDGKIVINGYNKVEDSGL
TRPKYSLTLTDYDGSNNFNFVINMANMKIQPGNYKVMLWGAGDKVAAKFESSQVSYVIAMEADSTHDF
>P04525 ~~~~~~Sliding clamp~~~
MKLSKDTTALLKNFATINSGIMLKSGQFIMTRAVNGTTYAEANISDVIDFDVAIYDLNGFLGILSLVNDDAEISQSEDGN
IKIADARSTIFWPAADPSTVVAPNKPIPFPVASAVTEIKAEDLQQLLRVSRGLQIDTIAITVKEGKIVINGFNKVEDSAL
TRVKYSLTLGDYDGENTFNFIINMANMKMQPGNYKLLLWAKGKQGAAKFEGEHANYVVALEADSTHDF
>Q65389 ~~~DNA-C~~~Cell cycle link protein~~~
MEFWESSAMPDDVKREIKEIYWEDRKKLLFCQKLKSYVRRILVYGDQEDALAGVKDMKTSIIRYSEYLKKPCVVICCVSN
KSIVYRLNSMVFFYHEYLEELGGDYSVYQDLYCDEVLSSSSTEEEDVGVIYRNVIMASTQEKFSWSDCQQIVISDYDVTL
L
>Q9WIJ4 ~~~DNA-C~~~Cell cycle link protein~~~
MGLKYFSHLPEELREKIVHDHLQQERKKEFLEKAIEDSCRRHVSLLKSDPSPSEMYSLSKFLDSLADYVGRQFNTRCLIK
WRKDVPANIKFQVMEEQHLRLYGFLDMDDLSCRELLPPEEDDDITYEDGMIVNCSELDKLFAALGIRVVYITVSNNCICT
PLNKDIVIS
>P41469 2.1.1.57~~~~~~Cap-specific mRNA (nucleoside-2'-O-)-methyltransferase~~~
MLQQKLNKLKDGLNTFSSKSVVCARSKLFDKRPTRRPRCWRKLSEIDKKFHVCRHVDTFLDLCGGPGEFANYTMSLNPLC
KAYGVTLTNNSVCVYKPTVRKRKNFTTITGPDKSGDVFDKNVVFEISIKCGNACDLVLADGSVDVNGRENEQERLNFDLI
MCETQLILICLRPGGNCVLKVFDAFEHETIQMLNKFVNHFEKWVLYKPPSSRPANSERYLICFNKLVRPYCNNYVNELEK
QFEKYYRIQLKNLNKLINLLKI
>P19270 ~~~~~~Coat protein TP1~~~
MIIIKTKNREYTIDENKEINSISAADTICPYRAYRSYHRKIPIKIDDCKEARKIAGSMTHYVILRQLKEQGCETEKIVDP
ELKFRADAICDGDALLRLRQTSIGTGMRLISGN
>P19271 ~~~~~~Coat protein TP2~~~
MKVLRLGFTDKYIEKQEPIDVVFDKYQPLGYVAVVELPSRIPWIIEIQRREWIERFITMPRDIFRELSFDIIILRRKLEP
TPQYRMIRDIVSDLRQSRGYASGMVILPNGLTYDGDLLEGIEVMEGVDVIAYTLGLIDF
>P19272 ~~~~~~Coat protein TP3~~~
MVEIKLVNKEIIKFGLALGIVNELNEYLYAAMPPMYDILSKLYGYGRTVNAMLYSVLFNLLESRIDTLRRLDLSKFFAVV
AYMDAVKDTVELAINGYAYVTTSGFTVYPASQITGTYESDFSQAKAVPANTTATQVWFAPKKFALITQNKIYTFLAPYWF
>P15158 ~~~~~~Coat protein~~~
MDSSEVVKVKQASIPAPGSILSQPNTEQSPAIVLPFQFEATTFGTAETAAQVSLQTADPITKLTAPYRHAQIVECKAILT
PTDLAVSNPLTVYLAWVPANSPATPTQILKLRVYGGQSFVLGGAISAAKTIEVPLNLDSVNRMLKDSVTYTDTPKLLAYS
RAPTNPSKIPTASIQISGRIRLSKPMLIAN
>Q04754 ~~~~~~Major capsid protein~~~
MTVVLDSKDLARIDEEYKADSQVWSYLTGGNGVTQRFRGHNEVRINKLSGFVDATAYKRGQDNARKTISVGKETVKLTHE
DWFGYDLDQFDMDENGAYTVENVVREHNKMITIPHRDKVAVQKLFDSAAKKATDSITKDNALDAYDTAEAYMFDNEVPGG
FVMFVSSAYYTALKQSAAVTRTFSTDGTMVINGIDRRVAQLDGGVPIVRVSSDRLKGLGITNHVNFILTPLSAIAPIVKY
DSVSVIDPSTDRSGNRWTIKGLSYYDAIVLDNAKKGIYVAATAGV
>P36351 ~~~~~~Coat protein~~~
MDSSEVVKVKQASIPAPGSILSQPNTEQSPAIVLPFQFEATTFGTAETAAQVSLQTADPITKLTAPYRHAQIVECKAILT
PTDLAVSNPLTVYLAWVPANSPATPTQILRVYGGQSFVLGGAISAAKTIEVPLNLDSVNRMLKDSVTYTDTPKLLAYSRA
PTNPSKIPTASIQISGRIRLSKPMLIAN
>P23629 ~~~~~~Coat protein~~~
MSKKAVPPIVKAQYELYNRKLNRAIKVSGSQKKLDASFVGFSESSNPETGKPHADMSMSAKVKRVNTWLKNFDREYWENQ
FASKPIPRPAKQVLKGSSSKSQQRDEGEVVFTRKDSQKSVRTVSYWVCTPEKSMKPLKYKEDENVVEVTFNDLAAQKAGD
KLVSILLEINVVGGAVDDKGRVAVLEKDAAVTVDYLLGSPYEAINLVSGLNKINFRSMTDVVDSIPSLLNERKVCVFQND
DSSSFYIRKWANFLQEVSAVLPVGTGKSSTIVLT
>Q86993 ~~~~~~Coat protein~~~
MAPKRSRRSNRRAGSRAAATSLVYDTCYVTLTERATTSFQRQSFPTLKGMGDRAFQVVAFTIQGVSAAPLMYNARLYNPG
DTDSVHATGVQLMGTVPRTVRLTPRVGQNNWFFGNTEEAETILAIDGLVSTKGANAPSNTVIVTGCFRLAPSELQSS
>P17574 ~~~~~~Coat protein~~~
MGRGKVKPNRKSTGDNSNVVTMIRAGSYPKVNPTPTWVRAIPFEVSVQSGIAFKVPVGSLFSANFRTDSFTSVTVMSVRA
WTQLTPPVNEYSFVRLKPLFKTGDSTEEFEGRASNINTRASVGYRIPTNLRQNTVAADNVCEVRSNCRQVALVISCCFN
>P03608 ~~~~~~Coat protein~~~
MEIDKELAPQDRTVTVATVLPAVPGPSPLTIKQPFQSEVLFAGTKDAEASLTIANIDSVSTLTTFYRHASLESLWVTIHP
TLQAPTFPTTVGVCWVPANSPVTPAQITKTYGGQIFCIGGAINTLSPLIVKCPLEMMNPRVKDSIQYLDSPKLLISITAQ
PTAPPASTCIITVSGTLSMHSPLITDTST
>P20125 ~~~~~~Coat protein~~~
MEIDKELAPQDRTVTVATVLPTVPGPSPFTIKQPFQSEVLFAGTKDAEASLTIANIDSVSTLTTFYRHASLESLWVTIHP
TLQAPAFPTTVGVCWVPANSPVTPTQITKTYGGQIFCIGGAINTLSPLIVKCPLEMMNPRVKDSIQYLDSPKLLISITAQ
PTAPPASTCIITVSGTLSMHSPLITDTST
>Q7Y5D9 ~~~~~~Collar protein p132~~~
MSTENRVIDLVVDENVPYGLLMQFMDVDDSVYPSTSKPVDLTDFSLRGSIKSSLEDGAETVASFTTAIVDAAQGVASISL
PVSAVTTIASKASKERDRYNPRQRLAGYYDVIITRTAVGSAASSFRIMEGKVYISDGVTQ
>O48448 ~~~~~~Tail completion protein gp17~~~
MTWKLASRALQKATVENLESYQPLMEMVNQVTESPGKDDPYPYVVIGDQSSTPFETKSSFGENITMDFHVWGGTTRAEAQ
DISSRVLEALTYKPLMFEGFTFVAKKLVLAQVITDTDGVTKHGIIKVRFTINNN
>P11112 ~~~~~~Tail completion protein gp15~~~
MFGYFYNSSFRRYATLMGDLFSNIQIKRQLESGDKFIRVPITYASKEHFMMKLNKWTSINSQEDVAKVETILPRINLHLV
DFSYNAPFKTNILNQNLLQKGATSVVSQYNPSPIKMIYELSIFTRYEDDMFQIVEQILPYFQPHFNTTMYEQFGNDIPFK
RDIKIVLMSAAIDEAIDGDNLSRRRIEWSLTFEVNGWMYPPVDDAEGLIRTTYTDFHANTRDLPDGEGVFESVDSEVVPR
DIDPEDWDGTVKQTFTSNVNRPTPPEPPGPRT
>Q6QGE0 ~~~~~~Tail completion protein p143~~~
MSLSDLARQIIKEQLDTASRSENNKNTVVYSVETGLKDPTRDGTVAQVSFKFSKPVSQDLLNIRTASILKAVSSSLDLSG
DLGALENLIQATAGKKSSVGKKRSTGRVQVNFGDPSDVEDGYSGAVTGASGRFVSNSNMKIILEIVAKEYLIKDMKKAGA
PLKFRTGRFANSLKIKDVMLRDSETSKGSPELNVTYNYMTRPYSVFNPAVSTYRRLSLRPYPGARNPQKLIGEAIAKAAR
DLIHSRYKIKVNQGT
>Q38200 ~~~com~~~Translational activator com~~~
MKSIRCKNCNKLLFKADSFDHIEIRCPRCKRHIIMLNACEHPTEKHCGKREKITHSDETVRY
>Q38621 ~~~com~~~Translational activator com~~~
MKSIRCKNCNKLLFKADSFDHIEIRCPRCKRHIIMLNACEHPTEKHCGKREKITHSDETVRY
>P14269 ~~~L2~~~Pre-core protein X~~~
MALTCRLRFPVPGFRGRMHRRRGMAGHGLTGGMRRAHHRRRRASHRRMRGGILPLLIPLIAAAIGAVPGIASVALQAQRH
>Q2KS10 ~~~L2~~~Pre-core protein X~~~
MALTCRLRFPVPGFRGRMHRRRGMAGHGLTGGMRRAHHRRRRASHRRMRGGILPLLIPLIAAAIGAVPGIASVALQAQRH
>P03267 ~~~L2~~~Core-capsid bridging protein~~~
MSKRKIKEEMLQVIAPEIYGPPKKEEQDYKPRKLKRVKKKKKDDDDDELDDEVELLHATAPRRRVQWKGRRVRRVLRPGT
TVVFTPGERSTRTYKRVYDEVYGDEDLLEQANERLGEFAYGKRHKDMLALPLDEGNPTPSLKPVTLQQVLPTLAPSEEKR
GLKRESGDLAPTVQLMVPKRQRLEDVLEKMTVEPGLEPEVRVRPIKQVAPGLGVQTVDVQIPTTSSTSIATATEGMETQT
SPVASAVADAAVQAAAAAASKTSTEVQTDPWMFRVSAPRRPRRSRKYGTASALLPEYALHPSIAPTPGYRGYTYRPRRRA
TTRRRTTTGTRRRRRRRQPVLAPISVRRVAREGGRTLVLPTARYHPSIV
>P24938 ~~~L2~~~Core-capsid bridging protein~~~
MSKRKIKEEMLQVIAPEIYGPPKKEEQDYKPRKLKRVKKKKKDDDDELDDEVELLHATAPRRRVQWKGRRVKRVLRPGTT
VVFTPGERSTRTYKRVYDEVYGDEDLLEQANERLGEFAYGKRHKDMLALPLDEGNPTPSLKPVTLQQVLPALAPSEEKRG
LKRESGDLAPTVQLMVPKRQRLEDVLEKMTVEPGLEPEVRVRPIKQVAPGLGVQTVDVQIPTTSSTSIATATEGMETQTS
PVASAVADAAVQAVAAAASKTSTEVQTDPWMFRVSAPRRPRGSRKYGAASALLPEYALHPSIAPTPGYRGYTYRPRRRAT
TRRRTTTGTRRRRRRRQPVLAPISVRRVAREGGRTLVLPTARYHPSIV
>P0DSV7 ~~~~~~Cytokine response-modifying protein B~~~
MKSVLYLYILFLSCIIINGRDAAPYTPPNGKCKDTEYKRHNLCCLSCPPGTYASRLCDSKTNTQCTPCGSGTFTSRNNHL
PACLSCNGRCNSNQVETRSCNTTHNRICECSPGYYCLLKGSSGCKACVSQTKCGIGYGVSGHTSVGDVICSPCGFGTYSH
TVSSADKCEPVPNNTFNYIDVEITLYPVNDTSCTRTTTTGLSESILTSELTITMNHTDCNPVFREEYFSVLNKVATSGFF
TGENRYQNISKVCTLNFEIKCNNKGSSFKQLTKAKNDDGMMSHSETVTLAGDCLSSVDIYILYSNTNAQDYETDTISYRV
GNVLDDDSHMPGSCNIHKPITNSKPTRFL
>P03222 ~~~CVC1~~~Capsid vertex component 1~~~
MDVHIDNQVLSGLGTPLLVHLFVPDTVMAELCPNRVPNCEGAWCQTLFSDRTGLTRVCRVFAARGMLPGRPSHRGTFTSV
PVYCDEGLPELYNPFHVAALRFYDEGGLVGELQIYYLSLFEGAKRALTDGHLIREASGVQESAAAMQPIPIDPGPPGGAG
IEHMPVAAAQVEHPKTYDLKQILLEITQEENRGEQRLGHAGSPALCLGLRLRAGAETKAAAETSVSKHHPALENPSNIRG
SAGGEGGGGRAGTGGTVGVGSGALSRVPVSFSKTRRAIRESRALVRGIAHIFSPHALYVVTYPELSAQGRLHRMTAVTHA
SPATDLAEVSILGAPEREFRFLISVALRISASFREKLAMQAWTAQQEIPVVIPTSYSRIYKNSDLIREAFFTVQTRVSWE
SCWVKATISNAPKTPDACLWIDSHPLYEEGASAWGKVIDSRPPGGLVGAASQLVALGTDGHCVHLATTSDGQAFLVLPGG
FVIKGQLALTPEERGYILARHGIRREQ
>P16799 ~~~CVC1~~~Capsid vertex component 1~~~
METHLYSDLAFEARFADDEQLPLHLVLDQEVLSNEEAETLRYVYYRNVDSAGRSTGRAPGGDEDDAPASDDAEDAVGGDR
AFDRERRTWQRACFRVLPRPLELLDYLRQSGLTVTLEKEQRVRMFYAVFTTLGLRCPDNRLSGAQTLHLRLVWPDGSYRD
WEFLARDLLREEMEANKRDRQHQLATTTNHRRRGGLRNNLDNGSDRRLPEAAVASLETAVSTPFFEIPNGAGTSSANGDG
RFSNLEQRVARLLRGDEEFIYHAGPLEPPSKIRGHELVQLRLDVNPDLMYATDPHDRDEVARTDEWKGAGVSRLREVWDV
QHRVRLRVLWYVNSFWRSRELSYDDHEVELYRALDAYRARIAVEYVLIRAVRDEIYAVLRRDGGALPQRFACHVSRNMSW
RVVWELCRHALALWMDWADVRSCIIKALTPRLSRGAAAAAQRARRQRERSAPKPQELLFGPRNESGPPAEQTWYADVVRC
VRAQVDLGVEVRAARCPRTGLWIVRDRRGRLRRWLSQPEVCVLYVTPDLDFYWVLPGGFAVSSRVTLHGLAQRALRDRFQ
NFEAVLARGMHVEAGRQEPETPRVSGRRLPFDDL
>P10201 ~~~CVC1~~~Capsid vertex component 1~~~
MNAHLANEVQTISATARVGPRSLVHVIISSECLAAAGIPLAALMRGRPGLGTAANFQVEIQTRAHATGDCTPWCTAFAAY
VPADAVGELLAPVVPAHPGLLPRASSAGGLFVSLPVVCDAQGVYDPYAVAALRLAWGSGASCARVILFSYDELVPPNTRY
AADSTRIMRVCRHLCRYVALLGAAAPPAAKEAAAHLSMGLGESASPRPQPLARPHAGAPADPPIVGASDPPISPEEQLTA
PGGDTTAAQDVSIAQENEEILALVQRAVQDVTRRHPVRARTGRAACGVASGLRQGALVHQAVSGGAMGAADADAVLAGLE
PPGGGRFVAPAPHGPGGEDILNDVLTLTPGTAKPRSLVEWLDRGWEALAGGDRPDWLWSRRSISVVLRHHYGTKQRFVVV
SYENSVAWGGRRARPPLLSSALATALTEACAAERVVRPHQLSPAGQAELLLRFPALEVPLRHPRPVLPPFDIAAEVAFTA
RIHLACLRALGQAIRAALQGGPRISQRLRYDFGPDQRAWLGEVTRRFPILLENLMRAVEGTAPDAFFHTAYALAVLAHLG
GRGGRGRRVVPLGDDLPARFADSDGHYVFDYYSTSGDTLRLNNRPIAVAMDGDVSKREQSKCRFMEAVPSTAPRRVCEQY
LPGESYAYLCLGFNRRLCGIVVFPGGFAFTINIAAYLSLSDPVARAAVLRFCRKVSSGNGRSR
>P89440 ~~~CVC1~~~Capsid vertex component 1~~~
MNAHFANEVQYDLTRDPSSPASLIHVIISSECLAAAGVPLSALVRGRPDGGAAANFRVETQTRAHATGDCTPWRSAFAAY
VPADAVGAILAPVIPAHPDLLPRVPSAGGLFVSLPVACDAQGVYDPYTVAALRLAWGPWATCARVLLFSYDELVPPNTRY
AADGARLMRLCRHFCRYVARLGAAAPAAATEAAAHLSLGMGESGTPTPQASSVSGGAGPAVVGTPDPPISPEEQLTAPGG
DTATAEDVSITQENEEILALVQRAVQDVTRRHPVRARPKHAASGVASGLRQGALVHQAVSGGALGASDAEAVLAGLEPPG
GGRFASRGGPRAAGEDVLNDVLTLVPGTAKPRSLVEWLDRGWEALAGGDRPDWLWSRRSISVVLRHHYGTKQRFVVVSYE
NSVAWGGRRARPPRLSSELATALTEACAAERVVRPHQLSPAAQTALLRRFPALEGPLRHPRPVLQPFDIAAEVAFVARIQ
IACLRALGHSIRAALQGGPRIFQRLRYDFGPHQSEWLGEVTRRFPVLLENLMRALEGTAPDAFFHTAYALAVLAHLGGQG
GRGRRRRLVPLSDDIPARFADSDAHYAFDYYSTSGDTLRLTNRPIAVVIDGDVNGREQSKCRFMEGSPSTAPHRVCEQYL
PGESYAYLCLGFNRRLCGLVVFPGGFAFTINTAAYLSLADPVARAVGLRFCRGAATGPGLVR
>F5HB39 ~~~CVC1~~~Capsid vertex component 1~~~
MDAHAINERYVGPRCHRLAHVVLPRTFLLHHAIPLEPEIIFSTYTRFSRSPGSSRRLVVCGKRVLPGEENQLASSPSGLA
LSLPLFSHDGNFHPFDISVLRISCPGSNLSLTVRFLYLSLVVAMGAGRNNARSPTVDGVSPPEGAVAHPLEELQRLARAT
PDPALTRGPLQVLTGLLRAGSDGDRATHHMALEAPGTVRGESLDPPVSQKGPARTRHRPPPVRLSFNPVNADVPATWRDA
TNVYSGAPYYVCVYERGGRQEDDWLPIPLSFPEEPVPPPPGLVFMDDLFINTKQCDFVDTLEAACRTQGYTLRQRVPVAI
PRDAEIADAVKSHFLEACLVLRGLASEASAWIRAATSPPLGRHACWMDVLGLWESRPHTLGLELRGVNCGGTDGDWLEIL
KQPDVQKTVSGSLVACVIVTPALEAWLVLPGGFAIKGRYRASKEDLVFIRGRYG
>P03233 ~~~CVC2~~~Capsid vertex component 2~~~
MALSGHVLIDPARLPRDTGPELMWAPSLRNSLRVSPEALELAEREAERARSERWDRCAQVLKNRLLRVELDGIMRDHLAR
AEEIRQDLDAVVAFSDGLESMQVRSPSTGGRSAPAPPSPSPAQPFTRLTGNAQYAVSISPTDPPLMVAGSLAQTLLGNLY
GNINQWVPSFGPWYRTMSANAMQRRVFPKQLRGNLNFTNSVSLKLMTEVVAVLEGTTQDFFSDVRHLPDLQAALILSVAY
LLLQGGSSHQQRPLPASREELLELGPESLEKIIADLKAKSPGGNFMILTSGNKEARQSIAPLNRQAAYPPGTFADNKIYN
LFVGAGLLPTTAALNVPGAAGRDRDLVYRIANQIFGEDVPPFSSHQWNLRVGLAALEALMLVYTLCETANLAEAATRRLH
LSSLLPQAMQRRKPAMASAGMPGAYPVQTLFRHGELFRFIWAHYVRPTVAADPQASISSLFPGLVLLALELKLMDGQAPS
HYAINLTGQKFDTLFEIINQKLLFHDPAAMLAARTQLRLAFEDGVGVALGRPSPMLAAREILERQFSASDDYDRLYFLTL
GYLASPVAPS
>P16726 ~~~CVC2~~~Capsid vertex component 2~~~
MSLLHTFWRLPVAVFFEPHEENVLRCPERVLRRLLEDAAVTMRGGGWREDVLMDRVRKRYLRQELRDLGHRVQTYCEDLE
GRVSEAEALLNQQCELDEGPSPRTLLQPPCRPRSSSPGTGVAGASAVPHGLYSRHDAITGPAAAPSDVVAPSDAVAASAA
AGASSTWLAQCAERPLPGNVPSYFGITQNDPFIRFHTDFRGEVVNTMFENASTWTFSFGIWYYRLKRGLYTQPRWKRVYH
LAQMDNFSISQELLLGVVNALENVTVYPTYDCVLSDLEAAACLLAAYGHALWEGRDPPDSVATVLGELPQLLPRLADDVS
REIAAWEGPVAAGNNYYAYRDSPDLRYYMPLSGGRHYHPGTFDRHVLVRLFHKRGVIQHLPGYGTITEELVQERLSGQVR
DDVLSLWSRRLLVGKLGRDVPVFVHEQQYLRSGLTCLAGLLLLWKVTNADSVFAPRTGKFTLADLLGSDAVAGGGLPGGR
AGGEEEGYGGRHGRVRNFEFLVRYYIGPWYARDPAVTLSQLFPGLALLAVTESVRSGWDPSRREDSAGGGDGGGAVLMQL
SKSNPVADYMFAQSSKQYGDLRRLEVHDALLFHYEHGLGRLLSVTLPRHRVSTLGSSLFNVNDIYELLYFLVLGFLPSVA
VL
>P10209 ~~~CVC2~~~Capsid vertex component 2~~~
MDPYCPFDALDVWEHRRFIVADSRNFITPEFPRDFWMSPVFNLPRETAAEQVVVLQAQRTAAAAALENAAMQAAELPVDI
ERRLRPIERNVHEIAGALEALETAAAAAEEADAARGDEPAGGGDGGAPPGLAVAEMEVQIVRNDPPLRYDTNLPVDLLHM
VYAGRGATGSSGVVFGTWYRTIQDRTITDFPLTTRSADFRDGRMSKTFMTALVLSLQACGRLYVGQRHYSAFECAVLCLY
LLYRNTHGAADDSDRAPVTFGDLLGRLPRYLACLAAVIGTEGGRPQYRYRDDKLPKTQFAAGGGRYEHGALASHIVIATL
MHHGVLPAAPGDVPRDASTHVNPDGVAHHDDINRAAAAFLSRGHNLFLWEDQTLLRATANTITALGVIQRLLANGNVYAD
RLNNRLQLGMLIPGAVPSEAIARGASGSDSGAIKSGDNNLEALCANYVLPLYRADPAVELTQLFPGLAALCLDAQAGRPV
GSTRRVVDMSSGARQAALVRLTALELINRTRTNPTPVGEVIHAHDALAIQYEQGLGLLAQQARIGLGSNTKRFSAFNVSS
DYDMLYFLCLGFIPQYLSAV
>P89448 ~~~CVC2~~~Capsid vertex component 2~~~
MDPYYPFDALDVWEHRRFIVADSRSFITPEFPRDFWMLPVFNIPRETAAERAAVLQAQRTAAAAALENAALQAAELPVDI
ERRIRPIEQQVHHIADALEALETAAAAAEEADAARDAEARGEGAADGAAPSPTAGPAAAEMEVQIVRNDPPLRYDTNLPV
DLLHMVYAGRGAAGSSGVVFGTWYRTIQERTIADFPLTTRSADFRDGRMSKTFMTALVLSLQSCGRLYVGQRHYSAFECA
VLCLYLLYRTTHESSPDRDRAPVAFGDLLARLPRYLARLAAVIGDESGRPQYRYRDDKLPKAQFAAAGGRYEHGALATHV
VIATLVRHGVLPAAPGDVPRDTSTRVNPDDVAHRDDVNRAAAAFLARGHNLFLWEDQTLLRATANTITALAVLRRLLANG
NVYADRLDNRLQLGMLIPGAVPAEAIARGASGLDSGAIKSGDNNLEALCVNYVLPLYQADPTVELTQLFPGLAALCLDAQ
AGRPLASTRRVVDMSSGARQAALVRLTALELINRTRTNTTPVGEIINAHDALGIQYEQGPGLLAQQARIGLASNTKRFAT
FNVGSDYDLLYFLCLGFIPQYLSVA
>Q2HRB3 ~~~CVC2~~~Capsid vertex component 2~~~
MLTSERSYLRYPKNRRWTEAGRFWAPHPENVLFIHKPTMEETRRVALGLRSQLVRNRERKTKAHLLSLELDRLVQVHDSR
VRVINADIDAVKQMIGNMTWSDNIDMPQSRSHEPPLVTSPPQASHRNFTVAIVPGDPHFSVDRDLRGELMPTLYMNQNQW
LPSFGPWFISLTDNAMQRRVFPKELKGTVNFQNSTSLKLISHTLTTVASTTADFFADARHLTDTQAALCLVNAYFCQKTS
RQLPATPDDLLADLPQKLDLLITQLKQESGPGDFSFTYSNPQERASLAPLNKESRYPTAFFQRHKLHAMMAKAGLFPHNK
GTGAPGTAPAMDLVFAITSAMFGSDIPPFSAYQWNLRAGIVALEVFILAYGLLEFGQVARGHPNRRLNLVSLLGPKFQPG
ALPDPNAPMLKRGQLFSFISEHYIIPTLQANPNAPVSFIFPGIILAALEARSTVSHKQPGPFVNLTGSRFNEIFEILNQQ
LTFRDPLALLQARTALRLATEEGLDVLLSHPSPPTLLQEIIKSQFGGGDDYDRAYFMVLGCLPVVLAVV
>F5HBX1 ~~~~~~Chemokine vCXCL1~~~
MRLIFGALIISLTYMYYYEVHGTELRCKCLDGKKLPPKTIMLGNFWFHRESGGPRCNNNEYFLYLGGGKKHGPGVCLSPH
HPFSKWLDKRNDNRWYNVNVTRQPERGPGKITVTLVGLKE
>Q88939 ~~~orfA~~~Retroviral cyclin~~~
MDIPVEFLTAQEPLSYGHIPPVYWKELLNWIDRILTHNQATPNTWEATHMVLLKLHGTLSFSNPAQLPLVAAACLQIAAK
HTEAHSRLADPDYITMLGDGVYTKPSLLLTETMALFIVGGHVGAYTLAACDWLLGSLPFSQAENDLLHPYMYHYIKLSYR
HRTPDYHSSPALRAAVVIAAAVKGADLLEMNMLFIMMYHLTHISTASLSLGLTHFTAALQRQINLDFAEAEQREAAERRA
LLEREREQQLQEARERLDDVMAVLEAEVAITITTATEGTDAEDTSEVDVINVVDPIG
>O55779 ~~~P/V/C~~~Protein C~~~
MMASILLTLFRRTKKKYKRHTDDQASNNQVPKTGQEHGRTSCRAPVENMNRLRGECLRMMEVLKEEMWRIYPVLLPQMEL
LDKECQTPELGQKTQMTYNWTQWLQTLYTMIMEENVPDMDLLQALREGGVITCQEHTMGMYVLYLIQRCCPMLPKLQFLK
KLGKLI
>P35977 ~~~P/V/C~~~Protein C~~~
MSKTDWNASGLSRPSPSAHWPSRKLWQHGQKYQTTQDRSEPPAGKRRQAVRVSANHASQQLDQLKAVHLASAVRDLERAM
TTLKLWESPQEISRHQALGYSVIMFMITAVKRLRESKMLTLSWFNQALMVIAPYQEETMNLKTAMWILANLIPRDMLSLT
GDLLPSLWGSGLLMLKLQKEGRSTSS
>Q00794 ~~~P/V/C~~~Protein C~~~
MSKTDWNASGPSRPSPSAHWPSGKLWQHGQKYQTTQDRSGPPTRRRRQAVRVSANHASQQLDQLKAVHLASAVRDLERAM
TTLKLWESPQEISRHQALGYSVIMFMITAVKRLRESKMLTLSWFNQALMVTAPSQKETMNLKTAMWILANLIPRDMLSLT
GDLLPSLWGSGLPMLKLQKEGRSTSS
>P28055 ~~~P/V/C~~~Protein C~~~
MPSFLRGILKPKERHHENKNHSQVSSDSLTSSYPTSPPKLEKTEAGSMVSSTTQKKTSHHAKPTITTKTEQSQRRPKIID
QVRRVESLGEQVSQKQRHMLDSLINKVYTGPLGEELVQTLYLRIWAMKETPESMKILQMREDIRDQYLRMKTERWLRTLI
RGKKTKLRDFQKRYEEVHPYLMMERVEQIIMEEAWKLAAHIVQE
>P06164 ~~~P/V/C~~~Protein C~~~
MLKTIKSWILGKRDQETSHLTSHRPSTSLNSYSAPTPKRTRQTAMKSTQGTRDSARQSTNLNPKQQKQAKKIVDQLTKID
SLGHHTNVPQRQRIEMLIRRLYREEIGEEAAQIVELRLWSLEESPEAAQILTMEPKSRKVLITMKLERWIRTLLRGKCDN
LKMFQSRYQEVMPFLQQNKMETVMMEEAWNLSVHLIQDIPA
>P06165 ~~~P/V/C~~~Protein C~~~
MLKTIKSWILGKRNQEINQLISPRPSTSLNSYSAPTPKKTYRKTTQSTQEPSNSAPPSVNQKSNQQKQVKKLVDQLTKID
SLGHHTNVQQKQKIEILIRKLYREDLGEEAAQIVELRLWSLEESLEASQILKMEPKTRRILISMKLERWIRTLLRGKCDN
LQMFQARYQEVMSYLQQNKVETVIMEEAWNLSVHLIQDQ
>P35948 ~~~P/V/C~~~Protein C~~~
MSTKAWNASRLSGPDPSTPWSLKKPLQHGSRPPKGKRLTVCPPTRPKQTIRISASHASQQLDQAKAACLAVTIRDLEEAT
AVMRSWEHSLVTPQCIAPRYSIIMFMITAVKRLRESKMLTLSWFNQALMMVSKSGEEMRNLRTAMWILANLIPREVLPLT
GDLLPSLQQQEPPMLKQ
>P04861 ~~~P/V/C~~~Protein C'~~~
MASATLTAWIKMPSFLKKILKLRGRRQEEESRSRMLSDSSMLSCRVNQLTSEGTEAGSTTPSTLPKDQALPIEPKVRAKE
KSQHRRPKIIDQVRRVESLGEQASQRQKHMLETLINKIYTGPLGEELVQTLYLRIWAMEETPESLKILQMREDIRDQVLK
MKTERWLRTLIRGEKTKLKDFQKRYEEVHPYLMKEKVEQVIMEEAWSLAAHIVQE
>P69738 ~~~P/V/C~~~Protein C'~~~
MASATLTAWIKMPSFLKKILKLRGRRQEDESRSRMLSDSSMLSCRVNQLTSEGTEAGSTTPSTLPKDQALLIEPKVRAKE
KSQHRRPKIIDQVRRVESLGEQASQRQKHMLETLINKIYTGPLGEELVQTLYLRIWTMEETPESLKILQMREDIRDQVLK
MKTERWLRTLIRGEKTKLKDFQKRYEEVHPYLMKEKVEQVIMEEAWSLAAHIVQE
>P04862 ~~~P/V/C~~~C' protein~~~
MASATLTAWIKMPSFLKKILKLRGRRQEDESRSRMLSDSSMLSCRVNQLTSEGTEAGSTTPSTLPKDQALLIEPKVRAKE
KSQHRRPKIIDQVRRVESLGEQASQRQKHMLETLINKIYTGPLGEELVQTLYLRIWAMEETPESLKILQMREDIRDQVLK
MKTERWLRTLIRGEKTKLKDFQKRYEEVHPYLMKEKVEQVIMEEAWSLAAHIVQE
>Q86609 ~~~P~~~Protein C'~~~
MIIWILPCRMPMNLRKDERINISKTSSSKIKEINQLRHIIRKKNRQIQILIIMLNILRCCHRMKE
>P32817 3.1.3.-~~~~~~mRNA-decapping protein D10~~~
MGEYYKNKLLLRPSVYSDNIQKIKLVAYEYGKLHAKYPLSVIGIMKTIDDKFVLCHRYNSFLFSEIAFTKDKRRKIRLFK
KYSKYMSNIERDILSYKLSLPNNYNTNHIDIIFPGGKIKDLESITNCLVREIKEELNIDSSYLAICKNCFVYGSIYDRLI
DKDFEVIALYVETDLTSRQILNRFIPNREIKGISFIDARDINKDYLYTNVIKYIINAVRTSASNS
>P32097 3.1.3.-~~~~~~mRNA-decapping protein D10~~~
MNERLQYSHLMNYITKHNRKLSKTYTWNDDSQRVSATGFSNQRLRWSNKTSICLILSTLDNKFIACSRKHSFLYSEIVRC
RSVFRKKRLFLTYTRFLKKKERPFLSSKLNVPLDDPGTEHNDIIFPGGLPKNEEDPIMCLSREIKEEINIDSKDIYIDSR
FFVHLFIEDLLSNRVYETILFLGNTTLTSNEILNNFLANREIKSLVFLDALEKGLMCDVLRYVLAISQLKCFGSTGDKTE
LLYDKVTESEKKMAPRYGFD
>O48499 ~~~~~~Protein D14~~~
MAVDSREKGKRGEYQVRDILRERTGLEWERVPGSGAFGQSHGLKGDIYLPPQSGHISKYCFEVKWYKDDNISSNLFNVGE
STLEKWWQQCSREGEQMNSKPALIFKKDRGQWLIALDSSDPMVDNLMSRTHMVLNKKDMEIVIGLFEPWLHHASVEDLIK
>P25951 ~~~D4L~~~Core protein D4~~~
MELDIKKLTDLLQNNKTVCPCDIKKIYDERFIVLEKGRCMIRNIHVYSSAARFDNKTMFGVIKYLYKHNEAIIDMLFPSK
TVYNSIREIAPDYTVSISTDDTEPSNPVTVLLNIFNSFRFGKKDAPVSYYYLPFGKDVHDVIS
>Q8V2R6 ~~~~~~27 kDa core protein~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DTGRQVRWCSTSNHISEDIPEDIHTDKFVIYDIYTFDSFKNKRLVFVQVPTSLGDDSYLTNPLLSPYYRNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFAFAWFNGVLENEKVLDTYKKVSDLI
>Q775T6 ~~~~~~27 kDa core protein~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DTGRQVRWCSTSNHISEDIPEDIHTDKFVIYDIYTFDSFKNKRLVFVQVPTSLGDDSYLTNPLLSPYYRNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFAFAWFNGVLENEKVLDTYKKVSDLI
>Q6VZQ4 ~~~~~~27 kDa core protein~~~
MDIVTDKNIGSNFLADSNNRIYILIGDTDNVIDKYLVSILGKIEFYYVYEITVEDSKLINTFVTSNLLCPIKNKFNIKIY
HDYKKVIGSCILNVDGKFTRYKDPSKLHVYVFCYRYNNCLNTCTMVKCHELLYPEKEIIVDGYKINDMSFFYTNPEIIKQ
HTDIKDYETLYKNIFLRRELNRVILGKPSDLIETLKEIVTINSEDIWKVIVSNDIFDSRDVIKLINFDYDREDFLSFVRA
WYSNQLNNCKEDNNKIEKVYEIVRNSI
>P0DTA9 ~~~~~~27 kDa core protein~~~
MEIHRLNSYTSVDYLCNSSNNVYILLGDTDEFINKRIILLMNNIELYYVYEISVNDEDELYHSFITSNVVCPIKQRINLM
LYKEYKKVIGSCVINNEGNIKMYSQPDKLHVYVLCYRCNGDIKTITMIKCHQLLKPEKEIVIDGYQVNDSSFFYTSPNLI
KQINMDKSDLFYKNILLRKEINCLIRKQESSNLYCILNKHIVSLSDTDIWKVIISDELFDSSDIEKLVKFDYDRDKFHAF
VRAWYSGQLSNCKEENETIKTVYEMIEKRI
>P0DTB0 ~~~~~~27 kDa core protein~~~
MEIHRLNSYTSVDYLCNSSNNVYILLGDTDEFINKRIILLMNNIELYYVYEISVNDEDELYHSFITSNVVCPIKQRINLM
LYKEYKKVIGSCVINNEGNIKMYSQPDKLHVYVLCYRCNGDIKTITMIKCHQLLKPEKEIVIDGYQVNDSSFFYTSPNLI
KQINMDKSDLFYKNILLRKEINCLIRKQESSNLYCILNKHIVSLSDTDIWKVIISDELFDSSDIEKLVKFDYDRDKFHAF
VRAWYSGQLSNCKEENETIKTVYEMIEKRI
>Q6RZJ4 ~~~~~~27 kDa core protein~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSEYIVTPDKYGSLCVELNGSSFQHGGRYIEVEEFI
DAGRQVRWCSTSNHISEDIPEDIHTDKFVIYDIYTFDAFKNKRLVFVQVPPSLGDDSYLTNPLLSPYYRNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFVFAWFNGVSENEKVLDTYKKVSNLI
>P25952 ~~~D5R~~~Core protein D3 homolog~~~
MNTTILIHDDDIQVNDLKENKTFLLLSEHNERIIDKLCSCLLPIIFYCDYITSPDDEGTLETRILSSSYMIRDKYVNVEE
FITAGLPLSWCVNLPEKAHSTASDSLIIRDVLYYKKDWIRILLIQCPSAIYTDEELLIDPFKLPRHPPELFKNVTLRSYV
NGLLFYPTSSPLYALLSHVVTTFIIKHITCVTKHDEKLITTCYDKGRFNAFVYAWYNSQISDDVVENEKVKNLFALVKAR
I
>Q8V3L7 ~~~~~~27 kDa core protein~~~
MDIVVIRDDIYPIFNNEDKIVLLLGNHQEFISNFISKINIHALFYCKYSIIPDEIGTLNVSIIESSHKIRGRYINVEEFI
SLLYPIQLCSKYTYKNDIDHDTMFIHDIIFFNNTWVRILFIEFLGIIDKQYETCIINPYLVKDNYKIFKNILLASIINNI
IFDKNSTLIELINKLYTRYHIDKYIMTCIVKYNDYNNIKLIYHCYNRNKFNAFIYAWFRSQITCDSIEENEKVERMFNNI
SKRI
>P0DSQ9 ~~~D3R~~~Core protein D3~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSKYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DDGRQVRWCSTSNHISEDIPEDIHTNKFIIYDIYTFDSFKNKRLVFVQVPTSLGDDSYLTNPLLSPYYCNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFAFAWFNGVLENEKVLDTYKKVSDLI
>P0DSR0 ~~~D3R~~~Core protein D3~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSKYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DDGRQVRWCSTSNHISEDIPEDIHTNKFIIYDIYTFDSFKNKRLVFVQVPTSLGDDSYLTNPLLSPYYCNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFAFAWFNGVLENEKVLDTYKKVSDLI
>Q6QGG9 ~~~D5~~~Putative transcription factor D5~~~
MSKLNWNVEGVTESLKAKATALGVDVISQEQVAAIAAELAAETGKDVTARSVGSKLRKEGFEVQKANEVQKSPWTPEQEA
ELVDFLNAHAGQYTYAEIAAAVAGGQFGAKQVQGKILSLEMTASVKPTEKAAAVRSFTPDEETDFVNQVVAGATIEAIAA
HFGRNIKQIRGKALSLLREGRIAAMPVQETSSAKTREDLLEGLDLVNMTVAEIAEKTGKSERGVKSMLSRRGLVAKDYDG
AAKRAKLDAKAAAAE
>P20215 ~~~~~~Protein D-63~~~
MSKEVLEKELFEMLDEDVRELLSLIHEIKIDRITGNMDKQKLGKAYFQVQKIEAELYQLIKVS
>P21970 3.6.-.-~~~~~~Putative Nudix hydrolase FPV054~~~
MFDISREQQNMLEKNKDCVITFETNRERITIENTNIKDILSDRRIHIFALCITSDNIPIIGIRRTSFMYQSVISKRRSFS
EILAVDINHLKYMYNNEIKEICIRSIVPFTYSGFNNFEELVLLGGRVKNKESIYQCLSRELSEESDGILTIKTFGNKILK
LTIEDKILRRTFYGYCIVCFIDQLYSEIIKPLYNIEIKELGSLFDRSSNEKYEYLHFIYNTLLTYKYGGVL
>P32098 3.1.3.-~~~D9R~~~mRNA-decapping protein D9~~~
MTTFETPRETVFIESVDSIPQSKKTHVFAICVTVDNKPIVAARRSSFVFQEITMNMNPPIVVTISKHLTNYMYNNEIKEI
KRKLQKGSAPIYKTSFEELILLGGKLNKSETIDDCIRREIKEETDSKLTIKSIGTTCVKITITDKLFNRKYVNYCKLCYI
DELMEEVISFVIYNVEIRKLKSLLDCDNNDKFNYLRFIYNTLLYSK
>P0DOU6 3.1.3.-~~~D9R~~~mRNA-decapping protein D9~~~
MGITMDEEVIFETPRELISIKRIKDIPRSKDTHVFAACITSDGYPLIGARRTSFAFQAILSQQNSDSIFRVSTKLLRFMY
YNELREIFRRLRKGSINNIDPHFEELILLGGKLDKKESIKDCLRRELKEESDERITVKEFGNVILKLTTQDKLFNKVYIG
YCMSCFINQSLEDLSHTSIYNVEIRKIKSLNDCINDDKYEYLSYIYNMLVNSK
>P51715 2.1.1.72~~~~~~DNA N-6-adenine-methyltransferase~~~
MMTKSNTKKSDKDLWATPWWVFHYAEQYFNIKFDLDTCAMEHNTKVKNFITPEQNTLTADWQGRYCWMNPPYSNPLPFVL
RAISQSVLHNKTVVMLLNVDGSTKWFDMCVRNAKEIVYITNSRIPFINNETGEETDQNNKPQMLVLFEPKAPYGSLKSSY
VSLHEMKEKGMLQ
>Q71TL0 2.1.1.72~~~dmt~~~DNA N-6-adenine-methyltransferase~~~
MKELCYGSVCSGIEAASIAWEPLGMRPAWFAEIEPFPSAVLAHRWPHVANLGDMTKLAKKVLAGEIESPDVLVWGTPCQA
FSIAGLRGGLDDERGALTLKYVELANAIDDKRSESFLKPTVIVWENVPGVLSSADNAFGCFLAGLAGEDAPFEPGDRPES
GKSNAFWRWDGKTGCHAPKWPQCGCIYGPQRKVAWRILDAQYFGVAQRRRRVFVVASARTDLDPATVLFEFEGVRRNIAP
RRKKKEIASAIIANGAAISGESLNPCLHADMPPSMKSTKAVNAFRMAAFGEYIDDETASTVKARDFKDATDLAVFSSTGA
GFWSEGHGTLRAREQESHEHLVTLAFPERMSGTQHAATKNTSPSLMAKNPTAVCYEVRNAEVAVRRLTPVECERLQGFPD
GHTLIPTEKRKKVNSDELAYLRNHYPDLSEEEAAMLAADGPRYKAIGNSMAIPVMRWIGDRITKAVCRQKEGSETKERKV
KPAAEFERSIFKWAGGKFGVLEQIFRYLPEGKRLIEPFVGGGAVFTNAGYQENLLNDVNADLINFYKTLQREAHSLITLA
HRFFQDYNTQEGYLAVRNAFNKQVYDDLHRAAAFLFLNRHCFNGLTRYNQAGEFNVGYGKYKTPYFPLQEMEAFLGAEGR
SEFVCGDFAAVIEAAGEGDVIFCDPPYEPLPNTEGFTNYSGHDFKFEEQKRLVSLLTDAHRRGAKVLITNSGAPNIRELY
HDSGFRVEPLFARRSVSCKGDTRGVAHDVLGILL
>Q38156 2.1.1.72~~~~~~DNA N-6-adenine-methyltransferase~~~
MKDFNDIETIDFAETGCSFTREAIASGGYYQALKTPTCKEISGRRYKGTNTPDAVRDLWSTPREVIAYLEGRYGKYDLDA
AASEENKVCEKFYSQETNCLKRWWGKNKHVWLNPPYSRPDIFVKKAIEQMEHNNQIDMLLPADNSTAWFTEARQNAAEII
WIEADLTEDIDGNEYARSGRLAFISGETGKAVDGNNKGSVIFIMRELKEGEVQQTHYIPITSICPSVKNKRAKVRKVD
>O21970 ~~~darA~~~Defense against restriction protein A~~~
MEQFNINKGMTIKPGLDVLPPPVTDDEYRALMAGEDRYLMTESNTLEEIEATFFYDTPIHWCATDLLEAISSTRLQLHRT
MQAFVRALNQKLNGTGISAGSDKTGDVAQSGARAIGGAEIGRARNVNGLPVLPAIIPLSDGQTISILFHSPTAENRITNS
DTLVAFQFLLNKKDVTHTVAPMSGRDMTLAQVTMKLANLAEKNSAKFQRAQKKKKALVDEITQLQADSDQKEDAMSDLAD
QVAAVEGQKADLEQKINAVASEADSLYEENERLQGEIDRLNRTGGRDTIAPAGMTGGHSRALTDRLASIKNRMHMDGEAT
LSNGASMKQFIGDGEGYIQLTDPDGSVYMIKAKSIQGVDMADAIGKLFKAYKAGNVSEYLVQPEEHKPENVEPESAEDTG
SSSPEPEVSVGAYRYALQMRPAAPGAIPEGNKAILPRPDEGDPYYEYARYGIATYDTPLSDQQMSEYDLKLLPREDSFDF
LAKTLTNGPFGKYAQKALELATNSPDEFRVMLKTQFQKTFPNIAFPGGAGTEKMVQSMINALQAEVGEITQPEPAPAQPD
ETVSEADAEANKAIEYLNNVMDMQSTDMAEIRNARGNVREAIAALQTAGRFEENEELVNGAARHLADLLVAIQKAGVAA
>A0A2L0V156 ~~~datZ~~~dATP triphosphohydrolase~~~
MMTMTPRDLMRAQYVTRWQIVPTTRSQSVAQHSWAVAMLAMNLWCRRTGGQPGEAATDVELGKIAVMALWHDAPEVFTGD
INTPTKIFLNASNALDELENTAGDGYLESMDGGPIRTCVKIADFLEAMYWLMEHGDGHYANNQLHGLNERFHQYLNEHAP
LWRDSAVALWKELSDVNAETTTFQRVNYLKANDA
>A0A7U3TCA2 ~~~datZ~~~dATP triphosphohydrolase~~~
MTLQITETYERLRASHISRWGIVQTTYPQNIAEHMWRVWLLCRDWGAAAGMPQHTVRQACEFALVHDLAEIRTGDAPTPH
KTPELKELLAGIEAQIVPEVAELEATMAPEARELWKFCDTAEAVLFLKVNGLGAHAYDVQHLLMEQMKRRLMDSVLDVEV
QDELMFQFERTIKKT
>A0A2H5BHG9 ~~~datZ~~~dATP triphosphohydrolase~~~
MDKLNIRDILRAQDVTRWQIVRTKKQSVAEHTFAVQAVLMRLVPLLISTYTAPMKVGFEERLLCECIMGAFWHDIPEVIT
GDIASPVKRLIRDGGDITPLDDLEKKVDPAFIKCYTAAKPLTLAIIKCADLMEMVYHLNEYGDQRANSHSWRVQHGINNA
FHEHIHNCSENFPAFKWDVAHGLLIEMLDPSQETDIDSIVNGI
>P08773 2.1.2.8~~~~~~Deoxycytidylate 5-hydroxymethyltransferase~~~
MISDSMTVEEIRLHLGLALKEKDFVVDKTGVKTIEIIGASFVADEPFIFGALNDEYIQRELEWYKSKSLFVKDIPGETPK
IWQQVASSKGEINSNYGWAIWSEDNYAQYDMCLAELGQNPDSRRGIMIYTRPSMQFDYNKDGMSDFMCTNTVQYLIRDKK
INAVVNMRSNDVVFGFRNDYAWQKYVLDKLVSDLNAGDSTRQYKAGSIIWNVGSLHVYSRHFYLVDHWWKTGETHISKKD
YVGKYA
>P00814 3.5.4.12~~~CD~~~Deoxycytidylate deaminase~~~
MKASTVLQIAYLVSQESKCCSWKVGAVIEKNGRIISTGYNGSPAGGVNCDNYAAIEGWLLNKPKHTIIQGHKPECVSFGT
SDRFVLAKEHRSAHSEWSSKNEIHAELNAILFAARNGSSIEGATMYVTLSPCPDCAKAIAQSGIKKLVYCETYDKNKPGW
DDILRNAGIEVFNVPKLNWENISEFCGE
>P16006 3.5.4.12~~~CD~~~Deoxycytidylate deaminase~~~
MKASTVLQIAYLVSQESKCCSWKVGAVIEKNGRIISTGYNGSPAGGVNCCDYAAEQGWLLNKPKHAIIQGHKPECVSFGS
TDRFVLAKEHRSAHSEWSSKNEIHAELNAILFAARNGSSIEGATMYVTLSPCPDCAKAIAQSGIKKLVYCETYDKNKPGW
DDILRNAGIEVFNVPKKNLNKLNWENINEFCGE
>P32270 3.6.4.12~~~dda~~~ATP-dependent DNA helicase dda~~~
MTFDDLTEGQKNAFNIVMKAIKEKKHHVTINGPAGTGKTTLTKFIIEALISTGGTGIILAAPTHAAKKILSKLSGKEAST
IHSILKINPVTYEENVLFEQKEVPDLAKCRVLICDEVSMYDRKLFKILLSTIPPWCTIIGIGDNKQIRPVEPGENTAYIS
PFFTHKDFYQCELTEVKRSNAPIIDVATDVRNGKWNYDKVVDGHGVRGFTGDTALRDFMVNYFSIVKSLDDLFENRVMAF
TNKSVDKLNSIIRKKIFETDKDFIVGEIIVMQEPLFKTYKIDGKPVSEIIFNNGQLVRIIEAEYTSTFVKARGVPGEYLI
RHWDLTVETYGDDEYYREKIKIISSDEELYKFNLFLAKTAETYKNWNKGGKAPWSDFWDAKSQFSKVKALPASTFHKAQG
MSVDRAFIYTPCIHYADVELAQQLLYVGVTRGRYDVFYV
>A7XXC1 ~~~~~~Decoration protein~~~
MDKVKLFQTIGRVEYWERVPRLHAYGVFALPFPMDPDVNWAQWFTGPHPRAFLVSIHKYGPKAGHVYPTNLTDEDALLNV
IGMVLDGHDYENDPNVTVTLKAAVPIEYVQQDPQAPALQPHQAVLDAAEVLKLKVIKGHYFFDYTR
>P36275 ~~~shp~~~Head decoration protein~~~
MVTKTITEQRAEVRIFAGNDPAHTATGSSGISSPTPALTPLMLDEATGKLVVWDGQKAGSAVGILVLPLEGTETALTYYK
SGTFATEAIHWPESVDEHKKANAFAGSALSHAALP
>Q38581 ~~~~~~Decoration protein gp12~~~
MSKRIPRFLRNIQLPAGPQGPKGDPGPKGDTGAKGADGFGTEAQYNDIISRLEALEQSSGGGTT
>Q6QGD6 ~~~N5~~~Decoration protein~~~
MIDYSGLRTIFGEKLPESHIFFATVAAHKYVPSYAFLRRELGLSSAHTNRKVWKKFVEAYGKAIPPAPPAPPLTLSKDLT
ASMSVEEGAALTLSVTATGGTGPYTYAWTKDGSPIPDASGATYTKPTAAAEDAGSYKVTVTDSKQVSKDSTTCAVTVNPT
VPGG
>P03712 ~~~D~~~Capsid decoration protein~~~
MTSKETFTHYQPQGNSDPAHTATAPGGLSAKAPAMTPLMLDTSSRKLVAWDGTTDGAAVGILAVAADQTSTTLTFYKSGT
FRYEDVLWPEAASDETKKRTAFAGTAISIV
>A0A5P1KKQ4 ~~~~~~Depolymerase, capsule K47-specific~~~
MDQDIKTVIQYPVGATEFDIPFDYLSRKFVRVSLVSDDNRRLLSNITEYRYVSKTRVKLLVETTGFDRVEIRRFTSASER
IVDFSDGSVLRASDLNVSQIQSAHIAEEARDAALMAMPQDDAGNLDARNRRIVRLAPGIAGTDAVNKDQLDTTLGEAGGI
LSDMKDLEGEIHDYIEKFADDTALVRGVAWVYNLGSADGGETVITINKSTRTYAVPYIEVNGSRQEVGYHYSFDLETQQI
TLATPLKAGDFVMVMTTESQLPVETLLASSVGAASIGTATGETVEERLTRLYGHFVHPETYGAVGDGITDDRVALQRSLD
VAYENALNGTGPSTVRWSGDYMVSLNPNSLGVSGELAAGRSALCIRPGVSIEGKGTVRLDPSFTGSQSGAVITNWAGPAD
DCSIKDIRIYGGKDVATGTGITGILILDSQRVVISDVKVLNSTAGGIYLRKGATEGLYGCSFSKVSGCTVDNAGYIGIQM
ERPYDNTVIGNTINRCEDNGIDVFGNVNDATVTGIAQSTLITGNNIRDVLNGVFIESCGNTNITGNYIADFRSSGVIYNR
INSAANDNSLTSNVLIGASGASAGVSFKNSVGYCTVASNRIQNSDYGIRCVGGGITGLNILPNTMKNIAKTLLFVEARNN
GLVKSRMSTQFYEGAQVGGIPSNTSPRGVPHRFPSRLSYIVDIQPFWATEQGTREDNFERAKGTLASITGWGSKCALYDT
IVAGDTVVSLNSSSVAVGEYLEINAEVYKVTSVSATYAVVRKWTGSDYTAGDYAAVIISNPSYIIRRVQWGEQ
>K9L8K6 ~~~~~~Depolymerase, capsule K63-specific~~~
MALYREGKAAMAADGTVTGTGTKWQSSLSLIRPGATIMFLSSPIQMAVVNKVVSDTEIKAITTNGAVVASSDYAILLSDS
LTVDGLAQDVAETLRYYQSQETVIADAVEFFKNFDFDSLQDLANQINADSESAQSSAAAAAASENAAKTSENNAKSSEVA
AENARDQVQQIINDAGDASTLVVLANPDGFRHIGRCKDIATLRTIEPVESRQVIEVLSYYNGLAQGGGTFWYDPNDSVTE
DNGGSCIVTNGGKRWKRIIDGAVDVLSFGAKPDDISFDSAPHIQAALDNHDAVSLYGRSYYIGSPIYMPSRTVFDGMGGK
LTSIAPSTAGFMAGSIFAPGNYHPDFWEEVPKVAATTTLGSANITLADPNIVNVGDIIRLSSTTGVLSAGFFVSEYLQMA
RVLSKTGNVITIDGPVESQLTLVAANANQPGYLARFNKPLFCCVDSIIQNIEIDTWDYWTADSATYNVKFSNIWGSAKAV
AYGNTFCRSLFEDIRIVFSGRVSELAFGSHDTNLVRITAIASPKGLSASVVFGWAESGRRCTIDTFSIMLNANANPSTVI
RVSGHRDSLIKNGSIYVHNNTNNILSVENYGTTADGVRPDCDNITFENVSIFVTGSSAVVCDVYKSADNSVIKNVAFKNI
KYFGPTPSVALYRARGTLANFVKGVQANISSDTGGAIVLSNSENNVLTFTGPVSVTSLVSAAAKNTLSIRNYARSSAKAN
NFTQESTLNVTDTTANAVSKEFTYPAGSLRINDKIKLSLGGSTAGTVGKKTVQVGFIGSDGAFKYVELAALATDQVYWTM
EVEISFLRTTNSQTNELETSAIITSFLSKGAATGAALSGSRALAVVSDLALSNFVVQVRAWKENAADGLSLSRMNLQLED
LTA
>A0A068Q5Q5 4.-.-.-~~~~~~Depolymerase, capsule K1-specific~~~
MALIRLVAPERVFSDLASMVAYPNFQVQDKITLLGSAGGDFTFTTTASVVDNGTVFAVPGGYLLRKFVGPAYSSWFSNWT
GIVTFMSAPNRHLVVDTVLQATSVLNIKSNSTLEFTDTGRILPDAAVARQVLNITGSAPSVFVPLAADAAAGSKVITVAA
GALSAVKGTYLYLRSNKLCDGGPNTYGVKISQIRKVVGVSTSGGVTSIRLDKALHYNYYLSDAAEVGIPTMVENVTLVSP
YINEFGYDDLNRFFTSGISANFAADLHIQDGVIIGNKRPGASDIEGRSAIKFNNCVDSTVKGTCFYNIGWYGVEVLGCSE
DTEVHDIHAMDVRHAISLNWQSTADGDKWGEPIEFLGVNCEAYSTTQAGFDTHDIGKRVKFVRCVSYDSADDGFQARTNG
VEYLNCRAYRAAMDGFASNTGVAFPIYRECLAYDNVRSGFNCSYGGGYVYDCEAHGSQNGVRINGGRVKGGRYTRNSSSH
IFVTKDVAETAQTSLEIDGVSMRYDGTGRAVYFHGTVGIDPTLVSMSNNDMTGHGLFWALLSGYTVQPTPPRMSRNLLDD
TGIRGVATLVAGEATVNARVRGNFGSVANSFKWVSEVKLTRLTFPSSAGALTVTSVAQNQDVPTPNPDLNSFVIRSSNAA
DVSQVAWEVYL
>P0DTN8 ~~~~~~Depolymerase, capsule K2-specific~~~
MALYREGKAAMAADGTVTGTGTKWQSSLSLIRPGATIMFLSSPIQMAVVNKVVSDTEIKAITTSGAVVASTDYAILLSDS
LTVDGLAQDVAETLRHYQSQETVIADAVEFFKSFDFDSLQNLANQIKADSESAESSAAAAAASESKAKTSEDNAKSSENA
AKNSEVAAETTRDQIQQIIDNAGDQSTLVVLAQPDGFDSIGRVSSFAALRNLKPKKSGQHVLLTSYYDGWAAENKMPTGG
GEFISSIGTATDDGGYIAAGPGYYWTRVVNNNSFTAEDFGCKTTATPPPNFNVLPAELFDNTAMMQAAFNLAISKSFKLN
LSTGTYYFESSDTLRITGPIHIEGRPGTVFYHNPSNKANPKTDAFMNISGCSAGRISSINCFSNSYLGKGINFDRSVGDN
RKLVLEHVYVDTFRWGFYVGEPECINQIEFHSCRAQSNYFQGIFIESFKEGQEYGHSAPVHFFNTICNGNGPTSFALGAT
YKTTKNEYIKVMDSVNDVGCQAYFQGLSNVQYIGGQLSGHGSPRNTSLATITQCNSFIIYGTDLEDINGFTTDGTAITAD
NIDAIESNYLKDISGAAIVVSSCPGFKIDSPHIFKIKTLSTIKLMNNTYNYEIGGFTPDEALKYNVWDANGLATNRISGV
IHPRLVNSRLGINSVAFDNMSNKLDVSSLIHNETSQIVGLTPSTGSNVPHTRKMWSNGAMYSSTDLNNGFRLNYLSNHNE
PLTPMHLYNEFSVSEFGGSVTESNALDEIKYIFIQTTYANSGDGRFIIQALDASGSVLSSNWYSPQSFNSTFPISGFVRF
DVPTGAKKIRYGFVNSANYTGSLRSHFMSGFAYNKRFFLKIYAVYNDLGRYGQFEPPYSVAIDRFRVGDNTTQMPSIPAS
SATDVAGVNEVINSLLASLKANGFMSS
>A0A7U3TBV3 3.6.1.9~~~~~~dATP/dGTP diphosphohydrolase~~~
MPATVAELQAEIAAWIHPLNPDRRPGGTIAKLLEEIGELIASDRAHDPLEVADVLILALDLATLLGVDVTEAIRAKLAIN
RARSWARADNGAMRHIPGSDTPSFP
>A0A2H5BHG5 3.6.1.9~~~~~~dATP/dGTP diphosphohydrolase~~~
MSGNKFDQEKVDLHVLDPFFIEGTARVAQFGEQKYGRSNWMQGLTQTRIINAIKRHIAQIEKGEDIDEESGFHHAYHAAW
GCQVLAYQHRNGQTHLDDRRWSESVRDADTKIGTCEGISGHTVPCMYEGPFGRLHPVDGEKDGLVMSFVQSEADEGTAQG
DPIQQLQQMISEWADQVYPDRTVENALTKMMLHEIPELLHGKAMDPAEFADVAILLFDVAHLQGIDIAQAMREKMEINQA
RDWKIDPATGLMSHVKPKGMMETIRDAVQGIGDAARIMAAPSRLLNGNNMELAEYTAETLPKPPEKPWEITIPNWALKTE
EGTHIKLRAGSGVFGRVKYQSQCILMHKNMRKRSHLETYYCATIKDTVTEDEYNVPWSEIEPWTN
>P32092 3.1.3.-~~~~~~mRNA-decapping protein g5R~~~
MDTAMQLKTSIGLITCRMNTQNNQIETILVQKRYSLAFSEFIHCHYSINANQGHLIKMFNNMTINERLLVKTLDFDRMWY
HIWIETPVYELYHKKYQKFRKNWLLPDNGKKLISLINQAKGSGTLLWEIPKGKPKEDESDLTCAIREFEEETGITREYYQ
ILPEFKKSMSYFDGKTEYKHIYFLAMLCKSLEEPNMNLSLQYENRIAEISKISWQNMEAVRFISKRQSFNLEPMIGPAFN
FIKNYLRYKH
>P04331 ~~~9~~~Tail knob protein gp9~~~
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSNQYNWFNRKSRVYEMSKVTFMGFRENKPYVSVSLPIDKLYSASYIMF
QNADYGNKWFYAFVTELEFKNSAVTYVHFEIDVLQTWMFDMKFQESFIVREHVKLWNDDGTPTINTIDEGLSYGSEYDIV
SVENHKPYDDMMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQPLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT
NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKELKLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEALEIDTGDKWGGFT
KDQESKLMMYPYCVTEITDFKGNHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQDYNADSALSGGNRLTASLDSSLIN
NNPNDIAILNDYLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISAGASAAGGSALGMASSVTGMTSTAGNAVLQMQAMQ
AKQADIANIPPQLTKMGGNTAFDYGNGYRGVYVIKKQLKAEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNYVQTKDCF
ISGDINNNDLQEIRTIFDNGITLWHTDNIGNYSVENELR
>D3WAD3 ~~~~~~Distal tail protein~~~
MVRQYKIHTNLDGTDDKVWDVTNGKVRFYQPSNLGLQSTNNIWQSNGIGVMGTRSITQPQIEFKLETFGESLEENYQLMK
DFVNDILSKKFVTLEYQTEIFQVYADLALADVTKTEGYGKNGTFSEKITFDIITKWYTYENLTFDKIQNGKVIAGMSKIY
GGTAPGNYKYIKGTSYTYYGESDIDRLSRWDIKEEIFSFMGILYPKLPKTPAGVRFLDDIGNEYTAIVFKTEQVQDYILI
NTDVNDETYQGWKGTTALNLFPVMDFERYRTRIIEKGQMELINLSKAEFKIKRKADFV
>O48459 ~~~~~~Distal tail protein~~~
MNIYDILDKVFTMMYDGQDLTDYFLVQEVRGRSVYSIEMGKRTIAGVDGGVITTESLPARELEVDAIVFGDGTETDLRRR
IEYLNFLLHRDTDVPITFSDEPSRTYYGRYEFATEGDEKGGFHKVTLNFYCQDPLKYGPEVTTDVTTASTPVKNTGLAVT
NPTIRCVFSTSATEYEMQLLDGSTVVKFLKVKYGFNTGDTLVIDCHERSVTLNGQDIMPALLIQSDWIQLKPQVNTYLKA
TQPSTIVFTEKFL
>Q6QGE8 ~~~~~~Distal tail protein pb9~~~
MRLPDPYTNPEYPGLGFESVNLVDNDPMIRDELPNGKVKEVKISAQYWGINISYPELFPDEYAFLDSRLLEYKRTGDYLD
VLLPQYEAFRVRGDTKSVTIPAGQKGSQIILNTNGTLTGQPKAGDLFKLSTHPKVYKITNFSSSGNVWNISLYPDLFITT
TGSEKPVFNGILFRTKLMNGDSFGSTLNNNGTYSGISLSLRESL
>P04392 2.1.1.72~~~DAM~~~DNA adenine methylase~~~
MLGAIAYTGNKQSLLPELKSHFPKYNRFVDLFCGGLSVSLNVNGPVLANDIQEPIIEMYKRLINVSWDDVLKVIKQYKLS
KTSKEEFLKLREDYNKTRDPLLLYVLHFHGFSNMIRINDKGNFTTPFGKRTINKNSEKQYNHFKQNCDKIIFSSLHFKDV
KILDGDFVYVDPPYLITVADYNKFWSEDEEKDLLNLLDSLNDRGIKFGQSNVLEHHGKENTLLKEWSKKYNVKHLNKKYV
FNIYHSKEKNGTDEVYIFN
>P39232 ~~~dmd~~~Antitoxin Dmd~~~
MELVKVVFMGWFKNESMFTKEITMMKDDVQWATTQYAEVNKALVKAFIDDKKVCEVDCRG
>Q38167 3.1.3.89~~~dmp~~~5'-deoxynucleotidase~~~
MNQVKTNITRNFPHISRVMIWDLDGTIINSFHRVAPCFDSEGNLDLNKYKNEACKHDLIMQDTLLPLVTYMRQCMNDANT
LNIICTARLMSKSDYYYLRKQGLRGRGDSNIRVFSRDTLHKYFEADKVSEIYHSKDAVYKSYYFGLFKQLYPNADFTMID
DHKGVLSAAASYGFKTLDAQAVNDILSIGVTLIGETFIDESLEDDNDYQFLADRLQLCWEGMTEEERAEYSCSPQQYIEK
LKVA
>P03264 ~~~DBP~~~DNA-binding protein~~~
MASREEEQRETTPERGRGAARRPPTMEDVSSPSPSPPPPRAPPKKRLRRRLESEDEEDSSQDALVPRTPSPRPSTSTADL
AIASKKKKKRPSPKPERPPSPEVIVDSEEEREDVALQMVGFSNPPVLIKHGKGGKRTVRRLNEDDPVARGMRTQEEKEES
SEAESESTVINPLSLPIVSAWEKGMEAARALMDKYHVDNDLKANFKLLPDQVEALAAVCKTWLNEEHRGLQLTFTSNKTF
VTMMGRFLQAYLQSFAEVTYKHHEPTGCALWLHRCAEIEGELKCLHGSIMINKEHVIEMDVTSENGQRALKEQSSKAKIV
KNRWGRNVVQISNTDARCCVHDAACPANQFSGKSCGMFFSEGAKAQVAFKQIKAFMQALYPNAQTGHGHLLMPLRCECNS
KPGHAPFLGRQLPKLTPFALSNAEDLDADLISDKSVLASVHHPALIVFQCCNPVYRNSRAQGGGPNCDFKISAPDLLNAL
VMVRSLWSENFTELPRMVVPEFKWSTKHQYRNVSLPVAHSDARQNPFDF
>P03265 ~~~DBP~~~DNA-binding protein~~~
MASREEEQRETTPERGRGAARRPPTMEDVSSPSPSPPPPRAPPKKRMRRRIESEDEEDSSQDALVPRTPSPRPSTSAADL
AIAPKKKKKRPSPKPERPPSPEVIVDSEEEREDVALQMVGFSNPPVLIKHGKGGKRTVRRLNEDDPVARGMRTQEEEEEP
SEAESEITVMNPLSVPIVSAWEKGMEAARALMDKYHVDNDLKANFKLLPDQVEALAAVCKTWLNEEHRGLQLTFTSKKTF
VTMMGRFLQAYLQSFAEVTYKHHEPTGCALWLHRCAEIEGELKCLHGSIMINKEHVIEMDVTSENGQRALKEQSSKAKIV
KNRWGRNVVQISNTDARCCVHDAACPANQFSGKSCGMFFSEGAKAQVAFKQIKAFMQALYPNAQTGHGHLLMPLRCECNS
KPGHAPFLGRQLPKLTPFALSNAEDLDADLISDKSVLASVHHPALIVFQCCNPVYRNSRAQGGGPNCDFKISAPDLLNAL
VMVRSLWSENFTELPRMVVPEFKWSTKHQYRNVSLPVAHSDARQNPFDF
>P03227 ~~~DBP~~~Major DNA-binding protein~~~
MQGAQTSEDNLGSQSQPGPCGYIYFYPLATYPLREVATLGTGYAGHRCLTVPLLCGITVEPGFSINVKALHRRPDPNCGL
LRATSYHRDIYVFHNAHMVPPIFEGPGLEALCGETREVFGYDAYSALPRESSKPGDFFPEGLDPSAYLGAVAITEAFKER
LYSGNLVAIPSLKQEVAVGQSASVRVPLYDKEVFPEGVPQLRQFYNSDLSRCMHEALYTGLAQALRVRRVGKLVELLEKQ
SLQDQAKVAKVAPLKEFPASTISHPDSGALMIVDSAACELAVSYAPAMLEASHETPASLNYDSWPLFADCEGPEARVAAL
HRYNASLAPHVSTQIFATNSVLYVSGVSKSTGQGKESLFNSFYMTHGLGTLQEGTWDPCRRPCFSGWGGPDVTGTNGPGN
YAVEHLVYAASFSPNLLARYAYYLQFCQGQKSSLTPVPETGSYVAGAAASPMCSLCEGRAPAVCLNTLFFRLRDRFPPVM
STQRRDPYVISGASGSYNETDFLGNFLNFIDKEDDGQRPDDEPRYTYWQLNQNLLERLSRLGIDAEGKLEKEPHGPRDFV
KMFKDVDAAVDAEVVQFMNSMAKNNITYKDLVKSCYHVMQYSCNPFAQPACPIFTQLFYRSLLTILQDISLPICMCYEND
NPGLGQSPPEWLKGHYQTLCTNFRSLAIDKGVLTAKEAKVVHGEPTCDLPDLDAALQGRVYGRRLPVRMSKVLMLCPRNI
KIKNRVVFTGENAALQNSFIKSTTRRENYIINGPYMKFLNTYHKTLFPDTKLSSLYLWHNFSRRRSVPVPSGASAEEYSD
LALFVDGGSRAHEESNVIDVVPGNLVTYAKQRLNNAILKACGQTQFYISLIQGLVPRTQSVPARDYPHVLGTRAVESAAA
YAEATSSLTATTVVCAATDCLSQVCKARPVVTLPVTINKYTGVNGNNQIFQAGNLGYFMGRGVDRNLLQAPGAGLRKQAG
GSSMRKKFVFATPTLGLTVKRRTQAATTYEIENIRAGLEAIISQKQEEDCVFDVVCNLVDAMGEACASLTRDDAEYLLGR
FSVLADSVLETLATIASSGIEWTAEAARDFLEGVWGGPGAAQDNFISVAEPVSTASQASAGLLLGGGGQGSGGRRKRRLA
TVLPGLEV
>P17147 ~~~DBP~~~Major DNA-binding protein~~~
MSHEELTALAPVGPAAFLYFSRLNAETQEILATLSLCDRSSSVVIAPLLAGLTVEADFGVSVRTPVLCYDGGVLTKVTSF
CPFALYFHHTQGIVAFTEDHGDVHRLCEDARQKYALEAYMPEADRVPTDLAALCAAVGCQASETTVHVVVGNGLKEFLFA
GQLIPCVEEATTVRLHGGEAVRVPLYPPTLFNSLQLDAEADEVSLDARSAFVEARGLYVPAVSETLFYYVYTSWCQSLRF
SEPRVLIEAALRQFVHDSQQSVKLAPHKRYLGYMSQRLSSLEKDHLMLSDAVVCELAFSFASVFFDSAYQPAESMLFSEW
PLVTNATDHRDLIRALTELKLHLSTHVAALVFSANSVLYQHRLVYLQSSARHPSAGGTASQETLLKAIQFTNGLSAACED
VYNDARKVLKFQGAPLKDERYGPQHLALVCGTCPQLVSGFVWYLNRVSVYNTGLSGSSTLTNHLVGCAAGLCEACGGTCC
HTCYQTAFVRVRTRLPVVPKQPKKEPCVITVQSRFLNDVDILGSFGRRYNVDAKDGGLDGKGDDGVPGGGAGGGGGRDVS
GGPSDGLGGGRGGGGGGDSGGMMGRGGRMLGASVDRTYRLNRILDYCRKMRLIDPVTGEDTFSAHGKSDFVAVFSALNKF
VDDEALGFVSEVRLKSSRDEVAGATQAFNLDLNPYAVAFQPLLAYAYFRSVFYVIQNVALITATSYIVDNPLTTNLVSKW
MTQHFQSIHGAFSTTSSRKGFLFTKQIKSSKNSDHDRLLDFRLYAQGTYAVVPMEIKLSRLSVPTLIMVRVKNRPIYRAG
KGNAGSVFFRRDHVPRRNPAKGCLGFLLYRHHERLFPECGLPCLQFWQKVCSNALPKNVPIGDMGEFNAFVKFLVAVTAD
YQEHDLLDVAPDCVLSYVESRFHNKFLCYYGFKDYIGSLHGLTTRLTTQNHAQFPHVLGASPRFSSPAEFALHVKGLKTA
GVPAPMAATVARESLVRSVFEHRSLVTVPVSVEKYAGINNSKEIYQFGQIGYFSGNGVERSLNVSSMSGQDYRFMRQRYL
LATRLADVLIKRSRRENVLFDADLIKNRVMLALDAENLDCDPEVMAVYEILSVREEIPASDDVLFFVDGCEALAASLMDK
FAALQEQGVEDFSLENLRRVLDADAQRLTDAAGGEVHDLSALFAPSGVGAASGVGGGGLLLGESVAGNSICFGVPGETGG
GCFLVNAGEDEAGGVGGSSGGGGGSGLLPAKRSRL
>P04296 ~~~DBP~~~Major DNA-binding protein~~~
METKPKTATTIKVPPGPLGYVYARACPSEGIELLALLSARSGDSDVAVAPLVVGLTVESGFEANVAVVVGSRTTGLGGTA
VSLKLTPSHYSSSVYVFHGGRHLDPSTQAPNLTRLCERARRHFGFSDYTPRPGDLKHETTGEALCERLGLDPDRALLYLV
VTEGFKEAVCINNTFLHLGGSDKVTIGGAEVHRIPVYPLQLFMPDFSRVIAEPFNANHRSIGEKFTYPLPFFNRPLNRLL
FEAVVGPAAVALRCRNVDAVARAAAHLAFDENHEGAALPADITFTAFEASQGKTPRGGRDGGGKGAAGGFEQRLASVMAG
DAALALESIVSMAVFDEPPTDISAWPLFEGQDTAAARANAVGAYLARAAGLVGAMVFSTNSALHLTEVDDAGPADPKDHS
KPSFYRFFLVPGTHVAANPQVDREGHVVPGFEGRPTAPLVGGTQEFAGEHLAMLCGFSPALLAKMLFYLERCDGAVIVGR
QEMDVFRYVADSNQTDVPCNLCTFDTRHACVHTTLMRLRARHPKFASAARGAIGVFGTMNSMYSDCDVLGNYAAFSALKR
ADGSETARTIMQETYRAATERVMAELETLQYVDQAVPTAMGRLETIITNREALHTVVNNVRQVVDREVEQLMRNLVEGRN
FKFRDGLGEANHAMSLTLDPYACGPCPLLQLLGRRSNLAVYQDLALSQCHGVFAGQSVEGRNFRNQFQPVLRRRVMDMFN
NGFLSAKTLTVALSEGAAICAPSLTAGQTAPAESSFEGDVARVTLGFPKELRVKSRVLFAGASANASEAAKARVASLQSA
YQKPDKRVDILLGPLGFLLKQFHAAIFPNGKPPGSNQPNPQWFWTALQRNQLPARLLSREDIETIAFIKKFSLDYGAINF
INLAPNNVSELAMYYMANQILRYCDHSTYFINTLTAIIAGSRRPPSVQAAAAWSAQGGAGLEAGARALMDAVDAHPGAWT
SMFASCNLLRPVMAARPMVVLGLSISKYYGMAGNDRVFQAGNWASLMGGKNACPLLIFDRTRKFVLACPRAGFVCAASSL
GGGAHESSLCEQLRGIISEGGAAVASSVFVATVKSLGPRTQQLQIEDWLALLEDEYLSEEMMELTARALERGNGEWSTDA
ALEVAHEAEALVSQLGNAGEVFNFGDFGCEDDNATPFGGPGAPGPAFAGRKRAFHGDDPFGEGPPDKKGDLTLDML
>Q2HRD3 ~~~DBP~~~Major DNA-binding protein~~~
MALKGPQTLEENIGSAAPTGPCGYLYAYVTHNFPIGEASLLGNGYPEAKVFSLPLLHGLTVESDFPLNVKAVHKKIDATT
ASVKLTSYHREAIVFHNTHLFQPIFQGKGLEKLCRESRELFGFSTFVEQQHKGTLWSPEACPQLPCANEIFMAVIVTEGF
KERLYGGKLVPVPSQTTPVHIGEHQAFKIPLYDEDLFGPSRAQELCRFYNPDISRYLHDSIFTGIAQALRVKDVSTVIQA
SERQFVHDQYKIPKLVQAKDFPQCASRGTDGSTLMVIDSLVAELGMSYGLSFIEGPQDSCEVLNYDTWPIFENCETPDAR
LRALEVWHAEQALHIGAQLFAANSVLYLTRVAKLPQKNQRGDANMYNSFYLQHGLGYLSEATVKENGASAFKGVPVSALD
GSSYTLQHLAYASSFSPHLLARMCYYLQFLPHHKNTNSQSYNVVDYVGTAAPSQMCDLCQGQCPAVCINTLFYRMKDRFP
PVLSNVKRDPYVITGTAGTYNDLEILGNFATFREREEEGNPVEDAPKYTYWQLCQNITEKLASMGISEGGDALRTLIVDI
PSFVKVFKGIDSTVEAELLKFINCMIKNNYNFRENIKSVHHILQFACNVYWQAPCPVFLTLYYKSLLTVIQDICLTSCMM
YEQDNPAVGIVPSEWLKMHFQTMWTNFKGACFDKGAITGGELKIVHQSMFCDLFDTDAAIGGMFAPARMQVRIARAMLMV
PKTIKIKNRIIFSNSTGAESIQAGFMKPASQRDSYIVGGPYMKFLNALHKTLFPSTKTSALYLWHKIGQTTKNPILPGVS
GEHLTELCNYVKASSQAFEEINVLDLVPDTLTSYAKIKLNSSILRACGQTQFYATTLSCLSPVTQLVPAEEYPHVLGPVG
LSSPDEYRVKVAGRSVTIVQSTLKQAVSTNGRLRPIITVPLVVNKYTGSNGNTNVFHCANLGYFSGRGVDRNLRPESVPF
KKNNVSSMLRKRHVIMTPLVDRLVKRIVGINSGEFEAEAVKRSVQNVLEDRDNPNLPKTVVLELVKHLGSSCASLTEEDV
IYYLGPYAVLGDEVLSLLSTVGQAGVPWTAEGVASVIQDIIDDCELQFVGPEEPCLIQGQSVVEELFPSPGVPSLTVGKK
RKIASLLSDLDL
>P13215 ~~~DBP~~~Major DNA-binding protein~~~
MSNEELSALAPVGPAAYVYFTKTNHEMNEVLATLSLCDSSSPVVIAPLLMGLTVDQDFCTSVRTPVVCYDGGVLTKVTSF
CPFALYFYNTQGIVDFSEPHGDVQRLCDETRQRYAIESYMPEEGRAPTDLAALCTAAGCDPQEVLVHVVVGNGMKEFMYA
GQLIPCFEEAAPTRLNDCDAVRVPLYPPTLFGSLQADVDSDELSLDKRSSFVESRGLYVPAVSETLFYYVYTSWCQALRF
SETKVLIEAALKQFVNDSQQSVKLAPHKKYFGYTSQKLSSLEKDHLMLSDAVICELGFSFASVFLDSAYGASDSMVYSEW
PVVVNATDHRDLIRALTELKLHLSTHISALLFSCNSILYHNRLVYLTSNKNASGTGASQEVLLKSIHFANGLTGLCEDTY
NDARKLIKCSGVVAKDERYAPYHLSLICGTCPQLFSAFIWYLNRVSVYNTGLTGSSTLSNHLIGCSSSLCGACGGTCCHT
CYNTAFVRVQTRLPQMPRLPKKEPSVVVMQSRFLNDVDVLGTFGRRYSAESKEASLDAKADEGSASTSNRTASSSVDRTH
RLNRILDYCKKMRLIDSVTGEDTMTINGRSDFINLVSSLNKFVDDEAMSFVSEVRMKSNRDEVLGATQAFNLDLNPFAVS
FSPILAYEYYRVIFAIIQNVALITATSYIVDNPLTTSLVSRWVTQHFQSIHGAFSTTSSRKGFLFIRNVKSSKNADHDRL
PDFKLYARGTYSVISMEIKLSRLSVPSLLMFRVKNRPISKASKGTTAHVFFRREHVPKKNPVKGCLGFLLYKYHDKLFPD
CGFSCLQFWQKVCANALPKNVNIGDMGEFNNFVKFVISVTADYNEHDLIDVPPDCMLNYLENRFHNKFLCFYGFKDYIGT
LHGLTTRLTYQNHAQFPYLLGESPNFASAADFALRLKDLKATGVTAPLASTVTRESLMRTIFEQRSLVTVSFSIEKYAGV
NNNKEIYQFGQIGYFSGNGVERSLNTNSIGGQDYKFMRQRCILATKLSDVLIKRSRRDNVLFDEDIIKNRVMAALDSENL
DVDPELMAMYEILSTREEIPERDDVLFFVDGCQAVADSLMEKFSRLQEMGVDDFSLVNLQQVLDSRPECGGGGGEVHDLS
ALFTAASGEAVGNSVGLNARGGEHAFDEDCGLLPAKRGRL
>P35970 6.5.1.1~~~LIG~~~DNA ligase~~~
MLNQFPGQYSNNIFCFPPIESETKSGKKASWIICVQVVQHNTIIPITDEMFSTDVKDAVAEIFTKFFVEEGAVRISKMTR
VTEGKNLGKKNATTVVHQAFKDALSKYNRHARQKRGAHTNRGMIPPMLVKYFNIIPKTFFEEETDPIVQRKRNGVRAVAC
QQGDGCILLYSRTEKEFLGLDNIKKELKQLYLFIDVRVYLDGELYLHRKPLQWIAGQANAKTDSSELHFYVFDCFWSDQL
QMPSNKRQQLLTNIFKQKEDLTFIHQVENFSVKNVDEALRLKAQFIKEGYEGAIVRNANGPYEPGYNNYHSAHLAKLKPL
LDAEFILVDYTQGKKGKDLGAILWVCELPNKKRFVVTPKHLTYADRYALFQKLTPALFKKHLYGKELTVEYAELSPKTGI
PLQARAVGFREPINVLEII
>P00970 6.5.1.1~~~~~~DNA ligase~~~
MILKILNEIASIGSTKQKQAILEKNKDNELLKRVYRLTYSRGLQYYIKKWPKPGIATQSFGMLTLTDMLDFIEFTLATRK
LTGNAAIEELTGYITDGKKDDVEVLRRVMMRDLECGASVSIANKVWPGLIPEQPQMLASSYDEKGINKNIKFPAFAQLKA
DGARCFAEVRGDELDDVRLLSRAGNEYLGLDLLKEELIKMTAEARQIHPEGVLIDGELVYHEQVKKEPEGLDFLFDAYPE
NSKAKEFAEVAESRTASNGIANKSLKGTISEKEAQCMKFQVWDYVPLVEIYSLPAFRLKYDVRFSKLEQMTSGYDKVILI
ENQVVNNLDEAKVIYKKYIDQGLEGIILKNIDGLWENARSKNLYKFKEVIDVDLKIVGIYPHRKDPTKAGGFILESECGK
IKVNAGSGLKDKAGVKSHELDRTRIMENQNYYIGKILECECNGWLKSDGRTDYVKLFLPIAIRLREDKTKANTFEDVFGD
FHEVTGL
>P00969 6.5.1.1~~~~~~DNA ligase~~~
MMNIKTNPFKAVSFVESAIKKALDNAGYLIAEIKYDGVRGNICVDNTANSYWLSRVSKTIPALEHLNGFDVRWKRLLNDD
RCFYKDGFMLDGELMVKGVDFNTGSGLLRTKWTDTKNQEFHEELFVEPIRKKDKVPFKLHTGHLHIKLYAILPLHIVESG
EDCDVMTLLMQEHVKNMLPLLQEYFPEIEWQAAESYEVYDMVELQQLYEQKRAEGHEGLIVKDPMCIYKRGKKSGWWKMK
PENEADGIIQGLVWGTKGLANEGKVIGFEVLLESGRLVNATNISRALMDEFTETVKEATLSQWGFFSPYGIGDNDACTIN
PYDGWACQISYMEETPDGSLRHPSFVMFRGTEDNPQEKM
>A0A7H0DNE6 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCDIYHASGYKEKSKLIRDFITDRDDTDTYLIIKLLLPGLDDRMYNMNDKQIIKLYSIIFKQSQEDMLQD
LGYGYIGDTIRTFFKENTEIRPRDKSILTLEEVDSFLTTLSSVTKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKA
GPRYVLNAISPHAYDVFRKSNNLKEIIENAAKQNLDSISISVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQ
VHKKNNEFAFFSRNMKPVLSHKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVPLPFGSLGIHKKKEYKNSNMCLFVFD
CLYFDGFDMTDIPLYERRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKI
KRDYLNEGSMADSADLVVLGAYYGKGGKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRVLQDQLTMVKINKDPKKI
PEWLVVNKIYIPDFVVDDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKSLNSYI
>Q9YMV2 6.5.1.1~~~LIG~~~DNA ligase~~~
MENHDSFYKFCQLCQSLYDADDHQEKRDALERHFADFRGSAFMWRELLAPAESDAAADRELTLIFETILSIERTEQENVT
RNLKCTIDGAAVPLSRESRITVPQVYEFINDLRGSGSRQERLRLIGQFAAGCTDEDLLTVFRVVSDHAHAGLSAEDVMEL
VEPWERFQKPVPPALAQPCRRLASVLVKHPEGALAEVKYDGERVQVHKAGSRFKFFSRTLKPVPEHKVAGCREHLTRAFP
RARNFILDAEIVMVDGSGEALPFGTLGRLKQMEHADGHVCMYIFDCLRYNGVSYLNATPLDFRRRVLQDEIVPIEGRVVL
SAMERTNTLSELRRFVHRTLATGAEGVVLKGRLSSYAPNKRDWFKMKKEHLCDGALVDTLDLVVLGAYYGTGRNCRKMSV
FLMGCLDRESNVWTTVTKVHSGLADAALTALSKELRPLMAAPRDDLPEWFDCNESMVPHLLAADPEKMPVWEIACSEMKA
NIGAHTAGVTMRFPRVKRFRPDKDWSTATDLQEAEQLIRNSQENTKKTFARLATTYDGPSPNKKLKLN
>O57250 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCDIYHASGYKEKSKLIRDFITDRDDKYLIIKLLLPGLDDRIYNMNDKQIIKLYSIIFKQSQEDMLQDLG
YGYIGDTIRTFFKENTEIRPRDKSILTLEEVDSFLTTLSSVTKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKAGP
RYVLNAISPHAYDVFRKSNNLKEIIENASKQNLDSISISVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQVH
KNNNEFAFFSRNMKPVLSHKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVPLPFGSLGIHKKKEYKNSNMCLFVFDCL
YFDGFDMTDIPLYERRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKIKR
DYLNEGSMADSADLVVLGAYYGKGAKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRELQDQLKMIKINKDPKKIPE
WLVVNKIYIPDFVVEDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKS
>P20492 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCDIYHASGYKEKSKLIRDFITDRDDKYLIIKLLLPGLDDRIYNMNDKQIIKLYSIIFKQSQEDMLQDLG
YGYIGDTIRTFFKENTEIRPRDKSILTLEDVDSFLTTLSSVTKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKAGP
RYVLNAISPNAYDVFRKSNNLKEIIENSSKQNLDSISISVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQVH
KNNNEFAFFSRNMKPVLSHKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVPLPFGSLGIHKKKEYKNSNMCLFVFDCL
YFDGFDMTDIPLYERRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKIKR
DYLNEGSMADSADLVVLGAYYGKGAKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRVLQDQLTMIKINKDPKKIPE
WLVVNKIYIPDFVVEDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKS
>P16272 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCDIYHASGYKEKSKLIRDFITDRDDKYLIIKLLLPGLDDRIYNMNDKQIIKLYSIIFKQSQEDMLQDLG
YGYIGDTIRTFFKENTEIRPRDKSILTLEDVDSFLTTLSSVTKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKAGP
RYVLNAISPNAYDVFRKSNNLKEIIENASKQNLDSISISVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQVH
KNNNEFAFFSRNMKPVLSHKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVPLPFGSLGIHKKKEYKNSNMCLFVFDCL
YFDGFDMTDIPLYERRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKIKR
DYLNEGSMADSADLVVLGAYYGKGAKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRVLQDQLTMVKINKDPKKIPE
WLVVNKIYIPDFVVEDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKS
>P0DOO3 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCAIYHASGYKEKSKLIRDFITDRDDKYLIIKLLLPGLDDRIYNMNDKQIIKIYSIIFKQSQKDMLQDLG
YGYIGDTISTFFKENTEIRPRNKSILTLEDVDSFLTTLSSITKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKAGP
RYVLNAISPHAYDVFRKSNNLKEIIENESKQNLDSISVSVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQVH
KNNNEFAFFSRNMKPVLSYKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVQLPFGSLGIHKKKEYKNSNMCLFVFDCL
YFDGFDMTDIPLYKRRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKIKR
DYLNEGSMADSADLVVLGAYYGKGAKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRVLQDQLTMVKINKDPKKIPE
WLVVNKIYIPDFVVEDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKS
>P0DOO4 6.5.1.1~~~~~~DNA ligase~~~
MTSLREFRKLCCAIYHASGYKEKSKLIRDFITDRDDKYLIIKLLLPGLDDRIYNMNDKQIIKIYSIIFKQSQKDMLQDLG
YGYIGDTISTFFKENTEIRPRNKSILTLEDVDSFLTTLSSITKESHQIKLLTDIASVCTCNDLKCVVMLIDKDLKIKAGP
RYVLNAISPHAYDVFRKSNNLKEIIENESKQNLDSISVSVMTPINPMLAESCDSVNKAFKKFPSGMFAEVKYDGERVQVH
KNNNEFAFFSRNMKPVLSYKVDYLKEYIPKAFKKATSIVLDSEIVLVDEHNVQLPFGSLGIHKKKEYKNSNMCLFVFDCL
YFDGFDMTDIPLYKRRSFLKDVMVEIPNRIVFSELTNISNESQLTDVLDDALTRKLEGLVLKDINGVYEPGKRRWLKIKR
DYLNEGSMADSADLVVLGAYYGKGAKGGIMAVFLMGCYDDESGKWKTVTKCSGHDDNTLRVLQDQLTMVKINKDPKKIPE
WLVVNKIYIPDFVVEDPKQSQIWEISGAEFTSSKSHTANGISIRFPRFTRIREDKTWKESTHLNDLVNLTKS
>Q5UPZ0 6.5.1.2~~~~~~DNA ligase~~~
MDIIKKIKKSKSLWAILPSLNATDIEEALSVSSEYYHNTGTSLISDQEYDILMDRLKELNPSSKIFAQVGAPVKGKKVKL
PFWMGSMNKIKADEKAVNKWLNEYSGPYVISDKLDGISCLLTIKNNKTKLYTRGDGTYGQDITHLLGLINIDIGLLEEID
QDIAIRGELIMSKKNFEKYQEIMANARNMVGGIVNSKPESVNKDHAADVDLIFYEVIKPNDKLSRQLKILKEWGLKVVYY
NIYKTFDVNILESVLSERKKKSGYEIDGIIVTDNNKHVRNISGNPSYSFAFKGDTPTIDTVVKRVIWTPSKDGVLVPRIK
FKKVRLSNVDLEYTTGFNAKYIVDNKIGSGAIINVVRSGDVIPYITHVVKPAKKPDLPNIEYVWDKNGVNIILADINDNE
TVIIKRLTKFMRNIGAENISEGITTRLVEAGFDTIPKIINMTEEDFLTIDGFQERLAEKIYNNLQNSLDNLDILTLMDAS
NIFGRGFGTKKFKKILDVYPNIVNQYTKETDNIWRKKLLDIEGFDTITVNKFLGEMPNFQKFYKVINKTITIKPYISEVN
SEGIFQNQTVVFTGFRNADWQKFIENEGGKVSGSVSKNTSLLVYNDGEESSAKYQKAKQLGIKTMTKSSFSKKFEK
>P04531 2.7.4.13~~~1~~~Deoxynucleotide monophosphate kinase~~~
MKLIFLSGVKRSGKDTTADFIMSNYSAVKYQLAGPIKDALAYAWGVFAANTDYPCLTRKEFEGIDYDRETNLNLTKLEVI
TIMEQAFCYLNGKSPIKGVFVFDDEGKESVNFVAFNKITDVINNIEDQWSVRRLMQALGTDLIVNNFDRMYWVKLFALDY
LDKFNSGYDYYIVPDTRQDHEMDAARAMGATVIHVVRPGQKSNDTHITEAGLPIRDGDLVITNDGSLEELFSKIKNTLKV
L
>Q6QGP4 2.7.4.13~~~dnk~~~Deoxynucleoside-5'-monophosphate kinase~~~
MSVLVGLHGEAGSGKDTVAKLIIDWCNDTYPTCLSRRYSFAKPVYELASVILGVTPEFLGERRGKEIDQWFTVTQSQLER
ARDVWFKYGIDKFEDFSYVWPIFEEKYLNPQQLISENKEDGLYSLFISPRKMLQLVGTELGRQLVHERIWLIILEQSIAK
DDPDVAVITDVRFPNEGELLRETNHLDMDSLLVNVVPAEQKFTIKSDHPSESGIPAKYITHELVNKFDGINNLKLEVYNF
CDLELEPLVG
>Q67472 2.1.1.37~~~~~~Probable DNA (cytosine-5)-methyltransferase~~~
MRILDLFSGTHSVPKACAQREGWSCVTVDLADSDYNVDVLEWDYTKDLKPREFDVVWASPPCRYFSKLRESNIGRGGMTK
KSVKEDLETKGLPLLRRAMEIIAYLQPKKFIVENPDTGRMKEYITEWPHYVVDYCAYSDWGYRKRTRLWTDIEGFVPKTC
AGKESCPNMERNPSSGRWRHVLATDAGGRGRKGTTRRLRYRVPPAIIVELLDLC
>Q06259 2.7.11.1~~~doc~~~Protein kinase doc~~~
MRHISPEELIALHDANISRYGGLPGMSDPGRAEAIIGRVQARVAYEEITDLFEVSATYLVATARGHIFNDANKRTALNSA
LLFLRRNGVQVFDSPELADLTVGAATGEISVSSVADTLRRLYGSAE
>Q65214 ~~~~~~Uncharacterized protein DP60R~~~
MSSIWPPQKKVFTVGFITGGVTPVMVSFVWPAAQPQKKINYSRKKKYFRPRSFYKKNVSF
>Q65212 ~~~~~~Protein DP71L~~~
MGGRRRKKRTNDTKHVRFAAAVEVWEADDIERKGPWEQVAVDRFRFQRRIASVEELLSTVLLRQKKLLEQQ
>D1L2X0 ~~~~~~Depolymerase 1, capsule K3-specific~~~
MDQDIKTIIQYPTGATEFDIPFDYLSRKFVRVSLVSDDNRRLLRNITEYRYVSKTRVKILVDTTGFDRVEIRRFTSASER
VVDFSDGSVLRANDLNVSQLQSSHIAEEARDSALLAMPEDDAGNLDARNRKIVRLAPGEVGTDAVNKDQLDTTLGEAGGI
LSEIKQTQQDIGDYIEEFANGTTYLKNIVMVYNQGSANGGETAIVLDRTDEVFAVPVIYINGDRQEVGFQFSYDNTTKTI
TLVKPLQRGDFVVMLTSEGTHSLASILAGPDGASRIGASGGLTVQQAIDRILESHIILPQWFGARGDWSDTTNTGTDDTA
AFEAAIEMAVTLNYTEVYVPAGSYLITRELNLNGGNRNTRLGARIRGAGWASSVLVFKAPAPETPCISIIGTPGSHTSKG
VEKLLIKSHPSTTGQGIGILCRNTCFAHVTEFLIANMNIGICLENYMTAGSFTEFCYFTNGRLFTNNINIQFNKVQNGDN
SFHGSNFKNIQNQVKKNGGIGVQIVGNPQGAAYLYNMTWDMQFFGGAGCIAMDLNYCNTDYVGGKLTGESDLTFKADGNS
RFDFHGRFDSIGKVIWDTATEGARTGGTYVFANRTSLENAVMTNADAGRLPAGVSARPLPADWADRNNNGVYPATFHIRG
PNIESMGFVTYDSPGNGFFFGQLPFQGNIKDFNPRFWFNSGGTSFNTAATAYTMKLVNGEGLYFSDTTIRALSDGNVSLG
EWNRRFKEARFTSWDIGTSIVPTSTATKDIGTNSNRIRDIYLANSPNVTSDSRKKSFIKPIDEALLDAWETIPFSQWKLN
DSVAEKGSEARWHVGYIAQKVEESLQAFGLNAGDYGLITIGDDGFMLRMEECLVVEAAMIRRKLGLTYK
>A0A0U3DL17 ~~~~~~Depolymerase 1, capsule K47-specific~~~
MDQDIKTVIQYPVGATEFDIPFDYLSRKFVRVSLVTDENRRLLSNITEYRYVSKTRVKLLVETDGFDRVEIRRLTSASER
VVDFSDGSVLRAADLNVSQIQSAHIAEEARDAALMAMPQDDAGNLDARNRRIVRLAPGVEGTDAINKNQLDTTLGDAGGI
LSDMKDLEGEIHDYIEKFADDTALVRGVAWVYNLGSADGGETVITINKSTRTYAVPYIEVNGSRQEVGYHYSFDLETQQI
TLATPLKAGDFVMVMTTESQLPVETLLASSVGASSIGTATGETVEERLTRLYGHFVHPETYGAVGDGITDDRVALQRSLD
VAYENALNGTGPSTVRWSGDYMVSLNPNSLGVSGELAAGRSALCIRPGVSIEGKGTVRLDPSFTGSQSGAVITNWAGPVD
DCSIKDIRIYGGKDVATGTGITGILILDSQRVVISDVKVLNSTAGGIYLRKGATEGLYGCSFSKVSGCTVDNAGYIGIQM
ERPYDNTVIGNTINRCEDNGIDVFGNVNDATVTGIAQSTLITGNNIRDVLNGVFIESCGNTNITGNYIADFRSSGVIYNR
INSAANDNSLTSNVLIGASGASAGVSFKNSVGYCTVASNRIQNSDYGIRCVGGGITGLNILPNTMKNIAKTLLFVEARNN
GLVKSRMSTQFYEGAQVGGIPSNTSPRGVPHRFPSRLSYIVDIQPFWATEQGTREDNFERAKGTLASITGWGSKCALYDT
IVAGDTVVSLNSSSVAVGEYLEINAEVYKVTSVSATYAVVRKWTGSDYTAGDYAAVIISNPSYIIRRVQWGEQ
>D1L2X1 ~~~~~~Depolymerase 2, capsule K21-specific~~~
MLDNFNQPKGSTIGVLKDGRTIQEAFDSLPRLESFSGSTATDKLRAAITLGVSEVAIGPVEGNGGRPYEFGDVVIPYPLR
IVGCGSQGINVTKGTVLKRSAGASFMFHFTGEGQAQRPMGGGLFNINLNGDTATALGDIIKVTQWSYFKANNCAFQNMAG
WGIRLKDVMESNISGNLFRRLGGPSGGGILFDDVRSAVTDNVNNLHIEDNTFALMSGPWIGSTANSNPDLIWIVRNKFEF
DGTPAAPNTVDSYVLDFQQLSRAFIQDNGFTHFTTERNRYVGVLRVGATAVGTIKFEDNLLFACESAGLIAGGIVVSRGN
VNNQGSATTAIKQFTNTSSKLCKLERVINVQSNGNVSVGQQILPDGYINMAELPGNTRLPSEYDADGETTSVLRVPANTQ
VRQWSVPKMYKDGLTVTKVTVRAKGAAAGAILSLQSGSTVLSTKSIDAGVWKNYVFYVKANQLQETLQLRNTGTADVLAD
GMVFGKVDYIDWDFAIAPGTLAAGAKYTTPNQSYLDVAGMRVQAVSIPMFDGPTTGLQVWVEATSANGSFVVVMKNDTGS
ELVTTVTRCRVRAFVS
>A0A0U3C9T3 ~~~~~~Depolymerase 1, capsule K47-specific~~~
MLNNLNQPKGSTIGVLKDGRTIQQAIDGLENPVHYVKDVSITPSALLAVAVEAARLGRTVEFGPGHYTNQGQPFEVDFPL
NLDVPVGTFLDFPIIIRGKTVKTVRSVATNLTAAQCPAGTTVIAGDFSAFPVGSVVGVKLGDNTNGSASYNNEAGWDFTT
VAAASNTSITLSTGLRWAFDKPEVFTPEYAVRYSGQLSRSSYFIPGDYTSGLNVGDIIRVENIDGTDGVHGNKEYFEMLK
VSSIDSSGITVETRLRYTHVNPWIVKTGLVKGSSVTGGGRLKRLEVRGVDTPKVNNVDVDRLIVGLCYNIDVGEITSRGV
GEPSSVNFTFCFGRGFLYNVRASGSVSTTDNSALKLMSCPGLIINNCSPHNSTSTGSQGDYGFYVDAYYSPYWCWNDGMS
INGIVTETPRSAVTRALWLFGLRGCSVSNLSGAQVFLQGCAKSVFSNIVTPDNLLELRDLSGCIVSGMANNALVLGCWNS
TFDLTLFGIGSGSNLNIALRAGAGVTHPETGVPTTLGKNNTFNVKSFSQSSLAVTLSIAQQERPIFGAGCVDVDSANKSV
ALGSNVIVPTMLPLALTKGIDSGSGWVGGRTKGGIWFDGNYRDAAVRWNGQYVWVADNGSLKAAPTKPDSDSPSNGVVIG
P
>P0DTK4 2.7.7.7~~~~~~DNA-directed DNA polymerase~~~
MSYLQPPLFTTVSSDWVAPDLSTLPSWEGAKRVAIDCETRDPDLRKLGPGAGRRPNSYITGISFAIEDGPGGYLPIRHEG
GGNLPLEGVLAYLRAQAKVFTGDLVGANLPYDLDFLAGDGIEFERVRYFRDIQIADPLICELHDSYSMQAIAERWGFHGK
DEALLRAAAVDYGIDPKKDMWMLPAKFVGKYAEEDTRLPLNILRRQEREIDEQDLWGVYNLESKLLPILTGLRRRGVRID
CDRLDMIERWALEKETEALAQVRSITGHRIAVGDVWKPEVIAPALEHIGIKLNKTSQGKPNIDKELLGSIDHPVADLLER
ARKVNKLRTTFASSVRDHMVNGRLHGTFNQLRRQKDDESDGTAGAAYGRLSSEHPNLQQQPARDEFAMMWRAIYLPEEGQ
HWASNDYSQQEPRMAVHYACLAKDLIGHQAWLSAIEARDKYRNDPNTDNHQMMADMAGIKRKDAKEIYLGLSYGMGGAKM
CRKLGLPTMMAVRGPRFQLFDVNSPEGQRLVAEGARRFEAAGPEGQALLDTFDHKVPFIKKLAKACEARAKAVGYITTLS
GRRCRFPKDKDGNYDWTHKGLNRLIQGSSADQTKMAMVACAEAGLDIIIQVHDEIAFSVHDMKEAAEAAHIMRTCTPLEL
PSKVDVEIGQSWGHSMGWDGNPPS
>P42494 2.7.7.7~~~~~~Repair DNA polymerase X~~~
MLTLIQGKKIVNHLRSRLAFEYNGQLIKILSKNIVAVGSLRREEKMLNDVDLLIIVPEKKLLKHVLPNIRIKGLSFSVKV
CGERKCVLFIEWEKKTYQLDLFTALAEEKPYAIFHFTGPVSYLIRIRAALKKKNYKLNQYGLFKNQTLVPLKITTEKELI
KELGFTYRIPKKRL
>Q7T6Y4 2.7.7.7~~~~~~Probable DNA polymerase family X~~~
MNSKIIEQFNLLEKQVDAEYLNSKVENDLKEETMNRFRLKSIKKALSILKNLDFEITDANDVKGIPGIGAGTIKRIKEIL
ETGKLHDLKDKFSPEKQKQIEGIQELENVIGIGSSTAKKLISQYGIRSVDDLKKAIETGKVKVSTSIMLGLKYYGIVQRD
IPRKEITAIEKLLSKEAHKIDPDLEIIICGSYRRGKKTSGDIDVLMYHPKMKTSKEMLHPEKFDLEPYFNLYIDRLTEKG
FLIDDITFNPNKKYMGFCKYKLNPVRRIDIRFIPYNSLAPAMLYFTGPMELNTKMRSAAKKRKMILNEYGLFKTDKNGAQ
IPLDTKSEADIFHALGMDYLTPQQRELYSSGKIH
>P03261 2.7.7.7~~~POL~~~DNA polymerase~~~
MALVQAHRARRLHAEAPDSGDQPPRRRVRQQPPRAAPAPARARRRRAPAPSPGGSRAPPTSGGPPASPLLDASSKDTPAA
HRPPRGTVVAPRGCGLLQVIDAATNQPLEIRYHLDLARALTRLCEVNLQELPPDLSPRELQTMDSSHLRDVVIKLRPPRA
DIWTLGSRGVVVRSTITPLEQPDGQGQAAEVEDHQPNPPGEGLKFPLCFLVRGRQVNLVQDVQPVHRCQYCARFYKSQHE
CSARRRDFYFHHINSHSSNWWREIQFFPIGSHPRTERLFVTYDVETYTWMGAFGKQLVPFMLVMKFGGDEPLVTAARDLA
VDLGWDRWEQDPLTFYCITPEKMAIGRQFRTFRDHLQMLMARDLWSSFVASNPHLADWALSEHGLSSPEELTYEELKKLP
SIKGTPRFLELYIVGHNINGFDEIVLAAQVINNRSEVPGPFRITRNFMPRAGKILFNDVTFALPNPRSKKRTDFLLWEQG
GCDDTDFKYQYLKVMVRDTFALTHTSLRKAAQAYALPVEKGCCAYQAVNQFYMLGSYRSEADGFPIQEYWKDREEFVLNR
ELWKKKGQDKYDIIKETLDYCALDVQVTAELVNKLRDSYASFVRDAVGLTDASFNVFQRPTISSNSHAIFRQIVFRAEQP
ARSNLGPDLLAPSHELYDYVRASIRGGRCYPTYLGILREPLYVYDICGMYASALTHPMPWGPPLNPYERALAARAWQQAL
DLQGCKIDYFDARLLPGVFTVDADPPDETQLDPLPPFCSRKGGRLCWTNERLRGEVATSVDLVTLHNRGWRVHLVPDERT
TVFPEWRCVAREYVQLNIAAKERADRDKNQTLRSIAKLLSNALYGSFATKLDNKKIVFSDQMDAATLKGITAGQVNIKSS
SFLETDNLSAEVMPAFEREYSPQQLALADSDAEESEDERAPTPFYSPPSGTPGHVAYTYKPITFLDAEEGDMCLHTLERV
DPLVDNDRYPSHLASFVLAWTRAFVSEWSEFLYEEDRGTPLEDRPLKSVYGDTDSLFVTERGHRLMETRGKKRIKKHGGN
LVFDPERPELTWLVECETVCGACGADAYSPESVFLAPKLYALKSLHCPSCGASSKGKLRAKGHAAEGLDYDTMVKCYLAD
AQGEDRQRFSTSRTSLKRTLASAQPGAHPFTVTQTTLTRTLRPWKDMTLARLDEHRLLPYSESRPNPRNEEICWIEMP
>P04495 2.7.7.7~~~POL~~~DNA polymerase~~~
MDSSHLRDVVIKLRPPRADIWTLGSRGVVVRSTVTPLEQPDGQGQAAEVEDHQPNPPGEGLKFPLCFLVRGRQVNLVQDV
QPVHRCQYCARFYKSQHECSARRRDFYFHHINSHSSNWWREIQFFPIGSHPRTERLFVTYDVETYTWMGAFGKQLVPFML
VMKFGGDEPLVTAARDLAANLGWDRWEQDPLTFYCITPEKMAIGRQFRTFRDHLQMLMARDLWSSFVASNPHLADWALSE
HGLSSPEELTYEELKKLPSIKGIPRFLELYIVGHNINGFDEIVLAAQVINNRSEVPGPFRITRNFMPRAGKILFNDVTFA
LPNPRSKKRTDFLLWEQGGCDDTDFKYQYLKVMVRDTFALTHTSLRKAAQAYALPVEKGCCAYQAVNQFYMLGSYRSEAD
GFPIQEYWKDREEFVLNRELWKKKGQDKYDIIKETLDYCALDVQVTAELVNKLRDSYASFVRDAVGLTDASFNVFQRPTI
SSNSHAIFRQIVFRAEQPARSNLGPDLLAPSHELYDYVRASIRGGRCYPTYLGILREPLYVYDICGMYASALTHPMPWGP
PLNPYERALAARAWQQALDLQGCKIDYFDARLLPGVFTVDADPPDETQLDPLPPFCSRKGGRLCWTNERLRGEVATSVDL
VTLHNRGWRVHLVPDERTTVFPEWRCVAREYVQLNIAAKERADRDKNQTLRSIAKLLSNALYGSFATKLDNKKIVFSDQM
DAATLKGITAGQVNIKSSSFLETDNLSAEVMPAFQREYSPQQLALADSDAEESEDERAPTPFYSPPSGTPGHVAYTYKPI
TFLDAEEGDMCLHTLERVDPLVDNDRYPSHLASFVLAWTRAFVSEWSEFLYEEDRGTPLEDRPLKSVYGDTDSLFVTERG
HRLMETRGKKRIKKHGGNLVFDPERPELTWLVECETVCGACGADAYSPESVFLAPKLYALKSLHCPSCGASSKGKLRAKG
HAAEGLDYDTMVKCYLADAQGEDRQRFSTSRTSLKRTLASAQPGAHPFTVTQTTLTRTLRPWKDMTLARLDEHRLLPYSE
SRPNPRNEEICWIEMP
>P42489 2.7.7.7~~~DPOL~~~DNA polymerase beta~~~
MISIMDRSEIVARENPVITQRVTNLLQTNAPLLFMPIDIHEVRYGAYTLFMYGSLENGYKAEVRIENIPVFFDVQIEFND
TNQLFLKSLLTAENIAYERLETLTQRPVMGYREKEKEFAPYIRIFFKSLYEQRKAITYLNNMGYNTAADDTTCYYRMVSR
ELKLPLTSWIQLQHYSYEPRGLVHRFSVTPEDLVSYQDDGPTDHSIVMAYDIETYSPVKGTVPDPNQANDVVFMICMRIF
WIHSTEPLASTCITMAPCKKSSEWTTILCSSEKNLLLSFAEQFSRWAPDICTGFNDSRYDWPFIVEKSMQHGILEEIFNK
MSLFWHQKLDTILKCYYVKEKRVKISAEKSIISSFLHTPGCLPIDVRNMCMQLYPKAEKTSLKAFLENCGLDSKVDLPYH
LMWKYYETRDSEKMADVAYYCIIDAQRCQDLLVRHNVIPDRREVGILSYTSLYDCIYYAGGHKVCNMLIAYAIHDEYGRI
ACSTIARGKREHGKYPGAFVIDPVKGLEQDKPTTGLDFASLYPSLIMAYNFSPEKFVASRDEAKSLMAKGESLHYVSFHF
NNRLVEGWFVRHNNVPDKMGLYPKVLIDLLNKRTALKQELKKLGEKKECIHESHPGFKELQFRHAMVDAKQKALKIFMNT
FYGEAGNNLSPFFLLPLAGGVTSSGQYNLKLVYNFVINKGYGIKYGDTDSLYITCPDSLYTEVTDAYLNSQKTIKHYEQL
CHEKVLLSMKAMSTLCAEVNEYLRQDNGTSYLRMAYEEVLFPVCFTGKKKYYGIAHVNTPNFNTKELFIRGIDIIKQGQT
KLTKTIGTRIMEESMKLRRPEDHRPPLIEIVKTVLKDAVVNMKQWNFEDFIQTDAWRPDKDNKAVQIFMSRMHARREQLK
KHGAAASQFAEPEPGERFSYVIVEKQVQFDIQGHRTDSSRKGDKMEYVSEAKAKNLPIDILFYINNYVLGLCARFINENE
EFQPPDNVSNKDEYAQRRAKSYLQKFVQSIHPKDKSVIKQGIVHRQCYKYVHQEIKKKIGIFADLYKEFFNNTTNPIESF
IQSARFMIQYSDGEQKVNHSMKKMVEQRATLASKPAGKPAGNPAGNPAGNALMRAIFTQLITEEKKIVQALYNKGDAIHD
LLTYIINNINYKIATFQTKQMLTFEFSSTHVELLLKLNKTWLILAGIHVAKKHLQALLDSYNNEPPSRTFIQQAIEEECG
SIKPSCYDFIS
>B7SSM2 2.7.7.7~~~2~~~DNA polymerase~~~
MSRKMFSCDFETTTKLDDCRVWAYGYMEIGNLDNYKIGNSLDEFMQWVMEIQADLYFHNLKFDGAFIVNWLEQHGFKWSN
EGLPNTYNTIISKMGQWYMIDICFGYRGKRKLHTVIYDSLKKLPFPVKKIAKDFQLPLLKGDIDYHTERPVGHEITPEEY
EYIKNDIEIIARALDIQFKQGLDRMTAGSDSLKGFKDILSTKKFNKVFPKLSLPMDKEIRKAYRGGFTWLNDKYKEKEIG
EGMVFDVNSLYPSQMYSRPLPYGAPIVFQGKYEKDEQYPLYIQRIRFEFELKEGYIPTIQIKKNPFFKGNEYLKNSGVEP
VELYLTNVDLELIQEHYELYNVEYIDGFKFREKTGLFKDFIDKWTYVKTHEEGAKKQLAKLMLNSLYGKFASNPDVTGKV
PYLKDDGSLGFRVGDEEYKDPVYTPMGVFITAWARFTTITAAQACYDRIIYCDTDSIHLTGTEVPEIIKDIVDPKKLGYW
AHESTFKRAKYLRQKTYIQDIYVKEVDGKLKECSPDEATTTKFSVKCAGMTDTIKKKVTFDNFAVGFSSMGKPKPVQVNG
GVVLVDSVFTIK
>P03680 2.7.7.7~~~2~~~DNA polymerase~~~
MKHMPRKMYSCDFETTTKVEDCRVWAYGYMNIEDHSEYKIGNSLDEFMAWVLKVQADLYFHNLKFDGAFIINWLERNGFK
WSADGLPNTYNTIISRMGQWYMIDICLGYKGKRKIHTVIYDSLKKLPFPVKKIAKDFKLTVLKGDIDYHKERPVGYKITP
EEYAYIKNDIQIIAEALLIQFKQGLDRMTAGSDSLKGFKDIITTKKFKKVFPTLSLGLDKEVRYAYRGGFTWLNDRFKEK
EIGEGMVFDVNSLYPAQMYSRLLPYGEPIVFEGKYVWDEDYPLHIQHIRCEFELKEGYIPTIQIKRSRFYKGNEYLKSSG
GEIADLWLSNVDLELMKEHYDLYNVEYISGLKFKATTGLFKDFIDKWTYIKTTSEGAIKQLAKLMLNSLYGKFASNPDVT
GKVPYLKENGALGFRLGEEETKDPVYTPMGVFITAWARYTTITAAQACYDRIIYCDTDSIHLTGTEIPDVIKDIVDPKKL
GYWAHESTFKRAKYLRQKTYIQDIYMKEVDGKLVEGSPDDYTDIKFSVKCAGMTDKIKKEVTFENFKVGFSRKMKPKPVQ
VPGGVVLVDDTFTIK
>Q38087 2.7.7.7~~~~~~DNA-directed DNA polymerase~~~
MKEFYLTVEQIGDSIFERYIDSNGRERTREVEYKPSLFAHCPESQATKYFDIYGKPCTRKLFANMRDASQWIKRMEDIGL
EALGMDDFKLAYLSDTYNYEIKYDHTKIRVANFDIEVTSPDGFPEPSQAKHPIDAITHYDSIDDRFYVFDLLNSPYGNVE
EWSIEIAAKLQEQGGDEVPSEIIDKIIYMPFDNEKELLMEYLNFWQQKTPVILTGWNVESFDIPYVYNRIKNIFGESTAK
RLSPHRKTRVKVIENMYGSREIITLFGISVLDYIDLYKKFSFTNQPSYSLDYISEFELNVGKLKYDGPISKLRESNHQRY
ISYNIIDVYRVLQIDAKRQFINLSLDMGYYAKIQIQSVFSPIKTWDAIIFNSLKEQNKVIPQGRSHPVQPYPGAFVKEPI
PNRYKYVMSFDLTSLYPSIIRQVNISPETIAGTFKVAPLHDYINAVAERPSDVYSCSPNGMMYYKDRDGVVPTEITKVFN
QRKEHKGYMLAAQRNGEIIKEALHNPNLSVDEPLDVDYRFDFSDEIKEKIKKLSAKSLNEMLFRAQRTEVAGMTAQINRK
LLINSLYGALGNVWFRYYDLRNATAITTFGQMALQWIERKVNEYLNEVCGTEGEAFVLYGDTDSIYVSADKIIDKVGESK
FRDTNHWVDFLDKFARERMEPAIDRGFREMCEYMNNKQHLMFMDREAIAGPPLGSKGIGGFWTGKKRYALNVWDMEGTRY
AEPKLKIMGLETQKSSTPKAVQKALKECIRRMLQEGEESLQEYFKEFEKEFRQLNYISIASVSSANNIAKYDVGGFPGPK
CPFHIRGILTYNRAIKGNIDAPQVVEGEKVYVLPLREGNPFGDKCIAWPSGTEITDLIKDDVLHWMDYTVLLEKTFIKPL
EGFTSAAKLDYEKKASLFDMFDF
>P04415 2.7.7.7~~~~~~DNA-directed DNA polymerase~~~
MKEFYISIETVGNNIVERYIDENGKERTREVEYLPTMFRHCKEESKYKDIYGKNCAPQKFPSMKDARDWMKRMEDIGLEA
LGMNDFKLAYISDTYGSEIVYDRKFVRVANCDIEVTGDKFPDPMKAEYEIDAITHYDSIDDRFYVFDLLNSMYGSVSKWD
AKLAAKLDCEGGDEVPQEILDRVIYMPFDNERDMLMEYINLWEQKRPAIFTGWNIEGFDVPYIMNRVKMILGERSMKRFS
PIGRVKSKLIQNMYGSKEIYSIDGVSILDYLDLYKKFAFTNLPSFSLESVAQHETKKGKLPYDGPINKLRETNHQRYISY
NIIDVESVQAIDKIRGFIDLVLSMSYYAKMPFSGVMSPIKTWDAIIFNSLKGEHKVIPQQGSHVKQSFPGAFVFEPKPIA
RRYIMSFDLTSLYPSIIRQVNISPETIRGQFKVHPIHEYIAGTAPKPSDEYSCSPNGWMYDKHQEGIIPKEIAKVFFQRK
DWKKKMFAEEMNAEAIKKIIMKGAGSCSTKPEVERYVKFSDDFLNELSNYTESVLNSLIEECEKAATLANTNQLNRKILI
NSLYGALGNIHFRYYDLRNATAITIFGQVGIQWIARKINEYLNKVCGTNDEDFIAAGDTDSVYVCVDKVIEKVGLDRFKE
QNDLVEFMNQFGKKKMEPMIDVAYRELCDYMNNREHLMHMDREAISCPPLGSKGVGGFWKAKKRYALNVYDMEDKRFAEP
HLKIMGMETQQSSTPKAVQEALEESIRRILQEGEESVQEYYKNFEKEYRQLDYKVIAEVKTANDIAKYDDKGWPGFKCPF
HIRGVLTYRRAVSGLGVAPILDGNKVMVLPLREGNPFGDKCIAWPSGTELPKEIRSDVLSWIDHSTLFQKSFVKPLAGMC
ESAGMDYEEKASLDFLFG
>P19822 2.7.7.7~~~~~~DNA polymerase~~~
MKIAVVDKALNNTRYDKHFQLYGEEVDVFHMCNEKLSGRLLKKHITIGTPENPFDPNDYDFVILVGAEPFLYFAGKKGIG
DYTGKRVEYNGYANWIASISPAQLHFKPEMKPVFDATVENIHDIINGREKIAKAGDYRPITDPDEAEEYIKMVYNMVIGP
VAFDSETSALYCRDGYLLGVSISHQEYQGVYIDSDCLTEVAVYYLQKILDSENHTIVFHNLKFDMHFYKYHLGLTFDKAH
KERRLHDTMLQHYVLDERRGTHGLKSLAMKYTDMGDYDFELDKFKDDYCKAHKIKKEDFTYDLIPFDIMWPYAAKDTDAT
IRLHNFFLPKIEKNEKLCSLYYDVLMPGCVFLQRVEDRGVPISIDRLKEAQYQLTHNLNKAREKLYTYPEVKQLEQDQNE
AFNPNSVKQLRVLLFDYVGLTPTGKLTDTGADSTDAEALNELATQHPIAKTLLEIRKLTKLISTYVEKILLSIDADGCIR
TGFHEHMTTSGRLSSSGKLNLQQLPRDESIIKGCVVAPPGYRVIAWDLTTAEVYYAAVLSGDRNMQQVFINMRNEPDKYP
DFHSNIAHMVFKLQCEPRDVKKLFPALRQAAKAITFGILYGSGPAKVAHSVNEALLEQAAKTGEPFVECTVADAKEYIET
YFGQFPQLKRWIDKCHDQIKNHGFIYSHFGRKRRLHNIHSEDRGVQGEEIRSGFNAIIQSASSDSLLLGAVDADNEIISL
GLEQEMKIVMLVHDSVVAIVREDLIDQYNEILIRNIQKDRGISIPGCPIGIDSDSEAGGSRDYSCGKMKKQHPSIACIDD
DEYTRYVKGVLLDAEFEYKKLAAMDKEHPDHSKYKDDKFIAVCKDLDNVKRILGA
>P00581 2.7.7.7~~~5~~~DNA-directed DNA polymerase~~~
MIVSDIEANALLESVTKFHCGVIYDYSTAEYVSYRPSDFGAYLDALEAEVARGGLIVFHNGHKYDVPALTKLAKLQLNRE
FHLPRENCIDTLVLSRLIHSNLKDTDMGLLRSGKLPGKRFGSHALEAWGYRLGEMKGEYKDDFKRMLEEQGEEYVDGMEW
WNFNEEMMDYNVQDVVVTKALLEKLLSDKHYFPPEIDFTDVGYTTFWSESLEAVDIEHRAAWLLAKQERNGFPFDTKAIE
ELYVELAARRSELLRKLTETFGSWYQPKGGTEMFCHPRTGKPLPKYPRIKTPKVGGIFKKPKNKAQREGREPCELDTREY
VAGAPYTPVEHVVFNPSSRDHIQKKLQEAGWVPTKYTDKGAPVVDDEVLEGVRVDDPEKQAAIDLIKEYLMIQKRIGQSA
EGDKAWLRYVAEDGKIHGSVNPNGAVTGRATHAFPNLAQIPGVRSPYGEQCRAAFGAEHHLDGITGKPWVQAGIDASGLE
LRCLAHFMARFDNGEYAHEILNGDIHTKNQIAAELPTRDNAKTFIYGFLYGAGDEKIGQIVGAGKERGKELKKKFLENTP
AIAALRESIQQTLVESSQWVAGEQQVKWKRRWIKGLDGRKVHVRSPHAALNTLLQSAGALICKLWIIKTEEMLVEKGLKH
GWDGDFAYMAWVHDEIQVGCRTEEIAQVVIETAQEAMRWVGDHWNFRCLLDTEGKMGPNWAICH
>P0C691 ~~~P~~~Protein P~~~
MPQPLKQSLDQSKWLREAEKQLRVLENLVDSNLEEEKLKPQLSMGEDVQSPGKGEPLHPNVRAPLSHVVRAVTTDLPRLG
NKLPARHHLGKLSGLYQMKGCTFNPEWKVPDISDTHFDLEVINECPSRNWKYLTPAKFWPKSISYFPVQAGVKPKYPDNV
MQHESIVGKYLTRLYEAGILYKRISKHLVTFKGQPYNWEQQHLVNQHQIPDGATSSKINGRQENRRRRTPIKSTCRQNDT
KRDSDMVGQVSNNRSRIRPCANNGGDKHPPATGSLACWGRKASRVIKSGSSRDSSASVDSRRRSKGPRGFSTLPRRETTG
NDHHSSDISNSVEATTRRRSTPGESITLGDSSIIPDGTSCASDKDSSPKEENVWYLRGNTSWPNRITGKLFLVDKNSRNT
TEARLVVDFSQFSKGKNAMRFPRYWSPNLSTLRRILPVGMPRISLDLSQAFYHLPLNPASSSRLAVSDGQWVYYFRKAPM
GVGLSPFLLHLFTTALGSEISRRFNVWTFTYMDDFLLCHPNARHLNSISHAVCSFLQELGIRINFDKTTPSPVTEIRFLG
YQIDEHFMKIEESRWKELRTVIKKIKVGEWYDWKCIQRFVGHLNFVLPFTKGNIEMLKPMYAAITNQVNFSFSSAYRTLL
YKLTMGVCKLRINPKSSVPLPRVATDATPTHGAISHITGGSAVFAFSKVRDIHIQELLMTCLARIMIKPRCLLSDSTFVC
HKRYQTLPWHFAVLAKQLLKPIQLYFVPSKYNPADGPSRHRPPDWTAFPYTPLSKAIYIPHRLCGT
>P03198 2.7.7.7~~~BALF5~~~DNA polymerase catalytic subunit~~~
MSGGLFYNPFLRPNKGLLKKPDKEYLRLIPKCFQTPGAAGVVDVRGPQPPLCFYQDSLTVVGGDEDGKGMWWRQRAQEGT
ARPEADTHGSPLDFHVYDILETVYTHEKCAVIPSDKQGYVVPCGIVIKLLGRRKADGASVCVNVFGQQAYFYASAPQGLD
VEFAVLSALKASTFDRRTPCRVSVEKVTRRSIMGYGNHAGDYHKITLSHPNSVCHVATWLQDKHGCRIFEANVDATRRFV
LDNDFVTFGWYSCRRAIPRLQHRDSYAELEYDCEVGDLSVRREDSSWPSYQALAFDIECLGEEGFPTATNEADLILQISC
VLWSTGEEAGRYRRILLTLGTCEDIEGVEVYEFPSELDMLYAFFQLIRDLSVEIVTGYNVANFDWPYILDRARHIYSINP
ASLGKIRAGGVCEVRRPHDAGKGFLRANTKVRITGLIPIDMYAVCRDKLSLSDYKLDTVARHLLGAKKEDVHYKEIPRLF
AAGPEGRRRLGMYCVQDSALVMDLLNHFVIHVEVAEIAKIAHIPCRRVLDDGQQIRVFSCLLAAAQKENFILPMPSASDR
DGYQGATVIQPLSGFYNSPVLVVDFASLYPSIIQAHNLCYSTMITPGEEHRLAGLRPGEDYESFRLTGGVYHFVKKHVHE
SFLASLLTSWLAKRKAIKKLLAACEDPRQRTILDKQQLAIKCTCNAVYGFTGVANGLFPCLSIAETVTLQGRTMLERAKA
FVEALSPANLQALAPSPDAWAPLNPEGQLRVIYGDTDSLFIECRGFSESETLRFADALAAHTTRSLFVAPISLEAEKTFS
CLMLITKKRYVGVLTDGKTLMKGVELVRKTACKFVQTRCRRVLDLVLADARVKEAASLLSHRPFQESFTQGLPVGFLPVI
DILNQAYTDLREGRVPMGELCFSTELSRKLSAYKSTQMPHLAVYQKFVERNEELPQIHDRIQYVFVEPKGGVKGARKTEM
AEDPAYAERHGVPVAVDHYFDKLLQGAANILQCLFDNNSGAALSVLQNFTARPPF
>P08546 2.7.7.7~~~~~~DNA polymerase catalytic subunit~~~
MFFNPYLSGGVTGGAVAGGRRQRSQPGSAQGSGKRPPQKQFLQIVPRGVMFDGQTGLIKHKTGRLPLMFYREIKHLLSHD
MVWPCPWRETLVGRVVGPIRFHTYDQTDAVLFFDSPENVSPRYRQHLVPSGNVLRFFGATEHGYSICVNVFGQRSYFYCE
YSDTDRLREVIASVGELVPEPRTPYAVSVTPATKTSIYGYGTRPVPDLQCVSISNWTMARKIGEYLLEQGFPVYEVRVDP
LTRLVIDRRITTFGWCSVNRYDWRQQGRASTCDIEVDCDVSDLVAVPDDSSWPRYRCLSFDIECMSGEGGFPCAEKSDDI
VIQISCVCYETGGNTAVDQGIPNGNDGRGCTSEGVIFGHSGLHLFTIGTCGQVGPDVDVYEFPSEYELLLGFMLFFQRYA
PAFVTGYNINSFDLKYILTRLEYLYKVDSQRFCKLPTAQGGRFFLHSPAVGFKRQYAAAFPSASHNNPASTAATKVYIAG
SVVIDMYPVCMAKTNSPNYKLNTMAELYLRQRKDDLSYKDIPRCFVANAEGRAQVGRYCLQDAVLVRDLFNTINFHYEAG
AIARLAKIPLRRVIFDGQQIRIYTSLLDECACRDFILPNHYSKGTTVPETNSVAVSPNAAIISTAAVPGDAGSVAAMFQM
SPPLQSAPSSQDGVSPGSGSNSSSSVGVFSVGSGSSGGVGVSNDNHGAGGTAAVSYQGATVFEPEVGYYNDPVAVFDFAS
LYPSIIMAHNLCYSTLLVPGGEYPVDPADVYSVTLENGVTHRFVRASVRVSVLSELLNKWVSQRRAVRECMRECQDPVRR
MLLDKEQMALKVTCNAFYGFTGVVNGMMPCLPIAASITRIGRDMLERTARFIKDNFSEPCFLHNFFNQEDYVVGTREGDS
EESSALPEGLETSSGGSNERRVEARVIYGDTDSVFVRFRGLTPQALVARGPSLAHYVTACLFVEPVKLEFEKVFVSLMMI
CKKRYIGKVEGASGLSMKGVDLVRKTACEFVKGVTRDVLSLLFEDREVSEAAVRLSRLSLDEVKKYGVPRGFWRILRRLV
QARDDLYLHRVRVEDLVLSSVLSKDISLYRQSNLPHIAVIKRLAARSEELPSVGDRVFYVLTAPGVRTAPQGSSDNGDSV
TAGVVSRSDAIDGTDDDADGGGVEESNRRGGEPAKKRARKPPSAVCNYEVAEDPSYVREHGVPIHADKYFEQVLKAVTNV
LSPVFPGGETARKDKFLHMVLPRRLHLEPAFLPYSVKAHECC
>P04293 2.7.7.7~~~~~~DNA polymerase catalytic subunit~~~
MFSGGGGPLSPGGKSAARAASGFFAPAGPRGASRGPPPCLRQNFYNPYLAPVGTQQKPTGPTQRHTYYSECDEFRFIAPR
VLDEDAPPEKRAGVHDGHLKRAPKVYCGGDERDVLRVGSGGFWPRRSRLWGGVDHAPAGFNPTVTVFHVYDILENVEHAY
GMRAAQFHARFMDAITPTGTVITLLGLTPEGHRVAVHVYGTRQYFYMNKEEVDRHLQCRAPRDLCERMAAALRESPGASF
RGISADHFEAEVVERTDVYYYETRPALFYRVYVRSGRVLSYLCDNFCPAIKKYEGGVDATTRFILDNPGFVTFGWYRLKP
GRNNTLAQPAAPMAFGTSSDVEFNCTADNLAIEGGMSDLPAYKLMCFDIECKAGGEDELAFPVAGHPEDLVIQISCLLYD
LSTTALEHVLLFSLGSCDLPESHLNELAARGLPTPVVLEFDSEFEMLLAFMTLVKQYGPEFVTGYNIINFDWPFLLAKLT
DIYKVPLDGYGRMNGRGVFRVWDIGQSHFQKRSKIKVNGMVNIDMYGIITDKIKLSSYKLNAVAEAVLKDKKKDLSYRDI
PAYYAAGPAQRGVIGEYCIQDSLLVGQLFFKFLPHLELSAVARLAGINITRTIYDGQQIRVFTCLLRLADQKGFILPDTQ
GRFRGAGGEAPKRPAAAREDEERPEEEGEDEDEREEGGGEREPEGARETAGRHVGYQGARVLDPTSGFHVNPVVVFDFAS
LYPSIIQAHNLCFSTLSLRADAVAHLEAGKDYLEIEVGGRRLFFVKAHVRESLLSILLRDWLAMRKQIRSRIPQSSPEEA
VLLDKQQAAIKVVCNSVYGFTGVQHGLLPCLHVAATVTTIGREMLLATREYVHARWAAFEQLLADFPEAADMRAPGPYSM
RIIYGDTDSIFVLCRGLTAAGLTAVGDKMASHISRALFLPPIKLECEKTFTKLLLIAKKKYIGVIYGGKMLIKGVDLVRK
NNCAFINRTSRALVDLLFYDDTVSGAAAALAERPAEEWLARPLPEGLQAFGAVLVDAHRRITDPERDIQDFVLTAELSRH
PRAYTNKRLAHLTVYYKLMARRAQVPSIKDRIPYVIVAQTREVEETVARLAALRELDAAAPGDEPAPPAALPSPAKRPRE
TPSPADPPGGASKPRKLLVSELAEDPAYAIAHGVALNTDYYFSHLLGAACVTFKALFGNNAKITESLLKRFIPEVWHPPD
DVAARLRTAGFGAVGAGATAEETRRMLHRAFDTLA
>P07917 2.7.7.7~~~~~~DNA polymerase catalytic subunit~~~
MFSGGGGPLSPGGKSAARAASGFFVPAGPRGAGRGPPPCLRQNFYNPYLAPVGTQQKPTGPTQRHTYYSECDEFRFIAPR
VLDEDAPPEKRAGVHDGHLKRAPKVYCGGDERDVLRVGSGGFWPRRSRLWGGVDHAPAGFNPTVTVFHVYDILENVEHAY
GMRAAQFHARFMDAITPTGTVITLLGLTPEGHRVAVHVYGTRQYFYMNKEEVDRHLQCRAPRDLCERMAAALRESPGASF
RGISADHFEAEVVERTDVYYYETRPALFYRVYVRSGRVLSYLCDNFCPAIKKYEGGVDATTRFILDNPGFVTFGWYRLKP
GRNNTLAQPRVPMAFGTSSDVEFNCTADNLAIEGGMSDLPAYKLMCFDIECKAGGEDELAFPVAGHPEDLVIQISCLLYD
LSTTALEHVLLFSLGSCDLPESHLTELAARGLPTPVVLEFDSEFEMLLAFMTLVKQYGPEFVTGYNIINFDWPFLLAKLT
DIYKVPLDGYGRMNGRGVFRVWDIGQSHFQKRSKIKVNGMVNIDMYGIITDKIKLSSYKLNAVAEAVLKDKKKDLSYRDI
PAYYAAGPAQRGVIGEYCIQDSLLVGQLFFKFLPHLELSAVARLAGINITRTIYDGQQIRVFTCLLRLADQKGFILPDTQ
GRFRGAGGEAPKRPAAAREDEERPEEEGEDEDEREEGGGEREPEGARETAGRHVGYQGARVLDPTSGFHVNPVVVFDFAS
LYPSIIQAHNLCFSTLSLRADAVAHLEAGKDYLEIEVGGRRLFFVKAHVRESLLSILLRDWLAMRKQIRSRIPQSSPEEA
VLLDKQQAAIKVVCNSVYGFTGVQHGLLPCLHVAATVTTIGREMLLATREYVHARWAAFEQLLADFPEAADMRAPGPYSM
RIIYGDTDSIFVLCRGLTAAGLTAVGDKMASHISRALFLPPIKLECEKTFTKLLLIAKKKYIGVIYGGKMLIKGVDLVRK
NNCAFINRTSRALVDLLFYDDTVSGAAAALAERPAEEWLARPLPEGLQAFGAVLVDAHRRITDPERDIQDFVLTAELSRH
PRAYTNKRLAHLTVYYKLMARRAQVPSIKDRIPYVIVAQTREVEETVARLAALRELDATAPGDEPAPPAALPCPAKRPRE
TPSHADPPGGASKPRKLLVSELAEDPAYAIAHGVALNTDYYFSHLLGVACVTFKALFGNNAKITESLLKRFIPEVWHPPD
DVAARLRAAGFGAVGAGATAEETRRMLHRAFDTLA
>P04292 2.7.7.7~~~~~~DNA polymerase catalytic subunit~~~
MFSGGGGPLSPGGKSAARAASGFFAPAGPRGAGRGPPPCLRQNFYNPYLAPVGTQQKPTGPTQRHTYYSECDEFRFIAPR
VLDEDAPPEKRAGVHDGHLKRAPKVYCGGDERDVLRVGSGGFWPRRSRLWGGVDHAPAGFNPTVTVFHVYDILENVEHAY
GMRAAQFHARFMDAITPTGTVITLLGLTPEGHRVAVHVYGTRQYFYMNKEEVDRHLQCRAPRDLCERMAAALRESPGASF
RGISADHFEAEVVERTDVYYYETRPALFYRVYVRSGRVLSYLCDNFCPAIKKYEGGVDATTRFILDNPGFVTFGWYRLKP
GRNNTLAQPRAPMAFGTSSDVEFNCTADNLAIEGGMSDLPAYKLMCFDIECKAGGEDELAFPVAGHPEDLVIQISCLLYD
LSTTALEHVLLFSLGSCDLPESHLNELAARGLPTPVVLEFDSEFEMLLAFMTLVKQYGPEFVTGYNIINFDWPFLLAKLT
DIYKVPLDGYGRMNGRGVFRVWDIGQSHFQKRSKIKVNGMVNIDMYGIITDKIKLSSYKLNAVAEAVLKDKKKDLSYRDI
PAYYATGPAQRGVIGEYCIQDSLLVGQLFFKFLPHLELSAVARLAGINITRTIYDGQQIRVFTCLLRLADQKGFILPDTQ
GRFRGAGGEAPKRPAAAREDEERPEEEGEDEDEREEGGGEREPEGARETAGRHVGYQGAKVLDPTSGFHVNPVVVFDFAS
LYPSIIQAHNLCFSTLSLRADAVAHLEAGKDYLEIEVGGRRLFFVKAHVRESLLSILLRDWLAMRKQIRSRIPQSSPEEA
VLLDKQQAAIKVVCNSVYGFTGVQHGLLPCLHVAATVTTIGREMLLATREYVHARWAAFEQLLADFPEAADMRAPGPYSM
RIIYGDTDSIFVLCRGLTAAGLTAMGDKMASHISRALFLPPIKLECEKTFTKLLLIAKKKYIGVIYGGKMLIKGVDLVRK
NNCAFINRTSRALVDLLFYDDTVSGAAAALAERPAEEWLARPLPEGLQAFGAVLVDAHRRITDPERDIQDFVLTAELSRH
PRAYTNKRLAHLTVYYKLMARRAQVPSIKDRIPYVIVAQTREVEETVARLAALRELDAAAPGDEPAPPAALPSPAKRPRE
TPSHADPPGGASKPRKLLVSELAEDPAYAIAHGVALNTDYYFSHLLGAACVTFKALFGNNAKITESLLKRFIPEVWHPPD
DVAARLRAAGFGAVGAGATAEETRRMLHRAFDTLA
>P20509 2.7.7.7~~~~~~DNA polymerase~~~
MDVRCINWFESHGENRFLYLKSRCRNGETVFIRFPHYFYYVVTDEIYQSLSPPPFNARPLGKMRTIDIDETISYNLDIKD
RKCSVADMWLIEEPKKRSIQNATMDEFLNISWFYISNGISPDGCYSLDEQYLTKINNGCYHCDDPRNCFAKKIPRFDIPR
SYLFLDIECHFDKKFPSVFINPISHTSYCYIDLSGKRLLFTLINEEMLTEQEIQEAVDRGCLRIQSLMEMDYERELVLCS
EIVLLRIAKQLLELTFDYVVTFNGHNFDLRYITNRLELLTGEKIIFRSPDKKEAVHLCIYERNQSSHKGVGGMANTTFHV
NNNNGTIFFDLYSFIQKSEKLDSYKLDSISKNAFSCMGKVLNRGVREMTFIGDDTTDAKGKAAAFAKVLTTGNYVTVDED
IICKVIRKDIWENGFKVVLLCPTLPNDTYKLSFGKDDVDLAQMYKDYNLNIALDMARYCIHDACLCQYLWEYYGVETKTD
AGASTYVLPQSMVFEYRASTVIKGPLLKLLLETKTILVRSETKQKFPYEGGKVFAPKQKMFSNNVLIFDYNSLYPNVCIF
GNLSPETLVGVVVSTNRLEEEINNQLLLQKYPPPRYITVHCEPRLPNLISEIAIFDRSIEGTIPRLLRTFLAERARYKKM
LKQATSSTEKAIYDSMQYTYKIVANSVYGLMGFRNSALYSYASAKSCTSIGRRMILYLESVLNGAELSNGMLRFANPLSN
PFYMDDRDINPIVKTSLPIDYRFRFRSVYGDTDSVFTEIDSQDVDKSIEIAKELERLINNRVLFNNFKIEFEAVYKNLIM
QSKKKYTTMKYSASSNSKSVPERINKGTSETRRDVSKFHKNMIKTYKTRLSEMLSEGRMNSNQVCIDILRSLETDLRSEF
DSRSSPLELFMLSRMHHSNYKSADNPNMYLVTEYNKNNPETIELGERYYFAYICPANVPWTKKLVNIKTYETIIDRSFKL
GSDQRIFYEVYFKRLTSEIVNLLDNKVLCISFFERMFGSKPTFYEA
>P06856 2.7.7.7~~~~~~DNA polymerase~~~
MDVRCINWFESHGENRFLYLKSRCRNGETVFIRFPHYFYYVVTDEIYQSLSPPPFNARPLGKMRTIDIDETISYNLDIKD
RKCSVADMWLIEEPKKRSIQNATMDEFLNISWFYISNGISPDGCYSLDEQYLTKINNGCYHCDDPRNCFAKKIPRFDIPR
SYLFLDIECHFDKKFPSVFINPISHTSYCYIDLSGKRLLFTLINEEMLTEQEIQEAVDRGCLRIQSLMEMDYERELVLCS
EIVLLRIAKQLLELTFDYVVTFNGHNFDLRYITNRLELLTGEKIIFRSPDKKEAVYLCIYERNQSSHKGVGGMANTTFHV
NNNNGTIFFDLYSFIQKSEKLDSYKLDSISKNAFSCMGKVLNRGVREMTFIGDDTTDAKGKAAAFAKVLTTGNYVTVDED
IICKVIRKDIWENGFKVVLLCPTLPNDTYKLSFGKDDVDLAQMYKDYNLNIALDMARYCIHDACLCQYLWEYYGVETKTD
AGASTYVLPQSMVFEYRASTVIKGPLLKLLLETKTILVRSETKQKFPYEGGKVFAPKQKMFSNNVLIFDYNSLYPNVCIF
GNLSPETLVGVVVSTNRLEEEINNQLLLQKYPPPRYITVHCEPRLPNLISEIAIFDRSIEGTIPRLLRTFLAERARYKKM
LKQATSSTEKAIYDSMQYTYKIVANSVYGLMGFRNSALYSYASAKSCTSIGRRMILYLESVLNGAELSNGMLRFANPLSN
PFYMDDRDINPIVKTSLPIDYRFRFRSVYGDTDSVFTEIDSQDVDKSIEIAKELERLINNRVLFNNFKIEFEAVYKNLIM
QSKKKYTTMKYSASSNSKSVPERINKGTSETRRDVSKFHKNMIKTYKTRLSEMLSEGRMNSNQVCIDILRSLETDLRSEF
DSRSSPLELFMLSRMHHSNYKSADNPNMYLVTEYNKNNPETIELGERYYFAYICPANVPWTKKLVNIKTYETIIDRSFKL
GSDQRIFYEVYFKRLTSEIVNLLDNKVLCISFFERMFGSKPTFYEA
>P0DOO5 2.7.7.7~~~~~~DNA polymerase~~~
MDVRCINWFESHGENRFLYLKSRCRNGETVFIRFPHYFYYVVTDEIYQSLAPPPFNARPMGKMRTIDIDETISYNLDIKD
RKCSVADMWLIEEPKKRNIQNATMDEFLNISWFYISNGISPDGCYSLDDQYLTKINNGCYHCGDPRNCFAKEIPRFDIPR
SYLFLDIECHFDKKFPSVFINPISHTSYCYIDLSGKRLLFTLINEEMLTEQEIQEAVDRGCLRIQSLMEMDYERELVLCS
EIVLLQIAKQLLELTFDYIVTFNGHNFDLRYITNRLELLTGEKIIFRSPDKKEAVHLCIYERNQSSHKGVGGMANTTFHV
NNNNGTIFFDLYSFIQKSEKLDSYKLDSISKNAFSCMGKVLNRGVREMTFIGDDTTDAKGKAAVFAKVLTTGNYVTVDDI
ICKVIHKDIWENGFKVVLSCPTLTNDTYKLSFGKDDVDLAQMYKDYNLNIALDMARYCIHDACLCQYLWEYYGVETKTDA
GASTYVLPQSMVFGYKASTVIKGPLLKLLLETKTILVRSETKQKFPYEGGKVFAPKQKMFSNNVLIFDYNSLYPNVCIFG
NLSPETLVGVVVSSNRLEEEINNQLLLQKYPPPRYITVHCEPRLPNLISEIAIFDRSIEGTIPRLLRTFLAERARYKKML
KQATSSTEKAIYDSMQYTYKIIANSVYGLMGFRNSALYSYASAKSCTSIGRRMILYLESVLNGAELSNGMLRFANPLSNP
FYMDDRDINPIVKTSLPIDYRFRFRSVYGDTDSVFTEIDSQDVDKSIEIAKELERLINSRVLFNNFKIEFEAVYKNLIMQ
SKKKYTTMKYSASSNSKSVPERINKGTSETRRDVSKFHKNMIKIYKTRLSEMLSEGRMNSNQVCIDILRSLETDLRSEFD
SRSSPLELFMLSRMHHLNYKSADNPNMYLVTEYNKNNPETIELGERYYFAYICPANVPWTKKLVNIKTYETIIDRSFKLG
SDQRIFYEVYFKRLTSEIVNLLDNKVLCISFFERMFGSRPTFYEA
>A0A2H5BHJ5 ~~~dpoZ~~~DNA polymerase DpoZ~~~
MAFPNIEQYPQISVDCESTGLEWYKEDRAFGVSIYLPTGDAEYYDIRKDRNAFHWMKDNLWKAKKIVNHNIKFDIHMLRA
TGINLNPANCECTMIRAALIDEHLLKYDLDSLLKKYLKMSKDNDIYADLAQIFGGQPTRKVQILNLHRAPVDLVARYANI
DTEGAYKLWEWQEGEIERQDLHQVWQLERRLFRHIVEMERRGIRIDPNEANRRAEELDRVTAETVAELNRLAGFEVNPNP
SGSIKKLFNPKQNEAGIWVARDGTPLPKTDSGAPSLGAKSLESMTDPCAKLILKARKLNKTKDTFIRGHVLGHAIQNGQD
WFVHPNINQTKSDTGDGSEGTGTGRLSYTRPALQQIPSRDKEIASIVRPIFLPDRGQKWSYGDLDQHEFRIFAHYANPKD
IIEAYAQNPDLDMHQIVADLTGMPRSATKAGEANAKQINLGMVFNMGAGELASQMGLPFTIESVDFGDHVHDLKKAGPET
LEIVENYYAKVQGVKEMARKARTIAKSRGYVRTLMGRHIRFPRGMFTYKASGLIFQGTAGDLNKLNICNIAEYLESECPY
NRLLLNIHDEYSVSLEDDGKEIKHLKELQGLVQHRPELRVPIRIDFSHPAPNWWLATRADLATK
>G3FFN8 ~~~dpoZ~~~DNA polymerase DpoZ~~~
MKLDWERTGRRMGFIDLSKYEVWSYDTECTGLQYKVDKVFGFSIATPDGQSGYFDVREQPESLQWLAEQVEPYKGTIVCH
NASFDYRMSLHSGIKLPLSQIDDTGIRACCINEHESTIFPWTRGRAGDYSLDYLAKKYVGAQKYAEIYDELAALFGGKAT
RKTQMPNLYRAPSGLVRKYACPDAELTLELWLEQEELIKKRGLERIVAFERKVMPTLIRTEARGVRVDLDYAEQAIFKMD
GVVRENQAKMFALAGREFNPNSPKQVREVFGAKEEGGVWKSRDGTILERTATGNPCLDADALRSMTDPLAAAVLELRSNI
KTKDTFLAKHVVEHSVGGRVYPNINQMKGEDGGTGTGRLSYTGPALQQIPSRNKRIAAIIKPAFLPEEGQLWLDSDMASF
EVRIFAHLVAAYNPAIAKAYAENPELDLHQWVGDLMGIPRNASYSGQPNAKQMNLGMIFNRGDGAVADSLGMPWEWCEFT
DKKGELIRYKKAGREAKSIIAAYHSQIQGVKTLATRAQKIAEERGWIQTAHGRRLRFPNGYKSYKASGILIQATAADENK
ENWLRIEDALGSDGSMILNTHDSYSMSVDENWKPIWERVKKAVERQTLRVPLLLEFDGVGKNWAEAKGLIDVH
>Q08FX8 ~~~~~~Apoptosis regulator DPV022~~~
MEAAIEFDEIVKKLLNIYINDICTMGEKRLLNNYEKSILDRIYKSCEYIKKNYELDFNSMYNQININDITTSDIKSKIIE
SLLIDSRPSVKLATLSFISLIAEKWGEKNRTKIMEILSNEIVEKISNNGKDFIDFIDRDDDDIVDDYVLITNYLKITIFG
AILGITAYYICKYLLKSIF
>Q89743 ~~~DR7L~~~Uncharacterized protein DR7~~~
MRHLPFHGMPLRVQMFCAFFIRSETTDKNKATPTITFMVSCCFVWVKRLFYRVGRIHHVQSLTYARPITALDSCLYVCCG
YGEKLQPVGFVKSYVTNSQLDTLRVLLVGKDGAVYVHHMRAARLCRLASSTTEFTRRGLQRDAVTYEEDLELPDQRMCGT
NARHLFDVIAAAADEHNLLTVGGLCQTHAGVSCNLLETVGDPWTAVPAARMTLTVPQVQYRLWPEARRDLRRHLYAGHPL
GPWLVCGVLSRERETQKPSPPIRTTVGNVPTPGPREVEIAWVVLTLAGPLLAFWPDTGKICRLANSFSTLWKMGPRAMRG
HWTYSAPGRHLPGDAWPLCEHVRPQWKASEEESVPGLDATEVK
>P13320 ~~~dsbA~~~Double-stranded DNA-binding protein~~~
MAKKEMVEFDEAIHGEDLAKFIKEASDHKLKISGYNELIKDIRIRAKDELGVDGKMFNRLLALYHKDNRDVFEAETEEVV
ELYDTVFSK
>P00588 2.4.2.36~~~~~~Diphtheria toxin~~~
MLVRGYVVSRKLFASILIGALLGIGAPPSAHAGADDVVDSSKSFVMENFSSYHGTKPGYVDSIQKGIQKPKSGTQGNYDD
DWKGFYSTDNKYDAAGYSVDNENPLSGKAGGVVKVTYPGLTKVLALKVDNAETIKKELGLSLTEPLMEQVGTEEFIKRFG
DGASRVVLSLPFAEGSSSVEYINNWEQAKALSVELEINFETRGKRGQDAMYEYMAQACAGNRVRRSVGSSLSCINLDWDV
IRDKTKTKIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEFHQTALEHPELSELKTVTGTNPVFAGANYAAWAVNVAQV
IDSETADNLEKTTAALSILPGIGSVMGIADGAVHHNTEEIVAQSIALSSLMVAQAIPLVGELVDIGFAAYNFVESIINLF
QVVHNSYNRPAYSPGHKTQPFLHDGYAVSWNTVEDSIIRTGFQGESGHDIKITAENTPLPIAGVLLPTIPGKLDVNKSKT
HISVNGRKIRMRCRAIDGDVTFCRPKSPVYVGNGVHANLHVAFHRSSSEKIHSNEISSDSIGVLGYQKTVDHTKVNSKLS
LFFEIKS
>P0DTK3 2.1.2.-~~~~~~Deoxyuridylate hydroxymethyltransferase~~~
MKVIKTRNVQQALPEALYQLSFEGVRRDSRNGPVFMFPEPVTTVYLRPAERVLFWAERDANPFFHLMESLWMLGGRNDVE
YVARFVDRMRSYSDDGLTFHGAYGFRWRQHFFEDQLPKIIAALKENRDDRRQVLSMWDADADLGRQGKDLPCNLQAIFQI
ACDGRLDMTVTNRSNDLIWGAYGANAVHFSYLHEYVARSVGVEQGIYRQVSANFHAYEEVLNKVAPLADLAANPMTGTET
PDPYAAGIAEPYPLMSTDPEEWNQELMMFLSEPDAVGFRDPFFRRVAIPMMKAHKAFKQTSNPSRFDAALAELDNVAATD
WKLAGVEWIERRRAAFEARKARAMDDGVAYE
>P31654 2.1.2.-~~~~~~Deoxyuridylate hydroxymethyltransferase~~~
MYTFSGSKPTQLYMDILSTVIKEGDVLAPRGKRIKEIRPVMIEFKNPIRRTTFLKGRNINPFFQVAESLWILAGRSDVGF
LLDYNKNMGQFSDDGVFFNAPYGERLRFWNRSDANNFIYNPLDQLRDVYEKIKADPDTRQAVAVIYNPLFDNINNDTKDR
PCNLLLSFKLRNGKLDLSVYNRSNDLHWGTFGANLCQFSTILEAMATWLGVEVGSYYQITDSLHVYLDDYGAKITEDIQK
VYGVNLETDEAPVVEDFEELLDPNPRMSYTMNELNQFLNEYFGIVDSLMRDDETYMHDGDCAQMLLNQIRHEPDEYIKVT
LLLMVAKQALNRGMKDTVATAMNWIPLCEIKVSALRFLYKSYPEYINQYDLDEKLMKYIRREE
>C9DGK1 2.1.2.-~~~~~~Deoxyuridylate hydroxymethyltransferase~~~
MNPFNIISDTGVSAAVYAGLSRAKCTGQYKECRNGGSTFLRNVKFHITDPRNRNLTLNGRKSNIFQMVAETFWVMSGSGN
IKEFLEFFLPRAPQYSDDGINWHGAYGPRMYAHNQLQSAIDLLIKDKDTRRAYVMIADPTLDSAPAIEAAYGVGHSPKDV
PCNREIHINIIEDKLCMKVIQRSGDMLFGTGSINPFEFTFLQELLSEATGYALGDYQWDVTDAHYYKAFEDQVNDVLRSE
QTFWPNDGKPLGTRFTSATKMQEFFAGVVRVWVKQINRLIDLGDAHYAINDLFADYGVLPEGRLRDYAKMVTFYIAAKQG
EIEGDFKALLTNIPTNTDLGQAILTSPFRKFGVVLGD
>Q9J592 3.1.3.-~~~~~~Probable dual specificity protein phosphatase H1 homolog~~~
MDEKQLYKHIITKSTNTCVKFTPRDITKITDYVYLGNYRNVIELPNKTFFKYIVNVSMLKYKLKRTDITVLHFPLEDNDT
VSISKHIDAVTYVLKKCESLKIPVLVHCMAGINRSSAMIMGYLMEIRDKNIPFVIYFLYIYHELKYIRGAFIENKSFLNQ
IIDKYI
>Q85297 3.1.3.-~~~~~~Probable dual specificity protein phosphatase H1 homolog~~~
MDKKSLYENVLLKSTGALPKARVPTKMMRVTDYVYLGNYNDAKAAPTSGIGFKYILNLTTEKYTIKNSSITIIHMPLVDD
EYTDLTKYFDYATTFLSNCEDKHYPVLVHCMAGVNRSGAIIMAYLMSRKSKDIPAFMYFLYIYHSIREQRGAFLENPSFR
RQIIEKYIINEHKLKLFG
>P80994 3.1.3.-~~~~~~Probable dual specificity protein phosphatase OPG106~~~
MDKKSLYKYLLLRSTGDIHRAKSPTMMTRVTNNVYLGNYKNAMEAPSSEVKFKYILNLTMDKYSFTNSNINIIHVPMVDD
TSTDISIYFDDITAFLSKCDQRNEPVLVHCAAGVNRSGAMILAYLMSKNKESSPMLYFLYVYHSMRDLRGAFVENPSFKR
QIIEKYVIDKN
>P20495 3.1.3.-~~~~~~Dual specificity protein phosphatase OPG106~~~
MDKKSLYKYLLLRSTGDMHKAKSPTIMTRVTNNVYLGNYKNAMDAPSSEVKFKYVLNLTMDKYTLPNSNINIIHIPLVDD
TTTDISKYFDDVTAFLSKCDQRNEPVLVHCAAGVNRSGAMILAYLMSKNKESSPMLYFLYVYHSMRDLRGAFVENPSFKR
QIIEKYVIDKN
>P07239 3.1.3.-~~~~~~Dual specificity protein phosphatase OPG106~~~
MDKKSLYKYLLLRSTGDMHKAKSPTIMTRVTNNVYLGNYKNAMDAPSSEVKFKYVLNLTMDKYTLPNSNINIIHIPLVDD
TTTDISKYFDDVTAFLSKCDQRNEPVLVHCAAGVNRSGAMILAYLMSKNKESLPMLYFLYVYHSMRDLRGAFVENPSFKR
QIIEKYVIDKN
>P0DOQ5 3.1.3.-~~~~~~Dual specificity protein phosphatase OPG106~~~
MDKKSLYKYLLLRSTGDMRRAKSPTIMTRVTNNVYLGNYKNAMNAPSSEVKFKYVLNLTMDKYTLPNSNINIIHIPLVDD
TTTDISKYFDDVTAFLSKCDQRNEPVLVHCVAGVNRSGAMILAYLMSKNKESSPMLYFLYVYHSMRDLRGAFVENPSFKR
QIIEKYVIDKN
>P0DOQ6 3.1.3.-~~~H1L~~~Dual specificity protein phosphatase H1~~~
MDKKSLYKYLLLRSTGDMRRAKSPTIMTRVTNNVYLGNYKNAMNAPSSEVKFKYVLNLTMDKYTLPNSNINIIHIPLVDD
TTTDISKYFDDVTAFLSKCDQRNEPVLVHCVAGVNRSGAMILAYLMSKNKESSPMLYFLYVYHSMRDLRGAFVENPSFKR
QIIEKYVIDKN
>Q65199 3.6.1.23~~~~~~Deoxyuridine 5'-triphosphate nucleotidohydrolase~~~
MATNFFIQPITQEAEAYYPPSVITNKRKDLGVDVYCCSDLVLQPGLNIVRLHIKVACEHMGKKCGFKIMARSSMCTHERL
LILANGIGLIDPGYVGELMLKIINLGDTPVQIWAKECLVQLVAQGDHVPDHINILKRNQIFPLFAPTPRGEGRFGSTGEA
GIMRT
>O48500 3.6.1.23~~~DUT~~~Deoxyuridine 5'-triphosphate nucleotidohydrolase~~~
MIKIKLTHPDCMPKIGSEDAAGMDLRAFFGTNPAADLRAIAPGKSLMIDTGVAVEIPRGWFGLVVPRSSLGKRHLMIANT
AGVIDSDYRGTIKMNLYNYGSEMQTLENFERLCQLVVLPHYSTHNFKIVDELEETIRGEGGFGSSGSK
>P03195 3.6.1.23~~~DUT~~~Deoxyuridine 5'-triphosphate nucleotidohydrolase~~~
MEACPHIRYAFQNDKLLLQQASVGRLTLVNKTTILLRPMKTTTVDLGLYARPPEGHGLMLWGSTSRPVTSHVGIIDPGYT
GELRLILQNQRRYNSTLRPSELKIHLAAFRYATPQMEEDKGPINHPQYPGDVGLDVSLPKDLALFPHQTVSVTLTVPPPS
IPHHRPTIFGRSGLAMQGILVKPCRWRRGGVDVSLTNFSDQTVFLNKYRRFCQLVYLHKHHLTSFYSPHSDAGVLGPRSL
FRWASCTFEEVPSLAMGDSGLSEALEGRQGRGFGSSGQ
>P17374 3.6.1.23~~~~~~Deoxyuridine 5'-triphosphate nucleotidohydrolase~~~
MFNMNINSPVRFVKETNRAKSPTRQSPYAAGYDLYSAYDYTIPPGERQLIKTDISMSMPKFCYGRIAPRSGLSLKGIDIG
GGVIDEDYRGNIGVILINNGKCTFNVNTGDRIAQLIYQRIYYPELEEVQSLDSTNRGDQGFGSTGLR
>P04382 1.5.1.3~~~frd~~~Dihydrofolate reductase~~~
MIKLVFRYSPTKTVDGFNELAFGLGDGLPWGRVKKDLQNFKARTEGTIMIMGAKTFQSLPTLLPGRSHIVVCDLARDYPV
TKDGDLAHFYITWEQYITYISGGEIQVSSPNAPFETMLDQNSKVSVIGGPALLYAALPYADEVVVSRIVKRHRVNSTVQL
DASFLDDISKREMVETHWYKIDEVTTLTESVYK
>Q9J5C8 ~~~~~~Protein E11 homolog~~~
MELVNILLESESERVKLYYDIPPKKSLRTKCEVDRAVKYFISVIKKYIKLKESTFYVVVKDTTLFTYKYDKGELTPVDNT
YYTFSKELASTDYSSSEITSICFTITDDMSISVKPKTGYIVKVRSDNSRYY
>P24795 ~~~~~~Uncharacterized protein gp14~~~
MNNETKFTPKDLDEELVKAKMLERMHDVIETAISKGFSAREALEIMTREIHLIRDEVLLHNKKAHNNIVCRELGVDDSAV
IPQRQYLCALMRGSRH
>P03254 ~~~~~~Early E1A protein~~~
MRHIICHGGVITEEMAASLLDQLIEEVLADNLPPPSHFEPPTLHELYDLDVTAPEDPNEEAVSQIFPDSVMLAVQEGIDL
LTFPPAPGSPEPPHLSRQPEQPEQRALGPVSMPNLVPEVIDLTCHEAGFPPSDDEDEEGEEFVLDYVEHPGHGCRSCHYH
RRNTGDPDIMCSLCYMRTCGMFVYSPVSEPEPEPEPEPEPARPTRRPKLVPAILRRPTSPVSRECNSSTDSCDSGPSNTP
PEIHPVVPLCPIKPVAVRVGGRRQAVECIEDLLNESGQPLDLSCKRPRP
>P10407 ~~~~~~Early E1A protein~~~
MRHLRDLPDEEIIIASGSEILELVVNATMGDDHPEPPTPFGTPSLHDLYDLEVDVPEDDPNEKAVNDLFSDAALLAAEEA
SSPSSDSDSSLHTPRHDRGEKEIPGLKWEKMDLRCYEECLPPSDDEDEQAIQNAASHGVQAVSESFALDCPPLPGHGCKS
CEFHRINTGDKAVLCALCYMRAYNHCVYSPVSDADDETPTTESTLSPPEIGTSPSDNIVRPVPVRATGRRAAVECLDDLL
QGGDEPLDLCTRKRPRH
>P03255 ~~~~~~Early E1A protein~~~
MRHIICHGGVITEEMAASLLDQLIEEVLADNLPPPSHFEPPTLHELYDLDVTAPEDPNEEAVSQIFPDSVMLAVQEGIDL
LTFPPAPGSPEPPHLSRQPEQPEQRALGPVSMPNLVPEVIDLTCHEAGFPPSDDEDEEGEEFVLDYVEHPGHGCRSCHYH
RRNTGDPDIMCSLCYMRTCGMFVYSPVSEPEPEPEPEPEPARPTRRPKMAPAILRRPTSPVSRECNSSTDSCDSGPSNTP
PEIHPVVPLCPIKPVAVRVGGRRQAVECIEDLLNEPGQPLDLSCKRPRP
>P03259 ~~~~~~Early E1A protein~~~
MRTEMTPLVLSYQEADDILEHLVDNFFNEVPSDDDLYVPSLYELYDLDVESAGEDNNEQAVNEFFPESLILAASEGLFLP
EPPVLSPVCEPIGGECMPQLHPEDMDLLCYEMGFPCSDSEDEQDENGMAHVSASAAAAAADREREEFQLDHPELPGHNCK
SCEHHRNSTGNTDLMCSLCYLRAYNMFIYSPVSDNEPEPNSTLDGDERPSPPKLGSAVPEGVIKPVPQRVTGRRRCAVES
ILDLIQEEEREQTVPVDLSVKRPRCN
>P10541 ~~~~~~Early E1A protein~~~
MRMLPDFFTGNWDDMFQGLLETEYVFDFPEPSEASEEMSLHDLFDVEVDGFEEDANQEAVDGMFPERLLSEAESAAESGS
GDSGVGEELLPVDLDLKCYEDGLPPSDPETDEATEAEEEAAMPTYVNENENELVLDCPENPGRGCRACDFHRGTSGNPEA
MCALCYMRLTGHCIYSPISDAEGESESGSPEDTDFPHPLTATPPHGIVRTIPCRVSCRRRPAVECIEDLLEEDPTDEPLN
LSLKRPKCS
>P10542 ~~~~~~Early E1A protein~~~
MRMLPDFFTGNWDDMFQGLLEAEHPFDFPEPSQAFEEISLHNLFDVELDESEGDPNEEAVDGMFPNWMLSEDHSADSGAA
SGDSGVGEDLVEVNLDLKCYEEGLPPSGSEADEAEERAEEEETAVSNYVNIAEGASQLVLDCPENPGRGCRACDFHRGSS
GNPEAMCALCYMRLTGHCIYSPISDAEGECELGSNEETELPCSLTATAPVRPTPCRVSCRRRPAVDCIEDLLEEDPTDEP
LNLSLKRPKSS
>P03244 ~~~E1B~~~E1B 55 kDa protein~~~
MERRNPSERGVPAGFSGHASVESGGETQESPATVVFRPPGNNTDGGATAGGSQAAAAAGAEPMEPESRPGPSGMNVVQVA
ELFPELRRILTINEDGQGLKGVKRERGASEATEEARNLTFSLMTRHRPECVTFQQIKDNCANELDLLAQKYSIEQLTTYW
LQPGDDFEEAIRVYAKVALRPDCKYKISKLVNIRNCCYISGNGAEVEIDTEDRVAFRCSMINMWPGVLGMDGVVIMNVRF
TGPNFSGTVFLANTNLILHGVSFYGFNNTCVEAWTDVRVRGCAFYCCWKGVVCRPKSRASIKKCLFERCTLGILSEGNSR
VRHNVASDCGCFMLVKSVAVIKHNMVCGNCEDRASQMLTCSDGNCHLLKTIHVASHSRKAWPVFEHNILTRCSLHLGNRR
GVFLPYQCNLSHTKILLEPESMSKVNLNGVFDMTMKIWKVLRYDETRTRCRPCECGGKHIRNQPVMLDVTEELRPDHLVL
ACTRAEFGSSDEDTD
>P03243 ~~~~~~E1B 55 kDa protein~~~
MERRNPSERGVPAGFSGHASVESGCETQESPATVVFRPPGDNTDGGAAAAAGGSQAAAAGAEPMEPESRPGPSGMNVVQV
AELYPELRRILTITEDGQGLKGVKRERGACEATEEARNLAFSLMTRHRPECITFQQIKDNCANELDLLAQKYSIEQLTTY
WLQPGDDFEEAIRVYAKVALRPDCKYKISKLVNIRNCCYISGNGAEVEIDTEDRVAFRCSMINMWPGVLGMDGVVIMNVR
FTGPNFSGTVFLANTNLILHGVSFYGFNNTCVEAWTDVRVRGCAFYCCWKGVVCRPKSRASIKKCLFERCTLGILSEGNS
RVRHNVASDCGCFMLVKSVAVIKHNMVCGNCEDRASQMLTCSDGNCHLLKTIHVASHSRKAWPVFEHNILTRCSLHLGNR
RGVFLPYQCNLSHTKILLEPESMSKVNLNGVFDMTMKIWKVLRYDETRTRCRPCECGGKHIRNQPVMLDVTEELRPDHLV
LACTRAEFGSSDEDTD
>P10546 ~~~~~~E1B 55 kDa protein~~~
MERPNPSVGGIYSGLHDNGPVENPAAEEEGLRLLAGAASARSGSSAGGGGGGGGGGEPEGRSGSSNGIVTEPDPEEGTSS
GQRGEKRKLENDGADFLKELTLSLMSRCYPESVWWADLEDEFKNGNMNLLYKYGFEQLKTHWMEPWEDWELALNMFAKVA
LRPDTIYTIKKTVNIRKCAYVIGNGAVVRFQTFDRVVFNCAMQSLGPGVIGMSGVTFNNVRFAADGFNGKVFASTTQLTL
HGVFFQNCSGVCVDSWGRVSARGCTFVGCWKGLVGQNKSQMSVKKCVFERCILAMVVEGQARIRHNAGSENVCFLLLKGT
ASVKHNMICGTGHSQLLTCADGNCQTLKVIHVVSHQRRPWPVFEHNMLMRCTMHLGARRGMFSPYQSNFCHTKVLMETDA
FSRVWWSGVFDLTIELYKVVRYDELKARCRPCECGANHIRLYPATLNVTEQLRTDHQMLSCLRTDYESSDED
>P03247 ~~~E1B~~~E1B protein, small T-antigen~~~
MEAWECLEDFSAVRNLLEQSSNSTSWFWRFLWGSSQAKLVCRIKEDYKWEFEELLKSCGELFDSLNLGHQALFQEKVIKT
LDFSTPGRAAAAVAFLSFIKDKWSEETHLSGGYLLDFLAMHLWRAVVRHKNRLLLLSSVRPAIIPTEEQQQEEARRRRRQ
EQSPWNPRAGLDPRE
>P03246 ~~~~~~E1B protein, small T-antigen~~~
MEAWECLEDFSAVRNLLEQSSNSTSWFWRFLWGSSQAKLVCRIKEDYKWEFEELLKSCGELFDSLNLGHQALFQEKVIKT
LDFSTPGRAAAAVAFLSFIKDKWSEETHLSGGYLLDFLAMHLWRAVVRHKNRLLLLSSVRPAIIPTEEQQQQQEEARRRR
QEQSPWNPRAGLDPRE
>P10544 ~~~~~~E1B protein, small T-antigen~~~
MEFWSELQSYQSLRRLLELASARTSSCWRFIFGSTLTNVIYRAKEDYSSRFAELLSFNPGIFASLNLGHHSFFQEIVIKN
LDFSSPGRTVSGLAFICFILDQWSAQTHLSEGYTLDYMTMALWRTLLRRKRVLGCSPAQPPHGLDPVREEEEEEEEEENL
RAGLDPQTEL
>Q65200 ~~~~~~Inner membrane protein pE248R~~~
MGGSTSKNSFKNTTNIISNSIFNQMQNCISMLDGKNYIGVFGDGNILNHVFQDLNLSLDTSCVQKHVNEENFITNLSNQI
TQNLKDQEVALTQWMDAGHHDQKTDIEENIKVNLTTTLIQNCVSSLSGMNVLVVKGNGNIVENATQKQSQQIISNCLQGS
KQAIDTTTGITNTVNQYSHYTSKNFFDFIADAISAVFKNIMVAAVVIVLIIVGFIAVFYFLHSRHRHEEEEEAEPLISNK
VLKNAAVS
>P41483 ~~~~~~Virus envelope protein E25~~~
MWGIVLLIVLLILFYLYWTNALNFNSLTESSPSLGQSSDSVELDENKQLNVKLNNGRVANLRIAHGDNKLSQVYIAEKPL
SIDDIVKEGSNKVGTNSVFLGTVYDYGIKSPNAASTSSNVTMTRGAANFDIKEFKSMFIVFKGVTPTKTVEDNGMLRFEV
DNMIVCLIDPNTAPLSEREVRELRKSNCTLVYTRNAAAQQVLLENNFTVINAEQTAYLKNYKSYREMN
>P12827 ~~~~~~Protein E26~~~
MESVQTRLCASSNQFAPFKKRQLAVPVGSVNSLTHTITSTTVTSVIPKNYQEKRQKICHIISSLRNTHLNFNKIQSVHKK
KLRHLQNLLRKKNEIIAELVRKLESAQKKTTHRNISKPAHWKYFGVVRCDNTIRTIIGNEKFVRRRLAELCTLYNAEYVF
CQARADGDKDRQALASLLTAAFGSRVIVYENSRRFEFINPDEIASGKRLIIKHLQDESQSDINAY
>P41702 ~~~~~~Occlusion-derived virus envelope protein E27~~~
MKRIKCNKVRTVTEIVNSDEKIQKTYELAEFDLKNLSSLESYETLKIKLALSKYMAMLSTLEMTQPLLEIFRNKADTRQI
AAVVFSTLAFIHNRFHPLVTNFTNKMEFVVTETNDTSIPGEPILFTENEGVLLCSVDRPSIVKMLSREFDTEALVNFEND
NCNVRIAKTFGASKRKNTTRSDDYESNKQPNYDMDLSDFSITEVEATQYLTLLLTVEHAYLHYYIFKNYGVFEYCKSLTD
HSLFTNKLRSTMSTKTSNLLLSKFKFTIEDFDKINSNSVTSGFNIYNFNK
>P27311 ~~~~~~Early E3A 12.5 kDa protein~~~
MTSGEAERLRLTHLDHCRRHKCFARGSGEFCYFELPEEHIEGPAHGVRLTTQVELTRSLIREFTKRPLLVERERGPCVLT
VVCNCPNPGLHQDLCCHLCAEYNKYRN
>P68976 ~~~~~~Early 3 14.7 kDa protein~~~
MTESLDLELDGINTEQRLLERRKAASERERLKQEVEDMVNLHQCKRGIFCVVKQAKLTYEKTTTGNRLSYKLPTQRQKLV
LMVGEKPITVTQHSAETEGCLHFPYQGPEDLCTLIKTMCGIRDLIPFN
>P06498 ~~~~~~Early E3B 14.6 kDa protein~~~
MKFTVTFLLIICTLSAFCSPTSKPQRHISCRFTRIWNIPSCYNEKSDLSEAWLYAIISVMVFCSTILALAIYPYLDIGWK
RIDAMNHPTFPAPAMLPLQQVVAGGFVPANQPRPTSPTPTEISYFNLTGGDD
>Q910M3 ~~~~~~Early 3 Conserved Region 1-alpha protein~~~
MSNSSNSTSLSNFSGIGVGVILTLVILFILILALLCLRVAACCTHVCTYCQLFKRWGQHPR
>P68978 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MRYMILGLLALAAVCSAAKKVEFKEPACNVTFKSEANECTTLIKCTTEHEKLIIRHKDKIGKYAVYAIWQPGDTNDYNVT
VFQGENRKTFMYKFPFYEMCDITMYMSKQYKLWPPQKCLENTGTFCSTALLITALALVCTLLYLKYKSRRSFIDEKKMP
>P11323 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MGAILVVLALLSLLGLGSANLNPLDHDPCLDFDPENCTLTFAPDTSRLCGVLIKCGWDCRSVEITHNNKTWNNTLSTTWE
PGVPQWYTVSVRGPDGSIRISNNTFIFSEMCDLAMFMSRQYDLWPPSKENIVAFSIAYCLVTCIITAIICVCIHLLIVIR
PRQSNEEKEKMP
>P04494 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MIRYIILGLLTLASAHGTTQKVDFKEPACNVTFAAEANECTTLIKCTTEHEKLLIRHKNKIGKYAVYAIWQPGDTTEYNV
TVFQGKSHKTFMYTFPFYEMCDITMYMSKQYKLWPPQNCVENTGTFCCTAMLITVLALVCTLLYIKYKSRRSFIEEKKMP
>P68979 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MRYMILGLLALAAVCSAAKKVEFKEPACNVTFKSEANECTTLIKCTTEHEKLIIRHKDKIGKYAVYAIWQPGDTNDYNVT
VFQGENRKTFMYKFPFYEMCDITMYMSKQYKLWPPQKCLENTGTFCSTALLITALALVCTLLYLKYKSRRSFIDEKKMP
>P35771 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MGPILVLLVLLSLLEAGSANYDPCLDFDPENCTLTFAPDTSRICGVLIKCGWECRSVEITHNNKTWNNTLSTTWEPGVPQ
WYTVSVRGPDGSIRISNNTFIFSKMCDLAMFMSKQYSLWPPSKDNIVTFSIAYCLCACLLTALLCVCIHLLVTTRIKNAN
NKEKMP
>P68980 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MGPILVLLVLLSLLEPGSANYDPCLDFDPENCTLTFAPDTSRICGVLIKCGWECRSVEITHNNKTWNNTLSTTWEPGVPE
WYTVSVRGPDGSIRISNNTFIFSEMCDLAMFMSKQYSLWPPSKDNIVTFSIAYCLCACLLTALLCVCIHLLVTTRIKNAN
NKEKMP
>P68981 ~~~~~~Early E3 18.5 kDa glycoprotein~~~
MGPILVLLVLLSLLEPGSANYDPCLDFDPENCTLTFAPDTSRICGVLIKCGWECRSVEITHNNKTWNNTLSTTWEPGVPE
WYTVSVRGPDGSIRISNNTFIFSEMCDLAMFMSKQYSLWPPSKDNIVTFSIAYCLCACLLTALLCVCIHLLVTTRIKNAN
NKEKMP
>P15133 ~~~~~~Pre-early 3 receptor internalization and degradation alpha protein~~~
MIPRVLILLTLVALFCACSTLAAVAHIEVDCIPPFTVYLLYGFVTLILICSLVTVVIAFIQFIDWVCVRIAYLRHHPQYR
DRTIADLLRIL
>P03250 ~~~~~~Early 3 receptor internalization and degradation beta protein~~~
MKRSVIFVLLIFCALPVLCSQTSAPPKRHISCRFTQIWNIPSCYNKQSDLSEAWLYAIISVMVFCSTIFALAIYPYLDIG
WNAIDAMNHPTFPVPAVIPLQQVIAPINQPRPPSPTPTEISYFNLTGGDD
>Q9DHS8 ~~~~~~Protein E3 homolog~~~
MDLLSCTVNDAEIFSLVKKEVLSLNTNDYTTAISLSNRLKINKKKINQQLYKLQKEDTVKMVPSNPPKWFKNYNCDNGEK
HDSKLEQKNHIPNHIFSDTVPYKKIINWKDKNPCIVLNEYCQFTCRDWSIDITTSGKSHCPMFTATVIISGIKFKPAIGN
TKREAKYNASKITMDEILDSVIIKF
>P04489 ~~~~~~Probable early E4 11 kDa protein~~~
MIRCLRLKVEGALEQIFTMAGLNIRDLLRDILRRWRDENYLGMVEGAGMFIEEIHPEGFSLYVHLDVRAVCLLEAIVQHL
TNAIICSLAVEFDHATGGERVHLIDLHFEVLDNLLE
>P03239 ~~~~~~Early 4 ORF6 protein~~~
MTTSGVPFGMTLRPTRSRLSRRTPYSRDRLPPFETETRATILEDHPLLPECNTLTMHNVSYVRGLPCSVGFTLIQEWVVP
WDMVLTREELVILRKCMHVCLCCANIDIMTSMMIHGYESWALHCHCSSPGSLQCIAGGQVLASWFRMVVDGAMFNQRFIW
YREVVNYNMPKEVMFMSSVFMRGRHLIYLRLWYDGHVGSVVPAMSFGYSALHCGILNNIVVLCCSYCADLSEIRVRCCAR
RTRRLMLRAVRIIAEETTAMLYSCRTERRRQQFIRALLQHHRPILMHDYDSTPM
>P89079 3.6.1.23~~~E4~~~E4-ORF1~~~
MAESLYAFIDSPGGIAPVQEGTSNRYTFFCPESFHIPPHGVVLLHLKVSVLVPTGYQGRFMALNDYHARDILTQSDVIFA
GRRQELTVLLFNHTDRFLYVRKGHPVGTLLLERVIFPSVKIATLV
>P03242 ~~~~~~Early 4 ORF1 protein~~~
MAAAVEALYVVLEREGAILPRQEGFSGVYVFFSPINFVIPPMGAVMLSLRLRVCIPPGYFGRFLALTDVNQPDVFTESYI
MTPDMTEELSVVLFNHGDQFFYGHAGMAVVRLMLIRVVFPVVRQASNV
>P03241 ~~~~~~Early 4 ORF3 protein~~~
MIRCLRLKVEGALEQIFTMAGLNIRDLLRDILIRWRDENYLGMVEGAGMFIEEIHPEGFSLYVHLDVRAVCLLEAIVQHL
TNAIICSLAVEFDHATGGERVHLIDLHFEVLDNLLE
>P03240 ~~~~~~Early 4 ORF4 protein~~~
MVLPALPAPPVCDSQNECVGWLGVAYSAVVDVIRAAAHEGVYIEPEARGRLDALREWIYYNYYTERAKRRDRRRRSVCHA
RTWFCFRKYDYVRRSIWHDTTTNTISVVSAHSVQ
>P03238 ~~~~~~Early 4 ORF6/7 control protein~~~
MTTSGVPFGMTLRPTRSRLSRRTPYSRDRLPPFETETRATILEDHPLLPECNTLTMHNAWTSPSPPVEQPQVGQQPVAQQ
LDSDMNLSELPGEFINITDERLARQETVWNITPKNMSVTHDMMLFKASRGERTVYSVCWEGGGRLNTRVL
>Q9J5C4 ~~~~~~Protein E6 homolog~~~
MDFIRRKYLIYTVENNINFFTQELADKISNFCLNHVVAINYIIKKYHKSVLTKDIFNNTNFYIFLHFIRDCETYDIVLKS
SFDVTLLYLNQLVKNYTSFTDFIDIYKQQSNTLLDDKRFLFVTKLSPYFQDIISVNFSTELNPLFHLNEPIKDLEIIYSK
LFKETRFIKVDRISVLRLLIWAYSLKMDTGMKFDDNDSHDLYTILQKTGPVVSSIMTETFKEFVFPKNSTTSYWLFMKER
IYNDEKVYTNEPAITIYEKVLSYIYSEIKQARVNKNMLKVVYMLDSDSEIKKFMLELIYGIPGDILSIIDERDETWKSYF
VDFYHDNFIDDKTFTSANRFYDDLFNVIAKIDPEKFDIRRDIESIFRTDATLVKRFDDMKINSTYVSQMIYQTQNVDLLA
LENKKLCQIYNKDTEYAIKEYNTYLYLNEDNPIVIYKGELKKLSDLDLNSPSIVFSLFSKSLLKYYLDSKLASLGLIIEN
YKDDIILKIITGSSCLQNFTSFIVYATCNDKSILKSVVRTIINHFKVAIIILFKQFLQENIYYVNEYLDNTKHLSKNDKK
FILQIINGNYD
>P03191 ~~~BMRF1~~~DNA polymerase processivity factor BMRF1~~~
METTQTLRFKTKALAVLSKCYDHAQTHLKGGVLQVNLLSVNYGGPRLAAVANAGTAGLISFEVSPDAVAEWQNHQSPEEA
PAAVSFRNLAYGRTCVLGKELFGSAVEQASLQFYKRPQGGSRPEFVKLTMEYDDKVSKSHHTCALMPYMPPASDRLRNEQ
MIGQVLLMPKTASSLQKWARQQGSGGVKVTLNPDLYVTTYTSGEACLTLDYKPLSVGPYEAFTGPVAKAQDVGAVEAHVV
CSVAADSLAAALSLCRIPAVSVPILRFYRSGIIAVVAGLLTSAGDLPLDLSVILFNHASEEAAASTASEPEDKSPRVQPL
GTGLQQRPRHTVSPSPSPPPPPRTPTWESPARPETPSPAIPSHSSNTALERPLAVQLARKRTSSEARQKQKHPKKVKQAF
NPLI
>P0C6Z1 ~~~BHRF1~~~Apoptosis regulator BHRF1~~~
MAYSTREILLALCIRDSRVHGNGTLHPVLELAARETPLRLSPEDTVVLRYHVLLEEIIERNSETFTETWNRFITHTEHVD
LDFNSVFLEIFHRGDPSLGRALAWMAWCMHACRTLCCNQSTPYYVVDLSVRGMLEASEGLDGWIHQQGGWSTLIEDNIPG
SRRFSWTLFLAGLTLSLLVICSYLFISRGRH
>P03182 ~~~BHRF1~~~Apoptosis regulator BHRF1~~~
MAYSTREILLALCIRDSRVHGNGTLHPVLELAARETPLRLSPEDTVVLRYHVLLEEIIERNSETFTETWNRFITHTEHVD
LDFNSVFLEIFHRGDPSLGRALAWMAWCMHACRTLCCNQSTPYYVVDLSVRGMLEASEGLDGWIHQQGGWSTLIEDNIPG
SRRFSWTLFLAGLTLSLLVICSYLFISRGRH
>P0C736 ~~~BHRF1~~~Apoptosis regulator BHRF1~~~
MAYSTREILLALCIRDSRVHGNGTLHPVLELAARETPLRLSPEDTVVLRYHVLLEEIIERNSETFTETWNRFITHTEHVD
LDFNSVFVEIFHRGDPSLGRALAWMAWCMHACRTLCCNQSTPYYVVDLSVRGMLEASEGLDGWIHQQGGWSTLIEDNIPG
SRRFSWTLFLAGLTLSLLVICSYLFISRGRH
>Q1HVF7 3.1.21.-~~~EBNA1~~~Epstein-Barr nuclear antigen 1~~~
MSDEGPGTGPGNGLGQKEDTSGPDGSSGSGPQRRGGDNHGRGRGRGRGRGGGRPGAPGGSGSGPRHRDGVRRPQKRPSCI
GCKGAHGGTGAGGGAGAGGAGAGGAGAGGAGAGGAGAGGAGAGGAGAGGAGAGGAGAGGAGAGGGAGAGGAGAGGAGAGG
GAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGAGAGGAGAGGGAGAGGGAGAGGGAGA
GGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGGGAGAGG
GAGAGGGGRGRGGSGGRGRGGSGGRGRGGSGGRRGRGRERARGGSRERARGRGRGRGEKRPRSPSSQSSSSGSPPRRPPP
GRRPFFHPVAEADYFEYHQEGGPDGEPDMPPGAIEQGPADDPGEGPSTGPRGQGDGGRRKKGGWYGKHRGEGGSSQKFEN
IAEGLRLLLARCHVERTTEDGNWVAGVFVYGGSKTSLYNLRRGIGLAIPQCRLTPLSRLPFGMAPGPGPQPGPLRESIVC
YFIVFLQTHIFAEGLKDAIKDLVLPKPAPTCNIKVTVCSFDDGVDLPPWFPPMVEGAAAEGDDGDDGDEGGDGDEGEEGQ
E
>P03211 3.1.21.-~~~EBNA1~~~Epstein-Barr nuclear antigen 1~~~
MSDEGPGTGPGNGLGEKGDTSGPEGSGGSGPQRRGGDNHGRGRGRGRGRGGGRPGAPGGSGSGPRHRDGVRRPQKRPSCI
GCKGTHGGTGAGAGAGGAGAGGAGAGGGAGAGGGAGGAGGAGGAGAGGGAGAGGGAGGAGGAGAGGGAGAGGGAGGAGAG
GGAGGAGGAGAGGGAGAGGGAGGAGAGGGAGGAGGAGAGGGAGAGGAGGAGGAGAGGAGAGGGAGGAGGAGAGGAGAGGA
GAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGGGAGGAGAGGGAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGGGAGAG
GAGAGGGGRGRGGSGGRGRGGSGGRGRGGSGGRRGRGRERARGGSRERARGRGRGRGEKRPRSPSSQSSSSGSPPRRPPP
GRRPFFHPVGEADYFEYHQEGGPDGEPDVPPGAIEQGPADDPGEGPSTGPRGQGDGGRRKKGGWFGKHRGQGGSNPKFEN
IAEGLRALLARSHVERTTDEGTWVAGVFVYGGSKTSLYNLRRGTALAIPQCRLTPLSRLPFGMAPGPGPQPGPLRESIVC
YFMVFLQTHIFAEVLKDAIKDLVMTKPAPTCNIRVTVCSFDDGVDLPPWFPPMVEGAAAEGDDGDDGDEGGDGDEGEEGQ
E
>Q3KSS4 3.1.21.-~~~EBNA1~~~Epstein-Barr nuclear antigen 1~~~
MSDEGPGTGPGNGLGQKEDSSGPEGSGGSGPQRRGGDNHGRGRGRGRGRGGGRPGAPGGSGSGPRHRDGVRRPQKRPSCI
GCKGAHGGTGSGAGAGGAGAGGAGAGGGAGAGGGAGGAGGAGGAGAGGGAGAGGGAGGAGGAGAGGGAGAGGGAGGAGAG
GGAGGAGGAGAGGGAGAGGGAGGAGAGGGAGGAGGAGAGGGAGAGGAGGAGGAGAGGAGAGGGAGGAGGAGAGGAGAGGA
GAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGEEAGGAGAGGGAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGGGAGAG
GAGAGGGGRGRGGSGGRGRGGSGGRGRGGSGGRRGRGRERARGRSRERARGRGRGRGEKRPRSPSSQSSSSGSPPRRPPP
GRRPFFHPVGDADYFEYLQEGGPDGEPDVPPGAIEQGPTDDPGEGPSTGPRGQGDGGRRKKGGWFGKHRGQGGSNPKFEN
IAEGLRVLLARSHVERTTEEGNWVAGVFVYGGSKTSLYNLRRGIALAVPQCRITPLSRLPFGMAPGPGPQPGPLRESIVC
YFMVFLQTHIFAEVLKDAIKDLVMTKPAPTCNIKVTVCSFDDGVDLPPWFPPMVEGAAAEGDDGDDGDEGGDGDEGEEGQ
E
>P12978 ~~~EBNA2~~~Epstein-Barr nuclear antigen 2~~~
MPTFYLALHGGQTYHLIVDTDSLGNPSLSVIPSNPYQEQLSDTPLIPLTIFVGENTGVPPPLPPPPPPPPPPPPPPPPPP
PPPPPPPPSPPPPPPPPPPPQRRDAWTQEPSPLDRDPLGYDVGHGPLASAMRMLWMANYIVRQSRGDRGLILPQGPQTAP
QARLVQPHVPPLRPTAPTILSPLSQPRLTPPQPLMMPPRPTPPTPLPPATLTVPPRPTRPTTLPPTPLLTVLQRPTELQP
TPSPPRMHLPVLHVPDQSMHPLTHQSTPNDPDSPEPRSPTVFYNIPPMPLPPSQLPPPAAPAQPPPGVINDQQLHHLPSG
PPWWPPICDPPQPSKTQGQSRGQSRGRGRGRGRGRGKGKSRDKQRKPGGPWRPEPNTSSPSMPELSPVLGLHQGQGAGDS
PTPGPSNAAPVCRNSHTATPNVSPIHEPESHNSPEAPILFPDDWYPPSIDPADLDESWDYIFETTESPSSDEDYVEGPSK
RPRPSIQ
>P12977 ~~~EBNA3~~~Epstein-Barr nuclear antigen 3~~~
MDKDRPGPPALDDNMEEEVPSTSVVQEQVSAGDWENVLIELSDSSSEKEAEDAHLEPAQKGTKRKRVDHDAGGSAPARPM
LPPQPDLPGREAILRRFPLDLRTLLQAIGAAATRIDTRAIDQFFGSQISNTEMYIMYAMAIRQAIRDRRRNPASRRDQAK
WRLQTLAAGWPMGYQAYSSWMYSYTDHQTTPTFVHLQATLGCTGGRRCHVTFSAGTFKLPRCTPGDRQWLYVQSSVGNIV
QSCNPRYSIFFDYMAIHRSLTKIWEEVLTPDQRVSFMEFLGFLQRTDLSYIKSFVSDALGTTSIQTPWIDDNPSTETAQA
WNAGFLRGRAYGIDLLRTEGEHVEGATGETREESEDTESDGDDEDLPCIVSRGGPKVKRPPIFIRRLHRLLLMRAGKRTE
QGKEVLEKARGSTYGTPRPPVPKPRPEVPQSDETATSHGSAQVPEPPTIHLAAQGMAYPLHEQHGMAPCPVAQAPPTPLP
PVSPGDQLPGVFSDGRVACAPVPAPAGPIVRPWEPSLTQAAGQAFAPVRPQHMPVEPVPVPTVALERPVYPKPVRPAPPK
IAMQGPGETSGIRRARERWRPAPWTPNPPRSPSQMSVRDRLARLRAEAQVKQASVEVQPPQLTQVSPQQPMEGPLVPEQQ
MFPGAPFSQVADVVRAPGVPAMQPQYFDLPLIQPISQGAPVAPLRASMGPVPPVPATQPQYFDIPLTEPINQGASAAHFL
PQQPMEGPLVPEQWMFPGAALSQSVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEGPWVPEQWMFQGAPPSQGTDV
VQHQLDALGYTLHGLNHPGVPVSPAVNQYHLSQAAFGLPIDEDESGEGSDTSEPCEALDLSIHGRPCPQAPEWPVQEEGG
QDATEVLDLSIHGRPRPRTPEWPVQGEGGQNVTGPETRRVVVSAVVHMCQDDEFPDLQDPPDEA
>Q3KST2 ~~~EBNA3~~~Epstein-Barr nuclear antigen 3~~~
MDKDRPGDPALDDNMEEEVPSTSVVQEQVSAGDWENVLIELSDSSSEKEAEDAQLEPAQKGTKRKRVDHDAGGSAPARPM
LPPQPDLPGREAILRRFPLDLRTLLQAIGAAATRIDTRAIDQFFGSQISNTEMYIMYAMAIRQAIRDRRRNPASRRDQAK
WRLQTLAAGWPMGYQAYSSWMYSYTDHQTAPTFVQLQATLGCTGGRRCHVTFSAGTFKPPRCTPGDRQWLYVQSSVGNIV
QSCNPRYSIFFDYMAIHRSLTKIWEEVLTPDQRVSFMEFLGFLQRTDLSYIKSFVSDALGTTSIQTPWIDDNSSTETAQA
WNAGFLRGRAYGLDLLRTEGEHVEGATGETREESEDTESDGDDEDLPCIVSRGGPKVKRPPIFIRRLHRLLLMRAGKRTE
QGKEVLEKARGSTYGTPRPPVPKPRPEVPQSDETATSHGSAQVPEPPTIHLAAQGMAYPLHEQRGMAPCPVAQAPHTPLP
PVSPGDQLPGVSSDGRVACAPVPAPAGPIVRPWEPSLTQAAGQAFAPVRPQHMPVEPVPVPTVALERPVYPKPVRPAPPK
IAMQGPGETSGIRRARERWRPAPWTPNPPRSPSQMSVRDRLARLRAEAQVKQASVEVQPPQVTQVSPQQPMEGPGAPFSQ
VADVVHTPGVPAMQPQYFDLPLIQPISQGAPVAPLRASMGPVPPVPATQPQYFDIPLTEPINQGASAAHFLPQQPMEGPL
VPEQWMFPGAALSQSVRPGVAQSQYFDLPLTQPITHGAPAAHFLHQPPMEGPWVPEQWMFQGAPPSQGTDVVQHQLDALG
YPLHALNHPGVPVSPAVNQYHLSQAAFGLPIDEDESGEGSDTSEPCEALDLSIHGRPCPQAPEWPVQGEGGQDATEVLDL
SIHGRPRPRTPEWPVQGESGQNVTDHEPRRVVVSAIVHMCQDDEFPDLQDPPDEA
>P03203 ~~~EBNA4~~~Epstein-Barr nuclear antigen 4~~~
MKKAWLSRAQQADAGGASGSEDPPDYGDQGNVTQVGSEPISPEIGPFELSAASEDDPQSGPVEENLDAAAREEEEPHEQE
HNGGDDPLDVHTRQPRFVDVNPTQAPVIQLVHAVYDSMLQSDLRPLGSLFLEQNLNIEEFIWMCMTVRHRCQAIRKKPLP
IVKQRRWKLLSSCRSWRMGYRTHNLKVNSFESGGDNVHPVLVTATLGCDEGTRHATTYSAGIVQIPRISDQNQKIETAFL
MARRARSLSAERYTLFFDLVSSGNTLYAIWIGLGTKNRVSFIEFVGWLCKKDHTHIREWFRQCTGRPKAAKPWLRAHPVA
IPYDDPLTNEEIDLAYARGQAMNIEAPRLPDDPIIVEDDDESEEIEAESDEEEDKSGMESLKNIPQTLPYNPTVYGRPAV
FDRKSDAKSTKKCRAIVTDFSVIKAIEEEHRKKKAARTEQPRATPESQAPTVVLQRPPTQQEPGPVGPLSVQARLEPWQP
LPGPQVTAVLLHEESMQGVQVHGSMLDLLEKDDEVMEQRVMATLLPPVPQQPRAGRRGPCVFTGDLGIESDEPASTEPVH
DQLLPAPGPDPLEIQPLTSPTTSQLSSSAPSCAQTPWPVVQPSQTPDDPTKQSRPPETAAPRQWPMPLRPIPMRPLRMQP
IPFNHPVGPTPHQTPQVEITPYKPTWAQIGHIPYQPTPTGPATMLLRQWAPATMQTPPRAPTPMSPPEVPPVPRQRPRGA
PTPTPPPQVPPVPRQRPRGAPTPTPPPQVLPTPMQLALRAPAGQQGPTKQILRQLLTGGVKKGRPSLKLQAALERQAAAG
WQPSPGSGTSDKIVQAPIFYPPVLQPIQVMGQGGSPTAMAASAVTQAPTEYTRERRGVGPMPPTDIPPSKRAKIEAYTEP
EMPHGGASHSPVVILENVGQGQQQTLECGGTAKQERDMLGLGDIAVSSPSSSETSNDE
>Q8AZK7 ~~~EBNA-LP~~~Epstein-Barr nuclear antigen leader protein~~~
MGDRSEGPGPTRPGPPGIGPEGPLGQLLRRHRSPSPTRGGQEPRRVRRRVLVQQEEEVVSGSPSGPRGDRSEGPGPTRPG
PPGIGPEGPLGQLLRRHRSPSPTRGGQEPRRVRRRVLVQQEEEVVSGSPSGPRGDRSEGPGPTRPGPPGIGPEGPLGQLL
RRHRSPSPTRGGQEPRRVRRRVLVQQEEEVVSGSPSGPRGDRSEGPGPTRPGPPGIGPEGPLGQLLRRHRSPSPTRGGQE
PRRVRRRVLVQQEEEVVSGSPSGPRGDRSEGPGPTRPGPPGIGPEGPLGQLLRRHRSPSPTRGGQEPRRVRRRVLVQQEE
EVVSGSPSGPRGDRSEGPGPTRPGPPGIGPEGPLGQLLRRHRSPSPTRGGQEPRRVRRRVLVQQEEEVVSGSPSGPRGDR
SEGPGPTRPGPPGIGPEGPLGQLLRRHRSPSPTRGGQEPRRVRRRVLVQQEEEVVSGSPSGPLRPRPRPPARSLREWLLR
IRDHFEPPTVTTQRQSVYIEEEEDED
>P03204 ~~~EBNA6~~~Epstein-Barr nuclear antigen 6~~~
MESFEGQGDSRQSPDNERGDNVQTTGEHDQDPGPGPPSSGASERLVPEESYSRDQQPWGQSRGDENRGWMQRIRRRRRRR
AALSGHLLDTEDNVPPWLPPHDITPYTARNIRDAACRAVKQSHLQALSNLILDSGLDTQHILCFVMAARQRLQDIRRGPL
VAEGGVGWRHWLLTSPSQSWPMGYRTATLRTLTPVPNRVGADSIMLTATFGCQNAARTLNTFSATVWTPPHAGPREQERY
AREAEVRFLRGKWQRRYRRIYDLIELCGSLHHIWQNLLQTEENLLDFVRFMGVMSSCNNPAVNYWFHKTIGNFKPYYPWN
APPNENPYHARRGIKEHVIQNAFRKAQIQGLSMLATGGEPRGDATSETSSDEDTGRQGSDVELESSDDELPYIDPNMEPV
QQRPVMFVSRVPAKKPRKLPWPTPKTHPVKRTNVKTSDRSDKAEAQSTPERPGPSEQSSVTVEPAHPTPVEMPMVILHQP
PPVPKPVPVKPTPPPSRRRRGACVVYDDDVIEVIDVETTEDSSSVSQPNKPHRKHQDGFQRSGRRQKRAAPPTVSPSDTG
PPAVGPPAAGPPAAGPPAAGPPAAGPPAAGPPAAGPRILAPLSAGPPAAGPHIVTPPSARPRIMAPPVVRMFMRERQLPQ
STGRKPQCFWEMRAGREITQMQQEPSSHLQSATQPTTPRPSWAPSVCALSVMDAGKAQPIESSHLSSMSPTQPISHEEQP
RYEDPDAPLDLSLHPDVAAQPAPQAPYQGYQEPPAPQAPYQGYQEPPPPQAPYQGYQEPPAHGLQSSSYPGYAGPWTPRS
QHPCYRHPWAPWSQDPVHGHTQGPWDPRAPHLPPQWDGSAGHGQDQVSQFPHLQSETGPPRLQLSLVPLVSSSAPSWSSP
QPRAPIRPIPTRFPPPPMPLQDSMAVGCDSSGTACPSMPFASDYSQGAFTPLDINATTPKRPRVEESSHGPARCSQATAE
AQEILSDNSEISVFPKDAKQTDYDASTESELD
>Q4ZJY8 ~~~~~~Protease inhibitor Egf1.0~~~
MYIDTGIMSNNIFLFAFFALVGLTRIEAMPTKGSEGTWDVDYEDQEHTGITCRENEHYNSTRIECEEECNDRNNKLCYRF
QQFCWCNEGYIRNSSHICVKLEDCLKDEEQKSETLASSANNDSSKRLEDDLKLFSHDSVSHTSLEPETQAQKFNGIIDQE
TLDLVFGKPENSWAENKPLETKTQAQKFNGKIDQETLDLVFGKPKNSSAEKKPLETETQAQKFNGIIDQETLD
>Q4ZJZ3 ~~~O1~~~Protease inhibitor Egf1.5a~~~
MYIDTGIMSNNIFLFAFFALVGLTRIEAMPTKGSEGTWDVDYEDQEHTGITCRENEHYNSTRIECEDECNDRNNKLCYRF
QQFCWCNEGYIRNSSHICVKLEDCLKDEEQKSETLASSANNDSSKRLEDDLKLFSHDSVSHTSLEPETQAQKFNGIINEE
TLDLVFGKPENSWAENKPLETKTQAQKFNGIINEETLDLVFGKPENSWAENKPLETETQAQKFNGIINEETLDLVFGKPE
NSWAENKPLETKTQAQKFNGIINEETLDLVFGKPENSWAENKPLETKTQAQKFNGIINEETLDLVFGKPENSWAENKPLE
TKTQTQKFNGIIDQYTRSIVFVF
>P18569 2.4.1.-~~~EGT~~~Ecdysteroid UDP-glucosyltransferase~~~
MTILCWLALLSTLTAVNAANILAVFPTPAYSHHIVYKVYIEALAEKCHNVTVVKPKLFAYSTKTYCGNITEINADMSVEQ
YKKLVANSAMFRKRGVVSDTDTVTAANYLGLIEMFKDQFDNINVRNLIANNQTFDLVVVEAFADYALVFGHLYDPAPVIQ
IAPGYGLAENFDTVGAVARHPVHHPNIWRSNFDDTEANVMTEMRLYKEFKILANMSNALLKQQFGPNTPTIEKLRNKVQL
LLLNLHPIFDNNRPVPPSVQYLGGGIHLVKSAPLTKLSPVINAQMNKSKSGTIYVSFGSSIDTKSFANEFLYMLINTFKT
LDNYTILWKIDDEVVKNITLPANVITQNWFNQRAVLRHKKMAAFITQGGLQSSDEALEAGIPMVCLPMMGDQFYHAHKLQ
QLGVARALDTVTVSSDQLLVAINDVLFNAPTYKKHMAELYALINHDKATFPPLDKAIKFTERVIRYRHDISRQLYSLKTT
AANVPYSNYYMYKSVFSIVMNHLTHF
>P07059 3.1.21.8~~~denA~~~Endonuclease II~~~
MKEIATEYSFIKYTELELDDNGSIKQLSIPNKYNVIYAIAINDELVYIGKTKNLRKRINYYRTAINRKDKTSDSTKSALI
HSALKEGSKVEFYARQCFNLSMTNELGTMTIATIDLEEPLFIKLFNPPWNIQHKKK
>P39250 3.1.21.9~~~denB~~~Endonuclease IV~~~
MQKTNPGLQRLFQIPTFTLSNSDLTCEMKVKIADTARYSLKQNPNQDKAEVIERCRIAVYAEFFVADWLSGYVNKGQEDV
DDPYTYAWDVLAHPKYCGLRVEVKTHQTDSRWISVTTGCSGEYPYGSGINLGPILNHQVADCIIIFNTKEIHPGVIQYTP
KFIGDREDLRKVVRKSNYNGWYLSI
>P04418 3.2.2.17~~~~~~Endonuclease V~~~
MTRINLTLVSELADQHLMAEYRELPRVFGAVRKHVANGKRVRDFKISPTFILGAGHVTFFYDKLEFLRKRQIELIAECLK
RGFNIKDTTVQDISDIPQEFRGDYIPHEASIAISQARLDEKIAQRPTWYKYYGKAIYA
>P13340 3.1.-.-~~~~~~Recombination endonuclease VII~~~
MLLTGKLYKEEKQKFYDAQNGKCLICQRELNPDVQANHLDHDHELNGPKAGKVRGLLCNLCNAAEGQMKHKFNRSGLKGQ
GVDYLEWLENLLTYLKSDYTQNNIHPNFVGDKSKEFSRLGKEEMMAEMLQRGFEYNESDTKTQLIASFKKQLRKSLK
>Q6QGD3 3.1.-.-~~~~~~Putative endonuclease~~~
MGYYSVGIPFLYFKIPFTNSIRLCILYSLIGRTSPIIERKQEKLMLYMGVGDSMGRQKLTIKDINIRLADRGIQIVGEYV
NQRTKTVFKCQRAHVWEATPHSILHMRRGCPHCSNNTISLDEVSNRISNIGYTLLSSYTNAKTKLHLRCNNGHDCFITLD
GLTQGKRCPYCSLKWENGGFLYIMSFSSGTKVGISLYPEKRLNEVKRESGFSDLYLFTMYHLPDRETALDLEKEVHREYY
NKNCGFSNFTGSTEFFNVAPEDIVIFLNSFGLEEYGH
>Q6QGE6 3.1.-.-~~~~~~Putative endonuclease~~~
MGKKITKQDRESQISNICNDKNLSFVGWIGEYTNIKSTLTLKCNKCYYTWSPRLDNFLRITAQKCPACAGKARWTKEERE
EQIKSKCAEKGYNFLSWSSTYINKDSKIILKCLKDGCIWDVSIHHFINHDTGCPDCASGGFNPNIPATFYIQKLTYQGTH
FLKFGITGKDVLERMRQQSNKSLCEHSVIFSHTFSYGSMARGLEKVVKDSVNTGVLDKRILPDGYTETCHYSELETILSL
TNSFIQENIR
>Q6QGD4 3.1.21.-~~~~~~Nicking endonuclease~~~
MATNTKYKRDAISIMRDGIKSRYSKDGCCAICGSSEDLELHHYHTISQLIKKFAKELQLDFTDENIVLSNREAFYKKYEH
ELVRDVVTLCQHHHQLLHKVYTKEPPLFSANKQKAWVQKQKDKIQNPQEKTQVKTETKSGFARFL
>P00641 3.1.21.2~~~3~~~Endonuclease I~~~
MAGYGAKGIRKVGAFRSGLEDKVSKQLESKGIKFEYEEWKVPYVIPASNHTYTPDFLLPNGIFVETKGLWESDDRKKHLL
IREQHPELDIRIVFSSSRTKLYKGSPTSYGEFCEKHGIKFADKLIPAEWIKEPKKEVPFDRLKRKGGKK
>Q38653 3.5.1.28~~~~~~Endolysin~~~
MVKYTVENKIIAGLPKGKLKGANFVIAHETANSKSTIDNEVSYMTRNWKNAFVTHFVGGGGRVVQVANVNYVSWGAGQYA
NSYSYAQVELCRTSNATTFKKDYEVYCQLLVDLAKKAGIPITLDSGSKTSDKGIKSHKWVADKLGGTTHQDPYAYLSSWG
ISKAQFASDLAKVSGGGNTGTAPAKPSTPAPKPSTPSTNLDKLGLVDYMNAKKMDSSYSNRDKLAKQYGIANYSGTASQN
TTLLSKIKGGAPKPSTPAPKPSTSTAKKIYFPPNKGNWSVYPTNKAPVKANAIGAINPTKFGGLTYTIQKDRGNGVYEIQ
TDQFGRVQVYGAPSTGAVIKK
>Q3ZFI3 3.5.1.28~~~~~~Endolysin~~~
MAKVQFTKRQETSQFFVHCSATKANMDVGVREIRQWHKEQGWLDVGYHFIIRRDGTVEAGRDQDAVGSHVKGYNSTSVGV
CLVGGIDAKGNPEANFTPQQMSALNGVLHELRGTYPKAVIMAHHDVAPKACPSFDLQRWVKTGELVTSDRG
>D1L2U8 3.5.1.28~~~~~~Endolysin~~~
MAKVQFTKRQETSQIFVHCSATKATMDVGVREIRQWHKEQGWLDVGYHFIIRRDGTVEAGRDQDAVGSHVKGYNSTSVGV
CLVGGIDAKGNPEANFTPAQMQALRSLLVELKVQYTGAVLMAHHDVAPKACPSFDLKRWWEKNELVTSDHG
>Q7Y2C0 3.2.1.17~~~~~~SAR-endolysin~~~
MNKPLRGAALAAALAGLVALEGSETTAYRDIAGVPTICSGTTAGVKMGDKATPEQCYQMTIKDFQRFERIVLDAIKVPLN
VNEQTALTFFCYNVGPVCTTSTAFKRFNQGRATEGCQALAMWNKVTINGQKVVSKGLVNRRNAEIKQCLEPSSQYSSLLW
>O64203 3.-.-.-~~~~~~Endolysin A~~~
MTLIVTRDHAQWVHDMCRARAGNRYGYGGAFTLNPRDTTDCSGLVLQTAAWYGGRKDWIGNRYGSTESFRLDHKIVYDLG
FRRLPPGGVAALGFTPVMLVGLQHGGGGRYSHTACTLMTMDIPGGPVKVSQRGVDWESRGEVNGVGVFLYDGARAWNDPL
FHDFWYLDAKLEDGPTQSVDAAEILARATGLAYNRAVALLPAVRDGLIQADCTNPNRIAMWLAQIGHESDDFKATAEYAS
GDAYDTRTDLGNTPEVDGDGRLYKGRSWIMITGKDNYRDFSRWAHGRGLVPTPDYFVVHPLELSELRWAGIGAAWYWTVE
RPDINALSDRRDLETVTRRINGGLTNLDDRRRRYNLALAVGDQLLTLIGDDDELADPTIQRFIREIHGALFNTVVTQSPY
GDPQNPDGSEPRSNLWQLHELIKNGDGMGHARYVEESARAGDLRELERVVRAAKGLGRDRSPEFIARARNVLAQIEAANP
EYLQAYIARNGAL
>Q9T1X2 3.2.1.17~~~lys~~~SAR-endolysin~~~
MAGIPKKLKAALLAVTIAGGGVGGYQEMTRQSLIHLENIAYMPYRDIAGVLTVCVGHTGPDIEMRRYSHAECMALLDSDL
KPVYAAIDRLVRVPLTPYQKTALATFIFNTGVTAFSKSTLLKKLNAGDYAGARDQMARWVFAAGHKWKGLMNRREVEMAI
WNIRGADDLRQ
>Q37875 3.2.1.17~~~~~~SAR-endolysin~~~
MKGKTAAGGGAICAIAVMITIVMGNGNVRTNQAGLELIGNAEGCRRDPYMCPAGVWTDGIGNTHGVTPGVRKTDQQIAAD
WEKNILIAERCINQHFRGKDMPDNAFSAMTSAAFNMGCNSLRTYYSKARGMRVETSIHKWAQKGEWVNMCNHLPDFVNSN
GVPLRGLKIRREKERQLCLTGLVNE
>P27359 3.2.1.17~~~R~~~SAR-endolysin~~~
MPPSLRKAVAAAIGGGAIAIASVLITGPSGNDGLEGVSYIPYKDIVGVWTVCHGHTGKDIMLGKTYTKAECKALLNKDLA
TVARQINPYIKVDIPETMRGALYSLLYNVGAGNFRTSTLLRKINQGDIKGACDQLRRWTYAGGKQWKGLMTRREIEREIC
LWGQQ
>P09963 3.2.1.17~~~~~~Endolysin~~~
MMQISSNGITRLKREEGERLKAYSDSRGIPTIGVGHTGKVDGNSVASGMTITAEKSSELLKEDLQWVEDAISSLVRVPLN
QNQYDALCSLIFNIGKSAFAGSTVLRQLNLKNYQAAADAFLLWKKAGKDPDILLPRRRRERALFLS
>P11187 3.2.1.17~~~~~~Endolysin~~~
MQISQAGINLIKSFEGLQLKAYKAVPTEKHYTIGYGHYGSDVSPRQVITAKQAEDMLRDDVQAFVDGVNKALKVSVTQNQ
FDALVSFAYNVGLGAFRSSSLLEYLNEGRTALAAAEFPKWNKSGGKVYQGLINRRAQEQALFNSGTPKNVSRGTSSTKTT
PKYKVKSGDNLTKIAKKHNTTVATLLKLNPSIKDPNMIRVGQTINVTGSGGKTHKVKSGDTLSKIAVDNKTTVSRLMSLN
PEITNPNHIKVGQTIRLS
>P13559 3.2.1.17~~~XV~~~Endolysin~~~
MQYTLWDIISRVESNGNLKALRFEPEYYQRRMERGDWDNSIIQNIRAANKCSLGTARMIYCSSWGAVQIMGFNLYLNGAF
NLSVAHFMENEAYQVNEFRRFLLKNGLTEYTPERLASDKAARVKFAKVYNGAESYADLILQACQFYGVK
>B8QIR1 3.5.1.28~~~LysH5~~~Endolysin LysH5~~~
MQAKLTKKEFIEWLKTSEGKQYNADGWYGFQCFDYANAGWKALFGLLLKGVGAKDIPFANNFDGLATVYQNTPDFLAQPG
DMVVFGSNYGAGYGHVAWVIEATLDYIIVYEQNWLGGGWTDGVQQPGSGWEKVTRRQHAYDFPMWFIRPNFKSETAPRSV
QSPTQASKKETAKPQPKAVELKIIKDVVKGYDLPKRGSNPNFIVIHNDAGSKGATAEAYRNGLVNAPLSRLEAGIAHSYV
SGNTVWQALDESQVGWHTANQIGNKYGYGIEVCQSMGADNATFLKNEQATFQECARLLKKWGLPANRNTIRLHNEFTSTS
CPHRSSVLHTGFDPVTRGLLPEDKRLQLKDYFIKQIRAYMDGKIPVATVSNDSSASSNTVKPVASAWKRNKYGTYYMEES
ARFTNGNQPITVRKVGPFLSCPVGYQFQPGGYCDYTEVMLQDGHVWVGYTWEGQRYYLPIRTWNGSAPPNQILGDLWGEI
S
>P00720 3.2.1.17~~~E~~~Endolysin~~~
MNIFEMLRIDERLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAKSELDKAIGRNCNGVITKDEAEKLFNQDVDAAVRGILR
NAKLKPVYDSLDAVRRCALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSIWYNQTPNRAKRVITTFRTGTWDA
YKNL
>Q6QGP7 3.4.24.-~~~lys~~~L-alanyl-D-glutamate peptidase~~~
MSFKFGKNSEKQLATVKPELQKVARRALELSPYDFTIVQGIRTVAQSAQNIANGTSFLKDPSKSKHITGDAIDFAPYING
KIDWNDLEAFWAVKKAFEQAGKELGIKLRFGADWNASGDYHDEIKRGTYDGGHVELV
>P00806 3.5.1.28~~~~~~Endolysin~~~
MARVQFKQRESTDAIFVHCSATKPSQNVGVREIRQWHKEQGWLDVGYHFIIKRDGTVEAGRDEMAVGSHAKGYNHNSIGV
CLVGGIDDKGKFDANFTPAQMQSLRSLLVTLLAKYEGAVLRAHHEVAPKACPSFDLKRWWEKNELVTSDRG
>P03706 4.2.2.n2~~~R~~~Endolysin~~~
MVEINNQRKAFLDMLAWSEGTDNGRQKTRNHGYDVIVGGELFTDYSDHPRKLVTLNPKLKSTGAGRYQLLSRWWDAYRKQ
LGLKDFSPKSQDAVALQQIKERGALPMIDRGDIRQAIDRCSNIWASLPGAGYGQFEHKADSLIAKFKEAGGTVREIDV
>P0DTM4 ~~~env~~~Envelope glycoprotein gp95~~~
MEAVIKAFLTGYPGKTSKKDSKEKPLATSKKDPEKTPLLPTRVNYILIIGVLVLCEVTGVRADVHLLEQPGNLWITWANR
TGQTDFCLSTQSATSPFQTCLIGIPSPISEGDFKGYVSDNCTTLGTDRLVSSADFTGGPDNSTTLTYRKVSCLLLKLNVS
MWDEPHELQLLGSQSLPNITNIAQISGITGGCVGFRPQGVPWYLGWSRQEATRFLLRHPSFSKSTEPFTVVTADRHNLFM
GSEYCGAYGYRFWNMYNCSQVGRQYRCGNARSPRPGLPEIQCTRRGGKWVNQSQEINESEPFSFTVNCTASSLGNASGCC
GKAGTILPGKWVDSTQGSFTKPKALPPAIFLICGDRAWQGIPSRPVGGPCYLGKLTMLAPKHTDILKVLVNSSRTGIRRK
RSTSHLDDTCSDEVQLWGPTARIFASILAPGVARAQALREIERLACWSVKQANLTTSFLGDLLDDVTSIRHAVLQNRAAI
DFLLLAHGHGCEDVAGMCCFNLSDHSESIQKKFQLMKEHVNKIGVDSDPIGSWLRGLFGGIGEWAVHLLKGLLLGLVVIL
LLVVCLPCLLQIVCGNIRKMINNSISYHTEYKKLQKACGQPESRIV
>P19557 ~~~env~~~Envelope glycoprotein~~~
MDQDLDGAERGERGGGSEELLQEEINEGRLTAREALQTWINNGEIHPWVLAGMLSMGVGMLLGVYCQLPDTLIWILMFQL
CLYWGLGETSRELDKDSWQWVRSVFIIAILGTLTMAGTALADDDQSTLIPNITKIPTKDTEPGCTYPWILILLILAFILG
ILGIILVLRRSNSEDILAARDTIDWWLSANQEIPPKFAFPIILISSPLAGIIGYYVMERHLEIFKKGCQICGSLSSMWGM
LLEEIGRWLARREWNVSRVMVILLISFSWGMYVNRVNASGSHVAMVTSPPGYRIVNDTSQAPWYCFSSAPIPTCSSSQWG
DKYFEEKINETLVKQVYEQAAKHSRATWIEPDLLEEAVYELALLSANDSRQVVVENGTDVCSSQNSSTNKGHPMTLLKLR
GQVSETWIGNSSLQFCVQWPYVLVGLNNSDSNISFNSGDWIATNCMHPITLNKSAQDLGKNFPRLTFLDGQLSQLKNTLC
GHNTNCLKFGNKSFSTNSLILCQDNPIGNDTFYSLSHSFSKQASARWILVKVPSYGFVVVNDTDTPPSLRIRKPRAVGLA
IFLLVLAIMAITSSLVAATTLVNQHTTAKVVERVVQNVSYIAQTQDQFTHLFRNINNRLNVLHHRVSYLEYVEEIRQKQV
FFGCKPHGRYCHFDFGPEEVGWNNSWNSKTWNDLQDEYDKIEEKILKIRVDWLNSSLSDTQDTFGLETSIFDHLVQLFDW
TSWKDWIKIIIVIIVLWLLIKILLGMLRSCAKVSQNYQHLPAEEEDGDTEPESSPARGDPASGSLYENWLNKIGESKNDA
YRVWTEEYNSLRILFATCRWDLLTPQLLQLPFFLLTLLLKLLWDIFRHAPILNLKGWTVGQGGTSGQQQPPDFPYVNWTG
SREQNNPEGGLDSGAWYEGLRGSQ
>P51519 ~~~env~~~Envelope glycoprotein~~~
MPKERRSRRRPQPIIRWVSLTLTLLALCQPIQTWRCSLSLGNQQWMTTYNQEAKFSISIDQILEAHNQSPFCPRSPRYTL
DFVNGYPKIYWPPPQGRRRFGARAMVTYDCEPRCPYVGADHFDCPHWDNASQADQGSFYVNHQILFLHLKQCHGIFTLTW
EIWGYDPLITFSLHKIPDPPQPDFPQLNSDWVPSVRSWALLLNQTARAFPDCAICWEPSPPWAPEILVYNKTISGSGPGL
ALPDAQIFWVNTSLFNTTQGWHHPSQRLLFNVSQGNALLLPPISLVNLSTVSSAPPTRVRRSPVAALTLGLALSVGLTGI
NVAVSALSHQRLTSLIHVLEQDQQRLITAINQTHYNLLNVASVVAQNRRGLDWLYIRLGFQSLCPTINEPCCFLRIQNDS
IIRLGDLQPLSQRVSTDWQWPWNWDLGLTAWVRETIHSVLSLFLLALFLLFLAPCLIKCLTSRLLKLLRQAPHFPEISFP
PKPDSDYQALLPSAPEIYSHLSPTKPDYINLRPCP
>P03380 ~~~env~~~Envelope glycoprotein~~~
MPKERRSRRRPQPIIRWVSLTLTLLALCRPIQTWRCSLSLGNQQWMTAYNQEAKFSISIDQILEAHNQSPFCAKSPRYTL
DSVNGYPKIYWPPPQGRRRFGARAMVTYDCEPRCPYVGADRFDCPHWDNASQADQGSFYVNHQILFLHLKQCHGIFTLTW
EIWGYDPLITFSLHKIPDPPQPDFPQLNSDWVPSVRSWALLLNQTARAFPDCAICWEPSPPWAPEILVYNKTISSSGPGL
ALPDAQIFWVNSSSFNTTQGWHHPSQRLLFNVSQGNALLLPPISLVNLSTASSAPPTRVRRSPVAALTLGLALSVGLTGI
NVAVSALSHQRLTSLIHVLEQDQQRLITAINQTHYNLLNVASVVAQNRRGLDWLYIRLGFQSLCPTINEPCCFLRIQNDS
IILRGDLQPLSQRVSTDWQWPWNWDLGLTAWVRETIHSVLSLFLLALFLLFLAPCLIKCLTSRLLKLLRQAPHFPEISLT
PKPDSDYQALLPSAPEIYSHLSPVKPDYINLRPCP
>P31627 ~~~env~~~Envelope glycoprotein~~~
MDAGASYMRLTGEENWVEVTMDEEKERKGKDVQQGKYRPQVSKPIINRDTNTSFAYKGIFLWGIQITMWILLWTNMCVRA
EDYITLISDPYGFSPIKNVSGVPVTCVTKEFARWGCQPLGAYPDPEIEYRNVSQEIVKEVYQENWPWNTYHWPLWQMENV
RYWLKENIAENKKRKNSTKKGIEELLAGTIRGRFCVPYPFALLKCTKWCWYPAEIDQETGRARKIKINCTEARAVSCTEE
MPLASIHRAYWDEKDRESMAFMNIRACDSNLRCQKRPGGCVEGYPIPVGANIIPENMKYLRGQKSQYGGIKDKNGELKLP
LTVRVWVKLANVSTWVNGTPPYWQNRINGSKGINGTLWGQLSGMHHLGFNLSQTGKWCNYTGKIKIGQETFSYHYKPNWN
CTGNWTQHPVWQVMRDLDMVEHMTGECVQRPQRHNITVDRNQTITGNCSVTNWDGCNCSRSGNYLYNSTTGGLLVIICRN
NNTITGIMGTNTNWTTMWRIYRNCSGCENATLDRKETGTLGGVANKNCSLPHKNESNKWTCAPRQREGKTDSLYIAGGKK
FWTREKAQYSCENNIGELDGMLHQQILLQKYQVIKVRAYTYGVIEMPENYAKTRIINRRKRELSHTRKKRGVGLVIMLVI
MAIVAAAGASLGVANAIQQSYTKAAVQTLANATAAQQDALEATYAMVQHVAKGVRILEARVARVEAITDRIMLYQELDCW
HYHQYCVTSTRADVAKYINWTRFKDNCTWQQWERELQGYDGNLTMLLRESARQTQLAEEQVRRIPDVWESLKEVFDWSGW
FSWLKYIPIIVVGLVGCILIRAVICVCQPLVQIYRTLSTPTYQRVTVIMEKRADVAGENQDFGDGLEESDDSKTDQKVTV
QKAWSRAWELWQNSPWKEPWKRSLLKLLILPLTMGIWINGRLGEHLKNKKERVDCETWGKGD
>P27427 ~~~P4~~~Envelope glycoprotein~~~
MVDSTIRLVATIFLISLTQQIEVCNKAQQQGPYTLVDYQEKPLNISRIQIKVVKTSVATKGLNFHIGYRAVWRGYCYNGG
SLDKNTGCYNDLIPKSPTESELRTWSKSQKCCTGPDAVDAWGSDARICWAEWKMELCHTAKELKKYSNNNHFAYHTCNLS
WRCGLKSTHIEVRLQASGGLVSMVAVMPNGTLIPIEGTRPTYWTEDSFAYLYDPAGTEKKTESTFLWCFKEHIRPTTELS
GAVYDTHYLGGTYDKNPQFNYYCRDNGYYFELPANRLVCLPTSCYKREGAIVNTMHPNTWKVSEKLHSASQFDVNNVVHS
LVYETEGLRLALSQLDHRFATLSRLFNRLTQSLAKIDDRLLGTLLGQDVSSKFISPTKFMLSPCLSTPEGDSNCHNHSIY
RDGRWVHNSDPTQCFSLSKSQPVDLYSFKELWLPQLLDVNVKGVVADEEGWSFVAQSKQALIDTMTYTKNGGKGTSLEDV
LGYPSGWINGKLQGLLLNGAISWVVVIGVVLVGVCLMRRVF
>P32541 ~~~env~~~Envelope glycoprotein~~~
MVSIAFYGGIPGGISTPITQQSEKSKCEENTMFQPYCYNNDSKNSMAESKEARDQEMNLKEESKEEKRRNDWWKKGMFLL
CLAGTTGGILWWYEGLPQQHYIGLVAIGGRLNGSGQSNAIECWGSFPGCRPFQNYFSYETNRSMHMDNNTATLLEAYHRE
ITFIYKSSCTDSDHCQEYQCKKVNLNSSDSSNSVRVEDVTNTAEYWGFKWLECNQTENFKTILVPENEMVNINDTDTWIP
KGCNETWARVKRCPIDILYGIHPIRLCVQPPFFLVQEKGIADTSRIGNCGPTIFLGVLEDNKGVVRGDYIACNVRRLNIN
RKDYTGIYQVPIFYTCTFTNITSCNNEPIISVIMYETNQVQYLLCNNNNSNNYNCVVQSFGVIGQAHLELPRPNKRIRNQ
SFNQYNCSINNKTELETWKLVKTSGVTPLPISSEANTGLIRHKRDFGISAIVAAIVAATAIAASATMSYVALTEVNKIME
VQNHTFEVENSTLNGMDLIERQIKILYAMILQTHADVQLLKERQQVEETFNLIGCIERTHVFCHTGHPWNMSWGHLNEST
QWDDWVSKMEDLNQEILTTLHGARNNLAQSMITFNTPDSIAQFGKDLWSHIGNWIPGLGASIIKYIVMFLLIYLLLTSSP
KILRALWKVTSGAGSSGSRYLKKKFHHKHASREDTWDQAQHNIHLAGVTGGSGDKYYKQKYSRNDWNGESEEYNRRPKSW
VKSIEAFGESYISEKTKGEISQPGAAINEHKNGSGGNNPHQGSLDLEIRSEGGNIYDCCIKAQEGTLAIPCCGFPLWLFW
GLVIIVGRIAGYGLRGLAVIIRICIRGLNLIFEIIRKMLDYIGRALNPGTSHVSMPQYV
>P16082 ~~~env~~~Envelope glycoprotein~~~
MVSIAFYGGIPGGISTPITQQSEKSKYEENTMFQPYCYNNDSKNSMAESKEARDQEMNLKEESKEEKRRNDWWKIGMFLL
CLAGTTGGILWWYEGLPQQHYIGLVAIGGRLNGSGQSNAIECWGSFPGCRPFQNYFSYETNRSMHMDNNTATLLEAYHRE
ITFIYKSSCTDSDHCQEYQCKKVNLNSSDSSNSVRVEDVTNTAEYWGFKWLECNQTENFKTILVPENEMVNINDTDTWIP
KGCNETWARVKRCPIDILYGIHPIRLCVQPPFFLVQEKGIADTSRIGNCGPTIFLGVLEDNKGVVRGDYTACNVSRLNIN
RKDYTGIYQVPIFYTCTFTNITSCNNEPIISVIMYETNQVQYLLCNNNNSNNYNCVVQSFGVIGQAHLELPRPNKRIRNQ
SFNQYNCSINNKTELETWKLVKTSGITPLPISSEANTGLIRHKRDFGISAIVAAIVAATAIAASATMSYVALTEVNKIME
VQNHTFEVENSTLNGMDLIERQIKILYAMILQTHADVQLLKERQQVEETFNLIGCIERTHVFCHTGHPWNMSWGHLNEST
QWDDWVSKMEDLNQEILTTLHGARNNLAQSMITFNTPDSIAQFGKDLWSHIGNWIPGLGASIIKYIVMFLLIYLLLTSSP
KILRALWKVTSGAGSSGSRYLKKKFHHKHASREDTWDQAQHNIHLAGVTGGSGDKYYKQKYSRNDWNGESEEYNRRPKSW
VKSIEAFGESYISEKTKGEISQPGAAINEHKNGSGGNNPHQGSLDLEIRSEGGNIYDCCIKAQEGTLAIPCCGFPLWLFW
GLVIIVGRIAGYGLRGLAVIIRICTRGLNLIFEIIRKMLDYIGRALNPGTSHVSMPQYV
>O56861 ~~~env~~~Envelope glycoprotein gp130~~~
MEQEHVMTLKEWMEWNAHKQLQKLQSTHPELHVDIPEDIPLVPEKVPLKMRMRYRCYTLCATSTRIMFWILFFLLCFSIV
TLSTIISILRYQWKEAITHPGPVLSWQVTNSHVTMGGNTSSSSRRRRDIQYHKLPVEVNISGIPQGLFFAPQPKPIFHKE
RTLGLSQVILIDSDTITQGHIKQQKAYLVSTINEEMEQLQKTVLPFDLPIKDPLTQKEYIEKRCFQKYGHCYVIAFNGNK
VWPSQDLIQDQCPLPPRFGNNLKYRNHTIWKYYIPLPFKVSSNWTRVESYGNIRIGSFKVPDEFRQNATHGIFCSDALYS
NWYPRDLPSSVQQSFAQAYITKVLMKRKKQPTLRDIAFPKELSPVGSGMLFRPINPYDICNMPRAVLLLNKTYYTFSLWE
GDCGYYQHNLTLHPACKNFNRTRQDHPYACRFWRNKYDSESVQCYNNDMCYYRPLYDGTENTEDWGWLAYTDSFPSPICI
EEKRIWKKNYTLSSVLAECVNQAMEYGIDEVLSKLDLIFGNLTHQSADEAFIPVNNFTWPRYEKQNKQQKTSCERKKGRR
QRRSVSTENLRRIQEAGLGLANAITTVAKISDLNDQKLAKGVHLLRDHVVTLMEANLDDIVSLGEGIQIEHIHNHLTSLK
LLTLENRIDWRFINDSWIQEELGVSDNIMKVIRKTARCIPYNVKQTRNLNTSTAWEIYLYYEIIIPTTIYTQNWNIKNLG
HLVRNAGYLSKVWIQQPFEVLNQECGTNIYLHMEECVDQDYIICEEVMELPPCGNGTGSDCPVLTKPLTDEYLEIEPLKN
GSYLVLSSTTDCGIPAYVPVVITVNDTISCFDKEFKRPLKQELKVTKYAPSVPQLELRVPRLTSLIAKIKGIQIEITSSW
ETIKEQVARAKAELLRLDLHEGDYPEWLQLLGEATKDVWPTISNFVSGIGNFIKDTAGGIFGTAFSFLGYVKPVLLGFVI
IFCIILIIKIIGWLQNTRKKDQ
>P11261 ~~~env~~~Envelope glycoprotein~~~
MEGPTHPKPSKDKTFSWDLMILVGVLLRLDVGMANPSPHQIYNVTWTITNLVTGTKANATSMLGTLTDAFPTMYFDLCDI
IGNTWNPSDQEPFPGYGCDQPMRRWQQRNTPFYVCPGHANRKQCGGPQDGFCAVWGCETTGETYWRPTSSWDYITVKKGV
TQGIYQCSGGGWCGPCYDKAVHSSITGASEGGRCNPLILQFTQKGRQTSWDGPKSWGLRLYRSGYDPIALFSVSRQVMTI
TLPQAMGPNLVLPDQKPPSRQSQIESRVTPHHSQGNGGTPGITLVNASIAPLSTPVTPASPKRIGTGNRLINLVQGTYLA
LNVTNPNKTKDCWLCLVSRPPYYEGIAVLGNYSNQTNPPPSCLSDPQHKLTISEVSGQGSCIGTVPKTHQALCKKTQKGH
KGTHYLAAPSGTYWACNTGLTPCISMAVLNWTSDFCVLIELWPRVTYHQPEYVYTHFDKTVRLRREPISLTVALMLGGLT
VGGIAAGVGTGTKALLETAQFGQLQMAMHTDIQALEESISALEKSLTSLSEVVLQNRRGLDILFLQEGGLCAALKEECCF
YADHTGLVRDNMAKLRERLKQRQQLFDSQQGWFEGWFNKSPWFTTLISSIMGPLLILLLILLFGPCILNRLVQFVKDRIS
VVQALILTQQYQQIKQYDPDQP
>P14351 ~~~env~~~Envelope glycoprotein gp130~~~
MAPPMTLQQWIIWKKMNKAHEALQNTTTVTEQQKEQIILDIQNEEVQPTRRDKFRYLLYTCCATSSRVLAWMFLVCILLI
IVLVSCFVTISRIQWNKDIQVLGPVIDWNVTQRAVYQPLQTRRIARSLRMQHPVPKYVEVNMTSIPQGVYYEPHPEPIVV
KERVLGLSQILMINSENIANNANLTQEVKKLLTEMVNEEMQSLSDVMIDFEIPLGDPRDQEQYIHRKCYQEFANCYLVKY
KEPKPWPKEGLIADQCPLPGYHAGLTYNRQSIWDYYIKVESIRPANWTTKSKYGQARLGSFYIPSSLRQINVSHVLFCSD
QLYSKWYNIENTIEQNERFLLNKLNNLTSGTSVLKKRALPKDWSSQGKNALFREINVLDICSKPESVILLNTSYYSFSLW
EGDCNFTKDMISQLVPECDGFYNNSKWMHMHPYACRFWRSKKNEKEETKCRDGETKRCLYYPLWDSPESTYDFGYLAYQK
NFPSPICIEQQKIRDQDYEVYSLYQERKIASKAYGIDTVLFSLKNFLNYTGTPVNEMPNARAFVGLIDPKFPPSYPNVTR
EHYTSCNNRKRRSVDNNYAKLRSMGYALTGAVQTLSQISDINDENLQQGIYLLRDHVITLMEATLHDISVMEGMFAVQHL
HTHLNHLKTMLLERRIDWTYMSSTWLQQQLQKSDDEMKVIKRIARSLVYYVKQTHSSPTATAWEIGLYYELVIPKHIYLN
NWNVVNIGHLVKSAGQLTHVTIAHPYEIINKECVETIYLHLEDCTRQDYVICDVVKIVQPCGNSSDTSDCPVWAEAVKEP
FVQVNPLKNGSYLVLASSTDCQIPPYVPSIVTVNETTSCFGLDFKRPLVAEERLSFEPRLPNLQLRLPHLVGIIAKIKGI
KIEVTSSGESIKEQIERAKAELLRLDIHEGDTPAWIQQLAAATKDVWPAAASALQGIGNFLSGTAQGIFGTAFSLLGYLK
PILIGVGVILLVILIFKIVSWIPTKKKNQ
>P03393 ~~~env~~~Glycoprotein 55~~~
MKGPAFSKPLKDKINPWGPLIVLGILIRAGVSVQHDSPHQVFNVTWRVTNLMTGQTANATSLLGTMTDAFPMLHFDLCDL
IGDDWDETGLECRTPGGRKRARTFDFYVCPGHTVPTGCGGPREGYCGKWGCETTGQAYWKPSSSWDLISLKRGNTPKDRG
PCYDSSVSSGVQGATPGGRCNPLVLKFTDAGKKASWDSPKVWGLRLYRPTGIDPVTRFSLTRQVLNIGPRIPIGPNPVII
GQLPPSRPVQVRLPRPPQPPPTGAASMVPGTAPPSQQPGTGDRLLNLVQGAYQALNLTNPDKTQECWLCLVSGPPYYEGV
AVLGTNSNHTSALKEKCCFYADHTGLVRDSMAKLRKRLTQRQKLFESSQGWFEGSFNRSPWFTTLISTIMGLLIILLLLL
ILLLWTLHS
>P21415 ~~~env~~~Envelope glycoprotein~~~
MVLLPGSMLLTSNLHHLRHQMSPGSWKRLIILLSCVFGGGGTSLQNKNPHQPMTLTWQVLSQTGDVVWDTKAVQPPWTWW
PTLKPDVCALAASLESWDIPGTDVSSSKRVRPPDSDYTAAYKQITWGAIGCSYPRARTRMASSTFYVCPRDGRTLSEARR
CGGLESLYCKEWDCETTGTGYWLSKSSKDLITVKWDQNSEWTQKFQQCHQTGWCNPLKIDFTDKGKLSKDWITGKTWGLR
FYVSGHPGVQFTIRLKITNMPAVAVGPDLVLVEQGPPRTSLALPPPLPPREAPPPSLPDSNSTALATSAQTPTVRKTIVT
LNTPPPTTGDRLFDLVQGAFLTLNATNPGATESCWLCLAMGPPYYEAIASSGEVAYSTDLDRCRWGTQGKLTLTEVSGHG
LCIGKVPFTHQHLCNQTLSINSSGDHQYLLPSNHSWWACSTGLTPCLSTSVFNQTRDFCIQVQLIPRIYYYPEEVLLQAY
DNSHPRTKREAVSLTLAVLLGLGITAGIGTGSTALIKGPIDLQQGLTSLQIAIDADLRALQDSVSKLEDSLTSLSEVVLQ
NRRGLDLLFLKEGGLCAALKEECCFYIDHSGAVRDSMKKLKEKLDKRQLERQKSQNWYEGWFNNSPWFTTLLSTIAGPLL
LLLLLLILGPCIINKLVQFINDRISAVKILVLRQKYQALENEGNL
>C1JJY3 ~~~ORF4~~~Envelope protein~~~
MSVNRSSIKSLLMVFMIVSSSLLAPVGGAAADEFRTPAASDTSPEAGECSNLDDFIMFLSVGRINADSCSRQAYVDAAVQ
DMKDSDANQTKVDIYSAAAGVKGGSETWAAPYDNYLNDTESIAWMKAESAIAQSYSEGESKTEAKVAAKAAIADYYATKQ
KNLIEQWNFANAQMFTLREQARMEDGISRNYVEPAYRNVEKTNSPDYSLAYSNTTVEKSLVDGTTVNTTGVSMDVTVQHT
TVSDVATVSSGPVRAGKYNNQYNEWKATYYSWSVEPASPSQDTLYAVHFQPYADRWQRIVDMNGALQSEADNFVNATWDD
YDTGQINASDVLSANTAMSEYGVRSGSESEGLWRSTAALSMMGYDTPNLNNSGMMTVEYKNVQHTGLLMAKNAPNGSWQV
NTTYNTSNIDGPVFMATTEGTKLDFADGEEFTIVGMTAKDGTAVNSTQTTKYRYKTANTTELLEVQNQLIELRQEIEDRE
PEAGGFFGSGSTDTMLVGLLALAGVLLLAQSNNRGGRR
>P23064 ~~~env~~~Envelope glycoprotein gp62~~~
MGKFLATLILFFQFCPLILGDYSPSCCTLTIGVSSYHSKPCNPAQPVCSWTLDLLALSADQALQPPCPNLVSYSSYHATY
SLYLFPHWIKKPNRNGGGYYSASYSDPCSLKCPYLGCQSWTCPYTGAVSSPYWKFQQDVNFTQEVSRLNINLHFSKCGFP
FSLLVDAPGYDPIWFLNTEPSQLPPTAPPLLPHSNLDHILEPSIPWKSKLLTLVQLTLQSTNYTCIVCIDRASLSTWHVL
YSPNVSVPSSSSTPLLYPSLALPAPHLTLPFNWTHCFDPQIQAIVSSPCHNSLILPPFSLSPVPTLGSRSRRAVPVAVWL
VSALAMGAGVAGGITGSMSLASGKSLLHEVDKDISQLTQAIVKNHKNLLKIAQYAAQNRRGLDLLFWEQGGLCKALQEQC
CFLNITNSHVSILQERPPLENRVLTGWGLNWDLGLSQWAREALQTGITLVALLLLVILAGPCILRQLRHLPSRVRYPHYS
LINPESSL
>P03383 ~~~env~~~Envelope glycoprotein gp63~~~
MGNVFFLLLFSLTHFPLAQQSRCTLTIGISSYHSSPCSPTQPVCTWNLDLNSLTTDQRLHPPCPNLITYSGFHKTYSLYL
FPHWIKKPNRQGLGYYSPSYNDPCSLQCPYLGCQAWTSAYTGPVSSPSWKFHSDVNFTQEVSQVSLRLHFSKCGSSMTLL
VDAPGYDPLWFITSEPTQPPPTSPPLVHDSDLEHVLTPSTSWTTKILKFIQLTLQSTNYSCMVCVDRSSLSSWHVLYTPN
ISIPQQTSSRTILFPSLALPAPPSQPFPWTHCYQPRLQAITTDNCNNSIILPPFSLAPVPPPATRRRRAVPIAVWLVSAL
AAGTGIAGGVTGSLSLASSKSLLLEVDKDISHLTQAIVKNHQNILRVAQYAAQNRRGLDLLFWEQGGLCKAIQEQCCFLN
ISNTHVSVLQERPPLEKRVITGWGLNWDLGLSQWAREALQTGITILALLLLVILFGPCILRQIQALPQRLQNRHNQYSLI
NPETML
>P03378 ~~~env~~~Envelope glycoprotein gp160~~~
MKVKGTRRNYQHLWRWGTLLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDARAYDTEVHNVWATHACVPTDPNP
QEVVLGNVTENFNMWKNNMVEQMQEDIISLWDQSLKPCVKLTPLCVTLNCTDLGKATNTNSSNWKEEIKGEIKNCSFNIT
TSIRDKIQKENALFRNLDVVPIDNASTTTNYTNYRLIHCNRSVITQACPKVSFEPIPIHYCTPAGFAILKCNNKTFNGKG
PCTNVSTVQCTHGIRPIVSTQLLLNGSLAEEEVVIRSDNFTNNAKTIIVQLNESVAINCTRPNNNTRKSIYIGPGRAFHT
TGRIIGDIRKAHCNISRAQWNNTLEQIVKKLREQFGNNKTIVFNQSSGGDPEIVMHSFNCRGEFFYCNTTQLFNNTWRLN
HTEGTKGNDTIILPCRIKQIINMWQEVGKAMYAPPIGGQISCSSNITGLLLTRDGGTNVTNDTEVFRPGGGDMRDNWRSE
LYKYKVIKIEPLGIAPTKAKRRVVQREKRAVGIVGAMFLGFLGAAGSTMGAVSLTLTVQARQLLSGIVQQQNNLLRAIEA
QQHLLQLTVWGIKQLQARVLAVERYLRDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEDIWDNMTWMQWEREIDNYTNT
IYTLLEESQNQQEKNEQELLELDKWASLWNWFSITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTRL
PVPRGPDRPDGIEEEGGERDRDRSVRLVDGFLALIWEDLRSLCLFSYRRLRDLLLIAARTVEILGHRGWEALKYWWSLLQ
YWIQELKNSAVSWLNATAIAVTEGTDRVIEVAQRAYRAILHIHRRIRQGLERLLL
>P03375 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN
ISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYTLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSANFTDNAKTIIVQLNQSVEINCTRPNNNTRKSIRIQRGPGRAFVTI
GKIGNMRQAHCNISRAKWNNTLKQIDSKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTW
STKGSNNTEGSDTITLPCRIKQIINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWR
SELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWNNMTWMEWDREINNYTS
LIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKLFIMIVGGLVGLRIVFAVLSVVNRVRQGYSPLSFQTH
LPIPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGAYRAIRHIPRRIRQGLERILL
>P04582 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYFGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN
ISTSKRGKVQKEYAFFYKLDIIPIDNDTTSYTLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLDTSVEINCTRPNNNTRKKIRIQRGPGRAFVTI
GKIGNMRQAHCNISRAKWNATLKQIDSKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWSTKGS
NNTEGSDTITLPCRIKQIINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYK
YKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIEGQQHL
LQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWNNMTWMEWDREINNYTSLIHSL
IEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKLFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTHLPNPR
GPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLLQYWSQ
ELKNSAVNLLNATAIAVAEGTDRVIELVQAAYRAIRHIPRRIRQGLERILL
>Q73372 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEIRKNWQHLRGGILLLGMLMICSAAKEKTWVTIYYGVPVWREATTTLFCASDAKAYDTEVHNVWATHACVPTDPNP
QEVVLGNVTENFNMWKNNMVDQMHEDIISLWDESLKPCVKLTPLCVTLNCTNLNITKNTTNPTSSSWGMMEKGEIKNCSF
YITTSIRNKVKKEYALFNRLDVVPIENTNNTKYRLISCNTSVITQACPKVSFQPIPIHYCVPAGFAMLKCNNKTFNGSGP
CTNVSTVQCTHGIRPVVSTQLLLNGSLAEEDIVIRSENFTDNAKTIIVQLNESVVINCTRPNNNTRRRLSIGPGRAFYAR
RNIIGDIRQAHCNISRAKWNNTLQQIVIKLREKFRNKTIAFNQSSGGDPEIVMHSFNCGGEFFYCNTAQLFNSTWNVTGG
TNGTEGNDIITLQCRIKQIINMWQKVGKAMYAPPITGQIRCSSNITGLLLTRDGGNSTETETEIFRPGGGDMRDNWRSEL
YKYKVVRIEPIGVAPTRAKRRTVQREKRAVGIGAVFLGFLGAAGSTMGAASVTLTVQARLLLSGIVQQQNNLLRAIEAQQ
HMLQLTVWGIKQLQARVLALERYLRDQQLMGIWGCSGKLICTTSVPWNVSWSNKSVDDIWNNMTWMEWEREIDNYTDYIY
DLLEKSQTQQEKNEKELLELDKWASLWNWFDITNWLWYIRLFIMIVGGLIGLRIVFAVLSIVNRVRQGYSPLSFQTLLPA
SRGPDRPEGTEEEGGERDRDRSGPLVNGFLALFWVDLRNLCLFLYHLLRNLLLIVTRIVELLGRRGWEALKYWWNLLQYW
SQELKNSAVSLLNATAIAVAEGTDRVIKIVQRACRAIRNIPTRIRQGLERALL
>P12488 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGIKKNYQHLWRWGGMMLLGILMICSATDKLWVTVYYGVPVWKEANTTLFCASDAKAYDTEIHNVWATHACVPTDPN
PQELVMGNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCHDFNATNATSNSGKMMEGGEMKNCSFNIT
TSIRDKMQKEYALFYKLDIVPIDNDKTNTRYRLISCNTSVITQACPKVTFEPIPIHYCAPAGFAILKCNNKKFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSENFTNNVKTIIVQLNESVEINCTRPNNNTRKRITMGPGRVYYTTGQ
IIGDIRRAHCNLSRSKWENTLKQIVTKLRVQFKNKTIVFNRSSGGDPEIVMHSFNCGGEFFFCNTTQLFNSTWYRNTTGN
ITEGNSPITLPCRIKQIINMWQEVGKAMYAPPIRGQIKCSSNITGLLLTRDGGNNNETTDTEIFRPGGGNMRDNWRSELY
KYKVVKIEPLGVAPTKAKRRVVQREKRAVGLGALFLGFLGAAGSTMGAASLTLTVQARLLLSGIVQQQNNLLMAIEAQQH
MLELTVWGIKQLQARVLAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLSDIWDNMTWMEWEREIDNYTNLIYS
LIEDSQIQQEKNEKELLELDKWASLWNWFNITNWLWYIKIFIMIVGGLIGLRIVFAVLSIVNRVRQGYSPLSFQTRLPGR
RGPDRPEGIEEEGGERDRDRSSPLVDGFLALFWVDLRSLFLFSYHRLRDLLLIVTRIVELLGRRGWEVLKYWWNLLQYWS
QELKNSAVSLLNATAIAVGERTDRAIEVVQRAFRAILHIPRRIRQGLERALQ
>P03377 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWKWGTMLLGILMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLGNATNTNSSNTNSSSGEMMMEKGEIK
NCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYTLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNG
TGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSANFTDNAKTIIVQLNQSVEINCTRPNNNTRKSIRIQRGPGR
AFVTIGKIGNMRQAHCNISRAKWNATLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTW
FNSTWSTEGSNNTEGSDTITLPCRIKQFINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNNNNGSEIFRPGGGDM
RDNWRSELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGARSMTLTVQARQLLSGIVQQQNNL
LRAIEAQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWNNMTWMEWDREI
NNYTSLIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPL
SFQTHLPTPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKY
WWNLLQYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGACRAIRHIPRRIRQGLERILL
>P05879 ~~~env~~~Envelope glycoprotein gp160~~~
MAMRAKGIRKNCQHLWRWGTMLLGMLMICSAAANLWVTVYYGVPVWKEATTTLFCASDAKAYDTEAHNVWATHACVPTNP
NPQEVVLENVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTDLNTNNTTNTTELSIIVVWEQRGKGEM
RNCSFNITTSIRDKVQREYALFYKLDVEPIDDNKNTTNNTKYRLINCNTSVITQACPKVSFEPIPIHYCTPTGFALLKCN
DKKFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSENFTNNAKTIIVQLNVSVEINCTRPNNHTRKRVTL
GPGRVWYTTGEILGNIRQAHCNISRAQWNNTLQQIATTLREQFGNKTIAFNQSSGGDPEIVMHSFNCGGEFFYCNSTQLF
NSAWNVTSNGTWSVTRKQKDTGDIITLPCRIKQIINRWQVVGKAMYALPIKGLIRCSSNITGLLLTRDGGGENQTTEIFR
PGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGMLGAMFLGFLGAAGSTMGATSMALTVQARQLLSGI
VQQQNNLLRAIKAQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGFWGCSGKLICTTAVPWNASWSNKTLDQIWNNMTW
MEWDREIDNYTHLIYTLIEESQNQQEKNQQELLQLDKWASLWTWSDITKWLWYIKIFIMIVGGLIGLRIVFAVLSIVNRV
RQGYSPLSFQTLLPNPRGPDRPEGTEEGGGERGRDGSTRLVHGFLALVWDDLRSLCLFSYHRLRDLLLIVARIVELLGRR
GWEVLKYWWNLLQYWSQELKNSAVSLVNVTAIAVAEGTDRVIEVVQRIYRAFLHIPRRIRQGFERALL
>P04578 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN
ISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTI
GKIGNMRQAHCNISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTW
STEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWR
SELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWNHTTWMEWDREINNYTS
LIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKLFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTH
LPTPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGACRAIRHIPRRIRQGLERILL
>P04624 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHAGVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN
ISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYTLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKKIRIQRGPGRAFVTI
GKIGNMRQAHCNISRAKWNATLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTW
STEGSNNTEGSDTITLPCRIKQFINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNNNNGSEIFRPGGGDMRDNWR
SELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLLCTTAVPWNASWSNKSLEQIWNHTTWMEWDREINNYTS
LIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKLFIMIVGGLVGLRIVFAVLSVVNRVRQGYSPLSFQTH
LPIPRGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQEAYRAIRHIPRRIRQGLERILL
>P20871 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGIRKNYQHLWKGGILLLGTLMICSAVEKLWVTVYYGVPVWKETTTTLFCASDAKAYDTEVHNVWATHACVPTDPNP
QEVVLENVTEDFNMWKNNMVEQMQEDVINLWDQSLKPCVKLTPLCVTLNCKDVNATNTTSSSEGMMERGEIKNCSFNITK
SIRDKVQKEYALFYKLDVVPIDNKNNTKYRLISCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGKGQCKNV
STVQCTHGIRPVVSTQLLLNGSLAEEKVVIRSDNFTDNAKTIIVQLNESVKINCTRPSNNTRKSIHIGPGRAFYTTGEII
GDIRQAHCNISRAQWNNTLKQIVEKLREQFNNKTIVFTHSSGGDPEIVMHSFNCGGEFFYCNSTQLFNSTWNDTEKSSGT
EGNDTIILPCRIKQIINMWQEVGKAMYAPPIKGQIRCSSNITGLLLTRDGGKNESEIEIFRPGGGDMRDNWRSELYKYKV
VKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGARSMTLTVQARQLLSGIVQQQNNLLRAIEAQQHMLQL
TVWGIKQLQARVLAVERYLKDQQLMGIWGCSGKLICTTAVPWNTSWSNKSLDSIWNNMTWMEWEKEIENYTNTIYTLIEE
SQIQQEKNEQELLELDKWASLWNWFGITKWLWYIKIFIMIVGGLIGLRIVFSVLSIVNRVRQGYSPLSFQTLLPATRGPD
RPEGIEEEGGERDRDRSGQLVNGFLALIWVDLRSLFLFSYHRLRDLLLTVTRIVELLGRRGWEILKYWWNLLQYWSQELK
NSAVSLLNATAIAVAEGTDRIIEVVQRVYRAILHIPTRIRQGLERALL
>Q70626 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLRRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPN
PQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGGMIMEKGEIKNCSFN
ISTSIRGKVQKEYAFFYKHDIIPIDNDTTSYTLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCT
NVSTVQCTHGIKPVVSTQLLLNGSLAEEEVVIRSANLTDNVKTIIVQLNQSVEINCTRPNNNTRKRIRIQRGPGRTFVTI
GKIGNMRQAHCNISRAKWNNTLKQIASKLREQYGNNKTIIFKQSSGGDLEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTW
STEGSNNTEGSDTITLPCRIKQIINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGNNNNGSEIFRPGGGDMRDNWR
SELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWNHTTWMEWDREINNYTS
LIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGHSPLSFQTH
LPTPGGPDRPEGIEEEGGERDRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGACRAIRHIPRRIRQGLERILL
>P04583 ~~~env~~~Envelope glycoprotein gp160~~~
MRVREIQRNYQNWWRWGMMLLGMLMTCSIAEDLWVTVYYGVPVWKEATTTLFCASDAKSYETEVHNIWATHACVPTDPNP
QEIELENVTEGFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTNVNGTAVNGTNAGSNRTNAELKMEIGEVK
NCSFNITPVGSDKRQEYATFYNLDLVQIDDSDNSSYRLINCNTSVITQACPKVTFDPIPIHYCAPAGFAILKCNDKKFNG
TEICKNVSTVQCTHGIKPVVSTQLLLNGSLAEEEIMIRSENLTDNTKNIIVQLNETVTINCTRPGNNTRRGIHFGPGQAL
YTTGIVGDIRRAYCTINETEWDKTLQQVAVKLGSLLNKTKIIFNSSSGGDPEITTHSFNCRGEFFYCNTSKLFNSTWQNN
GARLSNSTESTGSITLPCRIKQIINMWQKTGKAMYAPPIAGVINCLSNITGLILTRDGGNSSDNSDNETLRPGGGDMRDN
WISELYKYKVVRIEPLGVAPTKAKRRVVEREKRAIGLGAMFLGFLGAAGSTMGAASLTLTVQARQLLSGIVQQQNNLLRA
IEAQQHLLQLTVWGIKQLQARVLAVERYLQDQRLLGMWGCSGKHICTTFVPWNSSWSNRSLDDIWNNMTWMQWEKEISNY
TGIIYNLIEESQIQQEKNEKELLELDKWASLWNWFSISKWLWYIRIFIIVVGGLIGLRIIFAVLSLVNRVRQGYSPLSLQ
TLLPTPRGPPDRPEGIEEEGGEQGRGRSIRLVNGFSALIWDDLRNLCLFSYHRLRDLLLIATRIVELLGRRGWEALKYLW
NLLQYWGQELKNSAISLLNTTAIAVAECTDRVIEIGQRFGRAILHIPRRIRQGFERALL
>P19551 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKEKYQHLWRWGWKWGIMLLGILMICSATENLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVCATHACVPTDPN
PQEVILVNVTENFDMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVNLKCTDLKNDTNTNSSNGRMIMEKGEIKNCSFN
ISTSIRNKVQKEYAFFYKLDIRPIDNTTYRLISCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNDKTFNGTGPCTNV
STVQCTHGIRPVVSTQLLLNGSLAEEEGVIRSANFTDNAKTIIVQLNTSVEINCTRPNNNTRKSIRIQRGPGRAFVTIGK
IGNMRQAHCNISRAKWMSTLKQIASKLREQFGNNKTVIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTWST
EGSNNTEGSDTITLPCRIKQFINMWQEVGKAMYAPPISGQIRCSSNITGLLLTRDGGKNTNESEVFRPGGGDMRDNWRSE
LYKYKVVKIETLGVAPTKAKRRVVQREKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIEAQ
QHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQFWNNMTWMEWDREINNYTSLI
HSLIDESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTHLP
NRGGPDRPEGIEEEGGERDRDRSVRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLLQY
WSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGAYRAIRHIPRRIRQGLERIL
>P05877 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGIRRNYQHWWGWGTMLLGLLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATQACVPTDPNP
QEVELVNVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTDLRNTTNTNNSTANNNSNSEGTIKGGEMK
NCSFNITTSIRDKMQKEYALLYKLDIVSIDNDSTSYRLISCNTSVITQACPKISFEPIPIHYCAPAGFAILKCNDKKFSG
KGSCKNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSENFTDNAKTIIVHLNESVQINCTRPNYNKRKRIHIGPGRAF
YTTKNIIGTIRQAHCNISRAKWNDTLRQIVSKLKEQFKNKTIVFNQSSGGDPEIVMHSFNCGGEFFYCNTSPLFNSTWNG
NNTWNNTTGSNNNITLQCKIKQIINMWQEVGKAMYAPPIEGQIRCSSNITGLLLTRDGGKDTDTNDTEIFRPGGGDMRDN
WRSELYKYKVVTIEPLGVAPTKAKRRVVQREKRAAIGALFLGFLGAAGSTMGAASVTLTVQARLLLSGIVQQQNNLLRAI
EAQQHMLQLTVWGIKQLQARVLAVERYLKDQQLLGFWGCSGKLICTTTVPWNASWSNKSLDDIWNNMTWMQWEREIDNYT
SLIYSLLEKSQTQQEKNEQELLELDKWASLWNWFDITNWLWYIKIFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSLQT
RPPVPRGPDRPEGIEEEGGERDRDTSGRLVHGFLAIIWVDLRSLFLFSYHHRDLLLIAARIVELLGRRGWEVLKYWWNLL
QYWSQELKSSAVSLLNATAIAVAEGTDRVIEVLQRAGRAILHIPTRIRQGLERALL
>P12490 ~~~env~~~Truncated surface protein~~~
MRAKGTRKNYQHLWRWGTMLLGMLMICSAAEQLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNP
QEVVLQNVTENFNMWKNNTVEQMHEDIISLWDQSLKPCVKSTPLCVTLNCTDLTNATYANGSSEERGEIRNCSFNVTTII
RNKIQKEYALFYRLDIVPIDKDNTSYTLINCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCTNVSTV
QCTHGIKPVVSTQLLLNGSLAEGEVVIRSENFTNNAKTIIVQLNKSVEINCTRPNNNTKKGIAIGPGRTLYAREKIIGDI
RQAHCNISKAKWNDTLKQIVTKLKEQFRNKTIVFNQSSGGDPEIVMHSFNCGGEFFYCKTTQLFNSTWLFNSTWNDTERS
DNNETIIIPCRIKQIINSGRK
>P18799 ~~~env~~~Envelope glycoprotein gp160~~~
MRAREKERNCQNLWKWGIMLLGMLMTCSAAEDLWVTVYYGVPIWKEATTTLFCASDAKAYKKEAHNIWATHACVPTDPNP
QEIELENVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTDELRNSKGNGKVEEEEKRKNCSFNVRDKR
EQVYALFYKLDIVPIDNNNRTNSTNYRLINCDTSTITQACPKISFEPIPIHFCAPAGFAILKCRDKKFNGTGPCSNVSTV
QCTHGIRPVVSTQLLLNGSLAEEEIIIRSENLTNNVKTIIVQLNASIVINCTRPYKYTRQRTSIGLRQSLYTITGKKKKT
GYIGQAHCKISRAEWNKALQQVATKLGNLLNKTTITFKPSSGGDPEITSHMLNCGGDFFYCNTSRLFNSTWNQTNSTGFN
NGTVTLPCRIKQIVNLWQRVGKAMYAPPIEGLIKCSSNITGLLLTRDGGANNSSHETIRPGGGDMRDNWRSELYKYKVVK
IEPIGVAPTKARRRVVEREKRAIGLGAVFLGFLGAAGSTMGAASVTLTVQARQLMSGIVHQQNNLLRAIEAQQHLLQLTV
WGIKQLQARVLAVERYLRDQQLLGIWGCSGRHICTTNVPWNSSWSNRSLDEIWQNMTWMEWEREIDNYTGLIYSLIEESQ
IQQEKNEKELLELDKWASLWNWFSITKWLWYIKLFIMIVGGLIGLRIVFAVLSVVNRVRQGYSPLSFQTLLPVPRGPDRP
EEIEEEGGERGRDRSIRLVNGLFALFWDDLRNLCLFSYHRLRDSILIAARIVELLGRRGWEALKYLWNLLQYWSQELRNS
ASSLLDTIAIAVAERTDRVIEVVQRACRAILNVPRRIRQGLERLLL
>P19549 ~~~env~~~Envelope glycoprotein gp160~~~
MRARETRKNYQCLWRWGTMLLGMLMICSAAENLWVTVYYGVPVWKDATTTLFCASDAKAYDTEVHNVWATHACVPTDPNP
QEVVLGNVTENFNMWKNNMVDQMHEDIVSLWDQSLKPCVKLTPLCVTLNCTDYLGNATNTNNSSGGTVEKEEIKNCSFNI
TTGIRDKVQKAYAYFYKLDVVPIDDDNTNTSYRLIHCNSSVITQTCPKVSFEPIPIHYCAPAGFAILKCNNKKFSGKGQC
TNVSTVQCTHGIKPVVSTQLLLNGSLAEEEVVIRSDNFTNNAKTILVQLNVSVEINCTRPNNNRRRRITSGPGKVLYTTG
EIIGDIRKAYCNISRAKWNKTLEQVATKLREQFGNKTIVFKQSSGGDPEIVMHSFNCRGEFFYCNTTKLFNSTWNENSTW
NATGNDTITLPCRIKQIINMWQEVGKAMYAPPIEGQIRCSSNITGLLLTRDGGGDKNSTTEIFRPAGGNMKDNWRSELYK
YKVVKIEPLGVAPTKAKRRVVQREKRAVGVIGAMFLGFLGAAGSTMGAASITLTVQARKLLSGIVQQQNNLLRAIEAQQH
LLQLTVWGIKQLQARVLAVERYLRDQQLLGIWGCSGKLICTTTVPWNTSWSNKSLDKIWNNMTWMEWEREIDNYTSLIYT
LLEESQNQQEKNEQELLELDKWASLWNWFSITNWLWYIRIFIMIVGGLIGLRIIFAVLSIVNRVRQGYSPLSFQTLIPAQ
RGPDRPEGIEEGGGERDRDRSTRLVNGFLALFWDDLRSLCLFSYHRLTDLLLIVARIVELLGRRGWEVLKYWWNLLLYWS
QELKNSAVSLLNATAIAVAEGTDRVIEVVQRVGRAILHIPTRIRQGFERALL
>P05878 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGSGRNYQHLWRWGTMLLGILMICSAAEQLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNIWATHACVPTDPNP
QEVVLGNVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTNLRNDTSTNATNTTSSNRGKMEGGEMTNC
SFNITTSIRSKVQKEYALFYKLDVVPIDNTSYTLINCNTSVITQACPKVSFEPIPIHYCARWFAILNCNNKKFNGTGPCT
NVSTVQCTHGIRPVVSTHLLLNGSLAEEEVVLRSENFTDNAKTIIVQLKEAVEINCTRPNNNTTRSIHIGPGRAFYATGD
IIGDIRQAHCNISRAKWNNTLKQIVIKLRDQFENKTIIFNRSSGGDPEIVMHSFNCGGEFFYCNSTQLFSSTWNGTEGSN
NTGGNDTITLPCRIKEIINMWQEVGKAMYAPPIKGQVKCSSNITGLLLTRDGGNSKNGSKNENTEIFRPGGGDMRDNWRS
ELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGTIGAMFLGFLGAAGSTMGATSMTLTVQARLLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARVLAVERYLRDQQLLGIWGCSGKLICTTTVPWNTSWSNKSLDKIWGNMTWMEWEREIDNYTS
LIYTLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYIKIFIMIVGGLVGLRIVFTVLSIVNRVRQGYSPLSFQTR
LPSQRGPDRPEGIEEEGGERDRDRSGRLVDGFLAIIWVDXRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSQELRNSAVSFVNATAIAVAEGTDRVIELLQRAFRAILHIPTRIRQGLERALQ
>Q9Q714 ~~~env~~~Envelope glycoprotein gp160~~~
METQRNYPSLWRWGTLILGMLLICSVVGNLWVTVYYGVPVWKEAKTTLFCASDAKAYDTERHNVWATHACVPTDPNPQEM
VLENVTETFNMWVNDMVEQMHTDIISLWDQSLKPCVKLTPLCVTLDCSSVNATNVTKSNNSTDINIGEIQEQRNCSFNVT
TAIRDKNQKVHALFYRADIVQIDEGERNKSDNHYRLINCNTSVIKQACPKVSFEPIPIHYCAPAGFAILKCNGKKFNGTG
PCTNVSTVQCTHGIRPVVSTQLLLNGSLAEVEEVIIRSKNITDNTKNIIVQLNEPVQINCTRTGNNTRKSIRIGPGQAFY
ATGDIIGDIRRAYCNISGKQWNETLHKVITKLGSYFDNKTIILQPPAGGDIEIITHSFNCGGEFFYCNTTKLFNSTWTNS
SYTNDTYNSNSTEDITGNITLQCKIKQIVNMWQRVGQAMYAPPIRGNITCISNITGLILTFDRNNTNNVTFRPGGGDMRD
NWRSELYKYKVVKIEPLGVAPTEARRRVVEREKRAVGMGAFFLGFLGAAGSTMGAASITLTVQARQLLSGIVQQQSNLLR
AIQAQQHMLQLTVWGIKQLQARVLAVERYLKDQQLLGIWGCSGKLICTTNVPWNSSWSNKSLDEIWDNMTWMEWDKQINN
YTDEIYRLLEVSQNQQEKNEQDLLALDKWANLWNWFSITNWLWYIRIFIMIVGGIIGLRIVFAVLSIVNRVRQGYSPLSL
QTLIPNQRGPDRPREIEEEGGEQDRDRSIRLVNGFLPLVWEDLRNLCLFSYRRLRDLLSIVARTVELLGRRGWEALKLLG
NLLLYWGQELKNSAISLLNTTAIAVAEGTDRIIELVQRAWRAILHIPRRIRQGFERALL
>P31872 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGIRRNCQHLWIWGTMLFGMWMICSAVEQLWVTVYYGVPVWKEATTTLFCASDAKAYSTEAHKVWATHACVPTNPNP
QEVVLENVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCIDKNITDWENKTIIGGGEVKNCSFNITTSI
RDKVHKEYALFYKLDVVPIKSNNDSSTYTRYRLIHCNTSVITQACSKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCT
NVSTVQCTHGIRPVVSTQLLLNGSLAEEEIVIRSENFTDNAKTIIVHLNESVEINCTRPNNNVRRRHIHIGPGRAFYTGE
IRGNIRQAHCNISRAKWNNTLKQIVEKLREQFKNKTIVFNHSSGGDPEIVTHSFNCGGEFFYCDSTQLFNSTWNVTGIST
EGNNNTEENGDTITLPCRIKQIINMWQGVGKAMYAPPIGGQIRCSSNITGLLLTRDGGNSSSREEIFRPGGGNMRDNWRS
ELYKYKVVKIEPLGVAPTKAKRRVVQREKRAVGAIGAMFLGFLGAAGSTMGAASLTLTVQARQLLSGIVQQQNNLLRAIE
AQQHLLQLTVWGIKQLQARVLAVERYLRDQQLLGIWGCSGKLICTTTVPWNASWSNKSMDQIWNNMTWMEWEREIDNYTS
LIYNLIEESQNQQEKNEQELLELDKWASLWNWFSITNWLWYIKIFIMIVGGLVGLRIVFSVLSIVNRVRQGYSPLSFQTH
LPTPRGPDRPEGTEEEGGERDRDRSVRLVHGFLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWNLL
QYWSKELKNSAVGLLNAIAIAVAEGTDRVIEVVQRICRAIIHIPRRIRQGLERALL
>P05880 ~~~env~~~Envelope glycoprotein gp160~~~
MRVKGIMRNCQHLWIWGTMLFGMWMICSAVEQLWVTVYYGVPVWKEATTTLFCASDAKAYSTEAHNVWATHACVPTDPNP
QEVILGNVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCIDKNITDWKNTTIIGGGEVKNCSFNITTSR
RDKVHKEYALFYKLDVVPIKGDNNSSRYRLINCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCTNVS
TVQCTHGIRPVVSTQLLLNGSLAEEEIVIRSENFTDNAKTIIVHLNESVEINCTRPYNNVRRSLSIGPGRAFRTREIIGI
IRQAHCNISRAKWNNTLKQIVEKLREQFKNKTIVFNHSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWNGTDIKGDNKN
STLITLPCRIKQIINMWQGVGKAMYAPPIQGQIRCSSNITGLLLTRDGGNSSSREEIFRPGGGNMRDNWRSELYKYKVVR
IEPLGVAPTKAKRRVVQREKRAVGTIGAMFLGFLGAAGSTMGAGSLTLTVQARQLLSGIVQQQNNLLRAIDAQQHLLQLT
VWGIKQLQARVLAVERYLRDQQLLGIWGCSGKLICTTTVPWNASWSNKSMNQIWDNLTWMEWEREIDNYTSIIYSLIEES
QNQQGKNEQELLELDKWASLWNWFDITNWLWYIKIFIMIVGGLIGLRIVFTVLSIVNRVRQGYSPLSFQTHLPTPRGPDR
PEGIEEEGGERDRDRSVRLVHGFLALIWDDLRSLCLFSYHRLRDLLLIVKRIVELLGRRGWEALKYWWNLLQYWSKELKN
SAVGLLNAIAIAVAEGTDRVIEVVQRICRAIIHIPRRIRQGLERALL
>P35961 ~~~env~~~Envelope glycoprotein gp160~~~
MRATEIRKNYQHLWKGGTLLLGMLMICSAAEQLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNP
QEVKLENVTENFNMWKNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCTDLRNATNTTSSSWETMEKGEIKNCSFNIT
TSIRDKVQKEYALFYNLDVVPIDNASYRLISCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCTNVST
VQCTHGIRPVVSTQLLLNGSLAEEEIVIRSENFTNNAKTIIVQLNESVVINCTRPNNNTRKSINIGPGRALYTTGEIIGD
IRQAHCNLSKTQWENTLEQIAIKLKEQFGNNKTIIFNPSSGGDPEIVTHSFNCGGEFFYCNSTQLFTWNDTRKLNNTGRN
ITLPCRIKQIINMWQEVGKAMYAPPIRGQIRCSSNITGLLLTRDGGKDTNGTEIFRPGGGDMRDNWRSELYKYKVVKIEP
LGVAPTKAKRRVVQREKRAVGLGALFLGFLGAAGSTMGAASITLTVQARQLLSGIVQQQNNLLRAIEAQQHLLQLTVWGI
KQLQARVLAVERYLRDQQLLGIWGCSGKLICTTTVPWNTSWSNKSLNEIWDNMTWMKWEREIDNYTHIIYSLIEQSQNQQ
EKNEQELLALDKWASLWNWFDITKWLWYIKIFIMIVGGLIGLRIVFVVLSIVNRVRQGYSPLSFQTHLPAQRGPDRPDGI
EEEGGERDRDRSGPLVDGFLAIIWVDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWGVLKYWWNLLQYWIQELKNSAVS
LLNATAIAVAEGTDRVIEILQRAFRAVLHIPVRIRQGLERALL
>P12487 ~~~env~~~Envelope glycoprotein gp160~~~
MRVRGIERNCQNLWKWGIMLLGILMTCSNADNLWVTVYYGVPVWKEATTTLFCASDAKSYKTEAHNIWATHACVPTDPNP
QEIELENVTENFNMWRNNMVEQMHEDIISLWDQSLKPCVKLTPLCVTLNCIDEVMENVTMKNNNVTEEIRMKNCSFNITT
VVRDKTKQVHALFYRLDIVPIDNDNSTNSTNYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCRDKRFNGTGPC
TNVSTVQCTHGIRPVVSTQLLLNGSLAEEEIIIRSENLTNNAKIIIVQLNESVAINCTRPYRNIRQRTSIGLGQALYTTK
TRSIIGQAYCNISKNEWNKTLQQVAIKLGNLLNKTTIIFKPSSGGDPEITTHSFNCGGEFFYCNTSGLFNSTWDISKSEW
ANSTESDDKPITLQCRIKQIINMWQGVGKAMYAPPIEGQINCSSNITGLLLTRDGGTNNSSNETFRPGGGDMRDNWRSEL
YKYKVVKIEPLGVAPTRAKRRVVEREKRAIGLGAMFLGFLGAAGSTMGARSLTLTVQARQLLSGIVQQQNNLLRAIEAQQ
HLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTTVPWNSSWSNRSLNDIWQNMTWMEWEREIDNYTGLIY
RLIEESQTQQEKNEQELLELDKWASLWNWFNITQWLWYIKIFIMIVGGLIGLRIVFAVLSLVNRVRQGYSPLSFQTLLPA
PRGPDRPEGIEEEGGERGRDRSIRLVNGFSALIWDDLRNLCLFSYHRLRDLILIAARIVELLGRRGWEALKYLWNLLQYW
SRELKNSASSLLDTIAIAVAEGTDRVIEIVRRACRAVLHIPTRIRQGLERLLL
>P04580 ~~~env~~~Envelope glycoprotein gp160~~~
MRAREIERNCPNLWKWGIMLLGILMICSAADNLWVTVYYGVPVWKEATTTLFCASDAKSYKTEAHNIWATHACVPTDPNP
QEIELENVTENFNMWRNNMVEQIHEDIISLWDQSLKPCVKLTPLCVTLNCTDESDEWMGNVTGKNVTEDIRMKNCSFNIT
TVVRDKTKQVHALFYRLDIVPIDNDNSTNSTNYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCRDKRFNGTGP
CTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEIIIRSENLTNNAKIIIVQLNESVAINCTRPYKNTRQSTPIGLGQALYTT
RGRTKIIGQAHCNISKEDWNKTLQRVAIKLGNLLNKTTIIFKPSSGGDAEITTHSFNCGGEFFYCNTSGLFNSTWNINNS
EGANSTESDNKLITLQCRIKQIINMWQGVGKAMYAPPIEGQINCSSNITGLLLTRDGGTNNSSNETFRPGGGDMRDNWRS
ELYKYKVVKIEPLGVAPTKAKRRVVEREKRAIGLGAMFLGFLGAAGSTMGAASVTLTVQARQLMSGIVQQQNNLLRAIEA
QQHLLQLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTTVPWNSSWSNRSLNDIWQNMTWMEWEREIDNYTGL
IYRLIEESQTQQEKNEQELLELDKWASLWNWFNITQWLWYIKIFIMIVGGLIGLRIVFAVLSLVNRVRQGYSPLSFQTLL
PAPREPDRPEGIEEEGGERGRDRSIRLVNGFSALIWDDLRNLCLFSYHRLRDLILIAARIVELLGRRGWEALKYLWNLLQ
YWSRELRNSASSLLDTIAIAVAEGTDRVIEIVRRTYRAVLNVPTRIRQGLERLLL
>P05883 ~~~env~~~Envelope glycoprotein gp160~~~
MKGSKNQLLIAIVLASAYLIHCKQFVTVFYGIPAWRNASIPLFCATKNRDTWGTIQCLPDNDDYQEITLNVTEAFDAWNN
TVTEQAVEDVWNLFETSIKPCVKLTPLCVAMNCTRNMTTWTGRTDTQNITIINDTSHARADNCTGLKEEEMIDCQFSMTG
LERDKRKQYTEAWYSKDVVCDNNTSSQSKCYMNHCNTSVITESCDKHYWDAMRFRYCAPPGFALLRCNDTNYSGFAPNCS
KVVAATCTRMMETQTSTWFGFNGTRAENRTYIYWHGKDNRTIISLNNFYNLTMHCKRPGNKTVLPITFMSGFKFHSQPVI
NKKPRQAWCWFEGQWKEAMQEVKETLAKHPRYKGNRSRTENIKFKAPGRGSDPEVTYMWTNCRGESLYCNMTWFLNWVEN
RTGQKQRNYAPCRIRQIINTWHRVGKNLYLPPREGELTCNSTVTSIIANIDAGDQTNITFSAEAAELYRLELGDYKLVEI
TPIGFAPTSVKRYSSAHQRHTRGVFVLGFLGFLATAGSAMGAASLTLSAQSRTLLAGIVQQQQQLLDVVKRQQEMLRLTV
WGTKNLQARVTAIEKYLKDQAQLNSWGCAFRQVCHTSVPWVNDTLTPDWNNMTWQEWEQKVRYLEANISQSLEQAQIQQE
KNMYELQKLNSWDVFTNWLDFTSWVRYIQYGVYVVVGIVALRIVIYIVQMLSRLRKGYRPVFSSPPGYIQQIHIHKDQEQ
PAREETEEDVGSNGGDRSWPWPIAYIHFLIRLLIRLLTGLYNICRDLLSRISPILQPIFQSLQRALTAIRDWLRLKAAYL
QYGCEWIQEAFQALARTTRETLAGAGRDLWRALQRIGRGILAVPRRIRQGAELALL
>P04577 ~~~env~~~Envelope glycoprotein gp160~~~
MMNQLLIAILLASACLVYCTQYVTVFYGVPTWKNATIPLFCATRNRDTWGTIQCLPDNDDYQEITLNVTEAFDAWNNTVT
EQAIEDVWHLFETSIKPCVKLTPLCVAMKCSSTESSTGNNTTSKSTSTTTTTPTDQEQEISEDTPCARADNCSGLGEEET
INCQFNMTGLERDKKKQYNETWYSKDVVCETNNSTNQTQCYMNHCNTSVITESCDKHYWDAIRFRYCAPPGYALLRCNDT
NYSGFAPNCSKVVASTCTRMMETQTSTWFGFNGTRAENRTYIYWHGRDNRTIISLNKYYNLSLHCKRPGNKTVKQIMLMS
GHVFHSHYQPINKRPRQAWCWFKGKWKDAMQEVKETLAKHPRYRGTNDTRNISFAAPGKGSDPEVAYMWTNCRGEFLYCN
MTWFLNWIENKTHRNYAPCHIKQIINTWHKVGRNVYLPPREGELSCNSTVTSIIANIDWQNNNQTNITFSAEVAELYRLE
LGDYKLVEITPIGFAPTKEKRYSSAHGRHTRGVFVLGFLGFLATAGSAMGAASLTVSAQSRTLLAGIVQQQQQLLDVVKR
QQELLRLTVWGTKNLQARVTAIEKYLQDQARLNSWGCAFRQVCHTTVPWVNDSLAPDWDNMTWQEWEKQVRYLEANISKS
LEQAQIQQEKNMYELQKLNSWDIFGNWFDLTSWVKYIQYGVLIIVAVIALRIVIYVVQMLSRLRKGYRPVFSSPPGYIQQ
IHIHKDRGQPANEETEEDGGSNGGDRYWPWPIAYIHFLIRQLIRLLTRLYSICRDLLSRSFLTLQLIYQNLRDWLRLRTA
FLQYGCEWIQEAFQAAARATRETLAGACRGLWRVLERIGRGILAVPRRIRQGAEIALL
>P20872 ~~~env~~~Envelope glycoprotein gp160~~~
MCGRNQLFVASLLASACLIYCVQYVTVFYGVPVWRNASIPLFCATKNRDTWGTIQCLPDNDDYQEIALNVTEAFDAWNNT
VTEQAVEDVWSLFETSIKPCVKLTPLCVAMRCNSTTAKNTTSTPTTTTTANTTIGENSSCIRTDNCTGLGEEEMVDCQFN
MTGLERDKKKLYNETWYSKDVVCESNDTKKEKTCYMNHCNTSVITESCDKHYWDTMRFRYCAPPGFALLRCNDTNYSGFE
PNCSKVVAATCTRMMETQTSTWFGFNGTRAENRTYIYWHGRDNRTIISLNKFYNLTVHCKRPGNKTVVPITLMSGLVFHS
QPINRRPRQAWCWFKGEWKEAMKEVKLTLAKHPRYKGTNDTEKIRFIAPGERSDPEVAYMWTNCRGEFLYCNMTWFLNWV
ENRTNQTQHNYVPCHIKQIINTWHKVGKNVYLPPREGQLTCNSTVTSIIANIDGGENQTNITFSAEVAELYRLELGDYKL
IEVTPIGFAPTPVKRYSSAPVRNKRGVFVLGFLGFLTTAGAAMGAASLTLSAQSRTLLAGIVQQQQQLLDVVKRQQEMLR
LTVWGTKNLQARVTAIEKYLKDQAQLNSWGCAFRQVCHTTVPWVNDTLTPDWNNMTWQEWEQRIRNLEANISESLEQAQI
QQEKNMYELQKLNSWDVFGNWFDLTSWIKYIQYGVYIVVGIIVLRIVIYVVQMLSRLRKGYRPVFSSPPAYFQQIHIHKD
REQPAREETEEDVGNSVGDNWWPWPIRYIHFLIRQLIRLLNRLYNICRDLLSRSFQTLQLISQSLRRALTAVRDWLRFNT
AYLQYGGEWIQEAFRAFARATGETLTNAWRGFWGTLGQIGRGILAVPRRIRQGAEIALL
>P31789 ~~~env~~~Envelope glycoprotein~~~
MVQKMLTSRGLFLILTMLNLSQVPSIMGEQRWAILSTFPKPMPVRHDAIVFPKFFTTNKTVDLPYLLYDPTAPLGENRSL
LEQGSLCFQINGPGNCINLTARALGKFNEHRGWVSTTQDTSNVEITFTNRTFWQEVNWVNGTFLPPNFSDKEHLHQPKIA
PHCSLEDEGLILPWSDCQSSIIRWVDQSKTFSFSPNMIDDPEKEFVMKKGLFIQDFRMHPFHKWVLCGVNGSCTELNPLI
FIQGGAVGKASFTGISRFAQYWGIHDASQDSYGYTNTSVEITGFNKTLVNQINYPSTPVCVYPPFLFILSNDSFEVCSND
SCWISQCWDVTKNTRAMVARIPRWIPVPVETPSTLSMFRQKRDFGITAAMIIAISASAAAATAAGYAMVSAVSGTKLNQL
SADLADAITVQTSASTKLKGGLMILNQCLDLAEEQIGVLHQMAQLGCERKLEALCITSVQYENFTYAANLSRQLSLYLAG
NWSERFDETLEALIAAVLKINSTRMDLSLTEGLSSWISSAFSYFKEWVGVGLFGVATCCGLVVMLWLVCKLRTQQTRDKV
VITQALAAIEQGASPEIWLSMLKN
>P31621 ~~~env~~~Envelope glycoprotein~~~
MPKRRAGFRKGWYARQRNSLTHQMQRMTLSEPTSELPTQRQIEALMPYAWNEAHVQPPVTPTNILIMLLLLLQRVQNGAA
AAFWAYIPDPPMIQSLGWDREIVPVYVNDTSLLGGKSDIHISPQQANISFYGLTTQYPMCFSYQSQHPHCIQVSADISYP
RVTISGIDEKTGKKSYGNGTGPLDIPFCDKHLSIGIGIDTPWTLCRARVASVYNINNANATFLWDWAPGGTPDFPEYRGQ
HPPIFSVNTAPIYQTELWKLLAAFGHGNSLYLQPNISGTKYGDVGVTGFLYPRACVPYPFMLIQGHMEITLSLNIYHLNC
SNCILTNCIRGVAKGEQVIIVKQPAFVMLPVEIAEAWYDETALELLQRINTALSRPKRGLSLIILGIVSLITLIATAVTA
CVSLAQSIQAAHTVDSLSYNVTKVMGTQEDIDKKIEDRLSALYDVVRVLGEQVQSINFRMKIQCHANYKWICVTKKPYNT
SDFPWDKVKKHLQGIWFNTNLSLDLLQLHNEILDIENSPKATLNIADTVDNFLQNLFSNFPSLHSLWKTLIGLGIFVIII
AIVIFVFPCVVRGLVRDFLKMRVEMLHMKYRTMLQHRHLMELLKNKERGAAGDDP
>P03386 ~~~env~~~Envelope glycoprotein~~~
MESTTLSKPFKNQVNPWGPLIVLLILGGVNPVTLGNSPHQVFNLTWEVTNGDRETVWAITGNHPLWTWWPDLTPDLCMLA
LHGPSYWGLEYRAPFSPPPGPPCCSGSSDSTPGCSRDCEEPLTSYTPRCNTAWNRLKLSKVTHAHNGGFYVCPGPHRPRW
ARSCGGPESFYCASWGCETTGRASWKPSSSWDYITVSNNLTSDQATPVCKGNEWCNSLTIRFTSFGKQATSWVTGHWWGL
RLYVSGHDPGLIFGIRLKITDSGPRVPIGPNPVLSDRRPPSRPRPTRSPPPSNSTPTETPLTLPEPPPAGVENRLLNLVK
GAYQALNLTSPDKTQECWLCLVSGPPYYEGVAVLGTYSNHTSAPANCSVASQHKLTLSEVTGQGLCIGAVPKTHQVLCNT
TQKTSDGSYYLAAPTGTTWACSTGLTPCISTTILDLTTDYCVLVELWPRVTYHSPSYVYHQFERRAKYKREPVSLTLALL
LGGLTMGGIAAGVGTGTTALVATQQFQQLQAAMHDDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLKEGGLCAALK
EECCFYADHTGLVRDSMAKLRERLSQRQKLFESQQGWFEGLFNKSPWFTTLISTIMGPLIILLLILLFGPCILNRLVQFI
KDRISVVQALVLTQQYHQLKTIEDCKSRE
>P08360 ~~~env~~~Envelope glycoprotein~~~
MEGPAFSKSPKDKTIERAFLGVLGILFVTGGLASRDNPHQVYNITWEVTNGEQDTVWAVTGNHPLWTWWPDLTPDLCMLA
LHGPTHWGLDNHPPYSSPPGPPCCSGDAGAVSGCARDCDEPLTSYSPRCNTAWNRLKLARVTHAPKEGFYICPGSHRPRW
ARSCGGLDAYYCASWGCETTGRAAWNPTSSWDYITVSNNLTSSQATKACKNNGWCNPLVIRFTGPGKRATSWTTGHFWGL
RLYISGHDPGLTFGIRLKVTDLGPRVPIGPNPVLSDQRPPSRPVPARPPPPSASPSTPTIPPQQGTGDRLLNLVQGAYLT
LNMTDPTRTQECWLCLVSEPPYYEGVAVLREYTSHETAPANCSSGSQHKLTLSEVTGQGRCLGTVPKTHQALCNRTEPTV
SGSNYLVAPEGTLWACSTGLTPCLSTTVLNLTTDYCVLVELWPKVTYHSPDYVYTQFEPGARFRREPVSLTLALLPEGLT
MGGIAAGVGTGTTALVATQQFQQLQAAMHNDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLKEGGLCAALKEECCF
YADHTGLVRDSMAKLRERLNQRQKLFESGQGWFEGLFNRSPWFTTLISTIMGPLIVLLLILLFGPCILNRLVQFVKDRIS
VVQALVLTQQYHQLKPIEYEP
>P03390 ~~~env~~~Envelope glycoprotein~~~
MACSTLPKSPKDKIDPRDLLIPLILFLSLKGARSAAPGSSPHQVYNITWEVTNGDRETVWAISGNHPLWTWWPVLTPDLC
MLALSGPPHWGLEYQAPYSSPPGPPCCSGSSGSSAGCSRDCDEPLTSLTPRCNTAWNRLKLDQVTHKSSEGFYVCPGSHR
PREAKSCGGPDSFYCASWGCETTGRVYWKPSSSWDYITVDNNLTTSQAVQVCKDNKWCNPLAIQFTNAGKQVTSWTTGHY
WGLRLYVSGRDPGLTFGIRLRYQNLGPRVPIGPNPVLADQLSLPRPNPLPKPAKSPPASNSTPTLISPSPTPTQPPPAGT
GDRLLNLVQGAYQALNLTNPDKTQECWLCLVSGPPYYEGVAVLGTYSNHTSAPANCSVASQHKLTLSEVTGRGLCIGTVP
KTHQALCNTTLKIDKGSYYLVAPTGTTWACNTGLTPCLSATVLNRTTDYCVLVELWPRVTYHPPSYVYSQFEKSYRHKRE
PVSLTLALLLGGLTMGGIAAGVGTGTTALVATQQFQQLHAAVQDDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLK
EGGLCAALKEECCFYADHTGLVRDSMAKLRERLTQRQKLFESSQGWFEGLFNRSPWFTTLISTIMGPLIILLLILLFGPC
ILNRLVQFVKDRISVVQALVLTQQYHQLKPLEYEP
>P26804 ~~~env~~~Envelope glycoprotein~~~
MACSTLSKSPKDKIDPRDLLIPLILFLSLKGARSAAPGSSPHQVYNITWEVTNGDRETVWAISGNHPLWTWWPVLTPDLC
MLALSGPPHWGLEYQAPYSSPPGPPCCSGSSGNVAGCARDCNEPLTSLTPRCNTAWNRLKLDQVTHKSSEGFYVCPGSHR
PREAKSCGGPDSFYCASWGCETTGRVYWKPSSSWDYITVDNNLTSNQAVQVCKDNKWCNPLAIRFTNAGKQVTSWTTGHY
WGLRLYVSGQDPGLTFGIRLSYQNLGPRIPIGPNPVLADQLSFPLPNPLPKPAKSPPASSSTPTLISPSPTPTQPPPAGT
GDRLLNLVQGAYQALNLTNPDKTQECWLCLVSGPPYYEGVAVLGTYSNHTSAPANCSVASQHKLTLSEVTGRGLCIGTVP
KTHQALCNTTLKAGKGSYYLVAPTGTMWACNTGLTPCLSATVLNRTTDYCVLVELWPRVTYHPPSYVYSQFEKSHRHKRE
PVSLTLALLLGGLTMGGIAAGVGTGTTALVATQQFQQLHAAVQDDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLK
EGGLCAALKEECCFYADHTGLVRDSMAKLRERLSQRQKLFESSQGWFEGWFNRSPWFTTLISTIMGPLIILLLILLFGPC
ILNRLVQFVKDRISVVQALVLTQQYHQLKPLEYEPQ
>P26803 ~~~env~~~Envelope glycoprotein~~~
MACSTLSKSPKDKIDPRDLLIPLILFLSLKGARSAAPGSSPHQVYNITWEVTNGDRETVWAISGNHPLWTWWPDLTPDLC
MLALSGPPHWGLEYRAPYSSPPGPPCCSGSSGNRAGCARDCDEPLTSLTPRCNTAWNRLKLDQVTHKSSGGFYVCPGSHR
PRKAKSCGGPDSFYCASWGCETTGRAYWKPSSSWDYITVDNNLTTNQAAQVCKDNKWCNPLAIQFTNAGKQVTSWTIGHY
WGLRLYVSGQDPGLTFGIRLKYQNLGPRVPIGPNPVLADQLSFPLPNPLPKPAKSPSASNSTPTLISPSPAPTQPPPAGT
GDRLLNLVQGAYQALNLTNPDKTQECWLCLVSAPPYYEGVAVLGTYSNHTSAPANCSAGSQHKLTLSEVTGQGLCIGTVP
KTHQALCNTTLKTGKGSYYLVAPAGTMWACNTGLTPCLSATVLNRTTDYCVLVELWPRVTYHPPSYVYSQFEKSYRHKRE
PVSLTLALLLGGLTMGGIAAGVGTGTTALVATQQFQQLHAAVQDDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLK
EGGLCAALKEECCFYADHTGLVRDSMAKLRERLTQRQKLFESSQGWFEGLFNRSPWFTTLISTIMGPLIILLLILLFGPC
ILNRLVQFVKDRISVVQALVLTQQYHQLKPLEYEPQ
>P21436 ~~~env~~~Envelope glycoprotein~~~
MDRPALPKSIKDKTNPWGPIILGILIMLGGALGKGSPHKVFNLTWEVYNQEYETVWATSGSHPLWTWWPTLTPDLCMLAQ
LAKPSWGLSDYPPYSKPPGPPCCTTDNNPPGCSRDCNGPLTYLTPRCSTAWNRLKLVLTTHHLNQGFYVCPGPHRPRHAR
NCGGPDDFYCAHWGCETTGQAYWKPSSSWDYIRVSNNASSSDATTACKNNNWCSPLAISFTDPGKRATSWTSGFTWGLRL
YISGHPGLIFGVRLKISDLGPRVPIGPNPVLSEQRPPSQPEPARLPPSSNLTQGGTPSAPTGPPQEGTGDRLLDLVQGAY
QALNATSPDKTQECWLCLVSSPPYYEGVAVVGPYSNHTTAPANCSADSQHKLTLSEVTGKPLPRKGSQDPPGPVQYHSGA
RQKYSLSGGSRGTMWACNTGLTPCLSTAVLNLTTDYCVLVELWPRVTYHSLDFVYRQVEGRTRYQREPVSLTLALLLGGL
TMGGIAAGVGTGTSALVKTQQFEQLHAAIQADLKEVESSITNLEKSLTSLSEVVLQNRRGLDLLFLEKGGLCAALKEECC
FYADHTGLVRDSMAKLRERLNQRQKLFEAGQGWFEGLFNRSPWLTTLISTIMGPLIILLLILMFGPCILNRLVQFVKDRI
SVVQALVLTQQYHQLKPLEHGRAIVK
>P03385 ~~~env~~~Envelope glycoprotein~~~
MARSTLSKPLKNKVNPRGPLIPLILLMLRGVSTASPGSSPHQVYNITWEVTNGDRETVWATSGNHPLWTWWPDLTPDLCM
LAHHGPSYWGLEYQSPFSSPPGPPCCSGGSSPGCSRDCEEPLTSLTPRCNTAWNRLKLDQTTHKSNEGFYVCPGPHRPRE
SKSCGGPDSFYCAYWGCETTGRAYWKPSSSWDFITVNNNLTSDQAVQVCKDNKWCNPLVIRFTDAGRRVTSWTTGHYWGL
RLYVSGQDPGLTFGIRLRYQNLGPRVPIGPNPVLADQQPLSKPKPVKSPSVTKPPSGTPLSPTQLPPAGTENRLLNLVDG
AYQALNLTSPDKTQECWLCLVAGPPYYEGVAVLGTYSNHTSAPANCSVASQHKLTLSEVTGQGLCIGAVPKTHQALCNTT
QTSSRGSYYLVAPTGTMWACSTGLTPCISTTILNLTTDYCVLVELWPRVTYHSPSYVYGLFERSNRHKREPVSLTLALLL
GGLTMGGIAAGIGTGTTALMATQQFQQLQAAVQDDLREVEKSISNLEKSLTSLSEVVLQNRRGLDLLFLKEGGLCAALKE
ECCFYADHTGLVRDSMAKLRERLNQRQKLFESTQGWFEGLFNRSPWFTTLISTIMGPLIVLLMILLFGPCILNRLVQFVK
DRISVVQALVLTQQYHQLKPIEYEP
>P31794 ~~~env~~~Envelope glycoprotein~~~
MESTTLSKPFKNQVNPWGPLIVLLILGRVNPVALGNSPHQVFNLSWEVTNEDRETVWAITGNHPLWTWWPDLTPDLCMLA
LHGPSYWGLEYQAPFSPPPGPPCCSRSSGSTPGCSRDCEEPLTSYTPRCNTAWNRLKLSKVTHAHNEGFYVCPGPHRPRW
ARSCGGPESFYCASWGCETTGRASWKPSSSWDYITVSNNLTSGQATPVCKNNTWCNSLTIRFTSLGKQATSWVTGHWWGL
RLYVSGHDPGLIFGIRLKITDSGPRVPIGPNPVLSDQRPPSQPRSPPHSNSTPTETPLTLPEPPPAGVENRLLNLVKGAY
QALNLTSPDRTQECWLCLVSGPPYYEGVAVLGTYSNHTSAPANCSVASQHKLTLSEVTGRGLCVGAVPKTHQALCNTTQN
TSGGSYYLAAPAGTIWACNTGLTPCLSTTVLNLTTDYCVLVELWPRVTYHSPSYVYHQFEGRAKYKREPVSLTLALLLGG
LTMGGIAAGVGTGTTALVATQQLQAAVHDDLKEVEKSITNLEKSLTSLSEVVLQNRRGLDLLFLKEGGLCAALKEECCFY
ADHTGVVRDSMAKLRERLNQRQKLFESGQGWFERLFNGSPWFTTLISTIMGPLIVLLLILLLGPCILNRLVQFVKDRISV
VQALVLTQQYHQLKSIDPEEMESRE
>Q85646 ~~~env~~~Envelope glycoprotein gp70~~~
MPNHQSGSPTGSSDLLLDGKKQRAHLALRRKRRREMRKINRKVRRMNLAPIKEKTAWQHLQALIFEAEEVLKTSQTPQTS
LTLFLALLSVLGPPPVSGESYWAYLPKPPILHPVGWGNTDPIRVLTNQTIYLGGSPDFHGFRNMSGNVHFEEKSDTLPIC
FSFSFSTPTGCFQVDKQVFLSDTPTVDNNKPGGKGDKRRMWELWLTTLGNSGANTKLVPIKKKLPPKYPHCQIAFKKDAF
WEGDESAPPRWLPCAFPDQGVSFSPKGALGLLWDFSLPSPSVDQSDQIKSKKDLFGNYTPPVNKEVHRWYEAGWVEPTWF
WENSPKDPNDRDFTALVPHTELFRLVAASRYLILKRPGFQEHEMIPTSACVTYPYVILLGLPQLIDIEKRGSTFHISCSS
CRLTNCLDSSAYDYAAIIVKRPPYVLLPVDIGDEPWFDDSAIQTFRYATDLIRAKRFVAAIILGISALIAIITSFAVATT
ALVKEMQTATFVNNLHRNVTLALSEQRIIDLKLEARLNALEEVVLDLGQDVANLKTRMSTRCHANYDFICVTPLPYNASE
SWERTKAHLLGIWNDNEISYNIQELTNLIGDMSKQHIDTVDLSGLAQSFANGVKALNPLDWTQYFIFIGVGALLLVIVLM
IFPIVFQCLAKSLDQVQSDLNVLLLKKKKGGNAAPAAEMVELPRVSYT
>P03374 ~~~env~~~Envelope glycoprotein gp70~~~
MPNHQSGSPTGSSDLLLSGKKQRPHLALRRKRRREMRKINRKVRRMNLAPIKEKTAWQHLQALISEAEEVLKTSQTPQNS
LTLFLALLSVLGPPPVTGESYWAYLPKPPILHPVGWGSTDPIRVLTNQTMYLGGSPDFHGFRNMSGNVHFEGKSDTLPIC
FSFSFSTPTGCFQVDKQVFLSDTPTVDNNKPGGKGDKRRMWELWLHTLGNSGANTKLVPIKKKLPPKYPHCQIAFKKDAF
WEGDESAPPRWLPCAFPDKGVSFSPKGALGLLWDFSLPSPSVDQSDQIKSKKDLFGNYTPPVNKEVHRWYEAGWVEPTWF
WENSPKDPNDRDFTALVPHTELFRLVAASRHLILKRPGFQEHEMIPTSACVTYPYAILLGLPQLIDIEKRGSTFHISCSS
CRLTNCLDSSAYDYAAIIVKRPPYVLLPVDIGDEPWFDDSAIQTFRYATDLIRAKRFVAAIILGISALIAIITSFAVATT
ALVKEMQTATFVNNLHRNVTLALSEQRIIDLKLEARLNALEEVVLELGQDVANLKTRMSTRCHANYDFICVTPLPYNATE
DWERTRAHLLGIWNDNEISYNIQELTNLISDMSKQHIDAVDLSGLAQSFANGVKALNPLDWTQYFIFIGVGALLLVIVLM
IFPIVFQCLAKSLDQVQSDLNVLLLKKKKGGNAAPAAEMVELPRVSYT
>P07575 ~~~env~~~Envelope glycoprotein~~~
MNFNYHFIWSLVILSQISQVQAGFGDPREALAEIQQKHGKPCDCAGGYVSSPPINSLTTVSCSTHTAYSVTNSLKWQCVS
TPTTPSNTHIGSCPGECNTISYDSVHASCYNHYQQCNIGNKTYLTATITGDRTPAIGDGNVPTVLGTSHNLITAGCPNGK
KGQVVCWNSRPSVHISDGGGPQDKARDIIVNKKFEELHRSLFPELSYHPLALPEARGKEKIDAHTLDLLATVHSLLNASQ
PSLAEDCWLCLQSGDPVPLALPYNDTLCSNFACLSNHSCPLTPPFLVQPFNFTDSNCLYAHYQNNSFDIDVGLASFTNCS
SYYNVSTASKPSNSLCAPNSSVFVCGNNKAYTYLPTNWTGSCVLATLLPDIDIIPGSEPVPIPAIDHFLGKAKRAIQLIP
LFVGLGITTAVSTGAAGLGVSITQYTKLSHQLISDVQAISSTIQDLQDQVDSLAEVVLQNRRGLDLLTAEQGGICLALQE
KCCFYANKSGIVRDKIKNLQDDLERRRRQLIDNPFWTSFHGFLPYVMPLLGPLLCLLLVLSFGPIIFNKLMTFIKHQIES
IQAKPIQVHYHRLEQEDSGGSYLTLT
>P03396 ~~~env~~~Envelope glycoprotein gp95~~~
MEAVIKAFLTGYPGKTSKKDSKEKPLATSKKDPEKTPLLPTRVNYILIIGVLVLCEVTGVRADVHLLEQPGNLWITWANR
TGQTDFCLSTQSATSPFQTCLIGIPSPISEGDFKGYVSDTNCSTVGTDRLVLSASITGGPDNSTTLTYRKVSCLLLKLNV
SMWDEPPELQLLGSQSLPNVTNITQVSGVAGGCVYFAPRATGLFLGWSKQGLSRFLLRHPFTSTSNSTEPFTVVTADRHN
LFMGSEYCGAYGYRFWEIYNCSQTRNTYRCGDVGGTGLPETWCRGKGGIWVNQSKEINETEPFSFTANCTGSNLGNVSGC
CGEPITILPLGAWIDSTQGSFTKPKALPPAIFLICGDRAWQGIPSRPVGGPCYLGKLTMLAPNHTDILKILANSSRTGIR
RKRSVSHLDDTCSDEVQLWGPTARIFASILAPGVAAAQALREIERLACWSVKQANLTTSLLGDLLDDVTSIRHAVLQNRA
AIDFLLLAHGHGCEDVAGMCCFNLSDHSESIQKKFQLMKKHVNKIGVDSDPIGSWLRGIFGGIGEWAVHLLKGLLLGLVV
ILLLLVCLPCLLQFVSSSIRKMINSSINYHTEYRKMQGGAV
>O92955 ~~~env~~~Envelope glycoprotein gp95~~~
MEAVIKAVLTGYPGETSKKDSKKKPPATSKKDPEKTPLLPTRVNYILIIGVLVLCEVTGVRADVHLLEQPGNLWITWASR
TGQTDFCLSTQSATSPFQTCLIGIPSPISEGDFKGYVSDNCTTLEPHRLVSRGIPGGPENSTTLTYQKVSCLLLKLNVSL
LDEPSELQLLGSQSLPNITNITRIPSVAGGCIGFTPYDSPAGVYGWDRREVTHILLTDPGNNPFFDKASNSSKPFTVVTA
DRHNLFMGSEYCGAYGYRFWEMYNCSQMRQNWSICQDVWGRGPPENWCTSTGGTWVNQSKEFNETEPFSFTVNCTGSNLG
NVSGCCGEPITILPPEAWVDSTQGSFTKPKALPPAIFLICGDRAWQGIPSRPVGGPCYLGKLTMLAPNHTDILKILANSS
RTGIRRKRSVSHLDDTCSDEVQLWGPTARIFASILAPGVAAAQALKEIERLACWSVKQANLTTSLLGDLLDDVTSIRHAV
LQNRAAIDFLLLAHGHGCEDVAGMCCFNLSDHSESIQKKFQLMKEHVNKIGVDSDPIGSWLRGLFGGIGEWAVHLLKGLL
LGLVVILLLVVCLPCLLQIVCGNIRKMINNSISYHTEYKKLQKAYGQPESRIV
>Q87041 ~~~env~~~Envelope glycoprotein gp130~~~
MAPPMTLQQWIIWNKMNKAHEALQNSTTVTDQQKEQIILEIQNEEVRPTRKDKIRYLLYTCCATSSRVLAWMLLVCVLLI
VVLVSCFLTISRIQWNRDIQVLGPVIDWNVTQRAVYQPLQTRRIARSLRMQHPVPKYIEVNMTSIPQGVYYEPHPEPIVV
TERVLGLSQVLMINSENIANNANLTQEVKKLLAEVVNEEMQSLSDVMIDFEIPLGDPRDQEQYIHRKCYQEFAHCYLVKY
KTPKSWPTEGLIADQCPLPGYHAGLSYKPQSIWDYYIKVEITRPANWSSQAVYGQARLGSFYVPKGIRQNNYSHVLFCSD
QLYSKWYNIENSIEQNEKFLLNKLDNLTTGSSLLKKRALPKEWSSQGKNALFKEINVLDVCSKPELVILLNTSYYSFSLW
EGDCNFTKNMISQLVPECEGFYNNSKWMHMHPYACRFWRSKNEKEETKCRPGEKEKCLYYPYQDSLESTYDFGFLAYQKN
FPAPICIEQQEIRDKDYEVYSLYQECKLASKVHGIDTVLFSLKNFLNHTGRPVNEMPNARAFVGLVDPKFPPSYPNVTRE
HYTSCNNRKRRSTDNNYAKLKSMGYALTGAVQTLSQISDINDENLQQGIYLLRDHVITLMEATLHDISVMEGMFAVQHLH
THLNHLKTMLLERRIDWTYMSSAWLQQQLQKSDDEMKVIKRIAKSLVYYVKQTYNSPTATAWEIGLYYELTIPKHVYLNN
WNVVNIGHLVQSAGQLTHVTIAHPYEIINKECTETKYLHLKDCRRQDYVICDVVEIVQPCGNSTDTSDCPVWAEAVKEPF
VQVNPLKNGSYLVLASSTDCQIPPYVPSIVTVNETTSCYGLNFKKPLVAEERLGFEPRLPNLQLRLPHLVGIIAKIKGLK
IEVTSSGESIKDQIERAKAELLRLDIHEGDTPAWIQQLAAATKDVWPAAASALQGIGNFLSGAAHGIFGTAFSLLGYLKP
ILIGVGVILLIILIFKIVSWIPTKKKSQ
>P08810 ~~~env~~~Envelope glycoprotein gp160~~~
MGCLGNQLLIAILLLSVYGIYCTQYVTVFYGVPAWRNATIPLFCATKNRDTWGTTQCLPDNGDYSELALNVTESFDAWEN
TVTEQAIEDVWQLFETSIKPCVKLSPLCITMRCNKSETDRWGLTKSSTTITTAAPTSAPVSEKIDMVNETSSCIAQNNCT
GLEQEQMISCKFTMTGLKRDKTKEYNETWYSTDLVCEQGNSTDNESRCYMNHCNTSVIQESCDKHYWDTIRFRYCAPPGY
ALLRCNDTNYSGFMPKCSKVVVSSCTRMMETQTSTWFGFNGTRAENRTYIYWHGRDNRTIISLNKYYNLTMKCRRPGNKT
VLPVTIMSGLVFHSQPINDRPKQAWCWFGGKWKDAIKEVKQTIVKHPRYTGTNNTDKINLTAPGGGDPEVTFMWTNCRGE
FLYCKMNWFLNWVEDRDVTTQRPKERHRRNYVPCHIRQIINTWHKVGKNVYLPPREGDLTCNSTVTSLIANIDWTDGNQT
SITMSAEVAELYRLELGDYKLVEITPIGLAPTDVKRYTTGGTSRNKRGVFVLGFLGFLATAGSAMGAASLTLTAQSRTLL
AGIVQQQQQLLDVVKRQQELLRLTVWGTKNLQTRVTAIEKYLKDQAQLNAWGCAFRQVCHTTVPWPNASLTPDWNNDTWQ
EWERKVDFLEENITALLEEAQIQQEKNMYELQKLNSWDVFGNWFDLASWIKYIQYGIYVVVGVILLRIVIYIVQMLAKLR
QGYRPVFSSPPSYFQXTHTQQDPALPTREGKEGDGGEGGGNSSWPWQIEYIHFLIRQLIRLLTWLFSNCRTLLSRAYQIL
QPILQRLSATLRRVREVLRTELTYLQYGWSYFHEAVQAGWRSATETLAGAWRDLWETLRRGGRWILAIPRRIRQGLELTL
L
>P05884 ~~~env~~~Envelope glycoprotein gp160~~~
MGCLGNQLLIAILLLSVYGIYCTQYVTVFYGVPAWRNATIPLFCATKNRDTWGTTQCLPDNGDYSELALNVTESFDAWEN
TVTEQAIEDVWQLFETSIKPCVKLSPLCITMRCNKSETDRWGLTKSSTTITTAAPTSAPVSEKIDMVNETSSCIAQNNCT
GLEQEQMISCKFTMTGLKRDKTKEYNETWYSTDLVCEQGNSTDNESRCYMNHCNTSVIQESCDKHYWDTIRFRYCAPPGY
ALLRCNDTNYSGFMPKCSKVVVSSCTRMMETQTSTWFGFNGTRAENRTYIYWHGRDNRTIISLNKYYNLTMKCRRPGNKT
VLPVTIMSGLVFHSQPLTDRPKQAWCWFGGKWKDAIKEVKQTIVKHPRYTGTNNTDKINLTAPGGGDPEVTFMWTNCRGE
FLYCKMNWFLNWVEDRDVTTQRPKERHRRNYVPCHIRQIINTWHKVGKNVYLPPREGDLTCNSTVTSLIANIDWTDGNQT
SITMSAEVAELYRLELGDYKLVEITPIGLAPTDVKRYTTGGTSRNKRGVFVLGFLGFLATAGSAMGAASFRLTAQSRTLL
AGIVQQQQQLLGVVKRQQELLRLTVWGTKNLQTRVTAIEKYLEDQAQLNAWGCAFRQVCHTTVPWPNASLTPDWNNDTWQ
EWERKVDFLEENITALLEEAQIQQEKNMYELQKLNSWDVFGNWFDLASWIKYIQYGIYVVVGVILLRIVIYIVQMLAKLR
QGYRPVFSSPPSYFQXTHTQQDPALPTREGKEGDGGEGGGNSSWPWQIEYIHFLIRQLIRLLTWLFSNCRTLLSRAYQIL
QPILQRLSATLRRIREVLRTELTYLQYGWSYFHEAVQAGWRSATETLAGAWGDLWETLRRGGRWILAIPRRIRQGLELTL
L
>P12492 ~~~env~~~Envelope glycoprotein gp160~~~
MGCLGNQLLIALLLVSVLEICCVQYVTVFYGVPAWKNATIPLFCATKNRDTWGTTQCLPDNDDYSELAINVTEAFDAWDN
TVTEQAIEDVWNLFETSIKPCVKLTPLCIAMRCNKTETDRWGLTGNAGTTTTAITTTATPSVAENVINESNPCIKNNSCA
GLEQEPMIGCKFNMTGLNRDKKKEYNETWYSRDLICEQSANESESKCYMHHCNTSVIQESCDKHYWDAIRFRYCAPPGYA
LLRCNDSNYLGFAPNCSKVVVSSCTRMMETQTSTWFGFNGTRAENRTYIYWHGKSNRTIISLNKYYNLTMRCRRPENKTV
LPVTIMSGLVFHSQPINERPKQAWCWFEGSWKKAIQEVKETLVKHPRYTGTNDTRKINLTAPAGGDPEVTFMWTNCRGEF
LYCKMNWFLNWVEDRDQKGGRWKQQNRKEQQKKNYVPCHIRQIINTWHKVGKNVYLPPREGDLTCNSTVTSLIAEIDWIN
SNETNITMSAEVAELYRLELGDYKLIEITPIGLAPTSVRRYTTTGASRNKRGVFVLGFLGFLATAGSAMGAASVTLSAQS
RTLLAGIVQQQQQLLDVVKRQQELLRLTVWGTKNLQTRVTAIEKYLKDQAQLNSWGCAFRQVCHTTVPWPNETLVPNWNN
MTWQEWERQVDFLEANITQLLEEAQIQQEKNMYELQKLNSWDIFGNWFDLTSWIRYIQYGVLIVLGVIGLRIVIYVVQML
ARLRQGYRPVFSSPPAYVQQIPIHKGQEPPTKEGEEGDGGDRGGSRSWPWQIEYIHFLIRQLIRLLTWLFSSCRDWLLRS
YQILQPVLQSLSTTLQRVREVIRIEIAYLQYGWRYFQEAVQAWWKLARETLASAWGDIWETLGRVGRGILAIPRRIRQGL
ELTLL
>P19503 ~~~env~~~Envelope glycoprotein gp160~~~
MGCLGNQLLIALLLLSASGIYCVQYVTVFYGIPAWRNATVPLFCATKNRDTWGTTQCLPDNGDYSELAINVTEAFDAWDN
TVTEQAIEDVWNLFETSIKPCVKLTPLCITMRCNKSETDRWGLTGTPAPTTTQTTTTQASTTPTSPITAKVVNDSDPCIK
INNCTGLEQEPMVSCKFNMTGLKRDKKREYNETWYSRDLVCEQNSNETDSKCYMNHCNTSVIQESCDKHYWDAIRFRYCA
PPGYALLRCNDSNYSGFAPNCTKVVVSSCTRMMETQTSTWFGFNGTRAENRTYIYWHGRSNRTIISLNKYYNLTMRCRRP
GNKTVLPVTIMSGLVFHSQPINERPKQAWCWFGGEWKKAIQEVKETLVKHPRYTGTNKTEQIKLTAPGGGDPEVTFMWTN
CRGEFLYCKMNWFLNWVENIQNGSRWTSQNQKERQRRNYVPCHIRQIINTWHKVGKNVYLPPREGDLTCNSTVTSLIAEI
DWINGNETNITMSAEVAELYRLELGDYKLVEITPIAFAPTSVKRYTTTGASRNKRGVFVLGFLGFLATAGSAMSAASVTL
SAQSRTLLAGIVQQQQQLLDVVKRQQELLRLTVWGAKNLQTRVTAIEKYLKDQAQLNSWGCAFRQVCHTTVPRPNDTLTP
NWNNMTWQEWEKQVNFLEANITQSLEEAQIQQEKNTYELQKLNSWDIFGNWFDLTSWIKYIQYGVLIVLGVIGLRIVIYV
VQMLARLRQGYRPVFSSPPAYVQQIPIQTGQELPTKEGEEGDGGGRGGNRSWPWQIEYIHFLIRQLIRLLTWLFSSCRDW
LLRNCQTLQPVLQSLSRTLQRAREVIRVQIAYLQYGWRYLQEAAQAWWKFVRETLASAWRDLWETLGRVGRGILAIPRRI
RQGLELTLL
>P28977 ~~~Segment 4~~~Envelope glycoprotein~~~
MFLQTALLLLSLGVAEPDCNTKTATGPYILDRYKPKPVTVSKKLYSATRYTTSAQNELLTAGYRTAWVAYCYNGGLVDSN
TGCNARLLHYPPSRDELLLWGSSHQCSYGDICHDCWGSDSYACLGQLDPAKHWAPRKELVRRDANWKFAYHMCNIDWRCG
VTTSPVFFNLQWVKNEVKVSTLLPNGSTVEHSAGEPLFWTEKDFSYLVKDNFEIQREEVKISCFVDPDYWVGERKTKKAF
CQDGTNFFEVTSHQFCHQYACYNFSKDELLEAVYKERAHEKSKDLPFGNKSWTVVTASIDDLHALSAAQAFELEGLRASF
AELDSRFRQLSEILDTVISSIAKIDERLIGRLIKAPVSSRFISEDKFLLHQCVDSVANNTNCVGDSAYVDGRWTHVGDNH
PCTTVVDEPIGIDIYNFSALWYPSAAEVDFRGTVQSEDGWSFVVKSKDALIQTMMYTKNGGKGTSLTDLLDYPSGWLKGQ
LGGLLYGNIGVYLLIAFAFVLLIRLIKSAGLC
>P03379 ~~~env~~~Envelope glycoprotein gp160~~~
MASKESKPSRTTWRDMEPPLRETWNQVLQELVKRQQQEEEEQQGLVSGKKKSWVSIDLLGTEGKDIKKVNIWEPCEKWFA
QVVWGVLWVLQIVLWGCLMWEVRKGNQCQAEEVIALVSDPGGFQRVQHVETVPVTCVTKNFTQWGCQPEGAYPDPELEYR
NISREILEEVYKQDWPWNTYHWPLWQMENMRQWMKENEKEYKERTNKTKEDIDDLVAGRIRGRFCVPYPYALLRCEEWCW
YPESINQETGHAEKIKINCTKAKAVSCTEKMPLAAVQRVYWEKEDEESMKFLNIKACNISLRCQDEGKSPGGCVQGYPIP
KGAEIIPEAMKYLRGKKSRYGGIKDKNGELKLPLSVRVWVRMANLSGWVNGTPPYWSARINGSTGINGTRWYGVGTLHHL
GYNISSNPEGGICNFTGELWIGGDRFPYYYKPSWNCSQNWTGHPVWHVFRYLDMTEHMTSRCIQRPKRHNITVGNGTITG
NCSVTNWDGCNCTRSGNHLYNSTSGGLLVIICRQNSTITGIMGTNTNWTTMWNIYQNCSRCNNSSLDRTGSGTLGTVNNL
KCSLPHRNESNKWTCKSQRDSYIAGRDFWGKVKAKYSCESNLGGLDSMMHQQMLLQRYQVIRVRAYTYGVVEMPQSYMEE
RGENRRSRRNLQRKKRGIGLVIVLAIMAIIAAAGAGLGVANAVQQSYTRTAVQSLANATAAQQEVLEASYAMVQHIAKGI
RILEARVARVEALVDMMVYQELDCWHYQHYCVTSTRSEVANYVNWTRFKDNCTWQQWEEEIEQHEGNLSLLLREAALQVH
IAQRDARRIPDAWKAIQEAFNWSSWFSWLKYIPWIIMGIVGLMCFRILMCVISMCLQAYKQVKQIRYTQVTVVIEAPVEL
EEKQKRNGDGTNGCASLEHERRTSHRSFIQIWRATWWAWKTSPWRHNWRTMPYITLLPILVIWQWMEENGWNGENQHKKK
KERVDCQDREQMPTLENDYVEL
>P35954 ~~~env~~~Envelope glycoprotein gp160~~~
MASKESKPSRTTRRGMEPPLRETWNQVLQELVKRQQQEEEEQQGLVSGKKKSWVSIDLLGTEGKDIKKVNIWEPCEKWFA
QVVWGVLWVLQIVLWGCLMWEVRKGNQCQAEEVIALVSDPGGFQRVQHVETVPVTCVTKNFTQWGCQPEGAYPDPELEYR
NISREILEEVYKQDWPWNTYHWPLWQMENMRQWMKENEKEYKERTNKTKEDIDDLVAGRIRGRFCVPYPYALLRCEEWCW
YPESINQETGHAEKIKINCTKAKAVSCTEKMSLAAVQRVYWEKEDEESMKFLNIKACNISLRCQDEGKSPGGCVQGYPIP
KGAEIIPEAMKYLRGKKSRYGGIKDKNGELKLPLSVRVWVRMANLSGWVNGTPPYWSARINGSTGINGTRWYGIGTLHHL
GCNISSNPERGICNFTGELWIGGDKFPYYYTPSWNCSQNWTGHPVWHVFRYLDMTEHMTSRCIQRPKRHNITVGNGTITG
NCSVTNWDGCNCTRSGNHLYNSTSGGLLVIICRQNSTITGIMGTNTNWTTMWNIYQNCSRCNNSSLDRTGSGTLGTVNNL
KCSLPHRNESNKWTCKSQRDSYIAGRDFWGKVKAKYSCESNLGGLDSMMHQQMLLQRYQVIRVRAYTYGVVEMPQSYMEA
QGENKRSRRNLQRKKRGIGLVIVLAIMAIIAAAGAGLGVANAVQQSYTRTAVQSLANATAAQQEVLEASYAMVQHIAKGI
RILEARVARVEALVDRMMVYQELDCWHYQHYCVTSTRSEVANYVNWTRFKDNCTWQQWEEEIEQHEGNLSLLLREAALQV
HIAQRDARRIPDAWKAIQEAFNWSSWFSWLKYIPWIIMGIVGLMCFRILMCVISMCLQAYKQVKQIRYTQVTVVIEAPVE
LEEKQKRNGDGTNGCASLERERRTSHRSFIQIWRATWWAWKTSPWRHNWRTMPYITLLPILVIWQWMEENGWNGENQHKK
KKERVDCQDREQMPTLENDYVEL
>Q88938 ~~~env~~~Envelope glycoprotein~~~
MDTPGSLQVIAIISLLLVGGASQPATFLEKALPTDGPSLETIEHKTEMVNTTRSEEQSPVRPSKTRQQLIDETPEICANA
WVIRLITEFPTELGNMSQKQKTIAIQVHNTTMTMEETVFSLVSHVNKKNYEIHNVSGICTKYQLVPGNFTCSTRKCISQT
KEKRIISTTVKDTYEVYLPFAWSQKPTGGDKYPEPQIGYNTGTGRLNQWNKDEFVIKQCRKKRGKRQITVPNSTLSPTGT
TDFTKFTPNPISPNSTALNELEQKTTPIGTEQPFNNEKWQNLIFGNIVTKMDPQCEAELFQQFNISDKTVQVEFKVTSLP
GQNISCQAIYNTEHGINIENKNCVISLIKENRKIKAHAYITRTGSYEWYAQQVTSKGIIQEVRNLVTIVECECPIVKPLP
QGGIIPLTMPMRVLTNPSPILIHSALKFDLSKFGLSPCSFSPMEWQTYITKPLKRAMHGFEVHQRKKRDLGIGLHSTLNS
WWNGANSLGLTVESADRQKYDQKILKVLQNLAVQQRTDVKNQQTLGKALETPIYTITLQLADSLTAAILKHEQQQNVGIT
CKDIAILTVTQIATYLRDIQHEHLPVWFIEQITNQILLPVGQVIMPEITAPPILNPLIGWNQSVLVIGLTHQLTITTVQQ
PLYKAANMGNFQDWTPFPPFILANKTHGFSIDCPIMRNSFLCHTLPTPVKLSEWERSTSTIYQTSPQVWITPEGKACLNH
RNITVQDRTCLINKPGCFIPKHPWSAGKQTIVPTQYIQQNFVPDTIDTEDNQTRVLQKEMIEAISKAKRDYGVLKQGQIA
LIRHHEAITTILGQEATYSIKETQALISSIEQEAWYNNLFSWYDGSVWSQLQLIIVVITCTIPLLWVLNTCLFFKLRRAI
RRERDNNIVVEYQAQTRGRRTHMTEPITKKQRAKLLRHAKTNRRLPRSLRATPAVSAFEMVTFDPQEETVEINRIDPSHE
NNDHGGPMNMAPIISADSYALPTPYITIMLDRELLNQGMRKVITLLNDPAREVFNKAYNLVTTNHFTLAYGCDESAGWVN
QHAEYMGKPVIVTLAGLVITPVGLAWIPLPQQEPLEKLFMVPNSMPHVTVAMADYHETKEMGKIVKDINNEELLLVKPQL
FKWGPEGFFVACPLVIRGVVTGHSLLHIACPATAVQAEGT
>Q65150 ~~~~~~Lectin-like protein EP153R~~~
MYFKKKYIGLIDKNCEKKILDDSSTIKICYILIGILIGTNMITLIYNFIFWDNYIKCYRNNDKMFYCPNDWVGYNNICYY
FSNGSFSKNYTAASNFCRQLNGTLANNDTNLLNLTKIYNNQSMYWVNNTVILRGDNKYSQKVNYTDLLFICGK
>P0CA64 ~~~~~~Lectin-like protein EP153R~~~
MFSNKKYIGLIDKYCEKKILDDSSTIKICYILIGILIGTNMITLIYNFIFWENYITCNQKDKTFYCPKDWVGYNNVCYYF
GNDEKNYNNASNYCKQLNSTLTNNNTNLVNLTKTLNLTKTYNHESNYWVNYSLIKNESVLLRNSGYYKKQKHVSLLYICS
K
>P17151 ~~~~~~Early phosphoprotein p84~~~
MDLPTTVVRKYWTFANPNRILHQSVNQTFDVRQFVFDTARLVNCVDGDGKVLHLNKGWLCATIMQHGEASAGAKTQQGFM
SIDITGDGELQEHLFVRGGIVFNKSVSSVVGSSGPNESALLTMISENGNLQVTYVRHYLKNHGESSSGGGGCGAASTASA
VCVSSLGGSGGTRDGPSAEEQQRRRQEQRHEERRKKSSSSAGGGGGGGAGGGGGGGGSGGQHSSDSANGLLRDPRLMNRQ
KERRPPPSSENDGSPPLREAKRQKTTAQHEGHGGGGKNETEQQSGGAGGGGGGGSGRMSLPLDTSEAVAFLNYSSSSSAV
SSSSNNHHHHHHHHNAVTDVAAGTDGALLLPIERGAVVSSPSSTSPSSLLSLPRPSSAHSAGETVQESEAAATAAAAGLM
MMRRMRRAPAEAAEAPPQSEEENDSTTPVSNCRVPPNSQESAAPQPPRSPRFDDIIQSLTKMLNDCKEKRLCDLPLVSSR
LLPETSGGTVVVNHSSVARTAAAVSAAGVGPPAAACPPLVTTGVVPSGSVAGVAPVAAAIETPAAPPRPVCEIKPYVVNP
VVATAAAASNSSSSSSAPLPPPPPPSGGRRGRARNNTRGGGGGGGGRNSRRQAASSSSSSSRRSRRRNNRHEDEEDNDPL
LRLSQVAGNGRRRGPSFLEDGLEIIDPSEEAAIAAASIAAFFDD
>P00534 2.7.10.1~~~V-ERBB~~~Tyrosine-protein kinase transforming protein erbB~~~
MKCAHFIDGPHCVKACPAGVLGENDTLVWKYADANAVCQLCHPNCTRGCKGPGLEGCPNGSKTPSIAAGVVGGLLCLVVV
GLGIGLYLRRRHIVRKRTLRRLLQERELVEPLTPSGEAPNQAHLRILKETEFKKVKVLGSGAFGTVYKGLWIPEGEKVKI
PVAIKELREATSPKANKEILDEAYVMASVDNPHVCRLLGICLTSTVQLITQLMPYGCLLDYIREHKDNIGSQYLLNWCVQ
IAKGMNYLEERRLVHRDLAARNVLVKTPQHVKITDFGLAKLLGADEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSY
GVTVWELMTFGSKPYDGIPASEISSVLEKGERLPQPPICTIDVYMIMVKCWMIDADSRPKFRELIAEFSKMARDPPRYLV
IQGDERMHLPSPTDSKFYRTLMEEEDMEDIVDADEYLVPHQGFFNSPSTSRTPLLSSLSATSNNSATNCIDRNGQGHPVR
EDSFVQRYSSDPTGNFLEESIDDGFLPAPEYVNQLMPKKPSTAMVQNQIYNNISLTAISKLPMDSRYQNSHSTAVDNPEY
LNTNQSPLAKTVFESSPYWIQSGNHQINLDNPDYQQDFLPNETKPNGLLKVPAAENPEYLRVAAPKSEYIEASA
>P04308 3.6.4.-~~~~~~Early transcription factor 70 kDa subunit~~~
MNTGIIDLFDNHVDSIPTILPHQLATLDYLVRTIIDENRSVLLFHIMGSGKTIIALLFALVASRFKKVYILVPNINILKI
FNYNMGVAMNLFNDEFIAENIFIHSTTSFYSLNYNDNVINYNGLSRYNNSIFIVDEAHNIFGNNTGELMTVIKNKNKIPF
LLLSGSPITNTPNTLGHIIDLMSEETIDFGEIISRGKKVIQTLLNERGVNVLKDLLKGRISYYEMPDKDLPTIRYHGRKF
LDTRVVYCHMSKLQERDYMITRRQLCYHEMFDKNMYNVSMAVLGQLNLMNNLDTLFQEQDKELYPNLKINNGVLYGEELV
TLNISSKFKYFINRIQTLNGKHFIYFSNSTYGGLVIKYIMLSNGYSEYNGSQGTNPHMINGKPKTFAIVTSKMKSSLEDL
LDVYNSPENDDGSQLMFLFSSNIMSESYTLKEVRHIWFMTIPDTFSQYNQILGRSIRKFSYADISEPVNVYLLAAVYSDF
NDEVTSLNDYTQDELINVLPFDIKKLLYLKFKTKETNRIYSILQEMSETYSLPPHPSIVKVLLGELVRQFFYNNSRIKYN
DSKLLKMVTSVIKNKEDARNYIDDIVNGHFFVSNKVFDKSLLYKYENDIITVPFRLSYEPFVWGVNFRKEYNVVSSP
>P20636 ~~~~~~Early transcription factor 82 kDa subunit~~~
MRYIVSPQLVLQVGKGQEVERALYLTPYDYIDEKSPIYYFLRSHLNIQQPEIVKRHILLTLRMTQLKGYLGNLLDIKDDI
IIYSHKNNLEYSYVDNTIFNPFVYTQKKTLLKNDSFLYNVYPGACDFLVIWVARACDTSIPEFGSYEDVDNNIIKFETML
MEVFPQLDLDITVESKFNNIFRTNLKLTGLKKIIQRVQDLDINYKSLLSRYDEHFINMTGNHFILNDEQLNLSIWDLDGT
LALSSDGDTVMINNVKLFTDLVSDIDTQMERIKGDITYKVHLATPINSRIKLDIETSFIFIETATNNILLSSDKKISIIL
AKNHISIKVKNHIPNIEKYFTFLVIAINAMFNSVQKSADFTKVETVYWSRICQNTKNKNRKPIIINYLDPGMKKISNNFY
KSDEKEVFINDNGIMFTCMDPLGKYNKVGFLNIFHDMRKYCIPCCFLHDQSHRSTFSSCVHQIDVEKKIVSPYILNFGKV
VTESKMSFLPIIFDAFLNDGMTANMEQDNKRLKETSGYHIVRCCAGDDIVRLRTTSDIIQFVNEDKNILIVNDMVYFPMN
ASDIGKKIHILIQEIVHEVMIVKKKESSDKIDFFPPNYKLLKDLFPKQTIQTPIQSDAGMVLTTDGFYIDGKLFNEDLSS
KYVTFTKNVIASDAVAKYFSPLFKYVISEAKDRFIKTWMINIMIHMNVDPNNIIPTLEKYYPNSGRAQIN
>Q89525 3.6.4.13~~~~~~Early transcription factor large subunit homolog~~~
MAYPELDAADFLQQLARRKEFKSLISPPVDQKELIRDLRAHFVQIGGPGCEKGGRAFFPCDPYASPFPSIKGLQLHNAQL
FVQNFQNPNTPYSRLLLNWQTGTGKSIAAIAIARQFMNHYMNFIENAPWIFVVGFTRAIIQTEMLRRPELGFVSYKEVAE
LHRLLHIAKQSGSTTSVESRHLNGFVSTLKRRLTDRNRGGFFQFYGYKEFASKLFNITSKGEEKNFDVLSLFHRSDEAED
TLNENDISQFVQKISEAETNGLIRVNQKIMEQLRGGLLIADEIHNVYNIQERNNYGIALQYVLDAFPPHQAPRAVFMSAT
PVTGSVMEYVDLLNLLVPRHELPNGQPLQRQQLFDSSGHSVKWKKDALALVERLSTGRVSFLLDTNTNFYPERIFAGKML
SYKDETLPYLHFIECPMSEYQLETLKQLGPDPKISSNAYSIYDMVFPNPKFSKQTEPKAYGLFNSTETPTALSMASTDWL
LENGVQIIEPSRRAPFNVSGSFLSLQPPTHISGLAFYSGKYTQMMKDILSIIRQGRGKILIYHNRVRMSGVLILQEILQS
NGILNEVSSPVGTTRCSICAAIRDEHTHSDHQFIPVRFTILHSEIEPAVRERSLALFNASSNLEGHQLRILIGSKVIVEG
LNFQAVRYEMIMSLPLDIPRLIQVFGRVVRKNSHMELPPSERNVTIYLYVSTTPDGGPELAKYAQKLKEYILIQEGDKAL
RKHAIDGFTNQIKIDKPMLESLPLSPSITPANVGATVLNTFEAYGYGEQEVKTISNIIISLFMARPVWTYSELWKAVSTP
KLIQGITIDNKLFSEDNFALALISLCYSKNQCKELWIQNRLCTIMHVPAKPEHLYVAAVLNHKKEPVLDIETYIRDFQLP
AMHSIRITKYLEHSQTKEPFQVLYEKFQKDFQDEPMEQVLIHYPASFHYTMLEALIIDNLAGMGALVEVYKKFFIAFSKK
DIQPFPDIFKIISHVPGDDNTLVGYATEDSVRLITSREDKTWHEIPLYMLNINVKRKENDIVIGYMESKGKALKFKIRPP
IQVLKKNEITDIRMLNRGAVCETRGREEQQKIADQLGISLNLTKISAIKLCLLIRNNLLQKEMEARNQPNGMQDGIRWFY
LFNDKMPSLVHTS
>P0C9W4 ~~~~~~Putative membrane protein 162~~~
MYYPAVQVLIGIILVDNFNTEFLSSEKKNCKTDTDCKDKGHHCVRGTCTDISCLEAVKQDIKDINLDPSIRSCNYTPDFY
TFNSTTADLQSPFGKTRIDYGSIYTSDWSSIDHCQSLCCKHNDCIGWEFDKIESSHGGECYFYTNPHPALKNSNDTTIMG
IARNIL
>P0C9W5 ~~~~~~Putative membrane protein 164~~~
MYHPVVQVLIGLILVIILILGFYHLKKKSCKTDTDCKDNGHHCVRGTCTDISCLEAVKQDIKGINLDPSIRSCNYTPGFY
TFNSTTADLQSPFGKTRIYYGDVYATWSSVDYCQSLCLQHNGCIAWEFDDEKESPGQGSCYFYTNSHPALKNSNDTTIMG
IARNIL
>P0C9W6 ~~~~~~Putative membrane protein 165~~~
MYLVLLIAVILFIIVILMIFLISGLFYPEQEPALPISPPKKKCKTDTDCKDKGHHCVGGICTNMSCLDAIKYDIKDIKLD
PNIRSCNYTPKFYKFSNTTADLQSPFGKTRIDDAELYDPHSGEDFCQRLCLDRKDCIGWEFDQYYAKTTGECYFYIDPHP
ALKSKNDAVLAIARKVS
>P0C9W7 ~~~~~~Envelope protein 166~~~
MFYPVVQVLIGIILVIILILGFYHMKHKPPKKKCKTDTDCKDKGHHCVRGTCTDKSCLEAAKQDIKDIKLDPTIRSCDYA
PGFYRFNATTADLQSPFGKTRIDLGRVWTTWSKEDEYCQSLCLQRKGSIGWEFDELSLGGVGNCYCYTNSHPVLKNSNNT
TVMGIARNVL
>P0C9W8 ~~~~~~Envelope protein 167~~~
MYLVLLIAIILFITIILVIFLISGLFYPEQNPLLPISPPKKKCKIDVDCKDNGHHCVGGFCTKMNCLEAAKYDIKGIKLD
PNIRSCNYTPKFYKFSSTADPQSPFGKSRIEYGELYDPHSGEEFCESLCANYPGCISWEYDQISGKTTGNCYFYRNPHPA
LKYKSDAVMAIPRKVL
>P0C9W9 ~~~~~~Envelope protein 168~~~
MFYPVVQILIGIILVIILILGFYHLKRKPPKKKCKTDTDCKDKGHHCVRGACTDKSCLEAVKQDIKDIKLDPTIRSCDYA
PGFYRFNATTADLQSPFGKTRIDLGKVWTTWSKEDEYCQSLCLQHKGSIGWEFDEMSLRGEGNCYCYTNSHPALKNSNNT
TVMGIARNVL
>P0C9X0 ~~~~~~Envelope protein 169~~~
MKKYIKMYLVLLIAIILFITILVIFLISGLFYPEQNPLLPISPPKKKCKTDTDCKDKGHHCVGGFCTNMSCVEAAKYDIK
DIKIDPNIRSCNYTPKFYKFSNTTADLQSPFGKSRIDYGWIYSPHSNEDSCQSFCANYPEGCIAWEYDQFSGTTTGECFL
YTNPHPVLKYKNGATVMAIPRKVL
>P28981 ~~~~~~Envelope protein UL45 homolog~~~
MEDYKLLQLETATVDAQAPPLPTKTVPVFAPPLSTPPQPNELVYTKRRRTKRKAKCRCLFFTMGMFALGVLMTTAILVST
FILTVPIGALRTAPCPAETFGLGDECVRPVLLNASSNTRNISGVGAVCEEYSEMAASNGTAGLIMSLLDCLNVGDSESVM
NKLNLDDTQLAYCNVPSFAECYTKGFGVCYAARPLSPLGELIYKARQALRLDHIIPFPR
>P06483 ~~~~~~Envelope protein UL45~~~
MAFRASGPAYQPLAPAASPARARVPAVAWIGVGAIVGAFALVAALVLVPPRSSWGLSPCDSGWQEFNAGCVAWDPTPVEH
EQAVGGCSAPATLIPRAAAKHLAALTRVQAERSSGYWWVNGDGIRTCLRLVDSVSGIDEFCEELAIRICYYPRSPGGFVR
FVTSIRNALGLP
>Q8SCY1 ~~~~~~Peptidoglycan hydrolase gp181~~~
MAKKVTLPKGQTGATGTTLGQAGNILDLSDVDDIFGDTPKAKKGSPVTEFFNGIKQGLFDSVKPQQALKAFMRSAAPDGF
SRMFGVYEDTMSTIRDVKDSVERTSASDLLFLTREAQDLAVKLKDKVPASVFDRLNNRLESQIENYKYAEDSNRNYKEIR
RRMEAERDEDELKSAIDQVTLVQRDLAIKAEQGEVKRFAIGQAERGIRDKVSADRFDWMAKAMGQTVDNLSKLASYNEQV
NYSIQKKGLEIQFRSMLHLRKIAQQTEATMELLNNGFAALVRNTGIPDHKKSSMKDLVGFNAAQRVSSSFVDNAIQTLPN
FLGNFGSAVTNNATRFANENIRNFADAVRAGNMFGADAWENRYNIAGQFAGSYLGDWTRNSVIPVLGRMARPGIERFSNN
YLGGRHNQASYLLDNFPAWTQEYMNNYQNTYGARGILRDIMAPFIPQFTLQDRLKTGSYQTIGQDSGFNQLTQRTIVEAI
PGYLSRLLQETRMIRTGRDDITREVFDLSTGSFMSVEDSAANTERRLVSRSTVRGVSGALTDVLEAFDPNKELSIDARKA
LTERLIRDANMNKRFDPEAYARAGGYDRSKVSGETIKELTEYIKRQYNIGADGRMASTNENFARRQEISTLFLDVRNFSR
DPIKEIERLTNAGKTDQLREMGIIITEQGIDRINYPRIWELMSSEVKYEGWGNDNPFDRYSDTPPSDNGGPEQLSPLGDQ
NNPHFIGPMYQSQTERKMRQAQEQVIRASKLAQEQANKGYDYAQQKVSLATDYVRDNVPNQFSELRRNVNIPASMNFGDL
RDQLYGQAGDMYSRAVNYSNQLNGYSDIINQSIADLYTKANTFTPVIKGMDFLNGNLIDINTGMIVEKISDITGEIKNQA
GITVVTAQEVATGLYNQRGDLLTKATDIASQLRDKAAQIAGDARERLTQGLDNVSDMAKDWYIPGREEAVILGRDLLAGE
YIDTATNQIITNLKDAKGTIINRAGDVVVTAQELAKGLIRSDGFNLRRNIADTSNWIQRNVLGGGSTTQKIFNAMGTVAN
KAKDFTIGLGKDILSNRDAYLPGMLKPVLQKVKLKAGEYYTTAGNLLKSFDEINGPVLDRDGNIVVDEEQISELINSDGS
KHTAAKSKGLFRTGLGNLARGYANMSMRYWKWLGKKSVDTAKGMAGLGYKLLGSPFKKRFSAFTGKVETQIDKKALDTTT
DQLLAGIWEELRNQKPDANKPRRGSWQDLTSRVSDTLNGKNNGDEETTESKGLFGKLGDTLKNIFGKKKGDEEDEGLLED
LGLGGKKGGKWAAARQILGRGALAIGGGALSTAAAYASFGGTGASTNDKLAGAAIVTSNPIMWALKDFLIKPVLGWRGSQ
KFKDDLISYRMMQYGATTTDQMNKVTELEQLVSSVATRGGDASFDVRALNARDIIKIFGYGADDGPAIMRLANWIDFRFK
PIFEAWLKGLSKINRSDVDISEVDSKVPNELKGQLIRSVSFPYEGNTPYLVLNNPFGEEDLSIDVASIQMKEKELLNKYS
ATEKTSVAPKATSSSFKESTTDVINDTITTIKSKSTDITNWFRDSTIGKAIKAVSPVESIRKMVTTTVDTIIPKANASDS
LTSLQALRVHAYGMQGLDLAAVNGLLSIESLVNDKMRVANGKATYTGDIEELIKWTGQAFGMVTTSDGPDRVKVVDWLYR
RFLPVFKAFIVTARSVSTSITLSQIETLTATQRVQIANAIMGATDDEGISIWKAPSIFNIVGDMDSVEDLAKISLDEIKK
EAETEVAEAPGKSKSAQIAGKNDAASGRSFASRIIDNVKSTFNSATTKVTNWMENTSARVSQVIGRAREGVTDTYYTAKY
KLGAGGELTPTGQTYGQLATGNGGVWENIPMPQSNKSRDAAQATFKAVSEMTGVPVELLNIFCGIESSFNYNAKAPTSSA
AGWFQFIKSTWKGMLAKYGAKFGIPADDENGSLRFDPRINALMGAMFLRDNYEYLENALGRAPTDVDLYLAHFMGPAGAR
KFLTRDQNSIGAEIFPDQARANRSIFFKTDGSARTLGEIYQVMENKVAKFRTGGGKNANSQSLGKPKSTEELMNDAATAK
QKDMATDKELIGGAADTSITDSSNNKIGLGKIMSGMASPLRTNAPSMMLPGAPSSATDVSSGQQPVVDTGAATQATVRAS
QIEEQRKVVTSQDKAMLDIASEQLSVLKQFHADILNYIKNKAANPSAQTGQEQANTIAPSQRPGRVVDNRPLPIRLR
>Q7Y2C9 3.2.1.17~~~~~~Peptidoglycan hydrolase gp36~~~
MAESQRASQELGINVGQTQLQPGQSARRGVRDSEVNYSGPSVGSQILDGILGAGQQIAGKWFEHNVQQEVLRGERARMAG
EAEEAVDSNVLAKPFVKGGWRKQDYRIAQADFSLKMQRFIANKGREMTPEEFRKYLSQEATHVLDSTEGMNPNDALQALA
QQQKAEEQLFGMQAKAYMDWSIDQAARGFRTQGNSILAKAVQAQATGDELSRQLSLEEAGLFYTNIMTSEDIPLEVRDKV
GMQFLAASLDMNQRGIYEGLRDAGFLDSMSFDDRRALNGLYEKSKAQTRAKESMATLRADADFQQRVANGAITDLAEVEA
YSRGMVEEGRWSDAQAISFMTKAMTGLGNAQRMQGIMAALEAGDINALHTLGTNVTEALEQWDKMQAANGSSLTDRLVQG
TQLGLRLGTFPKTYGESVGSAVRMIQAAKEGEANPELVNTLNSIFEQVASAQEINPSAGNVMLSGIPEAEQGAVAWALKQ
MKMGIAPAQALREFSANAEVVKQMDEFEKGQNTKAFKDNLGKQVNDKFVNNIFGRAWNMLTGESDLSNNEAVLSMYRRAT
IDEANWLASDRKHAGLLTSDTGREALLEIAAANVRNRTIQVGEGRNLKEGDLFSRRDSAPLILPRGTTAEQLFGTNDTET
IGTVLAEQHKPHVEGLLGYKSVVAFEYDRTSGSLLAVEYDENGVALDRTRVDPQAVGKEVLKRNADKLNAMRGAEYGANV
KVSGTDIRMNGGNSAGMLKQDVFNWRKELAQFEAYRGEAYKDADGYSVGLGHYLGSGNAGAGTTVTPEQAAQWFAEDTDR
ALDQGVRLADELGVTNNASILGLAGMAFQMGEGRARQFRNTFQAIKDRNKEAFEAGVRNSKWYTQTPDRAEAFIKRMAPH
FDTPSQIGVDWYSAATAE
>P26746 ~~~4~~~Peptidoglycan hydrolase gp4~~~
MQIKTKGDLVRAALRKLGVASDATLTDVEPQSMQDAVDDLEAMMAEWYQDGKGIITGYVFSDDENPPAEGDDHGLRSSAV
SAVFHNLACRIAPDYALEATAKIIATAKYGKELLYKQTAISRAKRAPYPSRMPTGSGNSFANLNEWHYFPGEQNADSTTP
HDEGNG
>P07582 ~~~P5~~~Peptidoglycan hydrolase gp5~~~
MSKDSAFAVQYSLRALGQKVRADGVVGSETRAALDALPENQKKAIVELQALLPKAQSVGNNRVRFTTAEVDSAVARISQK
IGVPASYYQFLIPIENFVVAGGFETTVSGSFRGLGQFNRQTWDRLRRLGRNLPAFEEGSAQLNASLYAIGFLYLENKRAY
EASFKGRVFTHEIAYLYHNQGAPAAEQYLTSGRLVYPKQSEAAVAAVAAARNQHVKESWA
>Q9XJR8 ~~~VII~~~Peptidoglycan hydrolase P7~~~
MINKTTIKTVLITLGVLAAVNKVSALRSVKRLIS
>P27380 4.2.2.n1~~~VII~~~Transglycosylase~~~
MSGALQWWETIGAASAQYNLDPRLVAGVVQTESSGNPRTTSGVGAMGLMQLMPATAKSLGVTNAYDPTQNIYGGAALLRE
NLDRYGDVNTALLAYHGGTNQANWGAKTKSYPGKVMKNINLLFGNSGPVVTPAAGIAPVSGAQEMTAVNISDYTAPDLTG
LTMGAGSPDFTGGASGSWGEENIPWYRVDKHVANAAGSAYDAVTDAVSAPVEAAGNYALRGVVIIAAVAIVVVGLYFLFQ
DEINSAAMKMIPAGKAAGAAAKALA
>P03726 4.2.2.n1~~~~~~Peptidoglycan transglycosylase gp16~~~
MDKYDKNVPSDYDGLFQKAADANGVSYDLLRKVAWTESRFVPTAKSKTGPLGMMQFTKATAKALGLRVTDGPDDDRLNPE
LAINAAAKQLAGLVGKFDGDELKAALAYNQGEGRLGNPQLEAYSKGDFASISEEGRNYMRNLLDVAKSPMAGQLETFGGI
TPKGKGIPAEVGLAGIGHKQKVTQELPESTSFDVKGIEQEATAKPFAKDFWETHGETLDEYNSRSTFFGFKNAAEAELSN
SVAGMAFRAGRLDNGFDVFKDTITPTRWNSHIWTPEELEKIRTEVKNPAYINVVTGGSPENLDDLIKLANENFENDSRAA
EAGLGAKLSAGIIGAGVDPLSYVPMVGVTGKGFKLINKALVVGAESAALNVASEGLRTSVAGGDADYAGAALGGFVFGAG
MSAISDAVAAGLKRSKPEAEFDNEFIGPMMRLEARETARNANSADLSRMNTENMKFEGEHNGVPYEDLPTERGAVVLHDG
SVLSASNPINPKTLKEFSEVDPEKAARGIKLAGFTEIGLKTLGSDDADIRRVAIDLVRSPTGMQSGASGKFGATASDIHE
RLHGTDQRTYNDLYKAMSDAMKDPEFSTGGAKMSREETRYTIYRRAALAIERPELQKALTPSERIVMDIIKRHFDTKREL
MENPAIFGNTKAVSIFPESRHKGTYVPHVYDRHAKALMIQRYGAEGLQEGIARSWMNSYVSRPEVKARVDEMLKELHGVK
EVTPEMVEKYAMDKAYGISHSDQFTNSSIIEENIEGLVGIENNSFLEARNLFDSDLSITMPDGQQFSVNDLRDFDMFRIM
PAYDRRVNGDIAIMGSTGKTTKELKDEILALKAKAEGDGKKTGEVHALMDTVKILTGRARRNQDTVWETSLRAINDLGFF
AKNAYMGAQNITEIAGMIVTGNVRALGHGIPILRDTLYKSKPVSAKELKELHASLFGKEVDQLIRPKRADIVQRLREATD
TGPAVANIVGTLKYSTQELAARSPWTKLLNGTTNYLLDAARQGMLGDVISATLTGKTTRWEKEGFLRGASVTPEQMAGIK
SLIKEHMVRGEDGKFTVKDKQAFSMDPRAMDLWRLADKVADEAMLRPHKVSLQDSHAFGALGKMVMQFKSFTIKSLNSKF
LRTFYDGYKNNRAIDAALSIITSMGLAGGFYAMAAHVKAYALPKEKRKEYLERALDPTMIAHAALSRSSQLGAPLAMVDL
VGGVLGFESSKMARSTILPKDTVKERDPNKPYTSREVMGAMGSNLLEQMPSAGFVANVGATLMNAAGVVNSPNKATEQDF
MTGLMNSTKELVPNDPLTQQLVLKIYEANGVNLRERRK
>P04521 3.1.11.-~~~~~~Exonuclease subunit 1~~~
MKILNLGDWHLGVKADDEWIRGIQIDGIKQAIEYSKKNGITTWIQYGDIFDVRKAITHKTMEFAREIVQTLDDAGITLHT
IVGNHDLHYKNVMHPNASTELLAKYPNVKVYDKPTTVDFDGCLIDLIPWMCEENTGEILEHIKTSSASFCVGHWELNGFY
FYKGMKSHGLEPDFLKTYKEVWSGHFHTISEAANVRYIGTPWTLTAGDENDPRGFWMFDTETERTEFIPNNTTWHRRIHY
PFKGKIDYKDFTNLSVRVIVTEVDKNLTKFESELEKVVHSLRVVSKIDNSVESDDSEEVEVQSLQTLMEEYINAIPDITD
SDREALIQYANQLYVEATQ
>P11108 3.1.11.-~~~~~~Probable exonuclease subunit 1~~~
MRILFSADHHIKLGQDKVPKEWQKRRFLMLGERLNDIFHNHNCDLHIAGGDILDVADPSSEEIELLEQFMSRLDHPGKIF
TGNHEMLTKTISCLYHYAGVINKVTSGKWEVITKPYRSPEFDIVPYDEIHKAKWKPPVSKLCFTHVRGEIPPHVKPEIDL
TKYNCYDTVIAGDLHSYTNSQTIGSTRLLYPGSPLTTSFHRERTKGTNGCFIIDTDTLKVEWIELGDLPQLIRKTIGAGE
EMEPSDYDRVVYEVTGDVVQLKSIKDSDLLDKKINHRVTKDAKLNLVDLDMLGELELYFREVEKLSQGDIDRILARAAKY
VKDYN
>P04522 3.1.11.-~~~~~~Exonuclease subunit 2~~~
MKNFKLNRVKYKNIMSVGQNGIDIQLDKVQKTLITGRNGGGKSTMLEAITFGLFGKPFRDVKKGQLINSTNKKELLVELW
MEYDEKKYYIKRGQKPNVFEITVNGTRLNESASSKDFQAEFEQLIGMSYASFKQIVVLGTAGYTPFMGLSTPARRKLVED
LLEVGTLAEMDKLNKALIRELNSQNQVLDVKKDSIIQQIKIYNDNVERQKKLTGDNLTRLQNMYDDLAKEARTLKSEIEE
ANERLVNIVLDEDPTDAFNKIGQEAFLIKSKIDSYNKVINMYHEGGLCPTCLSQLSSGDKVVSKIKDKVSECTHSFEQLS
THRDNLKVLVDEYRDNIKTQQSLANDIRNKKQSLIAAVDKAKKVKAAIEKASSEFIDHADEIALLQEELDKIVKTKTNLV
MEKYHRGILTDMLKDSGIKGAIIKKYIPLFNKQINHYLKIMEADYVFTLDEEFNETIKSRGREDFSYASFSEGEKARIDI
ALLFTWRDIASIVSGVSISTLILDEVFDGSFDAEGIKGVANIINSMKNTNVFIISHKDHDPQEYGQHLQMKKVGRFTVMV
>P11109 3.1.11.-~~~~~~Probable exonuclease subunit 2~~~
MSKITIKTLKFSNVMSYGKDIVIHFDKNPVTQLIGGNGLGKSTIATVIEELFYNKNSRGIKKDALFSWNAPKKEYDMHAY
FSKDEDEYELHKVVKSTAKVTLIKNGEDISGHTATQTYKMIEEIMGGDFQTFTKLIYQSVGSNLDFLKATDATRKAFLVN
LFNQEQYKEMSETIKADRKEIANTLNNLQGQMAVITKILNGKNNLGTLQEPVEVPEFDEEPLAQELTESKIKAALAKSQE
ANITKLRNLDKAVQVAEQSFEPFKNLPAPTDQNEEISSVTRDLTIVTSRASEVKKRYQKFKQEASNTECPTCGTHLNTTA
AQKAMDMARVEYDPLFKEKQSLEAKLEQLKKEQLEYVAYTRAKDALDKAVVARDEFKNSMSDASFEELNVQILQVQIRQL
EQEIADGRSKVAIAKEHNATVELANAKYKAKLEQIEKAEAEMTEITSKLDGVSEAVADLDILIAALKNLVGYKLEHSVKV
FEELINKYLSIMTGGKFALGFELDETKLQVVIFNDGNRTSMENCSTGQQSRINLATLLAIRMLLTSISKVNINLLFLDEV
ISFIDTKGLDTLVELLNEEESLNSIIVSHGHTHPLAHKITVKKDAEGFSYLE
>P04536 3.1.11.1~~~dexA~~~Exodeoxyribonuclease~~~
MFDFIIDFETMGSGEKAAVIDLAVIAFDPNPEVVETFDELVSRGIKIKFDLKSQKGHRLFTKSTIEWWKNQSPEARKNIA
PSDEDVSTIDGIAKFNDYINAHNIDPWKSQGWCRGMSFDFPILVDLIRDIQRLNGVSENELDTFKLEPCKFWNQRDIRTR
IEALLLVRDMTTCPLPKGTLDGFVAHDSIHDCAKDILMMKYALRYAMGLEDAPSEEECDPLSLPTKR
>P03697 3.1.11.3~~~exo~~~Exonuclease~~~
MTPDIILQRTGIDVRAVEQGDDAWHKLRLGVITASEVHNVIAKPRSGKKWPDMKMSYFHTLLAEVCTGVAPEVNAKALAW
GKQYENDARTLFEFTSGVNVTESPIIYRDESMRTACSPDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAIKSAYMAQVQY
SMWVTRKNAWYFANYDPRMKREGLHYVVIERDEKYMASFDEIVPEFIEKMDEALAEIGFVFGEQWR
>P00638 3.1.11.3~~~6~~~Exonuclease~~~
MALLDLKQFYELREGCDDKGILVMDGDWLVFQAMSAAEFDASWEEEIWHRCCDHAKARQILEDSIKSYETRKKAWAGAPI
VLAFTDSVNWRKELVDPNYKANRKAVKKPVGYFEFLDALFEREEFYCIREPMLEGDDVMGVIASNPSAFGARKAVIISCD
KDFKTIPNCDFLWCTTGNILTQTEESADWWHLFQTIKGDITDGYSGIAGWGDTAEDFLNNPFITEPKTSVLKSGKNKGQE
VTKWVKRDPEPHETLWDCIKSIGAKAGMTEEDIIKQGQMARILRFNEYNFIDKEIYLWRP
>Q91DM1 ~~~GP2a~~~Envelope small membrane protein~~~
MGLVWSLISNSIQTIIADFAISVIDAALFFLMLLALAVVTVFLFWLIVAIGRSLVARCSRGARYRPV
>P0C6Y6 ~~~GP2b~~~Envelope small membrane protein~~~
MGSLWSKISQLFVDAFTEFLVSVVDIAIFLAILFGFTVAGWLLVFLLRVVCSALLRSRSAIHSPELSKVL
>A0MD31 ~~~GP2b~~~Envelope small membrane protein~~~
MGSLWSKISQLFVDAFTEFLVSVVDIVIFLAILFGFTVAGWLLVFLLRVVCSALLRSRSAIHSPELSKVL
>P20220 ~~~~~~Protein F-112~~~
MAQTLNSYKMAEIMYKILEKKGELTLEDILAQFEISVPSAYNIQRALKAICERHPDECEVQYKNRKTTFKWIKQEQKEEQ
KQEQTQDNIAKIFDAQPANFEQTDQGFIKAKQ
>P20222 ~~~~~~Protein F-93~~~
MKSTPFFYPEAIVLAYLYDNEGIATYDLYKKVNAEFPMSTATFYDAKKFLIQEGFVKERQERGEKRLYLTEKGKLFAISL
KTAIETYKQIKKR
>A0A0S0N8M3 ~~~~~~Flavin-dependent lyase~~~
MKTDVIVVGAGLFGSIAAKALAQAGLAVVGVDDSRPGAGSLPAACLMKPSWFSSMGKDKFEPSLELLDRIYGVKDISFKV
GLLRATVHWCDPAQILGDEEVPVYREKVTALTRTSSGWAVSLEGREAALEARSVVVAAGVWTSELVRSQALGGLVGRAGV
AFRWQDMQLEEQFISPWAPYRQTVGFNISPTEVWVGDGSAIKPENWNQDRQNVSYSRCAQAIDRAGFGDQEAGRVKALYG
IRPYIAGVKPCLLEEVEPGLWALTGGAKNGTISAGWAASELVRRIK
>P06229 3.1.11.-~~~~~~Flap endonuclease~~~
MSKSWGKFIEEEEAEMASRRNLMIVDGTNLGFRFKHNNSKKPFASSYVSTIQSLAKSYSARTTIVLGDKGKSVFRLEHLP
EYKGNRDEKYAQRTEEEKALDEQFFEYLKDAFELCKTTFPTFTIRGVEADDMAAYIVKLIGHLYDHVWLISTDGDWDTLL
TDKVSRFSFTTRREYHLRDMYEHHNVDDVEQFISLKAIMGDLGDNIRGVEGIGAKRGYNIIREFGNVLDIIDQLPLPGKQ
KYIQNLNASEELLFRNLILVDLPTYCVDAIAAVGQDVLDKFTKDILEIAEQ
>P20345 ~~~~~~Pre-neck appendage protein~~~
MSTKPELKRFEQFGEMMVQLYERYLPTAFDESLTLLEKMNKIIHYLNEIGKVTNELIEEWNKVMEWILNDGLEDLVKETL
ERWYEEGKFADLVIQVIDELKQFGVSVKTYGAKGDGVTDDIRAFEKAIESGFPVYVPYGTFMVSRGIKLPSNTVLTGAGK
RNAVIKFMDSVGRGESLMYNQNVTTGNENIFLSSFTLDGNNKRLGQGISGIGGSRESNLSIRACHNVYIRDIEAVDCTLH
GIDITCGGLDYPYLGDGTTAPNPSENIWIENCEATGFGDDGITTHHSQYINILNCYSHDPRLTANCNGFEIDDGSRHVVL
SNNRSKGCYGGIEIKAHGDAPAAYNISINGHMSVEDVRSYNFRHIGHHAATDPQSVSAKNIVASNLVSIRPNNKRGFQDN
ATPRVLAVSAYYGVVINGLTGYTDDPNLLTETVVSVQFRARNCSLNGVGLTGFSNSDNGIYVIGGSRGGDAVNISNVTLN
NSGRYGVSIGSGIENVSITNISGIGDGINSPVALVSTINSNPEISGLSSIGYPTAARVAGTDYNDGLTLFNGAFRASTTS
SGKIHSEGFIMGSTSGCEASVSKSGVLTSSSSKTSSERSLIAGSSTSEAKGTYNTILGSLGAVADEQFAALISASQSRAS
GNHNLILSSYGINTTGSYKVNGGFEKINWELDSLNGRIKARDTVTGGNTWSDFAEYFESLDGQVIETGYLVTLEKGKIRK
AEKGEKIIGVISETAGFVLGESSFEWQGAVLKNEFGGIIYEEVTTEDGVKFKRPLPSPDFDPNKNYIPRSQRREWHVVGL
LGQIAVRIDETVKQGHGIDAVGGVATDGDNFIVQEITTPYTKEKGYGVAIVLVK
>P10930 ~~~~~~Short tail fiber protein gp12~~~
MSNNTYQHVSNESRYVKFDPTDTNFPPEITDVQAAIAAISPAGVNGVPDASSTTKGILFIPTEQEVIDGTNNTKAVTPAT
LATRLSYPNATETVYGLTRYSTNDEAIAGVNNESSITPAKFTVALNNAFETRVSTESSNGVIKISSLPQALAGADDTTAM
TPLKTQQLAIKLIAQIAPSETTATESDQGVVQLATVAQVRQGTLREGYAISPYTFMNSSSTEEYKGVIKLGTQSEVNSNN
ASVAVTGATLNGRGSTTSMRGVVKLTTTAGSQSGGDASSALAWNADVIQQRGGQIIYGTLRIEDTFTIANGGANITGTVR
MTGGYIQGNRIVTQNEIDRTIPVGAIMMWAADSLPSDAWRFCHGGTVSASDCPLYASRIGTRYGGNPSNPGLPDMRGLFV
RGSGRGSHLTNPNVNGNDQFGKPRLGVGCTGGYVGEVQIQQMSYHKHAGGFGEHDDLGAFGNTRRSNFVGTRKGLDWDNR
SYFTNDGYEIDPESQRNSKYTLNRPELIGNETRPWNISLNYIIKVKE
>P03742 ~~~~~~Long-tail fiber protein gp35~~~
MEKFMAEFGQGYVQTPFLSESNSVRYKISIAGSCPLSTAGPSYVKFQDNPVGSQTFSAGLHLRVFDPSTGALVDSKSYAF
STSNDTTSAAFVSFMNSLTNNRIVAILTSGKVNFPPEVVSWLRTAGTSAFPSDSILSRFDVSYAAFYTSSKRAIALEHVK
LSNRKSTDDYQTILDVVFDSLEDVGATGFPRGTYESVEQFMSAVGGTNDEIARLPTSAAISKLSDYNLIPGDVLYLKAQL
YADADLLALGTTNISIRFYNASNGYISSTQAEFTGQAGSWELKEDYVVVPENAVGFTIYAQRTAQAGQGGMRNLSFSEVS
RNGGISKPAEFGVNGIRVNYICESASPPDIMVLPTQASSKTGKVFGQEFREV
>P03743 ~~~~~~Long-tail fiber protein gp36~~~
MADLKVGSTTGGSVIWHQGNFPLNPAGDDVLYKSFKIYSEYNKPQAADNDFVSKANGGTYASKVTFNAGIQVPYAPNIMS
PCGIYGGNGDGATFDKANIDIVSWYGVGFKSSFGSTGRTVVINTRNGDINTKGVVSAAGQVRSGAAAPIAANDLTRKDYV
DGAINTVTANANSRVLRSGDTMTGNLTAPNFFSQNPASQPSHVPRFDQIVIKDSVQDFGYY
>M1EAS5 ~~~~~~Long tail fiber protein Gp37~~~
MATIKQIQFKRTKVAGSRPTAAQLAEGELAINLKDRTIFTKDDLNQIIDLGFAKGGEVSGDITQIGNYTQTGNYNLTGDA
TISGKTTTSTLDVGSVSDLRQTNFRPVLSTTTGSNFIISNSGGLIKPITLTVEGTATSSNTILRHSVDTTVAASGFIDSI
NVSLNPTDGALVTALNGTVNIGSSLKTPKLSVSGAETALGDYSISIGDNDTGFKWNSDGVFSLVTDSNSIYTYSRDRTYS
NRPTNFRYTSDFDATTPALAPPGTWLASVETAIDGNAYGDGMSYLGYKDTAGYSFYFRGGGTFNVASKGGFNVDTAAAFA
KTVDVSDILTCSSIIKAKGPGQVDVTSAGNIALGGTIQWVPSYMSGSPNRARDTIATAAWGDADQRINVLETSDPHGWWY
YIQRAGSGSSSPTGIEGRVNGSWQASDLISDNTLRVAGAFTCTRRNSAGWGDNAGWYAGATPVVANQGNVQEMDPGVGGF
YPGFAQYNYNGTGWNQAFVLGLLGQGVQRWRRGVLALRGDGPVDAGQQIARWYFSQEDGSLESEGPLKAPSVQAGQITSF
GVNVTNALGSASIAIGDNDTGLRWGGDGIVQIVANNAIVGGWNSTDIFTEAGKHITSNGNLNQWGGGAIYCRDLNVSSDR
RIKKDIKAFENPVDILSTIGGYTYLIEKGFNEDGSQAYEESAGLIAQEVEAVLPRLVKISNDGTKDVKRLNYNGITALNT
AAINVHTKEINELKKQLKELKDIVKFLTK
>P03744 ~~~~~~Long-tail fiber protein gp37~~~
MATLKQIQFKRSKIAGTRPAASVLAEGELAINLKDRTIFTKDDSGNIIDLGFAKGGQVDGNVTINGLLRLNGDYVQTGGM
TVNGPIGSTDGVTGKIFRSTQGSFYARATNDTSNAHLWFENADGTERGVIYARPQTTTDGEIRLRVRQGTGSTANSEFYF
RSINGGEFQANRILASDSLVTKRIAVDTVIHDAKAFGQYDSHSLVNYVYPGTGETNGVNYLRKVRAKSGGTIYHEIVTAQ
TGLADEVSWWSGDTPVFKLYGIRDDGRMIIRNSLALGTFTTNFPSSDYGNVGVMGDKYLVLGDTVTGLSYKKTGVFDLVG
GGYSVASITPDSFRSTRKGIFGRSEDQGATWIMPGTNAALLSVQTQADNNNAGDGQTHIGYNAGGKMNHYFRGTGQMNIN
TQQGMEINPGILKLVTGSNNVQFYADGTISSIQPIKLDNEIFLTKSNNTAGLKFGAPSQVDGTRTIQWNGGTREGQNKNY
VIIKAWGNSFNATGDRSRETVFQVSDSQGYYFYAHRKAPTGDETIGRIEAQFAGDVYAKGIIANGNFRVVGSSALAGNVT
MSNGLFVQGGSSITGQVKIGGTANALRIWNAEYGAIFRRSESNFYIIPTNQNEGESGDIHSSLRPVRIGLNDGMVGLGRD
SFIVDQNNALTTINSNSRINANFRMQLGQSAYIDAECTDAVRPAGAGSFASQNNEDVRAPFYMNIDRTDASAYVPILKQR
YVQGNGCYSLGTLINNGNFRVHYHGGGDNGSTGPQTADFGWEFIKNGDFISPRDLIAGKVRFDRTGNITGGSGNFANLNS
TIESLKTDIMSSYPIGAPIPWPSDSVPAGFALMEGQTFDKSAYPKLAVAYPSGVIPDMRGQTIKGKPSGRAVLSAEADGV
KAHSHSASASSTDLGTKTTSSFDYGTKGTNSTGGHTHSGSGSTSTNGEHSHYIEAWNGTGVGGNKMSSYAISYRAGGSNT
NAAGNHSHTFSFGTSSAGDHSHSVGIGAHTHTVAIGSHGHTITVNSTGNTENTVKNIAFNYIVRLA
>Q6QGF0 ~~~~~~Straight fiber protein pb4~~~
MISNNAPAKMVLNSVLTGYTLAYIQHSIYSDYDVIGRSFWLKEGSNVTRRDFTGIDTFSVTINNLKPTTTYEVQGAFYDS
IIDSELLNAQIGINLSDKQTFKMKSAPRITGARCESEPVDVGVGAPIVYIDTTGEADYCTIELKDNSNANNPWVKYYVGA
LMPTIMFGGVPIGSYKVRISGQISLPDGVTIDSSGYYEYPNVFEVRYNFVPPAAPINIVFKAARIADGKERYDLRVQWDW
NRGAGANVREFVLSYIDSAEFVRTGWTKAQKINVGAAQSATIISFPWKVEHKFKVSSIAWGPDAQDVTDSAVQTFILNES
TPLDNSFVNETGIEVNYAYIKGKIKDGSTWKQTFLIDAATGAINIGLLDAEGKAPISFDPVKKIVNVDGSVITKTINAAN
FVMTNLTGQDNPAIYTQGKTWGDTKSGIWMGMDNVTAKPKLDIGNATQYIRYDGNILRISSEVVIGTPNGDIDIQTGIQG
KQTVFIYIIGTSLPAKPTSPAYPPSGWSKTPPNRTSNTQNIYCSTGTLDPVTNQLVSGTSWSDVVQWSGTEGVDGRPGAT
GQRGPGMYSLAIANLTAWNDSQANSFFTSNFGSGPVKYDVLTEYKSGAPGTAFTRQWNGSAWTSPAMVLHGDMIVNGTVT
ASKIVANNAFLSQIGVNIIYDRAAALSSNPEGSYKMKIDLQNGYIHIR
>Q775D6 ~~~mtd~~~Tail fiber receptor-binding protein~~~
MSTAVQFRGGTTAQHATFTGAAREITVDTDKNTVVVHDGATAGGFPLARHDLVKTAFIKADKSAVAFTRTGNATASIKAG
TIVEVNGKLVQFTADTAITMPALTAGTDYAIYVCDDGTVRADSNFSAPTGYTSTTARKVGGFHYAPGSNAAAQAGGNTTA
QINEYSLWDIKFRPAALDPRGMTLVAGAFWADIYLLGVNHLTDGTSKYNVTIADGSASPKKSTKFGGDGSAAYSDGAWYN
FAEVMTHHGKRLPNYNEFQALAFGTTEATSSGGTDVPTTGVNGTGATSAWNIFTSKWGVVQASGCLWTWGNEFGGVNGAS
EYTANTGGRGSVYAQPAAALFGGAWNGTSLSGSRAALWYSGPSFSFAFFGARGVCDHLILE
>Q858F5 3.2.1.-~~~~~~Tail spike protein~~~
MTVSTEVDHNDYTGNGVTTSFPYTFRIFKKSDLVVQVVDLNENITELILDTDYTVTGAGGYTCGDVVLSSPLANGYQISI
SRELPVTQETDLRNQGKFFAEVHENAFDKLTMLIQQVRSWLSLALRKPSFVANYYDALGNYIRNLRDPSRPQDAATKNYV
DNLSEGNNSYADNLFSRTLRVPEKINTLPSSLDRANKIPAFDSNGNAIVIIPQSGSASDVLIELAKPSGSGLVGFSHSNN
YNPGMVGEKLQNVVYPTDAPFYAPTDGTSDATTALQSAITHCEGKNAVLCINKSFSVSDSLSISSPLCVFAMNEQCGIVS
SAPAGHAAVIFNGDNICWNGGFIRGLNQPSSSTIRQDGVLLNGNDCVLDNVSINGFFAKGLHTSNADGSGVGIRDYGTRN
TISKCRVEYNKFGISLEGKDGWVLGNYVSNHYRMSSEAKPWDDTSNYWDGIVGGGEWLGVATGYLIDGNEFEDNGQSGIY
AGGNGGIFAKNRITNNHIHGNWNRGIDFGVVQRLANSDVYENIITDNIVHNNRAANIWLAGVRDSIINNNNSWFTDDYRS
MFAGNFDACVCLTLADGGEKAAPTGNQVNGNRCKTLESDDQISGFTLNITDTARGNQVRDNVLSPIGEAYIPNPELYAVN
NIDIPTEFAFTPQLIGGSGVTLGNSSGKLTANGNVFSLSLSISAQSVSSPSGSLTIGYIPGLSGTSVRHHNVRTEFYNNL
NTTMQRAQPYVNIGDSADQLRVYRLADGLSKDDLLEYFMSNSDLRMVGDIEIEPYNFSRSVTVVGHSFCTSDVMSTELNR
LLGTDIYNFARGGASDVEVAMSQEAITRQYAPVGGSIPASGSVALTPTEVGIFWNGATGKCIFGGIDGTFSTTLVNAGTG
ETQLVFTRDSAGSAVSVSTTATFAMRPYTRFNTNTIPAGRKHSLHRDDIYIVWGGRNSTDYTRYVSELHTMVANMHTQRF
VICPEFPYDTETTGTTGATNLAALNNNLKADFPDNYCQISGVDLLQNFKSKYNPAYAGDVTDIANGITPRSLREDNLHPS
ETLQPNGLYIGAKVNADFIAQFIKSKGWGG
>P49714 3.2.1.129~~~~~~Tail spike protein~~~
MIQRLGSSLVKFKSKIAGAIWRNLDDKLTEVVSLKDFGAKGDGKTNDQDAVNAAMASGKRIDGAGATYKVSSLPDMERFY
NTRFVWERLAGQPLYYVSKGFINGELYKITDNPYYNAWPQDKAFVYENVIYAPYMGSDRHGVSRLHVSWVKSGDDGQTWS
TPEWLTDMHPDYPTVNYHCMSMGVCRNRLFAMIETRTLAKNELTNCALWDRPMSRSLHLTGGITKAANQRYATIHVPDHG
LFVGDFVNFSNSAVTGVSGDMKVATVIDKDNFTVLTPNQQTSDLNNAGKNWHMGTSFHKSPWRKTDLGLIPRVTEVHSFA
TIDNNGFVMGYHQGDVAPREVGLFYFPDAFNSPSNYVRRQIPSEYEPDAAEPCIKYYDGVLYLITRGTRGDRLGSSLHRS
RDIGQTWESLRFPHNVHHTTLPFAKVGDDLIMFGSERAENEWEAGAPDDRYKASYPRTFYARLNVNNWNADDIEWVNITD
QIYQGDIVNSSVGVGSVVVKDSFIYYIFGGENHFNPMTYGDNKDKDPFKGHGHPTDIYCYKMQIANDNRVSRKFTYGATP
GQAIPTFMGTDGIRNIPAPLYFSDNIVTEDTKVGHLTLKASTSANIRSEMQMEGEYGFIGKSVPKDKPTGQRLIICGGEG
TSSSSGAQITLHGSNSSNAKRITYNGNEHLFQGAPIMPAVDNQFAAGGPSNRFTTIYLGSDPVTTSDADHKYGISSINTK
VLKAWSRVGFKQYGLNSEAERNLDSIHFGVLAQDIVAAFEAEGLDAIKYGIVSFEEGRYGVRYSEVLILEAAYTRHRLDK
LEEMYATNKIS
>Q04830 3.2.1.129~~~~~~Tail spike protein~~~
MSTITQFPSGNTQYRIEFDYLARTFVVVTLVNSSNPTLNRVLEVGRDYRFLNPTMIEMLVDQSGFDIVRIHRQTGTDLVV
DFRNGSVLTASDLTTAELQAIHIAEEGRDQTVDLAKEYADAAGSSAGNAKDSEDEARRIAESIRAAGLIGYMTRRSFEKG
YNVTTWSEVLLWEEDGDYYRWDGTLPKNVPAGSTPETSGGIGLGAWVSVGDAALRSQISNPEGAILYPELHRARWLDEKD
ARGWGAKGDGVTDDTAALTSALNDTPVGQKINGNGKTYKVTSLPDISRFINTRFVYERIPGQPLYYASEEFVQGELFKIT
DTPYYNAWPQDKAFVYENVIYAPYMGSDRHGVSRLHVSWVKSGDDGQTWSTPEWLTDLHPDYPTVNYHCMSMGVCRNRLF
AMIETRTLAKNALTNCALWDRPMSRSLHLTGGITKAANQRYATIHVPDHGLFVGDFVNFSNSAVTGVSGDMTVATVIDKD
NFTVLTPNQQTSDLNNAGKNWHMGTSFHKSPWRKTDLGLIPSVTEVHSFATIDNNGFAMGYHQGDVAPREVGLFYFPDAF
NSPSNYVRRQIPSEYEPDASEPCIKYYDGVLYLITRGTRGDRLGSSLHRSRDIGQTWESLRFPHNVHHTTLPFAKVGDDL
IMFGSERAENEWEAGAPDDRYKASYPRTFYARLNVNNWNADDIEWVNITDQIYQGGIVNSGVGVGSVVVKDNYIYYMFGG
EDHFNPWTYGDNSAKDPFKSDGHPSDLYCYKMKIGPDNRVSRDFRYGAVPNRAVPVFFDTNGVRTVPAPMEFTGDLGLGH
VTIRASTSSNIRSEVLMEGEYGFIGKSIPTDNPAGQRIIFCGGEGTSSTTGAQITLYGANNTDSRRIVYNGDEHLFQSAD
VKPYNDNVTALGGPSNRFTTAYLGSNPIVTSNGERKTEPVVFDDAFLDAWGDVHYIMYQWLDAVQLKGNDARIHFGVIAQ
QIRDVFIAHGLMDENSTNCRYAVLCYDKYPRMTDTVFSHNEIVEHTDEEGNVTTTEEPVYTEVVIHEEGEEWGVRPDGIF
FAEAAYQRRKLERIEARLSALEQK
>O09496 4.-.-.-~~~kflA~~~Tail spike protein~~~
MAKLTKPKTEGILHKGQSLYEYLDARVLTSKPFGAAGDATTDDTEVIAASLNSQKAVTISDGVFSSSGINSNYCNLDGRG
SGVLSHRSSTGNYLVFNNPRTGRLSNITVESNKATDTTQGQQVSLAGGSDVTVSDVNFSNVKGTGFSLIAYPNDAPPDGL
MIKGIRGSYSGYATNKAAGCVLADSSVNSLIDNVIAKNYPQFGAVELKGTASYNIVSNVIGADCQHVTYNGTEGPIAPSN
NLIKGVMANNPKYAAVVAGKGSTNLISDVLVDYSTSDARQAHGVTVEGSDNVINNVLMSGCDGTNSLGQRQTATIARFIG
TANNNYASVFPSYSATGVITFESGSTRNFVEVKHPGRRNDLLSSASTIDGAATIDGTSNSNVVHAPALGQYIGSMSGRFE
WRIKSMSLPSGVLTSADKYRMLGDGAVSLAVGGGTSSQVRLFTSDGTSRTVSLTNGNVRLSTSSTGYLQLGADAMTPDST
GTYALGSASRAWSGGFTQAAFTVTSDARCKTEPLTISDALLDAWSEVDFVQFQYLDRVEEKGADSARWHFGIIAQRAKEA
FERHGIDAHRYGFLCFDSWDDVYEEDANGSRKLITPAGSRYGIRYEEVLILEAALMRRTIKRMQEALAALPK
>P12528 3.2.1.-~~~9~~~Tail spike protein~~~
MTDITANVVVSNPRPIFTESRSFKAVANGKIYIGQIDTDPVNPANQIPVYIENEDGSHVQITQPLIINAAGKIVYNGQLV
KIVTVQGHSMAIYDANGSQVDYIANVLKYDPDQYSIEADKKFKYSVKLSDYPTLQDAASAAVDGLLIDRDYNFYGGETVD
FGGKVLTIECKAKFIGDGNLIFTKLGKGSRIAGVFMESTTTPWVIKPWTDDNQWLTDAAAVVATLKQSKTDGYQPTVSDY
VKFPGIETLLPPNAKGQNITSTLEIRECIGVEVHRASGLMAGFLFRGCHFCKMVDANNPSGGKDGIITFENLSGDWGKGN
YVIGGRTSYGSVSSAQFLRNNGGFERDGGVIGFTSYRAGESGVKTWQGTVGSTTSRNYNLQFRDSVVIYPVWDGFDLGAD
TDMNPELDRPGDYPITQYPLHQLPLNHLIDNLLVRGALGVGFGMDGKGMYVSNITVEDCAGSGAYLLTHESVFTNIAIID
TNTKDFQANQIYISGACRVNGLRLIGIRSTDGQGLTIDAPNSTVSGITGMVDPSRINVANLAEEGLGNIRANSFGYDSAA
IKLRIHKLSKTLDSGALYSHINGGAGSGSAYTQLTAISGSTPDAVSLKVNHKDCRGAEIPFVPDIASDDFIKDSSCFLPY
WENNSTSLKALVKKPNGELVRLTLATL
>Q9XJP3 3.2.1.-~~~~~~Tail spike protein~~~
MTDIITNVVIGMPSQLFTMARSFKAVANGKIYIGKIDTDPVNPENQIQVYVENEDGSHVPASQPIVINAAGYPVYNGQIV
KFVTEQGHSMAVYDAYGSQQFYFQNVLKYDPDQFGPDLIEQLAQSGKYSQDNTKGDAMIGVKQPLPKAVLRTQHDKNKEA
ISILDFGVIDDGVTDNYQAIQNAIDAVASLPSGGELFIPASNQAVGYIVGSTLLIPGGVNIRGVGKASQLRAKSGLTGSV
LRLSYDSDTIGRYLRNIRVTGNNTCNGIDTNITAEDSVIRQVYGWVFDNVMVNEVETAYLMQGLWHSKFIACQAGTCRVG
LHFLGQCVSVSVSSCHFSRGNYSADESFGIRIQPQTYAWSSEAVRSEAIILDSETMCIGFKNAVYVHDCLDLHMEQLDLD
YCGSTGVVIENVNGGFSFSNSWIAADADGTEQFTGIYFRTPTSTQSHKIVSGVHINTANKNTAANNQSIAIEQSAIFVFV
SGCTLTGDEWAVNIVDINECVSFDKCIFNKPLRYLRSGGVSVTDCYLAGITEVQKPEGRYNTYRGCSGVPSVNGIINVPV
AVGATSGSAAIPNPGNLTYRVRSLFGDPASSGDKVSVSGVTINVTRPSPVGVALPSMVEYLAI
>Q0PDK6 ~~~~~~Tail spike protein~~~
MRSLSKNIWIMSNFEVLLTTLTGDRDAGGCRYWNGSFEEPLSPGETATYQFYADASHPNSKYVINDNKVIFKEPEQFGGD
YRMFTIIDTELGRSEQDGRYIIATCEPAWYELDDDFIENKRPTDKTPQEALDLLLEGTRYEGEVDPDLQMRASDSFYQVS
VKEAVIQFFQTWGGYFKDTIEFDGNEITRRVIKSLKRRGAANGVRFEHDKDITSIRRNITGYPKTALYPYGASLEMTDED
GNATGGYTRYIDIKDVVWSKAAGDPVDKPAGQAWIGDPDLLKVFGRPMPDGSLRHRWTRWQNEDITTPEELIEAAYYDLI
NNVSKAIVNYNVEVAPKNVKLGDTCIILDREFATPIELEADVIKMTYFTDRPEEGATIELGEYLNLDDIERQIGAINDRF
KKNEGKWNTGGGITEVTDGAFPDKKPPKPSNITVKSMFQSIFLSWNYDPSSYIAGYRVYASKTPGFTPTDNDIIFEGKMS
GYEHKVGNNETWYYRLSTVNYHKTESDMSDQYSAQSERIMNDDILFGAITADKLANLAVTADKISRSFDEGNIFPGSLLK
PSQFYDYPNTKHSVTDKTFNELTITQVIDDSTRLRFGISGQNKAPNNIQRMPLEQGQIYTLSFEVKRNNTTDLSYIHLKK
PDNTFMLLSSSLKDISAYPSDEFVRVDLQFTAPETSSKFTLGFGGYNRNGNVTPCSFVIRKVQVRKGTEIKDYGYSPYDI
MLVDGAIYSDYIQSAAINSAHIADAAIGSAAIQNAAIKNAHLGTAIVDTANIKDAAITSAKIKELSADLITSGTINAINI
TGSLIRGGRFEPLNESSAYTSYIEGDKIYQRVNTINYGGTPKKTYNDLTITSGGVTMVYGGRDTGTDEPLREMTLRDGSL
YMSGNKNDTGYSEIQMYCNDDDVVNAPKSIFRQKLGGVNRMSLHSYYNSVPGSYNVQMDTRNTSIYSHFANGWQLTADGK
IMIMSKNGQGIYLDPSNDNVAGQVTMHVGNAGQAGGLIQPVIAGAPHYSITDIHGALQTNMSVMLYQYNFVMPSKGDAYA
SKYNDIPPPFECENIFMVVPTVFGGYSDHVHATITSQSSSGFRCYVRGTGSSTGILGRTFQVRFLIFYEVN
>P03748 ~~~~~~Tail fiber protein~~~
MANVIKTVLTYQLDGSNRDFNIPFEYLARKFVVVTLIGVDRKVLTINTDYRFATRTTISLTKAWGPADGYTTIELRRVTS
TTDRLVDFTDGSILRAYDLNVAQIQTMHVAEEARDLTTDTIGVNNDGHLDARGRRIVNLANAVDDRDAVPFGQLKTMNQN
SWQARNEALQFRNEAETFRNQAEGFKNESSTNATNTKQWRDETKGFRDEAKRFKNTAGQYATSAGNSASAAHQSEVNAEN
SATASANSAHLAEQQADRAEREADKLENYNGLAGAIDKVDGTNVYWKGNIHANGRLYMTTNGFDCGQYQQFFGGVTNRYS
VMEWGDENGWLMYVQRREWTTAIGGNIQLVVNGQIITQGGAMTGQLKLQNGHVLQLESASDKAHYILSKDGNRNNWYIGR
GSDNNNDCTFHSYVHGTTLTLKQDYAVVNKHFHVGQAVVATDGNIQGTKWGGKWLDAYLRDSFVAKSKAWTQVWSGSAGG
GVSVTVSQDLRFRNIWIKCANNSWNFFRTGPDGIYFIASDGGWLRFQIHSNGLGFKNIADSRSVPNAIMVENE
>P13390 3.4.21.-~~~ltf~~~Side tail fiber protein pb1~~~
MAITKIILQQMVTMDQNSITASKYPKYTVVLSNSISSITAADVTSAIESSKASGPAAKQSEINAKQSELNAKDSENEAEI
SATSSQQSATQSASSATASANSAKAAKTSETNANNSKNAAKTSETNAASSASSASSFATAAENSARAAKTSETNAGNSAQ
AADASKTAAANSATAAKTSETNAKKSETAAKTSETNAKTSENKAKEYLDMASELVSPVTQYDWPVGTNNNSVYVKIAKLT
DPGAVSCHLTLMITNGGNYGSSYGNIDFVEISARGLNDARGVTSENITKFLSVRRLGSPNLAWDNQLRYGLVEGDGYFEV
WCYQRAFIKETRVAVLAQTGRTELYIPEGFVSQDTQPSGFIESLAARIYDQVNKPTKADLGLENAMLVGAFGLGGNGLSY
SSVQSNVDLINKLKANGGQYWRAARESGANVDINDHGSGFYSHCGDTHAAINVQYNTGIVKVLATTDRNLASDIVYANTL
YGTANKPSKSDVGLGNVTNDAQVKKAGDVMSGDLDIRKETPSIRLKSTQGNAHLWFMNNDGGERGVIWSPPNNGSLGEIH
IRAKTSDGTSTGDFIVRHDGRIEAKDAKISYKISSRTAEFSNDDTNTAATNLRVSGKQHTPIMLVRDSDSNVSVGFKLNN
MNAKLLGIDIDGDLAFGENPDHKQNSKIVTRKMMDAGFSVAGLMDFTNGFAGPWEAKNISDQELDLNSLMIKKSDPGSIR
VYQCVSAGGGNNITNKPSGIGGNFILYVESIRKVGDTDFTNRQRLFGTDLNREFTRYCSNGTWSAWRESVVSGMNQDVSV
KSMSVSGRLSGNELSVGGAGVLNGNLGVGGGATSKMPSSDKGIVIGRGSIVREGGEGRLILSSSGGTDRLLQLRPAGATS
LDNQVEISCTSASAGDTKISFGQGAAIRCNNAGSPIISAKAGQMIYFRPNGDGISEGQMILSPNGDLVVKGGVNSKEIDV
TASQSLPLKETTATTGIGVNFIGDSATECSFGIENTAGGSAVFHNYTRGASNSVTKNNQLLGGYGSRPWLGSTYTEHSNA
ALHFLGAGDTSATNHGGWIRLLVTPKGKTISDRVPAFRLSDNGDLWLVPDGAMHSDLGLVRSIETLNAAVPRFNAPSIQD
GRGLKIVAPQAPEIDLIAPRGSGASAPAIRAMWCDGSLADTTRYIGATQPGSTFYIGASGHDGEKFDSMRGSVAIKSAGG
WGPTSTPTQVVLETCESGSISRLPRWGVDHNGTLMPMADNRYNLGWGSGRVKQVYAVNGTINTSDARLKNDVRAMSDPET
EAAKAIAKEIGFWTWKEQADMNDIREHCGLTVQRAIEIMESFGLDPFKYGFICYDKWDEHTVVSEYGPANEDGTENPIYK
TIPAGDHYSFRLEELNLFIAKGFEARLSAIEDKLGM
>P18771 ~~~~~~Long-tail fiber proximal subunit~~~
MAEIKREFRAEDGLDAGGDKIINVALADRTVGTDGVNVDYLIQENTVQQYDPTRGYLKDFVIIYDNRFWAAINDIPKPAG
AFNSGRWRALRTDANWITVSSGSYQLKSGEAISVNTAAGNDITFTLPSSPIDGDTIVLQDIGGKPGVNQVLIVAPVQSIV
NFRGEQVRSVLMTHPKSQLVLIFSNRLWQMYVADYSREAIVVTPANTYQAQSNDFIVRRFTSAAPINVKLPRFANHGDII
NFVDLDKLNPLYHTIVTTYDETTSVQEVGTHSIEGRTSIDGFLMFDDNEKLWRLFDGDSKARLRIITTNSNIRPNEEVMV
FGANNGTTQTIELKLPTNISVGDTVKISMNYMRKGQTVKIKAADEDKIASSVQLLQFPKRSEYPPEAEWVTVQELVFNDE
TNYVPVLELAYIEDSDGKYWVVQQNVPTVERVDSLNDSTRARLGVIALATQAQANVDLENSPQKELAITPETLANRTATE
TRRGIARIATTAQVNQNTTFSFADDIIITPKKLNERTATETRRGVAEIATQQETNAGTDDTTIITPKKLQARQGSESLSG
IVTFVSTAGATPASSRELNGTNVYNKNTDNLVVSPKALDQYKATPTQQGAVILAVESEVIAGQSQQGWANAVVTPETLHK
KTSTDGRIGLIEIATQSEVNTGTDYTRAVTPKTLNDRRATESLSGIAEIATQVEFDAGVDDTRISTPLKIKTRFNSTDRT
SVVALSGLVESGTLWDHYTLNILEANETQRGTLRVATQVEAAAGTLDNVLITPKKLLGTKSTEAQEGVIKVATQSETVTG
TSANTAVSPKNLKWIAQSEPTWAATTAIRGFVKTSSGSITFVGNDTVGSTQDLELYEKNSYAVSPYELNRVLANYLPLKA
KAADTNLLDGLDSSQFIRRDIAQTVNGSLTLTQQTNLSAPLVSSSTGEFGGSLAANRTFTIRNTGAPTSIVFEKGPASGA
NPAQSMSIRVWGNQFGGGSDTTRSTVFEVGDDTSHHFYSQRNKDGNIAFNINGTVMPININASGLMNVNGTATFGRSVTA
NGEFISKSANAFRAINGDYGFFIRNDASNTYFLLTAAGDQTGGFNGLRPLLINNQSGQITIGEGLIIAKGVTINSGGLTV
NSRIRSQGTKTSDLYTRAPTSDTVGFWSIDINDSATYNQFPGYFKMVEKTNEVTGLPYLERGEEVKSPGTLTQFGNTLDS
LYQDWITYPTTPEARTTRWTRTWQKTKNSWSSFVQVFDGGNPPQPSDIGALPSDNATMGNLTIRDFLRIGNVRIVPDPVN
KTVKFEWVE
>P03714 ~~~FII~~~Head-tail connector protein FII~~~
MADFDNLFDAAIARADETIRGYMGTSATITSGEQSGAVIRGVFDDPENISYAGQGVRVEGSSPSLFVRTDEVRQLRRGDT
LTIGEENFWVDRVSPDDGGSCHLWLGRGVPPAVNRRR
>P03709 ~~~Fi~~~DNA-packaging protein FI~~~
MTKDELIARLRSLGEQLNRDVSLTGTKEELALRVAELKEELDDTDETAGQDTPLSRENVLTGHENEVGSAQPDTVILDTS
ELVTVVALVKLHTDALHATRDEPVAFVLPGTAFRVSAGVAAEMTERGLARMQ
>Q65163 1.8.3.2~~~~~~FAD-linked sulfhydryl oxidase~~~
MLHWGPKYWRSLHLYAIFFSDAPSWKEKYEAIQWILNFIESLPCTRCQHHAFSYLTKNPLTLNNSEDFQYWTFAFHNNVN
NRLNKKIISWSEYKNIYEQSILKTIEYGKTDFIGAWSSL
>P41480 1.8.3.2~~~~~~FAD-linked sulfhydryl oxidase~~~
MIPLTPLFSRYKDSYLLYSFRLIDLLRASKSTHLTKLLSSQATYLYHFACLMKYKDIQKYEVQQLIEWAINASPDMDLQQ
FRIEFMDKTTELNLRSCQPKSFTYTFTTIWDTMHFLSLIIDDMVYTRDKSSLDFVMQQLKTMKVLFYNVFFILQCAMCRD
HYMNVKGFIIYHIELIEIALDKEKYGTDITFVDSYQQETAGADVAVVSNNMLMKNLMAYVSMTFHNHINDYKWIQRNKKP
PAHYERMTWGEYKKLLNLQ
>P69037 ~~~FP~~~Protein FP25~~~
MDQFEQLINVSLLKSLIKTQIDENVSDNIKSMSEKLKRLEYDNLTDSVEIYGIHDSRLNNKKIRNYYLKKICALLDLNFK
HVIESSFDKNHIVAKLCDATRAKEWQTKSRERRLKNFNLNINYDGPVKIFVAATAEQKLLLKKTRDALLPFYKYISICKN
GVMVRRDEKSRVFIVKNEQNIEYLKANKYYAFHSDSVDNFESENDSEKMLQNLI
>Q5UQ00 3.2.2.23~~~~~~Probable formamidopyrimidine-DNA glycosylase~~~
MPEGPEVALTADILEKYFKGKTLEYIDFISGRYSKSEPEGYDDFIANLPLKVSNVDTKGKFLWFELFDPNDKSNKWYIWN
TFGLTGMWSLFEAKYTRAVLSFDNELMAYFSDMRNFGTFKFSNSEKELKRKLNELGPDFLKNDDIDISKIKKYKQPIVAL
LMDQKKIGSGLGNYLVAEILYRAKIDPHKLGSNLTDQEIENLWYWIKYETKLAYDSNHIGYMVNLENESSKIGRKNYHPN
IHPTEKEFDFLVYRKKKDPNGNKVIADKIIGSGKNKRTTYWAPAIQK
>Q6QGJ4 1.5.1.3~~~frd~~~Dihydrofolate reductase~~~
MITAMYAVGPNGEFGLRGKLPWGSFKEELDAFYSQLDVLNPDNIIIGAGTYLALPYAVRERMIGASDLFIRADRPLPDDI
THDIYTPISMIGDTLPTFLKDQQTVVLGGANLLLEMYQHGHIESAFVSTIFSEQKLEADTHLDSMILDYNYESTRLVYAV
GANSDNSLRFVQELVTY
>P11128 ~~~P6~~~Fusion protein P6~~~
MSIFSSLFKVIKKVISKVVATLKKIFKKIWPLLLIVAIIYFAPYLAGFFTSAGFTGIGGIFSSIATTITPTLTSFLSTAW
SGVGSLASTAWSGFQSLGMGTQLAVVSGAAALIAPEETAQLVTEIGTTVGDIAGTIIGGVAKALPGWIWIAAGGLAVWAL
WPSSDSKE
>P29791 ~~~F~~~Fusion glycoprotein F0~~~
MATTTMRMIISIILISTYVPHITLCQNITEEFYQSTCSAVSRGYLSALRTGWYTSVVTIELSKIQKNVCNGTDSKVKLIK
QELERYNNAVAELQSLMQNEPTSSSRAKRGIPESIHYTRNSTKKFYGLMGKKRKRRFLGFLLGIGSAIASGVAVSKVLHL
EGEVNKIKNALLSTNKAVVSLSNGVSVLTSKVLDLKNYIDKELLPKVNNHDCRISNIATVIEFQQKNNRLLEIAREFSVN
AGITTPLSTYMLTNSELLSIINDMPITNDQKKLMSVCQIVRQQSYSIMSVLREVIAYVVQLPLYGVIDTPCWKLHTSPLC
TTDNKEGSNICLTRTDRGWYCDNAGSVSFFPQAETCKVQSNRVFCDTMNSLTLPTDVNLCNTDIFNSKYDCKIMTSKTDI
SSSVITSIGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNKGVDTVSVGNTLYYVNKLEGKALYIKGEPIINYYNPLV
FPSDEFDASIAQVNAKINQSLAFIRRSDELLHSVDVGKSTTNVVITTIIIVIVVVILMLITVGLLFYCKTRSTPIMLGKD
QLSSINNLSFSK
>P22167 ~~~F~~~Fusion glycoprotein F0~~~
MAATAMRMIISIIFISTYMTHITLCQNITEEFYQSTCSAVSRGYLSALRTGWYTSVVTIELSKIQKNVCKSTDSKVKLIK
QELERYNNAVIELQSLMQNEPASFSRAKRGIPELIHYTRNSTKRFYGLMGKKRKRRFLGFLLGIGSAIASGVAVSKVLHL
EGEVNKIKNALLSTNKAVVSLSNGVSVLTSKVLDLKNYIDKELLPKVNNHDCRISNIETVIEFQQKNNRLLEIAREFSVN
AGITTPLSTYMLTNSELLSLINDMPITNDQKKLMSSNVQIVRQQSYSIMSVVKEEVIAYVVQLPIYGVIDTPCWKLHTSP
LCTTDNKEGSNICLTRTDRGWYCDNAGSVSFFPQAETCKVQSNRVFCDTMNSLTLPTDVNLCNTDIFNTKYDCKIMTSKT
DISSSVITSIGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNKGVDTVSVGNTLYYVNKLEGKALYIKGEPIINYYDP
LVFPSDEFDASIAQVNAKINQSLAFIRRSDELLHSVDVGKSTTNVVITTIIIVIVVVILMLIAVGLLFYCKTRSTPIMLG
KDQLSGINNLSFSK
>P23728 ~~~F~~~Fusion glycoprotein F0~~~
MATTAMRMIISIIFISTYVTHITLCQNITEEFYQSTCSAVSRGYLSALRTGWYTSVVTIELSKIQKNVCNSTDSNVKLIK
QELERYNNAVVELQSLMQNEPASSSRAKRGIPELIHYKRNSTKKFYGLMGKKRKRRFLGFLLGIGSAIASGVAVSKVLHL
EGEVNKIKNALLSTNKAVVSLSNGVSVLTSKVLDLKNYIDKELLPKVNNHDCKISNIATVIEFQQKNNRLLEIAREFSVN
AGITTPLSTYMLTNSELLSLINDMPITNDQKKLMSSNVQIVRQQSYSIMSVVKEEVMAYVVQLPIYGVIDTPCWKLHTSP
LCTTDNKEGSNICLTRTDRGWYCDNAGSVSFFPQAETCKVQSNRVFCDTMNSLTLPTDVNLCNTDIFNAKYDCKIMTSKT
DISSSVITSIGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNRGVDTVSVGNTLYYVNKLEGKALYIKGEPIINYYDP
LVFPSDEFDASIAQVNAKINQSLAFIRRSDELLHSVDVGKSTTNVVITTIIIVIVVVILMLIAVGLLFYSKTRSTPIMLG
KDQLSGINNLSFSK
>P12569 ~~~F~~~Fusion glycoprotein F0~~~
MHRGIPKSSKTQTHTQQDRPPQPSTELEETRTSRARHSTTSAQRSTHYDPRTSDRPVSYTMNRTRSRKQTSHRLKNIPVH
GNHEATIQHIPESVSKGARSQIERRQPNAINSGSHCTWLVLWCLGMASLFLCSKAQIHWDNLSTIGIIGTDNVHYKIMTR
PSHQYLVIKLIPNASLIENCTKAELGEYEKLLNSVLEPINQALTLMTKNVKPLQSLGSGRRQRRFAGVVLAGVALGVATA
AQITAGIALHQSNLNAQAIQSLRTSLEQSNKAIEEIREATQETVIAVQGVQDYVNNELVPAMQHMSCELVGQRLGLRLLR
YYTELLSIFGPSLRDPISAEISIQALIYALGGEIHKILEKLGYSGSDMIAILESRGIKTKITHVDLPGKFIILSISYPTL
SEVKGVIVHRLEAVSYNIGSQEWYTTVPRYIATNGYLISNFDESSCVFVSESAICSQNSLYPMSPLLQQCIRGDTSSCAR
TLVSGTMGNKFILSKGNIVANCASILCKCYSTSTIINQSPDKLLTFIASDTCPLVEIDGATIQVGGRQYPDMVYEGKVAL
GPAISLDRLDVGTNLGNALKKLDDAKVLIDSSNQILETVRRSSFNFGSLLSVPILSCTALALLLLIYCCKRRYQQTLKQH
TKVDPAFKPDLTGTSKSYVRSL
>O89342 ~~~F~~~Fusion glycoprotein F0~~~
MATQEVRLKCLLCGIIVLVLSLEGLGILHYEKLSKIGLVKGITRKYKIKSNPLTKDIVIKMIPNVSNVSKCTGTVMENYK
SRLTGILSPIKGAIELYNNNTHDLVGDVKLAGVVMAGIAIGIATAAQITAGVALYEAMKNADNINKLKSSIESTNEAVVK
LQETAEKTVYVLTALQDYINTNLVPTIDQISCKQTELALDLALSKYLSDLLFVFGPNLQDPVSNSMTIQAISQAFGGNYE
TLLRTLGYATEDFDDLLESDSIAGQIVYVDLSSYYIIVRVYFPILTEIQQAYVQELLPVSFNNDNSEWISIVPNFVLIRN
TLISNIEVKYCLITKKSVICNQDYATPMTASVRECLTGSTDKCPRELVVSSHVPRFALSGGVLFANCISVTCQCQTTGRA
ISQSGEQTLLMIDNTTCTTVVLGNIIISLGKYLGSINYNSESIAVGPPVYTDKVDISSQISSMNQSLQQSKDYIKEAQKI
LDTVNPSLISMLSMIILYVLSIAALCIGLITFISFVIVEKKRGNYSRLDDRQVRPVSNGDLYYIGT
>Q6WB98 ~~~F~~~Fusion glycoprotein F0~~~
MSWKVVIIFSLLITPQHGLKESYLEESCSTITEGYLSVLRTGWYTNVFTLEVGDVENLTCSDGPSLIKTELDLTKSALRE
LKTVSADQLAREEQIENPRQSRFVLGAIALGVATAAAVTAGVAIAKTIRLESEVTAIKNALKTTNEAVSTLGNGVRVLAT
AVRELKDFVSKNLTRAINKNKCDIDDLKMAVSFSQFNRRFLNVVRQFSDNAGITPAISLDLMTDAELARAVSNMPTSAGQ
IKLMLENRAMVRRKGFGILIGVYGSSVIYMVQLPIFGVIDTPCWIVKAAPSCSGKKGNYACLLREDQGWYCQNAGSTVYY
PNEKDCETRGDHVFCDTAAGINVAEQSKECNINISTTNYPCKVSTGRHPISMVALSPLGALVACYKGVSCSIGSNRVGII
KQLNKGCSYITNQDADTVTIDNTVYQLSKVEGEQHVIKGRPVSSSFDPIKFPEDQFNVALDQVFENIENSQALVDQSNRI
LSSAEKGNTGFIIVIILIAVLGSSMILVSIFIIIKKTKKPTGAPPELSGVTNNGFIPHS
>P13843 ~~~F~~~Fusion glycoprotein F0~~~
MELLIHRSSAIFLTLAVNALYLTSSQNITEEFYQSTCSAVSRGYFSALRTGWYTSVITIELSNIKETKCNGTDTKVKLIK
QELDKYKNAVTELQLLMQNTPAANNRARREAPQYMNYTINTTKNLNVSISKKRKRRFLGFLLGVGSAIASGIAVSKVLHL
EGEVNKIKNALLSTNKAVVSLSNGVSVLTSKVLDLKNYINNRLLPIVNQQSCRISNIETVIEFQQMNSRLLEITREFSVN
AGVTTPLSTYMLTNSELLSLINDMPITNDQKKLMSSNVQIVRQQSYSIMSIIKEEVLAYVVQLPIYGVIDTPCWKLHTSP
LCTTNIKEGSNICLTRTDRGWYCDNAGSVSFFPQADTCKVQSNRVFCDTMNSLTLPSEVSLCNTDIFNSKYDCKIMTSKT
DISSSVITSLGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNKGVDTVSVGNTLYYVNKLEGKNLYVKGEPIINYYDP
LVFPSDEFDASISQVNEKINQSLAFIRRSDELLHNVNTGKSTTNIMITTIIIVIIVVLLSLIAIGLLLYCKAKNTPVTLS
KDQLSGINNIAFSK
>P03420 ~~~F~~~Fusion glycoprotein F0~~~
MELLILKANAITTILTAVTFCFASGQNITEEFYQSTCSAVSKGYLSALRTGWYTSVITIELSNIKENKCNGTDAKVKLIK
QELDKYKNAVTELQLLMQSTPPTNNRARRELPRFMNYTLNNAKKTNVTLSKKRKRRFLGFLLGVGSAIASGVAVSKVLHL
EGEVNKIKSALLSTNKAVVSLSNGVSVLTSKVLDLKNYIDKQLLPIVNKQSCSISNIETVIEFQQKNNRLLEITREFSVN
AGVTTPVSTYMLTNSELLSLINDMPITNDQKKLMSNNVQIVRQQSYSIMSIIKEEVLAYVVQLPLYGVIDTPCWKLHTSP
LCTTNTKEGSNICLTRTDRGWYCDNAGSVSFFPQAETCKVQSNRVFCDTMNSLTLPSEINLCNVDIFNPKYDCKIMTSKT
DVSSSVITSLGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNKGMDTVSVGNTLYYVNKQEGKSLYVKGEPIINFYDP
LVFPSDEFDASISQVNEKINQSLAFIRKSDELLHNVNAGKSTTNIMITTIIIVIIVILLSLIAVGLLLYCKARSTPVTLS
KDQLSGINNIAFSN
>O36634 ~~~F~~~Fusion glycoprotein F0~~~
MELLIHRLSAIFLTLAINALYLTSSQNITEEFYQSTCSAVSRGYFSALRTGWYTSVITIELSNIKETKCNGTDTKVKLIK
QELDKYKNAVTELQLLMQNTPAANNRARREAPQYMNYTINTTKNLNVSISKKRKRRFLGFLLGVGSAIASGIAVSKVLHL
EGEVNKIKNALLSTNKAVVSLSNGVSVLTSKVLDLKNYINNQLLPIVNQQSCRISNIETVIEFQQKNSRLLEINREFSVN
AGVTTPLSTYMLTNSELLSLINDMPITNDQKKLMSSNVQIVRQQSYSIMSIIKEEVLAYVVQLPIYGVIDTPCWKLHTSP
LCTTNIKEGSNICLTRTDRGWYCDNAGSVSFFPQADTCKVQSNRVFCDTMNSLTLPSEVSLCNTDIFNSKYDCKIMTSKT
DISSSVITSLGAIVSCYGKTKCTASNKNRGIIKTFSNGCDYVSNKGVDTVSVGNTLYYVNKLEGKNLYVKGEPIINYYDP
LVFPSDEFDASISQVNEKINQSLAFIRRSDELLHNVNTGKSTTNIMITTIIIVIIVVLLSLIAIGLLLYCKAKNTPVTLS
KDQLSGINNIAFSK
>P11209 ~~~F~~~Fusion glycoprotein F0~~~
MELPILKTNAITAILAAVTLCFASSQNITEEFYQSTCSAVSKGYLSALRTGWYTSVITIELSNIKENKCNGTDAKVKLIK
QELDKYKSAVTELQLLMQSTPATNNRARRELPRFMNYTLNNTKNTNVTLSKKRKRRFLGFLLGVGSAIASGIAVSKVLHL
EGEVNKIKSALLSTNKAVVSLSNGVSVLTSKVLDLKNYIDKQLLPIVNKQSCSISNIETVIEFQQKNNRLLEITREFSVN
AGVTTPVSTYMLTNSELLSLINDMPITNDQKKLMSNNVQIVRQQSYSIMSIIKEEVLAYVVQLPLYGVIDTPCWKLHTSP
LCTTNTKEGSNICLTRTDRGWYCDNAGSVSFFPLAETCKVQSNRVFCDTMNSLTLPSEVNLCNIDIFNPKYDCKIMTSKT
DVSSSVITSLGAIVSCYGKTKCTASNKDRGIIKTFSNGCDYVSNKGVDTVSVGNTLYYVNKQEGKSLYVKGEPIINFYDP
LVFPSDEFDASISQVNEKINQSLAFIRKSDELLHNVNAGKSTTNIMITTIIIVIIVILLSLIAVGLLLYCKARSTPVTLS
KDQLSGINNIAFSN
>Q8V3T9 ~~~Segment-5~~~Fusion glycoprotein F0~~~
MAFLTILVLFLFKEVLCEPCICENPTCLGITIPQAGFVRSAPGGVLLTETITERPQLTEWTTSRPKLEETLWLDGETKNG
KVSQTLFEAIQGTQMENCAVKAVLDTTFVNLTKQDIVLGKIKVSEFGGDSDISKCGRKGLKVFICGGTVGYVTRGCPPEE
CKGKKGRMMALEPTTDCGVEKGLTTDRIKTGMLDITSCCTQHGCTKGIRVEVPSPVLVSSKCQEVTFRVVPFHSVPDKLG
FARTSSFTLKANFVNKHGWSKYNFNLRGFPGEEFIKCCGFTLGVGGAWFQAYLNGMVQGDGAASADDVKEKLNGIIDQIN
KANTLLEGEIEAVRRIAYMNQASSLQNQVEIGLIGEYLNISSWLETTTLTKTEEGLMKNGWCQSNTHCWCPPKPTIVPTI
GYVDSIKEVTGTSWWMVMIHYIIVGLIVIVVVVFGLKLWGCLRR
>Q786F3 ~~~F~~~Fusion glycoprotein F0~~~
MGLKVNVSAIFMAVLLTLQTPTGQIHWGNLSKIGVVGIGSASYKVMTRSSHQSLVIKLMPNITLLNNCTRVEIAEYRRLL
RTVLEPIRDALNAMTQNIRPVQSVASSRRHKRFAGVVLAGAALGVATAAQITAGIALHQSMLNSQAIDNLRASLETTNQA
IEAIRQAGQEMILAVQGVQDYINNELIPSMNQLSCDLIGQKLGLKLLRYYTEILSLFGPSLRDPISAEISIQALSYALGG
DINKVLEKLGYSGGDLLGILESRGIKARITHVDTESYFIVLSIAYPTLSEIKGVIVHRLEGVSYNIGSQEWYTTVPKYVA
TQGYLISNFDESSCTFMPEGTVCSQNALYPMSPLLQECLRGSTKSCARTLVSGSFGNRFILSQGNLIANCASILCKCYTT
GTIINQDPDKILTYIAADHCPVVEVNGVTIQVGSRRYPDAVYLHRIDLGPPISLERLDVGTNLGNAIAKLEDAKELLESS
DQILRSMKGLSSTSIVYILIAVCLGGLIGIPALICCCRGRCNKKGEQVGMSRPGLKPDLTGTSKSYVRSL
>P69355 ~~~F~~~Fusion glycoprotein F0~~~
MGLKVNVSAIFMAVLLTLQTPTGQIHWGNLSKIGVVGIGSASYKVMTRSSHQSLVIKLMPNITLLNNCTRVEIAEYRRLL
RTVLEPIRDALNAMTQNIRPVQSVASSRRHKRFAGVVLAGAALGVATAAQITAGIALHQSMLNSQAIDNLRASLETTNQA
IEAIRQAGQEMILAVQGVQDYINNELIPSMNQLSCDLIGQKLGLKLLRYYTEILSLFGPSLRDPISAEISIQALSYALGG
DINKVLEKLGYSGGDLLGILESRGIKARITHVDTESYFIVLSIAYPTLSEIKGVIVHRLEGVSYNIGSQEWYTTVPKYVA
TQGYLISNFDESSCTFMPEGTVCSQNALYPMSPLLQECLRGSTKSCARTLVSGSFGNRFILSQGNLIANCASILCKCYTT
GTIINQDPDKILTYIAADHCPVVEVNGVTIQVGSRRYPDAVYLHRIDLGPPISLERLDVGTNLGNAIAKLEDAKELLESS
DQILRSMKGLSSTSIVYILIAVCLGGLIGIPALICCCRGRCNKKGEQVGMSRPGLKPDLTGTSKSYVRSL
>P69357 ~~~F~~~Fusion glycoprotein F0~~~
MGLKVNVSAIFMAVLLTLQTPTGQIHWGNLSKIGVVGIGSASYKVMTRSSHQSLVIKLMPNITLLNNCTRVEIAEYRRLL
RTVLEPIRDALNAMTQNIRPVQSVASSRRHKRFAGVVLAGAALGVATAAQITAGIALHQSMLNSQAIDNLRASLETTNQA
IEAIRQAGQEMILAVQGVQDYINNELIPSMNQLSCDLIGQKLGLKLLRYYTEILSLFGPSLRDPISAEISIQALSYALGG
DINKVLEKLGYSGGDLLGILESRGIKARITHVDTESYFIVLSIAYPTLSEIKGVIVHRLEGVSYNIGSQEWYTTVPKYVA
TQGYLISNFDESSCTFMPEGTVCSQNALYPMSPLLQECLRGSTKSCARTLVSGSFGNRFILSQGNLIANCASILCKCYTT
GTIINQDPDKILTYIAADHCPVVEVNGVTIQVGSRRYPDAVYLHRIDLGPPISLERLDVGTNLGNAIAKLEDAKELLESS
DQILRSMKGLSSTSIVYILIAVCLGGLIGIPALICCCRGRCNKKGEQVGMSRPGLKPDLTGTSKSYVRSL
>P26032 ~~~F~~~Fusion glycoprotein F0~~~
MGLRVNVSAIFMAVLLTLQTPTGQIHWGNLSKIGVVGIGSASYKVMTRSSHQSLVIKLMPNTTLLNNCTRVEIAEYRRLL
RTVLEPIRDALNAMTQNIRPVQIVASSRRHKRFAGVVLAGAALGVATAAQITAGIALHQSMLNSQAIDNLRASLETTNQA
IEAIRQTGQEMILAVQGVQDYINNELIPSMNQLSCDLIGQKLGLKLLRYYTEILSLFGPSLRDPISAEISIQALSYVLGG
DINKVLEKLGYSGGDLLGILESRGIKARITHVDTESYFIVLSIAYPTLSEIKGVIVHRLEGVSYNIGSQEWYTTVPKYVA
TQGYLISNFDESSCTFMPEGTVCSQNALYPMSPLLQECLRGSTKSCARTLVSGSFGNRFILSQGNLIANCASILCKCYTT
GTIINQDPDKILTHIAADHCPVVEVNGVTIQVGSRRYPDAVYLHRIDLGPPISLERLDVGTSLGSAIAKLEDAKELLESS
DQILRSMKGLSSTSIVYILIAVCLGGLIGIPALICCCRGRCNKRENKLVCQDQA
>P11236 ~~~F~~~Fusion glycoprotein F0~~~
MKVFLVTCLGFAVFSSSVCVNINILQQIGYIKQQVRQLSYYSQSSSSYIVVKLLPNIQPTDNSCEFKSVTQYNKTLSNLL
LPIAENINNIASPSSGSRRHKRFAGIAIGIAALGVATAAQVTAAVSLVQAQTNARAIAAMKNSIQATNRAVFEVKEGTQR
LAIAVQAIQDHINTIMNTQLNNMSCQILDNQLATSLGLYLTELTTVFQPQLINPALSPISIQALRSLLGSMTPAVVQATL
STSISAAEILSAGLMEGQIVSVLLDEMQMIVKINIPTIVTQSNALVIDFYSISSFINNQESIIQLPDRILEIGNEQWSYP
AKNCKLTRHHIFCQYNEAERLSLESKLCLAGNISACVFSPIAGSYMRRFVALDGTIVANCRSLTCLCKSPSYPIYQPDHH
AVTTIDLTACQTLSLDGLDFSIVSLSNITYAENLTISLSQTINTQPIDISTELSKVNASLQNAVKYIKESNHQLQSVNVN
SKIGAIIVAALVLSILSIIISLLFCCWAYVATKEIRRINFKTNHINTISSSVDDLIRY
>P09458 ~~~F~~~Fusion glycoprotein F0~~~
MKAFSVTCLGFAVFSSSICVNINILQQIGYIKQQVRQLSYYSQSSSSYIVVKLLPNIQPTDNSCEFKSVTQYNKTLSNLL
LPIAENINNIASPSPGSRRHKRFAGIAIGIAALGVATAAQVTAAVSLVQAQTNARAIAAMKNSIQATNRAIFEVKEGTQQ
LAIAVQAIQDHINTIMNTQLNNMSCQILDNQLATYLGLYLTELTTVFQPQLINPALSPISIQALRSLLGSMTPAVVQATL
STSISAAEILSAGLMEGQIVSVLLDEMQMIVKINIPTIVTQSNALVIDFYSISSFINNQESIIQLPDRILEIGNEQWSYP
AKNCKLTRHHIFCQYNEAERLSLESKLCLAGNISACVFSPIAGSYMRRFVALDGTIVANCRSLTCLCKSPSYPIYQPDHH
AVTTIDLTTCQTLSLDGLDFSIVSLSNITYAENLTISLSQTINTQPIDISTELSKVNASLQNAVKYIKESNHQLQSVSVN
SKIGAIIVAALVLSILSIIISLLFCCWAYIATKEIRRINFKTNHINTISSSVDDLIRY
>P12572 ~~~F~~~Fusion glycoprotein F0~~~
MGPRSSTRIPIPLMLTIRIALALSCVHLASSLDGRPLAAAGIVVTGDKAVNIYTSSQTGSIIVKLHPNMPKDKEACAKAP
LEAYNRTLTTLLTPLGDSIRRIQESVTTSGGRRQKRFIGAIIGSVALGVATAAQITAASALIQANQNAANILRLKESITA
TIEAVHEVTDGLSQLAVAVGKMQQFVNDQFNNTAQELDCIKITQQVGVELNLYLTELTTVFGPQITSPALTQLTIQALYN
LAGGNMDYLLTKLGVGNNQLSSLIGSGLITGNPILYDSQTQLLGIQVTLPSVGNLNNMRATYLETLSVSTTKGFASALVP
KVVTQVGSVIEELDTSYCIETDLDLYCTRIVTFPMSPGIYSCLNGNTSACMYSKTEGALTTPYMTLKGSVIANCKMTTCR
CADPPGIISQNYGEAVSLIDRHSCNVLSLDGITLRLSGEFDATYQKNISILDSQVIVTGNLDISTELGNVNNSISNALDK
LEESNSKLDKVNVKLTSTSALITYIALTAISLVCGILSLVLACYLMYKQKAQQKTLLWLGNNTLGQMRATTKM
>P35936 ~~~F~~~Fusion glycoprotein F0~~~
MGSRSSTRIPVPLMLTVRIMLALSCVCPTSSLDGRPLAAAGIVVTGDKAVNIYTSSQTGSIIIKLLPNMPKDKEACAKAP
LEAYNRTLTTLLTPLGDSIRRIQESVTTSGGGKQGRLIGAIIGGVALGVATAAQITAASALIQANQNAANILRLKESIAA
TNEAVHEVTDGLSQLAVAVGKMQQFVNDQFNKTAQELDCIKITQQVGVELNLYLTELTTVFGPQITSPALTQLTIQALYN
LAGGNMDYLLTKLGVGNNQLSSLIGSGLITGNPILYDSQTQLLGIQVTLPSVGNLNNMRATYLETLSVSTTKGFASALVP
KVVTQVGSVIEELDTSYCIETDLDLYCTRIVTFPMSPGIYSCLSGNTSACMYSKTEGALTTPYMTLKGSVIANCKMTTCR
CADPPGIISQNYGEAVSLIDRQSCNILSLDGITLRLSGEFDATYQKNISIQDSQVIVTGNLDISTELGNVNNSISNALDK
LEESNSKLDKVNVKLTSTSALITYIFLTVISLVCGILSLVLACYLMYKQKAQQKTLLWLGNNTLDQMRATTKM
>P14623 ~~~F~~~Fusion glycoprotein F0~~~
MRSRSSTRIPVPLMLIIRIALTLSCIRLTSSLDGRPLAAAGIVVTGDKAVDIYTSSQTGSIIVKLLPNMPKDKEACAKAP
LEAYNRTLTTLLTPLGDSIRRIQESVTTSGGRRQRRFIGAIIGSVALGVATPAQITAASALIQANQNAANILRLKESIAA
TNEAVHEVTDGLSQLAVAVGKMQQFVNDQFNNTAQQLDCIKITQQVGVELNLYLTELTTVFGPQITSPALTPLTIQALYN
LAGGNMDYLLTKLGVGNNQLSSLIGSGLITGNPILYDSQTQILGIQITSPSVGNLNNMRATYLETLSVSTTKGFASALVP
KVVTQVGSVIEELDTSYCMETDLDLYCTRIVTFPMSPGIYSCLSGNTSACMYSKTEGALTTPYMALKGSVIANCKMTTCR
CADPPGIISQNYGEAVSLIDRHSCNVLSLDGITLRLSGEFDATYQKNISILDSQVIVTGNLDISTELGNVNNSISNALNK
LEESNSKLDKLNVKLTSTSALITYIVLTVISLVFGVLSLVLACYLMYKQKAQQKTLLWLGNNTLDQMRATTKI
>Q9IH63 ~~~F~~~Fusion glycoprotein F0~~~
MVVILDKRCYCNLLILILMISECSVGILHYEKLSKIGLVKGVTRKYKIKSNPLTKDIVIKMIPNVSNMSQCTGSVMENYK
TRLNGILTPIKGALEIYKNNTHDLVGDVRLAGVIMAGVAIGIATAAQITAGVALYEAMKNADNINKLKSSIESTNEAVVK
LQETAEKTVYVLTALQDYINTNLVPTIDKISCKQTELSLDLALSKYLSDLLFVFGPNLQDPVSNSMTIQAISQAFGGNYE
TLLRTLGYATEDFDDLLESDSITGQIIYVDLSSYYIIVRVYFPILTEIQQAYIQELLPVSFNNDNSEWISIVPNFILVRN
TLISNIEIGFCLITKRSVICNQDYATPMTNNMRECLTGSTEKCPRELVVSSHVPRFALSNGVLFANCISVTCQCQTTGRA
ISQSGEQTLLMIDNTTCPTAVLGNVIISLGKYLGSVNYNSEGIAIGPPVFTDKVDISSQISSMNQSLQQSKDYIKEAQRL
LDTVNPSLISMLSMIILYVLSIASLCIGLITFISFIIVEKKRNTYSRLEDRRVRPTSSGDLYYIGT
>P17501 ~~~~~~Major envelope glycoprotein~~~
MVSAIVLYVLLAAAAHSAFAAEHCNAQMKTGPYKIKNLDITPPKETLQKDVEITIVETDYNENVIIGYKGYYQAYAYNGG
SLDPNTRVEETMKTLNVGKEDLLMWSIRQQCEVGEELIDRWGSDSDDCFRDNEGRGQWVKGKELVKRQNNNHFAHHTCNK
SWRCGISTSKMYSRLECQDDTDECQVYILDAEGNPINVTVDTVLHRDGVSMILKQKSTFTTRQIKAACLLIKDDKNNPES
VTREHCLIDNDIYDLSKNTWNCKFNRCIKRKVEHRVKKRPPTWRHNVRAKYTEGDTATKGDLMHIQEELMYENDLLKMNI
ELMHAHINKLNNMLHDLIVSVAKVDERLIGNLMNNSVSSTFLSDDTFLLMPCTNPPAHTSNCYNNSIYKEGRWVANTDSS
QCIDFSNYKELAIDDDVEFWIPTIGNTTYHDSWKDASGWSFIAQQKSNLITTMENTKFGGVGTSLSDITSMAEGELAAKL
TSFMFGHVVNFVIILIVILFLYCMIRNRNRQY
>P12605 ~~~F~~~Fusion glycoprotein F0~~~
MQKSEILFLIYSSLLLSSSLCQIPVDKLSNVGVIINEGKLLKIAGSYESRYIVLSLVPSIDLEDGCGTTQIIQYKNLLNR
LLIPLKDALDLQESLITITNDTTVTNDNPQSRFFGAVIGTIALGVATAAQITAGIALAEAREARKDIALIKDSIIKTHNS
VELIQRGIGEQIIALKTLQDFVNNEIRPAIGELRCETTALKLGIKLTQHYSELATAFSSNLGTIGEKSLTLQALSSLYSA
NITEILSTIKKDKSDIYDIIYTEQVKGTVIDVDLEKYMVTLLVKIPILSEIPGVLIYRASSISYNIEGEEWHVAIPNYII
NKASSLGGADVTNCIESRLAYICPRDPTQLIPDNQQKCILGDVSKCPVTKVINNLVPKFAFINGGVVANCIASTCTCGTN
RIPVNQDRSRGVTFLTYTNCGLIGINGIELYANKRGRDTTWGNQIIKVGPAVSIRPVDISLNLASATNFLEESKIELMKA
KAIISAVGGWHNTESTQIIIIIIVCILIIIICGILYYLYRVRRLLVMINSTHNSPVNTYTLESRMRNPYIGNNSN
>P25467 ~~~F~~~Fusion glycoprotein F0~~~
MHHLHPMIVCIFVMYTGIVGSDAIAGDQLLNIGVIQSKIRSLMYYTDGGASFIVVKLLPNLPPSNGTCNITSLDAYNVTL
FKLLTPLIENLSKISTVTDTKTRQKRFAGVVVGLAALGVATAAQITAAVAIVKANANAAAINNLASSIQSTNKAVSDVID
ASRTIATAVQAIQDHINGAIVNGITSASCRAHDALIGSILNLYLTELTTIFHNQITNPALTPLSIQALRILLGSTLPIVI
ESKLNTNLNTAELLSSGLLTGQIISISPMYMQMLIQINVPTFIMQPGAKVIDLIAISANHKLQEVVVQVPNRILEYANEL
QNYPANDCVVTPNSVFCRYNEGSPIPESQYQCLRGNLNSCTFTPIIGNFLKRFAFANGVLYANCKSLLCRCADPPHVVSQ
DDTQGISIIDIKRCSEMMLDTFSFRITSTFNATYVTDFSMINANIVHLSPLDLSNQINSINKSLKSAEDWIADSNFFANQ
ARTAKTLYSLSAIALILSVITLVVVGLLIAYIIKLVSQIHQFRSLAATTMFHRENPAFFSKNNHGNIYGIS
>P06828 ~~~F~~~Fusion glycoprotein F0~~~
MPTSILLIITTMIMASFCQIDITKLQHVGVLVNSPKGMKISQNFETRYLILSLIPKIEDSNSCGDQQIKQYKRLLDRLII
PLYDGLRLQKDVIVSNQESNENTDPRTKRFFGGVIGTIALGVATSAQITAAVALVEAKQARSDIEKLKEAIRDTNKAVQS
VQSSIGNLIVAIKSVQDYVNKEIVPSIARLGCEAAGLQLGIALTQHYSELTNIFGDNIGSLQEKGIKLQGIASLYRTNIT
EIFTTSTVDKYDIYDLLFTESIKVRVIDVDLNDYSITLQVRLPLLTRLLNTQIYRVDSISYNIQNREWYIPLPSHIMTKG
AFLGGADVKECIEAFSSYICPSDPGFVLNHEMESCLSGNISQCPRTVVKSDIVPRYAFVNGGVVANCITTTCTCNGIGNR
INQPPDQGVKIITHKECNTIGINGMLFNTNKEGTLAFYTPNDITLNNSVALDPIDISIELNKAKSDLEESKEWIRRSNQK
LDSIGNWHQSSTTIIIVLIMIIILFIINVTIIIIAVKYYRIQKRNRVDQNDKPYVLTNK
>P04849 ~~~F~~~Fusion glycoprotein F0~~~
MGTIIQFLVVSCLLAGAGSLDPAALMQIGVIPTNVRQLMYYTEASSAFIVVKLMPTIDSPISGCNITSISSYNATVTKLL
QPIGENLETIRNQLIPTRRRRRFAGVVIGLAALGVATAAQVTAAVALVKANENAAAILNLKNAIQKTNAAVADVVQATQS
LGTAVQAVQDHINSVVSPAITAANCKAQDAIIGSILNLYLTELTTIFHNQITNPALSPITIQALRILLGSTLPTVVEKSF
NTQISAAELLSSGLLTGQIVGLDLTYMQMVIKIELPTLTVQPATQIIDLATISAFINNQEVMAQLPTRVMVTGSLIQAYP
ASQCTITPNTVYCRYNDAQVLSDDTMACLQGNLTRCTFSPVVGSFLTRFVLFDGIVYANCRSMLCKCMQPAAVILQPSSS
PVTVIDMYKCVSLQLDNLRFTITQLANVTYNSTIKLESSQILSIDPLDISQNLAAVNKSLSDALQHLAQSDTYLSAITSA
TTTSVLSIIAICLGSLGLILIILLSVVVWKLLTIVVANRNRMENFVYHK
>P12575 ~~~F~~~Fusion glycoprotein F0~~~
MTAYIQRSQCISTSLLVVLTTLVSCQIPRDRLSNIGVIVDEGKSLKIAGSHESRYIVLSLVPGVDLENGCGTAQVIQYKS
LLNRLLIPLRDALDLQEALITVTNDTTQNAGVPQSRFFGAVIGTIALGVATSAQITAGIALAEAREAKRDIALIKESMTK
THKSIELLQNAVGEQILALKTLQDFVNDEIKPAISELGCETAALRLGIKLTQHYSGLLTAFGSNFGTIGEKSLTLQALSS
LYSANITEIMTTIRTGQSNIYDVIYTEQIKGTVIDVDLERYMVTLSVKIPILSEVPGVLIHKASSISYNIDGEEWYVTVP
SHILSRASFLGGADITDCVESRLTYICPRDPAQLIPDSQQKCILGDTTRCPVTKVVDSLIPKFAFVNGGVVANCIASTCT
CGTGRRPISQDRSKGVVFLTHDNCGLIGVNGVELYANRRGHDATWGVQNLTVGPAIAIRPIDISLNLADATNFLQDSKAE
LEKARKILSEVGRWYNSRETVITIIVVMVVILVVIIVIVIVLYRLKRSMLMGNPDDRIPRDTYTLEPKIRHMYTNGGFDA
MAEKR
>P04855 ~~~F~~~Fusion glycoprotein F0~~~
MTAYIQRSQCISTSLLVVLTTLVSCQIPRDRLSNIGVIVDEGKSLKIAGSHESRYIVLSLVPGVDFENGCGTAQVIQYKS
LLNRLLIPLRDALDLQEALITVTNDTTQNAGAPQSRFFGAVIGTIALGVATSAQITAGIALAEAREAKRDIALIKESMTK
THKSIELLQNAVGEQILALKTLQDFVNDEIKPAISELGCETAALRLGIKLTQHYSELLTAFGSNFGTIGEKSLTLQALSS
LYSANITEIMTTIKTGQSNIYDVIYTEQIKGTVIDVDLERYMVTLSVKIPILSEVPGVLIHKASSISYNIDGEEWYVTVP
SHILSRASFLGGADITDCVESRLTYICPRDPAQLIPDSQQKCILGDTTRCPVTKVVDSLIPKFAFVNGGVVANCIASTCT
CGTGRRPISQDRSKGVVFLTHDNCGLIGVNGVELYANRRGHDATWGVQNLTVGPAIAIRPIDISLNLADATNFLQDSKAE
LEKARKILSEVGRWYNSRETVITIIVVMVVILVVIIVIIIVLYRLRRSMLMGNPDDRIPRDTYTLEPKIRHMYTNGGFDA
MAEKR
>P25181 ~~~F~~~Fusion glycoprotein F0~~~
MRLTPYPIALTTLMIALTTLPETGLGIARDALSQVGVIQSKARSLMYYSDGSSSFIVVKLLPTLPTPSGNCNLTSITAYN
TTLFKLLTPLMENLDTIVSANQAGSRRKRFAGVVVGLAALGVATAAQVTAAVAVVKANANAAAINKLAASIQSTNAAISD
VISSTRTLATAIQAVQDHVNGVLASGLTEANCRSQDALIGSILNLYLTELTTIFHNQIVNPALTPLSIQALRIILGSTLP
LIVESRWNTNLNTAELLSSGLLTGQIISISPSYMQMVIQITVPTFVMQPGAKIIDLVTITANRMEEEVLIQVPPRILEYA
NEIQAYTADDCVVTPHAVFCKYNDGSPISDSLYQCLKGNLTSCVFTPVVGNYLKRFAFANGVMYVNCKALLCRCADPPMV
ITQDDLAGITVIDITVCREVMLDTLAFKITSLNNVTYGANFSMLAAAIKDLSPLDLSAQLAQVNKSLASAEEKIAQSSSL
AAQAVSQEATITVGSVAMLIAVLALIAGCTGIMIAVQMSRRLEVLRHLTDQSIISNHHYAELNPPPYNHSYESLHPIPQS
H
>P24614 ~~~F~~~Fusion glycoprotein F0~~~
MDVRICLLLFLISNPSSCIQETYNEESCSTVTRGYKSVLRTGWYTNVFNLEIGNVENITCNDGPSLIDTELVLTKNALRE
LKTVSADQVAKESRLSSPRRRRFVLGAIALGVATAAAVTAGVALAKTIRLEGEVKAIKNALRNTNEAVSTLGNGVRVLAT
AVNDLKEFISKKLTPAINQNKCNIADIKMAISFGQNNRRFLNVVRQFSDSAGITSAVSLDLMTDDELVRAINRMPTSSGQ
ISLMLNNRAMVRRKGFGILIGVYDGTVVYMVQLPIFGVIETPCWRVVAAPLCRKEKGNYACILREDQGWYCTNAGSTAYY
PNKDDCEVRDDYVFCDTAAGINVALEVEQCNYNISTSKYPCKVSTGRHPVSMVALTPLGGLVSCYESVSCSIGSNKVGII
KQLGKGCTHIPNNEADTITIDNTVYQLSKVVGEQRTIKGAPVVNNFNPILFPEDQFNVALDQVFESIDRSQDLIDKSNDL
LGADAKSKAGIAIAIVVLVILGIFFLLAVIYYCSRVRKTKPKHDYPATTGHSSMAYVS
>P0C045 ~~~~~~F protein~~~
MSTNPKPQRKKPNVTPTVAHRTSSSRVAVRSLVEFTCCRAGALDWVCARRGRLPSGRNLEVDVSLSPRHVGPRAGPGLSP
GTLGPSMAMRVAGGRDGSCLPVALGLAGAPQTPGVGRAIWVRSSIPLRAASPTSWGTYRSSAPLLEALPGPWRMASGFWK
TA
>P03657 ~~~I~~~Gene 1 protein~~~
MAVYFVTGKLGSGKTLVSVGKIQDKIVAGCKIATNLDLRLQNLPQVGRFAKTPRVLRIPDKPSISDLLAIGRGNDSYDEN
KNGLLVLDECGTWFNTRSWNDKERQPIIDWFLHARKLGWDIIFLVQDLSIVDKQARSALAENVVYCRRLDRITLPFVGTL
YSLITGSKMPLPKLHVGVVKYGDSQLSPTVERWLYTGKNLYNAYDTKQAFSSNYDSGVYSYLTPYLSHGRYFKPLNLGQK
MKLTKIYLKKFSRVLCLAIGFASAFTYSYITQPKPEVKKVVSQTYDFDKFTIDSSQRLNLSYRYVFKDSKGKLINSDDLQ
KQGYSLTYIDLCTVSIKKGNSNEIVKCN
>P69169 ~~~III~~~Attachment protein G3P~~~
MKKLLFAIPLVVPFYSHSAETVESCLAKPHTENSFTNVWKDDKTLDRYANYEGCLWNATGVVVCTGDETQCYGTWVPIGL
AIPENEGGGSEGGGSEGGGSEGGGTKPPEYGDTPIPGYTYINPLDGTYPPGTEQNPANPNPSLEESQPLNTFMFQNNRFR
NRQGALTVYTGTVTQGTDPVKTYYQYTPVSSKAMYDAYWNGKFRDCAFHSGFNEDPFVCEYQGQSSDLPQPPVNAGGGSG
GGSGGGSEGGGSEGGGSEGGGSEGGGSGGGSGSGDFDYEKMANANKGAMTENADENALQSDAKGKLDSVATDYGAAIDGF
IGDVSGLANGNGATGDFAGSNSQMAQVGDGDNSPLMNNFRQYLPSLPQSVECRPFVFGAGKPYEFSIDCDKINLFRGVFA
FLLYVATFMYVFSTFANILRNKES
>P03661 ~~~III~~~Attachment protein G3P~~~
MKKLLFAIPLVVPFYSHSAETVESCLAKPHTENSFTNVWKDDKTLDRYANYEGCLWNATGVVVCTGDETQCYGTWVPIGL
AIPENEGGGSEGGGSEGGGSEGGGTKPPEYGDTPIPGYTYINPLDGTYPPGTEQNPANPNPSLEESQPLNTFMFQNNRFR
NRQGALTVYTGTVTQGTDPVKTYYQYTPVSSKAMYDAYWNGKFRDCAFHSGFNEDPFVCEYQGQSSDLPQPPVNAGGGSG
GGSGGGSEGGGSEGGGSEGGGSEGGGSGGGSGSGDFDYEKMANANKGAMTENADENALQSDAKGKLDSVATDYGAAIDGF
IGDVSGLANGNGATGDFAGSNSQMAQVGDGDNSPLMNNFRQYLPSLPQSVECRPYVFGAGKPYEFSIDCDKINLFRGVFA
FLLYVATFMYVFSTFANILRNKES
>O80297 ~~~III~~~Attachment protein G3P~~~
MKKIIIALFFAPFFTHATTDAECLSKPAFDGTLSNVWKEGDSRYANFENCIYELSGIGIGYDNDTSCNGHWTPVRAADGS
GNGGDDNSSGGGSNGDSGNNSTPDTVTPGQTVNLPSDLSTLSIPANVVKSDSIGSQFSLYTNASCTMCSGYYLSNNADSI
AIANITETVKADYNQPDMWFEQTDSDGNHVKILQNSYKAVSYNVESKQSDVNNPTYINYSYSVNVKQVSYDTSNVCIMNW
ETFQNKCDASRAVLITDTVTPSYSRNITIQSNINYQGSNGSGGSGGSGGSGNDGGGTGNNGNGTGDFDYVKMANANKDAL
TESFDLSALQADTGASLDGSVQGTLDSLSGFSDSIGGLVGNGSAISGEFAGSSAAMNAIGEGDKSPLLDSLSFLKDGLFP
ALPEFKQCTPFVFAPGKEYEFIIECKYIDMFKGIFAFILYFWTFVTVYDSFSGILRKGRG
>P03663 ~~~III~~~Attachment protein G3P~~~
MKRKIIAISLFLYIPLSNADNWESITKSYYTGFAISKTVESKDKDGKPVRKEVITQADLTTACNDAKASAQNVFNQIKLT
LSGTWPNSQFRLVTGDTCVYNGSPGEKTESWSIRAQVEGDIQRSVPDEEPSEQTPEEICEAKPPIDGVFNNVFKGDEGGF
YINYNGCEYEATGVTVCQNDGTVCSSSAWKPTGYVPESGEPSSSPLKDGDTGGTGEGGSDTGGDTGGGDTGGGSTGGDTG
GSSGGGSSGGGSSGGSTGKSLTKEDVTAAIHVASPSIGDAVKDSLTEDNDQYDNQKKADEQSAKASASVSDAISDGMRGV
GNFVDDFGGESSQYGTGNSEMDLSVSLAKGQLGIDREGHGSAWESFLNDGALRPSIPTGHGCTNFVMYQGSVYQIEIGCD
KLNDIKSVLSWVMYCLTFWYVFQSVTSLLRKGEQ
>P69168 ~~~III~~~Attachment protein G3P~~~
MKKLLFAIPLVVPFYSHSAETVESCLAKPHTENSFTNVWKDDKTLDRYANYEGCLWNATGVVVCTGDETQCYGTWVPIGL
AIPENEGGGSEGGGSEGGGSEGGGTKPPEYGDTPIPGYTYINPLDGTYPPGTEQNPANPNPSLEESQPLNTFMFQNNRFR
NRQGALTVYTGTVTQGTDPVKTYYQYTPVSSKAMYDAYWNGKFRDCAFHSGFNEDPFVCEYQGQSSDLPQPPVNAGGGSG
GGSGGGSEGGGSEGGGSEGGGSEGGGSGGGSGSGDFDYEKMANANKGAMTENADENALQSDAKGKLDSVATDYGAAIDGF
IGDVSGLANGNGATGDFAGSNSQMAQVGDGDNSPLMNNFRQYLPSLPQSVECRPFVFSAGKPYEFSIDCDKINLFRGVFA
FLLYVATFMYVFSTFANILRNKES
>P03666 ~~~IV~~~Virion export protein~~~
MKLLNVINFVFLMFVSSSSFAQVIEMNNSSLRDFVTWYSKQTGESVIVSPDVKGTVTVYSSDVKPENLRDFFISVLRANN
FDMVGSIPSIIQKYNPNNQDYIDELPSSDNQEYDDNSAPSGGFFVPQNDNVTQTFKINNVRAKDLIRVVELFVKSNTSKS
SNVLSVDGSNLLVVSAPKDILDNLPQFLSTVDLPTDQILIEGLIFEVQQGDALDFSFAAGSQRGTVAGGVNTDRLTSVLS
SAGGSFGIFNGDVLGLSVRALKTNSHSKILSVPRILTLSGQKGSISVGQNVPFITGRVTGESANVNNPFQTVERQNVGIS
MSVFPVAMAGGNIVLDITSKADSLSSSTQASDVITNQRSIATTVNLRDGQTLLLGGLTDYKNTSQDSGVPFLSKIPLIGL
LFSSRSDSNEESTLYVLVKATIVRAL
>P69543 ~~~V~~~DNA-Binding protein G5P~~~
MIKVEIKPSQAQFTTRSGVSRQGKPYSLNEQLCYVDLGNEYPVLVKITLDEGQPAYAPGLYTVHLSSFKVGQFGSLMIDR
LRLVPAK
>P69542 ~~~V~~~DNA-Binding protein G5P~~~
MIKVEIKPSQAQFTTRSGVSRQGKPYSLNEQLCYVDLGNEYPVLVKITLDEGQPAYAPGLYTVHLSSFKVGQFGSLMIDR
LRLVPAK
>O80294 ~~~V~~~DNA-Binding protein G5P~~~
MSELGNLETTVTGKIKRFNNGGGYYYTTVVSPAADAYSFPPVIRIKSKKSLGRVGDEIADIHCRITGYERSFPYTDKQTG
EQSRGFNVDMLLELLE
>P03670 ~~~V~~~DNA-Binding protein G5P~~~
MLTVEIHDSQVSVKERSGVSQKSGKPYTIREQEAYIDLGGVYPALFNFNLEDGQQPYPAGKYRLHPASFKINNFGQVAVG
RVLLESVK
>P69544 ~~~V~~~DNA-Binding protein G5P~~~
MIKVEIKPSQAQFTTRSGVSRQGKPYSLNEQLCYVDLGNEYPVLVKITLDEGQPAYAPGLYTVHLSSFKVGQFGSLMIDR
LRLVPAK
>P03671 ~~~V~~~DNA-Binding protein G5P~~~
MNMFATQGGVVELWVTKTDTYTSTKTGEIYASVQSIAPIPEGARGNAKGFEISEYNIEPTLLDAIVFEGQPVLCKFASVV
RPTQDRFGRITNTQVLVDLLAVGGKPMAPTAQAPARPQAQAQAPRPAQQPQGQDKQDKSPDAKA
>P03672 ~~~V~~~DNA-Binding protein G5P~~~
MNIQITFTDSVRQGTSAKGNPYTFQEGFLHLEDKPFPLQCQFFVESVIPAGSYQVPYRINVNNGRPELAFDFKAMKRA
>P68676 ~~~V~~~DNA-Binding protein G5P~~~
MKVQIMSSAVAVRSFPAREGKPATHFREQTAAVLREGDFPLPFTIGLDEDQPPYGEGFYIIDPKSLQNNKFGGLEFGRRI
RLIPDLTAKLQQQPAKVG
>P69531 ~~~VI~~~Head virion protein G6P~~~
MPVLLGIPLLLRFLGFLLVTLFGYLLTFLKKGFGKIAIAISLFLALIIGLNSILVGYLSDISAQLPSDFVQGVQLILPSN
ALPCFYVILSVKAAIFIFDVKQKIVSYLDWDK
>P69532 ~~~VI~~~Head virion protein G6P~~~
MPVLLGIPLLLRFLGFLLVTLFGYLLTFLKKGFGKIAIAISLFLALIIGLNSILVGYLSDISAQLPSDFVQGVQLILPSN
ALPCFYVILSVKAAIFIFDVKQKIVSYLDWDK
>P69534 ~~~VII~~~Tail virion protein G7P~~~
MEQVADFDTIYQAMIQISVVLCFALGIIAGGQR
>P69535 ~~~VII~~~Tail virion protein G7P~~~
MEQVADFDTIYQAMIQISVVLCFALGIIAGGQR
>P68715 ~~~~~~Assembly protein G7~~~
MAAEQRRSTIFDIVSKCIVQSVLRDISINSEYIESKAKQLCYCPASKKESVINGIYNCCESNIEIMDKEQLLKILDNLRC
HSAHVCNATDFWRLYNSLKRFTHTTAFFNTCKPTILATLNTLITLILSNKLLYAAEMVEYLENQLDSSNKSMSQELAELL
EMKYALINLVQYRILPMIIGEPIIVAGFSGKEPISDYSAEVERLMELPVKTDIVNTTYDFLARKGIDTSNNIAEYIAGLK
IEEIEKVEKYLPEVISTIANSNIIKNKKSIFPANINDKQIMECSRMLDTSEKYSKGYKTDGAVTSPLTGNNTITTFIPIS
ASDMQKFTILEYLYIMRVMANNVKKKNEGKNNGGVVMHINSPFKVINLPKC
>P21028 ~~~G7L~~~Assembly protein G7~~~
MAAEQRRSTIFDIVSKCIVQSVLRDISINSEYIESKAKQLCYCPASKKESVINGIYNCCESNIEIMDKEQLLKILDNLRC
HSAHVCNATDFWRLYNSLKRFTHTTAFFNTCKPTILATLNTLITLILSNKLLYAAEMVEYLENQLDSSNKSMSQELAELL
EMKYALINLVQYRILPMIIGEPIIVAGFSGKEPISDYSAEVERLMELPVKTDIVNTTYDFLARKGIDTSNNIAEYIAGLK
IEEIEKVEKYLPEVISTIANSNIIKNKKSIFPANINDKQIMECSRMLDTSEKYSKGYKTDGAVTSPLTGNNTITTFIPIS
ASDMQKFTILEYLYIMRVMANNVKKKNVGKNNGGVVMHINSPFKVINLPKC
>P68716 ~~~~~~Assembly protein G7~~~
MAAEQRRSTIFDIVSKCIVQSVLRDISINSEYIESKAKQLCYCPASKKESVINGIYNCCESNIEIMDKEQLLKILDNLRC
HSAHVCNATDFWRLYNSLKRFTHTTAFFNTCKPTILATLNTLITLILSNKLLYAAEMVEYLENQLDSSNKSMSQELAELL
EMKYALINLVQYRILPMIIGEPIIVAGFSGKEPISDYSAEVERLMELPVKTDIVNTTYDFLARKGIDTSNNIAEYIAGLK
IEEIEKVEKYLPEVISTIANSNIIKNKKSIFPANINDKQIMECSRMLDTSEKYSKGYKTDGAVTSPLTGNNTITTFIPIS
ASDMQKFTILEYLYIMRVMANNVKKKNEGKNNGGVVMHINSPFKVINLPKC
>P0DSU1 ~~~G7L~~~Assembly protein G7~~~
MAAEQRRSTIFDIVSKCIVQSVLRDISINSEYIESKAKQLCYCPASKKESVINGIYNCCESNIEIMDKEQLLKILDNLRC
HSAHVCNATDFWRLYNSLKRFTHTTAFFNTCKPTILATLNTLITLILSNKLLYAAEMVEYLENQLDSSNKSMSQELAELL
EMKYALINLVQYRILPMIIGEPIIVAGFSGKEPISNYSAEVERLMELPVKTDIVNTTYDFLARKGIDTSNNIAEYIAGLK
IEEIEKVEKYLPEVISTIANSNIIKNKKSIFPANINDKQIMECSKMLDTSEKYSKGYKTDGAVTSPLTGNNTITTFIPIS
ASDMQKFTILEYLYIMRVMANNVKKKNEGKNNGGVVMHINSPFKVINLPKC
>P0DSU2 ~~~G7L~~~Assembly protein G7~~~
MAAEQRRSTIFDIVSKCIVQSVLRDISINSEYIESKAKQLCYCPASKKESVINGIYNCCESNIEIMDKEQLLKILDNLRC
HSAHVCNATDFWRLYNSLKRFTHTTAFFNTCKPTILATLNTLITLILSNKLLYAAEMVEYLENQLDSSNKSMSQELAELL
EMKYALINLVQYRILPMIIGEPIIVAGFSGKEPISNYSAEVERLMELPVKTDIVNTTYDFLARKGIDTSNNIAEYIAGLK
IEEIEKVEKYLPEVISTIANSNIIKNKKSIFPANINDKQIMECSKMLDTSEKYSKGYKTDGAVTSPLTGNNTITTFIPIS
ASDMQKFTILEYLYIMRVMANNVKKKNEGKNNGGVVMHINSPFKVINLPKC
>P69537 ~~~IX~~~Tail virion protein G9P~~~
MSVLVYSFASFVLGWCLRSGITYFTRLMETSS
>P69538 ~~~IX~~~Tail virion protein G9P~~~
MSVLVYSFASFVLGWCLRSGITYFTRLMETSS
>Q49P94 ~~~L6~~~Golgi anti-apoptotic protein~~~
MAMPSLSACSSIEDDFNYGSSVASASVHIRMAFLRKVYGILCLQFLLTTATTAVFLYFDCMRTFIQGSPVLILASMFGSI
GLIFALTLHRHKHPLNLYLLCGFTLSESLTLASVVTFYDVHVVMQAFMLTTAAFLALTTYTLQSKRDFSKLGAGLFAALW
ILILSGLLGIFVQNETVKLVLSAFGALVFCGFIIYDTHSLIHKLSPEEYVLASINLYLDIINLFLHLLQLLEVSNKK
>P0C776 ~~~gag~~~Gag polyprotein~~~
MEAVIKVISSACKTYCGKISPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMVLGKSGELKTWGLVLGA
LKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTAQRDAKMAPEKMATPKTVGTSCYQCGTA
TGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPWGAEQPRAEPGHAGLAPGPALTDWARIREELASTGPPVVAMP
VVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVI
AAATRDPRHPANGQGRGERTNLDRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADITQGPS
ESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRAAPSTLTTPGEIIKYVLDRQKIAPLTDQGIAAAMS
SAIQPLVMAVVNRERDGQTGSGGRARGLCYTCGSPGHYQAQCPKKRKSGNSRERCQLCDGMGHNAKQCRRRDGNQGQRPG
KGLSSGSWPVSEQPAVSLAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVMEAANPQ
IHGIGGGIPMRKSRDMIEVGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNL
>P26315 3.4.23.-~~~gag~~~Proteinase p15~~~
LAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVMEAANPQIHGIGGGIPMRKSRDMI
EVGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNL
>P19558 ~~~gag~~~Gag polyprotein~~~
MKRRELEKKLRKVRVTPQQDKYYTIGNLQWAIRMINLMGIKCVCDEECSAAEVALIITQFSALDLENSPIRGKEEVAIKN
TLKVFWSLLAGYKPESTETALGYWEAFTYREREARADKEGEIKSIYPSLTQNTQNKKQTSNQTNTQSLPAITTQDGTPRF
DPDLMKQLKIWSDATERNGVDLHAVNILGVITANLVQEEIKLLLNSTPKWRLDVQLIESKVREKENAHRTWKQHHPEAPK
TDEIIGKGLSSAEQATLISVECRETFRQWVLQAAMEVAQAKHATPGPINIHQGPKEPYTDFINRLVAALEGMAAPETTKE
YLLQHLSIDHANEDCQSILRPLGPNTPMEKKLEACRVVGSQKSKMQFLVAAMKEMGIQSPIPAVLPHTPEAYASQTSGPE
DGRRCYGCGKTGHLKRNCKQQKCYHCGKPGHQARNCRSKNGKCSSAPYGQRSQPQNNFHQSNMSSVTPSAPPLILD
>P25058 ~~~gag~~~Gag polyprotein~~~
MGNSPSYNPPAGISPSDWLNLLQSAQRLNPRPSPSDFTDLKNYIHWFHKTQKKPWTFTSGGPASCPPGKFGRVPLVLATL
NEVLSNDEGAPGASAPEEQPPPYDPPAVLPIISEGNRNRHRAWALRELQDIKKEIENKAPGSQVWIQTLRLAILQADPTP
ADLEQLCQYIASPVDQTAHMTSLTAAIAAEAANTLQGFNPKMGTLTQQSAQPNAGDLRSQYQNLWLQAWKNLPTRPSVQP
WSTIVQGPAESYVEFVNRLQISLADNLPDGVPKEPIIDSLSYANANKECQQILQGRGLVAAPVGQKLQACAHWAPKTKQP
AILVHTPGPKMPGPRQPAPKRPPPGPCYRCLKEGHWARDCPTKTTGPPPGPCPICKDPSHWKRDCPTLKSKN
>P03344 ~~~gag~~~Gag polyprotein~~~
MGNSPSYNPPAGISPSDWLNLLQSAQRLNPRPSPSDFTDLKNYIHWFHKTQKKPWTFTSGGPTSCPPGRFGRVPLVLATL
NEVLSNEGGAPGASAPEEQPPPYDPPAILPIISEGNRNRHRAWALRELQDIKKEIENKAPGSQVWIQTLRLAILQADPTP
ADLEQLCQYIASPVDQTAHMTSLTAAIAAAEAATPSRVLTPKTGTLTQQSAQPNAGDLRSQYQNLWLQAGKISLLVLQLQ
PWSTIVQGPAESSVEFVNRLQISLADNLPDGVLRNPLLTPLVMQMLTESVSKFCRGEASGRGGAKTAGLRTIGPPRMKQP
ALLVHTPGPKMPGPRQPAPKRPPPGPCYRCLKEGHWARDCPTKATGPPPGPCPICKDPSHWKRDCPTLKSKN
>P69730 ~~~gag~~~Gag polyprotein~~~
MGDPLTWSKALKKLEKVTVQGSQKLTTGNCNWALSLVDLFHDTNFVKEKDWQLRDVIPLLEDVTQTLSGQEREAFERTWW
AISAVKMGLQINNVVDGKASFQLLRAKYEKKTANKKQSEPSEEYPIMIDGAGNRNFRPLTPRGYTTWVNTIQTNGLLNEA
SQNLFGILSVDCTSEEMNAFLDVVPGQAGQKQILLDAIDKIADDWDNRHPLPNAPLVAPPQGPIPMTARFIRGLGVPRER
QMEPAFDQFRQTYRQWIIEAMSEGIKVMIGKPKAQNIRQGAKEPYPEFVDRLLSQIKSEGHPQEISKFLTDTLTIQNANE
ECRNAMRHLRPEDTLEEKMYACRDIGTTKQKMMLLAKALQTGLAGPFKGGALKGGPLKAAQTCYNCGKPGHLSSQCRAPK
VCFKCKQPGHFSKQCRSVPKNGKQGAQGRPQKQTFPIQQKSQHNKSVVQETPQTQNLYPDLSEIKKEYNVKEKDQVEDLN
LDSLWE
>P69732 ~~~gag~~~Gag polyprotein~~~
MGDPLTWSKALKKLEKVTVQGSQKLTTGNCNWALSLVDLFHDTNFVKEKDWQLRDVIPLLEDVTQTLSGQEREAFERTWW
AISAVKMGLQINNVVDGKASFQLLRAKYEKKTANKKQSEPSEEYPIMIDGAGNRNFRPLTPRGYTTWVNTIQTNGLLNEA
SQNLFGILSVDCTSEEMNAFLDVVPGQAGQKQILLDAIDKIADDWDNRHPLPNAPLVAPPQGPIPMTARFIRGLGVPRER
QMEPAFDQFRQTYRQWIIEAMSEGIKVMIGKPKAQNIRQGAKEPYPEFVDRLLSQIKSEGHPQEISKFLTDTLTIQNANE
ECRNAMRHLRPEDTLEEKMYACRDIGTTKQKMMLLAKALQTGLAGPFKGGALKGGPLKAAQTCYNCGKPGHLSSQCRAPK
VCFKCKQPGHFSKQCRSVPKNGKQGAQGRPQKQTFPIQQKSQHNKSVVQETPQTQNLYPDLSEIKKEYNVKEKDQVEDLN
LDSLWE
>O56860 ~~~gag~~~Gag polyprotein~~~
MARELNPLQLQQLYINNGLQPNPGHGDIIAVRFTGGPWGPGDRWARVTIRLQDNTGQPLQVPGYDLEPGIINLREDILIA
GPYNLIRTAFLDLEPARGPERHGPFGDGRLQPGDGLSEGFQPITDEEIQAEVGTIGAARNEIRLLREALQRLQAGGVGRP
IPGAVLQPQPVIGPVIPINHLRSVIGNTPPNPRDVALWLGRSTAAIEGVFPIVDQVTRMRVVNALVASHPGLTLTENEAG
SWNAAISALWRKAHGAAAQHELAGVLSDINKKEGIQTAFNLGMQFTDGNWSLVWGIIRTLLPGQALVTNAQSQFDLMGDD
IQRAENFPRVINNLYTMLGLNIHGQSIRPRVQTQPLQTRPRNPGRSQQGQLNQPRPQNRANQSYRPPRQQQQHSDVPEQR
DQRGPSQPPRGSGGGYNFRRNPQQPQRYGQGPPGPNPYRRFGDGGNPQQQGPPPNRGPDQGPRPGGNPRGGGRGQGPRNG
GGSAAAVHTVKASENETKNGSAEAVDGGKKGGKD
>P16087 ~~~gag~~~Gag polyprotein~~~
MGNGQGRDWKMAIKRCSNVAVGVGGKSKKFGEGNFRWAIRMANVSTGREPGDIPETLDQLRLVICDLQERREKFGSSKEI
DMAIVTLKVFAVAGLLNMTVSTAAAAENMYSQMGLDTRPSMKEAGGKEEGPPQAYPIQTVNGVPQYVALDPKMVSIFMEK
AREGLGGEEVQLWFTAFSANLTPTDMATLIMAAPGCAADKEILDESLKQLTAEYDRTHPPDAPRPLPYFTAAEIMGIGLT
QEQQAEARFAPARMQCRAWYLEALGKLAAIKAKSPRAVQLRQGAKEDYSSFIDRLFAQIDQEQNTAEVKLYLKQSLSIAN
ANADCKKAMSHLKPESTLEEKLRACQEIGSPGYKMQLLAEALTKVQVVQSKGSGPVCFNCKKPGHLARQCREVKKCNKCG
KPGHLAAKCWQGNRKNSGNWKAGRAAAPVNQMQQAVMPSAPPMEEKLLDL
>P14349 ~~~gag~~~Gag polyprotein~~~
MASGSNVEEYELDVEALVVILRDRNIPRNPLHGEVIGLRLTEGWWGQIERFQMVRLILQDDDNEPLQRPRYEVIQRAVNP
HTMFMISGPLAELQLAFQDLDLPEGPLRFGPLANGHYVQGDPYSSSYRPVTMAETAQMTRDELEDVLNTQSEIEIQMINL
LELYEVETRALRRQLAERSSTGQGGISPGAPRSRPPVSSFSGLPSLPSIPGIHPRAPSPPRATSTPGNIPWSLGDDSPPS
SSFPGPSQPRVSFHPGNPFVEEEGHRPRSQSRERRREILPAPVPSAPPMIQYIPVPPPPPIGTVIPIQHIRSVTGEPPRN
PREIPIWLGRNAPAIDGVFPVTTPDLRCRIINAILGGNIGLSLTPGDCLTWDSAVATLFIRTHGTFPMHQLGNVIKGIVD
QEGVATAYTLGMMLSGQNYQLVSGIIRGYLPGQAVVTALQQRLDQEIDNQTRAETFIQHLNAVYEILGLNARGQSIRASV
TPQPRPSRGRGRGQNTSRPSQGPANSGRGRQRPASGQSNRGSSTQNQNQDNLNQGGYNLRPRTYQPQRYGGGRGRRWNDN
TNNQESRPSDQGSQTPRPNQAGSGVRGNQSQTPRPAAGRGGRGNHNRNQRSSGAGDSRAVNTVTQSATSSTDESSSAVTA
ASGGDQRD
>P03345 ~~~gag~~~Gag polyprotein~~~
MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPARICPINYSLLASLLPKGYPGRV
NEILHILIQTQAQIPSRPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLR
REYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQTWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTI
PEPEPEEDALLLDLPADIPHPKNSIGGEV
>P14077 ~~~gag~~~Gag polyprotein~~~
MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPVWICPINYSLLASLLPKGYPGRV
NEILHILIQTQAQIPSRPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITSYNPLAGPLRVQANNPQQQGLR
REYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQTWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTI
PEPEPEEDALLLDLPADIPHPKNSIGGEV
>P03346 ~~~gag~~~Gag polyprotein~~~
MGQIHGLSPTPIPKAPRGLSTHHWLNFLQAAYRLQPRPSDFDFQQLRRFLKLALKTPIWLNPIDYSLLASLIPKGYPGRV
VEIINILVKNQVSPSAPAAPVPTPICPTTTPPPPPPPSPEAHVPPPYVEPTTTQCFPILHPPGAPSAHRPWQMKDLQAIK
QEVSSSALGSPQFMQTLRLAVQQFDPTAKDLQDLLQYLCSSLVVSLHHQQLNTLITEAETRGMTGYNPMAGPLRMQANNP
AQQGLRREYQNLWLAAFSTLPGNTRDPSWAAILQGLEEPYCAFVERLNVALDNGLPEGTPKEPILRSLAYSNANKECQKI
LQARGHTNSPLGEMLRTCQAWTPKDKTKVLVVQPRRPPPTQPCFRCGKVGHWSRDCTQPRPPPGPCPLCQDPSHWKRDCP
QLKPPQEEGEPLLLDLPSTSGTTEEKNSLRGEI
>P03349 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDKWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIDVKDTKEALEKIEEEQNKSKKKAQQAAAAAGTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKV
VEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAG
TTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMT
ETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPANIMMQRGNFRNQRKTVKCFNCGKE
GHIAKNCRAPRKKGCWRCGREGHQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPEESFRFGEEKTTPSQKQEP
IDKELYPLTSLRSLFGNDPSSQ
>P03347 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNTATIMMQRGNFRNQRKMVKCFNCGKEGH
TARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPFLQSRPEPTAPPEESFRSGVE
TTTPPQKQEPIDKELYPLTSLRSLFGNDPSSQ
>P03348 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
IARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPFLQSRPEPTAPPEESFRSGVE
TTTPSQKQEPIDKELYPLTSLRSLFGNDPSSQ
>P04591 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
TARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPEESFRSGVETTTPPQKQEPID
KELYPLTSLRSLFGNDPSSQ
>Q70622 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGKLDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEECRSLYN
TVATLYCVHQRIEIKDTKEALDKIKEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
IARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPEESFRSGVETTTPPQKQEPID
KELYPLTSLRSLFGNDPSSQ
>P05888 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHVVWASRELERFAINPGLLETSEGCRQILGQLQPSLQTGSEERKSLYN
TVATLYCVHQKIKIKDTKEALEKIEEEQNKSKKKAQQAAADTGNRGNSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVK
VVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPAHAGPIAPGQMREPRGSDIA
GTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPSSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWM
TETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIIKCFNCGK
EGHIAKNCRAPRKRGCWKCGKEGHQMKDCTERQANFLGKIWPSCKGRPGNFPQSRTEPTAPPEESFRFGEETTTPYQKQE
KKQETIDKDLYPLASLKSLFGNDPLSQ
>Q79665 ~~~gag~~~Gag polyprotein~~~
MGARASVLTGSKLDAWERIRLRPGSKKAYRLKHLVWASRELERYACNPGLLETAEGTEQLLQQLEPALKTGSEDLKSLWN
AIAVLWCVHNRFDIRDTQQAIQKLKEVMASRKSAEAAKEETSPRQTSQNYPIVTNAQGQMVHQAISPRTLNAWVKAVEEK
AFNPEIIPMFMALSEGAVPYDINTMLNAIGGHQGALQVLKEVINEEAAEWDRTHPPAMGPLPPGQIREPTGSDIAGTTST
QQEQIIWTTRGANSIPVGDIYRKWIVLGLNKMVKMYSPVSILDIRQGPKEPFRDYVDRFYKTLRAEQATQEVKNWMTETL
LVQNSNPDCKQILKALGPEATLEEMMVACQGVGGPTHKAKILAEAMASAQQDLKGGYTAVFMQRGQNPNRKGPIKCFNCG
KEGHIAKNCRAPRKRGCWKCGQEGHQMKDCKNGRQANFLGKYWPPGGTRPGNYVQKQVSPSAPPMEEAVKEQENQSQKGD
QEELYPFASLKSLFGTDQ
>P12493 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGELDKWEKIRLRPGGKKQYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TIAVLYCVHQRIDVKDTKEALDKIEEEQNKSKKKAQQAAADTGNNSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTHNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPATIMIQKGNFRNQRKTVKCFNCGKEGH
IAKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSHKGRPGNFLQSRPEPTAPPEESFRFGEETTTPSQKQEPID
KELYPLASLRSLFGSDPSSQ
>Q9WC62 ~~~gag~~~Gag polyprotein~~~
MGARASILSGGKLDDWEKIRLRPGGKKQYRIKHLVWASRELDRFALNPGLLESAKGCQQILVQLQPALQTGTEEIKSLYN
TVATLYCVHQRIEIKDTKEALDKIEEIQNKNKQQTQKAETDKKDNSQVSQNYPIVQNLQGQPVHQALSPRTLNAWVKVIE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTIGGHQAAMQMLKDTINEEAAEWDRVHPVHAGPVAPGQVREPRGSDIAGTT
SNLQEQIGWMTGNPPIPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKALRAEQATQDVKNWMTDT
LLVQNANPDCKTILKALGSGATLEEMMTACQGVGGPGHKARVLAEAMSQVTNTNIMMQRGNFRDHKRIVKCFNCGKQGHI
AKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSSKGRPGNFLQSRPEPTAPPAESLGFGEEIPSPKQEPKDKEL
YPLTSLRSLFGSDPLSQ
>P24736 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGKKLDSWEKIRLRPGGNKKYRLKHLVWASRELEKFTLNPGLLETAEGCQQILGQLQPALQTGTEELRSLYN
TVAVLYCVHQRIDVKDTKEALNKIEEMQNKNKQRTQQAAANTGSSQNYPIVQNAQGQPVHQALSPRTLNAWVKVVEDKAF
SPEVIPMFSALSEGATPQDLNMMLNVVGGHQAAMQMLKDTINEEAAEWDRLHPVHAGPIPPGQMREPRGSDIAGTTSTVQ
EQIGWMTGNPPIPVGDIYRRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQDVKNWMTETLLVQ
NANPDCKSILRALGPGATLEEMMTACQGVGGPGHKARVLAEAMSQVQQTSIMMQRGNFRGPRRIKCFNCGKEGHLAKNCR
APRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSNKGRPGNFPQSRPEPTAPPAEIFGMGEKMTSPAKQELKDREQTPLV
SLKSLFGNDPLSQ
>P35962 ~~~gag~~~Gag polyprotein~~~
MGARASVLSAGELDKWEKIRLRPGGKKQYRLKHIVWASRELERFAVDPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQKIEVKDTKEALEKIEEEQNKSKKKAQQAAADTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKTVKCFNCGKEGH
IAKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSHKGRPGNFLQSRPEPTAPSEESVRFGEETTTPSQKQEPID
KELYPLASLRSLFGSDPSSQ
>P12495 ~~~gag~~~Gag polyprotein~~~
MGARASVLSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELERFALNPGLLETSDGCKQIIGQLQPAIRTGSEELRSLFN
TVATLYCVHERIEVKDTKEALEKMEEEQNKSKNKKAQQAAADAGNNSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVI
EEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGT
TSTLQEQIAWMTSNPPIPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKGWMTE
TLLVQNANPDCKTILKALGPQATLEEMMTACQGVGGPSHKARVLAEAMSQATNSAAAVMMQRGNFKGPRKTIKCFNCGKE
GHIAKNCRAPRRKGCWKCGKEGHQLKDCTERQANFLGKIWPSHKGRPGNFLQSRPEPTAPPAESFGFGEEITPSQKQEQK
DKELYPSTALKSLFGNDPLLQ
>P18095 ~~~gag~~~Gag polyprotein~~~
MGARNSVLRGKKADELEKVRLRPGGKKKYRLKHIVWAANELDKFGLAESLLESKEGCQKILRVLDPLVPTGSENLKSLFN
TVCVIWCLHAEEKVKDTEEAKKLAQRHLVAETGTAEKMPNTSRPTAPPSGKRGNYPVQQAGGNYVHVPLSPRTLNAWVKL
VEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIREIINEEAADWDSQHPIPGPLPAGQLRDPRGSDIAGT
TSTVDEQIQWMYRPQNPVPVGNIYRRWIQIGLQKCVRKYNPTNILDIKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWMT
QTLLIQNANPDCKLVLKGLGMNPTLEEMLTACQGVGGPGQKARLMAEALKEAMGPSPIPFAAAQQRKAIRYWNCGKEGHS
ARQCRAPRRQGCWKCGKPGHIMANCPERQAGFLGLGPRGKKPRNFPVTQAPQGLIPTAPPADPAAELLERYMQQGRKQRE
QRERPYKEVTEDLLHLEQRETPHREETEDLLHLNSLFGKDQ
>P18041 ~~~gag~~~Gag polyprotein~~~
MGARNSVLRGKKADELEKIRLRPSGKKKYRLKHIVWAANELDKFGLAESLLESKEGCQKILTVLDPLVPTGSENLKSLFN
TVCVIWCLHAEEKVKDTEEAKKLVQRHLGAETGTAEKMPSTSRPTAPPSGRGRNFPVQQTGGGNYIHVPLSPRTLNAWVK
LVEDKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIREIINDEAADWDAQHPIPGPLPAGQLRDPRGSDIAG
TTSTVEEQIQWMYRPQNPVPVGNIYRRWIQIGLQKCVRMYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWM
TQTLLIQNANPDCKLVLKGLGMNPTLEEMLTACQGVGGPGQKARLMAEALKEALTPPPIPFAAAQQRKVIRCWNCGKEGH
SARQCRAPRRQGCWKCGKTGHVMAKCPERQAGFLGMGPWGKKPRNFPVAQAPPGLIPTAPPADPAVDLLERYMQQGREQR
EQRERPYKEVTEDLLHLEQGKAPHREATEDLLHLNSLFGKDQ
>P04590 ~~~gag~~~Gag polyprotein~~~
MGARNSVLRGKKADELERIRLRPGGKKKYRLKHIVWAANKLDRFGLAESLLESKEGCQKILTVLDPMVPTGSENLKSLFN
TVCVIWCIHAEEKVKDTEGAKQIVRRHLVAETGTAEKMPSTSRPTAPSSEKGGNYPVQHVGGNYTHIPLSPRTLNAWVKL
VEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIREIINEEAAEWDVQHPIPGPLPAGQLREPRGSDIAGT
TSTVEEQIQWMFRPQNPVPVGNIYRRWIQIGLQKCVRMYNPTNILDIKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWMT
QTLLVQNANPDCKLVLKGLGMNPTLEEMLTACQGVGGPGQKARLMAEALKEVIGPAPIPFAAAQQRKAFKCWNCGKEGHS
ARQCRAPRRQGCWKCGKPGHIMTNCPDRQAGFLGLGPWGKKPRNFPVAQVPQGLTPTAPPVDPAVDLLEKYMQQGKRQRE
QRERPYKEVTEDLLHLEQGETPYREPPTEDLLHLNSLFGKDQ
>P31790 ~~~gag~~~Intracisternal A-particle Gag-related polyprotein~~~
MGISHSIVVALRSVLKQCGLKIATKTLEGFVREIDRVAPWYACSGSLTVASWDKLKGDLVREQQKGKLKAGIIPLWKLVK
SCLTDEDCQQMVEAGQKVLDEIQESLSEVERGEKVKVERKQSALKNLGLSTGLEPEEKRYKGKNALGEIRKRDEKGEKKG
DRAGEAHKERSLYPPLVEFKQLTLSNSEPDEGVSTSEETDSEEEAVRYKGERYQQDKMATQPRKRQKAADESQLAAWPPD
CRLQGPSAPPLYVQR
>P31622 ~~~gag~~~Gag polyprotein~~~
MGHTHSRQLFVHMLSVMLKHRGITVSKTKLINFLSFIEEVCPWFPREGTVNLETWKKVGEQIRTHYTLHGPEKVPVETLS
FWTLIRDCLDFDNDELKRLGNLLKQEEDPLHTPDSVPSYDPPPPPPPSLKMHPSDNDDSLSSTDEAELDEEAAKYHQEDW
GFLAQEKGALTSKDELVECFKNLTIALQNAGIQLPSNNNTFPSAPPFPPAYTPTVMAGLDPPPGFPPPSKHMSPLQKALR
QAQRLGEVVSDFSLAFPVFENNNQRYYESLPFKQLKELKIACSQYGPTAPFTIAMIESLGTQALPPNDWKQTARACLSGG
DYLLWKSEFFEQCARIADVNRQQGIQTSYEMLIGEGPYQATDTQLNFLPGAYAQISNAARQAWKKLPSSSTKTEDLSKVR
QGPDEPYQDFVARLLDTIGKIMSDEKAGMVLAKQLAFENANSACQAALRPYRKKGDLSDFIRICADIGPSYMQGIAMAAA
LQGKSIKEVLFQQQARNKKGLQKSGNSGCFVCGQPGHRAAVCPQKHQTSVNTPNLCPRCKKGKHWARDCRSKTDVQGNPL
PPVSGNWVRGQPLAPKQCYGATLQVPKEPLQTSVEPQEAARDWTSVPPPIQY
>P03336 ~~~gag~~~Gag polyprotein~~~
MGQTVTTPLSLTLEHWEDVQRIASNQSVDVKKRRWVTFCSAEWPTFGVGWPQDGTFNLDIILQVKSKVFSPGPHGHPDQV
PYIVTWEAIAYEPPPWVKPFVSPKLSPSPTAPILPSGPSTQPPPRSALYPALTPSIKPRPSKPQVLSDNGGPLIDLLSED
PPPYGGQGLSSSDGDGDREEATSTSEIPAPSPIVSRLRGKRDPPAADSTTSRAFPLRLGGNGQLQYWPFSSSDLYNWKNN
NPSFSEDPGKLTALIESVLTTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLPNEVDAAFPLERPDWDY
TTQRGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQS
APDIGRKLERLEDLKSKTLGDLVREAERIFNKRETPEEREERVRRETEEKEERRRAEEEQKEKERDRRRHREMSKLLATV
VSGQRQDRQGGERRRPQLDKDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLDD
>P26806 ~~~gag~~~Gag polyprotein~~~
MGQAVTTPLSLTLDHWKDVERTAHNLSVEVRKRRWVTFCSAEWPTFNVGWPRDGTFNPDIITQVKIKVFSPGPHGHPDQV
PYIVTWEAIAVDPPPWVRPFVHPKPPLSLPPSAPSLPPEPPLSTPPQSSLYPALTSPLNTKPRPQVLPDSGGPLIDLLTE
DPPPYRDPGPPSPDGNGDSGEVAPTEGAPDPSPMVSRLRGRKEPPVADSTTSQAFPLRLGGNGQYQYWPFSSSDLYNWKN
NNPSFSEDPAKLTALIESVLLTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGEDGRPTQLPNDINDAFPLERPDWD
YNTQRGRNHLVHYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVAMSFIWQ
SAPDIGRKLERLEDLKSKTLGDLVREAEKIFNKRETPEEREERIRRETEEKEERRRAEDVQREKERDRRRHREMSKLLAT
VVSGQRQDRQGGERRRPQLDHDQCAYCKEKGHWARDCPKKPRGPRGPRPQASLLTLDD
>P03332 ~~~gag~~~Gag polyprotein~~~
MGQTVTTPLSLTLGHWKDVERIAHNQSVDVKKRRWVTFCSAEWPTFNVGWPRDGTFNRDLITQVKIKVFSPGPHGHPDQV
PYIVTWEALAFDPPPWVKPFVHPKPPPPLPPSAPSLPLEPPRSTPPRSSLYPALTPSLGAKPKPQVLSDSGGPLIDLLTE
DPPPYRDPRPPPSDRDGNGGEATPAGEAPDPSPMASRLRGRREPPVADSTTSQAFPLRAGGNGQLQYWPFSSSDLYNWKN
NNPSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGDDGRPTQLPNEVDAAFPLERPDWD
YTTQAGRNHLVHYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQ
SAPDIGRKLERLEDLKNKTLGDLVREAEKIFNKRETPEEREERIRRETEEKEERRRTEDEQKEKERDRRRHREMSKLLAT
VVSGQKQDRQGGERRRSQLDRDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLDD
>P11269 ~~~gag~~~Gag polyprotein~~~
MGQTVTTPLSLTLEHWGDVQRIASNQSVEVKKRRRVTFCPAEWPTFDVGWPQDGTFNLDIILQVKSKVFSPGPHGHPDQV
PYIVTWEAIAYEPPSWVKPFVSPKLSLSPTAPILPSGPSTQPPPRSALYPALTPSIKPRPSKPQVLSDNGGPLIDLLTED
PPPYGEQGPSSPDGDGDREEATYTSEIPAPSPMVSRLRGKRDPPAADSTTSRAFPLRLGGNGQLQYWPFSSSDLYNWKNN
NPSFSEDPGKLTALIESVLTTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLPNEVNSAFPLERPDWDY
TTPEGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDHGQETSVSMSFIWQS
APDIGRKLERLEDLKSKTLRDLVREAEKIFNKRETPEEREERFRRETEENEERRRAEDEQREKERDRRRQREMSKLLATV
VTGQRQDRQGGERKRPQLDKDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLDD
>P10258 ~~~gag~~~Gag polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKRYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSTSSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKIKKVVQRKENSEGKRKEKDSKAFLATDWNDDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVVKKKPQ
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDEDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKEMVQKAAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRMMPRGEGSDILIKQLAWENANSLCQDLIRPIRKTGTIQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGAEGPVCFSCGKTGHIRKDCKDEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNL
>P11284 ~~~gag~~~Gag polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKKYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSASSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKVKKIVQRKENSEHKRKEKDQKAFLATDWNNDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVDKKKPL
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDDDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKETVQKTAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRVMPRGEGSDILIKQLAWENANSLCQDLIRPMRKTGTMQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGSKGPVCFSCGKTGHIKRDCKEEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNL
>P07567 ~~~gag~~~Gag polyprotein~~~
MGQELSQHERYVEQLKQALKTRGVKVKYADLLKFFDFVKDTCPWFPQEGTIDIKRWRRVGDCFQDYYNTFGPEKVPVTAF
SYWNLIKELIDKKEVNPQVMAAVAQTEEILKSNSQTDLTKTSQNPDLDLISLDSDDEGAKSSSLQDKGLSSTKKPKRFPV
LLTAQTSKDPEDPNPSEVDWDGLEDEAAKYHNPDWPPFLTRPPPYNKATPSAPTVMAVVNPKEELKEKIAQLEEQIKLEE
LHQALISKLQKLKTGNETVTHPDTAGGLSRTPHWPGQHIPKGKCCASREKEEQIPKDIFPVTETVDGQGQAWRHHNGFDF
AVIKELKTAASQYGATAPYTLAIVESVADNWLTPTDWNTLVRAVLSGGDHLLWKSEFFENCRDTAKRNQQAGNGWDFDML
TGSGNYSSTDAQMQYDPGLFAQIQAAATKAWRKLPVKGDPGASLTGVKQGPDEPFADFVHRLITTAGRIFGSAEAGVDYV
KQLAYENANPACQAAIRPYRKKTDLTGYIRLCSDIGPSYQQGLAMAAAFSGQTVKDFLNNKNKEKGGCCFKCGKKGHFAK
NCHEHAHNNAEPKVPGLCPRCKRGKHWANECKSKTDNQGNPIPPHQGNGWRGQPQAPKQAYGAVSFVPANKNNPFQSLPE
PPQEVQDWTSVPPPTQY
>P03334 ~~~gag~~~Gag polyprotein~~~
MGQTVTTPLSLTLDHWKDVERLAHNQSVDVKKRRWVTFCSAEWPTFNVGWPRDGTFNRDLITQVKIKVFSPGPHGHPDQV
PYIVTWEALAFDPPPWVKPFVHPKPPPPLLPSAPSLPLEPPLSTPPQSSLYPALTPSLGAKPKPQVLSDSGGPLIDLLTE
DPPPYRDPRPPPSDRDGDSGEATPAGEAPDPSPMASRLRGRREPPVADSTTSQAFPLRTGGNGQLQYWPFSSSDLYNWKN
NNPSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGDDGRPTQLPNEVDAAFPLERPDWE
YTTQAGRNHLVHYRQLLIAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQ
SAPDIGRKLERLEDLRNKTLGDLVREAERIFNKRETPEEREERIRREREEKEERRRTEDEQKEKERDRRRHREMSRLLAT
VVSGQRQDRQEGERRRSQLDCDQCTYCEEQGHWAKDCPRRPRGPRGPRPQTSLLTLDD
>P03322 ~~~gag~~~Gag polyprotein~~~
MEAVIKVISSACKTYCGKTSPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMILGKSGELKTWGLVLGA
LKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTVQRDAKMAPEETATPKTVGTSCYHCGTA
IGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPPGAEQSRAEPGHAGQAPGPALTDWARVREELASTGPPVVAMP
VVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVI
AAATRDPRHPANGQGRGERTNLNRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADIMQGPS
ESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRTAPSTLTTPGEIIKYVLDRQKTAPLTDQGIAAAMS
SAIQPLIMAVVNRERDGQTGSGGRARGLCYTCGSPGHYQAQCPKKRKSGNSRERCQLCNGMGHNAKQCRKRDGNQGQRPG
KGLSSGPWPGPEPPAVSLAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVMEAANPQ
IHGIGGGIPMRKSRDMIELGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNL
>O92954 ~~~gag~~~Gag polyprotein~~~
MEAVIKVISSACKTYCGKTSPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMVLGKSGELKTWGLVLGA
LKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTVQRDAKMAPEETATPKTVGTSCYHCGTA
IGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPRGAEQPRAEPGHAGLAPGPALTDWARIREELASTGPPVVAMP
VVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVI
AAATRDPRHPANGQGRGERTNLDRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADITQGPS
ESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRAAPSTLTTPGEIIKYVLDRQKIAPLTDQGIAAAMS
SAIQPLVMAVVNRERDGQTGSGGRARRLCYTCGSPGHYQAQCPKKRKSGNSRERCQLCDGMGHNAKQCRRRDSNQGQRPG
RGLSSGPWPVSEQPAVSLAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVVDTANPQ
IHGIGGGIPMRKSRDMIELGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNL
>P32503 ~~~gag~~~Major capsid protein~~~
MLRFVTKNSQDKSSDLFSICSDRGTFVAHNRVRTDFKFDNLVFNRVYGVSQKFTLVGNPTVCFNEGSSYLEGIAKKYLTL
DGGLAIDNVLNELRSTCGIPGNAVASHAYNITSWRWYDNHVALLMNMLRAYHLQVLTEQGQYSAGDIPMYHDGHVKIKLP
VTIDDTAGPTQFAWPSDRSTDSYPDWAQFSESFPSIDVPYLDVRPLTVTEVNFVLMMMSKWHRRTNLAIDYEAPQLADKF
AYRHALTVQDADEWIEGDRTDDQFRPPSSKVMLSALRKYVNHNRLYNQFYTAAQLLAQIMMKPVPNCAEGYAWLMHDALV
NIPKFGSIRGRYPFLLSGDAALIQATALEDWSAIMAKPELVFTYAMQVSVALNTGLYLRRVKKTGFGTTIDDSYEDGAFL
QPETFVQAALACCTGQDAPLNGMSDVYVTYPDLLEFDAVTQVPITVIEPAGYNIVDDHLVVVGVPVACSPYMIFPVAAFD
TANPYCGNFVIKAANKYLRKGAVYDKLEAWKLAWALRVAGYDTHFKVYGDTHGLTKFYADNGDTWTHIPEFVTDGDVMEV
FVTAIERRARHFVELPRLNSPAFFRSVEVSTTIYDTHVQAGAHAVYHASRINLDYVKPVSTGIQVINAGELKNYWGSVRR
TQQGLGVVGLTMPAVMPTGEPTAGAAHEELIEQADNVLVE
>Q87026 ~~~gag~~~Major capsid protein~~~
MSSLLNSLLPEYFKPKTNLNINSSRVQYGFNARIDMQYEDDSGTRKGSRPNAFMSNTVAFIGNYEGIIVDDIPILDGLRA
DIFDTHGDLDMGLVEDALSKSTMIRRNVPTYTAYASELLYKRNLTSLFYNMLRLYYIKKWGSIKYEKDAIFYDNGHACLL
NRQLFPKSRDASLESSLSLPEAEIAMLDPGLEFPEEDVPAILWHGRVSSRATCILGQACSEFAPLAPFSIAHYSPQLTRK
LFVNAPAGIEPSSGRYTHEDVKDAITILVSANQAYTDFEAAYLMLAQTLVSPVPRTAEASAWFINAGMVNMPTLSCANGY
YPALTNVNPYHRLDTWKDTLNHWVAYPDMLFYHSVAMIESCYVELGNVARVSDSDAINKYTFTELSVQGRPVMNRGIIVD
LTLVAMRTGREISLPYPVSCGLTRTDALLQGTEIHVPVVVKDIDMPQYYNAIDKDVIEGQETVIKVKQLPPAMYPIYTYG
INTTEFYSDHFEDQVQVEMAPIDNGKAVFNDARKFSKFMSIMRMMGNDVTATDLVTGRKVSNWADNSSGRFLYTDVKYEG
QTAFLVDMDTVKARDHCWVSIVDPNGTMNLSYKMTNFRAAMFSRNKPLYMTGGSVRTIATGNYRDAAERLRAMDETLRLK
PFKITEKLDFRVAAYAIPSLSGSNMPSLHHQEQLQISEVDAEPINPIGEDELPPDIE
>Q02843 ~~~gag~~~Gag polyprotein~~~
MGGGHSALSGRSLDTFEKIRLRPNGKKKYQIKHLIWAGKEMERFGLHEKLLETKEGCQKIIEVLTPLEPTGSEGLKALFN
LCCVIWCIHAEQKVKDTEEAVVTVKQHYHLVDKNEKAAKKKNETTAPPGGESRNYPVVNQNNAWVHQPLSPRTLNAWVKC
VEEKRWGAEVVPMFQALSEGCLSYDVNQMLNVIGDHQGALQILKEVINEEAAEWDRTHRPPAGPLPAGQLRDPTGSDIAG
TTSSIQEQIEWTFNANPRIDVGAQYRKWVILGLQKVVQMYNPQKVLDIRQGPKEPFQDYVDRFYKALRAEQAPQDVKNWM
TQTLLIQNANPDCKLILKGLGMNPTLEEMLIACQGVGGPQHKAKLMVEMMSNGQNMVQVGPQKKGPRGPLKCFNCGKFGH
MQRECKAPRQIKCFKCGKIGHMAKDCKNGQANFLGYGHWGGAKPRNFVQYRGDTVGLEPTAPPMETAYDPAKKLLQQYAE
KGQRLREEREQTRKQKEKEVEDVSLSSLFGGDQ
>P05893 ~~~gag~~~Gag polyprotein~~~
MGARNAVLSGKKADELEKIRLRPGGKKKYMLKHVVWAANELDRFGLAESLLENKEGCQKILSVLAPLVPTGSENLKSLYN
TVCVIWCIHAEEKVKHTEEAKQIVQRHLVVETGTAETMPKTSRPTAPSSGRGGNYPVQQIGGNYVHLPLSPRTLNAWVKL
IEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIRDIINEEAADWDLQHPQPAPQQGQLREPSGSDIAGTT
SSVDEQIQWMYRQQNPIPVGNIYRRWIQLRLQKCVRMYNPINILDVKQRPKEPFQSYVDRFYKSLRAEQTDAAVKNWMTQ
TLLIQNANPDCKLVLKGLGVNPTLEEMLTACQGVGGPGQKARLMAEALKEALRPVPTPFAAAQQRGPRKPIKCWNCGKEG
HSARQCRAPRRQRCWKCGKMDHVMAKCPDRQAGFLGLGPWGKKPRNFPMAQVHQGLTPTAPPEDPAVDLLKNYMQLGKQQ
RESREKPYKEVTEDLLHLNSLFGGDQ
>P21411 ~~~gag~~~Gag polyprotein~~~
MGQASSHSENDLFISHLKESLKVRRIRVRKKDLVSFFSFIFKTCPWFPQEGSIDSRVWGRVGDCLNDYYRVFGPETIPIT
TFNYYNLIRDVLTNQSDSPDIQRLCKEGHKILISHSRPPSRQAPVTITTSEKASSRPPSRAPSTCPSVAIDIGSHDTGQS
SLYPNLATLTDPPIQSPHSRAHTPPQHLPLLANSKTLHNSGSQDDQLNPADQADLEEAAAQYNNPDWPQLTNTPALPPFR
PPSYVSTAVPPVAVAAPVLHAPTSGVPGSPTAPNLPGVALAKPSGPIDETVSLLDGVKTLVTKLSDLALLPPAGVMAFPV
TRSQGQVSSNTTGRASPHPDTHTIPEEEEADSGESDSEDDEEESSEPTEPTYTHSYKRLNLKTIEKIKTAVANYGPTAPF
TVALVESLSERWLTPSDWFFLSRAALSGGDNILWKSEYEDISKQFAERTRVRPPPKDGPLKIPGASPYQNNDKQAQFPPG
LLTQIQSAGLKAWKRLPQKGAATTSLAKIRQGPDESYSDFVSRLQETADRLFGSGESESSFVKHLAYENANPACQSAIRP
FRQKELSTMSPLLWYCSAHAVGLAIGAALQNLAPAQLLEPRPAFAIIVTNPAIFQETAPKKIQPPTQLPTQPNAPQASLI
KNLGPTTKCPRCKKGFHWASECRSRLDINGQPIIKQGNLNRGQPQGPTTGMNSGASQFTPQYRQPTPALPVINHAATSQT
SGEQQRAVQDWTSVPPPTQY
>P04022 ~~~gag~~~Gag polyprotein~~~
MGQELSQHERYVEQLKQALKTRGVKVKYADLLKFFDFVKDTCPWFPQEGTIDIKRWRRVGDCFQDYYNTFGPEKVPVTAF
SYWNLIKELIDKKEVNPQVMAAVAQTEEILKTSSHTELTTKPSQNPDLDLISLDSDDEGAKGSSLKDKNLSCTKKPKRFP
VLLTAQTSADPEDPNPSEVDWDGLEDEAAKYHNPDWPPFLTRPPPYNKATPSAPTVMAVVNPKEELKEKIAQLEEQIKLE
ELHQALISKLQKLKTGNETVTSPETAGGFSRTPHWPGQHIPKGKCCASREKEEQTPKDIFPVTETVDGQGQAWRHHNGFD
FTVIKELKTAASQYGATAPYTLAIVESVADNWLTPTDWNTLVRAVLSGGDHLLWKSEFFENCRETAKRNQQAGNGWDFDM
LTGSGNYSSTDAQMQYDPGLFAQIQAAATKAWRKLPVKGDPGASLTGVKQGPDEPFADFVHRLITTAGRIFGSAEAGVDY
VKQLAYENANPACQAAIRPYRKKTDLTGYIRLCSDIGPSYQQGLAMAAAFSGQTVKDFLNNKNKEKGGCCFKCGRKGHFA
KNCHEHIHNNSETKAPGLCPRCKRGKHWANECKSKTDSQGNPLPPHQGNGLRGQPQAPKQAYGAVSFVPANKNNPFQSLP
EPPQEVQDWTSVPPPTQY
>P35955 ~~~gag~~~Gag polyprotein~~~
MAKQGSKEKKGYPELKEVIKATCKIRVGPGKETLTEGNCLWALKTIDFIFEDLKTEPWTITKMYTVWDRLKGLTPEETSK
REFASLQATLACIMCSQMGMKPETVQAAKGIISMKEGLHENKEAKGEKVEQLYPNLEKHREVYPIVNLQAGGRSWKAVES
VVFQQLQTVAMQHGLVSEDFERQLAYYATTWTSKDILEVLAMMPGNRAQKELIQGKLNEEAERWVRQNPPGPNVLTVDQI
MGVGQTNQQASQANMDQARQICLQWVITALRSVRHMSHRPGNPMLVKQKNTESYEDFIARLLEAIDAEPVTDPIKTYLKV
TLSYTNASTDCQKQMDRTLGTRVQQATVEEKMQACRDVGSEGFKMQLLAQALRPQGKAGQKGVNQKCYNCGKPGHLARQC
RQGIICHHCGKRGHMQKDCRQKKQQGNNRRGPRVVPSAPPML
>Q88937 ~~~gag~~~Gag polyprotein~~~
MGNSSSTPPPSALKNSDLFKTMLRTQYSGSVKTRRINQDIKKQYPLWPDQGTCATKHWEQAVLIPLDSVSEETAKVLNFL
RVKIQARKGETARQMTAHTIKKLIVGTIDKNKQQTEILQKTDESDEEMDTTNTMLFIARNKRERIAQQQQADLAAQQQVL
LLQREQQREQREKDIKKRDEKKKKLLPDTTQKVEQTDIGEASSSDASAQKPISTDNNPDLKVDGVLTRSQHTTVPSNITI
KKDGTSVQYQHPIRNYPTGEGNLTAQVRNPFRPLELQQLRKDCPALPEGIPQLAEWLTQTMAIYNCDEADVEQLARVIFP
TPVRQIAGVINGHAAANTAAKIQNYVTACRQHYPAVCDWGTIQAFTYKPPQTAHEYVKHAEIIFKNNSGLEWQHATVPFI
NMVVQGLPPKVTRSLMSGNPDWSTKTIPQIIPLMQHYLNLQSRQDAKIKQTPLVLQLAMPAQTMNGNKGYVGSYPTNEPY
YSFQQQQRPAPRAPPGNVPSNTCFFCKQPGHWKADCPNKTRNLRNMGNMGRGGRMGGPPYRSQPYPAFIQPPQNHQNQYN
GRMDRSQLQASAQEWLPGTYPA
>Q64770 ~~~8~~~Protein GAM-1~~~
MARNPFRMFPGDLPYYMGTISFTSVVPVDPSQRNPTTSLREMVTTGLIFNPNLTGEQLREYSFSPLVSMGRKAIFADYEG
PQRIIHVTIRGRSAEPKTPSEALIMMEKAVRGAFAVPDWVAREYSDPLPHGITHVGDLGFPIGSVHALKMALDTLKIHVP
RGVGVPGYEGLCGTTTIKAPRQYRLLTTGVFTKKDLKRTLPEPFFSRFFNQTPEVCAIKTGKNPFSTEIWCMTLGGDSPA
PERNEPRNPHSLQDWARLGVMETCLRMSRRGLGSRHHPYHSL
>P06023 ~~~gam~~~Putative DNA ends protecting protein gam~~~
MAKPAKRIKSAAAAYVPQNRDAVITDIKRIGDLQREASRLETEMNDAIAEITEKFAARIAPIKTDIETLSKGVQGWCEAN
RDELTNGGKVKTANLVTGDVSWRVRPPSVSIRGMDAVMETLERLGLQRFIRTKQEINKEAILLEPKAVAGVAGITVKSGI
EDFSIIPFEQEAGI
>P03702 ~~~gam~~~Host-nuclease inhibitor protein gam~~~
MDINTETEIKQKHSLTPFPVFLISPAFRGRYFHSYFRSSAMNAYYIQDRLEAQSWARHYQQLAREEKEAELADDMEKGLP
QHLFESLCIDHLQRHGASKKSITRAFDDDVEFQERMAEHIRYMVETIAHHQVDIDSEV
>P03188 ~~~gB~~~Envelope glycoprotein B~~~
MTRRRVLSVVVLLAALACRLGAQTPEQPAPPATTVQPTATRQQTSFPFRVCELSSHGDLFRFSSDIQCPSFGTRENHTEG
LLMVFKDNIIPYSFKVRSYTKIVTNILIYNGWYADSVTNRHEEKFSVDSYETDQMDTIYQCYNAVKMTKDGLTRVYVDRD
GVNITVNLKPTGGLANGVRRYASQTELYDAPGWLIWTYRTRTTVNCLITDMMAKSNSPFDFFVTTTGQTVEMSPFYDGKN
KETFHERADSFHVRTNYKIVDYDNRGTNPQGERRAFLDKGTYTLSWKLENRTAYCPLQHWQTFDSTIATETGKSIHFVTD
EGTSSFVTNTTVGIELPDAFKCIEEQVNKTMHEKYEAVQDRYTKGQEAITYFITSGGLLLAWLPLTPRSLATVKNLTELT
TPTSSPPSSPSPPAPSAARGSTPAAVLRRRRRDAGNATTPVPPTAPGKSLGTLNNPATVQIQFAYDSLRRQINRMLGDLA
RAWCLEQKRQNMVLRELTKINPTTVMSSIYGKAVAAKRLGDVISVSQCVPVNQATVTLRKSMRVPGSETMCYSRPLVSFS
FINDTKTYEGQLGTDNEIFLTKKMTEVCQATSQYYFQSGNEIHVYNDYHHFKTIELDGIATLQTFISLNTSLIENIDFAS
LELYSRDEQRASNVFDLEGIFREYNFQAQNIAGLRKDLDNAVSNGRNQFVDGLGELMDSLGSVGQSITNLVSTVGGLFSS
LVSGFISFFKNPFGGMLILVLVAGVVILVISLTRRTRQMSQQPVQMLYPGIDELAQQHASGEGPGINPISKTELQAIMLA
LHEQNQEQKRAAQRAAGPSVASRALQAARDRFPGLRRRRYHDPETAAALLGEAETEF
>P0C762 ~~~gB~~~Envelope glycoprotein B~~~
MTRRRVLSVVVLLAALACRLGAQTPEQPAPPATTVQPTATRQQTSFPFRVCELSSHGDLFRFSSDIQCPSFGTRENHTEG
LLMVFKDNIIPYSFKVRSYTKIVTNILIYNGWYADSVTNRHEEKFSVDSYETDQMDTIYQCYNAVKMTKDGLTRVYVDRD
GVNITVNLKPTGGLANGVRRYASQTELYDAPGWLIWTYRTRTTVNCLITDMMAKSNSPFDFFVTTTGQTVEMSPFYDGKN
KETFHERADSFHVRTNYKIVDYDNRGTNPQGERRAFLDKGTYTLSWKLENRTAYCPLQHWQTFDSTIATETGKSIHFVTD
EGTSSFVTNTTVGIELPDAFKCIEEQVNKTMHEKYEAVQDRYTKGQEAITYFITSGGLLLAWLPLTPRSLATVKNLTELT
TPTSSPPSSPSPPAPPAARGSTSAAVLRRRRRDAGNATTPVPPAAPGKSLGTLNNPATVQIQFAYDSLRRQINRMLGDLA
RAWCLEQKRQNMVLRELTKINPTTVMSSIYGKAVAAKRLGDVISVSQCVPVNQATVTLRKSMRVPGSETMCYSRPLVSFS
FINDTKTYEGQLGTDNEIFLTKKMTEVCQATSQYYFQSGNEIHVYNDYHHFKTIELDGIATLQTFISLNTSLIENIDFAS
LELYSRDEQRASNVFDLEGIFREYNFQAQNIAGLRKDLDNAVSNGRNQFVDGLGELMDSLGSVGQSITNLVSTVGGLFSS
LVSGFISFFKNPFGGMLILVLVAGVVILVISLTRRTRQMSQQPVQMLYPGIDELAQQHASGEGPGINPISKTELQAIMLA
LHEQNQEQKRAAQRAAGPSVASRALQAARDRFPGLRRRRYHDPETAAALLGEAETEF
>P06473 ~~~gB~~~Envelope glycoprotein B~~~
MESRIWCLVVCVNLCIVCLGAAVSSSSTSHATSSTHNGSHTSRTTSAQTRSVYSQHVTSSEAVSHRANETIYNTTLKYGD
VVGVNTTKYPYRVCSMAQGTDLIRFERNIICTSMKPINEDLDEGIMVVYKRNIVAHTFKVRVYQKVLTFRRSYAYIYTTY
LLGSNTEYVAPPMWEIHHINKFAQCYSSYSRVIGGTVFVAYHRDSYENKTMQLIPDDYSNTHSTRYVTVKDQWHSRGSTW
LYRETCNLNCMLTITTARSKYPYHFFATSTGDVVYISPFYNGTNRNASYFGENADKFFIFPNYTIVSDFGRPNAAPETHR
LVAFLERADSVISWDIQDEKNVTCQLTFWEASERTIRSEAEDSYHFSSAKMTATFLSKKQEVNMSDSALDCVRDEAINKL
QQIFNTSYNQTYEKYGNVSVFETSGGLVVFWQGIKQKSLVELERLANRSSLNITHRTRRSTSDNNTTHLSSMESVHNLVY
AQLQFTYDTLRGYINRALAQIAEAWCVDQRRTLEVFKELSKINPSAILSAIYNKPIAARFMGDVLGLASCVTINQTSVKV
LRDMNVKESPGRCYSRPVVIFNFANSSYVQYGQLGEDNEILLGNHRTEECQLPSLKIFIAGNSAYEYVDYLFKRMIDLSS
ISTVDSMIALDIDPLENTDFRVLELYSQKELRSSNVFDLEEIMREFNSYKQRVKYVEDKVVDPLPPYLKGLDDLMSGLGA
AGKAVGVAIGAVGGAVASVVEGVATFLKNPFGAFTIILVAIAVVIITYLIYTRQRRLCTQPLQNLFPYLVSADGTTVTSG
STKDTSLQAPPSYEESVYNSGRKGPGPPSSDASTAAPPYTNEQAYQMLLALARLDAEQRAQQNGTDSLDGQTGTQDKGQK
PNLLDRLRHRKNGYRHLKDSDEEENV
>P13201 ~~~gB~~~Envelope glycoprotein B~~~
MESRIWCLVVCVNLCIVCLGAAVSSSSTRGTSATHSHHSSHTTSAAHSRSGSVSQRVTSSQTVSHGVNETIYNTTLKYGD
VVGVNTTKYPYRVCSMAQGTDLIRFERNIVCTSMKPINEDLDEGIMVVYKRNIVAHTFKVRVYQKVLTFRRSYAYIHTTY
LLGSNTEYVAPPMWEIHHINSHSQCYSSYSRVIAGTVFVAYHRDSYENKTMQLMPDDYSNTHSTRYVTVKDQWHSRGSTW
LYRETCNLNCMVTITTARSKYPYHFFATSTGDVVDISPFYNGTNRNASYFGENADKFFIFPNYTIVSDFGRPNSALETHR
LVAFLERADSVISWDIQDEKNVTCQLTFWEASERTIRSEAEDSYHFSSAKMTATFLSKKQEVNMSDSALDCVRDEAINKL
QQIFNTSYNQTYEKYGNVSVFETTGGLVVFWQGIKQKSLVELERLANRSSLNLTHNRTKRSTDGNNATHLSNMESVHNLV
YAQLQFTYDTLRGYINRALAQIAEAWCVDQRRTLEVFKELSKINPSAILSAIYNKPIAARFMGDVLGLASCVTINQTSVK
VLRDMNVKESPGRCYSRPVVIFNFANSSYVQYGQLGEDNEILLGNHRTEECQLPSLKIFIAGNSAYEYVDYLFKRMIDLS
SISTVDSMIALDIDPLENTDFRVLELYSQKELRSSNVFDLEEIMREFNSYKQRVKYVEDKVVDPLPPYLKGLDDLMSGLG
AAGKAVGVAIGAVGGAVASVVEGVATFLKNPFGAFTIILVAIAVVIIIYLIYTRQRRLCMQPLQNLFPYLVSADGTTVTS
GNTKDTSLQAPPSYEESVYNSGRKGPGPPSSDASTAAPPYTNEQAYQMLLALVRLDAEQRAQQNGTDSLDGQTGTQDKGQ
KPNLLDRLRHRKNGYRHLKDSDEEENV
>P10211 ~~~gB~~~Envelope glycoprotein B~~~
MRQGAPARGRRWFVVWALLGLTLGVLVASAAPSSPGTPGVAAATQAANGGPATPAPPAPGAPPTGDPKPKKNRKPKPPKP
PRPAGDNATVAAGHATLREHLRDIKAENTDANFYVCPPPTGATVVQFEQPRRCPTRPEGQNYTEGIAVVFKENIAPYKFK
ATMYYKDVTVSQVWFGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKYVRNNLETTAFHRDDHETDMELKPANAA
TRTSRGWHTTDLKYNPSRVEAFHRYGTTVNCIVEEVDARSVYPYDEFVLATGDFVYMSPFYGYREGSHTEHTSYAADRFK
QVDGFYARDLTTKARATAPTTRNLLTTPKFTVAWDWVPKRPSVCTMTKWQEVDEMLRSEYGGSFRFSSDAISTTFTTNLT
EYPLSRVDLGDCIGKDARDAMDRIFARRYNATHIKVGQPQYYLANGGFLIAYQPLLSNTLAELYVREHLREQSRKPPNPT
PPPPGASANASVERIKTTSSIEFARLQFTYNHIQRHVNDMLGRVAIAWCELQNHELTLWNEARKLNPNAIASATVGRRVS
ARMLGDVMAVSTCVPVAADNVIVQNSMRISSRPGACYSRPLVSFRYEDQGPLVEGQLGENNELRLTRDAIEPCTVGHRRY
FTFGGGYVYFEEYAYSHQLSRADITTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLDYTEVQRRNQLHDLRFADID
TVIHADANAAMFAGLGAFFEGMGDLGRAVGKVVMGIVGGVVSAVSGVSSFMSNPFGALAVGLLVLAGLAAAFFAFRYVMR
LQSNPMKALYPLTTKELKNPTNPDASGEGEEGGDFDEAKLAEAREMIRYMALVSAMERTEHKAKKKGTSALLSAKVTDMV
MRKRRNTNYTQVPNKDGDADEDDL
>P06436 ~~~gB~~~Envelope glycoprotein B~~~
MRQGAARGCRWFVVWALLGLTLGVLVASAAPSSPGTPGVAAATQAANGGPATPAPPAPGPAPTGDTKPKKNKKPKNPPPP
RPAGDNATVAAGHATLREHLRDIKAENTDANFYVCPPPTGATVVQFEQPRRCPTRPEGQNYTEGIAVVFKENIAPYKFKA
TMYYKDVTVSQVWFGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKYVRNNLETTAFHRDDHETDMELKPANAAT
RTSRGWHTTDLKYNPSRVEAFHRYGTTVNCIVEEVDARSVYPYDEFVLATGDFVYMSPFYGYREGSHTEHTSYAADRFKQ
VDGFYARDLTTKARATAPTTRNLLTTPKFTVAWDWVPKRPSVCTMTKWQEVDEMLRSEYGGSFRFSSDAISTTFTTNLTE
YPLSRVDLGDCIGKDARDAMDRIFARRYNATHIKVGQPQYYLANGGFLIAYQPLLSNTLAELYVREHLREQSRKPPNPTP
PPPGASANASVERIKTTSSIEFARLQFTYNHIQRHVNDMLGRVAIAWCELQNHELTLWNEARKLNPNAIASATVGRRVSA
RMLGDVMAVSTCVPVAADNVIVQNSMRISSRPGACYSRPLVSFRYEDQGPLVEGQLGENNELRLTRDAIEPCTVGHRRYF
TFGGGYVYFEEYAYSHQLSRADITTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLDYTEVQRRNQLHDLRFADIDT
VIHADANAAMFAGLGAFFEGMGDLGRAVGKVVMGIVGGVVSAVSGVSSFMSNPFGALAVGLLVLAGLAAAFFAFRYVMRL
QSNPMKALYPLTTKELKNPTNPDASGEGEEGGDFDEAKLAEAREMIRYMALVSAMERTEHKAKKKGTSALLSAKVTDMVM
RKRRNTNYTQVPNKDGDADEDDL
>P06437 ~~~gB~~~Envelope glycoprotein B~~~
MHQGAPSWGRRWFVVWALLGLTLGVLVASAAPTSPGTPGVAAATQAANGGPATPAPPPLGAAPTGDPKPKKNKKPKNPTP
PRPAGDNATVAAGHATLREHLRDIKAENTDANFYVCPPPTGATVVQFEQPRRCPTRPEGQNYTEGIAVVFKENIAPYKFK
ATMYYKDVTVSQVWFGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKYVRNNLETTAFHRDDHETDMELKPANAA
TRTSRGWHTTDLKYNPSRVEAFHRYGTTVNCIVEEVDARSVYPYDEFVLATGDFVYMSPFYGYREGSHTEHTTYAADRFK
QVDGFYARDLTTKARATAPTTRNLLTTPKFTVAWDWVPKRPSVCTMTKWQEVDEMLRSEYGGSFRFSSDAISTTFTTNLT
EYPLSRVDLGDCIGKDARDAMDRIFARRYNATHIKVGQPQYYQANGGFLIAYQPLLSNTLAELYVREHLREQSRKPPNPT
PPPPGASANASVERIKTTSSIEFARLQFTYNHIQRHVNDMLGRVAIAWCELQNHELTLWNEARKLNPNAIASVTVGRRVS
ARMLGDVMAVSTCVPVAADNVIVQNSMRISSRPGACYSRPLVSFRYEDQGPLVEGQLGENNELRLTRDAIEPCTVGHRRY
FTFGGGYVYFEEYAYSHQLSRADITTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLDYTEVQRRNQLHDLRFADID
TVIHADANAAMFAGLGAFFEGMGDLGRAVGKVVMGIVGGVVSAVSGVSSFMSNPFGALAVGLLVLAGLAAAFFAFRYVMR
LQSNPMKALYPLTTKELKNPTNPDASGEGEEGGDFDEAKLAEAREMIRYMALVSAMERTEHKAKKKGTSALLSAKVTDMV
MRKRRNTNYTQVPNKDGDADEDDL
>P06763 ~~~gB~~~Envelope glycoprotein B~~~
MRGGGLICALVVGALVAAVASAAPAAPAAPRASGGVAATVAANGGPASRPPPVPSPATTKARKRKTKKPPKRPEATPPPD
ANATVAAGHATLRAHLREIKVENADAQFYVCPPPTGATVVQFEQPRRCPTRPEGQNYTEGIAVVFKENIAPYKFKATMYY
KDVTVSQVWFGHRYSQFMGIFEDRAPVPFEEVIDKINAKGVCRSTAKYVRNNMETTAFHRDDHETDMELKPAKVATRTSR
GWHTTDLKYNPSRVEAFHRYGTTVNCIVEEVDARSVYPYDEFVLATGDFVYMSPFYGYREGSHTEHTSYAADRFKQVDGF
YARDLTTKARATSPTTRNLLTTPKFTVAWDWVPKRPAVCTMTKWQEVDEMLRAEYGGSFRFSSDAISTTFTTNLTQYSLS
RVDLGDCIGRDAREAIDRMFARKYNATHIKVGQPQYYLATGGFLIAYQPLLSNTLAELYVREYMREQDRKPRNATPAPLR
EAPSANASVERIKTTSSIEFARLQFTYNHIQRHVNDMLGRIAVAWCELQNHELTLWNEARKLNPNAIASATVGRRVSARM
LGDVMAVSTCVPVAPDNVIVQNSMRVSSRPGTCYSRPLVSFRYEDQGPLIEGQLGENNELRLTRDALEPCTVGHRRYFIF
GGGYVYFEEYAYSHQLSRADVTTVSTFIDLNITMLEDHEFVPLEVYTRHEIKDSGLLDYTEVQRRNQLHDLRFADIDTVI
RADANAAMFAGLCAFFEGMGDLGRAVGKVVMGVVGGVVSAVSGVSSFMSNPFGALAVGLLVLAGLVAAFFAFRYVLQLQR
NPMKALYPLTTKELKTSDPGGVGGEGEEGAEGGGFDEAKLAEAREMIRYMALVSAMERTEHKARKKGTSALLSSKVTNMV
LRKRNKARYSPLHNEDEAGDEDEL
>F5HB81 ~~~gB~~~Envelope glycoprotein B~~~
MTPRSRLATLGTVILLVCFCAGAAHSRGDTFQTSSSPTPPGSSSKAPTKPGEEASGPKSVDFYQFRVCSASITGELFRFN
LEQTCPDTKDKYHQEGILLVYKKNIVPHIFKVRRYRKIATSVTVYRGLTESAITNKYELPRPVPLYEISHMDSTYQCFSS
MKVNVNGVENTFTDRDDVNTTVFLQPVEGLTDNIQRYFSQPVIYAEPGWFPGIYRVRTTVNCEIVDMIARSAEPYNYFVT
SLGDTVEVSPFCYNESSCSTTPSNKNGLSVQVVLNHTVVTYSDRGTSPTPQNRIFVETGAYTLSWASESKTTAVCPLALW
KTFPRSIQTTHEDSFHFVANEITATFTAPLTPVANFTDTYSCLTSDINTTLNASKAKLASTHVPNGTVQYFHTTGGLYLV
WQPMSAINLTHAQGDSGNPTSSPPPSASPMTTSASRRKRRSASTAAAGGGGSTDNLSYTQLQFAYDKLRDGINQVLEELS
RAWCREQVRDNLMWYELSKINPTSVMTAIYGRPVSAKFVGDAISVTECINVDQSSVNIHKSLRTNSKDVCYARPLVTFKF
LNSSNLFTGQLGARNEIILTNNQVETCKDTCEHYFITRNETLVYKDYAYLRTINTTDISTLNTFIALNLSFIQNIDFKAI
ELYSSAEKRLASSVFDLETMFREYNYYTHRLAGLREDLDNTIDMNKERFVRDLSEIVADLGGIGKTVVNVASSVVTLCGS
LVTGFINFIKHPLGGMLMIIIVIAIILIIFMLSRRTNTIAQAPVKMIYPDVDRRAPPSGGAPTREEIKNILLGMHQLQQE
ERQKADDLKKSTPSVFQRTANGLRQRLRGYKPLTQSLDISPETGE
>P09257 ~~~gB~~~Envelope glycoprotein B~~~
MSPCGYYSKWRNRDRPEYRRNLRFRRFFSSIHPNAAAGSGFNGPGVFITSVTGVWLCFLCIFSMFVTAVVSVSPSSFYES
LQVEPTQSEDITRSAHLGDGDEIREAIHKSQDAETKPTFYVCPPPTGSTIVRLEPTRTCPDYHLGKNFTEGIAVVYKENI
AAYKFKATVYYKDVIVSTAWAGSSYTQITNRYADRVPIPVSEITDTIDKFGKCSSKATYVRNNHKVEAFNEDKNPQDMPL
IASKYNSVGSKAWHTTNDTYMVAGTPGTYRTGTSVNCIIEEVEARSIFPYDSFGLSTGDIIYMSPFFGLRDGAYREHSNY
AMDRFHQFEGYRQRDLDTRALLEPAARNFLVTPHLTVGWNWKPKRTEVCSLVKWREVEDVVRDEYAHNFRFTMKTLSTTF
ISETNEFNLNQIHLSQCVKEEARAIINRIYTTRYNSSHVRTGDIQTYLARGGFVVVFQPLLSNSLARLYLQELVRENTNH
SPQKHPTRNTRSRRSVPVELRANRTITTTSSVEFAMLQFTYDHIQEHVNEMLARISSSWCQLQNRERALWSGLFPINPSA
LASTILDQRVKARILGDVISVSNCPELGSDTRIILQNSMRVSGSTTRCYSRPLISIVSLNGSGTVEGQLGTDNELIMSRD
LLEPCVANHKRYFLFGHHYVYYEDYRYVREIAVHDVGMISTYVDLNLTLLKDREFMPLQVYTRDELRDTGLLDYSEIQRR
NQMHSLRFYDIDKVVQYDSGTAIMQGMAQFFQGLGTAGQAVGHVVLGATGALLSTVHGFTTFLSNPFGALAVGLLVLAGL
VAAFFAYRYVLKLKTSPMKALYPLTTKGLKQLPEGMDPFAEKPNATDTPIEEIGDSQNTEPSVNSGFDPDKFREAQEMIK
YMTLVSAAERQESKARKKNKTSALLTSRLTGLALRNRRGYSRVRTENVTGV
>Q4JR05 ~~~gB~~~Envelope glycoprotein B~~~
MSPCGYYSKWRNRDRPEYRRNLRFRRFFSSIHPNAAAGSGFNGPGVFITSVTGVWLCFLCIFSMFVTAVVSVSPSSFYES
LQVEPTQSEDITRSAHLGDGDEIREAIHKSQDAETKPTFYVCPPPTGSTIVRLEPPRTCPDYHLGKNFTEGIAVVYKENI
AAYKFKATVYYKDVIVSTAWAGSSYTQITNRYADRVPIPVSEITDTIDKFGKCSSKATYVRNNHKVEAFNEDKNPQDMPL
IASKYNSVGSKAWHTTNDTYMVAGTPGTYRTGTSVNCIIEEVEARSIFPYDSFGLSTGDIIYMSPFFGLRDGAYREHSNY
AMDRFHQFEGYRQRDLDTRALLEPAARNFLVTPHLTVGWNWKPKRTEVCSLVKWREVEDVVRDEYAHNFRFTMKTLSTTF
ISETNEFNLNQIHLSQCVKEEARAIINRIYTTRYNSSHVRTGDIQTYLARGGFVVVFQPLLSNSLARLYLQELVRENTNH
SPQKHPTRNTRSRRSVPVELRANRTITTTSSVEFAMLQFTYDHIQEHVNEMLARISSSWCQLQNRERALWSGLFPINPSA
LASTILDQRVKARILGDVISVSNCPELGSDTRIILQNSMRVSGSTTRCYSRPLISIVSLNGSGTVEGQLGTDNELIMSRD
LLEPCVANHKRYFLFGHHYVYYEDYRYVREIAVHDVGMISTYVDLNLTLLKDREFMPLQVYTRDELRDTGLLDYSEIQRR
NQMHSLRFYDIDKVVQYDSGTAIMQGMAQFFQGLGTAGQAVGHVVLGATGALLSTVHGFTTFLSNPFGALAVGLLVLAGL
VAAFFAYRYVLKLKTSPMKALYPLTTKGLKQLPEGMDPFAEKPNATDTPIEEIGDSQNTEPSVNSGFDPDKFREAQEMIK
YMTLVSAAERQESKARKKNKTSALLTSRLTGLALRNRRGYSRVRTENVTGV
>Q80RC7 2.4.1.102~~~~~~Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase~~~
MVGWKKKKLCRGHHLWVLGCYMLLAVVSLRLSLRFKCDVDSLDLESRDFQSQHCRDMLYNNLKLPAKRSINCSGITRGDQ
EAVVQALLDNLEVKKKRPPLTDTYYLNITRDCERFKAQRKFIQFPLSKEELDFPIAYSMVVHEKIENFERLLRAVYAPQN
IYCVHVDVKSPETFKEAVKAIISCFPNVFMASKLVPVVYASWSRVQADLNCMEDLLQSSVSWKYLLNTCGTDFPIKTNAE
MVLALKMLKGKNSMESEVPSESKKNRWKYHYEVTDTLYPTSKMKDPPPDNLPMFTGNAYFVASRAFVQHVLDNPKSQRLV
EWVKDTYSPDEHLWATLQRAPWMPGSVPSHPKYHISDMTAVARLVKWQYHEGDVSMGAPYAPCSGIHRRAICIYGAGDLY
WILQNHHLLANKFDPRVDDNVLQCLEEYLRHKAIYGTEL
>Q99CW3 2.4.1.102~~~~~~Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase~~~
MKMAGWKKKLCRGHHLWALGCYMLLAVVSLRLSLRFKCDVDSLDLESRDFQSQHCRDMLYNSLKLPAKRSINCSGITRGD
QEAVVQALLDNLEVKKKRSPLTGTYYLNITRDCERFKAQRKFIQFPLSKEELDFPIAYSMVVHEKIENFERLLRAVYAPQ
NIYCVHVDVKSPETFKEAVKAIISCFPNVFMASKLVPVVYASWSRVQADLNCMEDLLQSSVPWKYLLNTCGTDFPIKTNA
EMVLALKMLKGKNSMESEVPSESKKNRWKYRYEVTDTLYPTSKMKDPPPDNLPMFTGNAYFVASRAFVQHVLDNPKSQRL
VEWVKDTYSPDEHLWATLQRAPWMPGSVPSHPKYHISDMTAIARLVKWQYHEGDVSMGAPYAPCSGIHRRAICIYGAGDL
YWILQNHHLLANKFDPRVDDNVLQCLEEYLRHKAIYGTEL
>Q805R1 2.4.1.102~~~~~~Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase~~~
MKMAGWKKKLCPGHHLWALGCYMLLAVVSLRLSLRFKCDVDSLDLESRDFQSQHCRDMLYNSLKLPAKRSINCSGITRGD
QEAVVQALLDNLEVKKKRPPLTDTYYLNITRDCERFKAQRKFIQFPLSKEELDFPIAYSMVVHEKIENFERLLRAVYAPQ
NIYCVHVDVKSPETFKEAVKAIISCFPNVFMASKLVPVVYASWSRVQADLNCMEDLLQSSVSWKYLLNTCGTDFPIKTNA
EMVLALKMLKGKNSMESEVPSESKKNRWKYRYEVTDTLYPTSKMKDPPPDNLPMFTGNAYFVASRAFVQHVLDNPKSQIL
VEWVKDTYSPDEHLWATLQRAPWMPGSVPSHPKYHISDMTAIARLVKWQYHEGDVSMGAPYAPCSGIHRRAICIYGAGDL
YWILQNHHLLANKFDPRVDDNVLQCLEEYLRHKAIYGTEL
>Q9IZK2 2.4.1.102~~~~~~Beta-1,3-galactosyl-O-glycosyl-glycoprotein beta-1,6-N-acetylglucosaminyltransferase~~~
MKMAGWKKKLCPGHHLWALGCYMLLAVVSLRLSLRFKCDVDSLDLESRDFQSQHCRDMLYNSLKLPAKRSINCSGITRGD
QEAVVQALLDNLEVKKKRPPLTDTYYLNITRDCERFKAQRKFIQFPLSKEELDFPIAYSMVVHEKIENFERLLRAVYAPQ
NIYCVHVDVKSPETFKEAVKAIISCFPNVFMASKLVPVVYASWSRVQADLNCMEDLLQSSVSWKYLLNTCGTDFPIKTNA
EMVLALKMLKGKNSMESEVPSESKKNRWKYRYEVTDTLYPTSKIKDPPPDNLPMFTGNAYFVASRAFVQHVLDNPKSQIL
VEWVKDTYSPDEHLWATLQRAPWMPGSVPSHPKYHISDMTAIARLVKWQYHEGDVSMGAPYAPCSGIHRRAICIYGAGDL
YWILQNHHLLANKFDPRVDDNVLQCLEEYLRHKAIYGTEL
>P14378 ~~~gC~~~Envelope glycoprotein C homolog~~~
MGPLGRAWLIAAIFAWALLSARRGLAEEAEASPSPPPSPCPTETESSAGTTGATPPTPNSPDATPEDSTPGATTPVGTPE
PPSVSEHDPPVTNSTPPPAPPEDGRPGGAGNASRDGRPSGGGRPRPPRPSKAPPKERKWMLCEREAVAASYAEPLYVHCG
VADNATGGARLELWFQRVGRFRSTRGDDEAVRNPFPRAPPVLLFVAQNGSIAYRSAELGDNYIFPSPADPRNLPLTVRSL
TAATEGVYTWRRDMGTKSQRKVVTVTTHRAPAVSVEPQPALEGAGYAAVCRAAEYYPPRSTRLHWFRNGYPVEARHARDV
FTVDDSGLFSRTSVLTLEDATPTAHPPNLRCDVSWFQSANMERRFYAAGTPAVYRPPELRVYFEGGEAVCEARCVPEGRV
SLRWTVRDGIAPSRTEQTGVCAERPGLVNLRGVRLLSTTDGPVDYTCTATGYPAPLPEFSATATYDASPGLIGSPVLVSV
VAVACGLGAVGLLLVAASCLRRKARVIQPGLTRARALGSAP
>P68325 ~~~gC~~~Envelope glycoprotein C~~~
MWLPNLVRFVAVAYLICAGAILTYASGASASSSQSTPATPTHTTPNLTTAHGAGSDNTTNANGTESTHSHETTITCTKSL
ISVPYYKSVDMNCTTSVGVNYSEYRLEIYLNQRTPFSGTPPGDEENYINHNATKDQTLLLFSTAERKKSRRGGQLGVIPD
RLPKRQLFNLPLHTEGGTKFPLTIKSVDWRTAGIYVWSLYAKNGTLVNSTSVTVSTYNAPLLDLSVHPSLKGENYRATCV
VASYFPHSSVKLRWYKNAREVDFTKYVTNASSVWVDGLITRISTVSIPVDPEEEYTPSLRCSIDWYRDEVSFARIAKAGT
PSVFVAPTVSVSVEDGDAVCTAKCVPSTGVFVSWSVNDHLPGVPSQDMTTGVCPSHSGLVNMQSRRPLSEENGEREYSCI
IEGYPDGLPMFSDTVVYDASPIVEDRPVLTSIIAVTCGAAALALVVLITAVCFYCSKPSQAPYKKSDF
>P22596 ~~~gC~~~Envelope glycoprotein C~~~
MGLVNIMRFITFAYIICGGFILTRTSGTSASASPATPTTNTGEGTSSPVTPTYTTSTDSNNSTATNNSTDVNGTEATPTP
SHPHSHENTITCTNSLISVPYYTSVTINCSTTVSVNHSEYRLEIHLNQRTPFSDTPPGDQENYVNHNATKDQTLLLFSTA
HSSAKSRRVGQLGVIPDRLPKRQLFNLPAHTNGGTNFPLNIKSIDWRTAGVYVWYLFAKNGSLINSTSVTVLTYNAPLMD
LSVHPSLKGENHRAVCVVASYFPHNSVKLRWYKNAKEVDFTKYVTNASSVWVDGLITRISTVSIPADPDEEYPPSLRCSI
EWYRDEVSFSRMAKAGTPSVFVAPTVSVNVEDGAAVCTAECVPSNGVFVSWVVNDHLPGVPSQDVTTGVCSSHPGLVNMR
SSRPLSEENGEREYNCIIEGYPDGLPMFSDSVVYDASPIVEDMPVLTGIIAVTCGAAALALVVLITAVCFYCSKPSQVPY
KKADF
>P10228 ~~~gC~~~Envelope glycoprotein C~~~
MAPGRVGLAVVLWSLLWLGAGVSGGSETASTGPTITAGAVTNASEAPTSGSPGSAASPEVTPTSTPNPNNVTQNKTTPTE
PASPPTTPKPTSTPKSPPTSTPDPKPKNNTTPAKSGRPTKPPGPVWCDRRDPLARYGSRVQIRCRFRNSTRMEFRLQIWR
YSMGPSPPIAPAPDLEEVLTNITAPPGGLLVYDSAPNLTDPHVLWAEGAGPGADPPLYSVTGPLPTQRLIIGEVTPATQG
MYYLAWGRMDSPHEYGTWVRVRMFRPPSLTLQPHAVMEGQPFKATCTAAAYYPRNPVEFVWFEDDHQVFNPGQIDTQTHE
HPDGFTTVSTVTSEAVGGQVPPRTFTCQMTWHRDSVTFSRRNATGLALVLPRPTITMEFGVRIVVCTAGCVPEGVTFAWF
LGDDPSPAAKSAVTAQESCDHPGLATVRSTLPISYDYSEYICRLTGYPAGIPVLEHHGSHQPPPRDPTERQVIEAIEWVG
IGIGVLAAGVLVVTAIVYVVRTSQSRQRHRR
>P28986 ~~~gC~~~Envelope glycoprotein C~~~
MAPGRVGLAVVLWGLLWLGAGVAGGSETASTGPTITAGAVTNASEAPTSGSPGSAASPEVTPTSTPNPNNVTQNKTTPTE
PASPPTTPKPTSTPKSPPTSTPDPKPKNNTTPAKSGRPTKPPGPVWCDRRDPLARYGSRVQIRCRFRNSTRMEFRLQIWR
YSMGPSPPIAPAPDLEEVLTNITAPPGGLLVYDSAPNLTDPHVLWAEGAGPGADPPLYSVTGPLPTQRLIIGEVTPATQG
MYYLAWGRMDSPHEYGTWVRVRMFRPPSLTLQPHAVMEGQPFKATCTAAAYYPRNPVEFDWFEDDRQVFNPGQIDTQTHE
HPDGFTTVSTVTSEAVGGQVPPRTFTCQMTWHRDSVTFSRRNATGLALVLPRPTITMEFGVRHVVCTAGCVPEGVTFAWF
LGDDPSPAAKSAVTAQESCDHPGLATVRSTLPISYDYSEYICRLTGYPAGIPVLEHHGSHQPPPRDPTERQVIEAIEWVG
IGIGVLAAGVLVVTAIVYVVRTSQSRQRHRR
>Q89730 ~~~gC~~~Envelope glycoprotein C~~~
MALGRVGLAVGLWGLLWVGVVVVLANASPGRTITVGPRGNASNAAPSASPRNASAPRTTPTPPQPRKATKSKASTAKPAP
PPKTGPPKTSSEPVRCNRHDPLARYGSRVQIRCRFPNSTRTEFRLQIWRYATATDAEIGTAPSLEEVMVNVSAPPGGQLV
YDSAPNRTDPHVIWAEGAGPGASPRLYSVVGPLGRQRLIIEELTLETQGMYYWVWGRTDRPSAYGTWVRVRVFRPPSLTI
HPHAVLEGQPFKATCTAATYYPGNRAEFVWFEDGRRVFDPAQIHTQTQENPDGFSTVSTVTSAAVGGQGPPRTFTCQLTW
HRDSVSFSRRNASGTASVLPRPTITMEFTGDHAVCTAGCVPEGVTFAWFLGDDSSPAEKVAVASQTSCGRPGTATIRSTL
PVSYEQTEYICRLAGYPDGIPVLEHHGSHQPPPRDPTERQVIRAVEGAGIGVAVLVAVVLAGTAVVYLTHASSVRYRRLR
>P13374 ~~~gC~~~Envelope glycoprotein C homolog~~~
MVSNMRSTRTALTGWVGIFLVLSLQQTSCAGLPHNVDTHHILTFNPSPISADGVPLSEVPNSPTTELSTTVATKTAVPTT
ESTSSSEAHRNSSHKIPDIICDREEVFVFLNNTGRILCDLIVDPPSDDEWSNFALDVTFNPIEYHANEKNVEVARVAGLY
GVPGSDYAYPRKSELISSIRRDPQGSFWTSPTPRGNKYFIWINKTMHTMGVEVRNVDYKDNGYFQVILRDRFNRPLVEKH
IYMRVCQRPASVDVLAPPVLSGENYKASCIVRHFYPPGSVYVSWRRNGNIATPRKDRDGSFWWFESGRGATLVSTITLGN
SGLESPPKVSCLVAWRQGDMISTSNATAVPTVYYHPRISLAFKDGSLQDHRSL
>P06024 ~~~gC~~~Envelope glycoprotein C homolog~~~
MASLARAMLALLALYAAAIAAAPSTTTALDTTPNGGGGGNSSEGELSPSPPPTPAPASPEAGAVSTPPVPPPSVSRRKPP
RNNNRTRVHGDKATAHGRKRIVCRERLFSARVGDAVSFGCAVFPRAGETFEVRFYRRGRFRSPDADPEYFDEPPRPELPR
ERLLFSSANASLAHADALAPVVVEGERATVANVSGEVSVRVAAADAETEGVYTWRVLSANGTEVRSANVSLLLYSQPEFG
LSAPPVLFGEPFRAVCVVRDYYPRRSVRLRWFADEHPVDAAFVTNSTVADELGRRTRVSVVNVTRADVPGLAAADAADAL
APSLRCEAVWYRDSVASQRFSEALRPHVYHPAAVSVRFVEGFAVCDGLCVPPEARLAWSDHAADTVYHLGACAEHPGLLN
VRSARPLSDLDGPVDYTCRLEGLPSQLPVFEDTQRYDASPASVSWPVVSSMIVVIAGIGILAIVLVIMATCVYYRQAGP
>P0CK29 ~~~gD~~~Envelope glycoprotein D~~~
MQGPTLAVLGALLAVAVSLPTPAPRVTVYVDPPAYPMPRYNYTERWHTTGPIPSPFADGREQPVEVRYATSAAACDMLAL
IADPQVGRTLWEAVRRHARAYNATVIWYKIESGCARPLYYMEYTECEPRKHFGYCRYRTPPFWDSFLAGFAYPTDDELGL
IMAAPARLVEGQYRRALYIDGTVAYTDFMVSLPAGDCWFSKLGAARGYTFGACFPARDYEQKKVLRLTYLTQYYPQEAHK
AIVDYWFMRHGGVVPPYFEESKGYEPPPAADGGSPAPPGDDEAREDEGETEDGAAGREGNGGPPGPEGDGESQTPEANGG
AEGEPKPGPSPDADRPEGWPSLEAITHPPPAPATPAAPDAVPVSVGIGIAAAAIACVAAAAAGAYFVYTRRRGAGPLPRK
PKKLPAFGNVNYSALPG
>Q69091 ~~~gD~~~Envelope glycoprotein D~~~
MGGAAARLGAVILFVVIVGLHGVRSKYALVDASLKMADPNRFRGKDLPVLDQLTDPPGVRRVYHIQAGLPDPFQPPSLPI
TVYYAVLERACRSVLLNAPSEAPQIVRGASEDVRKQPYNLTIAWFRMGGNCAIPITVMEYTECSYNKSLGACPIRTQPRW
NYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRAKGSCKYALPLRIPPSACLSPQAYQQGVTVD
SIGMLPRFIPENQRTVAVYSLKIAGWHGPKAPYTSTLLPPELSETPNATQPELAPEDPEDSALLEDPVGTVAPQIPPNWH
IPSIQDAATPYHPPATPNNMGLIAGAVGGSLLAALVICGIVYWMRRHTQKAPKRIRLPHIREDDQPSSHQPLFY
>Q05059 ~~~gD~~~Envelope glycoprotein D~~~
MGGAAARLGAVILFVVIVGLHGVRGKYALADASLKMADPNRFRGKDLPVLDQLTDPPGVRRVYHIQAGLPDPFQPPSLPI
TVYYAVLERACRSVLLNAPSEAPQIVRGASEDVRKQPYNLTIAWFRMGGNCAIPITVMEYTECSYNKSLGACPIRTQPRW
NYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRAKGSCKYALPLRIPPSACLSPQAYQQGVTVD
SIGMLPRFIPENQRTVAVYSLKIAGWHGPKAPYTSTLLPPELSETPNATQPELAPEDPEDSALLEDPVGTVAPQIPPNWH
IPSIQDAATPYHPPATPNNMGLIAGAVGGSLLAALVICGIVYWMRRRTQKAPKRIRLPHIREDDQPSSHQPLFY
>A1Z0Q5 ~~~gD~~~Envelope glycoprotein D~~~
MGGAAARLGAVILFVVIVGLHGVRGKYALADASLKMADPNRFRGKDLPVLDQLTDPPGVRRVYHIQAGLPDPFQPPSLPI
TVYYAVLERACRSVLLNAPSEAPQIVRGASEDVRKQPYNLTIAWFRMGGNCAIPITVMEYTECSYNKSLGACPIRTQPRW
NYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRAKGSCKYALPLRIPPSACLSPQAYQQGVTVD
SIGMLPRFIPENQRTVAVYSLKIAGWHGPKAPYTSTLLPPELSETPNATQPELAPEDPEDSALLEDPVGTVAPQIPPNWH
IPSIQDAATPYHPPATPNNMGLIAGAVGGSLLAALVICGIVYWMHRRTRKAPKRIRLPHIREDDQPSSHQPLFY
>P57083 ~~~gD~~~Envelope glycoprotein D~~~
MGGTAARLGAVILFVVIVGLHGVRGKYALADASLKMADPNRFRGKDLPVLDQLTDPPGVRRVYHIQAGLPDPFQPPSLPI
TVYYAVLERACRSVLLNAPSEAPQIVRGASEDVRKQPYNLTIAWFRMGGNCAIPITVMEYTECSYNKSLGACPIRTQPRW
NYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRAKGSCKYALPLRIPPSACLSPQAYQQGVTVD
SIGMLPRFIPENQRTVAVYSLKIAGWHGPKAPYTSTLLPPELSETPNATQPELAPEDPEDSALLEDPVGTVAPQIPPNWH
IPSIQDAATPYHPPATPNNMGLIAGAVGGSLLAALVICGIVYWMHRRTRKAPKRIRLPHIREDDQPSSHQPLFY
>P03172 ~~~gD~~~Envelope glycoprotein D~~~
MGRLTSGVGTAALLVVAVGLRVVCAKYALADPSLKMADPNRFRGKNLPVLDRLTDPPGVKRVYHIQPSLEDPFQPPSIPI
TVYYAVLERACRSVLLHAPSEAPQIVRGASDEARKHTYNLTIAWYRMGDNCAIPITVMEYTECPYNKSLGVCPIRTQPRW
SYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRARASCKYALPLRIPPAACLTSKAYQQGVTVD
SIGMLPRFIPENQRTVALYSLKIAGWHGPKPPYTSTLLPPELSDTTNATQPELVPEDPEDSALLEDPAGTVSSQIPPNWH
IPSIQDVAPHHAPAAPSNPGLIIGALAGSTLAVLVIGGIAFWVRRRAQMAPKRLRLPHIRDDDAPPSHQPLFY
>Q69467 ~~~gD~~~Envelope glycoprotein D~~~
MGRLTSGVGTAALLVVAVGLRVVCAKYALADPSLKMADPNRFRGKNLPVLDQLTDPPGVKRVYHIQPSLEDPFQPPSIPI
TVYYAVLERACRSVLLHAPSEAPQIVRGASDEARKHTYNLTIAWYRMGDNCAIPITVMEYTECPYNKSLGVCPIRTQPRW
SYYDSFSAVSEDNLGFLMHAPAFETAGTYLRLVKINDWTEITQFILEHRARASCKYALPLRIPPAACLTSKAYQQGVTVD
SIGMLPRFIPENQRTVALYSLKIAGWHGPKPPYTSTLLPPELSDTTNATQPELVPEDPEDSALLEDPAGTVSSQIPPNWH
IPSIQDVAPHHAPAAPSNPGLIIGALAGSTLAVLVIGGIAFWVRRRAQMAPKRLRLPHIRDDDAPPSHQPLFY
>Q38494 ~~~gemA~~~GemA protein~~~
MSRTSLIKLIHVARRELQLDDDTYRAFLMQKTGKISCRELTVTQLEQVLGAMKERGFKKQNKYPRRRFKGHVTPREKVYK
IWQQMAEDGFITDGGDVALDKYVQRLTAKRNGGQGVSTLAWCHGDTLLTVLETLKQWHIRCIREAFSRHGLPLPVSPSGR
ELRGYDAMTAAYAHARKTRRMAQ
>P04488 ~~~gE~~~Envelope glycoprotein E~~~
MDRGAVVGFLLGVCVVSCLAGTPKTSWRRVSVGEDVSLLPAPGPTGRGPTQKLLWAVEPLDGCGPLHPSWVSLMPPKQVP
ETVVDAACMRAPVPLAMAYAPPAPSATGGLRTDFVWQERAAVVNRSLVIHGVRETDSGLYTLSVGDIKDPARQVASVVLV
VQPAPVPTPPPTPADYDEDDNDEGEDESLAGTPASGTPRLPPPPAPPRSWPSAPEVSHVRGVTVRMETPEAILFSPGETF
STNVSIHAIAHDDQTYSMDVVWLRFDVPTSCAEMRIYESCLYHPQLPECLSPADAPCAASTWTSRLAVRSYAGCSRTNPP
PRCSAEAHMEPVPGLAWQAASVNLEFRDASPQHSGLYLCVVYVNDHIHAWGHITISTAAQYRNAVVEQPLPQRGADLAEP
THPHVGAPPHAPPTHGALRLGAVMGAALLLSALGLSVWACMTCWRRRAWRAVKSRASGKGPTYIRVADSELYADWSSDSE
GERDQVPWLAPPERPDSPSTNGSGFEILSPTAPSVYPRSDGHQSRRQLTTFGSGRPDRRYSQASDSSVFW
>Q703F0 ~~~gE~~~Envelope glycoprotein E~~~
MDRGAVVGFLLGVCVVSCLAGTPKTSWRRVSVGEDVSLLPAPGPTGRGPTQKLLWAVEPLDGCGPLHPSWXSLMPPKQVP
ETVVDAACMRAPVPLAMAYAPPAPSATGGLRTDFVWQERAAVVNRSLVIYGVRETDSGLYTLSVGDIKDPARQVASVVLV
VQPAPVPTPPPTPADYDEDDNDEGEGEDESLAGTPASGTPRLPPPPAPPRSWPSAPEVSHVRGVTVRMETPEAILFSPGE
AFSTNVSIHAIAHDDQTYTMDVVWLRFDVPTSCAEMRIYESCLYHPQLPECLSPADAPCAASTWTSRLAVRSYAGCSRTN
PPPRCSAEAHMEPVPGLAWQAASVNLEFRDASPQHSGLYLCVVYVNDHIHAWGHITISTAAXYRNAVVEQPLPQRGADLA
EPTHPHVGAPPHAPPTHGALRLGAVMGAALLLSVLGLSVWACMTCWRRRAWRAVKSRASGKGPTYIRVADSELYADWSSD
SEGERDQVPWLAPPERPDSPSTNGSGFEILSPTAPSVYPRSDGHQSRRQLTTFGSGRPDRRYSQASDSSVFW
>P09259 ~~~gE~~~Envelope glycoprotein E~~~
MGTVNKPVVGVLMGFGIITGTLRITNPVRASVLRYDDFHTDEDKLDTNSVYEPYYHSDHAESSWVNRGESSRKAYDHNSP
YIWPRNDYDGFLENAHEHHGVYNQGRGIDSGERLMQPTQMSAQEDLGDDTGIHVIPTLNGDDRHKIVNVDQRQYGDVFKG
DLNPKPQGQRLIEVSVEENHPFTLRAPIQRIYGVRYTETWSFLPSLTCTGDAAPAIQHICLKHTTCFQDVVVDVDCAENT
KEDQLAEISYRFQGKKEADQPWIVVNTSTLFDELELDPPEIEPGVLKVLRTEKQYLGVYIWNMRGSDGTSTYATFLVTWK
GDEKTRNPTPAVTPQPRGAEFHMWNYHSHVFSVGDTFSLAMHLQYKIHEAPFDLLLEWLYVPIDPTCQPMRLYSTCLYHP
NAPQCLSHMNSGCTFTSPHLAQRVASTVYQNCEHADNYTAYCLGISHMEPSFGLILHDGGTTLKFVDTPESLSGLYVFVV
YFNGHVEAVAYTVVSTVDHFVNAIEERGFPPTAGQPPATTKPKEITPVNPGTSPLLRYAAWTGGLAAVVLLCLVIFLICT
AKRMRVKAYRVDKSPYNQSMYYAGLPVDDFEDSESTDTEEEFGNAIGGSHGGSSYTVYIDKTR
>Q9J3M8 ~~~gE~~~Envelope glycoprotein E~~~
MGTVNKPVVGVLMGFGIITGTLRITNPVRASVLRYDDFHIDEDKLDTNSVYEPYYHSDHAESSWVNRGESSRKAYDHNSP
YIWPRNDYDGFLENAHEHHGVYNQGRGIDSGERLMQPTQMSAQEDLGDDTGIHVIPTLNGDDRHKIVNVDQRQYGDVFKG
DLNPKPQGQRLIEVSVEENHPFTLRAPIQRIYGVRYTETWSFLPSLTCTGDAAPAIQHICLKHTTCFQDVVVDVDCAENT
KEDQLAEISYRFQGKKEADQPWIVVNTSTLFDELELDPPEIEPGVLKVLRTEKQYLGVYIWNMRGSDGTSTYATFLVTWK
GDEKTRNPTPAVTPQPRGAEFHMWNYHSHVFSVGDTFSLAMHLQYKIHEAPFDLLLEWLYVPIDPTCQPMRLYSTCLYHP
NAPQCLSHMNSGCTFTSPHLAQRVASTVYQNCEHADNYTAYCLGISHMEPSFGLILHDGGTTLKFVDTPESLSGLYVFVV
YFNGHVEAVAYTVVSTVDHFVNAIEERGFPPTAGQPPATTKPKEITPVNPGTSPLLRYAAWTGGLAAVVLLCLVIFLICT
AKRMRVKAYRVDKSPYNQSMYYAGLPVDDFEDSESTDTEEEFGNAIGGSHGGSSYTVYIDKTR
>P0DOG8 ~~~~~~Glyco-Gag protein~~~
LGDVSEASGARWVAQSVSPSPDRFGLFGAPPLSEGYVVLLGDERSKPSPPPSEFLLSVFRRNRAARLVCLSIVLSFVCSL
LFWTASKNMGQTVTTPLSLTLEHWEDVQRIASNQSVDVKKRRWVTFCSAEWPTFGVGWPQDGTFNLDIILQVKSKVFSPG
PHGHPDQVPYIVTWEAIAYEPPPWVKPFVSPKLSPSPTAPILPSGPSTQPPPRSALYPALTPSIKPRPSKPQVLSDNGGP
LIDLLSEDPPPYGGQGLSSSDGDGDREEATSTSEIPAPSPIVSRLRGKRDPPAADSTTSRAFPLRLGGNGQLQYWPFSSS
DLYNWKNNNPSFSEDPGKLTALIESVLTTHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLPNEVDAAFP
LERPDWDYTTQRGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNV
SMSFIWQSAPDIGRKLERLEDLKSKTLGDLVREAERIFNKRETPEEREERVRRETEEKEERRRAEEEQKEKERDRRRHRE
MSKLLATVVSGQRQDRQGGERRRPQLDKDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLDD
>P28967 ~~~gG~~~Envelope glycoprotein G~~~
MLTVLAALSLLSLLTSATGRLAPDELCYAEPRRTGSPPNTQPERPPVIFEPPTIAIKAESKGCELILLDPPIDVSYRRED
KVNASIAWFFDFGACRMPIAYREYYGCIGNAVPSPETCDAYSFTLIRTEGIVEFTIVNMSLLFQPGIYDSGNFIYSVLLD
YHIFTGRVTLEVEKDTNYPCGMIHGLTAYGNINVDETMDNASPHPRAVGCFPEPIDNEAWANVTFTELGIPDPNSFLDDE
GDYPNISDCHSWESYTYPNTLRQATGPQTLLVGAVGLRILAQAWKFVGDETYDTIRAEAKNLETHVPSSAAESSLENQST
QEESNSPEVAHLRSVNSDDSTHTGGASNGIQDCDSQLKTVYACLALIGLGTCAMIGLIVYICVLRSKLSSRNFSRAQNVK
HRNYQRLEYVA
>P81780 ~~~gG~~~Envelope glycoprotein G~~~
MHAIAPRLLLLFVLSGLPGTRGGSGVPGPINPPNNDVVFPGGSPVAQYCYAYPRLDDPGPLGSADAGRQDLPRRVVRHEP
LGRSFLTGGLVLLAPPVRGFGAPNATYAAHVTYYRLTRACRQPILLRQYGGCRGGEPPSPKTCGSYTYTYQGGGPPTRYA
LVNASLLVPIWDRAAETFEYQIELGGELHVGLLWVEVGGEGPGPTAPPQAARAEGGPCVPPVPAGRPWRSVPPVWYSAPN
PGFRGLRFRERCLPPQTPAAPSDLPRVAFAPQSLLVGITGRTFIRMARPTEDGVLPPHWAPGALDDGPYAPFPPRPRFRR
ALRTDPEGVDPDVRAPRTGRRLMALTEDASSDSPTSAPEKTPLPVSATAMAPSVDPSAEPTAPATTTPPDEMATQAATVA
VTPEETAVASPPATASVESSPPPAAAATPGAGHTNTSSASAAKTPPTTPAPTTPPPTSTHATPRPTTPGPQTTPPGPATP
GPVGASAAPTADSPLTALPPATAPGPSAANVSVAATTATPGTRGTARTPPTDPKTHPHGPADAPPGSPAPPPPEHRGGPE
EFEGAGDGEPPEDDDSATGLAFRTPNPNKPPPARPGPIRPTLPPGILGPLAPNTPRPPAQAPAKDMPSGPTPQHIPLFWF
LTASPALDILFIISTTIHTAAFVCLVALAAQLWRGRAGRRRYAHPSVRYVCLPPERD
>P03231 ~~~gH~~~Envelope glycoprotein H~~~
MQLLCVFCLVLLWEVGAASLSEVKLHLDIEGHASHYTIPWTELMAKVPGLSPEALWREANVTEDLASMLNRYKLIYKTSG
TLGIALAEPVDIPAVSEGSMQVDASKVHPGVISGLNSPACMLSAPLEKQLFYYIGTMLPNTRPHSYVFYQLRCHLSYVAL
SINGDKFQYTGAMTSKFLMGTYKRVTEKGDEHVLSLVFGKTKDLPDLRGPFSYPSLTSAQSGDYSLVIVTTFVHYANFHN
YFVPNLKDMFSRAVTMTAASYARYVLQKLVLLEMKGGCREPELDTETLTTMFEVSVAFFKVGHAVGETGNGCVDLRWLAK
SFFELTVLKDIIGICYGATVKGMQSYGLERLAAMLMATVKMEELGHLTTEKQEYALRLATVGYPKAGVYSGLIGGATSVL
LSAYNRHPLFQPLHTVMRETLFIGSHVVLRELRLNVTTQGPNLALYQLLSTALCSALEIGEVLRGLALGTESGLFSPCYL
SLRFDLTRDKLLSMAPQEATLDQAAVSNAVDGFLGRLSLEREDRDAWHLPAYKCVDRLDKVLMIIPLINVTFIISSDREV
RGSALYEASTTYLSSSLFLSPVIMNKCSQGAVAGEPRQIPKIQNFTRTQKSCIFCGFALLSYDEKEGLETTTYITSQEVQ
NSILSSNYFDFDNLHVHYLLLTTNGTVMEIAGLYEERAHVVLAIILYFIAFALGIFLVHKIVMFFL
>Q3KSQ3 ~~~gH~~~Envelope glycoprotein H~~~
MQLLCVFCLVLLWEVGAASLSEVKLHLDIEGHASHYTIPWTELMAKVPGLSPEALWREANVTEDLASMLNRYKLIYKTSG
TLGIALAEPVDIPAVSEGSMQVDASKVHPGVISGLNSPACMLSAPLEKQLFYYIGTMLPNTRPHSYVFYQLRCHLSYVAL
SINGDKFQYTGAMTSKFLMGTYKRVTEKGDEHVLSLIFGKTKDLPDLRGPFSYPSLTSAQSGDYSLVIVTTFVHYANFHN
YFVPNLKDMFSRAVTMTAASYARYVLQKLVLLEMKGGCREPELDTETLTTMFEVSVAFFKVGHAVGETGNGCVDLRWLAK
SFFELTVLKDIIGICYGATVKGMQSYGLERLAAMLMATVKMEELGHLTTEKQEYALRLATVGYPKAGVYSGLIGGATSVL
LSAYNRHPLFQPLHTVMRETLFIGSHVVLRELRLNVTTQGPNLALYQLLSTALCSALEIGEVLRGLALGTESGLFSPCYL
SLRFDLTRDKLLSMAPQEAMLDQAAVSNAVDGFLGRLSLEREDRDAWHLPAYKCVDRLDKVLMIIPLINVTFIISSDREV
RGSALYEASTTYLSSSLFLSPVIMNKCSQGAVAGEPRQIPKIQNFTRTQKSCIFCGFALLSYDEKEGLETTTYITSQEVQ
NSILSSNYFDFDNLHVHYLLLTTNGTVMEIAGLYEERAHVVLAIILYFIAFALGIFLVHKIVMFFL
>P12824 ~~~gH~~~Envelope glycoprotein H~~~
MRPGLPPYLTVFTVYLLSHLPSQRYGADAASEALDPHAFHLLLNTYGRPIRFLRENTTQCTYNSSLRNSTVVRENAISFN
FFQSYNQYYVFHMPRCLFAGPLAEQFLNQVDLTETLERYQQRLNTYALVSKDLASYRSFSQQLKAQDSLGQQPTTVPPPI
DLSIPHVWMPPQTTPHDWKGSHTTSGLHRPHFNQTCILFDGHDLLFSTVTPCLHQGFYLMDELRYVKITLTEDFFVVTVS
IDDDTPMLLIFGHLPRVLFKAPYQRDNFILRQTEKHELLVLVKKAQLNRHSYLKDSDFLDAALDFNYLDLSALLRNSFHR
YAVDVLKSGRCQMLDRRTVEMAFAYALALFAAARQEEAGTEISIPRALDRQAALLQIQEFMITCLSQTPPRTTLLLYPTA
VDLAKRALWTPDQITDITSLVRLVYILSKQNQQHLIPQWALRQIADFALQLHKTHLASFLSAFARQELYLMGSLVHSMLV
HTTERREIFIVETGLCSLAELSHFTQLLAHPHHEYLSDLYTPCSSSGRRDHSLERLTRLFPDATVPATVPAALSILSTMQ
PSTLETFPDLFCLPLGESFSALTVSEHVSYVVTNQYLIKGISYPVSTTVVGQSLIITQTDSQTKCELTRNMHTTHSITAA
LNISLENCAFCQSALLEYDDTQGVINIMYMHDSDDVLFALDPYNEVVVSSPRTHYLMLLKNGTVLEVTDVVVDATDSRLL
MMSVYALSAIIGIYLLYRMLKTC
>Q6SW67 ~~~gH~~~Envelope glycoprotein H~~~
MRPGLPSYLIILAVCLFSHLLSSRYGAEAVSEPLDKAFHLLLNTYGRPIRFLRENTTQCTYNSSLRNSTVVRENAISFNF
FQSYNQYYVFHMPRCLFAGPLAEQFLNQVDLTETLERYQQRLNTYALVSKDLASYRSFSQQLKAQDSLGEQPTTVPPPID
LSIPHVWMPPQTTPHGWTESHTTSGLHRPHFNQTCILFDGHDLLFSTVTPCLHQGFYLIDELRYVKITLTEDFFVVTVSI
DDDTPMLLIFGHLPRVLFKAPYQRDNFILRQTEKHELLVLVKKDQLNRHSYLKDPDFLDAALDFNYLDLSALLRNSFHRY
AVDVLKSGRCQMLDRRTVEMAFAYALALFAAARQEEAGAQVSVPRALDRQAALLQIQEFMITCLSQTPPRTTLLLYPTAV
DLAKRALWTPNQITDITSLVRLVYILSKQNQQHLIPQWALRQIADFALKLHKTHLASFLSAFARQELYLMGSLVHSMLVH
TTERREIFIVETGLCSLAELSHFTQLLAHPHHEYLSDLYTPCSSSGRRDHSLERLTRLFPDATVPATVPAALSILSTMQP
STLETFPDLFCLPLGESFSALTVSEHVSYIVTNQYLIKGISYPVSTTVVGQSLIITQTDSQTKCELTRNMHTTHSITVAL
NISLENCAFCQSALLEYDDTQGVINIMYMHDSDDVLFALDPYNEVVVSSPRTHYLMLLKNGTVLEVTDVVVDATDSRLLM
MSVYALSAIIGIYLLYRMLKTC
>P06477 ~~~gH~~~Envelope glycoprotein H~~~
MGNGLWFVGVIILGVAWGQVHDWTEQTDPWFLDGLGMDRMYWRDTNTGRLWLPNTPDPQKPPRGFLAPPDELNLTTASLP
LLRWYEERFCFVLVTTAEFPRDPGQLLYIPKTYLLGRPPNASLPAPTTVEPTAQPPPSVAPLKGLLHNPAASVLLRSRAW
VTFSAVPDPEALTFPRGDNVATASHPSGPRDTPPPRPPVGARRHPTTELDITHLHNASTTWLATRGLLRSPGRYVYFSPS
ASTWPVGIWTTGELVLGCDAALVRARYGREFMGLVISMHDSPPVEVMVVPAGQTLDRVGDPADENPPGALPGPPGGPRYR
VFVLGSLTRADNGSALDALRRVGGYPEEGTNYAQFLSRAYAEFFSGDAGAEQGPRPPLFWRLTGLLATSGFAFVNAAHAN
GAVCLSDLLGFLAHSRALAGLAARGAAGCAADSVFFNVSVLDPTARLQLEARLQHLVAEILEREQSLALHALGYQLAFVL
DSPSAYDAVAPSAAHLIDALYAEFLGGRVLTTPVVHRALFYASAVLRQPFLAGVPSAVQRERARRSLLIASALCTSDVAA
ATNADLRTALARADHQKTLFWLPDHFSPCAASLRFDLDESVFILDALAQATRSETPVEVLAQQTHGLASTLTRWAHYNAL
IRAFVPEASHRCGGQSANVEPRILVPITHNASYVVTHSPLPRGIGYKLTGVDVRRPLFLTYLTATCEGSTRDIESKRLVR
TQNQRDLGLVGAVFMRYTPAGEVMSVLLVDTDNTQQQIAAGPTEGAPSVFSSDVPSTALLLFPNGTVIHLLAFDTQPVAA
IAPGFLAASALGVVMITAALAGILKVLRTSVPFFWRRE
>P89445 ~~~gH~~~Envelope glycoprotein H~~~
MGPGLWVVMGVLVGVAGGHDTYWTEQIDPWFLHGLGLARTYWRDTNTGRLWLPNTPDASDPQRGRLAPPGELNLTTASVP
MLRWYAERFCFVLVTTAEFPRDPGQLLYIPKTYLLGRPRNASLPELPEAGPTSRPPAEVTQLKGLSHNPGASALLRSRAW
VTFAAAPDREGLTFPRGDDGATERHPDGRRNAPPPGPPAGTPRHPTTNLSIAHLHNASVTWLAARGLLRTPGRYVYLSPS
ASTWPVGVWTTGGLAFGCDAALVRARYGKGFMGLVISMRDSPPAEIIVVPADKTLARVGNPTDENAPAVLPGPPAGPRYR
VFVLGAPTPADNGSALDALRRVAGYPEESTNYAQYMSRAYAEFLGEDPGSGTDARPSLFWRLAGLLASSGFAFVNAAHAH
DAIRLSDLLGFLAHSRVLAGLAARGAAGCAADSVFLNVSVLDPAARLRLEARLGHLVAAILEREQSLVAHALGYQLAFVL
DSPAAYGAVAPSAARLIDALYAEFLGGRALTAPMVRRALFYATAVLRAPFLAGAPSAEQRERARRGLLITTALCTSDVAA
ATHADLRAALARTDHQKNLFWLPDHFSPCAASLRFDLAEGGFILDALAMATRSDIPADVMAQQTRGVASVLTRWAHYNAL
IRAFVPEATHQCSGPSHNAEPRILVPITHNASYVVTHTPLPRGIGYKLTGVDVRRPLFITYLTATCEGHAREIEPKRLVR
TENRRDLGLVGAVFLRYTPAGEVMSVLLVDTDATQQQLAQGPVAGTPNVFSSDVPSVALLLFPNGTVIHLLAFDTLPIAT
IAPGFLAASALGVVMITAALAGILRVVRTCVPFLWRRE
>P68324 ~~~gH~~~Envelope glycoprotein H~~~
MLLRLWVFVLLTPCYGWRPLNISNSSHCRNGNFENPIVRPGFITFNFYTKNDTRIYQVPKCLLGSDITYHLFDAINTTES
LTNYEKRVTRFYEPPMNDILRLSPVPSVKQFNLDRSIQPQVVYSLNMYPSQGIYYVRVVEVRQMQYDNVSCKLPNSLKEL
IFPVQVRCAKITRYVGEDIYTHFFTPDFMILYIQNPAGDLTMMYGNTTSINFKAPYKKSSFIFKQTLTDDLLLIVEKDVI
DVQYRFISDATFVDETLNDVDEVEALLLKFNNLGIQTLLRGDCKKPNYAGIPQMMFLYGIVHFSYSTKNTGPMPVLRVLK
THENLLSIDSFVNRCVNVSEGTLQYPKMKEFLKYEPSDYSYITKNKSISVSTLLTYLATAYESNVTISKYKWTDIANTLQ
NIYEKHMFFTNLTFSDRETLFMLAEIANIIPTDERMQRHMQLLIGNLCNPVEIVSWARMLTADRAPNLENIYSPCASPVR
RDVTNSFLKTVLTYASLDRYRSDMMEMLSVYRPPNMERVAAIQCLSPSEPAASLTLPNVTFVISPSYVIKGVSLTITTTI
VATSIIITAIPLNSTCVSTNYKYAGQDLLVLRNISSQTCEFCQSVVMEYDDIDGPLQYIYIKNIDELKTLTDPNNNLLVP
NTRTHYLLLAKNGSVFEMSEVGIDIDQVSIILVIIYILIAIIALFGLYRLIRLC
>P52353 ~~~gH~~~Envelope glycoprotein H~~~
MYFYINSLLLIVSINGWKHWNILNSSICVNEKTNQTIIQPGLITFNFHDYNETRVYQIPKCLFGYTFVSNLFDSVNFDES
FDQYKHRITRFFNPSTEKAVKIYAQKFQTNIKPVSHTKTITVSFLPLFYEKDVYFANVSEIRKLYYNQYICTLSNGLTDY
LFPITERCVMRHYNYLNTVFMLALTPSFFIISVETGMDDVVFIFGNVSRIFFKAPFRKSSFIYRQTVSDDLLLITKKTTI
ERFYPFLKIDFLDDIWKQNYDISFLIAKFNKLATVYIMEGFCGKPVNKDTFHLMFLFGLTHFLYSTRGDGLLPLLEILNT
HQSIITMGRFLEKCFKMTKSHLLYPEMEKLQNFQLVDYSYITSDLTIPISAKLAFLSLADGRIVTVPQNKWKEIENNIET
LYEKHKLFTNLTQPERANLFLLSEIGNSLVFQEKIKRKIHVLLASLCNPLEMYFWTHMLDNVMDIETMFSPCATATRKDL
TQRVVNNILSYKNLDAYTNKVMNTLSVYRKKRLDMFKSISCVSNEQAAFLTLPNITYTISSKYILAGTSFSVTSTVISTT
IIITVVPLNSTCTPTNYKYSVKNIKPIYNISSHDCVFCESLVVEYDDIDGIIQFVYIMDDKQLLKLIDPDTNFIDVNPRT
HYLLFLRNGSVFEITALDLKSSQVSIMLVLLYLIIIIIVLFGIYHVFRLF
>F5HAK9 ~~~gH~~~Envelope glycoprotein H~~~
MQGLAFLAALACWRCISLTCGATGALPTTATTITRSATQLINGRTNLSIELEFNGTSFFLNWQNLLNVITEPALTELWTS
AEVAEDLRVTLKKRQSLFFPNKTVVISGDGHRYTCEVPTSSQTYNITKGFNYSALPGHLGGFGINARLVLGDIFASKWSL
FARDTPEYRVFYPMNVMAVKFSISIGNNESGVALYGVVSEDFVVVTLHNRSKEANETASHLLFGLPDSLPSLKGHATYDE
LTFARNAKYALVAILPKDSYQTLLTENYTRIFLNMTESTPLEFTRTIQTRIVSIEARRACAAQEAAPDIFLVLFQMLVAH
FLVARGIAEHRFVEVDCVCRQYAELYFLRRISRLCMPTFTTVGYNHTTLGAVAATQIARVSATKLASLPRSSQETVLAMV
QLGARDGAVPSSILEGIAMVVEHMYTAYTYVYTLGDTERKLMLDIHTVLTDSCPPKDSGVSEKLLRTYLMFTSMCTNIEL
GEMIARFSKPDSLNIYRAFSPCFLGLRYDLHPAKLRAEAPQSSALTRTAVARGTSGFAELLHALHLDSLNLIPAINCSKI
TADKIIATVPLPHVTYIISSEALSNAVVYEVSEIFLKSAMFISAIKPDCSGFNFSQIDRHIPIVYNISTPRRGCPLCDSV
IMSYDESDGLQSLMYVTNERVQTNLFLDKSPFFDNNNLHIHYLWLRDNGTVVEIRGMYRRRAASALFLILSFIGFSGVIY
FLYRLFSILY
>P27416 ~~~gH~~~Envelope glycoprotein H~~~
MPASSVRLPLRLLTLAGLLALAGAAALARGAPQGGPPSPQGGPAPTAAPARGPTLFVLVGDGSAWFVFQLGGLGALNDTR
IRGHLLGRYLVSYQVVPPPVSAWYFVQRPRERPRLSGPPSGAELVAFDAPGVRRTYTTAAVWPAEVAVLADAEARCPAAV
FNVTLGEAFLGLRVALRSFLPLEVIISAERMRMIAPPALGSDLEPPGPPAGRFHVYTLGFLSDGAMHQTMRDVAAYVHES
DDYLAQLSAAHAAALAAVVQPGPYYFYRAAVRLGVAAFVFSEAARRDRRASAPALLRVESDARLLSRLLMRAAGCPAGFA
GLFDGRAERVPVAPADQLRAAWTFGEDPAPRLDLARATVAEAYRRSVRGKPFDQQALFFAVALLLRAGGPGDARETLLRT
TAMCTAERAAAAAELTRAALSPTAAWNEPFSLLDVLSPCAVSLRRDLGGDATLANLGAAARLALAPAGAPGAAAATDEGA
EEEEEDPVARAAPEIPAEALLALPLRGGASFVFTRRRPDCGPAYTLGGVDIANPLVLAIVSNDSAACDYTDRMPESQHLP
ATDNPSVCVYCDCVFVRYSSAGTILETVLIESKDMEEQLMAGANSTIPSFNPTLHGGDVKALMLFPNGTVVDLLSFTSTR
LAPVSPAYVVASVVGAAITVGILYALFKMLCSFSSEGYSRLINARS
>Q775J3 ~~~gH~~~Envelope glycoprotein H~~~
MFALVLAVVILPLWTTANKSYVTPTPATRSIGHMSALLREYSDRNMSLKLEAFYPTGFDEELIKSLHWGNDRKHVFLVIV
KVNPTTHEGDVGLVIFPKYLLSPYHFKAEHRAPFPAGRFGFLSHPVTPDVSFFDSSFAPYLTTQHLVAFTTFPPNPLVWH
LERAETAATAERPFGVSLLPARPTVPKNTILEHKAHFATWDALARHTFFSAEAIITNSTLRIHVPLFGSVWPIRYWATGS
VLLTSDSGRVEVNIGVGFMSSLISLSSGLPIELIVVPHTVKLNAVTSDTTWFQLNPPGPDPGPSYRVYLLGRGLDMNFSK
HATVDICAYPEESLDYRYHLSMAHTEALRMTTKADQHDINEESYYHIAARIATSIFALSEMGRTTEYFLLDEIVDVQYQL
KFLNYILMRIGAGAHPNTISGTSDLIFADPSQLHDELSLLFGQVKPANVDYFISYDEARDQLKTAYALSRGQDHVNALSL
ARRVIMSIYKGLLVKQNLNATERQALFFASMILLNFREGLENSSRVLDGRTTLLLMTSMCTAAHATQAALNIQEGLAYLN
PSKHMFTIPNVYSPCMGSLRTDLTEEIHVMNLLSAIPTRPGLNEVLHTQLDESEIFDAAFKTMMIFTTWTAKDLHILHTH
VPEVFTCQDAAARNGEYVLILPAVQGHSYVITRNKPQRGLVYSLADVDVYNPISVVYLSKDTCVSEHGVIETVALPHPDN
LKECLYCGSVFLRYLTTGAIMDIIIIDSKDTERQLAAMGNSTIPPFNPDMHGDDSKAVLLFPNGTVVTLLGFERRQAIRM
SGQYLGASLGGAFLAVVGFGIIGWMLCGNSRLREYNKIPLT
>Q38199 3.1.22.-~~~gin~~~Serine recombinase gin~~~
MLIGYVRVSTNDQNTDLQRNALVCAGCEQIFEDKLSGTRTDRPGLKRALKRLQKGDTLVVWKLDRLGRSMKHLISLVGEL
RERGINFRSLTDSIDTSSPMGRFFFHVMGALAEMERELIIERTMAGLAAARNKGRIGGRPPKLTKAEWEQAGRLLAQGIP
RKQVALIYDVALSTLYKKHPAKRTHIENDDRINQIDR
>P03015 3.1.22.-~~~gin~~~Serine recombinase gin~~~
MLIGYVRVSTNDQNTDLQRNALVCAGCEQIFEDKLSGTRTDRPGLKRALKRLQKGDTLVVWKLDRLGRSMKHLISLVGEL
RERGINFRSLTDSIDTSSPMGRFFFHVMGALAEMERELIIERTMAGLAAARNKGRIGGRPPKLTKAEWEQAGRLLAQGIP
RKQVALIYDVALSTLYKKHPAKRAHIENDDRIN
>P06487 ~~~gI~~~Envelope glycoprotein I~~~
MPCRPLQGLVLVGLWVCATSLVVRGPTVSLVSNSFVDAGALGPDGVVEEDLLILGELRFVGDQVPHTTYYDGGVELWHYP
MGHKCPRVVHVVTVTACPRRPAVAFALCRATDSTHSPAYPTLELNLAQQPLLRVQRATRDYAGVYVLRVWVGDAPNASLF
VLGMAIAAEGTLAYNGSAYGSCDPKLLPSSAPRLAPASVYQPAPNQASTPSTTTSTPSTTIPAPSTTIPAPQASTTPFPT
GDPKPQPPGVNHEPPSNATRATRDSRYALTVTQIIQIAIPASIIALVFLGSCICFIHRCQRRYRRSRRPIYSPQMPTGIS
CAVNEAAMARLGAELKSHPSTPPKSRRRSSRTPMPSLTAIAEESEPAGAAGLPTPPVDPTTPTPTPPLLV
>P09258 ~~~gI~~~Envelope glycoprotein I~~~
MFLIQCLISAVIFYIQVTNALIFKGDHVSLQVNSSLTSILIPMQNDNYTEIKGQLVFIGEQLPTGTNYSGTLELLYADTV
AFCFRSVQVIRYDGCPRIRTSAFISCRYKHSWHYGNSTDRISTEPDAGVMLKITKPGINDAGVYVLLVRLDHSRSTDGFI
LGVNVYTAGSHHNIHGVIYTSPSLQNGYSTRALFQQARLCDLPATPKGSGTSLFQHMLDLRAGKSLEDNPWLHEDVVTTE
TKSVVKEGIENHVYPTDMSTLPEKSLNDPPENLLIIIPIVASVMILTAMVIVIVISVKRRRIKKHPIYRPNTKTRRGIQN
ATPESDVMLEAAIAQLATIREESPPHSVVNPFVK
>Q77NN4 ~~~gI~~~Envelope glycoprotein I~~~
MFLIQCLISAVIFYIQVTNALIFKGDHVSLQVNSSLTSILIPMQNDNYTEIKGQLVFIGEQLPTGTNYSGTLELLYADTV
AFCFRSVQVIRYDGCPRIRTSAFISCRYKHSWHYGNSTDRISTEPDAGVMLKITKPGINDAGVYVLLVRLDHSRSTDGFI
LGVNVYTAGSHHNIHGVIYTSPSLQNGYSTRALFQQARLCDLPATPKGSGTSLFQHMLDLRAGKSLEDNPWLHEDVVTTE
TKSVVKEGIENHVYPTDMSTLPEKSLNDPPENLLIIIPIVASVMILTAMVIVIVISVKRRRIKKHPIYRPNTKTRRGIQN
ATPESDVMLEAAIAQLATIREESPPHSVVNPFVK
>P68331 ~~~gK~~~Envelope glycoprotein K~~~
MLAVRSLQHLSTVVLITAYGLVLVWYTVFGASPLHRCIYAVRPTGTNNDTALVWMKMNQTLLFLGAPTHPPNGGWRNHAH
ICYANLIAGRVVPFQVPPDAMNRRIMNVHEAVNCLETLWYTRVRLVVVGWFLYLAFVALHQRRCMFGVVSPAHKMVAPAT
YLLNYAGRIVSSVFLQYPYTKITRLLCELSVQRQNLVQLFETDPVTFLYHRPAIGVIVGCELMLRFVAVGLIVGTAFISR
GACAITYPLFLTITTWCFVSTIGLTELYCILRRGPAPKNADKAAAPGRSKGLSGVCGRCCSIILSGIAVRLCYIAVVAGV
VLVALHYEQEIQRRLFDV
>Q8JLF5 ~~~~~~Glutaredoxin-1~~~
MAEEFVQQRLANNKVTIFVKYTCPFCRNALDILNKFSFKRGAYEIVDIKEFKPENELRDYFEQITGGKTVPRIFFGKTSI
GGYSDLLEIDNMDALGDILSSIGVLRTC
>P68692 ~~~~~~Glutaredoxin-1~~~
MAEEFVQQRLANNKVTIFVKYTCPFCRNALDILNKFSFKRGAYEIVDIKEFKPENELRDYFEQITGGRTVPRIFFGKTSI
GGYSDLLEIDNMDALGDILSSIGVLRTC
>P68460 ~~~~~~Glutaredoxin-2~~~
MKNVLIIFGKPYCSICENVSDAVEELKSEYDILHVDILSFFLKDGDSSMLGDVKRGTLIGNFAAHLSNYIVSIFKYNPQT
KQMAFVDINKSLDFTKTDKSLVNLEILKSEIEKATYGVWPPVTE
>P00276 ~~~NRDC~~~Glutaredoxin~~~
MFKVYGYDSNIHKCVYCDNAKRLLTVKKQPFEFINIMPEKGVFDDEKIAELLTKLGRDTQIGLTMPQVFAPDGSHIGGFD
QLREYFK
>Q5UQ14 ~~~~~~Probable glutaredoxin~~~
MSYYMSPIVQKITGADPGTFVLFYVPECPYCQRALSTLRERNLPFKGYNINNISGNMPRLLQVLTTYSNLTGFNPYHTTK
PIIFINGKFIGGMDDLAKYLDVQFTQ
>Q9QSP1 ~~~G~~~Glycoprotein~~~
MLLQVILLVSLTAILPCTGQFPLYAIPDKLGPWSPIDIHHLSCPNNLIVEDEGCTSLSGFSYMELKVGFITTIKVSGFTC
TGVVTESETYTNFFGYVTTTFKRKHFRPTPESCRKAYNWKIAGDPRYEESLHNPYPDYHWLRTVTTTKESLLIISPSVVD
MDPYDKSLHSRMFPKGSCSGASIPSVFCSTNHDYTLWMPEDSNSGMSCDIFTMSKGKKASKGGKVCGFVDERGLYKSLKG
ACKLKLCGISGLRLLDGSWVSIQNHEEVKWCSPNQLVNIHDFNADEIEHLIVEELIKEREECLDALESIITTKSVSFRRL
SHLRKLVPGFGKAYTIINKTLMEADAHYKSVRTWDEIIPSKGCLKVREKCHPPYNGVFFNGIILGPDGQVLIPEMQSSLL
HQHTELLESSVIPLIHPLADPSTIFRGDDEAEGFIEVHLPDIQKQVSGIDLGLSEWERYLIIGISAIILFILAIIFTICC
RRCKRRKKIRTDHIELDRKVSVTSQSGKSIPSWESYKSRQGHSRS
>Q8JTH0 ~~~G~~~Glycoprotein~~~
MLLQIVLLMSLMVFSPCPGKFPLYTIPDKLGPWSPIDIHHLSCPNNLIVEDEGCTSLSGFSYMELKVGFITTIKVSGFTC
TGVVTESETYTNFFGYVTTTFKRKHFRPTPEFCRNAYNWKVAGDPRYEESLHNPYPDYHWLRTVTTTKESLLIISPSVVD
MDPYDKSLHSKMFPKGTCSGASVPSIFCSTNHDYTLWMPENPKPGMSCDIFTTSKGKKASKGGKVCGFVDERGLYKSLKG
ACKLKLCGISGLRLMDGSWVSIQNHEEAKWCSPDQLVNIHDFHSDEIEHLIVEELVRKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIVNKTLMEADAHYKSVRTWNEIIPSKGCLKVRERCHPPYNGVFFNGIILSPDGHVLIPEMQSSLL
QQHVELLESSVIPLIHPLADPSTVFKRDDEAEDFIEVHLPDVQKQVSGIDLGLSEWERYLIIGASAVVLFALAIIFAVCC
RRCKRRKKARTDRIELDRKVSVTSQSGKVIPSWESYKIEAGGHFRS
>Q6X1D5 ~~~G~~~Glycoprotein~~~
MPLQAIPLFSLILPVLVAGKFPIYTIPDKIGPWSPIDINHLSCPNNLVVEDEGCTTLTAFSYMELKVGYITTIKVSGFTC
TGVVTEAETYTNFVGYVTTTFRRKHFRPTASACREAYNWKATGDPRYEESLHNPYPDSHWLRTVKTTKESLLIISPSVAD
MDAYDKALYSKIFPNGKCLGVSLSSPFCSTNHDYTLWMPENPKPGVSCDIFTTSKGKKATKDGKLCGFVDERGLYKSLKG
ACKLKLCGVMGLRLMDGSWVSLQKTEESEWCSPNQLINIHDFHSDEIEHMVVEELVKKREECLDALESIMTTKSISFRRL
SHLRKLVPGFGKAYTLINKTLMEADAHYKSVREWTEVIPSKGCLKAGGGCYPHYNRVFFNGIILSPDGHVLIPEMQSALL
QQHIELLESSVIPLRHPLADPSTVFKGDDEAEEFVEVHLPDTQKQISGIDLGLPEWKRYFLMGMSAIGFLALTIILAVCC
RRIKRRKQSKPNPVELIRKVSVTSQSGRAIPSWESYKVKTGDQPQV
>Q89669 ~~~G~~~Glycoprotein~~~
MATFKVLVLMILWITSIFNVRCEKFVTIPVNCSGEVDIDKMDVMCPNRYNLLSTNHLMEGEEVETFCRPSLRENDLLDGY
LCRKQKWEVTCTETWYFVTDVKYQIIEVIPTENECMEERERKLKGEYIPPYYPPTNCVWNAIDTQERTFITLIEHPVIED
PVTMTLMDSKFTKPCNPKHNEVTICDTYNPLIKWISKETSGLNLHCQIKSWECIPVKLHHSHRNMMEALYLESPDFGIVD
ASKICNLTFCGYNGILLDNGEWWSIYRSGFTHGFLDNHILKNRRIEECKEKKPGYKLAKLDTTYIDLEFEIELEHEKCLG
TLEKLQNGEYVTPLDLSYLSPSNPGKHYAYRLEYINTTEHKCVQLGFTYEGGDCRKMLDERDDHGAYYNWTTIKLQRVIR
AVCYYHTFSMNLDESKHKYYDQDNRSIQIDEKFISEVLKSTPLIDRHEKYEGNLSWNGIIIESKNGHEKNVIVPSASQYN
HVMINKILKRLDTVMYDSYKFDSESGSISYNKIVPIVREDNLQNAHRVDVIQYIKDKGSYIINGFTGWFSSLGKLMRWTI
WGVGLFFSIFTLYKIIMILRKHSNDNVRKEFKETAGKVMIGQPIDTKSMSRTSIKANNKGKFDKVKDLFTPRSKTISHLT
TDTLKEHTDGTYEELHFFNV
>P32595 ~~~G~~~Glycoprotein~~~
MFKVLIITLLVNKIHLEKIYNVPVNCGELHPVKAHEIKCPQRLNELSLQAHHNLAKDEHYNKICRPQLKDDAHLEGFICR
KQRWITKCSETWYFSTSIEYQILEVIPEYSGCTDAVKKLDQGALIPPYYPPAGCFWNTEMNQEIEFYVLIQHKPFLNPYD
NLIYDSRFLTPCTINDSKTKGCPLKDITGTWIPDVRVEEISEHCNNKHWECITVKSFRSELNDKERLWEAPDIGLVHVNK
GCLSTFCGKNGIIFEDGEWWSIENQTESDFQNFKIEKCKGKKPGFRMHTDRTEFEELDIKAELEHERCLNTISKILNKEN
INTLDMSYLAPTRPGRDYAYLFEQTSWQEKLCLSLPDSGRVSKDCNIDWRTSTRGGMVKKNHYGIGSYKRAWCEYRPFVD
KNEDGYIDIQELNGHNMSGNHAILETAPAGGSSGNRLNVTLNGMIFVEPTKLYLHTKSLYEGIEDYQKLIKFEVMEYDNV
EENLIRYEEDEKFKPVNLNPHEKSQINRTDIVREIQKGGKKVLSAVVGWFTSTAKAVRWTIWAVGAIVTTYAIYKLYKMV
KSNSSHSKHREADLEGLQSTTKENMRVEKNDKNYQDLELGLYEEIRSIKGGSKQTGDDRFFDH
>P13180 ~~~G~~~Glycoprotein~~~
MTSSVTISVVLLISFITPLYSYLSIAFPENTKLDWKPVTKNTRYCPMGGEWFLEPGLQEESFLSSTPIGATPSKSDGFLC
HAAKWVTTCDFRWYGPKYITHSIHNIKPTRSDCDTALASYKSGTLVSLGFPPESCGYASVTDSEFLVIMITPHHVGVDDY
RGHWVDPLFVGGECDQSYCDTIHNSSVWIPADQTKKNICGQSFTPLTVTVAYDKTKEIAAGGIVFKSKYHSHMEGARTCR
LSYCGRNGIKFPNGEWVSLDVKTRIQEKHLLPLFKECPAGTEVRSTLQSDGAQVLTSEIQRILDYSLCQNTWDKVERKEP
LSPLDLSYLASKSPGKGLAYTVINGTLSFAHTRYVRMWIDGPVLKEPKGKRESPSGISSDIWTQWFKYGDMEIGPNGLLK
TAGGYKFPWHLIGMGIVDNELHELSEANPLDHPQLPHAQSIADDSEEIFFGDTGVSKNPVELVTGWFTSWKESLAAGVVL
ILVVVLIYGVLRCFPVLCTTCRKPKWKKGVERSDSFEMRIFKPNNMRARV
>Q5VKP3 ~~~G~~~Glycoprotein~~~
MSLLTAVIAFLFISTFCSGKFPIYTIPDKIGPWSPIDINHLSCPNNLEVEDEGCTTLTAFNYMELKVGYITSIKVDGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPNVSACRAAFSWKTAGDPRYEESLHNPYPDSHWLRTVTTTKESLLIISPSVVD
MDAYDKTLYSKMFPNGKCFPPISDSPFCSTNHDYTLWLPEKEKLSMSCNIFVSSKGKKATKDGRLCGFVDERGLYKSLKG
ACKLKLCGMAGMRLMDGSWVSLQRADAPEWCPPGALVNVHDFHSDEIAHFVVEELIKKREECLDTLETILTTKSISFRRL
SHFRKLVPGLGKAYTLINNTLMEAEAHYKSIREWKEIIPSKGCLKAGGRCHPHYDGIFFNGIILGPNGDVLIPEMQSSLL
QQHIELLESSMIPLRHPLADSSAIFRSDNEAEDFVDVHLPDTQKQVSDIDLGFPEWKRYFLIGVSAIALFSLAIIIAVCC
RKFKRRKRPKPGPIELVRKVSVTSQSGKVVPSWESYKEGATSQP
>Q6X1D1 ~~~G~~~Glycoprotein~~~
MPSQAVFLVLTTVFSQCVGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDDGCTTLSGFTYMELKVGYITTIKVDGFTC
TGIVTEAETYTNFVGYVTTTFKRKHFRPGPSACRDAYNWKAAGDPRYEESLHNPYPDSHWLRTVTTTKESLLIISPSVVD
MDAYDKSLLSKIFPNGKCPGVSIASPFCSTNHDYTIWMPENTKTGMSCDIFTTSKGKRATKDGKLCGFVDERGLYKSLKG
SCKLKLCGVSGLRLMDGSWVSIQNHEEAKWCPPDQLVNVHDFHSDEIEHLIVEELVKKREECLDALESIMTTKSISFRRL
SHLRKLVPGFGKAYTIINKTLMEADAHYKSIREWSEIIPSKGCLVAGGRCYHHHNGVFFNGIILSPDGHVLIPEMQSALL
QQHIELLESSVIPLMHPLADPSTVFKGDDGAEDFVEVHLPDVQKQISGIDLGLPEWKRYFLIGVAALTLFALTIFVVVCC
RRVRRRERAKPNPVELIRKVSVTSQSGKVIPSWESYKVEAEGQSQA
>Q8BDV6 ~~~G~~~Glycoprotein~~~
MSQLNLIPFFCVIIVLSVEDFPLYTIPEKIGPWTPIDLIHLSCPNNLQSEDEGCGTSSVFSYVELKTGYLTHQKVSGFTC
TGVVNEAVTYTNFVGYVTTTFKRKHFKPTALACRDAYHWKISGDPRYEESLHTPYPDNSWLRTVTTTKESLVIISPSIVE
MDVYSRTLHSPMFPTGTCSRFYPSSPSCATNHDYTLWLPDDPNLSLACDIFVTSTGKKSMNGSRMCGFTDERGYYRTIKG
ACKLTLCGKPGLRLFDGTWISFPRPEVTTRCLPNQLVNIHNNRIDEVEHLIVEDLIRKREECLDTLETVLMSKSISFRRL
SHFRKLVPGYGKAYTILNGSLMETNVHYLKVDNWSEILPSKGCLKINNQCVAHYKGVFFNGIIKGPDGHILIPEMQSSLL
KQHMDLLKAAVFPLKHPLIEPGSLFNKDGDADEFVDVHMPDVHKLVSDVDLGLPDWSLYALIGATIIAFFILICLIRICC
KKGGRRNSPTNRPDLPIGLSTTPQPKSKVISSWESYKGTSNV
>P0C572 ~~~G~~~Glycoprotein~~~
MNIPCFVVILSLATTHSLGEFPLYTIPEKIEKWTPIDMIHLSCPNNLLSEEEGCNAESSFTYFELKSGYLAHQKVPGFTC
TGVVNEAETYTNFVGYVTTTFKRKHFRPTVAACRDAYNWKVSGDPRYEESLHTPYPDSSWLRTVTTTKESLLIISPSIVE
MDIYGRTLHSPMFPSGVCSNVYPSVPSCETNHDYTLWLPEDPSLSLVCDIFTSSNGKKAMNGSRICGFKDERGFYRSLKG
ACKLTLCGRPGIRLFDGTWVSFTKPDVHVWCTPNQLINIHNDRLDEIEHLIVEDIIKKREECLDTLETILMSQSVSFRRL
SHFRKLVPGYGKAYTILNGSLMETNVYYKRVDKWADILPSKGCLKVGQQCMEPVKGVLFNGIIKGPDGQILIPEMQSEQL
KQHMDLLKAAVFPLRHPLISREAVFKKDGDADDFVDLHMPDVHKSVSDVDLGLPHWGFWMLIGATIVAFVVLVCLLRVCC
KRVRRRRSGRATQEIPLSFPSAPVPRAKVVSSWESYKGLPGT
>P15199 ~~~G~~~Glycoprotein~~~
MVPQVLLFVPLLGFSLCFGKFPIYTIPDELGPWSPIDIHHLSCPNNLVVEDEGCTNLSEFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVRTTKESLIIISPSVTD
LDPYDKSLHSRGFPGGKCSGITVSSTYCSTNHDYTIWMPENPGPRTPCDIFTNSRGKRASKGNKTCGFVDERGLYKSLKG
ACRLKLCGVLGLRLMDGTWVAMQTSDETKWCPPDQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEIIPSKGCLKVGGRCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHMELLKSSVIPLMHPLADPSTVFKEGDEAEDFVEVHLPDVYKQISGVDLGLPNWGKYVLMTAGAMIGLVLIFSLMTWC
RRANRPESKQRSFGGTGRNVSVTSQSGKVIPSWESYRSGGEIRL
>Q66T62 ~~~G~~~Glycoprotein~~~
MIPQALLFVPLLIPSLCLGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTSLSGFSYMELKVGYISAMKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPMPDACRAAHDWKIAGDPRYEDSLQNPYPDYHWLRTVKTTKESLVIISPSVAD
LDPYDKSLHSRVFPSGKCLGITVSSTYCPTNHDYTIWMPVEARLGTSCDIFTNSRGKKASKGGRTCGFVDERGLYKSLKG
ACKLKLCGVPGLRLMNGTWVSIQTSDDIKWCPPDQLVNLHDFHSDEIEHLVVEELIKKREGCLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNNTLMEADAHYKSVRTWNEVIPSKGCLKVGGRCHPPVNGVFFNGIILGPDGNVLIPEMQSSLL
QQHMELLESSVIPLMHPLADPSTVFKDGDEAEDFVEVHLPDVHKQVSDVDLGLPSWGKYLLMSAGALATLILAIFLITCC
RRANRTKSTQRGHRESGGKVSVAPQNGKIISSWELYKSESETGM
>O92284 ~~~G~~~Glycoprotein~~~
MVPQVLLFVPLLGFSLCFGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSEFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVRTTKESLIIISPSVTD
LDPYDKSLHSRVFPGGKCSGITVSSTYCSTNHDYTIWMPENPRPRTPCDIFTNSRGKRASKGNKTCGFVDERGLYKSLKG
ACRLKLCGVLGLRLMDGTWVAMQTSDETKWCPPDQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEIIPSKGCLKVGGRCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHMELLKSSVIPLMHPLADPSTVFKEGDEAEDFVEVHLPDVYKQISGVDLGLPNWGKYVLMTAGAMIGLVLIFSLMTWC
RRANRPESKQRSFGGTGRNVSVTSQSGKVIPSWESYKSGGEIRL
>Q0GBX6 ~~~G~~~Glycoprotein~~~
MVPQALLLVPLLGFSLCFGKFPIYTIPTKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGRISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPMPGCMYSRVQLEDGRSPQIEESLHNPYPDYHWLRTVRTTKESLIIISPSVTD
LDPYDKSLHSRVFPGRKCSGITVSSTYCSTNHDYTVWMPEILRLGTSCDIFTNSRGKRASKGSKTCGFVDERGLYKSLKG
ACKLKLCGVPGLRLMDGTWVAMQTSNETKWCPPGQLVNLHDLHSDEIEHLVVEELVKKREECLDALESITTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEAEAHYKSVRTWNEIIPSKGCLRVGGGCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHIELLESSVIPLMHPLADPFTVFKDGDEIEDFVEVHLPDVHEQVSGVDLGLPNWGEYVLLSAGTLIALMLIIFLITCC
KRVDRPESTQRSLRGTGRNVSVTSQSGKFIPSRESYKSGGETGL
>P03524 ~~~G~~~Glycoprotein~~~
MVPQALLFVPLLVFPLCFGKFPIYTILDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYILAIKMNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYRWLRTVKTTKESLVIISPSVAD
LDPYDRSLHSRVFPSGKCSGVAVSSTYCSTNHDYTIWMPENPRLGMSCDIFTNSRGKRASKGSETCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSNETKWCPPDQLVNLHDFRSDEIEHLVVEELVRKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEILPSKGCLRVGGRCHPHVNGVFFNGIILGPDGNVLIPEMQSSLL
QQHMELLESSVIPLVHPLADPSTVFKDGDEAEDFVEVHLPDVHNQVSGVDLGLPNWGKYVLLSAGALTALMLIIFLMTCC
RRVNRSEPTQHNLRGTGREVSVTPQSGKIISSWESHKSGGETRL
>P19462 ~~~G~~~Glycoprotein~~~
MVPQVLLFAPLLVFPLCFGKFPIYTIPDKLGPWSPIDLHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVKTTKESLVIISPSVTD
LDPYDKSLHSRVFPGGNCSGITVSSTYCSTNHDYTIWMPENLRLGTSCDIFTHSRGKRASKGDKTCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSDETKWCPPGQLVNLHDFRSDEIEHLVEEELVKKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVQTWNEIIPSKGCLRVGERCHPHVNGVFFNGIILGSDGHVLIPEMQSSLL
QQHMELLESSVIPLMHPLADPSTVFKDGDEVEDFVEVHLPDVHKQVSGVDLGLPKWGKYVLMIAGALIALMLIIFLMTCC
RRVNRPESTQSNLGGTGRNVSVPSQSGKVISSWESYKSGGETRL
>A3RM22 ~~~G~~~Glycoprotein~~~
MVPQVLLFVPLLVFSMCFGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVKTTKESLVIISPSVAD
LDPYDKSLHSRVFPSGKCSGITISSTYCSTNHDYTIWMPENPRLGTSCDIFTNSRGKRASKGGKTCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSDETKWCPPDQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMATKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEIIPSKGCLRVGGRCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHMELLESSVIPLMHPLADPSTVFKDGDEAEDFVEVHLPDVHKQISGVDLGLPSWGKYVLVSAGVLVVLMLTIFIMTCC
GRVHRPKSTQHGLGGTGRKVSVTSQSGKVISSWESYKSGGETRL
>Q9IPJ6 ~~~G~~~Glycoprotein~~~
MVPQALLLVPILGFSSCFGKFPIYTIPDTLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHSPYPDYHWLRTVKTTKESLVIISPSVAD
LDPYDNSLHSRVFPSGKCSGITVSSVYCSTNHDYTVWMPESLRLGTSCDIFTNSRGKRASKGSKTCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSNETKWCPPDQLVNLHDLRSDEIEHLVIEELVKKREECLDALESIITTKSVSFRRL
SYLRKLVPGFGKAYTIFNKTLMEAEAHYKSVRTWNEIIPSKGCLRVGGRCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHIELLESSVIPLMHPLADPFTVFKDGDETEDFIEVHLPDVHEQVSGVDLGLPNWGEYVLLSAGTLIALMLIIFLMTCC
RKVDRPESTQRSLRGTGRNVSVTSQSGKFIPSWESYKSGGETGL
>P08667 ~~~G~~~Glycoprotein~~~
MVPQALLFVPLLVFPLCFGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYISAIKMNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVKTTKESLVIISPSVAD
LDPYDRSLHSRVFPGGNCSGVAVSSTYCSTNHDYTIWMPENPRLGMSCDIFTNSRGKRASKGSETCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSNETKWCPPGQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEIIPSKGCLRVGGRCHPHVNGVFFNGIILGPDGNVLIPEMQSSLL
QQHMELLVSSVIPLMHPLADPSTVFKNGDEAEDFVEVHLPDVHERISGVDLGLPNWGKYVLLSAGALTALMLIIFLMTCW
RRVNRSEPTQHNLRGTGREVSVTPQSGKIISSWESYKSGGETGL
>Q0GBY1 ~~~G~~~Glycoprotein~~~
MVPQVLLFVLLLGFSLCFGKFPIYTIPDELGPWSPIDIHHLSCPNNLVVEDEGCTNLSEFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVRTTKESLIIISPSVTD
LDPYDKSLHSRVFPGRKCSGITVSSTYCSTNHDYTIWMPENPRPRTPCDIFTNSRGKRASNGNKTCGFVDERGLYKSLKG
ACRLKLCGVLGLRLMDGTWVAMQTSGETKWCPPDQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEIIPSKGCLKVGGRCHPHVNGVFFNGLILGPDDHVLIPEMQSSLL
QQHMELLESSVIPLMHPLADPSTVFKEGDEAEDFVEVHLPDVYKQISGVDLGLPNWGKYVLMTAGAMIGLVLIFSLMTWC
RRANRPESKQRSFGGTGGNVSVTSQSGKVIPSWESYKSGGEIRL
>P16288 ~~~G~~~Glycoprotein~~~
MVPQALLFVPLLVFPLCFGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYILAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYRWLRTVKTTKESLVIISPSVAD
LDPYDRSLHSRVFPSGKCSGVAVSSTYCSTNHDYTIWMPENPRLGMSCDIFTNSRGKRASKGSETCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVSMQTSNETKWCPPDKLVNLHDFRSDEIEHLVVEELVRKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEILPSKGCLRVGGRCHPHVNGVFFNGIILGPDGNVLIPEMQSSLL
QQHMELLESSVIPLVHPLADPSTVFKDGDEAEDFVEVHLPDVHNQVSGVDLGLPNWGKYVLLSAGALTALMLIIFLMTCC
RRVNRSEPTQHNLRGTGREVSVTPQSGKIISSWESHKSGGETRL
>P32550 ~~~G~~~Glycoprotein~~~
MVPQALLFVPLLVFPLCFGKFPIYTIPDKLGPWSPIDIHHLRCPNNLVVEDEGCTNLSGFSYMELKVGYISAIKVNGFTC
TGVVTEAETYTNFVGYVTTTFKRKHFRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYHWLRTVKTTKESLVIISPSVAD
LDPYDKSLHSRVFPSGNCSGITVSSTYCSTNHDYTIWMPENPRLETSCDIFTNSRGKRASKGSKTCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSDETKWCPPDQLVNLHDFRSDEIEHLVVEELVKKREECLDALESIMTTKSVSLRRL
SHLRKLVPGFGKAYTIFNKTLMEAEAHYKSVQTWNEIIPSKGCLRVGGRCHPHVNGVFFNGIILGPDGHVLIPEMQSSLL
QQHMELLESSVIPLMHPLADPSTVFKDGDEAEDFVEVHLPDVHKQVSGVDLGLPNWGKYVLLSAGTLIALMLIIFLMTCC
RRVNRPKSTERSLGETGRKVSVTSQSGKVISSWESYKSGGETRR
>Q08089 ~~~G~~~Glycoprotein~~~
MVPQALLFVPLLVFPLCFGKFPIYTIPDKLGPWSPIDIHHLSCPNNLVVEDEGCTNLSGFSYMELKVGYILAIKMNGFTC
TGVVTEAENYTNFVGYVTTTFKRKHLRPTPDACRAAYNWKMAGDPRYEESLHNPYPDYSWLRTVKTTKESLVIISPSVAD
LDPYDRSLHSRVFPSGKCSGVAVSSTYCSTNHDYTIWMPENPRLGKSCDIFTNSRGKRASKGSETCGFVDERGLYKSLKG
ACKLKLCGVLGLRLMDGTWVAMQTSNETKWCPPDQLVNLHDFRSDEIEHLVVEELVRKREECLDALESIMTTKSVSFRRL
SHLRKLVPGFGKAYTIFNKTLMEADAHYKSVRTWNEILPSKGCLRVGGRCHPHVNGVFFNGIILGPDGNVLIPEMQSSLL
QQHMELLESSVIPLVHPLADPSTVFKDGDEAEDFVEVHLPDVHNQVSGVDLGLPNWGKYVLLSAGALTALMLIIFLMTCC
RRVNRSEPTQHNLRGTGREVSVTPQTWKIISSWESHKSGGETRL
>P27277 ~~~G~~~Spike glycoprotein~~~
MSHIMNLLVISFVLAGSSWSLLGYQDDFSSKRSGALASNPTYNLPQDKGYGRDMYQPYYICEPDNDGSALTLPSWHYSCK
ESCMGNHLKRVVNITGARWNYVGISIPVFKIVTNEVCYTSHENVWGYCSQYQISRPVATQKSDVSCITSSMWDNDKSPIG
SLYNIVNSNEAECDYFSDITDCNRDYQIFKREGKLIKRSDDSPLELSIVTDGIRTDPASEYLSLDDVSWFWKLPNNDMSP
PCGWEKTQKLSCSYTDTTDVIKCNSIGYTYNIQGISKKSTCAGNIYDTDGPFPFFYDAEEALMSTDDACGKAKQGKPDAD
IAFIEGVNRAFEDLELTYCSATCDLFARQGTPNEDHVLDTPIGTWRYVMRDNLDPALVPCLPTSNWTISDPTTICHGKDH
ILVVDTATGHSGSWDTKKDYIITGEVCNTNNDEMGDDYDGMRDKILRGETIEIKFWTGDIIRMAPPYDNPEWIKGSVLFR
QNPGWFSSVELNKDMIHTRDNITDLLTVMVQNATAEVMYKRLDPKTMKHILFAEIVDGVGNVSGKISGFLTGLFGGFTKA
VIIVASLAICYIVLSVLWKVRLVASIFNSAKKKRVRISDILDEEPHRIQQSRPTLSRKKKTRESIQMLLNDI
>P03522 ~~~G~~~Glycoprotein~~~
MKCLLYLAFLFIGVNCKFTIVFPHNQKGNWKNVPSNYHYCPSSSDLNWHNDLIGTAIQVKMPKSHKAIQADGWMCHASKW
VTTCDFRWYGPKYITQSIRSFTPSVEQCKESIEQTKQGTWLNPGFPPQSCGYATVTDAEAVIVQVTPHHVLVDEYTGEWV
DSQFINGKCSNYICPTVHNSTTWHSDYKVKGLCDSNLISMDITFFSEDGELSSLGKEGTGFRSNYFAYETGGKACKMQYC
KHWGVRLPSGVWFEMADKDLFAAARFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIRAGLPISPVDLSY
LAPKNPGTGPAFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTTTERELWDDWAPYEDVEIGPNGVLRTSSGYKFPL
YMIGHGMLDSDLHLSSKAQVFEHPHIQDAASQLPDDESLFFGDTGLSKNPIELVEGWFSSWKSSIASFFFIIGLIIGLFL
VLRVGIHLCIKLKHTKKRQIYTDIEMNRLGK
>Q8B0H1 ~~~G~~~Glycoprotein~~~
MKCLLYLALLFIGVYCKFTTVFPHNKKGDWKNVPSNYHYCPSSSDLNWHNDLIGTALQVKMPKSHKAIQADGWMCHASKW
VTTCDFRWYGPKYITHSIRSFTPSVEQCKESIEQTKQGTWLNPGFPPQSCGYATVTDAEAVIVQVTPHHVLVDEYTGEWV
DSQFINGKCSDDICPTVHNSTTWHSDYKVKGLCDSNLISMDITFFSEDGELSSLGKEGTGFRSNYFAYETGDKACKMQYC
KHWGVRLPSGVWFEMADKNLFAAAKFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIRAGLPISPVDLSY
LAPKNPGTGPAFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTTTERELWEDWAPYEDVEIGPNGVLRTSSGYKFPL
YMIGHGMLDSDLHLSSKAQVFEHPHIPDATSQLPDDETLFFGDTGLSKDPIELVEGWFSGWKSSIASFFFIIGLIIGLFF
VLRIGVYLCIKLKHTNKRQIYTDIEMNRLGK
>P04883 ~~~G~~~Glycoprotein~~~
MKCFLYLAFLFIGVNCKFTIVFPHNQKGNWKNVPSNYHYCPSSSDLNWHNDLIGTGLQVKMPKSHKAIQADGWMCHASKW
VTTCDFRWYGPKYITHSIRSFTPSVEQCKESIEQTKQGTWLNPGFPPQSCGYATVTDAEAVIVQVTPHHVLVDEYTGEWV
DSQFINGKCSNDICPTVHNSTTWHSDYKVKGLCDSNLISTDITFFSEDRELSSLGKEGTGFRSNYFAYETGDKACKMQYC
KHWGVRLPSGVWFEMADKDLFAAARFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIRAGLPISPVDLSY
LAPKNPGTGPAFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTTTERELWDDWAPYEDVEIGPNGVLRTSSGYKFPL
YMIGHGMLDSGLHLSSKAQVFEHPHIQDAASQLPDDEILFFGDTGLSKNPIDFVEGWFSSWKSSIASFFFIIGLIIGLFL
VLRVGIYLYIKLKHTKKRQIYTDIEMNRLGR
>Q8B0I1 ~~~G~~~Glycoprotein~~~
MKCLLYLAFLSIGVNCKFTIVFPHNQKGTWKNVPSNYHYCPSSSDLNWHNDLIGTALQVKMPKSHKAIQADGWMCHASKW
VTTCDFRWYGPKYITHSIRSFTPSVEQCRESIEQTKQGTWLNPGFPPQSCGYATVTDAEAVIVQVTPHHVLVDEYTGEWV
DSQFINGKCSNDICPTVHNSTTWHSDYKVKGLCDSNLISMDITFFSEDGELSSLGKEGTGFRSNHFAYETGDKACKMQYC
KHWGVRLPSGVWFEMADQDLFAAARFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIGAGLPISPVDLSY
LAPKNPGTGPAFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTTTERELWDDWAPYEDVEIGPNGVLRTSSGYKFPL
YMIGHGMLDSDLHLSSKAQVFEHPHIQDAASQLPDDETLFFGDTGLSKNPIELVEGWFSGWKSSIASFFFIIGLIIGLFL
VLRVGIYLCIKLKHTKKRQIYTDIEMNRLGK
>P04884 ~~~G~~~Glycoprotein~~~
MKCLLYLAFLFIGVNCKFTIVFPHNQKGNWKNVPSNYHYCPSSSDLNWHNDLIGTALQVKMPKSHKAIQADGWMCHASKW
VTTCDFRWYGPKYITHSIRSFTPSVEQCKESIEQTKQGTWLNPGFPPQSCGYATVTDAEAAIVQVTPHHVLVDEYTGEWV
DSQFINGKCSNDICPTVHNSTTWHSDYKVKGLCDSNLISTDITFFSEDGELSSLGKEGTGFRSNYFAYETGDKACKMQYC
KHWGVRLPSGVWFEMADKDLFAAARFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIRAGLPISPVDLSY
LAPKNPGTGPVFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTTTERELWDDWAPYEDVEIGPNGVLRTSLGYKFPL
YMIGHGMLDSDLHLSSKAQVFEHPHIQDAASQLPDDETLFFGDTGLSKNPIEFVEGWFSSWKSSIASFFFIIGLIIGLFL
VLRVGIYLCIKLKHTKKRQIYTDIEMNRLGK
>Q8B0H6 ~~~G~~~Glycoprotein~~~
MKCLLCLAFLFIGVNCKFTIVFPHNQKGNWKNVPSNYHYCPSSSDLNWHNDLIGTALQVKMPKSHKAIQADGWMCHASKW
ITTCDFRWYGPKYITHSIQSFTPSVEQCKESIEQTKQGTWLNPGFPPQSCGYATVTDAEAVIVQVTPHHVLVDEYTGEWV
DSQFINGKCSNDICLTVHNSTTWHSDYKVKGLCDSNLISMDITFFSEDGELSSLGKAGTGFRSNYFAYETGDKACKMQYC
KHWGVRLPSGVWFEMADKDLFAAAKFPECPEGSSISAPSQTSVDVSLIQDVERILDYSLCQETWSKIRAGLPISPVDLSY
LAPKNPGTGPAFTIINGTLKYFETRYIRVDIAAPILSRMVGMISGTNTERELWEDWAPYEDVEIGPNGVLRTSSGYKFPL
YMIGHGMLDSDLHLSSKVQVFEHPHIQDAASQLPDDETLFFGDTGLSKNPIELVEGWFSGWKSSIASFFFIIGLIIGLFL
VLRVGIYLCIKLKHTRKRKIYADIEMNRLGK
>P04882 ~~~G~~~Glycoprotein~~~
MLSYLIFALAVSPILGKIEIVFPQHTTGDWKRVPHEYNYCPTSADKNSHGTQTGIPVELTMPKGLTTHQVEGFMCHSALW
MTTCDFRWYGPKYITHSIHNEEPTDYQCLEAIKSYKDGVSFNPGFPPQSCGYGTVTDAEAHIVTVTPHSVKVDEYTGEWI
DPHFIGGRCKGQICETVHNSTKWFTSSDGESVCSQLFTLVGGIFFSDSEEITSMGLPETGIRSNYFPYISTEGICKMPFC
RKQGYKLKNDLWFQIMDPDLDKTVRDLPHIKDCDLSSSIITPGEHATDISLISDVERILDYALCQNTWSKIESGEPITPV
DLSYLGPKNPGVGPVFTIINGSLHYFTSKYLRVELESPVIPRMEGKVAGTRIVRQLWDQWFPFGEVEIGPNGVLKTKQGY
KFPLHIIGTGEVDSDIKMERVVKHWEHPHIEAAQTFLKKDDTGEVLYYGDTGVSKNPVELVEGWFSGWRSSLMGVLAVII
GFVILMFLIKLIGVLSSLFRPKRRPIYKSDVEMAHFR
>Q5VKN9 ~~~G~~~Glycoprotein~~~
MASYFALVLNGISMVFSQGLFPLYTIPDHLGPWTPIDLSHLHCPNNLYTDASYCTTEQSITYTELKVGSSVSQKIPGFTC
TGVRTESVTYTNFVGYVTTTFKKKHFPPKSRDCREAYERKKAGDPRYEESLAHPYPDNSWLRTVTTTKDSWVIIEPSVVE
LDIYTSALYSPLFKDGTCSKSRTYSPYCPTNHDFTIWMPESENIRSACNLFSTSRGKLVRNRTSTCGIIDERGLFRSVKG
ACKISICGRQGIRLVDGTWMSFRYSEYLPVCSPSQLINTHDIKVDELENAIVLDLIRRREECLDTLETILMSGSVSHRRL
SHFRKLVPGSGKAYSYINGTLMESDAHYIKVENWSEVIPHKGCLMVGGKCYEPVNDVYFNGIIRDSNNQILIPEMQSSLL
REHVDLLKANIVPFRHPMLLRSFTSDTEEDIVEFVNPHLQDTQKLVSDMDLGLSDWKRYLLIGSLAVGGVVAILFIGTCC
LRCRAGRNRRTIRSNHRSLSHDVVFHKDKDKVITSWESYKGQTAQ
>O89343 ~~~G~~~Glycoprotein G~~~
MMADSKLVSLNNNLSGKIKDQGKVIKNYYGTMDIKKINDGLLDSKILGAFNTVIALLGSIIIIVMNIMIIQNYTRTTDNQ
ALIKESLQSVQQQIKALTDKIGTEIGPKVSLIDTSSTITIPANIGLLGSKISQSTSSINENVNDKCKFTLPPLKIHECNI
SCPNPLPFREYRPISQGVSDLVGLPNQICLQKTTSTILKPRLISYTLPINTREGVCITDPLLAVDNGFFAYSHLEKIGSC
TRGIAKQRIIGVGEVLDRGDKVPSMFMTNVWTPPNPSTIHHCSSTYHEDFYYTLCAVSHVGDPILNSTSWTESLSLIRLA
VRPKSDSGDYNQKYIAITKVERGKYDKVMPYGPSGIKQGDTLYFPAVGFLPRTEFQYNDSNCPIIHCKYSKAENCRLSMG
VNSKSHYILRSGLLKYNLSLGGDIILQFIEIADNRLTIGSPSKIYNSLGQPVFYQASYSWDTMIKLGDVDTVDPLRVQWR
NNSVISRPGQSQCPRFNVCPEVCWEGTYNDAFLIDRLNWVSAGVYLNSNQTAENPVFAVFKDNEILYQVPLAEDDTNAQK
TITDCFLLENVIWCISLVEIYDTGDSVIRPKLFAVKIPAQCSES
>Q9IH62 ~~~G~~~Glycoprotein G~~~
MPAENKKVRFENTTSDKGKIPSKVIKSYYGTMDIKKINEGLLDSKILSAFNTVIALLGSIVIIVMNIMIIQNYTRSTDNQ
AVIKDALQGIQQQIKGLADKIGTEIGPKVSLIDTSSTITIPANIGLLGSKISQSTASINENVNEKCKFTLPPLKIHECNI
SCPNPLPFREYRPQTEGVSNLVGLPNNICLQKTSNQILKPKLISYTLPVVGQSGTCITDPLLAMDEGYFAYSHLERIGSC
SRGVSKQRIIGVGEVLDRGDEVPSLFMTNVWTPPNPNTVYHCSAVYNNEFYYVLCAVSTVGDPILNSTYWSGSLMMTRLA
VKPKSNGGGYNQHQLALRSIEKGRYDKVMPYGPSGIKQGDTLYFPAVGFLVRTEFKYNDSNCPITKCQYSKPENCRLSMG
IRPNSHYILRSGLLKYNLSDGENPKVVFIEISDQRLSIGSPSKIYDSLGQPVFYQASFSWDTMIKFGDVLTVNPLVVNWR
NNTVISRPGQSQCPRFNTCPEICWEGVYNDAFLIDRINWISAGVFLDSNQTAENPVFTVFKDNEILYRAQLASEDTNAQK
TITNCFLLKNKIWCISLVEIYDTGDNVIRPKLFAVKIPEQCT
>O10683 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHPKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITAIIYISVGNAKAKPTSKPTTQQT
QQLQNHTPPPLTEHNYKSTHTSIQSTTLSQPPNIDTTSGTTYGHPTNRTQNRKIKSQSTPLATRKPPINPLGSNPPENHQ
DHNNSQTLPHVPCSTCEGNPACSPLCQIELERAPSSAPTITLKKAPKPKTTKKPTKTTIYHRTSPEAKLQTKKIMATPQQ
GILSSPEHQTNQSTTQISQHTSI
>O10685 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHPKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTSLAMITLTSLVITAIIYISVGNAKAKPTSKPTTQQT
QQPQNHTPLLPTEHNHKSTHTSTQSTTLSQPPNIDTTSGTTYGHPINRTQNRKIKSQSTPLATRKLPINPLESNPPENHQ
DHNNSQTLPHVPCSTCEGNPACSPLCQIGLERAPSRAPTITLKKAPKPKTTKKPTKTTIYHRTSPEAKLQTKKNTATPQQ
GILSSPEHQTNQSTTQISQHTSI
>O10684 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHPKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITAIIYISVGNAKAKPTSKPTTQQT
QQLQNHTPPPLTEHNYKSTHTSIQSTTLSQPPNIDTTSGTTYGHPTNRTQNRKIKSQSTPLATRKPPINPLGSNPPENHQ
DHNNSQTLPHVPCSTCEGNPACSPLCQIELERAPSSAPTITLKKAPKPKTTKKPTKTTIYHRTSPEAKLQTKKIMVTPQQ
GILSSPEHQTNQSTTQISQHTSI
>Q65706 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKFKTGKRAWKPSKYFIVGLSCLYKFNLKSLVQTALSTLAMITLTSLVITAIIYISVGKSKAKPTSKPTIQQT
QQPQNHTSPFFTENNYKSTHTSIQSTTLSQLINIDTTRGTTYGHSTDETQSRKIKSQSALPTTRKPPINPSESNPPENHQ
DHNNSQTLPYEPCSTCEGNLACLSLCQVGPGRAPSRAPTITLKKTPKPKTTKRPIKTTIHHRTSPEAKLQPKNNTAAPQQ
GILSSPENHTNQSTTQI
>P69351 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITALIYISVGNAKAKPTSKPTIQQT
QRPQNHTSPLFTEHNYKSTHTSIQSTTLSQLLNIDTTRGTTYSHPTDETQNRKIKSQSTLPATRQPPINPSGSNPPENHQ
DHNNSQTLPYVPCSTCEGNLACSSLCQIGLERAPSRAPTITLKKAPKPKTTKKPTKTTIHHRTSPEAKLQPKNNTAAPQQ
GILSSPEHHTNQSTTQI
>P62648 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHPKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTSLAMITLTSLVITAIIYISVGNAKAKPTSKPTTQQT
QQPQNHTPLLPTEHNHKSTHTSTQSTTLSQPPNIDTTSGTTYGHPINRTQNRKIKSQSTPLATRKLPINPLESNPPENHQ
DHNNSQTLPHVPCSTCEGNPACSPLCQIGLERAPSRAPTITLKKAPKPKTTKKPTKTTIYHRTSPEAKLQTKKNTATPQQ
GILSSPEHQTNQSTTQISQHTSI
>P22261 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALSTLAMITLTSLVITAIIYISVGNAKAKPTSKPTIQQT
QQPQNHTSPFFTEHNYKSTHTSIQSTTLSQLLNIDTTRGITYGHSTNETQNRKIKGQSTLPATRKPPINPSGSIPPENHQ
DHNNFQTLPYVPCSTCEGNLACLSLCHIETERAPSRAPTITLKKTPKPKTTKKPTKTTIHHRTSPETKLQPKNNTATPQQ
GILSSTEHHTNQSTTQI
>O09495 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKLKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITAIIYISVGNAKAKPTSKPTIQQT
QRPQNHTSPLFTEHNYKSTHTSIQSTTLSQLLNIDTTRGTTYSHPTDETQNRKIKSQSTLPATRQPPINPSGSNPPENHQ
DHNNSQTLPYVPCSTCEGNLACSSLCQIGLERAPSRAPTITLKKAPKPKTTKKPTKTTIHHRTSPEAKLQPKNNTAAPQQ
GILSSPEHHTNQSTTQI
>O10686 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITAIIYISVGNAKAKPTSKPTIQQT
QQPQNHTSPFFTEHNYKSTHTSIQSTTLSQLPNTDTTRGTTYGHSIDETQNRKIKSQSTLPATRKPPINPSGSNPPENHQ
DHNNSQTLPYVPCSTCEGNLACLSLCQIGPERAPSRAPTITPKKTPKPKITKKPTTTTIHHRTSLKAKLQPKNNTAAPQQ
GILSSPEHHTNQSTTQI
>O10687 ~~~G~~~Major surface glycoprotein G~~~
MSNHTHHLKFKTLKRAWKASKYFIVGLSCLYKFNLKSLVQTALTTLAMITLTSLVITAIIYISVGNAKAKPTSKPTIQQT
QQPQNHTSPFFTEHNYKSTHTSIQGTTLPQLPNTDTTRETTYSHSINETQDRKTKSQSTLPATRKPPINPSGSNPPENHQ
DHNNSQTLPHVPCSTCEGNPACSSLCQIGPERASSRAPTITLKKTPKPKTTKKPTKTTIHRKTSPEAKPQPKNNTAAPQQ
GILSSPEHHTNQPTTQIQQHTSI
>Q6Q304 ~~~~~~Envelope glycoprotein~~~
MLSVAQSSALFLLQAICILYITKLTIPTPVSEINLVRQSDCVCVPIISRSGTDYITCFNNCQIEPINTKLYNSTCTKMVN
ITLVRCNNEVYVMTLPNLVSNRSHSWEVLINYLLRFISAIIVYLLLSISKQGIFLFFSIVHYSFKFIKNKKSCNICGNDF
YFIHIDCPKPDFTKRSDFHMMFYIILFLSLFFVVTHADDNVYNYYEHGDLTEIQLLDKEHYSQDFVSDGFLYNFYVENSH
LIYDISNISTITRPVKHNEVTSTWSCDGSSGCYKDHVGKYNKKPDYVLKKVHDGFSCFFTTATICGTCKSEHIAIGDHVR
VINVKPYIHIVVKTANKTDKIVIDEFNKFIHEPYYIKPITQIHIDQHDFLVTGSKVYQGTFCERPSKSCFGPNYITSDKT
VTLHEPKIRDTFTHDREYIIDYCDYPSNSDLESLELTDMVHHSDKIYSPYDFGLISIGIPKLGYLAGGFCESLVSVKKIE
VYGCYDCQNGVKISVTYESSDSCHTLICKHDSTTHRYFVQQHTTTLNFHSFMSKKDTIIECNQMRKALNLDESSETSVYF
ESNGVKGSAKEPVNFDFIKNLLYIDYKKIIFVFLVAIISIGIFLRSPYMLLSSILKFRKRRKVVATNRSEQLVMDDDVDV
FIGPPS
>Q8AYW1 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQLISFFQDIPIFFEEALNVALAVVTLLAIIKGIVNVWKSGILQLFVFLVLAGRSCSFKVGHHTNFESFTVKLGGVFHE
LPSLCRVNNSYSLIRLSHNSNQALSVEYVDVHPVLCSSSPTILDNYTQCIKGSPEFDWILGWTIKGLGHDFLRDPRICCE
PKKTTNAEFTFQLNLTDSPETHHYRSKIEVGIRHLFGNYITNDSYSKMSVVMRNTTWEGQCSNSHVNTLRFLVKNAGYLV
GRKPLAFFSWSLSDPKGNDMPGGYCLERWMLVAGDLKCFGNTAVAKCNLNHDSEFCDMLRLFDFNKNAIEKLNNQTKTAV
NMLTHSINSLISDNLLMRNKLKEILKVPYCNYTRFWYINHTKSGEHSLPRCWLVSNGSYLNESDFRNEWILESDHLIAEM
LSKEYQDRQGKTPLTLVDLCFWSAIFFTTSLFLHLVGFPTHRHIQGDPCPLPHRLDRNGACRCGRFQKLGKQVTWKRKH
>P20895 ~~~G~~~Major surface glycoprotein G~~~
MSKNKDQRTAKTLEKTWDTLNHLLFISSGLYKLNLKSIAQITLSILAMIISTSLIITAIIFIASANHKVTLTTAIIQDAT
SQIKNTTPTYLTQDPQLGISFSNLSEITSQTTTILASTTPGVKSNLQPTTVKTKNTTTTQTQPSKPTTKQRQNKPPNKPN
NDFHFEVFNFVPCSICSNNPTCWAICKRIPNKKPGKKTTTKPTKKPTFKTTKKDLKPQTTKPKEVPTTKPTEEPTINTTK
TNITTTLLTNNTTGNPKLTSQMETFHSTSSEGNLSPSQVSTTSEHPSQPSSPPNTTRQ
>P20896 ~~~G~~~Major surface glycoprotein G~~~
MSKHKNQRTARTLEKTWDTLNHLIVISSCLYRLNLKSIAQIALSVLAMIISTSLIIAAIIFIISANHKVTLTTVTVQTIK
NHTEKNISTYLTQVPPERVNSSKQPTTTSPIHTNSATISPNTKSETHHTTAQTKGRITTSTQTNKPSTKSRSKNPPKKPK
DDYHFEVFNFVPCSICGNNQLCKSICKTIPSNKPKKKPTIKPTNKPTTKTTNKRDPKTPAKMPKKEIITNPAKKPTLKTT
ERDTSISQSTVLDTITPKYTIQQQSLHSTTSENTPSSTQIPTASEPSTLNPN
>P27025 ~~~G~~~Major surface glycoprotein G~~~
MSKTKDQRTAKTLERTWDTLNHLLFISSCLYKLNLKSIAQITLSILAMIISTSLIIAAIIFIASANHKVTLTTAIIQDAT
SQIKNTTPTYLTQNPQLGISFSNLSETTSQPTTTPAPTTPSAESTPQSTTVKTKNTTTTQIQPSKPTTKQRQNKPPNKPN
NDFHFEVFNFVPCSICSNNPTCWAICKRIPNKKPGKKTTTKPTKKPTIKTTKKDLKPQTTKPKEVLTTKPTEKPTINTTR
TNIRTTLLTTNTTGNPEYTSQKETLHSTSPEGNPSPSQVYTTSEYPSQPPSPSNTTN
>P03423 ~~~G~~~Major surface glycoprotein G~~~
MSKNKDQRTAKTLERTWDTLNHLLFISSCLYKLNLKSVAQITLSILAMIISTSLIIAAIIFIASANHKVTPTTAIIQDAT
SQIKNTTPTYLTQNPQLGISPSNPSEITSQITTILASTTPGVKSTLQSTTVKTKNTTTTQTQPSKPTTKQRQNKPPSKPN
NDFHFEVFNFVPCSICSNNPTCWAICKRIPNKKPGKKTTTKPTKKPTLKTTKKDPKPQTTKSKEVPTTKPTEEPTINTTK
TNIITTLLTSNTTGNPELTSQMETFHSTSSEGNPSPSQVSTTSEYPSQPSSPPNTPRQ
>Q27YE4 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQIITFFQEVPHIIEEVMNIVLITLSLLAILKGVYNVMTCGLIGLISFLLLCGKSCSLIYKDTYNFSSIELDLSHLNMT
LPMSCSRNNSHHYVFFNGSGLEMTFTNDSLLNHKFCNLSDAHKKNLYDHALMGIVTTFHLSIPNFNQYEAMACDFNGGNI
SIQYNLSHNDRTDAMNHCGTVANGVLDAFYRFHWGRNITYIAQLPNGDGTGRWTFCYATSYKYLVIQNISWADHCQMSRP
TPIGFASILSQRIRSIYISRRLMSTFTWSLSDSSGTENPGGYCLTRWMLFAADLKCFGNTAIAKCNLNHDEEFCDMLRLI
DFNKQALKTFKSEVNHGLQLITKAINALINDQLIMKNHLRDLMGIPYCNYSKFWYLNDTRTGRVSLPKCWMISNGTYLNE
THFSDEIEQEADNMITEMLRKEYQERQGKTPLGLVDLFIFSTSFYSITVFLHLIKIPTHRHIVGQGCPKPHRLNSRAICS
CGAYKQPGLPTKWKR
>P26313 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQFISFMQEIPTFLQEALNIALVAVSLIAIIKGVVNLYKSGLFQFFVFLALAGRSCTEEAFKIGLHTEFQTVSFSMVGL
FSNNPHDLPLLCTLNKSHLYIKGGNASFKISFDDIAVLLPEYDVIIQHPADMSWCSKSDDQIWLSQWFMNAVGHDWYLDP
PFLCRNRTKTEGFIFQVNTSKTGINENYAKKFKTGMHHLYREYPDSCLDGKLCLMKAQPTSWPLQCPLDHVNTLHFLTRG
KNIQLPRRSLKAFFSWSLTDSSGKDTPGGYCLEEWMLVAAKMKCFGNTAVAKCNLNHDSEFCDMLRLFDYNKNAIKTLND
ETKKQVNLMGQTINALISDNLLMKNKIRELMSVPYCNYTKFWYVNHTLSGQHSLPRCWLIKNNSYLNISDFRNDWILESD
FLISEMLSKEYSDRQGKTPLTLVDICFWSTVFFTASLFLHLVGIPTHRHIRGEACPLPHRLNSLGGCRCGKYPNLKKPTV
WRRGH
>P08669 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQIVTFFQEVPHVIEEVMNIVLIALSVLAVLKGLYNFATCGLVGLVTFLLLCGRSCTTSLYKGVYELQTLELNMETLNM
TMPLSCTKNNSHHYIMVGNETGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQYEAMSCDFNGG
KISVQYNLSHSYAGDAANHCGTVANGVLQTFMRMAWGGSYIALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIG
YLGLLSQRTRDIYISRRLLGTFTWTLSDSEGKDTPGGYCLTRWMLIEAELKCFGNTAVAKCNEKHDEEFCDMLRLFDFNK
QAIQRLKAEAQMSIQLINKAVNALINDQLIMKNHLRDIMGIPYCNYSKYWYLNHTTTGRTSLPKCWLVSNGSYLNETHFS
DDIEQQADNMITEMLQKEYMERQGKTPLGLVDLFVFSTSFYLISIFLHLVKIPTHRHIVGKSCPKPHRLNHMGICSCGLY
KQPGVPVKWKR
>Q8B121 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQVIGFFQSLPEIINEALNIALICVALLATIKGMVNIWKSGLIQLLFFLTLAGRSCSHSFTIGRFHEFQSVTVNFTQFM
SYAPSSCSVNNTHHYFKGPQNTTWGLELTLTNESMINITNSMRVFTNIHHNVTNCVQNISEHEGVLKWLLETMHLSISKP
GKHIAPVMCERQKGLLIEYNLTMTKDHHPNYWNQVLYGLAKLLGSSKRLWFGACNKADCQMQSDHQHIKCNYSNCKGYTS
FKYLIIQNTTWENHCEYNHLNTIHLLMSPIGQSFITRRLQAFLTWTLSDALGNDLPGGYCLEQWAVVWFGIKCFDNTAMA
KCNQNHDSEFCDMLRLFDYNRNAIQSLNDQSQARLNLLTNTINSLVSDNLLMKNKLRELMNVPYCNYTRFWFINDTKNGR
HTLPQCWLVSDGSYLNETRFRTQWLSESNSLYTEMLTEEYEKRQGRTPLSLVDLCFWSTLFYISTLFAHLVGFPTHRHLI
GEGCPKPHRLTGSGICSCGHYGIPGKPVRWTKMSR
>P09991 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQIVTMFEALPHIIDEVINIVIIVLIVITGIKAVYNFATCGIFALISFLLLAGRSCGMYGLKGPDIYKGVYQFKSVEFD
MSHLNLTMPNACSANNSHHYISMGTSGLELTFTNDSIISHNFCNLTSAFNKKTFDHTLMSIVSSLHLSIRGNSNYKAVSC
DFNNGITIQYNLTFSDAQSAQSQCRTFRGRVLDMFRTAFGGKYMRSGWGWTGSDGKTTWCSQTSYQYLIIQNRTWENHCT
YAGPFGMSRILLSQEKTKFFTRRLAGTFTWTLSDSSGVENPGGYCLTKWMILAAELKCFGNTAVAKCNVNHDAEFCDMLR
LIDYNKAALSKFKEDVESALHLFKTTVNSLISDQLLMRNHLRDLMGVPYCNYSKFWYLEHAKTGETSVPKCWLVTNGSYL
NETHFSDQIEQEADNMITEMLRKDYIKRQGSTPLALMDLLMFSTSAYLVSIFLHLVKIPTHRHIKGGSCPKPHRLTNKGI
CSCGAFKVPGVKTVWKRR
>P07399 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQIVTMFEALPHIIDEVINIVIIVLIIITSIKAVYNFATCGILALVSFLFLAGRSCGMYGLNGPDIYKGVYQFKSVEFD
MSHLNLTMPNACSVNNSHHYISMGSSGLEPTFTNDSILNHNFCNLTSALNKKSFDHTLMSIVSSLHLSIRGNSNYKAVSC
DFNNGITIQYNLSSSDPQSAMSQCRTFRGRVLDMFRTAFGGKYMRSGWGWTGSDGKTTWCSQTSYQYLIIQNRTWENHCR
YAGPFGMSRILFAQEKTKFLTRRLSGTFTWTLSDSSGVENPGGYCLTKWMILAAELKCFGNTAVAKCNVNHDEEFCDMLR
LIDYNKAALSKFKQDVESALHVFKTTLNSLISDQLLMRNHLRDLMGVPYCNYSKFWYLEHAKTGETSVPKCWLVTNGSYL
NETHFSDQIEQEADNMITEMLRKDYIKRQGSTPLALMDLLMFSTSAYLISIFLHFVRIPTHRHIKGGSCPKPHRLTNKGI
CSCGAFKVPGVKTIWKRR
>Q6IUF7 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQLISFFQEIPVFLQEALNIALVAVSLIAVIKGIINLYKSGLFQFIFFLLLAGRSCSDGTFKIGLHTEFQSVTLTMQRL
LANHSNELPSLCMLNNSFYYMKGGVNTFLIRVSDISVLMKEHDVSIYEPEDLGNCLNKSDSSWAIHWFSNALGHDWLMDP
PMLCRNKTKKEGSNIQFNISKADDVRVYGKKIRNGMRHLFRGFHDPCEEGKKCYLTINQCGDPSSFDYCGMDHLSKCQFD
HVNTLHFLVRSKTHLNFERSLKAFFSWSLTDSSGKDMPGGYCLEEWMLIAAKMKCFGNTAVAKCNQNHDSEFCDMLRLFD
YNKNAIKTLNDESKKEINLLSQTVNALISDNLLMKNKIKELMSIPYCNYTKFWYVNHTLTGQHTLPRCWLIRNGSYLNTS
EFRNDWILESDHLISEMLSKEYAERQGKTPITLVDICFWSTVFFTASLFLHLVGIPTHRHLKGEACPLPHKLDSFGGCRC
GKYPRLRKPTIWHKRH
>Q2A069 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQIVTFFQEVPHIIEEVMNIVLITLSLLAILKGIYNVMTCGLIGLLTFLFLCGKSCSTIYKDNYRLMQLNLDMSGLNAT
MPLSCSKNNSHHYIQVFNTTGLELTLTNDSLIGHKWCNLSDAHKKDTYDHTLMSIISTFHLSIPNFNHYEAMACDFNGGK
ISIQYNLSHSSETDAMNHCGTVANGVLEVFRRMTWCTHCDTPLGASIAGFNCVRTSYKYLIIQNTTWEDHCTMSRPSPMG
YLSLLSQRAREIYISRRLMGTFTWTLSDSEGNDLPGGYCLQRWMLIEAEMKCFGNTAVAKCNQQHDEEFCDMLRLFDFNK
EAIHRLRVEAEKSISLINKAVNSLINDQLIMRNHLRDIMGIPYCNYSRFWYLNDTRSGRTSLPKCWMVSNGSYLNETHFS
SDIEQEANNMITEMLRKEYERRQGTTPLGLVDLFVFSTSFYLISVFLHLIKIPTHRHLVGKPCPKPHRLNHMGVCSCGLY
KQPGLPTKWKR
>Q84168 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQVIGFFQSLPNIINEALNIALICVALIAILKGIVNIWKSGLIQLFIFLILAGRSCSHTFQIGRNHEFQSITLNFTQFL
GYAPSSCSVNNTHHYFRGPGNVSWGIELTLTNNSVINASNSLKVFTNIHHNITNCVQNIDEQDHLMKWLIETMHLQIMKP
GKRLPPILCEKDKGLLIEYNLTNIASREEKHSEYWSQLLYGLSKLLGSSKSLWFDYCQRADCMMQEHSSHLKCNYSECSG
HTTFKYLILQNTTWENHCEFNHLNTIHLLMSSTGQSFITRRLQAFLTWTLSDATGNDLPGGYCLEQWAIVWAGIKCFGNT
AVAKCNQNHDSEFCDMLRLFDYNRNAIKSLNDQSQSRLNLLTNTINSLISDNLLMKNKLAEIMNIPYCNYTKFWYINDTR
TGRHTLPQCWLISNGSYLNETKFRTQWLSESNALYTEMLTEDYDKRQGSTPLSLVDLCFWSTLFYVTTLFAHLVGFPTHR
HILDGPCPKPHRLTKKGICSCGHFGIPGKPVRWVKRSR
>Q85429 ~~~pc2~~~Putative envelope glycoprotein~~~
MHFKSYFIYTTIFNMAWGAPIPFPDTHSWMRNREREPSEIVKVPCSARAPPCKLTYELNGYFIENGLICYNRASVNYFET
CYTGNYDYKLPLHPSFSKFGGHVYLSCDDAILQNVSLVGIQQTEYTSSPLLITNSNSEKISYSNLKTGFLGIVYAVETRA
CIQPDQAKKPEEIINHGVAIKPSCTDGVLYYINSACEVNVSDQTFSIPSCESVKLPTYDDTIEVCDKGGCQNVTCHPGEI
CDKYERMDMIMRIKNYQCSHIYRYSLYSIILFFVIVIVFTLITIMNILFFLKPAFWLLKKVLYSMVGLCHRRPVVDEVSV
DMSTVRVVDEAEEGLLVVEDSIAPNTNVSDKVKRKGRKVENGLIFIPYVLMILLLVCSAESCQDLVSSISNIERCTNNSC
DFISKMKLTLLNTPQDFCFKTSTDVYKIRFNSVRVMCLSVPLYYTNSFKRVISREEWKCFEGEGCRTDGTHSIWGESTSL
SFDYCVTDFHIFSYCPAYHYNWKRIEYEPTSSRACTIMKCMDTKFEIVGYIQKNGHVLKELGGITSKYDSPLVSISLSNY
NSARMPREYAECDGKAYLRTANDLGSFDKELLGNIQCPTKEDAVVLSSKCKTKILSNEDLPVIRYIERDGVDMLEHVKSE
PLKDVLVSSSGISLSTLDLFPVELNLQFKEAITSIITSKISLNGTSCKITGIERKFKKTTVSIESSNKVYLSDILACEGL
AVCPMILNNIKKGTCITTTYYSVTVGSMIKCKFIYSGDTLMCKYDVSPLEITVISPSLDVSSFEAVKTSTTNWMELLAGI
VKDNPKLSLVASIIPIGLILKTIRSFLDDIRQVD
>Q90037 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQLFSFFEEVPNIIHEAINIALIAVSLIAALKGMINLWKSGLFQLIFFLTLAGRSCSFRIGRSTELQNITFDMLKVFED
HPTSCMVNHSTYYVHENKNATWCLEVSVTDVTLLMAEHDRQVLNNLSNCVHPAVEHRSRMVGLLEWIFRALKYDFNHDPT
PLCQKQTSTVNETRVQINITEGFGSHGFEDTILQRLGVLFGSRIAFSNIQDLGKKRFLLIRNSTWKNQCEMNHVNSMHLM
LANAGRSSGSRRPLGIFSWTITDAVGNDMPGGYCLERWMLVTSDLKCFGNTALAKCNLDHDSEFCDMLKLFEFNKKAIET
LNDNTKNKVNLLTHSINALISDNLLMKNRLKELLNTPYCNYTKFWYVNHTASGEHSLPRCWLVRNNSYLNESEFRNDWII
ESDHLLSEMLNKEYIDRQGKTPLTLVDICFWSTLFFTTTLFLHLVGFPTHRHIRGEPCPLPHRLNSRGGCRCGKYPELKK
PITWHKNH
>Q911P0 ~~~GPC~~~Pre-glycoprotein polyprotein GP complex~~~
MGQLISFFGEIPSIIHEALNIALIAVSIISILKGVINIWGSGLLQFIVFLLLAGRSCSYKIGHHVELQHIILNASYITPY
VPMPCMINDTHFLLRGPFEASWAIKLEITDVTTLVVDTDNVANPTNISKCFANNQDERLLGFTMEWFLSGLEHDHHFTPQ
IICGNVSKGEVNAQVNITMEDHCSQVFLKMRRIFGVFKNPCTSHGKQNVLISVSNWTNQCSGNHLSSMHLIVQNAYKQMI
KSRTLKSFFAWSLSDATGTDMPGGYCLEKWMLISSELKCFGNTAIAKCNLDHSSEFCDMLKLFEFNRNAIKTLQNDSKHQ
LDMIITAVNSLISDNTLMKNRLKELINIPYCNYTKFWYVNHTGFNVHSLPRCWLTKNGSYLNVSDFRNQWLLESDHLISE
ILSREYEARQGKTPLGLVDVCFWSTLFYVSSIFLHLLRIPTHRHIIGEGCPKPHRLSSNSVCACGLFKQKGRPLRWAGKV
>A0A0S0MVI5 ~~~~~~Glycinyltransferase~~~
MSRNYEKLSIEEFGAHLLGTVDLDPIYLALRRMELPEAQLNRWLLAYWCLYNGGEASYLSEFEGREFFEMLNHAAENVRE
APIGGRWPRGAERRHWRGAQATSSVEYLIDRYDDRPEDMAAYCAGQGGTFLEVTKRVQEHRLFGPWIGFKVADMVDRVLG
KPVSFDNAAVFMFKDPYKAACIQYEVNPNIPDHVLADGSVAPRNRELVTPETVHHVAQHLIEHFKGFQAPPLGDRPVNIQ
EVETILCKWKSHQNGHYPLFKDIVEIREAALPWAKVSKTAQAFFEAMPEVTQ
>P0DTK6 ~~~~~~Glycinyltransferase~~~
MSRNYPRLDIETFGRHLITTGDLDPIYTALVRAEEAGDFSVPQLCRWLLGYWCYYHAGVASFLSEKEGEEFWHWMMVAAR
NEEETPAGGRWPRGHERRHYRAKIAVDSVTSLQARYGDRPENMALYVGARATEDERLPFRTVSARAQEHNGFGPWIGFKI
ADMMDRVMEVPVDFDNAAVFMFKDPEKAAMMLWEQREAHKYPENAKPKREAILSGVADYLIGRFADLAAPPLSDRPVNIQ
EVETVLCKWKSHMNGHYPLWNDIREINGGLEPWAGRCSAARAFLNHMPKEQ
>Q1HVF6 ~~~gL~~~Envelope glycoprotein L~~~
MRTVGVFLATCLVTIFVLPTWGNWAYPCCHVTQLRAQHLLALENISDIYLVSNQTCDGFSLASLNSPKNGSNQLVISRCA
NGLNVVSFFISILKRSSSALTGHLRELLTTLETLYGSFSVEDLFGANLNRYAWHRGG
>P03212 ~~~gL~~~Envelope glycoprotein L~~~
MRAVGVFLAICLVTIFVLPTWGNWAYPCCHVTQLRAQHLLALENISDIYLVSNQTCDGFSLASLNSPKNGSNQLVISRCA
NGLNVVSFFISILKRSSSALTGHLRELLTTLETLYGSFSVEDLFGANLNRYAWHRGG
>Q3KSS3 ~~~gL~~~Envelope glycoprotein L~~~
MRAVGVFLATCLVTIFVLPTWGNWAYPCCHVTQLRAQHLLALENISDIYLVSNQTCDGFSLASLNSPKNGSNQLVISRCA
NGLNVVSFFISILKRSSSALTSHLRELLTTLESLYGSFSVEDLFGANLNRYAWHRGG
>Q68674 ~~~gL~~~Envelope glycoprotein L~~~
MCRRPDCGFSFSPGPVILLWCCLLLPIVSSAAVSVAPTAAEKVPAECPELTRRCLLGEVFEGDKYESWLRPLVNVTGRDG
PLSQLIRYRPVTPEAANSVLLDEAFLDTLALLYNNPDQLRALLTLLSSDTAPRWMTVMRGYSECGDGSPAVYTCVDDLCR
GYDLTRLSYGRSIFTEHVLGFELVPPSLFNVVVAIRNEATRTNRAVRLPVSTAAAPEGITLFYGLYNAVKEFCLRHQLDP
PLLRHLDKYYAGLPPELKQTRVNLPAHSRYGPQAVDAR
>P16832 ~~~gL~~~Envelope glycoprotein L~~~
MCRRPDCGFSFSPGPVVLLWCCLLLPIVSSVAVSVAPTAAEKVPAECPELTRRCLLGEVFQGDKYESWLRPLVNVTRRDG
PLSQLIRYRPVTPEAANSVLLDDAFLDTLALLYNNPDQLRALLTLLSSDTAPRWMTVMRGYSECGDGSPAVYTCVDDLCR
GYDLTRLSYGRSIFTEHVLGFELVPPSLFNVVVAIRNEATRTNRAVRLPVSTAAAPEGITLFYGLYNAVKEFCLRHQLDP
PLLRHLDKYYAGLPPELKQTRVNLPAHSRYGPQAVDAR
>F5HCH8 ~~~gL~~~Envelope glycoprotein L~~~
MCRRPDCGFSFSPGPVILLWCCLLLPIVSSAAVSVAPTAAEKVPAECPELTRRCLLGEVFEGDKYESWLRPLVNVTGRDG
PLSQLIRYRPVTPEAANSVLLDEAFLDTLALLYNNPDQLRALLTLLSSDTAPRWMTVMRGYSECGDGSPAVYTCVDDLCR
GYDLTRLSYGRSIFTEHVLGFELVPPSLFNVVVAIRNEATRTNRAVRLPVSTAAAPEGITLFYGLYNAVKEFCLRHQLDP
PLLRHLDKYYAGLPPELKQTRVNLPAHSRYGPQAVDAR
>P10185 ~~~gL~~~Envelope glycoprotein L~~~
MGILGWVGLIAVGVLCVRGGLPSTEYVIRSRVAREVGDILKVPCVPLPSDDLDWRYETPSAINYALIDGIFLRYHCPGLD
TVLWDRHAQKAYWVNPFLFVAGFLEDLSYPAFPANTQETETRLALYKEIRQALDSRKQAASHTPVKAGCVNFDYSRTRRC
VGRQDLGPTNGTSGRTPVLPPDDEAGLQPKPLTTPPPIIATSDPTPRRDAATKSRRRRPHSRRL
>Q96912 ~~~gL~~~Envelope glycoprotein L~~~
MGILGWVGLIAVGVLCVRGGLSSTEYVIRSRVAREVGDILKVPCVPLPSDDLDWRYETPSAINYALIDGIFLRYHCPGLD
TVLWDRHAQKAYWVNPFLFVAGFLEDLSHPAFPANTQETETRLALYKEIRQALDSRKQAASHTPVKAGCVNFDYSRTRRC
VGRQDLGPTNGTSGRTPVLPPDDEAGLQPKPLTTPPPIIATSDPTPRRDAATKSRRRRPHSRRL
>P28278 ~~~gL~~~Envelope glycoprotein L~~~
MGFVCLFGLVVMGAWGAWGGSQATEYVLRSVIAKEVGDILRVPCMRTPADDVSWRYEAPSVIDYARIDGIFLRYHCPGLD
TFLWDRHAQRAYLVNPFLFAAGFLEDLSHSVFPADTQETTTRRALYKEIRDALGSRKQAVSHAPVRAGCVNFDYSRTRRC
VGRRDLRPANTTSTWEPPVSSDDEASSQSKPLATQPPVLALSNAPPRRVSPTRGRRRHTRLRRN
>P52508 ~~~gL~~~Envelope glycoprotein L~~~
MELLLFVMSLILLTFSKAIPLFNHNSFYFEKLDDCIAAVINCTKSEVPLLLEPIYQPPAYNEDVMSILLQPPTKKKPFSR
IMVTDEFLSDFLLLQDNPEQLRTLFALIRDPESRDNWLNFFNGFQTCSPSVGITTCIRDNCRKYSPEKITYVNNFFVDNI
AGLEFNISENTDSFYSNIGFLLYLENPAKGVTKIIRFPFNSLTLFDTILNCLKYFHLKTGVELDLLKHMETYNSKLPFRS
SRPTILIRNT
>F5HDB7 ~~~gL~~~Envelope glycoprotein L~~~
MGIFALFAVLWTTLLVTSHAYVALPCCAIQASAASTLPLFFAVHSIHFADPNHCNGVCIAKLRSKTGDITVETCVNGFNL
RSFLVAVVRRLGSWASQENLRLLWYLQRSLTAYTVGFNATTADSSIHNVNIIIISVGKAMNRTGSVSGSQTRAKSSSRRA
HAGQKGK
>P52511 ~~~gL~~~Envelope glycoprotein L~~~
MSPLVAVLVFFSAALGVPGPGVAGNPRGLDAIFEAPVTPAPPTRHPRREELEWDDEDHPLLDLEPPVGSRCHPYIAYSLP
PDMNAVTSVVVKPYCSPPEVILWASGTAYLVNPFVAIQALAVGEPLNEAALKELGEVAVHKDSLPPLRYNGGPPAE
>P52512 ~~~gL~~~Envelope glycoprotein L~~~
MSPLVAVLVFFSAALGVPGTGVAGNPHGLDAIFEPPVTPAPPTRAPRREELEWDDEDHPLLGLEPPVGSRCHPYIAYSLP
PDMTAVTSVVVKPYCSPPEVILWASGTAYLVNPFVAIQALAIGEPLNEAALKELGEVAVHKDSLPPLRYNGGPPAE
>Q9J3N1 ~~~gL~~~Envelope glycoprotein L~~~
MASHKWLLQMIVFLKTITIAYCLHLQDDTPLFFGAKPLSDVSLIITEPCVSSVYEAWDYAAPPVSNLSEALSGIVVKTKC
PVPEVILWFKDKQMAYWTNPYVTLKGLTQSVGEEHKSGDIRDALLDALSGVWVDSTPSSTNIPENGCVWGADRLFQRVCQ
>P52370 ~~~gM~~~Envelope glycoprotein M~~~
MAGSAQPAAVHWRLWLAQVGVFAGLALLLLITLIGAASPGAGLPCFYAAIVNYNARNLSADGGAWAQRELGARHPALFLE
TPTTAAFSAYTAVVLLAVAAFDVAAAIIIRRENSGGFAAAYHMNALATLATPPGALLLGALAAWTLQAAVLLLSHKIMVL
AAATYLAHLAPPAAFVGLFCTAGLPGAEYAQAVHALRERSPRAHRLLGPGRAVMINLAGGLLALIIGTAPLMLGQLLGAG
LGLSLAQTVVAGVTVFCLAAVLFLVLTELVLSRYTQVLPGPAFGTLVAASCIAVASHDYFHQLRGVVRTQAPRAAARVKL
ALAGVALLAVAMLVLRLVRACLHHRRKGSAFYGHVSAARQQAARYIARARSSRGMAPLEGDAAALLDRGVASDDEEAVYE
AHAPPRPPTIPLRRPEVPHSRASHPRPPPRSPPPAHVK
>P03215 ~~~gM~~~Envelope glycoprotein M~~~
MKSSKNDTFVYRTWVKTLVVYFVMFVMSAVVPITAMFPNLGYPCYFNALVDYGALNLTNYNLAHHLTPTLYLEPPEMFVY
ITLVFIADCVAFIYYACGEVALIKARKKVSGLTDLSAWVSAVGSPTVLFLAILKLWSIQVFIQVLSYKHVFLSAFVYFLH
FLASVLHACACVTRFSPVWVVKAQDNSIPQDTFLWWVVFYLKPVVTNLYLGCLALETLVFSLSVFLALGNSFYFMVGDMV
LGAVNLFLILPIFWYILTEVWLASFLRHNFGFYCGMFIASIILILPLVRYEAVFVSAKLHTTVAINVAIIPILCSVAMLI
RICRIFKSMRQGTDYVPVSETVELELESEPRPRPSRTPSPGRNRRRSSTSSSSSRSTRRQRPVSTQALVSSVLPMTTDSE
EEIFP
>P28948 ~~~gM~~~Envelope glycoprotein M~~~
MARRGAAVAEEPLLPSSGIVGIGPIEGINWRTWLVQVFCFALTTSVLFITLVTASLPQTGYPCFYGSLVDYTQKNHSVVD
GVWMRQIAGGVAPTLFLETTSLVAFLYYTTLVLVAISFYLIISAVLVRRYARGKECTAVAGCTRPTTTLIASHVTLVLGT
LATWLLQVVILLLSHKQAVLGAAVYVVHFVSLVFFCMSFSGLGTASAQYSSNLRILKTNLPALHKMAGPGRAVMTNLGMG
MLGISLPILSLMLGIILANSFHITLWQTVTVAVGVFVALGLMFLIIVELIVSHYVHVLVGPALAVLVASSTLAVATHSYF
VHFHAMVSVQAPNLATASKAIVGIMAVISIIMLVVRLVRAIMFHKKRNTEFYGRVKTVSSKARRYANKVRGPRRNPQPLN
VAESRGMLLAEDSETDAEEPIYDVVSEEFETEYYDDPQRVPERSHRREYR
>P16733 ~~~gM~~~Envelope glycoprotein M~~~
MAPSHVDKVNTRTWSASIVFMVLTFVNVSVHLVLSNFPHLGYPCVYYHVVDFERLNMSAYNVMHLHTPMLFLDSVQLVCY
AVFMQLVFLAVTIYYLVCWIKISMRKDKGMSLNQSTRDISYMGDSLTAFLFILSMDTFQLFTLTMSFRLPSMIAFMAAVH
FFCLTIFNVSMVTQYRSYKRSLFFFSRLHPKLKGTVQFRTLIVNLVEVALGFNTTVVAMALCYGFGNNFFVRTGHMVLAV
FVVYAIISIIYFLLIEAVFFQYVKVQFGYHLGAFFGLCGLIYPIVQYDTFLSNEYRTGISWSFGMLFFIWAMFTTCRAVR
YFRGRGSGSVKYQALATASGEEVAVLSHHDSLESRRLREEEDDDDDEDFEDA
>P04288 ~~~gM~~~Envelope glycoprotein M~~~
MGRPAPRGSPDSAPPTKGMTGARTAWWVWCVQVATFVVSAVCVTGLLVLASVFRARFPCFYATASSYAGVNSTAEVRGGV
AVPLRLDTQSLVGTYVITAVLLLAVAVYAVVGAVTSRYDRALDAGRRLAAARMAMPHATLIAGNVCSWLLQITVLLLAHR
ISQLAHLVYVLHFACLVYFAAHFCTRGVLSGTYLRQVHGLMELAPTHHRVVGPARAVLTNALLLGVFLCTADAAVSLNTI
AAFNFNFSAPGMLICLTVLFAILVVSLLLVVEGVLCHYVRVLVGPHLGAVAATGIVGLACEHYYTNGYYVVETQWPGAQT
GVRVALALVAAFALGMAVLRCTRAYLYHRRHHTKFFMRMRDTRHRAHSALKRVRSSMRGSRDGRHRPAPGSPPGIPEYAE
DPYAISYGGQLDRYGDSDGEPIYDEVADDQTDVLYAKIQHPRHLPDDDPIYDTVGGYDPEPAEDPVYSTVRRW
>F5HDD0 ~~~gM~~~Envelope glycoprotein M~~~
MRASKSDRFLMSSWVKLLFVAVIMYICSAVVPMAATYEGLGFPCYFNNLVNYSALNLTVRNSAKHLTPTLFLEKPEMLVY
IFWTFIVDGIAIVYYCLAAVAVYRAKHVHATTMMSMQSWIALLGSHSVLYVAILRMWSMQLFIHVLSYKHVLMAAFVYCI
HFCISFAHIQSLITCNSAQWEIPLLEQHVPDNTMMESLLTRWKPVCVNLYLSTTALEMLLFSLSTMMAVGNSFYVLVSDA
IFGAVNMFLALTVVWYINTEFFLVKFMRRQVGFYVGVFVGYLILLLPVIRYENAFVQANLHYIVAINISCIPILCILAIV
IRVIRSDWGLCTPSAAYMPLATSAPTVDRTPTVHQKPPPLPAKTRARAKVKDISTPAPRTQYQSDHESDSEIDETQMIFI
>Q85041 ~~~gM~~~Envelope glycoprotein M~~~
MCGPRNAEAVSWRSWLIEVCGFALAALTLVLTLIFASLPEMGFPCFYATVADYDTLNDTSGGVWTRQPLVAPALFLETPT
VTSFFGFTATVLLAHALYAVAGAVVLRREAGRLAFQPSVVLYAASTVAAPGTLMLGALCAWTLQAVVLLMAHKQAGLAAA
AYITHFVFLALFGACHACKGTGDVRAALAASPPLRRVAVHARAVVTNVVLGAVGLGAAVVGLMLGVLLANSFHISLWKTA
EAALAVFTLLALALMVFVEVVVSGYVQVLPTPAFCVLVASAAFGVSAHRYFAKFSEALGETHGVVIGTRAVLAVLSLIAL
AMIVVRLVRACIAHRARGSRFYANVDKARTTARRYLQKRLHGRGNDEYLLAPGSGDDEFDDGDEVVYENLGFE
>Q77NP2 ~~~gM~~~Envelope glycoprotein M~~~
MGTQKKGPRSEKVSPYDTTTPEVEALDHQMDTLNWRIWIIQVMMFTLGAVMLLATLIAASSEYTGIPCFYAAVVDYELFN
ATLDGGVWSGNRGGYSAPVLFLEPHSVVAFTYYTALTAMAMAVYTLITAAIIHRETKNQRVRQSSGVAWLVVDPTTLFWG
LLSLWLLNAVVLLLAYKQIGVAATLYLGHFATSVIFTTYFCGRGKLDETNIKAVANLRQQSVFLYRLAGPTRAVFVNLMA
ALMAICILFVSLMLELVVANHLHTGLWSSVSVAMSTFSTLSVVYLIVSELILAHYIHVLIGPSLGTLVACATLGTAAHSY
MDRLYDPISVQSPRLIPTTRGTLACLAVFSVVMLLLRLMRAYVYHRQKRSRFYGAVRRVPERVRGYIRKVKPAHRNSRRT
NYPSQGYGYVYENDSTYETDREDELLYERSNSGWE
>Q77CE4 ~~~gN~~~Envelope glycoprotein N~~~
MPRSPLIVAVVAAALFAIVRGRDPLLDAMRREGAMDFWSAGCYARGVPLSEPPQALVVFYVALTAVMVAVALYAYGLCFR
LMGASGPNKKESRGRG
>P03196 ~~~gN~~~Envelope glycoprotein N~~~
MGKVLRKPFAKAVPLLFLAATWLLTGVLPAGASSPTNAAAASLTEAQDQFYSYTCNADTFSPSLTSFASIWALLTLVLVI
IASAIYLMYVCFNKFVNTLLTD
>P28980 ~~~gN~~~Envelope glycoprotein N~~~
MLSTRFVTLAILACLLVVLGLARGAGGDPGVKQRIDVAREEERRDFWHAACSGHGFPITTPSTAAILFYVSLLAVGVAVA
CQAYRAVLRIVTLEMLQHLH
>P16795 ~~~gN~~~Envelope glycoprotein N~~~
MEWNTLVLGLLVLSVVAESSGNNSSTSTSATTSKSSASVSTTKLTTVATTSATTTTTTTLSTTSTKLSSTTHDPNVMRRH
ANDDFYKAHCTSHMYELSLSSFAAWWTMLNALILMGAFCIVLRHCCFQNFTATTTKGY
>O09800 ~~~gN~~~Envelope glycoprotein N~~~
MGPPRRVCRAGLLFVLLVALAAGDAGPRGEPPGEEGGRDGIGGARCETQNTGQMSAPGALVPFYVGMASMGVCIIAHVCQ
ICQRLLAAGHA
>F5HFQ0 ~~~gN~~~Envelope glycoprotein N~~~
MTASTVALALFVASILGHCWVTANSTGVASSTERSSPSTAGLSARPSPGPTSVTTPGFYDVACSADSFSPSLSSFSSVWA
LINALLVVVATFFYLVYLCFFKFVDEVVHA
>Q87088 ~~~gN~~~Envelope glycoprotein N~~~
MVSSAGLSLTLVAALCALVAPALSSIVSTEGPLPLLREESRINFWNAACAARGVPVDQPTAAAVTFYICLLAVLVVALGY
ATRTCTRMLHASPAGRRV
>P0C764 ~~~gN~~~Envelope glycoprotein N~~~
MGSITASFILITMQILFFCEDSSGEPNFAERNFWHASCSARGVYIDGSMITTLFFYASLLGVCVALISLAYHACFRLFTR
SVLRSTW
>P16750 ~~~GO~~~Glycoprotein O~~~
MGRKEMMVRDVPKMVFLISISFLLVSFINCKVMSKALYNRPWRGLVLSKIGKYKLDQLKLEILRQLETTISTKYNVSKQP
VKNLTMNMTEFPQYYILAGPIQNYSITYLWFDFYSTQLRKPAKYVYSQYNHTAKTITFRPPPCGTVPSMTCLSEMLNVSK
RNDTGEQGCGNFTTFNPMFFNVPRWNTKLYVGPTKVNVDSQTIYFLGLTALLLRYAQRNCTHSFYLVNAMSRNLFRVPKY
INGTKLKNTMRKLKRKQAPVKEQFEKKAKKTQSTTTPYFSYTTSAALNVTTNVTYSITTAARRVSTSTIAYRPDSSFMKS
IMATQLRDLATWVYTTLRYRQNPFCEPSRNRTAVSEFMKNTHVLIRNETPYTIYGTLDMSSLYYNETMFVENKTASDSNK
TTPTSPSMGFQRTFIDPLWDYLDSLLFLDEIRNFSLRSPTYVNLTPPEHRRAVNLSTLNSLWWWLQ
>F5HGP1 ~~~~~~Envelope glycoprotein O~~~
MGKKEMIMVKGIPKIMLLISITFLLLSLINCNVLVNSRGTRRSWPYTVLSYRGKEILKKQKEDILKRLMSTSSDGYRFLM
YPSQQKFHAIVISMDKFPQDYILAGPIRNDSITHMWFDFYSTQLRKPAKYVYSEYNHTAHKITLRPPPCGTVPSMNCLSE
MLNVSKRNDTGEKGCGNFTTFNPMFFNVPRWNTKLYIGSNKVNVDSQTIYFLGLTALLLRYAQRNCTRSFYLVNAMSRNL
FRVPKYINGTKLKNTMRKLKRKQALVKEQPQKKNKKSQSTTTPYLSYTTSTAFNVTTNVTYSATAAVTRVATSTTGYRPD
SNFMKSIMATQLRDLATWVYTTLRYRNEPFCKPDRNRTAVSEFMKNTHVLIRNETPYTIYGTLDMSSLYYNETMSVENET
ASDNNETTPTSPSTRFQRTFIDPLWDYLDSLLFLDKIRNFSLQLPAYGNLTPPEHRRAANLSTLNSLWWWSQ
>P30005 ~~~~~~120 kDa Glycoprotein O~~~
MHLEVIVQSYKKSKYYFSHTFYLYKFIVVNSPDMLHISQLGLFLGLFAIVMHSANLIKYTSDPLEAFKTVNRHNWSDEQR
EHFYDLRNLYTSFCQTNLSLDCFTQILTNVFSWDIRDSQCKSAVNLSPLQNLPRTETKIVLSSTAANKSIIASSFSLFYL
LFATLSTYTADPPCVELLPFKILGAQLFDIKLTEESLRMAMSKFSNSNLTRSLTSFTSEIFFNYTSFVYFLLYNTTSCVP
SNDQYFKQSPKPINVTTSFGRTIVNFDSILTTTPSSTSASLTSPHIPSTNIPTPEPPPVTKNSTKLHTDTIKVTPNTPTI
TTQTTESIKKIVKRSDFPRPMYTPTDIPTLTIRLNATIKTEQNTENPKSPPKPTNFENTTIRIPKTFESATVTTNATQKI
ESTTFTTIGIKEINGNTYSSPKNSIYLKSKSQQSTTKFTDAEHTTPILKFTTWQNTARTYMSHNTEVQNMTDRFQRTTLK
SSNELPTIQTLSVTPKQKLPSNVTAKTEVHITNNALPSSNSSYSITEVTKEVKHTRMSASTHEEINHTEIAQITPILNAH
TSEKSTTPQRSFTAETFLTTSSKPAILTWSNLLSTTPKEPLTNTSLRWTDHITTQLTTSNRTQSAKLTKANISSQTTNIY
PQTITGRSTEV
>P52549 ~~~~~~130 kDa Glycoprotein O~~~
MLHISRLGLFLALFAIVMHSVNLIKYTSDPLEAFKTVNRHNWSDEQREHFYDLRNLYTTFCQRNLSLDCFTQILTNVFSW
NIRDLQCKSAVNLSPLQNLPRAETKIVLSSTAANKSIVASSFSLFYLLFATLSTYTADPPCVELLPFKILGTQLFDIKLT
DESLQMAISKFSNSNLTRSLTPFTPEIFFNYTSFVYFLLYNTTSCIRSNDQYFEHSPKPINVTTSFGRAIVNFHSILTTT
PSSTPSSTSASITSPHIPSTNTPTPEPSPVTKNFTELQTDTIKVTPNTPTITAQTTESIKKVVKRSDFPRPMYTPTDIPT
LTIRRNATIKTEQNTENPTENPKSPPKPTNFENTTIRIPETFESTTVATNTTQKLESTTFATTIGIEEISDNIYSSPKNS
IYLKSKSQQSTTKFTDTEHTTPILKFTTWQDAARTYMSHNTEVQNMTENFIKISLGETMGITPKEPTNPTQLLNVKNQTE
YANETHSTEVQTVKTFKEDRFQRTTLKSSSEPPTVQTLSVTPKKKLPSNVTAKTEVQVTNNALPSSNSSHSITKVTEEPK
QNRMSASTHGEINHTEIPRMTPILNAHTWEKSTTPQWPFTAETSLTTSSKSAILTWSNLLTTPKEPLTNTSLRSTNHITT
QLTTSNRTQSAKLTKAHVSSQTTNIYPQTITERSTDVKKKSSTESREANKTLPGNDYRVTDKNSHNHPDNLTTKAYSTQN
ATHYTYNERHDLNNTDST
>P52525 ~~~~~~Glycoprotein U47 homolog~~~
MKNKMYSMLVFTLISFLFYSYLIIWTPVLSVCSEKISFVNLTNFSMWPKFAKFHYESFEKTYIQSCDIPISKNCFKQILF
WSFRLSQKKNICLSKINMLYLDNFPRWTLHFEFPTKTNRKRHVYLDNISFLYLVFATLAFKTMDDSCVNKIPFEVLASHL
FKINFSQNKIETIFQDLQNHTEAQLFSRENSEQLFSFTNFIYYFVYNRTDCKNSISKFFYNSVNTRNVSTPFGVTNFSLV
RGIMSPIQSFKGNLMFLENKTKVTKPTAIPNAVTSEPTKFFPSTRGTSSMQASQQTSSFPTTFFTTNHSNTST
>P03776 ~~~~~~Gene 0.4 protein~~~
MSTTNVQYGLTAQTVLFYSDMVRCGFNWSLAMAQLKELYENNKAIALESAE
>Q69489 ~~~~~~Glycoprotein 105~~~
MATARLGVMRPPRSCALIFLCAFSMATAPTNATAHRRAGTVKSTPPPEDKGNYTAKYYDKNIYFNIYEGRNSTPRRRTLW
EIISKFSTSEMLSLKRVKAFVPVDDNPTTTLEDIADILNYAVCDDNSCGCTIETQARIMFGDIIICVPLSADNKGVRNFK
DRIMPKGLSQILSSSLGLHLSLLYGAFGSNYNSLAYMRRLKPLTAMTAIAFCPMTTKLELRQNYKVKETLCELIVSIEIL
KIRNNGGQTMKTLTSFAIVRKDNDGQDWETCTRFAPVNIEDILRYKRAANDTCCRHRDVQHGRRTLESSNSWTQTQYFEP
WQDIVDVYVPINDTHCPNDSYVVFETLQGFEWCSRLNKNETKNYLSSVLGFRNALFETEELMETIAMRLASQILSMVGQQ
GTTIRDIDPAIVSALWHSLPENLTTTNIKYDIASPTHMAPALCTIFVQTGTSKERFRNAGLLMVNNIFTVQGRYTTQNMF
ERKEYVYKHLGQALCQDGEILFQNEGQKFCRPLTDNRTIVYTMQDQVQKPLSVTWMDFNLVISDYGRDVINNLTKSAMLA
RKNGPRYLQMENGPRYLQMETFISDLFRHECYQDNYYVLDKKLQMFYPTTHSNELLFIPRKLRCPHLGRNLRFPHLGRNL
RFPHVGIGSY
>Q9T1X7 ~~~~~~Uncharacterized protein gp11~~~
MVDAKILNGVSTLLRAYGRLTCGVLAEKMNMLPSSMVYFLRDAVDAGVLTECNGFYDVPRPRPTPPVRRNATEQPAVDDA
VWCNWRRSLPWVEGNTIPALAKEFATGVLTCESVHIVAEVDNRMCEQGMPRFVMAYIDIRLGRFICSSSVWNITDHVLRY
LILDCSPAPAAVQEVA
>D3WAD0 ~~~~~~Chaperone protein gp12~~~
MAKQLSTARKFKMITGKDLFQQQKAMDTELKKEDGEITDLMEFVQYGLYLALFQDNIVKAKSDFSDFRSSFEFDTDGKGL
KELVELWQKEI
>Q38488 ~~~~~~Uncharacterized protein gp12~~~
MFFKTSNPAALLAWDQFMADCLKLREEARHLDKVLGCGCRSVFSTSIGGRYFHGVNFPGNERPFSRELWTVQRPASGNSC
RPRTSRIPAHLREQARELAKIWQENIPVTYARTDALLPALGLDFSATIFGPLQWFRVGDVIYVMTGMTPAQGRMTEILSD
EFIRAQKQAEVNNGKQ
>P03780 ~~~~~~Inhibitor of dGTPase~~~
MGRLYSGNLAAFKAATNKLFQLDLAVIYDDWYDAYTRKDCIRLRIEDRSGNLIDTSTFYHHDEDVLFNMCTDWLNHMYDQ
LKDWK
>Q38490 ~~~~~~Uncharacterized protein gp13~~~
MENNKTSYSWLGKFTTVKQECPTCGNESPEYLKECPHCGGLKCNHCDMGDDTACMNCEGE
>P15132 ~~~~~~Morphogenesis protein 1~~~
MVYVSNKYLTMSEMKVNAQYILNYLSSNGWTKQAICGMLGNMQSESTINPGLWQNLDEGNTSLGFGLVQWTPASNYINWA
NSQGLPYKDMDSELKRIIWEVNNNAQWINLRDMTFKEYIKSTKTPRELAMIFLASYERPANPNQPERGDQAEYWYKNLSG
GGGGGLQLAQFPMDIINISQGENGSFSHKGTLCIDFVGKTEKYPYYAPCDCTCVWRGDASAYLAWTSDKEVMCADGSVRY
ITWVNVHESPLPFDVGKKLKKGDLMGHTGIGGNVTGDHWHFNVIDGKEYQGWTKKPDSCLAGTELHIYDVFAVNNVEIIN
GNGYDWKTSDWQDGDGGDGDDDNDNNKTKDLITLLLSDALHGWKA
>Q38492 ~~~~~~Uncharacterized protein gp14~~~
MNNETKFTPLNIDNVMAEKGMLERVRAIVEYGIKHNLTAREVRDIINREMNRLETVVALQNETAREEYIRRRLGLSDQDI
VTDAHVFEAFEIRQHLGLTN
>P03724 ~~~~~~Internal virion protein gp14~~~
MCWAAAIPIAISGAQAISGQNAQAKMIAAQTAAGRRQAMEIMRQTNIQNADLSLQARSKLEEASAELTSQNMQKVQAIGS
IRAAIGESMLEGSSMDRIKRVTEGQFIREANMVTENYRRDYQAIFAQQLGGTQSAASQIDEIYKSEQKQKSKLQMVLDPL
AIMGSSAASAYASGAFDSKSTTKAPIVAAKGTKTGR
>Q9T1X4 ~~~~~~Uncharacterized protein gp15~~~
MENNVQPYDVAGYAIASALVRLLVKKAIITAEEGKAIFSSSAEILKDAPAMRTSRREKLQLSKIMEDIISSLDPDADGSQ
KPQTHERQ
>P03725 ~~~~~~Internal virion protein gp15~~~
MSKIESALQAAQPGLSRLRGGAGGMGYRAATTQAEQPRSSLLDTIGRFAKAGADMYTAKEQRARDLADERSNEIIRKLTP
EQRREALNNGTLLYQDDPYAMEALRVKTGRNAAYLVDDDVMQKIKEGVFRTREEMEEYRHSRLQEGAKVYAEQFGIDPED
VDYQRGFNGDITERNISLYGAHDNFLSQQAQKGAIMNSRVELNGVLQDPDMLRRPDSADFFEKYIDNGLVTGAIPSDAQA
TQLISQAFSDASSRAGGADFLMRVGDKKVTLNGATTTYRELIGEEQWNALMVTAQRSQFETDAKLNEQYRLKINSALNQE
DPRTAWEMLQGIKAELDKVQPDEQMTPQREWLISAQEQVQNQMNAWTKAQAKALDDSMKSMNKLDVIDKQFQKRINGEWV
STDFKDMPVNENTGEFKHSDMVNYANKKLAEIDSMDIPDGAKDAMKLKYLQADSKDGAFRTAIGTMVTDAGQEWSAAVIN
GKLPERTPAMDALRRIRNADPQLIAALYPDQAELFLTMDMMDKQGIDPQVILDADRLTVKRSKEQRFEDDKAFESALNAS
KAPEIARMPASLRESARKIYDSVKYRSGNESMAMEQMTKFLKESTYTFTGDDVDGDTVGVIPKNMMQVNSDPKSWEQGRD
ILEEARKGIIASNPWITNKQLTMYSQGDSIYLMDTTGQVRVRYDKELLSKVWSENQKKLEEKAREKALADVNKRAPIVAA
TKAREAAAKRVREKRKQTPKFIYGRKE
>P16517 ~~~~~~DNA replication protein 16.7~~~
MEAILMIGVLALCVIFLLSGRNNKKKQEARELEDYLEDLNKRVVQRTQILSELNEVISNRSIDKTVNLSACEVAVLDLYE
QSNIRIPSDIIEDLVNQRLQSEQEVLNYIETQRTYWKLENQKKLYRGSLK
>Q4Z971 ~~~~~~Gene product 168~~~
MLFFKEKFYNELSYYRGGHKDLESMFELALEYIEKLEEEDEQQVTDYENAMEEELRDAVDVIESQLEIIKDIVR
>D3KFX4 ~~~~~~Baseplate protein gp16~~~
MLEANVYDNFNPNYYNISDFSMPNGKKEKRGLPIPKARCQVINYELWETGYLYTSSATLTVSVEVGDIVQILFPEVVPIE
EALGKKKKLNLDMVYLVTDVDESNKATLKNYFWAMIESLDVPNAITKTTNFAIIDYLIDPNKNNLMSYGYFFNSSIFAGK
ATINRKAETSSAHDVAKRIFSKVQFQPTTTIQHAPSETDPRNLLFINFASRNWNRKRITTRVDIKQSVTMDTETIVERSA
YNFAVVFVKNKATDDYTDPPKMYIAKNNGDVIDYSTYHGDGTDLPDVRTAKTLFYDRDDHGNPPELSTIKVEISPSTIVT
RLIFNQNELLPLYVNDLVDIWYEGKLYSGYIADRVKTEFNDRLIFVESGDKPNVI
>Q01146 ~~~~~~Internal virion protein gp16~~~
MKVTANGKTFNFPDGTSTEDIGAAVDEYFAGQASAAETQPAEQQEEPQQPEQSLMQRAGDLLTGGQSAGQIAEQAGRGLV
NIPFDVLQGGASLINAISQGLGGPKVLDDVYRPVDRPTDPYAQAGESIGGYLIPGAGVAGNMAIGSVAEAANQQGDFAGN
VAKNAAVNLGAQGLLSGAAKLVGRGITAARGEIAPEARQLIDTAESMGVKPMTSDMIKPGNAFTRSLMQGGEGALLGTGG
KRAEQYAIRSKLLGDYFDRVGGYNPDDIVKSMTSTVGGRKNAAGAVRDEIVNRMGSAPVGTTNSINAIDTNIARLEKLGT
SADQRLLTALKNLKGELNSGNVDFDLLQQHRTAFRTNVQGDAMVFPNQAKAATNMVENAMTRDLRNAVGKSLGPQDAAKY
LKSNSDFANIYNKVLNKRISNTLNKARSEYTPELINTVVFSRKPSDIKRIWSSLDNKGKDAMRAAYISKIAEKTGDSPAK
FITEVNKLKAQSGGEIYNTIFSGRHMKELDALHDVLRQTARSDSANVVTQTGQALANPVRLGAAIPTLGKSLAAEAGYGL
AMRVYESKPIRNMLLRLANTKPGTPAYERALNQAATAVRPLLANEATRQ
>P24729 ~~~~~~Glycoprotein GP16~~~
MNFWATFSICLVGYLVYAGHLNNELQEIKSILVVMYESMEKHFSNVVDEIDSLKTDTFMMLSNLQNNTIRTWDAVVKNGK
KISNLDEKINVLLTKNGVVNNVLNVQ
>P03686 ~~~~~~DNA replication protein 17~~~
MNNYQLTINEVIDIINTNTEINKLVAKKENLFPTDLYDLDKQELIAIILNSDFALSSIKRVLLEVTVEELGTQDNDEDDE
LEDLDGEIDRVDYIDKDGIRFDVPRETSPHVDKSIVTFNDELLDEANKIAKSIQEHDFNDKAIEEAELKIFKNHLPSIYS
MKKENK
>Q38625 ~~~~~~Uncharacterized protein gp18~~~
MGKGWNASFHLGRRERLRQEVLHRVAGGPRPAPRDYTGHDGTHGSYYMKGWQSVDMPEILHHCLLYREKHYV
>Q38278 ~~~~~~Gene product 19~~~
MINLQNKKLDIKEFLQELGFTVSLDYEREPMGVMFAEIHPIVSQVSNNSAIYQSFRTLEIELMVICTEETENSLYRAVQL
LSDEHYIYANTITDNTNIIKLRGNYYD
>Q38646 ~~~~~~Uncharacterized protein gp19~~~
MSERSARQWPDFLSVVLLALLLWISLFCGWRALMFCCASVFSVALCVAADCLDALIMSCRVPEHFARFVWPLTWLGSLSG
LGLAVMATSQLKTGPEHVIWALAGLLTFWLSFRFRARLFG
>P03679 ~~~1~~~DNA replication protein 1~~~
MGKIFDQEKRLEGTWKNSKWGNQGIIAPVDGDLKMIDLELEKKMTKLEHENKLMKNALYELSRMENNDYATWVIKVLFGG
APHGAK
>Q01076 ~~~~~~Internal virion protein gp20~~~
MATWQQGINSGGFLAGIGAQNENAPKARDINATLGLIRENNDLARSGANNVALTGLRGLAGVADIYNQEQQQKALNAFNQ
VHANAWATGDPSGLFKFAQENPAFVAQAQQAFSGLNEQQRNDMGDLAMKANVALSQGPEAYSKFITDNKDRLNRVGANPD
WMIQTGVQNPEQLSHMLTTMSLGALGPEKAFAVQDKMVGREIDRGRLAETIRSNKAGEGLQARGQNITMRGQDMSAATAR
RGQDLATQRANARTISGSEGNRVVQLADGRTVSVGGKLHGAGANAFYEGIDDNGNMVRVPASAIAAPPTSAASAQNYAMK
KDIDAIANADASALDFMTGMTGGAGNPAIGADVRSRLTGKEQRQLYNSAQRIQGRMQNQGVAAARDMGASGINTIAEAKM
YFQGMPQVDYSSPEAMQQSIREIQEYTNNYNQQYNVNVGNGGLKSPRQQPDTQQSAGGSYTSKSGIKFTVE
>Q9T1X0 ~~~~~~Uncharacterized protein gp24~~~
MNEDQNRAVALLLAELWQGDTRDIPRPAAYDPPVLCAGCGRELRPDVLRQQPMANYCHWCRGAE
>Q9T1W8 ~~~~~~Uncharacterized protein gp26~~~
MINDILTEDRRLVILRSLMDCNNEANESILQDCLDAYGHNVSRDVVRGQIDWLAEQQLVTVENLRGFYVVTLTSRGQDVA
EGRARVAGVKRPRPRA
>P13335 ~~~~~~Baseplate hub assembly protein gp26~~~
MYEYKFDVRVGSKIINCRAFTLKEYLELITAKNNGSVEVIVKKLIKDCTNAKDLNRQESELLLIHLWAHSLGEVNHENSW
KCTCGTEIPTHINLLHTQIDAPEDLWYTLGDIKIKFRYPKIFDDKNIAHMIVSCIETIHANGESIPVEDLNEKELEDLYS
IITESDIVAIKDMLLKPTVYLAVPIKCPECGKTHAHVIRGLKEFFELL
>P13336 ~~~~~~Baseplate hub assembly protein gp28~~~
MNLNLILPLKKVVLPISNKEVSIPKMGLKHYNILKDVKGPDENLKLLIDSICPNLSPAEVDFVSIHLLEFNGKIKSRKEI
DGYTYDINDVYVCQRLEFQYQGNTFYFRPPGKFEQFLTVSDMLSKCLLRVNDEVKEINFLEMPAFVLKWANDIFTTLAIP
GPNGPITGIGNIIGLFE
>Q04566 ~~~GP2a~~~Glycoprotein 2a~~~
MQWGHCGVKSASCSWTPSLSSLLVWLILPFSLPYCLGSPSQDGYWSFFSEWFAPRFSVRALPFTLPNYRRSYEGLLPNCR
PDVPQFAVKHPLGMFWHMRVSHLIDEMVSRRIYQTMEHSGQAAWKQVVGEATLTKLSGLDIVTHFQHLAAVEADSCRFLS
SRLVMLKNLAVGNVSLQYNTTLDRVELIFPTPGTRPKLTDFRQWLISVHASIFSSVASSVTLFIVLWLRIPALRYVFGFH
WPTATHHSS
>A0MD30 ~~~GP2a~~~Glycoprotein 2a~~~
MQWGHCGVKSASCSWMPSLSFLSVWLILSFSLPYCLGSPSQDGYWSFFSEWFAPRFSVRALPFTLPNYRRSYESLLPNCR
PDVPQFAFKHPLGILWHMRVSHLIDEMVSRRIYQTMEHSGQAAWKYVVGEATLTKLSKLDIVTHFQHLAAVEADSCRFLS
SRLVMLKNLAVGNVSLQYNTTLDRVELIFPTPGTRPKLTDFRQWLISVHASIFSSVASSVTLFIVLWLRIPALRYVFGFH
WPTATHHSS
>P28992 ~~~GP2b~~~Glycoprotein 2b~~~
MQRFSFSCYLHWLLLLCFFSGSLLPSAAAWWRGVHEVRVTDLFKDLQCDNLRAKDAFPSLGYALSIGQSRLSYMLQDWLL
AAHRKEVMPSNIMPMPGLTPDCFDHLESSSYAPFINAYRQAILSQYPQELQLEAINCKLLAVVAPALYHNYHLANLTGPA
TWVVPTVGQLHYYASSSIFASSVEVLAAIILLFACIPLVTRVYISFTRLMSPSRRTSSGTLPRRKIL
>P68343 ~~~BLLF1~~~Envelope glycoprotein GP350~~~
MEAALLVCQYTIQSLIQLTRDDPGFFNVEILEFPFYPACNVCTADVNATINFDVGGKKHKLNLDFGLLTPHTKAVYQPRG
AFGGSENATNLFLLELLGAGELALTMRSKKLPINITTGEEQQVSLESVDVYFQDVFGTMWCHHAEMQNPVYLIPETVPYI
KWDNCNSTNITAVVRAQGLDVTLPLSLPTSAQDSNFSVKTEMLGNEIDIECIMEDGEISQVLPGDNKFNITCSGYESHVP
SGGILTSTSPVATPIPGTGYAYSLRLTPRPVSRFLGNNSILYVFYSGNGPKASGGDYCIQSNIVFSDEIPASQDMPTNTT
DITYVGDNATYSVPMVTSEDANSPNVTVTAFWAWPNNTETDFKCKWTLTSGTPSGCENISGAFASNRTFDITVSGLGTAP
KTLIITRTATNATTTTHKVIFSKAPESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAG
TTSGASPVTPSPSPRDNGTESKAPDMTSPTSAVTTPTPNATSPTPAVTTPTPNATSPTLGKTSPTSAVTTPTPNATSPTP
AVTTPTPNATIPTLGKTSPTSAVTTPTPNATSPTVGETSPQANTTNHTLGGTSSTPVVTSPPKNATSAVTTGQHNITSSS
TSSMSLRPSSISETLSPSTSDNSTSHMPLLTSAHPTGGENITQVTPASTSTHHVSTSSPAPRPGTTSQASGPGNSSTSTK
PGEVNVTKGTPPKNATSPQAPSGQKTAVPTVTSTGGKANSTTGGKHTTGHGARTSTEPTTDYGGDSTTPRTRYNATTYLP
PSTSSKLRPRWTFTSPPVTTAQATVPVPPTSQPRFSNLSMLVLQWASLAVLTLLLLLVMADCAFRRNLSTSHTYTTPPYD
DAETYV
>P03200 ~~~BLLF1~~~Envelope glycoprotein GP350~~~
MEAALLVCQYTIQSLIHLTGEDPGFFNVEIPEFPFYPTCNVCTADVNVTINFDVGGKKHQLDLDFGQLTPHTKAVYQPRG
AFGGSENATNLFLLELLGAGELALTMRSKKLPINVTTGEEQQVSLESVDVYFQDVFGTMWCHHAEMQNPVYLIPETVPYI
KWDNCNSTNITAVVRAQGLDVTLPLSLPTSAQDSNFSVKTEMLGNEIDIECIMEDGEISQVLPGDNKFNITCSGYESHVP
SGGILTSTSPVATPIPGTGYAYSLRLTPRPVSRFLGNNSILYVFYSGNGPKASGGDYCIQSNIVFSDEIPASQDMPTNTT
DITYVGDNATYSVPMVTSEDANSPNVTVTAFWAWPNNTETDFKCKWTLTSGTPSGCENISGAFASNRTFDITVSGLGTAP
KTLIITRTATNATTTTHKVIFSKAPESTTTSPTLNTTGFADPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAG
TTSGASPVTPSPSPWDNGTESKAPDMTSSTSPVTTPTPNATSPTPAVTTPTPNATSPTPAVTTPTPNATSPTLGKTSPTS
AVTTPTPNATSPTLGKTSPTSAVTTPTPNATSPTLGKTSPTSAVTTPTPNATGPTVGETSPQANATNHTLGGTSPTPVVT
SQPKNATSAVTTGQHNITSSSTSSMSLRPSSNPETLSPSTSDNSTSHMPLLTSAHPTGGENITQVTPASISTHHVSTSSP
APRPGTTSQASGPGNSSTSTKPGEVNVTKGTPPQNATSPQAPSGQKTAVPTVTSTGGKANSTTGGKHTTGHGARTSTEPT
TDYGGDSTTPRPRYNATTYLPPSTSSKLRPRWTFTSPPVTTAQATVPVPPTSQPRFSNLSMLVLQWASLAVLTLLLLLVM
ADCAFRRNLSTSHTYTTPPYDDAETYV
>Q3KST4 ~~~~~~Envelope glycoprotein GP350~~~
MEAALLVCQYTIQSLIHLTGEDPGFFNVEIPEFPFYPTCNVCTADVNVTINFDVGGKKHQLDLDFGQLTPHTKAVYQPRG
AFGGSENATNLFLLELLGAGELALTMRSKKLPINVTTGEEQQVSLESVDVYFQDVFGTMWCHHAEMQNPVYLIPETVPYI
KWDNCNSTNITAVVRAQGLDVTLPLSLPTSAQDSNFSVKTQMLGNEIDIECIMEDGEISQVLPGDNKFNITCSGYESHVP
SGGILTSTSPVVTPIPGTGYAYSLRLTPRPVSRFLGNNSILYVFYSGNGPKASGGDYCIQSNIVFSDEIPASQDMPTNTT
DITYVGDNATYSVPMVTSEDANSPNVTVTAFWAWPNNTETDFKCKWTLTSGTPSGCENISGAFASNRTFDITVSGLGTAP
KTLIITRTATNATTTTHKVIFSKAPESTTTSPTSNTTGFAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAG
TTSGASPVTPSPSPRDNGTESKAPDMTSPTPAVTTPTPNATSPTSAVTTPTPNATSPTLGKTSPTSAVTTPTPNATSPTS
AVTTPTPNATGPTVGETSPQANTTNHTLGGTSSTPVVTSQPKNATSAVTTGQHNITSSSTSSMSLRPSSISETTSHMPLL
TSAHPTGGENITQVTPASTSTHHVSTSSPAPRPGTTSQASGPGNSSTSTKPGEVNVTKGTPPKNATSPQAPSGQKTAVPT
VTSTGGKANSTTGGKHTTGHGARTSTEPTTDYGGDSTTPRTRYNATTYLPPSTSSELRPRWTFTSPPVTTAQATVPVPPT
SQPRFSNLSMLVLQWASLAVLTLLLLLVMADCAFRRNLSTSHTYTTPPYDDAETYV
>Q9T1W0 ~~~~~~Uncharacterized protein gp35~~~
MSGTSLNSQRLDTSRITCTAIIKCLRPVYRRAGIAFTRGENTVEVTEEQLAIIRADSVLSVVSASSAETLAEAGGLDVLG
VGDLNTRIRATVAGLDKANPEHFTAGGEPKVKAVSAALGEPVSSAQIKAALAEADA
>Q09YD1 ~~~~~~Gene product 38~~~
MYTAEEREQIIDIVDKMSLLRQDFDGAFTWIKENVAMPFDFDGEQQFISDLKQLVKINALKFGKIYEGVLN
>P79677 ~~~~~~Putative sheath terminator protein~~~
MLKIKPAAGKAIRDPLTMKLLASEGEEKPRNSFWIRRLAAGDVVEVGSTENTADDTDAAPKKRSKSK
>P28993 ~~~GP3~~~Glycoprotein 3~~~
MGRAYSGPVALLCFFLYFCFICGSVGSNNTTICMHTTSDTSVHLFYAANVTFPSHFQRHFAAAQDFVVHTGYEYAGVTML
VHLFANLVLTFPSLVNCSRPVNVFANASCVQVVCSHTNSTTGLGQLSFSFVDEDLRLHIRPTLICWFALLLVHFLPMPRC
RGS
>Q04567 ~~~GP3~~~Glycoprotein 3~~~
MAHQCARFHFFLCGFICYLVHSALASNSSSTLCFWFPLAHGNTSFELTINYTICMPCSTSQAARQRLEPGRNMWCKIGHD
RCEERDHDELLMSIPSGYDNLKLEGYYAWLAFLSFSYAAQFHPELFGIGNVSRVFVDKRHQFICAEHDGHNSTVSTGHNI
SALYAAYYHHQIDGGNWFHLEWLRPLFSSWLVLNISWFLRRSPVSPVSRRIYQILRPTRPRLPVSWSFRTSIVSDLTGSQ
QRKRKFPSESRPNVVKPSVLPSTSR
>P32651 ~~~~~~Structural glycoprotein gp41~~~
MTDERGNFYYNTPPPLRYPSNPATAIFTSAQTYNAPGYVPPATVPTTVATRDNRMDYTSRSNSTNSVAIAPYNKSKEPTL
DAGESIWYNKCVDFVQKIIRYYRCNDMSELSPLMILFINTIRDMCIDTNPISVNVVKRFESEETMIRHLIRLQKELGQSN
AAESLSSDSNIFQPSFVLNSLPAYAQKFYNGGADMLGKDALAEAAKQLSLAVQYMVAEAVTCNIPIPLPFNQQLANNYMT
LLLKHATLPPNIQSAVESRRFPHINMINDLINAVIDDLFAGGGDYYHYVLNEKNRARVMSLKENVAFLAPLSASANIFNY
MAELATRAGKQPSMFQNATFLTSAANAVNSPAAHLTKSACQESLTELAFQNETLRRFIFQQINYNKDANAIIAAAAPNAT
RPNTKGRTA
>P03205 ~~~BZLF2~~~Glycoprotein 42~~~
MVSFKQVRVPLFTAIALVIVLLLAYFLPPRVRGGGRVAAAAITWVPKPNVEVWPVDPPPPVNFNKTAEQEYGDKEVKLPH
WTPTLHTFQVPQNYTKANCTYCNTREYTFSYKGCCFYFTKKKHTWNGCFQACAELYPCTYFYGPTPDILPVVTRNLNAIE
SLWVGVYRVGEGNWTSLDGGTFKVYQIFGSHCTYVSKFSTVPVSHHECSFLKPCLCVSQRSNS
>P0C6Z5 ~~~BZLF2~~~Glycoprotein 42~~~
MVSFKQVRVPLFTAIALVIVLLLAYFLPPRVRGGGRVAAAAITWVPKPNVEVWPVDPPPPVNFNKTAEQEYGDKEVKLPH
WTPTLHTFQVPQNYTKANCTYCNTREYTFSYKGCCFYFTKKKHTWNGCFQACAELYPCTYFYGPTPDILPVVTRNLNAIE
SLWVGVYRVGEGNWTSLDGGTFKVYQIFGSHCTYVSKFSTVPVSHHECSFLKPCLCVSQRSNS
>Q08402 ~~~~~~E3 protein~~~
MAKSNNVYVVNGEEKVSTLAEVAKVLGVSRVSKKDVEEGKYDVVVEEAAVSLADTEEVVEEVVTEEEDILEGVEVVEDEE
EEEAAEDVEEPTSEEDSEDEWEEGYPVATEVEEDEDEEIEYPEVGDFEDEKAIKKYIKGLTDEQLQAWCELEGAEWVENE
HRNINRMRMAMAIKAVHFPELAKKPSSKKKSKYAEYTTEELVEMAIDNNVEVRDDKGNERILRMYTIIALREAGLIS
>O48400 ~~~~~~Inhibitor of histone-like protein HU~~~
MMTEDQKFKYLTKIEELEAGCFSDWTKEDITGDLKYLKKGIIEESIELIRAVNGLTYSEELHDFTQEIIEELDISPL
>P28994 ~~~GP4~~~Glycoprotein 4~~~
MKIYGCISGLLLFVGLPCCWCTFYPCHAAEARNFTYISHGLGHVHGHEGCRNFINVTHSAFLYLNPTTPTAPAITHCLLL
VLAAKMEHPNATIWLQLQPFGYHVAGDVIVNLEEDKRHPYFKLLRAPALPLGFVAIVYVLLRLVRWAQRCYL
>Q04568 ~~~GP4~~~Glycoprotein 4~~~
MAAATLFFLAGAQHIMVSEAFACKPCFSTHLSDIETNTTAAAGFMVLQDINCFRPHGVSAAQEKISFGKSSQCREAVGTP
QYITITANVTDESYLYNADLLMLSACLFYASEMSEKGFKVIFGNVSGVVSACVNFTDYVAHVTQHTQQHHLVIDHIRLLH
FLTPSAMRWATTIACLFAILLAI
>A0MD33 ~~~GP4~~~Glycoprotein 4~~~
MAAAAFFLLVGAQHIMVSEAFACKPCFSTHLSDIKTNTTAAAGFMVLQNINCSRPHEASATQGQVPSRKSSQCREAVGVP
QYITITANVTDESYLYNADLLMLSACLFYASEMSEKGFKVIFGNVSGVVSACVNFTDYVAHVTQHTQQHHLVINHIRLLH
FLTPSAMRWATTIACLFAILLAI
>P04532 ~~~~~~Tail fiber assembly helper protein~~~
MSEQTVEQKLSAEIVTLKSRILDTQDQAARLMEESKILQGTLAEIARAVGITGDTIKVEEIVEAVKNLTAESADEAKDEE
>P20406 ~~~~~~Probable RecBCD inhibitor gp5.9~~~
MSRDLVTIPRDVWNDIQGYIDSLERENDSLKNQLMEADEYVAELEEKLNGTS
>P28995 ~~~GP5~~~Glycoprotein 5~~~
MLSMIVLLFLLWGAPSHAYFSYYTAQRFTDFTLCMLTDRGVIANLLRYDEHTALYNCSASKTCWYCTFLDEQIITFGTDC
DDTYAVPVAEVLEQAHGPYSALFDDMPPFIYYGREFGIVVLDVFMFYPVLVLFFLSVLPYATLILEMCVSILFIIYGIYS
GAYLAMGIFAATLAIHSIVVLRQLLWLCLAWRYRCTLHASFISAEGKVYPVDPGLPVAAVGNRLLVPGRPTIDYAVAYGS
KVNLVRLGAAEVWEP
>Q04569 ~~~GP5~~~Glycoprotein 5~~~
MRCSHKLGRFLTPHSCFWWLFLLCTGLSWSFADGNGDSSTYQYIYNLTICELNGTDWLSSHFGWAVETFVLYPVATHILS
LGFLTTSHFFDALGLGAVSTAGFVGGRYVLCSVYGACAFAAFVCFVIRAAKNCMACRYARTRFTNFIVDDRGRVHRWKSP
IVVEKLGKAEVDGNLVTIKHVVLEGVKAQPLTRTSAEQWEA
>A0MD34 ~~~GP5~~~Glycoprotein 5~~~
MKCSHKLGHSLTPHSCFWWLFLLCTGLSWSFADGNGNNSTYQYIYNLTICELNGTNWLSGHFEWAVETFVLYPVVTHILS
LGFLTTSHFFDALGLGAVSTAGFVGGRYVLSSVYGACAFAAFVCFVIRAAKNCMACRYARTRFTNFIVDDRGGVHRWKSP
IVVEKLGKAEIGGNLVTIKHVVLEGVKAQPLTRTSAEQWEA
>O48414 ~~~~~~Putative gene 60 protein~~~
MLNQVEVLREEYVEGYVVQMWRRNPSNAPVIEVFTEDNLEEGIIPEYVTANDDTFDRIVDAVEFGYLEELELV
>P04538 ~~~~~~Prohead assembly protein gp68~~~
MLLIPETHELVLENVEALIPEAQGRFDELSSALNKDDINTIVENMLDDETDLAVALASINENMPLNEFIVKHVSARGEIT
RTKDRKTRERNAFQTTGLSKAKRRQIARKATKTKIANPAGQSRAQRKRKKALKRRKALGLS
>Q38477 ~~~~~~Uncharacterized protein gp6~~~
MCIKAEKYIEWVKHCQCHGVPLTTYKCPGCGEQIMTQCSPEKEIRDSLTCCPWCSAVFFKQVKGAKVKASAVIQNQ
>P03751 ~~~~~~Protein 7.3~~~
MGKKVKKAVKKVTKSVKKVVKEGARPVKQVAGGLAGLAGGTGEAQMVEVPQAAAQIVDVPEKEVSTEDEAQTESGRKKAR
AGGKKSLSVARSSGGGINI
>Q38479 ~~~~~~Uncharacterized protein gp7~~~
MAKVIIEIKNTVSGIKGRNLRTSIAVDGSAELDGDEGTLAGMVALLVLNKSQKIINESAHEAIEILKNDGVITSGRVTEM
AVEKTCH
>Q01074 ~~~7~~~Internal virion protein gp7~~~
MLYAFTLGRKLRGEEPSYPEKGGKGGADKSAKYAAEAQKYAADLQNQQFNTIMNNLKPFTPLADKYIGSLEGLSSLEGQG
QALNNYYNSQQYQDLAGQARYQNLAAAEATGGLGSTATSNQLSAIAPTLGQQWLSGQMNNYQNLANIGLGALQGQANAGQ
TYANNMSQISQQSAALAAANANRPSAMQSAIGGGASGAIAGAGLAKLIGSSTPWGAAIGGGIGLLGSLF
>Q38442 ~~~7~~~Minor head protein GP7~~~
MPEPQNQEELDKYLDNIITQAEKRLDKVFASRLKEIKAMINKLFEKYSKNGELTYADVVKYNRLEKEMDVIKQNISADYK
TVLKMLNELLETQYVDNYLRSAYIYEMYTGRNLGFSVPSADVVQRAVENPIPLLTLPKVLERQRVELINNIAMAIAQGLM
AGEGYSQVAQRVHKRMQLSLAKARLTARTEGHRVQVAGRMASAEQAAKKVNMQKMWSAALDTRTRAGHRKLDGKIINMDE
NFKSIYGGVGKAPGHMHMAKDDCNCRCALIYVIDGEIPSVRRARLSDGSTRVIKYIPYTEWEKQKKAS
>Q38480 ~~~~~~Uncharacterized protein gp8~~~
MNVKIRNEIQALIRIQERNNNGGELREFICAREVDGYGEKTYLITFDHYSICARYCGESISRAIASGDAFNVDLWEYVMD
REYICASDPEAREMWQRIWRDYRLMAKGWARCCYSSLALKAVQLSLRHIPASLREPLLY
>Q38482 ~~~~~~Uncharacterized protein gp9~~~
MVTDMKCNRKRWSREDREFIEANVGKMTVEEMAEKLKVATTALRAHARRHGISLCVYRISEHDKYLCRELYKEGLDIHVI
ARKMELSNRAVSSIVYSGY
>Q01259 ~~~F~~~Putative capsid assembly protein F~~~
MPQQTIDLAYAARLPPKEAVAYFRAKGYNITWNWYEQLADAHARAFTVAKATRMDVLTTIREEVERAVSEGITREEFTRT
LAPRLQKLGWWGKQIIVDAEGNAKEIELGSPRRLATIYNVNTRTAYGAGRYAQMMNTADLYPYWQYVAVMDGRTRPEHAR
LHNMVFQYDDIFWQTHYPPNGWNCRCRVRALSAARMKELGLQVSYGASFMNTREVDAGTDESTGEIFRTSSTTFDNGRVK
MTPDVGWSYNPGSAAFGTDQALIRKLVEVRDAQLREQVVQTLNNSRERQLAFSLWLKRLAGSRQTGHEIRALGFMTGSVA
EAVYQRTGNMPARLLVMNGKSLATTADAALKPEDLQRLPSLMAKPQAVLWDRENHQLLYVVATRDGTARIVVRTSQTVGR
QNDRADVLVSISRVSAQSLEAAIADGMIDVLEGHVEVNK
>Q01261 ~~~G~~~Putative capsid assembly protein G~~~
MSLDMNVAVDVRRIQLALDELGTVTRDRAIPRVMAAALLSSTEQAFERQADPDTGKGWEAWSDSWLAWRQDHGFVPGSIL
TLHGDLARSITTDYGQDYALIGSPKIYAAIHQWGGTPDMAPRPAGVPARPYMGLDKTGEQEIFDAIRKRVSAALRQ
>Q9T1V9 ~~~J~~~Gene product J~~~
MNYATVNDLCARYTRTRLDILTRPKTADGQPDDAVAEQALADASAFIDGYLAARFVLPLTVVPSLLKRQCCVVAWFYLNE
SQPTEQITATYRDTVRWLEQVRDGKTDPGVESRTAASPEGEDLVQVQSDPPVFSRKQKGFI
>Q9E006 ~~~GP~~~Envelopment polyprotein~~~
MEGWYLVVLGVCYTLTLAMPKTIYELKMECPHTVGLGQGYIIGSTELGLISIEAASDIKLESSCNFDLHTTSMAQKSFTQ
VEWRKKSDTTDTTNAASTTFEAQTKTVNLRGTCILAPELYDTLKKVKKTVLCYDLTCNQTHCQPTVYLIAPVLTCMSIRS
CMASVFTSRIQVIYEKTHCVTGQLIEGQCFNPAHTLTLSQPAHTYDTVTLPISCFFTPKKSEQLKVIKTFEGILTKTGCT
ENALQGYYVCFLGSHSEPLIVPSLEDIRSAEVVSRMLVHPRGEDHDAIQNSQSHLRIVGPITAKVPSTSSTDTLKGTAFA
GVPMYSSLSTLVRNADPEFVFSPGIVPESNHSTCDKKTVPITWTGYLPISGEMEKVTGCTVFCTLAGPGASCEAYSENGI
FNISSPTCLVNKVQRFRGSEQKINFICQRVDQDVVVYCNGQKKVILTKTLVIGQCIYTFTSLFSLMPDVAHSLAVELCVP
GLHGWATVMLLSTFCFGWVLIPAVTLIILKCLRVLTFSCSHYTNESKFKFILEKVKIEYQKTMGSMVCDVCHHECETAKE
LESHRQSCINGQCPYCMTITEATESALQAHYSICKLTGRFQEALKKSLKKPEVKKGCYRTLGVFRYKSRCYVGLVWCLLL
TCEIVIWAASAETPLMESGWSDTAHGVGEIPMKTDLELDFSLPSSSSYSYRRKLTNPANKEESIPFHFQMEKQVIHAEIQ
PLGHWMDATFNIKTAFHCYGACQKYSYPWQTSKCFFEKDYQYETGWGCNPGDCPGVGTGCTACGVYLDKLKSVGKAYKII
SLKYTRKVCIQLGTEQTCKHIDANDCLVTPSVKVCIVGTVSKLQPSDTLLFLGPLEQGGIILKQWCTTSCAFGDPGDIMS
TPSGMRCPEHTGSFRKICGFATTPVCEYQGNTISGYKRMMATKDSFQSFNLTEPHITTNKLEWIDPDGNTRDHVNLVLNR
DVSFQDLSDNPCKVDLHTQAIEGAWGSGVGFTLTCTVGLTECPSFMTSIKACDLAMCYGSTVTNLARGSNTVKVVGKGGH
SGSSFKCCHDTDCSSEGLLASAPHLERVTGFNQIDSDKVYDDGAPPCTFKCWFTKSGEWLLGILNGNWIVVVVLVVILIL
SIIMFSVLCPRRGHKKTV
>L7V0S7 ~~~GP~~~Envelopment polyprotein~~~
MMFSRVMQLALICAVTCEDNPCLWERFTNSRDIEFMIPVVNLSTSRRLSMSQRICMVSMGKHWSRIFSEGEEDRGMKDLD
PLLMSSLNWRGTAKTRSSNSFNFDILDGIFLGFLDLVKWGEEADRHTPIHPECIKSKVCGFMTASGPRIKTCTGKFRGAD
RHGHCTNRATPHEATNVISVGVQHAQEANQVDEHEARYISEARKSINPEICSIDGVEINQCDLASPGRWLMLHYASFRLQ
EGSLVYLSPGLNIKWSQINVPASDFYCINVSDHLNTHYRPCEVNCTDNCQGDELYCSVHQCARSAECKCSFIGSRGMAEV
QIGDRWFKPAVVGSQQFFVKEDVPVLQQPSADCTTCSMTCTAEGIAISSIKDELKDVTVCVEGFCSTRVSKGSKVWKIEF
HNQYPSSGSVALARGTTVSGETFELTAECGRRTGCEQINCLFCREMLSNPQCYPYGKWFLLFLILATLYIIVALLKTIMR
IFMACLSVLYGPFIIIIKISRCLGRLGKRKGERTYVRLMEALDDERKPEVVRAPVSLGRTKQPRIVLFIVLALLVHMALC
CDESRLVEETSVTCNPGPDNIFSCSTKEMITVKELRAGKTICVSLKGPGGSLSSPIKIKMLDIVGRSDLLDIYFTFNGHA
NCKSVRRCRWAGSCGNSGCLGVGKEDYDRELGDQESSLHPNWRDCYDGCGGAACGCFNAAPSCIFLKRYVTNADSRVFKV
FKPSAWFLSTKIVVETTSHKEDVTLKSGEAKVIDKVSFHYRTDKNLFAGMSIPPIVTEVKREGKPLSFFLENQGQHPKCK
DENSARTSSASNCIVDQNTISANVRVDDVSCRSNLVSISGMSTLKPLPQRVGDFLIQLHNDEPVLLATGDSGVVEGELQI
DLSHKKISIKVDTTVCRGTVKELKGCVGCTKGAFASLEIHSTSAGSASLQCSLSSCYMEVQKGVNNVNCSLRFSKAVVEE
TCVLACSGSKEQLSIKGNLIIGGDFKKLTEDSATSFSHTDSKDTRIHLQTGLMNWLDTLFGASLLGKILGIGLAILSPFI
LILILRWILRVVLRRSRIRREPKYEMAKYS
>Q8JPR1 ~~~GP~~~Envelopment polyprotein~~~
MICILVLITVAAASPVYQRCFQDGAIVKQNPSKEAVTEVCLKDDVSMIKTEARYVRNATGVFSNNVAIRKWLVSDWHDCR
PKKIVGGHINVIEVGDDLSLHTESYVCSADCTIGVDKETAQVRLQTDTTNHFEIAGTTVKSGWFKSTTYITLDQTCEHLK
VSCGPKSVQFHACFNQHMSCVRFLHRTILPGSIANSICQNIEIIILVTLTLLIFILLSILSKTYICYLLMPIFIPIAYIY
GIIYNKSCKKCKLCGLVYHPFTECGTHCVCGARYDTSDRMKLHRASGLCPGYKSLRAARVMCKSKGPASILSIITAVLVL
TFVTPINSMVLGESKETFELEDLPDDMLEMASRINSYYLTCILNYAVSWGLVIIGLLIGLLFKKYQHRFLNVYAMYCEEC
DMYHDKSGLKRHGDFTNKCRQCTCGQYEDAAGLMAHRKTYNCLVQYKAKWMMNFLIIYIFLILIKDSAIVVQAAGTDFTT
CLETESINWNCTGPFLNLGNCQKQQKKEPYTNIATQLKGLKAISVLDVPIITGIPDDIAGALRYIEEKEDFHVQLTIEYA
MLSKYCDYYTQFSDNSGYSQTTWRVYLRSHDFEACILYPNQHFCRCVKNGEKCSSSNWDFANEMKDYYSGKQTKFDKDLN
LALTALHHAFRGTSSAYIATMLSKKSNDDLIAYTNKIKTKFPGNALLKAIIDYIAYMKSLPGMANFKYDEFWDELLYKPN
PAKASNLARGKESSYNFKLAISSKSIKTCKNVKDVACLSPRSGAIYASIIACGEPNGPSVYRKPSGGVFQSSTDRSIYCL
LDSHCLEEFEAIGQEELDAVKKSKCWEIEYPDVKLIQEGDGTKSCRMKDSGNCNVATNRWPVIQCENDKFYYSELQKDYD
KAQDIGHYCLSPGCTTVRYPINPKHISNCNWQVSRSSIAKIDVHNIEDIEQYKKAITQKLQTSLSLFKYAKTKNLPHIKP
IYKYITIEGTETAEGIESAYIESEVPALAGTSIGFKINSKEGKHLLDVIAYVKSASYSSVYTKLYSTGPTSGINTKHDEL
CTGPCPANINHQVGWLTFARERTSSWGCEEFGCLAVSDGCVFGSCQDIIKEELSVYRKETEEVTDVELCLTFSDKTYCTN
LNPVTPIITDLFEVQFKTVETYSLPRIVAVQNHEIKIGQINDLGVYSKGCGNVQKVNGTIYGNGVPRFDYLCHLASRKEV
IVRKCFDNDYQACKFLQSPASYRLEEDSGTVTIIDYKKILGTIKMKAILGDVKYKTFADSVDITAEGSCTGCINCFENIH
CELTLHTTIEASCPIKSSCTVFHDRILVTPNEHKYALKMVCTEKPGNTLTIKVCNTKVEASMALVDAKPIIELAPVDQTA
YIREKDERCKTWMCRVRDEGLQVILEPFKNLFGSYIGIFYTFIISIVVLLVIIYVLLPICFKLRDTLRKHEDAYKREMKI
R
>P04875 ~~~GP~~~Envelopment polyprotein~~~
MICILILFAVTAASPVYQRCFQDGAIVKQNPSKEAVTEVCLKDDVSMIKTEARYIKNATGVFSNNVAIRKWLVSDWHDCR
PKKITGGHINVIEVGDDLSLHTESYVCSADCTIGVDKETAQVRLQTDTTNHFEIAGTIVKSGWFKSTTYITLDQTCEHLK
VSCGPKSIQFHACFNQHMSCVRFLHRTILPGSIANSICQNIEIIILVTLTLLIFILLSVLSKTYICYLLMPVFIPIAYAY
GIIYNKSCKKCKLCGLVYHPFTECGTHCVCGARYDTSDRMKLHRASGLCPGYKSLRAARVMCKSKGPASILSVITAILIL
TFVTPINSMVVGESKEVFELEQLPDDMLDMALRINFYYFVCIMNYAVTWGLIIIGLLIGLLFKKYQHRFSNLYAMYCEEC
DMYHDRSGLKRNGDFTNKCRQCTCGQYEDATGLMTHRKTYNCLVRYKAKWVMNFLIAYMLLTLIKDSAIVVQAAGTDFTT
CLETENINWNCTGPFLNLGNCQKQQKKEPYANIATQLKGLQAISVLDMPMIASIPEDIAGALRYIEEKETFHVQLTAEYA
MLSRYCDYYAQFSDNSGYSQTTWRVYLRSHDFDACILYPNQHFCRCVKRGDKCSSSNGDFANEMKNYYSGKQNKFDKDLN
LALMALHHAFRGTSSAYIATMLSKKSNDDLIAYTNKIKEKFPGNALLKAIVDYIAYMKSLSEMSSFKYDEFWDDLLYKSA
PTKAPSLSRGSEPSYNFKLVVSSRSIKSCKNVKSVVCLSPRSGVSYDSIIACGDPNGPSVYRKPSDGVFQSNADQSTYCL
ADSHCLEDFEVVSQEELDAIKKSKCWEAEYPDVKLSKLTDGVKSCRMKDSGNCNVAANRWPIIQCENDKFYYSELQKDYD
KTQDIGHFCLSPGCSTVRFPINPKHISNCNWQVSRSSIAKIDVHNIEDIDQYRKAITQKLQTSLSLFKYAKTKNLPHIKP
IYKYITIEGTETAEGIESAYIESEIPALAGTSIGFKITSKEGKHLLDVIGYVKSASCSSIYTKLYTTGPTSGINTKHDEL
CTGPCPAKINHQTGWLTFAKERTSSWGCEEFGCLAISDGCVFGSCQDIIRDELTVYRKETDEVTDVELCLTFSDKTYCTN
LNPITPIITDLFEVQFKTVETYSLPRIVAIQNHEIKIGQVNDLGVYSKGCGNVQKVNGTVYGNGVPKFDYLCHLASRKEV
IVRKCFDNDYQACKFLQSPASYRLEEDSGTVTVIDYKKILGTIKMKAILGDVKYKTFADNVDMTAEGSCTGCINCFENIH
CELTLHTTIEASCPIVSTCTVFHDRILVTPNEHKYALKVVCTEKPGNTLTIRICNTKVEASLALVDAKPILELAPVDQTA
YIREKDERCKTWMCRVRDEGLQVILEPFKNLFGSYIGIFYTFIISIIALLIIIYIVLPICFKLRDTLRKHEDAYKREMKI
R
>P04505 ~~~GP~~~Envelopment polyprotein~~~
MRILILLLAVTQLAVSSPVITRCFHGGQLIAERKSQTSISEFCIKDDVSMLKSEIVYTKNDTGIFGHSKVFRHWTITDWK
ACNPVVTAGGSINVIEVDKNLNLVTRNYVCTGDCTITVDRKNAQIIFQTDKLNHFEVTGTTISTGWFKSKASVTLDRTCE
HIKVSCGKKTLQFHACFKQHMSCVRFLHRSILPGSMAISICQNIELIIITILALCIFIIMIILTKTYICYVLIPVFMPIA
FAYGWAYNRSCKKCTCCGLAYHPFTNCGSYCVCGSKFETSDRMRMHRESGLCQGFKSLRVARRLCKSKGSSLIISILLSV
LILSFVTPIEGTLTNYPTDQKYTLDEIADVLQAKTHEDSTKYYIILYTSLFGAGLTIIFAGVALGLTIILEVLTKINVIF
CNECNMYHSKKSIKYVGDFTNKCGFCTCGLLEDPEGVVVHKAKKSCTYSYQINWVRGIMIFVAFLFVIQNTIIMVAAEED
CWKNEELKEDCVGPLIAPKDCTDKDHKTYLSEASLLATAKKITQVDAENVEILGKTMESAIRVIERQKTYHRMHLLEAVF
LNKHCDYYKMFEHNSGYSQVKWRMMIKTQHFDICALQANSPFCAQCIADNSCAQGSWEFDTHMNSTYSSKVDNFKHDFSL
FLRIFEAAFPGTAYVHLLTNIKEKKPYQAVSMIEKIKKKFPNNKLLIGYLDFGKYLLGLSHASTYELQQRQLDKLYQPTE
LTRSGGQQTSLANSVVGQATKECKKYKDVSCLSPRFGIPLEDLISCCDQPNYNIYKKPKKVYKAHDKEETWCINDQHCLV
DFVPAEADTVEKLKPMKCWLVDPGKNDDVYSIAIKTCRVVDKGVCTVNSQKWNIIKCDSGPLYYSDHIPGEDTGNDIGHY
CVSAGCKTDRYPINPDVVTDCVWEFTSRKSQYIGKISMQSLEDYEKALTDRLTHTLETYSFAPLENLPHIKPVYKYITAQ
GVENSDGIEGAFITASIPAAGGTSIGYNVRSKDGFPLLDLIVFVKSAVIKSTYNHIYDTGPTISINTKHDEHCTGQCPSN
IEHEANWLTFSQERTSRWGCEEFGCLAVNTGCVFGSCQDVIRPETKVYRKAVDEVVILTVCITYPGHTFCTEINAIEPKI
TEEIELQFKTVDTKTLPYIVAVNNHKLYSGQINDLGTFGQMCGNVQKTNSSILGTGTPKFDYTCHGASRKDIIVRRCYNN
NFDSCKLLKEETQLIFNDDHDTITVYNTNHLIGELAIKLILGDIQYKLFTETLDLQIDAKCVGCPDCFESYSCNFQIVSN
IDTICSLEGPCDTFHNRISIKAMQQNYAVKLSCQKDPRPSGTFKICNREYTVVFHTVAKDDKIEINVGDQTSFIKEKDDR
CKTWLCRVRDEGISVIFEPIKAFFGSYFSIFFYIIVVVVVGFLIIYIFMPMFMKLKEVLKANEKLYLQEIKQK
>Q8JSZ3 ~~~GP~~~Envelopment polyprotein~~~
MHISLMYAILCLQLCGLGETHGSHNETRHNKTDTMTTPGDNPSSEPPVSTALSITLDPSTVTPTTPASGLEGSGEVYTSP
PITTGSLPLSETTPELPVTTGTDTLSAGDVDPSTQTAGGTSAPTVRTSLPNSPSTPSTPQDTHHPVRNLLSVTSPGPDET
STPSGTGKESSATSSPHPVSNRPPTPPATAQGPTENDSHNATEHPESLTQSATPGLMTSPTQIVHPQSATPITVQDTHPS
PTNRSKRNLKMEIILTLSQGLKKYYGKILRLLQLTLEEDTEGLLEWCKRNLGLDCDDTFFQKRIEEFFITGEGHFNEVLQ
FRTPGTLSTTESTPAGLPTAEPFKSYFAKGFLSIDSGYYSAKCYSGTSNSGLQLINITRHSTRIVDTPGPKITNLKTINC
INLKASIFKEHREVEINVLLPQVAVNLSNCHVVIKSHVCDYSLDIDGAVRLPHIYHEGVFIPGTYKIVIDKKNKLNDRCT
LFTDCVIKGREVRKGQSVLRQYKTEIRIGKASTGSRRLLSEEPSDDCISRTQLLRTETAEIHGDNYGGPGDKITICNGST
IVDQRLGSELGCYTINRVRSFKLCENSATGKNCEIDSVPVKCRQGYCLRITQEGRGHVKLSRGSEVVLDACDTSCEIMIP
KGTGDILVDCSGGQQHFLKDNLIDLGCPKIPLLGKMAIYICRMSNHPKTTMAFLFWFSFGYVITCILCKAIFYLLIIVGT
LGKRLKQYRELKPQTCTICETTPVNAIDAEMHDLNCSYNICPYCASRLTSDGLARHVIQCPKRKEKVEETELYLNLERIP
WVVRKLLQVSESTGVALKRSSWLIVLLVLFTVSLSPVQSAPIGQGKTIEAYRAREGYTSICLFVLGSILFIVSCLMKGLV
DSVGNSFFPGLSICKTCSISSINGFEIESHKCYCSLFCCPYCRHCSTDKEIHKLHLSICKKRKKGSNVMLAVCKLMCFRA
TMEVSNRALFIRSIINTTFVLCILILAVCVVSTSAVEMENLPAGTWEREEDLTNFCHQECQVTETECLCPYEALVLRKPL
FLDSTAKGMKNLLNSTSLETSLSIEAPWGAINVQSTYKPTVSTANIALSWSSVEHRGNKILVSGRSESIMKLEERTGISW
DLGVEDASESKLLTVSVMDLSQMYSPVFEYLSGDRQVGEWPKATCTGDCPERCGCTSSTCLHKEWPHSRNWRCNPTWCWG
VGTGCTCCGLDVKDLFTDYMFVKWKVEYIKTEAIVCVELTSQERQCSLIEAGTRFNLGPVTITLSEPRNIQQKLPPEIIT
LHPRIEEGFFDLMHVQKVLSASTVCKLQSCTHGVPGDLQVYHIGNLLKGDKVNGHLIHKIEPHFNTSWMSWDGCDLDYYC
NMGDWPSCTYTGVTQHNHASFVNLLNIETDYTKNFHFHSKRVTAHGDTPQLDLKARPTYGAGEITVLVEVADMELHTKKI
EISGLKFASLACTGCYACSSGISCKVRIHVDEPDELTVHVKSDDPDVVAASSSLMARKLEFGTDSTFKAFSAMPKTSLCF
YIVEREHCKSCSEEDTKKCVNTKLEQPQSILIEHKGTIIGKQNSTCTAKASCWLESVKSFFYGLKNMLSGIFGNVFMGIF
LFLAPFILLILFFMFGWRILFCFKCCRRTRGLFKYRHLKDDEETGYRRIIEKLNNKKGKNKLLDGERLADRRIAELFSTK
THIG
>Q02004 ~~~GP~~~Envelopment polyprotein~~~
MSKRVLIIAVVVYLVFTTQNQITGNHTTINSSSPSTTEASSTPTVSRTPQTTTTSTAVSTTITATTTPTASWTTQSQYFN
KTTQHHWREETMISRNPTVLDRQSRASSVRELLNTKFLMLLGFIPKGEVNHLENACNREGKNCTELILKERIARFFSETE
KESCYNTYLEKHLRSVSPEVSLTPYRVLGLREDILLKEIDRRIIRFETDSQRVTCLSASLLKPDVFIREQRIDAKPSNGP
KIVPVDSVACMNLEANVDVRSNKLVIQSLMTTVKISLKNCKVVVNSRQCIHQQTGSGVIKVPKFEKQQGGTWSSYIAGVY
TATIDLLDENNQNCKLFTECIVKGRELVKGQSELKSFNIEVLLPRVMKTRRKLLAVTDGSTECNSGTQLIEGKSIEVHKQ
DIGGPGKKLTICNGTSVLDVPLDEGHGCYTINVITSKRACRPKNSKLQCSIDKELKPCDSGKCLSISQKGAGHIKVSRGK
TILITECKEHCQIPVPTGKGDIMVDCSGGRQHYLEVNIVDIHCPNTKFLGGIMLYFCRMSSRPTVALLLGIWIGCGYILT
CIFSFLLYHLILFFANCIKQCRKKGERLGEICVKCEQQTVNLMDQELHDLNCNFNLCPYCCNRMSDEGMSRHVGKCPKRL
ERLNEIELYLTTSECLCLSVCYQLLISVGIFLKRTTWLVVLLVLLGLAISPVQGAPTEVSNVKQDGDYSICYFIFGCLVT
AALLLKVKRTNSNGIVVVVDSFGRCPYCNEFTDSLFEEVLHDTLCSLCVCPFCEKQALDLVTLEEHVKECYKVATRKDIF
KILGRKFTNALVRREKLFTTGLQLFINKTNVVVFALIMCFLLLLTGHNASAFDSGDLPDGVWEESSQLVKSCTQFCYIEE
DVCYCPAEDGVGRKLLFFNGLQNSVKRLSDSHKLLTSVSIDAPWGRINVESTWKPTLAASNIAMSWSSTDIKGEKVILSG
RSTSIIKLKEKTGVMWKLVGSGLASEKKKPFRFPIMDFAQVYNSVFQYITGDRLLSEWPKAVCTGDCPHRCGCQTSTCMA
KECHTQECVSTHMVLGIGTGCTCCGMDVERPFNKYLGVKWSTEYLRTEVLVCVEVTEEERHCEIVEAGTRFNIGPITITI
SDPQNIGSKLPESLMTVQEIDDSNFVDIMHVGNVISADNSCRLQSCTHGSAVTTRFTALTALIKDDHSSGLNLAVLDPKV
NSSWLSWEGCDMDYYCNVGDWPTCTYTGVVTQKLREFLKLDQHRKRLHTTLSFSLKKNLSKRSHTSVRLEGKTVTRMEVK
VTALIEVDGMELHSKTIRLSGIRLTGLKCSGCFSCTSGISCSVNAKLTSPDEFTLHLRSTSPNVVVAETSIIARKGPSAT
TSRFKVFSVRDTKKICFEVVEREYCKDCTPDELTTCTGVELEPTKDILLEHRGTIVQHQNDTCKSKIDCWSNSISSFASG
IGDFFKHYIGSIAVGVLGTVLPFALLILFFIYGDKMLWPFKVFCRPCRRCCRKNEGYNKLAEEEELRDIIRKFSKSGELI
NKDAKDKRTLARLFMSDNPKLKKEKKLSEIA
>P28728 ~~~GP~~~Envelopment polyprotein~~~
MWSLLLLAALVGQGFALKNVFDMRIQCPHSVNFGETSVSGYTELPPLSLQEAEQLVPESSCNMDNHQSLSTINKLTKVIW
RKKANQESANQNSFEVVESEVSFKGLCMLKHRMVEESYRNRRSVIYYDLAGNSTFCKPTVYMIVPIHACNMMKSCLIGLG
PYRIQVVYERTYCTTGILTEGKCFVPDKAVVSALKRGMYAIASIETICFFIHQKWNKYKIVTAITSAMGSKCNNTDTKVQ
GYYICIIGGNSAPVYAPAGEDFRAMEVFSGIITSPHGEDHDLPGEEIATYHISGQIEAKIPHTVSSKNLRLAAFAGIPSY
SSTSILAASEDGRFIFSPGLFPNLNQSVCDNNALPLIWRGLIDLTGYYEAVHPCNVFCVLSGPGASCEAFSEGGIFNITS
PMCLVSKQNRFRAAEQQISFVCQRVDMDIIVYCNGQKKTILTKTLVIGQCIYTITSLFSLLPGVAHSIAIELCVPGFHGW
ATAALLITFCFGWVLIPACTLAILLVLKFFANILHTSNQENRFKAILRKIKEEFEKRKGSMVCEICKYECETLKELKAHN
LSCVQGECPYCFTHCEPTETAIQAHYKVCQATHRFREDLKKTVTPQNIGPGCYRTLNLFRYKSRCYILTMWTLLLIIESI
LWAASAAEIPLVPLWTDNAHGVGSVPMHTDLELDFSLPSSSKYTYKRHLTNPVNDQQSVSLHIEIESQGIGADVHHLGHW
YDARLNLKTSFHCYGACTKYQYPWHTAKCHFEKDYEYENSWACNPPDCPGVGTGCTACGLYLDQLKPVGTAFKIISVRYS
RKVCVQFGEEHLCKTIDMNDCFVTRHAKICIIGTVSKFSQGDTLLFLGPMEGGGIIFKHWCTSTCHFGDPRDVMGPKDKP
FICPEFPGQFRKKCNFATTPVCEYDGNIISGYKKVLATIDSFQSFNTSNIHFTDERIEWRDPDGMLRDHINIVISKDIDF
ENLAENPCKVGLQAANIEGAWGSGVGFTLTCQVSLTECPTFLTSIKACDMAICYGAESVTLSRGQNTVKITGKGGHSGSS
FKCCHGKECSSTGLQASAPHLDKVNGISELENEKVYDDGAPECGVTCWFKKSGEWVMGIINGNWVVLIVLCVLLLFSLIL
LSILCPVRKHKKS
>P08668 ~~~GP~~~Envelopment polyprotein~~~
MGIWKWLVMASLVWPVLTLRNVYDMKIECPHTVSFGENSVIGYVELPPVPLADTAQMVPESSCNMDNHQSLNTITKYTQV
SWRGKADQSQSSQNSFETVSTEVDLKGTCVLKHKMVEESYRSRKSVTCYDLSCNSTYCKPTLYMIVPIHACNMMKSCLIA
LGPYRVQVVYERSYCMTGVLIEGKCFVPDQSVVSIIKHGIFDIASVHIVCFFVAVKGNTYKIFEQVKKSFESTCNDTENK
VQGYYICIVGGNSAPIYVPTLDDFRSMEAFTGIFRSPHGEDHDLAGEEIASYSIVGPANAKVPHSASSDTLSLIAYSGIP
SYSSLSILTSSTEAKHVFSPGLFPKLNHTNCDKSAIPLIWTGMIDLPGYYEAVHPCTVFCVLSGPGASCEAFSEGGIFNI
TSPMCLVSKQNRFRLTEQQVNFVCQRVDMDIVVYCNGQRKVILTKTLVIGQCIYTITSLFSLLPGVAHSIAVELCVPGFH
GWATAALLVTFCFGWVLIPAITFIILTVLKFIANIFHTSNQENRLKSVLRKIKEEFEKTKGSMVCDVCKYECETYKELKA
HGVSCPQSQCPYCFTHCEPTEAAFQAHYKVCQVTHRFRDDLKKTVTPQNFTPGCYRTLNLFRYKSRCYIFTMWIFLLVLE
SILWAASASETPLTPVWNDNAHGVGSVPMHTDLELDFSLTSSSKYTYRRKLTNPLEEAQSIDLHIEIEEQTIGVDVHALG
HWFDGRLNLKTSFHCYGACTKYEYPWHTAKCHYERDYQYETSWGCNPSDCPGVGTGCTACGLYLDQLKPVGSAYKIITIR
YSRRVCVQFGEENLCKIIDMNDCFVSRHVKVCIIGTVSKFSQGDTLLFFGPLEGGGLIFKHWCTSTCQFGDPGDIMSPRD
KGFLCPEFPGSFRKKCNFATTPICEYDGNMVSGYKKVMATIDSFQSFNTSTMHFTDERIEWKDPDGMLRDHINILVTKDI
DFDNLGENPCKIGLQTSSIEGAWGSGVGFTLTCLVSLTECPTFLTSIKACDKAICYGAESVTLTRGQNTVKVSGKGGHSG
STFRCCHGEDCSQIGLHAAAPHLDKVNGISEIENSKVYDDGAPQCGIKCWFVKSGEWISGIFSGNWIVLIVLCVFLLFSL
VLLSILCPVRKHKKS
>J3WAX0 ~~~GP~~~Envelopment polyprotein~~~
MIVPIVLFLTLCPSELSAWGSPGDPIVCGVRTETNKSIQIEWKEGRSEKLCQIDRLGHVTSWLRNHSSFQGLIGQVKGRP
SVSYFPEGASYPRWSGLLSPCDAEWLGLIAVSKAGDTDMIVPGPTYKGKIFVERPTYNGYKGWGCADGKSLSHSGTYCET
DSSVSSGLIQGDRVLWVGEVVCQRGTPVPEDVFSELVSLSQSEFPDVCKIDGVALNQCEQESIPQPLDVAWIDVGRSHKV
LMREHKTKWVQESSAKDFVCFKVGQGPCSKQEEDDCMSKGNCHGDEVFCRMAGCSARMQDNQEGCRCELLQKPGEIIVNY
GGVSVRPTCYGFSRMMATLEVHKPDRELTGCTGCHLECIEGGVKIVTLTSELRSATVCASHFCASAKGGSKTTDILFHTG
ALVGPNSIRITGQLLDGSKFSFDGHCIFPDGCMALDCTFCKEFLRNPQCYPVKKWLFLVVVIMCCYCALMLLTNILRAIG
VWGTWVFAPIKLALALGLRLAKLSKKGLVAVVTRGQMIVNDELHQVRVERGEQNEGRQGYGPRGPIRHWLYSPALILILT
TSICSGCDELVHAESKSITCKSASGNEKECSVTGRALLPAVNPGQEACLHFSVPGSPDSKCLKIKVKSINLRCKQASSYY
VPEAKARCTSVRRCRWAGDCQSGCPTYFSSNSFSDDWANRMDRAGLGMSGCSDGCGGAACGCFNAAPSCIFWRKWVENPS
NRVWKVSPCASWVLAATIELTLPSGEVKTLEPVTGQATQMFKGVAITYLGSSIEIVGMTRLCEMKEMGTGIMALAPCNDP
GHAIMGNVGEIQCSSIESAKHIRSDGCIWNADLVGIELRVDDAVCFSKLTSVEAVANFSKIPATISGVRFDQGNHGESRI
YGSPLDITRVSGEFSVSFRGMRLKLSEISASCTGEITNVSGCYSCMTGASVSIKLHSSKNTTGHLKCDSDETAFSVMEGT
HTYRPHMSFDKAVIDEECVLNCGGHSSKLLLKGSLVFMDVPRFVDGSYVQTYHSKVPAGGRVPNPVDWLNALFGDGITRW
ILGIIGVLLACVMLFVVVVAITRRLIKGLTQRAKVA
>Q83887 ~~~GP~~~Envelopment polyprotein~~~
MVGWVCISLVVLATTTAGLTRNLYELKIECPHTVGLGQGYVTGSVETTPILLTQVTDLKIESSCNFDLHVPSTSIQKYNQ
VEWAKKSSTTESTSAGATTFEAKTKEVSLKGTCNIPVTTFEAAYKSRKTVICYDLACNQTHCLPTVHLIAPVQTCMSVRS
CMIGLLSSRIQVIYEKTYCVTGQLVEGLCFIPTHTIALTQPGHTYDTMTLPITCFLVAKKLGTQLKIAVELEKLITASGC
TENSFQGYYICFLGKHSEPLFVPMMDDYRSAELFTRMVLNPRGEDHDPDQNGQGLMRIAGPITAKVPSTETTETMQGIAF
AGAPMYSSFSTLVRKADPDYVFSPGIIAESNHSVCDKKTIPLTWTGFLAVSGEIEKITGCTVFCTLVGPGASCEAYSETG
IFNISSPTCLVNKVQKFRGSEQRINFMCQRVDQDVIVYCNGQKKVILTKTLVIGQCIYTFTSLFSLIPGVAHSLAVELCV
PGLHGWATTALLITFCFGWLLIPTITMIILKILRLLTFSCSHYSTESKFKAILERVKVEYQKTMGSMVCDVCHHECETAK
ELETHKKSCPEGQCPYCMTMTESTESALQAHFSICKLTNRFQENLKKSLKRPEVKQGCYRTLGVFRYKSRCYVGLVWGVL
LTTELIVWAASADTPLMESGWSDTAHGVGIVPMKTDLELDFALASSSSYSYRRKLVNPANKEETLPFHFQLDKQVVHAEI
QNLGHWMDGTFNIKTAFHCYGECKKYAYPWQTAKCFFEKDYQYETSWGCNPPDCPGVGTGCTACGVYLDKLRSVGKAYKI
VSLKFTRKVCIQLGTEQTCKHIDVNDCLVTPSVKVCLIGTISKLQPGDTLLFLGPLEQGGIILKQWCTTSCVFGDPGDIM
STTTGMKCPEHTGSFRKICGFATTPTCEYQGNTISGFQRMMATRDSFQSFNVTEPHITSNRLEWIDPDSSIKDHINMVLN
RDVSFQDLSDNPCKVDLHTQSIDGAWGSGVGFTLVCTVGLTECANFITSIKACDSAMCYGATVTNLLRGSNTVKVVGKGG
HSGSLFKCCHDTDCTEEGLAASPPHLDRVTGYNQIDSDKVYDDGAPPCTIKCWFTKSGEWLLGILNGNWVVVAVLIVILI
LSILLFSFFCPIRGRKNKSN
>P27315 ~~~GP~~~Envelopment polyprotein~~~
MSKFCLCLSLLGVLLLQVCDTRSLLELKIECPHTVGLGQGLVIGTVDLNPVPVESVSTLKLESSCNFDVHTSSATQQAVT
KWTWEKKADTAETAKAASTTFQSKSTELNLRGLCVIPTLVLETANKLRKTVTCYDLSCNQTACIPTVYLIAPIHTCVTTK
SCLLGLGTQRIQVTYEKTYCVSGQLVEGTCFNPIHTMALSQPSHTYDIVTIPVRCFFIAKKTNDDTLKIEKQFETILEKS
GCTAANIKGYYVCFLGATSEPIFVPTMDDFRASQILSDMAISPHGEDHDSALSSVSTFRIAGKLSGKAPSTESSDTVQGV
AFSGHPLYTSLSVLASKEDPVYIWSPGIIPERNHTVCDKKTLPLTWTGYLPLPGGIEKTTQCTIFCTLAGPGADCEAYSD
TGIFNISSPTCLINRVQRFRGAEQQIKFVCQRVDLDIVVYCNGMKKVILTKTLVIGQCIYTFTSVFSLMPGIAHSLAVEL
CVPGIHGWSTIALLATFCFGWLLIPIISLVSIKIMLLFAYMCSKYSNDSKFRLLIEKVKQEYQKTMGSMVCEVCQQECEM
AKELESHKKSCPNGMCPYCMNPTESTESALQAHFKVCKLTTRFQENLRKSLNPYEPKRGCYRTLSVFRYRSRCFVGLVWC
ILLVLELVIWAASADTVEIKTGWTDTAHGAGVIPLKSDLELDFSLPSSATYIYRRDLQNPANEQERIPFHFQLQRQVIHA
EIQNLGHWMDGTFNLKTSFHCYGACEKYAYPWQTAKCFLEKDYEFETGWGCNPGDCPGVGTGCTACGVYLDKLRSVGKVF
KVISLKFTRRVCIQLGSEQSCKTIDSNDCLMTTSVKVCMIGTVSKFQPGDTLLFLGPLEEGGIIFKQWCTTTCHFGDPGD
IMSTPQGMQCPEHTGAFRKKCAFATMPTCEYDGNTLSGYQRMLATRDSFQSFNITEPHITSNSLEWVDPDSSLKDHINLV
VNRDVSFQDLSENPCQVGVAVSSIDGAWGSGVGFNLVCSVSLTECASFLTSIKACDAAMCYGATTANLVRGQNTVHILGK
GGHSGSKFMCCHSTECSSTGLTAAAPHLDRVTGYNVIDNDKVFDDGSPECGVHCWFKKSGEWLMGILSGNWMVVAVLVVL
LILSIFLFSLCCPRRVVHKKSS
>P41266 ~~~GP~~~Envelopment polyprotein~~~
MGELSPVCLCLLLQGLLLCNTGAARNLNELKMECPHTIRLGQGLVVGSVELPSLPIQQVETLKLESSCNFDLHTSTAGQQ
SFTKWTWEIKGDLAENTQASSTSFQTKSSEVNLRGLCLIPTLVVETAARMRKTIACYDLSCNQTVCQPTVYLMGPIQTCI
TTKSCLLSLGDQRIQVNYEKTYCVSGQLVEGICFNPIHTMALSQPSHTYDIMTMMVRCFLVIKKVTSGDSMKIEKNFETL
VQKNGCTANNFQGYYICLIGSSSEPLYVPALDDYRSAEVLSRMAFAPHGEDHDIEKNAVSAMRIAGKVTGKAPSTESSDT
VQGIAFSGSPLYTSTGVLTSKDDPVYIWAPGIIMEGNHSICEKKTLPLTWTGFISLPGEIEKTTQCTVFCTLAGPGADCE
AYSETGIFNISSPTCLINRVQRFRGSEQQIKFVCQRVDMDITVYCNGMKKVILTKTLVIGQCIYTFTSIFSLIPGVAHSL
AVELCVPGLHGWATMLLLLTFCFGWVLIPTITMILLKILIAFAYLCSKYNTDSKFRILIEKVKREYQKTMGSMVCEVCQY
ECETAKELESHRKSCSIGSCPYCLNPSEATTSALQAHFKVCKLTSRFQENLRKSLTVYEPMQGCYRTLSLFRYRSRFFVG
LVWCVLLVLELIVWAASAETQNLNAGWTDTAHGSGIIPMKTDLELDFSLPSSASYTYRRQLQNPANEQEKIPFHLQLSKQ
VIHAEIQHLGHWMDATFNLKTAFHCYGSCEKYAYPWQTAGCFIEKDYEYETGWGCNPPDCPGVGTGCTACGVYLDKLKSV
GKVFKIVSLRYTRKVCIQLGTEQTCKTVDSNDCLITTSVKVCLIGTISKFQPSDTLLFLGPLQQGGLIFKQWCTTTCQFG
DPGDIMSTPTGMKCPELNGSFRKKCAFATTPVCQFDGNTISGYKRMIATKDSFQSFNVTEPHISTSALEWIDPDSSLRDH
INVIVSRDLSFQDLSETPCQIDLATASIDGAWGSGVGFNLVCTVSLTECSAFLTSIKACDAAMCYGSTTANLVRGQNTIH
IVGKGGHSGSKFMCCHDTKCSSTGLVAAAPHLDRVTGYNQADSDKIFDDGAPECGMSCWFKKSGEWILGVLNGNWMVVAV
LVVLLILSILLFTLCCPRRPSYRKEHKP
>P27312 ~~~GP~~~Envelopment polyprotein~~~
MGKSSPVCLYLILQGLLLFDTVNAKNLNELKMECPHTIGLGQGLVVGSVELPPVPIQQIESLKLESSCNFDLHTSTAGQQ
SFTKWTWETKGDLAENTQASSTSFQTKSSEVNLRGLCLIPTLVVETAARMRKTIACYDLSCNQTVCQPTVYLMGPIQTCL
TTKSCLLGLGDQRIQVNYERTYCVSGQLVEGVCFNPIHTMALSQPSHTYDIVTIMVRCFLVIKKVTSGDSMKIEKNFETL
VQKTGCTANGFQGYYICLIGSSSEPLYVPTLDDYRSAEVLSRMAFAPHGEDHDIEKNAVSALRIAGKVTGKAPSTESSDT
VQGIAFSGSPLYTSTGVLTAKDDPVYVWAPGIIMEGNHSVCEKKTLPLTWTGFIPLPGEIEKTTQCTVFCTLAGPGADCE
AYSETGIFNISSPTCLINRVQRFRGAEQQIKFVCQRVDMDITVYCNGVKKVILTKTLVIGQCIYTFTSIFSMIPGIAHSL
AVELCVPGLHGWATVLLLLTFCFGWVLIPTITMILLKILIAFAYLCSKYNTDSKFRILVEKVKKEYQKTMGSMVCEVCQY
ECETAKELESHRKSCSIGSCPYCLNPSEATPSALQAHFKVCKLTSRFQENLKKSLTMYEPMQGCYRTLSLFRYRSRFFVG
LVWCMLLVLELIVWAASAETQNLNDGWTDTAHGSGIIPMKADLELDFSLPSSASYTYRRQLQNPANEQEKIPFHLQISKQ
VIHAEIQHLGHWMDATFNLKTAFHCYGSCEKYAYPWQTAGCFVEKDYEYETGWGCNPPDCPGVGTGCTACGVYLDKLKSV
GKVFKIVSLRYTRKVCIQLGTGQTCKTVDSNDCLITTSVKVCLIGTISKFQPSDTLLFLGPLQQGGLIFKQWCTTTCQFG
DPGDIMSTPTGMKCPELNGSFRKKCAFATTPVCQFDGNTISGYKRMVATKDSFQSFNVTEPHISTSALEWIDLDSSLRDH
INVIVSRDLSFQDLSETPCQVDLTTSATDGAWGSGVGFNLVCTVSLTECSAFLTSIKACHAAMCYGSTTTNLVRGQNTIH
VVGKGGHSGSKFMCCHDTKCSSTGLVAAAPHLDRVTGFNQADSDKIFDDGAPECGMSCWFKKLGEWVLGVLNGNWMVVAV
LIALLILSIFLFALCCPRRPSYKKDHKP
>P03518 ~~~GP~~~Envelopment polyprotein~~~
MYVLLTILISVLVCEAVIRVSLSSTREETCFGDSTNPEMIEGAWDSLREEEMPEELSCSISGIREVKTSSQELYRALKAI
IAADGLNNITCHGKDPEDKISLIKGPPHKKRVGIVRCERRRDAKQIGRETMAGIAMTVLPALAVFALAPVVFAEDPHLRN
RPGKGHNYIDGMTQEDATCKPVTYAGACSSFDVLLEKGKFPLFQSYAHHRTLLEAVHDTIIAKADPPSCDLQSAHGNPCM
KEKLVMKTHCPNDYQSAHYLNNDGKMASVKCPPKYGLTEDCNFCRQMTGASLKKGSYPLQDLFCQSSEDDGSKLKTKMKG
VCEVGVQAHKKCDGQLSTAHEVVPFAVFKNSKKVYLDKLDLKTEENLLPDSFVCFEHKGQYKGTMDSGQTKRELKSFDIS
QCPKIGGHGSKKCTGDAAFCSAYECTAQYANAYCSHANGSGIVQIQVSGVWKKPLCVGYERVVVKRELSAKPIQRVEPCT
TCITKCEPHGLVVRSTGFKISSAVACASGVCVTGSQSPSTEITLKYPGISQSSGGDIGVHMAHDDQSVSSKIVAHCPPQD
PCLVHGCIVCAHGLINYQCHTALSAFVVVFVFSSIAIICLAVLYRVLKCLKIAPRKVLNPLMWITAFIRWIYKKMVARVA
HNINQVNREIGWMEGGQLVLGNPAPIPRHAPIPRYSTYLMLLLIVSYASACSELIQASSRITTCSTEGVNTKCRLSGTAL
IRAGSVGAEACLMLKGVKEDQTKFLKIKTVSSELSCREGQSYWTGSISPKCLSSRRCHLVGECHVNRCLSWRDNETSAEF
SFVGESTTMRENKCFEQCGGWGCGCFNVNPSCLFVHTYLQSVRKEALRVFNCIDWVHKLTLEITDFDGSVSTIDLGASSS
RFTNWGSVSLSLDAEGISGSNSFSFIESPSKGYAIVDEPFSEIPRQGFLGEIRCNSESSVLSAHESCLRAPNLISYKPMI
DQLECTTNLIDPFVVFERGSLPQTRNDKTFAASKGNRGVQAFSKGSVQADLTLMFDNFEVDFVGAAVSCDAAFLNLTGCY
SCNAGARVCLSITSTGTGSLSAHNKDGSLHIVLPSENGTKDQCQILHFTVPEVEEEFMYSCDGDERPLLVKGTLIAIDPF
DDRREAGGESTVVNPKSGSWNFFDWFSGLMSWFGGPLKLYSSFACMLHYQLGSFSSLYILEEQASLKCGLLPLRRPHRSV
RVKVIC
>P21401 ~~~GP~~~Envelopment polyprotein~~~
MYVLLTILTSVLVCEAIIRVSLSSTREETCFGDSTNPEMIEGAWDSLREEEMPEELSCSISGIREVKTSSQELYRALKAI
IAADGLNNITCHGKDPEDKISLIKGPPHKKRVGIVRCERRRDAKQIGRKTMAGIAMTVLPALAVFALAPVVFAEDPHLRN
RPGKGHNYIDGMTQEDATCKPVTYAGACSSFDVLLEKGKFPLFQSYAHHRTLLEAVHDTIIAKADPPSCDLLSAHGNPCM
KEKLVMKTHCPNDYQSAHHLNNDGKMASVKCPPKYELTEDCNFCRQMTGASLKKGSYPLQDLFCQSSEDDGSKLKTKMKG
VCEVGVQALKKCDGQLSTAHEVVPFAVFKNSKKVYLDKLDLKTEENLLPDSFVCFEHKGQYKGTMDSGQTKRELKSFDIS
QCPKIGGHGSKKCTGDAAFCSAYECTAQYANAYCSHANGSGIVQIQVSGVWKKPLCVGYERVVVKRELSAKPIQRVEPCT
TCITKCEPHGLVVRSTGFKISSAVACASGVCVTGSQSPSTEITLKYPGISQSSGGDIGVHMAHDDQSVSSKIVAHCPPQD
PCLVHDCIVCAHGLINYQCHTALSAFVVVFVFSSIAIICLAILYRVLKCLKIAPRKVLNPLMWITAFIRWIYKKMVARVA
DNINQVNREIGWMEGGQLVLGNPAPIPRHAPIPRYSTYLMLLLIVSYASACSELIQASSRITTCSTEGVNTKCRLSGTAL
IRAGSVGAEACLMLKGVKEDQTKFLKLKTVSSELSCREGQSYWTGSFSPKCLSSRRCHLVGECHVNRCLSWRDNETSAEF
SFVGESTTMRENKCFEQCGGWGCGCFNVNPSCLFVHTYLQSVRKEALRVFNCIDWVHKLTLEITDFDGSVSTIDLGASSS
RFTNWGSVSLSLDAEGISGSNSFSFIESPGKGYAIVDEPFSEIPRQGFLGEIRCNSESSVLSAHESCLRAPNLISYKPMI
DQLECTTNLIDPFVVFERGSLPQTRNDKTFAASKGNRGVQAFSKGSVQADLTLMFDNFEVDFVGAAVSCDAAFLNLTGCY
SCNAGARVCLSITSTGTGSLSAHNKDGSLHIVLPSENGTKDQCQILHFTVPEVEEEFMYSCDGDERPLLVKGTLIAIDPF
DDRREAGGESTVVNPKSGSWNFFDWFSGLMSWFGGPLKTILLICLYVALSIGLFFLLIYLGGTGLSKMWLAATKKAS
>H2AM12 ~~~GP~~~Envelopment polyprotein~~~
MLLNIVLISNLACLAFALPLKEGTRGSRCFLNGELVKTVNTSKVVSECCVKDDISIIKSNAEHYKSGDRLAAVIKYYRLY
QVKDWHSCNPIYDDHGSFMILDIDNTGTLIPKMHTCRVECEIALNKDTGEVILNSYRINHYRISGTMHVSGWFKNKIEIP
LENTCESIEVTCGLKTLNFHACFHTHKSCTRYFKGSILPELMIESFCTNLELILLVTFILVGSVMMMILTKTYIVYVFIP
IFYPFVKLYAYMYNKYFKLCKNCLLAVHPFTNCPSTCICGMIYTTTESLKLHRMCNNCSGYKALPKTRKLCKSKISNIVL
CVITSLIFFSFITPISSQCIDIEKLPDEYITCKRELANIKSLTIDDTYSFIYSCTCIIVLILLKKAAKYILYCNCSFCGM
VHERRGLKIMDNFTNKCLSCVCAENKGLTIHRASEKCLFKFESSYNRTGLIIFMLLLVPTIVMTQETSINCKNIQSTQLT
IEHLSKCMAFYQNKTSSPVVINEIISDASVDEQELIKSLNLNCNVIDRFISESSVIETQVYYEYIKSQLCPLQVHDIFTI
NSASNIQWKALARSFTLGVCNTNPHKHICRCLESMQMCTSTKTDHAREMSIYYDGHPDRFEHDMKIILNIMRYIVPGLGR
VLLDQIKQTKDYQALRHIQGKLSPKSQSNLQLKGFLEFVDFILGANVTIEKTPQTLTTLSLIKGAHRNLDQKDPGPTPIL
VCKSPQKVVCYSPRGVTHPGDYISCKSKMYKWPSLGVYKHNRDQQQACSSDTHCLEMFEPAERTITTKICKVSDMTYSES
PYSTGIPSCNVKRFGSCNVRGHQWQIAECSNGLFYYVSAKAHSKTNDITLYCLSANCLDLRYAFRSSSCSDIVWDTSYRN
KLTPKSINHPDIENYIAALQSDIANDLTMHYFKPLKNLPAIIPQYKTMTLNGDKVSNGIRNSYIESHIPAINGLSAGINI
AMPNGESLFSIIIYVRRVINKASYRFLYETGPTIGINAKHEEVCTGKCPSPIPHQDGWVTFSKERSSNWGCEEWGCLAIN
DGCLYGSCQDIIRPEYKIYKKSSIEQKDVEVCITMAHESFCSTVDVLQPLISDRIQLDIQTIQMDSMPNIIAVKNGKVYV
GDINDLGSTAKKCGSVQLYSEGIIGSGTPKFDYVCHAFNRKDVILRRCFDNSYQSCLLLEQDNTLTIASTSHMEVHKKVS
SVGTINYKIMLGDFDYNAYSTQATVTIDEIRCGGCYGCPEGMACALKLSTNTIGSCSIKSNCDTYIKIIAVDPMQSEYSI
KLNCPLATETVSVSVCSASAYTKPSISKNQPKIVLNSLDETSYIEQHDKKCSTWLCRVYKEGISVIFQPLFGNLSFYWRL
TIYIIISLIMLILFLYILIPLCKRLKGLLEYNERIYQMENKFK
>P17880 ~~~GP~~~Envelopment polyprotein~~~
MWSLLLLAALVGQGFALKNVFDMRIQLPHSVNFGETSVSGYTEFPPLSLQEAEQLVPESSCNMDNHQSLSTINKLTKVIW
RKKANQESANQNSFEVVESEVSFKGLCMLKHRMVEESYRNRRSVICYDLACNSTFCKPTVYMIVPIHACNMMKSCLIGLG
PYRIQVVYERTYCTTGILTEGKCFVPDKAVVSALKRGMYAIASIETICFFIHQKGNTYKIVTAITSAMGSKCNNTDTKVQ
GYYICIIGGNSAPVYAPAGEDFRAMEVFSGIITSPHGEDHDLPGEEIATYQISGQIEAKIPHTVSSKNLKLTAFAGIPSY
SSTSILAASEDGRFIFSPGLFPNLNQSVCDNNALPLIWRGLIDLTGYYEAVHPCNVFCVLSGPGASCEAFSEGGIFNITS
PMCLVSKQNRFRAAEQQISFVCQRVDMDIIVYCNGQKKTILTKTLVIGQCIYTITSLFSLLPGVAHSIAIELCVPGFHGW
ATAALLITFCFGWVLIPACTLAILLVLKFFANILHTSNQENRFKAILRKIKEEFEKTKGSMVCEICKYECETLKELKAHN
LSCVQGECPYCFTHCEPTETAIQAHYKVCQATHRFREDLKKTVTPQNIGPGCYRTLNLFRYKSRCYILTMWTLLLIIESI
LWAASAAEIPLVPLWTDNAHGVGSVPMHTDLELDFSLPSSSKYTYKRHLTNPVNDQQSVSLHIEIESQGIGAAVHHLGHW
YDARLNLKTSFHCYGACTKYQYPWHTAKCHFEKDYEYENSWACNPPDCPGVGTGCTACGLYLDQLKPVGTAFKIISVRYS
RKVCVQFGEEHLCKTIDMNDCFVTRHAKICIIGTVSKFSQGDTLLFLGPMEGGGIIFKHWCTSTCHFGDPGDVMGPKDKP
FICPEFPGQFRKKCNFATTPVCEYDGNIISGYKKVLATIDSFQSFNTSNIHFTDERIEWRDPDGMLRDHINIVISKDIDF
ENLAENPCKVGLQAANIEGAWGSGVGFTLTCKVSLTECPTFLTSIKACDMAICYGAESVTLSRGQNTVKITGKGGHSGSS
FKCCHGKECSSTGLQASAPHLDKVNGISELENEKVYDDGAPECGITCWFKKSGEWVMGIINGNWVVLIVLCVLLLFSLIL
LSILCPVRKHKKS
>R4V2Q5 ~~~~~~Envelopment polyprotein~~~
MMKVIWFSSLICFVIQCSGDSGPIICAGPIHSNKSADIPHLLGYSEKICQIDRLIHVSSWLRNHSQFQGYVGQRGGRSQV
SYYPAENSYSRWSGLLSPCDADWLGMLVVKKAKGSDMIVPGPSYKGKVFFERPTFDGYVGWGCGSGKSRTESGELCSSDS
GTSSGLLPSDRVLWIGDVACQPMTPIPEETFLELKSFSQSEFPDICKIDGIVFNQCEGESLPQPFDVAWMDVGHSHKIIM
REHKTKWVQESSSKDFVCYKEGTGPCSESEEKTCKTSGSCRGDMQFCKVAGCEHGEEASEAKCRCSLVHKPGEVVVSYGG
MRVRPKCYGFSRMMATLEVNQPEQRLGQCTGCHLECINGGVRLITLTSELKSATVCASHFCSSATSGKKSTEIQFHSGSL
VGKTAIHVKGALVDGTEFTFEGSCMFPDGCDAVDCTFCREFLKNPQCYPAKKWLFIIIVILLGYAGLMLLTNVLKAIGIW
GSWVIAPVKLMFAIIKKLMRTVSCLMRKLMDRGRQVIHEEIGENREGNQDDVRIEMARPRRVRHWMYSPVILTILAIGLA
ESCDEMVHADSKLVSCRQGSGNMKECVTTGRALLPAVNPGQEACLHFTAPGSPDSKCLKIKVKRINLKCKKSSSYFVPDA
RSRCTSVRRCRWAGDCQSGCPPHFTSNSFSDDWAGKMDRAGLGFSGCSDGCGGAACGCFNAAPSCIFWRKWVENPHGIIW
KVSPCAAWVPSAVIELTMPSGEVRTFHPMSGIPTQVFKGVSVTYLGSDMEVSGLTDLCEIEELKSKKLALAPCNQAGMGV
VGKVGEIQCSSEESARTIKKDGCIWNADLVGIELRVDDAVCYSKITSVEAVANYSAIPTTIGGLRFERSHDSQGKISGSP
LDITAIRGSFSVNYRGLRLSLSEITATCTGEVTNVSGCYSCMTGAKVSIKLHSSKNSTAHVRCKGDETAFSVLEGVHSYT
VSLSFDHAVVDEQCQLNCGGHESQVTLKGNLIFLDVPKFVDGSYMQTYHSTVPTGANIPSPTDWLNALFGNGLSRWILGV
IGVLLGGLALFFMIMSLFKLGTKQVFRSRTKLA
>A0A0B5A886 ~~~GP~~~Envelopment polyprotein~~~
MMKVIWFSSLICLVIQCSGDSGPIICAGPIHSNKSAGIPHLLGYSEKICQIDRLIHVSSWLRNHSQFQGYVGQRGGRSQV
SYYPAENSYSRWSGLLSPCDADWLGMLVVKKAKESDMIVPGPSYKGKVFFERPTFDGYVGWGCGSGKSRTESGELCSSDS
GTSSGLLPSDRVLWIGDVACQPMTPIPEETFLELKSFSQSEFPDICKIDGIVFNQCEGESLPQPFDVAWMDVGHSHKIIM
REHKTKWVQESSSKDFVCYKEGTGPCSESEEKACKTSGSCRGDMQFCKVAGCEHGEEASEAKCRCSLVHKPGEVVVSYGG
MRVRPKCYGFSRMMATLEVNPPEQRIGQCTGCHLECINGGVRLITLTSELRSATVCASHFCSSASSGKKSTEIHFHSGSL
VGKTAIHVKGALVDGTEFTFEGSCMFPDGCDAVDCTFCREFLKNPQCYPAKKWLFIIIVILLGYAGLMLLTNVLKAIGVW
GSWVIAPVKLMFAIIKKLMRTVSCLVGKLMDRGRQVIHEEIGENGEGNQDDVRIEMARPRRVRHWMYSPVILTILAIGLA
EGCDEMVHADSKLVSCRQGSGNMKECITTGRALLPAVNPGQEACLHFTAPGSPDSKCLKIKVKRINLKCKKSSSYFVPDA
RSRCTSVRRCRWAGDCQSGCPPHFTSNSFSDDWAGKMDRAGLGFSGCSDGCGGAACGCFNAAPSCIFWRKWVENPHGIIW
KVSPCAAWVPSAVIELTMPSGEVRTFHPMSGIPTQVFKGVSVTYLGSDMEVSGLTDLCEIEELKSKKLALAPCNQAGMGV
VGKVGEIQCSSEESARTIKKDGCIWNADLVGIELRVDDAVCYSKITSVEAVANYSAIPTTIGGLRFERSHDSQGKISGSP
LDITAIRGSFSVNYRGLRLSLSEITATCTGEVTNVSGCYSCMTGAKVSIKLHSSKNSTAHVRCKGDETAFSVLEGVHSYI
VSLSFDHAVVDEQCQLNCGGHESQVTLKGNLIFLDVPKFVDGSYMQTYHSTVPTGANIPSPTDWLNALFGNGLSRWILGV
IGVLLGGLALFFLIMFLLKLGTKQVFRSRTKLA
>P36291 ~~~GP~~~Envelopment polyprotein~~~
MRILKLLELVVKVSLFTIALSSVLLAFLIFRATDAKVEIIRGDHPEIYDDSAENEVPTAASIQREAILETLTNLMLESRT
PGTRQIREEKSTIPISAEPTTQKTISVLDLPNNCLNASSLKCEIKGISTYNVYYQVENNGVIYSCVSDSAEGLEKCDNSL
NLPKRFSKVPVIPITKLDKKRHFSVGGKFFISESLTQDNYPITYNSYPTNGTVSLQTVKLSGDCKITKSNFANPYTVSIT
SPEKIMGYLIKKPGENVEHKVISFSGSASITFTEEMLDGEHNLLCGDKSAKIPKTNKRVRDCIIKYSKSIYKQTACINFS
WIRLILIALLIYFPIRWLVNKTTKPLFLWYDLMGLITYPVLLLINCLWKYFPLKCSNCGNLCIVTHECTKVCICNKSKAS
KEHSSECPILSKEADHDYNKHKWTSMEWFHLIVNTKLSLSLLKFVTEILIGLVILSQMPMSMAQTTQCLSGCFYVPGCPF
LVTSKFEKCSEKDQCYCNVKEDKIIESIFGTNIVIEGPNDCIENQNCIARPSIDNLIKCRLGCEYLDLFRNKPLYNGFSD
YTGSSLGLTSVGLYEAKRLRNGIIDSYNRQGKISGMVAGDSLNKNETSIPENILPRQSLIFDSVVDGKYRYMIEQSLLGG
GGTIFMLNDKTSETAKKFVIYIKSVGIHYEVSEKYTTAPIQSTHTDFYSTCTGNCDTCRKNQALTGFQDFCVTPTSYWGC
EEAWCFAINEGATCGFCRNIYDMDKSYRIYSVLKSTIVADVCISGILGGQCSRITEEVPYENTLFQADIQADLHNDGITI
GELIAHGPDSHIYSGNIANLNDPVKMFGHPQLTHDGVPIFTKKTLEGDDMSWDCAAIGKKSVTIKTCGYDTYRFRSGLEQ
ISDIPVSFKDFSSFFLAKSFSLGKLKMVVDLPSDLFKVAPKKPSITSTSLNCNGCLLCGQGLSCLLEFFSDLTFSTAISI
DACSLSTYQLAVKKGSNKYNITMFCSANPDKKKMTLYPEGNPDISVEVLVNNVIVEEPENIIDQNDEYAHEEQQYNSDSS
AWGFWDYIKSPFNFIASYFGSFFDTIRVVLLIAFIFLVTYFCSILTSICKGYVKNESYKSRSKIEDDDEPEIKAPMLMKD
TMTRRRPPMDFSHLV
>P0DTJ1 ~~~GP~~~Envelopment polyprotein~~~
MFCLCLSLLGLLLCWPAATRNLLELKVECPHTIGLGQGIVIGSAELPPVPLAKVESLKLESSCNFDLHTSTAAQQAFTKW
SWEKKADTAENAKAASTTFQSSSKEVQLRGLCVIPTLVLETASRTRKTVTCFDLSCNQTVCQPTVYLMAPIQTCVTTKSC
LLGLGDQRIQVVYEKTYCVSGQLIEGNCFNPLHTIAISQPTHTYDIMTLAVHCFFISKKGGTDDTLKIEKQFETLVEKTG
CTENALKGYYACILGTSSEVVYVPAMDDYRSSEILSRMTTAPHGEDHDIDPNAISSLRIVGQLTGKAPSTESSDTVQGIA
FAGTPLYTSTSILVRKEDPIYLWSPGIIPEGNHSQCDKKTLPLTWTGFITLPGEIEKTTQCTVFCTLSGPGADCEAYSDT
GIFNISSPTCLVNRVQRFRGAEQQVKFVCQRVDLDITVYCNGVKKVILTKTLVIGQCIYTFTSIFSLMPGVAHSLAVELC
VPGLHGWATISLLITFCFGWLAIPLLSMIIIRFLLIFTYLCSKYSTDSKFKLIIEKVKQEYQKTMGSMVCEVCQQGCETA
KELESHKKSCPHGQCPYCLNPTEATESALQAHFKVCKLTTRFQENLKKSLSTYEPKRGLYRTLSMFRYKSKCYVGLVWCI
LLTMELIVWAASAETINLEPGWTDTAHGSGIIPLKTDLELDFSLPSSATYTYRRELQNPANEQEKIPFHFQMERQVIHAE
IQHLGHWMDGTFNLKTAFHCYGSCIKYAYPWQTAKCFLEKDFEFETGWGCNPPDCPGVGTGCTACGVYLDKLRSVGKVYK
ILSLKYTRKVCIQLGTEQTCKTIDSNDCLVTTSVKVCMIGTISKFQPGDTLLFLGPLEEGGMIFKQWCTTTCQFGDPGDI
MSTPLGMKCPEHAGSFRKKCSFATLPSCQYDGNTVSGYQRMIATKDSFQSFNITEPHITTNSLEWVDPDSSLKDHVNLIV
NRDLSFQDLAENPCQVDLSVSSIDGAWGSGVGFNLVCSVSLTECASFLTSIKACDSAMCYGSSTANLVRGQNTVHVVGKG
GHSGSKFMCCHDKKCSATGLVAAAPHLDRVTGYNQIDTNKVFDDGAPQCGVHCWFKKSGEWLLGILSGNWMVVAVLIALF
IFSLLLFSLCCPRRQNYKKNK
>P09613 ~~~GP~~~Envelopment polyprotein~~~
MVRTYLLLLLLCGPATPFFNHLMDVTRRLLDSSNATWQRDQPDTHRLSRLDAHVMSMLGVGSHIDEVSVNHSQHLHNFRS
YNCEEGRRTLTMMDPKSGKFKRLKCNENQTLSKDCASCIEKKSSIMKSEHLVYDDAICQSDYSSPEAMPDHETHLCRIGP
LHIQHCTHEAKRVQHVSWFWIDGKLRVYDDFSVSWTEGKFLSLFDCLNETSKDHNCNKAVCLEGRCSGDLQFCTEFTCSY
AKADCNCKRNQVSGVAVVHTKHGSFMPECMGQSLWSVRKPLSKRSVTVQQPCMDCESDCKVDHILVIVRHFYPDHYQACL
GSTCLTGRAKDKEFKIPFKMADRLSDSHFEIRIWDKERSNEYFLESRCESVDACAAITCWFCRANWANIHCFSKEQVLIL
VAVSSLCILLLASVLRALKVIATFTWKIIKPFWWILSLLCRTCSKRLNKRAERLKESIHSLEEGLNNVDEGPREQNNPAR
AVARPNVRQKMFNLTRLSPVVVGMLCLACPVESCSDSISVTASSQRCSTSSDGVNSCFVSTSSLLQVSPKGQESCLILKG
PTGTAVDSIRIKTTDIKLECVRRDLYWVPRVTHRCIGTRRCHLMGACKGEACSEFKINDYSPEWGHEEELMAQLGWSYCV
EQCGGALCQCFNMRPSCFYLRKTFSHLSQDAFNIYECSEWSYRINVLVSTNSTHSNLTLKLGVPDSIPHGLISLSSVSQP
PAIAYSECFGEDLHGTKFHTVCNRRTDYTLGRIGEIQCPTKADALAVSKRCISSDSIIFSKVHKDSVDCQSSIIDPMTIR
NRNKLPSTVGSVTFWPTETSVEAAIPDLASATMLIRLDGYTIQFRSDSNKCSPRFLSLSGCYNCEAGAKLELEHVTDFGT
ALGILECPSLGYTTYYEVKNTLEKSIRTMHLNGSHVEAKCYFRCPNSESQLTIRGELIYLFNDDIRHHNQTLSPGLSPKS
GSGWDPFGWFKASWLRAIWAILGGTVSLIIGVVIIYMVFTLCLKVKKS
>Q69572 ~~~~~~Glycoprotein Q1~~~
MATARLSAMKPPRSCALIFLCAFSMATAPTNATAHRRAGTVKSTPPPEDKHSYTAKYYDKDIYFNIYEGRNSTPRRRTLS
EIISKFSTSEMLSLKRVKAFVPVDENPTTTLEDIADILNYAVCDDNSCGCTIETQARIMFGDIIICVPLSADNKGVRNFK
DRIMPKGLSQILSSSLGLHLSLLYGAFGSNYNSLAYMRRLKPLTAMTAIRFCPMTTKLELRQNYKVKETLCELIVSIEIL
KIRNNGGQTMKTLTSFAIVRKDNDGQDWETCTRFAPVNIEDILRYKRVANDTCCRHRDVQHGRRTLESSNSWTQTQYFEP
WQDIVDVYVPINDTHCPNDSYVVFETLQGFEWCSRLNKNETKNYLSSVLGFRNALFETEELMETIAMRLASQILSMVGQQ
GTTIRDIDPAIVSALWHSLPENLTTTNIKYDIASPTHMRPALCTIFVQTGTSKQRFRNAGLLMVNNIFTVQGRYTTQNMF
ERKEYVYKHLGQALCQGGHVFYNPKEVKSENIKMINIKPTVVRT
>Q9QJ11 ~~~~~~Glycoprotein Q1~~~
MRPPRRSAPILVCAISMATALSNATVHRDAGTVESTPPPDDEDNYTAKYYDDSIYFNIYDGTNPTPRRRTLPEIISKFST
SEMSRLGGLKAFVPVDYTPTTTLEDIEDLLNYAICDDNSCGCLIETEARXMFGDIIICVPLSAESRGVRNLKSRIMPMGL
SQILSSGLGLHFSLLYGAFGSNYNSLAYMERLKPLTAMTAIAFCPMTSKLELRQNYRLEKARXNLIVNIELLKIQNHGGQ
TIKTLTSFAIVRKDSDGQDWETCTRFASVSIEDILRSKPAANGTCCPPRDVHHDRPTLQSSNSWTRTEYFEPWQDVVDAY
VPINDNHCPNDSYVVFQTLQGHEWCSRLNKNDTKNYLSSVLAFKNALYETEELMETIGMRLASQILSLVGQRGTSIRNID
PAIVSALWHSLPEKLTTTNIKYDIASPTHMSPALXTIFIQTGTSKQRFRNAGLLMVNNIFTVQARYSKQNMFEKKIYGYE
HLGQALCEGGHVFYNPRDVYFQNIKMAATEPTVVRT
>P0DOE0 ~~~~~~Glycoprotein Q2~~~
MHFLVVYILIHFHAYRGMAALPLFSTLPKITSCCDSYVVINSSTSVSSLISTCLDGEILFQNEGQKFCRPLTDNRTIVYT
MQDQVQKPLSVTWMDFNLVISDYGRDVINNLTKSAMLARKNGPRYLQMENGPRYLQMETRISDLFRHECYQDNYYVLDKK
LQMFYPTTHSNELLFYPSEATLPSPWQEPPFSSPWPEPTFPSRWYWLLLNYTNY
>P0DOE1 ~~~~~~Glycoprotein Q2~~~
MHFVAVYILTHFHAYPGVAALPFFSTLPKITSCCDHYVVLNSLSSVSSSTPTCLDGEXLFQNAGQKFCRPFTDNRTIVYT
MQDQVQRPWSVTWMDFNLVISDYGRAVIENLTESAMSAHKNGPRYLQMETFISDLFRYECHRDNRYVLEKKLQMFYPTTH
MNELLFYPSDPTLPSPYGNGHY
>P31035 ~~~~~~Granulin~~~
GYNKSLRYSRHAGTSCLIDNQHYKQIASNGKDVRRKDRRISEAKYAPLKDLANQYMVTEDPFRGPGKNVRITLFKEIRRV
EPDTHKLICNWSGKEFLRETWTRFISEEFPITTDQQIMNLLFELQLRPMQPNRCYRFTMQYALGAHPDYVAHDVIRQGDP
YYVGPNQIERINLSKKGYAYPLTCLQSVYNENFDFFFDEHLWPYFHRPLVYVGMNSAEIEEIMIEVSVIFKIKEFAPDVP
LFTGPAY
>P87577 ~~~~~~Granulin~~~
MGYNKSLRYSRHDGTSCVIDNHHLKSLGAVLNDVRRKKDRIREAEYEPIIDIADQYMVTEDPFRGPGKNVRITLFKEIRR
VHPDTMKLVCNWSGKEFLRETWTRFISEEFPITTDQEIMDLWFELQLRPMHPNRCYKFTMQYALGAHPDYVAHDVIRQQD
PYYVGPNNIERINLSKKGFAFPLTCLQSVYNDNFERFFDDVLWPYFYRPLVYVGTTSAEIEEIMIEVSLLFKIKEFAPDV
PLFTGPAY
>P08072 ~~~MGF~~~Growth factor~~~
MVPRDLVATLLCAMCIVQATMPSLDNYLYIIKRIKLCNDDYKNYCLNNGTCFTVALNNVSLNPFCACHINYVGSRCQFIN
LITIK
>P04519 2.4.1.26~~~agt~~~DNA alpha-glucosyltransferase~~~
MRICIFMARGLEGCGVTKFSLEQRDWFIKNGHEVTLVYAKDKSFTRTSSHDHKSFSIPVILAKEYDKALKLVNDCDILII
NSVPATSVQEATINNYKKLLDNIKPSIRVVVYQHDHSVLSLRRNLGLEETVRRADVIFSHSDNGDFNKVLMKEWYPETVS
LFDDIEEAPTVYNFQPPMDIVKVRSTYWKDVSEINMNINRWIGRTTTWKGFYQMFDFHEKFLKPAGKSTVMEGLERSPAF
IAIKEKGIPYEYYGNREIDKMNLAPNQPAQILDCYINSEMLERMSKSGFGYQLSKLNQKYLQRSLEYTHLELGACGTIPV
FWKSTGENLKFRVDNTPLTSHDSGIIWFDENDMESTFERIKELSSDRALYDREREKAYEFLYQHQDSSFCFKEQFDIITK
>P04547 2.4.1.27~~~bgt~~~DNA beta-glucosyltransferase~~~
MKIAIINMGNNVINFKTVPSSETIYLFKVISEMGLNVDIISLKNGVYTKSFDEVDVNDYDRLIVVNSSINFFGGKPNLAI
LSAQKFMAKYKSKIYYLFTDIRLPFSQSWPNVKNRPWAYLYTEEELLIKSPIKVISQGINLDIAKAAHKKVDNVIEFEYF
PIEQYKIHMNDFQLSKPTKKTLDVIYGGSFRSGQRESKMVEFLFDTGLNIEFFGNAREKQFKNPKYPWTKAPVFTGKIPM
NMVSEKNSQAIAALIIGDKNYNDNFITLRVWETMASDAVMLIDEEFDTKHRIINDARFYVNNRAELIDRVNELKHSDVLR
KEMLSIQHDILNKTRAKKAEWQDAFKKAIDL
>P03735 ~~~T~~~Tail assembly protein GT~~~
MFLKTESFEHNGVTVTLSELSALQRIEHLALMKRQAEQAESDSNRKFTVEDAIRTGAFLVAMSLWHNHPQKTQMPSMNEA
VKQIEQEVLTTWPTEAISHAENVVYRLSGMYEFVVNNAPEQTEDAGPAEPVSAGKVFDGELSFALKLAREMGRPDWRAML
AGMSSTEYADWHRFYSTHYFHDVLLDMHFSGLTYTVLSLFFSDPDMHPLDFSLLNRREADEEPEDDVLMQKAAGLAGGVR
FGPDGNEVIPASPDVADMTEDDVMLMTVSEGIAGGVRYG
>P31281 ~~~G~~~Major spike protein G~~~
MYQNFVTKHDTAIQTSRFSVTGNVIPAAPTGNIPVINGGSITAERAVVNLYANMNVSTSSDGSFIVAMKVDTSPTDPNCV
ISAGVNLSFAGTSYPIVGIVRFESASEQPTSIAGSEVEHYPIEMSVGSGGVCSARDCATVDIHPRTSGNNVFVGVICSSA
KWTSGRVIGTIATTQVIHEYQVLQPLK
>P03644 ~~~G~~~Major spike protein G~~~
MFQKFISKHNAPINSTQLAATKTPAVAAPVLSVPNLSRSTILINATTTAVTTHSGLCHVVRIDETNPTNHHALSIAGSLS
NVPADMIAFAIRFEVADGVVPTAVPALYDVYPIETFNNGKAISFKDAVTIDSHPRTVGNDVYAGIMLWSNAWTASTISGV
LSVNQVNREATVLQPLK
>P03643 ~~~G~~~Major spike protein G~~~
MFQTFISRHNSNFFSDKLVLTSVTPASSAPVLQTPKATSSTLYFDSLTVNAGNGGFLHCIQMDTSVNAANQVVSVGADIA
FDADPKFFACLVRFESSSVPTTLPTAYDVYPLNGRHDGGYYTVKDCVTIDVLPRTPGNNVYVGFMVWSNFTATKCRGLVS
LNQVIKEIICLQPLK
>P03734 ~~~G~~~Tail assembly protein G~~~
MFLKTESFEHNGVTVTLSELSALQRIEHLALMKRQAEQAESDSNRKFTVEDAIRTGAFLVAMSLWHNHPQKTQMPSMNEA
VKQIEQEVLTTWPTEAISHAENVVYRLSGMYEFVVNNAPEQTEDAGPAEPVSAGKCSTVS
>A0A097I1R9 ~~~~~~Histone doublet miniH2B-H2A~~~
MDRVGKYGLFIKRISPKDADITKESLETVNNMLVFLAEKLTKQANIIIDQKTLRHDAFLWLLTDIQGELGKHSQDFANSV
LYGEKELVFPTKRTENLMRKNTCLRISQSAVKTLTAILEYFCGQIMEASFSQAKKSKRKRIRPIDIEAAISQDKELHSMF
GKGVISGR
>A0A097I2B5 ~~~~~~Histone doublet H2B-H2A~~~
MATQKETTRKRDKSVNFRLGLRNMLAQIHPDISVQTEALSELSNIAVFLGKKISHGAVTLLPEGTKTIKSSAVLLAAGDL
YGKDLGRHAVGEMTKAVTRYGSAKESKEGSRSSKAKLQISVARSERLLREHGGCSRVSEGAAVALAAAIEYFMGEVLELA
GNAARDSKKVRISVKHITLAIQNDAALFAVVGKGVFSGAGVSLISVPIPRKKARKTTEKEASSPKKKAAPKKKKAASKQK
KSLSDKELAKLTKKELAKYEKEQGMSPGY
>A0A097I2D0 ~~~~~~Histone doublet H4-H3~~~
MSKAGKKVKAQQHGHLADHVSVGETQIPKASTQHLLRKAGSLSAAGDTEVPIRGFVHMKLHKLVQKSLLAMQLAKRKTIM
KSDVKKAAELMHLPVFAIPTKDSGAKGSVFLSCRQKGAGSAGTGSETNSQEVRSQMKSTCLIIPKERFRTMAKEISKKEG
HDVHIAEAALDMLQVIVESCTVRLLEKALVITYSGKRTRVTSKDIETAFMLEHGPL
>Q9J586 ~~~~~~Late protein H7 homolog~~~
MDHKSRMLLDTIFKDMLNTKDVYALIKYIFKKDPVETIFSKKDDDIFIDFVYNDNVLASDYLGMKTTKVEDCCSCRKVVA
VEYMNTSIIDNDLEGYIKQSDKLKRFIKLYNKNNAIKKARNIKSRQKMLKDAGIDDIGYEFIKDAIGLISRK
>Q6RZJ7 ~~~~~~Late protein H7 homolog~~~
MEMDKRMKSLAMTAFFGELSTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDNYKASQYLDLTLTPIFGDECKTHA
SSIAEQLACADIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISRDTEKLKIALAKGIDYEYIKDAC
>Q9JFA9 ~~~TH8R~~~Late protein H7~~~
MEMDKRMKSLAMTAFFGELNTLDIMALIMSIFKRHPNNTIFSVDKDGQFIIDFEYDNYKASQYLDLTLTPISGDECKTHA
SSIAEQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISRDTEKLKIALAKGIDYEYIKDAC
>P0DON8 ~~~H7R~~~Late protein H7~~~
MEMDKRMKSLAMTAFFGELTTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDTYKASQYLDLPLTPISGDECKTHA
SSIAKQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISQDTEKLKIALAKGIDYEYIKDAC
>Q89443 3.6.4.13~~~~~~Putative RNA helicase B962L~~~
MGKPTLLEPGHLYNVPAEHKNDVPIHYIITWIKQRLPEFGGAIPTSLADRVLIIKSRTGSGKSTALPVHVFRILRNENTH
SFQKYLGRSVICTQPRVLTAVTLAKDIGASTHYPDMILGQTVGYHTKPLTEKPNRGLIYATAGVLLAQLHTMTDDEIASR
YAFMIIDEAHERALGIDLMLMYIKSMLQRMLQRGSIGALRIPFVILTSATIDTHKYSTYFGIGKENIILVEGRQYGVETH
WPLYNTNNYIKTACETALTIHKENIHDRPTEADILIFMPGMAEIRFLSMLLNNANMDLAKEKLPLMLILPIDSEAIAQEN
EAYLGLKAEIKNLWVKNPLTAKVEKPLRRVIVSTVVAETGLTIETLKYVIDPGWNRSIETYYPEWAGGLITRPAAQSRIE
QRKGRVGRVFPGHFYPLYTKHVFEQIPAQQYPEIITEGPGAIFLSIVVETIKKNKEGVFKAEEIDMLDPPPTDALASAIE
RAIVAGLLTRGEKGLQLTQLGDIASRFSFLSIEEARMCFSGYFWQAAISDIATILAVVSVADKKLTNLLDSKQRNGAMLA
EAVLAGIPPFLQNIDNAYTNIHLLLADDLLEGLFIFEGFQHAIVYFINNKVNNVAKHLREWCEKKMLKYSSMVQILARRE
DILNELAIVGLNPFHQWQNRLASANAETFLKRVCTLKQCMYEAYRLNCFCYDKHRLLYTGRNGIHFSYHDAVIKNPSCIV
TPRIMLSPVSKQYMEWRLEPSFVSVLDGFVNVDINFLSPRQEIPNILGGVEDEEEEPPLPIQVFLHKYVKTHFHFSGKSF
KELKMKPSQMIKFPETTLINMIPDIPKNVVQTYLEINVCHQYSFKRLIYCETFYTDMDDVQHENSVELIGLPMAAHHLTI
NDFNKLYHLLKPDGFLIVYDLHKGQEAFWLHSLQDALGHHTIRRDMDFHTIPEWETIFKECGFTPIFSKQPSEHELFIVF
KK
>Q65162 3.6.4.-~~~~~~Putative primase C962R~~~
MREESWEDHDTIQLTAQRKYLAEVQALETLLTRELSAFLTEPGSKKTNIINRITGKTYALPSTELLRLYEHLEQCRKQGA
LMYFLERQGTYSGLMLDYDLKLNTNAVPPLEPPALSRLCHRIFVHIKNSSVLPEGSHKIHFFFTLKPEVVQGKYGFHVLI
PGLKLAASTKKSIIGSLQHDATVQKILHEQGVANPESCLDPHSASVPSLLYGSSKLNHKPYQLKTGFELVFDSSDPDYIP
IHQIKNIESYNLVSELSLTNEQGSLVRPVYCAADIAAEKEEEIPTDDHSLSILMLHDPEARYLHKILNLLPPEYYVEYPL
WSNVVFALANTSANYRPLAEWFSQKCPEKWNTGGKEKLEKLWNDASRHTEKKITKRSIMYWAHKHAPQQYKEIVEQGYFS
ILAEYVYSYNGMLEHYMIAKVIYAMMGNKFVVDVDSNGKYVWFEFVLPGQPMNQGEIWKWRKEVNPDELHIYISENFSRV
MDRITEHIKYHLSQPHETNILNYYKKLLKAFERSKSKIFNDSFKKGVIRQAEFLFRQRSFIQTLDTNPHLLGVGNGVLSI
ETIPAKLINHFHEHPIHQYTHICYVPFNPENPWTKLLLNALQDIIPELDARLWIMFYLSTAIFRGLKEALMLLWLGGGCN
GKTFLMRLVAMVLGDHYASKLNISLLTSCRETAEKPNSAFMRLKGRGYGYFEETNKSEVLNTSRLKEMVNPGDVTARELN
QKQESFQMTATMVAASNYNFIIDTTDHGTWRRLRHYRSKVKFCHNPDPGNPYEKKEDPRFIHEYIMDPDCQNAFFSILVY
FWEKLQKEYNGQIKKVFCPTIESETEAYRKSQDTLHRFITERVVESPSAETVYNLSEVVTAYAEWYNANINVKRHIALEL
SQELENSVLEKYLQWSPNKTRILKGCRILHKFETLQPGESYIGVSTAGTLLNTPICEPKNKWWEWSPNLSAPPEKEASAP
TP
>P46083 ~~~~~~Hemagglutinin component HA-17 type C~~~
MSSERTFLPNGNYKIKSLFSDSLYLTYSSGALSFSNTSSLDNQKWKLEYISSSNGFRFSNVAEPNKYLAYNDYGFIYLSS
SSNNSLWNPIKIAINSYIICTLSIVNVTDYAWTIYDNNNNITDQPILNLPNFDINNSNQILKLEKL
>Q9LBR4 ~~~~~~Hemagglutinin component HA-17 type D~~~
MSSERTFLPNGNYKIKSLFSDSLYLTYSSGSLSFLNTSSLDNQKWKLEYISSSNGFRFSNVAEPNKYLAYNDYGFIYLSS
SSNNSLWNPIKIAINSYIICTLSIVNVTDYAWTIYDNNNNITDQPILNLPNFDINNSNQILKLEKL
>P0DPR0 ~~~~~~Main hemagglutinin component type C~~~
MSQTNANDLRNNEVFFISPSNNTNKVLDKISQSEVKLWNKLSGANQKWRLIYDTNKQAYKIKVMDNTSLILTWNAPLSSV
SVKTDTNGDNQYWYLLQNYISRNVIIRNYMNPNLVLQYNIDDTLMVSTQTSSSNQFFKFSNCIYEALNNRNCKLQTQLNS
DRFLSKNLNSQIIVLWQWFDSSRQKWIIEYNETKSAYTLKCQENNRYLTWIQNSNNYVETYQSTDSLIQYWNINYLDNDA
SKYILYNLQDTNRVLDVYNSQIANGTHVIVDSYHGNTNQQWIINLI
>P0DPR1 ~~~~~~Main hemagglutinin component type D~~~
MSQTNANDLRNNEVFFISPSNNTNKVLDKISQSEVKLWNKLSGANQKWRLIYDTNKQAYKIKVMDNTSLILTWNAPLSSV
SVKTDTNGDNQYWYLLQNYISRNVIIRNYMNPNLVLQYNIDDTLMVSTQTSSSNQFFKFSNCIYEALNNRNCKLQTQLNS
DRFLSKNLNSQIIVLWQWFDSSRQKWIIEYNETKSAYTLKCQENNRYLTWIQNSNNYVETYQSTDSLIQYWNINYLDNDA
SKYILYNLQDTNRVLDVYNSQIANGTHVIVDSYHGNTNQQWIINLI
>P46085 ~~~~~~Hemagglutinin components HA-70 type C~~~
MSLSIKELYYTKDKSINNVNLADGNYVVNRGDGWILSRQNQNLGGNISNNGCTAIVGDLRIRETATPYYYPTASFNEEYI
KNNVQNVFANFTEASEIPIGFEFSKTAPSNKSLYMYLQYTYIRYEIIKVLQNTVTERAVLYVPSLGYVKSIEFNSEEQID
KNFYFTSQDKCILNEKFIYKKIDDTITVKESKNSNNNINFNTSQTILPYPNGLYVINKGDGYMRTNDKDLIGTLLIESST
SGSIIQPRLRNTTRPLFNTSNPTIFSQEYTEARLNDAFNIQLFNTSTTLFKFVEEAPTNKNISMKVYNTYEKYELINYQN
GNIDDKAEYYLPSLGKCEVSDAPSPQAPVVETPVDQDGFIQTGPNENIIVGVINPSENIEEISTPIPDDYTYNIPTSIQN
NACYVLFKVNTTGVYKITTKNNLPPLIIYEAIGSSNRNMNSNNLSNDNIKAIKYITGLNRSDAKSYLIVSLFKDKNYYIR
IPQISSSTTSQLIFKRELGNISDLADSTVNILDNLNTSGTHYYTRQSPDVGNYISYQLTIPGDFNNIASSIFSFRTRNNQ
GIGTLYRLTESINGYNLITINNYSDLLNNVEPISLLNGATYIFRVKVTELNNYNIIFDAYRNS
>Q9LBR5 ~~~~~~Hemagglutinin component HA-70 type D~~~
MSLSIKELYYTKDKSINNVNLADGNYVVNRGDGWILSRQNQNLGGNISNNGCTAIVGDLRIRETATPYYYPTASFNEEYI
RNNVQNVFANFTEASEIPIGFEFSKTAPSNKGLYMYLQYTYIRYEIIKVLRNTVIERAVLYVPSLGYAKSIEFNSGEQID
KNFYFTSEDKCILNEKFIYKKIAETTTAKESNDSNNTTNLNTSQTILPYPNGLYVINKGDGYMRTNDKDLIGTLLIETNT
SGSIIQPRLRNTTRPLFNTSNPTLFSQEYTEARLNDAFNIQLFNTSTTLFKFVEEAPDNKNISMKAYNTYEKYELINYQN
GNIADKAEYYLPSLGKCEVSDAPSPQAPVVETPVEQDGFIQTGPNENIIVGVINPSENIEEISTPIPDDYTYNIPTSIQN
NACYVLFTVNTTGVYKINAQNNLPPLIIYESIGSDNMNIQSNTLSNNNIKAINYITGTDSSNAESYLIVSLIKNKNYYIR
IPQISSSTTNQLIFKRELGNISDLANSTVNILDNLNTSGTHYYTRQSPDVGNYISYQLTIPGDFNNIASSIFSFRTRNNQ
GIGTLYRLTESINGYNLITIKNYSDLLNNVEPISLLNGATYIFRVKVTELNNYNIIFDAYRNS
>P0C625 ~~~C~~~External core antigen~~~
MQLFHLCLIISCTCPTVQASKLCLGWLWGMDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPH
HTALRQAILCWGELMTLATWVGNNLEDPASRDLVVNYVNTNVGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPA
YRPPNAPILSTLPETTVVRRRDRGRSPRRRTPSPRRRRSPSPRRRRSQSRESQC
>P0C6H2 ~~~C~~~External core antigen~~~
MQLFHLCLIISCSCPTVQASKLCLGWLWGMDIDPYKEFGASVELLSFLPSDFFPSIRDLLDTASALYREALESPEHCSPH
HTALRQAILCWGELMNLATWVGSNLEDPASRELVVSYVNVNMGLKIRQLLWFHVSCLTFGRETVLEYLVSFGVWIRTPPA
YRPPNAPILSTLPETTVVRRRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC
>P0C573 ~~~C~~~External core antigen~~~
MQLFHLCLIISCSCPTVQASKLCLGWLWGMDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSPH
HTALRQAILCWGELMTLATWVGVNLEDPASRDLVVSYVNTNMGLKFRQLLWFHISCLTFGRETVIEYLVSFGVWIRTPPA
YRPPNAPILSTLPETTVVRRRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC
>P0C6J2 ~~~C~~~External core antigen~~~
MYLFHLCLVFACVPCPTVQASKLCLGWLWGMDIDPYKEFGSSYQLLNFLPLDFFPDLNALVDTATALYEEELTGREHCSP
HHTAIRQALVCWDELTKLIAWMSSNITSEQVRTIIVNHVNDTWGLKVRQSLWFHLSCLTFGQHTVQEFLVSFGVWIRTPA
PYRPPNAPILSTLPEHTVIRRRGGARASRSPRRRTPSPRRRRSQSPRRRRSQSPSANC
>P0C6J4 ~~~C~~~External core antigen~~~
MYLFHLCLVFACVPCPTFQASKLCLGWLWGMDIDPYKEFGSSYQLLNFLPLDFFPDLNALVDTATALYEEELTGREHCSP
HHTAIRQALVCWDELTKLIAWMSSNITSEQVRTIIVNHVNDTWGLKVRQSLWFHLSCLTFGQHTVQEFLVSFGVWIRTPA
PYRPPNAPILSTLPEHTVIRRRGGARASRSPRRRTPSPRRRRSQSPRRRRSQSPSANC
>P03145 ~~~S~~~Large envelope protein~~~
MGQHPAKSMDVRRIEGGEILLNQLAGRMIPKGTLTWSGKFPTLDHVLDHVQTMEEINTLQNQGAWPAGAGRRVGLSNPTP
QEIPQPQWTPEEDQKAREAFRRYQEERPPETTTIPPSSPPQWKLQPGDDPLLGNQSLLETHPLYQSEPAVPVIKTPPLKK
KMSGTFGGILAGLIGLLVSFFLLIKILEILRRLDWWWISLSSPKGKMQCAFQDTGAQISPHYVGSCPWGCPGFLWTYLRL
FIIFLLILLVAAGLLYLTDNGSTILGKLQWASVSALFSSISSLLPSDPKSLVALTFGLSLIWMTSSSATQTLVTLTQLAT
LSALFYKS
>P0C684 ~~~S~~~Large envelope protein~~~
MGQHPAKSMDVRRIEGGELLLNQLAGRMIPKGTLTWSGKFPTIDHVLDHVQTMEEINTLQQQGAWPAGAGRRVGLSNPAP
QEIPQPQWTPEEDQKAREAFRRYQEERPPETTTIPPTSPTQWKLQPGDDPLLGNQSLLETHPLYQTEPAVPVIKTPPLKK
KMSGTFGGILAGLIGLLVSFFLLIKILEILRRLDWWWISLSSPKGKMQCAFQDTGAQISPHYAGSCPWGCPGFLWTYLRL
FIIFLLILLVAAGLLYLTDNGSTILGKLQWASVSALFSSISSLLPSDPKSLVALTFGLSLIWMTSSSATQTLVTLTQLAT
LSALFYKS
>P31873 ~~~S~~~Large envelope protein~~~
MGGWSAKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDHWPEANQVGVGAFGPGFTPPHGGLLGWSSQ
AQGTLHTVPAVPPPASTNRQTGRQPTPISPPLRDSHPQAMQWNSTAFQQALQDPRVRGLFFPAGGSSSGTVNPAPNIASH
ISSISSRTGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFIPLLPIFFCLWVYI
>P03142 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFLPDHQLDPAFGANSTNPDWDFNPIKDHWPAANQVGVGAFGPGLTPPHGGILGWSPQAQGILTTVSTI
PPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTALHQALQDPRVRGLYLPAGGSSSGTVNPAPNIASHISSISARTGDP
VTIMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQSPTSNHSPTSCPPICPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSKFPSCCCTKPTDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYSIVSPFIPLLPIFFCLWVYI
>P03141 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPVKDDWPAANQVGVGAFGPRLTPPHGGILGWSPQ
AQGILTTVSTIPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTAFHQTLQDPRVRGLYLPAGGSSSGTVNPAPNIASH
ISSISARTGDPVTNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYSIVSPFIPLLPIFFCLWVYI
>P17101 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPVFGANSNNPDWDFNPIKDHWPAANQVGVGAFGPGFTPPHGGVLGWSPQ
AQGMLTPVSTIPPPASANRQSGRQPTPISPPLRDSHPQAMQWNSTAFHQALQDPRVRGLYFPAGGSSSGTVNPAPNIASH
ISSISARTGDPVTNMENITSGFLGPLPVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSRSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLILGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYSIVSSFIPLLPIFFCLWVYI
>Q02317 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPIKDHWPQANQVGVGAFGPGFTPPHGGVLGWSPQ
AQGILATVPAMPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTAFHQALQDPRVRGLYFPAGGSSSGTLNPVPTIASH
ISSISSRIGDPAPNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYNILSPFIPLLPIFFCLWVYI
>Q91C35 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFFPDHQLDPAFGANSTNPDWDFNPIKDHWPQANQVGVGAFGPGHSPPHGGVLGWSPQAQGILTTVPTV
PPTASTNRQSGRQPTPISPPLRDSHPQAMQWNSTALHQALQDPRVRGLYFPAGGSSSGTLNPVPNTASHISSISSRTGDP
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQYPTSNHSPTSCPPICPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNIVSPFIPLLPIFFCLWVYI
>O91534 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPIKDHWPAANQVGVGAFGPGFTPPHGGILGWSPQ
AQGILTTVSTIPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTAFHQTLQDPRVRGLYLPAGGSSSGTVNPAPNIASH
ISSISARTGDPVTNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKFLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYSIVRPFIPLLPIFFCLWVYI
>Q4R1S6 ~~~S~~~Large envelope protein~~~
MGGWLPKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPIKDHWPQANQVGVGAFGPGFTPPHGGVLGWSPQ
AQGTLTTVPAVPPPASANRQSGRQPTPISPPLRDSHPQAIKWNSPAFHQALQDPRVKGLYFPAGGSSSGTVSPVPNIASH
ISSISSRTGDPAPTMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGSPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCRTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPRLYNILSPFIPLLPIFFCLWVYI
>Q4R1R8 ~~~S~~~Large envelope protein~~~
MGGRLPKPRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPIKDHWPQANQVGVGAFGPGFTPPHGGVLGWSPQ
AQGTLTTVPAVPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTKFHQTLQDPRVRGLYFPAGGSSSGTVNPAPNIASH
ISSISSRIGDPAPTMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGEAPVCLGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDCQGMLPVCPLIPGSTTTSTGPCRTCTTPAQGNSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFIPLLPIFFCLWVYI
>P17398 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPHKDNWPDAHKVGVGAFGPGFTPPHGGLLGWSPQAQGILTSVPAA
PPPASTNRQSGRQPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPRVRALYFPAGGSSSGTVSPAQNTVSAISSILSKTGDP
VPNMENIASGLLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGTPVCLGQNSQSQISSHSPTCCPPICPGYRWMCL
RRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPMDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFMPLLPIFFCLWVYI
>P17397 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFFPDHQLDPAFKANSDNPDWDLNPHKDNWPDSNKVGVGAFGPGFTPPHGGLLGWSPQAQGILTTVPTA
PPPASTNRQLGRKPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPRVRALYFPAGGSSSGTVNPVQNTASSISSILSTTGDP
VPNMENIASGLLGPLLVLQAGFFSLTKILTIPLSLDSWWTSLNFLGETPVCLGQNSQSQISSHSPTCCPPICPGYRWMCL
RRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWFWGPSLYNILSPFMPLLPIFFCLWVYI
>Q9QAB7 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLAVPNPLGFFPDHQLDPAFKANSDNPDWDLNPHKDNWPDANKVGVGAFGPGFTPPHGGLLGWSPQ
AQGLLTTVPAAPPPASTSRQSGRQPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPRVRALYFPAGGSSSGTVSPAQNTVSA
ISSTLSKTGDPVPNMENISSGLLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGQTPVCLGQNSQSQISSHSLTCCP
PICPGYRWMCLRRFIIFLCILLLCLIFLLVLLDCQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKFLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWFWGPSLCNILSPFMPLLPIFFCLWVYI
>P17399 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPNKDNWPDANKVGVGAFGPGFTPPHGGLLGWSPQAQGLLTTVPAA
PPPASTNRQSGRQPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPGVRALYFPAGGSSSGTVSPAQNTVSAISSILSKTGDP
VPNMENIASGLLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGTPVCLGQNSQSQISSHSPTCCPPICPGYRWMCL
RRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMIWFWGPSLYNILSPFMPLLPIFFCLWVYI
>Q9PWW3 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPHKDNWPDANKVGVGAFGPGFTPPHGGLLGWSPQ
AQGLLTTVPAAPPPASTNRQSGRQPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPRVRALYFPAGGSSSGTVSPAQNTVST
ISSILSKTGDPVPNMENIASGLLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGTPVCLGQNSQSQISSHSPTCCP
PICPGYRWMCLRRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWFWGPSLYNILSPFMPLLPIFFCLWVYI
>Q67926 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPHKDNWPDANKVGVGAFGPGFTPPHGGLLGWSPQ
AQGLLTTVPAAPPPASTNRQSGRQPTPFSPPLRDTHPQAMQWNSTTFLQTLQDSRVRALYLPAGGSSSGTVSPAQNTVSA
ISSISSKTGDPVPNMENIASGLLGHLLVLQAGFFSLTKILTIPQSLDSWWTSLNFLGGTPACPGQNSQSQISSHSPTCCP
PICPGYRWMCLRRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLTPGSTTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWGWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWFWGPSLYNILRPFMPLLPTFFCLWVYI
>Q9QBF0 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPHKDNWPDAHKVGVGAFGPGFTPPHGGLLGWSPQ
AQGILTSVPAAPPPASTNRQSGRQPTPLSPPLRDTHPQAVQWNSTTFHQTLQDPRVRALYLPAGGSSSGTVSPAQNTVSA
ISSILSTTGDPVPNMENIASGLLGPLLVLQAGFFSLTKILTIPQSLDSWWTSLNFLGGTPVCLGQNSQSQISSHSPTCCP
PICPGYRWMCLRRFIIFLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWFWGPSLYNILSPFMPLLPIFLCLWVYM
>Q8JXB9 ~~~S~~~Large envelope protein~~~
MGGWSSKPRKGMGTNLSVPNPLGFFPDHQLDPAFKANSENPDWDLNPHKDNWPDAHKVGVGAFGPGFTPPHGGLLGWSPQ
AQGILTSVPAAPPPASTNRQSGRQPTPLSPPLRDTHPQAMQWNSTTFHQTLQDPRVRALYLPAGGSSSGTVSPAQNTVSA
ISSILSTTGDPVPNMENIASGLLGPLLVLQAGFFSLTKILTIPQSLDSWWTSLSFLGGTPVCLGQNSQSPISSHSPTCCP
PICPGYRWMYLRRFIIXLCILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGTSMFPSCCCTKPTDGN
CTCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFMPLLPIFFCLWVYI
>Q9E6S4 ~~~S~~~Large envelope protein~~~
MGGYSSKPRKGMGTNLSVPNPLGFLPDHQLDPAFGANSNNPDWDFNPNKDPWPEAWQVGAGAFGPGFTPPHGSLLGWSPQ
AQGILTTVPATPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTANPVPTTASP
ISSIFSRTGDPVPKMENTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPACPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGTSTTSTGPCKTCTTPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFAKFLWEWASVRFSWLSLLAPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P31868 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDHWPEANQVGVGTFGPGFTPPHGGLLGWSPQ
AQGILTTVPAAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTVNPVPTTASP
ISSIFSRTGDPAPNMENTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCKTCTTPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEWASVRFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P31869 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDHWPEANQVGAGAFGPGFTPPHGGLLGWSPQ
AQGILTTVPAAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVKGLYFPAGGSSSGTVNPVPTTASP
ISSIFSRTGDPAPNMESTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDFQGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEWASVRFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P12934 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSHNPDWDFNPNKDHWPEANQVGAGAFGPGFTPPHGGLLGWSPQ
AQGVLTTVPVAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTVNPVPTTASP
ISSISSRTGDPAPNMENTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEGASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>Q67867 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDRWPEANQVGAGAFGPGYPPPHGGLLGWSPQ
AQGILTTVPAAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQVLLDPRVRGLYFPPGGSSSGTVNPVPTTASP
ISSISSRTGDPAPNMESTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEGASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P03140 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDQWPEANQVGAGAFGPGFTPPHGGLLGWSPQ
AQGILTTVPAAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTVNPVPTTASP
ISSIFSRTGDPAPNMENTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P30019 ~~~S~~~Small envelope protein~~~
MENTASGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCPPICPGYRWMCLRRF
IIFLFILLLCLIFLLVLLDYHGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGNCTCIPIPSSWAFAR
FLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>Q913A6 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHHLDPAFGANSNNPDWDFNPNKDHWPKANQVRAGAFGPGFTPPHCSLLGWSPQ
AQGILTTVPAAPPPASSNRQSGKQPTPISPPLRDSHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTASP
ISSIFSRIGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPISNHSPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGN
CTCIPIPSSWAFGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI
>Q81162 ~~~S~~~Large envelope protein~~~
MGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDQWPEANQVGAGAFGPGFTPPHGGLLGWSPQAQGILTTVPAA
PPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTVNPVPTIVSPISSIFSRTGDP
APNMESTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGEAPTCPGQNSQSPTSNHSPTSCPPICPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCKTCTIPAQGTSMFPSCCCTKPSDGNCTCIPIPSSWA
FARFLWEWASVRFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>Q998L9 ~~~S~~~Large envelope protein~~~
MGGWSSKHRKGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDHWPEANQVGAGAFGPGFTPPHGGFLGWSPQ
AQGILTTVPAAPPPASTNRQSGRQPTPISPPLRDTHPQAMQWNSTAFHQALQDPRVRGLYFPAGGSSSGTVNPVPNTVSH
ISSIFTKTGDPASNMESTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPGCIGQNSQSQTSNHSPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCRTCTITAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWGFAKFLWEWASVRFSWLSLLVPFVQWFAGLSPTVWLSVIWMIWYWGPSLYNILSPFLPLLPIFLCLWVYI
>Q76R62 ~~~S~~~Large envelope protein~~~
MGGWSSKPRQGMGTNLSVPNPLGFFPDHQLDPAFGANSNNPDWDFNPNKDHWPEANQVGAGAFGPGFTPPHGGLLGWSPQ
AQGILTTLPAAPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTTFHQALLDPRVRGLYFPAGGSSSGTVNPVPTTASP
ISSIFSRTGDPAPNMESTTSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGAPTCPGQNSQSPTSNHSPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGTSTTSTGPCRTCTIPAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFARFLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYNILSPFLPLLPIFFCLWVYI
>P03139 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTNNPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGIMQTLPAN
PPPASTNRQSGRQPTPLSPPLRTTHPQAMHWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTTSPISSIFSRIGDP
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPISNHSPTSCPPTCPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGSCRTCTTPAQGISMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFVGLSPIVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWAYI
>P24025 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDSWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPAN
PPPASTNRQSGRQPTPLSPPLRNTHPQAMQWNSTTFHQTLQDPRVRGLYLPAGGSSSGTVNPVPTTVSPISSIFSRIGDP
ALNMENITSGFLGPLLVLQAGFFLLTKILTIPKSLDSWWTSLNFLGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLLVIWMMWYWGPKLFTILSPFLPLLPIFFCLWVYI
>P03138 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPAN
PPPASTNRQSGRQPTPLSPPLRNTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVLTTASPLSSIFSRIGDP
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI
>Q9QMI0 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTRNPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPAN
PPPAATNRQSGRQPTPLSPPLRDAHPQAMQWTSTTFHQALQDPRVRGLYFPAGGSSSGTVNPVPTTASPILSIFSKIGDL
APNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFAGLSPIVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWAYI
>Q998M2 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTNNPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGIMQTLPAN
PPPASTNRQSGRQPTPLSPPLRTTHPQAMQWNSTTFHQTLQDPRVRGLYLPAGGSSSGTVNPVPTTASPTLSTSSRIGDP
ALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLSFLGGTTVCLGQNSQSPTSNHSPTSCPPTCVGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCTTPAQGTSMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYNTLSPFLPLLPIFFYLWVYI
>Q67875 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPASRANTANPDWDFNPNKDTWPDANKDGAGAFGLGLTPPHGGLLGWSPQAQGILHTVPAN
PPPASTNRQTGRQPTPLSPPLRDTHPQAVQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTASPLSSIFSRIGDP
VTNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFRGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCL
RGFIIFLFILLLCLIFLLVLLEYQGMLHVCPLIPGTTTTSTGPCKTCTTPAQGNSMFPSCCCTKTSDGNCTCIPIPSSWA
FAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI
>O92921 ~~~S~~~Large envelope protein~~~
MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGIIQTLPAN
PPPASTNRQTGRQPTPLSPPLRNTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVPTTASPISSIFSRIGDP
ALNMENITSGLLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCL
RRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSVGPCRTCTTTVQGTSMYPSCCCTKPSDGNCTCIPIPSSWA
FGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYRILSPFLPLLPIFFCLWVYI
>Q69603 ~~~S~~~Large envelope protein~~~
MGLSWTVPLEWGKNISTTNPLGFFPDHQLDPAFRANTRNPDWDHNPNKDHWTEANKVGVGAFGPGFTPPHGGLLGWSPQA
QGMLKTLPADPPPASTNRQSGRQPTPITPPLRDTHPQAMQWNSTTFHQALQDPRVRGLYFPAGGSSSGTVNPVPTTASLI
SSIFSRIGDPAPNMESITSGFLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGAPVCLGQNSQSPTSNHSPTSCPP
ICPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCMTLAQGTSMFPSCCCSKPSDGNC
TCIPIPSSWAFGKFLWEWASARFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYDILSPFIPLLPIFFCLWVYI
>Q80IU6 ~~~S~~~Large envelope protein~~~
MGLSWTVPLEWGKNHSTTNPLGFFPDHQLDPAFRANTRNPDWDHNPNKDHWTEANKVGVGAFGPGFTPPHGGLLGWSPQA
QGMLKTLPADPPPASTNRQSGRQPTPITPPLRDTHPQAMQWNSTTFHQALQDPRVRGLYFPAGGSSSGTVNPVPTTASLI
SSIFSRIGDPAPNMEGITSGFLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGAPVCLGQNSQSPTSNHSPTSCPP
ICPGYRWMCLRRFIIFLFILLLCLIFLLVLLGYQGMLPVCPLIPGSSTTSTGPCRTCTTLAQGTSMFPSCCCSKPSDGNC
TCIPIPSSWAFGKFLWEWASARFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYNILSPFIPLLPIFFCLWVYI
>Q80IU3 ~~~S~~~Large envelope protein~~~
MGLSWTVPLEWGKNHSTTNPLGFFPDHQLDPAFRANTRNPDWDHNPNKDHWTEANKVGVGAFGPGFTPPHGGLLGWSPQA
QGMLKTLPADPPPASTNRQSGRQPTPITPPLRDTHPQAMQWNSTTFHQALQDPRVRGLYFPAGGSSSGTVNPVPTTVSLI
SSIFSRTGDPAPNMEGITSGFLGPLLVLQAGFFLLTKILTIPQSLDSWWTSLNFLGGAPVCLGQNSQSPTSNHSPTSCPP
ICPGYRWMCLRRFIIFLFILLLCLIFLLVLLGYQGMLPVCPLIPGSSTTITGPCRTCTTLAQGTSMFPSCCCSKPSDGNC
TCIPIPSSWAFGKFLWEWASARFSWLSLLVPFVQWFAGLSPTVWLSVIWMMWYWGPSLYNILSPFIPLLPIFFCLWVYI
>Q05496 ~~~S~~~Large envelope protein~~~
MGAPLSTTRRGMGQNLSVPNPLGFFPDHQLDPLFRANSSSPDWDFNTNKDSWPMANKVGVGGYGPGFTPPHGGLLGWSPQ
AQGVLTTLPADPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRALYFPAGGSSSGTQNPAPTIASL
TSSIFSKTGGPAMNMDNITSGLLGPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGGLPGCPGQNSQSPTSNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCKTCTTLAQGTSMFPSCCCSKPSDGN
CTCIPIPSSWALGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWVSI
>Q99HS3 ~~~S~~~Large envelope protein~~~
MGAPLSTTRRGMGQNLSVPNPLGFFPDHQLDPLFRANSSSPDWDFNKNKDNWPMANKVGVGGYGPGFTPPHGGLLGWSPQ
AQGVLTTLPADPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRALYFPAGGSSSETQNPAPTIASL
TSSIFLKTGGPATNMDNITSGLLGPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGGTPGCPGQNSQSPTSNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLVDYQGMLPVCPPLPGSTTTSTGPCKTCTTLAQGTSMFPSCCCSKPSDGN
CTCIPIPSSWALGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWVSI
>Q99HR4 ~~~S~~~Large envelope protein~~~
MGAPLSTTRRGMGQNLSVPNPLGFFPEHQLDPLFRANSSSPDWDFNKNKDTWPMANKVGVGGYGPGFTPPHGGLLGWSPQ
AQGVLTTLPADPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRALYFPAGGSSSETQNPAPTIASL
TSSIFSKTGGPAMNMDSITSGLLGPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGGLPGCPGQNSQSPTSNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSTTTSTGPCKTCTTLAQGTSMFPSCCCSKPSDGN
CTCIPIPSSWALGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWVSI
>Q69606 ~~~S~~~Large envelope protein~~~
MGAPLSTTRRGMGQNLSVPNPLGFLPDHQLDPLFRANSSSPDWDFNTNKDSWPMANKVGVGGYGPGFTPPHGGLLGWSPQ
AQGVLTTLPADPPPASTNRLSGRKPTQVSPPLRDTHPQAMQWNSTHFHQALLDPRVRALYFPAGGSSSGTQNPAPTIASL
TSSISSKTGGPAMNMENITSGLLGPLRVLQAVCFLLTKILTIPQSLDSWWTSLNFLGGLPRCPGQNSQSPTSNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCKTCTALAQGTSMFPSCCCSKPSDGN
CTCIPIPSSWALGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWVSI
>Q9IBI3 ~~~S~~~Large envelope protein~~~
MGLSWTVPLEWGKNLSASNPLGFLPDHQLDPAFRANTNNPDWDFNPKKDPWPEANKVGVGAYGPGFTPPHGGLLGWSPQS
QGTLTTLPADPPPASTNRQSGRQPTPISPPLRDSHPQAMQWNSTAFHQALQNPKVRGLYFPAGGSSSGIVNPVPTIASHI
SSIFSRIGDPAPNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGVPVCPGLNSQSPTSNHSPISCPP
TCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCKTCTTPAQGNSMYPSCCCTKPSDGNC
TCIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWFVGLSPTVWLSAIWMMWYWGPNLYNILSPFIPLLPIFFCLWVYI
>Q8JMY6 ~~~S~~~Large envelope protein~~~
MGAPLSTARRGMGQNLSVPNPLGFFPDHQLDPLFRANSSSPDWDFNTNKDNWPMANKVGVGGFGPGFTPPHGGLLGWSPQ
AQGILTTSPPDPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRGLYFPAGGSSSETQNPAPTIASL
TSSIFSKTGDPAMNMENITSGLLRPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGVPPGCPGQNSQSPISNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCKTCTTLAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWASI
>Q8JN07 ~~~S~~~Large envelope protein~~~
MGAPLSTARRGMGQNLSVPNPLGFFPDHQLDPLFRANSSSPDWDFNTNKDNWPMANKVGVGGFGPGFTPPHGGLLGWSPQ
AQGILTTSPPDPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRGLYFPAGGSSSETQNPVPTIASL
TSSIFSKTGDPAMNMENITSGLLGPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGVPPGCPGQNSQSPISNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCKTCTTLAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWASI
>Q8JMZ6 ~~~S~~~Large envelope protein~~~
MGAPLSTARRGMGQNLSVPNPLGFFPDHQLDPLFRANSSSPDWDFNTNKDNWPMANKVGVGGFGPGFTPPHGGLLGWSPQ
AQGILTTSPPDPPPASTNRRSGRKPTPVSPPLRDTHPQAMQWNSTQFHQALLDPRVRGLYLPAGGSSSETQNPVPTIASL
TSSIFSKTGDPAMNMENITSGLLGPLLVLQAVCFLLTKILTIPQSLDSWWTSLNFLGVPPGCPGQNSQSPISNHLPTSCP
PTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLLPGSTTTSTGPCKTCTTLAQGTSMFPSCCCTKPSDGN
CTCIPIPSSWAFGKYLWEWASARFSWLSLLVQFVQWCVGLSPTVWLLVIWMIWYWGPNLCSILSPFIPLLPIFCYLWASI
>P17400 ~~~S~~~Large envelope protein~~~
MGNNIKVTFNPDKIAAWWPAVGTYYTTTYPQNQSVFQPGIYQTTSLINPKNQQELDSVLINRYKQIDWNTWQGFPVDQKF
SLVSRDPPPKPYINQSAQTFEIKPGPIIVPGIRDIPRGLVPPQTPTNRDQGRKPTPPTPPLRDTHPHLTMKNQTFHLQGF
VDGLRDLTTTERQHNAYGDPFTTLSPAVPTVSTILSPPSTTGDPALSPEMSPSSLLGLLAGLQVVYFLWTKILTIAQNLD
WWWTSLSFPGGIPECTGQNSQFQTCKHLPTSCPPTCNGFRWMYLRRFIIYLLVLLLCLIFLLVLLDWKGFIPVCPLQPTT
ETTVNCRQCTISAQDMYTPPYCCCLKPTAGNCTCWPIPSSWALGNYLWEWALARFSWLNLLVPLLQWLGGISLIAWFLLI
WMIWFWGPALLSILPPFIPIFVLFFLIWVYI
>P0C746 ~~~HBZ~~~HTLV-1 basic zipper factor~~~
MVNFVSAGLFRCLPVSCPEDLLVEELVDGLLSLEEELKDKEEEEAVLDGLLSLEEESRGRLRRGPPGEKAPPRGETHRDR
QRRAEEKRKRKKEREKEEEKQTAEYLKRKEEEKARRRRRAEKKAADVARRKQEEQERRERKWRQGAEKAKQHSARKEKMQ
ELGIDGYTRQLEGEVESLEAERRKLLQEKEDLMGEVNYWQGRLEAMWLQ
>P0C745 ~~~HBZ~~~HTLV-1 basic zipper factor~~~
MVNFVSVGLFRCLPVPCPEDLLVEELVDGLLSLEEELKDKEEEETVLDGLLSLEEESRGRLRRGPPGGKAPPRGETHRDR
QRRAEEKRKRKKEREKEEEKQIAEYLKRKEEEKARRRKRAEEKAADFARRKQEEQERRERKWRQGAEKAKQHSARKEKMQ
ELGVDGYTRQLEGEVESLEAERRRLLQEKEDLMGEVNYWQGRLEAMWLQ
>Q38584 ~~~~~~Head completion protein gp15~~~
MDIQRVKRLLSITNDKHDEYLTEMVPLLVEFAKDECHNPFIDKDGNESIPSGVLIFVAKAAQFYMTNAGLTGRSMDTVSY
NFATEIPSTILKKLNPYRKMAR
>O48446 ~~~~~~Head completion protein gp16~~~
MYEEFPDVITFQSYVEQSNGEGGKTYKWVDEFTAAAHVQPISQEEYYKAQQLQTPIGYNIYTPYDDRIDKKMRVIYRGKI
VTFIGDPVDLSGLQEITRIKGKEDGAYVG
>D3WAC7 ~~~~~~Probable head completion protein 1~~~
MIFSQVTLQVEKTVKKKNGAEDNVIKPITLPAVKQRISQSRLDEFSMIGLGKNVRYELNGIGEMEDLIFNYFLDEKGETF
KRTTWERNPKNNKMILEGLVSNGI
>D3WAC8 ~~~~~~Probable head completion protein 2~~~
MEFDSYIDWYNNLLTMPLNDVILGVKDTIEDKTVYLSLSDSKVIKMDNTSFVMGYYYQVVLSVKDVDDELVGLVGNVLQN
GWNMTNWSENSHLYNYTGTVYLPCGAGGQAWQ
>P68660 ~~~W~~~Head completion protein~~~
MTRQEELAAARAALHDLMTGKRVATVQKDGRRVEFTATSVSDLKKYIAELEVQTGMTQRRRGPAGFYV
>P15075 3.1.-.-~~~~~~Head completion nuclease~~~
MAYSGKWVPKNISKYRGDPKKITYRSNWEKFFFEWLDKNPEIIAWGSETAVIPYFCNAEGKKRRYFMDIWMKDSSGQEFF
IEIKPKKETQPPVKPAHLTTAAKKRFMNEIYTWSVNTDKWKAAQSLAEKRGIKFRILTEDGLRALGFKGA
>Q08539 ~~~~~~Protein Early 65 kDa~~~
MVYICIDTGSHAKGYAVESSDTDYHIYTKCDRETFEKFIDNKELLKNRHAKDESGNDVKYVDLYTGLIGILTGKSPELSM
FSKREDFKDKYGIENLQLYEFVTKLMTVSMVKIIYTLMRYKILNNAKGLLQLMFNYVYVEYYLDYKRAPKSTKILNMLFN
VGDEIKITMDKNNQLVVNDLNVLDLNKNNGGYKIDETLTLFVKNVKLLKLYVKLMQRGEYQQEWTEYFQQWKQQLQDRLH
HVPEPPERTDIRHNIVMYALNERGPVMPEDENKIVYQIYPSVSHLDQGKKGTLADKEIIVQEKLDGCNFRIICNQNKITY
GSRNTYRPDGNFMNYYRIRKDLETCMRSLQARFNDGFIVYGELMGWKDDAKTTPINVINYVDQKESLKYYAYEIQLYGGE
FVPFVEAQELLTNVGFNTIPCHKYLYNDFVERLNFKSLMFPQSPLEGFIIRCGNLIYKLKSDYKDLNKLKIEKGPFEWLT
CDYIKSNCDAIDKSDMMKILIFCYNMCKVKNYNEKLLFNKVFNLFRQQFNLNHNDYKNLYKQYVNMCKCTEYK
>P11107 3.6.4.-~~~~~~Probable helicase D10~~~
MKVVISNKAYFKPDDELWDYCSKQTTYHIETMTSKYPIMYKNSGVVAKEIKWIPITRLDLLDAKGIKYELVDKRTLAPVD
IPKPKFKLREEDQLPIYEECDDTCIINGKPGFGKTILALALAYKFGQKTLVICTNTSIREMWAAEVRKWFGFEPGIIGSG
KYNIDPPIVVSNIQTVNKHANNLSKVFGTVIVDEVHHCVATTFTNFLEISCARYKIGLSGTLKRKDGLQVMFKDFFGYKI
FSPPVNNTVAPTIHRYSVPVELSGNQNVPWALRANDVYNHPEYRETIINLAHLYVNMGHKVLIVSDRTELIQTILEALTQ
RGVTTYEIIGATHLDDRLKIQEDIAKGGPCVLAAAQSIFSEGISLNELSCLIMGSLINNESLIEQLAGRVQRIVEGKLDP
IVVDLIMKGGTGLRQASGRMAVYRNNGWKTITMTPEKAVQLAKIAFGNSS
>P04530 3.6.4.-~~~~~~DnaB-like replicative helicase~~~
MVEIILSHLIFDQAYFSKVWPYMDSEYFESGPAKNTFKLIKSHVNEYHSVPSINALNVALENSSFTETEYSGVKTLISKL
ADSPEDHSWLVKETEKYVQQRAMFNATSKIIEIQTNAELPPEKRNKKMPDVGAIPDIMRQALSISFDSYVGHDWMDDYEA
RWLSYMNKARKVPFKLRILNKITKGGAETGTLNVLMAGVNVGKSLGLCSLAADYLQLGHNVLYISMEMAEEVCAKRIDAN
MLDVSLDDIDDGHISYAEYKGKMEKWREKSTLGRLIVKQYPTGGADANTFRSLLNELKLKKNFVPTIIIVDYLGICKSCR
IRVYSENSYTTVKAIAEELRALAVETETVLWTAAQVGKQAWDSSDVNMSDIAESAGLPATADFMLAVIETEELAAAEQQL
IKQIKSRYGDKNKWNKFLMGVQKGNQKWVEIEQDSTPTEVNEVAGSQQIQAEQNRYQRNESTRAQLDALANELKF
>P03692 2.7.7.-~~~4~~~DNA helicase/primase~~~
MDNSHDSDSVFLYHIPCDNCGSSDGNSLFSDGHTFCYVCEKWTAGNEDTKERASKRKPSGGKPMTYNVWNFGESNGRYSA
LTARGISKETCQKAGYWIAKVDGVMYQVADYRDQNGNIVSQKVRDKDKNFKTTGSHKSDALFGKHLWNGGKKIVVTEGEI
DMLTVMELQDCKYPVVSLGHGASAAKKTCAANYEYFDQFEQIILMFDMDEAGRKAVEEAAQVLPAGKVRVAVLPCKDANE
CHLNGHDREIMEQVWNAGPWIPDGVVSALSLRERIREHLSSEESVGLLFSGCTGINDKTLGARGGEVIMVTSGSGMGKST
FVRQQALQWGTAMGKKVGLAMLEESVEETAEDLIGLHNRVRLRQSDSLKREIIENGKFDQWFDELFGNDTFHLYDSFAEA
ETDRLLAKLAYMRSGLGCDVIILDHISIVVSASGESDERKMIDNLMTKLKGFAKSTGVVLVVICHLKNPDKGKAHEEGRP
VSITDLRGSGALRQLSDTIIALERNQQGDMPNLVLVRILKCRFTGDTGIAGYMEYNKETGWLEPSSYSGEEESHSESTDW
SNDTDF
>P24307 3.6.4.12~~~HELI~~~DNA replication helicase~~~
MIDNILQFFLKNVPQDKTYEINNLQDANHLIIRNTRTGTRRLFEYVNNFQQFLNTIRNNFNGPCAKHDMGASCEDTEEPA
EKHAAQTLDGHDWVLESNDFCIFVKPFILKKHYDIIQKYINFEDFFKSTDPGYINKCVQAGDYYYWPNWPKKQAFSFNGW
QLFLNIKFGIVIEPTIPIIHNKKLGPVDLFVFDPKCFLNVELSLRTNHDPPQTLFVNGKTKFDDSHEDLFILKMADGTVA
TCKVNGELVNSDKNFFNYIRDDINLEECITVPKYKHIVNVNLKSLRVFENNNFDKNDVDLSDTRSRKPRIVPIISASSEN
ADYIQTQINLGLIAIHENMVKVLATHERANDPNLLQQYFEKSKFKNFDFLIYVLWKILTKNENFSYRETDIKLFLELLCE
SLFACDKEALNEALKRCEPYKKQEKIVFNRACNHWFDFDDTKLCVSLGYYFGIHYMIYLTQSAKNEILDHDELWAYTYEN
VMALNLPPDIVCKGFFRKLENVVTGVNLVFNGKHYQIVKKEDDLFKLTKSNCYKLSNIKFNNWKYLYLTTHGVYNVFTNS
FHSSCPFLLGTTLPQTFKKPTDEKYLPEDAFNYMLSTSADELSIYRTYHIAKMCRDVKMLKTNTAIVNYMGNCNTCQADM
RVALNNLFRDLWNLDDENLITLALYVNKNRVSDMLHNLKCKPCRSTVSGSRPKCKCYKKIKINRKALKVCLMADMFGNDA
ELSELIWMLIFTNKTYVSTTLIRTNSEFVNQHGEFFSKEHNKIIQYLYRTIHKIEYVDMLMDKFNDKRLFLTELRDDVAR
EPDVQFEESDNISKFYTHHADALMILKKYNVWWDKIILARSTDDLPTWLTRFYMRIIMSKVDLKEYSYNYLKKIVEGYLY
FKRFTNFNHANAIMLMHFAASLAIPVDYGKKAIYMPGEPGSGKSSFFELLDYLVLMHKFDDDNHSGESNKETSDKEVSKL
NSQLYTINELKQCSESYFKKHADSSKSDSKSRKYQGLLKYEANYKMLIVNNKPLYVDDYDDGVQDRFLIVYTNHKFVDSV
KFAGSVYEHIKSKQFPIESMYYESLVTPVRLFLSHVLMYRRDPKTGFVVYKTLLSNDPMHKHNLMCLSTNNSPLYALIYI
LNIKTVRSATITIGEDKMEEMIGIAVQHFKNFLHPSFVQYNYKKNINASSSKSFVFNEQVLLQQIKNKFKNNYNKTTNVF
YNMTMALNRNDLNTSVPNFVC
>P31964 ~~~HE~~~Truncated non-functional hemagglutinin-esterase homolog~~~
MNFTVPVQAIQSIWSVGKESDDAIAEACKPPFCIYFSKKTPYTVTNGSNADHGDDEVRQMMRGLLYNSSCISAQGHTPLA
LYSTAMLYPPMYGSCPQYVKLFDGSGSESVDVISSSYFVATWVLLVVVIILVFIIISFCISN
>P0C0V9 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MLSLILFFPSFAFAATPVTPYYGPGHITFDWCGFGDSRSDCTNPQSPMSLDIPQQLCPKFSSKSSSSMFLSLHWNNHSSF
VSYDYFNCGVEKVFYEGVNFSPRKQYSCWDEGVDGWIELKTRFYTKLYQMATTSRCIKLIQLQAPSSLPTLQAGVCRTNK
QLPDNPRLALLSDTVPTSVQFVLPGSSGTTICTKHLVPFCYLNHGCFTTGGSCLPFGVSYVSDSFYYGYYDATPQIGSTE
SHDYVCDYLFMEPGTYNASTVGKFLVYPTKSYCMDTMNITVPVQAVQSIWSEQYASDDAIGQACKAPYCIFYNKTTPYTV
TNGSDANHGDDEVRMMMQGLLRNSSCISPQGSTPLALYSTEMIYEPNYGSCPQFYKLFDTSGNENIDVISSSYFVATWVL
LVVVVILIFVIISFFC
>P0C0W0 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MLSLILFFPSFAFAATPVTPYYGPGHITFDWCGFGDSRSDCTNPQSPMSLDIPQQLCPKFSSKSSSSMFLSLHWNNHSSF
VSYDYFNCGVEKVFYEGVNFSPRKQYSCWDEGVDGWIELKTRFYTKLYQMATTSRCIKLIQLQAPSSLPTLQAGVCRTNK
QLPDNPRLALLSDTVPTSVQFVLPGSSGTTICTKHLVPFCYLNHGCFTTGGSCLPFGVSYVSDSFYYGYYDATPQIGSTE
SHDYVCDYLFMEPGTYNASTVGKFLVYPTKSYCMDTMNITVPVQAVQSIWSEQYASDDAIGQACKAPYCIFYNKTTPYTV
TNGSDANHGDDEVRMMMQGLLRNSSCISPQGSTPLALYSTEMIYEPNYGSCPQFYKLFDTSGNENIDVISSSYFVATWVL
LVVVVILIFVIISFFC
>P24306 ~~~H~~~Hemagglutinin glycoprotein~~~
MLPYQDKVGAFYKDNARANSTKLSLVTEGHGGRRPPYLLFVLLILLVGILALLAITGVRFHQVSTSNMEFSRLLKEDMEK
SEAVHHQVIDVLTPLFKIIGDEIGLRLPQKLNEIKQFILQKTNFFNPNREFDFRDLHWCINPPSTVKVNFTNYCESIGIR
KAIASAANPILLSALSGGRGDIFPPHRCSGATTSVGKVFPLSVSLSMSLISRTSEVINMLTAISDGVYGKTYLLVPDDIE
REFDTREIRVFEIGFIKRWLNDMPLLQTTNYMVLPKNSKAKVCTIAVGELTLASLCVEESTVLLYHDSSGSQDGILVVTL
GIFWATPMDHIEEVIPVAHPSMKKIHITNHRGFIKDSIATWMVPALASEKQEEQKGCLESACQRKTYPMCNQASWEPFGG
RQLPSYGRLTLPLDASVDLQLNISFTYGPVILNGDGMDYYESPLLNSGWLTIPPKDGTISGLINKAGRGDQFTVLPHVLT
FAPRESSGNCYLPIQTSQIRDRDVLIESNIVVLPTQSIRYVIATYDISRSDHAIVYYVYDPIRTISYTHPFRLTTKGRPD
FLRIECFVWDDNLWCHQFYRFEADIANSTTSVENLVRIRFSCNR
>Q66165 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MFLLPRFVLVSCIIGSLGFENPPTNVVSHLNGDWFLFGDSRSDCNHVVNTNPRNYSYMDLNPALCDSGKISSKAGNSIFR
SFHFTDFYNYTGEGQQIIFYEGVNFTPYHAFKCTTSGSNDIWMQNKGLFYTQVYKNMAVYRSLTFVNVPYVYNGSAQSTA
LCKSGSLVLNNPAYIAREANFGDYYYKVEADFYLSGCDEYIVPLCIFNGKFLSNTKYYDDSQYYFNKDTGVIYGLNSTET
ITTGFDFNCHYLVLPSGNYLAISNELLLTVPTKAICLNKRKDFTPVQVVDSRWNNARQSDNMTAVACQPPYCYFRNSTTN
YVGVYDINHGDAGFTSILSGLLYDSPCFSQQGVFRYNNVSSVWPLYPYGRCPTAADINTPDVPICVYDPLPLILLGILLG
VAVIIIVVLLLYFMVDNGTRLHDA
>P59710 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MFLLLRFVLVSCIIGSLGFDNPPTNVVSHLNGDWFLFGDSRSDCNHVVNTNPRNYSYMDLNPALCDSGKISSKAGNSIFR
SFHFTDFYNYTGEGQQIIFYEGLNFTPYHAFKCTTSGSNDIWMQNKGLFYTQVYKNMAVYRSLTFVNVPYVYNGSAQSTA
LCKSGSLVLNNPAYIAREANFGDYYYKVEADFYLSGCDEYIVPLCIFNGKFLSNTKYYDDSQYYFNKDTGVIYGLNSTET
ITTGFDFNCHYLVLPSGNYLAISNELLLTVPTKAICLNKRKDFTPVQVVDSRWNNARQSDNMTAVACQPPYCYFRNSTTN
YVGVYDINHGDAGFTSILSGLLYDSPCFSQQGVFRYDNVSSVWPLYSYGRCPTAADINTPDVPICVYDPLPLILLGILLG
VAVIIIVVLLLYFMVDNGTRLHDA
>P15776 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MFLLLRFVLVSCIIGSLGFDNPPTNVVSHLNGDWFLFGDSRSDCNHVVNTNPRNYSYMDLNPALCDSGKISSKAGNSIFR
SFHFTDFYNYTGEGQQIIFYEGVNFTPYHAFKCTTSGSNDIWMQNKGLFYTQVYKNMAVYRSLTFVNVPYVYNGSAQSTA
LCKSGSLVLNNPAYIAREANFGDYYYKVEADFYLSGCDEYIVPLCIFNGKFLSNTKYYDDSQYYFNKDTGVIYGLNSTET
ITTGFDFNCHYLVLPSGNYLAISNELLLTVPTKAICLNKRKDFTPVQVVDSRWNNARQSDNMTAVACQPPYCYFRNSTTN
YVGVYDINHGDAGFTSILSGLLYDSPCFSQQGVFRYDNVSSVWPLYSYGRCPTAADINTPDVPICVYDPLPLILLGILLG
VAVIIIVVLLLYFMVDNGTRLHDA
>Q5MQD1 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MLIIFLFFYFCYGFNEPLNVVSHLNHDWFLFGDSRSDCNHINNLKIKNFDYLDIHPSLCNNGKISSSAGDSIFKSFHFTR
FYNYTGEGDQIIFYEGVNFNPYHRFKCFPNGSNDVWLLNKVRFYRALYSNMAFFRYLTFVDIPYNVSLSKFNSCKSDILS
LNNPIFINYSKEVYFTLLGCSLYLVPLCLFKSNFSQYYYNIDTGSVYGFSNVVYPDLDCIYISLKPGSYKVSTTAPFLSL
PTKALCFDKSKQFVPVQVVDSRWNNERASDISLSVACQLPYCYFRNSSANYVGKYDINHGDSGFISILSGLLYNVSCISY
YGVFLYDNFTSIWPYYSFGRCPTSSIIKHPICVYDFLPIILQGILLCLALLFVVFLLFLLYNDKSH
>O92367 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MARTDAMAPRTLLLVLSLGYAFGFNEPLNVVSHLNDDWFLFGDSRSDCNHINNLSQQNYNYMDINPELCKSGKISAKAGN
SLFKSFHFTDFYNYTGEGSQIIFYEGVNFTPYVGFKCLNNGDNNRWMGNKARFYTQLYQKMAHYRSLSVINITYTYNGSA
GPVSMCKHIANGVTLTLNNPTFIGKEVSKPDYYYESEANFTLQGCDEFIVPLCVFNGQYLSSKLYYDDSQYYYNVDTGVL
YGFNSTLNITSGLDLTCIYLALTPGNYISISNELLLTVPSKAICLRKPKAFTPVQVVDSRWHSNRQSDNMTAIACQLPYC
YFRNTTSDYNGVYDSHHGDAGFTSILAGLMYNVSCLAQQGAFVYNNVSSSWPQYPYGHCPTAANIVFMAPVCMYDPLPVI
LLGVLLGIAVLIIVFLMFYFMTDSGVRLHEA
>Q83356 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MGSTCIAMAPRTLLLLIGCQLVFGFNEPLNIVSHLNDDWFLFGDSRSDCTYVENNGHPKLDWLDLDPKLCNSGKISAKSG
NSLFRSFHFTDFYNYTGEGDQIVFYEGVNFSPNHGFKCLAYGDNKRWMGNKARFYARVYEKMAQYRSLSFVNVPYAYGGK
AKPTSICKHKTLTLNNPTFISKESNYVDYYYESEANFTLAGCDEFIVPLCVFNGHSKGSSSDPANKYYMDSQSYYNMDTG
VLYGFNSTLDVGNTAKDPGLDLTCRYLALTPGNYKAVSLEYLLSLPSKAICLRKPKRFMPVQVVDSRWNSTRQSDNMTAV
ACQLPYCFFRNTSADYSGGTHDVHHGDFHFRQLLSGLLLNVSCIAQQGAFLYNNVSSSWPAYGYGQCPTAANIGYMAPVC
IYDPLPVVLLGVLLGIAVLIIVFLILYFMTDSGVRLHEA
>P31614 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MGCMCIAMAPRTLLLLIGCQLVFGFNEPLNIVSHLNDDWFLFGDSRSDCTYVENNGHPKLDWLDLDPKLCNSGRISAKSG
NSLFRSFHFIDFYNYSGEGDQVIFYEGVNFSPSHGFKCLAYGDNKRWMGNKARFYARVYEKMAQYRSLSFVNVSYAYGGN
AKPTSICKDKTLTLNNPTFISKESNYVDYYYESEANFTLQGCDEFIVPLCVFNGHSKGSSSDPANKYYTDSQSYYNMDTG
VLYGFNSTLDVGNTVQNPGLDLTCRYLALTPGNYKAVSLEYLLSLPSKAICLRKPKSFMPVQVVDSRWNSTRQSDNMTAV
ACQLPYCFFRNTSADYSGGTHDVHHGDFHFRQLLSGLLYNVSCIAQQGAFVYNNVSSSWPAYGYGHCPTAANIGYMAPVC
IYDPLPVILLGVLLGIAVLIIVFLMFYFMTDSGVRLHEA
>O91262 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MGSMCIAMAPRTLLLLIGCQLALGFNEPLNVVSHLSDDWFLFGDSRSDCSYVENNGHPAFDWLDLPQELCHSGKISAKSG
NSLFKSFHFTDWYNYTGEGDQVIFYEGVNFSPSHGFKCLAEGDNKRWMGNKARFYALVYKKMAYYRSLSFVNVSYSYGGK
AKPTAICKDNTLTLNNPTFISKESNYVDYYYESDANFTLEGCDEFIVPLCVFNGHSRGSSSDPANKYYMDSQMYYNMDTG
VFYGFNSTLDVGNTAQNPGLDLTCIYYALTPGNYKAVSLEYLLTIPSKAICLRKPKRFMPVQVVDSRWNNAKHSDNMTAV
ACQTPYCLFRNTSSGYNGSTHDVHHGGFHFRKLLSGLLYNVSCIAQQGAFFYNNVSSQWPVLGYGQCPTAANIEFIAPVC
LYDPLPVILLGVLLGIAVLIIVFLLFYFMTDSGVRLHEA
>O72737 ~~~~~~Protein OPG185~~~
MARLPILLLLISLVYSTPSPQTSKKIGDDATLSCNRNNTTDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSTTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSEKPDYIDNS
NCSSVFEIATPEPITDNEEDHTDVTYTSENINTVSTTSRESTTDETPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKST
TDDADLYDTYNDNDTVPPTTVGGSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYICNKRSRKYKTENKV
>Q9Q9G3 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MLSLILFFPSFAFAVTPVTPYFGPLYITFNCCLFGDSRSDCTKVQSPMSLDNPQNFCPNFSLKSSSSMFFSIHYNNHSSL
VLFDNFNCRIEKVYYNGVNLSPRNQYSCYDEGVDSYMELKTSFNIKLNQMATILRCIKLIQLKARSSFTTLQDVVCRTNK
YLPNNPTFALLSDTVPTWVQFVLPDLSGKTICIKYLVPFCHLNHGCFTAGSSCPPFGVSYVSDSFNYGFNDATPYIGLAE
SHDNVCDYLFVEAGTHNASIVGNFLFYPTKSYCFNTMNFTVPVQAIQSIWSEGNESDDAIAEACKPPFCIYYSKTTPYTV
TNGSNADHRDDEVRMMVRGLLYNSSCISAQGSTPLALYSTAMLYAPIYGSCPQYVKLFDTSGSESVDVISSSYFVATWVL
LVVVVILIFVIISFFC
>P03438 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLALGQDLPGNDNSTATLCLGHHAVPNGTLVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGIDC
TLIDALLGDPHCDVFQKETWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSIACKRGPD
SGFFSRLNWLTKSESTYPVLNVTMPNNDNFDKLYIWGIHHPSTNQEQTSLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GLSSRISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIESIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWTCQRGNI
RCNICI
>Q8QPL1 ~~~HA~~~Hemagglutinin~~~
MEKIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLDGVKPLILRDCSVAGWLLGN
PMCDEFINVPEWSYIVEKASPANDLCYPGDFNDYEELKHLLSRINHFEKIQIIPKSSWSNHEASSGVSSACPYQGKSSFF
RNVVWLIKKNSAYPTIKRSYNNTNQEDLLILWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPKIATRSKVNGQSG
RMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSELEYGNCNTKCQTPMGAINSSMPFHNIHPLTIGECPK
YVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNS
IIDKMNTQFEAVGREFNNLERRIENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELG
NGCFEFYHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMVAGLSLWMCSNG
SLQCRICI
>Q289M7 ~~~HA~~~Hemagglutinin~~~
MKVKLLVLLCTFTATYADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCLLKGIAPLQLGNCSVAGWILG
NPECELLISKESWSYIVETPNPENGTCYPGYFADYEELREQLSSVSSFERFEIFPKESSWPNHTVTGVSASCSHNGKSSF
YRNLLWLTGKNGLYPNLSKSYANNKEKEVLVLWGVHHPPNIGDQRALYHTENAYVSVVSSHYSRRFTPEIAKRPKVRNQE
GRINYYWTLLEPGDTIIFEANGNLIAPRYAFALSRGFGSGIITSNAPMDECDAKCQTPQGAINSSLPFQNVHPVTIGECP
KYVRSAKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMVDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVIE
KMNTQFTAVGKEFNKLERRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKSQLKNNAKEIGNGC
FEFYHKCNNECMESVKNGTYDYPKYSEESKLNREKIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSLQ
CRICI
>Q9WFX3 ~~~HA~~~Hemagglutinin~~~
MEARLLVLLCAFAATNADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCKLKGIAPLQLGKCNIAGWLLG
NPECDLLLTASSWSYIVETSNSENGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKTSSWPNHETTKGVTAACSYAGASS
FYRNLLWLTKKGSSYPKLSKSYVNNKGKEVLVLWGVHHPPTGTDQQSLYQNADAYVSVGSSKYNRRFTPEIAARPKVRDQ
AGRMNYYWTLLEPGDTITFEATGNLIAPWYAFALNRGSGSGIITSDAPVHDCNTKCQTPHGAINSSLPFQNIHPVTIGEC
PKYVRSTKLRMATGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAIDGITNKVNSVI
EKMNTQFTAVGKEFNNLERRIENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVRNLYEKVKSQLKNNAKEIGNG
CFEFYHKCDDACMESVRNGTYDYPKYSEESKLNREEIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>Q9WCD9 ~~~HA~~~Hemagglutinin~~~
MKAILLVLLCAFAATNADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLGGIAPLQLGKCNIAGXXLG
NPECDLLLTVSSWSYIVETSNSDNGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKTSSWPNHETTRGVTAACPYAGASS
FYRNLLWLVKKENSYPKLSKSYVNNKGKEVLVLWGVHHPPTSTDQQSLYQNADAYVSVGSSKYDRRFTPEIAARPKVRGQ
AGRMNYYWTLLEPGDTITFEATGNLVAPRYAFALNRGSESGIITSDAPVHDCDTKCQTPHGAINSSLPFQNIHPVTIGEC
PKYVKSTKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGLIDGWYGYHHQNGQGSGYAADQKSTQNAIDGITNKVNSVI
EKMNTQFTVVGKEFNNLERRIKNLNKKVDDGFLDVWTYNAEMLVLLENERTLDFHDSNVKNLYEKARSQLRNNAKEIGNG
CFEFYHKCDDACMESVRNGTYDYPKYSEESKLNREEIDGVKLESMMVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>P03454 ~~~HA~~~Hemagglutinin~~~
MKAKLLVLLYAFVATDADTICIGYHANNSTDTVDTIFEKNVAVTHSVNLLEDRHNGKLCKLKGIAPLQLGKCNITGWLLG
NPECDSLLPARSWSYIVETPNSENGACYPGDFIDYEELREQLSSVSSLERFEIFPKESSWPNHTFNGVTVSCSHRGKSSF
YRNLLWLTKKGDSYPKLTNSYVNNKGKEVLVLWGVHHPSSSDEQQSLYSNGNAYVSVASSNYNRRFTPEIAARPKVKDQH
GRMNYYWTLLEPGDTIIFEATGNLIAPWYAFALSRGFESGIITSNASMHECNTKCQTPQGSINSNLPFQNIHPVTIGECP
KYVRSTKLRMVTGLRNIPSIQYRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVIE
KMNTQFTAVGKEFNNLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDLNVKNLYEKVKSQLKNNAKEIGNGC
FEFYHKCDNECMESVRNGTYDYPKYSEESKLNREKIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSLQ
CRICI
>P03459 ~~~HA~~~Hemagglutinin~~~
MNTQILVFALVAVIPTNADKICLGHHAVSNGTKVNTLTERGVEVVNATETVERTNIPKICSKGKRTTDLGQCGLLGTITG
PPQCDQFLEFSADLIIERREGNDVCYPGKFVNEEALRQILRGSGGIDKETMGFTYSGIRTNGTTSACRRSGSSFYAEMEW
LLSNTDNASFPQMTKSYKNTRRESALIVWGIHHSGSTTEQTKLYGSGNKLITVGSSKYHQSFVPSPGTRPQINGQSGRID
FHWLILDPNDTVTFSFNGAFIAPNRASFLRGKSMGIQSDVQVDANCEGECYHSGGTITSRLPFQNINSRAVGKCPRYVKQ
ESLLLATGMKNVPEPSKKREKRGLFGAIAGFIENGWEGLVDGWYGFRHQNAQGEGTAADYKSTQSAIDQITGKLNRLIEK
TNQQFELIDNEFTEVEKQIGNLINWTKDFITEVWSYNAELLVAMENQHTIDLADSEMNKLYERVRKQLRENAEEDGTGCF
EIFHKCDDDCMASIRNNTYDHSKYREEAMQNRIQIDPVKLSSGYKDVILWFSFGASCFLLLAIAVGLVFICVKNGNMRCT
ICI
>P03452 ~~~HA~~~Hemagglutinin~~~
MKANLLVLLCALAAADADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLKGIAPLQLGKCNIAGWLLG
NPECDPLLPVRSWSYIVETPNSENGICYPGDFIDYEELREQLSSVSSFERFEIFPKESSWPNHNTNGVTAACSHEGKSSF
YRNLLWLTEKEGSYPKLKNSYVNKKGKEVLVLWGIHHPPNSKEQQNLYQNENAYVSVVTSNYNRRFTPEIAERPKVRDQA
GRMNYYWTLLKPGDTIIFEANGNLIAPMYAFALSRGFGSGIITSNASMHECNTKCQTPLGAINSSLPYQNIHPVTIGECP
KYVRSAKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNTVIE
KMNIQFTAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKSQLKNNAKEIGNGC
FEFYHKCDNECMESVRNGTYDYPKYSEESKLNREKVDGVKLESMGIYQILAIYSTVASSLVLLVSLGAISFWMCSNGSLQ
CRICI
>Q0HD60 ~~~HA~~~Hemagglutinin~~~
MKAKLLILLCALSATDADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLKGIAPLQLGKCNIAGWILG
NPECESLLSKRSWSYIAETPNSENGTCYPGDFADYEELREQLSSVSSFERFEIFPKERSWPNHNINIGVTAACSHAGKSS
FYKNLLWLTEKDGSYPNLNKSYVNKKEKEVLVLWGVHHPSNIENQKTLYRKENAYVSVVSSNYNRRFTPEIAERPKVRGQ
AGRMNYYWTLLEPGDTIIFEANGNLIAPWYAFALSRGLGSGIITSNASMDECDTKCQTPQGAINSSLPFQNIHPFTIGEC
PKYVRSTKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWAGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVI
EKMNTQFTAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKNQLRNNAKEIGNG
CFEFYHKCNNECMESVKNGTYDYPKYSEESKLNREKIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>P12581 ~~~HA~~~Hemagglutinin~~~
MYKVVVIIALLGAVRGLDKICLGHHAVANGTIVKTLTNVQEEVTNATETVESTSLNRLCMKGRSYKDLGNCHPIGMLIGT
PACDLHLTGTWDTLIERKNAIAYCYPGTTINEGALRQKIMESGGISKTSTGFAYGSSINSAGTTKACMRNGGDSFYAEVK
WLVSKDKGQNFPQTTNTYRNTDTAEHLIIWGIHHPSSTQEKNDLYGTQSLSISVGSSTYQNNFVPVVRARPQVNGQSGRI
DFHWTLVQPGDNITFSHNGGRIAPSRVSKLVGRGLGIQSEASIDNGCESKCFWRGGSINTKLPFQNLSPRTVGQCPKYVN
KKSLMLATGMRNVPEIMQGRGLFGAIAGFIENGWEGMVDGWYGFRHQNAQGTGQAADYKSTQAAIDQITGKLNRLIEKTN
TEFESIESEFSEIEHQIGNVINWTKDSITDIWTYQAELLVAMENQHTIDMADSEMLNLYERVRKQLRQNAEEDGKGCFEI
YHTCDDSCMESIRNNTYDHSQYREEALLNRLNINSVKLSSGYKDIILWFSFGASCFVLLAAVMGLVFFCLKNGNMQCTIC
I
>Q0A448 ~~~HA~~~Hemagglutinin~~~
MYKVVVIIALLGAVKGLDRICLGHHAVANGTIVKTLTNEQEEVTNATETVESTNLNKLCMKGRSYKDLGNCHPVGMLIGT
PVCDPHLTGTWDTLIERENAIAHCYPGATINEEALRQKIMESGGISKMSTGFTYGSSINSAGTTKACMRNGGDSFYAELK
WLVSKTKGQNFPQTTNTYRNTDTAEHLIIWGIHHPSSTQEKNDLYGTQSLSISVESSTYQNNFVPVVGARPQVNGQSGRI
DFHWTLVQPGDNITFSHNGGLIAPSRVSKLTGRGLGIQSEALIDNSCESKCFWRGGSINTKLPFQNLSPRTVGQCPKYVN
QRSLLLATGMRNVPEVVQGRGLFGAIAGFIENGWEGMVDGWYGFRHQNAQGTGQAADYKSTQAAIDQITGKLNRLIEKTN
TEFESIESEFSETEHQIGNVINWTKDSITDIWTYQAELLVAMENQHTIDMADSEMLNLYERVRKQLRQNAEEDGKGCFEI
YHTCDDSCMESIRNNTYDHSQYREEALLNRLNINSVKLSSGYKDIILWFSFGASCFVLLAVVMGLVFFCLKNGNMRCTIC
I
>P19696 ~~~HA~~~Hemagglutinin~~~
MLSIVILFLLIAENSSQNYTGNPVICMGHHAVANGTMVKTLADDQVEVVTAQELVESQNLPELCPSPLRLVDGQTCDIIN
GALGSPGCDHLNGAEWDVFIERPNAVDTCYPFDVPEYQSLRSILANNGKFEFIAEEFQWNTVKQNGKSGACKRANVDDFF
NRLNWLVKSDGNAYPLQNLTKINNGDYARLYIWGVHHPSTSTEQTNLYKNNPGRVTVSTKTSQTSVVPDIGSRPLVRGQS
GRVSFYWTIVEPGDLIVFNTIGNLIAPRGHYKLNNQKKSTILNTAIPIGSCVSKCHTDKGSLSTTKPFQNISRIAVGDCP
RYVKQGSLKLATGMRNIPEKASRGLFGAIAGFIENGWQGLIDGWYGFRHQNAEGTGTAADLKSTQAAIDQINGKLNRLIE
KTNDKYHQIEKEFEQVEGRIQDLENYVEDTKIDLWSYNAELLVALENQHTIDVTDSEMNKLFERVRRQLRENAEDKGNGC
FEIFHKCDNNCIESIRNGTYDHDIYRDEAINNRFQIQGVKLTQGYKDIILWISFSISCFLLVALLLAFILWACQNGNIRC
QICI
>P03451 ~~~HA~~~Hemagglutinin~~~
MAIIYLILLFTAVRGDQICIGYHANNSTEKVDTNLERNVTVTHAKDILEKTHNGKLCKLNGIPPLELGDCSIAGWLLGNP
ECDRLLSVPEWSYIMEKENPRDGLCYPGSFNDYEELKHLLSSVKHFEKVKILPKDRWTQHTTTGGSRACAVSGNPSFFRN
MVWLTKEGSDYPVAKGSYNNTSGEQMLIIWGVHHPIDETEQRTLYQNVGTYVSVGTSTLNKRSTPEIATRPKVNGQGGRM
EFSWTLLDMWDTINFESTGNLIAPEYGFKISKRGSSGIMKTEGTLENCETKCQTPLGAINTTLPFHNVHPLTIGECPKYV
KSEKLVLATGLRNVPQIESRGLFGAIAGFIEGGWQGMVDGWYGYHHSNDQGSGYAADKESTQKAFDGITNKVNSVIEKMN
TQFEAVGKEFGNLERRLENLNKRMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRMQLRDNVKELGNGCFEF
YHKCDDECMNSVKNGTYDYPKYEEESKLNRNEIKGVKLSSMGVYQILAIYATVAGSLSLAIMMAGISFWMCSNGSLQCRI
CI
>Q67333 ~~~HA~~~Hemagglutinin~~~
MAIIYLILLFTAVRGDQICIGYHANNSTEKVDTILERNVTVTHAKDILEKTHNGKLCKLNGIPPLELGDCSIAGWLLGNP
ECDRLLSVPEWSYIMEKENPRDGLCYPGSFNDYEELKHLLSSVKHFEKVKILPKDRWTQHTTTGGSRACAVSGNPSFFRN
MVWLTEKGSNYPVAKGSYNNTSGEQMLIIWGVHHPNDEKEQRTLYQNVGTYVSVGTSTLNKRSTPDIATRPKVNGLGSRM
EFSWTLLDMWDTINFESTGNLIAPEYGFKISKRGSSGIMKTEGTLENCETKCQTPLGAINTTLPFHNVHPLTIGECPKYV
KSEKLVLATGLRNVPQIESRGLFGAIAGFIEGGWQGMIDGWYGYHHSNDQGSGYAADKESTQKAFDGITNKVNSVIEKMN
TQFEAVGKEFSNLERRLENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRMQLRDNVKELGNGCFEF
YHKCDDECMNSVKNGTYDYPKYEEESKLNRNEIKGVKLSSMGVYQILAIYATVAGSLSLAIMMAGISFWMCSNGSLQCRI
CI
>P09345 ~~~HA~~~Hemagglutinin~~~
MERIVLLLAIVSLVKSDQICIGYHANKSTKQVDTIMEKNVTVTHAQDILERTHNGKLCSLNGVKPLILRDCSVAGWLLGN
PMCDEFLNVPEWSYIVEKDNPINSLCYPGDFNDYEELKHLLSSTNHFEKIQIIPRSSWSNHDASSGVSSACPYIGRSSFF
RNVVWLIKKDNAYPTIKRSYNNTNQEDLLILWGIHHPNDAAEQTKLYQNPTTYVSVGTSTLNQRSIPEIATRPKVNGQSG
RMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSGLAYGNCDTKCQTPVGAINSSMPFHNIHPHTIGECPK
YVKSDRLVLATGLRNVPQRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGITNKVNSIIDK
MNTQFKAVGKEFNNLERRVENLNKKMEDGFLDVWTYNVELLVLMENERTLDFHDSNVKNLYDKVRLQLKDNARELGNGCF
EFYHKCDNECMESVRNGTYDYPQYSEEARLNREEISGVKLESMGVYQILSIYSTVASSLALAIMIAGLSFWMCSNGSLQC
RICI
>Q82509 ~~~HA~~~Hemagglutinin~~~
MERIVLFLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCSLNGVKPLILRDCSVAGWLLGN
PMCDEFLNVPEWSYIVEKDNPINSLCYPGDFNDYEELKHLLSSTNHFEKIQIIPRSSWSNHDASSGVSSACPYNGRSSFF
RNVVWLIEKNNAYPTIKRSYNNTNQEDLLILWGIHHPNDAAEQTKLYQNPTTYVSVGTSTLNQRSIPEIATRPKVNGQSG
RVEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSDLEYGNCNAKCQTPVGAINSSMPFHNIHPLTIGECPK
YVKSDRLVLATGLRNVPQRETRRQKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGITNKVNS
IIDKMNTQFETVGKEFNNLERRIENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLKDNAKELG
NGCFEFYHKCDNECMESVRNGTYDYPQYSEEARLNREEISGVKLELMGVYQILSIYSTVASSLALAIMIAGLSFWMCSNG
SLQCRICI
>Q9WCD8 ~~~HA~~~Hemagglutinin~~~
MKAILLVLLCAFAATNADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDRHNGKLCKLGGIAPLHLGKCNIAGWLLG
NPECELLLTVSSWSYIVETSNSDNGTCYPGDFINYEELREQLSSVSSFERFEIFPKTSSWPNHETNRGVTAACPYAGANS
FYRNLIWLVKKESSYPKLSKSYVNNKGKEVLVLWGIHHPPTSTDQQSLYQNADAYVFVGSSKYNRKFKPEIAARPKVRGQ
AGRMNYYWTLIEPGDTITFEATGNLVVPRYAFAMNRGSGSGIIISDAPVHDCNTKCQTPKGAINTSLPFQNIHPVTIGEC
PKYVKSTKLRMATGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNGQGSGYAADQKSTQNAIDGITNKVNSVI
EKMNMQFTAVGKEFNNLEKRIENLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVKNLYEKVRSQLRNNAKEIGNG
CFEFYHKCDDTCMESVKNGTYDYPKYSEESKLNREEIDGVKLESTRVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>P03442 ~~~HA~~~Hemagglutinin~~~
MKTVIALSYILCLTFGQDLPGNDNSTATLCLGHHAVPNGTIVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGRAC
TLIDALLGDPHCDVFQNETWDLFVERSNAFSNCYPYDIPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSSACKRGPA
NGFFSRLNWLTKSESAYPVLNVTMPNNDNFDKLYIWGVHHPSTNQEQTNLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GQPGRISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINRKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLADSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIESIRNGTYDHDIYRDEALNNRFQIKGVELKSGYKDWILWISFAISCLLLCVVLLGFIMWACQRGNI
RCNICI
>P03437 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLALGQDLPGNDNSTATLCLGHHAVPNGTLVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGIDC
TLIDALLGDPHCDVFQNETWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSNACKRGPG
SGFFSRLNWLTKSGSTYPVLNVTMPNNDNFDKLYIWGIHHPSTNQEQTSLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GLSSRISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEEMGN
GCFKIYHKCDNACIESIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQRGNI
RCNICI
>Q91MA7 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLALGQDLPGNDNSTATLCLGHHAVPNGTLVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGIDC
TLIDALLGDPHCDVFQNETWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSNACKRGPG
SGFFSRLNWLTKSGSTYPVLNVTMPNNDNFDKLYIWGVHHPSTNQEQTSLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GLSSRISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIESIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQRGNI
RCNICI
>P03436 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLALGQDLPGNDNNTATLCLGHHAVPNGTLVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGIDC
TLIDALLGDPHCDVFQNETWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSNACKRGPD
SGFFSRLNWLTKSGSTYPVLNVTMPNNDNFDKLYIWGVHHPSTNQEQTSLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GQSSRISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIESIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQRGNI
RCNICI
>Q1PUD9 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLVLGQDFPGNDNSTATLCLGHHAVPNGTLVKTITNDQIEVTNATELVQSSSTGKICNNPHRILDGINC
TLIDALLGDPHCDGFQNETWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFINEGFTWTGVTQNGGSNACKRGPD
SGFFSRLNWLYKSGSAYPVLNVTMPNNDNFDKLYIWGVHHPSTDQEQTNLYVQASGRVTVSTKRSQQTIIPNIGSRPWVR
GLSSRISIYWTIVKPGDILVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIGTCISECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQKGNI
RCNICI
>P03435 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLVFAQDLPGNDNNSTATLCLGHHAVPNGTLVKTITNDQIEVTNATELVQSSSTGKICNNPHRILDGIN
CTLIDALLGDPHCDGFQNEKWDLFVERSKAFSNCYPYDVPDYASLRSLVASSGTLEFINEGFNWTGVTQNGGSSACKRGP
DSGFFSRLNWLYKSGSTYPVQNVTMPNNDNSDKLYIWGVHHPSTDKEQTNLYVQASGKVTVSTKRSQQTIIPNVGSRPWV
RGLSSRISIYWTIVKPGDILVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIGTCSSECITPNGSIPNDKPFQNVNKITYG
ACPKYVKQNTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNR
VIEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMG
NGCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQKGN
IRCNICI
>P03446 ~~~HA~~~Hemagglutinin~~~
MEKFIILSTVLAASFAYDKICIGYQTNNSTETVNTLSEQNVPVTQVEELVHRGIDPILCGTELGSPLVLDDCSLEGLILG
NPKCDLYLNGREWSYIVERPKEMEGVCYPGSIENQEELRSLFSSIKKYERVKMFDFTKWNVTYTGTSKACNNTSNQGSFY
RSMRWLTLKSGQFPVQTDEYKNTRDSDIVFTWAIHHPPTSDEQVKLYKNPDTLSSVTTVEINRSFKPNIGPRPLVRGQQG
RMDYYWAVLKPGQTVKIQTNGNLIAPEYGHLITGKSHGRILKNNLPMGQCVTECQLNEGVMNTSKPFQNTSKHYIGKCPK
YIPSGSLKLAIGLRNVPQVQDRGLFGAIAGFIEGGWPGLVAGWYGFQHQNAEGTGIAADRDSTQRAIDNMQNKLNNVIDK
MNKQFEVVNHEFSEVESRINMINSKIDDQITDIWAYNAELLVLLENQKTLDEHDANVRNLHDRVRRVLRENAIDTGDGCF
EILHKCDNNCMDTIRNGTYNHKEYEEESKIERQKVNGVKLEENSTYKILSIYSSVASSLVLLLMIIGGFIFGCQNGNVRC
TFCI
>P26562 ~~~HA~~~Hemagglutinin~~~
MEAKLFVLFCTFTVLKADTICVGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCSLNGIAPLQLGKCNVAGWLLG
NPECDLLLTANSWSYIIETSNSENGTCYPGEFIDYEELREQLSSISSFEKFEIFPKASSWPNHETTKGVTAACSYSGASS
FYRNLLWITKKGTSYPKLSKSYTNNKGKEVLVLWGVHHPPSVSEQQSLYQNADAYVSVGSSKYNRRFAPEIAARPEVRGQ
AGRMNYYWTLLDQGDTITFEATGNLIAPWYAFALNKGSDSGIITSDAPVHNCDTRCQTPHGALNSSLPFQNVHPITIGEC
PKYVKSTKLRMATGLRNVPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAIDGITSKVNSVI
EKMNTQFTAVGKEFNNLERRIENLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVRNLYEKVKSQLRNNAKEIGNG
CFEFYHKCDDECMESVKNGTYDYPKYSEESKLNREEIDGVKLESMGVYQILAIYSTVASSLVLLVSWGAISFWMCSNGSL
QCRICI
>P13103 ~~~HA~~~Hemagglutinin~~~
MALNVIATLTLISVCVHADRICVGYLSTNSSERVDTLLENGVPVTSSIDLIETNHTGTYCSLNGVSPVHLGDCSFEGWIV
GNPACTSNFGIREWSYLIEDPAAPHGLCYPGELNNNGELRHLFSGIRSFSRTELIPPTSWGEVLDGTTSACRDNTGTNSF
YRNLVWFIKKNNRYPVISKTYNNTTGRDVLVLWGIHHPVSVDETKTLYVNSDPYTLVSTKSWSEKYKLETGVRPGYNGQR
SWMKIYWSLIHPGEMITFESNGGFLAPRYGYIIEEYGKGRIFQSRIRMSRCNTKCQTSVGGINTNRTFQNIDKNALGDCP
KYIKSGQLKLATGLRNVPAISNRGLFGAIAGFIEGGWPGLINGWYGFQHQNEQGTGIAADKESTQKAIDQITTKINNIID
KMNGNYDSIRGEFNQVEKRINMLADRIDDAVTDIWSYNAKLLVLLENDKTLDMHDANVKNLHEQVRRELKDNAIDEGNGC
FELLHKCNDSCMETIRNGTYDHTEYAEESKLKRQEIDGIKLKSEDNVYKALSIYSCIASSVVLVGLILSFIMWACSSGNC
RFNVCI
>P18875 ~~~HA~~~Hemagglutinin~~~
MKAKLLVLLCALSATDADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCRLKGIAPLQLGKCSIAGWILG
NPECESLFSKKSWSYIAETPNSENGTCYPGYFADYEELREQLSSVSSFERFEIFPKERSWPKHNVTRGVTASCSHKGKSS
FYRNLLWLTEKNGSYPNLSKSYVNNKEKEVLVLWGVHHPSNIEDQKTIYRKENAYVSVVSSNYNRRFTPEIAKRPKVRGQ
EGRINYYWTLLEPGDTIIFEANGNLIAPWYAFALSRGFGSGIITSNASMDECDTKCQTPQGAINSSLPFQNVHPVTIGEC
PKYVRSTKLRMVTGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAINGITNKVNSVI
EKMNTQFTAVGKEFNKLEKRMENLNKKVDDGFLDIWTYNAELLVLLENERTLDFHDSNVKNLYEKVKSQLKNNAKEIGNG
CFEFYHKCNNECMESVKNGTYDYPKYSEESKLNREKIDGVKLESMGVYQILAIYSTVASSLCLLVSLGAISFWMCSNGSL
QCRICI
>Q9WCE3 ~~~HA~~~Hemagglutinin~~~
MEAKLLVLFCTFAALKADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLLENSHNGKLCSLNGIAPLQLGKCNVAGWLLG
NLECDLLLTANSWSYIIETSNSENGTCYPGEFIDYEELREQLSSVSSFEKFEIFPKASSWPNHETTKGVTAACSYLGASS
FYRNLLWMTKKGTSYPKLSKSYTNNKGKEVLVLWGVHHPPTTSEQQTLYQNVDAYVSVGSSKYNRRFTPEIAARPKVRGQ
AGKMNYYWTLLDQGDTITFEATGNLIAPWYAFALNKGSDSGIITSDAPVHNCDTKCQTPYGALNSSLPFQNVHPITIGEC
PKYVKSTKLRMATGLRNVPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQSAIDGITNKVNSVI
EKMNTQFTAVGKEFNNLERRIENLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVRNLYEKVKSQLRNNAKEIGNG
CFEFYHKCDDECMESVKNGTYDYPKYSEESKLNREEIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAVSFWMCSNGSL
QCRICI
>P12583 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYIFCLAFSQDLPGNDNSTATLCLGHHAVPNGTIVKTITDDQIEVTNATELVQSSSTGKICNNPHRILDGRDC
TLIDALLGDPHCDVFQDETWDLFVERSNAFSNCYPYDVPDYASLRSLVASSGTLEFITEGFTWTGVTQNGGSNACKRGPA
SGFFSRLNWLTKSGSTYPVLNVTMPNNDNFDKLYIWGVHHPSTNQEQTNLYVQASGRVTVSTRRSQQTIIPNIGSRPWVR
GQSGGISIYWTIVKPGDVLVINSNGNLIAPRGYFKMRTGKSSIMRSDAPIDTCVSECITPNGSIPNDKPFQNVNKITYGA
CPKYVKQNTLKLATGMRNVPEKQARGLFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRV
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENAEDMGN
GCFKIYHKCDNACIESIRNGTYDHDIYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQRGNI
RCNICI
>Q82559 ~~~HA~~~Hemagglutinin~~~
MKTTIILILLTHWVYSQNPTSGNNTATLCLGHHAVANGTLVKTITDDQIEVTNATELVQSTSIGKICNNPYRVLDGRNCT
LIDAMLGDPHCDVFQYENWDLFIERSSAFSNCYPYDIPDYASLRSIVASSGTLEFTAEGFTWTGVTQNGGSGACRRGSAD
SFFSRLNWLTKSGNSYPTLNVTMPNNNNFDKLYIWGIHHPSTNNEQTKLYIQESGRVTVSTKRSQQTIIPNIGSRPWVRG
QSGRISIYWTIVKPGDILMINSNGNLVAPRGYFKMRTGKSSVMRSDAPIDTCVSECITPNGSIPNDKPFQNVNKVTYGKC
PKYIKQNTLKLATGMRNVPEKQIRGIFGAIAGFIENGWEGMVDGWYGFRYQNSEGTGQAADLKSTQAAIDQINGKLNRVI
ERTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDAEMNKLFEKTRRQLRENAEDMGGG
CFKIYHKCDNACIGSIRNGTYDHYIYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLICVVLLGFIMWACQKGNIR
CNICI
>Q9WCE1 ~~~HA~~~Hemagglutinin~~~
MEAKLFVLFCTFTVLKADTICVGYHANNSTDTVDTVLEKNVTVTHSVNLLEDSHNGKLCSLNGIAPLQLGKCNVAGWLLG
NPECDLLLTANSWSYIIETSNSENGTCYPGEFIDYEELREQLSSVSSFEKFEIFPKASSWPNHETTKGVTAACSYSGASS
FYRNLLWITKKGTSYPKLSKSYTNNKGKEVLVLWGVHHPPTTSEQQSLYQNADAYVSVGSSKYNRRFTPEIAARPKVRGQ
AGRMNYYWTLLDQGDTITFEATGNLIAPWYAFALNKGSDSGIITSDAPVHNCDTRCQTPHGALNSSLPFQNVHPITIGEC
PKYVKSTKLRMATGLRNVPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAIDGITNKVNSVI
EKMNTQFTAVGKEFNNLERRIEKLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDSNVRNLYEKVKSQLRNNAKELGNG
CFEFYHKCDDECMESVKNGTYDYPKYSEESKLNREEIDGVKLESMGVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>Q9WCE8 ~~~HA~~~Hemagglutinin~~~
MEAKLFVLFCAFTTLEADTICVGYHANNSTDTVDTILEKNVTVTHSVNLLENSHNGKLCSLNGVAPLQLGKCNVAGWILG
NPECDLLLTANSWSYIIETSDSENGTCYPGEFIDYEELREQLSSVSSFERFEIFPKANSWPNHETTKGITAACSYSGTLS
FYRNLLWIVKRGNSYPKLSKSYTNNKGKEVLIIWGVHHPPTTSDQQSLYQNADAYVSVGSSKYNRRFTPEIAARPKVKGQ
AGRMNYYWTLLDQGDTITFEATGNLIAPWYAFALNKGSGSGIITSDTPVHNCDTKCQTPHGALNSSLPFQNVHPITIGEC
PKYVKSTKLRMATGLRNVPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQIAIDGISNKVNSVI
EKMNTQFTAVGKEFNDLEKRIENLNKKVDDGFLDVWTYNAELLVLLENERTLDFHDFNVRNLYEKVKSQLRNNAKEIGNG
CFEFYHKCDDECMESVKNGTYNYPKYSEESKLNREEIDGVKLESMEVYQILAIYSTVASSLVLLVSLGAISFWMCSNGSL
QCRICI
>P16999 ~~~HA~~~Hemagglutinin~~~
MKTTIILILLTHWVYSQNPTSGNNTATLCLGHHAVANGTLVKTITDDQIEVTNATELVQSISIGKICNNPYRVLDGRNCT
LIDAMLGYPHCDVFQYENWDLFIERSSTFSNCYPYDIPDYASLRSIVASSGTLEFTAEGFTWTGVTQNGRSGACKRGSAD
SFFSRLNWLTKSGNSYPTLNVTMPNNNNFDKLYIWGIHHPSSNNEQTKLYIQESGRVTVSTKRSQQTIIPNIGSRPGIRG
QSGRISIYWTIVKPGDILMVNSNGNLVAPRGYFKMRTGKSSVMRSDAPIDTCVSECITPNGSIPNDKPFQNVNKVTYGKC
PKYIKQNTLKLATGMRNVPEKQIRGIFGAIAGFIENGWEGMVDGWYGFRYQNSEGTGQAGDLKSTQAAIDQINGKLNRVI
ERTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDAEMNKLFEKTRRQLRENAEDMGGG
CFKIYHKCDNACIGSIRNGTYDHYIYRDEALNNRFQIRGVELKSGYKDWILWISFAISCFLICVVLLGFIMWACQKGNIR
CNICI
>P12589 ~~~HA~~~Hemagglutinin~~~
QKLPGNDNSTATPCLGHHAVPNGTLVETITNDQIEVTNATELVQSSSTGRICDSPHRILDGKNCTLIDALLGDPHCDGFQ
NEKWDLFIERSKAFSNCYPYDVPDYASLRSLVASSGTLEFINEGFNWTGVTQSGGSYACKRGSVNSFFSRLNWLYESEYK
YPALNVTMPNNGKFDKLYIWGVHHPSTEKEQTNLYVRASGRVTVSTKRSQQTVIPNIGSRPWVRGLSSRISIYWTIVKPG
DILLINSTGNLIAPRGYFKIRTGKSSIMRSDAPIGTCSSECITPNGSIPNDKPFQNVNKITYGACPRYVKQNTLKLATGM
RNVPEKQTRGIFGAIAGFIENGWEGMVDGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRLIEKTNEKFHQIEKEFS
EVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRKQLRENAEDMGNGCFKIYHKCDNACIGS
IRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQKGNIRCNICI
>P87506 ~~~HA~~~Hemagglutinin~~~
MERIVIALAIINIVKGDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKEHNGKLCSLKGVRPLILKDCSVAGWLLGN
PMCDEFLNVPEWSYIVEKDNPVNGLCYPGDFNDYEELKHLMSSTNHFEKIQIIPRNSWSTHDASSGVSSACPYNGRSSFF
RNVVWLIKKNNAYPTIKRTYNNTNVEDLLILWGIHHPNDAAEQTKLYQNSNTYVSVGTSTLNQRSIPEIATRPKVNGQSG
RMEFFWTILRPNDAISFESNGNFIAPEYAYKIVKKGDSAIMKSELEYGNCDTKCQTPVGAINSSMPFHNVHPLTIGECPK
YVKSDKLVLATGLRNVPQRETRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGITNKVNSIIDK
MNTQFEAVGKEFNNLERRIENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELGNGCF
EFYHKCDNECMESVRNGTYDYPQYSEESRLNREEIDGVKLESMGTYQILSIYSTVASSLALAIMVAGLSFWMCSNGSLQC
RICI
>P26140 ~~~HA~~~Hemagglutinin~~~
MKAILLVLLYTFTAANADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDRHNGKLCKLRGVAPLHLGKCNIAGWLLG
NPECELLFTASSWSYIVETSNSDNGTCYPGDFINYEELREQLSSVSSFERFEIFPKASSWPNHETNRGVTAACPYAGANS
FYRNLIWLVKKGNSYPKLSKSYVNNKEKEVLVLWGIHHPPTSTDQQSLYQNADAYVFVGSSKYNKKFKPEIATRPKVRGQ
AGRMNYYWTLVEPGDTITFEATGNLVVPRYAFAMKRGSGSGIIISDTPVHDCNTTCQTPKGAINTSLPFQNIHPVTIGEC
PKYVKSTKLRMATGLRNIPSIQSRGLFGAIAGFIEGGWTGMIDGWYGYHHQNEQGSGYAADRKSTQNAIDGITNKVNSVI
EKMNTQFTAVGKEFNHLEKRIENLNKKVDDGFLDVWTYNAELLVLLENERTLDYHDSNVKNLYEKVRSQLKNNAKEIGNG
CFEFYHKCDDTCMESVKNGTYDYPNYSEESKLNREEIDGVKLESTRIYQILAIYSTVASSLVLSVSLGAISFWMCSNGSL
QCRICI
>O11283 ~~~HA~~~Hemagglutinin~~~
MKTIIALSYILCLVFAQKLPGNDNSTATLCLGHHAVPNGTLVKTITNDQIEVTNATELVQSSSTGRICDSPHRILDGKNC
TLIDALLGDPHCDGFQNKEWDLFVERSKAYSNCYPYDVPDYASLRSLVASSGTLEFINEDFNWTGVAQSGESYACKRGSV
KSFFSRLNWLHESEYKYPALNVTMPNNGKFDKLYIWGVHHPSTDREQTNLYVRASGRVTVSTKRSQQTVIPNIGSRPWVR
GLSSRISIYWTIVKPGDILLINSTGNLIAPRGYFKIRTGKSSIMRSDAPIGTCSSECITPNGSIPNDKPFQNVNRITYGA
CPRYVKQNTLKLATGMRNVPEKQTRGIFGAIAGFIENGWEGMVNGWYGFRHQNSEGTGQAADLKSTQAAIDQINGKLNRL
IEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRKQLRENAEDMGN
GCFKIYHKCDNACIGSIRNGTYDHDVYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWACQKGNI
RCNICI
>Q03909 ~~~HA~~~Hemagglutinin~~~
MLSLIMRTVIALSYIFCLAFGQGLPWNDNNTATLCLGHHAVPNGTIVKTITDDQIEVTNATELVQSSSTGKICNNPHRIL
DGGNCTLIDALLGDPHCNVFQYETWDLFVERTNAFSNCYPYDVPDYASLRSIVASSGTLEFFAESFTWTGVTQNGGSSAC
KRGTASSFFSRLNWLTKSGNAYPLLNVTMPNNDNFDKLYIWGVHHPSTNQEQTELYVQASGRVTVSTRKSQQTVIPNIGS
RPWVRGQSGRVSIYWTIVKPGDVLVINSNGNLIAPRGYFKVRTGKSSIMRSDAPIDTCISECITPNGSIPNDKPFQNVNK
ITYGACPKYVKQNTLKLATGMRNVPEKQIRGIFGAIAGFIENGWEGMIDGWYGFRHQNSEGTGQAADLKSTQAALDQING
KLNRVIEKTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDSEMNKLFEKTRRQLRENA
EDMGNGCFKIYHNCDNACIESIRNGTYDHNIYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLLCVVLLGFIMWAC
QKGNIRCNICF
>Q08011 ~~~HA~~~Hemagglutinin~~~
MKTTIILILLTHWVYSQNPTSGNNTATLCLGHHAVANGTLVKTITDDQIEVTNATELVQSISIGKICNNSYRVLDGRNCT
LIDAMLGDPHCDVFQYENWDLFIERSSAFSNCYPYDIPDYASLRSIVASSGTLEFTAEGFTWTGVTQNGRSGACKRGSAD
SFFSRLNWLTKSGNSYPILNVTMPNNKNFDKLYIWGIHHPSSNKEQTKLYIQESGRVTVSTERSQQTVIPNIGSRPWVRG
QSGRISIYWTIVKPGDILTINSNGNLVAPRGYFKLRTGKSSVMRSDAPIDTCVSECITPNGSIPNDKPFQNVNKVTYGKC
PKYIRQNTLKLATGMRNVPEKQIRGIFGAIAGFIENGWEGMVDGWYGFRYQNSEGTGQAADLKSTQAAIDQINGKLNRVI
ERTNEKFHQIEKEFSEVEGRIQDLEKYVEDTKIDLWSYNAELLVALENQHTIDLTDAEMNKLFEKTRRQLRENAEDMGGG
CFKIYHKCDNACIGSIRNGTYDHYIYRDEALNNRFQIKGVELKSGYKDWILWISFAISCFLICVVLLGFIMWACQKGNIR
CNICI
>Q9Q0U6 ~~~HA~~~Hemagglutinin~~~
MEKIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILEKTHNGKLCDLNGVKPLILRDCSVAGWLLGN
PMCDEFINVPEWSYIVEKASPANDLCYPGDFNDYEELKHLLSRTNHFEKIQIIPKSSWSNHDASSGVSSACPYHGRSSFF
RNVVWLIKKNSAYPTIKRSYNNTNQEDLLVLWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSG
RMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSAIMKSELEYGNCNTKCQTPMGAINSSMPFHNIHPLTIGECPK
YVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNS
IIDKMNTQFEAVGREFNNLERRIENLNKQMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELG
NGCFEFYHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMVAGLSLWMCSNG
SLQCRICI
>O56140 ~~~HA~~~Hemagglutinin~~~
MEKTVLLLATVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILERTHNGKLCDLNGVKPLILRDCSVAGWLLGN
PMCDEFINVPEWSYIVEKASPANDLCYPGNFNDYEELKHLLSRINHFEKIQIIPKSSWSNHDASSGVSSACPYLGRSSFF
RNVVWLIKKNSAYPTIKRSYNNTNQEDLLVLWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSG
RMEFFWTILKPNDAINFESNGNFIAPEYAYKIVKKGDSTIMKSELEYGNCNTKCQTPMGAINSSMPFHNIHPLTIGECPK
YVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNS
IINKMNTQFEAVGREFNNLERRIENLNKKMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELG
NGCFEFYHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMVAGLSLWMCSNG
SLQCRICI
>P03460 ~~~HA~~~Hemagglutinin~~~
MKAIIVLLMVVTSNADRICTGITSSNSPHVVKTATQGEVNVTGVIPLTTTPTKSHFANLKGTQTRGKLCPNCFNCTDLDV
ALGRPKCMGNTPSAKVSILHEVKPATSGCFPIMHDRTKIRQLPNLLRGYENIRLSTSNVINTETAPGGPYKVGTSGSCPN
VANGNGFFNTMAWVIPKDNNKTAINPVTVEVPYICSEGEDQITVWGFHSDDKTQMERLYGDSNPQKFTSSANGVTTHYVS
QIGGFPNQTEDEGLKQSGRIVVDYMVQKPGKTGTIVYQRGILLPQKVWCASGRSKVIKGSLPLIGEADCLHEKYGGLNKS
KPYYTGEHAKAIGNCPIWVKTPLKLANGTKYRPPAKLLKERGFFGAIAGFLEGGWEGMIAGWHGYTSHGAHGVAVAADLK
STQEAINKITKNLNYLSELEVKNLQRLSGAMNELHDEILELDEKVDDLRADTISSQIELAVLLSNEGIINSEDEHLLALE
RKLKKMLGPSAVEIGNGCFETKHKCNQTCLDRIAAGTFNAGDFSLPTFDSLNITAASLNDDGLDNHTILLYYSTAASSLA
VTLMIAIFIVYMVSRDNVSCSICL
>P07975 3.1.1.53~~~HE~~~Hemagglutinin-esterase-fusion glycoprotein~~~
MFFSLLLVLGLTEAEKIKICLQKQVNSSFSLHNGFGGNLYATEEKRMFELVKPKAGASVLNQSTWIGFGDSRTDKSNSAF
PRSADVSAKTADKFRFLSGGSLMLSMFGPPGKVDYLYQGCGKHKVFYEGVNWSPHAAINCYRKNWTDIKLNFQKNIYELA
SQSHCMSLVNALDKTIPLQVTAGTAGNCNNSFLKNPALYTQEVKPSENKCGKENLAFFTLPTQFGTYECKLHLVASCYFI
YDSKEVYNKRGCDNYFQVIYDSFGKVVGGLDNRVSPYTGNSGDTPTMQCDMLQLKPGRYSVRSSPRFLLMPERSYCFDMK
EKGPVTAVQSIWGKGRESDYAVDQACLSTPGCMLIQKQKPYIGEADDHHGDQEMRELLSGLDYEARCISQSGWVNETSPF
TEKYLLPPKFGRCPLAAKEESIPKIPDGLLIPTSGTDTTVTKPKSRIFGIDDLIIGVLFVAIVETGIGGYLLGSRKESGG
GVTKESAEKGFEKIGNDIQILKSSINIAIEKLNDRISHDEQAIRDLTLEIENARSEALLGELGIIRALLVGNISIGLQES
LWELASEITNRAGDLAVEVSPGCWIIDNNICDQSCQNFIFKFNETAPVPTIPPLDTKIDLQSDPFYWGSSLGLAITATIS
LAALVISGIAICRTK
>P35971 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPQRDRINAFYKDNPHPKGSRIVINREHLMIDRPYVLLAVLFVMFLSLIGLLAIAGIRLHRAAIYTAEIHKSLSTNLDV
TNSIEHQVKDVLTPLFKIIGDEVGLRTPQRFTDLVKFISDKIKFLNPDREYDFRDLTWCINPPERIKLDYDQYCADVAAE
ELMNALVNSTLLETRTTNQFLAVSKGNCSGPTTIRGQFSNMSLSLLDLYLGRGYNVSSIVTMTSQGMYGGTYLVEKPNLS
SKRSELSQLSMYRVFEVGVIRNPGLGAPVFHMTNYLEQPVSNDLSNCMVALGELKLAALCHREDSITIPYQGSGKGVSFQ
LVKLGVWKSPTDMQSWVTLSTDDPVIDRLYLSSHRGVIADNQAKWAVPTTRTDDKLRMETCFQQACKGKIQALCENPEWA
PLKDNRIPSYGVLSVDLSLTVELKIKIASGFGPLITHGSGMDLYKSNHNNVYWLTIPPMKNLALGVINTLEWIPRFKVSP
YLFNVPIKEAGEDCHAPTYLPAEVDGDVKLSSNLVILPGQDLQYVLATYDTSRVEHAVVYYVYSPSRSFSYFYPFRLPIK
GVPIELQVECFTWDQKLWCRHFCVLADSESGGHITHSGMVGMGVSCTVTREDGTNRR
>Q786F2 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPQRDRINAFYKDNPHPKGSRIVINREHLMIDRPYVLLAVLFVMFLSLIGLLAIAGIRLHRAAIYTAEIHKSLSTNLDV
TNSIEHQVKDVLTPLFKIIGDEVGLRTPQRFTDLVKFISDKIKFLNPDREYDFRDLTWCINPPERIKLDYDQYCADVAAE
ELMNALVNSTLLEARATNQFLAVSKGNCSGPTTIRGQFSNMSLSLLDLYLSRGYNVSSIVTMTSQGMYGGTYLVGKPNLS
SKGSELSQLSMHRVFEVGVIRNPGLGAPVFHMTNYFEQPVSNDFSNCMVALGELKFAALCHREDSITIPYQGSGKGVSFQ
LVKLGVWKSPTDMRSWVPLSTDDPVIDRLYLSSHRGVIADNQAKWAVPTTRTDDKLRMETCFQQACKGKNQALCENPEWA
PLKDNRIPSYGVLSVNLSLTVELKIKIASGFGPLITHGSGMDLYKTNHNNVYWLTIPPMKNLALGVINTLEWIPRFKVSP
NLFTVPIKEAGEDCHAPTYLPAEVDGDVKLSSNLVILPGQDLQYVLATYDTSRVEHAVVYYVYSPSRSFSYFYPFRLPIK
GVPIELQVECFTWDKKLWCRHFCVLADSESGGHITHSGMVGMGVSCTVTREDGTNRR
>P08362 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPQRDRINAFYKDNPHPKGSRIVINREHLMIDRPYVLLAVLFVMFLSLIGLLAIAGIRLHRAAIYTAEIHKSLSTNLDV
TNSIEHQVKDVLTPLFKIIGDEVGLRTPQRFTDLVKFISDKIKFLNPDREYDFRDLTWCINPPERIKLDYDQYCADVAAE
ELMNALVNSTLLETRTTNQFLAVSKGNCSGPTTIRGQFSNMSLSLLDLYLGRGYNVSSIVTMTSQGMYGGTYLVEKPNLS
SKRSELSQLSMYRVFEVGVIRNPGLGAPVFHMTNYLEQPVSNDLSNCMVALGELKLAALCHGEDSITIPYQGSGKGVSFQ
LVKLGVWKSPTDMQSWVPLSTDDPVIDRLYLSSHRGVIADNQAKWAVPTTRTDDKLRMETCFQQACKGKIQALCENPEWA
PLKDNRIPSYGVLSVDLSLTVELKIKIASGFGPLITHGSGMDLYKSNHNNVYWLTIPPMKNLALGVINTLEWIPRFKVSP
YLFNVPIKEAGEDCHAPTYLPAEVDGDVKLSSNLVILPGQDLQYVLATYDTSRVEHAVVYYVYSPSRSFSYFYPFRLPIK
GVPIELQVECFTWDQKLWCRHFCVLADSESGGHITHSGMEGMGVSCTVTREDGTNRR
>P06830 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPQRDRINAFYKDNPHPKGSRIVINREHLMIDRPYVLLAVLFVMFLSLIGLLAIAGIRLHRAAIYTAEIHKSLSTNLDV
TNSIEHQVKDVLTPLFKIIGDEVGLRTPQRFTDLVKFISDKIKFLNPDREYDFRDLTWCINPPERIKLDYDQYCADVAAE
ELMNALVNSTLLETRTTNQFLAVSKGNCSGPTTIRGQFSNMSLSLLDLYLGRGYNVSSIVTMTSQGMYGGTYPVEKPNLS
SKRSELSQLSMYRVFEVSVIRNPGLGAPVFHMTNYLEQPVSNDLSNCMVALGELKLAALCHGEDSITIPYQGSGKGVSFQ
LVKLGVWKSPTGMQSWVPLSTDDPVIDRLYLSSHRGVIADNQAKWAVPTTRTDDKLRMETCFQQACKGKIQALCENPECV
PLKDNRIPSYGVLSVDLSLTVELKIKIASGFGPLITHGSGMDLYKSNHNNVYWLTIPPMKNLALGVINTLEWIPRFKVSP
YLFTVPIKEAGEDCHAPTYLPAEVDGDVKLSSNLVILPGQDLQYVLATYDTSRVEHAVVYYVYSPGRSFSYFYPFRLPIK
GVPIELQVECFTWDQKLWCRHFCVLADSESGGHITHSGMVGMGVSCTVTREDGTNRR
>P28081 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPQRDRTNAFYKDNPHPKGSRIVINREHLMIDRPYVLLAILFVMFLSLIGLLAIAGIRLHQAAIHTAEIHKSLSTNLDV
TNSIEHQVKDVLTPLFKIIGDEVGLRTPQRFTDLVKFISDKIKFLNPDREYDFRDLNWCINPPERIKLDYDQYCADVAAE
ELMNALVNSTLLETRTTNQFLAVSKGNCSGPTTIRGQFSNMSTSLLDLYLSRGYNVSSIVTMTSQGMYGGTYLVEKPNLS
SKRSELSQLSMYRVFEVGVIRNPGLGAPVFHMTNYFEQPVSNDLSNCMVALGEFKLAALCHREDSITIPYQGSGKGVSFQ
LVNLGVWKSPTDMQSWIPLSTDDPVIDRLYLSSHRGVIADNQAKWAVPTTRTDDKLRMETCFQQACKGKIQALCENPEWA
PLKDNRIPSYGVLSVDLSPTVELKIKIASGFGPLITHGSGMDLYKSNHNNVYWLTIPPMKNLALGVINTLEWIPRFKVSP
NLFTVPIKEAGKDCHAPTYLPAEVDGDVKLSSNLVILPGQDLQYVLATYDTSRVEHAVVYYVYSPGRSFSYFYPFRLPIR
GVPIELQVECFTWDQKLWCRHFCVLANSESGGHITHSGMVGMGVSCTVTREDGTNRRQSC
>Q8BEJ6 ~~~~~~Protein OPG185~~~
MTQLPILLLLISLVYATPSPQTSKKIGDDATISCSRNNTNYYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTAGDAGTYICAFFMTSTTNDTDKVDYEEYSIELIVNTDSESTIDIILSGSTPETISEKPEDIDNSNC
SSVFEITTPEPITDNVDDHTDTVTYTSDSINTVNASSGESTTDEIPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKSTT
DDADLYDTYNDNDTVPPTTVGGSTTSISNYKTKDFVEIFGITTLIILSAVAIFCITYYICNKHPRKYKTENKV
>Q70KP1 3.1.1.53~~~HE~~~Hemagglutinin-esterase~~~
MLRMRVRPPSAIPVFLIFVLLPFVLTSKPITPHYGPGHITSDWCGFGDSRSDCGNQHTPKSLDIPQELCPKFSSRTGSSM
FISMHWNNDSDFNAFGYSNCGVEKVFYEGVNFSPYRNYTCYQEGSFGWVSNKVGFYSKLYSMASTSRCIKLINLDPPTNF
TNYRNGTCTGNGGTAKMPDNPQLVIFNSVVKVSTQFVLPSSSDGFSCTKHLVPFCYIDGGCFEMSGVCYPFGYYYQSPSF
YHAFYTNGTAGLHRYICDYLEMKPGVYNATTFGKFLIYPTKSYCMDTMNYTVPVQAVQSIWSENRQSDDAIGQACKSPYC
IFYNKTKPYLAPNGADENHGDEEVRQMMQGLLVNSSCVSPQGSTPLALYSSEMIYTPNYGSCPQYYKLFETSSDENVDVT
SSAYFVATWVLLVLVIILIFILISFCLSSY
>P41355 ~~~H~~~Hemagglutinin glycoprotein~~~
MSPPRDRVDAYYKDNFQFKNTRVVLNKEQLLIERPCMLLTVLFVMFLSLVGLLAIAGIRLHRAAVNTAKINNDLTTSIDI
TKSIEYQVKDVLTPLFKIIGDEVGLRTPQRFTDLTKFISDKIKFLNPDKEYDFRDINWCINPPERIKIDYDQYCAHTAAE
DLITMLVNSSLTGTTVPRTSLVNLGRNCTGPTTTKGQFSNISLTLSGIYSGRGYNISSMITITGKGMYGSTYLVGKYNQR
ARRPSKVWHQDYRVFEVGIIRELGVGTPGFHMTNYLELPRQPELETCMLALGESKLAALCLADSPVALHYGRVGDDNKIR
FVKLGVWASPADRDTLATLSAIDPTLDGLYITTHRGIIAAGTAIWAVPVTRTDDQVKMGKCRLEACRDRPPPFCNSTDWE
PLEAGRIPAYGVLTIKLGLADEPKVDIISEFGPLITHDSGMDLYTSFDGTKYWLTTPPLQNSALGTVNTLVLEPSLKISP
NILTLPIRSGGGDCYIPTYLSDRADDDVKLSSNLVILPSRDLQYVSATYDISRVEHAIVYHIYSTGRLSSYYYPFKLPIK
GDPVSLQIECFPWDRKLWCHHFCSVVDSGTGEQVTHIGVVGIKITCNGK
>Q89182 ~~~~~~Protein OPG185~~~
MTRLPILLLLISLVYATPFPQTSKKIGDDATLSCNRNNTNDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSPTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSEKPDYIDNS
NCSSVFEIATPEPITDNVEDHTDTVTYTSDSINTVSASSGESTTDETPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKS
TTDDADLYDTYNDNDTVPSTTVGGSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYIYNKRSRKYKTENKV
>P20978 ~~~~~~Protein OPG185~~~
MTRLPILLLLISLVYATPFPQTSKKIGDDATLSCNRNNTNDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSPTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSEKPDYIDNS
NCSSVFEIATPEPITDNVEDHTDTVTYTSDSINTVSATSGESTTDETPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKS
TTDDADLYDTYNDNDTVPSTTVGSSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYICNKRSRKYKTENKV
>P08714 ~~~~~~Protein OPG185~~~
MTRLPILLLLISLVYATPFPQTSKKIGDDATLSCNRNNTNDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSTTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSEKPEDIDNF
NCSSVFEIATPEPITDNVEDHTDTVTYTSDSINTVSASSGESTTDETPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKS
TTDDADLYDTYNDNDTVPPTTVGGSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYIYNKRSRKYKTENKV
>P16561 ~~~~~~Protein OPG185~~~
MARLPILLLLISLVYSTPSPQTSKKIGDDATLSCNRNNTNDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSPTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSEKPEDIDNL
NCSSVFEIATPEPITDNVEDHTDTVTYTSDSINTVSASSGESTTDETPEPITDKEEDHTVTDTVSYTTVSTSSGIVTTKS
TTDDADLYDTYNDNDTVPSTTVGCSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYIYNKRSRKYKTENKV
>Q01218 ~~~~~~Protein OPG185~~~
MTRLPILLLLISLVYATPFPQTSKKIGDDATLSCNRNNTNDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYDSP
YDDLVTTITIKSLTARDAGTYVCAFFMTSTTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSTHSPETSSKKPDYIDNS
NCSSVFEIATPEPITDNVEDHTDTVTYTSDSINTVSASSGESTTDETPEPITDKEDHTVTDTVSYTTVSTSSGIVTTKST
TDDADLYDTYNDNDTVPPTTVGGSTTSISNYKTKDFVEIFGITALIILSAVAIFCITYYIYNKRSRKYKTENKV
>P33807 ~~~~~~Protein OPG185~~~
MTRLSILLLLISLVYSTPYPQTQISKKIGDDATLSCSRNNINDYVVMSAWYKEPNSIILLAAKSDVLYFDNYTKDKISYD
SPYDDLVTTITIKSLTAKDAGTYVCAFFMTSTTNDTDKVDYEEYSTELIVNTDSESTIDIILSGSSHSPETSSEKPDYIN
NFNCSLVFEIATPGPITDNVENHTDTVTYTSDIINTVSTSSRESTTVKTSGPITNKEDHTVTDTVSYTTVSTSSEIVTTK
STANDAHNDNEPSTVSPTTVKNITKSIGKYSTKDYVKVFGIAALIILSAVAIFCITYYICNKRSRKYKTENKV
>P16827 ~~~~~~DNA helicase/primase complex-associated protein~~~
MTAQPPLHHRHHPYTLFGTSCHLSWYGLLEASVPIVQCLFLDLGGGRAEPRLHTFVVRGDRLPPAEVRAVHRASYAALAS
AVTTDADERRRGLEQRSAVLARVLLEGSALIRVLARTFTPVQIQTDASGVEILEAAPALGVETAALSNALSLFHVAKLVV
IGSYPEVHEPRVVTHTAERVSEEYGTHAHKKLRRGYYAYDLAMSFRVGTHKYVLERDDEAVLARLFEVREVCFLRTCLRL
VTPVGFVAVAVTDEQCCLLLQSAWTHLYDVLFRGFAGQPPLRDYLGPDLFETGAARSFFFPGFPPVPVYAVHGLHTLMRE
TALDAAAEVLSWCGLPDIVGSAGKLEVEPCALSLGVPEDEWQVFGTEAGGGAVRLNATAFRERPAGGDRRWLLPPLPRDD
GDGENNVVEVSSSTGGAHPPSDDATFTVHVRDATLHRVLIVDLVERVLAKCVRARDFNPYVRYSHRLHTYAVCEKFIENL
RFRSRRAFWQIQSLLGYISEHVTSACASAGLLWVLSRGHREFYVYDGYSGHGPVSAEVCVRTVVDCYWRKLFGGDDPGPT
CRVQESAPGVLLVWGDERLVGPFNFFYGNGGAGGSPLHGVVGGFAAGHCGGACCAGCVVTHRHSSGGGGSGVGDADHASG
GGLDAAAGSGHNGGSDRVSPSTPPAALGGCCCAAGGDWLSAVGHVLGRLPALLRERVSVSELEAVYREILFRFVARRNDV
DFWLLRFQPGENEVRPHAGVIDCAPFHGVWAEQGQIIVQSRDTALAADIGYGVYVDKAFAMLTACVEVWARELLSSSTAS
TTACSSSSVLSSALPSVTSSSSGTATVSPPSCSSSSATWLEERDEWVRSLAVDAQHAAKRVASEGLRFFRLNA
>P10192 ~~~UL8~~~DNA helicase/primase complex-associated protein~~~
MDTADIVWVEESVSAITLYAVWLPPRAREYFHALVYFVCRNAAGEGRARFAEVSVTATELRDFYGSADVSVQAVVAAARA
ATTPAASPLEPLENPTLWRALYACVLAALERQTGPVALFAPLRIGSDPRTGLVVKVERASWGPPAAPRAALLVAEANIDI
DPMALAARVAEHPDARLAWARLAAIRDTPQCASAASLTVNITTGTALFAREYQTLAFPPIKKEGAFGDLVEVCEVGLRPR
GHPQRVTARVLLPRDYDYFVSAGEKFSAPALVALFRQWHTTVHAAPGALAPVFAFLGPEFEVRGGPVPYFAVLGFPGWPT
FTVPATAESARDLVRGAAAAYAALLGAWPAVGARVVLPPRAWPGVASAAAGCLLPAVREAVARWHPATKIIQLLDPPAAV
GPVWTARFCFPGLRAQLLAALADLGGSGLADPHGRTGLARLDALVVAAPSEPWAGAVLERLVPDTCNACPALRQLLGGVM
AAVCLQIEETASSVKFAVCGGDGGAFWGVFNVDPQDADAASGVIEDARRAIETAVGAVLRANAVRLRHPLCLALEGVYTH
AVAWSQAGVWFWNSRDNTDHLGGFPLRGPAYTTAAGVVRDTLRRVLGLTTACVPEEDALTARGLMEDACDRLILDAFNKR
LDAEYWSVRVSPFEASDPLPPTAFRGGALLDAEHYWRRVVRVCPGGGESVGVPVDLYPRPLVLPPVDCAHHLREILREIE
LVFTGVLAGVWGEGGKFVYPFDDKMSFLFA
>A0A385DVL5 ~~~~~~Head fiber dimeric protein~~~
MKRVLNLGNLSRIVEGDPNEITDDEILVIKDKIIEGKIIDIQKRVDGKLVSLITEKYTYTINPTPADAIVVINGSTTKSI
RAAKGHTVTWSVSKTGFVTQSGSDVISGDVSKDVTLVANPAS
>A0A385DTC5 ~~~~~~Head fiber trimeric protein~~~
MKTLRTLKISPNAPDINSVWLYKGTMKYFNNGEWETIGGESEPYVLPAATTSTIGGVKKATNVGNLATGAELATVVTKVN
AILSALKVADIMVEDAN
>Q98VP9 3.1.21.10~~~hjc~~~Holliday junction resolvase~~~
MNIRQSGKYYEYKTLEILEKNGFKALRIPVSGTGKQALPDLIATKNNTIYPIEVKSTSKDVVTVRNFQIEKLFKFCEIFN
FCECHPLVTVYYKKYKIVIVYELSQDVRTKEKIKFKYGINS
>P13342 ~~~~~~DNA helicase assembly protein~~~
MIKLRMPAGGERYIDGKSVYKLYLMIKQHMNGKYDVIKYNWCMRVSDAAYQKRRDKYFFQKLSEKYKLKELALIFISNLV
ANQDAWIGDISDADALVFYREYIGRLKQIKFKFEEDIRNIYYFSKKVEVSAFKEIFEYNPKVQSSYIFKLLQSNIISFET
FILLDSFLNIIDKHDEQTDNLVWNNYSIKLKAYRKILNIDSQKAKNVFIETVKSCKY
>E1XT70 ~~~~~~5-hmdU DNA kinase 1~~~
MSYSLKGLLKRPVHLFVKPPAVEGEYPARGELYYVKGSNGSGKSTVPSYLAENDPQAYVVTYNGKIMLTVCPSYNIICIG
KYDKSKSKGVDSLKDTEQMLFALSIADQPEYLKYDVLFEGIIPSTLLSSWIPRLTRPPRELVVLFMDTPLETCVSRVKSR
NGGADFNESLVVEKWERVHDHSQRHKGLFPTVPAGMMKSNGLTIEQAVFAFLNRDFGSID
>E1XTK3 ~~~~~~5-hmdU DNA kinase 2~~~
MAKIIVIKGTSGTGKGTRVVQFIEWLRTKLKPTELSYTVGDKTRPFGLKFEELKLIFVGQYTVSNKSGLASWTSMDAIHA
ATGSGDIARDLVKGWLAQGYTLVCEGEPLMLSDKWRPEWMFKNYPIDSLALLYFAYPDRYQYDARIRGRSGKEAGDSGWS
RNESYSKEFEKSKAEMLALGWNVAVDDYSGQDVLYHQTATNTQEFKTGNDSELAMLPFDAPLWVVGNAIYHQLGTVCRAN
NLMSKDFYGYCETNPMTREVGGQDPLAHRVPEKPQKASKTKNKAVAKEEPKTSSVSLLGLMRKA
>A0A0S0N9S9 ~~~~~~5-hmdU DNA kinase~~~
MAKQRVINIRGTSGSGKSTLIRRLVELYPEKEPVHVPDRKQPLFYKLRGDGLLPLSLLGHYETACGGCDTIPSMDRIYEL
VRERLAEGDSVLYEGLLISAEVNRAVALHTDGFDLTVVRLNTPLELCVDSVNQRRWAKNPDKPGVNPKNTEAKFKQTLAT
CKKLDAAGIPVVEADRDGAFHAIKQLLEA
>P0DTK5 ~~~~~~5-hmdU DNA kinase~~~
MKYINVRGCNGSGKTTLLRCLARDPLCRVINVIVPDHKPIPVTYAPDGIAIIGDYTPAAAGATTAGLDRIKTQAAAKAVA
ELVGRDPDVKAVLFEGVVVSTIYGPWQEWSKANGGMIWAFLDTPLEVCLKRIQERNGGKPIKEDQVADKHRTIARVRDKA
LADGETVRDIHWETALKDIKAVIENLG
>F8WQ30 ~~~~~~5-hmdU DNA kinase~~~
MKTKEEFHRIEISKFTEDLREAFGYKIQYRQDLVLVNIRGCNGAGKSTVPMQMLQTDPGAFMLTLDGKDKATVFPSYGFV
AMGRYFSKTGGLDGFKNNEETLKVLKLLWELPFSIIMEGVISSTIFSTYCDLFKELEQRNNPKRAVGVLNLLPPFEVCLE
RIKKRTPEKFDSIKKDQIEGKWRTVNRNAQKFRDAGVTSWDEDNSVIDINDTVSWFFSSIKNNLQPEFTGLRLGVRFPTE
EPVKALKKAEKGVKRGKKGRKTSPVAKTLDDNGDGAFLRSLKREIRPHWDPKYLNTPDDNVRLRRDPETGQTLWDMYFIN
LVERQNIWYRRVIQGKSKPWTDDPVMSTYHFTNVDRRLDRVTLHYIDKVLCNLEDSYESKKFLLLNTFIYRLFVRPETTD
AMGYIFPETFEEDWERAKAALRARRESGEPVFTDAYFVNDLKSANPDRANSSNKTENAIHLIQFIIDHLDELAEFTFNPK
NSMEEVIEKFTMIPAVGNFNAYEVALDLGIVKEMTGIDFVDWTPDHYANVGPGCKKGIEYVFEDLGNMSHLDIVFFITSV
YKGELERLGLEYKYQEGCKELDLRALEGWCCEMSKYFNYYATERGYDWAKGKRPKKKMNLRTDDVSYLNPRISNLVK
>C9DG08 ~~~~~~5-hmdU DNA kinase~~~
MSNTLHEGFKFGDPGLINIRGTNGSGKSTIVKRYIPKGAVQKRFDDIGTTYYDCGTHFVVGRYETDCGGLDAVRGTYDPK
TGQGIRPFEAGQIAIGRLAPLKTTFAEGVIYGTTFKGSKEVHDELAKTNTPYFWFSIDMPFQEVFDSVLLRRVKSGNADP
LSTENIAKKFRPVLASLDKAVDAGLWTIYGHRDVIAQNVDDLVNNRPLTNADLIGRKPDLTKFNKEAQVWFDKGTVSPTQ
EMIDEHFPKPTTHGLGGFFKG
>P34081 3.1.-.-~~~~~~DNA endonuclease I-HmuI~~~
MEWKDIKGYEGHYQVSNTGEVYSIKSGKTLKHQIPKDGYHRIGLFKGGKGKTFQVHRLVAIHFCEGYEEGLVVDHKDGNK
DNNLSTNLRWVTQKINVENQMSRGTLNVSKAQQIAKIKNQKPIIVISPDGIEKEYPSTKCACEELGLTRGKVTDVLKGHR
IHHKGYTFRYKLNG
>P11235 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MEPSKLFTMSDNATFAPGPVINAADKKTFRTCFRILVLSVQAVTLILVIVTLGELVRMINDQGLSNQLSSIADKIRESAT
MIASAVGVMNQVIHGVTVSLPLQIEGNQNQLLSTLATICTGKKQVSNCSTNIPLVNDLRFINGINKFIIEDYATHDFSIG
HPLNMPSFIPTATSPNGCTRIPSFSLGKTHWCYTHNVINANCKDHTSSNQYISMGILVQTASGYPMFKTLKIQYLSDGLN
RKSCSIATVPDGCAMYCYVSTQLETDDYAGSSPPTQKLTLLFYNDTVTERTISPTGLEGNWATLVPGVGSGIYFENKLIF
PAYGGVLPNSSLGVKSAREFFRPVNPYNPCSGPQQDLDQRALRSYFPSYFSNRRVQSAFLVCAWNQILVTNCELVVPSNN
QTLMGAEGRVLLINNRLLYYQRSTSWWPYELLYEISFTFTNSGQSSVNMSWIPIYSFTRPGSGNCSGENVCPTACVSGVY
LDPWPLTPYSHQSGINRNFYFTGALLNSSTTRVNPTLYVSALNNLKVLAPYGNQGLFASYTTTTCFQDTGDASVYCVYIM
ELASNIVGEFQILPVLTRLTIT
>P10866 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MEPSKLFTISDNATFAPGPVNNAADKKTFRTCFRILVLSVQAVTLILVIVTLGELVRMINDQGLSNQLSSITDKIRESAT
MIASAVGVMNQVIHGVTVSLPLQIEGNQNQLLSTLATICTSKKQISNCSTNIPLVNDLRFINGINKFIIEDYANHDFSIG
HPLNMPSFIPTATSPNGCTRIPSFSLGKTHWCYTHNVINANCKDHTSSNQYVSMGILVQTASGYPMFKTLKIQYLSDGLN
RKSCSIATVPDGCAMYCYVSTQLETDDYAGSSPPTQKLTLLFYNDTVTERTISPSGLEGNWATLVPGVGSGIYFENKLIF
PAYGGVLPNSTLGVKLAREFFRPVNPYNPCSGPQQDLDQRALRSYFPSYLSNRRVQSAFLVCAWNQILVTNCELVVPSNN
QTLMGAEGRVLLINNRLLYYQRSTSWWPYELLYEISFTFTNSGQSSVNMSWIPIYSFTRPGSGKCSGENVCPIACVSGVY
LDPWPLTPYSHQSGINRNFYFTGALLNSSTTRVNPTLYVSALNNLKVLAPYGTQGLSASYTTTTCFQDTGDASVYCVYIM
ELASNIVGEFQILPVLTRLTIT
>P12554 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MNRAVCQVALENDEREAKNTWRLVFRIAILLLTVMTLAISAAALAYSMEASTPGDLVSIPTAISRAEGKITSALGSNQDV
VDRIYKQVALESPLALLNTESIIMNAITSLSYQINGAANNSGCGAPVHDPDYIGGIGKELIVDDTSDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHCYTHNVIFSGCRDHSHSHQYLALGVLRTSATGRVFFSTLRSINLDDTQNRKSCSVS
ATPLGCDMLCSKVTETEEEDYNSVIPTSMVHGRLGFDGQYHEKDLDVTTLFGDWVANYPGVGGGSFIDNRVWFPVYGGLK
PSSPSDTGQEGRYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRVQQAILSIKVSTSLGEDPVLTIPPNTVTLMGAE
GRVLTVGTSHFLYQRGSSYFSPALLYPMTVNNNTATLHSPYTFNAFTRPGSVPCQASARCPNSCVTGVYTDPYPLVFHRN
HTLRGVFGTMLDDEQARLNLVSAVFDNISRSRITRVSSSRTKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVPL
LVEILKDDGV
>P32884 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDRAVSQVALENDEREAKNTWRLIFRIAILLLTVVTLATSVASLVYSMGASTPSDLVGIPTRISRAEEKITSALGSNQDV
VDRIYKQVALESPLALLNTETTIMNAITSLSYQINGAANNSGWGAPIHDPDFIGGIGKELIVDDASDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHYCYTHNVILSGCRDHSHSHQYLALGVLRTTATGRIFFSTLRSISLDDTQNRKSCSV
SATPLGCDMLCSKVTETEEEDYNSAVPTLMAHGRLGFDGQYHEKDLDVTTLFEDWVANYPGVGGGSFIDGRVWFSVYGGL
KPNSPSDTVQEGKYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRIQQAILSIKVSTSLGEDPVLTVPPNTVTLMGA
EGRILTVGTSHFLYQRGSSYFSPALLYPMTVSNKTATLHSPYTFNAFTRPGSIPCQASARCPNSCVTGVYTDPYPLIFYR
NHTLRGVFGTMLDSEQARLNPTSAVFDSTSRSRITRVSSSSTKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVP
LLVEILKNDGVREARSG
>Q91UL0 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDRAVSQVALENDEREAKNTWRLIFRIAILFLTVVTLAISVASLLYSMGASTPSDLVGIPTRISRAEEKITSTLGSNQDV
VDRIYKQVALESPLALLNTETTIMNAITSLSYQINGAANNSGWGAPIHDPDYIGGIGKELIVDDASDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHYCYTHNVILSGCRDHSHSYQYLALGVLRTSATGRVFFSTLRSINLDDTQNRKSCSV
SATPLGCDMLCSKATETEEEDYNSAVPTRMVHGRLGFDGQYHEKDLDVTTLFGDWVANYPGVGGGSFIDSRVWFSVYGGL
KPNSPSDTVQEGKYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRIQQAILSIKVSTSLGEDPVLTVPPNTVTLMGA
EGRILTVGTSHFLYQRGSSYFSPALLYPMTVSNKTATLHSPYTFNAFTRPGSIPCQASARCPNSCVTGVYTDPYPLIFYR
NHTLRGVFGTMLDGEQARLNPASAVFDSTSRSRITRVSSSSIKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVP
LLVEILKDDGVREARSG
>P12559 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDRAVSQVALENDEREAKNTWRLIFRIAILFLTVVTLAISVASLLYSMGASTPSDLVGIPTRISRAEEKITSTLGSNQDV
VDRIYKQVALESPLALLNTETTIMNAITSLSYQINGAANNSGWGAPIHDPDYIGGIGKELIVDDASDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHYCYTHNVILSGCRDHLHSHQYLALGVLRTSATGRVFFSTLRSINLDDTQNRKSCSV
SATPLGCDMLCSKATETEEEDYNSAVPTRMVHGRLGFDGQYHEKDLDVTTLFGDWVANYPGVGGGSFIDSRVWFSVYGGL
KPNTPSDTVQEGKYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRIQQAILSIKVSTSLGEDPVLTVPPNTVTLMGA
EGRILTVGTSHFLYQRGSSYFSPALLYPMTVSNKTATLHSPYTFNAFTRPGSIPCQASARCPNSCVTGVYTDPYPLIFYR
NHTLRGVFGTMLDGEQARLNPASAVFDSTSRSRITRVSSSSIKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVP
LLVEILKDDGVREARSG
>Q9Q2W5 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDRAVSQVALENDEREAKNTWRLIFRIAILLLTVVTLATSVASLVYSMGASTPSDLVGIPTRISRAEEKITSALGSNQDV
VDRIYKQVALESPLALLNTETTIMNAITSLSYQINGAANNSGWGAPIHDPDFIGGIGKELIVDNASDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHYCYTHNVILSGCRDHSHSHQYLALGVLRTTATGRIFFSTLRSISLDDTQNRKSCSV
SATPLGCDMLCSKVTETEEEDYNSAVPTLMAHGRLGFDGQYHEKDLDVTTLFEDWVANYPGVGGGSFIDGRVWFSVYGGL
KPNSPSDTVQEGKYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRIQQAILSIKVSTSLGEDPVLTVPPNTVTLMGA
EGRILTVGTSHFLYQRGSSYFSPALLYPMTVSNKTATLHSPYTFNAFTRPGSIPCQASARCPNSCVTGVYTDPYPLIFYR
NHTLRGVFGTMLDSEQARLNPASAVFDSTSRSRITRVSSSSTKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVP
LLVEILKNDGVREARSG
>P12558 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDRAVSQVALENDEREAKNTWRLVFRIAILLLTVVTLAISAAALAYSMEASTPSDLIGIPTAISRAEEKITSALGSNQDV
VDRIYKQVALESPLALLNTESTIMNAITSLSYQINGAANSSGCGAPIHDPDYIGGIGKELIVDDASDVTSFYPSAFQEHL
NFIPAPTTGSGCTRIPSFDMSATHYCYTHNVILSGCRDHSHSHQYLALGVLRTSATGRVFFSTLHSINLDDTQNRKSCSV
SATPLGCDMLCSKVTETEEEDYNSAVPTSMVHGRLGFDGQYHEKDLDVTTLFEDWVANYPGVGGGSFIDNRVWFPVYGGL
KPNSPSDTAQEGKYVIYKRYNDTCPDEQDYQIRMAKSSYKPGRFGGKRVQQAILSIKVSTSLGEDPVLTVPPNTVTLMGA
EGRVLTVGTSHFLYQRGSSYFSPALLYPMTVSNKTATLHSPYTFDAFTRPGSVPCQASARCPNSCVTGVYTDPYPLVFYR
NHTLRGVFGTMLDDKQARLNPVSAVFDSISRSRITRVSSSSTKAAYTTSTCFKVVKTNKTYCLSIAEISNTLFGEFRIVP
LLVEILKDDGVREARAGRLSQLREGWKDDIVSPIFCDAKNQTEYRRELESYAASWP
>P25465 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MEDYSNLSLKSIPKRTCRIIFRTATILGICTLIVLCSSILHEIIHLDVSSGLMDSDDSQQGIIQPIIESLKSLIALANQI
LYNVAIIIPLKIDSIETVIYSALKDMHTGSMSNTNCTPGNLLLHDAAYINGLNKFLVLKSYNGTPKYGPLLNIPSFIPSA
TSPNGCTRIPSFSLIKTHWCYTHNVILGDCLDFTTSNQYLAMGIIQQSAAAFPIFRTMKTIYLSDGINRKSCSVTAIPGG
CVLYCYVATRSEKEDYATTDLAELRLAFYYYNDTFIERVISLPNTTGQWATINPAVGSGIYHLGFILFPVYGGLIKGTPS
YNKQSSRYFIPKHPNITCAGKSSEQAAAARSSYVIRYHSNRLLQSAVLICPLSDMHTARCNLVMFNNSQVMMGAEGRLYV
IDNNLYYYQRSSSWWSASLFYRINTDFSKGIPPIIEAQWVPSYQVPRPGVMPCNATSFCPANCITGVYADVWPLNDPEPT
SQNALNPNYRFAGAFLRNESNRTNPTFYTASASALLNTTGFNNTNHKAAYTSSTCFKNTGTQKIYCLIIIEMGSSLLGEF
QIIPFLRELIP
>P25466 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MEDYSNLSLKSIPKRTCRIIFRTATILGICTLIVLCSSILHEIIHLDVSSGLMDSDDSQQGIIQPIIESLKSLIALANQI
LYNVAIIIPLKIDSIETVIFSALKDMHTGSMSNTNCTPGNLLLHDAAYINGINKFLVLKSYNGTPKYGPLLNIPSFIPSA
TSPNGCTRIPSFSLIKTHWCYTHNVMLGDCLDFTTSNQYLAMGIIQQSAAAFPIFRTMKTIYLSDGINRKSCSVTAIPGG
CVLYCYVATRSEKEDYATTDLAELRLAFYYYNDTFIERVISLPNTTGQWATINPAVGSGIYHLGFILFPVYGGLISGTPS
YNKQSSRYFIPKHPNITCAGNSSEQAAAARSSYVIRYHSNRLIQSAVLICPLSDMHTARCNLVMFNNSQVMMGAEGRLYV
IDNNLYYYQRSSSWWSASLFYRINTDFSKGIPPIIEAQWVPSYQVPRPGVMPCNATSFCPANCITGVYADVWPLNDPEPT
SQNALNPNYRFAGAFLRNESNRTNPTFYTASASALLNTTGFNNTNHKAAYTSSTCFKNTGTQKIYCLIIIEMGSSLLGEF
QIIPFLRELIP
>P08492 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MEYWKHTNHGKDAGNELETSMATHGNKITNKITYILWTIILVLLSIVFIIVLINSIKSEKAHESLLQDVNNEFMEVTEKI
QMASDNINDLIQSGVNTRLLTIQSHVQNYIPISLTQQMSDLRKFISEITIRNDNQEVPPQRITHDVGIKPLNPDDFWRCT
SGLPSLMKTPKIRLMPGPGLLAMPTTVDGCVRTPSLVINDLIYAYTSNLITRGCQDIGKSYQVLQIGIITVNSDLVPDLN
PRISHTFNINDNRKSCSLALLNTDVYQLCSTPKVDERSDYASSGIEDIVLDIVNHDGSISTTRFKNNNISFDQPYAALYP
SVGPGIYYKGKIIFLGYGGLEHPINENAICNTTGCPGKTQRDCNQASHSPWFSDRRMVNSIIVVDKGLNSIPKLKVWTIS
MRQNYWGSEGRLLLLGNKIYIYTRSTSWHSKLQLGIIDITDYSDIRIKWTWHNVLSRPGNNECPWGHSCPDGCITGVYTD
AYPLNPTGSIVSSVILDSQKSRVNPVITYSTSTERVNELAIRNKTLSAGYTTTSCITHYNKGYCFHIVEINHKSLDTFQP
MLFKTEIPKSCS
>P04850 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MVAEDAPVRATCRVLFRTTTLIFLCTLLALSISILYESLITQKQIMSQAGSTGSNSGLGSITDLLNNILSVANQIIYNSA
VALPLQLDTLESTLLTAIKSLQTSDKLEQNCSWSAALINDNRYINGINQFYFSIAEGRNLTLGPLLNMPSFIPTATTPEG
CTRIPSFSLTKTHWCYTHNVILNGCQDHVSSNQFVSMGIIEPTSAGFPFFRTLKTLYLSDGVNRKSCSISTVPGGCMMYC
FVSTQPERDDYFSAAPPEQRIIIMYYNDTIVERIINPPGVLDVWATLNPGTGSGVYYLGWVLFPIYGGVIKGTSLWNNQA
NKYFIPQMVAALCSQNQATQVQNAKSSYYSSWFGNRMIQSGILACPLRQDLTNECLVLPFSNDQVLMGAEGRLYMYGDSV
YYYQRSNSWWPMTMLYKVTITFTNGQPSAISAQNVPTQQVPRPGTGDCSATNRCPGFCLTGVYADAWLLTNPSSTSTFGS
EATFTGSYLNTATQRINPTMYIANNTQIISSQQFGSSGQEAAYGHTTCFRDTGSVMVYCIYIIELSSSLLGQFQIVPFIR
QVTLS
>P19758 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDGDRGKRDSYWSTSPSGSTTKLASGWERSSKVDTWLLILSFTQWALSIATVIICIIISARQGYSMKEYSMTVEALNMSS
REVKESLTSLIRQEVIARAVNIQSSVQTGIPVLLNKNSRDVIQMIDKSCSRQELTQLCESTIAVHHAEGIAPLEPHSFWR
CPVGEPYLSSDPKISLLPGPSLLSGSTTISGCVRLPSLSIGEAIYAYSSNLITQGCADIGKSYQVLQLGYISLNSDMFPD
LNPVVSHTYDINDNRKSCSVVATGTRGYQLCSMPTVDERTDYSSDGIEDLVLDVLDLKGSTKSHRYRNSEVDLDHPFSAL
YPSVGNGIATEGSLIFLGYGGLTTPLQGDTKCRTQGCQQVSQDTCNEALKITWLGGKQVVSVIIQVNDYLSERPKIRVTT
IPITQNYLGAEGRLLKLGDRVYIYTRSSGWHSQLQIGVLDVSHPLTINWTPHEALSRPGNEECNWYNTCPKECISGVYTD
AYPLSPDAANVATVTLYANTSRVNPTIMYSNTTNIINMLRIKDVQLEAAYTTTSCITHFGKGYCFHIIEINQKSLNTLQP
MLFKTSIPKLCKAES
>P04853 3.2.1.18~~~HN~~~Hemagglutinin-neuraminidase~~~
MDGDRGKRDSYWSTSPSGSTTKPASGWERSSKADTWLLILSFTQWALSIATVIICIIISARQGYSMKEYSMTVEALNMSS
REVKESLTSLIRQEVIARAVNIQSSVQTGIPVLLNKNSRDVIQMIDKSCSRQELTQHCESTIAVHHADGIAPLEPHSFWR
CPVGEPYLSSDPEISLLPGPSLLSGSTTISGCVRLPSLSIGEAIYAYSSNLITQGCADIGKSYQVLQLGYISLNSDMFPD
LNPVVSHTYDINDNRKSCSVVATGTRGYQLCSMPTVDERTDYSSDGIEDLVLDVLDLKGRTKSHRYRNSEVDLDHPFSAL
YPSVGNGIATEGSLIFLGYGGLTTPLQGDTKCRTQGCQQVSQDTCNEALKITWLGGKQVVSVIIQVNDYLSERPKIRVTT
IPITQNYLGAEGRLLKLGDRVYIYTRSSGWHSQLQIGVLDVSHPLTINWTPHEALSRPGNKECNWYNKCPKECISGVYTD
AYPLSPDAANVATVTLYANTSRVNPTIMYSNTTNIINMLRIKDVQLEAAYTTTSCITHFGKGYCFHIIEINQKSLNTLQP
MLFKTSIPKLCKAES
>P18056 ~~~hoc~~~Highly immunogenic outer capsid protein~~~
MTFTVDITPKTPTGVIDETKQFTATPSGQTGGGTITYAWSVDNVPQDGAEATFSYVLKGPAGQKTIKVVATNTLSEGGPE
TAEATTTITVKNKTQTTTLAVTPASPAAGVIGTPVQFTAALASQPDGASATYQWYVDDSQVGGETNSTFSYTPTTSGVKR
IKCVAQVTATDYDALSVTSNEVSLTVNKKTMNPQVTLTPPSINVQQDASATFTANVTGAPEEAQITYSWKKDSSPVEGST
NVYTVDTSSVGSQTIEVTATVTAADYNPVTVTKTGNVTVTAKVAPEPEGELPYVHPLPHRSSAYIWCGWWVMDEIQKMTE
EGKDWKTDDPDSKYYLHRYTLQKMMKDYPEVDVQESRNGYIIHKTALETGIIYTYP
>Q7Y2C1 ~~~~~~Holin~~~
MMLDTATEAGKGTLAVTGVGIAVYSPYEIASLCAAVLTALYVGAQLITLLPKMLDSIAELRRRFKK
>O64204 ~~~~~~Holin~~~
MSPKIRETLYYVGTLVPGILGIALIWGGIDAGAAANIGDIVAGALNLVGAAAPATAAVKVNQQRKDGTLTTSPVDQVTRG
VEQVLAAKQNAEAEVERVKQALESAVNGAVPQLGPLASQILNGIQPAYSQPFDPHTQPWNR
>P51773 ~~~Y~~~Holin~~~
MTAEEKSVLSLFMIGVLIVVGKVLAGGEPITPRLFIGRMLLGGFVSMVAGVVLVQFPDLSLPAVCGIGSMLGIAGYQVIE
IAIQRRFKGRGKQ
>P27360 ~~~S~~~Antiholin~~~
MKSMDKISTGIAYGTSAGSAGYWFLQWLDQVSPSQWAAIGVLGSLVLGFLTYLTNLYFKIREDRRKAARGE
>P09962 ~~~~~~Antiholin~~~
MKKMPEKHDLLTAMMAAKEQGIGAILAFAMAYLRGRYNGGAFKKTLIDATMCAIIAWFIRDLLVFAGLSSNLAYIASVFI
GYIGTDSIGSLIKRFAAKKAGVDDANQQ
>P06808 ~~~t~~~Holin~~~
MAAPRISFSPSDILFGVLDRLFKDNATGKVLASRVAVVILLFIMAIVWYRGDSFFEYYKQSKYETYSEIIEKERTARFES
VALEQLQIVHISSEADFSAVYSFRPKNLNYFVDIIAYEGKLPSTISEKSLGGYPVDKTMDEYTVHLNGRHYYSNSKFAFL
PTKKPTPEINYMYSCPYFNLDNIYAGTITMYWYRNDHISNDRLESICAQAARILGRAK
>Q6R6U4 ~~~C1~~~Holin~~~
MVLVRGGYKLEKFLQLLTVLLQEAKDPASLLKRLLTILVAVIIFLFVSNTSEVMSFLKTFSTSAVLQDVQTQRIDNFPNV
AREKSMVLFSQTGADAVFVVKYKPDAINDYSNIIAWESNAQLDRADLADKAVNKTSELYRRHLEGFNYASDLTVKVNKYM
GKNIPSFKNVIFNYIYTCPYFNLNNIYAGYIGIAWRDKPVDIADSEQFKEYLTKLCSPQQRSLGRSI
>P03705 ~~~S~~~Antiholin~~~
MKMPEKHDLLAAILAAKEQGIGAILAFAMAYLRGRYNGGAFTKTVIDATMCAIIAWFIRDLLDFAGLSSNLAYITSVFIG
YIGTDSIGSLIKRFAAKKAGVEDGRNQ
>P15316 3.2.1.35~~~HYLP1~~~Hyaluronoglucosaminidase~~~
MTENIPLRVQFKRMSADEWARSDVILLEGEIGFETDTGFAKFGDGQNTFSKLKYLTGPKGPKGDTGLQGKTGGTGPRGPA
GKPGTTDYDQLQNKPDLGAFAQKEETNSKITKLESSKADKSAVYSKAESKIELDKKLSLTGGIVTGQLQFKPNKSGIKPS
SSVGGAINIDMSKSEGAAMVMYTNKDTTDGPLMILRSDKDTFDQSAQFVDYSGKTNAVNIVMRQPSAPNFSSALNITSAN
EGGSAMQIRGVEKALGTLKITHENPNVEAKYDENAAALSIDIVKKQKGGKGTAAQGIYINSTSGTAGKMLRIRNKNEDKF
YVGPDGGFHSGANSTVAGNLTVKDPTSGKHAATKDYVDEKIAELKKLILKK
>P19193 ~~~ORF2~~~Minor spike protein H~~~
MSFAENVGRFIGNSVNSVGSVIGDGLKGFNSTQSISSAKQANLLNNLPLPSLDNVLNIGMFGGLASGLLSYRAAKKQNKV
MQDIANRQMAFQERMSSTAVRRHVEDLKKAGLNPLLALGGSASTPQGAFYSPVNPMESGLNSAISVADKVFDYQRLAHAD
FQGRLNSAMSVVQLASAVQDYKRNYGKFGEVAYWFDRYAGKLLPAMLFYLFRKHPVGRAVSAANSGYAVAKGAKGVNFKF
SNMSSTAVQRHNSRYNVSKGWRR
>P03646 ~~~H~~~Minor spike protein H~~~
MFGAIAGGIASALAGGAMSKLFGGGQKAASGGIQGDVLATDNNTVGMGDAGIKSAIQGSNVPNPDEAAPSFVSGAMAKAG
KGLLEGTLQAGTSAVSDKLLDLVGLGGKSAADKGKDTRDYLAAAFPELNAWERAGADASSAGMVDAGFENQKELTKMQLD
NQKEIAEMQNETQKEIAGIQSATSRQNTKDQVYAQNEMLAYQQKESTARVASIMENTNLSKQQQVSEIMRQMLTQAQTAG
QYFTNDQIKEMTRKVSAEVDLVHQQTQNQRYGSSHIGATAKDISNVVTDAASGVVDIFHGIDKAVADTWNNFWKDGKADG
IGSNLSRK
>Q9J5C9 ~~~~~~Telomere-binding protein I1 homolog~~~
MEQFDQLVLNSISAKALKSYLTTKIAEAIDELAAKKNSPKKKAQTKKPENRIPLDLINKNFVSKFGLKGYKDGVLNSLIC
SLVENNYFENGKLKRGKHDELVLLDIEKEILARIDENSSLNIDVLDVKVLANRLRTNADRFEFKGHTYYLEQNKTEDIIN
QLIKNSAISMDMKNTIKDTFYMISDELLDVFKNRLFKCPQVKDNIISRARLYEYFIKATKPDDSKIYVILKDDNIAKILN
IETIVIDHFIYTKHSLLVSSISNQIDKYSKKFNDQFYSSISEYIKDNEKINLSKVIEYLTISTVKIENTVE
>O72899 ~~~~~~Protein I2 homolog~~~
MEKLFTGTYGVFLESNDSDFEDFINTIMTVLTGKKESKQLSWLTIFIIFVVCIVVFTFLYLKLMC
>P27945 ~~~~~~Transmembrane protein I329L~~~
MLRVFIFFVFLGSGLTGRIKPQVTCKYFISENNTWYKYNVTILNSSIILPAYNTIPSNAAGISCTCHDIDYLQKNNISIH
YNTSILKTFQDIRIIRCGMKNISEIAGGFGKELKFLDLRYNDLQVIDYNILRKLIRSNTPTYLYYNNLMCGKRNCPLYYF
LLKQEQTYLKRLPQFFLRRISFSNNNTYLYHFLSCGNKPGHEFLEYQTKYCRTKFPEINITVNQLIAKKNTERYKSCYPL
VFISILCSCISFLFLFICLLRSICKKYSCTKQDKSSHNYIPLIPSYTFSLKKHRHPETAVVEDHTTSANSPIVYIPTTEE
KKVSCSRRK
>A9JM73 ~~~~~~Transmembrane protein I329L~~~
MLRVFIFFVFLGSGLTGRIKPQVTCKYFISENNTWYKYNVTILNSSIILPAYNTIPSNAAGISCTCHDIDYLQKNNISIH
YNTSILKTFQDIRIIRCGMKNISEIAGGFGKELKFLDLRYNDLQVIDYNILRKLIRSNTPTYLYYNNLMCGKRNCPLYYF
LLKQEQTYLKRLPQFFLRRISFSNNNTYLYHFLSCGNKPGHEFLEYQTKYCRTKFPEINITVNQLIAKKNTERYKSCYPL
VFISILCSCISFLFLFICLLRSICKKYSCTKQDKSSHNYIPLIPSYTFSLKKHRHPETAVVEDHTTSANSPIVYIPTTEE
KKVSCSRRK
>P18521 ~~~~~~Protein I5 homolog~~~
MEIARETLITIGLTILVVLLIITGFSLVLRLIPGVYSSVSRSSFTAGRILRFMEIFSTIMFIPGIIILYAAYIRKIKMKN
N
>P27947 ~~~~~~Early protein I78R~~~
MPAALLGIVLYAGRIILLLRIVTLYLYHVLFSDIKYLQVTCGLILPVKPLPKKTKNMKTLSILYILLKIYKIFCLNFI
>P29817 3.4.22.-~~~~~~Core protease I7 homolog~~~
MNNKIRRFPNKNLKMPESGINFMSMLFFSKIDNMVYFINPIKYNTNANIAILEKIDDDDETRGKVTFIPIKYLEILYNEL
VLDPNHINNINFENNIKRKFFLFWTIKKYLQDKNININTFITSKKYKGIPLVYMRKSFLKSELSKTRDFSTFATIYDDLD
AQIGIPPLGFNPKPKAYPRKHDKSTWLSSGDIYNCIYPLTMINTDYDYFHLILFEKTDKNIATVASSMRCYKLEDRVKFF
LMNDKKRFFMFPIIYNDHFTCCVIDKHFDKDKKAAYFFNSSGYIPELIKQNKKYMFIESDMTIKSHKHYNSTPNTNYAYL
YIDVLSEYLNDIFKNVNYYFFNTFELQYDSPDCGMFNIIFLYYIVYFNIKSKFEFKKLYYSMSFIGDLLASSYRGALFIS
RYDINSIDEFKNTLEIFNIKNKKFMELIDMYKKNSNRIMNVCSKIKNDYDSYIDNEKNSLESNI
>O72903 3.4.22.-~~~~~~Core protease I7 homolog~~~
MDKYTELVINKIPELGFVNLLSHIYQTVGLCSSIDISKFKTNCNGYVVERFDKSETAGKVSCVPISILMELVERGMLSKP
DNSKSQLEVKTDLVNELISKNNGFEDIMTIPTSIPMKYFFKPVLKEKVSKAIDFSVMDIKGDDVSRMGIRYGENDKVVKI
KIAPERDAWMTNTSIHQFLIPMCYGTEVIYIGQFNFNFMNRHAIYEKSSVFNKNTEVFKLKDRIRDNRSSRFIMFGFCYL
HHWKCAIYDKNRDFICFYDSGGNNPNEFNHYRNFFFYSNSDGLNRNSYLSSLANENADIDILFNFFIDNYGVTAGCINVE
VNQLLESECGMFTCLFMAVCCLNPPKGFKGIRKIYTYFKFLADKKVTMLKSILFNVGKMEFTIKEVDGEGMQQYKKMEKW
CANTINILANKITSRVEDIIN
>P69180 ~~~~~~Inhibitor of apoptosis protein~~~
MFPKINTIDPYISLRLFEVKPKYVGYSSIDARNQSFAIHGIKNYEKFSNAGFFYTSPTEITCYCCGMKFCNWLYEKHPLQ
VHGFWSRNCGFMRATLGIIGLKKMIDSYNDYYNNEVFVKHKNRVYTHKRLEDMGFSKPFMRFILANAFIPPYRKYIHKII
LNERYFTFKFAAHLLSFHKVNLDNQTTYCMTCGIEPIKKDENFCNACKTLNYKHYKTLNFSVKL
>O11453 ~~~~~~Inhibitor of apoptosis protein~~~
MYPKINTIDTYISLRLFEVKPKYVGYSSVDARNKSFAIHDIKNYEKFSNAGLFYTSPTEITCYCCGMKFCNWIYEKHPLQ
VHGFWSRNCGFMRATLGIIGLKKMIDSYNDYFTNEVSVKHKNRVYTHKRLEDMGFSKSFMRFILANAFMPPYRKYIHKII
LNERYFTFKFVAYLLSFHKVNLDNQITYCMTCGIEQINKDENFCNACKPLNYKHYKILNFSVKL
>P16666 ~~~ORF VI~~~Transactivator/viroplasmin protein~~~
MEDIEKLLLQEKILMLELDLVRAKISLARAKGSMQQGGNSLHRETPVKEEAVHSALATFAPIQAKAIPEQTAPGKESTNP
LMVSILPKDMKSVQTEKKRLVTPMDFLRPNQGIQIPQKSEPNSSVAPNRAESGIQHPHSNYYVVYNGPHAGIYDDWGSAK
AATNGVPGVAHKKFATITEARAAADVYTTAQQAERLNFIPKGEAQLKPKSFVKALTSPPKQKAQWLTLGVKKPSSDPAPK
EVSFDQETTMDDFLYLYDLGRRFDGEGDDTVFTTDNESISLFNFRKNANPEMIREAYNAGLIRTIYPSNNLQEIKYLPKK
VKDAVKKFRTNCIKNTEKDIFLKIKSTIPVWQDQGLLHKPKHVIEIGVSKKIVPKESKAMESKDHSEDLIELATKTGEQF
IQSLLRLNDKKKIFVNLVEHDTLVYSKNTKETVSEDQRAIETFQQRVITPNLLGFHCPSICHFIKRTVEKEGGAYKCHHC
DKGKAIVQDASADSKVADKEGPPLTTNVEKEDVSTTSSKASG
>P29128 2.3.2.27~~~BICP0~~~E3 ubiquitin-protein ligase ICP0~~~
MAPPAAAPELGSCCICLDAITGAARALPCLHAFCLACIRRWLEGRPTCPLCKAPVQSLIHSVASDECFEEIPVGGGPGAD
GALEPDAAVIWGEDYDAGPIDLTAADGEASGAGGEAGAADGSEAGGGAGGAEEAGEARGAGAGRAAGAAGGRAGRGADAA
QEFIDRVARGPRLPLLPNTPEHGPGAPYLRRVVEWVEGALVGSFAVTARELAAMTDYVMAMLAECGFDDDGLADAMEPLI
GEDDAPAFVRSLLFVAARCVTVGPSHLIPQQSAPPGGRGVVFLDTSDSDSEGSEDDSWSESEESSSGLSTSDLTAIDDTE
TEPETDAEVESRRTRGASGAARARRPAERQYVSTRGRQTPAVQPAPRSLARRPCGRAAAVSAPPSSRSRGGRRDPRLPAA
PRAAPAAQARACSPEPREEGRGAGLGVAAGETAGWGAGSEEGRGERRARLLGEAGPPRVQARRRRRTELDRAPTPAPAPA
PAPAPISTVIDLTANAPARPADPAPAAAPGPASAGAQIGTPAAAAAVTAAAAAPSVARSSAPSPAVTAAATSTAAAISTR
APTPSPAGRAPAADPRRAGAPALAGAARAEVGRNGNPGRERRPASAMARGDLDPGPESSAQKRRRTEMEVAAWVRESLLG
TPRRSSAALAPQPGGRQGPSLAGLLGRCSGGSAWRQ
>P28990 2.3.2.27~~~~~~E3 ubiquitin-protein ligase ICP0~~~
MATVAERCPICLEDPSNYSMALPCLHAFCYVCITRWIRQNPTCPLCKVPVESVVHTIESDSEFKETKVSVDFDYDSEEDE
DSFEGQFLAVDSGDAPANISAWNGPMAFVPLNANGTAGAPRLQPLVDWLVERLDQLFETPELALVMRNIVMDTLCEHGCN
EEELTRQFWPMFHEDTVPFVTDLIVQAELCVASRPILPIARGRGVEYIDSSSSSSSSEEETDSDIEVDPNNLTDPEDTSD
ETSTDNSSAQAPRQEDSRPARARPGPPTRGRRRGRRPAAPGPASRRSARLRRRQPRTNSRTNGGDNGEIIDLTLDSDGDT
EPADVSGSLNTTDQPVLIPDEEEAAPASPHTSSNSAIICLVSELTPESEEPPRDQPVAPSGSSAGERPMRPRCSLREFAR
RFMALAPRDSSTSEAAGPSRLGAGPRATEPFSVAVVLVDRSSEGAGLFGGRFAQHVRRRTEDESARRRGNVLLRPRRQSV
PPVPYPDIASTSPLIRQGGQRVRDLQRAFQTQPAEPEEMRCPHNCQRYRRNQ
>P08393 2.3.2.27~~~ICP0~~~E3 ubiquitin-protein ligase ICP0~~~
MEPRPGASTRRPEGRPQREPAPDVWVFPCDRDLPDSSDSEAETEVGGRGDADHHDDDSASEADSTDTELFETGLLGPQGV
DGGAVSGGSPPREEDPGSCGGAPPREDGGSDEGDVCAVCTDEIAPHLRCDTFPCMHRFCIPCMKTWMQLRNTCPLCNAKL
VYLIVGVTPSGSFSTIPIVNDPQTRMEAEEAVRAGTAVDFIWTGNQRFAPRYLTLGGHTVRALSPTHPEPTTDEDDDDLD
DADYVPPAPRRTPRAPPRRGAAAPPVTGGASHAAPQPAAARTAPPSAPIGPHGSSNTNTTTNSSGGGGSRQSRAAAPRGA
SGPSGGVGVGVGVVEAEAGRPRGRTGPLVNRPAPLANNRDPIVISDSPPASPHRPPAAPMPGSAPRPGPPASAAASGPAR
PRAAVAPCVRAPPPGPGPRAPAPGAEPAARPADARRVPQSHSSLAQAANQEQSLCRARATVARGSGGPGVEGGHGPSRGA
APSGAAPLPSAASVEQEAAVRPRKRRGSGQENPSPQSTRPPLAPAGAKRAATHPPSDSGPGGRGQGGPGTPLTSSAASAS
SSSASSSSAPTPAGAASSAAGAASSSASASSGGAVGALGGRQEETSLGPRAASGPRGPRKCARKTRHAETSGAVPAGGLT
RYLPISGVSSVVALSPYVNKTITGDCLPILDMETGNIGAYVVLVDQTGNMATRLRAAVPGWSRRTLLPETAGNHVMPPEY
PTAPASEWNSLWMTPVGNMLFDQGTLVGALDFRSLRSRHPWSGEQGASTRDEGKQ
>P28284 2.3.2.27~~~RL2~~~E3 ubiquitin-protein ligase ICP0~~~
MEPRPGTSSRADPGPERPPRQTPGTQPAAPHAWGMLNDMQWLASSDSEEETEVGISDDDLHRDSTSEAGSTDTEMFEAGL
MDAATPPARPPAERQGSPTPADAQGSCGGGPVGEEEAEAGGGGDVCAVCTDEIAPPLRCQSFPCLHPFCIPCMKTWIPLR
NTCPLCNTPVAYLIVGVTASGSFSTIPIVNDPRTRVEAEAAVRAGTAVDFIWTGNPRTAPRSLSLGGHTVRALSPTPPWP
GTDDEDDDLADVDYVPPAPRRAPRRGGGGAGATRGTSQPAATRPAPPGAPRSSSSGGAPLRAGVGSGSGGGPAVAAVVPR
VASLPPAAGGGRAQARRVGEDAAAAEGRTPPARQPRAAQEPPIVISDSPPPSPRRPAGPGPLSFVSSSSAQVSSGPGGGG
LPQSSGRAARPRAAVAPRVRSPPRAAAAPVVSASADAAGPAPPAVPVDAHRAPRSRMTQAQTDTQAQSLGRAGATDARGS
GGPGAEGGPGVPRGTNTPGAAPHAAEGAAARPRKRRGSDSGPAASSSASSSAAPRSPLAPQGVGAKRAAPRRAPDSDSGD
RGHGPLAPASAGAAPPSASPSSQAAVAAASSSSASSSSASSSSASSSSASSSSASSSSASSSSASSSAGGAGGSVASASG
AGERRETSLGPRAAAPRGPRKCARKTRHAEGGPEPGARDPAPGLTRYLPIAGVSSVVALAPYVNKTVTGDCLPVLDMETG
HIGAYVVLVDQTGNVADLLRAAAPAWSRRTLLPEHARNCVRPPDYPTPPASEWNSLWMTPVGNMLFDQGTLVGALDFHGL
RSRHPWSREQGAPAPAGDAPAGHGE
>P29129 2.3.2.27~~~EP0~~~E3 ubiquitin-protein ligase ICP0~~~
MGCTVSRRRTTTAEASSAWGIFGFYRPRSPSPPPQRLSLPLTVMDCPICLDVAATEAQTLPCMHKFCLDCIQRWTLTSTA
CPLCNARVTSILHHVDSDASFVETPVEGATDVDGEEDEPVGGGFAVIWGEDYTEEVRHEEAEGQGSGSGSRARPRVPVFN
WLYGQVSTVIESDPIREAVVDNIVEIIQEHGMNRQRVTEAMLPMFGANTHALVDTLFDISAQWMRRMQRRAPMSHQGVNY
IDTSESEAHSDSEVSSPDEEDSGASSSGVHTEDLTEASESADDQRPAPRRSPRRARRAAVLRREQRRTRCLRRGRTGGQA
QGETPEAPSSGEGSSAQHGASGAGAGPGSANTAASARSSPSSSPSSSMRRPSPSASAPETAAPRGGPPASSSSGSPRSAT
IFIDLTQDDD
>P68337 ~~~IR4~~~Transcriptional regulator ICP22 homolog~~~
MPHGQPCGACDGSCRMAQRGTPSTSPLIPSLTPSPPAGDPSPRSSQRIDAVRVPARLPGGSDHPEYGMPLSPRALRPYLA
RGPGAFCAPPWRPDVNRLAGDVNRLFRGISTSSIHVTEDSRTLRRALLDFYAMGYTHTRPTLECWQSLLQLLPEQSFPLR
ATLRALNSEDRYEQRFLEPPSDPPNTLFGEECDVSGDESPSEEEEEDEASGESSVSEFSPEEETASSEYDSFSDVGEDDS
SCTGKWSSSESESDSESDAPTNNHHPTTRASAAKKRRKRQPPKGERPTKSARR
>P04485 ~~~~~~Transcriptional regulator ICP22~~~
MADISPGAFAPCVKARRPALRSPPLGTRKRKRPSRPLSSESEVESDTALESEVESETASDSTESGDQDEAPRIGGRRAPR
RLGGRFFLDMSAESTTGTETDASVSDDPDDTSDWSYDDIPPRPKRARVNLRLTSSPDRRDGVIFPKMGRVRSTRETQPRA
PTPSAPSPNAMLRRSVRQAQRRSSARWTPDLGYMRQCINQLFRVLRVARDPHGSANRLRHLIRDCYLMGYCRARLAPRTW
CRLLQVSGGTWGMHLRNTIREVEARFDATAEPVCKLPCLETRRYGPECDLSNLEIHLSATSDDEISDATDLEAAGSDHTL
ASQSDTEDAPSPVTLETPEPRGSLAVRLEDEFGEFDWTPQEGSQPWLSAVVADTSSVERPGPSDSGAGRAAEDRKCLDGC
RKMRFSTACPYPCSDTFLRP
>P09255 ~~~~~~Transcriptional regulator ICP22 homolog~~~
MFCTSPATRGDSSESKPGASVDVNGKMEYGSAPGPLNGRDTSRGPGAFCTPGWEIHPARLVEDINRVFLCIAQSSGRVTR
DSRRLRRICLDFYLMGRTRQRPTLACWEELLQLQPTQTQCLRATLMEVSHRPPRGEDGFIEAPNVPLHRSALECDVSDDG
GEDDSDDDGSTPSDVIEFRDSDAESSDGEDFIVEEESEESTDSCEPDGVPGDCYRDGDGCNTPSPKRPQRAIERYAGAET
AEYTAAKALTALGEGGVDWKRRRHEAPRRHDIPPPHGV
>Q04360 ~~~BMLF1~~~mRNA export factor ICP27 homolog~~~
MVPSQRLSRTSSISSNEDPAESHILELEAVSDTNTDCDLDPMEGSEEHSTDGEISSSEEEDEDPTPAHAIPARPSSVVIT
PTSASFVIPRKKWDLQDKTVTLHRSPLCRDEDEKEETGNSSYTRGHKRRRGEVHGCTDESYGKRRHLPPGARAPRAPRAP
RVPRAPRSPRAPRSNRATRGPRSESRGAGRSTRKQARQERSQRPLPNKPWFDMSLVKPVSKITFVTLPSPLASLTLEPIQ
DPFLQSMLAVAAHPEIGAWQKVQPRHELRRSYKTLREFFTKSTNKDTWLDARMQAIQNAGLCTLVAMLEETIFWLQEITY
HGDLPLAPAEDILLACAMSLSKVILTKLKELAPCFLPNTRDYNFVKQLFYITCATARQNKVVETLSSSYVKQPLCLLAAY
AAVAPAYINANCRRRHDEVEFLGHYIKNYNPGTLSSLLTEAVETHTRDCRSASCSRLVRAILSPGTGSLGLFFVPGLNQ
>Q3KSU1 ~~~BMLF1~~~mRNA export factor ICP27 homolog~~~
MVPSQRLSRTSSISSNEDPAESHILELEAVSDTNTDCDMDPMEGSEEHSTDGEISSSEEEDEDPTPAHAVPAQPSSVVIT
PTSASFVIPRKKWDLQDKTVTLHRSPLCRDEDEKEETGNSSYTRGHKRRRGEVHGCTDESYGKRRHLPPGARAPRAPRAP
RVPRAPRSPRAPRSNRATRGPRSESRGAGRSTRKQARQERSQRPLPNKPWFDMSLVKPVSKITFVTLPSPLASLTLEPIQ
DPFLQSMLAVAAHPEIGAWQKVQPRHELRRSYKTLREFFTKSTNKDTWLDARMQAIQNAGLCTLVAMLEETIFWLQEITY
HGDLPLAPAEDILLACAMSLSKVILTKLKELAPCFLPNTRDYNFVKQLFYITCATARQNKVVETLSSSYVKQPLCLLAAY
AAVAPAYINANCRRRHDEVEFLGHYIKNYNPGTLSSLLTEAVETHTRDCRSASCSRLVRAILSPATGSLGLFFVPGLNQ
>Q05906 ~~~UL3~~~mRNA export factor ICP27 homolog~~~
MALSSVSSCEPMEDEMSIMGSDTEDNFTGGDTCAEATRGLVNKSAFVPTQTVGTVSALRNVVGNPPKSVVVSFSASPQRA
QPSNPKSERPAFGHGRRNRRRPFRRNNWKQQQRGWEKPEPENVPARQSAGSWPKRSSLPVHMRLGQRGGDSSSADSGHGG
AGPSDRWRFKTRTQSVARVHRNRRRGNANHGSNTPGRSAGDRLNAAAARSIADVCRRVTSSRIGEMFHGARETLTTPVKN
GGFRAENSSPWAPVLGFGSDQFNPEARRITWDTLVEHGVNLYKLFEVRSHAAEAARSLRDAVMRGENLLEALASADETLS
WCKMIVTKNLPMRTRDPIISSSVALLDNLRLKLEPFMRCYLSSSGSPTLAELCDHQRLSDVACVPTFMFVMLARIARAVG
SGAETVSRDALGPDGRVLADYVPGACLAGTLEAIDAHKRRCKADTCSLVSAYTLVPVYLHGKYFYCNQIF
>P16749 ~~~~~~mRNA export factor ICP27 homolog~~~
MELHSRGRHDAPSLSSLSERERRARRARRFCLDYEPVPRKFRRERSPTSPSTRNGAAASEHHLAEDTVGAASHHHRPCVP
ARRPRYSKDDDTEGDPDHYPPPLPPSSRHALGGTGGHIIMGTAGFRGGHRASSSFKRRVAASASVPLNPHYGKSYDNDDG
EPHHHGGDSTHLRRRVPSCPTTFGSSHPSSANNHHGSSAGPQQQQMLALIDDELDAMDEDELQQLSRLIEKKKRARLQRG
AASSGTSPSSTSPVYDLQRYTAESLRLAPYPADLKVPTAFPQDHQPRGRILLSHDELMHTDYLLHIRQQFDWLEEPLLRK
LVVEKIFAVYNAPNLHTLLAIIDETLSYMKYHHLHGLPVNPHDPYLETVGGMRQLLFNKLNNLDLGCILDHQDGWGDHCS
TLKRLVKKPGQMSAWLRDDVCDLQKRPPETFSQPMHRAMAYVCSFSRVAVSLRRRALQVTGTPQFFDQFDTNNAMGTYRC
GAVSDLILGALQCHECQNEMCELRIQRALAPYRFMIAYCPFDEQSLLDLTVFAGTTTTTASNHATAGGQQRGGDQIHPTD
EQYANMESRTDPATLTAYDKKDREGSHRHPSPMIAAAPPAQPPSQPQQHYSEGELEEDEDSDDASSQDLVRATDRHGDTV
VYKTTAVPPSPPAPLAGVRSHRGELNLMTPSPSHGGSPPQVPHKQPIIPVQSANGNHSTTATQQQQPPPPPPPPPVPQED
DSVVMRCQTPDYEDMLCYSDDMDD
>P10238 ~~~~~~mRNA export factor~~~
MATDIDMLIDLGLDLSDSDLDEDPPEPAESRRDDLESDSSGECSSSDEDMEDPHGEDGPEPILDAARPAVRPSRPEDPGV
PSTQTPRPTERQGPNDPQPAPHSVWSRLGARRPSCSPEQHGGKVARLQPPPTKAQPARGGRRGRRRGRGRGGPGAADGLS
DPRRRAPRTNRNPGGPRPGAGWTDGPGAPHGEAWRGSEQPDPPGGQRTRGVRQAPPPLMTLAIAPPPADPRAPAPERKAP
AADTIDATTRLVLRSISERAAVDRISESFGRSAQVMHDPFGGQPFPAANSPWAPVLAGQGGPFDAETRRVSWETLVAHGP
SLYRTFAGNPRAASTAKAMRDCVLRQENFIEALASADETLAWCKMCIHHNLPLRPQDPIIGTTAAVLDNLATRLRPFLQC
YLKARGLCGLDELCSRRRLADIKDIASFVFVILARLANRVERGVAEIDYATLGVGVGEKMHFYLPGACMAGLIEILDTHR
QECSSRVCELTASHIVAPPYVHGKYFYCNSLF
>P36295 ~~~~~~mRNA export factor~~~
MATDIDMLIDLGLDLSDSDLDEDPPEPAESRRDDLESDSSGECSSSDEDMEDPHGEDGPEPILDAARPAVRPSRPEDPGV
PSTQTPRPTERQGPNDPQPAPHSVWSRLGARRPSCSPEQHGGKVARLQPPPTKAQPARGGRRGRRRGRGRGGPGAADGLS
DPRRRAPRTNRNPGDRPGAGWTDGPGAPHGEAWRGSEQPDPPGGPRTRGVRQAPPPLMTLAIAPPPADPRAPAPERKAPA
ADTIDATTRLVLRSISERAAVDRISESFGRSAQVMPDPFGGQPFPAANSPWAPVLAGQGGPFDAETRRVSWETLVAHGPS
LYRTFAGNPRAASTAKAMRDCVLRQENFIEALASADETLAWCKMCIHHNLPLRPQDPIIGTTAAVLDNLATRLRPFLQCY
LKARGLCGLDELCSRRRLADIKDIASFVFVILARLANRVERGVAEIDYATLGVGVGEKMHFYLPGACMAGLIEILDTHRQ
ECSSRVCELTASHIVAPPYVHGKYFYCNSLF
>P28276 ~~~~~~mRNA export factor~~~
MATDIDMLIDLGLDLSDSELEEDALERDEEGRRDDPESDSSGECSSSDEDMEDPCGDGGAEAIDAAIPKGPPARPEDAGT
PEASTPRPAARRGADDPPPATTGVWSRLGTRRSASPREPHGGKVARIQPPSTKAPHPRGGRRGRRRGRGRYGPGGADSTP
KPRRRVSRNAHNQGGRHPASARTDGPGATHGEARRGGEQLDVSGGPRPRGTRQAPPPLMALSLTPPHADGRAPVPERKAP
SADTIDPAVRAVLRSISERAAVERISESFGRSALVMQDPFGGMPFPAANSPWAPVLATQAGGFDAETRRVSWETLVAHGP
SLYRTFAANPRAASTAKAMRDCVLRQENLIEALASADETLAWCKMCIHHNLPLRPQDPIIGTAAAVLENLATRLRPFLQC
YLKARGLCGLDDLCSRRRLSDIKDIASFVLVILARLANRVERGVSEIDYTTVGVGAGETMHFYIPGACMAGLIEILDTHR
QECSSRVCELTASHTIAPLYVHGKYFYCNSLF
>Q2HR75 ~~~~~~mRNA export factor ICP27 homolog~~~
MVQAMIDMDIMKGILEDSVSSSEFDESRDDETDAPTLEDEQLSEPAEPPADERIRGTQSAQGIPPPLGRIPKKSQGRSQL
RSEIQFCSPLSRPRSPSPVNRYGKKIKFGTAGQNTRPPPEKRPRRRPRDRLQYGRTTRGGQCRAAPKRATRRPQVNCQRQ
DDDVRQGVSDAVKKLRLPASMIIDGESPRFDDSIIPRHHGACFNVFIPAPPSHVPEVFTDRDITALIRAGGKDDELINKK
ISAKKIDHLHRQMLSFVTSRHNQAYWVSCRRETAAAGGLQTLGAFVEEQMTWAQTVVRHGGWFDEKDIDIILDTAIFVCN
AFVTRFRLLHLSCVFDKQSELALIKQVAYLVAMGNRLVEACNLLGEVKLNFRGGLLLAFVLTIPGMQSRRSISARGQELF
RTLLEYYRPGDVMGLLNVIVMEHHSLCRNSECAAATRAAMGSAKFNKGLFFYPLS
>P13199 ~~~EJRF1~~~mRNA export factor ICP27 homolog~~~
MEDIIEGGISSDDDFDSSDSSSDEEESDTSPQIMKSDVTMASPPSTPEPSPDVSASTSNLKRERQRSPITWEHQSPLSRV
YRSPSPMRFGKRPRISSNSTSRSCKTSWADRVREAAAQRRPSRPFRKPYSHPRNGPLRNGPPRAPPLLKLFDISILPKSG
EPKLFLPVPSLPCQEAEKTNDKYVLAMAQRAMHDVPISSKQLTANLLPVKFKPLLSIVRYTPNYYYWVSMRKETIASANL
CTVAAFLDESLCWGQQYLKNDFIFSENGKDIILDTSSALLSQLVHKIKMLPFCHCLMQTTPQDHIVKQVCYLIASNNRIL
DAVRYLQTSVIKSPIVLLLAYAVCLPAAIICTKNETQLYSHCMRILKEYRPGDVMNILHESLTQHLNKCPSSTCAYTTRA
IVGTKANTTGLFFLPTQ
>P09269 ~~~ORF4~~~mRNA export factor ICP27 homolog~~~
MASASIPTDPDVSTICEDFMNLLPDEPSDDFALEVTDWANDEAIGSTPGEDSTTSRTVYVERTADTAYNPRYSKRRHGRR
ESYHHNRPKTLVVVLPDSNHHGGRDVETGYARIERGHRRSSRSYNTQSSRKHRDRSLSNRRRRPTTPPAMTTGERNDQTH
DESYRLRFSKRDARRERIRKEYDIPVDRITGRAIEVVSTAGASVTIDSVRHLDETIEKLVVRYATIQEGDSWASGGCFPG
IKQNTSWPELMLYGHELYRTFESYKMDSRIARALRERVIRGESLIEALESADELLTWIKMLAAKNLPIYTNNPIVATSKS
LLENLKLKLGPFVRCLLLNRDNDLGSRTLPELLRQQRFSDITCITTYMFVMIARIANIVVRGSKFVEYDDISCNVQVLQE
YTPGSCLAGVLEALITHQRECGRVECTLSTWAGHLSDARPYGKYFKCSTFNC
>P36313 ~~~~~~Neurovirulence factor ICP34.5~~~
MARRRRHRGPRRPRPPGPTGAVPTAQSQVTSTPNSEPAVRSAPAAAPPPPPAGGPPPSCSLLLRQWLHVPESASDDDDDD
DWPDSPPPEPAPEARPTAAAPRPRPPPPGVGPGGGADPSHPPSRPFRLPPRLALRLRVTAEHLARLRLRRAGGEGAPEPP
ATPATPATPATPATPARVRFSPHVRVRHLVVWASAARLARRGSWARERADRARFRRRVAEAEAVIGPCLGPEARARALAR
GAGPANSV
>P08353 ~~~RL1~~~Neurovirulence factor ICP34.5~~~
MARRRRHRGPRRPRPPGPTGAVPTAQSQVTSTPNSEPAVRSAPAAAPPPPPASGPPPSCSLLLRQWLHVPESASDDDDDD
DWPDSPPPEPAPEARPTAAAPRPRSPPPGAGPGGGANPSHPPSRPFRLPPRLALRLRVTAEHLARLRLRRAGGEGAPEPP
ATPATPATPATPATPATPATPATPATPATPARVRFSPHVRVRHLVVWASAARLARRGSWARERADRARFRRRVAEAEAVI
GPCLGPEARARALARGAGPANSV
>Q8VBE2 ~~~~~~ICP35~~~
MTPLGLVWSTTLRCAPLDACTRATAAGPKNPATTEAPTTIVTFPKLFINTVTALPSFFPNTESVAFDSSVSLISLPALSG
FSVK
>P03170 ~~~~~~ICP47 protein~~~
MSWALEMADTFLDTMRVGPRTYADVRDEINKRGREDREAARTAVHDPERPLLRSPGLLPEIAPNASLGVAHRRTGGTVTD
SPRNPVTR
>P14345 ~~~~~~ICP47 protein~~~
MSWALKTTDMFLDSSRCTHRTYGDVCAEIHKREREDREAARTAVTDPELPLLCPPDVRSDPASRNPTQQTRGCARSNERQ
DRVLAP
>P17473 ~~~IE~~~Major viral transcription factor~~~
MASQRSDFAPDLYDFIESNDFGEDPLIRAASAAEEGFTQPAAPDLLYGSQNMFGVDDAPLSTPVVVIPPPSPAPEPRGGK
AKRSPSAAGSGGPPTPAAAAQPASPAPSPAPGLAAMLKMVHSSVAPGNGRRATGSSSPGGGDAADPVALDSDTETCPGSP
QPEFPSSASPGGGSPAPRVRSISISSSSSSSSSMDEDDQADGAGASSSSSSSSDDSDSDEGGEEETPRPRHSQNAAKTPS
AAGSPGPSSGGDRPAAGAATPKSCRSGAASPGAPAPAPASAPAPSRPGGGLLPPGARILEYLEGVREANLAKTLERPEPP
AGMASPPGRSPHRLPKDQRPKSALAGASKRKRANPRPRPQTQTQAPAEEAPQTAVWDLLDMNSSQATGAAAAAASAPAAA
SCAPGVYQREPLLTPSGDPWPGSDPPPMGRVRYGGTGDSRDGLWDDPEIVLAASRYAEAQAPVPVFVPEMGDSTKQYNAL
VRMVFESREAMSWLQNSKLSGQDQNLAQFCQKFIHAPRGHGSFITGSVANPLPHIGDAMAAGNALWALPHAAASVAMSRR
YDRTQKSFILQSLRRAYADMAYPRDEAGRPDSLAAVAGCPAQAAAAAASQQQPEAPAPSVRVREAYTRVCAALGPRRKAA
AAAAAPGTRAPRPSAFRLRELGDACVLACQAVFEALLRLRGGASAVPGLDPSEIPSPACPPEALCSNPAGLETAALSLYE
LRDLVERARLLGDSDPTHRLGSDELRLAVRAVLVVARTVAPLVRYNAEGARARASAWTVTQAVFSIPSLVGGMLGEAVSL
LAPPTRSQQPSSSSPGGEPFSGSAAAEGSLQTLPPLWPTVPGKQSATVPSSHSQSPQHSQSGGGAGATTATCCRATQTNA
RSRGQQHQPQKARSPQAAASPAHLSQEAMPGSSSDDRAIHGRPRGKSGKRRSEPLEPAAQAGASASFSSSARGYDPSGPV
DSPPAPKRRVATPGHQAPRALGPMPAEGPDRRGGFRRVPRGDCHTPRPSDAACAAYCPPELVAELIDNQLFPEAWRPALT
FDPQALATIAARCSGPPARDGARLGELAASGPLRRRAAWMHQIPDPEDVKVVVLYSPLQDEDLLGGLPASRPGGSRREPL
WSDLKGGLSALLAALGNRILTKRSHAWAGNWTGAPDVSALNAQGVLLLSTGDLAFTGCVEYLCLRLGSARRKLLVLDAVS
TEDWPQDGPAISQYHIYMRAALTPRVACAVRWPGERHLSRAVLTSSTLFGPGLFARAEAAFARLYPDSAPLRLCRSSNVA
YTVDTRAGERTRVPLAPREYRQRVLPDYDGCKDMRAQAEGLGFHDPDFEEGAAQSHRAANRWGLGAWLRPVYLACGRRGA
GAVEPSELLIPELLSEFCRVALLEPDAEAEPLVLPITEAPRRRAPRVDWEPGFGSRSTSVLHMGATELCLPEPDDELEID
GAGDVELVVEHPGPSPGVAQALRRAPIKIEVVSDDEDGGDWCNPYLS
>P08392 ~~~ICP4~~~Major viral transcription factor ICP4~~~
MASENKQRPGSPGPTDGPPPTPSPDRDERGALGWGAETEEGGDDPDHDPDHPHDLDDARRDGRAPAAGTDAGEDAGDAVS
PRQLALLASMVEEAVRTIPTPDPAASPPRTPAFRADDDDGDEYDDAADAAGDRAPARGREREAPLRGAYPDPTDRLSPRP
PAQPPRRRRHGRWRPSASSTSSDSGSSSSSSASSSSSSSDEDEDDDGNDAADHAREARAVGRGPSSAAPAAPGRTPPPPG
PPPLSEAAPKPRAAARTPAASAGRIERRRARAAVAGRDATGRFTAGQPRRVELDADATSGAFYARYRDGYVSGEPWPGAG
PPPPGRVLYGGLGDSRPGLWGAPEAEEARRRFEASGAPAAVWAPELGDAAQQYALITRLLYTPDAEAMGWLQNPRVVPGD
VALDQACFRISGAARNSSSFITGSVARAVPHLGYAMAAGRFGWGLAHAAAAVAMSRRYDRAQKGFLLTSLRRAYAPLLAR
ENAALTGAAGSPGAGADDEGVAAVAAAAPGERAVPAGYGAAGILAALGRLSAAPASPAGGDDPDAARHADADDDAGRRAQ
AGRVAVECLAACRGILEALAEGFDGDLAAVPGLAGARPASPPRPEGPAGPASPPPPHADAPRLRAWLRELRFVRDALVLM
RLRGDLRVAGGSEAAVAAVRAVSLVAGALGPALPRDPRLPSSAAAAAADLLFDNQSLRPLLAAAASAPDAADALAAAAAS
AAPREGRKRKSPGPARPPGGGGPRPPKTKKSGADAPGSDARAPLPAPAPPSTPPGPEPAPAQPAAPRAAAAQARPRPVAV
SRRPAEGPDPLGGWRRQPPGPSHTAAPAAAALEAYCSPRAVAELTDHPLFPVPWRPALMFDPRALASIAARCAGPAPAAQ
AACGGGDDDDNPHPHGAAGGRLFGPLRASGPLRRMAAWMRQIPDPEDVRVVVLYSPLPGEDLAGGGASGGPPEWSAERGG
LSCLLAALANRLCGPDTAAWAGNWTGAPDVSALGAQGVLLLSTRDLAFAGAVEFLGLLASAGDRRLIVVNTVRACDWPAD
GPAVSRQHAYLACELLPAVQCAVRWPAARDLRRTVLASGRVFGPGVFARVEAAHARLYPDAPPLRLCRGGNVRYRVRTRF
GPDTPVPMSPREYRRAVLPALDGRAAASGTTDAMAPGAPDFCEEEAHSHAACARWGLGAPLRPVYVALGREAVRAGPARW
RGPRRDFCARALLEPDDDAPPLVLRGDDDGPGALPPAPPGIRWASATGRSGTVLAAAGAVEVLGAEAGLATPPRREVVDW
EGAWDEDDGGAFEGDGVL
>P90493 ~~~ICP4~~~Major viral transcription factor ICP4 homolog~~~
MSAEQRKKKKTTTTTQGRGAEVAMADEDGGRLRAAAETTGGPGSPDPADGPPPTPNPDRRPAARPGFGWHGGPEENEDEA
DDAAADADADEAAPASGEAVDEPAADGVVSPRQLALLASMVDEAVRTIPSPPPERDGAQEEAARSPSPPRTPSMRADYGE
ENDDDDDDDDDDDRDAGRWVRGPETTSAVRGAYPDPMASLSPRPPAPRRHHHHHHHRRRRAPRRRSAASDSSKSGSSSSA
SSASSSASSSSSASASSSDDDDDDDAARAPASAADHAAGGTLGADDEEAGVPARAPGAAPRPSPPRAEPAPARTPAATAG
RLERRRARAAVAGRDATGRFTAGRPRRVELDADAASGAFYARYRDGYVSGEPWPGAGPPPPGRVLYGGLGDSRPGLWGAP
EAEEARARFEASGAPAPVWAPELGDAAQQYALITRLLYTPDAEAMGWLQNPRVAPGDVALDQACFRISGAARNSSSFISG
SVARAVPHLGYAMAAGRFGWGLAHVAAAVAMSRRYDRAQKGFLLTSLRRAYAPLLARENAALTGARTPDDGGDANRHDGD
DARGKPAAAAAPLPSAAASPADERAVPAGYGAAGVLAALGRLSAAPASAPAGADDDDDDDGAGGGGGGRRAEAGRVAVEC
LAACRGILEALAEGFDGDLAAVPGLAGARPAAPPRPGPAGAAAPPHADAPRLRAWLRELRFVRDALVLMRLRGDLRVAGG
SEAAVAAVRAVSLVAGALGPALPRSPRLLSSAAAAAADLLFQNQSLRPLLADTVAAADSLAAPASAPREARKRKSPAPAR
APPGGAPRPPKKSRADAPRPAAAPPAGAAPPAPPTPPPRPPRPAALTRRPAEGPDPQGGWRRQPPGPSHTPAPSAAALEA
YCAPRAVAELTDHPLFPAPWRPALMFDPRALASLAARCAAPPPGGAPAAFGPLRASGPLRRAAAWMRQVPDPEDVRVVIL
YSPLPGEDLAAGRAGGGPPPEWSAERGGLSCLLAALGNRLCGPATAAWAGNWTGAPDVSALGAQGVLLLSTRDLAFAGAV
EFLGLLAGACDRRLIVVNAVRAADWPADGPVVSRQHAYLACEVLPAVQCAVRWPAARDLRRTVLASGRVFGPGVFARVEA
AHARLYPDAPPLRLCRGANVRYRVRTRFGPDTLVPMSPREYRRAVLPALDGRAAASGAGDAMAPGAPDFCEDEAHSHRAC
ARWGLGAPLRPVYVALGRDAVRGGPAELRGPRREFCARALLEPDGDAPPLVLRDDADAGPPPQIRWASAAGRAGTVLAAA
GGGVEVVGTAAGLATPPRREPVDMDAELEDDDDGLFGE
>P09310 ~~~~~~Major viral transcription factor ICP4 homolog~~~
MDTPPMQRSTPQRAGSPDTLELMDLLDAAAAAAEHRARVVTSSQPDDLLFGENGVMVGREHEIVSIPSVSGLQPEPRTED
VGEELTQDDYVCEDGQDLMGSPVIPLAEVFHTRFSEAGAREPTGADRSLETVSLGTKLARSPKPPMNDGETGRGTTPPFP
QAFSPVSPASPVGDAAGNDQREDQRSIPRQTTRGNSPGLPSVVHRDRQTQSISGKKPGDEQAGHAHASGDGVVLQKTQRP
AQGKSPKKKTLKVKVPLPARKPGGPVPGPVEQLYHVLSDSVPAKGAKADLPFETDDTRPRKHDARGITPRVPGRSSGGKP
RAFLALPGRSHAPDPIEDDSPVEKKPKSREFVSSSSSSSSWGSSSEDEDDEPRRVSVGSETTGSRSGREHAPSPSNSDDS
DSNDGGSTKQNIQPGYRSISGPDPRIRKTKRLAGEPGRQRQKSFSLPRSRTPIIPPVSGPLMMPDGSPWPGSAPLPSNRV
RFGPSGETREGHWEDEAVRAARARYEASTEPVPLYVPELGDPARQYRALINLIYCPDRDPIAWLQNPKLTGVNSALNQFY
QKLLPPGRAGTAVTGSVASPVPHVGEAMATGEALWALPHAAAAVAMSRRYDRAQKHFILQSLRRAFASMAYPEATGSSPA
ARISRGHPSPTTPATQAPDPQPSAAARSLSVCPPDDRLRTPRKRKSQPVESRSLLDKIRETPVADARVADDHVVSKAKRR
VSEPVTITSGPVVDPPAVITMPLDGPAPNGGFRRIPRGALHTPVPSDQARKAYCTPETIARLVDDPLFPTAWRPALSFDP
GALAEIAARRPGGGDRRFGPPSGVEALRRRCAWMRQIPDPEDVRLLIIYDPLPGEDINGPLESTLATDPGPSWSPSRGGL
SVVLAALSNRLCLPSTHAWAGNWTGPPDVSALNARGVLLLSTRDLAFAGAVEYLGSRLASARRRLLVLDAVALERWPRDG
PALSQYHVYVRAPARPDAQAVVRWPDSAVTEGLARAVFASSRTFGPASFARIETAFANLYPGEQPLCLCRGGNVAYTVCT
RAGPKTRVPLSPREYRQYVLPGFDGCKDLARQSRGLGLGAADFVDEAAHSHRAANRWGLGAALRPVFLPEGRRPGAAGPE
AGDVPTWARVFCRHALLEPDPAAEPLVLPPVAGRSVALYASADEARNALPPIPRVMWPPGFGAAETVLEGSDGTRFVFGH
HGGSERPSETQAGRQRRTADDREHALELDDWEVGCEDAWDSEEGGGDDGDAPGSSFGVSIVSVAPGVLRDRRVGLRPAVK
VELLSSSSSSEDEDDVWGGRGGRSPPQSRG
>P41710 ~~~IE0~~~Immediate-early protein IE-0~~~
MIRTSSHVLNVQENIMTSNCASSPYSCEATSACAEAQQVMIDNFVFFHMYNADIQIDAKLQCGVRSAAFAMIDDKHLEMY
KHRIENKFFYYYDQCADIAKPDRLPDDDGACCHHFIFDAQRIIQCIKEIESAYGVRDRGNVIVFYPYLKQLRDALKLIKN
SFACCFKIINSMQMYVNELISNCLLFIEKLETINKTVKVMNLFVDNLVLYECNVCKEISTDERFLKPKECCEYAICNACC
VNMWKTATTHAKCPACRTSYK
>P11138 ~~~IE1~~~Trans-activating transcriptional regulatory protein~~~
MTQINFNASYTSASTPSRASFDNSYSEFCDKQPNDYLSYYNHPTPDGADTVISDSETAAASNFLASVNSLTDNDLVECLL
KTTDNLEEAVSSAYYSESLEQPVVEQPSPSSAYHAESFEHSAGVNQPSATGTKRKLDEYLDNSQGVVGQFNKIKLRPKYK
KSTIQSCATLEQTINHNTNICTVASTQEITHYFTNDFAPYLMRFDDNDYNSNRFSDHMSETGYYMFVVKKSEVKPFEIIF
AKYVSNVVYEYTNNYYMVDNRVFVVTFDKIRFMISYNLVKETGIEIPHSQDVCNDETAAQNCKKCHFVDVHHTFKAALTS
YFNLDMYYAQTTFVTLLQSLGERKCGFLLSKLYEMYQDKNLFTLPIMLSRKESNEIETASNNFFVSPYVSQILKYSESVQ
FPDNPPNKYVVDNLNLIVNKKSTLTYKYSSVANLLFNNYKYHDNIASNNNAENLKKVKKEDGSMHIVEQYLTQNVDNVKG
HNFIVLSFKNEERLTIAKKNKEFYWISGEIKDVDVSQVIQKYNRFKHHMFVIGKVNRRESTTLHNNLLKLLALILQGLVP
LSDAITFAEQKLNCKYKKFEFN
>Q8BB47 ~~~~~~Immediate-early protein 2~~~
MEPAKPSGNNMGSNDERMQDYRPDPMMEESIQQILEDSLMCDTSFDDLILPGLESFGLIIPESSNNIESNNVEEGSNEDL
KTLAEHKCKQGNDNDVIQSAMKLSGLYCDADITHTQPLSDNTHQDPIYSQETRIFSKTIQDPRIAAQTHRQCTSSASNLP
SNESGSTQVRFASELPNQLLQPMYTSHNQNANLQNNFTSLPYQPYHDPYRDIESSYRESRNTNRGYDYNFRHHSYRPRGG
NGKYNYYNPNSKYQQPYKRCFTRTYNRRGRGHRSYDCSDRSADLPYEHYTYPNYEQQNPDPRMNNYKDFTQLTNKFNFGA
NDYSMAFSTDSTHVQSDNYNHPTKAQTIPETTKTKKHKATKDNETSRGNQVLTSNDAISLSYRPSPIKLDIIKKIYDTDV
IPLPKEALTANGSNRDVDIQKYKKAHIRCRSVQKKKERSSQTNKHDENHASSRSDLKERKSNEHEDKAVTKARDFSKLDP
LLSPLPLTPEPAIDFADHTDKFYSTPEFNQIKQNLHRSKTSLQDTVPISKHTPRAPTKDNSYKKHHDSKDNYPKMKHSPG
RTTSKKNTTNSNGHQNFKDVSVKNVSGKATSTSPKSKTHHYSSSSDEEGQYKSPVKTITPSPSPYCKLKNPSIMDKNSAK
NHTASADKNLTDNSPIRSNLNPTAFNKSNNNKSITNSTSNSDECTDKKPNCNSTKNESKDPNRTCGKNSDKHLSKSCTMA
SKRAPSRASSRTSSRASSRASSRASSRASSRASSRASSRDSSRASSRAPSRASSRDSSRASSRDSSRDSNRASSKASSRA
SSRDSSRASSRDSSRASSKASRKASSRASSRASSRASSKASGKASSKASSRASSRAFSRNSSRASSRASSRASSRASSRA
SSRASSRDSSRALSRAFGRDSSRASSRASSRDSSRASSKASRKASSRASSRASSRASSRASSRELRQIYCDSNKRQTPPH
DTSINTKFEISEIKFRCGEDLNFYKNTAARLQCFNHNDQFYNPRFRPHIRTNRKKSESTNDTDSESSMSRCKSHCRNSPD
SLTIVRRKKHKSGSSSISSSIEENCRSNSHIVTGKEKFTPFYYQSSRTRSSSSSSSSSSASLSCSKSTLKTCRKTQNRDN
KQIKSKSDSKHKTTNMSSDYESNRHADVFRNSPEAGEKFPLHNSSPFNTHEQSNHSENAIDEEQKKAPNITTSHLHQKQN
VKLHNTKKCKKKRPRDDDSDSSIKNFCKKRISGAQKTESEVSEIDDLCYRDYVRLKERKVSEKFKIHRGRVATKDFQKLF
RNTMRAFEYKQIPKKPCNDKNLKEAVYNICCNGLSNNAAIIMYFTRSKKVAQNIKIMQKELMIRPNITVSEAFKMNHAPP
KYYDKDEIKRFIQLQKQGPQELWDKFENNTTHDLFTRHSDVKTMIIYAATPIDFVGAVKTCNKYAKDNPKEIVLRVCSII
DGDNPISIYNPISKDFKSKFSTLSKC
>Q77Z83 ~~~~~~Immediate-early protein 2~~~
MEPAKPSGNNMGSNDERMQDYRPDPMMEESIKEILEESLMCDTSFDDLIIPGLESFGLIIPESSNNIESNNVEEGSDGEL
KTLAEQKCKQGNDNDVIQSAMKLSGLYCDADITHTQPLSDNTHQDPIYSQESRIFTKTIQDPRIVAQTHRQCTSSASNLQ
SNESGSTQVRFASELPNQLLQPMYTSHNQNANLQNNFTSLPYQPYHDPYRDIESSYRESRNTNRGYDYNFRHHPYRPRGG
NGKYNYYNPNSKYQQPYKRCFTRTYNRRGRGHRSYDCSDRSADLPYEHYTYPNYEQQNPDPRMNNYKDFTQLTNKFNFES
YDYSMAFSTDSTHVQSDNYNHPTKAQTIPETTKTKKHEATKDNETSTENQVLTPDVISLSYRPSSYKMDIIKKIYDTDVI
PLPKEALTANGSNRDVDIQKYKKAHIRCRSVQKKKERSSQTNKHDENHASSRSDLKERKSNENEDKAVTKARDFSKLNPL
LSPLPLTPEPAIDFADHTDKFYSTPEFNQIQQNLHRSKTSLQDTVPISKHTPRAPTKDNSYKKHHDSKDNYPKMKHSPGR
TTSKKNTTNSNGHQNFKEVSVKNVSGKATSTSPKSKTHHYSSSSDEEGQYKSPVKTIIQSPSPYCKLKNPSIMDKNSAKN
HTASADKNLTDNSPIRSNLNPTAFNKSNSNKSITDSTSNSDECTDRKPNCNSTKNESKDPNRTCGKNSDKHLSKSCTMAS
KRAPSRASSRASSRDSSRASSRASSRASSRDSSRASSRASSRDSSRASSRASSRASSKASSRASSRASSRASSRDSSRAS
SKASSRASSRDSSRASSRASSRASSKASSRASSRASSRASSRDSSRASSKASSRASSRDSSRASSRDSSRDSSRASSRAS
SRDSSRASSKASRKASSRASSRASSRASSKASGKASSEASSRASSRNSSRASSRASSRASSRDSSRASSRASSRDSSRAS
SKASRKASSRASSRASSRELRQIYCDSNKRQTPPHDTSINTKFEISEIKFRCGEDLNFYKNTAARLQCFNHNDQFYNPRF
RPHIRTNRKKSESTNDTDSESSMSRCKSHCRNSPDSLTVVRRKKHKSGSSSISSSIEENCRSNSHIVTGKEKFTPFYYQS
SRTRSSSSSSSSSASLSCSKSTLKTCRKTQYKDNKQIKSKSDSKHKTTNMSSDYESNRHADVFKNSPEAGEKFPLHNSSP
FNTHEQSNHSENAIDEEQKKAPNITTSHLQGKQNVRLHNTKKCKKKRPRDDDSDSSIKNFCKKRISGAQKTESEVSEPDD
LCYRDYVRLKERKVSEKFKIHRGRVATKDFQKLFRNTMRAFEYKQIPKKPCNEKNLKEAVYDICCNGLSNNAAIIMYFTR
SKKVAQIIKIMQKELMIRPNITVSEAFKMNHAPPKYYDKDEIKRFIQLQKQGPQELWDKFENNTTHDLFTRHSDVKTMII
YAATPIDFVGAVKTCNKYAKDNPKEIVLRVCSIIDGDNPISIYNPISKEFKSKFSTLSKC
>Q9E1W2 2.3.2.27~~~~~~E3 ubiquitin-protein ligase IE61~~~
MNPPAYTSTSGSVASTGNCAICMSAISGLGKTLPCLHDFCFVCIQTWTSTSAQCPLCRTVVSSILHNITSDANYEEYEVI
FDDEGYNEDAPLQIPEEPGVNVSPQPPVHSTANSASNTALMRSHAQPRVLAPDNSSVFQPSTSSHASFSSGFAPYSQTPP
VGASNLEATRVSRSAVITPTTSTGPRLHLSPSSRSVSQRLQTLFGITKLPGVPTEPPAYAQAEAHFGQANGYGQHRGALH
GSYPAVLTAQDTSQIPTRLPFRATDRDVMEVLNSHVICSLCWVGWDEQLATLFPPPIVEPTKTLILNYIAIYGVEDVKLK
VSLRCLLHDLTVPFVENMLFLIDRCTDPTRISMQAWTWHDTPIRLLSGPIKSPDGGSTSQDTSVSNIHRSPPGGSSTQPS
SGRRPGRPKGVKRRLFVDDDTGVSTNESVFPVINAPIHHKNSKLAALPTGSTTDSNERLVVESPGASAEQPSTSGSSPSP
SRRRGRKQGIARIEMLTKKVRRK
>P09309 2.3.2.27~~~~~~E3 ubiquitin-protein ligase IE61~~~
MDTILAGGSGTSDASDNTCTICMSTVSDLGKTMPCLHDFCFVCIRAWTSTSVQCPLCRCPVQSILHKIVSDTSYKEYEVH
PSDDDGFSEPSFEDSIDILPGDVIDLLPPSPGPSRESIQQPTSRSSREPIQSPNPGPLQSSAREPTAESPSDSQQDSIQP
PTRDSSPGVTKTCSTASFLRKVFFKDQPAVRSATPVVYGSIESAQQPRTGGQDYRDRPVSVGINQDPRTMDRLPFRATDR
GTEGNARFPCYMQPLLGWLDDQLAELYQPEIVEPTKMLILNYIGIYGRDEAGLKTSLRCLLHDSTGPFVTNMLFLLDRCT
DPTRLTMQTWTWKDTAIQLITGPIVRPETTSTGETSRGDERDTRLVNTPQKVRLFSVLPGIKPGSARGAKRRLFHTGRDV
KRCLTIDLTSESDSACKGSKTRKVASPQGESNTPSTSGSTSGSLKHLTKKSSAGKAGKGIPNKMKKS
>P85991 ~~~~~~Ig-like virion protein~~~
MPKIKILVTPYVVKNKPETERNHVVTGVADGWEKTSLNADPNEILTESKGLDTLLTDCNLKPDGVTKIDPSKPIGKEVEI
EIYDPDAIPVQTVTVSPDTATGTVGQQLTFTADILPADATYKDVEWSSSDEAKAKSLGNGVFDLKAVGTGIIVSATSIDG
GVIGEAELTITAAVVAVTGVTLSPKTKTIAVGEKFTLAATVAPANATNKTVTYTSADPATATVNATTGEVEGKASGTVAI
TGKTADGNKTDVCNVTVS
>Q76U48 ~~~~~~IkB-like protein~~~
MEHMFPEREIENLFVKWIKKHIRNGNLTLFEEFFKTDPWIVNRCDKNGSSVFMWICIYGRIDFLKFLFEQESYPGEIINP
HRRDKDGNSALHYLAEKKNHLILEEVLGYFGKNGTKICLPNFNGMTPVMKAAIRGRTSNVLSLIKFGADPTQKDYHRGFT
AWDWAVFTGNMELVKSLNHDYQKPLYMHFPLYKLDVFHRWFKKKPKIIITGCKNNVYEKLPEQNPNFLCVKKLNKYGK
>O36972 ~~~~~~IkB-like protein~~~
MDTIGLFSVEAEHLFVEWVKKCIKKGDLTLFETLFNADPWIVNRCNKNKITVFMLICIYGRLDFLRFLFKQESYPGEIVN
HYRRDKDGNSAWHYLAEKNNHLLLEEVLDYFGKNGIRVCFPNFNGVTPIMKAAMRGRTLSVLSLLKYGANPNRKDYLKGF
TTWDWAVFTGHADLVKTLNKGYQKPLFMHFPLYKLDVFHRRFKKKPKIIITGCEDNVYEKLPEQNSNFLCVKKLNKYGK
>P03180 ~~~BCRF1~~~Viral interleukin-10 homolog~~~
MERRLVVTLQCLVLLYLAPECGGTDQCDNFPQMLRDLRDAFSRVKTFFQTKDEVDNLLLKESLLEDFKGYLGCQALSEMI
QFYLEEVMPQAENQDPEAKDHVNSLGENLKTLRLRLRRCHRFLPCENKSKAVEQIKNAFNKLQEKGIYKAMSEFDIFINY
IEAYMTIKAR
>P17150 ~~~~~~Viral interleukin-10 homolog~~~
MLSVMVSSSLVLIVFFLGASEEAKPATTTIKNTKPQCRPEDYATRLQDLRVTFHRVKPTLQREDDYSVWLDGTVVKGCWG
CSVMDWLLRRYLEIVFPAGDHVYPGLKTELHSMRSTLESIYKDMRQCPLLGCGDKSVISRLSQEAERKSDNGTRKGLSEL
DTLFSRLEEYLHSRK
>P25212 ~~~~~~Interleukin-1-binding protein~~~
MSILPVIFLSIFFYSSFVQTFNAPECIDKGQYFASFMELENEPVILPCPQINTLSSGYNILDILWEKRGADNDRIIPIDN
GSNMLILNPTQSDSGIYICITTNETYCDMMSLNLTIVSVSESNIDLISYPQIVNERSTGEMVCPNINAFIASNVNADIIW
SGHRRLRNKRLKQRTPGIITIEDVRKNDAGYYTCVLEYIYGGKTYNVTRIVKLEVRDKIIPSTMQLPDGIVTSIGSNLTI
ACRVSLRPPTTDADVFWISNGMYYEEDDGDGNGRISVANKIYMTDKRRVITSRLNINPVKEEDATTFTCMAFTIPSISKT
VTVSIT
>P0C5A5 ~~~I~~~Protein I~~~
MESSRRPLGLTKPSAGXIIKIEAERISPSRLQLLNPIPGVWFPITLGFRALPNSRREKSFSLHKDKECLLPMESQLHSKR
DIGTDTTDVPLKHLMASRSSYCPDGIFTILEQGPMLAQSMATISKELSGSQANRPRLGPLPILLKGTQVAMRLFLLGLRP
VRYCLKVFMLKAQEGLHLLVDLVRGHNPVGQIIALEAVPTSASLPLL
>P03718 ~~~ipi1~~~Internal protein I~~~
MKTFKEFTSTTTPVSTITEATLTSEVIKANKGREGKPMISLVDGEEIKGTVYLGDGWSAKKDGATIVISPAEETALFKAK
HISAAHLKIIAKNLL
>P03719 ~~~ipi2~~~Internal protein II~~~
MKTYQEFIAEARVGAGKLEAAVNKKAHSFHDLPDKDRKKLVSLYIDRERILALPGANEGKQAKPLNAVEKKIDNFASKFG
MSMDDLQQAAIEAAKAIKDK
>P13302 ~~~ipi3~~~Internal protein III~~~
MKTYQEFIAEASVVKAKGINKDEWTYRSGNGFDPKTAPIERYLATKASDFKAFAWEGLRWRTDLNIEVDGLKFAHIEDVV
ASNLDSEFVKADADLRRWNLKLFSKQKGPKFVPKAGKWVIDNKLAKAVNFAGLEFAKHKSSWKGLDAMAFRKEFADVMTK
GGFKAEIDTSKGKFKDANIQYAYAVANAARGNS
>P09715 ~~~IRS1~~~Protein IRS1~~~
MAQRNGMSPRPPPLGRGRGAGGPSGVGSSPPSSCVPMGAPSTAGTGASAAATTTPGHGVHRVEPRGPPGAPPSSGNNSNF
WHGPERLLLSQIPVERQALTELEYQAMGAVWRAAFLANSTGRAMRKWSQRDAGTLLPLGRPYGFYARVTPRSQMNGVGAT
DLRQLSPRDAWIVLVATVVHEVDPAADPTVGDKAGHPEGLCAQDGLYLALGAGFRVFVYDLANNTLILAARDADEWFRHG
AGEVVRLYRCNRLGVGTPRATLLPQPALRQTLLRAEEATALGRELRRRWAGTTVALQTPGRRLQPMVLLGAWQELAQYEP
FASAPHPASLLTAVRRHLNQRLCCGWLALGAVLPARWLGCAAGPATGTAAGTTSPPAASGTETEAAGGDAPCAIAGAVGS
AVPVPPQPYGAAGGGAICVPNADAHAVVGADAAAAAAPTVMVGSTAMAGPAASGTVPRAMLVVLLDELGAVFGYCPLDGH
VYPLAAELSHFLRAGVLGALALGRESAPAAEAARRLLPELDREQWERPRWDALHLHPRAALWAREPHGQWEFMFREQRGD
PINDPLAFRLSDARTLGLDLTTVMTERQSQLPEKYIGFYQIRKPPWLMEQPPPPSRQTKPDAATMPPPLSAQASVSYALR
YDDESWRPLSTVDDHKAWLDLDESHWVLGDSRPDDIKQRRLLKATQRRGAEIDRPMPVVPEECYDQRFTTEGHQVIPLCA
SEPEDDDEDPTYDELPSRPPQKHKPPDKPPRLCKTGPGPPPLPPKQRHGSTDGKVSAPRQSEHHKRQTRPPRPPPPKFGD
RTAAHLSQNMRDMYLDMCTSSGHRPRPPAPPRPKKCQTHAPHHVHH
>P03785 ~~~~~~Inhibitor of toxin/antitoxin system~~~
MSNVAETIRLSDTADQWNRRVHINVRNGKATMVYRWKDSKSSKNHTQRMTLTDEQALRLVNALTKAAVTAIHEAGRVNEA
MAILDKIDN
>P10221 3.5.1.-~~~~~~Inner tegument protein~~~
MADRGLPSEAPVVTTSPAGPPSDGPMQRLLASLAGLRQPPTPTAETANGADDPAFLATAKLRAAMAAFLLSGTAIAPADA
RDCWRPLLEHLCALHRAHGLPETALLAENLPGLLVHRLVVALPEAPDQAFREMEVIKDTILAVTGSDTSHALDSAGLRTA
AALGPVRVRQCAVEWIDRWQTVTKSCLAMSPRTSIEALGETSLKMAPVPLGQPSANLTTPAYSLLFPAPFVQEGLRFLAL
VSNRVTLFSAHLQRIDDATLTPLTRALFTLALVDEYLTTPERGAVVPPPLLAQFQHTVREIDPAIMIPPLEANKMVRSRE
EVRVSTALSRVSPRSACAPPGTLMARVRTDVAVFDPDVPFLSSSALAVFQPAVSSLLQLGEQPSAGAQQRLLALLQQTWT
LIQNTNSPSVVINTLIDAGFTPSHCTHYLSALEGFLAAGVPARTPTGHGLGEVQQLFGCIALAGSNVFGLAREYGYYANY
VKTFRRVQGASEHTHGRLCEAVGLSGGVLSQTLARIMGPAVPTEHLASLRRALVGEFETAERRFSSGQPSLLRETALIWI
DVYGQTHWDITPTTPATPLSALLPVGQPSHAPSVHLAAATQIRFPALEGIHPNVLADPGFVPYVLALVVGDALRATCSAA
YLPRPVEFALRVLAWARDFGLGYLPTVEGHRTKLGALITLLEPAARGGLGPTMQMADNIEQLLRELYVISRGAVEQLRPL
VQLQPPPPPEVGTSLLLISMYALAARGVLQDLAERADPLIRQLEDAIVLLRLHMRTLSAFFECRFESDGRRLYAVVGDTP
DRLGPWPPEAMGDAVSQYCSMYHDAKRALVASLASLRSVITETTAHLGVCDELAAQVSHEDNVLAVVRREIHGFLSVVSG
IHARASKLLSGDQVPGFCFMGQFLARWRRLSACYQAARAAAGPEPVAEFVQELHDTWKGLQTERAVVVAPLVSSADQRAA
AIREVMAHAPEDAPPQSPAADRVVLTSRRDLGAWGDYSLGPLGQTTAVPDSVDLSRQGLAVTLSMDWLLMNELLRVTDGV
FRASAFRPLAGPESPRDLEVRDAGNSLPAPMPMDAQKPEAYGHGPRQADREGAPHSNTPVEDDEMIPEDTVAPPTDLPLT
SYQ
>P15915 ~~~~~~Protein J1 homolog~~~
MAHPHQHLLTLFLTDDNGFYSYLSEKSDDEALEDINTIKKYMDFILSVLIRSKEKLENIGCSYEPMSESFKALIKVKDDG
TLVKAFTKPLLNPHSEKIVLDRGYTSDFAISVIRLSSKSSYILPANTKYINPNENMYINNLISLLKRN
>P19746 ~~~~~~Protein J1 homolog~~~
MDHSKYLLTIFLENDDSFFKYLSEQDDETAMSDIETIVTYLNFLLSLLIRSKDKLESIGYYYEPLSEECKTLVDFSNMKN
FRILFNKIPINILNKQITVNKGYLSDFVTTLMRLKKELFLESPEPITYIDPRKDPTFLNILSILHEK
>A0A097SRX8 ~~~~~~Uncharacterized protein J64R~~~
MLLYIVIIVAYVSYKLVPKQYWPILMFMAYMVYTHEKLDINERSGFWKYIIAKLFRCHGCEICK
>P05411 ~~~JUN~~~Viral jun-transforming protein~~~
MSAKMEPTFYEDALNASFAPPESGGYGYNNADILTSPDVGLLKLASPELERLIIQSSNGLITTTPTPTQFLCPKNVTDEQ
EGFAEGFVRALAELHNQNTLPSVTSAAQPVSGGMAPVSSMAGGGSFNTSLHSEPPVYANLSNFNPNALNSAPNYNANRMG
YAPQHHINPQMPVQHPRLQALKEEPQTVPEMPGETPPLFPIDMESQERIKAERKRMRNRIAASKSRKRKLERIARLEEKV
KTLKAQNSELASTANMLREQVAQLKQKVMNHVNSGCQLMLTQQLQTF
>P69548 ~~~J~~~DNA-binding protein J~~~
MKKARRSPSRRKGARLWYVGGSQF
>P03652 ~~~J~~~DNA-binding protein J~~~
MKKSIRRSGGKSKGARLWYVGGTQY
>P69592 ~~~J~~~DNA-binding protein J~~~
MSKGKKRSGARPGRPQPLRGTKGKRKGARLWYVGGQQF
>Q9QR69 ~~~~~~Protein K15~~~
MKTLIFFWNLWLWALLVCFWCITLVCVTTNSIDTMASLLVMCILFVSAINKYTQAISSNNPKWPSSWHLGIIACIVLKLW
NLSTTNSVTYACLITTAILSLVTAFLTLIKHCTACKLQLEHGILFTSTFAVLMTNMLVHMSNTWQSSWIFFPISFTLSLP
FLYAFATVKTGNIKLVSSVSFICAGLVMGYPVSCCKTHTCTATAAGLSLSSIYLGFTGIISTLHKSWAPPKRGILTFLLL
QGGVLTTQTLTTELLAITSTTGNIKGHEILLLVCLIFLWCLYVWQSFNKASLVTGMLHLIAAWSHTGGCVQLVMLLPSGL
TRGILTMIICISTLFSTLQGLLVFYLYKEKKVVAVNSYRQRRRRIYTRDQNLHHNDNHLGNNVISPPPLPPFFRQPVRLP
SHVTDRGRGSQLLNEVELQEVNRDPPNVFGYASILVSGAEESREPSPQPDQSGMSILRVDGGSAFRIDTAQAATQPTDDL
YEEVLFPRN
>Q2HRD5 ~~~K1~~~Protein K1~~~
MFLYVVCSLAVCFRGLLSLSLQSSPNLCPGVISTPYTLTCPSNTSLPTSWYCNDTRLLRVTQGTLTVDTLICNFSCVGQS
GHRYSLWITWYAQPVLQTFCGQPSNTVTCGQHVTLYCSTSGNNVTVWHLPNGQNETVSQTKYYNFTLMNQTEGCYACSNG
LSSRLSNRLCFSARCANITPETHTVSVSSTTGFRTFATAPTLFVMKEVKSTYLYIQEHLLVFMTLVALIGTMCGILGTII
FAHCQKQRDSNKTVPQQLQDYYSLHDLCTEDYTQPVDWY
>F5HDA4 ~~~K7~~~Protein K7~~~
MGTLEIKGASLSQFSTGTAQSPWLPLHLWILCSLLAFLPLLVFIGAADCGLIASLLAIYPSWLSARFSVLLFPHWPESCS
TKNTARSGALHKPAEQKLRFAQKPCHGNYTVTPCGLLHWIQSPGQL
>Q8JL69 ~~~KBTB1~~~Kelch repeat and BTB domain-containing protein 1~~~
MNNSSELIAAINGFRNSGRFCDINIVINDERINAHRLILSGASEYFSILFSSDFIDSNEYEVNLSHLDYQSVNDLIDYIY
GIPLSLTNDSVKYILSTADFLQIGSAITECENYILKNLCSRNCIDFYIYADKYNNKKIETASFNTILRNILRLINDENFK
YLTEESMIKILSDDMLNIKNEDFAPLILIKWLESTQQPCTVELLRCLRISLLSPQVIKSLYSHRLVSSIYECITFLNNIA
FLDESFPRYNSIELISIGISNSRDKISINCYNRKKNTWEMISSRGYRCSFAVAVMDNIIYMMGGYDQSPYRSSKVIAYNT
CTNSWIYDIPELKYPRSNCGGVVDDEYIYCIGGIRDQDSSLISDIDRWKPSKPYWQTYAKMREPKCDMGVAMLNGLIYVI
GGVVKGGTCTDTLESLSEDGWMMHRRLPIKMSNMSTIVHAGKIYISGGYTNSSIVNEISNLVLSYNPIYDEWTKLSSLNI
PRINPALWSVHNKLYVGGISDDVQTNTSETYDKEKDCWTLDNGHVLPYNYIMYKCEPIKHKYPLEKIQYTNDFLKCLESF
IGS
>P24768 ~~~KBTB1~~~Kelch repeat and BTB domain-containing protein A55~~~
MNNSSELIAVINGFRNSGRFCDISIVINDERINAHKLILSGASEYFSILFSNNFIDSNEYEVNLSHLDYQSVNDLIDYIY
GIPLSLTNDNVKYILSTADFLQIGSAITECENYILKNLCSKNCIDFYIYADKYNNKKIESASFNTILQNILRLINDENFK
YLTEESMIKILSDDMLNIKNEDFAPLILIKWLESTQQSCTVELLRCLRISLLSPQVIKSLYSHQLVSSIYECITFLNNIA
FLDESFPRYHSIELISIGISNSHDKISINCYNHKKNTWEMISSRRYRCSFAVAVLDNIIYMMGGYDQSPYRSSKVIAYNT
CTNSWIYDIPELKYPRSNCGGLADDEYIYCIGGIRDQDSSLTSSIDKWKPSKPYWQKYAKMREPKCDMGVAMLNGLIYVM
GGIVKGDTCTDALESLSEDGWMKHQRLPIKMSNMSTIVHDGKIYISGGYNNSSVVNVISNLVLSYNPIYDEWTKLSSLNI
PRINPALWSAHNKLYVGGGISDDVRTNTSETYDKEKDCWTLDNGHVLPRNYIMYKCEPIKHKYPLEKTQYTNDFLKYLES
FIGS
>Q9JFS4 ~~~KBTB2~~~Kelch repeat and BTB domain-containing protein 2~~~
MDIDDIKHNRRVVSNISSLLDNDILCDVIITIGDGEEIKAHKTILAAGSKYFRTLFTTPMIIRDLVTRVNLQMFDKDAVK
NIVQYLYNRHISSMNVIDVLKCADYLLIDDLVTDCESYVKDYTNHDTCIYIYHRLYEMSHIPIVKYVKRMVMRNIPTLIT
TDAFKNAVFEILLDIISTNDGEYVYREGYKVTILLKWLDYNYITEEQLLCILSCIDIQNLDKKSRLLLYSNTTINMYSSC
VKFLLDNKQNRNIIPRQLCLVYHDTNYNISNPCILVYNINTMEYNTIYTIHNNIINYSSAVVDNEIIIAGGYNFNNISLN
KVYKINIEHRTCVELPPMIKNRCHFSLAVIDDMIYAIGGQNGTIVERSVECYTMGDDTWKMLPDMPDAISSYGMCVFDQY
IYIIGGRTEHVKYIPVQHMNEIVDINEHSSDKVIRYDTVNNIWEKLPNLCSGTIRPSVVSHKDDIYVVCDIKDDEINGFK
TCIFRYNTKDNYKGWELITTIDSKLTVLHTILHDDAITILHWYESCMIQDKFNIDTYKWTNICYQRSNSYIVHDTLPIY
>Q2HR82 2.3.2.-~~~K8~~~E3 SUMO-protein ligase K-bZIP~~~
MPRMKDIPTKSSPGTDNSEKDEAVIEEDLSLNGQPFFTDNTDGGENEVSWTSSLLSTYVGCQPPAIPVCETVIDLTAPSQ
SGAPGDEHLPCSLNAETKFHIPDPSWTLSHTPPRGPHISQQLPTRRSKRRLHRKFEEERLCTKAKQGAGRPVPASVVKVG
NITPHYGEELTRGDAVPAAPITPPYPRVQRPAQPTHVLFSPVFVSLKAEVCDQSHSPTRKQGRYGRVSSKAYTRQLQQAL
EEKDAQLCFLAARLEAHKEQIIFLRDMLMRMCQQPASPTDAPLPPC
>P00545 2.7.10.1~~~V-FMS~~~Tyrosine-protein kinase transforming protein fms~~~
RMPSGPGHYGASAETPGPRPPLCPASSCCLPTEAMGPRALLVLLMATAWHAQGVPVIQPSGPELVVEPGTTVTLRCVGNG
SVEWDGPISPHWNLDLDPPSSILTTNNATFQNTGTYHCTEPGNPRGGNATIHLYVKDPARPWKVLAQEVTVLEGQDALLP
CLLTDPALEAGVSLVRVRGRPVLRQTNYSFSPWHGFTIHKAKFIENHVYQCSARVDGRTVTSMGIWLKVQKDISGPATLT
LEPAELVRIQGEAAQIVCSASNIDVNFDVSLRHGDTKLTISQQSDFHDNRYQKVLTLNLDHVSFQDAGNYSCTATNAWGN
HSASMVFRVVESAYSNLTSEQSLLQEVTVGEKVDLQVKVEAYPGLESFNWTYLGPFSDYQDKLDFVTIKDTYRYTSTLSL
PRLKRSESGRYSFLARNAGGQNALTFELTLRYPPEVRVTMTLINGSDTLLCEASGYPQPSVTWVQCRSHTDRCDESAGLV
LEDSHSEVLSQVPFYEVIVHSLLAIGTLEHNRTYECRAFNSVGNSSQTFWPISIGAHTPLPDELLFTPVLLTCMSIMALL
LLLLLLLLYKYKQKPKYQVRWKIIESYEGNSYTFIDPTQLPYNEKWEFPRNNLQFGKTLGTGAFGKVVEATAFGLGKEDA
VLKVAVKMLKSTAHADEKEALMSELKIMSHLGQHENIVNLLGACTHGGPVLVITEYCCYGDLLNFLRRQAEAMLGPSLSV
GQDPEAGAGYKNIHLEKKYVRRDSGFSSQGVDTYVEMRPVSTSSSNDSFSEEDLGKEDGRPLELRDLLHFSSQVAQGMAF
LASKNCIHRDVAARNVLLTSGRVAKIGDFGLARDIMNDSNYIVKGNARLPVKWMAPESIFDCVYTVQSDVWSYGILLWEI
FSLGLNPYPGILVNSKFYKLVKDGYQMAQPAFAPKNIYSIMQACWALEPTRRPTFQQICSLLQKQAQEDRRVPNYTNLPS
SSSSRLLRPWQRTPPVAR
>P03046 ~~~kil~~~Protein kil~~~
MARNIKMATDAQNWLQARGSHVNESYLGVARPILEITYPPVELVKNAVRIMEHKSGVARSVWTARLNGCQIIWR
>P06855 2.7.1.78~~~pseT~~~Polynucleotide kinase~~~
MKKIILTIGCPGSGKSTWAREFIAKNPGFYNINRDDYRQSIMAHEERDEYKYTKKKEGIVTGMQFDTAKSILYGGDSVKG
VIISDTNLNPERRLAWETFAKEYGWKVEHKVFDVPWTELVKRNSKRGTKAVPIDVLRSMYKSMREYLGLPVYNGTPGKPK
AVIFDVDGTLAKMNGRGPYDLEKCDTDVINPMVVELSKMYALMGYQIVVVSGRESGTKEDPTKYYRMTRKWVEDIAGVPL
VMQCQREQGDTRKDDVVKEEIFWKHIAPHFDVKLAIDDRTQVVEMWRRIGVECWQVASGDF
>Q3KSQ2 2.7.1.21~~~TK~~~Thymidine kinase~~~
MAGFPGKEAGPPGGWRKCQEDESPENERHENFYAEIDDFAPSVLTPTGSDSGAGEEDDDGLYQVPTHWPPLMAPTGLSGE
RVPCRTQAAVTSNTGNSPGSRHTSCPFTLPRGAQPPAPAHQKPTAPTPKPRSRECGPSKTPDPFSWFRKTSCTEGGADST
SRSFMYQKGFEEGLAGLGLDDKSDCESEDESNFRRPSSHSALKQKNGGKGKPSGLFEHLAAHGREFSKLSKHAAQLKRLS
GSVMNVLNLDDAQDTRQAKAQRKESTRVPIVIHLTNHVPVIKPACSLFLEGAPGVGKTTMLNHLKAVFGDLTIVVPEPMR
YWTHVYENAIKAMHKNVTRARHGREDTSAEVLACQMKFTTPFRVLASRKRSLLVTESGARSVAPLDCWILHDRHLLSASV
VFPLMLLRSQLLSYSDFIQVLATFTADPGDTIVWMKLNVEENMRRLKKRGRKHESGLDAGYLKSVNDAYHAVYCAWLLTQ
YFAPEDIVKVCAGLTTITTVCHQSHTPIIRSGVAEKLYKNSIYSVLKEVIQPYRADAVLLEVCLAFTRTLAYLQFVLVDL
SEFQDDLPGCWTEIYMQALKNPAIRSQFFDWAGLSKVISDFERGNRD
>P24425 2.7.1.21~~~TK~~~Thymidine kinase~~~
MAACVPPGEAPRSASGTPTRRQVTIVRIYLDGVYGIGKSTTGRVMASAASGGSPTLYFPEPMAYWRTLFETDVISGIYDT
QNRKQQGNLAVDDAALITAHYQSRFTTPYLILHDHTCTLFGGNSLQRGTQPDLTLVFDRHPVASTVCFPAARYLLGDMSM
CALMAMVATLPREPQGGNIVVTTLNVEEHIRRLRTRARIGEQIDITLIATLRNVYFMLVNTCHFLRSGRVWRDGWGELPT
SCGAYKHRATQMDAFQERVSPELGDTLFALFKTQELLDDRGVILEVHAWALDALMLKLRNLNVFSADLSGTPRQCAAVVE
SLLPLMSSTLSDFDSASALERAARTFNAEMGV
>Q9QNF7 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPCHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRLEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWQVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWREDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMVQTHVTTPGSIPTICDLARTFAREMGEAN
>P0DTH5 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPCHQHASAFDQAARSRGHNNRRTALRPRRQQKATEVRLEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWRVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWREDWGQLSGAAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMVQTHVTTPGSIPTICDLARTFAREMGEAN
>P06478 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPCHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRLEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWQVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHVGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWWEDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMVQTHVTTPGSIPTICDLARTFAREMGEAN
>P08333 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPGHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRPEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWRVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITIGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWREDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMIQTHVTTPGSIPTICDLARTFAREMGEAN
>P17402 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPCHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRLEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWQVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWREDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMVQTHVTTPGSIPTICDLARMFAREMGEAN
>P06479 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASYPGHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRPEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVY
VPEPMTYWRVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLI
FDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYG
LLANTVRYLQGGGSWREDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLR
PMHVFILDYDQSPAGCRDALLQLTSGMIQTHVTTPGSIPTICDLARTFAREMGEAN
>P04407 2.7.1.21~~~TK~~~Thymidine kinase~~~
MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASEARGDPELPTLLRVYIDGPHGVGKTTTSAQLMEALGPRDNIV
YVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAVVMTSAQITMSTPYAATDAVLAPHIGGEAVGPQAPPPALTL
VFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGVLPEAEHADRLARRQRPGERLDLAMLSAIRRVY
DLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEDGAGSLPRIEDTLFALFRVPELLAPNGDLYHIFAWVLDVLADRL
LPMHLFVLDYDQSPVGCRDALLRLTAGMIPTRVTTAGSIAEIRDLARTFAREVGGV
>O57203 2.7.1.21~~~~~~Thymidine kinase~~~
MNGGHIQLIIGPMFSGKSTELIRRVRRYQIAQYKCVTIKYSNDNRYGTGLWTHDKNNFEALEATKLCDVLESITDFSVIG
IDEGQFFPDIVEFCERMANEGKIVIVAALDGTFQRKPFNNILNLIPLSEMVVKLTAVCMKCFKEASFSKRLGEETEIEII
GGNDMYQSVCRKCYVGS
>P68563 2.7.1.21~~~~~~Thymidine kinase~~~
MNGGHIQLIIGPMFSGKSTELIRRVRRYQIAQYKCVTIKYSNDNRYGTGLWTHDKNNFEALEATKLCDVLESITDFSVIG
IDEGQFFPDIVEFCERMANEGKIVIVAALDGTFQRKPFNNILNLIPLSEMVVKLTAVCMKCFKEASFSKRLGEETEIEII
GGNDMYQSVCRKCYIDS
>P0C0E6 2.7.1.21~~~TK~~~Thymidine kinase~~~
MSTDKTDVKMGVLRIYLDGAYGIGKTTAAEEFLHHFAITPNRILLIGEPLSYWRNLAGEDAICGIYGTQTRRLNGDVSPE
DAQRLTAHFQSLFCSPHAIMHAKISALMDTSTSDLVQVNKEPYKIMLSDRHPIASTICFPLSRYLVGDMSPAALPGLLFT
LPAEPPGTNLVVCTVSLPSHLSRVSKRARPGETVNLPFVMVLRNVYIMLINTIIFLKTNNWHAGWNTLSFCNDVFKQKLQ
KSECIKLREVPGIEDTLFAVLKLPELCGEFGNILPLWAWGMETLSNCLRSMSPFVLSLEQTPQHAAQELKTLLPQMTPAN
MSSGAWNILKELVNAVQDNTS
>Q90121 ~~~M2A~~~KP4 killer toxin~~~
MQIINVVYSFLFAAAMLPVVHSLGINCRGSSQCGLSGGNLMVRIRDQACGNQGQTWCPGERRAKVCGTGNSISAYVQSTN
NCISGTEACRHLTNLVNHGCRVCGSDPLYAGNDVSRGQLTVNYVNSC
>P16948 ~~~~~~KP6 killer toxin~~~
MLIFSVLMYLGLLLAGASALPNGLSPRNNAFCAGFGLSCKWECWCTAHGTGNELRYATAAGCGDHLSKSYYDARAGHCLF
SDDLRNQFYSHCSSLNNNMSCRSLSKRTIQDSATDTVDLGAELHRDDPPPTASDIGKRGKRPRPVMCQCVDTTNGGVRLD
AVTRAACSIDSFIDGYYTEKDGFCRAKYSWDLFTSGQFYQACLRYSHAGTNCQPDPQYE
>P13288 2.7.11.1~~~BGLF4~~~Serine/threonine-protein kinase BGLF4~~~
MDVNMAAELSPTNSSSSGELSVSPEPPRETQAFLGKVTVIDYFTFQHKHLKVTNIDDMTETLYVKLPENMTRCDHLPITC
EYLLGRGSYGAVYAHADNATVKLYDSVTELYHELMVCDMIQIGKATAEDGQDKALVDYLSACTSCHALFMPQFRCSLQDY
GHWHDGSIEPLVRGFQGLKDAVYFLNRHCGLFHSDISPSNILVDFTDTMWGMGRLVLTDYGTASLHDRNKMLDVRLKSSK
GRQLYRLYCQREPFSIAKDTYKPLCLLSKCYILRGAGHIPDPSACGPVGAQTALRLDLQSLGYSLLYGIMHLADSTHKIP
YPNPDMGFDRSDPLYFLQFAAPKVVLLEVLSQMWNLNLDMGLTSCGESPCVDVTAEHMSQFLQWCRSLKKRFKESYFFNC
RPRFEHPHLPGLVAELLADDFFGPDGRRG
>P0C731 2.7.11.1~~~BGLF4~~~Serine/threonine-protein kinase BGLF4~~~
MDVNMAAELSPTNSSSSGELSVSPEPPRETQAFLGKVTVIDYFTFQHKHLKVTNIDDMTETLYVKLPENMTRCDHLPITC
EYLLGRGSYGAVYAHADNATVKLYDSVTELYHELMVCDMIQIGKATAEDGQDKALVDYLSACTSCHALFMPQFRCSLQDY
GHWHDGSIEPLVRGFQGLKDAVYFLNRHCGLFHSDISPSNILVDFTDTMWGMGRLVLTDYGTASLHDRNKMLDVRLKSSK
GRQLYRLYCQREPFSIAKDTYKPLCLLSKCYILRGAGHIPDPSACGPVGAQTALRLDLQSLGYSLLYGIMHLADSTHKIP
YPNPDMGFDRSDPLYFLQFAAPKVVLLEVLSQMWNLNLDMGLTSCGESPCVDVTAEHMSQFLQWCRSLKKRFKESYFFNC
RPRFEHPHLPGLVAELLADDFFGPDGRRG
>P24362 ~~~~~~Pseudokinase OPG198~~~
MESFKYCFDNDGKKWIIGNTLYSGNSILYKVRKNFTSSFYNYVMKIDHKSHKPLLSEIRFYISVLDPLTIDNWTRERGIK
YLAIPDLYGIGETDDYMFFVIKNLGRVFAPKDTESVFEACVTMINTLEFIHSQGFTHGKIEPRNILIRNKRLSLIDYSRT
NKLYKSGNSHIDYNEDMITSGNINYMCVDNHLGATVSRRGDLEMLGYCMIEWFGGKLPWKNESSIKVIKQKKEYKKFIAT
FFEDCFPEGNEPLELVRYIELVYTLDYSQTPNYDRLRKLFIQD
>A0A7H0DNE5 2.7.4.9~~~~~~Thymidylate kinase~~~
MSRGALIVFEGLDKSGKTTQCMNIMESIPANTIKYLNFPQRSTVTGKMIDDYLTRKKTYNDHIVNLLFCANRWEFASFIQ
EQLEQGITLIVDRYAFSGVAYATAKGASMTLSKSYESGLPKPDLVIFLESGSKEINRNIGEEIYEDVEFQQKVLQEYKKM
IEEGDIHWQIISSEFEEDVKKELIKNIVIEAIHTVTGPVGQLWM
>P68693 2.7.4.9~~~~~~Thymidylate kinase~~~
MSRGALIVFEGLDKSGKTTQCMNIMESIPANTIKYLNFPQRSTVTGKMIDDYLTRKKTYNDHIVNLLFCANRWEFASFIQ
EQLEQGITLIVDRYAFSGVAYAAAKGASMTLSKSYESGLPKPDLVIFLESGSKEINRNVGEEIYEDVTFQQKVLQEYKKM
IEEGDIHWQIISSEFEEDVKKELIKNIVIEAIHTVTGPVGQLWM
>Q80HT9 2.7.4.9~~~~~~Thymidylate kinase~~~
MSRGALIVFEGLDKSGKTTQCMNIMESIPANTIKYLNFPQRSTVTGKMIDDYLTRKKTYNDHIVNLLFCANRWEFASFIQ
EQLEQGITLIVDRYAFSGVAYAAAKGASMTLSKSYESGLPKPDLVIFLESGSKEINRNVGEEIYEDVTFQQKVLQEYKKM
IEEGDIHWQIISSEFEEDVKKELIKNIVIEAIHTVTGPVGQLWM
>P0DSV5 2.7.4.9~~~~~~Thymidylate kinase~~~
MSRGALIVFEGLDKSGKTTQCMNIMESIPTNTIKYLNFPQRSTVTGKMIDDYLTRKKTYNDHIVNLLFCANRWEFASFIQ
EQLEQGITLIVDRYAFSGVAYATAKGASMTLSKSYESGLPKPDLVIFLESGSKEINRNVGEEIYEDVAFQQKVLQEYKKM
IEEGEDIHWQIISSEFEEDVKKELIKNIVIEAIHTVTGPVGQLWM
>P0DSV6 2.7.4.9~~~~~~Thymidylate kinase~~~
MSRGALIVFEGLDKSGKTTQCMNIMESIPTNTIKYLNFPQRSTVTGKMIDDYLTRKKTYNDHIVNLLFCANRWEFASFIQ
EQLEQGITLIVDRYAFSGVAYATAKGASMTLSKSYESGLPKPDLVIFLESGSKEINRNVGEEIYEDVAFQQKVLQEYKKM
IEEGEDIHWQIISSEFEEDVKKELIKNIVIEAIHTVTGPVGQLWM
>Q856K7 ~~~~~~Protein Ku~~~
MRSVGNVDLTIGLVTVPVKMVGVSESHDRKASMYHPHEDGNFGKIKMPKLCEDCGEVVPTADIAKGFEEGGDIVILTADE
LASIAAATGAALEVPQFVKAEQINPMLFANENIYRLVPDPKRGRQAATTYLMVRHILVSQELVGVVQYTRWGRNRLGVLD
VEPSDDGGVLVIRNMMWADELRSTEGIVPTNVTEDDIDPRLLPVMASVVESMTGDWDPTAYTDRYTEQLSEAITAKAQGD
EIATVASESGKAIDDVSDLLAKLEASIQKKAPAKKATARRKKTA
>Q853W0 ~~~~~~Protein Ku~~~
MRAVWTGAVNFGLVNVPVKMYAATEEHDLKGHLAHVQDGGRIRYHKVCETCGEQVHTADLGKVFEVDGQTALLTDEDLAE
LPSENNKVIDVVEFVPAGEVDPILLDKPYYLNAEGSVRPYALLARTLSDADKVAIVRVTLRSKEHLAVLRVTGKNEVLTL
QTLRWPDEVREPDFPKLDNKPELSEAELKVAAMLVDELSAPFNPDKHQDTYKVELRALVESKLEPVEVPEDVSGLLAKLE
ASVKPKQAKPDIRTWAKAQGFKISARGRIPKDIVDKYEGAMA
>P35986 ~~~PX~~~Late L2 mu core protein~~~
MALTCRMRIPIPGYRGRPRRRKGLTGNGRFRRRSMRRRMKGGVLPFLIPLIAAAIGAVPGIASVALQASRKN
>P15911 ~~~~~~Protein FPV129~~~
MHTFLTARLQAIEDVSNRNLSMLELILTRAIVTHWIILDLVLNLIFDSLITSFVIIYSLYSFVARNNKVLLFLLMSYAIF
RFIVMYLLYIVSESID
>A0A097SRV6 ~~~~~~Uncharacterized protein L57L~~~
MDDTLPKQMTPTDTSPLKEEQAHCNNKTLENQPKNINDNKCTDSQNTDLQNTEPSKV
>P15914 ~~~~~~L5 homolog~~~
MDRNINFSPVFIEPRFKHEFLLSPQRYFYILVFEVIVALIILNFFFKEEILYTFFPLAKPSKNSINSLLDRTMLKCEEDG
SLMISRPSGIYSALSLDGSPVRISDCSLLLSSINGASSSTSPYSIFNRR
>Q65132 ~~~~~~Protein L83L~~~
MIMDTSLKNNDGALEADNKNYQDYKAEPDKTSDVLDVTKYNSVVDCCHKNYSTFTSEWYINERKYNDVPEGPKKAVVHRC
TIL
>Q9QR71 ~~~LANA1~~~Protein LANA1~~~
MAPPGMRLRSGRSTGAPLTRGSCRKRNRSPERCDLGDDLHLQPRRKHVADSVDGRECGPHTLPIPGSPTVFTSGLPAFVS
SPTLPVAPIPSPAPATPLPPPALLPPVTTSSSPIPPSHPVSPGTTDTHSPSPALPPTQSPESSQRPPLSSPTGRPDSSTP
MRPPPSQQTTPPHSPTTPPPEPPSKSSPDSLAPSTLRSLRKRRLSSPQGPSTLNPICQSPPVSPPRCDFANRSVYPPWAT
ESPIYVGSSSDGDTPPRQPPTSPISIGSSSPSEGSWGDDTAMLVLLAEIAEEASKNEKECSENNQAGEDNGDNEISKESQ
VDKDDNDNKDDEEEQETDEEDEEDDEEDDEEDDEEDDEEDDEEDDEEDDEEEDEEEDEEEDEEEDEEEEEDEEDDDDEDN
EDEEDDEEEDKKEDEEDGGDGNKTLSIQSSQQQQEPQQQEPQQQEPQQQEPQQQEPQQQEPQQQEPQQQEPQQREPQQRE
PQQREPQQREPQQREPQQREPQQREPQQREPQQREPQQREPQQREPQQREPQQQEPQQQEPQQQEPQQQEPQQQEPQQQE
PQQQEPQQQEPQQQEPQQQEPQQQEPQQQEPQQQDEQQQDEQQQDEQQQDEQQQDEQQQDEQQQDEQQQDEQEQQDEQQQ
DEQQQQDEQEQQEEQEQQEEQQQDEQQQDEQQQDEQQQDEQEQQDEQQQDEQQQQDEQEQQEEQEQQEEQEQQEEQEQQE
EQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQEQELEEQE
QELEEQEQELEEQEQELEEQEQEQELEEVEEQEQEQEEQELEEVEEQEQEQEEQEEQELEEVEEQEEQELEEVEEQEEQE
LEEVEEQEQQGVEQQEQETVEEPIILHGSSSEDEMEVDYPVVSTHEQIASSPPGDNTPDDDPQPGPSREYRYVLRTSPPH
RPGVRMRRVPVTHPKKPHPRYQQPPVPYRQIDDCPAKARPQHIFYRRFLGKDGRRDPKCQWKFAVIFWGNDPYGLKKLSQ
AFQFGGVKAGPVSCLPHPGPDQSPITYCVYVYCQNKDTSKKVQMARLAWEASHPLAGNLQSSIVKFKKPLPLTQPGENQG
PGDSPQEMT
>P03263 ~~~~~~I-leader protein~~~
MRADREELDLPPPVGGVAVDVVKVEVPATGRTLVLAFVKTCAVLAAVHGLYILHEVDLTTAHKEAEWEFEPLAWRVWLVV
FYFGCLSLTVWLLEGSYGGSDHHAARAQSPDVRARRSELDDNIAQMGAVHGLELPRRQVLRHRGT
>Q2KS19 ~~~~~~I-leader protein~~~
MRADREELDLPPPIGGVAIDVVKVEVPATGRTLVLAFVKTCAVLAAVHGLYILHEVDLTTAHKEAEWEFEPLAWRVWLVV
FYFGCLSLTVWLLEGSYGGSDHHAARAQSPDVRARRSELDDNIAQMGAVHGLELPRRQVLRRRGT
>P21288 ~~~~~~Late expression factor 11~~~
MPPKNCTHLGGCDSDCLTRSEIQALFREAINTLKHTMNTENVCAHMLDIVSFERIKEYIRANLGHFTVITDKCSKRKVCL
HHKRIARLLGIKKIYHQEYKRVVSKVYKKQTW
>P41418 ~~~LEF-2~~~Primase-associated factor LEF-2~~~
MANASYNVWSPLIRASCLDKKATYLIDPDDFIDKLTLTPYTVFYNGGVLVKISGLRLYMLLTAPPTINEIKNSNFKKRSK
RNICMKECVEGKKNVVDMLNNKINMPPCIKKILNDLKENNVPRGGMYRKRFILNCYIANVVSCAKCENRCLIKALTHFYN
HDSKCVGEVMHLLIKSQDVYKPPNCQKMKTVDKLCPFAGNCKGLNPICNY
>P41453 ~~~LEF-3~~~Late expression factor 3~~~
MATKRSLSGESSGEPLIKRMAMASSPKKIRENYKRISGKLMSKMTLSIDNEYHYTFRIMSDNKIQEYYGDSQSFKDMEEG
KCYDISLNYVKTKFSQMIQINEYKECEMEIETATPMSDYLTNKHFENEDGVNIIVKYKFIYKKINSGLYKVVFEVVYKNL
NDDPDVVQVECSVNAKTLINLFKNNIKGSDDINEVFKYLKDNENQIFTIYSIKCQQIFNGSNVYMNWNVVNSTRIELCEA
KESEAYSNLQNCTNAKINISRSNKHVASYNVNVLKSELEENDMGDNKFIVQFKSDELNIADSDDCSTSSDLGKWNKSVFY
VNTNKKTEADSLQKLCADFNQISMLLEDNLIKVTIYVTVENGENHNMNVLGLLKYDEDENEYKFL
>P41477 2.7.7.50~~~LEF-4~~~DNA-dependent RNA polymerase subunit LEF-4~~~
MDYGDFVIEKEISYSINFSQDLLYKILNSYIVPNYSLAQQYFDLYDENGFRTRIPIQSACNNIISSVKKTNSKHKKFVYW
PKDTNALVPLVWRESKEIKLPYKTLSHNLSKIIKVYVYQHDKIEIKFEHVYFSKSDIDLFDSTMANKISKLLTLLENGDA
SETLQNSQVGSDEILARIRLEYEFDDDAPDDAQLNVMCNIIADMEALTDAQNISPFVPLTTLIDKMAPRKFEREQKIVYG
DDAFDNASVKKWALKLDGMRGRGLFMRNFCIIQTDDMQFYKTKMANLFALNNIVAFQCEVMDKQKIYITDLLQVFKYKYN
NRTQYECGVNASYAIDPVTAIECINYMNNNVQSVTLTDTCPAIELRFQQFFDPPLQQSNYMTVSVDGYVVLDTELRYVKY
KWMPTTELEYDAVNKSFNTLNGPLNGLMILTDLPELLHENIYECVITDTTINVLKHRRDRIVPN
>P41658 ~~~LEF-5~~~Late expression factor 5~~~
MSFDDGVVKAQTDPFALKRGGHNVQKWTSYALFKLFKEFRINKNYSKLIDFLTENFPNNVKNKTFNFSSTGHLFHSLHAY
VPSVSDLVKERKQIRLQTEYLAKLFNNTINDFKLYTELYEFIERTEGVDCCCPCQLLHKSLLNTKNYVENLNCKLFDIKP
PKFKKEPFDNILYKYSLNYKSLLLKKKEKHTSTGCTRKKKIKHRQILNDKVIYLQNSNKNKLFELSGLSLKSCRHDFVTV
ESQTRAGDEIASFIRYCRLCGMSGC
>P41677 ~~~LEF-7~~~Late expression factor 7~~~
MSSVTKRPRAKRIRLPLEIIDTILQYLDPILHAKVVGLTTRVKCRLLRDNNVEDYLKLTPASYHPTTDQFICNYLGITNQ
PMAPYLVPLLSFGKASCVFFNKCIPEDVRIVTLNWPLPLLENFLSKQFLWYKLARKLIEHERRMDRCVTPSTVQINLYDD
NEDYLNCFNCFNCCKDNLVSFNCCIVDCNINDMNRCPDLQIDVYLDDNIISLYLYFLFRIYRIVKE
>P41452 2.7.7.6~~~LEF-8~~~Probable DNA-directed RNA polymerase catalytic subunit~~~
MTDVVQDFNELYDRIENKYKLKYTFDCATNSNERMLFCAIQERKSYLCCALNEQLSCVMHKCVPVIFGTRLDKQFRETDD
VDANNNINGTFMLDGRFLSFPNIMMNNNVLVHNFYDKLYAKHCKRMFLYGNVDQEKHINRAIQLVYDKQDDVLFARDVYA
SDYVVTEDLNSVLETYLANSGKWKPLDFLFEYNTLHKQQLVEHIKIIMNHDINYSIDSLANKIVYKHAYLIELLLTSTIL
QNYQRVLDKTADDDAYTVANKRRKIQSVLYNKESKKIVDCIVNGRLIYCVSKTFSKQRKVFPNQQDNSSNNNIEISLPVL
KYRVGNEVARITNDSMRQKMLKQKKDFVKFIGSFFHGEMTVAGKKFFLCRNACLPNVDYEMVAQKFQYLLKHNLVAFVDD
LNDVQDDSLLIAFNDRPTNLKCLKSNTSFIVYTMKRNMAPIELKITDRILYVNHHEGMICIKKKLRVNNEADINVLLTPY
EYHYKHSIYYNPIIQCTIVENDDVKSLMSKLEQYYYRNFIHLFHTTPVPKLIVSLTNLKNAMPVFEYKEENCVSGLPNGY
SVAVNKSILLNNKMFKLWTLVRDNKLMTAEDPYIPHIALPICLYNNKVNKLKGKLVVGPKQSCLVKFTNSSDKNYVALDD
GLVLYMAGVLVSNAKINWVYDGRRYKIETCTNGNFNVYKVYVYFRQIKNQKIEKLDASMVVNGDNVMLKIVIVTSTNDLE
GIKICGIHGQKGVFNRSEDLTEWMAEDGTHAQICLSPVSFLSRQSNFDKIERKYVVRGGNHDDPHAKRYPIFNIPYMLFN
NTPDNIFKEFIKTSHTGHEKVEGTRFDQWTKNQSFVGNRMSESLHWMRGGSNLPQNCGEFNVVSSLLMCNNTIMKN
>P41465 ~~~LEF-9~~~DNA-directed RNA polymerase subunit LEF-9~~~
MVRKIYFLTLLYFALCVNTRSCTDVIMFSFLDKTPTEFDLILDPHKLQNVAFFTNEEFKVILKNFITDLKKNQKLNYFNS
LIDQLINVYTDASVKNTQPDVLAKIIKSTCVIVTDLPSNVFLKKLKTNKFTDTINYLILPHFILWDHNFVIFLNKAFNSK
HENDLVDISGALQKIKLTHGVIKDQLQSKNGYAVQYLYATFLNTASFYANVQCLNGVNEIMPPRSSVKRYYGRDVDNVRA
WTTRHPNISQLSTQVSDVHINESSTDWNVKVGLGIFPGANTDCDGDKKIITFLPKPNSLIDSECLLYGDPRFNFICFDKN
RLSFVSQQIYYLYKNIDAMEALFKSTPLVYALWQKHKHEQFAQRLEMLLRDFCLIASSNASYLLFKQLTQLIANEEMVCG
DEEIFNLGGQFVDMIKSGAKGSQNLIKSTQQYRQTLNTDIETVSSRATTSLNSYISSHNKVKVCGADIYHNTVVLQSVFI
KNNYVCYKNDERTIMNICALPSEFLFPEHLLDMFIE
>P0C725 ~~~LF2~~~Protein LF2~~~
MAEAYPGGAHAALASRRSSFRNSLRRLRPTEKPDTSFMRGVWKYEIFPSYVRVTNKQVLQLDAQCQELPPCPSVGQILSF
KLPSFSFNTTTYGSRYFTVAFLFFGAEDNEVFLKPFFVMHSDQDIVLSVLNPRSLFIEKGKFTWYIVPIRLVKNPYLYLQ
ILPGQSDIQLTRSCTQSGDKLNTSEPQIFLSGSPVTSQDECLPYLLAQHTPPFLKSYARIHTFPGKVCPVNAIRRGKGYV
RVSVDTPDLKREGPLNVKVGMTLLDDVIIAFRYNPYPKSHWRWDGESTDIRYFGSPVIIPPNFITELEYNNTYEAPLSSK
ITAVVVSHSSNPVFYVYPQEWKPGQTLKLTVRNISNNPITIVTGQSMAQAFFIYAGDPSISTIMRRYIQRQGCALTLPGN
IVVESSSLPTFERINKTFNGNIVASEGTL
>Q3KSP5 ~~~LF2~~~Protein LF2~~~
MAEAYPGGAHAALASRRSSFRNSLRRLRPTEKPDTSFMRGVWKYEIFPSYVRVTNKQVLQLDAQCQELPPCPSVGQILSF
KLPSFSFNTTTYGSRYFTVAFLFFGAEDNEVFLKPFFVMHSAPGHRAKRPESAQPVHREREVYLVHCAHQIGQEPLPLPA
DSARAERHSADTLLHPERGQAKHQRAPDLPQWLSRHQPGRVPALPAGAAHAPISKVIRPYSHIPGEGMSGQRHTPR
>Q83924 ~~~~~~Hexon-interlacing protein LH3~~~
MTSAEFSVEFPYVSKPIVPWPASASSNNSSSQNIDFPVLKPDQDPIAFFQTNNTAYLQPGATYYWKCIELSKPIHIYGQG
ATVQLVGPGPVFVFNSESVIPEDFYVVFENINFIEDEFPIRSGQLSLGLTTHSAVWFINVWKTSIVNCNFKNFRGAALWY
SDNRNFWNARKWNQQHLVSNCRFNGCRIGISNTGSSEYSIASQNQFYDCQICFNVTGGNWSRNNNVIVNCRCAYLHVGDN
MWYEGHSENNNPAKGTFCNNIINHADNGGNVWPTQFKLTDGSTIQLASFYFDDNQEIPPCYSGNFHWFGDVNIVNFSTTK
IDKWCITGCNFYGNTHAANDAGQVQVAEAVKDKVFIIGCSGNNVTMKNIVEGNMTPKIGTIK
>A9CB85 ~~~~~~Hexon-interlacing protein LH3~~~
MTSVEDLYVVKPINQWPAPSAFLYPLPRGTLLPGQSPDEAFARNSVVFLVPGAEYNWKNVVIRKPVWIYGERCHGEDFRP
RAIIHIMGDLDNPMDVRIQDLTFIGGDSPDRLVPFSAVLTNQMALWCIDPRITIRGCSFYNFGGAAIYLERSERDTGFRF
GRGQVMITDCRFRGCRIGIANGGSVEYGLASQNNFSDCQICFNVVGGNWTRSGNVASNCRCMYLHTQGMWYEGAAGNFNP
AHGSFTSNTLNHCDYGGNLWPTEFQLPDRVINLAGFYFDNAAARLPNFSGNSQWYGDMKLINFLPDSTFVINGGALYGGP
GDTGVIAVATALAAKVFVIGCQGNAGQQIVNVPAANIIPEVGTRKDDATQPAA
>P0C6L5 ~~~~~~Large delta antigen~~~
MSRSERRKDRGGREDILEQWVSGRKKLEELERDLRKLKKKIKKLEEDNPWLGNIKGIIGKKDKDGEGAPPAKKLRMDQME
IDAGPRKRPLRGGFTDKERQDHRRRKALENKRKQLSSGGKSLSREEEEELKRLTEEDEKRERRIAGPSVGGVNPLEGGSR
GAPGGGFVPSMQGVPESPFARTGEGLDIRGSQGFPWDILFPADPPFSPQSCRPQ
>P29996 ~~~~~~Large delta antigen~~~
MSRPEGRKNRGGREEVLEQWVSGRKKLEELERDLRKVKKKIKKLEDEHPWLGNIKGILGKKDKDGEGAPPAKRARTDQME
VDSGPRKRPSRGGFTDKERQDHRRRKALENKRKQLSAGGKNLSKEEEEELRRLTEEDERRERRIAGPQVGGVNPLEGGTR
GAPGGGFVPSMQGVPESPFTRTGEGLDIRGSQGFPWDILFPADPPSSPQSCRPQ
>P0C6L6 ~~~~~~Large delta antigen~~~
MSRSESRKNRGGREEILEQWVAGRKKLEELERDLRKTKKKLKKIEDENPWLGNIKGILGKKDKDGEGAPPAKRARTDQME
VDSGPRKRPLRGGFTDKERQDHRRRKALENKKKQLSAGGKNLSKEEEEELRRLTEEDERRERRVAGPPVGGVIPLEGGSR
GAPGGGFVPSLQGVPESPFSRTGEGLDIRGNRGFPWDILFPADPPFSPQSCRPQ
>P0C6M3 ~~~~~~Large delta antigen~~~
MSQTVARLTSKEREEILEQWVEERKNRRKLEKDLRRANKKIKKLEDENPWLGNVVGLLRRKKDEDGAPPAKRPRQETMEV
DSGPGRKPKARGFTDQERRDHRRRKALENKKKQLAGGGKHLSQEEEEELRRLARDDDERERRTAGPRPGGVNPMDGPPRG
APGGGFVPSLQGVPESPFSRTGEGIDIRGTQQFPWYGFTPPPPGYYWVPGCTQQ
>P17309 ~~~rIII~~~Lysis inhibition accessory protein~~~
MIKQLQHALELQRNAWNNGHENYGASIDVEAEALEILRYFKHLNPAQTALAAELQEKDELKYAKPLASAARKAVRHFVVT
LK
>Q38162 ~~~llp~~~Lytic conversion lipoprotein~~~
MKKLFLAMAVVLLSACSTFGPKDIKCEAYYMQDHVKYKANVFDRKGDMFLVSPIMAYGSFWAPVSYFTEGNTCEGVF
>P15024 3.6.4.13~~~L3~~~Inner capsid protein lambda-1~~~
MKRIPRKTKGKSSGKGNDSTERADDGSSQLRDKQNNKAGPATTEPGTSNREQYKARPGIASVQRATESAEMPMKNNDEGT
PDKKGNTKGDLVNEHSEAKDEADEATKKQAKDTDKSKAQVTYSDTGINNANELSRSGNVDNEGGSNQKPMSTRIAEATSA
IVSKHPARVGLPPTASSGHGYQCHVCSAVLFSPLDLDAHVASHGLHGNMTLTSSDIQRHITEFISSWQNHPIVQVSADVE
NKKTAQLLHADTPRLVTWDAGLCTSFKIVPIVPAQVPQDVLAYTFFTSSYAIQSPFPEAAVSRIVVHTRWASNVDFDRDS
SVIMAPPTENNIHLFKQLLNTETLSVRGANPLMFRANVLHMLLEFVLDNLYLNRHTGFSQDHTPFTEGANLRSLPGPDAE
KWYSIMYPTRMGTPNVSKICNFVASCVRNRVGRFDRAQMMNGAMSEWVDVFETSDALTVSIRGRWMARLARMNINPTEIE
WALTECAQGYVTVTSPYAPIVNRLMPYRISNAERQISQIIRIMNIGNNATVIQPVLQDISVLLQRISPLQIDPTIISNTM
STVSESTTQTLSPASSILGKLRPSNSDFSSFRVALAGWLYNGVVTTVIDDSSYPKDGGSVTSLENLWDFFILALALPLTT
DPCAPVKAFMTLANMMVGFETIPMDNQIYTQSRRASAFSTPHTWPRCFMNIQLISPIDAPILRQWAEIIHRYWPNPSQIR
YGAPNVFGSANLFTPPEVLLLPIDHQPANVTTPTLDFTNELTNWRARVCELMKNLVDNQRYQPGWTQSLVSSMRGTLDKL
KLIKSMTPMYLQQLAPVELAVIAPMLPFPPFQVPYVRLDRDRVPTMVGVTRQSRDTITQPALSLSTTNTTVGVPLALDAR
AITVALLSGKYPPDLVTNVWYADAIYPMYADTEVFSNLQRDMITCEAVQTLVTLVAQISETQYPVDRYLDWIPSLRASAA
TAATFAEWVNTSMKTAFDLSDMLLEPLLSGDPRMTQLAIQYQQYNGRTFNIIPEMPGSVIADCVQLTAEVFNHEYNLFGI
ARGDIIIGRVQSTHLWSPLAPPPDLVFDRDTPGVHIFGRDCRISFGMNGAAPMIRDETGLMVPFEGNWIFPLALWQMNTR
YFNQQFDAWIKTGELRIRIEMGAYPYMLHYYDPRQYANAWNLTSAWLEEITPTSIPSVPFMVPISSDHDISSAPAVQYII
STEYNDRSLFCTNSSSPQTIAGPDKHIPVERYNILTNPDAPPTQIQLPEVVDLYNVVTRYAYETPPITAVVMGVP
>Q9WAB1 3.6.4.13~~~~~~Inner capsid protein lambda-1~~~
MKRIPRKTRGKSSGKGNDSTERADDGSAQLRDKQSSKVTQNVKEPGTTLKEQYKTRPSLQTVQKATENAELPMQTNDEGA
VDKKGNTKGDKTNEHVEAEVNAADATKRQAKDTDKQKAQVTYNDTGINNANELSRSGNVDNEGGDNQKPMTTRIAEATSA
IISKHPARVGLPPTASSGHGYQCHVCSAVLFSPLDLDAHVASHGLHGNMTLTSSEIQRHITEFISSWQNHPIVQVSADVE
NKKTAQLLHADTPRLVTWDAGLCTSFKIVPIVPAQVPQDVLAYTFFTSSYAIQSPFPEAAVSRIVVHTRWASNVDFDRDS
SVIMAPPTENNIHLFKQLLNNETLSVRGANPLMFRANVLHMLLEFVLDNLYINKHTGFSQDHTPFTEGANLRSLPGPDAE
KWYAIMYPTRMGTPNVSKICNFVASCVRNRVGRFDRAQMMNGAMSEWVDVFETSDALTVSIRGRWMARLARMNINPTEIE
WALTECAHGYVTVTSPYAPSVNRLMPYRVSNAERQISQIIRIMNIGNNATVIQPVLQDISVLLQRISPLQIDPTIISNTM
STVSESTTQTLSPASSILGKLRPSNSDFSSFRVALAGWLYNGVVTTVIDDSSYPKDGGSVTSLENLWDFFILALALPLTT
DPCAPVKAFMTLANMMVGFETIPMDNQIYTQSRRASAFSTPHTWPRCFMNIQLISPIDAPILRQWAEIIHRYWPNPSQIR
FGAPNVFGSANLFTPPEVLLLPIDHQPANVTTPTLDFTNELTNWRARVCELMKNLVDNQRYQPGWTQSLVSSMRGTLDKL
KLIKSMTPMYLQQLAPVELAVIAPMLPFPPFQVPYVRLDRDRVPTMVGVTRQSRDTITQPALSLSTTNTTVGVPLALDAR
AITVALLSGKYPSDLVTNVWYADAIYPMYADTEVFSNLQRDMITCEAVQTLITLVAQISETQYPVDRYLDWIPSLRASAA
TAATFAEWVNTSMKTAFDLSDMLLEPLLSGDPRMSQLAIQYQQYNGRTFNVIPEMPGSVVTDCVQLTAEVFNHEYNLFGI
ARGDIIIGRVQSTHLWSPLAPPPDLVFDRDTPGVHVFGRDCRISFGMNGAAPMIRDETGMMVPFEGNWIFPLALWQMNTR
YFNQQFDAWIKTGELRIRIEMGAYPYMLHYYDPRQYANAWNLTSAWLEEISPTSIPSVPFMVPISSDHDISSAPAVQYII
STEYNDRSLFCTNSSSPQTIAGPDKHIPVERYNILTNPDAPPTQIQLPEVVDLYNVVTRYAYETPPITAVVMGVP
>Q9WAB2 3.6.4.13~~~L3~~~Inner capsid protein lambda-1~~~
MKRIPRKTKGKSSGKGNDSTERSDDGSSQLRDKQNNKAGPATTEPGTSNREQYRARPGIASVQRATESAELPMKNNDEGT
PDKKGNTRGDLVNEHSEAKDEADEATQKQAKDTDKSKAQVTYSDTGINNANELSRSGNVDNEGGSNQKPMSTRIAEATSA
IVSKHPARVGLPPTASSGHGYQCHVCSAVLFSPLDLDAHVASHGLHGNMTLTSSEIQRHITEFISSWQNHPIVQVSADVE
NKKTAQLLHADTPRLVTWDAGLCTSFKIVPIVPAQVPQDVLAYTFFTSSYAIQSPFPEAAVSRIVVHTRWASNVDFDRDS
SVIMAPPTENNIHLFKQLLNTETLSVRGANPLMFRANVLHMLLEFVLDNLYLNRHTGFSQDHTPFTEGANLRSLPGPDAE
KWYSIMYPTRMGTPNVSKICNFVASCVRNRVGRFDRAQMMNGAMSEWVDVFETSDALTVSIRGRWMARLARMNINPTEIE
WALTECAQGYVTVTSPYAPSVNRLMPYRISNAERQISQIIRIMNIGNNATVIQPVLQDISVLLQRISPLQIDPTIISNTM
STVSESTTQTLSPASSILGKLRPSNSDFSSFRVALAGWLYNGVVTTVIDDSSYPKDGGSVTSLENLWDFFILALALPLTT
DPCAPVKAFMTLANMMVGFETIPMDNQIYTQSRRASAFSTPHTWPRCFMNIQLISPIDAPILRQWAEIIHRYWPNPSQIR
YGAPNVFGSANLFTPPEVLLLPIDHQPANVTTPTLDFTNELTNWRARVCELMKNLVDNQRYQPGWTQSLVSSMRGTLDKL
KLIKSMTPMYLQQLAPVELAVIAPMLPFPPFQVPYVRLDRDRVPTMVGVTRQSRDTITQPALSLSTTNTTVGVPLALDAR
AITVALLSGKYPPDLVTNVWYADAIYPMYADTEVFSNLQRDMITCEAVQTLVTLVAQISETQYPVDRYLDWIPSLRASAA
TAATFAEWVNTSMKTAFDLSDMLLEPLLSGDPRMTQLAIQYQQYNGRTFNVIPEMPGSVIADCVQLTAEVFNHEYNLFGI
ARGDIIIGRVQSTHLWSPLAPPPDLVFDRDTPGVHIFGRDCRISFGMNGAAPMIRDETGMMVPFEGNWIFPLALWQMNTR
YFNQQFDAWIKTGELRIRIEMGAYPYMLHYYDPRQYANAWNLTSAWLEEITPTSIPSVPFMVPISSDHDISSAPAVQYII
STEYNDRSLFCTNSSSPQTIAGPDKHIPVERYNILTNPDAPPTQIQLPEVVDLYNVVTRYAYETPPITAVVMGVP
>P11079 ~~~L2~~~Outer capsid protein lambda-2~~~
MANVWGVRLADSLSSPTIETRTRQYTLHDLCSDLDANPGREPWKPLRNQRTNNIVAVQLFRPLQGLVLDTQLYGFPGAFD
DWERFMREKLRVLKYEVLRIYPISNYSNEHVNVFVANALVGAFLSNQAFYDLLPLLIINDTMIGDLLGTGASLSQFFQSH
GDVLEVAAGRKYLQMENYSNDDDDPPLFAKDLSDYAKAFYSDTYEVLDRFFWTHDSSAGVLVHYDKPTNGHHYLLGTLTQ
MVSAPPYIINATDAMLLESCLEQFSANVRARPAQPVTRLDQCYHLRWGAQYVGEDSLTYRLGVLSLLATNGYQLARPIPR
QLTNRWLSSFVSQIMSDGVNETPLWPQERYVQIAYDSPSVVDGATQYGYVRKNQLRLGMRISALQSLSDTPSPVQWLPQY
TIDQAAMDEGDLMVSRLTQLPLRPDYGNIWVGDALSYYVDYNRSHRVVLSSELPQLPDTYFDGDEQYGRSLFSLARKIGD
RSLVKDTAVLKHAYQAIDPNTGKEYLRSRQSVAYFGASAGHSGADQPLVIEPWIQGKISGVPPPSSVRQFGYDVARGAIV
DLARPFPSGDYQFVYSDVDQVVDGHDDLSISSGLVESLLSSCMHATAPGGSFVVKINFPTRPVWHYIEQKILPNITSYML
IKPFVTNNVELFFVAFGVHQHSSLTWTSGVYFFLVDHFYRYETLSTISRQLPSFGYVDDGSSVTGIETISIENPGFSNMT
QAARIGISGLCANVGNARKSIAIYESHGARVLTITSRRSPASARRKSRLRYLPLIDPRSLEVQARTILPADPVLFENVSG
ASPHVCLTMMYNFEVSSAVYDGDVVLDLGTGPEAKILELIPATSPVTCVDIRPTAQPSGCWNVRTTFLELDYLSDGWITG
VRGDIVTCMLSLGAAAAGKSMTFDAAFQQLIKVLSKSTANVVLVQVNCPTDVVRSIKGYLEIDSTNKRYRFPKFGRDEPY
SDMDALEKICRTAWPNCSITWVPLSYDLRWTRLALLESTTLSSASIRIAELMYKYMPIMRIDIHGLPMEKRGNFIVGQNC
SLVIPGFNAQDVFNCYFNSALAFSTEDVNAAMIPQVSAQFDATKGEWTLDMVFSDAGIYTMQALVGSNANPVSLGSFVVD
SPDVDITDAWPAQLDFTIAGTDVDITVNPYYRLMTFVRIDGQWQIANPDKFQFFSSASGTLVMNVKLDIADKYLLYYIRD
VQSRDVGFYIQHPLQLLNTITLPTNEDLFLSAPDMREWAVKESGNTICILNSQGFVLPQDWDVLTDTISWSPSIPTYIVP
PGDYTLTPL
>Q91RA4 ~~~L2~~~Outer capsid protein lambda-2~~~
MANVWGVRLADSLSSPTLESRNRSYTLHDFCSDLDASAGKEPWKALRNQRTSEIVAVRLFRPLQGLILDTHMYGFPGEFD
AWEVFVKEKLRVLKYEVLRVYPISGYSNSHVNVFVANALVGAFLSNQAFYDLLPLLIINDTMINDLLGAGVSLAQFFQAH
GDVLEVAAGRKYIQMNGYSNDDDDPPLFAKDLSDYAKAFYCESFEVLDRFFWTHDASAGVLVHYDKPTNGNHYLLGTLTQ
MVSAPPFIINATDAMMLESCVEQFAANAAARPAQPATRLDQCYHLRWGAQYVGEDSLTYRLGVLSLLATNGYQLARPIPK
QLTNRWLSSFVSQIMSEGANETPLWPQERYVQIAYDSPSVVDGAVQYGYVRKNQLRLGMRISPIQSLSDVPAPVAWLPQY
TIDQTALEDGDMVGHMSQLPLRPEYGSMWVGEALSYYVDYNQSHRVVAAKELPQLPDTYFDGDEQYGRSLFSLARRIGDR
SLIKDTAVLKHAYQAIDPSTGREYLRAGQSVAYFGASAGHSGADQPLVIEPWLQGKISGVPSPASIRQFGYDVAKGAIVD
LARPFPSGDYQFVYSDVDQVVDGHDDLSISSNLVESILSSCMQATSPGGSFVAKINFPTRSIWYYIEQKILPNITSYMII
KPFVTNNVEVFFVAFGVHRQSSLTWTSGVYFFLVDHFYRYETLSAISRQLPSYGYVDDGSSVTGLEVISIENPGFSTMTQ
ASRVAISALCANTGNSRKTISIYESHGARVLMLVSRRSPASAKRKARLRYLPLIDPRSLEVQSRTIMPSTPVLFENSNGA
SPHVCLTMMYNYEVSSAVYDGDVVLDLGTGPEAKILELIPPTSPATCVDIRPTAQPTGCWNVRTTFLQLDYLSDGWITGV
RGDIVTCMLSLGAAAAGKSMTFDAAFQQFVRVIAQSAANVVLVQVNCPTDVIRSVRGYLEIDQTSKRYRFPKFGRDEPYS
DMESLERICRATWPNCSITWVPLSYDLRWTRLALLEAATLNSASIRIAELMYKYMPVMRVDIHGLPMNKSGNFVVGQNCS
LTIPGFNAQDTFNCYYNSALAFSTEDVNAAMIPSVTATFDNAKNEWTLDMVFSDAGIYTMQAVVGVNASPIALGSFVVDS
PDVDITDAWPAQLDFTIAGTDVDITVNPYYRLMAFVKIDGQWQIANPDKFQFFASATGTLTMNVKLDIADKYLLYYIRDV
QSREVGFYIQHPLQLLNTITLPTNEDLFLSAPDMREWAVKESGNTICILNSQGFIPPQDWDVLTDTISWSPSLPTYVVPP
GDYTLTPL
>Q91RA6 ~~~L2~~~Outer capsid protein lambda-2~~~
MANVWGVRLADSLSSPTIETRTRHYTLRDFCSDLDAVAGKEPWRPLRNQRTNDIVAVQLFRPLQGLVLDTQFYGFPGIFS
EWEQFIKEKLRVLKYEVLRIYPISNYNHERVNVFVANALVGAFLSNQAFYDLLPLLVINDTMINDLLGTGAALSQFFQSH
GEVLEVAAGRKYLQMKNYSNDDDDPPLFAKDLSDYAKAFYSDTFETLDRFFWTHDSSAGVLVHYDKPTNGNHYILGTLTQ
MVSAPPHIINATDALLLESCLEQFAANVRARPAQPVARLDQCYHLRWGAQYVGEDSLTYRLGVLSLLATNGYQLARPIPK
QLTNRWLSSFVSQVMSDGVNETPLWPQERYVQIAYDSPSVVDGATHYGYVRRNQLRLGMRVSALQSLSDTPAPIQWLPQY
TIEQAAVDEGDLMVSRLTQLPLRPDYGSIWVGDALSYYVDYNRSHRVVLSSELPQLPDTYFDGDEQYGRSLFSLARKIGD
RSLIKDTAVLKHAYQAIDPNTGKEYLRAGQSVAYFGASAGHSGADQPLVIEPWTQGKISGVPPPSSVRQFGYDVAKGAIV
DLARPFPSGDYQFVYSDVDQVVDGHDDLSISSGLVESLLDSCMHATSPGGSFVMKINFPTRTVWHYIEQKILPNITSYML
IKPFVTNNVELFFVAFGVHQQSALTWTSGVYFFLVDHFYRYETLSTISRQLPSFGYVDDGSSVTGIEMISLENPGFSNMT
QAARVGISGLCANVGNARKLISIHESHGARVLTITSRRSPASARRKARLRYLPLVDPRSLEVQARTILPSNPVLFDNVNG
ASPHVCLTMMYNFEVSSAVYDGDVVLDLGTGPEAKILELIPPTSPVTCVDIRPTAQPSGCWNVRTTFLELDYLSDGWITG
IRGDIVTCMLSLGAAAAGKSMTFDAAFQQLVKVLTKSTANVLLIQVNCPTDVIRTIKGYLEIDQTNKRYRFPKFGRDEPY
SDMDSLERICRAAWPNCSITWVPLSYDLRWTKLALLESTTLSSASVRIAELMYKYMPVMRIDIHGLPMEKQGNFIVGQNC
SLTIPGFNAQDVFNCYFNSALAFSTEDVNSAMIPQVTAQFDANKGEWSLDMVFSDAGIYTMQALVGSNANPVSLGSFVVD
SPDVDITDAWPAQLDFTIAGTDVDITVNPYYRLMAFVRIDGQWQIANPDKFQFFSSSTGTLVMNVKLDIADRYLLYYIRD
VQSRDVGFYIQHPLQLLNTITLPTNEDLFLSAPDMREWAVKESGNTICILNSQGFVPPQDWDVLTDTISWSPSLPTYVVP
PGDYTLTPL
>P03230 ~~~LMP1~~~Latent membrane protein 1~~~
MEHDLERGPPGPRRPPRGPPLSSSLGLALLLLLLALLFWLYIVMSDWTGGALLVLYSFALMLIIIILIIFIFRRDLLCPL
GALCILLLMITLLLIALWNLHGQALFLGIVLFIFGCLLVLGIWIYLLEMLWRLGATIWQLLAFFLAFFLDLILLIIALYL
QQNWWTLLVDLLWLLLFLAILIWMYYHGQRHSDEHHHDDSLPHPQQATDDSGHESDSNSNEGRHHLLVSGAGDGPPLCSQ
NLGAPGGGPDNGPQDPDNTDDNGPQDPDNTDDNGPHDPLPQDPDNTDDNGPQDPDNTDDNGPHDPLPHSPSDSAGNDGGP
PQLTEEVENKGGDQGPPLMTDGGGGHSHDSGHGGGDPHLPTLLLGSSGSGGDDDDPHGPVQLSYYD
>P13198 ~~~LMP1~~~Latent membrane protein 1~~~
MDLDLERGPPGPRRPPRGPPLSSSIGLALLLLLLALLFWLYIIMSNWTGGALLVLYAFALMLVIIILIIFIFRRDLLCPL
GALCLLLLMITLLLIALWNLHGQALYLGIVLFIFGCLLVLGLWIYLLEILWRLGATIWQLLAFFLAFFLDIILLIIALYL
QQNWWTLLVDLLWLLLFLAILIWMYYHGQRHSDEHHHDDSLPHPQQATDDSSNQSDSNSNEGRHLLLVSGAGDGPPLCSQ
NLGAPGGGPNNGPQDPDNTDDNGPQDPDNTDDNGPHDPLPQDPDNTDDNGPQDPDNTDDNGPHDPLPHNPSDSAGNDGGP
PQLTEEVENKGGDQGPPLMTDGGGGHSHDSGHDGIDPHLPTLLLGTSGSGGDDDDPHGPVQLSYYD
>P13285 ~~~LMP2~~~Latent membrane protein 2~~~
MGSLEMVPMGAGPPSPGGDPDGYDGGNNSQYPSASGSSGNTPTPPNDEERESNEEPPPPYEDPYWGNGDRHSDYQPLGTQ
DQSLYLGLQHDGNDGLPPPPYSPRDDSSQHIYEEAGRGSMNPVCLPVIVAPYLFWLAAIAASCFTASVSTVVTATGLALS
LLLLAAVASSYAAAQRKLLTPVTVLTAVVTFFAICLTWRIEDPPFNSLLFALLAAAGGLQGIYVLVMLVLLILAYRRRWR
RLTVCGGIMFLACVLVLIVDAVLQLSPLLGAVTVVSMTLLLLAFVLWLSSPGGLGTLGAALLTLAAALALLASLILGTLN
LTTMFLLMLLWTLVVLLICSSCSSCPLSKILLARLFLYALALLLLASALIAGGSILQTNFKSLSSTEFIPNLFCMLLLIV
AGILFILAILTEWGSGNRTYGPVFMCLGGLLTMVAGAVWLTVMSNTLLSAWILTAGFLIFLIGFALFGVIRCCRYCCYYC
LTLESEERPPTPYRNTV
>P04526 3.6.4.-~~~~~~Sliding-clamp-loader large subunit~~~
MITVNEKEHILEQKYRPSTIDECILPAFDKETFKSITSKGKIPHIILHSPSPGTGKTTVAKALCHDVNADMMFVNGSDCK
IDFVRGPLTNFASAASFDGRQKVIVIDEFDRSGLAESQRHLRSFMEAYSSNCSIIITANNIDGIIKPLQSRCRVITFGQP
TDEDKIEMMKQMIRRLTEICKHEGIAIADMKVVAALVKKNFPDFRKTIGELDSYSSKGVLDAGILSLVTNDRGAIDDVLE
SLKNKDVKQLRALAPKYAADYSWFVGKLAEEIYSRVTPQSIIRMYEIVGENNQYHGIAANTELHLAYLFIQLACEMQWK
>P04527 ~~~~~~Sliding-clamp-loader small subunit~~~
MSLFKDDIQLNEHQVAWYSKDWTAVQSAADSFKEKAENEFFEIIGAINNKTKCSIAQKDYSKFMVENALSQFPECMPAVY
AMNLIGSGLSDEAHFNYLMAAVPRGKRYGKWAKLVEDSTEVLIIKLLAKRYQVNTNDAINYKSILTKNGKLPLVLKELKG
LVTDDFLKEVTKNVKEQKQLKKLALEW
>P13338 ~~~~~~Late transcription coactivator~~~
MTQFSLNDIRPVDETGLSEKELSIKKEKDEIAKLLDRQENGFIIEKMVEEFGMSYLEATTAFLEENSIPETQFAKFIPSG
IIEKIQSEAIDENLLRPSVVRCEKTNTLDFLL
>P03186 3.4.19.12~~~BPLF1~~~Large tegument protein deneddylase~~~
MSNGDWGQSQRTRGTGPVRGIRTMDVNAPGGGSGGSALRILGTASCNQAHCKFGRFAGIQCVSNCVLYLVKSFLAGRPLT
SRPELDEVLDEGARLDALMRQSGILKGHEMAQLTDVPSSVVLRGGGRVHIYRSAEIFGLVLFPAQIANSAVVQSLAEVLH
GSYNGVAQFILYICDIYAGAIIIETDGSFYLFDPHCQKDAAPGTPAHVRVSTYAHDILQYVGAPGAQYTCVHLYFLPEAF
ETEDPRIFMLEHYGVYDFYEANGSGFDLVGPELVSSDGEAAGTPGADSSPPVMLPFERRIIPYNLRPLPSRSFTSDSFPA
ARYSPAKTNSPPSSPASAAPASAAPASAAPASAAPASAAPASAAPASAAPASAAPASSPPLFIPIPGLGHTPGVPAPSTP
PRASSGAAPQTPKRKKGLGKDSPHKKPTSGRRLPLSSTTDTEDDQLPRTHVPPHRPPSAARLPPPVIPIPHQSPPASPTP
HPAPVSTIAPSVTPSPRLPLQIPIPLPQAAPSNPKIPLTTPSPSPTAAAAPTTTTLSPPPTQQQPPQSAAPAPSPLLPQQ
QPTPSAAPAPSPLLPQQQPPPSAARAPSPLPPQQQPLPSATPAPPPAQQLPPSATTLEPEKNHPPAADRAGTEISPSPPF
GQQPSFGDDASGGSGLVRYLSDLEEPFLSMSDSEEAESDLASDIPTTEDEDMFEDEVFSNSLESGSSAPTSPITLDTARS
QYYQTTFDIETPEMDFVPLESNIARIAGHTYQEQAIVYDPASNREVPEADALSMIDYLLVTVVLEQGLIRSRDRSSVLNL
LEFLKDWSGHLQVPTLDLEQLLTSELNIQNLANMLSENKGRAGEFHKHLAAKLEACLPSLATKDAVRVDAGAKMLAEIPQ
LAESDDGKFDLEAARRRLTDLLSGGDQEAGEGGGEPEDNSIYRGPHVDVPLVLDDESWKRLLSLAEAARTAVARQQAGVD
EEDVRFLALLTAIEYGAPPAASVPPFVHNVAVRSKNAALHVRRCTADIRDKVASAASDYLSYLEDPSLPTVMDFDDLLTH
LRHTCQIIASLPLLNIRYTSIEWDYRELLYLGTALSDMSGIPWPLERVEEDDPSIAPLPEFETVAKKQKELETTRENEKR
LRTILDDIEAMLGLAGVASAPGAPISPASPSATPANHDNPEATPPLADTAALTIPVIEKYIANAGSIVGAAKNPTYIRLR
DTIQQIVRSKKYLMNILKSITFYTIDNYIASFEESIDHLYRDLPVLDPEVQDGIDRILDPMVSEALHTFEMGNRLTLEPA
RLVALQNFATHSTLKETAAAVNLLPGLLAVYDATITGQAPEDALRLLSGLQNQLSQTLIPGKLKKRFLSYLQKLKNNNND
QLRQKEVQAWRLEAEGFKPATEEQLEAFLDTAPNKELKRQYEKKLRQLMETGRKEKEKLREQEDKERQERRAREANEAWA
RIRKALGARPEPAPTSPDDWNTLLASLLPDNTDSAAAAAAAVARNTDILDSLTQILAAMLLGITRVRRERLRSLLVDDGG
AAERMEAAEPGWFTDIETGPLARLDAWPATPAATAKEGGGGRGAEEAAGALFRARTAADAIRSALAQTRQALQSPDMKSA
VVNTDLEAPYAEYERGLAGLLEKRRAAEAALTAIVSEYVDRTLPEATNDPGQANLPPPPTIPQATAPPRLASDSALWPKK
PQLLTRRERDDLLQATGDFFSELLTEAEAAEVRALEEQVRESQTLMAKAHEMAASTRRGFHTALEAVLSRSRDEAPDDEL
RSLLPSPPKAPVQAPLEAALARAAAGNGSWPYRKSLAAAKWIRGICEAVRGLSEGALALAGGAGAWLNLAAAADGEIHEL
TRLLEVEGMAQNSMDGMEELRLALATLDPKRVAGGKETVADWKRRLSRLEAIIQEAQEESQLQGTLQDLVTQARGHTDPR
QLKIVVEAARGLALGASAGSQYALLKDKLLRYASAKQSFLAFYETAQPTVFVKHPLTNNLPLLITISAPPTGWGNGAPTR
RAQFLAAAGPAKYAGTLWLETESPCDPLNPAYVSADTQEPLNYIPVYHNFLEYVMPTVLENPEAFSLTPAGRPQAIGPPQ
DDQERRRRTLASVASARLSAAAADSYWDTWPDVESNAGELLREYVSAPKALMEDLADNPIVAMTLLAHASLIASRNHPPY
PAPATDREVILLEQREMMALLVGTHPAYAAAFLGAPSFYAGLGLVSALARDGGLGDLLSDSVLTYRLVRSPASGRGGMPS
TTRGSNDGEDARRLTRHRIAGPPTGFIFFQDAWEEMDTRAALWPHPEFLGLVHNQSTARARACMLLLARRCFAPEALQQL
WHSLRPLEGPVAFQDYLRDFVKQAYTRGEELPRAEGLEVPRETPSSYGTVTGRALRNLMPYGTPITGPKRGSGDTIPVSV
FEAAVAAAFLGRPLTLFVSSQYLFNLKTLGQVRVVAPLLYCDGHSEPFRSLVETISLNFLQDLDGYSESFEPEMSIFARQ
AVWLRELLTEARAAKPKEARPPTVAILANRKNIIWKCFTYRHNLPDVQFYFNAAGASRWPTDVLNPSFYEHEDPPLPVGY
QLPPNPRNVQELFSGFPPRVGHGLVSGDGFQSADNTPASSDRLQQLGGGETDQGEKGSTTAESEASGPPSPQSPLLEKVA
PGRPRDWLSPTSSPRDVTVTPGLAAPITLPGPRLMARPYFGAETRASESPDRSPGSSPRPWPKDSLELLPQPAPQQPPSS
PWASEQGPIVYTLSPHSTPSTASGSQKKHTIQIPGLVPSQKPSYPPSAPYKPGQSTGGIAPTPSAASLTTFGLQPQDTQA
SSQDPPYGHSIMQREKKQQGGREEAAEIRPSATRLPTAVGLRPRAPVVAAGAAASATPAFDPGEAPSGFPIPQAPALGSG
LAAPAHTPVGALAPRPQKTQAQRPQDAAALPTPTIKAVGARPVPKATGALAAGARPRGQPTAAPPSAASPPRVSLPVRSR
QQQSPAIPLPPMHSGSEPGARPEVRLSQYRHAGPQTYTVRKEAPPSAASQLPKMPKCKDSMYYPPSGSARYPAPFQALSF
SQSVASPAPSSDQTTLLWNTPSVVTQFLSIEDIIREVVTGGSTSGDLVVPSGSPSSLSTAAPEQDLRYSLTLSQASRVLS
RFVSQLRRKLERSTHRLIADLERLKFLYL
>P16785 3.4.19.12~~~~~~Large tegument protein deneddylase~~~
MKVTQASCHQGDIARFGARAGNQCVCNGIMFLHALHLGGTSAVLQTEALDAIMEEGARLDARLERELQKKLPAGGRLPVY
RLGDEVPRRLESRFGRTVHALSRPFNGTTETCDLDGYMCPGIFDFLRYAHAKPRPTYVLVTVNSLARAVVFTEDHMLVFD
PHSSAECHNAAVYHCEGLHQVLMVLTGFGVQLSPAFYYEALFLYMLDVATVPEAEIAARLVSTYRDRDIDLTGVVRESAD
TAATTTTAAPSLPPLPDPIVDPGCPPGVAPSIPVYDPSSSPKKTPEKRRKDLSGSKHGGKKKPPSTTSKTLATASSSPSA
IAAASSSSAVPPSYSCGEGALPALGRYQQLVDEVEQELKALTLPPLPANTSAWTLHAAGTESGANAATATAPSFDEAFLT
DRLQQLIIHAVNQRSCLRRPCGPQSAAQQAVRAYLGLSKKLDAFLLNWLHHGLDLQRMHDYLSHKTTKGTYSTLDRALLE
KMQVVFDPYGRQHGPALIAWVEEMLRYVESKPTNELSQRLQRFVTKRPMPVSDSFVCLRPVDFQRLTQVIEQRRRVLQRQ
REEYHGVYEHLAGLITSIDIHDLDASDLNRREILKALQPLDDNAKQELFRLGNAKMLELQMDLDRLSTQLLTRVHNHILN
GFLPVEDLKQMERVVEQVLRLFYDLRDLKLCDGSYEEGFVVIREQLSYLMTGTVRDNVPLLQEILQLRHAYQQATQQNEG
RLTQIHDLLHVIETLVRDPGSRGSALTLALVQEQLAQLEALGGLQLPEVQQRLQNAQLALSRLYEEEEETQRFLDGLSYD
DPPNEQTIKRHPQLREMLRRDEQTRLRLINAVLSMFHTLVMRLARDESPRPTFFDAVSLLLQQLPPDSHEREDLRAANAT
YAQMVKKLEQIEKAGTGASEKRFQALRELVYFFRNHEYFFQHMVGRLGVGPQVTELYERYQHEMEEQHLERLEREWQEEA
GKLTVTSVEDVQRVLARAPSHRVMHQMQQTLTTKMQDFLDKEKRKQEEQQRQLLDGYQKKVQQDLQRVVDAVKGEMLSTI
PHQPLEATLELLLGLDQRAQPLLDKFNQDLLSALQQLSKKLDGRINECLHGVLTGDVERRCHPHREAAMQTQASLNHLDQ
ILGPQLLIHETQQALQHAVHQAQFIEKCQQGDPTTAITGSEFEGDFARYRSSQQKMEEQLQETRQQMTETSERLDRSLRQ
DPGSSSVTRVPEKPFKGQELAGRITPPPADFQQPVFKTLLDQQADAARKALSDEADLLNQKVQTQLRQRDEQLSTAQNLW
TDLVTRHKMSGGLDVTTPDAKALMEKPLETLRELLGKATQQLPYLSAERTVRWMLAFLEEALAQITADPTHPHHGSRTHY
RNLQQQAVESAVTLAHQIEQNAACENFIAQHQEATANGASTPRVDMVQAVEAVWQRLEPGRVAGGAARHQKVQELLQRLG
QTLGDLELQETLATEYFALLHGIQTFSYGLDFRSQLEKIRDLRTRFAELAKRRGTRLSNEGVLPNPRKPQATTSLGAFTR
GLNALERHVQLGHQYLLNKLNGSSLVYRLEDIPSVLPATHETDPALIMRDRLRRLCFARHHDTFLEVVDVFGMRQIVTQA
GEPIHLVTDYGNVAFKYLALRDDGRPLAWRRRCSGGGLKNVVTTRYKAITVAVAVCQTLRTFWPQISQYDLRPYLTQHQS
HTHPAETHTLHNLKLFCYLVSTAWHQRIDTQQELTAADRVGSGEGGDVGEQRPGRGTVLRLSLQEFCVLIAALYPEYIYT
VLKYPVQMSLPSLTAHLHQDVIHAVVNNTHKMPPDHLPEQVKAFCITPTQWPAMQLNKLFWENKLVQQLCQVGPQKSTPP
LGKLWLYAMATLVFPQDMLQCLWLELKPQYAETYASVSELVQTLFQIFTQQCEMVTEGYTQPQLPTGEPVLQMIRVPRQD
TTTTDTNTTTEPGLLDVFIQTETALDYALGSWLFGIPVCLGVHVADLLKGQRILVARHLEYTSRDRDFLRIQRSRDLNLS
QLLQDTWTETPLEHCWLQAQIRRLRDYLRFPTRLEFIPLVIYNAQDHTVVRVLRPPSTFEQDHSRLVLDEAFPTFPLYDQ
DDNSSADNIAASGAAPTPPVPFNRVPVNIQFLRENPPPIARVQQPPRRHRHRAAAAADDDGQIDHVQDDTSRTADSALVS
TAFGGSVFQENRLGETPLCRDELVAVAPGAASTSFASPPITVLTQNVLSALEILRLVRLDLRQLAQSVQDTIQHMRFLYL
L
>D5LX59 3.4.19.12~~~~~~Large tegument protein deneddylase~~~
MKVTQASCHQGDIARFGARAGNQCVCNGIMFLHALHLGGTSAVLQTEALDAIMEEGARLDARLERELQKKLPAGGRLPVY
RLGDEVPRRLESRFGRTVHALSRPFNGTTETCDLDGYMCPGIFDFLRYAHAKPRPTYVLVTVNSLARAVVFTEDHMLVFD
PHSSAECHNAAVYHCEGLHQVLMVLTGFGVQLSPAFYYEALFLYMLDVATVPEAEIAARLVSTYRDRDIDLTGVVRESAD
TAATTTTAAPSLPPLPDPIVDPGCPPGVAPSIPVYDPSSSPKKTPEKRRKDLSGSKHGGKKKPPSTTSKTLATASSSSPS
AIAAASSSSAVPPSYSCGEGALPALGRYQQLVDEVEQELKALTLPPLPANTSAWTLHAAGTESGANAATATAPSFDEAFL
TDRLQQLIIHAVNQRSCLRRPCGPQSAAQQAVRAYLGLSKKLDAFLLNWLHHGLDLRRMHDYLSHKTTKGTYSTLDRALL
EKMQVVFDPYGRQHGPALIAWVEEMLRYVESKPTNELSQRLQRFVTKRPMPVSDSFVCLRPVDFQRLTQVIEQRRRVLQR
QREEYHGVYEHLAGLITSIDIHDLDASDLNRREILKALQPLDDNAKQELFRLGNAKMLELQMDLDRLSTQLLTRVHNHIL
NGFLPVEDLKQMERVVEQVLRLFYDLRDLKLCDGSYEEGFVVIREQLSYLMTGTVRDNVPLLQEILQLRHAYQQATQQNE
GRLTQIHDLLHVIETLVRDPGSRGSALTLALVQEQLAQLEALGGLQLPEVQQRLQNAQLALSRLYEEEEETQRFLDGLSY
DDPPTEQTIKRHPQLREMLRRDEQTRLRLINAVLSMFHTLVMRLARDESPRPTFFDAVSLLLQQLPPDSHEREDLRAANA
TYAQMVKKLEQIEKAGTGASEKRFQALRELVYFFRNHEYFFQHMVGRLGVGPQVTELYERYQHEMEEQHLERLEREWQEE
AGKLTVTSVEDVQRVLARAPSHRVMHQMQQTLTTKMQDFLDKEKRKQEEQQRQLLDGYQKKVQQDLQRVVDAVKGEMLST
IPHQPLEATLELLLGLDQRAQPLLDKFNQDLLSALQQLSKKLDGRINECLHGVLTGDVERRCHPHREAAMQTQASLNHLD
QVLGPQLLIHETQQALQHAVHQAQFIEKCQQGDPTTAITGSEFESDFARYRSSQQKMEGQLQETRQQMTETSERLDRSLR
QDPGSSSVTRVPEKPFKGQELAGRITPPPVDFQRPVFKTLLDQQADAARKALSDEADLLNQKVQTQLRQRDEQLSTAQNL
WTDLVTRHKMSGGLDVTTPDAKALMEKPLETLRELLGKATQQLPYLSAERTVRWMLAFLEEALAQITADPTHPHHGSRTH
YRNLQQQAVESAVTLAHQIEQNAACENFIAQHQEATANGASTPRVDMVQAVEAVWQRLEPGRVAGGAARHQKVQELLQRL
GQTLGDLELQETLATEYFALLHGIQTFSYGLDFRSQLEKIRDLRTRFAELAKRRGTRLSNEGALPNPRKPQATTSLGAFT
RGLNALERHVQLGHQYLLNKLNGSSLVYRLEDIPSVLPPTHETDPALIMRDRLRRLCFARHHDTFLEVVDVFGMRQIVTQ
AGEPIHLVTDYGNVAFKYLALRDDGRPLAWRRRCSGGGLKNVVTTRYKAITVAVAVCQTLRTFWPQISQYDLRPYLTQHQ
SHTHPAETHTLHNLKLFCYLVSTAWHQRIDTQQELTAADRVGSGEGGDVGEQRPGRGTVLRLSLQEFCVLIAALYPEYIY
TVLKYPVQMSLPSLTAHLHQDVIHAVVNNTHKMPPDHLPEQVKAFCITPTQWPAMQLNKLFWENKLVQQLCQVGPQKSTP
PLGKLWLYAMATLVFPQDMLQCLWLELKPQYAETYASVSELVQTLFQIFTQQCEMVTEGYTQPQLPTGEPVLQMIRVRRQ
DTTTTDTNTTTEPGLLDVFIQTETALDYALGSWLFGIPVCLGVHVADLLKGQRVLVARHLEYTSRDRDFLRIQRSRDLNL
SQLLQDTWTETPLEHCWLQAQIRRLRDYLRFPTRLEFIPLVIYNAQDHTVVRVLRPPSTFEQDHSRLVLDEAFPIFPLYD
QDDNSSADNVAASGAAPTPPVPFNRVPVNIQFLRENPPPIARVQQPPRRHRHRAAAAADDDGQIDHVQDDTSRTADSALV
STAFGGSVFQENRLGETPLCRDELVAVAPGAASTSFASPPITVLTQNVLSALEILRLVRLNLRQLAQSVQDTIQHMRFLY
LL
>P10220 3.4.19.12~~~~~~Large tegument protein deneddylase~~~
MGGGNNTNPGGPVHKQAGSLASRAHMIAGTPPHSTMERGGDRDIVVTGARNQFAPDLEPGGSVSCMRSSLSFLSLIFDVG
PRDVLSAEAIEGCLVEGGEWTRATAGPGPPRMCSIVELPNFLEYPGARGGLRCVFSRVYGEVGFFGEPAAGLLETQCPAH
TFFAGPWALRPLSYTLLTIGPLGMGLFRDGDTAYLFDPHGLPEGTPAFIAKVRAGDMYPYLTYYTRDRPDVRWAGAMVFF
VPSGPEPAAPADLTAAALHLYGASETYLQDEAFSERRVAITHPLRGEIAGLGEPCVGVGPREGVGGPGPHPPTAAQSPPP
TRARRDDRASETSRGTAGPSAKPEAKRPNRAPDDVWAVALKGTPPTDPPSADPPSADPPSAIPPPPPSAPKTPAAEAAEE
DDDDMRVLEMGVVPVGRHRARYSAGLPKRRRPTWTPPSSVEDLTSGEKTKRSAPPAKTKKKSTPKGKTPVGAAVPASVPE
PVLASAPPDPAGPPVAEAGEDDGPTVPASSQALEALKTRRSPEPPGADLAQLFEAHPNVAATAVKFTACSAALAREVAAC
SRLTISALRSPYPASPGLLELCVIFFFERVLAFLIENGARTHTQAGVAGPAAALLEFTLNMLPWKTAVGDFLASTRLSLA
DVAAHLPLVQHVLDENSLIGRLALAKLILVARDVIRETDAFYGELADLELQLRAAPPANLYTRLGEWLLERSQAHPDTLF
APATPTHPEPLLYRVQALAKFARGEEIRVEAEDRQMREALDALARGVDAVSQHAGPLGVMPAPAGAAPQGAPRPPPLGPE
AVQVRLEEVRTQARRAIEGAVKEYFYRGAVYSAKALQASDNNDRRFHVASAAVVPVVQLLESLPVFDQHTRDIAQRAAIP
APPPIATSPTAILLRDLIQRGQTLDAPEDLAAWLSVLTDAANQGLIERKPLDELARSIRDINDQQARRSSGLAELRRFDA
LDAALGQQLDSDAAFVPAPGASPYPDDGGLSPEATRMAEEALRQARAMDAAKLTAELAPDARARLRERARSLEAMLEGAR
ERAKVARDAREKFLHKLQGVLRPLPDFVGLKACPAVLATLRASLPAGWSDLPEAVRGAPPEVTAALRADMWGLLGQYRDA
LEHPTPDTATALSGLHPSFVVVLKNLFADAPETPFLLQFFADHAPIIAHAVSNAINAGSAAVATADPASTVDAAVRAHRV
LVDAVTALGAAASDPASPLAFLAAMADSAAGYVKATRLALDARVAIAQLTTLGSAAADLVVQVRRAANQPEGEHASLIQA
ATRATTGARESLAGHEGRFGGLLHAEGTAGDHSPSGRALQELGKVIGATRRRADELEAATADLREKMAAQRARSSHERWA
ADVEAVLDRVESGAEFDVVELRRLQALAGTHGYNPRDFRKRAEQALGTNAKAVTLALETALAFNPYTPENQRHPMLPPLA
AIHRIDWSAAFGAAADTYADMFRVDTEPLARLLRLAGGLLERAQANDGFIDYHEAVLHLSEDLGGVPALRQYVPFFQKGY
AEYVDIRDRLDALRADARRAIGSVALDLAAAAEEISAVRNDPAAAAELVRAGVTLPCPSEDALVACVAALERVDQSPVKD
TAYAHYVAFVTRQDLADTKDAVVRAKQQRAEATERVTAGLREVLAARERRAQLEAEGLANLKTLLKVVAVPATVAKTLDQ
ARSAEEIADQVEILVDQTEKARELDVQAVAWLEHAQRTFETHPLSAASGDGPGLLTRQGARLQALFDTRRRVEALRRSLE
EAEAEWDEVWGRFGRVRGGAWKSPEGFRAACEQLRALQDTTNTVSGLRAQRDYERLPAKYQGVLGAKSAERAGAVEELGG
RVAQHADLSARLRDEVVPRVAWEMNFDTLGGLLAEFDAVAGDLAPWAVEEFRGARELIQRRMGLYSAYAKATGQTGAGAA
AAPAPLLVDLRALDARARASAPPGQEADPQMLRRRGEAYLRVSGGPGPLVLREATSTLDRPFAPSFLVPDGTPLQYALCF
PAVTDKLGALLMCPEAACIRPPLPTDTLESASTVTAMYVLTVINRLQLALSDAQAANFQLFGRFVRHRQARWGASMDAAA
ELYVALVATTLTREFGCRWAQLEWGGDAAAPGPPLGPQSSTRHRVSFNENDVLVALVASSPEHIYTFWRLDLVRQHEYMH
LTLPRAFQNAADSMLFVQRLTPHPDARIRVLPAFSAGGPPTRGLMFGTRLADWRRGKLSETDPLAPWRSVPELGTERGAA
LGKLSPAQALAAVSVLGRMCLPSTALVALWTCMFPDDYTEYDSFDALLTARLESGQTLSPSGGREASPPAPPNALYRPTG
QHVAVPAAATHRTPAARVTAMDLVLAAVLLGAPVVVALRNTTAFSRESELELCLTLFDSRARGPDAALRDAVSSDIETWA
VRLLHADLNPIENACLAAQLPRLSALIAERPLARGPPCLVLVDISMTPVAVLWENPDPPGPPDVRFVGSEATEELPFVAG
GEDVLAASATDEDPFLARAILGRPFDASLLSGELFPGHPVYQRAPDDQSPSVPNPTPGPVDLVGAEGSLGPGSLAPTLFT
DATPGEPVPPRMWAWIHGLEELASDDSGGPAPLLAPDPLSPTADQSVPTSQCAPRPPGPAVTAREARPGVPAESTRPAPV
GPRDDFRRLPSPQSSPAPPDATAPRPPASSRASAASSSGSRARRHRRARSLARATQASATTQGWRPPALPDTVAPVTDFA
RPPAPPKPPEPAPHALVSGVPLPLGPQAAGQASPALPIDPVPPPVATGTVLPGGENRRPPLTSGPAPTPPRVPVGGPQRR
LTRPAVASLSESRESLPSPWDPADPTAPVLGRNPAEPTSSSPAGPSPPPPAVQPVAPPPTSGPPPTYLTLEGGVAPGGPV
SRRPTTRQPVATPTTSARPRGHLTVSRLSAPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQPQ
PQPQPQPQPQPQPQPQPQPQNGHVAPGEYPAVRFRAPQNRPSVPASASSTNPRTGSSLSGVSSWASSLALHIDATPPPVS
LLQTLYVSDDEDSDATSLFLSDSEAEALDPLPGEPHSPITNEPFSALSADDSQEVTRLQFGPPPVSANAVLSRRYVQRTG
RSALAVLIRACYRLQQQLQRTRRALLHHSDAVLTSLHHVRMLLG
>P89459 3.4.19.12~~~~~~Large tegument protein deneddylase~~~
MIPAALPHPTMKRQGDRDIVVTGVRNQFATDLEPGGSVSCMRSSLSFLSLLFDVGPRDVLSAEAIEGCLVEGGEWTRAAA
GSGPPRMCSIIELPNFLEYPAARGGLRCVFSRVYGEVGFFGEPTAGLLETQCPAHTFFAGPWAMRPLSYTLLTIGPLGMG
LYRDGDTAYLFDPHGLPAGTPAFIAKVRAGDVYPYLTYYAHDRPKVRWAGAMVFFVPSGPGAVAPADLTAAALHLYGASE
TYLQDEPFVERRVAITHPLRGEIGGLGALFVGVVPRGDGEGSGPVVPALPAPTHVQTPGADRPPEAPRGASGPPDTPQAG
HPNRPPDDVWAAALEGTPPAKPSAPDAAASGPPHAAPPPQTPAGDAAEEAEDLRVLEVGAVPVGRHRARYSTGLPKRRRP
TWTPPSSVEDLTSGERPAPKAPPAKAKKKSAPKKKAPVAAEVPASSPTPIAATVPPAPDTPPQSGQGGGDDGPASPSSPS
VLETLGARRPPEPPGADLAQLFEVHPNVAATAVRLAARDAALAREVAACSQLTINALRSPYPAHPGLLELCVIFFFERVL
AFLIENGARTHTQAGVAGPAAALLDFTLRMLPRKTAVGDFLASTRMSLADVAAHRPLIQHVLDENSQIGRLALAKLVLVA
RDVIRETDAFYGDLADLDLQLRAAPPANLYARLGEWLLERSRAHPNTLFAPATPTHPEPLLHRIQALAQFARGEEMRVEA
EAREMREALDALARGVDSVSQRAGPLTVMPVPAAPGAGGRAPCPPALGPEAIQARLEDVRIQARRAIESAVKEYFHRGAV
YSAKALQASDSHDCRFHVASAAVVPMVQLLESLPAFDQHTRDVAQRAALPPPPPLATSPQAILLRDLLQRGQPLDAPEDL
AAWLSVLTDAATQGLIERKPLEELARSIHGINDQQARRSSGLAELQRFDALDAALAQQLDSDAAFVPATGPAPYVDGGGL
SPEATRMAEDALRQARAMEAAKMTAELAPEARSRLRERAHALEAMLNDARERAKVAHDAREKFLHKLQGVLRPLPDFVGL
KACPAVLATLRASLPAGWTDLADAVRGPPPEVTAALRADLWGLLGQYREALEHPTPDTATALAGLHPAFVVVLKTLFADA
PETPVLVQFFSDHAPTIAKAVSNAINAGSAAVATASPAATVDAAVRAHGALADAVSALGAAARDPASPLSFLAVLADSAA
GYVKATRLALEARGAIDELTTLGSAAADLVVQARRACAQPEGDHAALIDAAARATTAARESLAGHEAGFGGLLHAEGTAG
DHSPSGRALQELGKVIGATRRRADELEAAVADLTAKMAAQRARGSSERWAAGVEAALDRVENRAEFDVVELRRLQALAGT
HGYNPRDFRKRAEQALAANAEAVTLALDTAFAFNPYTPENQRHPMLPPLAAIHRLGWSAAFHAAAETYADMFRVDAEPLA
RLLRIAEGLLEMAQAGDGFIDYHEAVGRLADDMTSVPGLRRYVPFFQHGYADYVELRDRLDAIRADVHRALGGVPLDLAA
AAEQISAARNDPEATAELVRTGVTLPCPSEDALVACAAALERVDQSPVKNTAYAEYVAFVTRQDTAETKDAVVRAKQQRA
EATERVMAGLREALAARERRAQIEAEGLANLKTMLKVVAVPATVAKTLDQARSVAEIADQVEVLLDQTEKTRELDVPAVI
WLEHAQRTFETHPLSAARGDGPGPLARHAGRLGALFDTRRRVDALRRSLEEAEAEWDEVWGRFGRVRGGAWKSPEGFRAM
HEQLRALQDTTNTVSGLRAQPAYERLSARYQGVLGAKGAERAEAVEELGARVTKHTALCARLRDEVVRRVPWEMNFDALG
GLLAEFDAAAADLAPWAVEEFRGARELIQYRMGLYSAYARAGGQTGAGAESAPAPLLVDLRALDARARASSSPEGHEVDP
QLLRRRGEAYLRAGGDPGPLVLREAVSALDLPFATSFLAPDGTPLQYALCFPAVTDKLGALLMRPEAACVRPPLPTDVLE
SAPTVTAMYVLTVVNRLQLALSDAQAANFQLFGRFVRHRQATWGASMDAAAELYVALVATTLTREFGCRWAQLGWASGAA
APRPPPGPRGSQRHCVAFNENDVLVALVAGVPEHIYNFWRLDLVRQHEYMHLTLERAFEDAAESMLFVQRLTPHPDARIR
VLPTFLDGGPPTRGLLFGTRLADWRRGKLSETDPLAPWRSALELGTQRRDVPALGKLSPAQALAAVSVLGRMCLPSAALA
ALWTCMFPDDYTEYDSFDALLAARLESGQTLGPAGGREASLPEAPHALYRPTGQHVAVLAAATHRTPAARVTAMDLVLAA
VLLGAPVVVALRNTTAFSRESELELCLTLFDSRPGGPDAALRDVVSSDIETWAVGLLHTDLNPIENACLAAQLPRLSALI
AERPLADGPPCLVLVDISMTPVAVLWEAPEPPGPPDVRFVGSEATEELPFVATAGDVLAASAADADPFFARAILGRPFDA
SLLTGELFPGHPVYQRPLADEAGPSAPTAARDPRDLAGGDGGSGPEDPAAPPARQADPGVLAPTLLTDATTGEPVPPRMW
AWIHGLEELASDDAGGPTPNPAPALLPPPATDQSVPTSQYAPRPIGPAATARETRPSVPPQQNTGRVPVAPRDDPRPSPP
TPSPPADAALPPPAFSGSAAAFSAAVPRVRRSRRTRAKSRAPRASAPPEGWRPPALPAPVAPVAASARPPDQPPTPESAP
PAWVSALPLPPGPASARGAFPAPTLAPIPPPPAEGAVVPGGDRRRGRRQTTAGPSPTPPRGPAAGPPRRLTRPAVASLSA
SLNSLPSPRDPADHAAAVSAAAAAVPPSPGLAPPTSAVQTSPPPLAPGPVAPSEPLCGWVVPGGPVARRPPPQSPATKPA
ARTRIRARSVPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPQPPLPPVTRTLTPQSRDSV
PTPESPTHTNTHLPVSAVTSWASSLALHVDSAPPPASLLQTLHISSDDEHSDADSLRFSDSDDTEALDPLPPEPHLPPAD
EPPGPLAADHLQSPHSQFGPLPVQANAVLSRRYVRSTGRSALAVLIRACRRIQQQLQRTRRALFQRSNAVLTSLHHVRML
LG
>Q2HR64 3.4.19.12~~~~~~Large tegument protein deneddylase~~~
MAAQPLYMEGMASTHQANCIFGEHAGSQCLSNCVMYLASSYYNSETPLVDRASLDDVLEQGMRLDLLLRKSGMLGFRQYA
QLHHIPGFLRTDDWATKIFQSPEFYGLIGQDAAIREPFIESLRSVLSRNYAGTVQYLIIICQSKAGAIVVKDKTYYMFDP
HCIPNIPNSPAHVIKTNDVGVLLPYIATHDTEYTGCFLYFIPHDYISPEHYIANHYRTIVFEELHGPRMDISRGVESCSI
TEITSPSVSPAPSEAPLRRDSTQSQDETRPRRPRVVIPPYDPTDRPRPPHQDRPPEQAAGYGGNKGRGGNKGRGGKTGRG
GNEGRGGHQPPDEHQPPHITAEHMDQSDGQGADGDMDSTPANGETSVTETPGPEPNPPARPDREPPPTPPATPGATALLS
DLTATRGQKRKFSSLKESYPIDSPPSDDDDVSQPSQQTAPDTEDIWIDDPLTPLYPLTDTPSFDITADVTPDNTHPEKAA
DGDFTNKTTSTDADRYASASQESLGTLVSPYDFTNLDTLLAELGRLGTAQPIPVIVDRLTSRPFREASALQAMDRILTHV
VLEYGLVSGYSTAAPSKCTHVLQFFILWGEKLGIPTEDAKTLLESALEIPAMCEIVQQGRLKEPTFSRHIISKLNPCLES
LHATSRQDFKSLIQAFNAEGIRIASRERETSMAELIETITARLKPNFNIVCARQDAQTIQDGVGLLRAEVNKRNAQIAQE
AAYFENIITALSTFQPPPQSQQTFEVLPDLKLRTLVEHLTLVEAQVTTQTVESLQAYLQSAATAEHHLTNVPNVHSILSN
ISNTLKVIDYVIPKFIINTDTLAPYKQQFSYLGGELASMFSLDWPHAPAEAVEPLPVLTSLRGKIAEALTRQENKNAVDQ
ILTDAEGLLKNITDPNGAHFHAQAVSIPVLENYVHNAGVLLKGEKSERFSRLKTAIQNLVSSESFITVTLHSTNLGNLVT
NVPKLGEAFTGGPHLLTSPSVRQSLSTLCTTLLRDALDALEKKDPALLGEGTTLALETLLGYGSVQDYKETVQIISSLVG
IQKLVRDQGADKWATAVTRLTDLKSTLATTAIETATKRKLYRLIQRDLKEAQKHETNRAMEEWKQKVLALDNASPERVAT
LLQQAPTAKAREFAEKHFKILLPVPADAPVQASPTPMEYSASPLPDPKDIDRATSIHGEQAWKKIQQAFKDFNFAVLRPA
DWDALAAEYQRRGSPLPAAVGPALSGFLETILGTLNDIYMDKLRSFLPDAQPFQAPPFDWLTPYQDQVSFFLRTIGLPLV
RALADKISVQALRLSHALQSGDLQQATVGTPLELPATEYARIASNMKSVFNDHGLQVRSEVADYVEAQRADAHTPHVPRP
KIQAPKTLIPHPDAIVADGLPAFLKTSLLQQEAKLLALQRADFESLESDMRAAEAQRKASREETQRKMAHAITQLLQQAP
SAISGRPLSLQDPVGFLEGIIYDKVLERESYETGLEGLSWLEQTIKSITVYAPVEEKQRMHVLLDEVKKQRANTETALEL
EAAATHGDDARLLQRAVDELSPLRVKGGKAAVESWRQKIQTLKSLVQEAEQAGLLLATIDTVAGQAQETISPSTLQGLYQ
QGQEAMAAIKRFRDSPQLAGLQEKLAELQQYVKYKKQYLEHFEATQSVVFTAFPLTQEVTIPALHYAGPFDNLERLSRYL
HIGQTQPAPGQWLLTLPTFDPTRPACVPAGGHEPPLHRQVVFSSFLEAQIRLALSVAGPVPGRGLPGTPQIRRGVEAAAC
FLHQWDEISRLLPEVLDTFFHNAPLPAESSSNAFLAMCVLTHLVYLAGRAVLGPREPEHAAPDAYPREVALAPRDLTYLL
LAMWPSWISAILKQPSHAEAAHACLVTLPTMLKAVPYLTLEASAGPLPADMRHFATPEARLFFPARWHHVNVQEKLWLRN
DFMSLCHRSPGRARIAVLVWAVTCLDPEVIRQLWSTLRPLTADESDTASGLLRVLVEMEFGPPPKTPRREAVAPGATLPP
YPYGLATGERLVGQAQERSGGAGKMPVSGFEIVLGALLFRAPLRIFSTASTHRISDFEGGFQILTPLLDCCPDREPFASL
AAAPRRTVPLGDPCANIHTPEEIQIFARQAAWLQYTFANYQIPSTDNPIPIVVLNANNNLENSYIPRDRKADPLRPFYVV
PLKPQGRWPEIMTTATTPCRLPTSPEEAGSQFARLLQSQVSATWSDIFSRVPERLAPNAPQKSSQTMSEIHEVAATPPLT
ITPNKPTGTPHVSPEADPITERKRGQQPKIVADNMPSRILPSLPTPKPREPRITLPHALPVISPPAHRPSPIPHLPAPQV
TEPKGVLQSKRGTLVLRPAAVIDPRKPVSAPITRYERTALQPPRTEGEGRRPPDTQPVTLTFRLPPTAPTPATAALETKT
TPPSTPPHAIDISPPQTPPMSTSPHARDTSPPAEKRAAPVIRVMAPTQPSGEARVKRVEIEQGLSTRNEAPPLERSNHAV
PAVTPRRTVAREIRIPPEIKAGWDTAPDIPLPHSSPESSPPTSPQPIRVDDKSPLPNLVERYARGFLDTPSVEVMSLENQ
DIAVDPGLLTRRIPSVVPMPHPIMWSPIVPISLQNTDIDTAKITLISFIRRIKQKVAALSASLAETVDRIKKWYL
>P0DOJ4 3.6.4.-~~~~~~Large T antigen~~~
MDRVLSRADKERLLELLKLPRQLWGDFGRMQQAYKQQSLLLHPDKGGSHALMQELNSLWGTFKTEVYNLRMNLGGTGFQG
SPPRTAERGTEESGHSPLHDDYWSFSYGSKYFTREWNDFFRKWDPSYQSPPKTAESSEQPDLFCYEEPLLSPNPSSPTDT
PAHTAGRRRNPCVAEPDDSISPDPPRTPVSRKRPRPAGATGGGGGGVHANGGSVFGHPTGGTSTPAHPPPYHSQGGSESM
GGSDSSGFAEGSFRSDPRCESENESYSQSCSQSSFNATPPKKAREDPAPSDFPSSLTGYLSHAIYSNKTFPAFLVYSTKE
KCKQLYDTIGKFRPEFKCLVHYEEGGMLFFLTMTKHRVSAVKNYCSKLCSVSFLMCKAVTKPMECYQVVTAAPFQLITEN
KPGLHQFEFTDEPEEQKAVDWIMVADFALENNLDDPLLIMGYYLDFAKEVPSCIKCSKEETRLQIHWKNHRKHAENADLF
LNCKAQKTICQQAADGVLASRRLKLVECTRSQLLKERLQQSLLRLKELGSSDALLYLAGVAWYQCLLEDFPQTLFKMLKL
LTENVPKRRNILFRGPVNSGKTGLAAALISLLGGKSLNINCPADKLAFELGVAQDQFVVCFEDVKGQIALNKQLQPGMGV
ANLDNLRDYLDGSVKVNLEKKHSNKRSQLFPPCVCTMNEYLLPQTVWARFHMVLDFTCKPHLAQSLEKCEFLQRERIIQS
GDTLALLLIWNFTSDVFDPDIQGLVKEVRDQFASECSYSLFCDILCNVQEGDDPLKDICEYS
>P24851 3.6.4.-~~~~~~Large T antigen~~~
MELTSEEYEELRGLLGTPDIGNADTLKKAFLKACKVHHPDKGGNEEAMKRLLYLYNKAKIAASATTSQVPEYGTSQWEQW
WEEFNQGFDEQDLHCDEELEPSDNEEENPAGSQAPGSQATPPKKPRTSPDFPEVLKEYVSNALFTNRTYNCFIIFTTAEK
GKELYPCIQAAYKCTFIALYMYNGDSVLYIITVGKHRVNAMENLCSKKCTVSFLQAKGVLKPQEAYNVCCTFELISQNIQ
GGLPSSFFNPVQEEEKSVNWKLISEFACSIKCTDPLLLMALYLEFTTAPEACKVCDNPRRLEHRRHHTKDHTLNALLFQD
SKTQKTICNQACDTVLAKRRLDMKTLTRNELLVQRWQGLFQEMEDLFGARGEEHLAHRMAAVMWLNALHPNMPDVIFNYI
KMVVENKPKQRYLLLKGPVNCGKTTVAAGLIGLCGGAYLNINCPPERLAFELGMAIDQFTVVFEDVKGKKSSKSSLQTGI
GFENLDNLRDHLDGAVPVNLERKHQNKVTQIFPPGIVTCNEYDIPLTVKIRMYQKVELLHNYNLYKSLKNTEEVGKKRYL
QSGITWLLLLIYFRSVDDFTEKLQECVVKWKERIETEVGDMWLLTMKENIEQGKNILEK
>P03072 3.6.4.-~~~~~~Large T antigen~~~
MDKVLNREESMELMDLLGLDRSAWGNIPVMRKAYLKKCKELHPDKGGDEDKMKRMNFLYKKMEQGVKVAHQPDFGTWNSS
EVPTYGTDEWESWWNTFNEKWDEDLFCHEEMFASDDENTGSQHSTPPKKKKKVEDPKDFPVDLHAFLSQAVFSNRTVASF
AVYTTKEKAQILYKKLMEKYSVTFISRHGFGGHNILFFLTPHRHRVSAINNYCQKLCTFSFLICKGVNKEYLFYSALCRQ
PYAVVEESIQGGLKEHDFNPEEPEETKQVSWKLVTQYALETKCEDVFLLMGMYLDFQENPQQCKKCEKKDQPNHFNHHEK
HYYNAQIFADSKNQKSICQQAVDTVAAKQRVDSIHMTREEMLVERFNFLLDKMDLIFGAHGNAVLEQYMAGVAWIHCLLP
QMDTVIYDFLKCIVLNIPKKRYWLFKGPIDSGKTTLAAALLDLCGGKSLNVNMPLERLNFELGVGIDQFMVVFEDVKGTG
AESRDLPSGHGISNLDCLRDYLDGSVKVNLERKHQNKRTQVFPPGIVTMNEYSVPRTLQARFVRQIDFRPKAYLRKSLSC
SEYLLEKRILQSGMTLLLLLIWFRPVADFAAAIHERIVQWKERLDLEISMYTFSTMKANVGMGRPILDFPREEDSEAEDS
GHGSSTESQSQCFSQVSEASGADTQENCTFHICKGFQCFKKPKTPPPK
>P0DOJ5 3.6.4.-~~~~~~Large T antigen~~~
MDRVLSRADKERLLELLKLPRQLWGDFGRMQQAYKQQSLLLHPDKGGSHALMQELNSLWGTFKTEVYNLRMNLGGTGFQG
SPPRTAERGTEESGHSPLHDDYWSFSYGSKYFTREWNDFFRKWDPSYQSPPKTAESSEQPDLFCYEEPLLSPNPSSPTDT
PAHTAGRRRNPCVAEPDDSISPDPPRTPVSRKRPRPAGATGGGGGGVHANGGSVFGHPTGGTSTPAHPPPYHSQGGSESM
GGSDSSGFAEGSFRSDPRCESENESYSQSCSQSSFNATPPKKAREDPAPSDFPSSLTGYLSHAIYSNKTFPAFLVYSTKE
KCKQLYDTIGKFRPEFKCLVHYEEGGMLFFLTMTKHRVSAVKNYCSKLCSVSFLMCKAVTKPMECYQVVTAAPFQLITEN
KPGLHQFEFTDEPEEQKAVDWIMVADFALENNLDDPLLIMGYYLDFAKEVPSCIKCSKEETRLQIHWKNHRKHAENADLF
LNCKAQKTICQQAADGVLASRRLKLVECTRSQLLKERLQQSLLRLKELGSSDALLYLAGVAWYQCLLEDFPQTLFKMLKL
LTENVPKRRNILFRGPVNSGKTGLAAALISLLGGKSLNINCPADKLAFELGVAQDQFVVCFEDVKGQIALNKQLQPGMGV
ANLDNLRDYLDGSVKVNLEKKHSNKRSQLFPPCVCTMNEYLLPQTVWARFHMVLDFTCKPHLAQSLEKCEFLQRERIIQS
GDTLALLLIWNFTSDVFDPDIQGLVKEVRDQFASECSYSLFCDILCNVQEGDDPLKDICEYS
>P03070 3.6.4.-~~~~~~Large T antigen~~~
MDKVLNREESLQLMDLLGLERSAWGNIPLMRKAYLKKCKEFHPDKGGDEEKMKKMNTLYKKMEDGVKYAHQPDFGGFWDA
TEIPTYGTDEWEQWWNAFNEENLFCSEEMPSSDDEATADSQHSTPPKKKRKVEDPKDFPSELLSFLSHAVFSNRTLACFA
IYTTKEKAALLYKKIMEKYSVTFISRHNSYNHNILFFLTPHRHRVSAINNYAQKLCTFSFLICKGVNKEYLMYSALTRDP
FSVIEESLPGGLKEHDFNPEEAEETKQVSWKLVTEYAMETKCDDVLLLLGMYLEFQYSFEMCLKCIKKEQPSHYKYHEKH
YANAAIFADSKNQKTICQQAVDTVLAKKRVDSLQLTREQMLTNRFNDLLDRMDIMFGSTGSADIEEWMAGVAWLHCLLPK
MDSVVYDFLKCMVYNIPKKRYWLFKGPIDSGKTTLAAALLELCGGKALNVNLPLDRLNFELGVAIDQFLVVFEDVKGTGG
ESRDLPSGQGINNLDNLRDYLDGSVKVNLEKKHLNKRTQIFPPGIVTMNEYSVPKTLQARFVKQIDFRPKDYLKHCLERS
EFLLEKRIIQSGIALLLMLIWYRPVAEFAQSIQSRIVEWKERLDKEFSLSVYQKMKFNVAMGIGVLDWLRNSDDDDEDSQ
ENADKNEDGGEKNMEDSGHETGIDSQSQGSFQAPQSSQSVHDHNQPYHICRGFTCFKKPPTPPPEPET
>O64205 3.1.-.-~~~~~~Endolysin B~~~
MSKPWLFTVHGTGQPDPLGPGLPADTARDVLDIYRWQPIGNYPAAAFPMWPSVEKGVAELILQIELKLDADPYADFAMAG
YSQGAIVVGQVLKHHILPPTGRLHRFLHRLKKVIFWGNPMRQKGFAHSDEWIHPVAAPDTLGILEDRLENLEQYGFEVRD
YAHDGDMYASIKEDDLHEYEVAIGRIVMKASGFIGGRDSVVAQLIELGQRPITEGIALAGAIIDALTFFARSRMGDKWPH
LYNRYPAVEFLRQI
>Q8HA43 3.4.22.-~~~~~~D-alanyl-L-alanine endopeptidase~~~
MATYQEYKSRSNGNAYDIDGSFGAQCWDGYADYCKYLGLPYANCTNTGYARDIWEQRHENGILNYFDEVEVMQAGDVAIF
MVVDGVTPYSHVAIFDSDAGGGYGWFLGQNQGGANGAYNLVKIPYSATYPTAFRPKSFKNAVTVTDNTGLNKGDYFIDVS
AYQQADLTTTCQQAGTTKTIIKVSESIAWLSDRHQQQANTSDPIGYYHFGRFGGDSALAQREADLFLSNLPSKKVSYLVI
DYEDSASADKQANTNAVIAFMDKIASAGYKPIYYSYKPFTLNNIDYQKIIAKYPNSIWIAGYPDYEVRTEPLWEFFPSMD
GVRWWQFTSVGVAGGLDKNIVLLADDSSKMDIPKVDKPQELTFYQKLATNTKLDNSNVPYYEATLSTDYYVESKPNASSA
DKEFIKAGTRVRVYEKVNGWSRINHPESAQWVEDNYLVNATDM
>P15057 3.2.1.17~~~CPL1~~~Lysozyme~~~
MVKKNDLFVDVSSHNGYDITGILEQMGTTNTIIKISESTTYLNPCLSAQVEQSNPIGFYHFARFGGDVAEAEREAQFFLD
NVPMQVKYLVLDYEDDPSGDAQANTNACLRFMQMIADAGYKPIYYSYKPFTHDNVDYQQILAQFPNSLWIAGYGLNDGTA
NFEYFPSMDGIRWWQYSSNPFDKNIVLLDDEEDDKPKTAGTWKQDSKGWWFRRNNGSFPYNKWEKIGGVWYYFDSKGYCL
TSEWLKDNEKWYYLKDNGAMATGWVLVGSEWYYMDDSGAMVTGWVKYKNNWYYMTNERGNMVSNEFIKSGKGWYFMNTNG
ELADNPSFTKEPDGLITVA
>P19385 3.2.1.17~~~CPL7~~~Lysozyme~~~
MVKKNDLFVDVASHQGYDISGILEEAGTTNTIIKVSESTSYLNPCLSAQVSQSNPIGFYHFAWFGGNEEEAEAEARYFLD
NVPTQVKYLVLDYEDHASASVQRNTTACLRFMQIIAEAGYTPIYYSYKPFTLDNVDYQQILAQFPNSLWIAGYGLNDGTA
NFEYFPSMDGIRWWQYSSNPFDKNIVLLDDEKEDNINNENTLKSLTTVANEVIQGLWGNGQERYDSLANAGYDPQAVQDK
VNEILNAREIADLTTVANEVIQGLWGNGQERYDSLANAGYDPQAVQDKVNEILNAREIADLTTVANEVIQGLWGNGQERY
DSLANAGYDPQAVQDKVNELLS
>P03609 ~~~~~~Lysis protein~~~
METRFPQQSQQTPASTNRRRPFKHEDYPCRRQQRSSTLYVLIFLAIFLSKFTNQLLLSLLEAVIRTVTTLQQLLT
>P33486 3.2.1.17~~~lysA~~~Lysozyme~~~
MTKTYGVDVAVYQPIDLAAYHKAGASFAIVKLTEGVDYVNRRGPSRWTAPGLTTSTLMPTISRSFGSSVSRAKKEAAYFL
KEAKKQDISKKRMLWLDWEAGSGNVVTGSKSSNTAAILDFMDAIKAAGWRPGLYSGASLMRTAIDTKQVVKKYGTCLWVA
SYPTMAAVSTADFGYFRQWTGSPSGSLPVTAWPGRRRERCSG
>P03639 ~~~E~~~Lysis protein E~~~
MVRWTLWDTLAFLLLLSLLLPSLLIMFIPSTFKRPVSSWKALNLRKTLLMASSVRLKPLNCSRLPCVYAQETLTFLLTQK
KTCVKNYVQKE
>Q9E005 2.7.7.48~~~~~~RNA-directed RNA polymerase L~~~
MEKYREIHQRVRDLAPGTVSALECIDLLDRLYAVRHDLVDQMIKHDWSDNKDVERPIGQVLLMAGIPNDIIQGMEKKIIP
NSPSGQVLKSFFRMTPDNYKITGNLIEFIEVTVTADVSRGIREKKIKYEGGLQFVEHLLETESRKGNIPQPYKITFSVVA
VKTDGSNISTQWPSRRNDGVVQHMRLVQADINYVREHLIKLDERASLEAMFNLKFHVSGPKLRYFNIPDYRPQQLCEPRI
DNLIQYCKNWLTKEHKFVFKEVSGANVIQAFESHEQLHLQKYNESRKPRNFLLLQLTVQGAYLPSTISSDQCNTRIGCLE
ISKNQPETPVQMLALDISYKYLSLTRDELINYYSPRVHFQSSPNVKEPGTLKLGLSQLNPLSKSILDNVGKHKKDKGLFG
EIIDSINVASQIQINACAKIIEQILSNLEINIGEINASMPSPNKTTGVDDLLNKFYDNELGKYMLSILRKTAAWHIGHLV
RDITESLIAHAGLRRSKYWSVHAYDHGNVILFILPSKSLEVVGSYIRYFTVFKDGIGLIDADNIDSKAEIDGVTWCYSKV
MSIDLNRLLALNIAFEKSLLATATWFQYYTEDQGHFPLQHALRSIFSFHFLLCVSQKMKLCAIFDNLRYLIPSVTSLYSG
YELLIEKFFERPFKSSLDVYLYSIIKSLLISLAQNNKVRFYSRVRLLGLTVDHSTVGASGVYPSLMSRVVYKHYRSLISE
ATTCFFLFEKGLHGNLPEEAKIHLETIEWARKFQEKEKQYGDILLKEGYTIESVINGEVDVEQQLFCQEVSELSAQELNK
YLQAKSQVLCANIMNKHWDKPYFSQTRNISLKGMSGALQEDGHLAASVTLIEAIRFLNRSQTNPNVIDMYEQTKQSKAQA
RIVRKYQRTEADRGFFITTLPTRVRLEIIEDYFDAIAKVVPEEYISYGGDKKVLNIQNALEKALRWASGVSEITTSTGKS
IKFKRKLMYVSADATKWSPGDNSAKFRRFTQAIYDGLSDNKLKCCVVDALRNIYETEFFMSRKLHRYIDSMENHSDAVED
FLAFFSNGVSANVKGNWLQGNLNKCSSLFGAAVSLLFREVWKQLFPELECFFEFAHHSDDALFIYGYLEPEDDGTDWFLY
VSQQIQAGNFHWHAINQEMWKSMFNLHEHLLLMGSIKVSPKKTTVSPTNAEFLSTFFEGCAVSIPFVKILLGSLSDLPGL
GFFDDLAAAQSRCVKSLDLGACPQLAQLAIVLCTSKVERLYGTADGMVNSPTAFLKVNKAHVPVPLGGDGSMSIMELATA
GFGMADKNILKNAFISYKHTRRDGDRYVLGLFKFLMSLSEDVFQHDRLGEFSFVGKVQWKVFTPKAEFEFHDQFSHNYLL
EWTRQHPVYDYIIPRNRDNLLVYLVRKLNDPSIITAMTMQSPLQLRFRMQAKQHMKVCRYEGEWVTFREVLAAADSFATS
YQPTERDMDLFNTLVSCTFSKEYAWKDFLNEVRCEVLTTRHVHRPKIARTFTVREKDQAIQNPINSVIGYKYALTVDEVS
DVLDSAFFPESLSADLQVMKDGVYRELGLDISSPEVLKRIAPLLYKAGRSRVVIVEGNVEGTAESICSYWLKTMSLIKTI
RVRPKKEVLKAMSLYSVKENIGLQDDIAATRLCIEIWRWCKANEQDVKEWLTSLYFEKQTLMDWVERFRRKGVVPIDPEI
QCIGLLLYDVLGYKSVLQMQANRRAYSGKQYDAYCVQTYNEETKLYEGDLRVTFNFGLDCARLEVFWDKKEYILETSITQ
RHVLRLLMEEVSQELIRCGMRFKTEQVNQTRSLVLFKTEAGFEWGKPNVPCIVYKHCVLRTGLRTKQPINKEFMINVQSD
GFRAIAQMDIESPRFLLAHAYHTLRDIRYQAVQAVGNVWFKTEQHKLFINPIISSGLLENFMKGLPAAIPPAAYSLIMNK
AKISVDLFMFNELLALINRNNILNLDGIEETSEGYSTVTSMSSKQWSEEMSLMSDDDIDDMEDFTIALDDIDFEQINLEE
DIQHFLQDESAYVGDLLIQTEDIEVKKIRGVTRVLEPVKLLKSWVSKGLAIDKVYNPIGIILMARYMSKTYNFSSTPLAL
LNPYDLTELESVVKGWGETVNDRFKDLDIEAQTVVKEKGVQPEDVLPDSLFSFRHVDVLLRRLFPRDPVSTFY
>P52639 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MSFHASLLREEETPRPVAGINRTDQSLKNPLLGTEVSFCLKSSSLPHHVRALGQIKARNLASCDYYLLFRQVVLPPEVYP
IGVLIRAAEAILTVIVSAWKLDHMTKTLYSSVRYALTNPRVRAQLELHIAYQRIVGQVSYSREADIGPKRLGNMSLQFIQ
SLVIATIDTTSCLMTYNHFLAAADTAKSRCHLLIASVVQGALWEQGSFLDHIINMIDIIDSINLPHDDYFTIIKSIFPYS
QGLVMGRHNVSVSSDFASVFAIPELCPQLDSLLKKLLQLDPVLLLMVSSVQKSWYFPEIRMVDGSREQLHKMRVELETPQ
ALLSYGHTLLSIFRAEFIKGYVSKNAKWPPVHLLPGCDKSIKNARELGRWSPAFDRRWQLFEKVVILRIADLDMDPDFND
IVSDKAIISSRRDWVFEYNAAAFWKKYGERLERPPARSGPSRLVNALIDGRLDNIPALLEPFYRGAVEFEDRLTVLVPKE
KELKVKGRFFSKQTLAIRIYQVVAEAALKNEVMPYLKTHSMTMSSTALTHLLNRLSHTITKGDSFVINLDYSSWCNGFRP
ELQAPICRQLDQMFNCGYFFRTGCTLPCFTTFIIQDRFNPPYSLSGEPVEDGVTCAVGTKTMGEGMRQKLWTILTSCWEI
IALREINVTFNILGQGDNQTIIIHKSASQNNQLLAERALGALYKHARLAGHNLKVEECWVSDCLYEYGKKLFFRGVPVPG
CLKQLSRVTDSTGELFPNLYSKLACLTSSCLSAAMADTSPWVALATGVCLYLIELYVELPPAIIQDESLLTTLCLVGPSI
GGLPTPATLPSVFFRGMSDPLPFQLALLQTLIKTTGVTCSLVNRVVKLRIAPYPDWLSLVTDPTSLNIAQVYRPERQIRR
WIEEAIATSSHSSRIATFFQQPLTEMAQLLARDLSTMMPLRPRDMSALFALSNVAYGLSIIDLFQKSSTVVSASQAVHIE
DVALESVRYKESIIQGLLDTTEGYNMQPYLEGCTYLAAKQLRRLTWGRDLVGVTMPFVAEQFHPHSSVGAKAELYLDAII
YCPQETLRSHHLTTRGDQPLYLGSNTAVKVQRGEITGLAKSRAANLVKDTLVLHQWYKVRKVTDPHLNTLMARFLLEKGY
TSDARPSIQGGTLTHRLPSRGDSRQGLTGYVNILSTWLRFSSDYLHSFSKSSDDYTIHFQHVFTYGCLYADSVIRSGGVI
STPYLLSASCKTCFEKIDSEEFVLACEPQYRGAEWLISKPVTVPEQITDAEVEFDPCVSASYCLGILIGKSFLVDIRASG
HDIMEQRTWANLERFSVSDMQKLPWSIVIRSLWRFLIGARLLQFEKAGLIRMLYAATGPTFSFLMKVFQDSALLMDCAPL
DRLSPRINFHSRGDLVAKLVLLPFINPGIVEIEVSGINSKYHAVSEANMDLYIAAAKSVGVKPTQFVEETNDFTARGHHH
GCYSLSWSKSRNQSQVLKMVVRKLKLCVLYIYPTVDPAVALDLCHLPALTIILVLGGDPAYYERLLEMDLCGAVSSRVDI
PHSLAARTHRGFAVGPDAGPGVIRLDRLESVCYAHPCLEELEFNAYLDSELVDISDMCCLPLATPCKALFRPIYRSLQSF
RLALMDNYSFVMDLIMIRGLDIRPHLEEFDELLVVGQHILGQPVLVEVVYYVGVVRKRPVLARHPWSADLKRITVGGRAP
CPSAARLRDEDCQGSLLVGLPAGLTQLLIID
>A5HC98 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDYQEYQQFLARINTARDACVAKDIDVDLLMARHDYFGRELCKSLNIEYRNDVPFIDIILDIRPEVDPLTIDAPHITPDN
YLYINNVLYIIDYKVSVSNESSVITYDKYYELTRDISDRLSIPIEIVIIRIDPVSRDLHINSDRFKELYPTIVVDINFNQ
FFDLKQLLYEKFGDDEEFLLKVAHGDFTLTAPWCKTGCPEFWKHPIYKEFKMSMPVPERRLFEESVKFNAYESERWNTNL
VKIREYTKKDYSEHISKSAKNIFLASGFYKQPNKNEISEGWTLMVERVQDQREISKSLHDQKPSIHFIWGAHNPGNSNNA
TFKLILLSKSLQSIKGISTYTEAFKSLGKMMDIGDKAIEYEEFCMSLKSKARSSWKQIMNKKLEPKQINNALVLWEQQFM
INNDLIDKSEKLKLFKNFCGIGKHKQFKNKMLEDLEVSKPKILDFDDANMYLASLTMMEQSKKILSKSNGLKPDNFILNE
FGSRIKDANKETYDNMHKIFETGYWQCISDFSTLMKNILSVSQYNRHNTFRIAMCANNNVFAIVFPSADIKTKKATVVYS
IIVLHKEEENIFNPGCLHGTFKCMNGYISISRAIRLDKERCQRIVSSPGLFLTTCLLFKHDNPTLVMSDIMNFSIYTSLS
ITKSVLSLTEPARYMIMNSLAISSNVKDYIAEKFSPYTKTLFSVYMTRLIKNACFDAYDQRQRVQLRDIYLSDYDITQKG
IKDNRELTSIWFPGSVTLKEYLTQIYLPFYFNAKGLHEKHHVMVDLAKTILEIECEQRENIKEIWSTNCTKQTVNLKILI
HSLCKNLLADTSRHNHLRNRIENRNNFRRSITTISTFTSSKSCLKIGDFRKEKELQSVKQKKILEVQSRKMRLANPMFVT
DEQVCLEVGHCNYEMLRNAMPNYTDYISTKVFDRLYELLDKKVLTDKPVIEQIMDMMIDHKKFYFTFFNKGQKTSKDREI
FVGEYEAKMCMYAVERIAKERCKLNPDEMISEPGDGKLKVLEQKSEQEIRFLVETTRQKNREIDEAIEALATEGYESNLG
KIEKLSLGKAKGLKMEINADMSKWSAQDVFYKYFWLIALDPILYPQEKERILYFMCNYMDKELILPDELLFNLLDQKVAY
QNDIIATMTNQLNSNTVLIKRNWLQGNFNYTSSYVHSCAMSVYKEILKEAITLLDGSILVNSLVHSDDNQTSITIVQDKM
ENDKIIDFAMKEFERACLTFGCQANMKKTYVTNCIKEFVSLFNLYGEPFSIYGRFLLTSVGDCAYIGPYEDLASRISSAQ
TAIKHGCPPSLAWVSIAISHWMTSLTYNMLPGQSNDPIDYFPAENRKDIPIELNGVLDAPLSMISTVGLESGNLYFLIKL
LSKYTPVMQKRESVVNQIAEVKNWKVEDLTDNEIFRLKILRYLVLDAEMDPSDIMGETSDMRGRSILTPRKFTTAGSLRK
LYSFSKYQDRLSSPGGMVELFTYLLEKPELLVTKGEDMKDYMESVIFRYNSKRFKESLSIQNPAQLFIEQILFSHKPVID
FSGIRDKYINLHDSRALEKEPDILGKVTFTEAYRLLMRDLSSLELTNDDIQVIYSYIILNDPMMITIANTHILSIYGSPQ
RRMGMSCSTMPEFRNLKLIHHSPALVLRAYSKNNPDIQGADPTEMARDLVHLKEFVENTNLEEKMKVRIAMNEAEKGQRD
IVFELKEMTRFYQVCYEYVKSTEHKIKVFILPAKSYTTTDFCSLMQGNLIKDKEWYTVHYLKQILSGGHKAIMQHNATSE
QNIAFECFKLITHFADSFIDSLSRSAFLQLIIDEFSYKDVKVSKLYDIIKNGYNRTDFIPLLFRTGDLRQADLDKYDAMK
SHERVTWNDWQTSRHLDMGSINLTITGYNRSITIIGEDNKLTYAELCLTRKTPENITISGRKLLGSRHGLKFENMSKIQT
YPGNYYITYRKKDRHQFVYQIHSHESITRRNEEHMAIRTRIYNEITPVCVVNVAEVDGDQRILIRSLDYLNNDIFSLSRI
KVGLDEFATIKKAHFSKMVSFEGPPIKTGLLDLTELMKSQDLLNLNYDNIRNSNLISFSKLICCEGSDNINDGLEFLSDD
PMNFTEGEAIHSTPIFNIYYSKRGERHMTYRNAIKLLIERETKIFEEAFTFSENGFISPENLGCLEAVVSLIKLLKTNEW
STVIDKCIHICLIKNGMDHMYHSFDVPKCFMGNPITRDINWVMFREFINSLPGTDIPPWNVMTENFKKKCIALINSKFET
QRDFSEFTKLMKKEGGRSNIEFD
>P20470 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MEDQAYDQYLHRIQAARTATVAKDISADILEARHDYFGRELCNSLGIEYKNNVLLDEIILDVVPGVNLLNYNIPNVTPDN
YIWDGHFLIILDYKVSVGNDSSEITYKKYTSLILPVMSELGIDTEIAIIRANPVTYQISIIGEEFKQRFPNIPIQLDFGR
FFELRKMLLDKFADDEEFLMMIAHGDFTLTAPWCTSDTPELEEHEIFQEFINSMPPRFVSLFKEAVNFSAYSSERWNTFL
YRARAETEVDYNQFLSDKAHKIFMLEGDYMRPTQAEIDKGWELMSQRVYTEREIITDVTKQKPSIHFIWVKNADRKLIGS
TAKLIYLSNSLQSITEQSTWTDALKAIGKSMDIDGKVGQYETLCAERKMIARSTGKKVDNKRLEAVKIGNALVLWEQQFI
LANDLFKNQERQKFMKNFFGIGKHKSFKDKTSSDIETDKPKILDFNNTIVLMAARTMVNKNKALLAKDNTLQDLHPIIMQ
YASEIKEASKDTFDALLKISKTCFWQCIVDVSTIMRNILAVSQYNRHNTFRVAMCANDSVYALVFPSSDIKTKRATVVFS
IVCMHKEKNDLMDAGALFTTLECKNKEYISISKAIRLDKERCQRIVSSPGLFILSSMLLYNNNPEVNLVDVLNFTFYTSL
SITKSMLSLTEPSRYMIMNSLAISSHVRDYIAEKFSPYTKTLFSVYMVNLIKRGCASANEQSSKIQLRNIYLSDYDITQK
GVNDGRNLDSIWFPGKVNLKEYINQIYLPFYFNAKGLHEKHHVMIDLAKTVLEIEMNQRSDNLGIWSKAEKKQHVNLPIL
IHSIAKSLILDTSRHNHLRNRVESRNNFRRSITTISTFTSSKSCIKIGDFREIKDKETEKSKKSTEKFDKKFRLSNPLFL
EDEEANLEVQHCNYRALIQKIPNYKDYISVKVFDRLYELLKNGVLTDKPFIELAMEMMKNHKEFSFTFFNKGQKTAKDRE
IFVGEFEAKMCMYVVERISKERCKLNTDEMISEPGDSKLKILEKKAEEEIRYIVERTKDSIIKGDPSKALKLEINADMSK
WSAQDVFYKYFWLIAMDPILYPAEKTRILYFMCNYMQKLLILPDDLIANILDQKRPYNDDLILEMTNGLNYNYVQIKRNW
LQGNFNYISSYVHSCAMLVYKDILKECMKLLDGDCLINSMVHSDDNQTSLAIIQNKVSDQIVIQYAANTFESVCLTFGCQ
ANMKKTYITHTCKEFVSLFNLHGEPLSVFGRFLLPSVGDCAYIGPYEDLASRLSAAQQSLKHGCPPSLVWLAISCSHWIT
FFTYNMLDDQINAPQQHLPFNNRKEIPVELNGYLNAPLYLIALVGLEAGNLWFLINILKKLVPLDKQKETIQSQCLHLCN
SIDKLTESEKFKLKILRYLTLDTEMSVDNNMGETSDMRSRSLLTPRKFTTLGSLNKLVSYNDFRSSLDDQRFTDNLNFML
NNPELLVTKGENKEQFMQSVLFRYNSKRFKESLSIQNPAQLFIEQILFSHKPIIDYSSIFDKLTSLAEADIIEELPEIIG
RVTFPQAYQMINRDIGQLPLDIDDIKLIFRYCILNDPLMITAANTSLLCVKGTPQDRTGLSASQMPEFRNMKLIHHSPAL
VLKAFSKGTSDIPGADPIELEKDLHHLNEFVETTAIKEKILHNIDNPPKHLIGNEILIYRIREMTKLYQVCYDYVKSTEH
KVKIFILPMKSYTAIDFCTLIQGNTISDNKWYTMHYLKQIASGSIKGNIVTTSTSEQIIANECFRVLCHFADSFVEEASR
LSFINEVLDNFTYKNISVNSLFNTLLASTTRLDFIPLLFRLKVLTQTDLNRFDALKTNERVSWNNWQTNRSLNSGLIDLT
ISGYLRSIRVVGEDNKLKIAELTIPNFYPNTVFHAGNKLLNSRHGLKFEYMEEIVLDEKYNYYITYQKKRAHIYTYQVST
IEHILRRNNEGLQSRGPRYNKMVPVCPVVLSVRDELFRMSLENVFSLNMTNFSMSRLFVSPDEVATVKKAHMSKMMFFSG
PTIKAGIINLTSLMRTQELLTLNYDNLCKSSIVPFCRILECNGDEQGELIFLSDEVMDFTISEEIESMPLFTIRYQKRGT
EIMTYKNAIMKLVSAGVDEIKEVFDFSKQGFYSKKNLGIINTICSIINILETNEWSTILYNSFHIAMLLESMDREFHMFT
LPEAFFINVAGGVVNWTKLLKFIKSLPVIEQEPWSMMMSRFVEKTVYLIEREMNKDVDFTDFLDELEFSSGKSLFTFF
>Q6TQR6 ~~~L~~~RNA-directed RNA polymerase L~~~
MDFLRSLDWTQVIAGQYVSNPRFNISDYFEIVRQPGDGNCFYHSIAELTMPNKTDHSYHYIKRLTESAARKYYQEEPEAR
LVGLSLEDYLKRMLSDNEWGSTLEASMLAKEMGITIIIWTVAASDEVEAGIKFGDGDVFTAVNLLHSGQTHFDALRILPQ
FETDTREALSLMDRVIAVDQLTSSSSDELQDYEDLALALTSAEESNRRSSLDEVTLSKKQAEILRQKASQLSKLVNKSQN
IPTRVGRVLDCMFNCKLCVEISADTLILRPESKEKIGEIMSLRQLGHKLLTRDKQIKQEFSRMKLYVTKDLLDHLDVGGL
LRAAFPGTGIERHMQLLHSEMILDICTVSLGVMLSTFLYGSNNKNKKKFITNCLLSTALSGKKVYKVLGNLGNELLYKAP
RKALATVCSALFGKQINKLQNCFRTISPVSLLALRNLDFDCLSVQDYNGMIENMSKLDNTDVEFNHREIADLNQLTSRLI
TLRKEKDTDLLKQWFPESDLTRRSIRNAANAEEFVISEFFKKKDIMKFISTSGRAMSAGKIGNVLSYAHNLYLSKSSLNM
TSEDISQLLIEIKRLYALQEDSEVEPIAIICDGIESNMKQLFAILPPDCARECEVLFDDIRNSPTHSTAWKHALRLKGTA
YEGLFANCYGWQYIPEDIKPSLTMLIQTLFPDKFEDFLDRTQLHPEFRDLTPDFSLTQKVHFKRNQIPSVENVQISIDAT
LPESVEAVPVTERKMFPLPETPLSEVHSIERIMENFTRLMHGGRLSTKKRDGDPAEQGNQQSITEHESSSISAFKDYGER
GIVEENHMKFSGEDQLETRQLLLVEVGFQTDIDGKIRTDHKKWKDILKLLELLGIKCSFIACADCSSTPPDRWWITEDRV
RVLKNSVSFLFNKLSRNSPTEVTDIVVGAISTQKVRSYLKAGTATKTPVSTKDVLETWEKMKEHILNRPTGLTLPTSLEQ
AMRKGLVEGVVISKEGSESCINMLKENLDRITDEFERTKFKHELTQNITTSEKMLLSWLSEDIKSSRCGECLSNIKKAVD
ETANLSEKIELLAYNLQLTNHCSNCHPNGVNISNTSNVCKRCPKIEVVSHCENKGFEDSNECLTDLDRLVRLTLPGKTEK
ERRVKRNVEYLIKLMMSMSGIDCIKYPTGQLITHGRVSAKHNDGNLKDRSDDDQRLAEKIDTVRKELSESKLKDYSTYAR
GVISNSLKNLSRQGKSKCSVPRSWLEKVLFDLKVPTKDEEVLINIRNSLKARSEFVRNNDKLLIRSKEELKKCFDVQSFK
LKKNKQPVPFQVDCILFKEVAAECMKRYIGTPYEGIVDTLVSLINVLTRFTWFQEVVLYGKICETFLRCCTEFNRSGVKL
VKIRHCNINLSVKLPSNKKENMLCCLYSGNMELLQGPFYLNRRQAVLGSSYLYIVITLYIQVLQQYRCLEVINSVSEKTL
QDIENHSMTLLEDSFREITFALEGRFEESYKIRTSRCRASGNFLNRSSRDHFISVVSGLNLVYGFLIKDNLLANSQQQNK
QLQMLRFGMLAGLSRLVCPNELGKKFSTSCRRIEDNIARLYLQTSIYCSVRDVEDNVKHWKQRDLCPEVTIPCFTVYGTF
VNSDRQLIFDIYNVHIYNKEMDNFDEGCISVLEETAERHMLWELDLMNSLCSDEKKDTRTARLLLGCPNVRKAANREGKK
LLKLNSDTSTDTQSIASEVSDRRSYSSSKSRIRSIFGRYNSQKKPFELRSGLEVFNDPFNDYQQAITDICQFSEYTPNKE
SILKDCLQIIRKNPSHTMGSFELIQAISEFGMSKFPPENIDKARRDPKNWVSISEVTETTSIVASPRTHMMLKDCFKIIL
GTENKKIVKMLRGKLKKLGAISTNIEIGKRDCLDLLSTVDGLTDQQKENIVNGIFEPSKLSFYHWKELVKKNIDEVLLTE
DGNLIFCWLKTISSSVKGSLKKRLKFMNIHSPELMPENCLFSSEEFNELIKLKKLLLNEQQDEQELKQDLLISSWIKCIT
ACKDFASINDKIQKFIYHLSEELYDIRLQHLELSKLKQEHPSVSFTKEEVLIKRLEKNFLKQHNLEIMETVNLVFFAALS
APWCLHYKALESYLVRHPEILDCGSKEDCKLTLLDLSVSKLLVCLYQKDDEELINSSSLKLGFLVKYVVTLFTSNGEPFS
LSLNDGGLDLDLHKTTDEKLLHQTKIVFAKIGLSGNSYDFIWTTQMIANSNFNVCKRLTGRSTGERLPRSVRSKVIYEMV
KLVGETGMAILQQLAFAQALNYEHRFYAVLAPKAQLGGARDLLVQETGTKVMHATTEMFSRNLLKTTSDDGLTNPHLKET
ILNVGLDCLANMRNLDGKPISEGSNLVNFYKVICISGDNTKWGPIHCCSFFSGMMQQVLKNVPDWCSFYKLTFIKNLCRQ
VEIPAGSIKKILNVLRYRLCSKGGVEQHSEEDLRRLLTDNLDSWDGNDTVKFLVTTYISKGLMALNSYNHMGQGIHHATS
SVLTSLAAVLFEELAIFYLKRSLPQTTVHVEHAGSSDDYAKCIVVTGILSKELYSQYDETFWKHACRLKNFTAAVQRCCQ
MKDSAKTLVSDCFLEFYSEFMMGYRVTPAVIKFMFTGLINSSVTSPQSLMQACQVSSQQAMYNSVPLVTNTAFTLLRQQI
FFNHVEDFIRRYGILTLGTLSPFGRLFVPTYSGLVSSAVALEDAEVIARAAQTLQMNSVSIQSSSLTTLDSLGRSRTSST
AEDSSSVSDTTAASHDSGSSSSSFSFELNRPLSETELQFIKALSSLKSTQACEVIQNRITGLYCNSNEGPLDRHNVIYSS
RMADSCDWLKDGKRRGNLELANRIQSVLCILIAGYYRSFGGEGTEKQVKASLNRDDNKIIEDPMIQLIPEKLRRELERLG
VSRMEVDELMPSISPDDTLAQLVAKKLISLNVSTEEYSAEVSRLKQTLTARNVLHGLAGGIKELSLPIYTIFMKSYFFKD
NVFLSLTDRWSTKHSTNYRDSCGKQLKGRIITKYTHWLDTFLGCSVSINRHTTVKEPSLFNPNIRCVNLITFEDGLRELS
VIQSHLKVFENEFTNLNLQFSDPNRQKLRIVESRPAESELEANRAVIVKTKLFSATEQVRLSNNPAVVMGYLLDESAISE
VKPTKVDFSNLLKDRFKIMQFFPSVFTLIKMLTDESSDSEKSGLSPDLQQVARYSNHLTLLSRMIQQAKPTVTVFYMLKG
NLMNTEPTVAELVSYGIKEGRFFRLSDTGVDASTYSVKYWKILHCISAIGCLPLSQADKSSLLMSFLNWRVNMDIRTSDC
PLSSHEASILSEFDGQVIANILASELSSVKRDSEREGLTDLLDYLNSPTELLKKKPYLGTTCKFNTWGDSNRSGKFTYSS
RSGESIGIFIAGKLHIHLSSESVALLCETERQVLSWMSKRRTEVITKEQHQLFLSLLPQSHECLQKHKDGSALSVIPDSS
NPRLLKFVPLKKGLAVVKIKKQILTVKKQVVFDAESEPRLQWGHGCLSIVYDETDTQTTYHENLLKVKHLVDCSTDRKKL
LPQSVFSDSKVVLSRIKFKTELLLNSLTLLHCFLKHAPSDAIMEVESKSSLLHKYLKSGGVRQRNTEVLFREKLNKVVIK
DNLEQGVEEEIEFCNNLTKTVSENPLPLSCWSEVQNYIEDIGFNNVLVNIDRNTVKSELLWKFTLDTNVSTTSTIKDVRT
LVSYVSTETIPKFLLAFLLYEEVLMNLINQCKAVKELINSTGLSDLELESLLTLCAFYFQSECSKRDGPRCSFAALLSLI
HEDWQRIGKNILVRANNELGDVSLKVNIVLVPLKDMSKPKSERVVMARRSLNHALSLMFLDEMSLPELKSLSVNCKMGNF
EGQECFEFTILKDNSARLDYNKLIDHCVDMEKKREAVRAVEDLILMLTGRAVKPSAVTQFVHGDEQCQEQISLDDLMAND
TVTDFPDREAEALKTGNLGFNWDSD
>Q66431 ~~~L~~~RNA-directed RNA polymerase L~~~
MDFLDSLIWERVVDEQYITNPTFCVSDYFEVIRQPGDGNCFYHSIAELFFDVKTPSSFRKVKEHLQLAAEVYYDTEPEAV
GTGISKDEYIKVAMKDNEWGGSLEASMLSKHLQTTIILWVVNSTEQVTAAIKFGPGRVSTALNLMHVGRTHFDALRIIEQ
LENNQPQDRNRLDIADRIAAAEVYVRQSIEDNLQEDEFFDYAREDEISEDVSAPGGSREATELKKKAILLNKTVKRGENI
PIRVGRVLDCLFSCKIAVSLDEGLLYLRPETRESEATSISLRQLGHKLLTRDRHIKMEYARSKLYVTRDLIDHLDIGGLL
RSSFPGLGLERYIQLLHSELVLDLVTVVLAVLLSTFLYGSNNKNKKQFITNCLLNTKLSGKRVFKALSKLTGQMLYRTPK
RAVSIVSQELYGKLMLKVKNNLEGMGPISMLALRNLNFDNMQLQDYLEMLSEMSKIDNSDVEYTHREISDLHTLVERLSK
LQKSQDVNELKLWFKEEVLTKRSQRSVGNAFEFLINDYFKKKDIMKFVSTSGKASSTGNIGNVLSYAHNLYLSKESLRMT
SEDVTQLLIEIRKLHKLQGDLSIEPVAIICDKLEDQFRKLFRELPEECSSECQTLFNDIRNSPSHSVAWKHALRLKGTAY
EGMFAKQYGWSYISEDIKPSLTMIVQTLFPESFEAFLDRTQLHPEFRDLTPDYALTQKIFFPRNTIPRTENRQLAIDVSL
EGSVEAVPIVEKRMFPLPEVPIGEANSISRVMNIFKEKREESMQKKLEHDRQAEANRLKSAGLSASKAEQEVCNSAQDRK
EEKERTTEPAGKQQRTEDLVVIEGNQDEGDSDPQKKVDEKTVPGESKQHSKSSGSSSTNQMSQKVVDVPSVEDSSDQAPG
DFPDYGYYFKRIVMDESGTVLTEEAQLEKRQLLFIEVGYQTDVDGKITTDYKKWKDILRLLELLNIKCSFIACADCSSTP
SNNWWISEDKVRLLKNSISHLFSKLTQNSPADVTDIVVGSISTQKVRSYLKSGTATKTPISLKDVQETWSKMKDYIVNRP
TGISLNKELVGALYQGLVEGAIISKEGTVNLIQMLKDKQERITDEFERTKLKHEVNEDVKTSEKLLLGWLMEDLKGCRCM
GCLTKIKELSESMSVNQDRLEYLSTNCQTKSHCTECHPRSLEYRNISNVDNRVPSMQRVSHSRNEGFEDTNETLTELDRL
VRLTLPGKTEKERRVKRNVEGLIRFMMQQSSLDCIKLPSGQIIAHRCSKKFKNSSEAEEKCNERFERLMKELSEQKLKPY
SDHVRKTITSSLKKTDKQAGSKCAVPRLWLETLIRDLRVPTKDEEILLNIRTSMQSKTNFIRNNDKLIIRSNKEIADYLE
TKRKNLLSEKASDKIFSSDCILFKEVIAEALRRYYSTPYEGVPETIVKLINFLCTFGWFQEVVLYSKICETFLRCCTEFS
RSGIKLVKVRHCDTNLSIKLPSNKKENMLCCLYDKNMSLLKGPFFLNRRQAILGSAYPYILITLYIQVLQQHRCLEVLNS
VNDRVVGNINTCTSNLLNTVKAELTLVNSGLFEKAYECRTEQCRLGGNFLNRSSRDHFISTVSGLNVVYGALIKDNLLAN
SQPQNKQLQMLRFGMLCGLSRLSSALELGKKFSTSCRRIEDNIMRLYLQSTIYSANRDVSQNVQNWKMKDLCPDITIPCF
SVYGLFVNSDRQRIFDIYNVHIYNKEMDNFDEGCITVLEETAERHMLWELDLLRSLEGDTRDVRAARLLLGCPNIRKATD
KDGNRLMKKGITDDWREEAGSDSSSISGRRSYASSGTRVKSMFGKYNSSQKPFELKPGLEVVNDPLHDYKQAVQDSFCYS
EYTPNTESVLKDCIHIIRTNPSHTMGSYELIQAVTENARRKYPPENIERARKDPKNWVSISEVTETTSIVSQPRTHFMLK
DCYKVLLGTENKKIVKMLRGKLKKLGAMRTDIEIGKKDCLDLLTTVDGLSEEQCKNIVNGIFEPSKLSFYHWKDLLKKEL
SEVLLTDDGNYIYCWLKTLSSMVKHSLKKDLRFMTGKNSLDIKPEMFTDEEYSALNIMKLELLGEHTDGIQGKTDFLLSS
WKKCALKPKEGQSILNVGLNSLAALHDELYDIRLQHLELTRIKKENPTVSFTKEEILVKRLEKGFLNKYKKEVMEAVNLI
FYCCLTAPWCLHYKSLEAYLVRHPEILETECIKENDIPLLDLTVTSLIRSLIDDIEGESSFNDSSDIKVRFAVKYLITLF
TANGEPFSLSLNDGGLNDDLQLTTDEKLLYQTKKVFAKLGLSGNNYDFIWTLQMIANSNFNVCKRLTGRTTGERLPRSVR
SKVIYEMVKLVGETGMAILQQLAFSQALNYDHRFYAVLAPKAQLGGSRDLLVQETGTKVIHATTEMFSRNLLKTTSDDGL
TNPHLKETRLNIGLDMLSTARALDGKQVSDDSNLLNFFKTVCISGDNTKWGPIHCCSFFSGMMQQLLKDVQDWSSFYKLT
FIKNLCRQIEIPAPSIRKILNVLRFKLSDKGGVEKLSEEAIRSELINNLAEWEGNDTVKFLITTYISKGIMAMNSYNHMG
QGIHHATSSLLTSMMAETFEELAVDYMKKHFPGLTVNVDHAGSSDDYAKCIIVSGLVSKDMYKRYDGVFWRHMCRLKNFL
AAVRRCCQMKDSAKTLVGDCFLEFYSEFMMGNRVTPAVIKFIFTGLINSSVTSPQSLVQACHVSSQQGMYNSVPLVTNAA
FTILRQQIFYNHVEDFIRRYGLITLGAVSPFGRLFLPRFSGLVSSSVALEDSETISKAAAEINSNDIFFNTSSLSNLDKL
EQSPDSSGLDDDSVVSTTTVESSDSKGSSSSFTFDLNRPLSETEVKFLKLLRELTSTTACEMLQEKINTLYNDSREGPLD
RHNILQNCRLSESCDWLLDGKKRGLLELSRRMSCLLNVLIAGYYRSFGSEGTEKQVKASWSRDDNRVIEDPMIQLIPEKL
RRELERLGLSRMEVDELMPAVGPDESLSQLVAKKLISLNVSTEEYSAEVSRLKQTLTARNVLHGLAGGIKELSLPIYTIF
MKPYFFKDNVFLDLEDRWSSRHSTNYRDSTGKMLTGKVVTKFTHWLDTFLSCVVSANRSQEIKECSLFNPNLRCVNIMVK
GNGIKELSYIRSHLSVLSVEFENLNLQFSDVNRQKLKIVESRPPECELEANKAVIIKSKLFSAVEHVRLSNNPAVVMGYL
LEESSISEVKPTKVDFSNLLKHRFKLMQFFPSVFTLLRSLQSESKELEKLGEPVDMHQVSKYANHLTLLCRMIQQSKPSL
TVFYMLKGSQMNTEPTVSELVSYGIKEGRFLKLPEIGLDASTYSVRYWKILHCISAIGELPLSSEDKTSLLISFLNWKVT
SDCVDDCCPLEKYDKAIVSEFSGQVLINTLASELSSVRKDQEREGLTDLIDYINSPSELLKKKPYLGTTCRFQCWGEGAK
SGKFTYSSRTGEAIGIFVAGKLHIHLTSDSPGLLCEVERQVLSWLGKRRTDVLTKEQHQFFLDFLPNLSEVVQKNRDGAI
LGVTIDSTNVRMLKYVPPKRNTPVIKIKKQILTVKKQTTLDVESEPRIVWGHGQLSIVYDECETETTYHENLIKVKKLVD
LASGTTDKLPTAIFSDTRITLARVKFKTELLLNSLCLLHCFLKHTSQDAIQEVESKCNVLERYLRSGGVQFRPMSESLDK
KVTKLPLQCQSDKDVDKEINFCEDLTRVFSNENVPLSSWSEVQSYIEEVGFGNVLVHIEKNPTRSDLIWRFSIDSISGNF
GPIKDIRTLVTYMSTETVPKFLLPFLLFEEQLKHLIAGCVELRDALNSSGINDREIAIVALFTCFYYQSDSVKRQGPVCS
ISSFCSLIGDDLLPLDNRLQARVLPEQDNVKLHFKLNLTTDSALGKKDKAIQAKKIISRYLRLIFTEDDMDLKRLKSDAT
KVKLSSEKECEFLEFCLHSDLSYALNYRVLLEHLIDLEDRAKKTACVLIEEFILMLTGRLMISSTIDSDSKRTLEDDALC
LEDLLDSDNEASSSKSDNEEQIALQTGKFNFNWDSD
>Q5XX01 ~~~L~~~RNA-directed RNA polymerase L~~~
MMATQHTQYPDARLSSPIVLDQCDLVTRACGLYSEYSLNPKLKTCRLPKHIYRLKYDTIVLRFISDVPVATIPIDYIAPM
LINVLADSKNVPLEPPCLSFLDEIVNYTVQDAAFLNYYMNQIKTQEGVITDQLKQNIRRVIHKNRYLSALFFWHDLAILT
RRGRMNRGNVRSTWFVTNEVVDILGYGDYIFWKIPIALLPMNTANVPHASTDWYQPNIFKEAIQGHTHIISVSTAEVLIM
CKDLVTSRFNTLLIAELARLEDPVSADYPLVDNIQSLYNAGDYLLSILGSEGYKIIKYLEPLCLAKIQLCSQYTERKGRF
LTQMHLAVIQTLRELLLNRGLKKSQLSKIREFHQLLLRLRSTPQQLCELFSIQKHWGHPVLHSEKAIQKVKNHATVLKAL
RPIIIFETYCVFKYSVAKHFFDSQGTWYSVISDRCLTPGLNSYIRRNQFPPLPMIKDLLWEFYHLDHPPLFSTKIISDLS
IFIKDRATAVEQTCWDAVFEPNVLGYSPPYRFNTKRVPEQFLEQEDFSIESVLQYAQELRYLLPQNRNFSFSLKEKELNV
GRTFGKLPYLTRNVQTLCEALLADGLAKAFPSNMMVVTEREQKESLLHQASWHHTSDDFGEHATVRGSSFVTDLEKYNLA
FRYEFTAPFIKYCNQCYGVRNVFDWMHFLIPQCYMHVSDYYNPPHNVTLENREYPPEGPSAYRGHLGGIEGLQQKLWTSI
SCAQISLVEIKTGFKLRSAVMGDNQCITVLSVFPLESSPNEQERCAEDNAARVAASLAKVTSACGIFLKPDETFVHSGFI
YFGKKQYLNGIQLPQSLKTAARMAPLSDAIFDDLQGTLASIGTAFERSISETRHILPCRVAAAFHTYFSVRILQHHHLGF
HKGSDLGQLAINKPLDFGTIALSLAVPQVLGGLSFLNPEKCLYRNLGDPVTSGLFQLKHYLSMVGMSDIFHALIAKSPGN
CSAIDFVLNPGGLNVPGSQDLTSFLRQIVRRSITLSARNKLINTLFHASADLEDELVCKWLLSSTPVMSRFAADIFSRTP
SGKRLQILGYLEGTRTLLASKMISNNAETPILERLRKITLQRWNLWFSYLDHCDPALMEAIQPIKCTVDIAQILREYSWA
HILDGRQLIGATLPCIPEQFQTTWLKPYEQCVECSSTNNSSPYVSVALKRNVVSAWPDASRLGWTIGDGIPYIGSRTEDK
IGQPAIKPRCPSAALREAIELTSRLTWVTQGSANSDQLIRPFLEARVNLSVQEILQMTPSHYSGNIVHRYNDQYSPHSFM
ANRMSNTATRLMVSTNTLGEFSGGGQAARDSNIIFQNVINFAVALYDIRFRNTCTSSIQYHRAHIHLTNCCTREVPAQYL
TYTTTLNLDLSKYRNNELIYDSDPLRGGLNCNLSIDSPLMKGPRLNIIEDDLIRLPHLSGWELAKTVLQSIISDSSNSST
DPISSGETRSFTTHFLTYPKIGLLYSFGALISFYLGNTILCTKKIGLTEFLYYLQNQIHNLSHRSLRIFKPTFRHSSVMS
RLMDIDPNFSIYIGGTAGDRGLSDAARLFLRIAISTFLSFVEEWVIFRKANIPLWVIYPLEGQRSDPPGEFLNRVKSLIV
GTEDDKNKGSILSRSGEKCSSNLVYNCKSTASNFFHASLAYWRGRHRPKKTIGATNATTAPHIILPLGNSDRPPGLDLNR
NNDTFIPTRIKQIVQGDSRNDRTTTTRFPPKSRSTPTSATEPPTKMYEGSTTHQGKLTDTHLDEDHNAKEFPSNPHRLVV
PFFKLTKDGEYSIEPSPEESRSNIKGLLQHLRTMVDTTIYCRFTGIVSSMHYKLDEVLWEYNKFESAVTLAEGEGSGALL
LIQKYGVKKLFLNTLATEHSIESEVISGYTTPRMLLPIMPKTHRGELEVILNNSASQITDITHRDWFSNQKNRIPNDADI
ITMDAETTENLDRSRLYEAVYTIICNHINPKTLKVVILKVFLSDLDGMCWINNYLAPMFGSGYLIKPITSSAKSSEWYLC
LSNLLSTLRTTQHQTQANCLHVVQCALQQQVQRGSYWLSHLTKYTTSRLHNSYIAFGFPSLEKVLYHRYNLVDSRNGPLV
SITRHLALLQTEIRELVTDYNQLRQSRTQTYHFIKTSKGRITKLVNDYLRFELVIRALKNNSTWHHELYLLPELIGVCHR
FNHTRNCTCSERFLVQTLYLHRMSDAEIKLMDRLTSLVNMFPEGFRSSSV
>Q05318 ~~~L~~~RNA-directed RNA polymerase L~~~
MATQHTQYPDARLSSPIVLDQCDLVTRACGLYSSYSLNPQLRNCKLPKHIYRLKYDVTVTKFLSDVPVATLPIDFIVPVL
LKALSGNGFCPVEPRCQQFLDEIIKYTMQDALFLKYYLKNVGAQEDCVDEHFQEKILSSIQGNEFLHQMFFWYDLAILTR
RGRLNRGNSRSTWFVHDDLIDILGYGDYVFWKIPISMLPLNTQGIPHAAMDWYQASVFKEAVQGHTHIVSVSTADVLIMC
KDLITCRFNTTLISKIAEIEDPVCSDYPNFKIVSMLYQSGDYLLSILGSDGYKIIKFLEPLCLAKIQLCSKYTERKGRFL
TQMHLAVNHTLEEITEMRALKPSQAQKIREFHRTLIRLEMTPQQLCELFSIQKHWGHPVLHSETAIQKVKKHATVLKALR
PIVIFETYCVFKYSIAKHYFDSQGSWYSVTSDRNLTPGLNSYIKRNQFPPLPMIKELLWEFYHLDHPPLFSTKIISDLSI
FIKDRATAVERTCWDAVFEPNVLGYNPPHKFSTKRVPEQFLEQENFSIENVLSYAQKLEYLLPQYRNFSFSLKEKELNVG
RTFGKLPYPTRNVQTLCEALLADGLAKAFPSNMMVVTEREQKESLLHQASWHHTSDDFGEHATVRGSSFVTDLEKYNLAF
RYEFTAPFIEYCNRCYGVKNVFNWMHYTIPQCYMHVSDYYNPPHNLTLENRDNPPEGPSSYRGHMGGIEGLQQKLWTSIS
CAQISLVEIKTGFKLRSAVMGDNQCITVLSVFPLETDADEQEQSAEDNAARVAASLAKVTSACGIFLKPDETFVHSGFIY
FGKKQYLNGVQLPQSLKTATRMAPLSDAIFDDLQGTLASIGTAFERSISETRHIFPCRITAAFHTFFSVRILQYHHLGFN
KGFDLGQLTLGKPLDFGTISLALAVPQVLGGLSFLNPEKCFYRNLGDPVTSGLFQLKTYLRMIEMDDLFLPLIAKNPGNC
TAIDFVLNPSGLNVPGSQDLTSFLRQIVRRTITLSAKNKLINTLFHASADFEDEMVCKWLLSSTPVMSRFAADIFSRTPS
GKRLQILGYLEGTRTLLASKIINNNTETPVLDRLRKITLQRWSLWFSYLDHCDNILAEALTQITCTVDLAQILREYSWAH
ILEGRPLIGATLPCMIEQFKVFWLKPYEQCPQCSNAKQPGGKPFVSVAVKKHIVSAWPNASRISWTIGDGIPYIGSRTED
KIGQPAIKPKCPSAALREAIELASRLTWVTQGSSNSDLLIKPFLEARVNLSVQEILQMTPSHYSGNIVHRYNDQYSPHSF
MANRMSNSATRLIVSTNTLGEFSGGGQSARDSNIIFQNVINYAVALFDIKFRNTEATDIQYNRAHLHLTKCCTREVPAQY
LTYTSTLDLDLTRYRENELIYDSNPLKGGLNCNISFDNPFFQGKRLNIIEDDLIRLPHLSGWELAKTIMQSIISDSNNSS
TDPISSGETRSFTTHFLTYPKIGLLYSFGAFVSYYLGNTILRTKKLTLDNFLYYLTTQIHNLPHRSLRILKPTFKHASVM
SRLMSIDPHFSIYIGGAAGDRGLSDAARLFLRTSISSFLTFVKEWIINRGTIVPLWIVYPLEGQNPTPVNNFLYQIVELL
VHDSSRQQAFKTTISDHVHPHDNLVYTCKSTASNFFHASLAYWRSRHRNSNRKYLARDSSTGSSTNNSDGHIERSQEQTT
RDPHDGTERNLVLQMSHEIKRTTIPQENTHQGPSFQSFLSDSACGTANPKLNFDRSRHNVKFQDHNSASKREGHQIISHR
LVLPFFTLSQGTRQLTSSNESQTQDEISKYLRQLRSVIDTTVYCRFTGIVSSMHYKLDEVLWEIESFKSAVTLAEGEGAG
ALLLIQKYQVKTLFFNTLATESSIESEIVSGMTTPRMLLPVMSKFHNDQIEIILNNSASQITDITNPTWFKDQRARLPKQ
VEVITMDAETTENINRSKLYEAVYKLILHHIDPSVLKAVVLKVFLSDTEGMLWLNDNLAPFFATGYLIKPITSSARSSEW
YLCLTNFLSTTRKMPHQNHLSCKQVILTALQLQIQRSPYWLSHLTQYADCELHLSYIRLGFPSLEKVLYHRYNLVDSKRG
PLVSITQHLAHLRAEIRELTNDYNQQRQSRTQTYHFIRTAKGRITKLVNDYLKFFLIVQALKHNGTWQAEFKKLPELISV
CNRFYHIRDCNCEERFLVQTLYLHRMQDSEVKLIERLTGLLSLFPDGLYRFD
>Q6Q305 2.7.7.48~~~RdRp~~~RNA-directed RNA polymerase L~~~
MENIRKLEAQRRKEYLRAETEIRNNNVFEKDILQKFLHIVGKKQRHYTISSKKKEVQDIFNKCSTNPGFTKDVIDLAERI
LYAPPNLNQIDFAMTIVSLIEMQRHDDLIHKLNELLVRSGMKVISFEFKLCDHFPFTNSILTPDILFEDPYGSRYILEVK
VRNKHTDLEHYYMRYKKVVGVHAKVGVFNLSQSGYMQHGDYKLSEKINLESDDFDDILLCVELAGKIREKYIQYPQYFLY
TLQSEVTNPDSFLDGFKTRLSDLSMFEEIKSGFGDYWNDITYHMDNYSLIDNHDEVTEDLLNSTDDLTQYCNDLYDEFLD
HSQSYSRQGKYGKTILRNSGLDNIIDKKNRSKYTITSKLKPSVYIPITKTIKLDTYGGSRLKFYKDAFINIKCTGDSYSR
SAYNLVDNVFNTSSIDLLMTKDDKIDPSLYYEVLDPGFIAHLHDDQVKYKKIAKVTNITSDMTILANNSFSIHCHTDQRL
KDNICGYDKKHYDSTQKAKECLDFSNSSLLLPSLGVHLSAIFKSEHHAGVYWNDLVTLGTDNLNHMTSDIPEEAQTKFLE
HLYNSHIIFKAIISLNTINSHKFRLLQSPDPGTIIILLPNSDGLKGAPLRYFVVSILQKQDNDSIEANKLLGIYHSHTES
KKYKIMLSKVISLDITRLKLLSNSFVKYSLLISYYSQFKKSLKFDTHMLSWMLSQVTTIASLSITDVYKNFIMAIYSDYS
NIDDLINDKLECRPRTLGHVFVMKHLFQGITSAVEQLGKINKNKLIADVNDEGELISTGFDPNLRLKLPISQLSTNNPKE
IIHESFILFYLGNKGLHGSPQELLNLYYTPMQFESEYTKMMTDCGLYCQELGNNGNLSFSFQAMFLTSKVAYAKLLNNTD
EIRRSLVKEMKMDEPIMSIKQFSSTKSMVSNSVPDMSIKDCNLNKNIDVIQLERYIDTSLISDPMKYITLMNNSIESINQ
ERLLIYKDKLNKGIVLLPKLILTTFKGKSFIGLENHYYTKLVSGDYIKQTNTKVFDEFYRLTDEIQEEKLRGFYKGYITE
GDLLVRIFNKDQRTTDDREIYTGNAQVRLCLYPLEMTFKSICKKIPEEAITISGDQKQRKLLEQRLALIKTKRQFNKSGY
KTEIYSVSSDASKWSARDLLPKFIISIATNPYLTSDEKYFLVYLLVRYYDKKIVLTDSAFSNALRFSREDINGKYEEMTN
NFTQNWFNVRSNWLQGNLNMTSSFVHHCSTIMTDTLLSISAKHNGFEAVMTSMVHSDDSTYDFLIAKNSKTSSYINNEAN
MGRFIISLITYSNKKHCITLNEKKTYISTFYKEFLSTTIVSNELFFFYMADLMPISSDTSYKSPLEDLASYTGYINNSFS
HACPIQILKCAITLLNHLTLSTYNMQYTSEKNPRCNIPNSTDLPIQIYPRYKLPLSLAGCIPYYSSDAYNILDDIIKTLE
KNKVIKNSLLEDVIDDETLDEYITLVNKQKPEYAKYIQACLLTMDYTQYERDDEDPYNIVDYDLSQKSIINVASINKGSR
IKKTYTYKKYLENETDIRLTSCVNPMWCISKPKDEVLIKNPILANYMNPNFKDSLIFSKSALDYGRRIIGSNKSMDTLSS
HAFEKEKKQGIKTIYKKLDDKISTVEISKQSLQRFLECIYSVIKKSLVALQVYYSKVQVLVKTRPEFTKVIMPRSVYAEE
YGKNSNTSMVENLLVEQYCEIEQVDSKVEKFISFCKHVLQRCGDIKIYRDPEDIDDDFRKYIEFKYTLKDATMGLIQPHQ
HLAEYAFDVYNNKLIFQGLMVRYYIDICETISNPSYNIPSYTSPNSIIMTLDSLMKRDEISSKIYISHIRTNRFDEYWLS
RFGMYVYENYFVKYKLGYRIKIAANEKLMPTMKKVRNLREPFKFICSLVANDPSLFIQMTESPDFQISGWKYSDIIAEMK
STTDFSYNLFLYMMNEINFQTLMRVMNLNRRVWNHWLMKTDSEPSDPNASIALYMYQSTVVKVQTKTIGGGVTFSMLLLR
HGMQHRQAFDEISKKIASDYAPQLRIANITPQTSFGRLQFCVNEYGRTVKPGSYRSSCICNVNIAALTDLKPDIAYKENT
INQIVTIISPTFEGEFVFKLNTYCDSEYYTCVMLENLDLNRVMILDHLCRGKYLIENPEYFTEISDQITPGACLALFSNN
VNNKLWSNTIDTSKFAKLVHIGNYLKTEHEASIVTKLCDSLVAICALNGIDHTLSLKPDNFIKSLRQYKLSYGFHEEFYN
NYKKNEREPYTELIMAIASTAGDPFQKVILAIITIFKAYTDLFISYKTDEVEF
>P23456 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDKYREIHNKLKEFSPGTLTAVECIDYLDRLYAVRHDIVDQMIKHDWSDNKDSEEAIGKVLLFAGVPSNIITALEKKIIP
NHPTGKSLKAFFKMTPDNYKISGTTIEFVEVTVTADVDKGIREKKLKYEAGLTYIEQELHKFFLKGEIPQPYKITFNVVA
VRTDGSNITTQWPSRRNDGVVQYMRLVQAEISYVREHLIKTEERAALEAMFNLKFNISTHKSQPYYIPDYKGMEPIGANI
EDLVDYSKDWLSRARNFSFFEVKGTAVFECFNSNEANHCQRYPMSRKPRNFLLIQCSLITSYKPATTLSDQIDSRRACSY
ILNLIPDTPASYLIHDMAYRYINLTREDMINYYAPRIQFKQTQNVREPGTFKLTSSMLRAESKAMLDLLNNHKSGEKHGA
QIESLNIASHIVQSESVSLITKILSDLELNITEPSTQEYSTTKHTYVDTVLDKFFQNETQKYLIDVLKKTTAWHIGHLIR
DITESLIAHSGLKRSKYWSLHSYNNGNVILFILPSKSLEVAGSFIRFITVFRIGPGLVDKDNLDTILIDGDSQWGVSKVM
SIDLNRLLALNIAFEKALIATATWFQYYTEDQGQFPLQYAIRSVFANHFLLAICQKMKLCAIFDNLRYLIPAVTSLYSGF
PSLIEKLFERPFKSSLEVYIYYNIKSLLVALAQNNKARFYSKVKLLGLTVDQSTVGASGVYPSFMSRIVYKHYRSLISEV
TTCFFLFEKGLHGNMNEEAKIHLETVEWALKFREKEEKYGESLVENGYMMWELRANAELAEQQLYCQDAIELAAIELNKV
LATKSSVVANSILSKNWEEPYFSQTRNISLKGMSGQVQEDGHLSSSVTIIEAIRYLSNSRHNPSLLKLYEETREQKAMAR
IVRKYQRTEADRGFFITTLPTRCRLEIIEDYYDAIAKNISEEYISYGGEKKILAIQGALEKALRWASGESFIELSNHKFI
RMKRKLMYVSADATKWSPGDNSAKFRRFTSMLHNGLPNNKLKNCVIDALKQVYKTDFFMSRKLRNYIDSMESLDPHIKQF
LDFFPDGHHGEVKGNWLQGNLNKCSSLFGVAMSLLFKQVWTNLFPELDCFFEFAHHSDDALFIYGYLEPVDDGTDWFLFV
SQQIQAGHLHWFSVNTEMWKSMFNLHEHILLLGSIKISPKKTTVSPTNAEFLSTFFEGCAVSIPFVKILLGSLSDLPGLG
YFDDLAAAQSRCVKALDLGASPQVAQLAVALCTSKVERLYGTAPGMVNHPAAYLQVKHTDTPIPLGGNGAMSIMELATAG
IGMSDKNLLKRALLGYSHKRQKSMLYILGLFKFLMKLSDETFQHERLGQFSFIGKVQWKIFTPKSEFEFADMYTSKFLEL
WSSQHVTYDYIIPKGRDNLLIYLVRKLNDPSIVTAMTMQSPLQLRFRMQAKQHMKVCRLDGEWVTFREVLAAANSFAENY
SATSQDMDLFQTLTSCTFSKEYAWKDFLNGIHCDVIPTKQVQRAKVARTFTVREKDQIIQNSIPAVIGYKFAVTVEEMSD
VLDTAKFPDSLSVDLKTMKDGVYRELGLDISLPDVMKRIAPMLYKSSKSRVVIVQGNVEGTAEAICRYWLKSMSLVKTIR
VKPHKEVLQAVSIFNRKEDIGQQKDLAALKLCIEVWRWCKANSAPYRDWFQALWFEDKTFSEWLDRFCRVGVPPIDPEIQ
CAALMIADIKGDYSVLQLQANRRAYSGKQYDAYCVQTYNEVTKLYEGDLRVTFNFGLDCARLEIFWDKKAYILETSITQK
HVLKIMMDEVSKELIKCGMRFNTEQVQGVRHMVLFKTESGFEWGKPNIPCIVYKNCVLRTSLRTTQAINHKFMITIKDDG
LRAIAQHDEDSPRFLLAHAFHTIRDIRYQAVDAVSNVWFIHKGVKLYLNPIISSGLLENFMKNLPAAIPPAAYSLIMNRA
KISVDLFMFNDLLKLINPRNTLDLSGLETTGDEFSTVSSMSSRLWSEEMSLVDDDEELDDEFTIDLQDVDFENIDIEADI
EHFLQDESSYTGDLLISTEETESKKMRGIVKILEPVRLIKSWVSRGLSIEKVYSPVNIILMSRYISKTFNLSTKQVSLLD
PYDLTELESIVRGWGECVIDQFESLDREAQNMVVNKGICPEDVIPDSLFSFRHTMVLLRRLFPQDSISSFY
>Q6WB93 ~~~L~~~RNA-directed RNA polymerase L~~~
MDPLNESTVNVYLPDSYLKGVISFSETNAIGSCLLKRPYLKNDNTAKVAIENPVIEHVRLKNAVNSKMKISDYKVVEPVN
MQHEIMKNVHSCELTLLKQFLTRSKNISTLKLNMICDWLQLKSTSDDTSILSFIDVEFIPSWVSNWFSNWYNLNKLILEF
RREEVIRTGSILCRSLGKLVFIVSSYGCIVKSNKSKRVSFFTYNQLLTWKDVMLSRFNANFCIWVSNSLNENQEGLGLRS
NLQGMLTNKLYETVDYMLSLCCNEGFSLVKEFEGFIMSEILRITEHAQFSTRFRNTLLNGLTDQLTKLKNKNRLRVHSTV
LENNDYPMYEVVLKLLGDTLRCIKLLINKNLENAAELYYIFRIFGHPMVDERDAMDAVKLNNEITKILRLESLTELRGAF
ILRIIKGFVDNNKRWPKIKNLKVLSKRWTMYFKAKNYPSQLELSEQDFLELAAIQFEQEFSVPEKTNLEMVLNDKAISPP
KRLIWSVYPKNYLPETIKNRYLEETFNASDSLKTRRVLEYYLKDNKFDQKELKSYVVRQEYLNDKEHIVSLTGKERELSV
GRMFAMQPGKQRQIQILAEKLLADNIVPFFPETLTKYGDLDLQRIMEIKSELSSIKTRRNDSYNNYIARASIVTDLSKFN
QAFRYETTAICADVADELHGTQSLFCWLHLIVPMTTMICAYRHAPPETKGEYDIDKIEEQSGLYRYHMGGIEGWCQKLWT
MEAISLLDVVSVKTRCQMTSLLNGDNQSIDVSKPVKLSEGLDEVKADYRLAVKMLKEIRDAYRNIGHKLKEGETYISRDL
QFISKVIQSEGVMHPTPIKKVLRVGPWINTILDDIKTSAESIGSLCQELEFRGESIIVSLILRNFWLYNLYMHESKQHPL
AGKQLFKQLNKTLTSVQRFFEIKRENEVVDLWMNIPMQFGGGDPVVFYRSFYRRTPDFLTEAISHVDILLKISANIKNET
KVSFFKALLSIEKNERATLTTLMRDPQAVGSERQAKVTSDINRTAVTSILSLSPNQLFSDSAIHYSRNEEEVGIIAENIT
PVYPHGLRVLYESLPFHKAEKVVNMISGTKSITNLLQRTSAINGEDIDRAVSMMLENLGLLSRILSVVVDSIEIPIKSNG
RLICCQISRTLRETSWNNMEIVGVTSPSITTCMDVIYATSSHLKGIIIEKFSTDRTTRGQRGPKSPWVGSSTQEKKLVPV
YNRQILSKQQREQLEAIGKMRWVYKGTPGLRRLLNKICLGSLGISYKCVKPLLPRFMSVNFLHRLSVSSRPMEFPASVPA
YRTTNYHFDTSPINQALSERFGNEDINLVFQNAISCGISIMSVVEQLTGRSPKQLVLIPQLEEIDIMPPPVFQGKFNYKL
VDKITSDQHIFSPDKIDMLTLGKMLMPTIKGQKTDQFLNKRENYFHGNNLIESLSAALACHWCGILTEQCIENNIFKKDW
GDGFISDHAFMDFKIFLCVFKTKLLCSWGSQGKNIKDEDIVDESIDKLLRIDNTFWRMFSKVMFEPKVKKRIMLYDVKFL
SLVGYIGFKNWFIEQLRSAELHEIPWIVNAEGDLVEIKSIKIYLQLIEQSLFLRITVLNYTDMAHALTRLIRKKLMCDNA
LLTPISSPMVNLTQVIDPTTQLDYFPKITFERLKNYDTSSNYAKGKLTRNYMILLPWQHVNRYNFVFSSTGCKVSLKTCI
GKLMKDLNPKVLYFIGEGAGNWMARTACEYPDIKFVYRSLKDDLDHHYPLEYQRVIGELSRIIDSGEGLSMETTDATQKT
HWDLIHRVSKDALLITLCDAEFKDRDDFFKMVILWRKHVLSCRICTTYGTDLYLFAKYHAKDCNVKLPFFVRSVATFIMQ
GSKLSGSECYILLTLGHHNSLPCHGEIQNSKMKIAVCNDFYAAKKLDNKSIEANCKSLLSGLRIPINKKELDRQRRLLTL
QSNHSSVATVGGSKIIESKWLTNKASTIIDWLEHILNSPKGELNYDFFEALENTYPNMIKLIDNLGNAEIKKLIKVTGYM
LVSKK
>P28887 ~~~L~~~RNA-directed RNA polymerase L~~~
MDPIINGNSANVYLTDSYLKGVISFSECNALGSYIFNGPYLKNDYTNLISRQNPLIEHMNLKKLNITQSLISKYHKGEIK
LEEPTYFQSLLMTYKSMTSSEQIATTNLLKKIIRRAIEISDVKVYAILNKLGLKEKDKIKSNNGQDEDNSVITTIIKDDI
LSAVKDNQSHLKADKNHSTKQKDTIKTTLLKKLMCSMQHPPSWLIHWFNLYTKLNNILTQYRSNEVKNHGFTLIDNQTLS
GFQFILNQYGCIVYHKELKRITVTTYNQFLTWKDISLSRLNVCLITWISNCLNTLNKSLGLRCGFNNVILTQLFLYGDCI
LKLFHNEGFYIIKEVEGFIMSLILNITEEDQFRKRFYNSMLNNITDAANKAQKNLLSRVCHTLLDKTVSDNIINGRWIIL
LSKFLKLIKLAGDNNLNNLSELYFLFRIFGHPMVDERQAMDAVKINCNETKFYLLSSLSMLRGAFIYRIIKGFVNNYNRW
PTLRNAIVLPLRWLTYYKLNTYPSLLELTERDLIVLSGLRFYREFRLPKKVDLEMIINDKAISPPKNLIWTSFPRNYMPS
HIQNYIEHEKLKFSESDKSRRVLEYYLRDNKFNECDLYNCVVNQSYLNNPNHVVSLTGKERELSVGRMFAMQPGMFRQVQ
ILAEKMIAENILQFFPESLTRYGDLELQKILELKAGISNKSNRYNDNYNNYISKCSIITDLSKFNQAFRYETSCICSDVL
DELHGVQSLFSWLHLTIPHVTIICTYRHAPPYIGDHIVDLNNVDEQSGLYRYHMGGIEGWCQKLWTIEAISLLDLISLKG
KFSITALINGDNQSIDISKPIRLMEGQTHAQADYLLALNSLKLLYKEYAGIGHKLKGTETYISRDMQFMSKTIQHNGVYY
PASIKKVLRVGPWINTILDDFKVSLESIGSLTQELEYRGESLLCSLIFRNVWLYNQIALQLKNHALCNNKLYLDILKVLK
HLKTFFNLDNIDTALTLYMNLPMLFGGGDPNLLYRSFYRRTPDFLTEAIVHSVFILSYYTNHDLKDKLQDLSDDRLNKFL
TCIITFDKNPNAEFVTLMRDPQALGSERQAKITSEINRLAVTEVLSTAPNKIFSKSAQHYTTTEIDLNDIMQNIEPTYPH
GLRVVYESLPFYKAEKIVNLISGTKSITNILEKTSAIDLTDIDRATEMMRKNITLLIRILPLDCNRDKREILSMENLSIT
ELSKYVRERSWSLSNIVGVTSPSIMYTMDIKYTTSTISSGIIIEKYNVNSLTRGERGPTKPWVGSSTQEKKTMPVYNRQV
LTKKQRDQIDLLAKLDWVYASIDNKDEFMEELSIGTLGLTYEKAKKLFPQYLSVNYLHRLTVSSRPCEFPASIPAYRTTN
YHFDTSPINRILTEKYGDEDIDIVFQNCISFGLSLMSVVEQFTNVCPNRIILIPKLNEIHLMKPPIFTGDVDIHKLKQVI
QKQHMFLPDKISLTQYVELFLSNKTLKSGSHVNSNLILAHKISDYFHNTYILSTNLAGHWILIIQLMKDSKGIFEKDWGE
GYITDHMFINLKVFFNAYKTYLLCFHKGYGKAKLECDMNTSDLLCVLELIDSSYWKSMSKVFLEQKVIKYILSQDASLHR
VKGCHSFKLWFLKRLNVAEFTVCPWVVNIDYHPTHMKAILTYIDLVRMGLINIDRIHIKNKHKFNDEFYTSNLFYINYNF
SDNTHLLTKHIRIANSELENNYNKLYHPTPETLENILANPIKSNDKKTLNDYCIGKNVDSIMLPLLSNKKLIKSSAMIRT
NYSKQDLYNLFPMVVIDRIIDHSGNTAKSNQLYTTTSHQISLVHNSTSLYCMLPWHHINRFNFVFSSTGCKISIEYILKD
LKIKDPNCIAFIGEGAGNLLLRTVVELHPDIRYIYRSLKDCNDHSLPIEFLRLYNGHINIDYGENLTIPATDATNNIHWS
YLHIKFAEPISLFVCDAELSVTVNWSKIIIEWSKHVRKCKYCSSVNKCMLIVKYHAQDDIDFKLDNITILKTYVCLGSKL
KGSEVYLVLTIGPANIFPVFNVVQNAKLILSRTKNFIMPKKADKESIDANIKSLIPFLCYPITKKGINTALSKLKSVVSG
DILSYSIAGRNEVFSNKLINHKHMNILKWFNHVLNFRSTELNYNHLYMVESTYPYLSELLNSLTTNELKKLIKITGSLLY
NFHNE
>J3TRD1 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MNLEALCSRVLSERGLSTGEPGVYDQIFERPGLPNLEVTVDSTGVVVDVGAIPDSASQLGSSINAGVLTIPLSEAYKINH
DFTFSGLTKTTDRKLSEVFPLVHDGSDSMTPDVIHTRLDGTVVVIEFTTTRSTNMGGLEAAYRSKLEKYRDPLNRRTDIM
PDASIYFGIIVVSASGVLTNMPLTQDEAEELMFRFCVANEIYSQARAMDAEVELQKSEEEYEAISRARAFFTLFDYDDGK
LSEAFPNSDIEMLRRFLSQPVDTSFVTTTLKEKEQEAYKRMCEEHYLKSGMSTKERLEANRSDAIDKTRALMERLHNMSS
KELHSNKSTVKLPPWVVKPSDRTLDVKTDTGSGELLNHGPYGELWSRCFLEIVLGNVEGVISSPEKELEIAISDDPEADT
PKAAKIKYHRFRPELSLESKHEFSLQGIEGKRWKHSARNVLKDEMSHKTMSPFVDVSNIEEFLIMNNLLNDTSFNREGLQ
ETINLLLEKATEMHQNGLSTALNDSFKRNFNTNVVQWSMWVSCLAQELASALKQHCKPGEFIIKKLMHWPIFVIIKPTKS
SSHIFYSLAIKKANIKRRLIGDVFTDTIDAGEWEFSEFKSLKTCKLTNLINLPCTMLNSIAFWREKMGVAPWISRKACSE
LREQVAITFLMSLEDKSTTEELVTLTRYSQMEGFVSPPLLPKPQKMVEKLEVPLRTKLQVFLFRRHLDAIVRVAASPFPI
VARDGRVEWTGTFNAITGRSTGLENMVNNWYIGYYKNKEESTELNALGEMYKKIVEIEAEKPTSSEYLGWGDTSSPKRHE
FSRSFLKSACISLEKEIEMRHGKSWKQSLEERVLKELGSKNLLDLATMKATSNFSKEWEAFSEVRTKEYHRSKLLEKMAE
LIEHGLMWYVDAAGHAWKAVLDDKCMRICLFKKNQHGGLREIYVTNANARLVQFGVETMARCVCELSPHETIANPRLKSS
IIENHGLKSARQLGQGTINVNSSNDAKKWSQGHYTTKLAMVLCWFMPAKFHRFIWAGISMFRCKKMMMDLRFLEKLSTKA
NQKTDDDFRKDLAGAFHGNVEVPWMTQGATYLQTETGMMQGILHFTSSLLHSCVQSFYKAYFLSRLKEGIAGRTIKAAID
VLEGSDDSAIMISLKPASDNEEAMARFLTANLLYSVRVINPLFGIYSSEKSTVNTLFCVEYNSEFHFHKHLVRPTIRWVA
ASHQISESEALASRQEDYANLLTQCLEGGSSFSLTYLIQCAQLVHHYMLLGLCLHPLFGTFVGMLIEDPDPALGFFIMDN
PAFAGGAGFRFNLWRSCKFTNLGKKYAFFFNEIQGKTKGDADYRALDATTGGTLSHSVMTYWGDRRKYQHLLDRMGLPKD
WVERIDENPSILYRRPENKQELILRLAEKVHSPGVTSSFSKGHVVPRVVAAGVYLLSRHCFRYTASIHGRGASQKASLIK
LLVMSSTSAERNQGRLNPNQERMLFPQVQEYERVLTLLDEVTALTGKFVVRERNIVKSRVELFQEPVDLRCKAENLIAEM
WFGLKRTKLGPRLLKEEWDKLRASFSWLSTDHKETLDVGPFLSHVQFRNFIAHVDAKSRSVRLLGAPVKKSGGVTTVSQV
VKSNFFPGFILDSSESLDDQERVEGVSILKHILFMTLNGPYTDEQKKAMVLETFQYFALPHAAEVVKRSRSLTLCLMKNF
IEQRGGSILDQIEKAQSGTVGGFSKPQKPYRKQSGGIGYKGKGVWSGIMENTNVQILIDGDGSSNWIEEIRLSSESRLFD
VIESVRRLCDDINVNNRVTSSFRGHCMVRLSNFKVKPASRVEGCPVRLMPSSFRIKELQNPDEVFLRVRGDILNLSILLQ
EDRVMNLLSYRARDTDISESAASYLWMNRTDFSFGKKEPSCSWMCLKTLDSWAWNQAARVLERNIKTPGIDNTAMGNIFK
DCLESSLRKQGLLRSRIAEMVERHVIPLTSQELVDILEEDVDFSEMMQSDIMEGDLDIDILMEGSPMLWAAEVEEMGEAM
VILSQSGKYYHLKLMDQAATTLSTILGKDGCRLLLGRPTGRSNLREQVKPYLTLLQIREGDVNWVSEYKDDTRGLDEDSA
EMWG
>Q6GWS6 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MEEDIACVKDLVSKYLADNERLSRQKLAFLVQTEPRMLLMEGLKLLSLCIEIDSCNANGCEHNSEDKSVERILHDHGILT
PSLCFVVPDGYKLTGNVLILLECFVRSSPANFEQKYIEDFKKLEQLKEDLKTVNISLIPLIDGRTSFYNEQIPDWVNDKL
RDTLFSLLRYAQESNSLFEESEYSRLCESLSMTSGRLSGVESLNVLLDNRSNHYEEVIASCHQGINNKLTAHEVKLQIEE
EYQVFRNRLRKGEIESQFLKVEKSRLLNEFNDLYMDKVSTTEDDVEYLIHQFKRASPILRFLYANVGKEANEKSDQTIKE
CQMQYWRSFLNKVKSLRILNTRRKLLLIFDVLILLASKYDQMKYKSLRGWLGSCFISVNDRLVSLESTKRDLKKWVERRQ
QAEMSKTMQSSQCSNKNQILNSMLQKTILKATTALKDVGISVDQYKVDMEIMCPNCYDSVMDFDVSGITPTISYQRSEEE
KFPYIMGSVELLETVDLERLSSLSLALVNSMKTSSTVKLRQNEFGPARYQVVRCREAYCQEFSLGDTEFQLVYQKTGECS
KCYAINDNRVGEICSFYADPKRYFPAIFSAEVLQATVGTMISWIEDCSELEGQLHNIRSLTKMILVLILTHPSKRSQKLL
QNLRYFIMAYVSDFYHKDLIDKIREELITDTEFLLYRLVRALMGLILSENVKSMMTNRFKFILNISYMCHFITKETPDRL
TDQIKCFEKFLEPKVKFGHVSINPADIATEEELDDMIYNAKKFLNKDGCTSAKGPDYKRPGVSKKFLSLLTSSFNNGSLF
KEKEVKKDIKDPLITSGCATALDLASNKSVVVNKYTDGSRVLNYDFNKLTALAVSQLTEVFSRKGKHLLNKQDYEYKVQQ
AMSNLVLGSRQHKTDADEADLDEILLDGGASVYFDQLRETVEKIVDQYREPVKPGSGPDDDGQPSVNDLDEVISNKFHIR
LIKGELSNHMVEDFDHDVLPDKFYKEFCDAVYENDKLKERYFYCGHMSQCPIGELTKAVTTRTYFDHEYFQCFKSILLKM
NANTLMGRYTHYKSRNLNFKFDMGKLSDDVRISERESNSEALSKALSLTNCTTAMLKNLCFYSQESPQSYNSVGPDTGRL
KFSLSYKEQVGGNRELYIGDLRTKMFTRLIEDYFEALSLQLSGSCLNNEKEFENAILSMKLNVSLAHVSYSMDHSKWGPM
MCPFLFLAVLQNLIFLSKDLQADIKGRDYLSTLLMWHMHKMVEIPFNVVSAMMKSFIKAQLGLRKKTTQSITEDFFYSSF
QVGVVPSHVSSILDMGQGILHNTSDFYALISERFINYAISCICGGVVDAYTSSDDQISLFDQTLTELLQRDPDEFKTLMD
FHYYMSDQLNKFVSPKSVIGRFVAEFKSRFFVWGDEVPLLTKFVAAALHNIKCKEPHQLAETIDTIIDQSVANGVPVHLC
NLIQMRTLSLLQYARYPIDPFLLNCETDVRDWVDGNRSYRIMRQIEGLIPDACSKIRSMLRKLYNRLKTGQLHEEFTTNY
LSSEHLSSLRNLCELLGVEPPSESDLEFSWLNLAAHHPLRMVLRQKIIYSGAVNLDDEKVPTIVKTIQNKLSSTFTRGAQ
KLLSEAINKSAFQSSIASGFVGLCRTLGSKCVRGPNKENLYIKSIQSLISGTQGIELLTNSYGVQYWRVPLNLRSENESV
VSYFRPLLWDYMCISLSTAIELGAWVLGEPKMTKALEFFKHNPCDYFPLKPTASKLLEDRIGLNHIIHSLRRLYPSVFEK
HILPFMSDLASTKMKWSPRIKFLDLCVALDVSCEALSLVSHIVKWKREEHYIVLSSELRLSHTRTHEPMVEERVVSTSDA
VDNFMRQIYFESYVRPFVATTRTLGSFTWFPHKTSIPEGEGLQRLGPFSSFVEKVIHKGVERPMFKHDLMMGYAWIDFDI
EPARLNQNQLIASGLVSTKFDSLEDFFDAVASLPSGSTRLSQTVRFRIKSQDASFKETFAIHLDYTGSMNQQTKYLVHDV
TVMYSGAVNPCVLLDCWRLVMSGSTFKGKSAWYVDTEVINEFLIDTNQLGHVTPVEIVVDAEKLQFTEYDFVLVGPCTEP
APLVVHKGGLWECGKKLASFTPVIQDQDLEMFVKEVGDTSSDLLTEALSAMMLDRLGLKMQWSGVDIVSTLKAAVPQSME
ILGAVLEAVDNWVEFKGYALCYSKSRRRIMVQSSGGKLRLKGRTCEELIERDEHIEDIE
>P14240 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDEIISELRELCLNYIEQDERLSRQKLNFLGQREPRMVLIEGLKLLSRCIEIDSADKSGCTHNHDDKSVETILVESGIVC
PGLPLIIPDGYKLIDNSLILLECFVRSSPASFEKKFIEDTNKLACIREDLAVAGVTLVPIVDGRCDYDNSFMPEWANFKF
RDLLFKLLEYSNQNEKVFEESEYFRLCESLKTTIDKRSGMDSMKILKDARSTHNDEIMRMCHEGINPNMSCDDVVFGINS
LFSRFRRDLESGKLKRNFQKVNPEGLIKEFSELYENLADSDDILTLSREAVESCPLMRFITAETHGHERGSETSTEYERL
LSMLNKVKSLKLLNTRRRQLLNLDVLCLSSLIKQSKFKGLKNDKHWVGCCYSSVNDRLVSFHSTKEEFIRLLRNRKKSKV
FRKVSFEELFRASISEFIAKIQKCLLVVGLSFEHYGLSEHLEQECHIPFTEFENFMKIGAHPIMYYTKFEDYNFQPSTEQ
LKNIQSLRRLSSVCLALTNSMKTSSVARLRQNQIGSVRYQVVECKEVFCQVIKLDSEEYHLLYQKTGESSRCYSIQGPDG
HLISFYADPKRFFLPIFSDEVLYNMIDIMISWIRSCPDLKDCLTDIEVALRTLLLLMLTNPTKRNQKQVQSVRYLVMAIV
SDFSSTSLMDKLREDLITPAEKVVYKLLRFLIKTIFGTGEKVLLSAKFKFMLNVSYLCHLITKETPDRLTDQIKCFEKFF
EPKSQFGFFVNPKEAITPEEECVFYEQMKRFTSKEIDCQHTTPGVNLEAFSLMVSSFNNGTLIFKGEKKLNSLDPMTNSG
CATALDLASNKSVVVNKHLNGERLLEYDFNKLLVSAVSQITESFVRKQKYKLSHSDYEYKVSKLVSRLVIGSKGEETGRS
EDNLAEICFDGEEETSFFKSLEEKVNTTIARYRRGRRANDKGDGEKLTNTKGLHHLQLILTGKMAHLRKVILSEISFHLV
EDFDPSCLTNDDMKFICEAVEGSTELSPLYFTSVIKDQCGLDEMAKNLCRKFFSENDWFSCMKMILLQMNANAYSGKYRH
MQRQGLNFKFDWDKLEEDVRISERESNSESLSKALSLTKCMSAALKNLCFYSEESPTSYTSVGPDSGRLKFALSYKEQVG
GNRELYIGDLRTKMFTRLIEDYFESFSSFFSGSCLNNDKEFENAILSMTINVREGFLNYSMDHSKWGPMMCPFLFLMFLQ
NLKLGDDQYVRSGKDHVSTLLTWHMHKLVEVPFPVVNAMMKSYVKSKLKLLRGSETTVTERIFRQYFEMGIVPSHISSLI
DMGQGILHNASDFYGLLSERFINYCIGVIFGERPEAYTSSDDQITLFDRRLSDLVVSDPEEVLVLLEFQSHLSGLLNKFI
SPKSVAGRFAAEFKSRFYVWGEEVPLLTKFVSAALHNVKCKEPHQLCETIDTIADQAIANGVPVSLVNSIQRRTLDLLKY
ANFPLDPFLLNTNTDVKDWLDGSRGYRIQRLIEELCPNETKVVRKLVRKLHHKLKNGEFNEEFFLDLFNRDKTEAILQLG
DLLGLEEDLNQLADVNWLNLNEMFPLRMVLRQKVVYPSVMTFQEERIPSLIKTLQNKLCSKFTRGAQKLLSEAINKSAFQ
SCISSGFIGLCKTLGSRCVRNKNRENLYIKKLLEDLTTDDHVTRVCNRDGITLYICDKQSHPEAHRDHICLLRPLLWDYI
CISLSNSFELGVWVLAEPTKGKNNSENLTLKHLNPCDYVARKPESSRLLEDKVNLNQVIQSVRRLYPKIFEDQLLPFMSD
MSSKNMRWSPRIKFLDLCVLIDINSESLSLISHVVKWKRDEHYTVLFSDLANSHQRSDSSLVDEFVVSTRDVCKNFLKQV
YFESFVREFVATTRTLGNFSWFPHKEMMPSEDGAEALGPFQSFVSKVVNKNVERPMFRNDLQFGFGWFSYRMGDVVCNAA
MLIRQGLTNPKAFKSLKDLWDYMLNYTKGVLEFSISVDFTHNQNNTDCLRKFSLIFLVRCQLQNPGVAELLSCSHLFKGE
IDRRMLDECLHLLRTDSVFKVNDGVFDIRSEEFEDYMEDPLILGDSLELELLGSKRILDGIRSIDFERVGPEWEPVPLTV
KMGALFEGRNLVQNIIVKLETKDMKVFLAGLEGYEKISDVLGNLFLHRFRTGEHLLGSEISVILQELCIDRSILLIPLSL
LPDWFAFKDCRLCFSKSRSTLMYEIVGGRFRLKGRSCDDWLGGSVAEDID
>Q6IUF8 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDEYVQELKGLIRKHIPDRCEFAHQKVTFLSQVHPSPLLTEGFKLLSSLVELESCEAHACQANTDQRFVDVILSDNGILC
PTLPKVIPDGFKLTGKTLILLETFVRVNPDEFEKKWKADMSKLLNLKHDLQKSGVTLVPIVDGRSNYNNRFVADWVIERM
RWLLIEILKASKSMLEIDIEDQEYQRLIHSLSNVKNQSLGLENLEHLKRNSLDYDERLNESLFIGLKGDIRESTVREELI
KLKMWFKDEVFSKGLGKFKLTDRRELLESLSSLGAHLDSDVSSCPFCNNKLMEIVYNVTFSSVERTDGAATVDQQFSTTH
TNIEKHYLSVLSLCNKIKGLKVFNTRRNTLLFLDLIMVNLMVDISESCQDAIESLRKSGLIVGQMVMLVNDRVLDILEAI
KLIRKKIGTNPNWVKNCSKILERSHPEIWLQLNTLIRQPDFNSLISIAQYLVSDRPIMRYSVERGSDKICRHKLFQEMSS
FEQMRLFKTLSSISLSLINSMKTSFSSRLLVNEREFSKYFGNVRLRECYAQRFYLAESLVGFLFYQKTGERSRCYSVYLS
DNGVMSEQGSFYCDPKRFFLPVFSDEVLAGMCEEMTSWLDFDTGLMNDTGPILRLLVLAILCSPSKRNQTFLQGLRYFLM
AFANQIHHIDLISKLVVECKSSSEVVVQRLAVGLFIRLLGGESDASSFFSRRFKYLLNVSYLCHLITKETPDRLTDQIKC
FEKFIEPKVKFGCAVVNPSLNGKLTVDQEDIMINGLKKFFSKSLRDTEDVQTPGVCKELLNYCVSLFNRGKLKVSGELKN
NPFRPNITSTALDLSSNKSVVIPKLDELGNILSTYDKEKLVSACVSSMAERFKTKGRYNLDPESTDYLILKNLTGLVSAG
PKAKSSQEELSLMYETLTEEQVESFNEIKYDVQVALAKMADNSVNTRIKNLGRADNSVKNGNNPLDNLWSPFGVMKEIRA
EVSLHEVKDFDPDVLPSDVYKELCDAVYKSSEKCNFFLEEVLDVCPLGLLLKNLTTSSYMEEEYFMCFKYLLIQGHFDQK
LGSYEHKSRSRLGFTDETLRLKDEVRLSIRESNSEAIADKLDKSYFTNAALRNLCFYSEDSPTEFTSISSNSGNLKFGLS
YKEQVGSNRELYVGDLNTKLMTRLVEDFSEAVGNSMKYTCLNSEKEFERAICDMKMAVNNGDLSCSYDHSKWGPTMSPAL
FLALLQMLELRTPVDRSKIDLDSVKSILKWHLHKVVEVPINVAEAYCIGKLKRSLGLMGCGSTSLSEEFFHQTMQLSGQI
PSHIMSVLDMGQGILHNTSDLYGLITEQFLCYALDLLYDVIPVSYTSSDDQITLVKTPSLDIEGGSDAAEWLEMICFHEF
LSSKLNKFVSPKSVIGTFVAEFKSRFFVMGEETPLLTKFVSAALHNVKCKTPTQLSETIDTICDQCIANGVSTKIVARIS
KRVNQLIRYSGYGDTPFGAIEDQDVKDWVDGSRGYRLQRKIEAIFYDDKETSFIRNCARKVFNDIKRGRIFEENLINLIG
RGGDEALTGFLQYAGCSEQEVNRVLNYRWVNLSSFGDLRLVLRTKLMTSRRVLEREEVPTLIKTLQSKLSRNFTKGVKKI
LAESINKSAFQSSVASGFIGFCKSMGSKCVRDGKGGFLYIKEVYSGINVCICEICALKPKIIYCNDSLNKVSQFSKPILW
DYFSLVLTNACELGEWVFSTVKEPQKPLVLNNQNFFWAVKPKVVRQIEDQLGMNHVLQSIRRNYPVLFDEHLAPFMNDLQ
VSRTMDSGRLKFLDVCIALDMMNENLGIISHLLKTRDNSVYIVKQSDCALAHIRQSSYTDWELGLSPQQICTNFKTQLVL
SSMVNPLVLSTSCLKSFFWFNEVLELEDDSQIELAELTDFALMVKNQNVSRAMFVEDIAMGYVVSNFEGVRISLSNVMVD
GVQLPPKEKAPDVGVLFGLKAENVIVGLVVQIDHVRMSTKFKLRRKMVYSFSLECTMDVGDIQNKEVILKVVAVDQSVSG
SGGNHMLLDGVPVIASLPLFTGQASFDLAAMLIESNLAGSNDNFLMSNVTLDLGGFSPELSDKYSYRLSGPENQEDPLVL
KDGAFYVGGERLSTYKVELTGDLVVKALGALEDDEGVVSMLHQLWPYLKATSQVILFQQEDFTIVHDLYKIQLTKSIESF
GEWIEFTNFKVAYSKSLKELVISDTQGSFRLKGVMCRPLANTLQVEDIE
>P12576 ~~~L~~~RNA-directed RNA polymerase L~~~
MDSLSVNQILYPEVHLDSPIVTNKIVAILEYARVPHAYSLEDPTLCQNIKHRLKNGFSNQMIINNVEVGNVIKSKLRSYP
AHSHIPYPNCNQDLFNIEDKESTRKIRELLKKGNSLYSKVSDKVFQCLRDTNSRLGLGSELREDIKEKVINLGVYMHSSQ
WFEPFLFWFTVKTEMRSVIKSQTHTCHRRRHTPVFFTGSSVELLISRDLVAIISKESQHVYYLTFELVLMYCDVIEGRLM
TETAMTIDARYTELLGRVRYMWKLIDGFFPALGNPTYQIVAMLEPLSLAYLQLRDITVELRGAFLNHCFTEIHDVLDQNG
FSDEGTYHELIEALDYIFITDDIHLTGEIFSFFRSFGHPRLEAVTAAENVRKYMNQPKVIVYETLMKGHAIFCGIIINGY
RDRHGGSWPPLTLPLHAADTIRNAQASGEGLTHEQCVDNWKSFAGVKFGCFMPLSLDSDLTMYLKDKALAALQREWDSVY
PKEFLRYDPPKGTGSRRLVDVFLNDSSFDPYDVIMYVVSGAYLHDPEFNLSYSLQEKEIKETGRLFAKMTYKMRACQVIA
ENLISNGIGKYFKDNGMAKDEQDLTKALHTLAVSGVPKDLKESHRGGPVLKTYSRSPVHTSTRNVRAAKGFIGFPQVIRQ
DQDTDHPENMEAYETVSAFITTDLKKYCLNWRYETISLFAQRLNEIYGLPSFFQWLHKRLETSVLYVSDPHCPPDLDAHI
PLYKVPNDQIFIKYPMGGIEGYCQKLWTISTIPYLYLAAYESGVRIASLVQGDNQTIAVTKRVPSTWPYNLKKREAARVT
RDYFVILRQRLHDIGHHLKANETIVSSHFFVYSKGIYYDGLLVSQSLKSIARCVFWSETIVDETRAACSNIATTMAKSIE
RGYDRYLAYSLNFLKVIQQILISLGFTINSTMTRDVVIPLLTNNDLLIRMALLPAPIGGMNYLNMSRLFVRNIGDPVTSS
IADLKRMILASLMPEETLHQVMTQQPGDSSFLDWASDPYSANLVCVQSITRLLKNITARFVLIHSPNPMLKGLFHDDSKE
EDEGLAAFLMDRHIIVPRAAHEILDHSVTGARESIAGMLDTTKGLIRASMRKGGLTSRVITRLSNYDYEQFRAGMVLLTG
RKRNVLIDKESCSVQLARALRSHMWARLARGRPIYGLEVPDVLESMRGHLIRRHETCVICECGSVNYGWFFVPSGCQLDD
IDKETSSLRVPYIGSTTDERTDMKLAFVRAPSRSLRSAVRIATVYSWAYGDDDSSWNEAWLLARQRANVSLEELRVITPI
STSTNLAHRLRDRSTQVKYSGTSLVRVARYTTISNDNLSFVISDKKVDTNFIYQQGMLLGLGVLETLFRLEKDTGSSNTV
LHLHVETDCCVIPMIDHPRIPSSRKLELRAELCTNPLIYDNAPLIDRDATRLYTQSHRRHLVEFVTWSTPQLYHILAKST
ALSMIDLVTKFEKDHMNEISALIGDDDINSFITEFLLIEPRLFTIYLGQCAAINWAFDVHYHRPSGKYQMGELLSSFLSR
MSKGVFKVLVNALSHPKIYKKFWHCGIIEPIHGPSLDAQNLHTTVCNMVYTCYMTYLDLLLNEELEEFTFLLCESDEDVV
PDRFDNIQAKHLCVLADLYCQPGTCPPIQGLRPVEKCAVLTDHIKAEAMLSPAGSSWNINPIIVDHYSCSLTYLRRGSIK
QIRLRVDPGFIFDALAEVNVSQPKIGSNNISNMSIKAFRPPHDDVAKLLKDINTSKHNLPISGGNLANYEIHAFRRIGLN
SSACYKAVEISTLIRRCLEPGEDGLFLGEGSGSMLITYKEILKLSKCFYNSGVSANSRSGQRELAPYPSEVGLVEHRMGV
GNIVKVLFNGRPEVTWVGSVDCFNFIVSNIPTSSVGFIHSDIETLPDKDTIEKLEELAAILSMALLLGKIGSILVIKLMP
FSGDFVQGFISYVGSHYREVNLVYPRYSNFISTESYLVMTDLKANRLMNPEKIKQQIIESSVRTSPGLIGHILSIKQLSC
IQAIVGDAVSRGDINPTLKKLTPIEQVLINCGLAINGPKLCKELIHHDVASGQDGLLNSILILYRELARFKDNQRSQQGM
FHAYPVLVSSRQRELISRITRKFWGHILLYSGNRKLINKFIQNLKSGYLILDLHQNIFVKNLSKSEKQIIMTGGLKREWV
FKVTVKETKEWYKLVGYSALIKD
>P12577 ~~~L~~~RNA-directed RNA polymerase L~~~
MDTESNNGTVSDILYPECHLNSPIVKGKIAQLHTIMSLPQPYDMDDDSILVITRQKIKLNKLDKRQRSIRRLKLILTEKV
NDLGKYTFIRYPEMSKEMFKLHIPGINSKVTELLLKADRTYSQMTDGLRDLWINVLSKLASKNDGSNYDLNEEINNISKV
HTTYKSDKWYNPFKTWFTIKYDMRRLQKARNEVTFNMGKDYNLLEDQKNFLLIHPELVLILDKQNYNGYLITPELVLPYC
DVVEGRWNISACAKLDPKLQSMYQKGNNLWEVIDKLFPIMGEKTFDVISLLEPLALSLIQTHDPVKQLRGAFLNHVLSEM
ELIFESRESIKEFLSVDYIDKILDIFNKSTIDEIAEIFSFFRTFGHPPLEASIAAEKVRKYMYIGKQLKFDTINKCHAIF
CTIIINGYRERHGGQWPPVTLPDHAHEFIINAYGSNSAISYENAVDYYQSFIGIKFNKFIEPQLDEDLTIYMKDKALSPK
KSNWDTVSPASNLLYRTNASNESRRLVEKFIADSKFDPNQILDYVESGDWLDDPEFNISYSLKEKEIKQEGRLFAKMTYK
MRATQVLSETLLANNIGKFFQENGMVKGEIELLKRLTTISISGVPRYNEVYNNSKSHTDDLKTYNKISNLNLSSNQKSKK
FEFKSTDIYNDGYETVSCFLTTDLKKYCLNWRYESTALFGETCNQIFGLNKLFNWLHPRLEGSTIYVGDPYCPPSDKEHI
SLEDHPDSGFYVHNPRGGIEGFCQKLWTLISISAIHLAAVRIGVRVTAMVQGDNQAIAVTTRVPNNYDYRVKKEIVYKDV
VRFFDSLREVMDDLGHELKLNETIISSKMFIYSKRIYYDGRILPQALKALSRCVFWSETVIDETRSASSNLATSFAKAIE
NGYSPVLGYACSIFKNIQQLYIALGMNINPTITQNIKDLYFRNPNWMQYASLIPASVGGFNYMAMSRCFVRNIGDPSVAA
LADIKRFIKANLLDRSVLYRIMNQEPGESSFLDWASDPYSCNLPQSQNITTMIKNITARNVLQDSPNPLLSGLFTNTMIE
EDEELAEFLMDRKVILPRVAHDILDNSLTGIRNAIAGMLDTTKSLIRVGINRGGLTYSLLRKISNYDLVQYETLSRTLRL
IVSDKIRYEDMCSVDLAIALRQKMWIHLSGGRMISGLETPDPLELLSGVIITGSEHCKICYSSDGTNPYTWMYLPGNIKI
GSAETGISSLRVPYFGSVTDERSEAQLGYIKNLSKPAKAAIRIAMIYTWAFGNDEISWMEASQIAQTRANFTLDSLKILT
PVATSTNLSHRLKDTATQMKFSSTSLIRVSRFITMSNDNMSIKEANETKDTNLIYQQIMLTGLSVFEYLFRLEETTGHNP
IVMHLHIEDECCIKESFNDEHINPESTLELIRYPESNEFIYDKDPLKDVDLSKLMVIKDHSYTIDMNYWDDTDIIHAISI
CTAITIADTMSQLDRDNLKEIIVIANDDDINSLITEFLTLDILVFLKTFGGLLVNQFAYTLYSLKTEGRDLIWDYIMRTL
RDTSHSILKVLSNALSHPKVFKRFWDCGVLNPIYGPNTASQDQIKLALSICEYSLDLFMREWLNGVSLEIYICDSDMEVA
NDRKQAFISRHLSFVCCLAEIASFGPNLLNLTYLERLDLLKQYLELNIKDDPTLKYVQISGLLIKSFPSTVTYVRKTAIK
YLRIRGISPPEVIDDWDPIEDENMLDNIVKTINDNCNKDNKGNKINNFWGLALKNYQVLKIRSITSDSDNNDRSDASTGG
LTLPQGGNYLSHQLRLFGINSTSCLKALELSQILMKEVNKDQDRLFLGEGAGAMLACYDATLGPAVNYYNSGLNITDVIG
QRELKIFPSEVSLVGKKLGNVTQILNRVKVLFNGNPNSTWIGNMECETLIWSELNDKSIGLVHCDMEGAIGKSEETVLHE
HYSVIRITYLIGDDDVVLISKIIPTITPNWSRILYLYKLYWKDVSIISLKTSNPASTELYLISKDAYCTIMEPSEVVLSK
LKRLSLLEENNLLKWIILSKKKNNEWLHHEIKEGERDYGVMRPYHMALQIFGFQINLNHLAKEFLSTPDLTNINNIIQSF
QRTIKDVLFEWINITHDGKRHKLGGRYNIFPLKNKGKLRLLSRRLVLSWISLSLSTRLLTGRFPDEKFEHRAQTGYVSLP
DTDLESLKLLSKNTIKNYRECIGSISYWFLTKEVKILMKLIGGAKLLGIPRQYKEPEEQLLEDYNQHDEFDID
>Q88434 ~~~L~~~RNA-directed RNA polymerase L~~~
MAGSREILLPEVHLNSPIVKHKLYYYILLGNLPNEIDLDDLGPLHNQNWNQIAHEESNLAQRLVNVRNFLITHIPDLRKG
HWQEYVNVILWPRILPLIPDFKINDQLPLLKNWDKLVKESCSVINAGTSQCIQNLSYGLTGRGNLFTRSRELSGDRRDID
LKTVVAAWHDSDWKRISDFWIMIKFQMRQLIVRQTDHNDSDLITYIENREGIIIITPELVALFNTENHTLTYMTFEIVLM
VSDMYEGRHNILSLCTVSTYLNPLKKRITYLLSLVDNLAFQIGDAVYNIIALLESFVYAQLQMSDPIPELRGQFHAFVCS
EILDALRGTNSFTQDELRTVTTNLISPFQDLTPDLTAELLCIMRLWGHPMLTASQAAGKVRESMCAGKVLDFPTIMKTLA
FFHTILINGYRRKHHGVWPPLNLPGNASKGLTELMNDNTEISYEFTLKHWKEVSLIKFKKCFDADAGEELSIFMKDKAIS
APKQDWMSVFRRSLIKQRHQHHQVPLPNPFNRRLLLNFLGDDKFDPNVELQYVTSGEYLHDDTFCASYSLKEKEIKPDGR
IFAKLTKRMRSCQVIAESLLANHAGKLMKENGVVMNQLSLTKSLLTMSQIGIISEKARKSTRDNINQPGFQNIQRNKSHH
SKQVNQRDPSDDFELAASFLTTDLKKYCLQWRYQTIIPFAQSLNRMYGYPHLFEWIHLRLMRSTLYVGDPFNPPADTSQF
DLDKVINGDIFIVSPRGGIEGLCQKAWTMISIAVIILSATESGTRVMSMVQGDNQAIAVTTRVPRSLPTLEKKTIAFRSC
NLFFERLKCNNFGLGHHLKEQETIISSHFFVYSKRIFYQGRILTQALKNASKLCLTADVLGECTQSSCSNLATTVMRLTE
NGVEKDICFYLNIYMTIKQLSYDIIFPQVSIPGDQITLEYINNPHLVSRLALLPSQLGGLNYLSCSRLFNRNIGDPVVSA
VADLKRLIKSGCMDYWILYNLLGRKPGNGSWATLAADPYSINIEYQYPPTTALKRHTQQALMELSTNPMLRGIFSDNAQA
EENNLARFLLDREVIFPRVAHIIIEQTSVGRRKQIQGYLDSTRSIMRKSLEIKPLSNRKLNEILDYNINYLAYNLALLKN
AIEPPTYLKAMTLETCSIDIARNLRKLSWAPLLGGRNLEGLETPDPIEITAGALIVGSGYCEQCAAGDNRFTWFFLPSGI
EIGGDPRDNPPIRVPYIGSRTDERRVASMAYIRGASSSLKAVLRLAGVYIWAFGDTLENWIDALDLSHTRVNITLEQLQS
LTPLPTSANLTHRLDDGTTTLKFTPASSYTFSSFTHISNDEQYLTINDKTADSNIIYQQLMITGLGILETWNNPPINRTF
EESTLHLHTGASCCVRPVDSCILSEALTVKPHITVPYSNKFVFDEDPLSEYETAKLESLSFQAQLGNIDAVDMTGKLTLL
SQFTARQIINAITGLDESVSLTNDAIVASDYVSNWISECMYTKLDELFMYCGWELLLELSYQMYYLRVVGWSNIVDYSYM
ILRRIPGAALNNLASTLSHPKLFRRAINLDIVAPLNAPHFASLDYIKMSVDAILWGCKRVINVLSNGGDLELVVTSEDSL
ILSDRSMNLIARKLTLLSLIHHNGLELPKIKGFSPDEKCFALTEFLRKVVNSGLSSIENLSNFMYNVENPRLAAFASNNY
YLTRKLLNSIRDTESGQVAVTSYYESLEYIDSLKLTPHVPGTSCIEDDSLCTNDYIIWIIESNANLEKYPIPNSPEDDSN
FHNFKLNAPSHHTLRPLGLSSTAWYKGISCCRYLERLKLPQGDHLYIAEGSGASMTIIEYLFPGRKIYYNSLFSSGDNPP
QRNYAPMPTQFIESVPYKLWQAHTDQYPEIFEDFIPLWNGNAAMTDIGMTACVEFIINRVGPRTCSLVHVDLESSASLNQ
QCLSKPIINAIITATTVLCPHGVLILKYSWLPFTRFSTLITFLWCYFERITVLRSTYSDPANHEVYLICILANNFAFQTV
SQATGMAMTLTDQGFTLISPERINQYWDGHLKQERIVAEAIDKVVLGENALFNSSDNELILKCGGTPNARNLIDIEPVAT
FIEFEQLICTMLTTHLKEIIDITRSGTQDYESLLLTPYNLGLLGKISTIVRLLTERILNHTIRNWLILPPSLRMIVKQDL
EFGIFRITSILNSDRFLKLSPNRKYLIAQLTAGYIRKLIEGDCNIDLTRPIQKQIWKALGCVVYCHDPMDQRESTEFIDI
NINEEIDRGIDGEEI
>Q8B6J5 ~~~L~~~Large structural protein~~~
MLDPGEVYDDPIDPIESEAEPRGTPTVPNILRNSDYNLNSPLIEDSAKLMLEWLKTGNRPYRMTLTDNCSRSYKDLKDYF
KKVDLGSLKVGGTAAQSMISLWLYGAHSESNRSRRCITDLAHFYSKSSPIEKLLNCTLGNRGLRIPPEGVLSCLERVDYD
KAFGRYLANTYSSYLFFHVITLYMNALDWEEEKTILALWKDLTSVDTGKDLVKFKDQIWGLLIVTKDFVYSQSSNCLFDR
NYTLMLKDLFLSRFNSLMILLSPPEPRYSDDLISQLCQLYIAGDQVLSLCGNSGYEVIKILEPYVVNSLVQRAEKFRPLI
HSLGDFPMFIKDKVNQLEGTFGPSAKRFFRVLDQFDNIHDLVFVYGCYRHWGHPYIDYRKGLLKLYDQVHIKKVIDKSYQ
ECLASDLARRILRWGFDKYSKWYLDSRFLARDHPLAPYIKTQTWPPKHIVDLVGDTWHKLPITQIFEIPESMDPSEILDD
KSHSFTRTRLASWLSENRGGPVPSEKVIITALSQPPVNPREFLKSIDLGGLPDDDLIIGLRPKERELKIEGRFFALMSWN
LRLYFVITEKLLANYILPLFDALTMTDNLNKVFKKLIDRVTGQGLLDYSRVTYAFHLDYEKWNNHQRLESTEDVFSVLDQ
VFGLKRVFSRTHEFFQKSWIYYSDRSDLIGLREDQIYCLDMSNGPTCWNGQDGGLEGLRQKGWSLVSLLMIDRESQTRNT
RTKILAQGDNQVLCPTYMLSPGLSQEGLLYELESISRNALSIYRAIEEGASKLGLIIKKEETMCSYDFLIYGKTPLFRGN
ILVPESKRWARVSCISNDQIVNLANIMSTVSTNALTVAQHSQSLIKPMRDFLLMSVQAVFHYLLFSPILKGRVYKILSAE
GESFLLAMSRIIYLDPSLGGVSGMSLGRFHIRQFSDPVSEGLSFWREIWLSSHESWIHALCQEAGNPDLGERTLESFTRL
LEDPTTLNIKGGASPTILLKDAIRKALYDEVDKVENSEFREAILLSKTHRDNFILFLKSVEPLFPRFLSELFSSSFLGIP
ESIIGLIQNSRTIRRQFRKSLSRTLEESFYNSEIHGINRMTQTPQRVGRVWPCSSERADLLREISWGRKVVGTTVPHPSE
MLGLLPKSSISCTCGATGGGNPRVSVSVLPSFDQSFFSRGPLKGYLGSSTSMSTQLFHAWEKVTNVHVVKRAISLKESIN
WFINRNSNLAQTLIRNIMSLTGPDFSLEEAPVFKRTGSALHRFKSARYSEGGYSSVCPNLLSHISVSTDTMSDLTQDGKN
YDFMFQPLMLYAQTWTSELVQRDTRLRDSTFHWHLRCNRCVRPIEDITLETSQIFEFPDVSKRISRMVSGAVPHFQKLPD
IRLRPGDFESLSGREKSRHIGSAQGLLYSILVAIHDSGYNDGTIFPVNIYGKVSPRDYLRGLARGILIGSSICFLTRMTN
INIKRPLELISGVISYILLRLDNHPSLYIMLREPSLRGEIFSIPQKIPAAYPTTMKEGNRSILCYLQHVLRYEREVITAS
PENDWLWIFSDFRSAKMTYLTLITYQSHLLLQRVERNLSKSMRATLQQMGSLMRQVLGGHGEDTLESDDDIQRLLKDSLR
RTRWVDQEVRHAARTMSGDYSPNKRVSRKAGCSEWVCSAQQVAVSTSANPAPVSELDIRALSKRFQNPLISGLRVVQWAT
GAHYKLKPILDDLNVFPSLCLVVGDGSGGISRAVLNMFPDSKLVFNSLLEVNDLMASGTRPLPPSAIMSGGDDIISRVID
FDSIWEKPSDLRNLATWRYFQSVQKQVNMSYDLIICDAEVTDIASINRITLLMSDFALSIDGPLYLVFKTYGTMLVNPDY
KAIQHLSRAFPSVTGFITQVTSSFSSELYLRFSKRGKFFRDAEYLTSSTLREMSLVLFNCSSPKSEMQRARSLNYQDLVR
GFPEEIISNPYNEMIITLIDSDVESFLVHKMVDDLELQRGTLSKVAIIISIMIVFSNRVFNISKPLTDPLFYPPSDPKIL
RHFNICCSTMMYLSTALGDVPSFARLHDLYNRPITYYFRKQVIRGNIYLSWSWSDDTPVFKRVACNSSLSLSSHWIRLIY
KIVKTTRLVGRIEDLSGEVERHLHGYNRWITLEDIRSRSSLLDYSCL
>P16289 ~~~L~~~Large structural protein~~~
MLDPGEVYDDPIDPIELEAEPRGTPIVPNILRNSDYNLNSPLIEDPARLMLEWLKTGNRPYRMTLTDNCSRSFRVLKDYF
KKVDLGSLKVGGMAAQSMISLWLYGAHSESNRSRRCITDLAHFYSKSSPIEKLLNLTLGNRGLRIPPEGVLSCLERVDYD
NAFGRYLANTYSSYLFFHVITLYMNALDWDEEKTILALWKDLTSVDIGKDLVKFKDQIWGLLIVTKDFVYSQSSNCLFDR
NYTLMLKDLFLSRFNSLMVLLSPPEPRYSDDLISQLCQLYIAGDQVLSMCGNSGYEVIKILEPYVVNSLVQRAEKFRPLI
HSLGDFPVFIKDKVSQLEETFGPCARRFFRALDQFDNIHDLVFVFGCYRHWGHPYIDYRKGLSKLYDQVHLKKMIDKSYQ
ECLASDLARRILRWGFDKYSKWYLDSRFLARDHPLTPYIKTQTWPPKHIVDLVGDTWHKLPITQIFEIPESMDPSEILDD
KSHSFTRTRLASWLSENRGGPVPSEKVIITALSKPPVNPREFLRSIDLGGLPDEDLIIGLKPKERELKIEGRFFALMSWN
LRLYFVITEKLLANYILPLFDALTMTDNLNKVFKKLIDRVTGQGLLDYSRVTYAFHLDYEKWNNHQRLESTEDVFSVLDQ
VFGLKRVFSRTHEFFQKAWIYYSDRSDLIGLREDQIYCLDASNGPTCWNGQDGGLEGLRQKGWSLVSLLMIDRESQIRNT
RTKILAQGDNQVLCPTYMLSPGLSQEGLLYELERISRNALSIYRAVEEGASKLGLIIKKEETMCSYDFLIYGKTPLFRGN
ILVPESKRWARVSCVSNDQIVNLANIMSTVSTNALTVAQHSQSLIKPMRDFLLMSVQAVFHYLLFSPILKGRVYKILSAE
GESFLLAMSRIIYLDPSLGGISGMSLGRFHIRQFSDPVSEGLSFWREIWLSSQESWIHALCQEAGNPDLGERTLESFTRL
LEDPTTLNIRGGASPTILLKDAIRKALYDEVDKVENSEFREAILLSKTHRDNFILFLISVEPLFPRFLSELFSSSFLGIP
ESIIGLIQNSRTIRRQFRKSLSKTLEESFYNSEIHGISRMTQTPQRVGGVWPCSSERADLLREISWGRKVVGTTVPHPSE
MLGLLPKSSISCTCGATGGGNPRVSVSVLPSFDQSFFSRGPLKGYLGSSTSMSTQLFHAWEKVTNVHVVKRALSLKESIN
WFITRDSNLAQALIRNIMSLTGPDFPLEEAPVFKRTGSALHRFKSARYSEGGYSSVCPNLLSHISVSTDTMSDLTQDGKN
YDFMFQPLMLYAQTWTSELVQRDTRLRDSTFHWHLRCNRCVRPIDDVTLETSQIFEFPDVSKRISRMVSGAVPHFQRLPD
IRLRPGDFESLSGREKSHHIGSAQGLLYSILVAIHDSGYNDGTIFPVNIYGKVSPRDYLRGLARGVLIGSSICFLTRMTN
ININRPLELVSGVISYILLRLDNHPSLYIMLREPSLRGEIFSIPQKIPAAYPTTMKEGNRSILCYLQHVLRYEREIITAS
PENDWLWIFSDFRSAKMTYLSLITYQSHLLLQRVERNLSKSMRDNLRQLSSLMRQVLGGHGEDTLESDDNIQRLLKDSLR
RTRWVDQEVRHAARTMTGDYSPNKKVSRKVGCSEWVCSAQQVAVSTSANPAPVSELDIRALSKRFQNPLISGLRVVQWAT
GAHYKLKPILDDLNVFPSLCLVVGDGSGGISRAVLNMFPDAKLVFNSLLEVNDLMASGTHPLPPSAIMRGGNDIVSRVID
LDSIWEKPSDLRNLATWKYFQSVQKQVNMSYDLIICDAEVTDIASINRITLLMSDFALSIDGPLYLVFKTYGTMLVNPNY
KAIQHLSRAFPSVTGFITQVTSSFSSELYLRFSKRGKFFRDAEYLTSSTLREMSLVLFNCSSPKSEMQRARSLNYQDLVR
GFPEEIISNPYNEMIITLIDSDVESFLVHKMVDDLELQRGTLSKVAIIIAIMIVFSNRVFNVSKPLTDPSFYPPSDPKIL
RHFNICCSTMMYLSTALGDVPSFARLHDLYNRPITYYFRKQVIRGNVYLSWSWSNDTSVFKRVACNSSLSLSSHWIRLIY
KIVKTTRLVGSIKDLSREVERHLHRYNRWITLEDIRSRSSLLDYSCL
>Q85431 2.7.7.48~~~pc1~~~RNA-directed RNA polymerase L~~~
MTTPPLVIPLHVHGRSYELLAGYHEVDWQEIEELEETDVRGDGFCLYHSILYSMGLSKENSRTTEFMIKLRSNPAICQLD
QEMQLSLMKQLDPNDSSAWGEDIAIGFIAIILRIKIIAYQTVDGKLFKTIYGAEFESTIRIRNYGNYHFKSLETDFDHKV
KLRSKIEEFLRMPVEDCESISLWHASVYKPIVSDSLSGHKSFSNVDELIGSIISSMYKIMDNGDQCFLWSAMRMVARPSE
KLYALAVFLGFNLKFYHVRKRAEKLTAKLESDHTNLGVKLIEVYEVSEPTRSTWVLKPGGSRITETRNFVIEEIIDNRRS
LESLFVSSSEYPAELCSQKLSAIKDRIALMFGFINRTPENSGRELYINTYYLKRILQVERNVIRDSLRSQPAVGMIQIIR
LPTAFGTYNPEVGTLLLAQTGLIYRLGTTTRVQMEVRRSPSVISRSHKITSFPETQKHNNNLYDYAPRTQETFYHPNAEI
YEAVDVKTPSVITEIVDNHIVIKLNTDDKGWSVSDSIKQDFVYRKRLMDAKNIVHDFVFDILSTETDKSFKGADLSIGGI
SDNWSPDVIISRESDPQYEDIVVYEFTTRSTESIESLLRSVEVKSLRYKEAIQERAITLKKRISYYTICVSLDAVATNLL
SLPADVCRELIIRLRVANQVKIQLADNDINLDSATLLAPDIYRIKEMFRESFPNNKFIHPITKEMYEHFVNPMISGEKDY
VANLKSIIDKETRDEQRKNLESLKVVDGKKYTERKAETALNEMSQAEEHYRSYFENDNFRSTLKAPVQLPLIIPDVSSQD
NQFSNKELSDRIRKKPIDHPIYNIWDQAVNKRNCSIALGHLDELEISMLEGQVAKKVEESYKKDRSQYNRTTLLTNMKED
IYLAERGINAKKRLEEPDVKFYRDQSKRPFHPFVSETRDIEQFTQKECLELNEESGHCSLINVEDLVLSALELHEVGDLE
HLWNNIKAHSKTKFALYAKFISDLATELAISLSQNCKEDTYVVKKLRDFSCYVLIKPVNLKSNVFFSLYIPSNIYKSHNT
TFKTLIGSPESGYMTDFVSANVSKLVNWVRCEAMMLAQRGFWREFYAVAPSIEEQDGMAEPDSVCQMMSWTLLILLNDKH
QLEEMITVSRFVHMEGFVTFPAWPKPYKMFDKLSVTPRSRLECLVIKRLIMLMKHYSENPIKFMIEDEKKKWFGFKNMFL
LDCNGKLADLSDQDQMLNLFYLGYLKNKDEEVEDNGMGQLLTKILGFESAMPKTRDFLGMKDPEYGTIKKHEFSISYVKD
LCDKFLDRLKKTHGIKDPITYLGDKIAKFLSTQFIETMASLKASSNFSEDYYLYTPSRRLKNQEQSRSKHVIDAGGNISA
SVKGKLYHRSKVIEKLTTLIKDETPGKELKIVVDLLPKAMEVLNKNECMHICIFKKNQHGGLREIYVLNIFERIMQKTVE
DFSRAILECCPSETMTSPKNKFRIPELHNMEARKTLKNEYMTISTSDDASKWNQGHYVSKFMCMLLRLTPTYYHGFLVQA
LQLWHHKKIFLGDQLLQLFNQNAMLNTMDTTLMKVFQAYKGEIQVPWMKAGRSYIETETGMMQGILHYTSSLFHAIFLDQ
LAEECRRDINRAIKTINNKENEKVSCIVNNMESSDDSSFIISIPNFKENEAAQLYLLCVVNSWFRKKEKLGTYLGIYKSP
KSTTQTLFVMEFNSEFFFSGDVHRPTFRWVNAAVLIGEQETLSGIQEELSNTLKDVIEGGGTYALTFIVQVAQAMIHYRM
LGSSASSVWPAYETLLKNSYDPALGFFLMDNPKCAGLLGFNYNVWIACTTTPLGEKYHEMIQEEMKAESQSLKSVTEDTI
NTGLVSRTTMVGFGNKKRWMKLMTTLNLSADVYEKIEEEPRVYFFHAATAEQIIQKIAIKMKSPGVIQSLSKGNMLARKI
ASSVFFISRHIVFTMSAYYDADPETRKTSLLKELINSSKIPQRHDYLQEPHTLKPTKVEVDEDSWEFKSAKEECVRVLKQ
RIKIHTGREERSISLLFENMAKSMIGRCTDQYDVRENVSILACALKMNYSIFKKDAAPNRYLLDEKNLVYPLIGKEVSVY
VKSDKVHIEISEKKERLSTKLFNIDKMKDIEETLSLLFPSYGDYLSLKETIDQVTFQSAIHKVNERRRVRADVHLTGTEG
FSKLPMYTAAVWAWFDVKTIPAHDSIYRTIWKVYKEQYSWLSDTLKETVEKGPFKTVQGVVNFISRAGVRSRVVHLVGSF
GKNVRGSINLVTAIKDNFSNGLVFKGNIFDIKAKKTRESLDNYLSICTTLSQAPITKHDKNQILRSLFVSGPRIQYVSSQ
FGSRRNRMSILQEVVADDPTLHWPDQDTSQKQLEDKFRELAHKELPFLTEKVFHDYLEKIEQLMKENTHLGGRDVDASKT
PYVLARANDIEIHCYELWREYDEDEDEAYQAYCSEVEAAMDQEKLNALIERYHVDPKANWIQMLMNGEIETVEELNKLDK
GFESHRLALVERIRVGKLGILGSYTKCQQRIEELDGEGNKTHRYTGEGIWRGSFDDSDVCIVVQDLKKTRESYLKCVVFS
KVSDYKVLMGHLKTWCREHHISNDEFPTCTQKELLSYGVTKSSVLLYKMNGMKMLRNMEKGIPLYWNPSLSTRSQTYINW
LAVDITDHSLRLRNRTVENGRVVNQTIMVVPLYKTDVQIFKTSPVDLEQDVQNDRLKLLSVTKAGELRWLQDWIMWRSSA
VDDLNILNQVRRNKAARDHFNAKPEFKKWIKELWDYALDTTLINKKVFITTQGSESQSTVSSGDSDSAVAPLTDEAVDEI
HDLLDKELEKGTLKQIIHDATIDAQLDIPAIESFLAEEMEVFKSSLAKSHPLLLNYVRYMIQEIGVTNFRSLIDSFNQKD
PLKSVSLSILDLKEVFKFVYQDINDAYFVKQEEDHKFDF
>A2SZS3 2.7.7.48~~~~~~RNA-directed RNA polymerase L~~~
MDSILSKQLVDKTGFVRVPIKHFDCTMLTLALPTFDVSKMVDRITIDFNLDDIQGASEIGSTLLPSMSIDVEDMANFVHD
FTFGHLADKTDRLLMREFPMMNDGFDHLSPDMIIKTTSGMYNIVEFTTFRGDERGAFQAAMTKLAKYEVPCENRSQGRTV
VLYVVSAYRHGVWSNLELEDSEAEEMVYRYRLALSVMDELRTLFPELSSTDEELGKTERELLAMVSSIQINWSVTESVFP
PFSREMFDRFRSSPPDSEYITRIVSRCLINSQEKLINSSFFAEGNDKALRFSKNAEECSLAVERALNQYRAEDNLRDLND
HKSTIQLPPWLSYHDVDGKDLCPLQGLDVRGDHPMCNLWREVVTSANLEEIERMHDDAAAELEFALSGVKDRPDERNRYH
RVHLNMGSDDSVYIAALGVNGKKHKADTLVQQMRDRSKQPFSPDHDVDHISEFLSACSSDLWATDEDLYNPLSCDKELRL
AAQRIHQPSLSERGFNEIITEHYKFMGSRIGSWCQMVSLIGAELSASVKQHVKPNYFVIKRLLGSGIFLLIKPTSSKSHI
FVSFAIKRSCWAFDLSTSRVFKPYIDAGDLLVTDFVSYKLSKLTNLCKCVSLMESSFSFWAEAFGIPSWNFVGDLFRSSD
SAAMDASYMGKLSLLTLLEDKAATEELQTIARYIIMEGFVSPPEIPKPHKMTSKFPKVLRSELQVYLLNCLCRTIQRIAG
EPFILKKKDGSISWGGMFNPFSGRPLLDMQPLISCCYNGYFKNKEEETEPSSLSGMYKKIIELEHLRPQSDAFLGYKDPE
LPRMHEFSVSYLKEACNHAKLVLRSLYGQNFMEQIDNQIIRELSGLTLERLATLKATSNFNENWYVYKDVADKNYTRDKL
LVKMSKYASEGKSLAIQKFEDCMRQIESQGCMHICLFKKQQHGGLREIYVMGAEERIVQSVVETIARSIGKFFASDTLCN
PPNKVKIPETHGIRARKQCKGPVWTCATSDDARKWNQGHFVTKFALMLCEFTSPKWWPLIIRGCSMFTRKRMMMNLNYLK
ILDGHRELDIRDDFVMDLFKAYHGEAEVPWAFKGKTYLETTTGMMQGILHYTSSLLHTIHQEYIRSLSFKIFNLKVAPEM
SKGLVCDMMQGSDDSSMLISFPADDEKVLTRCKVAAAICFRMKKELGVYLAIYPSEKSTANTDFVMEYNSEFYFHTQHVR
PTIRWIAACCSLPEVETLVARQEEASNLMTSVTEGGGSFSLAAMIQQAQCTLHYMLMGMGVSELFLEYKKAVLKWNDPGL
GFFLLDNPYACGLGGFRFNLFKAITRTDLQKLYAFFMKKVKGSAARDWADEDVTIPETCSVSPGGALILSSSLKWGSRKK
FQKLRDRLNIPENWIELINENPEVLYRAPRTGPEILLRIAEKVHSPGVVSSLSSGNAVCKVMASAVYFLSATIFEDTGRP
EFNFLEDSKYSLLQKMAAYSGFHGFNDMEPEDILFLFPNIEELESLDSIVYNKGEIDIIPRVNIRDATQTRVTIFNEQKT
LRTSPEKLVSDKWFGTQKSRIGKTTFLAEWEKLKKIVKWLEDTPEATLAHTPLNNHIQVRNFFARMESKPRTVRITGAPV
KKRSGVSKIAMVIRDNFSRMGHLRGVEDLAGFTRSVSAEILKHFLFCILQGPYSESYKLQLIYRVLSSVSNVEIKESDGK
TKTNLIGILQRFLDGDHVVPIIEEMGAGTVGGFIKRQQSKVVQNKVVYYGVGIWRGFMDGYQVHLEIENDIGQPPRLRNV
TTNCQSSPWDLSIPIRQWAEDMGVTNNQDYSSKSSRGARYWMHSFRMQGPSKPFGCPVYIIKGDMSDVIRLRKEEVEMKV
RGSTLNLYTKHHSHQDLHILSYTASDNDLSPGIFKSISDEGVAQALQLFEREPSNCWVRCESVAPKFISAILEICEGKRQ
IRGINRTRLSEIVRICSESSLRSKVGSMFSFVANVEEAHDVDYDALMDLMIEDAKNNAFSHVVDCIELDVSGPYEMESFD
TSDVNLFGPAHYKDISSLSMIAHPLMDKFVDYAISKMGRASVRKVLETGRCSSKDYDLSKVLFRTLQRPEESIRIDDLEL
YEETDVADDMLG
>P27316 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDSILSKQLVDKTGFVRVPIKHFDCTMLTLALPTFDVSKMVDRITIDFNLDDIQGASEIGSTLLPSMSIDVEDMANFVHD
FTFGHLADKTDRLLMREFPMMNDGFDHLSPDMIIKTTSGVYNIVEFTNFRGDERGAFQAAMIKLAKYEVPCENRSQGRTV
VLYVVSAYRAWCMVYLELERTLKQREMVYRYRLALSVMDELRTLFPELSSTDEELGKTERELPAMVSSIQINWSVTESVF
PPFSREMFDRFRSSPPDSEYITRIVSRCLINSQEKLINSSFFAEGNDKALRFSKNAEECSLAVERALNQYRAEDNLRDLN
DHKSTIQLPPWLSYHDVDGKDLCPLQGLDVRGDHPMCNLWREVVTSANLEEIERMHDDAAAELEFAFGSKGQARERNRYH
RVHLNMGSDVLVYIAALGVNGKKHKADTLVQQMRDRSKQPFSPDHDVITYLNFSLHALVTCGQQMRTCTALSLVIRDSVG
SPEDSSAILVRKGFHEIITEHYKFMGSRIGHGCQMVSLIGAELSASVKQHVKPNYFVIKRLLGSGIFLLIKPTSSKSHIF
VSFALSALAGPLISPLPGFSSPTKMLGILLVTDFVSYKLSKLTNLCKCVSLMESSFSFWAEAFGIQAGTLVGDFVPRSSD
SAAMDASYMGKLSLLTLLEDKAATEELQTIARYIIMEGFVSPPEIPKPHKMTSKFPKVLRSELQVYLLNCLCRTIQRIAG
EPFILKKKDGSISWGGMFNPFSGRPLLDMQPLISCCYNGYFKNKEEETEPSSLSGMYKKIIELEHLRPQSDAFLGYKDPE
LPRMHEFSVSYLKEACNHAKLVLRSLYGQNFMEQIDNQIIRELSGLTLERLATLKATSNFNENWYVYKDVADKNYTRDKL
LVKMSKYASEGKSLAIQKFEDCMRQIESQGCMHICLFKKQQHGGLREIYVMGAEERIVQSVVETIARSIGKFFASDTLCN
PPNKVKIPETHGIRARKQCKGPVWTCATSDDARKWNQGHFVTKFALMLCEFTSPKWWPLIIRGCSMFTKKRMMMNLNYLK
ILDGHRELDIRDDFVMDLFKAYHGEAEVPWAFKGKTYLETTTGMMQGILHYTSSLLHTIHQEYIRSLSFKIFNLKVAPEM
SKSLVCDMMQGSDDSSMLISFPADDEKVLTRCKVAAAICFRMKKELGVYLAIYPSEKSTANTDFVMEYNSEFYFHTQHVR
PTIRWIAACCSLPEVETLVARQEEASNLMTSVTEGGGSFSLAAIIQQAQCTLHYMLMGMGVSELFLEYKKAVLKWNDPGL
GFFLLDNPYACGLGGFRFNLFKAITRTDLQKLYAFFMKKVKGSAARDWADEDVTIPETCSVSPGGALILSSSLKWGSRKK
FQKLRDRLNIPENWIELINENPEVLYRAPRTGPEILLRIAEKVHSPGVVSSLSSGNAVCKVMASAVYFLSATIFEDTGRP
EFNFLEDSKYSLLQKMAAYSGFHGFNDMEPEDILFLFPNIEELESLDSIVYNKGEIDIIPRVNIRDATQTRVTIFNEQKN
LRTSPEKLVSDKWFGTQKSRIGKTTFLAEWEKLKKIVKWLEDAPEATLAHTPLNNHIQVRNFFARMESKPRTVRITGAPV
KKRSGVSKIAMVIRDHFSRMGHLRGVEDLAGFTRSVSAEILKHFLFCILQGPYSESYKLQLIYRVLSSVSNVEIKESDGK
TKTNLIGILQRFLDGDHVVPIIEEMGAGTVGGFIKRQQSKVVQNKVVYYGVGIWRGFMDGYQVHLEIENDIGQPPRLRNV
TTNCQSSPWDLSIPIRQWAEDMGVTNNQDYSSKSSRGARYWMHSFRMQGPSKPFGCPVYIIKGDMSDVIRLRKEEVEMKV
RGSTLNLYTKHHSHQDLHILSYTASDNDLSPGIFKSISDEGVAQALQLFEREPSNCWVRCESVAPKFISAILEICEGKRQ
IKGINRTRLSEIVRICSESSLRSKVGSMFSFVANVEEAHDVDYDALMDLMIEDAKNNAFSHVVDCIELDVSGPYEMESFH
GRSTLTCTPSTILIRTYTFYLTLHQTMISVQAFQVILDEGVLLIALVNNYLRGSKANCWVRCESVAPKFISAILEICEGK
RQIKGINRTRLSEIVEFVLNLPKIKSRIYVLICRQCHGANFPPISVRRLMLEDIASVARRLIIVASFGS
>P06447 ~~~L~~~RNA-directed RNA polymerase L~~~
MDGQESSQNPSDILYPECHLNSPIVRGKIAQLHVLLDVNQPYRLKDDSIINITKHKIRNGGLSPRQIKIRSLGKALQRTI
KDLDRYTFEPYPTYSQELLRLDIPEICDKIRSVFAVSDRLTRELSSGFQDLWLNIFKQLGNIEGREGYDPLQDIGTIPEI
TDKYSRNRWYRPFLTWFSIKYDMRWMQKTRPGGPLDTSNSHNLLECKSYTLVTYGDLVMILNKLTLTGYILTPELVLMYC
DVVEGRWNMSAAGHLDKKSIGITSKGEELWELVDSLFSSLGEEIYNVIALLEPLSLALIQLNDPVIPLRGAFMRHVLTEL
QTVLTSRDVYTDAEADTIVESLLAIFHGTSIDEKAEIFSFFRTFGHPSLEAVTAADKVRAHMYAQKAIKLKTLYECHAVF
CTIIINGYRERHGGQWPPCDFPDHVCLELRNAQGSNTAISYECAVDNYTSFIGFKFRKFIEPQLDEDLTIYMKDKALSPR
KEAWDSVYPDSNLYYKAPESEETRRLIEVFINDENFNPEEIINYVESGDWLKDEEFNISYSLKEKEIKQEGRLFAKMTYK
MRAVQVLAETLLAKGIGELFRENGMVKGEIDLLKRLTTLSVSGVPRTDSVYNNSKSSEKRNEGMENKNSGGYWDEKKRSR
HEFKATDSSTDGYETLSCFLTTDLKKYCLNWRFESTALFGQRCNEIFGFKTFFNWMHPVLERCTIYVGDPYCPVADRMHR
QLQDHADSGIFIHNPRGGIEGYCQKLWTLISISAIHLAAVRVGVRVSAMVQGDNQAIAVTSRVPVAQTYKQKKNHVYEEI
TKYFGALRHVMFDVGHELKLNETIISSKMFVYSKRIYYDGKILPQCLKALTKCVFWSETLVDENRSACSNISTSIAKAIE
NGYSPILGYCIALYKTCQQVCISLGMTINPTISPTVRDQYFKGKNWLRCAVLIPANVGGFNYMSTSRCFVRNIGDPAVAA
LADLKRFIRADLLDKQVLYRVMNQEPGDSSFLDWASDPYSCNLPHSQSITTIIKNITARSVLQESPNPLLSGLFTETSGE
EDLNLASFLMDRKVILPRVAHEILGNSLTGVREAIAGMLDTTKSLVRASVRKGGLSYGILRRLVNYDLLQYETLTRTLRK
PVKDNIEYEYMCSVELAVGLRQKMWIHLTYGRPIHGLETPDPLELLRGIFIEGSEVCKLCRSEGADPIYTWFYLPDNIDL
DTLTNGCPAIRIPYFGSATDERSEAQLGYVRNLSKPAKAAIRIAMVYTWAYGTDEISWMEAALIAQTRANLSLENLKLLT
PVSTSTNLSHRLKDTATQMKFSSATLVRASRFITISNDNMALKEAGESKDTNLVYQQIMLTGLSLFEFNMRYKKGSLGKP
LILHLHLNNGCCIMESPQEANIPPRSTLDLEITQENNKLIYDPDPLKDVDLELFSKVRDVVHTVDMTYWSDDEVIRATSI
CTAMTIADTMSQLDRDNLKEMIALVNDDDVNSLITEFMVIDVPLFCSTFGGILVNQFAYSLYGLNIRGREEIWGHVVRIL
KDTSHAVLKVLSNALSHPKIFKRFWNAGVVEPVYGPNLSNQDKILLALSVCEYSVDLFMHDWQGGVPLEIFICDNDPDVA
DMRRSSFLARHLAYLCSLAEISRDGPRLESMNSLERLESLKSYLELTFLDDPVLRYSQLTGLVIKVFPSTLTYIRKSSIK
VLRTRGIGVPEVLEDWDPEADNALLDGIAAEIQQNIPLGHQTRAPFWGLRVSKSQVLRLRGYKEITRGEIGRSGVGLTLP
FDGRYLSHQLRLFGINSTSCLKALELTYLLSPLVDKDKDRLYLGEGAGAMLSCYDATLGPCINYYNSGVYSCDVNGQREL
NIYPAEVALVGKKLNNVTSLGQRVKVLFNGNPGSTWIGNDECEALIWNELQNSSIGLVHCDMEGGDHKDDQVVLHEHYSV
IRIAYLVGDRDVVLISKIAPRLGTDWTRQLSLYLRYWDEVNLIVLKTSNPASTEMYLLSRHPKSDIIEDSKTVLASLLPL
SKEDSIKIEKWILIEKAKAHEWVTRELREGSSSSGMLRPYHQALQTFGFEPNLYKLSRDFLSTMNIADTHNCMIAFNRVL
KDTIFEWARITESDKRLKLTGKYDLYPVRDSGKLKTISRRLVLSWISLSMSTRLVTGSFPDQKFEARLQLGIVSLSSREI
RNLRVITKTLLDRFEDIIHSITYRFLTKEIKILMKILGAVKMFGARQNEYTTVIDDGSLGDIEPYDSS
>I0DF35 2.7.7.48~~~~~~RNA-directed RNA polymerase L~~~
MNLEVLCGRINVENGLSLGEPGLYDQIYDRPGLPDLDVTVDATGVTVDIGAVPDSASQLGSSINAGLITIQLSEAYKINH
DFTFSGLSKTTDRRLSEVFPITHDGSDGMTPDVIHTRLDGTIVVVEFTTTRSHNIGGLEAAYRTKIEKYRDPISRRVDIM
ENPRVFFGVIVVSSGGVLSNMPLTQDEAEELMYRFCIANEIYTKARSMDADIELQKSEEELEAISRALSFFSLFEPNIER
VEGTFPNSEIEMLEQFLSTPADVDFITKTLKAKEVEAYADLCDSHYLKPEKTIQERLEINRCEAIDKTQDLLAGLHARSN
KQTSLNRGTVKLPPWLPKPSSESIDIKTDSGFGSLMDHGAYGELWAKCLLDVSLGNVEGVVSDPAKELDIAISDDPEKDT
PKEAKITYRRFKPALSSSARQEFSLQGVEGKKWKRMAANQKKEKESHDALSPFLDVEDIGDFLTFNNLLADSRYGDESVQ
RAVSILLEKASAMQDTELTHALNDSFKRNLSSNVVQWSLWVSCLAQELASALKQHCRAGEFIIKKLKFWPIYVIIKPTKS
SSHIFYSLGIRKADVTRRLTGRVFSETIDAGEWELTEFKSLKTCKLTNLVNLPCTMLNSIAFWREKLGVAPWLVRKPCSE
LREQVGLTFLISLEDKSKTEEIITLTRYTQMEGFVSPPMLPKPQKMLGKLDGPLRTKLQVYLLRKHLDCMVRIASQPFSL
IPREGRVEWGGTFHAISGRSTNLENMVNSWYIGYYKNKEESTELNALGEMYKKIVEMEEDKPSSPEFLGWGDTDSPKKHE
FSRSFLRAACSSLEREIAQRHGRQWKQNLEERVLREIGTKNILDLASMKATSNFSKDWELYSEVQTKEYHRSKLLEKMAT
LIEKGVMWYIDAVGQAWKAVLDDGCMRICLFKKNQHGGLREIYVMDANARLVQFGVETMARCVCELSPHETVANPRLKNS
IIENHGLKSARSLGPGSININSSNDAKKWNQGHYTTKLALVLCWFMPAKFHRFIWAAISMFRRKKMMVDLRFLAHLSSKS
ESRSSDPFREAMTDAFHGNREVSWMDKGRTYIKTETGMMQGILHFTSSLLHSCVQSFYKSYFVSKLKEGYMGESISGVVD
VIEGSDDSAIMISIRPKSDMDEVRSRFFVANLLHSVKFLNPLFGIYSSEKSTVNTVYCVEYNSEFHFHRHLVRPTLRWIA
ASHQISETEALASRQEDYSNLLTQCLEGGASFSLTYLIQCAQLLHHYMLLGLCLHPLFGTFMGMLISDPDPALGFFLMDN
PAFAGGAGFRFNLWRACKTTDLGRKYAYYFNEIQGKTKGDEDYRALDATSGGTLSHSVMVYWGDRKKYQALLNRMGLPED
WVEQIDENPGVLYRRAANKKELLLKLAEKVHSPGVTSSLSKGHVVPRVVAAGVYLLSRHCFRFSSSIHGRGSTQKASLIK
LLMMSSISAMKHGGSLNPNQERMLFPQAQEYDRVCTLLEEVEHLTGKFVVRERNIVRSRIDLFQEPVDLRCKAEDLVSEV
WFGLKRTKLGPRLLKEEWDKLRASFAWLSTDPSETLRDGPFLSHVQFRNFIAHVDAKSRSVRLLGAPVKKSGGVTTISQV
VRMNFFPGFSLEAEKSLDNQERLESISILKHVLFMVLNGPYTEEYKLEMIIEAFSTLVIPQPSEVIRKSRTMTLCLLSNY
LSSRGGSILDQIERAQSGTLGGFSKPQKTFIRPGGGVGYKGKGVWTGVMEDTHVQILIDGDGTSNWLEEIRLSSDARLYD
VIESIRRLCDDLGINNRVASAYRGHCMVRLSGFKIKPASRTDGCPVRIMERGFRIRELQNPDEVKMRVRGDILNLSVTIQ
EGRVMNILSYRPRDTDISESAAAYLWSNRDLFSFGKKEPSCSWICLKTLDNWAWSHASVLLANDRKTQGIDNRAMGNIFR
DCLEGSLRKQGLMRSKLTEMVEKNVVPLTTQELVDILEEDIDFSDVIAVELSEGSLDIESIFDGAPILWSAEVEEFGEGV
VAVSYSSKYYHLTLMDQAAITMCAIMGKEGCRGLLTEKRCMAAIREQVRPFLIFLQIPEDSISWVSDQFCDSRGLDEEST
IMWG
>Q89709 2.7.7.48~~~~~~RNA-directed RNA polymerase L~~~
MEKYREIHQRVKEIPPGGASALECLDLLDRLYAVRHDVVDQMIKHDWSDNKDMERPIGQVLLMAGVPNDVIQGMEKKVIP
TSPSGQILKSFFRMTPDNYKITGALIEFIEVTVTADVAKGIREKKLKYESGLQFVESLLSQEHKKGNINQAYKITFDVVA
VKTDGSNISTQWPSRRNDGVVQHMRLVQADINYVREHLIKPDERASLEAMFNLKFHVGGPKLRYFNIPDYKPQSLCQPEI
TNLIQYCKHWLTEDHDFVFKEVTGNNVMNSFENNESVYMSRYRESRKPRNFLLIQGSIQGPYLPSTISSDQCDTRIGCLE
VLKVHPETPVQAIAVDMAYKYMELNRDEIINYYNPRVHFQATQSVKEPGTFKLGLSQLNPMSKSILDQVGKHKSEKGLFG
EPLESINISSQIQQNECSRIIESILSNLEINVGEVTMSLANPRKTTGVDELLGKFYENELSKYLISILRKTAAWHIGHLI
RDITESLIAHAGLKRSKYWSIHAYDHGGVILFILPSKSLEVVGSYIRYFTVFKDGIGLIDEENLDSKVDIDGVQWCFSKV
MSIDLNRLLALNIAFEKALLATATWFQYYTEDQGHFPLQHALRSVFSFHFLLCVSQKMKICAIFDNLRYLIPAVTSLYSG
YELLIEKFFERPFKSALEVYLYNIIKALLISLAQNNKVRFYSKVRLLGLTVDHSTVGASGVYPSLMSRVVYKHYRSLISE
ATTCFFLFEKGLHGNLNEEAKIHLETVEWARKFEAKERKYGDILMREGYTIDAIRVGDVQVEQQLFCQEVVELSAEELNK
YLQAKSQVLSSNIMNKHWDKPYFSQTRNISLKGMSGALQEDGHLAASVTLIEAIRFLNRSQTNPNVIDMYEQTKQHKAQA
RIVRKYQRTEADRGFFITTLPTRVRLEIIEDYYDAIARVVPEEYISYGGDKKILNIQTALEKALRWASGSSEVITSTGNV
IKFKRRLMYVSADATKWSPGDNSAKFKRFTQALYDGLSDEKLKCCVVDALRHVYETEFFMSRKLHRYIDSMDEHSEAVQD
FLDFFKGGVSATVKGNWLQGNLNKCSSLFGAAVSLLFRRIWAELFPELECFFEFAHHSDDALFIYGYLEPEDDGTDWFLY
VSQQIQAGNYHWHAVNQEMWKSMFNLHEHLLLMGSIKVSPKKTTVSPTNAEFLSTFFEGCAVSIPFIKILLGSLSDLPGL
GFFDDLAAAQSRCVKAMDLGASPQLAQLAVVICTSKVERLYGTADGMVNSPVAFLKVTKAHVPIPLGGDGSMSIMELATA
GIGMADKNILKQAFYSYKHTRRDGDRYVLGLFKFLMSLSEDVFQHDRLGEFSFVGKVQWKVFTPKNEFEFFDQFSQSYLK
SWTNQHPVYDYIIPRGRDNLLVYLVRKLNDPSIVTAMTMQSPLQLRFRMQAKQHMKVCKLEGEWVTFREVLAAADSFATK
YNPTEKDLDLFNTLVSCTFSKEYAWKDFLNEVRCEVVPTKHVHRSKIARTFTVREKDQAIQNPITAVIGYKYASTVDEIS
DVLDSSFFPDSLSADLQVMKEGVYRELGLDIGLPEVLKRIAPLLYKAGRSRVVIVEGNVEGTAESICSYWLRSMSLVKTI
KVRPKKEVLRAVSLYSTKENIGLQDDVAATRLCIEVWRWCKANDQNVNDWLNALYFEKQTLMDWVERFRRKGVVPIDPEI
QCIALLLYDVLGYKSVLQMQANRRAYSGKQYDAYCVQTYNEETRLYEGDLRVTFNFGLDCARLEIFWDKKEYILETSITQ
RHVLKLMMEEVTQELLRCGMRFKTEQVSHTRSLVLFKTESGFEWGKPNVPCIVFKHCALRTGLRTKQAINKEFMINVQAD
GFRAIAQMDMESPRFLLAHAYHTLRDVRYQAVQAVGNVWFQTAQHKLFINPIISSGLLENFMKGLPAAIPPAAYSLIMNK
AKISVDLFMFNELLALVNPRNVLNLDGIEETSEGYSTVTSISSRQWSEEVSLMADDDIDDEEEFTIALDDIDFEQINLDE
DIQHFLQDESAYTGDLTIQTEEVEVKRIRGVTRVLEPVKLIKSWVSKGLAIDKVYNPIGIVLMARYMSKNYDFSKIPLAL
LNPYDLTEFESVVKGWGETVNDRFLEVDNDAQRLVREKNILPEDILPDSLFSFRHVDVLLKRLFPHDPVSSFY
>Q91DR9 ~~~L~~~RNA-directed RNA polymerase L~~~
MFEWESQDTPSGLPDEESYFPTSKLSVEERMHYLNNVDYNLNSPLISDDIEYLTLKHFGRAIPSLWKVKNWEIPLEMLKG
VGIIKTWDQIHPWMGKWFDSEHNCPQGESFLRTVQAESELTSEIPVTFIKGWIGKEIKFPVKRGHHAVHLLMQKVLDLHK
LTLLINSVDSGETEKLCESFGLNSKQSKFETYSLGTVRYCPGWIFIDKAEILLDRNFLLMMKDTLIGRLQTLLSMLGNCE
MEVEQIYTHTETMLSLYSYGDQIIEKAGNNGYSKIKLLEPICNLRLSELAHKYRPLVPDFPHFQEHVETSVREEDTTDGL
LSAILSLVNNTEDIQLILTIYGSFRHWGHPFISYFEGLQKLHDQVTLPKQIDKEYAAALASDLAYTVLQRKFSEEKKWYV
DSIALSSKHPLKEHVDNGTWPTAAQIQDFGDRWHLLPLTKCFEVPDLLDPSVIYSDKSHSMNRKEVIDHVISTPNKPIPS
KKVLETMINNPATDWPTFLKAVDEEGLPRDNLIIGLKGKERELKIAGRFFSLMSWQLREYFVITEYLIKTHYVPLFKGLT
MADDLTSVVKKMLDNTNGQGLDDYSSICIANHIDYEKWNNHQRKESNGPVFRVMGQFLGYPRLFERTHEFFESSLIYYNG
RPDLMDVRGDSLVNTTDKMVCWEGQAGGLEGLRQKGWSVLNLLVINRESSIRNTVVKVLAQGDNQVICTQYKTKNYKNED
ELKMLLTAMVENNQTIMNGIITGTGKLGLIINNDETMQSADYLNYGKVPVFRGILRGLETKRWSRVTCITNDQIPTLAGV
MSSVSTNALTVAHFAASPINAILQYHYFANFCLMMIAMHNPAIRSSMYTKMFRKCHIMSKEFKAATLYLDPSLGGVCGIS
LARFLIRSFPDPVTEGLAFWKMIHHNCQSDWLKALSKRCGNPKLARFRPEHIPKIIEDPAALNISMGMSASNLLKTEVKG
HLIRTADTIQNQIIREAAEYLGQEEASLNEFLWDIEPFFPRFLSEFRSSTFVGVTDSLIGLFQNSKTIRGLFKSYYKREL
DRLVVKSELSSLEHLGSYRKETPDSIWECSSTQADLLREKSWGRSVIGMTVPHPLEMFGKGHQKELECIPCQTSGLTYIS
SYCPRGINNWYSTVGSLAAYLGSKTSETTSILQPWEKDSKIPIIKRATKLRDSISWFVPPDSKLAKSIQQNLKALTGEDW
EEDIQGFKRTGSALHRFTTSRVSNGGFSAQSPAKLTRIMTTTDTMRDLGDQNYDFMFQAGILYSQMTTGELRENSTNSTA
THYHITCKSCLREIQEPMLESRIVYNPPSSSRVIKSWIPNATEIMEESKPIKLREVDWDPLTRYEKSYHIGRCQGFLYGD
LTYQKTGRSEESSIFPLSIQYKVEGSGFMRGFCDGIIRASAVQALHRRVSSIVSTADVIYGGALYLTNQVGDSPPFQNLC
RSGPLREELERIPHKMTSSYPTSNSDMGYLIRNYLKRSLKQLSRGRYETKEGPIWVFSDVRTKKFLGPFSLSTDALNCLY
KNKLSKRDKNAVRNLSQLSSRMRSGDLSDEEIGKVEARFSFTPAEMRHACKFTIGKTQVPIVMSEWGQEAYGNITMYPVF
YSTTKTEKPDWTFSRLQNPTISGLRISQQATGAHYKLRSLLKGMKIHYQDAIGCGDGSGGLSSCLLRENKHCRVIFNSLL
ELTGNTLRGSTPDPPSAINGIPQIRDRCVNLNNVWEHPSDLSHPDTWKYFGELKAQFNMDVDLIVMDMEVQDIDISRRIE
QNLRDHVHSLLSRHGTVIYKTYMTIMSENEKSVLDIVGVLFEDVQLCQTQYSSSQTSEVYCVMRRLRQKVDSQHVDWQSL
VRQGINSKVYCNLPLDKEFERALNLYQIDTLVGVPRGLIPNLAVELETLLEIGGLSGGILGKLVLNIEEGKLGFTMALIV
SCILISESAICTTRLSNKREVPSSGACQRMAVCLIGAAILLSVHHRSIENHKGAIRMLRHSVPIRISSKLRKDGKLQSRW
SSISREGLAKDVRLNSNMAGVGAWIRVWSRMKDRERRWEAREADSWLKTHNKGLSMEHVRRHTGVLDILHGTGDRLDRSV
PTVSSAPRESGTWVE
>P31332 ~~~L~~~RNA-directed RNA polymerase L~~~
MEGMDHWENAKYFQGIEDIEEDTRQPTVDSMSSGTYHCKSALRSHKDNMKLFLYRRDFLIFSHRFNGLPYDEQYLGVLPK
LWSCFYDKTHDLSGFLDQYASREHCTPSDSFSRWADPTVLHLYDDPIIRNLLASENKVLNFLEGGISDILDKYQICIKRN
IRLIYLHLFLNLALIVLNHTDADSMPDRRVELNGVTFKLEEGVILCEYNEYLKIYVLKGAVIWDMPAYRQVLQKDLFLTI
CDKISERINIVIGATIITALSHKTNLDDPDSHLYDACINMIKIGDNILVNHGNRGFDLLGKFEAYCVACILTYDDQRIWN
PLEFLNNLIEDDRINQPDLYNDANNLVAFLRKQPIVILAELHGLWRIWGHPIIDLEGGMKKMEATCTKQSPVSVEETRVC
ERTMKLTFFTNYYDKHHHYPLSTLTHPDHFNLYSQYLSERDKIEYLANKDIAFEHSYIMRCIRRNKKIFQRSSLYNHKDW
DQVVILQSFQIPKSVNLATMIKDKAISMTRSELIESVNTKNSVFDSTKRRGILKWLNEQSDKIYNFLMRIDDKGLDEDDC
IIGLYPKEREMKTKARFFSLMSYKLRMYVTSTEELLGKYVLKYFPMITMSDNLLSMVIRLFDMTTLIGDKGVAVTYSMNI
DFSKWNQNMRERTNAGIFDNLDRILGFRSLISRTHSIFKACYLYLCSGEYVPVISNNQLTAQSPWSRTGDESGKEGLRQK
GWTITTVCDILSLAFKYNARIQLIGGGDNQVLTVTMLPSESMQSQGRDSQLLKVRERMTSFRNALAKKMVKRGLPLKLEE
TWISHNLLMYNKIMYYSGVPLRGRLKVISRLFSNSNVGVTSLGGITSTLGTGFQSISTKDYTPTLAWLISRVFTDIYIST
YHLLNPISGTQRLDKQVLMSRGNIRQGRNELGGETSVPIINKIRNHAALATDHTLDLDSLLICVLYYHKILGGPGIGPPT
AYVMKGFPDPLSEGLTFNYLVITNVLNERTKRKIISVTKVMKNRNQHWEHLLEDPVSVNHDAPPHGIAALRAQAEAVMRS
AKITNIGFKNLIDIGDNQYLRDLSEKLCSPNDLEPRLLHDIVGSTIPGFVNSVLSRVDQSTTINKIAGNSDVVTSIYLSE
MSYYLYLSKKVNTQDGHAIGSCPTRDSKMLRNWTWGKNIIGVTTPHPLGYLKRERHSESSSCDNNYIRVLTKRIGNSWEL
RRGQFRPYFGSYTEEKFKMTTLASAYGDESILKRAIKIQKLLGWRYHQGSSLYNLIQKILTCVTDADPNKFLPLPDEITG
DVEHRYHDMATKHGGIPSNLIHLYTHASCNTSTFINHSKGAANESLHFQAAIIWTCMQSICRTSASSSVSDISHYHEACN
QCIVKLEDPIESDYSTSDISLMSCPANDLMYVKEDDIPVHFHTTMEFYRASSSSTVLKKIKKIEDAEVISSRMTWVVTLC
SHLLNQDTIKHSTWKLISEDLSKEEVMFIVMSITMYIMSEQDIPVHSASLSDFRTLYEKNKDIIDRVLGIEALNDAVSGV
SFYNNRCSDDQCLRWKETSDQILSHYKTTGTCAVKYQAPHFRICTRLVYLMTNPSCQSCPCCLGVIKDDSNDGPIMCQLH
GELAGPCGYHLCSLDKLNKTKKGLNMSKVFYTDGTISAQHDAKNRPTKKRPHGENSLTEMFKRAKTSRENRILKQNKESY
LFIQPRLMVDLFIDMGSMWEKTQISDQGSHIIPAHTNPIVLSKKSDLYVPAAISKFVSNGFLIMDAVERALGKPSKPITE
HSQLSVNISYGIEYHPEIKRETVQLLRFVNELAYTGYGGGVTICITLFPIFISDIEAVDPRLISDIIYRYRADSSDYACI
RLTDMGDMCFDINNILSDCDACLSYDPGCWSQDANLVYIISDTSDIMIKAKEFETLHSFNKFYSVECLAPRFFMSSSTVS
ALVISKSSSINSIDYDRLAEIHLDRRGTQMWDLSRLFANMTMRDGLTEMIKCIYNNVSGEATILTSQSLVDAAISKIKVD
ILRGLIDMVRGERMNWRTQYMIQILLGVMILTSDNPEDYRREISKYNNAVVLLQKPRRIKLIRDHLGGADRKFYHATTNN
GLLGSSLNDVTHILEGILYITHRTSRKLCTSISLEV
>P20430 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MDETVSELKDLVRKHIPNRHEFAHQKDAFLSHCHSGSLLQEGFKLLSNLVELESCESHACHLNTCQKYVDVILSDHGIPC
PTLPKVIPDGFKLTGKTLILLETFVRVNPEEFERKWKSDMTKLLNLKQDLLRSGITLVPVVDGRTNYSNRFTPEWVVERI
RWLLIEILRKSRSSAEIDIEDQEYQRLIHSLSNVRNQSLGFENIECLKRNLLEYDDRLAKSLFVGVKGDVRESVIREELM
KLRLWYKKEVFDKNLGKFRITNRSELLNNLIRLGKHEDNTTSDCPFCVNKFMDIIYSLTFTALKRQDREKSNSELDQYVV
CPHEKAYLGVLSICNKIKGLKVFNTRRNTLLFLDLIMVNFLDDLFTAKPEALDSLRRSGLILGQMVTLVNDRALDFLEAV
KLIKKKIETNVKWVENCSKILRRSQQDIWSQISVWARYPDLSKLISIAQTISSDRPIMRYSAGGNFNTECKHKTFHMMSD
AEQVEAFKILSSVSLSLINSMKTSFSSRLLINEKEYSRYFGNVRLRECYQQRFFLTDGLIVILFYQKTGERSGCYSIYTC
EDGVLVEKGSFYCDPKRFFLPIFSQEVLVEMCDEMTTWLDFNSDLMVISKEKLRLLLLSILCAPSKRNQVFLQGLRYFLM
AYSNQFHHVDLLSKLKVECMSGSEVIVQRLAVDLFQCLLGEGVDSDPYFARRFKYLLNVSYLCHLITKETPDRLTDQIKC
FEKFIEPKIDFNCVIVNPSLNGQLTEAQEGMMLDGLDKFYSKTLKDCSDTKLPGVSNELLSYCISLFNKGKLKVTGELKN
DPFKPNITSTALDLSSNKSVVVPKLDELGNVLSVYDREKMISSCVSSMAERFKTKGRYNIDPSTLDYLILKNLTGLVSIG
SKTQRDCEELSMMFEGLTEEQAEAFNDIKNSVQLAMVKMKDSKSGDVNLSPNQKEGRVKSSTGTLEELWGPFGIMREIRT
EVSLHEVKDFDPDVLASDLYKELCDVVYYSSSKPEYFLERPLEVCPLGLLLKNLTTSAYFDEEYFECFKYLLIQGHYDQK
LGSYEHRSRSRLGFTNEALRVKDEVRLSMRESNSEAIADKLDRSYFTNAALRNLCFYSDDSPTEFTSISSNNGNLKFGLS
YKEQVGSNRELYVGDLNTKLITRLVEDFAEAVGSSMRYTCLSSEKEFDRAICDMKLAVNNGDLSCSLDHSKWGPTMSPAL
FLTFLQFLELRTPKERNIINLEPVLNVLRWHLHKVIEVPVNVAEAYCTGNLKRSLGLMGCGSSSVGEEFFHQFMPVQGEI
PSHIMSVLDMGQGILHNMSDLYGLITEQFLNYVLDLLYDVIPTSYTSSDDQVTLIKLPCASDDNQVNDEWLEMLCFHEYL
SSKLNKFVSPKSVAGTFVAEFKSRFFVMGEETPLLTQFVAAALHNVKCKTPTQLSETIDTICDQCVANGVSVQIVSKISQ
RVNQLIKYSGFKETPFGAVEKQDVKDWVDGTRGYRLQRKIESIFSDDEMTGFIRSCAKRVFNDIKRGKVFEENLISLIGR
DGDDALVGFLRYSSCSEQDIMRALGFRWVNLSSFGDLRLVLRTKLMTSRRVLEREEVPTLIKTLQSRLSRNFTKGVKKIL
AESINKSAFQSSVASGFIGFCKSIGSKCVRDGEGGFLYIKDIYTKVKPCLCEVCNMKRGVIYCRPSLEKIEKFSKPILWD
YFSLVLTNACEIGEWVFSSVKEPQIPVVLSNRNLFWAVKPRIVRQLEDQLGMNHVLYSIRKNYPKLFDEHLSPFMSDLQV
NRTLDGRKLKFLDVCIALDLMNENLGIVSHLLKARDNSVYIVKQSDCAMAHVRQSDYVDKEVGLSPQQVCYNFMVQIILS
SMVNPLVMSTSCLKSFFWFNEVLELEDDGQIELGELTDFTFLVRDQKISRAMFIEDIAMGYVISNLEDVRLYIDKITIGE
QPLAPGRHINDLLDLLGNFDDHEDCDLRFLIQVEHSRTSTKYRFKRKMTYSFSVTCVSKVIDLKEASVELQVVDVTQSVS
GSGGSHLLLDGVSMIAGLPIFTGQGTFNMASLMMDADLVETNDNLILTDVRFSFGGFLSELSDKYAYTLNGPVDQGEPLV
LRDGHFFMGTEKVSTYRVELTGDIIVKAIGALDDPEDVNALLNQLWPYLKSTAQVMLFQQEDFVLVYDLHRSGLIRSLEL
IGDWVEFVNFKVAYSKSLKDLVVSDNQGSLRLRGIMCRPLARRNTVEDIE
>P37800 2.7.7.48~~~L~~~RNA-directed RNA polymerase L~~~
MERILKKQPAPVRALTIHPLRRYESSIYDTPIPAYVIKHSSDGVTIDIATSELADGQSGSTIQPFESVPAQNLTLFKHDF
TFGHLADTTDKKFVEVFGVLENRADDSDFQSPDMIIETETGHVYVVEFTTTMGDANSADLAARNKIAKYEIACLNRSAIK
PISLYIIAVHFNGVISNLDLSDEEVNEIVFRFRLARDIFEELREINPALFDSDETISRLEREVNSVMSAIQIDWDTTEKK
FPSFRRELFENFRSKEVDDEYISKIIKRCTDEALRGIERDSLYTEDITNKERFELNSKRAASDIKNKMAEMMSYEFLRDT
EDHKSTVQFPPWVTRTGPAGKDLEPLKSVSVEGSHPMCKIWNKVCTNASIEKIERMHDDPVLELEYAMSGSTERSVERNK
YHRTVLTLSPEEREYAAVLGVCGKRNANLGAVKEARVRSKKGFSIGHNTERVEEFLSDSCVEDLIPTEGLYNPLSEDKSL
RLLAMGLHQPTLIHMDDETPETLDCHLKFLSSPIGSWLQMVSIVGAELSASVKQHVKPNQFIVKRLLDSAIFLLIKPTTS
KGHIFVSLAVNKKFLHGELSKSSVFKQSIDAGDLLVTDFVSFKLSKITNLCKALCVLEAASCFWAETYGFEPWKFVDQAS
AVKFLDAWFMIKLSLLTMLEDKATTEELQTMQRYVIMEGFVSLPEIPKPHKMLSKIPKVLRSELQVFLTHRLFSTMQRIS
ATPFQLHKVGGNIRWKGLFNPYSGNSIDELQTLISCCYNGYFKNKEEDTEPSALSAMYKKIIELEHLRPPTDTYLGYEDP
IDPKMHEFSRSYLKLLCNHAKTKLRKQYGRGVMNQIENSIVREVQSITLERLATLKATSNFDDSWYTFKDVKDKNYTRDK
LLVKMTQFAHRGKTLAIEVFEECMSRIEEKGCMEICLFKKQQHGGLREIYVMGADERIVQSVIEAIARAIGRFFDSDTLC
NPSNKIRIPETHGQRAKRRCGRSVWTCATSDDARKWNQGHYVTKFALMLCEFTPQEWWPLIIRGCSMFTNKFMMMNLDFL
RIIDSHKELQIEDEFVSKLFKAYHGESVEPWISQGCTYLKTSTGMMQGILHFTSSLLHSLHQEFVKTTAIQLFTLKLGSD
ASSKVVCDMMQGSDDSSMIISFPSYNEKIKMRYKLVAAMCFRIKKSLGIYIGIYPSEKSTPNTDFVMEYNSEFFFHSQHV
RPTIRWIAASCSLPEVETLVASKEEAANLLTAITEGGGSFSLAAMIQHCQSSIHYMLMGLGVSALFSEFSKAISKWLDPG
LGFFLFDNPYSAGLSGFKYNLYRAIMNSSLKSIYSFFMKRVKGGSQRTDGIISESCSVSPGGAIVMSSTLRWGSVEKFKR
LRNRLNIPETWKEMINESPEVLYRAPQTGTEIMLRIAEKVHSPGVVSSLSTGNAVCKVMASSVYFLSACIFEDAGSQEYK
VVNNDKYSLMQKIIAFDQIGCNDEISQEDLLFLFPNLAEFEAFDSIIYDKGRFNVIPRASQREATQTRIVVFEHHSSARV
APEKLVSDKWFGTRKSKIGSPGFRQEWDRLKAIVRWLRDTPEETLDSSPFSNHIQIRNFFARMEGRPRVIKVTGAPVKRD
LGMSKIAMAIRDNFCKTGFLQGLEDEVGHSRAMQVEKIKHYLFSVLMGPYSEEAKLEYVVKILKEEPQVILNYNDKRSRA
NIISLLQRFIKSEIGIATLIEDMKAGVFGAFVKAQQFSQSSVNNKYYGRGIWKGVMDGYQVQIDIDGKEGMPSHLSGITI
SNCSKTWILTQSLKAWCEDMQVYNNTDVSKANPKANYWMYGFKMYGSSYPYGCPIYLVRHDITNLGLLHDDDIDIKVRRN
TINLFVRSKDKRPRDLHILSYTPSDSDISSVSSKHIMEDEYFVYKGAFSVEPTRSWMLCQPLPWSFVRPVLQVATGSRRS
PRQLDLERLREIIRLCTESSIRNKVGTVYGQNRPEKFIEAEPIDMSEMFDMMLDEGMDDAFEELADYLTVEEDPDYMDEV
SFDDDSLNLFGPAHYKELQSLTVLAHPLMDDFVTRLVGKMGRPQIRRLLEKNVTTRDLRELSELLFMALDRDPSQIREEL
ILGDSPTEVPDDLLG
>P03523 ~~~L~~~RNA-directed RNA polymerase L~~~
MEVHDFETDEFNDFNEDDYATREFLNPDERMTYLNHADYNLNSPLISDDIDNLIRKFNSLPIPSMWDSKNWDGVLEMLTS
CQANPISTSQMHKWMGSWLMSDNHDASQGYSFLHEVDKEAEITFDVVETFIRGWGNKPIEYIKKERWTDSFKILAYLCQK
FLDLHKLTLILNAVSEVELLNLARTFKGKVRRSSHGTNICRIRVPSLGPTFISEGWAYFKKLDILMDPNFLLMVKDVIIG
RMQTVLSMVCRIDNLFSEQDIFSLLNIYRIGDKIVERQGNFSYDLIKMVEPICNLKLMKLARESRPLVPQFPHFENHIKT
SVDEGAKIDRGIRFLHDQIMSVKTVDLTLVIYGSFRHWGHPFIDYYTGLEKLHSQVTMKKDIDVSYAKALASDLARIVLF
QQFNDHKKWFVNGDLLPHDHPFKSHVKENTWPTAAQVQDFGDKWHELPLIKCFEIPDLLDPSIIYSHKSHSMNRSEVLKH
VRMNPNTPIPSKKVLQTMLDTKATNWKEFLKEIDEKGLDDDDLIIGLKGKERELKLAGRFFSLMSWKFPEYFVITEYLIK
THFVPMFKGLTMADDLTAVIKKMLDSSSGQGLKSYEAICIANHIDYEKWNNHQRKLSNGPVFRVMGQFLGYPSLIERTHE
FFEKSLIYYNGRPDLMRVHNNTLINSTSQPVCWQGQEGGLEGLRQKGWTILNLLVIQREAKIRNTAVKVLAQGDNQVICT
QYKTKKSRNVVELQGALNQMVSNNEKIMTAIKIGTGKLGLLINDDETMQSADYLNYGKIPIFRGVIRGLETKRWSRVTCV
TNDQIPTCANIMSSVSTNALTVAHFAENPINAMIQYNYFGTFARLLLMMHDPALRQSLYEVQDKIPGLHSSTFKYAMLYL
DPSIGGVSGMSLSRFLIRAFPDPVTESLSSWRFIHVHARSEHLKEMSAVFGNPEIAKFRITHIDKLVEDPTSLNIAMGMS
PANLLKTEVKKCLIESRQTIRNQVIKDATIYLYHEEDRLRSFLWSINPLFPRFLSEFKSGTFLGVPDGLISLFQNSRTIR
NSFKKKYHRELDDLIVRSEVSSLTHLGKLHLRRGSCKMWTCSATHADTLRYKSWGRTVIGTTVPHPLEMLGPQHRKETPC
APCNTSGFNYVSVHCPDGIHDVFSSRGPLPAYLGSKTSESTSILQPWERESKVPLIKRATRLRDAISWFVEPDSKLAMTI
LSNIHSLTGEEWTKRQHGFKRTGSALHRFSTSRMSHGGFASQSTAALTRLMATTDTMRDLGDQNFDFLFQATLLYAQITT
TVARDGWITSCTDHYHIACKSCLRPIEEITLDSSMDYTPPDVSHVLKTWRNGEGSWGQEIKQIYPLEANWKNLAPAEQSY
QVGRCIGFLYGDLAYRKSTHAEDSSLFPLSIQGRIRGRGFLKGLLDGLMRASCCQVIHRRSLAHLKRPANAVYGGLIYLI
DKLSVSPPFLSLTRSGPIRDELETIPHKIPTSYPTSNRDMGVIVRNYFKYQCRLIEKGKYRSHYSQLWLFSDVLSIDFIG
PFSISTTLLQILYKPFLSGKDKNELRELANLSSLLRSGEGWEDIHVKFFTKDILLCPEEIRHACKFGIPKDNNKDMSYPP
WGRESRGTITTIPVYYTTTPYPKMLEMPPRIQNPLLSGIRLGQLPTGAHYKIRSILHGMGIHYRDFLSCGDGSGGMTAAL
LRENVHSRGIFNSLLELSGSVMRGASPEPPSALETLGGDKSRCVNGETCWEYPSDLCDPRTWDYFLRLKAGLGLQIDLIV
MDMEVRDSSTSLKIETNVRNYVHRILDEQGVLIYKTYGTYICESEKNAVTILGPMFKTVDLVQTEFSSSQTSEVYMVCKG
LKKLIDEPNPDWSSINESWKNLYAFQSSEQEFARAKKVSTYFTLTGIPSQFIPDPFVNIETMLQIFGVPTGVSHAAALKS
SDRPADLLTISLFYMAIISYYNINHIRVGPIPPNPPSDGIAQNVGIAITGISFWLSLMEKDIPLYQQCLAVIQQSFPIRW
EAVSVKGGYKQKWSTRGDGLPKDTRISDSLAPIGNWIRSLELVRNQVRLNPFNEILFNQLCRTVDNHLKWSNLRRNTGMI
EWINRRISKEDRSILMLKSDLHEENSWRD
>Q98776 ~~~L~~~RNA-directed RNA polymerase L~~~
MEVHDFETDEFNDFNEDDYATREFLNPDERMTYLNHADYNLNSPLISDDIDNLIRKFNSLPIPSMWDSKNWDGVLEMLTS
CQANPISTSQMHKWMGSWLMSDNHDASQGYSFLHEVDKEAEITFDVVETFIRGWGNKPIEYIKKERWTDSFKILAYLCQK
FLDLHKLTLILNAVSEVELLNLARTFKGKVRRSSHGTNICRIRVPSLGPTFISEGWAYFKKLDILMDRNFLLMVKDVIIG
RMQTVLSMVCRIDNLFSEQDIFSLLNIYRIGDKIVERQGNFSYDLIKMVEPICNLKLMKLARESRPLVPQFPHFENHIKT
SVDEGAKIDRGIRFLHDQIMSVKTVDLTLVIYGSFRHWGHPFIDYYTGLEKLHSQVTMKKDIDVSYAKALASDLARIVLF
QQFNDHKKWFVNGDLLPHDHPFKSHVKENTWPTAAQVQDFGDKWHELPLIKCFEIPDLLDPSIIYSDKSHSMNRSEVLKH
VRMNPNTPIPSKKVLQTMLDTKATNWKEFLKEIDEKGLDDDDLIIGLKGKERELKLAGRFFSLMSWKLREYFVITEYLIK
THFVPMFKGLTMADDLTAVIKKMLDSSSGQGLKSYEAICIANHIDYEKWNNHQRKLSNGPVFRVMGQFLGYPSLIERTHE
FFEKSLIYYNGRPDLMRVHNNTLINSTSQRVCWQGQEGGLEGLRQKGWTILNLLVIQREAKIRNTAVKVLAQGDNQVICT
QYKTKKSRNVVELQGALNQMVSNNEKIMTAIKIGTGKLGLLINDDETMQSADYLNYGKIPIFRGVIRGLETKRWSRVTCV
TNDQIPTCANIMSSVSTNALTVAHFAENPINAMIQYNYFGTFARLLLMMHDPALRQSLYEVQDKIPGLHSSTFKYAMLYL
DPSIGGVSGMSLSRFLIRAFPDPVTESLSFWRFIHVHARSEHLKEMSAVFGNPEIAKFRITHIDKLVEDPTSLNIAMGMS
PANLLKTEVKKCLIESRQTIRNQVIKDATIYLYHEEDRLRSFLWSINPLFPRFLSEFKSGTFLGVADGLISLFQNSRTIR
NSFKKKYHRELDDLIVRSEVSSLTHLGKLHLRRGSCKMWTCSATHADTLRYKSWGRTVIGTTVPHPLEMLGPQHRKETPC
APCNTSGFNYVSVHCPDGIHDVFSSRGPLPAYLGSKTSESTSILQPWERESKVPLIKRATRLRDAISWFVEPDSKLAMTI
LSNIHSLTGEEWTKRQHGFKRTGSALHRFSTSRMSHGGFASQSTAALTRLMATTDTMRDLGDQNFDFLFQATLLYAQITT
TVARDGWITSCTDHYHIACKSCLRPIEEITLDSSMDYTPPDVSHVLKTWRNGEGSWGQEIKQIYPLEGNWKNLAPAEQSY
QVGRCIGFLYGDLAYRKSTHAEDSSLFPLSIQGRIRGRGFLKGLLDGLMRASCCQVIHRRSLAHLKRPANAVYGGLIYLI
DKLSVSPPFLSLTRSGPIRDELETIPHKIPTSYPTSNRDMGVIVRNYFKYQCRLIEKGKYRSHYSQLWLFSDVLSIDFIG
PFSISTTLLQILYKPFLSGKDKNELRELANLSSLLRSGEGWEDIHVKFFTKDILLCPEEIRHACKFGIAKDNNKDMSYPP
WGRESRGTITTIPVYYTTTPYPKMLEMPPRIQNPLLSGIRLGQLPTGAHYKIRSILHGMGIHYRDFLSCGDGSGGMTAAL
LRENVHSRGIFNSLLELSGSVMRGASPEPPSALETLGGDKSRCVNGETCWEYPSDLCDPRTWDYFLRLKAGLGLQIDLIV
MDMEVRDSSTSLKIETNVRNYVHRILDEQGVLIYKTYGTYICESEKNAVTILGPMFKTVDLVQTEFSSSQTSEVYMVCKG
LKKLIDEPNPDWSSINESWKNLYAFQSSEQEFARAKKVSTYFTLTGIPSQFIPDPFVNIETMLQIFGVPTGVSHAAALKS
SDRPADLLTISLFYMAIISYYNINHIRVGPIPPNPPSDGIAQNVGIAITGISFWLSLMEKDIPLYQQCLAVIQQSFPIRW
EAVSVKGGYKQKWSTRGDGLPKDTRISDSLAPIGNWIRSLELVRNQVRLNPFNEILFNQLCRTVDNHLKWSNLRRNTGMI
EWINRRISKEDRSILMLKSDLHEENSWRD
>P16379 ~~~L~~~RNA-directed RNA polymerase L~~~
MDFDLIEDSANWEDDESDFFLRDILSQEDQMSYLNTADYNLNSPLISDDMVYIIKRMNHEEVPPIWRSKEWDSPLDMLRG
CQAQPMSHQEMHNWFGTWIQNIQHDSAQGFTFLKEVDKESEMTYDLVSTFLKGWVGKDYPFKSKNKEIDSMALVGPLCQK
FLDLHKITLILNAVSLGETKELLATFKGKYRMSCENIPIARLRLPSLGPVFMCKGWTYIHKERVLMDRNFLLMCKDVIIG
RMQTFLSMIGRSDNKFSPDQIYTLANVYRIGDKILEQCGNKAYDLIKMIEPICNLKMMELARLHRPKIPKFPHFEEHVKG
SVRELTQRSNRIQTLYDLIMSMKDVDLVLVVYGSFRHWGHPFIDYFEGLEKLHTQVNMEKHIDKEYPQQLASDLARLVLN
KQFSESKKWFVDPSKMSPKHPFYEHVINKTWPTAAKIQDFGDNWHKLPLIQCFEIPDLIDPSVIYSDKSHSMNKKEVIQH
VRSKPNIPIPSKKVLQTMLTNRATNWKAFLKDIDENGLDDDDLIIGLKGKERELKIAGRFFSLMSWRLREYFVITEYLIK
TYYVPLFKGLTMADDLTSVIKKMMDSSSGQGLDDYSSVCLANHIDYEKWNNHQRKESNGPIFRVMGQFLGYPSLIERTHE
FFEKSLIYYNGRPDLMTIRNGTLCNSTKHRVCWNGQKGGLEGLRQKGWSIVNLLVIQREAKIRNTAVKVLAQGDNQVICT
QYKTKKTRSELELRAVLHQMAGNNNKIMEEIKRGTEKLGLIINDDETMQSADYLNYGKIPIFRGVIRGLETKRWSRVTCV
TNDQIPTCANLMSSVSTNALTVAHFAENPINAMIQYNYFGTFARLLLFMHDPAIRQSLYKVQDKIPGLHTRTFKYAMLYL
DPSIGGVCGMALSRFLIRAFPDPVTESLSFWKFIYEHASEPHLRKMAVMFGDPPIAKFRIEHINKLLEDPTSLNISMGMS
PANLLKSEVKKCLIESRSSIKNEIIKDATIYMHQEEEKLRGFLWSIKPLFPRFLSEFKAGTFLGVSEGLINLFQNSRTIR
NSFKKRYHKDLDELIIKSEISSLSHLGSMHYRLGDNQIWSCSASRADILRYKSWTRKVVGTTVPHPLEMHGPPSKKERPC
QLCNSSGLTYISVHCPKGIIDVFNRRGPLPAYLGSNTSESTSILQPWEKESKIPIIKRATRLRDAISWFIPPESPLSTCI
LNNIQALTGEDWSSKQHGFKRTGSALHRFSTSRMSNGGFASQSPATLTRMIATTDTMRDFGTKNYDFMFQASLLYGQMTT
SISRYGTPGSCTDHYHIRCKGCIREIEEVELNTSLEYKTPDVSHILEKWRNNTGSWGHQIKQLKPAEGNWESLSPVEQSY
QVARCIGFLYGELTHKKSRQADDSSLFPLSIQLKVRGRGFLRGLLDGLMRSSCCQVIHRRSVSTLKRPANAVYGGLIYLI
DKLSASSPFLSLVRTGPIRQELEQVPHKMSTSYPTNIRDLGSIVRNYFKYQCRPVERGNYKTCYNQIWLFSDVLSTEFIG
PMAISSSLLRLLYRPSLTKKDREELRELAALSSNLRSGEDWDDSHIKFFSNDLLFCSQEIRHACKFGIKKDNEDITFYPN
WGTEYIGNVIDIPVFYRAQNVKKDIKVPPRIQNPLMSGLRLGQLPTGAHYKMRAIVFRLKIPYHDFLACGDGSGGMTAAL
LRYNRTSRGIFNSLLDLSDTMLRGSSPEPPSALETLGGERVRCVNGDSCWEHPSDLSDENTWKYFLHLKKGCGMSINLIT
MDMEVQDSVISYKIESLVRQYVPVLLESDGCLIYKTYGTYIATQEDNSLTLIGSLFHSVQLVQTDLSSSNTSELYLVCRG
LKDYVDTPFVDWIELYDNWEKQYAFRSFKDEFQRAQSLTPETTLIGIPPQFVPDPGVNLETLFQIAGVPTGVAHGITHHI
LQSKDKLISNAIGSMCVISHFTINTIRTTDSMPGPPSDGDVNKMCSALIGTCFWLSWMESDLNLYKTCLRSIMKSMPVRW
FRTLKNEKWSQKWDCKGDAIPKDSRLGDSLANIGNWIRAWELIRNGNKSEPFDSMVAEALPKSVDKSLSWRKISKSTGIP
RLLNSDIDLVDQSILNVQIDIVENQAWQN
>Q77PA8 ~~~~~~Apoptosis regulator M11L~~~
MMSRLKTAVYDYLNDVDITECTEMDLLCQLSNCCDFINETYAKNYDTLYDIMERDILSYNIVNIKNTLTFALRDASPSVK
LATLTLLASVIKKLNKIQHTDAAMFSEVIDGIVAEEQQVIGFIQKKCKYNTTYYNVRSGGCKISVYLTAAVVGFVAYGIL
KWYRGT
>P01546 ~~~~~~M1-1 protoxin~~~
MTKPTQVLVRSVSILFFITLLHLVVALNDVAGPAETAPVSLLPREAPWYDKIWEVKDWLLQRATDGNWGKSITWGSFVAS
DAGVVIFGINVCKNCVGERKDDISTDCGKQTLALLVSIFVAVTSGHHLIWGGNRPVSQSDPNGATVARRDISTVADGDIP
LDFSALNDILNEHGISILPANASQYVKRSDTAEHTTSFVVTNNYTSLHTDLIHHGNGTYTTFTTPHIPAVAKRYVYPMCE
HGIKASYCMALNDAMVSANGNLYGLAEKLFSEDEGQWETNYYKLYWSTGQWIMSMKFIEESIDNANNDFEGCDTGH
>Q65152 ~~~~~~Minor capsid protein M1249L~~~
MEEVITIAQIVHRGTDILSLNNEEIEALVDEIYSTLKGSNDIKNIRLIDFLFTLKDFVNHVRAEQSKLPDLSMPMEAYIR
QLLVDPDVVPIVSEKKKELRVRPSTRKEIFLINGTHLAVPAEAPIEIYGLKLRLKSFSPQCFMRMAEIGSFSPETLGYVA
SGANLTNFIRVFMKCVDQETWKKNGEGVVVTTKENIIQFTHQYIELYKFLRSGGHSWLINRLAEEMVHRKLDREDQGSHI
SNIVETEEIEPEENIKRVIFFLKELSTMYSVSPVFTSGYMPLLYDLYRAGYLEVLWNPVEQKFLQHAEQREKEQMILQQV
DMKLTEVITQARQYFKIMEEKIGRVQSDAIREILTMEGKVDDPNSILQEVIKACGKQEAELITTEYLNIKKQWELQEKNA
CAHLKLVKQLRSGLQYAELLKVLESIRVLYKEKNNTTNWNLCKACGFKLLCPHVDMLIQLQAAEASYDTMRTKLMKFSGI
NKEKENNQGLIYSYFCKICGEELAHFIQEDRTADVGVIGDLNSKLRIFIWQETMKACTFIHFGKLVDVKQFANIAVNVCL
PLVYSIENIKKEEDYDPLTQLYAVIYIYAYILNLIYSSQKNKEFLTITIHGMKADSSLNAYVTFLLEKMMQQYSGIINQL
SEITDQWIANNFREAFKKIIHQNGLQGLSVQDDTKVLLTEILLGPMYDYAATVARIDGSIPMHKPRTPKEAEYEFKTVIG
RTPAELLSQKEFYDKIYTSKYRPDFTQLARLNDIYFQEESLRVWWGGRDEEKTSTLIYLRAYELFLKYLQNAPNFNSELA
EFKTYENAYGEQKALLAQQGFYNIFDPNTGRADQRTRLFEYKRLPISTLYDERGLPHKWTIYVYKAVDSSQKPAEIEVTR
KDVIKKIDNHYALADLRCSVCHVLQHEVGQLNIKKVQTALKASLEFNTFYAFYESRCPKGGLHDFQDKKCVKCGLFTYII
YDHLSQPELVHDYYNNYKDQYDKEKMSIRSIQIKKDMTTPSSETQPKPPQEPWTFDYGKIIKTAKILDISPAVIEAIGAM
EGRSYADIREGQGAPPPPTSMDDPRLMAVDSAVRIFLYNYNCLRHVSTFNKPPMHVERLVKHLSYEEKEDLEKVLPNVVN
EYHTTFKHLRVTDPASALLYSIEFLCVSFLTLYEIKEPSWVVNIVREFALTELNTIIQSEKLLSKPGAFNFMIFGEDFVC
SGEDSSMDDISAYSSPGLFGEDIIDRLDDPFSIEDVDISLDVLDNLAPQ
>P05775 ~~~M~~~Matrix protein 1~~~
MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGVLGFVFTLTVPSERGLQRRRFV
QNALNGNGDPNNMDKAVKLYRKLKREITFYGAKEVALSYSTGALASCMGLIYNRMGTVTTEVAFGLVCATCEQIADSQHR
SHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTVGTHPSSSAGLKDDLLENLQAY
QKRMGVQLQRFK
>P05777 ~~~M~~~Matrix protein 1~~~
MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFV
QNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHR
SHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMDIASQARQMVQAMRTIGTHPSSSAGLKDDLLENLQAY
QKRMGVQMQRFK
>P03485 ~~~M~~~Matrix protein 1~~~
MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFV
QNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEISLSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHR
SHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAY
QKRMGVQMQRFK
>P0DOF7 ~~~M~~~Matrix protein 1~~~
MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFV
QNALNGNGDPNNMDRAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHR
SHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRAIGTHPSSSAGLKDDLLENLQAY
QKRMGVQMQRFK
>Q8QV58 ~~~M~~~Matrix protein 1~~~
MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALTEWLKTRPILSPLTKGILGFVFTLTVPSERGLQRRRFV
QNALNGNGDPNNMDRAVKLYRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHR
SHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRAIGTHPSSSAGLKDDLLENLQTY
QKRMGVQMQRFK
>P13879 ~~~M~~~Matrix protein 1~~~
MSLFGDTIAYLLSLTEDGEGKAELAEKLHCWFGGKEFDLDSALEWIKNKRCLTDIQKALIGASICFLKPKDQERKRRFIT
EPLSGMGTTATKKKGLILAERKMRRCVSFHEAFEIAEGHESSALLYCLMVMYLNPGNYSMQVKLGTLCALCEKQASHSQR
AHSRAARSSVPGVRREMQMVSAVNTAKTMNGMGKGEDVQKLAEELQSNIGVLRSLGASQKNGEGIAKDVMEVLKQSSMGN
SALVKKYL
>Q76ZA3 ~~~M~~~Matrix protein 1~~~
MSLFGDTIAYLLSLTEDGEGKAELAEKLHCWFGGKEFDLDSALEWIKNKRCLTDIQKALIGASICFLKPKDQERKRRFIT
EPLSGMGTTATKKKGLILAERKMRRCVSFHEAFEIAEGHESSALLYCLMVMYLNPGNYSMQVKLGTLCALCEKQASHSHR
AHSRAARSSVPGVRREMQMVSAMNTAKTMNGMGKGEDVQKLAEELQSNIGVLRSLGASQKNGEGIAKDVMEVLKQSSMGN
SALVKKYL
>Q71UK7 ~~~M~~~Matrix protein 1~~~
MSLFGDTIAYLLSLTEDGEGKAELAEKLHCWFGGKEFDLDSALEWIKNKRCLTDIQKALIGASICFLKPKDQERKRRFIT
EPLSGMGTTATKKKGLILAERKMRRCVSFHEAFEIAEGHESSALLYCLMVMYLNPGNYSMQVKLGTLCALCEKQASHSHR
AHSRAARSSVPGVRREMQMVSAMNTAKTMNGMGKGEDVQKLAEELQSNIGVLRSLGASQKNGEGIAKDVMEVLKQSSMGN
SALVKKYL
>Q8V3U4 ~~~Segment-8~~~Matrix protein 1~~~
MHERSKPKTTGADQTCLEEEKETRGGLRNGSSSDTGGRERTDRGVSCSRRKNCEGQNLEPIGERDDQSSDDDPLLCDERS
TIGRHGNADERPHQELAEGGIRMPGRGWWRGKMGNGVWYDFTRHGRGEDDAEGAENNATQQDADVCSGCKFESPREFRKG
HRRCSSSTSGILLDREDGASGVPEVSFKERMEAEKKKLKELDDKIYKLRRRLRKMEYKKMGINREIDKLEDSVQ
>P29792 ~~~M2-1~~~Protein M2-1~~~
MSRRNPCKYEIRGHCLNGKKCHFSHNYFEWPPHALLVRQNFMLNKILKSMDRNNDTLSEISGAAELDRTEEYALGVIGVL
ESYLSSINNITKQSACVAMSKLLAEINNDDIKRLRNKEVPTSPKIRIYNTVISYIDSNKRNTKQTIHLLKRLPADVLKKT
IKNTIDIHNEINGNNQGDINVDEQNE
>Q6WB97 ~~~M2-1~~~Protein M2-1~~~
MSRKAPCKYEVRGKCNRGSECKFNHNYWSWPDRYLLIRSNYLLNQLLRNTDRADGLSIISGAGREDRTQDFVLGSTNVVQ
GYIDDNQSITKAAACYSLHNIIKQLQEVEVRQARDSKLSDSKHVALHNLILSYMEMSKTPASLINNLKRLPREKLKKLAK
LIIDLSAGADNDSSYALQDSESINQVQ
>Q4KRW3 ~~~M2-1~~~Protein M2-1~~~
MSRRNPCKFEIRGHCLNGKRCHFSHNYFEWPPHALLVRQNFMLNRILKSMDKSIDTLSEISGAAELDRTEEYALGVVGVL
ESYIGSINNITKQSACVAMSKLLTELNSDDIKKLRDNEELNSPKIRVYNTVISYIESNRKNNKQTIHLLKRLPADVLKKT
IKNTLDIHKSITINNPKELTVSDTNDHAKNNDTT
>P04545 ~~~M2-1~~~Protein M2-1~~~
MSRRNPCKFEIRGHCLNGKRCHFSHNYFEWPPHALLVRQNFMLNRILKSMDKSIDTLSEISGAAELDRTEEYALGVVGVL
ESYIGSINNITKQSACVAMSKLLTELNSDDIKKLRDNEELNSPKIRVYNTVISYIESNRKNNKQTIHLLKRLPADVLKKT
IKNTLDIHKSITINNPKESTVSDTNDHAKNNDTT
>Q84132 ~~~M2-1~~~Protein M2-1~~~
MSRRNPCKYEIRGHCLNGKKCHFSHNYFEWPPHALLVRQNFMLNKILKSMDRSNDTLSEISGAAELDRTEEYALGVIGVL
ESYLGSVNNITKQSACVAMSKLLGEINSDDIKGLRNKELPTSPKIRIYNTVISYIDSNKRNPKQTIHLLKRLPADVLKKT
IKNTIDIHNEINVNNPSDIGVNEQNE
>P33494 ~~~~~~Protein M2-1~~~
MSRRNPCRYEIRGKCNRGSSCTFNHNYWSWPDHVLLVRANYMLNQLLRNTDRTDGLSLISGAGREDRTQDFVLGSANVVQ
NYIEGNTTITKSAACYSLYNIIKQLQENDVKTSRDSMLEDPKHVALHNLILSYVDMSKNPASLINSLKRLPREKLKKLAK
IILQLSAGPESDNANGNTLQKGDSNN
>Q6WB96 ~~~M2~~~Protein M2-2~~~
MTLHMPCKTVKALIKCSEHGPVFITIEVDEMIWTQKELKEALSDGIVKSHTNIYNCYLENIEIIYVKAYLS
>P05780 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNDSSDPLVIAANIIGILHLILWILDRLFFKCIYRRFKYGLKRGPSTEGVPESMREEYRKEQ
QNAVDVDDGHFVNIELE
>P03492 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPTRNGWECRCNDSSDPLIIAASIIGILHLILWILNRLFFKCIYRRLKYGLKRGPSTEGVPESMREEYRQEQ
QSAVDVDDGHFVNIELE
>P06821 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLILWILDRLFFKCIYRRFKYGLKGGPSTEGVPKSMREEYRKEQ
QSAVDADDGHFVSIELE
>Q0HD59 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWECRCNDSSDPLVVAASIIGILHLILWILDRLFFKCIYRLFKHGLKRGPSTEGVPKSMREEYRKEQ
QSAVDADDSHFVNIELE
>A4K144 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNDSSDPLIIAASVVGILHLILWILDRLFFKCIYRLFKHGLKRGPSTEGVPESMREEYRKEQ
QSAVDADDSHFVNIELE
>Q3YPZ4 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIKNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKCIYRFFEHGLKRGPSTEGVPESMREEYRKEQ
QSAVDADDSHFVSIELE
>P0DOF5 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKCIYRFFEHGLKRGPSTEGVPESMREEYRKEQ
QSAVDADDSHFVSIELE
>P0DOF8 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKCIYRFFEHGLKRGPSTEGVPESMREEYRKEQ
QSAVDADDSHFVSIELE
>P35938 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKCIYRLFKHGLKRGPSTEGVPESMREEYRKEQ
QNAVDADDSHFVNIELE
>Q9Q0L9 ~~~M~~~Matrix protein 2~~~
MSLLTEVETPTKNEWECKCSDSSDPLVVAASIIGILHLILWILDRLFFKCIYRRLKYGLKRGPSTEGVPESMREEYRQEQ
QSAVDVDDGHFVNIELE
>O70632 ~~~M~~~Matrix protein 2~~~
MSLLTEVETLTRNGWGCRCSDSSDPLVVAASIIGILHLILWILDRLFFKCIYRRFKYGLKRGPSTEGVPESMREEYRQEQ
QNAVDVDDGHFVNIELE
>Q8LTE2 ~~~~~~Maturation protein A2~~~
MPKLPRGLRFGADNEILNDFQELWFPDLFIESSDTHPWYTLKGRVLNAHLDDRLPNVGGRQVRRTPHRVTVPIASSGLRP
VTTVQYDPAALSFLLNARVDWDFGNGDSANLVINDFLFRTFAPKEFDFSNSLVPRYTQAFSAFNAKYGTMIGEGLETIKY
LGLLLRRLREGYRAVKRGDLRALRRVIQSYHNGKWKPATAGNLWLEFRYGLMPLFYDIRDVMLDWQNRHDKIQRLLRFSV
GHGEDYVVEFDNLYPAVAYFKLKGEITLERRHRHGISYANREGYAVFDNGSLRPVSDWKELATAFINPHEVAWELTPYSF
VVDWFLNVGDILAQQGQLYHNIDIVDGFDRRDIRLKSFTIKGERNGRPVNVSASLSAVDLFYSRLHTSNLPFATLDLDTT
FSSFKHVLDSIFLLTQRVKR
>P09676 ~~~A~~~Maturation protein A2~~~
MPTLPRGLRFGSNGEVLNDFEALWFPERHTVDLSNGTCKLTGYITNLPGYSDIFPNKGVTAARTPYRSTVPVNHLGYRPV
TTVEYIPDGTYVRLDGHVKFEGDLVNGSVDLTNFVISLAAQGGFDYQSVIGPRFSARFSAFSTKYGVLLGEGRETLKYLL
LVVRRMREGYRAVRRGDLKRLRNVISTFEPSTIKGKRARAEFSQTYRDKLTGNKVEVRPSEGKWNSSSASDLWLEFRYGL
MPLFYDIQSVMEDFMRVHKKIAKIQRFSAGHGKLETVSSRFYPDVHFSLEVTAVLQRRHRWGVIYQDTGSFATFNNGRLV
PVKDWKTAAFALLNPAEVAWEVTPYSFVVDWFVNVGDMLEQMGQLYRHVDVVDGFDRKDIKLKSVSVRVLTNDVAHVASF
QLRQAKLLHSYYSRVHTVAFPQISPQLDTEIRSVKHVIDSIALLTQRVKR
>P15966 ~~~A~~~Maturation protein A~~~
MRKFIPTERMSKSHVVSVREYADGELEDNSLPLIYRSNWSPGQYTSTGPRTKEWHYPSSYSRGAIGIKALDQGKYARLGT
SWGREFEERAGYGMSIDARSCYSLFPVSQNLTWIDVPTNVANRATTEVLGKVTQGNFNLGVALAEARSTASQLSTQTIAL
IKAYTAARRGNWRQALRYLALNENRKFNSKSVASRWLELQFGWMPLLSDIQGAYEMLTKVHLKAFMPMRAVRQVGQNVSL
SGRLTSPAASYKSTCNISRRIVIWFYINDARLAWLSSLGILNPLGIVWEKVPFSFLVDWLLPVGNMLEGLTAPIGCSYQS
GTVTDVISGESTITADDIYGWDTVRPATAKVQISAVHRGVQSVWPTTGVYVKSPFSMVHTLDALALFRQRLWK
>P07394 ~~~A~~~Maturation protein A~~~
MFPKSNIDRNYKVKLISYDKKGKLVSDDSFEQVENYLFQNRSTTYKPGYIRRDFRRPTNFWNGYRCFNQPVGTFTRKLSD
GGRQVADYGIVNPNKFTANSQHLGDNMVIYPGPFSINIDQRASVEVLNKLSQSNLNIGVAIAEAKMTASLLAKQSIALIR
AYTAAKRGNWREVLSQLLISEHRFRAPAKDLGGRWLELQYGWLPLMSDLKAAYDLLTQTKLPAFMPLRVTRTVGGTHNYK
VRNVESAGDTWSYRHRLSVNYRIWYFISDPRLAWASSLGLLNPLEIYWEKTPWSFVVDWFLPVGNLIEAMSNPLGLDIIS
GTKTWQLESKLNATLPASGWSGTAKLTAYAKAYDRSTFYSFPTPLPYVKSPLSGLHLANALALINQRLKR
>P03610 ~~~A~~~Maturation protein A~~~
MRAFSTLDRENETFVPSVRVYADGETEDNSFSLKYRSNWTPGRFNSTGAKTKQWHYPSPYSRGALSVTSIDQGAYKRSGS
SWGRPYEEKAGFGFSLDARSCYSLFPVSQNLTYIEVPQNVANRASTEVLQKVTQGNFNLGVALAEARSTASQLATQTIAL
VKAYTAARRGNWRQALRYLALNEDRKFRSKHVAGRWLELQFGWLPLMSDIQGAYEMLTKVHLQEFLPMRAVRQVGTNIKL
DGRLSYPAANFQTTCNISRRIVIWFYINDARLAWLSSLGILNPLGIVWEKVPFSFVVDWLLPVGNMLEGLTAPVGCSYMS
GTVTDVITGESIISVDAPYGWTVERQGTAKAQISAMHRGVQSVWPTTGAYVKSPFSMVHTLDALALIRQRLSR
>P0C794 ~~~M~~~Matrix protein~~~
MNSKHSYVELKDKVIVPGWPTLMLEIDFVGGTSRNQFLNIPFLSVKEPLQLPREKKLTDYFTIDVEPAGHSLVNIYFQID
DFLLLTLNSLSVYKDPIRKYMFLRLNKEQSKHAINAAFNVFSYRLRNIGVGPLGPDIRSSGP
>P0C795 ~~~M~~~Matrix protein~~~
MNSKHSYVELKDKVIVPGWPTLMLEIDFVGGTSRNQFLNIPFLSVKEPLQLPREKKLTDYFTIDVEPAGHSLVNIYFQID
DFLLLTLNSLSVYKDPIRKYMFLRLNKDQSKHAINAAFNVFSYRLRNIGVGPLGPDIRSSGP
>P24615 ~~~M~~~Matrix protein~~~
METYVNKLHEGSIYTAAVQYNVIEKDDDPASLTIWVPMFQSSISADMLIKELINVNILVRQISTPKGPSLKIMINSRSAV
LAQMPSKFTISANVSLDERSKLAYDITTPCEIKACSLTCLKVKNMLTTVKDLTMKTFNPTHEIIALCEFENIMTSKRVVI
PTFLRSINVKAKDLDSLENIATTEFKNAITNAKIIPYAGLVLVITVTDNKGAFKYIKPQSQFIVDLGAYLEKESIYYVTT
NWKHTATKFSIKPIED
>Q9WH76 ~~~M~~~Matrix protein~~~
MQRLKKFIAKREKGDKGKMKWNSSMDYDSPPSYQDVRRGIFPTAPLFGMEDDMMEFTPSLGIQTLKLQYKCVVNINAINP
FRDFREAISAMQFWEADYSGYIGKKPFYRAIILHTARQLKTSNPGILDRGVVEYHATTQGRALVFHSLGPSPSMMFVPET
FTREWNILTNKGTINVKIWLGETDTLSELEPILNPVNFRDDREMIEGAAIMGLEIKKQKDNTWLISKSH
>C1JJY2 ~~~ORF3~~~Matrix protein~~~
MAYSKGVSGTLSDYTHLRTLPALLSVVFVIAGLYQFGGISDVTITWLSNYTLTSTHAMGASIGAYALAFASSETKQFDNY
QDFEKVLIAAGPAVILGYEYIQPVTDLINTTSNAGPIAAFVVTVVAWGVAVR
>O89341 ~~~M~~~Matrix protein~~~
MDFSVSDNLDDPIEGVSDFSPTSWENGGYLDKVEPEIDKHGSMIPKYKIYTPGANERKFNNYMYMICYGFVEDVERSPES
GKRKKIRTIAAYPLGVGKSTSHPQDLLEELCSLKVTVRRTAGATEKIVFGSSGPLHHLLPWKKILTGGSIFNAVKVCRNV
DQIQLENQQSLRIFFLSITKLNDSGIYMIPRTMLEFRRNNAIAFNLLVYLKIDADLAKAGIQGSFDKDGTKVASFMLHLG
NFVRRAGKYYSVEYCKRKIDRMKLQFSLGSIGGLSLHIKINGVISKRLFAQMGFQKNLCFSLMDINPWLNRLTWNNSCEI
SRVAAVLQPSVPREFMIYDDVFIDNTGKILKG
>P0DOE7 ~~~M~~~Matrix protein~~~
METYVNKLHEGSTYTAAVQYNVLEKDDDPASLTIWVPMFQSSMPADLLIKELANVNILVKQISTPKGPSLRVMINSRSAV
LAQMPSKFTICANVSLDERSKLAYDVTTPCEIKACSLTCLKSKNMLTTVKDLTMKTLNPTHDIIALCEFENIVTSKKVII
PTYLRSISVRNKDLNTLENITTTEFKNAITNAKIIPYSGLLLVITVTDNKGAFKYIKPQSQFIVDLGAYLEKESIYYVTT
NWKHTATRFAIKPMED
>Q5VKP4 ~~~M~~~Matrix protein~~~
MNIIRKIVKSCKDEEEHKPNPVSAPPDDDDLWLPPPEYVPLAEITGKKNMRNFCINGEVKVCSPNGYSFRILRHILKSFE
GVYSGNRRMIGLVKVVIGLAQSGAPVPEGMNWVYKIRRTLVFQWAESSGPLDGEELEYSQEITWDDDSEFVGLQIRVSAR
QCHIQGRLWCINMNSRACQLWADMSLKTQQSNEDKNTSLLLE
>Q6JAM6 ~~~M~~~Matrix protein~~~
MNFLRKIVKNCKDEEIPKPGTPSAPPDDDDLWMPPPEYVPLTQIKGKENVRNFCINGEIKICSPNGYSFRILRHILKSFD
NVYSGNRRLIGVVKVVIGLVLSASPVPEGMNWVYKLRRTLIFQWAESHGPLEGEELEYSQEITWDDEAEFVSLQIRVSAK
QCHIQGRLWCINMNSKACQLWADMGLKTQQSQEDENTSLLLE
>Q9E7N9 ~~~M~~~Matrix protein~~~
MSAKLNWYRITFNDTVWRFDTARGPKDGETCPLIASELFSSGLSEVFKSVTSFSEILRNMESRGYITNITLRADSDILGP
GALRCEFLFPSEVFIPTSHTLKMGRSSLILEPHLVVLKECKYISSGKLDIGISSIEATSVAVLRRVKGPAFIGCMDDNPF
GVLTKKPSDEKNVLASK
>Q9W850 ~~~M~~~Matrix protein~~~
MTEIYDFDKSAWDIKGSIAPIQPTTYSDGRLVPQVRVIDPGLGDRKDECFMYMFLLGVVEDSDPLGPPIGRAFGSLPLGV
GRSTAKPEELLKEATELDIVVRRTAGLNEKLVFYNNTPLTLLTPWRKVLTTGSVFNANQVCNAVNLIPLDTPQRFRVVYM
SITRLSDNGYYTVPRRMLEFRSVNAVAFNLLVTLRIDKAIGPGKIIDNAEQLPEATFMVHIGNFRRKKSEVYSADYCKMK
IEKMGLVFALGGIGGTSLHIRSTGKMSKTLHAQLGFKKTLCYPLMDINEDLNRLLWRSRCKIVRIQAVLQPSVPQEFRIY
DDVIINDDQGLFKVL
>P11206 ~~~M~~~Matrix protein~~~
MDSSRTIGLYFDSALPSSNLLAFPIVLQDIGDGKKQIAPQYRIQRLDSWTDSKEDSVFITTYGFIFQVGNEEVTVGMISD
NPKHELLSAAMLCLGSVPNVGDLVELARACLTMVVTCKKSATDTERMVFSVVQAPQVLQSCRVVANKYSSVNAVKHVKAP
EKIPGSGTLEYKVNFVSLTVVPRKDVYKIPTAALKVSGSSLYNLALNVTIDVEVDPKSPLVKSLSKSDSGYYANLFLHIG
LMSTVDKKGKKVTFDKLERKIRRLDLSVGLSDVLGPSVLVKARGARTRLLAPFFSSSGTACYPISNASPQVAKILWSQTA
RLRSVKVIIQAGTQRAVAVTADHEVTSTKIEKRHTIAKYNPFKK
>Q9IK90 ~~~M~~~Matrix protein~~~
MEPDIKSISSESMEGVSDFSPSSWEHGGYLDKVEPEIDENGSMIPKYKIYTPGANERKYNNYMYLICYGFVEDVERTPET
GKRKKIRTIAAYPLGVGKSASHPQDLLEELCSLKVTVRRTAGSTEKIVFGSSGPLNHLVPWKKVLTSGSIFNAVKVCRNV
DQIQLDKHQALRIFFLSITKLNDSGIYMIPRTMLEFRRNNAIAFNLLVYLKIDADLSKMGIQGSLDKDGFKVASFMLHLG
NFVRRAGKYYSVDYCRRKIDRMKLQFSLGSIGGLSLHIKINGVISKRLFAQMGFQKNLCFSLMDINPWLNRLTWNNSCEI
SRVAAVLQPSIPREFMIYDDVFIDNTGRILKG
>Q84131 ~~~M~~~Matrix protein~~~
METYVNKLHEGSTYTAAVQYNVLEKDDDPASLTIWVPMFQSSISADLLIKELINVNILVRQISTLKGPSLKIMINSRSAV
LAQMPNKFTISANVSLDERSKLAYDITTPCEIKACSLTCLKVKNMLTTVKDLTMKTFNPTHEIIALCEFENIMTSKKVVI
PTFLRSINVKAKDLDSLENIATTEFKNAITNAKIIPYAGLVLVITVTDNKGAFKYIKPQSQFIVDLGAYLEKESIYYVTT
NWKHTATRFSIKPIED
>P24266 ~~~M~~~Matrix protein~~~
MPIISLPADPTSPSQSLTPFPIQLDTKDGKAGKLLKQIRIRYLNEPNSRHTPITFINTYGFVYARDTSGGIHSEISSDLA
AGSITACMMKLGPGPNIQNANLVLRSLNEFYVKVKKTSSQREEAVFELVNIPTLLREHALCKRKMLVCSAEKFLKNPSKL
QAGFEYVYIPTFVSITYSPRNLNYQVARPILKFRSRFVYSIHLELILRLLCKSDSPLMKSYNADRTGRGCLASVWIHVCN
ILKNKSIKQQGRESYFIAKCMSMQLQVSIADLWGPTIIIKSLGHIPKTALPFFSKDGIACHPLQDVSPNLAKSLWSVGCE
IESAKLILQESDLNELMGHQDLITDKIAIRSGQRTFERSKFSPFKKYASIPNLEAIN
>P07873 ~~~M~~~Matrix protein~~~
MSITNSAIYTFPESSFSENGHIEPLPLKVNEQRKAVPHIRVAKIGNPPKHGSRYLDVFLLGFFEMERIKDKYGSVNDLDS
DPGYKVCGSGSLPIGLAKYTGNDQELLQAATKLDIEVRRTVKAKEMIVYTVQNIKPELYPWSSRLRKGMLFDANKVALAP
QCLPLDRSIKFRVIFVNCTAIGSITLFKIPKSMASLSLPSTISINLQVHIKTGVQTDSKGIVQILDEKGEKSLNFMVHLG
LIKRKVGRMYSVEYCKQKIEKMRLIFSLGLVGGISLHVNATGSISKTLASQLVFKREICYPLMDLNPHLNLVIWASSVEI
TRVDAIFQPSLPGEFRYYPNIIAKGVGKIKQWN
>P27020 ~~~M~~~Matrix protein~~~
MAPTQSKVKIHNLAEAHEKVLRAFPIEVEQNSEGNKLLVKQIRIRTLGHADHSNDSICFLNTYGFIKEAVSQTEFMRAGQ
RPESKNTLTACMLPFGPGPNIGSPQKMLEYAEDIKIHVRKTAGCKEQIVFSLDRTPQVFRGFQFPRDRYVCVPSDKYIKS
PGKLVAGPNYCYTITFLSLTFCPSSQKFKVPRPILNFRSTRMRGIHLEIIMKITCSENSPIRKTLITDDPENGPKASVWI
HLCNLYKGRNPIKVYDEAYFAEKCKQMLLSVGISDLWGPTIAVHANGKIPKSASLYFNSRGWALHPIADASPTMAKQLWS
IGCEIIEVNAILQGSDYSALVDHPDVIYRKIRIDPAKKQYAHSKWNPFKKAISMPDLTGISI
>P0DOF2 ~~~M~~~Matrix protein~~~
MNFLRKIVKNCRDEDTQKPSPVSAPLDDDDLWLPPPEYVPLKELTSKKNMRNFCINGGVKVCSPNGYSFRILRHILKSFD
EIYSGNHRMIGLVKVVIGLALSGSPVPEGMNWVYKLRRTFIFQWADSRGPLEGEELEYSQEITWDDDTEFVGLQIRVIAK
QCHIQGRIWCINMNPRACQLWSDMSLQTQRSEEDKDSSLLLE
>Q8B6J7 ~~~M~~~Matrix protein~~~
MNFLCKIVKNCRDEDTQKPSPASAPPDGDDLWLPPPEYVPLKELTSKKNMRNFCINGEVKVCSPNGYSFRILRHILRSFD
EIYSGNHRMIGLVKVVVGLALSGAPAPEGMNWVYKLRRTLIFQWADSRGPLEGEELEHSQEITWDDDTEFVGLQMRVSAR
QCHIQGRIWCIDMNSRACQLWSDMSLQTQRSEEDKDSSLLLE
>P16287 ~~~M~~~Matrix protein~~~
MNLLRKIVKNRRDEDTQKSSPASAPLDDDDLWLPPPEYVPLKELTGKKNMRNFCINGRVKVCSPNGYSFRILRHILKSFD
EIYSGNHRMIGLVKVVIGLALSGSPVPEGLNWVYKLRRTFIFQWADSRGPLEGEELEYSQEITWDDDTEFVGLQIRVIAK
QCHIQGRVWCINMNPRACQLWSDMSLQTQRSEEDKDSSLLLE
>P17748 ~~~M~~~Matrix protein~~~
MADIYRFPKFSYEDNGTVEPLPLRTGPDKKAIPYIRIIKVGDPPKHGVRYLDLLLLGFFETPKQTTNLGSVSDLTEPTSY
SICGSGSLPIGVAKYYGTDQELLKACTDLRITVRRTVRAGEMIVYMVDSIGAPLLPWSGRLRQGMIFNANKVALAPQCLP
VDKDIRFRVVFVNGTSLGAITIAKIPKTLADLALPNSISVNLLVTLKTGISTEQKGVLPVLDDQGEKKLNFMVHLGLIRR
KVGKIYSVEYCKSKIERMRLIFSLGLIGGISFHVQVTGTLSKTFMSQLAWKRAVCFPLMDVNPHMNLVIWAASVEITGVD
AVFQPAIPRDFRYYPNVVAKNIGRIRKL
>P03426 ~~~M~~~Matrix protein~~~
MADIYRFPKFSYEDNGTVEPLPLRTGSDKKAIPYIRIIKVGDPPKHGVRYLDLLLLGFFETPKQTTNLGSVSDLTEPTSY
SICGSGSLPIGVAKYYGTDQELLKACTDLRITVRRTVRAGEMIVYMVDSIGAPLLPWSGRLRQGMIFNANKVALAPQCLP
VDKDIRFRVVFVNGTSLGAITIAKIPKTLADLALPNSISVNLLVTLKTGISTEQKGVLPVLDDQGEKKLNFMVHLGLIRR
KVGKIYSVEYCKSKIERMRLIFSLGLIGGISFHVQVTGTLSKTFMSQLAWKRAVCFPLMDVNPHMNLVIWAASVEITGVD
AVFQPAIPRDFRYYPNVVAKNIGRIRKL
>P06446 ~~~M~~~Matrix protein~~~
MADIYRFPKFSYEDNGTVEPLPLRTGPDKKAIPHIRIVKVGDPPKHGVRYLDLLLLGFFETPKQTTNLGSVSDLTEPTSY
SICGSGSLPIGVAKYYGTDQELLKACTDLRITVRRTVRAGEMIVYMVDSIGAPLLPWSGRLRQGMIFNANKVALAPQCLP
VDKDIRLRVVFVNGTSLGAITIAKIPKTLADLALPNSISVNLLVTLKTGISTEQKGVLPVLDDQGEKKLNFMVHLGLIRR
KVGKIYSVEYCKSKIERMRLIFSLGLIGGISFHVQVNGTLSKTFMSQLAWKRAVCFPLMDVNPHMNMVIWAASVEITGVD
AVFQPAIPRDFRYYPNVVAKNIGRIRKL
>P25182 ~~~M~~~Matrix protein~~~
MPTISIPADPASPDQGLKPFPIQLDSKDGKSGKLVKQIRIKYLTEPNSRSPPLTFINTYGFIYARDLSGGIMNEQSSGIQ
SGSVTACMMTLGPGPDIKNANRVLAALNGFYVKVRKTSSLKEEAVFELVNVPKLLANHALCKQGRLVCSAEKFVKNPSKM
MAGQEYLYFPTFVSLTYCPSNLNYQVAKPILKIRSRFVYSIHMEIIFRLLCKPDSPLLKTYATDPEGRGCLASVWIHVCN
ILKNKKIKQRGVDSYFSSKAISMQLTVSIADTWGPTVIIKANGHIPKTAAPFFSKDGVACHPLQDVSPALTKSLWSVGCE
ITKARLILQESNISDLLKTQDLITDQIKIKKGHSHFGRSSFNPFKKAISLPNLTQLGQDDED
>P19692 ~~~M2~~~Matrix protein~~~
MEIDPNYVNPKYSSLKSTVMNSEVLTSKYKSAIHHAGDGELEDDILAVMEELHSMLQEKGLACHTENLEVFSSTILHLKT
TGQENRAGDLIAAILSFGCSISAQAIVPSTLLKTMSEMLDSFATRNHELKLITKDLQEVVPRQVLKAKKKSKAKSAEGPS
ASTEDIKDSDTKGNQDIGDNGDLNSSINQRNREICYKHYTTDEFEALSLEKRQEIMKYYIQYILGAWGYNATDPTKTAML
YDLIDKHTVITVMRQSKEGTLTSDDILMAIDEVIDSVNSMSSCYGGYKATIGNDNGTPYLVLIPKEGVILLSYPPPIPVT
HHYYNKALSFIFPCAITIFSSPIYI
>P31620 ~~~M~~~Matrix protein~~~
MESYIIDTYQGVPYTAAVQVDLIEKDSNPATLTVWFPLFQSSTPAPVLLDQLKTLSITTQYTASPEGPVLQVNAAAQGAA
MSALPKKFAVSAAVALDEYSRLEFGTLTVCDVRSIYLTTLKPYGMVSKIMTDVRSVGRKTHDLIALCDFIDIEKGVPITI
PAYIKAVSIKDSESATVEAAISGEADQAITQARIAPYAGLILIMTMNNPKGIFKKLGAGMQVIVELGPYVQAESLGKICK
TWNHQRTRYVLRSR
>P03519 ~~~M~~~Matrix protein~~~
MSSLKKILGLKGKGKKSKKLGIAPPPYEEDTSMEYAPSAPIDKSYFGVDEMDTYDPNQLRYEKFFFTVKMTVRSNRPFRT
YSDVAAAVSHWDHMYIGMAGKRPFYKILAFLGSSNLKATPAVLADQGQPEYHTHCEGRAYLPHRMGKTPPMLNVPEHFRR
PFNIGLYKGTIELTMTIYDDESLEAAPMIWDHFNSSKFSDFREKALMFGLIVEKKASGAWVLDSISHFK
>Q8B0H2 ~~~M~~~Matrix protein~~~
MSSLKKILGLKGKGKKSKKLGIAPPPYEEDTSMEYAPSAPIDKSYFGVDEMDTHDPNQLRYEKFFFTVKMTVRSNRPFRT
YSDVAAAVSHWDHMYIGMAGKRPFYKILAFLGSSNLKATPAVLADRGQPEYHAHCEGRAYLPHRMGKTPPMLNVPEHFRR
PFNIGLYKGTVELTMTIYDDESLEAAPMIWDHFNSSKFSDFREKALMFGLIVEKKASGAWVLDSVSHFK
>P04876 ~~~M~~~Matrix protein~~~
MSSLKKILGLKGKGKKSKKLGIAPPPYEEDTSMEYAPSAPIDKSYFGVDEMDTHDPNQLRYEKSFFTVKMTVRSNRPFRT
YSDVAAAVSHWDHMYIGMAGKRPFYKILAFLGSSNLKATPAVLADQGQPEYHAHCEGRAYLPHRMGKTPPMLNVPEHFRR
PFNIGLYKGTIELTMTIYDDESLEAAPMIWDHFNSSKFSDFREKALMFGLIVEEEASGAWVLDSVRHSKWASLASSF
>Q8B0I2 ~~~M~~~Matrix protein~~~
MSSLKKILGLKGKGKKSKKLGIAPPPYEEDTSMEYAPSAPIDKSYFGVDEMDTHDPNQLRYEKFFFTVKLTVRSNRPFRT
YSDVAAAVSHWDHMYIGMAGKRPFYKILAFLGSSNLKATPAVLADQGQPEYHAHCEGRAYLPHRMGKTPPMLNVPEHFRR
PFNIGLYKGTIELTMTIYDDESLEAAPMIWDHFNSSKFSDFREKALMFGLIVEKKASGAWILDSVSHFK
>Q8B0H7 ~~~M~~~Matrix protein~~~
MSSLKKILGLKGKGKKSKKLGIAPPPYEEDTSMEYAPSAPIDKSYFGVDEMDTHDPNQLRYEKFFFTVKMTVRSNRPFRT
YSDVAAAVSHWDHMYIGMAGKRPFYKILAFLGSSNLKATPAVLADQGQPEYHAHCEGRAYLPHRMGKTPPMLNVPEHFRR
PFNIGLYKGTIELTMTIYDDESLEAAPMIWDHFNSSKFSDFREKALMFGLIVEKKASGAWVLDSVSHFK
>P08325 ~~~M~~~Matrix protein~~~
MSSFKKILGLSSKSHKKSKKMGLPPPYDESCPMETQPSAPLSNDFFGMEDMDLYDKDSLRYEKFRFMLKMTVRSNKPFRS
YDDVTAAVSQWDNSYIGMVGKRPFYKIIAVIGSSHLQATPAVLADLNQPEYYATLTGRCFLPHRLGLIPPMFNVQETFRK
PFNIGLYKGTLDFTFTVSDDESNEKVPHVWDYMNPKYQSQIQQEGLKFGLILSKKATGTWVLDQLSPFK
>Q5VKP0 ~~~M~~~Matrix protein~~~
MNFLRKMMKTCRDDESSKPLDPSAPPDDDDLWLPPPEYVPLHEISSKGNTRNFCISGEVKICSPNGYSFKIIRHILRSFE
SVYSGNRRMIGLVKVVIGLTLSGSPVPEGMNWVYKLRKTLVFQWSNSSGPLEGEELEYSQEITWDDDSEYVGLQIRVNAK
QCHIAGRSWCVNMNSRACQLWSDMTLKTQQSEEDEHTSVLIE
>Q6I7B9 ~~~M~~~Polyprotein p42~~~
MAHEILIAETEAFLKNVAPETRTAIISAITGGKSACKSAAKLIKNEHLPLMSGEATTMHIVMRCLYPEIKPWKKASDMLN
KATSSLKKSEGRDIRKQMKAAGDFLGVESMMKMRAFRDDQIMEMVEEVYDHPDDYTPDIRIGTITAWLRCKNKKSERYRS
NVSESGRTALKIHEVRKASTAMNEIAGITGLGEEALSLQRQTESLAILCNHTFGSNIMRPHLEKAIKGVEGRVGEMGRMA
MKWLVVIICFSITSQPASACNLKTCLKLFNNTDAVTVHCFNENQGYMLTLASLGLGIITMLYLLVKIIIELVNGFVLGRW
ERWCGDIKTTIMPEIDSMEKDIALSRERLDLGEDAPDETDNSPIPFSNDGIFEI
>Q98176 ~~~~~~Protein MC005~~~
MCLVAPMQCGCASCVRILDALLSAMEALVQMRLLSEEEKTSCASQFLELAIFAVENCRGGRQALLQARGEPASLGEVAGK
GPAAD
>Q98178 ~~~~~~Protein MC007~~~
MHAELAEVAAAVSAAACIVGLRHARPGCARAFVAGALVSSVGYTCFPAIRHCVHMAARYLSYLCFVRGGATACVHSVADE
AVQELVDSLHRVLAERDQCVLGSVELVQGSEPLVLGLPQAPVLCVPEPSGAVDPECVEVDLYCHENLRYESDVSEDEDKD
DACDEGISLARQTLLDLLACSEDASGFSPPEDSFSSLLETGELYDEVLDASLARAEA
>Q98298 ~~~~~~Protein MC132~~~
MMNFSDPCLLGPSACNEDFFEELAREILPTPEAKALAARLLRRLGWHPSEGQCSWCPEAYEYLYDLQFRYTGPVPLVRKH
PGSALIVRWMLDGLGHFLFCRPRVQGNPLTYHLSCMGSAIRTLELFREQAHSHLWEQGNLVDCYWSCSSDYGYEGKWGQQ
AVVRGALSGPWPRPDPPSLQLQCLCAIWRTCLLRGQSQAQKYTNWICARHLQLRPDTSTRDSDLLLQRC
>Q98314 ~~~~~~Chemokine-like protein MC148~~~
MRGGDVFASVVLMLLLALPRPGVSLARRKCCLNPTNRPIPNPLLQDLSRVDYQAIGHDCGREAFRVTLQDGRQGCVSVGN
KSLLDWLRGHKDLCPQIWSGCESL
>Q98326 ~~~~~~Protein MC160~~~
MAHEPIPFSFLRNLLAELDASEHEVLRFLCRDVAPASKTAEDALRALQRRRLLTLSSMAELLCALRRFDVLKVRFGMTRE
CAGRLLGHGFLSQYRLQVAAINNMVGSEDLRVMCLCAGKLLPPSCTPRCLVDLVSALEDAGAISPQDVSVLVTLLHAVCR
YDLSVALSAVAHGHMTVGVGTPVQDEPMDVLEVDDAEPMEATPACDEIGVVKLAGAASAGAPLADGAFAACTSAGKGEDL
ATSDLTDSEPEDSVFAVADPVYADVDLSMFVRANATADSSMFVNADAGADSSLVNADAGADSSLVNADAGADSSLVNAVA
DANSSLMRTTSACTDSEPEDSAGPSCAGMALSMFGRAKSVSSLLLRTKASY
>P09516 ~~~ORF3/ORF5~~~Readthrough protein P3-RTD~~~
MNSVGRRGPRRANQNGTRRRRRRTVRPVVVVQPNRAGPRRRNGRRKGRGGANFVFRPTGGTEVFVFSVDNLKANSSGAIK
FGPSLSQCPALSDGILKSYHRYKITSIRVEFKSHASANTAGAIFIELDTACKQSALGSYINSFTISKTASKTFRSEAING
KEFQESTIDQFWMLYKANGTTTDTAGQFIITMSVSLMTAKXVDSSTPEPKPAPEPTPTPQPTPAPQPTPEPTPAPVPKRF
FEYIGTPTGTISTRENTDSISVSKLGGQSMQYIENEKCETKVIDSFWSTNNNVSAQAAFVYPVPEGSYSVNISCEGFQSV
DHIGGNEDGYWIGLIAYSNSSGDNWGVGNYKGCSFKNFLATNTWRPGHKDLKLTDCQFTDGQIVERDAVMSFHVEATGKD
ASFYLMAPKTMKTDKYNYVVSYGGYTNKRMEFGTISVTCDESDVEAERITRHAETPIRSKHILVSERYAEPLPTIVNQGL
CDVKTPEQEQTLVDEDDRQTVSTESDIALLEYEAATAEIPDAEEDVLPSKEQLSSKPMDTSGNIIPKPKEPEVLGTYQGQ
NIYPEDVPPMARQKLREAANAPSTLLYERRTPKKSGNFLSRLVEANRSPTTPTAPSVSTTSNMTREQLREYTRIRNSSGI
TAAKAYKAQFQ
>P17525 ~~~ORF3/ORF5~~~Readthrough protein P3-RTD~~~
MSTVVVKGNVNGGVQQPRMRRRQSLRRRANRVQPVVMVTAPGQPRRRRRRRGGNRRSRRTGVPRGRGSSETFVFTKDNLV
GNTQGSFTFGPSLSDCPAFKDGILKAYHEYKITSILLQFVSEASSTSSGSIAYELDPHCKVSSLQSYVNKFQITKGGAKT
YQARMINGVEWHDSSEDQCRILWKGNGKSSDSAGSFRVTIKVALQNPKYDSGSEPSPSPQPTPTPTPQKHERFIAYVGIP
MLTIQARENDDQIILGSLGSQRMKYIEDENQNYTKFSSEYYSQSSMQAVPMYYFNVPKGQWSVDISCEGYQPTSSTSDPN
RGRSDGMIAYSNADSDYWNVGEADGVKISKLRNDNTYRQGHPELEINSCHFREGQLLERDATISFHVEAPTDGRFFLVGP
AIQKTAKYNYTISYGDWTDRDMELGLITVVLDEHLEGTGSANRVRRPPREGHTYMASPHEPEGKPVGNKPRDETPIQTQE
RQPDQTPSDDVSDAGSVNSGGPTESLRLEFGVNSDSTYDATVDGTDWPRIPPPRHPPEPRVSGNSRTVTDFSSKADLLEN
WDAEHFDPGYSKEDVAAATIIAHGSIQDGRSMLEKREENVKNKTSSWKPPSLKAVSPAIAKLRSIRKSQPLEGGTLNKDA
TDGVSSIGSGSLTGGTLKRKATIEERLLQTLTTEQRLWYENFKKTNPPAATQWLFEYQPPPQVDRNIAEKPFQGRK
>P09514 ~~~ORF3/ORF5~~~Readthrough protein P3-RTD~~~
MNTVVGRRIINGRRRPRRQTRRAQRPQPVVVVQTSRATQRRPRRRRRGNNRTGRTVPTRGAGSSETFVFSKDNLAGSSSG
AITFGPSLSDCPAFSNGMLKAYHEYKISMVILEFVSEASSQNSGSIAYELDPHCKLNSLSSTINKFGITKPGKRAFTASY
INGTEWHDVAEDQFRILYKGNGSSSIAGSFRITIKCQFHNPKYVDEEPGPSPGPSPSPQPTPQKKYRFIVYTGVPVTRIM
AQSTDDAISLYDMPSQRFRYIEDENMNWTNLDSRWYSQNSLKAIPMIIVPVPQGEWTVEISMEGYQPTSSTTDPNKDKQD
GLIAYNDDLSEGWNVGIYNNVEITNNKADNTLKYGHPDMELNGCHFNQGQCLERDGDLTCHIKTTGDNASFFVVGPAVQK
QSKYNYAVSYGAWTDRMMEIGMIAIALDEQGSSGSVKTERPKRVGHSMAVSTWETIKLPEKGNSEGYETSQRQDSKTPPT
ASGGSDTLDVEEGGLPLPVEEEIPDFVGDNPWSDLSTKNSQEEEAMSSESGLRPQLKPPGLPKPQPIRTIRNFDPTPDLV
EAWRPDVNPGYSKADVAAATIIAGGSIKDGRSMIDKRNKAVLDGRKSWGSSLASSLTGGTLKASAKSEKLAKLTTSERAR
YERIKRQQGSTRASEFLESLLAGEDPDSRF
>P04298 ~~~~~~mRNA-capping enzyme catalytic subunit~~~
MDANVVSSSTIATYIDALAKNASELEQRSTAYEINNELELVFIKPPLITLTNVVNISTIQESFIRFTVTNKEGVKIRTKI
PLSKVHGLDVKNVQLVDAIDNIVWEKKSLVTENRLHKECLLRLSTEERHIFLDYKKYGSSIRLELVNLIQAKTKNFTIDF
KLKYFLGSGAQSKSSLLHAINHPKSRPNTSLEIEFTPRDNETVPYDELIKELTTLSRHIFMASPENVILSPPINAPIKTF
MLPKQDIVGLDLENLYAVTKTDGIPITIRVTSNGLYCYFTHLGYIIRYPVKRIIDSEVVVFGEAVKDKNWTVYLIKLIEP
VNAINDRLEESKYVESKLVDICDRIVFKSKKYEGPFTTTSEVVDMLSTYLPKQPEGVILFYSKGPKSNIDFKIKKENTID
QTANVVFRYMSSEPIIFGESSIFVEYKKFSNDKGFPKEYGSGKIVLYNGVNYLNNIYCLEYINTHNEVGIKSVVVPIKFI
AEFLVNGEILKPRIDKTMKYINSEDYYGNQHNIIVEHLRDQSIKIGDIFNEDKLSDVGHQYANNDKFRLNPEVSYFTNKR
TRGPLGILSNYVKTLLISMYCSKTFLDDSNKRKVLAIDFGNGADLEKYFYGEIALLVATDPDADAIARGNERYNKLNSGI
KTKYYKFDYIQETIRSDTFVSSVREVFYFGKFNIIDWQFAIHYSFHPRHYATVMNNLSELTASGGKVLITTMDGDKLSKL
TDKKTFIIHKNLPSSENYMSVEKIADDRIVVYNPSTMSTPMTEYIIKKNDIVRVFNEYGFVLVDNVDFATIIERSKKFIN
GASTMEDRPSTRNFFELNRGAIKCEGLDVEDLLSYYVVYVFSKR
>P04318 ~~~~~~mRNA-capping enzyme regulatory subunit OPG124~~~
MDEIVKNIREGTHVLLPFYETLPELNLSLGKSPLPSLEYGANYFLQISRVNDLNRMPTDMLKLFTHDIMLPESDLDKVYE
ILKINSVKYYGRSTKADAVVADLSARNKLFKRERDAIKSNNHLTENNLYISDYKMLTFDVFRPLFDFVNEKYCIIKLPTL
FGRGVIDTMRIYCSLFKNVRLLKCVSDSWLKDSAIMVASDVCKKNLDLFMSHVKSVTKSSSWKDVNSVQFSILNNPVDTE
FINKFLEFSNRVYEALYYVHSLLYSSMTSDSKSIENKHQRRLVKLLL
>P32094 ~~~~~~mRNA-capping enzyme~~~
MASLDNLVARYQRCFNDQSLKNSTIELEIRFQQINFLLFKTVYEALVAQEIPSTISHSIRCIKKVHHENHCREKILPSEN
LYFKKQPLMFFKFSEPASLGCKVSLAIEQPIRKFILDSSVLVRLKNRTTFRVSELWKIELTIVKQLMGSEVSAKLAAFKT
LLFDTPEQQTTKNMMTLINPDDEYLYEIEIEYTGKPESLTAADVIKIKNTVLTLISPNHLMLTAYHQAIEFIASHILSSE
ILLARIKSGKWGLKRLLPQVKSMTKADYMKFYPPVGYYVTDKADGIRGIAVIQDTQIYVVADQLYSLGTTGIEPLKPTIL
DGEFMPEKKEFYGFDVIMYEGNLLTQQGFETRIESLSKGIKVLQAFNIKAEMKPFISLTSADPNVLLKNFESIFKKKTRP
YSIDGIILVEPGNSYLNTNTFKWKPTWDNTLDFLVRKCPESLNVPEYAPKKGFSLHLLFVGISGELFKKLALNWCPGYTK
LFPVTQRNQNYFPVQFQPSDFPLAFLYYHPDTSSFSNIDGKVLEMRCLKREINYVRWEIVKIREDRQQDLKTGGYFGNDF
KTAELTWLNYMDPFSFEELAKGPSGMYFAGAKTGIYRAQTALISFIKQEIIQKISHQSWVIDLGIGKGQDLGRYLDAGVR
HLVGIDKDQTALAELVYRKFSHATTRQHKHATNIYVLHQDLAEPAKEISEKVHQIYGFPKEGASSIVSNLFIHYLMKNTQ
QVENLAVLCHKLLQPGGMVWFTTMLGEQVLELLHENRIELNEVWEARENEVVKFAIKRLFKEDILQETGQEIGVLLPFSN
GDFYNEYLVNTAFLIKIFKHHGFSLVQKQSFKDWIPEFQNFSKSLYKILTEADKTWTSLFGFICLRKN
>Q5UQX1 ~~~~~~Probable mRNA-capping enzyme~~~
MGTKLKKSNNDITIFSENEYNEIVEMLRDYSNGDNLEFEVSFKNINYPNFMRITEHYINITPENKIESNNYLDISLIFPD
KNVYRVSLFNQEQIGEFITKFSKASSNDISRYIVSLDPSDDIEIVYKNRGSGKLIGIDNWAITIKSTEEIPLVAGKSKIS
KPKITGSERIMYRYKTRYSFTINKNSRIDITDVKSSPIIWKLMTVPSNYELELELINKIDINTLESELLNVFMIIQDTKI
PISKAESDTVVEEYRNLLNVRQTNNLDSRNVISVNSNHIINFIPNRYAVTDKADGERYFLFSLNSGIYLLSINLTVKKLN
IPVLEKRYQNMLIDGEYIKTTGHDLFMVFDVIFAEGTDYRYDNTYSLPKRIIIINNIIDKCFGNLIPFNDYTDKHNNLEL
DSIKTYYKSELSNYWKNFKNRLNKSTDLFITRKLYLVPYGIDSSEIFMYADMIWKLYVYNELTPYQLDGIIYTPINSPYL
IRGGIDAYDTIPMEYKWKPPSQNSIDFYIRFKKDVSGADAVYYDNSVERAEGKPYKICLLYVGLNKQGQEIPIQFKVNGV
EQTANIYTKDGEATDINGNAINDNTVVEFVFDTLKIDMDDSYKWIPIRTRYDKTESVQKYHKRYGNNLQIANRIWKTITN
PITEDIISSLGDPTTFNKEITLLSDFRDTKYNKQALTYYQKNTSNAAGMRAFNNWIKSNMITTYCRDGSKVLDIGCGRGG
DLIKFINAGVEFYVGIDIDNNGLYVINDSANNRYKNLKKTIQNIPPMYFINADARGLFTLEAQEKILPGMPDFNKSLINK
YLVGNKYDTINCQFTIHYYLSDELSWNNFCKNINNQLKDNGYLLITSFDGNLIHNKLKGKQKLSSSYTDNRGNKNIFFEI
NKIYSDTDKVGLGMAIDLYNSLISNPGTYIREYLVFPEFLEKSLKEKCGLELVESDLFYNIFNTYKNYFKKTYNEYGMTD
VSSKKHSEIREFYLSLEGNANNDIEIDIARASFKLAMLNRYYVFRKTSTINITEPSRIVNELNNRIDLGKFIMPYFRTNN
MFIDLDNVDTDINRVYRNIRNKYRTTRPHVYLIKHNINENRLEDIYLSNNKLDFSKIKNGSDPKVLLIYKSPDKQFYPLY
YQNYQSMPFDLDQIYLPDKKKYLLDSDRIINDLNILINLTEKIKNIPQLS
>Q84424 2.7.7.50~~~~~~mRNA-capping enzyme~~~
MVPPTINTGKNITTERAVLTLNGLQIKLHKVVGESRDDIVAKMKDLAMDDHKFPRLPGPNPVSIERKDFEKLKQNKYVVS
EKTDGIRFMMFFTRVFGFKVCTIIDRAMTVYLLPFKNIPRVLFQGSIFDGELCVDIVEKKFAFVLFDAVVVSGVTVSQMD
LASRFFAMKRSLKEFKNVPEDPAILRYKEWIPLEHPTIIKDHLKKANAIYHTDGLIIMSVDEPVIYGRNFNLFKLKPGTH
HTIDFIIMSEDGTIGIFDPNLRKNVPVGKLDGYYNKGSIVECGFADGTWKYIQGRSDKNQANDRLTYEKTLLNIEENITI
DELLDLFKWE
>P14583 2.7.7.50~~~~~~Putative mRNA-capping enzyme P5~~~
MSNPDYCIPNFSQTVNERTIIDIFTICRYRSPLVVFCLSHNELAKKYAQDVSMSSGTHVHIIDGSVEITASLYRTFRTIA
TQLLGRMQIVVFVTVDKSVVSTQVMKSIAWAFRGSFVELRNQSVDSSTLVSKLENLVSFAPLYNVPKCGPDYYGPTVYSE
LLSLATNARTHWYATIDYSMFTRSVLTGFIAKYFNEEAVPIDKRIVSIVGYNPPYVWTCLRHGIRPTYIEKSLPNPGGKG
PFGLILPVINELVLKSKVKYVMHNPQIKLLCLDTFMLSTSMNILYIGAYPATHLLSLQLNGWTILAFDPKITSDWTDAMA
KATGAKVIGVNKEFDFKSFSVQANQLNMFQNSKLSVIDDTWVETDYEKFQAEKQAYFEWLIDRTSIDVRLISMKWNRSKD
TSVSHLLALLPQPYGASIREMRAFFHKKGASDIKILAAETEKYMDDFTAMSVSDQINTQKFMHCMITTVGDALKMDLDGG
RAVIASYSLSNSSNPKERVLKFLSDANKAKAMVVFGAPNTHRLAYAKKVGLVLDSAIKMSKDLITFSNPTGRRWRDYGYS
QSELYDAGYVEITIDQMVAYSSDVYNGVGYFANSTYNDLFSWYIPKWYVHKRMLMQDIRLSPAALVKCFTTLIRNICYVP
HETYYRFRGILVDKYLRSKNVDPSQYSIVGSGSKTFTVLNHFEVPHECGPLVFEASTDVNISGHLLSLAIAAHFVASPMI
LWAEQMKYMAVDRMLPPNLDKSLFFDNKVTPSGALQRWHSREEVLLAAEICESYAAMMLNNKHSPDIIGTLKSAINLVFK
I
>Q85437 2.7.7.50~~~~~~Putative mRNA-capping enzyme P5~~~
MSNPDYCIPNFSQTVNERTIIDIFTICRYRSPLVVFCLSHTELAKKYAQDVSMSSGTHVHIIDGSVEITTSLYRTFRTIA
TQLLGRMQIVVFVTVDKSVVSTQVMKSTAWASRGSFVELRNQSVDSSTLVNKLENLVSFAPLYNVPKCGPDYYGPTVYSE
LLSLATNARTHWYATIDYSMFTRSVLTGFIAKYFNEEAVPIDKRIVSIVGYNPPYVWTCLRHGIRPTYIQKSLPNPGGKG
PFGLILPVINELVLKSKVKYVMHNPQIKLLCLDTFMLSTSMNILYIGAYPATHLLSLQLNGWTILAFDPKITSDWTDAMA
KATGAKVIGVNKEFDFKSFSVQANQLNMFQNSKLSVIDDTWVETDYEKFQAEKQAYFEWLIDRTSIDVRLISMKWNRSKD
TSVSHLLALLPQPYGASIREMRAFFHKKGASDIKILAAETEKYMDDFTAMSVSDQINTQKFMHCMITTVGDALKMDLDGG
RAVIASYSLSNSSNPKERVLKFLSDANNGKAMVVFGAPNTYRLAYAKKVGLVLDSAIKMSKDLITFSNPTGRRWRDYGYS
QSELYDAGYVEITIDQMVDYSSDVYNGVGYFANSTYNDLFSWYIPKWYVHKRMLMQDIRLSPAALVKCFTTLIRNICYVP
HETYYRFRGILVDKYLRSKNVDSISLFHYWSGSKTFTVLSHFEVPHECGPLVFEASTDVNVSGHLLSLAIAAHFVASPMI
LWAEQMKYMAVDRMLPPNLDKSLFFDNKVTPSGALQRWHSREEVLLAAEICESYAAMMLNNKHSPDIIGTLKSAINLVFK
I
>P07617 2.1.1.57~~~~~~Cap-specific mRNA (nucleoside-2'-O-)-methyltransferase~~~
MDVVSLDKPFMYFEEIDNELDYEPESANEVAKKLPYQGQLKLLLGELFFLSKLQRHGILDGATVVYIGSAPGTHIRYLRD
HFYNLGVIIKWMLIDGRHHDPILNGLRDVTLVTRFVDEEYLRSIKKQLHPSKIILISDVRSKRGGNEPSTADLLSNYALQ
NVMISILNPVASSLKWRCPFPDQWIKDFYIPHGNKMLQPFAPSYSAEMRLLSIYTGENMRLTRVTKSDAVNYEKKMYYLN
KIVRNKVVVNFDYPNQEYDYFHMYFMLRTVYCNKTFPTTKAKVLFLQQSIFRFLNIPTTSTEKVSHEPIQRKISSKNSMS
KNRNSKRSVRSNK
>P0C703 ~~~MCP~~~Major capsid protein~~~
MASNEGVENRPFPYLTVDADLLSNLRQSAAEGLFHSFDLLVGKDAREAGIKFEVLLGVYTNAIQYVRFLETALAVSCVNT
EFKDLSRMTDGKIQFRISVPTIAHGDGRRPSKQRTFIVVKNCHKHHISTEMELSMLDLEILHSIPETPVEYAEYVGAVKT
VASALQFGVDALERGLINTVLSVKLRHAPPMFILQTLADPTFTERGFSKTVKSDLIAMFKRHLLEHSFFLDRAENMGSGF
SQYVRSRLSEMVAAVSGESVLKGVSTYTTAKGGEPVGGVFIVTDNVLRQLLTFLGEEADNQIMGPSSYASFVVRGENLVT
AVSYGRVMRTFEHFMARIVDSPEKAGSTKSDLPAVAAGVEDQPRVPISAAVIKLGNHAVAVESLQKMYNDTQSPYPLNRR
MQYSYYFPVGLFMPNPKYTTSAAIKMLDNPTQQLPVEAWIVNKNNLLLAFNLQNALKVLCHPRLHTPAHTLNSLNAAPAP
RDRRETYSLQHRRPNHMNVLVIVDEFYDNKYAAPVTDIALKCGLPTEDFLHPSNYDLLRLELHPLYDIYIGRDAGERARH
RAVHRLMVGNLPTPLAPAAFQEARGQQFETATSLAHVVDQAVIETVQDTAYDTAYPAFFYVVEAMIHGFEEKFVMNVPLV
SLCINTYWERAGRLAFVNSFSMIKFICRHLGNNAISKEAYSMYRKIYGELIALEQALMRLAGSDVVGDESVGQYVCALLD
PNLLPPVAYTDIFTHLLTVSDRAPQIIIGNEVYADTLAAPQFIERVGNMDEMAAQFVALYGYRVNGDHDHDFRLHLGPYV
DEGHADVLEKIFYYVFLPTCTNAHMCGLGVDFQHVAQTLAYNGPAFSHHFTRDEDILDNLENGTLRDLLEISDLRPTVGM
IRDLSASFMTCPTFTRTVRVSVDNDVTQQLAPNPADKRTEQTVLVNGLVAFAFSERTRAVTQCLFHAIPFHMFYGDPRVA
ATMHQDVATFVMRNPQQRAVEAFNRPEQLFAEYREWHRSPMGKYAAECLPSLVSISGMTAMHIKMSPMAYIAQAKLKIHP
GVAMTVVRTDEILSENILFSSRASTSMFIGTPNVSRREARVDAVTFEVHHEMASIDTGLSYSSTMTPARVAAITTDMGIH
TQDFFSVFPAEAFGNQQVNDYIKAKVGAQRNGTLLRDPRTYLAGMTNVNGAPGLCHGQQATCEIIVTPVTADVAYFQKSN
SPRGRAACVVSCENYNQEVAEGLIYDHSRPDAAYEYRSTVNPWASQLGSLGDIMYNSSYRQTAVPGLYSPCRAFFNKEEL
LRNNRGLYNMVNEYSQRLGGHPATSNTEVQFVVIAGTDVFLEQPCSFLQEAFPALSASSRALIDEFMSVKQTHAPIHYGH
YIIEEVAPVRRILKFGNKVVF
>P03226 ~~~MCP~~~Major capsid protein~~~
MASNEGVENRPFPYLTVDADLLSNLRQSAAEGLFHSFDLLVGKDAREAGIKFEVLLGVYTNAIQYVRFLETALAVSCVNT
EFKDLSRMTDGKIQFRISVPTIAHGDGRRPSKQRTFIVVKNCHKHHISTEMELSMLDLEILHSIPETPVEYAEYVGAVKT
VASALQFGVDALERGLINTVLSVKLRHAPPMFILQTLADPTFTERGFSKTVKSDLIAMFKRHLLEHSFFLDRAENMGSGF
SQYVRSRLSEMVAAVSGESVLKGVSTYTTAKGGEPVGGVFIVTDNVLRQLLTFLGEEADNQIMGPSSYASFVVRGENLVT
AVSYGRVMRTFEHFMARIVDSPEKAGSTKSDLPAVAAGVEDQPRVPISAAVIKLGNHAVAVESLQKMYNDTQSPYPLNRR
MQYSYYFPVGLFMPNPKYTTSAAIKMLDNPTQQLPVEAWIVNKNNLLLAFNLQNALKVLCHPRLHTPAHTLNSLNAAPAP
RDRRETYSLQHRRPNHMNVLVIVDEFYDNKYAAPVTDIALKCGLPTEDFLHPSNYDLLRLELHPLYDIYIGRDAGERARH
RAVHRLMVGNLPTPLAPAAFQEARGQQFETATSLAHVVDQAVIETVQDTAYDTAYPAFFYVVEAMIHGFEEKFVMNVPLV
SLCINTYWERSGRLAFVNSFSMIKFICRHLGNNAISKEAYSMYRKIYGELIALEQALMRLAGSDVVGDESVGQYVCALLD
PNLLPPVAYTDIFTHLLTVSDRAPQIIIGNEVYADTLAAPQFIERVGNMDEMAAQFVALYGYRVNGDHDHDFRLHLGPYV
DEGHADVLEKIFYYVFLPTCTNAHMCGLGVDFQHVAQTLAYNGPAFSHHFTRDEDILDNLENGTLRDLLEISDLRPTVGM
IRDLSASFMTCPTFTRAVRVSVDNDVTQQLAPNPADKRTEQTVLVNGLVAFAFSERTRAVTQCLFHAIPFHMFYGDPRVA
ATMHQDVATFVMRNPQQRAVEAFNRPEQLFAEYREWHRSPMGKYAAECLPSLVSISGMTAMHIKMSPMAYIAQAKLKIHP
GVAMTVVRTDEILSENILFSSRASTSMFIGTPNVSRREARVDAVTFEVHHEMASIDTGLSYSSTMTPARVAAITTDMGIH
TQDFFSVFPAEAFGNQQVNDYIKAKVGAQRNGTLLRDPRTYLAGMTNVNGAPGLCHGQQATCEIIVTPVTADVAYFQKSN
SPRGRAACVVSCENYNQEVAEGLIYDHSRPDAAYEYRSTVNPWASQLGSLGDIMYNSSYRQTAVPGLYSPCRAFFNKEEL
LRNNRGLYNMVNEYSQRLGGHPATSNTEVQFVVIAGTDVFLEQPCSFLQEAFPALSASSRALIDEFMSVKQTHAPIHYGH
YIIEEVAPVRRILKFGNKVVF
>P16729 ~~~MCP~~~Major capsid protein~~~
MENWSALELLPKVGIPTDFLTHVKTSAGEEMFEALRIYYGDDPERYNIHFEAIFGTFCNRLEWVYFLTSGLAAAAHAIKF
HDLNKLTTGKMLFHVQVPRVASGAGLPTSRQTTIMVTKYSEKSPITIPFELSAACLTYLRETFEGTILDKILNVEAMHTV
LRALKNTADAMERGLIHSFLQTLLRKAPPYFVVQTLVENATLARQALNRIQRSNILQSFKAKMLATLFLLNRTRDRDYVL
KFLTRLAEAATDSILDNPTTYTTSSGAKISGVMVSTANVMQIIMSLLSSHITKETVSAPATYGNFVLSPENAVTAISYHS
ILADFNSYKAHLTSGQPHLPNDSLSQAGAHSLTPLSMDVIRLGEKTVIMENLRRVYKNTDTKDPLERNVDLTFFFPVGLY
LPEDRGYTTVESKVKLNDTVRNALPTTAYLLNRDRAVQKIDFVDALKTLCHPVLHEPAPCLQTFTERGPPSEPAMQRLLE
CRFQQEPMGGAARRIPHFYRVRREVPRTVNEMKQDFVVTDFYKVGNITLYTELHPFFDFTHCQENSETVALCTPRIVIGN
LPDGLAPGPFHELRTWEIMEHMRLRPPPDYEETLRLFKTTVTSPNYPELCYLVDVLVHGNVDAFLLIRTFVARCIVNMFH
TRQLLVFAHSYALVTLIAEHLADGALPPQLLFHYRNLVAVLRLVTRISALPGLNNGQLAEEPLSAYVNALHDHRLWPPFV
THLPRNMEGVQVVADRQPLNPANIEARHHGVSDVPRLGAMDADEPLFVDDYRATDDEWTLQKVFYLCLMPAMTNNRACGL
GLNLKTLLVDLFYRPAFLLMPAATAVSTSGTTSKESTSGVTPEDSIAAQRQAVGEMLTELVEDVATDAHTPLLQACRELF
LAVQFVGEHVKVLEVRAPLDHAQRQGLPDFISRQHVLYNGCCVVTAPKTLIEYSLPVPFHRFYSNPTICAALSDDIKRYV
TEFPHYHRHDGGFPLPTAFAHEYHNWLRSPFSRYSATCPNVLHSVMTLAAMLYKISPVSLVLQTKAHIHPGFALTAVRTD
TFEVDMLLYSGKSCTSVIINNPIVTKEERDISTTYHVTQNINTVDMGLGYTSNTCVAYVNRVRTDMGVRVQDLFRVFPMN
VYRHDEVDRWIRHAAGVERPQLLDTETISMLTFGSMSERNAAATVHGQKAACELILTPVTMDVNYFKIPNNPRGRASCML
AVDPYDTEAATKAIYDHREADAQTFAATHNPWASQAGCLSDVLYNTRHRERLGYNSKFYSPCAQYFNTEEIIAANKTLFK
TIDEYLLRAKDCIRGDTDTQYVCVEGTEQLIENPCRLTQEALPILSTTTLALMETKLKGGAGAFATSETHFGNYVVGEII
PLQQSMLFNS
>P06491 ~~~MCP~~~Major capsid protein~~~
MAAPNRDPPGYRYAAAMVPTGSLLSTIEVASHRRLFDFFSRVRSDANSLYDVEFDALLGSYCNTLSLVRFLELGLSVACV
CTKFPELAYMNEGRVQFEVHQPLIARDGPHPIEQPTHNYMTKIIDRRALNAAFSLATEAIALLTGEALDGTGIGAHRQLR
AIQQLARNVQAVLGAFERGTADQMLHVLLEKAPPLALLLPMQRYLDNGRLATRVARATLVAELKRSFCETSFFLGKAGHR
REAVEAWLVDLTTATQPSVAVPRLTHADTRGRPVDGVLVTTAPIKQRLLQSFLKVEDTEADVPVTYGEMVLNGANLVTAL
VMGKAVRSLDDVGRHLLEMQEEQLDLNRQTLDELESAPQTTRVRADLVSIGEKLVFLEALEKRIYAATNVPYPLVGAMDL
TFVLPLGLFNPVMERFAAHAGDLVPAPGHPDPRAFPPRQLFFWGKDRQVLRLSLEHAIGTVCHPSLMNVDAAVGGLNRDP
VEAANPYGAYVAAPAGPAADMQQLFLNAWGQRLAHGRVRWVAEGQMTPEQFMQPDNANLALELHPAFDFFVGVADVELPG
GDVPPAGPGEIQATWRVVNGNLPLALCPAAFRDARGLELGVGRHAMAPATIAAVRGAFDDRNYPAVFYLLQAAIHGSEHV
FCALARLVVQCITSYWNNTRCAAFVNDYSLVSYVVTYLGGDLPEECMAVYRDLVAHVEALAQLVDDFTLTGPELGGQAQA
ELNHLMRDPALLPPLVWDCDALMRRAALDRHRDCRVSAGGHDPVYAAACNVATADFNRNDGQLLHNTQARAADAADDRPH
RGADWTVHHKIYYYVMVPAFSRGRCCTAGVRFDRVYATLQNMVVPEIAPGEECPSDPVTDPAHPLHPANLVANTVNAMFH
NGRVVVDGPAMLTLQVLAHNMAERTTALLCSAAPDAGANTASTTNMRIFDGALHAGILLMAPQHLDHTIQNGDYFYPLPV
HALFAGADHVANAPNFPPALRDLSRQVPLVPPALGANYFSSIRQPVVQHVRESAAGENALTYALMAGYFKISPVALHHQL
KTGLHPGFGFTVVRQDRFVTENVLFSERASEAYFLGQLQVARHETGGGVNFTLTQPRANVDLGVGYTAVVATATVRNPVT
DMGNLPQNFYLGRGAPPLLDNAAAVYLRNAVVAGNRLGPAQPVPVFGCAQVPRRAGMDHGQDAVCEFIATPVSTDVNYFR
RPCNPRGRAAGGVYAGDKEGDVTALMYDHGQSDPSRAFAATANPWASQRFSYGDLLYNGAYHLNGASPVLSPCFKFFTSA
DIAAKHRCLERLIVETGSAVSTATAASDVQFKRPPGCRELVEDPCGLFQEAYPLTCASDPALLRSARNGEAHARETHFAQ
YLVYDASPLKGLAL
>P89442 ~~~MCP~~~Major capsid protein~~~
MAAPARDPPGYRYAAAILPTGSILSTIEVASHRRLFDFFAAVRSDENSLYDVEFDALLGSYCNTLSLVRFLELGLSVACV
CTKFPELAYMNEGRVQFEVHQPLIARDGPHPVEQPVHNYMTKVIDRRALNAAFSLATEAIALLTGEALDGTGISLHRQLR
AIQQLARNVQAVLGAFERGTADQMLHVLLEKAPPLALLLPMQRYLDNGRLATRVARATLVAELKRSFCDTSFFLGKAGHR
REAIEAWLVDLTTATQPSVAVPRLTHADTRGRPVDGVLVTTAAIKQRLLQSFLKVEDTEADVPVTYGEMVLNGANLVTAL
VMGKAVRSLDDVGRHLLDMQEEQLEANRETLDELESAPQTTRVRADLVAIGDRLVFLEALERRIYAATNVPYPLVGAMDL
TFVLPLGLFNPAMERFAAHAGDLVPAPGHPEPRAFPPRQLFFWGKDHQVLRLSMENAVGTVCHPSLMNIDAAVGGVNHDP
VEAANPYGAYVAAPAGPGADMQQRFLNAWRQRLAHGRVRWVAECQMTAEQFMQPDNANLALELHPAFDFFAGVADVELPG
GEVPPAGPGAIQATWRVVNGNLPLALCPVAFRDARGLELGVGRHAMAPATIAAVRGAFEDRSYPAVFYLLQAAIHGNEHV
FCALARLVTQCITSYWNNTRCAAFVNDYSLVSYIVTYLGGDLPEECMAVYRDLVAHVEALAQLVDDFTLPGPELGGQAQA
ELNHLMRDPALLPPLVWDCDGLMRHAALDRHRDCRIDAGGHEPVYAAACNVATADFNRNDGRLLHNTQARAADAADDRPH
RPADWTVHHKIYYYVLVPAFSRGRCCTAGVRFDRVYATLQNMVVPEIAPGEECPSDPVTDPAHPLHPANLVANTVKRMFH
NGRVVVDGPAMLTLQVLAHNMAERTTALLCSAAPDAGANTASTANMRIFDGALHAGVLLMAPQHLDHTIQNGEYFYVLPV
HALFAGADHVANAPNFPPALRDLARDVPLVPPALGANYFSSIRQPVVQHARESAAGENALTYALMAGYFKMSPVALYHQL
KTGLHPGFGFTVVRQDRFVTENVLFSERASEAYFLGQLQVARHETGGGVNFTLTQPRGNVDLGVGYTAVAATGTVRNPVT
DMGNLPQNFYLGRGAPPLLDNAAAVYLRNAVVAGNRLGPAQPLPVFGCAQVPRRAGMDHGQDAVCEFIATPVATDINYFR
RPCNPRGRAAGGVYAGDKEGDVIALMYDHGQSDPARPFAATANPWASQRFSYGDLLYNGAYHLNGASPVLSPCFKFFTAA
DITAKHRCLERLIVETGSAVSTATAASDVQFKRPPGCRELVEDPCGLFQEAYPITCASDPALLRSARDGEAHARETHFTQ
YLIYDASPLKGLSL
>Q9QJ26 ~~~MCP~~~Major capsid protein~~~
MENWQATEILPKIEAPLNIFNDIKTYTAEQLFDNLRIYFGDDPSRYNISFEALLGIYCNKIEWINFFTTPIAVAANVIRF
NDVSRMTLGKVLFFIQLPRVATGNDVTAPKETTIMVAKHSEKHPINISFDLSAACLEHLENTFKNTVIDQILNINALHTV
LRSLKNSADSLERGLIHAFMQTLLRKSPPQFIVLTMNENKVHNKQALSRVQRSNMFQSLKNRLLTSLFFLNRNNNSSYIY
RILNDMMESVTESILNDTNNYTSKENIPLDGVLLGPIGSIQKLTNILSQYISTQVVSAPISYGHFIMGKENAVTAIAYRA
IMADFTQFTVNAGTEQQDTNNKSEIFDKSRAYADLKLNTLKLGDKLVAFDHLHKVYKNTDVNDPLEQSLQLTFFFPLGIY
IPTETGFSTMETRVKLNDTMENNLPTSVFFHNKDQVVQRIDFADILPSVCHPIVHDSTIVERLMKNEPLPTGHRFSQLCQ
LKITRENPTRILQTLYNLYESRQEVPKNTNVLKNELNVEDFYKPDNPTLPTERHPFFDLTYIQKNRATEVLCTPRIMIGN
MPLPLAPISFHEARTNQMLEHAKTNSHNYDFTLKIVTESLTSGSYPELAYVIEILVHGNKHAFMILKQVISQCISYWFNM
KHILLFCNSFEMIMLISNHMGDELIPGAAFAHYRNLVSLIRLVKRTISISNINEQLCGEPLVNFANALFDGRLFCPFVHT
MPRNDTNAKITADDTPLTQNTVRVRNYEISDVQRMNLIDSSVVFTDNDRPSNENTILSKIFYFCVLPALSNNKACGAGVN
VKELVLDLFYTEPFICPDDCFQENPISSDVLMSLIREAMGPGYTVANTSSIAKQLFKSLIYINENTKILEVEVSLDPAQR
HGNSVHFQSLQHILYNGLCLISPITTLRRYYQPIPFHRFFSDPGICGTMNADIQVFLNTFPHYQRNDGGFPLPPPLALEF
YNWQRTPFSVYSAFCPNSLLSIMTLAAMHSKLSPVAIAIQSKSKIHPGFAATLVRTDNFDVECLLYSSRAATSIILDDPT
VTAEAKDIVTTYNFTQHLSFVDMGLGFSSTTATANLKRIKSDMGSKIQNLFSAFPIHAFTNTDINTWIRHHVGIEKPNPS
EGEALNIITFGGINKNPPSILLHGQQAICEVILTPVTTNINFFKLPHNPRGRESCMMGTDPHNEEAARKALYDHTQTDSD
TFAATTNPWASLPGSLGDILYNTAHREQLCYNPKTYSPNAQFFTESDILKTNKMMYKVINEYCMKSNSCLNSDSEIQYSC
SEGTDSFVSRPCQFLQNALPLHCSSNQALLESRSKTGNTQISETHYCNYAIGETIPLQLIIESSI
>Q2HRA7 ~~~MCP~~~Major capsid protein~~~
MEATLEQRPFPYLATEANLLTQIKESAADGLFKSFQLLLGKDAREGSVRFEALLGVYTNVVEFVKFLETALAAACVNTEF
KDLRRMIDGKIQFKISMPTIAHGDGRRPNKQRQYIVMKACNKHHIGAEIELAAADIELLFAEKETPLDFTEYAGAIKTIT
SALQFGMDALERGLVDTVLAVKLRHAPPVFILKTLGDPVYSERGLKKAVKSDMVSMFKAHLIEHSFFLDKAELMTRGKQY
VLTMLSDMLAAVCEDTVFKGVSTYTTASGQQVAGVLETTDSVMRRLMNLLGQVESAMSGPAAYASYVVRGANLVTAVSYG
RAMRNFEQFMARIVDHPNALPSVEGDKAALADGHDEIQRTRIAASLVKIGDKFVAIESLQRMYNETQFPCPLNRRIQYTY
FFPVGLHLPVPRYSTSVSVRGVESPAIQSTETWVVNKNNVPLCFGYQNALKSICHPRMHNPTQSAQALNQAFPDPDGGHG
YGLRYEQTPNMNLFRTFHQYYMGKNVAFVPDVAQKALVTTEDLLHPTSHRLLRLEVHPFFDFFVHPCPGARGSYRATHRT
MVGNIPQPLAPREFQESRGAQFDAVTNMTHVIDQLTIDVIQETAFDPAYPLFCYVIEAMIHGQEEKFVMNMPLIALVIQT
YWVNSGKLAFVNSYHMVRFICTHMGNGSIPKEAHGHYRKILGELIALEQALLKLAGHETVGRTPITHLVSALLDPHLLPP
FAYHDVFTDLMQKSSRQPIIKIGDQNYDNPQNRATFINLRGRMEDLVNNLVNIYQTRVNEDHDERHVLDVAPLDENDYNP
VLEKLFYYVLMPVCSNGHMCGMGVDYQNVALTLTYNGPVFADVVNAQDDILLHLENGTLKDILQAGDIRPTVDMIRVLCT
SFLTCPFVTQAARVITKRDPAQSFATHEYGKDVAQTVLVNGFGAFAVADRSREAAETMFYPVPFNKLYADPLVAATLHPL
LANYVTRLPNQRNAVVFNVPSNLMAEYEEWHKSPVAAYAASCQATPGAISAMVSMHQKLSAPSFICQAKHRMHPGFAMTV
VRTDEVLAEHILYCSRASTSMFVGLPSVVRREVRSDAVTFEITHEIASLHTALGYSSVIAPAHVAAITTDMGVHCQDLFM
IFPGDAYQDRQLHDYIKMKAGVQTGSPGNRMDHVGYTAGVPRCENLPGLSHGQLATCEIIPTPVTSDVAYFQTPSNPRGR
AACVVSCDAYSNESAERLLYDHSIPDPAYECRSTNNPWASQRGSLGDVLYNITFRQTALPGMYSPCRQFFHKEDIMRYNR
GLYTLVNEYSARLAGAPATSTTDLQYVVVNGTDVFLDQPCHMLQEAYPTLAASHRVMLDEYMSNKQTHAPVHMGQYLIEE
VAPMKRLLKLGNKVVY
>Q05815 ~~~MCP~~~Major capsid protein~~~
MSISSSNVTSGFIDIATKDEIEKYMYGGKTSTAYFVRETRKATWFTQVPVSLTRANGSANFGSEWSASISRAGDYLLYTW
LRVRIPSVTLLSTNQFGANGRIRWCRNFMHNLIRECSITFNDLVAARFDHYHLDFWAAFTTPASKAVGYDNMIGNVSALI
QPQPVPVAPATVSLPEADLNLPLPFFFSRDSGVALPTAALPYNEMRINFQFHDWQRLLILDNIAAVASQTVVPVVGATSD
IATAPVLHHGTVWGNYAIVSNEERRRMGCSVRDILVEQVQTAPRHVWNPTTNDAPNYDIRFSHAIKALFFAVRNTTFSNQ
PSNYTTASPVITSTTVILEPSTGAFDPIHHTTLIYENTNRLNHMGSDYFSLVNPWYHAPTIPGLTGFHEYSYSLAFNEID
PMGSTNYGKLTNISIVPTASPAAKVGAAGTGPAGSGQNFPQTFEFIVTALNNNIIRISGGALGFPVL
>P18162 ~~~MCP~~~Major capsid protein~~~
MSMSSSNITSGFIDIATFDEIEKYMYGGPTATAYFVREIRKSTWFTQVPVPLSRNTGNAAFGQEWSVSISRAGDYLLQTW
LRVNIPPVTLSGLLGNTYSLRWTKNLMHNLIREATITFNDLVAARFDNYHLDFWSAFTVPASKRNGYDNMIGNVSSLINP
VAPGGTLGSVGGINLNLPLPFFFSRDTGVALPTAALPYNEMQINFNFRDWHELLILTNSALVPPASPYVPIVVGTHISAA
PVLGPVQVWANYAIVSNEERRRMGCAIRDILIEQVQTAPRQNYVPLTNASPTFDIRFSHAIKALFFAVRNKTSAAEWSNY
ATSSPVVTGATVNYEPTGSFDPIANTTLIYENTNRLGAMGSDYFSLINPFYHAPTIPSFIGYHLYSYSLHFYDLDPMGST
NYGKLTNVSVVPQASPAAIAAAGGTGGQAGSDYPQNYEFVILAVNNNIVRISGGETPQNYIAVC
>P22166 ~~~MCP~~~Major capsid protein~~~
MSMSSSNITSGFIDIATFDEIEKYMYGGPTATAYFVREIRKSTWFTQVPVPLSRNTGNAAFGQEWSVSISRAGDYLLQTW
LRVNIPPVTLSGLLGNTYSLRWTKNLMHNLIREATITFNDLVAARFDNYHLDFWSAFTVPASKRNGYDNMIGNVSSLINP
VAPGGTLGSVGGINLNLPLPFFFSRDTGVALPTAALPYNEMQINFNFRDWHELLILTNSALVPPASSYVSIVVGTHISAA
PVLGPVQVWANYAIVSNEERRRMGCAIRDILIEQVQTAPRQNYVPLTNASPTFDIRFSHAIKALFFAVRNKTSAAEWSNY
ATSSPVVTGATVNYEPTGSFDPIANTTLIYENTNRLGAMGSDYFSLINPFYHAPTIPSFIGYHLYSYSLHFYDLDPMGST
NYGKLTNVFVVPAASSAAISAAGGTGGQAGSDYAQSYEFVIVAVNNNIVRIENSLVRNRRRWSREGPMVMVC
>P17499 ~~~~~~Major capsid protein~~~
MALVPVGMAPRQMRVNRCIFASIVSFDACITYKSPCSPDAYHDDGWFICNNHLIKRFKMSKMVLPIFDEDDNQFKMTIAR
HLVGNKERGIKRILIPSATNYQDVFNLNSMMQAEQLIFHLIYNNENAVNTICDNLKYTEGFTSNTQRVIHSVYATTKSIL
DTTNPNTFCSRVSRDELRFFDVTNARALRGGAGDQLFNNYSGFLQNLIRRAVAPEYLQIDTEELRFRNCATCIIDETGLV
ASVPDGPELYNPIRSSDIMRSQPNRLQIRNVLKFEGDTRELDRTLSGYEEYPTYVPLFLGYQIINSENNFLRNDFIPRAN
PNATLGGGAVAGPAPGVAGEAGGGIAV
>P30328 ~~~~~~Major capsid protein~~~
MAGGLSQLVAYGAQDVYLTGNPQITFFKTVYRRYTNFAIESIQQTINGSVGFGNKVSTQISRNGDLITDIVVEFVLTKGG
NGGTTYYPAEELLQDVELEIGGQRIDKHYNDWFRTYDALFRMNDDRYNYRRMTDWVNNELVGAQKRFYVPLIFFFNQTPG
LALPLIALQYHEVKLYFTLASQVQGVNYNGSSAIAGAAQPTMSVWVDYIFLDTQERTRFAQLPHEYLIEQLQFTGSETAT
PSATTQASQNIRLNFNHPTKYLAWNFNNPTNYGQYTALANIPGACSGAGTAAATVTTPDYGNTGTYNEQLAVLDSAKIQL
NGQDRFATRKGSYFNKVQPYQSIGGVTPAGVYLYSFALKPAGRQPSGTCNFSRIDNATLSLTYKTCSIDATSPAAVLGNT
ETVTANTATLLTALNIYAKNYNVLRIMSGMGGLAYAN
>Q9DGW5 ~~~~~~Oncoprotein MEQ~~~
MSQEPEPGAMPYSPADDPSPLDLSLGSTSRRKKRKSHDIPNSPSKHPFPDGLSEEEKQKLERRRKRNRDAARRRRRKQTD
YVDKLHEACEELQRANEHLRKEIRDLRTECTSLRVQLACHEPVCPMAVPLTVTLGLLTTPHDPVPEPPICTPPPPSPDEP
NAPHCSGSQPPICTPPPPDTEELCAQLCSTPPPPISTPHIIYAPGPSPLQPPICTPAPPDAEELCAQLCSTPPPPICTPH
SLFCPPQPPSPEGIFPALCPVTEPCTPPSPGTVYAQLCPVGQVPLFTPSPPHPAPEPERLYARLTEDPEQDSLYSGQIYT
QFPSDTQSTVWWFPGDGRP
>P90495 2.3.2.36~~~K3~~~E3 ubiquitin-protein ligase MIR1~~~
MEDEDVPVCWICNEELGNERFRACGCTGELENVHRSCLSTWLTISRNTACQICGVVYNTRVVWRPLREMTLLPRLTYQEG
LELIVFIFIMTLGAAGLAAATWVWLYIVGGHDPEIDHVAAAAYYVFFVFYQLFVVFGLGAFFHMMRHVGRAYAAVNTRVE
VFPYRPRPTSPECAVEEIELQEILPRGDNQDEEGPAGAAPGDQNGPAGAAPGDQDGPADGAPVHRDSEESVDEAAGYKEA
GEPTHNDGRDDNVEPTAVGCDCNNLGAERYRATYCGGYVGAQSGDGAYSVSCHNKAGPSSLVDILPQGLPGGGYGSMGVI
RKRSAVSSALMFH
>O41933 2.3.2.36~~~K3~~~E3 ubiquitin-protein ligase MIR1~~~
MDSTGEFCWICHQPEGPLKRFCGCKGSCAVSHQDCLRGWLETSRRQTCALCGTPYSMKWKTKPLREWTWGEEEVLAAMEA
CLPLVLIPLAVLMIVMGTWLLVNHNGFLSPRMQVVLVVIVLLAMIVFSASASYVMVEGPGCLDTCTAKNSTVTVNSIDEA
IATQQPTKTDLGLARETLSTRFRRGKCRSCCRLGCVRLCCV
>P90489 2.3.2.27~~~K5~~~E3 ubiquitin-protein ligase MIR2~~~
MASKDVEEGVEGPICWICREEVGNEGIHPCACTGELDVVHPQCLSTWLTVSRNTACQMCRVIYRTRTQWRSRLNLWPEME
RQEIFELFLLMSVVVAGLVGVALCTWTLLVILTAPAGTFSPGAVLGFLCFFGFYQIFIVFAFGGICRVSGTVRALYAANN
TRVTVLPYRRPRRPTANEDNIELTVLVGPAGGTDEEPTDESSEGDVASGDKERDGSSGDEPDGGPNDRAGLRGTARTDLC
APTKKPVRKNHPKNNG
>Q80A33 ~~~Segment 6~~~Protein ML~~~
MASNLPVRSFSEVCCAEARAAIIQMENNPDETVCNRIWKIHRDLQSSDLTTTVQVMMVYRFISKRVPEGCFAILSGVNTG
MYNPRELKRSYVQSLSSGTSCEFLRSLDKLAKNLLAVHVCSDVKMSLNKRQVIDFISGEEDPTLHTAEHLTSLALDDSPS
AVVYSGWQQEAIKLHNTIRKIATMRPADCKAGKFYSDILSACDQTKELLDAFDQGKLAYDRDVVLIGWMDEIIKIFSKPD
YLEAKGVSYQVLKNVSNKVALLRESIWWVTELDGREYLFFDESWYLHGMSAFSDGVPGYEDFIY
>P39421 2.4.2.31~~~modA~~~NAD--protein ADP-ribosyltransferase modA~~~
MKYSVMQLKDFKIKSMDASVRASIREELLSEGFNLSEIELLIHCITNKPDDHSWLNEIIKSRLVPNDKPLWRGVPAETKQ
VLNQGIDIITFDKVVSASYDKNIALHFASGLEYNTQVIFEFKAPMVFNFQEYAIKALRCKEYNPNFKFPDSHRYRNMELV
SDEQEVMIPAGSVFRIADRYEYKKCSTYTIYTLDFEGFNL
>P39423 2.4.2.31~~~modB~~~NAD--protein ADP-ribosyltransferase modB~~~
MIINLADVEQLSIKAESVDFQYDMYKKVCEKFTDFEQSVLWQCMEAKKNEALHKHLNEIIKKHLTKSPYQLYRGISKSTK
ELIKDLQVGEVFSTNRVDSFTTSLHTACSFSYAEYFTETILRLKTDKAFNYSDHISDIILSSPNTEFKYTYEDTDGLDSE
RTDNLMMIVREQEWMIPIGKYKITSISKEKLHDSFGTFKVYDIEVVE
>P08794 ~~~mom~~~Methylcarbamoylase mom~~~
MPASIPRRNIVGKEKKSRILTKPCVIEYEGQIVGYGSKELRVETISCWLARTIIQTKHYSRRFVNNSYLHLGVFSGRDLV
GVLQWGYALNPNSGRRVVLETDNRGYMELNRMWLHDDMPRNSESRAISYALKTIRLLYPSVEWVQSFADERCGRAGVVYQ
ASNFDFIGSHESTFYELDGEWYHEITMNAIKRGGQRGMYLRANKERAVVHKFNQYRYIRFLNKRARKRLNTKLFRIQPYP
KSSSD
>P06018 ~~~mom~~~Methylcarbamoylase mom~~~
MPASIPRRNIVGKEKKSRILTKPCVIEYEGQIVGYGSKELRVETISCWLARTIIQTKHYSRRFVNNSYLHLGVFSGRDLV
GVLQWGYALNPNSGRRVVLETDNRGYMELNRMWLHDDMPRNSESRAISYALKVIRLLYPSVEWVQSFADERCGRAGVVYQ
ASNFDFIGSHESTFYELDGEWYHEITMNAIKRGGQRGVYLRANKERAVVHKFNQYRYIRFLNKRARKRLNTKLFKVQPYP
K
>P23848 ~~~mor~~~Middle operon regulator~~~
MTEDLFGDLQDDTILAHLDNPAEDTSRFPALLAELNDLLRGELSRLGVDPAHSLEIVVAICKHLGGGQVYIPRGQALDSL
IRDLRIWNDFNGRNVSELTTRYGVTFNTVYKAIRRMRRLKYRQYQPSLL
>P07331 2.7.11.1~~~V-MOS~~~Serine/threonine-protein kinase-transforming protein mos~~~
MARSTPCSQTSLAVPTHFSLVSHVTVPSEGVMPSPLSLCRYLPRELSPSVDSRSCSIPLVAPRKAGKLFLGTTPPRAPGL
PRRLAWFSIDWEQVCLMHRLGSGGFGSVYKATYHGVPVAIKQVNKCTKDLRASQRSFWAELNIARLRHDNIVRVVAASTR
TPEDSNSLGTIIMEFGGNVTLHQVIYGATRSPEPLSCREQLSLGKCLKYSLDVVNGLLFLHSQSILHLDLKPANILISEQ
DVCKISDFGCSQKLQDLRCRQASPHHIGGTYTHQAPEILKGEIATPKADIYSFGITLWQMTTREVPYSGEPQYVQYAVVA
YNLRPSLAGAVFTASLTGKTLQNIIQSCWEARALQRPGAELLQRDLKAFRGALG
>P22915 ~~~motA~~~Middle transcription regulatory protein motA~~~
MSKVTYIIKASNDVLNEKTATILITIAKKDFITAAEVREVHPDLGNAVVNSNIGVLIKKGLVEKSGDGLIITGEAQDIIS
NAATLYAQENAPELLKKRATRKAREITSDMEEDKDLMLKLLDKNGFVLKKVEIYRSNYLAILEKRTNGIRNFEINNNGNM
RIFGYKMMEHHIQKFTDIGMSCKIAKNGNVYLDIKRSAENIEAVITVASEL
>Q01437 ~~~motB~~~Transcription regulatory protein motB~~~
MIINIGELARVSDKSRSKAAGKLVEVVSIQLKHGVKDEDSEVKVRIIPKDGKSKPQFGYVRAKFLESAFLKAVPAKGIET
IDTSHVGVDFKWKLGQAIKFIAPCEFNFIKDDGRVVYTRAMCGYITDQWVEDGVKLYNVVFLGTYKVIPESWIKHYSNAL
YA
>P0C777 ~~~ORF2~~~Double gene block protein 1~~~
MDIEPEVPVVEKQMLAGNRGKQKTRRSVAKDAIRKPASDSTNGGNWVNVADKIEVHIHFNF
>Q89682 ~~~ORF2~~~Double gene block protein 1~~~
MDSQRTVELTNPRGRSKERGDSGGKQKNSMGRKIANDAISESKQGVMGASTYIADKIKVTINFNF
>P17461 ~~~ORF2~~~Double gene block protein 1~~~
MDPERIPYNSLSDSDATGKRKKGGEKSAKKRLVASHAASSVLNKKRNEGSASHGGTWVIVADKVEVSINFNF
>Q89846 ~~~ORF3~~~Double gene block protein 2~~~
MACCRCDSSPGDYSGALLILFISFVFFYITSLSPQGNTYVHHFDSSSVKTQYVGISTNGDG
>Q7TD19 ~~~ORF3~~~Double gene block protein 2~~~
MKVLLVTGVLGLLLLIKWKSQSTSTSNQTCQCPTSPWVIYAFYNSLSLVLLLCHLIPEIKPIHTSYNTHDSSKQQHISIN
TGNGK
>P89035 ~~~ORF2bis~~~Putative movement protein p6.6~~~
MATGKCYCPEDPRVGPLLVLCLLLLLILFSRSWNVAPVVVPSYHTVYHHEKYQNIEIQK
>P0C649 ~~~V2~~~Movement protein~~~
MDPQNALYYQPRVPTAAPTSGGVPWSRVGEVAILSFVALICFYLLYLWVLRDLILVLKARQGRSTEELIFGGQAVDRSNP
IPNIPAPPSQGNPGPFVPGTG
>P10838 ~~~ORF3~~~Movement protein~~~
MAVHVENLSDLAKTNDGVAVSLNRYTDWKCRSGVSEAPLIPASMMSKITDYAKTTAKGNSVALNYTHVVLSLAPTIGVAI
PGHVTVELINPNVEGPFQVMSGQTLSWSPGAGKPCLMIFSVHHQLNSDHEPFRVRITNTGIPTKKSYARCHAYWGFDVGT
RHRYYKSEPARLIELEVGYQRTLLSSIKAVEAYVQFTFDTSRMEKNPQLCTKSNVNIIPPKAETGSIRGIAPPLSVVPNQ
GRESKVLKQKGGTGSKTTKLPSLEPSSGSSSGLSMSRRSHRNVLNSSIPIKRNQDGNWLGDHLSDKGRVTDPNPERL
>Q9WIJ5 2.7.7.-~~~DNA-R~~~Master replication protein~~~
MARQVICWCFTLNNPLSPLSLHDSMKYLVYQTEQGEAGNIHFQGYIEMKKRTSLAGMKKLIPGAHFEKRRGTQGEARAYS
MKEDPRLEGPWEYEEFVPTIEDKLREVMNDMKITGKRPIEYIEECCNTYDKSASTLREFRGELKKKKAISSWELQRKPWM
DEVDALLQERDGRRIIWVYGPQGGEGKTSYAKHLVKTRDAFYSTGGKTADIAFAWDHQELVLFDFPRSFEEYVNYGVIEQ
LKNGIIQSGKYQSVIKYSDYVEVIVFANFTPRSGMFSEDRIVYVYA
>O39828 2.7.7.-~~~DNA-R~~~Master replication protein~~~
MARQVICWCFTLNNPLSPLSLHDSMKYLVYQTEQGEAGNIHFQGYIEMKKRTSLAGMKKLIPGAHFEKRRGTQGEARAYS
MKEDTRLEGPWEYGEFVPTIEDKLREVMNDMKITGKRPIEYIEECCNTYDKSASTLREFRGELKKKKAISSWELQRKPWM
GEVDALLQERDGRRIIWVYGPQGGEGKTSYAKHLVKTRDAFYSTGGKTADIAFAWDHQELVLFDFPRSFEEYVNYGVIEQ
LKNGIIQSGKYQSVIKYSDYVEVIVFANFTPRSGMFSEDRIVYVYA
>Q83730 ~~~~~~Ankyrin repeat domain-containing protein M-T5~~~
MDLYGYVSCTPRIRHDVLDGLLNVYDPDELCSRDTPFRLYLTRYDCTPEGLRLFLTRGADVNGVRGSRTSPLCTVLSNKD
LGNEAEALAKQLIDAGADVNAMAPDGRYPLLCLLENDRINTARFVRYMIDRGTSVYVRGTDGYGPVQTYIHSKNVVLDTL
RELVRAGATVHDPDKTYGFNVLQCYMIAHVRSSNVQILRFLLRHGVDSSRGLHATVMFNTLERKISHGVFNRKVLDFIFT
QISINEQNSLDFTPINYCVIHNDRRTFDYLLERGADPNVVNFLGNSCLDLAVLNGNKYMVHRLLRKTITPDAYTRALNVV
NSNIYSIKSYGMSEFVKRHGTLYKALIRSFVKDSDREIFTYVHIYDYFREFVDECIRERDAMKADVLDAVSVFDTAFGLV
ARPRWKHVRILSKYVRGVYGDRVKKILRSLHKRRFKTDRLVRRIADLCGPDGLWTRLPVEVRYSVVDYLTDDEIHDLFVK
IHA
>P31118 2.1.1.72~~~CVIAIIM~~~Type II methyltransferase M.CviAII~~~
MNRIGYIGSKLKLKDWIFEEISKRTDDTYTKFADLFAGSCIMTHEALEKKYECISNDLETYSYVIMNGLKCPFSDKLQNI
IETLDDLDTKDMVIPGFVTLTYSPRGNRMYFTEDIAMRIDIIRENIERMKERVSTDEYNFLLASLLTSADSVKNTSVVYG
AYLKKFKKTALKRMVFAPLHTRSTTVTLETFNEDATELEIKTDIAYVDPPYNSRQYGANYFVLNQILTPKEIGNGVTGLP
EYKKSSFCRKQEVAMSFHKMLKNVSARLFVISYSSESLLSKGDMVALLSQYGKCEVVVRNHKRFKAQISAVGNDVEEYLF
FVYIEQ
>Q1HVJ0 ~~~BNRF1~~~Major tegument protein~~~
MEDRGRETQMPVARYGGPFIMVRLFGQDGEANIQEQRLYELLSDPRSALGLDPGPLIAENLLLVALRGTNNDPRPQRQER
ARELALVGILLGNGEQGEHLGTESALEASGNNYVYAYGPDWMARPSTWSAEIQQFLRLLGATYVLRVEMGRQFGFEVHRS
RPSFRQFQAINHLVLFDNALRKYDSGQVAAGFQRALLVAGPETADTRPDLRKLNEWVFGGRAAGGRQLADELKIVSALRD
TYSGHLVLQPTETLDTWKVLSRDTRTAHSLEHGFIHAAGTIQANCPQLFMRRQHPGLFPFVSAIASSLGWYYQTATGPGA
DARAAARRQQAFQTRAAAECHAKSGVPVVAGFYRTINATLKGGEGLQPTMFNGELGAIKHQALDTVRYDYGHYLIMLGPF
QPWSGLTAPPCPYAESSWAQAAVQTALELFSALYPAPCISGYARPPGPSAVIEHLGSLVPKGGLLLFLSHLPDDVKDGLG
EMGPARATGPGMQQFVSSYFLNPACSNVFITVRQRGEKINGRTVLQALGRACDMAGCQHYVLGSTVPLGGLNFVNDLASP
VSTAEMMDDFSPFFTVEFPPIQEEGARSPVPLDVDESMDISPSYELPWLSLESCLTSILSHPTVGSKEHLVRHTDRVSGG
RVAQQPGVGPLDLPLADYAFVAHSQVWTRPGGAPPLPYRTWDRMTEKLLVSAKPGGENVKVSGTVITLGEQGYKVSLDLR
EGTRLAMAEALLNAAFAPILDPEDVLLTLHLHLDPRRADNSVVMEAMTAASDYARGLGVKLTFGSASCPETGSSASSFMT
VVASVSAPGEFSGPLITPVLQKTGSLLIAVRCGDGKIQGGSLFEQLFSDVATTPRAPEALSLKNLFRAVQQLVKSGIVLS
GHDISDGGLVTCLVEMALAGQRGVTITMPVASDYLPEMFAEHPGLVFEVEERSVGEVLQTLRSMNMYPAVLGRVGEQGPD
QMFEVQHGPETVLRQSLRLLLGTWSSFASEQYECLRPDRINRSMHVSDYGYNEALAVSPLTGKNLSPRRLVTEPDPRCQV
AVLCAPGTRGHESLLAAFTNAGCLCRRVFFREVRDNTFLDKYVGLAIGGVHGARDSALAGRATVALINRSPALRDAILKF
LNRPDTFSVALGELGVQVLAGLGAVGSTDNPPAPGVEVNVQRSPLILAPNASGMFESRWLNISIPATTSSVMLRGLRGCV
LPCWVQGSCLGLQFTNLGMPYVLQNAHQIACHFHSNGTDAWRFAMNYPRNPTEQGNIAGLCSRDGRHLALLCDPSLCTDF
WQWEHIPPAFGHPTGCSPWTLMFQAAHLWSLRHGRPSE
>P03179 ~~~BNRF1~~~Major tegument protein~~~
MEERGRETQMPVARYGGPFIMVRLFGQDGEANIQEERLYELLSDPRSALGLDPGPLIAENLLLVALRGTNNDPRPQRQER
ARELALVGILLGNGEQGEHLGTESALEASGNNYVYAYGPDWMARPSTWSAEIQQFLRLLGATYVLRVEMGRQFGFEVHRS
RPSFRQFQAINHLVLFDNALRKYDSGQVAAGFQRALLVAGPETADTRPDLRKLNEWVFGGRAAGGRQLADELKIVSALRD
TYSGHLVLQPTETLDTWKVLSRDTRTAHSLEHGFIHAAGTIQANCPQLFMRRQHPGLFPFVNAIASSLGWYYQTATGPGA
DARAAARRQQAFQTRAAAECHAKSGVPVVAGFYRTINATLKGGEGLQPTMFNGELGAIKHQALDTVRYDYGHYLIMLGPF
QPWSGLTAPPCPYAESSWAQAAVQTALELFSALYPAPCISGYARPPGPSAVIEHLGSLVPKGGLLLFLSHLPDDVKDGLG
EMGPARATGPGMQQFVSSYFLNPACSNVFITVRQRGEKINGRTVLQALGRACDMAGCQHYVLGSTVPLGGLNFVNDLASP
VSTAEMMDDFSPFFTVEFPPIQEEGASSPVPLDVDESMDISPSYELPWLSLESCLTSILSHPTVGSKEHLVRHTDRVSGG
RVAQQPGVGPLDLPLADYAFVAHSQVWTRPGGAPPLPYRTWDRMTEKLLVSAKPGGENVKVSGTVITLGEQGYKVSLDLR
EGTRLAMAEALLNAACAPILDPEDVLLTLHLHLDPRRADNSAVMEAMTAASDYARGLGVKLTFGSASCPETGSSASNFMT
VVASVSAPGEFSGPLITPVLQKTGSLLIAVRCGDGKIQGGSLFEQLFSDVATTPRAPEALSLKNLFRAVQQLVKSGIVLS
GHDISDGGLVTCLVEMALAGQRGVTITMPVASDYLPEMFAEHPGLVFEVEERSVGEVLQTLRSMNMYPAVLGRVGEQGPD
QMFEVQHGPETVLRQSLRLLLGTWSSFASEQYECLRPDRINRSMHVSDYGYNEALAVSPLTGKNLSPRRLVTEPDPRCQV
AVLCAPGTRGHESLLAAFTNAGCLCRRVFFREVRDNTFLDKYVGLAIGGVHGARDSALAGRATVALINRFPALRDAILKF
LNRPDTFSVALGELGVQVLAGLGAVGSTDNPPAPGVEVNVQRSPLILAPNASGMFESRWLNISIPATTSSVMLRGLRGCV
LPCWVQGSCLGLQFTNLGMPYVLQNAHQIACHFHSNGTDAWRFAMNYPRNPTEQGNIAGLCSRDGRHLALLCDPSLCTDF
WQWEHIPPAFGHPTGCSPWTLMFQAAHLWSLRHGRPSE
>P0DOJ7 ~~~~~~Middle T antigen~~~
MDRVLSRADKERLLELLKLPRQLWGDFGRMQQAYKQQSLLLHPDKGGSHALMQELNSLWGTFKTEVYNLRMNLGGTGFQV
RRLHADGWNLSTKDTFGDRYYQRFCRMPLTCLVNVKYSSCSCILCLLRKQHRELKDKCDARCLVLGECFCLECYMQWFGT
PTRDVLNLYADFIASMPIDWLDLDVHSVYNPKRRSEELRRAATVHYTMTTGHSAMEASTSQGNGMISSESGTPATSRRLR
LPSLLSNPTYSVMRSHSYPPTRVLQQIHPHILLEEDEILVLLSPMTAYPRTPPELLYPESDQDQLEPLEEEEEEYMPMED
LYLDILPEEQVPQLIPPPIIPRAGLSPWEGLILRDLQRAHFDPILDASQRMRATHRAALRAHSMQRHLRRLGRTLLLVTF
LAALLGICLMLFILIKRSRHF
>P03079 ~~~~~~Middle T antigen~~~
MDRILTKEEKQALISLLDLEPQYWGDYGRMQKCYKKKCLQLHPDKGGNEELMQQLNTLWTKLKDGLYRVRLLLGPSQVRR
LGKDQWNLSLQQTFSGTYFRRLCRLPITCLRNKGISTCNCILCLLRKQHFLLKKSWRVPCLVLGECYCIDCFALWFGLPV
TNMLVPLYAQFLAPIPVDWLDLNVHEVYNPASGMTLMLPPPPADPESSTILTQEDTGPTLMGQQDTLTSRRNTGKSFSLS
GMLMRTSPAKKSYHHQKMNSPPGIPIPPPPLFLFPVTAPVPPVTRNTQETQAERENEYMPMAPQIHLYSQIREPTHQEEE
EPQYEEIPIYLELLPENPNQHLALTSTARRSLRRKYHKHNSHIITQRQRNRLRRLVLMIFLLSLGGFFLTLFFLIKRKMH
L
>P0DOJ9 ~~~~~~Middle T antigen~~~
MDRVLSRADKERLLELLKLPRQLWGDFGRMQQAYKQQSLLLHPDKGGSHALMQELNSLWGTFKTEVYNLRMNLGGTGFQV
RRLHADGWNLSTKDTFGDRYYQRFCRMPLTCLVNVKYSSCSCILCLLRKQHRELKDKCDARCLVLGECFCLECYMQWFGT
PTRDVLNLYADFIASMPIDWLDLDVHSVYNPKRRSEELRRAATVHYTMTTGHSAMEASTSQGNGMISSESGTPATSRRLR
LPSLLSNPTYSVMRSHSYPPTRVLQQIHPHILLEEDEILVLLSPMTAYPRTPPELLYPESDQDQLEPLEEEEEEYMPMED
LYLDILPEEQVPQLIPPPIIPRAGLSPWEGLILRDLQRAHFDPILDASQRMRATHRAALRAHSMQRHLRRLGRTLLLVTF
LAALLGICLMLFILIKRSRHF
>P03077 ~~~~~~Middle T antigen~~~
MDRVLSRADKERLLELLKLPRQLWGDFGRMQQAYKQQSLLLHPDKGGSHALMQELNSLWGTFKTEVYNLRMNLGGTGFQV
RRLHADGWNLSTKDTFGDRYYQRFCRMPLTCLVNVKYSSCSCILCLLRKQHRELKDKCDARCLVLGECFCLECYMQWFGT
PTRDVLNLYADFIASMPIDWLDLDVHSVYNPKRRSEELRRAATVHYTMTTGHSAMEASTSQGNGMISSESGTPATSRRLR
LPSLLSNPTYSVMRSHSYPPTRVLQQIHPHILLEEDEILVLLSPMTAYPRTPPELLYPESDQDQLEPLEEEEEEYMPMED
LYLDILPGEQVPQLIPPPIIPRAGLSPWEGLILRDLQRAHFDPILDASQRMRATHRAALRAHSMQRHLRRLGRTLLLVTF
LAALLGICLMLFILIKRSRHF
>P11078 ~~~M2~~~Outer capsid protein mu-1~~~
MGNASSIVQTINVTGDGNVFKPSAETSSTAVPSLSLSPGMLNPGGVPWIAVGDETSVTSPGALRRMTSKDIPETAIINTD
NSSGAVPSESALVPYIDEPLVVVTEHAITNFTKAEMALEFNREFLDKMRVLSVSPKYSDLLTYVDCYVGVSARQALNNFQ
KQVPVITPTRQTMYVDSIQAALKALEKWEIDLRVAQTLLPTNVPIGEVSCPMQSVVKLLDDQLPDDSLIRRYPKEAAVAL
AKRNGGIQWMDVSEGTVMNEAVNAVAASALAPSASAPPLEEKSKLTEQAMDLVTAAEPEIIASLAPVPAPVFAIPPKPAD
YNVRTLRIDEATWLRMIPKSMNTPFQIQVTDNTGTNWHLNLRGGTRVVNLDQIAPMRFVLDLGGKSYKETSWDPNGKKVG
FIVFQSKIPFELWTAASQIGQATVVNYVQLYAEDSSFTAQSIIATTSLAYNYEPEQLNKTDPEMNYYLLATFIDSAAITP
TNMTQPDVWDALLTMSPLSAGEVTVKGAVVSEVVPADLIGSYTPESLNASLPNDAARCMIDRASKIAEAIKIDDDAGPDE
YSPNSVPIQGQLAISQLETGYGVRIFNPKGILSKIASRAMQAFIGDPSTIITQAAPVLSDKNNWIALAQGVKTSLRTKSL
SAGVKTAVSKLSSSESIQNWTQGFLDKVSAHFPAPKPDCPTSGDSGESSNRRVKRDSYAGVVKRGYTR
>P12397 ~~~M2~~~Outer capsid protein mu-1~~~
MGNASSIVQTINVTGDGNVFKPSAETSSTAVPSLSLSPGMLNPGGVPWIAIGDETSVTSPGALRRMTSKDIPETAIINTD
NSSGAVPSESALVPYNDEPLVVVTEHAIANFTKAEMALEFNREFLDKLRVLSVSPKYSDLLTYVDCYVGVSARQALNNFQ
KQVPVITPTRQTMYVDSIQAALKALEKWEIDLRVAQTLLPTNVPIGEVSCPMQSVVKLLDDQLPDDSLIRRYPKEAAVAL
AKRNGGIQWMDVSEGTVMNEAVNAVAASALAPSASAPPLEEKSKLTEQAMDLVTAAEPEIIASLVPVPAPVFAIPPKPAD
YNVRTLKIDEATWLRMIPKTMNTPFQIQVTDNTGTSWHMNLRGGTRVVNLDQIAPMRFVLDLGGKSYKETSWDPNGKKVG
FIVFQSKIPFELWTAASQIGQATVVNYVQLYAEDSSFTAQSIIATTSLAYNYEPEQLNKTDPEMNYYLLAAFIDSAAIST
SNMTQPDVWDALLTMSPLSAGEVTVKGAVVSEVIPADLVGSYTPESLNASLPNDAARCMIDRASKIAEAIKIDDDAGPDE
YSPNSVPIQGQLAISQLETGYGVRIFNPKGILSKIASRAMQAFIGDPSTIITQAAPVLSDKNNWIALAQGVKTSLRTKSL
SAGVKTAVSKLSSSESIQSWTQGFLDKVSTHFPAPKPDCPQSGDSGDGSARRLKRDSYAGVVKRGYTR
>P11077 ~~~M2~~~Outer capsid protein mu-1~~~
MGNASSIVQTINVTGDGNVFKPSAETSSTAVPSLSLSPGMLNPGGVPWIAIGDETSVTSPGALRRMTSKDIPETAIINTD
NSSGAVPSESALVPYNDEPLVVVTEHAIANFTKAEMALEFNREFLDKLRVLSVSPKYSDLLTYVDCYVGVSARQALNNFQ
KQVPVITPTRQTMYVDSIQAALKALEKWEIDLRVAQTLLPTNVPIGEVSCPMQSVVKLLDDQLPDDSLIRRYPKEAAVAL
AKRNGGIQWMDVSEGTVMNEAVNAVAASALAPSASAPPLEEKSKLTEQAMDLVTAAEPEIIASLVPVPAPVFAIPPKPAD
YNVRTLKIDEATWLRMIPKTMGTPFQIQVTDNTGTNWHLNLRGGTRVVNLDQIAPMRFVLDLGGKSYKETSWDPNGKKVG
FIVFQSKIPFELWTAASQIGQATVVNYVQLYAEDSSFTAQSIIATTSLAYNYEPEQLNKTDPEMNYYLLATFIDSAAITP
TNMTQPDVWDALLTMSPLSAGEVTVKGAVVSEVVPAELIGSYTPESLNASLPNDAARCMIDRASKIAEAIKIDDDAGPDE
YSPNSVPIQGQLAISQLETGYGVRIFNPKGILSKIASRAMQAFIGDPSTIITQAAPVLSDKNNWIALAQGVKTSLRTKSL
SAGVKTAVSKLSSSESIQNWTQGFLDKVSTHFPAPKPDCPTNGDGSEPSARRVKRDSYAGVVKRGYTR
>P12418 ~~~M1~~~Microtubule-associated protein mu-2~~~
MAYIAVPAVVDSRSSEAIGLLESFGVDAGADANDVSYQDHDYVLDQLQYMLDGYEAGDVIDALVHKNWLHHSVYCLLPPK
SQLLEYWKSNPSAIPDNVDRRLRKRLMLKKDLRKDDEYNQLARAFKISDVYAPLISSTTSPMTMIQNLNRGEIVYTTTDR
VIGARILLYAPRKYYASTLSFTMTKCIIPFGKEVGRVPHSRFNVGTFPSIATPKCFVMSGVDIESIPNEFIKLFYQRVKS
VHANILNDISPQIVSDMINRKRLRVHTPSDRRAAQLMHLPYHVKRGASHVDVYKVDVVDMLFEVVDVADGLRNVSRKLTM
HTVPVCILEMLGIEIADYCIRQEDGMLTDWFLLLTMLSDGLTDRRTHCQYLMNPSSVPPDVILNISITGFINRHTIDVMP
DIYDFVKPIGAVLPKGSFKSTIMRVLDSISILGIQIMPRAHVVDSDEVGEQMEPTFEQAVMEIYKGIAGVDSLDDLIKWV
LNSDLIPHDDRLGQLFQAFLPLAKDLLAPMARKFYDNSMSEGRLLTFAHADSELLNANYFGHLLRLKIPYITEVNLMIRK
NREGGELFQLVLSYLYKMYATSAQPKWFGSLLRLLICPWLHMEKLIGEADPASTSAEIGWHIPREQLMQDGWCGCEDGFI
PYVSIRAPRLVIEELMEKNWGQYHAQVIVTDQLVVGEPRRVSAKAVIKGNHLPVKLVSRFACFTLTAKYEMRLSCGHSTG
RGAAYSARLAFRSDLA
>Q00335 ~~~M1~~~Microtubule-associated protein mu-2~~~
MAYIAVPAVVDSRSSEAIGLLESFGVDAGADANDVSYQDHDYVLDQLQYMLDGYEAGDVIDALVHKNWLHHSVYCLLPPK
SQLLEYWKSNPSVIPDNVDRRLRKRLMLKKDLRKDDEYNQLARAFKISDVYAPLISSTTSPMTMIQNLNQGEIVYTTTDR
VIGARILLYAPRKYYASTLSFTMTKCIIPFGKEVGRVPHSRFNVGTFPSIATPKCFVMSGVDIESIPNEFIKLFYQRVKS
VHANILNDISPQIVSDMINRKRLRVHTPSDRRAAQLMHLPYHVKRGASHVDVYKVDVVDVLLEVVDVADGLRNVSRKLTM
HTVPVCILEMLGIEIADYCIRQEDGMFTDWFLLLTMLSDGLTDRRTHCQYLINPSSVPPDVILNISITGFINRHTIDVMP
DIYDFVKPIGAVLPKGSFKSTIMRVLDSISILGVQIMPRAHVVDSDEVGEQMEPTFEHAVMEIYKGIAGVDSLDDLIKWV
LNSDLIPHDDRLGQLFQAFLPLAKDLLAPMARKFYDNSMSEGRLLTFAHADSELLNANYFGHLLRLKIPYITEVNLMIRK
NREGGELFQLVLSYLYKMYATSAQPKWFGSLLRLLICPWLHMEKLIGEADPASTSAEIGWHIPREQLMQDGWCGCEDGFI
PYVSIRAPRLVMEELMEKNWGQYHAQVIVTDQLVVGEPRRVSAKAVIKGNHLPVKLVSRFACFTLTAKYEMRLSCGHSTG
RGAAYNARLAFRSDLA
>Q9PY82 ~~~M3~~~Protein mu-NS~~~
MASFKGFSANTVPVSKTRKDTSSLTATPGLRAPSMSSPVDMAQSREFLTKAIEHGSMSIPYQHVNVPKVDRKVVSLVVRP
FSAGAFSISGVISPAHAYLLECLPQLEQAMAFVASPEAFQASDVAKRFTIKPGMSLQDAITAFINFVSAMLKMTVTRQNF
DVIIAEIERLASSGVVNRTEEAKVADEELMLFGLDHRAPQQIDVSEPVGISRAVEIQTTNNVHLAPGLGNIDPEIYNEGR
FMFMQHKPLAADQSYFTTETADYFKIYPTYDEHDGRMVDQKQSGLILCTKDEVLAEQTIFKLDVPDDKTVHLLDRDDDHV
VARFTRVFIEDVAPSHHAAQRSNQRSLLDDLYANTQVVSVTPSALRWVIKHGVSDGIVNRKNVKICVGFDPLYTLATSNG
LSLCSILMDEKLSVLNSACKMTLRSLLKTHRDLDLHRAFQRVISQSYASLMCYYHPSRKLAYGELLFMSSQSDTVDGIKL
QLDASRQCHECPLLQQKIVELEKHLIVQKSASSDPTPVALQPLLSQLRELSSEVTRLQMDLSRTQAINTRLEADVKSAQS
CSLDMYLKHHTCINSHVKEDELMDAVRIAPDVRQELMLKRKATRQEWWERIARETSTTFQSKIDELTLMNGKQAHEISEL
RDSVTNYEKQVAELVSTITQNQTTYQQELQALVAKNIELDALNQRQAKSVRITSSLLSATPIDAVDGASDLIDFSVPADE
L
>Q9PY83 ~~~M3~~~Protein mu-NS~~~
MASFKGFSVNTVPVSKAKRDISSLAATPGIRSQPFTPSVDMSQSREFLTKAIEQGSMSIPYQHVNVPKVDRKVVSLVVRP
FSSGAFSISGVISPAHAYLLDCLPQLEQAMAFVASPESFQASDVAKRFAIKPGMSLQDAITAFINFVSAMLKMTVTRQNF
DVIVAEIERLASTSVSVRTEEAKVADEELMLFGLDHRGPQQLDISNAKGIMKAADIQATHDVHLAPGVGNIDPEIYNEGR
FMFMQHKPLAADQSYFTLETADYFKIYPTYDEHDSRMADQKQSGLILCTKDEVLAEQTIFKLDAPDDKTVHLLDRDDDHV
VARFTKVFIEDVAPGHHAAQRSGQRSVLDDLYANTQVVSITSAALKWVVKHGVSDGIVNRKNVKVCVGFDPLYTLSTHNG
VSLCALLMDEKLSVLNSACRMTLRSLMKTGRDADAHRAFQRVLSQGYASLMCYYHPSRKLAYGEVLFLERSSDMVDGIKL
QLDASRQCHECPVLQQKVVELEKQIIMQKSIQSDPTPMALQPLLSQLRELSSEVTRLQMELSRTQSLNAQLEADAKSAQA
CSLDMYLRHHTCINGHTKEDELLDAVRVAPDVRKEIMEKRGEVRRGWCERISKEAAAKCQTVIDDLTQMNGKQAREITEL
RESAENYEKQIAELVGTITQNQMTYQQELQALVAKNVELDTMNQRQAKSLRITPSLLSATPIDSVDGAADLIDFSVPTDE
L
>A0A385DVD6 ~~~~~~Muzzle protein~~~
MALKKEQHFFKGMQRDLSVSKFNPEYAFDAQNIRITAREHDTLLSVSNEKGNKEIPLQSPSGDPVVIDGVLLGQNVLNNY
VTLFTKGTNDNIYRLENKGTYFETLILFSGNLNFSTDYPIESISVYENNNIQKVYWVDGLNQARVINITKDDYNNADDFD
FVGTIHTSSKIEVSKVNGSGAFGQGVIQYAFTYYNKYGKETNIFRTSPLLYIAYSDRGASPEETVSCSFQINFTELDSSY
DFIRVYSIHRTSIDATPTVRKVADLATDTKLYVDTGTTGEIVDPTLLLYVGGEEIAPYTMTQKDNTLFLGNYTLKRSLIS
TELKNQIKSDSIVTTILGGLDDAIESEWNVNTQYNSNYDLNYDSRIKGFQKGEIYRLGIQFQDNKGKWSEVVFIGDYECT
ERFKYTQYDTYGITLIPRFKVVISNSTTIQAIKNLGYINARGVVVFPTLEDRNILCQGILCPTVANYKDRLDNSPFVQSS
WFSRPKQATETWKTEYSGTNHLSEFGEVPYFQHNEPIGSASLSEITRWEIQTSLGLVPYYNPSTTNAKDFVDGSPSEFLV
DENIVTMHSPDVEFDDRLQNITNGKFKLRIIGTTHLTNTLSDISVITSTPTYGNYATGFYKGKVANMNISTSYYGGRQLS
AGLFWSDNVKFQDPSPQDKLERLWMVYPWHRNGSLMNMGVPTEGTRAAALQRKIISNLKFASQNNYLPNQSVWEAEISGD
ANHTGITPVNSWTEGLVRIPAQANSNLGSLNYYANIDKVLTFNRSEQISEIYKNGYLIYTTKDWITDGKIADLFNNAISQ
TISVDQVQDWLTRIADTDKYGTEPVSMKYKSNPHLVFAFNYTESGKQLILPMKNNNNGYLAPSANSKPFWNPTAPEGAVY
QDSINFTNENRAFFWLAELYRDSVVNRFGGDTEEAILNNTWLPSGDSVIIGDSINIEYTEGDTYYQRYDCLRTFAYTNED
QNSIVDIVSFMCESKVNIDGRYDKNRGQVNNLAVSPTNFNLFNPVYSQKNNFFTFRTIDYERFSINYFPNSITVTKEKSL
GEDIDTWTNITLATTLDLDGDKGEIVSLNTYNNEIFCFQRRGLSNILFNSRVQIPTSDGMPIEITNGLKVSGKRYISNTI
GCANKWSIAESPSGLYFIDNETNSLYLFNGEIVSLSDKLGFRQWISTHNVHVNWEPVGYNNYRSFYDKNNNDVYFTYKDH
CLCYSELINQFTSFMSYEGVPAMFNVSSEFYAFKDGKMWEQFAGDYNMFFGEYKPFSITFVANAEEPNDKIFNTVEFRAD
SWDSDNLISNKTFDTLDVWNEYQHGTTPLTNLLGHPSPLKKKFRIWRANIPRAIANNRDRIRNTWAYIKLGMNTPNTYRT
EFHDAIIHYFA
>P37092 ~~~ORF3~~~Movement protein Hsp70h~~~
MVVFGLDFGTTFSSVCAYVGEELYLFKQRDSAYIPTYVFLHSDTQEVAFGYDAEVLSNDLSVRGGFYRDLKRWIGCDEEN
YRDYLEKLKPHYKTELLKVAQSSKSTVKLDCYSGTVPQNATLPGLIATFVKALISTASEAFKCQCTGVICSVPANYNCLQ
RSFTESCVNLSGYPCVYMVNEPSAAALSACSRIKGATSPVLVYDFGGGTFDVSVISALNNTFVVRASGGDMNLGGRDIDK
AFVEHLYNKAQLPVNYKIDISFLKESLSKKVSFLNFPVVSEQGVRVDVLVNVSELAEVAAPFVERTIKIVKEVYEKYCSS
MRLEPNVKAKLLMVGGSSYLPGLLSRLSSIPFVDECLVLPDARAAVAGGCALYSACLRNDSPMLLVDCAAHNLSISSKYC
ESIVCVPAGSPIPFTGVRTVNMTGSNASAVYSAALFEGDFVKCRLNKRIFFGDVVLGNVGVTGSATRTVPLTLEINVSSV
GTISFSLVGPTGVKKLIGGNAAYDFSSYQLGERVVADLHKHNSDKVKLIHALTYQPFQRKKLTDGDKALFLKRLTADYRR
EARKFSSYDDAVLNSSELLLGRIIPKILRGSRVEKLDV
>Q08542 ~~~ORF2~~~Movement protein p6~~~
MDCVLRSYLLLAFGFLICLFLFCLVVFIWFVYKQILFRTTAQSNEARHNHSTVV
>P21946 ~~~BC1~~~Movement protein BC1~~~
MDSQLVNPPNAFNYIESHRDEYQLSHDLTEIILQFPSTAAQLTARLSRSCMKIDHCVIEYRQQVPINATGSVIVEIHDKR
MTDNESLQASWTFPIRCNIDLHYFSASFFSLKDPIPWKLYYKVCDTNVHQRTHFAKFKGKLKLSTAKHSVDIPFRAPTVR
ILSKQFSEKDVDFSHVDYGKWERKPIRCASMSRIGLRGPIEIRPGESWASRSTIGTAQPDTDSEMENELHPYRHLNRLGT
SLLDPGESASIVGDQRAEPNITMSMGQLNELVRTAVQECVNSNCQASQAKSLK
>P0CJ94 ~~~~~~P3N-PIPO polyprotein~~~
MATIMFGSIAAEIPVIKEAIMIAMPKSKHTLHVVQVEAKHMATEIRSERGKLYVAKRFADNAIKAYDSQLKAFDELLKKN
SDLQKRLFIGQNSPIKQKKGGACFVRSLSFKQAEERHAKYLKLQEEEHQFLSGAYGDKAYVGSVQGTLDRKVAEKVSFKS
PYYKRTCKAVRQVKVLKKAVGSGKVLDQVLEIVAETGVPVTFVGKGANKTLRAQYVRRYGLVIPKIFLCHESGRKVHREM
SYWHHKETLQYLCKHGKYGALNENALCKGDSGLLFDQRTAFVKRVTYLPHFIVRGRQEGQLVCATEYLDNVYTIEHYTHK
PEEQFFKGWKQVFDKMAPHTFEHDCTIDYNNEQCGELAATICQTLFPVRKLSCNKCRHRIKDLSWEEFKQFILAHLGCCA
KLWEEQKNLPGLEKIHSFVVQATSENMIFETSMEIVRLTQNYTSTHMLQIQDINKALMKGSSATQEDLKKASEQLLAMTR
WWKNHMTLTNEDALKTFRNKRSSKALINPSLLCDNQLDRNGNFVWGERGRHSKRFFENFFEEVVPSEGYKKYVIRNNPNG
FRKLAIDSLIVPMDLARARIALQGESIKREDLTLACVSKQDGNFVYPCCCVTQDDGRPFYSELKSPTKRHLVVGTSGDPK
YIDLPATDSDRMYIAKEGYCYLNIFLAMLVNVNEDEAKDFTKMVRDVVVPKLGTWPSMMDVATAVYIMTVFHPETRSAEL
PRILVDHASQTMHVIDSFGSLSVGYHVLKAGTVNQLIQFASNDLEGEMKHYRVGGDAEQRMRCERALISSIFKPKKMMQI
LENDPYTLVLGLVSPTVLIHMFRMKHFEKGVELWINKDQSVVKIFLLLEHLTRKIAMNDVLLEQLEMISQQAGRLHEIIC
DCPKNIHSYRAVKDFLEVKMEAALTNKELANNGFFDINESLGHVSEKNLCKSLREGMARAKLVGKIFCNMAIEKVLKGYG
RAFDKESCRRQKRIFKKICECVLHECPNTPRKCTYYNFK
>P03603 ~~~ORF3a~~~Movement protein~~~
MSNIVSPFSGSSRTTSDVGKQAGGTSDEKLIESLFSEKAVKEIAAECKLGCYNYLKSNEPRNYIDLVPKSHVSAWLSWAT
SKYDKGELPSRGFMNVPRIVCFLVRTTDSAESGSITVSLCDSGKAARAGVLEAIDNQEATIQLSALPALIALTPSYDCPM
EVIGGDSGRNRCFGIATQLSGVVGTTGSVAVTHAYWQANFKAKPNNYKLHGPATIMVMPFDRLRQLDKKSLKNYIRGISN
QSVDHGYLLGRPLQSVDQVAQEDLLVEESESPSALGRGVKDSKSVSASSVAGLPVSSPTLRIK
>P03546 ~~~ORF I~~~Movement protein~~~
MDLYPEENTQSEQSQNSENNMQIFKSENSDGFSSDLMISNDQLKNISKTQLTLEKEKIFKMPNVLSQVMKKAFSRKNEIL
YCVSTKELSVDIHDATGKVYLPLITREEINKRLSSLKPEVRKIMSMVHLGAVKILLKAQFRNGIDTPIKIALIDDRINSR
RDCLLGAAKGNLAYGKFMFTVYPKFGISLNTQRLNQTLSLIHDFENKNLMNKGDKVMTITYIVGYALTNSHHSIDYQSNA
TIELEDVFQEIGNVQQSDFCTIQNDECNWAIDIAQNKALLGAKTQSQIGNSLQIGNSASSSNTENELARVSQNIDLLKNK
LKEICGE
>P03545 ~~~ORF I~~~Movement protein~~~
MDLYPEENTQSEQSQNSENNMQIFKSENSDGFSSDLMISNDQLKNISKTQLTLEKEKIFKMPNVLSQVMKKAFSRKNEIL
YCVSTKELSVDIHDATGKVYLPLITKEEINKRLSSLKPEVRKTMSMVHLGAVKILLKAQFRNGIDTPIKIALIDDRINSR
RDCLLGAAKGNLAYGKFMFTVYPKFGISLNTQRLNQTLSLIHDFENKNLMNKGDKVMTITYVVGYALTNSHHSIDYQSNA
TIELEDVFQEIGNVQQSEFCTIQNDECNWAIDIAQNKALLGAKTKTQIGNNLQIGNSASSSNTENELARVSQNIDLLKNK
LKEICGE
>Q83252 ~~~ORF3a~~~Movement protein~~~
MAFQGTSRTLTQQSSAASSDDLQKILFSPDAIKKMATECDLGRHHWMRADNAISVRPLVPQVTSNNLLSFFKSGYDAGEL
RSKGYMSVPQVLCAVTRTVSTDAEGSLKIYLADLGDKELSPIDGQCVTLHNHELPALISFQPTYDCPMELVGNRHRCFAV
VVERHGYIGYGGTTASVCSNWQAQFSSKNNNYTHAAAGKTLVLPYNRLAEHSKPSAVARLLKSQLNNVSSSRYLLPNVAL
NQNASGHESEILNESPPFAIGSPSASRNNSFRSQVVNGL
>Q67685 ~~~ORF4~~~Movement protein~~~
MSSQVAKAATQGELLEALYGEVTVQELQETNLGVLTPHRGDQRVVFTPLLPPRTQTRISGVLRRLRPTRNTGGLLYLEKV
VVVFTPHVPDDAPGEVEVWIHDSLLPNLNSVGPRLRFPLNGGPRLMAFYPPYSIPLMDKSKEMPRCFAIVSELLSASYVG
GGSPFSLHIMWQPQVESLAHNYLMRPPRMQKICRGMVKDALGSLSSRKSYIAGAVSHRFALTAANPLPISGDTAEEAGEA
SSGEPHWVPEATAPRVRKAT
>P0CJ97 ~~~~~~P3N-PIPO polyprotein~~~
MATLDNCTQVHHMFAYNREHGTNYTRNHFRRYLAAQRIGFYYDWDDDVYECPTCEAIYHSLDDIKNWHECDPPAFDLNDF
ITDARLKSAPVPDLGPVIIEIPKAEEKQELNFFAATPAPEVSQWKCRGLQFGSFTELETSEPVASAPEPKCEEPARTIAK
PEESVEQETRGDGKRLLQAQMEVDKAEQDLAFACLNASLKPRLEGRTTATIARRRDGCLVYKTKPSWSQRRRAKKTLKVD
TLACENPYIPAIVDKISIAGGSSASVMHEQQKPKTLHTTPSRKVATHYKRTVMNQQTLMAFINQVGTILLNAEKEFEVVG
CRKQKVTGKGTRHNGVRLVKLKTAHEEGHRRRVDIRIPNGLRPIVMRISARGGWHRTWTDSELSPGSSGYVLNSSKIIGK
FGLRRHSIFVVRGRVDGEVIDSQSKVTHSITHRMVQYSDVARNFWNGYSTCFMHNTPKDILHTCTSDFDVKECGTVAALL
TQTLFQFGKITCEKCAIEYKNLTRDELATRVNKEIDGTIISIQTQHPRFVHVLNFLRLIKQVLNAKNGNFGAFQETERII
GDRMDAPFSHVNKLNAIVIKGNQATSDEMAQASNHVLEIARYLKNRTENIQKGSLKSFRNKISGKAHLNPSLMCDNQLDK
NGGFEWGQRSYHAKRFFDGYFETIDPSDGYSKYTIRRNPNGHRKLAIGNLIVSTNFESHRRSMIGESIEDPGLTNQCVSK
EGDTFIYPCCCVTDEYGKPTLSEIKMPTKHHLVLGNAGDPKYVDLPKEAEGKMFVTKDGYCYINIFLAMLVDVPEDQAKD
FTKMAREIAVKQLGEWPSMMDVATACNILATFHPDTRRSELPRILVDHATKTFHVIDSYGSITTGFHILKANTVTQLVKF
AHESLESEMQHYRVGGEPDKAPRKPAGSVPTLGISDLRDLGVELENEEHSIRPNLQRLIKAIYRPRMMRSLLTEEPYLLI
LSIVSPGVLMALYNSGSLERTMHEFLQTDQRLSATAQILKHLAKKVSLAKTLTIQNAILEGGAGSLNEILDAPAGRSLSY
RLAKQTVEVMMARSDMDKELVDVGFSVLRDQKNELIEKKLSHGFGGFVARTTIVWKIISNASLAAMAGYFYSRSNPNRRR
RFERQIQYLGWICFQKRDLAPKGNLLRRSKES
>Q9E7N8 ~~~4b~~~Probable movement protein 4b~~~
MLDVNSVRKHVLKTGSLTSAVGTGTIYQGTYNRYAKKKELNIIVTSSGSNNVIMRQVPLFDKEDLDAMKSDTTSNKYLHI
GCITVSIEPLLHQRYMKNFGKTIAGNCAIIDSTFRKVDQSIISLHKYDLSRGRADYVSYPNHCLSLTDPMIQKRLSVLLG
IKGIDVEPGVELFSICIGYIVSSVNTLHPVSQLGIQGVAINGTESADIDELGAEDIDQLSLSYNDSKIISLPSDEDIYYR
SKGSLFSKGRTIKRRTMRTRVPDPEEPIKLTKSQSSRIEHGKVMRLLKNKQIREKIERGMIA
>B3VML2 ~~~~~~Putative movement protein~~~
MGDNALDLATASSTPIPMPNTGQLVISPQDIGYSDPPKLRGRLKLEFVHDISLDANVEDPIALIPHGIWSIFKSKLAQMR
CPKGYITYDKVILSWKPHVATGLARGQIAVVDTRVNHTSIEDLMHKALWKTAPVDLGCTYTIQGTVPYCLPFHPKEGGDV
KSDLESQNPIRGIVYITDSRYQEAARHGALTMTLKLSIGTMPTDALTGPRATLSQPHLRDNLRSRSQRISRPPIGITQRP
RRSLAEPPLEKEEEQESTLSSEASGSEQGLIIPVQGPSTSSRSRRVRG
>P10471 ~~~ORF4~~~Movement protein P17~~~
MSMVVYNNQGGEEGNPFAGALTEFSQWLWSRPLGNPGAEDVEEEAIAAQEELEFPEDEAQARHSCLQRTTSWATPKEVSP
SGRVYQTVRLSRMEYSRPTMSIRSQASYFSSSARPLPPPPAPSLMSWTPIAKYHPSSPTSTSSKLRRAAPKLIKRG
>P17524 ~~~ORF4~~~Movement protein P17~~~
MSMVVYNNQECEEGNPFAGALTEFSQWLWSRPLGNPGAEDAEEEAIAAQEELEFPEDEAQARHSCLQRTTSWATPKEVSP
SGRVYQTVRHSRMEYSRPTMSIRSQASYFSSSARPLPPPPVPSLMSWTPIAKYHPSSPTSTSSKLRRAAPKLIKRG
>P89658 ~~~MP~~~Movement protein~~~
MALVVKDDVKISEFINLSAAEKFLPAVMTSVKTVRISKVDKVIAMENDSLSDVDLLKGVKLVKDGYVCLAGLVVSGEWNL
PDNCRGGVSVCLVDKRMQRDDEATLGSYRTSAAKKRFAFKLIPNYSITTADAERKVWQVLVNIRGVAMEKGFCPLSLEFV
SVCIVHKSNIKLGLREKITSVSEGGPVELTEAVVDEFIESVPMADRLRKFRNQSKKGSNKYVGKRNDNKGLNKEGKLFDK
VRIGQNSESSDAESSSF
>P0CJ99 ~~~~~~P3N-PIPO polyprotein~~~
MSTLVCQAVAAPVWSNGARTRRIRDADGEYRCTQCDMGFDSMTMARPVNHCCDGIMIDEYNLYDDDPIMHLVDSKTPIKR
GSQETEGDGMAAEAIKVTGAEPVNCFMVGTIKCKINENSIVAKGVMAAIPRQLTQDEVFMRKARLQAAVAKSTIEREEKE
RQFAFSKLEEKLRARREKLKDGIVIKTRKGLEWREATPNQQRGKLQSTSFDASGGKTLTPHTIYCKTKSSKFSNGGVKCA
TSKKMRTVRKPQSLKMKTESIDVLIEQVMTIAGKHAKQVTLIDKQKTNRVWIRRVNGVRLLQVETKHHKGIISQKDASLN
NLTKRVARHFARKTAYIHPSDSITHGHSGVVFLRANISGSKSYSIDDLFVVRGKRNGKLMESRNKVAWRKMFQIDHFSIV
GIKIWNAFDAEYVKLRDESVSDHDCVGGITPEECGILAAQILRVFYPCWRITCTKCISNWLSKPTSEQIEHIYERGNLAI
QDLNKRIPSAHHVTQMVELLRQRIKNTTFDMGNNTKVHELIGHRQDGVFRHLNRLNNSILAANGSSTIEWESMNESLLEL
ARWHNKRTESIASGGISSFRNKISAKAQINFALMCDNQLDTNGNFVWGERGYHAKRFFSEFFTKIDPKDGYSHHTVRATP
TGVRHLAIGNLIIPGDLQKLREKLEGVSITAVGISEKCVSRRNGDFVYPCSCVTSENGKPVLSDVILPTRNHLVIGNTGD
PKLVDLPKTETGRMWIAKEGYCYINIFFAMLVNVSEKDAKDFTKFVRDEIMPQLGKWPTMMDVATACYKLAIIYPDVRDA
QLPRILVDHSEQIFHVIDSYGSMTTGYHILKAGTVSQLISFAHGALLGEMKMYRVGGTQKMEINMCCCQRKNLLIKQLIR
AIYRPKLLTEIIETEPFVLMLAIVSPSILKAMFRSGTFNQAIKFYMHRSKPTAQTLAFLEALSERVSRSRVLSEQFNIID
GALKELKSLANMSMRTQHTYPIVQNQLDIMIERVSADAELLRDGFVVSKGRVQALIEKKLSRRPEKFLHRLAICTTIATN
YVIFKSEAWFWRIVRKQRLELFQGSMDGAFELIFGRRQTDHPLGAHKVAADVSKWW
>Q85438 ~~~~~~Movement protein~~~
MDTETLCLITADSGKVYGILKAIFGDESEIVKKLIDFDVSIRVVPLNLGLLNIFRDNAADLDNADLMKRRFGNTMGSRIV
EAYRRSQDSKYKRNVCKTTGLLVCLFGGGLGLSREADKHKKFVEGKSHNILSVEMLKRALSIGGQNVDANKISSFWFATY
TIFTTVYSPRLRYQAGSSKRIIALSESRNQYRSNLFWDLRDDSSHEVMSMVHVLSALFASALTAYISTRVRHELTQGNDE
RESLNNVLVWLKTLTFEPSTIALIAYIWLVSPTDAQATITIGSVMESESSDDFPDIVKILSYTSNTMLPVQLLEDGRTAY
CSVADGYTRHTTALTLITDYNSSHMSDKFGVLINIVKFEHAYALHYVHHKPRDGKEMTITSPSSEMMFTSVVVTPLSSYP
LIHARNAVIDWLRTFVHMFPDSGSLVIPADSYTWIHNLAQDMFPWVQLSTTLDIRDDHYFQVLCDCLSLGRDSRNHAKVE
KLIKYMKASVYNFTSEARGNMLLAITVYK
>Q9QDI8 ~~~MP~~~Movement protein~~~
MSYEPKVSDFLALTKKEEILPKALTRLKTVSISTKDVISVKESESLCDIDLLVNVPLDKYRYVGVLGVAFTGEWLVPDFV
KGGVTVSVIDKRLENSRESMIGTYRAAAKDRRFQFKLVPNYFVSTADAKRKPWQVHVRIQNLKIEAGWQPLALEVVSVAM
VTNNVVVKGLREKVIAVNDPNVEGFEGVVDDFVDSVAAFKAIDSFRKKKKKIGGRDVNNNKYRYRPERYAGPDSLQYKEE
NGLQHHELESVPVFRSDVGRAHSDA
>Q00847 ~~~pc4~~~Movement protein~~~
MALSRLLSTLKSKVLYDDLSEESQKRVDNKNRKSLALSKRPLNQGRVTIDQAATMLGLEPFSFSDVKVNKYDMFIAKQDY
SVKAHRKATFNILVDPYWFHQPLTHYPFFRVETFAMVWIGIKGRASGITTLRIIDKSYVNPSDQVEVEVRYPISKNFAVL
GSLANFLALEDKHNLQVSVSVDDSSVQNCVISRTLWFWGIERTDLPVSMKTNDTVMFEFEPLEDKAINHLSSFSNFTTNV
VQKAVGGAFTSKSFPELDTEKEFGVVKQPKKIPITKKSKSEVSVIM
>Q98663 ~~~3~~~Movement protein~~~
MGEGKNHQSFSFKNADDEIDLSLSKFSLFKLKMAKSKIIKVFGQNDPVDPNNCYINMRSIKITTSSVLPESDPKYLIWEM
SYKTDEEDHTLGQLAWKASYNGTFIVTTTYAMMVTGGELYTPYTAVIRSSDGNEIKGVKVKVTLSWDPANDRPSKARMGG
FIQDMYCKTITNGKTQISPMVGWYIGQDERRYCKVLNKSALEFSSEGIYPLMELVSGADSVINPLINKLISGMLNDEEKR
RVSLYTSTVGAGTSLTQSEKLLLKKLVESKTGSGLVQFLMRACKELGTDVYLEA
>P21936 ~~~BC1~~~Movement protein BC1~~~
MGSQLVPPPSAFNYIESQRDEFQLSHDLTEIVLQFPSTASQITARLSRSCMKIDHCVIEYRQQVPINASGTVIVEIHDKR
MTDNESLQASWTFPIRCNIDLHYFSSSFFSLKDPIPWKLYYRVSDSNVHQMTHFAKFKGKLKLSSAKHSVDIPFRAPTVK
ILAKQFSEKDIDFWHVGYGKWERRLVKSASSSRFGLRGPIEINPGESWATKSAIGPTNRNADLDIEEELLPYRELNRLGT
NILDPGESASIVGIQRSQSNITMSMSQLNELVRSTVHECIKTSCIPSTPKSLS
>Q89914 ~~~sc4~~~Movement protein~~~
MEGLSSKAQTMGREDDNRSSKMKVFHSELVYGDNHNISIKKADLTGQHKMMLLLSSALRIGSVHMDVSRILVKWCPYITP
NMNTTIGITIKNNHHDDMSNINDMSTYISVKGKMSEALQITWHPASTLVYKKGMSCIFPWVVDVDTGSTEQESGSPALGE
IKIWCYFKMQYHKPSTRHIARAEIAPSIEWGNTNFPYYVPFAMIRRARGIRPLDVFSTNQYSMFLEDVIKHVGTDSIKES
DIVPIMSTMSQEDMMMINEKNKTCLLKRGGSYCSCKDVIENVVKEINMNRDRKYDNHGLLLSGYIAGSTSGRFQTVPMLS
DISY
>P11691 ~~~~~~Movement protein~~~
MDTEYEQVNKPWNELYKETTLGNKLTVNVGMEDQEVPLLPSNFLTKVRVGLSGGYITMRRIRIKIIPLVSRKAGVSGKLY
LRDISDTTGRKLHCTESLDLGREIRLTMQHLDFSVSTRSDVPIVFGFEELVSPFLEGRELFSISVRWQFGLSKNCYSLPQ
SKWKVMYQEDALKVLRPSKKKASKTDSSV
>P0CK09 ~~~~~~P3N-PIPO polyprotein~~~
MALIFGTVNANILKEVFGGARMACVTSAHMAGANGSILKKAEETSRAIMHKPVIFGEDYITEADLPYTPLHLEVDAEMER
MYYLGRRALTHGKRRKVSVNNKRNRRRKVAKTYVGRDSIVEKIVVPHTERKVDTTAAVEDICNEATTQLVHNSMPKRKKQ
KNFLPATSLSNVYAQTWSIVRKRHMQVEIISKKSVRARVKRFEGSVQLFASVRHMYGERKRVDLRIDNWQQETLLDLAKR
FKNERVDQSKLTFGSSGLVLRQGSYGPAHWYRHGMFIVRGRSDGMLVDARAKVTFAVCHSMTHYSDKSISEAFFIPYSKK
FLELRPDGISHECTRGVSVERCGEVAAILTQALSPCGKITCKRCMVETPDIVEGESGESVTNQGKLLAMLKEQYPDFPMA
EKLLTRFLQQKSLVNTNLTACVSVKQLIGDRKQAPFTHVLAVSEILFKGNKLTGADLEEASTHMLEIARFLNNRTENMRI
GHLGSFRNKISSKAHVNNALMCDNQLDQNGNFIWGLRGAHAKRFLKGFFTEIDPNEGYDKYVIRKHIRGSRKLAIGNLIM
STDFQTLRQQIQGETIERKEIGNHCISMRNGNYVYPCCCVTLEDGKAQYSDLKHPTKRHLVIGNSGDSKYLDLPVLNEEK
MYIANEGYCYMNIFFALLVNVKEEDAKDFTKFIRDTIVPKLGAWPTMQDVATACYLLSILYPDVLRAELPRILVDHDNKT
MHVLDSYGSRTTGYHMLKMNTTSQLIEFVHSGLESEMKTYNVGGMNRDVVTQGAIEMLIKSIYKPHLMKQLLEEEPYIIV
LAIVSPSILIAMYNSGTFEQALQMWLPNTMRLANLAAILSALAQKLTLADLFVQQRNLINEYAQVILDNLIDGVRVNHSL
SLAMEIVTIKLATQEMDMALREGGYAVTSEKVHEMLEKKLCKGFEGCMGRINLVGKILRNQAFKKALEIWAKAFNHEKHR
RLRRTYRLVCEIAFQVPLGTPEGNHLKSRKWWRKKGKSSEECHDKRGFSQNLQHAS
>P03583 ~~~MP~~~Movement protein~~~
MALVVKGKVNINEFIDLTKMEKILPSMFTPVKSVMCSKVDKIMVHENESLSEVNLLKGVKLIDSGYVCLAGLVVTGEWNL
PDNCRGGVSVCLVDKRMERADEATLGSYYTAAAKKRFQFKVVPNYAITTQDAMKNVWQVLVNIRNVKMSAGFCPLSLEFV
SVCIVYRNNIKLGLREKITNVRDGGPMELTEEVVDEFMEDVPMSIRLAKFRSRTGKKSDVRKGKNSSNDRSVPNKNYRNV
KDFGGMSFKKNNLIDDDSEATVAESDSF
>P69513 ~~~MP~~~Movement protein~~~
MALVVKGKVNINEFIDLSKSEKLLPSMFTPVKSVMVSKVDKIMVHENESLSEVNLLKGVKLIEGGYVCLVGLVVSGEWNL
PDNCRGGVSVCMVDKRMERADEATLGSYYTAAAKKRFQFKVVPNYGITTKDAEKNIWQVLVNIKNVKMSAGYCPLSLEFV
SVCIVYKNNIKLGLREKVTSVNDGGPMELSEEVVDEFMENVPMSVRLAKFRTKSSKRGPKNNNNLGKGRSGGRPKPKSFD
EVEKEFDNLIEDEAETSVADSDSY
>P36292 ~~~NSM~~~Movement protein~~~
MLTLFGNKRPSKSAGKDEGPLVSLAKHNGSVEVSKPWSSSDEKLALTKAMDASKGKILLNIEGTSSFGTYESDSIIESEG
YDLSARMIVDTNHHISNWKNDLFVGNGKQNANKVIKICPTWDSRKQYMMISRIVIWVCPTIPNPTGKLVVALIDPNMPSG
KQVILKGQGTITDPICFVFYLNWSIPKMNNTPENCCQLHLMCSQEYKKGVSFGSVMYSWTKEFGDSPRADKDKCMVIPLN
RAIRARSQAFIEACKLIIPKCNSEKQIKKQLKELSSNLERSVEEEEEGISDSVAQLSFDEI
>P0CK11 ~~~~~~P3N-PIPO polyprotein~~~
MAAVTFASAITNAITSKPALTGMVQFGSFPPMPLRSTTVTTVATSVAQPKLYTVQFGSLDPVVVKSGAGSLAKATRQQPN
VEIDVSLSEAAALEVAKPRSNAVLRMHEEANKERALFLDWEASLKRSSYGIAEDEKVVMTTHGVSKIVPRSSRAMKLKRA
RERRRAQQPIILKWEPKLSGISIGGGLSASVIEAEEVRTKWPLHKTPSMKKRTVHRICKMNDQGVDMLTRSLVKIFKTKS
ANIEYIGKKSIKVDFIRKERTKFARIQVAHLLGKRAQRDLLTGMEENHFIDILSKYSGNKTTINPGVVCAGWSGIVVGNG
ILTQKRSRSPSEAFVIRGEHEGKLYDARIKVTRTMSHKIVHFSAAGANFWKGFDRCFLAYRSDNREHTCYSGLDVTECGE
VAALMCLAMFPCGKITCPDCVTDSELSQGQASGPSMKHRLTQLRDVIKSSYPRFKHAVQILDRYEQSLSSANENYQDFAE
IQSISDGVEKAAFPHVNKLNAILIKGATVTGEEFSQATKHLLEIARYLKNRTENIEKGSLKSFRNKISQKAHINPTLMCD
NQLDRNGNFIWGERGYHAKRFFSNYFEIIDPKKGYTQYETRAVPNGSRKLAIGKLIVPTNFEVLREQMKGEPVEPYPVTV
ECVSKLQGDFVHACCCVTTESGDPVLSEIKMPTKHHLVIGNSGDPKYIDLPEIEENKMYIAKEGYCYINIFLAMLVNVKE
SQAKEFTKVVRDKLVGELGKWPTLLDVATACYFLKVFYPDVANAELPRMLVDHKTKIIHVVDSYGSLSTGYHVLKTNTVE
QLIKFTRCNLESSLKHYRVGGTEWEDTHGSSNIDNPQWCIKRLIKGVYKPKQLKEDMLANPFLPLYALLSPGVILAFYNS
GSLEYLMNHYIRVDSNVAVLLVVLKSLAKKVSTSQSVLAQLQIIERSLPELIEAKANVNGPDDAATRACNRFMGMLLHMA
EPNWELADGGYTILRDHSISILEKKLSTNLGRSMERVKLVGALCYKILLVKASNLYTERFANEKRSRFRRQIQRVSHVIL
RTE
>Q66221 ~~~MP~~~Movement protein~~~
MSYEPKVSDFLALTKKEEILPKALTRLKTVSISTKDVISVKESESLCDIDLLVNVPLDKYRYVGVLGVVFTGEWLVPDFV
KGGVTVSVIDKRLENSKECIIGTYRAAAKDRRFQFKLVPNYFVSVADAKRKPWQVHVRIQNLKIEAGWQPLALEVVSVAM
VTNNVVVKGLREKVIAVNDPNVEGFEGVVDDFVDSVAAFKAIDSFRKKKKRIGGRDVNSNKYRYRPERYAGPDSLQYKEE
NGLQHHELESVPVFRSDVGRAHSDA
>P01104 ~~~V-MYB~~~Transforming protein Myb~~~
NRTDVQCQHRWQKVLNPELNKGPWTKEEDQRVIEHVQKYGPKRWSDIAKHLKGRIGKQCRERWHNHLNPEVKKTSWTEEE
DRIIYQAHKRLGNRWAEIAKLLPGRTDNAVKNHWNSTMRRKVEQEGYPQESSKAGPPSATTGFQKSSHLMAFAHNPPAGP
LPGAGQAPLGSDYPYYHIAEPQNVPGQIPYPVALHINIINVPQPAAAAIQRHYTDEDPEKEKRIKELELLLMSTENELKG
QQALPTQNHTANYPGWHSTTVADNTRTSGDNAPVSCLGEHHHCTPSPPVDHGCLPEESASPARCMIVHQSNILDNVKNLL
EFAETLQLIDSFLNTSSNHENLNLDNPALTSTPVCGHKMSVTTPFHKDQTFTEYRKMHGGAV
>P28991 ~~~M~~~Membrane protein~~~
MGAIDSFCGDGILGEYLDYFILSVPLLLLLTRYVASGLVYVLTALFYSFVLAAYIWFVIVGRAFSTAYAFVLLAAFLLLV
MRMIVGMMPRLRSIFNHRQLVVADFVDTPSGPVPIPRSTTQVVVRGNGYTAVGNKLVDGVKTITSAGRLFSKRTAATAYK
LQ
>O36307 3.1.-.-~~~N~~~Nucleoprotein~~~
MSTLQELQENITAHEQQLVTARQKLKDAEKAVEVDPDDVNKSTLQSRRAAVSTLETKLGELKRQLADLVAAQKLATKPVD
PTGLEPDDHLKEKSSLRYGNVLDVNSIDLEEPSGQTADWKAIGAYILGFAIPIILKALYMLSTRGRQTVKDNKGTRIRFK
DDSSFEEVNGIRKPKHLYVSMPTAQSTMKAEEITPGRFRTIACGLFPAQVKARNIISPVMGVIGFGFFVKDWMDRIEEFL
AAECPFLPKPKVASEAFMSTNKMYFLNRQRQVNESKVQDIIDLIDHAETESATLFTEIATPHSVWVFACAPDRCPPTALY
VAGVPELGAFFSILQDMRNTIMASKSVGTAEEKLKKKSAFYQSYLRRTQSMGIQLDQKIIILYMLSWGKEAVNHFHLGDD
MDPELRQLAQSLIDTKVKEISNQEPLKL
>Q0Q462 ~~~N~~~Nucleoprotein~~~
MASVKFQPRGRSKGRVPLSLFAPLRVTDEKPLYKVLPNNAVPQGMGGKDQQIGYWVEQQRWRMRRGDRVDLPSNWHFYFL
GTGPHSDLPFRKRTDGVFWVAIDGAKTQPTGLGVRKSSEKPLVPKFKNKLPNNVEIVEPTTPNNSRANSRSRSRGGQSNS
RGNSQNRGDKSRNQSRNRSQSNDRGSDSRDDLVAAVKKALEDLGVGAAKPKGKTQSGKNTPKNKSRSGSVQRAEAKDKPE
WRRTPSGDESVEVCFGPRGGTRNFGSSEFVAKGVNAPGYAQAASLVPGAAALLFGGNVATKEMADGVEITYTYKMLVPKD
DKNLEIFLAQVDAYKLGDPKPQRKVKRSRTPTPKPATEPVYDDVAADPTYANLEWDTTVEDGVEMINEVFDTQN
>P0C796 ~~~N~~~Nucleoprotein~~~
MPPKRRLVDDADAMEDQDLYEPPASLPKLPGKFLQYTVGGSDPHPGIGHEKDIRQNAVALLDQSRRDMFHTVTPSLVFLC
LLIPGLHAAFVHGGVPRESYLSTPVTRGEQTVVKTAKFYGEKTTQRDLTELEISSIFSHCCSLLIGVVIGSSSKIKAGAE
QIKKRFKTMMAALNRPSHGETATLLQMFNPHEAIDWINGQPWVGSFVLSLLTTDFESPGKEFMDQIKLVASYAQMTTYTT
IKEYLAECMDATLTIPVVAYEIRDFLEVSAKLKEEHADLFPFLGAIRHPDAIKLAPRSFPNLASAAFYWSKKENPTMAGY
RASTIQPGASVKETQLARYRRREISRGEDGAELSGEISAIMRMIGVTGLN
>P0C797 ~~~N~~~Nucleoprotein~~~
MPPKRRLVDDADAMEDQDLYEPPASLPKLPGKFLQYTVGGSDPHPGIGHEKDIRQNAVALLDQSRRDMFHTVTPSLVFLC
LLIPGLHAAFVHGGVPRESYLSTPVTRGEQTVVKTAKFYGEKTTQRDLTELEISSIFSHCCSLLIGVVIGSSSKIKAGAE
QIKKRFKTMMAALNRPSHGETATLLQMFNPHEAIDWINGQPWVGSFVLSLLTTDFESPGKEFMDQIKLVASYAQMTTYTT
IKEYLAECMDATLTIPVVAYEIRDFLEVSAKLKEDHADLFPFLGAIRHPDAIKLAPRSFPNLASAAFYWSKKENPTMAGY
RASTIQPGASVKETQLARYRRREISRGEDGAELSGEISAIMKMIGVTGLN
>P23051 ~~~N~~~Nucleoprotein~~~
MNSMLNPNAVPFQPSPQVVALPMQYPSGFSSGYRRQRDPAFRPMFRRQNNGNQNRSRQNRQRLQNNNRGNNRNRNQFNRR
QNQPSQSMSFEQQLLLMANETAYAATYPSDMQNIAPTKLVKIAKRAAMQIVSGHATVEISNGTEDSNQRVATFTIKVVMN
>P35943 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTFNKDQLLSTSKYTIQRSTGDNIDIPNYDVQKHLNKLCGMLLITEDANHKFTGLIGMLYAMSRLGREDTL
KILKDAGYQVRANGVDVITHRQDVNGKEMKFEVLTLVSLTSEVQGNIEIESRKSYKKMLKEMGEVAPEYRHDFPDCGMIV
LCVAALVITKLAAGDRSGLTAVIRRANNVLRNEMKRYKGLIPKDIANSFYEVFEKYPHYIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPNF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTTEELEAIKNQLNPKDNDVEL
>P22677 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTFNKDQLLSTSKYTIQRSTGDNIDIPNYDVQKHLNKLCGMLLITEDANHKFTGLIGILYAMSRLGREDTL
KILKDAGYQVRANGVDVITHRQDVNGKEMKFEVLTLVSLTSEVQGNIEIESRKSYKKMLKEMGEVAPEYRHDSPDCGMIV
LCVAALVITKLAAGDRSGLTAVIRRANNVLRNEMKRYKGLIPKDIANSFYEVIEKYPHYIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPNF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTTEELEAIKNQLNPKDNDVEL
>Q65708 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTFNKDQLLSTSKYTIQRSTGDNIDIPNYDVQKHLNKLCGMLLITEDANHKFTGLIGMLYAMSRLGREDTL
KILKDAGYQVRANGVDVITHRQDVNGKEMKFEVLTLVSLTSEVQGNIEIESRKSYKKMLKEMGEVAPEYRHDSPDCGMIV
LCVAALVITKLAAGDRSGLTAVIRRANNVLRNEMKRYKGLIPKDIANSFYEVFEKYPHYIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPNF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTTEELEAIKNQLNPKDNDVEL
>P04873 ~~~N~~~Nucleoprotein~~~
MSDLVFYDVASTGANGFDPDAGYMDFCVKNAESLNLAAVRIFFLNAAKAKAALSRKPERKANPKFGEWQVEVINNHFPGN
RNNPIGNNDLTIHRLSGYLARWVLDQYNENDDESQHELIRTTIINPIAESNGVGWDSGPEIYLSFFPGTEMFLETFKFYP
LTIGIHRVKQGMMDPQYLKKALRQRYGTLTADKWMSQKVAAIAKSLKDVEQLKWGKGGLSDTAKTFLQKFGIRLP
>P16495 ~~~N~~~Nucleoprotein~~~
MIELEFHDVAANTSSTFDPEVAYANFKRVHTTGLSYDHIRIFYIKGREIKTSLAKRSEWEVTLNLGGWKITVYNTNFPGN
RNNPVPDDGLTLHRLSGFLARYLLEKMLKVSEPEKLIIKSKIINPLAEKNGITWNDGEEVYLSFFPGSEMFLGTFRFYPL
AIGIYKVQRKEMEPKYLEKTMRQRYMGLEAATWTVSKLTEVQSALTVVSSLGWKKTNVSAAARDFLAKFGINM
>P89522 3.1.-.-~~~N~~~Nucleoprotein~~~
MENKIEVNNKDEMNRWFEEFKKGNGLVDTFTNSYSFCESVPNLDRFVFQMASATDDAQKDSIYASALVEATKFCAPIYEC
AWVSSTGIVKKGLEWFEKNAGTIKSWDESYTELKVDVPKIEQLTGYQQAALKWRKDIGFRVNANTAALSNKVLAEYKVPG
EIVMSVKEMLSDMIRRRNLILNRGGDENPRGPVSHEHVDWCREFVKGKYIMAFNPPWGDINKSGRSGIALVATGLAKLAE
TEGKGIFDEAKKTVEALNGYLDKHKDEVDRASADSMITNLLKHIAKAQELYKNSSALRAQSAQIDTAFSSYYWLYKAGVT
PETFPTVSQFLFELGKQPRGTKKMKKALLSTPMKWGKKLYELFADDSFQQNRIYMHPAVLTAGRISEMGVCFGTIPVANP
DDAAQGSGHTKSILNLRTNTETNNPCAKTIVKLFEVQKTGFNIQDMDIVASEHLLHQSLVGKQSPFQNAYNVKGNATSAN
II
>P04865 ~~~N~~~Nucleoprotein~~~
MASLLKSLTLFKRTRDQPPLASGSGGAIRGIKHVIIVLIPGDSSIVTRSRLLDRLVRLVGDPKINGPKLTGILISILSLF
VESPGQLIQRIIDDPDVSIKLVEVIPSINSACGLTFASRGASLDSEADEFFKIVDEGSKAQGQLGWLENKDIVDIEVDNA
EQFNILLASILAQIWILLAKAVTAPDTAADSEMRRWIKYTQQRRVVGEFRMNKIWLDIVRNRIAEDLSLRRFMVALILDI
KRSPGNKPRIAEMICDIDNYIVEAGLASFILTIKFGIETMYPALGLHEFSGELTTIESLMMLYQQMGETAPYMVILENSV
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSALAAELGITKEEAQLVSEIASK
TTEDRTIRATGPKQSQITFLHSERSEVANQQPPTINKRSENQGGDKYPIHFSDERLPGYTPDVNSSEWSESRYDTQIIQD
DGNDDDRKSMEAIAKMRMLTKMLSQPGTSEDNSPVYSDKELLN
>P10527 ~~~N~~~Nucleoprotein~~~
MSFTPGKQSSSRASFGNRSGNGILKWADQSDQSRNVQTRGRRAQPKQTATSQLPSGGNVVPYYSWFSGITQFQKGKEFEF
AEGQGVPIAPGVPATEAKGYWYRHNRRSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVASNQADVNTPAD
ILDRDPSSDEAIPTRFPPGTVLPQGYYIEGSGRSAPNSRSTSRASSRASSAGSRSRANSGNRTPTSGVTPDMADQIASLV
LAKLGKDATKPQQVTKQTAKEIRQKILNKPRQKRSPNKQCTVQQCFGKRGPNQNFGGGEMLKLGTSDPQFPILAELAPTA
GAFFFGSRLELAKVQNLSGNLDEPQKDVYELRYNGAIRFDSTLSGFETIMKVLNENLNAYQQQDGMMNMSPKPQRQRGQK
NGQGENDNISVAAPKSRVQQNKSRELTAEDISLLKKMDEPYTEDTSEI
>P15130 ~~~N~~~Nucleoprotein~~~
MATVKWADASEPQRGRQGRIPYSLYSPLLVDSEQPWKVIPRNLVPINKKDKNKLIGYWNVQKRFRTRKGKRVDLSPKLHF
YYLGTGPHKDAKFRERVEGVVWVAVDGAKTEPTGYGVRRKNSEPEIPHFNQKLPNGVTVVEEPDSRAPSRSQSRSQSRGR
GESKPQSRNPSSDRNHNSQDDIMKAVAAALKSLGFDKPQEKDKKSAKTGTPKPSRNQSPASSQTSAKSLARSQSSETKEQ
KHEMQKPRWKRQPNDDVTSNVTQCFGPRDLDHNFGSAGVVANGVKAKGYPQFAELVPSTAAMLFDSHIVSKESGNTVVLT
FTTRVTVPKDHPHLGKFLEELNAFTREMQQHPLLNPSALEFNPSQTSPATAEPVRDEVSIETDIIDEVN
>Q5MQC6 ~~~N~~~Nucleoprotein~~~
MSYTPGHYAGSRSSSGNRSGILKKTSWADQSERNYQTFNRGRKTQPKFTVSTQPQGNTIPHYSWFSGITQFQKGRDFKFS
DGQGVPIAFGVPPSEAKGYWYRHSRRSFKTADGQQKQLLPRWYFYYLGTGPYANASYGESLEGVFWVANHQADTSTPSDV
SSRDPTTQEAIPTRFPPGTILPQGYYVEGSGRSASNSRPGSRSQSRGPNNRSLSRSNSNFRHSDSIVKPDMADEIANLVL
AKLGKDSKPQQVTKQNAKEIRHKILTKPRQKRTPNKHCNVQQCFGKRGPSQNFGNAEMLKLGTNDPQFPILAELAPTPGA
FFFGSKLDLVKRDSEADSPVKDVFELHYSGSIRFDSTLPGFETIMKVLEENLNAYVNSNQNTDSDSLSSKPQRKRGVKQL
PEQFDSLNLSAGTQHISNDFTPEDHSLLATLDDPYVEDSVA
>Q6Q1R8 ~~~N~~~Nucleoprotein~~~
MASVNWADDRAARKKFPPPSFYMPLLVSSDKAPYRVIPRNLVPIGKGNKDEQIGYWNVQERWRMRRGQRVDLPPKVHFYY
LGTGPHKDLKFRQRSDGVVWVAKEGAKTVNTSLGNRKRNQKPLEPKFSIALPPELSVVEFEDRSNNSSRASSRSSTRNNS
RDSSRSTSRQQSRTRSDSNQSSSDLVAAVTLALKNLGFDNQSKSPSSSGTSTPKKPNKPLSQPRADKPSQLKKPRWKRVP
TREENVIQCFGPRDFNHNMGDSDLVQNGVDAKGFPQLAELIPNQAALFFDSEVSTDEVGDNVQITYTYKMLVAKDNKNLP
KFIEQISAFTKPSSIKEMQSQSSHVAQNTVLNASIPESKPLADDDSAIIEIVNEVLH
>P33469 ~~~N~~~Nucleoprotein~~~
MSFTPGKQSSSRASSGNRSGNGILKWADQSDQVRNVQTRGRRAQPKQTATSQQPSGGNVVPYYSWFSGITQFQKGKEFEF
VEGQGPPIAPGVPATEAKGYWYRHNRGSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVYWVASNQADVNTPAD
IVDRDPSSDEAIPTRFPPGTVLPQGYYIEGSGRSAPNSRSTSRTSSRASSAGSRSRANSGNRTPTSGVTPDMADQIASLV
LAKLGKDATKPQQVTKHTAKEVRQKILNKPRQKRSPNKQCTVQQCFGKRGPNQNFGGGEMLKLGTSDPQFPILAELAPTA
GAFFFGSRLELAKVQNLSGNPDEPQKDVYELRYNGAIRFDSTLSGFETIMKVLNENLNAYQQQDGMMNMSPKPQRQRGHK
NGQGENDNISVAVPKSRVQQNKSRELTAEDISLLKKMDEPYTEDTSEI
>Q9PY96 ~~~N~~~Nucleoprotein~~~
MSFVPGQENAGSRSSSGNRAGNGILKKTTWADQTERGNRGRRNHPKQTATTQPNAGSVVPHYSWFSGITQFQKGKEFQFA
QGQGVPIASGIPASEQKGYWYRHNRRSFKTPDGQHKQLLPRWYFYYLGTGPHAGAEYGDDIEGVVWVASQQADTKTTADV
VERDPSSHEAIPTRFAPGTVLPQGFYVEGSGRSAPASRSGSRSQSRGPNNRARSSSNQRQPASAVKPDMAEEIAALVLAK
LGKDAGQPKQVTKQSAKEVRQKILTKPRQKRTPNKQCPVQQCFGKRGPNQNFGGSEMLKLGTSDPQFPILAELAPTPSAF
FFGSKLELVKKNSGGADEPTKDVYELQYSGAIRFDSTLPGFETIMKVLTENLNAYQDQAGSVDLVSPKPPRRGRRQAQEK
KDEVDNVSVAKPKSLVQRNVSRELTPEDRSLLAQILDDGVVPDGLEDDSNV
>P18447 ~~~N~~~Nucleoprotein~~~
MSFVPGQENAGGRSSSGNRAGNGILKKTTWADQTERGPNNQNRGRRNQPKQTATTQPNSGSVVPHYSWFSGITQFQKGKE
FQFAEGQGVPIANGIPASEQKGYWYRHNRRSFKTPDGQQKQLLPRWYFYYLGTGPHAGASYGDSIEGVFWVANSQADTNT
RSDIVERDPSSHEAIPTRFAPGTVLPQGFYVEGSGRSAPASRSGSRSQSRGPNNRARSSSNQRQPASTVKPDMAEEIAAL
VLAKLGKDAGQPKQVTKQSAKEVRQKILNKPRQKRTPNKQCPVQQCFGKRGPNQNFGGSEMLKLGTSDPQFPILAELAPT
VGAFFFGSKLELVKKNSGGADEPTKDVYELQYSGAVRFDSTLPGFETIMKVLNENLNAYQKDGGADVVSPKPQRKGRRQA
QEKKDEVDNVSVAKPKSSVQRNVSRELTPEDRSLLAQILDDGVVPDGLEDDSNV
>P03416 ~~~N~~~Nucleoprotein~~~
MSFVPGQENAGGRSSSVNRAGNGILKKTTWADQTERGPNNQNRGRRNQPKQTATTQPNSGSVVPHYSWFSGITQFQKGKE
FQFAEGQGVPIANGIPASEQKGYWYRHNRRSFKTPDGQQKQLLPRWYFYYLGTGPHAGASYGDSIEGVFWVANSQADTNT
RSDIVERDPSSHEAIPTRFAPGTVLPQGFYVEGSGRSAPASRSGSRSQSRGPNNRARSSSNQRQPASTVKPDMAEEIAAL
VLAKLGKDAGQPKQVTKQSAKEVRQKILNKPRQKRTPNKQCPVQQCFGKRGPNQNFGGSEMLKLGTSDPQFPILAELAPT
VGAFFFGSKLELVKKNSGGADEPTKDVYELQYSGAVRFDSTLPGFETIMKVLNENLNAYQKDGGADVVSPKPQRKGRRQA
QEKKDEVDNVSVAKPKSSVQRNVSRELTPEDRSLLAQILDDGVVPDGLEDDSNV
>P03417 ~~~N~~~Nucleoprotein~~~
MSFVPGQENAGSRSSSGNRAGNGILKKTTWADQTERGLNNQNRGRKNQPKQTATTQPNSGSVVPHYSWFSGITQFQKGKE
FQFAQGQGVPIANGIPASQQKGYWYRHNRRSFKTPDGQQKQLLPRWYFYYLGTGPYAGAEYGDDIEGVVWVASQQAETRT
SADIVERDPSSHEAIPTRFAPGTVLPQGFYVEGSGRSAPASRSGSRPQSRGPNNRARSSSNQRQPASTVKPDMAEEIAAL
VLAKLGKDAGQPKQVTKQSAKEVRQKILNKPRQKRTPNKQCPVQQCFGKRGPNQNFGGPEMLKLGTSDPQFPILAELAPT
AGAFFFGSKLELVKKNSGGADGPTKDVYELQYSGAVRFDSTLPGFETIMKVLNENLNAYQNQDGGADVVSPKPQRKRGTK
QKAQKDEVDNVSVAKPKSSVQRNVSRELTPEDRSLLAQILDDGVVPDGLEDDSNV
>P05991 ~~~N~~~Nucleoprotein~~~
MANQGQRVSWGDESTKIRGRSNSRGRKSNNIPLSFFNPITLQQGSKFWNLCPRDFVPKGIGNRDQQIGYWNRQTRYRMVK
GQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVAKDGAMNKPTTLGSRGANNESKALKFDGKVPGEFQLEVNQSRDNS
RSRSQSRSRSRNRSQSRGRQQSNNKKDDSVEQAVLAALKKLGVDTEKQQQRSCSKSKERSNSKTRDTTPKNENKHTWKRT
AGKGDVTRFYGARSSSANFGDSDLVANGSSAKHYPQLAECVPSVSSILFGSYWTSKEDGDQIEVTFTHKYHLPKDDPKTE
QFLQQINAYARPSEVAKEQRKRKSRSKSAERSEQEVVPDALIENYTDVFDDTQVEMIDEVTN
>P04134 ~~~N~~~Nucleoprotein~~~
MANQGQRVSWGDESTKTRGRSNSRGRKNNNIPLSFFNPITLQQGSKFWNLCPRDFVPKGIGNRDQQIGYWNRQTRYRMVK
GQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVAKDGAMNKPTTLGSRGANNESKALKFDGKVPGEFQLEVNQSRDNS
RSRSQSRSRSRNRSQSRGRQQFNNKKDDSVEQAVLAALKKLGVDTEKQQQRSRSKSKERSNSKTRDTTPKNENKHTWKRT
AGKGDVTRFYGARSSSANFGDTDLVANGSSAKHYPQLAECVPSVSSILFGSYWTSKEDGDQIEVTFTHKYHLPKDDPKTG
QFLQQINAYARPSEVAKEQRKRKSRSKSAERSEQDVVPDALIENYTDVFDDTQVEIIDEVTN
>P59713 ~~~N~~~Nucleoprotein~~~
MSFVPGQENAGSRSSSGNRAGNGILKKTTWADQTERGSNNQNRGRRNQPKQTATTQPNSGSVVPHYSWFSGITQFQKGKE
FKFAEGQGVPIANGIPATEQKGYWFRHNRRSFKSPDGQQKQLLPRWYFYYLGTGPYAGAEYGDDVEGVCWVANKQADTRT
SADIAERDPSSHEAIPTRFAPGTFLPQGYYVEGSGRSAPASRSGSRSQSRGPNNRARSSSNQRQPASIVKPDMAEEIAAL
VLAKLGKDAGQPKQVTKQSAKEVRQKILNKPRQKRTPNKQCPVQQCFGKRGPNQNFGGPEMLKLGTSDPQFPILAELAPT
AGAFFFGSKLELVKKNSVGVDEPTKDVYELQYSGAVRFDSTLPGFETIMKVLRENLNAYQNQDGGADVVSPKPQRKRGQR
QVAQKKNDEVDNVSVAKPKSAVQRNVNRELTPEDRSLLAQILDDGVVPDGLEDDSNV
>Q805Q9 3.1.-.-~~~N~~~Nucleoprotein~~~
MATLEELQKEINNHEGQLVIARQKVKDAEKQYEKDPDDLNKRALSDRESIAQSIQGKIDELRRQLADRVAAGKNIGKERD
PTGLDPGDHLKEKSMLSYGNVIDLNHLDIDEPTGQTADWLSIVVYLTSFVVPILLKALYMLTTRGRQTTKDNKGMRIRFK
DDSSFEDVNGIRKPKHLFLSMPNAQSSMKADEITPGRFRTAICGLYPAQVKARNLISPVMSVIGFLALAKNWTERVEEWL
DLPCKLLSEPSPTSLTKGPSTNRDYLNQRQGALAKMETKEAQAVRKHAIDAGCNLIDHIDSPSSIWVFAGAPDRCPPTCL
FIAGMAELGAFFAVLQDMRNTIMASKTIGTSEEKLKKKSSFYQSYLRRTQSMGIQLDQRIIVLFMVDWGKEAVDSFHLGD
DMDPELRGLAQALIDQKVKEISNQEPLKL
>P19810 ~~~N~~~Nucleoprotein~~~
MASRRSRPQAASFRNGRRRQPTSYNDLLRMFGQMRVRKPPAQPTQAIIAEPGDLRHDLNQQERATLSSNVQRFFMIGHGS
LTADAGGLTYTVSWVPTKQIQRKVAPPAGP
>Q8JPY1 ~~~NP~~~Nucleoprotein~~~
MDRGTRRIWVSQNQGDTDLDYHKILTAGLTVQQGIVRQKIISVYLVDNLEAMCQLVIQAFEAGIDFQENADSFLLMLCLH
HAYQGDYKLFLESNAVQYLEGHGFKFELRKKDGVNRLEELLPAATSGKNIRRTLAALPEEETTEANAGQFLSFASLFLPK
LVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKYLLIHQGMHMVAGHDANDAVIANSVAQAR
FSGLLIVKTVLDHILQKTDQGVRLHPLARTAKVRNEVNAFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLYPQLSAIA
LGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAESRELDSLGLDDQERRILMNFHQKKNEISFQQTNAMVTLRKE
RLAKLTEAITLASRPNLGSRQDDGNEIPFPGPISNNPDQDHLEDDPRDSRDTIIPNGAIDPEDGDFENYNGYHDDEVGTA
GDLVLFDLDDHEDDNKAFEPQDSSPQSQREIERERLIHPPPGNNKDDNRASDNNQQSADSEEQGGQYNWHRGPERTTANR
RLSPVHEEDTLMDQGDDDPSSLPPLESDDDDASSSQQDPDYTAVAPPAPVYRSAEAHEPPHKSSNEPAETSQLNEDPDIG
QSKSMQKLEETYHHLLRTQGPFEAINYYHMMKDEPVIFSTDDGKEYTYPDSLEEAYPPWLTEKERLDKENRYIYINNQQF
FWPVMSPRDKFLAILQHHQ
>Q9QP77 ~~~NP~~~Nucleoprotein~~~
MDKRVRGSWALGGQSEVDLDYHKILTAGLSVQQGIVRQRVIPVYVVNDLEGICQHIIQAFEAGVDFQDNADSFLLLLCLH
HAYQGDHRLFLKSDAVQYLEGHGFRFEVREKENVHRLDELLPNVTGGKNLRRTLAAMPEEETTEANAGQFLSFASLFLPK
LVVGEKACLEKVQRQIQVHAEQGLIQYPTSWQSVGHMMVIFRLMRTNFLIKFLLIHQGMHMVAGHDANDTVISNSVAQAR
FSGLLIVKTVLDHILQKTDLGVRLHPLARTAKVKNEVSSFKAALGSLAKHGEYAPFARLLNLSGVNNLEHGLYPQLSAIA
LGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAETRELDNLGLDEQEKKILMSFHQKKNEISFQQTNAMVTLRKE
RLAKLTEAITTASKIKVGDRYPDDNDIPFPGPIYDDTHPNPSDDNPDDSRDTTIPGGVVDPYDDESNNYPDYEDSAEGTT
GDLDLFNLDDDDDDSRPGPPDRGQNKERAARTYGLQDPTLDGAKKVPELTPGSHQPGNLHITKSGSNTNQPQGNMSSTLH
SMTPIQEESEPDDQKDNDDESLTSLDSEGDEDGESISEENTPTVAPPAPVYKDTGVDTNQQNGPSSTVDSQGSESEALPI
NSKKSSALEETYYHLLKTQGPFEAINYYHLMSDEPIAFSTESGKEYIFPDSLEEAYPPWLSEKEALEKENRYLVIDGQQF
LWPVMSLRDKFLAVLQHD
>Q5XX08 ~~~NP~~~Nucleoprotein~~~
MDKRVRGSWALGGQSEVDLDYHKILTAGLSVQQGIVRQRVIPVYVVSDLEGICQHIIQAFEAGVDFQDNADSFLLLLCLH
HAYQGDHRLFLKSDAVQYLEGHGFRFEVREKENVHRLDELLPNVTGGKNLRRTLAAMPEEETTEANAGQFLSFASLFLPK
LVVGEKACLEKVQRQIQVHAEQGLIQYPTSWQSVGHMMVIFRLMRTNFLIKFLLIHQGMHMVAGHDANDTVISNSVAQAR
FSGLLIVKTVLDHILQKTDLGVRLHPLARTAKVKNEVSSFKAALGSLAKHGEYAPFARLLNLSGVNNLEHGLYPQLSAIA
LGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAETRELDNLGLDEQEKKILMSFHQKKNEISFQQTNAMVTLRKE
RLAKLTEAITTASKIKVGDRYPDDNDIPFPGPIYDETHPNPSDDNPDDSRDTTIPGGVVDPYDDESNNYPDYEDSAEGTT
GDLDLFNLDDDDDDSQPGPPDRGQSKERAARTHGLQDPTLDGAKKVPELTPGSHQPGNLHITKPGSNTNQPQGNMSSTLQ
SMTPIQEESEPDDQKDDDDESLTSLDSEGDEDVESVSGENNPTVAPPAPVYKDTGVDTNQQNGPSNAVDGQGSESEALPI
NPEKGSALEETYYHLLKTQGPFEAINYYHLMSDEPIAFSTESGKEYIFPDSLEEAYPPWLSEKEALEKENRYLVIDGQQF
LWPVMSLQDKFLAVLQHD
>O72142 ~~~NP~~~Nucleoprotein~~~
MDSRPQKVWMTPSLTESDMDYHKILTAGLSVQQGIVRQRVIPVYQVNNLEEICQLIIQAFEAGVDFQESADSFLLMLCLH
HAYQGDYKLFLESGAVKYLEGHGFRFEVKKRDGVKRLEELLPAVSSGKNIKRTLAAMPEEETTEANAGQFLSFASLFLPK
LVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKFLLIHQGMHMVAGHDANDAVISNSVAQAR
FSGLLIVKTVLDHILQKTERGVRLHPLARTAKVKNEVNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLFPQLSAIA
LGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKKNEISFQQTNAMVTLRKE
RLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQDTTIPDVVVDPDDGSYGEYQSYSENGMNAP
DDLVLFDLDEDDEDTKPVPNRSTKGGQQKNSQKGQHTEGRQTQSRPTQNVPGPHRTIHHASAPLTDNDRRNEPSGSTSPR
MLTPINEEADPLDDADDETSSLPPLESDDEEQDRDGTSNRTPTVAPPAPVYRDHSEKRELPQDEQQDQDHTQEARNQDSD
NTQPEHSFEEMYRHILRSQGPFDAVLYYHMMKDEPVVFSTSDGKEYTYPDSLEEEYPPWLTEKEAMNEENRFVTLDGQQF
YWPVMNHKNKFMAILQHHQ
>P18272 ~~~NP~~~Nucleoprotein~~~
MDSRPQKIWMAPSLTESDMDYHKILTAGLSVQQGIVRQRVIPVYQVNNLEEICQLIIQAFEAGVDFQESADSFLLMLCLH
HAYQGDYKLFLESGAVKYLEGHGFRFEVKKRDGVKRLEELLPAVSSGKNIKRTLAAMPEEETTEANAGQFLSFASLFLPK
LVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKFLLIHQGMHMVAGHDANDAVISNSVAQAR
FSGLLIVKTVLDHILQKTERGVRLHPLARTAKVKNEVNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLFPQLSAIA
LGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKKNEISFQQTNAMVTLRKE
RLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQDTTIPDVVVDPDDGSYGEYQSYSENGMNAP
DDLVLFDLDEDDEDTKPVPNRSTKGGQQKNSQKGQHIEGRQTQSRPIQNVPGPHRTIHHASAPLTDNDRRNEPSGSTSPR
MLTPINEEADPLDDADDETSSLPPLESDDEEQDRDGTSNRTPTVAPPAPVYRDHSEKKELPQDEQQDQDHTQEARNQDSD
NTQSEHSFEEMYRHILRSQGPFDAVLYYHMMKDEPVVFSTSDGKEYTYPDSLEEEYPPWLTEKEAMNEENRFVTLDGQQF
YWPVMNHKNKFMAILQHHQ
>Q0PI70 ~~~N~~~Nucleoprotein~~~
MPIIPKPKSQTKGSVESSKKESRVKMETSDAKYMVGNEVKTIKFLDMRGNIATSARNSLNISPGVFAVNPFLGETLAEDT
FNILDYAGLGNVDACASHLSRSQELREQVTEKTLREVPISDSYVLKVVSNLQATTVQNVVSFNKACAVMSFNILRHTTDE
MYDWTKNEYVSLGLKEKAAKVNPNIINRLAGQINLSPQSPYYYLVTPGYEFLYDAYPAETIAMTLVKMAYRKTMNLPDSM
KDSDICSSLNAKINKRHNLAVNNIDDIIKQIGKKHIEDMYNTLTQNIAMSGKESRNVETAQSFLALIESFKTTT
>P05133 3.1.-.-~~~N~~~Nucleoprotein~~~
MATMEELQREINAHEGQLVIARQKVRDAEKQYEKDPDELNKRTLTDREGVAVSIQAKIDELKRQLADRIATGKNLGKEQD
PTGVEPGDHLKERSMLSYGNVLDLNHLDIDEPTGQTADWLSIIVYLTSFVVPILLKALYMLTTRGRQTTKDNKGTRIRFK
DDSSFEDVNGIRKPKHLYVSLPNAQSSMKAEEITPGRYRTAVCGLYPAQIKARQMISPVMSVIGFLALAKDWSDRIEQWL
IEPCKLLPDTAAVSLLGGPATNRDYLRQRQVALGNMETKESKAIRQHAEAAGCSMIEDIESPSSIWVFAGAPDRCPPTCL
FIAGIAELGAFFSILQDMRNTIMASKTVGTSEEKLRKKSSFYQSYLRRTQSMGIQLGQRIIVLFMVAWGKEAVDNFHLGD
DMDPELRTLAQSLIDVKVKEISNQEPLKL
>P27318 3.1.-.-~~~N~~~Nucleoprotein~~~
MENKIVASTKEEFNTWYKQFAEKHKLNNKYTESASFCAEIPQLDTYKYKMELASTDNERDAIYSSALIEATRFCAPIMEC
AWASCTGTVKRGLEWFDKNKDSDTVKVWDANYQKLRTETPPAEALLAYQKAALNWRKDVGFSIGEYTSILKKAVAAEYKV
PGTVINNIKEMLSDMIRRRNRIINGGSDDAPKRGPVGREHLDWCREFASGKFLNAFNPPWGEINKAGKSGYPLLATGLAK
LVELEGKDVMDKAKASIAQLEGWVKENKDQVDQDKAEDLLKGVRESYKTALALAKLSNAFRAQGAQIDTVFSSYYWPWKA
GVTPVTFPSVSQFLFELGKNPKGQKKMQKALINTPLKWGKRLIELFADNDFTENRIYMHPCVLTSGRMSELGISFGAVPV
TSPDDAAQGSGHTKAVLNYKTKTEVGNPCACIISSLFEIQKAGYDIESMDIVASEHLLHQSLVGKRSPFQNAYLIKGNAT
NINII
>Q4KRW9 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTLNKDQLLSSSKYTIQRSTGDSIDTPNYDVQKHINKLCGMLLITEDANHKFTGLIGMLYAMSRLGREDTI
KILRDAGYHVKANGVDVTTHRQDINGKEMKFEVLTLASLTTEIQINIEIESRKSYKKMLKEMGEVAPEYRHDSPDCGMII
LCIAALVITKLAAGDRSGLTAVIRRANNVLKNEMKRYKGLLPKDIANSFYEVFEKHPHFIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPHF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTAEELEAIKHQLNPKDNDVEL
>P03418 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTLNKDQLLSSSKYTIQRSTGDSIDTPNYDVQKHINKLCGMLLITEDANHKFTGLIGMLYAMSRLGREDTI
KILRDAGYHVKANGVDVTTHRQDINGKEMKFEVLTLASLTTEIQINIEIESRKSYKKMLKEMGEVAPEYRHDSPDCGMII
LCIAALVITKLAAGDRSGLTAVIRRANNVLKNEMKRYKGLLPKDIANSFYEVFEKHPHFIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPHF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTAEELEAIKHQLNPKDNDVEL
>Q5UEW0 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETDGERQNATEIRASVGRMIGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRIDGKWMRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVVPRGKLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATSPIVPSFD
MSNEGSYFFGDNAEEYDN
>P15682 ~~~NP~~~Nucleoprotein~~~
MATKGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRVDGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMVD
QVRESRNPGNAEFEDLIFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTKVVPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQR
ASSGQISIQPTFSVQRNLPFDRPTIMAAFTGNTEGRTSDMRTEIIRLMESARPEDVSFQGRGVFELSDEKAASPIVPSFD
MSNEGSYFFGDNAEEYDN
>P03466 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRVNGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMD
QVRESRNPGNAEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVLSFIKGTKVLPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISIQPTFSVQRNLPFDRTTIMAAFNGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKAASPIVPSFD
MSNEGSYFFGDNAEEYDN
>P26079 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNTTEIRASVGRMIGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKRTGGPIYRRIDGKWIRELILYDKEEISRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQKAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGHDFEREGYSLVGIDPFRLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGKRVVPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERATVMAAFTGNTEGRTSDMRTEIIRIMESARPEDVSFQGRGVFELSDEKATSPIVPSFD
MSNEGSYFFGDNAEEYDN
>Q67356 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQDATEIRASVGRMIGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSVGKDPKKTGGPIYRRIDGKWMRELILYDKEEIRRVWRQANNGEDATAGLTHIMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTIVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGHDFEREGYSLVGIDPFKLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSGFIRGKKVVPRGKLSTRGVQIASNENVEAMDSSTLELRSRYWAIRTRSGGNTNQQK
ASAGQISVQPTFSVQRNLPFERATVMAAFVGNNEGRTSDMRTEIIRMMESAKPEDLSFQGRGVFELSDEKATNPIVPSFD
MNNEGSYFFGDNAEEYDN
>P03467 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETDGERQNATEIRASVGKMIDGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYKRVDGKWMRELVLYDKGEIRRIWRQANNGDDATAGLTHMMIWHSNLNDTTYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRKTRSAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEKEGYSLVGIDPFKLLQNSQVYSLIRPNE
NPAHKSQLVWMACNSAAFEDLRVLSFIRGTKVSPRGKLSTRGVQIASNENMDAMESSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPAFSVQRNLPFDKPTIMAAFTGNTEGRTSDMRAEIIRMMEGAKPEEMSFQGRGVFELSDEKAANPIVPSFD
MSNEGSYFFGDNAEEYDN
>P26074 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQDATEIRASVGRMIGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYKRIDGKWMRELILYDKEEIRRVWRQANNGEDATAGLTHIMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTIAMELIGMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGHDFEREGYSLVGIDPFKLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSGFIRGKKVVPRGRLSTRGVQIASNENVEAMDSSTLELRSRYWAIRTRSGGNTNQQK
ASADQISVQPTFSVQRNLPFERATVMAAFIGDNEGRTSDMRTEIIRMMESAKPEDLSFQGRGVFELSDEKATNPIVPSFD
MNNEGSYFFGDNAEEYDN
>O90385 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRRDGKWMRELILYDKEEIRRIWRQANNGEDATAGLTHLMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTKIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVVPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>P26091 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRRDGKWMRELILYDKEEIRRIWRQANNGEDATAGLTHLMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMVKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVLSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVIPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>Q6TXC0 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYVQMCTELKLNDHEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRKDGKWMRELILHDKEEIMRIWRQANNGEDATAGLTHMMIWHSNLNDTTYQRTRALVRAGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLTRSALILRGSVAHKSCLPACVYGLAVASGYDFEKEGYSLVGIDPFKLLQNSQIFSLIRPKE
NPAHKSQLVWMACHSAAFEDLRVLNFIRGTKVIPRGQLATRGVQIASNENMETIDSSTLELRSRYWAIRTRSGGNTSQQR
ASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMENARSEDVSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEFDS
>P68043 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQDATEIRASVGRMIGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRVDGKWMRELILYDKEEIRRVWRQANNGEDATAGLTHIMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTIAMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGHDFEREGYSLVGIDPFKLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGKKVVPRGKLSTRGVQIASNENVEAMDSSTLELRSRYWAIRTRSGGNTNQQK
ASAGQISVQPTFSVQRNLPFERATVMAAFSGNNEGRTSDMRTEVIRMMESAKPEDLSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>Q9Q0U8 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDYEGRLIQNSITIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGLAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPNE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVAPRGQLSTRGVQIASNENMETMDSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERATIMAAFTGNTEGRTSDMRTEIIRMMESSRPEDVSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>Q07FI1 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETDGERQNATEIRASVGRMIGGIGRFYIQMCTELKLNDYEGRLIQNSLTIERMVLSAFDERRNKYLE
EHPSAGKDPKKTGGPIYKRVDGKWVRELVLYDKEEIRRIWRQANNGDDATAGLTHIMIWHSNLNDTTYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAVKGVGTMVLELIRMIKRGINDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQRAMMD
QVRESRNPGNAEIEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEKEGYSLVGVDPFKLLQTSQVYSLIRPNE
NPAHKSQLVWMACNSAAFEDLRVSSFIRGTRVLPRGKLSTRGVQIASNENMDAIVSSTLELRSRYWAIRTRSGGNTNQQR
ASAGQISTQPTFSVQRNLPFDKTTIMAAFTGNAEGRTSDMRAEIIKMMESARPEEVSFQGRGVFELSDERATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>O92784 ~~~NP~~~Nucleoprotein~~~
MASQGTKRSYEQMETGGERQNATEIRASVGRMVGGIGRFYIQMCTELKLSDQEGRLIQNSITIERMVLSAFDERRNRYLE
EHPSAGKDPKKTGGPIYRRRDGKWVRELILYDKEEIRRIWRQANNGEDATAGLTHMMIWHSNLNDATYQRTRALVRTGMD
PRMCSLMQGSTLPRRSGAAGAAIKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQKAMMD
QVRESRNPGNAEIEDLIFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVFSLIRPKE
NPAHKSQLVWMACHSAAFEDLRVSSFIRGTRVIPRGQLSTRGVQIASNENVEAMDSTTLELRSRYWAIRTRSGGNTNQQR
ASAGQISVQPTFSVQRNLPFERVTIMAAFKGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKATNPIVPSFD
MSNEGSYFFGDNAEEYDN
>P69596 ~~~N~~~Nucleoprotein~~~
MASGKAAGKTDAPAPVIKLGGPKPPKVGSSGNASWFQAIKAKKLNTPPPKFEGSGVPDNENIKPSQQHGYWRRQARFKPG
KGGRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVAAKGADTKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAAASRAPSREGSRGRRSDSGDDLIARAAKIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPNYRVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPKLQLDGLHLRFEFTTVVPCDDPQFDNYVKIC
DQCVDGVGTRPKDDEPKPKSRSSSRPATRGNSPAPRQQRPKKEKKLKKQDDEADKALTSDEERNNAQLEFYDEPKVINWG
DAALGENEL
>P69597 ~~~N~~~Nucleoprotein~~~
MASGKAAGKTDAPAPVIKLGGPKPPKVGSSGNASWFQAIKAKKLNTPPPKFEGSGVPDNENIKPSQQHGYWRRQARFKPG
KGGRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVAAKGADTKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAAASRAPSREGSRGRRSDSGDDLIARAAKIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPNYRVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPKLQLDGLHLRFEFTTVVPCDDPQFDNYVKIC
DQCVDGVGTRPKDDEPKPKSRSSSRPATRGNSPAPRQQRPKKEKKLKKQDDEADKALTSDEERNNAQLEFYDEPKVINWG
DAALGENEL
>P69598 ~~~N~~~Nucleoprotein~~~
MASGKAAGKTDAPAPVIKLGGPKPPKVGSSGNASWFQAIKAKKLNTPPPKFEGSGVPDNENIKPSQQHGYWRRQARFKPG
KGGRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVAAKGADTKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAAASRAPSREGSRGRRSDSGDDLIARAAKIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPNYRVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPKLQLDGLHLRFEFTTVVPCDDPQFDNYVKIC
DQCVDGVGTRPKDDEPKPKSRSSSRPATRGNSPAPRQQRPKKEKKLKKQDDEADKALTSDEERNNAQLEFYDEPKVINWG
DAALGENEL
>P32923 ~~~N~~~Nucleoprotein~~~
MASGKATGKTDAPAPVIKLGGPRPPKVGSSGNASWFQAIKAKKLNSPQPKFEGSGVPDNENFKTSQQHGYWRRQARFKPG
KGRRKPVPDAWYFYYTGTGPAADLNWGDSQDGIVWVAAKGADVKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAASSRPPSREGSRGRRSGSEDDLIARAAKIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPGYKVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPKLQPDGLHLKFEFTTVVPRDDPQFDNYVKIC
DQCVDGVGTRPKDDEPKPKSRSSSRPATRTSSPAPRQQRLKKEKRPKKQDDEVDKALTSDEERNNAQLEFDDEPKVINWG
DSALGENEL
>Q98Y32 ~~~N~~~Nucleoprotein~~~
MASGKAAGKTDAPTPVIKLGGPKPPKVGSSGNVSWFQAIKAKKLNSPPPKFEGSGVPDNENLKPSQQHGYWRRQARFKPG
KGGRKPVPDAWYFYYTGTGPAANLNWGDSQDGIVWVAGKGADTKFRSNQGTRDSDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAASSRAPSREVSRGRRSGSEDDLIARAARIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPNYKVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPRLQPDGLHLKFEFTTVVPRDDPQFDNYVKIC
DQCVDGVGTRPKDDEPRPKSRSSSRPATRGNSPAPRQQRPKKEKKPKKQDDEVDKALTSDEERNNAQLEFDDEPKVINWG
DSALGENEL
>Q8JMI6 ~~~N~~~Nucleoprotein~~~
MASGKATGKTDAPAPVIKLGGPRPPKVGSSGNASWFQAIKAKKLNSPQPKFEGSGVPDNENLKTSQQHGYWRRQARFKPG
KGGRKPVPDAWYFYYTGTGPAADLNWGDSQDGIVWVAAKGADVKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAASSRPPSREGSRGRRSGSEDDLIARAAKIIQDQQKKGSRITKAKADEMAHRRYCKRTIPPGYKVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHACLFGSRVTPKLQPDGLHLKFEFTTVVPRDDPQFDNYVKIC
DQCVDGVGTRPKDDEPKPKSRSSSRPATRTSSPAPRQQRLKKEKRPKKQDDEVDKALTSDEERNNAQLEFDDEPKVINWG
DSALGENEL
>Q96605 ~~~N~~~Nucleoprotein~~~
MSAGKLKFDSPAPILKLSKNTGSTPPKVGGTGQASWFQSLKEKKRTGTPPTFEGSGVPDNSNVKPQFQHGYWKRQHRYKP
GKGGRKPVADAWYFYYTGTGPFGDLKWGDSNDDVVWVKAKGADTSKIGNYGVRDPDKFDQAPLRFTEGGPDNNYRWDFIA
LNRGRSRNSSAVTSRENSRPGSRDSSRGRQRSRVDDDLIDRAAKIIMQQQKNGSRISKQKANEMAERKYHKRAIAPGKRI
DEVFGQRRKGQAPNFGDDKMIEEGVKDGRLTAMLNLVPTPHACLLGSMVTAKLQPDGLHVRFSFETVVKREDPQFANYSK
ICDECVDGVGTRPKDDPTPRSRAASKDRNSAPATPKQQRAKKVHKKKEEESSLTEEEEEVNKQLEYDDDVTDIPNKIDWG
EGAFDDINI
>Q96598 ~~~N~~~Nucleoprotein~~~
MASGKAAGKSDAPTPIIKLGGPKPPKIGSSGNASWFQAIKAKKLNVPQPKFEGSGVPDNNNIKPSQQHGYWRRQARYKPG
KSGRKPVPDAWYFYYTGTGPAADLNWGENQDGIVWVAAKGADTKSRSNQGTRDPDKFDQYPLRFSDGGPDGNFRWDFIPL
NRGRSGRSTAASSAASSRAPSREGSRGRRSGAEDDLIARAAKIIQDQQKRGSRITKAKADEMAHRRYCKRTIPPGYRVDQ
VFGPRTKGKEGNFGDDKMNEEGIKDGRVTATLNLIPSSHACLFGSRVTPKLQPDGLHLKFEFTTVVPRDDPQFDNYVKIC
DQCVDGVGTRPKDDEPRPKSRSSSRPATRGNSPAPRQQRPKKEKKPKKQDDEVDKALTSDEERNNAQLEFDDEPKVINWG
DSALGENEL
>P13884 ~~~NP~~~Nucleoprotein~~~
MSNMDIDGINTGTIDKTPEEITSGTSGATRPIIKPATLAPPSNKRTRNPSPERAATSSEADVGRRTQKKQTPTEIKKSVY
NMVVKLGEFYNQMMVKAGLNDDMERNLIQNAHAAERILLAATDDKKTEFQKKKNARDVKEGKEEIDHNKTGGTFYKMVRD
DKTIYFSPIRITFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQRSKALKRVGLDPSLISTFAGSTLPRRSGAT
GVAIKGGGTLVAEAIRFIGRAMADRGLLRDIRAKTAYEKILLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARS
MVVVRPSVASKVVLPISINAKIPQLGFNVEEYSMVGYEAMALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDQRVL
SALTGTEFKHRSALKCKGFHVPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVGGDGGSGQISCSPVFAVERPIALSKQA
VRRMLSMNIEGRDADVKGNLLKMMNDSMTKKTNGNAFIGKKMFQISDKNKTNPIEIPIKQTIPNFFFGRDTAEDYDDLDY
>P04665 ~~~NP~~~Nucleoprotein~~~
MSNMDIDSINTGTIDKTPEELTPGTSGATRPIIKPATLAPPSNKRTRNPSPERTTTSSETDIGRKIQKKQTPTEIKKSVY
KMVVKLGEFYNQMMVKAGLNDDMERNLIQNAQAVERILLAATDDKKTEYQKKRNARDVKEGKEEIDHNKTGGTFYKMVRD
DKTIYFSPIKITFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQRSKGLKRVGLDPSLISTFAGSTLPRRSGTT
GVAIKGGGTLVDEAIRFIGRAMADRGLLRDIKAKTAYEKILLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARS
MVVVRPSVASKVVLPISIYAKIPQLGFNTEEYSMVGYEAMALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDLRVL
SALTGTEFKPRSALKCKGFHVPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVSGEGGSGQISCSPVFAVERPIALSKQA
VRRMLSMNVEGRDADVKGNLLKMMNDSMAKKTSGNAFIGKKMFQISDKNKVNPIEIPIKQTIPNFFFGRDTAEDYDDLDY
>O36433 ~~~NP~~~Nucleoprotein~~~
MSNMDIDGINTGTIDKTPEEITSGTSGTTRPIIRPATLAPPSNKRTRNPSPERATTSSEADVGRKTQKKQTPTEIKKSVY
NMVVKLGEFYNQMMVKAGLNDDMERNLIQNAHAVERILLAATDDKKTEFQRKKNARDVKEGKEEIDHNKTGGTFYKMVRD
DKTIYFSPIRITFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQRSKALKRVGLDPSLISTFAGSTLPRRSGAT
GVAIKGGGTLVAEAIRFIGRAMADRGLLRDIKAKTAYEKILLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARS
MVVVRPSVASKVVLPISIYAKIPQLGFNVEEYSMVGYEAMALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDLRVL
SALTGIEFKPRSALKCKGFHVPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVGGDGGSGQISCSPVFAVERPIALSKQA
VRRMLSMNIEGRDADVKGNLLKMMNDSMAKKTNGNAFIGKKMFQISDKNKTNPVEIPIKQTIPNFFFGRDTAEDYDDLDY
>P04666 ~~~NP~~~Nucleoprotein~~~
MSNMDIDGINTGTIDKTPEEIISGTSGATRPIIRPATLAPPSNKRTRNPSPERATTSSEADVGRKTQKKQTPTEIKKSVY
NMVVKLGEFYNQMMVKAGLNDDMERNLIQNAHAVERILLAATDDKKTEFQKKKNARDVKEGKEEIDHNKTGGTFYKMVRD
DKTIYFSPIRITFLKEEVKTMYKTTMGSDGFSGLNHIMIGHSQMNDVCFQRSKALKRVGLDPSLISTFAGSTLPRRSGAT
GVAIKGGGTLVAEAIRFIGRAMADRGLLRDIKAKTAYEKILLNLKNKCSAPQQKALVDQVIGSRNPGIADIEDLTLLARS
MVVVRPSVASKVVLPISIYAKIPQLGFNVEEYSMVGYEAMALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDLRVL
SALTGTEFKPRSALKCKGFHVPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVGGDGGSGQISCSPVFAVERPIALSKQA
VRRMLSMNIEGRDADVKGNLLKMMNDSMAKKTNGNAFIGKKMFQISDKNKTNPVEIPIKQTIPNFFFGRDTAEDYDDLDY
>Q5VKP6 ~~~N~~~Nucleoprotein~~~
MDSDRIVFKVHNQLVSLKPEVISDQYEYKYPAIDDKKKPSITLGKAPDLKTAYKSILSGMNAAKLDPDDVCSYLAAAMVF
FEGICPEDWTSYGINIAKKGDKITPAVLVDIQRTNTEGNWAQAGGQDLTRDPTTPEHASLVGLLLCLYRLSKIVGQNTGN
YKTNVAERMEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGVYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTTFIRQINLTARDAVLYFFHKNFEEEIKRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHTFNLIHFVGCYMG
QVRSLNATVIQSCAPHEMSVLGGYLGEEFFGRGTFERRFFRDEKELQDYEAAEATKIDLALEDDGTVNSDDEDFFSGETR
SPEAVYSRIMMSGGRLKKSHIKRYISVSSNHQARPNSFAEFLNKTYASDTR
>Q8V3T7 ~~~Segment-3~~~Nucleoprotein~~~
MADKGMTYSFDVRDNTLVVRRSTATKSGIKISYREDRGTSLLQKAFAGTEDEFWVELDQDVYVDKKIREFLVEEKMKDMS
TRVSGAVAAAIERSVEFDNFSKEAAANIEMAGVDDEEAGGSGLVDNRRKNKGVSNMAYNLSLFIGMVFPALTTFFSAILS
EGEMSIWQNGQAIMRILALADEDGKRQTRTGGQRVDMADVTKLNVVTANGKVKQVEVNLNDLKAAFRQSRPKRSDYRKGQ
GSKATESSISNQCMALIMKSVLSADQLFAPGVKMMRTNGFNASYTTLAEGANIPSKYLRHMRNCGGVALDLMGMKRIKNS
PEGAKSKIFSIIQKKVRGRCRTEEQRLLTSALKISDGENKFQRIMDTLCTSFLIDPPRTTKCFIPPISSLMMYIQEGNSV
LAMDFMKNGEDACKICREAKLKVGVNSTFTMSVARTCVAVSMVATAFCSADIIENAVPGSERYRSNIKANTTKPKKDSTY
TIQGLRLSNVRYEARPETSQSNTDRSWQVNVTDSFGGLAVFNQGAIREMLGDGTSETTSVNVRALVKRILKSASERSARA
VKTFMVGEQGKSAIVISGVGLFSIDFEGVEEAERITDMTPEIEFDEDDEEEEDIDI
>P14239 3.1.13.-~~~N~~~Nucleoprotein~~~
MAHSKEVPSFRWTQSLRRGLSQFTQTVKSDVLKDAKLIADSIDFNQVAQVQRALRKTKRGEEDLNKLRDLNKEVDRLMSM
RSVQRNTVFKAGDLGRVERMELASGLGNLKTKFRRAETGSQGVYMGNLSQSQLAKRSEILRTLGFQQQGTGGNGVVRVWD
VKDPSKLNNQFGSVPALTIACMTVQGGETMNSVIQALTSLGLLYTVKYPNLSDLDRLTQEHDCLQIVTKDESSINISGYN
FSLSAAVKAGASILDDGNMLETIRVTPDNFSSLIKSTIQVKRREGMFIDEKPGNRNPYENLLYKLCLSGDGWPYIGSRSQ
IIGRSWDNTSIDLTRKPVAGPRQPEKNGQNLRLANLTEIQEAVIREAVGKLDPTNTLWLDIEGPATDPVEMALFQPAGSK
YIHCFRKPHDEKGFKNGSRHSHGILMKDIEDAMPGVLSYVIGLLPPDMVVTTQGSDDIRKLFDLHGRRDLKLVDVRLTSE
QARQFDQQVWEKFGHLCKHHNGVVVSKKKRDKDAPFKLASSEPHCALLDCIMFQSVLDGKLYEEELTPLLPPSLLFLPKA
AYAL
>P13699 3.1.13.-~~~N~~~Nucleoprotein~~~
MSASKEIKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNVQRLMRKERRDDNDLKRLRDLNQAVNNLVEL
KSTQQKSILRVGTLTSDDLLILAADLEKLKSKVIRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGNQGARAGRDGV
VRVWDVKNAELLNNQFGTMPSLTLACLTKQGQVDLNDAVQALTDLGLIYTAKYPNTSDLDRLTQSHPILNMIDTKKSSLN
ISGYNFSLGAAVKAGACMLDGGNMLETIKVSPQTMDGILKSILKVKKALGMFISDTPGERNPYENILYKICLSGDGWPYI
ASRTSITGRAWENTVVDLESDGKPQKADSNNSSKSLQSAGFTAGLTYSQLMTLKDAMLQLDPNAKTWMDIEGRPEDPVEI
ALYQPSSGCYIHFFREPTDLKQFKQDAKYSHGIDVTDLFATQPGLTSAVIDALPRNMVITCQGSDDIRKLLESQGRKDIK
LIDIALSKTDSRKYENAVWDQYKDLCHMHTGVVVEKKKRGGKEEITPHCALMDCIMFDAAVSGGLNTSVLRAVLPRDMVF
RTSTPRVVL
>Q8BDE6 3.1.13.-~~~N~~~Nucleoprotein~~~
MSGASEVPSFRWTQSLRRGLSHFTTSAKGDVLRDAKSLVDGLDFNQVSQVQRVMRKDKRSDDDLSKLRDLNRSVDSLMVM
KNKQNNVSLKIGSLSKDELMDLATDLEKLKRKINLGDRQGPGVYQGNLTSAQLEKRSEILKSLGFQPRANQNGVVKVWDI
KNPKLLINQFGSIPALTIACMSVQGAEQMNDVVQGLTSLGLLYTVKYPNLDDLDKLSKDHPCLEFITKEESANNISGYNL
SLSAAVKAGACLVDGGNMLETILVKPDNFQDIVKSLLVVKRQEKMFVNEKPGLRNPYENILYKLCLSGEGWPYIGSRSQI
VGRAWENTTVDLSKEVVYGPSAPVKNGGNMRLSPLSDTQEAVIKEAIGKLDMDETIWIDIEGPPNDPVELAIYQPSTGNY
IHCFRVPHDEKGFKNGSKYSHGILLRDIENARSGLLSRILMRLPQKVVFTCQGSDDIQKLLQMNGRPDIATIDMSFSSEQ
ARFFEGVVWEKFGHLCTRHNGVVLSRKKKGGNSGEPHCALLDCIIFQAAFEGQVTGQIPKPLLPNSLIFKDEPRVAM
>P0C779 ~~~N~~~Nucleoprotein~~~
MSQNKKKGGQNKGANQQLNQLISALLRNAGQNKGKGQKKKKQPKLHFPMAGPSDLRHVMTPNEVQMCRSSLVTLFNQGGG
QCTLVDSGGINFTVSFMLPTHATVRLINASANSSA
>P09992 3.1.13.-~~~N~~~Nucleoprotein~~~
MSLSKEVKSFQWTQALRRELQSFTSDVKAAVIKDATNLLNGLDFSEVSNVQRIMRKEKRDDKDLQRLRSLNQTVHSLVDL
KSTSKKNVLKVGRLSAEELMSLAADLEKLKAKIMRSERPQASGVYMGNLTTQQLDQRSQILQIVGMRKPQQGASGVVRVW
DVKDSSLLNNQFGTMPSLTMACMAKQSQTPLNDVVQALTDLGLLYTVKYPNLNDLERLKDKHPVLGVITEQQSSINISGY
NFSLGAAVKAGAALLDGGNMLESILIKPSNSEDLLKAVLGAKRKLNMFVSDQVGDRNPYENILYKVCLSGEGWPYIACRT
SIVGRAWENTTIDLTSEKPAVNSPRPAPGAAGPPQVGLSYSQTMLLKDLMGGIDPNAPTWIDIEGRFNDPVEIAIFQPQN
GQFIHFYREPVDQKQFKQDSKYSHGMDLADLFNAQPGLTSSVIGALPQGMVLSCQGSDDIRKLLDSQNRKDIKLIDVEMT
REASREYEDKVWDKYGWLCKMHTGIVRDKKKKEITPHCALMDCIIFESASKARLPDLKTVHNILPHDLIFRGPNVVTL
>Q1PD53 ~~~NP~~~Nucleoprotein~~~
MDLHSLLELGTKPTAPHVRNKKVILFDTNHQVSICNQIIDAINSGIDLGDLLEGGLLTLCVEHYYNSDKDKFNTSPIAKY
LRDAGYEFDVIKNADATRFLDVIPNEPHYSPLILALKTLESTESQRGRIGLFLSFCSLFLPKLVVGDRASIEKALRQVTV
HQEQGIVTYPNHWLTTGHMKVIFGILRSSFILKFVLIHQGVNLVTGHDAYDSIISNSVGQTRFSGLLIVKTVLEFILQKT
DSGVTLHPLVRTSKVKNEVASFKQALSNLARHGEYAPFARVLNLSGINNLEHGLYPQLSAIALGVATAHGSTLAGVNVGE
QYQQLREAAHDAEVKLQRRHEHQEIQAIAEDDEERKILEQFHLQKTEITHSQTLAVLSQKREKLARLAAEIENNIVEDQG
FKQSQNRVSQSFLNDPTPVEVTVQARPINRPTALPPPVDSKIEHESTEDSSSSSSFVDLNDPFALLNEDEDTLDDSVMIP
STTSREFQGIPEPPRQSQDIDNSQGKQEDESTNLIKKPFLRYQELPPVQEDDESEYTTDSQESIDQPGSDNEQGVDLPPP
PLYAQEKRQDPIQHPAVSSQDPFGSIGDVNGDILEPIRSPSSPSAPQEDTRAREAYELSPDFTNYEDNQQNWPQRVVTKK
GRTFLYPNDLLQTNPPESLITALVEEYQNPVSAKELQADWPDMSFDERRHVAMNL
>P27588 ~~~NP~~~Nucleoprotein~~~
MDLHSLLELGTKPTAPHVRNKKVILFDTNHQVSICNQIIDAINSGIDLGDLLEGGLLTLCVEHYYNSDKDKFNTSPVAKY
LRDAGYEFDVIKNADATRFLDVSPNEPHYSPLILALKTLESTESQRGRIGLFLSFCSLFLPKLVVGDRASIEKALRQVTV
HQEQGIVTYPNHWLTTGHMKVIFGILRSSFILKFVLIHQGVNLVTGHDAYDSIISNSVGQTRFSGLLIVKTVLEFILQKT
DSGVTLHPLVRTSKVKNEVASFKQALSNLARHGEYAPFARVLNLSGINNLEHGLYPQLSAIALGVATAHGSTLAGVNVGE
QYQQLREAAHDAEVKLQRRHEHQEIQAIAEDDEERKILEQFHLQKTEITHSQTLAVLSQKREKLARLAAEIENNIVEDQG
FKQSQNRVSQSFLNDPTPVEVTVQARPMNRPTALPPPVDDKIEHESTEDSSSSSSFVDLNDPFALLNEDEDTLDDSVMIP
GTTSREFQGIPEPPRQSQDLNNSQGKQEDESTNRIKKQFLRYQELPPVQEDDESEYTTDSQESIDQPGSDNEQGVDLPPP
PLYAQEKRQDPIQHPAANPQDPFGSIGDVNGDILEPIRSPSSPSAPQEDTRMREAYELSPDFTNDEDNQQNWPQRVVTKK
GRTFLYPNDLLQTNPPESLITALVEEYQNPVSAKELQADWPDMSFDERRHVAMNL
>Q6UY69 ~~~NP~~~Nucleoprotein~~~
MDLHSLLELGTKPTAPHVRNKKVILFDTNHQVSICNQIIDAINSGIDLGDLLEGGLLTLCVEHYYNSDKDKFNTSPIAKY
LRDAGYEFDVIKNADATRFLDVIPNEPHYSPLILALKTLESTESQRGRIGLFLSFCSLFLPKLVVGDRASIEKALRQVTV
HQEQGIVTYPNHWLTTGHMKVIFGILRSSFILKFVLIHQGVNLVTGHDAYDSIISNSVGQTRFSGLLIVKTVLEFILQKT
DSGVTLHPLVRTSKVKNEVASFKQALSNLARHGEYAPFARVLNLSGINNLEHGLYPQLSAIALGVATAHGSTLAGVNVGE
QYQQLREAAHDAEVKLQRRHEHQEIQAIAEDDEERKILEQFHLQKTEITHSQTLAVLSQKREKLARLAAEIENNIVEDQG
FKQSQNRVSQSFLNDPTPVEVTVQARPVNRPTALPPPVDDKIEHESTEDSSSSSSFVDLNDPFALLNEDEDTLDDSVMIP
STTSREFQGIPESPGQSQDLDNSQGKQEDESTNPIKKQFLRYQELPPVQEDDESEYTTDSQESIDQPGSDNEQGVDLPPP
PLYTQEKRQDPIQHPAASSQDPFGSIGDVNGDILEPIRSPSSPSAPQEDTRAREAYELSPDFTNYEDNQQNWPQRVVTKK
GRTFLYPNDLLQTSPPESLVTALVEEYQNPVSAKELQADWPDMSFDERRHVAMNL
>Q9WMB5 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPSSSDQSRSGWFENKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDRISRAVGPRQAQVSFLHGDQSENELPGLGGKEDRRVKQGRGEARESYRETGSSRASDARAAHPPTSMPLDIDTASE
SGQDPQDSRRSADALLRLQAMAGILEEQGSDTDTPRVYNDRDLLD
>P04851 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPISSDQSRFGWFENKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNIIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGKPAPYMVNLENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDKISRAVGPRQAQVSFLQGDQSENELPRLGGKEDRRVKQSRGEARESYRETGPSRASDARAAHLPTGTPLDIDTASE
SSQDPQDSRRSAEPLLSCKPWQESRKNKAQTRTPLQCTMTEIF
>Q89933 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPISSDQSRFGWFENKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDKISRAVGPRQAQVSFLHGDQSENELPRLGGKEDRRVKQSRGEARESYRETGPSRASDARAAHLPTGTPLDIDTASE
SSQDPQDSRRSADALLRLQAMAGISEEQGSDTDTPIVYNDRNLLD
>P10050 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPISSDQSRFGWFENKEISDIEVQDP
EGFNMILGTILAQIWVLVAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICNIDTYIVEAGLASFILTIKFGIETMYPALGLHEFDGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDKISRAVGPRQAQVSFLHGDQSENELPRLGGKEDRRVKQSRGEARESYRETGPSRASDARAAHLPTGTPLDIDTASE
SSQDPQDSRRSADALLRLQAMAGISEEQGSDTDTPIVYNDRNLLD
>Q77M43 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPISSDQSRFGWFGNKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDKISRAVGPRQAQVSFLHGDQSENELPRLGGKEDRRVKQSRGEARESYRETGPSRASDARAAHLPTGTPLDIDTATE
SSQDPQDSRRSADALLRLQAMAGISEEQGSDTDTPIVYNDRNLLD
>B8PZP3 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPISSDQSRFGWFGNKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDKISRAVGPRQAQVSFLHGDQSENELPRLGGKEDRRVKQSRGEARESYRETGPSRASDARAAHLPTGTPLDIDTATE
SSQDPQDSRRSADALLRLQAMAGISEEQGSDTDTPIVYNDRNLLD
>P26030 ~~~N~~~Nucleoprotein~~~
MATLLRSLALFKRNKDKPPITSGSGGAIRGIKHIIIVPIPGDSSITTRSRLLDRLVRLIGNPDVSGPKLTGALIGILSLF
VESPGQLIQRITDDPDVSIRLLEVVQSDQSQSGLTFASRGTNMEDEADQYFSHDDPSSSDQPRFGWFENKEISDIEVQDP
EGFNMILGTILAQIWVLLAKAVTAPDTAADSELRRWIKYTQQRRVVGEFRLERKWLDVVRNRIAEDLSLRRFMVALIQDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTLESLMNLYQQMGETAPYMVILENSI
QNKFSAGSYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSTLASELGITAEDARLVSEIAMH
TTEDRISRAVGPRQAQVSFLHGDQSENELPRWEGKEDMRVKQSRGEARESYRETGPSRASDARAAHLPTDTPLDIDTASE
PSQDPQDSRRSAEALLRLQAMAGISEEQGSDTDTPRVYNDRDLLE
>K9N4V7 ~~~N~~~Nucleoprotein~~~
MASPAAPRAVSFADNNDITNTNLSRGRGRNPKPRAAPNNTVSWYTGLTQHGKVPLTFPPGQGVPLNANSTPAQNAGYWRR
QDRKINTGNGIKQLAPRWYFYYTGTGPEAALPFRAVKDGIVWVHEDGATDAPSTFGTRNPNNDSAIVTQFAPGTKLPKNF
HIEGTGGNSQSSSRASSVSRNSSRSSSQGSRSGNSTRGTSPGPSGIGAVGGDLLYLDLLNRLQALESGKVKQSQPKVITK
KDAAAAKNKMRHKRTSTKSFNMVQAFGLRGPGDLQGNFGDLQLNKLGTEDPRWPQIAELAPTASAFMGMSQFKLTHQNND
DHGNPVYFLRYSGAIKLDPKNPNYNKWLELLEQNIDAYKTFPKKEKKQKAPKEESTDQMSEPPKEHRVQGTQRTRTRPSV
QPGPMIDVNTD
>P19239 3.1.13.-~~~N~~~Nucleoprotein~~~
MSNSKEVKSFLWTQSLRRELSGYCSNIKIQVIKDAQALLHGLDFSEVANVQRLMRKEKRDDSDLKRLRDLNQAVNNLVEL
KSVQQKNVLRVGTLTSDDLLVLAADLDRLKAKVIRGERPLAAGVYMGNLTAQQLEQRRVLLQMVGMGGGFRAGNTLGDGI
VRVWDVRNPELLNNQFGTMPSLTIACMCKQGQADLNDVIQSLSDLGLVYTAKYPNMSDLDKLSQTHPILGIIEPKKSAIN
ISGYNFSLSAAVKAGACLIDGGNMLETIKVTKSNLEGILKAALKVKRSLGMFVSDTPGERNPYENLLYKLCLSGEGWPYI
ASRTSIVGRAWDNTTVDLSGDVQQNAKPDKGNSNRLAQAQGMPAGLTYSQTMELKDSMLQLDPNAKTWIDIEGRPEDPVE
IAIYQPNNGQYIHFYREPTDIKQFKQDSKHSHGIDIQDLFSVQPGLTSAVIESLPKNMVLSCQGADDIRKLLDSQNRRDI
KLIDVSMQKDDARKFEDKIWDEYKHLCRMHTGIVTQKKKRGGKEEVTPHCALLDCLMFEAAVIGSPQIPTPRPVLSRDLV
FRTGPPRVVL
>P26589 ~~~N~~~Nucleoprotein~~~
MSLDRLKLNDVSNKDSLLSNCKYSVTRSTGDVTSVSGHAMQKALARTLGMFLLTAFNRCEEVAEIGLQYAMSLLGRDDSI
KILREAGYNVKCVDTQLKDFTIKLQGKEYKIQVLDIVGIDAANLADLEIQARGVVAKELKTGARLPDNRRHDAPDCGVIV
LCIAALVVSKLAAGDRGGLDAVERRALNVLKAEKARYPNMEVKQIAESFYDLFERKPYYIDVFITFGLAQSSVRGGSKVE
GLFSGLFMNAYGAGQVMLRWGLLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKQGGEAGFYHIRNNPKASLLSLTNCPNF
TSVVLGNAAGLGIIGSYKGAPRNRELFDAAKDYAERLKDNNVINYSALNLTAEERELISQQLNIVDDTPDDDI
>P21277 ~~~N~~~Nucleoprotein~~~
MSSVLKAFERFTIEQELQDRGEEGSIPPETLKSAVKVFVINTPNPTTRYQMLNFCLRIICSQNRRASHRVGALIALFSLP
SAGMQNHIRLADRSPEAQIERCEIDGFEPGTYRLIPNARANLTANEIAAYALLADDLPPTINNGTPYVHADVELQPCDEI
EQFLDRCYSVLIQAWVMVCKCMTAYDQPAGSADRRFAKYQQQGRLEARYMLQPEAQRLIQTAIRKSLVVRQYLTFELQLA
RRQGLLSNRYYAMVGDIGKYIENSGLTAFFLTLKYALGTKWSPLSLAAFTGELTKLRSLMMLYRDIGEQARYLALLEAPQ
IMDFAPGGYPLIFSYAMGVGSVLDVQMRNYTYARPFLNGYYFQIGVETARRQQGTVDNRVADDLGLTPEQRNEVTQLVDR
LARGRGAGIPGGPVNPFVPPVQQQQPAAVYADIPALEESDDDGDEDGGAGFQNGVQVPAVRQGGQTDFRAQPLQDPIQAQ
LFMPLYPQVSNIPNNRIIRSIASGGWKTKIYYDTTRMVILNKMQGANTETLSQTIPIKTHSCKWATGMSKSLT
>Q77IS8 ~~~N~~~Nucleoprotein~~~
MSSVLKAFERFTIEQELQDRGEEGSIPPETLKSAVKVFVINTPNPTTRYQMLNFCLRIICSQNARASHRVGALITLFSLP
SAGMQNHIRLADRSPEAQIERCEIDGFEPGTYRLIPNARANLTANEIAAYALLADDLPPTINNGTPYVHADVEGQPCDEI
EQFLDRCYSVLIQAWVMVCKCMTAYDQPAGSADRRFAKYQQQGRLEARYMLQPEAQRLIQTAIRKSLVVRQYLTFELQLA
RRQGLLSNRYYAMVGDIGKYIENSGLTAFFLTLKYALGTKWSPLSLAAFTGELTKLRSLMMLYRGLGEQARYLALLEAPQ
IMDFAPGGYPLIFSYAMGVGTVLDVQMRNYTYARPFLNGYYFQIGVETARRQQGTVDNRVADDLGLTPEQRTEVTQLVDR
LARGRGAGIPGGPVNPFVPPVQQQQPAAVYEDIPALEESDDDGDEDGGAGFQNGVQLPAVRQGGQTDFRAQPLQDPIQAQ
LFMPLYPQVSNMPNNQNHQINRIGGLEHQDLLRYNENGDSQQDARGEHVNTFPNNPNQNAQLQVGDWDE
>Q99FY3 ~~~N~~~Nucleoprotein~~~
MSSVFDEYEQLLAAQTRPNGAHGGGERGSTLRVEVPVFTLNSDDPEDRWNFAVFCLRIAVSEDANKPLRQGALISLLCSH
SQVMRNHVALAGKQNEATLTVLEIDGFTSSVPQFNNRSGVSEERAQRFMVIAGSLPRACSNGTPFVTAGVEDDAPEDITD
TLERILSIQAQVWVTVAKAMTAYETADESETRRINKYMQQGRVQKKYILHPVCRSAIQLTIRHSLAVRIFLVSELKRGRN
TAGGSSTYYNLVGDVDSYIRNTGLTAFFLTLKYGINTKTSALALSSLTGDIQKMKQLMRLYRMKGENAPYMTLLGDSDQM
SFAPAEYAQLYSFAMGMASVLDKGTGKYQFARDFMSTSFWRLGVEYAQAQGSSINEDMAAELKLTPAARRGLAAAAQRVS
EETGSVDIPTQQAGVLTGLSDGGPRASQGGSNKSQGQPDAGDGETQFLDLMRAVANSMREAPNSAQSTTHPEPPPTPGPS
QDNDTDWGY
>P09459 ~~~N~~~Nucleoprotein~~~
MSSVFDEYEQLLAAQTRPNGAHGGGEKGSTLKVDVPVFTLNSDDPEDRWNFAVFCLRIAVSEDANKPLRQGALISLLCSH
SQVMRNHVALAGKQNEATLAVLEIDGFANGMPQFNNRSGVSEERAQRFAMIAGSLPRACSNGTPFVTAGAEDDAPEDITD
TLERILSIQAQVWVTVAKAMTAYETADESETRRINKYMQQGRVQKKYILYPVCRSTIQLTIRQSLAVRIFLVSELKRGRN
TAGGTSTYYNLVGDVDSYIRNTGLTAFFLTLKYGINTKTSALALSSLSGDIQKMKQLMRLYRMKGDNAPYMTLLGDSDQM
SFAPAEYAQLYSFAMGMASVLDKGTGKYQFARDFMSTSFWRLGVEYAQAQGSSINEDMAAELKLTPAARRGLAAAAQRVS
EETSSIDMPTQQVGVLTGLSEGGSQALQGGSNRSQGQPEAGDGETQFLDLMRAVANSMREAPNSAQGTPQSGPPPTPGPS
QDNDTDWGY
>Q9IK92 ~~~N~~~Nucleoprotein~~~
MSDIFEEAASFRSYQSKLGRDGRASAATATLTTKIRIFVPATNSPELRWELTLFALDVIRSPSAAESMKVGAAFTLISMY
SERPGALIRSLLNDPDIEAVIIDVGSMVNGIPVMERRGDKAQEEMEGLMRILKTARDSSKGKTPFVDSRAYGLRITDMST
LVSAVITIEAQIWILIAKAVTAPDTAEESETRRWAKYVQQKRVNPFFALTQQWLTEMRNLLSQSLSVRKFMVEILIEVKK
GGSAKGRAVEIISDIGNYVEETGMAGFFATIRFGLETRYPALALNEFQSDLNTIKSLMLLYREIGPRAPYMVLLEESIQT
KFAPGGYPLLWSFAMGVATTIDRSMGALNINRGYLEPMYFRLGQKSARHHAGGIDQNMANRLGLSSDQVAELAAAVQETS
AGRQESNVQAREAKFAAGGVLIGGSDQDIDEGEEPIEQSGRQSVTFKREMSISSLANSVPSSSVSTSGGTRLTNSLLNLR
SRLAAKAAKEAASSNATDDPAISNRTQGESEKKNNQDLKPAQNDLDFVRADV
>Q83957 ~~~N~~~Nucleoprotein~~~
MALSKVKLNDTFNKDQLLSTSKYTIQRSTGDNIDIPNYDVQKHLNKLCGMLLITEDANHKFTGLIGMLYAMSRLGREDTL
KILKDAGYQVKANGVDVITHRQDVNGKEMKFEVLTLVSLTSEVQVNIEVESRKSYKKMLKEMGEVAPEYRHDSPDCGMIV
LCIAALVIAKLAAGDRSGLTAVIRRANNVLKNEIERYKGLIPKDVANSFYEVFEKYPHYIDVFVHFGIAQSSTRGGSRVE
GIFAGLFMNAYGAGQVMLRWGVLAKSVKNIMLGHASVQAEMEQVVEVYEYAQKLGGEAGFYHILNNPKASLLSLTQFPNF
SSVVLGNAAGLGIMGEYRGTPRNQDLYDAAKAYAEQLKENGVINYSVLDLTTEELEAIKNQLNPKDNDVEL
>Q07499 ~~~N~~~Nucleoprotein~~~
MASVSFQDRGRKRVPLSLYAPLRVTNDKPLSKVLANNAVPTNKGNKDQQIGYWNEQIRWRMRRGERIEQPSNWHFYYLGT
GPHGDLRYRTRTEGVFWVAKEGAKTEPTNLGVRKASEKPIIPKFSQQLPSVVEIVEPNTPPASRANSRSRSRGNGNNRSR
SPSNNRGNNQSRGNSQNRGNNQGRGASQNRGGNNNNNNKSRNQSNNRNQSNDRGGVTSRDDLVAAVKDALKSLGIGENPD
RHKQQQKPKQEKSDNSGKNTPKKNKSRATSKERDLKDIPEWRRIPKGENSVAACFGPRGGFKNFGDAEFVEKGVDASGYA
QIASLAPNVAALLFGGNVAVRELADSYEITYNYKMTVPKSDPNVELLVSQVDAFKTGNAKLQRKKEKKNKRETTLQQHEE
AIYDDVGAPSDVTHANLEWDTAVDGGDTAVEIINEIFDTGN
>P24304 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGGGAIIPGQRSTVSVFILGPSVTDDADKLLIATTFLAHSLDTDKQHSQRGGFLVSLLA
MAYSSPELYLTTNGVNADVKYVIYNIERDPKRTKTDGFIVKTRDMEYERTTEWLFGPMINKNPLFQGQRENADLEALLQT
YGYPACLGAIIVQVWIVLVKAITSSSGLRKGFFNRLEAFRQDGTVKSALVFTGDTVEGIGAVMRSQQSLVSLMVETLVTM
NTSRSDLTTLEKNIQIVGNYIRESGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLVDIYLSKGARAPFTCILRDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGRTYLDMEMFLLGQAVAKDADSKISSALEEELGVTDTAKERLRHHLTN
LSGGDGAYHKPTGGGAIEVAIDHTDITFGAEDTADRDNKNWTNNSNERWMNHSINNHTITISGAEELEEETNDEDITDIE
NKIARRLADRKQRLSQANNRQDASSDADHENDDDATAAAGIGGI
>P26590 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGGGAIIPGQRSTVSVFILGPSVTDDADKLLIATTFLAHSLDTDKQHSQRGGFLVSLLA
MAIRSPELYLTTNGVNADVKYVIYNIERDPKRTKTDGFIVKTRDMEYERTTEWLFGPMINKNPLFQGQRENADLEALLQT
YGYPACLGAIIVQVWIVLVKAITSSAGLRKGFFNRLEAFRQDGTVKSALVFTGDTVEGIGAVMRSQQSLVSLMVETLVTM
NTSRSDLTTLEKNIQIVGNYIRDAGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLVDIYLSKGARAPFICILRDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGRTYLDMEMFLLGQAVAKDADSKISSALEEELNVTDTAKERLRHHLTN
LSGGDGAYHKPTGGGAIEVAIDHTDITFGAEDTADRDNKNWTNNSNERWSNHSINNHTITISGAEQLEEETNDEDITDIE
NKIARRLADKKQRLSQANNKQDANSDADYENDDDATAAAGIGGI
>P21737 ~~~N~~~Nucleoprotein~~~
MSSVLKTFERFTIQQELQEQSDDTPVPLETIKPTIRVFVINNNDPVVRSRLLFFNLRIIMSNTAREGHRAGALLSLLSLP
SAAMSNHIKLAMHSPEASIDRVEITGFENNSFRVIPDARSTMSRGEVLAFEALAEDIPDTLNHQTPFVNNDVEDDIFDET
EKFLDVCYSVLMQAWIVTCKCMTAPDQPPVSVAKMAKYQQQGRINARYVLQPEAQRLIQNAIRKSMVVRHFMTYELQLSQ
SRSLLANRYYAMVGDIGKYIEHSGMGGFFLTLKYGLGTRWPTLALAAFSGELQKLKALMLHYQSLGPMAKYMALLESPKL
MDFVPSEYPLDYSYAMGIGTVLDTNMRNYAYGRSYLNQQYFQLGVETARKQQGAVDNRTAEDLGMTAADKADLTATISKL
SLSQLPRGRQPISDPFAGANDREMGGQANDTPVYNFNPIDTRRYDNYDSDGEDRIDNDQDQAIRENRGEPGQPNNQTSDN
QQRFNPPIPQRTSGMSSEEFQHSMNQYIRAMHEQYRGSQDDDANDATDGNDISLELVGDFDS
>P06159 ~~~N~~~Nucleoprotein~~~
MLSLFDTFNARRQENITKSAGGAIIPGQKNTVSIFALGPTITDDNEKMTLALLFLSHSLDNEKQHAQRAGFLVSLLSMAY
ANPELYLTTNGSNADVKYVIYMIEKDLKRQKYGGFVVKTREMIYEKTTEWIFGSDLDYDQETMLQNGRNNSTIEDLVHTF
GYPSCLGALIIQIWIVLVKAITSISGLRKGFFTRLEAFRQDGTVQAGLVLSGDTVDQIGSIMRSQQSLVTLMVETLITMN
TSRNDLTTIEKNIQIVGNYIRDAGLASFFNTIRYGIETRMAALTLSTLRPDINRLKALMELYLSKGPRAPFICILRDPIH
GEFAPGNYPAIWSYAMGVAVVQNRAMQQYVTGRSYLDIDMFQLGQAVARDAEAQMSSTLEDELGVTHEAKESLKRHIRNI
NSSETSFHKPTGGSAIEMAIDEEPEQFEHRADQEQDGEPQSSIIQYAWAEGNRSDDRTEQATESDNIKTEQQNIRDRLNK
RLNDKKKQGSQPSTNPTNRTNQDEIDDLFNAFGSN
>P17240 ~~~N~~~Nucleoprotein~~~
MSSVLAAYEQFLQTTEDRGFGDQQFVQSDTLKAEIPVFVLNTNDPQQRFTLMNFCLRQAVSSSAKSAIKQGALLSLLSLQ
ATSMQNHLMIAARAPDAALRIIEVDAIDPPDYTLTINPRSGWDDIKIRAYRALSRDLPISLADRTVFVSRDAEHAVCDDM
DTYLNRIFSVLIQVWIMVCKCMTAYDQPTGSEERRLAKYKQQGRMLERYQLQTDARKIIQLVIRESMVIRQFLVQEMLTA
DKVGAYTNRYYAMVGDIAKYIANVGMSAFFLTLKFGLGNRWKPLALAAFSGELVKLKSLMSLYRRLGDRSRYLALLESPE
LMEFAPANYPLLFSYAMGVGSVQDPLIRNYQFGRNFLNTSYFQYGVETAMKHQGTVDPKFASELGITDEDRVDIMQSVEK
HISGKAGDDISQPRSAFTMSLNRSAFITNNNPQDLSGARLSNYEQGWSGIDQDETRDTLPESTMHRFQNIDSTNSDHNEL
QMPEFENDINPFNHPRFTARAPLIPEISHQTPTIRMNRNVNIRDSTRDDRQDANEDRSSNIPDDILGDLDN
>P17241 ~~~N~~~Nucleoprotein~~~
MSSVLAAYEQFLQTTEDRSFGDQQFVQSDTLKAEIPVFVLNTNDPQQRFTLMNFCLRLAVSSSAKSAIKQGALLSLLSLQ
ATSMQNHLMIAARAPDAALRIIEVDAIDPQDYTLTINPRSGWDDIKIRAYRALSRDLPISLADRTVFVSRDAEHAVCDDM
DTYLNRIFSVLIQIWIMVCKCMTAYDQPTGSEERRLAKYKQQGRMLEKYQLQTDARKIIQLVIRESMVIRQFLVQEMLTA
DKVGAYTNRYYAMVGDIAKYIANVGMSAFFLTLKFGLGNRWKPLALAAFSGELVKLKSFMSLYRRLGDRSRYLALLESPE
LMEFAPANYPLLFSYAMGVGSVQDPLIRNYQFGRNFLNTSYFQYGVETAMKHQGTVDPKLALELGITDEDRVDIMQSVEK
HISGKAGDDISQPAGAFTMSLSRSAFINNNTSQDFSGARLSNYEQGWSGTNQDETRDVYPESTMHRLQNIEPTDSDHNEL
LMPELESDSNPFNRPRFTVRAPLIPEISHQNPTTRMNRNINTRDNTRADHQDTNEDRGSNVPDDILGDLDN
>P03541 3.1.13.-~~~N~~~Nucleoprotein~~~
MSDNIPSFRWVQSLRRGLSNWTHPVKADVLSDTRALLSALDFHKVAQVQRMVRKDKRTDSDLTKLRDMNKEVDALMNMRS
VQRDNVLKVGGLAKEELMELASDLDKLRKKVTRTEGLSQPGVYEGNLTNTQLEQRAEILRSMGFANARPAGNRDGVVKVW
DIKDNTLLINQFGSMPALTIACMTEQGGEQLNDVVQALSALGLLYTVKFPNMTDLEKLTQQHSALKIISHEPSALNISGY
NLSLSAAVKAAACMIDGGNMLETIQVKPSMFSTLIKSLLQIKNREGMFVSTTPGQRNPYENLLYKICLSGDGWPYIGSRS
QVQGRAWDNTTVDLDSKPSAIQPPVRNGGSPDLKQIPKEKEDTVVSSIQMLDPRATTWIDIEGTPNDPVEMAIYQPDTGN
YIHCYRFPHDEKSFKEQSKYSHGLLLKDLADAQPGLISSIIRHLPQNMVFTAQGSDDIIRLFEMHGRRDLKVLDVKLSAE
QARTFEDEIWERYNQLCTKHKGLVIKKKKKGAVQTTANPHCALLDTIMFDATVTGWVRDQKPMRCLPIDTLYRNNTDLIN
L
>Q88435 ~~~NP~~~Nucleoprotein~~~
MSSVLKAYERFTLTQELQDQSEEGTIPPTTLKPVIRVFILTSNNPELRSRLLLFCLRIVLSNGARDSHRFGALLTMFSLP
SATMLNHVKLADQSPEADIERVEIDGFEEGSFRLIPNARSGMSRGEINAYAALAEDLPDTLNHATPFVDSEVEGTAWDEI
ETFLDMCYSVLMQAWIVTCKCMTAPDQPAASIEKRLQKYRQQGRINPRYLLQPEARRIIQNVIRKGMVVRHFLTFELQLA
RAQSLVSNRYYAMVGDVGKYIENCGMGGFFLTLKYALGTRWPTLALAAFSGELTKLKSLMALYQTLGEQARYLALLESPH
LMDFAAANYPLLYSYAMGIGYVLDVNMRNYAFSRSYMNKTYFQLGMETARKQQGAVDMRMAEDLGLTQAERTEMANTLAK
LTTANRGADTRGGVNPFSSVTGTTQVPAAATGDTLESYMAADRLRQRYADAGTHDDEMPPLEEEEEDDTSAGPRTGPTLE
QVALDIQNAAVGAPIHTDDLNAALGDLDI
>Q08823 ~~~N~~~Nucleoprotein~~~
MATLLKSLALFKRNKDKAPTASGSGGAIRGIKNVIIVPIPGDSSIITRSRLLDRLVRLAGDPDINGSKLTGVMISMLSLF
VESPGQLIQRITDDPDVSIRLVEVVQSTRSQSGLTFASRGADLDNEADMYFSTEGPSSGSKKRINWFENREIIDIEVQDA
EEFNMLLASILAQVWILLAKAVTAPDTAADSELRRWVKYTQQRRVIGEFRLDKGWLDAVRNRIAEDLSLRRFMVSLILDI
KRTPGNKPRIAEMICDIDNYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTIESLMNLYQQLGEVAPYMVILENSI
QNKFSAGAYPLLWSYAMGVGVELENSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSVIAAELGITAEEAKLVSEIASQ
TGDERTVRGTGPRQAQVSFLQHKTDEGESPTPATREEVKAAIPNGSEGRDTKRTRSGKPRGETPGQLLPEIMQEDELSRE
SSQNPREAQRSAEALFRLQAMAKILEDQEEGEDNSQIYNDKDLLS
>Q04558 ~~~N~~~Nucleoprotein~~~
MAGKNQSQKKKKSTAPMGNGQPVNQLCQLLGAMIKSQRQQPRGGQAKKKKPEKPHFPLAAEDDIRHHLTQTERSLCLQSI
QTAFNQGAGTASLSSSGKVSFQVEFMLPVAHTVRLIRVTSTSASQGAS
>P27313 3.1.-.-~~~N~~~Nucleoprotein~~~
MSDLTDIQEDITRHEQQLIVARQKLKDAERAVEVDPDDVNKNTLQARQQTVSALEDKLADYKRRMADAVSRKKMDTKPTD
PTGIEPDDHLKERSSLRYGNVLDVNAIDIEEPSGQTADWYTIGVYVIGFTLPIILKALYMLSTRGRQTVKENKGTRIRFK
DDTSFEDINGIRRPKHLYVSMPTAQSTMKAEELTPGRFRTIVCGLFPTQIQVRNIMSPVMGVIGFSFFVKDWSERIREFM
EKECPFIKPEVKPGTPAQEIEMLKRNKIYFMQRQDVLDKNHVADIDKLIDYAASGDPTSPDNIDSPNAPWVFACAPDRCP
PTCIYVAGMAELGAFFSILQDMRNTIMASKTVGTAEEKLKKKSSFYQSYLRRTQSMGIQLDQRIILLFMLEWGKEMVDHF
HLGDDMDPELRGLAQALIDQKVKEISNQEPLKI
>Q8JXF6 ~~~N~~~Nucleoprotein~~~
MDADKIVFKVNNQVVSLKPEIIVDQYEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGMNAAKLDPDDVCSYLAAAMQF
FEGTCPEDWTSYGILIARKGDRITPNSLVEIKRTDVDGNWALTGGMELTRDPTVSEHASLVGLLLSLYRLSKISGQNTGN
YKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETR
SPEAVYTRIMMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSNDS
>P0DOF3 ~~~N~~~Nucleoprotein~~~
MDADKIVFKVNNQVVSLKPEIIVDQHEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGMSAAKLDPDDVCSYLAAAMQF
FEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQNTGN
YKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETR
SPEAVYTRIMMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS
>Q8B6J9 ~~~N~~~Nucleoprotein~~~
MDADKIVFKVNNRVVSLKPEIIVDQYEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGMNAAKLDPDDVCSYLAAAMQF
FEGTCPEDWTSYGILIARKGDKITPDSLVEIKRTDVEGNWALTGGMELTRDPTVSEHASLVGLLLSLYRLSKISGQNTGN
YKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETR
SPEAVYTRIMMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS
>O55611 ~~~N~~~Nucleoprotein~~~
MDADRIVFRSNNQVVSLRPEIIADQYEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGMNAAKLDPDDVCSYLAAAMQF
FEGTCPEDWTSYGILIARKGDKITPNSLVEIKRNDVEGNWALTGGMEMTRDPTVSEHASLVGLLLSLYRLSKISGQNTGN
YKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QIRSLNATVIAACAPHEMSVLGGYLGEEFFGRGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVDSDDEDYFSGEAR
GPEAVYARIMMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS
>P16285 ~~~N~~~Nucleoprotein~~~
MDADKIVFKVNNQVVSLKPEIIVDQYEYKYPAIKDLKKPCITLGKAPDLNKAYKSVLSGMSAAKLNPDDVCSYLAAAMQF
FEGTCPEDWTSYGIVIARKGDKITPGSLVEIKRTDVEGNWALTGGMELTRDPTVPEHASLVGLLLSLYRLSKISGQNTGN
YKTNIADRIEQIFETAPFVKIVEHHTLMTTHKMCANWSTIPNFRFLAGTYDMFFSRIEHLYSAIRVGTVVTAYEDCSGLV
SFTGFIKQINLTAREAILYFFHKNFEEEIRRMFEPGQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QVRSLNATVIAACAPHEMSVLGGYLGEEFFGKGTFERRFFRDEKELQEYEAAELTKTDVALADDGTVNSDDEDYFSGETR
SPEAVYTRIMMNGGRLKRSHIRRYVSVSSNHQARPNSFAEFLNKTYSSDS
>P37708 ~~~N~~~Nucleoprotein~~~
MASLLKSLALFKKNKDKPPLAAGSGGAIRGIKHVIIVPIPGDSSITTRSRLLDCLVKMVGDPDISGPKLTGALISILSLF
VESPGQLIQRITDDPDISIKLVEVIQSDKTQSGLTFASRGASMDDEADRYFTYDEPNGGEERQSYWFENREIQDIEVQDP
EGFNMILATILAQIWILLAKAVTTPDTAADSELRRWVKYTQQRRVIGEFRLDKGWLDTVRNRIAEDLSLRRFMVALILDI
KRTPGNKPRIAEMICDIDTYIVEAGLASFILTIKFGIETMYPALGLHEFAGELSTIESLMNLYQQMGELAPYMVILENSI
QNKFSAGAYPLLWSYAMGVGVELESSMGGLNFGRSYFDPAYFRLGQEMVRRSAGKVSSNLASELGITEEEAKLVSEIAAY
TGDDRNSRTSGPKQTQVSFLRTDQGGEIQHNASKKDEARVLQVRKETWASSRSDRYKEDTDNEAVSPSVKTLIDVDTTPE
ADTDPLGNKKSAEALLKLQAMASILEDPTLGNDSPRTYNDKDLLS
>P68560 ~~~pc3~~~Nucleoprotein~~~
MGTNKPATLADLQKAINDISKDALSYLTAHKADVVTFAGQIEYAGYDAATLIGILKDKGGDTLAKDMTMCITMRYVRGTG
FVRDVTKKVKVAAGSTEASTLVSRYGIVSSVGTNANAITLGRLAQLFPNVSHEVVRQISGVKMAVDSSDLGLTGCDNLLW
DYVPQYIKLESETAPYCSTHSLSHILFVVHIIHSFQITKKTMPEGKKKERGLTKDIDMMKYTTGLLVITCKSKNLSDKKK
EEGRKKVLDEFITNGKVKTTIFDALAGMSVNTISTYGNQTRLYLAQQSKLMKILAENTSKTATEVSGLVKEFFEDEAEGA
DD
>P68559 ~~~pc3~~~Nucleoprotein~~~
MGTNKPATLADLQKAINDISKDALSYLTAHKADVVTFAGQIEYAGYDAATLIGILKDKGGDTLAKDMTMCITMRYVRGTG
FVRDVTKKVKVAAGSTEASTLVSRYGIVSSVGTNANAITLGRLAQLFPNVSHEVVRQISGVKMAVDSSDLGLTGCDNLLW
DYVPQYIKLESETAPYCSTHSLSHILFVVHIIHSFQITKKTMPEGKKKERGLTKDIDMMKYTTGLLVITCKSKNLSDKKK
EEGRKKVLDEFITNGKVKTTIFDALAGMSVNTISTYGNQTRLYLAQQSKLMKILAENTSKTATEVSGLVKEFFEDEAEGA
DD
>D3K5I7 ~~~N~~~Nucleoprotein~~~
MDNYQELAIQFAAQAVDRNEIEQWVREFAYQGFDARRVIELLKQYGGADWEKDAKKMIVLALTRGNKPRRMMMKMSKEGK
ATVEALINKYKLKEGNPSRDELTLSRVAAALAGRTCQALVVLSEWLPVTGTTMDGLSPAYPRHMMHPSFAGMVDPSLPGD
YLRAILDAHSLYLLQFSRVINPNLRGRTKEEVAATFTQPMNAAVNSNFISHEKRREFLKAFGLVDSNGKPSAAVMAAAQA
YKTAA
>P21700 ~~~N~~~Nucleoprotein~~~
MDNYQELRVQFAAQAVDRNEIEQWVREFAYQGFDARRVIELLKQYGGADWEKDAKKMIVLALTRGNKPRRMMMKMSKEGK
ATVEALINKYKLKEGNPSRDELTLSRVAAALAGWTCQALVVLSEWLPVTGTTMDGLSPAYPRHMMHPSFAGMVDPSLPGD
YLRAILDAHSLYLLQFSRVINPNLRGRTKEEVAATFTQPMNAAVNSNFISHEKRREFLKAFGLVDSNGKPSAAVMAAAQA
YKTAA
>Q86523 ~~~N~~~Nucleoprotein~~~
MANDNVSDYANAAPFARFANLQNRETLNPIGNEAKEIPYNRDQYLTWLAEGKLFQIGALTDAEIVAAWTTIKTAMGNNTF
SETHMRSIVKIACNLRGITPGSTPLLVTYNPPQSATWAPAPSTDAIYSGTPVAGVIIPQNTGAGGEDTETEASKARAIAF
ICCYLLRFIVKTEEHLTNSLGNLKLQYSRLYSAQSATLSNWNPSNTWASRVKLGFDTYLTLRATVAYNIASADALLVPEN
VNYGLCRMLVFQHLELSGLQLYKMAMTLIAHFKLIEPNKFLSWIYDPLSEASIDQIYKIAVNYDNVNSKTHKHWKYAKLA
RGQYWLNTTVKRNQFLAYILADLELKYGLAGKSDYSSPKRMKALSGMPVERMTEAETISKAVEQMYTAIESAKRVDAGAA
YRLAKKLGPPRANAHSRRKEPNNSRQHRDKQPNSKQQGRDKRNKHQVLGRHSKPQAQGPPNKQQDLGPHNNKPKEASRPP
QQDRQQLAQPWRLTRRQRGARGHRTQTLSGMFCKGTCQCIQ
>P59595 ~~~N~~~Nucleoprotein~~~
MSDNGPQSNQRSAPRITFGGPTDSTDNNQNGGRNGARPKQRRPQGLPNNTASWFTALTQHGKEELRFPRGQGVPINTNSG
PDDQIGYYRRATRRVRGGDGKMKELSPRWYFYYLGTGPEASLPYGANKEGIVWVATEGALNTPKDHIGTRNPNNNAATVL
QLPQGTTLPKGFYAEGSRGGSQASSRSSSRSRGNSRNSTPGSSRGNSPARMASGGGETALALLLLDRLNQLESKVSGKGQ
QQQGQTVTKKSAAEASKKPRQKRTATKQYNVTQAFGRRGPEQTQGNFGDQDLIRQGTDYKHWPQIAQFAPSASAFFGMSR
IGMEVTPSGTWLTYHGAIKLDDKDPQFKDNVILLNKHIDAYKTFPPTEPKKDKKKKTDEAQPLPQRQKKQPTVTLLPAAD
MDDFSRQLQNSMSGASADSTQA
>P0DTC9 ~~~N~~~Nucleoprotein~~~
MSDNGPQNQRNAPRITFGGPSDSTGSNQNGERSGARSKQRRPQGLPNNTASWFTALTQHGKEDLKFPRGQGVPINTNSSP
DDQIGYYRRATRRIRGGDGKMKDLSPRWYFYYLGTGPEAGLPYGANKDGIIWVATEGALNTPKDHIGTRNPANNAAIVLQ
LPQGTTLPKGFYAEGSRGGSQASSRSSSRSRNSSRNSTPGSSRGTSPARMAGNGGDAALALLLLDRLNQLESKMSGKGQQ
QQGQTVTKKSAAEASKKPRQKRTATKAYNVTQAFGRRGPEQTQGNFGDQELIRQGTDYKHWPQIAQFAPSASAFFGMSRI
GMEVTPSGTWLTYTGAIKLDDKDPNFKDQVILLNKHIDAYKTFPPTEPKKDKKKKADETQALPQRQKKQQTVTLLPAADL
DDFSKQLQQSMSSADSTQA
>H2AM13 ~~~N~~~Nucleoprotein~~~
MSSQFIFEDVPQRNAATFNPEVGYVAFIGKYGQQLNFGVARVFFLNQKKAKMVLHKTAQPSVDLTFGGVKFTVVNNHFPQ
YVSNPVPDNAITLHRMSGYLARWIADTCKASVLKLAEASAQIVMPLAEVKGCTWADGYTMYLGFAPGAEMFLDAFDFYPL
VIEMHRVLKDNMDVNFMKKVLRQRYGTMTAEEWMTQKITEIKAAFNSVGQLAWAKSGFSPAARTFLQQFGINI
>P04857 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGGGAVIPGQRSTVSVFVLGPSVTDDADKLFIATTFLAHSLDTDKQHSQRGGFLVSLLA
MAYSSPELYLTTNGVNADVKYVIYNIEKDPKRTKTDGFIVKTRDMEYERTTEWLFGPMVNKSPLFQGQRDAADPDTLLQT
YGYPACLGAIIVQVWIVLVKAITSSAGLRKGFFNRLEAFRQDGTVKGALVFTGETVEGIGSVMRSQQSLVSLMVETLVTM
NTARSDLTTLEKNIQIVGNYIRDAGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLIDTYLSKGPRAPFICILKDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGRTYLDMEMFLLGQAVAKDAESKISSALEDELGVTDTAKERLRHHLAN
LSGGDGAYHKPTGGGAIEVALDNADIDLETEAHADQDARGWGGESGERWARQVSGGHFVTLHGAERLEEETNDEDVSDIE
RRIAMRLAERRQGILQPMEMKAAITVWITTKMTMPQQ
>Q07097 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGGGAVIPGQRSTVSVFVLGPSVTDDADKLFIATTFLAHSLDTDKQHSQRGGFLVSLLA
MAYSSPELYLTTNGVNADVKYVIYNIEKDPKRTKTDGFIVKTRDMEYERTTEWLFGPMVNKSPLFQGQRVAADPDTLLQT
YGYPACLGAIIVQVWIVLVKAITSSAGLRKGFFNRLEAFRQDGTVKGALVFTGETVEGIGSVMRSQQSLVSLMVETLVTM
NTARSDLTTLEKNIQIVGNYIRDAGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLIDTYLSKGPRAPFICILKDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGGTYLDMEMFLLGQAVAKDAESKISSALEDELGVTDTAKERLRHHLAN
LSGGDGAYHEPTGGGAIEVALDNADIDLETEAHADQDARGWGGESGERWARQVSGGHFVTLHGAERLEEETNDEDVSDIE
RRIAMRLAERRQEDSATHGDEGRNNGVDHDEDDDAAAVAGIGGI
>O57286 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGGGAVIPGQRSTVSVFVLGPSVTDDADKLSIATTFLAHSLDTDKQHSQRGGFLVSLLA
MAYSSPELYLTTNGVNADVKYVIYNIEKDPKRTKTDGFIVKTRDMEYERTTEWLFGPMVNKSPLFQGQRDAADPDTLLQI
YGYPACLGAIIVQVWIVLVKAITSSAGLRKGFFNRLEAFRQDGTVKGALVFTGETVEGIGSVMRSQQSLVSLMVETLVTM
NTARSDLTTLEKNIQIVGNYIRDAGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLIDTYLSKGPRAPFICILKDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGRTYLDMEMFLLGQAVAKDAESKISSALEDELGVTDTAKERLRHHLAN
LSGGDGAYHKPTGGGAIEVALDNADIDLEPEAHTDQDARGWGGDSGDRWARSMGSGHFITLHGAERLEEETNDEDVSDIE
RRIARRLAERRQEDATTHEDEGRNNGVDHDEEDDAAAAAGMGGI
>P04858 ~~~N~~~Nucleoprotein~~~
MAGLLSTFDTFSSRRSESINKSGRGAVIPGQRSTVSVFVLGLSVTDDADKLFIATTFLAHSLDTDKRHSQRGGFLVSLLA
MAYSSPELYLTTNGVNADVKYVIYNIEKDPKRTKTDGFIVKTRDMEYERTTEWLFGPMVNKSPLFQGQRDAADPDTLLQI
YGYPACLGAIIVQVWIVLVKAITSSAGLRKGFFNRLEAFRQDGTVKGALVFTGETVEGIGSVMRSQQSLVSLMVETLVTM
NTARSDLTTLEKNIQIVGNYIRDAGLASFMNTIKYGVETKMAALTLSNLRPDINKLRSLIDTYLSKGPRAPFICILKDPV
HGEFAPGNYPALWSYAMGVAVVQNKAMQQYVTGRTYLDMEMFLLGQAVAKDAESKISSALEDELGVTEAAKGRLRHHLAS
LSGGNGAYRKPTGGGAIEVALDNADIDLETKAHADQDARGWGGDSGERWARQVSGGHFVTLHGAERLEEETNDEDVSDIE
RRIAMRLAERRQEILQPMEMKAAITVSIMTKMTIPQQ
>P17881 3.1.-.-~~~N~~~Nucleoprotein~~~
MATMEEIQREISAHEGQLVIARQKVKDAEKQYEKDPDDLNKRALHDRESVAASIQSKIDELKRQLADRLQQGRTSGQDRD
PTGVEPGDHLKERSALSYGNTLDLNSLDIDEPTGQTADWLTIIVYLTSFVVPIILKALYMLTTRGRQTSKDNKGMRIRFK
DDSSYEDVNGIRKPKHLYVSMPNAQSSMKAEEITPGRFRTAVCGLYPAQIKARNMVSPVMSVVGFLALAKDWTSRIEEWL
GAPCKFMAESLIAGSLSGNPVNRDYIRQRQGALAGMEPKEFQALRQHSKDAGCTLVEHIESPSSIWVFAGAPDRCPPTCL
FVGGMAELGAFFSILQDMRNTIMASKTVGTADEKLRKKSSFYQSYLRRTQSMGIQLDQRIIVMFMVAWGKEAVDNFHLGD
DMDPELRSLAQILIDQKVKEISNQEPMKL
>I6WJ72 ~~~NP~~~Nucleoprotein~~~
MSEWSRIAVEFGEQQLNLTELEDFARELAYEGLDPALIIKKLKETGGDDWVKDTKFIIVFALTRGNKIVKASGKMSNSGS
KRLMALQEKYGLVERAETRLSITPVRVAQSLPTWTCAAAAALKEYLPVGPAVMNLKVENYPPEMMCMAFGSLIPTAGVSE
ATTKTLMEAYSLWQDAFTKTINVKMRGASKTEVYNSFRDPLHAAVNSVFFPNDVRVKWLKAKGILGPDGVPSRAAEVAAA
AYRNL
>P0DW82 ~~~NP~~~Nucleoprotein~~~
MSEWSRIAVEFGEQQLNLTELEDFARELAYEGLDPALIIKKLKETGGDDWVKDTKFIIVFALTRGNKIVKASGKMSNSGS
KRLMALQEKYGLVERAETRLSITPVRVAQSLPTWTCAAAAALKEYLPVGPAVMNLKVENYPPEMMCMAFGSLIPTAGVSE
ATTKTLMEAYSLWQDAFTKTINVKMRGASKTEVYNSFRDPLHAAVNSVFFPNDVRVKWLKAKGILGPDGVPSRAAEVAAA
AYRNL
>Q89462 3.1.-.-~~~N~~~Nucleoprotein~~~
MSTLKEVQDNITLHEQQLVTARQKLKDAERAVELDPDDVNKSTLQSRRAAVSALETKLGELKRELADLIAAQKLASKPVD
PTGIEPDDHLKEKSSLRYGNVLDVNSIDLEEPSGQTADWKSIGLYILSFALPIILKALYMLSTRGRQTIKENKGTRIRFK
DDSSYEEVNGIRKPRHLYVSMPTAQSTMKADEITPGRFRTIACGLFPAQVKARNIISPVMGVIGFSFFVKDWMERIDDFL
AARCPFLPEQKDPRDAALATNRAYFITRQLQVDESKVSDIEDLIADARAESATIFADIATPHSVWVFACAPDRCPPTALY
VAGMPELGAFFAILQDMRNTIMASKSVGTSEEKLKKKSAFYQSYLRRTQSMGIQLDQKIIILYMSHWGREAVNHFHLGDD
MDPELRELAQTLVDIKVREISNQEPLKL
>A0A1W6I186 ~~~~~~Nucleocapsid VP1~~~
MATLTDVRNALQNAWTDDMKKAVKNAYRNAEKRASQGGNYAKCLEQASMQGTPYSLAAKKCAIENNLASTLSGLWGTS
>P27018 ~~~N~~~Nucleoprotein~~~
MSSVLKTFERFTIQQELQDHEEDTPVPLETIRPLIRVFVVNSNDPALRAQLLLFNLRIIMSNTARESHKTGALLSMFSLP
AAAMGNHLKLATRSPEASIDRVEITGFEGRSFRVVPDARSTMSRAEVLAYEAIAEDIPDTLNHKTPFVNADVEQGDYDET
EGFLELCYSVLMQAWIVTCKCMTAPDQPPISIEKRMAKYQQQGRINPRFILQPEARRIIQNAIRKAMVVRHFLTYELQMA
QSKTLLANRYYAMVGDVGKYIEHSGMGGFFCTLKYGLGTRWPTLALAAFSGELQKLKALMLHYQSLGPMAKYMALLESPK
LMDFAPAEYPLMYSYAMGIGTVIDTNMRNYAYGRSYLNPQYFQLGVETARKQQGAVDHRTAEDLGMSQADKVELAATLAK
LTIGQGGRGRQPLDDPFAGAAGDYQGAAAGGAQGFDYASRRVRKYNDYESDEEAGMDDDYEQEAREGRGYDDDDARQGIG
GQSGFDFSVPQRAPGMSDEEFQAQMTKYIQHVQQHYQEAQEGAEDGGYNQTTDDQGAGGDFDT
>P10550 ~~~N~~~Nucleoprotein~~~
MSTTPTITLADLERIREPYKVLSKTARPENPSGQCTYREYLFSDAVKYPIYKRATMTNEEIVTFFGKITSDKHTHMTESD
MWTFVQCALSLKDPVDRSSIFDKGFWDANHLCADYATAQPANTGKVAMSHHNPGVVVTQLVPKYDTGGSSSQETESMASK
AEAISFYFAWLTRFSVKQAPNTINVLYDRVRATYLKFYSTSSSIFDTFRPSNTWLQGLKDAFDTFPRVKNTLILHVAHAE
TYFRPTPKIFNVLRFLFFQNLEFMGLHAYVSIVTIMSKVALPPSQVLSWLRVSGSEMAIDEAFMIMNTLDNGMIDNGHNA
ERLWKYARCLDQGYFNRLQSSYSAELIAMLAYIEINMGISTEVGYNSPLNIYAIANNKAVKEVGRMKADVFIQCKNSVVS
LTQDASVIDKVYAAAQQKHIRSEEAARPSEQNKEDEVVAMDTDAPSRKRRSDALTTEKPKKALPAIIKLPNIPDF
>P18140 3.1.13.-~~~N~~~Nucleoprotein~~~
MAQSKEVPSFRWTQSLRKGLSQFTQTVKSDILKDAKLIADSIDFNQVAQVQRVLRKTKRTDDDLNKLRDLNIEVDRLMSM
KSVQKNTIFKVGDLARDELMELASDLEKLKDKIKRTESNGTNAYMGNLPQSQLNRRSEILRTLGFAQQGGRPNGIVRVWD
VKDSSKLNNQFGSMPALTIACMTVQGGETMNNVVQALTSLGLLYTVKYPNLSDLDKLIPNHECLQIITKEESSINISGYN
LSLLAAVKAGASILDGGNMLETIRVSPDNFSSLIKNTLQVKRREGMFIDDRPGSRNPYENLLYKLCLSGDGWPYIGSRSQ
IMGRSWDNTSVDLTKKPDAVPEPGAAPRPAERKGQNLRLASLTEGQELIVRAAISELDPSNTIWLDIEDLQLDPVELALY
QPAKKQYIHCFRKPHDEKGFKNGSRHSHGILMKDIEDAVPGVLSYVIGLLPPNMVITTQGSDDIRKLLDIHGRKDLKLID
VKFTSDQARLFEHQVWDKFGHLCKQHNGVIISKKNKSKDSPPSPSPDEPHCALLDCIMFHSAVSGELPKEEPIPLLPKEF
LFFPKTAFAL
>P21701 ~~~N~~~Nucleoprotein~~~
MSDENYRDIALAFLDESADSGTINAWVNEFAYQGFDPKRIVQLVKERGTAKGRDWKKDVKMMIVLNLVRGNKPEAMMKKM
SEKGASIVANLISVYQLKEGNPGRDTITLSRVSAAFVPWTVQALRVLSESLPVSGTTMDAIAGVTYPRAMMHPSFAGIID
LDLPNGAGATIADAHGLFMIEFSKTINPSLRTKQANEVAATFEKPNMAAMSGRFFTREDKKKLLIAVGIIDEDLVLASAV
VRSAEKYRAKVGK
>P26000 ~~~N~~~Nucleoprotein~~~
MSKVKLTKESIVALLTQGKDLEFEEDQNLVAFNFKTFCLENLDQIKKMSIISCLTFLKNRQSIMKVIKQSDFTFGKITIK
KTSDRIGATDMTFRRLDSLIRVRLVEETGNSENLNTIKSKIASHPLIQAYGLPLDDAKSVRLAIMLGGSLPLIASVDSFE
MISVVLAIYQDAKYKDLGIDPKKYDTREALGKVCTVLKSKAFEMNEDQVKKGKEYAAILSSSNPNAKGSIAMEHYSETLN
KFYEMFGVKKQAKLTELA
>Q88918 3.1.-.-~~~N~~~Nucleoprotein~~~
MSQLKEIQEEITRHEQQIVIARQKLKDVEKTVEADPDDVNKSTLQSRRAAVSALEDKLADFKRQLADLVSSQKMGEKPVD
PTGLEPDDHLKERSSLRYGNVLDVNAIDIDEPSGQTADWFSIGQYITGFALAIILKALYMLSTRGRQTIKENKGTRIRFK
DDSSYEEINGIRRPKHLYVSMPTAQSTMKADELTPGRFRTIVCGLFPAQIMYRNIISPVMGVIGFSFFVKDWPEKIEEFL
IKPCPFLKKSGPSKEEDFLVSNDAYLLGREKALRESHLAEIDDLIDLAASGDPTPPDSIKSPQAPWVFACRPDRCPPTCI
YIAGMAELGAFFSILQDMRNTIMASKTVGTAEEKLKKKSSFYQSYLRRTQSMGIQLDQRIILLFMTEWGSDIVNHFHLGD
DMDPELRTLAQSLIDQKVKEISNQEPLKI
>P24378 ~~~N~~~Nucleoprotein~~~
MEGGIRAAFSGLNDVRIDPTGGEGRVLVPGEVELIVYVGEFGEEDRKVIVDALSALGGPQTVQALSVLLSYVLQGNTQED
LETKCKVLTDMGFKVTQAVRATSIEAGIMMPMRELALTVNDDNLMEIVKGTLMTCSLLTKYSVDKMIKYITKKLGELADT
QGVGELQHFTADKAAIRKLAGCVRPGQKITKALYAFILTEIADPTTQSRVPSMGALRLNGTGMTMIGLFTQAANNLGIAP
AKLLEDLCMESLVESARRIIQLMRQVSEAKSIQERYAIMMSRMLGESYYKSYGLNDNSKISYILSQISGKYAVDSLEGLE
GIKVTEKFREFAELVAEVLVDKYERIGEDSTEVSDVIREAARQHARRTSAKPEPKARNFRSSTGRGREQETGESDDDDYP
EDSD
>P27371 ~~~N~~~Nucleoprotein~~~
MEGGIRAAFSGLNDVRIDPTGGEGRVLVPGEVELIVYAGPFGTDDGKVIVDALAALGGPQTVQALSVLLSYVLQGSAQGD
LEAKCKILTDMGFKVTQSPRATGIEAGILMPMRELAQTVNNDNLMDIVKGALMTCSLLDKYSVDKMIKYITKKLGELGST
QGVGELQHLSADKAAIRKLAGCVRPGQKITKALYAFILTEIADPTTQSRVQSMGALRLNGTGMTMIGLFTQAANNLGIPP
AKLLEDLCMESLVESARRIIQLMRQVSEARSIQERYAIMMSRMLGESYYKSYRLNDNSKISYILSQISGKYAVDSLEGLE
GIKVTEKFREFTELVAEVLVDKYERIGEDSTEVSDVIREAARQHARKASDKPEPKARNFRSSTGRGKEQEKEESDDDDYP
GDSD
>P03521 ~~~N~~~Nucleoprotein~~~
MSVTVKRIIDNTVIVPKLPANEDPVEYPADYFRKSKEIPLYINTTKSLSDLRGYVYQGLKSGNVSIIHVNSYLYGALKDI
RGKLDKDWSSFGINIGKAGDTIGIFDLVSLKALDGVLPDGVSDASRTSADDKWLPLYLLGLYRVGRTQMPEYRKKLMDGL
TNQCKMINEQFEPLVPEGRDIFDVWGNDSNYTKIVAAVDMFFHMFKKHECASFRYGTIVSRFKDCAALATFGHLCKITGM
STEDVTTWILNREVADEMVQMMLPGQEIDKADSYMPYLIDFGLSSKSPYSSVKNPAFHFWGQLTALLLRSTRARNARQPD
DIEYTSLTTAGLLYAYAVGSSADLAQQFCVGDNKYTPDDSTGGLTTNAPPQGRDVVEWLGWFEDQNRKPTPDMMQYAKRA
VMSLQGLREKTIGKYAKSEFDK
>P11212 ~~~N~~~Nucleoprotein~~~
MSVTVKRIIDNTVIVPKLPANEDPVEYPADYFRKSKEIPLYINTTKSLSDLRGYVYQGLKSGNVSIIHVNSYLYGALKDI
RGKLDKDWSSFGINIGKAGDTIGIFDLVSLKGLDGVLPDGVSDASRTRADDKWLPLYLLGLYRVGRTQMPEYRKKLMDGL
TNQCKMINEQFEPLVPEGRDIFDVWGNDSNYTKIVAAVDMFFHMFKKHECASFRYGTIVSRFKDCAALATFGHLCKITGM
STEDVTTWILNREVADEMVQMMLPGQEIDKADSYMPYLIDFGLSSKSPYSSVKNPAFHFWGQLTALLLRSTRARNARQPD
DIEYTSLTTAGLLYAYAVGSSADLAQQFCVGDSKYTPDDSTGGLTTNAPPQGRDVVEWLGWFEDQNRKPTPDMMQYAKRA
VMSLQGLREKTIGKYAKSEFDK
>Q77E03 ~~~N~~~Nucleoprotein~~~
MSVTVKRIIDNTVIVPKLPANEDPVEYPADYFRKSKEIPLYINTTKSLSDLRGYVYQGLKSGNVSIIHVNSYLYGALKDI
RGKLDKDWSSFGINIGKAGDTIGIFDLVSLKALDGVLPDGVSDASRTSADDKWLPLYLLGLYRVGRTQMPEYRKKLMDGL
TNQCKMINEQFEPLVPEGRDIFDVWGNDSNYTKIVAAVDMFFHMFKKHECASFRYGTIVSRFKDCAALATFGHLCKITGM
STEDVTTWILNREVADEMVQMMLPGQEIDKADSYMPYLIDFGLSSKSPYSSVKNPAFHFWGQLTALLLRSTRARNARQPD
DIEYTSLTTAGLLYAYAVGSSADLAQQFCVGDNKYTPDDSTGGLTTNAPPQGRDVVEWLGWFEDQNRKPTPDMMQYAKRA
VMSLQGLREKTIGKYAKSEFDK
>P04881 ~~~N~~~Nucleoprotein~~~
MAPTVKRIINDSIIQPKLPANEDPVEYPADYFKNNTNIVLYVSTKVALNDLRAYVYQGIKSGNPSILHINAYLYAALKGV
EGTLDRDWVSFGRTIGKREENVKIFDLVKVEELKTALPDGKSDPDRSAEDDKWLPIYILGLYRVGRSKVTDYRKKLLDGL
ENQCRVASTRFESLVEDGLDFFDIWENDPNFTKIVAAVDMFFHMFKKHERAPIRYGTIVSRFKDCAALATFGHLSKVSGL
SIEDLTTWVLNREVADELCQMMYPGQEIDKADSYMPYMIDFGLSQKSPYSSVKNPAFHFWGQLAALLLRSTRAKNARQPD
DIEYTSLTCASLLLSFAVGSSADIEQQFYIGEDKYTTEKDDSLKKSDVPPKGRNVVDWLGWYDDNGGKPTPDMLNFARRA
VSSLQSLREKTIGKYAKVEFDK
>Q5VKP2 ~~~N~~~Nucleoprotein~~~
MDSEHIVFRVRNEIVTLKPEVISDQYEYKYPAITDKKKPSITLGRAPDLSIAYRSILSGFNAAKLDPDDVCSYLAAAMPL
FEGVCPEDWISYGIIIARKGDKINPSHLVDIMRTEVEGNWSQSGGADVTRNPTVAEHASLVGLLLCLYRLSKIVGQNTAN
YKTNVADRMEQIFETAPFVKIIEHHTLMTTHKMCANWSTIPNFRFLVGTYDMFFSRIDHLYSALRVGTVVTAYEDCTGLV
SFTAFLKQINLSARDAILYFFHKNFEEEIRRMFRPNQETAVPHSYFIHFRSLGLSGKSPYSSNAVGHVFNLIHFVGCYMG
QVRSLNATVIQTCAPHEMSVLGGYLGEEFFGKGTFERRFFRDERELQDHLEAEEAKIDIALADDATVDSGDEDFYGGESR
SPEAVYNRIIMNKGRLKKLHIKRYRSVSSNHQARPNTFAEFLNKVYSDDN
>Q91PB2 3.1.13.-~~~N~~~Nucleoprotein~~~
MSDQSVPSFRWTQSLRRGLSAWTTSVKADVLNDTRALLSGLDFAKVASVQRMMRRVKRDDSDLVGLRDLNKEVDSLMIMK
SNQKNMFLKVGSLSKDELMELSSDLEKLKQKVQRTERVGNGTGQYQGNLSNTQLTRRSEILQLVGIQRAGLAPTGGVVKI
WDIKDPSLLVNQFGSVPAVTISCMTEQGGESLNDVVQGLTDLGLLYTAKYPNLNDLKALTTKHPSLNIITQEESQINISG
YNLSLSAAVKAGACLIDGGNMLETIKIEESTFTTVIKTLLEVKNKEKMFVSPTPGQRNPYENVLYKLCLSGDGWPYIASR
SQIKGRAWDNTVVEFDTATVKEPIPIRNGGAPLLTTLKPEIENQVKRSVESLLINDTTWIDIEGPPNDPVEFAIYQPESQ
RYIHCYRRPNDIKSFKDQSKYCHGILLKDVENARPGLISSIIRSLPKSMVFTAQGADDIRKLFDMHGRQDLKIVDVKLSA
EESRIFEDLVWKRFEHLCDKHKGIVIKSKKKGSTPATTNAHCALLDGVMFSAVISGSVSNEKPKRMLPIDLLFREPETTV
VL
>P84254 ~~~pc3~~~Nucleoprotein~~~
MTSMVEIQNEIERVTALAVKYISENKDSLVVFVGQIDYNGYDAGKLLSILKEKAKGRDFGRDLCYLLVMRYTRGTGFVRD
VRKKIKVAAGADTAYEIVTHYGVVQSVGDNADAITLGRLAALFPYVSMNIVKSVSTGAKLALDTSDLGTSGLDILLWDFV
PQFINLDSVDAPYCNKKNTSNILFSLHLLQGALTTRKTMPDQKKKKDNLTTDFDLLKYTAELLVITCSAKNLTDNKKSTY
RKKLVEPFRENEDYKADFWTALGKLSTGCLKKMKKDAQNYLKDRTTVLKLMVDNCSGTDDEAAKAIKDYLTVDD
>P22175 ~~~p4~~~Major non-capsid protein~~~
MQRSADVSIGPITGLNYTDLYDSLPSSVSDNITLLDLKEPERVTEATKKLILKGCVETAYHHPLETDPLFASVHKHLPDF
CHSFLEHLLGGEQDENSLIDIGEFFKLLQPSLGDWITKYYLKHPNKMSGIQIKTLLNQIINMAKAESSDTETYEKVWKKM
PSYFSIVLTPLLHKVV
>Q01209 ~~~p4~~~Major non-capsid protein~~~
MQDVQRTVEVSVGPIVGLDYTLLYDTLPETVSDNITLPDLKDPERVTEDTKKLILKGCVYIAYHHPLETDTLFIKVHKHI
TEFCHSFLSHLLGGEDDDNALIDIGLFFNMLQPSLGGWITKNFLRHPNRMSKDQIKLLLDQIIKMAKAESSDTEEYEKVW
KKMPTYFESIIQPLLHKT
>A0MZE7 ~~~~~~Non-contractile tail sheath~~~
MSIEDYLKGKNCLASPNYDPDDQHSSWREDLPQFKKDREHLTLVNTRRNRTYNTKLNRFDPEYWVVDYNALMVATIIPYG
SKSFKVPCQWRTNKDFLGVRWMTEDTFDHHLYRYETDPNYLGLILAFRHNPDEPDKFTVTIQTPEKAYTYRLAPYGFNNK
TRRWECLDTKYGTKRTYQADIFVATDEDIPESEMTEVYGTKDYIFILDFADLRTGVAFNGVTINPRNITMISFDCTEAHH
GLGKDAYIAAMYNNDDGATFQMEIGGIHTNAALAAGDKLQCIWRYLDVNGNAQAAENEFEVVSYEGFGTSNFSVKCKGML
PGKFIGCDAFYGKYLQTDGPIKQVDSVKWFTNLTVSGSGRKQLGQRKYPQVVMGMGMTSGFDDGYNLTPERQVKMAYGLG
YRDWWTTYIGMSHYWKGLTAFQDKETGELITEQTVLDYPILFAGESQVAIHFMSGAYPDRGYDVFQKYMTETWGINYAGV
HPINGTTGSTAVDRACAVNPNSEVFDPTQSSGAGGLWWWDLEADKPGPALLHCVGQVGKLKPKAIIWGQGDQDATALAYP
GDRNPAPSLTRTKQATKKVFEYLRSLYGQIPIFIQELSYAWGITNTDAPNVPIRTGLPSFLAARRNTWGDIEFRWKSYGL
DPALAQYRIEIYNPSNLNQILHSFVVSGTQEANGYVYADFTVEDWIPVMMEAVGSPNPWEFMKWRVVCLYQEREIPSAPW
SDNIPLDNAGLVKKTILVGINQFGGGHFTDMSDPTATTANGAIGRKDKVSASTLRLTFAEKAGLRPIQVMPVNVAADSAG
MTVGTHKWWNTSSNSPGDALLAINDMVKGLGVKPDYFIEANPWETMYMKDVNSSTWPALMTAFESSNKAMLAWMRTNWGN
PNLEIWFQGATTVWFGVAPPNDLNSEATVTVRDKQIQMATANIGFKLGSFVPGSNLYTAYRNVESSWIYYTVEAFHATAI
ELGEALALNINRATNPPDWSYLRPPANLQGRKLATRDIKMTWDNRAGITHWKYANRHVTTGAEISSGILTSPEYVFTLND
QQNAYNGDTLNMSFSVSEYAADSGAVGASSSFVGVVQNGSYMQTPTQLKAAKQLNGDIIFTWVGRPSWQHFWVVNTSVND
SKTVIFSKEWSSESLTWTVAEQNEFYGLEEGGATHVIFMVSEYDPSNGLVSIGAQVTGQAEQPSNPMNPVAGLYAVFTGD
PGNSNIKIMWDKPSVGGRDVRIRNMHVTSSATISDQFVSDNNLVFTREEQVAAYGFTASSVSVRAQEHDIESGALGLTTE
YVAVPETAGTVGQGFAKKDSVGNCTMSWEVGDAVQWQVEILNAENSTVVKTEIVVAPTITWMAEEITAEYGYLTDHMVWR
VRPYRADGASNVAKQFDMTATL
>P15556 ~~~ndd~~~Nucleoid disruption protein~~~
MKYMTVTDLNDAGATVIGTIKGGEWFLGTPHKDILSKPGFYFLVSKLGGPFSNPCVSARFYVGNQRSKQGFSAVLSHIRQ
RRSQLARTIANNNVPYTVFYLPASKMKPLTTGFGKGQLALAFTRNHHSEYQTLEEMNRMLADNFKFVLQAY
>Q5UQL3 2.7.4.6~~~NDK~~~Nucleoside diphosphate kinase~~~
MQRTLVLIKPDAFERSLVAEIMGRIEKKNFKIVSMKFWSKAPRNLIEQHYKEHSEQSYFNDNCDFMVSGPIISIVYEGTD
AISKIRRLQGNILTPGTIRGDLANDIRENLIHASDSEDSAVDEISIWFPETKMETDN
>P0CK47 ~~~NEC1~~~Nuclear egress protein 1~~~
MAPVTPDAVNARQQRPADPALRRLMHPHHRNYTASKASAHSVKSVSRCGKSRSELGRMERVGSVARSICSRHTRHGVDRS
HFSLRDFFRGISANFELGKDFLREMNTPIHVSEAVFLPLSLCTLSPGRCLRLSPFGHSLTLGSHCEICINRSQVHVPQEF
SSTQLSFFNNVHKIIPNKTFYVSLLSSSPSAVKAGLSQPSLLYAYLVTGHFCGTICPIFSTNGKGRLIMHLLLQGTSLHI
PETCLKLLCENIGPTYELAVDLVGDAFCIKVSPRDTVYEKAVNVDEDAIYEAIKDLECGDELRLQIINYTQLILENKQ
>P16794 ~~~NEC1~~~Nuclear egress protein 1~~~
MSSVSGVRTPRERRSALRSLLRKRRQRELASKVASTVNGATSANNHGEPPSPADARPRLTLHDLHDIFREHPELELKYLN
MMKMAITGKESICLPFNFHSHRQHTCLDISPYGNEQVSRIACTSCEDNRILPTASDAMVAFINQTSNIMKNRNFYYGFCK
SSELLKLSTNQPPIFQIYYLLHAANHDIVPFMHAEDGRLHMHVIFENPDVHIPCDCITQMLTAAREDYSVTLNIVRDHVV
ISVLCHAVSASSVKIDVTILQRKIDEMDIPNDVSESFERYKELIQELCQSSGNNLYEEATSSYAIRSPLTASPLHVVSTN
GCGPSSSSQSTPPHLHPPSQATQPHHYSHHQSQSQQHHHRPQSPPPPLFLNSIRAP
>F5HFZ4 ~~~NEC1~~~Nuclear egress protein 1~~~
MSSVSGVRTPRERRSALRSLLRKRRQRELASKVASTVNGATSANNHGEPPSPADARPRLTLHDLHDIFREHPELELKYLN
MMKMAITGKESICLPFNFHSHRQHTCLDISPYGNEQVSRIACTSCEDNRILPTASDAMVAFINQTSNIMKNRNFYYGFCK
SSELLKLSTNQPPIFQIYYLLHAANHDIVPFMHAENGRLHMHVIFENSDVHIPCDCITQMLTAAREDYSVTLNIVRDHVV
ISVLCHAVSASSVKIDVTILQRKIDEMDIPNDVSESFERYKELIQELCQSSGNNLYEEATSSYAIRSPLTASPLHVVSTN
GCGPSSSSQSTPPHLHPPSQATQPHHYSHHQSQSQQHHHRPQSPPPPLFLNSIRAP
>P10215 ~~~NEC1~~~Nuclear egress protein 1~~~
MYDTDPHRRGSRPGPYHGKERRRSRSSAAGGTLGVVRRASRKSLPPHARKQELCLHERQRYRGLFAALAQTPSEEIAIVR
SLSVPLVKTTPVSLPFCLDQTVADNCLTLSGMGYYLGIGGCCPACNAGDGRFAATSREALILAFVQQINTIFEHRAFLAS
LVVLADRHNAPLQDLLAGILGQPELFFVHTILRGGGACDPRLLFYPDPTYGGHMLYVIFPGTSAHLHYRLIDRMLTACPG
YRFVAHVWQSTFVLVVRRNAEKPTDAEIPTVSAADIYCKMRDISFDGGLMLEYQRLYATFDEFPPP
>F5H982 ~~~NEC1~~~Nuclear egress protein 1~~~
MPKSVSSHISLATSTGRSGPRDIRRCLSSRLRSVPPGARSASVSSKHRNGLRKFISDKVFFSILSHRHELGVDFLREMET
PICTSKTVMLPLDLSTVAPGRCVSLSPFGHSSNMGFQCALCPSTENPTVAQGSRPQTMVGDALKKNNELCSVALAFYHHA
DKVIQHKTFYLSLLSHSMDVVRQSFLQPGLLYANLVLKTFGHDPLPIFTTNNGMLTMCILFKTRALHLGETALRLLMDNL
PNYKISADCCRQSYVVKFVPTHPDTASIAVQVHTICEAVAALDCTDEMRDDIQKGTALVNAL
>P03185 ~~~NEC2~~~Nuclear egress protein 2~~~
MASPEERLLDELNNVIVSFLCDSGSLEVERCSGAHVFSRGSSQPLCTVKLRHGQIYHLEFVYKFLAFKLKNCNYPSSPVF
VISNNGLATTLRCFLHEPSGLRSGQSGPCLGLSTDVDLPKNSIIMLGQDDFIKFKSPLVFPAELDLLKSMVVCRAYITEH
RTTMQFLVFQAANAQKASRVMDMISDMSQQLSRSGQVEDTGARVTGGGGPRPGVTHSGCLGDSHVRGRGGWDLDNFSEAE
TEDEASYAPWRDKDSWSESEAAPWKKELVRHPIRRHRTRETRRMRGSHSRVEHVPPETRETVVGGAWRYSWRATPYLARV
LAVTAVALLLMFLRWT
>P16791 ~~~NEC2~~~Nuclear egress protein 2~~~
MEMNKVLHQDLVQATRRILKLGPSELRVTDAGLICKNPNYSVCDAMLKTDTVYCVEYLLSYWESRTDHVPCFIFKNTGCA
VSLCCFVRAPVKLVSPARHVGEFNVLKVNESLIVTLKDIEEIKPSAYGVLTKCVVRKSNSASVFNIELIAFGPENEGEYE
NLLRELYAKKAASTSLAVRNHVTVSSHSGSGPSLWRARMSAALTRTAGKRSSRTASPPPPPRHPSCSPTMVAAGGAAAGP
RPPPPPMAAGSWRLCRCEACMGRCGCASEGDADEEEEELLALAGEGKAAAAAAGQDVGGSARRPLEEHVSRRRGVSTHHR
HPPSPPCAPSLERTGYRWAPSSWWRARSGPSRPQSGPWLPARFATLGPLVLALLLVLALLWRGHGQSSSPTRSAHRD
>Q6SW81 ~~~NEC2~~~Nuclear egress protein 2~~~
MEMNKVLHQDLVQATRRILKLGPSELRVTDAGLICKNPNYSVCDAMLKTDTVYCVEYLLSYWESRTDHVPCFIFKNTGCA
VSLCCFVRAPVKLVSPARHVGEFNVLKVNESLIVTLKDIEEIKPSAYGVLTKCVVRKSNSASVFNIELIAFGPENEGEYE
NLLRELYAKKAASTSLAVRNHVTVSSHSGSGPSLWRARMSAALTRTAGKRSPRTASPPPPPPRHPSCSPTMVAAGGAAAG
PRPPPPPMAAGSWRLCRCEACMGRCGCASEGDADEEEEELLALAGEGKAAAAAAGQDIGGSARRPLEEHVSRRRGVSTHH
RHPPSPPCTPSLERTGYRWAPSSWWRARSGPSRPQSGPWLPARFATLGPLVLALLLVLALLWRGHGQSSSPTRSAHRD
>P10218 ~~~NEC2~~~Nuclear egress protein 2~~~
MAGLGKPYTGHPGDAFEGLVQRIRLIVPSTLRGGDGEAGPYSPSSLPSRCAFQFHGHDGSDESFPIEYVLRLMNDWAEVP
CNPYLRIQNTGVSVLFQGFFHRPHNAPGGAITPERTNVILGSTETTGLSLGDLDTIKGRLGLDARPMMASMWISCFVRMP
RVQLAFRFMGPEDAGRTRRILCRAAEQAITRRRRTRRSREAYGAEAGLGVAGTGFRARGDGFGPLPLLTQGPSRPWHQAL
RGLKHLRIGPPALVLAAGLVLGAAIWWVVGAGARL
>P89457 ~~~NEC2~~~Nuclear egress protein 2~~~
MAGMGKPYGGRPGDAFEGLVQRIRLIVPATLRGGGGESGPYSPSNPPSRCAFQFHGQDGSDEAFPIEYVLRLMNDWADVP
CNPYLRVQNTGVSVLFQGFFNRPHGAPGGAITAEQTNVILHSTETTGLSLGDLDDVKGRLGLDARPMMASMWISCFVRMP
RVQLAFRFMGPEDAVRTRRILCRAAEQALARRRRSRRSQDDYGAVVVAAAHHSSGAPGPGVAASGPPAPPGRGPARPWHQ
AVQLFRAPRPGPPALLLLAAGLFLGAAIWWAVGARL
>F5HA27 ~~~NEC2~~~Nuclear egress protein 2~~~
MSVVGKRVVDELCRVVSSYLGQSGQSLDLERCIDGAPVYAKGGATAICTVRMQHGCVYHLEFVYKFWAHLLEEMHYPFSP
CFVISNNGLSTTLKCFLCRPSDAVSQFGHVLPVESDVYLAKNTSVVLGQDDFTKFKASLVFSKNLGVYNSMVICRTYFTD
YRQVLQFLVVTPKSHKRLKSLLETVYCLAAPVADSAAQGGAGFPTNGRDARACTSDVTAVYWAGQGGRTVRILGAFQWSL
GRAVALVRRSWPWISAGIAFLCLGLVWMRPS
>P11110 ~~~~~~Neck protein gp13~~~
MSGYNPQNPKELKDVILRRLGAPIINVELTPDQIYDCIQRALELYGEYHFDGLNKGFHVFYVGDDEERYKTGVFDLRGSN
VFAVTRILRTNIGSITSMDGNATYPWFTDFLLGMAGINGGMGTSCNRFYGPNAFGADLGYFTQLTSYMGMMQDMLSPIPD
FWFNSANEQLKVMGNFQKYDLIIVESWTKSYIDTNKMVGNTVGYGTVGPQDSWSLSERYNNPDHNLVGRVVGQDPNVKQG
AYNNRWVKDYATALAKELNGQILARHQGMMLPGGVTIDGQRLIEEARLEKEALREELYLLDPPFGILVG
>P11111 ~~~~~~Neck protein gp14~~~
MATYDKNLFAKLENRTGYSQTNETEILNPYVNFNHYKNSQILADVLVAESIQMRGVECYYVPREYVSPDLIFGEDLKNKF
TKAWKFAAYLNSFEGYEGAKSFFSNFGMQVQDEVTLSINPNLFKHQVNGKEPKEGDLIYFPMDNSLFEINWVEPYDPFYQ
LGQNAIRKITAGKFIYSGEEINPVLQKNEGINIPEFSELELNAVRNLNGIHDINIDQYAEVDQINSEAKEYVEPYVVVNN
RGKSFESSPFDNDFMD
>P39234 ~~~~~~Baseplate puncturing device gp5.4~~~
MSGLSYDKCVTAGHEAWPPTVVNATQSKVFTGGIAVLVAGDPITEHTEIKKPYETHGGVTQPRTSKVYVTGKKAVQMADP
ISCGDTVAQASSKVFIK
>Q6WHG9 ~~~5~~~Protein Gp5~~~
MFMGLDGFEWWTGVVEDRTTDPLKLGRIKVRMIGLHPDKKSSEQGIRTEELLWVHPMQSLDNAAMNGIGNAPIGVVEGTW
VFGFFRDKLRQDAVAMGVLPGIPEDLPNGSVGFNDPNEKYPLADKLNEPDTNRLARNDVDPDVYDESQSQTAFDNGEAPY
VYRPHPIIASKRAAEEKEIPLAGYNAEGPKYDEKGTPYAAQYPYNHVRESESGHIHEIDDTEGAERLHTYHRTGTFEEIH
PDGSRVTKIIGDDFEIVHKNQNVYIKGNLNITVVGDATFYCQQNVTQQIDGDLKQHVKGNVDQHVEMNVTQTVDKDVTQV
VHQNVTQTVDMNVTQTVHQNVTQTVDGDVNQTVGGNVQSNVTGDYTQNISGNYTITVGGSMSESVSSSYTRSAASISDDG
GGATLNLAGSAALDGTTVSLG
>P35837 ~~~~~~Tail needle protein gp26~~~
MADPSLNNPVVIQATRLDASILPRNVFSKSYLLYVIAQGTDVGAIAGKANEAGQGAYDAQVKNDEQDVELADHEARIKQL
RIDVDDHESRITANTKAITALNVRVTTAEGEIASLQTNVSALDGRVTTAENNISALQADYVSKTATTSQSLASPLNVTTS
YSVGGKKVLGARQTGWTAATGTANKGVFDADLTFAVSDTYTQSEIQAIANALITERRRTKALEDALRAHGLID
>P16009 3.2.1.17~~~5~~~Pre-baseplate central spike protein Gp5~~~
MEMISNNLNWFVGVVEDRMDPLKLGRVRVRVVGLHPPQRAQGDVMGIPTEKLPWMSVIQPITSAAMSGIGGSVTGPVEGT
RVYGHFLDKWKTNGIVLGTYGGIVREKPNRLEGFSDPTGQYPRRLGNDTNVLNQGGEVGYDSSSNVIQDSNLDTAINPDD
RPLSEIPTDDNPNMSMAEMLRRDEGLRLKVYWDTEGYPTIGIGHLIMKQPVRDMAQINKVLSKQVGREITGNPGSITMEE
ATTLFERDLADMQRDIKSHSKVGPVWQAVNRSRQMALENMAFQMGVGGVAKFNTMLTAMLAGDWEKAYKAGRDSLWYQQT
KGRASRVTMIILTGNLESYGVEVKTPARSLSAMAATVAKSSDPADPPIPNDSRILFKEPVSSYKGEYPYVHTMETESGHI
QEFDDTPGQERYRLVHPTGTYEEVSPSGRRTRKTVDNLYDITNADGNFLVAGDKKTNVGGSEIYYNMDNRLHQIDGSNTI
FVRGDETKTVEGNGTILVKGNVTIIVEGNADITVKGDATTLVEGNQTNTVNGNLSWKVAGTVDWDVGGDWTEKMASMSSI
SSGQYTIDGSRIDIG
>P04324 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVVGWPAVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAANNAACAWLEAQEEEKVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGIRYPLTFGWCYKLVPVEPEKLEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>P03407 ~~~nef~~~Protein Nef~~~
MGGKWSKRSMGGWSAIRERMRRAEPRAEPAADGVGAVSRDLEKHGAITSSNTAATNADCAWLEAQEEEEVGFPVRPQVPL
RPMTYKAALDISHFLKEKGGLEGLIWSQRRQEILDLWIYHTQGYFPDWQNYTPGPGIRYPLTFGWCFKLVPVEPEKVEEA
NEGENNSLLHPMSLHGMEDAEKEVLVWRFDSKLAFHHMARELHPEYYKDC
>Q77378 ~~~nef~~~Protein Nef~~~
MGNALRKGKFEGWAAVRERMRRTRTFPESEPCAPGVGQISRELAARGGIPSSHTPQNNAALAFLESHQEEEVGFPVAPQV
PLRPMTYKGAFDLSFFLKEKGGLEGLIYSHKRAEILDLWVYNTQGFFPDWQNYTPGPGTRFPLTFGWLFKLVPVSEEEAE
RLGNTCERANLLHPACAHGFEDTHKEILMWKFDRSLGNTHVAMITHPELFQKD
>P03404 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVIGWPAVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAANNAACAWLEAQEEEKVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGIRYPLTFGWCYKLVPVEPDKVEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>P03406 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>P04601 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVIGWPTVRERMRRAEPAADRVGAASRDLEKHGAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKIEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>P05854 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVVGWPAVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAANNAACAWLEAQEEEKVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGIRYPLTFGWRYKLVPVEPEKLEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>Q70627 ~~~nef~~~Protein Nef~~~
MGGKWSKSSVIGWPTVRERMRRAEPAADGVGAASQDLEKHGAITSSNTAATNADCAWLEAQEEEEVGFPVTPQVPLRPMT
YKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGIRYPLTFGWCYKLVPVEPEKLEEANKGE
NTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC
>P04603 ~~~nef~~~Protein Nef~~~
MGGKWSKSSIVGWPKIRERIRRTPPTETGVGAVSQDAVSQDLDKCGAAASSSPAANNASCEPPEEEEEVGFPVRPQVPLR
PMTYKGAFDLSHFLKEKGGLDGLVWSPKRQEILDLWVYHTQGYFPDWQNYTPGPGIRFPLTFGWCFKLVPMSPEEVEEAN
EGENNCLLHPISQHGMEDAEREVLKWKFDSSLALRHRAREQHPEYYKDC
>P04602 ~~~nef~~~Protein Nef~~~
MGGRWSKSSIVGWPAIRERIRRTDPRRTDPAADGVGAASRDLEKHGAITSSNTRDTNADCAWLEAQEESEEVGFPVRPQV
PLRPMTYKLAVDLSHFLKEKGGLEGLIWSKKRQEILDLWVYNTQGIFPDWQNYTPGPGIRYPLTFGWCFELVPVDPREVE
EATEGETNCLLHPVCQHGMEDTEREVLKWRFNSRLAFEHKAREMHPEFYKDC
>P20868 ~~~nef~~~Protein Nef~~~
MGASGSKKRSEPSRGLRERLLQTPGEASGGHWDKLGGEYLQSQEGSGRGQKSPSCEGRRYQQGDFMNTPWRAPAEGEKGS
YKQQNMDDVDSDDDDLVGVPVTPRVPLREMTYRLARDMSHLIKEKGGLEGLYYSDRRRRVLDIYLEKEEGIIGDWQNYTH
GPGVRYPKFFGWLWKLVPVDVPQEGDDSETHCLVHPAQTSRFDDPHGETLVWRFDPTLAFSYEAFIRYPEEFGYKSGLPE
DEWKARLKARGIPFS
>P31818 ~~~nef~~~Protein Nef~~~
MGGTISMRRSRSTGDLRQRLLRARGETYERLLGEVEDGSSQSLGELDKGLSSLSCEGQKYNQEQYMNTPWRNPAEEREKL
AYRKQNMDDIDEEDDDLVGDTVRPKVPLRTMSYKLAIDMSHFIKEKGGLEGIYYSARRHRILDIYLEKEEGIIPDWQDYT
SGPGIRYPKTFGWLWKLVPVNVSDEAQEDEEHYLMHPAQTSQWDDPWGEVPAWKFDPTLAYTYEAYVRYPEEFGSKSGLS
EEEVRRRLTARGLLNMADKKETR
>P12482 ~~~nef~~~Protein Nef~~~
MGGAISKKQYKRGGNLRERLLQARGETYGRLWEGLEEGYSQSLGASGKGLSSLSCEPQKYSEGQYMNTPWRNPATERAKL
GYRQQNMDDVDDEDDDLIGVSVHPRVPLRAMTYKLAIDMSHFIKEKGGLEGIYYNERRHRILDMYLEKEEGIIPDWQNYT
SGPGIRYPMHYGWLWKLVPVDVSDEAQEDETHCLVHPAQTYQWDDPWGEVLAWKFDPELAYSYKAFIKYPEEFGSKSGLS
EEEVKRRLTARGLLKMADKKETS
>Q89733 ~~~NS~~~Nuclear export protein~~~
MDPNTVSSFQDILMRMSKMQLGSSSEDLNGIITQFESLKLYRDSLGEAVMRMGDLHSLQNRNGKWREQLGQKFEEIRWLI
EEVRHRLKITENSFEQITFMQALQLLLEVEQEIRTFSFQLI
>P03508 ~~~NS~~~Nuclear export protein~~~
MDPNTVSSFQDILLRMSKMQLESSSGDLNGMITQFESLKLYRDSLGEAVMRMGDLHSLQNRNEKWREQLGQKFEEIRWLI
EEVRHKLKITENSFEQITFMQALHLLLEVEQEIRTFSFQLI
>P03511 ~~~NS~~~Nuclear export protein~~~
MADNMTTTQIEWRMKKMAIGSSTHSSSVLMKDIQSQFEQLKLRWESYPNLVKSTDYHQKRETIRLATEELYLLSKRIDDS
ILFHKTVIANSSIIADMIVSLSLLETLYEMKDVVEVYSRQCL
>P08014 ~~~NS~~~Nuclear export protein~~~
MADNMTTTQIEWRMKKMAIGSSTHSSSVLMKDIQSQFEQLKLRWESYPNLVKSTDYHQRRETIRLVTEELYLLSKRIDDN
ILFHKTVIANSSIIADMIVSLSLLETLYEMKDVVEVYSRQCL
>Q01640 ~~~NS~~~Nuclear export protein~~~
MSDKTVKSTNLMAFVATKMLERQEDLDTCTEMQVEKMKTSTKARLKTESSFAPRTWEDAIKDEILRRSVDTSSLDKWPEL
KQELENVSDALKADSLWLPMKSLSLYSKVSNQEPSSIPIGEMKHQILTRLKLICSRLEKLDLNLSKAVLGIQNSEDLILI
IYNRDVCKNTILMIKSLCNSLI
>P33493 ~~~NS~~~Nuclear export protein~~~
MSDKTVKSTNLMAFVATKMLERQEDLDTCTEMQVEKMKTSTKARLRTESSFAPRTWEDAIKDEILRRSVDTSSLNKWPEL
KQELENVSDALKADSLWLPMKSLSLYSRVSNQEPSSIPIGEMKHQILTRLKLICSRLEKLDLNLSKAVLGIQNSEDLILI
IYNRDICKNTILMIKSLCNSLI
>Q8V3U1 ~~~Segment-7~~~Nuclear export protein~~~
MNLLLLLQVASFLSDSKVPGEDGTSSTSGMLDLLRDQVDSLSINDSTTEPKTRLDPGLYPWLKWTETAYRSSTRSLASTI
VMGALVQQRGSGNGITMRELELSLGLDFTSECDWLKTCYVNKNFVFLSEKEIAVNMEVEKFICNEN
>P06903 ~~~ner~~~Negative regulator of transcription~~~
MHMNKRTNRQDWHRADIVAELRKRNMSLAELGRSNHLSSSTLKNALDKRYPKAEKIIADALGMTPQDIWPSRY
>P06020 ~~~ner~~~Negative regulator of transcription~~~
MCSNEKARDWHRADVIAGLKKRKLSLSALSRQFGYAPTTLANALERHWPKGEQIIANALETKPEVIWPSRYQAGE
>Q65192 ~~~~~~NifS-like protein~~~
MASILALDGLYAEVPKFLPEALREGCAGKNPLSFYIQQILNLMGCDGNEYHVLFTGSSEEANTHMIMAAVRRHLLRTQQR
PHVIIGAAEPPSVTECVKALAQEKRCVYTIIPLKNFEIDPVAVYDAIQSNTCLACISGTNAVVKTFNKLQDISKVLKGIP
LHSEVSELVYQGCIKQNPPADSFSINSLYGFLGVGVLGMKKKVMQGLGPLIFGGGLRGGSPNIPGIHAMYRTLTQQRPSM
KKINTIHKLFMKTLKKHQHVYLPIGGVSAEDTSAENISTKDIPVEGPKELPGYILFSVGRRAEELQKKIFTKFNIKVGRI
VDLQEILFRIKIPQKYWETLLFIQLRDNLTKEDIKRVMVVLMHLDTITPRGSLPPPSYSSSFS
>P03765 ~~~ninB~~~Protein ninB~~~
MKKLTFEIRSPAHQQNAIHAVQQILPDPTKPIVVTIQERNRSLDQNRKLWACLGDVSRQVEWHGRWLDAESWKCVFTAAL
KQQDVVPNLAGNGFVVIGQSTSRMRVGEFAELLELIQAFGTERGVKWSDEARLALEWKARWGDRAA
>Q1XBW5 ~~~UL3~~~Nuclear phosphoprotein UL3~~~
MVKPLVSYGSVMSGVGGEGVPSALAILASWGWTFDTPNHESGISPDTTPADSIRGAAVASPDQPLHGGPEREATAPSFSP
TRADDGPPCTDGPYVTFDTLFMVSSIDELGRRQLTDTIRKDLRLSLAKFSIACTKTSSFSGNAPRHHRRGAFQRGTRAPR
SNKSLQMFVLCKRTHAARVREQLRVVIQSRKPRKYYTRSSDGRLCPAVPVFVHEFVSSEPMRLHRDNVMLASGAE
>P0C012 ~~~UL3~~~Nuclear phosphoprotein UL3~~~
MVKSRVSYRSVMSGVGEERVPSAFTILASWGWTFAPQNHDPGASPNTTPIESIAGTAPDAHVGPLDGEPDRDAISPLTSS
VAGDPPGADGPYVTFDTLFMVSSIDELGRRQLTDTIRKDLRLSLAKFSIACTKTSSFSGTAARQRKRGAPPQRTCVPRSN
KSLQMFVLCKRANAAQVREQLRAVIRSRKPRKYYTRSSDGRLCPAVPVFVHEFVSSEPMRLHRDNVMLSTEPD
>Q3YPH5 ~~~NP1~~~Non-structural protein NP-1~~~
MSSGNMKDKHRSYKRKGSPERGERKRHWQTTHHRSRSRSPIRHSGERGSGSYHQEHPISHLSSCTASKTSDQVMKTREST
SGKKDNRTNPYTVFSQHRASNPEAPGWCGFYWHSTRIARDGTNSIFNEMKQQFQQLQIDNKIGWDNTRELLFNQKKTLDQ
KYRNMFWHFRNNSDCERCNYWDDVYRRHLANVSSQTEADEITDEEMLSAAESMEADASN
>Q9Q927 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MENHLPNIFYFPNCVTTFPYRYTQKELDDMKPDERERFKYATFPIIKHRWSHAYVVKDNHVFKLNVETSKRLRRATPPTL
SVPVSMNTMLREYITQDGTKISFECYSYLTCKKATSLHDLDDNAIRGLVEGGNRLQLFTNSVGSVKDTVGIFGNPNPFMK
VPLKSLHPSMQCKIFESWIHHVPVVLTGDTGVGKTSQVPKLLLWFNYLFGGFVNLSTITFDVQEKPIVLSLPRVALVKLH
SETLLTSLGFNEIHKSPVSLKFGNMQEQFVNTRFRRYGIVFSTHKITLNTLFNYSTVILDEVHEHDQTGDIIIAVCRKYI
RKLDSLFLMTATLEDDRQRIEEFFTESVFVHIPGGTLFSISEAYVKNSNDSLNKFMYIEEEKRNLVNAIKTYTPPKQSSG
IVFVSTVSQCDVYKQYLSERLPYKFYIIHGKIQNINDLLSDIYDNEGVSIIISTPYLESSVTVQNATHVYDTGRVYIPSP
YGGREVFISKSMRDQRKGRVGRVKPGMYIYFYDVSELRPIKRIDFEFLHNYVLYSKVFDLQLPEDLFVKPTNMTRLRDVI
EYIRSFNISDGVWTRLLSSYYIHILEYAKVYARGGQSAAALDSFERTGNLTDDALDAIKSLNMRAKIISHRKASTHTYAL
MCRLLFGVYAGKTFIAYHKRPLTGYITMITEHSFIPEY
>O57193 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSKTERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEARGKKVSPPSLGK
PAHINLTAKQYIYSEHTISFECYSFLKCITNTEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLDIDEILEKVYSSPNVSIIISTPYLESSVTIRNVTHIYDMGRVFVPAPFGGS
QEFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRIDSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVQEAILSLNLRIKILNFKHKDDDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYMNMISDTIFVPVDNN
>P20502 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSKTERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEARGKKVSPPSLGK
PAHINLTTKQYIYSEHTISFECYSFLKCITNTEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLDIDEILEKVYSSPNVSIIISTPYLESSVTIRNVTHIYDMGRVFVPAPFGGS
QEFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRIDSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVREAILSLNLRIKILNFKHKDDDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYMNMISDTIFVPVDNN
>Q9JFC3 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSKTERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEARGKKVSPPSLGK
PAHINLTAKQYIYSEHTISFECYSFLKCITNTEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLDIDEILEKVYSSPNVSIIISTPYLESSVTIHNVTHIYDMGRVFVPAPFGGS
QQFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRIDSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVREAILSLNLRIKILNFKHKDDDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYINIISDTIFVPVDNN
>P12927 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSKTERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEARGKKVSPPSLGK
PAHINLTAKQYIYSEHTISFECYSFLKCITNTEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITNFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLDIDEILEKVYSSPNVSIIISTPYLESSVTIRNVTHIYDMGKVFVPAPFGGS
QEFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRIDSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVREAILSLNLRIKILNFKHKDDDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYMNMISDTIFVPVDNN
>P0DSU5 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSNMERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEAHGKKVSPPSLGK
PSHINLTAKQYIYSEHTISFECYSFLKCITNAEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLEIDKILEKVYSSPNVSIIISAPYLESSVTIHNVTHIYDMGRVFVPAPFGGS
QQFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRINSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVQEAILSLNLRIKILNFKHKDNDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYMNMISDTIFVPVDNN
>P0DSU6 3.6.4.13~~~~~~RNA helicase NPH-II~~~
MEKNLPDIFFFPNCVNVFSYKYSQDEFSNMSNMERDSFSLAVFPVIKHRWHNAHVVKHKGIYKVSTEAHGKKVSPPSLGK
PSHINLTAKQYIYSEHTISFECYSFLKCITNAEINSFDEYILRGLLEAGNSLQIFSNSVGKRTDTIGVLGNKYPFSKIPL
ASLTPKAQREIFSAWISHRPVVLTGGTGVGKTSQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRIALVRLHSNTI
LKSLGFKVLDGSPISLRYGSIPEELINKQPKKYGIVFSTHKLSLTKLFSYGTLIIDEVHEHDQIGDIIIAVARKHHTKID
SMFLMTATLEDDRERLKVFLPNPAFIHIPGDTLFKISEVFIHNKINPSSRMAYIEEEKRNLVTAIQMYTPPDGSSGIVFV
ASVAQCHEYKSYLEKRLPYDMYIIHGKVLEIDKILEKVYSSPNVSIIISTPYLESSVTIHNVTHIYDMGRVFVPAPFGGS
QQFISKSMRDQRKGRVGRVNPGTYVYFYDLSYMKSIQRINSEFLHNYILYANKFNLTLPEDLFIIPTNLDILWRTKEYID
SFDISTETWNKLLSNYYMKMIEYAKLYVLSPILAEELDNFERTGELTSIVQEAILSLNLRIKILNFKHKDNDTYIHFCKI
LFGVYNGTNATIYYHRPLTGYMNMISDTIFVPVDNN
>P68950 ~~~L2~~~Pre-histone-like nucleoprotein~~~
MSILISPSNNTGWGLRFPSKMFGGAKKRSDQHPVRVRGHYRAPWGAHKRGRTGRTTVDDAIDAVVEEARNYTPTPPPVST
VDAAIQTVVRGARRYAKMKRRRRRVARRHRRRPGTAAQRAAAALLNRARRTGRRAAMRAARRLAAGIVTVPPRSRRRAAA
AAAAAISAMTQGRRGNVYWVRDSVSGLRVPVRTRPPRN
>P68951 ~~~L2~~~Pre-histone-like nucleoprotein~~~
MSILISPSNNTGWGLRFPSKMFGGAKKRSDQHPVRVRGHYRAPWGAHKRGRTGRTTVDDAIDAVVEEARNYTPTPPPVST
VDAAIQTVVRGARRYAKMKRRRRRVARRHRRRPGTAAQRAAAALLNRARRTGRRAAMRAARRLAAGIVTVPPRSRRRAAA
AAAAAISAMTQGRRGNVYWVRDSVSGLRVPVRTRPPRN
>P68965 ~~~L2~~~Pre-histone-like nucleoprotein~~~
MAILISPSNNTGWGLGTHKLFGGAKQKSDQHPVYVQAHYRAPWGSKGRRRPGRARGVPLDPKTEAEVVATIDEVARNGPP
AARLVLEAARRVGAYNLRRARKLTPAGRAMAAMRARQMVNQAKRRKRRVRSK
>Q89707 ~~~~~~Pre-histone-like nucleoprotein~~~
MSILISPSDNRGWGANMRYRRRASMRGVGRRRLTLRQLLGLGSRRRRRSRPTTVSNRLVVVSTRRRSSRRRR
>P03685 ~~~6~~~Histone-like protein p6~~~
MAKMMQREITKTTVNVAKMVMVDGEVQVEQLPSETFVGNLTMEQAQWRMKRKYKGEPVQVVSVEPNTEVYELPVEKFLEV
ATVRVEKDEDQEEQTEAPEEQVAE
>Q9IGQ6 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMVVGIISLILQIGNIISIWVSHSIQTGNQNHPETCNQSIITYENNTWVNQTYVNISNTNVVAGQDA
TSVILTGNSSLCPISGWAIYSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPYRTLMS
CPVGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFT
IMTDGPSNGQASYKILKIEKGKVTKSIELNAPNYHYEECSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYQIGYICSG
VFGDNPRPNDGTGSCGPVSSNGANGIKGFSFRYDNGVWIGRTKSTSSRSGFEMIWDPNGWTETDSSFSVRQDIVAITDWS
GYSGSFVQHPELTGLDCMRPCFWVELIRGQPKENTIWTSGSSISFCGVNSDTVGWSWPDGAELPFSIDK
>P03470 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMVVGIISLILQIGNIISIWISHSIQTGNQNHTGICNQGIITYNVVAGQDSTSVILTGNSSLCPIRG
WAIHSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPYRALMSCPVGEAPSPYNSRFES
VAWSASACHDGMGWLTIGISGPDNGAVAVLKYNGIITETIKSWRKKILRTQESECTCVNGSCFTIMTDGPSNGLASYKIF
KIEKGKVTKSIELNAPNSHYEECSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYQIGYICSGVFGDNPRPKDGPGSCG
PVSADGANGVKGFSYRYGNGVWIGRTKSDSSRHGFEMIWDPNGWTETDSRFSVRQDVVAMTDRSGYSGSFVQHPELTGLD
CMRPCFWVELIRGRPEEETIWTSGSIISFCGVNSDTVDWSWPDGAELPFTIDK
>P03468 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICLVVGLISLILQIGNIISIWISHSIQTGSQNHTGICNQNIITYKNSTWVKDTTSVILTGNSSLCPIR
GWAIYSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPYRALMSCPVGEAPSPYNSRFE
SVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNGIITETIKSWRKKILRTQESECACVNGSCFTIMTDGPSDGLASYKI
FKIEKGKVTKSIELNAPNSHYEECSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYQIGYICSGVFGDNPRPEDGTGSC
GPVYVDGANGVKGFSYRYGNGVWIGRTKSHSSRHGFEMIWDPNGWTETDSKFSVRQDVVAMTDWSGYSGSFVQHPELTGL
DCMRPCFWVELIRGRPKEKTIWTSASSISFCGVNSDTVDWSWPDGAELPFSIDK
>Q8JSD9 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMVVGIISLILQIGNIVSIWISHSIQTGNQNHTGTCDQSIITYKNSTWVNQTYVNISNTNVVAGKDT
TSVILAGNSSLCPIRGWAIYSKDNGVRIGSKGDVFVIREPFISCSHLECKTFFLTQGALLNDKHSNGTVKDRSPYRALMS
CPVGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDDGAVAVLKYNGIITETIKSWRKEILRTQESECVCVNGSCFT
IMTDGPSGGPASYKIFKIEKGKVTKSIELDAPNSHYEECSCYPDTSKVMCVCRDNWHGSNRPWVSFDQNLDYQMGYICSG
VFGDNPRPKDGKGSCGPVNVDGADGVKGFSYRYGNGGWIGRTKSNSSRKGFEMIWDPNGWTDPDSNFLVKQDIVAMTDWS
GYSGSFVQHPELTGLDCMRPCFWVELIRGRPKENTIWTSGSSISFCGVNSDTVDWSWPDDAELPLNIDK
>Q6XV27 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIICISATGMTLSVVSLLVGIANLGLNIGLHYKVGDTPNVNIPNVNGTNSTTTIINNNTQNNFTNITNIIQSKGG
ERTFLNLTKPLCEVNSWHILSKDNAIRIGEDAHILVTREPYLSCDPQGCRMFALSQGTTLRGRHANGTIHDRSPFRALIS
WEMGQAPSPYNTRVECIGWSSTSCHDGMSRMSICMSGPNNNASAVVWYGGRPITEIPSWAGNILRTQESECVCHKGVCPV
VMTDGPANNRAATKIIYFKEGKIQKIEELAGNAQHIEECSCYGAGGVIKCICRDNWKGANRPVITIDPEMMTHTSKYLCS
KVLTDTSRPNDPTNGNCDAPITGGSPDPGVKGFAFLDGENSWLGRTISKDSRSGYEMLKVPNAETDIQSGPISNQVIVNN
QNWSGYSGAFIDYWANKECFNPCFYVELIRGRPKESSVLWTSNSIVALCGSKKRLGSWSWHDGAEIIYFE
>P88838 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKLFASSGIAIALGIINLLIGISNMSLNISLYSKGENHKSDNLTCTNINQNNTTMVNTYINNTTIIDKNTKMENPG
YLLLNKSLCNVEGWVVIAKDNAIRFGESEQIIVTREPYVSCDPLSCKMYALHQGTTIRNKHSNGTTHDRTAFRGLISTPL
GNPPTVSNSEFICVGWSSTSCHDGVSRMTICVQGNNENATATVYYNKRLTTTIKTWAKNILRTQESECVCHNSTCVVVMT
DGPANNQAFTKVIYFHKGTIIKEEPLKGSAKHIEECSCYGHNQRVTCVCRDNWQGANRPVIEIDMNNLEHTSRYICTGVL
TDTSRPKDKAIGECFNPITGSPGAPGIKGFGFLNENNTWLGRTISPKLRSGFEMLKIPNAGTDPDSKIKERQEIVGNDNW
SGYSGSFIDYWNDNSECYNPCFYVELIRGRPEEAKYVEWTSNSLIALCGSPIPVGSGSFPDGAQIKYFS
>Q1K9Q1 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSVSLTIATVCFLMQIAILATTVTLHFKQHKCDSPASNQVMPCEPIIIERNITEIVYLNNTTIEKEICPE
VVEYRNWSKPQCQITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPGKCYQFALGQGTTLDNKHSNGTIHDRIPHRTLLM
NELGVPFHLGTKQVCVAWSSSSCHDGKAWLHVCVTGDDRNATASFIYDGRLVDSIGSWSQNILRTQESECVCINGTCTVV
MTDGSASGRADTRILFIKEGKIVHIGPLSGSAQHIEECSCYPRYPDVRCICRDNWKGSNRPVIDINMEDYSIDSSYVCSG
LVGDTPRNDDSSSNSNCRDPNNERGNPGVKGWAFDNGDDVWMGRTISKDLRSGYETFKVIGGWSTPNSKSQVNRQVIVDN
NNWSGYSGIFSVEGKSCINRCFYVELIRGRPQETRVWWTSNSIVVFCGTSGTYGTGSWPDGANINFMPI
>Q710U6 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMIVGIISLILQIGNIISIWVSHSIQTGNQNQPEICNQSIITYENNTWVNQTYVNISNTNFVTEQAL
APVALAGNSSLCPISGWAIYSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPYRTLMS
CPIGESPSPYNSRFESVAWSASACHDGIGWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACMNGSCFT
IMTDGPSNGQASYKIFKIEKGKVVKSVELNAPNYHYEECSCYPDAGEIMCVCRDNWHGSNRPWVSFNQNLEYQIGYICSG
VFGDNPRPNDGAGSCGPVSSNGAYGVKGFSFKYGKGVWIGRTKSTSSRSGFEMIWDPNGWTETDSSFSVKQDIVAITDWS
GYSGSFVQHPELTGLDCMRPCFWVELIRGRPKENTIWTSGSSISFCGVNSDTVGWSWPDGAELPFTIDK
>Q07599 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSISLGLVVFNVLLHVVSIIVTVLVLGKGGNNGICNETVVREYNETVRIEKVTQWHNTNVVEYVPYWNGG
TYMNNTEAICDAKGFAPFSKDNGIRIGSRGHIFVIREPFVSCSPIECRTFFLTQGSLLNDKHSNGTVKDRSPFRTLMSVE
VGQSPNVYQARFEAVAWSATACHDGKKWMTVGVTGPDSKAVAVIHYGGVPTDVVNSWAGDILRTQESSCTCIQGDCYWVM
TDGPANRQAQYRIYKANQGRIIGQTDISFNGGHIEECSCYPNDGKVECVCRDGWTGTNRPVLVISPDLSYRVGYLCAGIP
SDTPRGEDTQFTGSCTSPMGNQGYGVKGFGFRQGTDVWMGRTISRTSRSGFEILRIKNGWTQTSKEQIRKQVVVDNLNWS
GYSGSFTLPVELSGKDCLVPCFWVEMIRGKPEEKTIWTSSSSIVMCGVDYEVADWSWHDGAILPFDIDKM
>P06820 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSVSLTIATVCFLMQIAILVTTVTLHFKQHECDSPASNQVMPCEPIIIERNITEIVYLNNTTIEKEICPK
VVEYRNWSKPQCQITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPVKCYQFALGQGTTLDNKHSNDTVHDRIPHRTLLM
NELGVPFHLGTRQVCIAWSSSSCHDGKAWLHVCITGDDKNATASFIYDGRLVDSIGSWSQNILRTQESECVCINGTCTVV
MTDGSASGRADTRILFIEEGKIVHISPLAGSAQHVEECSCYPRYPGVRCICRDNWKGSNRPVVDINMEDYSIDSSYVCSG
LVGDTPRNDDRSSNSNCRNPNNERGTQGVKGWAFDNGNDLWMGRTISKDLRSGYETFKVIGGWSTPNSKSQINRQVIVDS
DNRSGYSGIFSVEGKSCINRCFYVELIRGRKQETRVWWTSNSIVVFCGTSGTYGTGSWPDGANINFMPI
>P03472 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKILCTSATALVIGTIAVLIGITNLGLNIGLHLKPSCNCSHSQPEATNASQTIINNYYNDTNITQISNTNIQVEER
AIRDFNNLTKGLCTINSWHIYGKDNAVRIGEDSDVLVTREPYVSCDPDECRFYALSQGTTIRGKHSNGTIHDRSQYRALI
SWPLSSPPTVYNSRVECIGWSSTSCHDGKTRMSICISGPNNNASAVIWYNRRPVTEINTWARNILRTQESECVCHNGVCP
VVFTDGSATGPAETRIYYFKEGKILKWEPLAGTAKHIEECSCYGERAEITCTCRDNWQGSNRPVIRIDPVAMTHTSQYIC
SPVLTDNPRPNDPTVGKCNDPYPGNNNNGVKGFSYLDGVNTWLGRTISIASRSGYEMLKVPNALTDDKSKPTQGQTIVLN
TDWSGYSGSFMDYWAEGECYRACFYVELIRGRPKEDKVWWTSNSIVSMCSSTEFLGQWDWPDGAKIEYFL
>Q9IGQ0 3.2.1.18~~~NA~~~Neuraminidase~~~
MNTNQRIITIGTICLIVGIISLLLQIGNIILLWMSHSIQTGEKSHPKVCNQSVITYENNTWVNQTYVNISNTNIAAGQGV
TPIILAGNSSLCPISGWAIYSKDNSIRIGSKGDIFVMREPFISCSHLECRTFFLTQGALLNDRHSNGTVKDRSPYRTLMS
CPIGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNKILRTQESECVCINGSCFT
IMTDGPSNGQASYKLFKMEKGKIIRSIELDAPNYHYEECSCYPDTGKVVCVCRDNWHASNRPWVSFDQNLDYQIGYICSG
VFGDNPRSNDGKGNCGPVLSNGANGVKGFSFRYGNGVWIGRTKSISSRSGFEMIWDPNGWTETDSSFSMKQDIIALTDWS
GYSGSFVQHPELTGMNCIRPCFWVELIRGQPKESTIWTSGSSISFCGVNSGTASWSWPDGADLPFTIDK
>P05803 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKILCTSATALVIGTIAVLIGIVNLGLNIGLHLKPSCNCSRSQPEATNASQTIINNYYNETNITQISNTNIQVEER
ASREFNNLTKGLCTINSWHIYGKDNAVRIGEDSDVLVTREPYVSCDPDECRFYALSQGTTIRGKHSNGTIHDRSQYRDLI
SWPLSSPPTVYNSRVECIGWSSTSCHDGRARMSICISGPNNNASAVIWYNRRPVTEINTWARNILRTQESECVCQNGVCP
VVFTDGSATGPAETRIYYFKEGKILKWEPLTGTAKHIEECSCYGEQAGVTCTCRDNWQGSNRPVIQIDPVAMTHTSQYIC
SPVLTDNPRPNDPTVGKCNDPYPGNNNNGVKGFSYLDGGNTWLGRTISIASRSGYEMLKVPNALTDDRSKPTQGQTIVLN
TDWSGYSGSFMDYWAEGECYRACFYVELIRGRPKEDKVWWTSNSIVSMCSSTEFLGQWNWPDGAKIEYFL
>Q6TXB9 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSASLGILILNVILHVVSIIVTVLVLNNNGTGLNCNGTIIREYNETVRVERVIQWYNTNTIEYIERPSNE
YYMNNTEPLCEAQGFAPFSKDNGIRIGSRGHVFVIREPFVSCSPSECRTFFLTQGSLLNDKHSNGTVKDRSPYRTLMSVK
IGQSPNVYQARFESVAWSATACHDGKKWMTVGVTGPDNQAVAVVNYGGVPVDIINSWAGDILRTQESSCTCIKGDCYWVM
TDGPANRQAKYRIFKAKDGRIIGQTDISFNGGHIEECSCYPNEGKVECVCRDNWTGTNRPILVISPDLSYTVGYLCAGIP
TDTPRGEDSQFTGSCTSPLGNKGYGVKGFGFRQGTDVWAGRTISRTSRSGFEIIKIRNGWTQNSKDQIRRQVIIDNPNWS
GYSGSFTLPVELTKKGCLVPCFWVEMIRGKPEETTIWTSSSSIVMCGVDHKIASWSWHDGAILPFDIDKM
>Q9Q0U7 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMVVGIISLMLQIGNIISIWVSHSIQTGNQHQAEPCNQSIITYENNTWVNQTYVNISNTNFLTEKAV
ASVTLAGNSSLCPISGWAVHSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMS
CPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFT
VMTDGPSNGQASYKIFKMEKGKVVKSVELNAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSG
VFGDNPRPNDGTGSCGPVSPNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSSFSVKQDIVAITDWS
GYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVGWSWPDDAELPFTIDK
>Q9W7Y7 3.2.1.18~~~NA~~~Neuraminidase~~~
MNPNQKIITIGSICMVVGIISLMLQIGNIISVWVSHIIQTWHPNQPEPCNQSINFYTEQAAASVTLAGNSSLCPISGWAI
YSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPYRTLMSCPVGEAPSPYNSRFESVAW
SASACHDGISWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNEQASYKIFKIE
KGRVVKSVELNAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDSPRPNDGTGSCGPVS
LNGAYGVKGFSFKYGNGVWIGRTKSTSSRSGFEMIWDPNGWTETDSSFSLKQDIIAITDWSGYSGSFIQHPELTGLNCMR
PCFWVELIRGRPKEKTIWTSGSSISFCGVNSDTVGWSWPDGADLPFTIDK
>P27907 3.2.1.18~~~NA~~~Neuraminidase~~~
MLPSTIQTLTLFLTSGGVLLSLYVSASLSYLLYSDILLKFSSKITAPTMTLDCANASNVQAVNRSATKEMTFLLPEPEWT
YPRLSCQGSTFQKALLISPHRFGEARGNSAPLIIREPFIACGPKECKHFALTHYAAQPGGYYNGTREDRNKLRHLISVKL
GKIPTVENSIFHMAAWSGSACHDGREWTYIGVDGPDSNALIKIKYGEAYTDTYHSYANNILRTQESACNCIGGDCYLMIT
DGSASGISKCRFLKIREGRIIKEIFPTGRVEHTEECTCGFASNKTIECACRDNSYTAKRPFVKLNVETDTAEIRLMCTET
YLDTPRPDDGSITGPCESNGDKGRGGIKGGFVHQRMASKIGRWYSRTMSKTERMGMELYVRYDGDPWTDSDALAHSGVMV
SMKEPGWYSFGFEIKDKKCDVPCIGIEMVHDGGKKTWHSAATAIYCLMGSGQLLWDTVTGVDMAL
>P03474 3.2.1.18~~~NA~~~Neuraminidase~~~
MLPSTVQTLTLLLTSGGVLLSLYVSASLSYLLYSDVLLKFSSTKTTAPTMSLECTNASNAQTVNHSATKEMTFPPPEPEW
TYPRLSCQGSTFQKALLISPHRFGEIKGNSAPLIIREPFVACGPKECRHFALTHYAAQPGGYYNGTRKDRNKLRHLVSVK
LGKIPTVENSIFHMAAWSGSACHDGREWTYIGVDGPDNDALVKIKYGEAYTDTYHSYAHNILRTQESACNCIGGDCYLMI
TDGSASGISKCRFLKIREGRIIKEILPTGRVEHTEECTCGFASNKTIECACRDNSYTAKRPFVKLNVETDTAEIRLMCTK
TYLDTPRPDDGSIAGPCESNGDKWLGGIKGGFVHQRMASKIGRWYSRTMSKTNRMGMELYVKYDGDPWTDSDALTLSGVM
VSIEEPGWYSFGFEIKDKKCDVPCIGIEMVHDGGKDTWHSAATAIYCLMGSGQLLWDTVTGVDMAL
>P07071 1.1.98.6~~~nrdD~~~Anaerobic ribonucleoside-triphosphate reductase~~~
MTIEKEIEGLIHKTNKDLLNENANKDSRVFPTQRDLMAGIVSKHIAKNMVPSFIMKAHESGIIHVHDIDYSPALPFTNCC
LVDLKGMLENGFKLGNAQIETPKSIGVATAIMAQITAQVASHQYGGTTFANVDKVLSPYVKRTYAKHIEDAEKWQIADAL
NYAQSKTEKDVYDAFQAYEYEVNTLFSSNGQTPFVTITFGTGTDWTERMIQKAILKNRIKGLGRDGITPIFPKLVMFVEE
GVNLYKDDPNYDIKQLALECASKRMYPDIISAKNNKAITGSSVPVSPMGCRSFLSVWKDSTGNEILDGRNNLGVVTLNLP
RIALDSYIGTQFNEQKFVELFNERMDLCFEALMCRISSLKGVKATVAPILYQEGAFGVRLKPDDDIIELFKNGRSSVSLG
YIGIHELNILVGRDIGREILTKMNAHLKQWTERTGFAFSLYSTPAENLCYRFCKLDTEKYGSVKDVTDKGWYTNSFHVSV
EENITPFEKISREAPYHFIATGGHISYVELPDMKNNLKGLEAVWDYAAQHLDYFGVNMPVDKCFTCGSTHEMTPTENGFV
CSICGETDPKKMNTIRRTCGYLGNPNERGFNLGKNKEIMHRVKHQ
>Q67726 ~~~ORF1~~~Non-structural polyprotein 1AB~~~
MAYGEPYYSSKPDKDFNFGSTMARRQMTPTMVTKLPKFVRNSPQAYDWIVRGLIFPTIGKTYFQRVVVITGGLEDGTYGS
FAFDGKEWVGIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKEMIPKPEPRKIQMKWIIMG
AVLTFLSLIPGGYAHSQTNNTIFTDMIAACKYSTETLTENLDLRIKLALANITISDKLDAVRQILNFAFVPRAHWLRTVF
YYIHYYEMWNIFMFVLAIGTVMRSARPGTDLVTLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITLAI
LAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVVA
TVPTRIKNVAFDFFQKLKQSGVRVGVNEFVVIKPGALCVIDTPEGKGTGFFSGNDIVTAAHVVGNNTFVNVCYEGLMYEA
KVRYMPEKDIAFLTCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTAAAMVHGNTLSYAVRTQDGMSGAPVCDKYG
RVLAVHQTNTGYTGGAVIIDPADFHPVKAPSQVELLKEEIERLKAQLNSATENATTVVTQQPSAALEQKSVSDSDVVDLV
RTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSGY
PDYDDEDYYDEDDDGWGMVGDDVEFDYTEVINFDQAKPIPAPRTTKQKICPEPEVESQPLDLSQKKEKQSEYEQQVVKST
KPQQLEHEQQVVKPIKPQKSEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAI
KTQNYSALPEAVFELDKAAYEAGLEGFLQRVKSKNKAPEKQGPKKLQRAPEDQGAQNYHSLDAWKLLLEPPRERRCVPAN
FPLLGHLPINRPIFDDKKPRDDLLGLLPEPTWHAFEEYGPTTWGPQAFIKSFDKFFYAEPIDFFSEYPQLCAFADWATYR
EFRYLEDTRVIHITATEKNTDSTPAYPKMNYFDTEENYLEAHGWAPYIREFTRVYKGDKPEVLWYLFLKKEIIKEEKIRN
SDIRQIVCADPIYTRIGACLEAHQNALMKQHTDTSVGQCGWSPMEGGFKKTMQRLVNKGNKHFIEFDWTRYDGTIPPALF
KHIKEIRWNFINKDQREKYRHVHEWYVNNLLNRHVLLPSGEVTLQTRGNPSGQFSTTMDNNMVNFWLQAFEFAYFNGPDR
DLWKTYDTVVYGDDRLSTTPSVPDDYEERVITMYRDIFGMWVKPGKVICRDSIVGLSFCGFTVNENLEPVPTSPEKLMAS
LLKPYKILPDLESLHGKLLCYQLLAAFMAEDHPFKVYVEHCLSRTAKQLRDSGLPARLTEEQLHRIWRGGPKKCDG
>Q3ZN06 ~~~ORF1~~~Non-structural polyprotein 1AB~~~
MAHGEPYYSSKPDKDFNFGSTMARRQMTPTMVAKLPNFVRNSPQAYDWIVRGLIFPTTGKTYFQRVVVITGGLEDGTYGS
FVFDGREWVEIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKELIPKPEPRKIQMKWIIVG
AVLTFLSLIPGGYAQSQINNTIFTDMIAACKYSTETLTENLDLRIKLALANITISDKLDAVRQILNFAFVPRAHWLRTVF
YYIHYYEMWNIFMFVLAIGTVMRSARPGTDLITLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITLAI
LAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVVA
TVPTRIKNVAFDFFQKLKQSGVRVGVNEFVVIKPGALCVIDTPEGKGTGFFSGNDIVTAAHAVGNNTFVNVCYEGLMYEA
KVRYMPEKDIAFITCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTAAAMVHGNTLSYAVRTQDGMSGAPVCDKYG
RVLAVHQTNTGYTGGAVIIDPTDFHPVKAPSRVELLKEEIERLKAQLNSAAENPATAVTQQPVVTLEQKSVSDSDVVDLV
RTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSGY
PDYDDEDYYDEDDDGWGVVGDDVEFDYTEVINFDQAKPTPAPRTVKPKTCPEPEAETQPLDLSQKKEKQLEHEQQVVKST
KPQKNEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAIKTQNYSALPEAVFEL
DKAAYEAGLEGFLQRVKSKKGSKKLQRAPEDQGAQNYHSLDAWKSLLEPPRERRCVPANFPLLGHLPINRPIFDDKKPRD
DLLGLLPEPTWHAFEEYGPTTWGPQAFIKSFDKFFYAEPIDFFSEYPQLCAFADWATYREFRYLEDTRVIHITATEKNTD
STPAYPKMNYFDTEENYLEAHGWAPYIREFTRVFKGDKPEVLWYLFLKKEIIKEEKIRNSDIRQIVCADPIYTRIGACLE
AHQNALMKQHTDTSVGQCGWSPMEGGFKKTMQRLVNKGNKHFIEFDWTRYDGTIPPALFKHIKEIRWNFINKDQREKYKH
VHEWYVDNLLNRHVLLPSGEVTLQTRGNPSGQFSTTMDNNMVNFWSQAFEFAYFNGPDKDLWKTYDTVVYGDDRLSTTPS
VPDDYEERVINMYRDIFGMWVKPGKVICRDSIVGLSFCGFTVNENLEPVPTSPEKLMASLLKPYKILPDLESLHGKLLCY
QLLAAFMAEDHPFKVYVEHCLSRTAKQLRDSGLPARLTEEQLHRIWRGGPKKCDG
>Q9IFX2 ~~~ORF1~~~Non-structural polyprotein 1AB~~~
MMALGEPYYSSKPDKDFNFGSTMARRQMTPTMVTKLPKFVRNSPQAYDWIVRGLIFPTTGKTYFQRVVVITGGLEDGTYG
SYAFNGSEWVEIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKELIPKPEPRKIQMKWIIV
GAVLTFLSLIPGGYAQSQTNNTIFTDMIAACKYSTETLTENLDLRIKLALANITINDKLDAVRQILNFAFVPRAHWLRTV
FYYIHYYEMWNIFMFVLAIGTVMRSARPGTDLITLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITMA
ILAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVV
ATVPTRIKNVAFDFFQKLKQSGVRVGVNDFVVIKPGALCIIDTPEGKGTGFFSGNDIVTAAHVVGNNTFVSVCYEGLVYE
AKVRYMPEKDIAFITCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTATAMVHGNTLSYAVRTQDGMSGAPVCDKY
GRVLAVHQTNTGYTGGAVIIDPADFHPVKAPSQVELLKEEIERLKAQLNSAAENPVTVVTQQPIVTLEQKSVSDSDVVDL
VRTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSG
YPDYDDEDYYDEDDDGWGMVGDDVEFDYTEVINFDQAKPTPAPRTTKPKPCPEPKIEAQPLDLSQKKEKQPEHEQQVAKP
TKPQKIEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAIKTQNYSALPEAVFE
LDKAAYEAGLEGFLQRVKSKKQGPKKLQRAPEDQGAQNYHSLDAWKSLLEPPRERRCVPANFPLLGHLPINRPIFDDKKP
RDDLLGLLPEPTWHAFEEYGPTTWGPQAFVKSFDKFFYAEPIDFFSEYPQLCAFADWATYREFRYLEDTRVIHITATEKN
TDSTPAYPKMNYFDTEEDYLEAHGWAPYIREFTRVFKGDKPEVLWYLFLKKEIIKEEKIRNSDIRQIVCADPIYTRIGAC
LEAHQNALMKQHTDTSVGQCGWSPMEGGFKKTMQRLVNKGNKHFIEFDWTRYDGTIPPALFKHIKEIRWNFINKDQREKY
RHVHEWYVDNLLNRHVLLPSGEVTLQTRGNPSGQFSTPMDNNMVNFWLQAFEFAYFNGPDKDLWKTYDTVVYGDDRLSTT
PSVPDNYEERVITMYRDIFGMWVKPGKVICRDSIVGLSFCGFTVNENLEPVPTSPEKLMASLLKPYKILPDLESLHGKLL
CYQLLAAFMAEDHPFKVYVEHCLSRTAKQLRDSGLPARLTEEQLHRIWRGGPKKCDG
>P0C6K4 ~~~ORF1~~~Non-structural polyprotein 1A~~~
MAYGEPYYSSKPDKDFNFGSTMARRQMTPTMVTKLPKFVRNSPQAYDWIVRGLIFPTIGKTYFQRVVVITGGLEDGTYGS
FAFDGKEWVGIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKEMIPKPEPRKIQMKWIIMG
AVLTFLSLIPGGYAHSQTNNTIFTDMIAACKYSTETLTENLDLRIKLALANITISDKLDAVRQILNFAFVPRAHWLRTVF
YYIHYYEMWNIFMFVLAIGTVMRSARPGTDLVTLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITLAI
LAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVVA
TVPTRIKNVAFDFFQKLKQSGVRVGVNEFVVIKPGALCVIDTPEGKGTGFFSGNDIVTAAHVVGNNTFVNVCYEGLMYEA
KVRYMPEKDIAFLTCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTAAAMVHGNTLSYAVRTQDGMSGAPVCDKYG
RVLAVHQTNTGYTGGAVIIDPADFHPVKAPSQVELLKEEIERLKAQLNSATENATTVVTQQPSAALEQKSVSDSDVVDLV
RTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSGY
PDYDDEDYYDEDDDGWGMVGDDVEFDYTEVINFDQAKPIPAPRTTKQKICPEPEVESQPLDLSQKKEKQSEYEQQVVKST
KPQQLEHEQQVVKPIKPQKSEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAI
KTQNYSALPEAVFELDKAAYEAGLEGFLQRVKSKNKAPKNYKGPQKTKGPKTTTH
>Q3ZN07 ~~~ORF1~~~Non-structural polyprotein 1A~~~
MAHGEPYYSSKPDKDFNFGSTMARRQMTPTMVAKLPNFVRNSPQAYDWIVRGLIFPTTGKTYFQRVVVITGGLEDGTYGS
FVFDGREWVEIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKELIPKPEPRKIQMKWIIVG
AVLTFLSLIPGGYAQSQINNTIFTDMIAACKYSTETLTENLDLRIKLALANITISDKLDAVRQILNFAFVPRAHWLRTVF
YYIHYYEMWNIFMFVLAIGTVMRSARPGTDLITLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITLAI
LAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVVA
TVPTRIKNVAFDFFQKLKQSGVRVGVNEFVVIKPGALCVIDTPEGKGTGFFSGNDIVTAAHAVGNNTFVNVCYEGLMYEA
KVRYMPEKDIAFITCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTAAAMVHGNTLSYAVRTQDGMSGAPVCDKYG
RVLAVHQTNTGYTGGAVIIDPTDFHPVKAPSRVELLKEEIERLKAQLNSAAENPATAVTQQPVVTLEQKSVSDSDVVDLV
RTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSGY
PDYDDEDYYDEDDDGWGVVGDDVEFDYTEVINFDQAKPTPAPRTVKPKTCPEPEAETQPLDLSQKKEKQLEHEQQVVKST
KPQKNEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAIKTQNYSALPEAVFEL
DKAAYEAGLEGFLQRVKSKNKAPKNYKGPQKTKGPKTITH
>Q9IFX3 ~~~ORF1~~~Non-structural polyprotein 1A~~~
MMALGEPYYSSKPDKDFNFGSTMARRQMTPTMVTKLPKFVRNSPQAYDWIVRGLIFPTTGKTYFQRVVVITGGLEDGTYG
SYAFNGSEWVEIYPIEHLNLMSSLKLIHKANALQERLRLSQEEKATLALDVQFLQHENVRLKELIPKPEPRKIQMKWIIV
GAVLTFLSLIPGGYAQSQTNNTIFTDMIAACKYSTETLTENLDLRIKLALANITINDKLDAVRQILNFAFVPRAHWLRTV
FYYIHYYEMWNIFMFVLAIGTVMRSARPGTDLITLATSHLSGFRMAVLPTIPFHTTMTLWVMNTLMVCYYFDNLLAITMA
ILAPILGIIFLCFMEDSNYVSQIRGLIATAVLIAGGHACLTLTGTTTSLFVVILTCRFIRMATVFIGTRFEIRDANGKVV
ATVPTRIKNVAFDFFQKLKQSGVRVGVNDFVVIKPGALCIIDTPEGKGTGFFSGNDIVTAAHVVGNNTFVSVCYEGLVYE
AKVRYMPEKDIAFITCPGDLHPTARLKLSKNPDYSCVTVMAYVNEDLVVSTATAMVHGNTLSYAVRTQDGMSGAPVCDKY
GRVLAVHQTNTGYTGGAVIIDPADFHPVKAPSQVELLKEEIERLKAQLNSAAENPVTVVTQQPIVTLEQKSVSDSDVVDL
VRTAMEREMKVLRDEINGILAPFLQKKKGKTKHGRGRVRRNLRKGVKLLTEEEYRELLEKGLDRETFLDLIDRIIGERSG
YPDYDDEDYYDEDDDGWGMVGDDVEFDYTEVINFDQAKPTPAPRTTKPKPCPEPKIEAQPLDLSQKKEKQPEHEQQVAKP
TKPQKIEPQPYSQTYGKAPIWESYDFDWDEDDAKFILPAPHRLTKADEIVLGSKIVKLRTIIETAIKTQNYSALPEAVFE
LDKAAYEAGLEGFLQRVKSKNKAPKNYKGPQKTKGPKTTTH
>P24030 3.1.21.-~~~NS1~~~Initiator protein NS1~~~
MAQAQIDEQRRLQDLYVQLKKEINDGEGVAWLFQQKTYTDKDNKPTKATPPLRTTSSDLRLAFDSIEENLTASNEHLTNN
EINFCKLTLGKTLLLIDKHVKSHRWDSNKVNLIWQIEKGKTQQFHIHCCLGYFDKNEDPKDVQKSLGWFMKRLNKDLAVI
YSNHHCDIQDIKDPEDRAKNLKVWIEDGPTKPYKYFNKQTKQDYNKPVHLRDYTFIYLFNKDKINTDSMDGYFAAGNGGI
VDNLTNKERKTLRKMYLDEQSSDIMDANIDWEDGQDAPKVTDQTDSATTKTGTSLIWKSCATKVTSKKEVANPVQQPSKK
LYSAQSTLDALFNVGCFTPEDMIIKQSDKYLELSLEPNGPQKINTLLHMNQVKTSTMITAFDCIIKFNEEEDDKPLLATI
KDMGLNEQYLKKVLCTILTKQGGKRGCIWFYGPGGTGKTLLASLICKATVNYGMVTTSNPNFPWTDCGNRNIIWAEECGN
FGNWVEDFKAITGGGDVKVDTKNKQPQSIKGCVIVTSNTNITKVTVGCVETNAHAEPLKQRMIKIRCMKTINPKTKITPG
MLKRWLNTWDRQPIQLSHEMPELYLGKCRW
>Q65694 ~~~1C~~~Non-structural protein 1~~~
MGSETLSVIQVRLRNIYDNDKVALLKITCHTNRLILLTHTLAKSVIHTIKLSGIVFIHIITSSDYCPTSDIINSANFTSM
PILQNGGYIWELMELTHCFQTNGLIDDNCEITFSKRLSDSELAKYSNQLSTLLGLN
>D0EZM8 3.1.21.-~~~NS1~~~Initiator protein NS1~~~
MAFNPPVIRAFSQPAFTYVFKFPYPQWKEKEWLLHALLAHGTEQSMIQLRNCAPHPDEDIIRDDLLISLEDRHFGAVLCK
AVYMATTTLMSHKQRNMFPRCDIIVQSELGEKNLHCHIIVGGEGLSKRNAKSSCAQFYGLILAEIIQRCKSLLATRPFEP
EEADIFHTLKKAEREAWGGVTGGNMQILQYRDRRGDLHAQTVDPLRFFKNYLLPKNRCISSYSKPDVCTSPDNWFILAEK
TYSHTLINGLPLPEHYRKNYHATLDNEVIPGPQTMAYGGRGPWEHLPEVGDQRLAASSVSTTYKPNKKEKLMLNLLDKCK
ELNLLVYEDLVANCPELLLMLEGQPGGARLIEQVLGMHHINVCSNFTALTYLFHLHPVTSLDSDNKALQLLLIQGYNPLA
VGHALCCVLNKQFGKQNTVCFYGPASTGKTNMAKAIVQGIRLYGCVNHLNKGFVFNDCRQRLVVWWEECLMHQDWVEPAK
CILGGTECRIDVKHRDSVLLTQTPVIISTNHDIYAVVGGNSVSHVHAAPLKERVIQLNFMKQLPQTFGEITATEIAALLQ
WCFNEYDCTLTGFKQKWNLDKIPNSFPLGVLCPTHSQDFTLHENGYCTDCGGYLPHSADNSMYTDRASETSTGDITPSDL
GDSDGEDTEPETSQVDYCPPKKRRLTAPASPPNSPASSVSTITFFNTWHAQPRDEDELREYERQASLLQKKRESRKRGEE
ETLADNSSQEQEPQPDPTQWGERLGFISSGTPNQPPIVLHCFEDLRPSDEDEGEYIGEKRQ
>Q86306 ~~~1C~~~Non-structural protein 1~~~
MGSNSLSMIKVRLQNLFDNDEVALLKITCYTDKLIHLTNALAKAVIHTIKLNGIVFVHVITSSDICPNNNIVVKSNFTTM
PALQNGGYIWEMMELTHCSQPNGLIDDNCEIKFSKKLSDSTMTNYMNQLSELLGFDLNP
>P0DOE9 ~~~1C~~~Non-structural protein 1~~~
MGSNSLSMIKVRLQNLFDNDEVALLKITCYTDKLIHLTNALAKAVIHTIKLNGIVFVHVITSSDICPNNNIVVKSNFTTM
PVLQNGGYIWEMMELTHCSQPNGLLDDNCEIKFSKKLSDSTMTNYMNQLSELLGFDLNP
>Q99AU3 ~~~NS~~~Non-structural protein 1~~~
MDSNTVSSFQVDCFLWHVRKRFADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILKEESDEALKMT
IASVPASRYLTDMTLEEMSRDWFMLMPKQKVAGSLCIRMDQAIMDKNIILKANFSVIFDRLETLILLRAFTEEGAIVGEI
SPLPSLPGHTDEDVKNAVGVLIGGLEWNDNTVRVSETLQRFAWRSSNENGRPPLPPKQKRKMARTIKSEV
>Q82506 ~~~NS~~~Non-structural protein 1~~~
MDPNTVSSFQVDCFLWHVRKRVADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILKEESDEALKMT
MASVPASRYLTDMTLEEMSRHWFMLMPKQKVAGPLCIRMDQAIMDKNIILKANFSVILDRLETLILLRAFTEEGTIVGEI
SPLPSLPGHTDEDVKNAVGVLIGGLEWNNNTVRVSETLQRFAWRSSNENGRPPLTPKQKRKMAGTIRSEV
>P03496 ~~~NS~~~Non-structural protein 1~~~
MDPNTVSSFQVDCFLWHVRKRVADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILKEESDEALKMT
MASVPASRYLTDMTLEEMSRDWSMLIPKQKVAGPLCIRMDQAIMDKNIILKANFSVIFDRLETLILLRAFTEEGAIVGEI
SPLPSLPGHTAEDVKNAVGVLIGGLEWNDNTVRVSETLQRFAWRSSNENGRPPLTPKQKREMAGTIRSEV
>P03495 ~~~NS~~~Non-structural protein 1~~~
MDSNTVSSFQVDCFLWHVRKQVVDQELGDAPFLDRLRRDQKSLRGRGSTLGLNIEAATHVGKQIVEKILKEESDEALKMT
MASTPASRYITDMTIEELSRDWFMLMPKQKVEGPLCIRIDQAIMDKNIMLKANFSVIFDRLETLILLRAFTEEGAIVGEI
SPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKTLQRFAWGSSNENGRPPLTPKQKRKMARTARSKVRRDKMAD
>P69270 ~~~NS~~~Non-structural protein 1~~~
MDSNTITSFQVDCYLWHIRKLLSMRDMCDAPFDDRLRRDQKALKGRGSTLGLDLRVATMEGKKIVEDILKSETDENLKIA
IASSPAPRYITDMSIEEISREWYMLMPRQKITGGLMVKMDQAIMDKRITLKANFSVLFDQLETLVSLRAFTDDGAIVAEI
SPIPSMPGHSTEDVKNAIGILIGGLEWNDNSIRASENIQRFAWGIRDENGGPPLPPKQKRYMARRVESEV
>Q38SQ2 ~~~NS~~~Non-structural protein 1~~~
MDSNTVSSFQVDCFLWHVRKQVVDQELSDAPFLDRLRRDQRSLRGRGSTLGLDIKAATHVGKQIVEKILKEESDEALKMT
MASTPASRYITDMTIEELSRNWFMLMPKQKVEGPLCIRMDQAIMEKNIMLKANFSVIFDRLETLVLLRAFTEEGAIVGEI
SPLPSFPGHTIEDVKNAIGVLIGGLEWNDNTVRVSKTLQRFAWGSSNENGGPPLTPKQKRKMARTARSKVRRDKMAD
>P03502 ~~~NS~~~Non-structural protein 1~~~
MADNMTTTQIEVGPGATNATINFEAGILECYERFSWQRALDYPGQDRLHRLKRKLESRIKTHNKSEPENKRMSLEERKAI
GVKMMKVLLFMDPSAGIEGFEPYCVKNPSTSKCPNYDWTDYPPTPGKYLDDIEEEPENVDHPIEVVLRDMNNKDARQKIK
DEVNTQKEGKFRLTIKRDIRNVLSLRVLVNGTFLKHPNGDKSLSTLHRLNAYDQNGGLVAKLVATDDRTVEDEKDGHRIL
NSLFERFDEGHSKPIRAAETAVGVLSQFGQEHRLSPEEGDN
>P12601 ~~~NS~~~Non-structural protein 1~~~
MADNMTTTQIEVGPGATNATINFEAGILECYERLSWQRALDYPGQDRLNRLKRKLESRIKTHNKSEPESKRMSLEERKAI
GVKMMKVLLFMNPSAGIEGFEPYCMKNSSNSNCPNCNWTDYPPTPGKCLDDIEEEPENVDDPTEIVLRDMNNKDARQKIK
EEVNTQKEGKFRLTIKRDIRNVLSLRVLVNGTFLKHPNGYKSLSTLHRLNAYDQSGRLVAKLVATDDLTVEDEEDGHRIL
NSLFERFNEGHSKPIRAAETAVGVLSQFGQEHRLSPEEGDN
>P08013 ~~~NS~~~Non-structural protein 1~~~
MADNMTTTQIEVGPGATNATINFEAGILECYERLSWQRALDYPGQDRLNRLKRKLESRIKTHNKSEPESKRMSLEERKAI
GVKMMKVLLFMNPSAGIEGFEPYCMKNSSNSNCPNCNWTDYPPTSGKCLDDIEEEPENVDDPTEIVLRDMNNKDARQKIK
EEVNTQKEGKFRLTIKRDIRNVLSLRVLVNGTFLKHPNGYKSLSTLHRLNAYDQSGRLVAKLVATDDLTVEDEEDGHRIL
NSLFERFNEGHSKPIRAAETAVGVLSQFGQEHRLSPEEGDN
>Q01639 ~~~NS~~~Non-structural protein 1~~~
MSDKTVKSTNLMAFVATKMLERQEDLDTCTEMQVEKMKTSTKARLRTESSFAPRTWEDAIKDGELLFNGTILQTESPTMT
PASVEMKGKKFPIDFAPSNIAPIGQNPIYLSPCIPNFDGNVWEATMYHHRGATLTKTMNCNCFQRTIWCHPNPSRMRLSY
AFVLYCRNTKKICGYLIAKQVAGIETGIRKCFRCIKSGFVMATDEISLTILQSIKSGAQLDPYWGNEKPDIDKTEAYMLS
LREAGP
>P03134 3.1.21.-~~~NS1~~~Initiator protein NS1~~~
MAGNAYSDEVLGATNWLKEKSNQEVFSFVFKNENVQLNGKDIGWNSYKKELQEDELKSLQRGAETTWDQSEDMEWETTVD
EMTKKQVFIFDSLVKKCLFEVLNTKNIFPGDVNWFVQHEWGKDQGWHCHVLIGGKDFSQAQGKWWRRQLNVYWSRWLVTA
CNVQLTPAERIKLREIAEDNEWVTLLTYKHKQTKKDYTKCVLFGNMIAYYFLTKKKISTSPPRDGGYFLSSDSGWKTNFL
KEGERHLVSKLYTDDMRPETVETTVTTAQETKRGRIQTKKEVSIKTTLKELVHKRVTSPEDWMMMQPDSYIEMMAQPGGE
NLLKNTLEICTLTLARTKTAFDLILEKAETSKLTNFSLPDTRTCRIFAFHGWNYVKVCHAICCVLNRQGGKRNTVLFHGP
ASTGKSIIAQAIAQAVGNVGCYNAANVNFPFNDCTNKNLIWVEEAGNFGQQVNQFKAICSGQTIRIDQKGKGSKQIEPTP
VIMTTNENITVVRIGCEERPEHTQPIRDRMLNIHLTHTLPGDFGLVDKNEWPMICAWLVKNGYQSTMASYCAKWGKVPDW
SENWAEPKVPTPINLLGSARSPFTTPKSTPLSQNYALTPLASDLEDLALEPWSTPNTPVAGTAETQNTGEAGSKACQDGQ
LSPTWSEIEEDLRACFGAEPLKKDFSEPLNLD
>Q9PZT1 3.1.21.-~~~NS1~~~Initiator protein NS1~~~
MELFRGVLQVSSNVLDCANDNWWCSLLDLDTSDWEPLTHTNRLMAIYLSSVASKLDFTGGPLAGCLYFFQVECNKFEEGY
HIHVVIGGPGLNPRNLTVCVEGLFNNVLYHLVTENVKLKFLPGMTTKGKYFRDGEQFIENYLMKKIPLNVVWCVTNIDGY
IDTCISATFRRGACHAKKPRITTAINDTSSDAGESSGTGAEVVPINGKGTKASIKFQTMVNWLCENRVFTEDKWKLVDFN
QYTLLSSSHSGSFQIQSALKLAIYKATNLVPTSTFLLHTDFEQVMCIKDNKIVKLLLCQNYDPLLVGQHVLKWIDKKCGK
KNTLWFYGPPSTGKTNLAMAIAKSVPVYGMVNWNNENFPFNDVAGKSLVVWDEGIIKSTIVEAAKAILGGQPTRVDQKMR
GSVAVPGVPVVITSNGDITFVVSGNTTTTVHAKALKERMVKLNFTVRCSPDMGLLTEADVQQWLTWCNAQSWDHYENWAI
NYTFDFPGINADALHPDLQTTPIVTDTSISSSGGESSEELSESSFFNLITPGAWNTETPRSSTPIPGTSSGESFVGSSVS
SEVVAASWEEAFYTPLADQFRELLVGVDYVWDGVRGLPVCCVQHINNSGGGLGLCPHCINVGAWYNGWKFREFTPDLVRC
SCHVGASNPFSVLTCKKCAYLSGLQSFVDYE
>Q80872 ~~~2a~~~Non-structural protein 2a~~~
MAVAYADKPNHFINFPLTHFQGFVLNYKGLQFQILDEGVDCKIQTAPHISLTMLDIQPEDYKSVDVAIQEVIDDMHWGDG
FQIKFENPHILGRCIVLDVKGVEELHDDLVNYIRDKGCVADQSRKWIGHCTIAQLTDAALSIKENVDFINSMQFNYKITI
NPSSPARLEIVKLGAEKKDGFYETIVSHWMGIRFEYTSPTDKLAMIMGYCCLDVVRKELEEGDLPENDDDAWFKLSYHYE
NNSWFFRHVYRKSFHFRKACQNLDCNCLGFYESPVEED
>P19738 ~~~2a~~~Non-structural protein 2a~~~
MAFADKPNHFINFPLAQFSGFMGKYLKLQSQLVEMGLDCKLQKAPHVSITLLDIKADQYKQVEFAIQEIIDDLAAYEGDI
VFDNPHMLGRCLVLDVRGFEELHEDIVEILRRRGCTADQSRHWIPHCTVAQFDEERETKGMQFYHKEPFYLKHNNLLTDA
GLELVKIGSSKIDGFYCSELSVWCGERLCYKPPTPKFSDIFGYCCIDKIRGDLEIGDLPQDDEEAWAELSYHYQRNTYFF
RHVHDNSIYFRTVCRMKGCMC
>P26625 ~~~2a~~~Non-structural protein 2a~~~
MAARMAFADKPNHFINFPLAQFSGFMGKYLKLQSQLVEMGLDCKLQKVPHVSITLLDIKADQYKQVEFAIQEIIDDLAAY
EGDIVFDNPHMLGRCLVLDVKGFEELHEDIVEILRRRGCTADQSRQWIPHCTVAQFDEEKEIKEMQFYFKLPFYLKHNNL
LTDARLELVKIGSSKVGGFYCSELSIWCGERLCYKPPTPKFSDIFGYCCIDKIRGDLEIGDLPPDDEEAWAELSYHYQRN
TYFFRHVHDNSIYFRTVCRMKGCMC
>Q65695 ~~~1B~~~Non-structural protein 2~~~
MSTPNPETTAQRLIVNDIRPLSIETEIISLTKDIITHTFIYLINHECIVRKLDERQATFTFLVNYEMKLLHKVGSTKYNK
YTEYNRKYGTLPMPIFINHDGFLECIGIKPTRHTPIIYKYDLNP
>Q86305 ~~~1B~~~Non-structural protein 2~~~
MDTTHNDTTPQRLMITDMRPLSLETTITSLTRDIITHRFIYLINHECIVRKLDERQATFTFLVNYEMKLLHKVGSTKYKK
YTEYNTKYGTFPMPIFINHDGFLECIGIKPTKHTPIIYKYDLNP
>P04543 ~~~1B~~~Non-structural protein 2~~~
MDTTHNDNTPQRLMITDMRPLSLETIITSLTRDIITHKFIYLINHECIVRKLDERQATFTFLVNYEMKLLHKVGSTKYKK
YTEYNTKYGTFPMPIFINHDGFLECIGIKPTKHTPIIYKYDLNP
>P0DJZ2 ~~~NS2~~~Non-structural protein NS2~~~
MAGNAYSDEVLGATNWLKEKSNQEVFSFVFKNENVQLNGKDIGWNSYKKELQEDELKSLQRGAETTWDQSEDMEWETTVD
EMTKKFGTLTIHDTEKYASQPELCTNSTCIGSRGPGFRALEHTKYSCCGHCRNPEHWGSWFQSLPRWSTEPNLVRDRGGF
ESVLRCGTVEERLQRAAELGLRYDGASS
>P59633 ~~~3b~~~ORF3b protein~~~
MMPTTLFAGTHITMTTVYHITVSQIQLSLLKVTAFQHQNSKKTTKLVVILRIGTQVLKTMSLYMAISPKFTTSLSLHKLL
QTLVLKMLHSSSLTSLLKTHRMCKYTQSTALQELLIQQWIQFMMSRRRLLACLCKHKKVSTNLCTHSFRKKQVR
>P19740 ~~~4b~~~Non-structural protein 4b~~~
MQGKCWFLENKALKPFVCFYGGDQFLYIGDRIVSYFSTNDLYVALRGRIDKDLSLSRKVELYNGECVYLFCEHPAVGIVN
TDFKLEIH
>P59634 ~~~6~~~ORF6 protein~~~
MFHLVDFQVTIAEILIIIMRTFRIAIWNLDVIISSIVRQLFKPLTKKNYSELDDEEPMELDYP
>P0DTC6 ~~~6~~~ORF6 protein~~~
MFHLVDFQVTIAEILLIIMRTFKVSIWNLDYIINLIIKNLSKSLTENKYSQLDEEQPMEID
>P59635 ~~~7a~~~ORF7a protein~~~
MKIILFLTLIVFTSCELYHYQECVRGTTVLLKEPCPSGTYEGNSPFHPLADNKFALTCTSTHFAFACADGTRHTYQLRAR
SVSPKLFIRQEEVQQELYSPLFLIVAALVFLILCFTIKRKTE
>P0DTC7 ~~~7a~~~ORF7a protein~~~
MKIILFLALITLATCELYHYQECVRGTTVLLKEPCSSGTYEGNSPFHPLADNKFALTCFSTQFAFACPDGVKHVYQLRAR
SVSPKLFIRQEEVQELYSPIFLIVAAIVFITLCFTLKRKTE
>P19743 ~~~7b~~~Non-structural protein 7b~~~
MIVVILVCIFLANGIKATAVQNDLHEHPVLTWDLLQHFIGHTLYITTHQVLALPLGSRVECEGIEGFNCTWPGFQDPAHD
HIDFYFDLSNPFYSFVDNFYIVSEGNQRINLRLVGAVPKQKRLNVGCHTSFAVDLPFGIQIYHDRDFQHPVDGRHLDCTH
RVYFVKYCPHNLHGYCFNERLKVYDLKQFRSKKVFDKINQHHKTEL
>Q7TFA1 ~~~7b~~~Protein non-structural 7b~~~
MNELTLIDFYLCFLAFLLFLVLIMLIIFWFSLEIQDLEEPCTKV
>P0DTD8 ~~~7b~~~ORF7b protein~~~
MIELSLIDFYLCFLAFLLFLVLIMLIIFWFSLELQDHNETCHA
>P04136 ~~~7~~~Non-structural protein 7~~~
MLVFLHAVFITVLILLLIGRLQLLERLLLNHSFNLKTVNDFNILYRSLAETRLLKVVLRVIFLVLLGFCCYRLLVTLM
>Q80H93 ~~~8b~~~ORF8b protein~~~
MCLKILVRYNTRGNTYSTAWLCALGKVLPFHRWHTMVQTCTPNVTINCQDPAGGALIARCWYLHEGHQTAAFRDVLVVLN
KRTN
>P0DTC8 ~~~8~~~ORF8 protein~~~
MKFLVFLGIITTVAAFHQECSLQSCTQHQPYVVDDPCPIHFYSKWYIRVGARKSAPLIELCVDEAGSKSPIQYIDIGNYT
VSCLPFTINCQEPKLGSLVVRCSFYEDFLEYHDVRVVLDFI
>P28890 ~~~~~~RNA-binding protein~~~
MTASESFVGMQVLAQDKEVKATFIALDRKLPANLKVPYMKNAKYRTCICPSSNHLVDDCVCEDVIIAYTAHRNNAVAALL
YSDGNVIHRSGTLKPKSQNRFDLRGFLTSVNPGESSRNEAGASKSTQKTYDRKDKSPSKSRNSKKGAKKSSSARKKKEYS
SNSETDLSSDSDANTRKSKRK
>Q85442 ~~~~~~RNA-binding protein~~~
MSGTLPLAMTASESFVGMQVLAQDKEVKATFIALDRKLPANLKVPYMKNAKYRTCICPSSNHLVDDCVCEDVIIAYTAHR
NNAVAALLYSDGNVIHRSGTLKPKSQNRFDLRGFLTSVNPGESSKAEAGTSKSTQKAYDRKDKSPSKGKNSKKGGKKSSS
ERKRKEYSSNSETDLSSDSDANTRKSKRK
>P31612 ~~~~~~RNA-binding protein~~~
MSNKESNVALQTLRVTKDMKDFLSHRIVGEPPANIKIEYQKIHRYRTCVCPSTGHISELCPSGDLILSLGAHRNVIAAAT
VYDVVKNKTKSTTSKAGTSSTLSSLGLSGFQKPKIGSKNKKTMFSKQNNSTNESDESEGEEGSSLNDLPKSDLINAIMEL
ASQGRNNSKGKGRRGGKR
>Q85443 ~~~~~~Non-structural protein 12A~~~
MFKSGSGSLKRSGSISSVKSFSGDSEKGLPPISRGSVSITSQNYEPLIVPANSSSFAAASDFVPEKTKSEGNLKNKSSVI
TGNFESSGPTNAHYNQNADGDRLVENLLLKEIAKGRGPSTSDARHTATDSRLSQEVKQPFSEENAGGNDLNTGRGSHGTG
DGIEQYHKSDCEERMSAYHKRVVDTFFKYFEYSAEDGHSTLYSDVAFLFGCGDLDLLVMSRYQEVMTLRARSAIYGIFCY
LQALTAYLTYLGAKVGQVIMLDEELEKYEIRLDVAQDDDPIVFQITTGVFTSGVAHDLRKLTQILEAFSLER
>Q8JZ13 ~~~~~~Non-structural protein 1~~~
MATFKDACYHYKKLNKLNSLVLKLGANDEWRPAPVTKYKGWCLDCCQYTNLTYCRGCALYHVCQWCSQYNRCFLDEEPHL
LRMRTFKDVVTKEDIEGLLTMYETLFPINEKLVNKFINSVKQRKCRNEYLLEWYNHLLMPITLQALTINLEDNVYYMFGY
YDCMEHENQTPFQFVNLLEKYDKLLLDDRNFHRMSHLPVILQQEYALRYFSKSRFLSKGKKRLSRSDFSDNLMEDRHSPT
SLMQVVRNCISIHIDDCEWNKACTLIVDARNYISIMNSSYTEHYSVSQRCKLFTKYKFGIVSKLVKPNYIFSSHESCALN
VHNCKWCQINNHYKVWEDFRLRKIYNNVMDFIRALVKSNVNVGHCSSQESVYKYVPDLFLICKTEKWSEAVEMLFNYLEP
VNVNGTEYVLLDYEVNWEVRGLVMQNMDGKVPRILNMNDTKKILSAMIFDWFDTRYMRETPMTTSTTNQLRTLNKRNELI
DEYDLELSDVE
>Q65696 ~~~~~~Non-structural protein 1~~~
MATFKDACYHYKKLNKLNSLVLKLGANDEWRPAPVTKYKGWCLDCCQYTNLTYCRGCALYHVCQWCSQYNRCFLDEEPHL
LRMRTFKDVVTKEDIEGLLTMYETLFPINEKLVNKFINFVKQRKCRNEYLLEWYNHLLMPITLQALTINLEDSAYYIFGY
YDCMECENQTPFQFVNLLEKYDKLLLDDRNFHRMSHLPAILQQEYALRYFSKSRFLSKGKKRLSRHDFSDNLMEDRHSPT
SLMQVVRNCISTHMNDCEWNKRCHVIVDAKNYISIMNSSYTEHYSVSQRCKLFTKYKFGIISKLVKPNYIFSNHESYALN
VHNCKWCQINNHYKVWEDFRLRKIYNNIMDFIRALVKSNGNVGHCSSQESVYKYIPDIFLICKKEKWNEAVKMLFNYLEP
VDINGTEYALLDYEVNWEVRGLVMQNMDGKVPRILNMNDTKKILSAIIFDWFDTRYMRETPMTTSTTNQLRTLNKKNELI
DEYDLELSDVE
>O56831 ~~~~~~Non-structural protein 1~~~
MATFKDACFHYRKITKLNRELLRIGANSVWIPVSSNKIKGWCIECCQLTELTFCHGCSLAHVCQWCIQNKRCFLDNEPHL
LKLRSFESPITKEKLQCIIDLYNLLFPINPGIINRFKKIVNQRKCRNEFEQSWYNQLLFPITLNAAVFKFHSREVYVFGL
YEGSSSCIDLPYRIVNCIDLYDRLLLDQINFERMSSLPASLQSVYANKYFKLSRLPSMKLKQIYYSDFSKQNLINKCKIK
SRIVLRNLTEFTWDSQVSLHNDVINNKEKILTALSTSSLKRFETHDLNLGRVKADIFELGHHCKPNYISSNHWQPASKVS
QCRWCNVKYVFRNMDWKMESMYNELLSFIQACYKSNVNVGNCSSIESAYPLVKDMLWHSITKYIDQTIEKLFNVMNPVKV
DGQQVISFHWQIDVALYIHIKMILKTETLPFAFTLNQFRSIIKGIVNQWYDVTELDYLPLCTEQTDKLVKLEEEGKISEE
HELLISDSEDDD
>Q86194 ~~~~~~Non-structural protein 1~~~
MATFKDACYHYRRLNKLNNLVLKLGANDEWRPAPVTKYKGWCLDCCQHTNLTYCRGCALYHVCQWCSQYNRCFLDEEPHL
LRMRTFKNVMTKEDIEGLLTMYETLFPINEKLVNKFTDFVKQRKCRNEYLLEWYNHLLMPITLQALTVKLEDNIYYICGY
YDCMEHENQTPLQFINLLEKYDKLLLDDRNFNRMSYLPTILQQEYALRYFSKSRFFSKKEKRLSRNDFSDNLMEDRHSPI
SLIQVIRNCISTHMNDSEWNKACTLVVDPKNYIDIINSSYTEHYSVSQRCKLFTKYKLGIVSKLVRPNYIFSSHESCALN
VHNCKWCQTNNHYKVWADFRLKKIYNNMMDFVRALTKSNGNVGHCSSQESESKCIPDIFLICEMEKWNGPVRVLFRYLEP
VDINGEEYVLLDYEVNWEVRGLIIQNMDGRVPRILNMDDVKKILSAIIFDWFDVRYMRETPLTTLTTNQLRALNRKNELI
DEYDLELSDVE
>Q9QNA9 ~~~~~~Non-structural protein 1~~~
MATFKDACYHYKRINKLNNTVLKLGVNDTWRPSPPTKYKGWCLDCCQHTDLTYCRGCTMYHVCQWCSQYGRCFLDNEPHL
LRMRTFKNEVTKDELKNLIDMYDTLFPMNQKIVCRFISNTRQHKCRNECMTQWYNHLLMPITLQSLAIELDGDIYYVFGY
YDNMNSINQTPFSFTNLVDIYDKLLLDNVNFVRMSFLPASLQQEYALRYFSKSRFISEQRKCVNDSHFSINVLENLHNPN
FKIQITRNCSEMSFDWNEACKLVKNAGAYFDILKTSHIEFYSVSTRCRIFTQCKLKIASKLIIPNYITSNHKTLATEVHN
CEWCSVNNSYTVWNDFRIKKIYDKVFNFLRAFSKFNINIGHCSSQEKMYEYVEDVLNVCNDERWKTSIIEIFNCLEPVEL
DDVKYVLFNHEINWDVINVLVHSIGKVPQILTLENVITIIQSIIYEWFDIRYMRNTPMVTFTIDKLRRLHTGLKTVEDDS
GISDVE
>Q82044 ~~~~~~Non-structural protein 1~~~
MATFKDTCYYYKRINKLNHAVLKLGVIDTWRPSPLTKYKGWCLDCCQHTDLTYCRGCTMYHDCQWCSQYGRCFLDSEPHL
LRMRTFKNEVTKNDLMNLIDMYDTLFPINQRIVDKFMNSTRQHKCRNECITQWYNHLLMPITLQSLSVELDGDVYYVFGY
YDSMSEINQTPFSFTNLIDMYDKLLLDNINFNRMSFLPVALQQEYALRYFSKSRFISEKRKCVSDLHFSANVIENLHNPS
FKIQITRNCSDLSSDWNGVCKLVKDVSAYFNVLKTSHIEFYSISTRCRVFTQHKLKIASKHIKPNYVTSNHKTSATEVHN
CKWCSINNSYTVWNDFRVKKIYDNIFNFLRALVKSNANVGHCSSQEKIYEYIKDVLDVCDDEKWKIAVTEIFNCLEPVEL
NNVKYALFNHEVNWDVINLLVQSVDKAPQILTLNDIVIIMKSIIYEWFDIRYMRNTPMTTFTVDKLRRLCTGVKTVDYDS
GISDVE
>Q82045 ~~~~~~Non-structural protein 1~~~
MATFKDACYYYKRINKLNHVVLKLGVNDTWRPSPPTKYKGWCLDCCQHTDLTYCQGCTMYHDCQWCSQYGRCFLDSEPHL
LRMRTFKNEVTKNDLMNLIDMYNTLFPINQKIVDKFINSTRQHKCRNECMTQWYNHLLMPITLQSLSIELDGDVYYVFGY
YDSMSDINQTPFSFANLIDIYDKLLLDNINFNRMSFLPVALQQEYALRYFSKSRFISEKRKCVSDLHFSANVIENLHNPS
FKIQITRNCIELSSDWNGACKLVEDVSAYFDMLKTSHIEFYSISTRCRVFTQHKLKMASKHIKPNYVTSNHRTSATEVHN
CKWCSINNSYAVWNDFRVKKIYDNIFNFLRALVKSNANVGHCSSQEKIYEHIEDVLDVCDDEKWKTAVTEIFNCLEPVEL
DAVKYVLFNHEVNWDVINLLVQSVGKVPQILTLNDIVIIMKSIIYEWFDIRYMRNTPMTTFTVDKLRQLCTGVKTVDYDS
GISDVE
>P87724 ~~~~~~Non-structural protein 1~~~
MATFKDACYHYKRINKLNQTVLKLGVNDTWRPSPPTKYKGWCLDCCQHTDLTYCRGCTIYHVCQWCSQYGRCFLDDEPHL
LRMRTFKNEVTKDNLKNLIDMYNTLFPITQKIIHRFINNTRQHKCRNECMTQWYNHLLMPITLQSLSIELDGDVYYIFGY
YDSMNNINQTPFSFTNLVDIYDKLLLDDVNFVRMSFLPTSLQREYALRYFSKSRFISEQRKCVNDSHFSINVLENLYNPN
FKVQITRNCSELSVDWNEACKLVKNVSAYFDILKTSHVEFYSVSTRCRIFTRCKLEMASKLIKPNYVTSNHKTLATEVRN
CKWCSINNSYTVWNDFRIKKIYNNIFSFLRALVKSNVNIGHCSSQEKIYEYVENVLNVCDDKRWKTSIMEIFNCLEPVEL
NDVKYVLFNYEINWDVINVLIHSIGKVPQILTLENVITIIQSIVYEWFDITYMRNTPMVTFTIDKLRRLHIGLKTVDSDS
GISDVE
>Q84940 ~~~~~~Non-structural protein 1~~~
MATFKDACYYYKRINKLNHAVLKLGVNDTWRPSPPTKYKGWCLDCCQHTDLTYCRGCTMYHVCQWCSQYGRCFLDNEPHL
LRMRTFKNEVTKDDLMNLVDMYDTLFPMNQKIVDKFINNTRQHKCRNECVNQWYNHLLMPITLQSLSIELDGDVYYIFGY
YDDMNNVNQTPFSFVNLVDIYDKLLLDDVNFTRMSFLPVTLQQEYALRYFSKSRFISEQRKCVSDSHFSINVLENLHNPS
FKMQITRNCSELSSDWNGACKLVKDTSAYFNILKTSHVEFYSISTRCRVFTQRKLKIASKLIKPNYITSNHRTSATEVHN
CKWCSINSSYTVWNDFRVKKIYDNIFNFLRALVKSNVNVGHCSSQEKIYECVENILDVCDNEKWKTSVTKIFNYLEPVEL
NAVNYVLFNHEVNWDVINVLVQSIGKVPQILTLNDVTTIMQSIIYEWFDTKYMRNTPMTTFTVDKLRRLCTGSKTVDYDS
GISDVE
>Q00033 ~~~~~~Non-structural protein 1~~~
MANSYREMLYWFGKTIDRNLPYVNTNGWRKQKGRKDGICLNCLDECKLYSCDHCGIKHKCGNCVLSECFLDVKNEFNKYR
WLVFDEEPDQAVLLQHWIMYKDYFLQKFNYRLATQAKILNMNKNQKFQLNEGRKRALSVPITSQFLKFRLFGKIYIQFGT
IMTNKIQPWLELSTLKIGYLQLLNVERCSELMATRGQFTTNVAKTACITEIKCRRPIYDNDCIIEAYLDKNDRGWKFAAI
LGRRKIPVTQKLAMEYFMKSLRAELFYYAHSRCHTLSNCPRWNEGLRLLNSSTLNIVFRRQFMNEIVEWFEIFSQYTGSH
YEFITECVHDKSAITAFKQEIEDYIKEGKQITLKSVVPEEHAAYRHILRLRESLMLAIDAALSRIRSQSMGVL
>Q6YLT2 ~~~~~~Non-structural protein 1~~~
MATFKDACFHYRRVTKLNRELLRIGANSVWTPVSSNKIKIKGWCIECCQLTGLTFCHGCSLAHVCQWCIQNKRCFLDNEP
HLLKLRTFESPITKEKLQCIINLYELLFPINHGVINKFKKTIKQRKCRNEFDKSWYNQLLLPITLNAAVFKFHSRDVYVF
GFYEGSSPCIDLPYRLVNCIDLYDKLLLDQVNFERMSSLPDNLQSIYANKYFKLSRLPSMKLKRIYYSDFSKQNLINKYK
TKSRIVLRNLTEFTWDSQTDLHHDLINDKDKILAALSTSSLKQFETHDLNLGRIKADIFELGHHCKPNYISSNHWQPASK
ISKCKWCNVKYAFRDMDWKMESMYNELLSFIQSCYKSNVNVGHCSSIEKAYPLVKDILWHSITEYIDQTVEKLFNTMNPV
QVNEQQVIKFCWQIDIALYMHIKMILETEALPFTFTLNQFNSIIKGIVNQWCDVAELDHLPLCTEQTDALVKLEEEGKLS
EEYELLISDSEDDD
>Q99FX5 ~~~~~~Non-structural protein 1~~~
MATFKDACFHYRRLTALNRRLCNIGANSICMPVPDAKIKGWCLECCQIADLTHCYGCSLPHVCKWCVQNRRCFLDNEPHL
LKLRTVKHPITKDKLQCIIDLYNIIFPINDKVIRKFERMIKQRKCRNQYKIEWYNHLLLPITLNAAAFKFDENNLYYVFG
LYEKSVSDIYAPYRIVNFINEFDKLLLDDINFTRMSNLPIELRNHYAKKYFQLSRLPSSKLKQIYFSDFTKETVIFNTYT
KTPGRSIYRNVTEFNWRDELELYSDLKNDKNKLIAAMMTSKYTRFYAHDNNFGRLKMTIFELGHHCQPNYVASNHPGNAS
DIQYCKWCNIKYFLSKIDWRIRDMYNLLMEFIKDCYKSNVNVGHCSSVENIYPLIKRLIWSLFTNHMDQTIEEVFNHMSP
VSVEGTNVIMLILGLNISLYNEIKRTLNVDSIPMVLNLNEFSSIVKSISSKWYNVDELDKLPMSIKSTEELIEMKNSGTL
TEEFELLISNSEDDNE
>A2T3M4 ~~~~~~Non-structural protein 1~~~
MATFKDACFHYRRLTALNRRLCNIGANSICMPVPDAKIKGWCLECCQIADLTHCYGCSLPHVCKWCVQNRRCFLDNEPHL
LKLRTVKHPITKDKLQCIIDLYNIIFPINDKVIRKFERMIKQRKCRNQYKIEWYNHLLLPITLNAAAFKFDENNLYYVFG
LYEKSVSDIYAPYRIVNFINEFDKLLLDDINFTRMSNLPIELRNHYAKKYFQLSRLPSSKLKQIYFSDFTKETVIFNTYT
KTPGRSIYRNVTEFNWRDELELYSDLKNDKNKLIAAMMTSKYTRFYAHDNNFGRLKMTIFELGHHCQPNYVASNHPGNAS
DIQYCKWCNIKYFLSKIDWRIRDMYNLLMEFIKDCYKSNVNVGHCSSVENIYPLIKRLIWSLFTNHMDQTIEEVFNHMSP
VSVEGTNVIMLILGLNISLYNEIKRTLNVDSIPMVLNLNEFSSIVKSISSKWYNVDELDKLPMSIKSTEELIEMKNSGTL
TEEFELLISNSEDDNE
>Q86197 3.6.4.-~~~~~~Non-structural protein 2~~~
MTQSVSLSDFIVKTEDGYMPSDRECIALDRYLSKEQKELRETFKDGKNDRAALRIKMFLCPSPSRRFTQHGVVPMREIKT
NTDMPSTLWTLVTDWLLNLLQDEENQEMFEDFISSKFPDVLASADKLARFAQRLEDRKDVLRKNFGKAMNAFGACFWAIK
PTFATEGKCNVVRASDDSIILEFQPVPEYFRCGKSKATFYKLYPLSDEQPVNGMLALKAVAGNQFFMYHGHGHIRTVPYH
ELLTLSNHSLVKIKKRSKTFLNHHSQLNVVVNFSICSME
>Q9PY93 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFVSFSLTEDKVVWYPINKKAVQTMLCAKVEKDQRSNYYDTILYGVAPPPEFRNRFKTNERYGLDYESDQYTELV
NLLADTLNMVSMPTEKFQFDIVKTVVQVRHLENLLCRIKDVNDILNANVKLRVKAVMIACNLVNETETTPLTESNDIVYQ
DSYFTITKLDYSNHKLLPLMADEYKITINTKTDIPDRNQTAFAAYIRYNFNKFAAISHGKRHWRLVLHSQLMSHAERLDR
KIKSDKKHGRQFSYDDGDMAFVHPGWKTCIGQLCGGTTFEVAKTSLYSIKPSKTVRTATNKIESDLISMVGN
>Q9QNA8 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLENDSYKFIPFNNLAIKCMLTAKVDKKDQDKFYNSIIYGIAPPPQFKKRYNTNDNSRGMNYETPMLIKV
AILICEALNSIKVTQSDVANVLSRVVSVRHLENLVLRKENHQDVLFHSKELLLKSVLIAIGQSKEIETTATAEGGEIVFQ
NVAFTMWKLTYLDHKLMPILDQNFIEYKITMNEDKPISDVHVKELIAELRWQYNRFAVITHGKGHYRVVKYSSVANHADR
VFATYKNNAKSGNVIDFNLLDQRIIWQNWYAFTSSMKQGFTLDVCKKLLFQKMKQERNPFKGLSTDRKMDEVSRIGI
>P09366 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLEKDSYKFIPFNSLAIKCMLTAKVDKKDQDKFYNSIVYGIAPPPQFKKRYNTNDNSRGMNFETSMFNKV
AILICEALNSIKVTQSDVANVLSRVVSVRHLENLVLRKENHQDVLFHSKELLLKAVLIAIGQSKEIETTATAEGGEIVFQ
NAAFTMWKLTYLDHKLMPILDQNFIEYKITLNEDKPISDACVKELVAELRWQYNRFAVITHGKGHYRVVKYSSVANHADR
VFATYKNNAKSGNVTDFNLLDQRIIWQNWYAFTSSMKQGNTLDVCKKLLFQKMKQEKNPFKGLSTDRKMDEVSHVGI
>Q86484 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFVSFSLTEDKVKWFPINKKAVKTMLCAKVEKDQRSNYYDTILYGVAPPPEFRNRFKTTERYGLDYESDQYSEVA
NLLADVLNMVSMPTEKFQFDIVKTVVQVRHLENLLLRIKDTDDILSENVKLRVKAVMIACNLVNETETTPLTESNEIVYQ
DSYFTITKLDYSSHKLLPLMADEYKITINTKTDIPDRDQTAFAAYIRYNFNKFAAISHGKRHWRLVLHSQLMSHAERLDR
KIKSDKKHGRQFAYDDGDMAFVHPGWKACIGQLCGGTTFEVAKTSLYSVKTSKTVRTATNKIESDLISMVGN
>Q86505 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLESDTYRFIPFNSLAIKCMLTAKVDKKDQDKFYNSIIYGIAPPPQFKKRYNTNDNSRGMNYETPMFNKV
AVLICEALNSIKVTQSDVASVLSKVISVRHLENLVLRRENHQDVLFHSKELLLRSVLIAIGHSKEIETTATAEGGEVVFQ
NAAFTMWKLTYLEHRLMPILDQNFIEYKITVNEDKPISESHVRELIAELRWQYNKFAVITHGKGHYRVVKYSSVANHADR
VYATFKSNNKNGNVIEFNLLDQRIIWQNWYAFTSSMKQGNTLEICKKLLFQKMKRESNPFKGLSTDRKMDEVSQIGI
>P03537 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLENDSYRFIPFNSLAIKCMLTAKVDKKDQDKFYNSIIYGIAPPPQFKKRYNTSDNSRGMNYETSMFNKV
AALICEALNSIKVTQSDVASVLSKIVSVRHLENLVLRRENHQDVLFHSKELLLKSVLIAIGHSKEIETTATAEGGEIVFQ
NAAFTMWKLTYLEHKLMPILDQNFIEYKITLNEDKPISESHVKELIAELRWQYNKFAVITHGKGHYRVVKYSSVANHADR
VYATFKSNNKNGNMIEFNLLDQRIIWQNWYAFTSSMKQGNTLEICKKLLFQKMKRESNPFKGLSTDRKMDEVSQIGI
>A2T3P0 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLENDSYKFIPFNNLAIKAMLTAKVDKKDMDKFYDSIIYGIAPPPQFKKRYNTNDNSRGMNFETIMFTKV
AMLICEALNSLKVTQANVSNVLSRVVSIRHLENLVIRKENPQDILFHSKDLLLKSTLIAIGQSKEIETTITAEGGEIVFQ
NAAFTMWKLTYLEHQLMPILDQNFIEYKVTLNEDKPISDVHVKELVAELRWQYNKFAVITHGKGHYRIVKYSSVANHADR
VYATFKSNVKTGVNNDFNLLDQRIIWQNWYAFTSSMKQGNTLDVCKRLLFQKMKPEKNPFKGLSTDRKMDEVSQVGV
>Q03242 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLENDSYKFIPFNNLAIKAMLTAKVDKKDMDKFYDSIIYGIAPPPQFKKRYNTNDNSRGMNFETIMFTKV
AMLICEALNSLKVTQANVSNVLSRVVSIRHLENLVIRKENPQDILFHSKDLLLKSTLIAIGQSKEIETTITAEGGEIVFQ
NAAFTMWKLTYLEHQLMPILDQNFIEYKVTLNEDKPISDVHVKELVAELRWQYNKFAVITHGKGHYRIVKYSSVANHADR
VYATFKSNVKTGVNNDFNLLDQRIIWQNWYAFTSTMKQGNTLDVCKRLLFQKMKPEKNPFKGLSTDRKMDEVSQVGV
>Q03243 3.6.4.-~~~~~~Non-structural protein 2~~~
MAELACFCYPHLENDSYKFIPFNNLAIKAMLTAKVDKKDMDKFYDSIIYGIAPPPQFKKRYNTNDNSRGMNFETIMFTKV
AMLICEALNSLKVTQANVSNVLSRVVSIRHLENLVIRKENPQDILFHSKDLLLKSTLIAIGQSKEIETTITAEGGEIVFQ
NAAFTMWKLTYLEHQLMPILDQNFIEYKVTLNEDKPISDVHVKELVAELRWQYNKFAVITHGKGHYRIVKYSSVANHADR
VYATFKSNVKTGVNNDFNLLDQRIIWQNWYAFTSSMKQGNTLDVCKRLLFQKMKPEKNPFKGLSTDRKMDEVSQVGV
>Q65701 ~~~~~~Non-structural protein 3~~~
MSKMESTQQMASSIINTSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDYVMDDSGVKNNLLGKAATYDQALNGKFGS
AARNRNWMADTRTTARLDEDVNKLRMMLSSKGIDQKMRVLNACFNVKRVPGKSSSIIKCTRLMRDKIERGEVEVDDSFVE
EKMEVDTIDWKSRYEQLEKRFESLKQRVNEKYTSWVQKAKKVNENMYSLQNVISQQQSQIADLQNYCNKLEVDLQNKISS
LVSSVEWYLKSMELPDEIKTDIEQQLNSIDVINPINAIDDFESLIRNIILDYDRIFLMFKGLMRQCNYEYTYE
>Q5K037 ~~~~~~Non-structural protein 3~~~
MALDALASILETVLRNCGINEISRVTTKFEEALDDCGMKVDDWREAYYKERFPKRMTATTMASQIMNFEIENLQLRNKAW
AEGADRKFRLLSSFEIGNKDGHTILVPKTRNAEILLANSTSDLKLSSFPSEAVAKLAEENEKMRKQIEHLREQQTSKSTA
TLCEALENMTERMKLIEREKETVRRMFLECDKTNQRLRKQIQICEEEATDRLVLVNSHHREEILIMKREIYRLQMENVTL
KEQIDSIEQELDHSNRIVRGLANRAGLVVDEVDSGNETSDLSDDSDHDDHENSESDLEDMMDPGEDERIPRGGENPRRQA
RMLQMREEMERLHEDMEILNLNLDLDI
>Q82051 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVSSIINSSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDLVMDDSGVKNNLMGKAATIDQALNGKFSS
SIRNRNWMTDSKTDARLDEDVNKLRLLLSSKGIDQKMRVLNACFNVKRIPGKSSSVIRCTRLMKEKIERGEVEVDDAFVE
EKMEVDTIDWKSRYEQLEKRFESLKQRVNEKYNNWVIKARKDNENMYSLQNVISQQQAHINELQIYNNKLERDLQSKIGS
VISSIEWYLRSMELSDDIKSDIEQQLNSIDQINPVNAYDDFESILRNLISDYDRMFIMFKGLLQQSNYIYTYE
>Q82050 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVSSIINTSFEAAVVAATSTLELMGIQYDYNEVFTRVKSKFDYVMDDSGVKNNLLGKAITIDQALNGKFGS
AIRNRNWMIDSKTVAKLDEDVNKLRMTLSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTRLMKDKLERGEVEVDDSYVD
EKMEIDTIDWKSRYDQLEKRFESLKQRVNEKYNAWVQKAKKVNENMYSLQNVISQQQNQIADLQQYCNKLEVDLQGKFSS
LVSSVEWYLRSMELPDDVKTDVEQQLNSIDLINPINAIDDIESLIRNLIQDYDRTFLMLKGLLKQCNYEYAYE
>Q82052 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMASSIINSSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDFVMDDSGVENNLMGKAATIDQALNGKFSS
SIRNRNWMTDSKTVARLDEDVNKLRLLLSSKGIDQKMRVLNACFSVKRVPGKSSSVIKCTRLMKEKIERGEVEVDDTFIE
ERMEIDTIDWKSRYDQLERRFESLKQRVNEKYNNWVIKARKVNENMNSLQNVISQQQAHINELQIYNNKLERDLQSKIGS
VISSIEWYLRSMELSDDIKSDIEQQLNSIDLINPVNAFDDFESILRNLISDYDRIFIMFKGLLQQSNYTYTYE
>Q82053 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVSSIINTSFEAAVVAATSTLELMGIQYDYNEVFTRVKSKFDYVMDDSGVKNNLLGKAITIAQALNGKFGS
AIRNRNWMSDSKTVAKLDEDVNKLRMTLSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTRLMKDKIERGEVEVDDSYVD
EKMEIDTIDWKSRYDQLEKRFESLKQRVNEKYNTWVQKAKKVNENMYSLQNVISQQQNQIADLQQYCNKLEADLQGKFSS
LVSSVEWYLRSMELPDDVKTDIEQQLNSIDLINPINAIDDIESLIRNLIQDYDRTFLMLKGLLKQCNYEYAYE
>Q82054 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVSSIINTSFEAAVVAATSTLELMGIQYDYNEVFTRVKSKFDYVMDDSGVKNNLLGKAITIAQALNGKFGS
AIRNRNWMTDSKTVAKLDEDVSKLRMTLSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTRLMKDKIEGGEVEVDDSYVD
EKMEIDTIDWKSRYDQLEKRFEALKQRVNEKYNTWVQKAKKVNENMYSLQNVISQQQNQISDLQQYCNKLEADLQGKFSS
LVSSVEWYLRSMELPDDVKNDIEQQLNSIDAINPINAIDDIESLIRNLIQDYDRTFLMLKGLLKQCNYEYAYE
>Q85014 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVSSIINTSFEAAVVAATSTLELMGIQYDYNEIFTRVKSKFDYVMDDSGVKNNLLGKAATIDQALNGKFGS
TIRNRNWMTDSKTVAKLDEDVNKLRMILSSKGIDQKMRVLNACFSVKRIPGKSSSVIKCTRLMKDKIERGEVEVDDSFVD
EKMEIDTIDWKARYDQLEKRFESLKQRVNEKYNSWVQKSKERNENMYSLQNVISQQQNQIADLQQYCNKLEADLQSKISS
LVSSVEWYLRSMELSDDVKTDIEQQLNSIDAINPINAIDDLESLIRNLIQDYDRTFLMLKGLVRQCNYECTYE
>P27586 ~~~~~~Non-structural protein 3~~~
MATQASVEWIFNIAGSAASASIAKAIKDAGGSEDFAKYVIARFYDNYKDSVDDTGVYNACIGRARTVDKALDDSRKAERN
EDWHTNLETISRLDLELAELKLILSNLGIKREDRVLNSMFSVVREEGKSSNTVMLKQNAVRMIEEGKLKIRVERNENYTA
SLKNKIEELECMIDAFEKGKEIIISLDAMNGEVKRDGNSCSYNSTAAFVSTIVGNPIKMYDESGKPLFDVGDYLNPKHII
DKMIENEIPIFKSDYRNNESPDFDVWNERSNLKIVSINDCHAICVFKFENAWWCFDDGVLNKYSGNGNPLIVANAKFQID
KILISGDVELNPGPDPLIRLNDCKTKYGIDIICRFYIVLDNDGSIIHMCYMRTGSAEAVAKGRSKKEAKRIAAKDILDQI
GL
>Q86504 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMASSIINTSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDYVMDDSGVKNNLLGKAATIDQALNGKFGS
AARNRNWMADTRTTARLDEDVNKLRMMLSSKGIDQKMRVLNACFNVKRVPGKSSSIIKCTRLMRDKIERGEVEVDDSFVE
EKMEVDTIDWKSRYEQLEKRFESLKQRVNEKYTSWVQKAKKVNENMYSLQNVISQQQSQIADLQNYCNKLEVDLQNKISS
LVSSVEWYLKSMELPDEIKTDIEQQLNSIDVINPINAIDDFESLIRNIILDYDRIFLMFKGLMRQCNYEYTYE
>Q8UZL8 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMASSIINSSFEAAVVAATSTLELMGIQYDYNEVYTRVKSKFDLVMDDSGVKNNLIGKAITIDQALNGKFSS
AIRNRNWMTDSRTVAKLDEDVNKLRIMLSSKGIDQKMRVLNACFSVKRIPGKSSSIVKCTRLMKDKLERGEVEVDDSFVE
EKMEVDTIDWKSRYEQLEKRFESLKHRVNEKYNHWVLKARKVNENMNSLQNVISQQQAHINELQMYNNKLERDLQSKIGS
VVSSIEWYLRSMELSDDVKSDIEQQLNSIDQLNPVNAIDDFESILRNLISDYDRLFIMFKGLLQQCNYTYTYE
>P03536 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMAVSIINSSFEAAVVAATSALENMGIEYDYQDIYSRVKNKFDFVMDDSGVKNNPIGKAITIDQALNNKFGS
AIRNRNWLADTSRPAKLDEDVNKLRMMLSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTKLMRDKLERGEVEVDDSFVD
EKMEVDTIDWKSRYEQLEQRFESLKSRVNEKYNNWVLKARKMNENMHSLQNVIPQQQAHIAELQVYNNKLERDLQNKIGS
LTSSIEWYLRSMELDPEIKADIEQQINSIDAINPLHAFDDLESVIRNLISDYDKLFLMFKGLIQRCNYQYSFGCE
>Q00721 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMAVSIINSSFEAAVVAATSALENMGIEYDYQDIYSRVKNKFDFVMDDSGVKNNLIGKAITIDQALNNKFGS
AIRNRNWLADTSRPAKLDEDVNKLRMMLSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTKLMRDKLERGEVEVDDSFVD
EKMEVDTIDWKSRYEQLEQRFESLKSRVNEKYNNWVLKARKMNENMHSLQNVISQQQAHIAELQVYNNKLERDLQNKIGS
LTSSIEWYLRSMELDPEIKADIEQQINSIDAINPLHAFDDLESVIRNLISDYDKLFLMFKGLIQRCNYQYSFGCE
>Q82049 ~~~~~~Non-structural protein 3~~~
MLKMESTQQMVISVINTSFEAAVVAATSTLELMGIQYDYNEVFTRVKSKFDYVMDDSGVKNNLLGKAITIDQALNGKLGS
AIRNRNWMTDSKTVAKLDEDVNKLRMILSSKGIDQKMRVLNACFSVKRIPGKSSSIIKCTRLMKDKIERGEVEVDDSYVD
EKMEIDTIDWKFRYDQLEKRFESLKQRVNEKYNTWVQKAKKVNENMYSLQNVISQQQNQIADLQQYRNKLETDLQGKFSS
LVSSVEWYLRSMELPDDVKTDIEQQLNSIDLINSINAIDDIESLIRNLIQDYDRTFLMLKGLLKQCNYEYTYE
>Q85436 ~~~~~~Non-structural protein 4~~~
MNQSRSFVTGRGRDLSRTPSALSSNSETPGSMSSPSEGKTNAWVNSAYVSNFPALGQSQGLPSHKCSALALRSSQTTYII
NFPRQHWNIMTFPNQSEAILATVASYAKDLDGKNSFAVFDTLKMPWSCRLGEKSCSGIDTLGHLADVHMHVLDPAEAEGK
NLSDSETVYVYVTPPNLTDVKPTTIVLTECAANAKSANDLRQYIVTQLRKMPSLPFGCTTYAPGFLSDGVCKEHPNLFTS
EELGAKIKVLTKLLIRCATSMSQDGSNAFCPKHPKVKIVHESNATSYILFNRPNGMVATNLILSDLPDDDCPTCWILKLA
ISEARYYALDGHHRCRSRIITPSVFRYLASIVIRVSMDSVLAPSDASSTDHAALVNMMCGIIQNTPAMRHVGISTGSEKV
NNRSMRVIIMQENADRATQMSALYHLFLDYFGALNGWGFYFCSLTSLYGEFHGFSVGFSGEITHVNVASVIAKNWDTQSG
IDNILEFKTITIPVHNEDIVCMVERTLAESFEVVINEHFNGASTIKVRRNGGDSRFNFTISNPRDAFLLLQKAVVDGGIL
QKILCRAMLKAIASLALRADREVQDVSFSFVLKMSLNPVNKSDPKSSELAHAAQMNSLPEFLASTPFTMQLGTLRDALLK
KTGNVTVINMARTTEEVSNDALQEILKSIGGNSMTLDDPAEPLSDIESIPDPPPRSWASEDEAVNSPQTYSSRRKARKAR
AASKLSK
>O92374 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITLMNDTLHSIIQDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVIKCCIVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIIKEMRRQLEMIDKLTTREIEQVELLKRIHDNLIIKPVDVIDMSKEFNQKNIKTLDEWE
SGKNPYEPLEVTASM
>A9Q1L1 ~~~~~~Non-structural glycoprotein 4~~~
MEGTSESPILDEFEANSNDYDNEFISRFSQNPLHAFSLFTNGNIQEYFMNNSLEKIIIHIVLIIISLCGIKAQTSKIIYI
VRLLFWKMYNVINNLVNKMINREKIADRQIVDNRFREFEERFRILLLQHDENIAKQDNIVQYNKLDNFAESIKSEFNLKV
AEMERRFQELKWRCDMIANKTMNTIVLTNTVDSTNKDEKIIFDEGSVVQYNRE
>P08434 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTSSVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIHDKLMIRAVDEIDMTKEINQKNVRTLEEWE
NGKNPYEPKEVTAAM
>P04513 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTLSVITLMNSTLHTILEDPGMAYFPYIVSVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIHDKLMIRTVDEIDMTKEINQKNVRTLEEWE
NGRNPYEPKEVTAAM
>P89063 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTLSVITLMNDTLHTIMEDPGMAYFPYIASVLTVLFTLHKASLPTMKIALKTSRCSYKVIKYCIVSIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIYDMLIATSVDKIDTTQEFNQKHFKTLNEWA
EGENPYKPREVTASL
>Q82030 ~~~~~~Non-structural glycoprotein 4~~~
MDKFTDLNYTLNVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCTVTIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDRLTTREIEQVELLKRIHDKLMVQSTGEIDMRKEINQKNVKTLEEWE
SGRNPYEPKEVTAAM
>O56850 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTLSVITLMNDTLHTIMEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALRTSRCSYKVIKYCIVSIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIHDMLIIKPVDKIDMSQEFNQRQFKTLNEWA
EGENPYEPKEVTASL
>Q82033 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITLMNDTLHSIIEDPGMAYFPYIASVLTVLFALHIASIPTMKIALKASKCSYKVIKYCIVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIVKEMRRQLEMIDKLTTREIEQVELLKRIHDNLITRPVDVIDMSKEFNQKNIKTLDEWE
SGKNPYEPSEVTASM
>Q82028 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTLSVITLMNNTLHTILEDPGMAYFPYIASVLIVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTIFNTLL
TLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIYDKLMVRSTGEIDMRKEINQKNVRTLEEWE
NGKNPYEPKEVTAAM
>Q9QNA6 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITLMNDTLHSIIQDPGMAYFTYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVIKYCIVTIINTLL
RLAGYKEQVTTKDEIEQQMDRIVKEMRRQLEMIDKLTTREIEQVELLKSIHDNLITRSVDVIDMSKEFNQKNIKTLDEWE
SGRNPYEPSEVTASM
>Q82034 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITSMNDTLHSIIEDPGMAYFTYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVLKYCIVTIINTFL
KLAGYKEQVTTKDEIEQQMDRIVKEMRRQLEMIDKLTTREIEQVELLKRIHDNLITRTVDVIDMSKEFNQKNIKTLDEWE
SGKNPYEPSEVTASM
>Q82029 ~~~~~~Non-structural glycoprotein 4~~~
MEKFTDLNYTLSVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTILNTLL
KLAGYKEQITAKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIYDKLIVRSTGEIDMTKEINQKNVRTLEEWE
SGKNPYEPKEVTAAM
>Q82035 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITSMNDTLHSIIEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKASKCSYKVIKYCVVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIVKEMRRQLEMIDKLTTREIEQIELLKRIHDNLITRPVNVIDMSMEFNQKNIKTLDEWE
SRKNPYEPSEVTASM
>P03535 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITSMNDTLHSIIQDPGMAYFLYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVIKYCIVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIVKEMRRQLEMIDKLTTREIEQVELLKRIHDNLITRPVDVIDMSKEFNQKNIKTLDEWE
SGKNPYEPSEVTASM
>P89059 ~~~~~~Non-structural glycoprotein 4~~~
MEKFTDLNYTLSVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTILNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIYDKLIVRSTGEIDMTKEINQKNVRTLEEWE
SGKNPYEPKEVTAAM
>O11982 ~~~~~~Non-structural glycoprotein 4~~~
MEKLADLNYTLGVITLLNDTLHNILEEPGMVYFPYVASALTVLFTMHKASLPAMKLAMRTSQCSYRIIKRVVVTLINTLL
RLGGYNDYLTDKDETEKQINRVVKELRQQLTMIEKLTTREIEQVELLKRIYDMMVVRHDREIDMSKETNQKAFNTLHDWG
NDRNYDDNTDVIAPL
>P89061 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITLMNDTLHSIIQDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVIKYCMVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIIKEMRRQLEMIDKLTTREIEQVELLKRIHDKLAARSVDAIDMSKEFNQKNIRTLDEWE
SGKNPYEPSEVTASM
>Q9YS17 ~~~~~~Non-structural glycoprotein 4~~~
MEFINQTFFSDYSEGKIDTIPYALGIVLALTNGSRILKFINLLISLLRKFIITSKTVIGKFKIENNTSHQNDDIHKEYEE
VMKQMREMRVHVTALFDSIHKDNMEWRMSESIRREKKREMKASTAENEVKIHTNDVNICDTSGLETEVCL
>Q06381 ~~~~~~Non-structural glycoprotein 4~~~
MDKLADLNYTLSVITLMNDTLHSIIQDPGMAYFPYIASVLTVLFALHKASIPTMKIALKTSKCSYKVIKYCMVTIINTLL
KLAGYKEQVTTKDEIEQQMDRIVKEMRRQPRMIDKLTTREIEQVELLKRIHDKLVTRPVDVIDMSKEFNQKNIKTLDEWE
SGKNPYEPSEVTASM
>Q6YLV5 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTSSVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIHDKLMIRAVDEIDMTKEINQKNVRTLEEWE
NGKNPYEPKEVTAAM
>P04512 ~~~~~~Non-structural glycoprotein 4~~~
MEKLTDLNYTLSVITLMNNTLHTILEDPGMAYFPYIASVLTGLFALNKASIPTMKIALKTSKCSYKVVKYCIVTIFNTLL
KLAGYKEQITTKDEIEKQMDRVVKEMRRQLEMIDKLTTREIEQVELLKRIYDKLTVQTTGEIDMTKEINQKNVRTLEEWE
SGKNPYEPREVTAAM
>Q8JNB2 ~~~~~~Non-structural glycoprotein 4~~~
MDKLTDLNYTLSVITLMNSTLHTILEDPGMAYFPYIASVLTVLFTLHKASIPTMKIALKTSKCSYKVVKYCIVTIFNALL
KLAGYKEQITTKDEIEKQTDRVVKEMRRQLEMIDKLTTREIEQVELLKRIHDKLMIRAVDEIDMTKEINQKNVKTLEEWE
NGKNPYEPKEVTAAM
>A9Q1L2 ~~~~~~Non-structural protein 5~~~
MSEVPRFELRNKRKVGKKQKVDIFGDKDDESMLQIDCETDSLISESVSSTHSYEDYSKAYKELTLETPADTNDSTSTIID
SAYEESWYDKTIKDEQTKENKKTDKKLKRIENIKENNQNDSTSMQIAQLSLRIQRIESETKLKTLDSAYNTIITQADNLT
TPQKKSLISAILATMR
>Q8V9C3 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIYKNESSSTTSTLSGKSIGRSEQYISPDAEAFSKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSLTQSRPSSNVGCDQVDFSLNKGIKVSANLDSSVSISTNVKKEKSKNDHRSRKHYPKIEAESDSDDYV
LDDSDSDDGKCKNCKYKRKYFALRMRMKQVAMQLIEDL
>Q8V9C4 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIYKNESSSTTSTLSGKSIGRSEQYISPDAEAFSKYMLSKSPEDIGPSDSASNDPLTSFSNRSNA
VKTNADAGVSMDSSTQSRPSSNVGCDQVDFSFNKGIKVSANLDSSVSISTNVKKEKSKNDHRSRKHYPKIEAESDSDDYV
LDDSDSDDGKCKNCKYKRKYFALRMRMKRVAMQLIEDL
>Q9QNA5 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIFKNESSSTTSTLSGKSIGRNEQYVSPDIDAFNKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSSTQSRPSSNVGCDQMDFSLNKGINVSASLDSCVSISTNQKKEKSKKDKSRKHYPRIEADSDSEDYVL
DDSDSDDGKCKNCKYKKKYFALRMRMKQVAMQLIEDL
>P19715 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIFKNESSSTTSTLSGKSIGRSEQYISPDAEAFNKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSSTQSRPSSNVGCDQVDFSLTKGINVSANLDSCVSISTDNKKEKSKKDKSRKHYPRIEADSDSEDYVL
DDSDSDDGKCKNCKYKKRCFALRVRMKQVAMQLIEDL
>P36358 ~~~~~~Non-structural protein 5~~~
MSDFGINLDAICDNVKYKSSNSRTGSQVSNRSSRRMDFVDEEELSTYFNSKASVTQSDSCSNDLAVKTSIITEAVICDES
EHVSADAIQEKEESIMQVDDNVMKWMMDSHDGISMNGGINFSRSKSKTGRSDFTESKSETSVSAHVSAGISSQLGMFNPI
QNTVKKEAISEMFEDEDGDGCTCRNCPYREKYLKLRNKMKSVLVDMINEM
>Q03054 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIFKNESSSTTSTLSGKSIGRSEQYISPDAEAFNKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSSTQSRPSSNVGCDQVDFSLTKGINVSANLDSCISISTDHKKEKSKKDKSRKHYPRIEADSDSEDYVL
DDSDSDDGKCKNCKYKKKYFALRMRMKQVAMQLIEDL
>Q9E8F2 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSISSSIYKNESSSTTSTLSGKSIGRSEQYISPDAEAFSKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSSTQSRPSSNVGCDQVDFSFNKGISMNANLDSSISISTSSKKEKSKSDHKSRKHYPKIEAESDSDDYI
LDDSDSDDGKCKNCKYKRKYFALRMRMKQVAMQLIEDL
>A2T3Q9 ~~~~~~Non-structural protein 5~~~
MSLSIDVTSLPSIPSTIYKNESSSTTSTLSGKSIGRSEQYISPDAEAFNKYMLSKSPEDIGPSDSASNDPLTSFSIRSNA
VKTNADAGVSMDSSAQSRPSSNVGCDQVDFSLNKGLKVKANLDSSISISTDTKKEKSKQNHKSRKHYPRIEAESDSDDYV
LDDSDSDDGKCKNCKYKKKYFALRMRMKQVAMQLIEDL
>P0C6Z2 ~~~~~~Non-structural protein 6~~~
MNHLQQRQLFLENLLVGVNNMFHQMQKRPVNTCCRSLQKILDHLILLQTIHSPAFRLDQMQLRQMQTLACLWIHQYNHDH
QVTLGAIKWISPLIKELK
>Q03056 ~~~~~~Non-structural protein 6~~~
MNRLLQRQLFLENLLVGVNSTFHQMQKHSINTCCRSLQRILDHLILLQTIHSPVFRLDRMQLRQMQTLACLWIHRRNHDL
QVTLGAIKWISP
>P11203 ~~~~~~Non-structural protein 6~~~
MNRLQQRQLFLENLLVGVNSTFHQMQKHSINTCCRSLQRILDHLILLQTIHSPVFRLDRMQLRQMQTLACLWIHQHNHDL
QVMSDAIKWISP
>P21945 ~~~BR1~~~Nuclear shuttle protein~~~
MYPSRNKRGSYFNQRRQYSRNHVWKRPTAAKRHDWKRRPSNTSKPNDEPKMSAQRIHENQYGPEFVMAQNSAISSFISYP
DLGRSEPNRSRSYIRLKQLRFKGTVKIEQVPLAMNMDGSTPKVEGVFSLVIVVDRKPHLGPSGCLHTFDELFGARIHSHG
NLSVTPALKDRYYIRHVCKRVLSVEKDTLMVDVEGSIPLSNRRINCWATFKDVDRESCKGVYDNISKNALLVYYCWMSDT
PAKASTFVSFDLDYIG
>Q96706 ~~~BR1~~~Nuclear shuttle protein~~~
MYPTKFRRGVSYSQRRFVSRNQSSKRGTFVRRTDGKRRKGPSSKAHDEPKMKLQRIHENQYGPEFVMTHNSALSTFINFP
VLGKIEPNRSRSYIKLNRLSFKGTVKIERVHADVNMDGVISKIEGVFSLVIVVDRKPHLSSTGGLHTFDEIFGARIHSHG
NLAITPGLKDRYYVLHVLKRVLSVEKDTLMVDLEGSTTISNRRYNCWASFNDLEHDLCNGVYANISKNAILVYYCWMSDA
MSKASTFVSYDLDYLG
>P21935 ~~~BR1~~~Nuclear shuttle protein~~~
MYSTSNRRGRSQTQRGSHVRRTGVKRSYGAARGDDRRRPNVVSKTQVEPRMTIQRVQENQFGPEFVLSQNSALSTFVTYP
SYVKTVPNRTRTYIKLKRVRFKGTLKIERGQGDTIMDGPSSNIEGVFSMVIVVDRKPHVSQSGRLHTFDELFGARIHCHG
NLSVVPALKDRYYIRHVTKRVVSLEKDTLLIDLHGTTQLSNKRYNCWASFSDLERDSCNGVYGNITKNALLVYYCWLSDA
QSKASTYVSFELDYLG
>Q80DP8 ~~~N~~~Non-structural protein NS-S~~~
MPRRRWRWTRMTLTRAHYKIDGQLCLHWRPNSGNSRDNLQIWWQLKNWLQNQLIQQGLSLMTI
>P0DTL3 ~~~~~~Non-structural protein~~~
MLSALALSTSSLCLSRYPLRASTVFLASSNIPFPSVSASLARPVATSAIPERPDLLMSPHGGLKAMMYLPLTNSLHQSTC
SWLTGPRGFSSPPLFRIRFLLLIMSDSISLTDITISPGTLYSARTLLLRAAVLALTRKPMSFLHFKAACW
>J3TRC5 ~~~NSS~~~Non-structural protein NS-S~~~
MSLSKASQPSVKSACVRLPIVVLEPNLAELSTSYVGLVSCKCSVLTCSMMRKMKAFTNTVWLFGNPNNPLHALEPAVEQL
LDEYSGDLGSYSQQEKSALRWPSGKPSVHFLQAAHLFFSLKNTWAVETGQENWRGFFHRITSGKKYKFEGDMVIDSCYKI
DERRRRMGLPDTFITGLNPIMDVALLQIESLLRVRGLTLNYHLFTSSFLDKPLLDSLYFAIWRDKKKDDGSYSQDEGARQ
DDPLNPLDELLYLSDLPKPLAHYLNKCPLHNIIMHDEEVREAYLNPIWGKDWPALSSSP
>P21698 ~~~NSS~~~Non-structural protein S~~~
MDYFPVISVDLQSGRRVVSVEYFRGDGPPRIPYSMVGPCCVFLMHHRPSHEVRLRFSDFYNVGEFPYRVGLGDFASNVAP
PPAKPFQRLIDLIGHMTLSDFTRFPNLKEAISWPLGEPSLAFFDLSSTRVHRNDDIRRDQIATLAMRSCKITNDLEDSFV
GLHRMIATEAILRGIDLCLLPGFDLMYEVAHVQCVRLLQAAKEDISNAVVPNSALIVLMEESLMLRSSLPSMMGRNNWIP
VIPPIPDVEMESEEESDDDGFVEVD
>P12792 ~~~NSS~~~Non-structural protein NS-S~~~
MINNNMMNSQYMFDYPAINIDVRCHRLLSSVSYVAYNKFHTHDVSTYEHCEIPLEKLRLGFGRRNSLADFYSLGELPASW
GPACYFSSVKPMMYTFQGMASDLSRFDLTSFSRKGLPNVLKALSWPLGIPDCEIFSICSDRFVRGLQTRDQLMSYILRMG
DSHSLDECIVQAHKKILQEARRLGLSDEHYNGYDLFREIGSLVCLRLINAEPFDTASSGEALDVRTVIRSYRASDPSTGL
TEYGNSLWTPIHSHVDENDESSSDSDF
>A0A0B5AC19 ~~~NSS~~~Non-structural protein NS-S~~~
MSLSKCSNVDLKSVAMNANTVRLEPSLGEYPTLRRDLVECSCSVLTLSMVKRMGKMTNTVWLFGNPKNPLHQLEPGLEQL
LDMYYKDMRCYSQRELSALRWPSGKPSVWFLQAAHMFFSIKNSWAMETGRENWRGLFHRITKGQKYLFEGDMILDSLEAI
EKRRLRLGLPEILITGLSPILDVALLQIESLARLRGMSLNHHLFTSPSLRKPLLDCWDFFIPVRKKKTDGSYSVLDEDDE
PGVLHGYPHLMAHYLNRCPFHNLIRFDEELRTAALNTIWGRDWPAIGDLPKEV
>P21699 ~~~NSS~~~Non-structural protein NS-S~~~
MQSRAVILKYRSGSGHKRSLPRFYIDCDLDTFDFEKDCSLIENEFPIYINNYKVVYKSKPTLSHFLIEKEFPAVLGPGMI
SAVRTRLYEPTMRELYQESIHQLKRSNKKYLLSALRWPTGIPTLEFIDYYFEELLFLSEFDPGSIQRYLKLLVKASGLYN
STNEEQIVEIHRRVLIEGKKHGLTAFDLPGNDILGDICVVQAARVTRLVAKTFSKMTRDTHLMIYFSISPVELVLSKLDK
KGDKRAKAKGLMSMSAARSYDYFMRTDLGFRETALSTFWAKDWPTPQETILSDKRCLKEDMRVTKWLPSPPHYPPL
>P0DTK1 ~~~N~~~Non-structural protein NS-S~~~
MNSKLSLPGKNLKMQKRRWKPTRMMLTRAHYRVDGQLCQHWRTNWQTSRGSLQIWCQVKKWVKSLLTRLGLSRMITSRRD
QAFDMEMSLM
>P22026 ~~~NSS~~~Non-structural protein NS-S~~~
MSYFTIQNEDLPQGFTFRPHDKIYDSLWEMMDDGYFPSTIPLKTTINGVDMPSVGWLEVDEGLYDILIDGLDVLRPTDEE
MIVSATGWPLEKNRALILNFFRNLRMDIIGTYTLQRSFITIMSIVLFGDQNPRLRRKKRSRVSLGKMLFDLALRMRSKIR
RMKLTEVQVTGQNLVKDLCLLHILDLQKRLVTRGTIAEKRFFTAIEQAPCNYEPKRYGMKKKHMNFMFESDRKNLTVHPT
LVNLEEHWITFESARERLLDTTFTKDWPVVGSL
>P05807 3.6.1.15~~~~~~Nucleoside triphosphatase I~~~
MSKSHAAYIDYALRRTTNMPVEMMGSDVVRLKDYQHFVARVFLGLDSMHSLLLFHETGVGKTMTTVYILKHLKDIYTNWA
IILLVKKALIEDPWMNTILRYAPEITKDCIFINYDDQNFRNKFFTNIKTINSKSRICVIIDECHNFISKSLIKEDGKIRP
TRSVYNFLSKTIALKNHKMICLSATPIVNSVQEFTMLVNLLRPGSLQHQSLFENKRLVDEKELVSKLGGLCSYIVNNEFS
IFDDVEGSASFAKKTVLMRYVNMSKKQEEIYQKAKLAEIKTGISSFRILRRMATTFTFDSFPERQNRDPGEYAQEIATLY
NDFKNSLRDREFSKSALDTFKRGELLGGDASAADISLFTELKEKSVKFIDVCLGILASHGKCLVFEPFVNQSGIEILLLY
FKVFGISNIEFSSRTKDTRIKAVAEFNQESNTNGECIKTCVFSSSGGEGISFFSINDIFILDMTWNEASLRQIVGRAIRL
NSHVLTPPERRYVNVHFIMARLSNGMPTVDEDLFEIIQSKSKEFVQLFRVFKHTSLEWIHANEKDFSPIDNESGWKTLVS
RAIDLSSNKNITNKLIEGTNIWYSNSNRLMSINRGFKGVDGRVYDVDGNYLHDMPDNPVIKIHDGKLIYIF
>P03781 ~~~~~~Nucleotide kinase gp1.7~~~
MGLLDGEAWEKENPPVQATGCIACLEKDDRYPHTCNKGANDMTEREQEMIIKLIDNNEGRPDDLNGCGILCSNVPCHLCP
ANNDQKITLGEIRAMDPRKPHLNKPEVTPTDDQPSAETIEGVTKPSHYMLFDDIEAIEVIARSMTVEQFKGYCFGNILKY
RLRAGKKSELAYLEKDLAKADFYKELFEKHKDKCYA
>Q6QGH9 ~~~obp~~~Putative replication origin binding protein~~~
MFSILQGHAGFSRDLATGIWREIKAEDYTFAKRFSKEHPEGKPASMPFKFDVIEEHDPQSLAEMLPLMRRLTSDPHIVAV
RGRCLAPKNNVRRKKGNFNVSNPSNIIAMDVDGILDTGGYDKFNLVGMARHIIKMLNSISEDMFPLDAGFIAHASSSAGL
KPGIRMHLMLESNVKVTQGQLKFLFTSINDSSKQKFGFDIADLAYYSSVQLHYFADPLFSDGIVDPFKAESKPRLVYVKG
SKVNLPNNLVDYETTRGEFKEEFYSLLDQIKGKKIASDKVEETISELEEADDGVYLRIIPKLYHRALEDGVDFAWLEREI
KPALSEYIATKDNSRNIQDYFNNGRKQALKAFVNNSKREIPLNLKGVPLKKLEVDSPPEVPYLKINIVPPKGHITFVKAS
LGTGKTTAVTKWLDAGVLPGNFLAVTNTRALVSSNAKKFSAGQYDKSVDMLNFKRGAIDRMSTTIHSLHKFKSFIGQIDT
IFIDECDAVMNDLLFAPVVKQRRECIQVLRDILMTAKTVILSDGDISAETIEAYGSLIDFDKPVAFYNHHRKMLSKAHAY
EFPDESSIWVALQTSLEMGEKSILVSDCGPDELNEKGMALRRNTGALVKEIHSNSTSDVDIRRILDYTTNELIDQQIDCL
LCSPSVTSGVDFNYFDNVFVITRTSNQAPNMRFQAIRRDRGAQNIYYFIDKSTSGFSAGSEQYNIDEGWLELAQQLYARR
RELESRNYTSTLRYYLLDQGATIDIFSESWGTIEGAGKEYTEERIKAILHSTPDYCAPRHADAYEAKLLLVRYYHLESIK
DVTVEHVEQYIKDKPNDRAAFFHKMHEMFWEDIKKCSNVTIKPFIEALKGKKKDFFLKTGQSANPKYARMYLGMMGIGKD
MNTENIVDWYRTYCKIECMPIPFKFMTEEERAMAEEVMSELGATNEDA
>P10193 ~~~UL9~~~Replication origin-binding protein~~~
MPFVGGAESGDPLGAGRPIGDDECEQYTSSVSLARMLYGGDLAEWVPRVHPKTTIERQQHGPVTFPNASAPTARCVTVVR
APMGSGKTTALIRWLREAIHSPDTSVLVVSCRRSFTQTLATRFAESGLVDFVTYFSSTNYIMNDRPFHRLIVQVESLHRV
GPNLLNNYDVLVLDEVMSTLGQLYSPTMQQLGRVDALMLRLLRICPRIIAMDATANAQLVDFLCGLRGEKNVHVVVGEYA
MPGFSARRCLFLPRLGTELLQAALRPPGPPSGPSPDASPEARGATFFGELEARLGGGDNICIFSSTVSFAEIVARFCRQF
TDRVLLLHSLTPLGDVTTWGQYRVVIYTTVVTVGLSFDPLHFDGMFAYVKPMNYGPDMVSVYQSLGRVRTLRKGELLIYM
DGSGARSEPVFTPMLLNHVVSSCGQWPAQFSQVTNLLCRRFKGRCDASACDTSLGRGSRIYNKFRYKHYFERCTLACLSD
SLNILHMLLTLNCIRVRFWGHDDTLTPKDFCLFLRGVHFDALRAQRDLRELRCRDPEASLPAQAAETEEVGLFVEKYLRS
DVAPAEIVALMRNLNSLMGRTRFIYLALLEACLRVPMATRSSAIFRRIYDHYATGVIPTINVTGELELVALPPTLNVTPV
WELLCLCSTMAARLHWDSAAGGSGRTFGPDDVLDLLTPHYDRYMQLVFELGHCNVTDGLLLSEEAVKRVADALSGCPPRG
SVSETDHAVALFKIIWGELFGVQMAKSTQTFPGAGRVKNLTKQTIVGLLDAHHIDHSACRTHRQLYALLMAHKREFAGAR
FKLRVPAWGRCLRTHSSSANPNADIILEAALSELPTEAWPMMQGAVNFSTL
>P03775 ~~~~~~Protein Ocr~~~
MAMSNMTYNNVFDHAYEMLKENIRYDDIRDTDDLHDAIHMAADNAVPHYYADIFSVMASEGIDLEFEDSGLMPDTKDVIR
ILQARIYEQLTIDLWEDAEDLLNEYLEEVEEYEEDEE
>Q00704 4.2.2.-~~~~~~Non-sulfated chondroitin lyase E66~~~
MSIVLIIVIVVIFLICFLYLSNSNNKNDANKNNAFIDLNPLPLNATTATTTTAVATTTTNNNNSIVAFRQNNIQELQNFE
RWFKNNLSYSFSQKAEKVVNPNRNWNDNTVFDNLSPWTSVPDFGTVCHTLIGYCVRYNNTSDTLYQNPELAYNLINGLRI
ICSKLPDPPPHQQAPWGPVADWYHFTITMPEVFMNITIVLNETQHYDEAASLTRYWLGLYLPTAVNSMGWHRTAGNSMRM
GVPYTYSQILRGYSLAQIRQEQGIQEILNTIAFPYVTQGNGLHVDSIYIDHIDVRAYGYLINSYFTFAYYTYYFGDEVIN
TVGLTRAIENVGSPEGVVVPGVMSRNGTLYSNVIGNFITYPLAVHSADYSKVLTKLSKTYYGSVVGVTNRLAYYESDPTN
NIQAPLWTMARRIWNRRGRIINYNANTVSFESGIILQSLNGIMRIPSGTTSTQSFRPTIGQTAIAKTDTAGAILVYAKFA
EMNNLQFKSCTLFYDHGMFQLYYNIGVEPNSLNNTNGRVIVLSRDTSVNTNDLSFEAQRINNNNSSEGTTFNGVVCHRVP
ITNINVPSLTVRSPNSSVELVEQIISFQTMYTATASACYKLNVEGHSDSLRAFRVNSDENIYVNVGNGVKALFNYPWVMV
KENNKVSFMSANEDTTIPFSVIMNSFTSIGEPALQYSPSNCFVYGNGFKLNNSTFDLQFIFEIV
>Q9E2H5 ~~~ORF0~~~Membrane protein 0~~~
MATVHYSRRPGTPPVTLTSSPGMDDVATPIPYLPTYAEAVADAPPPYRSRESLVFSPPLFPHVENGTTQQSYDCLDCAYD
GIHRLQLAFLRIRKCCVPAFLILFGILTLTAVVVAIVAVFPEEPPNSTTRNYCPEGEGIYSRLQLVARVCTTKAIYVTKA
NVAIWSTTPSTLHNLSICIFSCADAFLRDRGLGTSTSGIRTAGGLARTTSGDALFCISSVC
>D6RRG7 ~~~~~~Structural protein ORF10~~~
MAGRTSTYAPNQVTIVINHAASGISHTLTGFSEDSIVSVERLVDTFTEYVGADDTHTRVFNANSGARATVSLAQTSESND
VLTFLHEFDREAMSADGMFEMLIKDNSGRSLYFSDEAYIAVIPQGGFSNQMNTRDWVISMTNTTFQHGGNQKVSPATADT
LTALGVNLDARWL
>I7H893 ~~~~~~Coiled-coil domain-containing protein ORF13~~~
MGIKEKEIELETLKREIAQAEASLEQDFIKHMVDKTNEKVEDLFFSDKPEFYKFVFTEQNNYLREKLTDKVSKAMDLSDE
IQRDKDAEEIEKDKQAFLNKHPEVDFNELLEFYEEELPKRIKTQIDKLEGAAFFEAILDYFNAINAREEEPKKESKEEYS
SLPKEALGNGVSGVGYANNENIMTRY
>I7H0H9 ~~~~~~Major structural protein ORF14~~~
MLEKLNNINFNNISNNLNLGIEVGREIQNASWIKSPFFSITGTGADRGVRLFSVASQQPFRPRIKAQLSGSGVSGNTDFE
ANYDNLEILSQTIYPDAFGNSLRSKIKAYSELERIDFIKESVDSLTTWMNEERDKRIVASLTNDFTNYLYTQTMNVATIR
KAIFHARNGLKGDNSKAFPIKPIRATMQSVGNVMVQNTSYIILLDSYQANQLKADSEFKELRKLYAFAGEDKGMLYSGLL
GVIDNCPVIDAGVWNKFNVGMPNSSISDSDFMRYLNKANVSSIVTPRQFKEKLNQEKDEKKRSINKEISIGCLIGASAVL
LAGSKETRFYIDETVDAGRKSLVGVDCLLGVSKARYQSTDGVVTPYDNQDYAVIGLVSDME
>Q4JQX4 ~~~ORF1~~~Structural protein 1~~~
MSRVSEYGVPEGVRESDSDTDSVFMYQHTELMQNNASPLVVRARPPAVLIPLVDVPRPRSRRKASAQLKMQMDRLCNVLG
VVLQMATLALVTYIAFVVHTRATSCKRE
>F5HIM6 ~~~~~~Protein ORF23~~~
MLRVPDVKASLVEGAARLSTGERVFHVLTSPAVAAMVGVSNPEVPMPLLFEKFGTPDSSTLPLYAARHPELSLLRIMLSP
HPYALRSHLCVGEETASLGVYLHSKPVVRGHEFEDTQILPECRLAITSDQSYTNFKIIDLPAGCRRVPIHAANKRVVIDE
AANRIKVFDPESPLPRHPITPRAGQTRSILKHNIAQVCERDIVSLNTDNEAASMFYMIGLRRPRLGESPVCDFNTVTIME
RANNSITFLPKLKLNRLQHLFLKHVLLRSMGLENIVSCFSSLYGAELAPAKTHEREFFGALLERLKRRVEDAVFCLNTIE
DFPFREPIRQPPDCSKVLIEAMEKYFMMCSPKDRQSAAWLGAGVVELICDGNPLSEVLGFLAKYMPIQKECTGNLLKIYA
LLTV
>Q9DH21 ~~~ORF2/3~~~Probable protein VP2~~~
MWTPPRNDQHYLNWQWYSSILSSHAAMCGCPDAVAHFNHLASVLRAPQNPPPPGPQRNLPLRRLPALPAAPEAPGDRAPW
PMAGGAEGEDGGAGGDADHGGAAGGPEDADLLDAVAAAETLLEIPAKKPTPRAIESLEAYKSLTRNTTHRNSHSIPGTSD
VVSLARKLFRECNNNQQLLTFFQQATRDPGGTPRCTTPAKKGSKKKAYFSPQSSSSDESPRGKTRSRRKAGRKAQRKRHR
PSPSSSSSSCSNSESWESNSDSSSTKSKKSNKIKISTLPCYQGGGI
>P0C674 ~~~ORF2/3~~~Probable protein VP2~~~
MWTPPRNDQQYLNWQWYSSILSSHAAMCGCPDAVAHFNHLASVLRAPQNPPPPGPQRNLPLRRLPALPAAPEAPGDRAPW
PMAGGAEGEDGGAGGDADHGGAAGGPEDADLLDAVAAAETLLEIPAKKPTPRAIESLEAYKSLTRNTTHRNSHSIPGTSD
VASLARKLFRECNNNQQLLTFFQQAARDPGGTPRCTTPAKKGSKKKAYFSPQSSSSDESPRGKTRSRRKAGRKAQRKRRR
PSPSSSSSSCSNSESWESNSDSCSTKSKKSTKIKISTLPCYQGGGI
>F5HFD2 ~~~~~~Protein ORF24~~~
MAALEGPLLLPPSASLTTSPQTTCYQATWESQLEIFCCLATNSHLQAELTLEGLDKMMQPEPTFFACRAIRRLLLGERLH
PFIHQEGTLLGKVGRRYSGEGLIIDGGGVFTRGQIDTDNYLPAVGSWELTDDCDKPCEFRELRSLYLPALLTCTICYKAM
FRIVCRYLEFWEFEQCFHAFLAVLPHSLQPTIYQNYFALLESLKHLSFSIMPPASPDAQLHFLKFNISSFMATWGWHGEL
VSLRRAIAHNVERLPTVLKNLSKQSKHQDVKVNGRDLVGFQLALNQLVSRLHVKIQRKDPGPKPYRVVVSTPDCTYYLVY
PGTPAIYRLVMCMAVADCIGHSCSGLHPCANFLGTHETPRLLAATLSRIRYAPKDRRAAMKGNLQACFQRYAATDARTLG
SSTVSDMLEPTKHVSLENFKITIFNTNMVINTKISCHVPNTLQKTILNIPRLTNNFVIRKYSVKEPSFTISVFFSDNMCQ
GTAININISGDMLHFLFAMGTLKCFLPIRHIFPVSIANWNSTLDLHGLENQYMVRMGRKNVFWTTNFPSVVSSKDGLNVS
WFKAATATISKVYGQPLVEQIRHELAPILTDQHARIDGNKNRIFSLLEHRNRSQIQTLHKRFLECLVECCSFLRLDVACI
RRAAARGLFDFSKKIISHTKSKHECAVLGYKKCNLIPKIYARNKKTRLDELGRNANFISFVATTGHRFAALKPQIVRHAI
RKLGLHWRHRTAASNEQTPPADPRVRCVRPLV
>Q9DH22 ~~~ORF2/4~~~Uncharacterized ORF2/4 protein~~~
MWTPPRNDQHYLNWQWYSSILSSHAAMCGCPDAVAHFNHLASVLRAPQNPPPPGPQRNLPLRRLPALPAAPEAPGDRAPW
PMAGGAEGEDGGAGGDADHGGAAGGPEDADLLDAVAAAETRPQETQEGHRGVPLQPRRGAKRKLTFPPSQAPQTSPPVGR
LAAGGKRVAKLRGRDTDRLPAAQAAAAATANPGSQTQTPLQPSPKNPTKSRYQPYLVTKGGGSSILISNSTINMFGDPKP
YNPSSNDWKEEYEACRIWDRPPRGNLRDTPYYPWAPKENQYRVNFKLGFQ
>I7H0I4 ~~~~~~Coiled-coil domain-containing protein ORF29~~~
MNEKTESEIFEEQNSLYKPIKQEKKTPSTPESEDKNDQSLANANQSLEAEPPYLSTGIGYLDDKIKNRSITAFDYYMAKK
FLGLDLSVNLNGNLNIKSENKTRLASINKATQDIFDDLKALDLGDDLIKKAQEHSGITNQVKLWLNYKTGGLKGVDYDLA
KTDNARLSYANRVAKTMAQGGQVTQKLRDEAKALTSWGFRSKEENTARATQTQEILLNSLRKNLQMLESLGGSVSPLMLE
KLKEHQGKINYINDTGGKIDLKKYQSLAGGN
>P0DTB4 ~~~~~~ORF2p protein~~~
MVTIKELLPYSYWIGHPVSNRAIVYLFVGFTPLTLETLHTLNYIILLNTKRWAPRSPHSDPARMRIPTQPRKAPL
>Q4JQX3 ~~~ORF2~~~Membrane protein 2~~~
MHVISETLAYGHVPAFIMGSTLVRPSLNATAEENPASETRCLLRVLAGRTVDLPGGGTLHITCTKTYVIIGKYSKPGERL
SLARLIGRAMTPGGARTFIILAMKEKRSTTLGYECGTGLHLLAPSMGTFLRTHGLSNRDLCLWRGNIYDMHMQRLMFWEN
IAQNTTETPCITSTLTCNLTEDSGEAALTTSDRPTLPTLTAQGRPTVSNIRGILKGSPRQQPVCHRVRFAEPTEGVLM
>Q4JQU3 ~~~~~~Phosphoprotein 32~~~
MESSNINALQQPSSIAHHPSKQCASSLNETVKDSPPAIYEDRLEHTPVQLPRDGTPRDVCSVGQLTCRACATKPFRLNRD
SQYDYLNTCPGGRHISLALEIITGRWVCIPRVFPDTPEEKWMAPYIIPDREQPSSGDEDSDTD
>Q67684 ~~~ORF3~~~Long-distance movement protein~~~
MDTTPASRGQWPVGIMSSVINVFASGKSCNSGGAPRSSVRRGHRAGAARDKSRGFDAPPRRPKGGVHPATTSKDPNKDIR
GPPAPTPHKKYRGPAIPGEGRSGVHTTRPRRRAGRSGGMDPRQLVAQPQQRWAKTEIPTERRAEIDGLLPSLLNTLDGQI
QGDAALLRYCVGAIKRELRRRWESVQPAHHVAASSGKPSPQLFNEAAQNAEDLSGDGKGCAGQSVQQEVLHSGSGVPPVC
ADCGKPAANKW
>P69616 ~~~ORF3~~~Protein ORF3~~~
MGSRPCALGLFCCCSSCFCLCCPRHRPVSRLAAVVGGAAAVPAVVSGVTGLILSPSQSPIFIQPTPSPPMSPLRPGLDLV
FANPPDHSAPLGVTRPSAPPLPHVVDLPQLGPRR
>Q81870 ~~~ORF3~~~Protein ORF3~~~
MGSRPCALGLFCCCSSCFCLCCPRHRPVSRLAAAVGGAAAVPAVVSGVTGLILSPSQSPIFIQPTPSPPMSPLRPGLDLV
FANPPDHSAPLGVTRPSAPPLPHVVDLPQLGPRR
>O90299 ~~~ORF3~~~Protein ORF3~~~
MGSRPWALGLFCCCSSCFCLCCSRHRPVSRLAAVVGGAAAVPAVVSGVTGLILSPSQSPIFIQPTPSPRMSPLRPGLDLV
FANPSDHSAPLGATRPSAPPLPHVVDLPQLGPRR
>Q03499 ~~~ORF3~~~Protein ORF3~~~
MGSPPCALGLFCCCSSCFCLCCPRHRPVSRLAAVVGGAAAVPAVVSGVTGLILSPSQSPIFIQPTPLPQTLPLRPGLDLA
FANQPGHLAPLGEIRPSAPPLPPVADLPQPGLRR
>Q9YLR0 ~~~ORF3~~~Protein ORF3~~~
MGSPCALGLFCCCSSCFCLCCPRHRPASRLAAVVGGAAAVPAVVSGVTGLILSPSPSPIFIQPTPSPPMSFHNPGLELAL
DSRPAPLXPLGVTSPSAPPLPPVVDLPQLGLRR
>O56124 ~~~ORF3~~~Protein ORF3~~~
MVTIPPLVSRWFPVCGFRVCKISSPFAFTTPRWPHNDVYIGLPITLLHFPAHFQKFSQPAEISDKRYRVLLCNGHQTPAL
QQGTHSSRQVTPLSLRSRSSTFNK
>F5HDE4 ~~~~~~Protein ORF45~~~
MAMFVRTSSSTHDEERMLPIEGAPRRRPPVKFIFPPPPLSSLPGFGRPRGYAGPTVIDMSAPDDVFAEDTPSPPATPLDL
QISPDQSSGESEYDEDEEDEDEEENDDVQEEDEPEGYPADFFQPLSHLRPRPLARRAHTPKPVAVVAGRVRSSTDTAESE
ASMGWVSQDDGFSPAGLSPSDDEGVAILEPMAAYTGTGAYGLSPASRNSVPGTQSSPYSDPDEGPSWRPLRAAPTAIVDL
TSDSDSDDSSNSPDVNNEAAFTDARHFSHQPPSSEEDGEDQGEVLSQRIGLMDVGQKRKRQSTASSGSEDVVRCQRQPNL
SRKAVASVIIISSGSDTDEEPSSAVSVIVSPSSTKGHLPTQSPSTSAHSISSGSTTTAGSRCSDPTRILASTPPLCGNGA
YNWPWLD
>K9N4V0 ~~~ORF4a~~~Non-structural protein ORF4a~~~
MDYVSLLNQIWQKYLNSPYTTCLYIPKPTAKYTPLVGTSLHPVLWNCQLSFAGYTESAVNSTKALAKQDAAQRIAWLLHK
DGGIPDGCSLYLRHSSLFAQSEEEESFSN
>A3EXD3 ~~~ORF4b~~~Non-structural protein ORF4b~~~
MDDSMDLDLDCVIAQPSSTIVMMPLSPISTRKRRRHPMNKRRYAKRRFTPVEPNDIIMCDKPTHCIRLVFDQSLRWVHFD
GIKNILTDYDVIFNPDLHVTVALVCAGNGVTFSDLTPLTFILADMLLEFNGIFTLGQTLVIGAREYHWLPQELKTNVGKA
IPQAKEWLVDHGYNVYHTGLPTHMSLAKLHSLDFVQQSYVGSKFFIKHSHTTEYAMPVCLQVIAIDGEKVDGRSKPLFQY
PIHNHYRHYRACFPGR
>K9N643 ~~~ORF4b~~~Non-structural protein ORF4b~~~
MEESLMDVPSTSGTQVYSRKARKRSHSPTKKLRYVKRRFSLLRPEDLSVIVQPTHYVRVTFSDPNMWYLRSGHHLHSVHN
WLKPYGGQPVSEYHITLALLNLTDEDLARDFSPIALFLRNVRFELHEFALLRKTLVLNASEIYCANIHRFKPVYRVNTAI
PTIKDWLLVQGFSLYHSGLPLHMSISKLHALDDVTRNYIITMPCFRTYPQQMFVTPLAVDVVSIRSSNQGNKQIVHSYPI
LHHPGF
>P0DTL8 ~~~~~~Protein ORF4~~~
MLRGQQIWLSNLTQPQTSAGPVPAVESPPALCSTSLPQVCLDPASPALLPKPTWTLSWSRPGSCVMPGAAAASLLSPRTL
RLESPRGAGLSLMRPRPFPLICCCSTCSGPPPSTFLATRIRSQPSILSTPGSFPPSGPIWPPPPGGMLPIAALRMYVS
>O56125 ~~~ORF4~~~Anti-apoptotic ORF4 protein~~~
MTCTLVFQSRFCIFPLTFKSSASPRKFLTNVTGCCSATVTRLPLSNKVLTAVDRSLRCP
>Q2HR80 ~~~~~~Tegument protein ORF52~~~
MAAPRGRPKKDLTMEDLTAKISQLTVENRELRKALGSTADPRDRPLTATEKEAQLTATVGALSAAAAKKIEARVRTIFSK
VVTQKQVDDALKGLSLRIDVCMSDGGTAKPPPGANNRRRRGASTTRAGVDD
>P0DJZ4 ~~~~~~Structural protein ORF5a~~~
MFSQIGAFLDSALLLLVAFFAVYRLVLVLCRWQRRQLDIPIHI
>P0DJZ5 ~~~~~~Structural protein ORF5a~~~
MFKYVGEMLDRGLLLAIAFFVVYRAVLFCCARQRQQRQQLPSTADLQLDAM
>P59636 ~~~9b~~~ORF9b protein~~~
MDPNQTNVVPPALHLVDPQIQLTITRMEDAMGQGQNSADPKVYPIILRLGSQLSLSMARRNLDSLEARAFQSTPIVVQMT
KLATTEELPDEFVVVTAK
>P0DTD2 ~~~9b~~~ORF9b protein~~~
MDPKISEMHPALRLVDPQIQLAVTRMENAVGRDQNNVGPKVYPIILRLGSPLSLNMARKTLNSLEDKAFQLTPIAVQMTK
LATTEELPDEFVVVTVK
>Q88940 ~~~~~~ORF-B protein~~~
MFSDSDSSDEELSRIITDIDESPQDIQQSLHLRAGVSPEARVPPGGLTQEEWTIDYFSTYHTPLLNPGIPHELTQRCTDY
TASILRRASAKMIQWDYVFYLLPRVWIMFPFIAREGLSHLTHLLTLTTSVLSATSLVFGWDLTVIELCNEMNIQGVYLPE
VIEWLAQFSFLFTHVTLIVVSDGMMDLLLMFPMDIEEQPLAINIALHALQTSYTIMTPILFASPLLRIISCVLYACGHCP
SARMLYAYTIMNRYTGESIAEMHTGFRCFRDQMIAYDMEFTNFLRDLTEEETPVLEITEPEPSPTE
>P0C788 ~~~~~~OX-2 membrane glycoprotein homolog~~~
MSSLFISLPWVAFIWLALLGAVGGARVQGPMRGSAALTCAITPRADIVSVTWQKRQLPGPVNVATYSHSYGVVVQTQYRH
KANITCPGLWNSTLVIHNLAVDDEGCYLCIFNSFGGRQVSCTACLEVTSPPTGHVQVNSTEDADTVTCLATGRPPPNVTW
AAPWNNASSTQEQFTDSDGLTVAWRTVRLPRGDNTTPSEGICLITWGNESISIPASIQGPLAHDLPAAQGTLAGVAITLV
GLFGIFALHHCRRKQGGASPTSDDMDPLSTQ
>P17518 ~~~ORF0~~~Suppressor of silencing P0~~~
MIVLTQSGTLLFDQRFKLSKFLFVVIATGFPLLLQQASLIYGYNHEQIYRICRSFLHVLPLLNCKRGRISTSGLQLPRHL
HYECLEWGLLCGTHPAIQIVGLTIVIKLDDPTTAAAYRSELLRVSSSSYIQNAAGLSNGWGHDMEAFVRNAICLLELRER
SIPQSGLRDLMGNYQHLVRSLLDACKVDHFVPLDFQHRSLMLNFARLYNQLDLQGRAKSFRALTGFPVYVPSEDYLEGSF
LQKELQE
>P09504 ~~~ORF0~~~Suppressor of silencing P0~~~
MQFLAHDNFHTLQVKKVRFLHPQQEVFLLAGLLLNIKQFVRAIKERNNVFKIDVFLRSLLYQLPFHLGSCFHDAPRELIP
ATEPELCAWFSLQTGYAPASTSGRVNLHVPGTKTSRRRIIQRSFASDFSEKLKRFPECLFVSLELFQRLLSTWTKDVERR
IFFSCREIPLGSDTLMELANLGEFLRVMVVGEQFHNSRLLSRLAVHCYKIYGEDGFISFWRIANLDHFDCFLTPEEILFS
SSVYTEMFV
>Q69535 ~~~~~~Large structural phosphoprotein~~~
MDLKAQSIPFAWLDRDKVQRLTNFLSNLENLENVDLREHPYVTNSCVVREGEDVDELKTLYNTFILWLMYHYVLSKRKPD
YNAIWQDITKLQNVVNEYLKSKGLNKGNFENMFTNKEKFESQFSDIHRALLRLGNSIRWGSNVPIDTPYVNLTAEDSSEI
ENNLQDAEKNMLWYTVYNINDPWDENGYLVTSINKLVYLGKLFVTLNQSWSKLEKVAMSQIVTTQNHLSGHLRKNENFNA
VYSQRVLQTPLTGQRVESFLKIITSDYEIIKSSLESYSASKAFSVPENGPHSLMDFASLDGRMPSDLSLPSISIDTKRPS
ADLARLKISQPKSLDAPLKTQRRHKFPESDSVDNAGGKILIKKETLGGRDVRATTPVSSVSLMSGVEPLSSLTSTNLDLR
DKSHGNYRIGPSGILDFGVKLPAEAQSNTGDVDLLQDKTSIRSPSSGITDVVNGLANLNLRQNKSDVSRPWSKNTAANAD
VFDPVHRLVSEQTGTPFVLNNSDVAGSEAKLTTHSTETGVSPHNVSLIKDLRDKDGFRKQKKLDLLGSWTKEKNDKAIVH
SREVTGDSGDATETVTARDSPVLRKTKHANDIFAGLNKKYARDVSRGGKGNSRDLYSGGNAEKKETSGKFNVDKEMTQNE
QEPLPNLMEAARNAGEEQYVQAGLGQRVNKILAEFTNLISLGEKGIQDILHNQSGTELKLPTENKLGRESEEANVERILE
VSDPQNLFKNFKLQNDLDSVQSPFRLPNADLSRDLDSVSFKDALDVKLPGNGEREIDLALQKVKAGERETSDFKVGQDET
LIPTQLMKVETPEEKDDVIEKMVLRIRQDGETDEETVPGPGVAESLGIAAKDKSVIAS
>Q89769 ~~~~~~Structural DNA-binding protein p10~~~
MPTKAGTKSTANKKTTKGSSKSGSSRGHTGKTHASSSMHSGMLYKDMVNIARSRGIPIYQNGSRLTKSELEKKIKRSK
>P0C215 ~~~~~~Accessory protein p12I~~~
MLFRLLSPLSPLALTALLLFLLPPSDVSGLLLRPPPAPCLLLFLPFQILSGLLFLLFLPLFFSLPLLLSPSLPITMRFPA
RWRFLPWKAPSQPAAAFLF
>P32510 ~~~~~~Inner membrane protein p12~~~
MALDGSSGGGSNVETLLIVAIIVVIMAIMLYYFWWMPRQQKKCSKAEECTCNNGSCSLKTS
>P11130 ~~~~~~Protein P13~~~
MVPLKISTLESQLQPLVKLVATETPGALVAYARGLSSADRSRLYRLLRSLEQAIPKLSSAVVSATTLAARGL
>Q65201 ~~~~~~Protein p14.5~~~
MADFNSPIQYLKEDSRDRTSIGSLEYDENSDTIIPSFAAGLEDFEPIPSPTTSTSLYSQLTHNMEKIAEEEDINFLHDTR
EFTSLVPDKADNKPEDDEESGAKPKKKKHLFPKLSSHKSK
>Q89424 ~~~~~~Minor capsid protein p17~~~
MDTETSPLLSHNLSTREGIKQSTQGLLAHTIARYPGTTAILLGILILLVIILIIVAIVYYNRSVDCKSSMPKPPPSYYVQ
QPEPHHHFPVFFRKRKNSTSLQSHIPSDEQLAELAHS
>Q66104 ~~~ORF4~~~RNA silencing suppressor p19~~~
MERAIQGNDTREQANGERWDGGSGGITSPFKLPDESPSWTEWRLYNDETNSNQDNPLGFKESWGFGKVVFKRYLRYDRTE
ASLHRVLGSWTGDSVNYAASRFLGANQVGCTYSIRFRGVSVTISGGSRTLQHLCEMAIRSKQELLQLTPVEVESNVSRGC
PEGIETFKKESE
>P11690 ~~~ORF4~~~RNA silencing suppressor p19~~~
MERAIQGNDAREQANSERWDGGSGGTTSPFKLPDESPSWTEWRLHNDETNSNQDNPLGFKESWGFGKVVFKRYLRYDRTE
ASLHRVLGSWTGDSVNYAASRFFGFDQIGCTYSIRFRGVSITVSGGSRTLQHLCEMAIRSKQELLQLAPIEVESNVSRGC
PEGTETFEKESE
>P69517 ~~~ORF4~~~RNA silencing suppressor p19~~~
MERAIQGNDAREQANSERWDGGSGGTTSPFKLPDESPSWTEWRLHNDETNSNQDNPLGFKESWGFGKVVFKRYLRYDRTE
ASLHRVLGSWTGDSVNYAASRFFGFDQIGCTYSIRFRGVSITVSGGSRTLQHLCEMAIRSKQELLQLAPIEVESNVSRGC
PEGTQTFEKEGE
>P11126 ~~~P1~~~Major inner protein P1~~~
MFNLKVKDLNGSARGLTQAFAIGELKNQLSVGALQLPLQFTRTFSASMTSELLWEVGKGNIDPVMYARLFFQYAQAGGAL
SVDELVNQFTEYHQSTACNPEIWRKLTAYITGSSNRAIKADAVGKVPPTAILEQLRTLAPSEHELFHHITTDFVCHVLSP
LGFILPDAAYVYRVGRTATYPNFYALVDCVRASDLRRMLTALSSVDSKMLQATFKAKGALAPALISQHLANAATTAFERS
RGNFDANAVVSSVLTILGRLWSPSTPKELDPSARLRNTNGIDQLRSNLALFIAYQDMVKQRGRAEVIFSDEELSSTIIPW
FIEAMSEVSPFKLRPINETTSYIGQTSAIDHMGQPSHVVVYEDWQFAKEITAFTPVKLANNSNQRFLDVEPGISDRMSAT
LAPIGNTFAVSAFVKNRTAVYEAVSQRGTVNSNGAEMTLGFPSVVERDYALDRDPMVAIAALRTGIVDESLEARASNDLK
RSMFNYYAAVMHYAVAHNPEVVVSEHQGVAAEQGSLYLVWNVRTELRIPVGYNAIEGGSIRTPEPLEAIAYNKPIQPSEV
LQAKVLDLANHTTSIHIWPWHEASTEFAYEDAYSVTIRNKRYTAEVKEFELLGLGQRRERVRILKPTVAHAIIQMWYSWF
VEDDRTLAAARRTSRDDAEKLAIDGRRMQNAVTLLRKIEMIGTTGIGASAVHLAQSRIVDQMAGRGLIDDSSDLHVGINR
HRIRIWAGLAVLQMMGLLSRSEAEALTKVLGDSNALGMVVATTDIDPSL
>Q84710 ~~~ORF1~~~Protein P1~~~
MASFLKPVNSQGLWLSLLLAITYLFLLPSAGQSLDPSGIGLAAGCSQSQGGISSFAALPRPCNDSVCTLPDLGWSCQRTA
QDTANQQQSPFNHTGHFLTTSGWTWPNWTCSPSQCQLLIHLPTWQIVKQDFLLLLKEWDLLTMCQRCSDLLTKTPGFILR
FAGETLILVANLIEFVLVSWSLWLCSVLVYVAQAVPGKFLLYMAAFCTTFWAWPRETASSLIRIVTTPLTLIGFLNKTGI
GLISHCLALTWNMFMTWSLLPWVTLMKMMKILITSSRVLTRSGRPKRTSSKSLKHKLKISRAIQKKQGKKTPVEERTIPG
VQIKKLREDPPKGVILRCTDQFGDHVGYASAVKLEKGQTGIVLPIHVWTDTVYINGPNGKLKMADFTALYEVTNHDSLIM
TSAMAGWGSILGVRPRPLTTIDAVKLKNYSLFTERDGKWYVQAAKCIAPAEGMFRVVSDTRPGDSGLPLFDMKMNVVAVH
RGTWPSERFPENRAFAILPVPDLTSSSSPKFTGCETYSEAETAYEMADNFSDGEEILIRTKGQSYRTFIGSNKVALLSIR
KLEEELSRGPIGLWADDTEDDESAPRRSGNGLFRSTPEKQSQAKTPSPKVEESAAPPPAPRAEKVRHVRRSEMTPEQKRA
DNLRRRKAKAAKKTPSTPPKKSKDKAPTLSQVAELVEKAVRAALTVQPRRSRASSKISIGGRNPGRKPQVSIQLDPVPSQ
STSVPPKDSQAGESAWLGPRRSYRPVQKSTVGQKQEPRRN
>P17519 ~~~ORF1~~~Protein P1~~~
MNRFTAYAALFFMFSLCSTAKEAGFLHPAFNFRGTSTMSASSGDYSAAPTPLYKSWALPSSLNLTTQPPPPLTDRSYYEL
VQALTSKMRLDCQTVGDMTWRHLSEMLFASWNSVKEVSLKAASVTLWAIINIWFGLYWTLARLITLFLWTFSIEALCLIL
LGCITSLIYKGALSLSEHLPVFLFMSPLKIIWRAAFSKRNYKNERAVEGYKGFSVPQKPPKSAVIELQHENGSHLGYANC
IRLYSGENALVTAEHCLEGAFATSLKTGNRIPMSTFFPIFKSARNDISILVGPPNWEGLLSVKGAHFITADKIGKGPASF
YTLEKGEWMCHSATIDGAHHQFVSVLCNTGPGYSGTGFWSSKNLLGVLKGFPLEEECNYNVMSVIPSIPGITSPNYVFES
TAVKGRVFSDEAVKELEREASEAVKKLARFKSLTDKNWADDYDSDEDYGLEREAATNAPAEKTAQTNSAEKTAPSTSAEK
TALTNKPLNGQAAPSAKTNGNSDIPDAATSAPPMDKMVEQIITAMVGRINLSEIEEKIVSRVSQKALQKPKQKKRGRRGG
KNKQNSLPPTSTQSTSGAPKKEAAPQASGSAGTSRATTTPAPEAKPSGGKNSAKFTPSWRIKQQDSAGQKPDLKLNSKA
>Q08544 ~~~ORF7~~~20 kDa protein~~~
MTSSVELAQTKPLFRVLLLKGFVFYVVAFETEEESSEAELPLVYLHDFELNINKRGKIEASYVDFMSCMTRLKPSSVSYT
RVSSEKSSEDFSLPGSGKTFGSKVLNRKVTFTFENGVQLVFGMYGLEQRCVSSDYLWFENVFVGAHCGTLTYCLNCELDK
SGGELEILTFSKNEVLLKRW
>Q08545 ~~~ORF8~~~RNA silencing suppressor~~~
MKFFLKDGETSRALSRSESLLRRVKELGTNSQQSEISECVDEFNELASFNHLLVTVEHREWMEQHPNQSSKLRVPSRIGE
MLKEIRAFLKVRVVTPMHKETASDTLNAFLEEYCRITGLAREDALREKMRKVKSVVLFHHSELLKFEVTENMFSYTELLK
LNLSLRVISSQILGMAI
>P23169 ~~~~~~Inner membrane protein p22~~~
MFNIKMTISTLLIALIILVIIILVVFLYYKKQQPPKKVCKVDKDCGSGEHCVRGTCSTLSCLDAVKMDKRNIKIDSKISS
CEFTPNFYRFTDTAADEQQEFGKTRHPIKITPSPSESHSPQEVCEKYCSWGTDDCTGWEYVGDEKEGTCYVYNNPHHPVL
KYGKDHIIALPRNHKHA
>Q65669 ~~~~~~Protein P25~~~
MGDILGAVYDLGHRPYLARRTVYEDRLILSTHGNVCRAINLLTHDNRTTLVYHNNTKRIRFRGLLCAYRVPYCGFRALCR
VMLCSLPRLCDIPINGSRDFVADPTRLDSSVNELLVSNGLVTHYDRVHNVPIHTDGFEVVDFTTVFRGPGNFLLPNATNF
PRSTTTDQVYMVCLVNTVNCVLRFESELVVWVHSGLYAGDVLDVDNNVIQAPDGVDDND
>Q83051 ~~~~~~Protein p26~~~
MNNFPEIFDDESTCDYDKEIDHQELSDTFWCLMDFISSKHGKSVADINSGMNTLINIRKSLNGSGKVVSITDSYNKTYFH
SQRGLTNVDSRINIDILKIDFISIIDDLQIIFRGLIYKDKGFLDSADLLDLDKKTTTRKFQEYFNILKIKIIEKIGMTKT
FHFNIDFRNTISPLDKQRKCSISSSHKKTNRLNDLNNYITYLNDNIVLTFRWKGVGFGGLSLNDIKI
>Q89504 ~~~ORF2A~~~Polyprotein P2A~~~
MGCSVVGNCKSVMLMSRMSWSKLALLISVAMAAAMTDSPPTLICMGILVSVVLNWIVCAVCEEASELILGVSLETTRPSP
ARVIGEPVFDPRYGYVAPAIYDGKSFDVILPISALSSASTRKETVEMAVENSRLQPLESSQTPKSLVALYSQDLLSGWGS
RIKGPDGQEYLLTALHVWETNISHLCKDGKKVPISGCPIVASSADSDLDFVLVSVPKNAWSVLGVGVARLELLKRRTVVT
VYGGLDSKTTYCATGVAELENPFRIVTKVTTTGGWSGSPLYHKDAIVGLHLGARPSAGVNRACNVAMAFRVVRKFVTVEN
SELYPDQSSGPARELDAETYTERLEQGIAFTEYNISGITVKTSDREWTTAEALRVARYKPLGGGKAWGDSDDEDTQETAI
RPLNYQRAGSLRGSPPLANLSSTRATSGVTKESSIPTACLSDPLESRVAGLEKLCAERFTEMFELLRQSSQNSKSSLGQA
ADRKQKSDRSSSKPEGLKESKRPPICNWQSLTSKPSTRGPDPAPVSAESPGVVKTSSQKSKRSRTRGKSTSRQVPASPSP
KSGSATSK
>P27378 ~~~II~~~Adsorption protein P2~~~
MANFNVPKLGVFPVAAVFDIDNVPEDSSATGSRWLPSIYQGGNYWGGGPQALHAQVSNFDSSNRLPYNPRTENNPAGNCA
FAFNPFGQYISNISSAQSVHRRIYGIDLNDEPLFSPNAASITNGGNPTMSQDTGYHNIGPINTAYKAEIFRPVNPLPMSD
TAPDPETLEPGQTEPLIKSDGVYSNSGIASFIFDRPVTEPNPNWPPLPPPVIPIIYPTPALGIGAAAAYGFGYQVTVYRW
EEIPVEFIADPETCPAQPTTDKVIIRTTDLNPEGSPCAYEAGIILVRQTSNPMNAVAGRLVPYVEDIAVDIFLTGKFFTL
NPPLRITNNYFADDEVKENTVTIGNYTTTLSSAYYAVYKTDGYGGATCFIASGGAGISALVQLQDNSVLDVLYYSLPLSL
GGSKAAIDEWVANNCGLFPMSGGLDKTTLLEIPRRQLEAINPQDGPGQYDLFILDDSGAYASFSSFIGYPEAAYYVAGAA
TFMDVENPDEIIFILRNGAGWYACEIGDALKIADDEFDSVDYFAYRGGVMFIGSARYTEGGDPLPIKYRAIIPGLPRGRL
PRVVLEYQAVGMSFIPCQTHCLGKGGIISKV
>Q98632 ~~~~~~Minor outer capsid protein P2~~~
MAYPNDVRNVWDVYNVFRDVPNREHLIRDIRNGLVTVRNLTNMLTNMERDDQLIIAQLSNMMKSLSIGIEKAQNELSKLK
TTDADRAAVLADYQTSVLNIERNTMLLTGYFKQLVLDLTGYVGASVYPILPFMITGDQSMMVDSINVNMKNVFDDKHEQE
IVLPIHPACFVSTITEDTSSVVYADGDELYSVHVRHENMTMYVNVLGETVETRQLSMIGESIVPDDFAPSLLILRFSQDS
VGEVFYLSHDNIKKFLGHSLEYTDKYVIFDVARRASTTRNTITDGFCSVDGVPYLDGRFIYQPSGISADSNICAIYNSYV
LDVLRYITECEVDTLRSVYDRTSSTVFSKTDVLRPRLLTMQSNISALSAATPQLANDVITSDSTDLLSLGTVLTVSNEFT
ADDTTLSTSLAGHCQVDYSEGSPQDKSMSIPVSCDSSQLASSTVHSYSADILGHGLKGDRNLNLMINVPGLMNPQKVTVD
YVYSDGYKLNFASVVAPDAPFWINATLQLSLSPSAHNMLSKLTPLDNDACPGLKAQANTPVLVSMTINLDDATPALGGEV
IQNCVFKIHHGDDVYSFVTDFDVISYTSTSGTNCLKLISSVDITSQLPSDMWIYVMNGSPDAAFISGDSVNMSSVDWHQS
TSQTVGNYVYATMKAYWNVTSYDVEARPYATYVPGKINFTAIDHADVFVDDYNTGVNSYVIVNSRIYYKGNSSIYEVPSG
SFIKVSYFTSPLKNPTVDAYNAEISRNSAYLIKANASLDSVAAMLSNISNRIDAMERLMEPTRAHRIAGVVSSIGGVISL
GMPLLGAIVVTIGSIISIADPDKQGIDYHSVANAFMSWCQYAAVCRYEYGLLKRGDEKLDVLSFMPKRVVSDFKNKPDVI
SLPELGESVLRGSSTDYLDTEINIIYNDMQLLGQGKLSDWLNKTVSKVENNAANFFERNLVKSLANKEVLPVHARVEITQ
TEKIGDVYRTTILYTGINEGSYLGGDVFASRLGDKNILRMNGFESGPGRFKAIVESTTEVGNFRVVDWTVSGMSRYEIYA
AAGEVYPSKDPSHADVQLLYESIVRDLTTRDGSFVLKHHDVLLLPGQLDAFEELIIKNASNYQYAFIGSNCSKLCA
>O56834 ~~~~~~Minor outer capsid protein P2~~~
MSYPGNVHSLRDLYNVFKDAPNRERLILDMNSQLARIDNIAQILTMTEEQEKEVIAKMSASLANLGLNMDNAIRELKNME
TVHMDRVSVLTVYQSAVLNITNNTHTISDSMKSIMYDLSGYVNTTIYPTIQSPATEDRSSLIPLNYKNMKNIFDDKHTIV
FATVIGPQCVDFTTEDVTNISWIVDSHEIIKITYYPSSLIMETSIVGNRYLQYQLSLSNQTIDEYDNNGPYSCLLVFCVV
GANGLLDIYSMSSQQVRKYVNDYEVGESSLIGYSTKHGSLHLSSNDIRNISHELSGSPYIDVRVTYSITAAFPDSQLLSM
LLDNIQQSLGTLTACESKSFDDAYRAAMDNQFTKADVLTSSIITMRSSLNSLISSSPLLSDDTIRIVDDCTFQLASALTL
TNEFASDSANLLGTFKVKQTTIDRYSNTKISEGKAVYLSPTLATMNVHSGSVVVSTDSFHNNNALKVMLLAPGLVKPDNV
NFDLLSVKSNNVIDLVSDYKLDDAIIQFTDQPDYSSTTSIANAKSVVYMKTTGCIGLTYSNPNELILNMSLQLNSAAPSL
GGVVNNNTTFSLTIGSTIINYVTDFEITDFTDSSGKQSLKLTSSVNLLTKLSVDMIFYAVSINAGGVIATDTHKYTDTGF
DSSNYTYGWEMYDSAALNDYQDMTVVTPPKFTAKMLRPQDASKLNGDCVGGTYKKGRNILVIAGNRIFYNGVSVYFKMQK
NSMISLKYFINPSGLSTVDTCDVRITRNAAFLTQIDTKLMSVQSVLNDVQRRIDIINQLMQPSRIQTLATIIQGIGGVVS
LAMPLLGAIVVTIGAIVSIADPNHHGVDYQAVLNAFHSWCQYAVVARMNYGLLKADDPKLDILKRISDGSVNTFRNKPKK
ITLPGIDDEVIRGTSTDYIDTGINVRYNSMGLFGEGKLEEWMANTALKVQDGTANIFQKNLFSLLQKRKIVPMHARVEII
QTEKIGDVYRNTILYAGINEGSYIENSVYLTRSGNTRIKRLNMTSGPGMFKAVTESTTEVGNFKAVDWTLSGMTKEEIYN
AAGLMYPNKNPAHSEVQDVYESVIRDMAEIDDTWVLQHHKTVMLPGQIEAFEHLIRVSANKFQYAFIGSNCQNFADDVVG
ILSQFKRPKRWVDENDFKQYIQSIYDEL
>P0C214 ~~~~~~Accessory protein p30II~~~
MALCCFAFSAPCLHLRSRRSCSSCFLLATSAAFFSARLLRRAFSSSFLFKYSAVCFSSSFSRSFFRFLFSSARRCRSRCV
SPRGGAFSPGGPRRSRPRLSSSKDSKPSSTASSSSLSFNSSSKDNSPSTNSSTSRSSGHDTGKHRNSPADTKLTMLIISP
LPRVWTESSFRIPSLRVWRLCTRRLVPHLWGTMFGPPTSSRPTGHLSRASDHLGPHRWTRYRLSSTVPYPSTPLLPHPEN
L
>P34204 ~~~~~~Phosphoprotein p30~~~
MDFILNISMKMEVIFKTDLRSSSQVVFHAGSLYNWFSVEIINSGRIVTTAIKTLLSTVKYDIVKSAHIYAGQGYTEHQAQ
EEWNMILHVLFEEETESSASSESIHEKNDNETNECTSSFETLFEQEPSSEEPKDSKLYMLAQKTVQHIEQYGKAPDFNKV
IRAHNFIQTIHGTPLKEEEKEVVRLMVIKLLKKNKLLSHLHLMF
>Q83046 ~~~~~~RNA-binding P34 protein~~~
MIMMSPLYALTKQCVIDTAYRLAVPTQHCAIYTVACRILFLSVGFMTIVKLCGFKMDTSSFIASIEKDNLMDCLISLVEM
RDRLRLCNDFPILNYGVNILELLIGKRLNKINNLKNCYVIRELITINISKEWVGKQALKVGLHCFLNLSQADSRHVKYLL
SDKESLNKMNFSRYYVPKVVTDLYLDLIGVLYVNTGYNIDLVEKFIFDKLEFLVYDGEEGFKSPQVEYNDICTVNNLKPI
IKYNRWHTDGSIVIECGDVIGKGINKTKKKFAINDAKAEFVKNFKAKNKNNE
>P08160 ~~~~~~Early 35 kDa protein~~~
MCVIFPVEIDVSQTIIRDCQVDKQTRELVYINKIMNTQLTKPVLMMFNISGPIRSVTRKNNNLRDRIKSKVDEQFDQLER
DYSDQMDGFHDSIKYFKDEHYSVSCQNGSVLKSKFAKILKSHDYTDKKSIEAYEKYCLPKLVDERNDYYVAVCVLKPGFE
NGSNQVLSFEYNPIGNKVIVPFAHEINDTGLYEYDVVAYVDSVQFDGEQFEEFVQSLILPSSFKNSEKVLYYNEASKNKS
MIYKALEFTTESSWGKSEKYNWKIFCNGFIYDKKSKVLYVKLHNVTSALNKNVILNTIK
>P0DTK2 ~~~ORF3a~~~Movement protein P3a~~~
MDYKFLAGFALGFSSAIPFSVAGLYFVYLKISSHVRSIVNEYGRG
>P11129 ~~~P3~~~Spike protein P3~~~
MRYQGINEWLGGAKKLTTANGEIGAIYLSAAPPTDAARADAKAVDFTAGWPSAIVDCADATRAKQNYLWVGDNVVHIGAK
HVPLLDLWGGTGDAWQQFVGYACPMLDLCRAWGLGYASASVTTGSLQGYQPSAFLDVEQQQFAKDNLNLYGDNCLDLATS
SSAQRAFLEQCMGCALPEDCIFGWYVKMDWEGSAVADAYAAIRVQGFATVMAPWQSVGGAGYVYARVPQKGAWMGVNLLA
YVHGTSGQPAYGIPMTLSGFTGNMGQVASKWLMLPLLMIVDPHVVQILAALGVKRGTKSDPRTTDVYADPKVPASRISGP
MINGTVAPPATIPATIPVPLAPLGGAGGPGAQGFQVYPVFTWGLPEFMTDVTIEGTVTADSNGLHVVDDVRNYVWNGTAL
AAIEQVNAADGRVTLTDSERAQLASLTVRTASLRQQLSVGADPLSKTSIWRRAQKADYDLLSQQIIEADTVKNLPAVTFA
QANKAAGGQSETLWHQMYRVNDIAGDQVTAIQITGTMATGIRWSATAGGLVVDADEQDAVIAISSGKPVKNSSDLPTADA
VNYLFGITADDMPGIVSSQKEMNSEFEEGFLQKARLWNPRKLVENVQNAYFLMVYARDRKQFHSLVASSLAMAKLGVSTR
ACKESYGC
>Q9XJR6 ~~~III~~~Protein P3~~~
MNTSVPTSVPTNQSVWGNVSTGLDALISGWARVEQIKAAKASTGQGRVEQAMTPELDNGAAVVVEAPKKAAQPSETLVFG
VPQKTLLLGFGGLLVLGLVMRGNK
>P22472 ~~~~~~Outer capsid protein P3~~~
MDSTGRAYDGASEFKSVLVTEGTSHYTPVEVYNILDELKTIKITSTIAEQSVVSRTPIPLSKIGLQDVKKLFDINVIKCG
SSLRIVDEPQVTFIVSYAKDIYDKFMCIEHDSAYEPSLTMHRVRVIYSMLNDYCAKMISEVPYESSFVGELPVKSVTLNK
LGDRNMDALAEHLLFEHDVVNAQRENRIFYQRKSAPAVPVIFGDDLEPAVRERANLYHRYSVPYHQIELALHALANDLLS
IQYCHPTVVYNYLSSRAPNFLRLDDQVSLKLTSAGIGTLMPRPVVQLLDYDLVYMSPLALNNLASRLLRKISLHLVMQMV
TAVQQDLGEVVSVSSNVTNPASACLVRMNVQGVQTLAVFIAQSMLNPNISYGMISGLTLDCFSNFIYGACLMLFQALIPP
SALTARQRLDINNRFAYFLIKCHATQATTARLVANQVIYPVDAIDQWQSNGRDVLVAIYNNLLPGELVLTNLIQTYFRGN
TAQQAAEILIPADQTSYGANETRALSAPYLFGAPINMLAPDARLSTYKRDLALPDRSPILITTVEGQNSISIENLRHKTG
LIRAMYLNGFVTQPPAWIRNANSNTALLSRFLDATPNLLGIYEAILANTYANAVNVYCDSVYRADIPIEWKLHQSVDPQD
LLFGVFGIVPQYQILNEAVPDFFAGGEDILILQLIRAVYDTLSNKLGRNPADIFHLEEVFKVIEEIVSVLVQQKIDVRKY
FTESMRSGSFSKPRWDNFLRRPVAQRLPNLYSVIMTQADHVYNYMTQLTHIIPITDCFYIVKNSGFVDRGSTGPVIASSS
VYENVLKVVHTIADFDAANALRLQRRRVDNTSYTDSLSDMFNGLRSISSSEFVRSVNGRSVFTEGRIDAIKVNMRAKFDL
QFITEEGGYSKPPNVKKLMFSDFLSFLDSHKSDYRPPLLTVPITIGLNNLGETNSNTLRMRSEAIDEYFSSYVGAQILVP
INVVDTRVYTEFSELRNFFTGDVVIRDDPFDVWDGVKATYIPIGVHGVRLDPNGDQPPL
>Q88898 ~~~~~~40 kDa protein~~~
MHELLRKWLDDTNVLLLDNGLVVKVRSRVPHIRTYEVIGKLSVFDNSLGDDTLFEGKVENVFVFMFRRFLCVNKDGHCYS
RKHDELYYYGRVDLDSVSKVTSGYEKLFIHRELYILTDLIERVSKFFNLAQDVVEASFEYAKVEERLGHVRNVLQLAGGK
STNADLTIKISDDVEQLLGKRGGFLKVVNGILSKNGSDVVTNDNELIHAINQNLVPDKVMSVSNVMKETGFLQFPKFLSK
LEGQVPKGTKFLDKHVPDFTWIQALEERVNIRRGESGLQTLLADIVPRNAIAAQKLTMLGYIEYHDYVVIVCQSGVFSDD
WATCRMLWAALSSAQLYTYVDASRIGPIVYGWLL
>Q65165 ~~~~~~Minor capsid protein p49~~~
MYHDYASKLLADYRSDPPLWESDLPRHNRYSDNILNSRYCGNKNGAAPVYNEYTNSPEKAEKGLQLSDLRNFSFMLNPQH
KNIGYGDAQDLEPYSSIPKNKLFNHFKNHRPAFSTHTENLIRRNVVRTEKKTFPQVASLKGTQKNCLTQPSSLPSLKNPK
NSSVPSTRFSEHTKFFSYEDIPKLKTKGTIKHEQHLGDQMPGQHYNGYIPHKDVYNILCLAHNLPASVEKGIAGRGIPLG
NPHVKPNIEQELIKSTSTYTGVPMLGPLPPKDSQHGREYQEFSANRHMLQVANILHSVFANHSIKPQILEDIPVLNAQLT
SIKPVSPFLNKAYQTHYMENIVTLVPRFKSIANYSSPIPNYSKRNSGQAEYFDTSKQTISRHNNYIPKYTGGIGDSKLDS
TFPKDFNASSVPLTSAEKDHSLRGDNSACCISSISPSL
>Q5UP57 1.14.11.2~~~~~~Putative prolyl 4-hydroxylase~~~
MKTVTIITIIVVIIVVILIIMVLSKSCVSHFRNVGSLNSRDVNLKDDFSYANIDDPYNKPFVLNNLINPTKCQEIMQFAN
GKLFDSQVLSGTDKNIRNSQQMWISKNNPMVKPIFENICRQFNVPFDNAEDLQVVRYLPNQYYNEHHDSCCDSSKQCSEF
IERGGQRILTVLIYLNNEFSDGHTYFPNLNQKFKPKTGDALVFYPLANNSNKCHPYSLHAGMPVTSGEKWIANLWFRERK
FS
>P11125 3.6.1.15~~~P4~~~Packaging enzyme P4~~~
MPIVVTQAHIDRVGIAADLLDASPVSLQVLGRPTAINTVVIKTYIAAVMELASKQGGSLAGVDIRPSVLLKDTAIFTKPK
AKSADVESDVDVLDTGIYSVPGLARKPVTHRWPSEGIYSGVTALMGATGSGKSITLNEKLRPDVLIRWGEVAEAYDELDT
AVHISTLDEMLIVCIGLGALGFNVAVDSVRPLLFRLKGAASAGGIVAVFYSLLTDISNLFTQYDCSVVMVVNPMVDAEKI
EYVFGQVMASTVGAILCADGNVSRTMFRTNKGRIFNGAAPLAADTHMPSMDRPTSMKALDHTSIASVAPLERGSVDTDDR
NSAPRRGANFSL
>Q37958 ~~~IV~~~Protein P4~~~
MQKPSGKGLKYFAYGVAISAAGAILAEYVRDWMRKPKAKS
>Q65194 ~~~~~~Inner membrane protein p54~~~
MDSEFFQPVYPRHYGECLSPVTPPSFFSTHMYTILIAIVVLVIIIIVLIYLFSSRKKKAAAAIEEEDIQFINPYQDQQWA
EVTPQPGTSKPAGATTASAGKPVTGRPATNRPATNKPVTDNPVTDRLVMATGGPAAAPAAASAHPTEPYTTVTTQNTASQ
TMSAIENLRQRNTYTHKDLENSL
>Q38503 ~~~~~~Protein p56~~~
MVQNDFVDSYDVTMLLQDDDGKQYYEYHKGLSLSDFEVLYGNTADEIIKLRLDKVL
>P06948 ~~~1B~~~Protein p56~~~
MVQNDFLDSYDVTMLLQDDNGKQYYEYHKGLSLSDFEVLYGNTVDEIIKLRVDKIS
>Q9XJR2 ~~~V~~~Protein P5~~~
MKKAHMFLATAAALGVAMFPTQINEAARGLRNNNPLNIKEGSDGGAQWEGEHELDLDPTFEEFKTPVHGIRAGARILRTY
AVKYGLESIEGIIARWAPEEENDTENYINFVANKTGIPRNQKLNDETYPAVISAMIDMENGSNPYTYDEIKKGFEWGFYG
>P22536 ~~~V~~~Spike protein P5~~~
MANQQIGGSTVTYNGAIPMGGPVAINSVIEIAGTEVLVDLKLDYATGKISGVQTLYIDLRDFLGDVTVTMPDTGQRITAR
AGTQGYYPVLSTNLMKFIVSATIDGKFPMNFINFPIALGVWPSGIKGDKGDPGAPGPAGGTVVVEDSGASFGESLLDTTS
EPGKILVKRISGGSGITVTDYGDQVEIEASGGGGGGGGVTDALSLMYSTSTGGPASIAANALTDFDLSGALTVNSVGTGL
TKSAAGIQLAAGKSGLYQITMTVKNNTVTTGNYLLRVKYGSSDFVVACPASSLTAGGTISLLIYCNVLGVPSLDVLKFSL
CNDGAALSNYIINITAAKIN
>P06545 ~~~~~~DNA-binding protein~~~
MVYRRRRRSSTGTTYGSTRRRRSSGYRRRPGRPRTYRRSRSRSSTGRRSYRTRYY
>Q9XJR1 ~~~VI~~~Protein P6~~~
MANFLTKNFVWILAAGVGVWFYQKADNAAKTATKPIADFLAELQFLVNGSNYVKFPNAGFVLTRDALQDDFIAYDDRIKA
WLGTHDRHKDFLAEILDHERRVKPVYRKLIGNIIDASTIRAASGVEL
>Q03209 ~~~~~~Protein P78/83~~~
MTNRRYESVQSYLFNNRNNKIDAHQFFERVDTAEAQIIKNNIYDNTVVLNRDVLLNILKLANDVFDNKAYMYVDDSEVSR
HYNAVVKMKRLVIGVRDPSLRQSLYNTIAYIERLLNIGTVNDSEITMLIADFYDLYSNYNIELPPPQALPRSRRPSVVQP
AAPAPVPTIVREQTKPEQIIPAAPPPPPSPVPNIPAPPPPPPPSMSELPPAPPMPTEPQPAAPLDDRQQLLEAIRNEKNR
TRLRPVKPKTAPETSTIVEVPTVLPKETFEPKPPSASPPPPPPPPPPPAPPAPPPMVDLSSAPPPPPLVDLPSEMLPPPA
PSLSNVLSELKSGTVRLKPAQKRPQSEIIPKSSTTNLIADVLADTINRRRVAMAKSSSEATSNDEGWDDDDNRPNKANTP
DVKYVQALFNVFTSSQLYTNDSDERNTKAHNILNDVEPLLQNKTQTNIDKARLLLQDLASFVALSENPLDSPAIGSEKQP
LFETNRNLFYKSIEDLIFKFRYKDAENHLIFALTYHPKDYKFNELLKYVQQLSVNQQRTESSA
>P11123 ~~~P7~~~Assembly protein P7~~~
MTLYLVPPLDSADKELPALASKAGVTLLEIEFLHELWPHLSGGQIVIAALNANNLAILNRHMSTLLVELPVAVMAVPGAS
YRSDWNMIAHALPSEDWITLSNKMLKSGLLANDTVQGEKRSGAEPLSPNVYTDALSRLGIATAHAIPVEPEQPFDVDEVS
A
>Q85448 ~~~~~~Protein P7~~~
MSAIVGLCLLSEKVVLSRSLTDEVSKLYKLNRGNVKEPRKYATERMSTQSKPVALQVPVSTIVLDYKDEDFIKQNPTYSA
MDIIGSPSNTAPQTAFQSIMPSLSALFNTPFIQGAFRHRIISSMGPEISYLVMVIGPPSGFMDTPNVSSAQSSVHTVSNA
DVDLNDIIAINSTMAKSTKLVSASTLQAMLVNDVYDRCMDLDGILLSQALPFFRNYVNVQSKGSLPPAVAACLNTPIKEL
FSMGSGKREPLTLEFRKDNEGQCLGIVLPKGHEGDTLSSRYPAVFINESEPFSDKERSELSELKRTDSDAYEKLYSETIS
KHVSDGSYGNRVIISHKMSRLSNGGVKIIGRFKISDFNTVKKNLSSRSGEVDSAKEQWEALSGNGLVTDSNISMLHDKIL
DTITSNKPGVVLRDGNKKSENIVVCFKNGFPNKKHSLLQLTKNGISVVSLDELTDAGILVESTGPDRVRRSPKALANKLS
SFKGRKVTLDVDNMSTEALIQKLSAL
>Q85435 ~~~~~~Protein P7~~~
MSAIVGLCLLSEKVVLSRSLTDEVSKLYKLNRGNVKEPRKYATERMSTQSKPVALQVPVSTIILDYKDEDFIKQNPTYSA
MDIIGSPSNTAPQTAFQSIMPSLSALFNTPFIQGAFRHRVISSMGPEISYLVMVIGPPSGFMDTPMYRTVNHLSILMSNA
DVDLIDIIAINSTMAKSTKLVFASTFQAMLVNDVYDRCMVLVGIFLSQALPFFRNYVNVQSKGSLPPAVAACLNTPIKEL
FSMGSGKREPLALEFRKDNEGQCLGIVLPKGHEGDTLSSRYPAVFINESEPFSDEERSELSKLKRTDPDAYEKLYSETIS
KHVSDGSYGNRVIISHKMSRLSNGGVKIIGRFKISDFNTVKKNLSSRPGEVDSAKEQWEALSGNGLVTDNNTSMLHDKIL
DTITSNKPGVVLRDGNKKSDNIVVCFKNGFPNKKHSLLQLTKNGISVVSLDELTDAGILVESTGPDRVRRSPKALANKLS
SFKGRKVTLDVDNMSTEALIQKLSTL
>P22473 ~~~~~~Protein P7~~~
MSAIVGLCLLSEKVVLSRSLTDEVSKLYKLNRGNVKEPRKYATERMSTQSKPVALQVPVSTIILDYKNEDFIKQNPTYSA
MDIIGSPSNTAPQTAFQSIMPSLSALFNTPFIQGAFRHRIISSMGPEISYLVMVIGPPSGFMDTPNVSSAQSSVHTVSNA
DVDLNDIIAINSTMAKSTKLVSASTLQAMLVNDVYDRCMDLDGILLSQALPFFRNYVNVQSKGSLPPAVAACLNTPIKEL
FSMGSGKREPLALEFRKDNEGQCIGIVLPKGHEGDTLSSRYPAVFINESEPFSDKERSELSELKRTDSDAYEKLYSETIS
KHVSDGSYGNRVIISHKMSRLSNGGVKIIGRFKISDFNTVKKNLSSRSGEIDSAKEQWEALSGNGLVTDSNISMLHDKIL
DTITSNKPGVVLRDGNKKSENIVVCFKNGFPNKKHSLLQLTKNGISVVSLDELTDAGILVESTGPDRVRRSPKVLANKLS
SFKGRKVTLDVDNMSTEALIQKLSTL
>P31610 ~~~~~~Protein P7~~~
MSVAIVCVGLLTDSTVLTRMLNDNTKEFYNALTGKSIIEGKDITGKLGSRKIELRRVTPTDTIILDFKDEKFIRDNRLMS
LNDICGSPSNMAPKTAFESIMPALGQLFNVGFIAGAFAHNVMSTYGKATQLLILVVGPPSGFSNKQIVSSSGSLVDVETN
AKIDLSNVVAVNTEMTSRTPLVNACAIRAMSLGDVMVKCDSLDRNLVQVAIKYFRHHVNLAQTASVSDATRIMLNSTFEE
IFDLSSDESARVKPSAWVSDSIRARGLVLPVGHGKTTLEERHPELFIEIDGIFNKEEHSLLDKMRITAKESNNWEEYNNM
FNQLVRKYLRQGHYGNKVILGHHPDNLNPNGISIIGVYALDSESNLEKHIDENPSLKNRLDLVRMNWKEIRDKTTVVAPT
IQELHHIILKDIMNDSSKKSIDTSSAKPKEKIVIKFLNGFPSDKYNLVNLEKEGISVTNDLTSDVNFVIDNTPTYVSSGG
KGKKKNAKQDSRGKIDAARISVDTDKVSEAEFIQLLRTK
>Q9XJR5 ~~~VIII~~~Protein P8~~~
MLGALMGVAGGAPMGGASPMGGMPSIASSSSAETGQQTQSGNFTGGGINFGSNNNNQLLIVGAVVIGLFLVIKRK
>Q85449 ~~~~~~Outer capsid protein P8~~~
MSRQMWLDTSALLEAISEYVVRCNGDTFSGLTTGDFNALSNMFTQLSVSSAGYVSDPRVPLQTMSNMFVSFITSTDRCGY
MLRKTWFNSDTKPTVSDDFITTYIRPRLQVPMSDTVRQLNNLSLQPSAKPKLYERQNAIMKGLDIPYSEPIEPCKLFRSV
AGQTGNIPMMGILATPPAAQQQPFFVAERRRILFGIRSNAAIPAGAYQFVVPAWASVLSVTGAYVYFTNSFFGTTIAGVT
ATATAADAATTFTVPTDANNLPVQTDSRLSFSLGGGNINLELGVAKTGFCVAIEGEFTILANRSQAYYTLNSITQTPTSI
DDFDVSDFLTTFLSQLRACGQYEIFSDAMDQLTNSLITNYMDPPALPAGLAFTSPWFRFSERARTILALQNVDLNIRKLI
VRHLWVITSLIAVFGRYYRPN
>Q85439 ~~~~~~Outer capsid protein P8~~~
MSRQMWLDTSALLEAISEYVVRCNGDTFSGLTTGDFNALSNMFTQLSVSSAGYVSDPRVPLQTMSNMFVSFITSTDRCGY
MLGKTWFNSDTKPTVSDDFITAYIKPRLQVPMSDTVRQLNNLSLQPSAKPKLYERQNAIMKGLDIPYSEPIEPCKLFRSV
AGQTGNIPLMGILLTPPVAQQQPFFVAERRRILFGIRSNAAIPAGAYQFVVPAWASVLSVTGAYVYFTNSFFGTTIAGVT
ATATAADAATTFTVPTDANNLPVQTDSRLSFSLGGGNINLELGVAKTGFCVAIEGEFTILANRSQAYYTLNSITQTPTSI
DDFDVSDFLTTFLSQLRACGQYEIFSDAMDQLTNNLITNYMDPPAIPAGLAFTSPWFRFSERARTILALQNVDLNIRKLM
VRHLWVITSLIAVFGRYYRPN
>P17379 ~~~~~~Outer capsid protein P8~~~
MSRQMWLDTSALLEAISEYVVRCNGDTFSGLTTGDFNALSNMFTQLSVSSAGYVSDPRVPLQTMSNMFVSFITSTDRCGY
MLRKTWFNSDTKPTVSDDFITTYIRPRLQVPMSDTVRQLNNLSLQPSAKPKLYERQNAIMKGLDIPYSEPIEPCKLFRSV
AGQTGNIPMMGILATPPAQQQPFFVAERRRILFGIRSNAAIPAGAYQFVVPAWASVLSVTGAYVYFTNSFFGTIIAGVTA
TATAADAATTFTVPTDANNLPVQTDSRLSFSLGGGNINLELGVAKTGFCVAIEGEFTILANRSQAYYTLNSITQTPTSID
DFDVSDFLTTFLSQLRACGQYEIFSDAMDQLTNSLITNYMDPPAIPAGLAFTSPWFRFSERARTILALQNVDLNIRKLIV
RHLWVITSLIAVFGRYYRPN
>P29077 ~~~~~~Outer capsid protein P8~~~
MSRQAWIETSALIECISEYGTKCSFDTFQGLTINDISTLSNLMNQISVASVGFLNDPRTPLQAMSCEFVNFISTADRHAY
MLQKNWFDSDVAPNVTTDNFIATYIKPRFSRTVSDVLRQVNNFALQPMENPKLISRQLGVLKAYDIPYSTPINPMDVARS
SANVVGNVSQRRALSTPLIQGAQNVTFIVSESDKIIFGTRSLNPIAPGNFQINVPPWYSDLNVVDARIYFTNSFLGCTIQ
NVQVNAVNGNDPVATITVPTDNNPFIVDSDSVVSLSLSGGAINVTTAVNLTGYAIAIEGKFNMQMNASPSYYTLSSLTIQ
TSVIDDFGLSAFLEPFRIRLRASGQTEIFSQSMNTLTENLIRQYMPANQAVNIAFVSPWYRFSERARTILTFNQPLLPFA
SRKLIIRHLWVIMSFIAVFGRYYTVN
>Q9XJR9 ~~~IX~~~Protein P9~~~
MRTTTKKQIERTDPTLPNVHHLVVGATGSGKSAFIRDQVDFKGARVLAWDVDEDYRLPRVRSIKQFEKLVKKSGFGAIRC
ALTVEPTEENFERFCQLVFAISHAGAPMVVIVEELADVARIGKASPHWGQLSRKGRKYGVQLYVATQSPQEIDKTIVRQC
NFKFCGALNSASAWRSMADNLDLSTREIKQLENIPKKQVQYWLKDGTRPTEKKTLTFK
>P0C790 ~~~~~~Protein P9~~~
MDTKTLIDKYNIENFTNYINFIIRNHQAGKGNLRFLVNLLKTTGGSNLKELDINPVEIENFNIDIYLDFLEFCLDSKFIF
>Q85450 ~~~~~~Minor outer capsid protein P9~~~
MGKLQDGIAIKRINDAITTFKNYKLGELEQGGSMAINTLSNVRAHVGLAWPAILRNCLIHTSSHLGFMKFMIDIATTWKV
GAFTLLGSVGDEDPFTDVDLIYTKTCLHLGLKDNDFLQFPEEFAYEANSFLEAQSMNARVDMLTGVHNIEDKYVFRMQSI
SKFLKAYYTASEDVAYLTGFIKPDDSKDSILSAELLKAQVTSEVLRVRNLITTKIQKYINLYEDSQLPHFRQAALSYTQD
WDVDGGVPAALPQPDTTDDENPVANPGPSAPTVSKGADQPEDEEMIRKKVETSKDGPPKAVPSGNVSARGFPAFLEDDMS
EMDAPDGFHDYLTREHENNFDLTQLGLAPSV
>Q85440 ~~~~~~Minor outer capsid protein P9~~~
MGKLQDGIAIKRINDAITTFKNYKLGELKQGGSMAINTLSNVRAHVGLAWPAILRNCLIHTSSHLGFMKFMIDIATTWKV
GAFTLLGSVGDEDPFTDVDLIYTKTCLHLGLKDNDFLQFPEEFAYEANSFLEAQSMNAKVDMLTGVHNIEDKYVFRMQSI
SKFLKAYYTASEDVAYLTGFIKPDDSKDSILNAELLEAQVTSEVLRVRNLITTKIQKYINLYEDSQLPHFRQAALSYIQD
WDVDGGVPAALPQPDTTDDERPVTKPGPSTPTVSKGVDEPEDEEMIRKKVETSKDAPSKADPPGNVSPRGVPALLEDDMS
EMDMPDGFHDYLTREHENNFDLSQLGLAPSV
>P31611 ~~~~~~Minor outer capsid protein P9~~~
MSGKIQDGVAIRRMSDAILFFTNYTSRNLIDQRDITLSTLHTIRRNLGTCWSIALLNCWNETSSHAGVMRFILDIAFSLR
FGDFTMLGACGNVDPFDDAGQIFLKSCKATGRNDSCFLTPSDNFGYYLVSFLNKEQLKCVVDMNVGIHNIEDIYVTRMES
IMEFIYYYYTESGRDVVNWLEKLESADAGLAAHAKSKRLMRAEIDLIRREILERTRLFINSNRNSFHDHHRELVRRYRTI
WADVISDGDVAEETSTEATTSAQHSTALSAELDEVDEYDHPNDGLLTFRREEDAASNLDSLLGSLSGEDAFQG
>Q65159 2.7.7.19~~~~~~Putative poly(A) polymerase catalytic subunit~~~
MSSLLKTDFNVSKYRLIAQKREANAVEIEAALEVVREFIIKKKLILYGGIAIDYALHLKGSSIYPEGERPDFDMFSPNHV
EDAYELADILYEKGFKQVGTVRAIHVQTMRVRTDFVWVADLSYMPPNIFNTIPTLTYKNLKIIHPDYQRAGLHLAFCFPF
DNPPREDVFSRFKKDLQRYNLIEKYYPIPVVPVKSIYESKTFSIPFKQVAIHGFAAYALLYQTLNELRITCKVPEWKTEF
PQPSYSYHKNDKNITLTVDMPKAYPALVLATYNPEEVIKEMGLHLTEICEPYMDYSPPIFKTNDIHFFSTMFKELAISII
QDNLIVVSPQYLLLYFLYGAFATPADKSLFLFYYNATLWILEKADSLLNIIQKQTSPEEFTRFANTSPFVLTTRVLSCSQ
ERCTFSPAYRISLANDVQQSQLPLPKTHFLSNSLPDVSTLPYNYYPGKGKDRPTNFSYEKNLLFNIGGKCTPSAM
>Q5UQS6 2.7.7.19~~~~~~Putative poly(A) polymerase catalytic subunit~~~
MLKNKTRAEKYQTYYTTNEYQIVKEKLPDIIRDAEIKASEVLEPTIYEKRAIMEVIKDFIRDHQRKVYGGTALNEALKQV
NPKDAIYDNYSFSDIEFYSPTPVQDLVDLCNILYRKGYKFVQGKDAQHEETYSIFVNFQLYCDITYSPTRVFYGIKTIEI
DGINYTDPHFMLIDYLRMVNQPLTAAGQRWEKAFERMYRLLKDYPIEDFDKRLDIPEPPEEIQSYISRIKTEFLSDNKLN
ESFLISGIEAYNFYIRHAASSKDEEQMARTNRNVVNLNNFIANVPFSELISVNYREDVKNTYNFLRMIVEDKEKISVDEY
FPLFQFTGYSTVIKYDDHPIIRIYEGDGYCIPNVKTVKTVENDNGTKTKYEYKYVSFQYVLMILYINKFRAHLDKNKPMY
FNYGIAISNLVKARNIYLDQTGKSVLDNTVFKEFRTNCTGNTISFTRMNRLRLLEKRKQGKQTSFVYTPEDFFKKDLETQ
AKLDPSKARFKNTSGNKIMVPKYLLFKIDNNGNIEDNIHSEEAEISEKEETSGGSSISTDKSFEESPNSSPNSSPNNSLN
NSIDISTNNYDDRSENSLDSLTSD
>P23371 2.7.7.19~~~~~~Poly(A) polymerase catalytic subunit~~~
MNRNPDQNTLPNITLKIIETYLGRVPSVNEYHMLKLQARNIQKITVFNKDIFVSLVKKNKKRFFSDVNTSASEIKDRILS
YFSKQTQTYNIGKLFTIIELQSVLVTTYTDILGVLTIKAPNVISSKISYNVTSMEELARDMLNSMNVAVIDKAKVMGRHN
VSSLVKNVNKLMEEYLRRHNKSCICYGSYSLYLINPNIRYGDIDILQTNSRTFLIDLAFLIKFITGNNIILSKIPYLRNY
MVIKDENDNHIIDSFNIRQDTMNVVPKIFIDNIYIVDPTFQLLNMIKMFSQIDRLEDLSKDPEKFNARMATMLEYVRYTH
GIVFDGKRNNMPMKCIIDENNRIVTVTTKDYFSFKKCLVYLDENVLSSDILDLNADTSCDFESVTNSVYLIHDNIMYTYF
SNTILLSDKGKVHEISARGLCAHILLYQMLTSGEYKQCLSDLLNSMMNRDKIPIYSHTERDKKPGRHGFINIEKDIIVF
>P10226 ~~~~~~DNA polymerase processivity factor~~~
MTDSPGGVAPASPVEDASDASLGQPEEGAPCQVVLQGAELNGILQAFAPLRTSLLDSLLVMGDRGILIHNTIFGEQVFLP
LEHSQFSRYRWRGPTAAFLSLVDQKRSLLSVFRANQYPDLRRVELAITGQAPFRTLVQRIWTTTSDGEAVELASETLMKR
ELTSFVVLVPQGTPDVQLRLTRPQLTKVLNATGADSATPTTFELGVNGKFSVFTTSTCVTFAAREEGVSSSTSTQVQILS
NALTKAGQAAANAKTVYGENTHRTFSVVVDDCSMRAVLRRLQVGGGTLKFFLTTPVPSLCVTATGPNAVSAVFLLKPQKI
CLDWLGHSQGSPSAGSSASRASGSEPTDSQDSASDAVSHGDPEDLDGAARAGEAGALHACPMPSSTTRVTPTTKRGRSGG
EDARADTALKKPKTGSPTAPPPADPVPLDTEDDSDAADGTAARPAAPDARSGSRYACYFRDLPTGEASPGAFSAFRGGPQ
TPYGFGFP
>P0CK68 ~~~PA~~~Protein PA-X~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINERGESIIVESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFVSPREAKRQLKKDLKSQEQCAGLPTKVSHRTSPALKTLEPMWMDSNRTA
TLRASFLKCPKK
>P0DJS4 ~~~PA~~~Protein PA-X~~~
MEEFVRQCFNPMIVELAEKAMKEYGEDRKIETNKFAAICTHLEVCFMYSDFHFINEQGESIIVELDDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFVSPKEAKKQLKKDLKSQGQCAGSPTKVSRRTSPALRILEPMWMDSNPTA
TLRASFLKCPKK
>P0DJS8 ~~~PA~~~Protein PA-X~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFVSPREAKRQLKKDLKSQEPCAGLPTKVSHRTSPALKTLEPMWMDSNRTA
ALRASFLKCQKK
>Q809J3 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESTIVESSDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHTYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLYTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFEPNG
CIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEP
NIVKPHEKGINPNYLLAWKQVLAELQDIENEEKIPKTKNMRKTSQLKWALGEKMAPEKVDFEDCKDVSDLRQYDSDEPQP
RSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCILEIGDMLLRTAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTREFFENKSETWPIGESPKGMEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALK
>Q3HM39 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINERGESIIVESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFEPNG
YIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
NVVKPHEKGINPNYLLAWKQVLAELQDIENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFDDCKDVSDLKQYDSDEPEL
RSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>P13175 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDPRIETNKFAAICTHMEVSFMYSDFHFINERGESIIVESGDPNALLKHRFEIIE
GRDRAMAWTVVNSICNTTGVGKPKFLPDLYDYKEDRFIEIGVTRREVHIYYLEKANKIKSEETHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFEPNG
YIEGKLSQMSKEVNARIEPFLKTTPRPLRLPGGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEP
IIVKPHEKGINSNYLLAWKQVLAEIQDIESEKKVPRTKNIKKTSQLKWALGENMAPEKVDFDDCKDVSDLKQYDSDEPEF
RSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRTSIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYESIEECLINDPWVLLNASWFNSFLTHALR
>P15659 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFIDEQGESIVVELGDPNALLKHRFEIIE
GRDRTIAWTVINSICNTTGAEKPKFLPDLYDYKKNRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRKLADQSLPPNFSSLENFRAYVDGFEPNG
YIEGKLSQMSKEVNARIEPFLKSTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
NVVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDDCKDVGDLKQYDSDEPEL
RSLASWIQNEFNKACELTDSSWIELDEIGEDAAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEVGDMLLRSAIGHVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPVGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>P03433 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKTMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIIVELGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRKLADQSLPPNFSSLENFRAYVDGFEPNG
YIEGKLSQMSKEVNARIEPFLKTTPRPLRLPNGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
NVVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFDDCKDVGDLKQYDSDEPEL
RSLASWIQNEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTSEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLIRSAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEESSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALS
>P67921 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEEFVRQCFNPMIVELAEKAMKEYGEDRKIETNKFAAICTHLEVCFMYSDFHFINEQGESIIVELDDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSCLENFRAYVDGFEPNG
YIEGKLSQMSKEVNAKIEPFLKTTPRPIKLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
YVVKPHDKGINPNYLLSWKQLLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDDCRDISDLKQYDSDEPEL
RSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEHIASMRRNYFTAEVSQCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>P03434 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSCLENFRAYVDGFEPNG
YIEGKLSQMSKEVNAKIEPFLKTTPRPIRLPDGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
YIVKPHEKGINPNYLLSWKQVLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDVSDLKQYDSDEPEL
RSLSSWIQNEFNKACELTDSTWIELDEIGEDVAPIEYIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQMSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEDGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLEGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>P31343 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDLKIETNKFAAICTHLEVCFMYSDFHFINEQGESIVVELDDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGAEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSENTHIHIFSFTGEGMATKAD
YTLDEESRARIKTRLFTIRQEMANRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSCLENFRAYVDGFEPNG
CIEGKLSQMSKEVNAKIEPFLKTTPRPIKLPDGPPCFQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMRTFFGWKEP
YIVKPHERGINSNYLLSWKQVLAELQDIENEEKIPRTKNMKKTSQLKWALGENMAPEKVDFDNCRDISDLKQYDSDEPEL
RSLSSWIQNEFNKACELTDSIWIELDEIGEDVAPIEYIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRSAIGQMSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLVVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>P31342 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVAQSFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFEPNG
CIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEP
NIIKPHEKGINPNYLLAWKQVLAELQDVENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFEDCKDVSDLKQYDSDEPEP
RSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVTFVSMEFSLTDPRLEPHKWERSCVLEIGDMLLRTAIGQAPRP
TFLYVRTNGTSKIKMKWGMETRRCLPHLLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFNLGGLYEAIEECLINDPWVLLNASWFNSFLTHALK
>P13166 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETVEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFEPNG
CIEGKLSQMSKEVNAKIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEP
NIIKPHERGINPNYLLAWKQVLAELQDIENEEKIPKTKNMKKTSQLKWALGENMAPEKVDFEDCKDVSDLKQYDSDEPET
RSLASWIQSEFNKACELTDSSWIELDEIGEDIAPIEHIASIRRNYFTAEVSHCRATEYIMKGVDINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFIIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGDMLLRTAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALK
>Q9Q0U9 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESTIIESGDPNALLKHRFEIIE
GRDRTMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHTYYLEKANKIKSEKTHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETVEERFEITGTMCRLADQSLPPNFSSLEKFRAYVDGFEPNG
CIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWKEP
NIVKPHEKGINPNYLLAWKQVLAELQDIENEEKIPKTKNMRKTSQLKWALGENMAPEKVDFEDCKDVSDLRQYDSDEPKP
RSLASWIQSEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRKTNLYGFLIKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHRWEKYCVLRIGDMLLRTEIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCPFQSLQQIESMIEAESSVKEKDMTKEFFENKSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYASPQLEGFSAESRKLLLIVQALRDNLEPGTFDLGGLYEAIEECLINDPWVLLNASWFNSFLTHALR
>O89752 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MEDFVRQCFNPMIVELAEKTMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIE
GRDRAMAWTVVNSICNTTGVDKPKFLPDLYDYKENRFTEIGVTRREVHIYYLEKANKIKSEETHIHIFSFTGEEMATKAD
YTLDEESRARIKTRLFTIRQEMASRGLWDSFRQSERGEETIEERFEITGTMRRLADQSLPPNFSSLENFRAYVDGFKPNG
CIEGKLSQMSKEVNARIEPFLKTTPRPLRLPDGPPCSQRSKFLLMDALKLSIEDPSHEGEGIPLYDAIKCMKTFFGWREP
NIIKPHEKGINPNYLLAWKQVLAELQDIENEDKIPKTKNMKKTSQLMWALGENMAPEKVDFEDCKDIDDLKQYHSDEPEL
RSLASWIQNEFNKACELTDSSWIELDEIGEDVAPIEHIASMRRNYFTAEVSHCRATEYIMKGVYINTALLNASCAAMDDF
QLIPMISKCRTKEGRRRTNLYGFIVKGRSHLRNDTDVVNFVSMEFSLTDPRLEPHKWEKYCVLEIGEMLLRTAIGQVSRP
MFLYVRTNGTSKIKMKWGMEMRRCLLQSLQQIESMIEAESSIKEKDMTKEFFENRSETWPIGESPKGVEEGSIGKVCRTL
LAKSVFNSLYSSPQLEGFSAESRKLLLIVQALRDNLEPGTFDLEGLYGAIEECLINDPWVLLNASWFNSFLTHALR
>O36432 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MDTFITRNFQTTIIQKAKNTMAEFSEDPELQPAMLFNICVHLEVCYVISDMNFLDEEGKSYTALEGQGKEQNLRPQYEVI
EGMPRTIAWMVQRSLAQEHGIETPKYLADLFDYKTKRFIEVGITKGLADDYFWKKKEKLGNSMELMIFSYNQDYSLSNES
SLDEEGKGRVLSRLTELQAELSLKNLWQVLIGEEDVEKGIDFKLGQTISRLRDISVPAGFSNFEGMRSYIDNIDPKGAIE
RNLARMSPLVSATPKKLKWEDLRPIGPHIYNHELPEVPYNAFLLMSDELGLANMTEGKSKKPKTLAKECLEKYSTLRDQT
DPILIMKSEKANENFLWKLWRDCVNTISNEEMSNELQKTNYAKWATGDGLTYQKIMKEVAIDDETMCQEEPKIPNKCRVA
AWVQTEMNLLSTLTSKRALDLPEIGPDVAPVEHVGSERRKYFVNEINYCKASTVMMKYVLFHTSLLNESNASMGKYKVIP
ITNRVVNEKGESFDMLYGLAVKGQSHLRGDTDVVTVVTFEFSSTDPRVDSGKWPKYTVFRIGSLFVSGREKSVYLYCRVN
GTNKIQMKWGMEARRCLLQSMQQMEAIVEQESSIQGYDMTKACFKGDRVNSPKTFSIGTQEGKLVKGSFGKALRVIFTKC
LMHYVFGNAQLEGFSAESRRLLLLIQALKDRKGPWVFDLEGMYSGIEECISNNPWVIQSAYWFNEWLGFEKEGSKVLESV
DEIMDE
>P11136 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MDTFITRNFQTTIIQKAKNTMAEFMEDPELQPAMLFNICVHLEVCYVISDMNFLDEEGKAYTALEGQGKEQNLRPQYEVI
EGMPRNIAWMVQRSLAQEHGIETPKYLADLFDYKTKRFIEVGITKGLADDYFWKKKEKLGNSMELMIFSYNQDYSLSNES
SLDEEGKGRVLSRLTELQAELSLKNLWQVLIGEEDIEKGIDFKLGQTISRLRDISVPAGFSNFEGMRSYIDNIDPKGAIE
RNLARMSPLVSVTPKKLKWEDLRPIGPHIYNHELPEVPYNAFLLMSDELGLANMTEGKSKKPKTLAKECLEKYSTLRDQT
DPILIMKSEKANENFLWKLWRDCVNTISNEETSNELQKTNYAKWATGDGLTYQKIMKEVAIDDETMCQEEPKIPNKCRVA
AWVQTEMNLLSTLTSKRALDLPEIGPDVAPVEHVGSERRKYFVNEINYCKASTVMMKYVLFHTSLLNESNASMGKYKVIP
ITNRVNEKGESFDMLYGLAVKGQSHLRGDTDVVTVVTFEFSSTDPRVDSGKWPKYTVFRIGSLFVSGREKSVYLYCRVNG
TNKIQMKWGMEARRCLLQSMQQMEAIVEQESSIQGYDMTKACFKEDRVNSPKTFSIGTQEGKLVKGSFGKALRVIFTKCL
MHYVFGNAQLEGFSAESRRLLLLIQALKDRKGPWVFDLEGMYSGIEECISNNPWVIQSAYWFNEWLGFEKEGSKVLESVD
EIMDE
>Q9IMP5 3.1.-.-~~~PA~~~Polymerase acidic protein~~~
MSKTFAEIAEAFLEPEAVRIAKEAVEEYGDHERKIIQIGIHFQVCCMFCDEYLSTNGSDRFVLIEGRKRGTAVSLQNELC
KSYDLEPLPFLCDIFDREEKQFVEIGITRKADDSYFQSKFGKLGNSCKIFVFSYDGRLDKNCEGPMEEQKLRIFSFLATA
ADFLRKENMFNEIFLPDNEETIIEMKKGKTFLELRDESVPLPFQTYEQMKDYCEKFKGNPRELASKVSQMQSNIKLPIKH
YEQNKFRQIRLPKGPMAPYTHKFLMEEAWMFTKISDPERSRAGEILIDFFKKGNLSAIRPKDKPLQGKYPIHYKNLWNQI
KAAIADRTMVINENDHSEFLGGIGRASKKIPEISLTQDVITTEGLKQSENKLPEPRSFPRWFNAEWMWAIKDSDLTGWVP
MAEYPPADNELEDYAEHLNKTMEGVLQGTNCAREMGKCILTVGALMTECRLFPGKIKVVPIYARSKERKSMQEGLPVPSE
MDCLFGICVKSKSHLNKDDGMYTIITFEFSIREPNLEKHQKYTVFEAGHTTVRMKKGESVIGREVPLYLYCRTTALSKIK
NDWLSKARRCFITTMDTVETICLRESAKAEENLVEKTLNEKQMWIGKKNGELIAQPLREALRVQLVQQFYFCIYNDSQLE
GFCNEQKKILMALEGDKKNKSSFGFNPEGLLEKIEECLINNPMCLFMAQRLNELVIEASKRGAKFFKTD
>P27194 ~~~Segment 3~~~Polymerase acidic protein~~~
MTDRPDHIDSRVWELSETQEDWITQVHGHVRRVVECWKYTICCLISNMHTHRGAPQYDVFKWQDRSTIEWICSKKKVQYP
ERDTPDLYDNERAVAYKVLLVSDLSDHSPTSGIYHDLAFNLEGEAEESCALVLRGSQLQDIKGFLCRALEWVVSNNLTQE
VVETISGEAKLQFSVGTTFRTLLKRDTDWDVIPTPRVEPNVPRIEGRRWTQMKKLPLLKEKEGPPSPWRALLLGADSEYI
VCPPGTDQEAISWIHSQSEIECIRESKSTPASVITCLTSSLQSFAEGNPVRSRIHEDIIAFGINKKQEKKQSASSSASGE
WKRAEYQVEEMSLPPWVEEEMVLLRSDQEDNWIELEKNAIYTEVDGVAEGLVDKYIEIVGRTKVASVIEKWQIAATRTFS
QLHTDRSRITACPIITRDPSGNCQFWGMVLLGPHHVKRDTDNAPLLIAEIMGEDTEEKYPKHSVFSLKVEGKQFLLSLKI
TSFSRNKLYTFSNIRRVLIQPASIYSQVVLSRAAENNSLNLEVNPEIQLYLEGAQRGMTLYQWVRMILCLEFLMAIYNNP
QMEGFLANMRRLHMSRHAMMERRQVFLPFGSRPEDKVNECIINNPIVAYLAKGWNSMPNVYY
>P0C5U5 ~~~PB1~~~Protein PB1-F2~~~
MEQEQDTPWTQSTEHINIQNRGNGQRTQRLEHPNSIRLMDHCLRIMSRVGMHRQIVYWKQWLSLKSPTQGSLKTRVLKRW
KLFSKQEWIN
>P0C574 ~~~PB1~~~Protein PB1-F2~~~
MGQEQDTPWILSTGHISTQKREDGQQTPRLEHHNSTRLMDHCQKTMNQVVMPKQIVYWKQWLSLRSPTPVSLKTRVLKRW
RLFSKHEWTS
>P0C0U1 ~~~PB1~~~Protein PB1-F2~~~
MGQEQDTPWILSTGHISTQKRQDGQQTPKLEHRNSTRLMGHCQKTMNQVVMPKQIVYWKQWLSLRNPILVFLKTRVLKRW
RLFSKHE
>P0C0U0 ~~~PB1~~~Protein PB1-F2~~~
MEQEQDTPWTQSTEHINIQKKGGGQQTQRPEHPNSTLLMDHYLKITSRAGMHKQIVYWKQWLSLKNPTQDSLKTRVLKRW
KLSSKREWIS
>P0DOG7 ~~~PB2~~~PB2-S1~~~
MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSK
TNDAGSDRVMVSPLAVTWWNRNGPTTSAVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSMRGVRV
SKMGVDEYSSTERVPLHQSKVECSSPL
>P0DOG3 ~~~PB2~~~PB2-S1~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRITEMIPERNEQGQTLWSK
MNDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTTTKEKKEELQGCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEARNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVNILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKALFQNWGIESIDNVMGMIGILPDMTPSTEMSMRGVRI
SKMGVDEYSSAEKIVPLHQSKVECSSPH
>Q8QPG7 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIIEMIPERNEQGQTLWSK
TNDAGSDRVMVSPLAVTWWNRNGPTTSAVYYPKVYKTYFEKVERLKHGTFGPVHFRNQIKIRRRVDINPGHADLNAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVIQSMIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGTSVKKEEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSLRGVRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLAITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQDPTMLYNKMEFEPFQSLVPKAARGQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGKEDKRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q6DNL8 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIIEMIPERNEQGQTLWSK
TNDAGSDRVMVSPLAVTWWNRNGPTTSAVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKKEEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSLRGVRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQDPTMLYNKMEFEPFQSLVPKAARGQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGKEDKRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q3HM41 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSK
TNDAGSDRVMVSPLAVTWWNRNGPTTSAVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSTEMSMRGVRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEVNGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAARGQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>P03427 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRITEMIPERNEQGQTLWSK
MNDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKIYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTTTKEKKEELQGCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEARNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVNILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKALFQNWGIESIDNVMGMIGILPDMTPSTEMSMRGVRI
SKMGVDEYSSAEKIVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAVRGQYSGFVRTLFQQMRDVLGTFDTAQIIKLLPFAAAPPKQSGMQFSSLTINV
RGSGMRILVRGNSPIFNYNKTTKRLTVLGKDAGPLTEDPDEGTAGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRNSSILTDSQTATKRIRMAIN
>P03428 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRITEMIPERNEQGQTLWSK
MNDAGSDRVMVSPLAVTWWNRNGPITNTVHYPKIYKTYFERVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSVKREEEVLTGNLQTLKIRVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGVEPIDNVMGMIGILPDMTPSIEMSMRGVRI
SKMGVDEYSSTERVVVSIDRFLRIRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTAQIIKLLPFAAAPPKQSRMQFSSFTVNV
RGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGTLTEDPDEGTAGVESAVLRGFLILGKEDKRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q91MB1 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSK
MSDAGSDRVMVSPLAVTWWNRNGPMTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSIKREEELLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSVAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMIGVLPDMTPSTEMSMRGIRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLIPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDEGTSGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>P03429 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSK
MSDAGSDRVMVSPLAVTWWNRNGPMTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSIKREEELLTGNLQTLKIRVHDGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSVAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMIGVLPDMTPSTEMSMRGIRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPAFNYNKTTKRLTILGKDAGTLIEDPDEGTSGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q67296 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSK
MSDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRAAVSADPLASLLEMCHSTLIGGTRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSIKREEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMVGVLPDMTPSTEMSMRGIRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETQGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q30NP1 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSK
MXDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDIDQSLIIAARNIVRRAAVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSIKREEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMVGVLPDMTPSTEMSMRGIRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETHGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>P31345 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRNLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPSLRMKWMMAMKYPITADKRITEMVPERNEQGQTLWSK
MSDAGSDRVMVSPLAVTWWNRNGPVTSTVHYPKVYKTYFDKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKISPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDIDQSLIIAARNIVRRASVSADPLASLLEMCHSTQIGGTRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTSGSSIKREEEVLTGNLQTLKIRVHEGYEEFTMVGKRATAILRKATRRLVQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEHIDNVMGMVGVLPDMTPSTEMSMRGIRV
SKMGVDEYSSTERVVVSIDRFLRVRDQRGNVLLSPEEVSETHGTERLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQNPTMLYNKMEFEPFQSLVPKAIRGQYSGFVRTLFQQMRDVLGTFDTTQIIKLLPFAAAPPKQSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKTTKRLTILGKDAGTLIEDPDESTSGVESAVLRGFLILGKEDRRYGPALSINELSNLAKGE
KANVLIGQGDVVLVMKRKRDSSILTDSQTATKRIRMAIN
>Q9Q0V1 ~~~PB2~~~Polymerase basic protein 2~~~
MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITADKRIMEMIPERNEQGQTLWSK
TNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFEKVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQ
DVIMEVVFPNEVGARILTSESQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW
EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQNPTEEQAVDICKAAMGLRIS
SSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGYEEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIV
AMVFSQEDCMIKAVRGDLNFVNRANQRLNPMHQLLRHFQKDAKVLFQNWGIEPIDNVMGMIGILPDMTPSAEMSLRGVRV
SKMGVDEYSSTERVVVSIDRFLRVRDQQGNVLLSPEEVSETQGTEKLTITYSSSMMWEINGPESVLVNTYQWIIRNWETV
KIQWSQDPTMLYNKMEFESFQSLVPKAARSQYSGFVRTLFQQMRDVLGTFDTVQIIKLLPFAAAPPEPSRMQFSSLTVNV
RGSGMRILVRGNSPVFNYNKATKRLTVLGKDAGALTEDPDEGTAGVESAVLRGFLILGREDKRYGPALSINELSNLAKGE
KANVLIMQGDVVLVMKRKRDFSILTDSQTATKRIRMAIN
>Q9QLL6 ~~~PB2~~~Polymerase basic protein 2~~~
MTLAKIELLKQLLRDNEAKTVLRQTTVDQYNIIRKFNTSRIEKNPSLRMKWAMCSNFPLALTKGDMANRIPLEYKGIQLK
TNAEDIGTKGQMCSIAAVTWWNTYGPIGDTEGFEKVYESFFLRKMRLDNATWGRITFGPVERVRKRVLLNPLTKEMPPDE
ASNVIMEILFPKEAGIPRESTWIHRELIKEKREKLKGTMITPIVLAYMLERELVARRRFLPVAGATSAEFIEMLHCLQGE
NWRQIYHPGGNKLTESRSQSMIVACRKIIRRSIVASNPLELAVEIANKTVIDTEPLKSCLAALDGGDVACDIIRAALGLK
IRQRQRFGRLELKRISGRGFKNDEEILIGNGTIQKIGIWDGEEEFHVRCGECRGILKKSQMRMEKLLINSAKKEDMKDLI
ILCMVFSQDTRMFQGVRGEINFLNRAGQLLSPMYQLQRYFLNRSNDLFDQWGYEESPKASELHGINELMNASDYTLKGVV
VTKNVIDDFSSTETEKVSITKNLSLIKRTGEVIMGANDVSELESQAQLMITYDTPKMWEMGTTKELVQNTYQWVLKNLVT
LKAQFLLGKEDMFQWDAFEAFESIIPQKMAGQYSGFARAVLKQMRDQEVMKTDQFIKLLPFCFSPPKLRSNGEPYQFLRL
MLKGGGENFIEVRKGSPLFSYNPQTEILTICGRMMSLKGKIEDEERNRSMGNAVLAGFLVSGKYDPDLGDFKTIEELERL
KPGEKANILLYQGKPVKVVKRKRYSALSNDISQGIKRQRMTVESMGWALS
>O36431 ~~~PB2~~~Polymerase basic protein 2~~~
MTLAKIELLKQLLRDNEAKTVLKQTTVDQYNIIRKFNTSRIEKNPSLRMKWAMCSNFPLALTKGDMANRIPLEYKGIQLK
TNAEDIGTKGQMCSIAAVTWWNTYGPIGDTEGFEKVYESFFLRKMRLDNATWGRITFGPVERVRKRVLLNPLTKEMPPDE
ASNVIMEILFPKEAGIPRESTWIHRELIKEKREKLKGTMITPIVLAYMLERELVARRRFLPVAGATSAEFIEMLHCLQGE
NWRQIYHPGGNKLTESRSQSMIVACRKIIRRSIVASNPLELAVEIANKTVIDTEPLKSCLTAIDGGDVACDIIRAALGLK
IRQRQRFGRLELKRISGRGFKNDEEILIGNGTIQKIGIWDGEEEFHVRCGECRGILKKSKMRMEKLLINSAKKEDMKDLI
ILCMVFSQDTRMFQGVRGEINFLNRAGQLLSPMYQLQRYFLNRSNDLFDQWGYEESPKASELHGINELMNASDYTLKGVV
VTKNVIDDFSSTETEKVSITKNLSLIKRTGEVIMGANDVSELESQAQLMITYDTPKMWEMGTTKELVQNTYQWVLKNLVT
LKAQFLLGKEDMFQWDAFEAFESIIPQKMAGQYSGFARAVLKQMRDQEVMKTDQFIKLLPFCFSPPKLRSNGEPYQFLRL
VLKGGGENFIEVRKGSPLFSYNPQTEVLTICGRMMSLKGKIEDEERNRSMGNAVLAGFLVSGKYDPDLGDFKTIEELEKL
KPGEKANILLYQGKPVKVVKRKRYSALSNDISQGIKRQRMTVESMGWALS
>P21770 ~~~PB2~~~Polymerase basic protein 2~~~
MSLLLTIAKEYKRLCQDAKAAQMMTVGTVSNYTTFKKWTTSRKEKNPSLRMRWAMSSKFPIIANKRMLEEAQIPKEHNNV
ALWEDTEDVSKRDHVLASASCINYWNFCGPCVNNSEVIKEVYKSRFGRLERRKEIMWKELRFTLVDRQRRRVDTQPVEQR
LRTGEIKDLQMWTLFEDEAPLASKFILDNYGLVKEMRSKFANKPLNKEVVAHMLEKQFNPESRFLPVFGAIRPERMELIH
ALGGETWIQEANTAGISNVDQRKNDMRAVCRKVCLAANASIMNAKSKLVEYIKSTSMRIGETERKLEELIPETDDVSPEV
TLCKSALGGPLGKTLSFGPMLLKKISGSGVKVKDTVYIQGVRAVQFEYWSEQEEFYGEYKSATALFSRKERSLEWITIGG
GINEDRKRLLAMCMIFCRDGDYFKDAPATITMADLTTKLGREIPYQYVMMNWIQKSEDNLEALLYSRGIVETNPGKMGSS
MGIDGSKRAIKSLRAVTIQSGKIDMPESKEKIHLELSDNLEAFDSSGRIVATILDLPSDKKVTFQDVSFQHPDLAVLRDE
KTAITKGYEALIKRLGTGDNDIPSLIAKKDYLSLYNLPEVKLMAPLIRPNRKGVYSRVARKLVSTQVTTGHYSLHELIKV
LPFTYFAPKQGMFEGRFFFSNDSFVEPGVNNNVLSWSKADSSKIYCHGIAIRVPLVVGDEHMDTSLALLEGFSVCENDPR
APMVTGQDLIDVGFGQKVRLFVGQGSVRTFKRTASQRAASSDVNKNVKKIKMSN
>Q9IMP3 ~~~PB2~~~Polymerase basic protein 2~~~
MSLLLTIAKEYKRLCQDAKAAQMMTVGTVSNYTTFKKWTTSRKEKNPSLRMRWAMSSKFPIIANKRMLEEAQIPKEHNNV
ALWEDTEDVSKRDHVLASASCINYWNFCGPCVNNSEVIKEVYKSRFGRLERRKEIMWKELRFTLVDRQRRRVDTQPVEQR
LRTGEIKDLQMWTLFEDEAPLASKFILDNYGLVKEMRSKFANKPLNKEVVAHMLEKQFNPESRFLPVFGAIRPERMELIH
ALGGETWIQEANTAGISNVDQRKNDIRAVCRKVCLAANASIMNAKSKLVEYIKSTSMRIGETERKLEELILETDDVSPEV
TLCKSALGGQLGKTLSFGPMLLKKISGSGVKVKDTVYIQGVRAVQFEYWSEQEEFYGEYKSATALFSRKERSLEWITIGG
GINEDRKRLLAMCMIFCRDGDYFKDAPATITMADLSTKLGREIPYQYVMMNWIQKSEDNLEALLYSRGIVETNPGKMGSS
MGIDGSKRAIKSLRAVTIQSGKIDMPESKEKIHLELSDNLEAFDSSGRIVATILDLPSDKKVTFQDVSFQHPDLAVLRDE
KTAITKGYEALIKRLGTGDNDIPSLIAKKDYLSLYNLPEVKLMAPLIRPNRKGVYSRVARKLVSTQVTTGHYSLHELIKV
LPFTYFAPKQGMFEGRLFFSNDSFVEPGVNNNVFSWSKADSSKIYCHGIAIRVPLVVGDEHMDTSLALLEGFSVCENDPR
APMVTRQDLIDVGFGQKVRLFVGQGSVRTFKRTASQRAASSDVNKNVKKIKMSN
>Q6UBL8 ~~~Segment-1~~~Polymerase basic protein 2~~~
MDFISENTISDKTTLEELKNATLFQVTKVDDRDCLRARRICNAPKGHWAGLMEKAKAMGDPTEEEKDELKKIVESYNTVS
VLGVSKSEGATGPRLVSSLKGLKNLLPGVNPKTLQETLLVGAPCPSTEPTTEEYWNVCRAAVGASMGSAKINMSQKVVMG
ASVIGWGQLNQSGPGVYFLNTKEIVTAEGKVDETRGPLERTSAPLMRDISRLIQETIEEVETGGDPSFSVRSEGGSKIEG
RIAFSLHSEVSTLKMRIALEQKLAKYEYMGENLLTLVKNTSIDRMQPDSAMMGKMVLESLRTHTVSSEQLNGRMITVQSQ
GLETIAISSPFDVEYDDGYVFTRMKGNFVAVGRDYKGAILCFREGQGTFFSGRGNWSGLMEKCLVEMRLCPCFYSCTWQD
YPDKKSLYEKATFEAKQIVFAMGENTGVDIRVNTDGEIGDKGISLLTREREDKYMSKVSYECRVVSGKLVMGLDKMSRVA
KGNLEVVREKGDDTSQSDSFYEGVLQVGSMIGTTMESLKQQLQGPVGIWRASGVSAMERCMKRGQSKTVVASARYTFQKM
MEKMATGREVSKYSLIIVMRCCIGFTSEANKRALTNISGTGYYISVAQPTVVKLAGEWLITPVGRSKTGEVQYVSAKLKK
GMTTGKLELIKKADRSDLDNFPEPSADELLREGTIVLMQIGKDKWLCRVRTGDRRVRTDTDIQRAEAKSQVEKEDLMDEY
GV
>Q9YNA4 ~~~Segment 1~~~Polymerase basic protein 2~~~
MDREEPAESECTLRALVEEYNGACKEAPKEMSKQFTDYNTFKRYTTSKKDHAPQMRLVYSVRKPWPISMTPSKEIPLVFN
GTKLKDTILDLGESKRTRANIVVPDYWSKYGSQTSLEVVNAILYAEDLKVQRFFSTEWGEIRYGRMLPFRKPVQACPTIE
EVNPASIPHTLLQVFCPQYTTLDSKRKAHMGAVEKLKRVMEPICKVQTQESAVHIARSLIDSNKKWLPTVVDHTPRTAEM
AHFLCSKYHYVHTNTQDLSDTRSIDNLCGELVKRSLKCRCPKETLVANLDKITIQGRPMREVLADHDGELPYLGICRVAM
GLSTHHTMKIRSTKFSILNSDHPRIEVKKVFSLSPDVQVTIPYRRFKGKAKVYFQNDQIQGYFSCTDRQIDEIKISAPKN
APLLEPLLDICYYGSFIEPGFEQTFGFYPAGKREFVDSFFMHHSKDHKAFLIHMGLDKDLSLPLSPELNWKEPALSKVCR
VTELDSTVQPYTSATREFVLGETLNVYTQHENGLELLICPTEIRSTRGPLPPGTNLSGSEFIDIYQDPFSRAKSLLKSTI
LHAERCKEFVGNMLEEYQDPAETTVQSLVPINTWGKSAKRKLQEEITSDPDWHQCPRKRAKMSYLAIIAGSIQDRDKKQT
NVPRAFMLRGSQIEYDMKATRGLVVDTTNRIIVGGETVLREGKGGPEGYVQTGVFEEQPRCYLVDTPDHGLSMGLSRFCV
HSQGRYFQYEKKISIWEETDNIKATIDSQRDLKRRRDIEEMVSKRARIV
>P11038 ~~~PCNA~~~Probable DNA polymerase sliding clamp~~~
MFEAEFKTGAVLKRLVETFKDLLPHATFDCDNRGVSMQVMDTSHVALVSLQLHAEGFKKYRCDRNVPLNVSINSLSKIVK
CVNERSSVLMKAEDQGDVMAFVFNNDNRICTYTLKLMCIDVEHLGIPDSDYDCVVHMSSVEFAQVCKDMTQFDHDIIVSC
SKKGLQFRANGDIGSADVQMSADNENFSVLKAKQTVTHTFAGDYLCHFAKAAPLAPTVTIYMSEELPFKLEYCIKDVGVL
ACFLAPKIVNNDEEIF
>P06807 3.4.-.-~~~~~~Prohead core protein protease~~~
MNEPQLLIETWGQPGEIIDGVPMLESHDGKDLGLKPGLYIEGIFMQAEVVNRNKRLYPKRILEKAVKDYINEQVLTKQAL
GELNHPPRANVDPMQAAIIIEDMWWKGNDVYGRARVIEGDHGPGDKLAANIRAGWIPGVSSRGLGSLTDTNEGYRIVNEG
FKLTVGVDAVWGPSAPDAWVTPKEITESQTAEADTSADDAYMALAEAMKKAL
>Q5UR88 ~~~~~~Phosphatidylethanolamine-binding protein homolog R644~~~
MSNDFKVIINGQNIDNGQKIIFEKSQDVPKPIFDIGDNEYYTIAMVDPDAPSRENPIYKYFLHMLIVNNYQTLVSFQPPS
PPKGSGYHRYFFFLLKQPKYIDQNIWKQQINNNSIRREKFNLSEFISDNKLTVIASTYFKTKR
>Q58MU6 1.3.7.6~~~pebS~~~Phycoerythrobilin synthase~~~
MTKNPRNNKPKKILDSSYKSKTIWQNYIDALFETFPQLEISEVWAKWDGGNVTKDGGDAKLTANIRTGEHFLKAREAHIV
DPNSDIYNTILYPKTGADLPCFGMDLMKFSDKKVIIVFDFQHPREKYLFSVDGLPEDDGKYRFFEMGNHFSKNIFVRYCK
PDEVDQYLDTFKLYLTKYKEMIDNNKPVGEDTTVYSDFDTYMTELDPVRGYMKNKFGEGRSEAFVNDFLFSYK
>P19063 ~~~~~~Chemokine-binding protein~~~
MKQYIVLACMCLAAAAMPASLQQSSSSSSSCTEEENKHHMGIDVIIKVTKQDQTPTNDKICQSVTEITESESDPDPEVES
EDDSTSVEDVDPPTTYYSIIGGGLRMNFGFTKCPQIKSISESADGNTVNARLSSVSPGQGKDSPAITREEALAMIKDCEV
SIDIRCSEEEKDSDIKTHPVLGSNISHKKVSYEDIIGSTIVDTKCVKNLEFSVRIGDMCKESSELEVKDGFKYVDGSASE
GATDDTSLIDSTKLKACV
>Q805H7 ~~~~~~Inactive chemokine-binding protein~~~
MHVPASLQQSSSSSSSCTEEENKHHMGIDVIIKVTKQDQTPTNDKICQSVTEITESESDPDPEVESEDDSTSVEDVDPPT
TYYSIIGGGLRMNFGFTKCPQIKSISESADGNTVNARLSSVSPGQGKDSPAITHEEALAMIKDCEVSIDIRCSEEEKDSD
IKTHPVLGSNISHKKVSYEDIIGSTIVDTKCVKNLEFSVRIGDMCKESSELEVKDGFKYVDGSASEGATDDTSLIDSTKL
KACV
>Q8QN36 ~~~~~~Ankyrin repeat domain-containing protein OPG023~~~
MFDYLENEEVALDELKQMLRDRDPNDTRNQFKNNALHAYLFNEHCNNVEVVKLLLDSGTNPLHKNWRQLTPLGEYTNSRH
GKVNKDIAMVLLEATGYSNINDFNIFTYMKSKNVDIDLIKVLVEHGFDFSVKCEKHHSVIENYVMTDDPVPEIIDLFIEN
GCSVIYEDEDDEYGYAYEEYHSQNDDYQPRNCGTVLHLYIISHLYSESDSRSCVNPEVVKCLINHGINPSSIDKNYCTAL
QYYIKSSHIDIDIVKLLMKGIDNTAYSYIDDLTCCTRGIMADYLNSDYRYNKDVDLDLVKLFLENGKPHGIMCSIVPLWR
NDKETISLILKTMNSDVLQHILIEYITFSDIDISLVEYMLEYGAVVNKEAIHGYFKNINIDSYTMKYLLKKEGGDAVNHL
DDGEIPIGHLCKSNYGRYNFYTDTYRQGFRDMSYACPILSTINICLPYLKDINMIDKRGETLLHKAVRYNKQSLVSLLLE
SGSDVNIRSNNGYTCIAIAINESRNIELLNMLLCHKPTLDCVIDSLREISNIVDNAYAIKQCIRYAMIIDDCISSKIPES
ISKHYNDYIDICNQELNEMKKIIVGGNTMFSLIFTDHGAKIIHRYANNPELRAYYESKQNKIYVEVYDIISNAIVKHNKI
HKNIESVDDNTYISNLPYTIKYKIFEQQ
>P17356 ~~~~~~OPG024 protein~~~
MSSKGGSSGGMWSVFIHGHDGSNKGSKTYTSGGGGMWGGGSSSGVKSGVNGGVKSGTGKI
>P17372 ~~~~~~Ankyrin repeat protein OPG025~~~
MVNDKILYDSCKTFNIDASSAQSLIESGANPLYEYDGETPLKAYVTKKNNNIKNDVVILLLSSVDYKNINDFDIFEYLCS
DNIDIDLLKLLISKGIEINSIKNGINIVEKYATTSNPNVDVFKLLLDKGIPTCSNIQYGYKIKIEQIRRAGEYYNWDDEL
DDYDYDYTTDYDDRMGKTVLYYYIITRSQDGYATSLDVINYLISHKKEMRYYTYREHTTLYYYLDKCDIKREIFDALFDS
NYSGHELMNILSNYLRKQFRKKNHKIDNYIVDQLLFDRDTFYILELCNSLRNNILISTILKRYTDSIQDLLLEYVSYHTV
YINVIKCMIDEGATLYRFKHINKYFQKFGNRDPKVVEYILKNGNLVVDNDNDDNLINIMPLFPTFSMRELDVLSILKLCK
PYIDDINKIDKHGCSILYHCIKSHSVSLVEWLIDNGADINIITKYGFTCITICVILADKYIPEIAELYIKILEIILSKLP
TIECIKKTVDYLDDHRYLFIGGNNKSLLKICIKYFILVDYKYTCSMYPSYIEFITDCEKEIADMRQIKINGTDMLTVMYM
LNKPTKKRYVNNPIFTDWANKQYKFYNQIIYNANKLIEQSKKIDDMIEEVSIDDNRLSTLPLEIRHLIFSYAFL
>P68598 ~~~~~~Interferon antagonist OPG027~~~
MGIQHEFDIIINGDIALRNLQLHKGDNYGCKLKIISNDYKKLKFRFIIRPDWSEIDEVKGLTVFANNYAVKVNKVDDTFY
YVIYEAVIHLYNKKTEILIYSDDENELFKHYYPYISLNMISKKYKVKEENYSSPYIEHPLIPYRDYESMD
>P68600 ~~~~~~Interferon antagonist OPG027~~~
MGIQHEFDIIINGDIALRNLQLHKGDNYGCKLKIISNDYKKLKFRFIIRPDWSEIDEVKGLTVFANNYAVKVNKVDDTFY
YVIYEAVIHLYNKKTEILIYSDDENELFKHYYPYISLNMISKKYKVKEENYSSPYIEHPLIPYRDYESMD
>P17362 ~~~~~~IFN signaling evasion protein OPG029~~~
MNAYNKADSFSLESDSIKDVIHDYICWLSMTDEMRPSIGNVFKAMETFKIDAVRYYDGNIYELAKDINAMSFDGFIRSLQ
TIASKKDKLTVYGTMGLLSIVVDINKGCDISNIKFAAGIIILMEYIFDDTDMSHLKVALYRRIQRRDDVDR
>P17367 ~~~~~~Protein OPG030~~~
MAYMNRSDLDKLKHENIFSGNIIEDAKEFVFGSRKIYTDSVDDLIELYSLAKYLNNENLKDVVIERMDYVCKYIGKDNWS
TIYSFYKENGLRNSFLRQYINNNIEEICNTDQFLKLDVDSVCDILDNDEIVVTREYTILNMVLRWLENKRVNIDDFTKVM
FVIRFKFITYSELTNAIKKIAPEYRQCLQDLYHMKITRPRHFDN
>P17370 ~~~~~~Protein C4~~~
MDTIKIFNHGEFDTIRNELVNLLKVVKWNTINSNVTVSSTDTIDISDCIREILYKQFKNVRNIEVSSDISFIKYNRFNDT
TLTDDNVGYYLVIYLNRTKSVKTLIYPTPETVITSSEDIMFSKSLNFRFENVKRDYKLVMCSISLTYKPSICRIQYDNNK
YLDISDSQECNNLCYCVITMDPHHLIDLETICVLVDKSGKCLLVNEFYIRFRKNHIYNSFADLCMDHIFELPNTKELFTL
RNDDGRNIAWDNDKLESGNNTWIPKTDDEYKFLSKLMNIAKFNNTKFDYYVLVGDTDPCTVFTFKVTKYYINLNYE
>P17368 ~~~~~~Protein OPG034~~~
MVKNNKIQKNKISNSCRMIMSTDPNNILMRHLKNLTDDEFKCIIHRSSDFLYLSDSDYTSITKETLVSEIVEEYPDDCNK
ILAIIFLVLDKDIDVDIKTKLKPKPAVRFAILDKMTEDIKLTDLVRHYFRYIEQDIPLGPLFKKIDSYRTRAINKYSKEL
GLATEYFNKYGHLMFYTLPIPYNRFFCRNSIGFLAVLSPTIGHVKAFYKFIEYVSIDDRRKFKKELMSK
>P21054 ~~~~~~Protein OPG035~~~
MRTLLIRYILWRNDNDQTYYNDDFKKLMLLDELVDDGDVCTLIKNMRMTLSDGPLLDRLNQPVNNIEDAKRMIAISAKVA
RDIGERSEIRWEESFTILFRMIETYFDDLMIDLYGEK
>P17361 ~~~~~~Protein OPG035~~~
MRTLLIRYILWRNDNDQTYYNDNFKKLMLLDELVDDGDVCTLIKNMRMTLSDGPLLDRLNQPVNNIEDAKRMIAISAKVA
RDIGERSEIRWEESFTILFRMIETYFDDLMIDLYGEK
>P14357 ~~~~~~Protein OPG036~~~
MTSSAMDNNEPKVLEMVYDATILPEGSSMDPNIMDCINRHINMCIQRTYSSSIIAILDRFLMMNKDELNNTQCHIIKEFM
TYEQMAIDHYGGYVNAILYQIRKRPNQHHTIDLFKRIKRTRYDTFKVDPVEFVKKVIGFVSILNKYKPVYSYVLYENVLY
DEFKCFINYVETKYF
>P14356 ~~~~~~Inhibitor of Apoptosis OPG037~~~
MIFVIESKLLQIYRNRNRNINFYTTMDNIMSAEYYLSLYAKYNSKNLDVFRNMLQAIEPSGNNYHILHAYCGIKGLDERF
VEELLHRGYSPNETDDDGNYPLHIASKINNNRIVAMLLTHGADPNACDKHNKTPLYYLSGTDDEVIERINLLVQYGAKIN
NSVDEEGCGPLLACTDPSERVFKKIMSIGFEARIVDKFGKNHIHRHLMSDNPKASTISWMMKLGISPSKPDHDGNTPLHI
VCSKTVKNVDIIDLLLPSTDVNKQNKFGDSPLTLLIKTLSPAHLINKLLSTSNVITDQTVNICIFYDRDDVLEIINDKGK
QYDSTDFKMAVEVGSIRCVKYLLDNDIICEDAMYYAVLSEYETMVDYLLFNHFSVDFVVNGHTCMSECVRLNNPVILSKL
MLHNPTSETMYLTMKAIEKDRLDKSIIIPFIAYFVLMHPDFCKNRRYFTSYKRFVTDYVHEGVSYEVFDDYF
>Q80HY2 ~~~~~~Early protein OPG038~~~
MVYKLVLLFCIASLGYSVEYKNTICPPRQDYRYWYFAAELTIGVNYDINSTIIGECHMSESYIDRNANIVLTGYGLEINM
TIMDTDQRFVAAAEGVGKDNKLSVLLFTTQRLDKVHHNISVTITCMEMNCGTTKYDSDLPESIHKSSSCDITINGSCVTC
VNLETDPTKINPHYLHPKDKYLYHNSEYSMRGSYGVTFIDELNQCLLDIKELSYDICYRE
>P04297 ~~~~~~Interferon antagonist OPG040~~~
MDLSRINTWKSKQLKSFLSSKDAFKADVHGHSALYYAIADNNVRLVCTLLNAGALKNLLENEFPLHQAATLEDTKIVKIL
LFSGLDDSQFDDKGNTALYYAVDSGNMQTVKLFVKKNWRLMFYGKTGWKTSFYHAVMLNDVSIVSYFLSEIPSTFDLAIL
LSCIHITIKNGHVDMMILLLDYMTSTNTNNSLLFIPDIKLAIDNKDIEMLQALFKYDINIYSANLENVLLDDAEIAKMII
EKHVEYKSDSYTKDLDIVKNNKLDEIISKNKELRLMYVNCVKKN
>P20532 ~~~~~~Superinfection exclusion protein~~~
MIALLILSLTCSVSTYRLQGFTNAGIVAYKNIQDDNIVFSPFGYSFSMFMSLLPASGNTRIELLKTMDLRKRDLGPAFTE
LISGLAKLKTSKYTYTDLTYQSFVDNTVCIKPLYYQQYHRFGLYRLNFRRDAVNKINSIVERRSGMSNVVDSNMLDNNTL
WAIINTIYFKGTWQYPFDITKTRNASFTNKYGTKTVPMMNVVTKLQGNTITIDDEEYDMVRLPYKDANISMYLAIGDNMT
HFTDSITAAKLDYWSFQLGNKVYNLKLPKFSIENKRDIKSIAEMMAPSMFNPDNASFKHMTRDPLYIYKMFQNAKIDVDE
QGTVAEASTIMVATARSSPEKLEFNTPFVFIIRHDITGFILFMGKVESP
>P18384 ~~~~~~Superinfection exclusion protein~~~
MIALLILSLTCSVSTYRLQGFTNAGIVAYKNIQDDNIVFSPFGYSFSMFMSLLPASGNTRIELLKTMDLRKRDLGPAFTE
LISGLAKLKTSKYTYTDLTYQSFVDNTVCIKPSYYQQYHRFGLYRLNFRRDAVNKINSIVERRSGMSNVVDSNMLDNNTL
WAIINTIYFKGIWQYPFDITKTRNASFTNKYGTKTVPMMNVVTKLQGNTITIDDEEYDMVRLPYKDANISMYLAIGDNMT
HFTDSITAAKLDYWSFQLGNKVYNLKLPKFSIENKRDIKSIAEMMAPSMFNPDNASFKHMTRDPLYIYKMFQNAKIDVDE
QGTVAEASTIMVATARSSPEKLEFNTPFVFIIRHDITGFILFMGKVESP
>P20639 ~~~~~~Protein K3~~~
MLAFCYSLPNAGDVIKGRVYEKDYALYIYLFDYPHSEAILAESVKMHMDRYVEYRDKLVGKTVKVKVIRVDYTKGYIDVN
YKRMCRHQ
>P18378 ~~~~~~Protein K3~~~
MLAFCYSLPNAGDVIKGRVYEKDYALYIYLFDYPHFEAILAESVKMHMDRYVEYRDKLVGKTVKVKVIRVDYTKGYIDVN
YKRMCRHQ
>P18377 ~~~~~~Virion nicking-joining enzyme~~~
MNPDNTIAVITETIPIGMQFDKVYLSTFNMWREILSNTTKTLDISSFYWSLSDEVGTNFGTIILNKIVQLPKRGVRVRVA
VNKSNKPLKDVERLQMAGVEVRYIDITNILGGVLHTKFWISDNTHIYLGSANMDWRSLTQVKELGIAIFNNRNLAADLTQ
IFEVYWYLGVNNLPYNWKNFYPSYYNTDHPLSINVSGVPHSVFIASAPQQLCTMERTNDLTALLSCIRNASKFVYVSVMN
FIPIIYSKAGNILFWPYIEDELRRAAIDRQVSVKLLISCWQRSSFIMRNFLRSIAMLKSKNINIEVKLFIVPDADPPIPY
SRVNHAKYMVTDKTAYIGTSNWTGNYFTDTCGASINITPDDGLGLRQQLEDIFMRDWNSKYSYELYDTSPTKRCRLLKNM
KQCTNDIYCDEIQPEKEIPEYSLE
>P68467 ~~~~~~Protein K7~~~
MATKLDYEDAVFYFVDDDKICSRDSIIDLIDEYITWRNHVIVFNKDITSCGRLYKELMKFDDVAIRYYGIDKINEIVEAM
SEGDHYINFTKVHDQESLFATIGICAKITEHWGYKKISESRFQSLGNITDLMTDDNINILILFLEKKLN
>P68466 ~~~~~~Protein K7~~~
MATKLDYEDAVFYFVDDDKICSRDSIIDLIDEYITWRNHVIVFNKDITSCGRLYKELMKFDDVAIRYYGIDKINEIVEAM
SEGDHYINFTKVHDQESLFATIGICAKITEHWGYKKISESRFQSLGNITDLMTDDNINILILFLEKKLN
>O57173 ~~~~~~Apoptosis regulator OPG045~~~
MLSMFMCNNIVDYVDGIVQDIEDEASNNVDHDYVYPLPENMVYRFDKSTNILDYLSTERDHVMMAVRYYMSKQRLDDLYR
QLPTKTRSYIDIINIYCDKVSNDYNRDMNIMYDMASTKSFTVYDINNEVNTILMDNKGLGVRLATISFITELGRRCMNPV
KTIKMFTLLSHTICDDCFVDYITDISPPDNTIPNTSTREYLKLIGITAIMFATYKTLKYMIG
>P24356 ~~~~~~Apoptosis regulator OPG045~~~
MLSMFMCNNIVDYVDDIDNGIVQDIEDEASNNVDHDYVYPLPENMVYRFDKSTNILDYLSTERDHVMMAVRYYMSKQRLD
DLYRQLPTKTRSYIDIINIYCDKVSNDYNRDMNIMYDMASTKSFTVYDINNEVNTILMDNKGLGVRLATISFITELGRRC
MNPVETIKMFTLLSHTICDDYFVDYITDISPPDNTIPNTSTREYLKLIGITAIMFATYKTLKYMIG
>Q85365 ~~~~~~Apoptosis regulator OPG045~~~
MYNSMLPMFMCNNIVDDIDDIDDIDDIDDIDDIDDIDDKASNNDDHNYVYPLPENMVYRFNKSTNILDYLSTERDHVMMA
VQYYMSKQRLDDLYRQLPTKTRSYIDIINMYCDKVNNDYNRDMNIMYDMASTESFTVYDINNEVNTILMDNKGLGVRLAT
ISFITELGKRCMNPVETIKMFTLLSHTICDDCFIDYITDISPPDNTIPNISTREYLKLIGITAIMFATYKTLKYMIG
>P24357 ~~~~~~Immune evasion protein OPG047~~~
MPIFVNTVYCKNILALSMTKKFKTIIDAIGGNIIVNSTILKKLSPYFRTHLRQKYTKNKDPVTWVCLDLDIHSLTSIVIY
SYTGKVYIDSHNVVNLLRASILTSVEFIIYTCINFILRDFRKEYCVECYMMGIEYGLSNLLCHTKNFIAKHFLELEDDII
DNFDYLSMKLILESDELNVPDEDYVVDFVIKWYIKRRNKLGNLLLLIKNVIRSNYLSPRGINNVKWILDCTKIFHCDKQP
RKSYKYPFIEYPMNMDQIIDIFHMCTSTHVGEVVYLIGGWMNNEIHNNAIAVNYISNNWIPIPPMNSPRLYASGIPANNK
LYVVGGLPNPTSVERWFHGDAAWVNMPSLLKPRCNPAVASINNVIYVMGGHSETDTTTEYLLPNHDQWQFGPSTYYPHYK
SCALVFGRRLFLVGRNAEFYCESSNTWTLIDDPIYPRDNPELIIVDNKLLLIGGFYRESYIDTIEVYNHHTYSWNIWDGK
>Q00320 ~~~~~~Protein OPG049~~~
MGTNGVRVFVILYLLAVCGCIEYDVDDNVHICTHTNVSHINHTSWYYNDKVIALATEDKTSGYISSFIKRVNISLTCLNI
SSLRYEDSGTYKGVSHLKDGVIVTTTMNISVKANIVDLTGRVRYLTRNYCEVKIRCEITSFALNGSTTPPHMILGTVDKW
KYLPFPTDDYRYVGELKRYISGNPYPTESLALEISSTFNRFTIVKNLNDDEFSCYLFSQNYSFHKMLNVRNICESKWKAL
NNNDNASSMPASHNNLANDLSSMMSQLQNDNDDNNDYSAPMNVDNLIMIVLITMLSIILVIIVVIAAISIYKRSKYRHID
N
>P24358 ~~~~~~Protein OPG049~~~
MGTNGVRVFVILYLLAVCGCIEYDVDDNVHICTHTNVSHINHTSWYYNDKVIALATEDKTSGYISSFIKRVNISLTCLNI
SSLRYEDSGTYKGVSHLKDGVIVTTTMNISVKANIIDLTGRVRYLTRNYCEVKIRCEITSFALNGSTTPPHMILGTVDKW
KYLPFPTDDYRYVGELKRYISGNPYPTESLALEISSTFNRFTIVKNLNDDEFSCYLFSQNYSFHKMLNVRNICESEWEAL
NNNNDNSSSMPASHNNLANDLSSMMSQLQNDNDDNNDYSAPMNVDNLIMIVLITMLSIILVIIVVIAAISMYKRSKYRHI
DN
>P68603 ~~~~~~Protein OPG050~~~
MSKILTFVKNKIIDLINNDQIKYSRVIMIEESDSLLPVDEVHANHGFDCVEMIDENISNENIEQYKTESFFTIN
>P24359 ~~~~~~Protein OPG051~~~
MTLVMGSCCGRFCDAKNKNKKEDVEEGREGCYNYKNLNDLDESEARVEFGPLYMINEEKSDINTLDIKRRYRHTIESVYF
>P24360 ~~~~~~Protein OPG052~~~
MEGSKRKHDSRRLQQEQEQLRPRTPPSYEEIAKYGHSFNVKRFTNEEMCLKNDYPRIISYNPPPK
>P24361 ~~~~~~Entry-fusion complex associated protein OPG083~~~
MAETKEFKTLYNLFIDSYLQKLAQHSIPTNVTCAIHIGEVIGQFKNCALRITNKCMSNSRLSFTLMVESFIEVISLLPEK
DRRRIAEEIGIDLDDVPSAVSKLEKNCNAYAEVNNIIDIQKLDIGECSAPPGQHMLLQIVNTGSAERNCGLQTIVKSLNK
IYVPPIIENRLPYYDPWFLVGVAIILVIFTVAICSIRRNLALKYRYGTFLYV
>Q80HX7 ~~~~~~Protein OPG055~~~
MGFCIPSRSKMLKRGSRKSSSILARRPTPKKMNIVTDLENRLKKNSYIENTNQGNILMDSIFVSTMPVETLFGSYITDDS
DDYELKDLLNVTYNIKPVIVPDIKLDAVLDRDGNFRPADCFLVKLKHRDGFTKGALYLGHSAGFTATICLKNEGVSGLYI
PGTSVIRSNICQGDTIVSRSSRGVQFLPQIGGEAIFLIVSLCPTKKLVETGFVIPEISSNDNAKIAARILSEKRKDTIAH
INTLIQYRQQLELAYYNSCMLTEFLHYCNSYAGTIKESLLKETIQKDINITHTNITTLLNETAKVIKLVKSLVDKEDTDI
VNNFITKEIKNRDKIVNCLSLSNLDFRL
>Q80HX6 ~~~~~~Protein OPG056~~~
MLNRVQILMKTANNYETIEILRNYLRLYIILARNEEGRGILIYDDNIDSIMSMMNITRLEVIGLTTHCTKLRSSPPIPMS
RLFMDEIDHESYYSPKTSDYPLIDIIRKRSHEQGDIALALEQYGIENTDSISEINEWLSSKGLACYRFVKFNDYRKQMYR
KFSRCTIVDSMIIGHIGHHYIWIKNLETYTRPEIDVLPFDIKYISRDELWARISSSLDQTHIKTIAVSVYGAITDNGPMP
YMISTYPGNTFVNFNSVKNLILNFLDWIKDIMTSTRTIILVGYMSNLFDIPLLTVYWPNNCGWKIYNNTLISSDGARVIW
MDAYKFSCGLSLQDYCYHWGSKPESRPFDLIKKSDAKRNSKSLVKESMASLKSLYEAFETQSGALEVLMSPCRMFSFSRI
EDMFLTSVINRVSENTGMGMYYPTNDIPSLFIESSICLDYIIVNNQESNKYRIKSVLDIISSKQYPAGRPNYVKNGTKGK
LYIALCKVTVPTNDHIPVVYHDDDNTTTFITVLTSVDIETAIRAGYSIVELGALQWDDNIPELKHGLLDSIKMIYDLNAV
TTNNLLEQLIENINFNNSSIISLFYTFAISYCRAFIYSIMETIDPVYISQFSYKELYVSSSYKDINESMSQMVKL
>P26653 3.1.1.-~~~~~~Envelope phospholipase OPG057~~~
MWPFAPVPAGAKCRLVETLPENMDFRSDHLTTFECFNEIITLAKKYIYIASFCCNPLSTTRGALIFDKLKEASEKGIKII
VLLDERGKRNLGELQSHCPDINFITVNIDKKNNVGLLLGCFWVSDDERCYVGNASFTGGSIHTIKTLGVYSDYPPLATDL
RRRFDTFKAFNSAKNSWLNLCSAACCLPVSTAYHIKNPIGGVFFTDSPEHLLGYSRDLDTDVVIDKLKSAKTSIDIEHLA
IVPTTRVDGNSYYWPDIYNSIIEAAINRGVKIRLLVGNWDKNDVYSMATARSLDALCVQNDLSVKVFTIQNNTKLLIVDD
EYVHITSANFDGTHYQNHGFVSFNSIDKQLVSEAKKIFERDWVSSHSKSLKI
>P04021 3.1.1.-~~~~~~Envelope phospholipase OPG057~~~
MWPFASVPAGAKCRLVETLPENMDFRSDHLTTFECFNEIITLAKKYIYIASFCCNPLSTTRGALIFDKLKEASEKGIKII
VLLDERGKRNLGELQSHCPDINFITVNIDKKNNVGLLLGCFWVSDDERCYVGNASFTGGSIHTIKTLGVYSDYPPLATDL
RRRFDTFKAFNSAKNSWLNLCSAACCLPVSTAYHIKNPIGGVFFTDSPEHLLGYSRDLDTDVVIDKLKSAKTSIDIEHLA
IVPTTRVDGNSYYWPDIYNSIIEAAINRGVKIRLLVGNWDKNDVYSMATARSLDALCVQNDLSVKVFTIQNNTKLLIVDD
EYVHITSANFDGTHYQNHGFVSFNSIDKQLVSEAKKIFERDWVSSHSKSLKI
>P68707 ~~~~~~Protein OPG058~~~
MKHRLYSEGLSISNDLNSIIGQQSTMDTDIEIDEDDIMELLNILTELGCDVDFDENFSDIADDILESLIEQDV
>P0DTM8 ~~~~~~Protein OPG059~~~
MVIGLVIFVSVAAAIVGVLSNVLDMLMYVEENNEEDARIKEEQELLLLY
>Q80HX4 ~~~~~~Protein OPG060~~~
MEIFNVEELINMKPFKNMNKITINQKDNCILANRCFVKIDTPRYIPSTSISSSNIIRIRNHDFTLSELLYSPFHFQQPQF
QYLLPGFVLTCIDKVSKQQKKCKYCISNRGDDDSLSINLFIPTINKSIYIIIGLRMKNFWKPKFEIE
>Q80HX3 ~~~~~~Protein OPG061~~~
MKVVIVTSVASLLDASIQFQKTACRHHCNYLSMQVVKEIEEFGTINEKNLEFDTWKDVIQNDEIDALVFYRVKQISISTG
VLYKSMMRNRTKPISMYFVRDCLAFDGDPPSFRMTSCNINAYNRSKIKDLIILMNMKTCNKKIIGEFIIDNFGSVDALLS
IINSNVTWITSVINNSNGRGINIRVSNNKMLTITSFRRFVNKLKMYKTTKCASQLDNLCTEMNKMDIIDKK
>P07396 ~~~~~~Phosphoprotein OPG062~~~
MNSHFASAHTPFYINTKEGRYLVLKAVKVCDVRTVECEGSKASCVLKVDKPSSPACERRPSSPSRCERMNNPRKQVPFMR
TDMLQNMFAANRDNVASRLLN
>P21604 ~~~~~~Protein OPG064~~~
MISVTDIRRAFLDNECHTITKAFGYLHEDKAIALIKIGFHPTYLPKVLYNNVVEFVPEKLYLFKPRTVAPLDLISTITKL
KNVDKFASHINYHKNSILITGDKSLIVKCMPYMIISDDDIRFIREQFVGTNSIEYILSFINKESIYRMSYQFSENEIVTI
INRDHFMYEPIYEHQVLDSDFLKTMLDRYGIVPINSGIIDELCPEAIIEILMAVVRPRDAIRFLDIVNKNQLTEDSVKNY
IINDIRRGKIDYYIPYVEDFLEDRTEDLGIYANIFFEDAIDITKLDITKTELEHISKYMNYYTTYIDHIVNIILQNNYID
ILASIIDYVQDVLTEELCIRIVCESTNPVPVTSLPIHSTLVMVMCIQMKYVDIVEFLDEIDIDTLIEKGADPITEYTFTT
RWYNKHNDLITLYIKKYGFCPMMMKRLMFEYPLTKEASDHLLKTMDENRGAIMFFPRTICTLPYLLCCNYKLIQKPIPFK
EENRNIVYKKNNRVLCFDSLENSAFKSLIKIDSIPGLKTYNMKDITYEKSNNIICVRFIPQESIHNEERRIKLQLFDIAR
LASYGLYYIPSRYLSSWTPVVNMIEGREYTNPQKIECLVILDLFSEEFIEYQNLGNAVSNKYELEYTISNYQAAINCLMS
TLLIYLVLGSIRSISRTENFVLSILNIFYKGLKINELLSEPVSGVCIELNKIKDRASSGDSSFIFLKKNELSKTLSLCEK
VCVETILDNNQSFKSSK
>P21081 ~~~~~~RNA-binding protein OPG065~~~
MSKIYIDERSDAEIVCAAIKNIGIEGATAAQLTRQLNMEKREVNKALYDLQRSAMVYSSDDIPPRWFMTTEADKPDADAM
ADVIIDDVSREKSMREDHKSFDDVIPAKKIIDWKDANPVTIINEYCQITKRDWSFRIESVGPSNSPTFYACVDIDGRVFD
KADGKSKRDAKNNAAKLAVDKLLGYVIIRF
>P21605 ~~~~~~RNA-binding protein OPG065~~~
MSKIYIDERSNAEIVCEAIKTIGIEGATAAQLTRQLNMEKREVNKALYDLQRSAMVYSSDDIPPRWFMTTEADKPDADAM
ADVIIDDVSREKSMREDHKSFDDVIPAKKIIDWKGANPVTVINEYCQITRRDWSFRIESVGPSNSPTFYACVDIDGRVFD
KADGKSKRDAKNNAAKLAVDKLLGYVIIRF
>P33863 ~~~~~~RNA-binding protein OPG065~~~
MSKIYIDERSDAEIVCEAIKNIGLEGVTAVQLTRQLNMEKREVNKALYDLQRSAMVYSSDDIPPRWFMTTEADKPDAMTM
ADVIIDDVSREKSMREDHKSFDDVIPAKKIIDWKNANPVTIINEYCQITKRDWSFRIESVGPSNSPTFYACVDIDGRVFD
KADGKSKRDAKNNAAKLAVDKLLGYVIIRF
>P21046 ~~~~~~Protein OPG067~~~
MLILTKVNIYMLIIVLWLYGYNFIMSGSQCPMINDDSFTLKRKYQIDSAESTMKMDKKRTKFQNRAKMVKEINQTIRAAQ
THYETLKLGYIKFKRMIRTTTLEDIAPSIPNNQKTYKLFSDISAIGKASQNPSKMVYALLLYMFPNLFGDDHRFIRYRMH
PMSKIKHKIFSPFKLNLIRILVEERFYNNECRSNKWRIIGTQVDKMLIAESDKYTIDARYNLKPMYRIKGKSEEDTLFIK
QMVEQCVTSQELVEKVLKILFRDLFKSGEYKAYRYDDDVENGFIGLDTLKLNIVHDIVEPCMPVRRPVAKILCKEMVNKY
FENPLHIIGKNLQECIDFVSE
>Q01478 ~~~~~~Protein OPG067~~~
MLILTKVNIYMLIIVLWLYGHNFIMSESQCPMINDDSFTLKRKYQIDSAESTIKMDKKRIKFQNRAKMVKEINQTIRVAQ
THYETLKLGYIKFKRMIRTTTLEDIAPSIPNNQKTYKLFSDISAIGKASQNPSKMVYALLLYMFPNLFGDDHRFIRYRMH
PMSKIKHKIFSPFKLNLIRILVEERFYNNECRSNKWRIIGTQVDKMLIAESDKYTIDARYRLRPIYRIKGKSEEDTLFIK
QMVEQCVTSQELVEKVLKILFRDLFKSGEYKAYRYDDDVENGFIGLDTLKLNIVHDIVEPCMPVRRPVAKILCKEMVNKY
FENPLHIIGKNLQECIDFVSE
>P21606 ~~~~~~Protein OPG067~~~
MLILTKVNIYMLIIVLWLYGYNFIISESQCPMINDDSFTLKRKYQIDSAESTIKMDKKRTKFQNRAKMVKEINQTIRAAQ
THYETLKLGYIKFKRMIRTTTLEDIAPSIPNNQKTYKLFSDISAIGKASRNPSKMVYALLLYMFPNLFGDDHRFIRYRMH
PMSKIKHKIFSPFKLNLIRILVEERFYNNECRSNKWRIIGTQVDKMLIAESDKYTIDARYNLKPMYRIKGKSEEDTLFIK
QMVEQCVTSQELVEKVLKILFRDLFKSGEYKAYRYDDDVENGFIGLDTLKLNIVHDIVEPCMPVRRPVAKILCKEMVNKY
FENPLHIIGKNLQECIDFVSE
>P0DSS3 ~~~~~~Protein OPG067~~~
MLILTKVNIYMLIIVLWLYGYNFIMSGSQCPMINDDRFTLKRKYQIDSVESTMKMDKKRTKFQNRAKMVKEINQTIRAAQ
THYETLKLGYIKFKKMIRTTTLEDITTSIPNIQKIYKLFSDISAIGKVSQNPSKMAYALLLYMFPNLFGDDHRFILYRMF
PMSKIKHKIFSPFKLNLIRILVEERFYNNECRSNKWRIIGTQVDKMLIAESDKYTIDARYRLRPIYRIKGKSEEDTLFIK
QMVDQCVTSQELVEKVLKILFRDLFKSGEYKAYRYDDDVENGFIGLDKLKLNIVHDIVEPCMPVRRPVAKILCKEMVNKY
FENPLHIIGKNLQECIDFVSE
>P0DSS4 ~~~~~~Protein OPG067~~~
MLILTKVNIYMLIIVLWLYGYNFIMSGSQCPMINDDRFTLKRKYQIDSVESTMKMDKKRTKFQNRAKMVKEINQTIRAAQ
THYETLKLGYIKFKKMIRTTTLEDITTSIPNIQKIYKLFSDISAIGKVSQNPSKMAYALLLYMFPNLFGDDHRFILYRMF
PMSKIKHKIFSPFKLNLIRILVEERFYNNECRSNKWRIIGTQVDKMLIAESDKYTIDARYRLRPIYRIKGKSEEDTLFIK
QMVDQCVTSQELVEKVLKILFRDLFKSGEYKAYRYDDDVENGFIGLDKLKLNIVHDIVEPCMPVRRPVAKILCKEMVNKY
FENPLHIIGKNLQECIDFVSE
>P21607 ~~~~~~Protein OPG068~~~
MDFIRRKYLIYTVENNIDFLKDDTLSKVNNFTLNHVLALKYLVSNFPQHVITKDVLANTNFFVFIHMVRCCKVYEAVLRH
AFDAPTLYVKALTKNYLSFSNTIQSYKETVHKLTQDEKFLEVAKYMDELGELIGVNYDLVLNPLFHGGEPIKDMEIIFLK
LFKKTDFKVVKKLSVIRLLIWAYLSKKDTGIEFADNDRQDIYTLFQQTGRIVHSNLTETFRDYIFPGDKTSYWVWLNESI
ANDADIVLNRHAITMYDKILSYIYSEIKQGRVNKNMLKLVYIFEPEKDIRELLLEIIYDIPGDILSIIDAKNDDWKKYFI
SFYKANFINGNTFISDRTFNEDLFRVVVQIDPEYFDNERIMSLFSTSAADIKRFDELDINNSYISNIIYEVNDITLDTMD
DMKKCQIFNEDTSYYVKEYNTYLFLHESDPMVIENGILKKLSSIKSKSRRLNLFSKNILKYYLDGQLARLGLVLDDYKGD
LLVKMINHLKSVEDVSAFVRFSTDKNPSILPSLIKTILASYNISIIVLFQRFLRDNLYHVEEFLDKSIHLTKTDKKYILQ
LIRHGRS
>P68447 ~~~~~~Protein OPG069~~~
MGTAATIQTPTKLMNKENAEMILEKIVDHIVMYISDESSDSENNPEYIDFRNRYEDYRSLIIKSDHEFVKLCKNHAEKSS
PETQQMIIKHIYEQYLIPVSEVLLKPIMSMGDIITYNGCKDNEWMLEQLSTLNFNNLRTWNSCSIGNVTRLFYTFFSYLM
KDKLNI
>P68446 ~~~~~~Protein OPG069~~~
MGTAATIQTPTKLMNKENAEMILEKIVDHIVMYISDESSDSENNPEYIDFRNRYEDYRSLIIKSDHEFVKLCKNHAEKSS
PETQQMIIKHIYEQYLIPVSEVLLKPIMSMGDIITYNGCKDNEWMLEQLSTLNFNNLRTWNSCSIGNVTRLFYTFFSYLM
KDKLNI
>P23372 ~~~~~~Protein OPG070~~~
MAATVPRFDDVYKNAQRRILDQETFFSRGLSRPLMKNTYLFDNYAYGWIPETAIWSSRYANLDASDYYPISLGLLKKFEF
LMSLYKGPIPVYEEKVNTEFIANGSFSGRYVSYLRKFSALPTNEFISFLLLTSIPIYNILFWFKNTQFDITKHTLFRYVY
TDNAKHLALARYMHQTGDYKPLFSRLKENYIFTGPVPIGIKDINHPNLSRARSPSDYETLANISTILYFTKYDPVLMFLL
FYVPGYSITTKITPAVEYLMDKLNLTKSDVQLL
>P23373 1.8.3.2~~~~~~Probable FAD-linked sulfhydryl oxidase OPG072~~~
MNPKHWGRAVWTIIFIVLSQAGLDGNIEACKRKLYTIVSTLPCPACRRHATIAIEDNNVMSSDDLNYIYYFFIRLFNNLA
SDPKYAIDVTKVNPL
>P68448 ~~~~~~Core protein OPG073~~~
MELVNIFLETDAGRVKFAIKNTDDVCASELINKFVELLSEYIHIDQSEFYLVVKDKDIFYFKCDRGSISIVNNEFYVFDE
PLLFVKDFTNVTGVEFIVTETMPCRIIPKNNHAVISVVTNHKFYNGLSL
>Q80HX1 ~~~~~~Protein OPG074~~~
MFMYPEFARKALSKLISKKLNIEKVSSKHQLVLLDYGLHGLLPKSLYLEAINSDILNVRFFPPEIINVTDIVKALQNSCR
VDEYLKSVSLYHKNSLMVSGPNVVKLMIEYNLLTHSDLEWLINENVVKATYLLKINAYMINFKIDLTVDEIIDLVKDIPV
GATLHLYNILNNIDLDIVLRISDEYNIPPVHDILSKLTDEEMCIKLVTKYPMDNVINFINQDVRYSPTFIKTIKDFVNEH
LPTMYDGLNDYLHSVIIDEDLIEEYKIKSVAMFNLEYKTDVNTLTLDEQIFVEVNISYYDFRYRQFADEFRDYIMIKERR
QITMQSGDRIRRFRRPMSLRSTIIKKDTDSLEDILAHIDNARKNSKVSIEDVERIISSFRLNPCVVRRTMLSDIDIKTKI
MVLKIVKDWKSCALTLSAIKGIMVTDTINTVLSKILHHHRNVFKYLTSVENKEIAVCNCSRCLSLFYRELKSVRCDLHTD
DGLLDRLYDLTRYALHGKINQNLIGQRCWGPLTEMLFNENKKKKLNNLMEYIKISDMLVYGHSIEKTLIPITDSLSFKLS
VDTMSVLNDQYAKVVIFFNTIIEYIIATIYYRLTVLNNYTNVKHFVSKVLHTVMEACGVLFSYIKVNDKIEHELEEMVDK
GTVPSYLYHLSINVISIILDDINGTR
>P0CK21 ~~~~~~Entry-fusion complex protein OPG076~~~
MLVVIMFFIAFAFCSWLSYSYLRPYISTKELNKSR
>Q8V518 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKNGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIAEFVKENERVNVSRVVECLTVPNITISSNTE
>Q6RZN2 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIAEFVKENERVNVSRVVECLTVPNITISSNAE
>O93117 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPELRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIVEFVKENERVNVSRVVECLTVPNITISSNAE
>P20498 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIAEFVKENERVNVSRVVECLTVPNITISSNAE
>Q9QBG0 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNAIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDITEFVKENERVNVSRVVEYLTVPNITISSNAE
>Q77TL8 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIAEFVKENERVNVSRVVECLTVPNITISSNAE
>P16714 ~~~~~~Telomere-binding protein OPG077~~~
MAEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKDGKLDDTGKKELVLTDVEKRILNTIDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRLFRSPQVKDNIISRTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVKMGAFMYTKHSMLTNAISSRVDRYSKKFQESFYEDIAEFVKENERVNVSRVVECLTVPNITISSNAE
>P32999 ~~~~~~Telomere-binding protein OPG077~~~
MVEFEDQLVFNSISARALKAYFTAKINEMVDELVTRKCPQKKKSQAKKPEVRIPVDLVKSSFVKKFGLCNYGGILISLIN
SLVENNFFTKNGKLDDTGKKELVLTDVEKRILNTVDKSSPLYIDISDVKVLAARLKRSATQFNFNGHTYHLENDKIEDLI
NQLVKDESIQLDEKSSIKDSMYVIPDELIDVLKTRSFRSPQVKDNIISSTRLYDYFTRVTKRDESSIYVILKDPRIASIL
SLETVEMGAFMYTKHSMLTNAISSEVDRYSEKFQESFYEDIAEFVEENERVDVSRVVECLTVPNITISSNAE
>P68604 ~~~~~~Protein OPG078~~~
MDKLYAAIFGVFMGSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCWGWYWLIIIFFIVLILLLLIYLYLKVVW
>P68605 ~~~~~~Protein OPG078~~~
MDKLYAAIFGVFMGSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCWGWYWLIIIFFIVLILLLLIYLYLKVVW
>P68606 ~~~~~~Protein OPG078~~~
MDKLYAAIFGVFMGSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCWGWYWLIIIFFIVLILLLLIYLYLKVVW
>P0DOR3 ~~~~~~Protein OPG078~~~
MDKLYAAIFGVFMGSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCWGWYWLIIIFFIVLILLLLIYLYLKVVW
>P0DOR4 ~~~~~~Protein OPG078~~~
MDKLYAAIFGVFMGSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCWGWYWLIIIFFIVLILLLLIYLYLKVVW
>P12923 ~~~~~~Protein OPG079~~~
MSKVIKKRVETSPRPTASSDSLQTCAGVIEYAKSISKSNAKCIEYVTLNASQYANCSSISIKLTDSLSSQMTSTFIMLEG
ETKLYKNKSKQDRSDGYFLKIKVTAASPMLYQLLEAVYGNIKHKERIPNSLHSLSVETITEKTFKDESIFINKLNGAMVE
YVSTGESSILRSIEGELESLSKRERQLAKAIITPVVFYRSGTETKITFALKKLIIDREVVANVIGLSGDSERVSMTENVE
EDLARNLGLVDIDDEYDEDSDKEKPIFNV
>P12924 ~~~~~~Protein OPG081~~~
MVDAITVLTAIGITVLMLLMVISGAALIVKELNPNDIFTMQSLKFNRAVTIFKYIGLFIYIPGTIILYATYVKSLLMKS
>P68462 ~~~~~~Telomere-binding protein OPG082~~~
MNNFVKQVASKSLKPTKKLSPSDEVISLNECIISFNLDNFYYCNDGLFTKPINTPEDVLKSLLIMESFAYEKMIIKGLIK
ILISRAYINDIYFTPFGWLTGVDDDPETHVVIKIIFNSSLISIKSQVIEYLKPYNVNNLSVLTTEKELSINTFNVPDSIP
MSIISFFPFDTDFILVILFFGVYNDSYCGISYISPKERLPYIIEILKPLVSEINMLSDEIGRTSSIRIFNSTSVKKFPTN
TLTSICEIVYSFDESSFPTPKTFTPLNASPYIPKKIVSLLDLPSNVEIKAISRGGVDFITHINNKRLNTILVIAKDNFLK
NSTFSGTFIKENIIWKGIYTYRIIKSSFPVPTIKSVTNKKKICKKHCFVNSQYTTRTLSHIL
>P20501 3.4.22.-~~~~~~Core protease OPG082~~~
MERYTDLVISKIPELGFTNLLCHIYSLAGLCSNIDVSKFLTNCNGYVVEKYDKSTTAGKVSCIPIGMMLELVESGHLSRP
NSSDELDQKKELTDELKTRYHSIYDVFELPTSIPLAYFFKPRLREKVSKAIDFSQMDLKIDDLSRKGIHTGENPKVVKMK
IEPERGAWMSNRSIKNLVSQFAYGSEVDYIGQFDMRFLNSLAIHEKFDAFMNKHILSYILKDKIKSSTSRFVMFGFCYLS
HWKCVIYDKKQCLVSFYDSGGNIPTEFHHYNNFYFYSFSDGFNTNHRHSVLDNTNCDIDVLFRFFECTFGAKIGCINVEV
NQLLESECGMFISLFMILCTRTPPKSFKSLKKVYTFFKFLADKKMTLFKSILFNLHDLSLDITETDNAGLKEYKRMEKWT
KKSINVICDKLTTKLNRIVDDDE
>P12926 3.4.22.-~~~~~~Core protease OPG082~~~
MERYTDLVISKIPELGFTNLLCHIYSLAGLCSNIDVSKFLTNCNGYVVEKYDKSTTAGKVSCIPIGMMLELVESGHLSRP
NSSDELDQKKELTDELKTRYHSIYDVFELPTSIPLAYFFKPRLREKVSKAIDFSQMDLKIDDLSRKGIHTGENPKVVKMK
IEPERGAWMSNRSIKNLVSQFAYGSEVDYIGQFDMRFLNSLAIHEKFDAFMNKHILSYILKDKIKSSTSRFVMFGFCYLS
HWKCVIYDKKQCLVSFYDSGGNIPTEFHHYNNFYFYSFSDGFNTNHKHSVLDNTNCDIDVLFRFFECTFGAKIGCINVEV
NQLLESECGMFISLFMILCTRTPPKSFKSLKKVYTFFKFLADKKMTLFKSILFNLHDLSLDITETDNAGLKEYKRMEKWT
KKSINVICDKLTTKLNRIVNDDE
>P0DOL3 3.4.22.-~~~~~~Core protease OPG082~~~
MERYTDLVISKIPELGFTNLLCHIYSLAGLCSNIDVSKFLTNCNGYVVEKYDKSTTAGKVSCIPIGMMLELVESRHLSRP
NSSDELDQKKELTDELKTRYHSIYDVFELPTSIPLAYFFKPRLREKVSKAIDFSQMDLKIDDLSRKGIHTGENPKVVKMK
IEPERGAWMSNRSIKNLVSQFAYGSEVDYIGQFDMRFLNSLAIHEKFDAFMNKHILSYILKDKIKSSTSRFVMFGFCYLS
HWKCVIYDKKQCLVSFYDSGGNIPTEFHHYNNFYFYSFSDGFNTNHRHSVLDNTNCDIDVLFRFFECIFGAKIGCINVEV
NQLLESECGMFISLFMILCTRTPPKSFKSLKKVYTFFKFLADKKMTLFKSILFNLQDLSLDITETDNAGLKEYKRMEKWT
KKSINVICDKLTTKLNRIVDDDE
>P0DOL4 3.4.22.-~~~~~~Core protease OPG082~~~
MERYTDLVISKIPELGFTNLLCHIYSLAGLCSNIDVSKFLTNCNGYVVEKYDKSTTAGKVSCIPIGMMLELVESRHLSRP
NSSDELDQKKELTDELKTRYHSIYDVFELPTSIPLAYFFKPRLREKVSKAIDFSQMDLKIDDLSRKGIHTGENPKVVKMK
IEPERGAWMSNRSIKNLVSQFAYGSEVDYIGQFDMRFLNSLAIHEKFDAFMNKHILSYILKDKIKSSTSRFVMFGFCYLS
HWKCVIYDKKQCLVSFYDSGGNIPTEFHHYNNFYFYSFSDGFNTNHRHSVLDNTNCDIDVLFRFFECIFGAKIGCINVEV
NQLLESECGMFISLFMILCTRTPPKSFKSLKKVYTFFKFLADKKMTLFKSILFNLQDLSLDITETDNAGLKEYKRMEKWT
KKSINVICDKLTTKLNRIVDDDE
>P16713 3.4.24.-~~~~~~Metalloendopeptidase OPG085~~~
MIVLPNKVRIFINDRMKKDIYLGISNFGFENDIDEILGIAHLLEHLLISFDSTNFLANASTSRSYMSFWCKSINSATESD
AIRTLVSWFFSNGKLKDNFSLSSIRFHIKELENEYYFRNEVFHCMDILTFLSGGDLYNGGRIDMIDNLNIVRDMLVNRMQ
RISGSNIVIFVKRLGPGTLDFFKQTFGSLPACPEIIPSSIPVSTNGKIVMTPSPFYTVMVKINPTLDNILGILYLYETYH
LIDYETIGNQLYLTVSFIDETEYESFLRGEAILQISQCQRINMNYSDDYMMNIYLNFPWLSHDLYDYITRINDDSKSILI
SLTNEIYASIINRDIIVIYPNFSKAMCNTRDTQQHPIVVLDATNDGLIKKPYRSIPLMKRLTSNEIFIRYGDASLMDMIT
LSLSKQDISLKRNAEGIRVKHSFSADDIQAIMESDSFLKYSRSKPAAMYQYIFLSFFASGNSIDDILANRDSTLEFSKRT
KSKILFGRNTRYDVTAKSSFVCGIVRGKSLDKTSLVEMMWDLKKKGLIYSMEFTNLLSKNTFYLFTFTIYTDEVYDYLNT
NKLFFAKCLVVSTKGDVENFSSLKKDVVIRV
>P68458 ~~~~~~Entry-fusion complex protein OPG086~~~
MASLLYLILFLLFVCISYYFTYYPTNKLQAAVMETDRENAIIRQRNDEIPTRTLDTAIFTDASTVASAQIHLYYNSNIGK
IIMSLNGKKHTFNLYDDNDIRTLLPILLLSK
>P68456 ~~~~~~Late transcription elongation factor OPG087~~~
MPFRDLILFNLSKFLLTEDEESLEIVSSLCRGFEISYDDLITYFPDRKYHKYISKVFEHVDLSEELSMEFHDTTLRDLVY
LRLYKYSKCIRPCYKLGDNLKGIVVIKDRNIYIREANDDLIEYLLKEYTPQIYTYSNERVPITGSKLILCGFSQVTFMAY
TTSHITTNKKVDVLVSKKCIDELVDPINYQILQNLFDKGSGTINKILRKIFYSVTGGQTP
>P21026 3.1.-.-~~~~~~Putative nuclease OPG089~~~
MGIKNLKSLLLENKSLTILDDNLYKVYNGIFVDTMSIYIAVANCVRNLEELTTVFIKYVNGWVKKGGHVTLFIDRGSIKI
KQDVRDKRRKYSKLTKDRKMLELEKCTSEIQNVTGFMEEEIKAEMQLKIDKLTFQIYLSDSDNIKISLNEILTHFNNNEN
VTLFYCDERDAEFVMCLEAKTHFSTTGEWPLIISTDQDTMLFASTDNHPKMIKNLTQLFKFVPSAEDNYLAKLTALVNGC
DFFPGLYGASITPNNLNKIQLFSDFTIDNIVTSLAIKNYYRKTNSTVDVRNIVTFINDYANLDDVYSYVPPCQCTVQEFI
FSALDEKWNNFKSSYLETVPLPCQLMYALEPRKEIDVSEVKTLSSYIDFENTKSDIDVIKSISSIFGYSNENCNTIVFGI
YKDNLLLSINNSFYFNDSLLITNTKSDNIINIGY
>Q80HX0 3.1.-.-~~~~~~Putative nuclease OPG089~~~
MGIKNLKSLLLENKSLTILDDNLYKVYNGIFVDTMSIYIAVANCVRNLEELTTVFIKYVNGWVKKGGHVTLFIDRGSIKI
KQDVRDKRRKYSKLTKDRKMLELEKCTSEIQNVTGFMEEEIKAEMQLKIDKLTFQIYLSDSDNIKISLNEILTHFNNNEN
VTLFYCDERDAEFVMCLEAKTHFSTTGEWPLIISTDQDTMLFASADNHPKMIKNLTQLFKYVPSAEDNYLAKLTALVNGC
DFFPGLYGASITPNNLNKIQLFSDFTIDNIVTSLAIKNYYRKTNSTVDVRNIVTFINDYANLDDVYSYIPPCQCTVQEFI
FSALDEKWNEFKSSYLESVPLPCQLMYALEPRKEIDVSEVKTLSSYIDFENTKSDIDVIKSISSIFGYSNENCNTIVFGI
YKDNLLLSINSSFYFNDSLLITNTKSDNIINIGY
>P0DSS5 3.1.-.-~~~~~~Putative nuclease OPG089~~~
MGIKNLKSLLLENKSLTILDDNLYKVYNGIFVDTMSIYIAVANCVRNLEELTTVFIKYVNGWVKKGGHVTLFIDRGSIKI
KQNVRDKRRKYSKSTKDRKMLELEKCTSKIQNVTGFMEEEIKAEIQLKIDKLTFQIYLSDSDNIKISLNEILTHFNNNEN
VTLFYCDERDAEFVMCLEAKTYFFTTGEWPLIISTDQDTMLFASVDNHPKMIKNLTQLFKFVPSAEDNYLAKLTALVNGC
DFFPGLYGASITPTNLNKIQLFSDFTINNIVTSLAIKNYYRKTNSTVDVRNIVTFINDYANLDDVYSYIPPCQCTVQEFI
FSALDEKWNDFKSSYLETVPLPCQLMYALEPRKEIDVSEVKTLSSYIDFENTKSDIDVIKSISSIFGYSNENCNTIVFGI
YKDNLLLSINSSFYFNNSLLITNTKSDNIINIGY
>P0DSS6 3.1.-.-~~~~~~Putative nuclease OPG089~~~
MGIKNLKSLLLENKSLTILDDNLYKVYNGIFVDTMSIYIAVANCVRNLEELTTVFIKYVNGWVKKGGHVTLFIDRGSIKI
KQNVRDKRRKYSKSTKDRKMLELEKCTSKIQNVTGFMEEEIKAEIQLKIDKLTFQIYLSDSDNIKISLNEILTHFNNNEN
VTLFYCDERDAEFVMCLEAKTYFFTTGEWPLIISTDQDTMLFASVDNHPKMIKNLTQLFKFVPSAEDNYLAKLTALVNGC
DFFPGLYGASITPTNLNKIQLFSDFTINNIVTSLAIKNYYRKTNSTVDVRNIVTFINDYANLDDVYSYIPPCQCTVQEFI
FSALDEKWNDFKSSYLETVPLPCQLMYALEPRKEIDVSEVKTLSSYIDFENTKSDIDVIKSISSIFGYSNENCNTIVFGI
YKDNLLLSINSSFYFNNSLLITNTKSDNIINIGY
>Q80HW9 ~~~~~~Protein OPG091~~~
MDPVNFIKTYAPRGSIIFINYTMSLTSHLNPSIEKHVGIYYGTLLSEHLVVESTYRKGVRIVPLDSFFEGYLSAKVYMLE
NIQVMKIAADTSLTLLGIPYGFGHDRMYCFKLVADCYKNAGIDTSSKRILGKDIFLSQNFTDDNRWIKIYDSNNLTFWQI
DYLKG
>P21030 ~~~~~~Entry-fusion complex protein OPG094~~~
MGGGVSVELPKRDPPPGVPTDEMLLNVDKMHDVIAPAKLLEYVHIGPLAKDKEDKVKKRYPEFRLVNTGPGGLSALLRQS
YNGTAPNCCRTFNRTHYWKKDGKISDKYEEGAVLESCWPDVHDTGKCDVDLFDWCQGDTFDRNICHQWIGSAFNRSNRTV
EGQQSLINLYNKMQTLCSKDASVPICESFLHHLRAHNTEDSKEMIDYILRQQSADFKQKYMRCSYPTRDKLEESLKYAEP
RECWDPECSNANVNFLLTRNYNNLGLCNIVRCNTSVNNLQMDKTSSLRLSCGLSNSDRFSTVPVNRAKVVQHNIKHSFDL
KLHLISLLSLLVIWILIVAI
>P07611 ~~~~~~Entry-fusion complex protein OPG094~~~
MGGGVSVELPKRDPPPGVPTDEMLLNVDKMHDVIAPAKLLEYVHIGPLAKDKEDKVKKRYPEFRLVNTGPGGLSALLRQS
YNGTAPNCCRTFNRTHYWKKDGKISDKYEEGAVLESCWPDVHDTGKCDVDLFDWCQGDTFDRNICHQWIGSAFNRSNRTV
EGQQSLINLYNKMQTLCSKDASVPICESFLHHLRAHNTEDSKEMIDYILRQQSADFKQKYMRCSYPTRDKLEESLKYAEP
RECWDPECSNANVNFLLTRNYNNLGLCNIVRCNTSVNNLQMDKTSSLRLSCGLSNSDRFSTVPVNRAKVVQHNIKHSFDL
KLHLISLLSLLVIWILIVAI
>P20540 ~~~~~~Entry-fusion complex associated protein OPG095~~~
MGAAASIQTTVNTLSERISSKLEQEANASAQTKCDIEIGNFYIRQNHGCNLTVKNMCSADADAQLDAVLSAATETYSGLT
PEQKAYVPAMFTAALNIQTSVNTVVRDFENYVKQTCNSSAVVDNKLKIQNVIIDECYGAPGSPTNLEFINTGSSKGNCAI
KALMQLTTKATTQIAPRQVAGTGVQFYMIVIGVIILAALFMYYAKRMLFTSTNDKIKLILANKENVHWTTYMDTFFRTSP
MVIATTDMQN
>P07612 ~~~~~~Entry-fusion complex associated protein OPG095~~~
MGAAASIQTTVNTLSERISSKLEQEANASAQTKCDIEIGNFYIRQNHGCNLTVKNMCSADADAQLDAVLSAATETYSGLT
PEQKAYVPAMFTAALNIQTSVNTVVRDFENYVKQTCNSSAVVDNKLKIQNVIIDECYGAPGSPTNLEFINTGSSKGNCAI
KALMQLTTKATTQIAPKQVAGTGVQFYMIVIGVIILAALFMYYAKRMLFTSTNDKIKLILANKENVHWTTYMDTFFRTSP
MVIATTDMQN
>Q76RD1 ~~~~~~Protein OPG096~~~
MEVIADRLDDIVKQNIADEKFVDFVIHGLEHQCPAILRPLIRLFIDILLFVIVIYIFTVRLVSRNYQMLLALVALVITLT
IFYYFIL
>P20843 ~~~~~~Protein OPG096~~~
MEVIADRLDDIVKQNIADEKFVDFVIHGLEHQCPAILRPLIRLFIDILLFVIVIYIFTVRLVSRNYQMLLALVALVITLT
IFYYFIL
>P07613 ~~~~~~Protein OPG096~~~
MEVITDRLDDIVKQNIADEKFVDFVIHGLEHQCPAILRPLIRLFIDILLFVIVIYIFTVRLVSRNYQMLLALVALVITLT
IFYYFIL
>P0DOM9 ~~~~~~Protein OPG096~~~
MEVIADRLDDIVKQNIADEKFVDFVIHGLEHQCPAILRPLIRLFIDILLFVIVIYIFTVRLVSRNYQMLLALLALVISLT
IFYYFIL
>P0DON0 ~~~~~~Protein OPG096~~~
MEVIADRLDDIVKQNIADEKFVDFVIHGLEHQCPAILRPLIRLFIDILLFVIVIYIFTVRLVSRNYQMLLALLALVISLT
IFYYFIL
>P07614 ~~~~~~Protein OPG097~~~
MNTRTDVTNDNIDKNPTKRGDKNIPGRNERFNDQNRFNNDIPKPKPRLQPNQPPKQDNKCREENGDFINIRLCAYEKEYC
NDGYLSPAYYMLKQVDDEEMSCWSELSSLVRSRKAVGFPLLKAAKRISHGSMLYFEQFKNSKVVRLTPQVKCLNDTVIFQ
TVVILYSMYKRGIYSNEFCFDLVSIPRTNIVFSVNQLMFNICTDILVVLSICGNRLYRTNLPQSCYLNFIHGHETIARRG
YEHSNYFFEWLIKNHISLLTKQTMDILKVKKKYAIGAPVNRLLEPGTLVYVPKEDYYFIGISLTDVSISDNVRVLFSTDG
IVLEIEDFNIKHLFMAGEMFVRSQSSTIIV
>P68623 ~~~~~~Entry-fusion complex protein OPG094~~~
MENVPNVYFNPVFIEPTFKHSLLSVYKHRLIVLFEVFVVFILIYVFFRSELNMFFMPKRKIPDPIDRLRRANLACEDDKL
MIYGLPWMTTQTSALSINSKPIVYKDCAKLLRSINGSQPVSLNDVLRR
>A0A7H0DN72 ~~~~~~Virion assembly protein OPG100~~~
MDHNQYLLTMFFADDDSFFKYFASQDDESSLSDILQITQYLDFLLLLLIQSKNKLEAVGHCYESLSEEYRQLTKFTDSQD
FKKLFNKVPIVTDGRVKLNKGYLFDFVISLMRFKKESALATTAIDPVRYIDPRRDIAFSNVMDILKSNKVEK
>P21032 ~~~~~~Virion assembly protein OPG100~~~
MDHNQYLLTMFFADDDSFFKYLASQDDESSLSDILQITQYLDFLLLLLIQSKNKLEAVGHCYESLSEEYRQLTKFTDSQD
FKKLFNKVPIVTDGRVKLNKGYLFDFVISLMRFKKESSLATTAIDPIRYIDPRRDIAFSNVMDILKSNKVNNN
>P07616 ~~~~~~Virion assembly protein OPG100~~~
MDHNQYLLTMFFADDDSFFKYLASQDDESSLSDILQITQYLDFLLLLLIQSKNKLEAVGHCYESLSEEYRQLTKFTDFQD
FKKLFNKVPIVTDGRVKLNKGYLFDFVISLMRFKKESSLATTAIDPVRYIDPRRNIAFSNVMDILKSNKVNNN
>P33004 ~~~~~~Virion assembly protein OPG100~~~
MDHNQYLLTMFFADDDSFFKYLASQDDESSLSDILQITQYLDFLLLLLIQSKNKLEAVGHCYESLSEEYRQLTKFTDSQD
FKKLFNKVPIVTDGRVKLNKGYLFDFVISLMRFKKESALATTAIDPVRYIDPRRDIAFSNVMDILKSNKAKNNYSLLSS
>P07618 ~~~~~~Protein OPG104~~~
MTDEQIYAFCDANKDDIRCKCIYPDKSIVRIGIDTRLPYYCWYEPCKRSDALLPASLKKNITKCNVSDCTISLGNVSITD
SKLDVNNVCDSKRVATENIAVRYLNQEIRYPIIDIKWLPIGLLALAILILAFF
>P08583 ~~~~~~Protein OPG107~~~
MDKTTLSVNACNLEYVREKAIVGVQAAKTSTLIFFVIILAISALLLWFQTSDNPVFNELTRYMRIKNTVNDWKSLTDSKT
KLESDRGRLLAAGKDDIFEFKCVDFGAYFIAMRLDKKTYLPQAIRRGTGDAWMVKKAAKVDPSAQQFCQYLIKHKSNNVI
TCGNEMLNELGYSGYFMSPHWCSDFSNME
>A0A7H0DN80 ~~~~~~Envelope protein OPG108~~~
MAAVKTPVIVVPVIDRPPSETFPNVHEHINDQKFDDVKDNEVMQEKRDVVIVNDDPDHYKDYVFIQWTGGNIRDDDKYTH
FFSGFCNTMCTEETKRNIARHLALWDSKFFTELENKNVEYVVIIENDNVIEDITFLRPVLKAIHDKKIDILQMREIITGN
KVKTELVIDKDHAIFTYTGGYDVSLSAYIIRVTTALNIVDEIIKSGGLSSGFYFEIARIENEMKINRQIMDNSAKYVEHD
PRLVAEHRFETMKPNFWSRIGTVAAKRYPGVMYTFTTPLISFFGLFDINVIGLIVILFIMFMLIFNVKSKLLWFLTGTFV
TAFI
>P20497 ~~~~~~Envelope protein H3~~~
MAAVKTPVIVVPVIDRPPSETFPNVHEHINDQKFDDVKDNEVMPEKRNVVVVKDDPDHYKDYAFIQWTGGNIRNDDKYTH
FFSGFCNTMCTEETKRNIARHLALWDSNFFTELENKKVEYVVIVENDNVIEDITFLRPVLKAMHDKKIDILQMREIITGN
KVKTELVMDKNHTIFTYTGGYDVSLSAYIIRVTTALNIVDEIIKSGGLSSGFYFEIARIENEMKINRQILDNAAKYVEHD
PRLVAEHRFENMKPNFWSRIGTAAAKRYPGVMYAFTTPLISFFGLFDINVIGLIVILFIMFMLIFNVKSKLLWFLTGTFV
TAFI
>P07240 ~~~~~~Envelope protein OPG108~~~
MAAAKTPVIVVPVIDRLPSETFPNVHEHINDQKFDDVKDNEVMPEKRNVVVVKDDPDHYKDYAFIQWTGGNIRNDDKYTH
FFSGFCNTMCTEETKRNIARHLALWDSNFFTELENKKVEYVVIVENDNVIEDITFLRPVLKAMHDKKIDILQMREIITGN
KVKTELVMDKNHAIFTYTGGYDVSLSAYIIRVTTELNIVDEIIKSGGLSSGFYFEIARIENEMKINRQILDNAAKYVEHD
PRLVAEHRFENMKPNFWSRIGTAATKRYPGVMYAFTTPLISFFGLFDINVIGLIVILFIMFMLIFNVKSKLLWFLTGTFV
TAFI
>P33059 ~~~~~~Envelope protein H3~~~
MATVNKTPVIVVPVIDRPPSETFPNLHEHINDQKFDDVKDNEVMPEKRNVVIVKDDPDHYKDYAFIHWTGGNIRNDDKYT
HFFSGFCNTMCTEETKRNIARHLALWDSKFFTELENKKVEYVVIVENDNVIEDITFLRPVLKAMHDKKIDILQMREIITG
NKVKTELVMDKNHVIFTYTGGYDVSLSAYIIRVTTALNIVDEIIKSGGLSSGFYFEIARIENEIKINRQIMDNSAKYVEH
DPRLVAEHRFENMKPNFWSRIGTAAVKRYPGVMYAFTTPLISFFGLFDINVIGLIVILFIMFMLIFNVKSKLLWFLTGTF
VTAFI
>P20538 ~~~~~~Late transcription elongation factor OPG110~~~
MAWSITNKADTSSFTKMAEIRAHLKNSAENKDKNEDIFPEDVIIPSTKPKTKRATTPRKPAATKRSTKKEEVEEEVVIEE
YHQTTEKNSPSPGVGDIVESVAAVELDDSDGDDEPMVQVEAGKVNHSARSDLSDLKVATDNIVKDLKKIITRISAVSTVL
EDVQAAGISRQFTSMTKAITTLSDLVTEGKSKVVRKKVKTCKK
>P07242 ~~~~~~Late transcription elongation factor OPG110~~~
MAWSITNKADTSSFTKMAEIRAHLKNSAENKDKNEDIFPEDVIIPSTKPKTKRATTPRKPAATKRSTKKEEVEEEVVIEE
YHQTTEKNSPSPGVSDIVESVAAVELDDSDGDDEPMVQVEAGKVNHSARSDLSDLKVATDNIVKDLKKIITRISAVSTVL
EDVQAAGISRQFTSMTKAITTLSDLVTEGKSKVVRKKVKTCKK
>P33062 ~~~~~~Late transcription elongation factor OPG110~~~
MAWSITNKADTSSFTKMAEIRAHLRNSAENKDKNDDIFPEDVIIPSTKPKTKRATTPRKPAATKRSTKKDKEKEEVEEEE
VVIEEYHQTTEENSPPPSSSPGVGNIVESVTAVELDDSNGDDDNDNDNDDNEPMVQVEAGKVNHSARSDLSDLKVATDNI
VKDLKKIITRISAVSTVLEDVQAAGISRQFTSMTKSITTLSDLVTEGKSKVVRKKVKTCKK
>A0A7H0DN84 ~~~~~~Late protein OPG112~~~
MEMDKRMKSLAMTAFFGELNTLDIMALIMSIFKHHPNNTIFSVDKDGQFMIDFEYDNYKASQYLDLTLTPISGNECKTHA
SSIAEQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSYTRISRDTEKLKIALAKGIDYEYIKDAC
>O57208 ~~~~~~Late protein OPG112~~~
MEMDKRMKSLAMTAFFGELNTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDNYKASQYLDLTLTPISGDECKTHA
SSIAEQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISRDTEKLKIALAKGIDYEYIKDAC
>P20539 ~~~~~~Late protein OPG112~~~
MEMDKRIKSLAMTAFFGELNTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDNYKASQYLDLTLTPISGDECKTHA
SSIAEQLACADIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISRDTEKLKIALAKGIDYEYIKDAC
>P08586 ~~~~~~Late protein OPG112~~~
MEMDKRMKSLAMTAFFGELSTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDNYKASQYLDLTLTPISGDECKTHA
SSIAEQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISRDTEKLKIALAKGIDYEYIKDAC
>P0DON7 ~~~~~~Late protein OPG112~~~
MEMDKRMKSLAMTAFFGELTTLDIMALIMSIFKRHPNNTIFSVDKDGQFMIDFEYDTYKASQYLDLPLTPISGDECKTHA
SSIAKQLACVDIIKEDISEYIKTTPRLKRFIKKYRNRSDTRISQDTEKLKIALAKGIDYEYIKDAC
>M1KJ15 ~~~~~~Core protein OPG114~~~
MSINIDIKKITDLLNSSILFPDDLQELLREKYIVLERKSNGTPTVAHIYKTMARFDNKSIYRIAKFLFMNRPDVIKLLFL
EDVEPLLPYKSINISINNTEYPQLEGPIGTKIALLELFNAFRTGISEPIPYYYLPLRKDINNIVTK
>P21008 ~~~~~~Core protein D2~~~
MSINIDIKKITDLLNSSILFPDDVQELLREKYIVLERKSNGTPTVAHIYKTMARFDNKSIYRIAKFLFMNRPDVIKLLFL
EDVEPLLPDKSINISINNTEYPQLEGPIGTKIALLELFNAFRTGISEPIPYYYLPLRKDINNIVTK
>P04300 ~~~~~~Core protein OPG114~~~
MSINIDIKKITDLLNSSILFPDDVQELLREKYIVLERKSNGTPTVAHIYKTMARFDNKSIYRIAKFLFMNRPDVIKLLFL
KDVEPLLPDKSINISINNTEYPQLEGPIGTKIALLELFNAFRTGRSEPIPYYYLPLRKDINNIVTK
>P0DOS2 ~~~~~~Core protein D2~~~
MSINIDIKKITDLLNSSILFPDDVQELLREKYIVLERKSNGTPTVAHIYKTMARFDNKSIYRIAKFLFMNRPDVIKLLFL
EDVEPLLPDKSINISINNTEYPQLEGPIGTKIALLELFNAFKTGISEPIPYYYLPLRKDINNIVTK
>P0DOS1 ~~~~~~Core protein D2~~~
MSINIDIKKITDLLNSSILFPDDVQELLREKYIVLERKSNGTPTVAHIYKTMARFDNKSIYRIAKFLFMNRPDVIKLLFL
EDVEPLLPDKSINISINNTEYPQLEGPIGTKIALLELFNAFKTGISEPIPYYYLPLRKDINNIVTK
>O57210 ~~~~~~Core protein OPG115~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRSKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DAGRQVRWCSTSNHISEDMHTDKFVIYDIYTFDSFKNKRLVFVQVPPSLGDDSYLTNPLLSPYYRNSVARQMVNDMIFNQ
DSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFVFAWFNGVSENEKVLDTYKKVSNLI
>P21009 ~~~~~~Core protein OPG115~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRLKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DAGRQVRWCSTSNHISEDIPEDIHTDKFVIYDIYTFDAFKNKRLVFVQVPPSLGDDSHLTNPLLSPYYRNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFVFAWFNGVSENEKVLDTYKKVSNLI
>Q9JFA5 ~~~~~~Core protein OPG115~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRLKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DAGRQVRWCSTSNHISKDIPEDMHTDKFVIYDIYTFEAFKNKRLVFVQVPPSLGDDSHLTNPLLSPYYRNSVARQMVNNM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFVFAWFNGVSENEKVLDTYKKVSNLI
>P04302 ~~~~~~Core protein OPG115~~~
MDIFIVKDNKYPKVDNDDNEVFILLGNHNDFIRLKLTKLKEHVFFSEYIVTPDTYGSLCVELNGSSFQHGGRYIEVEEFI
DAGRQVRWCSTSNHISKDIPEDMHTDKFVIYDIYTFDAFKNKRLVFVQVPPSLGDDSHLTNPLLSPYYRNSVARQMVNDM
IFNQDSFLKYLLEHLIRSHYRVSKHITIVRYKDTEELNLTRICYNRDKFKAFVFAWFNGVSENEKVLDTYKKVSNLI
>P21010 3.6.4.-~~~~~~Uncoating factor OPG117~~~
MDAAIRGNDVIFVLKTIGVPSACRQNEDPRFVEAFKCDELKRYIDNNPECTLFESLRDEEAYSIVRIFMDVDLDACLDEI
DYLTAIQDFIIEVSNCVARFAFTECGAIHENVIKSMRSNFSLTKSTNRDKTSFHIIFLDTYTTMDTLIAMKRTLLELSRS
SENPLTRSIDTAVYRRKTTLRVVGTRKNPNCDTIHVMQPPHDNIEDYLFTYVDMNNNSYYFSLQRRLEDLVPDKLWEPGF
ISFEDAIKRVSKIFINSIINFNDLDENNFTTVPLVIDYVTPCALCKKRSHKHPHQLSLENGAIRIYKTGNPHSCKVKIVP
LDGNKLFNIAQRILDTNSVLLTERGDYIVWINNSWKFNSEEPLITKLILSIRHQLPKEYSSELLCPRKRKTVEANIRDML
VDSVETDTYPDKLPFKNGVLDLVDGMFYSGDDAKKYTCTVSTGFKFDDTKFVEDSPEMEELMNIINDIQPLTDENKKNRE
LYEKTLSSCLCGATKGCLTFFFGETATGKSTTKRLLKSAIGDLFVETGQTILTDVLDKGPNPFIANMHLKRSVFCSELPD
FACSGSKKIRSDNIKKLTEPCVIGRPCFSNKINNRNHATIIIDTNYKPVFDRIDNALMRRIAVVRFRTHFSQPSGREAAE
NNDAYDKVKLLDEGLDGKIQNNRYRFAFLYLLVKWYKKYHVPIMKLYPTPEEIPDFAFYLKIGTLLVSSSVKHIPLMTDL
SKKGYILYDNVVTLPLTTFQQKISKYFNSRLFGHDIESFINRHKKFANVSDEYLQYIFIEDISSP
>P04305 3.6.4.-~~~~~~Uncoating factor OPG117~~~
MDAAIRGNDVIFVLKTIGVPSACRQNEDPRFVEAFKCDELERYIENNPECTLFESLRDEEAYSIVRIFMDVDLDACLDEI
DYLTAIQDFIIEVSNCVARFAFTECGAIHENVIKSMRSNFSLTKSTNRDKTSFHIIFLDTYTTMDTLIAMKRTLLELSRS
SENPLTRSIDTAVYRRKTTLRVVGTRKNPNCDTIHVMQPPHDNIEDYLFTYVDMNNNSYYFSLQQRLEDLVPDKLWEPGF
ISFEDAIKRVSKIFINSIINFNDLDENNFTTVPLVIDYVTPCALCKKRSHKHPHQLSLENGAIRIYKTGNPHSCKVKIVP
LDGNKLFNIAQRILDTNSVLLTERGDHIVWINNSWKFNSEEPLITKLILSIRHQLPKEYSSELLCPRKRKTVEANIRDML
VDSVETDTYPDKLPFKNGVLDLVDGMFYSGDDAKKYTCTVSTGFKFDDTKFVEDSPEMEELMNIINDIQPLTDENKKNRE
LYEKTLSSCLCGATKGCLTFFFGETATGKSTTKRLLKSAIGDLFVETGQTILTDVLDKGPNPFIANMHLKRSVFCSELPD
FACSGSKKIRSDNIKKLTEPCVIGRPCFSNKINNRNHATIIIDTNYKPVFDRIDNALMRRIAVVRFRTHFSQPSGREAAE
NNDAYDKVKLLDEGLDGKIQNNRYRFAFLYLLVKWYRKYHVPIMKLYPTPEEIPDFAFYLKIGTLLVSSSVKHIPLMTDL
SKKGYILYDNVVTLPLTTFQQKISKYFNSRLFGHDIESFINRHKKFANVSDEYLQYIFIEDISSP
>P21011 3.1.3.-~~~~~~mRNA-decapping protein OPG121~~~
MGITMDEEVIFETPRELISIKRIKDIPRSKDTHVFAACITSDGYPLIGARRTSFAFQAILSQQNSDSIFRVSTKLLRFMY
YNELREIFRRLRKGSINNIDPHFEELILLGGKLDKKESIKDCLRRELKEESDERITVKEFGNVILKLTTRDKLFNKVYIS
YCMACFINQSLEDLSHTSIYNVEIRKIKSLNDCINDDKYEYLSYIYNMLVNSK
>P04311 3.1.3.-~~~~~~mRNA-decapping protein OPG121~~~
MGITMDEEVIFETPRELISIKRIKDIPRSKDTHVFAACITSDGYPLIGARRTSFAFQAILSQQNSDSIFRVSTKLLRFMY
YNELREIFRRLRKGSINNIDPHFEELILLGGKLDKKESIKDCLRRELKEESDERITVKEFGNVILKLTTRDKLFNKVYIG
YCMACFINQSLEDLSHTSIYNVEIRKIKSLNDCINDDKYEYLSYIYNMLVNSK
>P0DOU5 3.1.3.-~~~~~~mRNA-decapping protein OPG121~~~
MGITMDEEVIFETPRELISIKRIKDIPRSKDTHVFAACITSDGYPLIGARRTSFAFQAILSQQNSDSIFRVSTKLLRFMY
YNELREIFRRLRKGSINNIDPHFEELILLGGKLDKKESIKDCLRRELKEESDERITVKEFGNVILKLTTQDKLFNKVYIG
YCMSCFINQSLEDLSHTSIYNVEIRKIKSLNDCINDDKYEYLSYIYNMLVNSK
>P21012 3.1.3.-~~~~~~mRNA-decapping protein OPG122~~~
MNFYRSSIISQIIKYNRRLAKSIICEDDSQIITLTAFVNQCLWCHKRVSVSAILLTTDNKILVCNRRDSFLYSEIIRTRN
MFRKKRLFLNYSNYLSKQERSILSSFFSLYPATADNDRIDAIYPGGIPKRGENVPECLSREIKEEVNIDNSFVFIDTRFF
IHGIIEDTIINKFFEVIFFVGRISLTSDQIIDTFKSNHEIKDLIFLDPNSGNGLQYEIAKYALDTAKLKCYGHRGCYYES
LKKLTEDD
>P04312 3.1.3.-~~~~~~mRNA-decapping protein OPG122~~~
MNFYRSSIISQIIKYNRRLAKSIICEDDSQIITLTAFVNQCLWCHKRVSVSAILLTTDNKILVCNRRDSFLYSEIIRTRN
MFRKKRLFLNYSNYLNKQERSILSSFFSLDPATADNDRIDAIYPGGIPKRGENVPECLSREIKEEVNIDNSFVFIDTRFF
IHGIIEDTIINKFFEVIFFVGRISLTSDQIIDTFKSNHEIKDLIFLDPNSGNGLQYEIAKYALDTAKLKCYGHRGCYYES
LKKLTEDD
>P33071 3.1.3.-~~~~~~mRNA-decapping protein OPG122~~~
MNFYRSSIISQIIKYNRRLAKSIICEDDSQIITLTAFVNQCLWCHKRVSVSAILLTTDNKILVCNRRDSFLYSEIIRTRN
MSRKKRLFLNYSNYLNKQERSILSSFFSLDPATIDNDRIDAIYPGGILKRGENVPECLSREIKEEVNIDNSFVFIDTRFF
IHGIIEDTIINKFFEVIFFVGRISLTSDQIIDTFKSNHEIKDLIFLDPNSGNGLQYEIAKYALDTAKLKCYGHRGCYYES
LKKLTEDD
>A0A7H0DN97 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDFIKRSNVFAVDSQIPTLYMPQYISLSGVMTNDGPDNQAIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSVSSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGV
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTVFAHISHTINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDINHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>Q76ZR4 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDSIKRSNVFAVDSQIPTLYMPQYISLSGVMTNDGPDNQAIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSISSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGI
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTIFAHISHSINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDVNHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>P68441 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDSIKRSNVFAVDSQIPTLYMPQYISLSGVMTNDGPDNQAIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSISSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGI
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTIFAHISHSINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDVNHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>Q77TJ4 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDSIKRSNVFAVDSQIPTLYMPQYISLSGVMTNDGPDNQAIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSISSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGI
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTIFAHISHSINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDVNHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>P68440 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDSIKRSNVFAVDSQIPTLYMPQYISLSGVMTNDGPDNQAIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSISSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGI
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTIFAHISHSINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDVNHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>P0DSP6 ~~~~~~Scaffold protein OPG125~~~
MNNTIINSLIGGDDSIKRSNVFAVDSQIPTLYMPQYISLSGVMTNNGPDNQTIASFEIRDQYITALNHLVLSLELPEVKG
MGRFGYVPYVGYKCINHVSVSSCNGVIWEIEGEELYNNCINNTIALKHSGYSSELNDISIGLTPNDTIKEPSTVYVYIKT
PFDVEDTFSSLKLSDSKITVTVTFNPVSDIVIRDSSFDFETFNKEFVYVPELSFIGYMVKNVQIKPSFIEKPRRVIGQIN
QPTATVTEVHAATSLSVYTKPYYGNTDNKFISYPGYSQDEKDYIDAYVSRLLDDLVIVSDGPPTGYPESAEIVEVPEDGI
VSIQDADVYVKIDNVPDNMSVYLHTNLLMFGTRKNSFIYNISKKFSAITGTYSDATKRTVFAHISHSINIIDTSIPVSLW
TSQRNVYNGDNRSAESKAKDLFINDPFIKGIDFKNKTDIISRLEVRFGNDVLYSENGPISRIYNELLTKSNNGTRTLTFN
FTPKIFFRPTTITANVSRGKDKLSVRVVYSTMDVNHPIYYVQKQLVVVCNDLYKVSYDQGVSITKIMGDNN
>M1L535 ~~~~~~Protein OPG128~~~
MSWYEKYNIVLNPPKRCSSTCSDNLTTILSEDGTNIIRAILYSQPKKLKILQDFLTTSRNKMFLYKILDDEIRRVLT
>P0CK20 ~~~~~~Protein OPG128~~~
MSWYEKYNIVLNPPKRCSSACADNLTTILAEDGNHIRAILYSQPKKLKILQDFLATSRNKMFLYKILDDEIRRVLT
>P07608 ~~~~~~Protein OPG128~~~
MSWYEKYNIVLNPPKRCSFACADNLTTILAEDGNNIRAILYSQPKKLKILQDFLATSRNKMFLYKILDDEIRRVLT
>Q07032 ~~~~~~Protein OPG128~~~
MSWYEKYNIVLNPPKRCFSSCADNLTTILAEDGNNIRAILYSQPQKLKVLQDFLATSRNKMFLYKILDDEIRRVLT
>P06440 ~~~~~~Major core protein OPG129~~~
MEAVVNSDVFLTSNAGLKSSYTNQTLSLVDEDHIHTSDKSLSCSVCNSLSQIVDDDFISAGARNQRTKPKRAGNNQSQQP
IKKDCMVSIDEVASTHDWSTRLRNDGNAIAKYLTTNKYDTSQFTIQDMLNIMNKLNIVRTNRNELFQLLTHVKSTLNNAS
VSVKCTHPLVLIHSRASPRIGDQLKELDKIYSPSNHHILLSTTRFQSMHFTDMSSSQDLSFIYRKPETNYYIHPILMALF
GIKLPALENAYVHGDTYSLIQQLYEFRKVKSYNYMLLVNRLTEDNPIVITGVSDLISTEIQRANMHTMIRKAIMNIIMGI
FYCNDDDAVDPHLMKIIHTGCSQVMTDEEQILASILSIVGFRPTLVSVARPINGISYDMKLQAAPYIVVNPMKMITTSDS
PISINSKDIYSMAFDGNSGRVVFAPPNIGYGRCSGVTHIDPLGTNVMGSAVHSPVIVNGAMMFYVERRQNKNMFGGECYT
GFRSLIDDTPIDVSPEIMLNGIMYRLKSAVCYKLGDQFFDCGSSDIFLKGHYTILFTENGPWMYDLSVFNPGARNARLMR
ALKNQYKKLSMDSDDGFYEWLNGDGSVFAASKQQMLMNHVANFDDDLLTMEEAMSMISRHCCILIYAQDYDQYISARHIT
ELF
>P20983 ~~~~~~39kDa core protein OPG130~~~
MDFFNKFSQGLAESSTPKSSIYYSEEKDPDTKKDEAIEIGLKSQESYYQRQLREQLARDNMMAASRQPIQPLQPTIHITP
QPVPTATPAPILLPSSTAPTPKPRQQTNTSSDMSNLFDWLSEDTDAPASSLLPALTPSNAVQDIISKFNKDQKTTTPPST
QPSQTLPTTTCTQQSDGNISCTTPTVTPPQPPIVATVCTPTPTGGTVCTTAQQNPNPGAASQQNLDDMALKDLMSSVEKD
MHQLQAETNDLVTNVYDAREYTRRAIDQILQLVKGFERFQK
>P29191 ~~~~~~39kDa core protein OPG130~~~
MDFFNKFSQGLAESSTPKSSIYYSEEKDPDTKKDEAIEIGLKSQESYYQRQLREQLARDNMTVASRQPIQPLQPTIHITP
QPVPTATPAPILLPSSTVPTPKPRQQTNTSSDMSNLFDWLSEDTDAPASSLLPALTPSNAVQDIISKFNKDQKTTTPPST
QPSQTLPTTTCTQQSDGNISCTTPTVTPPQPPIVATVCTPTPTGGTVCTTAQQNPNPGAASQQNLDDMALKDLMSNVERD
MHQLQAETNDLVTNVYDAREYTRRAIDQILQLVKGFERFQK
>P0DOP1 ~~~~~~39kDa core protein OPG130~~~
MDFFNKFSQGLAESSTPKSSIYYSEEKDLDIKKDEAIEIGLKSQESYYQRQLREQLARDNMMAASRQPIQPLQPTIHITP
LQVPTPAPTPKPRQQQTNTSSDMSNLFDWLSADDNTQPSSLLPALTPINAVQDIISKFNKDQKTTTTPSTQPSQTLPTTT
CTQQSDGSISCTTPTVTPPQPPIVATVCTPTPTGGTVCTTAQQNPNPGATSQQNLDNMALKDLMSSVEKDMRQLQAETND
LVTNVYDAREYTRRAIDQILQLVKGFERFQK
>P0DOP2 ~~~~~~39kDa core protein OPG130~~~
MDFFNKFSQGLAESSTPKSSIYYSEEKDLDIKKDEAIEIGLKSQESYYQRQLREQLARDNMMAASRQPIQPLQPTIHITP
LQVPTPAPTPKPRQQQTNTSSDMSNLFDWLSADDNTQPSSLLPALTPINAVQDIISKFNKDQKTTTTPSTQPSQTLPTTT
CTQQSDGSISCTTPTVTPPQPPIVATVCTPTPTGGTVCTTAQQNPNPGATSQQNLDNMALKDLMSSVEKDMRQLQAETND
LVTNVYDAREYTRRAIDQILQLVKGFERFQK
>P20985 ~~~~~~Virion morphogenesis protein OPG132~~~
MDKLRVLYDEFVTISKDNLERETGLSASDVDMDFDLNIFMTLVPVLEKKVCAITPTIEDDKIVTMMKYCSYQSFSFWFLK
SGAVVKSVYNKLDDVEKEKFVATFRDMLLNVQTLISLNSMYTRLRQDTEDIVSDSKKIMEIVSHLRASTTENAAYQVLQQ
NNSFIISTLNKILSDENYLLKIIAVFDSKLISEKETLNEYKQLYTISSESLVYGIRCVSNLDISSVQLSNNKYVLFVKKM
LPKIILFQNNDINAQQFANVISKIYTLIYRQLTSNVDVGCLLTDTIESAKTKISVEKIKQTGINNVQSLIKFISDNKKEY
KTIISEEYLSKEDRIITILQDIVNEHDIKYDNKLLNMRDLIVTFRERYSYKF
>P29192 ~~~~~~Virion morphogenesis protein OPG132~~~
MDKLRVLYDEFVTISKDNLERETGLSASDVDMDFDLNIFMTLVPVLEKKVCAITPTIEDDKIVTMMKYCSYQSFLFWFLK
SGAVVKSDNKLDDVEKEKFVATFRDMLLNVQTLISLNSMYTRLRQDTEDIVSDSKKIMEIVSHLRASTTENAAYQVLQQN
NSFIISTLNKILSDENYLLKIIAVFDSKLISEKETLNEYKQLYTISSESLVYGIRCVSNLDISSVQLSNNKYVLFVKKML
PKIILFQNNDINAQQFANVISKIYTLIYRQLTSNVDVGCLLTDTIESAKTKISVEKIKQTGINNVQSLIKFISDNKKEYK
TIISEEYLSKEDRIITILQDIVNEHDIKYDNKLLNMRDLIVTFRERYSYKF
>A0A7H0DNA7 ~~~~~~Virion membrane protein OPG135~~~
MSCYTAILKSVGGLALFQVANGAIDLCRHFFMYFCEQKLRPNSFWFVVVRAIASMIMYLVLGIALLYISEQDDKKNTNND
SNSNNDKRNVSSINSNSSHK
>O57222 ~~~~~~Virion membrane protein OPG135~~~
MSCYTAILKSVGGLALFQVANGAIDLCRHFFMYFCEQKLRPNSFWFVVVRAIASMIMYLVLGIALLYISEQDNKKNTNND
KRNESSINSNSSPK
>P20987 ~~~~~~Virion membrane protein OPG135~~~
MSCYTAILKSVGGLALFQVANGAIDLCRHFFMYFCEQKLRPNSFWFVVVRAIASMIMYLVLGIALLYISEQDDKKNTNND
GSNNDKRNESSINSNSSPK
>Q85320 ~~~~~~Virion membrane protein OPG135~~~
MSCYTAILKSVGGLALFQVANGAIDLCRHFFMYFCEQKLRPNSFWFVVVRAIASMIMYLVLGIALLYISEQDDKKNTNNA
NTNNDSNSNNSNNDKRNESSINSNSSPK
>P33835 ~~~~~~Virion membrane protein OPG135~~~
MSCYTAILKSVGGLALFQDANGAIDLCRHFFMYFCEQKLRPNSFWFVVVRAIASMIMYLVLGIALLYISEQDDKKNTNNA
SNSNKLNESSINSNS
>P16715 ~~~~~~Major core protein OPG136 precursor~~~
MMPIKSIVTLDQLEDSEYLFRIVSTVLPHLCLDYKVCDQLKTTFVHPFDILLNNSLGSVTKQDELQAAISKLGINYLIDT
TSRELKLFNVTLNAGNIDIINTPINISSETNPIINTHSFYDLPPFTQHLLNIRLTDTEYRARFIGGYIKPDGSDSMDVLA
EKKYPDLNFDNTYLFNILYKDVINAPIKEFKAKIVNGVLSRQDFDNLIGVRQYITIQDRPRFDDAYNIADAARHYGVNLN
TLPLPNVDLTTMPTYKHLIMFEQYFIYTYDRVDIYYNGNKMLFDDEIINFTISMRYQSLIPRLVDFFPDIPVNNNIVLHT
RDPQNAAVNVTVALPNVQFVDINRNNKFFINFFNLLAKEQRSTAIKVTKSMFWDGMDYEEYKSKNLQDMMFINSTCYVFG
LYNHNNTTYCSILSDIISAEKTPIRVCLLPRVVGGKTVTNLISETLKSISSMTIREFPRKDKSIMHIGLSETGFMRFFQL
LRLMADKPHETAIKEVVMAYVGIKLGDKGSPYYIRKESYQDFIYLLFASMGFKVTTRRSIMGSNNISIISIRPRVTKQYI
VATLMKTSCSKNEAEKLITSAFDLLNFMVSVSDFRDYQSYRQYRNYCPRYFYAGSPEGEETIICDSEPISILDRIDTRGI
FSAYTINEMMDTDIFSPENKAFKNNLSRFIESGDITGEDIFCAMPYNILDRIITNAGTCTVSIGDMLDNITTQSDCNMTN
EITDMINASLKNTISKDNNMLVSQALNSVANRSKQKIGDLRQSSCKMALLFKNLATSIYTIERIFNAKVGDDVKASMLEK
YKVFTDISMSLYKDLIAMENLKAMLYIIRRSGCRIDDAQITTDDLVKSYSLIRPKILSMINYYNEMSRGYFEHMKKNLNM
TDGDSVSFDDE
>M1L9Q3 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNSEAIMESRSMITSMPTQTPSLGVVYDKDKRIQMLEDEVV
NLRNQRSNTKSSDNLDNFTRILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKRK
IIVTNVIIIVINIIEQALLKLGFEEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>Q76RB8 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNAEAIMESRSMISSMPTQTPSLGVVYDKDKRIQMLEDEVV
NLRNQRSNTKSSDNLDNFTRILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKRK
IIVTNVIIIVINIIEQALLKLGFEEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>P20988 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNAEAIMESRSMISSMPTQTPSLGVVYDKDKRIQMLEDEVV
NLRNQRSNTKSSDNLDNFTRILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKRK
IIVTNVIIIVINIIEQALLKLGFEEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>Q80HV8 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNAEAIMESRSMITSMPTQTPSLGVVYDKDKRIQMLEDEVV
NLRNQRSNTKSSDNLDNFTKILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKQK
IIVTNVIIIVINIIEQALLKLGFEEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>P0DOQ7 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNAEAIMESRSMITSMPTQTPSLGVVYDKDKRIQMLEDEV
VNLRNQQSNTKSSNNLDNFTRILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKR
KIIVTNVIIIVINIIEQALLKLGFDEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>P0DOQ8 ~~~~~~Protein OPG137~~~
MTTVPVTDIQNDLITEFSEDNYPSNKNYEITLRQMSILTHVNNVVDREHNAAVVSSPEEISSQLNEDLFPDDDSPATIIE
RVQQPHTTIIDDTPPPTFRRELLISEQRQQREKRFNITVSKNAEAIMESRSMITSMPTQTPSLGVVYDKDKRIQMLEDEV
VNLRNQQSNTKSSNNLDNFTRILFGKTPYKSTEVNKRIAIVNYANLNGSPLSVEDLDVCSEDEIDRIYKTIKQYHESRKR
KIIVTNVIIIVINIIEQALLKLGFDEIKGLSTDITSEIIDVEIGDDCDAVASKLGIGNSPVLNIVLFILKIFVKRIKII
>M1L543 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKSKSSKCITCKPKSKSSSST
STSKSSKNTSGAPRRRTTATTSFNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIESVNAGKKKPSKKM
PTNKKINMSSGMRRQEQINPDDCCLDMGMY
>O57224 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKCITCKPKSKSSSSSTSTSK
GSKNTSGAPRRRTTVTTTSYNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIDSVNAGKKKPSKKMPTN
KKINMSSGMRRQEQINPDDCCLDMGMY
>P20989 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKSKSSKCITCKPKSKSSSSS
TSTSKGSKNTSGAPRRRTTVTTTSYNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIDSVNAGKKKPSK
KMPTNKKINMSSGMRRQEQINPDDCCLDMGMY
>Q80HV7 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKSKSSKCITCKPKSKSSSSS
TSASKGSKNTSGAPRRRTTVTTTSYNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIDSVNAGKKKPSK
KMPTNKKINMSSGMRRQEQINPDDCCLDMGMY
>P0DOK9 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKSKSSKCITCKSKSSSSSTS
TSKSSKNTSGAPRRRTTATTSFNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIESVNAGKKKPSKKMP
TNKKINMSSGMRRQEQINPNDCCLDMGMY
>P0DOL0 ~~~~~~25 kDa core protein OPG138~~~
MADKKNLAVRSSYDDYIETVNKITPQLKNLLAQIGGDAAVKGGNNNLNSQTDVTAGACDTKSKSSKCITCKSKSSSSSTS
TSKSSKNTSGAPRRRTTATTSFNAMDGQIVQAVTNAGKIVYGTVRDGQLEVRGMVGEINHDLLGIESVNAGKKKPSKKMP
TNKKINMSSGMRRQEQINPNDCCLDMGMY
>P20990 ~~~~~~Virion membrane protein OPG139~~~
MIGILLLIGICVAVTVAILYSMYNKIKNSQNPNPSPNLNSPPPEPKNTKFVNNLEKDHISSLYNLVKSSV
>Q76ZQ4 ~~~~~~Virion membrane protein OPG139~~~
MIGILLLIGICVAVTVAILYSMYNKIKNSQNPNPSPNLNSPPPEPKNTKFVNNLEKDHISSLYNLVKSSV
>P0DSW1 ~~~~~~Virion membrane protein OPG139~~~
MIGILLLIGICVAVTVAILYAMYNKIKNSQNPSPNVNLPPPETRNTRFVNNLEKDHISSLYNLVKSSV
>P0DSW2 ~~~~~~Virion membrane protein OPG139~~~
MIGILLLIGICVAVTVAILYAMYNKIKNSQNPSPNVNLPPPETRNTRFVNNLEKDHISSLYNLVKSSV
>Q76ZQ3 ~~~~~~Virion membrane protein OPG140~~~
MDMMLMIGNYFSGVLIAGIILLILSCIFAFIDFSKSTSPTRTWKVLSIMAFILGIIITVGMLIYSMWGKHCAPHRVSGVI
HTNHSDISMN
>A0A7H0DNB3 ~~~~~~Virion membrane protein OPG141~~~
MISNYEPLLLLVITCCVLLFNFTISSKTKIDIIFAVQTIVFIWFIFHFVYSAI
>P0CK28 ~~~~~~Virion membrane protein OPG141~~~
MISNYEPLLLLVITCCVLLFNFTISSKTKIDIIFAVQTIVFIWFIFHFVHSAI
>Q80HV6 ~~~~~~Virion membrane protein OPG141~~~
MISNYEPLLLLVITCCVLLFNFTISSKTKIDIIFAVQTIVFIWFIFHFVHSAI
>P0CK27 ~~~~~~Virion membrane protein OPG141~~~
MISNYEPLLLLVITCCVLLFNFTISSKTKIDIIFAVQTIVFIWFIFHFVHSAI
>P68718 ~~~~~~Core protein OPG142~~~
MFVDDNSLIIYSTWPSTLSDSSGRVIVMPDNRSFTFKEGFKLDESIKSILLVNPSSIDLLKIRVYKHRIKWMGDIFVLFE
QENIPPPFRLVNDK
>P20993 ~~~~~~Virion membrane protein OPG143~~~
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGSLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRAKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
SAVGDVHHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRPKIKTNDINVRRR
>P16710 ~~~~~~Virion membrane protein OPG143~~~
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGSLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRAKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
SAVGDVHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRPKIKTNDINVRRR
>P33841 ~~~~~~Virion membrane protein OPG143~~~
MGAAVTLNRINIASGIADIRDKYMELGFNYPKYNRTVKFAEESYMYYYETSPGEIKPKFCLIDGMSIDHCSSFIVPEFAK
QYVLIHGEPCSSFKFRPGTLIYYQNEVTPEYIKDLKHATDYIASGQRCHFIKKDYLLGDSDSVAKCCSKTNTKHCPKIFN
NNYKTEHCDDFMTGFCRNDPGNPNCLEWLRVKRKPAMSTYSDICSKHMDARYCSEFIRIIRPDYFTFGDTALYVFCNDHK
GNRNCWCANYPKSNSGDKYLGPRVCWLHECTDESRDRKWLYYNQDVQRTRCKYVGCTINVNSLALKNSQAELTSNCTRTT
STVGDIHPGEPVVKDKIKLPTWLGAAITLVVISVIFYFISIYSRPKIKTNDINVRRR
>P68593 ~~~~~~Virion membrane protein OPG144 precursor~~~
MSYLRYYNMLDDFSAGAGVLDKDLFTEEQQQSFMPKDGGMMQNDYGGMNDYLGIFKNNDVRTLLGLILFVLALYSPPLIS
ILMIFISSFLLPLTSLVITYCLVTQMYRGGNGNTVGMSIVCIVAAVIIMAINVFTNSQIFNIISYIILFILFFAYVMNIE
RQDYRRSINVTIPEQYTCNKPYTAGNKVDVDIPTFNSLNTDDY
>P16712 3.6.4.-~~~~~~Transcript termination protein OPG145~~~
MSLLKMEYNLYAELKKMTCGQPLSLFNEDGDFVEVEPGSSFKFLIPKGFYASPSVKTSLVFETLTTTDNKITSINPTNAP
KLYPLQRKVVSEVVSNMRKMIESKRPLYITLHLACGFGKTITTCYLMATHGRKTVICVPNKMLIHQWKTQVEAVGLEHKI
SIDGVSSLLKELKTQSPDVLIVVSRHLTNDAFCKYINKHYDLFILDESHTYNLMNNTAVTRFLAYYPPMMCYFLTATPRP
ANRIYCNSIINIAKLSDLKKTIYAVDSFFEPYSTDNIRHMVKRLDGPSNKYHIYTEKLLSVDEPRNQLILNTLVEEFKSG
TINRILVITKLREHMVLFYKRLLDLFGPEVVFIGDAQNRRTPDMVKSIKELNRFIFVSTLFYSGTGLDIPSLDSLFICSA
VINNMQIEQLLGRVCRETELLDRTVYVFPNTSIKEIKYMIGNFMQRIISLSVDKLGFKQESYRKHQESDPTSVCTTSSRE
ERVLNRIFNSQNR
>P68714 ~~~~~~Protein OPG146~~~
MDSTNVRSGMKSRKKKPKTTVIDDDDDCMTCSACQSKLVKISDITKVSLDYINTMRGNTLACAACGSSLKLLNDFAS
>P68712 ~~~~~~Virion membrane protein OPG147~~~
MITLFLILCYFILIFNIIVPAISEKMRRERAAYVNYKRLNKNFICVDDRLFSYNFTTSGIKAKVAVDNKNVPIPCSKINE
VNNNKDVDTLYCDKDRDDIPGFARSCYRAYSDLFFTT
>P20995 ~~~~~~DNA polymerase processivity factor component OPG148~~~
MTSSADLTNLKELLSLYKSLRFSDSAAIEKYNSLVEWGTSTYWKIGVQKVANVETSISDYYDEVKNKPFNIDPGYYIFLP
VYFGSVFIYSKGKNMVELGSGNSFQIPDDMRSACNKVLDSDNGIDFLRFVLLNNRWIMEDAISKYQSPVNIFKLASEYGL
NIPKYLEIEIEEDTLFDDELYSIIERSFDDKFPKISISYIKLGELRRQVVDFFKFSFMYIESIKVDRIGDNIFIPSVITK
SGKKILVKDVDHLIRSKVREHTFVKVKKKNTFSILYDYDGNGTETRGEVIKRIIDTIGRDYYVNGKYFSKVGSAGLKQLT
NKLDINECATVDELVDEINKSGTVKRKIKNQSAFDLSRECLGYPEADFITLVNNMRFKIENCKVVNFNIENTNCLNNPSI
ETIYRNFNQFVSIFNVVTDVKKRLFE
>P68710 ~~~~~~DNA polymerase processivity factor component OPG148~~~
MTSSADLTNLKELLSLYKSLKFSDSAAIEKYNSLVEWGTSTYWKIGVQKVANVETSISDYYDEVKNKPFNIDPGYYIFLP
VYFGSVFIYSKGKNMVELGSGNSFQIPDDMRSACNKVLDSDNGIDFLRFVLLNNRWIMEDAISKYQSPVNIFKLASEYGL
NIPKYLEIEIEEDTLFDDELYSIIERSFDDKFPKISISYIKLGELRRQVVDFFKFSFMYIESIKVDRIGDNIFIPSVITK
SGKKILVKDVDHLIRSKVREHTFVKVKKKNTFSILYDYDGNGTETRGEVIKRIIDTIGRDYYVNGKYFSKVGSAGLKQLT
NKLDINECATVDELVDEINKSGTVKRKIKNQSAFDLSRECLGYPEADFITLVNNMRFKIENCKVVNFNIENTNCLNNPSI
ETIYGNFNQFVSIFNIVTDVKKRLFE
>P24758 ~~~~~~Envelop protein OPG153~~~
MANIINLWNGIVPTVQDVNVASITAFKSMIDETWDKKIEANTCISRKHRNIIHEVIRDFMKAYPKMDENKKSPLGAPMQW
LTQYYILKNEYHKTMLAYDNGSLNTKFKTLNIYMITNVGQYILYIVFCIISGKNHDGTPYIYDSEITSNDKNFINERIKY
ACKQILHGQLTIALRIRNKFMFIGSPMYLWFNVNGSQVYHDIYDRNAGFHNKEIGRLLYAFMYYLSISGRFLNDFALLKF
TYLGESWTFSLSVPEYILYGLGYSVFDTIEKFSNDAILVYIRTNNRNGYDYVEFNKKGIAKVTEDKPDNDKRIHAIRLIN
DSTDVQHIHFGFRNMVIIDNECANIQSSAENATDTGHHQDSKINIEVEDDVIDDDDYNPKPTPIPEPHPRPPFPRHEYHK
RPKLLPVEEPDPVKKDADRIRLDNHILNTLDHNLNFIGHYCCDTAAVDRLEHHIETLGQYAVILARKINMQTLLFPWPLP
TVHPHAIDGSIPPHGRSTIL
>P11258 ~~~~~~Protein OPG154~~~
MDGTLFPGDDDLAIPATEFFSTKAAKKPEAKREAIVKADEDDNEETLKQRLTNLEKKITNVTTKFEQIEKCCKRNDEVLF
RLENHAETLRAAMISLAKKIDVQTGRRPYE
>P68633 ~~~~~~Envelope protein OPG155~~~
MNSLSIFFIVVATAAVCLLFIQGYSIYENYGNIKEFNATHAAFEYSKSIGGTPALDRRVQDVNDTISDVKQKWRCVVYPG
NGFVSASIFGFQAEVGPNNTRSIRKFNTMQQCIDFTFSDVININIYNPCVVPNINNAECQFLKSVL
>P68596 ~~~~~~Protein OPG157~~~
MEDLNEANFSHLLINLSNNKDIDAQYASTLSVVHELLSAINFKIFNINKKSKKNSKSIEQHPVVHHAASAGREFNRR
>P68615 ~~~~~~DNA packaging protein OPG160~~~
MNCFQEKQFSRENLLKMPFRMVLTGGSGSGKTIYLLSLFSTLVKKYKHIFLFTPVYNPDYDGYIWPNHINFVSSQESLEY
NLIRTKSNIEKCIAVAQNHKKSAHFLLIFDDVGDKLSKCNTLIEFLNFGRHLNTSIILLCQTYRHVPILGRANITHFCSF
NISISDAENMLRSMPVKGKRKDILNMLNMIQTARSNNRLAIIIEDSVFCEGELRICTDTADKDVIEQKLNIDILVNQYSH
MKKNLNAILESKKTKLCNSDQSSSSKNVSS
>P68616 ~~~~~~Protein OPG161~~~
MMTPENDEEQTSVFSATVYGDKIQGKNKRKRVIGLCIRISMVISLLSMITMSAFLIVRLNQCMSANEAAITDAAVAVAAA
SSTHRKVASSTTQYDHKESCNGLYYQGSCYILHSDYQLFSDAKANCTAESSTLPNKSDVLITWLIDYVEDTWGSDGNPIT
KTTSDYQDSDVSQEVRKYFCVKTMN
>P68617 ~~~~~~Protein OPG161~~~
MMTPENDEEQTSVFSATVYGDKIQGKNKRKRVIGLCIRISMVISLLSMITMSAFLIVRLNQCMSANEAAITDAAVAVAAA
SSTHRKVASSTTQYDHKESCNGLYYQGSCYILHSDYQLFSDAKANCTAESSTLPNKSDVLITWLIDYVEDTWGSDGNPIT
KTTSDYQDSDVSQEVRKYFCVKTMN
>P0DON1 ~~~~~~Protein OPG161~~~
MMTPENDEEQTSVFSATVYGDKIQGKNKRKRVIGICIRISMVISLLSMITMSAFLIVRLNQCMSANEAAITDATAVAAAL
STHRKVASSTTQYKHQESCNGLYYQGSCYIFHSDYQLFSDAKANCATESSTLPNKSDVLTTWLIDYVEDTWGSDGNPITK
TTTDYQDSDVSQEVRKYFCVKTMN
>P0DON2 ~~~~~~Protein OPG161~~~
MMTPENDEEQTSVFSATVYGDKIQGKNKRKRVIGICIRISMVISLLSMITMSAFLIVRLNQCMSANEAAITDATAVAAAL
STHRKVASSTTQYKHQESCNGLYYQGSCYIFHSDYQLFSDAKANCATESSTLPNKSDVLTTWLIDYVEDTWGSDGNPITK
TTTDYQDSDVSQEVRKYFCVKTMN
>P24761 ~~~~~~Protein OPG162~~~
MKSLNRQTVSRFKKLSVPAAIMMILSTIISGIGTFLHYKEELMPSACANGWIQYDKHCYLDTNIKMSTDNAVYQCRKLRA
RLPRPDTRHLRVLFSIFYKDYWVSLKKTNDKWLDINNDKDIDISKLTNFKQLNSTTDAEACYIYKSGKLVKTVCKSTQSV
LCVKKFYK
>A0A7H0DND4 ~~~~~~Protein OPG163~~~
MDAAFVITPMGVLTITDTLYDDLDISIMDFIGPYIIGNIKIVQIDVRDIKYSDMQKCYFSYKGKIVPQDSNDLARFNIYS
ICTAYRSKNTIIIACDYDIMLDIEGKHQPFYLFPSIDVFNATIIEAYNLYTAGDYHLIINPSDNLKMKLSFNSSFCISDG
NGWIIIDGKCNSNFLS
>P21058 ~~~~~~Protein OPG163~~~
MDAAFVITPMGVLTITDTLYDDLDISIMDFIGPYIIGNIKTVQIDVRDIKYSDMQKCYFSYKGKIVPQDSNDLARFNIYS
ICAAYRSKNTIIIACDYDIMLDIEDKHQPFYLFPSIDVFNATIIEAYNLYTAGDYHLIINPSDNLKMKLSFNSSFCISDG
NGWIIIDGKCNSNFLS
>Q01232 ~~~~~~Protein OPG163~~~
MDAAFVITPMGVLTITDTLYDDLDISIMDFIGPYIIGNIKTVQIDVRDIKYSDMQKCYFSYKGKIVPQDSNDLARFNIYS
ICAAYRSKNTIIIACDYDIMLDIEDKHQPFYLFPSIDVFNATIIEAYNLYTAGDYHLIINPSDNLKMKLLFNSSFCISDG
NGWIIIDGKCNSNFLS
>P68619 ~~~~~~Protein OPG164~~~
MMLVPLITVTVVAGTILVCYILYICRKKIRTVYNDNKIIMTKLKKIKSSNSSKSSKSTDSESDWEDHCSAMEQNNDVDNI
SRNEILDDDSFAGSLIWDNESNVMAPSTEHIYDSVAGSTLLINNDRNEQTIYQNTTVVINETETVEVLNEDTKQNPNYSS
NPFVNYNKTSICSKSNPFITELNNKFSENNPFRRAHSDDYLNKQEQDHEHDDIESSVVSLV
>A0A7H0DND6 ~~~~~~Protein OPG165~~~
MHYTNMEIFPVFGISKISNFIANNDCRYYIDVEHQKIISDEINRQMDETVLLTNILSVEVVNDNEMYHLIPHRLSTIILC
ISSVGGCVISIDNDVNDKNILTFPIDHAVIISPLSKCVVVSKGPTTILVVKADIPSKRLVTSFTNDILYVNNLSLINYLP
SSVFIIRRVTDYLDRHICDQIFANNKWYSIITIDDKQYPIPSNCIGMSSAKYINSSIEQDILIHVCNLEHPFDSVYKKMQ
SYNSLPIKEQILYGRIDNINMSISISVD
>P21060 ~~~~~~Protein OPG165~~~
MEIFPVFGISKISNFIANNDCRYYIDTEHQKIISDEINRQMDETVLLTNILSVEVVNDNEMYHLIPHRLSTIILCISSVG
GCVISIDNDVNGKNILTFPIDHAVIISPLSKCVVVSKGPTTILVVKADIPSKRLVTSFTNDILYVNNLSLINYLPLSVFI
IRRVTDYLDRHICDQIFANNKWYSIITIDNKQFPIPSNCIGMSSAKYINSSIEQDTLIHVCNLEHPFDLVYKKMQSYNSV
PIKEQILYGRIDNINMSISISVY
>P24762 ~~~~~~Protein OPG165~~~
MEIFPVFGISKISNFIANNDCRYYIDTEHQKIISDEINRQMDETVLLTNILSVEVVNDNEMYHLIPHRLSTIILCISSVG
GCVISIDNDINDKNILTFPIDHAVIISPLSKCVVVSKGPTTILVVKADIPSKRLVTSFTNDILYVNNLSLINYLPLSVFI
IRRVTDYLDRHICDQIFANNKWYSIITIDDKQYPIPSNCIGMSSAKYINSSIEQDTLIHVCNLEHPFDLVYKKMQSYNSV
PIKEQILYGRIDNINMSISISVD
>A0A7H0DND8 ~~~~~~Protein OPG170~~~
MYLLFIILMYLLPFSFQTSEPAYDKSVCDSGNKEYMGIEVYVEATLDEPLRQTTCESEIHKYGASVSNGGLNISVDLLNC
FLNFHTVGVYTNRDTVYAKFASLDPWTTEPMNSMTHDDLVKLTEECIVDIYLKCEVDKTKDFMKTNDNRLKPRDFKTVPP
SNVGSMIELQSDYCVNDVTAYVKIYDECGNIKQHSIPTLRDYFTTKNGQPRKILKKKIDNC
>P21064 ~~~~~~Protein OPG170~~~
MYSLLFIILMCIPFSFQTVYDDKSVCDSDNKEYMGIEVYVEATLDEPLRQTTCESEIHKYGASVSNGGLNISVDLLNCFL
NFHTVGVYTNRDTVYAKFASLDPWTTEPINSMTHDDLVKLTEECIVDIYLKCEVDKTKDFMKTNGNRLKPRDFKTVPPSD
VGSMIELQSDYCVNDVTAYVKIYDECGNIKQHSIPTLRDYFTTKNGQPRKILKKKFDNC
>P24766 ~~~~~~Protein OPG170~~~
MYSLVFVILMCIPFSFQTVYDDKSVCDSDNKEYMGIEVYVEATLDEPLRQTTCESKIHKYGASVSNGGLNISVDLLNCFL
NFHTVGVYTNRDTVYAKFASLDPWTTEPINSMTHDDLVKLTEECIVDIYLKCEVDKTKDFMKTNGNRLKPRDFKTVPPSN
VGSMIELQSDYCVNDVTTYVKIYDECGNIKQHSIPTLRDYFTTKNGQPRKILKKKFDNC
>P33854 ~~~~~~Protein OPG170~~~
MYSLVFVILMCIPFSFQTVYDDKSVCDSDNKEYMGIEVYVEATLDEPLRQTTCESEIHKYGASVSNGGLNISVDLLNCFL
NFHTVGVYTNRDTVYAKFTSLDPWTMEPINSMTYDDLVKLTEECIVDIYLKCEVDKTKDFIKTNGNRLKPRDFKTVPPNV
GSIIELQSDYCVNDVTAYVKIYDECGNIKQHSIPTLRDYFTTTNGQPRKILKKKFDNC
>A0A7H0DNE0 ~~~~~~Protein OPG172~~~
MMMKWIISILTMSIMPVLTYSSSIFRFHSEDIELCYGNLYFDRIYNNVVNIKYIPEHIPYRYNFINRTFSVDELDDNVFF
THGYFLKHKYGCSLNPSLIVSLSGNLKYNDIQCSVNVSCLIKNLATSTSTILTSKHKTYSLYRSMCIAIIGYDSIIWYKY
INDRYNDIYDFTAICMLIASTLIVIIYVFKKIKMNS
>P26671 ~~~~~~Protein A43~~~
MMMMKWIISILTMSIMPVLAYSSSIFRFHSEDVELCYGHLYFDRIYNVVNIKYNPHIPYRYNFINRTLTVDELDDNVFFT
HGYFLKHKYGSLNPSLIVSLSGNLKYNDIQCSVNVSCLIKNLATSTSTILTSKHKTYSLHRSTCITIIGYDSIIWYKDIN
DKYNGIYDFTAICMLIASTLIVTIYVFKKIKMNS
>A0A7H0DNE1 ~~~~~~Protein OPG173~~~
MDKIKITIDSKIGNVVTISYNLEKITIDVTPKKKKEKDVLLAQSVAVEEAKDVKVEEKNIIDIEDDDDMDIENT
>Q80HU0 ~~~~~~Protein OPG173~~~
MLLEMDKIKITVDSKIGNVVTISYNLEKITIDVTPKKKKEKDVLLAQSVAVEEAKDVKVEEKNIIDIEDDDDMDVESA
>A0A7H0DNE4 ~~~~~~Protein OPG176~~~
MAFDISVNASKTINALVYFSTQQDKLVIRNEVNDIHYTVEFDRDKVVDTFISYNRHNDSIEIRGVLPEETNIGRVVNTPV
SMTYLYNKYSFKPILAEYIRHRNTISGNIYSALMTLDDLVIKQYGDIDLLFNEKLKVDSDSGLFDFVNFVKDMICCDSRI
VVALSSLVSKHWELTNKKYRCMALAEHIADSIPISELSRLRYNLCKYLRGHTDSIEDEFDHFEDDDLSTCSAVTDRETDV
>P21066 ~~~~~~Protein OPG176~~~
MAFDISVNASKTINALVYFSTQQNKLVIRNEVNDTHYTVEFDRDKVVDTFISYNRHNDTIEIRGVLPEETNIGCAVNTPV
SMTYLYNKYSFKLILAEYIRHRNTISGNIYSALMTLDDLAIKQYGDIDLLFNEKLKVDSDSGLFDFVNFVKDMICCDSRI
VVALSSLVSKHWELTNKKYRCMALANIYLIVFQYLSYLDYDTIYVSIYAGTLRA
>P26672 ~~~~~~Protein OPG176~~~
MAFDISVNASKTINALVYFSTQQNKLVIRNEVNDTHYTVEFDRDKVVDTFISYNRHNDTIEIRGVLPEETNIGCAVNTPV
SMTYLYNKYSFKLILAEYIRHRNTISGNIYSALMTLDDLAIKQYGDIDLLFNEKLKVDSDSGLFDFVNFVKDMICCDSRI
VVALSSLVSKHWELTNKKYRCMALAEHISDSIPISELSRLRYNLCKYLRGHTESIEDKFDYFEDDDSSTCSAVTDRETDV
>P33876 ~~~~~~Protein OPG176~~~
MAFDISVNASKTINALVYFSTQQNKLVIRNEVNDTHYTVEFDRDKVVDTFISYNRHNDSIEIRGVLPEETNIGCTVNTPV
SMTYLYNKYSFKLILAEYIRHRNTVSGNIYSALMTLDDLVIKQYGDIDLLFNEKLKVDSDSGLFDFVNFVKDIICCDSRI
VVALSSLVSKHWELTNKKYRCMALAEHIADSIPISELSRPRYNLCKYLRGHTESIEDEFDYFEDDDSSTCSVVTDRETDV
>A0A7H0DNE7 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCERVCEKVKNVHIGGQSKNNTVIADLPYLDNAVSDVCKSIYKKNVSRIS
RFANLIKIDDDDKTPTGVYNYFKPKDAIPVIISIGKDKDVCELLISSDKACACIKLNLYKVAILPMDVSFFTKGNASLII
LLFDFSIDAAPLLRSVTDNNVIISRHQRLHDELPSSNWFKFYISIKSDYCSILYMVVDGSMMYAIADNRTHAIISKNILD
NTTINDECRCCYSEPQIRILDRDEMLNGSSCYMNRHCIMMNLPDVGEFGSSMLGKYEPDMIKIALSVAGNLIRNRDYIPG
RRGYSYYVYGIASR
>Q89181 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCEKVKNVHIGGQSKNNTVIADLPYMDNAVSDVCNSLYKKNVSRISRFAN
LIKIDDDDKTPTGVYNYFKPKDAIPVIISIGKDRDVCELLISSDKACACIELNSYKVAILPMDVSFFTKGNASLIILLFD
FSIDAAPLLRSVTDNNVIISRHQRLHDELPSSNWFKFYISIKSDYCSILYMVVDGSVMHAIADNRTYANISKNILDNTTI
NDECRCCYFEPQIRILDRDEMLNGSSCDMNRHCIMMNLPDVGEFGSSMLGKYEPDMIKIALSVAGIWKVL
>P21069 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCERVCEKVKNVHIGGQSKNNTVIADLPYMDNAVSDVCNSLYKKNVSRIS
RFANLIKIDDDDKTPTGVYNYFKPKDVIPVIISIGKDKDVCELLISSDISCACVELNSYKVAILPMDVSFFTKGNASLII
LLFDFSIDAAPLLRSVTDNNVIISRHQRLHDELPSSNWFKFYISIKSDYCSILYMVVDGSVMHAIADNRTHAIISKNILD
NTTINDECRCCYFEPQIRILDRDEMLNGSSCDMNRHCIMMNLPDVGEFGSSMLGKYEPDMIKIALSVAGNLIRNRDYIPG
RRGYSYYVYGIASR
>Q01219 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCERVCEKVKNVHIGGQSKNNTVIADLPYMDNAVSDVCNSLYKKNVSRIS
RFANLIKIDDDDKTPTGVYNYFKPKDVIPVIISIGKDKDVCELLISSDISCACVELNSYHVAILPMDVSFFTKGNASLII
LLFDFSIDAAPLLRSVTDNNVIISRHQRLHDELPSSNWFKFYISIKSDYCSILYMVVDGSVMHAIADNRTHAIISKNILD
NTTINDECRCCYFEPQIRILDRDEMLNGSSCDMNRHCIMMNLPDVGKFGSSMLGKYEPDMIKIALSVAGNLIRNRDYIPG
RRGYSYYVYGIASR
>P0DSU3 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCERVCEKVKNVHIDGQSKNNTVIADLPYLDNAVLDVCKSVYKKNVSRIS
RFANLIKIDDDDKTPTGVYNYFKPKDAISVIISIGKDKDVCELLIASDKACACIELNSYKVAILPMNVSFFTKGNASLII
LLFDFSINAAPLLRSVTDNNVVISRHKRLHGEIPSSNWFKFYISIKSNYCSILYMVVDGSVMYAIADNKTHTIISKNILD
NTTINDECRCCYFEPQIKILDRDEMLNGSSCDMNRHCIMMNLPDIGEFGSSILGKYEPDMIKIALSVAGNLIRNQDYIPG
RRGYSYYVYGIASR
>P0DSU4 ~~~~~~Protein OPG181~~~
MDGVIVYCLNALVKHGEEINHIKNDFMIKPCCERVCEKVKNVHIDGQSKNNTVIADLPYLDNAVLDVCKSVYKKNVSRIS
RFANLIKIDDDDKTPTGVYNYFKPKDAISVIISIGKDKDVCELLIASDKACACIELNSYKVAILPMNVSFFTKGNASLII
LLFDFSINAAPLLRSVTDNNVVISRHKRLHGEIPSSNWFKFYISIKSNYCSILYMVVDGSVMYAIADNKTHTIISKNILD
NTTINDECRCCYFEPQIKILDRDEMLNGSSCDMNRHCIMMNLPDIGEFGSSILGKYEPDMIKIALSVAGNLIRNQDYIPG
RRGYSYYVYGIASR
>A0A7H0DNE9 2.7.11.1~~~~~~B1 kinase~~~
MKFQGLVLIDNCKNQWVVGPLIGKGGFGSIYTTNDNNYVVKIEPKANGSLFTEQAFYTRVLKPSVIEEWKKSHNIKHVGL
ITCKAFGLYKSINVEYRFLVINRLGADLDAVIRANNNRLPERSVMLIGIEILNTIQFMHEQGYSHGDIKASNIVLDQIDK
NKLYLVDYGLVSKFMSNGEHVPFIRNPNKMDNGTLEFTPIDSHKGYVVSRRGDLETLGYCMIRWLGGILPWTKISETKNS
ALVSAAKQKYVNNTATLLMTSLQYAPRELLQYITMVNSLTYFEEPNYDEFRRVLMNGVMKNFC
>O57252 2.7.11.1~~~~~~B1 kinase~~~
MNFQGLVLTDNCKNQWVVGPLIGKGGFGSIYTTNDNNYVVKIEPKANGSLFTEQAFYTRVLKPSVIEEWKKSHNIKHVGL
ITCKAFGLYKSINVEYRFLVINRLGADLDAVIRANNNRLPKRSVMLIGIEILNTIQFMHEQGYSHGDIKASNIVLDQIDK
NKLYLVDYGLVSKFMSNGEHVPFIRNPNKMDNGTLEFTPIDSHKGYVVSRRGDLETLGYCMIRWLGGILPWTKISETKNC
ALVSATKQKYVNNTATLLMTSLQYAPRELLQYITMVNSLTYFEEPNYDKFRHILMQGVYY
>P20505 2.7.11.1~~~~~~B1 kinase~~~
MNFQGLVLTDNCKNQWVVGPLIGKGGFGSIYTTNDNNYVVKIEPKANGSLFTEQAFYTRVLKPSVIEEWKKSHNIKHVGL
ITCKAFGLYKSINVEYRFLVINRLGVDLDAVIRANNNRLPKRSVMLIGIEILNTIQFMHEQGYSHGDIKASNIVLDQIDK
NKLYLVDYGLVSKFMSNGEHVPFIRNPNKMDNGTLEFTPIDSHKGYVVSRRGDLETLGYCMIRWLGGILPWTKISETKNC
ALVSATKQKYVNNTATLLMTSLQYAPRELLQYITMVNSLTYFEEPNYDEFRHILMQGVYY
>P16913 2.7.11.1~~~~~~B1 kinase~~~
MNFQGLVLTDNCKNQWVVGPLIGKGGFGSIYTTNDNNYVVKIEPKANGSLFTEQAFYTRVLKPSVIEEWKKSHNIKHVGL
ITCKAFGLYKSINVEYRFLVINRLGADLDAVIRANNNRLPKRSVMLIGIEILNTIQFMHEQGYSHGDIKASNIVLDQIDK
NKLYLVDYGLVSKFMSNGEHVPFIRNPNKMDNGTLEFTPIDSHKGYVVSRRGDLETLGYCMIRWLGGILPWTKISETKNC
ALVSATKQKYVNNTATLLMTSLQYAPRELLQYITMVNSLTYFEEPNYDEFRHILMQGVYY
>P33800 2.7.11.1~~~~~~B1 kinase~~~
MNFQGLVLTDNCKNQWVVGPLIGKGGFGSIYTTNDNNYVVKIEPKANGSLFTEQAFYTRVLKPSVIEEWKKSHHISHVGV
ITCKAFGLYKSINTEYRFLVINRLGVDLDAVIRANNNRLPKRSVMLVGIEILNTIQFMHEQGYSHGNIKASNIVLDQMDK
NKLYLVDYGLVSKFMSNGEHVPFIRNPNKMDNGTLEFTPIDSHKGYVVSRRGDLETLGYCMIRWLGGILPWTKIAETKNC
ALVSATKQKYVNNTTTLLMTSLQYAPRELLQYITMVNSLTYFEEPNYDKFRHILMQGAYY
>P0DTN2 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNDKQKVTFTCDSGYHSLDPNAVCETDKWKYENPCKKMCTVSD
YVSELYDKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYMTINCDVG
YEVIGVSYISCTANSWNVIPSCQQKCDIPSLSNGLISGSTFSIGGVIHLSCKSGFTLTGSPSSTCIDGKWNPILPTCVRS
NEEFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIMALTIMGVIFLISIIVLVCSCDKNNDQYKFHKLLP
>P24084 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNDKQKVTFTCDQGYHSLDPNAVCETDKWKYENPCKKMCTVSD
YVSELYDKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYMTINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDMPSLSNGLISGSTFSIGGVIHLSCKSGFTLTGSPSSTCIDGKWNPILPTCVRS
NEKFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>O57254 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNNNQKVTFTCDQGYHSSDPNAVCETDKWKYENPCKKMCTVSD
YISELYNKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYITINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDIPSLSNGLISGSTFSIGGVIHLSCKSGFILTGSPSSTCIDGKWNPILPTCVRS
NEKFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>P21115 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNNNQKVTFTCDQGYHSSDPNAVCETDKWKYENPCKKMCTVSD
YISELYNKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYMTINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDIPSLSNGLISGSTFSIGGVIHLSCKSGFILTGSPSSTCIDGKWNPVLPICVRT
NEEFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>P24083 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNDKQKVTFTCDQGYHSSDPNAVCETDKWKYENPCKKMCTVSD
YVSELYDKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYITINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDMPSLSNGLISGSTFSIGGVIHLSCKSGFTLTGSPSSTCIDGKWNPILPTCVRS
NEKFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>Q9JF44 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNDKQKVTFTCDQGYHSSDPNAVCETDKWKYENPCKKMCTVSD
YISELYNKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYMTINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDMPSLSNGLISGSTFSIGGVIHLSCKSGFILTGSPSSTCIDGKWNPVLPICVRT
NEEFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIMVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>Q01227 ~~~~~~Protein OPG190~~~
MKTISVVTLLCVLPAVVYSTCTVPTMNNAKLTSTETSFNDKQKVTFTCDQGYHSSDPNAVCETDKWKYENPCKKMCTVSD
YISELYNKPLYEVNSTMTLSCNGETKYFRCEEKNGNTSWNDTVTCPNAECQPLQLEHGSCQPVKEKYSFGEYMTINCDVG
YEVIGASYISCTANSWNVIPSCQQKCDMPSLSNGLISGSTFSIGGVIHLSCKSGFTLTGSPSSTCIDGKWNPVLPICVRT
NEEFDPVDDGPDDETDLSKLSKDVVQYEQEIESLEATYHIIIVALTIMGVIFLISVIVLVCSCDKNNDQYKFHKLLP
>A0A7H0DNF3 ~~~~~~Protein OPG191~~~
MSSSVDVDIYDAVRAFLLRHYYDKRFIVYGRSNTILHNIYRLFTRCTVIPFDDIVRTMPNESRVKQWVMDTLNGIMMNEF
DTVCVGTGLRFMEMFFDYNKNNPKNSINNQIMYDIINSVAIILANERYRSAFNDDRIYIRRTMMDKLYEYASLTTIGTIT
GGVCYFIC
>P68443 ~~~~~~Protein OPG191~~~
MSSSVDVDIYDAVRAFLLRHYYNKRFIVYGRSNAILHNIYRLFTRCAVIPFDDIVRTMPNESRVKQWVMDTLNGIMMNER
DVSVSVGTGILFMEMFFDYNKNSINNQLMYDIINSVSIILANERYRSAFNDDGIYIRRNMINKLYGYASLTTIGTIAGGV
CYYLLMHLVSLYK
>P68442 ~~~~~~Protein OPG191~~~
MSSSVDVDIYDAVRAFLLRHYYNKRFIVYGRSNAILHNIYRLFTRCAVIPFDDIVRTMPNESRVKQWVMDTLNGIMMNER
DVSVSVGTGILFMEMFFDYNKNSINNQLMYDIINSVSIILANERYRSAFNDDGIYIRRNMINKLYGYASLTTIGTIAGGV
CYYLLMHLVSLYK
>A0A7H0DNF5 ~~~~~~Soluble interferon gamma receptor OPG193~~~
MRYIIILAVLFINSIHAKITSYKFESVNFDSKIEWTGDGLYNISLKNYGIKTWQTMYTNVPEGTYDISGFPKNDFVSFWV
KFEQGDYKVEEYCTGLCVEVKIGPPTVILTEYDDHINLFIEHPYATRGSKKIPIYKRGDMCDIYLLYTANFTFGDSEEPV
TYDIDDYDCTSTGCSIDFATTEKVCVTAQGATEGFLEKITPWSSEVCLTPKKNVYTCAIRSKEDVPNFKDKIARVITRKF
NKQSQSYLTKFLGSTSNDVTTFLSILD
>P21004 ~~~~~~Soluble interferon gamma receptor OPG193~~~
MRYIIILAVLFINSIHAKITSYKFESVNFDSKIEWTGDGLYNISLKNYGIKTWQTMYTNVPEGTYDISAFPKNDFVSFWV
KFEQGDYKVEEYCTGLCVEVKIGPPTVTLTEYDDHINLYIEHPYATRGSKKIPIYKRGDMCDIYLLYTANFTFGDSEEPV
TYDIDDYDCTSTGCSIDFATTEKVCVTAQGATEGFLEKITPWSSEVCLTPKKNVYTCAIRSKEDVPNFKDKMARVIKRKF
NKQSQSYLTKFLGSTSNDVTTFLSMLNLTKYS
>P24770 ~~~~~~Soluble interferon gamma receptor OPG193~~~
MRYIIILAVLFINSIHAKITSYKFESVNFDSKIEWTGDGLYNISLKNYGIKTWQTMYTNVPEGTYDISAFPKNDFVSFWV
KFEQGDYKVEEYCTGLCVEVKIGPPTVTLTEYDDHINLYIEHPYATRGSKKIPIYKRGDMCDIYLLYTANFTFGDSKEPV
PYDIDDYDCTSTGCSIDFVTTEKVCVTAQGATEGFLEKITPWSSKVCLTPKKSVYTCAIRSKEDVPNFKDKMARVIKRKF
NKQSQSYLTKFLGSTSNDVTTFLSMLNLTKYS
>A0A7H0DNF6 ~~~~~~Protein OPG195~~~
MRSLIIVLLFPSIIYSMSIRRCEKTEEETWGLKIGLCIIAKDFYPERTDCSVHRPTASGGLITEGNGFRVVIYDQCTEPH
DFIITDTQQTRLGSSHTYIKFSNMNTGVPSSIPKCSRTLCISVYCDQEAGDIKFEEYTQESSDISIRVKYDSSCIDYLGI
NQSFMNECIRRITTWDRESCVRIDTQTINKYLKSCTNTKFDRNVYKRYILKSKALHAKTEL
>P21005 ~~~~~~Protein OPG195~~~
MRSLIIVLLFPSIIYSMSIRRCEKTEEETWGLKIGLCIIAKDFYPERTDCSVHLPTASEGLITEGNGFRDIRNTDKL
>P24771 ~~~~~~Protein OPG195~~~
MRSLIIVLLFPSIIYSMSIRQCEKTEEETWGLKIGLCIIAKDFYPERTDCSVHLPTASEGLITEGNGFRDIRNTDKL
>A0A7H0DNF7 ~~~~~~Protein OPG197~~~
MYSGIYPIVLLLTTNMDSDTDTDTDTDTDTDTDTDVEDIMNEIDREKEEILKNVEIENNKNINKNHPSEYIREALVINTS
SNSDSIDKEVIEYISHDVGI
>P21007 ~~~~~~Protein OPG197~~~
MDTDTDTDTDTDTDTDTDTDVTNVEDIINEIDREKEEILKNVEIENNKNINKNHPSGYIREALVINTSSNSDSIDKEVIE
CISHDVGI
>Q01229 ~~~~~~Protein OPG197~~~
MDTDVTNVEDIINEIDREKEEILKNVEIENNKNINKNHPSGYIREALVINTSSNSDSIDKEVIECICHDVGI
>P24772 ~~~~~~Protein OPG200~~~
MTANFSTHVFSPQHCGCDRLTSIDDVRQCLTEYIYWSSYAYRNRQCAGQLYSTLLSFRDDAELVFIDIRELVKNMPWDDV
KDCAEIIRCYIPDEQKTIREISAIIGLCAYAATYWGGEDHPTSNSLNALFVMLEMLNYVDYNIIFRRMN
>A0A7H0DNG2 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MKMKMMVRIYFVSLSLLLFHSYAIDIENEITEFFNKMRDTLPAKDSKWLNPVCMFGGTMNDMAALGEPFSAKCPPIEDSL
LSHRYKDYVVKWERLEKNRRRQVSNKRVKHGDLWIANYTSKFSNRRYLCTVTTKNGDCVQGVVRSHVWKPSSCIPKTYEL
GTYDKYGIDLYCGILYANHYNNITWYKDNKEINIDDFKYSQAGKELIIHNPELEDSGRYDCYVHYDDVRIKNDIVVSRCK
ILTVIPSQDHRFKLILDPKINVTIGEPANITCSAVSTSLFVDDVLIEWENPSGWIIGLDFGVYSILTSRGGITEATLYFE
NVTEEYIGNTYTCRGHNYYFDKTLTTTVVLE
>P21077 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MTMKMMVHIYFVSLSLLLLLFHSYAIDIENEITEFFNKMRDTLPAKDSKWLNPACMFGGTMNDMATLGEPFSAKCPPIED
SLLSHRYKDYVVKWERLEKNRRRQVSNKRVKHGDLWIANYTSKFSNRRYLCTVTTKNGDCVQGIVRSHIKKPPSCIPKTY
ELGTHDKYGIDLYCGILYAKHYNNITWYKDNKEINIDDIKYSQTGKELIIHNPELEDSGRYDCYVHYDDVRIKNDIVVSR
CKILTVIPSQDHRFKLILDPKINVTIGEPANITCTAVSTSLLIDDVLIEWENPSGWLIGFDFDVYSVLTSRGGITEATLY
FENVTEEYIGNTYKCRGHNYYFEKTLTTTVVLE
>P23998 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MTMKMMVHIYFVSLLLLLFHSYAIDIENEITEFFNKMRDTLPAKDSKWLNPACMFGGTMNDIAALGEPFSAKCPPIEDSL
LSHRYKDYVVKWERLEKNRRRQVSNKRVKHGDLWIANYTSKFSNRRYLCTVTTKNGDCVQGIVRSHIKKPPSCIPKTYEL
GTHDKYGIDLYCGILYAKHYNNITWYKDNKEINIDDIKYSQTGKKLIIHNPELEDSGRYNCYVHYDDVRIKNDIVVSRCK
ILTVIPSQDHRFKLILDPKINVTIGEPANITCTAVSTSLLIDDVLIEWENPSGWLIGFDFDVYSVLTSRGGITEATLYFE
NVTEEYIGNTYKCRGHNYYFEKTLTTTVVLE
>P25213 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MTMKMMVHIYFVSLLLLLFHSYAIDIENEITEFFNKMRDTLPAKDSKWLNPACMFGGTMNDIAALGEPFSAKCPPIEDSL
LSHRYKDYVVKWERLEKNRRRQVSNKRVKHGDLWIANYTSKFSNRRYLCTVTTKNGDCVQGIVRSHIRKPPSCIPKTYEL
GTHDKYGIDLYCGILYAKHYNNITWYKDNKEINIDDIKYSQTGKELIIHNPELEDSGRYDCYVHYDDVRIKNDIVVSRCK
ILTVIPSQDHRFKLILDPKINVTIGEPANITCTAVSTSLLIDDVLIEWENPSGWLIGFDFDVYSVLTSRGGITEATLYFE
NVTEEYIGNTYKCRGHNYYFEKTLTTTVVLE
>P0DST1 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MMKMTMKMMVHIYFVSLLLLLFHSYAIDIENEITDFFNKMKDTLPAKDSKWLNPTCIFGGTMNNMAAIGEPFSAKCPPIE
DSLLSRRYINKDNVVNWEKIGKTRRPLNRRVKNGDLWIANYTSNDSHRMYLCTVITKNGDCIQGIVRSHVRKPSSCIPEI
YELGTHDKYGIDLYCGIIYAKHYNNITWYKDNKEINIDDIKYSQTGKELIIHNPALEDSGRYDCYVHYDDVRIKNDIVVS
RCKILTVIPSQDHRFKLILDSKINVIIGEPANITCTAVSTSLLFDDVLIEWENPSGWLIGFDFDVYSVLTSRGGITEATL
YFKNVTEEYIGNTYKCRGHNYYFEKTLTTTVVLE
>P0DST2 ~~~~~~Soluble interferon alpha/beta receptor OPG204~~~
MMKMTMKMMVHIYFVSLLLLLFHSYAIDIENEITDFFNKMKDTLPAKDSKWLNPTCIFGGTMNNMAAIGEPFSAKCPPIE
DSLLSRRYINKDNVVNWEKIGKTRRPLNRRVKNGDLWIANYTSNDSHRMYLCTVITKNGDCIQGIVRSHVRKPSSCIPEI
YELGTHDKYGIDLYCGIIYAKHYNNITWYKDNKEINIDDIKYSQTGKELIIHNPALEDSGRYDCYVHYDDVRIKNDIVVS
RCKILTVIPSQDHRFKLILDSKINVIIGEPANITCTAVSTSLLFDDVLIEWENPSGWLIGFDFDVYSVLTSRGGITEATL
YFKNVTEEYIGNTYKCRGHNYYFEKTLTTTVVLE
>Q06253 ~~~phd~~~Antitoxin phd~~~
MQSINFRTARGNLSEVLNNVEAGEEVEITRRGREPAVIVSKATFEAYKKAALDAEFASLFDTLDSTNKELVNR
>P0C798 ~~~P/X~~~Phosphoprotein~~~
MATRPSSLVDSLEDEEDPQTLRRERSGSPRPRKIPRNALTQPVDQLLKDLRKNPSMISDPDQRTGREQLSNDELIKKLVT
ELAENSMIEAEEVRGTLGDISARIEAGFESLSALQVETIQTAQRCDHSDSIRILGENIKILDRSMKTMMETMKLMMEKVD
LLYASTAVGTSAPMLPSHPAPPRIYPQLPSAPTADEWDIIP
>P0C799 ~~~P/X~~~Phosphoprotein~~~
MATRPSSLVDSLEDEEDPQTLRRERPGSPRPRKVPRNALTQPVDQLLKDLRKNPSMISDPDQRTGREQLSNDELIKKLVT
ELAENSMIEAEEVRGTLGDISARIEAGFESLSALQVETIQTAQRCDHSDSIRILGENIKILDRSMKTMMETMKLMMEKVD
LLYASTAVGTSAPMLPSHPAPPRIYPQLPSAPTTDEWDIIP
>P33454 ~~~P~~~Phosphoprotein~~~
MEKFAPEFHGEDANTKATKFLESLKGKFTSSKDSRKKDSIISVNSIDIELPKESPITSTNHNINQPSEINDTIAANQVHI
RKPLVSFKEELPSSENPFTKLYKETIETFDNNEEESSYSYDEINDQTNDNITARLDRIDEKLSEIIGMLHTLVVASAGPT
AARDGIRDAMVGLREEMIEKIRSEALMTNDRLEAMARLRDEESEKMTKDTSDEVKLTPTSEKLNMVLEDESSDNDLSLED
F
>O56774 ~~~P~~~Phosphoprotein~~~
MSKIFINPSDIRSGLADLEMAEETVELVNRNMEDSQAHLQGVPIDVETLPEDIQRLHITDPQASLRQDMVDEQKHQEDED
FYLTGRENPLSPFQTHLDAIGLRIVRKMKTGEGFFKIWSQAVEDIVSYVALNFSIPVNKLFEDKSTQTVTEKSQQASASS
APNRHEKSSQNARVNSKDASGPAALDWTASNEADDESVEAEIAHQIAESFSKKYKFPSRSSGIFLWNFEQLKMNLDEIVR
EVKEIPGVIKMAKDGMKLPLRCMLGGVASTHSRRFQILVNPEKLGKVMQEDLDKYLTY
>A4UHP9 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSALRSGLADLEMAEETVDLVNKNMEDSQAHLQGIPIDVETLPEDIKRLRIADYKQGQQEEDASRQEEGEDED
FYMTESENSYVPLQSYLDAVGMQIVRKMKTGDGFFKIWAQAVEDIVSYVATNFPAPVNKLQADKSTRTTLEKVKQAASSS
APSKREGPSSNMNLDSQESSGPPGLDWAASNDEDDGSIEAEIAHQIAESFSKKYKFPSRSSGIFLWNFEQLKMNLDDIVR
EVKGIPGVTRMARDGMKLPLRCMLGSVASNHSKRFQILVNSAKLGKLMQDDLNRYLAY
>A4UHQ4 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLVNKNIEDNQAHLQGEPIEVDALPEDMSKLQISERRPAQFTDNTGGKEEGSDED
FYMAESEDPYIPLQSYLEGVGIQLVRQMKTGERFFKIWSQAVEEIISYVTVHFPMPLGKSTEDKSTQTPEEKFKPSPQQA
VTKKESQSSKIKTISQESSGPPALEWSTTNDEENASVEAEIAHQIAESFSKKYKFPSRSSGIFLFNFEQLKMNLDDIVKE
AKKIPGVVRLAQDGFRLPLRCILGGVGSVNSKKFQLLVNSDKLGKIMQDDLNRYLAY
>O55778 ~~~P/V/C~~~Phosphoprotein~~~
MDKLDLVNDGLDIIDFIQKNQKEIQKTYGRSSIQQPSTKDRTRAWEDFLQSTSGEHEQAEGGMPKNDGGTEGRNVEDLSS
VTSSDGTIGQRVSNTRAWAEDPDDIQLDPMVTDVVYHDHGGECTGHGPSSSPERGWSYHMSGTHDGNVRAVPDTKVLPNA
PKTTVPEEVREIDLIGLEDKFASAGLNPAAVPFVPKNQSTPTEEPPVIPEYYYGSGRRGDLSKSPPRGNVNLDSIKIYTS
DDEDENQLEYEDEFAKSSSEVVIDTTPEDNDSINQEEVVGDPSDQGLEHPFPLGKFPEKEETPDVRRKDSLMQDSCKRGG
VPKRLPMLSEEFECSGSDDPIIQELEREGSHPGGSLRLREPPQSSGNSRNQPDRQLKTGDAASPGGVQRPGTPMPKSRIM
PIKKGTDAKSQYVGTEDVPGSKSGATRYVRGLPPNQESKSVTAENVQLSAPSAVTRNEGHDQEVTSNEDSLDDKYIMPSD
DFANTFLPHDTDRLNYHADHLNDYDLETLCEESVLMGIVNAIKLINIDMRLNHIEEQMKEIPKIINKIDSIDRVLAKTNT
ALSTIEGHLVSMMIMIPGKGKGERKGKTNPELKPVIGRNILEQQELFSFDNLKNFRDGSLTDEPYGGVARIRDDLILPEL
NFSETNASQFVPLADDASKDVVRTMIRTHIKDRELRSELMDYLNRAETDEEVQEVANTVNDIIDGNI
>Q8B9Q8 ~~~P~~~Phosphoprotein~~~
MSFPEGKDILFMGNEAAKLAEAFQKSLRKPSHKRSQSIIGEKVNTVSETLELPTISRPTKPTILSEPKLAWTDKGGAIKT
EAKQTIKVMDPIEEEEFTEKRVLPSSDGKTPAEKKLKPSTNTKKKVSFTPNEPGKYTKLEKDALDLLSDNEEEDAESSIL
TFEERDTSSLSIEARLESIEEKLSMILGLLRTLNIATAGPTAARDGIRDAMIGIREELIADIIKEAKGKAAEMMEEEMNQ
RTKIGNGSVKLTEKAKELNKIVEDESTSGESEEEEELKDTQENNQEDDIYQLIM
>P12579 ~~~P~~~Phosphoprotein~~~
MEKFAPEFHGEDANNRATKFLESIKGKFTSPKDPKKKDSIISVNSIDIEVTKESPITSNSTIINPTNETDDNAGNKPNYQ
RKPLVSFKEDPIPSDNPFSKLYKETIETFDNNEEESSYSYEEINDQTNDNITARLDRIDEKLSEILGMLHTLVVASAGPT
SARDGIRDAMVGLREEMIEKIRTEALMTNDRLEAMARLRNEESEKMAKDTSDEVSLNPTSEKLNNLLEGNDSDNDLSLED
F
>P03421 ~~~P~~~Phosphoprotein~~~
MEKFAPEFHGEDANNRATKFLESIKGKFTSPKDPKKKDSIISVNSIDIEVTKESPITSNSTIINPTNETDDTAGNKPNYQ
RKPLVSFKEDPTPSDNPFSKLYKETIETFDNNEEESSYSYEEINDQTNDNITARLDRIDEKLSEILGMLHTLVVASAGPT
SARDGIRDAMIGLREEMIEKIRTEALMTNDRLEAMARLRNEESEKMAKDTSDEVSLNPTSEKLNNLLEGNDSDNDLSLED
F
>Q5VKP5 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETIDLINRTIEDNQAHLQGVPIEVEALPEDMKKLQISDHQQGQPSGGATGQDGSEEED
FYMTESENPYIPFQSYLDAVGIQLVRKMKTGEGFLKIWSQAAEEIVSYVAINFPLPADKESAEKSTQTVGEPLKSNSASN
TPNKRSKPSTSTDLKAQEASGPHGIDWAASNDEDDASVEAEIAHQIAESFSKKYKFPSRSSGIFLWNFEQLKMNLDDIVG
GAKEIPGVIRMAKEGNKLPLRCILGGVALTHSKRFQVLVNSEKLGRIMQEDLNKYLAN
>O56773 ~~~P~~~Phosphoprotein~~~
MSKGLIHPSAIRSGLVDLEMAEETVDLVHKNLADSQAHLQGEPLNVDSLPEDMRKMRLTNAPSEREIIEEDEEEYSSEDE
YYLSQGQDPMVPFQNFLDELGTQIVRRMKSGDGFFKIWSAASEDIKGYVLSTFMKPETQATVSKPTQTDSLSVPRPSQGY
TSVPRDKPSNSESQGGGVKPKKVQKSEWTRDTDEISDIEGEVAHQVAESFSKKYKFPSRSSGIFLWNFEQLKMNLDDIVK
TSMNVPGVDKIAEKGGKLPLRCILGFVSLDSSKRFRLLADTDKVARLMQDDIHNYMTRIEEIDHN
>P35974 ~~~P/V~~~Phosphoprotein~~~
MAEEQARHVKNGLECIRALKAEPIGSLAIEEAMAAWSEISDNPGQERATCREEKAGSSGLSKPCLSAIGSTEGGAPRIRG
QGPGESDDDAETLGIPPRNLQASSTGLQCYYVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSGGDNESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLRLQSRGNNFPKLGKTLNVPPPPDPGRASTSGTPIKKGTERRLASFG
TEIASLLTGGATQCARKSPSEPSGPGAPAGNVPEYVSNAALIQEWTPESGTTISPRSQNNEEGGDYYDDELFSDVQDIKT
ALAKIHEDNQKIISKLESLLLLKGEVESIKKQINRQNISISTLEGHLSSIMIAIPGLGKDPNDPTADVEINPDLKPIIGR
DSGRALAEVLKKPVASRQLQGMTNGRTSSRGQLLKEFQPKPIGKKMSSAVGFVPDTGPASRSVIRSIIKSSRLEEDRKRY
LMTLLDDIKGANDLAKFHQMLMKIIMK
>P03422 ~~~P/V~~~Phosphoprotein~~~
MAEEQARHVKNGLECIRALKAEPIGSLAIEEAMAAWSEISDNPGQERATCREEKAGSSGLSKPCLSAIGSTEGGAPRIRG
QGPGESDDDAETLGIPPRNLQASSTGLQCYYVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSGGDNESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLRLQSRGNNFPKLGKTLNVPPPPDPGRASTSGTPIKKGTERRLASFG
TEIASLLTGGATQCARKSPSEPSGPGAPAGNVPECVSNAALIQEWTPESGTTISPRSQNNEEGGDYYDDELFSDVQDIKT
ALAKIHEDNQKIISKLESLLLLKGEVESIKKQINRQNISISTLEGHLSSIMIAIPGLGKDPNDPTADVEINPDLKPIIGR
DSGRALAEVLKKPVASRQLQGMTNGRTSSRGQLLKEFQLKPIGKKMSSAVGFVPDTGPASRSVIRSIIKSSRLEEDRKRY
LMTLLDDIKGANDLAKFHQMLMKIIMK
>Q00793 ~~~P/V~~~Phosphoprotein~~~
MAEEQARHVKNGLECIRALKAEPIGSLAIGEAMAAWSEISDNPGQERATYKEEKAGGSGLSKPCLSAIGSTEGGAPRIRG
QGSGESDDDTETLGIPSRNLQASSTGLQCHYVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSEGDNESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLRLQSRGNNFPKLGKTLNVPPPPDPGRASTSETPIKKGTDARLASFG
TEIASLLTGGATQCARKSPSEPSGPGAPAGNVPECVSNAALIQEWTPESGTTISPRSQNNKEGGDHYDDELFSDIQDIKT
ALAKIHEDNQKIISKLESLLLLKGEVESIKKQINKQNISISTLEGHLSSIMIAIPGLGKDPNDPTADVEINPDLKPIIGR
DSGRALAEVLKKPVASRQLQGMTNGRTSSRGQLLKEFQLKPIGKKMSSAVGFVPDTGPVSRSVIRSIIKSSRIEEDRKRY
LMTLLDDIKGANDLSKFHQMLMKIIMK
>P0C569 ~~~P~~~Phosphoprotein~~~
MSKDLVHPSLIRAGIVELEMAEETTDLINRTIESNQAHLQGEPLYVDSLPEDMSRLRIEDKSRRTKTEEEERDEGSSEED
NYLSEGQDPLIPFQNFLDEIGARAVKRLKTGEGFFRVWSALSDDIKGYVSTNIMTSGERDTKSIQIQTEPTASVSSGNES
RHDSESMHDPNDKKDHTPDHDVVPDIESSTDKGEIRDIEGEVAHQVAESFSKKYKFPSRSSGIFLWNFEQLKMNLDDIVK
AAMNVPGVERIAEKGGKLPLRCILGFVALDSSKRFRLLADNDKVARLIQEDINSYMARLEEAE
>P24698 ~~~P~~~Phosphoprotein~~~
MATFTDAEIDELFETSGTVIDNIITAQGKSAETVGRSAIPHGKTKALSAAWEKHGSIQPPASQDTPDRQDRSDKQPSTPE
QATPHDSPPATSADQPPTQATDEAVDTQLRTGASNSLLLMLDKLSNKSSNAKKGLWSSPQEGNHQRPTQQQGSQPSRGNS
QERPQNQVKAAPGNQGTDANTAYHGQWEESQLSAGATPHALRSRQSQDNTLVSADHVQPPVDFVQAMMSMMEAISQRVSK
VDYQLDLVLKQTSSIPMMRSEIQQLKTSVAVMEANLGMMKILDPGCANVSSLSDLRAVARSHPVLVSGPGDPSPYVTQGG
EMALNKLSQPVPHPSELIKSATACGPDIGVEKDTVRALIMSRPMHPSSSAKLLSKLDAAGSIEEIRKIKRLALNG
>Q9IK91 ~~~P/V/C~~~Phosphoprotein~~~
MDKLELVNDGLNIIDFIQKNQKEIQKTYGRSSIQQPSIKDQTKAWEDFLQCTSGESEQVEGGMSKDDGDVERRNLEDLSS
TSPTDGTIGKRVSNTRDWAEGSDDIQLDPVVTDVVYHDHGGECTGYGFTSSPERGWSDYTSGANNGNVCLVSDAKMLSYA
PEIAVSKEDRETDLVHLENKLSTTGLNPTAVPFTLRNLSDPAKDSPVIAEHYYGLGVKEQNVGPQTSRNVNLDSIKLYTS
DDEEADQLEFEDEFAGSSSEVIVGISPEDEEPSSVGGKPNESIGRTIEGQSIRDNLQAKDNKSTDVPGAGPKDSAVKEEP
PQKRLPMLAEEFECSGSEDPIIRELLKENSLINCQQGKDAQPPYHWSIERSISPDKTEIVNGAVQTADRQRPGTPMPKSR
GIPIKKGTDAKYPSAGTENVPGSKSGATRHVRGSPPYQEGKSVNAENVQLNASTAVKETDKSEVNPVDDNDSLDDKYIMP
SDDFSNTFFPHDTDRLNYHADHLGDYDLETLCEESVLMGVINSIKLINLDMRLNHIEEQVKEIPKIINKLESIDRVLAKT
NTALSTIEGHLVSMMIMIPGKGKGERKGKNNPELKPVIGRDILEQQSLFSFDNVKNFRDGSLTNEPYGAAVQLREDLILP
ELNFEETNASQFVPMADDSSRDVIKTLIRTHIKDRELRSELIGYLNKAENDEEIQEIANTVNDIIDGNI
>Q83956 ~~~P~~~Phosphoprotein~~~
MEKFAPEFHGEDANTKATKFLESLKGKFTSSKDSKKKDSIISVNSIDIELPKESPITSANHNISQSGENSDTPATNQVHT
RKPLVSFREELPTSENPFTKLYKETIETFDNNEEESSYSYDEINDQTNDNITARLDRIDEKLSEIIGMLHTLVVASAGPT
AARDGIRDAMVGLREEMIEKIRSEALMTNDRLEAMARLRNEESEKMAKDTSDDVNLNSTSEKLNTILEEDNSDNDLSLED
F
>P28054 ~~~P/C~~~Phosphoprotein~~~
MDQDAFFFERDPEAEGEAPRKQESLSGVIGLLDVVLSYKPTEIGEDRSWLHGIIDNPEENKPSCKADDNNKDRAISTPTQ
DHRSGEESGISRRTSESKTETHARLLDQQSIHRASRRGTSPNPLPENMGNERNTRIDEDSPNERRHQRSVLTDEDRKMAE
DSNKREEDQVEGFPEEIRRSTPLSDDGESRTNNNGRSMETSSTHSTRITDVIINPSPELEDAVLQRNKRRPTIIRRNQTR
SERTQNSELHKSTSENSSNLEDHNTKTSPKGLPPKNEESAATPKNNHNHRKTKYTMNNANNNTKSPPTPEHDTTANEEET
SNTSVDEMAKLLVSLGVMKSQHEFELSRSASHVFAKRMLKSANYKEMTFNLCGMLISVEKSLENKVEENRTLLKQIQEEI
NSSRDLHKRFSEYQKEQNSLMMANLSTLHIITDRGGKTGDPSDTTRSPSVFTKGKDNKVKKTRFDPSMEALGGQEFKPDL
IREDELRDDIKNPVLEENNNDLQASNASRLIPSTEKHTLHSLKLVIENSPLSRVEKKAYIKSLYKCRTNQEVKNVMELFE
EDIDSLTN
>P06162 ~~~P/V/D~~~Phosphoprotein~~~
MESDAKNYQIMDSWEEEPRDKSTNISSALNIIEFILSTDPQEDLSENDTINTRTQQLSATICQPEIKPTETSEKVSGSTD
KNRQSGSSHECTTEAKDRNIDQETVQGGSGRRSSSDSRAETVVSGGISGSITDSKNGTQNTENIDLNEIRKMDKDSIERK
MRQSADVPSEISGSDVIFTTEQSRNSDHGRSLEPISTPDTRSMSVVTAATPDDEEEILMKNSRMKKSSSTHQEDDKRIKK
GGGGKGKDWFKKSRDTDNQTSTSDHKPTSKGQKKISKTTTTNTDTKGQTETQTESSETQSPSWNPIIDNNTDRTEQTSTT
PPTTTPRSTRTKESIRTNSESKPKTQKTIGKERKDTEESNRFTERAITLLQNLGVIQSTSKLDLYQDKRVVCVANVLNNV
DTASKIDFLAGLVIGVSMDNDTKLIQIQNEMLNLKADLKRMDESHRRLIENQREQLSLITSLISNLKIMTERGGKKDQNE
SNERVSMIKTKLKEEKIKKTRFDPLMEAQGIDKNIPDLYRHAGNTLENDVQVKSEILSSYNESNATRLIPRKVSSTMRSL
VAVINNSNLPQSTKQSYINELKHCKSDEEVSELMDMFNEDVNNC
>P21738 ~~~P/V~~~Phosphoprotein~~~
MSFEISVEEIDELIETGNLNIDYALKELGATSQPPPNRPLSQISKTEENNDETRTSKNSASAEAPAHASSPLRSHNEESE
PGKQSSDGFSMISNRPQTGMLLMGSDTQSPSPSKTYQGLILDAKKRALNEPRRNQKTTNEHGNTNDTWIFKRGGNIATKK
EAWVTQNQRSKIQSSFQDIEESTRFHGSMEEPQYQSGAIHVAHQSNQLPPSKNVHVEDVPKFANYALEILDAIKALEVRL
DRIEGKVDKIMLTQNTIQQTKNDTQQIKGSLATIEGLITTMKIMDPGVPSKVSLRSLNKGPEQVPIIVTGTGDVSKFVDQ
DNTITLDPLARPILSGTKQITDERRAGVRIDALKITVSEMIRDLFGDCDKSRKLLESINMATTEQDINSIKTNALRSIT
>P11208 ~~~P/V~~~Phosphoprotein~~~
MDPTDLSFSPDEINKLIETGLNTVEYFTSQQVTGTSSLGKNTIPPGVTGLLTNAAEAKIQESTNHQKGSVGGGAKPKKPR
PKIAIVPADDKTVPGKPIPNPLLGLDSTPSTQTVLDLSGKTLPSGSYKGVKLAKFGKENLMTRFIEEPRENPIATSSPID
FKRGAGIPAGSIEGSTQSDGWEMKSRSLSGAIHPVLQSPLQQGDLNALVTSVQSLALNVNEILNTVRNLDSRMNQLETKV
DRILSSQSLIQTIKNDIVGLKAGMATLEGMITTVKIMDPGVPSNVTVEDVRKTLSNHAVVVPESFNDSFLTQSEDVISLD
ELARPTATSVKKIVRKVPPQKDLTGLKITLEQLAKDCISKPKMREEYLLKINQASSEAQLIDLKKAIIRSAI
>P15198 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNDAHLQGEPIEVDNLPEDMKRLHLDDEKSSNLGEMVRVGEGKYRED
FQMDEGEDPNLLFQSYLDNVGVQIVRQMRSGERFLKIWSQTVEEIVSYVTVNFPNPPRRSSEDKSTQTTGRELKKETTSA
FSQRESQPSKARMVAQVAPGPPALEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGIFLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKIPLRCVLGWVALANSKKFQLLVEADKLSKIMQDDLNRYTSC
>P22363 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMKRLHLDDEKSSNLGEMVRVGEGKYRED
FQMDEGEDPNLLFQSYLDNVGVQIVRQMRSGERFLKIWSQTVEEIVSYVTVNFPNPPRRSSEDKSTQTTGRELKKETTSA
FSQRESQPSKARMVAQVAPGPPALEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGIFLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKIPLRCVLGWVALANSKKFQLLVEADKLSKIMQDDLNRYTSC
>P69479 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGEMAKVGEGKYRED
FQMDEGEDPSFLFQSYLENVGVQIVRQMRSGERFLKIWSQTVEEIISYVAVNFPNPPGKSSEDKSTQTTGRELKKETTPT
PSQRESQSSKARMAAQTASGPPALEWSATNEKDDLSVEAEIAHQIAESFSKKYKFPSRSSGILLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKLPLRCVLGWVALANSKKFQLLVESDKLSKIMQDDLNRYTSC
>Q9IPJ8 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDSLPEDMSRLHLDDGKLPDLGRMSKAGEGRHQED
FQMDEGEDPSLLFQSYLDNVGVQIVRQMRSGERFLKIWSQTVEEIISYVTVNFPNPSGRSSEDKSTQTTSQEPKKETTST
PSQRKSQSLKSRTMAQTASGPPSLEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGIFLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKLPLRCVLGWVALANSKKFQLLVEANKLNKIMQDDLNRYASC
>P06747 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNPGEMAKVGEGKYRED
FQMDEGEDPSLLFQSYLDNVGVQIVRQIRSGERFLKIWSQTVEEIISYVAVNFPNPPGKSSEDKSTQTTGRELKKETTPT
PSQRESQSSKARMAAQTASGPPALEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGILLYNFEQLKMNLDDIVKE
AKNVPGVTRLARDGSKLPLRCVLGWVALANSKKFQLLVESNKLSKIMQDDLNRYTSC
>Q0GBY3 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDDLPEDMKRLHLDDEKSSNLGEMVRVGEGKYRED
FQMDEGEDPNLLFQSYLDNVGVQIVRQMRSGERFLKIWSQTVEEIVSYVTVNFPNPPRRSSENKSTQTTGRELKKETTSA
FSQRESQPSKARMVAQVAPGPPALEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGIFLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKIPLRCVLGYVALANSKKFQLLVEADKLSKIMQDDLNRYTSC
>P16286 ~~~P~~~Phosphoprotein~~~
MSKIFVNPSAIRAGLADLEMAEETVDLINRNIEDNQAHLQGEPIEVDNLPEDMGRLHLDDGKSPNHGEIAKVGEGKYRED
FQMDEGEDPSFLFQSYLENVGVQIVRQMRSGERFLKIWSQTVEEIISYVAVNFPNPPGKSSEDKSTQTTGRELKKETTPT
PSQRESQSSKARMAAQIASGPPALEWSATNEEDDLSVEAEIAHQIAESFSKKYKFPSRSSGILLYNFEQLKMNLDDIVKE
AKNVPGVTRLAHDGSKLPLRCVLGWVALANSKKFQLLVESDKLSKIMQDDLNRYTSC
>P35945 ~~~P/V~~~Phosphoprotein~~~
MAEEQAYHVNKGLECIKALRARPLDPLVVEEALAAWVETSEGQTLDRMSSDEAEADHQDISKPCFPAAGPGKSSMSRCHD
QGLGGSNSCDEELGAFIGDSSMHSTEVQHYHVYDHSGEKVEGVEDADSILVQSGADDGVEVWGGDEESENSDVDSGEPDP
EGSAPADWGSSPISPATRASDVETVEGDEIQKLLEDQSRIRKMTKAGKTLVVPPIPSQERPTASEKPIKKGTDVKSTSSG
TMAESSSTGGATRPALKSQWGPSGPNASAENALASASNVSPTQGSKTESGTTTSRISQSNIEPEDDYDDELFSDIQDIKT
ALAKLHDDQQIIITRLESLVSLKGEIDSIKKQISKQNISISTIEGHLSSVMIAIPGFGKDPNDPTADVDINPDLRPIIGR
DSGRALAEVLKKPASERQSKDTGKLGIESKGLLKKEFQLKPIEKKSSSAIRFVPDGSVASRSVIRSIIKSSHLGEDRKDY
LMSLLNDIQGSKDLAQFHQMLVKILKN
>P04859 ~~~P/V/C~~~Phosphoprotein~~~
MDQDAFILKEDSEVEREAPGGRESLSDVIGFLDAVLSSEPTDIGGDRSWLHNTINTPQGPGSAHRAKSEGEGEVSTPSTQ
DNRSGEESRVSGRTSKPEAEAHAGNLDKQNIHRAFGGRTGTNSVSQDLGDGGDSGILENPPNERGYPRSGIEDENREMAA
HPDKRGEDQAEGLPEEVRGGTSLPDEGEGGASNNGRSMEPGSSHSARVTGVLVIPSPELEEAVLRRNKRRPTNSGSKPLT
PATVPGTRSPPLNRYNSTGSPPGKPPSTQDEHINSGDTPAVRVKDRKPPIGTRSVSDCPANGRPIHPGLESDSTKKGIGE
NTSSMKEMATLLTSLGVIQSAQEFESSRDASYVFARRALKSANYAEMTFNVCGLILSAEKSSARKVDENKQLLKQIQESV
ESFRDIYKRFSEYQKEQNSLLMSNLSTLHIITDRGGKTDNTDSLTRSPSVFAKSKENKTKATRFDPSMETLEDMKYKPDL
IREDEFRDEIRNPVYQERDTEPRASNASRLLPSKEKPTMHSLRLVIESSPLSRAEKAAYVKSLSKCKTDQEVKAVMELVE
EDIESLTN
>Q86606 ~~~P/V~~~Phosphoprotein~~~
MAEEPTYTAEQVNDVVHAGLGTVDFFLSRPVDGQSSLGKGSVPPGITAVLTNAAELKAKTAAAAPVKPKRKKIQHMTPAY
TIADNGDPNRLPANTPIANPLIPIERPPGRMTDLDLATGTVTQGTYKGVELAKAGKNALLTRFSSGPSLTDQASSKDPNF
KRGGEKLTDATKADIGGSGASPGSETKLRFMSGAIQHVPQLLPLTASSPVLVEPAPIGAENVKEIIEILRGLDLRMQSLE
GKVDKILATSATITALKNEVTSLKANVATVEGMMTTMKIMDPSTPTNVPVEKIRKNLKDTPVIISGPLSESHITEGSDMI
VLDELARPSLSSTKKIVRRPEPKKDLTGMKLMLIQLANDCMGKPDQKAEIVAKIHAATREAQLLDIKRSIIKSAI
>Q91DS2 ~~~P~~~Phosphoprotein~~~
MSLHSKLSESLKAYADLDKTVKEIEEQVSSMEEPVPKTVKYVTFEENLSEEEWESDSGDDDEDSIDDSLIPDYLRESSSI
TVDEDEEDQKEDMEEHLPTVSWEEEPTGIDIGFGPGIVMPSVSNHEGGTYVRYNGLGGVDPNCKDLISKMMRSLIGQIGN
KYGYDIDLFDYQGDFLEVFLPHKPSKEDVRPDIRIGKKNEEGTSKQVSKPRGKEKIVLKTGDECGRFPMNKEAKKREPEG
LWEVMKVLSVQFDPWKEDEPPLSLTIRDLFISESEFRLHCNHSQTEREMALVGIKLRRLYNKLYQKYRL
>P21299 ~~~P~~~Phosphoprotein~~~
MAGIYAVSIKGHASAIFNRQEKEISTGRVWEVMKKIMSLKPTRVIMSYSLLRSALDKSRQLTQEEYNIMQLILDGCVKTL
EPVAASGICIDVNLGKCTKHTIPFGITNNDVGHVSVVMTLPFLEEGCYNIGACFDGRLSKSRSDASHYAVDVSLEIYLKS
LSRDEAEEQISKGTSVYPFKINHPTYFEDETDTSDGESLSGRASSDDGPEDGGHGHGDKNNEKNSGKVVRKRKSRKEIDV
GRFKMVKDNIINTRSGLLKSMRGTGHRKHRTQEITEGYNYGDKDAE
>P03520 ~~~P~~~Phosphoprotein~~~
MDNLTKVREYLKSYSRLDQAVGEIDEIEAQRAEKSNYELFQEDGVEEHTKPSYFQAADDSDTESEPEIEDNQGLYAQDPE
AEQVEGFIQGPLDDYADEEVDVVFTSDWKPPELESDEHGKTLRLTSPEGLSGEQKSQWLSTIKAVVQSAKYWNLAECTFE
ASGEGVIMKERQITPDVYKVTPVMNTHPSQSEAVSDVWSLSKTSMTFQPKKASLQPLTISLDELFSSRGEFISVGGDGRM
SHKEAILLGLRYKKLYNQARVKYSL
>P04880 ~~~P~~~Phosphoprotein~~~
MDNLTKVREYLKSYSRLDQAVGEIDEIEAQRAEKSNYELFQEDGVEEHTRPSYFQAADDSDTESEPEIEDNQGLYVPDPE
AEQVEGFIQGPLDDYADEDVDVVFTSDWKQPELESDEHGKTLRLTLPEGLSGEQKSQWLLTIKAVVQSAKHWNLAECTFE
ASGEGVIIKKRQITPDVYKVTPVMNTHPYQSEAVSDVWSLSKTSMTFQPKKASLQPLTISLDELFSSRGEFISVGGNGRM
SHKEAILLGLRYKKLYNQARVKYSL
>P04878 ~~~P~~~Phosphoprotein~~~
MDSIDRLKTYLATYDNLDSALQDANESEERREDKYLQDLFIEDQGDKPTPSYYQEEESSDSDTDYNAEHLTMLSPDERID
KWEEDLPELEKIDDDIPVTFSDWTQPVMKENGGEKSLSLFPPVGLTKVQTDQWRKTIEAVCESSKYWNLSECQIMNSEDR
LILKGRIMTPDCSSSIKSQNSIQSSESLSSSHSPGPAPKSRNQLGLWDSKSTEVQLISKRAGVKDMMVKLTDFFGSEEEY
YSVCPEGAPDLMGAIIMGLKHKKLFNQARMKYRI
>P04877 ~~~P~~~Phosphoprotein~~~
MDSVDRLKTYLATYDNLDSALQDANESEERREDKYLQDLFIEDQGDKPTPSYYQEEESSDSDTDYNAEHLTMLSPDERID
KWEEDLPELEKIDDDIPVTFSDWTQPVMKENGGEKSLSLFPPVGLTKIQTEQWKKTIEAVCESSKYWNLSECQILNLEDS
LTLKGRLMTPDCSSSVKSQNSVRRSEPLYSSHSPGPPLKVSESINLWDLKSTEVQLISKRAGVKDMTVKLTDFFGSEEEY
YSVCPEGAPDLMGAIIMGLKYKKLFNQARMKYRL
>Q5VKP1 ~~~P~~~Phosphoprotein~~~
MSKSLIHPSDLRAGLADIEMADETVDLVYKNLSEGQAHLQGEPFDIKDLPEGVSKLQISDNVRSDTSPNEYSDEDDEEGE
DEYEEVYDPVSAFQDFLDETGSYLISKLKKGEKIKKTWSEVSRVIYSYVMSNFPPRPPKPTTKDIAVQADLKKPNEIQKI
SEHKSKSEPSPREPVVEMHKHATLENPEDDEGALESEIAHQVAESYSKKYKFPSKSSGIFLWNFEQLKMNLDDIVQVARG
VPGISQIVERGGKLPLRCMLGYVGLETSKRFRSLVNQDKLCKLMQEDLNAYSVSSNN
>B3FK34 3.6.5.-~~~~~~Phage tubulin-like protein~~~
MPVKVCLIFAGGTGMNVATKLVDLGEAVHCFDTCDKNVVDVHRSVNVTLTKGTRGAGGNRKVILPLVRPQIPALMDTIPE
ADFYIVCYSLGGGSGSVLGPLITGQLADRKASFVSFVVGAMESTDNLGNDIDTMKTLEAIAVNKHLPIVVNYVPNTQGRS
YESINDEIAEKIRKVVLLVNQNHGRLDVHDVANWVRFTDKHNYLIPQVCELHIETTRKDAENVPEAISQLSLYLDPSKEV
AFGTPIYRKVGIMKVDDLDVTDDQIHFVINSVGVVEIMKTITDSKLEMTRQQSKFTQRNPIIDADDNVDEDGMVV
>Q8SDC3 3.6.5.-~~~~~~Phage tubulin-like protein~~~
MMSKVKTRIYFCGGAGFRIGELFHGYHEDVCYIDTSVQNKHKHNTDDNTIIIEADTKLADQTARKRAIGMGKDRKAAAEL
ISAHIPAIAHHFPAGDTNIVVYSMGGASGSTIGPSLVSHLQQQGEVVVSVVIGSYDSDISLRNSSGSLKTFEGVSSVSKV
PMIINYHENVEGIPQSMVNQNILEVLNALVILFNQEHQSLDLMDITNWAHFHKHHDVPVQTVQLHVCFDRQEAQAILDPI
SIASLYTDPDRDVSISTVLTRTTGYADPEKYDFDQMHFVINGLSIEDIRKRLEERREMMNRAKANMRKRQSTLDVDDQAT
SSGLVFD
>F8SJR0 3.6.5.-~~~~~~Phage tubulin-like protein~~~
MSKVKTRVYCCGGTGMDIGVNLQWHPDLVFIDTCDKNVTADHDLERVFLTEGTRGAGKNRRYMLPIIRPQVPGFLERYPA
GDFNIVVFGLGGGSGSTIGPVIVSELAKAGESVAVVCMSGIEATEVLQNDIDTLKTLEGIAAATNTPVVINHIENVNGVP
YTELDKEAIFNIHALINLTSQKHVRLDKLDIDNWINFTKKHNQIQPQLCQLHISNNRQEATSVPEPIAIASLFADASREV
AFGTPFVRTVGISDVSDPDLLADQLHFVINSIGVASLFGSLTKQKQELEAAQVRYQQRNAIIDIDDNRTDDGFVV
>P15963 ~~~~~~Per os infectivity factor 0~~~
MAVLTAVDLTNASRYAIHMHRLEFISRWRTRFPHILIDYTLRPASSDDDYYVPPKLADKALAVKLAFSKRGCVSMSCYPF
HETGVVSNTTPFMYMQTSETSVGYAQPACYHLDRAAAMREGAETQVQSAEFRYTLDNKCILVDSLSKMYFNSPYLRTEEH
TIMGVDDVPAFNVRPDPDPLFPERFKGEFNEAYCRRFGRELFNGGCSFRWWESLIGFVLGDTIFVTFKMLANNIFSELRD
FDYKAPSSILPPRPNVDSNAILAQWRSVRDNATDLEFEKLFNKNPTLNDLGMIVNGSPVQITYTAETGFTKTPIAYNYRG
NERARVEHFEALDRSISDQDLESIITSFLEDYALVFGIATDIGFDMLMSGFKSMLKKINTSLIPAMKHMLLSTTRRVTVR
MLGETYKAALVHSLNVIAIKTLTVTAKALTRIAIQASSIVGIVLILLTLADLVLALWDPFGYNNMFPREFPDDMSRTFLT
AYFESFDNTTSREIIEFMPEFFSEMVETDDDATFESLFHLLDYVASLEVNSDGQMLNLEEGDEIEDFDESTLVGQALATS
SLYTRMEFMQYTFRQNTLLSMNKENNNFNQIILGLFATNTIVAFTAFVIHTELIFFIFFVIFLMITFYYIIKESYEYYKT
IDLLF
>P41672 ~~~~~~Per os infectivity factor 1~~~
MHFAIILLFLLVIIAIVYTYVDLIDVHHEEVRYPITVFDNTRAPLIEPPSEIVIEGNAHECHKTLTPCFTHGDCDLCREG
LANCQLFDEDTIVKMRGDDGQEHETLIRAGEAYCLALDRERARSCNPNTGVWLLAETETGFALLCNCLRPGLVTQLNMYE
DCNVPVGCAPHGRIDNINSASIRCVCDDGYVSDYNADTETPYCRPRTVRDVMYDESFFPRAPCADGQVRLDHPALNDFYR
RHFRLEDICVIDPCSVDPISGQRTSGRLFHQPTVNGVGINGCNCPADDGLLPVFNRHTADTGMVRQSDRTVANACLQPFN
VHMLSLRHVDYKFFWGRSDHTEFADADMVFQANVNQLSHERYRAILYSLLESHPDVTEIVTVNMGVMKISVSYDTTLKNI
LLPSSVFRLFRFKESGTAQPVCFFPGVGRCITVNSDSCIRRHAGGQVWTAETFTNSWCVLSREGTHIKVWSRASRYPRGD
APAALRLRGFFLNNDRERNTIRAVTTGDMTQGQQIDALTQILETYPNYSV
>P41427 ~~~~~~Per os infectivity factor 2~~~
MYRVLIVFFLFVFLYIVYQPFYQAYLHIGHAQQDYNDTLDDRMDYIESVMRRRHYVPIEALPAIRFDTNLGTLAGDTIKC
MSVPLFVSDIDLPMFDCSQICDNPSAAYFFVNETDVFVVNGHRLTVGGYCSTNSLPRNCNRETSVILMSLNQWTCIAEDP
RYYAGTDNMTQLAGRQHFDRIMPGQSDRNVLFDRLLGREVNVTTNTFRRSWDELLEDGTRRFEMRCNARDNNNNLMFVNP
LNPLECLPNVCTNVSNVHTSVRPVFETGECDCGDEAVTRVTHIVPGDRTSMCASIIDGLDKSTASYRYRVECVNLYTSIL
NYSNNKLLCPSDTFDSNTDAAFAFEVPGSYPLSRNGINEPTYRFYLDTRSRVNYNDVRGQLS
>P41668 ~~~~~~Per os infectivity factor 3~~~
MLNFWQILILLVIILIVYMYTFKFVQKFILQDAYMHINELEAPPLNFTFQRNRGVDCSLNRLPCVTDQQCRDNCVISSAA
NELTCQDGFCNASDALVNAQAPDLIECDPALGLVHVFSAGGDFVVSQTCVSTYRDLVDDTGTPRPYLCNDGRLNMNLNTV
QFSPDACDCSSGYEKMLFRQTALARTIPVCIPNRMANLYRRVYN
>P41705 ~~~~~~Per os infectivity factor 5~~~
MSFFSNLRAVNKLYPNQASFITDNTRLLTSTPAGFTNVLNAPSVRNIGNNRFQPGYQLSNNQFVSTSDINRITRNNDVPN
IRGVFQGISDPQINSLSQLRRVDNVPDFNYHTKQTRSNAVKQNFPETNVRTPEGVQNALQQNPRLHSYMQSLKVGGTGIL
LATGGYFLFSAATLVQDIINAINNTGGSYYVQGKDAGEIAEACLLLQRTCRQDPNLNQSDVTICPFDPLLPNNPPELTNM
CQGFNYEVEKTVCRGSDPSADPDSPQYVDISDLPAGQTLMCIEPYSFGDLVGDLGLDWLLGDEGLVGKSSNVSDSVSGKL
MPIILLIGAVLFLGLIFYFIYRYMMKGGGGGGVGAATSPTPIVISMQNPTPTTAPR
>P41429 ~~~~~~Protein kinase-interacting protein PIKP1~~~
MSCILTALCKKGQANLNSLIKLQNKKVKNYYVKNNETAIDKMLCIAADIKGQVEQLELVNQYLGAPESEKLDFVYDCSDL
DINEKDLKSLCLTKNIAYFTQKYNAPTVLKAQAAVYDSFIKHSELFINAICQMDEKQQVNNFCLDELVKLKLIAIKHLCA
LEYVIENSI
>P42493 2.7.11.1~~~~~~Serine/threonine-protein kinase 1~~~
MSRPEQQLKKMLKTPQAQYAFYPTAKVERISTTQHMYFIATRPMFEGGRNNVFLGHQVGQPIIFKYVSKKEIPGNEVIVL
KALQDTPGVIKLIEYTENAMYHILIIEYIPNSVDLLHYHYFKKLEETEAKKIIFQLILIIQNIYEKGFIHGDIKDENLII
DINQKIIKVIDFGSAVRLDETRPQYNMFGTWEYVCPEFYYYGYYYQLPLTVWTIGMVAVNLFRFRAENFYLNDILKRENY
IPENISETGKQFITECLTINENKRLSFKSLVSHPWFKGLKKEIQPISELGVDYKNVIT
>P41415 2.7.11.1~~~PK1~~~Serine/threonine-protein kinase 1~~~
MATTNATLQTLVQFYENCKNVKTRYKIINGRFGKISILSHKPTSKLYLQKTISAHNFNADEIKVHQLMSDHPNFIKIYFN
HGSINNQVIVMDYIDCPDLFETLQIKGELSYQLVSNIIRQLCEALNDLHKHNFIHNDIKLENVLYFEALDRVYVCDYGLC
KHENSLSVHDGTLEYFSPEKIRHTTMHVSFDWYAVGVLTYKLLTGGRHPFEKSEDEMLDLNSMKRRQQYNDIGVLKHVRN
VNARDFVYCLTRYNIDCRLTNYKQIIKHEFLS
>P41720 2.7.11.1~~~PK1~~~Serine/threonine-protein kinase 1~~~
MDALIGDFADFHKECSARTALHLVNGKFGKVSVWKHGPTQKSFFYKRIEHKHFNAIEPFVHHLMKFNKYFLRLFYSLHSL
REHLLVMDYIPDGDLFDLMQTEPRLREPEISLIAYQLIDALQALHKHNVVHNDVKLENVLYRRFEQIYVCDYGLCKIAGS
PSTFEGTVDYFSPEKINKHAAAVHFDWWAVGVLLYEISTGKHPFKLDQDESLDVETLHKRQIQLDVTFPADFDNPFLEEF
ICFLLGYCYDYRAHSYEVIQKNTYWKSIVHWKQR
>P41676 2.7.11.1~~~PK2~~~Probable serine/threonine-protein kinase 2~~~
MKPEQLVYLNPRQHRIYIASPLNEYMLSDYLKQRNLQTFAKTNIKVPADFGFYISKFVDLVSAVKAIHSVNIVHHNINPE
DIFMTGPDFDLYVGGMFGSLYKTFIKNNPQNITLYAAPEQIKKVYTPKNDMYSLGIVLFELIMPFKTALERETTLTNFRN
NVQQMPASLSQGHPKLTEIVCKLIQHDYSQRPDAEWLLKEMEQLLLEYTTCSKKL
>P11014 3.6.4.-~~~~~~DNA packaging protein~~~
MDKSLFYNPQKMLSYDRILNFVIGARGIGKSYAMKVYPINRFIKYGEQFIYVRRYKPELAKVSNYFNDVAQEFPDHELVV
KGRRFYIDGKLAGWAIPLSVWQSEKSNAYPNVSTIVFDEFIREKDNSNYIPNEVSALLNLMDTVFRNRERVRCICLSNAV
SVVNPYFLFFNLVPDVNKRFNVYDDALIEIPDSLDFSSERRKTRFGRLIDGTEYGEMSLDNQFIGDSQVFIEKRSKDSKF
VFSIVYNGFTLGVWVDVNQGLMYIDTAHDPSTKNVYTLTTDDLNENMMLITNYKNNYHLRKLASAFMNGYLRFDNQVIRN
IAYELFRKMRIQ
>P03272 ~~~IVa2~~~Packaging protein 1~~~
METRGRRPAALQHQQDQPQAHPGQRAARSAPLHRDPDYADEDPAPVERHDPGPSGRAPTTAVQRKPPQPAKRGDMLDRDA
VEHVTELWDRLELLGQTLKSMPTADGLKPLKNFASLQELLSLGGERLLAHLVRENMQVRDMLNEVAPLLRDDGSCSSLNY
QLQPVIGVIYGPTGCGKSQLLRNLLSSQLISPTPETVFFIAPQVDMIPPSELKAWEMQICEGNYAPGPDGTIIPQSGTLR
PRFVKMAYDDLILEHNYDVSDPRNIFAQAAARGPIAIIMDECMENLGGHKGVSKFFHAFPSKLHDKFPKCTGYTVLVVLH
NMNPRRDMAGNIANLKIQSKMHLISPRMHPSQLNRFVNTYTKGLPLAISLLLKDIFRHHAQRSCYDWIIYNTTPQHEALQ
WCYLHPRDGLMPMYLNIQSHLYHVLEKIHRTLNDRDRWSRAYRARKTPK
>P03271 ~~~IVa2~~~Packaging protein 1~~~
METRGRRPAALQHQQDQPQAHPGQRAARSAPLHRDPDYADEDPAPVERHDPGPSGRAPTTAVQRKPPQPAKRGDMLDRDA
VEQVTELWDRLELLGQTLKSMPTADGLKPLKNFASLQELLSLGGERLLADLVRENMRVRDMLNEVAPLLRDDGSCSSLNY
QLHPVIGVIYGPTGCGKSQLLRNLLSSQLISPTPETVFFIAPQVDMIPPSELKAWEMQICEGNYAPGPDGTIIPQSGTLR
PRFVKMAYDDLILEHNYDVSDPRNIFAQAAARGPIAIIMDECMENLGGHKGVSKFFHAFPSKLHDKFPKCTGYTVLVVLH
NMNPRRDMAGNIANLKIQSKMHLISPRMHPSQLNRFVNTYTKGLPLAISLLLKDIFRHHAQRSCYDWIIYNTTPQHEALQ
WCYLHPRDGLMPMYLNIQSHLYHVLEKIHRTLNDRDRWSRAYRARKTPK
>P27387 ~~~XX~~~Packaging protein P20~~~
MVNWELLKNPINWLIVILMLTIAGMAATLVCNHFGKNAVTSE
>P27388 ~~~XXII~~~Packaging protein P22~~~
MQLITDMAEWSSKPFRPDMSLTGWLAFVGLIIVAIILWQQIIRFIIE
>P0DJX1 ~~~L4~~~Packaging protein 2~~~
MAPKKKLQLPPPPPTDEEEYWDSQAEEVLDEEEEMMEDWDSLDEASEAEEVSDETPSPSVAFPSPAPQKLATVPSIATTS
APQAPPALPVRRPNRRWDTTGTRAGKSKQPPPLAQEQQQRQGYRSWRGHKNAIVACLQDCGGNISFARRFLLYHHGVAFP
RNILHYYRHLYSPYCTGGSGSGSNSSGHTEAKATG
>Q2KS03 ~~~L4~~~Packaging protein 2~~~
MAPKKKLQLPPPPTDEEEYWDSQAEEVLDEEEEDMMEDWESLDEEASEVEEVSDETPSPSVAFPSPAPQKSATGSSMATT
SAPQAPPALPVRRPNRRWDTTGTRAGKSKQPPPLAQEQQQRQGYRSWRGHKNAIVACLQDCGGNISFARRFLLYHHGVAF
PRNILHYYRHLYSPYCTGGSGSGSNSSGHTEAKATG
>P03262 ~~~L1~~~Packaging protein 3~~~
MHPVLRQMRPPPQQRQEQEQRQTCRAPSPSPTASGGATSAADAAADGDYEPPRRRARHYLDLEEGEGLARLGAPSPERHP
RVQLKRDTREAYVPRQNLFRDREGEEPEEMRDRKFHAGRELRHGLNRERLLREEDFEPDARTGISPARAHVAAADLVTAY
EQTVNQEINFQKSFNNHVRTLVAREEVAIGLMHLWDFVSALEQNPNSKPLMAQLFLIVQHSRDNEAFRDALLNIVEPEGR
WLLDLINILQSIVVQERSLSLADKVAAINYSMLSLGKFYARKIYHTPYVPIDKEVKIEGFYMRMALKVLTLSDDLGVYRN
ERIHKAVSVSRRRELSDRELMHSLQRALAGTGSGDREAESYFDAGADLRWAPSRRALEAAGAGPGLAVAPARAGNVGGVE
EYDEDDEYEPEDGEY
>P04496 ~~~L1~~~Packaging protein 3~~~
MHPVLRQMRPPPQQRQEQEQRQTCRAPSPPPTASGGATSAVDAAADGDYEPPRRRARHYLDLEEGEGLARLGAPSPERYP
RVQLKRDTREAYVPRQNLFRDREGEEPEEMRDRKFHAGRELRHGLNRERLLREEDFEPDARTGISPARAHVAAADLVTAY
EQTVNQEINFQKSFNNHVRTLVAREEVAIGLMHLWDFVSALEQNPNSKPLMAQLFLIVQHSRDNEAFRDALLNIVEPEGR
WLLDLINILQSIVVQERSLSLADKVAAINYSMLSLGKFYARKIYHTPYVPIDKEVKIEGFYMRMALKVLTLSDDLGVYRN
ERIHKAVSVSRRRELSDRELMHSLQRALAGTGSGDREAESYFDAGADLRWAPSRRALEAAGAGPGLAVAPARAGNVGGVE
EYDEDDEYEPEDGEY
>P27379 ~~~VI~~~Packaging efficiency factor P6~~~
MDTEEIKEEMQEAAEAAIENAVETAELETAAIKAEGAAAAAEQSAEQAAVMAATLAASVEANAAQQIAEHSEQVQTQEEK
ISWLENQVMAMASNLQMMQEAVTALTVSQSLTPEPSPVPAVEVEAMPEAVTVEILPESAGDQQEAEPVPSVGDQQETAPR
KRFRAI
>P27381 ~~~IX~~~DNA packaging ATPase P9~~~
MTIRMPNDRQRILVLGKTGTGKTCAAVWHLSQKDFKRKAWIVLNHKGDDLIDSIEGANHVDLNFRPKKPGLYIYHPIPDV
DDAEVTQLLWDIHAMGDIGVYVDEGYMIPNRDPAFQALLTQGRSKKIPMIILSQRPVWLTRFAISESDFFQIFQLGDQRD
RQTVQGFVPVDLEKLMQAPVNTVPALKKFHSIYYDVGANNCVIMTPVPTADAVLARFDLGKKRKQTL
>P00513 2.7.11.1~~~~~~Protein kinase 0.7~~~
MNITDIMNAIDAIKALPICELDKRQGMLIDLLVEMVNSETCDGELTELNQALEHQDWWTTLKCLTADAGFKMLGNGHFSA
AYSHPLLPNRVIKVGFKKEDSGAAYTAFCRMYQGRPGIPNVYDVQRHAGCYTVVLDALKDCERFNNDAHYKYAEIASDII
DCNSDEHDELTGWDGEFVETCKLIRKFFEGIASFDMHSGNIMFSNGDVPYITDPVSFSQKKDGGAFSIDPEELIKEVEEV
ARQKEIDRAKARKERHEGRLEARRFKRRNRKARKAHKAKRERMLAAWRWAERQERRNHEVAVDVLGRTNNAMLWVNMFSG
DFKALEERIALHWRNADRMAIANGLTLNIDKQLDAMLMG
>Q5UQC3 1.14.11.4~~~~~~Procollagen lysyl hydroxylase and glycosyltransferase~~~
MISRTYVINLARRPDKKDRILAEFLKLKEKGVELNCVIFEAVDGNNPEHLSRFNFKIPNWTDLNSGKPMTNGEVGCALSH
WSVWKDVVDCVENGTLDKDCRILVLEDDVVFLDNFMERYQTYTSEITYNCDLLYLHRKPLNPYTETKISTHIVKPNKSYW
ACAYVITYQCAKKFMNANYLENLIPSDEFIPIMHGCNVYGFEKLFSNCEKIDCYAVQPSLVKLTSNAFNDSETFHSGSYV
PSNKFNFDTDKQFRIVYIGPTKGNSFHRFTEYCKLYLLPYKVIDEKETNDFVSLRSELQSLSEQDLNTTLMLVVSVNHND
FCNTIPCAPTNEFIDKYKQLTTDTNSIVSAVQNGTNKTMFIGWANKISEFINHYHQKLTESNAETDINLANLLLISSISS
DFNCVVEDVEGNLFQLINEESDIVFSTTTSRVNNKLGKTPSVLYANSDSSVIVLNKVENYTGYGWNEYYGYHVYPVKFDV
LPKIYLSIRIVKNANVTKIAETLDYPKELITVSISRSEHDSFYQADIQKFLLSGADYYFYISGDCIITRPTILKELLELN
KDFVGPLMRKGTESWTNYWGDIDPSNGYYKRSFDYFDIIGRDRVGCWNVPYLASVYLIKKSVIEQVPNLFTENSHMWNGS
NIDMRLCHNLRKNNVFMYLSNLRPYGHIDDSINLEVLSGVPTEVTLYDLPTRKEEWEKKYLHPEFLSHLQNFKDFDYTEI
CNDVYSFPLFTPAFCKEVIEVMDKANLWSKGGDSYFDPRIGGVESYPTQDTQLYEVGLDKQWHYVVFNYVAPFVRHLYNN
YKTKDINLAFVVKYDMERQSELAPHHDSSTYTLNIALNEYGKEYTAGGCEFIRHKFIWQGQKVGYATIHAGKLLAYHRAL
PITSGKRYILVSFVN
>P0DTK8 ~~~~~~Glycyl-dTMP PLP-dependent decarboxylase~~~
MIVNNTPVETYELNGVPILVKREDLCAPLPGPSFSKIRGVVAHIKNRPETTIGCLDTYHSKAGWAVAYVCQQLGKQAVDY
WPRFKRDGAADAPRVQQQHARQLGADLVDIPAGRSAILYHTAKKHLRENYHDSYLMPNALKLPESITENAAEAVRTAPHL
PDSGTLVISISSGTVAAGVLKGFEEAGLLRNYNVILHMGYSRSQDATREYIEKAAGLTLGDRIKFIDEGYGYADAARDAS
APFPCNPFYDLKAWKWLSNPTNLETILDGPIVFWNIGE
>E1XTJ1 ~~~~~~O-seryl-dTMP PLP-dependent decarboxylase~~~
MNEPIDPKSFPSQPLSPYIPMHHIGKGPYKTIFNVLSLNRDHIHWEDYLYKHTPCELVANPETNQQVWFKREDYFAPLSC
YMNGKQGINGSKLRQAIWLMVEHLKAGGSPDLIHGTVVGSPQSPMATAVSRHFGGKTTTVLGATKPTTCMNHDMVKMSAW
FGSEFNFVGSGYNSTIQPRCKKLIEQQNPKAYYLEYGITLDHTLHSPERIAGFHMLGGEQVANIPDHITDLIIPAGSCNS
CTSILTGLAMHPKPNLKNVYLIGIGPNRLDFIESRLRIIGKQANLPHITDFTRCYHDNPDYVYGKKDLQHASKSVSLAGL
LMGIREKGESEITLPRFAVHHWDLHTTNWVRYNDLMDYQWGDIELHPRYEGKVMTWIQQHKPELLNENTLFWIVGSKPYI
EPMKAACPELSMPEQVPVNEFTPD
>P0DV14 ~~~pNG1~~~Uncharacterized protein pNG1~~~
MMFDSLLSTITGGGVSHSFSIAMVY
>P0DTH8 ~~~pNG2~~~Uncharacterized protein pNG2~~~
MSGFMNSLRKCIGNINSHLEGFMRTYLLRIIRKIKPAAQLSIEDDHYKCL
>P0DTH9 ~~~pNG3~~~Uncharacterized membrane protein pNG3~~~
MFELSSILIRGGGGVLIVLILLLWIVDENCTDAKAMAYNINCTV
>P0DTI0 ~~~pNG4~~~Uncharacterized membrane protein pNG4~~~
MFDISSILIRGGGVLIVVILLLWIVEHNEDFIDAKSMNYNNQTV
>P0DTI2 ~~~pNG6~~~Uncharacterized membrane protein pNG6~~~
MDCWTIREKNYSYIILSILVILLIWYLILNYCRSKKNAVTNNMPPPAYTVSSSCSQ
>P0DTI3 ~~~pNG7~~~Uncharacterized protein pNG7~~~
MMCMIIRRNRFTGVHDQNLIFNVKSTDVNCLV
>P0DJY2 ~~~~~~Putative protein pog~~~
PIKKQILPTYIIRNSRRPLKKTQLKRNKSPPFMMWKLQIGSIPPWLKTLHRLGAWMIRTVLFSFYNAPYSLTTLRSLLDQ
QQMITNPSIDMC
>P03600 ~~~~~~RNA1 polyprotein~~~
MGLPEYEADSEALLSQLTIEFTPGMTVSSLLAQVTTNDFHSAIEFFAAEKAVDIEGVHYNAYMQQIRKNPSLLRISVVAY
AFHVSDMVAETMSYDVYEFLYKHYALFISNLVTRTLRFKELLLFCKQQFLEKMQASIVWAPELEQYLQVEGDAVAQGVSQ
LLYKMVTWVPTFVRGAVDWSVDAILVSFRKHFEKMVQEYVPMAHRVCSWLSQLWDKIVQWISQASETMGWFLDGCRDLMT
WGIATLATCSALSLVEKLLVAMGFLVEPFGLSGIFLRTGVVAAACYNYGTNSKGFAEMMALLSLAANCVSTVIVGGFFPG
EKDNAQSSPVILLEGLAGQMQNFCETTLVSVGKTCTAVNAISTCCGNLKALAGRILGMLRDFIWKTLGFETRFLADASLL
FGEDVDGWLKAISDLRDQFIAKSYCSQDEMMQILVLLEKGRQMRKSGLSKGGISPAIINLILKGINDLEQLNRSCSVQGV
RGVRKMPFTIFFQGKSRTGKSLLMSQVTKDFQDHYGLGGETVYSRNPCDQYWSGYRRQPFVLMDDFAAVVTEPSAEAQMI
NLISSAPYPLNMAGLEEKGICFDSQFVFVSTNFLEVSPEAKVRDDEAFKNRRHVIVQVSNDPAKAYDAANFASNQIYTIL
AWKDGRYNTVCVIEDYDELVAYLLTRSQQHAEEQEKNLANMMKSATFESHFKSLVEVLELGSMISAGFDIIRPEKLPSEA
KEKRVLYSIPYNGEYCNALIDDNYNVTCWFGECVGNPEQLSKYSEKMLLGAYEFLLCSESLNVVIQAHLKEMVCPHHYDK
ELNFIGKIGETYYHNQMVSNIGSMQKWHRAILFGIGVLLGKEKEKTWYQVQVANVKQALYDMYTKEIRDWPMPIKVTCGI
VLAAIGGSAFWKVFQQLVGSGNGPVLMGVAAGAFSAEPQSRKPNRFDMQQYRYNNVPLKRRVWADAQMSLDQSSVAIMSK
CRANLVFGGTNLQIVMVPGRRFLACKHFFTHIKTKLRVEIVMDGRRYYHQFDPANIYDIPDSELVLYSHPSLEDVSHSCW
DLFCWDPDKELPSVFGADFLSCKYNKFGGFYEAQYADIKVRTKKECLTIQSGNYVNKVSRYLEYEAPTIPEDCGSLVIAH
IGGKHKIVGVHVAGIQGKIGCASLLPPLEPIAQAQGAEEYFDFLPAEENVSSGVAMVAGLKQGVYIPLPTKTALVETPSE
WHLDTPCDKVPSILVPTDPRIPAQHEGYDPAKSGVSKYSQPMSALDPELLGEVANDVLELWHDCAVDWDDFGEVSLEEAL
NGCEGVEYMERIPLATSEGFPHILSRNGKEKGKRRFVQGDDCVVSLIPGTTVAKAYEELEASAHRFVPALVGIECPKDEK
LPMRKVFDKPKTRCFTILPMEYNLVVRRKFLNFVRFIMANRHRLSCQVGINPYSMEWSRLAARMKEKGNDVLCCDYSSFD
GLLSKQVMDVIASMINELCGGEDQLKNARRNLLMACCSRLAICKNTVWRVECGIPSGFPMTVIVNSIFNEILIRYHYKKL
MREQQAPELMVQSFDKLIGLVTYGDDNLISVNAVVTPYFDGKKLKQSLAQGGVTITDGKDKTSLELPFRRLEECDFLKRT
FVQRSSTIWDAPEDKASLWSQLHYVNCNNCEKEVAYLTNVVNVLRELYMHSPREATEFRRKVLKKVSWITSGDLPTLAQL
QEFYEYQRQQGGADNNDTCDLLTSVDLLGPPLSFEKEAMHGCKVSEEIVTKNLAYYDFKRKGEDEVVFLFNTLYPQSSLP
DGCHSVTWSQGSGRGGLPTQSWMSYNISRKDSNINKIIRTAVSSKKRVIFCARDNMVPVNIVALLCAVRNKLMPTAVSNA
TLVKVMENAKAFKFLPEEFNFAFSDV
>P29149 ~~~~~~RNA1 polyprotein~~~
MWQVPEGSQCCCTGKSFSNAEAKELRYVCSCWMSTRLVKAEAPPQQSRKSGIAPTPLKSKGTIQVSLPKATGVKPSIHKS
KGASVAPAPLLKQRCEVVVQYGPPADIELVYPPLVREEEKSSNIVVLPPTQKVEVRVPVCCAPKWMVAIPKPPVKLAPKA
SKLRFPKGAVAYNGVNFIDTKGKVVLSEGAKRILRGIRVAAKQRLRAARRSAACKKVRAKRALAEFEAIVQSERLDQLKT
GFQVVLPAPKMSCSLKEAAPSTTSVVVVKKRKLPRLPKILPEQDFSCLEGFDWGEKSHPVEVDIEDDWILVEKPVLKRQA
VQTAQGRATEALTRFAATSGFSLGAHQKVEDFASSGEAEYLMAGEFADLCLLSLVYNDAPTLSATIEELRDSKDFLEAIE
LLKLELAEIPTDSTTCAPFKQWASAAKQMAKGVGTMVGDFTRAAGAAVVISFDMAVEFLQDKALKFCKRIFDVTMAPYLQ
HLASAHSILKKIWEKLSEWMESLKSKASLALEVMRQHAIFALGAMVIGGVVVLVEKVLIAAKIIPNCGIILGAFLTLFFA
SLGLTALECTAEEIFRMHACCKSAIYSMYSVAEPTMADEGESHTMGATQGLDNAIQALTRVGQSMISFKLGSFSYYAKIA
QGFDQLARGKRAIGELTSWLIDLVGSIYSQVSGQESTFFDELSTIVCLDVRAWLLKSKRVRLQVETMAIGDRITLDTIAK
LLEEGHKILVTAAGVPRKTSADFTMCIKEEVSKLEEVHARTACAGINEGMRAFPFWVYIFGASQSGKTTIANSIIIPALL
EEMNLPKSSVYSRPKTGGFWSGYARQACVKVDDFYAIEQTPSLASSMIDVVNSEPYPLDMAYIHEKGMSMDSPLVVTTAN
TAVPPTNSQVVDLPSFYNRRAAVLEVRRKDGSFFTPRAYDSCIEVRFMHNKCPYVDSAGVPQGPAVNTPMDEGWITPSEA
VAVLKNLLGEHILAEEAKLLEYRERIGNDHPIYNAAKEFIGNMHYPGQWLTAEQKSTYGIKDDGFSFLAVDGKIYKYNVL
GKLNPCESEPPHPNVIPWLEKKTLEIVHWDVHKHIATGPRNALVACFLQGLVQGQSKVESVERMGKDSSPEQQNFFKRLS
LSERIYLRLCQIRIDNIQKEELAGSGRGPMAILRECLMKSKQVVVENYSLLLTLVAILLLISAAYTLLSTVVALAGCSSF
AGGMVAVTAVNNASIPCSEPRLEERYSPRNRFVSRISKIRGEGPSKGQGEHEELVTELYYYCDGVKKLISTCWFKGRSLL
MTRHQALAVPIGNEIEVIYADGTTKKLVWPGRQEDGNCKGFVEFPENELVVFEHPHLLTLPIKYEKYFVDDADRQISPNV
AVKCCVARLEDGIPQFHFWSKYATARSEVHTLKDEGGGNVYQNKIRRYIVYAHEAKKYDCGALAVAVIQGIPKVIAMLVS
GNRGVTYSSVIPNYSSSFIRGEVPYVPEDGLVSRGYRKVGYLHASDAPHVPSKTSFMKVPDELCFPYPDPKQPAILSAED
ERLKGTIHEGYTPLRDGMKKFAEPMYLLEEKLLDEVAGDMVQTWYDPGEFLEDISLDQAINGDMDEEYFDPLVMDTSEGY
PDVLDRKPGEKGKARFFVGEPGNRAFVAGCNPEKAYYQLEEDSKTKIPSLVSIETPKDERLKRSKIDTPGTRLFSVLPLA
YNLLLRVKFLSFSRLLMKKRGHLPCQVGINPYSREWTDLYHRLGELSDVGYNCDYKAFDGLITEQILSTIADMINAGYRD
PVGNRQRKNLLLAICGRLSICGNQVYATEAGIPSGCALTVVLNSIFKELLMRYCFKKIVPPVYKECFDRCVVLITYGDDN
VFTVAQSVMQYFTGDALKMQMAKLGVTITDGKDKSLSTIPARPLLELEFLKRGFVRSSGGMINAPLEKLSIMSSLVYIRS
DGSDMLQKLLDNVNTALVELYLHGDRTYFESVRAFYFEKLPPGAYKELTTWFQAESFHECQKSGESGYKPQGLIEISHGA
AFASFTQQAGTELEKHDICPGLSIAGTKYIATENEIVLSLSSVLPGDRNVFKLDLPCGDGIGRLPSKCSILNLRKPGLVM
RLCKRAQDEKKTLVIRDERPYIGAWAVACICGESFGFGQQSVLALYANLLGPNQRNGLASYFSDFESPIHIKKVHAKTNS
YEGGEALKEIFTFCETIFYEATEMDTRKVMLQNQPDVYPSISLVGGVCFPNEGGEPGAMYSETDVTMAREVQGVYVSEAC
VKCCRRCVGVATRVVTDTQLFGNNLLKTHLKALRKIQNHTCLRK
>P29150 ~~~~~~RNA1 polyprotein~~~
MSSICFAGGNHARLPSKAAYYRAISDRELDREGRFPCGCLAQYTVQAPPPAKTQEKAVGRSADLQKGNVAPLKKQRCDVV
VAVSGPPPLELVYPARVGQHRLDQPSKGPLAVPSAKQTSTAMEVVLSVGEAALTAPWLLCSYKSGVSSPPPPMTQRQQFA
AIKRRLVQKGQQIIRELIRARKAAKYAAFAARKKAAAVAAQKARAEAPRLAAQKAAIAKILRDRQLVSLPPPPPPSAARL
AAEAELASKSASLQRLKAFHRANRVRPVLNNSFPSPPLACKPDPALLERLRLATPSRCTVATKRQRDFVVAPLATQIRVA
KCASHQEAYDSCRSILIEEWPESRYLFGPLSFVGDWEHVPGMLMQYRLCVLFSMVRDVMPALSLVADTLHALRSGTAPNI
VFKNAMSTANQILECSHSSHAAQGFGNFLSRGKSAAINLASGLSSFVGEKVVSGANHVVNKASEVIVDKLFVPFVKLLRE
HFDDTIGKWIPKLLGATQKIEELWRWSLEWAQNMSKKLDVSLRVLRGSALVGVGLLLVSGILYFAEQLLRSFGLLIVAGS
FISMFVGGCLLAYAGSMAGIFDEQMMRVRGILCEIPMLLYLKAQPDPFFPKKSGGRAPTQGLTDVFGVPLSIMNAIGDGL
VHHSLDTLTLMGKFGAAMDNVRKGITCMRSFVSWLMEHLALALDKITGKRTSFFRELATLINFDVEKWVRDSQQYLLAAE
IYVDGDTVVMDTCRHLLDKGLKLQRMMVSAKSGCSFNYGRLVGDLVKRLSDLHKRYCASGRRVHYRLAPFWVYLYGGPRC
GKSLFAQSFMNAAVDFMGTTVDNCYFKNARDDFWSGYRQEAICCVDDLSSCETQPSIESEFIQLITTMRYGLNMAGVEEK
GASFDSKMVITTSNFFTAPTTAKIASKAAYNDRRHACILVQRKEGVAYNPSDPAAAAEAMFVDSTTQHPLSEWMSMQELS
AELLLRYQQHREAQHAEYSYWKSTSRTSHDVFDILQKCVNGDTQWLSLPVDVIPPSIRQKHKGNRVFAIDGRIFMFDYMT
LEYDEIKEKENLDARHLEARILEKYGDTRLLLEKWGANGVVAQFIEQLLEGPSNVASLEVLSKDSLESHKEFFSTLGLIE
RATLRAVQKKIDAAREDLMHLSGLKPGRSLTELFVEAYDWVYANGGKLLLVLAAVILILFFGSACIKLMQAIFCGAAGGT
VSMAAVGKMTVQSTIPSGSYADVYNARNMTRVFRPQSVQGSSLAEAQFNESHAVNMLVRIDLPDGNIISACRFRGKSLAL
TKHQALTIPPGAKIHIVYTDNNGNTKAPLTHFFQPTGPNGEHFLRFFNGTEVCIYSHPQLSALPGAPQNYFLKDVEKISG
DIAIKGCGIKLGRTSVGECVGVKDNEPVLNHWRAVAKVRTTKITIDNYSEGGDYSNDLPTSIISEYVNSPEDCGALLVAH
LEGGYKIIGMHVAGSSYPVEVDGVQMPRYISHASFFPDYSSFAPCQSSVIKSLIQEAGVEERGVSKVGHIKDPAETPHVG
GKTKLELVDEAFLVPSPVEVKIPSILSKDDPRIPEAYKGYDPLGDAMEKFYEPMLDLDEDVLESVMADMYDEFYDCQTTL
RIMSDDEVINGSDFGFNIEAVVKGTSEGYPFVLSRRPGEKGKARFLEELEPQPGDTKPKYKLVVGTEVHSAMVAMEQQAR
TEVPLLIGMDVPKDERLKPSKVLEKPKTRTFVVLPMHYNLLLRKYVGILCSSMQVNRHRLACAVGTNPYSRDWTDIYQRL
AEKNSVALNCDYSRFDGLLNYQAYVHIVNFINKLYNDEHSIVRGNLLMAMYGRWSVCGQRVFEVRAGMPSGCALTVIINS
LFNEMLIRYVYRITVPRPLVNNFKQEVCLIVYGDDNLISIKPDTMKYFNGEQIKTILAKYKVTITDGSDKNSPVLRAKPL
KQLDFLKRGFRVESDGRVLAPLDLQAIYSSLYYINPQGNILKSLFLNAQVALRELYLHGDVEQFTAVRNFYVNQIGGNFL
SLPQWRHCASFHDEQYSQWKPWSPVKFLEVDVPDAKFLQHKAPATALSIVADRLAVAGPGWRNKDPDRYLLVSLTSLKAN
EGGLYFPVDYGEGTGQQATEASIRAYRRLKDHRVRHMRDSWNEGKTIVFRCEGPFVSGWAAAISFGTSVGMNAQDLLINY
GIQGGAHKEYLGRYFVGARFKELERYDRPFQSRIIAS
>P38485 ~~~~~~RNA2 polyprotein~~~
MSELIVSDVVALSVWGLLTVVKVYQGNFSLSVLFSEAQTLGFLFFISCHWLQSILLPWVVKIKCATRFDIDLVEMEMRAE
KYLNSIPANVLEDRAKAYNVSKMTSIKNQIPSGKALYQNGKSLASMIKSQFQPISKVLKGGEVKSYNYIPVGSFTAGEHC
ELAVPIMPEEELAAIVPDSDFALVTKEDNKAKSVHVGAVEIVMECMTSPDCDIYGGAMFVDTFHEDPKNAVRALFVTQLK
GGVSPRCLFFPDTQVEIKKGMNERFRLILSSGNSDFKPGENLAYLKANVAMCGISMNKGYVPTAFHESYARKERASVIEY
LGRYSAVIHHRNEFKPEMLKRDGLSFRFGGKTKLIEKGPLQYEWSETASKVVSVKGTGPPTKEDETISKEVSETLGATEH
VVFPTRNVVAQAQMEVAQFGRLLDDTKSLKLQSLLNSRIAAGRFSIPMTAVKGTVVFDGLLASLIGTTLRGAPMFRHTYR
QSTKLRFIFTINVPISTGIGLMVGYNSVTSDKHLTNEYTISSEESVVWNPACQGVLEFSVSPNPCGMYWSYDYFRQTGSR
LSICVISPWSATPTTDCAVAWQIHVDDEQMTMSIFNPTQAPAVLPVKRWMGNLIFKQGAQEQVKKMPLAIGAAVGDDKTA
VMTMPNSLAAMWNYQIGTFNFEFTKLSSPFIKGTLLAFIAMDQDVSYSLEELQNFPNKIVQFDEKDGRAYVSFGEEHFAQ
AWSTQVSGAVTSAKRGCPYLYVVSKDCIASTICGDFQVGVKLLSIENYSPCGYNPGLVVASTIVQNTAGSNSTSLLAWPQ
FCSPCINVWSEFCALDIPVVDTTKVNFAQYSLDLVNPTVSANASGRNWRFVLIPSPMVYLLQTSDWKRGKLHFKLKILGK
SNVKRSEWSSTSRIDVRRAPGTEYLNAITVFTAEPHADEINFEIEICGPNNGFEMWNADFGNQLSWMANVVIGNPDQAGI
HQWYVRPGENFEVAGNRMVQPLALSGEDGTGMLPILK
>Q9Q2Q3 ~~~~~~RNA2 polyprotein~~~
MRPELVAVLDRYFSEIISCFFLGWLLNFLLVWFCSTKSTFLLWSVFLYICYYILRIEFAYIVAPFLKTIYTNSSQYHTVD
WAYAYTALPKGLWEQITDYNYCYNFPPPRVEGFVSDFSPRFTLKELEIMNEANITPVHTIPKDTLLKRASDYKLAVESKK
SILPRVQDLYEMDKWHNLRSKLSKNAPSYVTTSEIAVGAMSGAGNTKLAIPVVEKYTEEVADDRLPDRVRAKADQIMVAA
IELVADGFASVNSDVTMAGALYDKRHKTIASSFKGAFASRASGVPSHVIYFPMHRVPACDDPNTTLELSMVSRDSDFDEG
YTLANISARTLYVRAKGPEKVTETRHLLKAKTEDVVKARQFASEAQVVFATPRLFPEVNLDNYNLPGPSNAQQTEAITTD
RGILFPKPKFKGNEVVLNYTGSGKIRNVGSQRFETKKNATGEQFVRSVDDLGCLSDEDGKDYRYGQGLMEEDVLNVQTNN
FAIESATETMRLLFSGYASIPLNVIPGTKITVAYLNELSKHSAVHTGLLNMLSKIPGSLKVKINCQVAPTCGIGLAVSYV
EGNESANLGSSLGRLLGIQHYKWNPAIEPYVEFVFKPFSCADWWNMHYLGSFKFAPVVVIQTLSKWLNAPKVDARISFAI
YYEPTVVLPKQIATLEHAPAFMFRKEVGTLAFKQGERVAYSFEVNLGKPQTDGKEVTSTFASSYCGLSQYMQSDVILDFT
LMSSPMIGGTFSIAYVAGAYIEKVGNMQILDSLPHIDFTFSSGSKSTRSVRFPKEVFGVYQALDRWDLDSARGDDVSGNF
VLYQRDAVSSALEGELTFRIAARLSGDISFTGVSAGYPTTITRIGKGKTIGRSLDPEIRKPLRYMLGQAHATPKDFSSVR
FVMGHWKYRAGLYPGSKSDEDIHPFSLKMRLDGSKSSENFEIIHSPFVRLLQNCAWMRGTLRFYVVARASSDYMSYRRTS
QLTVSAHENSLSSNQFYSGVLTSPSGELSFSREVVGPVDGFASMGWNVRGSKKFYKIHVEMGNVHEYDTVVLYGQFDSNV
EFAGQRKGGHYLLEKETPIFKTIKY
>P23009 ~~~~~~RNA2 polyprotein~~~
MFASFIFSGDNKLTEKTIFNCGDLDILVVYYTIATQFRKFLPHYIRWHLYTLLIYILPSFLTTEIKYKRNLSNIHISGLF
YDNRFKFWTKHDKNLALTEEEKMEVIRNRGIPADVLAKRAHEFEKHVAHESLKDQIPAVDKLYSTKVNKFAKIMNLRQSV
VGDLKLLTDGKLYEGKHIPVSNISAGENHVVQIPLMAQEEILSSSASDFKTAMVSKSSKPQATAMHVGAIEIIIDSFASP
DCNIVGAMLLVDTYHTNPENAVRSIFVAPFRGGRPIRVVTFPNTIVQIEPDMNSRFQLLSTTTNGDFVQGKDLAMVKVNV
ACAAVGLTSSYTPTPLLESGLQKDRGLIVEYFGRMSYVAHNINQPQEKDLLEGNFSFDIKSRSRLEKVSSTKAQFVSGRT
FKYDIIGAGSQSSEELSEEKIQGKAKQVDARLRQRIDPQYNEVQAQMETNLFKLSLDDVETPKGSMLDLKISQSKIALPK
NTVGGTILRSDLLANFLTEGNFRASVDLQRTHRIKGMIKMVATVGIPENTGIALACAMNSSIRGRASSDIYTICSQDCEL
WNPACTKAMTMSFNPNPCSDAWSLEFLKRTGFHCDIICVTGWTATPMQDVQVTIDWFISSQECVPRTYCVLNPQNPFVLN
RWMGKLTFPQGTSRSVKRMPLSIGGGAGAKSAILMNMPNAVLSMWRYFVGDLVFEVSKMTSPYIKCTVSFFIAFGNLADD
TINFEAFPHKLVQFGEIQEKVVLKFSQEEFLTAWSTQVRPATTLLADGCPYLYAMVHDSSVSTIPGDFVIGVKLTIIENM
CAYGLNPGISGSRLLGTIPQSISQQTVWNQMATVRTPLNFDSSKQSFCQFSVDLLGGGISVDKTGDWITLVQNSPISNLL
RVAAWKKGCLMVKVVMSGNAAVKRSDWASLVQVFLTNSNSTEHFDACRWTKSEPHSWELIFPIEVCGPNNGFEMWSSEWA
NQTSWHLSFLVDNPKQSTTFDVLLGISQNFEIAGNTLMPAFSVPQANARSSENAESSA
>Q9YK98 ~~~~~~RNA2 polyprotein~~~
MSESGNTTSMPGCGRMCALRSTWSKRAFLVACKDGALTSDGRCPQYGCGALVSITKGVQQPKKTASAKVVKCLCWVQPAR
WCEKHSKGPASPNGSVTTKRSNSARAAPAPLPYKKQTCDVVVTVGPLELVYPALVSEELPTPVAATPTKVEEVPIPELPL
WLAPAWMVEQPYAATPEVLCLTQREEFALLKKRLTRKGKLLQRRATHARFEARAALARVRAATQRKVEEVTALVIKGRRI
LAAHQLLRELEEVAPLSQAQEQLVASSCAAAAARQEECASFLRRAKAWRKSISATPPVAFATAVASKVVSATMPWAHLGL
SLGGLLAVPTLDGTLGAKQWNAKTIATWVLKPVVSCVQSVHAKVRDWLHSQPEVGVTNTKVPLVLPEVCLGVLSPPSLSE
EIVDNPQETSQSGIWHPEMGVRNIYVFHDDSWETSPEEDENYTYTFSRQCGIPYLLVEGRGAEERKNTILGWDFSLHNDG
EFEFLPSPEEGYTKELVTPVALEEEDKYSTASSCGFFSLDDVSSAITIQCPGLLSADADVHFFDGPGYRCSSRPRDFRPP
VVRGCDYESRVKASIQRKIENPLQERFITVLREKRKKNKKKEFHSFSACFAFKRKQIQWPPTPNEMVNEWEEYCIAQAWL
PFEVVVTDEIEDVTPLYPGGRDYNCNSQLLFPLAPLSTVYCDDSCFHPNDGWTTDGNGKHFRLSPQFVLPDVPIPIVHRV
TRQLPQFLYDLGIGDLTCNSGYQAENLQEEIQERMEDRSEEKPVPSLDTLISKLSKRSTKVKGAGENRYADRHSLTEKAI
FHQPGALSRMRSGKEKTIVAANHNSDQISVRMAECGKPVFTPLPRMSDEMLRKFLEKGLGSTSTVALDIGIQSHIPQGMP
TVAFVNVMDTRIEDPLYSSLCGSYIDLGRDRAKTLCLPLVNFPMSKLAEDVDDVLNGLMLCTHFQDSTKFGVGKPAFQYG
TLEFQEFKPSAYSDFSRVRDNWDAIAKQQNTPNDRILAGFSVLGAVSQAYNQALPVFKSVELVAPPKRKPVVATFQNPTT
LGRSNTTRSFRMPTMDLPRSTGRDAPIPIVHRRNNNDVHGFDEATPARFSTCDSGLVADTTLAFAKMYQCKKDAKAGHVL
ATIDIQECVFEDNRRVALDWLAHGLASFKYDLQLTVDSNPFVGVTLGITVDAFDRLLPQISDEVIAVPLAFQLPTYLFPI
SKKGTFTQTIDFAAIAGYNFFPHVAAFGRPKIIVYIVSDNDLPASDTWMCLVELHMTRLESSTLACSPTLVLPQAFGGDL
PLDLWRGPYTFPLGGGTKRLSTSLDIGTSTTTVSGWRTVSFPAAYALFLQGHGGSLVGEVVHTGSAAVSCALHLCISFGG
APPTLEEALVFPGFRLPSGEGKFHIKVQTPYGRLSTLTPDCALYVYLAGGPIAVAPMSVPYQFCIHLERLVDDGAPPRTI
GLIREFNWATINNFKSDDITFAIPARLSDLVLTCGDVTMSTNPLALLIGSCGFFRGNLTVVLEWATFLKAGDKEGTVQLT
TCRGMINNVKGVRNAIQKKVVNLSLVGSVSRYLNVGDFTGFAQSGGQVGYDEIFLEFSTNKAKQIRYLNINVELDENFEL
YGRTIIPLKNTAPAFASTSASAPNES
>Q8QVU9 ~~~~~~RNA2 polyprotein~~~
MFAPIGAPGMGERASQNFRSASGFRDLLVHAVKVLVLSAYQRRPCTSMLRPREQKRIKMRLKWMCLMKFCFMQNNYWDTQ
RRFTGGMHNVPELATFDAWLQHWRVVHWRLNIISEILIDFQALVRMMKNLAGQCHFDFYPVCTHCVQLVNNWYIAEFKPH
LEEETEWWKDLVKPRMGNPVHEIEPNRKSQYVESRRIPPPLNGDEFANFLHVCQCAKLAFDVERAATEENFEDALDTLEE
DFYDIDSTIPKRDLLAADARVVKRAFTLRRKRRPNRTSVYSMKGQPPVTFSASDLVSCLGQVLSTLPTVKMDREAIEDQQ
DHLEDKQGGEILTTPQFIEVLRKKKREVREKEFDDSTQGKLLPAEDFTLSKHDVFLANSVLDGLRKSKLIQRFAGKCATS
TKITVDLTNKEEVVRYGPKELASEGFRQTFNVLNRPEYNALNKLAEAGWKEAKSVVLNLHIRSYLPQQMNAYAFCVIMWG
HSSDAQEAALSGSYVYLGDGEATMLQLPLLCEYVGHNLQDFEAYKRSLVLSTVFPEFSGIADGKAMFGITSIEFTEYLPT
SHAGITHERDSWDAMLRNHTEEKRRFLAGFNVVDTIEKGNRKGFSFPDFDLKAVPRHQAVVRTFEDQDVAPILSKAKSMR
VKTFGSFRAGNIPVNFLGTPSNGQVASKHSVSENAGYSVGDMKSAENFVFTQLITVPAASTKGNVLAGVDILANARTTMS
GFYMRWLQKGYIDTNLKLICHLPRAPFAGMSFFVLIDGTGYLAKDAPTSLNEEEILSYPLHLVTTSDVSSYEFVLDWHRY
IGQVPFAEENAFLRPTLFLVACVSSTLALSAKVEFYLEAQSVGEELPRTLAPSPVLSYPFQNSFLEDLDLFLPPKRLTLG
ERETTIIPLSFAKSKKSGDAVLYSHAAARLAHFQGIGGVLHGVVYLVGSQLVASQSRISMWSKEQHIQHQAVNVHVDTDT
GVAFDLPIKDAFYASSVYGDSGAVIQVTCLCSPMSPNAIKAPFDMIFKIRGFTPDAPMCRTINFTQRFGWFAVEPTTSTG
AIKLKIWPVSNHLESEDMKVTGYTNAFLQMCQTSTMHFGSVIIHFSWTLFGGTTNAATAGGVVTIAEGFGPEEENFRGHC
RNLSIYEGRATVPLELGTFAGPTPLKKLDFKYRNWIRFTTPKGRNISSIFCAIEVLPGFSFYGRTGSPRLSVVGTTVPPT
ADASTSNSQGGDEDIGDQYSAALGRGRGRGSRPGPSPIRG
>P03599 ~~~~~~RNA2 polyprotein~~~
MFSFTEAKSKISLWTRSAAPLNNVYLSYSCRCGLGKRKLAGGCCSAPYITCYDSADFRRVQYLYFCLTRYCCLYFFLLLL
ADWFYKKSSIFFETEFSRGFRTWRKIVKLLYILPKFEMESIMSRGIPSGILEEKAIQFKRAKEGNKPLKDEIPKPEDMYV
SHTSKWNVLRKMSQKTVDLSKAAAGMGFINKHMLTGNILAQPTTVLDIPVTKDKTLAMASDFIRKENLKTSAIHIGAIEI
IIQSFASPESDLMGGFLLVDSLHTDTANAIRSIFVAPMRGGRPVRVVTFPNTLAPVSCDLNNRFKLICSLPNCDIVQGSQ
VAEVSVNVAGCATSIEKSHTPSQLYTEEFEKEGAVVVEYLGRQTYCAQPSNLPTEEKLRSLKFDFHVEQPSVLKLSNSCN
AHFVKGESLKYSISGKEAENHAVHATVVSREGASAAPKQYDPILGRVLDPRNGNVAFPQMEQNLFALSLDDTSSVRGSLL
DTKFAQTRVLLSKAMAGGDVLLDEYLYDVVNGQDFRATVAFLRTHVITGKIKVTATTNISDNSGCCLMLAINSGVRGKYS
TDVYTICSQDSMTWNPGCKKNFSFTFNPNPCGDSWSAEMISRSRVRMTVICVSGWTLSPTTDVIAKLDWSIVNEKCEPTI
YHLADCQNWLPLNRWMGKLTFPQGVTSEVRRMPLSIGGGAGATQAFLANMPNSWISMWRYFRGELHFEVTKMSSPYIKAT
VTFLIAFGNLSDAFGFYESFPHRIVQFAEVEEKCTLVFSQQEFVTAWSTQVNPRTTLEADGCPYLYAIIHDSTTGTISGD
FNLGVKLVGIKDFCGIGSNPGIDGSRLLGAIAQGPVCAEASDVYSPCMIASTPPAPFSDVTAVTFDLINGKITPVGDDNW
NTHIYNPPIMNVLRTAAWKSGTIHVQLNVRGAGVKRADWDGQVFVYLRQSMNPESYDARTFVISQPGSAMLNFSFDIIGP
NSGFEFAESPWANQTTWYLECVATNPRQIQQFEVNMRFDPNFRVAGNILMPPFPLSTETPPLLKFRFRDIERSKRSVMVG
HTATAA
>P31630 ~~~~~~RNA2 polyprotein~~~
MSTFRYKCKQLDQEIQWWFSGTGNRAFWKFEKKLAELHEWYWSLALDPFPYSGFFYKCFYELFQLWVKLGLIVQVSYLIL
LLDFFVYTIPKKMASQIETTVEKVKQSGIPADILRKRAVDYWKKNNSHNSQMQDVLPNVDEIYEGMRANIAKYLGRSSTV
TSIAKLGKCKVYRKKNIPLANLPSLQTSCVPITLTEESVGNSDYTTEETNSEVKSLHVGAIEIVMNSFASSDCNILGGFL
LIDTCHTDINNAIRSIFVAPMRGGRPIRMISFPDTLVQIEPNMNKRFQLLCTTSNGDFMQGRDLAMMHVNVLAHAVTHTS
TYTPTPYYEKILSREKGFIVEYLNRMTYAVHNQNHPTEKDLLESDFQFDFEGQPVLKRISSTKAIFSKGSSFRYMISGKK
EHKIDKPRLEEDGSKSYIDGLQDTFDTTHATLQSGADLFKRNLDDVSTISDTMLGAMIGQTKVVIPKTLVAGTVLKSGPL
SDVMQQGSFRSTIALQRTHIITGKIHVVAMLETAVNTGLGLAICFNSGIRGKASADIYATCSQDAMIWNPSCTKVMQYAF
NPNPCSDGWSLAFLERTGYHCVVTCVTGWTGTPLQDTFMTINWHISREACVPKIYTIFDPEPDMMLNRWMGRAIFPQQST
QVVRRMPLSIGGGAGAKNSILMNLPNAILSMWRYFKADLEFELIKMSSPYINATIAFFVAFGDLSDDTVNFEAFPHKLIV
FSDKQDRTTISFSKDEFLMAWSTQVRPDTKLSEDGCPYLYAITHNGVSSSVEGDFILGIKMVGLKAVENIGVNPGIIGSR
LLGAVAQSGQTQQVWNKIWRIGTPPQATDGLFSFSIDLLGVELVTDGQEGAVSVLSSSPVANLLRTAAWKCGTLHVKVVM
TGRVTTTRANWASHTQMSLVNSDNAQHYEAQKWSVSTPHAWEKEFSIDICGPNRGFEMWRSSWSNQTTWILEFTVAGASQ
SAIFEIFYRLDNSWKSAGNVLMPPLLVGNPRLDIKGRAAAAA
>P13026 ~~~~~~RNA2 polyprotein~~~
MGFSEFFASALGTVARAKATLQGGFARFLSETVVTLQAASPEMRKFAYSKLWEEVDSVKELKPLTAQELVATLRKELWCA
QVRAQKCTLASTSRFCTCGGIPGEATPTVIKETVHVDECPNGRNLCRHGTRCLRHGGPGSFQQEREVQVDAPKCPHCAGT
GIVPASASWREIRRCWREQRKVHSLPSLPLHPDVLFEGTNAWQTRLRWLKTWRHVLGDVKPCTPEKWMQAAQIMRTCAVP
SFENPIPGQFGYERLYNGEGKEEYWLQIPATDKYTDLIINWWHAKNTPGWEEPSSSLMDFKRNRMGPCLHIVEKRVRNSY
VAPPWKPWGEDIDILSVMDSLSSQLEDFLDVFYDCAAQFDGELEFSLSNDRLSSVTGELGGVPISIGAPSKISNTPPKVN
FAELYGNLVRHNHRKISALRPILMAHPDQDEIEDQLDHLENKQGGEIVSTPSFIKMLKEKRKEVRGKEFEEGSEGRLVRS
KDLELSKKDIFLAHTLMDKFHGMSIVKKFGKSDPKLTKVCVDLTNQEEVIKYPVKELQTTSEGVLSAQTFTVLNRPQFKE
LNRLAEVGWKEAKSVCLNLHIRSYLPVHLPVYAFCVIMWGHSSNAEQASLSGAYVYLGDQEASVLQLPLLCGYIGNALED
MEAYKRSLVLSTCFFGTSGLSPGQNMFGITAVEFTEYLPTSYGGITHERDSWNQMLRNHQGVDKQRFISGFNVVDFVEAG
KEKQLHFPDFDLQPVPKHQPIVRTFGKEKQPLLNKSRSMRVKTFTSFRAGNIPIGRQIDNTAEAINFELGRASTSNAINP
RLDTSETNLRAGGEFAFIHTIDLPTAVTEGQVLAKIDIFKKIQDAKSMVCVQWMQAGYVNKNLTFISHLAPSQFCGVAIW
YIFDAYGKIPSDVTTSLELEIARSLCPHVHVLRDSKTSVWTIDFHKICGQSLNFSGRGFSKPTLWVIAASTAQLPWSAQV
TYRLEALAQGDEIAHGLATRSIVTYPISLEHLKDIEIMLPPRQMAIGNAGSINFPLSFAVQQKSSSGRIAYSYAAGLLSH
FLGIGGTIHFKIQCTSSAFVTARLRVALWGDTITLEQLSQMPHVDCDVDVVSSLKIQSPFYATANFGDSGARFWVTPMSS
PMAPETMESKLEYYIQILGIDADPPMCRQINYDQRFAWFTLLRPPDPKLSKILKLTLPSRVCNIAYKEATVTNYVNAFAI
MCATTGMHAGKCILHFSWTLNKGTSFKDLQGHISFYSGMGDSTIGEHHGEFHLGGPLSSSLAVPFEFGSFAGPVTSGGTP
FTSENWLRVETAHWDWLTSLTVDIQVLPGFRFYGRSAGPLTIPS
>P18474 ~~~~~~RNA2 polyprotein~~~
MGKFYYSNRRLACWAAGKNPHLGGSVEQWLAAINTDPSFRQTVKEDVQENREQPTAVRMFSWKVGSGPIDNPEKCDWHFV
LTGERPAPSRPVKADEVVVVPQPKKVVIPTPPPPPAPYFRAVGAFAPTRSEFVRAIVERLTRLREESRAAALFAELPLEY
PQGAPLKLSLAAKFAMLKHTTWRKWYDTSDERLLEAHPGGPCLPPPPPIQNPPSFQERVREFCRMKSCTKAFALETSLGL
NKAWVGLVDIPSTSVCCADGKTTGGQTIAQEADPLQHRISTSVAPGRAQWISERRQALRRREQANSFEGLAAQTDMTFEQ
ARNAYLGAADMIEQGLPLLPPLRSAYAPRGLWRGPSTRANYTLDFRLNGIPTGTNTLEILYNPVSEEEMEEYRDRGMSAV
VIDALEIAINPFGMPGNPTDLTVVATYGHERDMTRAFIGSASTFLGNGLARAIFFPGLQYSQEEPRRESIIRLYVASTNA
TVDTDSVLAAISVGTLRQHVGSMHYRTVASTVHQAQVQGTTLRATMMGNTVVVSPEGSLVTGTPEARVEIGGGSSIRMVG
PLQWESVEEPGQTFSIRSRSRSVRIDRNVDLPQLEAEPRLSSTVRGLAGRGVIYIPKDCQANRYLGTLNIRDMISDFKGV
QYEKWITAGLVMPTFKIVIRLPANAFTGLTWVMSFDAYNRITSRITASADPVYTLSVPHWLIHHKLGTFSCEIDYGELCG
HAMWFKSTTFESPRLHFTCLTGNNKELAADWQAVVELYAELEEATSFLGKPTLVFDPGVFNGKFQFLTCPPIFFDLTAVT
ALRSAGLTLGQVPMVGTTKVYNLNSTLVSCVLGMGGTVRGRVHICAPIFYSIVLWVVSEWNGTTMDWNELFKYPGVYVEE
DGSFEVKIRSPYHRTPARLLAGQSQRDMSSLNFYAIAGPIAPSGETAQLPIVVQIDEIVRPDLSLPSFEDDYFVWVDFSE
FTLDKEEIEIGSRFFDFTSNTCRVSMGENPFAAMIACHGLHSGVLDLKLQWSLNTEFGKSSGSVTITKLVGDKAMGLDGP
SHVFAIQKLEGTTELLVGNFAGANPNTRFSLYSRWMAIKLDQAKSIKVLRVLCKPRPGFSFYGRTSFPV
>P13561 ~~~~~~RNA2 polyprotein~~~
MLGISNLCRYYQGQRRVCANKFYQKYVNHSDLYFFDLWEISANNLWIKLAFVLLLCLFEIISGLEYLGKMAQEILKQGIP
ANVLQEKANLFKKASANNKIKDEMPNALSLYQNHSFFQKLKHLADKKNLDITSLPGGREVEYKHLDAGHLLADTNVVIDV
PLVPQLAARTPTDYNFGTSRDKSATALHVGAIEVVIQSYASSECDLMAGMMLVDTFHSRPENAIRSVYIVPIRGGMFMRA
LCFPNTLVPMDSDINNRFKVVFSLPNNDFPQGSKLGHVSINMAGCTTSLSKTYVPSPLLTEELGREAATVIQYLGRDTYA
MQTSNVPTSDEISRMVFNFHMEGKLSMHKTGSLSSILSKSKSLRYTIGGSKPKNKLADKAHNEEAETSDSKGIIDPKDGN
VFANPQTDTDLFKLSLDDTSSPKGSLLDTRFAQKKVLIPKAMAGGADLLSSNLYDVLSGSSFRASLALARTHVVEGKIRC
ICTINLPENTGCCLAITVNSSNRGQFSTDIYTTGSQDRILWNPACSKNCDFSFNPNPCGTAWSLEFLRRTKFHLSVTCVS
GWSAQPQTDIAMTMDWYVSNKPCVPCIYNVGTPGQNVWVNRWMGKLSFPQGSQNQLKQMPLAIGGGAGAKNSILMNMTNA
FLSLWRYFHGDLVFEVQKMSSPFIKSTVTFFIGFGGLPFSENLEDFPNKLIQFGEVQERVEITFTRKEFLTAWSTQVDPA
GPVAGDGCPYLCAMVHDSTASTITGDFNLGVTLLRIENFVGIGRNPGIQGARLLGSMQAEAQGGVVRTTDGVYSTCFRVR
TPLALKDSGSFTCDLIGGGITTDSNTGWNLTALNTPVANLLRTAAWKRGTIHVQVAMFGSTVKRSDWTSTVQLFLRQSMN
TSSYDARVWVISKPGAAILEFSFDVEGPNNGFEMWEANWASQTSWFLEFLISNVTQNTLFEVSMKLDSNFCVAGTTLMPP
FSVTASPDSRPLLGVKTSTPAKKYVGGSLQAGPSPD
>P36324 ~~~~~~RNA2 polyprotein~~~
MSQFWGEFPEKVIQTFQHLQVALIGDIKKCALSSPLFPELSKLDAHSQHHLLASFELPRFGGVTPGVMEQLRDAESELAE
AKQRLLRERLHAVANRQNIPYLGDCMYYDAPGISQEELLQAAFLEAPTPAWEHERIRPLWPKDEWFRDARQGPYLEDYGN
IPLGDLDTLCLAFDALVEEHWMPIYLLISTFSMFQQYGTQPLLLECAQSAGSLIPACMMTDHHLEPTGDRQADKEARQDY
ADSQDSIQSMGDFWKEFYTKDSGKKIPDSHKSRLANDPNKVGFTKSALFHKQPLSHSLAQTWANFRGTQDKADLVKVTMD
MNIEKYTVRLPDAVRTTAGPLYIEWINLPRMSENSARKLAEVGWNNADICGVDLAVKSHIAVGTPVRVIISLVDGACSDM
PTATMCAFEVNLASQNNRSLNLPLLSLPFSRLLADLHDFQNRVKIACQFRDPEGFNVGTPMLSFSSLEFSELKQTAFERN
SLLRDSWSEIEKRACHGGGRCVASQGIVQTWEKEVNPPLKEYAPLVLPPVPQPKRNFIDQQTGEVVQSFMQKSRSMRFKS
PSDLWSRPSVDGGSTSTQPPSKGSLRCENVPGCAYEVDPLHLLYYESVDVPKDTLAGTLLARIDVRAKAAIFDSAVWRQW
VRDGCLKPKIKMRITAATSCFSGIVLGACFDAYRRIPAATKTGITASLVTGLPNTVWATRDTSEVEWDIDLAAVCGHTFF
ALEDTFGYMDFLIYVLRGNEITAVADWSIYVSFHVDWTQESMLATLIPTFVWPPKPTDISLLKEVWGPYRFTLDGTEAKE
SFAIMPGTAILHGQQIVRTFPRVVAAHFRSWTGKVRMSIQEVSSIFLTGTYMVGVSWNATADLADIVTRKHWIVKSNEIF
EVDLYCPYGENPTFTGQANGKPFIIVHKLGGIVGPKDSVGTFGFMIHIHGLTGVYKNPTLHSGDRSVGSAWFRINNIADD
NLVFNIPGRIEDIIAAAGKYDVTNYVNPTSLLFSVTGLHGGIIRLHITWCPNTTLGESKGTLKYMQYLYHTATENFFGDQ
ATRGIIDQDGFTIDIACGDFFGATRVGLPGEVERLGIYSSNAKSIAEIRVSFEVLSMNFYGSTIKVT
>P36341 ~~~~~~RNA2 polyprotein~~~
MWHFCEQVYECFEGYHRDYSVQTVPVEYLASHYIVNKFRPDPLAVLWLFCLGIWWEIIQILHHLFQYKEPALFVGSCQNL
AAFLEKKYSMEVIQKEGLAASALKDKERLTEKAVVNQPLSNLIPHSNKMYERSKSLLSGLKRGLIKQKEIAFDKLMGGST
IDFQHIPTGTLTPGENKVLDIPIVPQHLLTSTNITDYHQANKKNANGATALHVGAIEVIMDCFTSPDSNICGGMLLVDTA
HLNPDNAIRSVFVAPFIGGRPIRVLLFPDTLVEIAPNMNSRFKLLCTTSNGDVAPDFNLAMVKVNVAGCAVSLTRTYTPT
AYLEQELIKEKGAIVQYLNRHTFSMHRNNQMTKEEMQKQRLSFRLESALTLQEKHPLHATFCKSTNFVYKIGGDAKEGSN
GNLTVNESQLSSHSPSAHVLHKHNNSGDNEVEFSEIGVVVPGAGRTKAYGQNELDLAQLSLDDTSSLRGTALQTKLATSR
IILSKTMVGNTVLREDLLATFLQDSNERAAIDLIRTHVIRGKIRCVASINVPENTGCALAICFNSGITGAADTDIYTTSS
QDAIVWNPACEKAVELTFNPNPCGDAWNFVFLQQTKAHFAVQCVTGWTTTPLTDLALVLTWHIDRSLCVPKTLTISSAHA
SFPINRWMGKLSFPQGPARVLKRMPLAIGGGAGTKDAILMNMPNAVISLHRYFRGDFVFEITKMSSPYIKATIAFFIAFG
DITEEMTNLESFPHKLVQFAEIQGRTTITFTQSEFLTAWSTQVLSTVNPQKDGCPHLYALLHDSATSTIEGNFVIGVKLL
DIRNYRAYGHNPGFEGARLLGISGQSTMVQQLGTYNPIWMVRTPLESTAQQNFASFTADLMESTISGDSTGNWNITVYPS
PIANLLKVAAWKKGTIRFQLICRGAAVKQSDWAASARIDLINNLSNKALPARSWYITKPRGGDIEFDLEIAGPNNGFEMA
NSSWAFQTTWYLEIAIDNPKQFTLFELNACLMEDFEVAGNTLNPPILLS
>P25247 ~~~~~~RNA2 polyprotein~~~
MSSICFAGGNHARLPSKAAYYRAISDRELDREGRFPCGCLAQYTVQAPPPAKTQEKAVGRSADLQKGNVAPLKKQRCDVV
VAVSGPPPLELVYPARVGQHRLDQPSKGPLAVPSAKQTSTAMEVVLSAEEAAITAPWLLRPCKGEAPPPPPLTQRQQFAA
LKKRLAVKGQQIIREHIRARKAAKYAAIAKAKKAAALAAVKAAQEAPRLAAQKAAISKILRDRDVAALPPPPPPSAARLA
AEAELASKAESLRRLKAFKTFSRVRPALNTSFPPPPPPPPARSSELLAAFEAAMNRSQPVQGGFSLPTRKGVYVAPTVQG
VVRAGLRAQKGFLNAVSTGIVAGARILKSKSQNWFRRSMGIAHDYVEGCMASTVLGCAGPVVQRQEACSVVAAPPIVEPV
LWVPPLSEYANDFPKLTCSTFTEWQRPRKQSIAISNLFRKLIDRALLVSGVSLIASVLLFEIAENFAVRQAVCPVEMPSC
ATSVSEKSLVSLDEGNFYLRKYLSPPPYPFGRESFYFQARPRFIGPMPSMVRAVPQIVQQPTMTEELEFEVPSSWSSPLP
LFANFKVNRGACFLQVLPQRVVLPDECMDLLSLFEDQLPEGPLPSFSWSSPLPLFANFKVNRGACFLQVLPQRVVLPDEC
MDLLSLFEDQLPEGPLPSFSWSSPLPLFASFKVNRGACFLQVLPARKVVSDEFMDVLPFLFSPLVSHQEEEPEMVPAVLE
AADSVGDITEAFFDDLECESFYDSYSDEEEAEWAEVPRCKTMSELCASLTLAGDAEGLRKSHGVFLKRLVTYLQSFEEPL
YSSRAFYSVKVKPVYRPKKFEGHIDCTCLDGNMGEWEWRESVDAMWRCPGRLLNTKRTFTRDDWERVQYLRIGFNEGRYR
RNWRVLNLEEMDLSLHEYPEISSAPVQSSLFSRVVDRGATLASSIPFVTRSNCQSSLGTPGLNVHTIHQEAPTTLRAPPF
TGARNVMGSSDAGANAAPYRSEARKRWLSRKQEDSQEDNIKRYADKHGISFEEARAVYKAPKEGVPTQRSILPDVRDAYS
ARSAGARVRSLFGGSPTTRAQRTEDFVLTSPSAGDASSFSFYFNPVSEQEMAEQERGGNTMLSLDAVEVVIDPVGMPGDD
TDLTVMVLWCQNSDDQRALIGAMSTFVGNGLARAVFYPGLKLLYANCRVRDGRVLKVIVSSTNSTLTHGLPQAQVSIGTL
RQHLGPGHDRTISGALYASQQQGFNIRATEQGGAVTFAPQGGHVEGIPSANVQMGAGEHLIQAGPMQWRLQRSQSSRFVV
SGHSRTRGSSLFTGSVDRTQQGTGAFEDPGFLPPRNSSVQGGSWQEGTEAAYLGKVTCAKDAKGGTLLHTLDIIKECKSQ
NLLRYKEWQRQGFLHGKLRLRCFIPTNIFCGHSMMCSLDAFGRYDSNVLGASFPVKLASLLPTEVISLADGPVVTWTFDI
GRLCGHGLYYSEGAYARPKIYFLVLSDNDVPAEADWQFTYQLLFEDHTFSNSFGAVPFITLPHIFNRLDIGYWRGPTEID
LTSTPAPNAYRLLFGLSTVISGNMSTLNANQALLRFFQGSNGTLHGRIKKIGTALTTCSLLLSLRHKDASLTLETAYQRP
HYILADGQGAFSLPISTPHAATSFLEDMLRLEIFAIAGPFSPKDNKAKYQFMCYFDHIELVEGVPRTIAGEQQFNWCSFR
NFKIDDWKFEWPARLPDILDDKSEVLLRQHPLSLLISSTGFFTGRAIFVFQWGLNTTAGNMKGSFSARLAFGKGVEEIEQ
TSTVQPLVGACEARIPVEFKTYTGYTTSGPPGSMEPYIYVRLTQAKLVDRLSVNVILQEGFSFYGPSVKHFKKEVGTPSA
TLGTNNPVGRPPENVDTGGPGGQYAAALQAAQQAGKNPFGRG
>P10941 ~~~~~~Polyprotein p69~~~
MAQLRKPSQSLVLSESVDPTTVDPFVSVRTEEVVPAGCITLWEYRDSCGDVPGPLSHGDLRRLRTPDGVCKCQVHFELPT
VLKSGSTGTVPEHPAVLAAFIGRPRRCSLEQRTKELDSRFLQLVHGGLPARPSYMIARPPRPVRGLCSSRNGSLAQFGQG
YCYLSAIVDSARWRVARTTGWCVRVADYLRLLQWVGRRSFGSFQIEKSAVDHVYHVVVDAEYQSEQDGALFYQAILGLAE
KDPLARIGGRLNPLAAEFAPGSALRVEPVTPQVTRRKGSTRMTGRDPTIVSVGKVGMAITSIQDALVATELRNVNFGRRD
TEAECRRLWARYEVNDYFRRHKAELLKFDARLRSRMAKKPASSRARPSDAKIQCIGWRDRHLLPQRLAGLSKQGRSLVWS
RFATSNIRRKTPPCVVNPSADPVVHNWKDSAALAVKKIAEARRRQEIRAAAYAERAKARGQTNVVASISEAIETTLRRNK
TRFALDGLHLAASAIVTTRLRSWNQEEIRAGREFRKSTTSWIWRHVPSSIQDALNLTSVRDKLDPGRAFGYVQAAVAQGM
SDFRRAKRALAIVAKPVIRNIRDPYEHGFVKRDGKLRHSRDAFNKKLRTKAVAATKVHKIKF
>Q04350 ~~~~~~ORFB polyprotein~~~
MYKEAERPIEVWRTQVMDGPTWTALSESCRDRLFFASGEGGEHMTLDIIQPDSYTKIRLFRSGRFEVSVDGKSFGQGGNR
YRFVFRYDSLLSTPFGYPAEDKEIALQDYNHKQLLGEMFLKLPDSYVDGRPIAEAFFRYVDDLKWDVGVFRDRRSLTELH
LPASSGLTTAQVSVAKLEWPPLPIIQAQPTILAGIIDNFKICFPVNGKWIYGQGLSWTRYDGDASVPTSLLSNRQHARFW
NEKDIPTGLKLSKEGFIKLWAQKSRKWQDHMARSIGLSHEAAVELVRATRVNEAKPHLVPMEEAKEAPRQQLVPRRSTFV
DNHEEEVEIDTLRVPVEEGRCFELLFNNQVTPAIFDKKPLLKDVLGVFEENVCTMDSLEISHSDQCVHIVAGETFRNYDE
IKAVLEVILENEPDILVGAEEGSVADYVKAGKHFLFENHQWVRNGLKLAKGLAEPGQRAKDNTNPSTPRPIEDADYIHPF
DNGQPLPGRSDQWVSGFEVTRLRHHDEMPHIRSVRNTGIHGLPGDFLSNYPRLPTPVFHRLRDLWDDVIGILMKLEFGDN
CSPVLNVTANADWVRSETTINFISDQPGKAQSRPREDGGFDILVPCRGIATRSIRLLPLFIRLPNRFRAVALLNGRQSDY
DNYGWPVFNPVIPLPQMDSFYVEAVAAGRSMYPPGFLLGRYDALEYLVHTATVYGAEEAFLLPFTHHVRVYPPPRPGREI
PFGSWCKNYKFEAERFWYDADWKLRVHETNHDFDRLIEITKTCRRNPPEENLQAKLEDTARKVCSVWQYNIMIASSVAFL
VPLYFTLYVPYLQFYLHVDPGDYILLPPVLWLVWTNLCYGYACDAWCRLFFFVEEAGKKELVHSSEEFSSDPSSTLLIPT
MGTRGDHVPPRFFANMAVLAGVKTHLLKLQTATYGDLENLKKGKLGSLLPGYLQNHYSVLRGYKAAFTPHVELDMPNATS
YNLAPPRSYINKIRYLTDENRSGASMIDRAVTWFAEELADTFWPDWQIGCLRGCNLPRSADGVSLITKQPNLKTGKIGWL
HGSADPAVVPKDIRDKYPLVPNGDHNEIFRHYDKIYMPGGAGAVQTAIACGCEVVVTDVNLDRDYHTMPTQKDFHQPSIL
PYFAWLWRQGFDVKLPRVLLVIGWLKFHYSIRYKHLEFAADFVIRAGLFWWYGCLHLLPFMAAAIMAPRFVKKYLVGMAW
LTEPGLLMLKALWRFPIFMVTPRWMLPFIVTVSVYNWWWPLSQDGLNYASKRFELIFEPVTRGKHTFSYPFGHWCLRDTN
SMIVYEGKFVNPSETSIGSPFKLSKSVRPVRPGAVFHLVPFHVQKLLDSMDEAPLPYSANHNCTTVILKGIMYRSALGFV
FAYMVSWAVYLVLRPPQAAATVYHWVYPERSWDTSRLYHLLLGFAAGGTVPMEVIDEEHVEEKPSVAGQSEPAAEIDNDK
ISDYDQEWWGSQDSIDTVVNDLCYLLSFLKDTAIPEEVKLDVVELAYTQLVQDEKERIPEPKGTKILDMPNWKPGNWAKL
IDETHRVLSQFTQYTPRVLNELVVWLKGLGENLYRVAEPILMLLVRAMRAAKSVSDRATRSVYHCLCHWLDVMYGGSAPT
RVKTVWGLTGLVASGMTSQKAILAQNIAMMEYQGRGNFLDDYDNFVSNIKEPGKGLPGINTIGGPQRRPIRYKNPVMSHQ
AAEICGLKPGEYEVDDRYQERINDYLAEGIPQAVDGVLFGDRNPDRIARSISRYEPEYSGCSPEDKALVEDTARAMFEQW
PEVFADRDIMLPKGVELYIKEKYSAGTPFISSFYKSRKALKQAGVMDVIRKNALECISTGKYPTQFYHAFAKSQAVPGQP
LLAPRMKDLRTVVSEDLSAYMVDQIFQIEANKRITWETYGAGSGMPLSQSMARIWDELHDLRKREGGQFIIADATAYDSN
CKPALFHGAGKLVELGFQNHPSGKGRQFAQVVQCKFEAMQNAWVMGITEPSYTALTFHVPDVAVRHELESKYPAHFATFS
ELLAHNNVNVTEWKRLSWEERKACARDMQAVPGKVFLTNDPALRLQGSSWQGSFTTEPKRDEFRKYQTYFYDSKAAMRED
IKRIVFANREVISNVHHKNRGGGTGQSATSWDNTATFKLGVISAWARATGKPPKDFFCSNRLYNTSDDTVWWSKDLLSSA
EVDRFKQAAADFGILLEIGSTKKITEVEYLSKLPRRPTAEDSADYRAWRQGRIENMRSSGRFSEEQLLSIEREQLPQFLM
VQNPTAILMRRTAFRYYQSSPSKFLYTSCERGAGHALVTAFQPALYKRFAIEYAEDLNRLCKEHHINQRYELVSQQDRMK
MQVINVNPNWKRNFKLSPRQEAFLRWIRQAKFPSYRQVLDIHLRIRDPDPSAHDRFIAKLDRAWRNPDEGIRDIVDGVYR
YTDMIPEEFKRFMPSTDMLYAENPWHTHNQYVEKFIYLKLLETTTVDELTFAQFDAVAKESPYGICMNTIKFWEDLRDPD
YLKDLLASEAMIDKVRIYQGMTVIISAMYFAMHWVELFIQSLFLIGPLYNLFMWSFWGLSKVYGLANTFYWHGKARSSRE
ISSILPRDPYMWSKRFVSTMADFIPERFALGIVPVTLVLDGLAEIIEVLFGRMWRLFANLKSVGTDFSDARSGKSLNVPS
NPWAAYAHTYATKAIEHGHVTVAAKTASGKSTFFPAAVWAERRNIGIKKLWIVMPRKILRDNWEIPFDIRSQIVKRGKTL
DPSADIYVTTYGHFRTRIGGLVPRDNLVFFDEFHEMDGFMLQDVEDWKGPTIFMSATPVALHGMAGIPFLEPTLPKRFNL
TVYKVDSDDVLEMWNRARNQFADQPELLARPMIIVPTYNELKKTIAGLENLDRSITWHEVSSNSPLVPKTGGLVCTPYVQ
TGIDIKPAPSILIDSGRDVIVHKGRLVTPHPYTDEKTNEQRVNRVGRTMDGVVIQPQLAGTGNPPVKYPSGIFFSSELVA
GQYKVPRLTKVNGCVHPELPYMSIKYTSELSDPAKAREEEQSVTKSLLFIHLMALAGVRQSEWALRYNRYFELHLPFGED
EDHLERILTSGKLRYANHIPVDMAMQLLGNGHVTWGIGGVPTITRPRYPCDGMWVEDPSSRKSYAHKVLLHQREHAEIGM
WQAQVNELRAQNLALQSQLRSACTRRSTAGRILRHTRPPDIPVCG
>C1JCT2 ~~~~~~Polyprotein-FSD~~~
MSEKTQTFVQNETHVLDMTSDFKSDLSLEKVTSSVEQTDDLVSKIINNNDLDIKDLSFLRNLLLSTLQYLGIAKFVAINI
TLSILSILMLLINSCAKFTRIVNLSSHILNIITTLGLYFQVSSMEIEEITQTFENEFGTYDDDKILSHYIKICNLPNRKD
VYEYISLNDLKYKIKLPDISFYELKNDILSKNKNLHLWIFQKFTDEFLAMWFGVQPYRISNLREMLVISRQGFIPKDLFN
EIRKLCNMGVSVIISFIQSKLFDEPFKKRDCTQALKDASVISSPFDTLWNLISKQVCDNSAEERFTQTILDFTSEFDNFL
GIPNYKFAKNQKLVNTISKSLDACAKFIRDCPKDKQTEIFPLQGLHTATVKRRNEILTNVMPKFARQEPFVVLFQGPGGI
GKTHLVQQLATKCVNSFYQDHEDDYIEISPDDKYWPPLSGQRVAFFDEAGNLNDLTEDLLFRNIKSICSPAYFNCAAADI
EHKISPCPFELVFATVNTDLDTLQSKISSTFGQASVFPIWRRCIVVECSWNEKELGPFNYKNPSGHRSDYSHITMNYMSY
DDKTQKLALEKEINFDTLFDMIRLRFRKKQQEHDTKISILNNEIQRQSNSKQHFSVCLYGEPGQGKTYNLNKLITTFANA
TNLKIGSEEKPSIHIFDDYIKDENDENCSKFMDIYNNKLPNNSVIFSATNVYPKTHFFPTFFLTNLIYAFIQPFKQVGLY
RRLGFDGYTDIPNSSVNAPIFVQNFKFYERKQHICYFLSLEFLKNIICYIFFFLYFPLKFIKKIDLIEIKDVNKYVYDRY
INFLSLSKQIEIVEYPPNLENVEFDFRFNMNKFHRVSFNNPFELDKYIHFNKNSYENLLHFDWKMYLSPRVKHRLALSYE
KFFITISEVNKEIIIEELKRYVLLFKQFNIDPNMEINLGEYGSFYYINGKIHLMTINIESNVSEIPVFTDGDYVYISEHK
IPVIDLFDNININSKYNLSFDQSIALNSFKTGDSFYSNAKVRKSLSKFVLLNYQTKFKLYLKEAKDKVKNFIETPIGHLL
SILLTIFVICYASFKIYSKFSNFFSKDQAIEDQRKGEKKIKKITNYDSDGVQPQRKGEKKIKKVTNYDSDGVQPQSNVKV
EEEIKLVFDPTGQKLLFGNDFTSELETLVELEKDDEEFTKSKIDNKSMAGLRREVRRRRYARSKKAQIEKQEVLTLPDVN
GFEGGKPYFQIAEEKARKNLCQIYMIANNENCIASKFSDHIVCYGLFVFKKRLASVGHIVEALKCAPGYNLYAGCDQFNG
KLYKMNLVRNYRKRELSVWDVDCPNDFVDLTSFFIPKEELYDAENCNTVLGRFGMNKREVYLYGNCEFIQEFFKVDNKGA
QEFGYIDWATVDITLTTGGDCGLPYYICERKKFHNKIMGLHFAGNNVNHKTIGMSALIYKEDLVVWKGAERQSKCKFCDV
KDIIIAQPDIPKEKYKGYNHEIVWNSLHESSPTTLNEELEHYLNIFPKFTGTIIKHSGDKFYGSVKHSHTQFISKFKTEL
TVTNGWKLSTAGDCQFESNHISPNTEVMYRVVDVQFNSIFKAFKSQPYIKNFRLIANVYEKDGKQRVTILTIIPVSDFNV
KQQTVRQALVPLHLNEDEEVYVTEDVSDIFKTAIKRKQRGILPDVPYETVENETVEILGITHRNMTPEPAQMYKPTPFYK
LALKFNLDHKLPVNFNMKDCPQEQKDMMVLDRLGQPNPRITQSLKWAHKDYSPDYELRKYVKEQYMCNIMEYYAGCNLLT
EEQILKGYGPNHRLYGALGGMEIDSSIGWTMKELYRVTKKSDVINLDSNGNYSFLNNEAAQYTQELLKISMEQAHNGQRY
YTAFNELMKMEKLKPSKNFIPRTFTAQDLNGVLMERWILGEFTARALAWDENCAVGCNPYATFHKFATKFFKFKNFFSCD
YKNFDRTIPKCVFEDFRDMLIQANPHMKNEIYACFQTIIDRIQVSGNSILLVHGGMPSGCVPTAPLNSKVNDIMIYTAYV
NILRRADRGDITSYRYYRDLVCRLFYGDDVIIAVDDSIADIFNCQTLSEEMKILFGMNMTDGSKSDIIPKFETIETLSFI
SRFFRPLKHQENFIVGALKKISIQTHFYYATDDTPEHFGQVFKTIQEEAALWEEEYFNKIQSYIQEIIRKFPEISKFFNF
ESYKSIQKRYIMNGWNEFVKLEKLDLNLNKKKSSKVTGIHSKQYSKFLKFLSRIENEKAALEGNFNKESVNTWYFKMSKA
MHLNEIFQKGLISKPLAEFYFNEGQKMWDCNITFRRSKDDLPFTFSGSGTTKACAREQAAEEALVLFSQEDEIVRQINDI
QSDCKFCKKMIRYKKLLSGVSIQRQMNVSKITENHVPSAGMMATDPSVAPDSGIATNTQTPSISRVLNPIARALDNPAGT
GAPFDKHTYVYNVFTRWPEMSTVVNKSLAAGAEVFKISLDPNKLPKRILQYIQFHKTIIPQIEVQILIGGAAGTVGWLKV
GWVPDASTAKKYSLDDLQLVASETINLNSTITMSMIINDSRRNGMFRLTKSDPEPWPGIVCLVEHPITNVQRNDDVNYPV
IVSVRLGPDCQLMQPYNDLNLSGGTDPDPDPEPDPDPEPGPDPEPGVDELDLSKYIPNQLIDLLICNSYVPNNVSVDFLS
TYPNLNFSIHNITDVVVSSKPYTLALFETESQINSASVWRGDLTQLSVFIQYKFYTRVEAYNKVTTVHTDKWTPNFDGTV
YKPVDVKIEHAYGTYELTTMWLTSYGLVMEWSLDESRVFYGTYKTDSNGRRWLIDGNTPIARSDHCFIVSSPDLLSDDKA
YYNNPIGAKQGGKLVDGAQIYRIFKTESGGYRSDPFVPETYWPSETPYNADWSGVKMPYQIRKVIQTGNNLAGKHLDGDL
KMCAMIRQGSSSTQSTDNYFYPIYVHNFSALLKQMNLILKERKTKYIKFDLQVGGKPFAQMGFGDGAFIGRTTMFRQIRA
AITNVILLKNIVGVDDLSGLQALPTSGFADWVVKAQSTNSKFLNDFYNDKISIERQASLGIAAAIGAGQGLFGGLSAQWQ
WQQQADWSRQMQRERLDMMEKLANINNQARLNQLTQSGAQQRITQQAAYQQQMNALGAGSVSAQNGMYTPSNYTPLPSYK
SNTTNYYNNSVYHTDNNITNNPSNTSLTNNINNFNPELFQQQRERMPTPSEAYDNSKGFVPQPGTSKSIATENINPNYKD
EEHIYEPIEQQNHEYADIDYNAMNISRENKNSSNFGNVGILDHQYADIDYDAMKIARDQQNSSKFGNVGVLNHQYAELDF
SKNNTRKNSQILDNSLYSKTQPSSKMIDNSLYGINPNKMVENQNYEPASMERKNSIYSSNLNSSNNLKFNNIPNFKGPTN
LNISGAKPAGFGSGIIQPAINKYTDFSKPN
>Q9YLS4 ~~~~~~Genome polyprotein~~~
MSKLFSTVGKTVDEVLSVLNDENTESYAGPDRTAVVGGGFLTTVDQSSVSTATMGSLQDVQYRTAVDIPGSRVTQGERFF
LIDQREWNSTQSEWQLLGKIDIVKELLDQSYAVDGLLKYHSYARFGLDVIVQINPTSFQAGGLIAALVPYDQVDIESIAA
MTTYCHGKLNCNINNVVRMKVPYIYSRGCYNLRNSAYSIWMLVIRVWSQLQLGSGTSTQITVTTLARFVDLELHGLSPLV
AQMMRNEFRLSSSSNIVNLANYEDARAKVSLALGQEEFSRDSSSTGGELLHHFSQWTSIPCLAFTFTFPGTVGPGTQIWS
TTVDPFSCNLRASSTVHPTNLSSIAGMFCFWRGDIVFEFQVFCTKYHSGRLMFVYVPGDENTKISTLTAKQASSGLTAVF
DINGVNSTLVFRCPFISDTPYRVNPTTHKSLWPYATGKLVCYVYNRLNAPASVSPSVSINVYKSAVDLELYAPVYGVSPT
NTSVFAQGKEDEGGFSSVPEVEQHVVEDKEPQGPLHVTPFGAVKAMEDPQLARKTPGTFPELAPGKPRHTVDHMDLYKFM
GRAHYLWGHKFTKTDMQYTFQIPLSPIKEGFVTGTLRWFLSLFQLYRGSLDITMTFAGKTNVDGIVYFVPEGVAIETERE
EQTPLLTLNYKTSVGAIRFNTGQTTNVQFRIPFYTPLEHIATHSKNAMDSVLGAITTQITNYSAQDEYLQVTYYISFNED
SQFSVPRAVPVVSSFTDTSSKTVMNTYWLDDDELVEESSHSSFDEIEEAQCSKCKMDLGDIVSCSGEKAKHFGVYVGDGV
VHVDPEGNATNWFMKRKATVKKSKNLDKWCFALSPRIDRTLICETANLMVGREVEYDIFVKNCETYARGIASGDYGTKEG
EKWKTLLSAVGVAAMTTTMMAMRHELLDTSLTKLPQKVGEVTNEVRKILEDTSAGVREFKEKVSSILRKTWPGKTSIKIM
KWTCRIVKMCVGVGLCYAHGWDSKTVTAVVTMFSMDFLDLVIDGIEIGRMIIDELTTPKAQGLSEINQVLSIAKNAKDVI
KMLIEIFCKVIERITGEHGKKIQWAQDKKEEIMNVLERAEKWITTSDDHSEGIECLKLVRSIQSVIRGEESLKELAGELR
AVGTHVLNKLGRLDKPNAPILVRAEPTVLYLYGNRGGGKSLASMAIAVKLCKELGISHVEGIYTKPIMSDFWDGYAGQPV
VIMDDLGQSTSDEDWTNFCQLVSSCPLRLNMANLEKKGTQFNSPFIIASSNLSHPCPKTVYCTDAIARRLHIKVKVSPKE
EFSTHAMLDVAKAKKAGAYCNLDCLDFQKISDLASTPVSVQDIVLEMLHTNVDKQTLMGDIIQYWAQSNPREVFDTMAEG
KNSGKYLWLFEKIKTSKWYILGCVGAVLSVSVLGVFAYHMIKNHFRDQQHDQSAYSAAIKPLRVVRLEQSDAQSVVDISN
VVHGNLVRVGVGPNEARIHWLYNGLGVYDTYILMPYHGIKDADVDDDLYIERAGTIYSTNMKMVQVLFLESREGDLVLIN
VPRLPKFRDIRNHFSTEENIRRAEGMPGTLCTLDHERFTLVTESDLKMVEAATYVCEDDKGVRTDISVGRSWKAKACTVA
GMCGGALVTSNNKMQNAIVGIHVAGGAHAISRVITKEMIEEMLKTRAQCSRIWKTEFVEEKISVGSKTKYHKSPLYDFCP
QEVIKCPTKLFYQGEIDVMQVMLAKYSSPIVSEPLGYATVVEAYTNRMVSFFSEPRQLTYDECINGIEGLDAIDLKTSAG
FPYNTLGLRKSDLIINGKMAQRLQQDVEKMEEDLHMNRSIQVVFTTCAKDELRPLSKVMLGKTRAIEACPVSFTILFRRY
LGYALAQIQSHPGFHTGIAVGVDPDQDWHCMWYSIVTQCDLVVGLDFSNYDASLSPFMIYHAGRVLGQICGLDPRLVDRI
MEPIVNSVHQLGSMRYYVHGSMPSGTPATSVLNSIINVVNICHVLCALEKISVFEVFKLFKILTYGDDVLLCIKKEYLDQ
KSFPLSSFVQGLEELGLSPTGADKMEVKVTPVHKMSFLKRTFYVDEWSICHPRISEETVYSMLAWKSDNASMKDLIETSI
WFMFHHGPRKYVRFCTWLRGVLCRVGIGLYIPTYKELEVRYDRLVKYRFIDDSF
>O91464 ~~~~~~Genome polyprotein~~~
MAATRVSRSVLAAVAHSAAHRTYHTVLDCYDRLYLNTNPHLSYPLPKNSSFPCPFCQYDEQNEVLSPESLRGEGAEPCWK
CSQDKPRRKYNTTPPEDWLYDSDVQSWFYPETYYSDLQQKFFDKLALLSLPGAYQAKTPEERALAGALTQLLNFPSTPPL
TLPTTNLQRQGNSVTNIYGNGNNVTTDVGANGWAPTVSTGLGDGPVSASADSLPGRSGGASSEKTHTVSGSSNKVGSRFS
KWWEPAAARASESATDSAIEGIDAAGKAASKAITRKLDRPAAPSSTANPQPSLIALNPSATQSGNASILTGSTAPSLLAY
PTATPVPLPNPDEPSQPGPSGDRTWLLDTVTWSQEFTRGWNIAGSNGMQWTGLESLIFPVSTDTNWTSTSSPTAYPLPFS
FVRAYPDSSWAAMYNTHSMWNCGWRVQVTVNGSQFHAGALILYMVPEATTHAIQTARDNAGFVFPYVILNLYESNTATIE
VPYISPTPNTSSGLHAPWTFYLQVLSPLNPPPSLPTSLSCSIYVTPVDSSFHGLRYLAPQHWKTRAVPGAGTFGSAVAGQ
ELPLCGVRAYYPPNAYIPAQVRDWLEFAHRPGLMATVPWTMADEPAERLGIFPVSPSAIAGTGAPISYVISLFSQWRGEL
AAHLLFTGSAQHYGRLVVCYTPAAPQPPSTMQEAMRGTYTVWDVNAASTLEFTIPFISNSYWKTVDVNNPDALLSTTGYV
SIWVQNPLVGPHTAPASALVQAFISAGESFNVRLMQNPALTSQTLTEDLDAPQDTGNIENGAADNSPQPRTTFDYTGNPL
PPDTKLENFFSFYRLLPMGGSGAPSLSFPADEGTIIPLNPINWLKGADVSGIAAMLSCFTYIAADLRITLRFSNPNDNPA
TMLVAFAPPGATIPLKPTRQMLSNFYMAEVPVSAATSTMVSFSIPYTSPLSAIPTSYFGWEDWSGTNFGQLSSGSWGNLM
LIPSLSVDSAIPFDFQLSCWVAFGNFKAWVPRPPPPLPPLPTPAANAERTVAVIKQGAASATPDVDPDDRVYIVRAQRPT
YVHWAIRKVAPDGSAKQISLSRSGIQALVALEPPEGEPYLEILPSHWTLAELQLGNKWEYSATNNCTHFVSSITGESLPN
TGFSLALGIGALTAIAASAAVAVKALPGIRRQGLLTLSADTETNQTLNKITESVNQAAQVVSQFDLSGPANSVSLAASDI
REAAHKVASSLNGFTDVIADIKDSLFTRVSDAVESGVATFLTWLVKLFGYLLVLFGSPTPMSISGLLVIICADLAPHARE
FFTASGNVLSSLYYWIASKLGLSVTPQECERATLEPQGLKDFNDGALAMRNVEWIGETAWKWAHRLLDWIRGKAKTDPQA
KLADVHDEIMLHYSDSILALGSEKLPIDHITKSISRCRELVSIAQEAKSGPHSSFLNQAIKNYTLAISQHRKCQTGPRPE
PVVVYLYGPPGTGKSLLASLLAQTLSQRLAGTPDDVYSPSSASCEYFDGYTGQTVHFIDDIGQDPEGRDWANFPNLVSSA
PFIVPMASLEEKGTHYTSKVIVVTSNFHEPNERAARSMGALRRRVHLRINVTSNGVPFDPTNALNPIPGTQSKYFTAQTP
LTLFQSNTVRLDRDSIWTPTFTNMDELVDAIVTRLDRSTGVSNSLASLIRRQGNRVIDAEPREIPLEYADDLLEAMAHHR
PVPCSLGLSQAIANNTPIQQISETFWKYRKPIFTCTTFLAVLGFLCSVIPLARSLWKSKQDTPQEPQAAYSAISHQKPKP
KSQKPVPTRHIQRQGISPAVPGISNNVVHVESGNGLNKNVMSGFYIFSRFLLVPTHLREPHHTTLTVGADTYDWATLQTQ
EFGEITIVHTPTSRQYKDMRRFIGAHPHPTGLLVSQFKAAPLYVRISDNRILDLDFPGVVVCKQAYGYRAATFEGLCGSP
LVTDDPSGVKILGLHVAGVAGTSGFSAPIHPILGQITQFATTQQSLIVPTAEVRPGVNVNRMSRLHPSPAYGAFPVKKQP
APLKRNDKRLQEGVDLDTQLFLKHGKGDVTEPWPGLEAAADLYFSTFPTSLPVLTQEQAIHGTPNMEGLDMGQAAGYPWN
TLGRSRRSLFDEVEPGVFVPKPELQAEINQTLEDPDYVYSTFLKDELRPTAKVEQGLTRIVEAAPIHAIVAGRMLLGGLI
DYMQGRPGEHGSAVGCNPDVHWTSFFYAFSEFSQVYDLDYKCFDATLPSAVFTLVADHLTRITGDPRVGRYIHSIRHSHH
IYGNRMYDMIGGNPSGCVATSILNTIINNICVLSALIQHPDFSPSRFHILAYGDDVIYATEPPIHPSFLREFYQKHTPLV
VTPANKGQDFPPTSTIYEVTFLKRWFVPDDVRPIYIHPVMDPDTYEQSVMWLRDGDFQDVVTSLCHLAFHSGPKTYAAWC
MKVREQCLKSGFAPNFLPYSYLQLRWLNLLAA
>Q91B85 ~~~~~~Genome polyprotein~~~
MAKGAVLKGKGGGPPRRVPKETAKKTRQGPGRLPNGLVLMRMMGVLWHMIAGTARSPILKRFWATVPVRQAIAALRKIRK
TVGLLLDSLNRRRGKRRSTTGLLTSILLACLATLVISATIRRERTGDMVIRAEGKDAATQVEVVNGTCIILATDMGSWCD
DSIMYECVTIDSGEEPVDVDCFCRGVERVSLEYGRCGKPVGGRSRRSVSIPVHAHSDLTGRGHKWLRGDSVKTHLTRVEG
WVWKNKLLTMAFCAVVWMVTDSLPTRFIVITVALCLAPTYATRCTHLQNRDFVSGIQGTTRVSLVLELGGCVTLTAEGKP
SVDVWLDDIHQENPAKTREYCLHAKLASSKVVARCPAMGPATLPEEHQASTVCRRDQSDRGWGNHCGLFGKGSIVACAKF
ACEAKKKATGYVYDVNKITYVVKVEPHTGDYLAANESHSNRKTASFTTQSEKTILTLGDYGDISLTCRVTSGVDPAQTVV
LELDKTAEHLPKAWQVHRDWFEDLSLPWRHEGAHEWNHADRLVEFGEPHAVKMDIFNLGDQTGILLKSLAGVPVANIEGS
KYHLQSGHVTCDVGLEKLKMKGMTYTVCEGSKFAWKRPPTDSGHDTVVMEVTYTGSKPCRIPVRAVAHGEPNVNVASLIT
PNPSMETTGGGFVELQLPPGDNIIYVGELSHQWFQKGSTIGRVLEKTRRGIERLTVVGEHAWDFGSVGGVLSSVGKALHT
AFGAAFNTIFGGVGFLPRILLGVALAWLGLNSRNPTLSVGFLITGGLVLTMTLGVGADMGCAIDANRMELRCGEGLVVWR
EVTDWYDGYAFHPESPPVLAASLKEAYEEGVCGIVPQNRLEMAMWRRVEAVLNLALAESDANLTVVVDRRDPSDYRGGKV
GILKRSGKEMKTSWKGWSQSFVWSVPESPRRFMVGIEGTGECPLDKRRTGVFTVAEFGMGMRTKIFLDLRETSSSDCDTG
VMGAAVKSGHAVHTDQSLWMKSHRNATGVFISELIVTDLRNCTWPASHTLDNAGVVDSKLFLPVSLAGPRSHYNHIPGYA
EQVRGPWNQTPLRVVREPCPGTTVKIDQNCDKRGSSLRSTTESGKAIPEWCCRTCELPPVTFRSGTDCWYAMEIRPVHQQ
GGLVRSMVLADNGAMLSEGGVPGIVAVFVVLELVIRRRPTTGTSVVWCGVVVLGLVVTGLVTIEGLCRYVVAVGILMSME
LGPEIVALVLLQAVFDMRTGLLVAFAVKRAYTTREAVVTYFLLLVLELGFPEASLSNIWKWADSLAMGTLILQACSQEGR
ARVGYLLAAMMTQKDMAIIHTGLTIFLSAATAMAVWSMIKGQRDQKGLSWATPLVGLFGGEGVGLRLLAFRRLAERRNRR
SFSEPLTVVGVMLTVASGMVRHTSQEALCALVAGAFLLLMMVLGTRKMQLIAEWCGEVEWNPDLVNEGGEVNLKVRQDAM
GNLHLTEVEKEERAMALWLLAGLVASAFHWAGILIVLAIWTFFEMLSSGRRSELVFSGQGTRTERNRPFEIKDGAYRIYS
PGLLWGHRQIGVGYGAKGVLHTMWHVTRGAALVVEEAISGPYWADVREDVVCYGGAWSLESRWRGETVQVHAFPPGRPQE
THQCQPGELILENGRKLGAVPIDLSKGTSGSPIINAQGEVVGLYGNGLKTNEAYVSSIAQGEAEKSRPELPLSVQGTGWM
SKGQITVLDMHPGSGKTHRVLPELVRQCANRGMRTLVLAPTRVVLKEMEKALAGKKVRFHSPAVEGQSTAGAVVDVMCHA
TYVHRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYSLAKENRCALVLMTATPPGRGDPFPESNGAIMSEERAIPDGE
WREGFDWITEYEGRTAWFVPSISKGGAIARTLRQRGKSVICLNSKTFEKDYLRVREEKPDFVVTTDISEMGANLDVSRVI
DGRTNIKPEEVDGKVEMTGTRKITTASAAQRRGRVGRTSGRTDEYIYSGQCDDDDTSLVQWKEAQILLDNITTLRGPVAT
FYGPEQMKMPEVAGHYRLNEEKRKHFRHLMTQCDFTPWLAWHVATNTSNVLDRSWTWQGPEGNAIDGADGDLVRFKTPGG
SERVLQPVWKDCRMFREGRDVKDFILYASGRRSVGDVLGGLAGVPGLLRHRCASALDVVYTLLNENPGSRAMRMAERDAP
EAFLTIVEVAVLGVATLGILWCFVARTSVSRMFLGTVVLFAALLLLWIGGVDYGYMAGIALIFYIFLTVLQPEPGKQRSS
DDNRLAYFLLGLLSLAGLVTANEMGMLDKTKADLAGLMWHGEQRHPAWEEWTNVDIQPARSWGTYVLIVSLFTPYMLHQL
QTKIQQLVNSSVASGAQAMRDLGGGTPFFGVAGHVIALGVTSLVGATPLSLGLGVALAAFHLAIVASGLEAELTQRAHRV
FFSAMVKNPMVDGDVINPFPDGEPKPVLYERRMSLILAIALCMVSVVLNRTAASMTEAGAVGLAALGQLVHPETETLWTM
PMACGMAGLVRGSFWGLLPMGHRLWLKTTGTRRGGADGETLGDIWKRRLNGCSREEFFQYRRSGVMETERDRARELLKRG
ETNMGLAVSRGTAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPAVMGVKAYTIGGKGHEVPRLITSLGWNLIKFR
TGMDVYSLEAHRADTILCDIGESNPDPLVEGERSRRVILLMEKWKLRNPDASCVFKVLAPYRPEVLEALHRFQLQWGGGL
VRVPFSRNSTHEMYFSTAVSGNIVNSVNIQSRKLLARFGDQRGPAKVPEVDLGTGTRCVVLAEDKVREADVAERITALKT
QYGDSWHVDKEHPYRTWQYWGSYKTEATGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKEKVDTKAQEP
QVGTKIIMRAVNDWILERLAGKKTPRLCTREEFIAKVRSNAALGAWSDEQNRWSNAREAVEDPEFWRLVDEERERHLRGR
CAQCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRYLEFEALGFLNEDHWASRDLSGAGVEGISLNYLGWHLKRLSELE
GGLFYADDTAGWDTRITNADLEDEEQILRYLRGEHRTLAKTILEKAYHAKVVKVARPSSSGGCVMDIITRRDQRGSGQVV
TYALNTLTNIKVQLIRMMEGEGVIGPSDSQDPRLLRVEAWLKEYGEERLTRMLVSGDDCVVRPIDDRFGKALYFLNDMAK
VRKDIGEWEPSEGYSSWEEVPFCSHHFHELTMKDGRVIIVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLSY
FHRRDLRTLGLAICSAVPIDWVPQGRTTWSIHASGAWMTTEDMLEVWNRVWILDNPFMSDKGKVKEWRDIPYLPKSQDGL
CSSLVGRRERAEWAKNIWGSVEKVRRMIGPERYADYLSCMDRHELHWDLKLESNII
>Q65399 ~~~~~~Genome polyprotein~~~
MATIMFGSIAAEIPVIKEAIMIAMPKSKHTLHVVQVEAKHMATEIRSERGKLYVAKRFADNAIKAYDSQLKAFDELLKKN
SDLQKRLFIGQNSPIKQKKGGACFVRSLSFKQAEERHAKYLKLQEEEHQFLSGAYGDKAYVGSVQGTLDRKVAEKVSFKS
PYYKRTCKAVRQVKVLKKAVGSGKVLDQVLEIVAETGVPVTFVGKGANKTLRAQYVRRYGLVIPKIFLCHESGRKVHREM
SYWHHKETLQYLCKHGKYGALNENALCKGDSGLLFDQRTAFVKRVTYLPHFIVRGRQEGQLVCATEYLDNVYTIEHYTHK
PEEQFFKGWKQVFDKMAPHTFEHDCTIDYNNEQCGELAATICQTLFPVRKLSCNKCRHRIKDLSWEEFKQFILAHLGCCA
KLWEEQKNLPGLEKIHSFVVQATSENMIFETSMEIVRLTQNYTSTHMLQIQDINKALMKGSSATQEDLKKASEQLLAMTR
WWKNHMTLTNEDALKTFRNKRSSKALINPSLLCDNQLDRNGNFVWGERGRHSKRFFENFFEEVVPSEGYKKYVIRNNPNG
FRKLAIDSLIVPMDLARARIALQGESIKREDLTLACVSKQDGNFVYPCCCVTQDDGRPFYSELKSPTKRHLVVGTSGDPK
YIDLPATDSDRMYIAKEGYCYLNIFLAMLVNVNEDEAKDFTKMVRDVVVPKLGTWPSMMDVATAVYIMTVFHPETRSAEL
PRILVDHASQTMHVIDSFGSLSVGYHVLKAGTVNQLIQFASNDLEGEMKHYRVGGDAEQRMRCERALISSIFKPKKMMQI
LENDPYTLVLGLVSPTVLIHMFRMKHFEKGVELWINKDQSVVKIFLLLEHLTRKIAMNDVLLEQLEMISQQAGRLHEIIC
DCPKNIHSYRAVKDFLEVKMEAALTNKELANNGFFDINESLGHVSEKIYAKALEKEWRALSWLEKSSVTWQLKKFSKVTE
EHLTKKAAEGRKESSRKFVSACFMNAQTHLGNARITISNKVNEVTNLGVRRIVEMCLRLIHRCYSDMIFLVNISIIFSLF
VQMCATLRNTLSIIHRDRTTLARVQAESNERSIMQMYDLMTKAGNGPPKMEDFFKHIEMVRPDLLPTAKYMVQDSEAVDT
QAKTQTQLQLEKIVAFMALLTMCIDSERSDAVFKILQKLKSVFGTMGEDVRPQSLDDILDLDEAKQLTVDFDLSTSKEST
STSFDVTFEDWWNRQLQQNRVIPHYRTSGEFLEFTRETAAKVANTITLSTSTEFLIRGAVGSGKSTGLPHHLSKKGKVLL
LEPTRPLAENVSKQLGRDPFFHAVTLRMRGLNRFGSSNITVMTSGFAFHYYVNNPHQLSDFDFIIIDECHVLDSATIAFN
CALKEFEFPGKLLKVSATPPGRECEFTTQHPVKLKVEEHLSFQQFAQAQGTGSNADMVQYGHNLLVYVASYNEVDQMSRH
LLDRQFHVTKVDGRTMQMGNIEIETHGTEGKPHFIVATNIIENGVTLDVDCVIDFGLKVVAQLDSDNRCVRYEKKAVSFG
ERIQRLGRVGRHKAGFALRIGHTEKSLEEIPEFIATEAAFLSFAYGLPVTTQGVSTNILSRCTVKQARNALNFELTPFFT
TNFIRYDGSIHPEVHKLLCKFKLRESEMLLSKLAIPHQYTSQWITVKDYNRIGIQVNCDEKVKIPFYVHGIPDKLFEMLW
NTVCKYKCDAGFGRISSVNATKISYTLSTDPSALPRTIAILDHLISEEIMKKNHFDTISSSLTGHSFSLAGIADGIRKRY
LKDYTQQNIAILQQARAQLLEFNSNTVDLNNLQNYEDLGVLNTVRLQGKAEVCEFLGLKGKWDGKKFFNDVVVAIFTLIG
GGWMLWDYFRHYMQEPVSTQGRKRMMQKLKFRDAFDRKVGREVYADDYTMEHTFGEAYTKKGKQKGSTHTKGMGKKSRGF
IHMYGVEPENYSTLRFVDPLTGHTMDESPRVDIRIVQDEFGEIRRQKINEGELDKQAVVARPGLQAYFLGKGTEEALKVD
LTPHRPTLLCMNSNAIAGFPEREDELRQTVPMSAVPKPNEVVELESKSTYKGLRDYSSVSTLICRLVNSSDGHNETIYGI
GYGSYIITNGHLFRRNNGTLTVKTWHGDFIIPNTTQLKIHFIEGKDAILIRMPRDFPPFAQRSCFRSPKKEERVCMVGTN
FQEKSLRSTVSESSIIVPEGKGSFWVHWITTQDGDCGLPMVSVNDGYIVGIHGLTSNETSRNFFVPFIDEFKNKYLDKLE
DLTWNKHWLWQPDRIAWGSLNLVDDQPKSEFKISKLVTDLFGSEVSVQSKKDRWVLEAVEGNLVACGQAESALVTKHVVK
GKCCHFAQYLSLHPDAQAFFKPLMSAYQPSKLNKEAFKKDFFKYNKPVMLNEVNFEAFEKAVEGVKIMMIEFGFNECVYV
TDPDDIYDSLNMKAAVGAQYKGKKQDYFQDMDSFDKERLLFLSCERLFYGQKGIWNGSLKAELRPLEKVQANKTRTFTAA
PIDTLLGAKVCVDDFNNQFYSFNLICPWTVGMTKFYGGWDKLMRALPDGWVYCHADGSQFDSSLTPLLLNSVLSIRSFFM
EDWWVGKEMLENLYAEIVYTPILTPDGTIFKKFRGNNSGQPSTVVDNTLMVVISMYYSCIKEGWTYDDIQERLVFFANGD
DIILAVQKEDVWLYNTLSNSFKELGLNYDFSEQTTKREELWFMSHQAMLIDDIYIPKLEQERIVSILEWDRSKELMHRTE
AICAAMIEAWGHTELLTEIRKFYLWLMGKEEFKELALNGKAPYIAETALRKLYTDKDAKMEEMQEYLKQLEFDSDDEVYE
SVSTQSSKKEEEKDAGADEREKDKGKGPADKDVGAGSKGKVVPRLQKITKKMNLPMVGGRMILNLDHLIEYKPQQTDLYN
TRATKAQFERWYEAVKTEYELNDQQMGVVMNGFMVWCIDNGTSPDVNGVWVMMDGDEQIEYPLKPMVENAKPTLRQVMHH
FSDAAEAYIEMRNSEGFYMPRYGLLRNLRDKSLARYAFDFYEVNSKTSDRAREAVAQMKAARLANVNTRLFGLDGNVATT
SENTERHTARDVNQNMHHLLGMTSGQ
>P12915 ~~~~~~Genome polyprotein~~~
MGAQLSRNTAGSHTTGTYATGGSTINYNNINYYSHAASAAQNKQDFTQDPSKFTQPIADVIKETAVPLKSPSAEACGYSD
RVAQLTLGNSTITTQEAANICVAYGCWPAKLSDTDATSVDKPTEPGVSADAFYTLRSKPWQADSKGWYWKLPDALNNTGM
FGQNAQFHYIYRGGWAVHVQCNATKFHQGTLLVLAIPEHQIATQEQPAFDRTMPGSEGGTFQEPFWLEDGTSLGNSLIYP
HQWINLRTNNSATLILPYVNAIPMDSAIRHSNWTLAIIPVAPLKYAAETTPLVPITVTIAPMETEYNGLRRAIASNQGLP
TKPGPGSYQFMTTDEDCSPCILPDFQPTLEIFIPGKVNNLLEIAQVESILEANNREGVEGVERYVIPVSVQDALDAQIYA
LRLELGGSGPLSSSLLGTLAKHYTQWSGSVEITCMFTGTFMTTGKVLLAYTPPGGDMPRNREEAMLGTHVVWDFGLQSSI
TLVIPWISASHFRGVSNDDVLNYQYYAAGHVTIWYQTNMVIPPGFPNTAGIIMMIAAQPNFSFRIQKDREDMTQTAILQN
DPGKMLKDAIDKQVAGALVAGTTTSTHSVATDSTPALQAAETGATSTARDESMIETRTIVPTHGIHETSVESFFGRSSLV
GMPLLATGTSITNWRIDFREFVQLRAKMSWFTYMRFDVEFTIIATSSTGQNVTTEQHTTYQVMYVPPGAPVPSNQDSFQW
QSGCNPSVFADTDGPPAQFSVPFMSSANAYSTVYDGYARFMDTDPDRYGILPSNFLGFMYFRTLEDAAHQVRFRICAKIK
HTSCWIPRAPRQAPYKKRYNLVFSGDSDRICSNRASLTSYGPFGQQQGAAYVGSYKILNRHLATYADWENEVWQSYQRDL
LVTRVDAHGCDTIARCNCRSGIYYCKSTAKHYPIVVTPPSIYKIEANDYYPERMQTHILLGIGFAEPGDCGGLLRCEHGV
MGILTVGGGDHVGFADVRDLLWIEDDAMEQGITDYVQQLGNAFGAGFTAEIANYTNQLRDMLMGSDSVVEKIIRSLVRLV
SALVIVVRNHQDLITVGATLALLGCEGSPWKWLKRKVCQILGINMAERQSDNWMKKFTEMCNAFRGLDWIAAKISKFIDW
LKQKILPELKERAEFVKKLKQLPLLEAQVNTLEHSSASQERQEQLFGNVQYLAHHCRKNAPLYAAEAKRVYHLEKRVLGA
MQFKTKNRIEPVCALIHGSPGTGQSLATMIVGRKLAEYEGSDVYSLPPDPDHFDGYQQQAVVVMDDLLQNPDGKDMTLFC
QMVSTAPFTVPMAALEDKGKLFTSKFVLASTNAGQVTPPTVADYKALQRRFFFDCDIEVQKEYKRDGVTLDVAKATETCE
DCSPANFKKCMPLICGKALQLKSRKGDGMRYSLDTLISELRRESNRRYNIGNVLEALFQGPVCYKPLRIEVHEEEPAPSA
ISDLLQAVDSEEVREYCRSKGWIVEERVTELKLERNVNRALAVIQSVSLIAAVAGTIYIVYRLFSGMQGPYSGIGTNYAT
KKPVVRQVQTQGPLFDFGVSLLKKNIRTVKTGAGEFTALGVYDTVVVLPRHAMPGKTIEMNGKDIEVLDAYDLNDKTDTS
LELTIVKLKMNEKFRDIRAMVPDQITDYNEAVVVVNTSYYPQLFTCVGRVKDYGFLNLAGRPTHRVLMYEFPTKAGQCGG
VVISMGKIVGVHVGGNGAQGFAASLLRRYFTAEQGQIEYIEKSKDAGYPVINAPTQTKLEPSVFFDVFPGVKEPAVLHKK
DKRLETNFEEALFSKYIGNVQRDMPEELLIAIDHYSEQLKMLNIDPRPISMEDAIYGTEGLEALDLGTSASYPYVAMGIK
KRDILNKETRDVTKMQECIDKYGLNLPMVTYVKDELRAPDKIRKGKSRLIEASSLNDSVAMRCYFGNLYKVFHTNPGTIS
GCAVGCDPETFWSKIPVMMDGELFGFDYTAYDASLSPMWFHALAEVLRRIGFVECKHFIDQLCCSHHLYMDKHYYVVGGM
PSGCSGTSIFNSMINNLIIRTLVLTVYKNIDLDDLKIIAYGDDVLASYPYEIDASLLAEAGKSFGLIMTPPDKSAEFVKL
TWDNVTFLKRKFVRDARYPFLVHPVMDMSNIHESIRWTKDPRHTEDHVRSLCLLAWHCGEEEYNEFVTKIRSVPVGRALH
LPSFKALERKWYDSF
>Q96662 ~~~~~~Genome polyprotein~~~
MELITNELLYKTYKQKPAGVEEPVYDQAGNPLFGERGVIHPQSTLKLPHKRGEREVPTNLASLPKRGDCRSGNSKGPVSG
IYLKPGPLFYQDYKGPVYHRAPLEFFEEASMCETTKRIGRVTGSDSRLYHIYVCIDGCIIVKSATKDRQKVLKWVHNKLN
CPLWVSSCSDTKDEGVVRKKQQKPDRLEKGRMKITPKESEKDSKTKPPDATIVVDGVKYQVKKKGKVKSKNTQDGLYHNK
NKPQESRKKLEKALLAWAIIALVFFQVTMGENITQWNLQDNGTEGIQRAMFQRGVNRSLHGIWPEKICTGVPSHLATDTE
LKAIHGMMDASEKTNYTCCRLQRHEWNKHGWCNWYNIEPWILLMNKTQANLTEGQPLRECAVTCRYDRDSDLNVVTQARD
SPTPLTGCKKGKNFSFAGILVQGPCNFEIAVSDVLFKEHDCTSVIQDTAHYLVDGMTNSLESARQGTAKLTTWLGRQLGI
LGKKLENKSKTWFGAYAASPYCEVERKLGYIWYTKNCTPACLPRNTKIIGPGRFDTNAEDGKILHEMGGHLSEVLLLSVV
VLSDFAPETASVIYLILHFSIPQGHTDIQDCDKNQLNLTVELTTAEVIPGSVWNLGKYVCVRPDWWPYETATVLVIEEVG
QVIKVVLRALKDLTRIWTAATTTAFLVCLVKVVRGQVLQGILWLMLITGAQGYPDCKPGFSYAIAKNDEIGPLGATGLTT
QWYEYSDGMRLQDSVVEVWCKNGEIKYLIRCGREARYLAVLHTRALPTSVVFEKIFDGKEQEDIVEMDDNFEFGLCPCDA
RPLIRGKFNTTLLNGPAFQMVCPIGWTGTVSCTLANKDTLATIVVRTYKRVRPFPYRQDCVTQKTIGEDLYDCALGGNWT
CVPGDALRYVAGPVESCEWCGYKFLKSEGLPHFPIGKCRLKNESGYRQVDETSCNRNGVAIVPSGTVKCKIGDTVVQVIA
MDDKLGPMPCKPHEIISSEGPVEKTACTFNYTRTLKNKYFEPRDNYFQQYMLKGEYQYWFDLEITDHHRDYFAESLLVIV
VALLGGRYVLWLLVTYMILSEQMASGVQYGAGEIVMMGNLLTHDSVEVVTYFLLLYLLLREENTKKWVILIYHIIVMHPL
KSVTVILLMVGGMAKAEPGAQGYLEQVDLSFTMITIIVIGLVIARRDPTVVPLVTIVAALKITGLGFGPGVDAAMAVLTL
TLLMTSYVTDYFRYKRWIQCILSLVAGVFLIRTLKHLGELKTPELTIPNWRPLTFILLYLTSATVVTRWKIDIAGIFLQG
APILLMIATLWADFLTLVLILPTYELAKLYYLKNVKTDVEKSWGVPYPDPQTLGGLDYRTIDSVYDVDESGEGVYLFPSR
QKKNKNISILLPLIRATLISCISSKWQMVYMAYLTLDFMYYMHRKVIEEISGSTNVMSRVIAALIELNWSMEEEESKGLK
KFFILSGRLRNLIIKHKVRNQTVASWYGEEEVYGMPKVVTIIRACTLNKNKHCIICTVCEARKWKGGNCPKCGRHGKPII
CGMTLADFEERHYKRIFIREGNFEGPFRQEYNGFVQYTARGQLFLRNLPILATKVKMIMVGNLGEEIGDLEHLGWILRGP
AVCKKITEHEKCHVNILDKLTAFFGVMPRGTTPRAPVRFPTALLKVRRGLETGWAYTHQGGISSVDHVTAGKDLLVCDSM
GRTRVVCQSNNKLTDETEYGVKTDSGCPDGARCYVLNPEAVNISGSKGAVVHLQKTGGEFTCVTASGTPAFFDLKNLKGW
SGLPIFEASSGRVVGRVKVGKNEESKPTKLMSGIQTVSKNTADLTEMVKKITSMNRGDFRQITLATGAGKTTELPKAVIE
EIGRHKRVLVLIPLRAAAESVYQYMRLKHPSISFNLRIGDMKEGDMATGITYASYGYFCQMPQPKLRAAMIEYSYIFLDE
YHCATPEQLAVIGKIHRFSESIRVVAMTATPAGSVTTTGQKHPIEEFIAPEVMKGEDLGSQFLDIAGLKIPVEEMKGNML
VFVPTRNMAVEVAKKLKAKGYNSGYYYSGEDPANLRVVTSQSPYVVVATNAIESGVTLPDLDTVVDTGLKCEKRVRVSSK
IPFIVTGLKRMAVTVGEQAQRRGRVGRVKPGRYYRSQETATGSKDYHYDLLQAQRYGIEDGINVTKSFREMNYDWSLYEE
DSLLITQLEILNNLLISEDLPAAVKNIMARTDHPEPIQLAYNSYEVQVPVLFPKIRNGEVTDTYENYSFLNARKLGEDVP
VYVYATEDEDLAVDLLGLDWPDPGNQQVVETGKALKQVVGLSSAENALLIALFGYVGYQALSKRHVPMITDIYTIEDQRL
EDTTHLQYAPNAIRTEGKETELKELAVGDLDKIMGSISDYASEGLNFVRSQAEKMRSAPAFKENVEAAKGYVQKFIDSLI
ENKETIIRYGLWGTHTALYKSIAARLGHETAFATLVIKWLAFGGESVSDHMRQAAVDLVVYYVINKPSFPGDSETQQEGR
RFVASLFISALATYTYKTWNYNNLSKVVEPALAYLPYATNALKMFTPTRLESVVILSTTIYKTYLSIRKGKSDGLLGTGI
SAAMEILSQNPVSVGISVMLGVGAIAAHNAIESSEQKRTLLMKVFVKNFLDQAATDELVKENPEKIIMALFEAVQTIGNP
LRLIYHLYGVYYKGWEAKELSERTAGRNLFTLIMFEAFELLGMDSEGKIRNLSGNYVLDLIYSLHKQINRGLKKIVLGWA
PAPFSCDWTPSDERIRLPTNNYLRVETKCPCGYEMKALRNVGGSLTKVEEKGPFLCRNRLGRGPVNYRVTKYYDDNLKEI
KPVAKLEGFVDHYYKGVTARIDYGRGKMLLATDKWEVEHGVVTRLAKRYTGVGFKGAYLGDEPNHRDLVERDCATITKNT
VQFLKMKKGCAFTYDLTLSNLTRLIELVHKNNLEEKDIPAATVTTWLAYTFVNEDIGTIKPVLGERVVTDPVVDVNLQPE
VQVDTSEVGITLVGRAALMTTGTTPVVEKTEPNADGGPSSIKIGLDEGRYPGPGLQDRTLTDEIHSRDERPFVLVLGSKN
SMSNRAKTARNINLYKGNNPREIRDLMAQGRMLVVALKDFNPELSELVDFKGTFLDREALEALSLGRPKSKQVTTATVRE
LLEQEVQVEIPSWFGAGDPVFLEVTLKGDRYHLVGDVDRVKDQAKELGATDQTRIVKEVGARTYTMKLSSWFLQATNKQM
SLTPLFEELLLRCPPKIKSNKGHMASAYQLAQGNWEPLDCGVHLGTIPARRVKIHPYEAYLKLKDLLEEEEKKPKCRDTV
IREHNKWILKKVRHQGNLNTKKILNPGKLSEQLDREGHKRNIYNNQIGTIMTEAGSRLEKLPVVRAQTDTKSFHEAIRDK
IDKNENQQSPGLHDKLLEIFHTIAQPSLRHTYSDVTWEQLEAGVNRKGAAGFLEKKNVGEVLDSEKHLVEQLIRDLKTGR
KIRYYETAIPKNEKRDVSDDWQSGDLVDEKKPRVIQYPEAKTRLAITKVMYNWVKQQPVVIPGYEGKTPLFNIFNKVRKE
WDLFNEPVAVSFDTKAWDTQVTSRDLRLIGEIQKYYYRKEWHKFIDTITDHMVEVPVITADGEVYIRNGQRGSGQPDTSA
GNSMLNVLTMMYAFCESTGVPYKSFNRVARIHVCGDDGFLITEKGLGLKFANNGMQILHEAGKPQKITEGERMKVAYRFE
DIEFCSHTPVPVRWSDNTSSYMAGRDTAVILSKMATRLDSSGERGTIAYEKAVAFSFLLMYSWNPLVRRICLLVLSQQPE
TTPSTQTTYYYKGDPIGAYKDVIGKNLCELKRTGFEKLANLNLSLSTLGIWSKHTSKRIIQDCVTIGKEEGNWLVNADRL
ISSKTGHLYIPDKGYTLQGKHYEQLQLQARTSPVTGVGTERYKLGPIVNLLLRRLRVLLMAAVGASS
>P19711 ~~~~~~Genome polyprotein~~~
MELITNELLYKTYKQKPVGVEEPVYDQAGDPLFGERGAVHPQSTLKLPHKRGERDVPTNLASLPKRGDCRSGNSRGPVSG
IYLKPGPLFYQDYKGPVYHRAPLELFEEGSMCETTKRIGRVTGSDGKLYHIYVCIDGCIIIKSATRSYQRVFRWVHNRLD
CPLWVTTCSDTKEEGATKKKTQKPDRLERGKMKIVPKESEKDSKTKPPDATIVVEGVKYQVRKKGKTKSKNTQDGLYHNK
NKPQESRKKLEKALLAWAIIAIVLFQVTMGENITQWNLQDNGTEGIQRAMFQRGVNRSLHGIWPEKICTGVPSHLATDIE
LKTIHGMMDASEKTNYTCCRLQRHEWNKHGWCNWYNIEPWILVMNRTQANLTEGQPPRECAVTCRYDRASDLNVVTQARD
SPTPLTGCKKGKNFSFAGILMRGPCNFEIAASDVLFKEHERISMFQDTTLYLVDGLTNSLEGARQGTAKLTTWLGKQLGI
LGKKLENKSKTWFGAYAASPYCDVDRKIGYIWYTKNCTPACLPKNTKIVGPGKFGTNAEDGKILHEMGGHLSEVLLLSLV
VLSDFAPETASVMYLILHFSIPQSHVDVMDCDKTQLNLTVELTTAEVIPGSVWNLGKYVCIRPNWWPYETTVVLAFEEVS
QVVKLVLRALRDLTRIWNAATTTAFLVCLVKIVRGQMVQGILWLLLITGVQGHLDCKPEFSYAIAKDERIGQLGAEGLTT
TWKEYSPGMKLEDTMVIAWCEDGKLMYLQRCTRETRYLAILHTRALPTSVVFKKLFDGRKQEDVVEMNDNFEFGLCPCDA
KPIVRGKFNTTLLNGPAFQMVCPIGWTGTVSCTSFNMDTLATTVVRTYRRSKPFPHRQGCITQKNLGEDLHNCILGGNWT
CVPGDQLLYKGGSIESCKWCGYQFKESEGLPHYPIGKCKLENETGYRLVDSTSCNREGVAIVPQGTLKCKIGKTTVQVIA
MDTKLGPMPCRPYEIISSEGPVEKTACTFNYTKTLKNKYFEPRDSYFQQYMLKGEYQYWFDLEVTDHHRDYFAESILVVV
VALLGGRYVLWLLVTYMVLSEQKALGIQYGSGEVVMMGNLLTHNNIEVVTYFLLLYLLLREESVKKWVLLLYHILVVHPI
KSVIVILLMIGDVVKADSGGQEYLGKIDLCFTTVVLIVIGLIIARRDPTIVPLVTIMAALRVTELTHQPGVDIAVAVMTI
TLLMVSYVTDYFRYKKWLQCILSLVSAVFLIRSLIYLGRIEMPEVTIPNWRPLTLILLYLISTTIVTRWKVDVAGLLLQC
VPILLLVTTLWADFLTLILILPTYELVKLYYLKTVRTDTERSWLGGIDYTRVDSIYDVDESGEGVYLFPSRQKAQGNFSI
LLPLIKATLISCVSSKWQLIYMSYLTLDFMYYMHRKVIEEISGGTNIISRLVAALIELNWSMEEEESKGLKKFYLLSGRL
RNLIIKHKVRNETVASWYGEEEVYGMPKIMTIIKASTLSKSRHCIICTVCEGREWKGGTCPKCGRHGKPITCGMSLADFE
ERHYKRIFIREGNFEGMCSRCQGKHRRFEMDREPKSARYCAECNRLHPAEEGDFWAESSMLGLKITYFALMDGKVYDITE
WAGCQRVGISPDTHRVPCHISFGSRMPFRQEYNGFVQYTARGQLFLRNLPVLATKVKMLMVGNLGEEIGNLEHLGWILRG
PAVCKKITEHEKCHINILDKLTAFFGIMPRGTTPRAPVRFPTSLLKVRRGLETAWAYTHQGGISSVDHVTAGKDLLVCDS
MGRTRVVCQSNNRLTDETEYGVKTDSGCPDGARCYVLNPEAVNISGSKGAVVHLQKTGGEFTCVTASGTPAFFDLKNLKG
WSGLPIFEASSGRVVGRVKVGKNEESKPTKIMSGIQTVSKNRADLTEMVKKITSMNRGDFKQITLATGAGKTTELPKAVI
EEIGRHKRVLVLIPLRAAAESVYQYMRLKHPSISFNLRIGDMKEGDMATGITYASYGYFCQMPQPKLRAAMVEYSYIFLD
EYHCATPEQLAIIGKIHRFSESIRVVAMTATPAGSVTTTGQKHPIEEFIAPEVMKGEDLGSQFLDIAGLKIPVDEMKGNM
LVFVPTRNMAVEVAKKLKAKGYNSGYYYSGEDPANLRVVTSQSPYVIVATNAIESGVTLPDLDTVIDTGLKCEKRVRVSS
KIPFIVTGLKRMAVTVGEQAQRRGRVGRVKPGRYYRSQETATGSKDYHYDLLQAQRYGIEDGINVTKSFREMNYDWSLYE
EDSLLITQLEILNNLLISEDLPAAVKNIMARTDHPEPIQLAYNSYEVQVPVLFPKIRNGEVTDTYENYSFLNARKLGEDV
PVYIYATEDEDLAVDLLGLDWPDPGNQQVVETGKALKQVTGLSSAENALLVALFGYVGYQALSKRHVPMITDIYTIEDQR
LEDTTHLQYAPNAIKTDGTETELKELASGDVEKIMGAISDYAAGGLEFVKSQAEKIKTAPLFKENAEAAKGYVQKFIDSL
IENKEEIIRYGLWGTHTALYKSIAARLGHETAFATLVLKWLAFGGESVSDHVKQAAVDLVVYYVMNKPSFPGDSETQQEG
RRFVASLFISALATYTYKTWNYHNLSKVVEPALAYLPYATSALKMFTPTRLESVVILSTTIYKTYLSIRKGKSDGLLGTG
ISAAMEILSQNPVSVGISVMLGVGAIAAHNAIESSEQKRTLLMKVFVKNFLDQAATDELVKENPEKIIMALFEAVQTIGN
PLRLIYHLYGVYYKGWEAKELSERTAGRNLFTLIMFEAFELLGMDSQGKIRNLSGNYILDLIYGLHKQINRGLKKMVLGW
APAPFSCDWTPSDERIRLPTDNYLRVETRCPCGYEMKAFKNVGGKLTKVEESGPFLCRNRPGRGPVNYRVTKYYDDNLRE
IKPVAKLEGQVEHYYKGVTAKIDYSKGKMLLATDKWEVEHGVITRLAKRYTGVGFNGAYLGDEPNHRALVERDCATITKN
TVQFLKMKKGCAFTYDLTISNLTRLIELVHRNNLEEKEIPTATVTTWLAYTFVNEDVGTIKPVLGERVIPDPVVDINLQP
EVQVDTSEVGITIIGRETLMTTGVTPVLEKVEPDASDNQNSVKIGLDEGNYPGPGIQTHTLTEEIHNRDARPFIMILGSR
NSISNRAKTARNINLYTGNDPREIRDLMAAGRMLVVALRDVDPELSEMVDFKGTFLDREALEALSLGQPKPKQVTKEAVR
NLIEQKKDVEIPNWFASDDPVFLEVALKNDKYYLVGDVGELKDQAKALGATDQTRIIKEVGSRTYAMKLSSWFLKASNKQ
MSLTPLFEELLLRCPPATKSNKGHMASAYQLAQGNWEPLGCGVHLGTIPARRVKIHPYEAYLKLKDFIEEEEKKPRVKDT
VIREHNKWILKKIRFQGNLNTKKMLNPGKLSEQLDREGRKRNIYNHQIGTIMSSAGIRLEKLPIVRAQTDTKTFHEAIRD
KIDKSENRQNPELHNKLLEIFHTIAQPTLKHTYGEVTWEQLEAGVNRKGAAGFLEKKNIGEVLDSEKHLVEQLVRDLKAG
RKIKYYETAIPKNEKRDVSDDWQAGDLVVEKRPRVIQYPEAKTRLAITKVMYNWVKQQPVVIPGYEGKTPLFNIFDKVRK
EWDSFNEPVAVSFDTKAWDTQVTSKDLQLIGEIQKYYYKKEWHKFIDTITDHMTEVPVITADGEVYIRNGQRGSGQPDTS
AGNSMLNVLTMMYGFCESTGVPYKSFNRVARIHVCGDDGFLITEKGLGLKFANKGMQILHEAGKPQKITEGEKMKVAYRF
EDIEFCSHTPVPVRWSDNTSSHMAGRDTAVILSKMATRLDSSGERGTTAYEKAVAFSFLLMYSWNPLVRRICLLVLSQQP
ETDPSKHATYYYKGDPIGAYKDVIGRNLSELKRTGFEKLANLNLSLSTLGVWTKHTSKRIIQDCVAIGKEEGNWLVKPDR
LISSKTGHLYIPDKGFTLQGKHYEQLQLRTETNPVMGVGTERYKLGPIVNLLLRRLKILLMTAVGVSS
>Q01499 ~~~~~~Genome polyprotein~~~
MELITNELLYKTYKQKPVGVEEPVYDQAGNPLFGERGAIHPQSTLKLPHKRGERNVPTSLASLPKRGDCRSGNSKGPVSG
IYLKPGPLFYQDYKGPVYHRAPLELFEEGSMCETTKRIGRVTGSDGKLYHIYICIDGCITVKSATRSHQRVLRWVHNRLD
CPLWVTSCSDTKEEGATKKKQQKPDRLEKGRMKIVPKESEKDSKTKPPDATIVVDGVKYQVKKKGKVKSKNTQDGLYHNK
NKPPESRKKLEKALLAWAILAVVLIEVTMGENITQWNLQDNGTEGIQRAMFQRGVNRSLHGIWPEKICTGVPSHLATDVE
LKTIHGMMDASEKTNYTCCRLQRHEWNKHGWCNWYNIEPWILIMNRTQANLTEGQPPRECAVTCRYDRDSDLNVVTQARD
SPTPLTGCKKGKNFSFAGVLTRGPCNFEIAASDVLFKEHECTGVFQDTAHYLVDGVTNSLESARQGTAKLTTWLGKQLGI
LGKKLENKSKTWFGAYAASPYCDVDRKIGYIWFTKNCTPACLPKNTKIIGPGKFDTNAEDGKILHEMGGHLSEVLLLSLV
VLSDFAPETASAMYLILHFSIPQSHVDITDCDKTQLNLTIELTTADVIPGSVWNLGKYVCIRPDWWPYETAAVLAFEEVG
QVVKIVLRALRDLTRIWNAATTTAFLVCLIKMVRGQVVQGILWLLLITGVQGHLDCKPEYSYAIAKNDRVGPLGAEGLTT
VWKDYSHEMKLEDTMVIAWCKGGKFTYLSRCTRETRYLAILHSRALPTSVVFKKLFEGQKQEDTVEMDDDFEFGLCPCDA
KPIVRGKFNTTLLNGPAFQMVCPIGWTGTVSCMLANRDTLDTAVVRTYRRSVPFPYRQGCITQKTLGEDLYDCALGGNWT
CVTGDQSRYTGGLIESCKWCGYKFQKSEGLPHYPIGKCRLNNETGYRLVDDTSCDREGVAIVPHGLVKCKIGDTTVQVIA
TDTKLGPMPCKPHEIISSEGPIEKTACTFNYTRTLKNKYFEPRDSYFQQYMLKGDYQYWFDLEVTDHHRDYFAESILVVV
VALLGGRYVLWLLVTYMVLSEQKASGAQYGAGEVVMMGNLLTHDNVEVVTYFFLLYLLLREESVKKWVLLLYHILVAHPL
KSVIVILLMIGDVVKADPGGQGYLGQIDVCFTMVVIIIIGLIIARRDPTIVPLITIVASLRVTGLTYSPGVDAAMAVITI
TLLMVSYVTDYFRYKRWLQCILSLVSGVFLIRCLIHLGRIETPEVTIPNWRPLTLILFYLISTTVVTMWKIDLAGLLLQG
VPILLLITTLWADFLTLILILPTYELVKLYYLKTIKTDIEKSWLGGLDYKRVDSIYDVDESGEGVYLFPSRQKAQKNFSM
LLPLVRATLISCVSSKWQLIYMAYLSVDFMYYMHRKVIEEISGGTNMISRIVAALIELNWSMEEEESKGLKKFYLLSGRL
RNLIIKHKVRNETVAGWYGEEEVYGMPKIMTIIKASTLNKNKHCIICTVCEGRKWKGGTCPKCGRHGKPITCGMSLADFE
ERHYKRIFIREGNFEGPFRQEYNGFIQYTARGQLFLRNLPILATKVKMLMVGNLGEEVGDLEHLGWILRGPAVCKKITEH
ERCHINILDKLTAFFGIMPRGTTPRAPVRFPTSLLKVRRGLETGWAYTHQGGISSVDHVTAGKDLLVCDSMGRTRVVCQS
NNKLTDETEYGVKTDSGCPDGARCYVLNPEAVNISGSKGAVVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEAS
SGRVVGRVKVGKNEESKPTKIMSGIQTVSKNTADLTEMVKKITSMNRGDFKQITLATGAGKTTELPKAVIEEIGRHKRVL
VLIPLRAAAESVYQYMRLKHPSISFNLRIGDMKEGDMATGITYASYGYFCQMPQPKLRAAMVEYSYIFLDEYHCATPEQL
AIIGKIHRFSESIRVVAMTATPAGSVTTTGQKHPIEEFIAPEVMEGEDLGSQFLDIAGLKIPVDEMKGNMLVFVPTRNMA
VEVAKKLKAKGYNSGYYYSGEDPANLRVVTSQSPYVIVATNAIESGVTLPDLDTVVDTGLKCEKRVRVSSKIPFIVTGLK
RMAVTVGEQAQRRGRVGRVKPGRYYRSQETATGSKDYHYDLLQAQRYGIEDGINVTKSFREMNYDWSLYEEDSLLITQLE
ILNNLLISEDLPAAVKNIMARTDHPEPIQLAYNSYEVQVPVLFPKIRNGEVTDTYENYSFLNARKLGEDVPVYIYATEDE
DLAVDLLGLDWPDPGNQQVVETGKALKQVAGLSSAENALLVALFGYVGYQALSKRHVPMITDIYTIEDQRLEDTTHLQYA
PNAIKTEGTETELKELASGDVEKIMGAISDYAAGGLDFVKSQAEKIKTAPLFKENVEAARGYVQKLIDSLIEDKDVIIRY
GLWGTHTALYKSIAARLGHETAFATLVLKWLAFGGETVSDHIRQAAVDLVVYYVMNKPSFPGDTETQQEGRRFVASLFIS
ALATYTYKTWNYNNLSKVVEPALAYLPYATSALKMFTPTRLESVVILSTTIYKTYLSIRKGKSDGLLGTGISAAMEILSQ
NPVSVGISVMLGVGAIAAHNAIESSEQKRTLLMKVFVKNFLDQAATDELVKENPEKIIMALFEAVQTIGNPLRLIYHLYG
VYYKGWEAKELSERTAGRNLFTLIMFEAFELLGMDSEGKIRNLSGNYILDLIHGLHKQINRGLKKIVLGWAPAPFSCDWT
PSDERIRLPTDSYLRVETKCPCGYEMKALKNVSGKLTKVEESGPFLCRNRPGRGPVNYRVTKYYDDNLREIRPVAKLEGQ
VEHYYKGVTARIDYSKGKTLLATDKWEVEHGTLTRLTKRYTGVGFRGAYLGDEPNHRDLVERDCATITKNTVQFLKMKKG
CAFTYDLTISNLTRLIELVHRNNLEEKEIPTATVTTWLAYTFVNEDVGTIKPVLGERVIPDPVVDINLQPEVQVDTSEVG
ITIIGKEAVMTTGVTPVMEKVEPDTDNNQSSVKIGLDEGNYPGPGVQTHTLVEEIHNKDARPFIMVLGSKSSMSNRAKTA
RNINLYTGNDPREIRDLMAEGRILVVALRDIDPDLSELVDFKGTFLDREALEALSLGQPKPKQVTKAAIRDLLKEERQVE
IPDWFTSDDPVFLDIAMKKDKYHLIGDVVEVKDQAKALGATDQTRIVKEVGSRTYTMKLSSWFLQASSKQMSLTPLFEEL
LLRCPPATKSNKGHMASAYQLAQGNWEPLGCGVHLGTVPARRVKMHPYEAYLKLKDLVEEEEKKPRIRDTVIREHNKWIL
KKIKFQGNLNTKKMLNPGKLSEQLDREGHKRNIYNNQISTVMSSAGIRLEKLPIVRAQTDTKSFHEAIRDKIDKNENRQN
PELHNKLLEIFHTIADPSLKHTYGEVTWEQLEAGINRKGAAGFLEKKNIGEVLDSEKHLVEQLVRDLKAGRKIRYYETAI
PKNEKRDVSDDWQAGDLVDEKKPRVIQYPEAKTRLAITKVMYNWVKQQPVVIPGYEGKTPLFNIFNKVRKEWDLFNEPVA
VSFDTKAWDTQVTSRDLHLIGEIQKYYYRKEWHKFIDTITDHMVEVPVITADGEVYIRNGQRGSGQPDTSAGNSMLNVLT
MIYAFCESTGVPYKSFNRVAKIHVCGDDGFLITEKGLGLKFSNKGMQILHEAGKPQKLTEGEKMKVAYKFEDIEFCSHTP
VPVRWSDNTSSYMAGRDTAVILSKMATRLDSSGERGTTAYEKAVAFSFLLMYSWNPLVRRICLLVLSQRPETAPSTQTTY
YYKGDPIGAYKDVIGRNLSELKRTGFEKLANLNLSLSTLGIWTKHTSKRIIQDCVAIGKEEGNWLVNADRLISSKTGHLY
IPDKGFTLQGKHYEQLQLGAETNPVMGVGTERYKLGPIVNLLLRRLKVLLMAAVGASS
>P19712 ~~~~~~Genome polyprotein~~~
MELNHFELLYKTSKQKPVGVEEPVYDTAGRPLFGNPSEVHPQSTLKLPHDRGRGDIRTTLRDLPRKGDCRSGNHLGPVSG
IYIKPGPVYYQDYTGPVYHRAPLEFFDEAQFCEVTKRIGRVTGSDGKLYHIYVCVDGCILLKLAKRGTPRTLKWIRNFTN
CPLWVTSCSDDGASGSKDKKPDRMNKGKLKIAPREHEKDSKTKPPDATIVVEGVKYQIKKKGKVKGKNTQDGLYHNKNKP
PESRKKLEKALLAWAVITILLYQPVAAENITQWNLSDNGTNGIQRAMYLRGVNRSLHGIWPEKICKGVPTHLATDTELKE
IRGMMDASERTNYTCCRLQRHEWNKHGWCNWYNIDPWIQLMNRTQTNLTEGPPDKECAVTCRYDKNTDVNVVTQARNRPT
TLTGCKKGKNFSFAGTVIEGPCNFNVSVEDILYGDHECGSLLQDTALYLLDGMTNTIENARQGAARVTSWLGRQLSTAGK
KLERRSKTWFGAYALSPYCNVTRKIGYIWYTNNCTPACLPKNTKIIGPGKFDTNAEDGKILHEMGGHLSEFLLLSLVILS
DFAPETASTLYLILHYAIPQSHEEPEGCDTNQLNLTVKLRTEDVVPSSVWNIGKYVCVRPDWWPYETKVALLFEEAGQVI
KLVLRALRDLTRVWNSASTTAFLICLIKVLRGQVVQGIIWLLLVTGAQGRLACKEDYRYAISSTNEIGLLGAEGLTTTWK
EYSHGLQLDDGTVKAVCTAGSFKVTALNVVSRRYLASLHKRALPTSVTFELLFDGTNPAIEEMDDDFGFGLCPFDTSPVI
KGKYNTTLLNGSAFYLVCPIGWTGVVECTAVSPTTLRTEVVKTFRRDKPFPHRVDCVTTIVEKEDLFHCKLGGNWTCVKG
DPVTYKGGQVKQCRWCGFEFKEPYGLPHYPIGKCILTNETGYRVVDSTDCNRDGVVISTEGEHECLIGNTTVKVHALDER
LGPMPCRPKEIVSSEGPVRKTSCTFNYTKTLRNKYYEPRDSYFQQYMLKGEYQYWFNLDVTDHHTDYFAEFVVLVVVALL
GGRYVLWLIVTYIILTEQLAAGLQLGQGEVVLIGNLITHTDNEVVVYFLLLYLVIRDEPIKKWILLLFHAMTNNPVKTIT
VALLMISGVAKGGKIDGGWQRQPVTSFDIQLALAVVVVVVMLLAKRDPTTFPLVITVATLRTAKITNGFSTDLVIATVSA
ALLTWTYISDYYKYKTWLQYLVSTVTGIFLIRVLKGIGELDLHAPTLPSHRPLFYILVYLISTAVVTRWNLDVAGLLLQC
VPTLLMVFTMWADILTLILILPTYELTKLYYLKEVKIGAERGWLWKTNYKRVNDIYEVDQTSEGVYLFPSKQRTSAITST
MLPLIKAILISCISNKWQLIYLLYLIFEVSYYLHKKVIDEIAGGTNFVSRLVAALIEVNWAFDNEEVKGLKKFFLLSSRV
KELIIKHKVRNEVVVRWFGDEEIYGMPKLIGLVKAATLSRNKHCMLCTVCEDRDWRGETCPKCGRFGPPVVCGMTLADFE
EKHYKRIFIREDQSGGPLREEHAGYLQYKARGQLFLRNLPVLATKVKMLLVGNLGTEIGDLEHLGWVLRGPAVCKKVTEH
ERCTTSIMDKLTAFFGVMPRGTTPRAPVRFPTSLLKIRRGLETGWAYTHQGGISSVDHVTCGKDLLVCDTMGRTRVVCQS
NNKMTDESEYGVKTDSGCPEGARCYVFNPEAVNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEAS
SGRVVGRVKVGKNEDSKPTKLMSGIQTVSKSATDLTEMVKKITTMNRGEFRQITLATGAGKTTELPRSVIEEIGRHKRVL
VLIPLRAAAESVYQYMRQKHPSIAFNLRIGEMKEGDMATGITYASYGYFCQMSQPKLRAAMVEYSFIFLDEYHCATPEQL
AIMGKIHRFSENLRVVAMTATPAGTVTTTGQKHPIEEFIAPEVMKGEDLGSEYLDIAGLKIPVEEMKNNMLVFVPTRNMA
VEAAKKLKAKGYNSGYYYSGEDPSNLRVVTSQSPYVVVATNAIESGVTLPDLDVVVDTGLKCEKRIRLSPKMPFIVTGLK
RMAVTIGEQAQRRGRVGRVKPGRYYRSQETPVGSKDYHYDLLQAQRYGIEDGINITKSFREMNYDWSLYEEDSLMITQLE
ILNNLLISEELPMAVKNIMARTDHPEPIQLAYNSYETQVPVLFPKIRNGEVTDTYDNYTFLNARKLGDDVPPYVYATEDE
DLAVELLGLDWPDPGNQGTVEAGRALKQVVGLSTAENALLVALFGYVGYQALSKRHIPVVTDIYSVEDHRLEDTTHLQYA
PNAIKTEGKETELKELAQGDVQRCVEAVTNYAREGIQFMKSQALKVRETPTYKETMNTVADYVKKFIEALTDSKEDIIKY
GLWGAHTALYKSIGARLGHETAFATLVVKWLAFGGESISDHIKQAATDLVVYYIINRPQFPGDTETQQEGRKFVASLLVS
ALATYTYKSWNYNNLSKIVEPALATLPYAAKALKLFAPTRLESVVILSTAIYKTYLSIRRGKSDGLLGTGVSAAMEIMSQ
NPVSVGIAVMLGVGAVAAHNAIEASEQKRTLLMKVFVKNFLDQAATDELVKESPEKIIMALFEAVQTVGNPLRLVYHLYG
VFYKGWEAKELAQRTAGRNLFTLIMFEAVELLGVDSEGKIRQLSSNYILELLYKFRDNIKSSVREIAISWAPAPFSCDWT
PTDDRIGLPHENYLRVETKCPCGYRMKAVKNCAGELRLLEEGGSFLCRNKFGRGSQNYRVTKYYDDNLSEIKPVIRMEGH
VELYYKGATIKLDFNNSKTVLATDKWEVDHSTLVRALKRYTGAGYRGAYLGEKPNHKHLIQRDCATITKDKVCFIKMKRG
CAFTYDLSLHNLTRLIELVHKNNLEDREIPAVTVTTWLAYTFVNEDIGTIKPTFGEKVTPEKQEEVVLQPAVVVDTTDVA
VTVVGETSTMTTGETPTTFTSLGSDSKVRQVLKLGVDDGQYPGPNQQRASLLEAIQGVDERPSVLILGSDKATSNRVKTA
KNVKIYRSRDPLELREMMKRGKILVVALSRVDTALLKFVDYKGTFLTRETLEALSLGKPKKRDITKAEAQWLLRLEDQIE
ELPDWFAAKEPIFLEANIKRDKYHLVGDIATIKEKAKQLGATDSTKISKEVGAKVYSMKLSNWVIQEENKQGSLAPLFEE
LLQQCPPGGQNKTTHMVSAYQLAQGNWVPVSCHVFMGTIPARRTKTHPYEAYVKLRELVDEHKMKALCGGSGLSKHNEWV
IGKVKYQGNLRTKHMLNPGKVAEQLHREGYRHNVYNKTIGSVMTATGIRLEKLPVVRAQTDTTNFHQAIRDKIDKEENLQ
TPGLHKKLMEVFNALKRPELEASYDAVDWEELERGINRKGAAGFFERKNIGEVLDSEKNKVEEVIDSLKKGRNIRYYETA
IPKNEKRDVNDDWTAGDFVDEKKPRVIQYPEAKTRLAITKVMYKWVKQKPVVIPGYEGKTPLFQIFDKVKKEWDQFQNPV
AVSFDTKAWDTQVTTRDLELIRDIQKFYFKKKWHKFIDTLTKHMSEVPVISADGEVYIRKGQRGSGQPDTSAGNSMLNVL
TMVYAFCEATGVPYKSFDRVAKIHVCGDDGFLITERALGEKFASKGVQILYEAGKPQKITEGDKMKVAYQFDDIEFCSHT
PVQVRWSDNTSSYMPGRNTTTILAKMATRLDSSGERGTIAYEKAVAFSFLLMYSWNPLIRRICLLVLSTELQVRPGKSTT
YYYEGDPISAYKEVIGHNLFDLKRTSFEKLAKLNLSMSTLGVWTRHTSKRLLQDCVNVGTKEGNWLVNADRLVSSKTGNR
YIPGEGHTLQGKHYEELILARKPIGNFEGTDRYNLGPIVNVVLRRLKIMMMALIGRGV
>P21530 ~~~~~~Genome polyprotein~~~
MELNHFELLYKTNKQKPMGVEEPVYDVTGRPLFGDPSEVHPQSTLKLPHDRGRGNIKTTLKNLPRRGDCRSGNHLGPVSG
IYVKPGPVFYQDYMGPVYHRAPLEFFDEAQFCEVTKRIGRVTGSDGKLYHIYVCIDGCILLKLAKRGEPRTLKWIRNLTD
CPLWVTSCSDDGASASKEKKPDRINKGKLKIAPKEHEKDSRTKPPDATIVVEGVKYQVKKKGKVKGKNTQDGLYHNKNKP
PESRKKLEKALLAWAVIAIMLYQPVAAENITQWNLRDNGTNGIQHAMYLRGVSRSLHGIWPEKICKGVPTYLATDTELRE
IQGMMVASEGTNYTCCKLQRHEWNKHGWCNWYNIDPWIQLMNRTQANLAEGPPSKECAVTCRYDKNADINVVTQARNRPT
TLTGCKKGTNFSFAGTVIEGPCNFNVSVEDILYGDHECGSLLQDTALYLVDGMTNTIERARQGAARVTSWLGRQLRIAGK
RLEGRSKTWFGAYALSPYCNVTTKIGYIWYTNNCTPACLPKNTKIIGPGKFDTNAEDGKILHEMGGHLSEFLLLSLVVLS
DFAPETASALYLILHYVIPQSHEEPEGCDTNQLNLTVELRTEDVIPSSVWNVGKYVCVRPDWWPYETKVALLFEEAGQVV
KLALRALRDLTRVWNSASTTAFLICLIKVLRGQVVQGVIWLLLVTGAQGRLACKEDHRYAISTTNEIGLHGAEGLTTTWK
EYNHNLQLDDGTVKAICMAGSFKVTALNVVSRRYLASLHKDALPTSVTFELLFDGTSPLTEEMGDDFGFGLCPYDTSPVV
KGKYNTTLLNGSAFYLVCPIGWTGVIECTAVSPTTLRTEVVKTFRREKPFPYRRDCVTTTVENEDLFYCKWGGNWTCVKG
EPVTYTGGPVKQCRWCGFDFNEPDGLPHYPIGKCILANETGYRIVDSTDCNRDGVVISTEGSHECLIGNTTVKVHALDER
LGPMPCRPKEIVSSAGPVRKTSCTFNYAKTLRNRYYEPRDSYFQQYMLKGEYQYWFDLDVTDRHSDYFAEFIVLVVVALL
GGRYVLWLIVTYIVLTEQLAAGLQLGQGEVVLIGNLITHTDIEVVVYFLLLYLVMRDEPIKKWILLLFHAMTNNPVKTIT
VALLMVSGVAKGGKIDGGWQRLPETNFDIQLALTVIVVAVMLLAKKDPTTVPLVITVATLRTAKITNGLSTDLAIATVST
ALLTWTYISDYYKYKTLLQYLISTVTGIFLIRVLKGVGELDLHTPTLPSYRPLFFILVYLISTAVVTRWNLDIAGLLLQC
VPTLLMVFTMWADILTLILILPTYELTKLYYLKEVKIGAERGWLWKTNFKRVNDIYEVDQAGEGVYLFPSKQKTGTITGT
MLPLIKAILISCISNKWQFIYLLYLIFEVSYYLHKKIIDEIAGGTNFISRLVAALIEANWAFDNEEVRGLKKFFLLSSRV
KELIIKHKVRNEVMVHWFGDEEVYGMPKLVGLVKAATLSKNKHCILCTVCENREWRGETCPKCGRFGPPVTCGMTLADFE
EKHYKRIFFREDQSEGPVREEYAGYLQYRARGQLFLRNLPVLATKVKMLLVGNLGTEVGDLEHLGWVLRGPAVCKKVTEH
EKCTTSIMDKLTAFFGVMPRGTTPRAPVRFPTSLLKIRRGLETGWAYTHQGGISSVDHVTCGKDLLVCDTMGRTRVVCQS
NNKMTDESEYGVKTDSGCPEGARCYVFNREAVNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEAS
SGRVVGRVKVGKNEDSKPTKLMSGIQTVSKSTTDLTEMVKKITTMNRGEFRQITLATGAGKTTELPRSVIEEIGRHKRVL
VLIPLRAAAESVYQYMRQKHPSIAFNLRIGEMKEGDMATGITYASYGYFCQMPQPKLRAAMVEYSFIFLDEYHCSTPEQL
AIMGKIHRFSENLRVVAMTATPAGTVTTTGQKHPIEEYIAPEVMKGEDLGPEYLDIAGLKIPVEEMKSNMLVFVPTRNMA
VETAKKLKAKGYNSGYYYSGEDPSNLRVVTSQSPYVVVATNAIESGVTLPDLDVVVDTGLKCEKRIRLSPKMPFIVTGLK
RMAVTIGEQAQRRGRVGRVKPGRYYRSQETPVGSKDYHYDLLQAQRYGIEDGINITKSFREMNYDWSLYEEDSLMITQLE
ILNNLLISEELPMAVKNIMARTDHPEPIQLAYNSYETQVPVLFPKIKNGEVTDSYDNYTFLNARKLGDDVPPYVYATEDE
DLAVELLGLDWPDPGNQGTVEAGRALKQVVGLSTAENALLVALFGYVGYQALSKRHIPVVTDIYSIEDHRLEDTTHLQYA
PNAIKTEGKETELKELAQGDVQRCMEAMTNYARDGIQFMKSQALKVKETPTYKETMDTVADYVKKFMEALADSKEDIIKY
GLWGTHTALYKSIGARLGNETAFATLVVKWLAFGGESIADHVKQAATDLVVYYIINRPQFPGDTETQQEGRKFVASLLVS
ALATYTYKSWNYNNLSKIVEPALATLPYAATALKLFAPTRLESVVILSTAIYKTYLSIRRGKSDGLLGTGVSAAMEIMSQ
NPVSVGIAVMLGVGAVAAHNAIEASEQKRTLLMKVFVKNFLDQAATDELVKESPEKIIMALFEAVQTVGNPLRLVYHVYG
VFYKGWEAKELAQRTAGRNLFTLIMFEAVELLGVDSEGKIRQLSSNYILELLYKFRDSIKSSVRQMAISWAPAPFSCDWT
PTDDRIGLPQDNFLRVETKCPCGYKMKAVKNCAGELRLLEEEGSFLCRNKFGRGSRNYRVTKYYDDNLSEIKPVIRMEGH
VELYYKGATIKLDFNNSKTILATDKWEVDHSTLVRVLKRHTGAGYCGAYLGEKPNHKHLIERDCATITKDKVCFLKMKRG
CAFTYDLSLHNLTRLIELVHKNNLEDKEIPAVTVTTWLAYTFVNEDIGTIKPAFGEKITPEMQEEITLQPAVLVDATDVT
VTVVGETPTMTTGETPTTFTSSGPDPKGQQVLKLGVGEGQYPGTNPQRASLHEAIQSADERPSVLILGSDKATSNRVKTV
KNVKVYRGRDPLEVRDMMRRGKILVIALSRVDNALLKFVDYKGTFLTRETLEALSLGRPKKKNITKAEAQWLLRLEDQME
ELPDWFAAGEPIFLEANIKHDRYHLVGDIATIKEKAKQLGATDSTKISKEVGAKVYSMKLSNWVMQEENKQSNLTPLFEE
LLQQCPPGGQNKTAHMVSAYQLAQGNWMPTSCHVFMGTISARRTKTHPYEAYVKLRELVEEHKMKTLCPGSSLRNDNEWV
IGKIKYQGNLRTKHMLNPGKVAEQLHREGHRHNVYNKTIGSVMTATGIRLEKLPVVRAQTDTTNFHQAIRDKIDKEENLQ
TPGLHKKLMEVFNALKRPELESSYDAVEWEELERGINRKGAAGFFERKNIGEILDSEKIKVEEIIDNLKKGRNIKYYETA
IPKNEKRDVNDDWTAGDFVDEKKPRVIQYPEAKTRLAITKVMYKWVKQKPVVIPGYEGKTPLFQIFDKVKKEWDQFQNPV
AVSFDTKAWDTQVTTNDLELIKDIQKYYFKKKWHKFIDTLTMHMSEVPVITADGEVYIRKGQRGSGQPDTSAGNSMLNVL
TMVYAFCEATGVPYKSFDRVAKIHVCGDDGFLITERALGEKFASKGVQILYEAGKPQKITEGDKMKVAYQFADIEFCSHT
PIQVRWSDNTSSYMPGRNTTTILAKMATRLDSSGERGTIAYEKAVAFSFLLMYSWNPLIRRICLLVLSTELQVKPGKSTT
YYYEGDPISAYKEVIGHNLFDLKRTSFEKLAKLNLSMSVLGAWTRHTSKRLLQDCVNMGVKEGNWLVNADRLVSSKTGNR
YVPGEGHTLQGRHYEELALARKQINSFQGTDRYNLGPIVNMVLRRLRVMMMTLIGRGV
>Q65900 ~~~~~~Genome polyprotein~~~
MGSQVSTQRSGSHENSNSASEGSTINYTTINYYKDAYAASAGRQDMSQDPKKFTDPVMDVIHEMAPPLKSPSAEACGYSD
RVAQLTIGNSTITTQEAANIIIAYGEWPEYCKDADATAVDKPTRPDVSVNRFFTLDTKSWAKDSKGWYWKFPDVLTEVGV
FGQNAQFHYLYRSGFCVHVQCNASKFHQGALLVAILPEYVLGTIAGGDGNENSHPPYVTTQPGQVGAVLTNPYVLDAGVP
LSQLTVCPHQWINLRTNNCATIIVPYMNTVPFDSALNHCNFGLIVVPVVPLDFNAGATSEIPITVTIAPMCAEFAGLRQA
IKQGIPTELKPGTNQFLTTDDGVSAPILPGFHPTPAIHIPGEVRNLLEICRVETILEVNNLQSNETTPMQRLCFPVSVQS
KTGELCAVFRADPGRNGPWQSTILGQLCRYYTQWSGSLEVTFMFAGSFMATGKMLIAYTPPGGGVPADRLTAMLGTHVIW
DFGLQSSVTLVIPWISNTHYRAHAKDGYFDYYTTGTITIWYQTNYVVPIGAPTTAYIVALAAAQDNFTMKLCKDTEDIEQ
SANIQGDGIADMIDQAVTSRVGRALTSLQVEPTAANTNASEHRLGTGLVPALQAAETGASSNAQDENLIETRCVLNHHST
QETTIGNFFSRAGLVSIITMPTTGTQNTDGYVNWDIDLMGYAQMRRKCELFTYMRFDAEFTFVAAKPNGELVPQLLQYMY
VPPGAPKPTSRDSFAWQTATNPSIFVKLTDPPAQVSVPFMSPASAYQWFYDGYPTFGAHPQSNDADYGQCPNNMMGTFSI
RTVGTEKSPHSITLRVYMRIKHVRAWIPRPLRNQPYLFKTNPNYKGNDIKCTSTSRDKITTLGKFGQQSGAIYVGNYRVV
NRHLATHNDWANLVWEDSSRDLLVSSTTAQGCDTIARCDCQTGVYYCSSRRKHYPVSFSKPSLIFVEASEYYPARYQSHL
MLAVGHSEPGDCGGILRCQHGVVGIVSTGGNGLVGFADVRDLLWLDEEAMEQGVSDYIKGLGDAFGTGFTDAVSREVEAL
KNHLIGSEGAVEKILKNLIKLISALVIVIRSDYDMVTLTATLALIGCHGSPWAWIKAKTASILGIPIAQKQSASWLKKFN
DMANAAKGLEWISNKISKFIDWLKEKIIPAAKEKVEFLNNLKQLPLLENQISNLEQSAASQEDLEAMFGNVSYLAHFCRK
FQPLYATEAKRVYALEKRMNNYMQFKSKHRIEPVCLIIRGSPGTGKSLATGIIARAIADKYHSSVYSLPPDPDHFDGYKQ
QVVTVMDDLCQNPDGKDMSLFCQMVSTVDFIPPMASLEEKGVSFTSKFVIASTNASNIIVPTVSDSDAIRRRFYMDCDIE
VTDSYKTDLGRLDAGRAARLCSENNTANFKRCSPLVCGKAIQLRDRKSKVRYSVDTVVSELIREYNNRYAIGNTIEALFQ
GPPKFRPIRISLEEKPAPDAISDLLASVDSEEVRQYCRDQGWIIPETPTNVERHLNRAVLIMQSIATVVAVVSLVYVIYK
LFAGFQGAYSGAPKQTLKKPILRTATVQGPSLDFALSLLRRNIRQVQTDQGHFTMLGVRDRLAVLPRHSQPGKTIWVEHK
LINILDAVELVDEQGVNLELTLVTLDTNEKFRDITKFIPENISAASDATLVINTEHMPSMFVPVGDVVQYGFLNLSGKPT
HRTMMYNFPTKAGQCGGVVTSVGKVIGIHIGGNGRQGFCAGLKRSYFASEQGEIQWVKPNKETGRLNINGPTRTKLEPSV
FHDVFEGNKEPAVLHSRDPRLEVDFEQALFSKYVGNTLHEPDEYIKEAALHYANQLKQLDINTSQMSMEEACYGTENLEA
IDLHTSAGYPYSALGIKKRDILDPTTRDVSKMKFYMDKYGLDLPYSTYVKDELRSIDKIKKGKSRLIEASSLNDSVYLRM
AFGHLYETFHANPGTITGSAVGCNPDTFWSKLPILLPGSLFAFDYSGYDASLSPVWFRALELVLREVGYSEEAVSLIEGI
NHTHHVYRNKTYCVLGGMPSGCSGTSIFNSMINNIIIRTLLIKTFKGIDLDELNMVAYGDDVLASYPFPIDCLELARTGK
EYGLTMTPADKSPCFNEVNWGNATFLKRGFLPDEQFPFLIHPTMPMKEIHESIRWTKDARNTQDHVRSLCLLAWHNGKQE
YEKFVSTIRSVPVGKALAIPNYENLRRNWLELF
>Q9QF31 ~~~~~~Genome polyprotein~~~
MGSQVSTQRSGSHENSNSASEGSTINYTTINYYKDAYAASAGRQDMSQDPKKFTDPVMDVIHEMAPPLKSPSAEACGYSD
RVAQLTIGNSTITTQEAANIVIAYGEWPEYCPDTDATAVDKPTRPDVSVNRFFTLDTKSWAKDSKGWYWKFPDVLTEVGV
FGQNAQFHYLYRSGFCVHVQCNASKFHQGALLVAVLPEYVLGTIAGGTGNENSHPPYATTQPGQVGAVLTHPYVLDAGIP
LSQLTVCPHQWINLRTNNCATIIVPYMNTVPFDSALNHCNFGLLVVPVVPLDFNAGATSEIPITVTIAPMCAEFAGLRQA
VKQGIPTELKPGTNQFLTTDDGVSAPILPGFHPTPPIHIPGEVHNLLEICRVETILEVNNLKTNETTPMQRLCFPVSVQS
KTGELCAAFRADPGRDGPWQSTILGQLCRYYTQWSGSLEVTFMFAGSFMATGKMLIAYTPPGGNVPADRITAMLGTHVIW
DFGLQSSVTLVVPWISNTHYRAHARAGYFDYYTTGIITIWYQTNYVVPIGAPTTAYIVALAAAQDNFTMKLCKDTEDIEQ
TANIQGDPIADMIDQTVNNQVNRSLTALQVLPTAADTEASSHRLGTGVVPALQAAETGASSNASDKNLIETRCVLNHHST
QETAIGNFFSRAGLVSIITMPTTGTQNTDGYVNWDIDLMGYAQLRRKCELFTYMRFDAEFTFVVAKPNGELVPQLLQYMY
VPPGAPKPTSRDSFAWQTATNPSVFVKMTDPPAQVSVPFMSPASAYQWFYDGYPTFGEHLQANDLDYGQCPNNMMGTFSI
RTVGTEKSPHSITLRVYMRIKHVRAWIPRPLRNQPYLFKTNPNYKGNDIKCTSTSRDKITTLGKFGQQSGAIYVGNYRVV
NRHLATHNDWANLVWEDSSRDLLVSSTTAQGCDTIARCNCQTGVYYCSSKRKHYPVSFTKPSLIFVEASEYYPARYQSHL
MLAVGHSEPGDCGGILRCQHGVVGIVSTGGNGLVGFADVRDLLWLDEEAMEQGVSDYIKGLGDAFGVGFTDAVSREVEAL
KNHLIGSEGAVEKILKNLVKLISALVIVVRSDYDMVTLTATLALIGCHGSPWAWIKAKTASILGIPIVQKQSASWLKKFN
DMANAAKGLEWISSKISKFIDWLKEKIIPAAKEKVEFLNNLKQLPLLENQISNLEQSAASQEDLEAMFGNVSYLAHFCRK
FQPLYATEAKRVYALEKRMNNYMQFKSKHRIEPVCLIIRGSPGTGKSLATGIIARAIADKYHSSVYSLPPDPDHFDGYKQ
QVVTVMDDLCQNPDGKDMSLFCQMVSTVDFIPPMASLEEKGVSFTSKFVIASTNASNIVVPTVSDSDAIRRRFYMDCDIE
VTDSYKTDLGRLDAGRAAKLCTENNTANFKRCSPLVCGKAIQLRDRKSKVRYSIDTVVSELIREYNNRSAIGNTIEALFQ
GPLKFKPIRISLEEKPAPDAISDLLASVDSEEVRQYCREQGWIIPETPTNVERHLNRAVLVMQSIATVVAVVSLVYVIYK
LFAGFQGAYSGAPKQALKKPVLRTATVQGPSLDFALSLLRRNIRQVQTDQGHFTMLGVRDRLAILPRHSQPGKTIWVEHK
LINVLDAVELVDEQGVNLELTLVTLDTNEKFRDVTKFIPETITGASDATLVINTEHMPSMFVPVGDVVQYGFLNLSGKPT
HRTMMYNFPTKAGQCGGVVTSVGKIIGIHIGGNGRQGFCAGLKRGYFASEQGEIQWMKPNKETGRLNINGPTRTKLEPSV
FHDVFEGNKEPAVLTSKDPRLEVDFEQALFSKYVGNTLHEPDEYVTQAALHYANQLKQLDININKMSMEEACYGTEYLEA
IDLHTSAGYPYSALGVKKRDILDPITRDTTKMKFYMDKYGLDLPYSTYVKDELRSLDKIKKGKSRLIEASSLNDSVYLRM
TFGHLYETFHANPGTVTGSAVGCNPDVFWSKLPILLPGSLFAFDYSGYDASLSPVWFRALEVVLREIGYSEEAVSLIEGI
NHTHHVYRNKTYCVLGGMPSGCSGTSIFNSMINNIIIRTLLIKTFKGIDLDELNMVAYGDDVLASYPFPIDCSELAKTGK
EYGLTMTPADKSPCFNEVTWENATFLKRGFLPDHQFPFLIHPTMPMREIHESIRWTKDARNTQDHVRSLCLLAWHNGKEE
YEKFVSTIRSVPIGKALAIPNFENLRRNWLELF
>P22055 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHENQNVAANGSTINYTTINYYKDSASNSATRQDLSQDPSKFTEPVKDLMLKTAPALNSPNVEACGYSD
RVRQITLGNSTITTQEAANAIVAYGEWPTYINDSEANPVDAPTEPDVSSNRFYTLESVSWKTTSRGWWWKLPDCLKDMGM
FGQNMYYHYLGRSGYTIHVQCNASKFHQGALGVFLIPEFVMACNTESKTSYVSYINANPGERGGEFTNTYNPSNTDVSEG
RQFAALDYLLGSGVLAGNAFVYPHQIINLRTNNSATIVVPYVNSLVIDCMAKHNNWGIVILPLAPLAFAATSSPQVPITV
TIAPMCTEFNGLRNITIPVHQGLPTMNTPGSNQFLTSDDFQSPCALPNFDVTPPIHIPGEVKNMMELAEIDTLIPMNAVD
GKVNTMEMYQIPLNDNLSKAPIFCLSLSPASDKRLSHTMLGEILNYYTHWTGSIRFTFLFCGSMMATGKLLLSYSPPGAK
PPTNRKDAMLGTHIIWDLGLQSSCSMVAPWISNTVYRRCARDDFTEGGFITCFYQTRIVVPASTPTSMFMLGFVSACPDF
SVRLLRDTSHISQSKLIARTQGIEDLIDTAIKNALRVSQPLRPSQLKQPNGVNSQEVPALTAVETGASGQAIPSDVVETR
HVINYKTRSESCLESFFGRAACVTILSLTNSSKSGEEKKHFNIWNITYTDTVQLRRKLEFFTYSRFDLEMTFVFTENYPS
TASGEVRNQCDQIMYIPPGAPRPSSWDDYTWQSSSNPSIFYMYGNAPPRMSIPYVGIANAYSHFYDGFARVPLEGENTDA
GDTFYGLVSINDFGVLAVRAVNRSNPHTIHTSVRVYMKPKHIRCWCPRPPRAVLYRGEGVDMISSAIQPLTKVDSITTFG
FGHQNKAVYVAGYKICNYHLATPSDHLNAISVLWDRDLMVVESRAQGTDTIARCSCRCGVYYCESRRKYYLVTFTGPTFR
FMEANDYYPARYQSHMLIGCGFAEPGDCGGILRCTHGVIGIITAGGEGIVAFADIRDLWVYEEEAMEQGITSYIESLGTA
FGAGFTHTISEKVTELTTMVTSTITEKLLKNLVKIVSALVIVVRNYEDTTTILATLALLGCDISPWQWLKKKACDLLEIP
HVMRQGDGWMKKFTEACNAAKGLRWVSNKISKFVDWLKCKIIPEAKDKVEFLTKLKQLDMLENQIATIHQSCPSQEQQEI
LFNNVRWLAVQSRRFAPLYAVEARRISKMESTINNYIQFKSKHRIEPVCMLVHGSPGTGKGIASSLIGRAIAERETTSVY
SVPLAPSHFDGYKQQGYDMDDLNQNPDGMDMKLFCQMVSTVEFIPPMASLEEKGILFTSDYVLASTNSHSIAPPTVAHSD
ALTRRFAFDVEVYTMSEHSVKGKLNMATATQLCKDCPTPANFKKCCPLVCGKALQLMDRYTRQRFTVDEITTLIMNEKNR
RANIGNCMEALFQGPLRYKDLKIDVKTVPPPECISDLLQAVDSQEVRDYCEKKGWIVNVTSQIQLERNINRAMTILQAVT
TFAAVAGVVYVMYKLFAGQQGAYTGLPNKKPNVPTIRIAKVQGPGFDYAVAMAKRNIVTATTTKGEFTMLGVHDNVAILP
THAAPGETIIVDGKEVEILDARALEDQAGTNLEITIITLKRNEKFRDIRPHIPTQITETNDGVLIVNTSKYPNMYVPVGA
VTEQGYLNLSGRQTARTLMYNFPTRAGQCGGIITCTGKVIGMHVGGNGSHGFAAALKRSYFTQNQGEIQWMRSSKEVGYP
IINAPSKTKLEPSAFHYVFEGVKEPAVLTKNDPRLKTDFEEAIFSKYVGNKITEVDEYMKEAVDHYAGQLMSLDINTEQM
CLEDAMYGTDGLEALDLSTSAGYPYVAMGKKKRDILDKQTRDTKEMQRLLDTYGINLPLVTYVKDELRSKTKVEQGKSRL
IEASSLNDSVAMRMAFGNLYAAFHKNPGVVTGSAVGCDPDLFWSKIPVLMEEKLFAFDYTGYDASLSPAWFEALKMVLEK
IGFGNRVDYIDYLNHSHHLYKNKTYCVKGGMPSGCSGTSIFNSMINNLIIRTLLLRTYKGIDLDHLKMIAYGDDVIASYP
HEVDASLLAQSGKDYGLTMTPADKSATFETVTWENVTFLKRFFRADEKYPFLVHPVMPMKEIHESIRWTKDPRNTQDHVR
SLCLLAWHNGEEEYNKFLAKIRSVPIGRALLLPEYSTLYRRWLDSF
>P21404 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETSLSAAGNSIIHYTNINYYKDAASNSANRQDFTQDPSKFTEPVKDVMIKSLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVGYGRWPTYLRDDEATAEDQPTQPDVATCRFYTLDSIKWEKGSVGWWWKFPEALSDMGL
FGQNMQYHYLGRAGYTIHLQCNASKFHQGCLLVVCVPEAEMGGAVVGQAFSATAMANGDKAYEFTSATQSDQTKVQTAIH
NAGMGVGVGNLTIYPHQWINLRTNNSATIVMPYINSVPMDNMFRHYNFTLMVIPFVKLDYADTASTYVPITVTVAPMCAE
YNGLRLAQAQGLPTMNTPGSTQFLTSDDFQSPCALPQFDVTPSMNIPGEVKNLMEIAEVDSVVPVNNVQDTTDQMEMFRI
PVTINAPLQQQVFGLRLQPGLDSVFKHTLLGEILNYYAHWSGSMKLTFVFCGSAMATGKFLIAYSPPGANPPKTRKDAML
GTHIIWDIGLQSSCVLCVPWISQTHYRLVQQDEYTSAGYVTCWYQTGMIVPPGTPNSSSIMCFASACNDFSVRMLRDTPF
ISQDNKLQGDVEEAIERARCTVADTMRTGPSNSASVPALTAVETGHTSQVTPSDTMQTRHVKNYHSRSESTVENFLGRSA
CVYMEEYKTTDKHVNKKFVAWPINTKQMVQMRRKLEMFTYLRFDMEVTFVITSRQDPGTTLAQDMPVLTRQIMYVPPGGP
IPAKVDDYAWQTSTNPSIFWTEGNAPARMSIPFISIGNAYSNFYDGWSNFDQRGSYGYNTLNNLGHIYVRHVSGSSPHPI
TSTIRVYFKPKHTRAWVPRPPRLCQYKKAFSVDFTPTPITDTRKDINTVTTVAQSRRRGDMSTLNTHGAFGQQSGAVYVG
NYRVINRHLATHTDWQNCVWEDYNRDLLVSTTTAHGCDVIARCQCTTGVYFCASKNKHYPVSFEGPGLVEVQESEYYPKR
YQSHVLLAAGFSEPGDCGGILRCEHGVIGIVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICE
QVNLLKESLVGQDSILEKSLKALVKIISALVIVVRNHDDLITVTAILALIGCTSSPWRWLKQKVSQYYGIPMAERQNDSW
LKKFTEMTNACKRMEWIAIKIQKFIEWLKVKILPEVREKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFA
HYCRKYAPLYAAEAKRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHF
DGYKQQAVVIMDDLCQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHF
DMNIEVISMYSQNGKINMPMSVKTCDEECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATL
EALFQGPPIYREIKISVAPETPPPPVIADLLKSVDSEDVREYCKEKGWLIPEVNSTLQIEKYVSRAFICLQAITTFVSVA
GIIYIIYKLFAGFQGAYTGIPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPG
PTILMNDQEVGVMDAKELVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEMEVNEAVLAINTSKFPNMYIPVGQVTDYGF
LNLGGTPTKRMLMYNFPTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGFPIINTPS
KTKLEPSVFHQVFEGVKEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAVDHYAGQLATLDISTEPMKLEDAV
YGTEGLEALDLTTSAGYPYVALGIKKRDILSKKTRDLTKLKECMDKYGLNLPMITYVKDQLRSAEKVAKGKSRLIEASSL
NDSVAMRQTFGNLYKTFHLNPGIVTGSAVGCDPDLFWSKIPVMLNGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYSHK
ETNYIDYLCNSHHLYRDKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDA
SLLAEAGKDYGLIMTPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLL
AWHNGEHEYEEFIRKIRSVPVGRCLTLPAFSTLRRKWLDSF
>Q9YLG5 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLSASGGSIIHYTNINYYKDAASNSANRQDFTQDPGKFTEPVKDIMIKSMPALNSPSAEECGYSD
RVRSITLGNSTITTQECANVVVGYGTWPRYLSDKEATAEDQPTQPDVATCRFYTLSSVQWQRESAGWWWKFPDALSDMGL
FAQNMMYHYLGRTGYTIHVQCNASKFHQGCLLVVCVPEAEMGCTNKENTPLFEKLCGQDNAKEFTREGPTISKGATDVQT
AVCNAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYINSVPMDSMIRHNNFTLMIIPFVPLDYVNGSSPYIPITVTVAPM
SAEYNGLRLASTQGLPTMLTPGSNQFLTSDDFQSPSAMPQFDVTPEMNIPGRVHNLMEIAEVDSVVPLNNIQDNLRKMDI
YRVQVSSQTSQGAQVFGFSLQPGASSVLQRTLLGEILNYYTHWSGSLKLTFVFCGSAMATGKFLLAYSPPGAGVPPDRKK
AMLGTHVIWDVGLQSSCVLCVPWISQTHYRYTVKDEYTDSGYITCWYQTNVIAPADALSTCYIMCMVSACNDFSVRMLRD
TRFIKQTAFYQSPVEESIERSIGRVADTIGSGPSNSEAIPVLTAVETGHTSQVTPSDTMQTRHVHNYHSRSESSVENFLA
RSACVFYTTYTNSKNAAKEKKFATWKVSVRQAAQLRRKLELFTYLRCDIELTFVITSAQDPSTATNLDVPVLTHQIMYVP
PGGPVPETVDDYNWQTSTNPSLFWTEGNAPPRMSIPFMSIGNAYSMFYDGWSEFRHDGVYGLNTLNNMGTIYARHVNADN
PGSITSTVRIYFKPKHVKAWIPRPPRLAQYLKANNVNFKITDVTEKRDSLTTTGAFGQQSGAVYVGNYRVVNRHLATHID
WQNCVWEDYNRDLLVSTTTAHGCDTIARCQCTSGVYYCASKNKHYPVVFEGPGMVEVQESEYYPKRYQSHVLLAAGFSEP
GDCGGILRCEHGVIGVVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDS
ILEKSLKALVRIISALVIVVRNHDDIITVTATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNGWLKKFTEMTNACKGM
EWIAVKIQKFIEWLKVKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEA
KRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDL
CQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNG
KINMPMSVKTCDDECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPIYREIK
ISVAPETPPPPAIADLLKSVDSEVVREYCKEKGWLVPEVNSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQ
GAYTGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVSVLD
AKELVDKDGTNLELVLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKRMLMY
NFPTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGFPVINAPSRTKLEPSVFHQVFE
GNKEPAVLRNGDPRLKANFEEAIFSKYIGNVNTRVDEYMLEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTS
AGYPYVALGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLY
KAFHLNPGIVTGSAVGCDPDMFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYIDYLCNSHHL
YRDKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIM
TPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIR
KIRSVPVGRCLSLPAFSTLRRKWLDSF
>P03313 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETRLNASGNSIIHYTNINYYKDAASNSANRQDFTQDPGKFTEPVKDIMIKSLPALNSPTVEECGYSD
RARSITLGNSTITTQECANVVVGYGVWPDYLKDSEATAEDQPTQPDVATCRFYTLDSVQWQKTSPGWWWKLPDALSNLGL
FGQNMQYHYLGRTGYTVHVQCNASKFHQGCLLVVCVPEAEMGCATLDNTPSSAELLGGDTAKEFADKPVASGSNKLVQRV
VYNAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYTNSVPMDNMFRHNNVTLMVIPFVPLDYCPGSTTYVPITVTIAPMC
AEYNGLRLAGHQGLPTMNTPGSCQFLTSDDFQSPSAMPQYDVTPEMRIPGEVKNLMEIAEVDSVVPVQNVGEKVNSMEAY
QIPVRSNEGSGTQVFGFPLQPGYSSVFSRTLLGEILNYYTHWSGSIKLTFMFCGSAMATGKFLLAYSPPGAGAPTKRVDA
MLGTHVIWDVGLQSSCVLCIPWISQTHYRFVASDEYTAGGFITCWYQTNIVVPADAQSSCYIMCFVSACNDFSVRLLKDT
PFISQQNFFQGPVEDAITAAIGRVADTVGTGPTNSEAIPALTAAETGHTSQVVPGDTMQTRHVKNYHSRSESTIENFLCR
SACVYFTEYKNSGAKRYAEWVLTPRQAAQLRRKLEFFTYVRFDLELTFVITSTQQPSTTQNQDAQILTHQIMYVPPGGPV
PDKVDSYVWQTSTNPSVFWTEGNAPPRMSIPFLSIGNAYSNFYDGWSEFSRNGVYGINTLNNMGTLYARHVNAGSTGPIK
STIRIYFKPKHVKAWIPRPPRLCQYEKAKNVNFQPSGVTTTRQSITTMTNTGAFGQQSGAVYVGNYRVVNRHLATSADWQ
NCVWESYNRDLLVSTTTAHGCDIIARCQCTTGVYFCASKNKHYPISFEGPGLVEVQESEYYPRRYQSHVLLAAGFSEPGD
CGGILRCEHGVIGIVTMGGEGVVGFADIRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSIL
EKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNSWLKKFTEMTNACKGMEW
IAVKIQKFIEWLKVKILPEVREKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEAKR
VFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQ
NPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKI
NMPMSVKTCDDECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGTTLEALFQGPPVYREIKIS
VAPETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEINSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQGA
YTGVPNQKPRVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVGVLDAK
ELVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTEYGFLNLGGTPTKRMLMYNF
PTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGFPVINTPSKTKLEPSVFHQVFEGN
KEPAVLRSGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTSAG
YPYVALGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSIEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKT
FHLNPGVVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKMLLEKLGYTHKETNYIDYLCNSHHLYR
DKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKGYGLIMTP
ADKGECFNEVTWTNATFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIRKI
RSVPVGRCLTLPAFSTLRRKWLDSF
>Q66282 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLNASGNSIIHYTNINYYKDAASNSANRQDFTQDPSKFTEPVKDIMIKSLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVGYGVWPDYLKDSEATAEDQPTQPDVATCRFYTLDSVQWQKTSPGWWWKLPDALSNLGL
FGQNMQYHYLGRTGYTIHVQCNASKFHQGCLLVVCVPEAEMGCATLNNTPSSAELLGGDSAKEFADKPVASGSNKLVQRV
VYNAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYTNSVPMDNMFRHNNVTLMVIPFVPLDYCPGSTTYVPITITIAPMC
AEYNGLRLAGHQGLPTMNTPGSCQFLTSDDFQSPSAMPQYDVTPEMRIPGEVKNLMEIAEVDSVVPVQNVGEKVNSMEAY
QIPVRSNEGSGTQVFGFPLQPGYSSVFSRTLLGEILNYYTHWSGSIKLTFMFCGSAMATGKFLLAYSPPGAGAPTKRVDA
MLGTHVVWDVGLQSSCVLCIPWISQTHYRYVASDEYTAGGFITCWYQTNIVVPADAQSSCYIMCFVSACNDFSVRLLKDT
PFISQQNFFQGPVEDAITAAIGRVADTVGTGPTNSEAIPALTAAETGHTSQVVPSDTMQTRHVKNYHSRSESTIENFLCR
SACVYFTEYENSGAKRYAEWVITPRQAAQLRRKLEFFTYVRFDLELTFVITSTQQPSTTQNQDAQILTHQIMYVPPGGPV
PDKVDSYVWQTSTNPSVFWTEGNAPPRMSVPFLSIGNAYSNFYDGWSEFSRNGVYGINTLNNMGTLYARHVNAGSTGPIK
STIRIYFKPKHVKAWIPRPPRLCQYEKAKNVNFQPSGVTTTRQSITTMTNTGAFGQQSGAVYVGNYRVVNRHLATSADWQ
NCVWENYNRDLLVSTTTAHGCDIIARCRCTTGVYFCASKNKHYPISFEGPGIVEVQESEYYPRRYQSHVLLAAGFSEPGD
CGGILRCEHGVIGIVTMGGEGVVGFADIRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSIL
EKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNGWLKKFTEMTNACKGMEW
IAIKIQKFIEWLKVKILPEVREKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYASEAKR
VFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQ
KPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKI
NMPMSVKTCDEECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVYREIKIS
VAPETPPPPRIADLLKSVDSEAVREYCKEKGWLVPEVNSTLQIEKHVSRAFICLQAITTFVSVAGIIYIIYKLFAGFQGA
YTGIPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVGVLDAK
ELVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKRMLMYNF
PTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKEAGFPIINTPSKTKLEPSVFHQVFEGD
KEPAVLRNGDPRLKVNFEEAIFSKYIGNVNTHVDEYMMEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTSAG
YPYVALGIKKRDILSKKTRDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKT
FHLNPGVVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYSHKETNYIDYLCNSHHLYR
DKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIMTP
ADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIRKI
RSVPVGRCLTLPAFSTIRRKWLDSF
>P08292 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETSLSASGNSIIHYTNINYYKDAASNSANRQDFTQDPSKFTEPVKDVMIKSLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVGYGVWPDYLSDEEATAEDQPTQPDVATCRFYTLNSVKWEMQSAGWWWKFPDALSEMGL
FGQNMQYHYLGRSGYTIHVQCNASKFHQGCLLVVCVPEAEMGCTNAENAPAYGDLCGGETAKSFEQNAATGKTAVQTAVC
NAGMGVGVGNLTIYPHQWINLRTNNSATIVMPYINSVPMDNMFRHNNFTLMIIPFAPLDYVTGASSYIPITVTVAPMSAE
YNGLRLAGHQGLPTMLTPGSTQFLTSDDFQSPSAMPQFDVTPEMNIPGQVRNLMEIAEVDSVVPINNLKANLMTMEAYRV
QVRSTDEMGGQIFGFPLQPGASSVLQRTLLGEILNYYTHWSGSLKLTFVFCGSAMATGKFLLAYSPPGAGAPDSRKNAML
GTHVIWDVGLQSSCVLCVPWISQTHYRYVVDDKYTASGFISCWYQTNVIVPAEAQKSCYIMCFVSACNDFSVRMLRDTQF
IKQTNFYQGPTEESVERAMGRVADTIARGPSNSEQIPALTAVETGHTSQVDPSDTMQTRHVHNYHSRSESSIENFLCRSA
CVIYIKYSSAESNNLKRYAEWVINTRQVAQLRRKMEMFTYIRCDMELTFVITSHQEMSTATNSDVPVQTHQIMYVPPGGP
VPTSVNDYVWQTSTNPSIFWTEGNAPPRMSIPFMSIGNAYTMFYDGWSNFSRDGIYGYNSLNNMGTIYARHVNDSSPGGL
TSTIRIYFKPKHVKAYVPRPPRLCQYKKAKSVNFDVEAVTAERASLITTGPYGHQSGAVYVGNYKVVNRHLATHVDWQNC
VWEDYNRDLLVSTTTAHGCDTIARCQCTTGVYFCASKSKHYPVSFEGPGLVEVQESEYYPKRYQSHVLLATGFSEPGDCG
GILRCEHGVIGLVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSILEK
SLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKHKVSQYYGIPMAERQNNGWLKKFTEMTNACKGMEWIA
VKIQKFIEWLKVKILPEVKEKHEFLSRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEAKRVF
SLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQNP
DGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKINM
PMSVKTCDEECCPVNFKRCCPLVCGKAIQFIDRKTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVYREIKISVT
PETPPPPVIADLLKSVDRQAIREYCKEKGWLVPEIDSILQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQGAYT
GMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVGVLDAKEL
IDRDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGRVTDYGFLNLGGTPTKRMLMYNFPT
RAGQCGGVLMSTGKVLGIHVGGNGHQGFSAGLLKHYFNDEQGEIEFIESSKDAGFPVINTPSRTKLEPSVFHHVFEGNKE
PAVLRNGDPRLKVNFEEAIFFKYIGNVNTHVDEYMLEAVDHYAGQLATLDINTEPMKLEDAVYGTEGLEALDLTTSAGYP
YVALGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKAFH
LNPGIVTGSAVGCDPDVFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYIDYLCNSHHLYRDK
HYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIMTPAD
KGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIQKIRS
VPVGRCLTLPAFSTLRRKWLDSF
>Q03053 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLRASGNSIIHYTNINYYKDAASNSANRQEFAQDPGKFTEPVKDIMIKSMPALNSPSAEECGYSD
RVRSITLGNSTITTQECANVVVGYGTWPTYLKDEEATAEDQPTQPDVATCRFYTLESVMWQQSSPGWWWKFPDALSNMGL
FGQNMQYHYLGRAGYTVHVQCNASKFHQGCLLVVCVPEAEMGCATLANKPDQKSLSNGETANTFDSQNTTGQTAVQANVI
NAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYINSVPMDNMFRHNNFTLMIIPFAPLSYSTGATTYVPITVTVAPMCAE
YNGLRLAGKQGLPTMLTPGSNQFLTSDDFQSPSAMPQFDVTPEMAIPGQVNNLMEIAEVDSVVPVNNTEGKVSSIEAYQI
PVQSNSTNGSQVFGFPLIPGASSVLNRTLLGEILNYYTHWSGSIKLTFMFCGSAMATGKFLLAYSPPGAGAPTTRKEAML
GTHVIWDVGLQSSCVLCIPWISQTHYRYVVVDEYTAGGYITCWYQTNIVVPADTQSDCKILCFVSACNDFSVRMLKDTPF
IKQDSFYQGPPGEAVERAIARVADTISSGPVNSESIPALTAAETGHTSQVVPADTMQTRHVKNYHSRSESTVENFLCRSA
CVYYTTYKNHGTDGDNFAYWVINTRQVAQLRRKLEMFTYARFDLELTFVITSTQEQSTIQGQDSPVLTHQIMYVPPGGPV
PTKINSYSWQTSTNPSVFWTEGSAPPRISIPFISIGNAYSMFYDGWAKFDKQGTYGINTLNNMGTLYMRHVNDGSPGPIV
STVRIYFKPKHVKTWVPRPPRLCQYQKAGNVNFEPTGVTESRTEITAMQTTGVLGQQTGAICIGNYRVVNRHLATSEDWQ
RCVWEDYNRDLLVSTTTAHGCDTIARCRCSTGVYFCASRNKHYPVSFEGPGLVEVQESEYYPKRYQSHVLLAAGFSEPGD
CGGILRCEHGVIGLVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSIL
EKSLKALVKIISALVIVVRNHDDLVTITATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNNWLKKFTEMTNACKGMEW
IAVKIQKFIDWLKVKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEAKR
VFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQ
NPDGGDISLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKI
NMPMSVRTCDEECCPVNFKRCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPIYREIKIS
VAPDTPPPPAIADLLKSVDSEAVREYCREKGWLVPEINSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQGA
YTGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVGVVDAK
ELVDKDGTNLELTLLKLSRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKRMLMYNF
PTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKEAGLPVINTPSKTKLEPSVFHQVFEGN
KEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTSAG
YPYVALGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKT
FHLNPGIVTGSAVGCDPDLFWSKIPVLLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYIDYLCNSHHLYR
DKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIMTP
ADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIKKI
RSVPVGRCLTLPAFSTLRRKWLDSF
>Q9QL88 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETALNAQGNSVIHYTNINYYKDAASNSANRQDFTQDPSKFTEPVKDVMIKSLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVAYGVWPDYLHDDEATAEDQPTQPDVATCRFYTLDSVSWQSSSAGWWWKFPDALSNMGL
FGQNMQYHYLGRSGYTIHVQCNASKFHQGCLLVVCVPEAEMGCSNLNNAPLAADLSAGEVARQFTVEPANGQNQVQTAVH
NAAMGVAVGNLTIFPHQWINLRTNNSATIVMPYINSVPMDNMFRHNNFTLMIIPFAKLAYSDGASTFVPITVTIAPMNAE
YNGLRLAGHQGLPVMTTPGSTQFLTSDDFQSPCAMPQFDVTPEMNIPGQVNNLMEIAEVDSVVPVNNTETNVNGMDAYRI
PVQSNMDTGGQVFGFPLQPGASSVFQRTLLGEILNYYTHWSGSIKLTFMFCGSAMATGKFLLAYSPPGAGAPKSRKDAML
GTHVIWDVGLQSSCVLCIPWISQTHYRFVVADEYTAGGFITCWYQTNVIVPLGAQSNCSILCFVSACNDFSVRMLRDTKF
ISQTAFYQSPVEGAIERAIARVADTMPSGPTNSEAVPALTAVETGHTSQVVPSDNMQTRHVKNYHSRSETSVENFLCRSA
CVYFTTYKNQTGATNRFASWVITTRQVAQLRRKLEMFTYLRFDIELTFVITSAQDQSTISQDAPVQTHQIMYVPPGGPVP
TKVDDYAWQTSTNPSVFWTEGNAPPRMSVPFMSIGNAYSTFYDGWSDFSNKGIYGLNTLNNMGTLYIRHVNGPNPVPITS
TVRIYFKPKHVKAWVPRPPRLCQYKTSRQVNFTVTGVTESRANITTMNTTGAFGQQSGAAYVGNYRVVNRHLATHADWQN
CVWEDYNRDLLVSTTTAHGCDVIARCQCNTGVYFCASRNKHYPVTFEGPGLVEVQESEYYPKRYQSHVLLAAGFSEPGDC
GGILRCEHGVIGLVTMGGEGVVGFADVRDLLWLEDDAMEQGVRDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSILE
KSLKALVKIISALVIVVRNHDDLITVTATLALIGCTTSPWRWLKQKVSQYYGIPMAERQSNGWLKKFTEMTNACKGMEWI
AIKIQKFIEWLKARILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEAKRV
FSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATSLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQN
PDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKIN
MPMSVKTCDEECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVYREIKISV
APETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEINSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQGAY
TGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGVYDRWAVLPRHAKPGPTILMNDQEVGVLDAKE
LVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKRMLMYNFP
TRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGFPIINTPSKTKLEPSVFHQVFEGNK
EPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAVDHYAGQLATLDINTEPMKLEDAVYGTEGLEALDLTTSAGY
PYVALGIKKRDILSKKSKDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKTF
HLNPGIVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYIDYLCNSHHLYRD
KHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKGYGLIMTPA
DKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIRKVR
SVPVGRCLTLPAFSTLRRKWLDSF
>P27909 ~~~~~~Genome polyprotein~~~
MNNQRKKTGRPSFNMLKRARNRVSTGSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWSSFKKNGAIKV
LRGFKKEISSMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCE
DTMTYKCPRITEAEPDDVDCWCNATDTWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETW
ALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKNKPT
LDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDANFVCRRTFVDRGWGNGCGLFGKGSLLTCAKFK
CVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTIATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLT
MKEKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQTSGTTTIFA
GHLKCRLKMDKLTLKGTSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSTQDEKGVTQNGRLITANPIVTD
KEKPVNIETEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLVHQVF
GTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEV
HTWTEQYKFQADSPKRLSAAIGRAWEEGVCGIRSATRLENIMWKQISNELNHILLENDIKFTVVVGNANGILAQGKKMIR
PQPMEHKYSWKSWGKAKIIGADIQNTTFIIDGPDTPECPDEQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQMCDHRLMS
AAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKMYGGPISQHNYRPGYFTQTA
GPWHLGKLELDFDLCEGTTVVVDEHCGSRGPSLRTTTVTGKIIHEWCCRSCTLPPLRFRGEDGCWYGMEIRPVKEKEENL
VRSMVSAGSGEVDSFSLGILCVSIMIEEVMRSRWSRKMLMTGTLAVFLLLIMGQLTWNDLIRLCIMVGANASDKMGMGTT
YLALMATFKMRPMFAVGLLFRRLTSREVLLLTIGLSLVASVELPNSLEELGDGLAMGIMMLKLLTEFQPHQLWTTLLSLT
FIKTTLSLDYAWKTTAMVLSIVSLFPLCLSTTSQKTTWLPVLLGSFGCKPLTMFLITENEIWGRKSWPLNEGIMAIGIVS
ILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGTSHNILVEVQDDGTMKIKDEERDDTL
TILLKATLLAVSGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQDGV
FHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNTGEEVQVIAVEPGKNPKNVQTTPGTFKTPEGEVGA
IALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFKKRNLTIMDLHPGSGKTRRYL
PAIVREAIKRKLRTLILAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGREIVDLMCHATFTMRLLSPVRVPNYNMIIMD
EAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPS
IKSGNDIANCLRKNGKRVIQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILA
GPMPVTVASAAQRRGRIGRNQNKEGDQYVYMGQPLNNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYR
LRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDP
LALREFKEFAAGRRSVSGDLILEIGKLPQHLTLRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTG
GVTLFFLSGKGLGKTSIGLLCVTASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMIL
TVAANEMGLLETTKKDLGIGHVAAENHQHATILDVDLHPASAWTLYAVATTVITPMMRHTIENTTANISLTAIANQAAIL
MGLDKGWPISKMDLGVPLLALGCYSQVNPLTLTAAVLMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDL
DPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGL
AFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIMEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVE
RNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLHSGKDVFFMPPEKCDTLLCDI
GESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNI
VSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNEHKSTWHYDEDNPYKTWAYHGS
YEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPRAKRGTAQIMEVTAKWLWGFLSRN
KKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKA
KGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQ
NEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEVQLIRQMESEGI
FFPSELESPNLAERVLDWLEKHGAERLKRMAISGDDCVVKPIDDRFATALIALNDMGKVRKDIPQWEPSKGWNDWQQVPF
CSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWV
PTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEEVPYLGKREDQWCGSLIGLTARATWATNIQVAIN
QVRRLIGNENYLDYMTSMKRFKNESDPEGALW
>P33478 ~~~~~~Genome polyprotein~~~
MNNQRKKTARPSFNMLKRARNRVSTGSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKV
LRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQEREKSLLFKTSVGVNMCTLIAMDLGELCE
DTMTYKCPRITEAEPDDVDCWCNATDTWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQRVETW
ALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGSRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPT
LDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDANFVCRRTFVDRGWGNGCGLFGKGSLLTCAKFK
CVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTIATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLT
MKEKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQTSGTTTIFA
GHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSTQDEKGVTQNRLITANPIVTDK
EKPVNIETEPPFGESYIVVGAGEKALKQCWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLVHQVFG
TAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEVH
TWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVVGILAQGKKMIRP
QPMEHKYSWKSWGKAKIIGADIQNTTFIIDGPDTPECPDDQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQMCDHRLMSA
AIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCVWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTAG
PWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKIIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENLV
KSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLIMGQLTWNDLIRLCIMVGANASDRMGMGTTY
LALMATFKMRPMFAVGLLFRRLTSREVLLLTIGLSLVASVELPNSLEELGDGLAMGIMILKLLTDFQSHQLWATLLSLTF
VKTTFSLHYAWKTMAMVLSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLIAENKIWGRKSWPLNEGIMAVGIVSI
LLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTLT
ILLKATLLAVSGVYPLSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRIMQRGLLGRSQVGVGVFQDGVF
HTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNTGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGAI
ALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYLP
AIVREAIRRNVRTLILAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMDE
AHFTDPASIARRGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYEWITDFPGKTVWFVPSI
KSGNDIANCLRKNGKRVIQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILAG
PMPVTVASAAQRRGRIGRNQNKEGDQYVYMGQPLNNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRL
RGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEMWTKEGERKKLRPRWLDARTYSDPL
ALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGRAYRHAMEELPDTIETLMLLALIAVLTGG
VTLFFLSGKGLGKTSIGLLCVMASSVLLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMILT
VAANEMGLLETTKKDLGIGHVAAENHHHATMLDVDLRPASAWTLYAVATTVITPMMRHTIENTTANISLTAIANQAAILM
GLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVLMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDLD
PVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLA
FSLMKSLGGGRRGTGAKGKHWERNGKDRLNQLSKSEFNTYKRSGIMEVDRSEAKEGLKRGETTKHAVSRGTAKLRWFVER
NLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDIG
ESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNIV
SAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKHEHKSTWHYDEDNPYKTWAYHGSY
EVKPSGSASSMVNGVVKLLTKPWDAIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRNK
KPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQN
EAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGIF
SPSELETPNLAERVLDWLEKYGVERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPFC
SHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWVP
TSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDQWCGSLIGLTARATWATNIQVAINQ
VRRLIGNENYLDYMTSMKRFKNESDPEGALW
>P17763 ~~~~~~Genome polyprotein~~~
MNNQRKKTGRPSFNMLKRARNRVSTVSQLAKRFSKGLLSGQGPMKLVMAFIAFLRFLAIPPTAGILARWGSFKKNGAIKV
LRGFKKEISNMLNIMNRRKRSVTMLLMLLPTALAFHLTTRGGEPHMIVSKQERGKSLLFKTSAGVNMCTLIAMDLGELCE
DTMTYKCPRITETEPDDVDCWCNATETWVTYGTCSQTGEHRRDKRSVALAPHVGLGLETRTETWMSSEGAWKQIQKVETW
ALRHPGFTVIALFLAHAIGTSITQKGIIFILLMLVTPSMAMRCVGIGNRDFVEGLSGATWVDVVLEHGSCVTTMAKDKPT
LDIELLKTEVTNPAVLRKLCIEAKISNTTTDSRCPTQGEATLVEEQDTNFVCRRTFVDRGWGNGCGLFGKGSLITCAKFK
CVTKLEGKIVQYENLKYSVIVTVHTGDQHQVGNETTEHGTTATITPQAPTSEIQLTDYGALTLDCSPRTGLDFNEMVLLT
MEKKSWLVHKQWFLDLPLPWTSGASTSQETWNRQDLLVTFKTAHAKKQEVVVLGSQEGAMHTALTGATEIQTSGTTTIFA
GHLKCRLKMDKLTLKGMSYVMCTGSFKLEKEVAETQHGTVLVQVKYEGTDAPCKIPFSSQDEKGVTQNGRLITANPIVTD
KEKPVNIEAEPPFGESYIVVGAGEKALKLSWFKKGSSIGKMFEATARGARRMAILGDTAWDFGSIGGVFTSVGKLIHQIF
GTAYGVLFSGVSWTMKIGIGILLTWLGLNSRSTSLSMTCIAVGMVTLYLGVMVQADSGCVINWKGRELKCGSGIFVTNEV
HTWTEQYKFQADSPKRLSAAIGKAWEEGVCGIRSATRLENIMWKQISNELNHILLENDMKFTVVVGDVSGILAQGKKMIR
PQPMEHKYSWKSWGKAKIIGADVQNTTFIIDGPNTPECPDNQRAWNIWEVEDYGFGIFTTNIWLKLRDSYTQVCDHRLMS
AAIKDSKAVHADMGYWIESEKNETWKLARASFIEVKTCIWPKSHTLWSNGVLESEMIIPKIYGGPISQHNYRPGYFTQTA
GPWHLGKLELDFDLCEGTTVVVDEHCGNRGPSLRTTTVTGKTIHEWCCRSCTLPPLRFKGEDGCWYGMEIRPVKEKEENL
VKSMVSAGSGEVDSFSLGLLCISIMIEEVMRSRWSRKMLMTGTLAVFLLLTMGQLTWNDLIRLCIMVGANASDKMGMGTT
YLALMATFRMRPMFAVGLLFRRLTSREVLLLTVGLSLVASVELPNSLEELGDGLAMGIMMLKLLTDFQSHQLWATLLSLT
FVKTTFSLHYAWKTMAMILSIVSLFPLCLSTTSQKTTWLPVLLGSLGCKPLTMFLITENKIWGRKSWPLNEGIMAVGIVS
ILLSSLLKNDVPLAGPLIAGGMLIACYVISGSSADLSLEKAAEVSWEEEAEHSGASHNILVEVQDDGTMKIKDEERDDTL
TILLKATLLAISGVYPMSIPATLFVWYFWQKKKQRSGVLWDTPSPPEVERAVLDDGIYRILQRGLLGRSQVGVGVFQEGV
FHTMWHVTRGAVLMYQGKRLEPSWASVKKDLISYGGGWRFQGSWNAGEEVQVIAVEPGKNPKNVQTAPGTFKTPEGEVGA
IALDFKPGTSGSPIVNREGKIVGLYGNGVVTTSGTYVSAIAQAKASQEGPLPEIEDEVFRKRNLTIMDLHPGSGKTRRYL
PAIVREAIRRNVRTLVLAPTRVVASEMAEALKGMPIRYQTTAVKSEHTGKEIVDLMCHATFTMRLLSPVRVPNYNMIIMD
EAHFTDPASIAARGYISTRVGMGEAAAIFMTATPPGSVEAFPQSNAVIQDEERDIPERSWNSGYDWITDFPGKTVWFVPS
IKSGNDIANCLRKNGKRVVQLSRKTFDTEYQKTKNNDWDYVVTTDISEMGANFRADRVIDPRRCLKPVILKDGPERVILA
GPMPVTVASAAQRRGRIGRNQNKEGDQYIYMGQPLNNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYR
LRGEARKTFVELMRRGDLPVWLSYKVASEGFQYSDRRWCFDGERNNQVLEENMDVEIWTKEGERKKLRPRWLDARTYSDP
LALREFKEFAAGRRSVSGDLILEIGKLPQHLTQRAQNALDNLVMLHNSEQGGKAYRHAMEELPDTIETLMLLALIAVLTG
GVTLFFLSGRGLGKTSIGLLCVIASSALLWMASVEPHWIAASIILEFFLMVLLIPEPDRQRTPQDNQLAYVVIGLLFMIL
TAAANEMGLLETTKKDLGIGHAAAENHHHAAMLDVDLHPASAWTLYAVATTIITPMMRHTIENTTANISLTAIANQAAIL
MGLDKGWPISKMDIGVPLLALGCYSQVNPLTLTAAVFMLVAHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIVAIDL
DPVVYDAKFEKQLGQIMLLILCTSQILLMRTTWALCESITLATGPLTTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGL
AFSLMKSLGGGRRGTGAQGETLGEKWKRQLNQLSKSEFNTYKRSGIIEVDRSEAKEGLKRGEPTKHAVSRGTAKLRWFVE
RNLVKPEGKVIDLGCGRGGWSYYCAGLKKVTEVKGYTKGGPGHEEPIPMATYGWNLVKLYSGKDVFFTPPEKCDTLLCDI
GESSPNPTIEEGRTLRVLKMVEPWLRGNQFCIKILNPYMPSVVETLEQMQRKHGGMLVRNPLSRNSTHEMYWVSCGTGNI
VSAVNMTSRMLLNRFTMAHRKPTYERDVDLGAGTRHVAVEPEVANLDIIGQRIENIKNGHKSTWHYDEDNPYKTWAYHGS
YEVKPSGSASSMVNGVVRLLTKPWDVIPMVTQIAMTDTTPFGQQRVFKEKVDTRTPKAKRGTAQIMEVTARWLWGFLSRN
KKPRICTREEFTRKVRSNAAIGAVFVDENQWNSAKEAVEDERFWDLVHRERELHKQGKCATCVYNMMGKREKKLGEFGKA
KGSRAIWYMWLGARFLEFEALGFMNEDHWFSRENSLSGVEGEGLHKLGYILRDISKIPGGNMYADDTAGWDTRITEDDLQ
NEAKITDIMEPEHALLATSIFKLTYQNKVVRVQRPAKNGTVMDVISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMESEGI
FSPSELETPNLAERVLDWLKKHGTERLKRMAISGDDCVVKPIDDRFATALTALNDMGKVRKDIPQWEPSKGWNDWQQVPF
CSHHFHQLIMKDGREIVVPCRNQDELVGRARVSQGAGWSLRETACLGKSYAQMWQLMYFHRRDLRLAANAICSAVPVDWV
PTSRTTWSIHAHHQWMTTEDMLSVWNRVWIEENPWMEDKTHVSSWEDVPYLGKREDRWCGSLIGLTARATWATNIQVAIN
QVRRLIGNENYLDFMTSMKRFKNESDPEGALW
>P29990 ~~~~~~Genome polyprotein~~~
MNDQRKKAKNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLYMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRSAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTEDGVNMCTLMAMDLGELCE
DTITYKCPLLRQNEPEDIDCWCNSTSTWVTYGTCTTMGEHRRQKRSVALVPHVGMGLETRTETWMSSEGAWKHVQRIETW
ILRHPGFTMMAAILAYTIGTTHFQRALIFILLTAVTPSMTMRCIGMSNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTESRCPTQGEPSLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFR
CKKNMEGKVVQPENLEYTIVITPHSGEEHAVGNDTGKHGKEIKITPQSSTTEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MENKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLNWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVTLVLVGIVTLYLGVMVQADSGCVVSWKNKELKCGSGIFITDNV
HTWTEQYKFQPESPSKLASAIQKAHEEGICGIRSVTRLENLMWKQITPELNHILSENEVKLTIMTGDIKGIMQAGKRSLR
PQPTELKYSWKTWGKAKMLSTESHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLKLKEKQDVFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKNCHWPKSHTLWSNGVLESEMIIPKNLAGPVSQHNYRPGYHTQIT
GPWHLGKLEMDFDFCDGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VNSLVTAGHGQVDNFSLGVLGMALFLEEMLRTRVGTKHAILLVAVSFVTLIIGNMSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKALMMTTIGIVLSSQSTTPETILELTDALALGMMVLKMVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAVVSVSPLFLTSSQQKTDWIPLALTIKGLNPTAIFLTTLSRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGPLTVCYVLTGRSADLELERAADVKWEDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPMGKAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFKTNAGTIGA
VSLDFSPGTSGSPIIDKKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKRRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIIDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KAGNDIAACLSKNGKKVIQLSRKTFDSEYAKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPKNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARTTFVDLMRRGDLPVWLAYRVAAEGINYADRRWCFDGVKNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKARDALDNLAVLHTAEAGGRAYNHALSELPETLETLLLLTLLATVTGG
ILLFLMSGRGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVVIAILTVVAA
TMANEMGFLEKTKKDLGLGSIATQQPESNILDIDLRPASAWTLYAVATTFVTPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKMDIGVPLLAIGCYSQVNPTTLTAALFLLVAHYAIIGPALQAKASREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEVLTLATGPISTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTNARRGTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
MVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFIPPEKCDTLLCDIGE
SSPNPTVEAGRTLRVLNLVENWLNNNTQFCIKVLNPYMPSVIEKMEALQRKYGGALVRNPLSRNSTHEMYWVSNASGNIV
SSVNMISRMLINRFTMRYKKATYEPDVDLGSGTRNIGIESEIPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMVNGVFRLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTRKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDKERNLHLEGKCETCVYNIMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILRDVSKKEGGAMYADDTAGWDTRITLEDLKN
EAMVTNHMEGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGVF
KSIQHLTITEEIAVQNWLARVGRERLSRMAISGDDCVVKPLDDRLPSALTALNDTGKIRKDIQQWEPSRGWNDWTQVPFC
SHHFHELIMKDGRVLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYDQMWSLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHAKHEWMTTEDMLTVWNRVWIQENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQAAINQ
VRSLIGNEEYTDYMPSMKRFRREEEEAGVLW
>P29991 ~~~~~~Genome polyprotein~~~
MNDQRKEAKNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLYMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRSAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTEVGVNMCTLMAMDLGELCE
DTITYKCPLLRQNEPEDIDCWCNSTSTWVTYGTCTTMGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHVQRIETW
ILRHPGFTMMAAILAYTIGTTHFQRALILILLTAVTPSMTMRCIGMSNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTESRCPTQGEPSLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFR
CKKNMEGKVVQPENLEYTIVITPHSGEEHAVGNDTGKHGKEIKITPQSSITEAELTGYGTITMECSPRTGLDFNEIVLLQ
MENKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLNWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVTLVLVGIVTLYLGVMVQADSGCVVSWKNKELKCGSGIFITDNV
HTWTEQYKFQPESPSKLASAIQKAHEEDICGIRSVTRLENLMWKQITPELNHILSENEVKLTIMTGDIKGIMQAGKRSLR
PQPTELKYSWKTWGKAKMLSTESHNQTFFIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLKLKEKQDVFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKNCHWPKSHTLWSNGVLESEMIIPKNLAGPVSKHNYRPGYHTQIT
GPWHLGKLEMDFDFCDGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VNSLVTAGHGQVDNFSLGVLGMALFLEEMLRTRVGTKHAILLVAVSFVTLIIGNRSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMTTIGIVLSSQSTIPETILELTDALALGMMVLKMVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAVVSVSPLFLTSSQQKTDWIPLALTIKGLNPTAIFLTTLSRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGLLTVCYVLTGRSADLELERAADVKWEDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRRGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPMGKAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALDPGKNPRAVQTKPGLFKTNAGTIGA
VSLDFSPGTSGSPIIDKKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKRRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIIDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KAGNDIAACLRKNGKKVIQLSRKTFDSEYVKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPKNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARTTFVDLMRRGDLPVWLAYRVAAEGINYADRRWCFDGVKNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKARNALDNLAVLHTAEAGGRAYNHALSELPETLETLLLLTLLATVTGG
IFLFLMSARGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVVIAILTVVAA
TMANEMGFLEKTKKDLGLGSIATQQPESNILDIDLRPASAWTLYAVATTFVTPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKMDIGVPLLAIGCYSQVNPTTLTAALFLLVAHYAIIGPALQAKASREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEALTLATGPISTLSEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTNTRRVTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
MVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFIPPEKCDTLLCDIGE
SSPNPTVEAGRTLRVLNLVENWLNNNTQFCIKVLNPYMPSVIEKMEALQRKYGGALVRNPLSRNSTHEMYWVSNASGNIV
SSVNMISRMLINRFTMRYKKATYEPDVDLGSGTRNIGIESEIPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMVNGVFRLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTRKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDKERNLHLEGKCETCVYNIMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILRDVSKKEGGAMYADDTAGWDTRITLEDLKN
EEMVTNHMEGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGVF
KSIQHLTITEEIAVQNWLARVGRERLSRMAISGDDCVVKPLDDRLPSALTALNDMGKIRKDIQQWEPSRGWNDWTQVPFC
SHHFHELIMKDGRVLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYAQMWSLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHAKHEWMTTEDMLTVWNRVWIQENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQAAINQ
VRSLIGNEEYTDYMPSMKRFRREEEEAGVLW
>P14337 ~~~~~~Genome polyprotein~~~
MNNQRKKAKNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRSAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTEDGVNMCTLMAMDLGELCE
DTITYKCPLLRQNEPEDIDCWCNSTSTWVTYGTCTTTGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHAQRIETW
ILRHPGFTIMAAILAYTIGTTHFQRALIFILLTAVAPSMTMRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTESRCPTQGEPSLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFT
CKKNMEGKIVQPENLEYTIVVTPHSGEEHAVGNDTGKHGKEIKVTPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MENKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLNWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVSLVLVGIVTLYLGVMVQADSGCVVSWKNKELKCGSGIFITDNV
HTWTEQYKFQPESPSKLASAIQKAQEEGICGIRSVTRLENLMWKQITPELNHILAENEVKLTIMTGDIKGIMQAGKRSLR
PQPTELKYSWKTWGKAKMLSTESHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLKLKEKQDAFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKNCHWPKSHTLWSNGVLESEMIIPKNLAGPVSQHNYRPGYHTQIA
GPWHLGKLEMDFDFCDGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VNSLVTAGHGQVDNFSLGVLGMALFLEEMLRTRVGTKHAILLVAVSFVTLITGNMSFKDLGRVVVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMTTIGIVLLSQSTIPETILELTDALALGMMVLKMVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAVVSVSPLLLTSSQQKTDWIPLALTIKGLNPTAIFLTTLSRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGLLTVCYVLTGRSADLELERAADVKWEDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPMGKAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFKTNAGTIGA
VSLDFSPGTSGSPIIDKKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKRRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIIDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KAGNDIAACLRKNGKKVIQLSRKTFDSEYVKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPKNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARKTFVDLMRRGDLPVWLAYKVAAEGINYADRRWCFDGIKNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKTRDALDNLAVLHTAEAGGRAYNHALSELPETLETLLLLTLLATVTGG
IFLFLMSGRGIGKMTLGMCCIITASVLLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVVIAILTVVAA
TMANEMGFLEKTKKDLGLGSIATQQPESNILDIDLRPASAWTLYAVATTFVTPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKMDIGVPLLAIGCYSQVNPITLTAALLLLVAHYAIIGPGLQAKATREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEALTLATGPISTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTNTRRGTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
MVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFIPPEKCDTLLCDIGE
SSPSPTVEAGRTLRVLNLVENWLNNNTQFCIKVLNPYMPSVIEKMETLQRKYGGALVRNPLSRNSTHEMYWVSNASGNIV
SSVNMISRMLINRFTMRHKKATYEPDVDLGSGTRNIGIESEIPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMVNGVVRLLTKPWDVLPTVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTRKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDKERNLHLEGKCETCVYNMMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILRDVSKKEGGAMYADDTAGWDTRITLEDLKN
EEMVTNHMEGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGVF
KNIQHLTVTEEIAVQNWLARVGRERLSRMAISGDDCVVKPLDDRFASALTALNDMGKIRKDIQQWEPSRGWNDWTQVPFC
SHHFHELIMKDGRVLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYAQMWSLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHAKHEWMTTEDMLTVWNRVWIQENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQAAINQ
VRSLIGNEEYTDYMPSMKRFRREEEEAGVLW
>P07564 ~~~~~~Genome polyprotein~~~
MNNQRKKARSTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRTAGVIIMLIPTAMAFHLTTRNGEPHMIVGRQEKGKSLLFKTEDGVNMCTLMAIDLGELCE
DTITYKCPLLRQNEPEDIDCWCNSTSTWVTYGTCATTGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHVQRIETW
ILRHPGFTIMAAILAYTIGTTHFQRALIFILLTAVAPSMTMRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTESRCPTQGEPSLNEEQDKRFLCKHSMVDRGWGNGCGLFGKGGIVTCAMFT
CKKNMEGKVVLPENLEYTIVITPHSGEEHAVGNDTGKHGKEIKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MEDKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKIVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLNWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVSLVLVGVVTLYLGAMVQADSGCVVSWKNKELKCGSGIFITDNV
HTWTEQYKFQPESPSKLASAIQKAHEEGICGIRSVTRLENLMWKQITPELNHILSENEVKLTIMTGDIKGIMQAGKRSLR
PQPTELKYSWKTWGKAKMLSTESHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLKLREKQDVFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKMEKASFIEVKSCHWPKSHTLWSNGVLESEMIIPKNFAGPVSQHNYRPGYHTQTA
GPWHLGKLEMDFDFCEGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VNSLVTAGHGQIDNFSLGVLGMALFLEEMLRTRVGTKHAILLVAVSFVTLITGNMSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMATIGIALLSQSTIPETILELTDALALGMMVLKIVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAAVSVSPLLLTSSQQKADWIPLALTIKGLNPTAIFLTTLSRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGLLTVCYVLTGRSADLELERAADVKWEDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGVFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPVGKAELEDGAYRIKQRGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFKTNTGTIGA
VSLDFSPGTSGSPIVDRKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKKRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIMDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KAGNDIAACLRKNGKKVIQLSRKTFDSEYVKTRANDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPKNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARKTFVDLMRRGDLPVWLAYRVAAEGINYADRRWCFDGIKNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKARDALDNLAVLHTAEAGGRAYNHALSELPETLETLLLLTLLATVTGG
IFLFLMSGKGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVVIAILTVVAA
TMANEMGFLEKTKKDLGLGSITTQESESNILDIDLRPASAWTLYAVATTFVTPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKIHIGVPLLAIGCYSQVNPITLTAALLLLVAHYAIIGPGLQAKATREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLILCVTQVLMMRTTWALCEALTLATGPISTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTNTRRGTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
MVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFTPPEKCDTLLCDIGE
SSPNPTIEAGRTLRVLNLVENWLNNNTQFCIKVLNPYMPSVIEKMETLQRKYGGALVRNPLSRNSTHEMYWVSNASGNIV
SSVNMISRMLINRFTMKHKKATYETDVDLGSGTRNIGIESEIPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMVNGVVRLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTRKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDRERNLHLEGKCETCVYNMMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRGNSLSGVEGEGLHKLGYILRDVSKKEGGAMYADDTAGWDTRITLEDLKN
EEMVTNHMEGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGIF
KSIQHLTVTEEIAVQNWLARVGRERLSRMAISGDDCVVKPLDDRFASALTALNDMGKVRKDIQQWEPSRGWNDWTQVPFC
SHHFHELVMKDGRVLVVPCRNQDELIGRARISQGAGWSLKETACLGKSYAQMWTLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHAKHEWMTTEDMLAVWNRVWIQENPWMEDKTPVESWEEVPYLGKREDQWCGSLIGLTSRATWAKNIQTAINQ
VRSLIGNEEYTDYMPSMKRFRREEEEAGVLW
>P14340 ~~~~~~Genome polyprotein~~~
MNNQRKKARNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRTAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTEDGVNMCTLMAMDLGELCE
DTITYKCPFLKQNEPEDIDCWCNSTSTWVTYGTCTTTGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHAQRIETW
ILRHPGFTIMAAILAYTIGTTHFQRALIFILLTAVAPSMTMRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIETEAKQPATLRKYCIEAKLTNTTTDSRCPTQGEPSLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFT
CKKNMKGKVVQPENLEYTIVITPHSGEEHAVGNDTGKHGKEIKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MENKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLNWFKKGSSIGQMIETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWIMKILIGVIITWIGMNSRSTSLSVSLVLVGVVTLYLGVMVQADSGCVVSWKNKELKCGSGIFITDNV
HTWTEQYKFQPESPSKLASAIQKAHEEGICGIRSVTRLENLMWKQITPELNHILSENEVKLTIMTGDIKGIMQAGKRSLQ
PQPTELKYSWKTWGKAKMLSTESHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLKLREKQDVFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKSCHWPKSHTLWSNGVLESEMIIPKNFAGPVSQHNYRPGYHTQTA
GPWHLGKLEMDFDFCEGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VNSLVTAGHGQIDNFSLGVLGMALFLEEMLRTRVGTKHAILLVAVSFVTLITGNMSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMTTIGIVLLSQSTIPETILELTDALALGMMVLKMVRKMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAVVSVSPLFLTSSQQKADWIPLALTIKGLNPTAIFLTTLSRTNKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGLLTVCYVLTGRSADLELERAADVKWEDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPVGKAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFKTNAGTIGA
VSLDFSPGTSGSPIIDKKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKRKLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIMDEEREIPERSWSSGHEWVTDFKGKTVWFVPSI
KAGNDIAACLRKNGKKVIQLSRKTFDSEYVKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPKNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARKTFVDLMRRGDLPVWLAYRVAAEGINYADRRWCFDGIKNNQILEENVEVEIWTKEGERKKLKPRWLDAKIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKARDALDNLAVLHTAEAGGRAYNHALSELPETLETLLLLTLLATVTGG
IFLFLMSGRGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVVIAILTVVAA
TMANEMGFLEKTKKDLGLGSITTQQPESNILDIDLRPASAWTLYAVATTFVTPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKMDIGVPLLAIGCYSQVNPITLTAALFLLVAHYAIIGPGLQAKATREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEALTLATGPISTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTNTRRGTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
MVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFTPPEKCDTLLCDIGE
SSPNPTVEAGRTLRVLNLVENWLNNNTQFCIKVLNPYMPSVIEKMEALQRKYGGALVRNPLSRNSTHEMYWLSNASGNIV
SSVNMISRMLINRFTMRHKKATYEPDVDLGSGTRNIGIESEIPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMGNGVVRLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTRKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDKERNLHLEGKCETCVYNMMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILRDVSKKEGGAMYADDTAGWDTRITLEDLKN
EEMVTNHMEGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGVF
KSIQHLTVTEEIAVQNWLARVGRERLSRMAISGDDCVVKPLDDRFASALTALNDMGKVRKDIQQWEPSRGWNDWTQVPFC
SHHFHELIMKDGRVLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYAQMWSLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHAKHEWMTTEDMLTVWNRVWIQENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQTAINQ
VRSLIGNEEYTDYMPSMKRFRKEEEEAGVLW
>P12823 ~~~~~~Genome polyprotein~~~
MNDQRKKARNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRTAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTKDGTNMCTLMAMDLGELCE
DTITYKCPFLKQNEPEDIDCWCNSTSTWVTYGTCTTTGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHAQRIETW
ILRHPGFTIMAAILAYTIGTTHFQRVLIFILLTAIAPSMTMRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTDSRCPTQGEPTLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFT
CKKNMEGKIVQPENLEYTVVITPHSGEEHAVGNDTGKHGKEVKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MKDKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKVVKEIAETQHGTIVIRVQYEGDGSPCKTPFEIMDLEKRHVLGRLTTVNPIVTE
KDSPVNIEAEPPFGDSYIIIGVEPGQLKLDWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVSLVLVGIVTLYLGVMVQADSGCVVSWKNKELKCGSGIFVTDNV
HTWTEQYKFQPESPSKLASAIQKAHEEGICGIRSVTRLENLMWKQITSELNHILSENEVKLTIMTGDIKGIMQVGKRSLR
PQPTELRYSWKTWGKAKMLSTELHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLRLREKQDAFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKSCHWPKSHTLWSNGVLESEMVIPKNFAGPVSQHNNRPGYHTQTA
GPWHLGKLEMDFDFCEGTTVVVTEDCGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VSSLVTAGHGQIDNFSLGILGMALFLEEMLRTRVGTKHAILLVAVSFVTLITGNMSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMTTIGIVLLSQSSIPETILELTDALALGMMVLKMVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTILAVVSVSPLFLTSSQQKADWIPLALTIKGLNPTAIFLTTLSRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDTPMTGPLVAGGLLTVCYVLTGRSADLELERATDVKWDDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPVGKAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFRTNTGTIGA
VSLDFSPGTSGSPIVDKKGKVVGLYGNGVVTRSGAYVSAIAQTEKSIEDNPEIEDDIFRKRRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPIRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIMDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KTGNDIAACLRKNGKRVIQLSRKTFDSEYVKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPRNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARKTFVDLMRRGDLPVWLAYKVAAEGINYADRRWCFDGTRNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFAAGRKSLTLNLITEMGRLPTFMTQKARDALDNLAVLHTAEAGGKAYNHALSELPETLETLLLLTLLATVTGGIFL
FLMSGRGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVIIAILTVVAATMA
NEMGFLEKTKKDLGLGNIATQQPESNILDIDLRPASAWTLYAVATTFITPMLRHSIENSSVNVSLTAIANQATVLMGLGK
GWPLSKMDIGVPLLAIGCYSQVNPITLTAALLLLVAHYAIIGPGLQAKATREAQKRAAAGIMKNPTVDGITVIDLDPIPY
DPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEALTLATGPVSTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLFSIM
KNTTSTRRGTGNIGETLGEKWKSRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERNLVT
PEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFVPPEKCDTLLCDIGESSP
NPTVEAGRTLRVLNLVENWLNNNTQFCVKVLNPYMPSVIERMETLQRKYGGALVRNPLSRNSTHEMYWVSNASGNIVSSV
NMISRMLINRFTMRHKKATYEPDVDLGSGTRNIGIESETPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSYETK
QTGSASSMVNGVVRLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKKTPR
MCTREEFTKKVRSNAALGAIFTDENKWKSAREAVEDSRFWELVDKERNLHLEGKCETCVYNMMGKREKKLGEFGKAKGSR
AIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILREVSKKEGGAMYADDTAGWDTRITIEDLKNEEM
ITNHMAGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGIFKSI
QHLTASEEIAVQDWLARVGRERLSRMAISGDDCVVKPLDDRFARALTALNDMGKVRKDIQQWEPSRGWNDWTQVPFCSHH
FHELIMKDGRTLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYAQMWSLMYFHRRDLRLAANAICSAVPSHWVPTSR
TTWSIHASHEWMTTEDMLTVWNKVWILENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQTAINQVRS
LIGNEEYTDYMPSMKRFRREEEEAGVLW
>Q9WDA6 ~~~~~~Genome polyprotein~~~
MNNQRKKARNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRTAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTKDGTNMCTLMAMDLGELCE
DTITYKCPFLKQNEPEDIDCWCNSTSTWVTYGTCTTTGEHRREKRSVALVPHVGMGLETRTETWMSSEGAWKHAQRIETW
ILRHPGFTIMAAILAYTIGTTHFQRVLIFILLTAIAPSMTMRCIGISNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPT
LDFELIKTEAKQPATLRKYCIEAKLTNTTTDSRCPTQGEPTLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFT
CKKNMEGKIVQPENLEYTVVITPHSGEEHAVGNDTGKHGKEVKITPQSSITEAELTGYGTVTMECSPRTGLDFNEMVLLQ
MEDKAWLVHRQWFLDLPLPWLPGADTQGSNWIQKETLVTFKNPHAKKQDVVVLGSQEGAMHTALTGATEIQMSSGNLLFT
GHLKCRLRMDKLQLKGMSYSMCTGKFKIVKEIAETQHGTIVIRVQYEGDGSPCKIPFEIMDLEKRHVLGRLITVNPIVTE
KDSPVNIEAEPPFGDSYIIIGAEPGQLKLDWFKKGSSIGQMFETTMRGAKRMAILGDTAWDFGSLGGVFTSIGKALHQVF
GAIYGAAFSGVSWTMKILIGVIITWIGMNSRSTSLSVSLVLVGIVTLYLGVMVQADSGCVVSWKNKELKCGSGIFVTDNV
HTWTEQYKFQPESPSKLASAIQKAHEEGICGIRSVTRLENLMWKQITSELNHILSENEVKLTIMTGDIKGIMQVGKRSLR
PQPTELRYSWKTWGKAKMLSTELHNQTFLIDGPETAECPNTNRAWNSLEVEDYGFGVFTTNIWLRLREKQDAFCDSKLMS
AAIKDNRAVHADMGYWIESALNDTWKIEKASFIEVKSCHWPKSHTLWSNGVLESEMVIPKNIAGPVSQHNNRPGYHTQTA
GPWHLGKLEMDFDFCEGTTVVVTEECGNRGPSLRTTTASGKLITEWCCRSCTLPPLRYRGEDGCWYGMEIRPLKEKEENL
VSSLVTAGHGQIDNFSLGILGMALFLEEMLRTRVGTKHAILLVAVSFLTLITGNMSFRDLGRVMVMVGATMTDDIGMGVT
YLALLAAFKVRPTFAAGLLLRKLTSKELMMTTIGIVLLSQSSIPETILELTDALALGMMVLKMVRNMEKYQLAVTIMAIL
CVPNAVILQNAWKVSCTTLAVVSVSPLLLTSSQQKADWIPLALTIKGLNPTAIFLTTLTRTSKKRSWPLNEAIMAVGMVS
ILASSLLKNDIPMTGPLVAGGLLTVCYVLTGRSADLELERATDVKWDDQAEISGSSPILSITISEDGSMSIKNEEEEQTL
TILIRTGLLVISGLFPVSIPITAAAWYLWEVKKQRAGVLWDVPSPPPVGRAELEDGAYRIKQKGILGYSQIGAGVYKEGT
FHTMWHVTRGAVLMHKGKRIEPSWADVKKDLISYGGGWKLEGEWKEGEEVQVLALEPGKNPRAVQTKPGLFRTNTGTIGA
VSLDFSPGTSGSPIVDKKGKVVGLYGNGVVTRGGAYVSAIAQTEKGIEDNPEIEDDIFRKRRLTIMDLHPGAGKTKRYLP
AIVREAIKRGLRTLILAPTRVVAAEMEEALRGLPIRYQTPAIRAEHTGREIVDLMCHATFTMRLLSPIRVPNYNLIIMDE
AHFTDPASIAARGYISTRVEMGEAAGIFMTATPPGSRDPFPQSNAPIMDEEREIPERSWNSGHEWVTDFKGKTVWFVPSI
KTGNDIAACLRKNGKRVIQLSRKTFDSEYVKTRTNDWDFVVTTDISEMGANFKAERVIDPRRCMKPVILTDGEERVILAG
PMPVTHSSAAQRRGRIGRNPRNENDQYIYMGEPLENDEDCAHWKEAKMLLDNINTPEGIIPSMFEPEREKVDAIDGEYRL
RGEARKTFVDLMRRGDLPVWLAYKVAAEGINYADRRWCFDGTRNNQILEENVEVEIWTKEGERKKLKPRWLDARIYSDPL
ALKEFKEFAAGRKSLTLNLITEMGRLPTFMTQKARDALDNLAVLHTAEAGGKAYNHALSELPETLETLLLLTLLATVTGG
IFLFLMSGRGIGKMTLGMCCIITASILLWYAQIQPHWIAASIILEFFLIVLLIPEPEKQRTPQDNQLTYVIIAILTVVAA
TMANEMGFLEKTKKDLGLGHIATQQPESNILDIDLRPASAWTLYAVATTFITPMLRHSIENSSVNVSLTAIANQATVLMG
LGKGWPLSKMDIGVPLLAIGCYSQVNPITLTAALLMLVAHYAIIGPGLQAKATREAQKRAAAGIMKNPTVDGITVIDLDP
IPYDPKFEKQLGQVMLLVLCVTQVLMMRTTWALCEALTLATGPVSTLWEGNPGRFWNTTIAVSMANIFRGSYLAGAGLLF
SIMKNTTSTRRGTGNMGETLGEKWKNRLNALGKSEFQIYKKSGIQEVDRTLAKEGIKRGETDHHAVSRGSAKLRWFVERN
LVTPEGKVVDLGCGRGGWSYYCGGLKNVREVKGLTKGGPGHEEPIPMSTYGWNLVRLQSGVDVFFVPPEKCDTLLCDIGE
SSPNPTVEAGRTLRVLNLVENWLNNNTQFCVKVLNPYMPSVIERMETLQRKYGGALVRNPLSRNSTHEMYWVSNASGNIV
SSVNMISRMLINRFTMRHKKATYEPDVDLGSGTRNIGIESETPNLDIIGKRIEKIKQEHETSWHYDQDHPYKTWAYHGSY
ETKQTGSASSMVNGVVRLLTKPWDVIPMVTQMAMTDTTPFGQQRVFKEKVDTRTQEPKEGTKKLMKITAEWLWKELGKKK
TPRMCTREEFTKKVRSNAALGAIFTDENKWKSAREAVEDNRFWELVDKERNLHLEGKCETCVYNMMGKREKKLGEFGKAK
GSRAIWYMWLGARFLEFEALGFLNEDHWFSRENSLSGVEGEGLHKLGYILREVSKKEGGAMYADDTAGWDTRITIEDLKN
EEMITNHMAGEHKKLAEAIFKLTYQNKVVRVQRPTPRGTVMDIISRRDQRGSGQVVTYGLNTFTNMEAQLIRQMEGEGVF
KSIQHLTASEEIAVQDWLVRVGRERLSRMAISGDDCVVKPLDDRFAKALTALNDMGKVRKDIQQWEPSRGWNDWTQVPFC
SHHFHELIMKDGRTLVVPCRNQDELIGRARISQGAGWSLRETACLGKSYAQMWSLMYFHRRDLRLAANAICSAVPSHWVP
TSRTTWSIHASHEWMTTEDMLTVWNRVWILENPWMEDKTPVESWEEIPYLGKREDQWCGSLIGLTSRATWAKNIQTAINQ
VRSLIGNEEYTDYMPSMKRFRREEEEVGVLW
>Q5UB51 ~~~~~~Genome polyprotein~~~
MNNQRKKTGKPSINMLKRVRNRVSTGSQLAKRFSRGLLNGQGPMKLVMAFIAFLRFLAIPPTAGVLARWGTFKKSGAIKV
LKGFKKEISNMLSIINKRKKTSLCLMMILPATLAFHLTSRDGEPRMIVGKNERGKSLLFKTASGINMCTLIAMDLGEMCD
DTVTYKCPLIAEVEPEDIDCWCNLTSTWVTYGTCNQAGEHRRDKRSVALAPHVGMGLDTRTQTWMSAEGAWRQVEKVETW
ALRHPGFTILALFLAHYIGTSLTQKVVIFILLMLVTPSMTMRCVGVGNRDFVEGLSGATWVDVVLEHGGCVTTMAKNKPT
LDIELQKTEATQLATLRKLCIEGKITNITTDSRCPTQGEAILPEEQDQNYVCKHTYVDRGWGNGCGLFGKGSLVTCAKFQ
CLEPIEGKVVQHENLKYTVIITVHTGDQHQVGNDTQGVTVEITPQASTVEAILPEYGTLGLECSPRTGLDFNEMILLTMK
NKAWMVHRQWFFDLPLPWTSGATTEAPTWNRKELLVTFKNAHAKKQEVVVLGSQEGAMHTALTGATEIQNSGGTSIFAGH
LKCRLKMDKLELKGMSYAMCLNTFVLKKEVSETQHGTILIKVEYKGEDAPCKIPFSTEDGQGKAHNGRLITANPVVTKKE
EPVNIEAEPPFGESNIVIGIGDKALKINWYKKGSSIGKMFEATARGARRMAILGDTAWDFGSVGGVLNSLGKMVHQIFGS
AYTALFSGVSWIMKIGIGVLLTWIGLNSKNTSMSFSCIAIGIITLYLGAVVQADMGCVINWKGKELKCGSGIFVTNEVHT
WTEQYKFQADSPKRLATAIAGAWENGVCGIRSTTRMENLLWKQIANELNYILWENNIKLTVVVGDIIGVLEQGKRTLTPQ
PMELKYSWKTWGKAKIVTAETQNSSFIIDGPNTPECPSASRAWNVWEVEDYGFGVFTTNIWLKLREVYTQSCDHRLMSAA
IKDERAVHADMGYWIESQKNGSWKLEKASFIEVKTCTWPKSHTLWSNGVLESDMIIPKSLAGPISQHNHRPGYHTQTAGP
WHLGKLELDFNYCEGTTVVITENCGTRGPSLRATTVSGKLIHEWCCRSCTLPPLRYMGEDGCWYGMEIRPVNEKEENMVK
SLVSAGSGKVDNFTMGVLCLAILFEEVMRGKFGKKHMIAGVLFTFVLLLSGQITWRDMAHTLIMIGSNASDRMGMGVTCL
ALIATFKIQPFLALGFFLRKLTSRENLLLGVGLAMATTLQLPEDIEQMANGIALGLMTLKLITQFETYQLWTALVSLTCS
NTIFTLTVAWRTATLILAGVSLLPVCQSSSMRKTDWLPMTVAAMGVPPLPLFIFSLKDALKRRSWPLNEGVMAVGLVSIL
ASSLLRNDVPMAGPLVAGGLLIACYVITGTSADLTVEKAADVTWEEEAEQTGVSHNLMITVDDDGTMRIKDDETENILTV
LLKTALLIVSGIFPYSIPATLLVWHTWQKQTQRSGVLWDVPSPPETQKAELEEGVYRIKQQGIFGKTQVGVGVQKEGVFH
TMWHVTRGAVLTHNGKRLEPNWASVKKDLISYGGGWRLSAQWQKGEEVQVIAVEPGKNPKNFQTMPGIFQTTTGEIGAIA
LDFKPGTSGSPIINREGKVVGLYGNGVVTKNGGYVSGIAQTNAEPDGPTPELEEEMFKKRNLTIMDLHPGSGKTRKYLPA
IVREAIKRRLRTLILAPTRVVAAEMEEALKGLPIRYQTTATKSEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDEA
HFTDPASIAARGYISTRVGMGEAAAIFMTATPPGTAEAFPQSNAPIQDEERDIPERSWNSGNEWITDFVGKTVWFVPSIK
AGNDIANCLRKNGKKVIQLSRKTFDTEYQKTKLNDWDFVVTTDISEMGANFKADRVIDPRRCLKPVILTDGPERVILAGP
MPVTAASAAQRRGRVGRNPQKENDQYIFTGQPLNNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLK
GESRKTFVELMRRGDLPVWLAHKVASEGIKYTDRKWCFDGERNNQILEENMDVEIWTKEGERKKLRPRWLDARTYSDPLA
LKEFKDFAAGRKSIALDLVTEIGRVPSHLAHRTRNALDNLVMLHTSEHGGRAYRHAVEELPETMETLLLLGLMILLTGGA
MLFLISGKGIGKTSIGLICVIASSGMLWMADVPLQWIASAIVLEFFMMVLLIPEPEKQRTPQDNQLAYVVIGILTLAAIV
AANEMGLLETTKRNLGMSKEPGVVSPTSYLDVDLHPASAWTLYAVATTVITPMLRHTIENSTANVSLAAIANQAVVLMGL
DKGWPISKMDLGVPLLALGCYSQVNPLTLTAAVLLLVTHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIMTIDLDPV
IYDSKFEKQLGQVMLLVLCAVQLLLMKTSWALCEVLTLATGPITTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFS
IMKSVGTGKRGTGSQGETLGEKWKKKLNQLSRKEFDLYKKSGITEVDRTEAKEGLKRGEITHHAVSRGSAKLQWFVERNM
VIPEGRVIDLGCGRGGWSYYCAGLKKVTEVRGYTKGGPGHEEPVPMSTYGWNIVKLMSGKDVFYLPPEKCDTLLCDIGES
SPSPTVEESRTIRVLKMVEPWLKNNQFCIKVLNPYMPTVIEHLERLQRKHGGMLVRNPLSRNSTHEMYWISNGTGNIVSS
VNMVSRLLLNRFTMTHRRPTIEKDVDLGAGTRHVNAEPETPNMDVIGERIKRIKEEHSSTWHYDDENPYKTWAYHGSYEV
KATGSASSMINGVVKLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTPRPMPGTRKVMEITAEWLWRTLGRNKRP
RLCTREEFTKKVRTNAAMGAVFTEENQWDSARAAVEDEEFWKLVDRERELHKLGKCGSCVYNMMGKREKKLGEFGKAKGS
RAIWYMWLGARYLEFEALGFLNEDHWFSRENSYSGVEGEGLHKLGYILRDISKIPGGAMYADDTAGWDTRITEDDLHNEE
KITQQMDPEHRQLANAIFKLTYQNKVVKVQRPTPKGTVMDIISRKDQRGSGQVGTYGLNTFTNMEAQLVRQMEGEGVLSK
ADLENPHPLEKKITQWLETKGVERLKRMAISGDDCVVKPIDDRFANALLALNDMGKVRKDIPQWQPSKGWHDWQQVPFCS
HHFHELIMKDGRKLVVPCRPQDELIGRARISQGAGWSLKETACLGKAYAQMWSLMYFHRRDLRLASNAICSAVPVHWVPT
SRTTWSIHAHHQWMTTEDMLTVWNRVWIEDNPWMEDKTPVTTWEDVPYLGKREDQWCGSLIGLTSRATWAQNILIAIQQV
RSLIGDEEFLDYMPSMKRFRKEEESEGAIW
>P27915 ~~~~~~Genome polyprotein~~~
MNNQRKKTGKPSINMLKRVRNRVSTGSQLAKRFSRGLLNGQGPMKLVMAFIAFLRFLAIPPTAGVLARWGTFKKSGAIKV
LKGFKKEISNMLSIINKRKKTSLCLMMMLPATLAFHLTSRDGEPRMIVGKNERGKSLLFKTASGINMCTLIAMDLGEMCD
DTVTYKCPHITEVEPEDIDCWCNLTSTWVTYGTCNQAGEHRRDKRSVALAPHVGMGLDTRTQTWMSAEGAWRQVEKVETW
ALRHPGFTILALFLAHYIGTSLTQKVVIFILLMLVTPSMTMRCVGVGNRDFVEGLSGATWVDVVLEHGGCVTTMAKNKPT
LDIELQKTEATQLATLRKLCIEGKITNITTDSRCPTQGEAILPEEQDQNYVCKHTYVDRGWGNGCGLFGKGSLVTCAKFQ
CLESIEGKVVQHENLKYTVIITVHTGDQHQVGNETQGVTAEITSQASTAEAILPEYGTLGLECSPRTGLDFNEMILLTMK
NKAWMVHRQWFFDLPLPWTSGATTKTPTWNRKELLVTFKNAHAKKQEVVVLGSQEGAMHTALTGATEIQTSGGTSIFAGH
LKCRLKMDKLKLKGMSYAMCLNTFVLKKEVSETQHGTILIKVEYKGEDAPCKIPFSTEDGQGKAHNGRLITANPVVTKKE
EPVNIEAEPPFGESNIVIGIGDKALKINWYRKGSSIGKMFEATARGARRMAILGDTAWDFGSVGGVLNSLGKMVHQIFGS
AYTALFSGVSWIMKIGIGVLLTWIGLNSKNTSMSFSCIAIGIITLYLGVVVQADMGCVINWKGKELKCGSGIFVTNEVHT
WTEQYKFQADSPKRVATAIAGAWENGVCGIRSTTRMENLLWKQIANELNYILWENDIKLTVVVGDITGVLEQGKRTLTPQ
PMELKYSWKTWGLAKIVTAETQNSSFIIDGPSTPECPSASRAWNVWEVEDYGFGVFTTNIWLKLREVYTQLCDHRLMSAA
VKDERAVHADMGYWIESQKNGSWKLEKASLIEVKTCTWPKSHTLWSNGVLESDMIIPKSLAGPISQHNHRPGYHTQTAGP
WHLGKLELDFNYCEGTTVVISENCGTRGPSLRTTTVSGKLIHEWCCRSCTLPPLRYMGEDGCWYGMEIRPINEKEENMVK
SLASAGSGKVDNFTMGVLCLAILFEEVMRGKFGKKHMIAGVLFTFVLLLSGQITWRGMAHTLIMIGSNASDRMGMGVTYL
ALIATFKIQPFLALGFFLRKLTSRENLLLGVGLAMAATLRLPEDIEQMANGIALGLMALKLITQFETYQLWTALVSLTCS
NTIFTLTVAWRTATLILAGISLLPVCQSSSMRKTDWLPMTVAAMGVPPLPLFIFSLKDTLKRRSWPLNEGVMAVGLVSIL
ASSLLRNDVPMAGPLVAGGLLIACYVITGTSADLTVEKAADVTWEEEAEQTGVSHNLMITVDDDGTMRIKDDETENILTV
LLKTALLIVSGIFPYSIPATMLVWHTWQKQTQRSGVLWDVPSPPETQKAELEEGVYRIKQQGIFGKTQVGVGVQKEGVFH
TMWHVTRGAVLTHNGKRLEPNWASVKKDLISYGGGWRLSAQWQKGEEVQVIAVEPGKNPKNFQTMPGIFQTTTGEIGAIA
LDFKPGTSGSPIINREGKVVGLYGNGVVTKNGGYVSGIAQTNAEPDGPTPELEEEMFKKRNLTIMDLHPGSGKTRKYLPA
IVREAIKRRLRTLILAPTRVVAAEMEEAMKGLPIRYQTTATKSEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDEA
HFTDPASIAARGYISTRVGMGEAAAIFMTATPPGTADAFPQSNAPIQDEERDIPERSWNSGNEWITDFVGKTVWFVPSIK
AGNVIANCLRKNGKKVIQLSRKTFDTEYQKTKLNDWDFVVTTDISEMGANFIADRVIDPRRCLKPVILTDGPERVILAGP
MPVTVASAAQRRGRVGRNPQKENDQYIFMGQPLNKDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLK
GESRKTFVELMRRGDLPVWLAHKVASEGIKYTDRKWCFDGERNNQILEENMDVEIWTKEGEKKKLRPRWLDARTYSDPLA
LKEFKDFAAGRKSIALDLVTEIGRVPSHLAHRTRNALDNLVMLHTSEHGGRAYRHAVEELPETMETLLLLGLMILLTGGA
MLFLISGKGIGKTSIGLICVIASSGMLWMADVPLQWIASAIVLEFFMMVLLIPEPEKQRTPQDNQLAYVVIGILTLAAIV
AANEMGLLETTKRDLGMSKEPGVVSPTSYLDVDLHPASAWTLYAVATTVITPMLRHTIENSTANVSLAAIANQAVVLMGL
DKGWPISKMDLGVPLLALGCYSQVNPLTLIAAVLLLVTHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIMTIDLDPV
IYDSKFEKQLGQVMLLVLCAVQLLLMRTSWALCEVLTLATGPITTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLALS
IMKSVGTGKRGTGSQGETLGEKWKKKLNQLSRKEFDLYKKSGITEVDRTEAKEGLKRGEITHHAVSRGSAKLQWFVERNM
VIPEGRVIDLGCGRGGWSYYCAGLKKVTEVRGYTKGGPGHEEPVPMSTYGWNIVKLMSGKDVFYLPPEKCDTLLCDIGES
SPSPTVEESRTIRVLKMVEPWLKNNQFCIKVLNPYMPTVIEHLERLQRKHGGMLVRNPLSRNSTHEMYWISNGTGNIVSS
VNMVSRLLLNRFTMTHRRPTIEKDVDLGAGTRHVNAEPETPNMDVIGERIKRIKEEHSSTWHYDDENPYKTWAYHGSYEV
KATGSASSMINGVVKLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTPRPMPGTRKVMEITAEWLWRTLGRNKRP
RLCTREEFTKKVRTNAAMGAVFTEENQWDSARAAVEDEEFWKLVDRERELHKLGKCGSCVYNMMGKREKKLGEFGKAKGS
RAIWYMWLGARYLEFEALGFLNEDHWFSRENSYSGVEGEGLHKLGYILRDISKIPGGAMYADDTAGWDTRITEDDLHNEE
KITQQMDPEHRQLANAIFKLTYQNKVVKVQRPTPKGTVMDIISRKDQRGSGQVGTYGLNTFTNMEAQLIRQMEGEGVLSK
ADLENPHPLEKKITQWLETKGVERLKRMAISGDDCVVKPIDDRFANALLALNDMGKVRKDIPQWQPSKGWHDWQQVPFCS
HHFHELIMKDGRKLVVPCRPQDELIGRARISQGAGWSLRETACLGKAYAQMWTLMYFHRRDLRLASNAICSAVPVHWVPT
SRTTWSIHAHHQWMTTEDMLTVWNRVWIEDNPWMEDKTPVTTWEDVPYLGKREDQWCGSLIGLTSRATWAQNILTAIQQV
RSLIGNEEFLDYMPSMKRFRKEEESEGAIW
>Q6YMS4 ~~~pol~~~Genome polyprotein~~~
MNNQRKKTGKPSINMLKRVRNRVSTGSQLAKRFSKGLLNGQGPMKLVMAFIAFLRFLAIPPTAGVLARWGTFKKSGAIKV
LKGFKKEISNMLSIINQRKKTSLCLMMILPAALAFHLTSRDGEPRMIVGKNERGKSLLFKTASGINMCTLIAMDLGEMCD
DTVTYKCPHITEVEPEDIDCWCNLTSTWVTYGTCNQAGEHRRDKRSVALAPHVGMGLDTRTQTWMSAEGAWRQVEKVETW
ALRHPGFTILALFLAHYIGTSLTQKVVIFILLMLVTPSMTMRCVGVGNRDFVEGLSGATWVDVVLEHGGCVTTMAKNKPT
LDIELQKTEATQLATLRKLCIEGKITNITTDSRCPTQGEAVLPEEQDQNYVCKHTYVDRGWGNGCGLFGKGSLVTCAKFQ
CLEPIEGKVVQYENLKYTVIITVHTGDQHQVGNETQGVTAEITPQASTTEAILPEYGTLGLECSPRTGLDFNEMILLTMK
NKAWMVHRQWFFDLPLPWASGATTETPTWNRKELLVTFKNAHAKKQEVVVLGSQEGAMHTALTGATEIQNSGGTSIFAGH
LKCRLKMDKLELKGMSYAMCTNTFVLKKEVSETQHGTILIKVEYKGEDAPCKIPFSTEDGQGKAHNGRLITANPVVTKKE
EPVNIEAEPPFGESNIVIGIGDNALKINWYKKGSSIGKMFEATERGARRMAILGDTAWDFGSVGGVLNSLGKMVHQIFGS
AYTALFSGVSWVMKIGIGVLLTWIGLNSKNTSMSFSCIAIGIITLYLGAVVQADMGCVINWKGKELKCGSGIFVTNEVHT
WTEQYKFQADSPKRLATAIAGAWENGVCGIRSTTRMENLLWKQIANELNYILWENNIKLTVVVGDTLGVLEQGKRTLTPQ
PMELKYSWKTWGKAKIVTAETQNSSFIIDGPNTPECPSASRAWNVWEVEDYGFGVFTTNIWLKLREVYTQLCDHRLMSAA
VKDERAVHADMGYWIESQKNGSWKLEKASLIEVKTCTWPKSHTLWTNGVLESDMIIPKSLAGPISQHNYRPGYHTQTAGP
WHLGKLELDFNYCEGTTVVITESCGTRGPSLRTTTVSGKLIHEWCCRSCTLPPLRYMGEDGCWYGMEIRPISEKEENMVK
SLVSAGSGKVDNFTMGVLCLAILFEEVLRGKFGKKHMIAGVFFTFVLLLSGQITWRDMAHTLIMIGSNASDRMGMGVTYL
ALIATFKIQPFLALGFFLRKLTSRENLLLGVGLAMATTLQLPEDIEQMANGVALGLMALKLITQFETYQLWTALVSLTCS
NTIFTLTVAWRTATLILAGVSLLPVCQSSSMRKTDWLPMTVAAMGVPPLPLFIFSLKDTLKRRSWPLNEGVMAVGLVSIL
ASSLLRNDVPMAGPLVAGGLLIACYVITGTSADLTVEKAPDVTWEEEAEQTGVSHNLMITVDDDGTMRIKDDETENILTV
LLKTALLIVSGIFPYSIPATLLVWHTWQKQTQRSGVLWDVPSPPETQKAELEEGVYRIKQQGIFGKTQVGVGVQKEGVFH
TMWHVTRGAVLTHNGKRLEPNWASVKKDLISYGGGWRLSAQWQKGEEVQVIAVEPGKNPKNFQTTPGTFQTTTGEIGAIA
LDFKPGTSGSPIINREGKVVGLYGNGVVTKNGGYVSGIAQTNAEPDGPTPELEEEMFKKRNLTIMDLHPGSGKTRKYLPA
IVREAIKRRLRTLILAPTRVVAAEMEEALKGLPIRYQTTATKSEHTGREIVDLMCHATFTMRLLSPVRVPNYNLIIMDEA
HFTDPASIAARGYISTRVGMGEAAAIFMTATPPGTADAFPQSNAPIQDEERDIPERSWNSGNEWITDFAGKTVWFVPSIK
AGNDIANCLRKNGKKVIQLSRKTFDTEYQKTKLNDWDFVVTTDISEMGANFKADRVIDPRRCLKPVILTDGPERVILAGP
MPVTAASAAQRRGRVGRNPQKENDQYIFTGQPLNNDEDHAHWTEAKMLLDNINTPEGIIPALFEPEREKSAAIDGEYRLK
GESRKTFVELMRRGDLPVWLAHKVASEGIKYTDRKWCFDGQRNNQILEENMDVEIWTKEGEKKKLRPRWLDARTYSDPLA
LKEFKDFAAGRKSIALDLVTEIGRVPSHLAHRTRNALDNLVMLHTSEDGGRAYRHAVEELPETMETLLLLGLMILLTGGA
MLFLISGKGIGKTSIGLICVIASSGMLWMAEVPLQWIASAIVLEFFMMVLLIPEPEKQRTPQDNQLAYVVIGILTLAATI
AANEMGLLETTKRDLGMSKEPGVVSPTSYLDVDLHPASAWTLYAVATTVITPMLRHTIENSTANVSLAAIANQAVVLMGL
DKGWPISKMDLGVPLLALGCYSQVNPLTLTAAVLLLITHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGIMTIDLDSV
IFDSKFEKQLGQVMLLVLCAVQLLLMRTSWALCEALTLATGPITTLWEGSPGKFWNTTIAVSMANIFRGSYLAGAGLAFS
IMKSVGTGKRGTGSQGETLGEKWKKKLNQLSRKEFDLYKKSGITEVDRTEAKEGLKRGETTHHAVSRGSAKLQWFVERNM
VVPEGRVIDLGCGRGGWSYYCAGLKKVTEVRGYTKGGPGHEEPVPMSTYGWNIVKLMSGKDVFYLPPEKCDTLLCDIGES
SPSPTVEESRTIRVLKMVEPWLKNNQFCIKVLNPYMPTVIEHLERLQRKHGGMLVRNPLSRNSTHEMYWISNGTGNIVSS
VNMVSRLLLNRFTMTHRRPTIEKDVDLGAGTRHVNAEPETPNMDVIGERIKRIKEEHNSTWHYDDENPYKTWAYHGSYEV
KATGSASSMINGVVKLLTKPWDVVPMVTQMAMTDTTPFGQQRVFKEKVDTRTPRPMPGTRKAMEITAEWLWRTLGRNKRP
RLCTREEFTKKVRTNAAMGAVFTEENQWDSAKAAVEDEEFWKLVDRERELHKLGKCGSCVYNMMGKREKKLGEFGKAKGS
RAIWYMWLGARYLEFEALGFLNEDHWFSRENSYSGVEGEGLHKLGYILRDISKIPGGAMYADDTAGWDTRITEDDLHNEE
KIIQQMDPEHRQLANAIFKLTYQNKVVKVQRPTPTGTVMDIISRKDQRGSGQLGTYGLNTFTNMEAQLVRQMEGEGVLTK
ADLENPHLLEKKITQWLETKGVERLKRMAISGDDCVVKPIDDRFANALLALNDMGKVRKDIPQWQPSKGWHDWQQVPFCS
HHFHELIMKDGRKLVVPCRPQDELIGRARISQGAGWSLRETACLGKAYAQMWSLMYFHRRDLRLASNAICSAVPVHWVPT
SRTTWSIHAHHQWMTTEDMLTVWNRVWIEENPWMEDKTPVTTWENVPYLGKREDQWCGSLIGLTSRATWAQNIPTAIQQV
RSLIGNEEFLDYMPSMKRFRKEEESEGAIW
>P09866 ~~~~~~Genome polyprotein~~~
MNQRKKVVRPPFNMLKRERNRVSTPQGLVKRFSTGLFSGKGPLRMVLAFITFLRVLSIPPTAGILKRWGQLKKNKAIKIL
IGFRKEIGRMLNILNGRKRSTITLLCLIPTVMAFSLSTRDGEPLMIVAKHERGRPLLFKTTEGINKCTLIAMDLGEMCED
TVTYKCPLLVNTEPEDIDCWCNLTSTWVMYGTCTQSGERRREKRSVALTPHSGMGLETRAETWMSSEGAWKHAQRVESWI
LRNPGFALLAGFMAYMIGQTGIQRTVFFVLMMLVAPSYGMRCVGVGNRDFVEGVSGGAWVDLVLEHGGCVTTMAQGKPTL
DFELTKTTAKEVALLRTYCIEASISNITTATRCPTQGEPYLKEEQDQQYICRRDVVDRGWGNGCGLFGKGGVVTCAKFSC
SGKITGNLVQIENLEYTVVVTVHNGDTHAVGNDTSNHGVTAMITPRSPSVEVKLPDYGELTLDCEPRSGIDFNEMILMKM
KKKTWLVHKQWFLDLPLPWTAGADTSEVHWNYKERMVTFKVPHAKRQDVTVLGSQEGAMHSALAGATEVDSGDGNHMFAG
HLKCKVRMEKLRIKGMSYTMCSGKFSIDKEMAETQHGTTVVKVKYEGAGAPCKVPIEIRDVNKEKVVGRIISSTPLAENT
NSVTNIELEPPFGDSYIVIGVGNSALTLHWFRKGSSIGKMFESTYRGAKRMAILGETAWDFGSVGGLFTSLGKAVHQVFG
SVYTTMFGGVSWMIRILIGFLVLWIGTNSRNTSMAMTCIAVGGITLFLGFTVQADMGCVASWSGKELKCGSGIFVVDNVH
TWTEQYKFQPESPARLASAILNAHKDGVCGIRSTTRLENVMWKQITNELNYVLWEGGHDLTVVAGDVKGVLTKGKRALTP
PVSDLKYSWKTWGKAKIFTPEARNSTFLIDGPDTSECPNERRAWNSLEVEDYGFGMFTTNIWMKFREGSSEVCDHRLMSA
AIKDQKAVHADMGYWIESSKNQTWQIEKASLIEVKTCLWPKTHTLWSNGVLESQMLIPKSYAGPFSQHNYRQGYATQTVG
PWHLGKLEIDFGECPGTTVTIQEDCDHRGPSLRTTTASGKLVTQWCCRSCTMPPLRFLGEDGCWYGMEIRPLSEKEENMV
KSQVTAGQGTSETFSMGLLCLTLFVEECLRRRVTRKHMILVVVITLCAIILGGLTWMDLLRALIMLGDTMSGRIGGQIHL
AIMAVFKMSPGYVLGVFLRKLTSRETALMVIGMAMTTVLSIPHDLMELIDGISLGLILLKIVTQFDNTQVGTLALSLTFI
RSTMPLVMAWRTIMAVLFVVTLIPLCRTSCLQKQSHWVEITALILGAQALPVYLMTLMKGASRRSWPLNEGIMAVGLVSL
LGSALLKNDVPLAGPMVAGGLLLAAYVMSGSSADLSLEKAANVQWDEMADITGSSPIIEVKQDEDGSFSIRDVEETNMIT
LLVKLALITVSGLYPLAIPVTMTLWYMWQVKTQRSGALWDVPSPAATKKAALSEGVYRIMQRGLFGKTQVGVGIHMEGVF
HTMWHVTRGSVICHETGRLEPSWADVRNDMISYGGGWRLGDKWDKEEDVQVLAIEPGKNPKHVQTKPGLFKTLTGEIGAV
TLDFKPGTSGSPIINRKGKVIGLYGNGVVTKSGDYVSAITQAERIGEPDYEVDEDIFRKKRLTIMDLHPGAGKTKRILPS
IVREALKRRLRTLILAPTRVVAAEMEEALRGLPIRYQTPAVKSEHTGREIVDLMCHATFTTRLLSSTRVPNYNLIVMDEA
HFTDPSSVAARGYISTRVEMGEAAAIFMTATPPGATDPFPQSNSPIEDIEREIPERSWNTGFDWITDYQGKTVWFVPSIK
AGNDIANCLRKSGKKVIQLSRKTFDTEYPKTKLTDWDFVVTTDISEMGANFRAGRVIDPRRCLKPVILPDGPERVILAGP
IPVTPASAAQRRGRIGRNPAQEDDQYVFSGDPLKNDEDHAHWTEAKMLLDNIYTPEGIIPTLFGPEREKTQAIDGEFRLR
GEQRKTFVELMRRGDLPVWLSYKVASAGISYKDREWCFTGERNNQILEENMEVEIWTREGEKKKLRPRWLDARVYADPMA
LKDFKEFASGRKSITLDILTEIASLPTYLSSRAKLALDNIVMLHTTERGGRAYQHALNELPESLETLMLVALLGAMTAGI
FLFFMQGKGIGKLSMGLITIAVASGLLWVAEIQPQWIAASIILEFFLMVLLIPEPEKQRTPQDNQLIYVILTILTIIGLI
AANEMGLIEKTKTDFGFYQVKTETTILDVDLRPASAWTLYAVATTILTPMLRHTIENTSANLSLAAIANQAAVLMGLGKG
WPLHRMDLGVPLLAMGCYSQVNPTTLTASLVMLLVHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGITVIDLEPISYD
PKFEKQLGQVMLLVLCAGQLLLMRTTWAFCEVLTLATGPILTLWEGNPGRFWNTTIAVSTANIFRGSYLAGAGLAFSLIK
NAQTPRRGTGTTGETLGEKWKRQLNSLDRKEFEEYKRSGILEVDRTEAKSALKDGSKIKHAVSRGSSKIRWIVERGMVKP
KGKVVDLGCGRGGWSYYMATLKNVTEVKGYTKGGPGHEEPIPMATYGWNLVKLHSGVDVFYKPTEQVDTLLCDIGESSSN
PTIEEGRTLRVLKMVEPWLSSKPEFCIKVLNPYMPTVIEELEKLQRKHGGNLVRCPLSRNSTHEMYWVSGASGNIVSSVN
TTSKMLLNRFTTRHRKPTYEKDVDLGAGTRSVSTETEKPDMTIIGRRLQRLQEEHKETWHYDQENPYRTWAYHGSYEAPS
TGSASSMVNGVVKLLTKPWDVIPMVTQLAMTDTTPFGQQRVFKEKVDTRTPQPKPGTRMVMTTTANWLWALLGKKKNPRL
CTREEFISKVRSNAAIGAVFQEEQGWTSASEAVNDSRFWELVDKERALHQEGKCESCVYNMMGKREKKLGEFGRAKGSRA
IWYMWLGARFLEFEALGFLNEDHWFGRENSWSGVEGEGLHRLGYILEEIDKKDGDLMYADDTAGWDTRITEDDLQNEELI
TEQMAPHHKILAKAIFKLTYQNKVVKVLRPTPRGAVMDIISRKDQRGSGQVGTYGLNTFTNMEVQLIRQMEAEGVITQDD
MQNPKGLKERVEKWLKECGVDRLKRMAISGDDCVVKPLDERFGTSLLFLNDMGKVRKDIPQWEPSKGWKNWQEVPFCSHH
FHKIFMKDGRSLVVPCRNQDELIGRARISQGAGWSLRETACLGKAYAQMWSLMYFHRRDLRLASMAICSAVPTEWFPTSR
TTWSIHAHHQWMTTEDMLKVWNRVWIEDNPNMTDKTPVHSWEDIPYLGKREDLWCGSLIGLSSRATWAKNIHTAITQVRN
LIGKEEYVDYMPVMKRYSAPSESEGVL
>Q58HT7 ~~~~~~Genome polyprotein~~~
MNQRKKVVRPPFNMLKRERNRVSTPQGLVKRFSTGLFSGKGPLRMVLAFITFLRVLSIPPTAGILKRWGQLKKNKAIKIL
TGFRKEIGRMLNILNGRKRSTMTLLCLIPTAMAFHLSTRDGEPLMIVARHERGRPLLFKTTEGINKCTLIAMDLGEMCED
TVTYECPLLVNTEPEDIDCWCNLTSAWVMYGTCTQSGERRREKRSVALTPHSGMGLETRAETWMSSEGAWKHAQRVESWI
LRNPGFALLAGFMAYMIGQTGIQRTVFFVLMMLVAPSYGMRCVGVGNRDFVEGVSGGAWVDLVLEHGGCVTTMAQGKPTL
DFELIKTTAKEVALLRTYCIEASISNITTATRCPTQGEPYLKEEQDQQYICRRDVVDRGWGNGCGLFGKGGVVTCAKFSC
SGKITGNLVQIENLEYTVVVTVHNGDTHAVGNDIPNHGVTATITPRSPSVEVKLPDYGELTLDCEPRSGIDFNEMILMKM
KKKTWLVHKQWFLDLPLPWAAGADTSEVHWNYKERMVTFKVPHAKRQDVTVLGSQEGAMHSALTGATEVDSGDGNHMFAG
HLKCKVRMEKLRIKGMSYTMCSGKFSIDKEMAETQHGTTVVKVKYEGAGAPCKVPIEIRDVNKEKVVGRIISSTPFAEYT
NSVTNIELEPPFGDSYIVIGVGDSALTLHWFRKGSSIGKMLESTYRGAKRMAILGETAWDFGSVGGLLTSLGKAVHQVFG
SVYTTMFGGVSWMVRILIGFLVLWIGTNSRNTSMAMTCIAVGGITLFLGFTVHADTGCAVSWSGKELKCGSGIFVIDNVH
TWTEQYKFQPESPARLASAILNAHEDGVCGIRSTTRLENIMWKQITNELNYVLWEGGHDLTVVAGDVKGVLSKGKRALAP
PVNDLKYSWKTWGKAKIFTPEAKNSTFLIDGPDTSECPNERRAWNFLEVEDYGFGMFTTNIWMKFREGSSEVCDHRLMSA
AIKDQKAVHADMGYWIESSKNQTWQIEKASLIEVKTCLWPKTHTLWSNGVLESQMLIPKAYAGPFSQHNYRQGYATQTVG
PWHLGKLEIDFGECPGTTVTIQEDCDHRGPSLRTTTASGKLVTQWCCRSCTMPPLRFLGEDGCWYGMEIRPLSEKEENMV
KSQVSAGQGTSETFSMGLLCLTLFVEECLRRRVTRKHMILVVVTTLCAIILGGLTWMDLLRALIMLGDTMSGRMGGQIHL
AIMAVFKMSPGYVLGIFLRKLTSRETALMVIGMAMTTVLSIPHDLMEFIDGISLGLILLKMVTHFDNTQVGTLALSLTFI
RSTMPLVMAWRTIMAVLFVVTLIPLCRTSCLQKQSHWVEITALILGAQALPVYLMTLMKGASKRSWPLNEGIMAVGLVSL
LGSALLKNDVPLAGPMVAGGLLLAAYVMSGSSADLSLEKAANVQWDEMADITGSSPIIEVKQDEDGSFSIRDIEETNMIT
LLVKLALITVSGLYPLAIPVTMTLWYMWQVKTQRSGALWDVPSPAAAQKATLTEGVYRIMQRGLFGKTQVGVGIHMEGVF
HTMWHVTRGSVICHETGRLEPSWADVRNDMISYGGGWRLGDKWDKEEDVQVLAIEPGKNPKHVQTKPGLFKTLTGEIGAV
TLDFKPGTSGSPIINRKGKVIGLYGNGVVTKSGDYVSAITQAERTGEPDYEVDEDIFRKKRLTIMDLHPGAGKTKRILPS
IVREALKRRLRTLILAPTRVVAAEMEEALRGLPIRYQTPAVKSEHTGREIVDLMCHATFTTRLLSSTRVPNYNLIVMDEA
HFTDPSSVAARGYISTRVEMGEAAAIFMTATPPGATDPFPQSNSPIEDIEREIPERSWNTGFDWITDYQGKTVWFVPSIK
AGNDIANCLRKSGKKVIQLSRKTFDTEYPKTKLTDWDFVVTTDISEMGANFRAGRVIDPRRCLKPVISTDGPERVILAGP
IPVTPASAAQRRGRIGRNPAQEDDQYVFSGDPLKNDEDHAHWTEAKMLLDNIYTPEGIIPTLFGPEREKNQAIDGEFRLR
GEQRKTFVELMRRGDLPVWLSYKVASAGISYKDREWCFTGERNNQILEENMEVEIWTREGEKKKLRPKWLDARVYADPMA
LKDFKEFASGRKSITLDILTEIASLPTYLSSRAKLALDNIVMLHTTERGGKAYQHALNELPESLETLMLVALLGAMTAGI
FLFFMQGKGIGKLSMGLIAIAVASGLLWVAEIQPQWIAASIILEFFLMVLLIPEPEKQRTPQDNQLIYVILTILTIIGLI
AANEMGLIEKTKTDFGFYQVKTETTILDVDLRPASAWTLYAVATTILTPMLRHTIENTSANLSLAAIANQAAVLMGLGKG
WPLHRMDLGVPLLAMGCYSQVNPTTLIASLVMLLVHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGITVIDLEPISYD
PKFEKQLGQVMLLVLCAGQLLLMRTTWAFCEVLTLATGPVLTLWEGNPGRFWNTTIAVSTANIFRGSYLAGAGLAFSLIK
NAQTPRRGTGTTGETLGEKWKRQLNSLDRKEFEEYKRSGILEVDRTEAKSALKDGSKIKHAVSRGSSKIRWIVERGMVKP
KGKVVDLGCGRGGWSYYMATLKNVTEVKGYTKGGPGHEEPIPMATYGWNLVKLHSGVDVFYKPTEQVDTLLCDIGESSSN
PTIEEGRTLRVLKMVEPWLSSKPEFCIKVLNPYMPTVIEELEKLQRKHGGSLVRCPLSRNSTHEMYWVSGVSGNIVSSVN
TTSKMLLNRFTTRHRKPTYEKDVDLGAGTRSVSTETEKPDMTIIGRRLQRLQEEHKETWHYDQENPYRTWAYHGSYEAPS
TGSASSMVNGVVKLLTKPWDVIPMVTQLAMTDTTPFGQQRVFKEKVDTRTPQPKPGTRMVMTTTANWLWALLGKKKNPRL
CTREEFISKVRSNAAIGAVFQEEQGWTSASEAVNDSRFWELVDKERALHQEGKCESCVYNMMGKREKKLGEFGRAKGSRA
IWYMWLGARFLEFEALGFLNEDHWFGRENSWSGVEGEGLHRLGYILEDIDKKDGDLIYADDTAGWDTRITEDDLLNEELI
TEQMAPHHKILAKAIFKLTYQNKVVKVLRPTPKGAVMDIISRKDQRGSGQVGTYGLNTFTNMEVQLIRQMEAEGVITQDD
MHNPKGLKERVEKWLKECGVDRLKRMAISGDDCVVKPLDERFSTSLLFLNDMGKVRKDIPQWEPSKGWKNWQEVPFCSHH
FHKIFMKDGRSLVVPCRNQDELIGRARISQGAGWSLRETACLGKAYAQMWSLMYFHRRDLRLASMAICSAVPTEWFPTSR
TTWSIHAHHQWMTTEDMLKVWNRVWIEDNPNMTDKTPVHSWEDIPYLGKREDLWCGSLIGLSSRATWAKNIHTAITQVRN
LIGKEEYVDYMPVMKRYSAPFESEGVL
>Q2YHF0 ~~~~~~Genome polyprotein~~~
MNQRKKVARPPFNMLKRERNRVSTPQGLVKRFSTGLFSGKGPLRMVLAFITFLRVLSIPPTAGILKRWGQLKKNKAIKIL
TGFRKEIGRMLNILNGRKRSTITLLCLIPTVMAFHLSTRDGEPLMIVAKHERGRPLLFKTTEGINKCTLIAMDLGEMCED
TVTYKCPLLVNTEPEDIDCWCNLTSAWVMYGTCTQSGERRREKRSVALTPHSGMGLETRAETWMSSEGAWKHAQRVESWI
LRNPGFALLAGFMAYMIGQTGIQRTVFFILMMLVAPSYGMRCVGVGNRDFVEGVSGGAWVDLVLEHGGCVTTMAQGKPTL
DFELIKTTAKEVALLRTYCIEASISNITTATRCPTQGEPYLKEEQDQQYICRRDMVDRGWGNGCGLFGKGGVVTCAKFSC
SGKITGNLVQIENLEYTVVVTVHNGDTHAVGNDTSNHGVTATITPRSPSVEVKLPDYGELTLDCEPRSGIDFNEMILMKM
KTKTWLVHKQWFLDLPLPWTAGADTLEVHWNHKERMVTFKVPHAKRQDVTVLGSQEGAMHSALAGATEVDSGDGNHMFAG
HLKCKVRMEKLRIKGMSYTMCSGKFSIDKEMAETQHGTTVVKVKYEGTGAPCKVPIEIRDVNKEKVVGRIISSTPFAENT
NSVTNIELEPPFGDSYIVIGVGDSALTLHWFRKGSSIGKMFESTYRGAKRMAILGETAWDFGSVGGLLTSLGKAVHQVFG
SVYTTMFGGVSWMVRILIGLLVLWIGTNSRNTSMAMSCIAVGGITLFLGFTVHADMGCAVSWSGKELKCGSGIFVIDNVH
TWTEQYKFQPESPARLASAILNAHKDGVCGIRSTTRLENVMWKQITNELNYVLWEGGHDLTVVAGDVKGVLSKGKRALAP
PVNDLKYSWKTWGKAKIFTPETRNSTFLVDGPDTSECPNERRAWNFLEVEDYGFGMFTTNIWMKFREGSSEVCDHRLMSA
AIKDQKAVHADMGYWIESSKNQTWQIEKASLIEVKTCLWPKTHTLWSNGVLESQMLIPKAYAGPISQHNYRQGYATQTVG
PWHLGKLEIDFGECPGTTVTIQEDCDHRGPSLRTTTASGKLVTQWCCRSCTMPPLRFLGEDGCWYGMEIRPLNEKEENMV
KSQVSAGQGTSETFSMGLLCLTLFVEECLRRRVTRKHMILVVVTTLCAIILGGLTWMDLLRALIMLGDTMSGRMGGQIHL
AIMAVFKMSPGYVLGIFLRKLTSRETALMVIGMAMTTVLSIPHDLMEFIDGISLGLILLKMVTHFDNTQVGTLALSLTFI
KSTMPLVMAWRTIMAVLFVVTLIPLCRTSCLQKQSHWVEITALILGAQALPVYLMTLMKGASKRSWPLNEGIMAVGLVSL
LGSALLKNDVPLAGPMVAGGLLLAAYVMSGSSADLSLEKAANVQWDEMADITGSSPIIEVKQDEDGSFSIRDVEETNMIT
LLVKLALITVSGLYPLAIPVTMTLWYMWQVKTQRSGALWDVPSPAAAQKATLTEGVYRIMQRGLFGKTQVGVGIHMEGVF
HTMWHVTRGSVICHESGRLEPSWADVRNDMISYGGGWRLGDKWDKEEDVQVLAIEPGKNPKHVQTKPGLFKTLTGEIGAV
TLDFKPGTSGSPIINRKGKVIGLYGNGVVTKSGDYVSAITQAERIGEPDYEVDEDIFRKKRLTIMDLHPGAGKTKRILPS
IVREALKRRLRTLILAPTRVVAAEMEEALRGLPIRYQTPAVKSEHTGREIVDLMCHATFTTRLLSSTRVPNYNLIVMDEA
HFTDPSSVAARGYISTRVEMGEAAAIFMTATPPGTTDPFPQSNSPIEDIEREIPERSWNTGFDWITDYQGKTVWFVPSIK
AGNDIANCLRKSGKKVIQLSRKTFDTEYPKTKLTDWDFVVTTDISEMGANFRAGRVIDPRRCLKPVILTDGPERVILAGP
IPVTPASAAQRRGRIGRNPAQEDDQYVFSGDPLRNDEDHAHWTEAKMLLDNIYTPEGIIPTLFGPEREKTQAIDGEFRLR
GEQRKTFVELMRRGDLPVWLSYKVASAGISYKDREWCFTGERNNQILEENMEVEIWTREGEKKKLRPKWLDARVYADPMA
LKDFKEFASGRKSITLDILTEIASLPTYLSSRAKLALDNIVMLHTTERGGKAYQHALNELPESLETLMLVALLGAMTAGI
FLFFMQGKGIGKLSMGLIAIAVASGLLWVAEIQPQWIAASIILEFFLMVLLVPEPEKQRTPQDNQLIYVILTILTIIALV
AANEMGLIEKTKTDFGFYQAKTETTILDVDLRPASAWTLYAVATTILTPMLRHTIENTSANLSLAAIANQAAVLMGLGKG
WPLHRMDLGVPLLAMGCYSQVNPTTLTASLVMLLVHYAIIGPGLQAKATREAQKRTAAGIMKNPTVDGITVIDLEPISYD
PKFEKQLGQVMLLVLCAGQLLLMRTTWAFCEVLTLATGPILTLWEGNPGRFWNTTIAVSTANIFRGSYLAGAGLAFSLIK
NAQTPRRGTGTTGETLGEKWKRQLNSLDRKEFEEYKRSGILEVDRTEAKSALKDGSKIKYAVSRGTSKIRWIVERGMVKP
KGKVVDLGCGRGGWSYYMATLKNVTEVKGYTKGGPGHEEPIPMATYGWNLVKLHSGVDVFYKPTEQVDTLLCDIGESSSN
PTIEEGRTLRVLKMVEPWLSSKPEFCIKVLNPYMPTVIEELEKLQRKHGGSLVRCPLSRNSTHEMYWVSGVSGNIVSSVN
TTSKMLLNRFTTRHRKPTYEKDADLGAGTRSVSTETEKPDMTIIGRRLQRLQEEHKETWHYDHENPYRTWAYHGSYEAPS
TGSASSMVNGVVKLLTKPWDVVPMVTQLAMTDTTPFGQQRVFKEKVDTRTPQPKPGTRVVMTTTANWLWALLGRKKNPRL
CTREEFISKVRSNAAIGAVFQEEQGWTSASEAVNDSRFWELVDKERALHQEGKCESCVYNMMGKREKKLGEFGRAKGSRA
IWYMWLGARFLEFEALGFLNEDHWFGRENSWSGVEGEGLHRLGYILEDIDKKDGDLIYADDTAGWDTRITEDDLLNEELI
TEQMAPHHKILAKAIFKLTYQNKVVKVLRPTPKGAVMDIISRKDQRGSGQVGTYGLNTFTNMEVQLIRQMEAEGVITRDD
MHNPKGLKERVEKWLKECGVDRLKRMAISGDDCVVKPLDERFSTSLLFLNDMGKVRKDIPQWEPSKGWKNWQEVPFCSHH
FHKIFMKDGRSLVVPCRNQDELIGRARISQGAGWSLRETACLGKAYAQMWSLMYFHRRDLRLASMAICSAVPTEWFPTSR
TTWSIHAHHQWMTTEDMLKVWNRVWIEDNPNMIDKTPVHSWEDIPYLGKREDLWCGSLIGLSSRATWAKNIQTAITQVRN
LIGKEEYVDYMPVMKRYSAHFESEGVL
>O91734 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETSLSATGNSIIHYTNINYYKDAASNSANRQDFTQDPGKFTEPMKDVMIKTLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVGYGEWPEYLSDNEATAEDQPTQPDVATCRFYTLDSVQWENGSPGWWWKFPDALRDMGL
FGQNMYYHYLGRAGYTIHVQCNASKFHQGCILVVCVPEAEMGSAQTSGVVNYEHISKGEIASRFTTTTTAEDHGVQAAVW
NAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYVNSVPMDNMYRHHNFTLMIIPFVPLDFSAGASTYVPITVTVAPMCAE
YNGLRLAGHQGLPTMNTPGSNQFLTSDDFQSPSAMPQFDVTPEMHIPGEVRNLMEIAEVDSVMPINNDSAAKVSSMEAYR
VELSTNTNAGTQVFGFQLNPGAESVMNRTLMGEILNYYAHWSGSIKITFVFCGSAMTTGKFLLSYAPPGAGAPKTRKDAM
LGTHVVWDVGLQSSCVLCIPWISQTHYRFVEKDPYTNAGFVTCWYQTSVVSPASNQPKCYMMCMVSACNDFSVRMLRDTK
FIEQTSFYQGDVQNAVEGAMVRVADTVQTSATNSERVPNLTAVETGHTSQAVPGDTMQTRHVINNHVRSESTIENFLARS
ACVFYLEYKTGTKEDSNSFNNWVITTRRVAQLRRKLEMFTYLRFDMEITVVITSSQDQSTSQNQNAPVLTHQIMYVPPGG
PIPVSVDDYSWQTSTNPSIFWTEGNAPARMSIPFISIGNAYSNFYDGWSHFSQAGVYGFTTLNNMGQLFFRHVNKPNPAA
ITSVARIYFKPKHVRAWVPRPPRLCPYINSTNVNFEPKPVTEVRTNIITTGAFGQQSGAVYVGNYRVVNRHLATHIDWQN
CVWEDYNRDLLVSTTTAHGCDTIARCQCTTGVYFCLSRNKHYPVSFEGPGLVEVQESEYYPKRYQSHVLLAAGFSEPGDC
GGILRCEHGVIGIVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLVGQDSILE
KSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNGWLKKFTEMTNACKGMEWI
AIKIQKFIEWLKVKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQGDQEQLFSNVQYFAHYCRKYAPLYAAEAKRV
FSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQN
PDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMYSQNGKIN
MPMSVKTCDEDCCPVNFKKCCPLVCGKAIQFIDRKTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVYREIKISV
APETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEISSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLFAGFQGAY
TGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNASTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEVGVLDAKE
LVDKDGTNLELTLLKLNRNEKFRDIRGFLAREEAEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKRMLMYNFP
TRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLRHYFNEEQGEIEFIESSKDAGFPVINTPSKTKLEPSVFHQVFEGNK
EPAVLRNGDPRLKVNFEEAIFSKYIGNVNTHVDEYMQEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTSAGY
PYVALGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTFGNLYKTF
HLNPGIVTGSAVGCDPDVFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTNKETNYIDYLCNSHHLYRD
KHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIMTPA
DKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIRKIR
SVPVGRCLTLPAFSTLRRKWLDSF
>Q66474 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLSASGNSIIHYTNINYYKDAASNSANRQDFTQDPGKFTEPVKDIMAKTLPALNSPSAEECGYSD
RVRSITLGNSTITTQESANVVVGYGVWPDYLKDDEATAEDQPTNPDVATCRFYTLDSVSWMKESQGWWWKFPDALRDMGL
FGQNMQYHYLGRSGYTIHVQCNASKFHQGCLLVVCVPEAEMGAATVNEKINREHLSNGEVANTFTGTKSSNTNGVQQAVF
NAGMGVRVGNLTVFPHQWINLRTNNCATIVMPYINSVPMDNMFRHYNFTLMIIPFAKLDYAAGSSTYIPITVTVAPMCAE
YNGLRLAGHQGLPVMSTPGSNQFLTSDDYQSPTAMPQFDVTPEMHIPGEVKNLMEIAEVDSVVPVNNVNENVNSLEAYRI
PVHSVTETGAQVFGFTLQPGADSVMERTLHGEILNYYANWSGSIKLTFMYCGSAMATGKFLLAYSPPGAGVPKNRKEAML
GTHMIWDIGLQSRCVLCVPWISQTHYRFVSKDSYTDAGFITCWYQTSIVVPAEVQNQSVILCFVSACNDFSVRLLRDSPF
VTQTAFYQNDVQNAVERSIVRVADTLPSGPSNSESIPALTAAETGHTSQVVPSDTIQTRHVRNFHVRSESSVENFLSRSA
CVYIVEYKTQDTTPDKMYDSWVINTRQVAQLRRKLEFFTYVRFDVEVTFVITSVQDDSTRQNTDTPVLTHQIMYVPPGGP
IPHAVDDYNWQTSTNPSVFWTEGNAPPRMSIPFMSVGNAYSNFYDGWSHFSQTGVYGFNTLNNMGKLYFRHVNDRTISPI
TSKVRIYFKPKHVKAWVPRPPRLCEYTHKDNVDYEPKGVTTSRTSITITNSKHMETHGAFGQQSGAAYVGNYRVVNRHLA
THTDWQNCVWEDYNRDLLVSTTTAHGCDTIARCHCTTGVYFCASRNKHHPVVFEGPGLVEVQESGYYPKRYQSHVLLAAG
LSEPGDCGGILRCEHGVIGIVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKESLI
GQDSILEKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKQKVSQYYGISMAERQNNGWLKKFTEMTNA
CKGMEWIAIKIQKFIEWLKVKILPEVREKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLY
AAEAKRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGVGKSVATNLIGRSLAEKLNSSIYSLPPDPDHFDGYKQQAVVI
MDDLCQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVISMY
NQNGKINMPMSVKTCDEECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVY
REIKISVAPEIPPPPAIADLLKSVDSEAVRDYCKEKGWLVPEVNSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYKLF
AGFQGAYTGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQEV
GVVDAKELVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPTKR
MLMYNFPTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGYPVINTPSRTKLEPSVFH
QVFEGSKEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAIDHYAGQLATLDISTEPMKLEDAVYGTEGLEALD
LTTSAGYPYVALGIKKRDILSKKTRDLTKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMRQTF
GNLYKAFHQNPGIVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYIDYLCN
SHHLYRDKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDY
GLIMTPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEQEYE
EFIQKIRSVPVGRCLTLPAFSTLRRKWLDSF
>P29813 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLNASGSSIIHYTNINYYKDAASNSANRQEFSQDPGKFTEPVKDIMVKSLPALNSPSAEECGYSD
RVRSITLGNSTITTQESANVVVGYGRWPEYLKDNEATAEDQPTQPDVATCRFYTLESVTWERDSPGWWWKFPDALKDMGL
FGQNMYYHYLGRAGYTLHVQCNASKFHQGCLLVVCVPEAEMGCSQVDGTVNEHGLSEGETAKKFSSTSTNGTNTVQTIVT
NAGMGVGVGNLTIYPHQWINLRTNNCATIVMPYINNVPMDNMFRHHNFTLMIIPFVPLDYSSDSSTYVPITVTVAPMCAE
YNGLRLSTSLQGLPVMNTPGSNQFLTSDDFQSPSAMPQFDVTPELNIPGEVQNLMEIAEVDSVVPVNNVEGKLDTMEVYR
IPVQSGNHQSDQVFGFQVQPGLDSVFKHTLLGEILNYFAHWSGSIKLTFVFCGSAMATGKFLLAYAPPGANAPKNRKDAM
LGTHIIWDVGLQSSCVLCVPWISQTHYRLVQQDEYTSAGNVTCWYQTGIVVPAGTPTSCSIMCFVSACNDFSVRLLKDTP
FIEQTALLQGDVVEAVENAVARVADTIGSGPSNSQAVPALTAVETGHTSQVTPSDTMQTRHVKNYHSRSESSIENFLSRS
ACVYMGGYHTTNTDQTKLFASWTISARRMVQMRRKLEIFTYVRFDVEVTFVITSKQDQGSRLGQDMPPLTHQIMYIPPGG
PIPKSVTDYAWQTSTNPSIFWTEGNAPPRMSIPFISIGNAYSNFYDGWSHFSQNGVYGYNTLNHMGQIYVRHVNGSSPLP
MTSTVRMYFKPKHVKAWVPRPPRLCQYKNASTVNFTPTNVTDKRTSINYIPETVKPDLSNYGAFGYQSGAVYVVNYRVVN
RHLATHTDWQNCVWEDYNRDLLISTTTAHGCDVIARCRCSTGVYYCQSKGKHYPVNFEGPGLVEVQESEYYPKRYQSHVL
LAAGFSEPGDCGGILRCEHGVIGIVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLK
ESLVGQDSILEKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKQKVSQYYGIPMAERQNNGWLKKFTE
MTNSCKGMEWISIKIQKFIEWLKVKILPEVREKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKY
APLYASEAKRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYTLPPDPDHFDGYKQQ
AVVIVDDLCQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSLFVLASTNAGSINAPTVSDSRALARRFHFDMNIEV
ISMYSQNGKINMPMSEKTCDEECCPVNFKRCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQG
PPIYREIKISVAPETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEVNSTLQIEKHVSRAFICLQALTTFVSVAGIIYII
YKLFAGFQGAYTGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNSSTVKTEYREFTMLGIYDRWAVLPRHAKPGPTILMN
NQEVGVLDAKELVDKDGTNLELTLLKLNRNEKFRDIRGFLAKEEVEANQAVLAINTSKFPNMYIPVGQVTDYGFLNLGGT
PTKRMLMSNFPTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLKHYFNDEQGEIEFIESSKDAGFPIINTPSKTKLEP
SVFHQVFEGDKEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMLEAVDHYAGQLATLDISTEPMRLEDAVYGTEGL
EALDLTTSAGYPYVALGIKKRDILSRRTRDLTKLKECMDKYGLNLPMVTYVKDELRSADKVAKGKSRLIEASSLNDSVAM
RQTFGNLYRTFHLNPGIVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHKETNYID
YLCNSHHLYRDKHYFERGGMPSGYSGTSMFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAET
GKGYGLIMTPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGE
HEYEEFIRKIRSVPVGRCLTLPAFSTLRRKWLDSF
>Q66575 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLSASGNSIIHYTNINYYKDAASNSANRQDFTQDPGKFTEPVKDIMIKSMPALNSPTAEECGYSD
RVRSITLGNSTITTQECANVVVGYGTWPDYLHDDEATAEDQPTQPDVATCRFYTLESIQWQKTSDGWWWKFPEALKDMGL
FGQNMHYHYLGRSGYTIHVQCNASKFHQGCLLVVCVPEAEMGCATVANEVNAAALSSGETAKHFAKTGATGTHTVQSIVT
NAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYINSVPMDNMFRHYNFTLMIIPFVPLDFTAEASTYVPITVTVAPMCAE
YNGLRLASHQGLPTMNTPGSNQFLTSDDFQSPSAMPQFDVTPELRIPGEVKNLMEIAEVDSVVPVNNTQDSVYNMDVYKI
PVSGGNQLSTQVFGFQMQPGLNSVFKRTLLGEILNYYAHWSGSVKLTFVFCGSAMALAKFLLAYSPPGADPPKSRKEAML
GTHVIWDIGLQSSCVLCVPWISQTHYRLVQQDEYTSAGYVTCWYQTSLVVPPGAPATCGVLCLASACNDFSVRMLRDTPF
IEQKQLLQGDVEEAVNRAVARVADTLPTGPRNSESIPALTAAETGHTSQVVPGDTMQTRHVKNYHSRTESSVENFLCRAA
CVYITKYKTKDSDPVQRYANWRINTRQMVQLRRKFELFTYLRFDMEVTFVITSSQDDGTQLAQDMPVLTHQVMYIPPGGP
VPNSVTDFAWQSSTNPSIFWTEGNAPARMSIPFISIGNAYSNFYDGWSHFTQDGVYGFNSLNNMGSIYIRHVNEQSPYAI
TSTVRVYFKPKHVRAWVPRPPRLCAYEKSSNVNFKPTDVTTSRTSITEVPSLRPSVVNTGAFGQQSGAAYVGNYRVVNRH
LATHVDWQNCVWEDYNRDLLVSTTTAHGCDTIARCQCTTGVYFCASRNKHYPVSFEGPGLVEVQESEYYPRRYQSHVLLA
AGFSEPGDCGGILRCEHGVIGLVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKES
LVGHDSILEKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTSSPWRWLKHKVSQYYGIPMAERQSNGWLKKFTEMT
NACKGMEWIAIKIQKFIEWLKLKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAP
LYAAEAKRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATSLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAV
VIMDDLCQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVIS
MYSQNGKINMPMSVKTCDEECCPVNFKRCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPP
VIREIKISVAPETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEVNSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIYK
LFAGFQGAYTGMPNQKPKVPTLRQAKVQGPAFEFAVAMMKRNASTVKTEYGEFTMLGIYDRWAVLPHHAKPGPTILMNDQ
EIGVLDAKELVDKDGTNLELTLLKLNRNEKFRDIRGFLAREEAEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTPT
KRMLMYNFPTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLRHYFNEEQGEIEFIESSKDAGFPVINTPSKTKLEPSV
FHQVFEGNKEPAVLRNGDPRLKVNFEEAIFSKYIGNINTHVDEYMLEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEA
LDLTTSAGYPYVAIGIKKRDILSKKTKDLTKLKECMDKYGLNLPMVTYVKDELRSSEKVAKGKSRLIEASSLNDSVAMRQ
TFGNLYKTFHLNPGIVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYTHRETNYIDYL
CNSHHLYRDKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGK
GYGLIMTPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEQE
YEEFIRKIRSVPVGRCLTLPAFSTLRRKWLDSF
>Q9WN78 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETGLSASGNSVIHYTNINYYKDSASNSLNRQDFTQDPSRFTEPVQDVLIKTLPALNSPTVEECGYSD
RVRSITLGNSTITTQECANVVVGYGVWPTYLSDHEATAVDQPTQPDVATCRFYTLESVKWESSSAGWWWKFPEALSDMGL
FGQNMQYHYLGRAGYTIHVQCNASKFHQGCLLVVCVPEAEMGAATTDHAMNHTKLSNIGQAMEFSAGKSTDQTGPQTAVH
NAGMGVAVGNLTIYPHQWINLRTNNSATIVMPYINSVPMDNMYRHYNFTLMVIPFAKLEHSPQASTYVPITVTVAPMCAE
YNGLRLAGHQGLPTMNTPGSTQFLTSDDFQSPSAMPQFDVTPEIQIPGQVRNLMEIAEVDSVVPVDNTEEHVNSIEAYRI
PVRPQTNSGEQVFGFQLQPGYDSVLKHTLLGEILNYYANWSGSMKLTFMYCGAAMATGKFLIAYSPPGAGVPGSRKDAML
GTHVIWDVGLQSSCVLCVPWISQTNYRYVTRDAYTDAGYITCWYQTSIVTPPDIPTTSTILCFVSACNDFSVRLLRDTPF
ITQQALYQNDPEGALNKAVGRVADTIASGPVNTEQIPALTAVETGHTSQVVPSDTMQTRHVVNFHTRSESSLENFMGRAA
CAYIAHYTTEKANDDLDRYTNWEITTRQVAQLRRKLEMFTYMRFDLEITFVITSSQRTSNRYASDSPPLTHQIMYVPPGG
PIPKGYEDFAWQTSTNPSVFWTEGNAPPRMSIPFMSVGNAYCNFYDGWSHFSQSGVYGYTTLNNMGHLYFRHVNKSTAYP
VNSVARVYFKPKHVKAWVPRAPRLCPYLYAKNVNFDVQGVTESRGKITLDRSTHNPVLTTGAFEQQSGAAYVGNYRLVNR
HLATHTDWQNCVWKDYNRDLLVSTTTAHGCDTIARCQCTTGVYFCASRNKHYPVTFEGPGLVEVQESEYYPKRYQSHVLL
AAGFSEPGDCGGILRCEHGVIGLVTMGGEGVVGFADVRDLLWLEDDAMEQGVKDYVEQLGNAFGSGFTNQICEQVNLLKE
SLIGQDSILEKSLKALVKIISALVIVVRNHDDLITVTATLALIGCTTSPWRWLKHKVSQYYGIPMAERQNNNWLKKFTEM
TNACKGMEWIAIKIQKFIEWLKVKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYA
PLYAAEAKRVFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQA
VVIMDDLCQNPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSINAPTVSDSRALARRFHFDMNIEVI
SMYSQNGKINMPMSVKTCDEECCPVNFKRCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGP
PVYREIKISVAPETPPPPAIADLLKSVDSEAVREYCKEKGWLVPEINSTLQIEKHVSRAFICLQALTTFVSVAGIIYIIY
KLFAGFQGAYSGMPNQKSKVPTLRQAKVQGPAFEFAVAMMKRNASTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTIIMND
QEVGVVDAKELVDKDGTNLELTLLKLNRNEKFRDIRGFLAREEAEVNEAVLAINTSKFPNMYIPVGQVTDYGFLNLGGTP
TKRMLMYNFPTRAGQCGGVLMSTGKVLGVHVGGNGHQGFSAALLRHYFNDEQGEIEFIESSKEAGFPVINTPSKTKLEPS
VFHHVFEGNKEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMMEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLE
ALDLTTSAGYPYVALGIKKRDILSKKTKDLAKLKECMDKYGLNLPMVTYVKDELRSAEKVAKGKSRLIEASSLNDSVAMR
QTFGNLYKTFHMNPGIVTGSAVGCDPDLFWSKIPVMLDGHLIAFDYSGYDASLSPVWFACLKLLLEKLGYSHKETNYIDY
LCNSHHLYRDKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPHPIDASLLAEAG
KGYGLIMTPADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEQ
EYEEFVSKIRSVPVGRCLTLPAFSTLRRKWLDSF
>P03304 ~~~~~~Genome polyprotein~~~
MATTMEQETCAHSLTFEECPKCSALQYRNGFYLLKYDEEWYPEELLTDGEDDVFDPELDMEVVFELQGNSTSSDKNNSSS
EGNEGVIINNFYSNQYQNSIDLSANAAGSDPPRLRSIFESLSGAVNAFSNMLPLLADQNTEEMENLSDRGLKTLPAIRSQ
TPSQQWAVLSVMVPFMMESIRHHVLTLLQKRFWRWKGTTPSRLMIGHQHKSPLSTSAFPFLTSCPVKMVVSLVALRRHYL
VKTGWRVQVQCNASQFHAGGLLVFMAPEYPTLDAFAMDNRWSKDNLPNGTRTQTNKKGPFAMDHQNFWQWTLYPHQFLNL
RTNTTVDLEVPYVNIAPTSSWTQHASWTLVIAVVAPLTYSTGASTSLDITASIQPVRPVFNGLRHETLSRQSPIPVTIRE
HAGTWYSTLPDSTVPIYGKTPVAPSNYMVGEYKDFLEIAQIPTFIGNKIPNAVPYIEASNTAVKTQPLATYQVTLSCSCL
ANTFLAALSRNFAQYRGSLVYTFVFTGTAMMKGKFLIAYTPPGAGKPTSRDQAMQATYAIWDLGLNSSYSFTVPFISPTH
FRMVGTDQVNITNADGWVTVWQLTPLTYPPGCPTSAKILTMVSAGKDFSLKMPISPAPWSPQGVENAEKGVTENTNATAD
FVAQPVYLPENQTKVAFFYNRSSPIGAFTVKSGSLESGFAPFSNGTCPNSVILTPGPQFDPAYDQLRPQRLTEIWGNGNE
ETSKVFPLKSKQDYSFCLFSPFVYYKCDLEVTLSPHTSGNHGLLVRWCPTGTPTKPTTQVLHEVSSLSEGRTPQVYSAGP
GISNQISFVVPYNSPLSVLSAVWYNGHKRFDNTGSLGIAPNSDFGTLFFAGTKPDIKFTVYLRYKNKRVFCPRPTVFFPW
PTSGDKIDMTPRAGVLMLESPNALDISRTYPTLHVLIQFNHRGLEVRLFRHGHFWAETRADVILRSKTKQVSFLSNGNYP
SMDSRAPWNPWKNTYQAVLRAEPCRVTMDIYYKRVRPFRLPLVQKEWPVREENVFGLYRIFNAHYAGYFADLLIHDIETN
PGPFMFRPRKQVFQTQGAAVSSMAQTLLPNDLASKAMGSAFTALLDANEDAQKAMKIIKTLSSLSDAWENVKETLNNPEF
WKQLLSRCVQLIAGMTIAVMHPDPLTLLCLGTLTAAEITSQTSLCEEIAAKFKTIFITPPPRFPTISLFQQQSPLKQVND
IFSLAKNLDWAVKTVEKVVDWFGTWIVQEEKEQTLDQLLQRFPEHAKRISDLRNGMAAYVECKESFDFFEKLYNQAVKEK
RTGIAAVCEKFRQKHDHATARCEPVVIVLRGDAGQGKSLSSQVIAQAVSKTIFGRQSVYSLPPDSDFFDGYENQFAAIMD
DLGQNPDGSDFTTFCQMVSTTNFLPNMASLERKGTPFTSQLVVATTNLPEFRPVTIAHYPAVERRITFDYSVSAGPVCSK
TEAGYKVLDVERAFRPTGEAPLPCFQNNCLFLEKAGLQFRDNRTKEIISLVDVIERAVARIERKKKVLTTVQTLVAQGPV
DEVSFHSVVQQLKARQQATDEQLEELQEAFAKVQERNSVFSDWLKISAMLCAATLALSQVVKMAKAVKQMVKPDLVRVQL
DEQEQGPYNETARVKPKTLQLLDIQGPNPVMDFEKYVAKHVTAPIGFVYPTGVSTQTCLLVRGRTLVVNRHMAESDWTSI
VVRGVTHARSTVKILAIAKAGKETDVSFIRLSSGPLFRDNTSKFVKAGDVLPTGAAPVTGIMNTDIPMMYTGTFLKAGVS
VPVETGQTFNHCIHYKANTRKGWCGSALLADLGGSKKILGIHSAGSMGIAAASIVSQEMIRAVVNAFEPQGALERLPDGP
RIHVPRKTALRPTVARQVFQPAYAPAVLSKFDPRTEADVDEVAFSKHTSNQESLPPVFRMVAKEYANRVFTLLGKDNGRL
TVKQALEGLEGMDPMDRNTSPGLPYTALGMRRTDVVDWESATLIPFAAERLRKMNEGDFSEVVYQTFLKDELRPIEKVQA
AKTRIVDVPPFEHCILGRQLLGKFASKFQTQPGLELGSAIGCDPDVHWTAFGVAMQGFERVYDVDYSNFDSTHSVAMFRL
LAEEFFTPENGFDPLTREYLESLAISTHAFEEKRFLITGGLPSGCAATSMLNTIMNNIIIRAGLYLTYKNFEFDDVKVLS
YGDDLLVATNYQLDFDKVRASLAKTGYKITPANTTSTFPLNSTLEDVVFLKRKFKKEGPLYRPVMNREALEAMLSYYRPG
TLSEKLTSITMLAVHSGKQEYDRLFAPFREVGVVVPSFESVEYRWRSLFW
>Q66765 ~~~~~~Genome polyprotein~~~
MATTMEQETCAHSLTFEECPKCSALQYRNGFYLLKYDEEWYPEELLTDGEDDVFDPELDMEVVFELQGNSTSSDKNNSSS
EGNEGVIINNFYSNQYQNSIDLSANAAGSDPPRTYGQFSNLFSGAVNAFSNMLPLLADQNTEEMENLSDRVSQDTAGNTV
TNTQSTVGRLVGYGTVHDGEHPASCADTASEKILAVERYYTFKVNDWTSTQKPFEYIRIPLPHVLSGEDGGVFGAALRRH
YLVKTGWRVQVQCNASQFHAGGLLVFMAPEYPTLDAFAMDNRWSKDNLPNGTRTQTNKKGPFAMDHQNFWQWTLYPHQFL
NLRTNTTVDLEVPYVNIAPTSSWTQHASWTLVIAVVAPLTYSTGASTSLDITASIQPVRPVFNGLRHETLSRQSPIPVTI
REHAGTWYSTLPDSTVPIYGKTPVAPSNYMVGEYKDFLEIAQIPTFIGNKIPNAVPYIEASNTAVKTQPLATYQVTLSCS
CLANTFLAALSRNFAQYRGSLVYTFVFTGTAMMKGKFLIAYTPPGAGKPTSRDQAMQATYAIWDLGLNSSYSFTVPFISP
THFRMVGTDQVNITNADGWVTVWQLTPLTYPPGCPTSAKILTMVSAGKDFSLKMPISPAPWSPQGVENAEKGVTENTNAT
ADFVAQPVYLPENQTKVAFFYNRSSPIGAFTVKSGSLESGFAPFSNGTCPNSVILTPGPQFDPAYDQLRPQRLTEIWGNG
NEETSKVFPLKSKQDYSFCLFSPFVYYKCDLEVTLSPHTSGNHGLLVRWCPTGTPTKPTTQVLHEVSSLSEGRTPQVYSA
GPGISNQISFVVPYNSPLSVLSAVWYNGHKRFDNTGSLGIAPNSDFGTLFFAGTKPDIKFTVYLRYKNKRVFCPRPTVFF
PWPTSGDKIDMTPRAGVLMLESPNALDISRTYPTLHVLIQFNHRGLEVRLFRHGHFWAETRADVILRSKTKQVSFLSNGN
YPSMDSRAPWNPWKNTYQAVLRAEPCRVTMDIYYKRVRPFRLPLVQKEWPVREENVFGLYRIFNAHYAGYFADLLIHDIE
TNPGPFMFRPRKQVFQTQGAAVSSMAQTLLPNDLASKAMGSAFTALLDANEDAQKAMKIIKTLSSLSDAWENVKETLNNP
EFWKQLLSRCVQLIAGMTIAVMHPDPLTLLCLGTLTAAEITSQTSLCEEIAAKFKTIFITPPPRFPTISLFQQQSPLKQV
NDIFSLAKNLDWAVKTVEKVVDWFGTWIVQEEKEQTLDQLLQRFPEHAKRISDLRNGMAAYVECKESFDFFEKLYNQAVK
EKRTGIAAVCEKFRQKHDHATARCEPVVIVLRGDAGQGKSLSSQVIAQAVSKTIFGRQSVYSLPPDSDFFDGYENQFAAI
MDDLGQNPDGSDFTTFCQMVSTTNFLPNMASLERKGTPFTSQLVVATTNLPEFRPVTIAHYPAVERRITFDYSVSAGPVC
SKTEAGYKVLDVERAFRPTGEAPLPCFQNNCLFLEKAGLQFRDNRTKEIISLVDVIERAVARIERKKKVLTTVQTLVAQG
PVDEVSFHSVVQQLKARQQATDEQLEELQEAFAKVQERNSVFSDWLKISAMLCAATLALSQVVKMAKAVKQMVKPDLVRV
QLDEQEQGPYNETARVKPKTLQLLDIQGPNPVMDFEKYVAKHVTAPIGFVYPTGVSTQTCLLVRGRTLVVNRHMAESDWT
SIVVRGVTHARSTVKILAIAKAGKETDVSFIRLSSGPLFRDNTSKFVKAGDVLPTGAAPVTGIMNTDIPMMYTGTFLKAG
VSVPVETGQTFNHCIHYKANTRKGWCGSALLADLGGSKKILGIHSAGSMGIAAASIVSQEMIRAVVNAFEPQGALERLPD
GPRIHVPRKTALRPTVARQVFQPAYAPAVLSKFDPRTEADVDEVAFSKHTSNQESLPPVFRMVAKEYANRVFTLLGKDNG
RLTVKQALEGLEGMDPMDRNTSPGLPYTALGMRRTDVVDWESATLIPFAAERLRKMNEGDFSEVVYQTFLKDELRPIEKV
QAAKTRIVDVPPFEHCILGRQLLGKFASKFQTQPGLELGSAIGCDPDVHWTAFGVAMQGFERVYDVDYSNFDSTHSVAMF
RLLAEEFFTPENGFDPLTREYLESLAISTHAFEEKRFLITGGLPSGCAATSMLNTIMNNIIIRAGLYLTYKNFEFDDVKV
LSYGDDLLVATNYQLDFDKVRASLAKTGYKITPANTTSTFPLNSTLEDVVFLKRKFKKEGPLYRPVMNREALEAMLSYYR
PGTLSEKLTSITMLAVHSGKQEYDRLFAPFREVGVVVPSFESVEYRWRSLFW
>P12296 ~~~~~~Genome polyprotein~~~
MATTMEQEICAHSMTFEECPKCSALQYRNGFYLLKYDEEWYPEELLTDGEDDVFDPDLDMEVVFETQGNSTSSDKNNSSS
EGNEGVIINNFYSNQYQNSIDLSANATGSDPPKTYGQFSNLLSGAVNAFSNMLPLLADQNTEEMENLSDRVSQDTAGNTV
TNTQSTVGRLVGYGTVHDGEHPASCADTASEKILAVERYYTFKVNDWTSTQKPFEYIRIPLPHVLSGEDGGVFGATLRRH
YLVKTGWRVQVQCNASQFHAGSLLVFMAPEYPTLDVFAMDNRWSKDNLPNGTRTQTNRKGPFAMDHQNFWQWTLYPHQFL
NLRTNTTVDLEVPYVNIAPTSSWTQHASWTLVIAVVAPLTYSTGASTSLDITASIQPVRPVFNGLRHEVLSRQSPIPVTI
REHAGTWYSTLPDSTVPIYGKTPVAPANYMVGEYKDFLEIAQIPTFIGNKVPNAVPYIEASNTAVKTQPLAVYQVTLSCS
CLANTFLAALSRNFAQYRGSLVYTFVFTGTAMMKGKFLIAYTPPGAGKPTSRDQAMQATYAIWDLGLNSSYSFTVPFISP
THFRMVGTDQANITNVDGWVTVWQLTPLTYPPGCPTSAKILTMVSAGKDFSLKMPISPAPWSPQGVENAEKGVTENTDAT
ADFVAQPVYLPENQTKVAFFYDRSSPIGAFAVKSGSLESGFAPFSNKACPNSVILTPGPQFDPAYDQLRPQRLTEIWGNG
NEETSEVFPLKTKQDYSFCLFSPFVYYKCDLEVTLSPHTSGAHGLLVRWCPTGTPTKPTTQVLHEVSSLSEGRTPQVYSA
GPGTSNQISFVVPYNSPLSVLPAVWYNGHKRFDNTGDLGIAPNSDFGTLFFAGTKPDIKFTVYLRYKNMRVFCPRPTVFF
PWPTSGDKIDMTPRAGVLMLESPNPLDVSKTYPTLHILLQFNHRGLEARIFRHGQLWAETHAEVVLRSKTKQISFLSNGS
YPSMDATTPLNPWKSTYQAVLRAEPHRVTMDVYHKRIRPFRLPLVQKEWRTCEENVFGLYHVFETHYAGYFSDLLIHDVE
TNPGPFTFKPRQRPVFQTQGAAVSSMAQTLLPNDLASKAMGSAFTALLDANEDAQKAMKIIKTLSSLSDAWENVKGTLNN
PEFWKQLLSRCVQLIAGMTIAVMHPDPLTLLCLGVLTAAEITSQTSLCEEIAAKFKTIFTTPPPRFPVISLFQQQSPLKQ
VNDVFSLAKNLDWAVKTVEKVVDWFGTWVAQEEREQTLDQLLQRFPEHAKRISDLRNGMAAYVECKESFDFFEKLYNQAV
KEKRTGIAAVCEKFRQKHDHATARCEPVVIVLRGDAGQGKSLSSQIIAQAVSKTIFGRQSVYSLPPDSDFFDGYENQFAA
IMDDLGQNPDGSDFTTFCQMVSTTNLLPNMASLERKGTPFTSQLVVATTNLPEFRPVTIAHYPAVERRITFDYSVSAGPV
CSKTEAGCKVLDVERAFRPTGDAPLPCFQNNCLFLEKAGLQFRDNRSKEILSLVDVIERAVTRIERKKKVLTAVQTLVAQ
GPVDEVSFYSVVQQLKARQEATDEQLEELQEAFARVQERSSVFSDWMKISAMLCAATLALTQVVKMAKAVKQMVRPDLVR
VQLDEQEQGPYNETTRIKPKTLQLLDVQGPNPTMDFEKFVAKFVTAPIGFVYPTGVSTQTCLLVKGRTLAVNRHMAESDW
TSIVVRGVSHTRSSVKIIAIAKAGKETDVSFIRLSSGPLFRDNTSKFVKASDVLPHSSSPLIGIMNVDIPMMYTGTFLKA
GVSVPVETGQTFNHCIHYKANTRKGWCGSAILADLGGSKKILGFHSAGSMGVAAASIISQEMIDAVVQAFEPQGALERLP
DGPRIHVPRKTALRPTVARQVFQPAFAPAVLSKFDPRTDADVDEVAFSKHTSNQETLPPVFRMVAREYANRVFALLGRDN
GRLSVKQALDGLEGMDPMDKNTSPGLPYTTLGMRRTDVVDWETATLIPFAAERLEKMNNKDFSDIVYQTFLKDELRPIEK
VQAAKTRIVDVPPFEHCILGRQLLGKFASKFQTQPGLELGSAIGCDPDVHWTAFGVAMQGFERVYDVDYSNFDSTHSVAV
FRLLAEEFFSEENGFDPLVKDYLESLAISKHAYEEKRYLITGGLPSGCAATSMLNTIMNNIIIRAGLYLTYKNFEFDDVK
VLSYGDDLLVATNYQLNFDRVRTSLAKTGYKITPANKTSTFPLESTLEDVVFLKRKFKKEGPLYRPVMNREALEAMLSYY
RPGTLSEKLTSITMLAVHSGKQEYDRLFAPFREVGVIVPTFESVEYRWRSLFW
>P27408 ~~~ORF1~~~Genome polyprotein~~~
MSQTLSFVLKTHSVRKDFVHSVKVTLARRRDLQYLYNKLARTMRAEACPSCSSYDVCPNCTSSDIPDNGSSTTSIPSWED
VTKTSTYSLLLSEDTSDELCPDDLVNVAAHIRKALSTQAHPANTEMCKEQLTSLLVMAEAMLPQRSRASIPLHQQHQAAR
LEWREKFFSKPLDFLLERIGVSKDILQITAIWKIILEKACYCKSYGEQWFCAAKQKLREMRTFESDTLKPLVGAFIDGLR
FMTVDNPNPMGFLPKLIGLVKPLNLAMIIDNHENTLSGWVVTLTAIMELYNITECTIDVITSLVTGFYDKISKATKFFSQ
VKALFTGFRSEDVANSFWYMAAAILCYLITGLIPNNGRFSKIKACLSGATTLVSGIIATQKLAAMFATWNSESIVNELSA
RTVAISELNNPTTTSDTDSVERLLELAKILHEEIKVHTLNPIMQSYNPILRNLMSTLDGVITSCNKRKAIARKRQVPVCY
ILTGPPGCGKTTAAQALAKKLSDQEPSVINLDVDHHDTYTGNEVCIIDEFDSSDKVDYANFVIGMVNSAPMVLNCDMLEN
KGKLFTSKYIIMTSNSETPVKPSSKRAGAFYRRVTIIDVTNPLVESHKRARPGTSVPRSCYKKNFSHLSLAKRGAECWCK
EYVLDPKGLQHQTIKAPPPTFLNIDSLAQTMKQDFVLKNMAFEAEDGCSEHRYGFICQQSEVETVRRLLNAIRARLNATF
TVCVGPEASHSIGCTAHVLTPDEPFNGRRFIVSRCNEASLAALEGNCVQSALGVCMSNKDLTHLCHFIRGKIVNDSVRLD
ELPANQHVVTVNSVFDLAWALRRHLTLTGQFQAIRAAYDVLTVPDKVPAMLRHWMDETSFSDEHVVTQFVTPGGVVILES
CGGARIWALGHNVIRAGGVTATPTGGCVRLVGLSAQTLPWSEIFRELFTLLGRIWSSIKVSTLVLTALGMYASRFRPKSE
AKGKTKSKIGPYRGRGVALTDDEYDEWREHNANRKLDLSVEDFLMLRHRAALGADDADAVKFRSWWNSRTRPGDGFEDVT
VIGKGGVKHEKIRTSTLRAVDRGYDVSFAEESGPGTKFHKNAIGSVTDVCGEHKGYCVHMGHGVYASVAHVVKGDSYFLG
ERIFDVKTNGEFCCFRSTKILPSAAPFFSGKPTRDPWGSPVATEWKPKAYTTTSGKIVGCFATTSTETHPGDCGLPYIDD
NGRVTGLHTGSGGPKTPSAKLVVPYIHIDMKNKSVTPQKYDETKPNISYKGLVCKQLDEIRIIPKGTRLHVSPAHVDDFE
ECSHQPASLGSGDPRCPKSLTAIVVDSLKPYCDRVEGPPHDVLHRVQKMLIDHLSGFVPMNISSETSMLSAFHKLNHDTS
CGPYLGGRKKDHMVNGEPDKQLLDLLSSKWKLATQGIALPHEYTIGLKDELRPIEKVQEGKRRMIWGCDVGVATVCAAAF
KGVSDAITANHQYGPIQVGINMDSPSVEVLYQRIKSAAKVFAVDYSKWDSTQSPRVSAASIDILRYFSDRSPIVDSAANT
LKSPPIAIFNGVAVKVASGLPSGMPLTSVINSLNHCMYVGCAILQSLEARQIPVTWNLFSSFDMMTYGDDGVYMFPTMFA
SVSDQIFGNLSAYGLKPTRVDKSVGAIESIDPESVVFLKRTITRTPNGIRGLLDRSSIIRQFFYIKGENSDDWKTPPKTI
DPTSRGQQLWNACLYASQHGVEFYNKVLKLAMRAVEYEGLHLKPPSYSSALEHYNSQFNGVEARSDQINMSDVTALHCDV
FEV
>P27409 ~~~ORF1~~~Genome polyprotein~~~
MSQTLSFVLKTHSVRKDFVHSVKLTLARRRDLQYIYNKLSRTIRAEACPSCASYDVCPNCTSGDVPDDGSSTMSIPSWED
VTKSSTYSLLLSEDTSDELCPEDLVNVAAHIRKALSTQSHPANAEMCKEQLTFLLVMAEAMLPQRSRASIPLHQQHTAAR
LEWREKFFSKPLDFLLERVGVSKDILQTTAIWKIILEKACYCKSYGEQWFTAAKQKLREMKNFESDTLKPLIGGFIDGLR
FLTVDNPNPMGFLPKLIGLVKPLNLAMIIDNHENTISGWIITLTAIMELYNITECTIDIITSVITAFYDKIGKATKFYSC
VKALFTGFRSEDVANSFWYMAAAILCYLITGLIPNNGRFSKIKACLAGATTLVSGIVATQKLAAMFATWNSESIVNELSA
RTVALSELNNPTTTSDTDSVERLLELAKILHEEIKIHTLNPIMQSYNPILRNLMSTLDGVITSCNKRKAIARKRQVPVCY
ILTGPPGCGKTTAAQALAKKLSDQEPSVINLDVDHHDTYTGNEVCIIDEFDSSDKVDYANFVIGMVNSAPMVLNCDMLEN
KGKLFTSKYIIMTSNSETPVKPSSKRAGAFYRRVTIIDVTNPFVESHKRARPGTSVPRSCYKKNFSHLSLAKRGAECWCK
EYVLDPKGLQHQSMKAPPPTFLNIDSLAQTMKQDFLLKNMAFEAEDGCAEHRYGFVCQQEEVETVRRLLNAVRARMNATF
TVCVGPETSHSIGCTAHVLTPNETFNGKKFVVSRCNEASLSALEGNCVKSALGVCMSDKDLTHLCHFIKGKIVNDSVRLD
ELPANQHVVTVNSVFDLAWAVRRHLTLAGQFQAIRAAYDVLTVPDKIPAMLRHWMDETSFSDDHVVTQFVTPGGIVILES
CGGARIWALGRNVIRAGGVTATPTGGCVRLMGLSAPTMPWSEIFRELFSLLGRIWSSVKVSALVLTALGMYASRFRPKSE
AKGKTKLKIGTYRGRGVALTDDEYDEWREHNASRKLDLSVEDFLMLRHRAALGADDNDAVKFRSWWNSRTKMANDYEDVT
VIGKGGVKHEKIRTNTLKAVDRGYDVSFAEESGPGTKFHKNAIGSVTDVCGEHKGYCIHMGHGVYASVAHVVKGDSFFLG
ERIFDLKTNGEFCCFRSTKILPSAAPFFSGKPTRDPWGSPVATEWKPKMYTTTSGKILGCFATTSTETHPGDCGLPYIDD
NGRVTGLHTGSGGPKTPSAKLVVPYVHIDMKTKSVTAQKYDVTKPDISYKGLICKQLDEIRIIPKGTRLHVSPAHTEDYQ
ECSHQPASLGSGDPRCPKSLTAIVVDSLKPYCENVEGPPHDVLHRVQKMLIDHLSGFVPMNISSETSMLSAFHKLNHDTS
CGPYLGGRKKDHMANGEPDKQLLDLLSAKWKLATQGIALPHEYTIGLKDELRPVEKVSEGKRRMIWGCDVGVATVCAAAF
KGVSDAITANHQYGPIQVGINMDSPSVEALFQRIKSAAKVFAVDYSKWDSTQSPRVSAASIDILRYFSDRSPIVDSASNT
LKSPPVAIFNGVAVKVSSGLPSGMPLTSVINSLNHCLYVGCAILQSLEAKAIPVTWNLFSTFDIMTYGDDGVYMFPIMYA
SISDQIFGNLSSYGLKPTRVDKSVGAIEPIDPDSVVFLKRTITRTPQGIRGLLDRSSIIRQFYYIKGENSDDWKSPPKHI
DPTSRGQQLWNACLYASQHGLEFFNKVYRLAERAVEYEELHFEPPTYASALDHYNSQFNGVEARSDQIDSSGMTALHCDV
FEV
>Q66914 ~~~ORF1~~~Genome polyprotein~~~
MSQTLSFVLKTHSVRKDFVHSVKRTLQRRRDLQYLYNKLSRPIRAEACPSCASYDVCPNCTSGSIPDDGSSKGQIPSWED
VTKTSTYSLLLSEDTSDELHPDDLVNVAAHIRKALSTQSHPANVDMCKEQLTSLLVMAEAMLPQRSRSTLPLHQKYVAAR
LEWREKFFSKPLDFLLEKIGTSRDILQITAVWKIIIEKACYCKSYGEHWFEAAKQKLREIKSYEHNTLKPLIGAFIDGLR
LMTIDNPNPMGFLPKLIGLIKPLNLAMIIDNHENTLSGWVITLTAIMELYNITECTIDVITSIITGFYDKIGKATKFYSQ
IKALFTGFRSEDVANSFWYMAAAILCYLITGLIPNNGRLSKIKACLAGATTLVSGIVATQKLAAMFATWNSESIVNELSA
RTVAISELNNPTTTSDTDSVERLLELAKILHEEIKIHTLNPIMQSYNPILRNLMSTLDGVITSCNKRKAIAKKRPVPVCY
ILTGPPGCGKTTAALALAKKLSDQEPSVINLDVDHHDTYTGNEVCIVDEFDSSDKVDYANFVIGMVNSAPMVLNCDMLEN
KGKLFTSKYIIMTSNSETPVKPSSRRAGAFYRRVTIIDVANPLAESHKRARPGTSVPRSCYKKNFSHLSLAKRGAECWCK
EYVLDPKGLQHQSIKAPPPTFLNIDSLAQTMKQDFTLKNMAFEAENGHSEHRYGFVCQQGEVETVRRLLNAVRTRLNATF
TVCVGSEASSSIGCTAHVLTPDEPFNGKKYVVSRCNEASLSALEGNCVQSALGVCMSTKDLTHLCHFIRGKIVNDSVRLD
ELPANQHVVTVNSVFDLAWALRRHLTLAGQFQAIRAAYDVLTAPDKVPAMLRHWMDETSFSDEHVVTQFVTPGGIVILES
CGGARIWALGHNVIRAGGVTATPTGGCIRFMGLSAQTMPWSEIFRELFSLLGRIWSSIKVSTLVLTALGMYASRFRPKSE
AKGKTKSKVGPYRGRGVALTDDEYDEWREHNATRKLDLSVEDFLMLRHRAALGADDADAVKFRSWWNSRSRLADDYEDVT
VIGKGGVKHEKIRTNTLRAVDRGYDVSFAEESGPGTKFHKNAIGSVTDVCGEHKGYCVHMGHGVYATVAHVAKGDSFFLG
ERIFDLKTNGEFCCFRSTKILPSAAPFFPGKPTRDPWGSPVATEWKPKPYTTTSGKIVGCFATTSTETHPGDCGLPYIDD
NGRVTGLHTGSGGPKTPSAKLVVPYVHIDMKTKSVTAQKYDVTKPDISYKGLICKQLDEIRIIPKGTRLHVSPAHTEDFE
ECSHQPASLGSGDPRCPKSLTAIVVDSLKPYCDKVEGPPHDILHRVQKMLIDHLSGFVPVNISSETSMLSAFHKLNHDTS
CGPYLGGRKKDHMTNGEPDKPLLDLLSAKWKLATQGIALPHEYTIGLKDELRPVEKVAEGKRRMIWGCDVGVATVCAAAF
KGVSDAITANHQYGPVQVGINMDSPSVEALHQRIKSAAKVYAVDYSKWDSTQSPRVSAASIDILRYFSDRSPIVDSAANT
LKSPPIAIFNGVAVKVSSGLPSGMPLTSVINSLNHCLYVGCAILQSLEARGVPVTWNLFSTFDMMTYGDDGVYMFPMMFA
SVSDQIFANLSAYGLKPTRVDKSVGSIEPIDPESVVFLKRTITRTPQGIRGLLDRSSIIRQFYYIKGENSDDWKTPPKSI
DPTSRGQQLWNACLYASQHGVEFYNKIYKLAQKAVEYEELHLEPPTYHSALEHYNNQFNGVEARSDQIDSSGMTALHCDV
FEV
>P03306 ~~~~~~Genome polyprotein~~~
MNTTNCFIALVYLIREIKTLFRSRTKGKMEFTLHNGEKKTFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYNSPENLTL
DAIKQLENFTGLELHEGGPPALVIWNIKHLLQTGIGTASRPSEVCMVDGTDMCLADFHAGIFMKGQEHAVFACVTSDGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNGDWKTQVQKKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMSTQLG
DNTISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFTGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTY
GYSTEEDHVAGPNTSGLETRVVQAERFFKKFLFDWTTDKPFGYLTKLELPTDHHGVFGHLVDSYAYMRNGWDVEVSAVGN
QFNGGCLLVAMVPEWKAFDTREKYQLTLFPHQFISPRTNMTAHITVPYLGVNRYDQYKKHKPWTLVVMVLSPLTVSNTAA
PQIKVYANIAPTYVHVAGELPSKEGIFPVACADGYGGLVTTDPKTADPVYGKVYNPPKTNYPGRFTNLLDVAEACPTFLR
FDDGKPYVVTRADDTRLLAKFDVSLAAKHMSNTYLSGIAQYYTQYSGTINLHFMFTGSTDSKARYMVAYIPPGVETPPDT
PEEAAHCIHAEWDTGLNSKFTFSIPYVSAADYAYTASDTAETTNVQGWVCVYQITHGKAENDTLLVSASAGKDFELRLPI
DPRTQTTTTGESADPVTTTVENYGGDTQVQRRHHTDVGFIMDRFVKINSLSPTHVIDLMQTHKHGIVGALLRAATYYFSD
LEIVVRHDGNLTWVPNGAPEAALSNTSNPTAYNKAPFTRLALPYTAPHRVLATVYDGTNKYSASDSRSGDLGSIAARVAT
QLPASFNYGAIQAQAIHELLVRMKRAELYCPRPLLAIKVTSQDRYKQKIIAPAKQLLNFDLLKLAGDVESNLGPFFFADV
RSNFSKLVDTINQMQEDMSTKHGPDFNRLVSAFEELATGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLVA
IMLADTGLEILDSTFVVKKSSDSLSSLFHVPAPAFSFGAPVLLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAIL
KNGEWLVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQRDLNDPGKYKEAKEWLDNARQACLKSGNVHIANLCKVVA
PAPSKSRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRIDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFKY
FAQMVSTTGFIPPMASLEDKGKPFNSKVIIATTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKINNKLDIIKALED
THTNPVAMFQYDCALLNGMAVEMKRLQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHPIFKQISIPSQKSVLYFLIE
KGQHEAAIEFFEGMVHDSVKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQKMVDDAVNEYIE
RANITTDDKTLDEAEKNPLETSGASTVGFRERSLTGQKVRDDVSSEPAQPAEDQPQAEGPYSGPLERQKPLKVRAKLPQQ
EGPYAGPMERQKPLKVKVKAPVVKEGPYEGPVKKPVALKVKARNLIVTESGAPPTDLQKMVMGNTKPVELNLDGKTVAIC
CATGVFGTAYLVPRHLFAEKYDKIMLDGRAMTDSDYRVFEFEIKVKGQDMLSDAALIVLHRGNCVRDITKHFRDTARMKK
GTPVVGVVNNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATRAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYC
SCVSRSMLQKMKAHVDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAYGVFNPEFGPAALSNKDPRLNEGVVLDDVIFS
KHKGDAKMTEEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGT
VGPEVEAALKLMEKREYKFACQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTKMMIGRFCAQMHSNNGPQIGSAVGCN
PDVDWQRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTDFGFHPNAEWILKTLVNTEHAYENKRITVEGGMPS
GCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLGQ
SITDVTFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRSL
YLRWVNAVCGDA
>P03307 ~~~~~~Genome polyprotein~~~
MHTTDCFIALVHAIREIRALFLPRTTGKMELTLHNGEKKTFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYNSPENLTL
EAINQLEELTGLELHEGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTDMCLADFHAGIFLKGQEHAVFACVTSNGWY
AIDDEEFYPWTPDPSDVLVFVPYDQEPLNGDWKAMVQRKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFTGLFGALLADKKTEETTLLEDRILTTRNGHTISTTQSSVGVTY
GYSTGEDHVAGPNTSGLETRVVQAERFFKKFLFDWTTDKPFGHLEKLELPADHHGVFGHLVESYAYMRNGWDVEVSAVGN
QFNGGCLLVAMVPEWKEFEQREKYQLTLFPHQFISPRTNMTAHITVPYLGVNRYDQYKKHKPWTLVVMVVSPLTVSDTAA
AQIKVYANIAPTYVHVAGELPSKEGIFPVACSDGYGGLVTTDPKTADPAYGKVYNPPRTNYPGRFTNLLDVAEACPTFLC
FDDGKPYVVTRTDDTRLLAKFDVSLAAKHMSNTYLSGIAQYYAQYSGTINLHFMFTGSTDSKARYMVAYIPPGVEVPPDT
PERAAHCIHAEWDTGLNSKFTFSIPYVSAADYAYTASDTAETTNVQGWVCIYQITHGKAENDTLVVSASAGKDFELRLPI
DPRQQTTAVGESADPVTTTVENYGGETQTQRRHHTDVGFIMDRFVKINSLSPTHVIDLMQTHQHGLVGALLRAATYYFSD
LEIVVRHDGNLTWVPNGAPEAALSNTSNPTAYNKAPFTRLALPYTAPHRVLATVYNGTNKYSTGGPRRGDTGSPAARAAK
QLPASFNYGAIRAVTIHELLVRMKRAELYCPRPLLAIEVSSQDRHKQKIIAPARQLLNFDLLKLAGDVESNPGPFFFSDV
RSNFSKLVETINQMQEDMSTKHGPDFNRLVSAFEELAAGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLVA
IMLADTGLEILDSTFVVKKISDSLSSLFHVPAPVFSFGAPILLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAIL
KNGEWLVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQHDLNDPSKYKEAKEWLDNARQACLKSGNVHIANLCKVVA
PAPSKPRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRTDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFKY
FAQMVSTTGFIPPMASLEDKGKPFNSKVIIATTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKINNKLDITKALED
THTNPVAMFQYDCALLNGMAVEMKRMQQDMFKPQPPLQNVYQLVQEVIDRVELHEKVSSHPIFKQISIPSQKSVLYFLIK
KGQHEAAIEFFEGMVHDSVKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQKMVDDAVNEYIE
KANITTDDKTLDEAEKNPLETSGASTVGFRERTLPGQKARDDVNSEPAQPAEEQPQAEGPYAGPLERQRPLKVRAKLPQQ
EGPYAGPMERQKPLKVKAKAPVVKEGPYEGPVKKPVALKVRAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAIC
CATGVFGTAYLVPRHLFAEKYDKIMLDGRAMTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDTARMKK
GTPVVGVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYRAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYC
SCVSRSMLLKMKAHIDPEPHHEGLIVDTRDAEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFS
KHKGDTKMSEEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGT
VGPEVEAALKLMEKREYKFVCQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGCN
PDVDWQRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTDFGFHPNAEWILKTLVNTEHAYENKRHTVEGGMPS
GCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLGH
SITDVTFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRSL
YLRWVNAVCGDA
>P03308 ~~~~~~Genome polyprotein~~~
MNTTNCFIALVHAIREIRAFFLSRATGKMEFTLYNGERKTFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYNSPENLTL
AAIKQLEELTGLELHEGGPPALVIWNIKHLLQTGIGTASRPSEVCMVDGTDMCLADFHAGIFLKGQEHAVFACVTSNGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNGGWKANVQRKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFTGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTY
GYSTEEDHVAGPNTSGLETRVVQAERFFKKFLFDWTTDKPFGHLEKLELPTDHHGVFGHLVDSYAYMRNGWDVEVSAVGN
QFNGGCLLVAMVPEWKEFDKREKYQLTLFPHQFISPRTNMTAHITVPYLGVNRYDQYKKHKPWTLVIMVVSPLTVSNTAA
TQIKVYANIAPTYVHVAGELPSKEGIFPVACSDGYGGLVTTDPKTADPVYGKVYNPPRTNYPGRFTNLLDVAEACPTFLC
FDDGKPYVVTRTDDTRLLAKFDVSLAAKHMSNTYLSGIAQYYTQYSGTINLHFMFTGSTDSKARYMVAYIPPGVETPPET
PEGAAHCIHAEWDTGLNSKFTFSIPYVSAADYAYTASDTAETTNVQGWVCIYQITHGKAEGDTLVVSASAGKDFELRLPI
DPRSQTTATGESADPVTTTVENYGGETQVQRRHHTDVSFIMDRFVKIKSLNPTHVIDLMQTHQHGLVGALLRAATYYFSD
LEIVVRHDGNLTWVPNGAPEAALSNTGNPTAYNKAPFTRLALPYTAPHRVLATVYNGTNKYSASGSGVRGDSGSLAPRVA
RQLPASFNYGAIKAETIHELLVRMKRAELYCPRPLLAIEVSSQDRHKQKIIAPGKQLLNFDLLKLAGDVESNPGPFFFAD
VRSNFSKLVDTINQMQEDMSTKHGPDFNRLVSAFEELATGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLV
AIMLADTGLEILDSTFVVKKISDSLSSLFHVPAPVFSFGAPVLLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAI
LKNGEWLVKLILAIRDWIKAWIASEEKFVTTTDLVPGILEKQRDLNDPSKYKEAKEWLDNARQACLKSGNVHIANLCKVV
APAPSKSRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRTDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFK
YFAQMVSTTGFIPPMASLEDKGKPFNSKVIIATTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKINNKLDIIKALE
DTHTNPVAMFQYDCALLNGMAVEMKRMQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHPIFKQISIPSQKSVLYFLI
EKGQHEAAIEFFEGMVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQKMVDDAVNEYI
EKANITTDDTTLDEAEKNPLETSGASTVGFRERTLTGQRACNDVNSEPARPAEEQPQAEGPYTGPLERQRPLKVRAKLPQ
QEGPYAGPLERQKPLKVKAKAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAI
CCATGVFGTAYLVPRHLFAEKYDKIMLDGRAMTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDTARMK
KGTPVVGVVNNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGY
CSCVSRSMLLRMKAHVDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIF
SKHKGDTKMSAEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMESDTAPGLPWAFQGKRRGALIDFENG
TVGPEVEAALKLMEKREYKFACQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGC
NPDVDWQRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTDFGFHPNAEWILKTLVNTEHAYENKRITVEGGMP
SGCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLG
HSITDVTFLKRHFHIDYGTGFYKPVMASKTLEAILSFARRGTIQEKLTSVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRS
LYLRWVNAVCGDA
>P03309 ~~~~~~Genome polyprotein~~~
MNTTDCFIALVHAIREIRAFFLPRATGRMEFTLHNGERKVFYSRPNNHDNCWLNTILQLFRYVGEPFFDWVYDSPENLTL
EAIEQLEELTGLELHEGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTNMCLADFHAGIFLKGQEHAVFACVTSNGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNGEWKTKVQQKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFTGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTH
GYSTEEDHVAGPNTSGLETRVVQAERFYKKYLFDWTTDKAFGHLEKLELPSDHHGVFGHLVDSYAYMRNGWDVEVSAVGN
QFNGGCLLVAMVPEWKEFDTREKYQLTLFPHQFISPRTNMTAHITVPYLGVNRYDQYKKHKPWTLVVMVVSPLTVNNTSA
AQIKVYANIAPTYVHVAGELPSKEGIFPVACADGYGGLVTTDPKTADPAYGKVYNPPRTNYPGRFTNLLDVAEACPTFLC
FDDGKPYVTTRTDDTRLLAKFDLSLAAKHMSNTYLSGIAQYYTQYSGTINLHFMFTGSTDSKARYMVAYIPPGVETPPDT
PERAAHCIHAEWDTGLNSKFTFSIPYVSAADYAYTASDTAETINVQGWVCIYQITHGKAENDTLVVSVSAGKDFELRLPI
DPRQQTTATGESADPVTTTVENYGGETQIQRRHHTDIGFIMDRFVKIQSLSPTHVIDLMQTHQHGLVGALLRAATYYFSD
LEIVVRHEGNLTWVPNGAPESALLNTSNPTAYNKAPFTRLALPYTAPHRVLATVYNGTSKYAVGGSGRRGDMGSLAARVV
KQLPASFNYGAIKADAIHELLVRMKRAELYCPRPLLAIEVSSQDRHKQKIIAPAKQLLNFDLLKLAGDVESNPGPFFFSD
VRSNFSKLVDTINQMQEDMSTKHGPDFNRLVSAFEELATGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLV
AIMLADTGLEILDSTFVVKKISDSLSSLFHVPAPVFSFGAPILLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAI
LKNGEWLVKLILAIRDWIKAWIASEEKFVTTTDLVPGILEKQRDLNDPSKYKEAKEWLDNARQACLKSGNVHIANLCKVV
APAPSRSRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRTDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFK
YFAQMVSTTGFIPPMASLEDKGKPFNSKVIIATTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKINNKLDIIKALE
DTHTNPVAMFQYDCALLNGMAVEMKRMQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHPIFKQISIPSQKSVLYFLI
EKGQHEAAIEFFEGMVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQKMVDDAVSEYI
ERANITTDDKTLDEAEKNPLETSGASTVGFRERPLPGQKARNDENSEPAQPAEEQPQAEGPYAGPLERQKPLKVRAKLPQ
QEGPYAGPMERQKPLKVKAKAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKLVMGNTKPVELILDGKTVAI
CCATGVFGTAYLVPRHLFAEKYDKIMLDGRAMTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDTARMK
KGTPVVGVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGY
CSCVSRSMLLKMKAHVDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNDGVVLDEVIF
SKHKGDTKMSEEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENG
TVGPEVEAALKLMEKREYKFACQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGC
NPDVDWQRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTEFGFHPNAEWILKTLVNTEHAYENKRITVEGGMP
SGCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLG
HSITDVTFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRS
LYLRWVNAVCGDA
>P03305 ~~~~~~Genome polyprotein~~~
MNTTDCFIALVQAIREIKALFLSRTTGKMELTLYNGEKKTFYSRPNNHDNCWLNAILQLFRYVEEPFFDWVYSSPENLTL
EAIKQLEDLTGLELHEGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTDMCLADFHAGIFLKGQEHAVFACVTSNGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNGEWKAKVQRKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFSGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTY
GYATAEDFVSGPNTSGLETRVVQAERFFKTHLFDWVTSDSFGRCHLLELPTDHKGVYGSLTDSYAYMRNGWDVEVTAVGN
QFNGGCLLVAMVPELYSIQKRELYQLTLFPHQFINPRTNMTAHITVPFVGVNRYDQYKVHKPWTLVVMVVAPLTVNTEGA
PQIKVYANIAPTNVHVAGEFPSKEGIFPVACSDGYGGLVTTDPKTADPVYGKVFNPPRNQLPGRFTNLLDVAEACPTFLR
FEGGVPYVTTKTDSDRVLAQFDMSLAAKQMSNTFLAGLAQYYTQYSGTINLHFMFTGPTDAKARYMVAYAPPGMEPPKTP
EAAAHCIHAEWDTGLNSKFTFSIPYLSAADYAYTASGVAETTNVQGWVCLFQITHGKADGDALVVLASAGKDFELRLPVD
ARAETTSAGESADPVTTTVENYGGETQIQRRQHTDVSFIMDRFVKVTPQNQINILDLMQIPSHTLVGALLRASTYYFSDL
EIAVKHEGDLTWVPNGAPEKALDNTTNPTAYHKAPLTRLALPYTAPHRVLATVYNGECRYNRNAVPNLRGDLQVLAQKVA
RTLPTSFNYGAIKATRVTELLYRMKRAETYCPRPLLAIHPTEARHKQKIVAPVKQTLNFDLLKLAGDVESNPGPFFFSDV
RSNFSKLVETINQMQEDMSTKHGPDFNRLVSAFEELAIGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLVA
IMLADTGLEILDSTFVVKKISDSLSSLFHVPAPVFSFGAPVLLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAIL
KNGEWLVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQRDLNDPSKYKEAKEWLDNARQACLKSGNVHIANLCKVVA
PAPSKSRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRIDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFKY
FAQMVSTTGFIPPMASLEDKGKPFNSKVIIATTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKINSKLDIIKALED
THANPVAMFQYDCALLNGMAVEMKRMQQDMFKPQPPLQNVYQLVQEVIDRVELHEKVSSHPIFKQISIPSQKSVLYFLIE
KGQHEAAIEFFEGMVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQKMVDDAVNEYIE
KANITTDDKTLDEAEKSPLETSGASTVGFRERTLPGQKACDDVNSEPAQPVEEQPQAEGPYAGPLERQKPLKVRAKLPQQ
EGPYAGPMERQKPLKVKAKAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAIC
CATGVFGTAYLVPRHLFAEKYDKIMVDGRAMTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDTARMKK
GTPVVGVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYRAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYC
SCVSRSMLLKMKAHIDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFS
KHKGDTKMSEEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGT
VGPEVEAALKLMEKREYKFVCQTFLKDEIRPLEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGCN
PDVDWQRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTEFGFHPNAEWILKTLVNTEHAYENKRITVGGGMPS
GCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLGH
SITDVTFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRSL
YLRWVNAVCGDA
>P03311 ~~~~~~Genome polyprotein~~~
MNTTDCFIAVVNAIKEVRALFLPRTAGKMEFTLHDGEKKVFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYNSPENLTL
EAIKQLEELTGLELREGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTDMCLADFHAGIFMKGREHAVFACVTSNGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNEGWKASVQRKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFSGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTF
GYATAEDSTSGPNTSGLETRVHQAERFFKMALFDWVPSQNFGHMHKVVLPHEPKGVYGGLVKSYAYMRNGWDVEVTAVGN
QFNGGCLLVALVPEMGDISDREKYQLTLYPHQFINPRTNMTAHITVPYVGVNRYDQYKQHRPWTLVVMVVAPLTTNTAGA
QQIKVYANIAPTNVHVAGELPSKEGIFPVACSDGYGNMVTTDPKTADPAYGKVYNPPRTALPGRFTNYLDVAEACPTFLM
FENVPYVSTRTDGQRLLAKFDVSLAAKHMSNTYLAGLAQYYTQYTGTINLHFMFTGPTDAKARYMVAYVPPGMDAPDNPE
EAAHCIHAEWDTGLNSKFTFSIPYISAADYAYTASHEAETTCVQGWVCVYQITHGKADADALVVSASAGKDFELRLPVDA
RQQTTTTGESADPVTTTVENYGGETQVQRRHHTDVAFVLDRFVKVTVSDNQHTLDVMQAHKDNIVGALLRAATYYFSDLE
IAVTHTGKLTWVPNGAPVSALNNTTNPTAYHKGPVTRLALPYTAPHRVLATAYTGTTTYTASARGDLAHLTTTHARHLPT
SFNFGAVKAETITELLVRMKRAELYCPRPILPIQPTGDRHKQPLVAPAKQLLNFDLLKLAGDVESNPGPFFFSDVRSNFS
KLVETINQMQEDMSTKHGPDFNRLVSAFEELASGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLVAIMLAD
TGLEILDSTFVVKKISDSLSSLFHVPAPAFSFGAPILLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAILKNGEW
LVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQRDLNDPSKYKDAKEWLDNTRQVCLKSGNVHIANLCKVVAPAPSK
SRPEPVVVCLRGKSGQGKSFLANVLAQAISTHLTGRTDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFKYFAQMV
STTGFIPPMASLEDKGKPFSSKVIIATTNLYSGFTPKTMVCPDALNRRFHFDIDVSAKDGYKINNKLDIIKALEDTHTNP
VAMFQYDCALLNGMAVEMKRLQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHPIFKQISIPSQKSVLYFLIEKGQHE
AAIEFFEGMVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETHKRQKMVDDAVNEYIEKANIT
TDDQTLDEAEKNPLETSGASTVGFRERTLPGQKARDDVNSEPAQPTEEQPQAEGPYAGPLERQRPLKVRAKLPRQEGPYA
GPMERQKPLKVKARAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAICCATGV
FGTAYLVPRHLFAEKYDKIMLDGRALTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDVARMKKGTPVV
GVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYCSCVSR
SMLLKMKAHIDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFSKHKGD
TKMSAEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGTVGPEV
EAALKLMEKREYKFACQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGCNPDVDW
QRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTEFGFHPNAEWILKTLVNTEHAYENKRITVEGGMPSGCSAT
SIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLGHSITDV
TFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRSLYLRWV
NAVCGDA
>P15072 ~~~~~~Genome polyprotein~~~
MNTTDCFIAVVNAIREIRALFLPRTTGKMEFTLHDGEKKVFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYNSPENLTL
EAIKQLEELTGLELREGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTDMCLADFHAGIFMKGQEHAVFACVTSNGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNEGWKANVQRKLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFSGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTF
GYATAEDSTSGPNTSGLETRVHQAERFFKMALFDWVPSQNFGHMHKVVLPHEPKGVYGGLVKSYAYMRNGWDVEVTAVGN
QFNGGCLLVALVPEMGDISDREKYQLTLYPHQFINPRTNMTAHITVPYVGVNRYDQYKQHRPWTLVVMVVAPLTTNTAGA
QQIKVYANIAPTNVHVAGELPSKEGIFPVACSDGYGNMVTTDPKTADPAYGKVYNPPRTALPGRFTNYLDVAEACPTFLM
FENVPYVSTRTDGQRLLAKFDVSLAAKHMSNTYLAGLAQYYTQYTGTINLHFMFTGPTDAKARYMVAYVPPGMDAPDNPE
EAAHCIHAEWDTGLNSKFTFSIPYISAADYAYTASHEAETTCVQGWVCVYQITHGKADADALVVSASAGKDFELRLPVDA
RQQTTATGESADPVTTTVENYGGETQVQRRHHTDVAFVLDRFVKVTVSGNQHTLDVMQAHKDNIVGALLRAATYYFSDLE
IAVTHTGKLTWVPNGAPVSALDNTTNPTAYHKGPLTRLALPYTAPHRVLATAYTGTTTYTASTRGDSAHLTATRARHLPT
SFNFGAVKAETITELLVRMKRAELYCPRPILPIQPTGDRHKQPLVAPAKQLLNFDLLKLAGDVESNPGPFFFSDVRSNFS
KLVETINQMQEDMSTKHGPDFNRLVSAFEELASGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLVAIMLAD
TGLEILDSTFVVKKISDSLSSLFHVPAPAFSFGAPILLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAILKNGEW
LVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQRDLNDPSKYKDAKEWLDNTRQACLKSGNVHIANLCKVVAPAPSK
SRPEPVVVCLRGKSGQGKSFLANVLAQAISTHLTGRTDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFKYFAQMV
STTGFIPPMASLEDKGKPFSSKVIIATTNLYSGFTPKTMVCPDALNRRFHFDIDVSAKDGYKINNKLDIIKALEDTHTNP
VAMFQYDCALLNGMAVEMKRLQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHPIFKQISIPSQKSVLYFLIEKGQHE
AAIEFFEGMVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETHKRQKMVDDAVNEYIEKANIT
TDDKTLDEAEKNPLETSGASTVGFRERTLPGQKARDDVNSEPAQPTEEQPQAEGPYAGPLERQRPLKVRAKLPQQEGPYA
GPMERQKPLKVKARAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAICCATGV
FGTAYLVPRHLFAEKYDKIMLDGRALTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDVARMKKGTPVV
GVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGYCSCVSR
SMLLKMKAHIDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIFSKHKGD
TKMSEEDKALFRRCAADYASRLHSVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGALIDFENGTVGPEV
EAALKLMEKREYKFACQTFLKDEIRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGCNPDVDW
QRFGTHFAQYRNVWDVDYSAFDANHCSDAMNIMFEEVFRTEFGFHPNAEWILKTLVNTEHAYENKRITVEGGMPSGCSAT
SIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLGHSITDV
TFLKRHFHMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRSLYLRWV
NAVCGDA
>P49303 ~~~~~~Genome polyprotein~~~
MNTTDCFIALLYALREIKAFLLSRTQGKMELTLYNGEKKTFYSRPNNHDNCWLNTILQLFRYVDEPFFDWVYDSPENLTC
EAIRQLEEITGLELHEGGPPALVIWNIKHLLHTGIGTASRPSEVCMVDGTDMCLADFHAGIFLKGQEHAVFACVTSDGWY
AIDDEDFYPWTPDPSDVLVFVPYDQEPLNGEWKAKVQKRLKGAGQSSPATGSQNQSGNTGSIINNYYMQQYQNSMDTQLG
DNAISGGSNEGSTDTTSTHTTNTQNNDWFSKLASSAFSGLFGALLADKKTEETTLLEDRILTTRNGHTTSTTQSSVGVTY
GYSTQEDHVSGPNTSGLETRVVQAERFFKKYLFDWTPDKAFGHLEKLELPTDHKGVYGHLVDSFAYMRNGWDVEVSAVGN
QFNGGCLLVAMVPEWKELTPREKYQLTLFPHQFISPRTNMTAHIVVPYLGVNRYDQYKKHKPWTLVVMVVSPLTTNTVSA
GQIKVYANIAPTHVHVAGELPSKEGIVPVACSDGYGGLVTTDPKTADPVYGMVYNPPRTNYPGRFTNLLDVAEACPTFLC
FDDGKPYVVTRTDEQRLLAKFDLSLAAKHMSNTYLSGIAQYYAQYSGTINLHFMFTGSTDSKARYMVAYVPPGVETPPDT
PEKAAHCIHAEWDTGLNSKFTFSIPYVSAADYAYTASDVAETTNVQGWVCIYQITHGKAEQDTLVVSVSAGKDFELRLPI
DPRSQTTSTGESADPVTTTVENYGGETQVQRRQHTDVTFIMDRFVKIQNLNPIHVIDLMQTHQHGLVGALLRAATYYFSD
LEIVVRHDGNLTWVPNGAPEAALSNMGNPTAYPKAPFTRLALPYTAPHRVLATVYNGTGKYSAGGMGRRGDLEPLAARVA
AQLPTSFNFGAIQATTIHELLVRMKRAELYCPRPLLAVEVSSQDRHKQKIIAPAKQLLNFDLLKLAGDVESNPGPFFFSD
VRSNFSKLVETINQMQEDMSTKHGPDFNRLVSAFEELATGVKAIRTGLDEAKPWYKLIKLLSRLSCMAAVAARSKDPVLV
AIMLADTGLEILDSTFVVKKISDSLSSLFHVPAPVFSFGAPILLAGLVKVASSFFRSTPEDLERAEKQLKARDINDIFAI
LKNGEWLVKLILAIRDWIKAWIASEEKFVTMTDLVPGILEKQRDLNDPSKYKEAKEWLDSARQACLKNGNVHIANLCKVV
TPAPSKSRPEPVVVCLRGKSGQGKSFLANVLAQAISTHFTGRIDSVWYCPPDPDHFDGYNQQTVVVMDDLGQNPDGKDFK
YFAQMVSTTGFIPPMASLEDKGKPFNSKVIITTTNLYSGFTPRTMVCPDALNRRFHFDIDVSAKDGYKVNNKLDITKALE
DTHTNPVAMFKYDCALLNGMAVEMKRMQQDMFKPQPPLQNVYQLVQEVIERVELHEKVSSHQIFKQISIPSQKSVLYFLI
EKGQHEAAIEFFEGLVHDSIKEELRPLIQQTSFVKRAFKRLKENFEIVALCLTLLANIVIMIRETRKRQQMVDDAVNEYI
EKANITTDDKTLDEAEKNPLETSGVSIVGFRERTLPGHRASDDVNSEPARPVEEQPQAEGPYTGPLERQKPLKVKAKLPQ
QEGPYAGPMERQKPLKVKVKAPVVKEGPYEGPVKKPVALKVKAKNLIVTESGAPPTDLQKMVMGNTKPVELILDGKTVAI
CCATGVFGTAYLVPRHLFAEKYDKIMLDGRAMTDSDYRVFEFEIKVKGQDMLSDAALMVLHRGNRVRDITKHFRDTARMK
KGTPVVGVINNADVGRLIFSGEALTYKDIVVCMDGDTMPGLFAYKAATKAGYCGGAVLAKDGADTFIVGTHSAGGNGVGY
CSCVSRSMLLKMKAHIDPEPHHEGLIVDTRDVEERVHVMRKTKLAPTVAHGVFNPEFGPAALSNKDPRLNEGVVLDEVIF
SKHKGDTKMTEEDKALFRRCAADYASRLHNVLGTANAPLSIYEAIKGVDGLDAMEPDTAPGLPWALQGKRRGTLIDFENG
TVGPEVASALELMEKRQYKFTCQTFLKDEVRPMEKVRAGKTRIVDVLPVEHILYTRMMIGRFCAQMHSNNGPQIGSAVGC
NPDVDWQRFGTHFAQYKNVWDVDYSAFDANHCSDAMNIMFEEVFRTEFGFHPNAEWILKTLVNTEHAYENKRITVEGGMP
SGCSATSIINTILNNIYVLYALRRHYEGVELDTYTMISYGDDIVVASDYDLDFEALKPHFKSLGQTITPADKSDKGFVLG
QSITDVTFLKRHFRMDYGTGFYKPVMASKTLEAILSFARRGTIQEKLISVAGLAVHSGPDEYRRLFEPFQGLFEIPSYRS
LYLRWVNAVCGDAQSL
>Q69422 ~~~~~~Genome polyprotein~~~
MPVISTQTSPVPAPRTRKNKQTQASYPVSIKTSVERGQRAKRKVQRDARPRNYKIAGIHDGLQTLAQAALPAHGWGRQDP
RHKSRNLGILLDYPLGWIGDVTTHTPLVGPLVAGAVVRPVCQIVRLLEDGVNWATGWFGVHLFVVCLLSLACPCSGARVT
DPDTNTTILTNCCQRNQVIYCSPSTCLHEPGCVICADECWVPANPYISHPSNWTGTDSFLADHIDFVMGALVTCDALDIG
ELCGACVLVGDWLVRHWLIHIDLNETGTCYLEVPTGIDPGFLGFIGWMAGKVEAVIFLTKLASQVPYAIATMFSSVHYLA
VGALIYYASRGKWYQLLLALMLYIEATSGNPIRVPTGCSIAEFCSPLMIPCPCHSYLSENVSEVICYSPKWTRPVTLEYN
NSISWYPYTIPGARGCMVKFKNNTWGCCRIRNVPSYCTMGTDAVWNDTRNTYEACGVTPWLTTAWHNGSALKLAILQYPG
SKEMFKPHNWMSGHLYFEGSDTPIVYFYDPVNSTLLPPERWARLPGTPPVVRGSWLQVPQGFYSDVKDLATGLITKDKAW
KNYQVLYSATGALSLTGVTTKAVVLILLGLCGSKYLILAYLCYLSLCFGRASGYPLRPVLPSQSYLQAGWDVLSKAQVAP
FALIFFICCYLRCRLRYAALLGFVPMAAGLPLTFFVAAAAAQPDYDWWVRLLVAGLVLWAGRDRGPRIALLVGPWPLVAL
LTLLHLATPASAFDTEIIGGLTIPPVVALVVMSRFGFFAHLLPRCALVNSYLWQRWENWFWNVTLRPERFLLVLVCFPGA
TYDTLVTFCVCHVALLCLTSSAASFFGTDSRVRAHRMLVRLGKCHAWYSHYVLKFFLLVFGENGVFFYKHLHGDVLPNDF
ASKLPLQEPFFPFEGKARVYRNEGRRLACGDTVDGLPVVARLGDLVFAGLAMPPDGWAITAPFTLQCLSERGTLSAMAVV
MTGIDPRTWTGTIFRLGSLATSYMGFVCDNVLYTAHHGSKGRRLAHPTGSIHPITVDAANDQDIYQPPCGAGSLTRCSCG
ETKGYLVTRLGSLVEVNKSDDPYWCVCGALPMAVAKGSSGAPILCSSGHVIGMFTAARNSGGSVSQIRVRPLVCAGYHPQ
YTAHATLDTKPTVPNEYSVQILIAPTGSGKSTKLPLSYMQEKYEVLVLNPSVATTASMPKYMHATYGVNPNCYFNGKCTN
TGASLTYSTYGMYLTGACSRNYDVIICDECHATDATTVLGIGKVLTEAPSKNVRLVVLATATPPGVIPTPHANITEIQLT
DEGTIPFHGKKIKEENLKKGRHLIFEATKKHCDELANELARKGITAVSYYRGCDISKIPEGDCVVVATDALCTGYTGDFD
SVYDCSLMVEGTCHVDLDPTFTMGVRVCGVSAIVKGQRRGRTGRGRAGIYYYVDGSCTPSGMVPECNIVEAFDAAKAWYG
LSSTEAQTILDTYRTQPGLPAIGANLDEWADLFSMVNPEPSFVNTAKRTADNYVLLTAAQLQLCHQYGYAAPNDAPRWQG
ARLGKKPCGVLWRLDGADACPGPEPSEVTRYQMCFTEVNTSGTAALAVGVGVAMAYLAIDTFGATCVRRCWSITSVPTGA
TVAPVVDEEEIVEECASFIPLEAMVAAIDKLKSTITTTSPFTLETALEKLNTFLGPHAATILAIIEYCCGLVTLPDNPFA
SCVFAFIAGITTPLPHKIKMFLSLFGGAIASKLTDARGALAFMMAGAAGTALGTWTSVGFVFDMLGGYAAASSTACLTFK
CLMGEWPTMDQLAGLVYSAFNPAAGVVGVLSACAMFALTTAGPDHWPNRLLTMLARSNTVCNEYFIATRDIRRKILGILE
ASTPWSVISACIRWLHTPTEDDCGLIAWGLEIWQYVCNFFVICFNVLKAGVQSMVNIPGCPFYSCQKGYKGPWIGSGMLQ
ARCPCGAELIFSVENGFAKLYKGPRTCSNYWRGAVPVNARLCGSARPDPTDWTSLVVNYGVRDYCKYEKLGDHIFVTAVS
SPNVCFTQVPPTLRAAVAVDGVQVQCYLGEPKTPWTTSACCYGPDGKGKTVKLPFRVDGHTPGVRMQLNLRDALETNDCN
SINNTPSDEAAVSALVFKQELRRTNQLLEAISAGVDTTKLPAPSIEEVVVRKRQFRARTGSLTLPPPPRSVPGVSCPESL
QRSDPLEGPSNLPSSPPVLQLAMPMPLLGAGECNPFTAIGCAMTETGGGPDDLPSYPPKKEVSEWSDGSWSTTTTASSYV
TGPPYPKIRGKDSTQSAPAKRPTKKKLGKSEFSCSMSYTWTDVISFKTASKVLSATRAITSGFLKQRSLVYVTEPRDAEL
RKQKVTINRQPLFPPSYHKQVRLAKEKASKVVGVMWDYDEVAAHTPSKSAKSHITGLRGTDVRSGAARKAVLDLQKCVEA
GEIPSHYRQTVIVPKEEVFVKTPQKPTKKPPRLISYPHLEMRCVEKMYYGQVAPDVVKAVMGDAYGFVDPRTRVKRLLSM
WSPDAVGATCDTVCFDSTITPEDIMVETDIYSAAKLSDQHRAGIHTIARQLYAGGPMIAYDGREIGYRRCRSSGVYTTSS
SNSLTCWLKVNAAAEQAGMKNPRFLICGDDCTVIWKSAGADADKQAMRVFASWMKVMGAPQDCVPQPKYSLEELTSCSSN
VTSGITKSGKPYYFLTRDPRIPLGRCSAEGLGYNPSAAWIGYLIHHYPCLWVSRVLAVHFMEQMLFEDKLPETVTFDWYG
KNYTVPVEDLPSIIAGVHGIEAFSVVRYTNAEILRVSQSLTDMTMPPLRAWRKKARAVLASAKRRGGAHAKLARFLLWHA
TSRPLPDLDKTSVARYTTFNYCDVYSPEGDVFVTPQRRLQKFLVKYLAVIVFALGLIAVGLAIS
>P08617 ~~~~~~Genome polyprotein~~~
MNMSRQGIFQTVGSGLDHILSLADIEEEQMIQSVDRTAVTGASYFTSVDQSSVHTAEVGSHQVEPLRTSVDKPGSKKTQG
EKFFLIHSADWLTTHALFHEVAKLDVVKLLYNEQFAVQGLLRYHTYARFGIEIQVQINPTPFQQGGLICAMVPGDQSYGS
IASLTVYPHGLLNCNINNVVRIKVPFIYTRGAYHFKDPQYPVWELTIRVWSELNIGTGTSAYTSLNVLARFTDLELHGLT
PLSTQMMRNEFRVSTTENVVNLSNYEDARAKMSFALDQEDWKSDPSQGGGIKITHFTTWTSIPTLAAQFPFNASDSVGQQ
IKVIPVDPYFFQMTNTNPDQKCITALASICQMFCFWRGDLVFDFQVFPTKYHSGRLLFCFVPGNELIDVSGITLKQATTA
PCAVMDITGVQSTLRFRVPWISDTPYRVNRYTKSAHQKGEYTAIGKLIVYCYNRLTSPSNVASHVRVNVYLSAINLECFA
PLYHAMDVTTQVGDDSGGFSTTVSTEQNVPDPQVGITTMKDLKGKANRGKMDVSGVQAPVGAITTIEDPVLAKKVPETFP
ELKPGESRHTSDHMSIYKFMGRSHFLCTFTFNSNNKEYTFPITLSSTSNPPHGLPSTLRWFFNLFQLYRGPLDLTIIITG
ATDVDGMAWFTPVGLAVDTPWVEKESALSIDYKTALGAVRFNTRRTGNIQIRLPWYSYLYAVSGALDGLGDKTDSTFGLV
SIQIANYNHSDEYLSFSCYLSVTEQSEFYFPRAPLNSNAMLSTESMMSRIAAGDLESSVDDPRSEEDKRFESHIECRKPY
KELRLEVGKQRLKYAQEELSNEVLPPPRKMKGLFSQAKISLFYTEEHEIMKFSWRGVTADTRALRRFGFSLAAGRSVWTL
EMDAGVLTGRLIRLNDEKWTEMKDDKIVSLIEKFTSNKYWSKVNFPHGMLDLEEIAANSKDFPNMSETDLCFLLHWLNPK
KINLADRMLGLSGVQEIKEQGVGLIAECRTFLDSIAGTLKSMMFGFHHSVTVEIINTVLCFVKSGILLYVIQQLNQDEHS
HIIGLLRVMNYADIGCSVISCGKVFSKMLETVFNWQMDSRMMELRTQSFSNWLRDICSGITIFKNFKDAIYWLYTKLKDF
YEVNYGKKKDILNILKDNQQKIEKAIEEADEFCILQIQDVEKFEQYQKGVDLIQKLRTVHSMAQVDPNLMVHLSPLRDCI
ARVHQKLKNLGSINQAMVTRCEPVVCYLYGKRGGGKSLTSIALATKICKHYGVEPEKNIYTKPVASDYWDGYSGQLVCII
DDIGQNTTDEDWSDFCQLVSGCPMRLNMASLEEKGRHFSSPFIIATSNWSNPSPKTVYVKEAIDRRLHFKVEVKPASFFK
NPHNDMLNVNLAKTNDAIKDMSCVDLIMDGHNVSLMDLLSSLVMTVEIRKQNMTEFMELWSQGISDDDNDSAVAEFFQSF
PSGEPSNSKLSGFFQSVTNHKWVAVGAAVGILGVLVGGWFVYKHFSRKEEEPIPAEGVYHGVTKPKQVIKLDADPVESQS
TLEIAGLVRKNLVQFGVGEKNGCVRWVMNALGVKDDWLLVPSHAYKFEKDYEMMEFYFNRGGTYYSISAGNVVIQSLDVG
FQDVVLMKVPTIPKFRDITQHFIKKGDVPRALNRLATLVTTVNGTPMLISEGPLKMEEKATYVHKKNDGTTVDLTVDQAW
RGKGEGLPGMCGGALVSSNQSIQNAILGIHVAGGNSILVAKLVTQEMFQNIDKKIESQRIMKVEFTQCSMNVVSKTLFRK
SPIYHHIDKTMINFPAAMPFSKAEIDPMAVMLSKYSLPIVEEPEDYKEASIFYQNKIVGKTQLVDDFLDLDMAITGAPGI
DAINMDSSPGFPYVQEKLTKRDLIWLDENGLLLGVHPRLAQRILFNTVMMENCSDLDVVFTTCPKDELRPLEKVLESKTR
AIDACPLDYSILCRMYWGPAISYFHLNPGFHTGVAIGIDPDRQWDELFKTMIRFGDVGLDLDFSAFDASLSPFMIREAGR
IMSELSGTPSHFGTALINTIIYSKHLLYNCCYHVCGSMPSGSPCTALLNSIINNVNLYYVFSKIFGKSPVFFCQALKILC
YGDDVLIVFSRDVQIDNLDLIGQKIVDEFKKLGMTATSADKNVPQLKPVSELTFLKRSFNLVEDRIRPAISEKTIWSLIA
WQRSNAEFEQNLENAQWFAFMHGYEFYQKFYYFVQSCLEKEMIEYRLKSYDWWRMRFYDQCFICDLS
>P13901 ~~~~~~Genome polyprotein~~~
MNMSRQGIFQTVGSGLDHILSLADIEEEQMIQSVDRTAVTGASYFTSVDQSSVHTAEVGSHQVEPLRTSVDKPGSKKTQG
EKFFLIHSADWLTTHALFHEVAKLDVVKLLYNEQFAVQGLLRYHTYARFGIEIQVQINPTPFQQGGLICAMVPGDQSYGS
IASLTVYPHGLLNCNINNVVRIKVPFIYTRGAYHFKDPQYPVWELTIRVWSELNIGTGTSAYTSLNVLARFTDLELHGLT
PLSTQMMRNEFRVSTTENVVNLSNYEDARAKMSFALDQEDWKSDPSQGGGIKITHFTTWTSIPTLAAQFPFNASDSVGQQ
IKVIPVDPYFFQMTNTNPDQKCITALASICQMFCFWRGDLVFDFQVFPTKYHSGRLLFCFVPGNELIDVSGITLKQATTA
PCAVMDITGVQSTLRFRVPWISDTPYRVNRYTKSAHQKGEYTAIGKLIVYCYNRLTSPSNVASHVRVNVYLSAINLECFA
PLYHAMDVTTQVGDDSGGFSTTVSTEQNVPDPQVGITTMKDLKGKANRGKMDVSGVQAPVGAITTIEDPVLAKKVPETFP
ELKPGESRHTSDHMSIYKFMGRSHFLCTFTFNSNNKEYTFPITLSSTSNPPHGLPSTLRWFFNLFQLYRGPLDLTIIIIG
ATDVDGMAWFTPVGLAVDTPWVEKESALSIDYKTALGAVRFNTRRTGNIQIRLPWYSYLYAVSGALDGLGDKTDSTFGLV
SIQIANYNHSDEYLSFSCYLSVTEQSEFYFPRAPLNSNAMLSTESMMSRIAAGDLESSVDDPRSEEDKRFESHIECRKPY
KELRLEVGKQRLKYAQEELSNEVLPPPRKKKGLFSQAKISLFYTEEHEIMKFSWRGVTADTRALRRFGFSLAAGRSVWTL
EMDAGVLTGRLIRLNDEKWTEMKDDKIVSLIEKFTSNKYWSKVNFPHGMLDLEEIAANSKDFPNMSETDLCFLLHWLNPK
KINLADRMLGLSGVQEIKEQGVGLIAECRTFLDSIAGTLKSMMFGFHHSVTVEIINTVLCFVKSGILLYVMQQLNQDEHS
HIIGLLRVMNYVDIGCSVISCGKVFSKMLETVFNWQMDSRMMELRTQSFSNWLRDICSGITIFKNFKDAIYWLYTKLNDF
YEVNYGKKKDILNILKDNQQKIEKAIEEADKFSILQIQDVEKFEQYQKGVDLIQKLRTVHSMAQVDPNLMVHLSPLRDCI
ARVHQKLKNLGSINQAMVTRCEPVVCYLYGKRGGGKSLTSIALATKICKHYGVEPEKNIYTKPVASDYWDGYSGQLVCII
DDIGQNTTDEDWSDFCQLVSGCPLRLNMASLEEKGRHFSSPFIIATSNWSNPSPKTVYVKEAIDRRLHFKVEVNPASFSK
NPHNDMLNVNLAKTNDAIKDMSCVDLIMDGHNVSLMDLLSSLVMTVEIRKQNMTAFMELWSQGISDDDNDSAMAEFFQSF
PSGEPSNSKLSGFFQSVTNHKWVAVGAAVGILGVLVGGWFVYKHFSRKEEEPIPAEGVYHGVTKPKQVIKLDADPVESQS
TLEIAGLVRKNLVQFGVGEKNGCVRWVMNALGVKDDWLLVPSHAYKFEKDYEMMEFYFNRGGTYYSISAGNVVIQSLDVG
FQDVVLMKVPTIPKFRDITQHFIKKGDVPRALNRLATLVTTVNGTPMLISEGPLKMEEKATYVHKKNDGTTVDLTVDQAW
RGKGEGLPGMCGGALVSSNQSIQNAILGIHVAGGNSILVAKLVTQEMFQNIDKKIESQRIMKVEFTQCSMNVVSKTLFRK
SPIHHHIDKTMINFPAAMPFSKAEIDPMAMMLSKYSLPIVEEPEDYKEASIFYQNKIVGKTQLVDDFLDLDMAITGAPGI
DAINMDSSPGFPYVQERLTKRDLIWLDENGLLLGVHPRLAQRILFNTVMMENCSDLDVVFTTCPKDELRPLEKVLESKTR
AIDACPLDYTILCRMYWGPAISYFHLNPGFHTGVAIGIDPDCQWDELFKTMIRFGDVGLDLDFSAFDASLSPFMIREAGR
IMSELSGTPSHFGTALMNTIIYSKHLLYNCCYHVCGSMPSGSPCTALLNSIINNVNLYYVFSKIFGKSPVFFCQALKILC
YGDDVLIVFSRDVQIDNLDLIGQKIVDEFKKLGMTATSADKNVPQLKPVSELTFLKRSFNLVEDRIRPAISEKTIWSLIA
WQRSNAEFEQNLENAQWFAFMHGYEFYQKFYYFVQSCLEKEMIEYRLKSYDWWRMRFYDQCFICDLS
>P26664 ~~~~~~Genome polyprotein~~~
MSTNPKPQKKNKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRPEGRTWAQPG
YPWPLYGNEGCGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTVPASAYQVRNSTGLYHVTNDCPNSSIVYEAADAILHTPGCVPCVREGNASRCWV
AMTPTVATRDGKLPATQLRRHIDLLVGSATLCSALYVGDLCGSVFLVGQLFTFSPRRHWTTQGCNCSIYPGHITGHRMAW
DMMMNWSPTTALVMAQLLRIPQAILDMIAGAHWGVLAGIAYFSMVGNWAKVLVVLLLFAGVDAETHVTGGSAGHTVSGFV
SLLAPGAKQNVQLINTNGSWHLNSTALNCNDSLNTGWLAGLFYHHKFNSSGCPERLASCRPLTDFDQGWGPISYANGSGP
DQRPYCWHYPPKPCGIVPAKSVCGPVYCFTPSPVVVGTTDRSGAPTYSWGENDTDVFVLNNTRPPLGNWFGCTWMNSTGF
TKVCGAPPCVIGGAGNNTLHCPTDCFRKHPDATYSRCGSGPWITPRCLVDYPYRLWHYPCTINYTIFKIRMYVGGVEHRL
EAACNWTRGERCDLEDRDRSELSPLLLTTTQWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGVGSSIASWAIKWEYVV
LLFLLLADARVCSCLWMMLLISQAEAALENLVILNAASLAGTHGLVSFLVFFCFAWYLKGKWVPGAVYTFYGMWPLLLLL
LALPQRAYALDTEVAASCGGVVLVGLMALTLSPYYKRYISWCLWWLQYFLTRVEAQLHVWIPPLNVRGGRDAVILLMCAV
HPTLVFDITKLLLAVFGPLWILQASLLKVPYFVRVQGLLRFCALARKMIGGHYVQMVIIKLGALTGTYVYNHLTPLRDWA
HNGLRDLAVAVEPVVFSQMETKLITWGADTAACGDIINGLPVSARRGREILLGPADGMVSKGWRLLAPITAYAQQTRGLL
GCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGSRSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVCTRGVAKAVDFIPVEN
LETTMRSPVFTDNSSPPVVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEVALSTTGEIPFYGKAIPLEVIKGGRHLIFCHSKKKCDELAAKLVALGINAVAYYRGLDVSVIPTSGDVVVVATDAL
MTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETITLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETTVRLRAYMNTPGLPVCQDHLEFWEGVFTGLTHIDAHFLSQTKQSGENLPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEITLTHPVTKYIMTCMSADLEVVTSTWVLVGGVLAALAAYCLSTGCV
VIVGRVVLSGKPAIIPDREVLYREFDEMEECSQHLPYIEQGMMLAEQFKQKALGLLQTASRQAEVIAPAVQTNWQKLETF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTAAVTSPLTTSQTLLFNILGGWVAAQLAAPGAATAFVGAGLAGAAIG
SVGLGKVLIDILAGYGAGVAGALVAFKIMSGEVPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTAILSSLTVTQLLRRLHQWISSECTTPCSGSWLRDIWDWICEVLSDFKTWLKAKLM
PQLPGIPFVSCQRGYKGVWRVDGIMHTRCHCGAEITGHVKNGTMRIVGPRTCRNMWSGTFPINAYTTGPCTPLPAPNYTF
ALWRVSAEEYVEIRQVGDFHYVTGMTTDNLKCPCQVPSPEFFTELDGVRLHRFAPPCKPLLREEVSFRVGLHEYPVGSQL
PCEPEPDVAVLTSMLTDPSHITAEAAGRRLARGSPPSVASSSASQLSAPSLKATCTANHDSPDAELIEANLLWRQEMGGN
ITRVESENKVVILDSFDPLVAEEDEREISVPAEILRKSRRFAQALPVWARPDYNPPLVETWKKPDYEPPVVHGCPLPPPK
SPPVPPPRKKRTVVLTESTLSTALAELATRSFGSSSTSGITGDNTTTSSEPAPSGCPPDSDAESYSSMPPLEGEPGDPDL
SDGSWSTVSSEANAEDVVCCSMSYSWTGALVTPCAAEEQKLPINALSNSLLRHHNLVYSTTSRSACQRQKKVTFDRLQVL
DSHYQDVLKEVKAAASKVKANLLSVEEACSLTPPHSAKSKFGYGAKDVRCHARKAVTHINSVWKDLLEDNVTPIDTTIMA
KNEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVTKLPLAVMGSSYGFQYSPGQRVEFLVQAWKSKKTPMGFSYD
TRCFDSTVTESDIRTEEAIYQCCDLDPQARVAIKSLTERLYVGGPLTNSRGENCGYRRCRASGVLTTSCGNTLTCYIKAR
AACRAAGLQDCTMLVCGDDLVVICESAGVQEDAASLRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDGAGKR
VYYLTRDPTTPLARAAWETARHTPVNSWLGNIIMFAPTLWARMILMTHFFSVLIARDQLEQALDCEIYGACYSIEPLDLP
PIIQRLHGLSAFSLHSYSPGEINRVAACLRKLGVPPLRAWRHRARSVRARLLARGGRAAICGKYLFNWAVRTKLKLTPIA
AAGQLDLSGWFTAGYSGGDIYHSVSHARPRWIWFCLLLLAAGVGIYLLPNR
>P27958 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRPEGRTWAQPG
YPWPLYGNEGCGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTVPASAYQVRNSSGLYHVTNDCPNSSVVYEAADAILHTPGCVPCVREGNASRCWV
AVTPTVATRDGKLPTTQLRRHIDLLVGSATLCSALYVGDLCGSVFLVGQLFTFSPRHHWTTQDCNCSIYPGHITGHRMAW
NMMMNWSPTAALVVAQLLRIPQAIMDMIAGAHWGVLAGIKYFSMVGNWAKVLVVLLLFAGVDAETHVTGGNAGRTTAGLV
GLLTPGAKQNIQLINTNGSWHINSTALNCNESLNTGWLAGLFYQHKFNSSGCPERLASCRRLTDFAQGWGPISYANGSGL
DERPYCWHYPPRPCGIVPAKSVCGPVYCFTPSPVVVGTTDRSGAPTYSWGANDTDVFVLNNTRPPLGNWFGCTWMNSTGF
TKVCGAPPCVIGGVGNNTLLCPTDCFRKYPEATYSRCGSGPRITPRCMVDYPYRLWHYPCTINYTIFKVRMYVGGVEHRL
EAACNWTRGERCDLEDRDRSELSPLLLSTTQWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGVGSSIASWAIKWEYVV
LLFLLLADARVCSCLWMMLLISQAEAALENLVILNAASLAGTHGLVSFLVFFCFAWYLKGRWVPGAVYALYGMWPLLLLL
LALPQRAYALDTEVAASCGGVVLVGLMALTLSPYYKRYISWCMWWLQYFLTRVEAQLHVWVPPLNVRGGRDAVILLTCVV
HPALVFDITKLLLAIFGPLWILQASLLKVPYFVRVQGLLRICALARKIAGGHYVQMAIIKLGALTGTCVYNHLAPLRDWA
HNGLRDLAVAVEPVVFSRMETKLITWGADTAACGDIINGLPVSARRGQEILLGPADGMVSKGWRLLAPITAYAQQTRGLL
GCIITSLTGRDKNQVEGEVQIVSTATQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQTYTNVDQDLVGWPAPQGSRSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPTGHAVGLFRAAVCTRGVAKAVDFIPVEN
LETTMRSPVFTDNSSPPAVPQSFQVAHLHAPTGSGKSTKVPAAYAAKGYKVLVLNPSVAATLGFGAYMSKAHGVDPNIRT
GVRTITTGSPITYSTYGKFLADAGCSGGAYDIIICDECHSTDATSISGIGTVLDQAETAGARLVVLATATPPGSVTVSHP
NIEEVALSTTGEIPFYGKAIPLEVIKGGRHLIFCHSKKKCDELAAKLVALGINAVAYYRGLDVSVIPTSGDVVVVSTDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETTVRLRAYMNTPGLPVCQDHLGFWEGVFTGLTHIDAHFLSQTKQSGENFPYLVAYQATVCARAQAP
PPSWDQMRKCLIRLKPTLHGPTPLLYRLGAVQNEVTLTHPITKYIMTCMSADLEVVTSTWVLVGGVLAALAAYCLSTGCV
VIVGRIVLSGKPAIIPDREVLYQEFDEMEECSQHLPYIEQGMMLAEQFKQKALGLLQTASRHAEVITPAVQTNWQKLEVF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTAAVTSPLTTGQTLLFNILGGWVAAQLAAPGAATAFVGAGLAGAALD
SVGLGKVLVDILAGYGAGVAGALVAFKIMSGEVPSTEDLVNLLPAILSPGALAVGVVFASILRRRVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTAILSSLTVTQLLRRLHQWISSECTTPCSGSWLRDIWDWICEVLSDFKTWLKAKLM
PQLPGIPFVSCQRGYRGVWRGDGIMHTRCHCGAEITGHVKNGTMRIVGPRTCKNMWSGTFFINAYTTGPCTPLPAPNYKF
ALWRVSAEEYVEIRRVGDFHYVSGMTTDNLKCPCQIPSPEFFTELDGVRLHRFAPPCKPLLREEVSFRVGLHEYPVGSQL
PCEPEPDVAVLTSMLTDPSHITAEAAGRRLARGSPPSMASSSASQLSAPSLKATCTANHDSPDAELIEANLLWRQEMGGN
ITRVESENKVVILDSFDPLVAEEDEREVSVPAEILRKSRRFAPALPVWARPDYNPLLVETWKKPDYEPPVVHGCPLPPPR
SPPVPPPRKKRTVVLTESTLPTALAELATKSFGSSSTSGITGDNTTTSSEPAPSGCPPDSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSSGADTEDVVCCSMSYSWTGALVTPCAAEEQKLPINALSNSLLRHHNLVYSTTSRSACQRKKKVTFDRLQVL
DSHYQDVLKEVKAAASKVKANLLSVEEACSLAPPHSAKSKFGYGAKDVRCHARKAVAHINSVWKDLLEDSVTPIDTTIMA
KNEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSKLPLAVMGSSYGFQYSPGQRVEFLVQAWKSKKTPMGLSYD
TRCFDSTVTESDIRTEEAIYQCCDLDPQARVAIKSLTERLYVGGPLTNSRGENCGYRRCRASRVLTTSCGNTLTRYIKAR
AACRAAGLQDCTMLVCGDDLVVICESAGVQEDAASLRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDGAGKR
VYYLTRDPTTPLARAAWETARHTPVNSWLGNIIMFAPTLWARMILMTHFFSVLIARDQLEQALNCEIYGACYSIEPLDLP
PIIQRLHGLSAFSLHSYSPGEINRVAACLRKLGVPPLRAWRHRAWSVRARLLARGGKAAICGKYLFNWAVRTKLKLTPIT
AAGRLDLSGWFTAGYSGGDIYHSVSHARPRWFWFCLLLLAAGVGIYLLPNR
>P26663 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRAPRKTSERSQPRGRRQPIPKARRPEGRTWAQPG
YPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTTPASAYEVHNVSGIYHVTNDCSNASIVYEAADLIMHTPGCVPCVREGNSSRCWV
ALTPTLAARNVTIPTTTIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFSPRRHVTLQDCNCSIYPGHVSGHRMAW
DMMMNWSPTTALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMAGNWAKVLIVMLLFAGVDGDTHVTGGAQAKTTNRLV
SMFASGPSQKIQLINTNGSWHINRTALNCNDSLQTGFLAALFYTHSFNSSGCPERMAQCRTIDKFDQGWGPITYAESSRS
DQRPYCWHYPPPQCTIVPASEVCGPVYCFTPSPVVVGTTDRFGVPTYRWGENETDVLLLNNTRPPQGNWFGCTWMNSTGF
TKTCGGPPCNIGGVGNNTLTCPTDCFRKHPEATYTKCGSGPWLTPRCMVDYPYRLWHYPCTVNFTIFKVRMYVGGVEHRL
NAACNWTRGERCDLEDRDRPELSPLLLSTTEWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGIGSAVVSFAIKWEYVL
LLFLLLADARVCACLWMMLLIAQAEAALENLVVLNSASVAGAHGILSFLVFFCAAWYIKGRLVPGATYALYGVWPLLLLL
LALPPRAYAMDREMAASCGGAVFVGLVLLTLSPYYKVFLARLIWWLQYFTTRAEADLHVWIPPLNARGGRDAIILLMCAV
HPELIFDITKLLIAILGPLMVLQAGITRVPYFVRAQGLIHACMLVRKVAGGHYVQMAFMKLGALTGTYIYNHLTPLRDWP
RAGLRDLAVAVEPVVFSDMETKIITWGADTAACGDIILGLPVSARRGKEILLGPADSLEGRGLRLLAPITAYSQQTRGLL
GCIITSLTGRDKNQVEGEVQVVSTATQSFLATCVNGVCWTVYHGAGSKTLAAPKGPITQMYTNVDQDLVGWPKPPGARSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPVSYLKGSSGGPLLCPFGHAVGIFRAAVCTRGVAKAVDFVPVES
METTMRSPVFTDNSSPPAVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRTITTGAPVTYSTYGKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEVALSNTGEIPFYGKAIPIEAIRGGRHLIFCHSKKKCDELAAKLSGLGINAVAYYRGLDVSVIPTIGDVVVVATDAL
MTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRRGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVFTGLTHIDAHFLSQTKQAGDNFPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEVTLTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGRPAIVPDRELLYQEFDEMEECASHLPYIEQGMQLAEQFKQKALGLLQTATKQAEAAAPVVESKWRALETF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLTTQSTLLFNILGGWVAAQLAPPSAASAFVGAGIAGAAVG
SIGLGKVLVDILAGYGAGVAGALVAFKVMSGEMPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTQILSSLTITQLLKRLHQWINEDCSTPCSGSWLRDVWDWICTVLTDFKTWLQSKLL
PQLPGVPFFSCQRGYKGVWRGDGIMQTTCPCGAQITGHVKNGSMRIVGPKTCSNTWHGTFPINAYTTGPCTPSPAPNYSR
ALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFSEVDGVRLHRYAPACRPLLREEVTFQVGLNQYLVGSQL
PCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTHHVSPDADLIEANLLWRQEMGGN
ITRVESENKVVVLDSFDPLRAEEDEREVSVPAEILRKSKKFPAAMPIWARPDYNPPLLESWKDPDYVPPVVHGCPLPPIK
APPIPPPRRKRTVVLTESSVSSALAELATKTFGSSESSAVDSGTATALPDQASDDGDKGSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSEEASEDVVCCSMSYTWTGALITPCAAEESKLPINALSNSLLRHHNMVYATTSRSAGLRQKKVTFDRLQVLD
DHYRDVLKEMKAKASTVKAKLLSVEEACKLTPPHSAKSKFGYGAKDVRNLSSKAVNHIHSVWKDLLEDTVTPIDTTIMAK
NEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQVVMGSSYGFQYSPGQRVEFLVNTWKSKKNPMGFSYDT
RCFDSTVTENDIRVEESIYQCCDLAPEARQAIKSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKASA
ACRAAKLQDCTMLVNGDDLVVICESAGTQEDAASLRVFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDASGKRV
YYLTRDPTTPLARAAWETARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIERLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRVWRHRARSVRARLLSQGGRAATCGKYLFNWAVKTKLKLTPIPA
ASRLDLSGWFVAGYSGGDIYHSLSRARPRWFMLCLLLLSVGVGIYLLPNR
>Q9WMX2 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARQPEGRAWAQPG
YPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGVYHVTNDCSNASIVYEAADMIMHTPGCVPCVRENNSSRCWV
ALTPTLAARNASVPTTTIRRHVDLLVGAAALCSAMYVGDLCGSVFLVAQLFTFSPRRHETVQDCNCSIYPGHVTGHRMAW
DMMMNWSPTAALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGGTYVTGGTMAKNTLGIT
SLFSPGSSQKIQLVNTNGSWHINRTALNCNDSLNTGFLAALFYVHKFNSSGCPERMASCSPIDAFAQGWGPITYNESHSS
DQRPYCWHYAPRPCGIVPAAQVCGPVYCFTPSPVVVGTTDRFGVPTYSWGENETDVLLLNNTRPPQGNWFGCTWMNSTGF
TKTCGGPPCNIGGIGNKTLTCPTDCFRKHPEATYTKCGSGPWLTPRCLVHYPYRLWHYPCTVNFTIFKVRMYVGGVEHRL
EAACNWTRGERCNLEDRDRSELSPLLLSTTEWQVLPCSFTTLPALSTGLIHLHQNVVDVQYLYGIGSAVVSFAIKWEYVL
LLFLLLADARVCACLWMMLLIAQAEAALENLVVLNAASVAGAHGILSFLVFFCAAWYIKGRLVPGAAYALYGVWPLLLLL
LALPPRAYAMDREMAASCGGAVFVGLILLTLSPHYKLFLARLIWWLQYFITRAEAHLQVWIPPLNVRGGRDAVILLTCAI
HPELIFTITKILLAILGPLMVLQAGITKVPYFVRAHGLIRACMLVRKVAGGHYVQMALMKLAALTGTYVYDHLTPLRDWA
HAGLRDLAVAVEPVVFSDMETKVITWGADTAACGDIILGLPVSARRGREIHLGPADSLEGQGWRLLAPITAYSQQTRGLL
GCIITSLTGRDRNQVEGEVQVVSTATQSFLATCVNGVCWTVYHGAGSKTLAGPKGPITQMYTNVDQDLVGWQAPPGARSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPVSYLKGSSGGPLLCPSGHAVGIFRAAVCTRGVAKAVDFVPVES
METTMRSPVFTDNSSPPAVPQTFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRTITTGAPITYSTYGKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEVALSSTGEIPFYGKAIPIETIKGGRHLIFCHSKKKCDELAAKLSGLGLNAVAYYRGLDVSVIPTSGDVIVVATDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRMGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVFTGLTHIDAHFLSQTKQAGDNFPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEVTTTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGKPAIIPDREVLYREFDEMEECASHLPYIEQGMQLAEQFKQKAIGLLQTATKQAEAAAPVVESKWRTLEAF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLTTQHTLLFNILGGWVAAQLAPPSAASAFVGAGIAGAAVG
SIGLGKVLVDILAGYGAGVAGALVAFKVMSGEMPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTQILSSLTITQLLKRLHQWINEDCSTPCSGSWLRDVWDWICTVLTDFKTWLQSKLL
PRLPGVPFFSCQRGYKGVWRGDGIMQTTCPCGAQITGHVKNGSMRIVGPRTCSNTWHGTFPINAYTTGPCTPSPAPNYSR
ALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPACKPLLREEVTFLVGLNQYLVGSQL
PCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTRHDSPDADLIEANLLWRQEMGGN
ITRVESENKVVILDSFEPLQAEEDEREVSVPAEILRRSRKFPRAMPIWARPDYNPPLLESWKDPDYVPPVVHGCPLPPAK
APPIPPPRRKRTVVLSESTVSSALAELATKTFGSSESSAVDSGTATASPDQPSDDGDAGSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSEEASEDVVCCSMSYTWTGALITPCAAEETKLPINALSNSLLRHHNLVYATTSRSASLRQKKVTFDRLQVLD
DHYRDVLKEMKAKASTVKAKLLSVEEACKLTPPHSARSKFGYGAKDVRNLSSKAVNHIRSVWKDLLEDTETPIDTTIMAK
NEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPGQRVEFLVNAWKAKKCPMGFAYDT
RCFDSTVTENDIRVEESIYQCCDLAPEARQAIRSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKAAA
ACRAAKLQDCTMLVCGDDLVVICESAGTQEDEASLRAFTEAMTRYSAPPGDPPKPEYDLELITSCSSNVSVAHDASGKRV
YYLTRDPTTPLARAAWETARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIQRLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRVWRHRARSVRARLLSQGGRAATCGKYLFNWAVRTKLKLTPIPA
ASQLDLSSWFVAGYSGGDIYHSLSRARPRWFMWCLLLLSVGVGIYLLPNR
>O39929 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPMDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRPEGRSWAQPG
YPWPLYGNEGCGWAGWLLSPRGSRPSWGPNDPRGRSRNLGKVIDTLTCGFADLMGYIPLVGAPVGSVARALAHGVRALED
GINYATGNLPGCSFSIFLLALLSCLTVPASAVNYRNVSGIYHVTNDCPNSSIVYEADHHIMHLPGCVPCVREGNQSRCWV
ALTPTVAAPYIGAPLESLRSHVDLMVGAATVCSGLYIGDLCGGLFLVGQMFSFRPRRHWTTQDCNCSIYTGHITGHRMAW
DMMMNWSPTTTLVLAQVMRIPTTLVDLLSGGHWGVLVGVAYFSMQANWAKVILVLFLFAGVDAETHVSGAAVGRSTAGLA
NLFSSGSKQNLQLINSNGSWHINRTALNCNDSLNTGFLASLFYTHKFNSSGCSERLACCKSLDSYGQGWGPLGVANISGS
SDDRPYCWHYAPRPCGIVPASSVCGPVYCFTPSPVVVGTTDHVGVPTYTWGENETDVFLLNSTRPPHGAWFGCVWMNSTG
FTKTCGAPPCEVNTNNGTWHCPTDCFRKHPETTYAKCGSGPWITPRCLIDYPYRLWHFPCTANFSVFNIRTFVGGIEHRM
QAACNWTRGEVCGLEHRDRVELSPLLLTTTAWQILPCSFTTLPALSTGLIHLHQNIVDVQYLYGVGSAVVSWALKWEYVV
LAFLLLADARVSAYLWMMFMVSQVEAALSNLININAASAAGAQGFWYAILFICIVWHVKGRFPAAAAYAACGLWPCFLLL
LMLPERAYAYDQEVAGSLGGAIVVMLTILTLSPHYKLWLARGLWWIQYFIARTEAVLHVYIPSFNVRGPRDSVIVLAVLV
CPDLVFDITKYLLAILGPLHILQASLLRIPYFVRAQALVKICSLLRGVVYGKYFQMVVLKSRGLTGTYIYDHLTPMSDWP
PYGLRDLAVALEPVVFTPMEKKVIVWGADTAACGDIIRGLPVSARLGNEILLGPADTETSKGWRLLAPITAYAQQTRGLF
STIVTSLTGRDTNENCGEVQVLSTATQSFLGTAVNGVMWTVYHGAGAKTISGPKGPVNQMYTNVDQDLVGWPAPPGVRSL
APCTCGSADLYLVTRHADVIPVRRRGDTRGALLSPRPISILKGSSGGPLLCPMGHRAGIFRAAVCTRGVAKAVDFVPVES
LETTMRSPVFTDNSTPPAVPQTYQVAHLHAPTGSGKSTKVPAAHAAQGYKVLVLNPSVAATLGFGVYMSKAYGIDPNIRS
GVRTITTGAPITYSTYGKFLADGGCSGGAYDIIICDECYSTDSTTILGIGTVLDQAETAGVRLTVLATATPPGSVTTPHS
NIEEVALPTTGEIPFYGKAIPLELIKGGRHLIFCHSKKKCDELARQLTSLGLNAVAYYRGLDVSVIPTSGDVVVCATDAL
MTGFTGDFDSVIDCNTSVIQTVDFSLDPTFSIEITTVPQDAVSRSQRRGRTGRGRLGTYRYVTPGERPSGMFDTAELCEC
YDAGCAWYELTPAETTTRLKAYFDTPGLPVCQDHLEFWESVFTGLTHIDGHFLSQTKQSGENFPYLVAYQATVSAKVWLA
PPSWDTMWKCLIRLKPTLHGPTPLLYRLGSVQNEVVLTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLSVGSV
VIVGRVVLSGQPAVIPDREVLYQQFDEMEECSKHLPLVEHGLQLAEQFKQKALGLLNFAGKQAQEATPVIQSNFAKLEQF
WANDMWNFISGIQYLAGLSTLPGNPAIASLMSFTAAVTSPLTTQQTLLFNILGGWVASQIRDSDASTAFVVSGLAGAAVG
SVGLGKILVDILPGYGAGVRGAVVTFKIMSGEMPSTEDLVNLLPAILSPGALVVEVVCPAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAARRVTTILSSLTVTSLLRRLHKWINEDCSTPCAESWLWEVWDWVLHVLSDFKTCLKAKFV
PLMPGIPLLSWPRGYKGEWRGDGVMHTTCPCGADLAGHIKNGSMRITGPKTCSNTWHGTFPINAYTTGPGVPIPAPNYKF
ALWRVSAEDYVEVRRVGDFHYVTGVTQDNIKFPCQVPAPELFTEVDGIRIHRHAPKCKPLLRDEVSFSVGLNSFVVGSQL
PCEPEPDVAVLTSMLTDPSHITAESARRRLARGSRPSLASSSASQLSPRLLQATCTAPHDSPGTDLLEANLLWGSTATRV
ETDEKVIILDSFESCVAEQNDDREVSVAAEILRPTKKFPPALPIWARPDYNPPLTETWKQQDYQAPTVHGCALPPAKQPP
VPSPRRKRTVQLTESVVSTALAELAAKTFGQSEPSSDRDTDLTTPTETTDSGPIVVDDASDDGSYSSMPPLEGEPGDPDL
TSDSWSTVSGSEDVVCCSMSYSWTGALVTPCAAEESKLPISPLSNSLLRHHNMVYATTTRSAVTRQKKVTFDRLQVVDST
YNEVLKEIKARASRVKPRLLTTEEACDLTPPHSARSKFGYGKKDVRSHSRKAINHISSVWKDLLDDNNTPIPTTIMAKNE
VFAVNPAKGGRKPARLIVYPDLGSRVCEKRALHDVIKKTALAVMGAAYGFQYSPAQRVEFLLTAWKSKNDPMGFSYDTRC
FDSTVTEKDIRVEEEVYQCCDLEPEARKVITALTDRLYVGGPMHNSKGDLCGYRRCRATGVYTTSFGNTLTCYLKATAAI
RAAALRDCTMLVCGDDLVVIAESDGVEEDNRALRAFTEAMTRYSAPPGDAPQPAYDLELITSCSSNVSVAHDVTGKKVYY
LTRDPETPLARAVWETVRHTPVNSWLGNIIVYAPTIWVRMILMTHFFSILQSQEALEKALDFDMYGVTYSITPLDLPAII
QRLHGLSAFTLHGYSPHELNRVAGALRKLGVPPLRAWRHRARAVRAKLIAQGGRAKICGIYLFNWAVKTKLKLTPLPAAA
KLDLSGWFTVGAGGGDIYHSMSHARPRYLLLCLLILTVGVGIFLLPAR
>O39928 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPKLGVRATRKNSERSQPRGRRQPIPKARRPTGRSWGQPG
YPWPLYANEGLGWAGWLLSPRSSRPNWGPNDPRRKSPNLGRVIHTLTCGFPHLMGYIPLVGGPVGGVSRALAHGVKVLED
GINYATGNLPGCPFSIFVLALLWCLTVPASAVPYRNASGVYHVTNDCPNSSIVYEADNLILHAPGCVPCVLEDNVSRCWV
QITPTLSAPSFGAVTALLRRAVDYLAGGAAFCSALYVGDACGALSLVGQMFTYKPRQHTTVQDCNCSIYSGHITGHRMAW
DMMMKWSPTTALLMAQLLRIPQVVIDIIAGGHWGVLLAAAYFASTANWAKVILVLFLFAGVDGRTHTVGGTVGQGLKSLT
SFFNPGPQRQLQFVNTNGSWHINSTALNCNDSLQTGFIAGLMYAHKFNSSGCPERMSSCRPLAAFDQGWGTISYATISGP
SDDKPYCWHYPPRPCGVVPARDVCGPVYCFTPSPVVVGTTDRRGCPTYNWGSNETDILLLNNIRPPAGNWFGCTWMNSTG
FVKNCGAPPCNLGPTGNNSLKCPTDCFRKHPDATYTRCGSGPWLTPRCLVHYPYRLWHYPCTVNYTIFKVRMFIGGLEHR
LEAACNWTYGERCDLEDRDRAELSPLLHTTTQWAILPCSFTPTPALSTGLIHLHQNIVDTQYLYGLSSSIVSWAVKWEYI
MLVFLLLADARICTCLLILLLICQAEATCKNVIVLNAAAAAGNHGFFWGLLVVCLAWHVKGRLVPGATYLCLGVWPLLLV
RLLRPHRALALDSSDGGTVGCLVLIVLTIFTLTPGYKKKVVLVMWWLQYFIARVEAIIHVWVPPLQVKGGRDAVIMLTCL
FHPALGFEITKILFGILGPLYLLQHSLTKVPYFLRARALLRLCLLAKHLVYGKYVQAALLHLGRLTGTYIYDHLAPMKDW
AASGLRELTVATEPIVFSAMETKVITWGADTAACGNILAVLPVSARRGREIFLGPADDIKTSGWRLLAPITAYAQQTRGV
LGAIVLSLTGRDKNEAEGEVQFLSTATQTFLGICINGVMWTLFHGAGSKTLAGPKGPVVQMYTNVDKDLVGWPSPPGKGS
LTRCTCGSADLYLVTRHADVIPARRRGDTRASLLSPRPISYLKGSSGGPIMCPSGHVVGVFRAAVCTRGVAKALEFVPVE
NLETTMRSPVFTDNSTPPAVPHEFQVGHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATFGFGAYMSRAYGVDPNIR
TGVRTVTTGAGITYSTYGKFFADGGCSGGAYDVIICDECHSQDATTILGIGTVLDQAETAGARLVVLATAIPPGSVTTPH
PNIEEVALPSEGEIPFYGRAIPLVLIKGGRHLIFCHSKKKCDELAKQLTSLGVNAVAYYRGLDVAVIPATGDVVVCSTDA
LMTGFTGDFDSVIDCNSAVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRHGIYRYVSSGERPSGIFDSVVLCE
CYDAGCAWYDLTPAETTVRLRAYLNTPGLPVCQEHLEFWEGVFTGLTNIDAHMLSQAKQGGENFPYLVAYQATVCVRAKA
PPPSWDTMWKCMICLKPTLTGPTPLLYRLGAVQNEITLTHPITKYIMACMSADLEVITSTWVLVGGVVAALAAYCLTVGS
VAIVGRIILSGRPAITPDREVLYQQFDEMEECSASLPYVDEARAIAGQFKEKVLGLIGTAGQKAETLKPAATSMWSKAEQ
FWAKHMWNFVSGIQYLAGLSTLPGNPAVATLMSFTAAVTSPLTTHQTLLFNILGGWVASQIAPPTAATAFVVSGMAGAAV
GNIGLGRVLIDILAGYGTGVAGALVAFKIMCGERPTAEELVNLLPSILCPGALVVGVICAAVLRRHIGPGEGAVQWMNRL
IAFASRGNHGSPTHYVPETDASAKVTQLLSSLTVTSLLKRLHTWIGEDYSTPCDGTWLRAIWDWVCTALTDFKAWLQAKL
LPQLPGVPFFSCQKGYKGVWRGDGVNSTKCPCGATISGHVKNGTMRIVGPKLCSNTWQGTFPINATTTGPSVPAPAPNYK
FALWRVGAADYAEVRRVGDYHYITGVTQDNLKCPCQVPSPEFFTELDGVRIHRFAPPCNPLLREEVTFSVGLHSYVVGSQ
LPCEPEPDVTVLTSMLSDPAHITAETAKRRLNRGSPPSLANSSASQLSAPSLKATCTIQGHHPDADLIKANLLWRQCMGG
NITRVEAENKVEILDCFKPLKEEEDDREISVSADCFKKGPAFPPALPVWARPGYDPPLLETWKRPDYDPPQVWGCPIPPA
GPPPVPLPRRKRKPMELSDSTVSQVMADLADARFKVDTPSIEGQDSALGTSSQHDSGPEEKRDDNSDAASYSSMPPLEGE
PGDPDLSSGSWSTVSGEDNVVCCSMSYTWTGALITPCSAEEEKLPINPLSNTLLRHHNLVYSTSSRSAGLRQKKVTFDRL
QVLDDHYREVVDEMKRLASKVKARLLPLEEACGLTPPHSARSKYGYGAKEVRSLDKKALKHIEGVWQDLLDDSDTPLPTT
IMAKNEVFAVEPSKGGKKPARLIVYPDLGVRVCEKRALYDVAQKLPTALMGPSYGFQYSPAQRVDFLLKAWKSKKIPMAF
SYDTRCFDSTITEHDIMTEESIYQSCDLQPEARVAIRSLTQRLYCGGPMYNSKGQQCGYRRCRASGVFTTSMGNTMTCYI
KALASCRAAKLRDCTLLVCGDDLVAICESQGTHEDEASLRAFTEAMTRYSAPPGDPPVPAYDLELVTSCSSNVSVARDAS
GNRIYYLTRDPQVPLAKAAWETAKHSPVNSWLGNIIMYAPTLWARIVLMTHFFSVLQSQEQLEKTLAFEMYGSVYSVTPL
DLPAIIQRLHGLSAFSLHSYSPSEINRVASCLRKLGVPPLRAWRHRARAVRAKLIAQGGRAAICGIYLFNWAVKTKRKLT
PLADADRLDLSSWFTVGAGGGDIYHSMSRARPRNLLLCLLLLSVGVGIFLLPAR
>Q03463 ~~~~~~Genome polyprotein~~~
MSTIPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKVRRPEGRTWAQPG
YPWPLYGNEGCGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTVPASAYQVRNSTGLYHVTNDCPNSSIVYEAHDAILHTPGCVPCVREGNVSRCWV
AMTPTVATRDGKLPATQLRRHIDLLVGSATLCSALYVGDLCGSVFLIGQLFTFSPRRHWTTQGCNCSIYPGHITGHRMAW
DMMMNWSPTAALVMAQLLRIPQAILDMIAGAHWGVLAGIAYFSMVGNWAKVLVVLLLFAGVDAETIVSGGQAARAMSGLV
SLFTPGAKQNIQLINTNGSWHINSTALNCNESLNTGWLAGLIYQHKFNSSGCPERLASCRRLTDFDQGWGPISHANGSGP
DQRPYCWHYPPKPCGIVPAKSVCGPVYCFTPSPVVVGTTDRSGAPTYNWGANDTDVFVLNNTRPPLGNWFGCTWMNSTGF
TKVCGAPPCVIGGGGNNTLHCPTDCFRKHPEATYSRCGSGPWITPRCLVDYPYRLWHYPCTINYTIFKVRMYVGGVEHRL
DAACNWTRGERCDLEDRDRSELSPLLLSTTQWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGVGSSIASWAIKWEYVV
LLFLLLADARVCSCLWMMLLISQAEAALENLVILNAASLAGTRGLVSFLVFFCFAWYLKGRWVPGAAYALYGMWPLLLLL
LALPQRAYALDTEVAASCGGVVLVGLMALTLSPYYKRCISWCLWWLQYFLTRVEAQLHVWVPPLNVRGGRDAVILLMCVV
HPTLVFDITKLLLAVLGPLWILQASLLKVPYFVRVQGLLRICALARKMVGGHYVQMAIIKLGALTGTYVYNHLTPLRDWA
HNGLRDLAVAVEPVVFSQMETKLITWGADTAACGDIINGLPVSARKGREILLGPADGMVSKGWRLLAPITAYAQQTRGLL
GCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGARSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHVVGIFRAAVCTRGVAKAVDFIPVES
LETTMRSPVFTDNSSPPAVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSVLGIGTVLDQAETAGARLVVLATATPPGSITVPHA
NIEEVALSTTGEIPFYGKAIPLEAIKGGRHLIFCHSKKKCDELAAKLVALGVNAVAYYRGLDVSVIPTSGDVVVVATDAL
MTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSILCEC
YDTGCAWYELTPAETTVRLRAYMNTPGLPVCQDHLEFWEGVFTGLTHIDAHFLSQTKQGGENFPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQGEVTLTHPVTKYIMTCMSADLEVVTSTWVLVGGVLAALAAYCLSTGCV
VIVGRIVLSGRPAIIPDREVLYREFDEMEECSQHLPYIEQGMMLAEQFKQKALGLLQTASRQAEVIAPTVQTNWQKLEAF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTAAVTSPLTTSQTLLFNILGGWVAAQLAAPGAATAFVGSGLAGAAVG
SVGLGRVLVDILAGYGAGVAGALVAFKIMSGELPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTAILSSLTVTQLLRRLHQWLSSESTTPCSGSWLRDIWDWICEVLSDFKTWLKTKLM
PHLPGIPFVSCQHGYKGVWRGDGIMHTRCHCGAEITGHVKNGTMRIVGPKTCRNMWSGTFPINAYTTGPCTPLPAPNYTF
ALWRVSAEEYVEIRRVGDFHYVTGMTTDNLKCPCQVPSPEFFTELDGVRLHRFAPPCKPLLREEVSFRVGLHDYPVGSQL
PCEPEPDVAVLTSMLTDPSHITAAAAGRRLARGSPPSEASSSASQLSAPSLKATCTINHDSPDAELIEANLLWRQEMGGN
ITRVESENKVVILDSFDPLVAEEDEREISVPAEILRKSRRFTQALPIWARPDYNPPLIETWKKPNYEPPVVHGCPLPPPQ
SPPVPPPRKKRTVVLTESTLSTALAELAAKSFGSSSTSGITGDNTTTSSEPAPSGCSPDSDAESYSSMPPLEGEPGDPDL
SDGSWSTVSSEAGTEDVVCCSMSYTWTGALITPCAAEEQKLPINALSNSLLRHHNLVYSTTSRSACQRQKKVTFDRLQVL
DSHYQDVLKEVKAAASKVKANLLSVEEACSLTPPHSAKSKFGYGAKDVRCHARKAVNHINSVWKDLLEDSVTPIQTTIMA
KNEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSKLPPAVMGSSYGFQYSPGQRVEFLVQAWKSKRTPMGFSYD
TRCFDSTVTESDIRTEEAIYQCCDLDPQARVAIRSLTERLYVGGPLTNSRGENCGYRRCRASGVLTTSCGNTLTCYIKAR
AACRAAGLQDCTMLVCGDDLVVICESAGVQEDAASLRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDGTGKR
VYYLTRDPTTPLARAAWETARHTPVNSWLGNIIMFAPTLWARMILMTHFFSVLIARDQLEQALDCEIYGACYSIEPLDLP
PIIQRLHGLSAFSLHSYSPGEINRVAACLRKLGVPPLRAWRHRARSVRARLLSRGGRAAICGKYLFNWAVRTKLKLTPIA
AAGRLDLSGWFTAGYSGGDIYHSVSHARPRWFWFCLLLLAAGVGIYLLPNR
>O92972 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKASERSQPRGRRQPIPKARRPEGRAWAQPG
YPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGIYHVTNDCSNSSIVYEAADVIMHTPGCVPCVREGNSSRCWV
ALTPTLAARNASVPTTTIRRHVDLLVGTAAFCSAMYVGDLCGSIFLVSQLFTFSPRRHETVQDCNCSIYPGHVSGHRMAW
DMMMNWSPTTALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVALLFAGVDGETHTTGRVAGHTTSGFT
SLFSSGASQKIQLVNTNGSWHINRTALNCNDSLQTGFFAALFYAHKFNSSGCPERMASCRPIDWFAQGWGPITYTKPNSS
DQRPYCWHYAPRPCGVVPASQVCGPVYCFTPSPVVVGTTDRSGVPTYSWGENETDVMLLNNTRPPQGNWFGCTWMNSTGF
TKTCGGPPCNIGGVGNRTLICPTDCFRKHPEATYTKCGSGPWLTPRCLVDYPYRLWHYPCTLNFSIFKVRMYVGGVEHRL
NAACNWTRGERCNLEDRDRSELSPLLLSTTEWQILPCAFTTLPALSTGLIHLHQNIVDVQYLYGVGSAFVSFAIKWEYIL
LLFLLLADARVCACLWMMLLIAQAEAALENLVVLNAASVAGAHGILSFLVFFCAAWYIKGRLAPGAAYAFYGVWPLLLLL
LALPPRAYALDREMAASCGGAVLVGLVFLTLSPYYKVFLTRLIWWLQYFITRAEAHMQVWVPPLNVRGGRDAIILLTCAV
HPELIFDITKLLLAILGPLMVLQAGITRVPYFVRAQGLIRACMLVRKVAGGHYVQMAFMKLGALTGTYVYNHLTPLRDWA
HAGLRDLAVAVEPVVFSAMETKVITWGADTAACGDIILGLPVSARRGKEIFLGPADSLEGQGWRLLAPITAYSQQTRGVL
GCIITSLTGRDKNQVEGEVQVVSTATQSFLATCINGVCWTVYHGAGSKTLAGPKGPITQMYTNVDLDLVGWQAPPGARSM
TPCSCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPVSYLKGSSGGPLLCPSGHVVGVFRAAVCTRGVAKAVDFIPVES
METTMRSPVFTDNSSPPAVPQTFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRTITTGGSITYSTYGKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEIGLSNNGEIPFYGKAIPIEAIKGGRHLIFCHSKKKCDELAAKLTGLGLNAVAYYRGLDVSVIPPIGDVVVVATDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRSGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVFTGLTHIDAHFLSQTKQAGDNFPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEVILTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGKPAVVPDREVLYQEFDEMEECASQLPYIEQGMQLAEQFKQKALGLLQTATKQAEAAAPVVESKWRALETF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLTTQNTLLFNILGGWVAAQLAPPSAASAFVGAGIAGAAVG
SIGLGKVLVDILAGYGAGVAGALVAFKVMSGEVPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTQILSSLTITQLLKRLHQWINEDCSTPCSGSWLRDVWDWICTVLTDFKTWLQSKLL
PRLPGVPFLSCQRGYKGVWRGDGIMQTTCPCGAQIAGHVKNGSMRIVGPRTCSNTWHGTFPINAYTTGPCTPSPAPNYSR
ALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPACKPLLREDVTFQVGLNQYLVGSQL
PCEPEPDVTVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTHHDSPDADLIEANLLWRQEMGGN
ITRVESENKVVILDSFEPLHAEGDEREISVAAEILRKSRKFPSALPIWARPDYNPPLLESWKDPDYVPPVVHGCPLPPTK
APPIPPPRRKRTVVLTESNVSSALAELATKTFGSSGSSAVDSGTATALPDLASDDGDKGSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSEEASEDVVCCSMSYTWTGALITPCAAEESKLPINPLSNSLLRHHNMVYATTSRSASLRQKKVTFDRLQVLD
DHYRDVLKEMKAKASTVKAKLLSIEEACKLTPPHSAKSKFGYGAKDVRNLSSRAVNHIRSVWEDLLEDTETPIDTTIMAK
SEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPKQRVEFLVNTWKSKKCPMGFSYDT
RCFDSTVTESDIRVEESIYQCCDLAPEARQAIRSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKATA
ACRAAKLQDCTMLVNGDDLVVICESAGTQEDAAALRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDASGKRV
YYLTRDPTTPLARAAWETARHTPINSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIERLHGLSAFTLHSYSPGEINRVASCLRKLGVPPLRTWRHRARSVRAKLLSQGGRAATCGRYLFNWAVRTKLKLTPIPA
ASQLDLSGWFVAGYSGGDIYHSLSRARPRWFPLCLLLLSVGVGIYLLPNR
>P26660 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKDRRSTGKSWGKPG
YPWPLYGNEGLGWAGWLLSPRGSRPSWGPNDPRHRSRNVGKVIDTLTCGFADLMGYIPVVGAPLGGVARALAHGVRVLED
GVNFATGNLPGCSFSIFLLALLSCITTPVSAAEVKNISTGYMVTNDCTNDSITWQLQAAVLHVPGCVPCEKVGNTSRCWI
PVSPNVAVQQPGALTQGLRTHIDMVVMSATLCSALYVGDLCGGVMLAAQMFIVSPQHHWFVQDCNCSIYPGTITGHRMAW
DMMMNWSPTATMILAYAMRVPEVIIDIIGGAHWGVMFGLAYFSMQGAWAKVVVILLLAAGVDAQTHTVGGSTAHNARTLT
GMFSLGARQKIQLINTNGSWHINRTALNCNDSLHTGFLASLFYTHSFNSSGCPERMSACRSIEAFRVGWGALQYEDNVTN
PEDMRPYCWHYPPRQCGVVSASSVCGPVYCFTPSPVVVGTTDRLGAPTYTWGENETDVFLLNSTRPPQGSWFGCTWMNST
GYTKTCGAPPCRIRADFNASMDLLCPTDCFRKHPDTTYIKCGSGPWLTPRCLIDYPYRLWHYPCTVNYTIFKIRMYVGGV
EHRLTAACNFTRGDRCNLEDRDRSQLSPLLHSTTEWAILPCTYSDLPALSTGLLHLHQNIVDVQFMYGLSPALTKYIVRW
EWVVLLFLLLADARVCACLWMLILLGQAEAALEKLVVLHAASAASCNGFLYFVIFFVAAWYIKGRVVPLATYSLTGLWSF
GLLLLALPQQAYAYDASVHGQIGAALLVLITLFTLTPGYKTLLSRFLWWLCYLLTLAEAMVQEWAPPMQVRGGRDGIIWA
VAIFCPGVVFDITKWLLAVLGPAYLLKGALTRVPYFVRAHALLRMCTMVRHLAGGRYVQMVLLALGRWTGTYIYDHLTPM
SDWAANGLRDLAVAVEPIIFSPMEKKVIVWGAETAACGDILHGLPVSARLGREVLLGPADGYTSKGWSLLAPITAYAQQT
RGLLGTIVVSMTGRDKTEQAGEIQVLSTVTQSFLGTTISGVLWTVYHGAGNKTLAGSRGPVTQMYSSAEGDLVGWPSPPG
TKSLEPCTCGAVDLYLVTRNADVIPARRRGDKRGALLSPRPLSTLKGSSGGPVLCPRGHAVGVFRAAVCSRGVAKSIDFI
PVETLDIVTRSPTFSDNSTPPAVPQTYQVGYLHAPTGSGKSTKVPVAYAAQGYKVLVLNPSVAATLGFGAYLSKAHGINP
NIRTGVRTVTTGAPITYSTYGKFLADGGCAGGAYDIIICDECHAVDSTTILGIGTVLDQAETAGVRLTVLATATPPGSVT
TPHPNIEEVALGQEGEIPFYGRAIPLSYIKGGRHLIFCHSKKKCDELAAALRGMGLNAVAYYRGLDVSVIPTQGDVVVVA
TDALMTGFTGDFDSVIDCNVAVTQVVDFSLDPTFTITTQTVPQDAVSRSQRRGRTGRGRLGIYRYVSTGERASGMFDSVV
LCECYDAGAAWYELTPAETTVRLRAYFNTPGLPVCQDHLEFWEAVFTGLTHIDAHFLSQTKQSGENFAYLTAYQATVCAR
AKAPPPSWDVMWKCLTRLKPTLVGPTPLLYRLGSVTNEVTLTHPVTKYIATCMQADLEVMTSTWVLAGGVLAAVAAYCLA
TGCVCIIGRLHVNQRAVVAPDKEVLYEAFDEMEECASRAALIEEGQRIAEMLKSKIQGLLQQASKQAQDIQPAVQASWPK
VEQFWAKHMWNFISGIQYLAGLSTLPGNPAVASMMAFSAALTSPLSTSTTILLNILGGWLASQIAPPAGATGFVVSGLVG
AAVGSIGLGKVLVDILAGYGAGISGALVAFKIMSGEKPSMEDVVNLLPGILSPGALVVGVICAAILRRHVGPGEGAVQWM
NRLIAFASRGNHVAPTHYVTESDASQRVTQLLGSLTITSLLRRLHNWITEDCPIPCSGSWLRDVWDWVCTILTDFKNWLT
SKLFPKMPGLPFISCQKGYKGVWAGTGIMTTRCPCGANISGNVRLGSMRITGPKTCMNIWQGTFPINCYTEGQCVPKPAP
NFKIAIWRVAASEYAEVTQHGSYHYITGLTTDNLKVPCQLPSPEFFSWVDGVQIHRFAPIPKPFFRDEVSFCVGLNSFVV
GSQLPCDPEPDTDVLTSMLTDPSHITAETAARRLARGSPPSEASSSASQLSAPSLRATCTTHGKAYDVDMVDANLFMGGD
VTRIESESKVVVLDSLDPMVEERSDLEPSIPSEYMLPKKRFPPALPAWARPDYNPPLVESWKRPDYQPATVAGCALPPPK
KTPTPPPRRRRTVGLSESSIADALQQLAIKSFGQPPPSGDSGLSTGADAADSGSRTPPDELALSETGSISSMPPLEGEPG
DPDLEPEQVELQPPPQGGVVTPGSGSGSWSTCSEEDDSVVCCSMSYSWTGALITPCSPEEEKLPINPLSNSLLRYHNKVY
CTTSKSASLRAKKVTFDRMQALDAHYDSVLKDIKLAASKVTARLLTLEEACQLTPPHSARSKYGFGAKEVRSLSGRAVNH
IKSVWKDLLEDTQTPIPTTIMAKNEVFCVDPTKGGKKAARLIVYPDLGVRVCEKMALYDITQKLPQAVMGASYGFQYSPA
QRVEFLLKAWAEKKDPMGFSYDTRCFDSTVTERDIRTEESIYRACSLPEEAHTAIHSLTERLYVGGPMFNSKGQTCGYRR
CRASGVLTTSMGNTITCYVKALAACKAAGIIAPTMLVCGDDLVVISESQGTEEDERNLRAFTEAMTRYSAPPGDPPRPEY
DLELITSCSSNVSVALGPQGRRRYYLTRDPTTPIARAAWETVRHSPVNSWLGNIIQYAPTIWARMVLMTHFFSILMAQDT
LDQNLNFEMYGAVYSVSPLDLPAIIERLHGLDAFSLHTYTPHELTRVASALRKLGAPPLRAWKSRARAVRASLISRGGRA
AVCGRYLFNWAVKTKLKLTPLPEARLLDLSSWFTVGAGGGDIYHSVSRARPRLLLLGLLLLFVGVGLFLLPAR
>P26661 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKDRRSTGKSWGKPG
YPWPLYGNEGCGWAGWLLSPRGSRPTWGPTDPRHRSRNLGRVIDTITCGFADLMGYIPVVGAPVGGVARALAHGVRVLED
GINYATGNLPGCSFSIFLLALLSCVTVPVSAVEVRNISSSYYATNDCSNNSITWQLTDAVLHLPGCVPCENDNGTLHCWI
QVTPNVAVKHRGALTRSLRTHVDMIVMAATACSALYVGDVCGAVMILSQAFMVSPQRHNFTQECNCSIYQGHITGHRMAW
DMMLSWSPTLTMILAYAARVPELVLEIIFGGHWGVVFGLAYFSMQGAWAKVIAILLLVAGVDATTYSSGQEAGRTVAGFA
GLFTTGAKQNLYLINTNGSWHINRTALNCNDSLQTGFLASLFYTHKFNSSGCPERLSSCRGLDDFRIGWGTLEYETNVTN
DGDMRPYCWHYPPRPCGIVPARTVCGPVYCFTPSPVVVGTTDKQGVPTYTWGENETDVFLLNSTRPPRGAWFGCTWMNGT
GFTKTCGAPPCRIRKDYNSTIDLLCPTDCFRKHPDATYLKCGAGPWLTPRCLVDYPYRLWHYPCTVNFTIFKARMYVGGV
EHRFSAACNFTRGDRCRLEDRDRGQQSPLLHSTTEWAVLPCSFSDLPALSTGLLHLHQNIVDVQYLYGLSPALTRYIVKW
EWVILLFLLLADARICACLWMLIILGQAEAALEKLIILHSASAASANGPLWFFIFFTAAWYLKGRVVPVATYSVLGLWSF
LLLVLALPQQAYALDAAEQGELGLAILVIISIFTLTPAYKILLSRSVWWLSYMLVLAEAQIQQWVPPLEVRGGRDGIIWV
AVILHPRLVFEVTKWLLAILGPAYLLKASLLRIPYFVRAHALLRVCTLVKHLAGARYIQMLLITIGRWTGTYIYDHLSPL
STWAAQGLRDLAIAVEPVVFSPMEKKVIVWGAETVACGDILHGLPVSARLGREVLLGPADGYTSKGWKLLAPITAYTQQT
RGLLGAIVVSLTGRDKNEQAGQVQVLSSVTQTFLGTSISGVLWTVYHGAGNKTLAGPKGPVTQMYTSAEGDLVGWPSPPG
TKSLDPCTCGAVDLYLVTRNADVIPVRRKDDRRGALLSPRPLSTLKGSSGGPVLCSRGHAVGLFRAAVCARGVAKSIDFI
PVESLDVATRTPSFSDNSTPPAVPQSYQVGYLHAPTGSGKSTKVPAAYASQGYKVLVLNPSVAATLGFGAYMSKAHGINP
NIRTGVRTVTTGDSITYSTYGKFIADGGCAAGAYDIIICDECHSVDATTILGIGTVLDQAETAGVRLVVLATATPPGTVT
TPHSNIEEVALGHEGEIPFYGKAIPLAFIKGGRHLIFCHSKKKCDELAAALRGMGVNAVAYYRGLDVSVIPTQGDVVVVA
TDALMTGYTGDFDSVIDCNVAVSQIVDFSLDPTFTITTQTVPQDAVSRSQRRGRTGRGRLGVYRYVSSGERPSGMFDSVV
LCECYDAGAAWYELTPAETTVRLRAYFNTPGLPVCQDHLEFWEAVFTGLTHIDAHFLSQTKQGGENFAYLTAYQATVCAR
AKAPPPSWDVMWKCLTRLKPTLTGPTPLLYRLGAVTNEVTLTHPVTKYIATCMQADLEIMTSSWVLAGGVLAAVAAYCLA
TGCISIIGRLHLNDRVVVAPDKEILYEAFDEMEECASKAALIEEGQRMAEMLKSKIQGLLQQATRQAQDIQPAIQSSWPK
LEQFWAKHMWNFISGIQYLAGLSTLPGNPAVASMMAFSAALTSPLPTSTTILLNIMGGWLASQIAPPAGATGFVVSGLVG
AAVGSIGLGKILVDVLAGYGAGISGALVAFKIMSGEKPTVEDVVNLLPAILSPGALVVGVICAAILRRHVGQGEGAVQWM
NRLIAFASRGNHVAPTHYVVESDASQRVTQVLSSLTITSLLRRLHAWITEDCPVPCSGSWLQDIWDWVCSILTDFKNWLS
SKLLPKMPGIPFISCQKGYKGVWAGTGVMTTRCPCGANISGHVRMGTMKITGPKTCLNLWQGTFPINCYTEGPCVPKPPP
NYKTAIWRVAASEYVEVTQHGSFSYVTGLTSDNLKVPCQVPAPEFFSWVDGVQIHRFAPVPGPFFRDEVTFTVGLNSFVV
GSQLPCDPEPDTEVLASMLTDPSHITAEAAARRLARGSPPSQASSSASQLSAPSLKATCTTHKTAYDCDMVDANLFMGGD
VTRIESDSKVIVLDSLDSMTEVEDDREPSVPSEYLIKRRKFPPALPPWARPDYNPVLIETWKRPGYEPPTVLGCALPPTP
QTPVPPPRRRRAKVLTQDNVEGVLREMADKVLSPLQDNNDSGHSTGADTGGDIVQQPSDETAASEAGSLSSMPPLEGEPG
DPDLEFEPVGSAPPSEGECEVIDSDSKSWSTVSDQEDSVICCSMSYSWTGALITPCGPEEEKLPINPLSNSLMRFHNKVY
STTSRSASLRAKKVTFDRVQVLDAHYDSVLQDVKRAASKVSARLLTVEEACALTPPHSAKSRYGFGAKEVRSLSRRAVNH
IRSVWEDLLEDQHTPIDTTIMAKNEVFCIDPTKGGKKPARLIVYPDLGVRVCEKMALYDIAQKLPKAIMGPSYGFQYSPA
ERVDFLLKAWGSKKDPMGFSYDTRCFDSTVTERDIRTEESIYQACSLPQEARTVIHSLTERLYVGGPMTNSKGQSCGYRR
CRASGVFTTSMGNTMTCYIKALAACKAAGIVDPVMLVCGDDLVVISESQGNEEDERNLRAFTEAMTRYSAPPGDLPRPEY
DLELITSCSSNVSVALDSRGRRRYFLTRDPTTPITRAAWETVRHSPVNSWLGNIIQYAPTIWVRMVIMTHFFSILLAQDT
LNQNLNFEMYGAVYSVNPLDLPAIIERLHGLEAFSLHTYSPHELSRVAATLRKLGAPPLRAWKSRARAVRASLIAQGARA
AICGRYLFNWAVKTKLKLTPLPEASRLDLSGWFTVGAGGGDIYHSVSHARPRLLLLCLLLLSVGVGIFLLPAR
>P26662 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRPEGRTWAQPG
YPWPLYGNEGMGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCVPCVRESNFSRCWV
ALTPTLAARNSSIPTTTIRRHVDLLVGAAALCSAMYVGDLCGSVFLVSQLFTFSPRRYETVQDCNCSIYPGHVSGHRMAW
DMMMNWSPTTALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGHTHVTGGRVASSTQSLV
SWLSQGPSQKIQLVNTNGSWHINRTALNCNDSLQTGFIAALFYAHRFNASGCPERMASCRPIDEFAQGWGPITHDMPESS
DQRPYCWHYAPRPCGIVPASQVCGPVYCFTPSPVVVGTTDRFGAPTYSWGENETDVLLLSNTRPPQGNWFGCTWMNSTGF
TKTCGGPPCNIGGVGNNTLVCPTDCFRKHPEATYTKCGSGPWLTPRCMVDYPYRLWHYPCTVNFTVFKVRMYVGGVEHRL
NAACNWTRGERCDLEDRDRSELSPLLLSTTEWQILPCSFTTLPALSTGLIHLHRNIVDVQYLYGIGSAVVSFAIKWEYIL
LLFLLLADARVCACLWMMLLIAQAEATLENLVVLNAASVAGAHGLLSFLVFFCAAWYIKGRLVPGAAYALYGVWPLLLLL
LALPPRAYAMDREMAASCGGAVFVGLVLLTLSPYYKVFLARLIWWLQYFITRAEAHLQVWVPPLNVRGGRDAIILLTCAV
HPELIFDITKLLLAILGPLMVLQAGITRVPYFVRAQGLIRACMLVRKVAGGHYVQMAFMKLAALTGTYVYDHLTPLRDWA
HAGLRDLAVAVEPVVFSDMETKLITWGADTAACGDIISGLPVSARRGKEILLGPADSFGEQGWRLLAPITAYSQQTRGLL
GCIITSLTGRDKNQVDGEVQVLSTATQSFLATCVNGVCWTVYHGAGSKTLAGPKGPITQMYTNVDQDLVGWPAPPGARSM
TPCTCGSSDLYLVTRHADVVPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPSGHVVGIFRAAVCTRGVAKAVDFIPVES
METTMRSPVFTDNSSPPAVPQTFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIEPNIRT
GVRTITTGGPITYSTYCKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSITVPHP
NIEEVALSNTGEIPFYGKAIPIEAIKGGRHLIFCHSKKKCDELAAKLTGLGLNAVAYYRGLDVSVIPTSGDVVVVATDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTLPQDAVSRAQRRGRTGRGRSGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVFTGLTHIDAHFLSQTKQAGDNLPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEVTLTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGRPAVIPDREVLYQEFDEMEECASHLPYIEQGMQLAEQFKQKALGLLQTATKQAEAAAPVVESKWRALEVF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLTTQNTLLFNILGGWVAAQLAPPSAASAFVGAGIAGAAVG
SIGLGKVLVDILAGYGAGVAGALVAFKVMSGEMPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTQILSSLTITQLLKRLHQWINEDCSTPCSGSWLKDVWDWICTVLSDFKTWLQSKLL
PRLPGLPFLSCQRGYKGVWRGDGIMQTTCPCGAQITGHVKNGSMRIVGPKTCSNTWHGTFPINAYTTGPCTPSPAPNYSR
ALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPVCKPLLREEVVFQVGLNQYLVGSQL
PCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTHHDSPDADLIEANLLWRQEMGGN
ITRVESENKVVILDSFDPIRAVEDEREISVPAEILRKPRKFPPALPIWARPDYNPPLLESWKDPDYVPPVVHGCPLPSTK
APPIPPPRRKRTVVLTESTVSSALAELATKTFGSSGSSAVDSGTATGPPDQASDDGDKGSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSGEAGEDVVCCSMSYTWTGALITPCAAEESKLPINPLSNSLLRHHSMVYSTTSRSASLRQKKVTFDRLQVLD
DHYRDVLKEMKAKASTVKARLLSIEEACKLTPPHSAKSKFGYGAKDVRSLSSRAVNHIRSVWEDLLEDTETPIDTTIMAK
NEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGPSYGFQYSPGQRVEFLVNTWKSKKCPMGFSYDT
RCFDSTVTENDIRTEESIYQCCDLAPEARQAIRSLTERLYVGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKATA
ACRAAKLQDCTMLVNGDDLVVICESAGTQEDAAALRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDASGKRV
YYLTRDPTTPLARAAWETVRHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIERLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRVWRHRARSVRAKLLSQGGRAATCGKYLFNWAVKTKLKLTPIPA
ASQLDLSGWFVAGYNGGDIYHSLSRARPRWFMLCLLLLSVGVGIYLLPNR
>Q99IB8 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPEDVKFPGGGQIVGGVYLLPRRGPRLGVRTTRKTSERSQPRGRRQPIPKDRRSTGKAWGKPG
RPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRHRSRNVGKVIDTLTCGFADLMGYIPVVGAPLSGAARAVAHGVRVLED
GVNYATGNLPGFPFSIFLLALLSCITVPVSAAQVKNTSSSYMVTNDCSNDSITWQLEAAVLHVPGCVPCERVGNTSRCWV
PVSPNMAVRQPGALTQGLRTHIDMVVMSATFCSALYVGDLCGGVMLAAQVFIVSPQYHWFVQECNCSIYPGTITGHRMAW
DMMMNWSPTATMILAYVMRVPEVIIDIVSGAHWGVMFGLAYFSMQGAWAKVIVILLLAAGVDAGTTTVGGAVARSTNVIA
GVFSHGPQQNIQLINTNGSWHINRTALNCNDSLNTGFLAALFYTNRFNSSGCPGRLSACRNIEAFRIGWGTLQYEDNVTN
PEDMRPYCWHYPPKPCGVVPARSVCGPVYCFTPSPVVVGTTDRRGVPTYTWGENETDVFLLNSTRPPQGSWFGCTWMNST
GFTKTCGAPPCRTRADFNASTDLLCPTDCFRKHPDATYIKCGSGPWLTPKCLVHYPYRLWHYPCTVNFTIFKIRMYVGGV
EHRLTAACNFTRGDRCDLEDRDRSQLSPLLHSTTEWAILPCTYSDLPALSTGLLHLHQNIVDVQYMYGLSPAITKYVVRW
EWVVLLFLLLADARVCACLWMLILLGQAEAALEKLVVLHAASAANCHGLLYFAIFFVAAWHIRGRVVPLTTYCLTGLWPF
CLLLMALPRQAYAYDAPVHGQIGVGLLILITLFTLTPGYKTLLGQCLWWLCYLLTLGEAMIQEWVPPMQVRGGRDGIAWA
VTIFCPGVVFDITKWLLALLGPAYLLRAALTHVPYFVRAHALIRVCALVKQLAGGRYVQVALLALGRWTGTYIYDHLTPM
SDWAASGLRDLAVAVEPIIFSPMEKKVIVWGAETAACGDILHGLPVSARLGQEILLGPADGYTSKGWKLLAPITAYAQQT
RGLLGAIVVSMTGRDRTEQAGEVQILSTVSQSFLGTTISGVLWTVYHGAGNKTLAGLRGPVTQMYSSAEGDLVGWPSPPG
TKSLEPCKCGAVDLYLVTRNADVIPARRRGDKRGALLSPRPISTLKGSSGGPVLCPRGHVVGLFRAAVCSRGVAKSIDFI
PVETLDVVTRSPTFSDNSTPPAVPQTYQVGYLHAPTGSGKSTKVPVAYAAQGYKVLVLNPSVAATLGFGAYLSKAHGINP
NIRTGVRTVMTGEAITYSTYGKFLADGGCASGAYDIIICDECHAVDATSILGIGTVLDQAETAGVRLTVLATATPPGSVT
TPHPDIEEVGLGREGEIPFYGRAIPLSCIKGGRHLIFCHSKKKCDELAAALRGMGLNAVAYYRGLDVSIIPAQGDVVVVA
TDALMTGYTGDFDSVIDCNVAVTQAVDFSLDPTFTITTQTVPQDAVSRSQRRGRTGRGRQGTYRYVSTGERASGMFDSVV
LCECYDAGAAWYDLTPAETTVRLRAYFNTPGLPVCQDHLEFWEAVFTGLTHIDAHFLSQTKQAGENFAYLVAYQATVCAR
AKAPPPSWDAMWKCLARLKPTLAGPTPLLYRLGPITNEVTLTHPGTKYIATCMQADLEVMTSTWVLAGGVLAAVAAYCLA
TGCVSIIGRLHVNQRVVVAPDKEVLYEAFDEMEECASRAALIEEGQRIAEMLKSKIQGLLQQASKQAQDIQPAMQASWPK
VEQFWARHMWNFISGIQYLAGLSTLPGNPAVASMMAFSAALTSPLSTSTTILLNIMGGWLASQIAPPAGATGFVVSGLVG
AAVGSIGLGKVLVDILAGYGAGISGALVAFKIMSGEKPSMEDVINLLPGILSPGALVVGVICAAILRRHVGPGEGAVQWM
NRLIAFASRGNHVAPTHYVTESDASQRVTQLLGSLTITSLLRRLHNWITEDCPIPCSGSWLRDVWDWVCTILTDFKNWLT
SKLFPKLPGLPFISCQKGYKGVWAGTGIMTTRCPCGANISGNVRLGSMRITGPKTCMNTWQGTFPINCYTEGQCAPKPPT
NYKTAIWRVAASEYAEVTQHGSYSYVTGLTTDNLKIPCQLPSPEFFSWVDGVQIHRFAPTPKPFFRDEVSFCVGLNSYAV
GSQLPCEPEPDADVLRSMLTDPPHITAETAARRLARGSPPSEASSSVSQLSAPSLRATCTTHSNTYDVDMVDANLLMEGG
VAQTEPESRVPVLDFLEPMAEEESDLEPSIPSECMLPRSGFPRALPAWARPDYNPPLVESWRRPDYQPPTVAGCALPPPK
KAPTPPPRRRRTVGLSESTISEALQQLAIKTFGQPPSSGDAGSSTGAGAAESGGPTSPGEPAPSETGSASSMPPLEGEPG
DPDLESDQVELQPPPQGGGVAPGSGSGSWSTCSEEDDTTVCCSMSYSWTGALITPCSPEEEKLPINPLSNSLLRYHNKVY
CTTSKSASQRAKKVTFDRTQVLDAHYDSVLKDIKLAASKVSARLLTLEEACQLTPPHSARSKYGFGAKEVRSLSGRAVNH
IKSVWKDLLEDPQTPIPTTIMAKNEVFCVDPAKGGKKPARLIVYPDLGVRVCEKMALYDITQKLPQAVMGASYGFQYSPA
QRVEYLLKAWAEKKDPMGFSYDTRCFDSTVTERDIRTEESIYQACSLPEEARTAIHSLTERLYVGGPMFNSKGQTCGYRR
CRASGVLTTSMGNTITCYVKALAACKAAGIVAPTMLVCGDDLVVISESQGTEEDERNLRAFTEAMTRYSAPPGDPPRPEY
DLELITSCSSNVSVALGPRGRRRYYLTRDPTTPLARAAWETVRHSPINSWLGNIIQYAPTIWVRMVLMTHFFSILMVQDT
LDQNLNFEMYGSVYSVNPLDLPAIIERLHGLDAFSMHTYSHHELTRVASALRKLGAPPLRVWKSRARAVRASLISRGGKA
AVCGRYLFNWAVKTKLKLTPLPEARLLDLSSWFTVGAGGGDIFHSVSRARPRSLLFGLLLLFVGVGLFLLPAR
>Q81258 ~~~~~~Genome polyprotein~~~
MSTLPKPQRKTKRNTIRRPQDVKFPGGGQIVGGVYVLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRSEGRSWAQPG
YPWPLYGNEGCGWAGWLLSPRGSRPSWGPNDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPVGGVARALAHGVRALED
GINFATGNLPGCSFSIFLLALFSCLIHPAASLEWRNTSGLYVLTNDCSNSSIVYEADDVILHTPGCVPCVQDGNTSTCWT
PVTPTVAVRYVGATTASIRSHVDLLVGAATMCSALYVGDMCGAVFLVGQAFTFRPRRHQTVQTCNCSLYPGHLSGHRMAW
DMMMNWSPAVGMVVAHVLRLPQTLFDIMAGAHWGILAGLAYYSMQGNWAKVAIIMVMFSGVDAHTYTTGGTASRHTQAFA
GLFDIGPQQKLQLVNTNGSWHINSTALNCNESINTGFIAGLFYYHKFNSTGCPQRLSSCKPITFFRQGWGPLTDANITGP
SDDRPYCWHYAPRPCDIVPASSVCGPVYCFTPSPVVVGTTDARGVPTYTWGENEKDVFLLKSQRPPSGRWFGCSWMNSTG
FLKTCGAPPCNIYGGEGNPHNESDLFCPTDCFRKHPETTYSRCGAGPWLTPRCMVDYPYRLWHYPCTVDFRLFKVRMFVG
GFEHRFTAACNWTRGERCDIEDRDRSEQHPLLHSTTELAILPCSFTPMPALSTGLIHLHQNIVDVQYLYGVGSGMVGWAL
KWEFVILVFLLLADARVCVALWLMLMISQTEAALENLVTLNAVAAAGTHGIGWYLVAFCAAWYVRGKLVPLVTYSLTGLW
SLALLVLLLPQRAYAWSGEDSATLGAGVLVLFGFFTLSPWYKHWIGRLMWWNQYTICRCESALHVWVPPLLARGSRDGVI
LLTSLLYPSLIFDITKLLMAVLGPLYLIQATITTTPYFVRAHVLVRLCMLVRSVIGGKYFQMIILSIGRWFNTYLYDHLA
PMQHWAAAGLKDLAVATEPVIFSPMEIKVITWGADTAACGDILCGLPVSARLGREVLLGPADDYREMGWRLLAPITAYAQ
QTRGLLGTIVTSLTGRDKNVVTGEVQVLSTATQTFLGTTVGGVIWTVYHGAGSRTLAGAKHPALQMYTNVDQDLVGWPAP
PGAKSLEPCACGSSDLYLVTRDADVIPARRRGDSTASLLSPRPLACLKGSSGGPVMCPSGHVAGIFRAAVCTRGVAKSLQ
FIPVETLSTQARSPSFSDNSTPPAVPQSYQVGYLHAPTGSGKSTKVPAAYVAQGYNVLVLNPSVAATLGFGSFMSRAYGI
DPNIRTGNRTVTTGAKLTYSTYGKFLADGGCSGGAYDVIICDECHAQDATSILGIGTVLDQAETAGVRLTVLATATPPGS
ITVPHSNIEEVALGSEGEIPFYGKAIPIALLKGGRHLIFCHSKKKCDEIASKLRGMGLNAVAYYRGLDVSVIPTTGDVVV
CATDALMTGFTGDFDSVIDCNVAVEQYVDFSLDPTFSIETRTAPQDAVSRSQRRGRTGRGRLGTYRYVASGERPSGMFDS
VVLCECYDAGCSWYDLQPAETTVRLRAYLSTPGLPVCQDHLDFWESVFTGLTHIDAHFLSQTKQQGLNFSYLTAYQATVC
ARAQAPPPSWDEMWKCLVRLKPTLHGPTPLLYRLGPVQNETCLTHPITKYLMACMSADLEVTTSTWVLLGGVLAALAAYC
LSVGCVVIVGHIELEGKPALVPDKEVLYQQYDEMEECSQAAPYIEQAQVIAHQFKEKILGLLQRATQQQAVIEPIVTTNW
QKLEAFWHKHMWNFVSGIQYLAGLSTLPGNPAVASLMAFTASVTSPLTTNQTMFFNILGGWVATHLAGPQSSSAFVVSGL
AGAAIGGIGLGRVLLDILAGYGAGVSGALVAFKIMGGECPTAEDMVNLLPAILSPGALVVGVICAAILRRHVGPGEGAVQ
WMNRLIAFASRGNHVSPTHYVPESDAAARVTALLSSLTVTSLLRRLHQWINEDYPSPCSDDWLRTIWDWVCSVLADFKAW
LSAKIMPALPGLPFISCQKGYKGVWRGDGVMSTRCPCGAAITGHVKNGSMRLAGPRTCANMWHGTFPINEYTTGPSTPCP
SPNYTRALWRVAANSYVEVRRVGDFHYITGATEDELKCPCQVPAAEFFTEVDGVRLHRYAPPCKPLLRDDITFMVGLHSY
TIGSQLPCEPEPDVSVLTSMLRDPSHITAETAARRLARGSPPSEASSSASQLSAPSLKATCQTHRPHPDAELVDANLLWR
QEMGSNITRVESETKVVVLDSFEPLRAETDDVEPSVAAECFKKPPKYPPALPIWARPDYNPPLLDRWKAPDYVPPTVHGC
ALPPRGAPPVPPPRRKRTIQLDGSNVSAALAALAEKSFPSSKPQEENSSSSGVDTQSSTTSKVPPSPGGESDSESCSSMP
PLEGEPGDPDLSCDSWSTVSDSEEQSVVCCSMSYSWTGALITPCSAEEEKLPISPLSNSLLRHHNLVYSTSSRSASQRQK
KVTFDRLQVLDDHYKTALKEVKERASRVKARMLTIEEACALVPPHSARSKFGYSAKDVRSLSSRAINQIRSVWEDLLEDT
TTPIPTTIMAKNEVFCVDPAKGGRKPARLIVYPDLGVRVCEKRALYDVIQKLSIETMGPAYGFQYSPQQRVERLLKMWTS
KKTPLGFSYDTRCFDSTVTEQDIRVEEEIYQCCNLEPEARKVISSLTERLYCGGPMFNSKGAQCGYRRCRASGVLPTSFG
NTITCYIKATAAAKAANLRNPDFLVCGDDLVVVAESDGVDEDRAALRAFTEAMTRYSAPPGDAPQATYDLELITSCSSNV
SVARDDKGRRYYYLTRDATTPLARAAWETARHTPVNSWLGNIIMYAPTIWVRMVMMTHFFSILQSQEILDRPLDFEMYGA
TYSVTPLDLPAIIERLHGLSAFTLHSYSPVELNRVAGTLRKLGCPPLRAWRHRARAVRAKLIAQGGKAKICGLYLFNWAV
RTKTNLTPLPAAGQLDLSSWFTVGVGGNDIYHSVSRARTRHLLLCLLLLTVGVGIFLLPAR
>Q913V3 ~~~~~~Genome polyprotein~~~
MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARQPEGRAWAQPG
YPWPLYGNEGMGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGVARALAHGVRVVED
GVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCVPCVREGNSSRCWV
ALTPTLAARNASVPTTAIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFSPRRHETIQDCNCSIYPGHVSGHRMAW
DMMMNWSPTTALVVSQLLRIPQAIVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGETRVTGGQIARNAYSLT
TLFSSGSAQNIQLINTNGSWHINRTALNCNDSLNTGFLAALFYTHKFNASGCPERLASCRPIDKFDQGWGPITYAEQGGQ
DQRPYCWHYAPKPCGIVSASKVCGPVYCFTPSPVVVGTTDRFGVPTYSWGENETDVLLLNNTRPPQGNWFGCTWMNGTGF
TKTCGGPPCNIGGGGNNTLTCPTDCFRKHPAATYTKCGSGPWLTPRCLVDYPYRLWHYPCTANFTIFKVRMYVGGVEHRL
DAACNWTRGERCNLEDRDRLELSPLLLSTTEWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGIGSAVVSFAIKWDYIV
ILFLLLADARVCACLWMMLLIAQAEAALENLVVLNAASVAGAHGILSFLVFFCAAWYIKGKLVPGAAYAFYGVWPLLLLL
LALPPRAYAMEREMAASCGGAVFVGLVLLTLSPYYKEFLARLIWWLQYFITRAEAHLQVWIPPLNIRGGRDAIILLACVV
HPELIFDITKLLLAILGPLMVLQASITQVPYFVRAQGLIRACMLVRKVAGGHYVQMAFVKLTALTGTYVYDHLTPLRDWA
HAGLRDLAVAVEPVVFSDMETKVITWGADTAACGDIILGLPVSARRGREILLGPADSLEGQGWRLLAPITAYSQQTRGLL
GCIITSLTGRDKNQVEGEVQVVSTATQSFLATCVNGACWTVFHGAGSKTLAGPKGPITQMYTNVDLDLVGWQAPPGSRSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPVSYLKGSSGGPLLCPSRHAVGIFRAAVCTRGVAKAVDFIPVES
METTMRSPVFTDNSSPPAVPQTFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRT
GVRAITTGAPITYSTYGKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEVALSNAGEIPFYGKAIPIEVIKGGRHLIFCHSKKKYDELAAKLSALGLNAVAYYRGLDVSVIPTNGDVVVVATDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVARSQRRGRTGRGRRGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVSTGLTHIDAHFLSQTKQAGDNFPYLVAYQATVCARAQAP
PPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEITLTHPMTKFIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGRPAVIPDREVLYREFDEMEECASHLPYIEQGMQLAEQFKQKALGLLQTATKQAEAAAPVVESKWRALETF
WAKHMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLSTQNTLLFNIWGGWVAAQLAPPSAASAFVGAGIAGAAVG
SIGLGKVLVDILAGYGAGVAGALVAFKIMSGEVPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVGPGEGAVQWMNRLI
AFASRGNHVSPAHYVPESDAAARVTQILSGLTITQLLKRLHHWINEDCSTPCSGSWLRDVWDWICTVLTDFKTWLQSKLL
PRLPGVPFFSCQRGYKGVWRGDGIMQTTCPCGAQITGHVKNGSMRIVGPKTCSSTWHGTFPINAYTTGPCAPSPAPNYSR
ALWRVAAEEYVEVTRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPACKPLLREEVTFQVGLNQYLVGSQL
PCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSAPSLKATCTTHHDSPDVDLIEANLLWRQEMGGN
ITRVESENKVVILDSFDPLRAEEDEREPSVAAEILRKTKRFPPAMPIWARPDYNPPLLESWKDPDYVPPVVHGCPLPPTK
APPIPPPRRKRTVVLTESTVSSALAELATKTFGSSGSSAVDSGTATAPPDQASDDGDQGSDVESYSSMPPLEGEPGDPDL
SDGSWSTVSEEAGEDVICCSMSYTWTGALITPCAAEESKLPINPLSNSLLRHHNMVYATTSRSAGLRQKKVTFDRLQVLD
DHYRDVLKEMKAKASTVKAKLLSIEEACKLTPPHSARSKFGYGAKDVRNLSSKAVNHIRSVWKDLLEDTETPIDTTVMAK
SEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPGQRVEFLVNAWKSKKCPMGFSYDT
RCFDSTVTESDIRVEESIYQCCDLAPEARQAIKSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKASA
ACRAAKLRDCTMLVNGDDLVVICESAGTQEDEANLRVFTEAMTRYSAPPGDPPRPEYDLELITSCSSNVSVAHDASGKRV
YYLTRDPSTPLARAAWETARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIERLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRVWRHRARSVRAKLLSQGGRAATCGKYLFNWAVRTKLKLTPIPA
ASQLDLSSWFVAGYSGGDIYHSLSRARPRWFMLCLLLLSVGVGIYLLPNR
>P29846 ~~~~~~Genome polyprotein~~~
MSTNGKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTWERSQPRGRRQPIPKARQPEGRAWAQPG
YPWPLYGNEGLGWAGWLVSPRGSRPNWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGVARALAHGVRVLED
GVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVHNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCVPCVRENNSSRCWV
ALTPTLAARNNSVPTATIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFSPRRYETVQDCNCSIYPGHVTGHRMAW
DMMMNWSPTTALVVSQLLRIPQAVVDMVGGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGSTIVSGGTVARTTHSLA
SLFTQGASQKIQLINTNGSWHINRTALNCNDSLQTGFLASLFYAHRFNASGCPERMASCRSIDKFDQGWGPITYTEADIQ
DQRPYCWHYAPRPCGIVPASQVCGPVYCFTPSPVVVGTTDRFGAPTYSWGENETDVLILNNTRPPQGNWFGCTWMNSTGF
TKTCGGPPCNIGGGGNNTLVCPTDCFRKHPEATYTKCGSGPWLTPRCMVDYPYRLWHYPCTVNFTIFKVRMYVGGVEHRL
NAACNWTRGERCDLEDRDRSELSPLLLSTTEWQILPCSFTGLPALSTGLIHLHQNVVDVQYLYGIGSAVVSFAIKWEYIL
LLFLLLADARVCACLWMMLLIAQAEAALENLVVFNAASVAGMHGTLSFLVFFCAAWYIKGRLVPGAAYALYGVWPLLLLL
LALPPRAYAMDREMAASCGGAVFVGLVLLTLSPHYKMFLARLIWWLQYFITRAEAHLQVWIPPLNVRGGRDAIILLTCAA
YPELIFDITKILLAILGPLMVLQAGLTRIPYFVRAQGLIRACMLVRKAAGGHYVQMALMKLAALTGTYVYDHLTPLQDWA
HTGLRDLAVAVEPVVFSDMETKIITWGADTAACGDIILGLPVSARRGREILLGPADSLEGRGWRLLAPITAYAQQTRGLF
GCIITSLTGRDKNQVEGEVQVVSTATQSFLATCINGVCWTVYHGAGSKTLAGPKGPITQMYTNVDQDLVGWHAPQGARSL
TPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPSGHVVGIFRAAVCTRGVAKAVDFVPVES
METTMRSPVFTDNSSPPAVPQAFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGVDPNIRT
GVRTITTGAPITYSTYGKFLADGGCSGGAYDIIMCDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHP
NIEEIALSNTGEIPFYGKAIPIETIKGGRHLIFCHSKKKCDELAAKLSALGIHAVAYYRGLDVSVIPASGNVVVVATDAL
MTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTMPQDAVSRSQRRGRTSRGRRGIYRFVTPGERPSGMFDSSVLCEC
YDAGCAWYELTPAETSVRLRAYLNTPGLPVCQDHLEFWESVFTGLTHIDAHFLSQTKQAGDNFPYLVAYQATVCARAQAP
PPSWDQMWKCLTRLKPTLHGPTPLLYRLGAVQNEVTLTHPITKYIMACMSADLEVVTSTWVLVGGVLAALAAYCLTTGSV
VIVGRIILSGKPAVVPDREVLYQEFDEMEECASHLPYIEQGMQLAEQFKQKALGLLQTATKQAEAAAPVVESKWRTLEAF
WANDMWNFISGIQYLAGLSTLPGNPAIASLMAFTASITSPLTTQSTLLFNILGGWVAAQLAPPGAASAFVGAGIAGAAVG
SIGLGKVLVDMVAGYGAGVAGALVAFKVMSGEMPSTEDLVNLLPAILSPGALVVGVVCAAILRRHVDPGEGAVQWMNRLI
AFASRGNHVSPTHYVPESDAAARVTQILSGLTITQLLRRLHQWINEDCSTPCSGSWLRDVWDWICTVLADFKTWLQSKLL
PRLPGVPFFSCQRGYKGVWRGDGIMQTTCPCGAQLTGHVKNGSMRIWGPKTCSNTWHGTFPINAYTTGPCTPSPAPNYSR
ALWRVAAEEYVEVRRVGDFHYVTGMTTDNVKCPCQVPAPEFFTEVDGVRLHRYAPACKPLLREEVSFQVGLNQYVVGSQL
PCEPEPDVAVLTSMLTDPSHITAETAKRRLARGSPPSLASSSASQLSALSLKAACTTRHTPPDADLIEANLLWRQEMGGN
ITRVESENKVVILDSFDPLRAEEDEREVSVPAEILRKSRKFPPALPVWARPDYNPPLLEPWKDPDYVPPVVHGCPLPPVK
APPIPPPRRKRTVVLTESTVSSALAELATKTFGSSESSAAGSGTATAPPDQPSDDGDAGSDVESCSSMPPLEGEPGDPDL
SDGSWSTVSEEDGEGVICCSMSYTWTGALITPCAAEESKLPINALSNSLLRHHNMVYATTSRSASQRQKKVTIDRLQVLD
DHYRDVLKEMKAKASTVKAKLLSVEEACKLTPPHSARSKFGYGAKDVRNLSGKAINHIRSVWKDLLEDTETPIDTTIMAK
NEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPGQRVEFLVNAWKSKKCPMGFSYDT
RCFDSTVTESDIRVEESIYQCCDLAPEARQAIRSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKASA
ACRAAKLQDCTMLVCGDDLVVICESAGTQEDAASLRVFTEAMTRYSAPPGDLPQPEYDQELITSCSSNVSVAHDASGKRV
YYLTRDPTTPLARAAWATARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQ
IIERLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRAWRHRARSVRAKLLSQGGRAATCGRYLFNWAVKTKLKLTPIPA
ASQLDLSKWFVAGYGGGDIYHSLSRARPRWFMLCLLLLSVGVGIYLLPNR
>P32537 ~~~~~~Genome polyprotein~~~
MGAQVSRQQTGTHENANVATGGSSITYNQINFYKDSYAASASKQDFSQDPSKFTEPVAEALKAGAPVLKSPSAEACGYSD
RVLQLKLGNSSIVTQEAANICCAYGEWPTYLPDNEAVAIDKPTQPETSTDRFYTLKSKKWESNSTGWWWKLPDALNQIGM
FGQNVQYHYLYRSGFLCHVQCNATKFHQGTLLIVAIPEHQIGKKGTGTSASFAEVMKGAEGGVFEQPYLLDDGTSLACAL
VYPHQWINLRTNNSATIVLPWMNSAPMDFALRHNNWTLAVIPVCPLAGGTGNTNTYVPITISIAPMCAEYNGLRNAITQG
VPTCLLPGSNQFLTTDDHSSAPAFPDFSPTPEMHIPGQVHSMLEIVQIESMMEINNVNDASGVERLRVQISAQSDMDQLL
FNIPLDIQLEGPLRNTLLGNISRYYTHWSGSLEMTFMFCGSFMTTGKLIICYTPPGGSSPTDRMQAMLATHVVWDFGLQS
SITIIIPWISGSHYRMFNTDAKAINANVGYVTCFMQTNLVAPVGAADQCYIVGMVAAKKDFNLRLMRDSPDIGQSAILPE
QAATTQIGEIVKTVANTVESEIKAELGVIPSLNAVETGATSNTEPEEAIQTRTVINMHGTAECLVENFLGRSALVCMRSF
EYKNHSTSTSSIQKNFFIWTLNTRELVQIRRKMELFTYLRFDTEITIVPTLRLFSSSNVSFSGLPNLTLQAMYVPTGARK
PSSQDSFEWQSACNPSVFFKINDPPARLTIPFMSINSAYANFYDGFAGFEKKATVLYGINPANTMGNLCLRVVNSYQPVQ
YTLTVRVYMKPKHIKAWAPRAPRTMPYTNILNNNYAGRSAAPNAPTAIVSHRSTIKTMPNDINLTTAGPGYGGAFVGSYK
IINYHLATDEEKERSVYVDWQSDVLVTTVAAHGKHQIARCRCNTGVYYCKHKNRSYPVCFEGPGIQWINESDYYPARYQT
NTLLAMGPCQPGDCGGLLVCSHGVIGLVTAGGEGIVAFTDIRNLLWLEDDAMEQGITDYIQNLGSAFGTGFTETISEKAK
EIQNMLVGEDSLLEKLLKALIKIVSAMVIVIRNSEDLVTVTATLALLGCNDSPWAFLKQKVCSYLGIPYTIRQSDSWLKK
FTEACNALRGLDWLAQKIDKFINWLKTKILPEAREKHEFVQKLKQLPVIESQINTIEHSCPNSEXQQALFNNVQYYSHYC
KKYAPLYALEAKRVSALERKINNYIQFKSKSRIEPVCLIIHGSPGTGKSVASNLIARAITEKLGGDSYSLPPDPKYFDGY
KQQTVVLMDDLMQNPDGNDIAMFCQMVSTVDFIPPMASLEEKGTLYTSPFLIATTNAGSIHAPTVSDSKALARRFKFDME
IESMESYKDGVRLDMFKAVELCNPEKCRPTNYKKCCPLICGKAIQFRDKRTNVRYSVDMLVTEMIKEYRIRNSTQDKLEA
LFQGPPTFKEIKISVTPETPAPDAINDLLRSIDSQEVRDYCQKKGWIVMHPPTELVVDKHISRAFIALQAITTFVSIAGV
VYVIYKLFAGIQGPYTGLPNQKPKVPTLRTAKVQGPSLDFAQAIMRKNTVIARTSKGEFTMLGIYDRIAVVPTHASVEEE
IYINDVPVKVKDAYALRDINDVNLEITVVELDRNEKFRDIRGFLPKYEDDYNDAILSVNTSKFPNMYIPVGQTLNYGFLN
LGGTPTHRILMYNFPTRAGQCGGVVTTTGKVIGIHVGGNGAQGFAAMLLQNYFTEKQGEIVSIEKTGVFINAPAKTKLEP
SVFHEVFEGVKEPAVLHSKDKRLKVDFEEAIFSKYVGNKTMLMDEYMEEAVDHYVGCLEPLDISTEPIKLEEAMYGMDGL
EALDLTTSAGYPYLLQGKKKRDIFNRQTRDTTEMTKMLDKYGVDLPFVTFVKDELRSREKVEKGKSRLIEASSLNDSVAM
RVAFGNLYATFHKNPGVATGSAVGCDPDLFWSKIPVXLDGKIFAFDYTGYDASLSPVWFACLKKTLVKLGYTHQTAFVDY
LCHSVHLYKDRKYIVNGGMPSGSSGTSIFNTMINNIIIRTLLLKVYKGIDLDQFKMIAYGDDVIASYPHEIDPGLLAKAG
KEYGLIMTPADKSSGFTETTWENVTFLKRYFRADEQYPFLIHPVMPMKEIHESIRWTKDPRNTQDHVRSLCLLAWHNGEE
TYNEFCRKIRTVPVGRALALPVYSSLRRKWLDSF
>B9VUU3 ~~~~~~Genome polyprotein~~~
MGSQVSTQRSGSHENSNSATEGSTINYTTINYYKDSYAATAGKQSLKQDPDKFANPVKDIFTEMAAPLKSPSAEACGYSD
RVAQLTIGNSTITTQEAANIIVGYGEWPSYCSDSDATAVDKPTRPDVSVNRFYTLDTKLWEKSSKGWYWKFPDVLTETGV
FGQNAQFHYLYRSGFCIHVQCNASKFHQGALLVAVLPEYVIGTVAGGTGTEDSHPPYKQTQPGADGFELQHPYVLDAGIP
ISQLTVCPHQWINLRTNNCATIIVPYINALPFDSALNHCNFGLLVVPISPLDYDQGATPVIPITITLAPMCSEFAGLRQA
VTQGFPTELKPGTNQFLTTDDGVSAPILPNFHPTPCIHIPGEVRNLLELCQVETILEVNNVPTNATSLMERLRFPVSAQA
GKGELCAVFRADPGRNGPWQSTLLGQLCGYYTQWSGSLEVTFMFTGSFMATGKMLIAYTPPGGPLPKDRATAMLGTHVIW
DFGLQSSVTLVIPWISNTHYRAHARDGVFDYYTTGLVSIWYQTNYVVPIGAPNTAYIIALAAAQKNFTMKLCKDASDILQ
TGTIQGDRVADVIESSIGDSVSRALTQALPAPTGQNTQVSSHRLDTGKVPALQAAEIGASSNASDESMIETRCVLNSHST
AETTLDSFFSRAGLVGEIDLPLEGTTNPNGYANWDIDITGYAQMRRKVELFTYMRFDAEFTFVACTPTGEVVPQLLQYMF
VPPGAPKPDSRESLAWQTATNPSVFVKLSDPPAQVSVPFMSPASAYQWFYDGYPTFGEHKQEKDLEYGACPNNMMGTFSV
RTVGTSKSKYPLVVRIYMRMKHVRAWIPRPMRNQNYLFKANPNYAGNSIKPTGTSRTAITTLGKFGQQSGAIYVGNFRVV
NRHLATHNDWANLVWEDSSRDLLVSSTTAQGCDTIARCNCQTGVYYCNSRRKHYPVSFSKPSLIYVEASEYYPARYQSHL
MLAQGHSEPGDCGGILRCQHGVVGIVSTGGNGLVGFADVRDLLWLDEEAMEQGVSDYIKGLGDAFGTGFTDAVSREVEAL
KSYLIGSEGAVEKILKNLIKLISALVIVIRSDYDMVTLTATLALIGCHGSPWAWIKAKTASILGIPIAQKQSASWLKKFN
DMANAAKGLEWVSNKISKFIDWLKEKIVPAAKEKVEFLNNLKQLPLLENQISNLEQSAASQEDLEVMFGNVSYLAHFCRK
FQPLYATEAKRVYALEKRMNNYMQFKSKHRIEPVCLIIRGSPGTGKSLATGIIARAIADKYHSSVYSLPPDPDHFDGYKQ
QVVTVMDDLCQNPDGKDMSLFCQMVSTVDFIPPMASLEEKGVSFTSKFVIASTNATNIIVPTVSDSDAIRRRFYMDCDIE
VTDSYKTDLGRLDAGRAAKLCSENNTANFKRCSPLVCGKAIQLRDRKSKVRYSVDTVVSELIREYSNRSAIGNTIEALFQ
GPPKFRPIRIGLEEKPAPDAISDLLASVDSEEVRQYCRDQGWIIPETPTNVERHLNRAVLVMQSIATVVAVVSLVYVIYK
LFAGFQGAYSGAPKQVLKKPALRTATVQGPSLDFALSLLRRNVRQVQTDQGHFTMLGVRDRLAVLPRHSQPGKTIWIEHK
LVNVLDAVELVDEQGVNLELTLITLDTNEKFRDITKFIPENISTASDATLVINTEHMPSMFVPVGDVVQYGFLNLSGKPT
HRTMMYNFPTKAGQCGGVVTSVGKVIGIHIGGNGRQGFCAGLKRSYFASEQGEIQWVKPNKETGRLNINGPTRTKLEPSV
FHDVFEGSKEPAVLHSKDPRLEVDFEQALFSKYVGNTLHVPDEYIREAALHYANQLKQLDIDTTQMSMEEACYGTDNLEA
IDLHTSAGYPYSALGIKKRDILDPTTRDVSKMKFYMDKYGLDLPYSTYVKDELRSIDKIKKGKSRLIEASSLNDSVYLRM
AFGHLYETFHANPGTVTGSAVGCNPDVFWSKLPILLPGSLFAFDYSGYDASLSPVWFRALELVLREIGYSEEAVSLIEGI
NHTHHVYRNKTYCVLGGMPSGCSGTSIFNTMINNIIIRALLIKTFKGIDLDELNMVAYGDDVLASYPFPIDCLELAKTGK
EYGLTMTPADKSPCFNEVNWENATFLKRGFLPDEQFPFLIHPTMPMKEIHESIRWTKDARNTQDHVRSLCLLAWHNGKQE
YEKFVSSIRSVPIGKALAIPNYENLRRNWLELF
>Q66478 ~~~~~~Genome polyprotein~~~
MGSQVSTQRSGSHENSNSATEGSTINYTTINYYKDSYAATAGKQSLKQDPDKFANPVKDIFTEMAAPLKSPSAEACGYSD
RVAQLTIGNSTITTQEAANIIVGYGEWPSYCSDNDATAVDKPTRPDVSVNRFYTLDTKLWEKSSKGWYWKFPDVLTETGV
FGPNAQFHYLYRSGFCIHVQCNASKFHQGALLVAVLPEYVIGTVAGGTGTENSHPPYKQTQPGADGFELQHPYVLDAGIP
ISQLTVCPHQWINLRTNNCATIIVPYMNTLPFDSALNHCNFGLLVVPISPLDFDQGATPVIPITITLAPMCSEFAGLRQA
VTQGFPTELKPGTNQFLTTDDGVSAPILPNFHPTPCIHIPGEVRNLLELCQVETILEVNNVPTNATSLMERLRFPVSAQA
GKGELCAVFRADPGRDGPWQSTMLGQLCGYYTQWSGSLEVTFMFTGSFMATGKMLIAYTPPGGPLPKDRATAMLGTHVIW
DFGLQSSVTLVIPWISNTHYRAHARDGVFDYYTTGLVSIWYQTNYVVPIGAPNTAYIIALAAAQKNFTMKLCKDTSDILE
TATIQGDRVADVIESSIGDSVSKALTPALPAPTGPDTQVSSHRLDTGKVPALQAAEIGASSNASDESMIETRCVLNSHST
AETTLDSFFSRAGLVGEIDLPLKGTTNPNGYANWDIDITGYAQMRRKVELFTYMRFDAEFTFVACTPTGRVVPQLLQYMF
VPPGAPKPDSRDSLAWPTATNPSVFVKSSDPPAQVSVPFMSPASAYQWFYDGYPTFGEHKQEKDLEYGACPNNMMGTFSV
RTVGSSKSEYSLVIRIYMRMKHVRAWIPRPMRNQNYLFKSNPNYAGDSIKPTGTSRTAITTLGKFGQQSGAIYVGNFRVV
NRHLATHTDWANLVWEDSSRDLLVSSTTAQGCDTIARCNCQTGVYYCNSRRKHYPVSFSKPSLVFVEASEYYPARYQSHL
MLAEGHSEPGDCGGILRCQHGVVGIVSTGGSGLVGFADVRDLLWLDEEAMEQGVSDYIKGLGRAFGTGFTDAVSREVEAL
KNHLIGSEGAVEKILKNLVKLISALVIVIRSDYDMVTLTATLALIGCHGSPWAWIKSKTASILGIPMAQKQSASWLKKFN
DMANAAKGLEWIFNKISKFIDWLKEKIIPAAKEKVEFLNNLKQLPLLENQVSNLEQSAASQEDLEAMFGNVIYLAHFCRK
FQPLYATEAKRVYALEKRMNNYMQFKSKHRIEPVCLIIRGSPGTGKSLATGIIARAIADKYRSSVYSLPPDPDHFDGYKQ
QVVAVMDDLCQNPDGKDMSLFCQMVSTVDFVPPMASLEEKGVSFTSKFVIASTNASNIIVPTVSDSDAIRRRFYMDCDIE
VTDSYKTDLGRLDAGRAAKLCTENNTANFKRCSPLVCGKAIQLRDRKSKVRYSVDTVVSELIREYNNRSAIGNTIEALFQ
GPPKFRPIRISLEEKPAPDAISDLLASVDSEEVRQYCREQGWIIPETPTNVERHLNRAVLVMQSIATVVAVVSLVYVIYK
LFAGFQGAYSGAPKPILKKPVLRTATVQGPSLDFALSLLRRNIRQAQTDQGHFTMLGVRDRLAILPRHSQPGKTIWVEHK
LINVLDAVELVDEQGVNLELTLVTLDTNEKFRDITKCIPEVITGASDATLVINTEHIPSMFVPVGDVVQYGFLNLSGKPT
HRTMMYNFPTKPGQCGGVVTSVGKIIGIHIGGNGRQAFCAGLKRSYFASEQGEIQWMKPNRETGRLNINGPTRTKLEPSV
FHDVFEGNKEPAVLTSKDPRLEVDFEQALFSKYVGNTLHEPDEYVTQAALHYANQLKQLDINTSKMSMEEACYGTEYLEA
IDLHTSAGYPYSALGIKKRDILDPVTRDTSRMKLYMDKYGLDLPNSTYVKDELSSLDKIRKGESRLIEASSLNDPVYPRL
TFGHLYEVFHANPGTVTGSAVGCNPDVFWSKLPILLPGSLFAFDYSGYDASLSPVWFRALELVLREIGYSEEAVSLIEGI
NHTHHVYRNKTYCVLGGMPSGCSGTSIFNSMINNIIIRTLLIKTFKGIDLDELKMVAYGDDVLASYPFPIDCLEWGKTGK
EYGLTMTPADKSPCFNEVTWENATFLKRGFLPDHQFPFLIHPTMPMREIHESIRWTKDARNTQDHVRSLCLLAWHNGKEE
YEKFVSTIRSVPIGRALAIPNLENLRRNWLELF
>Q66479 ~~~~~~Genome polyprotein~~~
MGSQVSTQRSGSHENSNSATEGSTINYTTINYYKDSYAATAGKQSLKQDPDKFANPVKDIFTEMAAPLKSPSAEACGYSD
RVAQLTIGNSTITTQEAANIIVGYGEWPSYCSDDDATAVDKPTRPDVSVNRFYTLDTKLWEKSSKGWYWKFPDVLTETGV
FGQNAQFHYLYRSGFCIHVQCNASKFHQGALLVAILPEYVIGTVAGGTGTEDSHPPYKQTQPGADGFELQHPYVLDAGIP
ISQLTVCPHQWINLRTNNCATIIVPYMNTLPFDSALNHCNFGLLVVPISPLDFDQGATPVIPITITLAPMCSEFGGLRQA
VTQGFPTELKPGTNQFLTTDDGVSAPILPNFHPTPCIHIPGEVRNLLELCQVETILEVNNVPTNATSLMERLRFPVSAQA
GKGELCAVFRADPGRDGPWQSTMLGQLCGYYTQWSGSLEVTFMFTGSFMATGKMLIAYTPPGGPLPKDRATAMLGTHVIW
DFGLQSSVTLVIPWISNTHYRAHARDGVFDYYTTGLVSIWYQTNYVVPIGAPNTAYILALAAAQKNFTMKLCKDTSHILQ
TASIQGDRVADVIESSIGDSVSRALTQALPAPTGQNTQVSSHRLDTGEVPALQAAEIGASSNTSDESMIETRCVLNSHST
AETTLDSFFSRAGLVGEIDLPLEGTTNPNGYANWDIDITGYAQMRRKVELFTYMRFDAEFTFVACTPTGEVVPQLLQYMF
VPPGAPKPESRESLAWQTATNPSVFVKLTDPPAQVSVPFMSPASAYQWFYDGYPTFGEHKQEKDLEYGACPNNMMGTFSV
RTVGSSKSKYPLVVRIYMRMKHVRAWIPRPMRNQNYLFKANPNYAGNSIKPTGTSRNAITTLGKFGQQSGAIYVGNFRVV
NRHLATHNDWANLVWEDSSRDLLVSSTTAQGCDTIARCNCQTGVYYCNSKRKHYPVSFSKPSLIYVEASEYYPARYQSHL
MLAAGHSESGDCGGILRCQHGVVGIASTGGNGLVGFADVRDLLWLDEEAMEQGVSDYIKGLGDAFGTGFTDAVSREVEAL
RNHLIGSDGAVEKILKNLIKLISALVIVIRSDYDMVTLTATLALIGCHGSPWAWIKAKTASILGIPIAQKQSASWLKKFN
DMASAAKGLEWISNKISKFIDWLREKIVPAAKEKAEFLTNLKQFPLLENQITHLEQSAASQEDLEAMFGNVSYLAHFCRK
FQPLYATEAKRVYVLEKRMNNYMQFKSTHRIEPVCLIIRGSPGTGKSLATGIIARAIADKYHSSVYSLPPDPDHFDGYKQ
QVVTVMDDLCQNPDGKDMSLFYQMVSTVDIIPPMASLEEKGVSFTSKFVIASTNASNIIVPTVSDSDAIRRRFYMDCDIE
VTDSSKTDLGRLDAGRAAKLCSENNTANFKRCSPLVCGKAIQLRDRKSKVRYSVDTVVSELIREYNSRSAIGNTIEALFQ
GPPKFRPIRISLEEKPAPDAISDLLASVDSEEVRQYCREQGWIIPETPTNVERHLNRAVLVMQSIATVVAVVSLVYVIYK
LFAGFQGAYSGAPNQVLKKPVLRTATVQGPSLDFALSLLRRNIRQVQTDQGHFTMLGVRDRLAVLPRHSQPGKTIWVEHK
LVNILDAAELVDEQGVNLELTLVTLDTNEKFRDITKFIPETISGASDATLVINTEHMPSMFVPVGDVVQYGFLNLSGKPT
HRTMMYNFPTKAGQCGGVVTSVGKIIGIHIGGNGRQGFCAGLKRSYFASEQGEIQWVKSNKETGRLNINGPTRTKLEPSV
FHDVFEANKEPAVLTSKDPRLEVDFEQALFSKYVGNVLHEPDEYVHQAALHYANQLKQLDINTKKMSMEEACYGTDNLEA
IDLHTSAGYPYSALGIKKRDILDPATRDVSKMKSYMDKYGLDLPYSTYVKDELRSLDKIKKGKSRLIEASSLNDSVYLRM
TFGHLYEVFHANPGTVTGSAVGCNPDVFWSKLPILLPGSLFAFDYSGYDASLSPVWFRALEVVLREIGYSEEAVSLIEGI
NHTHHIYRNKTYCVLGGMPSGCSGTSIFNSMINNIIIRTLLIKTFKGIDLDELNMVAYGDDVLASYPFPIDCLELAKTGK
EYGLTMTPAGKSPCFNEVTWENATFLKRGFLPDHQFPFLIHPTMPMKEIHESIRWTKDARNTQDHVRSLCLLAWHNGKDE
YEKFVSTIRSVPVGKALAIPNFENLRRNWLELF
>Q68T42 ~~~~~~Genome polyprotein~~~
MGAQVTRQQTGTHENANIATNGSHITYNQINFYKDSYAASASKQDFSQDPSKFTEPVVEGLKAGAPVLKSPSAEACGYSD
RVLQLKLGNSAIVTQEAANYCCAYGEWPNYLPDHEAVAIDKPTQPETSTDRFYTLRSVKWESNSTGWWWKLPDALNNIGM
FGQNVQYHYLYRSGFLIHVQCNATKFHQGALLVVAIPEHQRGAHDTTTSPGFNDIMKGERGGTFNHPYVLDDGTSIACAT
IFPHQWINLRTNNSATIVLPWMNVAPMDFPLRHNQWTLAVIPVVPLGTRTMSSVVPITVSIAPMCCEFNGLRHAITQGVP
TYLLPGSGQFLTTDDHSSAPVLPCFNPTPEMHIPGQIRNMLEMIQVESMMEINNTDGANGMERLRVDISVQADLDQLLFN
IPLDIQLDGPLRNTLVGNISRYYTHWSGSLEMTFMFCGSFMATGKLILCYTPPGGSCPTTRETAMLGTHIVWDFGLQSSI
TLIIPWISGSHYRMFNSDAKSTNANVGYVTCFMQTNLIVPSESSDTCSLIGFIAAKDDFSLRLMRDSPDIGQSNHLHGAE
AAYQVESIIKTATDTVKSEINAELGVVPSLNAVETGATSNTEPEEAIQTRTVINQHGVSETLVENFLGRAALVSKKSFEY
KNHASSSAGTHKNFFKWTINTKSFVQLRRKLELFTYLRFDAEITILTTVAVNGNNDSTYMGLPDLTLQAMFVPTGALTPK
EQDSFHWQSGSNASVFFKISDPPARMTIPFMCINSAYSVFYDGFAGFEKNGLYGINPADTIGNLCVRIVNEHQPVGFTVT
VRVYMKPKHIKAWAPRPPRTMPYMSIANANYKGRDTAPNTLNAIIGNRASVTTMPHNIVTTGPGFGGVFVGSFKIINYHL
ATIEERQSAIYVDWQSDVLVTPIAAHGRHQIARCKCNTGVYYCRHRDRSYPICFEGPGIQWIEQNEYYPARYQTNVLLAA
GPAEAGDCGGLLVCPHGVIGLLTAGGGGIVAFTDIRNLLWLDTDVMEQGITDYIQNLGNAFGAGFTETISNKAKEVQDML
IGESSLLEKLLKALIKIISALVIVIRNSEDLITVTATLALLGCHDSPWSYLKQKVCSYLGIPYVPRQSESWLKKFTEACN
ALRGLDWLSQKIDKFINWLKTKILPEAREKYEFVQRLKQLPVIEKQVSTIEHSCPTTERQQALFNNVQYYSHYCRKYAPL
YAVESKRVAALEKKINNYIQFKSKSRIEPVCLIIHGSPGTGKSVASNLIARAITEKLGGDIYSLPPDPKYFDGYKQQTVV
LMDDLMQNPDGNDISMFCQMVSTVDFIPPMASLEEKGTLYTSPFLIATTNAGSIHAPTVSDSKALSRRFKFDVDIEVTDS
YKDSNKLDMSRAVEMCKPDNCTPTNYKRCCPLICGKAIQFRDRRTNARSTVDMLVTDIIKEYRTRNSTQDKLEALFQGPP
QFKEIKISVAPDTPAPDAINDLLRSVDSQEVRDYCQKKGWIVIHPSNELVVEKHISRAFITLQAIATFVSIAGVVYVIYK
LFAGIQGPYTGIPNPKPKVPSLRTAKVQGPGFDFAQAIMKKNTVIARTEKGEFTMLGVYDRVAVIPTHASVGEIIYINDV
ETRVLDACALRDLTDTNLEITIVKLDRNQKFRDIRHFLPRCEDDYNDAVLSVHTSKFPNMYIPVGQVTNYGFLNLGGTPT
HRILMYNFPTRAGQCGGVVTTTGKVIGIHVGGNGAQGFAAMLLHSYFTDTQGEIVSNEKSGMCINAPAKTKLQPSVFHQV
FEGSKEPAVLNSKDPRLKTDFEEAIFSKYTGNKIMLMDEYMEEAVDHYVGCLEPLDISVDPIPLENAMYGMEGLEALDLT
TSAGFPYLLQGKKKRDIFNRQTRDTSEMTKMLEKYGVDLPFVTFVKDELRSREKVEKGKSRLIEASSLNDSVAMRVAFGN
LYATFHNNPGTATGSAVGCDPDIFWSKIPILLDGEIFAFDYTGYDASLSPVWFACLKKVLIKLGYTHQTSFIDYLCHSVH
LYKDRKYVINGGMPSGSSGTSIFNTMINNIIIRTLLIKVYKGIDLDQFKMIAYGDDVIASYPHKIDPGLLAEAGKHYGLV
MTPADKGTSFIDTNWENVTFLKRYFRADDQYPFLIHPVMPMKEIHESIRWTKDPRNTQDHVRSLCYLAWHNGEEAYNEFC
RKIRSVPVGRALTLPAYSSLRRKWLDSF
>C6KEF6 ~~~~~~Genome polyprotein~~~
MMEGSNGFSSSLAGLSSSRSSLRLLTHLLSLPPPNRDARRHSGWYRSPPTLPVNVYLNEQFDNLCLAALRYPGCKLYPSV
YTLFPDVSPFKIPQSIPAFAHLVQRQGLRRQGNPTTNIYGNGNEVTTDVGANGMSLPIAVGDMPTASSSEAPLGSNKGGS
STSPKSTSNGNVVRGSRYSKWWEPAAARALDRALDHAVDATDAVAGAASKGIKAGATKLSNKLAGSQTTALLALPGNIAG
GAPSATVNANNTSISSQALLPSVNPYPSTPAVSLPNPDAPTQVGPAADRQWLVDTIPWSETTPPLTVFSGPKALTPGTYP
PTIEPNTGVYPLPAALCVSHPESVFTTAYNAHAYFNCGFDVTVVVNASQFHGGSLIVLAMAEGLGDITPADSSTWFNFPH
AIINLANSNSATLKLPYIGVTPNTSTEGLHNYWTILFAPLTPLAVPTGSPTSVKVSLFVSPIDSAFYGLRFPIPFPTPQH
WKTRAVPGAGSYGSVVAGQEIPLVGYAPAAPPRDYLPGRVRNWLEYAARHSWERNLPWTAADEVGDQLVSYPIQPETLAN
TQTNTAFVLSLFSQWRGSLQISLIFTGPAQCYGRLLLAYTPPSANPPTTIEEANNGTYDVWDVNGDSTYTFTIPFCSQAY
WKTVDIGTSSGLVSNNGYFTIFVMNPLVTPGPSPPSATVAAFLHVADDFDVRLPQCPALGFQSGADGAEVQPAPTSDLSD
GNPTTDPAPRDNFDYPHHPVDPSTDLAFYFSQYRWFGLNEDLTPLNVTGGLFYHVSLNPVNFQQNSLLSVLGAFTYVYAN
LSLNINVSAPLQACTFYIFYAPPGASVPSTQTLAELSFFTHTATPLNLAAPTNITVSIPYASPQSVLCTSFGGFGLQNGG
DPGNLHSNTWGTLILYVDLPQSDSVSVSAYISFRDFEAYVPRQTPGVGPIPTSTSIVRVARPTPKPRTVRRQGGTLADLI
LTPESRCFIVAHTTAPYYSILLVNPDEEYAISMFTHGDESILRYSSRGGTRLAPTAPAFFLCAAASVDTILPYPISQSHL
WLSDLTGIPLRAVPPLTLFLSAGAALCAGAQTLIAVAQGGSAPDTPPTPNRALFRRQGLGDLPDAAKGLSAALENVAKVA
GDADIATSSQAIASSINSLSNSIDGATTFMQNFFSGLAPKNPTSPLQHLFAKLIKWVTKIIGSLIIICNNPTPSALIGVS
LMLCGDLAEDITEFFSNLGNPLAAVFYRCARALGLSPTPQSAAQAAGGRQGVRDYNDIMSALRNTDWFFEKIMSHIKNLL
EWLGVLVKDDPRTKLNSQHEKILELYTDSVTASSTPPSELSADAIRSNLDLAKQLLTLSHAANSVTHIQLCTRAITNYST
ALSAISLVGTPGTRPEPLVVYLYGPPGTGKSLLASLLASTLAQALSGDPNNYYSPSSPDCKFYDGYSGQPVHYIDDIGQD
PDGADWADFVNIVSSAPFIVPMADVNDKGRFYTSRVVIVTSNFPGPNPRSARCVAALERRLHIRLNVTARDGAAFSAAAA
LKPSEPLAATRYCKFSNPLTQFSMFNLAVDYKSIVLPNTPLSCFDELIDFILGSLRDRASVNSLLSGMVRTDVARQGGNA
DAPAPSAAPLPSVLPSVPSQDPFVRAVNENRPVSFLSKIWSWRAPIFAASSFLSLIAATLTIVRCLRDLRSTQGAYSGTP
VPKPRKKDLPKQPVYSGPVRRQGFDPAVMKIMGNVDSFVTLSGSKPIWTMSCLWIGGRNLIAPSHAFVSDDYEITHIRVG
SRTLDVSRVTRVDDGELSLISVPDGPEHKSLIRYIRSASPKSGILASKFSDTPVFVSFWNGKPHSTPLPGVVDEKDSFTY
RCSSFQGLCGSPMIATDPGGLGILGIHVAGVAGYNGFSARLTPERVQAFLSNLATPQSVLHFHPPMGPPAHVSRRSRLHP
SPAFGAFPITKEPAALSRKDPRLPEGTDLDAITLAKHDKGDIATPWPCMEEAADWYFSQLPDSLPVLSQEDAIRGLDHMD
AIDLSQSPGYPWTTQGRSRRSLFDEDGNPVPELQKAIDSVWDGGSYIYQSFLKDELRPTAKARAGKTRIVEAAPIQAIVV
GRRLLGSLINHLQGNPLQYGSAVGCNPDIHWTQIFHSLTPFSNVWSIDYSCFDATIPSVLLSAIASRIASRSDQPGRVLD
YLSYTTTSYHVYDSLWYTMVGGNPSGCVGTSILNTIANNIAIISAMMYCNKFDPRDPPVLYCYGDDLIWGSNQDFHPREL
QAFYQKFTNFVVTPADKASDFPDSSSIYDITFLKRYFVPDDIHPHLIHPVMDEATLTNSIMWLRGGEFEEVLRSLETLAF
HSGPNNYSTWCEKIKAKIRENGCDATFTPYSVLQRGWVSTCMTGPYPLTG
>Q66578 ~~~~~~Genome polyprotein~~~
METIKSIADMATGVVSSVDSTINAVNEKVESVGNEIGGNLLTKVADDASNILGPNCFATTAEPENKNVVQATTTVNTTNL
TQHPSAPTMPFSPDFSNVDNFHSMAYDITTGDKNPSKLVRLETHEWTPSWARGYQITHVELPKVFWDHQDKPAYGQSRYF
AAVRCGFHFQVQVNVNQGTAGSALVVYEPKPVVTYDSKLEFGAFTNLPHVLMNLAETTQADLCIPYVADTNYVKTDSSDL
GQLKVYVWTPLSIPTGSANQVDVTILGSLLQLDFQNPRVFAQDVNIYDNAPNGKKKNWKKIMTMSTKYKWTRTKIDIAEG
PGSMNMANVLCTTGAQSVALVGERAFYDPRTAGSKSRFDDLVKIAQLFSVMADSTTPSENHGVDAKGYFKWSATTAPQSI
VHRNIVYLRLFPNLNVFVNSYSYFRGSLVLRLSVYASTFNRGRLRMGFFPNATTDSTSTLDNAIYTICDIGSDNSFEITI
PYSFSTWMRKTNGHPIGLFQIEVLNRLTYNSSSPSEVYCIVQGKMGQDARFFCPTGSVVTFQNSWGSQMDLTDPLCIEDD
TENCKQTMSPNELGLTSAQDDGPLGQEKPNYFLNFRSMNVDIFTVSHTKVDNLFGRAWFFMEHTFTNEGQWRVPLEFPKQ
GHGSLSLLFAYFTGELNIHVLFLSERGFLRVAHTYDTSNDRVNFLSSNGVITVPAGEQMTLSAPYYSNKPLRTVRDNNSL
GYLMCKPFLTGTSTGKIEVYLSLRCPNFFFPLPAPKVTSSRALRGDMANLTNQSPYGQQPQNRMMKLAYLDRGFYKHYGI
IVGDHVYQLDSDDIFKTALTGKAKFTKTKLTSDWVIEEECELDYFRIKYLESAVDSEHIFSVDKNCETIAKDIFGTHTLS
QHQAIGLVGTILLTAGLMSTIKTPVNAVTIKEFFNHAIDGDEQGLSLLVQKCTTFFSSAATEILDNDLVKFIVKILVRIL
CYMVLYCHKPNILTTACLSTLLIMDVTSSSVLSPSCKALMQCLMDGDVKKLAEIVAESMSNTDDDEVKEQICDTVKYTKT
ILSNQGPFKGFNEVSTAFRHIDWWIHTLLKIKDMVLSVFKPSIESKAIQWLERNKEHVCSILDYASDIIVESKDQSKMKT
QDFYQRYSDCLAKFKPIMAICFRSCHNSISNTVYRLFQELARIPNRISTNNDLIRIEPIGIWIQGEPGQGKSFLTHTLSR
QLQKSCKLNGVFTNPTASEFMDGYDNQDIHLIDDLGQTRKEKDIEMLCNCISSVPFIVPMAHLEEKGKFYTSKLVVATTN
KSDFSSTVLQDSGALKRRFPYIMHIRAAKAYSKAGKLNVSQAMATMSTGECWEVSKNGRDWETLKLKDLVDKITIDYNER
VKNYNAWKQQLENQTLDDLDDAVSYIKHNFPDAIPYVDEYLNIEMSTLIEQMEAFIEPKPSVFKCFANKIGSKISKASRE
VVDWFSDKIKSMLSFVERNKAWLTVVSAVTSAISILLLVTKIFKKEESKDERAYNPTLPVAKPKGTFPVSQREFKNEAPY
DGQLEHIISQMAYITGSTTGHMTHCAGYQHDEIILHGHSIKYLEQEDELTLHYKNKVFPIEQPSVTQVTLGGKPMDLAIL
KCKLPFRFKKNSKYYTNKIGTESMLIWMTEQGIITKEVQRVHHSGGIKTREGTESTKTISYTVKSCKGMCGGLLISKVEG
NFKILGMHIAGNGEMGVAIPFNFLKNDMSDQGIVTEITPIQPMYINTKTQIHKSPVYGAVEVKMGPAVLSKSDTRLEEPV
ECLIKKSASKYRVNKFQVNNELWQGVKACVKSKFREIFGMNGIVDMKTAILGTSHVNSMDLSTSAGYSFVKSGYKKKDLI
CLEPFSVAPLLERLVQDKFHNLLKGNQITTTFNTCLKDELRKLDKIASGKTRCIEACEVDYCIVYRMIMMEIYDKIYQTP
CYYSGLAVGINPYKDWHFMINALNDYNYEMDYSQYDGSLSSMLLWEAVEVLAYCHDSPDLVMQLHKPVIDSDHVVFNERW
LIHGGMPSGSPCTTVLNSLCNLMMCIYTTNLISPGIDCLPIVYGDDVILSLDKEIEPEKLQSIMADSFGAEVTGSRKDEP
PSLKPRMEVEFLKRKPGYFPESTFIVGKLDTENMIQHLMWMKNFSTFKQQLQSYLMELCLHGKDTYQHYIKILEPYLQEW
NITVDDYDVVITKLMPMVFD
>P03303 ~~~~~~Genome polyprotein~~~
MGAQVSTQKSGSHENQNILTNGSNQTFTVINYYKDAASTSSAGQSLSMDPSKFTEPVKDLMLKGAPALNSPNVEACGYSD
RVQQITLGNSTITTQEAANAVVCYAEWPEYLPDVDASDVNKTSKPDTSVCRFYTLDSKTWTTGSKGWCWKLPDALKDMGV
FGQNMFFHSLGRSGYTVHVQCNATKFHSGCLLVVVIPEHQLASHEGGNVSVKYTFTHPGERGIDLSSANEVGGPVKDVIY
NMNGTLLGNLLIFPHQFINLRTNNTATIVIPYINSVPIDSMTRHNNVSLMVIPIAPLTVPTGATPSLPITVTIAPMCTEF
SGIRSKSIVPQGLPTTTLPGSGQFLTTDDRQSPSALPNYEPTPRIHIPGKVHNLLEIIQVDTLIPMNNTHTKDEVNSYLI
PLNANRQNEQVFGTNLFIGDGVFKTTLLGEIVQYYTHWSGSLRFSLMYTGPALSSAKLILAYTPPGARGPQDRREAMLGT
HVVWDIGLQSTIVMTIPWTSGVQFRYTDPDTYTSAGFLSCWYQTSLILPPETTGQVYLLSFISACPDFKLRLMKDTQTIS
QTVALTEGLGDELEEVIVEKTKQTVASISSGPKHTQKVPILTANETGATMPVLPSDSIETRTTYMHFNGSETDVECFLGR
AACVHVTEIQNKDATGIDNHREAKLFNDWKINLSSLVQLRKKLELFTYVRFDSEYTILATASQPDSANYSSNLVVQAMYV
PPGAPNPKEWDDYTWQSASNPSVFFKVGDTSRFSVPYVGLASAYNCFYDGYSHDDAETQYGITVLNHMGSMAFRIVNEHD
EHKTLVKIRVYHRAKHVEAWIPRAPRALPYTSIGRTNYPKNTEPVIKKRKGDIKSYGLGPRYGGIYTSNVKIMNYHLMTP
EDHHNLIAPYPNRDLAIVSTGGHGAETIPHCNCTSGVYYSTYYRKYYPIICEKPTNIWIEGNPYYPSRFQAGVMKGVGPA
EPGDCGGILRCIHGPIGLLTAGGSGYVCFADIRQLECIAEEQGLSDYITGLGRAFGVGFTDQISTKVTELQEVAKDFLTT
KVLSKVVKMVSALVIICRNHDDLVTVTATLALLGCDGSPWRFLKMYISKHFQVPYIERQANDGWFRKFNDACNAAKGLEW
IANKISKLIEWIKNKVLPQAKEKLEFCSKLKQLDILERQITTMHISNPTQEKREQLFNNVLWLEQMSQKFAPLYAVESKR
IRELKNKMVNYMQFKSKQRIEPVCVLIHGTPGSGKSLTTSIVGRAIAEHFNSAVYSLPPDPKHFDGYQQQEVVIMDDLNQ
NPDGQDISMFCQMVSSVDFLPPMASLDNKGMLFTSNFVLASTNSNTLSPPTILNPEALVRRFGFDLDICLHTTYTKNGKL
NAGMSTKTCKDCHQPSNFKKCCPLVCGKAISLVDRTTNIRYSVDQLVTAIISDFKSKMQITDSLETLFQGPVYKDLEIDV
CNTPPPECINDLLKSVDSEEIREYCKKKKWIIPEIPTNIERAMNQASMIINTILMFVSTLGIVYVIYKLFAQTQGPYSGN
PPHNKLKAPTLRPVVVQGPNTEFALSLLRKNIMTITTSKGEFTGLGIHDRVCVIPTHAQPGDDVLVNGQKIRVKDKYKLV
DPENINLELTVLTLDRNEKFRDIRGFISEDLEGVDATLVVHSNNFTNTILEVGPVTMAGLINLSSTPTNRMIRYDYATKT
GQCGGVLCATGKIFGIHVGGNGRQGFSAQLKKQYFVEKQGQVIARHKVREFNINPVNTPTKSKLHPSVFYDVFPGDKEPA
VLSDNDPRLEVKLTESLFSKYKGNVNTEPTENMLVAVDHYAGQLLSLDIPTSELTLKEALYGVDGLEPIDITTSAGFPYV
SLGIKKRDILNKETQDTEKMKFYLDKYGIDLPLVTYIKDELRSVDKVRLGKSRLIEASSLNDSVNMRMKLGNLYKAFHQN
PGVLTGSAVGCDPDVFWSVIPCLMDGHLMAFDYSNFDASLSPVWFVCLEKVLTKLGFAGSSLIQSICNTHHIFRDEIYVV
EGGMPSGCSGTSIFNSMINNIIIRTLILDAYKGIDLDKLKILAYGDDLIVSYPYELDPQVLATLGKNYGLTITPPDKSET
FTKMTWENLTFLKRYFKPDQQFPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCMLAWHSGEKEYNEFIQKIRTTDIG
KCLILPEYSVLRRRWLDLF
>Q82122 ~~~~~~Genome polyprotein~~~
MGAQVSRQNVGTHSTQNMVSNGSSLNYFNINYFKDAASSGASRLDFSQDPSKFTDPVKDVLEKGIPTLQSPSVEACGYSD
RIIQITRGDSTITSQDVANAVVGYGVWPHYLTPQDATAIDKPTQPDTSSNRFYTLDSKMWNSTSKGWWWKLPDALKDMGI
FGENMFYHFLGRSGYTVHVQCNASKFHQGTLLVVMIPEHQLATVNKGNVNAGYKYTHPGEAGREVGTQVENEKQPSDDNW
LNFDGTLLGNLLIFPHQFINLRSNNSATLIVPYVNAVPMDSMVRHNNWSLVIIPVCQLQSNNISNIVPITVSISPMCAEF
SGARAKTVVQGLPVYVTPGSGQFMTTDDMQSPCALPWYHPTKEIFIPGEVKNLIEMCQVDTLIPINSTQSNIGNVSMYTV
TLSPQTKLAEEIFAIKVDIASHPLATTLIGEIASYFTHWTGSLRFSFMFCGTANTTLKVLLAYTPPGIGKPRSRKEAMLG
THVVWDVGLQSTVSLVVPWISASQYRFTTPDTYSSAGYITCWYQTNFVVPPNTPNTAEMLCFVSGCKDFCLRMARDTDLH
KQTGPITQNPVERYVDEVLNEVLVVPNINQSHPTTSNAAPVLDAAETGHTNKIQPEDTIETRYVQSSQTLDEMSVESFLG
RSGCIHESVLDIVDNYNDQSFTKWNINLQEMAQIRRKFEMFTYARFDSEITMVPSVAAKDGHIGHIVMQYMYVPPGAPIP
TTRDDYAWQSGTNASVFWQHGQPFPRFSLPFLSIASAYYMFYDGYDGDTYKSRYGTVVTNDMGTLCSRIVTSEQLHKVKV
VTRIYHKAKHTKAWCPRPPRAVQYSHTHTTNYKLSSEVHNDVAIRPRTNLTTVGPSDMYVHVGNLIYRNLHLFNSDIHDS
ILVSYSSDLIIYRTSTQGDGYIPTCNCTEATYYCKHKNRYYPINVTPHDWYEIQESEYYPKHIQYNLLIGEGPCEPGDCG
GKLLCKHGVIGIITAGGEGHVAFIDLRHFHCAEEQGITDYIHMLGEAFGSGFVDSVKDQINSINPINNISSKMVKWMLRI
ISAMVIIIRNSSDPQTIIATLTLIGCNGSPWRFLKEKFCKWTQLTYIHKESDSWLKKFTEMCNAARGLEWIGNKISKFID
WMKSMLPQAQLKVKYLSELKKLNFLEKQVENLRAADTNTQEKIKCEIDTLHDLSCKFLPLYASEAKRIKVLYHKCTNIIK
QKKRSEPVAVMIHGPPGTGKSITTSFLARMITNESDIYSLPPDPKYFDGYDNQSVVIMDDIMQNPGGEDMTLFCQMVSSV
TFIPPMADLPDKGKPFDSRFVLCSTNHSLLAPPTISSLPAMNRRFYLDLDILVHDNYKDNQGKLDVSRAFRLCDVDSKIG
NAKCCPFVCGKAVTFKDRNTCRTYSLSQIYNQILEEDKRRRQVVDVMSAIFQGPISMDKPPPPAITDLLRSVRTPEVIKY
CQDNKWIVPADCQIERDLNIANSIITIIANIISIAGIIYIIYKLFCSLQGPYSGEPKPKTKVPERRVVAQGPEEEFGMSI
IKNNTCVVTTTNGKFTGLGIYDRILILPTHADPGSEIQVNGIHTKVLDSYDLFNKEGVKLEITVLKLDRNEKFRDIRKYI
PESEDDYPECNLALVANQTEPTIIKVGDVVSYGNILLSGTQTARMLKYNYPTKSGYCGGVLYKIGQILGIHVGGNGRDGF
SSMLLRSYFTEQQGQIQISKHVKDVGLPSIHTPTKTKLQPSVFYDIFPGSKEPAVLTEKDPRLKVDFDSALFSKYKGNTE
CSLNEHIQVAVAHYSAQLATLDIDPQPIAMEDSVFGMDGLEALDLNTSAGYPYVTLGIKKKDLINNKTKDISKLKLALDK
YDVDLPMITFLKDELRKKDKIAAGKTRVIEASSINDTILFRTVYGNLFSKFHLNPGVVTGCAVGCDPETFWSKIPLMLDG
DCIMAFDYTNYDGSIHPIWFKALGMVLDNLSFNPTLINRLCNSKHIFKSTYYEVEGGVPSGCSGTSIFNSMINNIIIRTL
VLDAYKHIDLDKLKIIAYGDDVIFSYKYKLDMEAIAKEGQKYGLTITPADKSSEFKELDYGNVTFLKRGFRQDDKYKFLI
HPTFPVEEIYESIRWTKKPSQMQEHVLSLCHLMWHNGPEIYKDFETKIRSVSAGRALYIPPYELLRHEWYEKF
>P23008 ~~~~~~Genome polyprotein~~~
MGAQVSRQNVGTHSTQNSVSNGSSLNYFNINYFKDAASSGASRLDFSQDPSKFTDPVKDVLEKGIPTLQSPSVEACGYSD
RIMQITRGDSTITSQDVANAVVGYGVWPHYLTPQDATAIDKPTQPDTSSNRFYTLESKHWNGSSKGWWWKLPDALKDMGI
FGENMYYHFLGRSGYTVHVQCNASKFHQGTLLVAMIPEHQLASAKHGSVTAGYKLTHPGEAGRDVSQERDASLRQPSDDS
WLNFDGTLLGNLLIFPHQFINLRSNNSATLIVPYVNAVPMDSMLRHNNWSLVIIPISPLRSETTSSNIVPITVSISPMCA
EFSGARAKNIKQGLPVYITPGSGQFMTTDDMQSPCALPWYHPTKEISIPGEVKNLIEMCQVDTLIPVNNVGNNVGNVSMY
TVQLGNQTGMAQKVFSIKVDITSQPLATTLIGEIASYYTHWTGSLRFSFMFCGTANTTLKLLLAYTPPGIDEPTTRKDAM
LGTHVVWDVGLQSTISLVVPWVSASHFRLTADNKYSMAGYITCWYQTNLVVPPSTPQTADMLCFVSACKDFCLRMARDTD
LHIQSGPIEQNPVENYIDEVLNEVLVVPNIKESHHTTSNSAPLLDAAETGHTSNVQPEDAIETRYVITSQTRDEMSIESF
LGRSGCVHISRIKVDYTDYNGQDINFTKWKITLQEMAQIRRKFELFTYVRFDSEITLVPCIAGRGDDIGHIVMQYMYVPP
GAPIPSKRNDFSWQSGTNMSIFWQHGQPFPRFSLPFLSIASAYYMFYDGYDGDNTSSKYGSVVTNDMGTICSRIVTEKQK
HSVVITTHIYHKAKHTKAWCPRPPRAVPYTHSHVTNYMPETGDVTTAIVRRNTITTAGPSDLYVHVGNLIYRNLHLFNSE
MHDSILISYSSDLIIYRTNTIGDDYIPNCNCTEATYYCRHKNRYYPIKVTPHDWYEIQESEYYPKHIQYNLLIGEGPCEP
GDCGGKLLCRHGVIGIITAGGEGHVAFIDLRQFHCAEEQGITDYIHMLGEAFGNGFVDSVKEQINAINPINNISKKVIKW
LLRIISAMVIIIRNSSDPQTIIATLTLIGCNGSPWRFLKEKFCKWTQLTYIHKESDSWLKKFTEMCNAARGLEWIGNKIS
KFIDWMKSMLPQAQLKVKYLNEIKKLSLLEKQIENLRAADSATQEKIKCEIDTLHDLSCKFLPLYAHEAKRIKVLYNKCS
NIIKQRKRSEPVAVMIHGPPGTGKSITTNFLARMITNESDVYSLPPDPKYFDGYDNQSVVIMDDIMQNPDGEDMTLFCQM
VSSVTFIPPMADLPDKGKPFDSRFILCSTNHSLLAPPTISSLPAMNRRFFFDLDIVVHDNYKDTQGKLDVSKAFRPCNVN
TKIGNAKCCPFVCGKAVXFKDRSTCSTYTLAQVYNHILEEDKRRRQVVDVMSAIFQGPISLDXPPPPAIXDLLQSVRTPE
VIKYCQDNKWVIPAECQVERDLNIANSIIAIIANIISIAGIIFVIYKLFCSLQGPYSGEPKPKTKVPERRVVAQGPEEEF
GRSILKNNTCVITTGNGKFTGLGIHDRILIIPTHADPGREVQVNGVHTKVLDSYDLYNRDGVKLEITVIQLDRNEKFRDI
RKYIPETEDDYPECNLALSANQDEPTIIKVGDVVSYGNILLSGNQTARMLKYNYPTKSGYCGGVLYKIGQILGIHVGGNG
RDGFSAMLLRSYFTDTQGQIKVNKHATECGLPTIHTPSKTKLQPSVFYDVFPGSKEPAVLTDNDPRLEVNFKEALFSKYK
GNVECNLNEHMEIAIAHYSAQLMTLDIDSRPIALEDSVFGIEGLEALDLNTSAGFPYVTMGIKKRDLINNKTKDISRLKE
ALDKYGVDLPMITFLKDELRKKEKISTGKTRVIEASSINDTILFRTTFGNLFSKFHLNPGVVTGSAVGCDPETFWSKIPV
MLDGDCIMAFDYTNYDGSIHPVWFQALKKVLENLSFQSNLIDRLCYSKHLFKSTYYEVAGGVPSGCSGTSIFNTMINNII
IRTLVLDAYKNIDLDKLKIIAYGDDVIFSYKYTLDMEAIANEGKKYGLTITPADKSNEFKKLDYSNVTFLKRGFKQDERH
TFLIHPTFPVEEIHESIRWTKKPSQMQEHVLSLCHLMWHNGRKVYEDFSSKIRSVSAGRALYIPPYDLLKHEWYEKF
>P12916 ~~~~~~Genome polyprotein~~~
MGAQVSRQNVGTHSTQNSVSNGSSLNYFNINYFKDAASSGASRLDFSQDPSKFTDPVKDVLEKGIPTLQSPSVEACGYSD
RIIQITRGDSTITSQDVANAVVGYGVWPHYLTPQDATAIDKPTQPDTSSNRFYTLESKHWNGDSKGWWWKLPDALKEMGI
FGENMYYHFLGRSGYTVHVQCNASKFHQGTLLVAMIPEHQLASAKNGSVTAGYNLTHPGEAGRVVGQQRDANLRQPSDDS
WLNFDGTLLGNLLIFPHQFINLRSNNSATLIVPYVNAVPMDSMLRHNNWSLVIIPISPLRSETTSSNIRPITVSISPMCA
EFSGARAKNVRQGLPVYITPGSGQFMTTDDMQSPCALPWYHPTKEISIPGEVKNLIEMCQVDTLIPVNNVGTNVGNISMY
TVQLGNQMDMAQEVFAIKVDITSQPLATTLIGEIASYYTHWTGSLRFSFMFCGTANTTLKLLLAYTPPGIDKPATRKDAM
LGTHVVWDVGLQSTISLVVPWVSASHFRLTANDKYSMAGYITCWYQTNLVVPPNTPQTADMLCFVSACKDFCLRMARDTD
LHIQSGPIEQNPVENYIDEVLNEVLVVPNIKESHHTTSNSAPLLDAAETGHTSNVQPEDAIETRYVMTSQTRDEMSIESF
LGRSGCVHISRIKVDYNDYNGVNKNFTTWKITLQEMAQIRRKFELFTYVRFDSEVTLVPCIAGRGDDIGHVVMQYMYVPP
GAPIPKTRNDFSWQSGTNMSIFWQHGQPFPRFSLPFLSIASAYYMFYDGYDGDNSSSKYGSIVTNDMGTICSRIVTEKQE
HPVVITTHIYHKAKHTKAWCPRPPRAVPYTHSRVTNYVPKTGDVTTAIVPRASMKTVGPSDLYVHVGNLIYRNLHLFNSE
MHDSILVSYSSDLIIYRTNTTGDDYIPSCNCTEATYYCKHKNRYYPIKVTPHDWYEIQESEYYPKHIQYNLLIGEGPCEP
GDCGGKLLCRHGVIGIITAGGEGHVAFTDLRQFQCAEEQGITDYIHMLGEAFGNGFVDSVKEQINAINPINSISKKVIKW
LLRIISAMVIIIRNSSDPQTIIATLTLIGCNGSPWRFLKEKFCKWTQLTYIHKESDSWLKKFTEMCNAARGLEWIGNKIS
KFIDWMKSMLPQAQLKVKYLNEIKKLSLLEKQIENLRAADNATQEKIKCEIDTLHDLSCKFLPLYAHEAKRIKVLYNKCS
NIIKQRKRSEPVAVMIHGPPGTGKSITTNFLARMITNESDVYSLPPDPKYFDGYDNQSVVIMDDIMQNPDGEDMTLFCQM
VSSVTFIPPMADLPDKGKPFDSRFVLCSTNHSLLAPPTISSLPAMNRRFFFDLDIVVHDNYKDAQGKLNVSKAFQPCNVN
TKIGNAKCCPFVCGKAVSFKDRSTCSTYTLAQVYNHILEEDKRRRQVVDVMSAIFQGPISLDAPPPPAIADLLQSVRTPE
VIKYCQDNKWIVPAECQIERDLSIANSIITIIANIISIAGIIFVIYKLFCTLQGPYSGEPKPKTKMPERRVVAQGPEEEF
GRSILKNNTCVITTDNGKFTGLGIYDRTLIIPTHADPGREVQVNGIHTKVLDSYDLYNRDGVKLEITVIQLDRNEKFRDI
RKYIPETEDDYPECNLALSANQVEPTIIKVGDVVSYGNILLSGNQTARMLKYNYPTKSGYCGGVLYKIGQILGIHVGGNG
RDGFSAMLLRSYFTDTQGQIKISKHANECGLPTIHTPSKTKLQPSVFYDVFPGSKEPAVSRDNDPRLKVNFKEALFSKYK
GNTECSLNQHMEIAIAHYSAQLITLDIDSKPIALEDSVFGIEGLEALDLNTSAGFPYVTMGIKKRDLINNKTKDISRLKE
ALDKYGVDLPMITFLKDELRKKEKISAGKTRVIEASSINDTILFRTTFGNLFSKFHLNPGVVTGSAVGCDPETFWSKIPV
MLDGDCIMAFDYTNYDGSIHPVWFQALKKVLENLSFQSNLIDRLCYSKHLFKSTYYEVAGGVPSGCSGTSIFNTMINNII
IRTLVLDAYKNIDLDKLKIIAYGDDVIFSYKYTLDMEAIANEGKKYGLTITPADKSTEFKKLDYNNVTFLKRGFKQDEKH
TFLIHPTFPVEEIYESIRWTKKPSQMQEHVLSLCHLMWHNGRKVYEDFSSKIRSVSAGRALYIPPYDLLKHEWYEKF
>P04936 ~~~~~~Genome polyprotein~~~
MGAQVSRQNVGTHSTQNSVSNGSSLNYFNINYFKDAASNGASKLEFTQDPSKFTDPVKDVLEKGIPTLQSPTVEACGYSD
RIIQITRGDSTITSQDVANAIVAYGVWPHYLSSKDASAIDKPSQPDTSSNRFYTLRSVTWSSSSKGWWWKLPDALKDMGI
FGENMFYHYLGRSGYTIHVQCNASKFHQGTLIVALIPEHQIASALHGNVNVGYNYTHPGETGREVKAETRLNPDLQPTEE
YWLNFDGTLLGNITIFPHQFINLRSNNSATIIAPYVNAVPMDSMRSHNNWSLVIIPICPLETSSAINTIPITISISPMCA
EFSGARAKRQGLPVFITPGSGQFLTTDDFQSPCALPWYHPTKEISIPGEVKNLVEICQVDSLVPINNTDTYINSENMYSV
VLQSSINAPDKIFSIRTDVASQPLATTLIGEISSYFTHWTGSLRFSFMFCGTANTTVKLLLAYTPPGIAEPTTRKDAMLG
THVIWDVGLQSTISMVVPWISASHYRNTSPGRSTSGYITCWYQTRLVIPPQTPPTARLLCFVSGCKDFCLRMARDTNLHL
QSGAIAQNPVENYIDEVLNEVLVVPNINSSNPTTSNSAPALDAAETGHTSSVQPEDVIETRYVQTSQTRDEMSLESFLGR
SGCIHESKLEVTLANYNKENFTVWAINLQEMAQIRRKFELFTYTRFDSEITLVPCISALSQDIGHITMQYMYVPPGAPVP
NSRDDYAWQSGTNASVFWQHGQAYPRFSLPFLSVASAYYMFYDGYDEQDQNYGTANTNNMGSLCSRIVTEKHIHKVHIMT
RIYHKAKHVKAWCPRPPRALEYTRAHRTNFKIEDRSIQTAIVTRPIITTAGPSDMYVHVGNLIYRNLHLFNSEMHESILV
SYSSDLIIYRTNTVGDDYIPSCDCTQATYYCKHKNRYFPITVTSHDWYEIQESEYYPKHIQYNLLIGEGPCEPGDCGGKL
LCKHGVIGIVTAGGDNHVAFIDLRHFHCAEEQGVTDYIHMLGEAFGNGFVDSVKEHIHAINPVGNISKKIIKWMLRIISA
MVIIIRNSSDPQTILATLTLIGCSGSPWRFLKEKFCKWTQLNYIHKESDSWLKKFTEACNAARGLEWIGNKISKFIEWMK
SMLPQAQLKVKYLNELKKLNLYEKQVESLRVADMKTQEKIKMEIDTLHDLSRKFLPLYASEAKRIKTLYIKCDNIIKQKK
RCEPVAIVIHGPPGAGKSITTNFLAKMITNDSDIYSLPPDPKYFDGYDQQSVVIMDDIMQNPAGDDMTLFCQMVSSVTFI
PPMADLPDKGKAFDSRFVLCSTNHSLLTPPTITSLPAMNRRFFLDLDIIVHDNFKDPQGKLNVAAAFRPCDVDNRIGNAR
CCPFVCGKAVSFKDRNSCNKYSLAQVYNIMIEEDRRRRQVVDVMTAIFQGPIDMKNPPPPAITDLLQSVRTPEVIKYCEG
NRWIIPAECKIEKELNLANTIITIIANVIGMARIIYVIYKLFCTLQGPYSGEPKPKTKIPERRVVTQGPEEEFGMSLIKH
NSCVITTENGKFTGLGVYDRFVVVPTHADPGKEIQVDGITTKVIDSYDLYNKNGIKLEITVLKLDRNEKFRDIRRYIPNN
EDDYPNCNLALLANQPEPTIINVGDVVSYGNILLSGNQTARMLKYSYPTKSGYCGGVLYKIGQVLGIHVGGNGRDGFSAM
LLRSYFTDVQGQITLSKKTSECNLPSIHTPCKTKLQPSVFYDVFPGSKEPAVLSEKDARLQVDFNEALFSKYKGNTDCSI
NDHIRIASSHYAAQLITLDIDPKPITLEDSVFGTDGLEALDLNTSAGFPYIAMGVKKRDLINNKTKDISKLKEAIDKYGV
DLPMVTFLKDELRKHEKVIKGKTRVIEASSVNDTLLFRTTFGNLFSKFHLNPGIVTGSAVGCDPEVFWSKIPAMLDDKCI
MAFDYTNYDGSIHPIWFEALKQVLVDLSFNPTLIDRLCKSKHIFKNTYYEVEGGVPSGCSGTSIFNTMINNIIIRTLVLD
AYKNIDLDKLKIIAYGDDVIFSYIHELDMEAIAIEGVKYGLTITPADKSNTFVKLDYSNVTFLKRGFKQDEKYNFLIHPT
FPEDEIFESIRWTKKPSQMHEHVLSLCHLMWHNGRDAYKKFVEKIRSVSAGRALYIPPYDLLLHEWYEKF
>Q82081 ~~~~~~Genome polyprotein~~~
MGAQVSTQKSGSHENQNILTNGSNQTFTVINYYKDAASSSSAGQSFSMDPSKFTEPVKDLMLKGAPALNSPNVEACGYSD
RVQQITLGNSTITTQEAANAIVCYAEWPEYLSDNDASDVNKTSKPDISVCRFYTLDSKTWKATSKGWCWKLPDALKDMGV
FGQNMFYHSLGRTGYTIHVQCNATKFHSGCLLVVVIPEHQLASHEGGTVSVKYKYTHPGDRGIDLDTVEVAGGPTSDAIY
NMDGTLLGNLLIFPHQFINLRTNNTATIVVPYINSVPIDSMTRHNNVSLMVVPIAPLNAPTGSSPTLPVTVTIAPMCTEF
TGIRSRSIVPQGLPTTTLPGSGQFLTTDDRQSPSALPSYEPTPRIHIPGKVRNLLEIIQVGTLIPMNNTGTNDNVTNYLI
PLHADRQNEQIFGTKLYIGDGVFKTTLLGEIAQYYTHWSGSLRISLMYTGPALSSAKIILAYTPPGTRGPEDRKEAMLGT
HVVWDIGLQSTIVMTIPWTSGVQFRYTDPDTYTSAGYLSCWYQTSLILPPQTSGQVYLLSFISACPDFKLRLMKDTQTIS
QTDALTEGLSDELEEVIVEKTKQTLASVSSGPKHTQSVPALTANETGATLPTRPSDNVETRTTYMHFNGSETDVESFLGR
AACVHVTEIKNKNAAGLDNHRKEGLFNDWKINLSSLVQLRKKLELFTYVRFDSEYTILATASQPEASSYSSNLTVQAMYV
PPGAPNPKEWDDYTWQSASNPSVFFKVGETSRFSVPFVGIASAYNCFYDGYSHDDPDTPYGITVLNHMGSMAFRVVNEHD
VHTTIVKIRVYHRAKHVEAWIPRAPRALPYVSIGRTNYPRDSKTIIKKRTNIKTYGLGPRFGGVFTSNVKIINYHLMTPD
DHLNLVAPYPNRDLAVVATGAHGAETIPHCNCTSGVYYSRYYRKFYPIICERPTNIWIEGSSYYPSRYQAGVMKGVGPAE
PGDCGGILRCIHGPIGLLTAGGGGYVCFADIRQLDFIADEQGLGDYITSLGRAFGTGFTDQISAKVCELQDVAKDFLTTK
VLSKVVKMISALVIICRNHDDLVTVTATLALLGCDGSPWRFLKMYISKHFQVPYIERQANDGWFRKFNDACNAAKGLEWI
ANKISKLIEWIKNKVLPQAREKLEFCSKLKQLDILERQIASIHDSNPTQEKREQLFNNVLWLEQMSQKFSPLYASEAKRI
RDLKNKITNYMQFKSKQRTEPVCVLIHGTPGSGKSLTTSIVGRALAEHFNSSVYSLPPDPKHFDGYQQQEVVIMDDLNQN
PDGQDISMFCQMVSSVDFLPPMASLDNKGMLFTSNFVLASTNSNTLSPPTILNPEALIRRFGFDLDICMHSTYTKNGKLN
AAMATSLCKDCHQPSNFKKCCPLVCGKAISLVDRVSNVRFSIDQLVTAIINDYKNKVKITDSLEVLFQGPVYKDLEIDIC
NTPPPECISDLLKSVDSEEVREYCKKKKWIIPQISTNIERAVNQASMIINTILMFVSTLGIVYVIYKLFAQTQGPYSGNP
VHNKLKPPTLKPVVVQGPNTEFALSLLRKNILTITTEKGEFTSLGIHDRICVLPTHAQPGDNVLVNGQKIQIKDKYKLVD
PDNTNLELTIIELDRNEKFRDIRGFISEDLEGLDATLVVHSNGFTNTILDVGPITMAGLINLSNTPTTRMIRYDYPTKTG
QCGGVLCTTGKIFGIHVGGNGRRGFSAQLKKQYFVEKQGLIVSKQKVRDIGLNPINTPTKTKLHPSVFYNVFPGSKQPAV
LNDNDPRLEVKLAESLFSKYKGNVQMEPTENMLIAVDHYAGQLMSLDISTKELTLKEALYGVDGLEPIDVTTSAGYPYVS
LGIKKRDILNKETQDVEKMKFYLDKYGIDLPLVTYIKDELRSVDKVRLGKSRLIEASSLNDSVNMRMKLGNLYKAFHQNP
GIITESAVGCDPDVFWSVIPCLMDGHLMAFDYSNFDASLSPVWFECLEKVLNKLGFKQPSLIQSICNTHHIFRDEIYRVE
GGMPSGCSGTSIFNSMINNIIIRTLILDAYKGIDLDSLRILAYGDDLIVSYPFELDSNILAAIGKNYGLTITPPDKSDAF
TKITWENITFLKRYFRPDPQFPFLIHPVMPMQDIYESIRWTRDPRNTQDHVRSLCMLAWHSGEKDYNDFITKIRTTDIGK
CLNLPEYSVLRRRWLDLF
>P27395 ~~~~~~Genome polyprotein~~~
MTKKPGGPGKNRAINMLKRGLPRVFPLVGVKRVVMSLLDGRGPVRFVLALITFFKFTALAPTKALLGRWKAVEKSVAMKH
LTSFKRELGTLIDAVNKRGRKQNKRGGNEGSIMWLASLAVVIACAGAMKLSNFQGKLLMTINNTDIADVIVIPTSKGENR
CWVRAIDVGYMCEDTITYECPKLTMGNDPEDVDCWCDNQEVYVQYGRCTRTRHSKRSRRSVSVQTHGESSLVNKKEAWLD
STKATRYLMKTENWIIRNPGYAFLAAVLGWMLGSNNGQRVVFTILLLLVAPAYSFNCLGMGNRDFIEGASGATWVDLVLE
GDSCLTIMANDKPTLDVRMINIEASQLAEVRSYCYHASVTDISTVARCPTTGEAHNEKRADSSYVCKQGFTDRGWGNGCG
LFGKGSIDTCAKFSCTSKAIGRTIQPENIKYEVGIFVHGTTTSENHGNYSAQVGASQAAKFTVTPNAPSITLKLGDYGEV
TLDCEPRSGLNTEAFYVMTVGSKSFLVHREWFHDLALPWTSPSSTAWRNRELLMEFEGAHATKQSVVALGSQEGGLHQAL
AGAIVVEYSSSVKLTSGHLKCRLKMDKLALKGTTYGMCTEKFSFAKNPVDTGHGTVVIELSYSGSDGPCKIPIVSVASLN
DMTPVGRLVTVNPFVATSSANSKVLVEMEPPFGDSYIVVGRGDKQINHHWHKAGSTLGKAFSTTLKGAQRLAALGDTAWD
FGSIGGVFNSIGRAVHQVFGGAFRTLFGGMSWITQGLMGALLLWMGVNARDRSIALAFLATGGVLVFLATNVHADTGCAI
DITRKEMRCGSGIFVHNDVEAWVDRYKYLPETPRSLAKIVHKAHKEGVCGVRSVTRLEHQMWEAVRDELNVLLKENAVDL
SVVVNKPVGRYRSAPKRLSMTQEKFEMGWKAWGKSILFAPELANSTFVVDGPETKECPDEHRAWNSMQIEDFGFGITSTR
VWLKIREESTDECDGAIIGTAVKGHVAVHSDLSYWIESRYNDTWKLERAVFGEVKSCTWPETHTLWGDDVEESELIIPHT
IAGPKSKHNRREGYKTQNQGPWDENGIVLDFDYCPGTKVTITEDCSKRGPSVRTTTDSGKLITDWCCRSCSLPPLRFRTE
NGCWYGMEIRPVMHDETTLVRSQVDAFKGEMVDPFQLGLLVMFLATQEVLRKRWTARLTIPAVLGVLLVLMLGGITYTDL
ARYVVLVAAAFAEANSGGDVLHLALIAVFKIQPAFLVMNMLSTRWTNQENVILVLGAAFFQLASVDLQIGVHGILNAAAI
AWMIVRAITFPTTSSVTMPVLALLTPGMRALYLDTYRIILLVIGICSLLHERKKTMAKKKGAVLLGLALTSTGWFSPTTI
AAGLMVCNPNKKRGWPATEFLSAVGLMFAIVGGLAELDIESMSIPFMLAGLMAVSYVVSGKATDMWLERAADISWEMDAA
ITGSSRRLDVKLDDDGDFHLIDDPGVPWKVWVLRMSCIGLAALTPWAIVPAAFGYWLTLKTTKRGGVFWDTPSPKPCSKG
DTTTGVYRIMARGILGTYQAGVGVMYENVFHTLWHTTRGAAIMSGEGKLTPYWGSVREDRIAYGGPWRFDRKWNGTDDVQ
VIVVEPGKAAVNIQTKPGVFRTPFGEVGAVSLDYPRGTSGSPILDSNGDIIGLYGNGVELGDGSYVSAIVQGDRQEEPVP
EAYTPNMLRKRQMTVLDLHPGSGKTRKILPQIIKDAIQQRLRTAVLAPTRVVAAEMAEALRGLPVRYQTSAVQREHQGNE
IVDVMCHATLTHRLMSPNRVPNYNLFVMDEAHFTDPASIAARGYIATKVELGEAAAIFMTATPPGTTDPFPDSNAPIHDL
QDEIPDRAWSSGYEWITEYAGKTVWFVASVKMGNEIAMCLQRAGKKVIQLNRKSYDTEYPKCKNGDWDFVITTDISEMGA
NFGASRVIDCRKSVKPTILEEGEGRVILGNPSPITSASAAQRRGRVGRNPNQVGDEYHYGGATSEDDSNLAHWTEAKIML
DNIHMPNGLVAQLYGPEREKAFTMDGEYRLRGEEKKNFLELLRTADLPVWLAYKVASNGIQYTDRKWCFDGPRTNAILED
NTEVEIVTRMGERKILKPRWLDARVYADHQALKWFKDFAAGKRSAVSFIEVLGRMPEHFMGKTREALDTMYLVATAEKGG
KAHRMALEELPDALETITLIVAITVMTGGFFLLMMQRKGIGKMGLGALVLTLATFFLWAAEVPGTKIAGTLLIALLLMVV
LIPEPEKQRSQTDNQLAVFLICVLTVVGVVAANEYGMLEKTKADLKSMFGGKTQASGLTGLPSMALDLRPATAWALYGGS
TVVLTPLLKHLITSEYVTTSLASINSQAGSLFVLPRGVPFTDLDLTVGLVFLGCWGQITLTTFLTAMVLATLHYGYMLPG
WQAEALRAAQRRTAAGIMKNAVVDGMVATDVPELERTTPLMQKKVGQVLLIGVSVAAFLVNPNVTTVREAGVLVTAATLT
LWDNGASAVWNSTTATGLCHVMRGSYLAGGSIAWTLIKNADKPSLKRGRPGGRTLGEQWKEKLNAMSREEFFKYRREAII
EVDRTEARRARRENNIVGGHPVSRGSAKLRWLVEKGFVSPIGKVIDLGCGRGGWSYYAATLKKVQEVRGYTKGGAGHEEP
MLMQSYGWNLVSLKSGVDVFYKPSEPSDTLFCDIGESSPSPEVEEQRTLRVLEMTSDWLHRGPREFCIKVLCPYMPKVIE
KMEVLQRRFGGGLVRLPLSRNSNHEMYWVSGAAGNVVHAVNMTSQVLLGRMDRTVWRGPKYEEDVNLGSGTRAVGKGEVH
SNQEKIKKRIQKLKEEFATTWHKDPEHPYRTWTYHGSYEVKATGSASSLVNGVVELMSKPWDAIANVTTMAMTDTTPFGQ
QRVFKEKVDTKAPEPPAGAKEVLNETTNWLWAHLSREKRPRLCTKEEFIKKVNSNAALGAVFAEQNQWSTAREAVDDPRF
WEMVDEERENHLRGECHTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARYLEFEALGFLNEDHWLSRENSGGGVEGSG
VQKLGYILRDIAGKQGGKMYADDTAGWDTRITRTDLENEAKVLELLDGEHRMLARAIIELTYRHKVVKVMRPAAEGKTVM
DVISREDQRGSGQVVTYALNTFTNIAVQLVRLMEAEGVIGPQHLEQLPRKTKIAVRTWLFENGEERVTRMAISGDDCVVK
PLDDRFATALHFLNAMSKVRKDIQEWKPSHGWHDWQQVPFCSNHFQEIVMKDGRSIVVPCRGQDELIGRARISPGAGWNV
KDTACLAKAYAQMWLLLYFHRRDLRLMANAICSAVPVDWVPTGRTSWSIHSKGEWMTTEDMLQVWNRVWIEENEWMMDKT
PITSWTDVPYVGKREDIWCGSLIGTRSRATWAENIYAAINQVRAVIGKENYVDYMTSLRRYEDVLIQEDRVI
>P32886 ~~~~~~Genome polyprotein~~~
MTKKPGGPGKNRAINMLKRGLPRVFPLVGVKRVVMSLLDGRGPVRFVLALITFFKFTALAPTKALLGRWKAVEKSVAMKH
LTSFKRELGTLIDAVNKRGRKQNKRGGNEGSIMWLASLAVVIAYAGAMKLSNFQGKLLMTINNTDIADVIVIPTSKGENR
CWVRAIDVGYMCEDTITYECPKLTMGNDPEDVDCWCDNQEVYVQYGRCTRTRHSKRSRRSVSVQTHGESSLVNKKEAWLD
STKATRYLMKTENWIIRNPGYAFLAATLGWMLGSNNGQRVVFTILLLLVAPAYSFNCLGMGNRDFIEGASGATWVDLVLE
GDSCLTIMANDKPTLDVRMINIEASQLAEVRSYCYHASVTDISTVARCPTTGEAHNEKRADSSYVCKQGFTDRGWGNGCG
LFGKGSIDTCAKFSCTSKAIGRTIQPENIKYEVGIFVHGTTTSENHGNYSAQVGASQAAKFTITPNAPSITLKLGDYGEV
TLDCEPRSGLNTEAFYVMTVGSKSFLVHREWFHDLALPWTSPSSTAWRNRELLMEFEEAHATKQSVVALGSQEGGLHQAL
AGAIVVEYSSSVKLTSGHLKCRLKMDKLALKGTTYGMCTEKFSFAKNPADTGHGTVVIELSYSGSDGPCKIPIVSVASLN
DMTPVGRLVTVNPFVATSSANSKVLVEMEPPFGDSYIVVGRGDKQINHHWHKAGSTLGKAFSTTLKGAQRLAALGDTAWD
FGSIGGVFNSIGKAVHQVFGGAFRTLFGGMSWITQGLMGALLLWMGVNARDRSIALAFLATGGVLVFLATNVHADTGCAI
DITRKEMRCGSGIFVHNDVEAWVDRYKYLPETPRSLAKIVHKAHKEGVCGVRSVTRLEHQMWEAVRDELNVLLKENAVDL
SVVVNKPVGRYRSAPKRLSMTQEKFEMGWKAWGKSILFAPELANSTFVVDGPETKECPDEHRAWNSMQIEDFGFGITSTR
VWLKIREESTDECDGAIIGTAVKGHVAVHSDLSYWIESRYNDTWKLERAVFGEVKSCTWPETHTLWGDGVEESELIIPHT
IAGPKSKHNRREGYKTQNQGPWDENGIVLDFDYCPGTKVTITEDCGKRGPSVRTTTDSGKLITDWCCRSCSLPPLRFRTE
NGCWYGMEIRPVRHDETTLVRSQVDAFNGEMVDPFQLGLLVMFLATQEVLRKRWTARLTIPAVLGALLVLMLGGITYTDL
ARYVVLVAAAFAEANSGGDVLHLALIAVFKIQPAFLVMNMLSTRWTNQENVVLVLGAALFQLASVDLQIGVHGILNAAAI
AWMIVRAITFPTTSSVTMPVLALLTPGMRALYLDTYRIILLVIGICSLLQERKKTMAKKKGAVLLGLALTSTGWFSPTTI
AAGLMVCNPNKKRGWPATEFLSAVGLMFAIVGGLAELDIESMSIPFMLAGLMAVSYVVSGKATDMWLERAADISWEMDAA
ITGSSRRLDVKLDDDGDFHLIDDPGVPWKVWVLRMSCIGLAALTPWAIVPAAFGYWLTLKTTKRGGVFWDTPSPKPCSKG
DTTTGVYRIMARGILGTYQAGVGVMYENVFHTLWHTTRGAAIMSGEGKLTPYWGSVKEDRIAYGGPWRFDRKWNGTDDVQ
VIVVEPGKAAVNIQTKPGVFRTPFGEVGAVSLDYPRGTSGSPILDSNGDIIGLYGNGVELGDGSYVSAIVQGDRQEEPVP
EAYTPNMLRKRQMTVLDLHPGSGKTRKILPQIIKDAIQQRLRTAVLAPTRVVAAEMAEALRGLPVRYQTSAVQREHQGNE
IVDVMCHATLTHRLMSPNRVPNYNLFVMDEAHFTDPASIAARGYIATKVELGEAAAIFMTATPPGTTDPFPDSNAPIHDL
QDEIPDRAWSSGYEWITEYAGKTVWFVASVKMGNEIAMCLQRAGKKVIQLNRKSYDTEYPKCKNGDWDFVITTDISEMGA
NFGASRVIDCRKSVKPTILEEGEGRVILGNPSPITSASAAQRRGRVGRNPNQVGDEYHYGGATSEDDSNLAHWTEAKIML
DNIHMPNGLVAQLYGPEREKAFTMDGEYRLRGEEKKNFLELLRTADLPVWLAYKVASNGIQYTDRRWCFDGPRTNAILED
NTEVEIVTRMGERKILKPRWLDARVYADHQALKWFKDFAAGKRSAISFIEVLGRMPEHFMGKTREALDTMYLVATAEKGG
KAHRMALEELPDALETITLIVAITVMTGGFFLLMMQRKGIGKMGLGALVLTLATFFLWAAEVPGTKIAGTLLIALLLMVV
LIPEPEKQRSQTDNQLAVFLICVLTVVGVVAANEYGMLEKTKADLKSMFVGKTQASGLTGLPSMALDLRPATAWALYGGS
TVVLTPLLKHLITSEYVTTSLASINSQAGSLFVLPRGVPFTDLDLTVGLVFLGCWGQITLTTFLTAMVLATLHYGYMLPG
WQAEALRAAQRRTAAGIMKNAVVDGMVATDVPELERTTPLMQKKVGQVLLIGVSVAAFLVNPNVTTVREAGVLVTAATLT
LWDNGASAVWNSTTATGLCHVMRGSYLAGGSIAWTLIKNADKPSLKRGRPGGRTLGEQWKEKLNAMSREEFFKYRREAII
EVDRTEARRARRENNIVGGHPVSRGSAKLRWLVEKGFVSPIGKVIDLGCGRGGWSYYAATLKKVQEVRGYTKGGAGHEEP
MLMQSYGRNLVSLKSGVDVFYKPSEPSDTLFCDIGESSPSPEVEEQRTLRVLEMTSDWLHRGPREFCIKVLCPYMPKVIE
KMEVLQRRFGGGLVRLPLSRNSNHEMYWVSGAAGNVVHAVNMTSQVLLGRMDRTVWRGPKYEEDVNLGSGTRAVGKGEVH
SNQEKIKKRIQKLKEEFATTWHKDPEHPYRTWTYHGSYEVKATGSASSLVNGVVKLMSKPWDAIANVTTMAMTDTTPFGQ
QRVFKEKVDTKAPEPPAGAKEVLNETTNWLWAHLSREKRPRLCTKEEFIKKVNSNAALGAVFAEQNQWSTAREAVDDPRF
WEMVDEERENHLRGECHTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARYLEFEALGFLNEDHWLSRENSGGGVEGSG
VQKLGYILRDIAGKQGGKMYADDTAGWDTRITRTDLENEAKVLELLDGEHRMLARAIIELTYRHKVVKVMRPAAEGKTVM
DVISREDQRGSGQVVTYALNTFTNIAVQLVRLMEAEGVIGPQHLEQLPRKTKIAVRTWLFENGEERVTRMAISGDDCVVK
PLDDRFATALHFLNAMSKVRKDIQEWKPSHGWHDWQQVPFCSNHFQEIVMKDGRSIVVPCRGQDELIGRARISPGAGWNV
KDTACLAKAYAQMWLLLYFHRRDLRLMANAICSAVPVDWVPTGRTSWSIHSKGEWMTTEDMLQVWNRVWIEENEWMMDKT
PITSWTDVPYVGKREDIWCGSLIGTRSRATWAENIYAAINQVRAVIGKENYVDYMTSLRRYEDVLIQEDRVI
>D7RF80 ~~~~~~Genome polyprotein~~~
MAKGAVLKGKGGGPPRRVPKETAKKTRQGPGRLPNGLVLMRMMGVLWHMVAGTARNPILKRFWATVPVRQAIAALRKIRK
TVGLLLDSLNKRRGKRRSTTGLLTPILLACLATLVFSATVRRERTGNMVIRAEGKDAATQVEVMNGTCTILATDMGSWCD
DSIMYECVTIDSGEEPVDVDCFCKGVERVSLEYGRCGKPAGGRNRRSVSIPVHAHSDLTGRGHKWLKGDSVKTHLTRVEG
WVWKNKFLTAAFCAVVWMVTDSLPTRFIVITVALCLAPTYATRCTHLQNRDFVSGTQGTTRVSLVLELGGCVTLTAEGKP
SVDVWLDDIHQENPAKTREYCLHAKLANSKVAARCPAMGPATLPEEHQASTVCRRDQSDRGWGNHCGLFGKGSIVACAKF
SCEAKKKATGYVYDVNKITYVVKVEPHTGDYLAANESHSNRKTASFTTQSEKTILTLGDYGDISLTCRVTSGVDPAQTVV
LELDKTAEHLPKAWQVHRDWFEDLSLPWRHGGAQEWNHADRLVEFGEPHAVKMDIFNLGDQTGILLKSLAGVPVANIEGS
KYHLQSGHVTCDVGLEKLKMKGMTYTVCEGSKFAWKRPPTDSGHDTVVMEVTYTGSKPCRIPVRAVAHGEPNVNVASLIT
PNPSMETTGGGFVELQLPPGDNIIYVGELSHQWFQKGSTIGRVLEKTRRGIERLTVVGEHAWDFGSVGGMLSSVGKALHT
AFGAAFNTIFGGVGFLPRILLGVALAWLGLNSRNPTLSVGFLITGGLVLTMTLGVGADMGCAIDANRMELRCGEGLVVWR
EVTDWYDGYAFHPESPSVLAASLKEAYEEGICGIVPQNRLEMAMWRRVEAVLNLALAESDANLTVVVDKRDPSDYRGGKV
GTLRRSGKEMKTSWKGWSQSFVWSVPEAPRRFMVGVEGAGECPLDKRRTGVFTVAEFGMGMRTKVFLDLRETASSDCDTG
VMGAAVKSGHAVHTDQSLWMRSHRNATGVFISELIVTDLRNCTWPASHTLDNAGVVDSKLFLPAGLAGPRSHYNHIPGYA
EQVKGPWSQTPLRVVREPCPGTAVKIDQSCDKRGASLRSTTESGKAIPEWCCRTCELPPVTFRSGTDCWYAMEIRPVHQQ
GGLVRSMVLADNGAMLSEGGVPGIVAVFVVLELVIRRRPTTGSSVVWCGMVVLGLVVTGLVTIEGLCRYVVAVGILMSME
LGPEIVALVLLQAVFDMRTGLLVAFAVKRAYTTREAVATYFLLLVLELGFPEASLSNIWKWADSLAMGALILQACGQEGR
TRVGYLLAAMMTQKDMVIIHTGLTIFLSAATAMAVWSMIKGQRDQKGLSWATPLAGLLGGEGVGLRLLAFRKLAERRNRR
SFSEPLTVVGVMLTVASGMVRHTSQEALCALVAGAFLLLMMVLGTRKMQLTAEWCGEVEWNPDLVNEGGEVNLKVRQDAM
GNLHLTEVEKEERAMALWLLAGLVASAFHWAGILIVLAVWTLFEMLGSGRRSELVFSGQETRTERNRPFEIKDGAYRIYS
PGLLWGHRQIGVGYGAKGVLHTMWHVTRGAALVVDEAISGPYWADVREDVVCYGGAWSLESRWRGETVQVHAFPPGRPQE
THQCQPGELILENGRKLGAVPIDLSKGTSGSPIINAQGEVVGLYGNGLKTNEAYVSSIAQGEAEKSRPEIPLSVQGTGWM
SKGQITVLDMHPGSGKTHRVLPELVRQCADRGMRTLVLAPTRVVLKEMERALAGKKVRFHSPAVEGQTTAGAIVDVMCHA
TYVHRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYSLAKENRCALVLMTATPPGRGDPFPESNGAIMSEERAIPDGE
WREGFDWITEYEGRTAWFVPSISKGGAVARTLRQRGKSVICLNSKTFEKDYLRVREEKPDFVVTTDISEMGANLDVSRVI
DGRTNIKPEEVDGKVELTGTRKVTTASAAQRRGRVGRTSGRTDEYIYSGQCDDDDTSLVQWKEAQILLDNITTLRGPVAT
FYGPEQVKMPEVAGHYRLNEEKRKHFRHLMTQCDFTPWLAWHVATNTSNVLDRSWTWQGPEENAIDGADGDLVRFKTPGG
SERVLQPVWKDCRMFREGRDVKDFILYASGRRSVGDVLGGLAGVPGLLRHRCASALDVVYTLLNENPGSRAMRMAERDAP
EAFLTIVEVAVLGVATLGILWCFVARASVSRMFLGTVVLFAALFLLWIGGVDYGHMAGIALIFYTLLTVLQPEPGKQRSS
DDNRLAYFLLGLFSLAGLVTANEMGMLDKTKADLAGLVWRGEQRHPAWEEWTNVDIQPARSWGTYVLIVSLFTPYMLHQL
QTKIQQLVNSSVASGAQAMRDLGGGTPFFGVAGHVIALGVTSLVGATPMSLGLGVALAAFHLAIVASGLEAELTQRAHRV
FFSAMVKNPMVDGDVINPFPDGETKPALYERRMSLILAIALCMGSVVLNRTAASMTEAGAVGLAALGQLVHPETETLWTM
PMACGMAGLVRGSFWGLLPMGHRLWLRTTGTRRGGAEGETLGDIWKRRLNGCSREEFFQYRRSGVMETERDKARELLKRG
ETNMGLAVSRGTAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPAVMGVKAYTIGGKGHEVPRLITSLGWNLIKFR
TGMDVYSLEAHRADTILCDIGESSPDPLAEGERSRRVILLMEKWKLRNPDASCVFKVLAPYRPEVLEALHRFQLQWGGGL
VRVPFSRNSTHEMYFSTAISGNIINSVNTQSRKLLARFGDQRGPTKVPEVDLGTGTRCVVLAEDKVREADVAERIAALKT
QYGDSWHVDKEHPYRTWQYWGSYKTEATGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKEKVDTKAQEP
QVGTKIIMRAVNDWIFERLAGKKTPRLCTREEFIAKVRSNAALGAWSDEQNRWPNAREAVEDPEFWRLVDEERERHLGGR
CAQCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRYLEFEALGFLNEDHWASRDLSGAGVEGTSLNYLGWHLKKLSELE
GGLFYADDTAGWDTRITNADLEDEEQILRYLEGEHRTLAKTILEKAYHAKVVKVARPSSSGGCVMDIITRRDQRGSGQVV
TYALNTLTNIKVQLIRMMEGEGVIGPSDSQDPRLLRVEAWLKEHGEERLTRMLVSGDDCVVRPIDDRFGKALYFLNDMAK
VRKDIGEWEPSEGYSSWEEVPFCSHHFHELTMKDGRVIIVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLSY
FHRRDLRTLGLAICSAVPIDWVPQGRTTWSIHASGAWMTTEDMLEVWNRVWILDNPFMGDKGKVREWRDIPYLPKSQDGL
CSSLVGRRERAEWAKNIWGSVEKVRRMIGPERYADYLSCMDRHELHWDLKLESNII
>Q32ZD5 ~~~~~~Genome polyprotein~~~
MTKKPGRPGRNRAVNMLKRGASRALGPMIKLKRMLFGLLDGRGPLRMVLAILAFFRFTALKPTAGLLKRWGMMDKVHALS
LLKGFKKDLASMTDFVHLPKKKSGVSIIGRMLVFSFTAAVRVTLENGMSLMKIQKADVGKVITIRTDRGENRCIVQAMDV
GEDCEDTMKYLCPAIENPSEPDDIDCWCDKADAMVTYGRCSKTRHSRRSRRSTNIAGHADSRLDSRGSVWMDTKKATSYL
TKAESWALRNPGYALVAAVLGWSLGTSNAQKVIFTVMILLIAPAYSIRCVGVENRDFIEGVSGGTWVDVVLEHGGCVTIM
APDKPTIDLELTSTIAKSMAVTRTYCVQAQVSELSVETRCPTMGEAHNSKSSDAAYVCKKGFSDRGWGNGCGLFGKGSME
TCAKFSCQTKAEGRIIQRENLEYTIHMNVHASQETGHFMNDTIASENKHGAKISITATGPSRTADLGDYGMVTLDCEPRA
GLDFDNLYLLTLGRNSWLVNRDWFHDVNLPWIGGAEGHWKNRESLVEFGKTHATKREVLALGSQEGTLQVALAGAMIAKF
GSNVATINSGHLKCRLKLDKLKIKGTTYHMCKGSFAFTKTPSDTGHGTVLLELTYSGSDGPCRVPISMSVSLSNIEPVGR
MVTVNPIVLSSSPQKTIMIEVEPPFGDSFIIAGTGEPRAHYHWRKSGSSIGAAFATTIKGARRLAVIGDDAWDFGSVGGI
LNSVGKALHQIFGGMFRTLFGGMSWFTQIMIGALCCWLGINARDRTIAVTFLAVGGVLVFLATSVNADSGCALDLKRKEF
KCGNGIFVFNDAEAWSHSYRYHPSTPKKLAGSIVRAIEEGQCGVRSVGRLEHEMWRANAREINAILLENEKNLSVVVLES
EYYRKAKNLMPIGDEMPFGWKSWGKKFFEEPQLQNQTFVVDGRVGKECPEEKRSWNNFRIEDFGFGVFTTSVWMEQRTEY
TEDCDQKVIGAAVKGELAAHSDLGYWIESRSKNGSWELERAYLLESKSCSWPATHTLWNGGVEESELIIPKSRAGPVSHH
NTRKGYHNQIKGPWHLTPLEIRFESCPGTTVVTTEECGNRGPSLRTTTTSGKVISEWCCRSCTMPPLSFRTADGCWYGME
IRPLKEREETMVKSHVSAGRGDGVDNLSLGLLVLTIALQEVMRKRILGRHITWMVIAVFMAMILGGLSYRDLGRYLVLVG
AAFAERNSGGDLLHLVLVATFKVKPMALLGFVLGGRWCRRQSLLLSIGAVLVNFALEFQGGYFELVDSLALALLFVKAVV
QTDTTSVSLPLLAALAPAGCYTVLGTHRFIMLTLVLVTFLGCKKTASVKKAGTAAVGVVLGMVGMKTIPMLGMLMVTSRA
RRSWPLHEAMAAVGILCALFGALAETEVDLAGPLAAAGLIVMAYVISGRSNDLSIKKVEDVKWSDEAEVTGESVSYHVSL
DVRGDPTLTEDSGPGLEKVLLKVGLMAISGIYPVAIPFALGAWFFLEKRCKRAGALWDIPSPREAKPAKVEDGVYRIFSR
KLFGESQIGAGVMVKGTFHTMWHVTRGAVLKAGEGLLEPAWADVRKDLICYGGNWKLEEHWDGNEEVQLIALEPGKKVRH
IQTKPGIFKTSEGEIGALDLDCMAGTSGSPIVNKNGEVVGLYGNGVLIKGDRYVSAISQKENVGQEDGAEIEDNWFRKRE
LTVLDLHPGAGKTRRVLPQLVREAVKKRLRTVILAPTRVVASEMYEALRGEPIRYMTPAVQSERTGNEIVDFMCHSTFTM
KLFQGVRVPNYNLYIMDEAHFLDPASVAARGYIETRVSMGDAGAIFMTATPPGTTEAFPPSNSPIIDEETRIPDKAWNSG
YEWIIEFDGRTVWFVHSIKQGAEIGTCLQKAGKKVLYLNRKTFESEYPKCKSEKWDFVITTDISEMGANFKADRVIDPRK
TIKPILLDGRVSMQGPIAITPASAAQRRGRIGRNPEKLGDIYAYSGNVSSDNEGHVSWTEARMLLDNVHVQGGVVAQLYT
PEREKTEAYEGEFKLKTNQRKVFSELIRTGDLPVWLAFQVASANVEYHDRKWCFDGPNEHLLLENNQEIEVWTRQGQRRV
LKPRWLDGRITSDHLNLKSFKEFASGKRSALSILDLIAVLPSHLNLRLQEALDTAAILSRSEPGSRSYKAALENSPEMIE
TFLLCALVCLMTIGLVVVLVRGKGPGKLAFGMVSIGVMTWLLWSAGVDPGKIAAAVILVFLLLVVLIPEPEKQRSVQDNQ
LAMLMLLIATILGGVAANEMGWLEKTKADLSWVVRGRSSTTTPVVELDMKPATAWTLYALATTLLTPLFQHLIVTKYANI
SLMAIASQAGTLFSMDSGIPFSSIELSVPLLALGCWTQITPCSLILACVLLSTHYAILLPGMQAQAARDAQRRTAAGIMK
NAVVDGIVATDIPPLDGAGPLTEKKLGQLLLFAAAVTGVVITRSPRSWSELGVLGSAVGSTLIEGSAGKFWNATTVTAMC
NLFRGSYLAGVPLTYTIIRNSNPSNKRGGGIGETLGEKWKARLNQMNTLEFHRYRRSHIMEVDREPARAALKSGDFTRGA
AVSRGSAKLRWMHERGYIRLHDKVVDLGCGRGGWCYYSATVKEVKEVKGYTKGGRGHEEPVLTQSYGWNIVQMKSGVDVF
YKEAEPCDVVLCDIGECSSSPAVEADRSTKVLELAERWLERNDGADFCIKVLCPYMPEVVEKLSKLQLRYGGCLVRNPLS
RNSTHEMYWVSGYKGNLIGVINSTSALLLRRMEIKFAEPRYEEDVNLSCGTRAVSIAPPKFDYKKIGQRVERLKAEHMST
WHYDCEHPYRTWAYHGSYVVKPSGSASSQVNGVVKLLSKPWDVSSEVTGMSMTDTTPFGQQRVFKEKVDTKAPEPPAGAE
MASVIVSEWLWKRLNREKKPRLCTKEEFVRKVRGNAALGPVFEEENQWKDAAEAVQDPGFWNLVDMERKNHLEGKCETCV
YNMMGKREKKRGEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWMSRGNSGGGVEGLGIQKLGYVMREIGEKGGILYA
DDTAGWDTRITECDLRNEAHIMEYMENEHRKLARAIFELTYKHKVVKVMRPGKGVPLMDIISREDQRGSGQVVTYALNTF
TNLVVQLIRMAEAECVLTPEDLHEMSQSAKLRLLKWLKEEGWERLTRMAVSGDDCVVAAPDARFGAALTFLNAMSKIRKD
IKEWTPSKGWKNWEEVPFCSHHFHRLQMKDGRELVVPCRSQDELIGRARVTQGPGDLMSSACLAKAYAQMWQLLYFHRRD
LRLMGNAICSAVPVDWVPTGRTTWSIHGKGEWMTSENMLEVWNRVWIEENEHMEDKTPVREWTDIPYLGKREDPWCGSYI
GYRPRSTWAENIKVPVNVIRVKIGGNKYQDYLGTQKRYESEKRVEFRGVL
>P14335 ~~~~~~Genome polyprotein~~~
MSKKPGGPGKSRAVNMLKRGMPRVLSLTGLKRAMLSLIDGRGPTRFVLALLAFFRFTAIAPTRAVLDRWRSVNKQTAMKH
LLSFKKELGTLTSAINRRSSKQKKRGGKTGIAFMIGLIAGVGAVTLSNFQGKVMMTVNATDVTDIITIPPAAGKNLCIVR
AMDVGHMCDDTITYECPVLSAGNDPEDIDCWCTKLAVYVRYGRCTKTRHSRRSRRSLTVQTHGESTLSNKKGAWMDSTKA
TRYLVKTESWILRNPGYALVAAVIGWMLGSNTMQRVVFAVLLLLVAPAYSFNCLGMSNRDFLEGVSGATWVDLVLEGDSC
VTIMSKDKPTIDVKMMNMEAANLAEVRSYCYLATVSELSTKAACPTMGEAHNDKRADPSFVCKQGVVDRGWGNGCGLFGK
GSIDTCAKFACSTKATGRTILKENIKYEVAIFVHGPTTVESHGNYFTQTGAAQAGRFSITPAAPSYTLKLGEYGEVTVDC
EPRSGIDTSAYYVMTVGTKTFLVHREWFMDLNLPWSSAESNVWRNRETLMEFEEPHATKQSVIALGSQEGALHQALAGAI
PVEFSSNTVKLTSGHLKCRVKMEKLQLKGTTYGVCSKAFRFLGTPADTGHGTVVLELQYTGTDGPCKIPISSVASLNDLT
PVGRLVTVNPFVSVSTANAKVLIELEPPFGDSYIVVGRGEQQINHHWHKSGSSIGKAFTATLKGAQRLAALGDTAWDFGS
VGGVFTSVGKAVHQVFGGAFRSLFGGMSWITQGLLGALLLWMGINARDRSIALTFLAVGGVLLFLSVNVHADTGCAIDIS
RQELRCGSGVFIHNDVEAWIDRYKYYPETPQGLAKIIQKAHKEGVCGLRSVSRLEHQMWEAVKDELNTLLKENGVDLSIV
VEKQEGMYKSAPRRLTATTEKLEIGWKAWGKSILFAPELANNTFVIDGPETKECPTQNRAWNNLEVEDFGFGLTSTRMFL
RVRESNTTECDSKIIGTAVKNNLAIHSDLSYWIESRFNDTWKLERAVLGEVKSCTWPETHTLWGDGVLESDLIIPITLAG
PRSNHNRRPGYKTQSQGPWDEGRVEIDFDYCPGTTVTLSESCGHRGPATRTTTESGKLITDWCCRSCTLPPLRYQTDNGC
WYGMEIRPQRHDEKTLVQSQVNAYNADMIDPFQLGLLVVFLATQEVLRKRWTAKISMPAILIALLVLVFGGITYTDVLRY
VILVGAAFAESNSGGDVVHLALMATFKIQPVFMVASFLKARWTNQENILLMLAAAFFQMAYYDARQILLWEMPDVLNSLA
VAWMILRAITFTTTSNVVVPLLALLTPGLRCLNLDVYRILLLMVGIGSLIREKRSAAAKKKGASLLCLALASTGFFNPMI
LAAGLVACDPNRKRGWPATEVMTAVGLMFAIVGGLAELDIDSMAIPMTIAGLMFAAFVISGKSTDMWIERTADISWEGDA
EITGSSERVDVRLDDDGNFQLMNDPGAPWKIWMLRMACLAISAYTPWAILPSVVGFWITLQYTKRGGVLWDTPSPKEYKR
GDTTTGVYRIMTRGLLGSYQAGAGVMVEGVFHTLWHTTKGAALMSGEGRLDPYWGSVKEDRLCYGGPWKLQHKWNGQDEV
QMIVVEPGKNVKNVQTKPGVFKTPEGEIGAVTLDFPTGTSGSPIVDKNGDVIGLYGNGVIMPNGSYISAIVQGERMDEPV
PAGFEPEMLRKKQITVLDLHPGAGKTRRILPQIIKEAINRRLRTAVLAPTRVVAAEMAEALRGLPIRYQTSAVAREHNGN
EIVDVMCHATLTHRLMSPHRVPNYNLFVMDEAHFTDPASIAARGYISTRVELGEAAAIFMTATPPGTSDPFPESNAPISD
LQTEIPDRAWNSGYEWITEYIGKTVWFVPSVKMGNEIALCLQRAGKKVIQLNRKSYETEYPKCKNDDWDFVVTTDISEMG
ANFKASRVIDSRKSVKPTIITEGEGRVILGEPSAVTAASAAQRRGRTGRNPSQAGDEYCYGGHTNEDDSNCAHWTEARIM
LDNINMPNGLIAQFYQPEREKVYTMDGEYRLRGEERKNFLELLRTADLPVWLAYKVAAAGVSYHDRRWCFDGPRTNTILE
DNNEVEVITKLGERKILRPRWIDARVYSDHQALKSFKDFASGKRSQIGFIEVLGKMPEHFMGKTWEALDTMYVVATAEKG
GRAHRMALEELPDALQTIALIALLSVMTMGVFFLLMQRKGIGKIGLGGVVLGAATFFCWMAEVPGTKIAGMLLLSLLLMI
VLIPEPEKQRSQTDNQLAVFLICVLTLVGAVAANEMGWLDKTKSDISGLFGQRIETKENFSIGEFLLDLRPATAWSLYAV
TTAVLTPLLKHLITSDYITTSLTSINVQASALFTLARGFPFVDVGVSALLLAAGCWGQVTLTVTVTSATLLFCHYAYMVP
GWQAEAMRSAQRRTAAGIMKNAVVDGIVATDVPELERTTPIMQKKVGQVMLILVSLAALVVNPSVKTVREAGILITAAAV
TLWENGASSVWNATTAIGLCHIMRGGWLSCLSITWTLVKNMEKPGLKRGGAKGRTLGEVWKERLNQMTKEEFIRYRKEAI
TEVDRSAAKHARKERNITGGHPVSRGTAKLRWLVERRFLEPVGKVIDLGCGRGGWCYYMATQKRVQEVRGYTKGGPGHEE
PQLVQSYGWNIVTMKSGVDVFYRPSECCDTLLCDIGESSSSAEVEEHRTLRVLEMVEDWLHRGPKEFCVKVLCPYMPKVI
EKMELLQRRYGGGLVRNPLSRNSTHEMYWVSRASGNVVHSVNMTSQVLLGRMEKKTWKGPQYEEDVNLGSGTRAVGKPLL
NSDTSKIKNRIERLRREYSSTWHHDENHPYRTWNYHGSYEVKPTGSASSLVNGVVRLLSKPWDTITNVTTMAMTDTTPFG
QQRVFKEKVDTKAPEPPEGVKYVLNETTNWLWAFLAREKRPRMCSREEFIRKVNSNAALGAMFEEQNQWRSAREAVEDPK
FWEMVDEEREAHLRGECHTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARFLEFEALGFLNEDHWLGRKNSGGGVEGL
GLQKLGYILREVGTRPGGRIYADDTAGWDTRITRADLENEAKVLELLDGEHRRLARAIIELTYRHKVVKVMRPAADGRTV
MDVISREDQRGSGQVVTYALNTFTNLAVQLVRMMEGEGVIGPDDVEKLTKGKGPKVRTWLSENGEERLSRMAVSGDDCVV
KPLDDRFATSLHFLNAMSKVRKDIQEWKPSTGWYDWQQVPFCSNHFTELIMKDGRTLVTPCRGQDELVGRARISPGAGWN
VRDTACLAKSYAQMWLLLYFHRRDLRLMANAICSAVPVNWVPTGRTTWSIHAGGEWMTTEDMLEVWNRVWIEENEWMEDK
TPVEKWSDVPYSGKREDIWCGSLIGTRARATWAENIQVAINQVRSIIGDEKYVDYMSSLKRYEDTTLVEDTVL
>P29837 ~~~~~~Genome polyprotein~~~
MAGKAVLKGKGGGPPRRASKVAPKKTRQLRVQMPNGLVLMRMLGVLWHALTGTARSPVLKAFWKVVPLKQATLALRKIKR
TVSTLMVGLHRRGSRRTTIDWMTPLLITVMLGMCLTATVRRERDGSMVIRAEGRDAATQVRVENGTCVILATDMGSWCDD
SLAYECVTIDQGEEPVDVDCFCRGVEKVTLEYGRCGRREGSRSRRSVLIPSHAQRDLTGRGHQWLEGEAVKAHLTRVEGW
VWKNKLFTLSLVMVAWLMVDGLLPRILIVVVALALVPAYASRCTHLENRDFVTGVQGTTRLTLVLELGGCVTVTADGKPS
LDVWLDSIYQESPAQTREYCLHAKLTGTKVAARCPTMGPATLPEEHQSGTVCKRDQSDRGWGNHCGLFGKGSIVTCVKFT
CEDKKKATGHVYDVNKITYTIKVEPHTGEFVAANETHSGRKSASFTVSSEKTILTLGDYGDVSLLCRVASGVDLAQTVVL
ALDKTHEHLPTAWQVHRDWFNDLALPWKHDGAEAWNEAGRLVEFGTPHAVKMDVFNLGDQTGVLLKSLAGVPVASIEGTK
YHLKSGHVTCEVGLEKLKMKGLTYTVCDKTKFTWKRAPTDSGHDTVVMEVGFSGTRPCRIPVRAVAHGVPEVNVAMLITP
NPTMENNGGGFIEMQLPPGDNIIYVGDLNHQWFQKGSSIGRVLQKTRKGIERLTVLGEHAWDFGSVGGVMTSIGRAMHTV
LGGAFNTLLGGVGFLPKILLGVAMAWLGLNMRNPTLSMGFLLSGGLVLAMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYVFHPETPAVLASAVQRAYEEEICGIVPQNRLEMAMWRSSLVELNLALAEGEANLTVVVDKADPSDYRGGVPG
LLNKGKDIKVSWRSWGRSMLWSVPEAPRRFMIGVEGGRECPFARRKTGVMTVAEFGIGLRTKVFMDLRQELTTECDTGVM
GAAVKNGMAVHTDQSLWMKSIKNDTTVTIVELIVTDLRNCTWPASHTIDNAGVVNSKLFLPASLAGPRSTYNVIPGYAEQ
VRGPWAHTPVRIKREECPGTRVTIDKACDKRGASVRSTTESGKVIPEWCCRTCELPPVTYRTGTDCWYAMEIRPVHTQGG
LVRSMVVADNGALLSEGGVPGVVALFVVLELVIRRRPATGGTVIWGGIAILALLVTGLVSVESLFRYLVAVGLVFQLELG
PEAVAMVLLQAVFEMRTCLLSGFVLRRSITTREIVTVYFLLLVLEMGIPVKGLEHLWRWTDALAMGAIIFRACTAEGKTG
IGLLLAAFMTQSDMNIIHDGLTAFLCVATTMAIWRYIRGQGERKGLTWIVPLAGILGGEGSGVRLLAFWELAASRGRRSF
NEPMTVIGVMLTLASGMMRHTSQEAVCAMALAAFLLLMLTLGTRKMQLLAEWSGNIEWNPELTSEGGEVSLRVRQDALGN
LHLTELEKEERMMAFWLVVGLIASAFHWSGILIVMGLWTISEMLGSPRRTDLVFSGCSEGRSDSRPLDVKNGVYRIYTPG
LLWGQRQIGVGYGAKGVLHTMWHVTRGAALLVDGVAVGPYWADVREDVVCYGGAWSLESRWRGETVQVHAFPPGRAHETH
QCQPGELILENGRKMGAIPIDLAKGTSGSPIMNSQGEVVGLYGNGLKTNDTYVSSIAQGEVEKSRPNLPQSVVGTGWTAK
GQITVLDMHPGSGKTHRVLPELIRQCVERRLRTLVLAPTRVVLREMERALSGKNVRFHSPAVTEQHANGAIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYSLAKENRCAFVLMTATPPGKSEPFPESNGAIASEERQIPDGEWR
DGFDWITEYEGRTAWFVPSIARGGAIARALRQRGKSVICLNSKTFDKEYSRVKDEKPDFVVTTDISEMGANLDVTRVIDG
RTNIKPEEVDGRIELTGTRRVTTASAAQRRGRVGRQGGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTARGPVATFY
GPEQERMTETAGHYRLPEEKRKHFRHLLAQCDFTPWLAWHVAANVASVTDRSWTWEGPEENAVDENNGELVTFRSPNGAE
RTLRPVWRDARMFREGRDIREFVSYASGRRSVGDVLMGMSGVPALLRQRCTSAMDVFYTLMHEEPGSRAMRIGERDAPEA
FLTAVEMLVLGLATLGVVWCFVVRTSVSRMVLGTLVLATSLIFLWAGGVGYGNMAGVALVFYTLLTVLQPETGKQRSSDD
NKLAYFLLTLCGLAGMVAANEMGLLEKTKADLAALFARDQGETVRWGEWTNLDIQPARSWGTYVLVVSLFTPYMLHQLQT
RIQQLVNSAVASGAQAMRDLGGGTPFFGVAGHVLALGVASLVGATPTSLILGVGLAAFHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGDGEAKPALYERKLSLILALVLCLASVVMNRTFVAVTEAGAVGVAAAMQLLRPEMDVLWTMPV
ACGMSGVVRGSLWGLLPLGHRLWLRTTGTRRGGSEGDTLGDMWKARLNSCTKEEFFAYRRAGVMETDREKARELLKRGET
NMGLAVSRGTSKLAWMEERGYVTLKGEVVDLGCGRGGWSYYAASRPAVMSVRAYTIGGKGHESPRMVTSLGWNLIKFRAG
MDVFSMEPHRADAILCDIGESNPDAVVEGERSRRVILLMEQWKNRNPTATCVFKVLAPYRPEVIEALHRFQLQWGGGLVR
TPFSRNSTHEMYFSTAITGNIVNSVNIQSRKLLARFGDQRGPTRVPEIDLGVGTRSVVLAEDKVKEKDVMERIQALKDQY
CDTWHEDHEHPYRTWQYWGSYKTAATGSSASLLNGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKDKVDTKAQEPQP
GTKIIMRAVNDWLLERLVKKSRPRMCSREEFIAKVRSNAALAAWSDEQNKWKSAREAVEDPEFWSLVEAERERHLQGRCA
HCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNEDHWASRASSGAGVEGISLNYLGWHLKKLASLSGG
LFYADDTAGWDTRITNADLDDEEQILRYMDGDHKKLAATVLRKAYHAKVVRVARPSREGGCVMDIITRRDQRGSGQVVTY
ALNTITNIKVQLVRMMEGEGVIEVADSHNPRLLRVEKCVEEHGEERLSRMLVSGDDCVVRPVDDRFSKALYFLNDMAKTR
KDTGEWEPSTGFASWEEVPFCSHHFHELVMKDGRALVVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLSYFH
RRDLRTLGFAICSAVPVDWVPTGRTTWSIHASGAWMTTEDMLEVWNRVWIYDNPFMEDKTRVDEWRDTPYLPKSQDILCS
SLVGRGERAEWAKNIWGAVEKVRRMIGPEHYRDYLSSMDRHDLHWELKLESSIF
>P22338 ~~~~~~Genome polyprotein~~~
MGRKTILKGKGGGPPRRVSKETATKTRQSRVQMPNGLVLMRMMGILWHAVAGTARNPVLKAFWNSVPLRQATAALRKIKR
TVSALMVGLQRRGKRRSVTNWMNWLLVIALLGMTLAATVRKEGDGTTVIRAEGRDAATQVRVENGTCVILATDMGSWCDD
SLSYECVTIEQGEEPVDVDCFCRNVDGVYLEYGRCGKQEGSRTRRSVLIPTHAQGELTGRGRKWLEGDSLRTHLTRVEGW
VWKNKLLALAMVAVVWLALESVVTRVAVLVVLLCLAPVYASRCTHLENRDFVTGTQGTTRVTLVLELGGCVTITAEGKPS
MDVWLDAIYQESPAKTREYCLHAKLSETKVAARCPTMGPAVLTEERQIGTVCKRDQSDRGWGNHCGLFGKGSIVACVKAA
CEAKKKATGYVYDANKIVYTVKVEPHTGDYVAANETHKGRKTATFTVSSEKTILTLGEYGDVSLLCRVASGVDLAQTIIL
ELDKTAEHLPTAWQVHRDWFNDLALPWKHDGNPHWNNAERLVEFGAPHAVKMDVYNLGDQTGVLLRALAGVPVAHIEGNK
YHLKSGHVTCEVGLEKLKMKGLTYTMCDKSKFAWKRTPTDSGHDTVVMEVTFSGSKPCRIPVRAVAHGSPDVNVAMLITP
NPTIENDGGGFIEMQLPPGDNIIYVGELSHQWFQTGSSIGRVFQTTRKGIERLTVIGEHAWDFGSAGGFFGSIGKAVHTV
LGGAFNSIFGGVGFLPKLLMGVALAWLGLNTRNPTMSMSFLLAGGLVLAMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYAYYPETPGALASAVKEAFEEGSCGVVPQNRLEMAMWRSSVTELNLALVEGDANLTVVVDKNDPTDYRGGVPG
TLKKGKDMKVSWRSWGHSMIWSIPEAPRRFMVGTEGQSECPLERRKTGVFTVAEFGVGLRTKVFLDFRQEPTHECDTGVM
GAAVKNDMAVHTDQSLWMKSMRNDTGTYIVELLVTDLRNCSWPASHTIDNADVVNSELFLPASLRGPRSWYNRIPGYSEQ
VKGPWKHTPLRVIREECPGTTVTINAKCEKRGASVRSTTESGKVIPEWCCRACTMPPVTFRTGTDCWYAMEIRPVHAQGG
LVRSMVVADNGELLSEGGVPGIVALFVVLECIIRRRPSTGVTVVWGGVVVLALLVTGMVRIESLVRYVVAVGIAFHLELG
PETVALMLLQAVFELRVGLLSAFALRRGLTVREMVTTYFLLLVLELGLSSAGLGDLWKWSDALAMGALIFRACTAEGKTG
TGLLLIALMTQRDVVTVHHGLVCFLAAAAACSVWRLLRGHREQKGLTWIIPLARLLGGEGSVIRLLAFWELAAHRGRRSF
SEPLTVVVVMLTLASGMMRHTSQEALCALAVASFFLLMLVSGTRKMQLVAEWSGCVEWHPETVNEGGEISLRVRQDSMGN
FHLTELEKEERMMAFWLLAGLVASALHWSGILGVMGLWTLTEIMRSSRRSDLVYSGQGGQERGDRPFEVKDGVYRIFSPG
LFWGQRQVGVGYGHKGVLHTMWHVTRGAALSIDDAVAGPYWADVKEDVVCYGGAWSLEEKWKGETVQVHAFPPGRAHEVH
QCQPGELILDTGKRLGAIPIDLAKGTSGSPILNAQGVVVGLYGNGPKTNESYVSSIAQGEAEKSRPNLPQAVVGTGWTSK
GQITVLDMHPGSGKTHRVLPELIRQCIDRRLSTLVLAPTRVVLKEMERALSGKRVRFHSPAVSDQQAGGAIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYTLAKENKCALVLMTATPPGKSEPFPESNGAITSEERQIPDGEWR
DGFDWITEYEGRTAWFVPSIAKGGVIARTLRQKGKSVICLNSKTFEKDYSRVREEKPDFVVTTDISEMGANLDVSRVIDG
RTNIKPEEVDGKVELTGTRRVTTASAAQRRGRVGRQDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFY
GPEQDKMPEVAGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTDRSWTWEGPEANAVDEASGDLVTFRSPNGAE
RTLRPVWRDARMFREGRDIKEFVAYASGRRSFGDVLTGMSGVPELLRHRCVNALDVFYTLMHEEPGSRAMKMAERDAPEA
FLTVVEMMVLGLATLGVVWCFVVRTSISRMMLGTLVLLASLLLLWAGGVGYGNMAGVALIFYTLLTVLQPETGKQRSSDD
NKLAYFLLTLCSLAGLVAANEMGFLEKTKADLSAMLWSGHEEHRQWSEWTNVDIQPARSWGTYVLVVSLFTPYIIHQLQT
KIQQLVNSAVASGAQAMRDLGGGAPFFGVAGHVMTLGVVSLVGATPTSLIVGIGLAAFHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGEGEAKPALYERKMSLVLAIVLCLVSVVMNRTVASMTEAAAVGLAATGQLLRPEADTLWTMPV
ACGMSGVVRGSLWGFLPLGHRLWLRASGGRRGGSDGDTLGDLWKRRLNNCTKEEFFVYRRTGILETERDKARELLRRGET
NMGLAVSRGTAKLAWLEERGYRTLKGEVVDLGCGRGGWSYYAASRPAVMSVRAYTIGGRGHEVPKMVTSLGWNLIRFRSG
MDVFSMQPHRADTIMCDIGESNPDAAVEGERTRKVISLMEQWKIRNPAAACVFKVLAPYRPEVIEALHRFQLQWGGGLVR
TPFSRNSTHEMYYSTAVTGNIVNSVNIQSRKLLARFGDQRGPTKVPEADLGVGTRCVVLAEDKVKEQDVQERIRALRKQY
SETWHMDEEHPYRTWQYWGTSRTAPTGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKEKVDTKAQEPQP
GTRVITRAVNDWILERLAQKSKPRMCSREEFIAKVRSNAALGAWSDEQNRWASAREAVVVPAFWALVDEVRERHLVGWCA
HCVYIMMGMREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNKDHWASRESSGGGVEGISLNYLGWHLKKLTTLNGG
LFYADDTAGWDTKGTNSDPEDEEQILRYMEGEHKQLATTIMQKAYHAKVVKVARPSRDGGCIMDVITRRDQRGSGQVVTY
ALNTLTNIKVQSTRMMEGEGVIEAEDAHNPRLLRVERWLKEHGEERLGRMLVSGDDCVVRPIDDRFGKALYFLNDMAKTR
KDMGEWEPSAGFSSWEEVPFCSHHFHELVMKDGRTLVVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLSYFH
RRDLRTLGFAISPAVPVDWVPTGRTTWSIHASGAWMTTEDMLDVWNRVWILDNPFMQNKERIMEWRDVPYLPKTQDMICS
SLVGRKERAEWAKNIWGAVEKVRKMIGPERFKDYLSCMDRHDLHWELKLESSII
>Q8JV21 ~~~~~~Genome polyprotein~~~
MAASKMNPVGNLLSTVSSTVGSLLQNPSVEEKEMDSDRVAASTTTNAGNLVQASVAPTMPVKPDFKNTDDFLSMSYRSTT
APTNPTKMVHLAHGTWTTNQHRQALVASITLPQAFWPNQDFPAWGQSRYFAAVRCGFHIQVQLNVNIGSAGCLIAAYMPK
TAHDHMNTYTFGSYTNLPHVLMNAATTSQADLYIPYVFNHNYARTDSDDLGGIYIWVWSALTVPSGSPTTVDVTIFGSLL
DLDFQCPRPPGADTVIYTQGKRTVRKTKTSKFKWVRNKIDIAEGPGAMNIANVLSTTGGQTIALVGERAFYDPRTAGAAV
RCKDLMEIARMPSVFLGESTEPDGRRGYFTWSHTISPVNWVFDDHIYLENMPNLRLFSSCYNYWRGSFVIKLTVYASTFN
KGRLRMAFFPNREGAYTQDEAQNAIFVVCDIGLNNTFEMTIPYTWGNWMRPTRGNSLGHLRIDVLNRLTYNSSSPNAVNC
ILQIKMGDDAMFMVPTTSNLVWQGLHSWGSEMDLVDSLDNPDEIQDNEEIQTQNVEAAQGEEAATEVGLRATENDGSLSE
QLNMSQPMFLNFKKHKVNIYAASHTKVDHIFGRAWAVGVFNTETAAIQKFDLHFPTSTHGALSRFFCFWTGELNIHILNV
STTNAFLKVAHTWFGTDSGIARTATLESNGTMIIPPNEQMTLCVPYYSEVPLRCVKGSDRNSAGLGSLFTQAVGRTISNR
VQIFVSFRCPNFFFPLPAPREATSRSILERVDEANAEELEAVLEARTPDAPLRLKFNPEDPLKQLREAAKAYFNIMHSDE
MDFAGGKFLNQCGDVETNPGPDIELVYKNRGFYKHYGVRFGGHIYHLNSQDILSTAITGKSDFIKEEDDGKWVHAMTAPL
DYFTEKYINSMVGSKHIFSATSNCETIARDLFPGRKEITQSKALGIIGVILLSASLLSLLAVPWDYSSLQTVYNQSIEGD
ASGLTLLSQRCMTFFSNTMCETFNNDLVKFIIKILVRLLCYIVLYCHAPNMLTTMCLGTLLVLDITTCEILSANTKALFQ
ALVDGDVKSLVWKIAENMQFAQSKDEQAEDMAATFNFASDMVNFVPMEQMRQEGWREFNDVSMSFRHVEWWLTMFKKVYN
VLKSIFAPSIEQKAVDWIDRNQEYIADVLDHASNIIIKMKDPKEQRRASTISEYFEVLKQLKPIVSLCMKVAPSTKFSSQ
VFRIYSEMMRVNVRVPANTDLTRLEPIGIWVSSEPGQGKSFFTHMLSTCLLKSCNLEGIYTNPTGSEFMDGYIGQDIHII
DDAGQNREEKDLALLCQCISSVPFTVPMADLTEKGTFYTSKIVIATTNKFDFTSMVLTDPAALERRFPFHLRIRAVASYS
RNNKLDVARSMAAMADGSCWEYSTDGGRAWKTLSMDELVKQITAVYTQRSDALMVWKRKLNTIRNEMSPGSSTGRIFEPL
EETLCALERRFGQLADSLKDNYHKTADELIEAIEDMMAPSQSPFACFAESYRPTIKYTASDKVKSWVKNHMNRWKEFVMR
NKGWFTLFSVLSSFLSILTLVYLHYKKEKKEEERQERAYNPQTAISKKGGKPKLSLVKTTNFVNEAPYMQDLEHCFAQTA
YISSPETQDIIHCAALSEDTILVYGHSQFYFNRYEDLRLHFKGAIFPIEGGKISQVTVNGQPMDLILVKIDKLPITFKNY
TKYYTTEVGKETLLIWNSEKGRLAMPVQCVAPAGPVETMEGTITHKTYSYKVASKKGMCGGLLVTRVHGTFKVLGMHIAG
NGQVARAAAVHFISNGAAGFMDQGVVVAKEKLQKPIYLPSKTALNPSPLNGVVPVKMEPAVLSPHDTRLEVIMPSVVKTA
AAKYRVNIFNPDFEIWERVVDELKSKFRTKLGIHKHVSFQKAVQGFSSLSSLDLSTSPGQKYVEKGMKKRDLLSTEPFWM
HPQLEGDVKDILGAVYSGKKPHTFFAAHLKDELRKKEKIAQGKTRCIEACSIDYVIAYRVVMSSLYEAIYQTPAQELGLA
VGMNPWTDWDPMINVLQPYNYGLDYSSYDGSLSEQLMRYGVEILAYCHEQPEAVMILHEPVINSQHLVMDEIWHVNGGMP
SGAPCTTVLNSICNLLVCTYLAYEQSLDIEVLPIVYGDDVIFSVSSPLDAEYLVQSAAQNFGMEVTSSDKSGPPKLLKMD
EIEFLKRTTKFFPGSTYKVGALSLDTMEQHIMWMKNLETFPEQLVSFENELVLHGKEIYDDYKNRFNPILNQWRVCMQDY
EVALHRMLRYVFD
>P31999 ~~~~~~Genome polyprotein~~~
MATLDNCTQVHHMFAYNREHGTNYTRNHFRRYLAAQRIGFYYDWDDDVYECPTCEAIYHSLDDIKNWHECDPPAFDLNDF
ITDARLKSAPVPDLGPVIIEIPKAEEKQELNFFAATPAPEVSQWKCRGLQFGSFTELETSEPVASAPEPKCEEPARTIAK
PEESVEQETRGDGKRLLQAQMEVDKAEQDLAFACLNASLKPRLEGRTTATIARRRDGCLVYKTKPSWSQRRRAKKTLKVD
TLACENPYIPAIVDKISIAGGSSASVMHEQQKPKTLHTTPSRKVATHYKRTVMNQQTLMAFINQVGTILLNAEKEFEVVG
CRKQKVTGKGTRHNGVRLVKLKTAHEEGHRRRVDIRIPNGLRPIVMRISARGGWHRTWTDSELSPGSSGYVLNSSKIIGK
FGLRRHSIFVVRGRVDGEVIDSQSKVTHSITHRMVQYSDVARNFWNGYSTCFMHNTPKDILHTCTSDFDVKECGTVAALL
TQTLFQFGKITCEKCAIEYKNLTRDELATRVNKEIDGTIISIQTQHPRFVHVLNFLRLIKQVLNAKNGNFGAFQETERII
GDRMDAPFSHVNKLNAIVIKGNQATSDEMAQASNHVLEIARYLKNRTENIQKGSLKSFRNKISGKAHLNPSLMCDNQLDK
NGGFEWGQRSYHAKRFFDGYFETIDPSDGYSKYTIRRNPNGHRKLAIGNLIVSTNFESHRRSMIGESIEDPGLTNQCVSK
EGDTFIYPCCCVTDEYGKPTLSEIKMPTKHHLVLGNAGDPKYVDLPKEAEGKMFVTKDGYCYINIFLAMLVDVPEDQAKD
FTKMAREIAVKQLGEWPSMMDVATACNILATFHPDTRRSELPRILVDHATKTFHVIDSYGSITTGFHILKANTVTQLVKF
AHESLESEMQHYRVGGEPDKAPRKPAGSVPTLGISDLRDLGVELENEEHSIRPNLQRLIKAIYRPRMMRSLLTEEPYLLI
LSIVSPGVLMALYNSGSLERTMHEFLQTDQRLSATAQILKHLAKKVSLAKTLTIQNAILEGGAGSLNEILDAPAGRSLSY
RLAKQTVEVMMARSDMDKELVDVGFSVLRDQKNELIEKSYLMDLEDSWHALPLCGKLSAMRASRRWRDTSTPEVIPTGAA
DLKGRYSISVGSVSKSAILHLKGICSGAVKRVRDKWVGVQVQGVKWLAKSVHYMIPELTNILNVGTLLLTLISLGVAFRN
LTGQFKEMKHKETLAKEEELRKRIRTYNSTYYEIHGKHADAKQITKFITHHDPKLLEVVEFYEGPEEEEVEHQAKREDQA
NLERIIAFTALVMMMFDSERSDCVYRSLSKLKSLVSTCEDDVRHQSVDEIIDLFDEKKETIDFEIEGKELYSSRVVDSTF
SKWWDNQLVRGNTMAHYRTEGHFMTFTRETAASVAAEIAHNEYRDILLQGGVGSGKSTGLPFHLHRKGGVLLIEPTRPLA
QNVYKQLGSSPFHLSPNLRMRGSCKFGSSQVTVATSGYALHFIANNAQSLKAYDFIIFDECHVLDASAMAFRCLLQEFEY
QGKIIKVSATPPGRKLDFKPMHMVDIATENELSIQQFVQGQGTGVNCDATKKGDNILVYVSSYNEVDMLSKMLNDKGYKV
TKVDGRTMKLGSVEVETVGTPQRKHFVVATNIIENGVTLDVDVVVDFGQKVVPILDSEHRMIRYTKKSITYGERIQRVGR
VGRNKAGSAIRIGSTEMGTEEIPASIATEAAFLCFTYGLPVMTSNVSTSVLGNCTVRQARTMQKFELSPFFMVDLVHHDG
TIHPAINSLLKQFKLKESDIKLSTLAIPNAVTTFWKSAREYNSLGARTTIDDAAKIPFMIKDVPEHLQEKLWETIQQYKG
DAGFGRCTSANACKIAYTLSVSPFMIPATINKIDALMAEERQKMEYFQTVTANTCTISNFSISSIGDMIRSRYSTNHSRE
NLQKLQAVRDTIINFECQAGTGDGGSFDMETAQKLAEEYGCIDVIYHQSKEALSKRLGLKGRWNQSLICKDLLVFCGVAI
GGTWMMFQSFKDGMADAVRHQGKGKRQRQKLRYRQARDNKVGIEVYGDDATMEHYFGAAYTEKGKKSGKTKGMGTKNRRF
VNMYGYNPEDFSFIRFLDPLTGKTMDEQVFSDISLVQDAFSKERLKLLSEGEIESEHMRNGIRAYLVKNLTTAALEIDMT
PHNSCQLGAKTNNIAGYVDREYELRQTGEARVVAPALIPKDNPITDEDIPVKHESKTLFRGLRDYNPIAAAICLLTNESD
GMKETMYGIGFGNTIITNQHLFRRNNGVLRVQSRHGEYVLPNTTQLKVLPCEGRDIMVIILTPDFPPFPQKLKFRPPIKG
EKICLVGSLFQDKSITSTVSETSVTTPVDNSFLWKHWITTKDGHCGLPLVSSNDGYIVGIHSATSSRQTQNYHAAMPEDF
HQTHLIDPASKSWVKHWKYNPDNMVWGGINLINSTPREPFKINKLVTDLFGDAVQFQSKQDEWFASQLKGNLKAVGKSTS
QLVTKHTVKGKCMMFELYLQTHEEEKEFFKPLMGAYQKSRLNREAFTKDIMKYSTPITVGIVDCDTFLKAEEGVIKRLER
LGFSGCEYVTDEEAIFQALNMKAAVGALYSGKKRDYFEGYGPEEKENILRESCKRLYTGKFGVWNGSLKSELRPMEKVMA
NKTRVFTAAPLDTLLAGKVCVDDFNNYFYSKNIEAPWTVGMTKFYGGWNELLTKLPDGWVYCDADGSQFDSSLSPFLINS
VLRIRLKFMEDWDLGEQMLKNLYTEIVYTAILTPDSTIVKKFKGNNSGQPSTVVDNTLMVVLAMTYTLHKLGFEDEEQDS
MCKYFVNGDDLIIAIKPEHESLLDQFQHCFKSLGLNYDFNSRTRKKEELWFMSHCGIKKDGIFIPKLEPERIVSILEWDR
SDQPVHRLEAICAAMIESWGYDKLTHEIRKFYKWCLEQAPYADLAKAGKAPYIAECALKRLYTSKEASEAELEKYMEAIR
SLVNDEDDDDMDEVYHQVDAKLDAGQGSKTDDKQKNSADPKDNIITEKGSGSGQMKKDDDINAGLHGKHTIPRTKAITQK
MKLPMIRGKVALNLDHLLEYEPNQRDISNTRATQKQYESWYDGVKNDYDVDDSGMQLILNGLMVWCIENGTSPNINGTWV
MMDGEEQVEYALKPIIEHAKPTFRQIMAHFSDAAEAYIEMRNKKKPYMPRYGRLRGLNDMGLARYAFDFYETTSATPNRA
REAHNQMKAAALVGTQNRLFGMDGGGSTQEENTERHTAADVNQNMHTLLGVRGLH
>P33515 ~~~~~~Genome polyprotein~~~
MKRKDLEARGKAPGRDSSTPFWGREGRRKDKDKGGESPSNRQVTLKTPIQSGRRAGKRQRVGLLGRLGVGWGSFLQEDIV
QALIHMALVLHALFASIDRRIRSLSRRVTALESRRTTGNPMTLAFILGFLTVLCGCVVIDMQVSTTRGTEIFEGETNRTD
YLHLLKLPADGCWSGILVTKKCPKVTDLAKDLESTDCGSTWTEFTLRYRRCVVKKREKRSREPPKADLLAEMEIIAFKTI
RENKTIFIVALLCVAIAKRWPTWVVILLAIGTWTTVKGEFVEPLYTLKAEQMTMLQTIVRPEEGYVVATPNGLLEFKTGP
AEIYGGQWLRELLADCHVNASYSTDVCPGGSQLNMADIMAKERVCSTQPYNRGWGTGCFKWGIGFVGTCVELHCDRGFNV
SSIARSAIVMNVTASFHSVSDTQQMVGDIPLTFRFAKLGNAAMTCRLESEQLLLDYYHVTGSSHEGLFLRSQVDSWPGVH
STASGRHGMEKVVVWGDARSNEILVKNVIEPSLSWEDAIATHGGFRDISFVCQIMLDKLVSGAFRDCPGPKISTFSQDGF
GYSGVVITTLTASSNETCSLSLTCHGCLLQSTKMIFLAGKTTSRAFVKCGNHTSTLLVGSTSVSIECALNPISQGWRLAR
HVVDRYRRFGVSGVAGVWQDLVGKFSVGAFFSNTALLVILVLAALIDKRIAFLLVLGGYFYYVRADLGCGIDTTRKTISC
GSGVFVWKHLGVGISNDHAVELEDYSFTDLYIKDMFSWTTKPCLICEDALQCVALRRAAFSAVGSMGSERVYVNDTLART
FKFSETAKRTISVTINLIQYKFSSYVAHGRAEGDLGLLPTMYGSYPEKEADKVIRIVASRPDIRRLCGKAVSFQFKFTGF
RRGLYGSNVQVEVSKNSSTECPTYLAGVAVKNGRTVITDGMFWMESIVLDGVAQITSLEMRQSHRCVWPREYTPDTLSDP
SDQALFIPPAWGGPISRVNHIIGYKTQTDFPWNVSDITLIEGPAPGTKVKVDSRCHGRMHAQVIGPNDTESWCCQSCTRI
VHFRVGDLLYYPMEIQLGTMSEASEPNSKIFEEPIGEEPEPTVDDILKRYGKANAQSDFRRVSQRAGVWFDRSLLNLLCL
AISLQLIGAKTRTSTLTRLFLTILAMALFGLPNLFSSVGLSAWVLLVASSSAQPQDLSMNLWIVLQTGSSAVLLLGYMIR
RKLAMVLGVHHLVTLMCVQFLFSAVDRYQKYLYGLLELMASVVLLSAYKSVLQALPPEVLCFSLVMGWKTALSLATVVFL
IFSLNAMYKYACQYHNPRNGYRDSGANLWFWTVSLASAGGIWAAEKAHQPTVAAVLAFTMVVLFLYMEQTNVSMELEFIS
AGETPEGVSTENDDGINIPDLKGRYGEDGIVVGAASSSGYLPELVFVFLLGFAVTSTSYFLGALYLLIATSTNLPVVIIR
MLRMKLTASNRSDDLLGLGGPVETDLQTSFQDIPNGVYRIVVRSLFGDRQRGAGFSKNGVFHTLMHVTRGEPVKWRGRVV
VPHSGSALRDVVSYGGPWQLDTPTTTEDLVLMACKPDKTIEYHRYRPGVMSIDGEPVMFISDDFGKGSSGSPFFINGEPV
GFYGFGFYVNGIYRSTVAGGKPTDVTESLNCDSTRRFVTWHPGKGKTRKVIVEETKKNYDSNQRTVILTPTRVVMAEVVE
ALNNSGMRSDKNLSYCTRNLITVACHATFTKFVLSHGAKKVRVAMIIMDECHFMDPMSIAARGILEHLHGQGTKLIYLSA
TPPGHAPDTGSNYAISDQSISFPTWLSPAWIGNVQKSVGAKKTILFVPSHNQANTLASAIPGSVPLHRANFSSNYAQAGD
AATALVISTDISEMGANLGVDLVIDTRRALRPLVDSATRVKLVETNITTSSMIQRRGRTGRREPGTYVYPIDSQTEENPV
SWVCWPEAQMILDQLGMTFMLEEAAYSQPPGRFTLVGEDRMRFLKLMDRDDIPIWLAWHWAEAGDRRHSALFQGAGTGKI
IENRFGKQEYRPQYVDDRFESIEWETRKVSIDFYMNCRGGPTLYEFFTVVDWTDIWRRTASALWDLSDVMNGEVRDRYTT
ERSLTVVMAFVLGVSIMLSCFIAVWALCFLFSLFRPKKATYEQMPSSDPLSGGVLVSTPSVLYCMGVPLGFCVVITLAMF
LVYPVLYKSIGNRSYMDSDLVKWVILGSCLICGVLAWEMRMFPNIRSDLMELVKAVKEPEEVVNSGPSFPSWEIAQGKGA
TMLDSLQVFFFITVLSTKFLYWFQENWTARMYAMKHPEMVSSIGGFRFDEIPFRAVLPSGFAIVAIASLPSVVVGLLAAG
VFMAIMYCQNKWNATPKILTALDARDQRHDRPTEITSRVPLENTRSIMYAFCLIFSLFWAFCTRSPGDFLRGSLVVGASM
WQILHPRSKIHDVMDFGSMVSAIGLLEMNYLFYRFMHIAARALGAVAPFNQFRALEKSTTIGLGMKWKMTLNALDGDAFT
RYKSRGVNETERGDYVSRGGLKLNEIISKYEWRPSGRVVDLGCGRGGWSQRAVMEETVSSALGFTIGGAEKENPQRFVTK
GYNLATLKTGVDVHRLTPFRCDTIMCDIGESDPSPIKEKTRTLKVLQLLENWLLVNPGAHFVCKILSPYSLEVLRKIESL
QHLYNGRLVRLSHSRNSSVEMYYISGARSNVVRTTYMTLAALMARFSRHLDSVVLPSPVLPKGTRADPAASVASMNTSDM
MDRVERLMNENRGTWFEDQQHPYKSFKYFGSFVTDDVKVGGQAVNPLVRKIMWPWETLTSVVGFSMTDVSTYSQQKVLRE
KVDTVIPPHPQHIRRVNRTITKHFIRLFKNRNLRPRILSKEEFVANVRNDAAVGSWSRDVPWRDVQEAIQDQCFWDLVGK
ERALHLQGKCEMCIYNTMGKKEKKPSLAGEAKGSRTIWYMWLGSRFLEFEALGFLNADHWVSREHFPGGVGGVGVNYFGY
YLKDIASRGKYLIADDIAGWDTKISEEDLEDEEALLTALTEDPYHRALMAATMRLAYQNIVAMFPRTHSKYGSGTVMDVV
GRRDQRGSGQVVTYALNTITNGKVQVARVLESEGLLQADESVLDAWLEKHLEEALGNMVIAGDDVVVSTDNRDFSSALEY
LELTGKTRKNVPQGAPSRMESNWEKVEFCSHHYHEMSLKDGRIIIAPCRHENEVLGRSRLQKGGVVSISESACMAKAYAQ
MWALYYFHRRDLRLGFIAISSAVPTNWFPLGRTSWSVHQYHEWMTTDDMLRVWNDVWVHNNPWMLNKESIESWDDIPYLH
KKQDITCGSLIGVKERATWAREIENSVISVRRIIDAETGVLNTYKDELSVMSRYRRGNDVI
>Q91TW9 ~~~ORF1~~~Genome polyprotein~~~
MSSFLRGGHLLSGVESLTPTTHRDTITAPIVESLATPLRRSLERYPWSIPKEFHSFLHTCGVDISGFGHAAHPHPVHKTI
ETHLLLDVWPNYARGPSDVMFIKPEKFAKLQSRQPNFAHLINYRLVPKDTTRYPSTSTNLPDCETVFMHDALMYYTPGQI
ADLFFLCPQLQKIYASVVVPAESSFTHLSLHPEIYRFRFQGSDLVYEPEGNPAANYTQPRSALDWLQTTGFTVGHEFFSV
TLLDSFGPVHSLLIQRGRPPVFQAEDIASFRVPDAVALPAPASLHQDLRHRLVPRKVYDALFNYVRAVRLRVTDPAGFVR
TQVGKPEYSWVTSSAWDNLQHFALQTAAVRPNTSHPLFQSPFARLSHWLRTHTWALWCLASPSASVSAWATASALGRLLP
LHTDRLRLFGFDIIGRRFWPRLPFHGPEPRFLWETHPACRPPVLFADSAFECQILAGLANRCSPSPFWSRLFPTASPPSW
VAYSALALAAVPLAALALRWFYGPDSPQALHDQYHATFHPDPWTLDLPRRLRRFERESFMRTGSAPLPQSLPPPEGSLLP
VEPPPVPSDPEPALEPSPPAASVPAPAPALASEPPPSPESVAPSRRRRRARRAAARAPSPSPALLGADLRFGDLPPVSAW
DSDPEISKLGESTQGTVFAVTPGPRAPEPDTARLDADPSASGPVMEFRELQKGAYIEPTGAFLTRARNSVSSSIPYPTRA
ACLLVAVSQATGLPTRTLWAALCANLPDSVLDDGSLATLGLTTDHFAVLARIFSLRCRFVSEHGDVELGLHDATSRFTIR
HTPGHFELVADNFSLPALVGASSVPGADLAEACKRFVAPDRTVLPFRDVHIHRTDVRRAKNLISNMKNGFDGVMAQANPL
DPKSARERFLMLDSCLDIAAPRRVRLIHIAGFAGCGKSWPISHLLRTPAFRVFKLAVPTTELRDEWKALMDPRDQDKWRF
GTWESSLLKTARVLVIDEVYKMPRGYLDLAIHADAAIQFVILLGDPIQGEYHSTHPSSSNARLSPEHRYLRPYVDFYCFW
SRRIPQNVARVLDVPTTSTEMGFARYSQQFPFSGKILISARDSAKSLADCGYHAVTIASSQGSTIAGPAYVHLDNHSRRL
SHQHSLVAITRSKSGIVFTGDKAAADGTSSANLLFSAVLLDRRLSVRSLFSALLPCCPFVTEPPTSRAVLLRGAGYGIAR
PLRARDAPPLGPDYVGDVILDSSAPILGDGSANAPQVSTHFLPETRRPLHFDIPSARHQVADHPLAPDHSACAIEPVYPG
ESFESLASLFLPPTDAESKETYFRGEMSNQFPHLDKPFELGAQTSSLLAPLHNSKHDPTLLPASIGKRLRFRHSEAPYVI
APRDEILGSLLYEAWCRAYHRSPRDVEPFDPDLYAECINLNEFAQLSSKTQATIMANANRSDPDWRWSAVRIFAKTQHKV
NEGSLFGSWKACQTLALMHDAVVLLLGPVKKYQRFFDQRDRPSTLYVHAGHTPFEMADWCRAHLTPAVKLANDYTAFDQS
QHGEAVVFERYKMNRLSIPAELVDLHVYLKTNVSTQFGPLTCMRLTGEPGTYDDNTDYNIAVLHLEYAVGSTPLMVSGDD
SLLDSEPPVRDQWSAIAPMLALTFKKERGRYATFCGYYVGFTGAVRSPPALFAKLMIAVDDGSISDKLIAYLTEFTVGHS
SGDAFWTILPVEAVPYQSACFDFFCRRAPAQAKVMLRLGEAPESLLSLAFEGLKWASHSVYALMNSSHRRQLLHSSRRPR
SLPEDPEVSQLQGELLHQFQSLHLPLRGGHMPNPLAAPFRLLQQSSSLGPTYAVAPIARAPQVPPPSMADNATQVGPVPP
RDDRVDRQPPLPDPPRVLETAPSHFLDLPFQWKVTDFTGYAAYHGTDDLVASAVLTTLCAPYRHAELLYVEISVAPCPPS
FSKPIMFTVVWTPATLSPRDGKETDYYGGRQITVGGPVMLSSTTAVPADLARMNPFIKSSVSYNDTPRWTMSVPAVTGGD
TKIPLATAFVRGIVRVRAPSGAATPSA
>P05769 ~~~~~~Genome polyprotein~~~
MSKKPGGPGKPRVVNMLKRGIPRVFPLVGVKRVVMNLLDGRGPIRFVLALLAFFRFTALAPTKALMRRWKSVNKTTAMKH
LTSFKKELGTLIDVVNKRGKKQKKRGGSETSVLMLIFMLIGFAAALKLSTFQGKIMMTVNATDIADVIAIPTPKGPNQCW
IRAIDIGFMCDDTITYECPKLESGNDPEDIDCWCDKQAVYVNYGRCTRARHSKRSRRSITVQTHGESTLVNKKDAWLDST
KATRYLTKTENWIIRNPGYALVAVVLGWMLGSNTGQKVIFTVLLLLVAPAYSFNCLGMSSRDFIEGASGATWVDLVLEGD
SCITIMAADKPTLDIRMMNIEATNLALVRNYCYAATVSDVSTVSNCPTTGESHNTKRADHNYLCKRGVTDRGWGNGCGLF
GKGSIDTCAKFTCSNSAAGRLILPEDIKYEVGVFVHGSTDSTSHGNYSTQIGANQAVRFTISPNAPAITAKMGDYGEVTV
ECEPRSGLNTEAYYVMTIGTKHFLVHREWFNDLLLPWTSPASTEWRNREILVEFEEPHATKQSVVALGSQEGALHQALAG
AIPVEFSSSTLKLTSGHLKCRVKMEKLKLKGTTYGMCTEKFTFSKNPADTGHGTVVLELQYTGSDGPCKIPISSVASLND
MTPVGRMVTANPYVASSTANAKVLVEIEPPFGDSYIVVGRGDKQINHHWHKEGSSIGKAFSTTLKGAQRLAALGDTAWDF
GSVGGVFNSIGKAVHQVFGGAFRTLFGGMSWISPGLLGALLLWMGVNARDKSIALAFLATGGVLLFLATNVHADTGCAID
ITRRELKCGSGIFIHNDVEAWIDRYKYLPETPKQLAKVVENAHKSGICGIRSVNRFEHQMWESVRDELNALLKENAIDLS
VVVEKQKGMYRAAPNRLRLTVEELDIGWKAWGKSLLFAAELANSTFVVDGPETAECPNSKRAWNSFEIEDFGFGITSTRG
WLKLREENTSECDSTIIGTAVKGNHAVHSDLSYWIESGLNGTWKLERAIFGEVKSCTWPETHTLWGDAVEETELIIPVTL
AGPRSKHNRREGYKVQVQGPWDEEDIKLDFDYCPGTTVTVSEHCGKRGPSVRTTTDSGKLVTDWCCRSCTLPPLRFTTAS
GCWYGMEIRPMKHDESTLVKSRVQAFNGDMIDPFQLGLLVMFLATQEVLRKRWTARLTLPAAVGALLVLLLGGITYTDLV
RYLILVGSAFAESNNGGDVIHLALIAVFKVQPAFLVASLTRSRWTNQENLVLVLGAAFFQMAASDLELTIPGLLNSAATA
WMVLRAMAFPSTSAIAMPMLAMLAPGMRMLHLDTYRIVLLLIGICSLLNERRRSVEKKKGAVLIGLALTSTGYFSPTIMA
AGLMICNPNKKRGWPATEVLTAVGLMFAIVGGLAELDIDSMSVPFTIAGLMLVSYVISGKATDMWLERAADVSWEAGAAI
TGTSERLDVQLDDDGDFHLLNDPGVPWKIWVLRMTCLSVAAITPRAILPSAFGYWLTLKYTKRGGVFWDTPSPKVYPKGD
TTPGVYRIMARGILGRYQAGVGVMHEGVFHTLWHTTRGAAIMSGEGRLTPYWGNVKEDRVTYGGPWKLDQKWNGVDDVQM
IVVEPGKPAINVQTKPGIFKTAHGEIGAVSLDYPIGTSGSPIVNSNGEIIGLYGNGVILGNGAYVSAIVQGERVEEPVPE
AYNPEMLKKRQLTVLDLHPGAGKTRRILPQIIKDAIQKRLRTAVLAPTRVVAAEMAEALRGLPVRYLTPAVQREHSGNEI
VDVMCHATLTHRLMSPLRVPNYNLFVMDEAHFTDPASIAARGYIATRVEAGEAAAIFMTATPPGTSDPFPDTNSPVHDVS
SEIPDRAWSSGFEWITDYAGKTVWFVASVKMSNEIAQCLQRAGKRVIQLNRKSYDTEYPKCKNGDWDFVITTDISEMGAN
FGASRVIDCRKSVKPTILDEGEGRVILSVPSAITSASAAQRRGRVGRNPSQIGDEYHYGGGTSEDDTMLAHWTEAKILLD
NIHLPNGLVAQLYGPERDKTYTMDGEYRLRGEERKTFLELIKTADLPVWLAYKVASNGIQYNDRKWCFDGPRSNIILEDN
NEVEIITRIGERKVLKPRWLDARVYSDHQSLKWFKDFAAGKRSAIGFFEVLGRMPEHFAGKTREALDTMYLVATSEKGGK
AHRMALEELPDALETITLIAALGVMTAGFFLLMMQRKGIGKLGLGALVLVVATFFLWMSDVSGTKIAGVLLLALLMMVVL
IPEPEKQRSQTDNQLAVFLICVLLVVGLVAANEYGMLERTKTDIRNLFGKSLIEENEVHIPPFDFFTLDLKPATAWALYG
GSTVVLTPLIKHLVTSQYVTTSLASINAQAGSLFTLPKGIPFTDFDLSVALVFLGCWGQVTLTTLIMATILVTLHYGYLL
PGWQAEALRAAQKRTAAGIMKNAVVDGIVATDVPELERTTPQMQKRLGQILLVLASVAAVCVNPRITTIREAGILCTAAA
LTLWDNNASAAWNSTTATGLCHVMRGSWIAGASIAWTLIKNAEKPAFKRGRAGGRTLGEQWKEKLNAMGKEEFFSYRKEA
ILEVDRTEARRARREGNKVGGHPVSRGTAKLRWLVERRFVQPIGKVVDLGCGRGGWSYYAATMKNVQEVRGYTKGGPGHE
EPMLMQSYGWNIVTMKSGVDVFYKPSEISDTLLCDIGESSPSAEIEEQRTLRILEMVSDWLSRGPKEFCIKILCPYMPKV
IEKLESLQRRFGGGLVRVPLSRNSNHEMYWVSGASGNIVHAVNMTSQVLIGRMDKKIWKGPKYEEDVNLGSGTRAVGKGV
QHTDYKRIKSRIEKLKEEYAATWHTDDNHPYRTWTYHGSYEVKPSGSASTLVNGVVRLLSKPWDAITGVTTMAMTDTTPF
GQQRVFKEKVDTKAPEPPQGVKTVMDETTNWLWAYLARNKKARLCTREEFVKKVNSHAALGAMFEEQNQWKNAREAVEDP
KFWEMVDEERECHLRGECRTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARFLEFEALGFLNEDHWMSRENSGGGVEG
AGIQKLGYILRDVAQKPGGKIYADDTAGWDTRITQADLENEAKVLELMEGEQRTLARAIIELTYRHKVVKVMRPAAGGKT
VMDVISREDQRGSGQVVTYALNTFTNIAVQLVRLMEAEAVIGPDDIESIERKKKFAVRTWLFENAEERVQRMAVSGDDCV
VKPLDDRFSTALHFLNAMSKVRKDIQEWKPSQGWYDWQQVPFCSNHFQEVIMKDGRTLVVPCRGQDELIGRARISPGSGW
NVRDTACLAKAYAQMWLVLYFHRRDLRLMANAICSSVPVDWVPTGRTTWSIHGKGEWMTTEDMLSVWNRVWILENEWMED
KTTVSDWTEVPYVGKREDIWCGSLIGTRTRATWAENIYAAINQVRSVIGKEKYVDYVQSLRRYEETHVSEDRVL
>Q83883 ~~~ORF1~~~Genome polyprotein~~~
MMMASKDVVPTAASSENANNNSSIKSRLLARLKGSGGATSPPNSIKITNQDMALGLIGQVPAPKATSVDVPKQQRDRPPR
TVAEVQQNLRWTERPQDQNVKTWDELDHTTKQQILDEHAEWFDAGGLGPSTLPTSHERYTHENDEGHQVKWSAREGVDLG
ISGLTTVSGPEWNMCPLPPVDQRSTTPATEPTIGDMIEFYEGHIYHYAIYIGQGKTVGVHSPQAAFSITRITIQPISAWW
RVCYVPQPKQRLTYDQLKELENEPWPYAAVTNNCFEFCCQVMCLEDTWLQRKLISSGRFYHPTQDWSRDTPEFQQDSKLE
MVRDAVLAAINGLVSRPFKDLLGKLKPLNVLNLLSNCDWTFMGVVEMVVLLLELFGIFWNPPDVSNFIASLLPDFHLQGP
EDLARDLVPIVLGGIGLAIGFTRDKVSKMMKNAVDGLRAATQLGQYGLEIFSLLKKYFFGGDQTEKTLKDIESAVIDMEV
LSSTSVTQLVRDKQSARAYMAILDNEEEKARKLSVRNADPHVVSSTNALISRISMARAALAKAQAEMTSRMRPVVIMMCG
PPGIGKTKAAEHLAKRLANEIRPGGKVGLVPREAVDHWDGYHGEEVMLWDDYGMTKIQEDCNKLQAIADSAPLTLNCDRI
ENKGMQFVSDAIVITTNAPGPAPVDFVNLGPVCRRVDFLVYCTAPEVEHTRKVSPGDTTALKDCFKPDFSHLKMELAPQG
GFDNQGNTPFGKGVMKPTTINRLLIQAVALTMERQDEFQLQGPTYDFDTDRVAAFTRMARANGLGLISMASLGKKLRSVT
TIEGLKNALSGYKISKCSIQWQSRVYIIESDGASVQIKEDKQALTPLQQTINTASLAITRLKAARAVAYASCFQSAITTI
LQMAGSALVINRAVKRMFGTRTAAMALEGPGKEHNCRVHKAKEAGKGPIGHDDMVERFGLCETEEEESEDQIQMVPSDAV
PEGKNKGKTKKGRGRKNNYNAFSRRGLSDEEYEEYKKIREEKNGNYSIQEYLEDRQRYEEELAEVQAGGDGGIGETEMEI
RHRVFYKSKSKKHQQEQRRQLGLVTGSDIRKRKPIDWTPPKNEWADDDREVDYNEKINFEAPPTLWSRVTKFGSGWGFWV
SPTVFITTTHVVPTGVKEFFGEPLSSIAIHQAGEFTQFRFSKKMRPDLTGMVLEEGCPEGTVCSVLIKRDSGELLPLAVR
MGAIASMRIQGRLVHGQSGMLLTGANAKGMDLGTIPGDCGAPYVHKRGNDWVVCGVHAAATKSGNTVVCAVQAGEGETAL
EGGDKGHYAGHEIVRYGSGPALSTKTKFWRSSPEPLPPGVYEPAYLGGKDPRVQNGPSLQQVLRDQLKPFADPRGRMPEP
GLLEAAVETVTSMLEQTMDTPSPWSYADACQSLDKTTSSGYPHHKRKNDDWNGTTFVGELGEQAAHANNMYENAKHMKPI
YTAALKDELVKPEKIYQKVKKRLLWGADLGTVVRAARAFGPFCDAIKSHVIKLPIKVGMNTIEDGPLIYAEHAKYKNHFD
ADYTAWDSTQNRQIMTESFSIMSRLTASPELAEVVAQDLLAPSEMDVGDYVIRVKEGLPSGFPCTSQVNSINHWIITLCA
LSEATGLSPDVVQSMSYFSFYGDDEIVSTDIDFDPARLTQILKEYGLKPTRPDKTEGPIQVRKNVDGLVFLRRTISRDAA
GFQGRLDRASIERQIFWTRGPNHSDPSETLVPHTQRKIQLISLLGEASLHGEKFYRKISSKVIHEIKTGGLEMYVPGWQA
MFRWMRFHDLGLWTGDRDLLPEFVNDDGV
>Q7T6D2 ~~~~~~Genome polyprotein~~~
MAGKAILKGKGGGPPRRVSKETAKKTRQRVVQMPNGLVLKRIMEILWHAMVGTARSPLLKSFWKVVPLKQAMAALRKIKK
AVSTLMIGLQKRGKRRSTTDWTGWLLVAMLLSIALAATVRKEGDGTTVIRAEGKDAATQVRVENGTCVILATDMGAWCED
SLSYECVTIDQGEEPVDVDCFCRNVDRVYLEYGRCGKQEGTRSRRSVLIPSHAQKDLTGRGQRWLEGDTIRSHLTRVEGW
VWKNKSLTLAVVVIVWMTVESAVTRIVIVSALLCLAPAYASRCTHLENRDFVTGTQGTTRVTLVLELGGCVTITAEGKPS
MDVWLDSIYQENPAKTREYCLHAKLSNTKVAARCPAMGPATLDEEHQSGTVCKRDQSDRGWGNHCGLFGKGSIVTCVKAS
CEAKKKATGHVYDANKIVYTVKVEPHTGNYVAANETHSGRKTALFTVSSEKTILTMGEYGDVSLMCRVASGVDLAQTVVL
ELDKTAEHLPTAWQVHRDWFNDLALPWKHEGMVGWNNAERLVEFGVPHAVKMDVYNLGDQTGVLLKSLAGAPLAHIEGTK
YHLKSGHVTCEVGLEKLKMKGLTYTMCDKAKFTWKRAPTDSGHDTVVMEVAFSGTKPCRIPVRAVAHGSPDVDVAMLITP
NPTIENNGGGFIEMQLPPGDNIIYVGELKHQWFQKGSSIGRVFQKTRKGIERLTVLGEHAWDFGSTGGFLSSIGKALHTV
LGGAFNSVFGGVGFLPRILLGISLAWLGLNMRNPTMSMSFLLAGGLVLTMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYAFYPETPAALASALKEMVEEGDCGIVPQNRLEMAMWRSSVSELNLALAEGDANLTVVVDKHDPTDYRGGVPG
LLKKGKDMKISWKSWGQSMIWSVPEAPRRFLVGTEGSSECPLAKRRTGVFTVAEFGMGLRTKVFLDFRQEITRECDTGVM
GAAVKNGIAVHTDQSLWMKSIRNETGTYIVELLVTDLRNCSWPASHTIDNADVVDSELFLPASLAGPRSWYNRIPGYSEQ
VRGPWKYTPIKITREECPGTKVAIDASCDKRGASVRSTSESGKIIPEWCCRKCTLPPVTFRTGTDCWYAMEIRPVHDQGG
LVRSMVVADNGELLSEGGIPGIVAVFVVLEYIIRKRPSAGLTVVWGGVVVLALLVTGMVTLQSMLRYVIAVGVTFHLELG
PEIVALMLLQAVFELRVGLLGAFVLRRSLTTREVVTIYFLLLVLELGLPSANLEALWGWADALAMGAMIFRACTAEGKTG
LGLLLVALMTQQNAVIVHQGLVIFLSVASACSVWKLLRGQREQKGLSWIVPLAGRLGGKGSGIRLLAFWELASRRDRRSF
SEPLTVVGVMLTLASGMMRHTSQEALCALAAASFLLLMLVLGTRKMQLVAEWSGCVEWHPDLADEGGEISLRVRQDALGN
FHLTELEKEERMMAFWLLAGLTASALHWTGILVVMGLWTMSEMLRSARRSDLVFSGQSGSERGSQPFEVRDGVYRILSPG
LLWGHRQVGVGFGSKGVLHTMWHVTRGAAIFIDNAVAGPYWADVKEDVVCYGGAWSLEEKWKGEKVQVHAFPPGRAHEVH
QCQPGELVLDTGRRIGAIPIDLAKGTSGSPILNAQGAVVGLYGNGLRTNETYVSSIAQGEVEKSRPNLPQAVVGTGWTSK
GTITVLDMHPGSGKTHRVLPELIRQCIDKRLRTLVLAPTRVVLKEMERALSGKRVRFHSPAVGDQQTGNAIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYSMAKENKCALVLMTATPPGKSEPFPESNGAITSEERQIPEGEWR
DGFDWITEYEGRTAWFVPSIAKGGVIARTLRQKGKSVICLNSKTFEKDYSRVRDEKPDFVVTTDISEMGANLDVSRVIDG
RTNIKPEEVDGKVELTGTRRVTTASAAQRRGRVGRHDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFY
GPEQDKMPEVAGHFRLTEERRKHFRHLLTHCDFTPWLAWHVAANVSNVTSRSWTWEGPEENAVDEANGDLVTFKSPNGAE
RTLRPVWRDARMFKEGRDIREFVAYASGRRSLGDMLTGMSGVPELLRHRCMSAMDVFYTLLYEEPGSRAMKMAERDAPEA
FLTMVEMVVLGLATLGAVWCLVLRTSISRMMLGTMVLLVSLALLWAGGVGYGSMAGVALVFYTLLTVLQPEAGKQRSSDD
NKLAYFLLTLCSLAGLVAANEMGFLEKTKADLSAVLWSEREEPRVWSEWTNIDIQPAKSWGTYVLVVSLFTPYIIHQLQT
RIQQLVNSAVASGAQAMRDLGGGTPFFGVAGHVLTLGVVSLVGATPTSLVVGVGLAAFHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGDGEVKPALYERKMSLILAMILCFMSVVLNRTVPAVTEASAVGLAAAGQLIRPEADTLWTMPV
ACGLSGVVRGSLWGFLPLGHRLWLRTSGTRRGGSEGDTLGDLWKRRLNNCTKEEFFAYRRTGILETERDKARELLKKGET
NMGLAVSRGTAKLAWLEERGYVNLKGEVVDLGCGRGGWSYYAASRPAVMGVKAYTIGGKGHEVPRMVTSLGWNLIKFRAG
MNVFTMQPHRADTVMCDIGESSPDAAIEGERTRKVILLMEQWKNRNPTAACVFKVLAPYRPEVIEALHRFQLQWGGGLVR
TPFSRNSTHEMYYSTAISGNIVNSVNVQSRKLLARFGDQRGPIRVPEMDLGVGTRCVVLAEDKVKEHDVQERIKALQEQY
SDTWHVDREHPYRTWQYWGSYRTAPTGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKDKVDTKAQEPQP
GTRVIMRAVNDWMFERLARRSRPRMCSREEFIAKVKANAALGAWSDEQNKWASAKEAVEDPAFWHLVDEERERHLKGRCA
HCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNEDHWASRESSGAGVEGISLNYLGWHLKKLSLLEGG
LFYADDTAGWDTRVTNADLEDEEQILRYMEGEHKQLAATVMQKAYHAKVVKVARPSRDGGCIMDVITRRDQRGSGQVVTY
ALNTLTNIKVQLIRMMEGEGVIEATDSHNPRLLRVERWLRDHGEERLGRMLISGDDCVVRPIDDRFSKALYFLNDMAKTR
KDIGEWEHSAGFSSWEEVPFCSHHFHELVMKDGRTLVVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLSYFH
RRDLRTLGFAICSAVPKDWVPTGRTTWSVHASGAWMTTENMLDVWNRVWILDNPFMENKEKVGEWRDIPYLPKSQDMMCS
SLVGRRERAEWAKNIWGAVEKVRKMLGPERYSDYLSCMDRHELHWELKVESSII
>Q01500 ~~~~~~Genome polyprotein~~~
MATSVIQFGSFVCNLPKSQPLCTTVHCPKQSMSTNIVRPSDPFAELEKHLEPYLQKRMDATIRQTKGGTLVYKHMSEAKR
ARKLRKKQREEEEVRLFMNAAPYIVSNITIGGGEVPSKMEEVSIKRPLNKTPSRKIKKSLTPVTFRDGHMNKFLRELRDC
ATRNSMTVHLIGKRKTELAFKRRASLNAVYATLHHMRGVDRKRDIVLEEWMNDYVLNLSKVSTWGSLFHAESLKRGDSGL
ILNARALRGKFGRCSRGFFIVRGKSDGVVLDARSKLSMATVTHMEQYSTPEAFWSGLEKKWSVVRKPTAHTCKPTYSVSN
CGEVAAIIAQALFPCHKLTCGECSKEICDLTSNECVQELYKNTSLALERMNNLHPEFQHIVKVLSVVRQLTEASNHGTET
FDEIFKMIGSKTQSPFTHLNKLNEFMLKGNENTSGEWLTARQHLRELVRFQKNRTDNIKKGDLASFRNKLSARAQYNLYL
SCDNQLDKNASFLWGQREYHARRFFLNFFQQIDPSKGYLAYEDRTIPNGSRKLAIGNLIVPLDLAEFRKRMNGIDTQQPP
IGKYCTSQLDGNFVYPCCCTTLDDGQPIRSAVYAPTKKHLVVGNTGDTKYINLPKGDTEMLYIALDGYCYINIYLAMLVN
ISEEEAKDFTKKVRDIFMPKLGKWPTLMDLATTCAQLRIFHPDVHDAELPRILVDHNTQTCHVVDSYGSISTGYHILKAA
TVSQLVLFADDNLESEIKHYRVGGIVENHKVQIDNQPSRCGVSEFHAIRMLIKGIYRPSVMYELLSEEPYLLVFSILSPS
ILIAMYNDRAFELAVQIWLEKEQSIPLIATILTNLAAKVSVATTLVQQLQLIELSADQLLNVTCDGFRVSFAYQSALTLL
TRMRDQAKANSELISGGFNEYDQDLAWTLEKNYQGLLHDQWKELSSLEKFRYYWSSRKRKTRLRSNIKSRSSPVASAISS
LSLKPFMGKVFSHMKAGAVCTKQGTKNFIDARCLGISTYFVGSLMRKFPSAKVLLSSLFVLGALLNITHAANRIIIDNRI
SREHAAALELYRKEDTCHELYTALERKLGEKPTWDEYCSYVAKINPAMLEFIKDSYDEKQVVHQRSTEDLKKVEHIIAFV
TLAIMLFDSERSDCVFKTLNKFKGVVCSLGSGVRHQSLDDFVSTMDEKNFVVDFELNDSVQRKNLTTEITFESWWDEQVA
RGFTIPHYRTEGRFMEFTRATAAKVASDISISSERDFLIRGAVGSGKSTGLPHHLSTYGRVLLIEPTRPLAENVFKQLSG
GPFFLKPTMRMRGNSVFGSSPISVMTSGFALHFFANNITQLQEIQFIIIDECHVMDASSMAFRSLIHTYHTNCKVLKVSA
TPPGREVEFTTQFPVKLVVEDSLSFKTFVESQGTGSNCDMIQYGNNLLVYVASYNEVDQLSKLLVAREFNVTKVDGRTMK
HGELEIVTRGTKSKPHFVVATNIIENGVTLDIDVVIDFGMKVSPFLDVDNRSVAYNKVSISYGERIQRLGRVGRIQKGTA
LRIGHTEKGLIEIPQMISTEAALYCFAYNLPVMSSGVSTSMIKNCTIPQVRTMHTFELSPFFMYNFVSHDGTMHPVVHET
LKRYKLRDSVIPLSESSIPYRASSDWITAGDYRRIGVKLDIPDETRIAFHIKTFHRKFTNNLWESVLKYKASAAFPTLRS
SSITKIAYTLSTDLYAIPRTLAVVESLLEDERTKQYQFKSLIDNGCSSMFSVVGISNALRAKYSKDHTVENINKLETVKA
QLKEFHNLNGSGDELNLIKRFESLQFVHHQSKSSLAKALGLRGVWNKSLIVRDAIIAAGVACGGAWLLYTWFTAKMSEVS
HQGRSKTKRIQALKFRKARDKRAGFEIDNNEDTIEEYFGSAYTKKGKGKGTTVGMGRTNRRFINMYGFEPGQFSYIKFVD
PLTGAQMEENVYADIVDVQEKFGDIRRQMILDDELDRRQTDVHNTIHAYLIKDWSNKALKVDLTPHNPLRVSDKASAIMK
FPEREGELRQTGQAVEVDVCDIPKEVVKHEAKTLMRGLRDYNPIAQTVCKLTVKSELGETSTYGLGFGGLIIANHHLFKS
FNGSLEVKSHHGVFRVPNLMAISVLPLKGRDMIIIKMPKDFPVFPQRLKFREPASTDRVCLIGSNFQERYISTTVSEISA
THPVPRSTFWKHWISTDDGHCGLPIVSTTDGFILGLHSLANNRNSENYYTAFDSDFEMKILRSGENTEWVKNWKYNPDTV
LWGPLQLTKGTPSGMFKTTKMIEDLLAFKSESVREQAHTSSWMLEVLKENLKAIAYMKSQLVTKHVVKGECMMFKQYLQE
NPRANEFFQPKMWAYGKSMLNKEAYIKDIMKYSKVIDVGVVDCDRHLRKLSLELLYTQIHGFRKCSYITDEEEIFKALNI
TTAVGAMYGGKKKEYFEKFTTEDKAEILRQSCLRLYTGKLGVWEWALKAELRSKEKIEANKTRTFTAAPIDTLLGGKVCV
DDLNNQFYSKNIECCWTVGMTKFYGGWDKLLTALPAGWIYCDADGSQFDSSLTPYLINAVLTIRYAFMEDWDIGYKMLQN
LYTEIIYTPISTPDGTIVKKFRGNNSGQPSTVVDNSLMVVLAMHYAFVREGIAFEEIDSICKFFVNGDDLLIAVNPERES
LLDTLSNHFSDLGLNYDFSSRTRNKSELWFMSHCGISVEGTYIPKLEEERIVSILQWDRAELPEYRLEAICAAMIESWGY
PQLTHEIRRFYSWLIEKNPYADLASEGKAPYISELALKKLYLNQDVQMMSFRSYLKYFADADEEFECGTYEVRHQSSSRS
DTLDAGEEKKKNKEVATVSDGMGKKEVESTRDSDVNAGTVGTFTIPRIKSITEKMRMPKQKRKGVLNLAHLLEYKPSQVD
ISNTRSTQAQFDNWYCEVMKAYDLQEEAMGTVMNGLMVWCIENGTSPNISGTWTMMDGDEQVEFPLKPVIENAKPTFRQI
MAHFSDVAEAYIEMRNKQEPYMPRYGLVRNLRDMGLARYAFDFYEVTSRTSTRAREAHIQMKAAALKSAQTRLFGLDGGI
GTQGENTERHTTEDVSPDMHTLLGVREM
>Q9QEJ5 ~~~ORF1~~~Genome polyprotein~~~
MANCRPLPIGQLPNRIFDTPRLTPGWVWACTSEATFKLEWLQDPVVIRPPDVFVAQGVVDDFFRPKRVLQGDPQLIAQVL
LGDANGPLVGPVSMQQLTSLLHEVSQALSDHKHPLANRYTRASLQRYADTLSNYIPLVDILTGPKDLTPRDVLEQLAAGR
EWECVPDSALKKVFRDMWQYICEGCDSVYIKLQDVKRKMPHIDTTVLKQFIITLTDTISMATALDTKTWLAHILGWLKPT
CLVMIMQQHVNSPQGWAATLTALAELYYGIMPLTETLGSVASWVTDKFADMATSTWGKFKSWWDSLYTPQAGNDLIILGG
VVGLVYFMVFGDAPTQMFTKKLMRVCGFITSTVAAIKAAMWIVDYFKQREHEHQVRITLARWAALQEVIKQNRCAGLSEV
TKLKECCEVLLNEVTELMYKLGASPLAGLIRSTSDVIQTTINELAQLMAYDTQRKPPAMVVFGGPPGIGKTRLVEALAKQ
LGEVSHFTMTVDHYDTYTGNTVAIWDEFDVDSKQAFIEATIGIVNCAPYPLNCDRPEAKGRVFTSKYVLATTNCPTPVMP
DHPRAMAFWRRITFIDVTAPTIEQWLVDNPGRKAPTSLFKDDFSHLQCSVRGYTAYDEKGNTLSGKVARARYVSVNNLLD
LIKEKYNSEAADVKHLWFTVPQAIHKQARDIILGWLRFHSYPNTVADNIPLSEVRDPTCFGYVVISDVDPPRHVAEHVAH
IEVESILRTDIVGLLREGGGGLFRALKVKSAPRNCIINKVMMQAHHTTLQVLTSQEPHPPNLPRPRRLVFVESPIDIISA
LRHHVGFCTIPGIVKLITSGVGLGVENLGNFLQSIAGNVRFPLQSECSLLRTPSGDVLFYTSGQAAVWATPARFPIVTPG
EASVGKEVCSESSWWDILKALFSTLVVAFGPIATLVLTAHNLAYLNTRENTLSEAKGKNKRGRGARRAIALRDDEYDEWQ
DIIRDWRKEMTVQQFLDLKERALSGASDPDSQRYNAWLELRAKRLSAGAYQHAVVDIIGKSGHRREVIRTQVMRAPREPK
GDTYDSEGRGYVVPMTAQEKHTGWAVHIGNGRLVTCTHVANMCDRVAEVEFKVAETDRDTCIITAPLGHLPSVALGNGPP
AFYTTNFHPIRVLDEGSWDTTTTRVTGWRVVINNGTTTAPGDCGQPYLNARRQLVGVHAATSTCGVKKLVSRVQTKKTAK
ATFPWKGLPVTTMPDAGGLPTGTRYHRSIAWPKLLPEETHAPAPYGVNDPRHPFSQHQMIANNLQPYINTPVALDQTLLQ
RAVKHTKGYLDQIIGTHRSPNLTYAAAVESMAHDTACGPNLPGRKKDYMTDQGEPIGPLKQMLEEAWDMAHRGVPRRHEY
KLALKDELRPIEKNDQGKKRLLWGCDAGVSMIANAVFKPVTERLVDTVPMHPVAVGICMDSPQIEQMNQALTGRVLYCLD
YSKWDSTQNPAVTCASVDILASYAEDTPLSSAAIATLCSPAVGRLDDIGLTVTSGLPSGMPFTSVINSVNHMIYFAMAVL
EAYEEFKVPYMGNIFDNETIYTYGDDCVYGLTPATASIMPVVVKNLTSYGLVPTAADKSQSIEPTDTPVFLKRTFSQTPF
GLRALLDETSLARQCYWVKANRTTDLFEPAAVDVDIRKNQLEVMLAYASQHPRSVFDKLAGMAEVTASAEGYQLVNVNWA
NAVATYNAWYGGTDGGRAPTNEDEEPEVFVMEAPAPTRSVASNPEGTQNSNESRPVQPAGPMPVAAAQALEMAVATGQIN
DTIPSVVRETFSTYTNVTWTTRQPTGTLLARMSLGPGLNPYTLHLSAMWAGWGGSFEIKVIISGSGMYAGKLLCALIPPG
VDPSAVDQPGAFPHALVDARITDGVTFTLGDVRAVDYHETGVGGAIASLALYVYQPLINPFETAVSAAMVTIETRPGPDF
GFTLLKPPNQAMEVGFDPRSLLPRTARTLRGNRFGRPITAVVIVGVAQQINRHFSAEGTTLGWSTAPIGPCVARVNGKHT
DNTGRAVFQLGPLSNGPLYPNIINHYPDVAASTIFNTGTAVNDNTTGGGGPMVIFNDVGDVVEDVAYQMRFIASHATSQS
PTLIDQINATSMAVCSFGNSRADLNQNQLNVGIELTYTCGNTAINGIVTSFMDRQYTFGPQGPNNIMLWVESVLGTHTGN
NTVYSSQPDTVSAALQGQPFNIPDGYMAVWNVNADSADFQIGLRRDGYFVTNGAIGTRMVISEDTTFSFNGMYTLTTPLI
GPSGTSGRSIHSSR
>O41174 ~~~~~~Genome polyprotein~~~
MGMQMSKNTAGSHTTVTQASGGSHINYTNINYYSHSASASQNKQDITQDPSKFTQPMVDIMKESAVPLKSPSAEACGYSD
RVAQLTLGNSTITTQEAANITVAYGEWPSYLSDLDATAVDKTTKPGVSCDRFYTLPGKKWEATTKGWEWKLPDALTELGV
FGQNCQFHFLYRCGWSIHVQCNATKFHQGTLLVVAVPDHQLGTTYQPEFDNVMPGKAGREVKYPYNFEDGTSLANSLIYP
HQWINLRTNNSATLVLPYANAIPMDSPIRHSSWSLLVIPVVPLACATGTTPFVGITITLAPMFSEFSGLRRAIAQGIPTT
NTPGSYQFLTTDEDSSACILPDFTPTQEIHIPGEVKNLQALCQVESLLEINNVDGKTGIERLRLEVSTQSELDRQLFALK
VSFTEGEIMSKTLCGVMCSYYTQWSGSLEITFMFTGSFMTTGKLLLAYTPPGGSAPASREDAMLGTHVIWDFGLQSSITL
VVPWICGGYYRDVNRANNYYAAGYVTGWFQTNMVIPPDFPSTAYILCFLAAQPNFSLRILKDRPDITQTAALQAPVETAL
NSAISSVIAGITAQDTQPSSHNISTSETPALQAAETGASSNASDEGMMETRHVVNTNTVSETSIESFYGRCGLVSIKEIA
DNKQVEKWLVNFNEFVQLRAKIELFTYMRFDIEFTLVATFTKGNSASQHPVQVQVMYLPPEQLLQLQQDSYAWQSAANPS
AIFSANTVPARFSVPFVGTANAYTIMYDGYNVFGSNRPSADYGMINSSHMGSMAFRAISQLQATEKVKFMDLCQVKDVRA
WCPRAPRMAPYKYIRNPVFETQDRIVPNRNNITTTGAFGQQSGAIYVGNYKIMNRHLATHEDWENVEWEDYNRDILVART
TAHGADKLARCHCNTGVYYCKSRNKHYPVTSRVQASIGSRLVSTTQLDTRPICSLPSGISEPGDCGGILRCQHGVIGIVT
AGGQGVVGFADVRDLFWVEHEAMEQGLTDYIQQLGNSFGQGFTAEITNYASQLSEMLIGADGMVERCLQTFVKVISAVVI
ATRSQGDVPTILATLALIGCDGSPWRWLKRQFCGIFKIPYVEKQGDDWLKKFTSYVNAFKGLDWVAEKIMKFIDWMKNKL
IPQARERQEFTTNLKTLPLLEAQVATLEHSCPTTEQQETIFGNIQYLAHHCRRYAPLYAAEARRVYALEKRILGYIQFKS
KQRIEPVCLLIHGTAGTGKSLATSIIGRKLAEYEHSEVYAIPPDSDHFDGYQQQAVVVMDDLNQNPDGKDMVAFCQMVST
VPYHVPMAAIEEKGMLFTSSYVLASTNSGSIHPPTVSNSKALSRRFAFDVDIEVSEHYKTHNGTLDVVNATQRCEDCCPA
NFKTCMPLICGEAYQLVDRRNGMRYSIDTMISAMRAEWKRRNQVGLCYVRLFQGPPQFKPLKISVDPEIPAPPAIADLLA
SVDSEEVREYCKKKGWIVEVPVTATTLERNVSIATTILSSLVLLTSVITLVYLVYRLFAGYQGPYTGLPNAKPKPPVLRE
VRAQGPLMDFGVGMMKKNIVTVRTGAGEFTGLGVHDHVLVLPKHSHPAEIVVVDGKETPVEDAYNLTDEQGVSLELTLVT
LKRNEKFRDIRAMIPENPCGTNEAVVCVNTSNFPNAFLPVGKVEYYGYLNLAGSPTHRTMMYNFPTKAGQCGGVVLSTGK
VLGIHIGGNGAQGFCAALKRSYFTKPQGKIDWVEPSKKHGFPVINAPSKTKLEPSVFFDVFEGVKEPAALHPKDPRLEVN
LEEALFSKYTGNVDIEMPEEMKEAVDHYANQLLALDIPTEPLSMEEAIYGTEGLEALDLTTSAGYPYVTMGIKKRDILNK
ETRDVKKMQECIDKYGLNLPMVTYIKDELRSKEKVKKGKSRLIEASSLNDSVAMRCYFGNLYKAFHQNPGTLTGCAVGCD
PDTFWSKIPVMMDGELFGFDYTAYDASLSPLMFQALQMVLEKIGFGEGKHFIDNLCYSHHLFRDKYYFVKGGMPSGCSGT
SIFNSMINNIIIRTVVLQTYKGIELDQLKIIAYGDDVIASYPYRIDPAELAKAGAKLGLHMTPPDKSETYVDLDWTNVTF
LKRNFVPDEKYPFLVHPVMPMKEIYESIRWTRDARNTQDHVRSLCLLAWHNGRKEYEEFCRKIRSVPVGRALHLPSYSSL
LREWYEKF
>P03300 ~~~~~~Genome polyprotein~~~
MGAQVSSQKVGAHENSNRAYGGSTINYTTINYYRDSASNAASKQDFSQDPSKFTEPIKDVLIKTAPMLNSPNIEACGYSD
RVLQLTLGNSTITTQEAANSVVAYGRWPEYLRDSEANPVDQPTEPDVAACRFYTLDTVSWTKESRGWWWKLPDALRDMGL
FGQNMYYHYLGRSGYTVHVQCNASKFHQGALGVFAVPEMCLAGDSNTTTMHTSYQNANPGEKGGTFTGTFTPDNNQTSPA
RRFCPVDYLLGNGTLLGNAFVFPHQIINLRTNNCATLVLPYVNSLSIDSMVKHNNWGIAILPLAPLNFASESSPEIPITL
TIAPMCCEFNGLRNITLPRLQGLPVMNTPGSNQYLTADNFQSPCALPEFDVTPPIDIPGEVKNMMELAEIDTMIPFDLSA
TKKNTMEMYRVRLSDKPHTDDPILCLSLSPASDPRLSHTMLGEILNYYTHWAGSLKFTFLFCGFMMATGKLLVSYAPPGA
DPPKKRKEAMLGTHVIWDIGLQSSCTMVVPWISNTTYRQTIDDSFTEGGYISVFYQTRIVVPLSTPREMDILGFVSACND
FSVRLLRDTTHIEQKALAQGLGQMLESMIDNTVRETVGAATSRDALPNTEASGPTHSKEIPALTAVETGATNPLVPSDTV
QTRHVVQHRSRSESSIESFFARGACVTIMTVDNPASTTNKDKLFAVWKITYKDTVQLRRKLEFFTYSRFDMELTFVVTAN
FTETNNGHALNQVYQIMYVPPGAPVPEKWDDYTWQTSSNPSIFYTYGTAPARISVPYVGISNAYSHFYDGFSKVPLKDQS
AALGDSLYGAASLNDFGILAVRVVNDHNPTKVTSKIRVYLKPKHIRVWCPRPPRAVAYYGPGVDYKDGTLTPLSTKDLTT
YGFGHQNKAVYTAGYKICNYHLATQDDLQNAVNVMWSRDLLVTESRAQGTDSIARCNCNAGVYYCESRRKYYPVSFVGPT
FQYMEANNYYPARYQSHMLIGHGFASPGDCGGILRCHHGVIGIITAGGEGLVAFSDIRDLYAYEEEAMEQGITNYIESLG
AAFGSGFTQQISDKITELTNMVTSTITEKLLKNLIKIISSLVIITRNYEDTTTVLATLALLGCDASPWQWLRKKACDVLE
IPYVIKQGDSWLKKFTEACNAAKGLEWVSNKISKFIDWLKEKIIPQARDKLEFVTKLRQLEMLENQISTIHQSCPSQEHQ
EILFNNVRWLSIQSKRFAPLYAVEAKRIQKLEHTINNYIQFKSKHRIEPVCLLVHGSPGTGKSVATNLIARAIAERENTS
TYSLPPDPSHFDGYKQQGVVIMDDLNQNPDGADMKLFCQMVSTVEFIPPMASLEEKGILFTSNYVLASTNSSRISPPTVA
HSDALARRFAFDMDIQVMNEYSRDGKLNMAMATEMCKNCHQPANFKRCCPLVCGKAIQLMDKSSRVRYSIDQITTMIINE
RNRRSNIGNCMEALFQGPLQYKDLKIDIKTSPPPECINDLLQAVDSQEVRDYCEKKGWIVNITSQVQTERNINRAMTILQ
AVTTFAAVAGVVYVMYKLFAGHQGAYTGLPNKKPNVPTIRTAKVQGPGFDYAVAMAKRNIVTATTSKGEFTMLGVHDNVA
ILPTHASPGESIVIDGKEVEILDAKALEDQAGTNLEITIITLKRNEKFRDIRPHIPTQITETNDGVLIVNTSKYPNMYVP
VGAVTEQGYLNLGGRQTARTLMYNFPTRAGQCGGVITCTGKVIGMHVGGNGSHGFAAALKRSYFTQSQGEIQWMRPSKEV
GYPIINAPSKTKLEPSAFHYVFEGVKEPAVLTKNDPRLKTDFEEAIFSKYVGNKITEVDEYMKEAVDHYAGQLMSLDINT
EQMCLEDAMYGTDGLEALDLSTSAGYPYVAMGKKKRDILNKQTRDTKEMQKLLDTYGINLPLVTYVKDELRSKTKVEQGK
SRLIEASSLNDSVAMRMAFGNLYAAFHKNPGVITGSAVGCDPDLFWSKIPVLMEEKLFAFDYTGYDASLSPAWFEALKMV
LEKIGFGDRVDYIDYLNHSHHLYKNKTYCVKGGMPSGCSGTSIFNSMINNLIIRTLLLKTYKGIDLDHLKMIAYGDDVIA
SYPHEVDASLLAQSGKDYGLTMTPADKSATFETVTWENVTFLKRFFRADEKYPFLIHPVMPMKEIHESIRWTKDPRNTQD
HVRSLCLLAWHNGEEEYNKFLAKIRSVPIGRALLLPEYSTLYRRWLDSF
>P03301 ~~~~~~Genome polyprotein~~~
MGAQVSSQKVGAHENSNRAYGGSTINYTTINYYRDSASNAASKQDFSQDPSKFTEPIKDVLIKTSPMLNSPNIEACGYSD
RVLQLTLGNSTITTQEAANSVVAYGRWPEYLRDSEANPVDQPTEPDVAACRFYTLDTVSWTKESRGWWWKLPDALRDMGL
FGQNMYYHYLGRSGYTVHVQCNASKFHQGALGVFAVPEMCLAGDSNTTTMHTSYQNANPGEKGGTFTGTFTPDDNQTSPA
RRFCPVDYLFGNGTLLGNAFVFPHQIINLRTNNCATLVLPYVNSLSIDSMVKHNNWGIAILPLAPLNFASESSPEIPITL
TIAPMCCEFNGLRNITLPRLQGLPVMNTPGSNQYLTADNFQSPCALPEFDVTPPIDIPGEVKNMMELAEIDTMIPFDLSA
KKKNTMEMYRVRLSDKPHTDDPILCLSLSPASDPRLSHTMLGEILNYYTHWAGSLKFTFLFCGSMMATGKLLVSYAPPGA
DPPKKRKEAMLGTHVIWDIGLQSSCTMVVPWISNTTYRQTIDDSFTEGGYISVFYQTRIVVPLSTPREMDILGFVSACND
FSVRLMRDTTHIEQKALAQGLGQMLESMIDNTVRETVGAATSRDALPNTEASGPAHSKEIPALTAVETGATNPLVPSDTV
QTRHVVQHRSRSESSIESFFARGACVAIITVDNSASTKNKDKLFTVWKITYKDTVQLRRKLEFFTYSRFDMEFTFVVTAN
FTETNNGHALNQVYQIMYVPPGAPVPEKWDDYTWQTSSNPSIFYTYGTAPARISVPYVGISNAYSHFYDGFSKVPLKDQS
AALGDSLYGAASLNDFGILAVRVVNDHNPTKVTSKIRVYLKPKHIRVWCPRPPRAVAYYGPGVDYKDGTLTPLSTKDLTT
YGFGHQNKAVYTAGYKICNYHLATQEDLQNAVNVMWNRDLLVTESRAQGTDSIARCNCNAGVYYCESRRKYYPVSFVGPT
FQYMEANNYYPARYQSHMLIGHGFASPGDCGGILRCHHGVIGIITAGGEGLVAFTDIRDLYAYEEEAMEQGITNYIESLG
AAFGSGFTQQIGDKITELTNMVTSTITEKLLKNLIKIISSLVIITRNYEDTTTVLATLALLGCDASPWQWLRKKACDVLE
IPYVTKQGDSWLKKFTEACNAAKGLEWVSNKISKFIDWLKEKIIPQARDKLEFVTKLRQLEMLENQISTIHQSCPSQEHQ
EILFNNVRWLSIQSKRFAPLYAVEAKRIQKLEHTINNYIQFKSKHRIEPVCLLVHGSPGTGKSVATNLIARAIAERENTS
TYSLPPDPSHFDGYKQQGVVIMDDLNQNPDGADMKLFCQMVSTVEFIPPMASLEEKGILFTSNYVLASTNSSRISPPTVA
HSDALARRFAFDMDIQVMNEYSRDGKLNMAMATEMCKNCHQPANFKRCCPLVCGKAIQLMDKSSRVRYSIDQITTMIINE
RNRRSNIGNCMEALFQGPLQYKDLKIDIKTSPPPECINDLLQAVDSQEVRDYCEKKGWIVNITSQVQTERNINRAMTILQ
AVTTFAAVAGVVYVMYKLFAGHQGAYTGLPNKKPNVPTIRTAKVQGPGFDYAVAMAKRNIVTATTSKGEFTMLGVHDNVA
ILPTHASPGESIVIDGKEVEILDAKALEDQAGTNLEITIITLKRNEKFRDIRPHIPTQITETNDGVLIVNTSKYPNMYVP
VGAVTEQGYLNLGGRQTARTLMYNFPTRAGQCGGVITCTGKVIGMHVGGNGSHGFAAALKRSYFTQSQGEIQWMRPSKEV
GYPIINAPSKTKLEPSAFHYVFEGVKEPAVLTKNDPRLKTNFEEAIFSKYVGNKITEVDEHMKEAVDHYAGQLMSLDINT
EQMCLEDAMYGTDGLEALDLSTSAGYPYVAMGKKKRDILNKQTRDTKEMQKLLDTYGINLPLVTYVKDELRSKTKVEQGK
SRLIEASSLNDSVAMRMAFGNLYAAFHKNPGVITGSAVGCDPDLFWSKIPVLMEEKLFAFDYTGYDASLSPAWFEALEMV
LEKIGFGDRVDYIDYLNHSHHLYKNKTYCVKGGMPSGCSGTSIFNSMINNLIIRTLLLKTYKGIDLDHLKMIAYGDDVIA
SYPHEVDASLLAQSGKDYGLTMTPADKSAIFETVTWENVTFLKRFFRADEKYPFLIHPVMPMKEIHESIRWTKDPRNTQD
HVRSLCLLAWHNGEEEYNKFLAKIRSVPIGRALLLPEYSTLYRRWLDSF
>P06210 ~~~~~~Genome polyprotein~~~
MGAQVSSQKVGAHENSNRAYGGSTINYTTINYYRDSASNAASKQDFAQDPSKFTEPIKDVLIKTAPTLNSPNIEACGYSD
RVMQLTLGNSTITTQEAANSVVAYGRWPEYIKDSEANPVDQPTEPDVAACRFYTLDTVTWRKESRGWWWKLPDALKDMGL
FGQNMFYHYLGRAGYTVHVQCNASKFHQGALGVFAVPEMCLAGDSTTHMFTKYENANPGEKGGEFKGSFTLDTNATNPAR
NFCPVDYLFGSGVLAGNAFVYPHQIINLRTNNCATLVLPYVNSLSIDSMTKHNNWGIAILPLAPLDFATESSTEIPITLT
IAPMCCEFNGLRNITVPRTQGLPVLNTPGSNQYLTADNYQSPCAIPEFDVTPPIDIPGEVRNMMELAEIDTMIPLNLTNQ
RKNTMDMYRVELNDAAHSDTPILCLSLSPASDPRLAHTMLGEILNYYTHWAGSLKFTFLFCGSMMATGKLLVSYAPPGAE
APKSRKEAMLGTHVIWDIGLQSSCTMVVPWISNTTYRQTINDSFTEGGYISMFYQTRVVVPLSTPRKMDILGFVSACNDF
SVRLLRDTTHISQEAMPQGLGDLIEGVVEGVTRNALTPLTPANNLPDTQSSGPAHSKETPALTAVETGATNPLVPSDTVQ
TRHVIQKRTRSESTVESFFARGACVAIIEVDNDAPTKRASKLFSVWKITYKDTVQLRRKLEFFTYSRFDMEFTFVVTSNY
TDANNGHALNQVYQIMYIPPGAPIPGKWNDYTWQTSSNPSVFYTYGAPPARISVPYVGIANAYSHFYDGFAKVPLAGQAS
TEGDSLYGAASLNDFGSLAVRVVNDHNPTKLTSKIRVYMKPKHVRVWCPRPPRAVPYYGPGVDYKDGLAPLPGKGLTTYG
FGHQNKAVYTAGYKICNYHLATQEDLQNAVNIMWIRDLLVVESKAQGIDSIARCNCHTGVYYCESRRKYYPVSFTGPTFQ
YMEANEYYPARYQSHMLIGHGFASPGDCGGILRCQHGVIGIITAGGEGLVAFSDIRDLYAYEEEAMEQGVSNYIESLGAA
FGSGFTQQIGNKISELTSMVTSTITEKLLKNLIKIISSLVIITRNYEDTTTVLATLALLGCDASPWQWLKKKACDILEIP
YIMRQGDSWLKKFTEACNAAKGLEWVSNKISKFIDWLKEKIIPQARDKLEFVTKLKQLEMLENQIATIHQSCPSQEHQEI
LFNNVRWLSIQSKRFAPLYAVEAKRIQKLEHTINNYVQFKSKHRIEPVCLLVHGSPGTGKSVATNLIARAIAEKENTSTY
SLPPDPSHFDGYKQQGVVIMDDLNQNPDGADMKLFCQMVSTVEFIPPMASLEEKGILFTSNYVLASTNSSRITPPTVAHS
DALARRFAFDMDIQIMSEYSRDGKLNMAMATEMCKNCHHPANFKRCCPLVCGKAIQLMDKSSRVRYSIDQITTMIINERN
RRSSIGNCMEALFQGPLQYKDLKIDIKTTPPPECINDLLQAVDSQEVRDYCEKKGWIVDITSQVQTERNINRAMTILQAV
TTFAAVAGVVYVMYKLFAGHQGAYTGLPNKRPNVPTIRTAKVQGPGFDYAVAMAKRNILTATTIKGEFTMLGVHDNVAIL
PTHASPGETIVIDGKEVEVLDAKALEDQAGTNLEITIVTLKRNEKFRDIRPHIPTQITETNDGVLIVNTSKYPNMYVPVG
AVTEQGYLNLSGRQTARTLMYNFPTRAGQCGGVITCTGKVIGMHVGGNGSHGFAAALKRSYFTQSQGEIQWMRPSKEVGY
PVINAPSKTKLEPSAFHYVFEGVKEPAVLTKSDPRLKTDFEEAIFSKYVGNKITEVDEYMKEAVDHYAGQLMSLDINTEQ
MCLEDAMYGTDGLEALDLSTSAGYPYVAMGKKKRDILNKQTRDTKEMQRLLDTYGINLPLVTYVKDELRSKTKVEQGKSR
LIEASSLNDSVAMRMAFGNLYAAFHKNPGVVTGSAVGCDPDLFWSKIPVLMEEKLFAFDYTGYDASLSPAWFEALKMVLE
KIGFGDRVDYIDYLNHSHHLYKNKTYCVKGGMPSGCSGTSIFNSMINNLIIRTLLLKTYKGIDLDHLKMIAYGDDVIASY
PHEVDASLLAQSGKDYGLTMTPADKSATFETVTWENVTFLKRFFRADEKYPFLVHPVMPMKEIHESIRWTKDPRNTQDHV
RSLCLLAWHNGEEEYNKFLAKIRSVPIGRALLLPEYSTLYRRWLDSF
>P03302 ~~~~~~Genome polyprotein~~~
MGAQVSSQKVGAHENSNRAYGGSTINYTTINYYKDSASNAASKQDYSQDPSKFTEPLKDVLIKTAPALNSPNVEACGYSD
RVLQLTLGNSTITTQEAANSVVAYGRWPEFIRDDEANPVDQPTEPDVATCRFYTLDTVMWGKESKGWWWKLPDALRDMGL
FGQNMYYHYLGRSGYTVHVQCNASKFHQGALGVFAIPEYCLAGDSDKQRYTSYANANPGERGGKFYSQFNKDNAVTSPKR
EFCPVDYLLGCGVLLGNAFVYPHQIINLRTNNSATIVLPYVNALAIDSMVKHNNWGIAILPLSPLDFAQDSSVEIPITVT
IAPMCSEFNGLRNVTAPKFQGLPVLNTPGSNQYLTSDNHQSPCAIPEFDVTPPIDIPGEVKNMMELAEIDTMIPLNLEST
KRNTMDMYRVTLSDSADLSQPILCLSLSPASDPRLSHTMLGEVLNYYTHWAGSLKFTFLFCGSMMATGKILVAYAPPGAQ
PPTSRKEAMLGTHVIWDLGLQSSCTMVVPWISNVTYRQTTQDSFTEGGYISMFYQTRIVVPLSTPKSMSMLGFVSACNDF
SVRLLRDTTHISQSALPQGIEDLISEVAQGALTLSLPKQQDSLPDTKASGPAHSKEVPALTAVETGATNPLAPSDTVQTR
HVVQRRSRSESTIESFFARGACVAIIEVDNEQPTTRAQKLFAMWRITYKDTVQLRRKLEFFTYSRFDMEFTFVVTANFTN
ANNGHALNQVYQIMYIPPGAPTPKSWDDYTWQTSSNPSIFYTYGAAPARISVPYVGLANAYSHFYDGFAKVPLKTDANDQ
IGDSLYSAMTVDDFGVLAVRVVNDHNPTKVTSKVRIYMKPKHVRVWCPRPPRAVPYYGPGVDYKNNLDPLSEKGLTTYGF
GHQNKAVYTAGYKICNYHLATKEDLQNTVSIMWNRDLLVVESKAQGTDSIARCNCNAGVYYCESRRKYYPVSFVGPTFQY
MEANDYYPARYQSHMLIGHGFASPGDCGGILRCQHGVIGIVTAGGEGLVAFSDIRDLYAYEEEAMEQGISNYIESLGAAF
GSGFTQQIGDKISELTSMVTSTITEKLLKNLIKIISSLVIITRNYEDTTTVLATLALLGCDVSPWQWLKKKACDTLEIPY
VIRQGDSWLKKFTEACNAAKGLEWVSNKISKFIDWLRERIIPQARDKLEFVTKLKQLEMLENQISTIHQSCPSQEHQEIL
FNNVRWLSIQSKRFAPLYALEAKRIQKLEHTINNYIQFKSKHRIEPVCLLVHGSPGTGKSVATNLIARAIAEKENTSTYS
LPPDPSHFDGYKQQGVVIMDDLNQNPDGADMKLFCQMVSTVEFIPPMASLEEKGILFTSNYVLASTNSSRITPPTVAHSD
ALARRFAFDMDIQVMGEYSRDGKLNMAMATETCKDCHQPANFKRCCPLVCGKAIQLMDKSSRVRYSVDQITTMIINERNR
RSNIGNCMEALFQGPLQYKDLKIDIKTRPPPECINDLLQAVDSQEVRDYCEKKGWIVNITSQVQTERNINRAMTILQAVT
TFAAVAGVVYVMYKLFAGHQGAYTGLPNKRPNVPTIRAAKVQGPGFDYAVAMAKRNIVTATTSKGEFTMLGVHDNVAILP
THASPGESIVIDGKEVEILDAKALEDQAGTNLEITIITLKRNEKFRDIRQHIPTQITETNDGVLIVNTSKYPNMYVPVGA
VTEQGYLNLGGRQTARILMYNFPTRAGQCGGVITCTGKVIGMHVGGNGSHGFAAALKRSYFTQSQGEIQWMRPSKEAGYP
IINAPTKTKLEPSAFHYVFEGVKEPAVLTKNDPRLKTDFEEAIFSKYVGNKITEVDEYMKEAVDHYAGQLMSLDISTEQM
CLEDAMYGTDGLEALDLSTSAGYPYVAMGKKKRDILNKQTRDTKEMQRLLDAYGINLPLVTYVKDELRSKTKVEQGKSRL
IEASSLNDSVAMRMAFGNLYAAFHRNPGVVTGSAVGCDPDLFWSKIPVLMEEKLFAFDYTGYDASLSPAWFEALKMVLEK
IGFGDRVDYIDYLNHSHHLYKNKIYCVKGGMPSGCSGTSIFNSMINNLIIRTLLLKTYKGIDLDHLKMIAYGDDVIASYP
HEVDASLLAQSGKDYGLTMTPADKSATFETVTWENVTFLKRFFRADEKYPFLIHPVMPMKEIHESIRWTKDPRNTQDHVR
SLCLLAWHNGEEEYNKFLAKIRSVPIGRALLLPEYSTLYRRWLDSF
>Q04538 ~~~~~~Genome polyprotein~~~
MMTTSKGKGGGPPRRKLKVTANKSRPATSPMPKGFVLSRMLGILWHAVTGTARPPVLKMFWKTVPLRQAEAVLKKIKRVI
GNLMQSLHMRGRRRSGVDWTWIFLTMALMTMAMATTIHRDREGYMVMRASGRDAASQVRVQNGTCVILATDMGEWCEDSI
TYSCVTIDQEEEPVDVDCFCRGVDRVKLEYGRCGRQAGSRGKRSVVIPTHAQKDMVGRGHAWLKGDNIRDHVTRVEGWMW
KNKLLTAAIVALAWLMVDSWMARVTVILLALSLGPVYATRCTHLENRDFVTGTQGTTRVSLVLELGGCVTITAEGKPSID
VWLEDIFQESPAETREYCLHAKLTNTKVEARCPTTGPATLPEEHQANMVCKRDQSDRGWGNHCGFFGKGSIVACAKFECE
EAKKAVGHVYDSTKITYVVKVEPHTGDYLAANETNSNRKSAQFTVASEKVILRLGDYGDVSLTCKVASGIDVAQTVVMSL
DSSKDHLPSAWQVHRDWFEDLALPWKHKDNQDWNSVEKLVEFGPPHAVKMDVFNLGDQTAVLLKSLAGVPLASVEGQKYH
LKSGHVTCDVGLEKLKLKGTTYSMCDKAKFKWKRVPVDSGHDTVVMEVSYTGSDKPCRIPVRAVAHGVPAVNVAMLITPN
PTIETNGGGFIEMQLPPGDNIIYVGDLSQQWFQKGSTIGRMFEKTRRGLERLSVVGEHAWDFGSVGGVLSSVGKAIHTVL
GGAFNTLFGGVGFIPKMLLGVALVWLGLNARNPTMSMTFLAVGALTLMMTMGVGADYGCAIDPERMEIRCGEGLVVWKEV
SEWYDGYAYHPESPDTLAQALREAFERGVCGVVPQNRLEMAMWRSTAPELNLVLSEGEANLTIVVDKTDPADYRGGTPMV
LKKTGKESKVSWKSWGKSILWSVPDSPRRMMMGVDGVGECPLYRRATGVFTVAEFGVGLRTKVFLDLRGEASKECDTGVM
GAAVKNGKAIHTDQSMWMSSFRNDTGTYIHELILTDLRNCTWPASHTIDNDGVLDSHLFLPVTLAGPRSKYNRIPGYSEQ
VRGPWDQTPLRVVRDHCPGTSVRIDSHCDKRGASVRSTTESGKIIPEWCCRACELPPVTFRSGTDCWYAMEIRPVHSQGG
LVRSMVVADNGALLSEGGVPGLVAVFVLMEFLLRRRPGSVTSILWGGILMLGLLVTGLVRVEEIVRYVIAVGVTFHLELG
PETMVLVMLQAVFNMRTCYLMGFLVKRVITTREVVTVYFLLLVLEMGIPEMNFGHLWEWADALAMGLLIIKASAMEDRRG
LGFLLAGLMTQRHLVAVHHGLMVFLTVALAVVGRNIYNGQKERKGLCFTVPLASLLGGSGSGLRMLALWECLGGRGRRSL
SEPLTVVGVMLAMASGLLRHSSQEALLALSAGSFLILMLILGTRRLQLTAEWAGVVEWNPELVNEGGEVSLKVRQDAMGN
LHLTEVEREERRLALWLVFGLLASAYHWSGILVTMGAWTVYELFSSTRRTDLVFSGQLPDQGEKRSFDIKEGVYRIYAPG
LFWGYRQIGVGYGTKGVLHTMWHVTRGAALSVEGATSGPYWADVREDVVCYGGAWGLDKKWGGEVVQVHAFPPDSGHKIH
QCQPGKLNLEGGRVLGAIPIDLPRGTSGSPIINAQGDVLGLYGNGLKSNDVYISSIAQGNVEKSRPEMPLAVQGGKWTSK
GSITVLDMHPGSGKTHRVLPELIRECIDKRLRTVVLAPTRVVLKEMERALQGKRVKFHSAAVDNASSSSGAIVDVMCHAT
YVNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYSLAKENRCALVLMTATPPGKSEAFPESKGAIVSEEKPIPEGEW
RDGFDWITEFEGRTAWFVPSIAKGGAIARTLRQKGKSVICLNSKTFDKDYGRVHEEKPDFVVTTDISEMGANLDVNRVID
GRTNIKPEEIDGKVELIGTRRVTTASAAQRRGRVGRHEGRTDLYVYSGQCDDDDSSLVQWKEAQILLDNITTVRGPVATF
YGPEQGKMLEVAGHFRLTEEKRKHFRHLLTNCDFTPWLAWHVAANTACVTDRKWTWEGPDENAIDGPGGELVTFRSPNGA
ERKLKPIWKDSRMFREGRDVADFIQYASGRRSAVDILTGLGGVPDLLRLRCTAAWDVVYTLLNETPGSRAMKMAERDAPE
AMLTLLEVAVLGIATLGVVWCFIVRTSVSRMVLGTLVLAVALILLWLGGMDYGTMAGVALIFYLLLTVLQPEPGKQRSGE
DNRLAFLLIGLGSVVGLVAANELGYLEQTKTDISGLFRREDQGGMVWDAWTNIDIQPARSWGTYVLIVSLFTPYMLHQLQ
TKIQRLVNSSVAAGTQAMRDLGGGTPFFGVAGHVVALGVTSLVGATPTSLALGVALAALHLAVVTSGLEAELTQRAHRAF
FSAMVKNPMVDGEIINPIPDGDPKPALYERKMSLFLAIGLCIAAVALNRTAAAMTEAGAVAVAALGQLLRPEEESWWTMP
MACGMAGLVRGSLWGLLPVLHRIWLRTQGARRGGAEGSTLGDIWKQRLNSCTKEEFFAYRRTGVMETNRDQARELLRRGE
TNMGLAVSRGCAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPSVMAVRAYTIGGKGHEAPRLVTSLGWNLIKFRS
GMDVFSMATTRADTILCDIGESSPDPEKEGARSRRVILLMEQWKARNPDAAAVFKVLAPYRPEVLEALHRFQLQWGGGLV
RVPFSRNSTHEMYYSTAVTGNLVNSVNVLSRKLLARFGETRGPIQVPEIDLGTGTRCVTLAEDKVKPRDVAERIGALREQ
YSESWHEDKEHPYRTWQYWGSYRTPATGSAASLINGVVKLLSWPWNAREDVTRMAMTDTTAFGQQRVFKEKVDTKAQEPQ
PGTRVIMRAVSDWLLEHLSRRAKVRMCTKDEFIAKVRSNAALGAWSDEQNKWSSAKEAVEDPEFWKLVDEERSRHLKGQC
RHCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEVLGFLNEEHWASREVSGAGVEGTSLNYLGWLLRELGMKDG
GKLYADDTAGWDTRITNADLEDEEQILRYMEGEHHVLAKTILEKAYHAKVVKVARPSPQGGCVMDVITRRDQRGSGQVVT
YALNTITNMKVQLIRMMEGEGVIGPADSQDPRLKRVETWLKEHGVERLGRMLVSGDDCVVKPIDDRFGKALYFLNDMAKV
RKDVGEWEPSMGFTEWEEVPFCSHHFHELVMKDGRSLIVPCRDQDELVGRARVSPGCGWSVRETACLSKAYGQMWLLNYF
HRRDLRTLGFAICSAVPVSWVPMGRTTWSIHASGEWMTTEDMLRIWNKVWILDNPHMEDKQTVDEWRDIPYLPKTQDLVC
SSLVGRKERAEWAKNIWGSVEKVRKLIGPEDYRDYLSSMDRHDLHWELKLESSII
>P13529 ~~~~~~Genome polyprotein~~~
MSTIVFGSFTCHLDAAIHQDNADRLAKAWTRPENRQVSNVHLLCRRAAKSLINTYESATASAWKGLEEKLQPMFAKREFS
KTVTKRKGLRCFKESSEKFIEKKLRKQYQEERERFQFVNGPDAIVNQISVDKCEASVWVPFPHIIEKPSFATPSMKKKVV
FTKVRMSEASLQLFMRRVAANAKANGQKVEIIGRKRVVGNYTTKSRLTYFRTHVRHLDGSKPRYDLVLDEATKKILQLFA
NTSRFHHVHKKGEVTPGMSGFVVNPINLSDPMQVYDTDLFIVRGKHNSILVDSRCKVSKKQSNEIIHYSDPGKQFSDGFT
NSFMQCKLRETDHQSTSDLDVKECGDVAALVCQAIIPCGKITCLQCAQKYSYMSQQEIRDRFSTVIEQHEKTAMDNYPQF
SHVLAFLKRYRELMRVENQNYEAFKDITHMIGEDRKEAPFSHLQQINELIIKGGMMSAQDYIEASDHLRELARYQKNRTE
NIRSGSIKAFRNKISSKAHVNMQLMCDNQLDTNGNFVWGQREYHAKRFFRNYFDVIDVSEGYRRHIVRENPRGIRKLAIG
NLVISTNLAALRKQLLGEECIHFEVSKECTSRRGENFVYQCCCVTHEDGTPLESEIISPTKNHLVVGNTGDSKYVDLPTA
KGGAMFIAKAGYCYINIFLAMLININEDEAKSFTKTVRDTLVPKLGTWPSMMDLATACHFLAVLYPETRNAELPRILVDH
EAKIFHVVDSFGSLSTGMHVLKANTINQLISFASDTLDSNMKTYLVGGSEVDKCDEFKNVKLLIRSIYKPQIMEQVLKEE
PYLLLMSVLSPGVLMALFNSGSLEKATQYWITRSHTLAAITSMLSALAAKVSLASTLNAQMSVIDEHAAVLCDSVFDGTK
PYASYMMAVKTLERMKARTESDHTLNDLGFSVLRQATPHLVEKSYLQELEQAWKELSWSEKFSAILESQRWRKHIPKPFI
PKDGADLGGRYDISVRSLLGNQYKRLRDVVRRKRDDVVCYTHQSMGKLFCKAIGISTSFLPSTLKMLDMLIVFGLLLSIG
ATCNSMINEHKHLKQLAADREDKKRFKRLQVLHTRLSEKVGCTPTADEFLEYVGGENPDLLKHAEDLIGDGQVVVHQSKR
DSQANLERVVAFVALVMMLFDSERSDGVYKILNKLKGIMGSVDQAVHHQSLDDIEDILDEKKLTVDFVLQSNEVAPTVPF
DSTFEKWWTNQLETGNVIPHYRTEGHFLEFTRENAAHIANEVMHGSHQDILIRGAVGSGKSTGLPFHLSKKGHVLLIEPT
RPLAENVCKQLRGQPFNVNPTLRMRGMSTFGSTPITVMTSGYALHFLANNPTYLDNYKCIIFDECHVHDASAMAFRCLLS
EYSYPGKILKVSATPPGHEVDFKTQKEVKVIVEESLSFQQFVSNLGTGCNSDILKHGVNVLVYVASYNEVDTLSKLLTDR
SFKVSKVDGRTMKIGNVEIPTSGTQAKPHFVVATNIIENGVTLDIDVVVDFGLKVVPVLDIDNRLVRYTKKSISYGERIQ
RLGRVGRNKPGAALRIGFTEKGLTQIPPIIATEAAFLCFTYGLPVMTNGVSTSLLAMCTVKQARTMQQFELSPFYTVALV
RFDGTMHQEIFRLLKSYRLRDSEVILNKLAIPNSNVCGWMSVRDYKRQGCNLDLDENIRVPFYVKDIPETLHERIWQVVE
THKSDAGFGRICSSSACKIAYTLQTDIHSIPRTIKIIDALLEQERTKQAHFRAMTSQSCSSSNFSLSSITSAIRSKYAKD
HTEENIGVLQMAKSQLLEFKNLNIDPSYPELVRNFGALECVHHQTKEGVSKALQLKGHWNKRLITRDATLMLGVLGGGAW
MIFSYLRDSFKEGVVHQGFNRRQRQKLKFRQARDNRMAREVYGDDSTMEDYFGSAYSKKGKSKGKTRGMGTKTRKFVNMY
GYDPTDYNFVRFVDPLTGHTLDEDPLMDINLVQEHFSQIRNDYIGDDKITMQHIMSNPGIVAYYIKDATQKALKVDLTPH
NPLRVCDKTATIAGFPEREFELRQTGHPIFVEPNAIPKINEEGDEEVDHESKSLFRGLRDYNPIASSICQLNNSSGARQS
EMFGLGFGGLIVTNQHLFKRNDGELTIRSHHGEFVVKDTKTLKLLPCKGRDIVIIRLPKDFPPFPKRLQFRTPTTEDRVC
LIGSNFQTKSISSTMSETSATYPVDNSHFWKHWISTKDGHCGLPIVSTRDGSILGLHSLANSTNTQNFYAAFPDNFETTY
LSNQDNDNWIKQWRYNPDEVCWGSLQLKRDIPQSPFTICKLLTDLDGEFVYTQSKTTHWLRDRLEGNLKAVGACPGQLVT
KHVVKGKCTLFETYLLTHPEEHEFFRPLMGAYQKSALNKDAYVKDLMKYSKPIVVGAVDCDQFERAVDVVISMLISKGFE
ECNYVTDPDDIFSALNMKAAVGALYSGKKRDYFKNVSDQDKESFVRASCKRLFMGKKGVWNGSLKAELRPKEKVEANKTR
SFTAAPIDTLLGGKVCVDDFNNQFYSLNLHCPWSVGMTKFRGGWDKLLRALPEGWIYCDADGSQFDSSLSPYLINAVLNI
RLAFMEEWDIGEQMLSNLYTEIVYTPIATPDGTIVKKFKGNNSGQPSTVVDNTLMVILAMTYSLLKLGYHPDTHDCICRY
FVNGDDLVLAVHPAYESIYDELQEHFSQLGLNYTFATKTENKEELWFMSHKGVLYDDMYIPKLEPERIVSILEWDRSNEP
IHRLEAICASMVEAWGYKELLREIRKFYSWVLEQAPYNALSKDGKAPYIAETALKKLYTDTEASETEIERYLEAFYDDIN
DDGESNVVVHQADEREDEEEVDAGKPIVVTAPAATSPILQPPPVIQPAPRTTAPMLNPIFTPATTQPATKPVSQVPGPQL
QTFGTYGNEDASPSNSNALVNTNRDRDVDAGSIGTFTVPRLKAMTSKLSLPKVKGKAIMNLNHLAHYSPAQVDLSNTRAP
QSCFQTWYEGVKRDYDVTDDEMSIILNGLMVWCIENGTSPNINGMWVMMDGETQVEYPIKPLLDHAKPTFRQIMAHFSNV
AEAYIEKRNYEKAYMPRYGIQRNLTDYSLARYAFDFYEMTSTTPVRAREAHIQMKAAALRNVQNRLFGLDGNVGTQEEDT
ERHTAGDVNRNMHNLLGVRGV
>P17767 ~~~~~~Genome polyprotein~~~
MSTIVFGSFTCHLDAAIHQDNADRLAKAWTRPENRQVSNVHLLCRRAAKSLINTYESATASAWKGLEEKLQPMFAKREFS
KTVTKRKGLRCFKESSEKFIEKKLRKQYQEERERFQFLNGPDAIVNQISVDKCEASVRVPFPHIIEKPSFATPSMKKKVV
FTKVRMSEASLQLFMRRVAANAKANGQKVEIIGRKRVVGNYTTKSRLTYFRTHVRHLDGSKPRYDLVLDEATKKILQLFA
NTSGFHHVHKKGEVTPGMSGFVVNPMNLSDPMQVYDTDLFIVRGKHNSILVDSRCKVSKEQSNEIIHYSDPGKQFWDGFT
NSFMQCKLRETDHQCTSDLDVKECGYVAALVCQAIIPCGKITCLQCAQKYSYMSQQEIRDRFSTVIEQHEKTVMDNYPQF
SHVLAFLKRYRELMRVENQNYEAFKDITHMIGERKEAPFSHLNKINELIIKGGMMSAQDYIEASDHLRELARYQKNRTEN
IRSGSIKAFRNKISSKAHVNMQLMCDNQLDTNGNFVWGQREYHAKRFFRNYFDVIDVSEGYRRHIVRENPRGIRKLAIGN
LVMSTNLAALRKQLLGEECIHFEVSKECTSKRGENFVYQCCCVTHEDGTPLESEIISPTKNHLVVGNSGDSKYVDLPTAK
GGAMFIAKAGYCYINIFLAMLININEDEAKSFTKTVRDTLVPKLGTWPSMMDLATACHFLAILYPETRNAELPRILVDHE
AKIFHVVDSFGSLSTGMHVLKANTINQLISFASDTLDSNMKTYLVGGLEVDKCDEFKNVKLLIRSIYKPQIMEQVLKEEP
YLLLMSVLSPGVLMALFNSGSLEKATQYWITRSHSLAAITSMLSSLAAKVSLASTLNAQMSVIDEHAAVLCDSVFVGTKP
YASYMMAVKTLERMKARTESDHTLNDLGFSVIRQATPHLVEKSYLQELEQAWKELSWSEKFSAILESQRWRKHIPKPFIP
KDGADLGGRYDISVRSLLGNQYKRLRDVVRRKRDDVVCYTHQSMGKLFCKAIGISTSFLPSTLKMFDMLIVFSLLLSIGA
TCNSMINEHKHLKQLAADREDKKRFKRLQVLHTRLSEKVGCTPTADEFLEYVGGENPDLLKHAEDLIGDGQVVVHQSKRD
SQANLERVVAFVALVMMLFDSERSDGVYKILNKLKGIMGSVDQAVQHQSLDDIEDILDEKKLTVDFVLQSNEVAPTVPFD
STFEKWWTNQLETGNVIPHYRTEGHFLEFTRENAAHIANEVMHGSHQDILIRGAVGSGKSTGLPFHLSKKGHVLLIEPTR
PLAENVCKQLRGQPFNVNPTLRMRGMSTFGSTPITVMTSGYALHFLANNPTYLDNYKCIIFDECHVHDASAMAFRCLLSE
YSYPGKILKVSATPPGHEVDFKTQKEVKVIVEESLSFQQFVSNLGTGCNSDILKHGVNVLVYVASYNEVDTLSKLLTDRS
FKVSKVDGRTMKIGNVEIPTSGTQAKPHFVVATNIIENGVTLDIDVVVDFGLKVVPVLDIDNRLVRYTKKSISYGERIQR
LGRVGRNKPGAALRIGFTEKGLTQIPPIIATEAAFLCFTYGLPVMTNGVSTSLLAMCTVKQARTMQQFELSPFYTVALVR
FDGTMHQEIFRLLKSYRLRDSEVILNKLAIPNSNVCGWMSVRDYKRQGCNLDLDENIRVPFYVKDIPETLHERIWQVVET
HKSDAGFGRICSSSACKIAYTLQTDIHSIPRTIKIIDALLEQERTKQAHFRAMTSQSCSSSNFSLSSITSAIRSKYAKDH
TEENIGVLQTAKSQLLEFKNLNIDPSYPELVRNFGALECVHHQTKEGVSKALQLKGHWNKRLITRDATLMLGVLGGGAWM
IFSYLRDSFKEEVVHQGFNRRQRQKLKFRQARDNRMAREVYGDDSTMADYFGSAYSKKGKSKGKTRGMGTKTRKFVNMYG
YDPTDYNFVRFVDPLTGHTLDENPLMDINLVQEHFSQIRNDYIGDDKITMQHIMSNPGIVAYYIKDATQKALKVDLTPHN
PLRVCDKTATIAGFPEREFELRQTGHPVFVEPNAIPKINEEGDEEVDHESKSLFRGLRDYNPIASSICQLNNSSGARQSV
MFGLGFGGLIVTNQHLFKRNDGELTIRSHHGEFVVKDTKTLKLLPCKGRDIVIIRLPKDFPPFPKRLQFRTPTTEDRVCL
IGSNFQTKSISSTMSETSATYPVDNSHFWKHWISTKDGHCGLPIVSTRDGSILGLHSLANSTNTQNFYAAFPDNFETTYL
SNQDNDNWIKQWRYNPDEVCWGSLQLKRDIPQSPFTICKLLTDLDGEFVYTQSKTTHWLRDRLEGNLKAVGACPGQLVTK
HVVKGKCTLFETYLLTHPEEHEFFRPLMGAYQKSALNKDAYVKDLMKYSKPIVVGAVDCDQFERAVDVVISMLISKGFEE
CNYVTDPDDIFSALNMKAAVGALYSGKKRDYFKNVSDQDKESFVRASCKRLFMGKKGVWNGSLKAELRPKEKVEANKTRS
FTAAPIDTLLGGKVCVDDFNNQFYSLNLHCPWSVGMTKFRGGWDKLLRALPEGWIYCDADGSQFDSSLSPYLINAVLNIR
LAFMEEWDIGEQMLSNLYTEIVYTPIATPDGTIVKKFKGNNSGQPSTVVDNTLMVILAMTYSLLKLGYHPDTHDCICRYF
VNGDDLVLAVHPAYESIYDELQEHFSQLGLNYTFATKTENKEELWFMSHKGVLYDDMYIPKLEPERIVSILEWDRSNEPI
HRLEAICASMVEAWGYKELLREIRKFYSWVLEQAPYNALSKDGKAPYIAETALKKLYTDTEASETEIERYLEAFYDDFND
DGESNVVVHQADEREDEEEVDAGKPSVVTAPAATSPILQPPPVIQPAPRTTASMLNPIFTPATTQPATKPVSQVSQPQLQ
TFGTYGNEDASPSNSNALVNTNRDRDVDAGSVGTFTVPRLKAMTSKLSLPKVKGKAIMNLNHLAHYSPAQVDLSNTRAPQ
SCFQTWYEGVKRDYDVTDDEMSIILNGLMVWCIENGTSPNINGMWVMMDGETQVEYPIKPLLDHAKPTFRQIMAHFSNVA
EAYIEKRNYEKAYMPRYGIQRNLTDYSLARYAFDFYEMTSTTPVRAREAHIQMKAAALRNVQNRLFGLDGNVGTQEEDTE
RHTAGDVNRNMHNLLGVRGV
>P29152 ~~~~~~Genome polyprotein~~~
MSTLVCQAVAAPVWSNGARTRRIRDADGEYRCTQCDMGFDSMTMARPVNHCCDGIMIDEYNLYDDDPIMHLVDSKTPIKR
GSQETEGDGMAAEAIKVTGAEPVNCFMVGTIKCKINENSIVAKGVMAAIPRQLTQDEVFMRKARLQAAVAKSTIEREEKE
RQFAFSKLEEKLRARREKLKDGIVIKTRKGLEWREATPNQQRGKLQSTSFDASGGKTLTPHTIYCKTKSSKFSNGGVKCA
TSKKMRTVRKPQSLKMKTESIDVLIEQVMTIAGKHAKQVTLIDKQKTNRVWIRRVNGVRLLQVETKHHKGIISQKDASLN
NLTKRVARHFARKTAYIHPSDSITHGHSGVVFLRANISGSKSYSIDDLFVVRGKRNGKLMESRNKVAWRKMFQIDHFSIV
GIKIWNAFDAEYVKLRDESVSDHDCVGGITPEECGILAAQILRVFYPCWRITCTKCISNWLSKPTSEQIEHIYERGNLAI
QDLNKRIPSAHHVTQMVELLRQRIKNTTFDMGNNTKVHELIGHRQDGVFRHLNRLNNSILAANGSSTIEWESMNESLLEL
ARWHNKRTESIASGGISSFRNKISAKAQINFALMCDNQLDTNGNFVWGERGYHAKRFFSEFFTKIDPKDGYSHHTVRATP
TGVRHLAIGNLIIPGDLQKLREKLEGVSITAVGISEKCVSRRNGDFVYPCSCVTSENGKPVLSDVILPTRNHLVIGNTGD
PKLVDLPKTETGRMWIAKEGYCYINIFFAMLVNVSEKDAKDFTKFVRDEIMPQLGKWPTMMDVATACYKLAIIYPDVRDA
QLPRILVDHSEQIFHVIDSYGSMTTGYHILKAGTVSQLISFAHGALLGEMKMYRVGGTQKMEINMCCCQRKNLLIKQLIR
AIYRPKLLTEIIETEPFVLMLAIVSPSILKAMFRSGTFNQAIKFYMHRSKPTAQTLAFLEALSERVSRSRVLSEQFNIID
GALKELKSLANMSMRTQHTYPIVQNQLDIMIERVSADAELLRDGFVVSKGRVQALIEKNYQDDLRNSFTDLPYVQQLQQT
MSFSRVKHGFGELCESKDLSFSKEAWMGHLSSFSAGGKQIIRLARTKSQQMLASGGRRVTLAARNITMRMVTATFSEIMK
FVNMLLVLSMIFKLWKQANTLLEEREKDKWEKFDRSQNELRRQLRYTLWRFEAQEGRQVTREEFFDYLKYNEGIENRHEL
INELIANQPLFSIQAKKHGEIRFEQTVALMALLAMMFGSDRSDAVFSTLSKVRTIFTTMAQEVRCQSIDDIHDVFDEKKA
TIDFELATDQPAQVQMDKTFCEWWQNQMEQNRTVPHYRTGGKFIEFTRSNAASVANEIAHTPDFSEYLIRGAVGSGKSTG
LPCYLSAKGRVLLLEPTRPLTENVCAQLRGSPFHKSPSMSMRNGHTFGSTPIHVMTTGYALHFFCNNVERIREYDFVIFD
ECHVIDSSAMSFYCALKEYSYQGKILKVSATPPGREVEFKTQFPVTIATEDSLSFDQFVQAQGSGANCDILKKGHNILVY
VSSYNEVDRLSKLLVDRGFKVTKVDGRTMKLGGVEINTSGTAEKPHFIVATNIIENGVTLDIDVVVDFGVKVVAELDADA
RTMRYNKQAISYGERIQRLGRVGRLKDGHALRIGHTEKGITEIPVAIAVESAFQCFAYGLPVMTSNVSTSIIGNCTVKQA
RTMMNFELSPFFTVELVKYNGTMHPEIHKILVPYKLRDSSMQLCKEAIPNSGVSRWHTAHEYISHGIVLETLKSDVRIPF
YLKGVPEKVYEKIWNAVCVFKSDSGFGRMSTASACNVAYTLKTDPLSITRTIAHIDALLIEEQEKKSQFDLMSSHVTNSS
SISLAGLVNRLRSKWMVDHSGENIVKLQNARSQLLEFRGMDINLDDVESFRKFGCAETVRCQSKSEVSKTLQLKGKWNKP
LITSDFFVVCMVSIGCVVLMYQIFMAKWNEPVKLEGKSKAKTLRFRQARDNNAKYEVFADEDTKRHYFGEAYTKKGKKSG
KARGMGVKTKKFVNVYGFDPCEYSLVRFVDPLTGLTYDRHPMEHMMDVQETIGDDRREAMWNDELDKQLFVTRPTIEAYY
IKDKTTPALKIDLNPHNPMRVCDKAETIAGFPEREFELRQSGSATLVPYSEVPVQNEKQEFDEEHVRTEAASLHFGLRDY
NPIAQAVCRITNTGVDYDRSIFGIGFGQFLITNAHCFKLNEGETRIVSRHGQFTIEKTHSLPIHQVKDKDMVIVRLPKDF
PPFPQRLQFRAPQEREKICLVGSNFQEKSIQSVITESCMTFKHNGGKYWKHWITTKEGHCGLPAVALKDGHIVGIHNLGG
ENTNINYFTPFDADILDKYLLNAEALQWTKGWKYNKNKVCWGGLELLDDNEPEESGLFRMVKLLKSLEEDGVRTQSRDDA
WLEKEIKGSLKVVARCPGQLVTKHVVKGPCAMFQLYLELHEDAKSFFTPRMGSYGKSRLSKGAFIKDIMKYSSNTVVGNV
DCDVFENAIDNVEKILWKAGMMQCEYVTDAEAIFQSLNMNAAVGAMYQGKKKDYFEDFTAADRELIVKQSCERLFLGKKG
VWNGSLKAELRPIEKVHENKTRTFTAAPLDTLLGGKVCVDDFNNFFYSCHLRGPWTVGITKFYAGWNEFLSKLPDGWLYC
DADGSRFDSSLTPYLINAVLELRLRFMEEWDAGEQMLKNLYTEIIYTPIATPDGSVIKKTKGNNSGQPSTVVDNTLMVIL
AMQYSLQLLGVDFETQDEVVRYFANGDDLLIAVRPDCEFVLKGLEIHFSNLGLNYNFSARHHDKKDVWFMSTRGILRDGI
LIPKLEEERIVAILEWDRSREFSHRLDAICAAMIEAWGYDELLQHIRKFYYWLLEQEPYRSIAQEGKAPYIAETALRHLY
TNAMATQSELEKYTEAINQHYNDEGGDGSIKVRLQAGDETKDDERRRKEEEDRKKREESIDASQFGSNRDNKKNKNKESD
TPNKLIVKSDRDVDAGSSGTITVPRLEKISAKIRMPKHKGGVAISLQHLVDYNPAQVDISNTRATQSQFDNWWRAVSQEY
GVGDNEMQVLASGLMVWCIENGTSPNINGMWTMMDGEEQVEYPLKPVMDNARPTFRQIMAHFSDVAEAYIEKRNSTEVYI
PRYALQRNLRDPSLARYGFDFYEITAKTPVRAREAHFQMKAAAIRGKSNSLFGLDGNVGTQEENTERHTAEDVNQNMHNL
LGMRAM
>Q85197 ~~~~~~Genome polyprotein~~~
MATQVIMVGEFKILEVNCKPHAPVAAIHVPTQTPKTNDIKWADLEFTLAKSLQRQAHGVVKVDKHGTARIKRASKHHMSC
LEQQMADEVAEKEAFMAAPTQLVTSIIFAGTTPPSMMETETIVKKIHTVGKRAKVMRKRSYITPPTDKSLRNHGVTPYSV
QQLCRTLGNLSKRTGISLEVVGKTSKATKLRFTKTSFGHMARVQLKHHDGRMHRRDLVVDTSTTTIMQTLFLKTARTNAN
LDVLTHGSSGLVFWNYLVTGQRMRTRDNFIIVRGRCNGILVDARAKLSESTMLSTHHYSTGDVFWRGFNRTFLENKPINL
DHVCSSDFSVEECGSIAALICQSLLPCGKITCRACAAKNLNMDEDTFKEFQTQRAREISAVIISEHPNFACVSQFIDRYF
SHQRVLNPNVNAYREILKIVGGFTQSPYTHIQELNEILVLGGRATPEQLGSASAHLLEITRFVRNRTDNIKKGSLALFRN
KISAKAHVNTALMCDNQLDRNGNLIWGERGYHAKRFFSNYFDIITPGGGYKQYIERRVPNGIRKLAIGNLIVTTNLEALR
EQLEGESIEKKAVTKACVSMSDNNYKYPCCCVTLDDGTPLYSTFIMPTKNHLVIGNSGDPKFLDLPADISTQMYIAKSGY
CYINIFLAMLVNVDESDAKDFTKKVRDIIVPDLGEWPTLIDVATSCSLLSAFYPATSAAELPRILVDHDLKTMHVIDSYG
SLNTGYHVLKANTIRQLIQFASNSLDSEMKHYRVGGTSNSQINGYATIKMLAKAVYRPKLMKEIIHEQPFMLVMSLMSPG
ILIALANSGALEMGIHHWIREGDSLVKMAHMLRTVAQNVSVARATWVQQEIISDSAQQMLETILNGTIPNVSYFQAIQYL
TMLAASKEVDAEVRVTGYYTFKLQTSELLEKTYLSLLEDSWQELSYFGRFQAIRHSRRYCTAGTIVVKPERHVDLGGIYA
TSYQFALAKQMEYSKKAVCQAVNGLQARFNNITSQIYCKILNWPKRLFPDLVKFINTMLAITVALQLYIAFATILRHHQQ
CKQDSLELEYCKKERQLITLYDFFIAKQPYATEEEFMAHVDEQNPDLSNFAREYCAEVVLFQAKASEQVNFERIIAFISL
VLMMFDRERSDCVYRSLTKLKSLMSTVENTVQFQSLDDIGPTLEEKNMTIDFDLDTDTIVGKSIIGHTFKEWWDVQLNTN
RIVPHYRTEGHFMEFTRANAPTIAHQIAHDLHTDIMLRGAVGSGKSTGLPYHLSKKGTVLLLEPTRPLAENVTKQLKSDP
FHVSPTLRMRGMAVFGSTPIHVMTTGFALHYLANNLKMLSTYDFIIIDEFHVHDSNAIALRNLLHEHNYQGKLIKVSATP
PGREVEFSTQYPVEIRVEDQVSFQDFVKAQGNGSNLDLTSKCDNLLVYVASYNEVDQLSKLLLERHFLVTKVDGRSMKLG
QVEIITKGSANKKHFIVATNIIENGVTLDIDAVIDFGMKVVPFLDSDNRMISYNKVSISYGERIQRLGRVGRNKAGVALR
IGHTEKGISDVPVVIATQAAFLGFVYGLPISTQSVTTQVLSNVTLKQARTMVQFELPIFYMAHLVRYDGTMHPAIHNELK
KYKLRDSEIQLRKLAIPSKCVPIWMTGKAYRLLTHNSQIPDDVRVPFLTKEIPDKLHENVWAIVEKFKCDAGIGRMTSAQ
ASKVAYTLETDIHSVQRTILIIDQLLEREMQKQSHFEMVTNQSCSSGMLSLQTMMNAIQSRYAKNHTAGNIEILQRAKAQ
LLEFSNLSGDISTESALREFGYLEAVQFQSGTQVSNFLGLEGHWKKSLITKDLLIVGGVCVGAAWMIGEYFFKKSKGVVA
FQGIISDRGKKLKFARARDEKMGHYVEAPDSTLEHYFGSAYTKKGKTKGKTHGMGKKNHRFVNMYGFDPSDYTFIRYVDP
LTGYTLDESPYTDIRLIQSQFSDIREQQLLNDELERNMVHYKPGVQGYLVKDKTSQILKIDLTPHIPLKVCDATNNIAGH
PDREGELRQTGKGQLLDYAELPQKKESVEFESTSMFRGVRDYNPISSVICQLENESEGRTTQLFGLGFGPFIITNQHLFV
RNNGSLTVRSQMGVFKVNSTVALQMRPVEGRDVLIIKMPKDFPPFPQRLKFRQPTHSEKVCLILTNFQQKSSSSMVSETS
ILYQRKNTYFWKHWISTKEGHCGSPIVSTTDGAILGIHSLSNMTNTSNYFACFPKGFTETYLATESVHEWVKGWKFNANN
VCWGSFHLQDSKPTKEFKTVKLVTDLLGEAVYTQGCDSKWLFNAAHTNIQAVAQLESNLVTKHTVKGKCKLFETYLNVDK
AAHDFFSKYMGFYKPSKLNREAYTQDLMKYSKVIQVGEVDCGVFESALTGLLHNLGRWGFTTACYTTDEDSIYTALNMKA
AVGALYRGKKRDYFDAMSPSEREHLLFLSCKRLYFGQLGVWNGSLKAELRPKEKVDLNKTRTFTAAPIETLLGGKVCVDD
FNNMFYSLHLKAPWSVGMTKFYGTWNQLMCKLPDDWVYCDADGSQFDSSISPYMINAVLRIRLHFMEDWDIGSQMLQNLY
TEIGTHQSQHQMAQLLKKFKGNNSGQPSTVVDNTLLVVLALHYALLKSGIPLEEQDSVCAYGVNGDDLLIAIRPDMEHKL
DGFQALFSELGLNYEFNSRSKDKKDLWFMSHKAIQCGEILIPKLEEERIVSILEWDRSHEPIHRLEAICASMVESWGYPE
LTHEIRRFYAWVLEQSPYNALATTGLAPYIAESALKTLYTNVHPTSTELEKYSIQFDEQMDEEDDMVYFQAETLDASEAL
AQKSEGRKKERESNSSKAVAVKDKDVDLGTAGTHSVPRLKSMTSKLTLPMLKGKSVVNLDHLLSYKPKQVDLSNARATHE
QFQNWYDGVMASYELEESSMEIILNGFMVWCIENGTSPDINGVWTMMDNEEQVSYPLKPMLDHAKPSLRQIMRHFSALAE
AYIEMRSREKPYMPRYGLQRNLRDQSLARYAFDFYEITATTPIRAKEAHLQMKAAALKNSNTNMFGLDGNVTTSEEDTER
HTATDVNRNMHHLLGVKGV
>P18247 ~~~~~~Genome polyprotein~~~
MATYMSTICFGSFECKLPYSPASCEHIVKEREVPASVDPFADLETQLSARLLKQKYATVRVLKNGTFTYRYKTDAQIMRI
QKKLERKDREEYHFQMAAPSIVSKITIAGGDPPSKSEPQAPRGIIHTTPRMRKVKTRPIIKLTEGQMNHLIKQIKQIMSE
KRGSVHLISKKTTHVQYKKILGAYSAAVRTAHMMGLRRRVDFRCDMWTVGLLQRLARTDKWSNQVRTINIRRGDSGVILN
TKSLKGHFGRSSGGLFIVRGSHEGKLYDARSRVTQSILNSMIQFSNADNFWKGLDGNWARMRYPSDHTCVAGLPVEDCGR
VAALMAHSILPCYKITCPTCAQQYASLPVSDLFKLLHKHARDGLNRLGADKDRFIHVNKFLIALEHLTEPVDLNLELFNE
IFKSIGEKQQAPFKNLNVLNNFFLKGKENTAHEWQVAQLSLLELARFQKNRTDNIKKGDISFFRNKLSAKANWNLYLSCD
NQLDKNANFLWGQREYHAKRFFSNFFEEIDPAKGYSAYEIRKHPSGTRKLSIGNLVVPLDLAEFRQKMKGDYRKQPGVSK
KCTSSKDGNYVYPCCCTTLDDGSAIESTFYPPTKKHLVIGNSGDQKFVDLPKGDSEMLYIAKQGYCYINVFLAMLINISE
EDAKDFTKKVRDMCVPKLGTWPTMMDLATTCAQMRIFYPDVHDAELPRILVDHDTQTCHVVDSFGSQTTGYHILKASSVS
QLILFANDELESDIKHYRVGGVPNASPELGSTISPFREGGVIMSESAALKLLLKGIFRPKVMRQLLLDEPYLLILSILSP
GILMAMYNNGIFELAVRLWINEKQSIAMIASLLSALALRVSAAETLVAQRIIIDAAATDLLDATCDGFNLHLTYPTALMV
LQVVKNRNECDDTLFKAGFPSYNTSVVQIMEKNYLNLLNDAWKDLTWRENYPQHGTHTEQNALSTRYIKPTEKADLKGLY
NISPQAFLGRSAQVVKGTASGLSERFNNYFNTKCVNISSFFIRRIFRRLPTFVTFVNSLLVISMLTSVVAVCQAIILDQR
KYRREIELMQIEKNEIVCMELYASLQRKLERDFTWDEYIEYLKSVNPQIVQFAQAQMEEYDVRHQRSTPVVKNLEQVVAF
MALVIMVFDAERSDCVFKTLNKFKGVLSSLDYEVRHQSLDDVIKNFDERNEIIDFELSEDTIRTSSVLDTKFSDWWDRQI
QMGHTLPHYRTEGHFMEFTRATAVQVANDIAHSEHLDFLVRGAVGSGKSTGLPVHLSVAGSVLLIEPTRPLAENVFKQLS
SEPFFKKPTLRMRGNSIFGSSPISVMTSGFALHYFANNRSQLAQFNFVIFDECHVLDPSAMAFRSLLSVYHQACKVLKVS
ATPVGREVEFTTQQPVKLIVEDTVSFQSFVDAQGSKTNADVVQFGSNVLVYVSSYNEVDTLAKLLTDKNMMVTKVDGRTM
KHGCLEIVTKGTSARPHFVVATNIIENGVTLDIDVVVDFGLKVSPFLDIDNRSIAYNKVSVSYGERIQRLGRVGRFKKGV
ALRIGHTEKGIIEIPSMVATEAALACFAYNLPVMTGGVSTSLIGNCTVRQVKTMQQFELSPFFIQNFVAHDGSMHPVIHD
ILKKYKLRDCMTPLCDQSIPYRASSTWLSVSEYERLGVALEIPKQVKIAFHIKEIPPKLHEMLWETVVKYKDVCLFPSIR
ASSISKIAYTLRTDLFAIPRTLILVERLLEEERVKQSQFRSLIDEGCSSMFSIVNLTNTLRARYAKDYTAENIQKLEKVR
SQLKEFSNLDGSACEENLIKRYESLQFVHHQAATSLAKDLKLKGIWNKSLVAKDLIIAGAVAIGGIGLIYSWFTQSVETV
SHQGKNKSKRIQALKFRHARDKRAGFEIDNNDDTIEEFFGSAYRKKGKGKGTTVGMGKSSRRFINMYGFDPTEYSFIQFV
DPLTGRQIEENVYADIRDIQERFSEVRKKMVENDDIEMQALGSNTTIHAYFRKDWCDKALKIDLMPHNPLKVCDKTNGIA
KFPERELELRQTGPAVEVDVKDIPAQEVEHEAKSLMRGLRDFNPIAQTVCRLKVSVEYGASEMYGFGFGAYIVANHHLFR
SYNGSMEVQSMHGTFRVKNLHSLSVLPIKGRDIILIKMPKDFPVFPQKLHFRAPTQNERICLVGTNFQEKYASSIITETS
TTYNIPGSTFWKHWIETDNGHCGLPVVSTADGCIVGIHSLANNAHTTNYYSAFDEDFESKYLRTNEHNEWVKSWVYNPDT
VLWGPLKLKDSTPKGLFKTTKLVQDLIDHDVVVEQAKHSAWMFEALTGNLQAVATMKSQLVTKHVVKGECRHFTEFLTVD
AEAEAEAFFRPLMDAYGKSLLNRDAYIKDIMKYSKPIDVGVVDRMHLRKPSIGLSSTCNVHGFKKCAYVTDEQEIFKALN
MKAAVGASYGCKKKDYFEHFTDADKEEIVMQSCLRLYKGLLGIWNGSLKAELRCKEKILANKTRTFTAAPLDTLLGGKVC
VDDFNNQFYSKNIECCWTVGMTKFYGGWDKLLRRLPENWVYCDADGSQFDSSLTPYLINAVLTIRSTYMEDWDVGLQMLR
NLYTEIVYTPISTPDGTIVKKFRGNNSGQPSTVVDNSLMVVLAMHYALIKECVEFEEIDSTCVFFVNGDDLLIAVNPEKE
SILDRMSQHFSDLGLNYDFSSRTRRKEELWFMSHRGLLIEGMYVPKLEEERIVSILQWDRADLPEHRLEAICAAMIESWG
YSELTHQIRRFYSWLLQQQPFATIAQEGKAPYIASMALRKLYMDRAVDEEELRAFTEMMVALDDEFELDSYEVHHQANDT
IDAGGSNKKDAKPEQGSIQPNPNKGKDKDVNAGTSGTHTVPRIKAITSKMRMPTSKGATVPNLEHLLEYAPQQIDISNTR
ATQSQFDTWYEAVRMAYDIGETEMPTVMNGLMVWCIENGTSPNVNGVWVMMDGNEQVEYPLKPIVENAKPTLRQIMAHFS
DVAEAYIEMRNKKEPYMPRYGLIRNLRDMGLARYAFDFYEVTSRTPVRAREAHIQMKAAALKSAQPRLFGLDGGISTQEE
NTERHTTEDVSPSMHTLLGVKNM
>Q05057 ~~~~~~Genome polyprotein~~~
MSSSNSQNSVNMVDGVDLNDPTAVAIAAASGTWGELDRASTCMYNFFDVSDVERNPGESSKQLVSRIKKRAGALLGVSRA
FKDTEQVLSAERCAFKSFSDDEEIMATTFEPLKDRAEITPTAASKLDKTLEAHRAKFNYMAIDSIRVAVTSLMHQGDSRE
CIMYLCDRRFKDPLLGAIALIGFTLPGMQTHVYKTGRMMAFSRKEAIAADRLQLYLYVKGAKLERTQNTPITVNVRTSLI
FGSNAENLLKCDSQIDTDMNIVSFMAQNQDLFGWLEAAQGGGYVPQSLRSATTHTSNSVLLHKWNNPVGSVSQRIASRMI
FTKAIDAGTSYEDSDVVGSNAKCPNSIGKNLRIGVAQASIQNTKDDSTPYSLADFVQDGPIAPVVASLESFMSSPSISQT
LPLINDQFTRPIYSRTFEWKATDTVGASIFQLELPGDVVGPQASSLFSDTMQRAFCFSSDFELSILLTGNESYMGALKIV
TDQLRRFHEAKQDDARVFHSMPGRTVFAKDSDGIKIPIEFMSIHKAVSAHDSNSHNALSRVEARVVTPLSHISLSSPVLS
ITIQVFAKNVKADYMMWRSLETTFPTANATLPSAVGDNFGRLRTSQSEILSTSQILGLLTERAFLGTAKVQQDTGARVII
AEFALHPMSSRNVDGTLLLSQLAALSAMYAFWRGSLVLTFEINCSASTRGKLIVSVTPKGGVALAGITASHQGYGAEFDL
GTSSTRSFTMPFVSTDEWESIGDDGIMSAFEGVWDCPVANLLVLHPITSIAESTPSVDIRCYLHPGPDFQLRGRRHIGLR
AASRPLSIAQASPVLSQVDFSLMASISIDATEESVVVAVPCAPWYSKEEVDYTLLQNPLHWASRMFTLWRGDIEYRFVVK
EEALGDGWQSPISVWHNPNTQLSKCKITKISNKKISKETYHGKKFCLMQLKSIDIVAVDDRRFSWRLCKILDTTKDTAGD
STSPSVTQITYTGHPPMSSQTGVVCIKFPKNSIKGKLKVYSKPGENFEFRHLGGVPSLQVSQMVKYKKPFQNSVPDVFIT
PSKESSKKELGFKPKVVESAAVPKLAVGQAQGLVSKIKGFGSLWKFDEEEETLNSELQKMAVEVSGEIDPIQDEGWAKRK
INEIVSSVSTKLIEASTSILSNAATRAISTLFDVMIGKVRGVLSSLVDSISGAFKMCLGDPKCLCLIGISISAVLGYCTL
KLVENSVPDALGIFKALMMVAITSISALYWPKAAISIVTKYEEQFKDIENYCSTIYKHIFLGVSEDKMEGATPAKACATN
FEDLAHGKAQAGGKSFLELAGLIAYIRLCVVLCKAMNTSFLEPFTPSNMEKQCRTVGGISIGVKTLCEFKDYIYRMIVGG
ITPTSSYVKVSGLTGFDIREWFEEVESVTLQETRYTQMGSDEKIKQIRALYDKGVNVMGKLTMIDSPHLSRVCERSFRLC
KELLDETHRCKGASSTRVDPFHVSLYGSPGVGKSFVMGKLLDDVLDFMSEPQADRCYSKTPNEEYWSGYIGQTAVKCDDL
GQDLSKGFSPTYNQIIQMKTNNCFIVPMADLANKGRTFTSKYIFSTTNVPGCGTKHGLADPGAFMRRRNIFVEVETEGDM
IPGSTNHMRFTLLNPLNPDERIMKYPARMKYVDFLCVCVAEARVYFETQNLVMETLNGTTKNQEEPSKDVIAILEELGDG
VVEGILEKRKELLSQFGVMDPPPFDAIELEPGKAQASVCFSTDAFGNPLKNPFVELFGKLRDEFERATKQEMPDDILTKF
GASLTLGEPTVFGYENQCGMHSAKDSNLMSSFFTFIFGKNLIYKQEQEFLRHIDTLSSMGVLRLVDAVTTTTKGDKKILS
FANIYDNQAFEQLGVLERLIFHLVLATRAAKKGRINGIRERLANWFTSARILSNNILEELPSPIKMLLVLATSVGSLYLA
FKGLSGIGSMILGFTGNFTAKEEDFEMISLNALMGQAKSKGRNFITSGDELTTRLSRMMSRASLATGRAQGGRSHMDTCE
ALLARQGQITNMATGLHLVATDLGGGFLLAPLHTFAGAEKDDIFRFQNGADYYFAFEPKDVSQLSEYDACIIRTDAIPMK
SSIVSIFAKESQIELLVDMSAHFVCGPWKVPFGGEFISEQTVAKRIASFKYFMDEKLYMAINGWSSPFKTEDGQCGSCLV
STSDKLDGKVFCSLVAGTYDRVTGKYVSTYVPITCDMIKKSISLLTGAEFSESQSSICDSPISDTVAETIKVDQLFSSKP
GASGKFGVFGVNDTIGIIDVVGRTFPETTPKSITKSTIVPSLIQPYMPRKPLTEPAILDPRDVRLGENRYDPMIDGIKKY
EEQARPIKISWRNQIIESMAAQMQDWETFMVREGYMTMDLPMSVVINGIDGVEYYEPLNMSTSEGYPLILNRPKDAHGKE
YLFETMESGERRIKSAKLEAHYESYGHALQSTEPFPLICIECPKDERRALDKIYEKPKTRLFSILPVEFNMHARRLFLDF
NVFVMANRHKHGIMVGINPHSREWSDLAISLASFSPYGFNGDFANFDGMFHPSSFSMVSELANIFYGNFLSTERDNLTRM
LTNRFSLMKGAILRVPGGGPSGFPMTVIFNSFINLFYLQSAWIMLARFNGRQDISHPCNFPKYVRACVYGDDNIVAIKME
VLPWYNLQTVSEALFDYFGVTMTDGAKNKASEAKPYGKILEFDFLKRHFKADELIPSLFHAPLHKRSIEEQVYWIREGGN
SLELLEANIENALYEAHHHGREYYEELKDQIKKAMNRAGYMSFVAPSYLMCRQRWLQQDLGEVATSSLPSHVGLLKEATK
NHFSALTGQEEIKAIFEEIDNGNGGTTKHGNMQQILPNIFIGPTRIFETKYGNSLFNLVCDNSLSKGQTRYGVKHGIQSL
SKPDFTYISESLPCLTTPNFRMVCLDPIGGELALATALCLLHAAGIINTKTFTMFMRIHIKQWKHVLQAYFRVCETFVSK
EWQNFKRDIKRLSQDDVGCSRTTPVCGRFLTLDGQLPQHIKSLDKIDFKKTRRIKIAQDEDFIIQID
>Q86119 ~~~ORF1~~~Genome polyprotein~~~
MAAMSRLTGMTTAILPEKKPLNFFLDLRDKTPPCCIRATGKLAWPVFLGQNGKEGPLETCNKCGKWLNGFGCFGLEDLGD
VCLCSIAQQKHKFGPVCLCNRAYIHDCGRWRRRSRFLKHYKALNKVIPCAYQFDESFSTPVFEGEVDDLFVELGAPTSMG
FMDKKLLKKGKKLMDKFVDVDEPCLTSRDASLLDSIASDNTIRAKLEEEYGVEMVQAARDRKDFMKNLRLALDNRPANPV
TWYTKLGNITEKGKQWAKKVVYGACKVTDPLKTLASILLVGLHNVIAVDTTVMLSTFKPVNLLAILMDWTNDLTGFVTTL
VRLLELYGVVQATVNLIVEGVKSFWDKVVCATDRCFDLLKRLFDTFEDSVPTGPTAGCLIFMAFVFSTVVGYLPNNSVIT
TFMKGAGKLTTFAGVIGAIRTLWITINQHMVAKDLTSIQQKVMTVVKMANEAATLDQLEIVSCLCSDLENTLTNRCTLPS
YNQHLGILNASQKVISDLHTMVLGKINMTKQRPQPVAVIFKGAPGIGKTYLVHRIARDLGCQHPSTINFGLDHFDSYTGE
EVAIADEFNTCGDGESWVELFIQMVNTNPCPLNCDKAENKNKVFNSKYLLCTTNSNMILNATHPRAGAFYRRVMIVEARN
KAVESWQATRHGSKPGRSCYSKDMSHLTFQVYPHNMPAPGFVFVGDKLVKSQVAPREYKYSELLDLIKSEHPDVASFEGA
NRFNFVYPDAQYDQALLMWKQYFVMYGCVARLAKNFVDDIPYNQVHISRASDPKIEGCVEYQCKFQHLWRMVPQFVLGCV
NMTNQLGTPLTQQQLDRITNGVEGVTVTTVNNILPFHSQTTLINPSFIKLIWAVRKHLKGLSGVTKVAQFIWRVMTNPVD
AYGSLVRTLTGAATFSDDPVSTTIICSNCTIQIHSCGGLLVRYSRDPVPVASDNVDRGDQGVDVFTDPNLISGFSWRQIA
HLFVEVISHLCANHLVNLATMAALGAVATKAFQGVKGKTKRGRGARVNLGNDEYDEWQAARREFVNAHDMTAEEYLAMKN
KAAMGSDDQDSVMFRSWWTRRQLRPDEDQVTVVGRGGVRNEVIRTRVRQTPKGPKTLDDGGFYDNDYEGLPGFMRHNGSG
WMIHIGNGLYISNTHTARSSCSEVVTCSPTTDLCLVKGEAIRSVAQIAEGTPVCDWKKSPISTYGIKKTLSDSTKIDVLA
YDGCTQTTHGDCGLPLYDSSGKIVAIHTGKLLGFSKMCTLIDLTITKGVYETSNFFCGEPIDYRGITAHRLVGAEPRPPV
SGTRYAKVPGVPEEYKTGYRPANLGRSDPDSDKSLMNIAVKNLQVYQQEPKLDKVDEFIERAAADVLGYLRFLTKGERQA
NLNFKAAFNTLDLSTSCGPFVPGKKIDHVKDGVMDQVHAKHLYKCWSVANSGKALHHIYACGLKDELRPLDKVKEGKKRL
LWGCDVGVAVCAAAVFHNICYKLKMVARFGPIAVGVDMTSRDVDVIINNLTSKASDFLCLDYSKWDSTMSPCVVRLAIDI
LADCCEQTELTKSVVLTLKSHPMTILDAMIVQTKRGLPSGMPFTSVINSICHWLLWSAAVYKSCAEIGLHCSNLYEDAPF
YTYGDDGVYAMTPMMVSLLPAIIENLRDYGLSPTAADKTEFIDVCPLNKISFLKRTFELTDIGWVSKLDKSSILRQLEWS
KTTSRHMMIEETYDLAKEERGVQLEELQVAAAAHGQEFFNFVCKELERQQAYTQFSVYSYDAARKILADRKRVVSVVPDD
EFVNVMEGKARTAPQGEAAGTATTASVPGTTTDGMDPGVVATTSVVTAENSSASIATAGIGGPPQQVDQQETWRTNFYYN
DVFTWSVADAPGSILYTVQHSPQNNPFTAVLSQMYAGWAGGMQFRFIVAGSGVFGGRLVAAVIPPGIEIGPGLEVRQFPH
VVIDARSLEPVTITMPDLRPNMYHPTGDPGLVPTLVLSVYNNLINPFGGSTSAIQVTVETRPSEDFEFVMIRAPSSKTVD
SISPAGLLTTPVLTGVGNDNRWNGQIVGLQPVPGGFSTCNRHWNLNGSTYGWSSPRFADIDHRRGSASYPGNNATNVLQF
WYANAGSAIDNPISQVAPDGFPDMSFVPFNGPGIPAAGWVGFGAIWNSNSGAPNVTTVQAYELGFATGAPGNLQPTTNTS
GSQTVAKSIYAVVTGTAQNPAGLFVMASGVISTPSANAITYTPQPDRIVTTPGTPAAAPVGKNTPIMFASVVRRTGDVNA
TAGSANGTQYGTGSQPLPVTIGLSLNNYSSALMPGQFFVWQLTFASGFMEIGLSVDGYFYAGTGASTTLIDLTELIDVRP
VGPRPSKSTLVFNLGGTANGFSYV
>Q89273 ~~~ORF1~~~Genome polyprotein~~~
MAAMSRLTGMTTAILPEKKPLNFFLDLRDKTPPCCIRATGKLAWPVFPGQNGKEGPLKTCNKCGKWLNGFGYFGLEDLGD
VCLCSIAQQKHKFGPVCLCNRAYIHDCGRWRRRSRFLKHYKALNKVIPCAYQFDESFSTPVFEGEVDDLFVELGAPTSMG
FMDKKLLKKGKKLMDKFVDVDEPCLTSRDASLLDSIASDTTIRAKLEEEYGVEMVQAARDRKDFMKNLRLALDNRPANPV
TWYTKLGNITEKGKQWAKKVVYGACKVTDPLKTLASILLVGLHNVIAVDTTVMLSTFKPVNLLAILMDWTNDLAGFVTTL
VRLLELYGVVQATVNLIIEGVKSFWDKVVCATERCFDLLKRLFDTFEDSVPTGPTAGCLIFMAFVFSTVVGYLPNNSVIT
TFMKGAGKLTTFAGVIGAIRTLWITINQHMVAKDLTSIQQKVMTVVKMANEAATLDQLEIVSCLCSDLENTLTNRCTLPS
YNQHLGILNASQKVISDLHTMVLGKINMTKQRPQPVAVIFKGAPGIGKTYLVHRIARDLGCQHPSTINFGLDHFDSYTGE
EVAIADEFNTCGDGESWVELFIQMVNTNPCPLNCDKAENKNKVFNSKYLLCTTNSNMILNATHPRAGAFYRRVMIVEARN
KAVESWQATRHGSKPGKSCYSKDMSHLTFQVYPHNMPAPGFVFVGDKLVKSQVAPREYKYSELLDLIKSEHPDVASFDGA
NRFNFVYPDAQYDQALLMWKQYFVMYGCVARLAKNFVDDIPYNQVHISRASDPKIEGCVEYQCKFQHLWRMVPQFVLGCV
NMTNQLGTPLTQQQLDRITNGVEGVTVTTVNNILAFHSQTTLINPSFLKLIWAVRKHLKGLSGVTKVAQFIWRVMTNPVD
AYGSLVRTLTGAATFSDEPVSTTIICSNCTIQIHSCGGLLVRYSRDPVPVASDNVDRGDQGVDVFTDPNLISGFSWRQIA
HLFVEVISHLCANHLVNLATMAALGPVATKAFQGVKGKTKRGRGARVNLGNDEYDEWQAARREFVNAHDMTAEEYLAMKN
KAAMGSDDQDSVMFRSWWTRRQLRPDEDQVTIVGRGGVRNEVIRTRVRQTPKGPKTLDDGGFYDNDYEGLPGFMRHNGSG
WMIHIGNGLYISNTHTARSSCSEIVTCSPTTDLCLVKGEAIRSVAQIAEGTPVCDWKKSPISTYGIKKTLSDSTKIDVLA
YDGCTQTTHGDCGLPLYDSSGKIVAIHTGKLLGFSKMCTLIDLTITKGVYETSNFFCGEPIDYRGITAHRLVGAEPRPPV
SGTRYAKVPGVPDEYKTGYRPANLGRSDPDSDKSLMNIAVKNLQVYQQEPKLDKVDEFIERAAADVLGYLRFLTKGERQA
NLNFKAAFNTLDLSTSCGPFVPGKKIDHVKDGVMDQVLAKHLYKCWSVANSGKALHHIYACGLKDELRPLDKVREGKKRL
LWGCDVGVAVCRAAVFHNICYKLKMVARFGPIAVGVDMTSRDVDVIINNLTSKASDFLCLDYSKWDSTMSPCVVRLAIDI
LADCCEQTELTKSVVLTLKSHPMTILDAMIVQTKRGLPSGMPFTSVINSICHWLLWSAAVYKSCAEIGLHCSNLYEDAPF
YTYGDDGVYAMTPMMVSLLPAIIENLRDYGLSPTAADKTEFIDVCPLNKISFLKRTFELTDIGWVSKLDKSSILRQLEWS
KTTSRHMMIEETYDLAKEERGVQLEELQVPAAAHGQEFFNFVCKELERQQAYTQFSVYSYDAARKILADRKRVVSVVPDD
EFVNVMEGKARTAPQGEAAGTATTASVPGTTTDGLDPGVVATTSVVTAENSSASIATAGIGGPPQQVDQQETWRTNFYYN
DVFTWSVADAPGSILYTVQHSPQNNPFTAVLSQMYAGWAGGMQFRFIVAGSGVFGGRLVAAVIPPGIEIGPGLEVRQFPH
VVIDARSLEPVTITMPDLRPNMYHPTGDPGLVPTLVLSVYNNLINPFGGSTSAIQVTVETRPSEDFEFVMIRAPSSKTVD
SISPAGLLTTPVLTGVGNDNRWNGQIVGLQPVPGGFSTCNRHWNLNGSTYGWSSPRFADIDHRRGSASYPGSNATNVLQF
WYANAGSAVDNPISQVAPDGFPDMSFVPFNGPGIPAAGWVGFGAIWNSNSGAPNVTTVQAYELGFATGAPGNLQPTTNTS
GAQTVAKSIYAVVTGTAQNPAGLFVMASGVISTPNANAITYTPQPDRIVTTPGTPAAAPVGKNTPIMFASVVRRTGDVNA
TAGSANGTQYGTGSQPLPVTIGLSLNNYSSALMPGQFFVWQLTFASGFMEIGLSVDGYFYAGTGASTTLIDLTELIDVRP
VGPRPSKSTLVFNLGGTANGFSYV
>P27410 ~~~ORF1~~~Genome polyprotein~~~
MAAMSRLTGMTTAILPEKKPLDFFLDLRDKTPPCCIRATGRLAWPVFPGQNGKEGPLETCNKCGKWLNGFGNFGLEDLGD
VCLCSIAQQKHKFGPVCLCNRVYIHDCGRWRRRSRFLKHYKALNKVIPCAYQFDESFPTPIFEGEVDDLFVELGAPTSMG
FMDKKLLKKGKKLMDKFVDVDEPCLTSRDTSLLDSIASDNTIRAKLEEEYGVEMVQAARDRKDFMKNLRLALDNRPANPV
TWYTKLGNITEKGKQWAKKVVYGARKVTDPLKTLASILLVGLHNVIAVDTTVMLSTFKPVNLLAILMDWNNDLTGFITTL
VRLLELYGVVQATVNLIIEGVKSFWDKVVCATDRCFDLLKRLFDTFEDSVPTGPTAGCLIFMAFVFSTVVGYLPNNSVIT
TFMKGAGKLTTFAGVVGAIRTLWITINQHMVAKDLTSVQQKVMTVVKMANEAATLDQLEIVSCLCSDLETTLTNRCTLPS
YNQHLGILNASQKVISDLHTMVLGKINMTKQRPQPVAVIFKGAPGIGKTYLVHRIARDLGCQHPSTINFGLDHFDSYTGE
EVAIADEFNTCGDGESWVELFIQMVNTNPCPLNCDKAENKNKVFNSKYLLCTTNSNMILNATHPRAGAFYRRVMIVEARN
KAVESWQATRHGSKPGKSCYSKDMSHLTFQVYPHNMPAPGFVFVGDKLVKSQVTPREYKYSELLDLIKSEHPDVASFEGA
NKFNFVYPDAQYDQALLMWKQYFVMYGCVARLAKNFVDDIPYNQVHISRASDPKIEGCVEYQCKFQHLWRMVPQFVLGCV
NMTNQLGTPLTQQQLDRITNGVEGVTVTTVNNILPFHSQTTLINPSFIKLIWAVRKHLKGLSGVTKVAQFIWRVMTNPVD
AYGTLVRTLTGAATFSDDPVSTTIICSNCTIQLHSCGGLLVRYSRDPVPVASDNVDRGDQGVDVFTDPNLISGFSWRQIA
HLFVEVISHLCANHLVNLATMAALGAVATKAFQGVKGKTKRGRGARVNLGNDEYDEWQAARREFVNAHDMTAEEYLAMKN
KAAMGSDDQDSIMFRSWWTRRQLRPEEDQVTIVGRSGVRNEVIRTRVRQTPRGPKTLDDGGFYDNDYEGLPGFMRHNGSG
WMIHIGNGLYISNTHTARSSCSEIVTCSPTTDLCLVKGESIRSVAQIAEGTPVCDWKKSPISTYGIKKTLSDSTKIDVLA
YDGCTQTTHGDCGLPLYDSSGKIVAIHTGKLLGFSKMCTLIDLTITKGVYETSNFFCGEPIDYRGITAHRLVGAEPRPPV
SGTRYAKVPGVPDEYKTGYRPANLGRSDPDSDKSLMNIAVKNLQVYQQEPKLDKVDEFIERAAADVLGYLRFLTKGERQA
NLNFKAAFNTLDLSTSCGPFVPGKKIDHVKDGVMDQVLAKHLYKCWSVANSGKALHHIYACGLKDELRPLDKVKEGKKRL
LWGCDVGVAVCAAAVFHNICYKLKMVARFGPIAVGVDMTSRDVDVIINNLTSKASDFLCLDYSKWDSTMSPCVVRLAIDI
LADCCEQTELTKSVVLTLKSHPMTILDAMIVQTKRGLPSGMPFTSVINSICHWLLWSAAVYKSCAEIGLHCSNLYEDAPF
YTYGDDGVYAMTPMMVSLLPAIIENLRDYGLSPTAADKTEFIDVCPLNKISFLKRTFELTDIGWVSKLDKSSILRQLEWS
KTTSRHMVIEETYDLAKEERGVQLEELQVAAAAHGQEFFNFVCRELERQQAYTQFSVYSYDAARKILADRKRVVSVVPDD
EFVNVMEGKARAAPQGEAAGTATTASVPGTTTDGMDPGVVATTSVITAENSSASIATAGIGGPPQQVDQQETWRTNFYYN
DVFTWSVADAPGSILYTVQHSPQNNPFTAVLSQMYAGWAGGMQFRFIVAGSGVFGGRLVRAVIPPGIEIGPGLEVRQFPH
VVIDARSLEPVTITMPDLRPNMYHPTGDPGLVPTLVLSVYNNLINPFGGSTSAIQVTVETRPSEDFEFVMIRAPSSKTVD
SISPAGLLTTPVLTGVGNDNRWNGQIVGLQPVPGGFSTCNRHWNLNGSTYGWSSPRFGDIDHRRGSASYSGSNATNVLQF
WYANAGSAIDNPISQVAPDGFPDMSFVPFNGPGIPAAGWVGFGAIWNSNSGAPNVTTVQAYELGFATGAPGNLQPTTNTS
GAQTVAKSIYAVVTGTAQNPAGLFVMASGIISTPNASAITYTPQPDRIVTTPGTPAAAPVGKNTPIMFASVVRRTGDVNA
TAGSANGTQYGTGSQPLPVTIGLSLNNYSSALMPGQFFVWQLTFASGFMEIGLSVDGYFYAGTGASTTLIDLTELIDVRP
VGPRPSKSTLVFNLGGTANGFSYV
>Q32ZD4 ~~~~~~Genome polyprotein~~~
MSKKPGGPAGRRVVNMLKRPASVSPIKGIKRLIGNLTDGRGPLRVVLAFIAFFRFAAIMPTQGLLRRWRVMNKSEALKHL
TSFKKEISNMLNIINRRKAKRGNGSVLLWIALVTGSMALRLGTYQGKVLMSINKTDVAEIIPIPTTKGDNLCTVRAMDVG
YMCQNDITYECPRLEPGMDPEDIDCWCDREAIYVHYGLCTKNHRERRGRRSVNIPSHGESQLENRGTPWLDTAKTTKYLT
KVENWMIRNPGYAIVAVAAAWMLGSNTSQKVIFTIMLLLIAPAYSINCLGVTNRDFVEGMSGGTWVDIVLEGDGCVTIMA
KDKPTLDIRLLKMEAKDLATVRSYCYHATVTSVSSEARCPTMGEAHNPKALDSNYLCKSTYVDRGWGNGCGLFGKGSLQT
CVKFGCTQKAMGMTIQRENLDYELAIYVHGPTSVAAHGNYTTQLGAKHAAKFSITPSSPSFTANLGEYGEATVDCEPRAA
LDIDNYYVMSMNNKHWLVNRDWFHDLDLPWTGPATDVWKYRESLVEFEEAHVTRQTVVALAAQEGELHIVLAGAIPVTVA
GTTLTLTSGHLKCRMKLDKLKIKGSTYLMCKDKFAFAKNPVDTGHGTIVTEVQYAGSDGPCRIPITMTENLHDLTPIGRL
VTVNPFVPSSETAQKILIELEPPFGTSFILVGTGPNQVKYQWHKSGSVIGSAFKTTIKGAQRMAVLGETAWDFGSVGGVF
NSIGKGIHGLFGGAFRTLFGGMSWVTQALMGALLLWLGVSSRERTVSITLLATGGILLFLAMNVHADTGCAIDITRRELK
CGSGIFIHNDVETWRDNYKYHPSTPKNFAKIIHKAYKEGICGVRSASRLEHEMWKHIAPELNAILEDNEVDLSVVVEEHK
GIYKKAPLRLENTSDEMHFGWKNWGKSFLFKTQMANSTFVVDGPETKECPTERRAWNSLEIEDFGVGIMSTKVFLKVNGD
KTEVCDSMVMGTAIKGNRAVHSDLGYWIESGKNTSWRLERAVLGEVRSCTWPESHTLWNEGVEDSDLIIPPTLGGPRTHH
NKREGYKTQLKGPWNEEGPIIIEFGECPGTKVTQEESCRNRAASARTTTASGKVIRDWCCKNCTMPPLRFTTKNGCWYGM
EIRPKHESEETLIKSKVTAGTGNDICRFQLGLLMAFVFTQEVLRKRWTARLALPTAALLLACFVLGAFTYSDMIRYFVLV
GCAFAESNSGGDVIHLALIAVFNIQPAALVSTFFRNRWTNRENLLLVIAAAMAQMAWSDVGIEIMPIMNAMALAWMILKA
VSIGTVSTIAMPILSGLAPPMEWFGLDVLRCLLLIVGVAALIKERKENLAKKKGALLISAGLALTGAFSPLVLQGALMLS
ECATKRGWPASEVLTAIGMTIALAGSVARLDSGTMAIPLATTSILFVSYVLSGKSTDMWIERCADVTWEEEAEITGTSPR
LDVELDDNGDFKMINDPGVPMWMWASRMGLMCMAAYNPVLIPVSVAGYWMTRKIHKRGGVLWDLPAPKQMGRSDMKPGVY
RVMTSGVLGSYQSGVGVMYDGVFHTMWHVTQGAALRNGEGRLNPTWGSVRDDLITYGGKWKLSATWDGTEEVQLIAAEPG
KPVKNFQTRPGVFKTPAGEVGAITLDFPKGTSGSPIVNKAGAVIGLYGNGLVLSHGAYVSAISQGERQEEEAPEAFTPEM
LRKRQLTILDLHPGAGKTRRVIPQIVREAVKQRLRTVILAPSRVVAAEIAEALRGLPVRFQTSAVKAEHSGTEIVDVMCH
ATLTQRLMTPMRVPNYNVFVMDEAHFTDPASIAARGYISTKVESGEAAAIFMTATPPGTIDPFPDSNSPIIDQEAEIPDR
AWNSGFEWITDYTGKTVWFVPSVRSGNEIAMCLTKAGKKVIQLNRKSYETEYQKCKGNDWDYVVTTDISEMGANFGAHRV
IDSRKCVKPVIINDGEGRVQLNGPLPITASSAAQRRGRVGRDPTQSGDEYYYGGPITNDDTGHAHWIEAKMLLDNIQLQN
GLVAQLYKPERDKVFATDGEYRLRGEQKKHFVELMRTGELPVWLSYKVAEAGINYTDRRWCFDGPHNNTILEDNTEVEIW
TRQGERKVLRPRWSDARVYSDNQALRAFKEFAAGKRSAGSMMDVMARMPDYFWTKTMNAADNLYVLATTEKGGRAHRAAL
EELPDTLETVLLIAMMSLASCGMLALMMQRKGIGKTGMGTAVLTAVTILLWMADVPAPKIAGVLLISFLLMIVLIPEPEK
QRSQTDNHLAVFLICALLLVSAVSANEMGWLDTTKRDLGKLFSGPSAVTTSRWEPLKLALALKPATAWAGYAGMTMLLTP
LFRHLITTQYISFSLTAITSQASALFGLNSGYPFVGVDLSVVFLLVGCYGQYNLPTTMATIGLLVGHYAFMIPGWQAEAM
RAAQRRTAAGVMKNAVVDGIVATDIPEMDTATPIVEKKMGQVMLLIISALAILLNPDTMTVVEGGVLITAALATLLEGNA
NTVWNSTVAVGVCHLMRGGWAAGPSIGWTIIRNLEAPKVKRGGIAAPTLGEIWKSRLNQLTREQFMEYRKDGIIEVDRTA
ARRARREGNRTGGHPVSRGTAKLRWLVERGFAKPLGKVVDLGCGRGGWSYYCATLRHVQEVRGYTKGGPGHEEPMLMQSY
GWNIVSMKSGIDVFYRPTEACDTVLCDIGESSPSPGVEEARTLRVLEMIEPWLRTANQYCVKVLCPYTPKVIERLEKLQR
KYGGGLVRVPLSRNSNHEMYWVSEASSNLINAVNATSQVLLQRLEKDHRKGPRYEEDVDLGSGTRSVARRSPFMDTRKIH
HRIERLKSEFSTTWHYDCEHPYRTWNYHGSYEVKPTGSASSMVNGVVKLMSKPWDSIQSVLTMAMTDTTPFGQQRVFKEK
VDTKAPEPAPGVKAVLDLTTDWLWAVLCRRKKPRMCTKEEFIAKVNSHAALGAIFEEQNQWASAREAVEDPGFWNFVDKE
RQAHLEGRCETCIYNMMGKREKKLGEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWMSRENSYGGVEGKGLQKLGYI
LQEISRKEGGHMFADDTAGWDTRVTLTDLENEAKITRWMEPEHRKLAEAMIELTYKNKVVKVTRPGKEGKTVMDIISRND
QRGSGQVVTYALNTYTNLAVQLIRCMEGEGLLEEEETMRISDAKRRAVQAWLDTNGTERLTPMAVSGDDCVVKPIDNRFA
TALHFLNGMSKVRKDIQEWKPSTGWTNWQEVPFCSHHFNELVMRDGRKIVVPCRAQDELIGRARVSPGSGWSLRETACLG
KAYAQMWLLMYFHRRDLRLMANAICSAVPIDWVPTGRTTWSIHGKGEWMTTEDMLAVWNRVWIFENEHMEDKTPVYSWTD
VPYIGKREDQWCGSLIGHRSRATWAENIYTPIMQVRNLIGAERYVDYMPAQTRFAHEAELQGGVL
>Q83034 ~~~~~~Genome polyprotein~~~
MQSFLLSSKNQAKLLHAGLEFVGGVRCAHQGWVSGKAVVFCNYCNFAHRLYRFYTKNHCVLNKELLKISVEGLLCHCIEQ
AFLFRRFYDRRFAWQRKYAKGFLFDNLSIPFDDCALCPNAGTRLSQTGVSHDHFVCNYVEHLFECASFSRETGGKFFRAC
SEGWHWNATCTTCGASCRFANPRENIVIAIFMNFLRVMYDGNKYYVSLHCDTEWIPVHPLFARLVLMVRGFAPLDNSHVI
EEDEMDICGHSSEVTYEDPSKFAFTHQHVTRGVGMGHLAFCRDANGVDRGEHKFYLHGPFDLKMTHAMFRVFMILLNCHG
YVQSEFRDEFPDIKDRSLCGLLSVAGLRGVNVSCNEEFIHLHSQFHNGSFRSQRPIPMVYAEPEMYPPLGYVHLTESWVP
RGRLLIDDLPSLMSRVYAESSQAQAGEIYEETFDEDDLFELDGEEGTSTRGLLDLGRRLGGLLLGATKCVKGLHSVIEWP
VDVLTKEAEDLGTWLADNKKYVSESTWSCQVCPEVQDALEKSMREQAKLNAQMISGIKKLATTMDSATLKLRDNLKELEQ
RISVLEQGADDTQQVRITNLENFCEDAAKAFEALRNDIEALKKKPAQSVTPLPSPSGNSGTAGEQRPPPRRRRRPPVVEM
SEAQAGETVIVGGDEEQEAHQDSSVAAAGPADEHNAMLQKIYLGSFKWKVSDGGGSILKTFSLPSDIWAANDRMKNFLSY
FQYYTCEGMTFTLTITSIGLHGGTLLVAWDALSSATRRGIVSMIQLSNLPSMTLHASGSSIGTLTVTSPAIQHQICTSGS
EGSLANLGSLVISVANVLCADSASAQELNVNAWVQFNKPKLSYWTAQHSIAQSGGFEESQDLGDLQAIIATGKWSTTSDK
NLMEIIVHPTACYVSEKLIYQTNLSVVAHMFAKWSGSMRYTFVFGASMFDRGKIMVSAVPVQFRNSKLTLSQMAAFPSMV
CDLSMETREFTFEVPYISIGKMSLVCKDYLFDISSYNADLVVSRLHVMILDPLVKTGNASNSIGFYVVAGPGKGFKLHQM
CGVKSQFAHDVLTAQDFGRSLSCSRLLGNGFKEWCSRESLLMRVPLKSGKKRAFKYAVTPRMRTLPPEATSLSWLSQIFV
EWRGSLTYTIHVQSGSAIQHSYMRIWYDPNGKTDEKEVKFLDSAHPPAGIKVYHWDLKIGDSFRFTVPYCARTEKLQIPK
AYASTPYEWLTMYNGAVTFDLRSGADMELFVSIAGGDDFEMFEQTVPPKCGSVSDSYTVLSYADDVKSVTEVPNKTTYLA
DEQPTTSAPRTSIVNTEDDPPTEGEIARTTNGTLVQYRGGAWKPMVERTPTMSKKQVGPELTVSDPQMYKCIKNMNKNVK
ILTDRQCTAKLANIVDSAQELVGSNSTFVEDLAVGAKQIRKFGESLDVFEGSMSAAKTAELIDNTHAAFSGPADGSPISN
VVQLLLPMLSSIKGMSGKMESKMASLTAMFQPCKKAITHLIERSFPYLACKGFKTDKWIWAALASILVGAALLHYYRSDL
KFVKKWSVMCMIIWAPLLAEKAYHLGTWIKEKFLKSLPRTRTIKDSCRKHSLAGAFECLASASCAYIKDNWAKTMSSLLT
ILSVVASLVMWGKIPDDKEITSFADKFHSIGKKGRSITNIIGGFEKITSVCKKWSETLVSWIVSNVSGGIPKEDLAMTAY
LGFKIHDWVRETRDMALMENRFRGFGGDEHLVKVRRLYGHSLKIDNALMEKQIVPDMQLSLIIKECRQKCLELMNESYTY
KGMKQSRIDPLHVCMLGAPGVGKSTIAHVVINHLLDHRGEPEVDRIYTRSCADAYWSNYHQEPVILYDDLGAIKSNLRLS
DYAEIMGIKTNDPFSVPMAAVEDKGKHCTSKYVFSCTNVLNLDDTGDVVTKMAYYRRRNVLVKVERDPDVPKNEANPTEG
LVFTVLGHDQNCQGDPQFVVKENWDEPFLREVDTEGWRFERVEYRTFLRFLCMYTDAYMYSQEQVLQGIKTFKMNPFAPE
PEFAQAQNGEAAECEIVEEMQEVPGEAPQEAKELVKIETAPNMDELVEAFNKLRVTPGHLNDILRDGSGCYIDEWAIAGP
RWLSFHELLPFTCGCHHTRVCDFNIVYNNMCKAVRSQSVHFKYRANQAIKYAYTHKLHSQCRYSIDFEKLRECNPLDVFV
CVLSKYTADDHSFERRCPKKMNVVRMQRPPVFELKMRPPSDSVVVEDEQGQRIFEWPHLYIFLRYRAIEFKDDKGSLTVR
EDAGADVCPWNEFLKLPWLDGDQLKSVLPAHLHRMVQARLEQVEIMEENGNYSGEMRNAIAEIKEYLDQDHQWVAALVLV
ACAVKERRRMTHDKLHRKSFNALDKLDAWYTTTAPKTSKKMKILLAIGASVAVAGVAVGAVILLQKTNLFGSKEDEEIEG
EEGETQASGAHESDGIVTQHLKRDIRPKMRVTYTDHHVAEAHEEKDAEKPRKSGNPTRKSYLGLSPGFAERGMGVTYEEH
TPLKDALLDESNKVFRRKIVASVESAVKQGGKASKDTVLSQIGDWQDKVKATGVIAARQLEASGSLKKIHNLNSRRTSSH
VMPGLVVHDGAFERSDEVDAELHRITIDEVKSCPKMIKEGVSTLSVKKASVGMLALQKAESQLSFPFTSRAGVDRDLSMT
NLIDTHMAGMSCIIISELGNVFRTFGVLRLCGTYVCMPAHYLDEITSEHTLYFVCPSKITQIQLERHRVCLVNGFQETVV
WDLGPSVPPSRNYIDFIANADDWKNYKATSGALVMSKYSVDSMLQCVHFLDSIELTEANVSVPTSYYEANGGIHTIISGL
RYRVHCMPGFCGRAIMRADATCFRKIIGMHVSGLRNKCMGYAETLTQEHLMQAIETLKETGLLKHIPKGAIGAGEEKLPE
HSKKQSLSLEGKGNLGIVGQLTAQLVPTSVTKTTICKSMIHGLIGEIKTEPSVLSAWDRRLPFPPGEWDPMKDAVKKYGS
YILPFPTEEIQEVENFLIKKFRRKENSRRTRNVNSLEVGINGIDGSDFWSPIEMKTSPGYPYILKRPSGAQGKKYLFEEL
EPYPSGRPKYAMKDPELIENYERIKEEVTSGVKPSIMTMECLKDERRKLAKIYEKPATRTFTILSPEVNILFRQYFGDFA
AMVMSTRREHFSQVGINPESMEWSDLINSLLRVNTKGFAGDYSKFDGIGSPAIYHSIVNVVNAWYNDGEVNARARHSLIS
SIVHRDGICGDLILRYSQGMPSGFAMTVIFNSFVNYYFMALAWMSTVGSSLLSPQGSCKDFDTYCKIVAYGDDNVVSVHE
EFLDVYNLQTVAAYLSHFGVTYTDGDKNPIHMSKPYEDITKMSFLKRGFERVESSGFLWKAPLDKTSIEERLNWIRDCPT
PVEALEQNIESALHEAAIHGRDYFDDLVRRLNSALTRVMLPPTDISFEECQARWWASVTGALRAADYTSLVRRASSGHVE
FNKKYRDMFRQQDLPLKEILMKSKPVALLDLEV
>Q91PP5 ~~~~~~Genome polyprotein~~~
MQSFLLSSKNQAKLLHAGLEFVGGVRCAHQGWVSGKAVVFCNYCNFAHRLYRFYTKNHCVLNKETIENLCGRSFVSLYRA
GLLLDDFTIDDLLGKGKYAKGSIDNLSIPFDDCALCPNAGTRLSQTGVSHDHFVCNYVEHLFECASFSRETGGKFIRACS
KGWHWNATCTTCGASCRFANPRENVVIAIFMNFLRVMYDGNKYYVSLHCDTEWIPVHPLFARLVLMVRGFAPLDNSHVIE
EDEMDICGHPSEVTYDDPSNYAFSHQHVTRGVGMGHLAFCRDANGVDRGEHKFYLHGPFDLKMTHAMFRVFMILLNCHGY
VQSEFREEHPAVKDRSLCALLSVAGLRGVNIACNEEFIHLHSQFHNGSFRSQRPIPMVYAEPEMYPPLEYVRLTESWVPR
GRVMIDDLPSLLSRVYAESSQPHAGEIYEEIFDEDDLFELGDDEGTSTRGLLDLGRRLGGLLLGATKCVKGLHAVIEWPV
DVLTKEAEDLGTWLADNKKYVSESTWSCQVCPEVQDALEKSMREQAKLNAQVIGGIKKLATTMDSATSKLKDSLKELERR
ISVLEQGVDETQQARITNLENFCEDAAKAFDALRADIDALKKKPAQSVTPLPSPSGNSGTAGEQRPPPRRRPPVVEMSEA
QAGETVIVGGDEEQEAHQDSSVAAAGPTDEHNAMLQKIYLGSFKWKVSDGGGSILKTFSLPSDIWAANDRMKNFLSYFQY
YTCEGMTFTLTITSIGLHGGTLLVAWDALSSATRRGIVSMIQLSNLPSMTLHASGSSIGTLTVTSPAIQHQICTSGSEGS
IANLGSLVISVANVLCADSASAQELNVNAWVQFDKPKLSYWTAQHTIAQSGGFEESQDLGDLQAIIATGKWSTTSDKNLM
EIIVHPTACYVSEKLIYQTNLSVVAHMFAKWSGSMKYTFVFGASMFDRGKIMVSAVPVQFRNSKLTLSQMAAFPSMVCDL
SVETREFTFEVPYISIGKMSLVCKDYLFDISSYNADLVVSRLHVMILDPLVKTGNASNSIGFYVVAGPGKDFKLHQMCGV
KSQFAHDVLTAQDFGRSLSCSRLLGNGFKEWCSRESLLMRIPLKNGKKRAFKYAVTPRMRTLPPEATSLSWLSQIFVEWR
GSLTYTIHVQSGSAIQHSYMRIWYDPNGKTDEKEIKFLDSAHPPAGIKVYHWDLKIGDSFRFTVPYCARTEKLQIPKAYA
STPYEWLTMYNGAVTIDLRSGADMELFVSIAGGDDFEMFEQTVPPKCGSVSDSYTVLSYADDIKSVTEVPNKTTYLADEQ
PTTSAPRTSTVDTEEDPPTEGEIARTSNGTLVQYRGGAWKPMVERTPTMSKKQVGPELVASDSHMYKCIKNMNKNVKILT
DRQCTAKLADIVDSTQGLVGSNSTFVEDLAVGAKQIRKFGESLEVFEGSMSAAKTAELIDNTHAAFSGPADGSPISNVVQ
LLLPMLSSIKGMSGKMESKMASLTAMFQPCKKAITHLIERSFPYLACKGFKTDKWIWAALASILVGAALLHYYRSDLKFV
KKWSVMCMIIWAPLLAEKAYHLGTWIKEKFLKSLPRTRTIKDSCRKHSLAGAFECLASASCAYIKDNWAKTMSSLLTILS
VVASLVMWGKIPDDKEITSFADKFHSIGKKGRSITNIIGGFEKITSVCKKWSETLVGWIVSNVSGGIPKEDLAMTAYLGF
KIHDWVRETRDMALMENRFQGFGGDEHLVRVRRLYGHSLKIDNALMEKQIVPDMQLSLIIKECRQKCLELMNESYTYKGM
KQSRIDPLHVCMLGAPGVGKSTIAHVVINNLLDHRGEPEVDRIYTRCCADAYWSNYHQEPVILYDDLGAIKSNLRLSDYA
EIMGIKTNDPFSVPMAAVEDKGKHCTSKYVFSCTNVLNLDDTGDVVTKMAYYRRRNVLVKVERDPDVPKNEANPTEGLVF
TVLGHDQNCQGDPQFVVKENWDEPFLREVDTEGWRFERVEYRTFLRFLCMYTDAYMYSQEQVLQGIKTFKMNPFAPEPEF
AQAQSGEAAECEIVEETQEIPGEAPQEVKELAKIETAPNMDELVEAFNKLRVTPGHLNEILRDGSGCYIDEWAIAGPRWL
SFHELLPFTCGCHHTRVCDFNIVYNNMCKAVRSQSVHFKYRANQAIKYAYTHKLHSQCRYSIDFEKLRECNPLDVFVCVL
SKYTADDHSFERRCPKKMNVVRMQRPPVFELKMRPPSDSVVVEDDQGQRAFEWPHLYTFLRYRAIEFKDDKGSLTVREDA
SADVCPWNEFLKLPWLDGDQLKSVLPAHLHRMVQARLEQVEIMEENGNYSGEMRNAIAEIKEYLDQDHQWVAALVLVACA
VKERRKMTHDKLHRKSFNALDRLDKWYTTTAPKTSKKMKILLAIGASVAVAGVAVGAVILLQKTNLFGSKEDEEIEGEEG
ETQASGAHESDGIVTQHLKRDIRPKMRVTYTDHHVAEAHEEKSTEKPRKPGNPTRKNFLGLSPGFAERGMGVTYEEHTPL
KDALLDESNKVFRRKIVASVESAVKQGGKASKDSVLSQISEWQDKVRATGVIAARQLEASGSLKKIHNLNSRRTSSHVMP
GLVVHDGTFERSDEVDAELHRITIDEVKSCPKMIKEGVSTLSVKKASVGVLALQKAESQLSFPFTSRAGVDRDLSMTNLI
DTHMAGMSCIIISELGNVFRTFGVLRLCGTYVCMPAHYLDEITSEHTLYFVCPSKITQIQLERHRVCLVNGFQETVVWDL
GPSVPPSRNYIDFTAKADDWKNYKATSGALVMSKYLVDSMLQCVHFLDSIELTEANVSVPTSYYEANGGIHTIISGLRYR
VHCMPGFCGAAIMRADATCYRKIIGMHVSGLRNKCMGYAETLTQEHLMRAIETLKETGLLKHIPRGAIGAGEEKLPEHSK
KQSLSLEGKGNLGIVGQLPAQLVPTSVTKTTICKSMIHGLIGEIKTEPSVLSAWDRRLPFPPGEWDPMKDAVKKYGSYIL
PFPTEEIQEVENFLIKKFRRKENSRRTRNVNSLEVGINGIDGSDFWSPIEMKTSPGYPYILKRPSGAQGKKYLFEELEPY
PSGRPKYAMKDPELIENYERIKEEVTSGVKPSIMTMECLKDERRKLAKIYEKPATRTFTILSPEVNILFRQYFGDFAAMV
MSTRREHFSQVGINPESMEWSDLINSLLRVNTKGFAGDYSKFDGIGSPAIYHSIVNVVNAWYDDGEVNARARHSLISSIV
HRDGICGDLILRYSQGMPSGFAMTVIFNSFVNYYFMALAWMSTVGSSLLSPQGSCKDFDTYCKIVAYGDDNVVSVHEEFL
DVYNLQTVAAYLSHFGVTYTDGDKNPIHMSKPYEDITKMSFLKRGFERVESSGFLWKAPLDKTSIEERLNWIRDCPTPVE
ALEQNIESALHEAAIHGRDYFDDLVQRLNSALKRVMLPPTDISFEECQARWWASVTGDALRAADYSSLVRRASSGHVEFN
KKYRDMFRQQDLPLKEILMKSKPVALLDLEV
>C0MHL9 ~~~~~~Genome polyprotein~~~
MACKHGYPLLCPLCTALDITPDGSFTLLFDNEWYPTDLLTVNLDDDVFYPLDTNMDWTDLPLIQDIVMEPQGNSNSSDKN
NSQSSGNEGVIINNYYSNQYQNSIDLSANANGVGKENSKPQGQLMNILGSAADAFKNIAPLLMDQNTEEMTNLSDRVSSD
TAGNTATNTQSTVGRLFGFGQRHKGKHPASCADTATDKVLAAERYYTIKLASWTKTQESFDHIRVPLPHALAGENGGVFS
STLRRHYLCKCGWRIQVQCNASQFHAGSLLVFMAPEFDTSNHSTEVEPRADTAFKVDANWQKHAQILTGHAYVNTTTKVN
VPLALNHQNFWQWTTYPHQILNLRTNTTCDLEVPYVNVCPTSSWTQHANWTLVIAVLTPLQYSQGSATTIEITASIQPVK
PVFNGLRHTVVNPQSPFPVTVREHAGTFFSTTPDTTVPVYGNTISTPFDYMCGEFTDLLSLCKIPTFLGNLDSNKKRIPY
FSATNSTPATPLVTYQVTLSCSCMANSMLAAVARNFNQYRGSLNYLFVFTGSAMTKGKFLISYTPPGAGEPKTLDQAMQA
TYAIWDLGLNSSYNFTVPFISPTHYRQTSYNTPTITSVDGWLTVWQLTPLTYPLGVPNDSHILTLVSGGDDFTLRMPVTF
TKYVPQGVDNAEKGKVSDDNASTDFVAEPVKLPENQTRVSFVYDRSTLSSVLQSTSDVSSKFTPSTAKNLQNSILLTPLP
SDIVNNSVLPEQERWISFASPTTQKPPYKTKQDWNFIMFSPFTYYKCDLEVTLSKNDRETISSVVRYVPCGAPSDLSDQT
MPQTPSLADTRDPHMWVVGQGTTNQISFVIPYTSPLSVLPSVWFNGFSNFDNSSRFGVAPNADFGRLLLQGQGTFSVHYR
YKKMRVFCPRPTVFIPWPNPQDTKIKSVRPTPTLELQNPISIYRVDLFINFSDEIIQFTYKVHGRTVCQYEIPGFGLSRS
GRLLVCMGEKPCQLPISTPKCFYHIVFTGSRNSFGVSIYKARYRPWKQPLHDELHDYGFSTFTDFFKAVRDYHASYYKQR
LQHDIETNPGPVQSVFQLQGGVLTKSQAPMSGLQSMLLRAIGIEADCTEFTRAVNLITDLCNTWESAKTTLSSPEFWTKM
VMRIVKMFAASVLYLHNPDLTTTVCLSLMAGIDILTNDSVFNWLSTKLSKFFHTPAPPIVPLLQQQSPIREANDSFNLAK
NIEWAIKTIKRIVEWITSWFKQEETSPQAKLDKMLTDFPEHCNSILAMRNGRKAYTDCASAFKYFEQLYNLAVQCKRIPL
ATLCEKFKNKHDHAVARPEPVVVVLRGNAGQGKSVTSQIIAQAVSKLSFGRQSVYSLPPDSDYLDGYENQYSVIMDDLGQ
NPDGEDFKVFCQMVSSTNFLPNMAHLEKKGTPFTSNFIIATTNLPKFRPVTVAHYPAVDRRITFDLTVEAGDECVTHNGM
LDVEKAFEEIPGKPQLDCFNTDCRLLHKRGVRFVCNRTKNIYNLQQVVKMVKSTIDNKVENLKKMNTLVAQSPGNDMDYV
LTCLRQTNAALQDQIDELQEAFNQAQERQNFLSDWLKVSAIVFASIASLSAVCKLVSRFKNLVCPAPVQIQLSEGEQAAY
SGGKKGEKQTLQVLDVQGGGKIVAQAGNPVMDYEVNIAKNMVNPITFFYADKAQVTQSCLLIKGHLFVVNRHVAETDWCA
FELKGTRHERDSVQMRSVNKSGMEVDLTFVKVVKGPLFKDNSKKFCSKDDDFPARNETVTGIMNTGVPFVFTGKFLIGNQ
PVNTTTGACFNHCIHYRATTHRGWCGSALICHVNGKKAVYAMHSAGGGGMAAATIITQEMIEAAEKALDCLTPQGAIVEI
GIDTVVHVPRKTKLRRTVAHPCFQPKFEPAVLSRYDPRTTKDVDQVAFSKHTTNLEELPSVFSMVAREYATRVFTTIGKE
NKILTPEQAILGLPGMDPMEKDTSPGLPYTQQGLKRAQLVNFEQGTMVQNLKEAHTKLTEGNYEDILYQSFLKDEIRPIE
KIHEAKTRIVDVPPFHHCIWGRQLLGRFASRFQTNPGLDLGSAIGTDPDTDWTAFAFQLLQYKYVYDVDYSNFDASHSTA
MFEVLIENFFTTENGFDERIGDYLRSLAVSRHAFEERRVLVRGGLPSGCAATSMLNTIINNIVIRAALHLTYSNFEFDDI
KVLSYGDDLLIATNYQINFNLVKQRLAPFNYKITPANKTVEFPEISNLYEVTFLKRKFVRYNSCLFKPQMDTENLKAMVS
YCRPGTLKEKLNSIALLAVHSGKSVYDEIFDPFRRIGIIIPEHGTMLYRWLNLFR
>P21231 ~~~~~~Genome polyprotein~~~
MATIMIGSMAISVPNTHVSCASNSVMPVQAVQMAKQVPSARGVLYTLKREGSTQVHKHEEALRKFQEAFDQDVGIQRRLL
VNKHSSIQSTKKNGLTLRRLTLEQARAKEAAIARRKQEEEDFLNGKYEQQFYAGVSATKSMKFEGGSVGFRTKYWRPTPK
KTKERRATSQCRKPTYVLEEVLSIASKSGKLVEFITGKGKRVKVCYVRKHGAILPKFSLPHEEGKYIHQELQYASTYEFL
PYICMFAKYKSINADDITYGDSGLLFDERSSLTTNHTKLPYFVVRGRRNGKLVNALEVVENMEDIQHYSQNPEAQFFRGW
KKVFDKMPPHVENHECTTDFTNEQCGELAAAISQSIFPVKKLSCKQCRQHIKHLSWEEYKQFLLAHMGCHGPEWETFQEI
DGMRYVKRVIETSTAENASLQTSLEIVRLTQNYKSTHMLQIQDINKALMKGPSVTQSELEQASKQLLAMTQWWKNHMTLT
DEDALKVFRNKRSSKALLNPSLLCDNQLDKNGNFVWGERGRHSKRFFANYFEEVVPSEGYSKYVIRKNPNGQRELAIGSL
IVPLDFERARMALQGKSVTREPITMSCISRQDGNFVYPCCCVTHDDGKAFYSELRSPTKRHLVIGTSGDPKYIDLPATDA
DRMYIAKEGFCYLNIFLAMLVNVNEDEAKDFTKMVRDVIVPRLGKWPTMLDVATAAYMLTVFHPETRNAELPRILVDHAC
QTMHVIDSFGSLTVGYHVLKAGTVNQLIQFASNDLQSEMKFYRVGGEVQQRMKCETALITSIFKPKRMIQILENDPYILL
MGLVSPSILIHMYRMKHFEKGVELWISKEHSVAKIFIILEQLTKRVAANDVLLEQLEMISETSERFMSILEDCPQAPHSY
KTAKDLLTMYIERKASNNQLVENGFVDMNDKLYMAYEKIYSDRLKQEWRALSWLEKFSITWQLKRFAPHTEKCLTKKVVE
ESSASSGNFASVCFMNAQSHLRNVRNTLFQKCDQVWTASVRAFVKLIISTLHRCYSDIVYLVNICIIFSLLVQMTSVLQG
IVNTVRRDKALLSGWKRKEDEEAVIHLYEMCEKMEGGHPSIEKFLDHVKGVRPDLLPVAVSMTGQSEDVSAQAKTATQLQ
LEKIVAFMALLTMCIDNERSDAVFKVLSKLKAFFSTMGEDVKVQSLDEIQSIDEDKKLTIDFDLETNKESSSVSFDVKFE
AWWNRQLEQNRVIPHYRSTGEFLEFTRETAAKIANLVATSSHTEFLIRGAVGSGKSTGLPHHLSKKGKVLLLEPTRPLAE
NVSKQLSFEPFYHNVTLRMRGMSKFGSSNIVVMTSGFAFHYYVNNPQQLSDFDFIIIDECHVQDSPTIAFNCALKEFEFS
GKLIKVSATPPGRECEFTTQHPVKLKVEDHLSFQNFVQAQGTGSNADMIQHGNNLLVYVASYNEVDQLSRLLTEKHYKVT
KVDGRTMQMGNVEIATTGTEGKPHFIVATNIIENGVTLDIDCVIDFGLKVVATLDTDNRCVRYNKQSVSYGERIQRLGRV
GRCKPGFALRIGHTGKGVEEVPEFIATEAAFLSFAYGLPVTTQSVSTNILSRCTVKQARVALNFELTPFFTTNFIKYDGS
MHPEIHRLLKSYKLRESEMLLTKIAIPYQFVGQWVTVKEYERQGIHLNCPEKVKIPFYVHGIPDKLYEMLWDTVCKYKND
AGFGSVKSVNATKISYTLSTDPTAIPRTLAILDHLLSEEMTKKSHFDTIGSAVTGYSFSLAGIADGFRKRYLKDYTQHNI
AVLQQAKAQLLEFDCNKVDINNLHNVEGIGILNAVQLQSKHEVSKFLQLKGKWDGKKFMNDAVVAIFTLVGGGWMLWDYF
TRVIREPVSTQGKKRQIQKLKFRDAFDRKIGREVYADDYTMEHTFGEAYTKKGKQKGSTRTKGMGRKSRNFIHLYGVEPE
NYSMIRFVDPLTGHTMDEHPRVDIRMVQQEFEEIRKDMIGEGELDRQRVYHNPGLQAYFIGKNTEEALKVDLTPHRPTLL
CQNSNAIAGFPEREDELRQTGLPQVVSKSDVPRAKERVEMESKSVYKGLRDYSGISTLICQLTNSSDGHKETMFGVGYGS
FIITNGHLFRRNNGMLTVKTWHGEFVIHNTTQLKIHFIQGKDVILIRMPKDFPPFGKRNLFRQPKREERVCMVGTNFQEK
SLRATVSESSMILPEGKGSFWIHWITTQDGFCGLPLVSVNDGHIVGIHGLTSNDSEKNFFVPLTDGFEKEYLENADNLSW
DKHWFWEPSKIAWGSLNLVEEQPKEEFKISKLVSDLFGNTVTVQGRKERWVLDAMEGNLAACGQADSALVTKHVVKGKCP
YFAQYLSVNQEAKSFFEPLMGAYQPSRLNKDAFKRDFFKYNKPVVLNEVDFQSFERAVAGVKLMMMEFDFKECVYVTDPD
EIYDSLNMKAAVGAQYKGKKQDYFSGMDSFDKERLLYLSCERLFYGEKGVWNGSLKAELRPIEKVQANKTRTFTAAPIDT
LLGAKVCVDDFNNQFYSLNLTCPWTVGMTKFYRGWDKLMRSLPDGWVYCHADGSQFDSSLTPLLLNAVLDVRSFFMEDWW
VGREMLENLYAEIVYTPILAPDGTIFKKFRGNNSGQPSTVVDNTLMVVIAMYYSCCKQGWSEEDIQERLVFFANGDDIIL
AVSDKDTWLYDTLSTSFAELGLNYNFEERTKKREELWFMSHKAVLVDGIYIPKLEPERIVSILEWDRSKELMHRTEAICA
SMIEAWGYTELLQEIRKFYLWLLNKDEFKELASSGKAPYIAETALRKLYTDVNAQTSELQRYLEVLDFNHADDCCESVSL
QSGKEKEGDMDADKDPKKSTSSSKGAGTSSKDVNVGSKGKVVPRLQKITRKMNLPMVEGKIILSLDHLLEYKPNQVDLFN
TRATRTQFEAWYNAVKDEYELDDEQMGVVMNGFMVWCIDNGTSPDANGVWVMMDGEEQIEYPLKPIVENAKPTLRQIMHH
FSDAAEAYIEMRNSESPYMPRYGLLRNLRDRELARYAFDFYEVTSKTPNRAREAIAQMKAAALSGVNNKLFGLDGNISTN
SENTERHTARDVNQNMHTLLGMGPPQ
>Q04544 ~~~ORF1~~~Genome polyprotein~~~
MMMASKDVVATNVASNNNANNTSATSRFLSRFKGLGGGASPPSPIKIKSTEMALGLIGRTTPEPTGTAGPPPKQQRDRPP
RTQEEVQYGMGWSDRPIDQNVKSWEELDTTVKEEILDNHKEWFDAGGLGPCTMPPTYERVKDDSPPGEQVKWSARDGVNI
GVERLTTVSGPEWNLCPLPPIDLRNMEPASEPTIGDMIEFYEGHIYHYSIYIGQGKTVGVHSPQAAFSVARVTIQPIAAW
WRVCYIPQPKHRLSYDQLKELENEPWPYAAITNNCFEFCCQVMNLEDTWLQRRLVTSGRFHHPTQSWSQQTPEFQQDSKL
ELVRDAILAAVNGLVSQPFKNFLGKLKPLNVLNILSNCDWTFMGVVEMVILLLELFGVFWNPPDVSNFIASLLPDFHLQG
PEDLARDLVPVILGGIGLAIGFTRDKVTKVMKSAVDGLRAATQLGQYGLEIFSLLKKYFFGGDQTERTLKGIEAAVIDME
VLSSTSVTQLVRDKQAAKAYMNILDNEEEKARKLSAKNADPHVISSTNALISRISMARSALAKAQAEMTSRMRPVVIMMC
GPPGIGKTKAAEHLAKRLANEIRPGGKVGLVPREAVDHWDGYHGEEVMLWDDYGMTKILDDCNKLQAIADSAPLTLNCDR
IENKGMQFVSDAIVITTNAPGPAPVDFVNLGPVCRRVDFLVYCSAPEVEQIRRVSPGDTSALKDCFKLDFSHLKMELAPQ
GGFDNQGNTPFGKGTMKPTTINRLLIQAVALTMERQDEFQLQGKMYDFDDDRVSAFTTMARDNGLGILSMAGLGKKLRGV
TTMEGLKNALKGYKISACTIKWQAKVYSLESDGNSVNIKEERNILTQQQQSVCTASVALTRLRAARAVAYASCIQSAITS
ILQIAGSALVVNRAVKRMFGTRTATLSLEGPPREHKCRVHMAKAAGKGPIGHDDVVEKYGLCETEEDEEVAHTEIPSATM
EGKNKGKNKKGRGRRNNYNAFSRRGLNDEEYEEYKKIREEKGGNYSIQEYLEDRQRYEEELAEVQAGGDGGIGETEMEIR
HRVFYKSKSRKHHQEERRQLGLVTGSDIRKRKPIDWTPPKSAWADDEREVDYNEKISFEAPPTLWSRVTKFGSGWGFWVS
PTVFITTTHVIPTSAKEFFGEPLTSIAIHRAGEFTLFRFSKKIRPDLTGMILEEGCPEGTVCSVLIKRDSGELLPLAVRM
GAIASMRIQGRLVHGQSGMLLTGANAKGMDLGTIPGDCGAPYVYKRANDWVVCGVHAAATKSGNTVVCAVQASEGETTLE
GGDKGHYAGHEIIKHGCGPALSTKTKFWKSSPEPLPPGVYEPAYLGGRDPRVTVGPSLQQVLRDQLKPFAEPRGRMPEPG
LLEAAVETVTSSLEQVMDTPVPWSYSDACQSLDKTTSSGFPYHRRKNDDWNGTTFVRELGEQAAHANNMYEQAKSMKPMY
TGALKDELVKPEKVYQKVKKRLLWGADLGTVVRAARAFGPFCDAIKSHTIKLPIKVGMNSIEDGPLIYAEHSKYKYHFDA
DYTAWDSTQNRQIMTESFSIMCRLTASPELASVVAQDLLAPSEMDVGDYVIRVKEGLPSGFPCTSQVNSINHWLITLCAL
SEVTGLSPDVIQSMSYFSFYGDDEIVSTDIEFDPAKLTQVLREYGLRPTRPDKSEGPIIVRKSVDGLVFLRRTISRDAAG
FQGRLDRASIERQIYWTRGPNHSDPFETLVPHQQRKVQLISLLGEASLHGEKFYRKISSKVIQEIKTGGLEMYVPGWQAM
FRWMRFHDLGLWTGDRNLLPEFVNDDGV
>P89201 ~~~~~~Genome polyprotein~~~
MGKSKLTYKQCIAKWGKAALEAQNNGSRRSVSVGTHQIAANIFAFYDAKDYHLFAMGKRGGLTPAAEQLRIAIARGTIYK
VQYNCHFCPDCDVIVDSEEGWFCEDCGSQFNKRDDNVLDNKNDVARALGGWNEYEDATWALFEAARADMLEVAPTVGQLE
KEIRAIEKSAGKKLTAYEEEMLEELAYKLDVAKMNEEKQEEVLEETNFSISNDEFPALNGPQDEEVNVVIEETTEESAIE
VAKEAEKSVEFEIIHEKTDEPISDAVNARMVATPVVATSVTKSGTVIDGKELVEKPKTTMWVTKPKTTAAIPATSSKSAV
WVAKPKPASAIFIAEPVVKPAVRACNDVMNIGAMVCPIMVSANAQVEDATKEEEPVIKYNITFGSFNYEVSTKGERIQAA
VQLDEIIEGPDIEPILICQTGSSHKSETKKAAKGLFVQDKFSVIGNKVLCKSFPAFNNFMNETRLGGIYRTRKGNYKNAA
LRLLKATKVQVFYDGIKDIFECPYCHVSSNELEGLNGDNCEKCKDLFYKHIDDPRKVEEEYLMVPLVPIDQHVHEEHSII
SKAKWEAHESICEGEVNIVKIFDGKPTASKKKFKTMQAPNVANIPLDDFMQELVEICLERNTPIEIIGGVKSFNVVKLRH
ATRDISKSGEDDMYPTEREWFCHKNKLCLCGGIEREKKVRSFEVRPGWSGVILHKNQVAECDWDKFVFIDDICVVQGRNL
ITNKIENALEKKGATRLKQIQFYASSVVPNFKDEFDRASRLKADHEPYESSNNELIGRLARLVAAVIPKGHLYCKTCCLR
VIKSKRADIVNALSKAKQRGERDEFIYDELIKLFELQAPPPYKIATITSDDDMFAHIRIGWKPYSGRLSLIMQHLQGLHT
SISMLHQSLAGAQNDQQIDRQALHNQVRILHQRNEEHMPFLKKAVDEIQLLNATDQVANARELYLDTRATSTGDFDILRK
YQSIYEFFPNIMSRANKVGMAVIKSETSLSKAFALMDNAKSMNAIHTLIGEDVIDNTSGACLMKNDKTFFSIGCKQGVDG
SKMYGPLCPTKQHVRIHRVESNMQIPLPTFHDATVWEFNEGYCYANQLAIMVGFINEDEMEFYKNQMNQIVLNLGAWPTF
EQYLVELRAISLDYPKVRGCPAAIHLVSHANKLIHVLGQFGTINQGWHALEVATVGELVDLCHKKVEGEMLTYKVGGIYD
WVTKKNAFIDLFEHHPENIFKICTSPSVLWLFARSCEKHDFINDIMARDHSLVGLFIKLEYVGKHLHIFQSVDDVCVEYA
ASMREIIEEHADIHGLRDSVVDRMVHAYHNEVREANKYELVDRILEKNIGLIAKEISSRKLITMYHRDLFSWHEWQRLKL
MPHSSNAQKLFEEANERAYGKQSWNLRVIWGACKEVLYAATHGVYVRVKGTTVRCADAVVYGFYGRTRAMVSSWASEAWG
AIFTSCLRALVVMVVTAYISTWIPKIRKMIKREKKQFEDLGDGELYVEQHGKKEEAFLFKICAIFALIAGIVDYEWGAAA
CATMNKVRSICTVLGSVGIESHANEPNDKVEQDLKESLKFTSFEIEVPTWFYHNDMTFERWFQHQIQYGNVCADPIYSGP
LRMLAITESSAREVAMNIRTSGETDVRVYSGVGGGKSTRLPKELSMFGHVLICVPTRVLAESLLTSFMVLFNMDVNVAYR
GRIHTGNAPITIMTYGYALNFLVHKPMELNRYDYVLLDEINTNPVEFAPLFSFIKTTDPKKKIVKLSATHAGMDCECETR
HKIKVETLSEMPIESWVSMQGSGVVGDATSVGDVILVFVASFKDVDTCANGLRSKGFKVLKVDSRNFRRDADVDKQIQSL
GEGKKFIVATNIIENGVTFNIDVVVDFGEKISPNLSSDERCITLGRQRISRAERIQRFGRAGRIKPGTVLRFGRGNLVDA
LPSVLTATESALLCFAHGIKPVCDRVDVAAIGTLTRQQALVSGQFELNKLLVAHSATPSGQIPRVVYELFKPLLLRTDAV
PICSSYNAIAACNWLPLSTYMRRNEKNEHVLATKIPWYCSDLSEDFNIKLAECVKSCMSTSNARFIVDNVNFITVAHKIS
VGEKTVGQAKLMVGELLENSKSWRDGLLHVQSSSVTRSLVGLCTSWYQRRAKAALDRLDLQVNRLQLLYDQLGQVEITSD
YDKLVEFFTENGECAAYLESQSKTDFLEKHVLELRQPAITKNVVGTAMFAVALTGCLFWWWMKRNEKYEFIEQHGKKIRL
NRDKRNACFVFSGTDDAMVEEYGVEYSQDVIHGRMSKAQKARQMKLKGKKPGSDTRVKPFKVLYGIDPNDYDTVALSAGG
LTTEAVPVGEASLIDLMLELDDETGIFRKQVVNELKLKYTNNANGEQAMVRLTPHDSRRATIGSFMPSGFPDHHGEWRQT
GAAEIIKEKNVAVDSHVGTPTVDAEDKHIASRLAIVRTHKGETHGIFHGDKLITPFHTFKNACGNDTLTVQSLRGLYDYG
ILSRQKMEQVPKQDIMVLVNPIDVTPFKQSQIFRPPIQCEVAYMIVCRRTPNGLRFEKTQETEIFPLGKQYGGVWKHGCD
TRLGDCGGPIIACRDRKIVGFNGGRLMQMKYNTVLAHIFEPVNETFIEMLAKMEYAKGFWKFNPELVEWSRLLPTTTSFP
IQKQIQGVESHGKPGDKCCGGNLISVGFANVTRISKHVIKGKRPSFVEYCNTYPDNIFMRDNLCEHYGPSILSKAAFYKD
FTKYDDPVKVGRLDCYAFDTALAMVHDTLSQLGFHGNSGSQWDIAEIFDDLNKKSSMGALYSGKKGQWMHGLTPEDAISL
AVESYALLNSGHLGVWSGSLKAELRHVDKLKEGKTRVFTGAPIDTLLAGKILVDNFNNYFYKCHLQGPWTVGINKFNRGW
NKLANYFNHDWVFIDCDGSRFDSSIPPIMFNAVCMLRSVFGDLDPDENQTLSNLYTEIVNTPILTIEGNIIRKFRGNNSG
QPSTVVDNTLILMIAMEYAIAKVFVTRPDIKYVCNGDDLLINCPRSTANAISEHFKDVFADLSLNYDFDHVCDKITDVDF
MSHSFMWLDTEQMYIPKLDKERIVAILEWERSDEQFRTRSALNAAYIESFGYEDLMTEIEKFAHFWAKKHGLNDVLMERE
KVRSLYVDENFDASRFEKFYPESFSPFDVYVEPHASTSKTIEELQQEMEDLDSDTTITVVQRETQKAGIRDQIEALRAQQ
IVRPPEAQLQPDVTPAQIVTFEPPRVTGFGALWIPRQQRNYMTPSYIEKIKAYVPHSNLIESGLASEAQLTSWFENTCRD
YQVSMDVFMSTILPAWIVNCIINGTSQERTNEHTWRAVIMANMEDQEVLYYPIKPIIINAQPTLRQVMRHFGEQAVAQYM
NSLQVGKPFTVKGAVTAGYANVQDAWLGIDFLRDTMKLTTKQMEVKHQIIAANVTRRKIRVFALAAPGDGDELDTERHVV
DDVARGRHSLRGAQLD
>P13900 ~~~~~~Genome polyprotein~~~
MGAQVSTQKTGAHETSLSAAGNSVIHYTNINYYKDAASNSANRQDFTQDPGKFTEPVKDIMVKSMPALNSPSAEECGYSD
RVRSITLGNSTITTQECANVVVGYGVWPTYLKDEEATAEDQPTQPDVATCRFYTLESVMWQQSSPGWWWKFPDALSNMGL
FGQNMQYHYLGRAGYTIHVQCNASKFHQGCLLVVCVPEAEMGCATLANKPDPKSLSKGEIANMFESQNSTGETAVQANVI
NAGMGVGVGNLTIFPHQWINLRTNNSATIVMPYINSVPMDNMFRHNNFTLMVIPFAPLSYSTGATTYVPITVTVAPMCAE
YNGLRLAGKQGLPTLSTPGSNQFLTSDDFQSPSAMPQFDVTPEMDIPGQVNNLMEIAEVDSVVPVNNTEGKVMSIEAYQI
PVQSNPTNGSQVFGFPLTPGANSVLNRTLLGEILNYYAHWSGSIKLTFMFCGSAMATGKFLLAYSPPGAGAPTTRKEAML
GTHVIWDVGLQSSCVLCIPWISQTHYRYVVMDEYTAGGYITCWYQTNIVVPADAQSDCKILCFVSACNDFSVRMLKDTPF
IKQDNFFQGPPGEVMGRAIARVADTIGSGPVNSESIPALTAAETGHTSQVVPSDTMQTRHVKNYHSRSESTVENFLCRSA
CVFYTTYKNHDSDGDNFAYWVINTRQVAQLRRKLEMFTYARFDLELTFVITSTQEQPTVRGQDAPVLTHQIMYVPPGGPV
PTKVNSYSWQTSTNPSVFWTEGSAPPRMSIPFIGIGNAYSMFYDGWARFDKQGTYGISTLNNMGTLYMRHVNDGGPGPIV
STVRIYFKPKHVKTWVPRPPRLCQYQKAGNVNFEPTGVTEGRTDITTMKTTGAFGQQSGAVYVGNYRVVNRHLATRADWQ
NCVWEDYNRDLLVSTTTAHGCDTIARCDCTAGVYFCASRNKHYPVTFEGPGLVEVQESEYYPKKYQSHVLLAAGFAEPGD
CGGILRCQHGVIGIVTVGGEGVVGFADVRDLLWLEDDAMEQGVRDYVEQLGNCFGSGFTNQICEQVTLLKESLIGQDSIL
EKSLKALVKIVSALVIVVRNHDDLITVTATLALIGCTTSPWRWLKQKVSQYYGIPMAERQNSGWLKKFTEMTNACKGMEW
IAIKIQKFIEWLKVKILPEVKEKHEFLNRLKQLPLLESQIATIEQSAPSQSDQEQLFSNVQYFAHYCRKYAPLYAAEAKR
VFSLEKKMSNYIQFKSKCRIEPVCLLLHGSPGAGKSVATNLIGRSLAEKLNSSVYSLPPDPDHFDGYKQQAVVIMDDLCQ
NPDGKDVSLFCQMVSSVDFVPPMAALEEKGILFTSPFVLASTNAGSVNAPTVSDSRALVRRFHFDMNIEVVSMYSQNGKI
NMPMAVKTCDEECCPVNFKKCCPLVCGKAIQFIDRRTQVRYSLDMLVTEMFREYNHRHSVGATLEALFQGPPVYREIKIS
VAPETPPPPAVADLLKSVDSEAVREYCKEKGWLIPEVDSTLQIEKHVNRAFICLQALTTFVSVAGIIYIIYKLFAGFQGA
YTGMPNQKPRVPTLRQAKVQGPAFEFAVAMMKRNASTVKTEYGEFTMLGIYDRWAVLPRHAKPGPTILMNDQVVGVLDAK
ELVDKDGTNLELTLLKLNRNEKFRDIRGFLAREEVEVNEAVLAINTSKFPNMYIPVGRVTDYGFLNLGGTPTKRMLMYNF
PTRAGQCGGVLMSTGKVLGIHVGGNGHQGFSAALLRHYFNEEQGEIEFIESSKDAGFPVINTPSKTKLEPSVFHHVFEGN
KEPAVLRNGDPRLKANFEEAIFSKYIGNVNTHVDEYMMEAVDHYAGQLATLDISTEPMKLEDAVYGTEGLEALDLTTSAG
YPYVALGIKKRDILSKKTRDLTKLKECMDKYGLNLPMVTYVKDELRSADKVAKGKSRLIEASSLNDSVAMRQTFGNLYKT
FHLNPGIVTGSAVGCDPDVFWSKIPVMLDGHLIAFDYSGYDASLSPVWFTCLKLLLEKLGYTNKETNYIDYLCNSHHLYR
DKHYFVRGGMPSGCSGTSIFNSMINNIIIRTLMLKVYKGIDLDQFRMIAYGDDVIASYPWPIDASLLAEAGKDYGLIMTP
ADKGECFNEVTWTNVTFLKRYFRADEQYPFLVHPVMPMKDIHESIRWTKDPKNTQDHVRSLCLLAWHNGEHEYEEFIRKI
RSVRVGRCLSLPAFSTLRRKWLDSF
>Q6XDK8 ~~~ORF1~~~Genome polyprotein~~~
MASKPFYPIEFNPSVELQVLRSAHLRVGGREQMFETINDLNDHVRGVVAKLWCKHLHRSLAAAPTFTEEGLLDSFLSKPP
VDINPDTTFRELFGINPHEQFPLSIHDLAKLQGELVDAARNPGHVLRRHYSTDSLTALINKITKFVPVHATLQEMQARRA
FERERAELFRELPHADLDVSRQQKSYFYAMWRQVVKKSKEFFIPLVKCTSWRKKFTEPAEIVRQVLVHFCEGMRSQFSTN
ANYINLSLIAKLRPTVLTMILQQHKNTYRGWLATVTALVEVYSNLFQDMRDTAVSAVSAITLVFETIKDFVVNVIDLVKS
TFQSQGPTSCGWAAIIAGALLILMKLSGCSNTTSYWHRLLKVCGGVTTIAAAARAVVWVRDIIAEADGKARLKKYMARTA
ALLELAASRDVTGTDELKRLLDCFTQLIEEGTELIQEFGTSPLAGLTRSYVSELESTANSIRSTILLDTPRKTPVAIILT
GPPGIGKTRLAQHLAAGFGKVSNFSVTLDHHDSYTGNEVAIWDEFDVDTQGKFVETMIGVVNTAPYPLNCDRVENKGKVF
TSDYIICTSNYPTSVLPDNPRAGAFYRRVTTIDVSSPTIEDWKKKNPGKKPPPDLYKNDFTHLRLSVRPFLGYNPEGDTL
DGVRVKPVLTSVDGLSRLMETKFKEQGNEHRNLWITCPRDLVAPAASGLKAYMAANRALAQVFQEPSSQDIGETCTSRVY
VSCNNPPPTYSGRVVKITAINPWDASLANSMLSMFETTSHIPASIQREIMYRVWDPLVHLQTREPNTQMLPYINRVVPVS
SAFDFIRGLRHHLGLCSVKGMWRAYQGWNSSSSILEFLSKHMADVAFPHNPECTVFRAPDGDVIFYTFGSYACFVSPARV
PFVGEPPKNVHSNITRNMTWAETLRLLAETITESLVHFGPFLLMMHNVSYLATRSGREEEAKGKTKHGRGAKHARRGGVS
LSDDEYDEWRDLVRDWRQDMTVGEFVELRERYALGMDSEDVQRYRAWLELRAMRMGAGAYQHATIIGRGGVQDTIIRTQP
MRAPRAPRNQGYDEEAPTPIVTFTSGGDHIGYGCHMGNGVVVTVTHVASASDQVEGQDFAIRKTEGETTWVNTNLGHLPH
YQIGDGAPVYYSARLHPVTTLAEGTYETPNITVQGYHLRIINGYPTKRGDCGTPYFDSCRRLVGLHAATSTNGETKLAQR
VTKTSKVENAFAWKGLPVVRGPDCGGMPTGTRYHRSPAWPNPVEGETHAPAPFGSGDERYKFSQVEMLVNGLKPYSEPTP
GIPPALLQRAATHTRTYLETIIGTHRSPNLSFSEACSLLEKSTSCGPFVAGQKGDYWDEDKQCYTGVLAEHLAKAWDAAN
RGVAPQNAYKLALKDELRPIEKNAQGKRRLLWGCDAGATLVATAAFKGVATRLQAVAPMTPVSVGINMDSYQVEVLNESL
KGGVLYCLDYSKWDSTQHPAVTAASLGILERLSEATPITTSAVELLSSPARGHLNDIVFITKSGLPSGMPFTSVINSLNH
MTYFAAAVLKAYEQHGAPYTGNVFQVETVHTYGDDCLYSVCPATASIFQTVLANLTSFGLKPTAADKSETIAPTHTPVFL
KRTLTCTPRGVRGLLDITSIKRQFLWIKANRTVDINSPPAYDRDARGIQLENALAYASQHGHAVFEEVAELARHTAKAEG
LVLTNVNYDQALATYESWFIGGTGLVQGSPSEETTKLVFEMEGLGQPQPQGGEKTSPQPVTPQDTIGPTAALLLPTQIET
PNASAQRLELAMATGAVTSNVPNCIRECFASVTTIPWTTRQAANTFLGAIHLGPRINPYTAHLSAMFAGWGGGFQVRVTI
SGSGLFAGRAVTAILPPGVNPASVQNPGVFPHAFIDARTTEPILINLPDIRPVDFHRVDGDDATASVGLWVAQPLINPFQ
TGPVSTCWLSFETRPGPDFDFCLLKAPEQQMDNGISPASLLPRRLGRSRGNRMGGRIVGLVVVAAAEQVNHHFDARSTTL
GWSTLPVEPIAGDISWYGDAGNKSIRGLVSAQGKGIIFPNIVNHWTDVALSSKTSNTTTIPTDTSTLGNLPGASGPLVTF
ADNGDVNESSAQNAILTAANQNFTSFSPTFDAAGIWVWMPWATDRPGASDSNIYISPTWVNGNPSHPIHEKCTNMIGTNF
QFGGTGTNNIMLWQEQHFTSWPGAAEVYCSQLESTAEIFQNNIVNIPMNQMAVFNVETAGNSFQIAILPNGYCVTNAPVG
THQLLDYETSFKFVGLFPQSTSLQGPHGNSGRAVRFLE
>Q155Z9 ~~~~~~Genome polyprotein~~~
MQNSHFSFDTASGTFEDVTGTKVKIVEYPRSVNNGVYDSSTHLEILNLQGEIEILRSFNEYQIRAAKQQLGLDIVYELQG
NVQTTSKNDFDSRGNNGNMTFNYYANTYQNSVDFSTSSSASGAGPGNSRGGLAGLLTNFSGILNPLGYLKDHNTEEMENS
ADRVTTQTAGNTAINTQSSLGVLCAYVEDPTKSDPPSSSTDQPTTTFTAIDRWYTGRLNSWTKAVKTFSFQAVPLPGAFL
SRQGGLNGGAFTATLHRHFLMKCGWQVQVQCNLTQFHQGALLVAMVPETTLDVKPDGKAKSLQELNEEQWVEMSDDYRTG
KNMPFQSLGTYYRPPNWTWGPNFINPYQVTVFPHQILNARTSTSVDINVPYIGETPTQSSETQNSWTLLVMVLVPLDYKE
GATTDPEITFSVRPTSPYFNGLRNRYTAGTDEEQGPIPTAPRENSLMFLSTLPDDTVPAYGNVRTPPVNYLPGEITDLLQ
LARIPTLMAFERVPEPVPASDTYVPYVAVPTQFDDRPLISFPITLSDPVYQNTLVGAISSNFANYRGCIQITLTFCGPMM
ARGKFLLSYSPPNGTQPQTLSEAMQCTYSIWDIGLNSSWTFVVPYISPSDYRETRAITNSVYSADGWFSLHKLTKITLPP
DCPQSPCILFFASAGEDYTLRLPVDCNPSYVFHSTDNAETGVIEAGNTDTDFSGELAAPGSNHTNVKFLFDRSRLLNVIK
VLEKDAVFPRPFPTQEGAQQDDGYFCLLTPRPTVASRPATRFGLYANPSGSGVLANTSLDFNFYSLACFTYFRSDLEVTV
VSLEPDLEFAVGWFPSGSEYQASSFVYDQLHVPFHFTGRTPRAFASKGGKVSFVLPWNSVSSVLPVRWGGASKLSSATRG
LPAHADWGTIYAFVPRPNEKKSTAVKHVAVYIRYKNARAWCPSMLPFRSYKQKMLMQSGDIETNPGPASDNPILEFLEAE
NDLVTLASLWKMVHSVQQTWRKYVKNDDFWPNLLSELVGEGSVALAATLSNQASVKALLGLHFLSRGLNYTDFYSLLIEK
CSSFFTVEPPPPPAENLMTKPSVKSKFRKLFKMQGPMDKVKDWNQIAAGLKNFQFVRDLVKEVVDWLQAWINKEKASPVL
QYQLEMKKLGPVALAHDAFMAGSGPPLSDDQIEYLQNLKSLALTLGKTNLAQSLTTMINAKQSSAQRVEPVVVVLRGKPG
CGKSLASTLIAQAVSKRLYGSQSVYSLPPDPDFFDGYKGQFVTLMDDLGQNPDGQDFSTFCQMVSTAQFLPNMADLAEKG
RPFTSNLIIATTNLPHFSPVTIADPSAVSRRINYDLTLEVSEAYKKHTRLNFDLAFRRTDAPPIYPFAAHVPFVDVAVRF
KNGHQNFNLLELVDSICTDIRAKQQGARNMQTLVLQSPNENDDTPVDEALGRVLSPAAVDEALVDLTPEADPVGRLAILA
KLGLALAAVTPGLIILAVGLYRYFSGSDADQEETESEGSVKAPRSENAYDGPKKNSKPPGALSLMEMQQPNVDMGFEAAV
AKKVVVPITFMVPNRPSGLTQSALLVTGRTFLINEHTWSNPSWTSFTIRGEVHTRDEPFQTVHFTHHGIPTDLMMVRLGP
GNSFPNNLDKFGLDQMPARNSRVVGVSSSYGNFFFSGNFLGFVDSITSEQGTYARLFRYRVTTYKGWCGSALVCEAGGVR
RIIGLHSAGAAGIGAGTYISKLGLIKALKHLGEPLATMQGLMTELEPGITVHVPRKSKLRKTTAHAVYKPEFEPAVLSKF
DPRLNKDVDLDEVIWSKHTANVPYQPPLFYTYMSEYAHRVFSFLGKDNDILTVKEAILGIPGLDPMDPHTAPGLPYAING
LRRTDLVDFVNGTVDAALAVQIQKFLDGDYSDHVFQTFLKDEIRPSEKVRAGKTRIVDVPSLAHCIVGRMLLGRFAAKFQ
SHPGFLLGSAIGSDPDVFWTVIGAQLEGRKNTYDVDYSAFDSSHGTGSFEALISHFFTVDNGFSPALGPYLRSLAVSVHA
YGERRIKITGGLPSGCAATSLLNTVLNNVIIRTALALTYKEFEYDMVDIIAYGDDLLVGTDYDLDFNEVARRAAKLGYKM
TPANKGSVFPPTSSLSDAVFLKRKFVQNNDGLYKPVMDLKNLEAMLSYFKPGTLLEKLQSVSMLAQHSGKEEYDRLMHPF
ADYGAVPSHEYLQARWRALFD
>Q01299 ~~~~~~Genome polyprotein~~~
MVKKAILKGKGGGPPRRVSKETATKTRQPRVQMPNGLVLMRMMGILWHAVAGTARNPVLKAFWNSVPLKQATAALRKIKR
TVSALMVGLQKRGKRRSATDWMSWLLVITLLGMTIAATVRKERDGSTVIRAEGKDAATQVRVENGTCVILATDMGSWCDD
SLSYECVTIDQGEEPVDVDCFCRNVDGVYLEYGRCGKQEGSRTRRSVLIPSHAQGELTGRGHKWLEGDSLRTHLTRVEGW
VWKNRLLALAMVTVVWLTLESVVTRVAVLVVLLCLAPVYASRCTHLENRDFVTGTQGTTRVTLVLELGGCVTITAEGKPS
MDVWLDAIYQENPAQTREYCLHAKLSDTKVAARCPTMGPATLAEEHQGGTVCKRDQSDRGWGNHCGLFGKGSIVACVKAA
CEAKKKATGHVYDANKIVYTVKVEPHTGDYVAANETHSGRKTASFTVSSEKTILTMGEYGDVSLLCRVASGVDLAQTVIL
ELDKTVEHLPTAWQVHRDWFNDLALPWKHEGARNWNNAERLVEFGAPHAVKMDVYNLGDQTGVLLKALAGVPVAHIEGTK
YHLKSGHVTCEVGLEKLKMKGLTYTMCDKTKFTWKRAPTDSGHDTVVMEVTFSGTKPCRIPVRAVAHGSPDVNVAMLITP
NPTIENNGGGFIEMQLPPGDNIIYVGELSYQWFQKGSSIGRVFQKTKKGIERLTVIGEHAWDFGSAGGFLSSIGKALHTV
LGGAFNSIFGGVGFLPKLLLGVALAWLGLNMRNPTMSMSFLLAGVLVLAMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYAYYPETPGALASAIKETFEEGSCGVVPQNRLEMAMWRSSVTELNLALAEGEANLTVMVDKFDPTDYRGGVPG
LLKKGKDIKVSWKSWGHSMIWSIPEAPRRFMVGTEGQSECPLERRKTGVFTVAEFGVGLRTKVFLDFRQEPTHECDTGVM
GAAVKNGMAIHTDQSLWMRSMKNDTGTYIVELLVTDLRNCSWPASHTIDNADVVDSELFLPASLAGPRSWYNRIPGYSEQ
VKGPWKHTPIRVIREECPGTTVTINAKCDKRGASVRSTTESGKVIPEWCCRACTMPPVTFRTGTDCWYAMEIRPVHDQGG
LVRSMVVADNGELLSEGGVPGIVALFVVLEYIIRRRPSTGSTVVWGGIVVLALLVTGMVRMESLVRYVVAVGITFHLELG
PEIVALMLLQAVFELRVGLLSAFALRRSLTVREMVTTYFLLLVLELGLPSANLEDFWKWGDALAMGALIFRACTAEGKTG
AGLLLMALMTQQDVVTVHHGLVCFLSAASACSIWRLLRGHREQKGLTWIVPLARLLGGEGSGIRLLAFWELSAHRGRRSF
SEPLTVVGVMLTLASGMMRHTSQEALCALAVASFLLLMLVLGTRKMQLVAEWSGCVEWHPELVNEGGEVSLRVRQDAMGN
FHLTELEKEERMMAFWLIAGLAASAIHWSGIIGVMGLWTLTKMLRSSRRSDLVFSGQGGRERGDRPFEVKDGVYRIFSPG
LFWGQNQVGVGYGSKGVLHTMWHVTRGAALSIDDAVAGPYWADVREDVVCYGGAWSLEEKWKGETVQVHAFPPGKAHEVH
QCQPGELILDTGRKLGAIPIDLVKGTSGSPILNAQGVVVGLYGNGLKTNETYVSSIAQGEAEKSRPNLPQAVVGTGWTSK
GQITVLDMHPGSGKTHRVLPELIRQCIDRRLRTLVLAPTRVVLKEMERALNGKRVRFHSPAVSDQQAGGAIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYTLAKENKCALVLMTATPPGKSEPFPESNGAITSEERQIPNGEWR
DGFDWITEYEGRTAWFVPSIAKGGAIARTLRQKGKSVICLNSKTFEKDYSRVRDEKPDFVVTTDISEMGANLDVSRVIDG
RTNIKPEEVDGKVELTGTRRVTTASAAQRRGRVGRQDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFY
GPEQDKMPEVAGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTDRSWTWEGPEANAVDEASGGLVTFRSPNGAE
RTLRPVWKDARMFKEGRDIKEFVAYASGRRSFGDVLTGMSGVPELLRHRCVSALDVFYTLMHEKPDSRAMRMAERDAPEA
FLTMVEMMVLGLATLGVIWCFVVRTSISRMMLGTLVLLASLLLLWAGGVGYGNMAGVALIFYTLLTVLQPEAGKQRSSDD
NKLAYFLLTLCSLAGLVAANEMGFLEKTKADLSTVLWSEREEPRPWSEWTNVDIQPARSWGTYVLVVSLFTPYIIHQLQT
KIQQLVNSAVASGAQAMRDLGGGAPFFGVAGHVMTLGVVSLIGATPTSLMVGVGLAALHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGEGEAKPALYERRMSLVLAIVLCLMSVVMNRTVASITEASAVGLAAAGQLLRPEADTLWTMPV
ACGMSGVVRGSLWGFLPLGHRLWLRASGGRRGGSEGDTLGDLWKRRLNNCTREEFFVYRRTGILETERDKARELLRRGET
NMGLAVSRGTAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPAVMSVRAYTIGGRGHEAPKMVTSLGWNLIKFRSG
MDVFSMQPHRADTVMCDIGESSPDAAVEGERTRKVILLMEQWKNRNPTAACVFKVLAPYRPEVIEALHRFQLQWGGGLVR
TPFSRNSTHEMYYSTAVTGNIVNSVNVQSRKLLARFGDQRGPTRVPELDLGVGTRCVVLAEDKVKEQDVQERIKALREQY
SETWHMDEEHPYRTWQYWGSYRTAPTGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKDKVDTKAQEPQP
GTRVIMRAVNDWILERLAQKSKPRMCSREEFIAKVKSNAALGAWSDEQNRWASAREAVEDPAFWHLVDEERERHLMGRCA
HCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNEDHWASRESSGAGVEGISLNYLGWHLKKLSTLNGG
LFYADDTAGWDTKVTNADLEDEEQILRYMEGEHKQLATTIMQKAYHAKVVKVARPSRDGGCIMDVITRRDQRGSGQVVTY
ALNTLTNIKVQLIRMMEGEGVIEAADAHNPRLLRVERWLKEHGEERLGRMLVSGDDCVVRPLDDRFGKALYFLNDMAKTR
KDIGEWEHSAGLSSWEEVPFCSHHFHELVMKDGRTLVVPCRDQDELVGRARISPGCGWSVRETACLSKAYGQMWLLSYFH
RRDLRTLGLAINSAVPVDWVPTGRTTWSIHASGAWMTTEDMLDVWNRVWILDNPFMQNKGKVMEWRDVPYLPKAQDMLCS
SLVGRKERAEWAKNIWGAVEKVRKMIGPEKFKDYLSCMDRHDLHWELRLESSII
>P07720 ~~~~~~Genome polyprotein~~~
MAGKAILKGKGGGPPRRVSKETAKKTRQSRVQMPNGLVLMRMMGILWHAVAGTARSPVLKSFWKSVPLKQATAALRKIKK
AVSTLMVGLQRRGKRRSAVDWTGWLLVVVLLGVTLAATVRKERDGTTVIRAEGKDAATQVRVENGTCVILATDMGSWCDD
SLTYECVTIDQGEEPVDVDCSCRNVDGVYLEYGRCGKQEGSRTRRSVLIPSHAQGDLTGRGHKWLEGDSLRTHLTRVEGW
VWKNKVLTLAVIAVVWLTVESVVTRVAVVVVLLCLAPVYASRCTHLENRDFVTGTQGTTRVTLVLELGGCVTITAEGKPS
MDVWLDSIYQENPAKTREYCLHAKLSDTKVAARCPTMGPATLAEEHQSGTVCKRDQSDRGWGNHCGLFGKGSIVTCVKAS
CEAKKKATGHVYDANKIVYTVKVEPHTGDYVAANETHSGRKTASFTVSSERTILTMGDYGDVSLLCRVASGVDLAQTVIL
ELDKTSEHLPTAWQVHRDWFNDLALPWKHEGAQNWNNAERLVEFGAPHAVKMDVYNLGDQTGVLLKSLAGVPVAHIDGTK
YHLKSGHVTCEVGLEKLKMKGLTYTMCDKTKFTWKRIPTDSGHDTVVMEVAFSGTKPCRIPVRAVAHGSPDVNVAMLMTP
NPTIENNGGGFIEMQLPPGDNIIYVGELSHQWFQKGSSIGRVFQKTRKGIERLTVIGEHAWDFGSTGGFLTSVGKALHTV
LGGAFNSLFGGVGFLPKILVGVVLAWLGLNMRNPTMSMSFLLAGGLVLAMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYAYYPETLGALASAIKETFEEGTCGIVPQNRLEMAMWRSSATELNLALVEGDANLTVVVDKLDPTDYRGGIPS
LLKKGKDIKVSWKSWGHSMIWSVPEAPRLFMVGTEGSSECPLERRKTGVFTVAEFGVGLRTKVFLDFRQESTHECDTGVM
GAAVKNGMAVHTDQSLWMKSVRNDTGTYIVELLVTDLRNCSWPASHTIDNAEVVDSELFLPASLAGPRSWYNRIPGYSEQ
VKGPWKYSPIRVTREECPGTRVTINADCDKRGASVRSTTESGKVIPEWCCRTCTLPPVTFRTGTDCWYAMEIRPVHDQGG
LVRSMVVADNGELLSEGGIPGIVALFVVLEYVIRRRPATGTTAMWGGIVVLALLVTGLVKIESLVRYVVAVGITFHLELG
PEIVALTLLQAVFELRVGLLSAFALRSNLTVREMVTIYFLLLVLELGLPSEGLGALWKWGDALAMGALIFRACTAEEKTG
VGLLLMALMTQQDLATVHYGLMLFLGVASCCSIWKLIRGHREQKGLTWIVPLAGLLGGEGSGVRLVAFWELTVHGRRRSF
SEPLTVVGVMLTLASGMIRHTSQEALCALAVASFLLLMLVLGTRKMQLVAEWSGCVEWHPELMNEGGEVSLRVRQDSMGN
FHLTELEKEERVMAFWLLAGLAASAFHWSGILGVMGLWTLSEMLRTARRSGLVFSGQGGRERGDRPFEVKDGVYRIFSPG
LLWGQRQVGVGYGSKGVLHTMWHVTRGAALSIDDAVAGPYWADVKEDVVCYGGAWSLEEKWKGETVQVHAFPPGRAHEVH
QCQPGELLLDTGRRIGAVPIDLAKGTSGSPILNSQGVVVGLYGNGLKTNETYVSSIAQGEAEKSRPNLPPAVTGTGWTAK
GQITVLDMHPGSGKTHRVLPELIRQCIDRRLRTLVLAPTRVVLKEMERALNGKRVRFHSPAVGDQQVGGSIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYTLAKENKCALVLMTATPPGKSEPFPESNGAISSEEKQIPDGEWR
DGFDWITEYEGRTAWFVPSSAKGGIIARTLIQKGKSVICLNSKTFEKDYSRVRDEKPDFVVTTDISEMGANLDVSRVIDG
RTNIKPEEVDGRVELTGTRRVTTASAAQRRGRVGRQEGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFY
GPEQDKMPEVAGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTSRNWTWEGPEENTVDEANGDLVTFRSPNGAE
RTLRPVWRDARMFREGRDIREFVAYASGRRSFGDVLSGMSGVPELLRHRCVSAMDVFYTLMHEEPGSRAMKMAERDAPEA
FLTVVEMMVLGLATLGVVWCFVVRTSISRMMLGTLVLLASLALLWAGGVSYGNMAGVALIFYTLLTVLQPEAGKQRSSDD
NKLAYFLLTLCSLAGLVAANEMGFLEKTKADLSTVLWSEHEELRSWEEWTNIDIQPARSWGTYVLVVSLFTPYIIHQLQT
KIQQLVNSAVATGAQAMRDLGGGAPFFGVAGHVMALGVVSLVGATPTSLVVGVGLAAFHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGEGEAKPALYERKMSLVLAIVLCLMSVVMNRTVPSTPRLLLWDWRQRDNCSNQRRTPFGRCQA
CGLSGVVRGSLWGFCPLGHRLWLRASGSRRGGSEGDTLGDLWKRKLNGCTKEEFFAYRRTGILETERDKARELLKRGETN
MGLAVSRGTAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPAVMSVKACAIAGKGHETPKMVTSLGWNLIKFRAGM
DVFSMQPHRADTIMCDIGESNPDAVVEGERTRKVILLMEQWKNRNPTATCVFKALAPYRPEVTEALHRFQLQWGGGLVRT
PFSRNSTHEMYYSTAITGNIVNSVNIQSRKLLARFGDQRGPTRVPELDLGVGTRCVVLAEDKVKEKDVQERISALREQYG
ETWHMDREHPYRTWQYWAATACANRVGGALINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKEKVDTKAQEPQPGT
KVIMRAVNDWILERLARKSKPRMCSREEFIAKVKSNAALGAWSDEQNRWSSAKEAVEDPAFWQLVDEERERHLAGRCAHC
VYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNEDHWASRGSSGSGVEGISLNYLGWHLKGLSTLEGGLF
YADDTAGWDTKVTNADLEDEEQLLRYMEGEHKQLAATIMQKAYHAKVVKVARPSRDGGCIMDVITRRDQRGSGQVVTYAL
NTLTNIKVQLIRMMEGEGVIEASDAHNPRLLRVERWLRDHGEERLGRMLVSGDDCVVRPVDDRFSGALYFLNDMAKTRKD
IGEWDHSVGFSNWEEVPFCSHHFHELVMKDGRTLIVPCRDQDELVGRARVSPGCGRSVRETACLSKAYGQMWLLSYFHRR
DLRTLGLAICSAVPVDWVPAGRTTWSIHASGAWMTTEDMLDVWNRVWILDNPFMHSKEKIAEWRDVPYLPKSHDMLCSSL
VGRKERAEWAKNIWGAVEKVRKMIGQEKFKDYLSCMDRHDLHWESKLESSII
>P14336 ~~~~~~Genome polyprotein~~~
MVKKAILKGKGGGPPRRVSKETATKTRQPRVQMPNGLVLMRMMGILWHAVAGTARNPVLKAFWNSVPLKQATAALRKIKR
TVSALMVGLQKRGKRRSATDWMSWLLVITLLGMTLAATVRKERDGSTVIRAEGKDAATQVRVENGTCVILATDMGSWCDD
SLSYECVTIDQGEEPVDVDCFCRNVDGVYLEYGRCGKQEGSRTRRSVLIPSHAQGELTGRGHKWLEGDSLRTHLTRVEGW
VWKNKLLALAMVTVVWLTLESVVTRVAVLVVLLCLAPVYASRCTHLENRDFVTGTQGTTRVTLVLELGGCVTITAEGKPS
MDVWLDAIYQENPAKTREYCLHAKLSDTKVAARCPTMGPATLAEEHQGGTVCKRDQSDRGWGNHCGLFGKGSIVACVKAA
CEAKKKATGHVYDANKIVYTVKVEPHTGDYVAANETHSGRKTASFTISSEKTILTMGEYGDVSLLCRVASGVDLAQTVIL
ELDKTVEHLPTAWQVHRDWFNDLALPWKHEGAQNWNNAERLVEFGAPHAVKMDVYNLGDQTGVLLKALAGVPVAHIEGTK
YHLKSGHVTCEVGLEKLKMKGLTYTMCDKTKFTWKRAPTDSGHDTVVMEVTFSGTKPCRIPVRAVAHGSPDVNVAMLITP
NPTIENNGGGFIEMQLPPGDNIIYVGELSHQWFQKGSSIGRVFQKTKKGIERLTVIGEHAWDFGSAGGFLSSIGKAVHTV
LGGAFNSIFGGVGFLPKLLLGVALAWLGLNMRNPTMSMSFLLAGGLVLAMTLGVGADVGCAVDTERMELRCGEGLVVWRE
VSEWYDNYAYYPETPGALASAIKETFEEGSCGVVPQNRLEMAMWRSSVTELNLALAEGEANLTVVVDKFDPTDYRGGVPG
LLKKGKDIKVSWKSWGHSMIWSIPEAPRRFMVGTEGQSECPLERRKTGVFTVAEFGVGLRTKVFLDFRQEPTHECDTGVM
GAAVKNGMAIHTDQSLWMRSMKNDTGTYIVELLVTDLRNCSWPASHTIDNADVVDSELFLPASLAGPRSWYNRIPGYSEQ
VKGPWKYTPIRVIREECPGTTVTINAKCDKRGASVRSTTESGKVIPEWCCRACTMPPVTFRTGTDCWYAMEIRPVHDQGG
LVRSMVVADNGELLSEGGVPGIVALFVVLEYIIRRRPSTGTTVVWGGIVVLALLVTGMVRIESLVRYVVAVGITFHLELG
PEIVALMLLQAVFELRVGLLSAFALRRSLTVREMVTTYFLLLVLELGLPGASLEEFWKWGDALAMGALIFRACTAEGKTG
AGLLLMALMTQQDVVTVHHGLVCFLSVASACSVWRLLKGHREQKGLTWVVPLAGLLGGEGSGIRLLAFWELSAHRGRRSF
SEPLTVVGVMLTLASGMMRHTSQEALCALAVASFLLLMLVLGTRKMQLVAEWSGCVEWYPELVNEGGEVSLRVRQDAMGN
FHLTELEKEERMMAFWLIAGLAASAIHWSGILGVMGLWTLTEMLRSSRRSDLVFSGQGGRERGDRPFEVKDGVYRIFSPG
LFWGQNQVGVGYGSKGVLHTMWHVTRGAALSIDDAVAGPYWADVREDVVCYGGAWSLEEKWKGETVQVHAFPPGRAHEVH
QCQPGELILDTGRKLGAIPIDLVKGTSGSPILNAQGVVVGLYGNGLKTNETYVSSIAQGEAEKSRPNLPQAVVGTGWTSK
GQITVLDMHPGSGKTHRVLPELIRQCIDRRLRTLVLAPTRVVLKEMERALNGKRVRFHSPAVSDQQAGGAIVDVMCHATY
VNRRLLPQGRQNWEVAIMDEAHWTDPHSIAARGHLYTLAKENKCALVLMTATPPGKSEPFPESNGAITSEERQIPDGEWR
DGFDWITEYEGRTAWFVPSIAKGGAIARTLRQKGKSVICLNSKTFEKDYSRVRDEKPDFVVTTDISEMGANLDVSRVIDG
RTNIKPEEVDGKVELTGTRRVTTASAAQRRGRVGRQDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFY
GPEQDKMPEVAGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTDRSWTWEGPEANAVDEASGDLVTFRSPNGAE
RTLRPVWKDARMFKEGRDIKEFVAYASGRRSFGDVLTGMSGVPELLRHRCVSALDVFYTLMHEEPGSRAMRMAERDAPEA
FLTMVEMMVLGLATLGVIWCFVVRTSISRMMLGTLVLLASLLLLWAGGVGYGNMAGVALIFYTLLTVLQPEAGKQRSSDD
NKLAYFLLTLCSLAGLVAANEMGFLEKTKADLSTALWSEREEPRPWSEWTNVDIQPARSWGTYVLVVSLFTPYIIHQLQT
KIQQLVNSAVASGAQAMRDLGGGAPFFGVAGHVMTLGVVSLIGATPTSLMVGVGLAALHLAIVVSGLEAELTQRAHKVFF
SAMVRNPMVDGDVINPFGEGEAKPALYERKMSLVLATVLCLMSVVMNRTVASITEASAVGLAAAGQLLRPEADTLWTMPV
ACGMSGVVRGSLWGFLPLGHRLWLRASGGRRGGSEGDTLGDLWKRRLNNCTREEFFVYRRTGILETERDKARELLRRGET
NVGLAVSRGTAKLAWLEERGYATLKGEVVDLGCGRGGWSYYAASRPAVMSVRAYTIGGKGHEAPKMVTSLGWNLIKFRSG
MDVFSMQPHRADTVMCDIGESSPDAAVEGERTRKVILLMEQWKNRNPTAACVFKVLAPYRPEVIEALHRFQLQWGGGLVR
TPFSRNSTHEMYYSTAVTGNIVNSVNVQSRKLLARFGDQRGPTKVPELDLGVGTRCVVLAEDKVKEQDVQERIRALREQY
SETWHMDEEHPYRTWQYWGSYRTAPTGSAASLINGVVKLLSWPWNAREDVVRMAMTDTTAFGQQRVFKDKVDTKAQEPQP
GTRVIMRAVNDWILERLAQKSKPRMCSREEFIAKVKSNAALGAWSDEQNRWASAREAVEDPAFWRLVDEERERHLMGRCA
HCVYNMMGKREKKLGEFGVAKGSRAIWYMWLGSRFLEFEALGFLNEDHWASRESSGAGVEGISLNYLGWHLKKLSTLNGG
LFYADDTAGWDTKVTNADLEDEEQILRYMEGEHKQLATTIMQKAYHAKVVKVARPSRDGGCIMDVITRRDQRGSGQVVTY
ALNTLTNIKVQLIRMMEGEGVIEAADAHNPRLLRVERWLKEHGEERLGRMLVSGDDCVVRPLDDRFGKALYFLNDMAKTR
KDIGEWEHSAGFSSWEEVPFCSHHFHELVMKDGRTLVVPCRDQDELVGRARISPGCGWSVRETACLSKAYGQMWLLSYFH
RRDLRTLGLAINSAVPADWVPTGRTTWSIHASGAWMTTEDMLDVWNRVWILDNPFMQNKERVMEWRDVPYLPKAQDMLCS
SLVGRRERAEWAKNIWGAVEKVRKMIGPEKFKDYLSCMDRHDLHWELRLESSII
>P04517 ~~~~~~Genome polyprotein~~~
MALIFGTVNANILKEVFGGARMACVTSAHMAGANGSILKKAEETSRAIMHKPVIFGEDYITEADLPYTPLHLEVDAEMER
MYYLGRRALTHGKRRKVSVNNKRNRRRKVAKTYVGRDSIVEKIVVPHTERKVDTTAAVEDICNEATTQLVHNSMPKRKKQ
KNFLPATSLSNVYAQTWSIVRKRHMQVEIISKKSVRARVKRFEGSVQLFASVRHMYGERKRVDLRIDNWQQETLLDLAKR
FKNERVDQSKLTFGSSGLVLRQGSYGPAHWYRHGMFIVRGRSDGMLVDARAKVTFAVCHSMTHYSDKSISEAFFIPYSKK
FLELRPDGISHECTRGVSVERCGEVAAILTQALSPCGKITCKRCMVETPDIVEGESGESVTNQGKLLAMLKEQYPDFPMA
EKLLTRFLQQKSLVNTNLTACVSVKQLIGDRKQAPFTHVLAVSEILFKGNKLTGADLEEASTHMLEIARFLNNRTENMRI
GHLGSFRNKISSKAHVNNALMCDNQLDQNGNFIWGLRGAHAKRFLKGFFTEIDPNEGYDKYVIRKHIRGSRKLAIGNLIM
STDFQTLRQQIQGETIERKEIGNHCISMRNGNYVYPCCCVTLEDGKAQYSDLKHPTKRHLVIGNSGDSKYLDLPVLNEEK
MYIANEGYCYMNIFFALLVNVKEEDAKDFTKFIRDTIVPKLGAWPTMQDVATACYLLSILYPDVLRAELPRILVDHDNKT
MHVLDSYGSRTTGYHMLKMNTTSQLIEFVHSGLESEMKTYNVGGMNRDVVTQGAIEMLIKSIYKPHLMKQLLEEEPYIIV
LAIVSPSILIAMYNSGTFEQALQMWLPNTMRLANLAAILSALAQKLTLADLFVQQRNLINEYAQVILDNLIDGVRVNHSL
SLAMEIVTIKLATQEMDMALREGGYAVTSEKVHEMLEKNYVKALKDAWDELTWLEKFSAIRHSRKLLKFGRKPLIMKNTV
DCGGHIDLSVKSLFKFHLELLKGTISRAVNGGARKVRVAKNAMTKGVFLKIYSMLPDVYKFITVSSVLSLLLTFLFQIDC
MIRAHREAKVAAQLQKESEWDNIINRTFQYSKLENPIGYRSTAEERLQSEHPEAFEYYKFCIGKEDLVEQAKQPEIAYFE
KIIAFITLVLMAFDAERSDGVFKILNKFKGILSSTEREIIYTQSLDDYVTTFDDNMTINLELNMDELHKTSLPGVTFKQW
WNNQISRGNVKPHYRTEGHFMEFTRDTAASVASEISHSPARDFLVRGAVGSGKSTGLPYHLSKRGRVLMLEPTRPLTDNM
HKQLRSEPFNCFPTLRMRGKSTFGSSPITVMTSGFALHHFARNIAEVKTYDFVIIDECHVNDASAIAFRNLLFEHEFEGK
VLKVSATPPGREVEFTTQFPVKLKIEEALSFQEFVSLQGTGANADVISCGDNILVYVASYNDVDSLGKLLVQKGYKVSKI
DGRTMKSGGTEIITEGTSVKKHFIVATNIIENGVTIDIDVVVDFGTKVVPVLDVDNRAVQYNKTVVSYGERIQKLGRVGR
HKEGVALRIGQTNKTLVEIPEMVATEAAFLCFMYNLPVTTQSVSTTLLENATLLQARTMAQFELSYFYTINFVRFDGSMH
PVIHDKLKRFKLHTCETFLNKLAIPNKGLSSWLTSGEYKRLGYIAEDAGIRIPFVCKEIPDSLHEEIWHIVVAHKGDSGI
GRLTSVQAAKVVYTLQTDVHSIARTLACINRRIADEQMKQSHFEAATGRAFSFTNYSIQSIFDTLKANYATKHTKENIAV
LQQAKDQLLEFSNLAKDQDVTGIIQDFNHLETIYLQSDSEVAKHLKLKSHWNKSQITRDIIIALSVLIGGGWMLATYFKD
KFNEPVYFQGKKNQKHKLKMREARGARGQYEVAAEPEALEHYFGSAYNNKGKRKGTTRGMGAKSRKFINMYGFDPTDFSY
IRFVDPLTGHTIDESTNAPIDLVQHEFGKVRTRMLIDDEIEPQSLSTHTTIHAYLVNSGTKKVLKVDLTPHSSLRASEKS
TAIMGFPERENELRQTGMAVPVAYDQLPPKNEDLTFEGESLFKGPRDYNPISSTICHLTNESDGHTTSLYGIGFGPFIIT
NKHLFRRNNGTLLVQSLHGVFKVKNTTTLQQHLIDGRDMIIIRMPKDFPPFPQKLKFREPQREERICLVTTNFQTKSMSS
MVSDTSCTFPSSDGIFWKHWIQTKDGQCGSPLVSTRDGFIVGIHSASNFTNTNNYFTSVPKNFMELLTNQEAQQWVSGWR
LNADSVLWGGHKVFMSKPEEPFQPVKEATQLMNELVYSQGEKRKWVVEALSGNLRPVAECPSQLVTKHVVKGKCPLFELY
LQLNPEKEAYFKPMMGAYKPSRLNREAFLKDILKYASEIEIGNVDCDLLELAISMLVTKLKALGFPTVNYITDPEEIFSA
LNMKAAMGALYKGKKKEALSELTLDEQEAMLKASCLRLYTGKLGIWNGSLKAELRPIEKVENNKTRTFTAAPIDTLLAGK
VCVDDFNNQFYDLNIKAPWTVGMTKFYQGWNELMEALPSGWVYCDADGSQFDSSLTPFLINAVLKVRLAFMEEWDIGEQM
LRNLYTEIVYTPILTPDGTIIKKHKGNNSGQPSTVVDNTLMVIIAMLYTCEKCGINKEEIVYYVNGDDLLIAIHPDKAER
LSRFKESFGELGLKYEFDCTTRDKTQLWFMSHRALERDGMYIPKLEEERIVSILEWDRSKEPSHRLEAICASMIEAWGYD
KLVEEIRNFYAWVLEQAPYSQLAEEGKAPYLAETALKFLYTSQHGTNSEIEEYLKVLYDYDIPTTENLYFQSGTVDAGAD
AGKKKDQKDDKVAEQASKDRDVNAGTSGTFSVPRINAMATKLQYPRMRGEVVVNLNHLLGYKPQQIDLSNARATHEQFAA
WHQAVMTAYGVNEEQMKILLNGFMVWCIENGTSPNLNGTWVMMDGEDQVSYPLKPMVENAQPTLRQIMTHFSDLAEAYIE
MRNRERPYMPRYGLQRNITDMSLSRYAFDFYELTSKTPVRAREAHMQMKAAAVRNSGTRLFGLDGNVGTAEEDTERHTAH
DVNRNMHTLLGVRQ
>P08544 ~~~~~~Genome polyprotein~~~
MACKHGYPDVCPICTAVDATPGFEYLLMADGEWYPTDLLCVDLDDDVFWPSDTSNQSQTMDWTDVPLIRDIVMEPQGNSS
SSDKSNSQSSGNEGVIINNFYSNQYQNSIDLSASGGNAGDAPQTNGQLSNILGGAANAFATMAPLLLDQNTEEMENLSDR
VASDKAGNSATNTQSTVGRLCGYGKSHHGEHPASCADTATDKVLAAERYYTIDLASWTTSQEAFSHIRIPLPHVLAGEDG
GVFGATLRRHYLCKTGWRVQVQCNASQFHAGSLLVFMAPEFYTGKGTKTGTMEPSDPFTMDTEWRSPQGAPTGYRYDSRT
GFFATNHQNQWQWTVYPHQILNLRTNTTVDLEVPYVNVAPSSSWTQHANWTLVVAVLSPLQYATGSSPDVQITASLQPVN
PVFNGLRHETVIAQSPIPVTVREHKGCFYSTNPDTTVPIYGKTISTPSDYMCGEFSDLLELCKLPTFLGNPNTNNKRYPY
FSATNSVPATSMVDYQVALSCSCMANSMLAAVARNFNQYRGSLNFLFVFTGAAMVKGKFLIAYTPPGAGKPTTRDQAMQS
TYAIWDLGLNSSFNFTAPFISPTHYRQTSYTSPTITSVDGWVTVWKLTPLTYPSGTPTNSDILTLVSAGDDFTLRMPISP
TKWVPQGVDNAEKGKVSNDDASVDFVAEPVKLPENQTRVAFFYDRAVPIGMLRPGQNMETTFNYQENDYRLNCLLLTPLP
SFCPDSSSGPQKTKAPVQWRWVRSGGVNGANFPLMTKQDYAFLCFSPFTFYKCDLEVTVSALGMTRVASVLRWAPTGAPA
DVTDQLIGYTPSLGETRNPHMWLVGAGNSQVSFVVPYNSPLSVLPAAWFNGWSDFGNTKDFGVAPNADFGRLWIQGNTSA
SVRIRYKKMKVFCPRPTLFFPWPTPTTTKINADNPVPILELENPAALYRIDLFITFTDEFITFDYKVHGRPVLTFRIPGF
GLTPAGRMLVCMGEQPAHGPFTSSRSLYHVIFTATCSSFSFSIYKGRYRSWKKPIHDELVDRGYTTFGEFFKAVRGYHAD
YYRQRLIHDVETNPGPVQSVFQPQGAVLTKSLAPQAGIQNLLLRLLGIDGDCSEVSKAITVVTDLVAAWEKAKTTLVSPE
FWSKLILKTTKFIAASVLYLHNPDFTTTVCLSLMTGVDLLTNDSVFDWLKQKLSSFFRTPPPACPNVMQPQGPLREANEG
FTFAKNIEWAMKTIQSVVNWLTSWFKQEEDHPQSKLDKLLMEFPDHCRNIMDMRNGRKAYCECTASFKYFDELYNLAVTC
KRIPLASLCEKFKNRHDHSVTRPEPVVVVLRGAAGQGKSVTSQIIAQSVSKMAFGRQSVYSMPPDSEYFDGYENQFSVIM
DDLGQNPDGEDFTVFCQMVSSTNFLPNMAHLERKGTPFTSSFIVATTNLPKFRPVTVAHYPAVDRRITFDFTVTAGPHCK
TPAGMLDVEKAFDEIPGSKPQLACFSADCPLLHKRGVMFTCNRTQTVYNLQQVVKMVNDTITRKTENVKKMNSLVAQSPP
DWEHFENILTCLRQNNAALQDQLDELQEAFAQARERSDFLSDWLKVSAIIFAGIASLSAVIKLASKFKESIWPTPVRVEL
SEGEQAAYAGRARAQKQALQVLDIQGGGKVLAQAGNPVMDFELFCAKNIVAPITFYYPDKAEVTQSCLLLRAHLFVVNRH
VAETDWTAFKLKDVRHERHTVALRSVNRSGAKTDLTFIKVTKGPLFKDNVNKFCSNKDDFPARNDTVTGIMNTGLAFVYS
GNFLIGNQPVNTTTGACFNHCLHYRAQTRRGWCGSAIICNVNGKKAVYGMHSAGGGGLAAATIITKELIEAAEKSMLALE
PQGAIVDIATGSVVHVPRKTKLRRTVAHDVFQPKFEPAVLSRYDPRTDKDVDVVAFSKHTTNMESLPPIFDVVCGEYANR
VFTILGKENGLLTVEQAVLGLPGMDPMEKDTSPGLPYTQQGLRRTDLLNFITAKMTPQLDYAHSKLVIGVYDDVVYQSFL
KDEIRPIEKIHEAKTRIVDVPPFAHCIWGRQLLGRFASKFQTKPGLELGSAIGTDPDVDWTRYAVELSGFNYVYDVDYSN
FDASHSTAMFECLINNFFTEQNGFDRRIAEYLRSLAVSRHAYEDRRVLIRGGLPSGCAATSMLNTIMNNVIIRAALYLTY
SNFDFDDIKVLSYGDDLLIGTNYQIDFNLVKERLAPFGYKITPANKTTTFPLTSHLQDVTFLKRRFVRFNSYLFRPQMDA
VNLKAMVSYCKPGTLKEKLMSIALLAVHSGPDIYDEIFLPFRNVGIVVPTYSSMLYRWLSLFR
>P13899 ~~~~~~Genome polyprotein~~~
MACKHGYPDVCPICTAVDVTPGFEYLLLADGEWFPTDLLCVDLDDDVFWPSNSSNQSETMEWTDLPLVRDIVMEPQGNAS
SSDKSNSQSSGNEGVIINNFYSNQYQNSIDLSASGGNAGDAPQNNGQLSNILGGAANAFATMAPLLLDQNTEEMENLSDR
VASDKAGNSATNTQSTVGRLCGYGEAHHGEHPASCADTATDKVLAAERYYTIDLASWTTTQEAFSHIRIPLPHVLAGEDG
GVFGATLRRHYLCKTGWRVQVQCNASQFHAGSLLVFMAPEFYTGKGTKTGDMEPTDPFTMDTTWRAPQGAPTGYRYDSRT
GFFAMNHQNQWQWTVYPHQILNLRTNTTVDLEVPYVNIAPTSSWTQHANWTLVVAVFSPLQYASGSSSDVQITASIQPVN
PVFNGLRHETVIAQSPIAVTVREHKGCFYSTNPDTTVPIYGKTISTPNDYMCGEFSDLLELCKLPTFLGNPNSNNKRYPY
FSATNSVPTTSLVDYQVALSCSCMCNSMLAAVARNFNQYRGSLNFLFVFTGAAMVKGKFLIAYTPPGAGKPTTRDQAMQA
TYAIWDLGLNSSFVFTAPFISPTHYRQTSYTSATIASVDGWVTVWQLTPLTYPSGAPVNSDILTLVSAGDDFTLRMPISP
TKWAPQGSDNAEKGKVSNDDASVDFVAEPVKLPENQTRVAFFYDRAVPIGMLRPGQNIESTFVYQENDLRLNCLLLTPLP
SFCPDSTSGPVKTKAPVQWRWVRSGGTTNFPLMTKQDYAFLCFSPFTYYKCDLEVTVSALGTDTVASVLRWAPTGAPADV
TDQLIGYTPSLGETRNPHMWLVGAGNTQISFVVPYNSPLSVLPAAWFNGWSDFGNTKDFGVAPNADFGRLWIQGNTSASV
RIRYKKMKVFCPRPTLFFPWPVSTRSKINADNPVPILELENPAAFYRIDLFITFIDEFITFDYKVHGRPVLTFRIPGFGL
TPAGRMLVCMGEKPAHGPFTSSRSLYHVIFTATCSSFSFSIYKGRYRSWKKPIHDELVDRGYTTFGEFFRAVRAYHADYY
KQRLIHDVEMNPGPVQSVFQPQGAVLTKSLAPQAGIQNLLLRLLGIDGDCSEVSKAITVVTDLFAAWERAKTTLVSPEFW
SKLILKTTKFIAASVLYLHNPDFTTTVCLSLMTGVDLLTNDSVFDWLKNKLSSFFRTPPPVCPNVLQPQGPLREANEGFT
FAKNIEWAMKTIQSIVNWLTSWFKQEEDHPQSKLDKFLMEFPDHCRNIMDMRNGRKAYCECTASFKYFDELYNLAVTCKR
IPLASLCEKFKNRHDHSVTRPEPVVVVLRGAAGQGKSVTSQIIAQSVSKMAFGRQSVYSMPPDSEYFDGYENQFSVIMDD
LGQNPDGEDFTVFCQMVSSTNFLPNMAHLERKGTPFTSSFIVATTNLPKFRPVTVAHYPAVDRRITFDFTVTAGPHCTTS
NGMLDIEKAFDEIPGSKPQLACFSADCPLLHKRGVMFTCNRTKAVYNLQQVVKMVNDTITRKTENVKKMNSLVAQSPPDW
EHFENILTCLRQNNAALQDQLDELQEAFAQARERSDFLSDWLKVSAIIFAGIASLSAVIKLASKFKESIWPSPVRVELSE
GEQAAYAGRARAQKQALQVLDIQGGGKVLAQAGNPVMDFELFCAKNMVAPITFYYPDKAEVTQSCLLLRAHLFVVNRHVA
ETEWTAFKLKDVRHERDTVVTRSVNRSGAETDLTFIKVTKGPLFKDNVNKFCSNKDDFPARNDAVTGIMNTGLAFVYSGN
FLIGNQPVNTTTGACFNHCLHYRAQTRRGWCGSAVICNVNGKKAVYGMHSAGGGGLAAATIITRELIEAAEKSMLALEPQ
GAIVDISTGSVVHVPRKTKLRRTVAHDVFQPKFEPAVLSRYDPRTDKDVDVVAFSKHTTNMESLPPVFDIVCDEYANRVF
TILGKDNGLLTVEQAVLGLPGMDPMEKDTSPGLPYTQQGLRRTDLLNFNTAKMTPQLDYAHSKLVLGVYDDVVYQSFLKD
EIRPLEKIHEAKTRIVDVPPFAHCIWGRQLLGRFASKFQTKPGLELGSAIGTDPDVDWTPYAAELSGFNYVYDVDYSNFD
ASHSTAMFECLIKNFFTEQNGFDRRIAEYLRSLAVSRHAYEDRRVLIRGGLLSGCAATSMLNTIMNNVIIRAALYLTYSN
FEFDDIKVLSYGDDLLIGTNYQIDFNLVKERLAPFGYKITPANKTTTFPLTSHLQDVTFLKRRFVRFNSYLFRPQMDAVN
LKAMVSYCKPGTLKEKLMSIALLAVHSGPDIYDEIFLPFRNVGIVVPTYSSMLYRWLSLFR
>P08545 ~~~~~~Genome polyprotein~~~
MACKHGYPDVCPICTAVDATPDFEYLLMADGEWFPTDLLCVDLDDDVFWPSDTSTQPQTMEWTDVPLVCDTVMEPQGNAS
SSDKSNSQSSGNEGVIINNFYSNQYQNSIDLSASGGNAGDAPQNNGQLSSILGGAANAFATMAPLLMDQNTEEMENLSDR
VASDKAGNSATNTQSTVGRLCGYGKSHHGEHPTSCADAATDKVLAAERYYTIDLASWTTSQEAFSHIRIPLPHVLAGEDG
GVFGATLRRHYLCKTGWRVQVQCNASQFHAGSLLVFMAPEFYTGKGTKSGTMEPSDPFTMDTTWRSPQSAPTGYRYDRQA
GFFAMNHQNQWQWTVYPHQILNLRTNTTVDLEVPYVNVAPSSSWTQHANWTLVVAVLSPLQYATGSSPDVQITASLQPVN
PVFNGLRHETVLAQSPIPVTVREHQGCFYSTNPDTTVPIYGKTISTPSDYMCGEFSDLLELCKLPTFLGNPSTDNKRYPY
FSATNSVPATSLVDYQVALSCSCTANSMLAAVARNFNQYRGSLNFLFVFTGAAMVKGKFRIAYTPPGAGKPTTRDQAMQA
TYAIWDLGLNSSFNFTAPFISPTHYRQTSYTSPTITSVDGWVTVWQLTPLTYPSGTPTHSDILTLVSAGDDFTLRMPISP
TKWVPQGIDNAEKGKVSNDDASVDFVAEPVKLPENQTRVAFFYDRAVPIGMLRPGQNMETTFSYQENDFRLNCLLLTPLP
SYCPDSSSGPVRTKAPVQWRWVRSGGANGANFPLMTKQDYAFLCFSPFTYYKCDLEVTVSAMGAGTVSSVLRWAPTGAPA
DVTDQLIGYTPSLGETRNPHMWIVGSGNSQISFVVPYNSPLSVLPAAWFNGWSDFGNTKDFGVAPTSDFGRIWIQGNSSA
SVRIRYKKMKVFCPRPTLFFPWPTPTTTKINADNPVPILELENPASLYRIDLFITFTDELITFDYKVHGRPVLTFRIPGF
GLTPAGRMLVCMGAKPAHSPFTSSKSLYHVIFTSTCNSFSFTIYKGRYRSWKKPIHDELVDRGYTTFREFFKAVRGYHAD
YYKQRLIHDVEMNPGPVQSVFQPQGAVLTKSLAPQAGIQNILLRLLGIEGDCSEVSKAITVVTDLVAAWEKAKTTLVSPE
FWSELILKTTKFIAASVLYLHNPDFTTTVCLSLMTGVDLLTNDSVFDWLKSKLSSFFRTPPPACPNVMQPQGPLREANEG
FTFAKNIEWATKTIQSIVNWLTSWFKQEEDHPQSKLDKLLMEFPDHCRNIMDMRNGRKAYCECTASFKYFDDLYNLAVTC
KRIPLASLCEKFKNRHDHSVTRPEPVVAVLRGAAGQGKSVTSQIIAQSVSKMAFGRQSVYSMPPDSEYFDGYENQFSVIM
DDLGQNPDGEDFTVFCQMVSSTNFLPNMAHLERKGTPFTSSFIVATTNLPKFRPVTVAHYPAVDRRITFDFTVTAGPHCK
TPAGMLDIEKAFDEIPGSKPQLACFSADCPLLHKRGVMFTCNRTKTVYNLQQVVKMVNDTITRKTENVKKMNSLVAQSPP
DWQHFENILTCLRQNNAALQDQVDELQEAFTQARERSDFLSDWLKVSAIIFAGIVSLSAVIKLASKFKESIWPTPVRVEL
SEGEQAAYAGRARAQKQALQVLDIQGGGKVLAQAGNPVMDFELFCAKNMVSPITFYYPDKAEVTQSCLLLRAHLFVVNRH
VAETEWTAFKLRDVRHERDTVVMRSVNRSGAETDLTFVKVTKGPLFKDNVNKFCSNKDDFPARNDTVTGIMNTGLAFVYS
GNFLIGNQPVNTTTGACFNHCLHYRAQTRRGWCGSAIICNVNGKKAVYGMHSAGGGGLAAATIITRELIEAAEKSMLALE
PQGAIVDISTGSVVHVPRKTKLRRTVAHDVFQPKFEPAVLSRYDPRTDKDVDVVAFSKHTTNMESLPPIFDIVCGEYANR
VFTILGKDNGLLTVEQAVLGLSGMDPMEKDTSPGLPYTQQGLRRTDLLDFNTAKMTPQLDYAHSKLVLGVYDDVVYQSFL
KDEIRPLEKIHEAKTRIVDVPPFAHCIWGRQLLGRFASKFQTKPGFELGSAIGTDPDVDWTRYAAELSGFNYVYDVDYSN
FDASHSTAMFECLINNFFTEQNGFDRRIAEYLRSLAVSRHAYEDRRVLIRGGLPSGCAATSMLNTIMNNVIIRAALYLTY
SNFEFDDIKVLSYGDDLLIGTNYQIDFNLVKERLAPFGYKITPANKTTTFPLTSHLQDVTFLKRRFVRFNSYLFRPQMDA
VNLKAMVSYCKPGTLKEKLMSIALLAVHSGPDIYDEIFLPFRNVGIVVPTYDSMLYRWLSLFR
>A4KZ49 ~~~~~~Genome polyprotein~~~
MSSKKMMWVPKSAHKAPVVSREPVIRKKEWVARQIPKYIPVSNPSDCRDEISQTLLHFDSEEAVYDFVWRFPMGSIFWDT
NGRIKPVVNCLLRATRMNLDYDVAADVYVCRDCLSCASSYMYFSNYHYDCRELRENHEAVVSCKYEQHIVSTFDVFPRYC
TQEIEQNVVNWMTETLERYDNEPLRIEKQLQFYNHKTEQMESRVQEVQVTTAEYAVSDTYVPQQLSRKGSVSAKLTQRRA
NKIIMRTHEVENLIRETIDLCDERQIPITFVDVKHKRCLPRIPLRHMQAKPDISEIVEQGDMYNEVGQFIEQYQNLAEPF
RVIRDYEVTRGWSGVILHRDDLALDPQTQARCLNNLFVVMGRCEHGHLQNALRPDCLEGLTYYSDTFGKVFNESLVKHHP
GKHQFRIGSRTDYEWEELAMWVNAVCPVSFRCADCRPPQSLNEYIENIRMSKAMAELAGRQDALSKTLHKWTTMLISSVL
TTEIRARDNLEPIQERIFTRNMPLGPLYDVAGAMNRAVIDIQTAVQNMQLSIGNSNMNEQQRNQTLLNEINKIKQHSFMQ
TKEMLSRFENIAQTYQNIISSASQPLSIHSMRQLMMDSRMDESFEFDIMRKKGSIASIAPMAFRTFEDIYSQPGVYNQKW
LNLTPSGRFQTDIDYLRLDLPIDVIQKKKHVVNRNEIKEETCYVIVGQVNVSFCEVVARCFVPIPHVLRVGSPQNPTMIK
IQDQEGGKTLVPKSGFCYVLQLVLMLGYVPDQLTAAFVKDVGIVVESLGPWPLFVDYLGAIKNLIIRYPTTIKAPTALHI
VDHVDTVIHVMTTLGCVNKGEHYLTLQSVAQLHDAAMTVNIETFKDYRIGGVVPQLKHMLQSEEHMLEVLEAKPQWLVHL
LLSPTQIWALSQSVVKYQVIHKVMTSNPDLAVALAQLVAISSNFSIFKNTEHVIQKYFEVSKQLQNVSGVILGEHNEYFE
TAFAQYSALRFSTDVVLLMDQFSTRKKTLDDLEDYYRKTIPSILIECGLLGPSDFGWRKRLVRGVVDRGSGLKSTVKSLG
SFSTKEKWISWSGLGSGTITCVKFPFVCLQRSGSWLYSSTKTTAFNAVWMAGIKCVKSNVRSILLDSALYGAITLALLCA
IKLIRKAFRFVEGLIKEDTSDDEDYVLHAKAASDSLYIQCLAWLALVVGCFNSGLANDIYFSTTKYRTLLDMVKTAHSDS
FVFHAGDEEEGEIVELITRDNFVDYVYNHSDPLMEFDSETLLGWYTRISYQGRVLEHPLRVGTNCHLTRENVDEIAKNIA
TGAGNEFIVVGDVGSGKSTKLPIAVSTYGPVLILVPSRELVNNLCSSIWHVGKKQASTYMMNCITRGTSNISIMTYGYAL
ALFSHCPIELQKYRFIQMDECHEFSSHMITFYSWWRESGKFTKLFKTTATPPGTVIKGGCVPTNHKVDVIEIRDVSVEEF
CRRSIDSHAEGLRSLMPNGGRVIMFVPSRRECELARSSLISIPGARTWVVYRAAATQATKLVAELADDKHYFQIIITTTV
LQNGVNLDPDCVVDFGQTFEAAYDRDSRQLGVRRRNINPGELIQRVGRVGRNKPGKFIQVGKRLEHEVVPNSCCVTDAIL
MSFTLELAPFISSHLIDEVNFVTREQVRTAMKFSAPLLFMIHYVRRDGRMLNGYYQQLKGLLLQTSDVALCDTLVGDAET
NSFLTLRQYQLRGIIEAQEVLPDLPIPFYSSEFALPFYLEIGQITKEAIRARSFTLRIKTPDVKKAVMRLSTSATQIDQT
IGILRTRLQLTRERLSKFSELKATAHNLRLTPIFNTCFDMGAAKSESTLRASLTAGEELLSALELARTEKSDKALEKLIL
DNPVLGDCLVFHGGPEEYFDQTLFQTSTGLINKYTVGIACLTVGLGCTIWYYLKKREKYVMHGKVHTRETGLTTNHLFVP
GMKEHIQEWTGGDHEIGNRFGEAYKRRFIGRQPTEEQKLSKEKWDKREGQQTSVYKTLYDLDPTKFKYVVVECPDFDLKK
KLNRQEKKQLDTTIVEACRTRMLDKGQHDFKDVERATVYLFNDNGVGHKVQLTPHNPLAVSRTTTHPVGFPAEAGRLRQT
GQAMEMTPEELEKALDDNYVPHSRCQIDISHLHRHLAIVNTGGMSTQCFITQTMCVAPYHLAMGFKDNTKLTIYCSNGVY
VMPVPKVEKMENMDLVVFRMPQDFPPLKRCATIREPKSSDEVTLITGKRTTHGIQLQFSKVVSIDRKSDTVWKYMIDSVP
GVCGGMVMCVEDGCVVGFHSAAAIRNKVSNGSIFTPVTPQLLDSLQSSEGHLFDWYFNDDLISWKGVPTNMDPRNFPVSE
TISEFIFHNDSKGHGTDKYYGENLTIEGRVLQSFNTRHVVKGLDDAFAEYVNKFGEPPADTFTHLPSDLSSDAFYKDFMK
YSTPVEVGTVNIENFEKAVQAVVELLEQQGFEQGEFSPEMDFYKILNSFNLDTAMGALYQCKKKDVLPMASHEQLATWFW
NSLENLATGKLGLWKASLKAELRPKEKVLEKKTRVFTAAPFDVSFGAKAFVDDFNNKFYATQAGSNWTVGINKFNCGWDE
LARRFNPDWKFIDADGSRYDSSLTPLLFNAVLRIRQHFLRANGFERRMLSNFYTQLVWTPISTITGQIVKKNKGGPSGQP
STVVDNTMMLMIAVEYAKLQYGVTDLKYVCNGDDLILNAPQGVCETIRANFSHSFKELGLTYEFEQEVDSIDQVEYMSHK
WIDCGGVLIPKLKPERIVSVLQWNKSLDLASQANKINAAWIESFGYGDLSKFIREYANWWGERNGQVGFLCSEEKVASLY
LTNDVTIHTEEHDEFVFHSGADQSGVVKDQTGDKAEGSGTKTEDPPNQTTDPVNNPSNGGNKDAPQNLNATVVTKSYTYI
PPIMKSLVTIDTAKKMADYTPPDALISTQACTLEQFGRWANAAANGLGLSMQAFQTDVVPYWIYWCIVNSASDEHKKLSS
WTKVNMTIDDATGQINLNEGEAQTIYEMSPMFDEAKPTLRAVMRHFGALAYRWVKFSIAKRKPIIPHNAIKAGLMDVTYF
PCCIDFVTVDQLSPQEQNVRNQVINARVSDTPRALFKHAQRAGAGEEDTNLRRDDDANYGRTRVGGAMFGTR
>P89509 ~~~~~~Genome polyprotein~~~
MAAVTFASAITNAITSKPALTGMVQFGSFPPMPLRSTTVTTVATSVAQPKLYTVQFGSLDPVVVKSGAGSLAKATRQQPN
VEIDVSLSEAAALEVAKPRSNAVLRMHEEANKERALFLDWEASLKRSSYGIAEDEKVVMTTHGVSKIVPRSSRAMKLKRA
RERRRAQQPIILKWEPKLSGISIGGGLSASVIEAEEVRTKWPLHKTPSMKKRTVHRICKMNDQGVDMLTRSLVKIFKTKS
ANIEYIGKKSIKVDFIRKERTKFARIQVAHLLGKRAQRDLLTGMEENHFIDILSKYSGNKTTINPGVVCAGWSGIVVGNG
ILTQKRSRSPSEAFVIRGEHEGKLYDARIKVTRTMSHKIVHFSAAGANFWKGFDRCFLAYRSDNREHTCYSGLDVTECGE
VAALMCLAMFPCGKITCPDCVTDSELSQGQASGPSMKHRLTQLRDVIKSSYPRFKHAVQILDRYEQSLSSANENYQDFAE
IQSISDGVEKAAFPHVNKLNAILIKGATVTGEEFSQATKHLLEIARYLKNRTENIEKGSLKSFRNKISQKAHINPTLMCD
NQLDRNGNFIWGERGYHAKRFFSNYFEIIDPKKGYTQYETRAVPNGSRKLAIGKLIVPTNFEVLREQMKGEPVEPYPVTV
ECVSKLQGDFVHACCCVTTESGDPVLSEIKMPTKHHLVIGNSGDPKYIDLPEIEENKMYIAKEGYCYINIFLAMLVNVKE
SQAKEFTKVVRDKLVGELGKWPTLLDVATACYFLKVFYPDVANAELPRMLVDHKTKIIHVVDSYGSLSTGYHVLKTNTVE
QLIKFTRCNLESSLKHYRVGGTEWEDTHGSSNIDNPQWCIKRLIKGVYKPKQLKEDMLANPFLPLYALLSPGVILAFYNS
GSLEYLMNHYIRVDSNVAVLLVVLKSLAKKVSTSQSVLAQLQIIERSLPELIEAKANVNGPDDAATRACNRFMGMLLHMA
EPNWELADGGYTILRDHSISILEKSYLQILDEAWNELSWSERCAIRYYSSKQAIFTQKDLPMKSEADLGGRYSVSVMSSY
ERSKQCMKSVHSSIGNRLRSSMSWTSSKVSNSVCRTINYLVPDVFKFMNVLVCISLLIKMTAEANHIVTTQRRLKLDVEE
TERRKIEWELAFHHAILTQSAGQHPTIDEFRAYIADKAPHLSEHIEPEEKAVVHQAKRQSEQELERIIAFVALVLMMFDA
ERSDCVTKILNKLKGLVATVEPTVYHQTLNDIEDDLSERNLFVDFELSSDGDMLQQLPAEKTFASWWSHQLSRGFTIPHY
RTEGKFMTFTRATATEVAGKIAHESDKDILLMGAVGSGKSTGLPYHLSRKGNVLLLEPTRPLAENVHKQLSQAPFHQNTT
LRMRGLTAFGSAPISVMTSGFALNYFANNRMRIEEFDFVIFDECHVHDANAMAMRCLLHECDYSGKIIKVSATPPGREVE
FSTQYPVSISTEDTLSFQDFVNAQGSGSNCDVISKGDNILVYVASYNEVDALSKLLIERDFKVTKVDGRTMKVGNIEITT
SGTPSKKHFIVATNIIENGVTLDIDVVADFGTKVLPYLDTDSRMLSTTKTSINYGERIQRLGRVGRHKPGHALRIGHTEK
GLSEVPSCIATEAALKCFTYGLPVITNNVSTSILGNVTVKQARTMSVFEITPFYTSQVVRYDGSMHPQVHALLKRFKLRD
SEIVLNKLAIPHRGVNAWLTASEYARLGANVEDRRDVRIPFMCRDIPEKLHLDMWDVIVKFKGDAGFGRLSSASASKVAY
TLQTDVNSIQRTVTIIDTLIAEERRKQEYFKTVTSNCVSSSNFSLQSITNAIKSRMMKDHTCENISVLEGAKSQLLEFRN
LNADHSFATKTDGISRHFMSEYGALEAVHHQNTSDMSKFLKLKGKWNKTLITRDVLVLCGVLGGGLWMVIQHLRSKMSEP
VTHEAKGKRQRQKLKFRNARDNKMGREVYGDDDTIEHFFGDAYTKKGKSKGRTRGIGHKNRKFINMYGFDPEDFSAVRFV
DPLTGATLDDNPLTDITLVQEHFGNIRMDLLGEDELDSNEIRVNKTIQAYYMNNKTGKALKVDLTPHIPLKVCDLHATIA
GFPERENELRQTGKAQPINIDEVPRANNELVPVDHESNSMFRGLRDYNPISNNICHLTNVSDGASNSLYGVGFGPLILTN
RHLFERNNGELVIKSRHGEFVIKNTTQLHLLPIPDRDLLLIRLPKDVPPFPQKLGFRQPEKGERICMVGSNFQTKSITSI
VSETSTIMPVENSQFWKHWISTKDGQCGSPMVSTKDGKILGLHSLANFQNSINYFAAFPDDFAEKYLHTIEAHEWVKHWK
YNTSAISWGSLNIQASQPSGLFKVSKLISDLDSTAVYAQTQQNRWMFEQLNGNLKAIAHCPSQLVTKHTVKGKCQMFDLY
LKLHDEAREYFQPMLGQYQKSKLNREAYAKDLLKYATPIEAGNIDCDLFEKTVEIVVSDLRGYGFETCNYVTDENDIFEA
LNMKSAVGALYKGKKKDYFAEFTPEMKEEILKQSCERLFLGKMGVWNGSLKAELRPLEKVEANKTRTFTAAPLDTLLGGK
VCVDDFNNQFYDHNLRAPWSVGMTKFYCGWDRLLESLPDGWVYCDADGSQFDSSLSPYLINAVLNIRLGFMEEWDIGEVM
LRNLYTEIVYTPISTPDGTLVKKFKGNNSGQPSTVVDNTLMVILAVNYSLKKSGIPSELRDSIIRFFVNGDDLLLSVHPE
YEYILDTMADNFRELGLKYTFDSRTREKGDLWFMSHQGHKREGIWIPKLEPERIVSILEWDRSKEPCHRLEAICAAMIES
WGYDKLTHEIRKFYAWMIEQAPFSSLAQEGKAPYIAETALRKLYLDKEPAQEDLTHYLQAIFEDYEDGAEACVYHQAGET
LDAGLTDEQKQAEKEKKEREKAEKERERQKQLALKKGKDVAQEEGKRDKEVNAGTSGTFSVPRLKSLTSKMRVPRYEKRV
ALNLDHLILYTPEQTDLSNTRSTRKQFDTWFEGVMADYELTEDKMQIILNGLMVWCIENGTSPNINGMWVMMDGDDQVEF
PIKPLIDHAKPTFRQIMAHFSDVAEAYIEKRNQDRPYMPRYGLQRNLTDMSLARYAFDFYEMTSRTPIRAREAHIQMKAA
ALRGANNNLFGLDGNVGTTVENTERHTTEDVNRNMHNLLGVQGL
>Q02597 ~~~~~~Genome polyprotein~~~
MAAVTFASAITNAITNKTTSTGMVQFGSFPPMPLRSTTVTTVATPVGQPKLYTVRFGSLDPVIVKGGAGSLAKATRQQPS
VEIDVSLSEAAALEVAKPKSSAVLRMHEEANKERALFLDWEASLKRRSYGIAENEKVVMTTRGVSKIVPRSSRAMKQKRA
RERRRAQQPIILKWEPKLSGFSIGGGFSASAIEAEEVRTKWPLHKTPSMKKRMVHKTCKMSDQGVDMLIRSLVKIFKAKS
ANIEYIGKKPIKVDFIRKERTKFARIQVAHLLGKRAQRDLLAGMEENHFIDILSEYSGNGTTINPGVVCAGWSGIVVRNE
TLTQKRSRSPSKAFVIRGEHEDKLYDARIKITKTMSLKIVHFSARGANFWKGFDRCFLAYRSDNREHTCYSGLDVTECGE
VAALMCLAMFPCGKITCPDCVIDSELSQGQASGPSMKHRLTQLRDVIKSSYPRFKHAVQILDRYEQSLSSANENYQDFAE
IQSISDGVEKAAFPHVNKLNAILIKGATATGEEFSQATKHLLEIARYLKNRTENIEKGSLKSFRNKVSQKAHINPTLMCD
NQLDKNGNFIWGERGYHAKRFFSNYFEIIDPKKGYTQYETRVVPNGSRKLAIGKLIVPTNFEVLREQMRGEPVEPYPVTV
ECVSKSQGDFVHACCCVTTESGDPVLSEIKMPTKHHLVIGNSGDPKYIDLPEIEENKMYIAKEGYCYINIFLAMLVNVKE
SQAKEFTKVVRDKLVSELGKWPTLLDVATACYFLKVFYPDVANAELPRMLVDHKTKIIHVVDSYGSLSTGYHVLKTNTVE
QLIKFTRCNLESSLKHYRVGGTEWENAHGADNIDNPQWCIKRLVKGVYRPKQLKEDMLANPFLPLYALLSPGVILAFYNS
GSLEHLMNHYISADSNVAVLLVVLKSLAKKVSTSQSVLAQLQIIERSLPELIEAKANINGPDDAATRACNRFMGMLLHMA
EPNYELANGGYTFLRDHSISILEKSYLQILDEAWNELSWSERCVIRYYPSKQAIFTQKDLPMQSEADLGGRYSESVISSY
EWSKQQAKGVKDSVVNKLRSSMSWTSSKVSNSVCRTINYLVPDVFKFMNVLVCISLLIKMTAEANHIITTQRRLKLDIEE
TERKKIEWELAFHHNILTHSASQHPTLDEFTAYIAEKAPHLSEHIEPEEKEVVHQAKRQSEQELERVIAFVALVLMMFDA
ERSDCVTKILNKLKGLVATVEPTVYHQTLNEIEDDLNERNLFVDFELSSDSEMLQQLPAEKTFASWWSHQLSRGFTIPHY
RTEGKFMTFTRATATEVAGKIAHESDKDILLMGAVGSGKSTGLPYHLSRKGNVLLLEPTRPLAENVHKQLSQAPFHQNTT
LRMRGLTAFGSAPISVMTSGFALNYFANNRSRIEEFDFVIFDECHVHDANAMAMRCLIHECDYSGKIIKVSATPPGREVE
FSTQYPVSISTEDTLSFQDFVNAQGSGSNCDVISKGDNILVYVASYNEVDTLSKLLIERDFKVTKVDGRTMKVGNIEITT
SGTPSRKHFIVATNIIENGVTLDIDVVADFGTKVLPYLDTDNRMLSTTKTSINYGERIQRLGRVGRHKPGHALRIGHTER
GLSEVPSCIATEAALKCFTYGLPVITNNVSTSILGNVTVKQARTMSVFEITPFYTSQVVRYDGSMHPQVHALLKRFKLRD
SEIVLTKLAIPNRGVNAGSQPVSMHDSVQMLKIGVTLRIPFMCRDIPEKLHLDMWDVVVKFKGDAGFGRLSSSASKVAYT
LQTDVNSIQRTVTIIDTLIAEERRKQEYFKTVTSNCVSSSNFSLQSITNAIKSRMMKDHPCENISVLEGAKSQLLEFRNL
NSDHSFVTKTDGISRSFMRDYGALEAVNHQSTNEMSKFLQLKGKWNKTLITRDVLVICGVLGGGVWMVVQHFRSKVSEPV
THEAKGKKQRQKLKFRNARDNKMGREVYGDDDTIEHFFGDAYTKKGKSKGRTRGIGHKNRKFINMYGFDPEDFSAVRFVD
PLTGATLDDNPFTDITLVQKHFGDIRMDLLGEDELDSNEIRMNKTIQAYYMNNKTGKALKVDLTPHIPLKVCDLHATIAG
FPERENELRQTGKAQPINIDEVPRANNELVPVDHESNSMFRGLRDYNPISNNICHLTNVSDGASNSLYGVGFGPLILTNR
HLFERNNGELIIKSRHGEFVIKNTTQLHLLPIPDRDLLLIRLPKDVPPFPQKLGFRQPEKGERICMVGSNFQTKSITSIV
SETSTIMPVENSQFWKHWISTKDGQCGSPMVSTKDGKILGLHSLANFQNSINYFAAFPDDFTEKYLHTIEAHEWVKHWKY
NTSAISWGSLNIQASQPVSLFKVSKLISDLDSTAVYAQTQQNRWMFEQLTGNLKAIAHCPSQLVTKHTVKGKCQMFDLYL
KLHDEAREYFQPMLGQYQKSKLNREAYAKDLLKYATPIEAGNIDCDLFEKTVEIVISDLRGYGFETCNYVTDENDIFEAL
NMKSAVGALYKGKKKDYFAEFTPEVKEEILKQSCERLFLGKMGVWNGSLKAELRPLEKVEANKTRTFTAAPLDTLLGGKV
CVDDFNNQFYDHNLRAPWDVGMTKFYCGWDRLLESLPDGWVYCDADGSQFDSSLSPYLINAVLNIRLGFMEEWDVGEVML
RNLYTEIVYTPISTPDGTLVKKFKGNNSGQPSTVVDNTLMVILAVNYSLKKGGIPSELRDSIIRFFVNGDDLLLSVHPEY
EYILDTMADNFRELGLKYTFDSRTREKGDLWFMSHQGHRREGIWIPKLEPERIVSILEWDRSKEPCHRLEAICAAMIESW
GYDKLTHEIRKFYAWMIEQAPFSSLAQEGKAPYIAETALRKLYLDKEPAQEDLTQYLQAIFEDYEDGVEACVYHQAGETL
DADLTEEQKQAEKEKKEREKAEKERERQKQLAFKKGKDVAQEEGKRDKEVNAGTSGTFSVPRLKSLTSKMRVPRYEKRVA
LNLDHLILYTPEQTDLSNTRSTRKQFDTWFEGVMADYELTEDKMQIILNGLRVWCIENGTSPNINGMWVMMDGDDQVEFP
IKPLIDHAKPTFRQIMAHFSDVAEAYIEKRNQDRPYMPRYGLQRNLTDMSLARYAFDFYEMTSRTPIRAREAHIQMKAAA
LRGANNNLFGLDGNVGTTVENTERHTTEDVNRNMHNLLGVQGL
>P09814 ~~~~~~Genome polyprotein~~~
MAATMIFGSFTHDLLGKAMSTIHSAVTAEKDIFSSIKERLERKRHGKICRMKNGSIYIKAASSTKVEKINAAAKKLADDK
AAFLKAQPTIVDKIIVNEKIQVVEAEEVHKREDVQTVFFKKTKKRAPKLRATCSSSGLDNLYNAVANIAKASSLRVEVIH
KKRVCGEFKQTRFGRALFIDVAHAKGHRRRIDCRMHRREQRTMHMFMRKTTKTEVRSKHLRKGDSGIVLLTQKIKGHLSG
VRDEFFIVRGTCDDSLLEARARFSQSITLRATHFSTGDIFWKGFNASFQEQKAIGLDHTCTSDLPVEACGHVAALMCQSL
FPCGKITCKRCIANLSNLDFDTFSELQGDRAMRILDVMRARFPSFTHTIRFLHDLFTQRRVTNPNTAAFREILRLIGDRN
EAPFAHVNRLNEILLLGSKANPDSLAKASDSLLELARYLNNRTENIRNGSLKHFRNKISSKAHSNLALSCDNQLDQNGNF
LWGLAGIAAKRFLNNYFETIDPEQGYDKYVIRKNPNGERKLAIGNFIISTNLEKLRDQLEGESIARVGITEECVSRKDGN
YRYPCCCVTLEDGSPMYSELKMPTKNHLVIGNSGDPKYLDLPGEISNLMYIAKEGYCYINIFLAMLVNVDEANAKDFTKR
VRDESVQKLGKWPSLIDVATECALLSTYYPAAASAELPRLLVDHAQKTIHVVDSYGSLNTGYHILKANTVSQLEKFASNT
LESPMAQYKVGGLVYSENNDASAVKALTQAIFRPDVLSELIEKEPYLMVFALVSPGILMAMSNSGALEFGISKWISSDHS
LVRMASILKTLASKVSVADTLALQKHIMRQNANFLCGELINGFQKKKSYTHATRFLLMISEENEMDDPVLNAGYRVLEAS
SHEIMEKTYLALLETSWSDLSLYGKFKSIWFTRKHFGRYKAELFPKEQTDLQGRYSNSLRFHYQSTLKRLRNKGSLCRER
FLESISSARRRTTCAVFSLLHKAFPDVLKFINTLVIVSLSMQIYYMLVAIIHEHRAAKIKSAQLEERVLEDKTMLLYDDF
KAKLPEGSFEEFLEYTRQRDKEVYEYLMMETTEIVEFQAKNTGQASLERIIAFVSLTLMLFDNERSDCVYKILTKFKGIL
GSVENNVRFQSLDTIVPTQEEKNMVIDFELDSDTAHTPQMQEQTFSDWWSNQIANNRVVPHYRTEGYFMQFTRNTASAVS
HQIAHNEHKDIILMGAVGSGKSTGLPTNLCKFGGVLLLEPTRPLAENVTKQMRGSPFFASPTLRMRNLSTFGSSPITVMT
TGFALHFFANNVKEFDRYQFIIFDEFHVLDSNAIAFRNLCHEYSYNGKIIKVSATPPGRECDLTTQYPVELLIEEQLSLR
DFVDAQGTDAHADVVKKGDNILVYVASYNEVDQLSKMLNERGFLVTKVDGRTMKLGGVEIITKGSSIKKHFIVATNIIEN
GVTLDVDVVVDFGLKVVPNLDSDNRLVSYCKIPISLGERIQRFGRVGRNKPGVALRIGETIKGLVEIPSMIATEAAFLCF
VYGLPVTTQNVSTSILSQVSVRQARVMCQFELPIFYTAHLVRYDGAMHPAIHNALKRFKLRDSEINLNTLAIPTSSSKTW
YTGKCYKQLVGRLDIPDEIKIPFYTKEVPEKVPEQIWDVMVKFSSDAGFGRMTSAAACKVAYTLQTDIHSIQRTVQIIDR
LLENEMKKRNHFNLVVNQSCSSHFMSLSSIMASLRAHYAKNHTGQNIEILQKAKAQLLEFSNLAIDPSTTEALRDFGYLE
AVRFQSESEMARGLKLSGHWKWSLISRDLIVVSGVGIGLGCMLWQFFKEKMHEPVKFQGKSRRRLQFRKARDDKMGYIMH
GEGDTIEHFFGAAYTKKGKSKGKTHGAGTKAHKFVNMYGVSPDEYSYVRYLDPVTGATLDESPMTDLNIVQEHFGEIRRE
AILADAMSPQQRNKGIQAYFVRNSTMPILKVDLTPHIPLKVCESNNIAGFPEREGELRRTGPTETLPFDALPPEKQEVAF
ESKALLKGVRDFNPISACVWLLENSSDGHSERLFGIGFGPYIIANQHLFRRNNGELTIKTMHGEFKVKNSTQLQMKPVEG
RDIIVIKMAKDFPPFPQKLKFRQPTIKDRVCMVSTNFQQKSVSSLVSESSHIVHKEDTSFWQHWITTKDGQCGSPLVSII
DGNILGIHSLTHTTNGSNYFVEFPEKFVATYLDAADGWCKNWKFNADKISWGSFTLVEDAPEDDFMAKKTVAAIMDDLVR
TQGEKRKWMLEAAHTNIQPVAHLQSQLVTKHIVKGRCKMFALYLQENADARDFFKSFMGAYGPSHLNKEAYIKDIMKYSK
QIVVGSVDCDTFESSLKVLSRKMKEWGFENLEYVTDEQTIKNALNMDAAVGALYSGKKKQYFEDLSDDAVANLVQKSCLR
LFKNKLGVWNGSLKAELRPFEKLIENKTRTFTAAPIETLLGGKVCVDDFNNHFYSKHIQCPWSVGMTKFYGGWNELLGKL
PDGWVYCDADGSQFDSSLSPYLINAVLRLRLSSMEEWDVGQKMLQNLYTEIVYTPISTPDGTIVKKFKGNNSGQPSTVVD
NTLMVVLAMYYALSKLGVDINSQEDVCKFFANGDDLIIAISPELEHVLDGFQQHFSDLGLNYDFSSRTRDKKELWFMSHR
ALSKDGILIPKLEPERIVSILEWDRSAEPHHRLEAICASMIEAWGYTDLLQNIRRFYKWTIEQEPYRSLAEQGLAPYLSE
VALRRLYTSQIATDNELTDYYKEILANNEFLRETVRFQSDTVDAGKDKARDQKLADKPTLAIDRTKDKDVNTGTSGTFSI
PRLKKAAMNMKLPKVGGSSVVNLDHLLTYKPAQEFVVNTRATHSQFKAWHTNVMAELELNEEQMKIVLNGFMIWCIENGT
SPNISGVWTMMDGDEQVEYPIEPMVKHANPSLRQIMKHFSNLAEAYIRMRNSEQVYIPRYGLQRGLVDRNLAPFAFDFFE
VNGATPVRAREAHAQMKAAALRNSQQRMFCLDGSVSGQEENTERHTVDDVNAQMHHLLGVKGV
>Q5WPU5 ~~~~~~Genome polyprotein~~~
MSKKPGGPGRNRAINMLKRGIPRVFPLVGVKRVVMGLLDGRGPVRFVLALMTFFKFTALAPTKALLGRWKRINKTTAMKH
LTSFKKELGTMINVVNNRGTKKKRGNNGPGLVMIITLMTVVSMVSSLKLSNFQGKVMMTINATDMADVIVVPTQHGKNQC
WIRAMDVGYMCDDTITYECPKLDAGNDPEDIDCWCDKQPMYVHYGRCTRTRHSKRSRRSIAVQTHGESMLANKKDAWLDS
TKASRYLMKTENWIIRNPGYAFVAVLLGWMLGSNNGQRVVFVVLLLLVAPAYSFNCLGMSNRDFLEGVSGATWVDVVLEG
DSCITIMAKDKPTIDIKMMETEATNLAEVRSYCYLATVSDVSTVSNCPTTGEAHNPKRAEDTYVCKSGVTDRGWGNGCGL
FGKGSIDTCANFTCSLKAMGRMIQPENVKYEVGIFIHGSTSSDTHGNYSSQLGASQAGRFTITPNSPAITVKMGDYGEIS
VECEPRNGLNTEAYYIMSVGTKHFLVHREWFNDLALPWTSPASSNWRNREILLEFEEPHATKQSVVALGSQEGALHQALA
GAVPVSFSGSVKLTSGHLKCRVKMEKLTLKGTTYGMCTEKFSFAKNPADTGHGTVVLELQYTGSDGPCKIPISIVASLSD
LTPIGRMVTANPYVASSEANAKVLVEMEPPFGDSYIVVGRGDKQINHHWHKAGSSIGKAFITTIKGAQRLAALGDTAWDF
GSVGGIFNSVGKAVHQVFGGAFRTLFGGMSWITQGLMGALLLWMGVNARDRSIALVMLATGGVLLFLATNVHADSGCAID
VGRRELRCGQGIFIHNDVEAWVDRYKFMPETPKQLAKVIEQAHAKGICGLRSVSRLEHVMWENIRDELNTLLRENAVDLS
VVVEKPKGMYKSAPQRLALTSEEFEIGWKAWGKSLVFAPELANHTFVVDGPETKECPDAKRAWNSLEIEDFGFGIMSTRV
WLKVREHNTTDCDSSIIGTAVKGDIAVHSDLSYWIESHKNTTWRLERAVFGEIKSCTWPETHTLWSDGVVESDLVVPVTL
AGPKSNHNRREGYKVQSQGPWDEEDIVLDFDYCPGTTVTITEACGKRGPSIRTTTSSGRLVTDWCCRSCTLPPLRYRTKN
GCWYGMEIRPMKHDETTLVKSSVSAHRSDMIDPFQLGLLVMFLATQEVLRKRWTARLTVPAIVGALLVLILGGITYTDLL
RYVLLVGAAFAEANSGGDVVHLALIAAFKIQPGFLAMTFLRGKWTNQENILLALGAAFFQMAATDLNFSLPGILNATATA
WMLLRAATQPSTSAIVMPLLCLLAPGMRLLYLDTYRITLIIIGICSLIGERRRAAAKKKGAVLLGLALTSTGQFSASVMA
AGLMACNPNKKRGWPATEVLTAVGLMFAIVGGLAELDVDSMSIPFVLAGLMAVSYTISGKSTDLWLERAADITWETDAAI
TGTSQRLDVKLDDDGDFHLINDPGVPWKIWVIRMTALGFAAWTPWAIIPAGIGYWLTVKYAKRGGVFWDTPAPRTYPKGD
TSPGVYRIMSRYILGTYQAGVGVMYEGVLHTLWHTTRGAAIRSGEGRLTPYWGSVKEDRITYGGPWKFDRKWNGLDDVQL
IIVAPGKAAINIQTKPGIFKTPQGEIGAVSLDYPEGTSGSPILDKNGDIVGLYGNGVILGNGSYVSAIVQGEREEEPVPE
AYNADMLRKKQLTVLDLHPGAGKTRRILPQIIKDAIQRRLRTAVLAPTRVVAAEMAEALKGLPVRYLTPAVNREHSGTEI
VDVMCHATLTHRLMSPLRAPNYNLFVMDEAHFTDPASIAARGYIATKVELGEAAAIFMTATPPGTHDPFPDTNAPVTDIQ
AEVPDRAWSSGFEWITEYTGKTVWFVASVKMGNEIAQCLQRAGKKVIQLNRKSYDTEYPKCKNGDWDFVITTDISEMGAN
FGASRVIDCRKSVKPTILEEGEGRVILSNPSPITSASAAQRRGRVGRNPSQIGDEYHYGGGTSEDDTIAAHWTEAKIMLD
NIHLPNGLVAQMYGPERDKAFTMDGEYRLRGEERKTFLELLRTADLPVWLAYKVASNGIQYTDRKWCFDGPRSNIILEDN
NEVEIVTRTGERKMLKPRWLDARVYADHQSLKWFKDFAAGKRSAVGFLEVLGRMPEHFAGKTREAFDTMYLVATAEKGGK
AHRMALEELPDALETITLIVALAVMTAGVFLLLVQRRGIGKLGLGGMVLGLATFFLWMADVSGTKIAGTLLLALLMMIVL
IPEPEKQRSQTDNQLAVFLICVLLVVGVVAANEYGMLERTKSDLGKIFSSTRQPQSALPLPSMNALALDLRPATAWALYG
GSTVVLTPLIKHLVTSEYITTSLASISAQAGSLFNLPRGLPFTELDFTVVLVFLGCWGQVSLTTLITAAALATLHYGYML
PGWQAEALRAAQRRTAAGIMKNAVVDGLVATDVPELERTTPLMQKKVGQILLIGVSAAALLVNPCVTTVREAGILISAAL
LTLWDNGAIAVWNSTTATGLCHVIRGNWLAGASIAWTLIKNADKPACKRGRPGGRTLGEQWKEKLNGLSKEDFLKYRKEA
ITEVDRSAARKARRDGNKTGGHPVSRGSAKLRWMVERQFVKPIGKVVDLGCGRGGWSYYAATLKGVQEVRGYTKGGPGHE
EPMLMQSYGWNLVTMKSGVDVYYKPSEPCDTLFCDIGESSSSAEVEEQRTLRILEMVSDWLQRGPREFCIKVLCPYMPRV
MERLEVLQRRYGGGLVRVPLSRNSNHEMYWVSGAAGNIVHAVNMTSQVLIGRMEKRTWHGPKYEEDVNLGSGTRAVGKPQ
PHTNQEKIKARIQRLKEEYAATWHHDKDHPYRTWTYHGSYEVKPTGSASSLVNGVVRLMSKPWDAILNVTTMAMTDTTPF
GQQRVFKEKVDTKAPEPPSGVREVMDETTNWLWAFLAREKKPRLCTREEFKRKVNSNAALGAMFEEQNQWSSAREAVEDP
RFWEMVDEERENHLKGECHTCIYNMMGKREKKLGEFGKAKGSRAIWFMWLGARFLEFEALGFLNEDHWLGRKNSGGGVEG
LGVQKLGYILREMSHHSGGKMYADDTAGWDTRITRADLDNEAKVLELMEGEHRQLARAIIELTYKHKVVKVMRPGTDGKT
VMDVISREDQRGSGQVVTYALNTFTNIAVQLIRLMEAEGVIGQEHLESLPRKTKYAVRTWLFENGEERVTRMAVSGDDCV
VKPLDDRFANALHFLNSMSKVRKDVPEWKPSSGWHDWQQVPFCSNHFQELIMKDGRTLVVPCRGQDELIGRARVSPGSGW
NVRDTACLAKAYAQMWLLLYFHRRDLRLMANAICSAVPSNWVPTGRTSWSVHATGEWMTTDDMLEVWNKVWIQDNEWMLD
KTPVQSWTDIPYTGKREDIWCGSLIGTRTRATWAENIYAAINQVRAIIGQEKYRDYMLSLRRYEEVNVQEDRVL
>P06935 ~~~~~~Genome polyprotein~~~
MSKKPGGPGKNRAVNMLKRGMPRGLSLIGLKRAMLSLIDGKGPIRFVLALLAFFRFTAIAPTRAVLDRWRGVNKQTAMKH
LLSFKKELGTLTSAINRRSTKQKKRGGTAGFTILLGLIACAGAVTLSNFQGKVMMTVNATDVTDVITIPTAAGKNLCIVR
AMDVGYLCEDTITYECPVLAAGNDPEDIDCWCTKSSVYVRYGRCTKTRHSRRSRRSLTVQTHGESTLANKKGAWLDSTKA
TRYLVKTESWILRNPGYALVAAVIGWMLGSNTMQRVVFAILLLLVAPAYSFNCLGMSNRDFLEGVSGATWVDLVLEGDSC
VTIMSKDKPTIDVKMMNMEAANLADVRSYCYLASVSDLSTRAACPTMGEAHNEKRADPAFVCKQGVVDRGWGNGCGLFGK
GSIDTCAKFACTTKATGWIIQKENIKYEVAIFVHGPTTVESHGKIGATQAGRFSITPSAPSYTLKLGEYGEVTVDCEPRS
GIDTSAYYVMSVGEKSFLVHREWFMDLNLPWSSAGSTTWRNRETLMEFEEPHATKQSVVALGSQEGALHQALAGAIPVEF
SSNTVKLTSGHLKCRVKMEKLQLKGTTYGVCSKAFKFARTPADTGHGTVVLELQYTGTDGPCKVPISSVASLNDLTPVGR
LVTVNPFVSVATANSKVLIELEPPFGDSYIVVGRGEQQINHHWHKSGSSIGKAFTTTLRGAQRLAALGDTAWDFGSVGGV
FTSVGKAIHQVFGGAFRSLFGGMSWITQGLLGALLLWMGINARDRSIAMTFLAVGGVLLFLSVNVHADTGCAIDIGRQEL
RCGSGVFIHNDVEAWMDRYKFYPETPQGLAKIIQKAHAEGVCGLRSVSRLEHQMWEAIKDELNTLLKENGVDLSVVVEKQ
NGMYKAAPKRLAATTEKLEMGWKAWGKSIIFAPELANNTFVIDGPETEECPTANRAWNSMEVEDFGFGLTSTRMFLRIRE
TNTTECDSKIIGTAVKNNMAVHSDLSYWIESGLNDTWKLERAVLGEVKSCTWPETHTLWGDGVLESDLIIPITLAGPRSN
HNRRPGYKTQNQGPWDEGRVEIDFDYCPGTTVTISDSCEHRGPAARTTTESGKLITDWCCRSCTLPPLRFQTENGCWYGM
EIRPTRHDEKTLVQSRVNAYNADMIDPFQLGLMVVFLATQEVLRKRWTAKISIPAIMLALLVLVFGGITYTDVLRYVILV
GAAFAEANSGGDVVHLALMATFKIQPVFLVASFLKARWTNQESILLMLAAAFFQMAYYDAKNVLSWEVPDVLNSLSVAWM
ILRAISFTNTSNVVVPLLALLTPGLKCLNLDVYRILLLMVGVGSLIKEKRSSAAKKKGACLICLALASTGVFNPMILAAG
LMACDPNRKRGWPATEVMTAVGLMFAIVGGLAELDIDSMAIPMTIAGLMFAAFVISGKSTDMWIERTADITWESDAEITG
SSERVDVRLDDDGNFQLMNDPGAPWKIWMLRMACLAISAYTPWAILPSVIGFWITLQYTKRGGVLWDTPSPKEYKKGDTT
TGVYRIMTRGLLGSYQAGAGVMVEGVFHTLWHTTKGAALMSGEGRLDPYWGSVKEDRLCYGGPWKLQHKWNGHDEVQMIV
VEPGKNVKNVQTKPGVFKTPEGEIGAVTLDYPTGTSGSPIVDKNGDVIGLYGNGVIMPNGSYISAIVQGERMEEPAPAGF
EPEMLRKKQITVLDLHPGAGKTRKILPQIIKEAINKRLRTAVLAPTRVVAAEMSEALRGLPIRYQTSAVHREHSGNEIVD
VMCHATLTHRLMSPHRVPNYNLFIMDEAHFTDPASIAARGYIATKVELGEAAAIFMTATPPGTSDPFPESNAPISDMQTE
IPDRAWNTGYEWITEYVGKTVWFVPSVKMGNEIALCLQRAGKKVIQLNRKSYETEYPKCKNDDWDFVITTDISEMGANFK
ASRVIDSRKSVKPTIIEEGDGRVILGEPSAITAASAAQRRGRIGRNPSQVGDEYCYGGHTNEDDSNFAHWTEARIMLDNI
NMPNGLVAQLYQPEREKVYTMDGEYRLRGEERKNFLEFLRTADLPVWLAYKVAAAGISYHDRKWCFDGPRTNTILEDNNE
VEVITKLGERKILRPRWADARVYSDHQALKSFKDFASGKRSQIGLVEVLGRMPEHFMVKTWEALDTMYVVATAEKGGRAH
RMALEELPDALQTIVLIALLSVMSLGVFFLLMQRKGIGKIGLGGVILGAATFFCWMAEVPGTKIAGMLLLSLLLMIVLIP
EPEKQRSQTDNQLAVFLICVLTLVGAVAANEMGWLDKTKNDIGSLLGHRPEARETTLGVESFLLDLRPATAWSLYAVTTA
VLTPLLKHLITSDYINTSLTSINVQASALFTLARGFPFVDVGVSALLLAVGCWGQVTLTVTVTAAALLFCHYAYMVPGWQ
AEAMRSAQRRTAAGIMKNVVVDGIVATDVPELERTTPVMQKKVGQIILILVSMAAVVVNPSVRTVREAGILTTAAAVTLW
ENGASSVWNATTAIGLCHIMRGGWLSCLSIMWTLIKNMEKPGLKRGGAKGRTLGEVWKERLNHMTKEEFTRYRKEAITEV
DRSAAKHARREGNITGGHPVSRGTAKLRWLVERRFLEPVGKVVDLGCGRGGWCYYMATQKRVQEVKGYTKGGPGHEEPQL
VQSYGWNIVTMKSGVDVFYRPSEASDTLLCDIGESSSSAEVEEHRTVRVLEMVEDWLHRGPKEFCIKVLCPYMPKVIEKM
ETLQRRYGGGLIRNPLSRNSTHEMYWVSHASGNIVHSVNMTSQVLLGRMEKKTWKGPQFEEDVNLGSGTRAVGKPLLNSD
TSKIKNRIERLKKEYSSTWHQDANHPYRTWNYHGSYEVKPTGSASSLVNGVVRLLSKPWDTITNVTTMAMTDTTPFGQQR
VFKEKVDTKAPEPPEGVKYVLNETTNWLWAFLARDKKPRMCSREEFIGKVNSNAALGAMFEEQNQWKNAREAVEDPKFWE
MVDEEREAHLRGECNTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARFLEFEALGFLNEDHWLGRKNSGGGVEGLGLQ
KLGYILKEVGTKPGGKVYADDTAGWDTRITKADLENEAKVLELLDGEHRRLARSIIELTYRHKVVKVMRPAADGKTVMDV
ISREDQRGSGQVVTYALNTFTNLAVQLVRMMEGEGVIGPDDVEKLGKGKGPKVRTWLFENGEERLSRMAVSGDDCVVKPL
DDRFATSLHFLNAMSKVRKDIQEWKPSTGWYDWQQVPFCSNHFTELIMKDGRTLVVPCRGQDELIGRARISPGAGWNVRD
TACLAKSYAQMWLLLYFHRRDLRLMANAICSAVPANWVPTGRTTWSIHAKGEWMTTEDMLAVWNRVWIEENEWMEDKTPV
ERWSDVPYSGKREDIWCGSLIGTRTRATWAENIHVAINQVRSVIGEEKYVDYMSSLRRYEDTIVVEDTVL
>Q9Q6P4 ~~~GP1~~~Genome polyprotein~~~
MSKKPGGPGKSRAVNMLKRGMPRVLSLIGLKRAMLSLIDGKGPIRFVLALLAFFRFTAIAPTRAVLDRWRGVNKQTAMKH
LLSFKKELGTLTSAINRRSSKQKKRGGKTGIAVMIGLIASVGAVTLSNFQGKVMMTVNATDVTDVITIPTAAGKNLCIVR
AMDVGYMCDDTITYECPVLSAGNDPEDIDCWCTKSAVYVRYGRCTKTRHSRRSRRSLTVQTHGESTLANKKGAWMDSTKA
TRYLVKTESWILRNPGYALVAAVIGWMLGSNTMQRVVFVVLLLLVAPAYSFNCLGMSNRDFLEGVSGATWVDLVLEGDSC
VTIMSKDKPTIDVKMMNMEAANLAEVRSYCYLATVSDLSTKAACPTMGEAHNDKRADPAFVCRQGVVDRGWGNGCGLFGK
GSIDTCAKFACSTKAIGRTILKENIKYEVAIFVHGPTTVESHGNYSTQVGATQAGRFSITPAAPSYTLKLGEYGEVTVDC
EPRSGIDTNAYYVMTVGTKTFLVHREWFMDLNLPWSSAGSTVWRNRETLMEFEEPHATKQSVIALGSQEGALHQALAGAI
PVEFSSNTVKLTSGHLKCRVKMEKLQLKGTTYGVCSKAFKFLGTPADTGHGTVVLELQYTGTDGPCKVPISSVASLNDLT
PVGRLVTVNPFVSVATANAKVLIELEPPFGDSYIVVGRGEQQINHHWHKSGSSIGKAFTTTLKGAQRLAALGDTAWDFGS
VGGVFTSVGKAVHQVFGGAFRSLFGGMSWITQGLLGALLLWMGINARDRSIALTFLAVGGVLLFLSVNVHADTGCAIDIS
RQELRCGSGVFIHNDVEAWMDRYKYYPETPQGLAKIIQKAHKEGVCGLRSVSRLEHQMWEAVKDELNTLLKENGVDLSVV
VEKQEGMYKSAPKRLTATTEKLEIGWKAWGKSILFAPELANNTFVVDGPETKECPTQNRAWNSLEVEDFGFGLTSTRMFL
KVRESNTTECDSKIIGTAVKNNLAIHSDLSYWIESRLNDTWKLERAVLGEVKSCTWPETHTLWGDGILESDLIIPVTLAG
PRSNHNRRPGYKTQNQGPWDEGRVEIDFDYCPGTTVTLSESCGHRGPATRTTTESGKLITDWCCRSCTLPPLRYQTDSGC
WYGMEIRPQRHDEKTLVQSQVNAYNADMIDPFQLGLLVVFLATQEVLRKRWTAKISMPAILIALLVLVFGGITYTDVLRY
VILVGAAFAESNSGGDVVHLALMATFKIQPVFMVASFLKARWTNQENILLMLAAVFFQMAYHDARQILLWEIPDVLNSLA
VAWMILRAITFTTTSNVVVPLLALLTPGLRCLNLDVYRILLLMVGIGSLIREKRSAAAKKKGASLLCLALASTGLFNPMI
LAAGLIACDPNRKRGWPATEVMTAVGLMFAIVGGLAELDIDSMAIPMTIAGLMFAAFVISGKSTDMWIERTADISWESDA
EITGSSERVDVRLDDDGNFQLMNDPGAPWKIWMLRMVCLAISAYTPWAILPSVVGFWITLQYTKRGGVLWDTPSPKEYKK
GDTTTGVYRIMTRGLLGSYQAGAGVMVEGVFHTLWHTTKGAALMSGEGRLDPYWGSVKEDRLCYGGPWKLQHKWNGQDEV
QMIVVEPGKNVKNVQTKPGVFKTPEGEIGAVTLDFPTGTSGSPIVDKNGDVIGLYGNGVIMPNGSYISAIVQGERMDEPI
PAGFEPEMLRKKQITVLDLHPGAGKTRRILPQIIKEAINRRLRTAVLAPTRVVAAEMAEALRGLPIRYQTSAVPREHNGN
EIVDVMCHATLTHRLMSPHRVPNYNLFVMDEAHFTDPASIAARGYISTKVELGEAAAIFMTATPPGTSDPFPESNSPISD
LQTEIPDRAWNSGYEWITEYTGKTVWFVPSVKMGNEIALCLQRAGKKVVQLNRKSYETEYPKCKNDDWDFVITTDISEMG
ANFKASRVIDSRKSVKPTIITEGEGRVILGEPSAVTAASAAQRRGRIGRNPSQVGDEYCYGGHTNEDDSNFAHWTEARIM
LDNINMPNGLIAQFYQPEREKVYTMDGEYRLRGEERKNFLELLRTADLPVWLAYKVAAAGVSYHDRRWCFDGPRTNTILE
DNNEVEVITKLGERKILRPRWIDARVYSDHQALKAFKDFASGKRSQIGLIEVLGKMPEHFMGKTWEALDTMYVVATAEKG
GRAHRMALEELPDALQTIALIALLSVMTMGVFFLLMQRKGIGKIGLGGAVLGVATFFCWMAEVPGTKIAGMLLLSLLLMI
VLIPEPEKQRSQTDNQLAVFLICVMTLVSAVAANEMGWLDKTKSDISSLFGQRIEVKENFSMGEFLLDLRPATAWSLYAV
TTAVLTPLLKHLITSDYINTSLTSINVQASALFTLARGFPFVDVGVSALLLAAGCWGQVTLTVTVTAATLLFCHYAYMVP
GWQAEAMRSAQRRTAAGIMKNAVVDGIVATDVPELERTTPIMQKKVGQIMLILVSLAAVVVNPSVKTVREAGILITAAAV
TLWENGASSVWNATTAIGLCHIMRGGWLSCLSITWTLIKNMEKPGLKRGGAKGRTLGEVWKERLNQMTKEEFTRYRKEAI
IEVDRSAAKHARKEGNVTGGHPVSRGTAKLRWLVERRFLEPVGKVIDLGCGRGGWCYYMATQKRVQEVRGYTKGGPGHEE
PQLVQSYGWNIVTMKSGVDVFYRPSECCDTLLCDIGESSSSAEVEEHRTIRVLEMVEDWLHRGPREFCVKVLCPYMPKVI
EKMELLQRRYGGGLVRNPLSRNSTHEMYWVSRASGNVVHSVNMTSQVLLGRMEKRTWKGPQYEEDVNLGSGTRAVGKPLL
NSDTSKIKNRIERLRREYSSTWHHDENHPYRTWNYHGSYDVKPTGSASSLVNGVVRLLSKPWDTITNVTTMAMTDTTPFG
QQRVFKEKVDTKAPEPPEGVKYVLNETTNWLWAFLAREKRPRMCSREEFIRKVNSNAALGAMFEEQNQWRSAREAVEDPK
FWEMVDEEREAHLRGECHTCIYNMMGKREKKPGEFGKAKGSRAIWFMWLGARFLEFEALGFLNEDHWLGRKNSGGGVEGL
GLQKLGYILREVGTRPGGKIYADDTAGWDTRITRADLENEAKVLELLDGEHRRLARAIIELTYRHKVVKVMRPAADGRTV
MDVISREDQRGSGQVVTYALNTFTNLAVQLVRMMEGEGVIGPDDVEKLTKGKGPKVRTWLFENGEERLSRMAVSGDDCVV
KPLDDRFATSLHFLNAMSKVRKDIQEWKPSTGWYDWQQVPFCSNHFTELIMKDGRTLVVPCRGQDELVGRARISPGAGWN
VRDTACLAKSYAQMWLLLYFHRRDLRLMANAICSAVPVNWVPTGRTTWSIHAGGEWMTTEDMLEVWNRVWIEENEWMEDK
TPVEKWSDVPYSGKREDIWCGSLIGTRARATWAENIQVAINQVRAIIGDEKYVDYMSSLKRYEDTTLVEDTVL
>P03314 ~~~~~~Genome polyprotein~~~
MSGRKAQGKTLGVNMVRRGVRSLSNKIKQKTKQIGNRPGPSRGVQGFIFFFLFNILTGKKITAHLKRLWKMLDPRQGLAV
LRKVKRVVASLMRGLSSRKRRSHDVLTVQFLILGMLLMTGGVTLVRKNRWLLLNVTSEDLGKTFSVGTGNCTTNILEAKY
WCPDSMEYNCPNLSPREEPDDIDCWCYGVENVRVAYGKCDSAGRSRRSRRAIDLPTHENHGLKTRQEKWMTGRMGERQLQ
KIERWFVRNPFFAVTALTIAYLVGSNMTQRVVIALLVLAVGPAYSAHCIGITDRDFIEGVHGGTWVSATLEQDKCVTVMA
PDKPSLDISLETVAIDRPAEVRKVCYNAVLTHVKINDKCPSTGEAHLAEENEGDNACKRTYSDRGWGNGCGLFGKGSIVA
CAKFTCAKSMSLFEVDQTKIQYVIRAQLHVGAKQENWNTDIKTLKFDALSGSQEVEFIGYGKATLECQVQTAVDFGNSYI
AEMETESWIVDRQWAQDLTLPWQSGSGGVWREMHHLVEFEPPHAATIRVLALGNQEGSLKTALTGAMRVTKDTNDNNLYK
LHGGHVSCRVKLSALTLKGTSYKICTDKMFFVKNPTDTGHGTVVMQVKVSKGAPCRIPVIVADDLTAAINKGILVTVNPI
ASTNDDEVLIEVNPPFGDSYIIVGRGDSRLTYQWHKEGSSIGKLFTQTMKGVERLAVMGDTAWDFSSAGGFFTSVGKGIH
TVFGSAFQGLFGGLNWITKVIMGAVLIWVGINTRNMTMSMSMILVGVIMMFLSLGVGADQGCAINFGKRELKCGDGIFIF
RDSDDWLNKYSYYPEDPVKLASIVKASFEEGKCGLNSVDSLEHEMWRSRADEINAIFEENEVDISVVVQDPKNVYQRGTH
PFSRIRDGLQYGWKTWGKNLVFSPGRKNGSFIIDGKSRKECPFSNRVWNSFQIEEFGTGVFTTRVYMDAVFEYTIDCDGS
ILGAAVNGKKSAHGSPTFWMGSHEVNGTWMIHTLEALDYKECEWPLTHTIGTSVEESEMFMPRSIGGPVSSHNHIPGYKV
QTNGPWMQVPLEVKREACPGTSVIIDGNCDGRGKSTRSTTDSGKVIPEWCCRSCTMPPVSFHGSDGCWYPMEIRPRKTHE
SHLVRSWVTAGEIHAVPFGLVSMMIAMEVVLRKRQGPKQMLVGGVVLLGAMLVGQVTLLDLLKLTVAVGLHFHEMNNGGD
AMYMALIAAFSIRPGLLIGFGLRTLWSPRERLVLTLGAAMVEIALGGVMGGLWKYLNAVSLCILTINAVASRKASNTILP
LMALLTPVTMAEVRLAAMFFCAVVIIGVLHQNFKDTSMQKTIPLVALTLTSYLGLTQPFLGLCAFLATRIFGRRSIPVNE
ALAAAGLVGVLAGLAFQEMENFLGPIAVGGLLMMLVSVAGRVDGLELKKLGEVSWEEEAEISGSSARYDVALSEQGEFKL
LSEEKVPWDQVVMTSLALVGAALHPFALLLVLAGWLFHVRGARRSGDVLWDIPTPKIIEECEHLEDGIYGIFQSTFLGAS
QRGVGVAQGGVFHTMWHVTRGAFLVRNGKKLIPSWASVKEDLVAYGGSWKLEGRWDGEEEVQLIAAVPGKNVVNVQTKPS
LFKVRNGGEIGAVALDYPSGTSGSPIVNRNGEVIGLYGNGILVGDNSFVSAISQTEVKEEGKEELQEIPTMLKKGMTTVL
DFHPGAGKTRRFLPQILAECARRRLRTLVLAPTRVVLSEMKEAFHGLDVKFHTQAFSAHGSGREVIDAMCHATLTYRMLE
PTRVVNWEVIIMDEAHFLDPASIAARGWAAHRARANESATILMTATPPGTSDEFPHSNGEIEDVQTDIPSEPWNTGHDWI
LADKRPTAWFLPSIRAANVMAASLRKAGKSVVVLNRKTFEREYPTIKQKKPDFILATDIAEMGANLCVERVLDCRTAFKP
VLVDEGRKVAIKGPLRISASSAAQRRGRIGRNPNRDGDSYYYSEPTSENNAHHVCWLEASMLLDNMEVRGGMVAPLYGVE
GTKTPVSPGEMRLRDDQRKVFRELVRNCDLPVWLSWQVAKAGLKTNDRKWCFEGPEEHEILNDSGETVKCRAPGGAKKPL
RPRWCDERVSSDQSALSEFIKFAEGRRGAAEVLVVLSELPDFLAKKGGEAMDTISVFLHSEEGSRAYRNALSMMPEAMTI
VMLFILAGLLTSGMVIFFMSPKGISRMSMAMGTMAGCGYLMFLGGVKPTHISYVMLIFFVLMVVVIPEPGQQRSIQDNQV
AYLIIGILTLVSAVAANELGMLEKTKEDLFGKKNLIPSSASPWSWPDLDLKPGAAWTVYVGIVTMLSPMLHHWIKVEYGN
LSLSGIAQSASVLSFMDKGIPFMKMNISVIMLLVSGWNSITVMPLLCGIGCAMLHWSLILPGIKAQQSKLAQRRVFHGVA
ENPVVDGNPTVDIEEAPEMPALYEKKLALYLLLALSLASVAMCRTPFSLAEGIVLASAALGPLIEGNTSLLWNGPMAVSM
TGVMRGNHYAFVGVMYNLWKMKTGRRGSANGKTLGEVWKRELNLLDKRQFELYKRTDIVEVDRDTARRHLAEGKVDTGVA
VSRGTAKLRWFHERGYVKLEGRVIDLGCGRGGWCYYAAAQKEVSGVKGFTLGRDGHEKPMNVQSLGWNIITFKDKTDIHR
LEPVKCDTLLCDIGESSSSSVTEGERTVRVLDTVEKWLACGVDNFCVKVLAPYMPDVLEKLELLQRRFGGTVIRNPLSRN
STHEMYYVSGARSNVTFTVNQTSRLLMRRMRRPTGKVTLEADVILPIGTRSVETDKGPLDKEAIEERVERIKSEYMTSWF
YDNDNPYRTWHYCGSYVTKTSGSAASMVNGVIKILTYPWDRIEEVTRMAMTDTTPFGQQRVFKEKVDTRAKDPPAGTRKI
MKVVNRWLFRHLAREKNPRLCTKEEFIAKVRSHAAIGAYLEEQEQWKTANEAVQDPKFWELVDEERKLHQQGRCRTCVYN
MMGKREKKLSEFGKAKGSRAIWYMWLGARYLEFEALGFLNEDHWASRENSGGGVEGIGLQYLGYVIRDLAAMDGGGFYAD
DTAGWDTRITEADLDDEQEILNYMSPHHKKLAQAVMEMTYKNKVVKVLRPAPGGKAYMDVISRRDQRGSGQVVTYALNTI
TNLKVQLIRMAEAEMVIHHQHVQDCDESVLTRLEAWLTEHGCDRLKRMAVSGDDCVVRPIDDRFGLALSHLNAMSKVRKD
ISEWQPSKGWNDWENVPFCSHHFHELQLKDGRRIVVPCREQDELIGRGRVSPGNGWMIKETACLSKAYANMWSLMYFHKR
DMRLLSLAVSSAVPTSWVPQGRTTWSIHGKGEWMTTEDMLEVWNRVWITNNPHMQDKTMVKKWRDVPYLTKRQDKLCGSL
IGMTNRATWASHIHLVIHRIRTLIGQEKYTDYLTVMDRYSVDADLQLGELI
>Q6DV88 ~~~~~~Genome polyprotein~~~
MSGRKAQGKTLGVNMVRRGVRSLSNKIKQKTKQIGNRPGPSRGVQGFIFFFLFNILTGKKITAHLKRLWKMLDPRQGLAV
LRKVKRVVASLMRGLSSRKRRSHDVLTVQFLILGMLLMTGGVTLVRKNRWLLLNVTSEDLGKTFSVGTGNCTTNILEAKY
WCPDSMEYNCPNLSPREEPDDIDCWCYGVENVRVAYGKCDSAGRSRRSRRAIDLPTHENHGLKTRQEKWMTGRMGERQLQ
KIERWLVRNPFFAVTALTIAYLVGSNMTQRVVIALLVLAVGPAYSAHCIGITDRDFIEGVHGGTWVSATLEQDKCVTVMA
PDKPSLDISLETVAIDGPAEARKVCYNAVLTHVKINDKCPSTGEAHLAEENEGDNACKRTYSDRGWGNGCGLFGKGSIVA
CAKFTCAKSMSLFEVDQTKIQYVIRAQLHVGAKQENWNTDIKTLKFDALSGSQEAEFTGYGKATLECQVQTAVDFGNSYI
AEMEKESWIVDRQWAQDLTLPWQSGSGGVWREMHHLVEFEPPHAATIRVLALGNQEGSLKTALTGAMRVTKDTNDNNLYK
LHGGHVSCRVKLSALTLKGTSYKMCTDKMSFVKNPTDTGHGTVVMQVKVPKGAPCKIPVIVADDLTAAINKGILVTVNPI
ASTNDDEVLIEVNPPFGDSYIIVGTGDSRLTYQWHKEGSSIGKLFTQTMKGAERLAVMGDAAWDFSSAGGFFTSVGKGIH
TVFGSAFQGLFGGLNWITKVIMGAVLIWVGINTRNMTMSMSMILVGVIMMFLSLGVGADQGCAINFGKRELKCGDGIFIF
RDSDDWLNKYSYYPEDPVKLASIVKASFEEGKCGLNSVDSLEHEMWRSRADEINAILEENEVDISVVVQDPKNVYQRGTH
PFSRIRDGLQYGWKTWGKNLVFSPGRKNGSFIIDGKSRKECPFSNRVWNSFQIEEFGTGVFTTRVYMDAVFEYTIDCDGS
ILGAAVNGKKSAHGSPTFWMGSHEVNGTWMIHTLEALDYKECEWPLTHTIGTSVEESEMFMPRSIGGPVSSHNHIPGYKV
QTNGPWMQVPLEVKREACPGTSVIIDGNCDGRGKSTRSTTDSGKIIPEWCCRSCTMPPVSFHGSDGCWYPMEIRPRKTHE
SHLVRSWVTAGEIHAVPFGLVSMMIAMEVVLRKRQGPKQMLVGGVVLLGAMLVGQVTLLDLLKLTVAVGLHFHEMNNGGD
AMYMALIAAFSIRPGLLIGFGLRTLWSPRERLVLTLGAAMVEIALGGMMGGLWKYLNAVSLCILTINAVASRKASNTILP
LMALLTPVTMAEVRLATMLFCTVVIIGVLHQNSKDTSMQKTIPLVALTLTSYLGLTQPFLGLCAFLATRIFGRRSIPVNE
ALAAAGLVGVLAGLAFQEMENFLGPIAVGGILMMLVSVAGRVDGLELKKLGEVSWEEEAEISGSSARYDVALSEQGEFKL
LSEEKVPWDQVVMTSLALVGAAIHPFALLLVLAGWLFHVRGARRSGDVLWDIPTPKIIEECEHLEDGIYGIFQSTFLGAS
QRGVGVAQGGVFHTMWHVTRGAFLVRNGKKLIPSWASVKEDLVAYGGSWKLEGRWDGEEEVQLIAAVPGKNVVNVQTKPS
LFKVRNGGEIGAVALDYPSGTSGSPIVNRNGEVIGLYGNGILVGDNSFVSAISQTEVKEEGKEELQEIPTMLKKGMTTIL
DFHPGAGKTRRFLPQILAECARRRLRTLVLAPTRVVLSEMKEAFHGLDVKFHTQAFSAHGSGREVIDAMCHATLTYRMLE
PTRVVNWEVIIMDEAHFLDPASIAARGWAAHRARANESATILMTATPPGTSDEFPHSNGEIEDVQTDIPSEPWNTGHDWI
LADKRPTAWFLPSIRAANVMAASLRKAGKSVVVLNRKTFEREYPTIKQKKPDFILATDIAEMGANLCVERVLDCRTAFKP
VLVDEGRKVAIKGPLRISASSAAQRRGRIGRNPNRDGDSYYYSEPTSEDNAHHVCWLEASMLLDNMEVRGGMVAPLYGVE
GTKTPVSPGEMRLRDDQRKVFRELVRNCDLPVWLSWQVAKAGLKTNDRKWCFEGPEEHEILNDSGETVKCRAPGGAKKPL
RPRWCDERVSSDQSALSEFIKFAEGRRGAAEVLVVLSELPDFLAKKGGEAMDTISVFLHSEEGSRAYRNALSMMPEAMTI
VMLFILAGLLTSGMVIFFMSPKGISRMSMAMGTMAGCGYLMFLGGVKPTHISYIMLIFFVLMVVVIPEPGQQRSIQDNQV
AYLIIGILTLVSVVAANELGMLEKTKEDLFGKKNLIPSSASPWSWPDLDLKPGAAWTVYVGIVTMLSPMLHHWIKVEYGN
LSLSGIAQSASVLSFMDKGIPFMKMNISVIILLVSGWNSITVMPLLCGIGCAMLHWSLILPGIKAQQSKLAQRRVFHGVA
KNPVVDGNPTVDIEEAPEMPALYEKKLALYLLLALSLASVAMCRTPFSLAEGIVLASAALGPLIEGNTSLLWNGPMAVSM
TGVMRGNYYAFVGVMYNLWKMKTGRRGSANGKTLGEVWKRELNLLDKQQFELYKRTDIVEVDRDTARRHLAEGKVDTGVA
VSRGTAKLRWFHERGYVKLEGRVIDLGCGRGGWCYYAAAQKEVSGVKGFTLGRDGHEKPMNVQSLGWNIITFKDKTDIHR
LEPVKCDTLLCDIGESSSSSVTEGERTVRVLDTVEKWLACGVDNFCVKVLAPYMPDVLEKLELLQRRFGGTVIRNPLSRN
STHEMYYVSGARSNVTFTVNQTSRLLMRRMRRPTGKVTLEADVILPIGTRSVETDKGPLDKEAIEERVERIKSEYMTSWF
YDNDNPYRTWHYCGSYVTKTSGSAASMVNGVIKILTYPWDRIEEVTRMAMTDTTPFGQQRVFKEKVDTRAKDPPAGTRKI
MKVVNRWLFRHLAREKNPRLCTKEEFIAKVRSHAAIGAYLEEQEQWKTANEAVQDPKFWELVDEERKLHQQGRCRTCVYN
MMGKREKKLSEFGKAKGSRAIWYMWLGARYLEFEALGFLNEDHWASRENSGGGVEGIGLQYLGYVIRDLAAMDGGGFYAD
DTAGWDTRITEADLDDEQEILNYMSPHHKKLAQAVMEMTYKNKVVKVLRPAPGGKAYMDVISRRDQRGSGQVVTYALNTI
TNLKVQLIRMAEAEMVIHHQHVQDCDESVLTRLEAWLTEHGCNRLKRMAVSGDDCVVRPIDDRFGLALSHLNAMSKVRKD
ISEWQPSKGWNDWENVPFCSHHFHELQLKDGRRIVVPCREQDELIGRGRVSPGNGWMIKETACLSKAYANMWSLMYFHKR
DMRLLSLAVSSAVPTSWVPQGRTTWSIHGKGEWMTTEDMLEVWNRVWITNNPHMQDKTMVKEWRDVPYLTKRQDKLCGSL
IGMTNRATWASHIHLVIHRIRTLIGQEKYTDYLTVMDRYSVDADLQPGELI
>Q1X881 ~~~~~~Genome polyprotein~~~
MSGRKAQGKTLGVNMVRRGVRSLSNKIKQKTKQIGNRPGPSRGVQGFIFFFLFNILTGKKLTAHLKKLWRMLDPRQGLAV
LRKVKRVVASLMRGLSSRKRRSNEMALFPLLLLGLLALSGGVTLVRKNRWLLLNVTAEDLGKTFSVGTGNCTTNILEAKY
WCPDSMEYNCPNLSPREEPDDIDCWCYGVENVRVAYGRCDAVGRSKRSRRAIDLPTHENHGLKTRQEKWMTGRMGERQLQ
KIERWLVRNPFFAVTALAIAYLVGNNTTQRVVIALLVLAVGPAYSAHCIGITDRDFIEGVHGGTWVSATLEQDKCVTVMA
PDKPSLDISLQTVAIDGPAEARKVCYSAVLTHVKINDKCPSTGEAHLAEENDGDNACKRTYSDRGWGNGCGLFGKGSIVA
CAKFTCAKSMSLFEVDQTKIQYVIRAQLHVGAKQENWNTDIKTLKFDALSGSQEAEFTGYGKATLECQVQTAVDFGNSYI
AEMEKDSWIVDRQWAQDLTLPWQSGSGGIWREMHHLVEFEPPHAATIRVLALGNQEGSLKTALTGAMRVTKDENDNNLYK
LHGGHVSCRVKLSALTLKGTSYKMCTDKMSFVKNPTDTGHGTVVMQVKVPKGAPCKIPVIVADDLTAAVNKGILVTVNPI
ASTNDDEVLIEVNPPFGDSYIIVGTGDSRLTYQWHKEGSSIGKLFTQTMKGAERLAVMGDAAWDFSSAGGFFTSVGKGIH
TVFGSAFQGLFGGLSWITKVIMGAVLIWVGINTRNMTMSMSMILVGVIMMFLSLGVGADQGCAVNFGKRELKCGDGIFVF
RDSDDWLTKYSYYPEDPVKLASIIKASHEEGKCGLNSVDSLEHEMWRSRADEINAIFEENEVDISVVVQDPKNIYQRGTH
PFSRIRDGLQYGWKTWGKNLVFSPGRKNGSFIIDGKSRKECPFSNRVWNSFQIEEFGMGVFTTRVFMDATFDYSVDCDGA
ILGAAVNGKKSAHGSPTFWMGSHEVNGTWMIHTLETLDYKECEWPLTHTIGTSVEESDMFMPRSIGGPVSSHNRIPGYKV
QTNGPWMQVPLEVKREVCPGTSVVVDSNCDGRGKSTRSTTDSGKIIPEWCCRSCTMPPVSFHGSDGCWYPMEIRPMKTSD
SHLVRSWVTAGEVHAVPFGLVSMMIAMEVVLRRRQGPKQMLVGGVVLLGAMLVGQVTVLDLVKFVVAVGLHFHEINNGGD
AMYMALIASFSIRPGLLMGFGLRTLWSPRERLVMAFGAAMVEIALGGMMGGLWQYLNAVSLCVLTINAISSRKASNMILP
LMALMTPMTMHEVRMATMLFCTVVIIGVLHQNSKDTSMQKTIPIVALTLTSYMGLTQPFLGLCAYMSTQVFGRRSIPVNE
ALAAAGLVGVLAGLAFQDMENFLGPIAVGGILMMLVSVAGRVDGLELKKLGEISWEEEAEISGSSSRYDVALSEQGEFKL
LSEDKVPWDQIVMTSLALVGAAIHPFALLLVLGGWILHIKGARRSGDVLWDIPTPKVIEECEHLEDGIYGIFQSTFLGAS
QRGVGVAQGGVFHTMWHVTRGAFLLRNGKKLVPSWASVKEDLVAYGGSWKLDGRWDGEEEVQLIAAVPGKSVVNVQTKPS
LFRVKNGGEIGAVALDYPSGTSGSPIVNRNGEVVGLYGNGILVGDNSFVSAISQTELKEESKEELQEIPTMLKKGMTTIL
DFHPGAGKTRRFLPQILAECARRRLRTLVLAPTRVVLSEMKEAFQGLDVKFHTQAFSAHGSGKEVIDAMCHATLTYRMLE
PTRVVNWEVIIMDEAHFLDPASIAARGWAAHRARANESATILMTATPPGTSDEFPHSNGEIEDVQTDIPSEPWTAGHEWI
LADKRPTAWFLPSIRAANVMAASLRKAGKNVVVLNRKTFEKEYPTIKQKKPDFILATDIAEMGANLCVERVLDCRTAYKP
VLVDEGKKVAIKGPLRISASSAAQRRGRIGRNPNRDGDSYYYSEPTSEDNAHHVCWLEASMLLDNMEVRGGMVAPLYGIE
GTKTPVSPGEMRLRDDQRRVFRELVRGCDLPVWLAWQVAKAGLKTNDRKWCFEGPEEHEILNDNGETVKCRSPGGAKRAL
RPRWCDERVSSDQSALADFIKFAEGRRGAAEMLVILTELPDFLAKKGGEAMDTISVFLHSEEGSRAYRNALSMMPEAMTI
VMLFLLAGLLTSGAVIFFMSPKGMSRMSMAMGTMAGSGYLMFLGGVKPTHISYVMLIFFVLMVVVIPEPGQQRTIQDNQV
AYLIIGILTLLSVVAANELGMLEKTKEDFFGKRDITTPSGAIPWSWPDLDLKPGAAWTVYVGIVTMLSPMLHHWIKVEYG
NLSLSGIAQSASVLSFMDKGIPFMKMNISVVILLVSGWNSITVIPLLCGIGGAMLHWTLILPGIKAQQSKLAQKRVFHGV
AKNPVVDGNPTADIEEAPEMPALYEKKLALYLLLALSLMSVAMCRTPFSLAEGIVLSSAALGPLIEGNTSLLWNGPMAVS
MTGVMRGNYYAFVGVMYNLWKMKTERRGSASGKTLGEVWKRELNLLDKQQFEMYKRTDIIEVDRDMARRHLAEGKVDTGV
AVSRGTAKLRWFHERGYVKLEGRVTDLGCGRGGWCYYAAAQKEVSGVKGYTLGRDGHEKPMNVQSLGWNIVTFKDKTDVH
RLEPLKCETLLCDIGESSPSSATEGERTLRVLDTVEKWLACGVDNFCIKVLAPYMPDVIEKLELLQRRFGGTVIRNPLSR
NSTHEMYYVSGARSNITFTVNQTSRLLMRRMRRPTGKVTLEADVILPIGTRSVETDKGPLDKDAIEERVERIKNEYATTW
FYDNDNPYRTWHYCGSYVTKTSGSAASMINGVIKILTFPWDRIEEVTRMAMTDTTPFGQQRVFKEKVDTRAKDPPAGTRK
IMKVVNRWLFRHLSREKNPRLCTKEEFIAKVRSHAAVGAFLEEQEQWKTANEAVQDPKFWEMVDAERKLHQQGRCQSCVY
NMMGKREKKLSEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWASRENSGGGVEGTGLQYLGYVIRDLSAKEGGGFYA
DDTAGWDTRITEADLDDEQEIMSYMSPEQRKLAWAIMEMTYKNKVVKVLRPAPGGKAFMDIISRRDQRGSGQVVTYALNT
ITNLKVQLIRMAEAEMVINHQHVQECGENVLERLETWLAENGCDRLSRMAVSGDDCVVRPVDDRFGLALSHLNAMSKVRK
DISEWQPSKGWTDWENVPFCSHHFHELVLKDGRKIVVPCRDQDELIGRGRVSPGNGWMIKETACLSKAYANMWSLMYFHK
RDMRLLSFAVSSAVPTAWVPSGRTTWSVHGRGEWMTTEDMLDVWNRVWVLNNPHMTDKTTIKEWRDVPYLTKRQDKLCGS
LIGMTNRATWASHIHLVIHRIRTLIGQEKFTDYLTVMDRYSVDADLQPGELI
>Q9YRV3 ~~~~~~Genome polyprotein~~~
MSGRKAQGKTLGVNMVRRGVRSLSNKIKQKTKQIGNRPGPSRGVQGFIFFFLFNILTGKKITAQLKRLWKMLDPRQGLAA
LRKVKRVVAGLMRGLSSRKRRSHDVLTVQFLILGMLLMTGGVTLMRKNRWLLLNVTSEDLGKTFSIGTGNCTTNILEAKY
WCPDSMEYNCPNLSPREEPDDIDCWCYGVENVRVTYGKCDSAGRSRRSRRAIDLPTHENHGLKTRQEKWMTGRMGERQLQ
KIERWFVRNPFFAVTALTIAYLVGSNMTQRVVIALLVLAVGPAYSAHCIGVADRDFIEGVHGGTWVSATLEQDKCVTVMA
PDKPSLDISLETVAIDGPVEARKVCYNAVLTHVKIDDKCPSTGEAHLAEENEGDNACKRTYSDRGWGNGCGLFGKGSIVA
CAKFTCAKSMSLFEVDQTKIQYVIKAQLHVGAKQEDWKTDIKTLKFDVLSGSQEAEFTGYGKVTLECQVQTAVDFGNSYI
AEMEKESWIVDRQWAQDLTLPWQSGSGGVWREMHHLVEFEPPHAATIRVLALGDQEGSLKTALTGAMRVTKDTNDNNLYK
LHGGHVSCRVKLSALTLKGTSYKMCTDKMSFVKNPTDTGHGTVVMQVKVPKGAPCKIPVIVADDLTAAINKGILVTVNPI
ASTNDDEVLIEVNPPFGDSYIIIGTGDSRLTYQWHKEGSSIGKLFTQTMKGAERLAVMGDAAWDFSSAGGFFTSVGKGIH
TVFGSAFQGLFGGLNWITKVIMGAVLIWVGFNTRNMTMSMSMILVGVIMMFLSLGVGADQGCAINFGKRELKCGDGIFIF
RDSDDWLNKYSYYPEDPVKLASIVKASFEEGKCGLNSVDSLEHEMWRSRADEINAILEENEVDISVVVQDPKNVYQRGTH
PFSRIRDGLQYGWKTWGKNLVFSPGRKNGSFIIDGKSRKECPFSNRVWNSFQIEEFGTGVFTTRVYMDAVFEYTIDCDGS
ILGAAVNGKKSAHGSPTFWMGSHEVNGTWMIHTLEALDYKECEWPLTHTIGTSVEESEMFMPRSIGGPVSSHNHIPGYKV
QTNGPWMQVPLEVKREACPGTSVIIDGNCDGRGKSTRSTTDSGKIIPEWCCRSCTMPPVSFHGSDGCWYPMEIRPMKTHE
SHLVRSWVTAGEIHAVPFGLVSMMIAMEVVLRKRQGPKQVLVGGVVLLGAMLVGQVTLLDLLELTVAVGLHFHEMNNGGD
AMYMALIAAFSVRPGLLIGFGLRTLWSPRERLVLALGAAMVEIALGGMMGGLWKYLNAVSLCILTINAVASRKASNAILP
LMALLTPVTMAEVRLATMLFCTVVIIGVLHQNSKDTSMQKTIPLVALTLTSYLGLTQPFLGLCAFLATRIFGRRSIPVNE
ALAATGLVGVLAGLAFQEMENFLGPIAVGGILMMLVSVAGRVDGLELKKLGEVSWEEEAEISGSSARYDVALSEQGEFKL
LSEEKVPWDQVVMTSLALVGAAIHPFALLLVLAGWLFHVRRARRSGDVLWDIPTPKIIEECEHLEDGIYGIFQSTFLGAS
QRGVGVAQGGVFHTMWHVTRGAFLVWNGKKLIPSWASVKEDLVAYGGSWKLEGRWDGEEEVQLIAAAPGKNVVNVQTKPS
LFKVRNGGEIGAVALDYPSGTSGSPIVNRNGEVIGLYGNGILVGDNSFVSAISQTEVKEEGKEELQEIPTMLKKGKTTIL
DFHPGAGKTRRFLPQILAECARRRLRTLVLAPTRVVLSEMKEAFHGLDVKFHTQAFSAHGSGREVIDVMCHATLTYRMLE
PTRIVNWEVIIMDEAHFLDPASIAARGWAAHRARANESATILMTATPPGTSDEFPHSNGEIEDVQTDIPSEPWNTGHDWI
LADKRPTAWFLPSIRAANVMAASLRKAGKSVVVLNRKTFEREYPTIKQKKPDFILATDIAEMGANLCVERVLDCRTAFKP
VLVDEGRKVAIKGPLRISASSAAQRRGRIGRNPNRDGDSYYYSEPTSEDNAHHVCWLEASMLLDNMEVRGGMVAPLYGVE
GIKTPVSPGEMRLRDDQRKVFRELVRNCDLPVWLSWQVAKAGLKTNDRKWCFEGPEEHEILNDSGETVKCRAPGGAKKPL
RPRWCDERVSSDQSALSEFIKFAEGRRGAADVLVVLSELPDFLAKKGGEAMDTISVFLHSEEGSRAYRNALSMMPEAMTI
VMLFLLAGLLTSGMVIFFMSPKGISRMSMAKGTMAGCGYLMFLGGVEPTHISYIMLIFFVLMVVVIPEPGQQRSIQDNQV
AFLIIGILTLVSVVAANELGMLEKTKEDLFGKKNLIPSGASPWSWPDLDLKPGAAWTVYVGIVTMLSPMLHHWIKVEYGN
LSLSGIAQSASVLSFMDKGIPFMKMNISVIMLLVSGWNSITVMPLLCGIGCAMLHWSLILPGIKAQQSKLAQRRVFHGVA
KNPVVDGNPTVDIEEAPEMPALYEKKLALYLLLALSLASVAMCRTPFSLDEGIVLASAALGPLIEGNTSLLWNGPMAVSM
TGVMRGNYYAFVGVAYNLWKMKTARRGTANGKTLGEVWKRELNLLDKQQFELYKRTDIVEVDRDTARRHLAEGKVDTGVA
VSRGTAKLRWFHERGYVKLEGRVIDLGCGRGGWCYYAAAQKEVSGVKGFTLGRDGHEKPMNVQSLGWNIITFKDKTDVHP
LEPVKCDTLLCDIGESSSSSVTEGERTVRVLDTVEKWLACGVDNFCVKVLAPYMRDVLEKLELLQRRFGGTVIRNPLSRN
STHEMYYVSGARSNVTFTVNQTSRLLMRRMRRPTGKVTLEADVTLPIGTRSVETDKGPLDKEAIKERVERIKSEYMTSWF
YDNDNPYRTWHYCGSYVTKTSGSAASMVNGVIKLLTYPWDKIEEVTRMAMTDTTPFGQQRVFKEKVDTRAKDPPAGTRKI
MKVVNRWLFRHLAREKNPRLCTKEEFIAKVRSHAAIGAYLEEQDQWKTANEAVQDPKFWELVDEERKLHQQGRCRTCVYN
MMGKREKKLSEFGKAKGSRAIWYMWLGARYLEFEALGFLNEDHWASRENSGGGVEGIGLQYLGYVIRDLAAMDGGGLYAD
DTAGWDTRITEADLDDEQEILNYMSPHHKKLAQAVMEMTYKNKVVKVLRPAPGGKAYMDVISRRDQRGSGQVVTYALNTI
TNLKVQLIRMAEAEMVIHHQHVQDCDESVLTRLEAWLTEHGCDRLRRMAVSGDDCVVRPIDDRFGLALSHLNAMSKVRKD
ISEWQPSKGWNDWENVPFCSHHFHELQLKDGRRIVVPCREQDELIGRGRVSPGNGWMIKETACLSKAYANMWSLMYFHKR
DMRLLSLAVSSAVPTSWVPQGRTTWSIHGKGEWMTTEDRLEVWNRVWITNNPHMQDKTMVKEWRDVPYLTKRQDKLCGSL
IGMTNRATWASNIHLVIHRIRTLVGQEKYTDYLTVMDRYSVDADLQPGELI
>Q32ZE1 ~~~~~~Genome polyprotein~~~
MKNPKEEIRRIRIVNMLKRGVARVNPLGGLKRLPAGLLLGHGPIRMVLAILAFLRFTAIKPSLGLINRWGSVGKKEAMEI
IKKFKKDLAAMLRIINARKERKRRGADTSIGIIGLLLTTAMAAEITRRGSAYYMYLDRSDAGKAISFATTLGVNKCHVQI
MDLGHMCDATMSYECPMLDEGVEPDDVDCWCNTTSTWVVYGTCHHKKGEARRSRRAVTLPSHSTRKLQTRSQTWLESREY
TKHLIKVENWIFRNPGFALVAVAIAWLLGSSTSQKVIYLVMILLIAPAYSIRCIGVSNRDFVEGMSGGTWVDVVLEHGGC
VTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGK
GSLVTCAKFTCSKKMTGKSIQPENLEYRIMLSVHGSQHSGMIGYETDEDRAKVEVTPNSPRAEATLGGFGSLGLDCEPRT
GLDFSDLYYLTMNNKHWLVHKEWFHDIPLPWHAGADTGTPHWNNKEALVEFKDAHAKRQTVVVLGSQEGAVHTALAGALE
AEMDGAKGRLFSGHLKCRLKMDKLRLKGVSYSLCTAAFTFTKVPAETLHGTVTVEVQYAGTDGPCKIPVQMAVDMQTLTP
VGRLITANPVITESTENSKMMLELDPPFGDSYIVIGVGDKKITHHWHRSGSTIGKAFEATVRGAKRMAVLGDTAWDFGSV
GGVFNSLGKGIHQIFGAAFKSLFGGMSWFSQILIGTLLVWLGLNTKNGSISLTCLALGGVMIFLSTAVSADVGCSVDFSK
KETRCGTGVFIYNDVEAWRDRYKYHPDSPRRLAAAVKQAWEEGICGISSVSRMENIMWKSVEGELNAILEENGVQLTVVV
GSVKNPMWRGPQRLPVPVNELPHGWKAWGKSYFVRAAKTNNSFVVDGDTLKECPLEHRAWNSFLVEDHGFGVFHTSVWLK
VREDYSLECDPAVIGTAVKGREAAHSDLGYWIESEKNDTWRLKRAHLIEMKTCEWPKSHTLWTDGVEESDLIIPKSLAGP
LSHHNTREGYRTQVKGPWHSEELEIRFEECPGTKVYVEETCGTRGPSLRSTTASGRVIEEWCCRECTMPPLSFRAKDGCW
YGMEIRPRKEPESNLVRSMVTAGSTDHMDHFSLGVLVILLMVQEGLKKRMTTKIIMSTSMAVLVVMILGGFSMSDLAKLV
ILMGATFAEMNTGGDVAHLALVAAFKVRPALLVSFIFRANWTPRESMLLALASCLLQTAISALEGDLMVLINGFALAWLA
IRAMAVPRTDNIALPILAALTPLARGTLLVAWRAGLATCGGIMLLSLKGKGSVKKNLPFVMALGLTAVRVVDPINVVGLL
LLTRSGKRSWPPSEVLTAVGLICALAGGFAKADIEMAGPMAAVGLLIVSYVVSGKSVDMYIERAGDITWEKDAEVTGNSP
RLDVALDESGDFSLVEEDGPPMREIILKVVLMAICGMNPIAIPFAAGAWYVYVKTGKRSGALWDVPAPKEVKKGETTDGV
YRVMTRRLLGSTQVGVGVMQEGVFHTMWHVTKGAALRSGEGRLDPYWGDVKQDLVSYCGPWKLDAAWDGLSEVQLLAVPP
GERARNIQTLPGIFKTKDGDIGAVALDYPAGTSGSPILDKCGRVIGLYGNGVVIKNGSYVSAITQGKREEETPVECFEPS
MLKKKQLTVLDLHPGAGKTRRVLPEIVREAIKKRLRTVILAPTRVVAAEMEEALRGLPVRYMTTAVNVTHSGTEIVDLMC
HATFTSRLLQPIRVPNYNLNIMDEAHFTDPSSIAARGYISTRVEMGEAAAIFMTATPPGTRDAFPDSNSPIMDTEVEVPE
RAWSSGFDWVTDHSGKTVWFVPSVRNGNEIAACLTKAGKRVIQLSRKTFETEFQKTKNQEWDFVITTDISEMGANFKADR
VIDSRRCLKPVILDGERVILAGPMPVTHASAAQRRGRIGRNPNKPGDEYMYGGGCAETDEGHAHWLEARMLLDNIYLQDG
LIASLYRPEADKVAAIEGEFKLRTEQRKTFVELMKRGDLPVWLAYQVASAGITYTDRRWCFDGTTNNTIMEDSVPAEVWT
KYGEKRVLKPRWMDARVCSDHAALKSFKEFAAGKRGAALGVMEALGTLPGHMTERFQEAIDNLAVLMRAETGSRPYKAAA
AQLPETLETIMLLGLLGTVSLGIFFVLMRNKGIGKMGFGMVTLGASAWLMWLSEIEPARIACVLIVVFLLLVVLIPEPEK
QRSPQDNQMAIIIMVAVGLLGLITANELGWLERTKNDIAHLMGRREEGATMGFSMDIDLRPASAWAIYAALTTLITPAVQ
HAVTTSYNNYSLMAMATQAGVLFGMGKGMPFMHGDLGVPLLMMGCYSQLTPLTLIVAIILLVAHYMYLIPGLQAAAARAA
QKRTAAGIMKNPVVDGIVVTDIDTMTIDPQVEKKMGQVLLIAVAISSAVLLRTAWGWGEAGALITAATSTLWEGSPNKYW
NSSTATSLCNIFRGSYLAGASLIYTVTRNAGLVKRRGGGTGETLGEKWKARLNQMSALEFYSYKKSGITEVCREEARRAL
KDGVATGGHAVSRGSAKIRWLEERGYLQPYGKVVDLGCGRGGWSYYAATIRKVQEVRGYTKGGPGHEEPMLVQSYGWNIV
RLKSGVDVFHMAAEPCDTLLCDIGESSSSPEVEETRTLRVLSMVGDWLEKRPGAFCIKVLCPYTSTMMETMERLQRRHGG
GLVRVPLCRNSTHEMYWVSGAKSNIIKSVSTTSQLLLGRMDGPRRPVKYEEDVNLGSGTRAVASCAEAPNMKIIGRRIER
IRNEHAETWFLDENHPYRTWAYHGSYEAPTQGSASSLVNGVVRLLSKPWDVVTGVTGIAMTDTTPYGQQRVFKEKVDTRV
PDPQEGTRQVMNIVSSWLWKELGKRKRPRVCTKEEFINKVRSNAALGAIFEEEKEWKTAVEAVNDPRFWALVDREREHHL
RGECHSCVYNMMGKREKKQGEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWMGRENSGGGVEGLGLQRLGYILEEMN
RAPGGKMYADDTAGWDTRISKFDLENEALITNQMEEGHRTLALAVIKYTYQNKVVKVLRPAEGGKTVMDIISRQDQRGSG
QVVTYALNTFTNLVVQLIRNMEAEEVLEMQDLWLLRKPEKVTRWLQSNGWDRLKRMAVSGDDCVVKPIDDRFAHALRFLN
DMGKVRKDTQEWKPSTGWSNWEEVPFCSHHFNKLYLKDGRSIVVPCRHQDELIGRARVSPGAGWSIRETACLAKSYAQMW
QLLYFHRRDLRLMANAICSAVPVDWVPTGRTTWSIHGKGEWMTTEDMLMVWNRVWIEENDHMEDKTPVTKWTDIPYLGKR
EDLWCGSLIGHRPRTTWAENIKDTVNMVRRIIGDEEKYMDYLSTQVRYLGEEGSTPGVL
>A0A024B7W1 ~~~~~~Genome polyprotein~~~
MKNPKKKSGGFRIVNMLKRGVARVSPFGGLKRLPAGLLLGHGPIRMVLAILAFLRFTAIKPSLGLINRWGSVGKKEAMEI
IKKFKKDLAAMLRIINARKEKKRRGADTSVGIVGLLLTTAMAAEVTRRGSAYYMYLDRNDAGEAISFPTTLGMNKCYIQI
MDLGHMCDATMSYECPMLDEGVEPDDVDCWCNTTSTWVVYGTCHHKKGEARRSRRAVTLPSHSTRKLQTRSQTWLESREY
TKHLIRVENWIFRNPGFALAAAAIAWLLGSSTSQKVIYLVMILLIAPAYSIRCIGVSNRDFVEGMSGGTWVDVVLEHGGC
VTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGK
GSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHSGMIVNDTGHETDENRAKVEITPNSPRAEATLGGFGSLGLDC
EPRTGLDFSDLYYLTMNNKHWLVHKEWFHDIPLPWHAGADTGTPHWNNKEALVEFKDAHAKRQTVVVLGSQEGAVHTALA
GALEAEMDGAKGRLSSGHLKCRLKMDKLRLKGVSYSLCTAAFTFTKIPAETLHGTVTVEVQYAGTDGPCKVPAQMAVDMQ
TLTPVGRLITANPVITESTENSKMMLELDPPFGDSYIVIGVGEKKITHHWHRSGSTIGKAFEATVRGAKRMAVLGDTAWD
FGSVGGALNSLGKGIHQIFGAAFKSLFGGMSWFSQILIGTLLMWLGLNTKNGSISLMCLALGGVLIFLSTAVSADVGCSV
DFSKKETRCGTGVFVYNDVEAWRDRYKYHPDSPRRLAAAVKQAWEDGICGISSVSRMENIMWRSVEGELNAILEENGVQL
TVVVGSVKNPMWRGPQRLPVPVNELPHGWKAWGKSYFVRAAKTNNSFVVDGDTLKECPLKHRAWNSFLVEDHGFGVFHTS
VWLKVREDYSLECDPAVIGTAVKGKEAVHSDLGYWIESEKNDTWRLKRAHLIEMKTCEWPKSHTLWTDGIEESDLIIPKS
LAGPLSHHNTREGYRTQMKGPWHSEELEIRFEECPGTKVHVEETCGTRGPSLRSTTASGRVIEEWCCRECTMPPLSFRAK
DGCWYGMEIRPRKEPESNLVRSMVTAGSTDHMDHFSLGVLVILLMVQEGLKKRMTTKIIISTSMAVLVAMILGGFSMSDL
AKLAILMGATFAEMNTGGDVAHLALIAAFKVRPALLVSFIFRANWTPRESMLLALASCLLQTAISALEGDLMVLINGFAL
AWLAIRAMVVPRTDNITLAILAALTPLARGTLLVAWRAGLATCGGFMLLSLKGKGSVKKNLPFVMALGLTAVRLVDPINV
VGLLLLTRSGKRSWPPSEVLTAVGLICALAGGFAKADIEMAGPMAAVGLLIVSYVVSGKSVDMYIERAGDITWEKDAEVT
GNSPRLDVALDESGDFSLVEDDGPPMREIILKVVLMTICGMNPIAIPFAAGAWYVYVKTGKRSGALWDVPAPKEVKKGET
TDGVYRVMTRRLLGSTQVGVGVMQEGVFHTMWHVTKGSALRSGEGRLDPYWGDVKQDLVSYCGPWKLDAAWDGHSEVQLL
AVPPGERARNIQTLPGIFKTKDGDIGAVALDYPAGTSGSPILDKCGRVIGLYGNGVVIKNGSYVSAITQGRREEETPVEC
FEPSMLKKKQLTVLDLHPGAGKTRRVLPEIVREAIKTRLRTVILAPTRVVAAEMEEALRGLPVRYMTTAVNVTHSGTEIV
DLMCHATFTSRLLQPIRVPNYNLYIMDEAHFTDPSSIAARGYISTRVEMGEAAAIFMTATPPGTRDAFPDSNSPIMDTEV
EVPERAWSSGFDWVTDHSGKTVWFVPSVRNGNEIAACLTKAGKRVIQLSRKTFETEFQKTKHQEWDFVVTTDISEMGANF
KADRVIDSRRCLKPVILDGERVILAGPMPVTHASAAQRRGRIGRNPNKPGDEYLYGGGCAETDEDHAHWLEARMLLDNIY
LQDGLIASLYRPEADKVAAIEGEFKLRTEQRKTFVELMKRGDLPVWLAYQVASAGITYTDRRWCFDGTTNNTIMEDSVPA
EVWTRHGEKRVLKPRWMDARVCSDHAALKSFKEFAAGKRGAAFGVMEALGTLPGHMTERFQEAIDNLAVLMRAETGSRPY
KAAAAQLPETLETIMLLGLLGTVSLGIFFVLMRNKGIGKMGFGMVTLGASAWLMWLSEIEPARIACVLIVVFLLLVVLIP
EPEKQRSPQDNQMAIIIMVAVGLLGLITANELGWLERTKSDLSHLMGRREEGATIGFSMDIDLRPASAWAIYAALTTFIT
PAVQHAVTTSYNNYSLMAMATQAGVLFGMGKGMPFYAWDFGVPLLMIGCYSQLTPLTLIVAIILLVAHYMYLIPGLQAAA
ARAAQKRTAAGIMKNPVVDGIVVTDIDTMTIDPQVEKKMGQVLLIAVAVSSAILSRTAWGWGEAGALITAATSTLWEGSP
NKYWNSSTATSLCNIFRGSYLAGASLIYTVTRNAGLVKRRGGGTGETLGEKWKARLNQMSALEFYSYKKSGITEVCREEA
RRALKDGVATGGHAVSRGSAKLRWLVERGYLQPYGKVIDLGCGRGGWSYYAATIRKVQEVKGYTKGGPGHEEPMLVQSYG
WNIVRLKSGVDVFHMAAEPCDTLLCDIGESSSSPEVEEARTLRVLSMVGDWLEKRPGAFCIKVLCPYTSTMMETLERLQR
RYGGGLVRVPLSRNSTHEMYWVSGAKSNTIKSVSTTSQLLLGRMDGPRRPVKYEEDVNLGSGTRAVVSCAEAPNMKIIGN
RIERIRSEHAETWFFDENHPYRTWAYHGSYEAPTQGSASSLINGVVRLLSKPWDVVTGVTGIAMTDTTPYGQQRVFKEKV
DTRVPDPQEGTRQVMSMVSSWLWKELGKHKRPRVCTKEEFINKVRSNAALGAIFEEEKEWKTAVEAVNDPRFWALVDKER
EHHLRGECQSCVYNMMGKREKKQGEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWMGRENSGGGVEGLGLQRLGYVL
EEMSRIPGGRMYADDTAGWDTRISRFDLENEALITNQMEKGHRALALAIIKYTYQNKVVKVLRPAEKGKTVMDIISRQDQ
RGSGQVVTYALNTFTNLVVQLIRNMEAEEVLEMQDLWLLRRSEKVTNWLQSNGWDRLKRMAVSGDDCVVKPIDDRFAHAL
RFLNDMGKVRKDTQEWKPSTGWDNWEEVPFCSHHFNKLHLKDGRSIVVPCRHQDELIGRARVSPGAGWSIRETACLAKSY
AQMWQLLYFHRRDLRLMANAICSSVPVDWVPTGRTTWSIHGKGEWMTTEDMLVVWNRVWIEENDHMEDKTPVTKWTDIPY
LGKREDLWCGSLIGHRPRTTWAENIKNTVNMVRRIIGDEEKYMDYLSTQVRYLGEEGSTPGVL
>A0A142I5B9 ~~~~~~Genome polyprotein~~~
MKNPKKKSGGFRIVNMLKRGVARVSPFGGLKRLPAGLLLGHGPIRMVLAILAFLRFTAIKPSLGLINRWGSVGKKEAMEI
IKKFKKDLAAMLRIINARKEKKRRGTDTSVGIVGLLLTTAMAVEVTRRGNAYYMYLDRSDAGEAISFPTTMGMNKCYIQI
MDLGHMCDATMSYECPMLDEGVEPDDVDCWCNTTSTWVVYGTCHHKKGEARRSRRAVTLPSHSTRKLQTRSQTWLESREY
TKHLIRVENWIFRNPGFALAAAAIAWLLGSSTSQKVIYLVMILLIAPAYSIRCIGVSNRDFVEGMSGGTWVDVVLEHGGC
VTVMAQDKPTVDIELVTTTVSNMAEVRSYCYEASISDMASDSRCPTQGEAYLDKQSDTQYVCKRTLVDRGWGNGCGLFGK
GSLVTCAKFACSKKMTGKSIQPENLEYRIMLSVHGSQHSGMIVNDTGHETDENRAKVEITPNSPRAEATLGGFGSLGLDC
EPRTGLDFSDLYYLTMNNKHWLVHKEWFHDIPLPWHAGADTGTPHWNNKEALVEFKDAHAKRQTVVVLGSQEGAVHTALA
GALEAEMDGAKGRLSSGHLKCRLKMDKLRLKGVSYSLCTAAFTFTKIPAETLHGTVTVEVQYAGTDGPCKVPAQMAVDMQ
TLTPVGRLITANPVITESTENSKMMLELDPPFGDSYIVIGVGEKKITHHWHRSGSTIGKAFEATVRGAKRMAVLGDTAWD
FGSVGGALNSLGKGIHQIFGAAFKSLFGGMSWFSQILIGTLLVWLGLNTKNGSISLMCLALGGVLIFLSTAVSADVGCSV
DFSKKETRCGTGVFVYNDVEAWRDRYKYHPDSPRRLAAAVKQAWEDGICGISSVSRMENIMWRSVEGELNAILEENGVQL
TVVVGSVKNPMWRGPQRLPVPVNELPHGWKAWGKSYFVRAAKTNNSFVVDGDTLKECPLKHRAWNSFLVEDHGFGVFHTS
VWLKVREDYSLECDPAVIGTAAKGKEAVHSDLGYWIESEKNDTWRLKRAHLIEMKTCEWPKSHTLWTDGIEESDLIIPKS
LAGPLSHHNTREGYRTQMKGPWHSEELEIRFEECPGTKVHVEETCGTRGPSLRSTTASGRVIEEWCCRECTMPPLSFRAK
DGCWYGMEIRPRKEPESNLVRSMVTAGSTDHMDHFSLGVLVILLMVQEGLKKRMTTKIIISTSMAVLVAMILGGFSMSDL
AKLAILMGATFAEMNTGGDVAHLALIAAFKVRPALLVSFIFRANWTPRESMLLALASCLLQTAISALEGDLMVPINGFAL
AWLAIRAMVVPRTDNITLAILAALTPLARGTLLVAWRAGLATCGGFMLLSLKGKGSVKKNLPFVMALGLTAVRLVDPINV
VGLLLLTRSGKRSWPPSEVLTAVGLICALAGGFAKADIEMAGPMAAVGLLIVSYVVSGKSVDMYIERAGDITWEKDAEVT
GNSPRLDVALDESGDFSLVEDDGPPMREIILKVVLMAICGMNPIAIPFAAGAWYVYVKTGKRSGALWDVPAPKEVKKGET
TDGVYRVMTRRLLGSTQVGVGVMQEGVFHTMWHVTKGSALRSGEGRLDPYWGDVKQDLVSYCGPWKLDAAWDGHSEVQLL
AVPPGERARNIQTLPGIFKTKDGDIGAVALDYPAGTSGSPILDKCGRVIGLYGNGVVIKNGSYVSAITQGRREEETPVEC
FEPSMLKKKQLTVLDLHPGAGKTRRVLPEIVREAIKTRLRTVILAPTRVVAAEMEEALRGLPVRYMTTAVNVTHSGTEIV
DLMCHATFTSRLLQPIRVPNYNLYIMDEAHFTDPSSIAARGYISTRVEMGEAAAIFMTATPPGTRDAFPDSNSPIMDTEV
EVPERAWSSGFDWVTDHSGKTVWFVPSVRNGNEIAACLTKAGKRVIQLSRKTFETEFQKTKHQEWDFVVTTDISEMGANF
KADRVIDSRRCLKPVILDGERVILAGPMPVTHASAAQRRGRIGRNPNKPGDEYLYGGGCAETDEDHAHWLEARMLLDNIY
LQDGLIASLYRPEADKVAAIEGEFKLRTEQRKTFVELMKRGDLPVWLAYQVASAGITYTDRRWCFDGTTNNTIMEDSVPA
EVWTRYGEKRVLKPRWMDARVCSDHAALKSFKEFAAGKRGAAFGVMEALGTLPGHMTERFQEAIDNLAVLMRAETGSRPY
KAAAAQLPETLETIMLLGLLGTVSLGIFFVLMRNKGIGKMGFGMVTLGASAWLMWLSEIEPARIACVLIVVFLLLVVLIP
EPEKQRSPQDNQMAIIIMVAVGLLGLITANELGWLERTKSDLSHLMGRREEGATIGFSMDIDLRPASAWAIYAALTTFIT
PAVQHAVTTSYNNYSLMAMATQAGVLFGMGKGMPFYAWDFGVPLLMIGCYSQLTPLTLIVAIILLVAHYMYLIPGLQAAA
ARAAQKRTAAGIMKNPVVDGIVVTDIDTMTIDPQVEKKMGQVLLIAVAVSSAILSRTAWGWGEAGALITAATSTLWEGSP
NKYWNSSTATSLCNIFRGSYLAGASLIYTVTRNAGLVKRRGGGTGETLGEKWKARLNQMSALEFYSYKKSGITEVCREEA
RRALKDGVATGGHAVSRGSAKLRWLVERGYLQPYGKVIDLGCGRGGWSYYAATIRKVQEVKGYTKGGPGHEEPMLVQSYG
WNIVRLKSGVDVFHMAAEPCDTLLCDIGESSSSPEVEEARTLRVLSMVGDWLEKRPGAFCIKVLCPYTSTMMETLERLQR
RYGGGLVRVPLSRNSTHEMYWVSGAKSNTIKSVSTTSQLLLGRMDGPRRPVKYEEDVNLGSGTRAVVSCAEAPNMKIIGN
RIERIRSEHAETWFFDENHPYRTWAYHGSYEAPTQGSASSLINGVVRLLSKPWDVVTGVTGIAMTDTTPYGQQRVFKEKV
DTRVPDPQEGTRQVMSMVSSWLWKELGKHKRPRVCTKEEFINKVRSNAALGAIFEEEKEWKTAVEAVNDPRFWALVDKER
EHHLRGECQSCVYNMMGKREKKQGEFGKAKGSRAIWYMWLGARFLEFEALGFLNEDHWMGRENSGGGVEGLGLQRLGYVL
EEMSRIPGGRMYADDTAGWDTRISRFDLENEALITNQMEKGHRALALAIIKYTYQNKVVKVLRPAEKGKTVMDIISRQDQ
RGSGQVVTYALNTFTNLVVQLIRNMEAEEVLEMQDLWLLRRSEKVTNWLQSNGWDRLKRMAVSGDDCVVKPIDDRFAHAL
RFLNDMGKVRKDTQEWKPSTGWDNWEEVPFCSHHFNKLHLKDGRSIVVPCRHQDELIGRARVSPGAGWSIRETACLAKSY
AQMWQLLYFHRRDLRLMANAICSSVPVDWVPTGRTTWSIHGKGEWMTTEDMLVVWNRVWIEENDHMEDKTPVTKWTDIPY
LGKREDLWCGSLIGHRPRTTWAENIKNTVNMMRRIIGDEEKYVDYLSTQVRYLGEEGSTPGVL
>P18479 ~~~~~~Genome polyprotein~~~
MASIMIGSISVPIAKTEQCANTQVSNRANIVAPGHMATCPLPLKTHMYYRHESKKLMQSNKSIDILNNFFSTDEMKFRLT
RNEMSKLKKGPSGRIVLRKPSKQRVFARIEQDEAARKEEAVFLEGNYDDSITNLARVLPPAVTHNVDVSLRSPFYKRTYK
KERKKVAQKQIVQAPLNSLCTRVLKIARNKNIPVEMIGNKKTRHTLTFKRFRGCFVGKVSVAHEEGRMRHTEMSYEQFKW
LLKAICQVTHTERIREEDIKPGCSGWVLGTNHTLTKRYSRLPHLVIRGRDDDGIVNALEQVLFYSEVDHSSSQPEVQFFQ
GWRRIFDKFRPSPDHVCKADHNNEECGELAAIFCQALFPVVKLSCQTCRESLVEVSFEEFKDSLNANFIIHKDEWGSFKE
GSQYDNIFKLIKVATQATQNLKLSSEVMKLVQNHTSTHMKQIQDINKALMKGSLVAQDELDLALKQLLEMTQWFKNHMHL
TGEEALKMFRNKRSSKAMINPSLLCGNQLDKNGNFVWGERGYHSKRLFKNFFEEVIPSEGYTKYVVRNFPNGTRKLAIGS
LIVPLNLDRARTALLGESIEKKPLTSACVSQQNGNYIHSCCCVTMDDGTPMYSELKSPTKRHLVIGASSDPKYIDLPASE
AERMYIAKEGYCYLSIFLAMLVNVNENEAKDFTKMIRDVLIPMLGQWPSLMDVATAAYILGVFHPETRCAELPRILVDHA
TQTMHVIDSYGSLTVGYHVLKAGTVNHLIQFASNDLQSEMKHYRVGGTPTQRIKLEEQLIKGIFKPKLMMQLLHDDPYIL
LLGMISPTILVHMYRMRHFERGIEIWIKRDHEIGKIFVILEQLTRKVALAEVLVDQLNLISEASPHLLEIMKGCQDNQRA
YVPALDLLTIQVEREFSNKELKTNGYPDLQQTLFDMREKMYAKQLHNSWQELSLLEKSCVTVRLKQFSIFTERNLIQRAK
EGKRASSLQFVHECFITTRVHAKSIRDAGVRKLNEALVGTCKFFFSCGFKIFARCYSDIIYLVNVCLVFSLVLQMSNTVR
SMIAATREEKERAMANKADENERTLMHMYHIFSKKQDDAPIYNDFLEHVRNVRPDLEETLLYMAGVEVVSTQAKSAVQIQ
FEKIIAVLALLTMCFDAERSDAIFKILTKLKTVFGTVGETVRLQGLEDIESLEDDKRLTIDFDINTNEAHSSTTFDVHFD
DWWNRQLQQNRTVPHYRTTGKFLEFTRNTAAFVANEIASSSEGEFLVRGRVGSGKSTSLPAHLAKKGKVLLLEPTRPLAE
NVSRQLAGDPFFQNVTLRMRGLSCFGSSNITVMTSGFAFHYYVNNPHQLMEFDFVIIDECHVTDSATIAFNCALKEYNFA
GKLIKVSATPPGRECDFDTQFAVKVKTEDHLSFHAFVGAQKTGSNADMVQHGNNILVYVASYNEVDMLSKLLTERQFSVT
KVDGRTMQLGKTTIETHGTSQKPHFIVATNIIENGVTLDVECVVDFGLKVGRRTGQRNRCVRYNKKSVSYGERIQRLGRV
GRSKPGTALRIGHTEKGIETIPEFIATEAAALSFAYGLPVTTHGVSTNILGKCTVKQMKCALNFELTPFFTTHLIRHDGS
MHPLIHEELKQFKLRDSEMVLNKVALPHQFVSQWMDQSEYERIGVHVQCHESNSIPFYTNGIPDKVYERIWKCIQENKND
AVFGKLSSACSTKVSYTLSTDPAALPRTIAIIDHLLAEEMMKRNHFDTISSAVTGYSFSLAGIADSFRKRYMRDYTAHNI
AILQQARAQLLEFNSKNVNINNLSDLEGIGVIKSVVLQSKQEVSSFLGLRGKWDGKKFANDVILAIMTLLGGGWFMWEYF
TKKINEPVRVESKKRRSQKLKFRDAYDRKVGREIFGDDDTIGRTFGEAYTKRGKVKGNNNTKGMGRKTRNFVHLYGVEPE
NYSFIRFVDPLTGHTLDESTHTDISLVQEEFGSIREKFLENDLISRQSIINKPGIQAYFMGKGTEEALKVDLTPHVPLLL
CRNTNAIAGYPERENELRQTGTPVKVSFKDVPEKNEHVELESKSIYKGVRDYNGISTIVCQLTNDSDGLKETMYGIGYGP
IIITNGHLFRKNNGTLLVRSWHGEFIVKNTTTLKVHFIEGKDVVLVRMPKDFPPFKSNASFRAPKREERRCLVGTNFQEK
SLRSTVSESSMTIPEGTGSYWIHWISTNEGDCGLPMVSTTDGKIIGVHGLASTVSSKNYFVPFTDDFIATHLSKLDDLTW
TQHWLWQPSKIAWGTLNLVDEQPGPEFRISNLVKDLFTSGVETQSKRERWVYESCEGNLRAVGTAQSALVTKHVVKGKCP
FFEEYLQTHAEASAYFRPLMGEYQPSKLNKEAFKKDFFKYNKPVTVNQLDHDKFLGAVDGVIRMMCDFEFNECRFITDPE
EIYNSLNMKAAIGAQYRGKKKEYFEGLDDFDRERLLFQSCERLFNGYKGLWNGSLKAELRPLEKVRANKTRTFTAAPIDT
LLGAKVCVDDFNNEFYRKNLKCPWTVGMTKFYGGWDKLMRSLPDGWLYCHADGSQFDSSLTPALLNAVLIIRSFYMEDWW
VGQEMLENLYAEIVYTPILAPDGTIFKKFRGNNSGQPSTVVDNTLMVVISIYYACMKFGWNCEEIENKLVFFANGDDLIL
AVKDEDSGLLDNMSSSFCELGLNYDFSERTHKREDLWFMSHQAMLVDGMYTPKLEKERIVSILEWDRSKEIMHRTEAICA
AMIEAWGHTELLQEIRKFYLWFVEKEEVRELAALGKAPYIAETALRKLYTDKGADTSELARYLQALHQDIFFEQGDTVML
QSGTQPTVADAGATKKDKEDDKGKNKDVTGSGSGEKTVAAVTKDKDVNAGSHGKIVPRLSKITKKMSLPRVKGNVILDID
HLLEYKPDQIELYNTRASHQQFASWFNQVKTEYDLNEQQMGVVMNGFMVWCIENGTSPDINGVWVMMDGNEQVEYPLKPI
VENAKPTLRQIMHHFSDAAEAYIEMRNAEAPYMPRYGLLRNLRDRSLARYAFDFYEVNSKTPERAREAVAQMKAAALSNV
SSRLFGLDGNVATTSEDTERHTARDVNRNMHTLLGVNTMQ
>Q5XXP4 ~~~~~~Polyprotein P1234~~~
MDPVYVDIDADSAFLKALQRAYPMFEVEPRQVTPNDHANARAFSHLAIKLIEQEIDPDSTILDIGSAPARRMMSDRKYHC
VCPMRSAEDPERLANYARKLASAAGKVLDRNISEKIGDLQAVMAVPDAETPTFCLHTDVSCRQRADVAIYQDVYAVHAPT
SLYHQAIKGVRVAYWIGFDTTPFMYNAMAGAYPSYSTNWADEQVLKAKNIGLCSTDLTEGRRGKLSIMRGKKMKPCDRVL
FSVGSTLYPESRKLLKSWHLPSVFHLKGKLSFTCRCDTVVSCEGYVVKRITISPGLYGKTTGYAVTHHADGFLMCKTTDT
VDGERVSFSVCTYVPATICDQMTGILATEVTPEDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPVVAQAFSKWAKECRK
DMEDEKLLGIRERTLTCCCLWAFKKQKTHTVYKRPDTQSIQKVPAEFDSFVVPSLWSSGLSIPLRTRIKWLLSKVPKTDL
IPYSGDAKEARDAEKEAEEEREAELTREALPPLQAAQDDVQVEIDVEQLEDRAGAGIIETPRGAIKVTAQPTDHVVGEYL
VLSPQTVLRSQKLSLIHALAEQVKTCTHSGRAGRYAVEAYDGRILVPSGYAISPEDFQSLSESATMVYNEREFVNRKLHH
IALHGPALNTDEESYELVRAERTEHEYVYDVDQRRCCKKEEAAGLVLVGDLTNPPYHEFAYEGLRIRPACPYKTAVIGVF
GVPGSGKSAIIKNLVTRQDLVTSGKKENCQEISTDVMRQRNLEISARTVDSLLLNGCNRPVDVLYVDEAFACHSGTLLAL
IALVRPRQKVVLCGDPKQCGFFNMMQMKVNYNHNICTQVYHKSISRRCTLPVTAIVSSLHYEGKMRTTNEYNKPIVVDTT
GSTKPDPGDLVLTCFRGWVKQLQIDYRGHEVMTAAASQGLTRKGVYAVRQKVNENPLYASTSEHVNVLLTRTEGKLVWKT
LSGDPWIKTLQNPPKGNFKATIKEWEVEHASIMAGICNHQVTFDTFQNKANVCWAKSLVPILETAGIKLNDRQWSQIIQA
FKEDRAYSPEVALNEICTRMYGVDLDSGLFSKPLVSVHYADNHWDNRPGGKMFGFNPEAASILERKYPFTKGKWNTNKQI
CVTTRRIEDFNPNTNIIPANRRLPHSLVAEHRPVKGERMEWLVNKINGHHVLLVSGYNLVLPTKRVTWVAPLGIRGADYT
YNLELGLPATLGRYDLVIINIHTPFRIHHYQQCVDHAMKLQMLGGDSLRLLKPGGSLLIRAYGYADRTSERVVCVLGRKF
RSSRALKPPCVTSNTEMFFLFSNFDNGRRNFTTHVMNNQLNAAFVGQATRAGCAPSYRVKRMDIAKNDEECVVNAANPRG
LPGDGVCKAVYKKWPESFKNSATPVGTAKTVMCGTYPVIHAVGPNFSNYSESEGDRELAAAYREVAKEVTRLGVNSVAIP
LLSTGVYSGGKDRLTQSLNHLFTALDSTDADVVIYCRDKEWEKKIAEAIQMRTQVELLDEHISVDCDIIRVHPDSSLAGR
KGYSTTEGSLYSYLEGTRFHQTAVDMAEVYTMWPKQTEANEQVCLYALGESIESIRQKCPVDDADASSPPKTVPCLCRYA
MTPERVTRLRMNHVTSIIVCSSFPLPKYKIEGVQKVKCSKVMLFDHNVPSRVSPREYKSPQETAQEVSSTTSLTHSQFDL
SVDGEELPAPSDLEADAPIPEPTPDDRAVLTLPPTIDNFSAVSDWVMNTAPVAPPRRRRGKNLNVTCDEREGNVLPMASV
RFFRADLHSIVQETAEIRDTAASLQAPLSVATEPNQLPISFGAPNETFPITFGDFDEGEIESLSSELLTFGDFSPGEVDD
LTDSDWSTCSDTDDELXLDRAGGYIFSSDTGPGHLQQRSVRQTVLPVNTLEEVQEEKCYPPKLDEVKEQLLLKKLQESAS
MANRSRYQSRKVENMKATIVQRLKGGCKLYLMSETPKVPTYRTTYPAPVYSPPINIRLSNPESAVAACNEFLARNYPTVA
SYQITDEYDAYLDMVDGSESCLDRATFNPSKLRSYPKQHSYHAPTIRSAVPSPFQNTLQNVLAAATKRNCNVTQMRELPT
LDSAVFNVECFKKFACNQEYWKEFAASPIRITTENLTTYVTKLKGPKAAALFAKTHNLLPLQEVPMDRFTVDMKRDVKVT
PGTKHTEERPKVQVIQAAEPLATAYLCGIHRELVRRLNAVLLPNVHTLFDMSAEDFDAIIAAHFKPGDAVLETDIASFDK
SQDDSLALTALMLLEDLGVDHPLLDLIEAAFGEISSCHLPTGTRFKFGAMMKSGMFLTLFVNTLLNITIASRVLEDRLTR
SACAAFIGDDNIIHGVVSDELMAARCATWMNMEVKIIDAVVSQKAPYFCGGFILYDTVAGTACRVADPLKRLFKLGKPLA
AGDEQDDDRRRALADEVVRWQRTGLTDELEKAVHSRYEVQGISVVVMSMATFASSRSNFEKLRGPVVTLYGGPK
>Q8JUX6 ~~~~~~Polyprotein P1234~~~
MDPVYVDIDADSAFLKALQRAYPMFEVEPRQVTPNDHANARAFSHLAIKLIEQEIDPDSTILDIGSAPARRMMSDRKYHC
VCPMRSAEDPERLANYARKLASAAGKVLDRNISGKIGDLQAVMAVPDTETPTFCLHTDVSCRQRADVAIYQDVYAVHAPT
SLYHQAIKGVRLAYWVGFDTTPFMYNAMAGAYPSYSTNWADEQVLKAKNIGLCSTDLTEGRRGKLSIMRGKKLEPCDRVL
FSVGSTLYPESRKLLKSWHLPSVFHLKGKLSFTCRCDTVVSCEGYVVKRITMSPGLYGKTTGYAVTHHADGFLMCKTTDT
VDGERVSFSVCTYVPATICDQMTGILATEVTPEDAQKLLVGLNQRIVVNGRTQRNTNTMKNYMIPVVAQAFSKWAKECRK
DMEDEKLLGVRERTLTCCCLWAFKKQKTHTVYKRPDTQSIQKVQAEFDSFVVPSLWSSGLSIPLRTRIKWLLSKVPKTDL
TPYSGDAQEARDAEKEAEEEREAELTLEALPPLQAAQEDVQVEIDVEQLEDRAGAGIIETPRGAIKVTAQPTDHVVGEYL
VLSPQTVLRSQKLSLIHALAEQVKTCTHSGRAGRYAVEAYDGRVLVPSGYAISPEDFQSLSESATMVYNEREFVNRKLHH
IAMHGPALNTDEESYELVRAERTEHEYVYDVDQRRCCKKEEAAGLVLVGDLTNPPYHEFAYEGLKIRPACPYKIAVIGVF
GVPGSGKSAIIKNLVTRQDLVTSGKKENCQEITTDVMRQRGLEISARTVDSLLLNGCNRPVDVLYVDEAFACHSGTLLAL
IALVRPRQKVVLCGDPKQCGFFNMMQMKVNYNHNICTQVYHKSISRRCTLPVTAIVSSLHYEGKMRTTNEYNKPIVVDTT
GSTKPDPGDLVLTCFRGWVKQLQIDYRGHEVMTAAASQGLTRKGVYAVRQKVNENPLYASTSEHVNVLLTRTEGKLVWKT
LSGDPWIKTLQNPPKGNFKATIKEWEVEHASIMAGICSHQMTFDTFQNKANVCWAKSLVPILETAGIKLNDRQWSQIIQA
FKEDKAYSPEVALNEICTRMYGVDLDSGLFSKPLVSVYYADNHWDNRPGGKMFGFNPEAASILERKYPFTKGKWNINKQI
CVTTRRIEDFNPTTNIIPANRRLPHSLVAEHRPVKGERMEWLVNKINGHHVLLVSGCSLALPTKRVTWVAPLGVRGADYT
YNLELGLPATLGRYDLVVINIHTPFRIHHYQQCVDHAMKLQMLGGDSLRLLKPGGSLLIRAYGYADRTSERVICVLGRKF
RSSRALKPPCVTSNTEMFFLFSNFDNGRRNFTTHVMNNQLNAAFVGQATRAGCAPSYRVKRMDIAKNDEECVVNAANPRG
LPGDGVCKAVYKKWPESFKNSATPVGTAKTVMCGTYPVIHAVGPNFSNYSESEGDRELAAAYREVAKEVTRLGVNSVAIP
LLSTGVYSGGKDRLTQSLNHLFTAMDSTDADVVIYCRDKEWEKKISEAIQMRTQVELLDEHISIDCDVVRVHPDSSLAGR
KGYSTTEGALYSYLEGTRFHQTAVDMAEIYTMWPKQTEANEQVCLYALGESIESIRQKCPVDDADASSPPKTVPCLCRYA
MTPERVTRLRMNHVTSIIVCSSFPLPKYKIEGVQKVKCSKVMLFDHNVPSRVSPREYRPSQESVQEASTTTSLTHSQFDL
SVDGKILPVPSDLDADAPALEPALDDGAIHTLPSATGNLAAVSDWVMSTVPVAPPRRRRGRNLTVTCDEREGNITPMASV
RFFRAELCPVVQETAETRDTAMSLQAPPSTATELSHPPISFGAPSETFPITFGDFNEGEIESLSSELLTFGDFLPGEVDD
LTDSDWSTCSDTDDELRLDRAGGYIFSSDTGPGHLQQKSVRQSVLPVNTLEEVHEEKCYPPKLDEAKEQLLLKKLQESAS
MANRSRYQSRKVENMKATIIQRLKRGCRLYLMSETPKVPTYRTTYPAPVYSPPINVRLSNPESAVAACNEFLARNYPTVS
SYQITDEYDAYLDMVDGSESCLDRATFNPSKLRSYPKQHAYHAPSIRSAVPSPFQNTLQNVLAAATKRNCNVTQMRELPT
LDSAVFNVECFKKFACNQEYWEEFAASPIRITTENLTTYVTKLKGPKAAALFAKTHNLLPLQEVPMDRFTVDMKRDVKVT
PGTKHTEERPKVQVIQAAEPLATAYLCGIHRELVRRLNAVLLPNVHTLFDMSAEDFDAIIAAHFKPGDTVLETDIASFDK
SQDDSLALTALMLLEDLGVDHSLLDLIEAAFGEISSCHLPTGTRFKFGAMMKSGMFLTLFVNTLLNITIASRVLEDRLTK
SACAAFIGDDNIIHGVVSDELMAARCATWMNMEVKIIDAVVSQKAPYFCGGFILHDIVTGTACRVADPLKRLFKLGKPLA
AGDEQDEDRRRALADEVVRWQRTGLIDELEKAVYSRYEVQGISVVVMSMATFASSRSNFEKLRGPVVTLYGGPK
>Q9IJX4 ~~~ORF1~~~Replicase polyprotein~~~
MSFQQTNNNATNNINSLEELAAQELIAAQFEGNLDGFFCTFYVQSKPQLLDLESECYCMDDFDCGCDRIKREEELRKLIF
LTSDVYGYNFEEWKGLVWKFVQNYCPEHRYGSTFGNGLLIVSPRFFMDHLDWFQQWKLVSSNDECRAFLRKRTQLLMSGD
VESNPGPVQSRPVYACDNDPRAIRLEKALQRRDEKISTLIKKLRQEIKNNRIYTQGFFDDLKGAKGEVGQLNGNLTRICD
FLENSLPTLTAQIQTTVLTTTDKYVNLKEDLLKVAILLVLVRLLMVWKKYRAALIVIILFVMHFYGFDKQILDIVLDLKD
KILQTTTQAGTETLEEVVYHPWFDTCGKLIFAVLAFFAIKKIPGKQDWDNYISRLDRIPKAIEGSKKIVDYCSEYFNLSV
DEVKKVVLGKELKGTQGLYDEIHVWAKEIRHYLDLDERNKITLDTETAAKVEDLYKRGLKYSEEKIPDRDIARFITTMLF
PAKSLYEQVLLSPVKGGGPKMRPITVWLTGESGIGKTQMIYPLCIDILREMGIVKPDAYKHQAYARQVETEYWDGYNGQK
IVIYDDAFQLKDDKTKPNPEIFEVIRTCNTFPQHLHMAALQDKNMYSQAEVLLYTTNQFQVQLESITFPDAFYNRMKTHA
YRVQIKQEKSIWVRNARGEEYNALDVTKLNKDEAIDLSVYEFQKMRFDDESATKWIDDGEPISYDEFARTICKAWKEEKE
KTFHQLQWLEAYASRTVAQGGSETSEYYDVWDETYFSNLLSQGFMAGKSLIEMEAEFASDAETFNAYIEYKKNIPKETKW
SKWMTILDEQISALSTKIRELKNKAYKFISEHPYLTALGFIGVMISAFAMYSFFERTLTDDTITSEVGSSGDNKTQKISK
RVVEVGGSGDVKTTKPAKTAVEVGSSGDSKTMKNKITKVEVGSSGDSKTQKQRNTKVEVGKELEKEAETQGCSDPAAHAL
VLDVLQKNTYCLYYERMVKGEMKRYRLATATFLRGWVCMMPYHFIETLYARKVAPSTNIYFSQPNCDDVIVVPVSHFIAP
NAERVELTTACTRIHYKDETPRDCVLVNLHRRMCHPHRDILKHFVKKSDQGNLRGVFQGTLATFHQSANELCRAYQWLQA
IRPLDQEITIYHEDTDMFDYESESYTQRDCYEYNAPTQTGNCGSIVGLYNKRMERKLIGMHIPGNVSECHGYACPLTQEA
IMDGLNRLEKLDPVNNITVQCCFEPPSDIKDTMSGETPEGKFCAIGKSNIKVGQAVKTTLLKSCIYGMLSKPITKPAHLT
RTRLPNGEIVDPLMKGLKKCGVDTAVLDAEIVESAALDVKQVVLTQYNSMLDVNKYRRFLTYEEATQGTGDDDFMKGIAR
QTSPGYRYFQMPRKLPGKQDWMGSGEQYDFTSQRAQELRRDVEELIDNCAKGIIKDVVFVDTLKDERRPIEKVDAGKTRV
FSAGPQHFVVAFRKYFLPFAAYLMNNRIDNEIAVGTNVYSTDWERIAKRLKKHGNKVIAGDFGNFDGSLVAQFFGQSCGK
SFYPWFKTFNDVNTEDGKRNLMICIGLWTHIVHSVHSYGDNVYMWTHSQPSGNPFTVIINCLYNSMIMRIVWILLARKLA
PEMQSMKKFRENVSMISYGDDNCLNISDRVVEWFNQITISEQMKEIKHEYTDEGKTGDMVKFPSLSEIHFLKKRFVFSHQ
LQRTVAPLQKDVIYEMLNWTRNTIDPNEILMMNINTAFREIVYHGKSEYQKLRSGIEDLAMKGILPQQPQILTFKAYLWD
ATMLADEVYDF
>Q306W6 ~~~~~~Polyprotein P1234~~~
MEKVHVDLDADSPYVKSLQKCFPHFEIEATQVTDNDHANARAFSHLATKLIESEVDPDQVILDIGSAPVRHTHSKHKYHC
ICPMISAEDPDRLHRYADKLRKSDVTDRFIASKAADLLTVMSTPDVETPSLCMHTDSTCRYHGTVAVYQDVYAVHAPTSI
YHQALKGVRTIYWIGFDTTPFMYKNMAGAYPTYNTNWADESVLEARNIGLCSSDLHEQRFGKISIMRKKKLQPTNKVVFS
VGSTIYTEERILLRSWHLPNVFHLKGKTSFTGRCNTIVSCEGYVVKKITISPGIYGKVDNLASTMHREGFLSCKVTDTLR
GERVSFPVCTYVPATLCDQMTGILATDVSVDDAQKLLVGLNQRIVVNGRTQRNTNTMPNYLLPIVAQAFSRWAREYHADL
EDEKDLGVRERSLVMGCCWAFKTHKITSIYKKPGTQTTKKVPAVFNSFVVPQLTSYGLDIELRRRIKMLLEEKKKPAPII
TEADVAHLKGMQEEAEVVAEAEAIRAALPPLLPEVERETVEADIDLIMQEAGAGSVETPRRHIKVTTYPGEEMIGSYAVL
SPQAVLNSEKLACIHPLAEQVLVMTHKGRAGRYKVEPYHGRVIVPSGTAIPIPDFQALSESATIVYNEREFVNRYLHHIA
INGGALNTDEEYYKVLRSGEAESEYVFDIDAKKCVKKAEAGPMCLVGDLVDPPFHEFAYESLKTRPAAPHKVPTIGVYGV
PGSGKSGIIKSAVTKRDLVVSAKKENCTEIIKDVKRMRGMDVAARTVDSVLLNGVKHPVDTLYIDEAFACHAGTLLALIA
IVKPKKVVLCGDPKQCGFFNMMCLKVHFNHEICTEVYHKSISRRCTRTVTAIVSTLFYDKRMRTVNPCSDKIIIDTTSTT
KPLKDDIILTCFRGWVKQLQIDYKNHEIMTAAASQGLTRKGVYAVRYKVNENPLYAQTSEHVNVLLTRTEKRIVWKTLAG
DPWIKTLTAHYPGEFSATLEEWQAEHDAIMKRVLETPANSDVYQNKVHVCWAKALEPVLATANITLTRSQWETIPAFKDD
KAFSPEMALNFLCTRFFGVDIDSGLFSAPTVPLTYTNEHWDNSPGPNRYGLCMRTAKELARRYPCILKAVDTGRVADVRT
NTIRDYNPMINVVPLNRRLPHSLVVSHRYTGDGNYSQLLSKLTGKTILVIGTPINVPGKRVETLGPGPQCTYKADLDLGI
PSMIGKYDIIFVNVRTPYKHHHYQQCEDHAIHHSMLTRKAVDHLNKGGTCVALGYGTADRATENIISAVARSFRFSRVCQ
PKCAWENTEVAFVFFGKDNGNHLRDQDQLSVVLNNIYQGSTQYEAGRAPAYRVIRGDISKSTDEVIVNAANNKGQPGAGV
CGALYKKWPGAFDKAPIATGTAHLVKHTPNIIHAVGPNFSRMSEVEGNQKLSEVYMDIAKIINKERYNKVSIPLLSTGVY
AGGKDRVMQSLNHLFTAMDTTDADVTIYCLDKQWETRIKDAIARKESVEELVEDDKPVDIELVRVHPQSSLVGRPGYSTN
EGKVHSYLEGTRFHQTAKDIAEIYAMWPNKQEANEQICLYVLGESMTSIRSKCPVEESEASSPPHTIPCLCNYAMTAERV
YRLRMAKNEQFAVCSSFQLPKYRITGVQKIQCNKPVIFSGVVPPAIHPRKFSTVEETQPTTIAERLIPRRPAPPVPVPAR
IPSPRCSPAVSMQSLGGSSTSEVIISEAEVHDSDSDCSIPPMPFVVEAEVHASQGSQWSIPSASGFEIREPLDDLGSITR
TPAISDHSVDLITFDSVTDIFENFKQAPFQFLSDIRPIPAPRRRREPETDIQRFDKSEEKPVPKPRTRTAKYKKPPGVAR
SISEAELDEFIRRHSNXRYEAGAYIFSSETGQGHLQQKSTRQCKLQHPILERSVHEKFYAPRLDLEREKLLQRKLQLCAS
EGNRSRYQSRKVENMKAITAERLLQGIGAYLSAEPQPVECYKINYPVPVYSTTRSNRFSSADVAVRVCNLVMQENFPTVA
SYTITDEYDAYLDMVDGASCCLDTATFCPAKLRSFPKKHSYLKPEIRSAVPSPIQNTLQNVLAAATKRNCNVTQMRELPV
LDSAAFNVECFKKYACNDEYWETFKNNPIRLTTENVTQYVTKLKGPKAAALFAKTHNLQPLHEIPMDRFVMDLKRDVKVT
PGTKHTEERPKVQVIQAADPLATAYLCGIHRELVRRLNAVLLPNIHTLFDMSAEDFDAIIAEHFQFGDAVLETDIASFDK
SEDDAIAMSALMILEDLGVDQALLDLIEAAFGNITSVHLPTGTRFKFGAMMKSGMFLTLFINTVVNIMIASRVLRERLTN
SPCAAFIGDDNIVKGVKSDALMAERCATWLNMEVKIIDAVVGVKAPYFCGGFIVVDQITGTACRVADPLKRLFKLGKPLP
LDDDQDGDRRRALYDEALRWNRIGITDELVKAVESRYDVLYISLVITALSTLAATVSNFKHIRGNPITLYG
>Q4QXJ8 ~~~~~~Polyprotein P1234~~~
MEKVHVDLDADSPFVKSLQRCFPHFEIEATQVTDNDHANARAFSHLATKLIEGEVDTDQVILDIGSAPVRHTHSKHKYHC
ICPMKSAEDPDRLYRYADKLRKSDVTDKCIASKAADLLTVMSTPDAETPSLCMHTDSTCRYHGSVAVYQDVYAVHAPTSI
YYQALKGVRTIYWIGFDTTPFMYKNMAGAYPTYNTNWADESVLEARNIGLGSSDLHEKSFGKVSIMRKKKLQPTNKVIFS
VGSTIYTEERILLRSWHLPNVFHLKGKTSFTGRCNTIVSCEGYVVKKITLSPGIYGKVDNLASTMHREGFLSCKVTDTLR
GERVSFPVCTYVPATLCDQMTGILATDVSVDDAQKLLVGLNQRIVVNGRTQRNTNTMQNYLLPVVAQAFSRWAREHRADL
EDEKGLGVRERSLVMGCCWAFKTHKITSIYKRPGTQTIKKVPAVFNSFVIPQPTSYGLDIGLRRRIKMLFDAKKAPAPII
TEADVAHLKGLQDEAEAVAEAEAVRAALPPLLPEVDKETVEADIDLIMQEAGAGSVETPRRHIKVTTYPGEEMIGSYAVL
SPQAVLNSEKLACIHPLAEQVLVMTHKGRAGRYKVEPYHGRVIVPSGTAIPILDFQALSESATIVFNEREFVNRYLHHIA
VNGGALNTDEEYYKVVKSTETDSEYVFDIDAKKCVKKGDAGPMCLVGELVDPPFHEFAYESLKTRPAAPHKVPTIGVYGV
PGSGKSGIIKSAVTKRDLVVSAKKENCMEIIKDVKRMRGMDIAARTVDSVLLNGVKHSVDTLYIDEAFACHAGTLLALIA
IVKPKKVVLCGDPKQCGFFNMMCLKVHFNHEICTEVYHKSISRRCTKTVTSIVSTLFYDKRMRTVNPCNDKIIIDTTSTT
KPLKDDIILTCFRGWVKQLQIDYKNHEIMTAAASQGLTRKGVYAVRYKVNENPLYAQTSEHVNVLLTRTEKRIVWKTLAG
DPWIKTLTASYPGNFTATLEEWQAEHDAIMAKILETPASSDVFQNKVNVCWAKALEPVLATANITLTRSQWETIPAFKDD
KAYSPEMALNFFCTRFFGVDIDSGLFSAPTVPLTYTNEHWDNSPGPNMYGLCMRTAKELARRYPCILKAVDTGRVADVRT
DTIKDYNPLINVVPLNRRLPHSLVVTHRYTGNGDYSQLVTKMTGKTVLVVGTPMNIPGKRVETLGPSPQCTYKAELDLGI
PAALGKYDIIFINVRTPYRHHHYQQCEDHAIHHSMLTRKAVDHLNKGGTCIALGYGTADRATENIISAVARSFRFSRVCQ
PKCAWENTEVAFVFFGKDNGNHLQDQDRLSVVLNNIYQGSTQHEAGRAPAYRVVRGDITKSNDEVIVNAANNKGQPGSGV
CGALYRKWPGAFDKQPVATGKAHLVKHSPNVIHAVGPNFSRLSENEGDQKLSEVYMDIARIINNERFTKVSIPLLSTGIY
AGGKDRVMQSLNHLFTAMDTTDADITIYCLDKQWESRIKEAITRKESVEELTEDDRPVDIELVRVHPLSSLAGRPGYSTT
EGKVYSYLEGTRFHQTAKDIAEIYAMWPNKQEANEQICLYVLGESMNSIRSKCPVEESEASSPPHTIPCLCNYAMTAERV
YRLRMAKNEQFAVCSSFQLPKYRITGVQKIQCSKPVIFSGTVPPAIHPRKFASVTVEDTPVVQPERLVPRRPAPPVPVPA
RIPSPPCTSTNGSTTSIQSLGEDQSASASSGAEISVDQVSLWSIPSATGFDVRTSSSLSLEQPTFPTMVVEAEIHASQGS
LWSIPSITGSETRAPSPPSQDSRPSTPSASGSHTSVDLITFDSVAEILEDFSRSPFQFLSEIKPIPAPRTRVNNMSRSAD
TIKPIPKPRKCQVKYTQPPGVARVISAAEFDEFVRRHSNXRYEAGAYIFSSETGQGHLQQKSTRQCKLQYPILERSVHEK
FYAPRLDLEREKLLQKKLQLCASEGNRSRYQSRKVENMKAITVERLLQGIGSYLSAEPQPVECYKVTYPAPMYSSTASNS
FSSAEVAVKVCNLVLQENFPTVASYNITDEYDAYLDMVDGASCCLDTATFCPAKLRSFPKKHSYLRPEIRSAVPSPIQNT
LQNVLAAATKRNCNVTQMRELPVLDSAAFNVECFKKYACNDEYWDFYKTNPIRLTAENVTQYVTKLKGPKAAALFAKTHN
LQPLHEIPMDRFVMDLKRDVKVTPGTKHTEERPKVQVIQAADPLATAYLCGIHRELVRRLNAVLLPNIHTLFDMSAEDFD
AIIAEHFQFGDAVLETDIASFDKSEDDAIAMSALMILEDLGVDQALLNLIEAAFGNITSVHLPTGTRFKFGAMMKSGMFL
TLFINTVVNIMIASRVLRERLTTSPCAAFIGDDNIVKGVTSDALMAERCATWLNMEVKIIDAVVGVKAPYFCGGFIVVDQ
ITGTACRVADPLKRLFKLGKPLPLDDDQDVDRRRALHDEAARWNRIGITEELVKAVESRYEVNYVSLIITALTTLASSVS
NFKHIRGHPITLYG
>P36328 ~~~~~~Polyprotein P1234~~~
MEKVHVDIEEDSPFLRALQRSFPQFEVEAKQVTDNDHANARAFSHLASKLIETEVDPSDTILDIGSAPARRMYSKHKYHC
ICPMRCAEDPDRLYKYATKLKKNCKEITDKELDKKMKELAAVMSDPDLETETMCLHDDESCRYEGQVAVYQDVYAVDGPT
SLYHQANKGVRVAYWIGFDTTPFMFKNLAGAYPSYSTNWADETVLTARNIGLCSSDVMERSRRGMSILRKKYLKPSNNVL
FSVGSTIYHEKRDLLRSWHLPSVFHLRGKQNYTCRCETIVSCDGYVVKRIAISPGLYGKPSGYAATMHREGFLCCKVTDT
LNGERVSFPVCTYVPATLCDQMTGILATDVSADDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPVVAQAFARWAKEYKE
DQEDERPLGLRDRQLVMGCCWAFRRHKITSIYKRPDTQTIIKVNSDFHSFVLPRIGSNTLEIGLRTRIRKMLEEHKEPSP
LITAEDIQEAKCAADEAKEVREAEELRAALPPLAADFEEPTLEADVDLMLQEAGAGSVETPRGLIKVTSYAGEDKIGSYA
VLSPQAVLKSEKLSCIHPLAEQVIVITHSGRKGRYAVEPYHGKVVVPEGHAIPVQDFQALSESATIVYNEREFVNRYLHH
IATHGGALNTDEEYYKTVKPSEHDGEYLYDIDRKQCVKKELVTGLGLTGELVDPPFHEFAYESLRTRPAAPYQVPTIGVY
GVPGSGKSGIIKSAVTKKDLVVSAKKENCAEIIRDVKKMKGLDVNARTVDSVLLNGCKHPVETLYIDEAFACHAGTLRAL
IAIIRPKKAVLCGDPKQCGFFNMMCLKVHFNHEICTQVFHKSISRRCTKSVTSVVSTLFYDKRMRTTNPKETKIVIDTTG
STKPKQDDLILTCFRGWVKQLQIDYKGNEIMTAAASQGLTRKGVYAVRYKVNENPLYAPTSEHVNVLLTRTEDRIVWKTL
AGDPWIKILTAKYPGNFTATIEEWQAEHDAIMRHILERPDPTDVFQNKANVCWAKALVPVLKTAGIDMTTEQWNTVDYFE
TDKAHSAEIVLNQLCVRFFGLDLDSGLFSAPTVPLSIRNNHWDNSPSPNMYGLNKEVVRQLSRRYPQLPRAVATGRVYDM
NTGTLRNYDPRINLVPVNRRLPHALVLHHNEHPQSDFSSFVSKLKGRTVLVVGEKLSVPGKKVDWLSDQPEATFRARLDL
GIPGDVPKYDIVFINVRTPYKYHHYQQCEDHAIKLSMLTKKACLHLNPGGTCVSIGYGYADRASESIIGAIARQFKFSRV
CKPKSSHEETEVLFVFIGYDRKARTHNPYKLSSTLTNIYTGSRLHEAGCAPSYHVVRGDIATATEGVIINAANSKGQPGG
GVCGALYKKFPESFDLQPIEVGKARLVKGAAKHIIHAVGPNFNKVSEVEGDKQLAEAYESIAKIVNDNNYKSVAIPLLST
GIFSGNKDRLTQSLNHLLTALDTTDADVAIYCRDKKWEMTLKEAVARREAVEEICISDDSSVTEPDAELVRVHPKSSLAG
RKGYSTSDGKTFSYLEGTKFHQAAKDIAEINAMWPVATEANEQVCMYILGESMSSIRSKCPVEESEASTPPSTLPCLCIH
AMTPERVQRLKASRPEQITVCSSFPLPKYRITGVQKIQCSQPILFSPKVPAYIHPRKYLVETPPVEETPESPAENQSTEG
TPEQPALVNVDATRTRMPEPIIIEEEEEDSISLLSDGPTHQVLQVEADIHGSPSVSSSSWSIPHASDFDVDSLSILDTLD
GASVTSGAVSAETNSYFARSMEFRARPVPAPRTVFRNPPHPAPRTRTPPLAHSRASSRTSLVSTPPGVNRVITREELEAL
TPSRAPSRSASRTSLVSNPPGVNRVITREEFEAFVAQQQXRFDAGAYIFSSDTGQGHLQQKSVRQTVLSEVVLERTELEI
SYAPRLDQEKEELLRKKLQLNPTPANRSRYQSRRVENMKAITARRILQGLGHYLKAEGKVECYRTLHPVPLYSSSVNRAF
SSPKVAVEACNAMLKENFPTVASYCIIPEYDAYLDMVDGASCCLDTASFCPAKLRSFPKKHSYLEPTIRSAVPSAIQNTL
QNVLAAATKRNCNVTQMRELPVLDSAAFNVECFKKYACNNEYWETFKENPIRLTEENVVNYITKLKGPKAAALFAKTHNL
NMLQDIPMDRFVMDLKRDVKVTPGTKHTEERPKVQVIQAADPLATADLCGIHRELVRRLNAVLLPNIHTLFDMSAEDFDA
IIAEHFQPGDCVLETDIASFDKSEDDAMALTALMILEDLGVDAELLTLIEAAFGEISSIHLPTKTKFKFGAMMKSGMFLT
LFVNTVINIVIASRVLRERLTGSPCAAFIGDDNIVKGVKSDKLMADRCATWLNMEVKIIDAVVGEKAPYFCGGFILCDSV
TGTACRVADPLKRLFKLGKPLAVDDEHDDDRRRALHEESTRWNRVGILPELCKAVESRYETVGTSIIVMAMTTLASSVKS
FSYLRGAPITLYG
>P27282 ~~~~~~Polyprotein P1234~~~
MEKVHVDIEEDSPFLRALQRSFPQFEVEAKQVTDNDHANARAFSHLASKLIETEVDPSDTILDIGSAPARRMYSKHKYHC
ICPMRCAEDPDRLYKYATKLKKNCKEITDKELDKKMKELAAVMSDPDLETETMCLHDDESCRYEGQVAVYQDVYAVDGPT
SLYHQANKGVRVAYWIGFDTTPFMFKNLAGAYPSYSTNWADETVLTARNIGLCSSDVMERSRRGMSILRKKYLKPSNNVL
FSVGSTIYHEKRDLLRSWHLPSVFHLRGKQNYTCRCETIVSCDGYVVKRIAISPGLYGKPSGYAATMHREGFLCCKVTDT
LNGERVSFPVCTYVPATLCDQMTGILATDVSADDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPVVAQAFARWAKEYKE
DQEDERPLGLRDRQLVMGCCWAFRRHKITSIYKRPDTQTIIKVNSDFHSFVLPRIGSNTLEIGLRTRIRKMLEEHKEPSP
LITAEDVQEAKCAADEAKEVREAEELRAALPPLAADVEEPTLEADVDLMLQEAGAGSVETPRGLIKVTSYAGEDKIGSYA
VLSPQAVLKSEKLSCIHPLAEQVIVITHSGRKGRYAVEPYHGKVVVPEGHAIPVQDFQALSESATIVYNEREFVNRYLHH
IATHGGALNTDEEYYKTVKPSEHDGEYLYDIDRKQCVKKELVTGLGLTGELVDPPFHEFAYESLRTRPAAPYQVPTIGVY
GVPGSGKSGIIKSAVTKKDLVVSAKKENCAEIIRDVKKMKGLDVNARTVDSVLLNGCKHPVETLYIDEAFACHAGTLRAL
IAIIRPKKAVLCGDPKQCGFFNMMCLKVHFNHEICTQVFHKSISRRCTKSVTSVVSTLFYDKKMRTTNPKETKIVIDTTG
STKPKQDDLILTCFRGWVKQLQIDYKGNEIMTAAASQGLTRKGVYAVRYKVNENPLYAPTSEHVNVLLTRTEDRIVWKTL
AGDPWIKTLTAKYPGNFTATIEEWQAEHDAIMRHILERPDPTDVFQNKANVCWAKALVPVLKTAGIDMTTEQWNTVDYFE
TDKAHSAEIVLNQLCVRFFGLDLDSGLFSAPTVPLSIRNNHWDNSPSPNMYGLNKEVVRQLSRRYPQLPRAVATGRVYDM
NTGTLRNYDPRINLVPVNRRLPHALVLHHNEHPQSDFSSFVSKLKGRTVLVVGEKLSVPGKMVDWLSDRPEATFRARLDL
GIPGDVPKYDIIFVNVRTPYKYHHYQQCEDHAIKLSMLTKKACLHLNPGGTCVSIGYGYADRASESIIGAIARQFKFSRV
CKPKSSLEETEVLFVFIGYDRKARTHNPYKLSSTLTNIYTGSRLHEAGCAPSYHVVRGDIATATEGVIINAANSKGQPGG
GVCGALYKKFPESFDLQPIEVGKARLVKGAAKHIIHAVGPNFNKVSEVEGDKQLAEAYESIAKIVNDNNYKSVAIPLLST
GIFSGNKDRLTQSLNHLLTALDTTDADVAIYCRDKKWEMTLKEAVARREAVEEICISDDSSVTEPDAELVRVHPKSSLAG
RKGYSTSDGKTFSYLEGTKFHQAAKDIAEINAMWPVATEANEQVCMYILGESMSSIRSKCPVEESEASTPPSTLPCLCIH
AMTPERVQRLKASRPEQITVCSSFPLPKYRITGVQKIQCSQPILFSPKVPAYIHPRKYLVETPPVDETPEPSAENQSTEG
TPEQPPLITEDETRTRTPEPIIIEEEEEDSISLLSDGPTHQVLQVEADIHGPPSVSSSSWSIPHASDFDVDSLSILDTLE
GASVTSGATSAETNSYFAKSMEFLARPVPAPRTVFRNPPHPAPRTRTPSLAPSRACSRTSLVSTPPGVNRVITREELEAL
TPSRTPSRSVSRTSLVSNPPGVNRVITREEFEAFVAQQQXRFDAGAYIFSSDTGQGHLQQKSVRQTVLSEVVLERTELEI
SYAPRLDQEKEELLRKKLQLNPTPANRSRYQSRKVENMKAITARRILQGLGHYLKAEGKVECYRTLHPVPLYSSSVNRAF
SSPKVAVEACNAMLKENFPTVASYCIIPEYDAYLDMVDGASCCLDTASFCPAKLRSFPKKHSYLEPTIRSAVPSAIQNTL
QNVLAAATKRNCNVTQMRELPVLDSAAFNVECFKKYACNNEYWETFKENPIRLTEENVVNYITKLKGPKAAALFAKTHNL
NMLQDIPMDRFVMDLKRDVKVTPGTKHTEERPKVQVIQAADPLATAYLCGIHRELVRRLNAVLLPNIHTLFDMSAEDFDA
IIAEHFQPGDCVLETDIASFDKSEDDAMALTALMILEDLGVDAELLTLIEAAFGEISSIHLPTKTKFKFGAMMKSGMFLT
LFVNTVINIVIASRVLRERLTGSPCAAFIGDDNIVKGVKSDKLMADRCATWLNMEVKIIDAVVGEKAPYFCGGFILCDSV
TGTACRVADPLKRLFKLGKPLAADDEHDDDRRRALHEESTRWNRVGILSELCKAVESRYETVGTSIIVMAMTTLASSVKS
FSYLRGAPITLYG
>Q6UP17 ~~~~~~Non-structural polyprotein~~~
MMTTQTNQLFNRSLNDELGNRTDTITLHPWNVSEYVTSTGLAEAEFEACTGKDLEIQKYLVTVSEECPCGHPTTFSEIDM
SRGMVMASSDNIYSDDIVTLVPTENGYEDSDPCFCENQKEDCDNCMYLQILRMAPGKWNMTYGLRHNHQAQLTYYLTEER
GFTCEGYTLEADTSDSEGFGDTADYTLYCIQYSGKYKQRRQWHVLLDDESCGFCKYRVTTGLIVPEVPRHWTNELGRRQN
LLVPKVLGQYVDFIEAKSPLRLDWLSMSKLVSGKCSLPNFYVNLTTLRTFQVAGGKFLPYLYNGSAANNDLKLPIQVTAQ
GDEDTPAGELSIEQDTHENTTLAESTDASTAYVATEEFSMMPWITDGPHTYPDLTERWTKAFQFQWTTSQAQGEIIQRFD
LPIEAIQNFINSPNALPWRQHAFYKSDIELKVQVNSQPGQSGYLILGAMYEASEGTAIGNRVDHAANIVAMPHMRISAGS
SNSGDMVIPFIRHFPVGCILNNAFDVPQYFVTLFVAPLLQLRTGADGPQVVDVTIMIRFPNCEFYGQRTTEQIVTAQGWA
PDLTQDGDVESNPGPFLSGLLGTVAAVGKTVAGAGSSIGSIATGVSGVANGIGSIGSAAGKVIGGVESLLRPLFPKKDMD
RPQNIIEPTNFYLQQNTSLSLATGTNNVKLLQLQAENSVSHPPGFVPVDDQFNNRFILSVFGLSDYFQWFSGSASGTLLY
SFDVTPLKSFTRGIPLQPEFYLTPMAALSGQYGGYHGDMEMRLTFAVSKFHSGRVFIVYSPDVVPTFDNIGAYYNVLLDV
QDQSVYTFKIPYQVPTPYAPIFESLQGDTGTFLTPAVSAGNVRCMAYGFVSIFVENQLRVMQTAAPTIDVLVELRGADNF
HLVLPAGGKFRSLSTITTTEAPVVTAMGDERREPHTVNPPPRRIMPVWAAQLNESYDCRDVVKRYHDWFDIVSPAVVAGR
GFTDMYMNVTVFHVPVPAFDAAPIQNVQFTDMLVGSSILNKSMVLARERINLATNNTISIRVPWTNYASMISNSINPASG
RSNTMAPYSNGRVVVYIEYLSSYTTTLGAFRVVWDVTAGSASTRPDTYTPDMLTLLHDGFRFAKGDFNYQMDFTPVPSGA
NTAIRMRCFRAYGDGGNLYVFQGFPKMLGTYTPRQAISGVTPTRGGLQRQNIIGGGQRDLTQDGDIESNPGPTQSKPTGA
QPPPDDILTDEDREGLEGGSIRISFFEKLGDYCKRALGWSGERIVEFIRELRSIKKHVGSIPGMTRISELITTIKELSVV
TNGLNRMSAAVEELNEQIKKTREKVETFVGGVGGKLNSLDTGDLVSNGIEYVAYILNIYNSKSVGMTLINVAALLSKMGL
GRYLVNDLVDRLGTFAKGEDTEEIEREYTSLIITGVLGAFGLGTMNVEKEGFVKPFLSNVKDFFRNGFAVKKFLDSHFKC
INDICGWVRSKIFGKVDKGGLTVDLLVWCERVQVLAEVYNYDTILNDPEFAETLLSLQDEAFEFDRLFIASRIRPPNQYS
MYRTKLQRAIDLLGQQGTMQKSKPVPFCLWVYGHSGCGKSHVCDNVLTEIGSALGINTANPIYTRSPDVEFWNGYTGQKL
ISWQDFAKITTGETYRKQVSELSSLIEPTPFNPPFAALEDKRKIADAWAVYVSSNKAFPEVQNMGMEDSALFYRRRHALV
KMRINPEIIQEYARREPPISLEDKYKGQTVYYPSNIPSEDFATRGPYFHVQFAFHVTSLSTAEPTEWLGYDGFIAEVTRR
AIDHRQREVDACIQRRGIYSKLRRSAPTGVEFREEVRTLKEELDRLTQERVAEAHGSDNAVKNLATTMDYEKRATFEETE
ALGSVILQYRLDPNCDCTNQAIMKAATSNEPRKICEVSFCEKCEPVNQRKKNVYKAMSEQLEGVAVSNPFGGIAMIERGD
SARIPRAGRQYTPLDYARQATIGTLKWIQGKGLPAGYAYWSEIMPVCCHKKEILELAYFENVGDNIELVLDLSSMKARHG
VSFPLFHKLEWFDLRKLDAWATTGFKVILPVVCEQSKFNWNNCIEVKPGERGRIPITPKWAPELIGRVTLEEFKRSWSCD
AVQKPSVSYYKIEKDLIRFLVSGTYVQIEGVENMTAGLNHLMSLHTKDTVAFCNSFFYWKSGGKLRAAKFYPFVNGMLKD
MSFIFSNIMIAVGAVMLAYKVYRLIRNSSECVEAQAKDYDQKTEAPKIPKVNFVQSTPVVVAAKGDEELSIIHNSLCQLR
KGSMRLLGVRLCSNFVLAPQHLAWVPGEFGISLHSSGAWSTELIFEATSVKYSCVKGFDYAVYRFVTLPAGRNIVNYFTT
RAQGATLKSEATLMTLSGASLQLRPVRISQYTGALTYDNASFPGTGTSMIGWQYWLGAQSVRCGSLIMSNNLLCGFHVAQ
NLKTGDAYAVSICKEMLVEALRILGATPFDMKMKRTTPITTVGEPKMQPPETAVVIPLGKAIEPVRYSGKSNLEKSIMHG
ELAEPFRIPAAQSATAKGECIGYDIVMKGCEKQFKPPKPIDPEEVEKIGDYLIERLVPQSIPLIPTISEPLSLGEAIAGI
DGIPLMCGMKLNTSIGWPLCNEYPKGTKKSNIIRVDREEGEVHVDVVAFDDYAKANTLRKQAILPPTTFMDFPKDELLKP
GKDTRLINGAPLHHTLDMRRYLMEFFAAITTINNKIAVGIDVHSGDWALIHGGADDVVDEDYSGFGPGFHSQWLTVVCPI
AVAWCKHHKKVDKEYEDVVNCLIMELQNAYHVAGDLVYQVLCGSPSGAFATDRINSLANLCYHCLCHLRKYGTLTGFWSH
YTLVYGDDTRRRETAYTGDEFQECMASVGIVVNRDKSGVTSFLKRQFIPIDHRDVRVMLAPLPRPIVEDILNWVRKPYAS
KLSALEETVGSYLSEIFHHGQDEFNNSRSKIQAILARHGSHPELPTFDDLFRQKYLSNGVWPVLMPLANALGPIPAAESE
PTHKVTSGQVVCVPHEAGECCAITLGG
>Q5Y389 ~~~~~~Polyprotein P1234~~~
MKVTVDVEADSPFLKALQKAFPAFEVESQQVTPNDHANARAFSHLATKLIEQEVPTGVTILDVGSAPARRLMSDHTYHCI
CPMKSAEDPERLANYARKLAKASGTVLDKNVSGKITDLQDVMATPDLESPTFCLHTDETCRTRAEVAVYQDVYAVHAPTS
LYHQAIKGVRTAYWIGFDTTPFMFEALAGAYPAYSTNWADEQVLQARNIGLCATGLSEGRRGKLSIMRKKCLRPSDRVMF
SVGSTLYTESRKLLRSWHLPSVFHLKGKNSFTCRCDTVVSCEGYVVKKITISPGIYGKTVDYAVTHHAEGFLMCKITDTV
RGERVSFPVCTYVPATICDQMTGILATDVTPEDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPVVAQAFSKWAREARAD
MEDEKPLGTRERTLTCCCLWAFKSHKIHTMYKRPETQTIVKVPSTFDSFVIPSLWSSSLSMGIRQRIKLLLSARMAQGLP
YSGDRTEARAAEEEEKEVQEAELTRAALPPLVSGSCADDIAQVDVEELTFRAGAGVVETPRNALKVTPQAHDHLIGSYLI
LSPQTVLKSEKLAPIHPLAEQVTVMTHSGRSGRYPVDKYDGRVLIPTGAAIPVSEFQALSESATMVYNEREFINRKLHHI
ALYGPALNTDEESYEKVRAERAETEYVFDVDKKACIKKEEASGLVLTGDLINPPFHEFAYEGLKIRPAAPYHTTIIGVFG
VPGSGKSAIIKNMVTTRDLVASGKKENCQEIMNDVKRQRGLDVTARTVDSILLNGCKKGVENLYVDEAFACHSGTLLALI
ALVRPSGKVVLCGDPKQCGFFNLMQLKVHYNHNICTRVLHKSISRRCTLPVTAIVSTLHYQGKMRTTNRCNTPIQIDTTG
SSKPASGDIVLTCFRGWVKQLQIDYRGHEVMTAAASQGLTRKGVYAVRQKVNENPLYSPLSEHVNVLLTRTENRLVWKTL
SGDPWIKVLTNVPRGDFSATLEEWHEEHDGIMRVLNERPAEVDPFQNKAKVCWAKCLVQVLETAGIRMTADEWNTILAFR
EDRAYSPEVALNEICTRYYGVDLDSGLFSAQSVSLFYENNHWDNRPGGRMYGFNHEVARKYAARFPFLRGNMNSGLQLNV
PERKLQPFSAECNIVPSNRRLPHALVTSYQQCRGERVEWLLKKIPGHQMLLVSEYNLVIPHKRVFWIAPPRVSGADRTYD
LDLGLPMDAGRYDLVFVNIHTEYRQHHYQQCVDHSMRLQMLGGDSLHLLRPGGSLLMRAYGYADRVSEMVVTALARKFSA
FRVLRPACVTSNTEVFLLFSNFDNGRRAVTLHQANQKLSSMYACNGLHTAGCAPSYRVRRADISGHSEEAVVNAANAKGT
VSDGVCRAVAKKWPSSFKGAATPVGTAKMIRADGMTVIHAVGPNFSTVTEAEGDRELAAAYRAVASIISTNNIKSVAVPL
LSTGTFSGGKDRVMQSLNHLFTALDATDADVVIYCRDKNWEKKIQEAIDRRTAIELVSEDVTLETDLVRVHPDSCLVGRN
GYSATDGKLYSYLEGTRFHQTAVDMAEISTLWPRLQDANEQICLYALGETMDSIRTKCPVEDADSSTPPKTVPCLCRYAM
TAERVARLRMNNTKNIIVCSSFPLPKYRIEGVQKVKCDRVLIFDQTVPSLVSPRKYIQQPPEQLDNVSLTSTTSTGSAWS
FPSETTYETMEVVAEVHTEPPIPPPRRRRAAVAQLRQDLEVTEEIEPYVTQQAEIMVMERVATTDIRAIPVPARRAITMP
VPAPRVRKVATEPPLEPEAPIPAPRKRRTTSTSPPHNPEDFVPRVPVELPWEPEDLDIQFGDLEPRRRNTRDRDVSTGIQ
FGDIDFNQSXLGRAGAYIFSSDTGPGHLQQKSVRQHELPCETLYAHEDERIYPPAFDGEKEKVLQAKMQMAPTEANKSRY
QSRKVENMKALIVERLREGAKLYLHEQTDKVPTYTSKYPRPVYSPSVDDSLSDPEVAVAACNSFLEENYPTVANYQITDE
YDAYLDLVDGSESCLDRATFCPAKLRCYPKHHAYHQPQIRSAVPSPFQNTLQNVLAAATKRNCNVTQMRELPTMDSAVFN
VESFKKYACTGEYWQEFKDNPIRITTENITTYVAKLKGPKAAALFAKTHNLVPLQEVPMDRFVMDMKRDVKVTPGTKHTE
ERPKVQVIQAAEPLATAYLCGIHRELVRRLKAVLTPNIHTLFDMSAEDFDAIIAAHFQPGDAVLETDIASFDKSQDDSLA
LTALMLLEDLGVDQELLDLIEAAFGEITSVHLPTGTRFKFGAMMKSGMFLTLFINTLLNIVIACRVLRDKLSSSACAAFI
GDDNIVHGVRSDPLMAERCASWVNMEVKIIDATMCEKPPYFCGGFILYDSVAGTACRVADPLKRLFKLGKPLPADDNQDE
DRRRALKDETVKWSRIGLREELDVALSSRYQVSGVGNITRAMSTLSKSLKSFRKIRGPIIHLYGGPK
>P29324 ~~~ORF1~~~Non-structural polyprotein pORF1~~~
MEAHQFIKAPGITTAIEQAALAAANSALANAVVVRPFLSHQQIEILINLMQPRQLVFRPEVFWNHPIQRVIHNELELYCR
ARSGRCLEIGAHPRSINDNPNVVHRCFLRPVGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCLDGFSGCNFPAETG
IALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWI
RTTKVTGDHPLVIERVRAIGCHFVLLLTAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIW
DRLMLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGWNASEDALTAVITAAYLTICHQRYLRTQAISKGMRRLER
EHAQKFITRLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRVLVFDESAPCHCRTAIRKALSKFCCFMKWLGQE
CTCFLQPAEGAVGDQGHDNEAYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPAEIVARAGRLTATVKVSQVDGR
IDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDASQSTMAAGPFSLTYAASAAGLEVRYVAAGLDHRAVFAPGVSPRS
APGEVTAFCSALYRFNREAQRHSLIGNLWFHPEGLIGLFAPFSPGHVWESANPFCGESTLYTRTWSEVDAVSSPARPDLG
FMSEPSIPSRAATPTLAAPLPPPAPDPSPPPSAPALAEPASGATAGAPAITHQTARHRRLLFTYPDGSKVFAGSLFESTC
TWLVNASNVDHRPGGGLCHAFYQRYPASFDAASFVMRDGAAAYTLTPRPIIHAVAPDYRLEHNPKRLEAAYRETCSRLGT
AAYPLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFEANRPTRPTLTITEDVARTANLAIELDSATDVGRACAG
CRVTPGVVQYQFTAGVPGSGKSRSITQADVDVVVVPTRELRNAWRRRGFAAFTPHTAARVTQGRRVVIDEAPSLPPHLLL
LHMQRAATVHLLGDPNQIPAIDFEHAGLVPAIRPDLGPTSWWHVTHRWPADVCELIRGAYPMIQTTSRVLRSLFWGEPAV
GQKLVFTQAAKPANPGSVTVHEAQGATYTETTIIATADARGLIQSSRAHAIVALTRHTEKCVIIDAPGLLREVGISDAIV
NNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRPVPVAAVLPPCPELEQGLLYLPQELTTC
DSVVTFELTDIVHCRMAAPSQRKAVLSTLVGRYGGRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELVEAMVEKGQ
DGSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFD
DTVFSAAVAAAKASMVFENDFSEFDSTQNNFSLGLECAIMEECGMPQWLIRLYHLIRSAWILQAPKESLRGFWKKHSGEP
GTLLWNTVWNMAVITHCYDFRDFQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPIGLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERAEQLRLAVSDFLRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVADGKAHFTESVKP
VLDLTNSILCRVE
>Q81862 ~~~ORF1~~~Non-structural polyprotein pORF1~~~
MEAHQFIKAPGITTAIEQAALAAANSALANAVVVRPFLSHQQIEILINLMQPRQLVFRPEVFWNHPIQRVIHNELELYCR
ARSGRCLEIGAHPRSINDNPNVVHRCFLRPAGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCFDGFSGCNFPAETG
VALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWI
RTTKVTGDHPLVIERVRAIGCHFVLLLTAAPEPSPTPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIW
DRLMLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGWNASEVALTAVITAAYLTICHQRYLRTQAISKGMRRLER
EHAQKFITRLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRVLVFDESAPCHCRTAIRKAVSKFCCFMKWLGQE
CTCFLQPAEGAVGDQGHDNEAYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPAEIVARAGRLTATVKVSQVDGR
IDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDASQSTMAAGPFSLTYAASAAGLEVRYVAAGLDHRAVFAPGVSPRS
APGEVTAFCSALYRFNREAQRLSLTGNFWFHPEGLLGPFAPFSPGHVWESANPFCGESTLYTRTWSEVDAVSSPAQPDLG
FISEPSIPSRAATLTPAAPLPPPAPDPSPTPSAPARGEPAPGATARAPAITHQAARHRRLLFTYPDGSKVFAGSLFESTC
TWLVNASNVDHRPGGGLCHAFYQRYPASFDAASFVMRDGAAAYTLTPRPIIHAVAPDYRLEHNPKMLEAAYRETCSRLGT
AAYPLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFEANRPTCPTLTITEDVARTANLAIELDSATDVGRACAG
CRVTPGVVQYQFTAGVPGSGKSRSITQADVDVVVVPTRELRNAWRRRGFAAFTPHTAARVTQGRRVVIDEAPSLPPHLLL
LHMQRAATVHLLGDPNQIPAIDFEHAGLVPAIRPDLAPTSWWHVTHRCPADVCELIRGAYPMIQTTSRVLRSLFWGEPAV
GQKLVFTQAAKAANPGSVTVHEAQGATYTETTIIATADARGLIQSSRAHAIVALTRHTEKCVIIDAPGLLREVGISDAIV
NNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRPAPVAAVLPPCPELEQGLLYLPQELTTC
DSVVTFELTDIVHCRMAAPSQRKAVLSTLVGHYGRRTKLYNASHSDVRDSLARFIPAIGHVQVTTCELYELVEAMVEKGQ
DGSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFD
DTVFSAAVAAARASMVFENDFSEFDSTQNNFSLGLECAIMVECGMPQWLIRLYHLIRSAWILQAPKESLRGFWKKHSGEP
GTLLWNTVWNMAVITHCYDFRDLQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPIGLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERAKQLRLAVSDFLRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVADGKAHFTESVKP
VLDLTNSILCRVE
>Q9WC28 ~~~ORF1~~~Non-structural polyprotein pORF1~~~
MEAHQFLKAPGITTAVEQAALATANSALANAVVVRPFLSHQQIEILINLMQPRQLVFRPEVFWNQPIQRVIHNELELYCR
ARSGRCLEIGAHPRSINDNPNVVHRCFLRPVGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCFDGFSGCSCPAETG
IALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWI
RTTKVTGDHPLVIERVRAIGCHFVLLLTAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIW
DRLMLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGRNASEDALTAVITAAYLTICHQRYLRTQAISKGIRRLER
EHDQKFITRLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRVLVFDESAPCHCRTVIRKALSKFCCFMKWLGQE
CTCFLQPAEGVVGDQGHDNESYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPDEIVARACRLTATVKVSQVDGR
IDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDASQSTMAAGPFSLTYAASAAGLEVRYVGAGLDHRAIFAPGVSPRS
NPGEVTAFCSALYRFNREAQRHSLTGNLWFHPEGLIGLFAPFSPGHVWESAKPFCGESTLYTRTWSEVDAVSSPTRPDLG
FMSEPPIPSRAATPTLAAPLPPLAPDPSPPSSAPALDEPASAATSGVPAITHQTARHRRLLFTYPDGSKVFAGSLFESTC
TWLVNASNVDHCPGGGLCHAFYQRYPASFDAACFVMRDGAAAYTLTPRPIIHRVAPDYRLEHNPKRLEAAYRETCSRLGT
AAYPLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFEANRPTRPTLTITEDAARTANLAIELDSATDVGRACAG
CRVTPGVVQYQFTAGVPGSGKSRSITRADVDVVVVPTRELRNAWRRRGFAAFTPHTAARVTDGRRVVIDEAPSLPPHLLL
LHMQRAATVHLLGDPNQIPAIDFEHPGLVPAIRPDLAPTSWWHVTHRCPADVCELIRGAYPMIQTTSRVLRSLFWGEPAV
GQKLVFTQAAKPANPGSVTVHDSQGATYTYTTIIATADARGLIQSSRAHAIVALTRHTEKWVIIDAPGLLREVGISDAIV
NNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRPAPVAAVLPPCPELEQGLLYLPQELTTC
DSVVTFELTDIVHCRMAAPSQRKAVVSTLVGRYGRRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELVEAMVEKGQ
DGSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFD
DTVFSAAVAAAKASMVFENDFSEFDSTQNNFSLGLECAIMEECGMPQGLIRLYHLIRSAWILQAPKESLLGFWKKHSGEP
GTLLWNTVWNMAVITHCYDFRDLQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPMRLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERADELRIAVSDFLRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVADGKAHFTESVKP
VLDLTNSILCRVE
>Q04610 ~~~ORF1~~~Non-structural polyprotein pORF1~~~
MEAHQFIKAHGITTAIEQAALAAANSALANAVVVRPFLSHQQIEILINLMQPRQLVFRPEVFWNHPIQRVIHNELELYCR
ARSGRCLEIGAHPRSINDNPNVVHRCFLLPVGRDVQRWYTAPTRGPAANCRRSALRGLPAVDRTYCLDGFSGCNFPAETG
IALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWI
RTTKVTGDHPLVIERVRAIGCHFVLLLTAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIW
DRLMLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGWNASEDALTAVITAAYLTICHQRYLRTQAISKGMRRLER
EHAQKFITPLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRVLVFDESAPCRCRTAIRKALSKFCCFMKWLGQE
CTCFLQPAEGVVGDQGHDNEAYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPAEIVARAGRLTATVKVSQVDGR
IDCETLLGNKTFRTSFVDGAVLEANGPERYNLSFDASQSTMAAGPFSLTYAASAAGLEVRYVAAGLDHRAVFAPGVSPRS
APGEVTAFCSALYRFNREAQRHSLTGNLWFHPEGLIGLFAPFSPGHVWESANPFCGESTLYTRTWSEVDAVSSPARPDLG
LMSEPSIPSRAATPTLAVLLPPPAPDPPPPPSAPALDEPASGATAGAPAITHQTARHRRLLFTYPDGSKVFAGSLFESTC
TWLVNASNVDHRPGGGLCHAFYQRYPASFDAASFVMRDGAAAYTLTPRPIIHAVAPDYRLEHNPKRLEAAYRETCSRLGT
AAYSLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFEANRPTRPTLTITEDVARTANLAIELDSATDVGRACAG
CRVTPGVVQYQFTAGVPGSGKSRSITQADVDVVVVPTRELRNAWRRRGFAAFNPHTAARVTQGRRVVIDEAPSLPPHLLL
LHMQRAATVHLLGDPNQIPAIDFEHAGLVPAIRPDLGPTSWWHVTHRCPADVCELIRGAYPMIQTTSRVLRSLFWGEPAV
GQKLVFTQAAKAANPGSVTVHEAQGATYTETTIIATADARGLIQSSRAHAIVALTRHTEKFVIIDAPGLLREVGISDAIV
NNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRPVPVAAVLPPCPELEQGLLYLPQGLTAC
DSVVTFELTDIVHCRMAAPNQRKAVLSTLVGRYGRRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELVEAMVEKGQ
DGSAVLELDLCNRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFD
DTVFSAAVAAAKASMVFENDFSEFDSTQNNFSLGLECAIMEECGMPQWLIRLYHLIRSAWILQAPKESLRGFWKKHSGEP
GTLLWNTVWNMAVITHCYDFRDFQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPIGLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERAEQLRLAVSDFLRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVADGKAHFTESVKP
VLDLTNSILCRVE
>P33424 ~~~ORF1~~~Non-structural polyprotein pORF1~~~
MEAHQFIKAPGITTAIEQAALAAANSALANAVVVRPFLSHQQIEILINLMQPRQLVFRPEVFWNHPIQRVIHNELELYCR
ARSGRCLEIGAHPRSINDNPNVVHRCFLRPAGRDVQRWYTAPTRGPAANCRRSALRGLPAADRTYCFDGFSGCNFPAETG
IALYSLHDMSPSDVAEAMFRHGMTRLYAALHLPPEVLLPPGTYRTASYLLIHDGRRVVVTYEGDTSAGYNHDVSNLRSWI
RTTKVTGDHPLVIERVRAIGCHFVLLLTAAPEPSPMPYVPYPRSTEVYVRSIFGPGGTPSLFPTSCSTKSTFHAVPAHIW
DRLMLFGATLDDQAFCCSRLMTYLRGISYKVTVGTLVANEGWNASEDALTAVITAAYLTICHQRYLRTQAISKGMRRLER
EHAQKFITRLYSWLFEKSGRDYIPGRQLEFYAQCRRWLSAGFHLDPRVLVFDESAPCHCRTAIRKAVSKFCCFMKWLGQE
CTCFLQPAEGVVGDQGHDNEAYEGSDVDPAESAISDISGSYVVPGTALQPLYQALDLPAEIVARAGRLTATVKVSQVDGR
IDCETLLGNKTFRTSFVDGAVLETNGPERHNLSFDASQSTMAAGPFSLTYAASAAGLEVRYVAAGLDHRAVFAPGVSPRS
APGEVTAFCSALYRFNREAQRLSLTGNFWFHPEGLLGPFAPFSPGHVWESANPFCGESTLYTRTWSEVDAVPSPAQPDLG
FTSEPSIPSRAATPTPAAPLPPPAPDPSPTLSAPARGEPAPGATARAPAITHQTARHRRLLFTYPDGSKVFAGSLFESTC
TWLVNASNVDHRPGGGLCHAFYQRYPASFDAASFVMRDGAAAYTLTPRPIIHAVAPDYRLEHNPKRLEAAYRETCSRLGT
AAYPLLGTGIYQVPIGPSFDAWERNHRPGDELYLPELAARWFEANRPTCPTLTITEDVARTANLAIELDSATDVGRACAG
CRVTPGVVQYQFTAGVPGSGKSRSITQADVDVVVVPTRELRNAWRRRGFAAFTPHTAARVTQGRRVVIDEAPSLPPHLLL
LHMQRAATVHLLGDPNQIPAIDFEHAGLVPAIRPDLAPTSWWHVTHRCPADVCELIRGAYPMIQTTSRVLRSLFWGEPAV
GQKLVFTQAAKAANPGSVTVHEAQGATYTETTIIATADARGLIQSSRAHAIVALTRHTEKCVIIDAPGLLREVGISDAIV
NNFFLAGGEIGHQRPSVIPRGNPDANVDTLAAFPPSCQISAFHQLAEELGHRPAPVAAVLPPCPELEQGLLYLPQELTTC
DSVVTFELTDIVHCRMAAPSQRKAVLSTLVGRYGRRTKLYNASHSDVRDSLARFIPAIGPVQVTTCELYELVEAMVEKGQ
DGSAVLELDLCSRDVSRITFFQKDCNKFTTGETIAHGKVGQGISAWSKTFCALFGPWFRAIEKAILALLPQGVFYGDAFD
DTVFSAAVAAAKASMVFENDFSEFDSTQNNFSLGLECAIMEECGMPQWLIRLYHLIRSAWILQAPKESLRGFWKKHSGEP
GTLLWNTVWNMAVITHCYDFRDLQVAAFKGDDSIVLCSEYRQSPGAAVLIAGCGLKLKVDFRPIGLYAGVVVAPGLGALP
DVVRFAGRLTEKNWGPGPERAEQLRLAVSDFLRKLTNVAQMCVDVVSRVYGVSPGLVHNLIGMLQAVADGKAHFTESVKP
VLDLTNSILCRVE
>Q8QZ73 ~~~~~~Polyprotein P1234~~~
MSKVFVDIEAESPFLKSLQRAFPAFEVEAQQVTPNDHANARAFSHLATKLIEQETEKDTLILDIGSAPARRMMSEHTYHC
VCPMRSAEDPERLLYYARKLAKASGEVVDRNIAAKIDDLQSVMATPDNESRTFCLHTDQTCRTQAEVAVYQDVYAVHAPT
SLYFQAMKGVRTAYWIGFDTTPFMFDTMAGAYPTYATNWADEQVLKARNIGLCSASLTEGHLGKLSIMRKKKMTPSDQIM
FSVGSTLYIESRRLLKSWHLPSVFHLKGRQSYTCRCDTIVSCEGYVVKKITMSPGVFGKTSGYAVTHHAEGFLVCKTTDT
IAGERVSFPICTYVPSTICDQMTGILATEVTPEDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPVVSQAFSKWAKEYRL
DQEDEKNMGMRERTLTCCCLWAFKTHKNHTMYKKPDTQTIVKVPSEFNSFVIPSLWSAGLSIGIRHRIRLLLQSRRVEPL
VPSMDVGEARAAEREAAEAKEAEDTLAALPPLIPTAPVLDDIPEVDVEELEFRAGAGVVETPRNALKVTPQDRDTMVGSY
LVLSPQTVLKSVKLQALHPLAESVKIITHKGRAGRYQVDAYDGRVLLPTGAAIPVPDFQALSESATMVYNEREFINRKLY
HIAVHGAALNTDEEGYEKVRAESTDAEYVYDVDRKQCVKREEAEGLVMIGDLINPPFHEFAYEGLKRRPAAPYKTTVVGV
FGVPGSGKSGIIKSLVTRGDLVASGKKENCQEIMLDVKRYRDLDMTAKTVDSVLLNGVKQTVDVLYVDEAFACHAGTLLA
LIATVRPRKKVVLCGDPKQCGFFNLMQLQVNFNHNICTEVDHKSISRRCTLPITAIVSTLHYEGRMRTTNPYNKPVIIDT
TGQTKPNREDIVLTCFRGWVKQLQLDYRGHEVMTAAASQGLTRKGVYAVRMKVNENPLYAQSSEHVNVLLTRTEGRLVWK
TLSGDPWIKTLSNIPKGNFTATLEDWQREHDTIMRAITQEAAPLDVFQNKAKVCWAKCLVPVLETAGIKLSATDWSAIIL
AFKEDRAYSPEVALNEICTKIYGVDLDSGLFSAPRVSLHYTTNHWDNSPGGRMYGFSVEAANRLEQQHPFYRGRWASGQV
LVAERKTQPIDVTCNLIPFNRRLPHTLVTEYHPIKGERVEWLVNKIPGYHVLLVSEYNLILPRRKVTWIAPPTVTGADLT
YDLDLGLPPNAGRYDLVFVNMHTPYRLHHYQQCVDHAMKLQMLGGDALYLLKPGGSLLLSTYAYADRTSEAVVTALARRF
SSFRAVTVRCVTSNTEVFLLFTNFDNGRRTVTLHQTNGKLSSIYAGTVLQAAGCAPAYAVKRADIATAIEDAVVNAANHR
GQVGDGVCRAVARKWPQAFRNAATPVGTAKTVKCDETYIIHAVGPNFNNTSEAEGDRDLAAAYRAVAAEINRLSISSVAI
PLLSTGIFSAGKDRVHQSLSHLLAAMDTTEARVTIYCRDKTWEQKIKTVLQNRSATELVSDELQFEVNLTRVHPDSSLVG
RPGYSTTDGTLYSYMEGTKFHQAALDMAEITTLWPRVQDANEHICLYALGETMDNIRARCPVEDSDSSTPPKTVPCLCRY
AMTPERVTRLRMHHTKDFVVCSSFQLPKYRIPGVQRVKCEKVMLFDAAPPASVSPVQYLTNQSETTISLSSFSITSDSSS
LSTFPDLESAEELDHDSQSVRPALNEPDDHQPTPTAELATHPVPPPRPNRARRLAAARVQVQVEVHQPPSNQPTKPIPAP
RTSLRPVPAPRRYVPRPVVELPWPLETIDVEFGAPTEEESDITFGDFSASEWETISNSSXLGRAGAYIFSSDVGPGHLQQ
KSVRQHDLEVPIMDRVIEEKVYPPKLDEAKEKQLLLKLQMHATDANRSRYQSRKVENMKATIIDRLKQGSAYYVSAAADK
AVTYHVRYAKPRYSVPVMQRLSSATIAVATCNEFLARNYPTVASYQITDEYDAYLDMVDGSESCLDRANFCPAKLRCYPK
HHAYHMPQIRSAVPSPFQNTLQNVLAAATKRNCNVTQMRELPTLDSAVYNVECFRKYACNNEYWEEFAKKPIRITTENLT
TYVTKLKGGKAAALFAKTHNLVPLQEVPMDRFIMDMKRDVKVTPGTKHTEERPKVQVIQAAEPLATAYLCGIHRELVRRL
NAVLLPNIHTLFDMSAEDFDAIISEHFKPGDHVLETDIASFDKSQDDSLALTGLMILEDLGVDNQLLDLIEAAFGQITSC
HLPTGTRFKFGAMMKSGMFLTLFINTVLNITIASRVLEARLTNSACAAFIGDDNVVHGVVSDKLMADRCATWVNMEVKII
DAVMCIKPPYFCGGFLVYDHVTRTACRIADPLKRLFKLGKPLPADDCQDEDRRRALYDEVKKWSRSGLGSEIEVALASRY
RLEGSYNLLLAMSTFAHSMKNFSALRGPVIHLYGGPK
>Q86500 ~~~~~~Non-structural polyprotein p200~~~
MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAAQKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHW
IEWGPKEALHVLIDPSPGLLREVARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG
GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDHDGGCPADCRGAGAGPTPGYTRP
CTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELSWEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAE
VGWWRWFSLPRPVFQRMLSYCKTLSPDAYYSERVFKFKNALSHSITLAGNVLQEGWKGTCAEEDALCAYVAFRAWQSNAR
LAGIMKSAKRCAADSLSVAGWLDTIWDAIKRFFGSVPLAERMEEWEQDAAVAAFDRGPLEDGGRHLDTVQPPKSPPRPEI
AATWIVHAASADRHCACAPRCDVPRERPSAPAGPPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPLAPRPA
RYPTVLYRHPAHHGPWLTLDEPGGADAALVLCDPLGQPLRGPERHYAAGAHMCAQARGLQAFVRVVPPPERPWADGGARA
WAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPKLIALALRTLAQQGAALALSVRDLPRGTAFEANAVTAAVRAGPGQLAAT
SPPPGDPPPPRRARRSQRHSDARGTPPPAPVRDPPRPQPSPPAPPRVGDPVPPTTAEPADRARHAELEVVYEPSGPPTST
KADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFANATAALAADCRRLAPCPIGEAVAT
PGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERAYRSIVALAAARRWARVACPLLGAGVYGWSAAESLRAALAATRAEP
AERVSLHICHPDRATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEPRGCQGCELC
RYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHYSVLKPAEVRPPRGMCGSDMWRCRGWQGMPQV
RCTPSNAHAALCRTGVPPRVSTRGGELDPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQWW
SASHADASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRARPEGGNPTGHFVCAVGG
GPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEVRRLGDDAMARAALASIQRPRKGPYNIRVWNMAAGAGKTT
RILAAFTREDLYVCPTNALLHEIQAKLRARDIDIKNAATYERALTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAEVIC
VGDRDQCGPHYANNCRTPVPDRWPTGRSRHTWRFPDCWAARLRAGLDYDIEGERTGTFACNLWDGRQVDLHLAFSRETVR
RLHEAGIRAYTVREAQGMSVGTACIHVGRDGTDVALALTRDLAIVSLTRASDALYLHELEDGLLRAAGLSAFLDAGALAE
LKEVPAGIDRVVAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRYMRISRHLLNKN
HTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPPRVTAGVAQEWRMTYLRERIDLTDVYTQMGVAAR
ELTDRYTRRYPEIFAGMCTAQSLSVPAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKI
IMRALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLGLPCAEDYRALRAGSYCTLR
ELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVRWAGIFQGDDMVIFLPEGARNAALKWTPAEVGLFGFHIPVKH
VSTPTPSFCGHVGTAAGLFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERVLAIVR
ELTAYARGRGLDHPATIGALEEIQTPYARANLHDAD
>O40955 ~~~~~~Non-structural polyprotein p200~~~
MERLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVSAAQKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHW
IEWGPKEALHVLIDPSPGLLREVARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG
GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDHDGGCPADCRGAGAGPTPGYTRP
CTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELSWEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAE
VGRWRWFSLPRPVFQRMLSYCKTLSPDAYYSERVFKFKNALSHSITLAGNVLQEGWKGTCAEEDALCAYVAFRAWQSNAR
LAGIMKSAKRCAADSLSVAGWLDTIWGAIKRFFGSVPLAERMEEWEQDAAVAAFDRGPLEDGGRHLDTVQPPKSPPRPEI
AATWIVHAASADRHCACAPRCDVPRERPSAPAGPPDDEALIPPWLFAEHRALRCREWDFEVLRARADTAAAPAPLAPRPA
RYPTVLYRHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRVVPPPERPWADGGARA
WAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQQGAALALSVRDLPGGAAFDANAVTAAVRAGPGQSAAT
SSPPGDPPPPRCARRSQRHSDARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPTSAGPADRARDAELEVAYEPSGPPTST
KADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFANATAALAADCRRLAPCPTGEAVAT
PGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERAYRSIVALAAARRWARVACPLLGAGVYGWSAAESLRAALAATRTEP
AERVSLHICHPDRATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEPRGCQGCELC
RYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHYSVLKPAEVRPPRGMCGSDMWRCRGWQGVPQV
RCTPSNAHAALCRTGVPPRVSTRGGELDPNTCWLRAAANVAQAARACGAYTSAGCPRCAYGRALSEARTHKDFAALSQRW
SASHADASSDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRARPEGGNPTGHFVCAVGG
GPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEVRRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTT
RILAAFTREDLYVCPTNALLHEIQAKLRARDIEIKNAATYERALTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAEVIC
VGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERTGTFACNLWDGRQVDLHLAFSRETVR
RLHEAGIRAYTVREAQGMSVGTACIHVGRDGTDVALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAE
LKEVPAGIDRVVAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRYMRISRHLLNKN
HTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPPRVTAGVAQEWRMTYLRERIDLTDVYTQMGVAAR
ELTDRYARRYPEIFAGMCTAQSLSVPAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKI
IMRALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLGLPCAEDYRALRAGSYCTLR
ELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVRWAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKH
VSTPTPSFCGHVGTAAGLFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERVLAIVR
ELTAYARGRGLDHPATIGALEEIQTPYARANLHDAD
>P08411 ~~~~~~Polyprotein P1234~~~
MAAKVHVDIEADSPFIKSLQKAFPSFEVESLQVTPNDHANARAFSHLATKLIEQETDKDTLILDIGSAPSRRMMSTHKYH
CVCPMRSAEDPERLVCYAKKLAAASGKVLDREIAGKITDLQTVMATPDAESPTFCLHTDVTCRTAAEVAVYQDVYAVHAP
TSLYHQAMKGVRTAYWIGFDTTPFMFDALAGAYPTYATNWADEQVLQARNIGLCAASLTEGRLGKLSILRKKQLKPCDTV
MFSVGSTLYTESRKLLRSWHLPSVFHLKGKQSFTCRCDTIVSCEGYVVKKITMCPGLYGKTVGYAVTYHAEGFLVCKTTD
TVKGERVSFPVCTYVPSTICDQMTGILATDVTPEDAQKLLVGLNQRIVVNGRTQRNTNTMKNYLLPIVAVAFSKWAREYK
ADLDDEKPLGVRERSLTCCCLWAFKTRKMHTMYKKPDTQTIVKVPSEFNSFVIPSLWSTGLAIPVRSRIKMLLAKKTKRE
LIPVLDASSARDAEQEEKERLEAELTREALPPLVPIAPAETGVVDVDVEELEYHAGAGVVETPRSALKVTAQPNDVLLGN
YVVLSPQTVLKSSKLAPVHPLAEQVKIITHNGRAGRYQVDGYDGRVLLPCGSAIPVPEFQALSESATMVYNEREFVNRKL
YHIAVHGPSLNTDEENYEKVRAERTDAEYVFDVDKKCCVKREEASGLVLVGELTNPPFHEFAYEGLKIRPSAPYKTTVVG
VFGVPGSGKSAIIKSLVTKHDLVTSGKKENCQEIVNDVKKHRGLDIQAKTVDSILLNGCRRAVDILYVDEAFACHSGTLL
ALIALVKPRSKVVLCGDPKQCGFFNMMQLKVNFNHNICTEVCHKSISRRCTRPVTAIVSTLHYGGKMRTTNPCNKPIIID
TTGQTKPKPGDIVLTCFRGWVKQLQLDYRGHEVMTAAASQGLTRKGVYAVRQKVNENPLYAPASEHVNVLLTRTEDRLVW
KTLAGDPWIKVLSNIPQGNFTATLEEWQEEHDKIMKVIEGPAAPVDAFQNKANVCWAKSLVPVLDTAGIRLTAEEWSTII
TAFKEDRAYSPVVALNEICTKYYGVDLDSGLFSAPKVSLYYENNHWDNRPGGRMYGFNAATAARLEARHTFLKGQWHTGK
QAVIAERKIQPLSVLDNVIPINRRLPHALVAEYKTVKGSRVEWLVNKVRGYHVLLVSEYNLALPRRRVTWLSPLNVTGAD
RCYDLSLGLPADAGRFDLVFVNIHTEFRIHHYQQCVDHAMKLQMLGGDALRLLKPGGSLLMRAYGYADKISEAVVSSLSR
KFSSARVLRPDCVTSNTEVFLLFSNFDNGKRPSTLHQMNTKLSAVYAGEAMHTAGCAPSYRVKRADIATCTEAAVVNAAN
ARGTVGDGVCRAVAKKWPSAFKGAATPVGTIKTVMCGSYPVIHAVAPNFSATTEAEGDRELAAVYRAVAAEVNRLSLSSV
AIPLLSTGVFSGGRDRLQQSLNHLFTAMDATDADVTIYCRDKSWEKKIQEAIDMRTAVELLNDDVELTTDLVRVHPDSSL
VGRKGYSTTDGSLYSYFEGTKFNQAAIDMAEILTLWPRLQEANEQICLYALGETMDNIRSKCPVNDSDSSTPPRTVPCLC
RYAMTAERIARLRSHQVKSMVVCSSFPLPKYHVDGVQKVKCEKGLLFDPTVPSVVSPRKYAASTTDHSDRSLRGFDLDWT
TDSSSTASDTMSLPSLQSCDIDSIYEPMAPIVVTADVHPEPAGIADLAADVHPEPADHVDLENPIPPPRPKRAAYLASRA
AERPVPAPRKPTPAPRTAFRNKLPLTFGDFDEHEVDALASGITFGDFDDVLRLGRAGAYIFSSDTGSGHLQQKSVRQHNL
QCAQLDAVEEEKMYPPKLDTEREKLLLLKMQMHPSEANKSRYQSRKVENMKATVVDRLTSGARLYTGADVGRIPTYAVRY
PRPVYSPTVIERFSSPDVAIAACNEYLSRNYPTVASYQITDEYDAYLDMVDGSDSCLDRATFCPAKLRCYPKHHAYHQPT
VRSAVPSPFQNTLQNVLAAATKRNCNVTQMRELPTMDSAVFNVECFKRYACSGEYWEEYAKQPIRITTENITTYVTKLKG
PKAAALFAKTHNLVPLQEVPMDRFTVDMKRDVKVTPGTKHTEERPKVQVIQAAEPLATAYLCGIHRELVRRLNAVLRPNV
HTLFDMSAEDFDAIIASHFHPGDPVLETDIASFDKSQDDSLALTGLMILEDLGVDQYLLDLIEAAFGEISSCHLPTGTRF
KFGAMMKSGMFLTLFINTVLNITIASRVLEQRLTDSACAAFIGDDNIVHGVISDKLMAERCASWVNMEVKIIDAVMGEKP
PYFCGGFIVFDSVTQTACRVSDPLKRLFKLGKPLTAEDKQDEDRRRALSDEVSKWFRTGLGAELEVALTSRYEVEGCKSI
LIAMATLARDIKAFKKLRGPVIHLYGGPRLVR
>P03317 ~~~~~~Polyprotein P1234~~~
MEKPVVNVDVDPQSPFVVQLQKSFPQFEVVAQQVTPNDHANARAFSHLASKLIELEVPTTATILDIGSAPARRMFSEHQY
HCVCPMRSPEDPDRMMKYASKLAEKACKITNKNLHEKIKDLRTVLDTPDAETPSLCFHNDVTCNMRAEYSVMQDVYINAP
GTIYHQAMKGVRTLYWIGFDTTQFMFSAMAGSYPAYNTNWADEKVLEARNIGLCSTKLSEGRTGKLSIMRKKELKPGSRV
YFSVGSTLYPEHRASLQSWHLPSVFHLNGKQSYTCRCDTVVSCEGYVVKKITISPGITGETVGYAVTHNSEGFLLCKVTD
TVKGERVSFPVCTYIPATICDQMTGIMATDISPDDAQKLLVGLNQRIVINGRTNRNTNTMQNYLLPIIAQGFSKWAKERK
DDLDNEKMLGTRERKLTYGCLWAFRTKKVHSFYRPPGTQTCVKVPASFSAFPMSSVWTTSLPMSLRQKLKLALQPKKEEK
LLQVSEELVMEAKAAFEDAQEEARAEKLREALPPLVADKGIEAAAEVVCEVEGLQADIGAALVETPRGHVRIIPQANDRM
IGQYIVVSPNSVLKNAKLAPAHPLADQVKIITHSGRSGRYAVEPYDAKVLMPAGGAVPWPEFLALSESATLVYNEREFVN
RKLYHIAMHGPAKNTEEEQYKVTKAELAETEYVFDVDKKRCVKKEEASGLVLSGELTNPPYHELALEGLKTRPAVPYKVE
TIGVIGTPGSGKSAIIKSTVTARDLVTSGKKENCREIEADVLRLRGMQITSKTVDSVMLNGCHKAVEVLYVDEAFACHAG
ALLALIAIVRPRKKVVLCGDPMQCGFFNMMQLKVHFNHPEKDICTKTFYKYISRRCTQPVTAIVSTLHYDGKMKTTNPCK
KNIEIDITGATKPKPGDIILTCFRGWVKQLQIDYPGHEVMTAAASQGLTRKGVYAVRQKVNENPLYAITSEHVNVLLTRT
EDRLVWKTLQGDPWIKQPTNIPKGNFQATIEDWEAEHKGIIAAINSPTPRANPFSCKTNVCWAKALEPILATAGIVLTGC
QWSELFPQFADDKPHSAIYALDVICIKFFGMDLTSGLFSKQSIPLTYHPADSARPVAHWDNSPGTRKYGYDHAIAAELSR
RFPVFQLAGKGTQLDLQTGRTRVISAQHNLVPVNRNLPHALVPEYKEKQPGPVKKFLNQFKHHSVLVVSEEKIEAPRKRI
EWIAPIGIAGADKNYNLAFGFPPQARYDLVFINIGTKYRNHHFQQCEDHAATLKTLSRSALNCLNPGGTLVVKSYGYADR
NSEDVVTALARKFVRVSAARPDCVSSNTEMYLIFRQLDNSRTRQFTPHHLNCVISSVYEGTRDGVGAAPSYRTKRENIAD
CQEEAVVNAANPLGRPGEGVCRAIYKRWPTSFTDSATETGTARMTVCLGKKVIHAVGPDFRKHPEAEALKLLQNAYHAVA
DLVNEHNIKSVAIPLLSTGIYAAGKDRLEVSLNCLTTALDRTDADVTIYCLDKKWKERIDAALQLKESVTELKDEDMEID
DELVWIHPDSCLKGRKGFSTTKGKLYSYFEGTKFHQAAKDMAEIKVLFPNDQESNEQLCAYILGETMEAIREKCPVDHNP
SSSPPKTLPCLCMYAMTPERVHRLRSNNVKEVTVCSSTPLPKHKIKNVQKVQCTKVVLFNPHTPAFVPARKYIEVPEQPT
APPAQAEEAPEVVATPSPSTADNTSLDVTDISLDMDDSSEGSLFSSFSGSDNSITSMDSWSSGPSSLEIVDRRQVVVADV
HAVQEPAPIPPPRLKKMARLAAARKEPTPPASNSSESLHLSFGGVSMSLGSIFDGETARQAAVQPLATGPTDVPMSFGSF
SDGEIDELSRRVTESEPVLFGSFEPGEVNSIISSRSAVSFPLRKQRRRRRSRRTEYXLTGVGGYIFSTDTGPGHLQKKSV
LQNQLTEPTLERNVLERIHAPVLDTSKEEQLKLRYQMMPTEANKSRYQSRKVENQKAITTERLLSGLRLYNSATDQPECY
KITYPKPLYSSSVPANYSDPQFAVAVCNNYLHENYPTVASYQITDEYDAYLDMVDGTVACLDTATFCPAKLRSYPKKHEY
RAPNIRSAVPSAMQNTLQNVLIAATKRNCNVTQMRELPTLDSATFNVECFRKYACNDEYWEEFARKPIRITTEFVTAYVA
RLKGPKAAALFAKTYNLVPLQEVPMDRFVMDMKRDVKVTPGTKHTEERPKVQVIQAAEPLATAYLCGIHRELVRRLTAVL
LPNIHTLFDMSAEDFDAIIAEHFKQGDPVLETDIASFDKSQDDAMALTGLMILEDLGVDQPLLDLIECAFGEISSTHLPT
GTRFKFGAMMKSGMFLTLFVNTVLNVVIASRVLEERLKTSRCAAFIGDDNIIHGVVSDKEMAERCATWLNMEVKIIDAVI
GERPPYFCGGFILQDSVTSTACRVADPLKRLFKLGKPLPADDEQDEDRRRALLDETKAWFRVGITGTLAVAVTTRYEVDN
ITPVLLALRTFAQSKRAFQAIRGEIKHLYGGPK
>P10358 ~~~~~~RNA replicase polyprotein~~~
MAFQLALDALAPTTHRDPSLHPILESTVDSIRSSIQTYPWSIPKELLPLLNSYGIPTSGLGTSHHPHAAHKTIETFLLCT
HWSFQATTPSSVMFMKPSKFNKLAQVNSNFRELKNYRLHPNDSTRYPFTSPDLPVFPTIFMHDALMYYHPSQIMDLFLRK
PNLERLYASLVVPPEAHLSDQSFYPKLYTYTTTRHTLHYVPEGHEAGSYNQPSDAHSWLRINSIRLGNHHLSVTILESWG
PVHSLLIQRGTPPPDPSLQAPPTLMTSDLFRSYQEPRLDVVSFRIPDAIELPQATFLQQPLRDRLVPRAVYNALFTYTRA
VRTLRTSDPAAFVRMHSSKPDHDWVTSNAWDNLQTFALLNVPLRPNVVYHVLQSPIASLSLYLRQHWRRLTATAVPILSF
LTLLQRFLPLPIPLAEVKSITAFRRELYRKKEPHHPLDVFHLQHRVRNYHSAISAVRPASPPHQKLPHALQKAALLLLRP
ISPLLTATPFFRSEQKSMLPNAELSWTLKRFALPWQASLVLLALSESSILLHKLFSPPTLQAQHDTYHRHLHPGSYSLQW
ERTPLSIPRTTAFLPFTPTTSTAPPDRSEASLPPAFASTFVPRPPPAASSPGAQPPTTTAAPPTPIEPTQRTHQNSDLAL
ESSTSTEPPPPPIRSPDMTPSAPVLFPEINSPRRFPPQLPATPDLEPAHTPPPLSIPHQDPTDSADPLMGSHLLHHSLPA
PPTHPLPSSQLLPAPLTNDPTAIGPVLPFEELHPRRYPENTATFLTRLRSLPSNHLPQPTLNCLLSAVSDQTKVSEEHLW
ESLQTILPDSQLSNEETNTLGLSTEHLTALAHLYNFQATVYSDRGPILFGPSDTIKRIDITHTTGPPSHFSPGKRLLGSQ
PSAKGHPSDPLIRAMKSFKVSGNYLPFSEAHNHPTSISHAKNLISNMKNGFDGVLSLLDVSTGQRTGPTPKERIIQIDHY
LDTNPGKTTPVVHFAGFAGCGKTYPIQQLLKTKLFKDFRVSCPTTELRTEWKTAMELHGSQSWRFNTWESSILKSSRILV
IDEIYKMPRGYLDLSILADPALELVIILGDPLQGEYHSQSKDSSNHRLPSETLRLLPYIDMYCWWSYRIPQCIARLFQIH
SFNAWQGVIGSVSTPHDQSPVLTNSHASSLTFNSLGYRSCTISSSQGLTFCDPAIIVLDNYTKWLSSANGLVALTRSRSG
VQFMGPSSYVGGTNGSSAMFSDAFNNSLIIMDRYFPSLFPQLKLITSPLTTRGPKLNGATPSASPTHRSPNFHLPPHIPL
SYDRDFVTVNPTLPDQGPETRLDTHFLPPSRLPLHFDLPPAITPPPVSTSVDPPQAKASPVYPGEFFDSLAAFFLPAHDP
STREILHKDQSSNQFPWFDRPFSLSCQPSSLISAKHAPNHDPTLLPASINKRLRFRPSDSPHQITADDVVLGLQLFHSLC
RAYSRQPNSTVPFNPELFAECISLNEYAQLSSKTQSTIVANASRSDPDWRHTTVKIFAKAQHKVNDGSIFGSWKACQTLA
LMHDYVILVLGPVKKYQRIFDNADRPPNIYSHCGKTPNQLRDWCQEHLTHSTPKIANDYTAFDQSQHGESVVLEALKMKR
LNIPSHLIQLHVHLKTNVSTQFGPLTCMRLTGEPGTYDDNTDYNLAVIYSQYDVGSCPIMVSGDDSLIDHPLPTRHDWPS
VLKRLHLRFKLELTSHPLFCGYYVGPAGCIRNPLALFCKLMIAVDDDALDDRRLSYLTEFTTGHLLGESLWHLLPETHVQ
YQSACFDFFCRRCPRHEKMLLDDSTPALSLLERITSSPRWLTKNAMYLLPAKLRLAITSLSQTQSFPESIEVSHAESELL
HYVQ
>P0DOK1 ~~~~~~Frameshifted structural polyprotein~~~
MEFIPTQTFYNRRYQPRPWTPRPTIQVIRPRPRPQRQAGQLAQLISAVNKLTMRAVPQQKPRKNRKNKKQKQKQQAPQNN
TNQKKQPPKKKPAQKKKKPGRRERMCMKIENDCIFEVKHEGKVTGYACLVGDKVMKPAHVKGTIDNADLAKLAFKRSSKY
DLECAQIPVHMKSDASKFTHEKPEGYYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGANEGARTAL
SVVTWNKDIVTKITPEGAEEWSLAIPVMCLLANTTFPCSQPPCIPCCYEKEPEETLRMLEDNVMRPGYYQLLQASLTCSP
HRQRRSTKDNFNVYKATRPYLAHCPDCGEGHSCHSPVALERIRNEATDGTLKIQVSLQIGIGTDDSHDWTKLRYMDNHIP
ADAGRAGLFVRTSAPCTITGTMGHFILARCPKGETLTVGFTDSRKISHSCTHPFHHDPPVIGREKFHSRPQHGKELPCST
YVQSNAATAEEIEVHMPPDTPDRTLLSQQSGNVKITVNSQTVRYKCNCGGSNEGLITTDKVINNCKVDQCHAAVTNHKKW
QYNSPLVPRNAELGDRKGKIHIPFPLANVTCMVPKARNPTVTYGKNQVIMLLYPDHPTLLSYRSMGEEPNYQEEWVTHKK
EVVLTVPTEGLEVTWGNNEPYKYWPQLSANGTAHGHPHEIILYYYELYPTMTVVVVSVASFILLSMVGMAVGMCMCARRR
CITPYELTPGATVPFLLSLICCIRTAKAATYQEAAVYLWNEQQPLFWLQALIPLAALIVLCNCLRLLPCCCKTLAFLSRN
EHRCPHCERVRTRNSDPEHGGSTV
>P0DJZ6 ~~~~~~Frameshifted structural polyprotein~~~
MNYIPTQTFYGRRWRPRPAARPWPLQATPVAPVVPDFQAQQMQQLISAVNALTMRQNAIAPARPPKPKKKKTTKPKPKTQ
PKKINGKTQQQKKKDKQADKKKKKPGKRERMCMKIENDCIFEVKHEGKVTGYACLVGDKVMKPAHVKGVIDNADLAKLAF
KKSSKYDLECAQIPVHMRSDASKYTHEKPEGHYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGANE
GSRTALSVVTWNKDMVTRVTPEGSEEWSAPLITAMCVLANATFPCFQPPCVPCCYENNAEATLRMLEDNVDRPGYYDLLQ
AALTCRNGTRHRRSVSQHFNVYKATRPYIAYCADCGAGHSCHSPVAIEAVRSEATDGMLKIQFSAQIGIDKSDNHDYTKI
RYADGHAIENAVRSSLKVATSGDCFVHGTMGHFILAKCPPGEFLQVSIQDTRNAVRACRIQYHHDPQPVGREKFTIRPHY
GKEIPCTTYQQTTAETVEEIDMHMPPDTPDRTLLSQQSGNVKITVGGKKVKYNCTCGTGNVGTTNSDMTINTCLIEQCHV
SVTDHKKWQFNSPFVPRADEPARKGKVHIPFPLDNITCRVPMAREPTVIHGKREVTLHLHPDHPTLFSYRTLGEDPQYHE
EWVTAAVERTIPVPVDGMEYHWGNNDPVRLWSQLTTEGKPHGWPHQIVQYYYGLYPAATVSAVVGMSLLALISIFASCYM
LVAARSKCLTPYALTPGAAVPWTLGILCCAPRAHAASVAETMAYLWDQNQALFWLEFAAPVACILIITYCLRNVLCCCKS
LSFLSATEPRGHRQSLRTFDSNAERGGVPV
>P0DOK0 ~~~~~~Frameshifted structural polyprotein~~~
MNRGFFNMLGRRPFPAPTAMWRPRRRRQAAPMPARNGLASQIQQLTTAVSALVIGQATRPQPPRPRPPPRQKKQAPKQPP
KPKKPKTQEKKKKQPAKPKPGKRQRMALKLEADRLFDVKNEDGDVIGHALAMEGKVMKPLHVKGTIDHPVLSKLKFTKSS
AYDMEFAQLPVNMRSEAFTYTSEHPEGFYNWHHGAVQYSGGRFTIPRGVGGRGDSGRPIMDNSGRVVAIVLGGADEGTRT
ALSVVTWNSKGKTIKTTPEGTEEWSAAPLVTAMCLLGNVSFPCDRPPTCYTREPSRALDILEENVNHEAYDTLLNAILRC
GSSGRSKRSVIDDFTLTSPYLGTCSYCHHTVPCFSPVKIEQVWDEADDNTIRIQTSAQFGYDQSGAASANKYRYMSLKQD
HTVKEGTMDDIKISTSGPCRRLSYKGYFLLAKCPPGDSVTVSIVSSNSATSCTLARKIKPKFVGREKYDLPPVHGKKIPC
TVYDRLKETTAGYITMHRPRPHAYTSYLEESSGKVYAKPPSGKNITYECKCGDYKTGTVSTRTEITGCTAIKQCVAYKSD
QTKWVFNSPDLIRHDDHTAQGKLHLPFKLIPSTCMVPVAHAPNVIHGFKHISLQLDTDHLTLLTTRRLGANPEPTTEWIV
GKTVRNFTVDRDGLEYIWGNHEPVRVYAQESAPGDPHGWPHEIVQHYYHRHPVYTILAVASATVAMMIGVTVAVLCACKA
RRECLTPYALAPNAVIPTSLALLCCVRSANAETFTETMSYLWSNSQPFFWVQLCIPLAAFIVLMRCCSCCLPFLSGCRRL
PGEGRRLRTCDHCSKCATDTV
>Q9DSN8 ~~~ORF2~~~Structural polyprotein~~~
ADQETNTSNVHNTQLASTSEENSVETEQITTFHDVETPNRINTPMAQDTSSARSMDDTHSIIQFLQRPVLIDHIEVIAGS
TADDNKPLNRYVLNRQNPQPFVKSWTLPSVVLSAGGKGQKLANFKYLRCDVKVKIVLNANPFIAGRLYLAYSPYDDRVDP
ARSILNTSRAGVTGYPGIEIDFQLDNSVEMTIPYASFQEAYDLVTGTEDFVKLYLFTITPILSPTSTSASSKVDLSVYMW
LDNISLVIPTYRVNTSIVPNVGTVVQTVQNMTTRDSETIRKAMVALRKNNKSTYDYIVQALSSAVPEVKNVTMQINSKKN
NSNKMATPVKEKTKNIPKPKTENPKIGPISELATGVNKVANGIERIPVIGEMAKPVTSTIKWVADKIGSVAAIFGWSKPR
NLEQVNLYQNVPGWGYSLYKGIDNSVPLAFDPNNELGDLRDVFPSGVDEMAIGYVCGNPAVKHVLSWNTTDKVQAPISNG
DDWGGVIPVGMPCYSKIIRTTENDTTRTNTEIMDPAPCEYVCNMFSYWRATMCYRIAIVKTAFHTGRLGIFFGPGKIPIT
TTKDNISPDLTQLDGIKAPSDNNYKYILDLTNDTEITIRVPFVSNKMFMKSTGIYGGNSENNWDFSESFTGFLCIRPITK
FMCPETVSNNVSIVVWKWAEDVVVVEPKPLLSGPTQVFQPPVTSADSINTIDASMQINLANKADENVVTFFDSDDAEERN
MEALLKGSGEQIMNLRSLLRTFRTISENWNLPPNTKTAITDLTDVADKEGRDYMSYLSYIYRFYRGGRRYKFFNTTALKQ
SQTCYVRSFLIPRYYTADNTNNDGPSHITYPVLNPVHEVEVPYYCQYRKLPVASTTDKGYDASLMYYSNVGTNQIVARAG
NDDFTFGWLIGTPQTQGITRTETK
>Q86925 ~~~~~~Structural polyprotein~~~
MNSVFYNPFGRGAYAQPPIAWRPRRRAAPAPRPSGLTTQIQQLTRAVRALVLDNATRRQRPAPRTRPRKPKTQKPKPKKQ
NQKPPQQQKKGKNQPQQPKKPKPGKRQRTALKFEADRTFVGKNEDGKIMGYAVAMEGKVIKPLHVKGTIDHPALAKLKFT
KSSSYDMEFAKLPTEMKSDAFGYTTEHPEVFYNWHHGAVQFSGGRFTIPTGVGGPGDSGRPILDNSGKVVAIVLGGANEV
PGTALSVVTWNKKGAAIKTTHEDTVEWSRAITAMCILQNVTFPCDRPPTCYNRNPDLTLTMLETNVNHPSYDVLLDAALR
CPTRRHVRSTPTDDFTLTAPYLGLCHRCKTMEPCYSPIKIEKVWDDADDGVLRIQVSAQLGYNRAGTAASARLRFMGGGV
PPEIQEGAIADFKVFTSKPCLHLSHKGYFVIVKCPPGDSITTSLKVHGSDQTCTIPMRVGYKFVGREKYTLPPMHGTQIP
CLTYERTREKSAGYVTMHRPGQQSITMLMEESGGEVYVQPTSGRNVTYECKCGDFKTGTVTARTKIDGCTERKQCIAISA
DHVKWVFNSPDLIRHTDHTAQGKLHIPFPLQQAQCTVPLAHLPGVKHAYRSMSLTLHAEHPTLLTTRHLGENPQPTAEWI
VGSVTRNFSITIQGFEYTWGNQKPVRVYAQESAPGNPHGWPHEIVRHYYHLYPFYTVTVLSGMGLAICAGLVISILCCCK
ARRDCLTPYQLAPNATVPFLVTLCCCFQRTSADEFTDTMGYLWQHSQTMFWIQLVIPLAAVITLVRCCSCCLPFLLVASP
PNKADAYEHTITVPNAPLNSYKALVERPGYAPLNLEVMVMNTQIIPSVKREYITCRYHTVVPSPQIKCCGTVECPKGEKA
DYTCKVFTGVYPFLWGGAQCFCDSENSQLSDKYVELSTDCATDHAEAVRVHTASVKSQLRITYGNSTAQVDVFVNGVTPA
RSKDMKLIAGPLSTTFSPFDNKVIIYHGKVYNYDFPEFGAGTPGAFGDVQASSTTGSDLLANTAIHLQRPEARNIHVPYT
QAPSGFEFWKNNSGQPLSDTAPFGCKVNVNPLRADKCAVGSLPISVDIPDAAFTRVSEPLPSLLKCTVTSCTYSTDYGGV
LVLTYESDRAGQCAVHSHSSTAVLRDPSVYVEQKGETTLKFSTRSLQADFEVSMCGTRTTCHAQCQPPTEHVMNRPQKST
PDFSSAISKTSWNWITALMGGISSIAAIAAIVLVIALVFTAQHR
>P89946 ~~~~~~Structural polyprotein~~~
MDFIPTQTFYGRRWRPAPVQRYIPQPQPPAPPRRRRGPSQLQQLVAALGALALQPKQKQKRAQKKPKKTPPPKPKKTQKP
KKPTQKKKSKPGKRMRNCMKIENDCIFPVMLDGKVNGYACLVGDKVMKPAHVKGTIDNPELAKLTFKKSSKYDLECAQVP
VCMKSDASKFTHEKPEGHYNWHHGAVQFSNGRFTIPTGSGKPGDSGRPIFDNTGKVVAIVLGGANEGARTALSVVTWNKD
MVTRITPEESVEWSAAALNITALCVLQNLSFPCDAPPCAPCCYEKDPAGTLRLLSDHYYHPKYYELLDSTMHCPQGRRPK
RSVAHFEAYKATRPYIGWCADCGLAGSCPSPVSIEHVWSDADDGVLKIQVSMQIGIAKSNTINHAKIRYMGANGVQEAER
STLSVSTTAPCDILATMGHFILARCRPGSQVEVSLSTDPKLLCRTPFSHKPRFIGNEKSPAPTGHKTRIPCKTYSHQTDL
TREEITMHVPPDVPIQGLVSNTGKSYSLDPKTKTIKYKCTCGETVKEGTATNKITLFNCDTAPKCITYAVDNTVWQYNSQ
YVPRSEVTEVKGKIHVPFPLTDSTCAVSVAPEPQVTYRLGEVEFHFHPMYPTLFSIRSLGKDPSHSQEWIDTPMSKTIQV
GAEGVEYVWGNNNPVRLWAQKSSSSSAHGNPISIVSHYYDLYPYWTITVLASLGLLIVISSGFSCFLCSVARTKCLTPYQ
LAPGAQLPTFIALLCCAKSARADTLDDFSYLWTNNQAMFWLQLASPVAAFLCLSYCCRNLACCMKIFLGISGLCVIATQA
YEHSTTMPNQVGIPFKALIERPGYAGLPLSLVVIKSELVPSLVQDYITCNYKTVVPSPYIKCCGGAECSHKNEADYKCSV
FTGVYPFMWGGAYCFCDTENSQMSEVYVTRGESCEADHAIAYQVHTASLKAQVMISIGELNQTVDVFVNGDSPARIQQSK
FILGPISSAWSPFDHKVIVYRDEVYNEDYAPYGSGQAGRFGDIQSRTVNSTDVYANTNLKLKRPASGNVHVPYTQTPSGF
SYWKKEKGVPLNRNAPFGCIIKVNPVRAENCVYGNIPISMDIADAHFTRIDESPSVSLKACEVQSCTYSSDFGGVASISY
TSNKVGKCAIHSHSNSATMKDSVQDVQESGALSLFFATSSVEPNFVVQVCNARITCHGKCEPPKDHIVPYAAKHNDAEFP
SISTTAWQWLAHTTSGPLTILVVAIIVVVVVSIVVCARH
>Q8AZM0 ~~~~~~Structural polyprotein~~~
MDFSKENTQIRYLNSLLVPETGSTSIPDDTLDRHCLKTETTTENLVAALGGSGLIVLFPNSPSGLLGAHYTKTPQGSLIF
DKAITTSQDLKKAYNYARLVSRIVQVRSSTLPAGVYALNGTFNGVTYIGSLSEIKDLDYNSLLSATANINDKVGNVLVGD
GVAVLSLPAGSDLPYVRLGDEVPSSAGVARCSPSDRPRHYNANNKQVQVGTTDTKTNGFNIDATTPTEVTVDMQIAQIAA
GKTLTVTVKLMGLTGAKVASRSETVSGNGGTFHFSTTAVFGETEITQPVVGVQVLAKTNGDPIVVDSYVGVTVHGGNMPG
TLRPVTIIAYESVATGSVLTLSGISNYELIPNPELAKNIQTSYGKLNPAEMTYTKVVLSHRDELGLRSIWSIPQYRDMMS
YFREVSDRSSPLKIAGAFGWGDLLSGIRKWVFPVVDTLLPAARPLTDLASGWIKNKYPEAASGRPLAASGRPMAASGTFS
KRIPLASSDEIDYQSVLALTIPGTHPKLVPPTEREPNSTPDGHKITGAKTKDNTGGDVTVVKPLDWLFKLPCLRPQAADL
PISLLQTLAYKQPLGRNSRIVHFTDGALFPVVAFGDNHSTSELYIAVRGDHRDLMSPDVRDSYALTGDDHKVWGATHHTY
YVEGAPKKPLKFNVKTRTDLTILPVADVFWRADGSADVDVVWNDMPAVAGQSSSIALALASSLPFVPKAAYTGCLSGTNV
QPVQFGNLKARAAHKIGLPLVGMTQDGGEDTRICTLDDAADHAFDSMESTVTRPESVGHQAAFQGWFYCGAADEETIEEL
EDFLDSIELHSKPTVEQPQTEEAMELLMELARKDPQMSKILVILGWVEGAGLIDALYNWAQLDDGGVRMRNMLRNLPHEG
SKSQRRKHGPAPESRESTRMEVLRREAAAKRKKAQRISEDAMDNGFEFATIDWVLENGSRGPNPAQAKYYKATGLDPEPG
LTEFLPEPTHAPENKAAKLAATIYGSPNQAPAPPEFVEEVAAVLMENNGRGPNQAQMRELRLKALTMKSGSGAAATFKPR
NRRPAQEYQPRPPITSRAGRFLNISTTLS
>Q5XXP3 ~~~~~~Structural polyprotein~~~
MEFIPTQTFYNRRYQPRPWAPRPTIQVIRPRPRPQRQAGQLAQLISAVNKLTMRAVPQQKPRRNRKNKKQRQKKQAPQND
PKQKKQPPQKKPAQKKKKPGRRERMCMKIENDCIFEVKHEGKVMGYACLVGDKVMKPAHVKGTIDNADLAKLAFKRSSKY
DLECAQIPVHMKSDASKFTHEKPEGYYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGANEGARTAL
SVVTWNKDIVTKITPEGAEEWSLALPVLCLLANTTFPCSQPPCTPCCYEKEPESTLRMLEDNVMRPGYYQLLKASLTCSP
HRQRRSTKDNFNVYKATRPYLAHCPDCGEGHSCHSPIALERIRNEATDGTLKIQVSLQIGIKTDDSHDWTKLRYMDSHTP
ADAERAGLLVRTSAPCTITGTMGHFILARCPKGETLTVGFTDSRKISHTCTHPFHHEPPVIGRERFHSRPQHGKELPCST
YVQSTAATAEEIEVHMPPDTPDRTLMTQQSGNVKITVNGQTVRYKCNCGGSNEGLTTTDKVINNCKIDQCHAAVTNHKNW
QYNSPLVPRNAELGDRKGKIHIPFPLANVTCRVPKARNPTVTYGKNQVTMLLYPDHPTLLSYRNMGQEPNYHEEWVTHKK
EVTLTVPTEGLEVTWGNNEPYKYWPQMSTNGTAHGHPHEIILYYYELYPTMTVVIVSVASFVLLSMVGTAVGMCVCARRR
CITPYELTPGATVPFLLSLLCCVRTTKAATYYEAAAYLWNEQQPLFWLQALIPLAALIVLCNCLKLLPCCCKTLAFLAVM
SIGAHTVSAYEHVTVIPNTVGVPYKTLVNRPGYSPMVLEMELQSVTLEPTLSLDYITCEYKTVIPSPYVKCCGTAECKDK
SLPDYSCKVFTGVYPFMWGGAYCFCDAENTQLSEAHVEKSESCKTEFASAYRAHTASASAKLRVLYQGNNITVAAYANGD
HAVTVKDAKFVVGPMSSAWTPFDNKIVVYKGDVYNMDYPPFGAGRPGQFGDIQSRTPESKDVYANTQLVLQRPAAGTVHV
PYSQAPSGFKYWLKERGASLQHTAPFGCQIATNPVRAVNCAVGNIPISIDIPDAAFTRVVDAPSVTDMSCEVPACTHSSD
FGGVAIIKYTASKKGKCAVHSMTNAVTIREADVEVEGNSQLQISFSTALASAEFRVQVCSTQVHCAAACHPPKDHIVNYP
ASHTTLGVQDISTTAMSWVQKITGGVGLIVAVAALILIVVLCVSFSRH
>Q8JUX5 ~~~~~~Structural polyprotein~~~
MEFIPTQTFYNRRYQPRPWTPRPTIQVIRPRPRPQRQAGQLAQLISAVNKLTMRAVPQQKPRRNRKNKKQKQKQQAPQNN
TNQKKQPPKKKPAQKKKKPGRRERMCMKIENDCIFEVKHEGKVTGYACLVGDKVMKPAHVKGTIDNADLAKLAFKRSSKY
DLECAQIPVHMKSDASKFTHEKPEGYYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGANEGARTAL
SVVTWNKDIVTKITPEGAEEWSLAIPVMCLLANTTFPCSQPPCIPCCYEKEPEETLRMLEDNVMRPGYYQLLQASLTCSP
HRQRRSTKDNFNVYKATRPYLAHCPDCGEGHSCHSPVALERIRNEATDGTLKIQVSLQIGIGTDDSHDWTKLRYMDNHIP
ADAGRAGLFVRTSAPCTITGTMGHFILARCPKGETLTVGFTDSRKISHSCTHPFHHDPPVIGREKFHSRPQHGKELPCST
YVQSNAATAEEIEVHMPPDTPDRTLLSQQSGNVKITVNGRTVRYKCNCGGSNEGLITTDKVINNCKVDQCHAAVTNHKKW
QYNSPLVPRNAELGDRKGKIHIPFPLANVTCMVPKARNPTVTYGKNQVIMLLYPDHPTLLSYRSMGEEPNYQEEWVTHKK
EVVLTVPTEGLEVTWGNNEPYKYWPQLSANGTAHGHPHEIILYYYELYPTMTVVVVSVASFILLSMVGMAVGMCMCARRR
CITPYELTPGATVPFLLSLICCIRTAKAATYQEAAVYLWNEQQPLFWLQALIPLAALIVLCNCLRLLPCCCKTLAFLAVM
SIGAHTVSAYEHVTVIPNTVGVPYKTLVNRPGYSPMVLEMELLSVTLEPTLSLDYITCEYKTVIPSPYVKCCGTAECKDK
NLPDYSCKVFTGVYPFMWGGAYCFCDAENTQLSEAHVEKSESCKTEFASAYRAHTASASAKLRVLYQGNNITVTAYANGD
HAVTVKDAKFIVGPMSSAWTPFDNKIVVYKGDVYNMDYPPFGAGRPGQFGDIQSRTPESKDVYANTQLVLQRPAAGTVHV
PYSQAPSGFKYWLKERGASLQHTAPFGCQIATNPVRAMNCAVGNMPISIDIPDAAFTRVVDAPSLTDMSCEVPACTHSSD
FGGVAIIKYAVSKKGKCAVHSMTNAVTIREAEIEVEGNSQLQISFSTALASAEFRVQVCSTQVHCAAECHPPKDHIVNYP
ASHTTLGVQDISATAMSWVQKITGGVGLVVAVAALILIVVLCVSFSRH
>P13418 ~~~~~~Structural polyprotein~~~
ATFQDKQENSHIENEDKRLMSEQKEIVHFVSEGITPSTTALPDIVNLSTNYLDMTTREDRIHSIKDFLSGPIIIATNLWS
SSDPVEKQLYTANFPEVLISNAMYQDKLKGFVGLRATLVVKVQVNSQPFQQGRLMLQYIPYAQYMPNRVTLINETLQGRS
GCPTTDLELSVGTEVEMRIPYVSPHLYYNLITGQGSFGSIYVVVYSQLHDQVSGTGSIEYTVWAHLEDVDVQYPTGANIF
TGNSPNYLSIAERIATGDFTETEMRKLWIHKTYLKRPARIYAQAAKELKQLETNNSPSTALGQISEGLTTLSHIPVLGNI
FSTPAWISAKAADLAKLFGFSKPTVQGKIGECKLRGQGRMANFDGMDMSHKMALSSTNEIETKEGLAGTSLDEMDLSRVL
SIPNYWDRFTWKTSDVTNTVLWDNYVSPFKVKPYSATITDRFRCTHMGYVANAFTYWRGSIVYTFKFVKTQYHSGRLRIS
FIPYYYNTTISTGTPDVSRTQKIVVDLRTSTEVSFTVPYIASRPWLYCIRPESSWLSKDNKDGALMYNCVSGIVRVEVLN
QLVAAQNVFSEIDVICEVSGGPDLEFAGPTCPSYVPYAGDLTLADTRKIEAERTQEYSNNEDNRITTQCSRIVAQVMGED
QQIPRNEAQHGVHPISIDTHRISNNWSPQAMCIGEKIVSIRQLIKRFGIFGDANTLQADGSSFVVAPFTVTSPTKTLTST
RNYTQFDYYYYLYAFWRGSMRIKMVAETQDGTGTPRKKTNFTWFVRMFNSLQDSFNSLISTSSSAVTTTVLPSGTINMGP
STQVIDPTVEGLIEVEVPYYNISHITPAVTIDDGTPSMEDYLKGHSPPCLLTFSPRDSISATNHHITASFMRAPGDDFSF
MYLLEVPPLVNVARA
>Q96724 ~~~~~~Structural polyprotein~~~
MNTTNEYLKTLLNPAQFISDIPDDIMIRHVNSAQTITYNLKSGASGTGLIVVYPNTPSSISGFHYIWDSATSNWVFDQYI
YTAQELKDSYDYGRLISGSLSIKSSTLPAGVYALNGTFNAVWFQGTLSEVSDYSYDRILSITSNPLDKVGNVLVGDGIEV
LSLPQGFNNPYVRLGDKSPSTLSSPTHITNTSQNLATGGAYMIPVTTVPGQGFHNKEFSINVDSVGPVDILWSGQMTMQD
EWTVTANYQPLNISGTLIANSQRTLTWSNTGVSNGSHYMNMNNLNVSLFHENPPPEPVAAIKININYGNNTNGDSSFSVD
SSFTINVIGGATIGVNSPTVGVGYQGVAEGTAITISGINNYELVPNPDLQKNLPMTYGTCDPHDLTYIKYILSNREQLGL
RSVMTLADYNRMKMYMHVLTNYHVDEREASSFDFWQLLKQIKNVAVPLAATLAPQFAPIIGAADGLANAILGDSASGRPV
GNSASGMPISMSRRLRNAYSADSPLGEEHWLPNENENFNKFDIIYDVSHSSMALFPVIMMEHDKVIPSDPEELYIAVSLT
ESLRKQIPNLNDMPYYEMGGHRVYNSVSSNVRSGNFLRSDYILLPCYQLLEGRLASSTSPNKVTGTSHQLAIYAADDLLK
SGVLGKAPFAAFTGSVVGSSVGEVFGINLKLQLTDSLGIPLLGNSPGLVQVKTLTSLDKKIKDMGDVKRRTPKQTLPHWT
AGSASMNPFMNTNPFLEELDQPIPSNAAKPISEETRDLFLSDGQTIPSSQEKIATIHEYLLEHKELEEAMFSLISQGRGR
SLINMVVKSALNIETQSREVTGERRQRLERKLRNLENQGIYVDESKIMSRGRISKEDTELAMRIARKNQKDAKLRRIYSN
NASIQESYTVDDFVSYWMEQESLPTGIQIAMWLKGDDWSQPIPPRVQRRHYDSYIMMLGPSPTQEQADAVKDLVDDIYDR
NQGKGPSQEQARELSHAVRRLISHSLVNQPATAPRVPPRRIVSAQTAQTDPPGRRAALDRLRRVRGEDNDIV
>P08768 ~~~~~~Structural polyprotein~~~
MFPYPTLNYPPMAPINPMAYRDPNPPRQVAPFRPPLAAQIEDLRRSIANLTLKQRAPNPPAGPPAKRKKPAPSLSLETKK
KRPPPPAKKQKRKPKPGKRQRMCMKLESDKTFPIMLNGQVNGYACVVGGRVFKPLHVEGRIDNEQLAAIKLKKASIYDLE
YGDVPQCMKSDTLQYTSDKPPGFYNWHHGAVQYENNRFTVPRGVGGKGDSGRPILDNKGRVVAIVQGGVNEGSRTALSVV
TWNQKGVTVKDTPEGSEPWSLATVMCVLANITFPCDQPPCMPCCYEKNPHETLTMLEQNYDSRAYDQLLDAAVKCNARRT
RRDLDTHFTQYKLARPYIADCPNCGHSRCDSPIAIEEVRGDAHAGVIRIQTSAMFGLKRHGVDLAYMSFMNGKTQKSIKI
DNLHVRTSAPCSLVSHHGYYILAQCPPGDTVTVGFHDGPNRHTCRLAHKVEFRPVGREKYRHPPEHGVELPCNRYTHKRA
DQGHYVEMHQPGLVGDHSLLSIHSAKVKITVPSGAQVKYYCKCPDVREGITSSDHTTTCTDVKQCRAYLIDNKKWVYNSG
RLPRGEGDTFKGKLHVPFVPVKAKCIATLAPEPLVEHKHRTLILHLHPDHPTLLTTRSLGSDANPTRQWIERPTTVNFTV
TGEGLEYTWGNHPPKRVWAQESGEGNPHGWPHVVVVYYYNRYPLTTIIGLCTCVAIIMVSCDHPCGSFSGLRNLCITPYK
LAPNAQVPILLALLCCIKPTRADDTLQVLNYLWNNNQNFFWMQTLIPLAALIVCMRMLAALFCCGPAFLLVCGAWAAAYE
HTAVMPNKVGIPYKALVERPGYAPVHLQIQLVNTRIIPSTNLEYITCKYKTKVPSPVVKCCGATQCTSKPHPDYQCQVFT
GVYPFMWGGAYCFCDTENTQMSEAYVERSEECSIDHAKAYKVHTGTVQAMVNITYGSVTWRSADVYVNGETPAKIGDAKL
IIGPLSSAWSPFDNKVVVYGHEVYNYDFPEYGTGKAGSFGDLQSRTSTSNDLYANTNLKLQRPQAGIVHTPFTQAPSGFE
RWKRDKGAPLNDVAPFGCSIALEPLRPENCAVGSIPISIDIPDAAFTRISETPTVSDLECKITECTYASDFGGIATLPTN
PVKQETVQFIVHQVLQLLKRMTSPLLRAGSFTFHFSTANIHPAFKLQVCTSGITCKGDCKPPKDHIVDYPAQHTESFTSA
ISATAWSWLKVLVGGTSAFIVLGLIATAVVALVLFFHRH
>P27284 ~~~~~~Structural polyprotein~~~
MFPYPTLNYPPMAPINPMAYRDPNPPRQVAPFRPPLAAQIEDLRRSIANLTLKQRAPNPPAGPPAKRKKPAPKPKPAQAK
KKRPPPPAKKQKRKPKPGKRQRMCMKLESDKTFPIMLNGQVNGYACVVGGRVFKPLHVEGRIDNEQLAAIKLKKASIYDL
EYGDVPQCMKSDTLQYTSDKPPGFYNWHHGAVQYENNRFTVPRGVGGKGDSGRPILDNKGRVVAIVLGGVNEGSRTALSV
VTWNQKGVTVKDTPEGSEPWSLATVMCVLANITFPCDQPPCMPCCYEKNPHETLTMLEQNYDSRAYDQLLDAAVKCNARR
TRRDLDTHFTQYKLARPYIADCPNCGHSRCDSPIAIEEVRGDAHAGVIRIQTSAMFGLKTDGVDLAYMSFMNGKTQKSIK
IDNLHVRTSAPCSLVSHHGYYILAQCPPGDTVTVGFHDGPNRHTCTVAHKVEFRPVGREKYRHPPEHGVELPCNRYTHKR
ADQGHYVEMHQPGLVADHSLLSIHSAKVKITVPSGAQVKYYCKCPDVREGITSSDHTTTCTDVKQCRAYLIGNKKWVYNS
GRLPRGEGDTFKGKLHVPFVPVKAKCIATLAPEPLVEHKHRTLILHLHPDHPTLLTTRSLGSDANPTRQWIERPTTVNFT
VTGEGLEYTWGNHPPKRVWAQESGEGNPHGWPHEVVVYYYNRYPLTTIIGLCTCVAIIMVSCVHPCGSFAGLRNLCITPY
KLAPNAQVPILLALLCCIKPTRADDTLQVLNYLWNNNQNFFWMQTLIPLAALIVCMRIVRCLFCCGPAFLLVCGAWAAAY
EHTAVMPNKVGIPYKALVERPGYAPVHLQIQLVNTSIIPSTNLEYITCKYKTKVPSPVVKCCGATQCTSKPHPDYQCQVF
TGVYPFMWGGAYCFCDTENTQMSEAYVERSEECSIDHAKAYKVHTGTVQAMVNITYGSVSWRSADVYVNGETPAKIGDAK
LIIGPLSSAWSPFDNKVVVYGHEVYNYDFPEYGTGKAGSFGDLQSRTSTSNDLYANTNLKLQRPQAGIVHTPFTQAPSGF
ERWKRDKGAPLNDVAPFGCSIALEPLRAENCAVGSIPISIDIPDAAFTRISETPTVSDLECKITECTYASDFGGIATLPT
NPVKQETVQFILHQVLQLLKRMTSPLLRAGSFTFHFSTANIHPAFKLQVCTSGVTCKGDCKPPKDHIVDYPAQHTESFTS
AISATAWSWLKVLVGGTSAFIVLGLIATAVVALVLFFHRH
>Q4QXJ7 ~~~~~~Structural polyprotein~~~
MFPYPTLNYPPMAPINPMAYRDPNPPRRRWRPFRPPLAAQIEDLRRSIANLTLKQRAPNPPAGPPAKRKKPAPKPKPAQA
KKKRPPPPAKKQKRKPKPGKRQRMCMKLESDKTFPIMLNGQVNGYACVVGGRVFKPLHVEGRIDNEQLAAIKLKKASIYD
LEYGDVPQCMKSDTLQYTSDKPPGFYNWHHGAVQYENNRFTVPRGVGGKGDSGRPILDNKGRVVAIVLGGVNEGSRTALS
VVTWNQKGVTVKDTPEGSEPWSLATVMCVLANITFPCDQPPCMPCCYEKNPHETLTMLEQNYDSRAYDQLLDAAVKCNAR
RTRRDLDTHFTQYKLARPYIADCPNCGHSRCDSPIAIEEVRGDAHAGVIRIQTSAMFGLKTDGVDLAYMSFMNGKTQKSI
KIDNLHVRTSAPCSLVSHHGYYILAQCPPGDTVTVGFHDGPNRHTCTVAHKVEFRPVGREKYRHPPEHGVELPCNRYTHK
RADQGHYVEMHQPGLVADHSLLSIHSAKVKITVPSGAQVKYYCKCPDVREGITSSDHTTTCTDVKQCRAYLIDNKKWVYN
SGRLPRGEGDTFKGKLHVPFVPVKAKCIATLAPEPLVEHKHRTLILHLHPDHPTLLTTRSLGSDANPTRQWIERPTTVNF
TVTGEGLEYTWGNHPPKRVWAQESGEGNPHGWPHEVVVYYYNRYPLTTIIGLCTCVAIIMVSCVTSVWLLCRTRNLCITP
YKLAPNAQVPILLALLCCIKPTRADDTLQVLNYLWNNNQNFFWMQTLIPLAALIVCMRMLRCLFCCGPAFLLVCGALGAA
AYEHTAVMPNKVGIPYKALVERPGYAPVHLQIQLVNTRIIPSTNLEYITCKYKTKVPSPVVKCCGATQCTSKPHPDYQCQ
VFTGVYPFMWGGAYCFCDTENTQMSEAYVERSEECSIDHAKAYKVHTGTVQAMVNITYGSVSWRSADVYVNGETPAKIGD
AKLIIGPLSSAWSPFDNKVVVYGHEVYNYDFPEYGTGKAGSFGDLQSRTSTSNDLYANTNLKLQRPQAGIVHTPFTQAPS
GFERWKRDKGAPLNDVAPFGCSIALEPLRAENCAVGSIPISIDIPDAAFTRISETPTVSDLECKITECTYASDFGGIATV
AYKSSKAGNCPIHSPSGVAVIKENDVTLAESGSFTFHFSTANIHPAFKLQVCTSAVTCKGDCKPPKDHIVDYPAQHTESF
TSAISATAWSWLKVLVGGTSAFIVLGLIATAVVALVLFFHRH
>P05674 ~~~~~~Structural polyprotein~~~
MFPFQPMYPMQPMPYRNPFAAPRRPWFPRTDPFLAMQVQELTRSMANLTFKQRRDAPPEGPSAKKPKKEASQKQKGGGQG
KKKKNQGKKKAKTGPPNPKAQNGNKKKTNKKPGKRQRMVMKLESDKTFPIMLEGKINGYACVVGGKLFRPMHVEGKIDND
VLAALKTKKASKYDLEYADVPQNMRADTFKYTHEKPQGYYSWHHGAVQYENGRFTVPKGVGAKGDSGRPILDNQGRVVAI
VLGGVNEGSRTALSVVMWNEKGVTVKYTPENCEQWSLVTTMCLLANVTFPCAQPPICYDRKPAETLAMLSVNVDNPGYDE
LLEAAVKCPGRKRRSTEELFNEYKLTRPYMARCIRCAVGSCHSPIAIEAVKSDGHDGYVRLQTSSQYGLDSSGNLKGRTM
RYDMHGTIKEIPLHQVSLYTSRPCHIVDGHGYFLLARCPAGDSITMEFKKDSVRHSCSVPYEVKFNPVGRELYTHPPEHG
VEQACQVYAHDAQNRGAYVEMHLPGSEVDSSLVSLSGSSVTVTPPDGTSALVECECGGTKISETINKTKQFSQCTKKEQC
RAYRLQNDKWVYNSDKLPKAAGATLKGKLHVPFLLADGKCTVPLAPEPMITFGFRSVSLKLHPKNPTYLITRQLADEPHY
THELISEPAVRNFTVTEKGWEFVWGNHPPKRFWAQETAPGNPHGLPHEVITHYYHRYPMSTILGLSICAAIATVSVAAST
WLFCRSRVACLTPYRLTPNARIPFCLAVLCCARTARAETTWESLDHLWNNNQQMFWIQLLIPLAALIVVTRLLRCVCCVV
PFLVMAGAAAPAYEHATTMPSQAGISYNTIVNRAGYAPLPISITPTKIKLIPTVNLEYVTCHYKTGMDSPAIKCCGSQEC
TPTYRPDEQCKVFTGVYPFMWGGAYCFCDTENTQVSKAYVMKSDDCLADHAEAYKAHTASVQAFLNITVGEHSIVTTVYV
NGETPVNFNGVKITAGPLSTAWTPFDRKIVQYAGEIYNYDFPEYGAGQPGAFGDIQSRTVSSSDLYANTNLVLQRPKAGA
IHVPYTQAPSGFEQWKKDKAPSLKFTAPFGCEIYTNPIRAENCAVGSIPLAFDIPDALFTRVSETPTLSAAECTLNECVY
SSDFGGIATVKYSASKSGKCAVHVPSGTATLKEAAVELTEQGSATIHFSTANIHPEFRLQICTSYVTCKGDCHPPKDHIV
THPQYHAQTFTAAVSKTAWTWLTSLLGGSAVIIIIGLVLATIVAMYVLTNQKHN
>P36330 ~~~~~~Structural polyprotein~~~
MFPFQPMYPMQPMPYRNPFAAPRRPWFPRTDPFLAMQVQELTRSMANLTFKQRRGAPPEGPPAKKSKREAPQKQRGGQRK
KKKNEGKKKAKTGPPNLKTQNGNKKKTNKKPGKRQRMVMKLESDKTFPIMLEGKINGYACVVGGKLFRPMHVEGKIDNDV
LAALKTKKASKYDLEYADVPQNMRADTFKYTHEKPQGYYSWHHGAVQYENGRFTVPRGVGARGDSGRPILDNQGRVVAIV
LGGVNEGSRTALSVVMWNEKGVTVKYTPENCEQWSLVTTMCLLANVTFPCAQPPICYDRKPAETLAMLSANVDNPGYDEL
LKAAVTCPGRKRRSTEELFKEYKLTRPYMARCVRCAVGSCHSPIAIEAVKSDGHDGYVRLQTSSQYGLDPSGNLKSRTMR
YNMYGTIEEIPLHQVSLHTSRPCHIVDGHGYFLLARCPAGDSITMEFKKDSVTHSCSVPYEVKFNPVGRELYTHPPEHGA
EQACQVYAHDAQNRGAYVEMHLPGSEVDSSLVSLSSGLVSVTPPAGTSALVECECSGTTISKTINKTKQFSQCTKKEQCR
AYRLQNDKWVYNSDKLPKAAGATLKGKLHVPFLLADGKCTVPLAPEPMITFGFRSVSLKLHPKYPTYLTTRELADEPHYT
HELISEPSVRNFSVTAKGWEFVWGNHPPKRFWAQETAPGNPHGLPHEVIVHYYHRYPMSTITGLSICAAIVAVSIAASTW
LLCRSRASCLTPYRLTPNAKMPLCLAVLCCARSARAETTWESLDHLWNNNQQMFWTQLLIPLAALIVVTRLLKCMCCVVP
FLVVAGAAGAGAYEHATTMPNQAGISYNTIVNRAGYAPLPISITPTKIKLIPTVNLEYVTCHYKTGMDSPTIKCCGSQEC
TPTYRPDEQCKVFAGVYPFMWGGAYCFCDTENTQISKAYVMKSEDCLADHAAAYKAHTASVQALLNITVGEHSTVTTVYV
NGETPVNFNGVKLTAGPLSTAWTPFDRKIVQYAGEIYNYDFPEYGAGQPGAFGDIQLRTVSSSDLYANTNLVLQRPKAGA
IHVPYTQAPSGFEQWKKDKAPSLKFTAPFGCEIYTNPIRAENCAVGSIPLAFDIPDALFTRVSETPTLSAAECTLNECVY
SSDFGGIATVKYSASKSGKCAVHVPSGTATLKEASVELAEQGSVTIHFSTANIHPEFRLQICTSFVTCKGDCHPPKDHIV
THPQYHAQTFTAAVSKTAWTWLTSLLGGSAVIIIIGLVLATLVAMYVLTNQKHN
>P36331 ~~~~~~Structural polyprotein~~~
MFPYQPMYPMQPMPFRNPFAAPRRPWFPRTDPFLAMQVQELARSMANLTFKQRRDVPPEGPPAKKKKKDTSQQGGRNQNG
KKKNKLVKKKKKTGPPPQKTNGGKKKVNKKPGKRQRMVMKLESDKTFPIMLDGRINGYACVVGGKLFRPLHVEGKIDNDV
LSSLKTKKASKYDLEYADVPQSMRADTFKYTHEKPQGYYSWHHGAVQYENGRFTVPKGVGAKGDSGRPILDNQGRVVAIV
LGGVNEGSRTALSVVTWNEKGVTVKYTPENSEQWSLVTTMCLLANVTFPCSQPPICYDRKPAETLSMLSHNIDNPGYDEL
LEAVLKCPGRGKRSTEELFKEYKLTRPYMARCIRCAVGSCHSPIAIEAVRSEGHDGYVRLQTSSQYGLDPSGNLKGRTMR
YDMHGTIEEIPLHQVSLHTSRPCHIIDGHGYFLLARCPAGDSITMEFKKESVTHSCSVPYEVKFNPVGRELYTHPPEHGA
EQPCHVYAHDAQNRGAYVEMHLPGSEVDSTLLSTSGSSVHVTPPAGQSVLVECECGGTKISETINSAKQYSQCSKTAQCR
AYRTQNDKWVYNSDKLPKAAGETLKGKLHVPFVLTEAKCTVPLAPEPIITFGFRSVSLKLHPKNPTFLTTRQLDGEPAYT
HELITNPVVRNFSVTEKGWEFVWGNHPPQRYWSQETAPGNPHGLPHEVITHYYHRYPMSTILGLSICAAIVTTSIAASVW
LFCKSRISCLTPYRLTPNARMPLCLAVLCCARTARAETTWESLDHLWNHNQQMFWSQLLIPLAALIVATRLLKCVCCVVP
FLVVAGAVGAGAYEHATTMPNQVGIPYNTIVNRAGYAPLPISIVPTKVKLIPTVNLEYITCHYKTGMDSPAIKCCGTQEC
SPTYRPDEQCKVFSGVYPFMWGGAYCFCDTENTQISKAYVTKSEDCVTDHAQAYKAHTASVQAFLNITVGGHSTTAVVYV
NGETPVNFNGVKLTAGPLSTAWSPFDKKIVQYAGEIYNYDFPEYGAGHAGAFGDIQARTISSSDVYANTNLVLQRPKAGA
IHVPYTQAPSGYEQWKKDKPPSLKFTAPFGCEIYTNPIRAENCAVGSIPLAFDIPDALFTRVSETPTLSTAECTLNECVY
SSDFGGIATVKYSASKSGKCAVHVPSGTATLKEAAVELAEQGSATIHFSTASIHPEFRLQICTSYVTCKGDCHPPKDHIV
THPQYHAQSFTAAVSKTAWTWLTSLLGGSAIIIIIGLVLATIVAMYVLTNQKHN
>P09592 ~~~~~~Structural polyprotein~~~
MFPFQPMYPMQPMPYRNPFAAPRRPWFPRTDPFLAMQVQELTRSMANLTFKQRRDAPPEGPSAKKPKKEASQKQKGGGQG
KKKKNQGKKKAKTGPPNPKAQNGNKKKTNKKPGKRQRMVMKLESDKTFPIMLEGKINGYACVVGGKLFRPMHVEGKIDND
VLAALKTKKASKYDLEYADVPQNMRADTFKYTHEKPQGYYSWHHGAVQYENGRFTVPKGVGAKGDSGRPILDNQGRVVAI
VLGGVNEGSRTALSVVMWNEKGVTVKYTPENCEQWSLVTTMCLLANVTFPCAQPPICYDRKPAETLAMLSVNVDNPGYDE
LLEAAVKCPGRKRRSTEELFKEYKLTRPYMARCIRCAVGSCHSPIAIEAVKSDGHDGYVRLQTSSQYGLDSSGNLKGRTM
RYDMHGTIKEIPLHQVSLHTSRPCHIVDGHGYFLLARCPAGDSITMEFKKDSVTHSCSVPYEVKFNPVGRELYTHPPEHG
VEQACQVYAHDAQNRGAYVEMHLPGSEVDSSLVSLSGSSVTVTPPVGTSALVECECGGTKISETINKTKQFSQCTKKEQC
RAYRLQNDKWVYNSDKLPKAAGATLKGKLHVPFLLADGKCTVPLAPEPMITFGFRSVSLKLHPKNPTYLTTRQLADEPHY
THELISEPAVRNFTVTEKGWEFVWGNHPPKRFWAQETAPGNPHGLPHEVITHYYHRYPMSTILGLSICAAIATVSVAAST
WLFCRSRVACLTPYRLTPNARIPFCLAVLCCARTARAETTWESLDHLWNNNQQMFWIQLLIPLAALIVVTRLLRCVCCVV
PFLVMAGAAAGAYEHATTMPSQAGISYNTIVNRAGYAPLPISITPTKIKLIPTVNLEYVTCHYKTGMDSPAIKCCGSQEC
TPTYRPDEQCKVFTGVYPFMWGGAYCFCDTENTQVSKAYVMKSDDCLADHAEAYKAHTASVQAFLNITVGEHSIVTTVYV
NGETPVNFNGVKLTAGPLSTAWTPFDRKIVQYAGEIYNYDFPEYGAGQPGAFGDIQSRTVSSSDLYANTNLVLQRPKAGA
IHVPYTQAPSGFEQWKKDKAPSLKFTAPFGCEIYTNPIRAENCAVGSIPLAFDIPDALFTRVSETPTLSAAECTLNECVY
SSDFGGIATVKYSASKSGKCAVHVPSGTATLKEAAVELTEQGSATIHFSTANIHPEFRLQICTSYVTCKGDCHPPKDHIV
THPQYHAQTFTAAVSKTAWTWLTSLLGGSAVIIIIGLVLATIVAMYVLTNQKHN
>Q5Y388 ~~~~~~Structural polyprotein~~~
MNYIPTQTFYGRRWRPRPAYRPWRVPMQPAPPMVIPELQTPIVQAQQMQQLISAVSALTTKQNGKAPKKPKKKPQKAKAK
KNEQQKKNENKKPPPKQKNPAKKKKPGKRERMCMKIENDCIFEVKLDGKVTGYACLVGDKVMKPAHVKGVIDNPDLAKLT
YKKSSKYDLECAQIPVHMKSDASKYTHEKPEGHYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGAN
EGARTALSVVTWTKDMVTRYTPEGTEEWSAALMMCVLANVTFPCSEPACAPCCYEKQPEQTLRMLEDNVDRPGYYDLLEA
TMTCNNSARHRRSVTKHFNVYKATKPYLAYCADCGDGQFCYSPVAIEKIRDEASDGMIKIQVAAQIGINKGGTHEHNKIR
YIAGHDMKEANRDSLQVHTSGVCAIRGTMGHFIVAYCPPGDELKVQFQDAESHTQACKVQYKHAPAPVGREKFTVRPHFG
IEVPCTTYQLTTAPTEEEIDMHTPPDIPDITLLSQQSGNVKITAGGKTIRYNCTCGSGNVGTTSSDKTINSCKIAQCHAA
VTNHDKWQYTSSFVPRADQLSRKGKVHVPFPLTNSTCRVPVARAPGVTYGKRELTVKLHPDHPTLLTYRSLGADPRPYEE
WIDRYVERTIPVTEDGIEYRWGNNPPVRLWAQLTTEGKPHGWPHEIILYYYGLYPAATIAAVSAAGLAVVLSLLASCYMF
ATARRKCLTPYALTPGAVVPVTLGVLCCAPRAHAASFAESMAYLWDENQTLFWLELATPLAAIIILVCCLKNLLCCCKPL
SFLVLVSLGTPVVKSYEHTATIPNVVGFPYKAHIERNGFSPMTLQLEVLGTSLEPTLNLEYITCEYKTVVPSPYIKCCGT
SECRSMERPDYQCQVYTGVYPFMWGGAYCFCDTENTQLSEAYVDRSDVCKHDHAAAYKAHTAAMKATIRISYGNLNQTTT
AFVNGEHTVTVGGSRFTFGPISTAWTPFDNKIVVYKNDVYNQDFPPYGSGQPGRFGDIQSRTVESKDLYANTALKLSRPS
SGTVHVPYTQTPSGFKYWIKERGTSLNDKAPFGCVIKTNPVRAENCAVGNIPVSMDIPDTAFTRVIDAPAVTNLECQVAV
CTHSSDFGGIATLTFKTDKPGKCAVHSHSNVATIQEAAVDIKTDGKITLHFSTASASPAFKVSVCSAKTTCMAACEPPKD
HIVPYGASHNNQVFPDMSGTAMTWVQRVAGGLGGLTLAAVAVLILVTCVTMRR
>P61825 ~~~~~~Structural polyprotein~~~
MTNLQDQTQQIVPFIRSLLMPTTGPASIPDDTLEKHTLRSETSTYNLTVGDTGSGLIVFFPGFPGSIVGAHYTLQSNGNY
KFDQMLLTAQNLPASYNYCRLVSRSLTVRSSTLPGGVYALNGTINAVTFQGSLSELTDVSYNGLMSATANINDKIGNVLV
GEGVTVLSLPTSYDLGYVRLGDPIPAIGLDPKMVATCDSSDRPRVYTITAADDYQFSSQYQPGGVTITLFSANIDAITSL
SVGGELVFQTSVHGLVLGATIYLIGFDGTAVITRAVAANNGLTTGTDNLLPFNLVIPTNEITQPITSIKLEIVTSKSGGQ
AGDQMSWSARGSLAVTIHGGNYPGALRPVTLVAYERVATGSVVTVAGVSNFELIPNPELAKNLVTEYGRFDPGAMNYTKL
ILSERDRLGIKTVWPTREYTDFREYFMEVADLNSPLKIAGAFGFKDIIRAIRRIAVPVVSTLFPPAAPLAHAIGEGVDYL
LGDEAQAASGTARAASGKARAASGRIRQLTLAADKGYEVVANLFQVPQNPVVDGILASPGVLRGAHNLDCVLREGATLFP
VVITTVEDAMTPKALNSKMFAVIEGVREDLQPPSQRGSFIRTLSGHRVYGYAPDGVLPLETGRDYTVVPIDDVWDDSIML
SKDPIPPIVGNSGNLAIAYMDVFRPKVPIHVAMTGALNACGEIEKVSFRSTKLATAHRLGLKLAGPGAFDVNTGPNWATF
IKRFPHNPRDWDRLPYLNLPYLPPNAGRQYHLAMAASEFKETPELESAVRAMEAAANVDPLFQSALSVFMWLEENGIVTD
MANFALSDPNAHRMRNFLANAPQAGSKSQRAKYGTAGYGVEARGPTPEEAQREKDTRISKKMETMGIYFATPEWVALNGH
RGPSPGQLKYWQNTREIPDPNEDYLDYVHAEKSRLASEEQILRAATSIYGAPGQAEPPQAFIDEVAKVYEINHGRGPNQE
QMKDLLLTAMEMKHRNPRRALPKPKPKPNAPTQRPPGRLGRWIRTVSDEDLE
>Q9WI42 ~~~~~~Structural polyprotein~~~
MTNLSDQTQQIVPFIRSLLMPTTGPASIPDDTLEKHTLRSETSTYNLTVGDTGSGLIVFFPGFPGSIVGAHYTLQGNGNY
KFDQMLLTAQNLPASYNYCRLVSRSLTVRSSTLPGGVYALNGTINAVTFQGSLSELTDVSYNGLMSATANINDKIGNVLV
GEGVTVLSLPTSYDLGYVRLGDPIPAIGLDPKMVATCDSSDRPRVYTITAADDYQFSSQYQPGGVTITLFSANIDAITSL
SVGGELVFRTSVHGLVLGATIYLIGFDGTTVITRAVAANNGLTTGTDNLMPFNLVIPTNEITQPITSIKLEIVTSKSGGQ
AGDQMSWSARGSLAVTIHGGNYPGALRPVTLVAYERVATGSVVTVAGVSNFELIPNPELAKNLVTEYGRFDPGAMNYTKL
ILSERDRLGIKTVWPTREYTDFREYFMEVADLNSPLKIAGAFGFKDIIRAIRRIAVPVVSTLFPPAAPLAHAIGEGVDYL
LGDEAQAASGTARAASGKARAASGRIRQLTLAADKGYEVVANLFQVPQNPVVDGILASPGVLRGAHNLDCVLREGATLFP
VVITTVEDAMTPKALNSKMFAVIEGVREDLQPPSQRGSFIRTLSGHRVYGYAPDGVLPLETGRDYTVVPIDDVWDDSIML
SKDPIPPIVGNSGNLAIAYMDVFRPKVPIHVAMTGALNACGEIEKVSFRSTKLATAHRLGLRLAGPGAFDVNTGPNWATF
IKRFPHNPRDWDRLPYLNLPYLPPNAGRQYHLAMAASEFKETPELESAVRAMEAAANVDPLFQSALSVFMWLEENGIVTD
MANFALSDPNAHRMRNFLANAPQAGSKSQRAKYGTAGYGVEARGPTPEEAQREKDTRISKKMETMGIYFATPEWVALNGH
RGPSPGQVKYWQNKREIPDPNEDYLDYVHAEKSRLASEEQILRAATSIYGAPGQAEPPQAFIDEVAKVYEINHGRGPNQE
QMKDLLLTAMEMKHRNPRRALPKPKPKPNAPTQRPPGRLGRWIRTVSDEDLE
>P15480 ~~~~~~Structural polyprotein~~~
MTNLQDQTQQIVPFIRSLLMPTTGPASIPDDTLEKHTLRSETSTYNLTVGDTGSGLIVFFPGFPGSIVGAHYTLQSNGNY
KFDQMLLTAQNLPASYNYCRLVSRSLTVRSSTLPGGVYALNGTINAVTFQGSLSELTDVSYNGLMSATANINDKIGNVLV
GEGVTVLSLPTSYDLGYVRLGDPIPAIGLDPKMVATCDSSDRPRVYTITAADDYQFSSQYQPGGVTITLFSANIDAITSL
SVGGELVFQTSVHGLVLGATIYLIGFDGTTVITRAVAANNGLTTGTDNLMPFNLVISTNEITQPITSIKLEIVTSKSGGQ
AGDQMSWSAKGSLAVTIHGGNYPGALRPVTLVAYERVATGSVVTVAGVSNFELIPNPELAKNLVTEYGRFDPGAMNYTKL
ILSERDRLGIKTVWPTREYTDFREYFMEVADLNSPLKIAGAFGFKDIIRAIRRIAVPVVSTLFPPAAPLAHAIGEGVDYL
LGDEAQAASGTARAASGKARAASGRIRQLTLAADKGYEVVANLFQVPQNPVVDGILASPGVLRGAHNLDCVLREGATLFP
VVITTVEDAMTPKALNSKMFAVIEGVREDLQPPSQRGSFIRTLSGHRVYGYAPDGVLPLETGRDYTVVPIDDVWDDSIML
SKDPIPPIVGNSGNLAIAYMDVFRPKVPIHVAMTGALNACGEIEKVSFRSTKLATAHRLGLKLAGPGAFDVNTGPNWATF
IKRFPHNPRDWDRLPYLNLPYLPPNAGRQYHLAMAASEFKETPELESAVRAMEAAANVDPLFQSALSVFMWLEENGIVTD
MANFALSDPNAHRMRNFLANAPQAGSKSQRAKYGTAGYGVEARGPTPEEAQREKDTRISKKMETMGIYFATPEWVALNGH
RGPSPGQLKYWQNTREIPDPNEDYLDYVHAEKSRLASEEQILRAATSIYGAPGQAEPPQAFIDEVAKVYEINHGRGPNQE
QMKDLLLTAMEMKHRNPRRALPKPKPKPNAPTQRPPGRLGRWIRTVSDEDLE
>P27276 ~~~~~~Structural polyprotein~~~
MTNLMDHTQQIVPFIRSLLMPTTGPASIPDDTLEKHTLRSETSTYNLTVGDTGSGLIVFFPGFPGSVVGAHYTLQSSGSY
QFDQMLLTAQNLPVSYNYCRLVSRSLTVRSSTLPGGVYALNGTINAVTFQGSLSELTDYSYNGLMSATANINDKIGNVLV
GEGVTVLSLPTSYDLSYVRLGDPIPAAGLDPKLMATCDSSDRPRVYTVTAADEYQFSSQLIPSGVKTTLFTANIDALTSL
SVGGELIFSQVTIHSIEVDVTIYFIGFDGTEVTVKAVATDFGLTTGTNNLVPFNLGGPTSEITQPITSMKLEVVTYKRGG
TAGDPISWTVSGTLAVTVHGGNYPGALRPVTLVAYERVAAGSVVTVAGVSNFELIPNPELAKNLVTEYGRFDPGAMNYTK
LILSERDRLGIKTVWPTREYTDFREYFMEVADLNSPLKIAGAFGFKDIIRAIRKIAVPVVSTLFPPAAPLAHANREGVDY
LLGDEAQAASGTARGASGKARAASGRIRQLTLAADKGYEVVANMFQVPQNPIVDGILASPGILRGAHNLDCVSKEGATLF
PVVITTLEDELTPKALNSKMFAVIEGAREDLQPPSQRGSFIRTLSGHRVYGYAPDGVLPLETGRDYTVVPIDDVWDDSIM
LSQDPIPPIVGNSGNLAIAYMDVFRPKVPIHVAMTGALNASEIESVSFRSTKLATAHRLGMKLAGPGDYDINTGPNWATF
IKRFPHNPRGWDRLPYLNLPYLPPTAGRQFHLALAASEFKETPELEDAVRAMDAAANADPLFRSALQVFMWLEENGIVTD
MANFALSDPNAHRMKNFLANAPQAGSKSQRAKYGTAGYGVEARGPTPEEAQRAKDARISKKMETMGIYFATPEWVALNGH
RGPSPGQLKYWQNTREIPEPNEDYPDYVHAEKSRLASEEQILRAATSIYGAPGQAEPPQAFIDEVARVYETNHGRVPNQE
QMKDLLLTAMEMKHRNPRRAPPKPKPKPNAPSQRPPGRLGRWIRTVSDEDLE
>Q82635 ~~~~~~Structural polyprotein~~~
MTNLQDQTQQIVPFIRSLLMPTTGPASIPDDTLEKHTLRSETSTYNLTVGDTGSGLIVFFPGFPGSIVGAHYTLQSNGNY
KFDQMLLTAQNLPASYNYCRLVSRSLTVRSSTLPGGVYALNGTINAVTFQGSLSELTDVSYNGLMSATANINDKIGNVLV
GEGVTVLSLPTSYDLGYVRLGDPIPAIGLDPKMVATCDSSDRPRVYTITAADDYQFSSQYQAGGVTITLFSANIDAITSL
SIGGELVFQTSVQGLILGATIYLIGFDGTAVITRAVAADNGLTAGTDNLMPFNIVIPTSEITQPITSIKLEIVTSKSGGQ
AGDQMSWSASGSLAVTIHGGNYPGALRPVTLVAYERVATGSVVTVAGVSNFELIPNPELAKNLVTEYGRFDPGAMNYTKL
ILSERDRLGIKTVWPTREYTDFREYFMEVADLNSPLKIAGAFGFKDIIRALRRIAVPVVSTLFPPAAPLAHAIGEGVDYL
LGDEAQAASGTARAASGKARAASGRIRQLTLAADKGYEVVANLFQVPQNPVVDGILASPGILRGAHNLDCVLREGATLFP
VVITTVEDAMTPKALNSKMFAVIEGVREDLQPPSQRGSFIRTLSGHRVYGYAPDGVLPLETGRVYTVVPIDGVWDDSIML
SKDPIPPIVGSSGNLAIAYMDVFRPKVPIHVAMTGALNAYGEIENVSFRSTKLATAHRLGLKLAGPGAFDVNTGSNWATF
IKRFPHNPRDWDRLPYLNLPYLPPNAGRQYDLAMAASEFKETPELESAVRAMEAAANVDPLFQSALSVFMWLEENGIVTD
MANFALSDPNAHRMRNFLANAPQAGSKSQRAKYGTAGYGVEARGPTPEGAQREKDTRISKKMETMGIYFATPEWVALNGH
RGPSPGQLKYWQNTREIPDPNEDYLDYVHAEKSRLASEGQILRAATSIYGAPGQAEPPQAFIDEVAKVYEVNHGRGPNQE
QMKDLLLTAMEMKHRNPRRAPPKPKPKPNVPTQRPPGRLGRWIRAVSDEDLE
>P05844 ~~~~~~Structural polyprotein~~~
MSTSKATATYLRSIMLPENGPASIPDDITERHILKQETSSYNLEVSESGSGLLVCFPGAPGSRVGAHYRWNLNQTALEFD
QWLETSQDLKKAFNYGRLISRKYDIQSSTLPAGLYALNGTLNAATFEGSLSEVESLTYNSLMSLTTNPQDKVNNQLVTKG
ITVLNLPTGFDKPYVRLEDETPQGPQSMNGARMRCTAAIAPRRYEIDLPSERLPTVAATGTPTTIYEGNADIVNSTAVTG
DITFQLEAEPVNETRFDFILQFLGLDNDVPVVTVTSSTLVTADNYRGASAKFTQSIPTEMITKPITRVKLAYQLNQQTAI
ANAATLGAKGPASVSFSSGNGNVPGVLRPITLVAYEKMTPQSILTVAGVSNYELIPNPDLLKNMVTKYGKYDPEGLNYAK
MILSHREELDIRTVWRTEEYKERTRAFKEITDFTSDLPTSKAWGWRDLVRGIRKVAAPVLSTLFPMAAPLIGAADQFIGD
LTKTNSAGGRYLSHAAGGRYHDVMDSWASGSEAGSYSKHLKTRLESNNYEEVELPKPTKGVIFPVVHTVESAPGEAFGSL
VVVIPEAYPELLDPNQQVLSYFKNDTGCVWGIGEDIPFEGDDMCYTALPLKEIKRNGNIVVEKIFAGPAMGPSSQLALSL
LVNDIDEGIPRMVFTGEIADDEETVIPICGVDIKAIAAHEHGLPLIGCQPGVDEMVANTSLASHLIQGGALPVQKAQGAC
RRIKYLGQLMRTTASGMDAELQGLLQATMARAKEVKDAEVFKLLKLMSWTRKNDLTDHMYEWSKEDPDAIKFGRLVSTPP
KHQEKPKGPDQHTAQEAKATRISLDAVKAGADFASPEWIAENNYRGPSPGQFKYYMITGRVPNPGEEYEDYVRKPITRPT
DMDKIRRLANSVYGLPHQEPAPDDFYQAVVEVFAENGGRGPDQDQMQDLRDLARQMKRRPRPAETRRQTKTPPRAATSSG
SRFTPSGDDGEV
>Q703G9 ~~~~~~Structural polyprotein~~~
MNTNKATATYLKSIMLPETGPASIPDDITERHILKQETSSYNLEVSESGSGVLVCFPGAPGSRIGAHYRWNANQTGLEFD
QWLETSQDLKKAFNYGRLISRKYDIQSSTLPAGLYALNGTLNAATFEGSLSEVESLTYNSLMSLTTNPQDKVNNQLVTKG
VTVLNLPTGFDKPYVRLEDETPQGLQSMNGAKMRCTAAIAPRRYEIDLPSQRLPPVPATGTLTTLYEGNADIVNSTTVTG
DINFSLAEQPADETKFDFQLDFMGLDNDVPVVTVVSSVLATNDNYRGVSAKMTQSIPTENITKPITRVKLSYKINQQTAI
GNVATLGTMGPASVSFSSGNGNVPGVLRPITLVAYEKMTPLSILTVAGVSNYELIPNPELLKNMVTRYGKYDPEGLNYAK
MILSHREELDIRTVWRTEEYKERTRVFNEITDFSSDLPTSKAWGWRDIVRGIRKVAAPVLSTLFPMAAPLIGMADQFIGD
LTKTNAAGGRYHSMAAGGRYKDVLESWASGGPDGKFSRALKNRLESANYEEVELPPPSKGVIVPVVHTVKSAPGEAFGSL
AIIIPGEYPELLDANQQVLSHFANDTGSVWGIGEDIPFEGDNMCYTALPLKEIKRNGNIVVEKIFAGPIMGPSAQLGLSL
LVNDIEDGVPRMVFTGEIADDEETIIPICGVDIKAIAAHEQGLPLIGNQPGVDEEVRNTSLAAHLIQTGTLPVQRAKGSN
KRIKYLGELMASNASGMDEELQRLLNATMARAKEVQDAEIYKLLKLMAWTRKNDLTDHMYEWSKEDPDALKFGKLISTPP
KHPEKPKGPDQHHAQEARATRISLDAVRAGADFATPEWVALNNYRGPSPGQFKYYLITGREPEPGDEYEDYIKQPIVKPT
DMNKIRRLANSVYGLPHQEPAPEEFYDAVAAVFAQNGGRGPDQDQMQDLRELARQMKRRPRNADAPRRTRAPAEPAPPGR
SRFTPSGDNAEV
>P0DOH8 ~~~~~~Structural polyprotein~~~
MTKKPGGPGKNRAINMLKRGLPRVFPLVGVKRVVMSLLDGRGPVRFVLALITFFKFTALAPTKALLGRWKAVEKSVAMKH
LTSFKRELGTLIDAVNKRGRKQNKRGGNEGSIMWLASLAVVIAYAGAMKLSNFQGKLLMTINNTDIADVIVIPTSKGENR
CWVRAIDVGYMCEDTITYECPKLTMGNDPEDVDCWCDNQEVYVQYGRCTRTRHSKRSRRSVSVQTHGESSLVNKKEAWLD
STKATRYLMKTENWIIRNPGYAFLAATLGWMLGSNNGQRVVFTILLLLVAPAYSFNCLGMGNRDFIEGASGATWVDLVLE
GDSCLTIMANDKPTLDVRMINIEASQLAEVRSYCYHASVTDISTVARCPTTGEAHNEKRADSSYVCKQGFTDRGWGNGCG
LFGKGSIDTCAKFSCTSKAIGRTIQPENIKYEVGIFVHGTTTSENHGNYSAQVGASQAAKFTITPNAPSITLKLGDYGEV
TLDCEPRSGLNTEAFYVMTVGSKSFLVHREWFHDLALPWTSPSSTAWRNRELLMEFEEAHATKQSVVALGSQEGGLHQAL
AGAIVVEYSSSVKLTSGHLKCRLKMDKLALKGTTYGMCTEKFSFAKNPADTGHGTVVIELSYSGSDGPCKIPIVSVASLN
DMTPVGRLVTVNPFVATSSANSKVLVEMEPPFGDSYIVVGRGDKQINHHWHKAGSTLGKAFSTTLKGAQRLAALGDTAWD
FGSIGGVFNSIGKAVHQVFGGAFRTLFGGMSWITQGLMGALLLWMGVNARDRSIALAFLATGGVLVFLATNVHADTGCAI
DITRKEMRCGSGIFVHNDVEAWVDRYKYLPETPRSLAKIVHKAHKEGVCGVRSVTRLEHQMWEAVRDELNVLLKENAVDL
SVVVNKPVGRYRSAPKRLSMTQEKFEMGWKAWGKSILFAPELANSTFVVDGPETKECPDEHRAWNSMQIEDFGFGITSTR
VWLKIREESTDECDGAIIGTAVKGHVAVHSDLSYWIESRYNDTWKLERAVFGEVKSCTWPETHTLWGDGVEESELIIPHT
IAGPKSKHNRREGYKTQNQGPWDENGIVLDFDYCPGTKVTITEDCGKRGPSVRTTTDSGKLITDWCCRSCSLPPLRFRTE
NGCWYGMEIRPVRHDETTLVRSQVDAFNGEMVDPFSAGPSGDVSGHPGGPSQEVDGQIDHSCGFGGPTCADAWGHHLH
>Q8QZ72 ~~~~~~Structural polyprotein~~~
MDFLPTQVFYGRRWRPRMPPRPWRPRMPTMQRPDQQARQMQQLIAAVSTLALRQNAAAPQRGKKKQPRRKKPKPQPEKPK
KQEQKPKQKKAPKRKPGRRERMCMKIEHDCIFEVKHEGKVTGYACLVGDKVMKPAHVPGVIDNADLARLSYKKSSKYDLE
CAQIPVAMKSDASKYTHEKPEGHYNWHYGAVQYTGGRFTVPTGVGKPGDSGRPIFDNKGPVVAIVLGGANEGTRTALSVV
TWNKDMVTKITPEGTVEWAASTVTAMCLLTNISFPCFQPSCAPCCYEKGPEPTLRMLEENVNSEGYYDLLHAAVYCRNSS
RSKRSTANHFNAYKLTRPYVAYCADCGMGHSCHSPAMIENIQADATDGTLKIQFASQIGLTKTDTHDHTKIRYAEGHDIA
EAARSTLKVHSSSECTVTGTMGHFILAKCPPGERISVSFVDSKNEHRTCRIAYHHEQRLIGRERFTVRPHHGIELPCTTY
QLTTAETSEEIDMHMPPDIPDRTILSQQSGNVKITVNGRTVRYSSSCGSQAVGTTTTDKTINSCTVDKCQAYVTSHTKWQ
FNSPFVPRRMQAERKGKVHIPFPLINTTCRVPLAPEALVRSGKREATLSLHPIHPTLLSYRTFGAERVFDEQWITAQTEV
TIPVPVEGVEYQWGNHKPQRFVVALTTEGKAHGWPHEIIEYYYGLHPTTTIVVVIRVSVVVLLSFAASVYMCVVARTKCL
TPYALTPGAVVPVTIGVLCCAPKAHAASFAEGMAYLWDNNQSMFWMELTGPLALLILATCCARSLLSCCKGSFLVAMSIG
SAVASAYEHTAIIPNQVGFPYKAHVAREGYSPLTLQMQVIETSLEPTLNLEYITCDYKTKVPSPYVKCCGTAECRTQDKP
EYKCAVFTGVYPFMWGGAYCFCDSENTQMSEAYVERADVCKHDHAAAYRAHTASLRAKIKVTYGTVNQTVEAYVNGDHAV
TIAGTKFIFGPVSTPWTPFDTKILVYKGELYNQDFPRYGAGQPGRFGDIQSRTLDSRDLYANTGLKLARPAAGNIHVPYT
QTPSGFKTWQKDRDSPLNAKAPFGCIIQTNPVRAMNCAVGNIPVSMDIADSAFTRLTDAPVISELTCTVSTCTHSSDFGG
IAVLSYKVEKSGRCDIHSHSNVAVLQEVSIETEGRSVIHFSTASASPSFVVSVCSSRATCTAKCEPPKDHVVTYPANHNG
VTLPDLSSTAMTWAQHLAGGVGLLIALAVLILVIVTCVTLRR
>P08491 ~~~~~~Structural polyprotein~~~
MNYIPTQTFYGRRWRPRPAFRPWQVSMQPTPTMVTPMLQAPDLQAQQMQQLISAVSALTTKQNVKAPKGQRQKKQQKPKE
KKENQKKKPTQKKKQQQKPKPQAKKKKPGRRERMCMKIENDCIFEVKLDGKVTGYACLVGDKVMKPAHVKGTIDNPDLAK
LTYKKSSKYDLECAQIPVHMKSDASKYTHEKPEGHYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGG
ANEGARTALSVVTWTKDMVTRVTPEGTEEWSAALMMCILANTSFPCSSPPCYPCCYEKQPEQTLRMLEDNVNRPGYYELL
EASMTCRNRSRHRRSVTEHFNVYKATRPYLAYCADCGDGYFCYSPVAIEKIRDEAPDGMLKIQVSAQIGLDKAGTHAHTK
IRYMAGHDVQESKRDSLRVYTSAACSIHGTMGHFIVAHCPPGDYLKVSFEDADSHVKACKVQYKHDPLPVGREKFVVRPH
FGVELPCTSYQLTTAPTDEEIDMHTPPDIPDRTLLSQTAGNVKITAGGRTIRYNCTCGRDNVGTTSTDKTINTCKIDQCH
AAVTSHDKWQFTSPFVPRADQTARRGKVHVPFPLTNVTCRVPLARAPDVTYGKKEVTLRLHPDHPTLFSYRSLGAEPHPY
EEWVDKFSERIIPVTEEGIEYQWGNNPPVRLWAQLTTEGKPHGWPHEIIQYYYGLYPAATIAAVSGASLMALLTLAATCC
MLATARRKCLTPYALTPGAVVPLTLGLLCCAPRANAASFAETMAYLWDENKTLFWMEFAAPAAALALLACCIKSLICCCK
PFSFLVLLSLGASAKAYEHTATIPNVVGFPYKAHIERNGFSPMTLQLEVVETSWEPTLNLEYITCEYKTVVPSPFIKCCG
TSECSSKEQPDYQCKVYTGVYPFMWGGAYCFCDSENTQLSEAYVDRSDVCKHDHASAYKAHTASLKATIRISYGTINQTT
EAFVNGEHAVNVGGSKFIFGPISTAWSPFDNKIVVYKDDVYNQDFPPYGSGQPGRFGDIQSRTVESKDLYANTALKLSRP
SPGVVHVPYTPTPSGFKYWLKEKGSSLNTKAPFGCKIKTNPVRAMDCAVGSIPVSMDIPDSAFTRVVDAPAVTDLSCQVV
VCTHSSDFGGVATLSYKTDKPGKCAVHSHSNVATLQEATVDVKEDGKVTVHFSTASASPAFKVSVCDAKTTCTAACEPPK
DHIVPYGASHNNQVFPDMSGTAMTWVQRLASGLGGLALIAVVVLVLVTCITMRR
>P08563 ~~~~~~Structural polyprotein~~~
MASTTPITMEDLQKALEAQSRALRAGLAAGASQSRRPRPPRQRDSSTSGDDSGRDSGGPRRRRGNRGRGQRKDWSRAPPP
PEERQESRSQTPAPKPSRAPPQQPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS
EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRGVWGKGERTYAEQDFRVGGTRWH
RLLRMPVRGLDGDTAPLPPHTTERIETRSARHPWRIRFGAPQAFLAGLLLAAVAVGTARAGLQPRADMAAPPMPPQPPRA
HGQHYGHHHHQLPFLGHDGHHGGTLRVGQHHRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCVEHDRPPPA
TPTSLTTAANSTTAATPATAPPPCHAGLNDSCGGFLSGCGPMRLRHGADTRCGRLICGLSTTAQYPPTRFGCAMRWGLPP
WELVVLTARPEDGWTCRGVPAHPGTRCPELVSPMGRATCSPASALWLATANALSLDHAFAAFVLLVPWVLIFMVCRRACR
RRGAAAALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQTPVPVRLAGVRFESKIVDGGCFAPWDLEATGACICEIPTDVS
CEGLGAWVPTAPCARIWNGTQRACTFWAVNAYSSGGYAQLASYFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTV
MSVFALASYVQHPHKTVRVKFHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQSRW
GLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEVWVTPVIGSQARKCGLHIRAGPYGH
ATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPVALPRALAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDVG
AFPPGKFVTAALLNTPPPYQVSCGGESDRASARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWWQLTLGAICAL
LLAGLLACCAKCLYYLRGAIAPR
>P07566 ~~~~~~Structural polyprotein~~~
MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRDSSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPP
PEERQETRSQTPAPKPSRAPPQQPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS
EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRGVWGKGERTYAEQDFRVGGTRWH
RLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRIRFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCA
HGQHYGHHHHQLPFLGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCVEHDRPPPA
TPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADTRCGRLICGLSTTAQYPPTRFGCAMRWGLPP
WELVVLTARPEDGWTCRGVPAHPGARCPELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACR
RRGAAAALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWDLEATGACICEIPTDVS
CEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLASYFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTV
MSVFALASYVQHPHKTVRVKFHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQSRW
GLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEVWVTPVIGSQARKCGLHIRAGPYGH
ATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPVALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLG
AVPPGKFVTAALLNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWWQLTLGAICAL
PLAGLLACCAKCLYYLRGAIAPR
>P03315 ~~~~~~Structural polyprotein~~~
MNYIPTQTFYGRRWRPRPAARPWPLQATPVAPVVPDFQAQQMQQLISAVNALTMRQNAIAPARPPKPKKKKTTKPKPKTQ
PKKINGKTQQQKKKDKQADKKKKKPGKRERMCMKIENDCIFEVKHEGKVTGYACLVGDKVMKPAHVKGVIDNADLAKLAF
KKSSKYDLECAQIPVHMRSDASKYTHEKPEGHYNWHHGAVQYSGGRFTIPTGAGKPGDSGRPIFDNKGRVVAIVLGGANE
GSRTALSVVTWNKDMVTRVTPEGSEEWSAPLITAMCVLANATFPCFQPPCVPCCYENNAEATLRMLEDNVDRPGYYDLLQ
AALTCRNGTRHRRSVSQHFNVYKATRPYIAYCADCGAGHSCHSPVAIEAVRSEATDGMLKIQFSAQIGIDKSDNHDYTKI
RYADGHAIENAVRSSLKVATSGDCFVHGTMGHFILAKCPPGEFLQVSIQDTRNAVRACRIQYHHDPQPVGREKFTIRPHY
GKEIPCTTYQQTTAETVEEIDMHMPPDTPDRTLLSQQSGNVKITVGGKKVKYNCTCGTGNVGTTNSDMTINTCLIEQCHV
SVTDHKKWQFNSPFVPRADEPARKGKVHIPFPLDNITCRVPMAREPTVIHGKREVTLHLHPDHPTLFSYRTLGEDPQYHE
EWVTAAVERTIPVPVDGMEYHWGNNDPVRLWSQLTTEGKPHGWPHQIVQYYYGLYPAATVSAVVGMSLLALISIFASCYM
LVAARSKCLTPYALTPGAAVPWTLGILCCAPRAHAASVAETMAYLWDQNQALFWLEFAAPVACILIITYCLRNVLCCCKS
LSFLVLLSLGATARAYEHSTVMPNVVGFPYKAHIERPGYSPLTLQMQVVETSLEPTLNLEYITCEYKTVVPSPYVKCCGA
SECSTKEKPDYQCKVYTGVYPFMWGGAYCFCDSENTQLSEAYVDRSDVCRHDHASAYKAHTASLKAKVRVMYGNVNQTVD
VYVNGDHAVTIGGTQFIFGPLSSAWTPFDNKIVVYKDEVFNQDFPPYGSGQPGRFGDIQSRTVESNDLYANTALKLARPS
PGMVHVPYTQTPSGFKYWLKEKGTALNTKAPFGCQIKTNPVRAMNCAVGNIPVSMNLPDSAFTRIVEAPTIIDLTCTVAT
CTHSSDFGGVLTLTYKTNKNGDCSVHSHSNVATLQEATAKVKTAGKVTLHFSTASASPSFVVSLCSARATCSASCEPPKD
HIVPYAASHSNVVFPDMSGTALSWVQKISGGLGAFAIGAILVLVVVTCIGLRR
>P27285 ~~~~~~Structural polyprotein~~~
MNRGFFNMLGRRPFPAPTAMWRPRRRRQAAPMPARNGLASQIQQLTTAVSALVIGQATRPQNPRPRPPPRQKKQAPKQPP
KPKKPKPQEKKKKQPAKTKPGKRQRMALKLEADRLFDVKNEDGDVIGHALAMEGKVMKPLHVKGTIDHPVLSKLKFTKSS
AYDMEFAQLPVNMRSEAFTYTSEHPEGFYNWHHGAVQYSGGRFTIPRGVGGRGDSGRPIMDNSGRVVAIVLGGADEGTRT
ALSVVTWNSKGKTIKTTPEGTEEWSAAPLVTAMCLLGNVSFPCNRPPTCYTREPSRALDILEENVNHEAYDTLLNAILRC
GSSGRSKRSVTDDFTLTSPYLGTCSYCHHTEPCFSPIKIEQVWDEADDNTIRIQTSAQFGYDKSGAASTNKYRYMSFEQD
HTVKEGTMDDIKISTSGPCRRLSYKGYFLLAKCPPGDSVTVSIASSNSATSCTMARKIKPKFVGREKYDLPPVHGKKIPC
TVYDRLKETTAGYITMHRPGPHAYTSYLEESSGKVYAKPPSGKNITYECKCGDYKTGTVTTRTEITGCTAIKQCVAYKSD
QTKWVFNSPDLIRHADHTAQGKLHLPFKLIPSTCMVPVAHAPNVIHGFKHISLQLDTDHLTLLTTRRLGANPEPTTEWII
GKTVRNFTVDRDGLEYIWGNHEPVRVYAQESAPGDPHGWPHEIVQHYYHRHPVYTILAVASAAVAMMIGVTVAALCACKA
RRECLTPYALAPNAVIPTSLALLCCVRSANAETFTETMSYFWSNSQPFFWVQLCIPLAAVIVLMRCCSCCLPFLVVAGAY
LAKVDAYEHATTVPNVPQIPYKALVERAGYAPLNLEITVMSSEVLPSTNQEYITCKFTTVVPSPKVKCCGSLECQPAAHA
DYTCKVFGGVYPFMWGGAQCFCDSENSQMSEAYVELSADCATDHAQAIKVHTAAMKVGLRIVYGNTTSFLDVYVNGVTPG
TSKDLKVIAGPISASFTPFDHKVVIHRGLVYNYDFPEYGAMKPGVFGDIQATSLTSKDLIASTDIRLLKPSAKNVHVPYT
QAASGFEMWKNNSGRPLQETAPFGCKIAVNPLRAVDCSYGNIPISIDIPNAAFIRTSDAPLVSTVKCDVSECTYSADFGG
MATLQYVSDREGQCPVHSHSSTATLQESTVHVLEKGAVTVHFSTASPQANFIVSLCGKKTTCNAECKPPADHIVSTPHKN
DQEFQAAISKTSWSWLFALFGGASSLLIIGLTIFACSMMLTSTRR
>P03316 ~~~~~~Structural polyprotein~~~
MNRGFFNMLGRRPFPAPTAMWRPRRRRQAAPMPARNGLASQIQQLTTAVSALVIGQATRPQPPRPRPPPRQKKQAPKQPP
KPKKPKTQEKKKKQPAKPKPGKRQRMALKLEADRLFDVKNEDGDVIGHALAMEGKVMKPLHVKGTIDHPVLSKLKFTKSS
AYDMEFAQLPVNMRSEAFTYTSEHPEGFYNWHHGAVQYSGGRFTIPRGVGGRGDSGRPIMDNSGRVVAIVLGGADEGTRT
ALSVVTWNSKGKTIKTTPEGTEEWSAAPLVTAMCLLGNVSFPCDRPPTCYTREPSRALDILEENVNHEAYDTLLNAILRC
GSSGRSKRSVIDDFTLTSPYLGTCSYCHHTVPCFSPVKIEQVWDEADDNTIRIQTSAQFGYDQSGAASANKYRYMSLKQD
HTVKEGTMDDIKISTSGPCRRLSYKGYFLLAKCPPGDSVTVSIVSSNSATSCTLARKIKPKFVGREKYDLPPVHGKKIPC
TVYDRLKETTAGYITMHRPRPHAYTSYLEESSGKVYAKPPSGKNITYECKCGDYKTGTVSTRTEITGCTAIKQCVAYKSD
QTKWVFNSPDLIRHDDHTAQGKLHLPFKLIPSTCMVPVAHAPNVIHGFKHISLQLDTDHLTLLTTRRLGANPEPTTEWIV
GKTVRNFTVDRDGLEYIWGNHEPVRVYAQESAPGDPHGWPHEIVQHYYHRHPVYTILAVASATVAMMIGVTVAVLCACKA
RRECLTPYALAPNAVIPTSLALLCCVRSANAETFTETMSYLWSNSQPFFWVQLCIPLAAFIVLMRCCSCCLPFLVVAGAY
LAKVDAYEHATTVPNVPQIPYKALVERAGYAPLNLEITVMSSEVLPSTNQEYITCKFTTVVPSPKIKCCGSLECQPAAHA
DYTCKVFGGVYPFMWGGAQCFCDSENSQMSEAYVELSADCASDHAQAIKVHTAAMKVGLRIVYGNTTSFLDVYVNGVTPG
TSKDLKVIAGPISASFTPFDHKVVIHRGLVYNYDFPEYGAMKPGAFGDIQATSLTSKDLIASTDIRLLKPSAKNVHVPYT
QASSGFEMWKNNSGRPLQETAPFGCKIAVNPLRAVDCSYGNIPISIDIPNAAFIRTSDAPLVSTVKCEVSECTYSADFGG
MATLQYVSDREGQCPVHSHSSTATLQESTVHVLEKGAVTVHFSTASPQANFIVSLCGKKTTCNAECKPPADHIVSTPHKN
DQEFQAAISKTSWSWLFALFGGASSLLIIGLMIFACSMMLTSTRR
>P13897 ~~~~~~Structural polyprotein~~~
MFPYPQLNFPPVYPTNPMAYRDPNPPRCRWRPFRPPLAAQIEDLRRSIANLTFKQRSPNPPPGPPPKKKKSAPKPKPTQP
KKKKQQAKKTKRKPKPGKRQRMCMKLESDKTFPIMLNGQVNGYACVVGGRLMKPLHVEGKIDNEQLAAVKLKKASMYDLE
YGDVPQNMKSDTLQYTSDKPPGFYNWHHGAVQYENGRFTVPRGVGGKGDSGRPILDNRGRVVAIVLGGANEGTRTALSVV
TWNQKGVTIKDTPEGSEPWSLVTALCVLSNVTFPCDKPPVCYSLAPERTLDVLEENVDNPNYDTLLENVLKCPSRRPKRS
ITDDFTLTSPYLGFCPYCRHSAPCFSPIKIENVWDESDDGSIRIQVSAQFGYNQAGTADVTKFRYMSFDHDHDIKEDSMD
KIAISTSGPCRRLGHKGYFLLAQCPPGDSVTVSITSGASENSCTVEKKIRRKFVGREEYLFPPVHGKLVKCHVYDHLKET
SAGYITMHRPGPHAYKSYLEEASGEVYIKPPSGKNVTYECKCGDYSTGIVSTRTKMNGCTKAKQCIAYKSDQTKWVFNSP
DLIRHTDHSVQGKLHIPFRLTPTVCPVPLAHTPTVTKWFKGITLHLTATRPTLLTTRKLGLRADATAEWITGTTSRNFSV
GREGLEYVWGNHEPVRVWAQESAPGDPHGWPHEIIIHYYHRHPVYTVIVLCGVALAILVGTASSAACIAKARRDCLTPYA
LAPNATVPTALAVLCCIRPTNAETFGETLNHLWFNNQPFLWAQLCIPLAALVILFRCFSCCMPFLLVAGVCLGKVDAFEH
ATTVPNVPGIPYKALVERAGYAPLNLEITVVSSELTPSTNKEYVTCRFHTVIPSPQVKCCGSLECKASSKADYTCRVFGG
VYPFMWGGAQCFCDSENTQLSEAYVEFAPDCTIDHAVALKVHTAALKVGLRIVYGNTTAHLDTFVNGVTPGSSRDLKVIA
GPISAAFSPFDHKVVIRKGLVYNYDFPEYGAMKPGAFGDIQASSLDATDIVARTDIRLLKPSVKNIHVPYTQAVSGYEMW
KNNSGRPLQETAPFGCKIEVEPLRASNCAYGHIPISIDIPDAAFVRSSESPTILEVSCTVADCIYSADFGGSLTLQYKAD
REGHCPVHSHSTTAVLKEATTHVTAVGSITLHFSTSSPQANFIVSLCGKKTTCNAECKPPADHIIGEPHKVDQEFQAAVS
KTSWNWLLALFGGASSLIVVGLIVLVCSSMLINTRR
>P19560 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MKRRELEKKLRKVRVTPQQDKYYTIGNLQWAIRMINLMGIKCVCDEECSAAEVALIITQFSALDLENSPIRGKEEVAIKN
TLKVFWSLLAGYKPESTETALGYWEAFTYREREARADKEGEIKSIYPSLTQNTQNKKQTSNQTNTQSLPAITTQDGTPRF
DPDLMKQLKIWSDATERNGVDLHAVNILGVITANLVQEEIKLLLNSTPKWRLDVQLIESKVREKENAHRTWKQHHPEAPK
TDEIIGKGLSSAEQATLISVECRETFRQWVLQAAMEVAQAKHATPGPINIHQGPKEPYTDFINRLVAALEGMAAPETTKE
YLLQHLSIDHANEDCQSILRPLGPNTPMEKKLEACRVVGSQKSKMQFLVAAMKEMGIQSPIPAVLPHTPEAYASQTSGPE
DGRRCYGCGKTGHLKRNCKQQKCYHCGKPGHQARNCRSKNREVLLCPLWAEEPTTEQFSPEQHEFCDPICTPSYIRLDKQ
PFIKVFIGGRWVKGLVDTGADEVVLKNIHWDRIKGYPGTPIKQIGVNGVNVAKRKTHVEWRFKDKTGIIDVLFSDTPVNL
FGRSLLRSIVTCFTLLVHTEKIEPLPVKVRGPGPKVPQWPLTKEKYQALKEIVKDLLAEGKISEAAWDNPYNTPVFVIKK
KGTGRWRMLMDFRELNKITVKGQEFSTGLPYPPGIKECEHLTAIDIKDAYFTIPLHEDFRPFTAFSVVPVNREGPIERFQ
WNVLPQGWVCSPAIYQTTTQKIIENIKKSHPDVMLYQYMDDLLIGSNRDDHKQIVQEIRDKLGSYGFKTPDEKVQEERVK
WIGFELTPKKWRFQPRQLKIKNPLTVNELQQLVGNCVWVQPEVKIPLYPLTDLLRDKTNLQEKIQLTPEAIKCVEEFNLK
LKDPEWKDRIREGAELVIKIQMVPRGIVFDLLQDGNPIWGGVKGLNYDHSNKIKKILRTMNELNRTVVIMTGREASFLLP
GSSEDWEAALQKEESLTQIFPVKFYRHSCRWTSICGPVRENLTTYYTDGGKKGKTAAAVYWCEGRTKSKVFPGTNQQAEL
KAICMALLDGPPKMNIITDSRYAYEGMREEPETWAREGIWLEIAKILPFKQYVGVGWVPAHKGIGGNTEADEGVKKALEQ
MAPCSPPEAILLKPGEKQNLETGIYMQGLRPQSFLPRADLPVAITGTMVDSELQLQLLNIGTEHIRIQKDEVFMTCFLEN
IPSATEDHERWHTSPDILVRQFHLPKRIAKEIVARCQECKRTTTSPVRGTNPRGRFLWQMDNTHWNKTIIWVAVETNSGL
VEAQVIPEETALQVALCILQLIQRYTVLHLHSDNGPCFTAHRIENLCKYLGITKTTGIPYNPQSQGVVERAHRDLKDRLA
AYQGDCETVEAALSLALVSLNKKRGGIGGHTPYEIYLESEHTKYQDQLEQQFSKQKIEKWCYVRNRRKEWKGPYKVLWDG
DGAAVIEEEGKTALYPHRHMRFIPPPDSDIQDGSS
>P25059 ~~~pol~~~Gag-Pro-Pol polyprotein~~~
MGNSPSYNPPAGISPSDWLNLLQSAQRLNPRPSPSDFTDLKNYIHWFHKTQKKPWTFTSGGPASCPPGKFGRVPLVLATL
NEVLSNDEGAPGASAPEEQPPPYDPPAVLPIISEGNRNRHRAWALRELQDIKKEIENKAPGSQVWIQTLRLAILQADPTP
ADLEQLCQYIASPVDQTAHMTSLTAAIAAEAANTLQGFNPQNGTLTQQSAQPNAGDLRSQYQNLWLQAWKNLPTRPSVQP
WSTIVQGPAESYVEFVNRLQISLADNLPDGVPKEPIIDSLSYANANKECQQILQGRGLVAAPVGQKLQACAHWAPKTKQP
AILVHTPGPKMPGPRQPAPKRPPPGPCYRCLKEGHWARDCPTKTTGPPPGPCPICKDPSHWKRDCPTLKSKKLIEGGPSA
PQIITPITDSLSEAELECLLSIPLARSRPSVAVYLSGPWLQPSQNQALMLVDTGAENTVLPQNWLVRDYPRTPAAVLGAG
GISRNRYNWLQGPLTLALKPEGPFITIPKILVDTFDKWQILGRDVLSRLQASISIPEEVHPPVVGVLDAPPSHIGLEHLP
PPPEVPQFPLNLERLQALQDLVHRSLEAGYISPWDGPGNNPVFPVRKPNGAWRFVHDLRVTNALTKPIPALSPGPPDLTA
IPTHLPHIICLDLKDAFFQIPVEDRFRSYFAFTLPTPGGLQPHRRFAWRVLPQGFINSPALFERALQEPLRQVSAAFSQS
LLVSYMDDILYVSPTEEQRLQCYQTMAAHLRDLGFQVASEKTRQTPSPVPFLGQMVHERMVTYQSLPTLQISSPISLHQL
QTVLGDLQWVSRGTPTTRRPLQLLYSSLKGIDDPRAIIHLSPEQQQGIAELRQALSHNARSRYNEQEPLLAYVHLTRAGS
TLVLFQKGAQFPLAYFQTPLTDNQASPWGLLLLLGCQYLQAQALSSYAKTILKYYHNLPKTSLDNWIQSSEDPRVQELLQ
LWPQISSQGIQPPGPWKTLVTRAEVFLTPQFSPEPIPAALCLFSDGAARRGAYCLWKDHLLDFQAVPAPESAQKGELAGL
LAGLAAAPPEPLNIWVDSKYLYSLLRTLVLGAWLQPDPVPSYALLYKSLLRHPAIFVGHVRSHSSASHPIASLNNYVDQL
LPLETPEQWHKLTHCNSRALSRWPNPRISAWDPRSPATLCETCQRLNPTGGGKMRTIQRGWAPNHIWQADITHYKYKQFT
YALHVFVDTYSGATHASAKRGLTTQTTIEGLLEAIVHLGRPKKLNTDQGANYTSKTFVRFCQQFGISLSHHVPYNPTSSG
LVERTNGLLKLLLSKYHLDEPHLPMTQALSRALWTHNQINLLPILKTRWELHHSPPLAVISEGGETPKGSDKLFLYKLPG
QNNRRWLGPLPALVEASGGALLATNPPVWVPWRLLKAFKCLKNDGPEDAPNRSSDG
>P11204 ~~~pol~~~Pol polyprotein~~~
TAWTFLKAMQKCSKKREARGSREAPETNFPDTTEESAQQICCTRDSSDSKSVPRSERNKKGIQCQGEGSSRGSQPGQFVG
VTYNLEKRPTTIVLINDTPLNVLLDTGADTSVLTTAHYNRLKYRGRKYQGTGIIGVGGNVETFSTPVTIKKKGRHIKTRM
LVADIPVTILGRDILQDLGAKLVLAQLSKEIKFRKIELKEGTMGPKIPQWPLTKEKLEGAKEIVQRLLSEGKISEASDNN
PYNSPIFVIKKRSGKWRLLQDLRELNKTVQVGTEISRGLPHPGGLIKCKHMTVLDIGDAYFTIPLDPEFRPYTAFTIPSI
NHQEPDKRYVWNCLPQGFVLSPYIYQKTLQEILQPFRERYPEVQLYQYMDDLFVGSNGSKKQHKELIIELRAILLEEGFE
TPDDKLQEVPPYSWLGYQLCPENWKVQKMQLDMVKNPTLNDVQKLMGNITWMSSGVPGLTVKHIAATTKGCLELNQKVIW
TEEAQKELEENNEKIKNAQGLQYYNPEEEMLCEVEITKNYEATYVIKQSQGILWAGKKIMKANKGWSTVKNLMLLLQHVA
TESITRVGKCPTFKVPFTKEQVMWEMQKGWYYSWLPEIVYTHQVVHDDWRMKLVEEPTSGITIYTDGGKQNGEGIAAYVT
SNGRTKQKRLGPVTHQVAERMAIQMALEDTRDKQVNIVTDSYYCWKNITEGLGLEGPQSPWWPIIQNIREKEIVYFAWVP
GHKGICGNQLADEAAKIKEEIMLAYQGTQIKEKRDEDAGFDLCVPYDIMIPVSDTKIIPTDVKIQVPPNSFGWVTGKSSM
AKQGLLINGGIIDEGYTGEIQVICTNIGKSNIKLIEGQKFAQLIILQHHSNSRQPWDENKISQRGDKGFGSTGVFWVENI
QEAQDEHENWHTSPKILARNYKIPLTVAKQITQECPHCTKQGSGPAGCVMRSPNHWQADCTHLDNKIILTFVESNSGYIH
ATLLSKENALCTSLAILEWARLFSPKSLHTDNGTNFVAEPVVNLLKFLKIAHTTGIPYHPESQGIVERANRTLKEKIQSH
RDNTQTLEAALQLALITCNKGRESMGGQTPWEVFITNQAQVIHEKLLLQQAQSSKKFCFYKIPGEHDWKGPTRVLWKGDG
AVVVNDEGKGIIAVPLTRTKLLIKPN
>P32542 ~~~pol~~~Pol polyprotein~~~
TAWTFLKAMQKCSKKREARGSREAPETNFPDTTEESAQQICCTRDSSDSKSVPRSERNKKGIQCQGEGSSRGSQPGQFVG
VTYNLEKRPTTIVLINDTPLNVLLDTGADTSVLTTAHYNRLKYRGRKYQGTGIIGVGGNVETFSTPVTIKKKGRHIKTRM
LVADIPVTILGRDILQDLGAKLVLAQLSKEIKFRKIELKEGTMGPKIPQWPLTKEKLEGAKEIVQRLLSEGKISEASDNN
PYNSPIFVIKKRSGKWRLLQDLRELNKTVQVGTEISRGLPHPGGLIKCKHMTVLDIGDAYFTIPLDPEFRPYTAFTIPSI
NHQEPDKRYVWNCLPQGFVLSPYIYQKTLQEILQPFRERYPEVQLYQYMDDLFVGSNGSKKQHKELIIELRAILLEKGFE
TPDDKLQEVPPYSWLGYQLCPENWKVQKMQLDMVKNPTLNDVQKLMGNITWMSSGVPGLTVKHIAATTKGCLELNQKVIW
TEEAQKELEENNEKIKNAQGLQYYNPEEEMLCEVEITKNYEATYVIKQSQGILWAGKKIMKANKGWSTVKNLMLLLQHVA
TESITRVGKCPTFKVPFTKEQVMWEMQKGWYYSWLPEIVYTHQVVHDDWRMKLVEEPTSGITIYTDGGKQNGEGIAAYVT
SNGRTKQKRLGPVTHQVAERMAIQMALEDTRDKQVNIVTDSYYCWKNITEGLGLEGPQSPWWPIIQNIREKEIVYFAWVP
GHKGICGNQLADEAAKIKEEIMLAYQGTQIKEKRDEDAGFDLCVPYDIMIPVSDTKIIPTDVKIQVPPNSFGWVTGKSSM
AKQGLLINGGIIDEGYTGEIQVICTNIGKSNIKLIEGQKFAQLIILQHHSNSRQPWDENKISQRGDKGFGSTGVFWVENI
QEAQDEHENWHTSPKILARNYKIPLTVAKQITQECPHCTKQGSGPAGCVMRSPNHWQADCTHLDNKIILTFVESNSGYIH
ATLLSKENALCTSLAILEWARLFSPKSLHTDNGTNFVAEPVVNLLKFLKIAHTTGIPYHPESQGIVERANRTLKEKIQSH
RDNTQTLEAALQLALITCNKGRESMGGQTPWEVFITNQAQVIHEKLLLQQAQSSKKFCFYKIPGEHDWKGPTRVLWKGDG
AVVVNDEGKGIIAVPLTRTKLLIKPN
>P16088 ~~~pol~~~Pol polyprotein~~~
KEFGKLEGGASCSPSESNAASSNAICTSNGGETIGFVNYNKVGTTTTLEKRPEILIFVNGYPIKFLLDTGADITILNRRD
FQVKNSIENGRQNMIGVGGGKRGTNYINVHLEIRDENYKTQCIFGNVCVLEDNSLIQPLLGRDNMIKFNIRLVMAQISDK
IPVVKVKMKDPNKGPQIKQWPLTNEKIEALTEIVERLEKEGKVKRADSNNPWNTPVFAIKKKSGKWRMLIDFRELNKLTE
KGAEVQLGLPHPAGLQIKKQVTVLDIGDAYFTIPLDPDYAPYTAFTLPRKNNAGPGRRFVWCSLPQGWILSPLIYQSTLD
NIIQPFIRQNPQLDIYQYMDDIYIGSNLSKKEHKEKVEELRKLLLWWGFETPEDKLQEEPPYTWMGYELHPLTWTIQQKQ
LDIPEQPTLNELQKLAGKINWASQAIPDLSIKALTNMMRGNQNLNSTRQWTKEARLEVQKAKKAIEEQVQLGYYDPSKEL
YAKLSLVGPHQISYQVYQKDPEKILWYGKMSRQKKKAENTCDIALRACYKIREESIIRIGKEPRYEIPTSREAWESNLIN
SPYLKAPPPEVEYIHAALNIKRALSMIKDAPIPGAETWYIDGGRKLGKAAKAAYWTDTGKWRVMDLEGSNQKAEIQALLL
ALKAGSEEMNIITDSQYVINIILQQPDMMEGIWQEVLEELEKKTAIFIDWVPGHKGIPGNEEVDKLCQTMMIIEGDGILD
KRSEDAGYDLLAAKEIHLLPGEVKVIPTGVKLMLPKGYWGLIIGKSSIGSKGLDVLGGVIDEGYRGEIGVIMINVSRKSI
TLMERQKIAQLIILPCKHEVLEQGKVVMDSERGDNGYGSTGVFSSWVDRIEEAEINHEKFHSDPQYLRTEFNLPKMVAEE
IRRKCPVCRIIGEQVGGQLKIGPGIWQMDCTHFDGKIILVGIHVESGYIWAQIISQETADCTVKAVLQLLSAHNVTELQT
DNGPNFKNQKMEGVLNYMGVKHKFGIPGNPQSQALVENVNHTLKVWIQKFLPETTSLDNALSLAVHSLNFKRRGRIGGMA
PYELLAQQESLRIQDYFSAIPQKLQAQWIYYKDQKDKKWKGPMRVEYWGQGSVLLKDEEKGYFLIPRRHIRRVPEPCALP
EGDE
>P14350 ~~~pol~~~Pro-Pol polyprotein~~~
MNPLQLLQPLPAEIKGTKLLAHWDSGATITCIPESFLEDEQPIKKTLIKTIHGEKQQNVYYVTFKVKGRKVEAEVIASPY
EYILLSPTDVPWLTQQPLQLTILVPLQEYQEKILSKTALPEDQKQQLKTLFVKYDNLWQHWENQVGHRKIRPHNIATGDY
PPRPQKQYPINPKAKPSIQIVIDDLLKQGVLTPQNSTMNTPVYPVPKPDGRWRMVLDYREVNKTIPLTAAQNQHSAGILA
TIVRQKYKTTLDLANGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFTADVVDLLKEIPNVQVYVDDIYLSH
DDPKEHVQQLEKVFQILLQAGYVVSLKKSEIGQKTVEFLGFNITKEGRGLTDTFKTKLLNITPPKDLKQLQSILGLLNFA
RNFIPNFAELVQPLYNLIASAKGKYIEWSEENTKQLNMVIEALNTASNLEERLPEQRLVIKVNTSPSAGYVRYYNETGKK
PIMYLNYVFSKAELKFSMLEKLLTTMHKALIKAMDLAMGQEILVYSPIVSMTKIQKTPLPERKALPIRWITWMTYLEDPR
IQFHYDKTLPELKHIPDVYTSSQSPVKHPSQYEGVFYTDGSAIKSPDPTKSNNAGMGIVHATYKPEYQVLNQWSIPLGNH
TAQMAEIAAVEFACKKALKIPGPVLVITDSFYVAESANKELPYWKSNGFVNNKKKPLKHISKWKSIAECLSMKPDITIQH
EKGISLQIPVFILKGNALADKLATQGSYVVNCNTKKPNLDAELDQLLQGHYIKGYPKQYTYFLEDGKVKVSRPEGVKIIP
PQSDRQKIVLQAHNLAHTGREATLLKIANLYWWPNMRKDVVKQLGRCQQCLITNASNKASGPILRPDRPQKPFDKFFIDY
IGPLPPSQGYLYVLVVVDGMTGFTWLYPTKAPSTSATVKSLNVLTSIAIPKVIHSDQGAAFTSSTFAEWAKERGIHLEFS
TPYHPQSGSKVERKNSDIKRLLTKLLVGRPTKWYDLLPVVQLALNNTYSPVLKYTPHQLLFGIDSNTPFANQDTLDLTRE
EELSLLQEIRTSLYHPSTPPASSRSWSPVVGQLVQERVARPASLRPRWHKPSTVLKVLNPRTVVILDHLGNNRTVSIDNL
KPTSHQNGTTNDTATMDHLEKNE
>P03362 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPARICPINYSLLASLLPKGYPGRV
NEILHILIQTQAQIPSRPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLR
REYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQTWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTI
PEPEPEEDALLLDLPADIPHPKNLHRGGGLTSPPTLQQVLPNQDPASILPVIPLDPARRPVIKAQVDTQTSHPKTIEALL
DTGADMTVLPIALFSSNTPLKNTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQG
VLYLPEAKRPPVILPIQAPAVLGLEHLPRPPQISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTW
RFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTIDLRDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWKVLP
QGFKNSPTLFEMQLAHILQPIRQAFPQCTILQYMDDILLASPSHEDLLLLSEATMASLISHGLPVSENKTQQTPGTIKFL
GQIISPNHLTYDAVPTVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQRHTDPRDQIYLNPSQVQSLVQLR
QALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKEQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQ
TIHHNISTQTFNQFIQTSDHPSVPILLHHSHRFKNLGAQTGELWNTFLKTAAPLAPVKALMPVFTLSPVIINTAPCLFSD
GSTSRAAYILWDKQILSQRSFPLPPPHKSAQRAELLGLLHGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAP
FQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQLSPAELHSFTHCGQTALTLQGATTTEASNILRSC
HACRGGNPQHQMPRGHIRRGLLPNHIWQGDITHFKYKNTLYRLHVWVDTFSGAISATQKRKETSSEAISSLLQAIAHLGK
PSYINTDNGPAYISQDFLNMCTSLAIRHTTHVPYNPTSSGLVERSNGILKTLLYKYFTDKPDLPMDNALSIALWTINHLN
VLTNCHKTRWQLHHSPRLQPIPETRSLSNKQTHWYYFKLPGLNSRQWKGPQEALQEAAGAALIPVSASSAQWIPWRLLKR
AACPRPVGGPADPKEKDLQHHG
>P14078 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPVWICPINYSLLASLLPKGYPGRV
NEILHILIQTQAQIPSRPAPPPPSSSTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLR
REYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQAWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTI
PEPEPEEDALLLDLPADIPHPKNLHRGGGLTSPPTLQQVLPNQDPTSILPVIPLDPARRPVIKAQIDTQTSHPKTIEALL
DTGADMTVLPIALFSSNTPLKNTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQG
VLYLPEAKRPPVILPIQAPAVLGLEHLPRPPEISQFPLNPERLQALQHLVRKALEAGHIEPYTGPGNNPVFPVKKANGTW
RFIHDLRATNSLTIDLSSSSPGPPDLSSLPTTLAHLQTIDLKDAFFQIPLPKQFQPYFAFTVPQQCNYGPGTRYAWRVLP
QGFKNSPTLFEMQLAHILQPIRQAFPQCTILQYMDDILLASPSHADLQLLSEATMASLISHGLPVSENKTQQTPGTIKFL
GQIISPNHLTYDAVPKVPIRSRWALPELQALLGEIQWVSKGTPTLRQPLHSLYCALQRHTDPRDQIYLNPSQVQSLVQLR
QALSQNCRSRLVQTLPLLGAIMLTLTGTTTVVFQSKQQWPLVWLHAPLPHTSQCPWGQLLASAVLLLDKYTLQSYGLLCQ
TIHHNISTQTFNQFIQTSDHPSVPILLHHSHRFKNLGAQTGELWNTFLKTTAPLAPVKALMPVFTLSPVIINTAPCLFSD
GSTSQAAYILWDKHILSQRSFPLPPPHKSAQRAELLGLLHGLSSARSWRCLNIFLDSKYLYHYLRTLALGTFQGRSSQAP
FQALLPRLLSRKVVYLHHVRSHTNLPDPISRLNALTDALLITPVLQLSPADLHSFTHCGQTALTLQGATTTEASNILRSC
HACRKNNPQHQMPQGHIRRGLLPNHIWQGDITHFKYKNTLYRLHVWVDTFSGAISATQKRKETSSEAISSLLQAIAYLGK
PSYINTDNGPAYISQDFLNMCTSLAIRHTTHVPYNPTSSGLVERSNGILKTLLYKYFTDKPDLPMDNALSIALWTINHLN
VLTNCHKTRWQLHHSPRLQPIPETHSLSNKQTHWYYFKLPGLNSRQWKGPQEALQEAAGAALIPVSASSAQWIPWRLLKR
AACPRPVGGPADPKEKDHQHHG
>P03363 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGQIHGLSPTPIPKAPRGLSTHHWLNFLQAAYRLQPRPSDFDFQQLRRFLKLALKTPIWLNPIDYSLLASLIPKGYPGRV
VEIINILVKNQVSPSAPAAPVPTPICPTTTPPPPPPPSPEAHVPPPYVEPTTTQCFPILHPPGAPSAHRPWQMKDLQAIK
QEVSSSALGSPQFMQTLRLAVQQFDPTAKDLQDLLQYLCSSLVVSLHHQQLNTLITEAETRGMTGYNPMAGPLRMQANNP
AQQGLRREYQNLWLAAFSTLPGNTRDPSWAAILQGLEEPYCAFVERLNVALDNGLPEGTPKEPILRSLAYSNANKECQKI
LQARGHTNSPLGEMLRTCQAWTPKDKTKVLVVQPRRPPPTQPCFRCGKVGHWSRDCTQPRPPPGPCPLCQDPSHWKRDCP
QLKPPQEEGEPLLLDLPSTSGTTEEKNLLKGGDLISPHPDQDISILPLIPLRQQQQPILGVRISVMGQTPQPTQALLDTG
ADLTVIPQTLVPGPVKLHDTLILGASGQTNTQFKLLQTPLHIFLPFRRSPVILSSCLLDTHNKWTIIGRDALQQCQGLLY
LPDDPSPHQLLPIATPNTIGLEHLPPPPQVDQFPLNLPERLQALNDLVSKALEAGHIEPYSGPGNNPVFPVKKPNGKWRF
IHDLRATNAITTTLTSPSPGPPDLTSLPTALPHLQTIDLTDAFFQIPLPKQYQPYFAFTIPQPCNYGPGTRYAWTVLPQG
FKNSPTLFEQQLAAVLNPMRKMFPTSTIVQYMDDILLASPTNEELQQLSQLTLQALTTHGLPISQEKTQQTPGQIRFLGQ
VISPNHITYESTPTIPIKSQWTLTELQVILGEIQWVSKGTPILRKHLQSLYSALHGYRDPRACITLTPQQLHALHAIQQA
LQHNCRGRLNPALPLLGLISLSTSGTTSVIFQPKQNWPLAWLHTPHPPTSLCPWGHLLACTILTLDKYTLQHYGQLCQSF
HHNMSKQALCDFLRNSPHPSVGILIHHMGRFHNLGSQPSGPWKTLLHLPTLLQEPRLLRPIFTLSPVVLDTAPCLFSDGS
PQKAAYVLWDQTILQQDITPLPSHETHSAQKGELLALICGLRAAKPWPSLNIFLDSKYLIKYLHSLAIGAFLGTSAHQTL
QAALPPLLQGKTIYLHHVRSHTNLPDPISTFNEYTDSLILAPLVPLTPQGLHGLTHCNQRALVSFGATPREAKSLVQTCH
TCQTINSQHHMPRGYIRRGLLPNHIWQGDVTHYKYKKYKYCLHVWVDTFSGAVSVSCKKKETSCETISAVLQAISLLGKP
LHINTDNGPAFLSQEFQEFCTSYRIKHSTHIPYNPTSSGLVERTNGVIKNLLNKYLLDCPNLPLDNAIHKALWTLNQLNV
MNPSGKTRWQIHHSPPLPPIPEASTPPKPPPKWFYYKLPGLTNQRWKGPLQSLQEAAGAALLSIDGSPRWIPWRFLKKAA
CPRPDASELAEHAATDHQHHG
>O12158 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASILRGGKLDAWERIKLKPGGKKHYMMKHLVWASRELERFALDPGLLETSEGCKQIMKQLQPALQTGTKELISLHN
TVATLYCVHEKIDVRDTKEALDKIKEEQNKSQQKTQQAEAADKGKVSQNYPIVQNLQGQMVHQPISARTLNAWVKVVEEK
AFSPEVIPMFTALSEGATPQDLNTMLNTVGGHQAAMQMLKDTINEEAAEWDRLHPVHAGPVAPGQMREPRGSDIAGTTST
LQEQITWMTNNPPVPVGDIYKRWIILGLNKIVRMYSPVSILDIKQGPKEPFRDYVDRFFKTLRAEQATQDVKNWMTDTLL
VQNANPDCKTILRALGPGASLEEMMTACQGVGGPGHKARVLAEAMSKVNNTNIMMQRSNCKGPKRTIKCFNCGKEGHLAR
NCRAPRKKGCWKCGKEGHQVKDCTERQANFFRENLAFPQGEARKSSSEQNRANSPTRRELQVWGRDNNSLSEAGDDRQGT
ALNFPQITLWQRPLVNIKVGGQLKEALLDTGADDTVLEEIKLPGNWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGTVL
VGPTPVNIIGRNMLTQLGCTLNFPISPIETVPVKLKPGMDGPKVKQWLLTEEKIKALTAICDEMEREGKITKIGPENPYN
TPVFAIKKKDSTKWRKLVDFRELNKRTWDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEGFRKYTAFTIPSINN
ETPGIRYQYNVLPQGWKGSPSIFQSSTTKILEPFRAQNPEIIIYQYMDDLYVGSDLEIGQHRAKIEELREHLLKWGFTTP
DKKHQKEPPFLWMGYELHPDKWTVQPIQLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGAKALTDIVPLTE
EAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQNQWTYQIYQEPFKNLKTGKYAKMRTAHTNDVRQLTEAVQKIAL
ESIIIWGKTPKFRLPIQKETWEAWWTDYWQATWIPEWEFVNTPPLVKLWYQLEKEPIAGAETFYVDGAANREIKMGKAGY
VTDRGRQKIVSITETTNQKTELQAIQLALQDSGSEVNIVTDSQYALGIIQAQPDKSESELVNQIIEQLIKKERVYLSWVP
AHKGIGGNEQVDKLVSSGIRKVLFLDGINKAQEEHEKYHSNWRAMASEFNLPPIVAKEIVASCDKCQLKGEATHGQVDCS
PGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVIHTDNGSNFISNTVKAACWWAGIQQ
EFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDIIATDIQTKELQKQI
MKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAVVLQDNSDIKVVPRRKVKIIKDYGKQMAGADCMASRQDED
>P03369 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDKWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIDVKDTKEALEKIEEEQNKSKKKAQQAAAAAGTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKV
VEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAG
TTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMT
ETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPANIMMQRGNFRNQRKTVKCFNCGKE
GHIAKNCRAPRKKGCWRCGREGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTRRELQVWGGENNSLSEAGA
DRQGTVSFNFPQITLWQRPLVTIRIGGQLKEALLDTGADDTVLEEMNLPGKWKPKMIGGIGGFIKVRQYDQIPVEICGHK
AIGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIG
PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKYTAFT
IPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLR
WGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIMLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGTKALTE
VIPLTEEAELELAENREILKEPVHEVYYDPSKDLVAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEA
VQKVSTESIVIWGKIPKFKLPIQKETWEAWWMEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETK
LGKAGYVTDRGRQKVVSIADTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIKKEKV
YLAWVPAHKGIGGNEQVDKLVSAGIRKVLFLNGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMH
GQVDCSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNFTSTTVKAACW
WAGIKQEFGIPYNPQSQGVVESMNNELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTK
ELQKQITKIQNFRVYYRDNKDPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
>P03366 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNTATIMMQRGNFRNQRKMVKCFNCGKEGH
TARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTISSEQTRANSPTRRELQVWGR
DNNSPSEAGADRQGTVSFNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD
QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEM
EKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLD
EDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTK
IEELRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCK
LLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAH
TNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFY
VDGAANRETKLGKAGYVTNKGRQKVVPLTNTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVNQI
IEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKILFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCD
KCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNF
TSATVKAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIV
DIIATDIQTKELQKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCV
ASRQDED
>P04587 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSTTIMMQRGNFRNQRKIVKCFNCGKEGH
IARNCKAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTISSEQTRANSPTRRELQVWGR
DNNSPSEAGADRQGTVSFNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD
QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEM
EKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNRRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLD
EDFRKYTAFTIPSINNETPGSGYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTK
IEELRQHLLRWGFTTPDKKHQKEPPFLWMGYELHPDKWTIQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCK
LLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAH
TNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFY
VDGAASRETKLGKAGYVTNRGRQKVVTLTHTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVNQI
IEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKILFLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCD
KCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNF
TSATVKAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIV
DIIATDIQTKELQKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCV
ASRQDED
>P03367 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
IARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTISSEQTRANSPTRRELQVWGR
DNNSLSEAGADRQGTVSFNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYD
QILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEM
EKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLD
EDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTK
IEELRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCK
LLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARTRGAH
TNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFY
VDGAASRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVNQI
IEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCD
KCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNF
TSTTVKAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIV
DIIATDIQTKELQKQITKIQNFRVYYRDSRDPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCV
ASRQDED
>Q75002 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASILRGEKLDAWEKIKLRPGGKKHYMLKHLVWANRELEKFALNPDLLDTSAGCKQIIKQLQPALQTGTEELKSLFN
TVATLYCVHQKIEIKDTKEALDKIEEEQNESQQKTQQAGAADRGKDSQNYPIVQNMQGQMVHQPISARTLNAWVKVVEEK
AFSPEVIPMFTALSEGATPQDLNTMLNTVGGHQAAMQMLKDTINEEAAEWDRLHPVHAGPVAPGQMRDPRGSDIAGTTST
LQEQIAWMTGNPPVPVGDIYKRWIILGLNKIVRMYSPVSILDIKQGPKEPFRDYVDRFFKTLRAEQATQDVKNWMTDTLL
VQNANPDCKTILRALGPGASLEEMMTACQGVGGPAHKARVLAEAMSQVNNTTIMMQKSNFKGPKRAIKCFNCGKEGHLAR
NCRAPRKKGCWKCGKEGHQMKDCTERQANFFRETLAFQQGKAREFPSEQTRANSPTRESQTRANSPTTRELQVRGSNTFS
EAGAERQGSLNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQIIIEICG
KKAIGTVLVGPTPVNIIGRNMLTQLGRTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALTAICEEMEQEGKISR
IGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEGFRKYTA
FTIPSTNNETPGIRYQYNVLPQGWKGSPPIFQSSMPQILEPFRAPNPEIVIYQYMDDLYVGSDLEIGQHRAPIEELREHL
LKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIQLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGAKAL
TDIVTLTEEAELELAENREILKEPVHGVFYDPSKDLIAEIQKQGNDQWTFQFYQEPFKNLKTGKFAKRGTAHTNDVKQLT
AVVQKIALESIVIWGKTPKFRLPIQKETWEAWWTDYWQATWIPEWEFVNTPPLVKLWYQLEKEPIAGVETFYVDGAANRE
TKIGKAGYVTDRGRQKIVSLTETTNQKTELQAIQLALQDSGSEVNIVTDSQYALGIILAQPDKSESEIVNQIIEQLISKE
RVYLSWVPAHKGIGGNEQVDKLVSSGIRKVLFLDGIDKAQEEHEKYHSNWRAMANEFNIPPVVPKEIVACCDKCQLKGEA
IHGQVNCSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVRVIHTDNGSNFTSNAVKAA
CWWAGIQQEFGIPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRRGGIGGYSAGERIIDIIASDIQ
TKELQNQILKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGADCVAGRQDED
>P04585 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
TARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTRRELQVWGRDNNSPSEAGADR
QGTVSFNFPQVTLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAI
GTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPE
NPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIP
SINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWG
LTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVI
PLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQ
KITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLG
KAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDQSESELVNQIIEQLIKKEKVYL
AWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQ
VDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNFTGATVRAACWWA
GIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKEL
QKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
>P20875 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWEKIRLRPGGKKKYRLKHIVWASRELERFAVNPGLLESSEGCRQILGQLQPSLKTGSEELTSLYN
TVATLYCVHQRIEIKDTKEALEKIEEEQTKSMKKAQQAAADTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVIE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFYKTLRAEQATQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPATIMMQRGNFRNQRKNVKCFNCGKEGH
IARNCRAPRKKGCWKCGKEGHQMKECTERQANFLREDLAFLQGKAREFPSEQTRANSPTRRELQVWGRDSNSLSEAGAEA
GADRQGIVSFNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEDMDLPGRWKPKMIGGIGGFIKVRQYDQIPIDICG
HKAVGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISK
IGPENPYNTPVFAIKKKDSTKWRKLVDFRELNRRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKYTA
FTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIIIYQYMDDLYVGSDLEIGQHRTKIEELRQHL
LKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGTKAL
TEVIPLTKEAELELAENREILKEPVHGVYYDPSKDLIVEIQKQGQGQWTYQIFQEPFKNLKTGKYARTRGAHTNDVKQLT
EAVQKIANESIVIWGKIPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRE
TKLGKAGYVTSRGRQKVVSLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIKKE
KVYLAWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQEDHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQLKGEA
MHGQVDCSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVTTIHTDNGSNFTSTTVKAA
CWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDIIATDIQ
TKELQKQITKIQNFRVYYRDNRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKVKIIRDYGKQMAGDDCVASRQDED
>P0C6F2 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGKLDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEECRSLYN
TVATLYCVHQRIEIKDTKEALDKIKEEQNKSKKKAQQAAADTGHSSQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGH
IARNCRAPRKKGCWKCGKEGHQMKDCTERQANFFREDLAFLQGKAREFSSEQTRANSPTRRELQVWGRDNNSPSEAGADR
QGTVSFNFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAI
GTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPE
NPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIP
SINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWG
LTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVI
PLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGTHTNDVKQLTEAVQ
KITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLG
KAGYVTNKGRQKVVPLTNTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVNQIIEQLIKKEKVYL
AWVPAHKGIGGNEQVDKLVSAGIRKILFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQ
VDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNFTSATVKAACWWA
GIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKEL
QKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
>P04588 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELERFALNPGLLETGEGCQQIMEQLQSTLKTGSEEIKSLYN
TVATLYCVHQRIDVKDTKEALDKIEEIQNKSRQKTQQAAAAQQAAAATKNSSSVSQNYPIVQNAQGQMIHQAISPRTLNA
WVKVIEEKAFSPEVIPMFSALSEGATPQDLNMMLNIVGGHQAAMQMLKDTINEEAADWDRVHPVHAGPIPPGQMREPRGS
DIAGTTSTLQEQIGWMTSNPPIPVGDIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVK
NWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPSHKARVLAEAMSQATNSTAAIMMQRGNFKGQKRIKCFN
CGKEGHLARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLRENLAFPQGKAREFPSEQTRANSPTSRELRVWGGDKTLSE
TGAERQGIVSFSFPQITLWQRPVVTVRVGGQLKEALLDTGADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEIC
GKKAIGTILVGPTPVNIIGRNMLTQIGCTLNFPISPIETVPVKLKPGMDGPRVKQWPLTEEKIKALTEICKDMEKEGKIL
KIGPENPYNTPVFAIKKKDSTKWRKLVNFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYT
AFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRTKNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREH
LLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIQLPDKESWTVNDIQKLVGKLNWASQIYPGIKVKQLCKLLRGAKA
LTDIVPLTAEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEQYKNLKTGKYARIKSAHTNDVKQL
TEAVQKIAQESIVIWGKTPKFRLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLETEPIVGAETFYVDGAANR
ETKKGKAGYVTDRGRQKVVSLTETTNQKTELQAIHLALQDSGSEVNIVTDSQYALGIIQAQPDKSESEIVNQIIEQLIQK
DKVYLSWVPAHKGIGGNEQVDKLVSSGIRKVLFLDGIDKAQEEHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQLKGE
AMHGQVDCSPGIWQLDCTHLEGKIIIVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVVHTDNGSNFTSAAVKA
ACWWANIKQEFGIPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDMIATDI
QTKELQKQITKIQNFRVYYRDNRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVAGGQDED
>P05961 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDRWENIRLRPGGKKKYKLKHVVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELKSLYN
TVATLYCVHQKIEIKDTKEALEKIEEEQNKSKKKAQQAAADTGNRGNSSQVSQNYPIVQNIEGQMVHQAISPRTLNAWVK
VVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPITPGQMREPRGSDIA
GTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPSSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNRT
TETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIIKCFNCGK
EGHIAKNCRAPRKRGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAEFSSEQNRANSPTRRELQVWGRDNNSLSEAGE
EAGDDRQGPVSFSFPQITLWQRPIVTIKIGGQLKEALLDTGADDTVLGEMNLPRRWKPKMIGGIGGFIKVRQYDQITIGI
CGHKAIGTVLVGPTPVNIIGRNLLTQLGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALIEICTEMEKEGKI
SKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKKTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKY
TAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRAKIEELRR
HLLRWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGTK
ALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEVQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQ
LTEAVQKIATESIVIWGKTPKFRLPIQKETWETWWTEYTXATWIPEWEVVNTPPLVKLWYQLEKEPIVGAETFYVDGAAN
RETKKGKAGYVTNRGRQKVVSLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIK
KEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQEDHEKYHSNWRAMASDFNLPPIVAKEIVASCDKCQLKG
EAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGPNFTSTTVK
AACWWTGIKQEFGIPYNPQSQGVIESMNKELKKIIGQVRDQAEHLKRAVQMAVFIHNFKRKGGIGGYSAGERIVGIIATD
IQTKELQKQITKIQNFRVYYRDSRDPLWKGPAKLLWKGEGAVVIQDNNDIKVVPRRKAKVIRDYGKQTAGDDCVASRQDE
D
>P12497 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGELDKWEKIRLRPGGKKQYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TIAVLYCVHQRIDVKDTKEALDKIEEEQNKSKKKAQQAAADTGNNSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTHNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPATIMIQKGNFRNQRKTVKCFNCGKEGH
IAKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFPQGKAREFSSEQTRANSPTRRELQVWGRDNNSLSEAGADR
QGTVSFSFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAI
GTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPE
NPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKQKKSVTVLDVGDAYFSVPLDKDFRKYTAFTIP
SINNETPGIRYQYNVLPQGWKGSPAIFQCSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWG
FTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVRQLCKLLRGTKALTEVV
PLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMKGAHTNDVKQLTEAVQ
KIATESIVIWGKTPKFKLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIIGAETFYVDGAANRETKLG
KAGYVTDRGRQKVVPLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIKKEKVYL
AWVPAHKGIGGNEQVDGLVSAGIRKVLFLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQ
VDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACWWA
GIKQEFGIPYNPQSQGVIESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKEL
QKQITKIQNFRVYYRDSRDPVWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
>P05959 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGKLDKWEKIRLRPRGKKRYKLKHIVWASRELERFAVNPSLLETAEGCRQILGQLQPALQTGSEELKSLYN
AVATLYCVHQNIEVRDTKEALDKIEEEQNKSKKKAQQAAADTGNGSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPISILDIRQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMTET
FLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPSHKARILAEAMSQVTNSATIMLQKGNFRDQRKIVKCFNCGKVGH
IAKNCRAPRKKGCWKCGKEGHQMKDCTNEGRQANFLRENLAFPQGKARELSSEQTRANSPTRRELQVWGRDNSLSEAGED
RQGTVSFSFPQITLWQRPIVTVKIGGQLKEALLDTGADDTVLEEMNLPGKWKPKMIGGIGGFIKVRQYDQILIEICGHKA
IGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGP
ENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKEFRKYTAFTI
PSINNETPRIRYQYNVLPQGWKGSPAIFQSSMTKILEPFKKQNPEIVIYQYMDDLYVGSDLEIGQHRIKIEELREHLLKW
GFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGTKALTEV
VQLTKEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAV
QKVATESIVIWGKTPKFKLPIQKETWEAWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIIGAETFYVDGAANRETKL
GKAGYVTDRGRQKVVSLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIKKEKVY
LAWVPAHKGIGGNEQVDRLVSTGIRKVLFLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHG
QVDCSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVIHTDNGSNFTSTTVKAACWW
AGIKQEFGIPYNPQSQGVVESMNKQLKQIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKE
LQKQITKIQNFRVYYRDSRDPLWKGHAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
>P24740 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGKKLDSWEKIRLRPGGNKKYRLKHLVWASRELEKFTLNPGLLETAEGCQQILGQLQPALQTGTEELRSLYN
TVAVLYCVHQRIDVKDTKEALNKIEEMQNKNKQRTQQAAANTGSSQNYPIVQNAQGQPVHQALSPRTLNAWVKVVEDKAF
SPEVIPMFSALSEGATPQDLNMMLNVVGGHQAAMQMLKDTINEEAAEWDRLHPVHAGPIPPGQMREPRGSDIAGTTSTVQ
EQIGWMTGNPPIPVGDIYRRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFKTLRAEQATQDVKNWMTETLLVQ
NANPDCKSILRALGPGATLEEMMTACQGVGGPGHKARVLAEAMSQVQQTSIMMQRGNFRGPRRIKCFNCGKEGHLAKNCR
APRKKGCWKCGKEGHQMKDCTERQANFLRENLAFQQGEAREFSSEQTRANSPTSRNLWDGGKDDLPCETGAERQGTDSFS
FPQITLWQRPLVTVKIGGQLIEALLDTGADDTVLEDINLPGKWKPKIIGGIGGFIKVRQYDQILIEICGKKTIGTVLVGP
TPVNIIGRNMLTQIGCTLNFPISPIETVPVKLKPEMDGPKVKQWPLTEEKIKALTEICNEMEKEGKISKIGPENPYNTPV
FAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHTAGLKKKKSVTVLDVGDAYFSVPLDESFRKYTAFTIPSINNETP
GVRYQYNVLPQGWKGSPSIFQSSMTKILEPFRSQHPDIVIYQYMDDLYVGSDLEIGQHRAKIEELRAHLLSWGFITPDKK
HQKEPPFLWMGYELHPDKWTVQPIQLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGAKALTDIVTLTEEAE
LELAENREILKDPVHGVYYDPSKDLVAEIQKQGQDQWTYQIYQEPFKNLKTGKYARKRSAHTNDVKQLTEVVQKVSTESI
VIWGKIPKFRLPIQKETWEAWWMEYWQATWIPEWEFVNTPPLVKLWYQLEKDPIAGAETFYVDGAANRETKLGKAGYVTD
RGRQKVVSLTETTNQKTELHAIHLALQDSGSEVNIVTDSQYALGIIQAQPDRSESEIVNQIIEKLIEKEKVYLSWVPAHK
GIGGNEQVDKLVSSGIRKVLFLDGIDKAQEDHEKYHCNWRAMASDFNLPPVVAKEIVASCNKCQLKGEAMHGQVDCSPGI
WQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKVIHTDNGSNFTSAAVKAVCWWANIQQEFG
IPYNPQSQGVVESMNKELKKIIGQVREQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDIIATDIQTKELQKQISKI
QNFRVYYRDSRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCMAGRQDED
>P35963 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSAGELDKWEKIRLRPGGKKQYRLKHIVWASRELERFAVDPGLLETSEGCRQILGQLQPSLQTGSEELRSLYN
TVATLYCVHQKIEVKDTKEALEKIEEEQNKSKKKAQQAAADTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVE
EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTT
STLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTET
LLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKTVKCFNCGKEGH
IAKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFPQGKARKFSSEQTRANSPIRRERQVWRRDNNSLSEAGADR
QGTVSFSFPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYDQIPIEICGHKAI
GTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPE
NPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLHEDFRKYTAFTIP
SINNETPGTRYQYNVLPQGWKGSPAIFQSSMTTILEPFRKQNPDLVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWG
FTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYAGIKVRQLCKLLRGTKALTEVI
PLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARTRGAHTNDVKQLTEAVQ
KIATESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIIGAETFYVDGAANRETKLG
KAGYVTNKGRQKVVSLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDRSESELVSQIIEQLIKKEKVYL
AWVPAHKGIGGNEQVDKLVSAGIRKVLFLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQ
VDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVTTIHTDNGSNFTSATVKAACWWA
GIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKEL
QKQITKIQNFRVYYRDSRDPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVAGRQDED
>Q9IDV9 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLTGGKLDQWEAIYLRPGGKKKYRLKHLVWASRELERFACNPGLMDTANGCAQLINQLEPALKTGSEGLRSLXN
TLAVLYCVHSNIPVHNTQEALDKIKEKQEQHKSEPKKPEAGTAAAADSSISRNYPLVQNAQGQMVHQPLTPRTLNAWVKV
IEEKAFNPEIIPMFMALSEGATPSDLNSMLNTVGGHQAAMQMLKEVINEEAAEWDRTHPAPVGPLPPGQMRDPRGSDIAG
TTSTLAEQVAWMTSNPPIPVGDIYRRWIVLGLNRIVRMYSPVSILEIKQGPKEPFRDYVDRFYKTLRAEQATQDVKNWMT
ETLLVQNANPDCKQILKALGPGATLEEMMTACQGVGGPAHKARVLAEAMAQAQTATSVFVQRGNFKGIRKTIKCFNCGKE
GHLARNCKAPRRRGCWKCGQEGHQMKDCKNEGXQANFRKGLVSLQRETRKLPPDNNKERAHSPATRELWVSGGEEHTGKG
DAGEPGEDRDLSVPTLNFPQITLWQRPVXAVKIGKEIREALLDTGADDTVIEEIQLEGKWKPKMIGGIGGFIKVRQYDNI
TIDIQGRKAVGTVLVGPTPVNIIGRNFLTQIGCTLNFPISPIETVPVKLKPGMDGPRVKQWPLTAEKIEALREICTEMEK
EGKISRIGPENPYNTPIFAIKKKDSTKWRKLVDFRELNKRTQEFWEVQLGIPHPAGLKQKKSVTVXDVGDAYFSCPLDKD
FRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKKHPEIIIYQYMDDLYVGSDLEIAQHRETVE
ELRGHLLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIKLPEKEVWTVNDIQKLVGKLNWASQIYPGIKVKQLCKLI
RGTKALTEVVTFTQEAELELAENREILKEPLHGVYYDPGKELIAEIQKQGQGQWTYQIYQEPYKNLKTGKYAKXRSAHTN
DIKELAAVVQKVATESIVIWGKTPKFKLPVQKEVWETWWTEHWQATWIPEWEFVNTPPLVKLWYQLETEPISGAETYYVD
GAANKETKLGKAGFVTDRGRQKVVSIENTTNQKAELQAILLALQESGQEANIVTDSQYAMGIIHSQPDKSESDLVGQIIE
ELIKKERVYLSWVPAHKGIGGNEQVDXLVSSGIRXVLFLDGIEKAQEEHERYHSNWKAMASDFNLPPIVAKEIVASCDKC
QLKGEAMHGQINCSPGVWQLDCTHLEGKIILVAVHVASGYLEAEVIPAETGQETAYFILKLAGRWPVKVIHTDNGPNFIS
ATVKAACWWAGIKQEFGIPYNPQSQGAVESMNKELKKIIGQIRDQAEHLKTAVQMAVFIHNFKRKGGIGGXTAGERIIDI
IATDIQTTKLQTQILKVQNFXVYYRDSRDPIWKGPAKLLWKGEGAVVIQDNGDIKVVPRRKAKIIRDYGKQMAGDGCVAS
GQDENQDME
>O91080 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLTGGKLDQWESIYLRPGGKKKYRMKHLVWASRELERFACNPGLMDTADGCAKLLNQLEPALKTGSEELRSLYN
ALAVLYCVHSRIQIHNTQEALDKIKEKQEQHKPEPKNPEAGAAAATDSNISRNYPLVQTAQGQMVHQPLTPRTLNAWVKV
IEEKAFSPEVIPMFMALSEGATPSDLNTMLNTVGGHQAAMQMLKEVINEEAADWDRTHPVPVGPLPPGQLRDPRGSDIAG
TTSTLAEQVAWMTANPPVPVGDIYRRWIVLGLNRIVRMYSPVSILEIKQGPKEPFRDYVDRFYKTLRAEQATQEVKNWMT
ETLLVQNANPDCKQLLKALGPGATLEEMMTACQGVGGPAHKARVLAEAMSQVQQPTTSVFAQRGNFKGIRKPIKCFNCGK
EGHLARNCKAPRRGGCWKCGQEGHQMKDCKNEGRQFFREELVSLQRETRKLPPDNNKERAHSPATRELWVSGGEEHTGEG
DAGEPGEDRELSVPTFNFPQITLWQRPVITVKIGKEVREALLDTGADDTVIEELQLEGKWKPKMIGGIGGFIKVRQYDNI
TVDIQGRKAVGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTTEKIEALREICTEMEK
EGKISRIGPENPYNTPIFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKQKKSVTVLDVGDAYFSCPLDKD
FRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSTMTKILEPFREKHPEIIIYQYMDDLYVGSDLELAQHREAVE
DLRDHLLKWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIKLPEKDVWTVNDIQKLVGKLNWASQIYPGIRVKQLCKLI
RGTKALTEVVNFTEEAELELAENREILKEPLHGVYYDPGKELVAEIQKQGQGQWTYQIYQELHKNLKTGKYAKMRSAHTN
DIKQLVEVVRKVATESIVIWGKTPKFRLPVQKEVWEAWWTDHWQATWIPEWEFVNTPPLVKLWYQLETEPISGAETFYVD
GAANRETKLGKAGFVTDRGRQKVVSIADTTNQKAELQAILMALQESGRDVNIVTDSQYAMGIIHSQPDKSESELVSQIIE
ELIKKERVYLSWVPAHKGIGGNEQVDKLVSSGIRKILFLDGIEKAQEDHDRYHSNWKAMASDFNLPPIVAKEIVASCDKC
QLKGEAMHGQVNCSPGVWQLDCTHLEGKIILVAVHVASGYLEAEVIPAETGQETAYFILKLAGRWPVKVIHTDNGSNFTS
ATVKAACWWANIKQEFGIPYNPQSQGAVESMNKELKKIIGQIRDQAEHLKTAVQMAVFIHNFKRKGGIGGYTAGERIIDI
IATDIQTTNLQTQILKVQNFRVYYRDSRDPIWKGPAKLLWKGEGAVVIQDNGDIKVVPRRKAKIIRDYGKQMAGDGCVAS
GQDENQEME
>P12499 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARASVLSGGKLDAWEKIRLRPGGKKKYRLKHLVWASRELERFALNPGLLETSDGCKQIIGQLQPAIRTGSEELRSLFN
TVATLYCVHERIEVKDTKEALEKMEEEQNKSKNKKAQQAAADAGNNSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVI
EEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGT
TSTLQEQIAWMTSNPPIPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKGWMTE
TLLVQNANPDCKTILKALGPQATLEEMMTACQGVGGPSHKARVLAEAMSQATNSAAAVMMQRGNFKGPRKTIKCFNCGKE
GHIAKNCRAPRRKGCWKCGKEGHQLKDCTERQANFLREDLAFPQGKAGELSSEQTRANSPTSRELRVWGRDNPLSETGAE
RQGTVSFNCPQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGKWKPKMIGGIGGFIKVRQYDQILIEICGHKA
IGTVLVGPTPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALTEICTEMEKEGKISRVGP
ENPYNTPIFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKYTAFTI
PSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLLRW
GFTTPDKKHQKEPPFLWMGYELHPDKWTVQSIKLPEKESWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEV
IPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGHGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLAEVV
QKISTESIVIWGKTPKFRLPIQKETWETWWVEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIIGAETFYVDGAANRETKL
GKAGYVTDRGRQKVVPFTDTTNQKTELQAINLALQDSGLEVNIVTDSQYALGIIQAQPDKSESELVSQIIEQLIKKEKVY
LAWVPAHKGIGGNEQVDKLVSQGIRKVLFLDGIDKAQEEHEKYHNNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHG
QVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFILKLAGRWPVKIVHTDNGSNFTSAAVKAACWW
AGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIIDIIATDIQTKE
LQKQITKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKVKIIRDYGKQMAGDDCVASRQDED
>P18042 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARNSVLRGKKADELEKIRLRPSGKKKYRLKHIVWAANELDKFGLAESLLESKEGCQKILTVLDPLVPTGSENLKSLFN
TVCVIWCLHAEEKVKDTEEAKKLVQRHLGAETGTAEKMPSTSRPTAPPSGRGRNFPVQQTGGGNYIHVPLSPRTLNAWVK
LVEDKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIREIINDEAADWDAQHPIPGPLPAGQLRDPRGSDIAG
TTSTVEEQIQWMYRPQNPVPVGNIYRRWIQIGLQKCVRMYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWM
TQTLLIQNANPDCKLVLKGLGMNPTLEEMLTACQGVGGPGQKARLMAEALKEALTPPPIPFAAAQQRKVIRCWNCGKEGH
SARQCRAPRRQGCWKCGKTGHVMAKCPERQAGFLRDGSMGKEAPQLPRGPSSSGADTNSTPSRSSSGSIGKIYAAGERAE
GAEGETIQRGDGRLTAPRAGKSTSQRGDRGLAAPQFSLWKRPVVTAYIEVQPVEVLLDTGADDSIVAGIQLGDNYVPKIV
GGIGGFINTKEIKNIEIKVLNKRVRATIMTGDTPINIFGRNILTALGMSLNLPIAKIEPIKVTLKPGKDGPRLRQWPLTK
EKIEALREICEKMEKEGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNRVTQDFTEIQLGIPHPAGLAKKKRITV
LDVGDAYFSIPLHEDFRQYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQHTMRQVLEPFRKANPDVILIQYMDDIL
IASDRTGLEHDKVVLQLKELLNGLGFSTPDEKFQKDPPLQWMGYELWPTKWKLQKLQLPQKEIWTVNDIQKLVGVLNWAA
QIYPGIKTKHLCRLIKGKMTLTEEVQWTELAEAELEENKIILSQEQEGYYYQEEKELEATIQKNQDNQWTYKIHQEEKIL
KVGKYAKIKNTHTNGVRLLAQVVQKIGKEALVIWGRIPKFHLPVERETWEQWWDNYWQVTWIPEWDFVSTPPLVRLTFNL
VGDPIPGAETFYTDGSCNRQSKEGKARYVTDRGRDKVRVLERTTNQQAELEAFAMTLTDSGPKVNIIVDSQYVMGIVVGQ
PTESESRIVNQIIEDMIKKEAVYVAWVPAHKGIGGNQEVDHLVSQGIRQVLFLERIEPAQEEHEKYHSNMKELTHKFGIP
QLVARQIVNTCAQCQQKGEAIHGQVNAEIGVWQMDCTHLEGKIIIVAVHVASGFIEAEVIPQESGRQTALFLLKLASRWP
ITHLHTDNGSNFTSQEVKMVAWWIGIEQSFGVPYNPQSQGVVEAMNHHLKNQISRIREQANTIETIVLMAVHCMNFKRRG
GIGDMTPAERLINMITTEQEIQFLQRKNSNFKNFQVYYREGRDQLWKGPGELLWKGDGAVIVKVGADIKVIPRRKAKIIR
DYGGRQELDSSHLEGAREEDGEVA
>P04584 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARNSVLRGKKADELERIRLRPGGKKKYRLKHIVWAANKLDRFGLAESLLESKEGCQKILTVLDPMVPTGSENLKSLFN
TVCVIWCIHAEEKVKDTEGAKQIVRRHLVAETGTAEKMPSTSRPTAPSSEKGGNYPVQHVGGNYTHIPLSPRTLNAWVKL
VEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIREIINEEAAEWDVQHPIPGPLPAGQLREPRGSDIAGT
TSTVEEQIQWMFRPQNPVPVGNIYRRWIQIGLQKCVRMYNPTNILDIKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWMT
QTLLVQNANPDCKLVLKGLGMNPTLEEMLTACQGVGGPGQKARLMAEALKEVIGPAPIPFAAAQQRKAFKCWNCGKEGHS
ARQCRAPRRQGCWKCGKPGHIMTNCPDRQAGFLRTGPLGKEAPQLPRGPSSAGADTNSTPSGSSSGSTGEIYAAREKTER
AERETIQGSDRGLTAPRAGGDTIQGATNRGLAAPQFSLWKRPVVTAYIEGQPVEVLLDTGADDSIVAGIELGNNYSPKIV
GGIGGFINTKEYKNVEIEVLNKKVRATIMTGDTPINIFGRNILTALGMSLNLPVAKVEPIKIMLKPGKDGPKLRQWPLTK
EKIEALKEICEKMEKEGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNKVTQDFTEIQLGIPHPAGLAKKRRITV
LDVGDAYFSIPLHEDFRPYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQHTMRQVLEPFRKANKDVIIIQYMDDIL
IASDRTDLEHDRVVLQLKELLNGLGFSTPDEKFQKDPPYHWMGYELWPTKWKLQKIQLPQKEIWTVNDIQKLVGVLNWAA
QLYPGIKTKHLCRLIRGKMTLTEEVQWTELAEAELEENRIILSQEQEGHYYQEEKELEATVQKDQENQWTYKIHQEEKIL
KVGKYAKVKNTHTNGIRLLAQVVQKIGKEALVIWGRIPKFHLPVEREIWEQWWDNYWQVTWIPDWDFVSTPPLVRLAFNL
VGDPIPGAETFYTDGSCNRQSKEGKAGYVTDRGKDKVKKLEQTTNQQAELEAFAMALTDSGPKVNIIVDSQYVMGISASQ
PTESESKIVNQIIEEMIKKEAIYVAWVPAHKGIGGNQEVDHLVSQGIRQVLFLEKIEPAQEEHEKYHSNVKELSHKFGIP
NLVARQIVNSCAQCQQKGEAIHGQVNAELGTWQMDCTHLEGKIIIVAVHVASGFIEAEVIPQESGRQTALFLLKLASRWP
ITHLHTDNGANFTSQEVKMVAWWIGIEQSFGVPYNPQSQGVVEAMNHHLKNQISRIREQANTIETIVLMAIHCMNFKRRG
GIGDMTPSERLINMITTEQEIQFLQAKNSKLKDFRVYFREGRDQLWKGPGELLWKGEGAVLVKVGTDIKIIPRRKAKIIR
DYGGRQEMDSGSHLEGAREDGEMA
>P03355 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGQTVTTPLSLTLGHWKDVERIAHNQSVDVKKRRWVTFCSAEWPTFNVGWPRDGTFNRDLITQVKIKVFSPGPHGHPDQV
PYIVTWEALAFDPPPWVKPFVHPKPPPPLPPSAPSLPLEPPRSTPPRSSLYPALTPSLGAKPKPQVLSDSGGPLIDLLTE
DPPPYRDPRPPPSDRDGNGGEATPAGEAPDPSPMASRLRGRREPPVADSTTSQAFPLRAGGNGQLQYWPFSSSDLYNWKN
NNPSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGDDGRPTQLPNEVDAAFPLERPDWD
YTTQAGRNHLVHYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQ
SAPDIGRKLERLEDLKNKTLGDLVREAEKIFNKRETPEEREERIRRETEEKEERRRTEDEQKEKERDRRRHREMSKLLAT
VVSGQKQDRQGGERRRSQLDRDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLDDQGGQGQEPPPEPRITLKVGGQP
VTFLVDTGAQHSVLTQNPGPLSDKSAWVQGATGGKRYRWTTDRKVHLATGKVTHSFLHVPDCPYPLLGRDLLTKLKAQIH
FEGSGAQVMGPMGQPLQVLTLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSI
KQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPP
SHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADFRIQHPDLILLQ
YVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLR
EFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVL
TQKLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQ
ALLLDTDRVQFGPVVALNPATLLPLPEEGLQHNCLDILAEAHGTRPDLTDQPLPDADHTWYTDGSSLLQEGQRKAGAAVT
TETEVIWAKALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHIHGEIYRRRGLLTSEGKEIKNKDEILALL
KALFLPKRLSIIHCPGHQKGHSAEARGNRMADQAARKAAITETPDTSTLLIENSSPYTSEHFHYTVTDIKDLTKLGAIYD
KTKKYWVYQGKPVMPDQFTFELLDFLHQLTHLSFSKMKALLERSHSPYYMLNRDRTLKNITETCKACAQVNASKSAVKQG
TRVRGHRPGTHWEIDFTEIKPGLYGYKYLLVFIDTFSGWIEAFPTKKETAKVVTKKLLEEIFPRFGMPQVLGTDNGPAFV
SKVSQTVADLLGIDWKLHCAYRPQSSGQVERMNRTIKETLTKLTLATGSRDWVLLLPLALYRARNTPGPHGLTPYEILYG
APPPLVNFPDPDMTRVTNSPSLQAHLQALYLVQHEVWRPLAAAYQEQLDRPVVPHPYRVGDTVWVRRHQTKNLEPRWKGP
YTVLLTTPTALKVDGIAAWIHAAHVKAADPGGGPSSRLTWRVQRSQNPLKIRLTREAP
>P03365 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKRYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSTSSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKIKKVVQRKENSEGKRKEKDSKAFLATDWNDDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVVKKKPQ
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDEDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKEMVQKAAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRMMPRGEGSDILIKQLAWENANSLCQDLIRPIRKTGTIQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGAEGPVCFSCGKTGHIRKDCKDEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNLVKGQSPSPAQKGDGVKGSGLNPEAPPFTIHDLPRGTPGSAGLDLSSQKD
LILSLEDGVSLVPTLVKGTLPEGTTGLIIGRSSNYKKGLEVLPGVIDSDFQGEIKVMVKAAKNAVIIHKGERIAQLLLLP
YLKLPNPVIKEERGSEGFGSTSHVHWVQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGL
GMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQ
PVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNATMHDMGALQPGLPSPVAV
PKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYI
VHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDSVSYQKLQIRTDKLRTLNDFQKLL
GNINWIRPFLKLTTGELKPLFEILNGDSNPISTRKLTPEACKALQLMNERLSTARVKRLDLSQPWSLCILKTEYTPTACL
WQDGVVEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHF
HLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIVAVITAFEEVSQP
FNLYTDSKYVTGLFPEIETATLSPRTKIYTELKHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTALES
AQESHALHHQNAAALRFQFHITREQAREIVKLCPNCPDWGHAPQLGVNPRGLKPRVLWQMDVTHVSEFGKLKYVHVTVDT
YSHFTFATARTGEATKDVLQHLAQSFAYMGIPQKIKTDNAPAYVSRSIQEFLARWKISHVTGIPYNPQGQAIVERTHQNI
KAQLNKLQKAGKYYTPHHLLAHALFVLNHVNMDNQGHTAAERHWGPISADPKPMVMWKDLLTGSWKGPDVLITAGRGYAC
VFPQDAETPIWVPDRFIRPFTERKEATPTPGTAEKTPPRDEKDQQESPKNESSPHQREDGLATSAGVDLRSGGGP
>P11283 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKKYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSASSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKVKKIVQRKENSEHKRKEKDQKAFLATDWNNDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVDKKKPL
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDDDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKETVQKTAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRMMPRGEGSDILIKQLAWENANSLCQDLIRPMRKTGTMQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGSEGPVCFSCGKTGHIKRDCKEEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNLVKGQSPSPTQKGDKGKDSGLNPEAPPFTIHDLPRGTPGSAGLDLSSQKD
LILSLEDGVSLVPTLVKGTLPEGTTGLIIGRSSNYKKGLEVLPGVIDSDFQGEIKVMVKAAKNAVIIHKGERIAQLLLLP
YLKLPNPIIKEERGSEGFGSTSHVHWVQEISDSRPMLHISLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGL
GMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKEIKVRLMTDSPDDSQDLMIGAIESNLFADQISWKSDQ
PVWLNQWPLKQEKLQALQQLVTEQLQLGHLEESNSPWNTPVFVIKKKSGKWRLLQDLRAVNATMHDMGALQPGLPSPVAV
PKGWEIIIIDLQDCFFNIKLHPEDCKRFAFSVPSPNFKRPYQRFQWKVLPQGMKNSPTLCQKFVDKAILTVRDKYQDSYI
VHYMDDILLAHPSRSIVDEILTSMIQALNKHGLVVSTEKIQKYDNLKYLGTHIQGDAVSYQKLQIRTDKLRTLNDFQKLL
GNINWIRPFLKLTTGELKPLFEILNGDSNPISIRKLTPEACKALQLVNERLSIARVKRLDLSRPWSLCILKTEYTPTACL
WQNGVLEWIHLPHISPKVITPYDIFCTQLIIKGRHRSKELFSKDPDYIVVPYTKVQFDLLLQEKEDWPISLLGFLGEVHF
HLPKDPLLTFTLQTAIIFPHMTSTTPLEKGIVIFTDGSANGRSVTYIQGREPIIKENTQNTAQQAEIVAVITAFEEVSQS
FNLYTDSKYVTGLFPEIETATLSPRTKIYTELRHLQRLIHKRQEKFYIGHIRGHTGLPGPLAQGNAYADSLTRILTALES
AQESHALHHQNAAALRFQFHITREQAREIVKLCPNCPDWGHAPQLGVNPRGLKPRVLWQMDVTHVSEFGKLKYVHVTVDT
YSHFTFATARTGEATKDVLQHLAQSFAYMGFPQKIKTDNAPAYVSRSIQEFLARWKISHVTGIPYNPQGQAIVERTHQNI
KAQLNKLQKAGKYYTPHHLLAHALFVLNHVNMDNQGHTAAERHWGPISADPKPMVMWKDLLAGSWKGPDVLITAGRGYAC
VFPQDAETPIWVPDRFIRPFTERKEATPTPGTAEKTPPRDEKDQQKSPEDESSPHQREDGLATSAGVNLRSGGGS
>P07572 ~~~gag-pro-pol~~~Gag-Pro-Pol polyprotein~~~
MGQELSQHERYVEQLKQALKTRGVKVKYADLLKFFDFVKDTCPWFPQEGTIDIKRWRRVGDCFQDYYNTFGPEKVPVTAF
SYWNLIKELIDKKEVNPQVMAAVAQTEEILKSNSQTDLTKTSQNPDLDLISLDSDDEGAKSSSLQDKGLSSTKKPKRFPV
LLTAQTSKDPEDPNPSEVDWDGLEDEAAKYHNPDWPPFLTRPPPYNKATPSAPTVMAVVNPKEELKEKIAQLEEQIKLEE
LHQALISKLQKLKTGNETVTHPDTAGGLSRTPHWPGQHIPKGKCCASREKEEQIPKDIFPVTETVDGQGQAWRHHNGFDF
AVIKELKTAASQYGATAPYTLAIVESVADNWLTPTDWNTLVRAVLSGGDHLLWKSEFFENCRDTAKRNQQAGNGWDFDML
TGSGNYSSTDAQMQYDPGLFAQIQAAATKAWRKLPVKGDPGASLTGVKQGPDEPFADFVHRLITTAGRIFGSAEAGVDYV
KQLAYENANPACQAAIRPYRKKTDLTGYIRLCSDIGPSYQQGLAMAAAFSGQTVKDFLNNKNKEKGGCCFKCGKKGHFAK
NCHEHAHNNAEPKVPGLCPRCKRGKHWANECKSKTDNQGNPIPPHQGNRVEGPAPGPETSLWGSQLCSSQQKQPISKLTR
ATPGSAGLDLCSTSHTVLTPEMGPQALSTGIYGPLPPNTFGLILGRSSITMKGLQVYPGVIDNDYTGEIKIMAKAVNNIV
TVSQGNRIAQLILLPLIETDNKVQQPYRGQGSFGSSDIYWVQPITCQKPSLTLWLDDKMFTGLIDTGADVTIIKLEDWPP
NWPITDTLTNLRGIGQSNNPKQSSKYLTWRDKENNSGLIKPFVIPNLPVNLWGRDLLSQMKIMMCSPNDIVTAQMLAQGY
SPGKGLGKKENGILHPIPNQGQSNKKGFGNFLTAAIDILAPQQCAEPITWKSDEPVWVDQWPLTNDKLAAAQQLVQEQLE
AGHITESSSPWNTPIFVIKKKSGKWRLLQDLRAVNATMVLMGALQPGLPSPVAIPQGYLKIIIDLKDCFFSIPLHPSDQK
RFAFSLPSTNFKEPMQRFQWKVLPQGMANSPTLCQKYVATAIHKVRHAWKQMYIIHYMDDILIAGKDGQQVLQCFDQLKQ
ELTAAGLHIAPEKVQLQDPYTYLGFELNGPKITNQKAVIRKDKLQTLNDFQKLLGDINWLRPYLKLTTGDLKPLFDTLKG
DSDPNSHRSLSKEALASLEKVETAIAEQFVTHINYSLPLIFLIFNTALTPTGLFWQDNPIMWIHLPASPKKVLLPYYDAI
ADLIILGRDHSKKYFGIEPSTIIQPYSKSQIDWLMQNTEMWPIACASFVGILDNHYPPNKLIQFCKLHTFVFPQIISKTP
LNNALLVFTDGSSTGMAAYTLTDTTIKFQTNLNSAQLVELQALIAVLSAFPNQPLNIYTDSAYLAHSIPLLETVAQIKHI
SETAKLFLQCQQLIYNRSIPFYIGHVRAHSGLPGPIAQGNQRADLATKIVASNINTNLESAQNAHTLHHLNAQTLRLMFN
IPREQARQIVKQCPICVTYLPVPHLGVNPRGLFPNMIWQMDVTHYSEFGNLKYIHVSIDTFSGFLLATLQTGETTKHVIT
HLLHCFSIIGLPKQIKTDNGPGYTSKNFQEFCSTLQIKHITGIPYNPQGQGIVERAHLSLKTTIEKIKKGEWYPRKGTPR
NILNHALFILNFLNLDDQNKSAADRFWHNNPKKQFAMVKWKDPLDNTWHGPDPVLIWGRGSVCVYSQTYDAARWLPERLV
RQVSNNNQSRE
>P03354 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MEAVIKVISSACKTYCGKTSPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMILGKSGELKTWGLVLGA
LKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTVQRDAKMAPEETATPKTVGTSCYHCGTA
IGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPPGAEQSRAEPGHAGQAPGPALTDWARVREELASTGPPVVAMP
VVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVI
AAATRDPRHPANGQGRGERTNLNRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADIMQGPS
ESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRTAPSTLTTPGEIIKYVLDRQKTAPLTDQGIAAAMS
SAIQPLIMAVVNRERDGQTGSGGRARGLCYTCGSPGHYQAQCPKKRKSGNSRERCQLCNGMGHNAKQCRKRDGNQGQRPG
KGLSSGPWPGPEPPAVSLAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVMEAANPQ
IHGIGGGIPMRKSRDMIELGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNLIGRATVLTVALHLAIPLKW
KPDHTPVWIDQWPLPEGKLVALTQLVEKELQLGHIEPSLSCWNTPVFVIRKASGSYRLLHDLRAVNAKLVPFGAVQQGAP
VLSALPRGWPLMVLDLKDCFFSIPLAEQDREAFAFTLPSVNNQAPARRFQWKVLPQGMTCSPTICQLVVGQVLEPLRLKH
PSLCMLHYMDDLLLAASSHDGLEAAGEEVISTLERAGFTISPDKVQREPGVQYLGYKLGSTYVAPVGLVAEPRIATLWDV
QKLVGSLQWLRPALGIPPRLMGPFYEQLRGSDPNEAREWNLDMKMAWREIVRLSTTAALERWDPALPLEGAVARCEQGAI
GVLGQGLSTHPRPCLWLFSTQPTKAFTAWLEVLTLLITKLRASAVRTFGKEVDILLLPACFREDLPLPEGILLALKGFAG
KIRSSDTPSIFDIARPLHVSLKVRVTDHPVPGPTVFTDASSSTHKGVVVWREGPRWEIKEIADLGASVQQLEARAVAMAL
LLWPTTPTNVVTDSAFVAKMLLKMGQEGVPSTAAAFILEDALSQRSAMAAVLHVRSHSEVPGFFTEGNDVADSQATFQAY
PLREAKDLHTALHIGPRALSKACNISMQQAREVVQTCPHCNSAPALEAGVNPRGLGPLQIWQTDFTLEPRMAPRSWLAVT
VDTASSAIVVTQHGRVTSVAVQHHWATAIAVLGRPKAIKTDNGSCFTSKSTREWLARWGIAHTTGIPGNSQGQAMVERAN
RLLKDRIRVLAEGDGFMKRIPTSKQGELLAKAMYALNHFERGENTKTPIQKHWRPTVLTEGPPVKIRIETGEWEKGWNVL
VWGRGYAAVKNRDTDKVIWVPSRKVKPDITQKDEVTKKDEASPLFAGISDWIPWEDEQEGLQGETASNKQERPGEDTLAA
NES
>O92956 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MEAVIKVISSACKTYCGKTSPSKKEIGAMLSLLQKEGLLMSPSDLYSPGSWDPITAALSQRAMVLGKSGELKTWGLVLGA
LKAAREEQVTSEQAKFWLGLGGGRVSPPGPECIEKPATERRIDKGEEVGETTVQRDAKMAPEETATPKTVGTSCYHCGTA
IGCNCATASAPPPPYVGSGLYPSLAGVGEQQGQGGDTPRGAEQPRAEPGHAGLAPGPALTDWARIREELASTGPPVVAMP
VVIKTEGPAWTPLEPKLITRLADTVRTKGLRSPITMAEVEALMSSPLLPHDVTNLMRVILGPAPYALWMDAWGVQLQTVI
AAATRDPRHPANGQGRGERTNLDRLKGLADGMVGNPQGQAALLRPGELVAITASALQAFREVARLAEPAGPWADITQGPS
ESFVDFANRLIKAVEGSDLPPSARAPVIIDCFRQKSQPDIQQLIRAAPSTLTTPGEIIKYVLDRQKIAPLTDQGIAAAMS
SAIQPLVMAVVNRERDGQTGSGGRARRLCYTCGSPGHYQAQCPKKRKSGNSRERCQLCDGMGHNAKQCRRRDSNQGQRPG
RGLSSGPWPVSEQPAVSLAMTMEHKDRPLVRVILTNTGSHPVKQRSVYITALLDSGADITIISEEDWPTDWPVVDTANPQ
IHGIGGGIPMRKSRDMIELGVINRDGSLERPLLLFPAVAMVRGSILGRDCLQGLGLRLTNLVGRATVLTVALHLAIPLKW
KPDHTPVWIDQWPLPEGKLVALTQLVEKELQLGHIEPSLSCWNTPVFVIRKASGSYRLLHDLRAVNAKLVPFGAVQQGAP
VLSALPRGWPLMVLDLKDCFFSIPLAEQDREAFAFTLPSVNNQAPARRFQWKVLPQGMTCSPTICQLVVGQVLEPLRLKH
PSLRMLHYMDDLLLAASSHDGLEAAGEEVINTLERAGFTISPDKIQREPGVQYLGYKLGSTYVAPVGLVAEPRIATLWDV
QKLVGSLQWLRPALGIPPRLMGPFYEQLRGSDPNEAREWNLDMKMAWREIVQLSTTAALERWDPALPLEGAVVRCEQGAI
GVLGQGLSTHPRPCLWLFSTQPTKAFTAWLEVLTLLITKLRASAVRTFGKEVDILLLPACFREDLPLPEGILLALRGFAG
KIRSSDTPSIFDIARPLHVSLKVRVTDHPVPGPTVFTDASSSTHKGVVVWREGPRWEIKEIADLGASVQQLEARAVAMAL
LLWPTTPTNVVTDSAFVAKMLLKMGQEGVPSTAAAFILEDALSQRSAMAAVLHVRSHSEVPGFFTEGNDVADSQATFQAY
PLREAKDLHTTLHIGPRALSKACNISMQQAREVVQTCPHCNSAPALEAGVNPRGLGPLQIWQTDFTLEPRMAPRSWLAVT
VDTASSAIVVTQHGRVTSVAAQHHWATAIAVLGRPKAIKTDNGSCFTSKSTREWLARWGIAHTTGIPGNSQGQAMVERAN
RLLKDKIRVLAEGDGFMKRIPASKQGELLAKAMYALNHFERGENTKTPVQKHWRPTVLTEGPPVKIRIETGEWEKGWNVL
VWGRGYAAVKNRDTDKVIWVPSRKVKPDITQKDEVTKKDEASPLFAGSSDWIPWGDEQEGLQEEAASNKQEGPGEDTLAA
NES
>P27502 ~~~~~~Polyprotein P3~~~
MSLRPFTGTSRTITQDSTSESNIKKGKNSTKRELIEEVDVNQEVENFDWKKLSGIKPNKLYEKNWQEKVKLKQQSIVSAY
KEEAISVTHNAYTTTLFPQEVIKNVKNQGKLYYHIGMMAIGVKGLHRRKIGTKVMIMFYDDSFGKAREASIGSIEMDMNA
GCGVFYSCPDFAKYIKDLSHLKIGIQTLGYENYEGKNLSVAIKTIGRLTTNIQSKYKINVKDIVEQISSQGIIMVAPMEI
DSSHLDGNEWDLSKFLNHENTSRVPTKALIYQNLQGGESLRFSNYKQTRMHDPTENNSDEDEDLKILGEQLNIKMARFYT
MQTPEEELREVIQQLEREKQAMIAKLEAKMKESSKMAIVEDNFNPNNEYLEDTYSEYEDLEFEKLGLTGWEDLDQDSIET
EEITEWENPNQVLHREIRAYKSVSEQIEDIFGELLKEHGNYDMALKNLEEKYDLDKIEKAKSIEEIAKSSTSSEIRPTKR
PKEEQTAYEDDMRDDWKRKELTVNPIEASKDRNFERIGSSYKKNFYPSRSEILNLDNVPPQFYYDQLVTWEGIVKNEWEA
RKKDGMDMWSWMDGRITGLVLYLVQDWISKNQAAYNDIKSRGDRPENFVKMVKDRFLIEDPTDERRTALQRLAQRELEAL
NCEDPTKIQPFMAEYLKKASEAKKGFDVVYVERLFDRLPEAVGKVVKADFVKDGNSYEAGIGIAVSYISTWMRAKCIKET
EAKTQKKASLAFCRSIYTIGDYKKRKILKRVTNYNKNRRKNYVRRPSIKKKCRCYICQDENHLANRCPRRYTNQARASLI
DGLDEDIVSIASDDEDIENFLEIIELDEFIAHSSQEHEHTWEIGGKKDKVCEICSYFTDYNKTVSCKTCETQYCKTCSDQ
LALEVTEVKKPTKEETMIDDLKLNVKNLEFRVTILEHKVEMQNLQDKFETMQIRNKSEITEIPTTSLAMRANESNYIKTS
INKTAGCYVETKISFNNENRIITALIDSGSTHNIICPTLIPASWINNTHREIIMFAVDNSKYNLNQELIDDIKLQFQEVD
ETFGIKYKLGQTYVAPKPTKTFIIGHRFLTNENGSVTIHKDYITIQKTTGIYPTARHELKSEFARKHGGRPPLFSNIPET
YNKIPHLHSYQPQPILGYKNEIGNQSLITMVKELEALGFIGDDITKNRTTWVCDFKIINPDINITCATIPYTPADKEVFE
KQIKELLDNKLIKKADPTCRHRTAAFIVRNHSEEVAQKPRIVYNYKRLNDNMHTDPFNIPHKISMINLIQKANIFSKFDL
KAGFHHMKLKDDFKDWTTFTCSEGLYTWNVCPFGIANAPCAFQRFMQESFGDLKFALLYIDDILIASNNEKEHIEHLKIF
FNRVKEVGCVLSKKKSKMFLKEVEYLGVEIKEGKISLQPHIVDKIKKFDKNKLNTLKGLQAYLGLLNYARGYIKDLSKLV
GPLYKKTGKNGQRIFNKEDWNIIFKIEREVSKIKPLERPKETDYIIIETDASEEGWGAVLVCKPDKYSGKDTEKIAGYAS
GNFGEKKTWTSLDYEIEAINEALNKFQIYLDKDFTIRTDCEAIVKGIKTEDYKKRSKTRWIKLRDNLLKDGYKPTFEHIK
GNKNFLPNFLSREGDFILKCLQNPDSTESYSIDSSESIPLYIDSKESHSIESDDSIPLYRDKLLPLVERLKEKSA
>P23074 ~~~pol~~~Pro-Pol polyprotein~~~
MDPLQLLQPLEAEIKGTKLKAHWDSGATITCVPEAFLEDERPIQTMLIKTIHGEKQQDVYYLTFKVQGRKVEAEVLASPY
DYILLNPSDVPWLMKKPLQLTVLVPLHEYQERLLQQTALPKEQKELLQKLFLKYDALWQHWENQVGHRRIKPHNIATGTL
APRPQKQYPINPKAKPSIQIVIDDLLKQGVLIQQNSTMNTPVYPVPKPDGKWRMVLDYREVNKTIPLIAAQNQHSAGILS
SIYRGKYKTTLDLTNGFWAHPITPESYWLTAFTWQGKQYCWTRLPQGFLNSPALFTADVVDLLKEIPNVQAYVDDIYISH
DDPQEHLEQLEKIFSILLNAGYVVSLKKSEIAQREVEFLGFNITKEGRGLTDTFKQKLLNITPPKDLKQLQSILGLLNFA
RNFIPNYSELVKPLYTIVANANGKFISWTEDNSNQLQHIISVLNQADNLEERNPETRLIIKVNSSPSAGYIRYYNEGSKR
PIMYVNYIFSKAEAKFTQTEKLLTTMHKGLIKAMDLAMGQEILVYSPIVSMTKIQRTPLPERKALPVRWITWMTYLEDPR
IQFHYDKSLPELQQIPNVTEDVIAKTKHPSEFAMVFYTDGSAIKHPDVNKSHSAGMGIAQVQFIPEYKIVHQWSIPLGDH
TAQLAEIAAVEFACKKALKISGPVLIVTDSFYVAESANKELPYWKSNGFLNNKKKPLRHVSKWKSIAECLQLKPDIIIMH
EKGHQQPMTTLHTEGNNLADKLATQGSYVVHCNTTPSLDAELDQLLQGHYPPGYPKQYKYTLEENKLIVERPNGIRIVPP
KADREKIISTAHNIAHTGRDATFLKVSSKYWWPNLRKDVVKSIRQCKQCLVTNATNLTSPPILRPVKPLKPFDKFYIDYI
GPLPPSNGYLHVLVVVDSMTGFVWLYPTKAPSTSATVKALNMLTSIAIPKVLHSDQGAAFTSSTFADWAKEKGIQLEFST
PYHPQSSGKVERKNSDIKRLLTKLLIGRPAKWYDLLPVVQLALNNSYSPSSKYTPHQLLFGVDSNTPFANSDTLDLSREE
ELSLLQEIRSSLHQPTSPPASSRSWSPSVGQLVQERVARPASLRPRWHKPTAILEVVNPRTVIILDHLGNRRTVSVDNLK
LTAYQDNGTSNDSGTMALMEEDESSTSST
>C1JCT1 ~~~~~~Polyprotein~~~
MSEKTQTFVQNETHVLDMTSDFKSDLSLEKVTSSVEQTDDLVSKIINNNDLDIKDLSFLRNLLLSTLQYLGIAKFVAINI
TLSILSILMLLINSCAKFTRIVNLSSHILNIITTLGLYFQVSSMEIEEITQTFENEFGTYDDDKILSHYIKICNLPNRKD
VYEYISLNDLKYKIKLPDISFYELKNDILSKNKNLHLWIFQKFTDEFLAMWFGVQPYRISNLREMLVISRQGFIPKDLFN
EIRKLCNMGVSVIISFIQSKLFDEPFKKRDCTQALKDASVISSPFDTLWNLISKQVCDNSAEERFTQTILDFTSEFDNFL
GIPNYKFAKNQKLVNTISKSLDACAKFIRDCPKDKQTEIFPLQGLHTATVKRRNEILTNVMPKFARQEPFVVLFQGPGGI
GKTHLVQQLATKCVNSFYQDHEDDYIEISPDDKYWPPLSGQRVAFFDEAGNLNDLTEDLLFRNIKSICSPAYFNCAAADI
EHKISPCPFELVFATVNTDLDTLQSKISSTFGQASVFPIWRRCIVVECSWNEKELGPFNYKNPSGHRSDYSHITMNYMSY
DDKTQKLALEKEINFDTLFDMIRLRFRKKQQEHDTKISILNNEIQRQSNSKQHFSVCLYGEPGQGKTYNLNKLITTFANA
TNLKIGSEEKPSIHIFDDYIKDENDENCSKFMDIYNNKLPNNSVIFSATNVYPKTHFFPTFFLTNLIYAFIQPFKQVGLY
RRLGFDGYTDIPNSSVNAPIFVQNFKFYERKQHICYFLSLEFLKNIICYIFFFLYFPLKFIKKIDLIEIKDVNKYVYDRY
INFLSLSKQIEIVEYPPNLENVEFDFRFNMNKFHRVSFNNPFELDKYIHFNKNSYENLLHFDWKMYLSPRVKHRLALSYE
KFFITISEVNKEIIIEELKRYVLLFKQFNIDPNMEINLGEYGSFYYINGKIHLMTINIESNVSEIPVFTDGDYVYISEHK
IPVIDLFDNININSKYNLSFDQSIALNSFKTGDSFYSNAKVRKSLSKFVLLNYQTKFKLYLKEAKDKVKNFIETPIGHLL
SILLTIFVICYASFKIYSKFSNFFSKDQAIEDQRKGEKKIKKITNYDSDGVQPQRKGEKKIKKVTNYDSDGVQPQSNVKV
EEEIKLVFDPTGQKLLFGNDFTSELETLVELEKDDEEFTKSKIDNKSMAGLRREVRRRRYARSKKAQIEKQEVLTLPDVN
GFEGGKPYFQIAEEKARKNLCQIYMIANNENCIASKFSDHIVCYGLFVFKKRLASVGHIVEALKCAPGYNLYAGCDQFNG
KLYKMNLVRNYRKRELSVWDVDCPNDFVDLTSFFIPKEELYDAENCNTVLGRFGMNKREVYLYGNCEFIQEFFKVDNKGA
QEFGYIDWATVDITLTTGGDCGLPYYICERKKFHNKIMGLHFAGNNVNHKTIGMSALIYKEDLVVWKGAERQSKCKFCDV
KDIIIAQPDIPKEKYKGYNHEIVWNSLHESSPTTLNEELEHYLNIFPKFTGTIIKHSGDKFYGSVKHSHTQFISKFKTEL
TVTNGWKLSTAGDCQFESNHISPNTEVMYRVVDVQFNSIFKAFKSQPYIKNFRLIANVYEKDGKQRVTILTIIPVSDFNV
KQQTVRQALVPLHLNEDEEVYVTEDVSDIFKTAIKRKQRGILPDVPYETVENETVEILGITHRNMTPEPAQMYKPTPFYK
LALKFNLDHKLPVNFNMKDCPQEQKDMMVLDRLGQPNPRITQSLKWAHKDYSPDYELRKYVKEQYMCNIMEYYAGCNLLT
EEQILKGYGPNHRLYGALGGMEIDSSIGWTMKELYRVTKKSDVINLDSNGNYSFLNNEAAQYTQELLKISMEQAHNGQRY
YTAFNELMKMEKLKPSKNFIPRTFTAQDLNGVLMERWILGEFTARALAWDENCAVGCNPYATFHKFATKFFKFKNFFSCD
YKNFDRTIPKCVFEDFRDMLIQANPHMKNEIYACFQTIIDRIQVSGNSILLVHGGMPSGCVPTAPLNSKVNDIMIYTAYV
NILRRADRGDITSYRYYRDLVCRLFYGDDVIIAVDDSIADIFNCQTLSEEMKILFGMNMTDGSKSDIIPKFETIETLSFI
SRFFRPLKHQENFIVGALKKISIQTHFYYATDDTPEHFGQVFKTIQEEAALWEEEYFNKIQSYIQEIIRKFPEISKFFNF
ESYKSIQKRYIMNGWNEFVKLEKLDLNLNKKKSSKVTGIHSKQYSKFLKFLSRIENEKAALEGNFNKESVNTWYFKMSKA
MHLNEIFQKGLISKPLAEFYFNEGQKMWDCNITFRRSKDDLPFTFSGSGTTKACAREQAAEEALVLFSQEDEIVRQINDI
QSDCKFCKKMIRYKKLLSGVSIQRQMNVSKITENHVPSAGMMATDPSVAPDSGIATNTQTPSISRVLNPIARALDNPAGT
GAPFDKHTYVYNVFTRWPEMSTVVNKSLAAGAEVFKISLDPNKLPKRILQYIQFHKTIIPQIEVQILIGGAAGTVGWLKV
GWVPDASTAKKYSLDDLQLVASETINLNSTITMSMIINDSRRNGMFRLTKSDPEPWPGIVCLVEHPITNVQRNDDVNYPV
IVSVRLGPDCQLMQPYNDLN
>P05896 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARNSVLSGKKADELEKIRLRPGGKKKYMLKHVVWAANELDRFGLAESLLENKEGCQKILSVLAPLVPTGSENLKSLYN
TVCVIWCIHAEEKVKHTEEAKQIVQRHLVMETGTAETMPKTSRPTAPFSGRGGNYPVQQIGGNYTHLPLSPRTLNAWVKL
IEEKKFGAEVVSGFQALSEGCLPYDINQMLNCVGDHQAAMQIIRDIINEEAADWDLQHPQQAPQQGQLREPSGSDIAGTT
STVEEQIQWMYRQQNPIPVGNIYRRWIQLGLQKCVRMYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWMTQ
TLLIQNANPDCKLVLKGLGTNPTLEEMLTACQGVGGPGQKARLMAEALKEALAPAPIPFAAAQQKGPRKPIKCWNCGKEG
HSARQCRAPRRQGCWKCGKMDHVMAKCPNRQAGFFRPWPLGKEAPQFPHGSSASGADANCSPRRTSCGSAKELHALGQAA
ERKQREALQGGDRGFAAPQFSLWRRPVVTAHIEGQPVEVLLDTGADDSIVTGIELGPHYTPKIVGGIGGFINTKEYKNVE
IEVLGKRIKGTIMTGDTPINIFGRNLLTALGMSLNLPIAKVEPVKSPLKPGKDGPKLKQWPLSKEKIVALREICEKMEKD
GQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNRVTQDFTEVQLGIPHPAGLAKRKRITVLDIGDAYFSIPLDEEF
RQYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQYTMRHVLEPFRKANPDVTLVQYMDDILIASDRTDLEHDRVVLQ
LKELLNSIGFSSPEEKFQKDPPFQWMGYELWPTKWKLQKIELPQRETWTVNDIQKLVGVLNWAAQIYPGIKTKHLCRLIR
GKMTLTEEVQWTEMAEAEYEENKIILSQEQEGCYYQESKPLEATVIKSQDNQWSYKIHQEDKILKVGKFAKIKNTHTNGV
RLLAHVIQKIGKEAIVIWGQVPKFHLPVEKDVWEQWWTDYWQVTWIPEWDFISTPPLVRLVFNLVKDPIEGEETYYVDGS
CSKQSKEGKAGYITDRGKDKVKVLEQTTNQQAELEAFLMALTDSGPKANIIVDSQYVMGIITGCPTESESRLVNQIIEEM
IKKTEIYVAWVPAHKGIGGNQEIDHLVSQGIRQVLFLEKIEPAQEEHSKYHSNIKELVFKFGLPRLVAKQIVDTCDKCHQ
KGEAIHGQVNSDLGTWQMDCTHLEGKIVIVAVHVASGFIEAEVIPQETGRQTALFLLKLASRWPITHLHTDNGANFASQE
VKMVAWWAGIEHTFGVPYNPQSQGVVEAMNHHLKNQIDRIREQANSVETIVLMAVHCMNFKRRGGIGDMTPAERLINMIT
TEQEIQFQQSKNSKFKNFRVYYREGRDQLWKGPGELLWKGEGAVILKVGTDIKVVPRRKAKIIKDYGGGKEMDSSSHMED
TGEAREVA
>P05897 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARNSVLSGKKADELEKIRLRPGGKKKYMLKHVVWAANELDRFGLAESLLENKEGCQKILSVLAPLVPTGSENLKSLYN
TVCVIWCIHAEEKVKHTEEAKQIVQRHLVVETGTAETMPKTSRPTAPSSGRGGNYPVQQIGGNYVHLPLSPRTLNAWVKL
IEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGDHQAAMQIIRDIINEEAADWDLQHPQPAPQQGQLREPSGSDIAGTT
SSVDEQIQWMYRQQNPIPVGNIYRRWIQLGLQKCVRMYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDAAVKNWMTQ
TLLIQNANPDCKLVLKGLGVNPTLEEMLTACQGVGGPGQKARLMAEALKEALAPVPIPFAAAQKRGPRKPIKCWNCGKEG
HSARQCRAPRRQGCWKCGKMDHVMAKCPDRQAGFFRPWSMGKEAPQFPHGSSASGADANCSPRGPSCGSAKELHAVGQAA
ERKQREALQGGDRGFAAPQFSLWRRPVVTAHIEGQPVEVLLDTGADDSIVTGIELGPHYTPKIVGGIGGFINTKEYKNVE
IEVLGKRIKRTIMTGDTPINIFGRNLLTALGMSLNLPIAKVEPVKVALKPGKVGPKLKQWPLSKEKIVALREICEKMEKD
GQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIHFRELNRVTQELYRSPIRIPHPAGLAKRKRITVLDIGDAYFSIPLDEEF
RQYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQYTMRHVLEPFRKANPDVTLVQYMDDILIASDRTDLEHDRVVLQ
LKELLNSIGFSTPEEKFQKDPPFQWMGYELWPTKWKLQKIELPQRETWTVNDIQKLVGVLNWAAQIYPGIKTKHLCRLIR
GKMTLTEAVQWTEMAEAEYEENNIILSQEQEGCYYQEGKPLEATVIKSQDNQWTYKIHQEDKILKVRKFAKIKNTHTNGV
RLLAHVIQKIGKEAIVIVGQVPKFHLPVERDVWEQWWTDYWQVTWIPEWDFISTPPLVRLVFNLVKDPIEVEETYYTDGS
CNKQSKEGKAGYITDRGKDIVKVLTTTNQQAELEAIYHGIEDSGPKRNIIVELQVCYGNNNRFPTESESRLVNQIIEEMI
KVRVYVAWVPALEGIGGNQEIGPLVSQGFRQVLFLEKIEPAQEEHDKYHSNVKELVFKFGLPRIVARQIVDTCDKCHQKG
EAIHGQVNSDLGTWQMDCTHLEGKIVIVAVHVASGFIEAEVIPQETGRQHYFLLKLAGRWPYLHIYTHSNGANFASQEVK
MVTWWAGIEAHLWVPYNPQSQGVVEAMNHHLKNQIDRIREQANSVETIVLMAVHCMNFKRRGGIGDMTPAERLINMITTE
QEIQFQQSKNSKFKNFRVYYREGRDQLWKGPGELLWKGEGAVILKVGTDIKVVPRRKAKIIKDYGGGKEVDSSSHMEDTG
EAREVA
>P19505 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGARNSVLSGKKADELEKIRLRPGGKKRYQLKHIVWAANELDRFGLAESLLENKEGCQKILSVLAPLVPTGSENLKSLYN
TVCVLWCIHAEEKVKHTEEAKQIVQRHLVVETGTADKMPATSRPTAPPSGKGGNYPVQQIGGNYTHLPLSPRTLNAWVKL
IEEKKFGAEVVPGFQALSEGCTPYDINQMLNCVGEHQAAMQIIREIINEEAADWDLQHPQPGPIPPGQLREPRGSDIAGT
TSTVDEQIQWMYRQQNPIPVGNIYRRWIQLGLQKCVRMYNPTNILDVKQGPKEPFQSYVDRFYKSLRAEQTDPAVKNWMT
QTLLIQNANPDCKLVLKGLGINPTLEEMLTACQGVGGPGQKARLMAEALKDALTQGPLPFAAVQQKGQRKIIKCWNCGKE
GHSARQCRAPRRQGCWKCGKAGHVMAKCPERQAGFFRAWPMGKEAPQFPHGPDASGADTNCSPRGSSCGSTEELHEDGQK
AEGEQRETLQGGNGGFAAPQFSLWRRPIVTAYIEEQPVEVLLDTGADDSIVAGIELGPNYTPKIVGGIGGFINTKEYKDV
KIKVLGKVIKGTIMTGDTPINIFGRNLLTAMGMSLNLPIAKVEPIKVTLKPGKDGPKLRQWPLSKEKIIALREICEKMEK
DGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNKVTQDFTEVQLGIPHPAGLAKRRRITVLDVGDAYFSIPLDEE
FRQYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQHTMRNVLEPFRKANPDVTLIQYMDDILIASDRTDLEHDRVVL
QLKELLNSIGFSTPEEKFQKDPPFQWMGYELWPTKWKLQKIELPQRETWTVNDIQKLVGVLNWAAQIYPGIKTKHLCRLI
RGKMTLTEEVQWTEMAEAEYEENKIILSQEQEGCYYQEGKPLEATVIKSQDNQWSYKIHQEDKILKVGKFAKIKNTHTNG
VRLLAHVVQKIGKEAIVIWGQVPRFHLPVEREIWEQWWTDYWQVTWIPEWDFVSTPPLVRLVFNLVKEPIQGAETFYVDG
SCNRQSREGKAGYVTDRGRDKAKLLEQTTNQQAELEAFYLALADSGPKANIIVDSQYVMGIVAGQPTESESRLVNQIIEE
MIKKEAIYVAWVPAHKGIGGNQEVDHLVSQGIRQVLFLEKIEPAQEEHEKYHSNVKELVFKFGLPRLVAKQIVDTCDKCH
QKGEAIHGQVNAELGTWQMDCTHLEGKIIIVAVHVASGFIEAEVIPQETGRQTALFLLKLASRWPITHLHTDNGANFTSQ
EVKMVAWWAGIEQTFGVPYNPQSQGVVEAMNHHLKTQIDRIREQANSIETIVLMAVHCMNFKRRGGIGDMTPAERLVNMI
TTEQEIQFQQSKNSKFKNFRVYYREGRDQLWKGPGELLWKGEGAVILKVGTEIKVVPRRKAKIIKDYGGGKELDSGSHLE
DTGEAREVA
>P03364 ~~~pol~~~Gag-Pro-Pol polyprotein~~~
MGQASSHSENDLFISHLKESLKVRRIRVRKKDLVSFFSFIFKTCPWFPQEGSIDSRVWGRVGDCLNDYYRVFGPETIPIT
TFNYYNLIRDVLTNQSDSPDIQRLCKEGHKILISHSRPPSRQAPVTITTSEKASSRPPSRAPSTCPSVAIDIGSHDTGQS
SLYPNLATLTDPPIQSPHSRAHTPPQHLPLLANSKTLHNSGSQDDQLNPADQADLEEAAAQYNNPDWPQLTNTPALPPFR
PPSYVSTAVPPVAVAAPVLHAPTSGVPGSPTAPNLPGVALAKPSGPIDETVSLLDGVKTLVTKLSDLALLPPAGVMAFPV
TRSQGQVSSNTTGRASPHPDTHTIPEEEEADSGESDSEDDEEESSEPTEPTYTHSYKRLNLKTIEKIKTAVANYGPTAPF
TVALVESLSERWLTPSDWFFLSRAALSGGDNILWKSEYEDISKQFAERTRVRPPPKDGPLKIPGASPYQNNDKQAQFPPG
LLTQIQSAGLKAWKRLPQKGAATTSLAKIRQGPDESYSDFVSRLQETADRLFGSGESESSFVKHLAYENANPACQSAIRP
FRQKELSTMSPLLWYCSAHAVGLAIGAALQNLAPAQLLEPRPAFAIIVTNPAIFQETAPKKIQPPTQLPTQPNAPQASLI
KNLGPTTKCPRCKKGFHWASECRSRLDINGQPIIKQGNLEQGPAPGPHYRDELRGFTVHPPIPPANPCPPSNQPRRYVTD
LWRATAGSAGLDLCTTTDTILTTQNSPLTLPVGIYGPLPPQTFGLILAEPALPSKGIQVLPGILDNDFEGEIHIILSTTK
DLVTIPKGTRLAQIVILPLQQINSNFHKPYRGASAPGSSDVYWVQQISQQRPTLKLKLNGKLFSGILDTGADATVISYTH
WPRNWPLTTVATHLRGIGQATNPQQSAQMLKWEDSEGNNGHITPYVLPNLPVNLWGRDILSQMKLVMCSPNDTVMTQMLS
QGYLPGQGLGKNNQGITQPITITPKKDKTGLGFHQNLPRSRAIDIPVPHADKISWKITDPVWVDQWPLTYEKTLAAIALV
QEQLAAGHIEPTNSPWNTPIFIIKKKSGSWRLLQDLRAVNKVMVPMGALQPGLPSPVAIPLNYHKIVIDLKDCFFTIPLH
PEDRPYFAFSVPQINFQSPMPRYQWKVLPQGMANSPTLCQKFVAAAIAPVRSQWPEAYILHYMDDILLACDSAEAAKACY
AHIISCLTSYGLKIAPDKVQVSEPFSYLGFELHHQQVFTPRVCLKTDHLKTLNDFQKLLGDIQWLRPYLKLPTSALVPLN
NILKGDPNPLSVRALTPEAKQSLALINKAIQNQSVQQISYNLPLVLLLLPTPHTPTAVFWQPNGTDPTKNGSPLLWLHLP
ASPSKVLLTYPSLLAMLIIKGRYTGRQLFGRDPHSIIIPYTQDQLTWLLQTSDEWAIALSSFTGDIDNHYPSDPVIQFAK
LHQFIFPKITKCAPIPQATLVFTDGSSNGIAAYVIDNQPISIKSPYLSAQLVELYAILQVFTVLAHQPFNLYTDSAYIAQ
SVPLLETVPFIKSSTNATPLFSKLQQLILNRQHPFFIGHLRAHLNLPGPLAEGNALADAATQIFPIISDPIHEATQAHTL
HHLNAHTLRLLYKITREQARDIVKACKQCVVATPVPHLGVNPRGLVPNAIWQMDVTHFTPFGKQRFVHVTVDTFSGFILA
TPQTGEASKNVISHVIHCLATIGKPHTIKTDNGPGYTGKNFQDFCQKLQIKHVTGIPYNPQGQGVVERAHQTLKNALNRL
ARSPLGFSMQQPRNLLSHALFQLNFLQLDSQGRSAADRLWHPQTSQQHATVMWRDPLTSVWKGPDPVLIWGRGSACIYDQ
KEDGPRWLPERLIRHINNQTAPLCDRPSNPNTAPGPKGSP
>P03370 ~~~pol~~~Gag-Pol polyprotein~~~
MAKQGSKEKKGYPELKEVIKATCKIRVGPGKETLTEGNCLWALKTIDFIFEDLKTEPWTITKMYTVWDRLKGLTPEETSK
REFASLQATLACIMCSQMGMKPETVQAAKGIISMKEGLHENKEAKGEKVEQLYPNLEKHREVYPIVNLQAGGRSWKAVES
VVFQQLQTVAMQHGLVSEDFERQLAYYATTWTSKDILEVLAMMPGNRAQKELIQGKLNEEAERWVRQNPPGPNVLTVDQI
MGVGQTNQQASQANMDQARQICLQWVITALRSVRHMSHRPGNPMLVKQKNTESYEDFIARLLEAIDAEPVTDPIKTYLKV
TLSYTNASTDCQKQMDRTLGTRVQQATVEEKMQACRDVGSEGFKMQLLAQALRPQGKAGHKGVNQKCYNCGKPGHLARQC
RQGIICHHCGKRGHMQKDCRQKKQQGKQQEGATCGAVRAPYVVTEAPPKIEIKVGTRWKKLLVDTGADKTIVTSHDMSGI
PKGRIILQGIGGIIEGEKWEQVHLQYKDKMIKGTIVVLATSPVEVLGRDNMRELGIGLIMANLEEKKIPSTRVRLKEGCK
GPHIAQWPLTQEKLEGLKEIVDRLEKEGKVGRAPPHWTCNTPIFCIKKKSGKWRMLIDFRELNKQTEDLAEAQLGLPHPG
GLQRKKHVTILDIGDAYFTIPLYEPYRQYTCFTMLSPNNLGPCVRYYWKVLPQGWKLSPAVYQFTMQKILRGWIEEHPMI
QFGIYMDDIYIGSDLGLEEHRGIVNELASYIAQYGFMLPEDKRQEGYPAKWLGFELHPEKWKFQKHTLPEITEGPITLNK
LQKLVGDLVWRQSLIGKSIPNILKLMEGDRALQSERYIESIHVREWEACRQKLKEMEGNYYDEEKDIYGQLDWGNKAIEY
IVFQEKGKPLWVNVVHSIKNLSQAQQIIKAAQKLTQEVIIRTGKIPWILLPGREEDWILELQMGNINWMPSFWSCYKGSV
RWKKRNVIAELVPGPTYYTDGGKKNGRGSLGYIASTGEKFRIHEEGTNQQLELRAIEEACKQGPEKMNIVTDSRYAYEFM
LRNWDEEVIRNPIQARIMELVHNKEKIGVHWVPGHKGIPQNEEIDRYISEIFLAKEGRGILQKRAEDAGYDLICPQEISI
PAGQVKRIAIDLKINLKKDQWAMIGTKSSFANKGVFVQGGIIDSGYQGTIQVVIYNSNNKEVVIPQGRKFAQLILMPLIH
EELEPWGETRKTERGEQGFGSTGMYWIENIPLAEEEHNKWHQDAVSLHLEFGIPRTAAEDIVQQCDVCQENKMPSTLRGS
NKRGIDHWQVDYTHYEDKIILVWVETNSGLIYAERVKGETGQEFRVQTMKWYAMFAPKSLQSDNGPAFVAESTQLLMKYL
GIEHTTGIPWNPQSQALVERTHQTLKNTLEKLIPMFNAFESALAGTLITLNIKRKGGLGTSPMDIFIFNKEQQRIQQQSK
SKQEKIRFCYYRTRKRGHPGEWQGPTQVLWGGDGAIVVKDRGTDRYLVIANKDVKFIPPPKEIQKE
>P35956 ~~~pol~~~Gag-Pol polyprotein~~~
MAKQGSKEKKGYPELKEVIKATCKIRVGPGKETLTEGNCLWALKTIDFIFEDLKTEPWTITKMYTVWDRLKGLTPEETSK
REFASLQATLACIMCSQMGMKPETVQAAKGIISMKEGLHENKEAKGEKVEQLYPNLEKHREVYPIVNLQAGGRSWKAVES
VVFQQLQTVAMQHGLVSEDFERQLAYYATTWTSKDILEVLAMMPGNRAQKELIQGKLNEEAERWVRQNPPGPNVLTVDQI
MGVGQTNQQASQANMDQARQICLQWVITALRSVRHMSHRPGNPMLVKQKNTESYEDFIARLLEAIDAEPVTDPIKTYLKV
TLSYTNASTDCQKQMDRTLGTRVQQATVEEKMQACRDVGSEGFKMQLLAQALRPQGKAGQKGVNQKCYNCGKPGHLARQC
RQGIICHHCGKRGHMQKDCRQKKQQGKQQEGATCGAVRAPYVVTEAPPKIEIKVGTRWKKLLVDTGADKTIVTSHDMSGI
PKGRIILQGIGGIIEGEKWEQVHLQYKDKIIRGTIVVLATSPVEVLGRDNMRELGIGLIMANLEEKKIPSTRVRLKEGCK
GPHIAQWPLTQEKLEGLKEIVDRLEKEGKVGRAPPHWTCNTPIFCIKKKSGKWRMLIDFRELNKQTEDLAEAQLGLPHPG
GLQRKKHVTILDIGDAYFTIPLYEPYRQYTCFTMLSPNNLGPCVRYYWKVLPQGWKLSPAVYQFTMQKILRGWIEEHPMI
QFGIYMDDIYIGSDLGLEEHRGIVNELASYIAQYGFMLPEDKRQEGYPAKWLGFELHPEKWKFQKHTLPEITEGPITLNK
LQKLVGDLVWRQSLIGKSIPNILKLMEGDRALQSERYIESIHVREWEACRQKLKEMEGNYYDEEKDIYGQLDWGNKAIEY
IVFQEKGKPLWVNVVHSIKNLSQAQQIIKAAQKLTQEVIIRTGKIPWILLPGREEDWILELQMGNINWMPSFWSCYKGSV
RWKKRNVIAEVVPGPTYYTDGGKKNGRGSLGYITSTGEKFRIHEEGTNQQLELRAIEEACKQGPEKMNIVTDSRYAYEFM
LRNWDEEVIRNPIQARIMELVHNKEKIGVHWVPGHKGIPQNEEIDRYISEIFLAKEGRGILQKRAEDAGYDLICPQEISI
PAGQVKRIAIDLKINLKKDQWAMIGTKSSFANKGVFVQGGIIDSGYQGTIQVVIYNSNNKEVVIPQGRKFAQLILMPLIH
EELEPWGETRKTERGEQGFGSTGMYWIENIPLAEEEHNKWHQDAVSLHLEFGIPRTAAEDIVQQCDVCQENKMPSTLRGS
NKRGIDHWQVDYTHYEDKIILVWVETNSGLIYAERVKGETGQEFRVQTMKWYAMFAPKSLQSDNGPAFVAESTQLLMKYL
GIEHTTGIPWNPQSQALVERTHQTLKNTLEKLIPMFNAFESALAGTLITLNIKRKGGLGTSPMDIFIFNKEQQRIQQQSK
SKQEKIRFCYYRTRKRGHPGEWQGPTQVLWGGDGAIVVKDRGTDRYLVIANKDVKFIPPPKEIQKE
>O92815 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGNSSSTPPPSALKNSDLFKTMLRTQYSGSVKTRRINQDIKKQYPLWPDQGTCATKHWEQAVLIPLDSVSEETAKVLNFL
RVKIQARKGETARQMTAHTIKKLIVGTIDKNKQQTEILQKTDESDEEMDTTNTMLFIARNKRERIAQQQQADLAAQQQVL
LLQREQQREQREKDIKKRDEKKKKLLPDTTQKVEQTDIGEASSSDASAQKPISTDNNPDLKVDGVLTRSQHTTVPSNITI
KKDGTSVQYQHPIRNYPTGEGNLTAQVRNPFRPLELQQLRKDCPALPEGIPQLAEWLTQTMAIYNCDEADVEQLARVIFP
TPVRQIAGVINGHAAANTAAKIQNYVTACRQHYPAVCDWGTIQAFTYKPPQTAHEYVKHAEIIFKNNSGLEWQHATVPFI
NMVVQGLPPKVTRSLMSGNPDWSTKTIPQIIPLMQHYLNLQSRQDAKIKQTPLVLQLAMPAQTMNGNKGYVGSYPTNEPY
YSFQQQQRPAPRAPPGNVPSNTCFFCKQPGHWKADCPNKTRNLRNMGNMGRGGRMGGPPYRSQPYPAFIQPPQNHQNQYN
GRMDRSQLQASAQEWLPGTYPAXDPIDCPYEKSGTKTTQDVITTKNAEIMVTVNHTKIPMLVDTGACLTAIGGAATVVPD
LKLTNTEIIAVGISAEPVPHVLAKPTKIQIENTNIDISPWYNPDQTFHILGRDTLSKMRAIVSFEKNGEMTVLLPPTYHK
QLSCQTKNTLNIDEYLLQFPDQLWASLPTDIGRMLVPPITIKIKDNASLPSIRQYPLPKDKTEGLRPLISSLENQGILIK
CHSPCNTPIFPIKKAGRDEYRMIHDLRAINNIVAPLTAVVASPTTVLSNLAPSLHWFTVIDLSNAFFSVPIHKDSQYLFA
FTFEGHQYTWTVLPQGFIHSPTLFSQALYQSLHKIKFKISSEICIYMDDVLIASKDRDTNLKDTAVMLQHLASEGHKVSK
KKLQLCQQEVVYLGQLLTPEGRKILPDRKVTVSQFQQPTTIRQIRAFLGLVGYCRHWIPEFSIHSKFLEKQLKKDTAEPF
QLDDQQVEAFNKLKHAITTAPVLVVPDPAKPFQLYTSHSEHASIAVLTQKHAGRTRPIAFLSSKFDAIESGLPPCLKACA
SIHRSLTQADSFILGAPLIIYTTHAICTLLQRDRSQLVTASRFSKWEADLLRPELTFVACSAVSPAHLYMQSCENNIPPH
DCVLLTHTISRPRPDLSDLPIPDPDMTLFSDGSYTTGRGGAAVVMHRPVTDDFIIIHQQPGGASAQTAELLALAAACHLA
TDKTVNIYTDSRYAYGVVHDFGHLWMHRGFVTSAGTPIKNHKEIEYLLKQIMKPKQVSVIKIEAHTKGVSMEVRGNAAAD
EAAKNAVFLVQRVLKKGDALASTDLVMEYSETDEKFTAGAELHDGVFMRGDLIVPPLEMLHAILLAIHGVSHTHKGGIMS
YFSKFWTHPKASQTIDLILGHCQICLKHNPKYKSRLQGHRPLPSRPFAHLQIDFVQMCVKKPMYALVIIDVFSKWPEIIP
CNKEDAKTVCDILMKDIIPRWGLPDQIDSDQGTHFTAKISQELTHSIGVAWKLHCPGHPRSSGIVERTNRTLKSKIIKAQ
EQLQLSKWTEVLPYVLLEMRATPKKHGLSPHEIVMGRPMKTTYLSDMSPLWATDTLVTYMNKLTRQLSAYHQQVVDQWPS
TSLPPGPEPGSWCMLRNPKKSSNWEGPFLILLSTPTAVKVEGRPTWIHLDHCKLLRSSLSSSLGGPVNQLLS
>Q2F7J3 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGQTVTTPLSLTLQHWGDVQRIASNQSVDVKKRRWVTFCSAEWPTFNVGWPQDGTFNLGVISQVKSRVFCPGPHGHPDQV
PYIVTWEALAYDPPPWVKPFVSPKPPPLPTAPVLPPGPSAQPPSRSALYPALTLSIKSKPPKPQVLPDSGGPLIDLLTED
PPPYGVQPSSSARENNEEEAATTSEVSPPSPMVSRLRGRRDPPAADSTTSQAFPLRMGGDGQLQYWPFSSSDLYNWKNNN
PSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLEAGKAVRGNDGRPTQLPNEVNAAFPLERPDWDYT
TTEGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQSA
PDIGRKLERLEDLKSKTLGDLVREAEKIFNKRETPEEREERIRREIEEKEERRRAEDEQRERERDRRRHREMSKLLATVV
IGQRQDRQGGERRRPQLDKDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLGDXGGQGQEPPPEPRITLKVGGQPVT
FLVDTGAQHSVLTQNPGPLSDKSAWVQGATGGKRYRWTTDRKVHLATGKVTHSFLHVPDCPYPLLGRDLLTKLKAQIHFE
GSGAQVVGPMGQPLQVLTLNIENKYRLHETSKEPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQ
YPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSH
QWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADFRIQHPDLILLQYV
DDLLLAATSEQDCQRGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREF
LGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQ
KLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAM
LLDTDRVQFGPVVALNPATLLPLPEKEAPHDCLEILAETHGTRPDLTDQPIPDADYTWYTDGSSFLQEGQRRAGAAVTTE
TEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHVHGEIYRRRGLLTSEGREIKNKNEILALLKA
LFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREAAMKAVLETSTLLIEDSTPYTPPHFHYTETDLKRLRELGATYNQT
KGYWVLQGKPVMPDQSVFELLDSLHRLTHPSPQKMKALLDREESPYYMLNRDRTIQYVTETCTACAQVNASKAKIGAGVR
VRGHRPGTHWEVDFTEVKPGLYGYKYLLVFVDTFSGWVEAFPTKRETAKVVTKKLLEDIFPRFGMPQVLGSDNGPAFASQ
VSQSVADLLGIDWKLHCAYRPQSSGQVERMNRTIKETLTKLTLASGTRDWVLLLPLALYRARNTPGPHGLTPYEILYGAP
PPLVNFHDPEMSKLTNSPSLQAHLQALQAVQQEVWKPLAAAYQDQLDQPVIPHPFRVGDAVWVRRHQTKNLEPRWKGPYT
VLLTTPTALKVDGISAWIHAAHVKAATTPPAGTAWKVQRSQNPLKIRLTRGAP
>A1Z651 ~~~gag-pol~~~Gag-Pol polyprotein~~~
MGQTVTTPLSLTLQHWGDVQRIASNQSVDVKKRRWVTFCSAEWPTFNVGWPQDGTFNLGVISQVKSRVFCPGPHGHPDQV
PYIVTWEALAYDPPPWVKPFVSPKPPPLPTAPVLPPGPSAQPPSRSALYPALTPSIKSKPPKPQVLPDSGGPLIDLLTED
PPPYGAQPSSSARENNEEEAATTSEVSPPSPMVSRLRGRRDPPAADSTTSQAFPLRMGGDGQLQYWPFSSSDLYNWKNNN
PSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLEARKAVRGNDGRPTQLPNEVNAAFPLERPDWDYT
TTEGRNHLVLYRQLLLAGLQNAGRSPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQSA
PDIGRKLERLEDLKSKTLGDLVREAEKIFNKRETPEEREERIRREIEEKEERRRAEDEQRERERDRRRHREMSKLLATVV
IGQRQDRQGGERRRPQLDKDQCAYCKEKGHWAKDCPKKPRGPRGPRPQTSLLTLGDXGGQGQEPPPEPRITLKVGGQPVT
FLVDTGAQHSVLTQNPGPLSDKSAWVQGATGGKRYRWTTDRKVHLATGKVTHSFLHVPDCPYPLLGRDLLTKLKAQIHFE
GSGAQVVGPMGQPLQVLTLNIEDEYRLHETSKEPDVPLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQ
YPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSH
QWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADFRIQHPDLILLQYV
DDLLLAATSEQDCQRGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREF
LGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPDQQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQ
KLGPWRRPVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNARMTHYQAM
LLDTDRVQFGPVVALNPATLLPLPEKEAPHDCLEILAETHGTRPDLTDQPIPDADYTWYTDGSSFLQEGQRRAGAAVTTE
TEVIWARALPAGTSAQRAELIALTQALKMAEGKKLNVYTDSRYAFATAHVHGEIYRRRGLLTSEGREIKNKNEILALLKA
LFLPKRLSIIHCPGHQKGNSAEARGNRMADQAAREAAMKAVLETSTLLIEDSTPYTPPHFHYTETDLKRLRELGATYNQT
KGYWVLQGKPVMPDQSVFELLDSLHRLTHLSPQKMKALLDREESPYYMLNRDRTIQYVTETCTACAQVNASKAKIGAGVR
VRGHRPGTHWEVDFTEVKPGLYGYKYLLVFVDTFSGWVEAFPTKRETAKVVSKKLLEDIFPRFGMPQVLGSDNGPAFASQ
VSQSVADLLGIDWKLHCAYRPQSSGQVERMNRTIKETLTKLTLASGTRDWVLLLPLALYRARNTPGPHGLTPYEILYGAP
PPLVNFHDPEMSKLTNSPSLQAHLQALQAVQQEVWKPLAAAYQDQLDQPVIPHPFRVGDAVWVRRHQTKNLEPRWKGPYT
VLLTTPTALKVDGISAWIHAAHVKAATTPPAGTAWKVQRSQNPLKIRLTRGAP
>A7XXB9 ~~~~~~Portal protein~~~
MAKRGRKPKELVPGPGSIDPSDVPKLEGASVPVMSTSYDVVVDREFDELLQGKDGLLVYHKMLSDGTVKNALNYIFGRIR
SAKWYVEPASTDPEDIAIAAFIHAQLGIDDASVGKYPFGRLFAIYENAYIYGMAAGEIVLTLGADGKLILDKIVPIHPFN
IDEVLYDEEGGPKALKLSGEVKGGSQFVNGLEIPIWKTVVFLHNDDGSFTGQSALRAAVPHWLAKRALILLINHGLERFM
IGVPTLTIPKSVRQGTKQWEAAKEIVKNFVQKPRHGIILPDDWKFDTVDLKSAMPDAIPYLTYHDAGIARALGIDFNTVQ
LNMGVQAVNIGEFVSLTQQTIISLQREFASAVNLYLIPKLVLPNWPGATRFPRLTFEMEERNDFSAAANLMGMLINAVKD
SEDIPTELKALIDALPSKMRRALGVVDEVREAVRQPADSRYLYTRRRR
>A0A385DT68 ~~~~~~Portal protein~~~
MADFLNFPRQMLPFSKKTKQWRKDCLLWANQKTFFNYSLVRKSVIHKKINYDLLNGRLHMSDLELVLNPDGIKAAYIPDR
LQHYPIMNSKLNVLRGEESKRVFDFKVVVTNPNAISEIEDNKKNELLQRLQEMITDTSISEDEYNIKLEKLNDYYTYEWQ
DIREVRANELLNHYIKEYDIPLIFNNGFMDAMTCGEEIYQCDIVGGEPVIERVNPLKIRIFKSGYSNKVEDADMIILEDY
WSPGRVIDTYYDVLSPKDIKYIETMPDYIGQGAVDQMDNIDERYGFVNQNMIGDEITVRDGTYFFDPANLFTEGIANSLL
PYDLAGNLRVLRLYWKSKRKILKVKSYDPETGEEEWNFYPENYVVNKEAGEEVQSFWVNEAWEGTMIGNEIFVNMRPRLI
QYNRLNNPSRCHFGIVGSIYNLNDSRPFSLVDMMKPYNYLYDAIHDRLNKAIASNWGSILELDLSKVPKGWDVGKWMYYA
RVNHIAVIDSFKEGTIGASTGKLAGALNNAGKGMIETNIGNYIQQQINLLEFIKMEMADVAGISKQREGQISQRETVGGV
ERATLQSSHITEWLFTIHDDVKKRALECFLETAKVALKGRNKKFQYILSDTSTRVMEIDGDEFAEADYGLVVDNSNGTQE
LQQKLDTLAQAALQTQTLSFSTITKLYTSSSLAEKQRLIEKDEKQIRERQAQAQKEQLEAQQQIAAMQQQQKEAELLQKE
EANIRDNQTKIIIAQIQSEGGPDEEDGIMIDDYSPEAKANLAEKIREFDEKLKLDKDKLKLDKKKAETDASIKRQALRKK
SSTTNK
>A0A1L4BKQ4 ~~~~~~Portal protein~~~
MAKRGRKPKELVPGPGSIDPSDVPKLEGASVPVMSTSYDVVVDREFDELLQGKDGLLVYHKMLSDGTVKNALNYIFGRIR
SAKWYVEPASTDPEDIAIAAFIHAQLGIDDASVGKYPFGRLFAIYENAYIYGMAAGEIVLTLGADGKLILDKIVPIHPFN
IDEVLYDEEGGPKALKLSGEVKGGSQFVSGLEIPIWKTVVFLHNDDGSFTGQSALRAAVPHWLAKRALILLINHGLERFM
IGVPTLTIPKSVRQGTKQWEAAKEIVKNFVQKPRHGIILPDDWKFDTVDLKSAMPDAIPYLTYHDAGIARALGIDFNTVQ
LNMGVQAINIGEFVSLTQQTIISLQREFASAVNLYLIPKLVLPNWPSATRFPRLTFEMEERNDFSAAANLMGMLINAVKD
SEDIPTELKALIDALPSKMRRALGVVDEVREAVRQPADSRYLYTRRRR
>I7HHN4 ~~~~~~Portal protein~~~
MDFTTLQNDFTNDYQKALIANNEFLEAKKYYNGNQLPQDVLNIILERGQTPIIENMFKVIVNKILGYKIESISEIRLSPK
QEEDRALSDLLNSLLQVFIQQENYDKSMIERDKNLLIGGLGVIQLWVSQDKDKNVEIEIKAIKPESFVIDYFSTDKNALD
ARRFHKMLEVSEQEALLLFGDSVIVNYSNVNHERIASVIESWYKEYNEETQSYEWNRYLWNRNTGIYKSEKKPFKNGACP
FIVSKLYTDELNNYYGLFRDIKPMQDFINYAENRMGNMMGSFKAMFEEDAVVDVAEFVETMSLDNAIAKVRPNALKDHKI
QFMNNQADLSALSQKAEQKRQLLRLLAGLNDESLGMAVNRQSGVAIAQRKESGLMGLQTFLKATDDMDRLIFRLAVSFIC
EYFTKEQVFKIVDKKLGDRYFKINSNDDNKIRPLKFDLILKSQLKTESRDEKWYNWNELLKILAPIRPDLVPSLVPLMLN
DMDSPITNDVLEAIQNANALQQQNAEANAPYNQQIQALQIQKLQAEIMELQAKAHKYAEQGALSQTTNESEKINQAVAIT
EMQQQNANNANNEESNNKPKKKLKTSDKTTWRKYPSAQNLDY
>D3WAC3 ~~~~~~Probable portal protein~~~
MNLFGKVVSFSRGKLNNDTQRVTAWQNEAVEYTSAFVTNIHNKIANEITKVEFNHVKYKKSDVGSDTLISMAGSDLDEVL
NWSPKGERNSMDFWRKVIKKLLRAPYVDLYAVFDDNTGELLDLLFADDKKEYKPEELVRLTSPFYINEDTSILDNALASI
QTKLEQGKLRGLLKINAFLDIDNTQEYREKALTTIKNMQEGSSYNGLTPVDNKTEIVELKKDYSVLNKDEIDLIKSELLT
GYFMNENILLGTASQEQQIYFYNSTIIPLLIQLEKELTYKLISTNRRRVVKGNLYYERIIVDNQLFKFATLKELIDLYHE
NINGPIFTQNQLLVKMGEQPIEGGDVYIANLNAVAVKNLSDLQGSRKDVTSTDETNNQ
>Q9T1W5 ~~~H~~~Portal protein~~~
MGRILDISGQPFDFDDEMQSRSDELAMVMKRTQEHPSSGVTPNRAAQMLRDAERGDLTAQADLAFDMEEKDTHLFSELSK
RRLAIQALEWRIAPARDASAQEKKDADMLNEYLHDAAWFEDALFDAGDAILKGYSMQEIEWGWLGKMRVPVALHHRDPAL
FCANPDNLNELRLRDASYHGLELQPFGWFMHRAKSRTGYVGTNGLVRTLIWPFIFKNYSVRDFAEFLEIYGLPMRVGKYP
TGSTNREKATLMQAVMDIGRRAGGIIPMGMTLDFQSAADGQSDPFMAMIGWAEKAISKAILGGTLTTEAGDKGARSLGEV
HDEVRREIRNADVGQLARSINRDLIYPLLALNSDSTIDINRLPGIVFDTSEAGDITALSDAIPKLAAGMRIPVSWIQEKL
HIPQPVGDEAVFTIQPVVPDNGSQKEAALSAEDIPQEDDIDRMGVSPEDWQRSVDPLLKPVIFSVLKDGPEAAMNKAASL
YPQMDDAELIDMLTRAIFVADIWGRLDAAADH
>P25480 ~~~Q~~~Probable portal protein~~~
MSKKKGKTPQPAAKTMTASGPKMEAFTFGEPVPVLDRRDILDYVECISNGRWYEPPVSFTGLAKSLRAAVHHSSPIYVKR
NILASTFIPHPWLSQQDFSRFVLDFLVFGNAFLEKRYSTTGKVIRLETSPAKYTRRGVEEDVYWWVPSFNEPTAFAPGSV
FHLLEPDINQELYGLPEYLSALNSAWLNESATLFRRKYYENGAHAGYIMYVTDAVQDRNDIEMLRENMVKSKGRNNFKNL
FLYAPQGKADGIKIIPLSEVATKDDFFNIKKASAADLLDAHRIPFQLMGGKPENVGSLGDIEKVAKVFVRNELIPLQDRI
REINGWLGQEVIRFKNYSLDTDND
>P26744 ~~~1~~~Portal protein~~~
MADNENRLESILSRFDADWTASDEARREAKNDLFFSRVSQWDDWLSQYTTLQYRGQFDVVRPVVRKLVSEMRQNPIDVLY
RPKDGARPDAADVLMGMYRTDMRHNTAKIAVNIAVREQIEAGVGAWRLVTDYEDQSPTSNNQVIRREPIHSACSHVIWDS
NSKLMDKSDARHCTVIHSMSQNGWEDFAEKYDLDADDIPSFQNPNDWVFPWLTQDTIQIAEFYEVVEKKETAFIYQDPVT
GEPVSYFKRDIKDVIDDLADSGFIKIAERQIKRRRVYKSIITCTAVLKDKQLIAGEHIPIVPVFGEWGFVEDKEVYEGVV
RLTKDGQRLRNMIMSFNADIVARTPKKKPFFWPEQIAGFEHMYDGNDDYPYYLLNRTDENSGDLPTQPLAYYENPEVPQA
NAYMLEAATSAVKEVATLGVDTEAVNGGQVAFDTVNQLNMRADLETYVFQDNLATAMRRDGEIYQSIVNDIYDVPRNVTI
TLEDGSEKDVQLMAEVVDLATGEKQVLNDIRGRYECYTDVGPSFQSMKQQNRAEILELLGKTPQGTPEYQLLLLQYFTLL
DGKGVEMMRDYANKQLIQMGVKKPETPEEQQWLVEAQQAKQGQQDPAMVQAQGVLLQGQAELAKAQNQTLSLQIDAAKVE
AQNQLNAARIAEIFNNMDLSKQSEFREFLKTVASFQQDRSEDARANAELLLKGDEQTHKQRMDIANILQSQRQNQPSGSV
AETPQ
>P04332 ~~~~~~Portal protein~~~
MARKRSNTYRSINEIQRQKRNRWFIHYLNYLQSLAYQLFEWENLPPTINPSFLEKSIHQFGYVGFYKDPVISYIACNGAL
SGQRDVYNQATVFRAASPVYQKEFKLYNYRDMKEEDMGVVIYNNDMAFPTTPTLELFAAELAELKEIISVNQNAQKTPVL
IRANDNNQLSLKQVYNQYEGNAPVIFAHEALDSDSIEVFKTDAPYVVDKLNAQKNAVWNEMMTFLGIKNANLEKKERMVT
DEVSSNDEQIESSGTVFLKSREEACEKINELYGLNVKVKFRYDIVEQMRRELQQIENVSRGTSDGETNE
>P54309 ~~~6~~~Portal protein~~~
MADIYPLGKTHTEELNEIIVESAKEIAEPDTTMIQKLIDEHNPEPLLKGVRYYMCENDIEKKRRTYYDAAGQQLVDDTKT
NNRTSHAWHKLFVDQKTQYLVGEPVTFTSDNKTLLEYVNELADDDFDDILNETVKNMSNKGIEYWHPFVDEEGEFDYVIF
PAEEMIVVYKDNTRRDILFALRYYSYKGIMGEETQKAELYTDTHVYYYEKIDGVYQMDYSYGENNPRPHMTKGGQAIGWG
RVPIIPFKNNEEMVSDLKFYKDLIDNYDSITSSTMDSFSDFQQIVYVLKNYDGENPKEFTANLRYHSVIKVSGDGGVDTL
RAEIPVDSAAKELERIQDELYKSAQAVDNSPETIGGGATGPALENLYALLDLKANMAERKIRAGLRLFFWFFAEYLRNTG
KGDFNPDKELTMTFTRTRIQNDSEIVQSLVQGVTGGIMSKETAVARNPFVQDPEEELARIEEEMNQYAEMQGNLLDDEGG
DDDLEEDDPNAGAAESGGAGQVS
>P13334 ~~~~~~Portal protein~~~
MKFNVLSLFAPWAKMDERNFKDQEKEDLVSITAPKLDDGAREFEVSSNEAASPYNAAFQTIFGSYEPGMKTTRELIDTYR
NLMNNYEVDNAVSEIVSDAIVYEDDTEVVALNLDKSKFSPKIKNMMLDEFSDVLNHLSFQRKGSDHFRRWYVDSRIFFHK
IIDPKRPKEGIKELRRLDPRQVQYVREIITETEAGTKIVKGYKEYFIYDTAHESYACDGRMYEAGTKIKIPKAAVVYAHS
GLVDCCGKNIIGYLHRAVKPANQLKLLEDAVVIYRITRAPDRRVWYVDTGNMPARKAAEHMQHVMNTMKNRVVYDASTGK
IKNQQHNMSMTEDYWLQRRDGKAVTEVDTLPGADNTGNMEDIRWFRQALYMALRVPLSRIPQDQQGGVMFDSGTSITRDE
LTFAKFIRELQHKFEEVFLDPLKTNLLLKGIITEDEWNDEINNIKIEFHRDSYFAELKEAEILERRINMLTMAEPFIGKY
ISHRTAMKDILQMTDEEIEQEAKQIEEESKEARFQDPDQEQEDF
>Q6QGD5 ~~~~~~Portal protein~~~
MGFKSWITEKLNPGQRIIRDMEPVSHRTNRKPFTTGQAYSKIEILNRTANMVIDSAAECSYTVGDKYNIVTYANGVKTKT
LDTLLNVRPNPFMDISTFRRLVVTDLLFEGCAYIYWDGTSLYHVPAALMQVEADANKFIKKFIFNNQINYRVDEIIFIKD
NSYVCGTNSQISGQSRVATVIDSLEKRSKMLNFKEKFLDNGTVIGLILETDEILNKKLRERKQEELQLDYNPSTGQSSVL
ILDGGMKAKPYSQISSFKDLDFKEDIEGFNKSICLAFGVPQVLLDGGNNANIRPNIELFYYMTIIPMLNKLTSSLTFFFG
YKITPNTKEVAALTPDKEAEAKHLTSLVNNGIITGNEARSELNLEPLDDEQMNKIRIPANVAGSATGVSGQEGGRPKGST
EGD
>P03728 ~~~8~~~Portal protein~~~
MAEKRTGLAEDGAKSVYERLKNDRAPYETRAQNCAQYTIPSLFPKDSDNASTDYQTPWQAVGARGLNNLASKLMLALFPM
QTWMRLTISEYEAKQLLSDPDGLAKVDEGLSMVERIIMNYIESNSYRVTLFEALKQLVVAGNVLLYLPEPEGSNYNPMKL
YRLSSYVVQRDAFGNVLQMVTRDQIAFGALPEDIRKAVEGQGGEKKADETIDVYTHIYLDEDSGEYLRYEEVEGMEVQGS
DGTYPKEACPYIPIRMVRLDGESYGRSYIEEYLGDLRSLENLQEAIVKMSMISSKVIGLVNPAGITQPRRLTKAQTGDFV
TGRPEDISFLQLEKQADFTVAKAVSDAIEARLSFAFMLNSAVQRTGERVTAEEIRYVASELEDTLGGVYSILSQELQLPL
VRVLLKQLQATQQIPELPKEAVEPTISTGLEAIGRGQDLDKLERCVTAWAALAPMRDDPDINLAMIKLRIANAIGIDTSG
ILLTEEQKQQKMAQQSMQMGMDNGAAALAQGMAAQATASPEAMAAAADSVGLQPGI
>P03213 ~~~BBRF1~~~Portal protein~~~
MFNMNVDESASGALGSSAIPVHPTPASVRLFEILQGKYAYVQGQTIYANLRNPGVFSRQVFTHLFKRAISHCTYDDVLHD
WNKFEACIQKRWPSDDSCASRFRESTFESWSTTMKLTVRDLLTTNIYRVLHSRSVLSYERYVDWICATGMVPAVKKPITQ
ELHSKIKSLRDRCVCRELGHERTIRSIGTELYEATKEIIESLNSTFIPQFTEVTIEYLPRSDEYVAYYCGRRIRLHVLFP
PAIFAGTVTFDSPVQRLYQNIFMCYRTLEHAKICQLLNTAPLKAIVGHGGRDMYKDILAHLEQNSQRKDPKKELLNLLVK
LSENKTISGVTDVVEEFITDASNNLVDRNRLFGQPGETAAQGLKKKVSNTVVKCLTDQINEQFDQINGLEKERELYLKKI
RSMESQLQASLGPGGNNPAASAPAAVAAEAASVDILTGSTASAIEKLFNSPSASLGARVSGHNESILNSFVSQYIPPSRE
MTKDLTELWESELFNTFKLTPVVDNQGQRLYVRYSSDTISILLGPFTYLVAELSPVELVTDVYATLGIVEIIDELYRSSR
LAIYIEDLGRKYCPASATGGDHGIRQAPSARGDTEPDHAKSKPARDPPPGAGS
>Q3KSR9 ~~~BBRF1~~~Portal protein~~~
MFNMNVDESASGALGSSAIPVHPTPASVRLFEILQGKYAYVQGQTIYANLRNPGVFSRQVFTHLFKRAISHCTYDDVLHD
WNKFEACIQKRWPSDDSCASRFRESTFESWSTTMKLTVRDLLTTNIYRVLHSRSVLSYERYVDWICATGMVPAVKKPITQ
ELHSKIKSLRDRCVCRELGHERTIRSIGTELYEATREIIESLNSTFIPQFTEVTIEYLPRSDEYVAYYCGRRIRLHVLFP
PAIFAGTVTFDSPVQRLYQNIFMCYRTLEHAKICQLLNTAPLKAIVGHGGRDMYKDILAHLEQNSQRKDPKKELLNLLVK
LSENKTISGVTDVVEEFITDASNNLVDRNRLFGQPGETAAQGLKKKVSNTVVKCLTDQINEQFDQINGLEKERELYLKKI
RSMESQLQASLGPGGNNPAASAPAAVAAEAASVDILTGSTASAIEKLFNSPSASLGARVSGHNESILNSFVSQYIPPSRE
MTKDLTELWESELFNTFKLTPVVDNQGQRLYVRYSSDTISILLGPFTYLVAELSPVELVTDVYATLGIVEIIDELYRSSR
LAIYIEDLGRKYCPASATGGDHGIRQAPSARGDAEPDHAKSKPARDPPPGAGS
>P16735 ~~~~~~Portal protein~~~
MERNHWNEKSSGAKRSRERDLTLSTIRSILAADERLRIKASSYLGVGRGVDDEAVIDIFPTGQTMSFLRLLHGFLGTCRG
QSMHQVLRDPCVLRKQLLYGVCKTLFDTITVRRVAEEWKLHAALFPYRALDEEDLEQYLLVWSASLRQSVQTGVLGALRD
ILYQYADNDDYGLYVDWCVTVGLVPLLDVKTKPSEAAERAQFVRAAVQRATETHPLAQDLLQANLALLLQVAERLGAVRV
ANAPEVRVFKKVRSERLEAQLRGKHIRLYVAAEPLAYERDKLLFTTPVAHLHEEILRYDGLCRHQKICQLLNTFPVKVVT
ASRHELNCKKLVEMMEQHDRGSDAKKSIMKFLLNVSDSKSRIGIEDSVESFLQDLTPSLVDQNRLLPARGPGGPGVVGPG
GAVVGGPAGHVGLLPPPPGPAAPERDIRDLFKKQVIKCLEEQIQSQVDEIQDLRTLNQTWENRVRELRDLLTRYASRRED
SMSLGARDAELYHLPVLEAVRKARDAAPFRPLAVEDNRLVANSFFSQFVPGTESLERFLTQLWENEYFRTFRLRRLVTHQ
GAEEAIVYSNYTVERVTLPYLCHILALGTLDPVPEAYLQLSFGEIVAAAYDDSKFCRYVELICSREKARRRQMSREAAGG
VPERGTASSGGPGTLERSAPRRLITADEERRGPERVGRFRNGGPDDPRRAGGPYGFH
>P10190 ~~~UL6~~~Portal protein~~~
MTAPRSRAPTTRARGDTEALCSPEDGWVKVHPSPGTMLFREILHGQLGYTEGQGVYNVVRSSEATTRQLQAAIFHALLNA
TTYRDLEADWLGHVAARGLQPQRLVRRYRNAREADIAGVAERVFDTWRNTLRTTLLDFAHGLVACFAPGGPSGPSSFPKY
IDWLTCLGLVPILRKRQEGGVTQGLRAFLKQHPLTRQLATVAEAAERAGPGFFELALAFDSTRVADYDRVYIYYNHRRGD
WLVRDPISGQRGECLVLWPPLWTGDRLVFDSPVQRLFPEIVACHSLREHAHVCRLRNTASVKVLLGRKSDSERGVAGAAR
VVNKVLGEDDETKAGSAASRLVRLIINMKGMRHVGDINDTVRSYLDEAGGHLIDAPAVDGTLPGFGKGGNSRGSAGQDQG
GRAPQLRQAFRTAVVNNINGVLEGYINNLFGTIERLRETNAGLATQLQERDRELRRATAGALERQQRAADLAAESVTGGC
GSRPAGADLLRADYDIIDVSKSMDDDTYVANSFQHPYIPSYAQDLERLSRLWEHELVRCFKILCHRNNQGQETSISYSSG
AIAAFVAPYFESVLRAPRVGAPITGSDVILGEEELWDAVFKKTRLQTYLTDIAALFVADVQHAALPPPPSPVGADFRPGA
SPRGRSRSRSPGRTARGAPDQGGGIGHRDGRRDGRR
>F5HGK9 ~~~~~~Portal protein~~~
MLRMNPGLGSSISVHPSELSISLFEILQGKYSYVRGQTLHCSLRNPGVFFRQLFIHLYKNALANCSYDHVLSDWRTYESS
AKTRWPEKEAQWGSYRRSTFDSWAQTMRMTLDHLLLNAINRVLYAKTQLSYERYVDWVVTVGMVPVVKHTPDHKLVNSIQ
EQLMKDCQRLASGEKTIGRILTSVTQEISNLVSSLSALYIPGYSEVSIDYDCVKNTFVGLYKQKRVHVEVITMPAILAGR
VIFDSPIQRMYTSIMSCHRTAEHAKLCQLLNTAPTKALVGSACNNVYKDIMTHLEQASQRTDPKRELLNLLMKLAENKTV
SGVTDVVEDFVTDVSQNIVDKNKLFGTGQETTTQGLRRQVSNSVFKCLTNQINEQFDTITQLEKERELCMKRLKCIETQL
SHQQPGDAKGPGSVNLLTANTFQSLGRLQDPSLQLTSSHIPSGSAVLNSFFSSYIPPVRESMKDLTNLWESEMFQTYKLA
PVVDNQGQRLSVTYSQDTISILLGPFTYVIADLLQMELISHSFVSSSLQDIAAYLYQTSRLFVYITDVGQKYCLVTPPFE
NVPGKGPGETDWSANEYSCPEDSRVRRGLSRIPPPCGAPPCPGSA
>P03710 ~~~B~~~Portal protein B~~~
MKTPTIPTLLGPDGMTSLREYAGYHGGGSGFGGQLRSWNPPSESVDAALLPNFTRGNARADDLVRNNGYAANAIQLHQDH
IVGSFFRLSHRPSWRYLGIGEEEARAFSREVEAAWKEFAEDDCCCIDVERKRTFTMMIREGVAMHAFNGELFVQATWDTS
SSRLFRTQFRMVSPKRISNPNNTGDSRNCRAGVQINDSGAALGYYVSEDGYPGWMPQKWTWIPRELPGGRASFIHVFEPV
EDGQTRGANVFYSVMEQMKMLDTLQNTQLQSAIVKAMYAATIESELDTQSAMDFILGANSQEQRERLTGWIGEIAAYYAA
APVRLGGAKVPHLMPGDSLNLQTAQDTDNGYSVFEQSLLRYIAAGLGVSYEQLSRNYAQMSYSTARASANESWAYFMGRR
KFVASRQASQMFLCWLEEAIVRRVVTLPSKARFSFQEARSAWGNCDWIGSGRMAIDGLKEVQEAVMLIEAGLSTYEKECA
KRGDDYQEIFAQQVRETMERRAAGLKPPAWAAAAFESGLRQSTEEEKSDSRAA
>Q8QMP8 3.1.-.-~~~~~~Poxin-Schlafen~~~
MAMFYAHAFGGYDENLHAFPGISSTVANDVRKYSVVSVYNNKYDIVKDKYMWCYSYVNKRYIGALLPMFECNEYLQIGDP
IHDLEGNQISIVTYRHKNYYALSGIGYESLDLCLEGVGIHHHTLEAGNAVYGKVQHDYSTIKEKAKEMNSLSPGPIIDYH
VWIGDCVCQVTAVDVHGKEIMRMRFKKGAVLPIPNLVKVKLGEENDTVNLSTSISALLNSGGGTIEVTSKEERVDYVLMK
RLESIRHLWSVVYDHFDVVNGKERCYVHMHSSNQSPMLSTVKTNLYMKTMGACLQMDYMEALEYLSELKESGGRSPRPEL
PEFEYPDGVEDAGSIERLAEEFFSRSELQADEPVNFCNSINVKHTSVSAKQLRTRIRQQLPSILSSFANTDGGYLFIGVD
NNTHKVVGFTVGQDYLKLVESDIEKYIKRLRVVHFCEKKEDIKYACRFIKVYKPGEETTSTYVCAIKVERCCCAVFADWP
ESWYMDTSGSMKKYSPDEWVSHIKF
>A0A7H0DNF0 ~~~~~~Poxin-Schlafen~~~
MFYAHAFGGYDENLHAFPGISSTVANDVRKYSVVSVYNKKYNIVKNKYMWCNSQVNKRYIGALLPMFECNEYLQIGDPIH
DLEGNQISIVTYRHKNYYALSGIGYESLDLCLEGVGIHHHVLETGNAVYGKVQHEYSTIKEKAKEMNALKPGPIIDYHVW
IGDCVCQVTTVDVHGKEIMRMRFKRGAVLPIPNLVKVKVGEENDTINLSTSISALLNSGGGTIEVTSKEERVDYVLMKRL
ESIHHLWSVVYDHLNVVNGEERCYVHMHSSHQSPMLSTVKTNLYMKTMGACLQMDSMEALEYLSELKESGGRSPRPELQK
FEYPDGVKDTESIERLAEEFFNRSELQAGESVKFGNSINVKHTSVSAKQLRTRIRQQLPSILSSFANTKGGYLFIGVDNN
THKVIGFTVGHDYLKLVESDIEKYIQKLPVVHFCKKKEDIKYACRFIKVYKPGDETTSTYVCAIKVERCCCAVFADWPES
WYMDTSGSMKKYSPDEWVSHIKF
>Q8V4S4 3.1.-.-~~~~~~Poxin-Schlafen~~~
MFYAHAFGGYDENLHAFPRISSTVANDVRKYSVVSVYNKKYNIVKNKYMWCNSQVNKRYIGALLPMFECNEYLQIGDPIH
DLEGNQISIVTYRHKNYYALSGIGYESLDLCLEGVGIHHHVLETGNAVYGKVQHEYSTIKEKAKEMNALKPGPIIDYHVW
IGDCVCQVTTVDVHGKEIMRMRFKRGAVLPIPNLVKVKVGEENDTINLSTSISALLNSGGGTIEVTSKEERVDYVLMKRL
ESIHHLWSVVYDHLNVVNGEERCYIHMHSSHQSPMLSTVKTNLYMKTMGACLQMDSMEALEYLSELKESGGRSPRPELQK
FEYPDGVKDTESIERLAEEFFNRSELQAGESVKFGNSINVKHTSVSAKQLRTRIRQQLPSILSSFANTKGGYLFIGVDNN
THKVIGFTVGHDYLKLVERDIEKYIQKLPVVHFCKKKEDIKYACRFIKVYKPGDETTSTYVCAIKVERCCCAVFADWPES
WYMDTSGSMKKYSPDEWVSHIKF
>P08358 3.1.-.-~~~~~~Poxin~~~
MELYNIKYAIDPTNKIVIEQVDNVDAFVHILEPGQEVFDETLSQYHQFPGVVSSIIFPQLVLNTIISVLSEDGSLLTLKL
ENTCFNFHVCNKRFVFGNLPAAVVNNETKQKLRIGAPIFAGKKLVSVVTAFHRVGENEWLLPVTGIREASQLSGHMKVLN
GVRVEKWRPNMSVYGTVQLPYDKIKQHALEQENKTPNALESCVLFYKDSEIRITYNKGDYEIMHLRMPGPLIQPNTIYYS
>O92490 3.1.-.-~~~~~~Poxin~~~
MELYNIKYAIDPTNKIVIEQVDNVDAFVHILEPGQEVFDETLSRYHQFPGVVSSIIFTQLVLNTIISVLSEDGSLLPLKL
ENTCFNFHVCNKRFVFGNLPAAIVNNETKQKLRIGSPIFAGEKLVSVVTTFHRVGENEWLLPVTGIQEASRLSGHIKVPN
GVRVEKLRPNMSVYGTVQLPYDKIKRHALEQENKTPNALESCVLFYRDSEIRITYNRGDYEIMHLRMPGPLIQPNTIYYS
>Q6J362 3.1.-.-~~~~~~Poxin~~~
MAMFYAHALGGYDENLHAFPGISSTVANDVRKYSVVSVYNNKYDIVKDKYMWCYSQVNKRYIGALLPMFECNEYLQIGDP
IHDQEGNQISIITYRHKNYYALSGIGYESLDLCLEGVGIHHHVLETGNAVYGKVQHDYSTIKEKAKEMSTLSPGPIIDYH
VWIGDCICQVTAVDVHGKEIMRMRFKKGAVLPIPNLVKVKLGENDTENLSSTISAAPSR
>P20999 3.1.-.-~~~~~~Poxin~~~
MAMFYAHALGGYDENLHAFPGISSTVANDVRKYSVVSVYNNKYDIVKDKYMWCYSQVNKRYIGALLPMFECNEYLQIGDP
IHDQEGNQISIITYRHKNYYALSGIGYESLDLCLEGVGIHHHVLETGNAVYGKVQHDYSTIKEKAKEMSTLSPGPIIDYH
VWIGDCICQVTAVDVHGKEIMRMRFKKGAVLPIPNLVKVKLGENDTENLSSTISAAPSR
>Q01225 3.1.-.-~~~~~~Poxin~~~
MAMFYAHALGGYDENLHAFPGISSTVANDVRKYSVVSVYNNKYDIVKDKYMWCYSQVNKRYIGALLPMFECNEYLQIGDP
IHDQEGNQISIITYRHKNYYALSGIGYESLDLCLEGVGIHHHVLETGNAVYGKVQHDYSTIKEKAKEMNALSPGPIIDYH
VWIGDCICQVTAVDVHGKEIMRMRFKKGAVLPIPNLVKVKLGENDTENLSSTISAAPSR
>P08318 ~~~~~~Large structural phosphoprotein~~~
MSLQFIGLQRRDVVALVNFLRHLTQKPDVDLEAHPKILKKCGEKRLHRRTVLFNELMLWLGYYRELRFHNPDLSSVLEEF
EVRCVAVARRGYTYPFGDRGKARDHLAVLDRTEFDTDVRHDAEIVERALVSAVILAKMSVRETLVTAIGQTEPIAFVHLK
DTEVQRIEENLEGVRRNMFCVKPLDLNLDRHANTALVNAVNKLVYTGRLIMNVRRSWEELERKCLARIQERCKLLVKELR
MCLSFDSNYCRNILKHAVENGDSADTLLELLIEDFDIYVDSFPQSAHTFLGARSPSLEFDDDANLLSLGGGSAFSSVPKK
HVPTQPLDGWSWIASPWKGHKPFRFEAHGSLAPAAEAHAARSAAVGYYDEEEKRRERQKRVDDEVVQREKQQLKAWEERQ
QNLQQRQQQPPPPARKPSASRRLFGSSADEDDDDDDDEKNIFTPIKKPGTSGKGAASGGGVSSIFSGLLSSGSQKPTSGP
LNIPQQQQRHAAFSLVSPQVTKASPGRVRRDSAWDVRPLTETRGDLFSGDEDSDSSDGYPPNRQDPRFTDTLVDITDTET
SAKPPVTTAYKFEQPTLTFGAGVNVPAGAGAAILTPTPVNPSTAPAPAPTPTFAGTQTPVNGNSPWAPTAPLPGDMNPAN
WPRERAWALKNPHLAYNPFRMPTTSTASQNTVSTTPRRPSTPRAAVTQTASRDAADEVWALRDQTAESPVEDSEEEDDDS
SDTGSVVSLGHTTPSSDYNNDVISPPSQTPEQSTPSRIRKAKLSSPMTTTSTSQKPVLGKRVATPHASARAQTVTSTPVQ
GRLEKQVSGTPSTVPATLLQPQPASSKTTSSRNVTSGAGTSSASSARQPSASASVLSPTEDDVVSPATSPLSMLSSASPS
PAKSAPPSPVKGRGSRVGVPSLKPTLGGKAVVGRPPSVPVSGSAPGRLSGSSRAASTTPTYPAVTTVYPPSSTAKSSVSN
APPVASPSILKPGASAALQSRRSTGTAAVGSPVKSTTGMKTVAFDLSSPQKSGTGPQPGSAGMGGAKTPSDAVQNILQKI
EKIKNTEE
>Q08358 ~~~~~~Polyprotein pp220~~~
MGNRGSSTSSRPPLSSEANLYAKLQDHIQRQTRPFSGGGYFNGGGDKNPVQHIKDYHIDSVSSKAKLRVIEGIIRAIAKI
GFKVDTKQPIEDILKDIKKQLPDPRAGSTFVKNAEKQETVCKMIADAINQEFIDLGQDKLIDTTDGAASICRQIVLYINS
LTHGLRAEYLDVHGSIENTLENIKLLNDAIKQLHERMVTEVTKAAPNEEVINAVTMIEAVYRRLLNEQNLQINILTNFID
NILTPTQKELDKLQTDEVDIIKLLNDTNSVLGTKNFGKVLSYTLCNLGIAASVANKINKALQKVGLKVEQYLQSKNWAEF
DKELDLKRFSGLVSAENIAEFEKAVNLLRQTFNERHKILENSCAKKGGDEEKTPLDRRIEAQRLDRKHILMEFLNKSTQA
YNDFLENVKKIGIKLVKEIALTPNITRLRDALSRINDMGTIALDLSLIGFYTNAAAREERETFLTQFMLVKNVLEEQSKI
DPNFKNLYDSCSRLLQIIDFYTDIVQKKYGGGEDCECTRVGGAALTVEELGLSKAARSQVDLNQAINTFMYYYYVAQIYS
NLTHNKQEFQSYEENYATILGDAIAGRLMQLDTEKNARINSPAVDLARGHVGPNPGGAQEADWKAAVSAIELEYDVKRRF
YRALEGLDLYLKNITKTFVNNIDSIQTVQQMLDGVRIIGRWFTEATGDTLAQVFESFPTSAGNDSNVFTDNAPAGHYYEK
VAAEIQQGRSVGTLRPVRASQAKNIRDLIGRSLSNFQALKNIINAFARIGDMLGGEELRQMVPMSPLQIYKTLLEYIQHS
ALSVGLKNLNQSEIGGQRVALARTPEEAAQRVYLSTVRVNDALSTRWETEDVFFTFMLKSMAAKIFIVLGIYDMFERPEP
VYKLIPTRMILGGADELEPEVIPEAAELYFRLPRLAEFYQKLFSFRDENVQISMLPELEGIFSGLIRIIFMRPIELINIG
DYSETEIRQLIKEINVIYQHFNLEYGEQEATKKALIHFVNEINRRFGVITRTEWEKFQRIVQEARTMNDFGMMNQTNYSI
LPDEDGYTQSSQLLPSDRFISPSTQPTPKWRPALYNIDSVDVQTGMLQPNSQWDLVQKFRKQLSEMFEDPSLQQELGKVS
YQELIRQAINELKKEHTDKIQIVSKLIQGSESLADTDVNKIFLFHETVITGLNLLSAIYVLLNNFRNNIKGLDLDTIQKS
IIEWLRETQAANVNRANLIDWLGRKHGAISEIRNPGLVVKENDVRLSRVYPDPTTNATAPQDQNLVTETLFAWIVPYVGI
PAGGGVRAEQELAARYLVDNQRIMQLLLTNIFEMTSSFNKMVQVRFPETSTAQVHLDFTGLISLIDSLMADTKYFLNLLR
PHIDKNIIQYYENRSNPGSFYWLEEHLIDKLIKPPTDAGGRPLPGGELGLEGVNQIINKTYILLTKPYNVLQLRGGVQRR
DAANIQINNNPQPSERFEQYGRVFSRLVFYDALENNSGLRVEQVVLGDFRLSNLIRTNNAQEENTLSYWDNMAPRTYANV
NDAANNLRRYRLYGSDYGIQNNRSMMMVFNQLVASYIARFYDAPSGKIYLNLINAFANGNFSQAVMELGYTHPDLARDNI
AFGHRGDPTEQSVLLLSLGLMLQRLIKDTNRQGLSQHLISTLTEIPIYLKENYRANLPLFNKMFNILISQGELLKQFIQY
TNVQLARPNLMGLLGANNDSVIYYNNNINVPMTGLSVGQAALRGIGGVFRPNVTLMPLGDAQNNTSDVVRKRLVAVIDGI
IRGSHTLADSAMEVLHELTDHPIYLETEEHFIQNYMSRYNKEPLMPFSLSLYYLRDLRIENNEVYDPLLYPNLESGSPEF
KLLYGTRKLLGNDPVQLSDMPGVQLIMKNYNETVVAREQITPTRFEHFYTHAIQALRFIVNIRSFKTVMMYNENTFGGVN
LISENRDDKPIITAGIGMNAVYSLRKTLQDVISFVESSYQEEQINHIHKIVSPKGQTRTLGSNRERERIFNLFDMNIIPI
NVNALMRSIPLANIYNYDYSFEEIACLMYGISAEKVRSLNTAAPQPDIAEVLNIPNRPPMNTREFMLKLLINPYVSVSIT
QYGNELMSKGSAGYMSRIFRGDNALNMGRPKFLSDQIFNKVLFGSLYPTQFDYDEAGPGLAAGIQRGREQWGQPLSEYIN
QALHELVRTIRIPQKLRVLRNIIVKNQLIADLTTIREQLVSMRREVENMIQTPEIQNNPTPEVIAAAQNWTQQYRARVDT
LINFIGNIGQPNSMLDLIQTITPVTVRAQLGVIFNRHGIPVPHPRQILQTDDEATQWFMTNILNIPAIIMTPFTDLANDL
RTFLETLERYVFNVPRWLGPSTGRVARAPVRMAPRDMRHPISYTENSVLTYITEQNREEGPWSIVKQVGVGIQKPTLVQI
GKDRFDTRLIRNLIFITNIQRLLRLRLNLELSQFRNVLVSPDHIINPSITEYGFSITGPSETFSDKQYDSDIRIL
>P11042 ~~~~~~Protein pp31~~~
MVNVPEQQSPETAAVCKNEKLLNKLESSSYNKSNMDQLAVIVNFLERKNINYILNVVPVMQDERKMSKRKKKVINNNKYI
LFNSWYTKIKQPEWPSSPAMWDLVKNKPELADFVFIFDHTEKLGKKMADRSTSSSSSENAAIPASKKRQTVVLTNANLAE
LKESCEMRDKLYSEFYSLLNETFNHNVAPLLSNIYDEVLTRDFITKSMAKFKTVALKLPVAPSTTEYVPTPISGSRKRKS
SVPAKQRSSIKTRRNTVAPALLMVSDNTQDTNMSD
>A7WNB0 ~~~X~~~Protein PP3~~~
MCRRTSVLPLVLSLFHLYYYGNVLSVSIPLSINITDRILPVQLNTTILMEGAEELNITQQLNVKHFQEGQLTREENFVCS
HMTTTDLATDFFQLSKLAGAYDAAGSQDQRLHSPSLTSTLAPVVPETTPATLTPDGVTLNIRTSVTDINPLTRFTKAVRD
LKEDITPGYDLTNGRKFGREFLRALEENGVRGRHKRDSTTDVLQMTLDVTESREKYVEIRDQCASVLGALIKSMVPSVSE
HCLYHYHIKNIINCFTSHYIRLHNEPDFPLVMALYSEYLSLTVKIHYMDEILVSTLRA
>Q65179 ~~~~~~Polyprotein pp62~~~
MPSNMKQFCKISVWLQQHDPDLLEIINNLCMLGNLSAAKYKHGVTFIYPKQAKIRDEIKKHAYSNDPSQAIKTLESLILP
FYIPTPAEFTGEIGSYTGVKLEVEKTEANKVILKNGEAVLVPAADFKPFPDRRLAVWIMESGSMPLEGPPYKRKKEGGGN
DPPVPKHISPYTPRTRIAIEVEKAFDDCMRQNWCSVNNPYLAKSVSLLSFLSLNHPTEFIKVLPLIDFDPLVTFYLLLEP
YKTHGDDFLIPETILFGPTGWNGTDLYQSAMLEFKKFFTQITRQTFMDIADSATKEVDVPICYSDPETVHSYTNHVRTEI
LHHNAVNKVTTPNLVVQAYNELEQTNTIRHYGPIFPESTINALRFWKKLWQDEQRFVIHGLHRTLMDQPTYETSEFAEIV
RNLRFSRPGNNYINELNITSPAMYGDKHTTGDIAPNDRFAMLVAFINSTDFLYTAIPEEKVGGNETQTSSLTDLVPTRLH
SFLNHNLSKLKILNRAQQTVRNILSNDCLNQLKHYVKHTGKNEILKLLQE
>P0CA06 ~~~~~~Polyprotein pp62~~~
MPSNMKQFCKISVWLQQHDPDLLEIINNLCMLGNLSAAKYKHGVTFIYPKQAKIRDEIKKHAYSNDPSQAIKTLESLILP
FYIPTPAEFTGEIGSYTGVKLEVEKTEANKVILKNGEAVLIPAADFKPFPDRRLAVWIMESGSMPLEGPPYKRKKEGGGN
DPPVPKHISPYTPRTRIAIEVEKAFDDCMRQNWCSVNNPYLAKSVSLLSFLSLNHPTEFIKVLPLIDFDPLVTFYLLLEP
YKTHGDDFLIPETVLFGPTGWNGTDLYQSAMLEFKKFFTQITRQTFMDIADTATKEVDVPICYSDPETVHSYANHVRTEI
LHHNMVNKVTTPNLVVQAYNELEQTNTIRHYGPIFPESTINALRFWKKLWQDEQRFVIHGLHRTLMDQPTYETSEFAEIV
RNLRFSRPGNNYINELNITSPAMYGDKHTTGDIAPNDRFAMLVAFINSTDFLYTAIPEEKVGGNDTQTSSLTDLVPTRLH
SFLNHNLSKLKILNRAQQTVRNILSNDCLNQLKHYVKHTGKNEILKLLQD
>P06725 ~~~~~~65 kDa phosphoprotein~~~
MESRGRRCPEMISVLGPISGHVLKAVFSRGDTPVLPHETRLLQTGIHVRVSQPSLILVSQYTPDSTPCHRGDNQLQVQHT
YFTGSEVENVSVNVHNPTGRSICPSQEPMSIYVYALPLKMLNIPSINVHHYPSAAERKHRHLPVADAVIHASGKQMWQAR
LTVSGLAWTRQQNQWKEPDVYYTSAFVFPTKDVALRHVVCAHELVCSMENTRATKMQVIGDQYVKVYLESFCEDVPSGKL
FMHVTLGSDVEEDLTMTRNPQPFMRPHERNGFTVLCPKNMIIKPGKISHIMLDVAFTSHEHFGLLCPKSIPGLSISGNLL
MNGQQIFLEVQAIRETVELRQYDPVAALFFFDIDLLLQRGPQYSEHPTFTSQYRIQGKLEYRHTWDRHDEGAAQGDDDVW
TSGSDSDEELVTTERKTPRVTGGGAMAGASTSAGRKRKSASSATACTSGVMTRGRLKAESTVAPEEDTDEDSDNEIHNPA
VFTWPPWQAGILARNLVPMVATVQGQNLKYQEFFWDANDIYRIFAELEGVWQPAAQPKRRRHRQDALPGPCIASTPKKHR
G
>Q6SW59 ~~~~~~65 kDa phosphoprotein~~~
MESRGRRCPEMISVLGPISGHVLKAVFSRGDTPVLPHETRLLQTGIHVRVSQPSLILVSQYTPDSTPCHRGDNQLQVQHT
YFTGSEVENVSVNVHNPTGRSICPSQEPMSIYVYALPLKMLNIPSINVHHYPSAAERKHRHLPVADAVIHASGKQMWQAR
LTVSGLAWTRQQNQWKEPDVYYTSAFVFPTKDVALRHVVCAHELVCSMENTRATKMQVIGDQYVKVYLESFCEDVPSGKL
FMHVTLGSDVEEDLTMTRNPQPFMRPHERNGFTVLCPKNMIIKPGKISHIMLDVAFTSHEHFGLLCPKSIPGLSISGNLL
MNGQQIFLEVQAIRETVELRQYDPVAALFFFDIDLLLQRGPQYSEHPTFTSQYRIQGKLEYRHTWDRHDEGAAQGDDDVW
TSGSDSDEELVTTERKTPRVTGGGAMASASTSAGRKRKSASSATACTAGVMTRGRLKAESTVAPEEDTDEDSDNEIHNPA
VFTWPPWQAGILARNLVPMVATVQGQNLKYQEFFWDANDIYRIFAELEGVWQPAAQPKRRRHRQDALPGPCIASTPKKHR
G
>P18139 ~~~~~~65 kDa phosphoprotein~~~
MASVLGPISGHVLKAVFSRGDTPVLPHETRLLQTGIHVRVSQPSLILVSQYTPDSTPCHRGDNQLQVQHTYFTGSEVENV
SVNVHNPTGRSICPSQEPMSIYVYALPLKMLNIPSINVHHYPSAAERKHRHLPVADAVIHASGKQMWQARLTVSGLAWTR
QQNQWKEPDVYYTSAFVFPTKDVALRHVVCAHELVCSMENTRATKMQVIGDQYVKVYLESFCEDVPSGKLFMHVTLGSDV
EEDLTMTRNPQPFMRPHERNGFTVLCPKNMIIKPGKISHIMLDVAFTSHEHFGLLCPKSIPGLSISGNLLMNGQQIFLEV
QAIRETVELRQYDPVAALFFFDIDLLLQRGPQYSEHPTFTSQYRIQGKLEYRHTWDRHDEGAAQGDDDVWTSGSDSDEEL
VTTERKTPRVTGGGAMAGASTSAGRKRKSASSATACTAGVMTRGRLKAESTVAPEEDTDEDSDNEIHNPAVFTWPPWQAG
ILARNLVPMVATVQGQNLKYQEFFWDANDIYRIFAELEGVWQPAAQPKRRRHRQDALPGPCIASTPKKHRG
>P06726 ~~~~~~Protein pp71~~~
MSQASSSPGEGPSSEAAAISEAEAASGSFGRLHCQVLRLITNVEGGSLEAGRLRLLDLRTNIEVSRPSVLCCFQENKSPH
DTVDLTDLNIKGRCVVGEQDRLLVDLNNFGPRRLTPGSENNTVSVLAFALPLDRVPVSGLHLFQSQRRGGEENRPRMEAR
AIIRRTAHHWAVRLTVTPNWRRRTDSSLEAGQIFVSQFAFRAGAIPLTLVDALEQLACSDPNTYIHKTETDERGQWIMLF
LHHDSPHPPTSVFLHFSVYTHRAEVVARHNPYPHLRRLPDNGFQLLIPKSFTLTRIHPEYIVQIQNAFETNQTHDTIFFP
ENIPGVSIEAGPLPDRVRITLRVTLTGDQAVHLEHRQPLGRIHFFRRGFWTLTPGKPDKIKRPQVQLRAGLFPRSNVMRG
AVSEFLPQSPGLPPTEEEEEEEEEDDEDDLSSTPTPTPLSEAMFAGFEEASGDEDSDTQAGLSPALILTGQRRRSGNNGA
LTLVIPSWHVFASLDDLVPLTVSVQHAALRPTSYLRSDMDGDVRTAADISSTLRSVPAPRPSPISTASTSSTPRSRPRI
>Q5UP71 ~~~~~~Structural PPIase-like protein L605~~~
MNYSLEDLPNSGKNPRVYMDIVLNNEIIGRLQIKLFRDAFPAGVENFVQLTNGKTYRVNSNGTGKYKYNRHINRTYEGCK
FHNVLHNNYIVSGDIYNSNGSSAGTVYCDEPIPPVFGDYFYPHESKGLLSLVPYTDESGNRYYDSTFMITLDDIRPSNVL
DELDRDQVVIGQVYGGLDVLDKINSMIKPYAGRKYPTFSIGKCGAYLDSSQAQRKRPVNVNGTKRFLNKPTRVN
>P03772 3.1.3.16~~~~~~Serine/threonine-protein phosphatase~~~
MRYYEKIDGSKYRNIWVVGDLHGCYTNLMNKLDTIGFDNKKDLLISVGDLVDRGAENVECLELITFPWFRAVRGNHEQMM
IDGLSERGNVNHWLLNGGGWFFNLDYDKEILAKALAHKADELPLIIELVSKDKKYVICHADYPFDEYEFGKPVDHQQVIW
NRERISNSQNGIVKEIKGADTFIFGHTPAVKPLKFANQMYIDTGAVFCGNLTLIQVQGEGA
>P03319 ~~~sag~~~Protein PR73~~~
MPRLQQKWLNSRECPTPRGEAAKGLFPTKDDPSAHKRVSPSDKDIFILCCKLGIALLCLGLLGEVAVRARRALTLDSFNS
SSVQDYNLNNSENSTFLLRQGPQPTSSYKPHRFCPSEIEIRMLAKNYIFTNKTNPIGRLLVTMLRNESLSFSTIFTQIQK
LEMGIENRKRRSTSIEEQVQGLLTTGLEVKKGKKSVFVKIGDRWWQPGTYRGPYIYRPTDAPLPYTGRYDLNWDRWVTVN
GYKVLYRSLPFRERLARARPPWCMLSQEEKDDMKQQVHDYIYLGTGMHFWGKIFHTKEGTVAGLIEHYSAKTYGMSYYE
>Q65146 ~~~~~~Putative helicase/primase complex protein~~~
MQETFKFLRCNSQGEAVEDKYSLETLKNHFVVRDEYNNLFRVFSNRDDFWEWEAAQPFEQKCFHEVVFGFLPQRLKFDID
FPVNKSYSDDNDNVNDDDSVYDDDNVYDILDMIINVIMDVFYETYSLPYNINLTREQILLTDSIGLNKKRELKYSFHIIL
YTYSVLNNNEAKAFTSKVLENLPKHVYPFVDPQVNKSIQNFRIIGSHKKGSMRVKMFNEELAEVFETSTTTKKSDTLIAT
PFETTCLPCIFTNVKETTPSSCDTIQQSELEEVLKFAGTLCKNHCFLRVHKNLVLFKRTSPSYCEICKRMHDKDNTLILR
VTGNKVYQHCRHDNKHSLLMGSLSGTTNFVETYVDQVMTKSIEVHESILFEELPDTQKHIYDESSMREYERVPTLVVKAQ
MKIGKTVQLRNYLQKYYGNNSISKQQTIRFVTFRQIFSKNIQSRLPNFTLYSEVTGDLDSYERVIVQVESLFRLTSTAEP
VDLLILDEVESIFNQFNSGLHKYFAPSFAIFMWMLETANYVICLDANLGNRTYNILQRFRGDVPIFFHWNQYKRAQHDTY
YFTSSRETWLNNLLKDLLEDKKIVIPTNSLMEARLLQSFIQKKFPEKKIGFYSSKSTAHERESHFNNVSYYWGLVDILIY
TPTISAGVSYEDKRFDVLYGFFNNMSCDVETCCQMLGRVRELKSKCYKICLQGKQNYYPETIEDIEMFTLQKRDTLFQTI
NNHQLSFTYSKETGRPVYYKTPYYHLWLETMRIQHLSKNHFITRFINQVADTGAKVFILTGEKLETVKQYTSIKMEIKHQ
DYVNIASAETIDANKALQIKQNLKEGITVDQQDLFAYEKYKLLEFYAWHGHKITPKFVEQYNSFMTKQNYTGRVQISRGK
TVYESLTMLQTQELNFHQWAMQHAEHHDLQFNYSFQSHMYAIMLLTKCGFKCVQDPNILTNEQLMTKLVDEFVQYDLSAV
SFEFKLKKPSKTDPQTILKFINKVLGLRYGLKIHHNKGNYYIKNTKAGSLIPFVRQQIKQSPCVVSNLLPITETSSVKEE
TLTETSPIKETFTET
>P10277 2.7.7.-~~~Alpha~~~Putative P4-specific DNA primase~~~
MKMNVTATVSHALGHWPRILPALGIQVLKNRHQPCPVCGGSDRFRFDDREGRGTWYCNQCGAGDGLKLVEKVFGVSPSDA
AAKVAAVTGSLPPADPAVTTAAVDETDAARKNAAALAQTLMAKTRTGTGNAYLTRKGFPGRECRMLTGTHRAGGVSWRAG
DLVVPLYDDSGELVNLQLISADGRKRTLKGGQVRGTCHTLEGQNQAGKRLWIAEGYATALTVHHLTGETVMVALSSVNLL
SLASLARQKHPACQIVLAADRDLSGDGQKKAAAAADACEGVVALPPVFGDWNDAFTQYGGEATRKAIYDAIRPPAESPFD
TMSEAEFSAMSTSEKAMRIYEHYGEALAVDANGQLLSRYENGVWKVLPPQDFARDVAGLFQRLRAPFSSGKVASVVDTLK
LIIPQQEAPSRRLIGFRNGVLDTQNGTFHPHSPSHWMRTLCDVDFTPPVDGETLETHAPAFWRWLDRAAGGRAEKRDVIL
AALFMVLANRYDWQLFLEVTGPGGSGKSIMAEIATLLAGEDNATSATIETLESPRERAALTGFSLIRLPDQEKWSGDGAG
LKAITGGDAVSVDPKYRDAYSTHIPAVILAVNNNPMRFTDRSGGVSRRRVIIHFPEQIAPQERDPQLKDKITRELAVIVR
HLMQKFSDPMLARSLLQSQQNSDEALNIKRDADPTFDFIGYLETLPQTSGMYMGNASIIPRNYRKYLYHAYLAYMEANGY
RNVLSLKMFGLGLPVMLKEYGLNYEKRHTKQGIQTNLTLKEESYGDWLPKCDDPTTA
>P04520 2.7.7.-~~~~~~DNA primase~~~
MSSIPWIDNEFAYRALAHLPKFTQVNNSSTFKLRFRCPVCGDSKTDQNKARGWYYGDNNEGNIHCYNCNYHAPIGIYLKE
FEPDLYREYIFEIRKEKGKSRPIEKPKELPKQPEKKIIKSLPSCVRLDKLAEDHPIIKYVKARCIPKDKWKYLWFTTEWP
KLVNSIAPGTYKKEISEPRLVIPIYNANGKAESFQGRALKKDAPQKYITIEAYPEATKIYGVERVKDGDVYVLEGPIDSL
FIENGIAITGGQLDLEVVPFKDRRVWVLDNEPRHPDTIKRMTKLVDAGERVMFWDKSPWKSKDVNDMIRKEGATPEQIME
YMKNNIAQGLMAKMRLSKYAKI
>P17149 2.7.7.-~~~~~~DNA primase~~~
MTLVLFATEYDSAHIVANVLSQTPTDHCVFPLLVKHQVSRRVYFCLQTQKCSDSRRVAPVFAVNNETLQLSRYLAARQPI
PLSALIASLDEAETQPLYRHLFRTPVLSPEHGGEVREFKHLVYFHHAAVLRHLNQVFLCPTSPSWFISVFGHTEGQVLLT
MAYYLFEGQYSTISTVEEYVRSFCTRDLGTIIPTHASMGEFARLLLGSPFRQRVSAFVAYAVARNRRDYTELEQVDTQIN
AFRERARLPDTVCVHYVYLAYRTALARARLLEYRRVVAYDADAAPEAQCTREPGFLGRRLSTELLDVMQKYFSLDNFLHD
YVETHLLRLDESPHSATSPHGLGLAGYGGRIDGTHLAGFFGTSTQLARQLERINTLSESVFSPLERSLSGLLRLCASLRT
AQTYTTGTLTRYSQRRYLLPEPALAPLLERPLPVYRVHLPNDQHVFCAVASETWHRSLFPRDLLRHVPDSRFSDEALTET
VWLHDDDVASTSPETQFYYTRHEVFNERLPVFNFVADFDLRLRDGVSGLARHTVFELCRGLRRVWMTVWASLFGYTHPDR
HPVYFFKSACPPNSVPVDAAGAPFDDDDYLDYRDERDTEEDEDGKEDKNNVPDNGVFQKTTSSVDTSPPYCRCKGKLGLR
IITPFPACTVAVHPSVLRAVAQVLNHAVCLDAELHTLLDPISHPESSLDTGIYHHGRSVRLPYMYKMDQDDGYFMHRRLL
PLFIVPDAYREHPLGFVRAQLDLRNLLHHHPPHDLPALPLSPPPRVILSVRDKICPSTEANFIETRSLNVTRYRRRGLTE
VLAYHLYGGDGATAAAISDTDLQRLVVTRVWPPLLEHLTQHYEPHVSEQFTAPHVLLFQPHGACCVAVKRRDGARTRDFR
CLNYTHRNPQETVQVFIDLRTEHSYALWASLWSRCFTKKCHSNAKNVHISIKIRPPDAPVPPATAV
>B9VXM8 2.7.7.-~~~~~~DNA primase~~~
MTLVLFATEYDSAHIVANVLSQTPTDHCVFPLLVKHQVSRRVYFCLQTQKCSDSRRVAPVFAVNNETLQLSRYLAARQPI
PLSALIASLDEAETQPLYRHLFRTPVLSPEHGGEVREFKHLVYFHHAAVLRHLNQVFLCPTSPSWFISVFGHTEGQVLLT
MAYYLFEGQYSTISTVEEYVRSFCTRDLGTIIPTHASMGEFARLLLGSPFRQRVSAFVAYAVARNRRDYTELEQVDTQIN
AFRERARLPDTVCVHYVYLAYRTALARARLLEYRRVVAYDADAAPEAQCTREPGFLGRRLSTELLDVMQKYFSLDNFLHD
YVETHLLRLDESPHSATSPHGLGLAGYGGRIDGTHLAGFFGTSTQLARQLERINTLSESVFSPLERSLSGLLRLCASLRT
AQTYTTGTLTRYSQRRYLLPEPALAPLLERPLPVYRVHLPNDQHVFCAVASETWHRSLFPRDLLRHVPDSRFSDEALTET
VWLHDDDVASTSPETQFYYTRHEVFNERLPVFNFVADFDLRLRDGVSGLARHTVFELCRGLRRVWMTVWASLFGYTHPDR
HPVYFFKSACPPNSVPVDAAGAPFDDDDYLDYRDERDTEEDEDGKENKNNVPDNGVFQKTTSSVDTSPPYCRCKGKLGLR
IITPFPACTVAVHPSVLRAVAQVLNHAVCLDAELHTLLDPISHPESSLDTGIYHHGRSVRLPYMYKMDQDDGYFMHRRLL
PLFIVPDAYREHPLGFVRAQLDLRNLLHHHPPHDLPALPLSPPPRVILSVRDKICPSTEANFIETRSLNVTRYRRRGLTE
VLAYHLYGGDGATAAAISDTDLQRLVVTRVWPPLLEHLTQHYEPHVSEQFTAPHVLLFQPHGACCVAVKRRDGARTRDFR
CLNYTHRNPQETVQVFIDLRTEHSYALWASLWSRCFTKKCHSNAKNVHISIKIRPPDAPMPPATAV
>P10236 2.7.7.-~~~~~~DNA primase~~~
MGQEDGNRGERRAAGTPVEVTALYATDGCVITSSIALLTNSLLGAEPVYIFSYDAYTHDGRADGPTEQDRFEESRALYQA
SGGLNGDSFRVTFCLLGTEVGGTHQARGRTRPMFVCRFERADDVAALQDALAHGTPLQPDHIAATLDAEATFALHANMIL
ALTVAINNASPRTGRDAAAAQYDQGASLRSLVGRTSLGQRGLTTLYVHHEVRVLAAYRRAYYGSAQSPFWFLSKFGPDEK
SLVLTTRYYLLQAQRLGGAGATYDLQAIKDICATYAIPHAPRPDTVSAASLTSFAAITRFCCTSQYARGAAAAGFPLYVE
RRIAADVRETSALEKFITHDRSCLRVSDREFITYIYLAHFECFSPPRLATHLRAVTTHDPNPAASTEQPSPLGREAVEQF
FCHVRAQLNIGEYVKHNVTPRETVLDGDTAKAYLRARTYAPGALTPAPAYCGAVDSATKMMGRLADAEKLLVPRGWPAFA
PASPGEDTAGGTPPPQTCGIVKRLLRLAATEQQGPTPPAIAALIRNAAVQTPLPVYRISMVPTGQAFAALAWDDWARITR
DARLAEAVVSAEAAAHPDHGALGRRLTDRIRAQGPVMPPGGLDAGGQMYVNRNEIFNGALAITNIILDLDIALKEPVPFR
RLHEALGHFRRGALAAVQLLFPAARVDPDAYPCYFFKSACRPGPASVGSGSGLGNDDDGDWFPCYDDAGDEEWAEDPGAM
DTSHDPPDDEVAYFDLCHEVGPTAEPRETDSPVCSCTDKIGLRVCMPVPAPYVVHGSLTMRGVARVIQQAVLLDRDFVEA
IGSYVKNFLLIDTGVYAHGHSLRLPYFAKIAPDGPACGRLLPVFVIPPACKDVPAFVAAHADPRRFHFHAPPTYLASPRE
IRVLHSLGGDYVSFFERKASRNALEHFGRRETLTEVLGRYNVQPDAGGTVEGFASELLGRIVACIETHFPEHAGEYQAVS
VRRAVSKDDWVLLQLVPVRGTLQQSLSCLRFKHGRASRATARTFVALSVGANNRLCVSLCQQCFAAKCDSNRLHTLFTID
AGTPCSPSVPCSTSQPSS
>P41417 2.7.7.-~~~LEF-1~~~DNA primase~~~
MLVCNYTQKRVDMMWDAIAYNDSRKYAFMTVNARWIHADRYFDTSAQLYSYIVQNKVSDVHVKPLDDGGGREWVVDADYK
NYVDEHDLMLKIYIGATAFLLFYTEENVSRVMYTGNRGFHLWLKFTDKFKITSAQNVRVHRYKAFEKPAKLDSDYIQPGS
FAHCVREAVRLYVPHMQDSNLDALTLQYWPDVDRDIFCNVNKQIRAPYSYNYKGTKFSRCITKELLDKLKQCYPGYGTGG
CGPVTTTTTPSPPKIGSMQTTTKSTT
>Q8JL78 ~~~~~~Profilin~~~
MAAEWHKIIEDVSKNNKFEDAAIVDYKTKKNVLAAIPNRTFAKIIPGEVIALITNRNILKPRIGQKFFIVYTNSLMDENT
YTMELLTGYAPVSPIVIARTHTALIFLMGKPTTSRREVYRTCRDYATNVRATGN
>Q8V4T7 ~~~~~~Profilin~~~
MAEWHKIIEDISKNNKFEDAAIVDYKTTKNVLAAIPNRTFAKINPGEVIPLITNHNILKPLIGQKFCIVYTNSLMDENTY
AMELLTGYAPVSPIVIARTHTALIFLMGKPTTSRRDVYRTCRDHATRVRATGN
>Q76ZN5 ~~~~~~Profilin~~~
MAEWHKIIEDISKNNKFEDAAIVDYKTTKNVLAAIPNRTFAKINPGEIIPLITNRNILKPLIGQKYCIVYTNSLMDENTY
AMELLTGYAPVSPIVIARTHTALIFLMGKPTTSRRDVYRTCRDHATRVRATGN
>P03252 3.4.22.39~~~L3~~~Protease~~~
MGSSEQELKAIVKDLGCGPYFLGTYDKRFPGFVSPHKLACAIVNTAGRETGGVHWMAFAWNPRSKTCYLFEPFGFSDQRL
KQVYQFEYESLLRRSAIASSPDRCITLEKSTQSVQGPNSAACGLFCCMFLHAFANWPQTPMDHNPTMNLITGVPNSMLNS
PQVQPTLRRNQEQLYSFLERHSPYFRSHSAQIRSATSFCHLKNM
>P03253 3.4.22.39~~~L3~~~Protease~~~
MGSSEQELKAIVKDLGCGPYFLGTYDKRFPGFVSPHKLACAIVNTAGRETGGVHWMAFAWNPHSKTCYLFEPFGFSDQRL
KQVYQFEYESLLRRSAIASSPDRCITLEKSTQSVQGPNSAACGLFCCMFLHAFANWPQTPMDHNPTMNLITGVPNSMLNS
PQVQPTLRRNQEQLYSFLERHSPYFRSHSAQIRSATSFCHLKNM
>P0DOI1 ~~~~~~Gag-Pro polyprotein~~~
MGNSPSYNPPAGISPSDWLNLLQSAQRLNPRPSPSDFTDLKNYIHWFHKTQKKPWTFTSGGPASCPPGKFGRVPLVLATL
NEVLSNDEGAPGASAPEEQPPPYDPPAVLPIISEGNRNRHRAWALRELQDIKKEIENKAPGSQVWIQTLRLAILQADPTP
ADLEQLCQYIASPVDQTAHMTSLTAAIAAEAANTLQGFNPKMGTLTQQSAQPNAGDLRSQYQNLWLQAWKNLPTRPSVQP
WSTIVQGPAESYVEFVNRLQISLADNLPDGVPKEPIIDSLSYANANKECQQILQGRGLVAAPVGQKLQACAHWAPKTKQP
AILVHTPGPKMPGPRQPAPKRPPPGPCYRCLKEGHWARDCPTKTTGPPPGPCPICKDPSHWKRDCPTLKSKKLIEGGPSA
PQIITPITDSLSEAELECLLSIPLARSRPSVAVYLSGPWLQPSQNQALMLVDTGAENTVLPQNWLVRDYPRTPAAVLGAG
GISRNRYNWLQGPLTLALKPEGPFITIPKILVDTFDKWQILGRDVLSRLQASISIPEEVHPPVVGVLDAPPSHIGLEHLP
PPPEVPQFPLN
>P49860 ~~~4~~~Prohead protease~~~
MPEIVKTLSFDETEIKFTGDGKQGIFEGYASVFNNTDSDGDIILPGAFKNALANQTRKVAMFFNHKTWELPVGKWDSLAE
DEKGLYVRGQLTPGHSGAADLKAAMQHGTVEGMSVGFSVAKDDYTIIPTGRIFKNIQALREISVCTFPANEQAGIAAMKS
VDGIETIRDVENWLRDSVGLTKSQAVGLIARFKSAIRSESEGDGNEAQINALLQSIKSFPSNLGK
>Q01267 ~~~I~~~Protease I~~~
MKKHAIGIAALNALSIDDDGWCQLLPAGHFSARDGRPFDVTGGQGWFIDGEIAGRLVEGVRALNQDVLIDYEHNQLRKDK
GLPPEQLVAAGWFNADEMQWREGEGLFIHPRWTAAAQQRIDDGEFGYLSAVFPYDTATGAVLQIRLAALTNDPGATGMKK
LTALAADLPDILQQENKPMNETLRKLLARLGVTVPENADITDEQATAALTALDTLEINAGKVAALSAELEKAQKAAVDLT
KYVPVESYNALRDELAQATAQSATASLSAVLDKAEQEGRIFKSERTYLEQLGGQIGVAALSAQLEKKQPIAALSAMQTTT
AKIPSQEKTAVAVLSADEQAAVKALGITEAEYLKMKQEQEK
>Q6QGD7 3.4.21.-~~~~~~Prohead protease~~~
MTQAAIDYNKLKSAPVHLDAYIKSIDSESKEGVVKIRGFANTISKDRAGDVIPASAWKTSNALTNYMKNPIILFGHDHRR
PIGKCIDLNPTEMGLEIECEINESSDPAIFSLIKNGVLKTFSIGFRCLDAEWDEATDIFIIKDLELYEVSVVSVPCNQDS
TFNLAKSMNGHDYTEWRKSFTAISSKAVPAQERNLSELEKLAIALGYVKE
>P10274 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGQIFSRSASPIPRPPRGLAAHHWLNFLQAAYRLEPGPSSYDFHQLKKFLKIALETPARICPINYSLLASLLPKGYPGRV
NEILHILIQTQAQIPSRPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGAPPNHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLR
REYQQLWLAAFAALPGSAKDPSWASILQGLEEPYHAFVERLNIALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQTWTPKDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCTQPRPPPGPCPLCQDPTHWKRDCPRLKPTI
PEPEPEEDALLLDLPADIPHPKNLHRGGGLTSPPTLQQVLPNQDPASILPVIPLDPARRPVIKAQVDTQTSHPKTIEALL
DTGADMTVLPIALFSSNTPLKNTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQG
VLYLPEAKRPPVILPIQAPAVLGLEHLPRPPEISQFPLNQNASRPCNTWSGRPWRQAISNPTPGQGITQYSQLKRPMEPG
DSSTTCGPLTL
>P0C210 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGQIFPRSANPIPRPPRGLATHHWLNFLQAAYRLEPGPSSYDFHQLKTVLKMALETPVWMCPINYSLLASLLPKGYPGQV
NEILQVLIQTQTQIPSHPAPPPPSSPTHDPPDSDPQIPPPYVEPTAPQVLPVMHPHGVPPTHRPWQMKDLQAIKQEVSQA
APGSPQFMQTIRLAVQQFDPTAKDLQDLLQYLCSSLVASLHHQQLDSLISEAETRGITGYNPLAGPLRVQANNPQQQGLR
REYQQLWLTAFAALPGSAKDPSWASILQGLEEPYHTFVERLNVALDNGLPEGTPKDPILRSLAYSNANKECQKLLQARGH
TNSPLGDMLRACQAWTPRDKTKVLVVQPKKPPPNQPCFRCGKAGHWSRDCAQPRPPPGPCPLCQDPTHWKRDCPRLKPAI
PEPEPEEDALLLDLPADIPHPKNLHRGGGLTSPPTLRQVHPNKDPASILPVIPLDPARRPLIKAQVDTQTSHPRTIEALL
DTGADMTVLPIALFSSDTPLKDTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQG
VLYLPEAKRPPVILPIQAPAVLGLEHLPRPPEISQFPLNQNASRPCNTWSGRPWRQAISNRTPGQEITQYSQLKRPMEPG
DSSTTCGPLIL
>P03353 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGQIHGLSPTPIPKAPRGLSTHHWLNFLQAAYRLQPRPSDFDFQQLRRFLKLALKTPIWLNPIDYSLLASLIPKGYPGRV
VEIINILVKNQVSPSAPAAPVPTPICPTTTPPPPPPPSPEAHVPPPYVEPTTTQCFPILHPPGAPSAHRPWQMKDLQAIK
QEVSSSALGSPQFMQTLRLAVQQFDPTAKDLQDLLQYLCSSLVVSLHHQQLNTLITEAETRGMTGYNPMAGPLRMQANNP
AQQGLRREYQNLWLAAFSTLPGNTRDPSWAAILQGLEEPYCAFVERLNVALDNGLPEGTPKEPILRSLAYSNANKECQKI
LQARGHTNSPLGEMLRTCQAWTPKDKTKVLVVQPRRPPPTQPCFRCGKVGHWSRDCTQPRPPPGPCPLCQDPSHWKRDCP
QLKPPQEEGEPLLLDLPSTSGTTEEKNLLKGGDLISPHPDQDISILPLIPLRQQQQPILGVRISVMGQTPQPTQALLDTG
ADLTVIPQTLVPGPVKLHDTLILGASGQTNTQFKLLQTPLHIFLPFRRSPVILSSCLLDTHNKWTIIGRDALQQCQGLLY
LPDDPSPHQLLPIATPNTIGLEHLPPPPQVDQFPLNLSASRP
>P10271 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKRYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSTSSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKIKKVVQRKENSEGKRKEKDSKAFLATDWNDDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVVKKKPQ
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDEDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKEMVQKAAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRMMPRGEGSDILIKQLAWENANSLCQDLIRPIRKTGTIQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGAEGPVCFSCGKTGHIRKDCKDEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNLVKGQSPSPAQKGDGVKGSGLNPEAPPFTIHDLPRGTPGSAGLDLSSQKD
LILSLEDGVSLVPTLVKGTLPEGTTGLIIGRSSNYKKGLEVLPGVIDSDFQGEIKVMVKAAKNAVIIHKGERIAQLLLLP
YLKLPNPVIKEERGSEGFGSTSHVHWVQEISDSRPMLHIYLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGL
GMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKDIKVRLMTDSPDDSQDL
>Q9IZT2 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGVSGSKGQKLFVSVLQRLLSERGLHVKESSAIEFYQFLIKVSPWFPEEGGLNLQDWKRVGREMKKYAAEHGTDSIPKQA
YPIWLQLREILTEQSDLVLLSAEAKSVTEEELEEGLTGLLSASSQEKTYGTRGTAYAEIDTEVDKLSEHIYDEPYEEKEK
ADKNEEKDHVRKVKKIVQRKENSEHKRKEKDQKAFLATDWNNDDLSPEDWDDLEEQAAHYHDDDELILPVKRKVDKKKPL
ALRRKPLPPVGFAGAMAEAREKGDLTFTFPVVFMGESDDDDTPVWEPLPLKTLKELQSAVRTMGPSAPYTLQVVDMVASQ
WLTPSDWHQTARATLSPGDYVLWRTEYEEKSKETVQKTAGKRKGKVSLDMLLGTGQFLSPSSQIKLSKDVLKDVTTNAVL
AWRAIPPPGVKKTVLAGLKQGNEESYETFISRLEEAVYRMMPRGEGSDILIKQLAWENANSLCQDLIRPMRKTGTMQDYI
RACLDASPAVVQGMAYAAAMRGQKYSTFVKQTYGGGKGGQGSEGPVCFSCGKTGHIKRDCKEEKGSKRAPPGLCPRCKKG
YHWKSECKSKFDKDGNPLPPLETNAENSKNLVKGQSPSPTQKGDKGKDSGLNPEAPPFTIHDLPRGTPGSAGLDLSSQKD
LILSLEDGVSLVPTLVKGTLPEGTTGLIIGRSSNYKKGLEVLPGVIDSDFQGEIKVMVKAAKNAVIIHKGERIAQLLLLP
YLKLPNPIIKEERGSEGFGSTSHVHWVQEISDSRPMLHISLNGRRFLGLLDTGADKTCIAGRDWPANWPIHQTESSLQGL
GMACGVARSSQPLRWQHEDKSGIIHPFVIPTLPFTLWGRDIMKEIKVRLMTDSPDDSQDL
>P07570 ~~~gag-pro~~~Gag-Pro polyprotein~~~
MGQELSQHERYVEQLKQALKTRGVKVKYADLLKFFDFVKDTCPWFPQEGTIDIKRWRRVGDCFQDYYNTFGPEKVPVTAF
SYWNLIKELIDKKEVNPQVMAAVAQTEEILKSNSQTDLTKTSQNPDLDLISLDSDDEGAKSSSLQDKGLSSTKKPKRFPV
LLTAQTSKDPEDPNPSEVDWDGLEDEAAKYHNPDWPPFLTRPPPYNKATPSAPTVMAVVNPKEELKEKIAQLEEQIKLEE
LHQALISKLQKLKTGNETVTHPDTAGGLSRTPHWPGQHIPKGKCCASREKEEQIPKDIFPVTETVDGQGQAWRHHNGFDF
AVIKELKTAASQYGATAPYTLAIVESVADNWLTPTDWNTLVRAVLSGGDHLLWKSEFFENCRDTAKRNQQAGNGWDFDML
TGSGNYSSTDAQMQYDPGLFAQIQAAATKAWRKLPVKGDPGASLTGVKQGPDEPFADFVHRLITTAGRIFGSAEAGVDYV
KQLAYENANPACQAAIRPYRKKTDLTGYIRLCSDIGPSYQQGLAMAAAFSGQTVKDFLNNKNKEKGGCCFKCGKKGHFAK
NCHEHAHNNAEPKVPGLCPRCKRGKHWANECKSKTDNQGNPIPPHQGNRVEGPAPGPETSLWGSQLCSSQQKQPISKLTR
ATPGSAGLDLCSTSHTVLTPEMGPQALSTGIYGPLPPNTFGLILGRSSITMKGLQVYPGVIDNDYTGEIKIMAKAVNNIV
TVSQGNRIAQLILLPLIETDNKVQQPYRGQGSFGSSDIYWVQPITCQKPSLTLWLDDKMFTGLIDTGADVTIIKLEDWPP
NWPITDTLTNLRGIGQSNNPKQSSKYLTWRDKENNSGLIKPFVIPNLPVNLWGRDLLSQMKIMMCSPNDIVTAQMLAQGY
SPGKGLGKKENGILHPIPNQGQSNKKGFGNF
>P21407 ~~~pro~~~Gag-Pro polyprotein~~~
MGQASSHSENDLFISHLKESLKVRRIRVRKKDLVSFFSFIFKTCPWFPQEGSIDSRVWGRVGDCLNDYYRVFGPETIPIT
TFNYYNLIRDVLTNQSDSPDIQRLCKEGHKILISHSRPPSRQAPVTITTSEKASSRPPSRAPSTCPSVAIDIGSHDTGQS
SLYPNLATLTDPPIQSPHSRAHTPPQHLPLLANSKTLHNSGSQDDQLNPADQADLEEAAAQYNNPDWPQLTNTPALPPFR
PPSYVSTAVPPVAVAAPVLHAPTSGVPGSPTAPNLPGVALAKPSGPIDETVSLLDGVKTLVTKLSDLALLPPAGVMAFPV
TRSQGQVSSNTTGRASPHPDTHTIPEEEEADSGESDSEDDEEESSEPTEPTYTHSYKRLNLKTIEKIKTAVANYGPTAPF
TVALVESLSERWLTPSDWFFLSRAALSGGDNILWKSEYEDISKQFAERTRVRPPPKDGPLKIPGASPYQNNDKQAQFPPG
LLTQIQSAGLKAWKRLPQKGAATTSLAKIRQGPDESYSDFVSRLQETADRLFGSGESESSFVKHLAYENANPACQSAIRP
FRQKELSTMSPLLWYCSAHAVGLAIGAALQNLAPAQLLEPRPAFAIIVTNPAIFQETAPKKIQPPTQLPTQPNAPQASLI
KNLGPTTKCPRCKKGFHWASECRSRLDINGQPIIKQGNLEQGPAPGPHYRDELRGFTVHPPIPPANPCPPSNQPRRYVTD
LWRATAGSAGLDLCTTTDTILTTQNSPLTLPVGIYGPLPPQTFGLILAEPALPSKGIQVLPGILDNDFEGEIHIILSTTK
DLVTIPKGTRLAQIVILPLQQINSNFHKPYRGASAPGSSDVYWVQQISQQRPTLKLKLNGKLFSGILDTGADATVISYTH
WPRNWPLTTVATHLRGIGQATNPQQSAQMLKWEDSEGNNGHITPYVLPNLPVNLWGRDILSQMKLVMCSPNDTVMTQMLS
QGYLPGQGLGKNNQGITQPITITPKKDKTGLGFHQNLP
>Q5I147 ~~~H1~~~Tyrosine phosphatase-like protein H1~~~
MGRHSFKTVSIDEFLELTLRPDLLNLIKKEHHKLMKVKLPGTIANFSRPENSSKNRSTLFPCWDESRVILKSPSKGIPYD
NDITSTYIHANFVDGFKDKNKFICSQSPMENTCEDFWRMILQENCHIIVSLTKVDNAVYCYEYWANEKYREKVFGKYVIK
TLEIIEEEVFTRSRLLLTDTNNDISQEIHHFWYTNFPLHYGWPIMSPELLNLIFHVDQKREELMNTTGSGPIVIHCSKIV
SWTGIFCTIYNALSQVREEKTVSLPQTVLNIRKKRHSSIMNWVEYEICYRVLCEAILNLKTFMYCNLEIFSAFQDKVAAI
NFYFGKSHNKHSFNNK
>P24656 3.1.3.48~~~PTP~~~Tyrosine-protein phosphatase~~~
MFPARWHNYLQCGQVIKDSNLICFKTPLRPELFAYVTSEEDVWTAEQIVKQNPSIGAIIDLTNTSKYYDGVHFLRAGLLY
KKIQVPGQTLPPESIVQEFIDTVKEFTEKCPGMLVGVHCTHGINRTGYMVCRYLMHTLGIAPQEAIDRFEKARGHKIERQ
NYVQDLLI
>A0A2L0V130 6.3.4.25~~~purZ~~~N6-succino-2-amino-2'-deoxyadenylate synthase~~~
MNSNKKATIVVDAQFGSTGKGLIAGYLAERDQPDVVMTAWSANAGHTYINAEGRKFVHCMLANGIVSPKLTTVLIGPGSQ
MNAELLRDEILSCADLLQGKTILLHASAALILQKHVEEEAGPMTKIGSTKKGCGAAMIAKIRRNPDDNNTVGANGDYMEE
HIYGPVREAGVFIRTATNAEYMAVVYDAERIQVEGAQGFSLGINNGFYPYVTSRECTPAQVAVDVNLPLAFIDKVVACMR
TLPIRVANRYNDKGEQIGWSGPCYPDQKELDWEKDLGMEAELTTVTKLPRRIFTFSYEQTKAALEVIRPDEVFLNFCNYL
EPDEVAAVIGTIERAAHELKVPGPMPLARYYGYGPTVNDIDLDMRHQ
>A0A7U3TBV6 6.3.4.25~~~purZ~~~N6-succino-2-amino-2'-deoxyadenylate synthase~~~
MLSIPPYYRVKNCNLIVDCQYGSTGKGLLAGYLGALEAPQVLCMAPSPNAGHTLVEEDGTARVHKMLPLGITSPSLERIY
LGPGSVIDMDRLLEEYLALPRQVELWVHQNAAVVLQEHRDEEAAGGLAPGSTRSGAGSAFIAKIRRRPGTLLFGEAVRDH
PLHGVVRVVDTRTAQDMLFRTRSIQAEGCQGYSLSVHHGAYPYCTARDVTTAQLIADCGLPYDVARIARVVGSMRTYPIR
VANRPEAGEWSGPCYPDSVECQFADLGLEQEYTTVTKLPRRIFTFSAIQAHEAIAQNGVDEVFLNFAQYPPSLGALEDIL
DAIEARAEVTYVGFGPKVTDVYHTPTRAELEGLYARYRR
>A0A2H5BHJ6 6.3.4.25~~~purZ~~~N6-succino-2-amino-2'-deoxyadenylate synthase~~~
MKKATVICDMQFGSTGKGLIAGFLAERDQPDVVVTAWSANAGHTYINREGRKWVHCMLANGIVSPKLKAVLIGGGSQMSI
PTLISEIMGSLDILQGKSILIHENACIIQQRHVEEEAGPMTKIGSTKKGCGAAMMEKIRRNPESKIVAKDFIDDGLEIPD
FKLDGTVGFKDISRHFEELGVCIKVVSNEVYLAVLHKAERVQVEGAQGFSLGLHNGFYPYVTSRECTPAQICSDCNVPIS
MVDKVVGTMRTYPIRVANRFDDEGKMVGWSGPCYSDQTELTWEQMGVTPEKTTVTKLTRRIFSFSRMQTRQAMLVCMPDE
IFLNFANYCASEDELASIIEVISNEGGDVSYIGWGDSAAHIETTLEGDWSDDTNPLFNQYNKSSNIA
>G3FFN6 6.3.4.25~~~purZ~~~N6-succino-2-amino-2'-deoxyadenylate synthase~~~
MKNVDLVIDLQFGSTGKGLIAGYLAEKNGYDTVINANMPNAGHTYINAEGRKWMHKVLPNGIVSPNLKRVMLGAGSVFSI
NRLMEEIEMSKDLLHDKVAILIHPMATVLDEEAHKKAEVGIATSIGSTGQGSMAAMVEKLQRDPTNNTIVARDVAQYDGR
IAQYVCTVEEWDMALMASERILAEGAQGFSLSLNQEFYPYCTSRDCTPARFLADMGIPLPMLNKVIGTARCHPIRVGGTS
GGHYPDQEELTWEQLGQVPELTTVTKKVRRVFSFSFIQMQKAMWTCQPDEVFLNFCNYLSPMGWQDIVHQIEVAAQSRYC
DAEVKYLGFGPTFNDVELREDVM
>C9DG80 ~~~~~~Putrescinyltransferase~~~
MKHSYRADEPLWLNYTEAISKTWDIDPTYNAIHGAMKDMGEAKTHRVLAGMALYYHLGFSCQLAEQSTDENFWDVLGNNY
EGTKRGTERRYFRGEQGRASLKYIKDNYKTPTDFLLGLHRDKYSDLLKAFGPVPAFGPYFVWKMGDIYDRILGMPIQADN
CIEHLPSEPFKGMELVRKEMIARGDASFADYTPQQMFEYMESQTNKLGLIAPPAGDRLLDIREIETAMCGLKHFYTGTDY
VGKDLVKHSESLLGYGETADLLRSHMPPVIARDYFMAPASVIKLHGPNRKIDEEGSNPIVKTTGLGSFFR
>A0A385DTA3 ~~~~~~Portal vertex auxiliary protein~~~
MAGQQGIYCAPDNIVPNRDRVDVGCAPDGAMQLWVMEYEVTGIGKGCAMCKAINPQQAEMLLKSNGIYNGSSYLYKVTRI
EQVIVPPCNGLMAEQVVTYKDVVS
>P11041 ~~~~~~Polyhedrin~~~
MADVAGTSNRDFRGREQRLFNSEQYNYNNSLNGEVSVWVYAYYSDGSVLVINKNSQYKVGISETFKALKEYREGQHNDSY
DEYEVNQSIYYPNGGDARKFHSNAKPRAIQIIFSPSVNVRTIKMAKGNAVSVPDEYLQRSHPWEATGIKYRKIKRDGEIV
GYSHYFELPHEYNSISLAVSGVHKNPSSYNVGSAHNVMDVFQSCDLALRFCNRYWAELELVNHYISPNAYPYLDINNHSY
GVALSNRQ
>P36326 ~~~~~~Polyhedrin~~~
MHGLDDAQYLQQKAHNKRISEFRSSSNSGINVTVVLKYTNGVVQVYNWQGTEVIAGSLNRQLMKFPNYMNPDKHGRIEWP
GEGVEHQHGLIRSNGGNGSYDIGAGDPYAMQFIVQGSVDWNATRLRFFGPDGSRWMPDDQGGASVRAGLLNAAEDIINSK
MQPLYFCDRMAGKSYYVRFDDKYAPRFPTIGFEVYRYRVGATNEMGGESARTAVASLISFPTFSTAYVNEKVAVENFFQP
RELVYQTAMGTPFEVRLVPMDRFVTETGI
>P36701 ~~~~~~Polyhedrin~~~
MHGLDDAQYLQQKAHNKRISEFRSSSNSGINVTVVLKYTNGVVQVYNWQGTEVIAGSLNRQLMKFPNYMNPDKHGRIEWP
GEGVEHQHGLIRSNGGNGSYDIGAGDPYAMQFIVQGSVDWNATRLRFFGPDGSRWMPDDQGGASVRAGLLNAAEDIINSK
MQPLYFCDRMAGKSYYVRFDDKYAPRFPTIGFEVYRYRVGATNEMGGESARTAVASLISFPTFSTAYVNEKVAVENFFQP
RELVYQNSYGYTV
>P04871 ~~~PH~~~Polyhedrin~~~
MPDYSYRPTIGRTYVYDNKYYKNLGAVIKNAKRKKHFAEHEIEEATLDPLDNYLVAEDPFLGPGKNQKLTLFKEIRNVKP
DTMKLVVGWKGKEFYRETWTRFMEDSFPIVNDQEVMDVFLVVNMRPTRPNRCYKFLAQHALRCDPDYVPHDVIRIVEPSW
VGSNNEYRISLAKKGGGCPIMNLHSEYTNSFEQFIDRVIWENFYKPIVYIGTDSAEEEEILLEVSLVFKVKEFAPDAPLF
TGPAY
>P31036 ~~~PH~~~Polyhedrin~~~
MRNFYSYNPTIGRTYVYDNKFYKNLGSVIKNAKRKQHLIEHLKEEKQLDPLDTFMVAEDPFLGPGKNQKLTLFKEVRNVK
PDTMKLVVNWSGKEFLRETWTRFMEDSFPIVNDQEVMDIFLEANLKPTRPNRCYRFLAQHALRCDPDYVPHEVIRIVEPD
YVGVGNEYRISLAKRGGGCPIMNLNSEYNNSFESFIERVIWENFYRNNVYIGTDSAEEEEILLELSLLFKVKEFAPDIPL
YSGPAY
>P03237 ~~~PH~~~Polyhedrin~~~
MPNYSYNPTIGRTYVYDNKYYKNLGGLIKNAKRKKHLIEHEKEEKQWDLLDNYMVAEDPFLGPGKNQKLTLFKEVRNVKP
DTMKLIVNWSGKEFLRETWTRFVEDSFPIVNDQEVMDVYLVANLKPTRPNRCYKFLAQHALRWDEDYVPHEVIRIMEPSY
VGMNNEYRISLAKKGGGCPIMNIHSEYTNSFESFVNRVIWENFYKPIVYIGTDSAEEEEILIEVSLVFKIKEFAPDAPLF
TGPAY
>P32373 ~~~PH~~~Polyhedrin~~~
MPDFSYRPTIGRTYVYDNKSNKNLGSVIKNAKRKKHLLEHEAEEKFLDPLDHYMVAEDPFLGPGKNQKLTLFKEIRNVKP
DTMKLIANWSGKEFLRETWTRFVEDSFPIVNDQEVMEVFLVINLRPTRPNRCYKFLAQHALRWDDNYVPHEVIRIVEPSY
VGMNNEYRISLAKRGGGCPIMNIHSEYTNSFEQFVNRVIWENFYKPIVYIGTDSGEEEEILIEVSLVFKVKEFAPDAPLF
TGPAY
>Q25469 ~~~~~~Polyhedrin~~~
MYTRYSYNPTLGRTYVYDNKYYKNLGHVIKNAKRKKNAAEHELEERNLDPLDKYLVAEDPFLGPGKNQKLTLFKEIRNVK
PDTMKLIVNWSGKEFLRETWTRFMEDSFPIVNDQEIMDVFLVVNMRPTKPNRCSVLAQHALRCDSDYVPHEVIRIVKPSY
VGSNNEYRISLGKRYNGCPVMNLHSEYTNSFEDFINRVIWENFYKPLVYIGTDSAEEEEILLEVSLVFKIKEFAPDAPLY
TGPAY
>P56260 ~~~V-QIN~~~Transforming protein Qin~~~
MLDMGDRKEVKMLPKSSFSINNLVPEAVQSDNHSGHSHHNSHHPHHHHHHHHHHPPPPQQPQRAAAAEEEDEEKAPLLLP
PPAAGALEAAKAEALAGKGEAGAAAAELEEKEKAAEEKKGAAEGGKDGESGKEGEKKNGKYEKPPFSYNALIMMAIRQSP
EKRLTLNGIYEFIMKNFPYYRENKQGWQNSIRHNLSLNKCFVKVPRHYDDPGKGNYWMLDPSSDDVFIGGTTDKLRRRST
TSRAKLAFKRGARLTSTGLTFMDRAGSLYWPMSPFLSLHHPRASSTLSYNGTASAYPSHPMPYSSVLTQNSLGNNHSFST
SNGLSVDRLVNGEIPYATHHLTAAALAASVPCGLSVPCSGTYSLNPCSVNLLAGQTSYFSPTSLTPQ
>P0C6W3 ~~~rep~~~Replicase polyprotein 1ab~~~
MLSKASVTTQGARGKYRAELYNEKRSDHVACTVPLCDTDDMACKLTPWFEDGETAFNQVSSILKEKGKILFVPMHMQRAM
KFLPGPRVYLVERLTGGMLSKHFLVNQLAYKDQVGAAMMRTTLNAKPLGMFFPYDSSLETGEYTFLLRKNGLGGQLFRER
PWDRKETPYVEILDDLEADPTGKYSQNLLKKLIGGDCIPIDQYMCGKNGKPIADYAKIVAKEGLTTLADIEVDVKSRMDS
DRFIVLNKKLYRVVWNVTRRNVPYPKQTAFTIVSVVQCDDKDSVPEHTFTIGSQILMVSPLKATNNKNFNLKQRLLYTFY
GKDAVQQPGYIYHSAYVDCNACGRGTWCTGNAIQGFACDCGANYSANDVDLQSSGLVPRNALFLANCPCANNGACSHSAA
QVYNILDGKACVEVGGKSFTLTFGGVVYAYMGCCDGTMYFVPRAKSCVSRIGDAIFTGCTGTWDKVVETANLFLEKAQRS
LNFCQQFALTEVVLAILSGTTSTFEELRDLCHNASYEKVRDHLVNHGFVVTIGDYIRDAINIGANGVCNATINAPFIAFT
GLGESFKKVSAIPWKICSNLKSALDYYSSNIMFRVFPYDIPCDVSNFVELLLDCGKLTVATSYFVLRYLDEKFDTVLGTV
SSACQTALSSFLNACVAASRATAGFINDMFKLFKVLMHKLYVYTSCGYVAVAEHSSKIVQQVLDIMSKAMKLLHTNVSWA
GTKLSAIIYEGREALLFNSGTYFCLSTKAKTLQGQMNLVLPGDYNKKTLGILDPVPNADTIDVNANSTVVDVVHGQLEPT
NEHGPSMIVGNYVLVSDKLFVRTEDEEFYPLCTNGKVVSTLFRLKGGMPSKKVTFGDVNTVEVTAYRSVSITYDIHPVLD
ALLSSSKLATFTVEKDLLVEDFVDVIKDEVLTLLTPLLRGYDIDGFDVEDFIDVPCYVYNQDGDCAWSSNMTFSINPVED
VEEVEEFIEDDYLSDELPIADDEEAWARAVEEVMPLDDILVAEIELEEDPPLETALESVEAEVVETAEAQEPSVESIDST
PSTSTVVGENDLSVKPMSRVAETDDVLELETAVVGGPVSDVTAIVTNDIVSVEQAQQCGVSSLPIQDEASENQVHQVSDL
QGNELLCSETKVEIVQPRQDLKPRRSRKSKVDLSKYKHTVINNSVTLVLGDAIQIASLLPKCILVNAANRHLKHGGGIAG
VINKASGGDVQEESDEYISNNGPLHVGDSVLLKGHGLADAILHVVGPDARNNEDAALLKRCYKAFNKHTIVVTPLISAGI
FSVDPKVSFEYLLANVTTTTYVVVNNEDIYNTLATPSKPDGLVYSFEGWRGTVRTAKNYGFTCFICTEYSANVKFLRTKG
VDTTKKIQTVDGVSYYLYSARDALTDVIAAANGCSGICAMPFGYVTHGLDLAQSGNYVRQVKVPYVCLLASKEQIPIMNS
DVAIQTPETAFINNVTSNGGYHSWHLVSGDLIVKDVCYKKLLHWSGQTICYADNKFYVVKNDVALPFSDLEACRAYLTSR
AAQQVNIEVLVTIDGVNFRTVILNDTTTFRKQLGATFYKGVDISDAFPTVKMGGESLFVADNLSESEKVVLKEYYGTSDV
TFLQRYYSLQPLVQQWKFVVHDGVKSLKLSNYNCYINATIMMIDMLHDIKFVVPALQNAYLRYKGGDPYDFLALIMAYGD
CTFDNPDDEAKLLHTLLAKAELTVSAKMVWREWCTVCGIRDIEYTGMRACVYAGVNSMEELQSVFNETCVCGSVKHRQLV
EHSAPWLLVSGLNEVKVSTSTDPIYRAFNVFQGVETSVGHYVHIRVKDGLFYKYDSGSLTKTSDMKCKMTSVWYPTVRYT
ADCNVVVYDLDGVTKVEVNPDLSNYYMKDGKYYTSKPTIKYSPATILPGSVYSNSCLVGVDGTPGSDTISKFFNDLLGFD
ETKPISKKLTYSLLPNEDGDVLLSEFSNYNPVYKKGVMLKGKPILWVNNGVCDSALNKPNRASLRQLYDVAPIVLDNKYT
VLQDNTSQLVEHNVPVVDDVPITTRKLIEVKCKGLNKPFVKGNFSFVNDPNGVTVVDTLGLTELRALYVDINTRYIVLRD
NNWSSLFKLHTVESGDLQIVAAGGSVTRRARVLLGASSLFASFAKITVTATTAACKTAGRGFCKFVVNYGVLQNMFVFLK
MLFFLPFNYLWPKKQPTVDIGVSGLRTAGIVTTNIVKQCGTAAYYMLLGKFKRVDWKATLRLFLLLCTTILLLSSIYHLV
LFNQVLSSDVMLEDATGILAIYKEVRSYLGIRTLCDGLVVEYRNTSFDVMEFCSNRSVLCQWCLIGQDSLTRYSALQMLQ
THITSYVLNIDWIWFALEFFLAYVLYTSSFNVLLLVVTAQYFFAYTSAFVNWRAYNYIVSGLFFLVTHIPLHGLVRVYNF
LACLWFLRKFYSHVINGCKDTACLLCYKRNRLTRVEASTIVCGTKRTFYIAANGGTSYCCKHNWNCVECDTAGVGNTFIC
TEVANDLTTTLRRLIKPTDQSHYYVDSVVVKDAVVELHYNRDGSSCYERYPLCYFTNLEKLKFKEVCKTPTGIPEHNFLI
YDTNDRGQENLARSACVYYSQVLCKPMLLVDVNLVTTVGDSREIAIKMLDSFINSFISLFSVSRDKLEKLINTARDCVRR
GDDFQNVLKTFTDAARGHAGVESDVETTMVVDALQYAHKNDIQLTTECYNNYVPGYIKPDSINTLDLGCLIDLKAASVNQ
TSMRNANGACVWNSGDYMKLSDSFKRQIRIACRKCNIPFRLTTSKLRAADNILSVKFSATKIVGGAPSWLLRVRDLTVKG
YCILTLFVFTVAVLSWFCLPSYSIATVNFNDDRILTYKVIENGIVRDIAPNDVCFANKYGHFSKWFNENHGGVYRNSMDC
PITIAVIAGVAGARVANVPANLAWVGKQIVLFVSRVFANTNVCFTPINEIPYDTFSDSGCVLSSECTLFRDAEGNLNPFC
YDPTVLPGASSYADMKPHVRYDMYDSDMYIKFPEVIVESTLRITKTLATQYCRFGSCEESAAGVCISTNGSWALYNQNYS
TRPGIYCGDDYFDIVRRLAISLFQPVTYFQLSTSLAMGLVLCVFLTAAFYYINKVKRALADYTQCAVVAVVAALLNSLCL
CFIVANPLLVAPYTAMYYYATFYLTGEPAFIMHISWYVMFGAVVPIWMLASYTVGVMLRHLFWVLAYFSKKHVDVFTDGK
LNCSFQDAASNIFVIGKDTYVALRNAITQDSFVRYLSLFNKYKYYSGAMDTASYREACAAHLCKALQTYSETGSDILYQP
PNCSVTSSVLQSGLVKMSAPSGAVENCIVQVTCGSMTLNGLWLDNTVWCPRHIMCPADQLTDPNYDALLISKTNHSFIVQ
KHIGAQANLRVVAHSMVGVLLKLTVDVANPSTPAYTFSTVKPGASFSVLACYNGKPTGVFTVNLRHNSTIKGSFLCGSCG
SVGYTENGGVINFVYMHQMELSNGTHTGSSFDGVMYGAFEDKQTHQLQLTDKYCTINVVAWLYAAVLNGCKWFVKPTRVG
IVTYNEWALSNQFTEFVGTQSIDMLAHRTGVSVEQMLAAIQSLHAGFQGKTILGQSTLEDEFTPDDVNMQVMGVVMQSGV
KRISYGFIHWLISTFVLAYVSVMQLTKFTMWTYLFETIPTQMTPLLLGFMACVMFTVKHKHTFMSLFLLPVALCLTYANI
VYEPQTLISSTLIAVANWLTPTSVYMRTTHFDFGLYISLSFVLAIIVRRLYRPSMSNLALALCSGVMWFYTYVIGDHSSP
ITYLMFITTLTSDYTITVFATVNLAKFISGLVFFYAPHLGFILPEVKLVLLIYLGLGYMCTMYFGVFSLLNLKLRVPLGV
YDYSVSTQEFRFLTGNGLHAPRNSWEALILNFKLLGIGGTPCIKVATVQSKLTDLKCTSVVLLTVLQQLHLESNSKAWSY
CVKLHNEILAAVDPTEAFERFVCLFATLMSFSANVDLDALANDLFENSSVLQATLTEFSHLATYAELETAQSSYQKALNS
GDASPQVLKALQKAVNVAKNAYEKDKAVARKLERMAEQAMTSMYKQARAEDKKAKIVSAMQTMLFGMIKKLDNDVLNGVI
ANARNGCVPLSIVPLCASNKLRVVIPDISVWNKVVNWPSVSYAGSLWDITVINNVDNEVVKPTDVVETNESLTWPLVIEC
SRSSSSAVKLQNNEIHPKGLKTMVITAGVDQVNCNSSAVAYYEPVQGHRMVMGLLSENAHLKWAKVEGKDGFINIELQPP
CKFLIAGPKGPEIRYLYFVKNLNNLHRGQLLGHIAATVRLQAGANTEFASNSTVLTLVAFAVDPAKAYLDYVGSGGTPLS
NYVKMLAPKTGTGVAISVKPEATADQETYGGASVCLYCRAHIEHPDVSGVCKYKTRFVQIPAHVRDPVGFLLKNVPCNVC
QYWVGYGCNCDALRNNTVPQSKDTNFLNRVRGSSVNARLEPCSSGLTTDVVYRAFDICNFKARVAGIGKYYKTNTCRFVQ
VDDEGHKLDSYFIVKRHTMSNYELEKRCYDLLKDCDAVAIHDFFIFDVDKTKTPHIVRQSLTEYTMMDLVYALRHFDQNN
CEVLKSILVKYGCCEQSYFDNKLWFDFVENPSVIGVYHKLGERIRQAMLNTVKMCDHMVKSGLVGVLTLDNQDLNGKWYD
FGDFVITQPGAGVAIVDSYYSYLMPVLSMTNCLAAETHKDCDFNKPLIEWPLLEYDYTDYKIGLFNKYFKYWDQTYHPNC
VNCSDDRCILHCANFNVLFSMVLPNTSFGPIVRKIFVDGVPFIVSCGYHYKELGLVMNMDFNIHRHRLALKELMMYAADP
AMHIASASALWDLRTPCFSVAALTTGLTFQTVRPGNFNKDFYDFVVSRGFFKEGSSVTLKHFFFAQDGHAAITDYSYYAY
NLPTMVDIKQMLFCMEVVDKYFDIYDGGCLNASEVIVNNLDKSAGHPFNKFGKARVYYESMSYQEQDELFAVTKRNVLPT
ITQMNLKYAISAKNRARTVAGVSILSTMTNRQYHQKMLKSMAATRGATCVIGTTKFYGGWDFMLKTLYKDVESPHLMGWD
YPKCDRAMPNMCRILASLILARKHSTCCTNSDRFYRLANECAQVLSEYVLCGGGYYVKPGGTSSGDATTAYANSVFNILQ
ATTANVSALMSANGNTIIDREIKDMQFDLYINVYRKVVPDPKFVDKYYAFLNKHFSMMILSDDGVVCYNSDYAAKGYVAS
IQNFKETLYYQNNVFMSEAKCWVETNLEKGPHEFCSQHTLYIKDGDDGYFLPYPDPSRILSAGCFVDDIVKTDGTVMMER
YVSLAIDAYPLTKHDDTEYQNVFWVYLQYIEKLYKDLTGHMLDSYSVMLCGDDSAKFWEEGFYRDLYSSPTTLQAVGSCV
VCHSQTSLRCGTCIRRPFLCCKCCYDHVIATPHKMVLSVSPYVCNAPGCDVSDVTKLYLGGMSYYCNDHRPVCSFPLCAN
GLVFGLYKNMCTGSSSIMEFNRLATCDWSDSGDYTLANTTTEPLKLFAAETLRATEEASKQSYAIATIKEIVGERELILV
WEVGKSKPPLNRNYVFTGYHLTKNSKVQLGEYVFERIDYSDAVSYKSSTTYKLAVGDIFVLTSHSVATLSAPTIVNQERY
LKITGIYPTITVPEEFANHVVNFQKAGFSKYVTVQGPPGTGKSHFAIGLAIYYPTARIVYTACSHAAVDALCEKAFKYLN
IAKCSRIIPAKARVECYDRFKVNDTNSQYLFSTVNALPEISVDILVVDEVSMCTNYDLSIINSRVKAKHIVYVGDPAQLP
APRTLLIRGTLEPENFNSVTRLMCNLGPDIFLSVCYRCPKEIVSTVSALVYNNKLSAKKDASGQCFKILFKGSVTHDASS
AINRPQLNFVKTFIAANPNWSKAVFISPYNSQNAVARSMLGLTTQTVDSSQGSEYPYVIFCQTADTAHANNLNRFNVAVT
RAQKGILCVMTSQVLFDSLEFAELSLNNYKLQSQIVTGLFKDCSREDTGLPPAYAPTYLSVDAKYKTTDELCVNLNITPN
VTYSRVISRMGFKLDATIPGYPKLFITRDEAIRQVRSWVGFDVEGAHASRNACGTNVPLQLGFSTGVNFVVQPVGVVDTE
WGSMLTTISARPPPGEQFKHLVPLMNKGATWPIVRRRIVQMLSDTLDKLSDYCTFVCWAHGFELTSASYFCKIGKEQRCS
MCSRRASTFSSPLQSYACWSHSSGYDYVYNPFFVDVQQWGYVGNLATNHDRYCGIHAGAHVASSDAIMTRCLAIYDCFIE
RVDWDVTYPYISHEQKLNSCCRTVERNVVRSAVLSGKFEKIYDIGNPKGIAIISEPVEWHFYDAQPLSNKVKKLFYTDDV
SKQFEDGLCLFWNCNVSKYPSNAVVCRFDTRVHSEFNLPGCNGGSLYVNKHAFHTPAYDINAFRDLKPLPFFYYSTTPCE
VHGSGNMLEDIDYVPLKSAVCITACNLGGAVCRKHAAEYRDYMEAYNIVSAAGFRLWVYKTFDIYNLWSTFVKVQGLENI
AFNVIKQGHFTGVDGELPVAVVNDKIFTKNGTDDVCIFKNETALPTNVAFELYAKRAVRSHPDLNLLRNLEVDVCYNFVL
WDYDRNNIYGTTTIGVCKYTDIDVNPNLNMCFDIRDKGSLERFMSMPNGVLISDRKIKNYPCISGPKHAYFNGAILRNID
AKQPVIFYLYKKVNNEFVSFSDTFYTCGRTVGDFTVLTPMEEDFLVLDSDVFIKKYGLEDYAFEHVVYGDFSHTTLGGLH
LLIGLYKKMREGHILMEEMLKDRATVHNYFITDSNTASYKAVCSVIDLRLDDFVTIIKEMDLDVVSKVVKVPIDLTMIEF
MLWCRDGKVQTFYPRLQATNDWKPGLTMPSLFKVQQMNLEPCLLANYKQSIPMPNGVHMNVAKYMQLCQYLNTCTLAVPA
NMRVIHFGAGCEKGVAPGTSVLRQWLPLDAVLIDNDLNEFVSDADITIFGDCVTVHVGQQVDLLISDMYDPCTKAVGEVN
QTKALFFVYLCNFIKNNLALGGSVAIKITEHSWSADLYKIMGRFAYWTVFCTNANASSSEGFLIGINFLGELKEEIDGNV
MHANYIFWRNSTPMNLSTYSLFDLSRFPLKLKGTPVLQLKESQINELVISLLSQGKLLIRDNDTLNVSTDVLVNFRKRL
>P0C6W5 ~~~rep~~~Replicase polyprotein 1ab~~~
MEGVPDPPKLKSMVVTTLKWCDPFANPNVTGWDIPIEEALEYAKQQLRTPEPQLVFVPYYLSHAPGISGDRVVITDSIWY
ATNFGWQPIRELAMDKDGVRYGRGGTHGVLLPMQDPSFIMGDIDIQIRKYGIGANSPPDVLPLWDGFSDPGPDVGPYLDF
PDNCCPTKPKAKRGGDVYLSDQYGFDNNGILVEPVMKLLGVIKSDFTLEQLLAALGKYRTEDGYDLPDGYVKVAIKVGRK
AVPVLKQSIFTVVGVTEQLVPGYYYPFSTSSVVEHTKPTRGGPVGKTVEAVMLSLYGTNNYNPATPVARLKCSYCDYYGW
TPLKDIGTVNCLCGAEFQLTSSCVDAESAGVIKPGCVMLLDKSPGMRLIPGNRTYVSFGGAIWSPIGKVNGVTVWVPRAY
SIVAGEHSGAVGSGDTVAINKELVEYLIEGIRVDADTLDNPTCATFIANLDCDTKAPVVHTVESLQGLCLANKIMLGDKP
LPTDEFHPFIVGLAYHVQRACWYGALASRTFEAFRDFVRTEEERFAQFFGKVCAPINGCVYLAYTTGRVTLFSAYQVLNT
AIAKSKDAFGGVAAIVVDMLKPILEWVLKKMSIAKGAWLPYAEGLLALFKAQFTVVKGKFQFLRASLNSKCHSLCDLLTT
IMSKLLTSVKWAGCKVDALYTGTYYYFSRKGVLTEVQLCAKRLGLLLTPKQQKMEVEVLDGDFDAPVTLTDLELEECTGV
LEEVFGASDVKLVKGTLVSLASKLFVRTEDGFLYRYVKSGGVLGKAFRLRGGGVSKVTFGDEEVHTIPNTVTVNFSYDVC
EGLDAILDKVMAPFQVEEGTKLEDLACVVQKAVYERLSDLFSDCPAELRPINLEDFLTSECFVYSKDYEKILMPEMYFSL
EDAVPVDDEMVDDIEDTVEQASDSDDQWLGDEGAEDCDNTIQDVDVATSMTTPCGYTKIAEHVYIKCADIVQEARNYSYA
VLVNAANVNLHHGGGVAGALNRATNNAMQKESSEYIKANGSLQPGGHVLLSSHGLASHGILHVVGPDKRLGQDLALLDAV
YAAYTGFDSVLTPLVSAGIFGFTVEESLCSLVKNVACTTYVVVYDRQLYERALATSFDVPGPQSSVQHVPAIDWAEAVEV
QESIVDQVETPSLGAVDTVDSNADSGLNETARSPENVVGSVPDDVVADVESCVRDLVRQVVKKVKRDKRPPPIVPQQTVE
QQPQEISSPGDCNTVLVDVVSMSFSAMVNFGKEKGLLIPVVIDYPAFLKVLKRFSPKEGLFSSNGYEFYGYSRDKPLHEV
SKDLNSLGRPLIMIPFGFIVNGQTLAVSAVSMRGLTVPHTVVVPSESSVPLYRAYFNGVFSGDTTAVQDFVVDILLNGAR
DWDVLQTTCTVDRKVYKTICKRGNTYLCFDDTNLYAITGDVVLKFATVSKARAYLETKLCAPEPLIKVLTTVDGINYSTV
LVSTAQSYRAQIGTVFCDGHDWSNKNPMPTDEGTHLYKQDNFSSAEVTAIREYYGVDDSNIIARAMSIRKTVQTWPYTVV
DGRVLLAQRDSNCYLNVAISLLQDIDVSFSTPWVCRAYDALKGGNPLPMAEVLIALGKATPGVSDDAHMVLSAVLNHGTV
TARRVMQTVCEHCGVSQMVFTGTDACTFYGSVVLDDLYAPVSVVCQCGRPAIRYVSEQKSPWLLMSCTPTQVPLDTSGIW
KTAIVFRGPVTAGHYMYAVNGTLISVYDANTRRRTSDLKLPATDILYGPTSFTSDSKVETYYLDGVKRTTIDPDFSKYVK
RGDYYFTTAPIEVVAAPKLVTSYDGFYLSSCQNPQLAESFNKAINATKTGPMKLLTMYPNVAGDVVAISDDNVVAHPYGS
LHMGKPVLFVTRPNTWKKLVPLLSTVVVNTPNTYDVLAVDPLPVNNETSEEPISVKAPIPLYGLKATMVLNGTTYVPGNK
GHLLCLKEFTLTDLQTFYVEGVQPFVLLKASHLSKVLGLRVSDSSLHVNHLSKGVVYAYAATRLTTRVTTSLLGGLVTRS
VRKTADFVRSTNPGSKCVGLLCLFYQLFMRFWLLVKKPPIVKVSGIIAYNTGCGVTTCVLNYLRSRCGNISWSRLLKLLR
YMLYIWFVWTCLTICGVWLSEPYAPSLVTRFKYFLGIVMPCDYVLVNETGTGWLHHLCMAGMDSLDYPALRMQQHRYGSP
YNYTYILMLLEAFFAYLLYTPALPIVGILAVLHLIVLYLPIPLGNSWLVVFLYYIIRLVPFTSMLRMYIVIAFLWLCYKG
FLHVRYGCNNVACLMCYKKNVAKRIECSTVVNGVKRMFYVNANGGTHFCTKHNWNCVSCDTYTVDSTFICRQVALDLSAQ
FKRPIIHTDEAYYEVTSVEVRNGYVYCYFESDGQRSYERFPMDAFTNVSKLHYSELKGAAPAFNVLVFDATNRIEENAVK
TAAIYYAQLACKPILLVDKRMVGVVGDDATIARAMFEAYAQNYLLKYSIAMDKVKHLYSTALQQISSGMTVESVLKVFVG
STRAEAKDLESDVDTNDLVSCIRLCHQEGWEWTTDSWNNLVPTYIKQDTLSTLEVGQFMTANAKYVNANIAKGAAVNLIW
RYADFIKLSESMRRQLKVAARKTGLNLLVTTSSLKADVPCMVTPFKIIGGHRRIVSWRRVLIHVFMLLVVLNPQWFTPWY
IMRPIEYNVVDFKVIDNAVIRDITSADQCFANKFSAFENWYSNRYGSYVNSRGCPMVVGVVSDIVGSLVPGLPARFLRVG
TTLLPLVNYGLGAVGSVCYTPHYAINYDVFDTSACVLAATCTLFSSASGERMPYCADAALIQNASRYDMLKPHVMYPFYE
HSGYIRFPEVISAGVHIVRTMAMEYCKVGRCDVSEAGLCMSLQPRWVVNNAYFRQQSGVYCGTSAFDLFMNMLLPIFTPV
GAVDITTSILMGALLAVVVSMSLYYLLRFRRAFGDYSGVIFTNILAFVLNVIVLCLEGPYPMLPSIYAMVFLYATCYFGS
DIACMMHVSFLIMFAGVVPLWVTVLYIVVVLSRHILWFASLCTKRTVQVGDLAFHSFQDAALQTFMLDKEVFLRLKREIS
SDAYFKYLAMYNKYKYYSGPMDTAAYREAACSHLVMALEKYSNGGGDTIYQPPRCSVASAALQAGLTRMAHPSGLVEPCL
VKVNYGSMTLNGIWLDNFVICPRHVMCSRDELANPDYPRLSMRAANYDFHVSQNGHNIRVIGHTMEGSLLKLTVDVNNPK
TPAYSFIRVSTGQAMSLLACYDGLPTGVYTCTLRSNGTMRASFLCGSCGSPGFVMNGKEVQFCYLHQLELPNGTHTGTDF
SGVFYGPFEDKQVPQLAAPDCTITVNVLAWLYAAVLSGENWFLTKSSISPAEFNNCAVKYMCQSVTSESLQVLQPLAAKT
GISVERMLSALKVLLSAGFCGRTIMGSCSLEDEHTPYDIGRQMLGVKLQGKFQSMFRWTLQWFAIIFVLTILILLQLAQW
TFVGALPFTLLLPLIGFVAVCVGFVSLLIKHKHTYLTVYLLPVAMVTAYYNFQYTPEGVQGYLLSLYNYVNPGRIDVIGT
DLLTMLIISVACTLLSVRMVRTDAYSRIWYVCTAVGWLYNCWTGSADTVAISYLTFMVSVFTNYTGVACASLYAAQFMVW
VLKFLDPTILLLYGRFRCVLVCYLLVGYLCTCYFGVFNLINRLFRCTLGNYEYVVSSQELRYMNSHGLLPPTNSWQALML
NIKLAGIGGIPIYRVSTIQSNMTDLKCTSVVLLSVLQQLRVESSSKLWALCVKLHNEILASNSPTEAFEAFVSLLSVLLS
LPGAINLDELCSSILENNSVLQAVASEFSNLSSYVDYENAQKAYDTAVATGAPASTVNALKKAMNVAKSVLDKDVATTRK
LERMSELAMTAMYKQARAEDRRSKVTAAMQTMLFNMIRRLDSDALSNILNNARNGVVPLGVIPRTAANKLLLVVPDFSVY
TATITMPTLTYAGSAWDVMQVADADGKTVNATDITRENSVNLAWPLVVTAQRQQATSPVKLQNNELMPQTVKRMNVVAGV
SQTACVTDAVAYYNATKEGRHVMAILADTDGLAFAKVEKSTGDGFVILELEPPCKFMVDTPKGPALKYLYFTKGLKNLCR
GTVLGTLACTVRLHAGSATEVASNSSILSLCSFSVDPEATYKDYLDNGGSPIGNCVKMLTPHTGTGLAITAKPDANIDQE
SFGGASCCLYCRCHIEHPGASGVCKYKGKFVQIPLVGVNDPIGFCIRNVVCAVCNMWQGYGCPCSSLREINLQARDECFL
NRVRGTSGVARLVPLGSGVQPDIVLRAFDICNTKVAGFGLHLKNNCCRYQELDADGTQLDSYFVVKRHTESNYLLEQRCY
EKLKDCGVVARHDFFKFNIEGVMTPHVSRERLTKYTMADLVYSLRHFDNNNCDTLKEILVLRGCCTADYFDRKDWYDPVE
NPDIIRVYHNLGETVRKAVLSAVKMADSMVEQGLIGVLTLDNQDLNGQWYDFGDFIEGPAGAGVAVMDTYYSLAMPVYTM
TNMLAAECHVDGDFSKPKRVWDICKYDYTQFKYSLFSKYFKYWDMQYHPNCVACADDRCILHCANFNILFSMVLPNTSFG
PLVQKIYVDGVPFVVSTGYHYRELGVVMNQDIRQHAQRLSLRELLVYAADPAMHVAASNALADKRTVCMSVAAMTTGVTF
QTVKPGQFNEDFYNFAVKCGFFKEGSTISFKHFFYAQDGNAAISDYDYYRYNLPTMCDIKQLLFSLEVVDKYFDCYDGGC
LQASQVVVANYDKSAGFPFNKFGKARLYYESLSYADQDELFAYTKRNVLPTITQMNLKYAISAKNRARTVAGVSIASTMT
NRQFHQKMLKSIAAARGASVVIGTTKFYGGWNRMLRTLCEGVENPHLMGWDYPKCDRAMPNLLRIFASLILARKHATCCN
ASERFYRLANECAQVLSEMVLCGGGFYVKPGGTSSGDSTTAYANSVFNICQAVSANLNTFLSIDGNKIYTTYVQELQRRL
YLGIYRSNTVDNELVLDYYNYLRKHFSMMILSDDGVVCYNADYAQKGYVADIQGFKELLYFQNNVFMSESKCWVEPDITK
GPHEFCSQHTMLVDMKGEQVYLPYPDPSRILGAGCFVDDLLKTDGTLMMERYVSLAIDAYPLTKHPDPEYQNVFWCYLQY
IKKLHEELTGHLLDTYSVMLASDNASKYWEVEFYENMYMESATLQSVGTCVVCNSQTSLRCGGCIRRPFLCCKCCYDHVV
STTHKLVLSVTPYVCNNPSCDVADVTQLYLGGMSYYCRDHRPPISFPLCANGQVFGLYKNICTGSPDVADFNSLATCDWS
NSKDYVLANTATERLKLFAAETLRATEENAKQAYASAVVKEVLSDRELVLSWETGKTRPPLNRNYVFTGFHITKNSKVQL
GEYIFEKGDYGDVVNYRSSTTYKLQVGDYFVLTSHSVQPLSSPTLLPQERYTKLVGLYPAMNVPESFASNVVHYQRVGMS
RYTTVQGPPGTGKSHLSIGLALYYPSAKIVYTACSHAAVDALCEKAHKNLPINRCSRIVPAKARVECFSKFKVNDVGAQY
VFSTINALPETTADILVVDEVSMCTNYDLSMINARVRAKHIVYVGDPAQLPAPRTLLTKGTLAPEHFNSVCRLMVAVGPD
IFLATCYRCPKEIVDTVSALVYDKKLKANKVTTGECYKCYYKGSVTHDSSSAINKPQLGLVKEFLIKNPKWQSAVFISPY
NSQNSVARRMLGLQTQTVDSSQGSEFDYVIYCQTSDTAHALNVNRFNVAITRAKKGILCVMSDSTLYESLEFTPLDVNDY
VKPKMQSEVTVGLFKDCAKAEPLGPAYAPTFVSVNDKFKLNESLCVHFDTTELQMPYNRLISKMGFKFDLNIPGYSKLFI
TREQAIREVRGWVGFDVEGAHACGPNIGTNLPLQIGFSTGVNFVVTPSGYIDTESGSRLANVVSKAPPGDQFKHLIPLMR
KGEPWSVVRKRIVEMLCDTLDGVSDTVTFVTWAHGFELTTLHYFAKVGPERKCFMCPRRATLFSSVYGAYSCWSHHRHIG
GADFVYNPFLVDVQQWGYVGNLQVNHDNVCDVHKGAHVASCDAIMTRCLAIHDCFCGEVNWDVEYPIIANELAINRACRS
VQRVVLKAAVKALHIETIYDIGNPKAIKVYGVNVNNWNFYDTNPVVEGVKQLHYVYDVHRDQFKDGLAMFWNCNVDCYPH
NALVCRFDTRVLSKLNLAGCNGGSLYVNQHAFHTDAFNKNAFVNLKPLPFFYYSDTACENATGVSTNYVSEVDYVPLKSN
VCITRCNLGGAVCKKHADEYRNFLESYNTMVSAGFTLWVDKTFDVFNLWSTFVKLQSLENVAYNVLKSGHFTAVAGELPV
AILNDRLYIKEDGADKLLFTNNTCLPTNVAFELWAKRSVNVVPEVKLLRNLGVTCTYNLVIWDYESNAPLVPNTVGICTY
TDLTKLDDQVVLVDGRQLDAYSKFCQLKNAIYFSPSKPKCVCTRGPTHASINGVVVEAPDRGTAFWYAMRKDGAFVQPTD
GYFTQSRTVDDFQPRTQLEIDFLDLEQSCFLDKYDLHDLGLEHIVYGQFDGTIGGLHLLIGAVRRKRTAHLVMETVLGTD
TVTSYAVIDQPTASSKQVCSVVDIILDDFIALIKAQDRSVVSKVVQCCLDFKVFRFMLWCKGGKISTFYPQLQAKQDWKP
GYSMPALYKVQNAVLEPCLLHNYGQAARLPSGTLMNVAKYTQLCQYLNTCSLAVPAKMRVMHFGAGSDKGVCPGTAVLKQ
WLPADAYLVDNDLCYCASDADSTYVGSCETFFSVNKWDFIFSDMYDARTKNTSGDNTSKEGFFTYLTGFIRSKLALGGSI
AIKITEHSWSADLYAIMGHFNWWTCFCTSVNSSSSEAFLIGVNYIGVGALLDGWQMHANYVFWRNSTVMQLSSYSLYDLQ
RFPLRLKGTPVMSLKEDQLNELVLNLIRAGRLIVRDAVDIGVRGVACSGV
>P0C6V7 ~~~rep~~~Replicase polyprotein 1ab~~~
MSTSSSILDIPSKMFRILKNNTRETEQHLSSSTLDLISKSQLLAQCFDTQEIMASLSKTVRSILESQNLEHKSTLTPYNS
SQSLQLLVMNTSCTQFKWTTGSTSSVKALLEKELCRGLVPLNDITPKSNYVELSLLTPSILIGNETSTTTTLPEIPLDME
QSIISCVENTLLKEVQALSGQESCQEYFLSANYQSLIPPQVLLNLMKMSSVVDLSPLTLPNTRLWLKLSPFHGGTSVSYA
TQIKGYANCARREEKCLKNRLTKKQKNQEKGSFDARSVITLGGKMYRYKVVVLRCEDQSDNLSELQFEPQVEYTMDMVPH
CWKELVKKRLIRAKGTWDLSCVEDLDLDHVEVRGDSLLHRSSVVHDLTSIVDDTLQEKLFSRTWLRQSLKYSGNILQRLS
SLFATEGLKKITLVNSDITPVQVGDKWLNFVDFGKSTVFFVKTLNNIHLAMTRQRESCNYIHEKFGRVRWLGAKPEQGAI
VKVFAWCLNKKEFKFRDNQLKQYVCRQGVIKHEPCEYLNVEVLDEFVALNNDLNCVQKIKTYLAAYFGLKKVKLTQKNFM
TPLITKKQELVFQPCNCPNHQFYVAQFDKHVTLGLGRKDGILFAEQVPSYAIILAVGFGTVETQLVTHYYSEMRRVYHPL
DFQSNTFVFDHQGVMLEDISPADYNDVGEEDYQLEYSGGFDQPFQNYHSDDEDQAFPDFEDERHPDEENWARPIISSGES
SVVSSRPSSPLVYSSLVPVASPFGYMNGIRVFDICLADDLDFLQIHGQCPCARCKGLYFYQPIRPRGFTIFENVVEFFSF
VEKCEVFEEIGPFFKMIEYSMLYNEYNIFYGLGKKIYQSDLVLPVKHLDQLWKRAQLDIDVVSEFENFKNSLQNINNVVY
IAPYFNDQGEWNDIFDGYEFNLNDNQFWFQAKPVYDLVCYIYQGFFSDSRPLEKLYQKLCLDYHTSAMLHTQTHLKYCYV
ALLHSERAFQMSINLDSLDNEQLHFLATMGMGDASLVGPTYLSEYHSNFNWYSIMSKACHYVKLEQLVGLTYQEKRLMIL
SRVQEFYEQQHRGPIQLILSPLKVVNLPPITCTEGYCYQPVTRLFDTCVMPDIMKKLSRKRTSVSDVFGILADYFKRTLS
YRCFKVHEFCGIERQQEFSDMTTLKLVTDWCQDTYYFYNEYATMTDVEPKVQVSSDYYLKIPSEVVEHIRQFLPHNVNVG
LMNYVSSNCDFDQCKFEFCLSGKGYVLGNMFFNRCAIQYVKTNLFIVLFKSRPLLYITQESIYLSDFNVLQAQCLTGEFC
LDFEPVQGKTLFGVYFTNGQRYGQQWETLPRFSLKPLNSPRKRVPTQPFEELAEVCIFKQKLKLTQLHNDCSVTPRVCSI
PQTITATFQPYYCLENFYGVKAPKVIVSGHLATHYVKLTHKISKCVLVTKLAVARAFYFTPTSMGSHYHLDPMEGISFGK
RATVQFEPVGLIKDVNLLVYQFGSHVSIQFFPEAPCIVADGHYPSKYSGVWLGYLPSVEECKIAQVNHRVYVPTILRTSK
SAPFHIIQNGDMGRGPITVTYHYAKNFDNKSLTPMFKMFQQVFEKSKDDIFKAFNTMSLEQKKVLSHFCGEFDEAYTLQT
MSDEISFESSAYPDVVACSLAYILGYEMCLTVKVNAKNEKLDIGSQCERVFVDYDVKKNEWTLSPEEGEDSDDNLDLPFE
QYYEFKIGQTNVVLVQDDFKSVFEFLKSEQGVDYVVNPANSQLKHGGGIAKVISCMCGPKLQAWSNNYITKNKTVPVTKA
IKSPGFQLGKKVNIIHAVGPRVSDGDVFQKLDQAWRSVFDLCEDQHTILTSMLSTGIFGCTVNDSFNTFLSNVARLDKSL
VVFVVTNMVEQYNQAFAVIKMYQQYHGLPNFGNTCWFNALYQLLKSFSEKEQCVNDLLNCFDDFYDCPTSQCVEWVCEQL
GVQFGQQQDAVEMLMKVFDVFKCDVRVGFDCLSRLQQVNCGFCVEVPAQAVLMFSGKDQCGHWTAARKIVDKWYTFDDNH
VVQKDPVWQNVVLVLRDRGIFRSADFERKPARRRRVSHRVPRDTLSQDAITYIEDLRFSSGTCLSRYFVESVESFVSGDN
VSEVSDEQTCVEVAIEESDGHVEQICQSSVDCVGMPESFQFTFSMPLQTFVQECDQKCEDDFSQEHVECDQQFEPVEQVG
QGGQQDGQVDQQIKESEQVVEPSAPSGQESPQALLQQVVDEVVYQIEQVKCDQKQDQDSVQCDEIEEINSRGEQTVQQQL
QPILGHDLNENEGPTLSVGAGKLVRCRSLAVTESNLSTSNTIFVWSEVLTHQYIGFKTDLMGLTYNIKFKLICYVLFLWF
GVLCCTSHNTPFYMRLCIYLVLLWLSLMIWNASQINVKTGWNELYVLKLLTSIKLPNIVKFRCELVQWFVLKCLFVSFYV
YDYVVKVCVSIFQMPQLRPFTWPFIKLGFVDTFLSHHILAFPEKVANQSTLPTCGDKRYYVYVPSWCRASFTSLVMRARE
LTSTGRSKTLDNWHYQCCSKTAKPLSCFNVREFVFDQDCKHEAYGFLSSLCVYLLFYSGFLTFWLPLFCYYYVLFMCTFK
NLPVDITKPIKWTVLQQVVNDVLSLVTKPLFGRPVCPPLTTYLTSTTADEAVKVSRSLLGRFCTPLGFQQPVMNVENGVT
VSNFGFFNPLMWPLFVVVLLDNRFIWFFNVLSYVMMPVFVIILFYFYLKKICGCINFKGLSKCCTKHFNQFSKPLVAAGV
HGNRTNFTYQPMQEHWCDRHSWYCPKEEHYMTPDMAVYIKNYYNLACAPTADLVWCDYTKSAPTMTWSNFKYSSYKAKET
VLCAPSSHADSMLMAWYALLHNVRFTVNPNVVDLPPAVNTIYVSSDSEDSVQDKSQPDVKLRPKKPKGNFKKQSVAYFSR
EPVDIWYYTTLVIVMGVLFMFMYSCLMVGQYVVMPRDKFFGVNPTGYSYVNAPPYLHAAPPVLQNSDGMILATQLKVPSI
TYSVYRLLSGHLYFTKLIVSDNECTPPFGAARLSNEFSCNGFTYVLPAHLRFFNRYVMLIHPDQLHMLPFEVEYGSHTRV
CYTTGSNSVECLPTFEIISPYVFVFIVVIFTVIFLILIRLYIVMYSYFKVFTYVVFKLLFVNIIMVLFVVCLPPLVPGVV
FVLALWLCDSVMFLLYLAFLSLFILPWFYVLFFLFMVGGFVFWWMMRSADVVHLTTDGLTFNGTFEQISKCVFPLNPLIV
NRMLLDCQMSHSDLVEKSKLKTTEGKLANEMMKVFMTGETSYYQPSNFSFQSVFSKATSPFTLHARPPMPMFKLYVHFTG
SCVGSTSTGTGFAIDDNTIVTAKHLFEYDDLKPTHVSVEIVTRSHSARSASIIWKEPDVKGWTFKGENAYIQVENLKDFY
IEDFKYLPFQQIEKDFYKRMEPVTIYSVKYGSEFATQAWQTVNGHFVCYNTEGGDSGAPLVCNGRIVGVHQGLCDNFKTT
LASDFEGKMMTEVKGHHVDPPVYYKPIIISAAYNKFVAGEDSSVGDGKNYHKFENEDFACMCKELESVTFGDQLRRYCYN
LPQFLEPLQYFHVPSFWQPFKKQSVSNNVSWVVEHLHFIFSIYFLICDFVAYWWLDDPFSVVLPLFFIVQLLSTVFLKNV
LFWTTSYLITLAVTFYIHSEVAESMFLLGFLSDRVVNRMSLIIVVAIMCLFVVVRVVVNVKRAIFVFVVSVVLIFVHICL
GIVQFNSFVNVVLFDVYAVFTALLTPQPVVAIIMLLLFDTKMLMSFAFIVIVLSFRVFKDYKFVKVLHNFCNFDFVLSQV
SLFRYRHRNQGNDPTHYEALWLFLKELYYGIQDAKYEVFSPQAGSYNVKFLTDMTEQDQLEAVEQVQRRLQRFNIVQDKA
SPRLVLYSKTIEFIKDQIQQQRAVGANPFIITTLTSNDIGLDNVEVHNPANFKPEDLQAHMWFFSKSPVFIGQVPIPTNV
QTAAVLDTTYNCQDLTADEKNNVAATLQIQNAAITLSLFEKCTQFLESELGEVPTLMWQAEDVADIKHLESQIENLRKVL
DGMQFGTTEYKATRKQLNICQSQLDQAKAFERKLAKFLEKVDQQQAITNETAKQLSAFKNLVKQVYESYMSSLKVKVLEA
NDASCLLTSTDLPRKLVLMRPITGVDGIKIVEKANGCEITAFGTTFNTGHGSNLAGLAYSTTQPLSAYPFIFNLEGIFKQ
QANIGYKTVECNMSSHNGSVLYKGKVVAVPSDDNPDFVVCGKGYKLDCGINVLMIPSIVRYITLNLTDHLQKQSLKPRRR
LQYRQQGVRLGGVNLGEHQAFSNELISTVGYTTWVSSTVCRDNTHKHPWFVQIPVNEKDPEWFMHNTQLKDNQWVVDLKP
THWLVNADTGEQLFALSLTDEQALKAEAILQKWSPITQDVECWFKDLKGYYTVSGFQPLWPVCPVNICNVRLDPVFKPQS
IVYADDPTHFLSLPVVNKNFLAAFYDLQEGFPGKKQVAPHISLTMLKLSDEDIEKVEDILDEMVLPNSWVTITNPHMMGK
HYVCDVEGLDSLHDEVVSVLREHGIACDQKRLWKPHLTIGELNDVSFDKFKDFAISCKLEDCDFVKLGAPKANARYEFIT
TLPLGDFKLLRGAWSACRHLCFQNGAYQSSRSKHYIDLATEYNAGIVKVNKSNTHSVEYQSKRFMIKRVKDQSEFALAKT
AFLPSIIPHHMEKQNGEWFLIRGPTSQWSLGDLVYAIWLGDQDYLSECGFVFNPSRDEFLDDANQRSFLANLLEPAILNF
SHIYWQVKMCKVPYKLTLDNVDLNGQLYDFGDYPCPNSVDNQSALFVLAEVWSMTRRPFPVAFARLLANEMEIPTDYQMF
FQNILLSGSYLDKALCLNNVRPFLSDPANLTTTPFFSQHNGVWTHFYNPIYGLVECNLDEFAELPEVLQQLVTVQGPITN
NMTPAISVGEGVYAANVPSASATKQKIPFYDVGLCQELTDAGVDCGEAFKYFYYLSNPAGALADVCYYDYQGTGFYSPKL
LAGVYDFMKRVTECYRINERFTYEQAKPRKSSMGINITGYQQDAVYRALGPENIARLFEYAQKAPLPFCTKIITKFALSA
KARARTVSSCSFIASTIFRFAHKPVTSKMVEVAQNSGGFCLIGVSKYGLKFSKFLKDKYGAIEGFDVFGSDYTKCDRTFP
LSFRALTAALLYELGEWDEKSWLYLNEVNSYMLDTMLCDGMLLNKPGGTSSGDATTAHSNTFYNYMVHYVVAFKTILSDL
SEGNKVMRIAAHNAYTTGDYQVFNTLLEDQFQTNYFLNFLSDDSFIFSKPEALKIFTCENFSNKLQTILHTKVDQTKSWS
TKGHIEEFCSAHIIKTDGEYHFLPSRGRLLASLLILDKLSDVDIYYMRFVAILCESAVYSRYQPEFFNGLFQVFLDKVQQ
FRKDYCCDPCPPQLLEREFYENLVFTSNSEVGIVDCYLENFKLQCEFKQQANFDKVCFCCPNPAVSVCEECYVPLPLCAY
CYYVHVVISNHSKVEDKFKCFCGQDNIRELYIVLNNSICMYQCKNCVESDRLRISLLSDVDQIVRLPGFKSNSASIAKNG
VAQLLTSVDNVDVSLDWNYQESVQQNVARIVYHSANMTQMSIEVVYVSFTLVRNDGSSAILDIPNFKCPDTSYCLFYKPG
KSGVLKFTGKGTLTSCYDNKNLTWFKVTCPDFNQPWRLATCFVIQQHDVVYPPIKATQYENVTFVMGPPGTGKTTFVYDT
YLSKASSSNRFVYCAPTHRLVGDMDEKVDGAVVVSAYNDRTYRNPVWNKDDSYGVLLCTHNTLPFIKSAVLIADEVSLIP
PHVMIKILSMGFKKVVLLGDPFQLSPVYKNHKVHFKYDTFYLLQLATQKRYLTACYRCPPQILSAFSKPYCDVGVDLVSF
NNKPGKFDIIVSKQLANIQDFSVLSVLSKEYPGYVILVNYRAAVDYAMQNGLGDVTTIDSSQGTTAANHLLVLFGASNFS
KTVNRVIVGCSRSTTHLVVVCCPELFKHFQPILNWPEPKYRYFGMEKQSDFNIIPEVSSLVFCDIEFWHYKADPNSKTRT
VYPGQIAVVTSQTLQLYLGVFDDTGYKSALRGLPKDVYVPPNWVWMRKHYPSYEQHAYNMQRLFKFIIDTTFGQPWFILY
SCSNDLKSLKFYVEFDTCYFCSCGEMAICLMRDGNYKCRNCYGGMLISKLVNCKYLDVQKERVKLQDAHDAICQQFHGDS
HEALCDAVMTKCLYLASYEAAFKDTIHVKYKDLCLEIQYKITSSFVRYDSVHKRYLYRDHGAMYYFRTPRSPMQNVYKYE
VGSHAEYSINICTSYEGCQSFGKTCTKCIHIHCIVEQFMADERFKEFILVSVVKSDYVEQALSPAAKALMLTVTKVEDKS
FYISNGVRYDLYDYDLSKSVMRVVNSNVKPLPLYSVIVGLGINCTVGCVLPNVPMKLKDELLITDVPLSTLRLDLQTWYY
ISWPTLSNKNSRWKLAGAQVYDCSVHIYIEATGEQPLYYLQQGKGESLFELPDTLFSTGRLYNLDHDAAQNFNVKQLAIE
TMPNNHHVFSGDFTEVGTDIGGVHHVVALNGYKGSIIPNYVKPIATGLINVGRAVKRTTLVDVCANQLYEKVKQQLEGVK
VSKVIFVNIDFQDVQFMVFANGEDDIQTFYPQKDFVRSYYEWPNILPQIESHYDLKNYGQNPTFMPQPVNFAKYTQICTF
IQDHVKVARNALVWHLGAAGVDGCSPGDIVLSSFFKECLVYSWDIKDYSTLLDKHSYDCNFRPNLIVSDIYNVSSNVSEV
LDDCVHRLALGGTIVFKTTESSRPDIQLSQFTKYFSAVQFFTAGVNTSSSEVFVVLKYKLYSEPIGEELCSPNILSRIAA
YRNKLCIVPNFKVFSTSFSYKYSGVKFVQKCFYVSVPRQFCASGLIQEVPMLCQMEH
>P0C6V8 ~~~rep~~~Replicase polyprotein 1ab~~~
MSKTSRELTNETELHLCSSTLDLISKSQLLAQCLGTPQNLVSLSKMVPSILESPTLEPRYTSTHSSSLQSLQLLALNTSS
TLYKWTTGSISKLRGHLERELCRGLVPLNDFIPKGNYVELSLMIPSVLTGQGTSTTTTLQEMCSDMVQSCIKSMETDLLK
GVLALKDQTSCQEYFLSANYQSLIPPQPLVNAMRMSSVVDLSPLILENTRLLLKLSPFHGGTSVSYTSMIREFVDCSRRD
EKCLKRRLTKKQKRQEEGSFDANKVITLGGKMYRYRVVILKCSDEVDDLIGFDGKVGEFDYNFENVPHCWRDLVKRRCLI
RAKATWNLAGGVDENLDHVYIDESQXDFRCADGSSDSPSACVEDPHLEERIFSRVWLKQTSRFFGTKIQQVSELFKSIGL
PELETTYCGVNPVKVGNKWLSFRDQGRSRVFFVYTDSNVYLATTRQKVCCDYILTKFKSVKWIGNKPDQCRVVKVLAWLI
SVNKVKNCTRVITPMLTVQGKISHRRVDYLDISVLDSYVSDTAGLNCVQKVKKFLSMYYNCGADLGLLDNFLTPIECGTK
QLVFERCNCPNHQFYVAQFDNHVVLGLGRPTGVVYPEEIPSCANIYAVGFATQKRVVEVHYYSEMDRHQLPQDYYYFAYD
QEFQHVGGDDYVNHHLDDVEDQPFPPVLFDDVYDSGDSLDDGGSDLDCFDVGYDFFWPEAPIPVPSPYGYYQGQRLRDLC
VAGGDFGCDCPRCDGTFIYHPFRPRHYHSFDEVGPFIQMCEFTLTYSGQNYNLFYGLEPKVCLQDLVEASDKLLQLLVRG
QLENISLPNDILACLSSLKLGANIHPFLWPAPFFNANGEWVDIFGGGDFTVFGEDFCLKAKSMVESVYFLVENFFSVDCP
IGNLYCNLHLDGDVKKMLWSTIHMKYIYLALIHSEKVFNIILNSRQLSHQELVKLVIIGTFDVSIVAPCACSGDCNHGKV
YNWTNLLSSVYRFVTLDQLVGLSYCEKRSLVLRKVQQYLEVEEGYQRPVQLLMAPFYGFNDNAEPDEQPLTGVFHQQVMQ
MFDTCVMLDVICGLKRPRASVYNLFGVLADYFRRPFTFRYYQVAEFSGSESTQVFTDVTSALTSKDPCSNRPYIYHDYAV
CRVVEPRTAAVTTRGAIYPPEVIEMIRSYLPIEFDVGVMNYVDGNCDFKYCNLEFCLSGRGLVKLDTGELLDYKTNLFVV
RYKTLPLLYVTSNPIYLSDFSLDNAVCLTGDFKLSFDVEPGSTLFGLYFTNGRCYRDVWETLPRFGLGTLSPPKCHSKCE
PFENLAEVFFFKRRVQLVPLVNNYTPVFRHRPDIPKVLTVELMPYYSSIGYQGFVAPKCVLPGCVATQYCKLRHQLDRCV
QVTKLAVAYAFYFKPLNIGSLYHLDPMRGTSYGKPAVVQFEPVGLIKEVNILVYQFGKHVAIHYFPECPTYVAYGHYPSH
SVGVWLGYLPSVEECVIAQRNYRVYVPTCFRLSRTGCYHIQQDEDFERTHITVSYHYARDFDTKSLTPMFQMFSKIFGKS
KQDLICALNSLSEESQSVLTLFCEEFDSAYTLQTISDEVSFETSTSPELVACVLAYAIGYELCLTVKTDGECESLDVGSS
LEQVYVDYDVSKNVWDLSTHLQDDSSDDLELPFNQYYEFKVGRASVVLVQDDFKSVYDFLKSEQGVDYVVNPANNQLKHG
GGIAKVISCMCGPKLTSWSNNYIKQYKKLGVTCAIRSPGFQLGKGVQIIHVVGPKSADSDVVNKLEASWRSVFQNVKPDT
TVLTSMLSTGIFGCSVTDSATTLLSNLVDLDKDVVVFVVTNVSDQYIEALGVVESFQSAHGLPNFGNTCWFNALYQLLKS
FAVKEQIVQDLVNCFDDFYECPTRQCVEWVCDQLGVVFGEQYDAVEMLVKIFDVFKCNVRVGYDCLARLQQVALGSCREV
PADAVLMFFGQDKSGHWVAARKVCGVWYTFDDKVVVKKDPDWSKVVLVLRERGLFKATDFETPRPRRRRVAYRVPRDTIS
QDAIMFLEERQFSSGTMLAHSCVESVESFHVEGVQPSPLQSVDGLDDVADLSCDNHVCDNSDLQEPQVVVSQPSEVLTTS
MSIECPVLENSECSVETDLNPVCEENEQVGESGIKEQDGVTTSDSQQVFSKSLDPIIKQHEVESVEPQDLPVFSQQPQVM
LSMTWRDVLFQQYLGFKSDLLSLTHVNKFKIVVYLMVLWFVLLYCFSDFSLLSRFCLYVFLLWLSHVVLVVKKLDLGLVN
SGGESYVLRILSSVKVPNCIAFNCDGVHWLILKLLFYSFHFYDFFVKTLVVVFQMPQLRCFTWPLLKLGFADTFLSHHIL
AFPTKQVSQSCLPVFGDERKYIYVPYWCKESFRTLVARAKQLTATGRTKTLDNWHYQCCSKTVKPSSCFNVRDFVFDDAC
NNHKHYGFFSALWFYVVFYSGFVSFWLPLMFCYCALFMCTFKNLPVNITRPIRWTVLQQVVDDLLSIITKPLFGRPACPP
LSAYLTATTADEAVRASRSLLGRFCTPVGFQQPIMNVENGVAVSSLGFINPLMWPLFIVVLLDNRFVWFFNVLSYIMLPV
FVIILFYFYLRKICGCVNVKGVVKNCTRHFQNFSKPLVAAGVHGNRTNFTYQPMQENWCDRHSWYCPKEEHYMTPEMAMF
IKNYYNLATSPMADTIWCDYVKSVPNMTWANFKFSLFKSNETVMCGPSSHADSMLLSWYAFLHGIRFAVNPSVIDIPSQT
QPIYVSSDSDDSLDKGCDVSLRPTKNKGKFKKQSVAYFSAGPVDLWYYVMLIIALGAIFVFMYSCFMVGQYVVMPRDKFF
GVNPTGYSYVNAQPYLHASPPVLRNSDGMVLATPLKVPSISYSVYRLLSGHLYFTKLIVAENECTPPFGAXRLSHEFTCN
DFTYILPAHLRIFGRYIMLIHPDQLHMLPFEVEHSTHTRLCYVTGTNIVECLPTFEIISPYVFVVLVAIFTIVFLFLLRM
YIVMYSYFKVFTYVVFKLLFVNTVMVLFVVCLPPLVPGVVFVLALWLCDSVVFLLYLAVLSLFILPWFYVMLFVLIVGGF
VFWWMMKSSDVVHLTPDGLTFNGTFEQVSKCVFPLNPLIVNRLLLDCRMSHSDLVEKSKLKTTEGKLATEMMKVFMTGET
AYYQPSNFSFQSVFSKVVSPFTLHARPPMPMFRLYVYFNGQCVGTTCTGTGFAIDDSTIVTAKHLFECDDLKPTHLSVEL
SCRSYWCTWKEPNVLSWKFEGENAYISVENLRDFYGIDFKYLPFQQIECEFYKRMEAVTIYSIKYGSEFATQAWQTVNGH
FVCCNTEGGDSGAPLVWRDSVIGVHQGLCDSFKTTLASDSKGVMMTEVKGYHVDPPVYYKPIIMSAAYNKFVADSDVSVG
ECTNYHNFVNEDFFSMHDELEKVSFGDKMFRYCQSLPRYLEPLHYFHVPSFWQPFKKQSVSSNVSWVVENLHFIFSVYFL
VCDFVAYWWLDDPFSVVLPLFFVVQLLSTVVLKNVLFWNTSYLVTLAVTFYVHSEVAESMYLLGLFSDQIVNRVGLILVV
SVMCLFVVVRVVVNVKRAIFVVVVSVLLIVVNVVLGVVQFNSLVAVCMFDIYAVFAALLTPQPVVAIMMLILFDTKCLMS
FAFVVIVLSFRVFKNYKFVRVLHNLCNFDFVLTQLSLFRYRHHNQGNNPSHYEALWLFLKELYYGVQDVKYEVFSPQAGT
YNVRFLTDMTEQDQLEAVEQVQRRLQRFSIVQDKNSQRLVLYSKNVDFLRSQIQHQRVLGANPFIITTLTPKDIAIDNVE
VHNPSQFKPEDLQAHMWFYSKSPIFVGQVPIPTNVQTAAVLDTTYNCQDLTADEKNNVAANLQIQNAALTLSLFEECNRF
LESELGDVPTLMWQSEDVVDVKQLEVQIEKLRVVLDGMQLGTSEYKATRKQINILQSQLDKALAFERKLAKFLEKVDQQQ
AITNETAKQLSAFKNLVKQVYESYMSSLKVRVVESNDASCLLTSTDLPRKLVLMRPITGLDNIKIVEKANGCEITAFGDT
FTTGLGSNLAGLAYSSTQPLSAYPFIFNLEGIFKQQANIGYKTVECNMSSDNGSVLYKGKIVAVPSEDNPDFVVCGKGYK
LDCGINVLMIPSIVRYITLNLTDHLQRQSLKPRRRLQYKQQGVRLGGVNLGEHQAFSNELISSVGYTTWVSSTVCTDKSH
KHPWFVQIPSSEKDPEWFMHNTQVKNNQWVVDAKPTHWLVDADTNEQLFALALTDEEYLKAESILAKWSPITQDVECWFK
DLRGYYTVSGLQPLWPVCPKKICSLKIVPIFQSQSVAYADEPTHFLSLPVVNKNFLEAFYELQEGFPGEKQVAPHISLTM
LKLTEEDVAKVEDILDEMVLPNTYATITNPHMMGQYYVFEVEGLQALHDEVVSVLRQHGIACDQTRMWKPHLTIGEIKDG
SVFNKFKDFGITCKLEDCDFVKLGAPKANARYEFIATLPVGDFKLLRDVWCACRHLCFQNGAYQSSRSKHYIDLATEYNA
GIVKVNKSNTHSVEYKGQRFMIKRVKDQHEFALARTAFLPSIIPHHMVHQNGEWFLVRGPTTQWSLGDLVYAIWLGDQAY
LDECGFVFNPSRDEFLDDANQRSYLANLLEPAILSFCEIFHCVKGCQVPYKITLDNLDLKGQLYDFGDYPCPNKVDNQSA
LFVLAEVWSMTRKPFPTKFAQVLAKEMNVPADFQMYFQHTLLSGKYFDKAMCLNNVRPLLRDPANLTTTPFFSQHSGVWT
HFYNPIYGLVECDLEEFSNLPEVLQQLITVQGPITSAMTPAISIGEGVYAANVPPVAAAKQKIPLYDVGLCQELTDAGVD
CGEAFKYFYYLSNPAGALADVCFYDYQGTGFYSPKLLAGVYDFMKRVTTCYKTDERFTYEQAKPRKSSMGINITGYQQDA
VYRALGPENITKLFEYAQKAPLPFCTKIITKFALSAKARARTVSSCSFIASTIFRFAHKPVTSKMVEVAQNSQGFCLIGV
SKYGLKFSKFLKDKYGAIEQFDVFGSDYTKCDRTFPLSFRALTAALLYELGGWEEDSWLYLNEVNSYMLDTMLCDGMLLN
KPGGTSSGDATTAHSNTFYNYMVHYVVAFKTILSDLSDCNKVMRIAAHNAYTTGDYGVFNTLLEEQFQTNYFLNFLSDDS
FIFSKPGALKIFTCENFSNKLQTILHTKVDLTKSWATTGHIEEFCSAHIIKTDGEYHFLPSRGRLLASLLILDKLSDVDI
YYMRFVAILCESAVYSRYQPEFFNGLFQVFLDKVQQFRKDYCCDPCPTQLLDRQFYENLVFTSNTEVGLVDCYLENFKLQ
CEFKQQAGFDRVCFCCPNPAVSVCEECYVPLPLCAYCYYVHVVISSHCKIEDKFKCPCGQDDIRELFIVLNNSICIYQCR
SCVESDRLRISLLSDVDQVVRLPGFKANSASIAKNGVAQLLTPVDNVDVSLDWNHQETVQQNVARIVYHSANMTQMSIEV
VYVNFSLVRNDGSSAILDIPNFKCPDTSYCLFYKPGKTGVLKFTGKGTLTSCYDSNNLTWFKVTCPDFNQPWRLATCFVI
QQHDAVYPPIKSTQYENVTFVMGPPGTGKTTFVYNTYLSKASPSNRFVYCAPTHRLVGDMDEKVDGAVVVSAYNDRTYRN
PVWKKDDSYDVLLCTHNTLPFIKSAVLIVDEVSLIPPHVMIKILSMGFKKVVLLGDPFQLSPVYKNHKVHFKYDTFYLLQ
LATQKRYLTACYRCPPQILSAFSKPYCDVGVDLVSFNNKPGKFDIIVSKQLANMQDFSVLSVLSKEYPGYVILVNYRAAV
DYAMQNGLGDVTTIDSSQGTTAANHLLVLFGASNFSKTINRVIVGCSRSTTHLVVVCCPELYKHFQPILNWPEPVYRYFG
MEKQSDFNIIPEVASLVFCDIEFWHYKADPNSKTRTVYPGQIAVVTSQTLQLYLGVFDDAGYKSALRGLPKDVFVPPNWV
WMRKHYPSFEQHAYNMQRLFKFIIDTTCGQPWFILYSCSNDLKSLKFYVEFGTNYFCSCGELAICLMRDGLYKCRNCYGN
MSISKLVNCKYLDVQKERIKLQDAHDAICQQFHGDSHEALCDAVMTKCLYLASYDAAFKDTIHVKYKDLCLEIQYKITSP
YVRYDGVNKRYLYRDHGAMHYFKTPKSPMQNVYRYEVGAHTEYSINICNSYEGCQSFGKTCTKCIHIHCIVEQFMADDRY
RDFILVSVVKSDFVEQALSPAAKALMLTVTRVEGKSFYTSNGQRYDLYDYDLSKSVMRVVGASVKPLPLYSVVVGLGINC
TVGCVLPNVPMKLKDELLSTDVPLSTLRLDLPTWYYVTWPTLSNRTSRWKLAGAQVYDCSVHIYVEATGEQPLYYLQLGN
GESLRELPETLFSTGRLYNLEHDPSKNFNVQQLAIETIPKNHHVFAGDFTDVGTDIGGVHHVVALNGYKGSIIPNYVKPI
ATGLINVGRSVKRTTLVDVCANQLYEKVKQQIAGVKVSKVIFVNIDFQEVQFMVFAKGEDDIQTFYPQKEFIRSYYEWPT
ILPELESHYDLKNYGQDPQFMPQPVNFAKYTQICTFIQEHVKVARNSLIWHVGAAGIDGCSPGDIVLSSFFKECLVYSWD
VKDYDTLLEKHNYDCNFRPNLIVSDVYNVSSNVSEVLEDCVHRLALGGTIIFKTTESSRPDIQLSKVTKYFAAVHFFTAG
VNTSSSEVFVVLKYKLHSEPIGEELCSPNILRRIAAYRNKLCIVPNFKVFSTSLSYRFSSVKFVQKCFYVSVPRQFCASG
LIQEVPLLCQMKH
>Q08534 ~~~1a-1b~~~Replicase polyprotein 1ab~~~
MAFLNVSAVPSCAFAPAFAPHAGASPIVPDSFPCVPRYSDDISHFRLTLSLDFSVPRPLSLNARVHLSASTDNPLPSLPL
GFHAETFVLELNGSSAPFSIPSRHIDFVVNRPFSVFPTEVLSVSSLRTPSRLFALLCDFFLYCSKPGPCVEIASFSTPPP
CLVSNCVAQIPTHAEMESIRFPTKTLPAGRFLQFHKRKYTKRPETLIIHESGLALKTSALGVTSKPNSRPITVKSASGEK
YEAYEISRKDFERSRRRQQTPRVRSHKPRKINKAVEPFFFPEEPKKDKRKRASLPTEDEGFITFGTLRFPLSETPKEEPR
LPKFREVEIPVVKKHAVPAVVSKPVRTFRPVATTGAEYVNARNQCSRRPRNHPILRSASYTFGFKKMPLQRFMKEKKEYY
VKRSKVVSSCSVTKSPLEALASILKNLPQYSYNSERLKFYDHFIGDDFEIEVHPLRGGKLSVLLILPKGEAYCVVTAATP
QYHAALTIARGDRPRVGELLQYRPGEGLCYLAHAALCCALQKRTFREEDFFVGMYPTKFVFAKRLTEKLGPSALKHPVRG
RQVSRSLFHCDVASAFSSPFYSLPRFIGGVEEEAPEITSSLKHKAIESVYERVSIHKDNLLARSVEKDLIDFKDEIKSLS
KEKRSVTVPFYMGEAVQSGLTRAYPQFNLSFTHSVYSDHPAAAGSRLLENETLASMAKSSFSDIGGCPLFHIKRGSTDYH
VCRPIYDMKDAQRRVSRELQARGLVENLSREQLVEAQARVSVCPHTLGNCNVKSDVLIMVQVYDASLNEIASAMVLKESK
VAYLTMVTPGELLDEREAFAIDALGCDVVVDTRRDMVQYKFGSSCYCHKLSNIKSIMLTPAFTFSGNLFSVEMYENRMGV
NYYKITRSAYSPEIRGVKTLRYRRACTEVVQVKLPRFDKTLKTFLSGYDYIYLDAKFVSRVFDYVVSNCSVVNSKTFEWV
WSYIKSSKSRVVISGKVIHRDVHIDLKHSECFAAVMLAVGVRSRTTTEFLAKNLNYYTGDASCFETIRFLFREWSRRAYA
EINRSFRKLMKSILSAGLDYEFLDLDNSLQHLLEYSEVEVRVSIAQNGEVDCNEENRVLTEIIAEAADRKSIAQGLSGAL
SSVPTQPRGGLRGGSRRSGVSFLYNLVEEVGNLFFSVGDAVRFLVKVFKTFSDSPIFRVVRMFLDLAEAASPFVSVVSLC
AWLREAVSAFSSWVADRTVSESVKTFVNRTVKRFLNFMSAKTLTKKFFRFFLSASALAKTVVRKAKVILEAYWEVWFESI
LSDSGEYSAVEFCSSVVITLLTNSGRLLPGFSPSAIITEVLLDLATKISIEVLLKQISPADSTASSALYRRVLSEILSNF
RTMGEHGIFTKVFLLCGFLPVFVRKCVALCVPGDMATYARFLEYGVDDLFFLGRSVNSIKNYLCVVAAGLVDSIVDSVVL
KLSGVAKERVLGFKSKIIKNFLNVFRKAKVVTRTSSSTDLSEDEYFSCDESKPGLRGGSSRFTLSRLLDIFFNFLKSSKL
VIENACFSAYERIERNMKLYFFPLNSSEEEARRLIRCAGDFDYLSDSAFDEDEMLRQAFEQYYSSDDESVTYDGKPTVLR
SYLNVSRRFLETFCNGPKFFVKVSNYFKALYSRLLRVLPWVDRNLSDSPGLKGGNEKALLAKFFKTCVITACECVSQICC
LRLIRLCWGTPACGLVRLFYITYSSTRVLSRVVVAVAVCPLLVRNELDGLSDGLTNMGVSVFRRLFVALRRALSAYSNSA
LRRKIIEFIFGNIHHPFDVAVIETNEVAPEPLSPEVDIDVDCDFGSDSESVSSDEVASNPRPGLHGGSRRSSNFLTSLVK
VVFKLARRIPRLLFRLRNFVAYFVERRLASKRLKTFIGLARLFDNFSLTSVVYLLQEYDSVLNAFIDVELILLNSGSVNV
LPLVSWVRGSLTKLAEAIVGSGFASFLGRMCCRVSDWCSSSSNAGCNFMSPVRTKGKFVPPSSSGSTASMYERLEALESD
IREHVLSTCRVGSDEEEERPKEVTEPGIEHTSEDVVPIRSHSQPLSGGECSYSEDREENERANLLPHVSKIVSERRGLET
ARRNKRTLHGVSEFLNAINTSNEQPRPIIVDHSPESRALTNSVREFYYLQELALFELSCKLREYYDQLKVANFNRQECLC
DKDEDMFVLRAGQGVVSGRNSRLPLKHFKGHEFCFRSGGLVPYDGTSRVDTIFHTQTNFVSANALLSGYLSYRTFTFTNL
SANVLLYEAPPGGGKTTTLIKVFCETFSKVNSLILTANKSSREEILAKVNRIVLDEGDTPLQTRDRILTIDSYLMNNRGL
TCKVLYLDECFMVHAGAAVACIEFTKCDSAILFGDSRQIRYGRCSELDTAVLSDLNRFVDDESRVYGEVSYRCPWDVCAW
LSTFYPKTVATTNLVSAGQSSMQVREIESVDDVEYSSEFVYLTMLQSEKKDLLKSFGKRSRSSVEKPTVLTVHEAQGETY
RKVNLVRTKFQEDDPFRSENHITVALSRHVESLTYSVLSSKRDDAIAQAIVKAKQLVDAYRVYPTSFGGSTLDVSVNPST
SDRSKCKASSAPYEVINSFLESVVPGTTSVDFGDVSEEMGTQVFESGADNVVIRDSAPVNKSTDHDPQRVSSIRSQAIPK
RKPSLQENLYSYESRNYNFTVCERFSGPQEFGQAMAMVMLERSFDLEKVAKVRSDVIAITEKGVRTWMSKREPSQLRALS
SDLQKPLNLEEEITTFKLMVKRDAKVKLDSSCLVKHPPAQNIMFHRKAVNAIFSPCFDEFKNRVITCTNSNIVFFTEMTN
STLASIAKEMLGSEHVYNVGEIDFSKFDKSQDAFIKSFERTLYSAFGFDEDLLDVWMQGEYTSNATTLDGQLSFSVDNQR
KSGASNTWIGNSIETLGILSMFYYTNRFKALFVSGDDSLIFSESPIRNSADAMCTELGFETKFLTPSVPYFCSKFFVMTG
HDVFFVPDPYKLLVKLGASKDEVDDEFLFEVFTSFRDLTKDLVDERVIELLTHLVHSKYGYESGDTYAALCAIHCIRSNF
SSFKKLYPKVKGWVVHYGKLKFVLRKFANCFREKFDTAFGERTFLLTTKLETVL
>P0C6X1 ~~~rep~~~Replicase polyprotein 1ab~~~
MACNRVTLAVASDSEISANGCSTIAQAVRRYSEAASNGFRACRFVSLDLQDCIVGIADDTYVMGLHGNQTLFCNIMKFSD
RPFMLHGWLVFSNSNYLLEEFDVVFGKRGGGNVTYTDQYLCGADGKPVMSEDLWQFVDHFGENEEIIINGHTYVCAWLTK
RKPLDYKRQNNLAIEEIEYVHGDALHTLRNGSVLEMAKEVKTSSKVVLSDALDKLYKVFGSPVMTNGSNILEAFTKPVFI
SALVQCTCGTKSWSVGDWTGFKSSCCNVISNKLCVVPGNVKPGDAVITTQQAGAGIKYFCGMTLKFVANIEGVSVWRVIA
LQSVDCFVASSTFVEEEHVNRMDTFCFNVRNSVTDECRLAMLGAEMTSNVRRQVASGVIDISTGWFDVYDDIFAESKPWF
VRKAEDIFGPCWSALASALKQLKVTTGELVRFVKSICNSAVAVVGGTIQILASVPEKFLNAFDVFVTAIQTVFDCAVETC
TIAGKAFDKVFDYVLLDNALVKLVTTKLKGVRERGLNKVKYATVVVGSTEEVKSSRVERSTAVLTIANNYSKLFDEGYTV
VIGDVAYFVSDGYFRLMASPNSVLTTAVYKPLFAFNVNVMGTRPEKFPTTVTCENLESAVLFVNDKITEFQLDYSIDVID
NEIIVKPNISLCVPLYVRDYVDKWDDFCRQYSNESWFEDDYRAFISVLDITDAAVKAAESKAFVDTIVPPCPSILKVIDG
GKIWNGVIKNVNSVRDWLKSLKLNLTQQGLLGTCAKRFKRWLGILLEAYNAFLDTVVSTVKIGGLTFKTYAFDKPYIVIR
DIVCKVENKTEAEWIELFPHNDRIKSFSTFESAYMPIADPTHFDIEEVELLDAEFVEPGCGGILAVIDEHVFYKKDGVYY
PSNGTNILPVAFTKAAGGKVSFSDDVEVKDIEPVYRVKLCFEFEDEKLVDVCEKAIGKKIKHEGDWDSFCKTIQSALSVV
SCYVNLPTYYIYDEEGGNDLSLPVMISEWPLSVQQAQQEATLPDIAEDVVDQVEEVNSIFDIETVDVKHDVSPFEMPFEE
LNGLKILKQLDNNCWVNSVMLQIQLTGILDGDYAMQFFKMGRVAKMIERCYTAEQCIRGAMGDVGLCMYRLLKDLHTGFM
VMDYKCSCTSGRLEESGAVLFCTPTKKAFPYGTCLNCNAPRMCTIRQLQGTIIFVQQKPEPVNPVSFVVKPVCSSIFRGA
VSCGHYQTNIYSQNLCVDGFGVNKIQPWTNDALNTICIKDADYNAKVEISVTPIKNTVDTTPKEEFVVKEKLNAFLVHDN
VAFYQGDVDTVVNGVDFDFIVNAANENLAHGGGLAKALDVYTKGKLQRLSKEHIGLAGKVKVGTGVMVECDSLRIFNVVG
PRKGKHERDLLIKAYNTINNEQGTPLTPILSCGIFGIKLETSLEVLLDVCNTKEVKVFVYTDTEVCKVKDFVSGLVNVQK
VEQPKIEPKPVSVIKVAPKPYRVDGKFSYFTEDLLCVADDKPIVLFTDSMLTLDDRGLALDNALSGVLSAAIKDCVDINK
AIPSGNLIKFDIGSVVVYMCVVPSEKDKHLDNNVQRCTRKLNRLMCDIVCTIPADYILPLVLSSLTCNVSFVGELKAAEA
KVITIKVTEDGVNVHDVTVTTDKSFEQQVGVIADKDKDLSGAVPSDLNTSELLTKAIDVDWVEFYGFKDAVTFATVDHSA
FAYESAVVNGIRVLKTSDNNCWVNAVCIALQYSKPHFISQGLDAAWNKFVLGDVEIFVAFVYYVARLMKGDKGDAEDTLT
KLSKYLANEAQVQLEHYSSCVECDAKFKNSVASINSAIVCASVKRDGVQVGYCVHGIKYYSRVRSVRGRAIIVSVEQLEP
CAQSRLLSGVAYTAFSGPVDKGHYTVYDTAKKSMYDGDRFVKHDLSLLSVTSVVMVGGYVAPVNTVKPKPVINQLDEKAQ
KFFDFGDFLIHNFVIFFTWLLSMFTLCKTAVTTGDVKIMAKAPQRTGVVLKRSLKYNLKASAAVLKSKWWLLAKFTKLLL
LIYTLYSVVLLCVRFGPFNFCSETVNGYAKSNFVKDDYCDGSLGCKMCLFGYQELSQFSHLDVVWKHITDPLFSNMQPFI
VMVLLLIFGDNYLRCFLLYFVAQMISTVGVFLGYKETNWFLHFIPFDVICDELLVTVIVIKVISFVRHVLFGCENPDCIA
CSKSARLKRFPVNTIVNGVQRSFYVNANGGSKFCKKHRFFCVDCDSYGYGSTFITPEVSRELGNITKTNVQPTGPAYVMI
DKVEFENGFYRLYSCETFWRYNFDITESKYSCKEVFKNCNVLDDFIVFNNNGTNVTQVKNASVYFSQLLCRPIKLVDSEL
LSTLSVDFNGVLHKAYIDVLRNSFGKDLNANMSLAECKRALGLSISDHEFTSAISNAHRCDVLLSDLSFNNFVSSYAKPE
EKLSAYDLACCMRAGAKVVNANVLTKDQTPIVWHAKDFNSLSAEGRKYIVKTSKAKGLTFLLTINENQAVTQIPATSIVA
KQGAGDAGHSLTWLWLLCGLVCLIQFYLCFFMPYFMYDIVSSFEGYDFKYIENGQLKNFEAPLKCVRNVFENFEDWHYAK
FGFTPLNKQSCPIVVGVSEIVNTVAGIPSNVYLVGKTLIFTLQAAFGNAGVCYDIFGVTTPEKCIFTSACTRLEGLGGNN
VYCYNTALMEGSLPYSSIQANAYYKYDNGNFIKLPEVIAQGFGFRTVRTIATKYCRVGECVESNAGVCFGFDKWFVNDGR
VANGYVCGTGLWNLVFNILSMFSSSFSVAAMSGQILLNCALGAFAIFCCFLVTKFRRMFGDLSVGVCTVVVAVLLNNVSY
IVTQNLVTMIAYAILYFFATRSLRYAWIWCAAYLIAYISFAPWWLCAWYFLAMLTGLLPSLLKLKVSTNLFEGDKFVGTF
ESAAAGTFVIDMRSYEKLANSISPEKLKSYAASYNRYKYYSGNANEADYRCACYAYLAKAMLDFSRDHNDILYTPPTVSY
GSTLQAGLRKMAQPSGFVEKCVVRVCYGNTVLNGLWLGDIVYCPRHVIASNTTSAIDYDHEYSIMRLHNFSIISGTAFLG
VVGATMHGVTLKIKVSQTNMHTPRHSFRTLKSGEGFNILACYDGCAQGVFGVNMRTNWTIRGSFINGACGSPGYNLKNGE
VEFVYMHQIELGSGSHVGSSFDGVMYGGFEDQPNLQVESANQMLTVNVVAFLYAAILNGCTWWLKGEKLFVEHYNEWAQA
NGFTAMNGEDAFSILAAKTGVCVERLLHAIQVLNNGFGGKQILGYSSLNDEFSINEVVKQMFGVNLQSGKTTSMFKSISL
FAGFFVMFWAELFVYTTTIWVNPGFLTPFMILLVALSLCLTFVVKHKVLFLQVFLLPSIIVAAIQNCAWDYHVTKVLAEK
FDYNVSVMQMDIQGFVNIFICLFVALLHTWRFAKERCTHWCTYLFSLIAVLYTALYSYDYVSLLVMLLCAISNEWYIGAI
IFRICRFGVAFLPVEYVSYFDGVKTVLLFYMLLGFVSCMYYGLLYWINRFCKCTLGVYDFCVSPAEFKYMVANGLNAPNG
PFDALFLSFKLMGIGGPRTIKVSTVQSKLTDLKCTNVVLMGILSNMNIASNSKEWAYCVEMHNKINLCDDPETAQELLLA
LLAFFLSKHSDFGLGDLVDSYFENDSILQSVASSFVGMPSFVAYETARQEYENAVANGSSPQIIKQLKKAMNVAKAEFDR
ESSVQKKINRMAEQAAAAMYKEARAVNRKSKVVSAMHSLLFGMLRRLDMSSVDTILNMARNGVVPLSVIPATSAARLVVV
VPDHDSFVKMMVDGFVHYAGVVWTLQEVKDNDGKNVHLKDVTKENQEILVWPLILTCERVVKLQNNEIMPGKMKVKATKG
EGDGGITSEGNALYNNEGGRAFMYAYVTTKPGMKYVKWEHDSGVVTVELEPPCRFVIDTPTGPQIKYLYFVKNLNNLRRG
AVLGYIGATVRLQAGKQTEFVSNSHLLTHCSFAVDPAAAYLDAVKQGAKPVGNCVKMLTNGSGSGQAITCTIDSNTTQDT
YGGASVCIYCRAHVAHPTMDGFCQYKGKWVQVPIGTNDPIRFCLENTVCKVCGCWLNHGCTCDRTAIQSFDNSYLNRVRG
SSAARLEPCNGTDIDYCVRAFDVYNKDASFIGKNLKSNCVRFKNVDKDDAFYIVKRCIKSVMDHEQSMYNLLKGCNAVAK
HDFFTWHEGRTIYGNVSRQDLTKYTMMDLCFALRNFDEKDCEVFKEILVLTGCCSTDYFEMKNWFDPIENEDIHRVYAAL
GKVVANAMLKCVAFCDEMVLKGVVGVLTLDNQDLNGNFYDFGDFVLCPPGMGIPYCTSYYSYMMPVMGMTNCLASECFMK
SDIFGQDFKTFDLLKYDFTEHKEVLFNKYFKYWGQDYHPDCVDCHDEMCILHCSNFNTLFATTIPNTAFGPLCRKVFIDG
VPVVATAGYHFKQLGLVWNKDVNTHSTRLTITELLQFVTDPTLIVASSPALVDKRTVCFSVAALSTGLTSQTVKPGHFNK
EFYDFLRSQGFFDEGSELTLKHFFFTQKGDAAIKDFDYYRYNRPTMLDIGQARVAYQVAARYFDCYEGGCITSREVVVTN
LNKSAGWPLNKFGKAGLYYESISYEEQDAIFSLTKRNILPTMTQLNLKYAISGKERARTVGGVSLLATMTTRQFHQKCLK
SIVATRNATVVIGTTKFYGGWDNMLKNLMADVDDPKLMGWDYPKCDRAMPSMIRMLSAMILGSKHVTCCTASDKFYRLSN
ELAQVLTEVVYSNGGFYFKPGGTTSGDATTAYANSVFNIFQAVSSNINCVLSVNSSNCNNFNVKKLQRQLYDNCYRNSNV
DESFVDDFYGYLQKHFSMMILSDDSVVCYNKTYAGLGYIADISAFKATLYYQNGVFMSTAKCWTEEDLSIGPHEFCSQHT
MQIVDENGKYYLPYPDPSRIISAGVFVDDITKTDAVILLERYVSLAIDAYPLSKHPKPEYRKVFYALLDWVKHLNKTLNE
GVLESFSVTLLDEHESKFWDESFYASMYEKSTVLQAAGLCVVCGSQTVLRCGDCLRRPMLCTKCAYDHVFGTDHKFILAI
TPYVCNTSGCNVNDVTKLYLGGLNYYCVDHKPHLSFPLCSAGNVFGLYKSSALGSMDIDVFNKLSTSDWSDIRDYKLAND
AKESLRLFAAETVKAKEESVKSSYAYATLKEIVGPKELLLLWESGKAKPPLNRNSVFTCFQITKDSKFQVGEFVFEKVDY
GSDTVTYKSTATTKLVPGMLFILTSHNVAPLRAPTMANQEKYSTIYKLHPSFNVSDAYANLVPYYQLIGKQRITTIQGPP
GSGKSHCSIGIGVYYPGARIVFTACSHAAVDSLCAKAVTAYSVDKCTRIIPARARVECYSGFKPNNNSAQYVFSTVNALP
EVNADIVVVDEVSMCTNYDLSVINQRISYKHIVYVGDPQQLPAPRVLISKGVMEPIDYNVVTQRMCAIGPDVFLHKCYRC
PAEIVNTVSELVYENKFVPVKEASKQCFKIFERGSVQVDNGSSINRRQLDVVKRFIHKNSTWSKAVFISPYNSQNYVAAR
LLGLQTQTVDSAQGSEYDYVIFAQTSDTAHACNANRFNVAITRAKKGIFCIMSDRTLFDALKFFEITMTDLQSESSCGLF
KDCARNPIDLPPSHATTYLSLSDRFKTSGDLAVQIGNNNVCTYEHVISYMGFRFDVSMPGSHSLFCTRDFAMRHVRGWLG
MDVEGAHVTGDNVGTNVPLQVGFSNGVDFVAQPEGCVLTNTGSVVKPVRARAPPGEQFTHIVPLLRKGQPWSVLRKRIVQ
MIADFLAGSSDVLVFVLWAGGLELTTMRYFVKIGAVKHCQCGTVATCYNSVSNDYCCFKHALGCDYVYNPYVIDIQQWGY
VGSLSTNHHAICNVHRNEHVASGDAIMTRCLAVYDCFVKNVDWSITYPMIANENAINKGGRTVQSHIMRAAIKLYNPKAI
HDIGNPKGIRCAVTDAKWYCYDKNPINSNVKTLEYDYMTHGQMDGLCLFWNCNVDMYPEFSIVCRFDTRTRSTLNLEGVN
GGSLYVNNHAFHTPAYDKRAMAKLKPAPFFYYDDGSCEVVHDQVNYVPLRATNCITKCNIGGAVCSKHANLYRAYVESYN
IFTQAGFNIWVPTTFDCYNLWQTFTEVNLQGLENIAFNVVNKGSFVGADGELPVAISGDKVFVRDGNTDNLVFVNKTSLP
TNIAFELFAKRKVGLTPPLSILKNLGVVATYKFVLWDYEAERPLTSFTKSVCGYTDFAEDVCTCYDNSIQGSYERFTLST
NAVLFSATAVKTGGKSLPAIKLNFGMLNGNAIATVKSEDGNIKNINWFVYVRKDGKPVDHYDGFYTQGRNLQDFLPRSTM
EEDFLNMDIGVFIQKYGLEDFNFEHVVYGDVSKTTLGGLHLLISQVRLSKMGILKAEEFVAASDITLKCCTVTYLNDPSS
KTVCTYMDLLLDDFVSVLKSLDLTVVSKVHEVIIDNKPWRWMLWCKDNAVATFYPQLQSAEWKCGYSMPGIYKTQRMCLE
PCNLYNYGAGLKLPSGIMFNVVKYTQLCQYFNSTTLCVPHNMRVLHLGAGSDYGVAPGTAVLKRWLPHDAIVVDNDVVDY
VSDADFSVTGDCATVYLEDKFDLLISDMYDGRTKAIDGENVSKEGFFTYINGFICEKLAIGGSIAIKVTEYSWNKKLYEL
VQRFSFWTMFCTSVNTSSSEAFVVGINYLGDFAQGPFIDGNIIHANYVFWRNSTVMSLSYNSVLDLSKFNCKHKATVVVQ
LKDSDINEMVLSLVRSGKLLVRGNGKCLSFSNHLVSTK
>P0C6X5 ~~~rep~~~Replicase polyprotein 1ab~~~
MFYNQVTLAVASDSEISGFGFAIPSVAVRTYSEAAAQGFQACRFVAFGLQDCVTGINDDDYVIALTGTNQLCAKILPFSD
RPLNLRGWLIFSNSNYVLQDFDVVFGHGAGSVVFVDKYMCGFDGKPVLPKNMWEFRDYFNNNTDSIVIGGVTYQLAWDVI
RKDLSYEQQNVLAIESIHYLGTTGHTLKSGCKLTNAKPPKYSSKVVLSGEWNAVYRAFGSPFITNGMSLLDIIVKPVFFN
AFVKCNCGSESWSVGAWDGYLSSCCGTPAKKLCVVPGNVVPGDVIITSTSAGCGVKYYAGLVVKHITNITGVSLWRVTAV
HSDGMFVASSSYDALLHRNSLDPFCFDVNTLLSNQLRLAFLGASVTEDVKFAASTGVIDISAGMFGLYDDILTNNKPWFV
RKASGLFDAIWDAFVAAIKLVPTTTGVLVRFVKSIASTVLTVSNGVIIMCADVPDAFQSVYRTFTQAICAAFDFSLDVFK
IGDVKFKRLGDYVLTENALVRLTTEVVRGVRDARIKKAMFTKVVVGPTTEVKFSVIELATVNLRLVDCAPVVCPKGKIVV
IAGQAFFYSGGFYRFMVDPTTVLNDPVFTGDLFYTIKFSGFKLDGFNHQFVTASSATDAIIAVELLLLDFKTAVFVYTCV
VDGCSVIVRRDATFATHVCFKDCYNVWEQFCIDNCGEPWFLTDYNAILQSNNPQCAIVQASESKVLLERFLPKCPEILLS
IDDGHLWNLFVEKFNFVTDWLKTLKLTLTSNGLLGNCAKRFRRVLVKLLDVYNGFLETVCSVAYTAGVCIKYYAVNVPYV
VISGFVSRVIRRERCDMTFPCVSCVTFFYEFLDTCFGVSKPNAIDVEHLELKETVFVEPKDGGQFFVSGDYLWYVVDDIY
YPASCNGVLPVAFTKLAGGKISFSDDVIVHDVEPTHKVKLIFEFEDDVVTSLCKKSFGKSIIYTGDWEGLHEVLTSAMNV
IGQHIKLPQFYIYDEEGGYDVSKPVMISQWPISNDSNGCVVEASTDFHQLECIVDDSVREEVDIIEQPFEEVEHVLSIKQ
PFSFSFRDELGVRVLDQSDNNCWISTTLVQLQLTKLLDDSIEMQLFKVGKVDSIVQKCYELSHLISGSLGDSGKLLSELL
KEKYTCSITFEMSCDCGKKFDDQVGCLFWIMPYTKLFQKGECCICHKMQTYKLVSMKGTGVFVQDPAPIDIDAFPVKPIC
SSVYLGVKGSGHYQTNLYSFNKAIDGFGVFDIKNSSVNTVCFVDVDFHSVEIEAGEVKPFAVYKNVKFYLGDISHLVNCV
SFDFVVNAANENLLHGGGVARAIDILTEGQLQSLSKDYISSNGPLKVGAGVMLECEKFNVFNVVGPRTGKHEHSLLVEAY
NSILFENGIPLMPLLSCGIFGVRIENSLKALFSCDINKPLQVFVYSSNEEQAVLKFLDGLDLTPVIDDVDVVKPFRVEGN
FSFFDCGVNALDGDIYLLFTNSILMLDKQGQLLDTKLNGILQQAALDYLATVKTVPAGNLVKLFVESCTIYMCVVPSIND
LSFDKNLGRCVRKLNRLKTCVIANVPAIDVLKKLLSSLTLTVKFVVESNVMDVNDCFKNDNVVLKITEDGINVKDVVVES
SKSLGKQLGVVSDGVDSFEGVLPINTDTVLSVAPEVDWVAFYGFEKAALFASLDVKPYGYPNDFVGGFRVLGTTDNNCWV
NATCIILQYLKPTFKSKGLNVLWNKFVTGDVGPFVSFIYFITMSSKGQKGDAEEALSKLSEYLISDSIVTLEQYSTCDIC
KSTVVEVKSAIVCASVLKDGCDVGFCPHRHKLRSRVKFVNGRVVITNVGEPIISQPSKLLNGIAYTTFSGSFDNGHYVVY
DAANNAVYDGARLFSSDLSTLAVTAIVVVGGCVTSNVPTIVSEKISVMDKLDTGAQKFFQFGDFVMNNIVLFLTWLLSMF
SLLRTSIMKHDIKVIAKAPKRTGVILTRSFKYNIRSALFVIKQKWCVIVTLFKFLLLLYAIYALVFMIVQFSPFNSLLCG
DIVSGYEKSTFNKDIYCGNSMVCKMCLFSYQEFNDLDHTSLVWKHIRDPILISLQPFVILVILLIFGNMYLRFGLLYFVA
QFISTFGSFLGFHQKQWFLHFVPFDVLCNEFLATFIVCKIVLFVRHIIVGCNNADCVACSKSARLKRVPLQTIINGMHKS
FYVNANGGTCFCNKHNFFCVNCDSFGPGNTFINGDIARELGNVVKTAVQPTAPAYVIIDKVDFVNGFYRLYSGDTFWRYD
FDITESKYSCKEVLKNCNVLENFIVYNNSGSNITQIKNACVYFSQLLCEPIKLVNSELLSTLSVDFNGVLHKAYVDVLCN
SFFKELTANMSMAECKATLGLTVSDDDFVSAVANAHRYDVLLSDLSFNNFFISYAKPEDKLSVYDIACCMRAGSKVVNHN
VLIKESIPIVWGVKDFNTLSQEGKKYLVKTTKAKGLTFLLTFNDNQAITQVPATSIVAKQGAGFKRTYNFLWYVCLFVVA
LFIGVSFIDYTTTVTSFHGYDFKYIENGQLKVFEAPLHCVRNVFDNFNQWHEAKFGVVTTNSDKCPIVVGVSERINVVPG
VPTNVYLVGKTLVFTLQAAFGNTGVCYDFDGVTTSDKCIFNSACTRLEGLGGDNVYCYNTDLIEGSKPYSTLQPNAYYKY
DAKNYVRFPEILARGFGLRTIRTLATRYCRVGECRDSHKGVCFGFDKWYVNDGRVDDGYICGDGLIDLLVNVLSIFSSSF
SVVAMSGHMLFNFLFAAFITFLCFLVTKFKRVFGDLSYGVFTVVCATLINNISYVVTQNLFFMLLYAILYFVFTRTVRYA
WIWHIAYIVAYFLLIPWWLLTWFSFAAFLELLPNVFKLKISTQLFEGDKFIGTFESAAAGTFVLDMRSYERLINTISPEK
LKNYAASYNKYKYYSGSASEADYRCACYAHLAKAMLDYAKDHNDMLYSPPTISYNSTLQSGLKKMAQPSGCVERCVVRVC
YGSTVLNGVWLGDTVTCPRHVIAPSTTVLIDYDHAYSTMRLHNFSVSHNGVFLGVVGVTMHGSVLRIKVSQSNVHTPKHV
FKTLKPGDSFNILACYEGIASGVFGVNLRTNFTIKGSFINGACGSPGYNVRNDGTVEFCYLHQIELGSGAHVGSDFTGSV
YGNFDDQPSLQVESANLMLSDNVVAFLYAALLNGCRWWLCSTRVNVDGFNEWAMANGYTSVSSVECYSILAAKTGVSVEQ
LLASIQHLHEGFGGKNILGYSSLCDEFTLAEVVKQMYGVNLQSGKVIFGLKTMFLFSVFFTMFWAELFIYTNTIWINPVI
LTPIFCLLLFLSLVLTMFLKHKFLFLQVFLLPTVIATALYNCVLDYYIVKFLADHFNYNVSVLQMDVQGLVNVLVCLFVV
FLHTWRFSKERFTHWFTYVCSLIAVAYTYFYSGDFLSLLVMFLCAISSDWYIGAIVFRLSRLIVFFSPESVFSVFGDVKL
TLVVYLICGYLVCTYWGILYWFNRFFKCTMGVYDFKVSAAEFKYMVANGLHAPHGPFDALWLSFKLLGIGGDRCIKISTV
QSKLTDLKCTNVVLLGCLSSMNIAANSSEWAYCVDLHNKINLCDDPEKAQSMLLALLAFFLSKHSDFGLDGLIDSYFDNS
STLQSVASSFVSMPSYIAYENARQAYEDAIANGSSSQLIKQLKRAMNIAKSEFDHEISVQKKINRMAEQAATQMYKEARS
VNRKSKVISAMHSLLFGMLRRLDMSSVETVLNLARDGVVPLSVIPATSASKLTIVSPDLESYSKIVCDGSVHYAGVVWTL
NDVKDNDGRPVHVKEITKENVETLTWPLILNCERVVKLQNNEIMPGKLKQKPMKAEGDGGVLGDGNALYNTEGGKTFMYA
YISNKADLKFVKWEYEGGCNTIELDSPCRFMVETPNGPQVKYLYFVKNLNTLRRGAVLGFIGATIRLQAGKQTELAVNSG
LLTACAFSVDPATTYLEAVKHGAKPVSNCIKMLSNGAGNGQAITTSVDANTNQDSYGGASICLYCRAHVPHPSMDGYCKF
KGKCVQVPIGCLDPIRFCLENNVCNVCGCWLGHGCACDRTTIQSVDISYLNRARGSSAARLEPCNGTDIDKCVRAFDIYN
KNVSFLGKCLKMNCVRFKNADLKDGYFVIKRCTKSVMEHEQSMYNLLNFSGALAEHDFFTWKDGRVIYGNVSRHNLTKYT
MMDLVYAMRNFDEQNCDVLKEVLVLTGCCDNSYFDSKGWYDPVENEDIHRVYASLGKIVARAMLKCVALCDAMVAKGVVG
VLTLDNQDLNGNFYDFGDFVVSLPNMGVPCCTSYYSYMMPIMGLTNCLASECFVKSDIFGSDFKTFDLLKYDFTEHKENL
FNKYFKHWSFDYHPNCSDCYDDMCVIHCANFNTLFATTIPGTAFGPLCRKVFIDGVPLVTTAGYHFKQLGLVWNKDVNTH
SVRLTITELLQFVTDPSLIIASSPALVDQRTICFSVAALSTGLTNQVVKPGHFNEEFYNFLRLRGFFDEGSELTLKHFFF
AQNGDAAVKDFDFYRYNKPTILDICQARVTYKIVSRYFDIYEGGCIKACEVVVTNLNKSAGWPLNKFGKASLYYESISYE
EQDALFALTKRNVLPTMTQLNLKYAISGKERARTVGGVSLLSTMTTRQYHQKHLKSIVNTRNATVVIGTTKFYGGWNNML
RTLIDGVENPMLMGWDYPKCDRALPNMIRMISAMVLGSKHVNCCTATDRFYRLGNELAQVLTEVVYSNGGFYFKPGGTTS
GDASTAYANSIFNIFQAVSSNINRLLSVPSDSCNNVNVRDLQRRLYDNCYRLTSVEESFIDDYYGYLRKHFSMMILSDDG
VVCYNKDYAELGYIADISAFKATLYYQNNVFMSTSKCWVEEDLTKGPHEFCSQHTMQIVDKDGTYYLPYPDPSRILSAGV
FVDDVVKTDAVVLLERYVSLAIDAYPLSKHPNSEYRKVFYVLLDWVKHLNKNLNEGVLESFSVTLLDNQEDKFWCEDFYA
SMYENSTILQAAGLCVVCGSQTVLRCGDCLRKPMLCTKCAYDHVFGTDHKFILAITPYVCNASGCGVSDVKKLYLGGLNY
YCTNHKPQLSFPLCSAGNIFGLYKNSATGSLDVEVFNRLATSDWTDVRDYKLANDVKDTLRLFAAETIKAKEESVKSSYA
FATLKEVVGPKELLLSWESGKVKPPLNRNSVFTCFQISKDSKFQIGEFIFEKVEYGSDTVTYKSTVTTKLVPGMIFVLTS
HNVQPLRAPTIANQEKYSSIYKLHPAFNVSDAYANLVPYYQLIGKQKITTIQGPPGSGKSHCSIGLGLYYPGARIVFVAC
AHAAVDSLCAKAMTVYSIDKCTRIIPARARVECYSGFKPNNTSAQYIFSTVNALPECNADIVVVDEVSMCTNYDLSVINQ
RLSYKHIVYVGDPQQLPAPRVMITKGVMEPVDYNVVTQRMCAIGPDVFLHKCYRCPAEIVNTVSELVYENKFVPVKPASK
QCFKVFFKGNVQVDNGSSINRKQLEIVKLFLVKNPSWSKAVFISPYNSQNYVASRFLGLQIQTVDSSQGSEYDYVIYAQT
SDTAHACNVNRFNVAITRAKKGIFCVMCDKTLFDSLKFFEIKHADLHSSQVCGLFKNCTRTPLNLPPTHAHTFLSLSDQF
KTTGDLAVQIGSNNVCTYEHVISFMGFRFDISIPGSHSLFCTRDFAIRNVRGWLGMDVESAHVCGDNIGTNVPLQVGFSN
GVNFVVQTEGCVSTNFGDVIKPVCAKSPPGEQFRHLIPLLRKGQPWLIVRRRIVQMISDYLSNLSDILVFVLWAGSLELT
TMRYFVKIGPIKYCYCGNSATCYNSVSNEYCCFKHALGCDYVYNPYAFDIQQWGYVGSLSQNHHTFCNIHRNEHDASGDA
VMTRCLAVHDCFVKNVDWTVTYPFIANEKFINGCGRNVQGHVVRAALKLYKPSVIHDIGNPKGVRCAVTDAKWYCYDKQP
VNSNVKLLDYDYATHGQLDGLCLFWNCNVDMYPEFSIVCRFDTRTRSVFNLEGVNGGSLYVNKHAFHTPAYDKRAFVKLK
PMPFFYFDDSDCDVVQEQVNYVPLRASSCVTRCNIGGAVCSKHANLYQKYVEAYNTFTQAGFNIWVPHSFDVYNLWQIFI
ETNLQSLENIAFNVVKKGCFTGVDGELPVAVVNDKVFVRYGDVDNLVFTNKTTLPTNVAFELFAKRKMGLTPPLSILKNL
GVVATYKFVLWDYEAERPFTSYTKSVCKYTDFNEDVCVCFDNSIQGSYERFTLTTNAVLFSTVVIKNLTPIKLNFGMLNG
MPVSSIKGDKGVEKLVNWYIYVRKNGQFQDHYDGFYTQGRNLSDFTPRSDMEYDFLNMDMGVFINKYGLEDFNFEHVVYG
DVSKTTLGGLHLLISQFRLSKMGVLKADDFVTASDTTLRCCTVTYLNELSSKVVCTYMDLLLDDFVTILKSLDLGVISKV
HEVIIDNKPYRWMLWCKDNHLSTFYPQLQSAEWKCGYAMPQIYKLQRMCLEPCNLYNYGAGIKLPSGIMLNVVKYTQLCQ
YLNSTTMCVPHNMRVLHYGAGSDKGVAPGTTVLKRWLPPDAIIIDNDINDYVSDADFSITGDCATVYLEDKFDLLISDMY
DGRIKFCDGENVSKDGFFTYLNGVIREKLAIGGSVAIKITEYSWNKYLYELIQRFAFWTLFCTSVNTSSSEAFLIGINYL
GDFIQGPFIAGNTVHANYIFWRNSTIMSLSYNSVLDLSKFECKHKATVVVTLKDSDVNDMVLSLIKSGRLLLRNNGRFGG
FSNHLVSTK
>P0C6X6 ~~~rep~~~Replicase polyprotein 1ab~~~
MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDMICSTTAQKLETDGICPENHVMVDCRRLLKQECCVQSSLIREI
VMNASPYHLEVLLQDALQSREAVLVTTPLGMSLEACYVRGCNPKGWTMGLFRRRSVCNTGRCTVNKHVAYQLYMIDPAGV
CLGAGQFVGWVIPLAFMPVQSRKFIVPWVMYLRKRGEKGAYNKDHGCGGFGHVYDFKVEDAYDQVHDEPKGKFSKKAYAL
IRGYRGVKPLLYVDQYGCDYTGSLADGLEAYADKTLQEMKALFPTWSQELPFDVIVAWHVVRDPRYVMRLQSAATICSVA
YVANPTEDLCDGSVVIKEPVHVYADDSIILRQYNLFDIMSHFYMEADTVVNAFYGVALKDCGFVMQFGYIDCEQDSCDFK
GWIPGNMIDGFACTTCGHVYEVGDLIAQSSGVLPVNPVLHTKSAAGYGGFGCKDSFTLYGQTVVYFGGCVYWSPARNIWI
PILKSSVKSYDSLVYTGVLGCKAIVKETNLICKALYLDYVQHKCGNLHQRELLGVSDVWHKQLLINRGVYKPLLENIDYF
NMRRAKFSLETFTVCADGFMPFLLDDLVPRAYYLAVSGQAFCDYADKLCHAVVSKSKELLDVSLDSLGAAIHYLNSKIVD
LAQHFSDFGTSFVSKIVHFFKTFTTSTALAFAWVLFHVLHGAYIVVESDIYFVKNIPRYASAVAQAFQSVAKVVLDSLRV
TFIDGLSCFKIGRRRICLSGRKIYEVERGLLHSSQLPLDVYDLTMPSQVQKAKQKPIYLKGSGSDFSLADSVVEVVTTSL
TPCGYSEPPKVADKICIVDNVYMAKAGDKYYPVVVDDHVGLLDQAWRVPCAGRRVTFKEQPTVKEIISMPKIIKVFYELD
NDFNTILNTACGVFEVDDTVDMEEFYAVVIDAIEEKLSPCKELEGVGAKVSAFLQKLEDNPLFLFDEAGEEVFAPKLYCA
FTAPEDDDFLEESDVEEDDVEGEETDLTITSAGQPCVASEQEESSEVLEDTLDDGPSVETSDSQVEEDVEMSDFVDLESV
IQDYENVCFEFYTTEPEFVKVLGLYVPKATRNNCWLRSVLAVMQKLPCQFKDKNLQDLWVLYKQQYSQLFVDTLVNKIPA
NIVLPQGGYVADFAYWFLTLCDWQCVAYWKCIKCDLALKLKGLDAMFFYGDVVSHICKCGESMVLIDVDVPFTAHFALKD
KLFCAFITKRIVYKAACVVDVNDSHSMAVVDGKQIDDHRITSITSDKFDFIIGHGMSFSMTTFEIAQLYGSCITPNVCFV
KGDIIKVSKLVKAEVVVNPANGHMVHGGGVAKAIAVAAGQQFVKETTNMVKSKGVCATGDCYVSTGGKLCKTVLNVVGPD
ARTQGKQSYVLLERVYKHFNNYDCVVTTLISAGIFSVPSDVSLTYLLGTAKKQVVLVSNNQEDFDLISKCQITAVEGTKK
LAARLSFNVGRSIVYETDANKLILINDVAFVSTFNVLQDVLSLRHDIALDDDARTFVQSNVDVLPEGWRVVNKFYQINGV
RTVKYFECTGGIDICSQDKVFGYVQQGIFNKATVAQIKALFLDKVDILLTVDGVNFTNRFVPVGESFGKSLGNVFCDGVN
VTKHKCDINYKGKVFFQFDNLSSEDLKAVRSSFNFDQKELLAYYNMLVNCFKWQVVVNGKYFTFKQANNNCFVNVSCLML
QSLHLTFKIVQWQEAWLEFRSGRPARFVALVLAKGGFKFGDPADSRDFLRVVFSQVDLTGAICDFEIACKCGVKQEQRTG
LDAVMHFGTLSREDLEIGYTVDCSCGKKLIHCVRFDVPFLICSNTPASVKLPKGVGSANIFIGDNVGHYVHVKCEQSYQL
YDASNVKKVTDVTGKLSDCLYLKNLKQTFKSVLTTYYLDDVKKIEYKPDLSQYYCDGGKYYTQRIIKAQFKTFEKVDGVY
TNFKLIGHTVCDSLNSKLGFDSSKEFVEYKITEWPTATGDVVLANDDLYVKRYERGCITFGKPVIWLSHEKASLNSLTYF
NRPLLVDDNKFDVLKVDDVDDSGDSSESGAKETKEINIIKLSGVKKPFKVEDSVIVNDDTSETKYVKSLSIVDVYDMWLT
GCKYVVRTANALSRAVNVPTIRKFIKFGMTLVSIPIDLLNLREIKPAVNVVKAVRNKTSACFNFIKWLFVLLFGWIKISA
DNKVIYTTEIASKLTCKLVALAFKNAFLTFKWSMVARGACIIATIFLLWFNFIYANVIFSDFYLPKIGFLPTFVGKIAQW
IKNTFSLVTICDLYSIQDVGFKNQYCNGSIACQFCLAGFDMLDNYKAIDVVQYEADRRAFVDYTGVLKIVIELIVSYALY
TAWFYPLFALISIQILTTWLPELFMLSTLHWSFRLLVALANMLPAHVFMRFYIIIASFIKLFSLFKHVAYGCSKSGCLFC
YKRNRSLRVKCSTIVGGMIRYYDVMANGGTGFCSKHQWNCIDCDSYKPGNTFITVEAALDLSKELKRPIQPTDVAYHTVT
DVKQVGCSMRLFYDRDGQRIYDDVNASLFVDYSNLLHSKVKSVPNMHVVVVENDADKANFLNAAVFYAQSLFRPILMVDK
NLITTANTGTSVTETMFDVYVDTFLSMFDVDKKSLNALIATAHSSIKQGTQIYKVLDTFLSCARKSCSIDSDVDTKCLAD
SVMSAVSAGLELTDESCNNLVPTYLKSDNIVAADLGVLIQNSAKHVQGNVAKIAGVSCIWSVDAFNQFSSDFQHKLKKAC
CKTGLKLKLTYNKQMANVSVLTTPFSLKGGAVFSYFVYVCFVLSLVCFIGLWCLMPTYTVHKSDFQLPVYASYKVLDNGV
IRDVSVEDVCFANKFEQFDQWYESTFGLSYYSNSMACPIVVAVIDQDFGSTVFNVPTKVLRYGYHVLHFITHALSADGVQ
CYTPHSQISYSNFYASGCVLSSACTMFTMADGSPQPYCYTDGLMQNASLYSSLVPHVRYNLANAKGFIRFPEVLREGLVR
VVRTRSMSYCRVGLCEEADEGICFNFNGSWVLNNDYYRSLPGTFCGRDVFDLIYQLFKGLAQPVDFLALTASSIAGAILA
VIVVLVFYYLIKLKRAFGDYTSVVFVNVIVWCVNFMMLFVFQVYPTLSCVYAICYFYATLYFPSEISVIMHLQWLVMYGT
IMPLWFCLLYIAVVVSNHAFWVFSYCRKLGTSVRSDGTFEEMALTTFMITKDSYCKLKNSLSDVAFNRYLSLYNKYRYYS
GKMDTAAYREAACSQLAKAMDTFTNNNGSDVLYQPPTASVSTSFLQSGIVKMVNPTSKVEPCVVSVTYGNMTLNGLWLDD
KVYCPRHVICSASDMTNPDYTNLLCRVTSSDFTVLFDRLSLTVMSYQMRGCMLVLTVTLQNSRTPKYTFGVVKPGETFTV
LAAYNGKPQGAFHVTMRSSYTIKGSFLCGSCGSVGYVIMGDCVKFVYMHQLELSTGCHTGTDFNGDFYGPYKDAQVVQLP
IQDYIQSVNFLAWLYAAILNNCNWFIQSDKCSVEDFNVWALSNGFSQVKSDLVIDALASMTGVSLETLLAAIKRLKNGFQ
GRQIMGSCSFEDELTPSDVYQQLAGIKLQSKRTRLFKGTVCWIMASTFLFSCIITAFVKWTMFMYVTTNMFSITFCALCV
ISLAMLLVKHKHLYLTMYITPVLFTLLYNNYLVVYKHTFRGYVYAWLSYYVPSVEYTYTDEVIYGMLLLVGMVFVTLRSI
NHDLFSFIMFVGRLISVFSLWYKGSNLEEEILLMLASLFGTYTWTTVLSMAVAKVIAKWVAVNVLYFTDIPQIKIVLLCY
LFIGYIISCYWGLFSLMNSLFRMPLGVYNYKISVQELRYMNANGLRPPKNSFEALMLNFKLLGIGGVPIIEVSQFQSKLT
DVKCANVVLLNCLQHLHVASNSKLWHYCSTLHNEILATSDLSVAFEKLAQLLIVLFANPAAVDSKCLTSIEEVCDDYAKD
NTVLQALQSEFVNMASFVEYEVAKKNLDEARFSGSANQQQLKQLEKACNIAKSAYERDRAVAKKLERMADLALTNMYKEA
RINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPSLAANTLNIIVPDKSVYDQIVDNIYVTYAGNVW
QIQTIQDSDGTNKQLNEISDDCNWPLVIIANRYNEVSATVLQNNELMPAKLKIQVVNSGPDQTCNTPTQCYYNNSNNGKI
VYAILSDVDGLKYTKILKDDGNFVVLELDPPCKFTVQDAKGLKIKYLYFVKGCNTLARGWVVGTISSTVRLQAGTATEYA
SNSSILSLCAFSVDPKKTYLDFIQQGGTPIANCVKMLCDHAGTGMAITVKPDATTSQDSYGGASVCIYCRARVEHPDVDG
LCKLRGKFVQVPVGIKDPVSYVLTHDVCRVCGFWRDGSCSCVSTDTTVQSKDTNFLNRVRGASVDARLVPCASGLSTDVQ
LRAFDIYNASVAGIGLHLKVNCCRFQRVDENGDKLDQFFVVKRTDLTIYNREMKCYERVKDCKFVAEHDFFTFDVEGSRV
PHIVRKDLTKYTMLDLCYALRHFDRNDCMLLCDILSIYAGCEQSYFTKKDWYDFVENPDIINVYKKLGPIFNRALVSATE
FADKLVEVGLVGVLTLDNQDLNGKWYDFGDYVIAAPGCGVAIADSYYSYIMPMLTMCHALDCELYVNNAYRLFDLVQYDF
TDYKLELFNKYFKHWSMPYHPNTVDCQDDRCIIHCANFNILFSMVLPNTCFGPLVRQIFVDGVPFVVSIGYHYKELGIVM
NMDVDTHRYRLSLKDLLLYAADPALHVASASALYDLRTCCFSVAAITSGVKFQTVKPGNFNQDFYDFVLSKGLLKEGSSV
DLKHFFFTQDGNAAITDYNYYKYNLPTMVDIKQLLFVLEVVYKYFEIYDGGCIPASQVIVNNYDKSAGYPFNKFGKARLY
YEALSFEEQDEIYAYTKRNVLPTLTQMNLKYAISAKNRARTVAGVSILSTMTGRMFHQKCLKSIAATRGVPVVIGTTKFY
GGWDDMLRRLIKDVDNPVLMGWDYPKCDRAMPNILRIVSSLVLARKHETCCSQSDRFYRLANECAQVLSEIVMCGGCYYV
KPGGTSSGDATTAFANSVFNICQAVSANVCALMSCNGNKIEDLSIRALQKRLYSHVYRSDKVDSTFVTEYYEFLNKHFSM
MILSDDGVVCYNSDYASKGYIANISAFQQVLYYQNNVFMSESKCWVEHDINNGPHEFCSQHTMLVKMDGDDVYLPYPNPS
RILGAGCFVDDLLKTDSVLLIERFVSLAIDAYPLVYHENEEYQKVFRVYLAYIKKLYNDLGNQILDSYSVILSTCDGQKF
TDESFYKNMYLRSAVMQSVGACVVCSSQTSLRCGSCIRKPLLCCKCCYDHVMATDHKYVLSVSPYVCNAPGCDVNDVTKL
YLGGMSYYCEDHKPQYSFKLVMNGLVFGLYKQSCTGSPYIDDFNRIASCKWTDVDDYILANECTERLKLFAAETQKATEE
AFKQSYASATIQEIVSERELILSWEIGKVKPPLNKNYVFTGYHFTKNGKTVLGEYVFDKSELTNGVYYRATTTYKLSVGD
VFVLTSHSVANLSAPTLVPQENYSSIRFASVYSVLETFQNNVVNYQHIGMKRYCTVQGPPGTGKSHLAIGLAVFYCTARV
VYTAASHAAVDALCEKAYKFLNINDCTRIVPAKVRVECYDKFKINDTTRKYVFTTINALPEMVTDIVVVDEVSMLTNYEL
SVINARIRAKHYVYIGDPAQLPAPRVLLSKGTLEPKYFNTVTKLMCCLGPDIFLGTCYRCPKEIVDTVSALVYENKLKAK
NESSLLCFKVYYKGVTTHESSSAVNMQQIYLINKFLKANPLWHKAVFISPYNSQNFAAKRVLGLQTQTVDSAQGSEYDYV
IYSQTAETAHSVNVNRFNVAITRAKKGILCVMSNMQLFEALQFTTLTLDKVPQAVETKVQCSTNLFKDCSKSYSGYHPAH
APSFLAVDDKYKATGDLAVCLGIGDSAVTYSRLISLMGFKLDVTLDGYCKLFITKEEAVKRVRAWVGFDAEGAHATLDSI
GTNFPLQLGFSTGIDFVVEATGLFADRDGYSFKKAVAKAPPGEQFKHLIPLMTRGHRWDVVRPRIVQMFADHLIDLSDCV
VLVTWAANFELTCLRYFAKVGREISCNVCTKRATVYNSRTGYYGCWRHSVTCDYLYNPLIVDIQQWGYIGSLSSNHDLYC
SVHKGAHVASSDAIMTRCLAVYDCFCNNINWNVEYPIISNELSINTSCRVLQRVILKAAMLCNRYTLCYDIGNPKGIACV
KDFDFKFYDAQPIVKSVKTLLYSFEAHKDSFKDGLCMFWNCNVDKYPPNAVVCRFDTRVLNNLNLPGCNGGSLYVNKHAF
HTKPFARAAFEHLKPMPFFYYSDTPCVYMDGMDAKQVDYVPLKSATCITRCNLGGAVCLKHAEEYREYLESYNTATTAGF
TFWVYKTFDFYNLWNTFTKLQSLENVVYNLVKTGHYTGQAGEMPCAIINDKVVTKIDKEDVVIFINNTTYPTNVAVELFA
KRSVRHHPELKLFRNLNIDVCWKHVIWDYARESIFCSNTYGVCMYTDLKFIDKLNVLFDGRDNGAFEAFKRSNNGVYIST
TKVKSLSMIRGPPRAELNGVVVDKVGDTDCVFYFAVRKEGQDVIFSQFDSLGVSSNQSPQGNLGSNGKPGNVGGNDALSI
STIFTQSRVISSFTCRTDMEKDFIALDQDVFIQKYGLEDYAFEHIVYGNFNQKIIGGLHLLIGLYRRQQTSNLVVQEFVS
YDSSIHSYFITDEKSGGSKSVCTVIDILLDDFVTLVKSLNLNCVSKVVNVNVDFKDFQFMLWCNDEKVMTFYPRLQAASD
WKPGYSMPVLYKYLNSPMERVSLWNYGKPVTLPTGCMMNVAKYTQLCQYLNTTTLAVPVNMRVLHLGAGSEKGVAPGSAV
LRQWLPAGTILVDNDLYPFVSDSVATYFGDCITLPFDCQWDLIISDMYDPITKNIGEYNVSKDGFFTYICHMIRDKLALG
GSVAIKITEFSWNAELYKLMGYFAFWTVFCTNANASSSEGFLIGINYLCKPKVEIDGNVMHANYLFWRNSTVWNGGAYSL
FDMAKFPLKLAGTAVINLRADQINDMVYSLLEKGKLLIRDTNKEVFVGDSLVNVI
>P0C6X9 ~~~rep~~~Replicase polyprotein 1ab~~~
MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVNCSRLPALECCVQSAIIRDI
FVDEDPQKVEASTMMALQFGSAVLVKPSKRLSIQAWTNLGVLPKTAAMGLFKRVCLCNTRECSCDAHVAFHLFTVQPDGV
CLGNGRFIGWFVPVTAIPEYAKQWLQPWSILLRKGGNKGSVTSGHFRRAVTMPVYDFNVEDACEEVHLNPKGKYSCKAYA
LLKGYRGVKPILFVDQYGCDYTGCLAKGLEDYGDLTLSEMKELFPVWRDSLDSEVLVAWHVDRDPRAAMRLQTLATVRCI
DYVGQPTEDVVDGDVVVREPAHLLAANAIVKRLPRLVETMLYTDSSVTEFCYKTKLCECGFITQFGYVDCCGDTCDFRGW
VAGNMMDGFPCPGCTKNYMPWELEAQSSGVIPEGGVLFTQSTDTVNRESFKLYGHAVVPFGSAVYWSPCPGMWLPVIWSS
VKSYSGLTYTGVVGCKAIVQETDAICRSLYMDYVQHKCGNLEQRAILGLDDVYHRQLLVNRGDYSLLLENVDLFVKRRAE
FACKFATCGDGLVPLLLDGLVPRSYYLIKSGQAFTSMMVNFSHEVTDMCMDMALLFMHDVKVATKYVKKVTGKLAVRFKA
LGVAVVRKITEWFDLAVDIAASAAGWLCYQLVNGLFAVANGVITFVQEVPELVKNFVDKFKAFFKVLIDSMSVSILSGLT
VVKTASNRVCLAGSKVYEVVQKSLSAYVMPVGCSEATCLVGEIEPAVFEDDVVDVVKAPLTYQGCCKPPTSFEKICIVDK
LYMAKCGDQFYPVVVDNDTVGVLDQCWRFPCAGKKVEFNDKPKVRKIPSTRKIKITFALDATFDSVLSKACSEFEVDKDV
TLDELLDVVLDAVESTLSPCKEHDVIGTKVCALLDRLAGDYVYLFDEGGDEVIAPRMYCSFSAPDDEDCVAADVVDADEN
QDDDAEDSAVLVADTQEEDGVAKGQVEADSEICVAHTGSQEELAEPDAVGSQTPIASAEETEVGEASDREGIAEAKATVC
ADAVDACPDQVEAFEIEKVEDSILDELQTELNAPADKTYEDVLAFDAVCSEALSAFYAVPSDETHFKVCGFYSPAIERTN
CWLRSTLIVMQSLPLEFKDLEMQKLWLSYKAGYDQCFVDKLVKSVPKSIILPQGGYVADFAYFFLSQCSFKAYANWRCLE
CDMELKLQGLDAMFFYGDVVSHMCKCGNSMTLLSADIPYTLHFGVRDDKFCAFYTPRKVFRAACAVDVNDCHSMAVVEGK
QIDGKVVTKFIGDKFDFMVGYGMTFSMSPFELAQLYGSCITPNVCFVKGDVIKVVRLVNAEVIVNPANGRMAHGAGVAGA
IAEKAGSAFIKETSDMVKAQGVCQVGECYESAGGKLCKKVLNIVGPDARGHGKQCYSLLERAYQHINKCDNVVTTLISAG
IFSVPTDVSLTYLLGVVTKNVILVSNNQDDFDVIEKCQVTSVAGTKALSLQLAKNLCRDVKFVTNACSSLFSESCFVSSY
DVLQEVEALRHDIQLDDDARVFVQANMDCLPTDWRLVNKFDSVDGVRTIKYFECPGGIFVSSQGKKFGYVQNGSFKEASV
SQIRALLANKVDVLCTVDGVNFRSCCVAEGEVFGKTLGSVFCDGINVTKVRCSAIYKGKVFFQYSDLSEADLVAVKDAFG
FDEPQLLKYYTMLGMCKWPVVVCGNYFAFKQSNNNCYINVACLMLQHLSLKFPKWQWQEAWNEFRSGKPLRFVSLVLAKG
SFKFNEPSDSIDFMRVVLREADLSGATCNLEFVCKCGVKQEQRKGVDAVMHFGTLDKGDLVRGYNIACTCGSKLVHCTQF
NVPFLICSNTPEGRKLPDDVVAANIFTGGSVGHYTHVKCKPKYQLYDACNVNKVSEAKGNFTDCLYLKNLKQTFSSVLTT
FYLDDVKCVEYKPDLSQYYCESGKYYTKPIIKAQFRTFEKVDGVYTNFKLVGHSIAEKLNAKLGFDCNSPFVEYKITEWP
TATGDVVLASDDLYVSRYSSGCITFGKPVVWLGHEEASLKSLTYFNRPSVVCENKFNVLPVDVSEPTDKGPVPAAVLVTG
VPGADASAGAGIAKEQKACASASVEDQVVTEVRQEPSVSAADVKEVKLNGVKKPVKVEGSVVVNDPTSETKVVKSLSIVD
VYDMFLTGCKYVVWTANELSRLVNSPTVREYVKWGMGKIVTPAKLLLLRDEKQEFVAPKVVKAKAIACYCAVKWFLLYCF
SWIKFNTDNKVIYTTEVASKLTFKLCCLAFKNALQTFNWSVVSRGFFLVATVFLLWFNFLYANVILSDFYLPNIGPLPTF
VGQIVAWFKTTFGVSTICDFYQVTDLGYRSSFCNGSMVCELCFSGFDMLDNYDAINVVQHVVDRRLSFDYISLFKLVVEL
VIGYSLYTVCFYPLFVLIGMQLLTTWLPEFFMLETMHWSARLFVFVANMLPAFTLLRFYIVVTAMYKVYCLCRHVMYGCS
KPGCLFCYKRNRSVRVKCSTVVGGSLRYYDVMANGGTGFCTKHQWNCLNCNSWKPGNTFITHEAAADLSKELKRPVNPTD
SAYYSVTEVKQVGCSMRLFYERDGQRVYDDVNASLFVDMNGLLHSKVKGVPETHVVVVENEADKAGFLGAAVFYAQSLYR
PMLMVEKKLITTANTGLSVSRTMFDLYVDSLLNVLDVDRKSLTSFVNAAHNSLKEGVQLEQVMDTFIGCARRKCAIDSDV
ETKSITKSVMSAVNAGVDFTDESCNNLVPTYVKSDTIVAADLGVLIQNNAKHVQANVAKAANVACIWSVDAFNQLSADLQ
HRLRKACSKTGLKIKLTYNKQEANVPILTTPFSLKGGAVFSRMLQWLFVANLICFIVLWALMPTYAVHKSDMQLPLYASF
KVIDNGVLRDVSVTDACFANKFNQFDQWYESTFGLAYYRNSKACPVVVAVIDQDIGHTLFNVPTTVLRYGFHVLHFITHA
FATDSVQCYTPHMQIPYDNFYASGCVLSSLCTMLAHADGTPHPYCYTGGVMHNASLYSSLAPHVRYNLASSNGYIRFPEV
VSEGIVRVVRTRSMTYCRVGLCEEAEEGICFNFNRSWVLNNPYYRAMPGTFCGRNAFDLIHQVLGGLVRPIDFFALTASS
VAGAILAIIVVLAFYYLIKLKRAFGDYTSVVVINVIVWCINFLMLFVFQVYPTLSCLYACFYFYTTLYFPSEISVVMHLQ
WLVMYGAIMPLWFCIIYVAVVVSNHALWLFSYCRKIGTEVRSDGTFEEMALTTFMITKESYCKLKNSVSDVAFNRYLSLY
NKYRYFSGKMDTAAYREAACSQLAKAMETFNHNNGNDVLYQPPTASVTTSFLQSGIVKMVSPTSKVEPCIVSVTYGNMTL
NGLWLDDKVYCPRHVICSSADMTDPDYPNLLCRVTSSDFCVMSGRMSLTVMSYQMQGCQLVLTVTLQNPNTPKYSFGVVK
PGETFTVLAAYNGRPQGAFHVTLRSSHTIKGSFLCGSCGSVGYVLTGDSVRFVYMHQLELSTGCHTGTDFSGNFYGPYRD
AQVVQLPVQDYTQTVNVVAWLYAAIFNRCNWFVQSDSCSLEEFNVWAMTNGFSSIKADLVLDALASMTGVTVEQVLAAIK
RLHSGFQGKQILGSCVLEDETPSDVYQQLAGVKLQSKRTRVIKGTCCWILASTFLFCSIISAFVKWTMFMYVTTHMLGVT
LCALCFVSFAMLLIKHKHLYLTMYIMPVLCTFYTNYLVVYKQSFRGLAYAWLSHFVPAVDYTYMDEVLYGVVLLVAMVFV
TMRSINHDVFSIMFLVGRLVSLVSMWYFGANLEEEVLLFLTSLFGTYTWTTMLSLATAKVIAKWLAVNVLYFTDVPQIKL
VLLSYLCIGYVCCCYWGILSLLNSIFRMPLGVYNYKISVQELRYMNANGLRPPRNSFEALMLNFKLLGIGGVPVIEVSQI
QSRLTDVKCANVVLLNCLQHLHIASNSKLWQYCSTLHNEILATSDLSMAFDKLAQLLVVLFANPAAVDSKCLASIEEVSD
DYVRDNTVLQALQSEFVNMASFVEYELAKKNLDEAKASGSANQQQIKQLEKACNIAKSAYERDRAVARKLERMADLALTN
MYKEARINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPSLTSNTLTIIVPDKQVFDQVVDNVYVTY
AGNVWHIQFIQDADGAVKQLNEIDVNSTWPLVIAANRHNEVSTVVLQNNELMPQKLRTQVVNSGSDMNCNTPTQCYYNTT
GTGKIVYAILSDCDGLKYTKIVKEDGNCVVLELDPPCKFSVQDVKGLKIKYLYFVKGCNTLARGWVVGTLSSTVRLQAGT
ATEYASNSAILSLCAFSVDPKKTYLDYIKQGGVPVTNCVKMLCDHAGTGMAITIKPEATTNQDSYGGASVCIYCRSRVEH
PDVDGLCKLRGKFVQVPLGIKDPVSYVLTHDVCQVCGFWRDGSCSCVGTGSQFQSKDTNFLNRIRGTSVNARLVPCASGL
DTDVQLRAFDICNANRAGIGLYYKVNCCRFQRVDEDGNKLDKFFVVKRTNLEVYNKEKECYELTKECGVVAEHEFFTFDV
EGSRVPHIVRKDLSKFTMLDLCYALRHFDRNDCSTLKEILLTYAECEESYFQKKDWYDFVENPDIINVYKKLGPIFNRAL
LNTAKFADALVEAGLVGVLTLDNQDLYGQWYDFGDFVKTVPGCGVAVADSYYSYMMPMLTMCHALDSELFVNGTYREFDL
VQYDFTDFKLELFTKYFKHWSMTYHPNTCECEDDRCIIHCANFNILFSMVLPKTCFGPLVRQIFVDGVPFVVSIGYHYKE
LGVVMNMDVDTHRYRLSLKDLLLYAADPALHVASASALLDLRTCCFSVAAITSGVKFQTVKPGNFNQDFYEFILSKGLLK
EGSSVDLKHFFFTQDGNAAITDYNYYKYNLPTMVDIKQLLFVLEVVNKYFEIYEGGCIPATQVIVNNYDKSAGYPFNKFG
KARLYYEALSFEEQDEIYAYTKRNVLPTLTQMNLKYAISAKNRARTVAGVSILSTMTGRMFHQKCLKSIAATRGVPVVIG
TTKFYGGWDDMLRRLIKDVDSPVLMGWDYPKCDRAMPNILRIVSSLVLARKHDSCCSHTDRFYRLANECAQVLSEIVMCG
GCYYVKPGGTSSGDATTAFANSVFNICQAVSANVCSLMACNGHKIEDLSIRELQKRLYSNVYRADHVDPAFVSEYYEFLN
KHFSMMILSDDGVVCYNSEFASKGYIANISAFQQVLYYQNNVFMSEAKCWVETDIEKGPHEFCSQHTMLVKMDGDEVYLP
YPDPSRILGAGCFVDDLLKTDSVLLIERFVSLAIDAYPLVYHENPEYQNVFRVYLEYIKKLYNDLGNQILDSYSVILSTC
DGQKFTDETFYKNMYLRSAVLQSVGACVVCSSQTSLRCGSCIRKPLLCCKCAYDHVMSTDHKYVLSVSPYVCNSPGCDVN
DVTKLYLGGMSYYCEDHKPQYSFKLVMNGMVFGLYKQSCTGSPYIEDFNKIASCKWTEVDDYVLANECTERLKLFAAETQ
KATEEAFKQCYASATIREIVSDRELILSWEIGKVRPPLNKNYVFTGYHFTNNGKTVLGEYVFDKSELTNGVYYRATTTYK
LSVGDVFILTSHAVSSLSAPTLVPQENYTSIRFASVYSVPETFQNNVPNYQHIGMKRYCTVQGPPGTGKSHLAIGLAVYY
CTARVVYTAASHAAVDALCEKAHKFLNINDCTRIVPAKVRVDCYDKFKVNDTTRKYVFTTINALPELVTDIIVVDEVSML
TNYELSVINSRVRAKHYVYIGDPAQLPAPRVLLNKGTLEPRYFNSVTKLMCCLGPDIFLGTCYRCPKEIVDTVSALVYNN
KLKAKNDNSSMCFKVYYKGQTTHESSSAVNMQQIHLISKFLKANPSWSNAVFISPYNSQNYVAKRVLGLQTQTVDSAQGS
EYDFVIYSQTAETAHSVNVNRFNVAITRAKKGILCVMSSMQLFESLNFTTLTLDKINNPRLQCTTNLFKDCSRSYVGYHP
AHAPSFLAVDDKYKVGGDLAVCLNVADSAVTYSRLISLMGFKLDLTLDGYCKLFITRDEAIKRVRAWVGFDAEGAHAIRD
SIGTNFPLQLGFSTGIDFVVEATGMFAERDGYVFKKAAARAPPGEQFKHLIPLMSRGQKWDVVRIRIVQMLSDHLVDLAD
SVVLVTWAASFELTCLRYFAKVGREVVCSVCTKRATCFNSRTGYYGCWRHSYSCDYLYNPLIVDIQQWGYTGSLTSNHDP
ICSVHKGAHVASSDAIMTRCLAVHDCFCKSVNWNLEYPIISNEVSVNTSCRLLQRVMFRAAMLCNRYDVCYDIGNPKGLA
CVKGYDFKFYDASPVVKSVKQFVYKYEAHKDQFLDGLCMFWNCNVDKYPANAVVCRFDTRVLNKLNLPGCNGGSLYVNKH
AFHTSPFTRAAFENLKPMPFFYYSDTPCVYMEGMESKQVDYVPLRSATCITRCNLGGAVCLKHAEEYREYLESYNTATTA
GFTFWVYKTFDFYNLWNTFTRLQSLENVVYNLVNAGHFDGRAGELPCAVIGEKVIAKIQNEDVVVFKNNTPFPTNVAVEL
FAKRSIRPHPELKLFRNLNIDVCWSHVLWDYAKDSVFCSSTYKVCKYTDLQCIESLNVLFDGRDNGALEAFKKCRNGVYI
NTTKIKSLSMIKGPQRADLNGVVVEKVGDSDVEFWFAVRKDGDDVIFSRTGSLEPSHYRSPQGNPGGNRVGDLSGNEALA
RGTIFTQSRLLSSFTPRSEMEKDFMDLDDDVFIAKYSLQDYAFEHVVYGSFNQKIIGGLHLLIGLARRQQKSNLVIQEFV
TYDSSIHSYFITDENSGSSKSVCTVIDLLLDDFVDIVKSLNLKCVSKVVNVNVDFKDFQFMLWCNEEKVMTFYPRLQAAA
DWKPGYVMPVLYKYLESPLERVNLWNYGKPITLPTGCMMNVAKYTQLCQYLSTTTLAVPANMRVLHLGAGSDKGVAPGSA
VLRQWLPAGSILVDNDVNPFVSDSVASYYGNCITLPFDCQWDLIISDMYDPLTKNIGEYNVSKDGFFTYLCHLIRDKLAL
GGSVAIKITEFSWNAELYSLMGKFAFWTIFCTNVNASSSEGFLIGINWLNKTRTEIDGKTMHANYLFWRNSTMWNGGAYS
LFDMSKFPLKAAGTAVVSLKPDQINDLVLSLIEKGKLLVRDTRKEVFVGDSLVNVK
>P0C6Y0 ~~~rep~~~Replicase polyprotein 1ab~~~
MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVDCSRLPALECCVQSAIIRDI
FVDEDPQKVEASTMMALQFGSAVLVKPSKRLSVQAWAKLGVLPKTPAMGLFKRFCLCNTRECVCDAHVAFQLFTVQPDGV
CLGNGRFIGWFVPVTAIPEYAKQWLQPWSILLRKGGNKGSVTSGHFRRAVTMPVYDFNVEDACEEVHLNPRGKYSCKAYA
LLRGYRGVKPILFVDQYGCDYTGCLAKGLEDYGDLTLSEMKELSPVWRDSLDNEVVVAWHVDRDPRAVMRLQTLATVRSI
EYVGQPIEDMVDGDVVMREPAHLLAPNAIVKRLPRLVETMLYTDSSVTEFCYKTKLCDCGFITQFGYVDCCGDTCGFRGW
VPGNMMDGFPCPGCCKSYMPWELEAQSSGVIPEGGVLFTQSTDTVNRESFKLYGHAVVPFGGAAYWSPYPGMWLPVIWSS
VKSYSYLTYTGVVGCKAIVQETDAICRFLYMDYVQHKCGNLEQRAILGLDDVYHRQLLVNRGDYSLLLENVDLFVKRRAE
FACKFATCGDGLVPLLLDGLVPRSYYLIKSGQAFTSLMVNFSREVVDMCMDMALLFMHDVKVATKYVKKVTGKVAVRFKA
LGIAVVRKITEWFDLAVDTAASAAGWLCYQLVNGLFAVANGVITFIQEVPELVKNFVDKFKTFFKVLIDSMSVSILSGLT
VVKTASNRVCLAGSKVYEVVQKSLPAYIMPVGCSEATCLVGEIEPAVFEDDVVDVVKAPLTYQGCCKPPSSFEKICIVDK
LYMAKCGDQFYPVVVDNDTVGVLDQCWRFPCAGKKVVFNDKPKVKEVPSTRKIKIIFALDATFDSVLSKACSEFEVDKDV
TLDELLDVVLDAVESTLSPCKEHGVIGTKVCALLERLVDDYVYLFDEGGEEVIASRMYCSFSAPDEDCVATDVVYADENQ
DDDADDPVVLVADTQEEDGVAREQVDSADSEICVAHTGGQEMTEPDVVGSQTPIASAEETEVGEACDREGIAEVKATVCA
DALDACPDQVEAFDIEKVEDSILSELQTELNAPADKTYEDVLAFDAIYSETLSAFYAVPSDETHFKVCGFYSPAIERTNC
WLRSTLIVMQSLPLEFKDLGMQKLWLSYKAGYDQCFVDKLVKSAPKSIILPQGGYVADFAYFFLSQCSFKVHANWRCLKC
GMELKLQGLDAVFFYGDVVSHMCKCGNSMTLLSADIPYTFDFGVRDDKFCAFYTPRKVFRAACAVDVNDCHSMAVVDGKQ
IDGKVVTKFNGDKFDFMVGHGMTFSMSPFEIAQLYGSCITPNVCFVKGDVIKVLRRVGAEVIVNPANGRMAHGAGVAGAI
AKAAGKAFINETADMVKAQGVCQVGGCYESTGGKLCKKVLNIVGPDARGHGNECYSLLERAYQHINKCDNVVTTLISAGI
FSVPTDVSLTYLLGVVTKNVILVSNNQDDFDVIEKCQVTSVAGTKALSFQLAKNLCRDVKFVTNACSSLFSESSFVSSYD
VLQEVEALRHDIQLDDDARVFVQANMDCLPTDWRLVNKFDSVDGVRTIKYFECPGEVFVSSQGKKFGYVQNGSFKEASVS
QIRALLANKVDVLCTVDGVNFRSCCVAEGEVFGKTLGSVFCDGINVTKVRCSAIHKGKVFFQYSGLSAADLAAVKDAFGF
DEPQLLQYYSMLGMCKWPVVVCGNYFAFKQSNNNCYINVACLMLQHLSLKFPKWQWRRPGNEFRSGKPLRFVSLVLAKGS
FKFNEPSDSTDFIRVELREADLSGATCDLEFICKCGVKQEQRKGVDAVMHFGTLDKSGLVKGYNIACTCGDKLVHCTQFN
VPFLICSNTPEGKKLPDDVVAANIFTGGSVGHYTHVKCKPKYQLYDACNVSKVSEAKGNFTDCLYLKNLKQTFSSVLTTY
YLDDVKCVAYKPDLSQYYCESGKYYTKPIIKAQFRTFEKVEGVYTNFKLVGHDIAEKLNAKLGFDCNSPFMEYKITEWPT
ATGDVVLASDDLYVSRYSGGCVTFGKPVIWRGHEEASLKSLTYFNRPSVVCENKFNVLPVDVSEPTDRRPVPSAVLVTGA
ASGADASAISTEPGTAKEQKACASDSVEDQIVMEAQKKSSVTTVAVKEVKLNGVKKPVKWNCSVVVNDPTSETKVVKSLS
IVDVYDMFLTGCRYVVWTANELSRLINSPTVREYVKWGMSKLIIPANLLLLRDEKQEFVAPKVVKAKAIACYGAVKWFLL
YCFSWIKFNTDNKVIYTTEVASKLTFKLCCLAFKNALQTFNWSVVSRGFFLVATVFLLWFNFLYANVILSDFYLPNIGPL
PMFVGQIVAWVKTTFGVLTICDFYQVTDLGYRSSFCNGSMVCELCFSGFDMLDNYESINVVQHVVDRRVSFDYISLFKLV
VELVIGYSLYTVCFYPLFVLVGMQLLTTWLPEFFMLGTMHWSARLFVFVANMLPAFTLLRFYIVVTAMYKVYCLCRHVMY
GCSKPGCLFCYKRNRSVRVKCSTVVGGSLRYYDVMANGGTGFCTKHQWNCLNCNSWKPGNTFITHEAAADLSKELKRPVN
PTDSAYYSVIEVKQVGCSMRLFYERDGQRVYDDVSASLFVDMNGLLHSKVKGVPETHVVVVENEADKAGFLNAAVFYAQS
LYRPMLMVEKKLITTANTGLSVSRTMFDLYVYSLLRHLDVDRKSLTSFVNAAHNSLKEGVQLEQVMDTFVGCARRKCAID
SDVETKSITKSVMAAVNAGVEVTDESCNNLVPTYVKSDTIVAADLGVLIQNNAKHVQSNVAKAANVACIWSVDAFNQLSA
DLQHRLRKACVKTGLKIKLTYNKQEANVPILTTPFSLKGGAVFSRVLQWLFVANLICFIVLWALMPTYAVHKSDMQLPLY
ASFKVIDNGVLRDVSVTDACFANKFNQFDQWYESTFGLVYYRNSKACPVVVAVIDQDIGHTLFNVPTKVLRYGFHVLHFI
THAFATDRVQCYTPHMQIPYDNFYASGCVLSSLCTMLAHADGTPHPYCYTEGVMHNASLYSSLVPHVRYNLASSNGYIRF
PEVVSEGIVRVVRTRSMTYCRVGLCEEAEEGICFNFNSSWVLNNPYYRAMPGTFCGRNAFDLIHQVLGGLVQPIDFFALT
ASSVAGAILAIIVVLAFYYLIKLKRAFGDYTSVVVINVIVWCINFLMLFVFQVYPTLSCLYACFYFYTTLYFPSEISVVM
HLQWLVMYGAIMPLWFCITYVAVVVSNHALWLFSYCRKIGTDVRSDGTFEEMALTTFMITKESYCKLKNSVSDVAFNRYL
SLYNKYRYFSGKMDTATYREAACSQLAKAMETFNHNNGNDVLYQPPTASVTTSFLQSGIVKMVSPTSKVEPCVVSVTYGN
MTLNGLWLDDKVYCPRHVICSSADMTDPDYPNLLCRVTSSDFCVMSDRMSLTVMSYQMQGSLLVLTVTLQNPNTPKYSFG
VVKPGETFTVLAAYNGRPQGAFHVVMRSSHTIKGSFLCGSCGSVGYVLTGDSVRFVYMHQLELSTGCHTGTDFSGNFYGP
YRDAQVVQLPVQDYTQTVNVVAWLYAAILNRCNWFVQSDSCSLEEFNVWAMTNGFSSIKADLVLDALASMTGVTVEQVLA
AIKRLHSGFQGKQILGSCVLEDELTPSDVYQQLAGVKLQSKRTRVIKGTCCWILASTFLFCSIISAFVKWTMFMYVTTHM
LGVTLCALCFVIFAMLLIKHKHLYLTMYIMPVLCTLFYTNYLVVGYKQSFRGLAYAWLSYFVPAVDYTYMDEVLYGVVLL
VAMVFVTMRSINHDVFSTMFLVGRLVSLVSMWYFGANLEEEVLLFLTSLFGTYTWTTMLSLATAKVIAKWLAVNVLYFTD
IPQIKLVLLSYLCIGYVCCCYWGVLSLLNSIFRMPLGVYNYKISVQELRYMNANGLRPPRNSFEALMLNFKLLGIGGVPV
IEVSQIQSRLTDVKCANVVLLNCLQHLHIASNSKLWQYCSTLHNEILATSDLSVAFDKLAQLLVVLFANPAAVDSKCLAS
IEEVSDDYVRDNTVLQALQSEFVNMASFVEYELAKKNLDEAKASGSANQQQIKQLEKACNIAKSAYERDRAVARKLERMA
DLALTNMYKEARINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPPLTSNTLTIIVPDKQVFDQVVD
NVYVTYAPNVWHIQSIQDADGAVKQLNEIDVNSTWPLVISANRHNEVSTVVLQNNELMPQKLRTQVVNSGSDMNCNIPTQ
CYYNTTGTGKIVYAILSDCDGLKYTKIVKEDGNCVVLELDPPCKFSVQDVKGLKIKYLYFVKGCNTLARGWVVGTLSSTV
RLQAGTATEYASNSAILSLCAFSVDPKKTYLDYIQQGGVPVTNCVKMLCDHAGTGMAITIKPEATTNQDSYGGASVCIYC
RSRVEHPDVDGLCKLRGKFVQVPLGIKDPVSYVLTHDVCQVCGFWRDGSCSCVGTGSQFQSKDTNFLNRVRGTSVNARLV
PCASGLDTDVQLRAFDICNANRAGIGLYYKVNCFRFQRVDEEGNKLDKFFVVKRTNLEVYNKEKECYELTKDCGVVAEHE
FFTFDVEGSRVPHIVRKDLSKFTMLDLCYALRHFDRNDCSTLKEILLTYAECDESYFQKKDWYDFVENPDIINVYKKLGP
IFNRALLNTANFADTLVEAGLVGVLTLDNQDLYGQWYDFGDFVKTVPCCGVAVADSYYSYMMPMLTMCHALDSELFVNGT
YREFDLVQYDFTDFKLELFNKYFKHWSMTYHPNTSECEDDRCIIHCANFNILFSMVLPKTCFGPLVRQIFVDGVPFVVSI
GYHYKELGVVMNMDVDTHRYRLSLKDLLLYAADPALHVASASALLDLRTCCFSVAAITSGVKFQTVKPGNFNQDFYEFIL
SKGLLKEGSSVDLKHFFFTQDGNAAITDYNYYKYNLPTMVDIKQLLFVVEVVNKYFEIYEGGCIPATQVIVNNYDKSAGY
PFNKFGKARLYYEALSFEEQDEIYAYTKRNVLPTLTQMNLKYAISAKNRARTVAGVSILSTMTGRMFHQKCLKSIAATRG
VPVVIGTTKFYGGWDDMLRRLIKDVDSPVLMGWDYPKCDRAMPNILRIVSSLVLARKHDSCCSHTDRFYRLANECAQVLG
EIVMCGGCYYVKPGGTSSGDATTAFANSVFNICQAVSANVCSLMACNGHKIEDLSIRELQKRLYSNVYRADHVDPAFVSE
YYEFLNKHFSMIILSDDGVVCYNSEFASKGYIANISDFQQVLYYQNNVFMSEAKCWVETDIEKGPHEFCSQHTMLVKMDG
DEVYLPYPDPSRILGAGCFVDDLLKTDSVLLIERFVSLAIDAYPLVYHENPEYQNVFRVYLEYIKKLYNDLGNQILDSIS
VILSTCDGQKFTDETFYKNMYLRSAVMQSVGACVVCSSQTSLRCGSCIRKPLLCCKCAYDHVMSTDHKYVLSVSPYVCNS
PGCDVNDVTKLYLGGMSYYCEAHKPQYSFKLVMNGMVFGLYKQSCTGSPYIEDFNKIASCKWTEVDDYVLANECTERLKL
FAAETQKATEEAFKQCYASATIREIVSDRELILSWEIGKVRPPLNKNYVFTGYHFTNNGKTVLGEYVFDKSELTNGVYYR
ATTTYKLSVGDVFILTSHAVSSLSAPTLVPQENYTSVRFASAYSVPETFQNNVPNYQHIGIKRYCTVQGPPGTGKSHLAI
GHAVYYCTARVVYTAASHAAVDALCEKAHKFLNINDCARIVPAKLRVDCYDKFNVNDTTRKYVFTTINALPELVTDIIVV
DEVSMLTNYELSVINSRVRAKHYVYIGDPAQLPAPRVLLNKGTLEPRYFNSVTKLMCCLGPDIFLGTCYRCPKEIVDTVS
ALVYNNKLKAKNDNSAMCFKVYYKGQTTHESSSAVNMQQIHLISKLLKANPSWSNAVFISPYNSQNYVAKRVLGLQTQTA
DSAQGSAYDFVIYSQTAQTAHSVNVNRFNVAITRAKKGILCVMSSMQLIGVFNFTTLTLDKINNPRLQCTTNLFKDCSKS
YVGIPPCAFLLAVDDKYKVSGNLAVCLNVADSAVTYSRLISLMGFKLDLTLDGYCKLFITRDEAIKRVRAWVGFDAEGAH
ATRDSIGTNFPLQLGFSTGIDFVVEATGMFAERDGYVFKKAAARAPPGEQFKHLVPLMSRGQKWDVVRIRIVQMLSDHLV
DLADSVVLVTWAASFELTCLRYFAKVGKEVVCSVCNKRATCFNSRTGYYGCWRHSYSCDYLYNPLIVDIQQWGYTGSLTS
NHDPICSVHKGAHVASSDAIMTRCLAVHDCFCKSVNWNLEYPIISNEVSVNTSCRLLQRVMFRAAMLCNRYDVCYDIGNP
KGLACVKGYDFKFYDASPVVKSVKQFVYKYEAHKDQFLDGLCMFWNCNVDKYPANAVVCRFDTRVLSKLNLPGCNGGSLY
VNKHAFHTNPFTRAAFENLKPMPFFYYSDTPCVYMEGMESKQVDYVPLRSATCITRCNLGGAVCLKHAEEYREYLESYNT
ATTAGFTFWVYKTFDFYNLWNTFTRLQSLENVVYNLVNAGHFDGRAGELPCAVIGEKVIAKIQNEDVVVFKNNTPFPTNV
AVELFAERSIRPHPELKLFRSSNIHVCWNHVLWDYAKDSVFCSSTYKVCKYTDLQCIESLNVLFDGRDNGALEAFKKCRN
GVYINTTKIKSLSMIKGPQRADLNGVVVEKVGDSDVEFWFAMRRDGDDVIFSRTGSLEPSHYRSPQGNPGGNRVGDLSGN
EALARGTIFTQSRFLSSFSPRSEMEKDFMDLDEDVFIAKYSLQDYAFEHVVYGSFNQKIIGGLHLLIGLARRPKKSNLVI
QEFVPYDSSIHSYFITDENSGSSESVCTVIDLLLDDFVDIVKSLNLKCVSKVVNVNVDFKDFQFMLWCNEEKVMTFYPRL
QAAADWKPGYVMPVLYKYLESPMERVNLWNYGKPITLPTGCMMNVAKYTQLCQYLSTTTLAVPANMRVLHLGAGSDKGVA
PGSAVLRQWLPSGSILVDNDMNPFVSDSVASYYGNCITLPFDCQWDLIISDMYDPLTKNIGEYNVSKDGFFTYLCHLIRD
KLALGGSVAIKITEFSWNAELYSLMGKFAFWTIFCTNVNASSSEGFLIGINWLNRTRNEIDGKTMHANYLFWRNSTMWNG
GAYSLFDMTKFPLKAAGTAVVSLKPDQINDLVLSLIEKGKLLVRDTRKEVFVGDSLVNVK
>P0C6Y5 ~~~rep~~~Replicase polyprotein 1ab~~~
MSSKQFKILVNEDYQVNVPSLPIRDVLQEIKYCYRNGFEGYVFVPEYCRDLVDCDRKDHYVIGVLGNGVSDLKPVLLTEP
SVMLQGFIVRANCNGVLEDFDLKIARTGRGAIYVDQYMCGADGKPVIEGDFKDYFGDEDIIEFEGEEYHCAWTTVRDEKP
LNQQTLFTIQEIQYNLDIPHKLPNCATRHVAPPVKKNSKIVLSEDYKKLYDIFGSPFMGNGDCLSKCFDTLHFIAATLRC
PCGSESSGVGDWTGFKTACCGLSGKVKGVTLGDIKPGDAVVTSMSAGKGVKFFANCVLQYAGDVEGVSIWKVIKTFTVDE
TVCTPGFEGELNDFIKPESKSLVACSVKRAFITGDIDDAVHDCIITGKLDLSTNLFGNVGLLFKKTPWFVQKCGALFVDA
WKVVEELCGSLTLTYKQIYEVVASLCTSAFTIVNYKPTFVVPDNRVKDLVDKCVKVLVKAFDVFTQIITIAGIEAKCFVL
GAKYLLFNNALVKLVSVKILGKKQKGLECAFFATSLVGATVNVTPKRTETATISLNKVDDVVAPGEGYIVIVGDMAFYKS
GEYYFMMSSPNFVLTNNVFKAVKVPSYDIVYDVDNDTKSKMIAKLGSSFEYDGDIDAAIVKVNELLIEFRQQSLCFRAFK
DDKSIFVEAYFKKYKMPACLAKHIGLWNIIKKDSCKRGFLNLFNHLNELEDIKETNIQAIKNILCPDPLLDLDYGAIWYN
CMPGCSDPSVLGSVQLLIGNGVKVVCDGCKGFANQLSKGYNKLCNAARNDIEIGGIPFSTFKTPTNTFIEMTDAIYSVIE
QGKALSFRDADVPVVDNGTISTADWSEPILLEPAEYVKPKNNGNVIVIAGYTFYKDEDEHFYPYGFGKIVQRMYNKMGGG
DKTVSFSEEVDVQEIAPVTRVKLEFEFDNEIVTGVLERAIGTRYKFTGTTWEEFEESISEELDAIFDTLANQGVELEGYF
IYDTCGGFDIKNPDGIMISQYDINITADEKSEVSASSEEEEVESVEEDPENEIVEASEGAEGTSSQEEVETVEVADITST
EEDVDIVEVSAKDDPWAAAVDVQEAEQFNPSLPPFKTTNLNGKIILKQGDNNCWINACCYQLQAFDFFNNEAWEKFKKGD
VMDFVNLCYAATTLARGHSGDAEYLLELMLNDYSTAKIVLAAKCGCGEKEIVLERAVFKLTPLKESFNYGVCGDCMQVNT
CRFLSVEGSGVFVHDILSKQTPEAMFVVKPVMHAVYTGTTQNGHYMVDDIEHGYCVDGMGIKPLKKRCYTSTLFINANVM
TRAEKPKQEFKVEKVEQQPIVEENKSSIEKEEIQSPKNDDLILPFYKAGKLSFYQGALDVLINFLEPDVIVNAANGDLKH
MGGVARAIDVFTGGKLTERSKDYLKKNKSIAPGNAVFFENVIEHLSVLNAVGPRNGDSRVEAKLCNVYKAIAKCEGKILT
PLISVGIFNVRLETSLQCLLKTVNDRGLNVFVYTDQERQTIENFFSCSIPVNVTEDNVNHERVSVSFDKTYGEQLKGTVV
IKDKDVTNQLPSAFDVGQKVIKAIDIDWQAHYGFRDAAAFSASSHDAYKFEVVTHSNFIVHKQTDNNCWINAICLALQRL
KPQWKFPGVRGLWNEFLERKTQGFVHMLYHISGVKKGEPGDAELMLHKLGDLMDNDCEIIVTHTTACDKCAKVEKFVGPV
VAAPLAIHGTDETCVHGVSVNVKVTQIKGTVAITSLIGPIIGEVLEATGYICYSGSNRNGHYTYYDNRNGLVVDAEKAYH
FNRDLLQVTTAIASNFVVKKPQAEERPKNCAFNKVAASPKIVQEQKLLAIESGANYALTEFGRYADMFFMAGDKILRLLL
EVFKYLLVLFMCLRSTKMPKVKVKPPLAFKDFGAKVRTLNYMRQLNKPSVWRYAKLVLLLIAIYNFFYLFVSIPVVHKLT
CNGAVQAYKNSSFIKSAVCGNSILCKACLASYDELADFQHLQVTWDFKSDPLWNRLVQLSYFAFLAVFGNNYVRCFLMYF
VSQYLNLWLSYFGYVEYSWFLHVVNFESISAEFVIVVIVVKAVLALKHIVFACSNPSCKTCSRTARQTRIPIQVVVNGSM
KTVYVHANGTGKFCKKHNFYCKNCDSYGFENTFICDEIVRDLSNSVKQTVYATDRSHQEVTKVECSDGFYRFYVGDEFTS
YDYDVKHKKYSSQEVLKSMLLLDDFIVYSPSGSALANVRNACVYFSQLIGKPIKIVNSDLLEDLSVDFKGALFNAKKNVI
KNSFNVDVSECKNLDECYRACNLNVSFSTFEMAVNNAHRFGILITDRSFNNFWPSKVKPGSSGVSAMDIGKCMTSDAKIV
NAKVLTQRGKSVVWLSQDFAALSSTAQKVLVKTFVEEGVNFSLTFNAVGSDDDLPYERFTESVSPKSGSGFFDVITQLKQ
IVILVFVFIFICGLCSVYSVATQSYIESAEGYDYMVIKNGIVQPFDDTISCVHNTYKGFGDWFKAKYGFIPTFGKSCPIV
VGTVFDLENMRPIPDVPAYVSIVGRSLVFAINAAFGVTNMCYDHTGNAVSKDSYFDTCVFNTACTTLTGLGGTIVYCAKQ
GLVEGAKLYSDLMPDYYYEHASGNMVKLPAIIRGLGLRFVKTQATTYCRVGECIDSKAGFCFGGDNWFVYDNEFGNGYIC
GNSVLGFFKNVFKLFNSNMSVVATSGAMLVNIIIACLAIAMCYGVLKFKKIFGDCTFLIVMIIVTLVVNNVSYFVTQNTF
FMIIYAIVYYFITRKLAYPGILDAGFIIAYINMAPWYVITAYILVFLYDSLPSLFKLKVSTNLFEGDKFVGNFESAAMGT
FVIDMRSYETIVNSTSIARIKSYANSFNKYKYYTGSMGEADYRMACYAHLGKALMDYSVNRTDMLYTPPTVSVNSTLQSG
LRKMAQPSGLVEPCIVRVSYGNNVLNGLWLGDEVICPRHVIASDTTRVINYENEMSSVRLHNFSVSKNNVFLGVVSARYK
GVNLVLKVNQVNPNTPEHKFKSIKAGESFNILACYEGCPGSVYGVNMRSQGTIKGSFIAGTCGSVGYVLENGILYFVYMH
HLELGNGSHVGSNFEGEMYGGYEDQPSMQLEGTNVMSSDNVVAFLYAALINGERWFVTNTSMSLESYNTWAKTNSFTELS
STDAFSMLAAKTGQSVEKLLDSIVRLNKGFGGRTILSYGSLCDEFTPTEVIRQMYGVNLQAGKVKSFFYPIMTAMTILFA
FWLEFFMYTPFTWINPTFVSIVLAVTTLISTVFVSGIKHKMLFFMSFVLPSVILVTAHNLFWDFSYYESLQSIVENTNTM
FLPVDMQGVMLTVFCFIVFVTYSVRFFTCKQSWFSLAVTTILVIFNMVKIFGTSDEPWTENQIAFCFVNMLTMIVSLTTK
DWMVVIASYRIAYYIVVCVMPSAFVSDFGFMKCISIVYMACGYLFCCYYGILYWVNRFTCMTCGVYQFTVSAAELKYMTA
NNLSAPKNAYDAMILSAKLIGVGGKRNIKISTVQSKLTEMKCTNVVLLGLLSKMHVESNSKEWNYCVGLHNEINLCDDPE
IVLEKLLALIAFFLSKHNTCDLSELIESYFENTTILQSVASAYAALPSWIALEKARADLEEAKKNDVSPQILKQLTKAFN
IAKSDFEREASVQKKLDKMAEQAAASMYKEARAVDRKSKIVSAMHSLLFGMLKKLDMSSVNTIIDQARNGVLPLSIIPAA
SATRLVVITPSLEVFSKIRQENNVHYAGAIWTIVEVKDANGSHVHLKEVTAANELNLTWPLSITCERTTKLQNNEIMPGK
LKERAVRASATLDGEAFGSGKALMASESGKSFMYAFIASDNNLKYVKWESNNDIIPIELEAPLRFYVDGANGPEVKYLYF
VKNLNTLRRGAVLGYIGATVRLQAGKPTEHPSNSSLLTLCAFSPDPAKAYVDAVKRGMQPVNNCVKMLSNGAGNGMAVTN
GVEANTQQDSYGGASVCIYCRCHVEHPAIDGLCRYKGKFVQIPTGTQDPIRFCIENEVCVVCGCWLNNGCMCDRTSMQSF
TVDQSYLNRVRGSSAARLEPCNGTDPDHVSRAFDIYNKDVACIGKFLKTNCSRFRNLDKHDAYYIVKRCTKTVMDHEQVC
YNDLKDSGAVAEHDFFTYKEGRCEFGNVARRNLTKYTMMDLCYAIRNFDEKNCEVLKEILVTVGACTEEFFENKDWFDPV
ENEAIHEVYAKLGPIVANAMLKCVAFCDAIVEKGYIGVITLDNQDLNGNFYDFGDFVKTAPGFGCACVTSYYSYMMPLMG
MTSCLESENFVKSDIYGSDYKQYDLLAYDFTEHKEYLFQKYFKYWDRTYHPNCSDCTSDECIIHCANFNTLFSMTIPMTA
FGPLVRKVHIDGVPVVVTAGYHFKQLGIVWNLDVKLDTMKLSMTDLLRFVTDPTLLVASSPALLDQRTVCFSIAALSTGI
TYQTVKPGHFNKDFYDFITERGFFEEGSELTLKHFFFAQGGEAAMTDFNYYRYNRVTVLDICQAQFVYKIVGKYFECYDG
GCINAREVVVTNYDKSAGYPLNKFGKARLYYETLSYEEQDALFALTKRNVLPTMTQMNLKYAISGKARARTVGGVSLLST
MTTRQYHQKHLKSIAATRNATVVIGSTKFYGGWDNMLKNLMRDVDNGCLMGWDYPKCDRALPNMIRMASAMILGSKHVGC
CTHNDRFYRLSNELAQVLTEVVHCTGGFYFKPGGTTSGDGTTAYANSAFNIFQAVSANVNKLLGVDSNACNNVTVKSIQR
KIYDNCYRSSSIDEEFVVEYFSYLRKHFSMMILSDDGVVCYNKDYADLGYVADINAFKATLYYQNNVFMSTSKCWVEPDL
SVGPHEFCSQHTLQIVGPDGDYYLPYPDPSRILSAGVFVDDIVKTDNVIMLERYVSLAIDAYPLTKHPKPAYQKVFYTLL
DWVKHLQKNLNAGVLDSFSVTMLEEGQDKFWSEEFYASLYEKSTVLQAAGMCVVCGSQTVLRCGDCLRRPLLCTKCAYDH
VMGTKHKFIMSITPYVCSFNGCNVNDVTKLFLGGLSYYCMNHKPQLSFPLCANGNVFGLYKSSAVGSEAVEDFNKLAVSD
WTNVEDYKLANNVKESLKIFAAETVKAKEESVKSEYAYAVLKEVIGPKEIVLQWEASKTKPPLNRNSVFTCFQISKDTKI
QLGEFVFEQSEYGSDSVYYKSTSTYKLTPGMIFVLTSHNVSPLKAPILVNQEKYNTISKLYPVFNIAEAYNTLVPYYQMI
GKQKFTTIQGPPGSGKSHCVIGLGLYYPQARIVYTACSHAAVDALCEKAAKNFNVDRCSRIIPQRIRVDCYTGFKPNNTN
AQYLFCTVNALPEASCDIVVVDEVSMCTNYDLSVINSRLSYKHIVYVGDPQQLPAPRTLINKGVLQPQDYNVVTKRMCTL
GPDVFLHKCYRCPAEIVKTVSALVYENKFVPVNPESKQCFKMFVKGQVQIESNSSINNKQLEVVKAFLAHNPKWRKAVFI
SPYNSQNYVARRLLGLQTQTVDSAQGSEYDYVIYTQTSDTQHATNVNRFNVAITRAKVGILCIMCDRTMYENLDFYELKD
SKIGLQAKPETCGLFKDCSKSEQYIPPAYATTYMSLSDNFKTSDGLAVNIGTKDVKYANVISYMGFRFEANIPGYHTLFC
TRDFAMRNVRAWLGFDVEGAHVCGDNVGTNVPLQLGFSNGVDFVVQTEGCVITEKGNSIEVVKARAPPGEQFAHLIPLMR
KGQPWHIVRRRIVQMVCDYFDGLSDILIFVLWAGGLELTTMRYFVKIGRPQKCECGKSATCYSSSQSVYACFKHALGCDY
LYNPYCIDIQQWGYTGSLSMNHHEVCNIHRNEHVASGDAIMTRCLAIHDCFVKRVDWSIVYPFIDNEEKINKAGRIVQSH
VMKAALKIFNPAAIHDVGNPKGIRCATTPIPWFCYDRDPINNNVRCLDYDYMVHGQMNGLMLFWNCNVDMYPEFSIVCRF
DTRTRSKLSLEGCNGGALYVNNHAFHTPAYDRRAFAKLKPMPFFYYDDSNCELVDGQPNYVPLKSNVCITKCNIGGAVCK
KHAALYRAYVEDYNIFMQAGFTIWCPQNFDTYMLWHGFVNSKALQSLENVAFNVVKKGAFTGLKGDLPTAVIADKIMVRD
GPTDKCIFTNKTSLPTNVAFELYAKRKLGLTPPLTILRNLGVVATYKFVLWDYEAERPFSNFTKQVCSYTDLDSEVVTCF
DNSIAGSFERFTTTRDAVLISNNAVKGLSAIKLQYGLLNDLPVSTVGNKPVTWYIYVRKNGEYVEQIDSYYTQGRTFETF
KPRSTMEEDFLSMDTTLFIQKYGLEDYGFEHVVFGDVSKTTIGGMHLLISQVRLAKMGLFSVQEFMNNSDSTLKSCCITY
ADDPSSKNVCTYMDILLDDFVTIIKSLDLNVVSKVVDVIVDCKAWRWMLWCENSHIKTFYPQLQSAEWNPGYSMPTLYKI
QRMCLERCNLYNYGAQVKLPDGITTNVVKYTQLCQYLNTTTLCVPHKMRVLHLGAAGASGVAPGSTVLRRWLPDDAILVD
NDLRDYVSDADFSVTGDCTSLYIEDKFDLLVSDLYDGSTKSIDGENTSKDGFFTYINGFIKEKLSLGGSVAIKITEFSWN
KDLYELIQRFEYWTVFCTSVNTSSSEGFLIGINYLGPYCDKAIVDGNIMHANYIFWRNSTIMALSHNSVLDTPKFKCRCN
NALIVNLKEKELNEMVIGLLRKGKLLIRNNGKLLNFGNHFVNTP
>Q98VG9 ~~~rep~~~Replicase polyprotein 1ab~~~
MSSKQFKILVNEDYQVNVPSLPFRDALQEIKYCYRNGFDGYVFVPEYRRDLVDCNRKDHYVIGVLGNGISDLKPVLLTEP
SVMLQGFIVRANCNGVLEDFDLKFARTGNGAIYVDQYMCGADGKPVIEGEFKDYFGDEDVIIYEGEEYHCAWLTVRDEKP
LWQQTLLTIREIQYNLDIPHKLPNCAIREVAPPVKKNSKVVLSEEYRKLYDIFGSPFMGNGDSLNTCFDSLHFIAATLKC
PCGAESSGVGDWTGFKTACCGLHGKVKGVTLGAVKPGDAIVTSMSAGKGVKFFANSVLQYAGDVENVSVWKVIKTFTVNE
TVCTTDFEGELNDFIRPESTSPVSCSIKRAFITGEVDDAVHDCIIAGKLDLSTNLFGSANLLFKKMPWFVQKCGAIFADA
WKVVEELLCSLKLTYKQIYDVVASLCTSAFTIMDYKPVFVVSSNSVKDLVDKCVKILVKAFDVFTQTITIAGVEAKCFVL
GSKYLLFNNALVKLVSVKILGKRQKGLDSAFFATNLIGATVNVTPQRTESAYISLNKVDDVVTPGGGHIVIIGDMAFYKS
EEYYFMMASPDSVLVNNVFKAARVPSYNIVYDVNDDTKSKMVVKIGTSFDFDGDLDAAIAKVNDLLIEFRQEKLCFRALK
DGENILVEAYLKKYKMPVCLKNHVGLWDIIRQDSGKKGFLDTFNHLNELEDVKDIKIQTIKNIICPDLLLELDFGAIWYR
CMPACSDKSILGNVKIMLGNGVKVVCDGCHSFANRLTINYNKLCDTARKDIEIGGIPFSTFKTPSSSFIDMKDAIYSVVE
YGEALSFKTASVPVTNSGIITTDDWSDPILLEPADYVEPKDNGDVIVIAGYTFYKDEDDHFYPYGSGMVVQKMYNKMGGG
DKSVSFSDNVNVREIEPVTRVRLEFEFDNEVVTQVLEKVIGTKYKFIGTTWEEFEDSISEKLDKIFDTLAEQGVELEGYF
IYDTCGGFDINNPDGVMISQYDLNTAADDKSDSDASVEDISLISDNEDVEQIEEDNTSTDDAEDVSSVEGETVSVVDVED
FVEQVSLVEENNVLTPAVNPDEQLSSVEKKDEVSAKNDPWAAAVDEQEAEQPKPSLTPFKTTNLNGKIILKQQDNNCWIN
ACCYQLQAFDFFNHDLWDGFKKDDVMPFVDFCYAALTLKQGDSGDAEYLLETMLNDYSTAKVTLSAKCGCGVKEIVLERT
VFKLTPLRNEFKYGVCGDCKQINMCKFASVEGSGVFVHDRIEKQTPVSQFIVTPTMHAVYTGTTQSGHYMIEDCIHDYCV
DGMGIKPRKHKFYTSTLFLNANVMTAKSKTMVEPPVPVEDKCVEDCQSPKDLILPFYKAGKVSFYQGDLDVLINFLEPDV
LVNAANGDLRHVGGVARAIDVFTGGKLTKRSKEYLKSSKAIAPGNAVLFENVLEHLSVLNAVGPRNGDSRVEGKLCNVYK
AIAKCDGKILTPLISVGIFKVKLEVSLQCLLKTVTDRDLNVFVYTDQERVTIENFFNGTIPIKVTEDTVNQKRVSVALDK
TYGEQLKGTVVIKDKDVTNQLPSVSDVGEKVVKALDVDWNAYYGFPNAAAFSASSHDAYEFDVVTHNNFIVHKQTDNNCW
VNAICLALQRLKPTWKFPGVKSLWDAFLTRKTAGFVHMLYHISGLTKGQPGDAELTLHKLVDLMSSDSAVTVTHTTACDK
CAKVETFTGPVVAAPLLVCGTDEICVHGVHVNVKVTSIRGTVAITSLIGPVVGDVIDATGYICYTGLNSRGHYTYYDNRN
GLMVDADKAYHFEKNLLQVTTAIASNFVANTPKKEIMPKTQAKESKAKESNTARVFSEVEENPKNIVRKEKLLAIESGVD
YTITTLGKYADVFFMAGDKILRFLLEVFKYLLVVFMCLRKSKMPKVKVKPPHVFRNLGAKVRTLNYVRQLNKPALWRYIK
LVLLLIALYHFFYLFVSIPVVHKLACSGSVQAYSNSSFVKSEVCGNSILCKACLASYDELADFDHLQVSWDYKSDPLWNR
VIQLSYFIFLAVFGNNYVRCLLMYFVSQYLNLWLSYFGYVKYSWFLHVVNFESISVEFVIIVVVFKAVLALKHIFLPCNN
PSCKTCSKIARQTRIPIQVVVNGSMKTVYVHANGTGKLCKKHNFYCKNCDSYGFDHTFICDEIVRDLSNSIKQTVYATDR
SYQEVTKVECTDGFYRFYVGEEFTAYDYDVKHKKYSSQEVLKTMFLLDDFIVYNPSGSSLASVRNVCVYFSQLIGRPIKI
VNSELLEDLSVDFKGALFNAKKNVIKNSFNVDVSECKNLEECYKLCNLDVTFSTFEMAINNAHRFGILITDRSFNNFWPS
KIKPGSSGVSAMDIGKCMTFDAKIVNAKVLTQRGKSVVWLSQDFSTLSSTAQKVLVKTFVEEGVNFSLTFNAVGSDEDLP
YERFTESVSAKSGSGFFDVLKQLKQLFWCLVLFITLYGLCSVYSVATQSYIDSAEGYDYMVIKNGVVQSFDDSINCVHNT
YKGFAVWFKAKHGFVPTFDKSCPIVLGTVFDLGNMRPIPDVPAYVALVGRSLVFAINAAFGVTNVCYDHTGAAVSKNSYF
DTCVFNSACTTLTGIGGTVVYCAKQGLVEGAKLYSELLPDYYYEHASGNMVKIPAIIRSFGLRFVKTQATTYCRVGECTE
SQAGFCFGGDNWFVYDKEFGDGYICGSSTLGFFKNVFALFNSNMSVVATSGAMLANIVIACLAIAVCYGVLKFKKIFGDC
TLLVVMIIVTLVVNNVSYFVTQNTFFMIVYAIIYYFTTRKLAYPGVLDAGFIIAYLNMAPWYVLVLYIMVFLYDSLPSLF
KLKVTTNLFEGDKFVGSFESAAMGTFVIDMRSYETLVNSTSLDRIKSYANSFNKYKYYTGSMGEADYRMACYAHLGKALM
DYSVSRNDMLYTPPTVSVNSTLQSGLRKMAQPSGVVEPCIVRVAYGNNVLNGLWLGDEVICPRHVIASDTSRVINYENEL
SSVRLHNFSIAKNNAFLGVVSAKYKGVNLVLKVNQVNPNTPEHKFKSVRPGESFNILACYEGCPGSVYGVNMRSQGTIKG
SFIAGTCGSVGYVLENGTLYFVYMHHLELGNGSHVGSNLEGEMYGGYEDQPSMQLEGTNVMSSDNVVAFLYAALINGERW
FVTNTSMTLESYNAWAKTNSFTEIVSTDAFNMLAAKTGYSVEKLLECIVRLNKGFGGRTILSYGSLCDEFTPTEVIRQMY
GVNLQSGKVKSIFYPMMTAIAILFAFWLEFFMYTPFTWINPTFVSVVLAITTLVSVLLVAGIKHKMLFFMSFVMPSVILA
TAHNVVWDMTYYESLQVLVENVNTTFLPVDMQGVMLALFCVVVFVICTIRFFTCKQSWFSLFATTIFVMFNIVKLLGMIG
EPWTDDHFLLCLVNMLTMLISLTTKDWFVVFASYKVAYYIVVYVMQPAFVQDFGFVKCVSIIYMACGYLFCCYYGILYWV
NRFTCMTCGVYQFTVSPAELKYMTANNLSAPKTAYDAMILSFKLMGIGGGRNIKISTVQSKLTEMKCTNVVLLGLLSKMH
VESNSKEWNYCVGLHNEINLCDDPDAVLEKLLALIAFFLSKHNTCDLSDLIESYFENTTILQSVASAYAALPSWIAYEKA
RADLEEAKKNDVSPQLLKQLTKACNIAKSEFEREASVQKKLDKMAEQAAASMYKEARAVDRKSKIVSAMHSLLFGMLKKL
DMSSVNTIIEQARNGVLPLSIIPAASATRLIVVTPNLEVLSKVRQENNVHYAGAIWSIVEVKDANGAQVHLKEVTAANEL
NITWPLSITCERTTKLQNNEILPGKLKEKAVKASATIDGDAYGSGKALMASEGGKSFIYAFIASDSNLKYVKWESNNDVI
PIELEAPLRFYVDGVNGPEVKYLYFVKSLNTLRRGAVLGYIGATVRLQAGKPTEHPSNSGLLTLCAFAPDPAKAYVDAVK
RGMQPVTNCVKMLSNGAGNGMAITNGVESNTQQDSYGGASVCIYCRCHVEHPAIDGLCRFKGKFVQVPTGTQDPIRFCIE
NEVCVVCGCWLTNGCMCDRTSIQGTTIDQSYLNECGVLVQLDLEPCNGTDPDHVSRAFDIYNKDVACIGKFLKTNCSRFR
NLDKHDAYYVVKRCTKSVMDHEQVCYNDLKDSGVVAEHDFFLYKEGRCEFGNVARKDLTKYTMMDLCYAIRNFDEKNCEV
LKEILVTLGACNESFFENKDWFDPVENEAIHEVYARLGPIVANAMLKCVAFCDAIVEKGYIGIITLDNQDLNGNFYDFGD
FVKTTPGFGCACVTSYYSYMMPLMGMTSCLESENFVKSDIYGADYKQYDLLAYDFTDHKEKLFHKYFKHWDRTYHPNCSD
CTSDECIIHCANFNTLFSMTIPSTAFGPLVRKVHIDGVPVVVTAGYHFKQLGIVWNLDVKLDTMKLSMTDLLRFVTDPTL
LVASSPALLDQRTVCFSIAALSTGVTYQTVKPGHFNKDFYDFITERGFFEEGSELTLKHFFFAQGGEAAMTDFNYYRYNR
VTVLDICQAQFVYKIVGKYFECYDGGCINAREVVVTNYDKSAGYPLNKFGKARLYYETLSYEEQDALFALTKRNVLPTMT
QMNLKYAISGKARARTVGGVSLLSTMTTRQYHQKHLKSIAATRNATVVIGSTKFYGGWDNMLKNLMRDVDNGCLMGWDYP
KCDRALPNMIRMASAMILGSKHVGCCTHSDRFYRLSNELAQVLTEVVHCTGGFYFKPGGTTSGDGTTAYANSAFNIFQAV
SANVNKLLGVDSNACNNVTVKSIQRKIYDNCYRSSSIDEEFVVEYFSYLRKHFSMMILSDDGVVCYNKDYADLGYVADIN
AFKATLYYQNNVFMSTSKCWVEPDLSVGPHEFCSQHTLQIVGPDGDYYLPYPDPSRILSAGVFVDDIVKTDNVIMLERYV
SLAIDAYPLTKHPKPAYQKVFYTLLDWVKHLQKNLNAGVLDSFSVTMLEEGQDKFWSEEFYASLYEKSTVLQAAGMCVVC
GSQTVLRCGDCLRRPLLCTKCAYDHVMGTKHKFIMSITPYVCSFNGCNVNDVTKLFLGGLSYYCMDHKPQLSFPLCANGN
VFGLYKSSAVGSEDVEDFNKLAVSDWTNVEDYKLANNVKESLKIFAAETVKAKEESVKSEYAYAILKEVIGPKEIVLQWE
ASKTKPPLNRNSVFTCFQISKDTKIQLGEFVFEQSEYGSDSVYYKSTSTYKLTPGMIFVLTSHNVSPLKATILVNQEKYN
TISKLYPVFNIAEAYNTLVPYYQMIGKQKFTTIQGPPGSGKSHCVIGLGLYYPQARIVYTACSHAAVDALCEKAAKNFNV
DRCSRIIPQRIRVDCYTGFKPNNTNAQYLFCTVNALPEASCDIVVVDEVSMCTNYDLSVINSRLSYKHIVYVGDPQQLPA
PRTLINKGVLQPQDYNVVTQRVCTLGPDVFLHKCYRCPAEIVKTVSALVYENKFVPVNPESKQCFKMFVKGQVQIESNSS
INNKQLEVVKAFLAHNPKWRKAVFISPYNSQNYVARRLLGLQTQTVDSAQGSEYDYVIYTQTSDTQHATNVNRFNVAITR
AKVGILCIMCDRTMYENLDFYELKDSKIGLQAKPETCGLFKDCSKSEQYIPPAYATTYMSLSDNFKTSDGLAVNIGTKDV
KYANVISYMGFRFEANIPGYHTLFCTRDFAMRNVRAWLGFDVEGAHVCGDNVGTNVPLQLGFSNGVDFVVQTEGCVVTEK
GNSIEVVKARAPPGEQFAHLIPLMRKGQPWHIVRRRIVQMVCDYFDGLSDILIFVLWAGGLELTTMRYFVKIGRPQKCEC
GKSATCYSSSQCVYACFKHALGCDYLYNPYCIDIQQWGYTGSLSMNHHEVCNIHRNEHVASGDAIMTRCLAIHDCFVKRV
DWSIVYPFIDNEEKINKAGRIVQSHVMKAALKIFNPAAIHDVGNPKGIRCATTPIPWFCYDRDPINNNVRCLEYDYMVHG
QMNGLMLFWNCNVDMYPEFSIVCRFDTRTRSKLSLEGCNGGALYVNNHAFHTPAYDRRAFAKLKPMPFFYYDDSNCELVD
GQPNYVPLKSNVCITKCNIGGAVCKKHAALYRAYVEDYNMFMQAGFTIWCPQNFDTYMLWHGFVNSKALQSLENVAFNVV
KKGAFTGLKGDLPTAVIADKIMVRDGPTDKCIFTNKTSLPTNVAFELYAKRKLGLTPPLTILRNLGVVATYKFVLWDYEA
ECPFSNFTKQVCSYTDLDSEVVTCFDNSIAGSFERFTTTKDAVLISNNAVKGLSAIKLQYGFLNDLPVSTVGNKPVTWYI
YVRKNGEYVEQIDSYYTHGRTFETFKPRSTMEEDFLSMDTTLFIQKYGLEDYGFEHVVFGDVSKTTIGGMHLLISQVRLA
KMGLFSVQEFMTNSDSTLKSCCITYADDPSSKNVCTYMDILLDDFVTIIKSLDLNVVSKVVDVIVDCKAWRWMLWCENSQ
IKTFYPQLQSAEWNPGYSMPTLYKIQRMCLERCNLYNYGAQVRLPDGITTNVVKYTQLCQYLNTTTVCVPHKMRVLHLGA
AGASGVAPGSTVLRRWLPDDAILVDNDLRDYVSDADFSVTGDCTSLYIEDKFDLLISDLYDGSTKSIDGENTSKDGFFTY
INGFIKEKLSLGGSAAIKITEFSWNKDLYELIQRFEYWTVFCTSVNTSSSEGFLIGINYLGPYCDKAIVDGNIMHANYIF
WRNSTIMALSHNSVLDTPKFKCRCNNALIVNLKEKELNEMVVGLLRKGKLLIRNNGKLLNFGNHLVNVP
>P0C6Y1 ~~~rep~~~Replicase polyprotein 1ab~~~
MASSLKQGVSPKPRDVILVSKDIPEQLCDALFFYTSHNPKDYADAFAVRQKFDRSLQTGKQFKFETVCGLFLLKGVDKIT
PGVPAKVLKATSKLADLEDIFGVSPLARKYRELLKTACQWSLTVEALDVRAQTLDEIFDPTEILWLQVAAKIHVSSMAMR
RLVGEVTAKVMDALGSNLSALFQIVKQQIARIFQKALAIFENVNELPQRIAALKMAFAKCARSITVVVVERTLVVKEFAG
TCLASINGAVAKFFEELPNGFMGSKIFTTLAFFKEAAVRVVENIPNAPRGTKGFEVVGNAKGTQVVVRGMRNDLTLLDQK
ADIPVEPEGWSAILDGHLCYVFRSGDRFYAAPLSGNFALSDVHCCERVVCLSDGVTPEINDGLILAAIYSSFSVSELVTA
LKKGEPFKFLGHKFVYAKDAAVSFTLAKAATIADVLRLFQSARVIAEDVWSSFTEKSFEFWKLAYGKVRNLEEFVKTYVC
KAQMSIVILAAVLGEDIWHLVSQVIYKLGVLFTKVVDFCDKHWKGFCVQLKRAKLIVTETFCVLKGVAQHCFQLLLDAIH
SLYKSFKKCALGRIHGDLLFWKGGVHKIVQDGDEIWFDAIDSVDVEDLGVVQEKSIDFEVCDDVTLPENQPGHMVQIEDD
GKNYMFFRFKKDENIYYTPMSQLGAINVVCKAGGKTVTFGETTVQEIPPPDVVPIKVSIECCGEPWNTIFKKAYKEPIEV
DTDLTVEQLLSVIYEKMCDDLKLFPEAPEPPPFENVALVDKNGKDLDCIKSCHLIYRDYESDDDIEEEDAEECDTDSGEA
EECDTNSECEEEDEDTKVLALIQDPASIKYPLPLDEDYSVYNGCIVHKDALDVVNLPSGEETFVVNNCFEGAVKPLPQKV
VDVLGDWGEAVDAQEQLCQQEPLQHTFEEPVENSTGSSKTMTEQVVVEDQELPVVEQDQDVVVYTPTDLEVAKETAEEVD
EFILIFAVPKEEVVSQKDGAQIKQEPIQVVKPQREKKAKKFKVKPATCEKPKFLEYKTCVGDLTVVIAKALDEFKEFCIV
NAANEHMTHGSGVAKAIADFCGLDFVEYCEDYVKKHGPQQRLVTPSFVKGIQCVNNVVGPRHGDNNLHEKLVAAYKNVLV
DGVVNYVVPVLSLGIFGVDFKMSIDAMREAFEGCTIRVLLFSLSQEHIDYFDVTCKQKTIYLTEDGVKYRSIVLKPGDSL
GQFGQVYAKNKIVFTADDVEDKEILYVPTTDKSILEYYGLDAQKYVIYLQTLAQKWNVQYRDNFLILEWRDGNCWISSAI
VLLQAAKIRFKGFLTEAWAKLLGGDPTDFVAWCYASCTAKVGDFSDANWLLANLAEHFDADYTNAFLKKRVSCNCGIKSY
ELRGLEACIQPVRATNLLHFKTQYSNCPTCGANNTDEVIEASLPYLLLFATDGPATVDCDEDAVGTVVFVGSTNSGHCYT
QAAGQAFDNLAKDRKFGKKSPYITAMYTRFAFKNETSLPVAKQSKGKSKSVKEDVSNLATSSKASFDNLTDFEQWYDSNI
YESLKVQESPDNFDKYVSFTTKEDSKLPLTLKVRGIKSVVDFRSKDGFIYKLTPDTDENSKAPVYYPVLDAISLKAIWVE
GNANFVVGHPNYYSKSLHIPTFWENAENFVKMGDKIGGVTMGLWRAEHLNKPNLERIFNIAKKAIVGSSVVTTQCGKLIG
KAATFIADKVGGGVVRNITDSIKGLCGITRGHFERKMSPQFLKTLMFFLFYFLKASVKSVVASYKTVLCKVVLATLLIVW
FVYTSNPVMFTGIRVLDFLFEGSLCGPYKDYGKDSFDVLRYCADDFICRVCLHDKDSLHLYKHAYSVEQVYKDAASGFIF
NWNWLYLVFLILFVKPVAGFVIICYCVKYLVLNSTVLQTGVCFLDWFVQTVFSHFNFMGAGFYFWLFYKIYIQVHHILYC
KDVTCEVCKRVARSNRQEVSVVVGGRKQIVHVYTNSGYNFCKRHNWYCRNCDDYGHQNTFMSPEVAGELSEKLKRHVKPT
AYAYHVVDEACLVDDFVNLKYKAATPGKDSASSAVKCFSVTDFLKKAVFLKEALKCEQISNDGFIVCNTQSAHALEEAKN
AAIYYAQYLCKPILILDQALYEQLVVEPVSKSVIDKVCSILSSIISVDTAALNYKAGTLRDALLSITKDEEAVDMAIFCH
NHDVDYTGDGFTNVIPSYGIDTGKLTPRDRGFLINADASIANLRVKNAPPVVWKFSELIKLSDSCLKYLISATVKSGVRF
FITKSGAKQVIACHTQKLLVEKKAGGIVSGTFKCFKSYFKWLLIFYILFTACCSGYYYMEVSKSFVHPMYDVNSTLHVEG
FKVIDKGVLREIVPEDTCFSNKFVNFDAFWGRPYDNSRNCPIVTAVIDGDGTVATGVPGFVSWVMDGVMFIHMTQTERKP
WYIPTWFNREIVGYTQDSIITEGSFYTSIALFSARCLYLTASNTPQLYCFNGDNDAPGALPFGSIIPHRVYFQPNGVRLI
VPQQILHTPYVVKFVSDSYCRGSVCEYTRPGYCVSLNPQWVLFNDEYTSKPGVFCGSTVRELMFSMVSTFFTGVNPNIYM
QLATMFLILVVVVLIFAMVIKFQGVFKAYATTVFITMLVWVINAFILCVHSYNSVLAVILLVLYCYASLVTSRNTVIIMH
CWLVFTFGLIVPTWLACCYLGFIIYMYTPLFLWCYGTTKNTRKLYDGNEFVGNYDLAAKSTFVIRGSEFVKLTNEIGDKF
EAYLSAYARLKYYSGTGSEQDYLQACRAWLAYALDQYRNSGVEIVYTPPRYSIGVSRLQSGFKKLVSPSSAVEKCIVSVS
YRGNNLNGLWLGDTIYCPRHVLGKFSGDQWNDVLNLANNHEFEVTTQHGVTLNVVSRRLKGAVLILQTAVANAETPKYKF
IKANCGDSFTIACAYGGTVVGLYPVTMRSNGTIRASFLAGACGSVGFNIEKGVVNFFYMHHLELPNALHTGTDLMGEFYG
GYVDEEVAQRVPPDNLVTNNIVAWLYAAIISVKESSFSLPKWLESTTVSVDDYNKWAGDNGFTPFSTSTAITKLSAITGV
DVCKLLRTIMVKNSQWGGDPILGQYNFEDELTPESVFNQIGGVRLQSSFVRKATSWFWSRCVLACFLFVLCAIVLFTAVP
LKFYVYAAVILLMAVLFISFTVKHVMAYMDTFLLPTLITVIIGVCAEVPFIYNTLISQVVIFLSQWYDPVVFDTMVPWMF
LPLVLYTAFKCVQGCYMNSFNTSLLMLYQFVKLGFVIYTSSNTLTAYTEGNWELFFELVHTTVLANVSSNSLIGLFVFKC
AKWMLYYCNATYLNNYVLMAVMVNCIGWLCTCYFGLYWWVNKVFGLTLGKYNFKVSVDQYRYMCLHKINPPKTVWEVFST
NILIQGIGGDRVLPIATVQAKLSDVKCTTVVLMQLLTKLNVEANSKMHVYLVELHNKILASDDVGECMDNLLGMLITLFC
IDSTIDLSEYCDDILKRSTVLQSVTQEFSHIPSYAEYERAKNLYEKVLVDSKNGGVTQQELAAYRKAANIAKSVFDRDLA
VQKKLDSMAERAMTTMYKEARVTDRRAKLVSSLHALLFSMLKKIDSEKLNVLFDQASSGVVPLATVPIVCSNKLTLVIPD
PETWVKCVEGVHVTYSTVVWNIDTVIDADGTELHPTSTGSGLTYCISGANIAWPLKVNLTRNGHNKVDVVLQNNELMPHG
VKTKACVAGVDQAHCSVESKCYYTNISGNSVVAAITSSNPNLKVASFLNEAGNQIYVDLDPPCKFGMKVGVKVEVVYLYF
IKNTRSIVRGMVLGAISNVVVLQSKGHETEEVDAVGILSLCSFAVDPADTYCKYVAAGNQPLGNCVKMLTVHNGSGFAIT
SKPSPTPDQDSYGGASVCLYCRAHIAHPGSVGNLDGRCQFKGSFVQIPTTEKDPVGFCLRNKVCTVCQCWIGYGCQCDSL
RQPKSSVQSVAGASDFDKNYLNRVRGSSEARLIPLASGCDPDVVKRAFDVCNKESAGMFQNLKRNCARFQELRDTEDGNL
EYLDSYFVVKQTTPSNYEHEKSCYEDLKSEVTADHDFFVFNKNIYNISRQRLTKYTMMDFCYALRHFDPKDCEVLKEILV
TYGCIEDYHPKWFEENKDWYDPIENSKYYVMLAKMGPIVRRALLNAIEFGNLMVEKGYVGVITLDNQDLNGKFYDFGDFQ
KTAPGAGVPVFDTYYSYMMPIIAMTDALAPERYFEYDVHKGYKSYDLLKYDYTEEKQELFQKYFKYWDQEYHPNCRDCSD
DRCLIHCANFNILFSTLIPQTSFGNLCRKVFVDGVPFIATCGYHSKELGVIMNQDNTMSFSKMGLSQLMQFVGDPALLVG
TSNNLVDLRTSCFSVCALTSGITHQTVKPGHFNKDFYDFAEKAGMFKEGSSIPLKHFFYPQTGNAAINDYDYYRYNRPTM
FDICQLLFCLEVTSKYFECYEGGCIPASQVVVNNLDKSAGYPFNKFGKARLYYEMSLEEQDQLFEITKKNVLPTITQMNL
KYAISAKNRARTVAGVSILSTMTNRQFHQKILKSIVNTRNASVVIGTTKFYGGWDNMLRNLIQGVEDPILMGWDYPKCDR
AMPNLLRIAASLVLARKHTNCCSWSERIYRLYNECAQVLSETVLATGGIYVKPGGTSSGDATTAYANSVFNIIQATSANV
ARLLSVITRDIVYDNIKSLQYELYQQVYRRVNFDPAFVEKFYSYLCKNFSLMILSDDGVVCYNNTLAKQGLVADISGFRE
VLYYQNNVFMADSKCWVEPDLEKGPHEFCSQHTMLVEVDGEPKYLPYPDPSRILGACVFVDDVDKTEPVAVMERYIALAI
DAYPLVHHENEEYKKVFFVLLAYIRKLYQELSQNMLMDYSFVMDIDKGSKFWEQEFYENMYRAPTTLQSCGVCVVCNSQT
ILRCGNCIRKPFLCCKCCYDHVMHTDHKNVLSINPYICSQLGCGEADVTKLYLGGMSYFCGNHKPKLSIPLVSNGTVFGI
YRANCAGSENVDDFNQLATTNWSIVEPYILANRCSDSLRRFAAETVKATEELHKQQFASAEVREVFSDRELILSWEPGKT
RPPLNRNYVFTGYHFTRTSKVQLGDFTFEKGEGKDVVYYKATSTAKLSVGDIFVLTSHNVVSLVAPTLCPQQTFSRFVNL
RPNVMVPECFVNNIPLYHLVGKQKRTTVQGPPGSGKSHFAIGLAVYFSSARVVFTACSHAAVDALCEKAFKFLKVDDCTR
IVPQRTTVDCFSKFKANDTGKKYIFSTINALPEVSCDILLVDEVSMLTNYELSFINGKINYQYVVYVGDPAQLPAPRTLL
NGSLSPKDYNVVTNLMVCVKPDIFLAKCYRCPKEIVDTVSTLVYDGKFIANNPESRECFKVIVNNGNSDVGHESGSAYNT
TQLEFVKDFVCRNKQWREAIFISPYNAMNQRAYRMLGLNVQTVDSSQGSEYDYVIFCVTADSQHALNINRFNVALTRAKR
GILVVMRQRDELYSALKFTELDSETSLQGTGLFKICNKEFSGVHPAYAVTTKALAATYKVNDELAALVNVEAGSEITYKH
LISLLGFKMSVNVEGCHNMFITRDEAIRNVRGWVGFDVEATHACGTNIGTNLPFQVGFSTGADFVVTPEGLVDTSIGNNF
EPVNSKAPPGEQFNHLRVLFKSAKPWHVIRPRIVQMLADNLCNVSDCVVFVTWCHGLELTTLRYFVKIGKEQVCSCGSRA
TTFNSHTQAYACWKHCLGFDFVYNPLLVDIQQWGYSGNLQFNHDLHCNVHGHAHVASVDAIMTRCLAINNAFCQDVNWDL
TYPHIANEDEVNSSCRYLQRMYLNACVDALKVNVVYDIGNPKGIKCVRRGDVNFRFYDKNPIVRNVKQFEYDYNQHKDKF
ADGLCMFWNCNVDCYPDNSLVCRYDTRNLSVFNLPGCNGGSLYVNKHAFYTPKFDRISFRNLKAMPFFFYDSSPCETIQV
DGVAQDLVSLATKDCITKCNIGGAVCKKHAQMYAEFVTSYNAAVTAGFTFWVTNKLNPYNLWKSFSALQSIDNIAYNMYK
GGHYDAIAGEMPTVITGDKVFVIDQGVEKAVFVNQTTLPTSVAFELYAKRNIRTLPNNRILKGLGVDVTNGFVIWDYANQ
TPLYRNTVKVCAYTDIEPNGLVVLYDDRYGDYQSFLAADNAVLVSTQCYKRYSYVEIPSNLLVQNGMPLKDGANLYVYKR
VNGAFVTLPNTINTQGRSYETFEPRSDIERDFLAMSEESFVERYGKDLGLQHILYGEVDKPQLGGLHTVIGMYRLLRANK
LNAKSVTNSDSDVMQNYFVLSDNGSYKQVCTVVDLLLDDFLELLRNILKEYGTNKSKVVTVSIDYHSINFMTWFEDGSIK
TCYPQLQSAWTCGYNMPELYKVQNCVMEPCNIPNYGVGITLPSGILMNVAKYTQLCQYLSKTTICVPHNMRVMHFGAGSD
KGVAPGSTVLKQWLPEGTLLVDNDIVDYVSDAHVSVLSDCNKYNTEHKFDLVISDMYTDNDSKRKHEGVIANNGNDDVFI
YLSSFLRNNLALGGSFAVKVTETSWHEVLYDIAQDCAWWTMFCTAVNASSSEAFLIGVNYLGASEKVKVSGKTLHANYIF
WRNCNYLQTSAYSIFDVAKFDLRLKATPVVNLKTEQKTDLVFNLIKCGKLLVRDVGNTSFTSDSFVCTM
>P0C6Y3 ~~~rep~~~Replicase polyprotein 1ab~~~
MASSLKQGVSPKLRDVILVSKDIPEQLCDALFFYTSHNPKDYADAFAVRQKFDRNLQTGKQFKFETVCGLFLLKGVDKIT
PGVPAKVLKATSKLADLEDIFGVSPFARKYRELLKTACQWSLTVETLDARAQTLDEIFDPTEILWLQVAAKIQVSAMAMR
RLVGEVTAKVMDALGSNMSALFQIFKQQIVRIFQKALAIFENVSELPQRIAALKMAFAKCAKSITVVVMERTLVVREFAG
TCLASINGAVAKFFEELPNGFMGAKIFTTLAFFREAAVKIVDNIPNAPRGTKGFEVVGNAKGTQVVVRGMRNDLTLLDQK
AEIPVESEGWSAILGGHLCYVFKSGDRFYAAPLSGNFALHDVHCCERVVCLSDGVTPEINDGLILAAIYSSFSVAELVAA
IKRGEPFKFLGHKFVYAKDAAVSFTLAKAATIADVLKLFQSARVKVEDVWSSLTEKSFEFWRLAYGKVRNLEEFVKTCFC
KAQMAIVILATVLGEGIWHLVSQVIYKVGGLFTKVVDFCEKYWKGFCAQLKRAKLIVTETLCVLKGVAQHCFQLLLDAIQ
FMYKSFKKCALGRIHGDLLFWKGGVHKIIQEGDEIWFDAIDSIDVEDLGVVQEKLIDFDVCDNVTLPENQPGHMVQIEDD
GKNYMFFRFKKDENIYYTPMSQLGAINVVCKAGGKTVTFGETTVQEIPPPDVVFIKVSIECCGEPWNTIFKKAYKEPIEV
ETDLTVEQLLSVVYEKMCDDLKLFPEAPEPPPFENVTLVDKNGKDLDCIKSCHLIYRDYESDDDIEEEDAEECDTDSGDA
EECDTNLECEEEDEDTKVLALIQDPASNKYPLPLDDDYSVYNGCIVHKDALDVVNLPSGEETFVVNNCFEGAVKALPQKV
IDVLGDWGEAVDAQEQLCQQESTRVISEKSVEGFTGSCDAMAEQAIVEEQEIVPVVEQSQDVVVFTPADLEVVKETAEEV
DEFILISAVPKEEVVSQEKEEPQVEQEPTLVVKAQREKKAKKFKVKPATCEKPKFLEYKTCVGDLAVVIAKALDEFKEFC
IVNAANEHMSHGGGVAKAIADFCGPDFVEYCADYVKKHGPQQKLVTPSFVKGIQCVNNVVGPRHGDSNLREKLVAAYKSV
LVGGVVNYVVPVLSSGIFGVDFKISIDAMREAFKGCAIRVLLFSLSQEHIDYFDATCKQKTIYLTEDGVKYRSVVLKPGD
SLGQFGQVFARNKVVFSADDVEDKEILFIPTTDKTILEYYGLDAQKYVTYLQTLAQKWDVQYRDNFVILEWRDGNCWISS
AIVLLQAAKIRFKGFLAEAWAKLLGGDPTDFVAWCYASCNAKVGDFSDANWLLANLAEHFDADYTNALLKKCVSCNCGVK
SYELRGLEACIQPVRAPNLLHFKTQYSNCPTCGASSTDEVIEASLPYLLLFATDGPATVDCDENAVGTVVFIGSTNSGHC
YTQADGKAFDNLAKDRKFGRKSPYITAMYTRFSLRSENPLLVVEHSKGKAKVVKEDVSNLATSSKASFDDLTDFEQWYDS
NIYESLKVQETPDNLDEYVSFTTKEDSKLPLTLKVRGIKSVVDFRSKDGFTYKLTPDTDENSKTPVYYPVLDSISLRAIW
VEGSANFVVGHPNYYSKSLRIPTFWENAESFVKMGYKIDGVTMGLWRAEHLNKPNLERIFNIAKKAIVGSSVVTTQCGKI
LVKAATYVADKVGDGVVRNITDRIKGLCGFTRGHFEKKMSLQFLKTLVFFFFYFLKASSKSLVSSYKIVLCKVVFATLLI
VWFIYTSNPVVFTGIRVLDFLFEGSLCGPYNDYGKDSFDVLRYCAGDFTCRVCLHDRDSLHLYKHAYSVEQIYKDAASGI
NFNWNWLYLVFLILFVKPVAGFVIICYCVKYLVLSSTVLQTGVGFLDWFVKTVFTHFNFMGAGFYFWLFYKIYVQVHHIL
YCKDVTCEVCKRVARSNRQEVSVVVGGRKQIVHVYTNSGYNFCKRHNWYCRNCDDYGHQNTFMSPEVAGELSEKLKRHVK
PTAYAYHVVYEACVVDDFVNLKYKAAIPGKDNASSAVKCFSVTDFLKKAVFLKEALKCEQISNDGFIVCNTQSAHALEEA
KNAAVYYAQYLCKPILILDQALYEQLIVEPVSKSVIDKVCSILSNIISVDTAALNYKAGTLRDALLSITKDEEAVDMAIF
CHNHEVEYTGDGFTNVIPSYGMDTDKLTPRDRGFLINADASIANLRVKNAPPVVWKFSDLIKLSDSCLKYLISATVKSGG
RFFITKSGAKQVISCHTQKLLVEKKAGGVINNTFKWFMSCFKWLFVFYILFTACCLGYYYMEMNKSFVHPMYDVNSTLHV
EGFKVIDKGVIREIVSEDNCFSNKFVNFDAFWGKSYENNKNCPIVTVVIDGDGTVAVGVPGFVSWVMDGVMFVHMTQTDR
RPWYIPTWFNREIVGYTQDSIITEGSFYTSIALFSARCLYLTASNTPQLYCFNGDNDAPGALPFGSIIPHRVYFQPNGVR
LIVPQQILHTPYIVKFVSDSYCRGSVCEYTKPGYCVSLDSQWVLFNDEYISKPGVFCGSTVRELMFNMVSTFFTGVNPNI
YIQLATMFLILVVIVLIFAMVIKFQGVFKAYATIVFTIMLVWVINAFVLCVHSYNSVLAVILLVLYCYASMVTSRNTAII
MHCWLVFTFGLIVPTWLACCYLGFILYMYTPLVFWCYGTTKNTRKLYDGNEFVGNYDLAAKSTFVIRGTEFVKLTNEIGD
KFEAYLSAYARLKYYSGTGSEQDYLQACRAWLAYALDQYRNSGVEVVYTPPRYSIGVSRLQAGFKKLVSPSSAVEKCIVS
VSYRGNNLNGLWLGDSIYCPRHVLGKFSGDQWGDVLNLANNHEFEVVTQNGVTLNVVSRRLKGAVLILQTAVANAETPKY
KFVKANCGDSFTIACSYGGTVIGLYPVTMRSNGTIRASFLAGACGSVGFNIEKGVVNFFYMHHLELPNALHTGTDLMGEF
YGGYVDEEVAQRVPPDNLVTNNIVAWLYAAIISVKESSFSQPKWLESTTVSIEDYNRWASDNGFTPFSTSTAITKLSAIT
GVDVCKLLRTIMVKSAQWGSDPILGQYNFEDELTPESVFNQVGGVRLQSSFVRKATSWFWSRCVLACFLFVLCAIVLFTA
VPLKFYVHAAVILLMAVLFISFTVKHVMAYMDTFLLPTLITVIIGVCAEVPFIYNTLISQVVIFLSQWYDPVVFDTMVPW
MLLPLVLYTAFKCVQGCYMNSFNTSLLMLYQFMKLGFVIYTSSNTLTAYTEGNWELFFELVHTIVLANVSSNSLIGLIVF
KCAKWMLYYCNATYFNNYVLMAVMVNGIGWLCTCYFGLYWWVNKVFGLTLGKYNFKVSVDQYRYMCLHKVNPPKTVWEVF
TTNILIQGIGGDRVLPIATVQSKLSDVKCTTVVLMQLLTKLNVEANSKMHAYLVELHNKILASDDVGECMDNLLGMLITL
FCIDSTIDLGEYCDDILKRSTVLQSVTQEFSHIPSYAEYERAKSIYEKVLADSKNGGVTQQELAAYRKAANIAKSVFDRD
LAVQKKLDSMAERAMTTMYKEARVTDRRAKLVSSLHALLFSMLKKIDSEKLNVLFDQANSGVVPLATVPIVCSNKLTLVI
PDPETWVKCVEGVHVTYSTVVWNIDCVTDADGTELHPTSTGSGLTYCISGDNIAWPLKVNLTRNGHNKVDVALQNNELMP
HGVKTKACVAGVDQAHCSVESKCYYTSISGSSVVAAITSSNPNLKVASFLNEAGNQIYVDLDPPCKFGMKVGDKVEVVYL
YFIKNTRSIVRGMVLGAISNVVVLQSKGHETEEVDAVGILSLCSFAVDPADTYCKYVAAGNQPLGNCVKMLTVHNGSGFA
ITSKPSPTPDQDSYGGASVCLYCRAHIAHPGGAGNLDGRCQFKGSFVQIPTTEKDPVGFCLRNKVCTVCQCWIGYGCQCD
SLRQPKPSVQSVAVASGFDKNYLNRVRGSSEARLIPLANGCDPDVVKRAFDVCNKESAGMFQNLKRNCARFQEVRDTEDG
NLEYCDSYFVVKQTTPSNYEHEKACYEDLKSEVTADHDFFVFNKNIYNISRQRLTKYTMMDFCYALRHFDPKDCEVLKEI
LVTYGCIEDYHPKWFEENKDWYDPIENPKYYAMLAKMGPIVRRALLNAIEFGNLMVEKGYVGVITLDNQDLNGKFYDFGD
FQKTAPGAGVPVFDTYYSYMMPIIAMTDALAPERYFEYDVHKGYKSYDLLKYDYTEEKQDLFQKYFKYWDQEYHPNCRDC
SDDRCLIHCANFNILFSTLVPQTSFGNLCRKVFVDGVPFIATCGYHSKELGVIMNQDNTMSFSKMGLSQLMQFVGDPALL
VGTSNKLVDLRTSCFSVCALASGITHQTVKPGHFNKDFYDFAEKAGMFKEGSSIPLKHFFYPQTGNAAINDYDYYRYNRP
TMFDIRQLLFCLEVTSKYFECYEGGCIPASQVVVNNLDKSAGYPFNKFGKARLYYEMSLEEQDQLFESTKKNVLPTITQM
NLKYAISAKNRARTVAGVSILSTMTNRQFHQKILKSIVNTRNAPVVIGTTKFYGGWDNMLRNLIQGVEDPILMGWDYPKC
DRAMPNLLRIAASLVLARKHTNCCTWSERVYRLYNECAQVLSETVLATGGIYVKPGGTSSGDATTAYANSVFNIIQATSA
NVARLLSVITRDIVYDDIKSLQYELYQQVYRRVNFDPAFVEKFYSYLCKNFSLMILSDDGVVCYNNTLAKQGLVADISGF
REVLYYQNNVFMADSKCWVEPDLEKGPHEFCSQHTMLVEVDGEPRYLPYPDPSRILCACVFVDDLDKTESVAVMERYIAL
AIDAYPLVHHENEEYKKVFFVLLSYIRKLYQELSQNMLMDYSFVMDIDKGSKFWEQEFYENMYRAPTTLQSCGVCVVCNS
QTILRCGNCIRKPFLCCKCCYDHVMHTDHKNVLSINPYICSQPGCGEADVTKLYLGGMSYFCGNHKPKLSIPLVSNGTVF
GIYRANCAGSENVDDFNQLATTNWSTVEPYILANRCVDSLRRFAAETVKATEELHKQQFASAEVREVLSDRELILSWEPG
KTRPPLNRNYVFTGFHFTRTSKVQLGDFTFEKGEGKDVVYYRATSTAKLSVGDIFVLTSHNVVSLIAPTLCPQQTFSRFV
NLRPNVMVPACFVNNIPLYHLVGKQKRTTVQGPPGSGKSHFAIGLAAYFSNARVVFTACSHAAVDALCEKAFKFLKVDDC
TRIVPQRTTIDCFSKFKANDTGKKYIFSTINALPEVSCDILLVDEVSMLTNYELSFINGKINYQYVVYVGDPAQLPAPRT
LLNGSLSPKDYNVVTNLMVCVKPDIFLAKCYRCPKEIVDTVSTLVYDGKFIANNPESRQCFKVIVNNGNSDVGHESGSAY
NITQLEFVKDFVCRNKEWREATFISPYNAMNQRAYRMLGLNVQTVDSSQGSEYDYVIFCVTADSQHALNINRFNVALTRA
KRGILVVMRQRDELYSALKFIELDSVASLQGTGLFKICNKEFSGVHPAYAVTTKALAATYKVNDELAALVNVEAGSEITY
KHLISLLGFKMSVNVEGCHNMFITRDEAIRNVRGWVGFDVEATHACGTNIGTNLPFQVGFSTGADFVVTPEGLVDTSIGN
NFEPVNSKAPPGEQFNHLRALFKSAKPWHVVRPRIVQMLADNLCNVSDCVVFVTWCHGLELTTLRYFVKIGKDQVCSCGS
RATTFNSHTQAYACWKHCLGFDFVYNPLLVDIQQWGYSGNLQFNHDLHCNVHGHAHVASADAIMTRCLAINNAFCQDVNW
DLTYPHIANEDEVNSSCRYLQRMYLNACVDALKVNVVYDIGNPKGIKCVRRGDLNFRFYDKNPIVPNVKQFEYDYNQHKD
KFADGLCMFWNCNVDCYPDNSLVCRYDTRNLSVFNLPGCNGGSLYVNKHAFHTPKFDRTSFRNLKAMPFFFYDSSPCETI
QLDGVAQDLVSLATKDCITKCNIGGAVCKKHAQMYADFVTSYNAAVTAGFTFWVTNNFNPYNLWKSFSALQSIDNIAYNM
YKGGHYDAIAGEMPTIVTGDKVFVIDQGVEKAVFFNQTILPTSVAFELYAKRNIRTLPNNRILKGLGVDVTNGFVIWDYT
NQTPLYRNTVKVCAYTDIEPNGLIVLYDDRYGDYQSFLAADNAVLVSTQCYKRYSYVEIPSNLLVQNGIPLKDGANLYVY
KRVNGAFVTLPNTLNTQGRSYETFEPRSDVERDFLDMSEESFVEKYGKELGLQHILYGEVDKPQLGGLHTVIGMCRLLRA
NKLNAKSVTNSDSDVMQNYFVLADNGSYKQVCTVVDLLLDDFLELLRNILKEYGTNKSKVVTVSIDYHSINFMAWFEDGI
IKTCYPQLQSAWTCGYNMPELYKVQNCVMEPCNIPNYGVGIALPSGIMMNVAKYTQLCQYLSKTTMCVPHNMRVMHFGAG
SDKGVAPGSTVLKQWLPEGTLLVDNDIVDYVSDAHVSVLSDCNKYKTEHKFDLVISDMYTDNDSKRKHEGVIANNGNDDV
FIYLSSFLRNNLALGGSFAVKVTETSWHEVLYDIAQDCAWWTMFCTAVNASSSEAFLVGVNYLGASEKVKVSGKTLHANY
IFWRNCNYLQTSAYSIFDVAKFDLRLKATPVVNLKTEQKTDLVFNLIKCGKLLVRDVGNTSFTSDSFVCTM
>K9N7C7 ~~~rep~~~Replicase polyprotein 1ab~~~
MSFVAGVTAQGARGTYRAALNSEKHQDHVSLTVPLCGSGNLVEKLSPWFMDGENAYEVVKAMLLKKEPLLYVPIRLAGHT
RHLPGPRVYLVERLIACENPFMVNQLAYSSSANGSLVGTTLQGKPIGMFFPYDIELVTGKQNILLRKYGRGGYHYTPFHY
ERDNTSCPEWMDDFEADPKGKYAQNLLKKLIGGDVTPVDQYMCGVDGKPISAYAFLMAKDGITKLADVEADVAARADDEG
FITLKNNLYRLVWHVERKDVPYPKQSIFTINSVVQKDGVENTPPHYFTLGCKILTLTPRNKWSGVSDLSLKQKLLYTFYG
KESLENPTYIYHSAFIECGSCGNDSWLTGNAIQGFACGCGASYTANDVEVQSSGMIKPNALLCATCPFAKGDSCSSNCKH
SVAQLVSYLSERCNVIADSKSFTLIFGGVAYAYFGCEEGTMYFVPRAKSVVSRIGDSIFTGCTGSWNKVTQIANMFLEQT
QHSLNFVGEFVVNDVVLAILSGTTTNVDKIRQLLKGVTLDKLRDYLADYDVAVTAGPFMDNAINVGGTGLQYAAITAPYV
VLTGLGESFKKVATIPYKVCNSVKDTLTYYAHSVLYRVFPYDMDSGVSSFSELLFDCVDLSVASTYFLVRLLQDKTGDFM
STIITSCQTAVSKLLDTCFEATEATFNFLLDLAGLFRIFLRNAYVYTSQGFVVVNGKVSTLVKQVLDLLNKGMQLLHTKV
SWAGSNISAVIYSGRESLIFPSGTYYCVTTKAKSVQQDLDVILPGEFSKKQLGLLQPTDNSTTVSVTVSSNMVETVVGQL
EQTNMHSPDVIVGDYVIISEKLFVRSKEEDGFAFYPACTNGHAVPTLFRLKGGAPVKKVAFGGDQVHEVAAVRSVTVEYN
IHAVLDTLLASSSLRTFVVDKSLSIEEFADVVKEQVSDLLVKLLRGMPIPDFDLDDFIDAPCYCFNAEGDASWSSTMIFS
LHPVECDEECSEVEASDLEEGESECISETSTEQVDVSHEISDDEWAAAVDEAFPLDEAEDVTESVQEEAQPVEVPVEDIA
QVVIADTLQETPVVSDTVEVPPQVVKLPSEPQTIQPEVKEVAPVYEADTEQTQSVTVKPKRLRKKRNVDPLSNFEHKVIT
ECVTIVLGDAIQVAKCYGESVLVNAANTHLKHGGGIAGAINAASKGAVQKESDEYILAKGPLQVGDSVLLQGHSLAKNIL
HVVGPDARAKQDVSLLSKCYKAMNAYPLVVTPLVSAGIFGVKPAVSFDYLIREAKTRVLVVVNSQDVYKSLTIVDIPQSL
TFSYDGLRGAIRKAKDYGFTVFVCTDNSANTKVLRNKGVDYTKKFLTVDGVQYYCYTSKDTLDDILQQANKSVGIISMPL
GYVSHGLDLIQAGSVVRRVNVPYVCLLANKEQEAILMSEDVKLNPSEDFIKHVRTNGGYNSWHLVEGELLVQDLRLNKLL
HWSDQTICYKDSVFYVVKNSTAFPFETLSACRAYLDSRTTQQLTIEVLVTVDGVNFRTVVLNNKNTYRSQLGCVFFNGAD
ISDTIPDEKQNGHSLYLADNLTADETKALKELYGPVDPTFLHRFYSLKAAVHKWKMVVCDKVRSLKLSDNNCYLNAVIMT
LDLLKDIKFVIPALQHAFMKHKGGDSTDFIALIMAYGNCTFGAPDDASRLLHTVLAKAELCCSARMVWREWCNVCGIKDV
VLQGLKACCYVGVQTVEDLRARMTYVCQCGGERHRQIVEHTTPWLLLSGTPNEKLVTTSTAPDFVAFNVFQGIETAVGHY
VHARLKGGLILKFDSGTVSKTSDWKCKVTDVLFPGQKYSSDCNVVRYSLDGNFRTEVDPDLSAFYVKDGKYFTSEPPVTY
SPATILAGSVYTNSCLVSSDGQPGGDAISLSFNNLLGFDSSKPVTKKYTYSFLPKEDGDVLLAEFDTYDPIYKNGAMYKG
KPILWVNKASYDTNLNKFNRASLRQIFDVAPIELENKFTPLSVESTPVEPPTVDVVALQQEMTIVKCKGLNKPFVKDNVS
FVADDSGTPVVEYLSKEDLHTLYVDPKYQVIVLKDNVLSSMLRLHTVESGDINVVAASGSLTRKVKLLFRASFYFKEFAT
RTFTATTAVGSCIKSVVRHLGVTKGILTGCFSFVKMLFMLPLAYFSDSKLGTTEVKVSALKTAGVVTGNVVKQCCTAAVD
LSMDKLRRVDWKSTLRLLLMLCTTMVLLSSVYHLYVFNQVLSSDVMFEDAQGLKKFYKEVRAYLGISSACDGLASAYRAN
SFDVPTFCANRSAMCNWCLISQDSITHYPALKMVQTHLSHYVLNIDWLWFAFETGLAYMLYTSAFNWLLLAGTLHYFFAQ
TSIFVDWRSYNYAVSSAFWLFTHIPMAGLVRMYNLLACLWLLRKFYQHVINGCKDTACLLCYKRNRLTRVEASTVVCGGK
RTFYITANGGISFCRRHNWNCVDCDTAGVGNTFICEEVANDLTTALRRPINATDRSHYYVDSVTVKETVVQFNYRRDGQP
FYERFPLCAFTNLDKLKFKEVCKTTTGIPEYNFIIYDSSDRGQESLARSACVYYSQVLCKSILLVDSSLVTSVGDSSEIA
TKMFDSFVNSFVSLYNVTRDKLEKLISTARDGVRRGDNFHSVLTTFIDAARGPAGVESDVETNEIVDSVQYAHKHDIQIT
NESYNNYVPSYVKPDSVSTSDLGSLIDCNAASVNQIVLRNSNGACIWNAAAYMKLSDALKRQIRIACRKCNLAFRLTTSK
LRANDNILSVRFTANKIVGGAPTWFNALRDFTLKGYVLATIIVFLCAVLMYLCLPTFSMVPVEFYEDRILDFKVLDNGII
RDVNPDDKCFANKHRSFTQWYHEHVGGVYDNSITCPLTVAVIAGVAGARIPDVPTTLAWVNNQIIFFVSRVFANTGSVCY
TPIDEIPYKSFSDSGCILPSECTMFRDAEGRMTPYCHDPTVLPGAFAYSQMRPHVRYDLYDGNMFIKFPEVVFESTLRIT
RTLSTQYCRFGSCEYAQEGVCITTNGSWAIFNDHHLNRPGVYCGSDFIDIVRRLAVSLFQPITYFQLTTSLVLGIGLCAF
LTLLFYYINKVKRAFADYTQCAVIAVVAAVLNSLCICFVASIPLCIVPYTALYYYATFYFTNEPAFIMHVSWYIMFGPIV
PIWMTCVYTVAMCFRHFFWVLAYFSKKHVEVFTDGKLNCSFQDAASNIFVINKDTYAALRNSLTNDAYSRFLGLFNKYKY
FSGAMETAAYREAAACHLAKALQTYSETGSDLLYQPPNCSITSGVLQSGLVKMSHPSGDVEACMVQVTCGSMTLNGLWLD
NTVWCPRHVMCPADQLSDPNYDALLISMTNHSFSVQKHIGAPANLRVVGHAMQGTLLKLTVDVANPSTPAYTFTTVKPGA
AFSVLACYNGRPTGTFTVVMRPNYTIKGSFLCGSCGSVGYTKEGSVINFCYMHQMELANGTHTGSAFDGTMYGAFMDKQV
HQVQLTDKYCSVNVVAWLYAAILNGCAWFVKPNRTSVVSFNEWALANQFTEFVGTQSVDMLAVKTGVAIEQLLYAIQQLY
TGFQGKQILGSTMLEDEFTPEDVNMQIMGVVMQSGVRKVTYGTAHWLFATLVSTYVIILQATKFTLWNYLFETIPTQLFP
LLFVTMAFVMLLVKHKHTFLTLFLLPVAICLTYANIVYEPTTPISSALIAVANWLAPTNAYMRTTHTDIGVYISMSLVLV
IVVKRLYNPSLSNFALALCSGVMWLYTYSIGEASSPIAYLVFVTTLTSDYTITVFVTVNLAKVCTYAIFAYSPQLTLVFP
EVKMILLLYTCLGFMCTCYFGVFSLLNLKLRAPMGVYDFKVSTQEFRFMTANNLTAPRNSWEAMALNFKLIGIGGTPCIK
VAAMQSKLTDLKCTSVVLLSVLQQLHLEANSRAWAFCVKCHNDILAATDPSEAFEKFVSLFATLMTFSGNVDLDALASDI
FDTPSVLQATLSEFSHLATFAELEAAQKAYQEAMDSGDTSPQVLKALQKAVNIAKNAYEKDKAVARKLERMADQAMTSMY
KQARAEDKKAKIVSAMQTMLFGMIKKLDNDVLNGIISNARNGCIPLSVIPLCASNKLRVVIPDFTVWNQVVTYPSLNYAG
ALWDITVINNVDNEIVKSSDVVDSNENLTWPLVLECTRASTSAVKLQNNEIKPSGLKTMVVSAGQEQTNCNTSSLAYYEP
VQGRKMLMALLSDNAYLKWARVEGKDGFVSVELQPPCKFLIAGPKGPEIRYLYFVKNLNNLHRGQVLGHIAATVRLQAGS
NTEFASNSSVLSLVNFTVDPQKAYLDFVNAGGAPLTNCVKMLTPKTGTGIAISVKPESTADQETYGGASVCLYCRAHIEH
PDVSGVCKYKGKFVQIPAQCVRDPVGFCLSNTPCNVCQYWIGYGCNCDSLRQAALPQSKDSNFLKRVRGSIVNARIEPCS
SGLSTDVVFRAFDICNYKAKVAGIGKYYKTNTCRFVELDDQGHHLDSYFVVKRHTMENYELEKHCYDLLRDCDAVAPHDF
FIFDVDKVKTPHIVRQRLTEYTMMDLVYALRHFDQNSEVLKAILVKYGCCDVTYFENKLWFDFVENPSVIGVYHKLGERV
RQAILNTVKFCDHMVKAGLVGVLTLDNQDLNGKWYDFGDFVITQPGSGVAIVDSYYSYLMPVLSMTDCLAAETHRDCDFN
KPLIEWPLTEYDFTDYKVQLFEKYFKYWDQTYHANCVNCTDDRCVLHCANFNVLFAMTMPKTCFGPIVRKIFVDGVPFVV
SCGYHYKELGLVMNMDVSLHRHRLSLKELMMYAADPAMHIASSNAFLDLRTSCFSVAALTTGLTFQTVRPGNFNQDFYDF
VVSKGFFKEGSSVTLKHFFFAQDGNAAITDYNYYSYNLPTMCDIKQMLFCMEVVNKYFEIYDGGCLNASEVVVNNLDKSA
GHPFNKFGKARVYYESMSYQEQDELFAMTKRNVIPTMTQMNLKYAISAKNRARTVAGVSILSTMTNRQYHQKMLKSMAAT
RGATCVIGTTKFYGGWDFMLKTLYKDVDNPHLMGWDYPKCDRAMPNMCRIFASLILARKHGTCCTTRDRFYRLANECAQV
LSEYVLCGGGYYVKPGGTSSGDATTAYANSVFNILQATTANVSALMGANGNKIVDKEVKDMQFDLYVNVYRSTSPDPKFV
DKYYAFLNKHFSMMILSDDGVVCYNSDYAAKGYIAGIQNFKETLYYQNNVFMSEAKCWVETDLKKGPHEFCSQHTLYIKD
GDDGYFLPYPDPSRILSAGCFVDDIVKTDGTLMVERFVSLAIDAYPLTKHEDIEYQNVFWVYLQYIEKLYKDLTGHMLDS
YSVMLCGDNSAKFWEEAFYRDLYSSPTTLQAVGSCVVCHSQTSLRCGTCIRRPFLCCKCCYDHVIATPHKMVLSVSPYVC
NAPGCGVSDVTKLYLGGMSYFCVDHRPVCSFPLCANGLVFGLYKNMCTGSPSIVEFNRLATCDWTESGDYTLANTTTEPL
KLFAAETLRATEEASKQSYAIATIKEIVGERQLLLVWEAGKSKPPLNRNYVFTGYHITKNSKVQLGEYIFERIDYSDAVS
YKSSTTYKLTVGDIFVLTSHSVATLTAPTIVNQERYVKITGLYPTITVPEEFASHVANFQKSGYSKYVTVQGPPGTGKSH
FAIGLAIYYPTARVVYTACSHAAVDALCEKAFKYLNIAKCSRIIPAKARVECYDRFKVNETNSQYLFSTINALPETSADI
LVVDEVSMCTNYDLSIINARIKAKHIVYVGDPAQLPAPRTLLTRGTLEPENFNSVTRLMCNLGPDIFLSMCYRCPKEIVS
TVSALVYNNKLLAKKELSGQCFKILYKGNVTHDASSAINRPQLTFVKNFITANPAWSKAVFISPYNSQNAVARSMLGLTT
QTVDSSQGSEYQYVIFCQTADTAHANNINRFNVAITRAQKGILCVMTSQALFESLEFTELSFTNYKLQSQIVTGLFKDCS
RETSGLSPAYAPTYVSVDDKYKTSDELCVNLNLPANVPYSRVISRMGFKLDATVPGYPKLFITREEAVRQVRSWIGFDVE
GAHASRNACGTNVPLQLGFSTGVNFVVQPVGVVDTEWGNMLTGIAARPPPGEQFKHLVPLMHKGAAWPIVRRRIVQMLSD
TLDKLSDYCTFVCWAHGFELTSASYFCKIGKEQKCCMCNRRAAAYSSPLQSYACWTHSCGYDYVYNPFFVDVQQWGYVGN
LATNHDRYCSVHQGAHVASNDAIMTRCLAIHSCFIERVDWDIEYPYISHEKKLNSCCRIVERNVVRAALLAGSFDKVYDI
GNPKGIPIVDDPVVDWHYFDAQPLTRKVQQLFYTEDMASRFADGLCLFWNCNVPKYPNNAIVCRFDTRVHSEFNLPGCDG
GSLYVNKHAFHTPAYDVSAFRDLKPLPFFYYSTTPCEVHGNGSMIEDIDYVPLKSAVCITACNLGGAVCRKHATEYREYM
EAYNLVSASGFRLWCYKTFDIYNLWSTFTKVQGLENIAFNFVKQGHFIGVEGELPVAVVNDKIFTKSGVNDICMFENKTT
LPTNIAFELYAKRAVRSHPDFKLLHNLQADICYKFVLWDYERSNIYGTATIGVCKYTDIDVNSALNICFDIRDNGSLEKF
MSTPNAIFISDRKIKKYPCMVGPDYAYFNGAIIRDSDVVKQPVKFYLYKKVNNEFIDPTECIYTQSRSCSDFLPLSDMEK
DFLSFDSDVFIKKYGLENYAFEHVVYGDFSHTTLGGLHLLIGLYKKQQEGHIIMEEMLKGSSTIHNYFITETNTAAFKAV
CSVIDLKLDDFVMILKSQDLGVVSKVVKVPIDLTMIEFMLWCKDGQVQTFYPRLQASADWKPGHAMPSLFKVQNVNLERC
ELANYKQSIPMPRGVHMNIAKYMQLCQYLNTCTLAVPANMRVIHFGAGSDKGIAPGTSVLRQWLPTDAIIIDNDLNEFVS
DADITLFGDCVTVRVGQQVDLVISDMYDPTTKNVTGSNESKALFFTYLCNLINNNLALGGSVAIKITEHSWSVELYELMG
KFAWWTVFCTNANASSSEGFLLGINYLGTIKENIDGGAMHANYIFWRNSTPMNLSTYSLFDLSKFQLKLKGTPVLQLKES
QINELVISLLSQGKLLIRDNDTLSVSTDVLVNTYRKLR
>P0C6Y4 ~~~rep~~~Replicase polyprotein 1ab~~~
MASNHVTLAFANDAEISAFGFCTASEAVSYYSEAAASGFMQCRFVSLDLADTVEGLLPEDYVMVVIGTTKLSAYVDTFGS
RPRNICGWLLFSNCNYFLEELELTFGRRGGNIVPVDQYMCGADGKPVLQESEWEYTDFFADSEDGQLNIAGITYVKAWIV
ERSDVSYASQNLTSIKSITYCSTYEHTFLDGTAMKVARTPKIKKNVVLSEPLATIYREIGSPFVDNGSDARSIIRRPVFL
HAFVKCKCGSYHWTVGDWTSYVSTCCGFKCKPVLVASCSAMPGSVVVTRAGAGTGVKYYNNMFLRHVADIDGLAFWRILK
VQSKDDLACSGKFLEHHEEGFTDPCYFLNDSSLATKLKFDILSGKFSDEVKQAIIAGHVVVGSALVDIVDDALGQPWFIR
KLGDLASAPWEQLKAVVRGLGLLSDEVVLFGKRLSCATLSIVNGVFEFLADVPEKLAAAVTVFVNFLNEFFESACDCLKV
GGKTFNKVGSYVLFDNALVKLVKAKARGPRQAGICEVRYTSLVVGSTTKVVSKRVENANVNLVVVDEDVTLNTTGRTVVV
DGLAFFESDGFYRHLADADVVIEHPVYKSACELKPVFECDPIPDFPLPVAASVAELCVQTDLLLKNYNTPYKTYSCVVRG
DKCCITCTLQFKAPSYVEDAVNFVDLCTKNIGTAGFHEFYITAHEQQDLQGFLTTCCTMSGFECFMPTIPQCPAVLEEID
GGSIWRSFITGLNTMWDFCKRLKVSFGLDGIVVTVARKFKRLGALLAEMYNTYLSTVVENLVLAGVSFKYYATSVPKIVL
GGCFHSVKSVFASVFQIPVQAGIEKFKVFLNCVHPVVPRVIETSFVELEETTFKPPALNGGIAIVDGFAFYYDGTLYYPT
DGNSVVPICFKKKGGGDVKFSDEVSVKTIDPVYKVSLEFEFESETIMAVLNKAVGNRIKVTGGWDDVVEYINVAIEVLKD
HVEVPKYYIYDEEGGTDPNLPVMVSQWPLNDDTISQDLLDVEVVTDAPIDSEGDEVDSSAPEKVADVANSEPGDDGLPVA
PETNVESEVEEVAATLSFIKDTPSTVTKDPFAFDFVSYGGLKVLRQSHNNCWVTSTLVQLQLLGIVDDPAMELFSAGRVG
PMVRKCYESQKAILGSLGDVSACLESLTKDLHTLKITCSVVCGCGTGERIYEGCAFRMTPTLEPFPYGACAQCAQVLMHT
FKSIVGTGIFCRDTTALSLDSLVVKPLCAAAFIGKDSGHYVTNFYDAAMAIDGYGRHQIKYDTLNTICVKDVNWTAPLVP
AVDSVVEPVVKPFYSYKNVDFYQGDFSDLVKLPCDFVVNAANEKLSHGGGIAKAIDVYTKGMLQKCSNDYIKAHGPIKVG
RGVMLEALGLKVFNVVGPRKGKHAPELLVKAYKSVFANSGVALTPLISVGIFSVPLEESLSAFLACVGDRHCKCFCYGDK
EREAIIKYMDGLVDAIFKEALVDTTPVQEDVQQVSQKPVLPNFEPFRIEGAHAFYECNPEGLMSLGADKLVLFTNSNLDF
CSVGKCLNDVTSGALLEAINVFKKSNKTVPAGNCVTLDCANMISITMVVLPFDGDANYDKNYARAVVKVSKLKGKLVLAV
DDATLYSKLSHLSVLGFVSTPDDVERFYANKSVVIKVTEDTRSVKAVKVESTATYGQQIGPCLVNDTVVTDNKPVVADVV
AKVVPNANWDSHYGFDKAGEFHMLDHTGFTFPSEVVNGRRVIKTTDNNCWVNVTCLQLQFARFRFKSAGLQAMWESYCTG
DVAMFVHWLYWLTGVDKGQPSDSENALNMLSKYIVPAGSVTIERVTHDGCCCSKRVVTAPVVNASVLKLGVEDGLCPHGL
NYIGKVVVVKGTTIVVNVGKPVVAPSHLFLKGVSYTTFLDNGNGVVGHYTVFDHGTGMVHDGDAFVPGDLNVSPVTNVVV
SEQTAVVIKDPVKKAELDATKLLDTMNYASERFFSFGDFMSRNLITVFLYILSILGLCFRAFRKRDVKVLAGVPQRTGII
LRKSMRYNAKALGVFFKLKLYWFKVLGKFSLGIYALYALLFMTIRFTPIGSPVCDDVVAGYANSSFDKNEYCNSVICKVC
LYGYQELSDFSHTQVVWQHLRDPLIGNVMPFFYLAFLAIFGGVYVKAITLYFIFQYLNSLGVFLGLQQSIWFLQLVPFDV
FGDEIVVFFIVTRVLMFIKHVCLGCDKASCVACSKSARLKRVPVQTIFQGTSKSFYVHANGGSKFCKKHNFFCLNCDSYG
PGCTFINDVIATEVGNVVKLNVQPTGPATILIDKVEFSNGFYYLYSGDTFWKYNFDITDSKYTCKEALKNCSIITDFIVF
NNNGSNVNQVKNACVYFSQMLCKPVKLVDSALLASLSVDFGASLHSAFVSVLSNSFGKDLSSCNDMQDCKSTLGFDDVPL
DTFNAAVAEAHRYDVLLTDMSFNNFTTSYAKPEEKFPVHDIATCMRVGAKIVNHNVLVKDSIPVVWLVRDFIALSEETRK
YIIRTTKVKGITFMLTFNDCRMHTTIPTVCIANKKGAGLPSFSKVKKFFWFLCLFIVAAFFALSFLDFSTQVSSDSDYDF
KYIESGQLKTFDNPLSCVHNVFINFDQWHDAKFGFTPVNNPSCPIVVGVSDEARTVPGIPAGVYLAGKTLVFAINTIFGT
SGLCFDASGVADKGACIFNSACTTLSGLGGTAVYCYKNGLVEGAKLYSELAPHSYYKMVDGNAVSLPEIISRGFGIRTIR
TKAMTYCRVGQCVQSAEGVCFGADRFFVYNAESGSDFVCGTGLFTLLMNVISVFSKTVPVTVLSGQILFNCIIAFVAVAV
CFLFTKFKRMFGDMSVGVFTVGACTLLNNVSYIVTQNTLGMLGYATLYFLCTKGVRYMWIWHLGFLISYILIAPWWVLMV
YAFSAIFEFMPNLFKLKVSTQLFEGDKFVGSFENAAAGTFVLDMHAYERLANSISTEKLRQYASTYNKYKYYSGSASEAD
YRLACFAHLAKAMMDYASNHNDTLYTPPTVSYNSTLQAGLRKMAQPSGVVEKCIVRVCYGNMALNGLWLGDIVMCPRHVI
ASSTTSTIDYDYALSVLRLHNFSISSGNVFLGVVSATMRGALLQIKVNQNNVHTPKYTYRTVRPGESFNILACYDGAAAG
VYGVNMRSNYTIRGSFINGACGSPGYNINNGTVEFCYLHQLELGSGCHVGSDLDGVMYGGYEDQPTLQVEGASSLFTENV
LAFLYAALINGSTWWLSSSRIAVDRFNEWAVHNGMTTVGNTDCFSILAAKTGVDVQRLLASIQSLHKNFGGKQILGHTSL
TDEFTTGEVVRQMYGVNLQGGYVSRACRNVLLVGSFLTFFWSELVSYTKFFWVNPGYVTPMFACLSLLSSLLMFTLKHKT
LFFQVFLIPALIVTSCINLAFDVEVYNYLAEHFDYHVSLMGFNAQGLVNIFVCFVVTILHGTYTWRFFNTPASSVTYVVA
LLTAAYNYFYASDILSCAMTLFASVTGNWFVGAVCYKVAVYMALRFPTFVAIFGDIKSVMFCYLVLGYFTCCFYGILYWF
NRFFKVSVGVYDYTVSAAEFKYMVANGLRAPTGTLDSLLLSAKLIGIGGERNIKISSVQSKLTDIKCSNVVLLGCLSSMN
VSANSTEWAYCVDLHNKINLCNDPEKAQEMLLALLAFFLSKNSAFGLDDLLESYFNDNSMLQSVASTYVGLPSYVIYENA
RQQYEDAVNNGSPPQLVKQLRHAMNVAKSEFDREASTQRKLDRMAEQAAAQMYKEARAVNRKSKVVSAMHSLLFGMLRRL
DMSSVDTILNLAKDGVVPLSVIPAVSATKLNIVTSDIDSYNRIQREGCVHYAGTIWNIIDIKDNDGKVVHVKEVTAQNAE
SLSWPLVLGCERIVKLQNNEIIPGKLKQRSIKAEGDGIVGEGKALYNNEGGRTFMYAFISDKPDLRVVKWEFDGGCNTIE
LEPPRKFLVDSPNGAQIKYLYFVRNLNTLRRGAVLGYIGATVRLQAGKQTEQAINSSLLTLCAFAVDPAKTYIDAVKSGH
KPVGNCVKMLANGSGNGQAVTNGVEASTNQDSYGGASVCLYCRAHVEHPSMDGFCRLKGKYVQVPLGTVDPIRFVLENDV
CKVCGCWLSNGCTCDRSIMQSTDMAYLNRVRGSSAARLEPCNGTDTQHVYRAFDIYNKDVACLGKFLKVNCVRLKNLDKH
DAFYVVKRCTKSAMEHEQSIYSRLEKCGAIAEHDFFTWKDGRAIYGNVCRKDLTEYTMMDLCYALRNFDENNCDVLKSIL
IKVGACEESYFNNKVWFDPVENEDIHRVYALLGTIVARAMLKCVKFCDAMVEQGIVGVVTLDNQDLNGDFYDFGDFTCSI
KGMGVPICTSYYSYMMPVMGMTNCLASECFVKSDIFGEDFKSYDLLEYDFTEHKTALFNKYFKYWGLQYHPNCVDCSDEQ
CIVHCANFNTLFSTTIPITAFGPLCRKCWIDGVPLVTTAGYHFKQLGIVWNNDLNLHSSRLSINELLQFCSDPALLIASS
PALVDQRTVCFSVAALGTGMTNQTVKPGHFNKEFYDFLLEQGFFSEGSELTLKHFFFAQKVDAAVKDFDYYRYNRPTVLD
ICQARVVYQIVQRYFDIYEGGCITAKEVVVTNLNKSAGYPLNKFGKAGLYYESLSYEEQDELYAYTKRNILPTMTQLNLK
YAISGKERARTVGGVSLLSTMTTRQYHQKHLKSIVNTRGASVVIGTTKFYGGWDNMLKNLIDGVENPCLMGWDYPKCDRA
LPNMIRMISAMILGSKHTTCCSSTDRFFRLCNELAQVLTEVVYSNGGFYLKPGGTTSGDATTAYANSVFNIFQAVSANVN
KLLSVDSNVCHNLEVKQLQRKLYECCYRSTIVDDQFVVEYYGYLRKHFSMMILSDDGVVCYNNDYASLGYVADLNAFKAV
LYYQNNVFMSASKCWIEPDINKGPHEFCSQHTMQIVDKEGTYYLPYPDPSRILSAGVFVDDVVKTDAVVLLERYVSLAID
AYPLSKHENPEYKKVFYVLLDWVKHLYKTLNAGVLESFSVTLLEDSTAKFWDESFYANMYEKSAVLQSAGLCVVCGSQTV
LRCGDCLRRPMLCTKCAYDHVIGTTHKFILAITPYVCCASDCGVNDVTKLYLGGLSYWCHEHKPRLAFPLCSAGNVFGLY
KNSATGSPDVEDFNRIATSDWTDVSDYRLANDVKDSLRLFAAETIKAKEESVKSSYACATLHEVVGPKELLLKWEVGRPK
PPLNRNSVFTCYHITKNTKFQIGEFVFEKAEYDNDAVTYKTTATTKLVPGMVFVLTSHNVQPLRAPTIANQERYSTIHKL
HPAFNIPEAYSSLVPYYQLIGKQKITTIQGPPGSGKSHCVIGLGLYYPGARIVFTACSHAAVDSLCVKASTAYSNDKCSR
IIPQRARVECYDGFKSNNTSAQYLFSTVNALPECNADIVVVDEVSMCTNYDLSVINQRISYRHVVYVGDPQQLPAPRVMI
SRGTLEPKDYNVVTQRMCALKPDVFLHKCYRCPAEIVRTVSEMVYENQFIPVHPDSKQCFKIFCKGNVQVDNGSSINRRQ
LDVVRMFLAKNPRWSKAVFISPYNSQNYVASRLLGLQIQTVDSSQGSEYDYVIYAQTSDTAHASNVNRFNVAITRAKKGI
LCIMCDRSLFDLLKFFELKLSDLQANEGCGLFKDCSRGDDLLPPSHANTFMSLADNFKTDQYLAVQIGVNGPIKYEHVIS
FMGFRFDINIPNHHTLFCTRDFAMRNVRGWLGFDVEGAHVVGSNVGTNVPLQLGFSNGVDFVVRPEGCVVTESGDYIKPV
RARAPPGEQFAHLLPLLKRGQPWDVVRKRIVQMCSDYLANLSDILIFVLWAGGLELTTMRYFVKIGPSKSCDCGKVATCY
NSALHTYCCFKHALGCDYLYNPYCIDIQQWGYKGSLSLNHHEHCNVHRNEHVASGDAIMTRCLAIHDCFVKNVDWSITYP
FIGNEAVINKSGRIVQSHTMRSVLKLYNPKAIYDIGNPKGIRCAVTDAKWFCFDKNPTNSNVKTLEYDYITHGQFDGLCL
FWNCNVDMYPEFSVVCRFDTRCRSPLNLEGCNGGSLYVNNHAFHTPAFDKRAFAKLKPMPFFFYDDTECDKLQDSINYVP
LRASNCITKCNVGGAVCSKHCAMYHSYVNAYNTFTSAGFTIWVPTSFDTYNLWQTFSNNLQGLENIAFNVLKKGSFVGDE
GELPVAVVNDKVLVRDGTVDTLVFTNKTSLPTNVAFELYAKRKVGLTPPITILRNLGVVCTSKCVIWDYEAERPLTTFTK
DVCKYTDFEGDVCTLFDNSIVGSLERFSMTQNAVLMSLTAVKKLTGIKLTYGYLNGVPVNTHEDKPFTWYIYTRKNGKFE
DYPDGYFTQGRTTADFSPRSDMEKDFLSMDMGLFINKYGLEDYGFEHVVYGDVSKTTLGGLHLLISQVRLACMGVLKIDE
FVSSNDSTLKSCTVTYADNPSSKMVCTYMDLLLDDFVSILKSLDLSVVSKVHEVMVDCKMWRWMLWCKDHKLQTFYPQLQ
ASEWKCGYSMPSIYKIQRMCLEPCNLYNYGAGVKLPDGIMFNVVKYTQLCQYLNSTTMCVPHHMRVLHLGAGSDKGVAPG
TAVLRRWLPLDAIIVDNDSVDYVSDADYSVTGDCSTLYLSDKFDLVISDMYDGKIKSCDGENVSKEGFFPYINGVITEKL
ALGGTVAIKVTEFSWNKKLYELIQKFEYWTMFCTSVNTSSSEAFLIGVHYLGDFASGAVIDGNTMHANYIFWRNSTIMTM
SYNSVLDLSKFNCKHKATVVVNLKDSSISDVVLGLLKNGKLLVRNNDAICGFSNHLVNVNK
>P0C6X7 ~~~rep~~~Replicase polyprotein 1ab~~~
MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEAREHLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTN
HGHKVVELVAEMDGIQYGRSGITLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQN
WNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAW
FTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFPLNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTL
MKCNHCDEVSWQTCDFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKG
GRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDLLEILSRERVNINIVGDFHLNEEVAIILASF
SASTSAFIDTIKSLDYKSFKTIVESCGNYKVTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFARTLDAANHSIP
DLQRAAVTILDGISEQSLRLVDAMVYTSDLLTNSVIIMAYVTGGLVQQTSQWLSNLLGTTVEKLRPIFEWIEAKLSAGVE
FLKDAWEILKFLITGVFDIVKGQIQVASDNIKDCVKCFIDVVNKALEMCIDQVTIAGAKLRSLNLGEVFIAQSKGLYRQC
IRGKEQLQLLMPLKAPKEVTFLEGDSHDTVLTSEEVVLKNGELEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQYC
ALSPGLLATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERVDKVLNEKCSVYTVESGTEVTEFACVVAEAV
VKTLQPVSDLLTNMGIDLDEWSVATFYLFDDAGEENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHEYGTEDDYQGL
PLEFGASAETVRVEEEEEEDWLDDTTEQSEIEPEPEPTPEEPVNQFTGYLKLTDNVAIKCVDIVKEAQSANPMVIVNAAN
IHLKHGGGVAGALNKATNGAMQKESDDYIKLNGPLTVGGSCLLSGHNLAKKCLHVVGPNLNAGEDIQLLKAAYENFNSQD
ILLAPLLSAGIFGAKPLQSLQVCVQTVRTQVYIAVNDKALYEQVVMDYLDNLKPRVEAPKQEEPPNTEDSKTEEKSVVQK
PVDVKPKIKACIDEVTTTLEETKFLTNKLLLFADINGKLYHDSQNMLRGEDMSFLEKDAPYMVGDVITSGDITCVVIPSK
KAGGTTEMLSRALKKVPVDEYITTYPGQGCAGYTLEEAKTALKKCKSAFYVLPSEAPNAKEEILGTVSWNLREMLAHAEE
TRKLMPICMDVRAIMATIQRKYKGIKIQEGIVDYGVRFFFYTSKEPVASIITKLNSLNEPLVTMPIGYVTHGFNLEEAAR
CMRSLKAPAVVSVSSPDAVTTYNGYLTSSSKTSEEHFVETVSLAGSYRDWSYSGQRTELGVEFLKRGDKIVYHTLESPVE
FHLDGEVLSLDKLKSLLSLREVKTIKVFTTVDNTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPS
DDTLRSEAFEYYHTLDESFLGRYMSALNHTKKWKFPQVGGLTSIKWADNNCYLSSVLLALQQLEVKFNAPALQEAYYRAR
AGDAANFCALILAYSNKTVGELGDVRETMTHLLQHANLESAKRVLNVVCKHCGQKTTTLTGVEAVMYMGTLSYDNLKTGV
SIPCVCGRDATQYLVQQESSFVMMSAPPAEYKLQQGTFLCANEYTGNYQCGHYTHITAKETLYRIDGAHLTKMSEYKGPV
TDVFYKETSYTTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNASFDNFKLTCSNTKFADDLNQ
MTGFTKPASRELSVTFFPDLNGDVVAIDYRHYSASFKKGAKLLHKPIVWHINQATTKTTFKPNTWCLRCLWSTKPVDTSN
SFEVLAVEDTQGMDNLACESQQPTSEEVVENPTIQKEVIECDVKTTEVVGNVILKPSDEGVKVTQELGHEDLMAAYVENT
SITIKKPNELSLALGLKTIATHGIAAINSVPWSKILAYVKPFLGQAAITTSNCAKRLAQRVFNNYMPYVFTLLFQLCTFT
KSTNSRIRASLPTTIAKNSVKSVAKLCLDAGINYVKSPKFSKLFTIAMWLLLLSICLGSLICVTAAFGVLLSNFGAPSYC
NGVRELYLNSSNVTTMDFCEGSFPCSICLSGLDSLDSYPALETIQVTISSYKLDLTILGLAAEWVLAYMLFTKFFYLLGL
SAIMQVFFGYFASHFISNSWLMWFIISIVQMAPVSAMVRMYIFFASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVEC
TTIVNGMKRSFYVYANGGRGFCKTHNWNCLNCDTFCTGSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVAVKNGALHL
YFDKAGQKTYERHPLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYYSQLMCQPILLLDQALVSDVGDS
TEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSELAKGVALDGVLSTFVSAARQGVVDTDVDTKDVIECLKLSHHSDL
EVTGDSCNNFMLTYNKVENMTPRDLGACIDCNARHINAQVAKSHNVSLIWNVKDYMSLSEQLRKQIRSAAKKNNIPFRLT
CATTRQVVNVITTKISLKGGKIVSTCFKLMLKATLLCVLAALVCYIVMPVHTLSIHDGYTNEIIGYKAIQDGVTRDIIST
DDCFANKHAGFDAWFSQRGGSYKNDKSCPVVAAIITREIGFIVPGLPGTVLRAINGDFLHFLPRVFSAVGNICYTPSKLI
EYSDFATSACVLAAECTIFKDAMGKPVPYCYDTNLLEGSISYSELRPDTRYVLMDGSIIQFPNTYLEGSVRVVTTFDAEY
CRHGTCERSEVGICLSTSGRWVLNNEHYRALSGVFCGVDAMNLIANIFTPLVQPVGALDVSASVVAGGIIAILVTCAAYY
FMKFRRVFGEYNHVVAANALLFLMSFTILCLVPAYSFLPGVYSVFYLYLTFYFTNDVSFLAHLQWFAMFSPIVPFWITAI
YVFCISLKHCHWFFNNYLRKRVMFNGVTFSTFEEAALCTFLLNKEMYLKLRSETLLPLTQYNRYLALYNKYKYFSGALDT
TSYREAACCHLAKALNDFSNSGADVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDTVYCPR
HVICTAEDMLNPNYEDLLIRKSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQTFSVLACYNG
SPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTI
TLNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCAALKELLQNGMNGRT
ILGSTILEDEFTPFDVVRQCSGVTFQGKFKKIVKGTHHWMLLTFLTSLLILVQSTQWSLFFFVYENAFLPFTLGIMAIAA
CAMLLVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRIMTWLELADTSLSGYRLKDCVMYASALVLLILMTARTVYDD
AARRVWTLMNVITLVYKVYYGNALDQAISMWALVISVTSNYSGVVTTIMFLARAIVFVCVEYYPLLFITGNTLQCIMLVY
CFLGYCCCCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKSSIDAFKLNIKLLGIGGKPCIKVATVQSKMS
DVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILLAKDTTEAFEKMVSLLSVLLSMQGAVDINRLCEEMLDNRATLQA
IASEFSSLPSYAAYATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRKLEKMADQAMTQMYKQARSEDKRA
KVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAKLMVVVPDYGTYKNTCDGNTFTYASALWEIQQVVD
ADSKIVQLSEINMDNSPNLAWPLIVTALRANSAVKLQNNELSPVALRQMSCAAGTTQTACTDDNALAYYNNSKGGRFVLA
LLSDHQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVLGSLAATVRLQAGNATEVPAN
STVLSFCAFAVDPAKAYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQESFGGASCCLYCRCHIDHPNPKGFC
DLKGKYVQIPTTCANDPVGFTLRNTVCTVCGMWKGYGCSCDQLREPLMQSADASTFLNRVCGVSAARLTPCGTGTSTDVV
YRAFDIYNEKVAGFAKFLKTNCCRFQEKDEEGNLLDSYFVVKRHTMSNYQHEETIYNLVKDCPAVAVHDFFKFRVDGDMV
PHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKKDWYDFVENPDILRVYANLGERVRQSLLKTVQ
FCDAMRDAGIVGVLTLDNQDLNGNWYDFGDFVQVAPGCGVPIVDSYYSLLMPILTLTRALAAESHMDADLAKPLIKWDLL
KYDFTEERLCLFDRYFKYWDQTYHPNCINCLDDRCILHCANFNVLFSTVFPPTSFGPLVRKIFVDGVPFVVSTGYHFREL
GVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAALTNNVAFQTVKPGNFNKDFYDFAVSKGFFKE
GSSVELKHFFFAQDGNAAISDYDYYRYNLPTMCDIRQLLFVVEVVDKYFDCYDGGCINANQVIVNNLDKSAGFPFNKWGK
ARLYYDSMSYEDQDALFAYTKRNVIPTITQMNLKYAISAKNRARTVAGVSICSTMTNRQFHQKLLKSIAATRGATVVIGT
SKFYGGWHNMLKTVYSDVETPHLMGWDYPKCDRAMPNMLRIMASLVLARKHNTCCNLSHRFYRLANECAQVLSEMVMCGG
SLYVKPGGTSSGDATTAYANSVFNICQAVTANVNALLSTDGNKIADKYVRNLQHRLYECLYRNRDVDHEFVDEFYAYLRK
HFSMMILSDDAVVCYNSNYAAQGLVASIKNFKAVLYYQNNVFMSEAKCWTETDLTKGPHEFCSQHTMLVKQGDDYVYLPY
PDPSRILGAGCFVDDIVKTDGTLMIERFVSLAIDAYPLTKHPNQEYADVFHLYLQYIRKLHDELTGHMLDMYSVMLTNDN
TSRYWEPEFYEAMYTPHTVLQAVGACVLCNSQTSLRCGACIRRPFLCCKCCYDHVISTSHKLVLSVNPYVCNAPGCDVTD
VTQLYLGGMSYYCKSHKPPISFPLCANGQVFGLYKNTCVGSDNVTDFNAIATCDWTNAGDYILANTCTERLKLFAAETLK
ATEETFKLSYGIATVREVLSDRELHLSWEVGKPRPPLNRNYVFTGYRVTKNSKVQIGEYTFEKGDYGDAVVYRGTTTYKL
NVGDYFVLTSHTVMPLSAPTLVPQEHYVRITGLYPTLNISDEFSSNVANYQKVGMQKYSTLQGPPGTGKSHFAIGLALYY
PSARIVYTACSHAAVDALCEKALKYLPIDKCSRIIPARARVECFDKFKVNSTLEQYVFCTVNALPETTADIVVFDEISMA
TNYDLSVVNARLRAKHYVYIGDPAQLPAPRTLLTKGTLEPEYFNSVCRLMKTIGPDMFLGTCRRCPAEIVDTVSALVYDN
KLKAHKDKSAQCFKMFYKGVITHDVSSAINRPQIGVVREFLTRNPAWRKAVFISPYNSQNAVASKILGLPTQTVDSSQGS
EYDYVIFTQTTETAHSCNVNRFNVAITRAKIGILCIMSDRDLYDKLQFTSLEIPRRNVATLQAENVTGLFKDCSKIITGL
HPTQAPTHLSVDIKFKTEGLCVDIPGIPKDMTYRRLISMMGFKMNYQVNGYPNMFITREEAIRHVRAWIGFDVEGCHATR
DAVGTNLPLQLGFSTGVNLVAVPTGYVDTENNTEFTRVNAKPPPGDQFKHLIPLMYKGLPWNVVRIKIVQMLSDTLKGLS
DRVVFVLWAHGFELTSMKYFVKIGPERTCCLCDKRATCFSTSSDTYACWNHSVGFDYVYNPFMIDVQQWGFTGNLQSNHD
QHCQVHGNAHVASCDAIMTRCLAVHECFVKRVDWSVEYPIIGDELRVNSACRKVQHMVVKSALLADKFPVLHDIGNPKAI
KCVPQAEVEWKFYDAQPCSDKAYKIEELFYSYATHHDKFTDGVCLFWNCNVDRYPANAIVCRFDTRVLSNLNLPGCDGGS
LYVNKHAFHTPAFDKSAFTNLKQLPFFYYSDSPCESHGKQVVSDIDYVPLKSATCITRCNLGGAVCRHHANEYRQYLDAY
NMMISAGFSLWIYKQFDTYNLWNTFTRLQSLENVAYNVVNKGHFDGHAGEAPVSIINNAVYTKVDGIDVEIFENKTTLPV
NVAFELWAKRNIKPVPEIKILNNLGVDIAANTVIWDYKREAPAHVSTIGVCTMTDIAKKPTESACSSLTVLFDGRVEGQV
DLFRNARNGVLITEGSVKGLTPSKGPAQASVNGVTLIGESVKTQFNYFKKVDGIIQQLPETYFTQSRDLEDFKPRSQMET
DFLELAMDEFIQRYKLEGYAFEHIVYGDFSHGQLGGLHLMIGLAKRSQDSPLKLEDFIPMDSTVKNYFITDAQTGSSKCV
CSVIDLLLDDFVEIIKSQDLSVISKVVKVTIDYAEISFMLWCKDGHVETFYPKLQASQAWQPGVAMPNLYKMQRMLLEKC
DLQNYGENAVIPKGIMMNVAKYTQLCQYLNTLTLAVPYNMRVIHFGAGSDKGVAPGTAVLRQWLPTGTLLVDSDLNDFVS
DADSTLIGDCATVHTANKWDLIISDMYDPRTKHVTKENDSKEGFFTYLCGFIKQKLALGGSIAVKITEHSWNADLYKLMG
HFSWWTAFVTNVNASSSEAFLIGANYLGKPKEQIDGYTMHANYIFWRNTNPIQLSSYSLFDMSKFPLKLRGTAVMSLKEN
QINDMIYSLLEKGRLIIRENNRVVVSSDILVNN
>P0DTD1 ~~~rep~~~Replicase polyprotein 1ab~~~
MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGVLPQLEQPYVFIKRSDARTAP
HGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRKVLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQEN
WNTKHSSGVTRELMRELNGGAYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLDFIDTKRGVYCCREHEHEIAW
YTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRIRSVYPVASPNECNQMCLSTL
MKCDHCGETSWQTGDFVKATCEFCGTENLTKEGATTCGYLPQNAVVKIYCPACHNSEVGPEHSLAEYHNESGLKTILRKG
GRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVNINIVGDFKLNEEIAIILASF
SASTSAFVETVKGLDYKAFKQIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEAARVVRSIFSRTLETAQNSVR
VLQKAAITILDGISQYSLRLIDAMMFTSDLATNNLVVMAYITGGVVQLTSQWLTNIFGTVYEKLKPVLDWLEEKFKEGVE
FLRDGWEIVKFISTCACEIVGGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKC
VKSREETGLLMPLKAPKEIIFLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVGTPVCINGLMLLEIKDTEKYC
ALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVNITFELDERIDKVLNEKCSAYTVELGTEVNEFACVVADAVI
KTLQPVSELLTPLGIDLDEWSMATYYLFDESGEFKLASHMYCSFYPPDEDEEEGDCEEEEFEPSTQYEYGTEDDYQGKPL
EFGATSAALQPEEEQEEDWLDDDSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTPVVQTIEVNSFSGYLKLTDNVYI
KNADIVEEAKKVKPTVVVNAANVYLKHGGGVAGALNKATNNAMQVESDDYIATNGPLKVGGSCVLSGHNLAKHCLHVVGP
NVNKGEDIQLLKSAYENFNQHEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLEMKSEKQVE
QKIAEIPKEEVKPFITESKPSVEQRKQDDKKIKACVEEVTTTLEETKFLTENLLLYIDINGNLHPDSATLVSDIDITFLK
KDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKVPTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSII
SNEKQEILGTVSWNLREMLAHAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDYGARFYFYTSKTTVASLINTLND
LNETLVTMPLGYVTHGLNLEEAARYMRSLKVPATVSVSSPDAVTAYNGYLTSSSKTPEEHFIETISLAGSYKDWSYSGQS
TQLGIEFLKRGDKSVYYTSNPTTFHLDGEVITFDNLKTLLSLREVRTIKVFTTVDNINLHTQVVDMSMTYGQQFGPTYLD
GADVTKIKPHNSHEGKTFYVLPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKYPQVNGLTSIKWADNNCYLATAL
LTLQQIELKFNPPALQDAYYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQHANLDSCKRVLNVVCKTCGQQQT
TLKGVEAVMYMGTLSYEQFKKGVQIPCTCGKQATKYLVQQESPFVMMSAPPAQYELKHGTFTCASEYTGNYQCGHYKHIT
SKETLYCIDGALLTKSSEYKGPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDNYYKKDNSYFTEQPIDLVPNQPY
PNASFDNFKFVCDNIKFADDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPSFKKGAKLLHKPIVWHVNNATNK
ATYKPNTWCIRCLWSTKPVETSNSFDVLKSEDAQGMDNLACEDLKPVSEEVVENPTIQKDVLECNVKTTEVVGDIILKPA
NNSLKITEEVGHTDLMAAYVDNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTIANYAKPFLNKVVSTTTNIVTRC
LNRVCTNYMPYFFTLLLQLCTFTRSTNSRIKASMPTTIAKNTVKSVGKFCLEASFNYLKSPNFSKLINIIIWFLLLSVCL
GSLIYSTAALGVLMSNLGMPSYCTGYREGYLNSTNVTIATYCTGSIPCSVCLSGLDSLDTYPSLETIQITISSFKWDLTA
FGLVAEWFLAYILFTRFFYVLGLAAIMQLFFSYFAVHFISNSWLMWLIINLVQMAPISAMVRMYIFFASFYYVWKSYVHV
VDGCNSSTCMMCYKRNRATRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTFCAGSTFISDEVARDLSLQFKRP
INPTDQSSYIVDSVTVKNGSIHLYFDKAGQKTYERHSLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCEESSAKSASVY
YSQLMCQPILLLDQALVSDVGDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEAELAKNVSLDNVLSTFISAARQG
FVDSDVETKDVVECLKLSHQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARHINAQVAKSHNIALIWNVKDFMS
LSEQLRKQIRSAAKKNNLPFKLTCATTRQVVNVVTTKIALKGGKIVNNWLKQLIKVTLVFLFVAAIFYLITPVHVMSKHT
DFSSEIIGYKAIDGGVTRDIASTDTCFANKHADFDTWFSQRGGSYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGD
FLHFLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFKDASGKPVPYCYDTNVLEGSVAYESLRPDTRYVLMDGS
IIQFPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVSTSGRWVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGA
LDISASIVAGGIVAIVVTCLAYYFMRFRRAFGEYSHVVAFNTLLFLMSFTVLCLTPVYSFLPGVYSVIYLYLTFYLTNDV
SFLAHIQWMVMFTPLVPFWITIAYIICISTKHFYWFFSNYLKRRVVFNGVSFSTFEEAALCTFLLNKEMYLKLRSDVLLP
LTQYNRYLALYNKYKYFSGAMDTTSYREAACCHLAKALNDFSNSGSDVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCM
VQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPK
TPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDL
EGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQT
GIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCSGVTFQSAVKRTIKGTHHWLLLTILTSLLVLVQSTQW
SLFFFLYENAFLPFAMGIIAMSAFAMMFVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRIMTWLDMVDTSLSGFKLK
DCVMYASAVVLLILMTARTVYDDGARRVWTLMNVLTLVYKVYYGNALDQAISMWALIISVTSNYSGVVTTVMFLARGIVF
MCVEYCPIFFITGNTLQCIMLVYCFLGYFCTCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKNSIDAFKL
NIKLLGVGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILLAKDTTEAFEKMVSLLSVLLS
MQGAVDINKLCEEMLDNRATLQAIASEFSSLPSYAAFATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRK
LEKMADQAMTQMYKQARSEDKRAKVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAKLMVVIPDYNTY
KNTCDGTTFTYASALWEIQQVVDADSKIVQLSEISMDNSPNLAWPLIVTALRANSAVKLQNNELSPVALRQMSCAAGTTQ
TACTDDNALAYYNTTKGGRFVLALLSDLQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRG
MVLGSLAATVRLQAGNATEVPANSTVLSFCAFAVDAAKAYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQES
FGGASCCLYCRCHIDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLKNTVCTVCGMWKGYGCSCDQLREPMLQSADAQSFL
NRVCGVSAARLTPCGTGTSTDVVYRAFDIYNDKVAGFAKFLKTNCCRFQEKDEDDNLIDSYFVVKRHTFSNYQHEETIYN
LLKDCPAVAKHDFFKFRIDGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKKDWYDFVEN
PDILRVYANLGERVRQALLKTVQFCDAMRNAGIVGVLTLDNQDLNGNWYDFGDFIQTTPGSGVPVVDSYYSLLMPILTLT
RALTAESHVDTDLTKPYIKWDLLKYDFTEERLKLFDRYFKYWDQTYHPNCVNCLDDRCILHCANFNVLFSTVFPPTSFGP
LVRKIFVDGVPFVVSTGYHFRELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAALTNNVAFQ
TVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDYYRYNLPTMCDIRQLLFVVEVVDKYFDCYDGGCI
NANQVIVNNLDKSAGFPFNKWGKARLYYDSMSYEDQDALFAYTKRNVIPTITQMNLKYAISAKNRARTVAGVSICSTMTN
RQFHQKLLKSIAATRGATVVIGTSKFYGGWHNMLKTVYSDVENPHLMGWDYPKCDRAMPNMLRIMASLVLARKHTTCCSL
SHRFYRLANECAQVLSEMVMCGGSLYVKPGGTSSGDATTAYANSVFNICQAVTANVNALLSTDGNKIADKYVRNLQHRLY
ECLYRNRDVDTDFVNEFYAYLRKHFSMMILSDDAVVCFNSTYASQGLVASIKNFKSVLYYQNNVFMSEAKCWTETDLTKG
PHEFCSQHTMLVKQGDDYVYLPYPDPSRILGAGCFVDDIVKTDGTLMIERFVSLAIDAYPLTKHPNQEYADVFHLYLQYI
RKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLQAVGACVLCNSQTSLRCGACIRRPFLCCKCCYDHVIS
TSHKLVLSVNPYVCNAPGCDVTDVTQLYLGGMSYYCKSHKPPISFPLCANGQVFGLYKNTCVGSDNVTDFNAIATCDWTN
AGDYILANTCTERLKLFAAETLKATEETFKLSYGIATVREVLSDRELHLSWEVGKPRPPLNRNYVFTGYRVTKNSKVQIG
EYTFEKGDYGDAVVYRGTTTYKLNVGDYFVLTSHTVMPLSAPTLVPQEHYVRITGLYPTLNISDEFSSNVANYQKVGMQK
YSTLQGPPGTGKSHFAIGLALYYPSARIVYTACSHAAVDALCEKALKYLPIDKCSRIIPARARVECFDKFKVNSTLEQYV
FCTVNALPETTADIVVFDEISMATNYDLSVVNARLRAKHYVYIGDPAQLPAPRTLLTKGTLEPEYFNSVCRLMKTIGPDM
FLGTCRRCPAEIVDTVSALVYDNKLKAHKDKSAQCFKMFYKGVITHDVSSAINRPQIGVVREFLTRNPAWRKAVFISPYN
SQNAVASKILGLPTQTVDSSQGSEYDYVIFTQTTETAHSCNVNRFNVAITRAKVGILCIMSDRDLYDKLQFTSLEIPRRN
VATLQAENVTGLFKDCSKVITGLHPTQAPTHLSVDTKFKTEGLCVDIPGIPKDMTYRRLISMMGFKMNYQVNGYPNMFIT
REEAIRHVRAWIGFDVEGCHATREAVGTNLPLQLGFSTGVNLVAVPTGYVDTPNNTDFSRVSAKPPPGDQFKHLIPLMYK
GLPWNVVRIKIVQMLSDTLKNLSDRVVFVLWAHGFELTSMKYFVKIGPERTCCLCDRRATCFSTASDTYACWHHSIGFDY
VYNPFMIDVQQWGFTGNLQSNHDLYCQVHGNAHVASCDAIMTRCLAVHECFVKRVDWTIEYPIIGDELKINAACRKVQHM
VVKAALLADKFPVLHDIGNPKAIKCVPQADVEWKFYDAQPCSDKAYKIEELFYSYATHSDKFTDGVCLFWNCNVDRYPAN
SIVCRFDTRVLSNLNLPGCDGGSLYVNKHAFHTPAFDKSAFVNLKQLPFFYYSDSPCESHGKQVVSDIDYVPLKSATCIT
RCNLGGAVCRHHANEYRLYLDAYNMMISAGFSLWVYKQFDTYNLWNTFTRLQSLENVAFNVVNKGHFDGQQGEVPVSIIN
NTVYTKVDGVDVELFENKTTLPVNVAFELWAKRNIKPVPEVKILNNLGVDIAANTVIWDYKRDAPAHISTIGVCSMTDIA
KKPTETICAPLTVFFDGRVDGQVDLFRNARNGVLITEGSVKGLQPSVGPKQASLNGVTLIGEAVKTQFNYYKKVDGVVQQ
LPETYFTQSRNLQEFKPRSQMEIDFLELAMDEFIERYKLEGYAFEHIVYGDFSHSQLGGLHLLIGLAKRFKESPFELEDF
IPMDSTVKNYFITDAQTGSSKCVCSVIDLLLDDFVEIIKSQDLSVVSKVVKVTIDYTEISFMLWCKDGHVETFYPKLQSS
QAWQPGVAMPNLYKMQRMLLEKCDLQNYGDSATLPKGIMMNVAKYTQLCQYLNTLTLAVPYNMRVIHFGAGSDKGVAPGT
AVLRQWLPTGTLLVDSDLNDFVSDADSTLIGDCATVHTANKWDLIISDMYDPKTKNVTKENDSKEGFFTYICGFIQQKLA
LGGSVAIKITEHSWNADLYKLMGHFAWWTAFVTNVNASSSEAFLIGCNYLGKPREQIDGYVMHANYIFWRNTNPIQLSSY
SLFDMSKFPLKLRGTAVMSLKEGQINDMILSLLSKGRLIIRENNRVVISSDVLVNN
>Q008X6 ~~~rep~~~Replicase polyprotein 1ab~~~
MSILFGNRQANATKRSDMASVARAVYEVDLISTKYARRTQERLAHNKHAKPSYPSVFFGRRMKAVKEPTFTPSTLFFEEA
TLPKVLASKAKPDTGIKTRRVYVADSLTINGHTYPIVGHFVEMAVSKKEAFPIQPKRVKPKPLMAKPIPNIRRTFLTPEE
RTNTPTTPTTTTTTPFVAGETAGPTIEYTPTSIDLPFAMPTVKQIKENAHTILREQDDCLRFAQTALFKHLGTVTHTTPN
HATTFQVKGRTSLLTFEWRKTTQSPLTDGHFYLQTANNHAELMQPVEGKLTTIFTTTIQQGTTHSLHLIKQESARTLKTR
KPLKLVTYKQETPTTTITPQSLKKTITYIPGSFCINVAEPTLQSVMRRQPLTPTPNDALLQIYHKLGCTTKSPNHASTFE
LFGNTYTWYPVQHTNNLLHKDPNRRFFLHITGQTPQLLIRTERKTFLTLQDEVTYISGKLFVMNHAPIQGEYKTQTAEWV
GSYNMAKTPKAIKPAKTVEYINTTPCHKPATMPAPITYRQCPYTWTLHEPSISKVQRNLFHIPKTATNCLDRIQKALFPE
IVTSNQHFPIGFTIQTDTTMQSYEWAILCKKVTRTYYLAVQNHHATLWYKCAQVYMCLSDDIAPTTELQGSVYILNKVTD
PGFYQNTRQWCGSNDDLHEPKHLIKNLANGDVNNIAHCSLTPWTTTPLVYSTQKNKLTRKLIHYYNATYTVQVPKENPNA
QPSTMKCAYKAYIDLATPTELQIHLDLEIQGKLYSQTAQKKGSKFNKLDIPTFGDILKGTLYVFSSKVLLYEQPKRCTSV
CFHLPNNAQSFFFNTETIQTFEDLFARISSEEVEDNLVLIKGFVPCLGAIYITKDLKFIQPELKEDFKHPTTYYTFTTTV
DPEQVLYSLHPSFTDIVPPHGFPYYTFAKLNNHIDAWNITDDQADTLAEAQPIIFQWPTEEATITTPYKVLHYEHLEGLD
YISLSSFNTVECPEETQSAHDSSSESESEDEELPAHPLSNAPSQASLSSVASTAPPTSPTSSPTPSPTPLQQQVGLKGKE
VPVGGWVLVSEEETPSEEVDSPKLLPNEVPLSFDFDLPIEPITRPISPELQQPILTHYEHPTSPTPSVEIEIDFGSYENL
TLQTEETVTTEVQPEPTPAPTPEPTIVTETVQETPVPTETTQESTPESTPESTPEPTPESTSESTLEPEHVATPSQSPTH
ITVTEITHEPETPDSWSERYDSTSNIPEVFNQLSFGSTDSVKITTPKTETPDEPQQPTVETVSAAQQLLQIVQTATPDIA
QLMSELPPYRLICIGSYCPILAENISKQLPTAVTTPTDADIPTVIFNVSEETMDTVINTVKTKHQANHLTFSLTTIIALD
VPKDKSLPLQQIYDKLTQQDYNTDFIYESHHRQPKESLTHASVLSAYTASYKTTAIKSIADNAVVLDIGYGKGNDGPRYA
VRPLTVTGIDTAARMLAIADQNKPENVTLVKQGFFTHITKTSNTYTHVIAFNSLHYPLASSHPDTLVQRLPTCPANILIP
CHHLLEGIQTPTYSVVKDEDMWCVKVTKNEFIESSYNYDVFVKALESKYHVTIGSLLDCVEKPSTRSITPTLWTAMRNFV
NNDQEMQRILSGYITFNLTPLPPKVEIINDWLDNNATVTINNPFASNEGVTFAVHNIGAITTTEGEFIVNAANKQLNNGT
GVTGAIFAAHDKELKLTQAIKALPTYGASDKLESHQHVVQTIIKNNNSTHAINILHAAAPIKVKCTSKNPEVLLAHNETA
QSELKETYKAIVDYAQLNKLTHIYLPLFGAGAYGHKPLDSLEAFLDAMRNRSPQSTTQYTLLLSDPVKPLDNPSFSYEFL
NLLVTNLNINKQFAQLIVNKYHNTCALQSAIQMNTTTDTHNFLATFIYLLYTMPYSMNTFRQTHTPEEPFTPGKMVTVVD
TTIAFLTTLDILPPCGQPCGYLPPSITKDGEYICACQKTSNWSLPFHFYNARYNKVYHTGLNNILTHKHSAFHKSRNAAH
FIAKTGPSTSSYPVYMAPVPEILAYNASYRDSCQDNAIEEQSDSQASQSPSSPVTIPVSLPTASPASSVKSALRSDIPIT
TDQQSTTSASISTATTASTIPTAPLTSSDSNTSVVTSLYGNMEELTYLDASGTSQDFILSETTPFIAHIYHNNEATFIPP
GYQLLDTNTNDPIEMYITPPRPIDGSPMISLASTASTTPMTYPLLSIRLTTEELTSFFKTKTDKFHLISHKSCLTVHLFD
SPTLNSIAADSTSDAHLYQQHLKDLYTFSDCCSMYTRTEVYNCIEADTPLIRQSEQTKFHPINLDTLIEMVATFPPIVKR
YSQTTTPDFTNLTVYFVSNGDIITTPTGSTSEQPPQLKIFLDYQTSSKFTTLVDLTLHEQTEANTIITYHHGEHQLLKPN
PSAFYIEFQTYSSFFSRFQTFSTNFFWTLFINFLINVRFCITADSAYFHWQGKPIETTNLNIVYSIGRLDFVLSKHTTPW
LTKPTDTLNPLTLIKNTLVQPIAINFHGRIRPLQSTNTRFGATHTPTKLPVHLLNTSLRTHYLSLLSLFQCTFSVFLAYI
ALLYSFSGHGIFTVVAYFTMLFARYYITSFINFCTSQLTATQVTQWFAAIKAKYTGIYESSQDRVLTVNVTGTNVPYIVK
YSTILTVTMYVAFMAFVWTVSTYAAQYTAGERYDRPPYQTVFQKTLNVLGLTETVTYYYPYASLNEACAASTSILCRLGS
PFNFHYPSDYTQVRTVQTDTSSPFWLFIIFMPPSFLFIVLPWLILCTITPTVSIAQLLVPSIILNATIVFIYIRRKFTGH
CCGPHTCIKHADISRSLQFRPTSQIQHSLTFCGTLCAKHNWYCNNSDSPTHTLGIQLAQLIETTYKLQPGTIKPDSSYTH
TTETATLPIMKVSTTSTDFTTSEPNTTVEHLHLQVIAHVTGTRISIESSSNKVQQQNTQHTRLTNKPVTGFMHTTLLQKL
KRQHKDELSSYLCNFVPSDNKKDCILPHSVVHMTLTENQRTFLLKNFTFSTNVTVDPTTTGFIPSSLNISTLPHKHFMIN
VIESAMLAKLPKEVQDTLRTTHLETTTLERQAMSLTTQAILTTFALIMATFVVAFLAFFSTAQVGKTPYAGLNPTMVGNV
NAEPYIQPTTLENSILIPLHGASKVCWRAQNGTLFFTDAIPTTECARAAVPYIGYKSEFTQTCASSNLRYPFTVYLGSIK
VMYLRDGISYLTSTLSHNSNTKKLCVQVGSNAVRCASVLPTGASSNVAALLMASVVVISMVLFYLYLLQIFKFYTNSVIM
SFVIQLLTLLATTVSTPLAVTVQLFVITYGYTNWILLTLSLLNLTVLLSTPVGITFVVIYGLYKAYTLFTSSGQGCVYNE
GGTIRFSGSFEQVANSTFPLTNASCVQLLSDLGITYQQLNVYASSRDRNVRRLAQALLHRQLDSASECILYEGCSGNTIT
RQALQRIRQAVTVVVTPASQNLCKITSNQANGIGLSCTGTFFTSTEIITCAHGIGTSDITAVHKGITYDCKVKSINNDIA
ILITTTVNLSVQNIKLDSSFSQKSDNYQRNFVQFVSFVDQQNSDAVTINNTVMLPSGHFFAIGTEAGESGSPYTLNGNII
GIHYGIDNAGSWMLASRPDGSFYVPATQHGNSAKVTFSTDAFAQQFPAIVTNKTSDQLVNEIAATNNTAFELDTTDVSHL
SNLLKHLKNNNETPKSLSDYLPVAPVTQQTSVTVGQTLTSPVNNMQTALYTLMLVSEVITYILTPNSDLSVLISMFVTSA
FLKFGASKLFYNTEMLRNTITTFVVYRYTTLLIALVFSQYYLHILSYALKLNTLVLTVIALTFLVTPLVLLTIRRVYYYS
QNYLISCVFIISCFASHTYTLYILTDTTVDFQTFIISEPIFSTLLCNLFIGFTLISVVPNPLYVCVVFMYILLDCEALGF
IVTCFMASYLCPKPLRSLTTFLCTDTLVLTAPAYLHWYGAKGTQREYSVIYAVFDSILTPETTIQIPVTIMEGIEQQVKF
IFAVPKSANIQDEEEAYVEYNQNSDIKDLAIKNEEKVCTITGMFRKTRTIKGTLMESFYPRHQPDSYILNQVVRHNYVLA
HDPETIILTTVNPTHLESEPFQTIVKKVRKLHALYQEISEQSSDDTELCHAYILALIKSTVLEAQMPTEKINFVSSQTML
SPAMILIFAEAYQILTESRSFTNHITPQSDIGSMQTTLASLAEMDTDEMTPQERKIHIKRMNVLKQEIAKMESASLKLEK
FLDNMHKAEISKRGKEDILLKVSNMLRLHLNKVANAAHCTIQTPSAGLITLASAFDVHSLCVTQHSESVLIQTPDDDTFL
VYVDGQIYTCYNPTDITGKKLVPINVMSPDNVQFPTYPVVFSLSKQDYAEEITEQNNIGYTERTHNFKLKELASGLAVTL
DGLVVVTETKELATAFKIGQRYFKFLNTNKTPVARNNTHAIIQLLRNNISQQAVVRIGGSRVSNDHIAISQVPVQTIGYL
TYAGISVCRQCATKQDHTCQYAGYFVQIPREHVSNIFNLTDTPPCLHNKFTCTTCQPLQQQSKTQQPPLNLVGQCLGLTL
DTAFCPFQTGEYKPSSREFYINILNNNVASLRKVFKKNTASIPSENGTIMLKDTGTAHEIYVAKQLLAKGLPVLQHARFN
HDGTDYLIRYYTTPYSLGDLVYAYMVGDFKHMLLALDITDETCLDPGNYSSYYNFKEQLRNKLASVIPNVNKILAAELPL
AITLDNIDLNGFLYDFGDYPTNGKVTNYHVVSCMRQIATFCSLDITQFPSPLGYTVDRPKLQQTLITGSYIDKLLAINAL
VASNPETPATASTLFIEASAPTTQTAAISNPILGMHVLDWDLIKANHTGVELDLIQTQDPSIYAKPDVLSVGDTIFYYGR
RRYKHDAYKRPFYDLDLIQRMNSAGLNLSETTGYHYQCGTTTEAVEDFMYYNYNSPKSFDPSYLKSVYTYMRDKFMKIIS
TDEKLNHQSGAPRMSSMGVGVSGFFQKTVWNALPEDFSPRLLDTASKTVMPFSTNIVKKFQRQKKTRVRTLGGSSFITSS
IFRMLHKPVTNKMVQTAQANIGPFLIGISKFNLGFHKYLSAHHPNGIEDCQVMGADYTKCDRSFPVVCRALSAALFYELG
HLEPNNHWFLNEMFAFLLDPSFISGHIFNKPGGTTSGDSTTAFSNSFYNYFVHLYIQYLTFLTTEMPPSYQPLCNLAHQA
FSTGNTETYDLYFSMADDLNSTEYFLHFLSDDSFIISKPTAFPIFTPANFSMKLQNVLGCYVDPAKSWSADGEIHEFCSS
HICKINEKYQYVPDPNNMLAGLICAPEPTPQDKLIWKLVATCAELAVFHFVNPTLFNNIFHLLQSLHAEFVSEHSVNLLP
PKLLEIDFYTDLIDSEDVEQYSFLADTLTEKNIIMQSASQCYFCDNATVSTCSDCTVQYPMCAHCAYEHLMLTDHTPTQV
LPCHVCDQTDPRHLNHTFVMGTVKVACDNHVEGMALPLVDHARKLVKIPLYQKCEQQKTSVSAIKYTKLFDENDQPLDPN
FFFYDHEQSAEHNYLKILNDCYLLDEHTVQTSTTYDFQCLEGNTIQVFHKPAETFGNTAYAEILDSKGRVVLKVTLDPIS
AQNPNHYYITTTKGVLYRKYSKIRRTIHKPRLANRHILNTLKKATFIIGPPGTGKTTYVMKNFIDTASPANKVAYIAPTH
KLVQSMDQAIWDKYNHTVSVSVVKSELNNNKYNYPLNSATKTIMLGTPGAVCTHAGCTLIFDEVTLSQLNTIINAISVVK
PSQVIFLGDPLQLGPVTHMRSLSYSYTNFPLFQFCNDSRVLSICYRCPSNIFNLWVKPYTDSNVRIDPHAAGGDAKIIVS
DQCSNPDAYQYVQKLARNNPDKVLLCNYKKPIIGLENAVTIDSSQGKTYKHTIVVLLGNTNFTQVINRAIVAMSRSTHSI
EVHCSPFIHTKFSELFGWPQNVEKENQITKQLHEYSVTSNITLLPVSELPNHLGSLVVCDLEFFHVRHETTPKVKCTLEV
GEMAIITTSLLKQIIIPRKSAFTAETHHKHTFGVPKGKPDMNWDYMKSHKAISQQINTDRTHKIFSHMAATTLNRVVYVL
YGAGNDLRALTNLNIVGDYTCEKCTKEATFYTIHREIVHTFCNHHAPSQFPLMGTINAQAIDIQHASANKQSLTNTHAEV
CNQQHGDAHTASADTIMTGCLASNFLMKSAIKLDNILTTSAFKPYAPYIQTGSHLTVSSIKFRTLGTGVFSMIDDKLYHF
DILPHHSKLSHFLQHSSNQPHSTIVTEIPAGYPSCVKIKGKGCTFCANTIAVITELYEDLAKLGLTLSRPIIQQAYTQVE
TQILQNITNVSYNQFGEMLIQLKDDTVIPFINDFQTSIHNYSLRSNQPLPNPAIFKNLGIKATLGFSTPWVPVTTTTTEP
HIMSTKILKDNNDYYHLSPVQLKASPHAFSSITAGYYIYTTPMINIPNETPAYYLTHFVHGQSTPLDLGYTSTNRLTTKQ
FIYTPNEDEYKSKLGHHVSVGDTSNVSWTIGGMHVLTAFQNITNYQLVSGPANPIMRINVALERGNKVETTALDTTLQDY
YKIATTNKVTVSKTTFFTLDGGQYRMMNFANPDGTIQTSYPVAQAQSQLITNYRIYPSYITWPTFFTNEATDCWAIPNYN
APPKNQTCNINIQKYDQMCDLFAIDLKIPVKGHIHHLGNAGNKYSPGDVVLRQYFDQAHLTSYDLREVVSDIPVLHPNDE
WKAHFILSDVYAPDTDFTSLALEYMQNHLRLGGSIMWKMTETSILQVNEIVKYFGSWKAVTFAVNYSSSETFLFCAGYTG
VEYNTSIVQNGYMSLLGGYRKDLLFVPFCNDYTGSKAYKDTGRVKVVANHLADKLTPAHYATASIFLNTALH
>P0C6T4 ~~~1a~~~Replicase polyprotein 1a~~~
MLSKASVTTQGARGKYRAELYNEKRSDHVACTVPLCDTDDMACKLTPWFEDGETAFNQVSSILKEKGKILFVPMHMQRAM
KFLPGPRVYLVERLTGGMLSKHFLVNQLAYKDQVGAAMMRTTLNAKPLGMFFPYDSSLETGEYTFLLRKNGLGGQLFRER
PWDRKETPYVEILDDLEADPTGKYSQNLLKKLIGGDCIPIDQYMCGKNGKPIADYAKIVAKEGLTTLADIEVDVKSRMDS
DRFIVLNKKLYRVVWNVTRRNVPYPKQTAFTIVSVVQCDDKDSVPEHTFTIGSQILMVSPLKATNNKNFNLKQRLLYTFY
GKDAVQQPGYIYHSAYVDCNACGRGTWCTGNAIQGFACDCGANYSANDVDLQSSGLVPRNALFLANCPCANNGACSHSAA
QVYNILDGKACVEVGGKSFTLTFGGVVYAYMGCCDGTMYFVPRAKSCVSRIGDAIFTGCTGTWDKVVETANLFLEKAQRS
LNFCQQFALTEVVLAILSGTTSTFEELRDLCHNASYEKVRDHLVNHGFVVTIGDYIRDAINIGANGVCNATINAPFIAFT
GLGESFKKVSAIPWKICSNLKSALDYYSSNIMFRVFPYDIPCDVSNFVELLLDCGKLTVATSYFVLRYLDEKFDTVLGTV
SSACQTALSSFLNACVAASRATAGFINDMFKLFKVLMHKLYVYTSCGYVAVAEHSSKIVQQVLDIMSKAMKLLHTNVSWA
GTKLSAIIYEGREALLFNSGTYFCLSTKAKTLQGQMNLVLPGDYNKKTLGILDPVPNADTIDVNANSTVVDVVHGQLEPT
NEHGPSMIVGNYVLVSDKLFVRTEDEEFYPLCTNGKVVSTLFRLKGGMPSKKVTFGDVNTVEVTAYRSVSITYDIHPVLD
ALLSSSKLATFTVEKDLLVEDFVDVIKDEVLTLLTPLLRGYDIDGFDVEDFIDVPCYVYNQDGDCAWSSNMTFSINPVED
VEEVEEFIEDDYLSDELPIADDEEAWARAVEEVMPLDDILVAEIELEEDPPLETALESVEAEVVETAEAQEPSVESIDST
PSTSTVVGENDLSVKPMSRVAETDDVLELETAVVGGPVSDVTAIVTNDIVSVEQAQQCGVSSLPIQDEASENQVHQVSDL
QGNELLCSETKVEIVQPRQDLKPRRSRKSKVDLSKYKHTVINNSVTLVLGDAIQIASLLPKCILVNAANRHLKHGGGIAG
VINKASGGDVQEESDEYISNNGPLHVGDSVLLKGHGLADAILHVVGPDARNNEDAALLKRCYKAFNKHTIVVTPLISAGI
FSVDPKVSFEYLLANVTTTTYVVVNNEDIYNTLATPSKPDGLVYSFEGWRGTVRTAKNYGFTCFICTEYSANVKFLRTKG
VDTTKKIQTVDGVSYYLYSARDALTDVIAAANGCSGICAMPFGYVTHGLDLAQSGNYVRQVKVPYVCLLASKEQIPIMNS
DVAIQTPETAFINNVTSNGGYHSWHLVSGDLIVKDVCYKKLLHWSGQTICYADNKFYVVKNDVALPFSDLEACRAYLTSR
AAQQVNIEVLVTIDGVNFRTVILNDTTTFRKQLGATFYKGVDISDAFPTVKMGGESLFVADNLSESEKVVLKEYYGTSDV
TFLQRYYSLQPLVQQWKFVVHDGVKSLKLSNYNCYINATIMMIDMLHDIKFVVPALQNAYLRYKGGDPYDFLALIMAYGD
CTFDNPDDEAKLLHTLLAKAELTVSAKMVWREWCTVCGIRDIEYTGMRACVYAGVNSMEELQSVFNETCVCGSVKHRQLV
EHSAPWLLVSGLNEVKVSTSTDPIYRAFNVFQGVETSVGHYVHIRVKDGLFYKYDSGSLTKTSDMKCKMTSVWYPTVRYT
ADCNVVVYDLDGVTKVEVNPDLSNYYMKDGKYYTSKPTIKYSPATILPGSVYSNSCLVGVDGTPGSDTISKFFNDLLGFD
ETKPISKKLTYSLLPNEDGDVLLSEFSNYNPVYKKGVMLKGKPILWVNNGVCDSALNKPNRASLRQLYDVAPIVLDNKYT
VLQDNTSQLVEHNVPVVDDVPITTRKLIEVKCKGLNKPFVKGNFSFVNDPNGVTVVDTLGLTELRALYVDINTRYIVLRD
NNWSSLFKLHTVESGDLQIVAAGGSVTRRARVLLGASSLFASFAKITVTATTAACKTAGRGFCKFVVNYGVLQNMFVFLK
MLFFLPFNYLWPKKQPTVDIGVSGLRTAGIVTTNIVKQCGTAAYYMLLGKFKRVDWKATLRLFLLLCTTILLLSSIYHLV
LFNQVLSSDVMLEDATGILAIYKEVRSYLGIRTLCDGLVVEYRNTSFDVMEFCSNRSVLCQWCLIGQDSLTRYSALQMLQ
THITSYVLNIDWIWFALEFFLAYVLYTSSFNVLLLVVTAQYFFAYTSAFVNWRAYNYIVSGLFFLVTHIPLHGLVRVYNF
LACLWFLRKFYSHVINGCKDTACLLCYKRNRLTRVEASTIVCGTKRTFYIAANGGTSYCCKHNWNCVECDTAGVGNTFIC
TEVANDLTTTLRRLIKPTDQSHYYVDSVVVKDAVVELHYNRDGSSCYERYPLCYFTNLEKLKFKEVCKTPTGIPEHNFLI
YDTNDRGQENLARSACVYYSQVLCKPMLLVDVNLVTTVGDSREIAIKMLDSFINSFISLFSVSRDKLEKLINTARDCVRR
GDDFQNVLKTFTDAARGHAGVESDVETTMVVDALQYAHKNDIQLTTECYNNYVPGYIKPDSINTLDLGCLIDLKAASVNQ
TSMRNANGACVWNSGDYMKLSDSFKRQIRIACRKCNIPFRLTTSKLRAADNILSVKFSATKIVGGAPSWLLRVRDLTVKG
YCILTLFVFTVAVLSWFCLPSYSIATVNFNDDRILTYKVIENGIVRDIAPNDVCFANKYGHFSKWFNENHGGVYRNSMDC
PITIAVIAGVAGARVANVPANLAWVGKQIVLFVSRVFANTNVCFTPINEIPYDTFSDSGCVLSSECTLFRDAEGNLNPFC
YDPTVLPGASSYADMKPHVRYDMYDSDMYIKFPEVIVESTLRITKTLATQYCRFGSCEESAAGVCISTNGSWALYNQNYS
TRPGIYCGDDYFDIVRRLAISLFQPVTYFQLSTSLAMGLVLCVFLTAAFYYINKVKRALADYTQCAVVAVVAALLNSLCL
CFIVANPLLVAPYTAMYYYATFYLTGEPAFIMHISWYVMFGAVVPIWMLASYTVGVMLRHLFWVLAYFSKKHVDVFTDGK
LNCSFQDAASNIFVIGKDTYVALRNAITQDSFVRYLSLFNKYKYYSGAMDTASYREACAAHLCKALQTYSETGSDILYQP
PNCSVTSSVLQSGLVKMSAPSGAVENCIVQVTCGSMTLNGLWLDNTVWCPRHIMCPADQLTDPNYDALLISKTNHSFIVQ
KHIGAQANLRVVAHSMVGVLLKLTVDVANPSTPAYTFSTVKPGASFSVLACYNGKPTGVFTVNLRHNSTIKGSFLCGSCG
SVGYTENGGVINFVYMHQMELSNGTHTGSSFDGVMYGAFEDKQTHQLQLTDKYCTINVVAWLYAAVLNGCKWFVKPTRVG
IVTYNEWALSNQFTEFVGTQSIDMLAHRTGVSVEQMLAAIQSLHAGFQGKTILGQSTLEDEFTPDDVNMQVMGVVMQSGV
KRISYGFIHWLISTFVLAYVSVMQLTKFTMWTYLFETIPTQMTPLLLGFMACVMFTVKHKHTFMSLFLLPVALCLTYANI
VYEPQTLISSTLIAVANWLTPTSVYMRTTHFDFGLYISLSFVLAIIVRRLYRPSMSNLALALCSGVMWFYTYVIGDHSSP
ITYLMFITTLTSDYTITVFATVNLAKFISGLVFFYAPHLGFILPEVKLVLLIYLGLGYMCTMYFGVFSLLNLKLRVPLGV
YDYSVSTQEFRFLTGNGLHAPRNSWEALILNFKLLGIGGTPCIKVATVQSKLTDLKCTSVVLLTVLQQLHLESNSKAWSY
CVKLHNEILAAVDPTEAFERFVCLFATLMSFSANVDLDALANDLFENSSVLQATLTEFSHLATYAELETAQSSYQKALNS
GDASPQVLKALQKAVNVAKNAYEKDKAVARKLERMAEQAMTSMYKQARAEDKKAKIVSAMQTMLFGMIKKLDNDVLNGVI
ANARNGCVPLSIVPLCASNKLRVVIPDISVWNKVVNWPSVSYAGSLWDITVINNVDNEVVKPTDVVETNESLTWPLVIEC
SRSSSSAVKLQNNEIHPKGLKTMVITAGVDQVNCNSSAVAYYEPVQGHRMVMGLLSENAHLKWAKVEGKDGFINIELQPP
CKFLIAGPKGPEIRYLYFVKNLNNLHRGQLLGHIAATVRLQAGANTEFASNSTVLTLVAFAVDPAKAYLDYVGSGGTPLS
NYVKMLAPKTGTGVAISVKPEATADQETYGGASVCLYCRAHIEHPDVSGVCKYKTRFVQIPAHVRDPVGFLLKNVPCNVC
QYWVGYGCNCDALRNNTVPQSKDTNFLNESGVLV
>P0C6F3 ~~~1a~~~Replicase polyprotein 1a~~~
MSTSSSILDIPSKMFRILKNNTRETEQHLSSSTLDLISKSQLLAQCFDTQEIMASLSKTVRSILESQNLEHKSTLTPYNS
SQSLQLLVMNTSCTQFKWTTGSTSSVKALLEKELCRGLVPLNDITPKSNYVELSLLTPSILIGNETSTTTTLPEIPLDME
QSIISCVENTLLKEVQALSGQESCQEYFLSANYQSLIPPQVLLNLMKMSSVVDLSPLTLPNTRLWLKLSPFHGGTSVSYA
TQIKGYANCARREEKCLKNRLTKKQKNQEKGSFDARSVITLGGKMYRYKVVVLRCEDQSDNLSELQFEPQVEYTMDMVPH
CWKELVKKRLIRAKGTWDLSCVEDLDLDHVEVRGDSLLHRSSVVHDLTSIVDDTLQEKLFSRTWLRQSLKYSGNILQRLS
SLFATEGLKKITLVNSDITPVQVGDKWLNFVDFGKSTVFFVKTLNNIHLAMTRQRESCNYIHEKFGRVRWLGAKPEQGAI
VKVFAWCLNKKEFKFRDNQLKQYVCRQGVIKHEPCEYLNVEVLDEFVALNNDLNCVQKIKTYLAAYFGLKKVKLTQKNFM
TPLITKKQELVFQPCNCPNHQFYVAQFDKHVTLGLGRKDGILFAEQVPSYAIILAVGFGTVETQLVTHYYSEMRRVYHPL
DFQSNTFVFDHQGVMLEDISPADYNDVGEEDYQLEYSGGFDQPFQNYHSDDEDQAFPDFEDERHPDEENWARPIISSGES
SVVSSRPSSPLVYSSLVPVASPFGYMNGIRVFDICLADDLDFLQIHGQCPCARCKGLYFYQPIRPRGFTIFENVVEFFSF
VEKCEVFEEIGPFFKMIEYSMLYNEYNIFYGLGKKIYQSDLVLPVKHLDQLWKRAQLDIDVVSEFENFKNSLQNINNVVY
IAPYFNDQGEWNDIFDGYEFNLNDNQFWFQAKPVYDLVCYIYQGFFSDSRPLEKLYQKLCLDYHTSAMLHTQTHLKYCYV
ALLHSERAFQMSINLDSLDNEQLHFLATMGMGDASLVGPTYLSEYHSNFNWYSIMSKACHYVKLEQLVGLTYQEKRLMIL
SRVQEFYEQQHRGPIQLILSPLKVVNLPPITCTEGYCYQPVTRLFDTCVMPDIMKKLSRKRTSVSDVFGILADYFKRTLS
YRCFKVHEFCGIERQQEFSDMTTLKLVTDWCQDTYYFYNEYATMTDVEPKVQVSSDYYLKIPSEVVEHIRQFLPHNVNVG
LMNYVSSNCDFDQCKFEFCLSGKGYVLGNMFFNRCAIQYVKTNLFIVLFKSRPLLYITQESIYLSDFNVLQAQCLTGEFC
LDFEPVQGKTLFGVYFTNGQRYGQQWETLPRFSLKPLNSPRKRVPTQPFEELAEVCIFKQKLKLTQLHNDCSVTPRVCSI
PQTITATFQPYYCLENFYGVKAPKVIVSGHLATHYVKLTHKISKCVLVTKLAVARAFYFTPTSMGSHYHLDPMEGISFGK
RATVQFEPVGLIKDVNLLVYQFGSHVSIQFFPEAPCIVADGHYPSKYSGVWLGYLPSVEECKIAQVNHRVYVPTILRTSK
SAPFHIIQNGDMGRGPITVTYHYAKNFDNKSLTPMFKMFQQVFEKSKDDIFKAFNTMSLEQKKVLSHFCGEFDEAYTLQT
MSDEISFESSAYPDVVACSLAYILGYEMCLTVKVNAKNEKLDIGSQCERVFVDYDVKKNEWTLSPEEGEDSDDNLDLPFE
QYYEFKIGQTNVVLVQDDFKSVFEFLKSEQGVDYVVNPANSQLKHGGGIAKVISCMCGPKLQAWSNNYITKNKTVPVTKA
IKSPGFQLGKKVNIIHAVGPRVSDGDVFQKLDQAWRSVFDLCEDQHTILTSMLSTGIFGCTVNDSFNTFLSNVARLDKSL
VVFVVTNMVEQYNQAFAVIKMYQQYHGLPNFGNTCWFNALYQLLKSFSEKEQCVNDLLNCFDDFYDCPTSQCVEWVCEQL
GVQFGQQQDAVEMLMKVFDVFKCDVRVGFDCLSRLQQVNCGFCVEVPAQAVLMFSGKDQCGHWTAARKIVDKWYTFDDNH
VVQKDPVWQNVVLVLRDRGIFRSADFERKPARRRRVSHRVPRDTLSQDAITYIEDLRFSSGTCLSRYFVESVESFVSGDN
VSEVSDEQTCVEVAIEESDGHVEQICQSSVDCVGMPESFQFTFSMPLQTFVQECDQKCEDDFSQEHVECDQQFEPVEQVG
QGGQQDGQVDQQIKESEQVVEPSAPSGQESPQALLQQVVDEVVYQIEQVKCDQKQDQDSVQCDEIEEINSRGEQTVQQQL
QPILGHDLNENEGPTLSVGAGKLVRCRSLAVTESNLSTSNTIFVWSEVLTHQYIGFKTDLMGLTYNIKFKLICYVLFLWF
GVLCCTSHNTPFYMRLCIYLVLLWLSLMIWNASQINVKTGWNELYVLKLLTSIKLPNIVKFRCELVQWFVLKCLFVSFYV
YDYVVKVCVSIFQMPQLRPFTWPFIKLGFVDTFLSHHILAFPEKVANQSTLPTCGDKRYYVYVPSWCRASFTSLVMRARE
LTSTGRSKTLDNWHYQCCSKTAKPLSCFNVREFVFDQDCKHEAYGFLSSLCVYLLFYSGFLTFWLPLFCYYYVLFMCTFK
NLPVDITKPIKWTVLQQVVNDVLSLVTKPLFGRPVCPPLTTYLTSTTADEAVKVSRSLLGRFCTPLGFQQPVMNVENGVT
VSNFGFFNPLMWPLFVVVLLDNRFIWFFNVLSYVMMPVFVIILFYFYLKKICGCINFKGLSKCCTKHFNQFSKPLVAAGV
HGNRTNFTYQPMQEHWCDRHSWYCPKEEHYMTPDMAVYIKNYYNLACAPTADLVWCDYTKSAPTMTWSNFKYSSYKAKET
VLCAPSSHADSMLMAWYALLHNVRFTVNPNVVDLPPAVNTIYVSSDSEDSVQDKSQPDVKLRPKKPKGNFKKQSVAYFSR
EPVDIWYYTTLVIVMGVLFMFMYSCLMVGQYVVMPRDKFFGVNPTGYSYVNAPPYLHAAPPVLQNSDGMILATQLKVPSI
TYSVYRLLSGHLYFTKLIVSDNECTPPFGAARLSNEFSCNGFTYVLPAHLRFFNRYVMLIHPDQLHMLPFEVEYGSHTRV
CYTTGSNSVECLPTFEIISPYVFVFIVVIFTVIFLILIRLYIVMYSYFKVFTYVVFKLLFVNIIMVLFVVCLPPLVPGVV
FVLALWLCDSVMFLLYLAFLSLFILPWFYVLFFLFMVGGFVFWWMMRSADVVHLTTDGLTFNGTFEQISKCVFPLNPLIV
NRMLLDCQMSHSDLVEKSKLKTTEGKLANEMMKVFMTGETSYYQPSNFSFQSVFSKATSPFTLHARPPMPMFKLYVHFTG
SCVGSTSTGTGFAIDDNTIVTAKHLFEYDDLKPTHVSVEIVTRSHSARSASIIWKEPDVKGWTFKGENAYIQVENLKDFY
IEDFKYLPFQQIEKDFYKRMEPVTIYSVKYGSEFATQAWQTVNGHFVCYNTEGGDSGAPLVCNGRIVGVHQGLCDNFKTT
LASDFEGKMMTEVKGHHVDPPVYYKPIIISAAYNKFVAGEDSSVGDGKNYHKFENEDFACMCKELESVTFGDQLRRYCYN
LPQFLEPLQYFHVPSFWQPFKKQSVSNNVSWVVEHLHFIFSIYFLICDFVAYWWLDDPFSVVLPLFFIVQLLSTVFLKNV
LFWTTSYLITLAVTFYIHSEVAESMFLLGFLSDRVVNRMSLIIVVAIMCLFVVVRVVVNVKRAIFVFVVSVVLIFVHICL
GIVQFNSFVNVVLFDVYAVFTALLTPQPVVAIIMLLLFDTKMLMSFAFIVIVLSFRVFKDYKFVKVLHNFCNFDFVLSQV
SLFRYRHRNQGNDPTHYEALWLFLKELYYGIQDAKYEVFSPQAGSYNVKFLTDMTEQDQLEAVEQVQRRLQRFNIVQDKA
SPRLVLYSKTIEFIKDQIQQQRAVGANPFIITTLTSNDIGLDNVEVHNPANFKPEDLQAHMWFFSKSPVFIGQVPIPTNV
QTAAVLDTTYNCQDLTADEKNNVAATLQIQNAAITLSLFEKCTQFLESELGEVPTLMWQAEDVADIKHLESQIENLRKVL
DGMQFGTTEYKATRKQLNICQSQLDQAKAFERKLAKFLEKVDQQQAITNETAKQLSAFKNLVKQVYESYMSSLKVKVLEA
NDASCLLTSTDLPRKLVLMRPITGVDGIKIVEKANGCEITAFGTTFNTGHGSNLAGLAYSTTQPLSAYPFIFNLEGIFKQ
QANIGYKTVECNMSSHNGSVLYKGKVVAVPSDDNPDFVVCGKGYKLDCGINVLMIPSIVRYITLNLTDHLQKQSLKPRRR
LQYRQQGVRLGGVNLGEHQAFSNELISTVGYTTWVSSTVCRDNTHKHPWFVQIPVNEKDPEWFMHNTQLKDNQWVVDLKP
THWLVNADTGEQLFALSLTDEQALKAEAILQKWSPITQDVECWFKDLKGYYTVSGFQPLWPVCPVNICNVRLDPVFKPQS
IVYADDPTHFLSLPVVNKNFLAAFYDLQEGFPGKKQVAPHISLTMLKLSDEDIEKVEDILDEMVLPNSWVTITNPHMMGK
HYVCDVEGLDSLHDEVVSVLREHGIACDQKRLWKPHLTIGELNDVSFDKFKDFAISCKLEDCDFVKLGAPKANARYEFIT
TLPLGDLNC
>P0C6U2 ~~~1a~~~Replicase polyprotein 1a~~~
MACNRVTLAVASDSEISANGCSTIAQAVRRYSEAASNGFRACRFVSLDLQDCIVGIADDTYVMGLHGNQTLFCNIMKFSD
RPFMLHGWLVFSNSNYLLEEFDVVFGKRGGGNVTYTDQYLCGADGKPVMSEDLWQFVDHFGENEEIIINGHTYVCAWLTK
RKPLDYKRQNNLAIEEIEYVHGDALHTLRNGSVLEMAKEVKTSSKVVLSDALDKLYKVFGSPVMTNGSNILEAFTKPVFI
SALVQCTCGTKSWSVGDWTGFKSSCCNVISNKLCVVPGNVKPGDAVITTQQAGAGIKYFCGMTLKFVANIEGVSVWRVIA
LQSVDCFVASSTFVEEEHVNRMDTFCFNVRNSVTDECRLAMLGAEMTSNVRRQVASGVIDISTGWFDVYDDIFAESKPWF
VRKAEDIFGPCWSALASALKQLKVTTGELVRFVKSICNSAVAVVGGTIQILASVPEKFLNAFDVFVTAIQTVFDCAVETC
TIAGKAFDKVFDYVLLDNALVKLVTTKLKGVRERGLNKVKYATVVVGSTEEVKSSRVERSTAVLTIANNYSKLFDEGYTV
VIGDVAYFVSDGYFRLMASPNSVLTTAVYKPLFAFNVNVMGTRPEKFPTTVTCENLESAVLFVNDKITEFQLDYSIDVID
NEIIVKPNISLCVPLYVRDYVDKWDDFCRQYSNESWFEDDYRAFISVLDITDAAVKAAESKAFVDTIVPPCPSILKVIDG
GKIWNGVIKNVNSVRDWLKSLKLNLTQQGLLGTCAKRFKRWLGILLEAYNAFLDTVVSTVKIGGLTFKTYAFDKPYIVIR
DIVCKVENKTEAEWIELFPHNDRIKSFSTFESAYMPIADPTHFDIEEVELLDAEFVEPGCGGILAVIDEHVFYKKDGVYY
PSNGTNILPVAFTKAAGGKVSFSDDVEVKDIEPVYRVKLCFEFEDEKLVDVCEKAIGKKIKHEGDWDSFCKTIQSALSVV
SCYVNLPTYYIYDEEGGNDLSLPVMISEWPLSVQQAQQEATLPDIAEDVVDQVEEVNSIFDIETVDVKHDVSPFEMPFEE
LNGLKILKQLDNNCWVNSVMLQIQLTGILDGDYAMQFFKMGRVAKMIERCYTAEQCIRGAMGDVGLCMYRLLKDLHTGFM
VMDYKCSCTSGRLEESGAVLFCTPTKKAFPYGTCLNCNAPRMCTIRQLQGTIIFVQQKPEPVNPVSFVVKPVCSSIFRGA
VSCGHYQTNIYSQNLCVDGFGVNKIQPWTNDALNTICIKDADYNAKVEISVTPIKNTVDTTPKEEFVVKEKLNAFLVHDN
VAFYQGDVDTVVNGVDFDFIVNAANENLAHGGGLAKALDVYTKGKLQRLSKEHIGLAGKVKVGTGVMVECDSLRIFNVVG
PRKGKHERDLLIKAYNTINNEQGTPLTPILSCGIFGIKLETSLEVLLDVCNTKEVKVFVYTDTEVCKVKDFVSGLVNVQK
VEQPKIEPKPVSVIKVAPKPYRVDGKFSYFTEDLLCVADDKPIVLFTDSMLTLDDRGLALDNALSGVLSAAIKDCVDINK
AIPSGNLIKFDIGSVVVYMCVVPSEKDKHLDNNVQRCTRKLNRLMCDIVCTIPADYILPLVLSSLTCNVSFVGELKAAEA
KVITIKVTEDGVNVHDVTVTTDKSFEQQVGVIADKDKDLSGAVPSDLNTSELLTKAIDVDWVEFYGFKDAVTFATVDHSA
FAYESAVVNGIRVLKTSDNNCWVNAVCIALQYSKPHFISQGLDAAWNKFVLGDVEIFVAFVYYVARLMKGDKGDAEDTLT
KLSKYLANEAQVQLEHYSSCVECDAKFKNSVASINSAIVCASVKRDGVQVGYCVHGIKYYSRVRSVRGRAIIVSVEQLEP
CAQSRLLSGVAYTAFSGPVDKGHYTVYDTAKKSMYDGDRFVKHDLSLLSVTSVVMVGGYVAPVNTVKPKPVINQLDEKAQ
KFFDFGDFLIHNFVIFFTWLLSMFTLCKTAVTTGDVKIMAKAPQRTGVVLKRSLKYNLKASAAVLKSKWWLLAKFTKLLL
LIYTLYSVVLLCVRFGPFNFCSETVNGYAKSNFVKDDYCDGSLGCKMCLFGYQELSQFSHLDVVWKHITDPLFSNMQPFI
VMVLLLIFGDNYLRCFLLYFVAQMISTVGVFLGYKETNWFLHFIPFDVICDELLVTVIVIKVISFVRHVLFGCENPDCIA
CSKSARLKRFPVNTIVNGVQRSFYVNANGGSKFCKKHRFFCVDCDSYGYGSTFITPEVSRELGNITKTNVQPTGPAYVMI
DKVEFENGFYRLYSCETFWRYNFDITESKYSCKEVFKNCNVLDDFIVFNNNGTNVTQVKNASVYFSQLLCRPIKLVDSEL
LSTLSVDFNGVLHKAYIDVLRNSFGKDLNANMSLAECKRALGLSISDHEFTSAISNAHRCDVLLSDLSFNNFVSSYAKPE
EKLSAYDLACCMRAGAKVVNANVLTKDQTPIVWHAKDFNSLSAEGRKYIVKTSKAKGLTFLLTINENQAVTQIPATSIVA
KQGAGDAGHSLTWLWLLCGLVCLIQFYLCFFMPYFMYDIVSSFEGYDFKYIENGQLKNFEAPLKCVRNVFENFEDWHYAK
FGFTPLNKQSCPIVVGVSEIVNTVAGIPSNVYLVGKTLIFTLQAAFGNAGVCYDIFGVTTPEKCIFTSACTRLEGLGGNN
VYCYNTALMEGSLPYSSIQANAYYKYDNGNFIKLPEVIAQGFGFRTVRTIATKYCRVGECVESNAGVCFGFDKWFVNDGR
VANGYVCGTGLWNLVFNILSMFSSSFSVAAMSGQILLNCALGAFAIFCCFLVTKFRRMFGDLSVGVCTVVVAVLLNNVSY
IVTQNLVTMIAYAILYFFATRSLRYAWIWCAAYLIAYISFAPWWLCAWYFLAMLTGLLPSLLKLKVSTNLFEGDKFVGTF
ESAAAGTFVIDMRSYEKLANSISPEKLKSYAASYNRYKYYSGNANEADYRCACYAYLAKAMLDFSRDHNDILYTPPTVSY
GSTLQAGLRKMAQPSGFVEKCVVRVCYGNTVLNGLWLGDIVYCPRHVIASNTTSAIDYDHEYSIMRLHNFSIISGTAFLG
VVGATMHGVTLKIKVSQTNMHTPRHSFRTLKSGEGFNILACYDGCAQGVFGVNMRTNWTIRGSFINGACGSPGYNLKNGE
VEFVYMHQIELGSGSHVGSSFDGVMYGGFEDQPNLQVESANQMLTVNVVAFLYAAILNGCTWWLKGEKLFVEHYNEWAQA
NGFTAMNGEDAFSILAAKTGVCVERLLHAIQVLNNGFGGKQILGYSSLNDEFSINEVVKQMFGVNLQSGKTTSMFKSISL
FAGFFVMFWAELFVYTTTIWVNPGFLTPFMILLVALSLCLTFVVKHKVLFLQVFLLPSIIVAAIQNCAWDYHVTKVLAEK
FDYNVSVMQMDIQGFVNIFICLFVALLHTWRFAKERCTHWCTYLFSLIAVLYTALYSYDYVSLLVMLLCAISNEWYIGAI
IFRICRFGVAFLPVEYVSYFDGVKTVLLFYMLLGFVSCMYYGLLYWINRFCKCTLGVYDFCVSPAEFKYMVANGLNAPNG
PFDALFLSFKLMGIGGPRTIKVSTVQSKLTDLKCTNVVLMGILSNMNIASNSKEWAYCVEMHNKINLCDDPETAQELLLA
LLAFFLSKHSDFGLGDLVDSYFENDSILQSVASSFVGMPSFVAYETARQEYENAVANGSSPQIIKQLKKAMNVAKAEFDR
ESSVQKKINRMAEQAAAAMYKEARAVNRKSKVVSAMHSLLFGMLRRLDMSSVDTILNMARNGVVPLSVIPATSAARLVVV
VPDHDSFVKMMVDGFVHYAGVVWTLQEVKDNDGKNVHLKDVTKENQEILVWPLILTCERVVKLQNNEIMPGKMKVKATKG
EGDGGITSEGNALYNNEGGRAFMYAYVTTKPGMKYVKWEHDSGVVTVELEPPCRFVIDTPTGPQIKYLYFVKNLNNLRRG
AVLGYIGATVRLQAGKQTEFVSNSHLLTHCSFAVDPAAAYLDAVKQGAKPVGNCVKMLTNGSGSGQAITCTIDSNTTQDT
YGGASVCIYCRAHVAHPTMDGFCQYKGKWVQVPIGTNDPIRFCLENTVCKVCGCWLNHGCTCDRTAIQSFDNSYLNESGA
LVPLD
>P0C6U3 ~~~1a~~~Replicase polyprotein 1a~~~
MIKTSKYGLGFKWAPEFRWLLPDAAEELASPMKSDEGGLCPSTGQAMESVGFVYDNHVKIDCRCILGQEWHVQSNLIRDI
FVHEDLHVVEVLTKTAVKSGTAILIKSPLHSLGGFPKGYVMGLFRSYKTKRYVVHHLSMTTSTTNFGEDFLGWIVPFGFM
PSYVHKWFQFCRLYIEESDLIISNFKFDDYDFSVEDAYAEVHAEPKGKYSQKAYALLRQYRGIKPVLFVDQYGCDYSGKL
ADCLQAYGHYSLQDMRQKQSVWLANCDFDIVVAWHVVRDSRFVMRLQTIATICGIKYVAQPTEDVVDGDVVIREPVHLLS
ADAIVLKLPSLMKVMTHMDDFSIKSIYNVDLCDCGFVMQYGYVDCFNDNCDFYGWVSGNMMDGFSCPLCCTVYDSSEVKA
QSSGVIPENPVLFTNSTDTVNHDSFNLYGYSVTPFGSCIYWSPRPGLWIPIIKSSVKSYDDLVYSGVVGCKSIVKETALI
THALYLDYVQCKCGNLEQNHILGVNNSWCRQLLLNRGDYNMLLKNIDLFVKRRADFACKFAVCGDGFVPFLLDGLIPRSY
YLIQSGIFFTSLMSQFSQEVSDMCLKMCILFMDRVSVATFYIEHYVNRLVTQFKLLGTTLVNKMVNWFNTMLDASAPATG
WLLYQLLNGLFVVSQANFNFVALIPDYAKILVNKFYTFFKLLLECVTVDVLKDMPVLKTINGLVCIVGNKFYNVSTGLIP
GFVLPCNAQEQQIYFFEGVAESVIVEDDVIENVKSSLSSYEYCQPPKSVEKICIIDNMYMGKCGDKFFPIVMNDKNICLL
DQAWRFPCAGRKVNFNEKPVVMEIPSLMTVKVMFDLDSTFDDILGKVCSEFEVEKGVTVDDFVAVVCDAIENALNSCKEH
PVVGYQVRAFLNKLNENVVYLFDEAGDEAMASRMYCTFAIEDVEDVISSEAVEDTIDGVVEDTINDDEDVVTGDNDDEDV
VTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDV
VTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNDDEDVVTGDNNDEEIVTGDNDDQIVVTGDDVDDIESIYDFDTYKA
LLVFNDVYNDALFVSYGSSVETETYFKVNGLWSPTITHTNCWLRSVLLVMQKLPFKFKDLAIENMWLSYKVGYNQSFVDY
LLTTIPKAIVLPQGGFVADFAYWFLNQFDINAYANWCCLKCGFSFDLNGLDALFFYGDIVSHVCKCGHNMTLIAADLPCT
LHFSLFDDNFCAFCTPKKIFIAACAVDVNVCHSVAVIGDEQIDGKFVTKFSGDKFDFIVGYGMSFSMSSFELPQLYGLCI
TPNVCFVKGDIINVARLVKADVIVNPANGHMLHGGGVAKAIAVAAGKKFSKETAAMVKSKGVCQVGDCYVSTGGKLCKTI
LNIVGPDARQDGRQSYVLLARAYKHLNNYDCCLSTLISAGIFSVPADVSLTYLLGVVDKQVILVSNNKEDFDIIQKCQIT
SVVGTKALAVRLTANVGRVIKFETDAYKLFLSGDDCFVSNSSVIQEVLLLRHDIQLNNDVRDYLLSKMTSLPKDWRLINK
FDVINGVKTVKYFECPNSIYICSQGKDFGYVCDGSFYKATVNQVCVLLAKKIDVLLTVDGVNFKSISLTVGEVFGKILGN
VFCDGIDVTKLKCSDFYADKILYQYENLSLADISAVQSSFGFDQQQLLAYYNFLTVCKWSVVVNGPFFSFEQSHNNCYVN
VACLMLQHINLKFNKWQWQEAWYEFRAGRPHRLVALVLAKGHFKFDEPSDATDFIRVVLKQADLSGAICELELICDCGIK
QESRVGVDAVMHFGTLAKTDLFNGYKIGCNCAGRIVHCTKLNVPFLICSNTPLSKDLPDDVVAANMFMGVGVGHYTHLKC
GSPYQHYDACSVKKYTGVSGCLTDCLYLKNLTQTFTSMLTNYFLDDVEMVAYNPDLSQYYCDNGKYYTKPIIKAQFKPFA
KVDGVYTNFKLVGHDICAQLNDKLGFNVDLPFVEYKVTVWPVATGDVVLASDDLYVKRYFKGCETFGKPVIWFCHDEASL
NSLTYFNKPSFKSENRYSVLSVDSVSEESQGNVVTSVMESQISTKEVKLKGVRKTVKIEDAIIVNDENSSIKVVKSLSLV
DVWDMYLTGCDYVVWVANELSRLVKSPTVREYIRYGIKPITIPIDLLCLRDDNQTLLVPKIFKARAIEFYGFLKWLFIYV
FSLLHFTNDKTIFYTTEIASKFTFNLFCLALKNAFQTFRWSIFIKGFLVVATVFLFWFNFLYINVIFSDFYLPNISVFPI
FVGRIVMWIKATFGLVTICDFYSKLGVGFTSHFCNGSFICELCHSGFDMLDTYAAIDFVQYEVDRRVLFDYVSLVKLIVE
LVIGYSLYTVWFYPLFCLIGLQLFTTWLPDLFMLETMHWLIRFIVFVANMLPAFVLLRFYIVVTAMYKVVGFIRHIVYGC
NKAGCLFCYKRNCSVRVKCSTIVGGVIRYYDITANGGTGFCVKHQWNCFNCHSFKPGNTFITVEAAIELSKELKRPVNPT
DASHYVVTDIKQVGCMMRLFYDRDGQRVYDDVDASLFVDINNLLHSKVKVVPNLYVVVVESDADRANFLNAVVFYAQSLY
RPILLVDKKLITTACNGISVTQTMFDVYVDTFMSHFDVDRKSFNNFVNIAHASLREGVQLEKVLDTFVGCVRKCCSIDSD
VETRFITKSMISAVAAGLEFTDENYNNLVPTYLKSDNIVAADLGVLIQNGAKHVQGNVAKAANISCIWFIDAFNQLTADL
QHKLKKACVKTGLKLKLTFNKQEASVPILTTPFSLKGGVVLSNLLYILFFVSLICFILLWALLPTYSVYKSDIHLPAYAS
FKVIDNGVVRDISVNDLCFANKFFQFDQWYESTFGSVYYHNSMDCPIVVAVMDEDIGSTMFNVPTKVLRHGFHVLHFLTY
AFASDSVQCYTPHIQISYNDFYASGCVLSSLCTMFKRGDGTPHPYCYSDGVMKNASLYTSLVPHTRYSLANSNGFIRFPD
VISEGIVRIVRTRSMTYCRVGACEYAEEGICFNFNSSWVLNNDYYRSMPGTFCGRDLFDLFYQFFSSLIRPIDFFSLTAS
SIFGAILAIVVVLVFYYLIKLKRAFGDYTSVVVINVVVWCINFLMLFVFQVYPICACVYACFYFYVTLYFPSEISVIMHL
QWIVMYGAIMPFWFCVTYVAMVIANHVLWLFSYCRKIGVNVCSDSTFEETSLTTFMITKDSYCRLKNSVSDVAYNRYLSL
YNKYRYYSGKMDTAAYREAACSQLAKAMETFNHNNGNDVLYQPPTASVSTSFLQSGIVKMVSPTSKIEPCIVSVTYGSMT
LNGLWLDDKVYCPRHVICSSSNMNEPDYSALLCRVTLGDFTIMSGRMSLTVVSYQMQGCQLVLTVSLQNPYTPKYTFGNV
KPGETFTVLAAYNGRPQGAFHVTMRSSYTIKGSFLCGSCGSVGYVLTGDSVKFVYMHQLELSTGCHTGTDFTGNFYGPYR
DAQVVQLPVKDYVQTVNVIAWLYAAILNNCAWFVQNDVCSTEDFNVWAMANGFSQVKADLVLDALASMTGVSIETLLAAI
KRLYMGFQGRQILGSCTFEDELAPSDVYQQLAGVKLQSKTKRFIKETIYWILISTFLFSCIISAFVKWTIFMYINTHMIG
VTLCVLCFVSFMMLLVKHKHFYLTMYIIPVLCTLFYVNYLVVYKEGFRGFTYVWLSYFVPAVNFTYVYEVFYGCILCVFA
IFITMHSINHDIFSLMFLVGRIVTLISMWYFGSNLEEDVLLFITAFLGTYTWTTILSLAIAKIVANWLSVNIFYFTDVPY
IKLILLSYLFIGYILSCYWGFFSLLNSVFRMPMGVYNYKISVQELRYMNANGLRPPRNSFEAILLNLKLLGIGGVPVIEV
SQIQSKLTDVKCANVVLLNCLQHLHVASNSKLWQYCSVLHNEILSTSDLSVAFDKLAQLLIVLFANPAAVDTKCLASIDE
VSDDYVQDSTVLQALQSEFVNMASFVEYEVAKKNLADAKNSGSVNQQQIKQLEKACNIAKSVYERDKAVARKLERMADLA
LTNMYKEARINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLSAIPALAANTLTIVIPDKQVFDKVVDNVY
VTYAGSVWHIQTVQDADGINKQLTDISVDSNWPLVIIANRYNEVANAVMQNNELMPHKLKIQVVNSGSDMNCNIPTQCYY
NNGSSGRIVYAVLSDVDGLKYTKIMKDDGNCVVLELDPPCKFSIQDVKGLKIKYLYFIKGCNTLARGWVVGTLSSTIRLQ
AGVATEYAANSSILSLCAFSVDPKKTYLDYIQQGGVPIINCVKMLCDHAGTGMAITIKPEATINQDSYGGASVCIYCRAR
VEHPDVDGICKLRGKFVQVPLGIKDPILYVLTHDVCQVCGFWRDGSCSCVGSSVAVQSKDLNFLNGFGVLV
>P0C6U6 ~~~1a~~~Replicase polyprotein 1a~~~
MFYNQVTLAVASDSEISGFGFAIPSVAVRTYSEAAAQGFQACRFVAFGLQDCVTGINDDDYVIALTGTNQLCAKILPFSD
RPLNLRGWLIFSNSNYVLQDFDVVFGHGAGSVVFVDKYMCGFDGKPVLPKNMWEFRDYFNNNTDSIVIGGVTYQLAWDVI
RKDLSYEQQNVLAIESIHYLGTTGHTLKSGCKLTNAKPPKYSSKVVLSGEWNAVYRAFGSPFITNGMSLLDIIVKPVFFN
AFVKCNCGSESWSVGAWDGYLSSCCGTPAKKLCVVPGNVVPGDVIITSTSAGCGVKYYAGLVVKHITNITGVSLWRVTAV
HSDGMFVASSSYDALLHRNSLDPFCFDVNTLLSNQLRLAFLGASVTEDVKFAASTGVIDISAGMFGLYDDILTNNKPWFV
RKASGLFDAIWDAFVAAIKLVPTTTGVLVRFVKSIASTVLTVSNGVIIMCADVPDAFQSVYRTFTQAICAAFDFSLDVFK
IGDVKFKRLGDYVLTENALVRLTTEVVRGVRDARIKKAMFTKVVVGPTTEVKFSVIELATVNLRLVDCAPVVCPKGKIVV
IAGQAFFYSGGFYRFMVDPTTVLNDPVFTGDLFYTIKFSGFKLDGFNHQFVTASSATDAIIAVELLLLDFKTAVFVYTCV
VDGCSVIVRRDATFATHVCFKDCYNVWEQFCIDNCGEPWFLTDYNAILQSNNPQCAIVQASESKVLLERFLPKCPEILLS
IDDGHLWNLFVEKFNFVTDWLKTLKLTLTSNGLLGNCAKRFRRVLVKLLDVYNGFLETVCSVAYTAGVCIKYYAVNVPYV
VISGFVSRVIRRERCDMTFPCVSCVTFFYEFLDTCFGVSKPNAIDVEHLELKETVFVEPKDGGQFFVSGDYLWYVVDDIY
YPASCNGVLPVAFTKLAGGKISFSDDVIVHDVEPTHKVKLIFEFEDDVVTSLCKKSFGKSIIYTGDWEGLHEVLTSAMNV
IGQHIKLPQFYIYDEEGGYDVSKPVMISQWPISNDSNGCVVEASTDFHQLECIVDDSVREEVDIIEQPFEEVEHVLSIKQ
PFSFSFRDELGVRVLDQSDNNCWISTTLVQLQLTKLLDDSIEMQLFKVGKVDSIVQKCYELSHLISGSLGDSGKLLSELL
KEKYTCSITFEMSCDCGKKFDDQVGCLFWIMPYTKLFQKGECCICHKMQTYKLVSMKGTGVFVQDPAPIDIDAFPVKPIC
SSVYLGVKGSGHYQTNLYSFNKAIDGFGVFDIKNSSVNTVCFVDVDFHSVEIEAGEVKPFAVYKNVKFYLGDISHLVNCV
SFDFVVNAANENLLHGGGVARAIDILTEGQLQSLSKDYISSNGPLKVGAGVMLECEKFNVFNVVGPRTGKHEHSLLVEAY
NSILFENGIPLMPLLSCGIFGVRIENSLKALFSCDINKPLQVFVYSSNEEQAVLKFLDGLDLTPVIDDVDVVKPFRVEGN
FSFFDCGVNALDGDIYLLFTNSILMLDKQGQLLDTKLNGILQQAALDYLATVKTVPAGNLVKLFVESCTIYMCVVPSIND
LSFDKNLGRCVRKLNRLKTCVIANVPAIDVLKKLLSSLTLTVKFVVESNVMDVNDCFKNDNVVLKITEDGINVKDVVVES
SKSLGKQLGVVSDGVDSFEGVLPINTDTVLSVAPEVDWVAFYGFEKAALFASLDVKPYGYPNDFVGGFRVLGTTDNNCWV
NATCIILQYLKPTFKSKGLNVLWNKFVTGDVGPFVSFIYFITMSSKGQKGDAEEALSKLSEYLISDSIVTLEQYSTCDIC
KSTVVEVKSAIVCASVLKDGCDVGFCPHRHKLRSRVKFVNGRVVITNVGEPIISQPSKLLNGIAYTTFSGSFDNGHYVVY
DAANNAVYDGARLFSSDLSTLAVTAIVVVGGCVTSNVPTIVSEKISVMDKLDTGAQKFFQFGDFVMNNIVLFLTWLLSMF
SLLRTSIMKHDIKVIAKAPKRTGVILTRSFKYNIRSALFVIKQKWCVIVTLFKFLLLLYAIYALVFMIVQFSPFNSLLCG
DIVSGYEKSTFNKDIYCGNSMVCKMCLFSYQEFNDLDHTSLVWKHIRDPILISLQPFVILVILLIFGNMYLRFGLLYFVA
QFISTFGSFLGFHQKQWFLHFVPFDVLCNEFLATFIVCKIVLFVRHIIVGCNNADCVACSKSARLKRVPLQTIINGMHKS
FYVNANGGTCFCNKHNFFCVNCDSFGPGNTFINGDIARELGNVVKTAVQPTAPAYVIIDKVDFVNGFYRLYSGDTFWRYD
FDITESKYSCKEVLKNCNVLENFIVYNNSGSNITQIKNACVYFSQLLCEPIKLVNSELLSTLSVDFNGVLHKAYVDVLCN
SFFKELTANMSMAECKATLGLTVSDDDFVSAVANAHRYDVLLSDLSFNNFFISYAKPEDKLSVYDIACCMRAGSKVVNHN
VLIKESIPIVWGVKDFNTLSQEGKKYLVKTTKAKGLTFLLTFNDNQAITQVPATSIVAKQGAGFKRTYNFLWYVCLFVVA
LFIGVSFIDYTTTVTSFHGYDFKYIENGQLKVFEAPLHCVRNVFDNFNQWHEAKFGVVTTNSDKCPIVVGVSERINVVPG
VPTNVYLVGKTLVFTLQAAFGNTGVCYDFDGVTTSDKCIFNSACTRLEGLGGDNVYCYNTDLIEGSKPYSTLQPNAYYKY
DAKNYVRFPEILARGFGLRTIRTLATRYCRVGECRDSHKGVCFGFDKWYVNDGRVDDGYICGDGLIDLLVNVLSIFSSSF
SVVAMSGHMLFNFLFAAFITFLCFLVTKFKRVFGDLSYGVFTVVCATLINNISYVVTQNLFFMLLYAILYFVFTRTVRYA
WIWHIAYIVAYFLLIPWWLLTWFSFAAFLELLPNVFKLKISTQLFEGDKFIGTFESAAAGTFVLDMRSYERLINTISPEK
LKNYAASYNKYKYYSGSASEADYRCACYAHLAKAMLDYAKDHNDMLYSPPTISYNSTLQSGLKKMAQPSGCVERCVVRVC
YGSTVLNGVWLGDTVTCPRHVIAPSTTVLIDYDHAYSTMRLHNFSVSHNGVFLGVVGVTMHGSVLRIKVSQSNVHTPKHV
FKTLKPGDSFNILACYEGIASGVFGVNLRTNFTIKGSFINGACGSPGYNVRNDGTVEFCYLHQIELGSGAHVGSDFTGSV
YGNFDDQPSLQVESANLMLSDNVVAFLYAALLNGCRWWLCSTRVNVDGFNEWAMANGYTSVSSVECYSILAAKTGVSVEQ
LLASIQHLHEGFGGKNILGYSSLCDEFTLAEVVKQMYGVNLQSGKVIFGLKTMFLFSVFFTMFWAELFIYTNTIWINPVI
LTPIFCLLLFLSLVLTMFLKHKFLFLQVFLLPTVIATALYNCVLDYYIVKFLADHFNYNVSVLQMDVQGLVNVLVCLFVV
FLHTWRFSKERFTHWFTYVCSLIAVAYTYFYSGDFLSLLVMFLCAISSDWYIGAIVFRLSRLIVFFSPESVFSVFGDVKL
TLVVYLICGYLVCTYWGILYWFNRFFKCTMGVYDFKVSAAEFKYMVANGLHAPHGPFDALWLSFKLLGIGGDRCIKISTV
QSKLTDLKCTNVVLLGCLSSMNIAANSSEWAYCVDLHNKINLCDDPEKAQSMLLALLAFFLSKHSDFGLDGLIDSYFDNS
STLQSVASSFVSMPSYIAYENARQAYEDAIANGSSSQLIKQLKRAMNIAKSEFDHEISVQKKINRMAEQAATQMYKEARS
VNRKSKVISAMHSLLFGMLRRLDMSSVETVLNLARDGVVPLSVIPATSASKLTIVSPDLESYSKIVCDGSVHYAGVVWTL
NDVKDNDGRPVHVKEITKENVETLTWPLILNCERVVKLQNNEIMPGKLKQKPMKAEGDGGVLGDGNALYNTEGGKTFMYA
YISNKADLKFVKWEYEGGCNTIELDSPCRFMVETPNGPQVKYLYFVKNLNTLRRGAVLGFIGATIRLQAGKQTELAVNSG
LLTACAFSVDPATTYLEAVKHGAKPVSNCIKMLSNGAGNGQAITTSVDANTNQDSYGGASICLYCRAHVPHPSMDGYCKF
KGKCVQVPIGCLDPIRFCLENNVCNVCGCWLGHGCACDRTTIQSVDISYLNEQGVLVQLD
>P0C6U7 ~~~1a~~~Replicase polyprotein 1a~~~
MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDMICSTTAQKLETDGICPENHVMVDCRRLLKQECCVQSSLIREI
VMNASPYHLEVLLQDALQSREAVLVTTPLGMSLEACYVRGCNPKGWTMGLFRRRSVCNTGRCTVNKHVAYQLYMIDPAGV
CLGAGQFVGWVIPLAFMPVQSRKFIVPWVMYLRKRGEKGAYNKDHGCGGFGHVYDFKVEDAYDQVHDEPKGKFSKKAYAL
IRGYRGVKPLLYVDQYGCDYTGSLADGLEAYADKTLQEMKALFPTWSQELPFDVIVAWHVVRDPRYVMRLQSAATICSVA
YVANPTEDLCDGSVVIKEPVHVYADDSIILRQYNLFDIMSHFYMEADTVVNAFYGVALKDCGFVMQFGYIDCEQDSCDFK
GWIPGNMIDGFACTTCGHVYEVGDLIAQSSGVLPVNPVLHTKSAAGYGGFGCKDSFTLYGQTVVYFGGCVYWSPARNIWI
PILKSSVKSYDSLVYTGVLGCKAIVKETNLICKALYLDYVQHKCGNLHQRELLGVSDVWHKQLLINRGVYKPLLENIDYF
NMRRAKFSLETFTVCADGFMPFLLDDLVPRAYYLAVSGQAFCDYADKLCHAVVSKSKELLDVSLDSLGAAIHYLNSKIVD
LAQHFSDFGTSFVSKIVHFFKTFTTSTALAFAWVLFHVLHGAYIVVESDIYFVKNIPRYASAVAQAFQSVAKVVLDSLRV
TFIDGLSCFKIGRRRICLSGRKIYEVERGLLHSSQLPLDVYDLTMPSQVQKAKQKPIYLKGSGSDFSLADSVVEVVTTSL
TPCGYSEPPKVADKICIVDNVYMAKAGDKYYPVVVDDHVGLLDQAWRVPCAGRRVTFKEQPTVKEIISMPKIIKVFYELD
NDFNTILNTACGVFEVDDTVDMEEFYAVVIDAIEEKLSPCKELEGVGAKVSAFLQKLEDNPLFLFDEAGEEVFAPKLYCA
FTAPEDDDFLEESDVEEDDVEGEETDLTITSAGQPCVASEQEESSEVLEDTLDDGPSVETSDSQVEEDVEMSDFVDLESV
IQDYENVCFEFYTTEPEFVKVLGLYVPKATRNNCWLRSVLAVMQKLPCQFKDKNLQDLWVLYKQQYSQLFVDTLVNKIPA
NIVLPQGGYVADFAYWFLTLCDWQCVAYWKCIKCDLALKLKGLDAMFFYGDVVSHICKCGESMVLIDVDVPFTAHFALKD
KLFCAFITKRIVYKAACVVDVNDSHSMAVVDGKQIDDHRITSITSDKFDFIIGHGMSFSMTTFEIAQLYGSCITPNVCFV
KGDIIKVSKLVKAEVVVNPANGHMVHGGGVAKAIAVAAGQQFVKETTNMVKSKGVCATGDCYVSTGGKLCKTVLNVVGPD
ARTQGKQSYVLLERVYKHFNNYDCVVTTLISAGIFSVPSDVSLTYLLGTAKKQVVLVSNNQEDFDLISKCQITAVEGTKK
LAARLSFNVGRSIVYETDANKLILINDVAFVSTFNVLQDVLSLRHDIALDDDARTFVQSNVDVLPEGWRVVNKFYQINGV
RTVKYFECTGGIDICSQDKVFGYVQQGIFNKATVAQIKALFLDKVDILLTVDGVNFTNRFVPVGESFGKSLGNVFCDGVN
VTKHKCDINYKGKVFFQFDNLSSEDLKAVRSSFNFDQKELLAYYNMLVNCFKWQVVVNGKYFTFKQANNNCFVNVSCLML
QSLHLTFKIVQWQEAWLEFRSGRPARFVALVLAKGGFKFGDPADSRDFLRVVFSQVDLTGAICDFEIACKCGVKQEQRTG
LDAVMHFGTLSREDLEIGYTVDCSCGKKLIHCVRFDVPFLICSNTPASVKLPKGVGSANIFIGDNVGHYVHVKCEQSYQL
YDASNVKKVTDVTGKLSDCLYLKNLKQTFKSVLTTYYLDDVKKIEYKPDLSQYYCDGGKYYTQRIIKAQFKTFEKVDGVY
TNFKLIGHTVCDSLNSKLGFDSSKEFVEYKITEWPTATGDVVLANDDLYVKRYERGCITFGKPVIWLSHEKASLNSLTYF
NRPLLVDDNKFDVLKVDDVDDSGDSSESGAKETKEINIIKLSGVKKPFKVEDSVIVNDDTSETKYVKSLSIVDVYDMWLT
GCKYVVRTANALSRAVNVPTIRKFIKFGMTLVSIPIDLLNLREIKPAVNVVKAVRNKTSACFNFIKWLFVLLFGWIKISA
DNKVIYTTEIASKLTCKLVALAFKNAFLTFKWSMVARGACIIATIFLLWFNFIYANVIFSDFYLPKIGFLPTFVGKIAQW
IKNTFSLVTICDLYSIQDVGFKNQYCNGSIACQFCLAGFDMLDNYKAIDVVQYEADRRAFVDYTGVLKIVIELIVSYALY
TAWFYPLFALISIQILTTWLPELFMLSTLHWSFRLLVALANMLPAHVFMRFYIIIASFIKLFSLFKHVAYGCSKSGCLFC
YKRNRSLRVKCSTIVGGMIRYYDVMANGGTGFCSKHQWNCIDCDSYKPGNTFITVEAALDLSKELKRPIQPTDVAYHTVT
DVKQVGCSMRLFYDRDGQRIYDDVNASLFVDYSNLLHSKVKSVPNMHVVVVENDADKANFLNAAVFYAQSLFRPILMVDK
NLITTANTGTSVTETMFDVYVDTFLSMFDVDKKSLNALIATAHSSIKQGTQIYKVLDTFLSCARKSCSIDSDVDTKCLAD
SVMSAVSAGLELTDESCNNLVPTYLKSDNIVAADLGVLIQNSAKHVQGNVAKIAGVSCIWSVDAFNQFSSDFQHKLKKAC
CKTGLKLKLTYNKQMANVSVLTTPFSLKGGAVFSYFVYVCFVLSLVCFIGLWCLMPTYTVHKSDFQLPVYASYKVLDNGV
IRDVSVEDVCFANKFEQFDQWYESTFGLSYYSNSMACPIVVAVIDQDFGSTVFNVPTKVLRYGYHVLHFITHALSADGVQ
CYTPHSQISYSNFYASGCVLSSACTMFTMADGSPQPYCYTDGLMQNASLYSSLVPHVRYNLANAKGFIRFPEVLREGLVR
VVRTRSMSYCRVGLCEEADEGICFNFNGSWVLNNDYYRSLPGTFCGRDVFDLIYQLFKGLAQPVDFLALTASSIAGAILA
VIVVLVFYYLIKLKRAFGDYTSVVFVNVIVWCVNFMMLFVFQVYPTLSCVYAICYFYATLYFPSEISVIMHLQWLVMYGT
IMPLWFCLLYIAVVVSNHAFWVFSYCRKLGTSVRSDGTFEEMALTTFMITKDSYCKLKNSLSDVAFNRYLSLYNKYRYYS
GKMDTAAYREAACSQLAKAMDTFTNNNGSDVLYQPPTASVSTSFLQSGIVKMVNPTSKVEPCVVSVTYGNMTLNGLWLDD
KVYCPRHVICSASDMTNPDYTNLLCRVTSSDFTVLFDRLSLTVMSYQMRGCMLVLTVTLQNSRTPKYTFGVVKPGETFTV
LAAYNGKPQGAFHVTMRSSYTIKGSFLCGSCGSVGYVIMGDCVKFVYMHQLELSTGCHTGTDFNGDFYGPYKDAQVVQLP
IQDYIQSVNFLAWLYAAILNNCNWFIQSDKCSVEDFNVWALSNGFSQVKSDLVIDALASMTGVSLETLLAAIKRLKNGFQ
GRQIMGSCSFEDELTPSDVYQQLAGIKLQSKRTRLFKGTVCWIMASTFLFSCIITAFVKWTMFMYVTTNMFSITFCALCV
ISLAMLLVKHKHLYLTMYITPVLFTLLYNNYLVVYKHTFRGYVYAWLSYYVPSVEYTYTDEVIYGMLLLVGMVFVTLRSI
NHDLFSFIMFVGRLISVFSLWYKGSNLEEEILLMLASLFGTYTWTTVLSMAVAKVIAKWVAVNVLYFTDIPQIKIVLLCY
LFIGYIISCYWGLFSLMNSLFRMPLGVYNYKISVQELRYMNANGLRPPKNSFEALMLNFKLLGIGGVPIIEVSQFQSKLT
DVKCANVVLLNCLQHLHVASNSKLWHYCSTLHNEILATSDLSVAFEKLAQLLIVLFANPAAVDSKCLTSIEEVCDDYAKD
NTVLQALQSEFVNMASFVEYEVAKKNLDEARFSGSANQQQLKQLEKACNIAKSAYERDRAVAKKLERMADLALTNMYKEA
RINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPSLAANTLNIIVPDKSVYDQIVDNIYVTYAGNVW
QIQTIQDSDGTNKQLNEISDDCNWPLVIIANRYNEVSATVLQNNELMPAKLKIQVVNSGPDQTCNTPTQCYYNNSNNGKI
VYAILSDVDGLKYTKILKDDGNFVVLELDPPCKFTVQDAKGLKIKYLYFVKGCNTLARGWVVGTISSTVRLQAGTATEYA
SNSSILSLCAFSVDPKKTYLDFIQQGGTPIANCVKMLCDHAGTGMAITVKPDATTSQDSYGGASVCIYCRARVEHPDVDG
LCKLRGKFVQVPVGIKDPVSYVLTHDVCRVCGFWRDGSCSCVSTDTTVQSKDTNFLNGFGVRV
>P0C6V0 ~~~1a~~~Replicase polyprotein 1a~~~
MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVNCSRLPALECCVQSAIIRDI
FVDEDPQKVEASTMMALQFGSAVLVKPSKRLSIQAWTNLGVLPKTAAMGLFKRVCLCNTRECSCDAHVAFHLFTVQPDGV
CLGNGRFIGWFVPVTAIPEYAKQWLQPWSILLRKGGNKGSVTSGHFRRAVTMPVYDFNVEDACEEVHLNPKGKYSCKAYA
LLKGYRGVKPILFVDQYGCDYTGCLAKGLEDYGDLTLSEMKELFPVWRDSLDSEVLVAWHVDRDPRAAMRLQTLATVRCI
DYVGQPTEDVVDGDVVVREPAHLLAANAIVKRLPRLVETMLYTDSSVTEFCYKTKLCECGFITQFGYVDCCGDTCDFRGW
VAGNMMDGFPCPGCTKNYMPWELEAQSSGVIPEGGVLFTQSTDTVNRESFKLYGHAVVPFGSAVYWSPCPGMWLPVIWSS
VKSYSGLTYTGVVGCKAIVQETDAICRSLYMDYVQHKCGNLEQRAILGLDDVYHRQLLVNRGDYSLLLENVDLFVKRRAE
FACKFATCGDGLVPLLLDGLVPRSYYLIKSGQAFTSMMVNFSHEVTDMCMDMALLFMHDVKVATKYVKKVTGKLAVRFKA
LGVAVVRKITEWFDLAVDIAASAAGWLCYQLVNGLFAVANGVITFVQEVPELVKNFVDKFKAFFKVLIDSMSVSILSGLT
VVKTASNRVCLAGSKVYEVVQKSLSAYVMPVGCSEATCLVGEIEPAVFEDDVVDVVKAPLTYQGCCKPPTSFEKICIVDK
LYMAKCGDQFYPVVVDNDTVGVLDQCWRFPCAGKKVEFNDKPKVRKIPSTRKIKITFALDATFDSVLSKACSEFEVDKDV
TLDELLDVVLDAVESTLSPCKEHDVIGTKVCALLDRLAGDYVYLFDEGGDEVIAPRMYCSFSAPDDEDCVAADVVDADEN
QDDDAEDSAVLVADTQEEDGVAKGQVEADSEICVAHTGSQEELAEPDAVGSQTPIASAEETEVGEASDREGIAEAKATVC
ADAVDACPDQVEAFEIEKVEDSILDELQTELNAPADKTYEDVLAFDAVCSEALSAFYAVPSDETHFKVCGFYSPAIERTN
CWLRSTLIVMQSLPLEFKDLEMQKLWLSYKAGYDQCFVDKLVKSVPKSIILPQGGYVADFAYFFLSQCSFKAYANWRCLE
CDMELKLQGLDAMFFYGDVVSHMCKCGNSMTLLSADIPYTLHFGVRDDKFCAFYTPRKVFRAACAVDVNDCHSMAVVEGK
QIDGKVVTKFIGDKFDFMVGYGMTFSMSPFELAQLYGSCITPNVCFVKGDVIKVVRLVNAEVIVNPANGRMAHGAGVAGA
IAEKAGSAFIKETSDMVKAQGVCQVGECYESAGGKLCKKVLNIVGPDARGHGKQCYSLLERAYQHINKCDNVVTTLISAG
IFSVPTDVSLTYLLGVVTKNVILVSNNQDDFDVIEKCQVTSVAGTKALSLQLAKNLCRDVKFVTNACSSLFSESCFVSSY
DVLQEVEALRHDIQLDDDARVFVQANMDCLPTDWRLVNKFDSVDGVRTIKYFECPGGIFVSSQGKKFGYVQNGSFKEASV
SQIRALLANKVDVLCTVDGVNFRSCCVAEGEVFGKTLGSVFCDGINVTKVRCSAIYKGKVFFQYSDLSEADLVAVKDAFG
FDEPQLLKYYTMLGMCKWPVVVCGNYFAFKQSNNNCYINVACLMLQHLSLKFPKWQWQEAWNEFRSGKPLRFVSLVLAKG
SFKFNEPSDSIDFMRVVLREADLSGATCNLEFVCKCGVKQEQRKGVDAVMHFGTLDKGDLVRGYNIACTCGSKLVHCTQF
NVPFLICSNTPEGRKLPDDVVAANIFTGGSVGHYTHVKCKPKYQLYDACNVNKVSEAKGNFTDCLYLKNLKQTFSSVLTT
FYLDDVKCVEYKPDLSQYYCESGKYYTKPIIKAQFRTFEKVDGVYTNFKLVGHSIAEKLNAKLGFDCNSPFVEYKITEWP
TATGDVVLASDDLYVSRYSSGCITFGKPVVWLGHEEASLKSLTYFNRPSVVCENKFNVLPVDVSEPTDKGPVPAAVLVTG
VPGADASAGAGIAKEQKACASASVEDQVVTEVRQEPSVSAADVKEVKLNGVKKPVKVEGSVVVNDPTSETKVVKSLSIVD
VYDMFLTGCKYVVWTANELSRLVNSPTVREYVKWGMGKIVTPAKLLLLRDEKQEFVAPKVVKAKAIACYCAVKWFLLYCF
SWIKFNTDNKVIYTTEVASKLTFKLCCLAFKNALQTFNWSVVSRGFFLVATVFLLWFNFLYANVILSDFYLPNIGPLPTF
VGQIVAWFKTTFGVSTICDFYQVTDLGYRSSFCNGSMVCELCFSGFDMLDNYDAINVVQHVVDRRLSFDYISLFKLVVEL
VIGYSLYTVCFYPLFVLIGMQLLTTWLPEFFMLETMHWSARLFVFVANMLPAFTLLRFYIVVTAMYKVYCLCRHVMYGCS
KPGCLFCYKRNRSVRVKCSTVVGGSLRYYDVMANGGTGFCTKHQWNCLNCNSWKPGNTFITHEAAADLSKELKRPVNPTD
SAYYSVTEVKQVGCSMRLFYERDGQRVYDDVNASLFVDMNGLLHSKVKGVPETHVVVVENEADKAGFLGAAVFYAQSLYR
PMLMVEKKLITTANTGLSVSRTMFDLYVDSLLNVLDVDRKSLTSFVNAAHNSLKEGVQLEQVMDTFIGCARRKCAIDSDV
ETKSITKSVMSAVNAGVDFTDESCNNLVPTYVKSDTIVAADLGVLIQNNAKHVQANVAKAANVACIWSVDAFNQLSADLQ
HRLRKACSKTGLKIKLTYNKQEANVPILTTPFSLKGGAVFSRMLQWLFVANLICFIVLWALMPTYAVHKSDMQLPLYASF
KVIDNGVLRDVSVTDACFANKFNQFDQWYESTFGLAYYRNSKACPVVVAVIDQDIGHTLFNVPTTVLRYGFHVLHFITHA
FATDSVQCYTPHMQIPYDNFYASGCVLSSLCTMLAHADGTPHPYCYTGGVMHNASLYSSLAPHVRYNLASSNGYIRFPEV
VSEGIVRVVRTRSMTYCRVGLCEEAEEGICFNFNRSWVLNNPYYRAMPGTFCGRNAFDLIHQVLGGLVRPIDFFALTASS
VAGAILAIIVVLAFYYLIKLKRAFGDYTSVVVINVIVWCINFLMLFVFQVYPTLSCLYACFYFYTTLYFPSEISVVMHLQ
WLVMYGAIMPLWFCIIYVAVVVSNHALWLFSYCRKIGTEVRSDGTFEEMALTTFMITKESYCKLKNSVSDVAFNRYLSLY
NKYRYFSGKMDTAAYREAACSQLAKAMETFNHNNGNDVLYQPPTASVTTSFLQSGIVKMVSPTSKVEPCIVSVTYGNMTL
NGLWLDDKVYCPRHVICSSADMTDPDYPNLLCRVTSSDFCVMSGRMSLTVMSYQMQGCQLVLTVTLQNPNTPKYSFGVVK
PGETFTVLAAYNGRPQGAFHVTLRSSHTIKGSFLCGSCGSVGYVLTGDSVRFVYMHQLELSTGCHTGTDFSGNFYGPYRD
AQVVQLPVQDYTQTVNVVAWLYAAIFNRCNWFVQSDSCSLEEFNVWAMTNGFSSIKADLVLDALASMTGVTVEQVLAAIK
RLHSGFQGKQILGSCVLEDETPSDVYQQLAGVKLQSKRTRVIKGTCCWILASTFLFCSIISAFVKWTMFMYVTTHMLGVT
LCALCFVSFAMLLIKHKHLYLTMYIMPVLCTFYTNYLVVYKQSFRGLAYAWLSHFVPAVDYTYMDEVLYGVVLLVAMVFV
TMRSINHDVFSIMFLVGRLVSLVSMWYFGANLEEEVLLFLTSLFGTYTWTTMLSLATAKVIAKWLAVNVLYFTDVPQIKL
VLLSYLCIGYVCCCYWGILSLLNSIFRMPLGVYNYKISVQELRYMNANGLRPPRNSFEALMLNFKLLGIGGVPVIEVSQI
QSRLTDVKCANVVLLNCLQHLHIASNSKLWQYCSTLHNEILATSDLSMAFDKLAQLLVVLFANPAAVDSKCLASIEEVSD
DYVRDNTVLQALQSEFVNMASFVEYELAKKNLDEAKASGSANQQQIKQLEKACNIAKSAYERDRAVARKLERMADLALTN
MYKEARINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPSLTSNTLTIIVPDKQVFDQVVDNVYVTY
AGNVWHIQFIQDADGAVKQLNEIDVNSTWPLVIAANRHNEVSTVVLQNNELMPQKLRTQVVNSGSDMNCNTPTQCYYNTT
GTGKIVYAILSDCDGLKYTKIVKEDGNCVVLELDPPCKFSVQDVKGLKIKYLYFVKGCNTLARGWVVGTLSSTVRLQAGT
ATEYASNSAILSLCAFSVDPKKTYLDYIKQGGVPVTNCVKMLCDHAGTGMAITIKPEATTNQDSYGGASVCIYCRSRVEH
PDVDGLCKLRGKFVQVPLGIKDPVSYVLTHDVCQVCGFWRDGSCSCVGTGSQFQSKDTNFLNGFGVQV
>P0C6V1 ~~~1a~~~Replicase polyprotein 1a~~~
MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVDCSRLPALECCVQSAIIRDI
FVDEDPQKVEASTMMALQFGSAVLVKPSKRLSVQAWAKLGVLPKTPAMGLFKRFCLCNTRECVCDAHVAFQLFTVQPDGV
CLGNGRFIGWFVPVTAIPEYAKQWLQPWSILLRKGGNKGSVTSGHFRRAVTMPVYDFNVEDACEEVHLNPRGKYSCKAYA
LLRGYRGVKPILFVDQYGCDYTGCLAKGLEDYGDLTLSEMKELSPVWRDSLDNEVVVAWHVDRDPRAVMRLQTLATVRSI
EYVGQPIEDMVDGDVVMREPAHLLAPNAIVKRLPRLVETMLYTDSSVTEFCYKTKLCDCGFITQFGYVDCCGDTCGFRGW
VPGNMMDGFPCPGCCKSYMPWELEAQSSGVIPEGGVLFTQSTDTVNRESFKLYGHAVVPFGGAAYWSPYPGMWLPVIWSS
VKSYSYLTYTGVVGCKAIVQETDAICRFLYMDYVQHKCGNLEQRAILGLDDVYHRQLLVNRGDYSLLLENVDLFVKRRAE
FACKFATCGDGLVPLLLDGLVPRSYYLIKSGQAFTSLMVNFSREVVDMCMDMALLFMHDVKVATKYVKKVTGKVAVRFKA
LGIAVVRKITEWFDLAVDTAASAAGWLCYQLVNGLFAVANGVITFIQEVPELVKNFVDKFKTFFKVLIDSMSVSILSGLT
VVKTASNRVCLAGSKVYEVVQKSLPAYIMPVGCSEATCLVGEIEPAVFEDDVVDVVKAPLTYQGCCKPPSSFEKICIVDK
LYMAKCGDQFYPVVVDNDTVGVLDQCWRFPCAGKKVVFNDKPKVKEVPSTRKIKIIFALDATFDSVLSKACSEFEVDKDV
TLDELLDVVLDAVESTLSPCKEHGVIGTKVCALLERLVDDYVYLFDEGGEEVIASRMYCSFSAPDEDCVATDVVYADENQ
DDDADDPVVLVADTQEEDGVAREQVDSADSEICVAHTGGQEMTEPDVVGSQTPIASAEETEVGEACDREGIAEVKATVCA
DALDACPDQVEAFDIEKVEDSILSELQTELNAPADKTYEDVLAFDAIYSETLSAFYAVPSDETHFKVCGFYSPAIERTNC
WLRSTLIVMQSLPLEFKDLGMQKLWLSYKAGYDQCFVDKLVKSAPKSIILPQGGYVADFAYFFLSQCSFKVHANWRCLKC
GMELKLQGLDAVFFYGDVVSHMCKCGNSMTLLSADIPYTFDFGVRDDKFCAFYTPRKVFRAACAVDVNDCHSMAVVDGKQ
IDGKVVTKFNGDKFDFMVGHGMTFSMSPFEIAQLYGSCITPNVCFVKGDVIKVLRRVGAEVIVNPANGRMAHGAGVAGAI
AKAAGKAFINETADMVKAQGVCQVGGCYESTGGKLCKKVLNIVGPDARGHGNECYSLLERAYQHINKCDNVVTTLISAGI
FSVPTDVSLTYLLGVVTKNVILVSNNQDDFDVIEKCQVTSVAGTKALSFQLAKNLCRDVKFVTNACSSLFSESSFVSSYD
VLQEVEALRHDIQLDDDARVFVQANMDCLPTDWRLVNKFDSVDGVRTIKYFECPGEVFVSSQGKKFGYVQNGSFKEASVS
QIRALLANKVDVLCTVDGVNFRSCCVAEGEVFGKTLGSVFCDGINVTKVRCSAIHKGKVFFQYSGLSAADLAAVKDAFGF
DEPQLLQYYSMLGMCKWPVVVCGNYFAFKQSNNNCYINVACLMLQHLSLKFPKWQWRRPGNEFRSGKPLRFVSLVLAKGS
FKFNEPSDSTDFIRVELREADLSGATCDLEFICKCGVKQEQRKGVDAVMHFGTLDKSGLVKGYNIACTCGDKLVHCTQFN
VPFLICSNTPEGKKLPDDVVAANIFTGGSVGHYTHVKCKPKYQLYDACNVSKVSEAKGNFTDCLYLKNLKQTFSSVLTTY
YLDDVKCVAYKPDLSQYYCESGKYYTKPIIKAQFRTFEKVEGVYTNFKLVGHDIAEKLNAKLGFDCNSPFMEYKITEWPT
ATGDVVLASDDLYVSRYSGGCVTFGKPVIWRGHEEASLKSLTYFNRPSVVCENKFNVLPVDVSEPTDRRPVPSAVLVTGA
ASGADASAISTEPGTAKEQKACASDSVEDQIVMEAQKKSSVTTVAVKEVKLNGVKKPVKWNCSVVVNDPTSETKVVKSLS
IVDVYDMFLTGCRYVVWTANELSRLINSPTVREYVKWGMSKLIIPANLLLLRDEKQEFVAPKVVKAKAIACYGAVKWFLL
YCFSWIKFNTDNKVIYTTEVASKLTFKLCCLAFKNALQTFNWSVVSRGFFLVATVFLLWFNFLYANVILSDFYLPNIGPL
PMFVGQIVAWVKTTFGVLTICDFYQVTDLGYRSSFCNGSMVCELCFSGFDMLDNYESINVVQHVVDRRVSFDYISLFKLV
VELVIGYSLYTVCFYPLFVLVGMQLLTTWLPEFFMLGTMHWSARLFVFVANMLPAFTLLRFYIVVTAMYKVYCLCRHVMY
GCSKPGCLFCYKRNRSVRVKCSTVVGGSLRYYDVMANGGTGFCTKHQWNCLNCNSWKPGNTFITHEAAADLSKELKRPVN
PTDSAYYSVIEVKQVGCSMRLFYERDGQRVYDDVSASLFVDMNGLLHSKVKGVPETHVVVVENEADKAGFLNAAVFYAQS
LYRPMLMVEKKLITTANTGLSVSRTMFDLYVYSLLRHLDVDRKSLTSFVNAAHNSLKEGVQLEQVMDTFVGCARRKCAID
SDVETKSITKSVMAAVNAGVEVTDESCNNLVPTYVKSDTIVAADLGVLIQNNAKHVQSNVAKAANVACIWSVDAFNQLSA
DLQHRLRKACVKTGLKIKLTYNKQEANVPILTTPFSLKGGAVFSRVLQWLFVANLICFIVLWALMPTYAVHKSDMQLPLY
ASFKVIDNGVLRDVSVTDACFANKFNQFDQWYESTFGLVYYRNSKACPVVVAVIDQDIGHTLFNVPTKVLRYGFHVLHFI
THAFATDRVQCYTPHMQIPYDNFYASGCVLSSLCTMLAHADGTPHPYCYTEGVMHNASLYSSLVPHVRYNLASSNGYIRF
PEVVSEGIVRVVRTRSMTYCRVGLCEEAEEGICFNFNSSWVLNNPYYRAMPGTFCGRNAFDLIHQVLGGLVQPIDFFALT
ASSVAGAILAIIVVLAFYYLIKLKRAFGDYTSVVVINVIVWCINFLMLFVFQVYPTLSCLYACFYFYTTLYFPSEISVVM
HLQWLVMYGAIMPLWFCITYVAVVVSNHALWLFSYCRKIGTDVRSDGTFEEMALTTFMITKESYCKLKNSVSDVAFNRYL
SLYNKYRYFSGKMDTATYREAACSQLAKAMETFNHNNGNDVLYQPPTASVTTSFLQSGIVKMVSPTSKVEPCVVSVTYGN
MTLNGLWLDDKVYCPRHVICSSADMTDPDYPNLLCRVTSSDFCVMSDRMSLTVMSYQMQGSLLVLTVTLQNPNTPKYSFG
VVKPGETFTVLAAYNGRPQGAFHVVMRSSHTIKGSFLCGSCGSVGYVLTGDSVRFVYMHQLELSTGCHTGTDFSGNFYGP
YRDAQVVQLPVQDYTQTVNVVAWLYAAILNRCNWFVQSDSCSLEEFNVWAMTNGFSSIKADLVLDALASMTGVTVEQVLA
AIKRLHSGFQGKQILGSCVLEDELTPSDVYQQLAGVKLQSKRTRVIKGTCCWILASTFLFCSIISAFVKWTMFMYVTTHM
LGVTLCALCFVIFAMLLIKHKHLYLTMYIMPVLCTLFYTNYLVVGYKQSFRGLAYAWLSYFVPAVDYTYMDEVLYGVVLL
VAMVFVTMRSINHDVFSTMFLVGRLVSLVSMWYFGANLEEEVLLFLTSLFGTYTWTTMLSLATAKVIAKWLAVNVLYFTD
IPQIKLVLLSYLCIGYVCCCYWGVLSLLNSIFRMPLGVYNYKISVQELRYMNANGLRPPRNSFEALMLNFKLLGIGGVPV
IEVSQIQSRLTDVKCANVVLLNCLQHLHIASNSKLWQYCSTLHNEILATSDLSVAFDKLAQLLVVLFANPAAVDSKCLAS
IEEVSDDYVRDNTVLQALQSEFVNMASFVEYELAKKNLDEAKASGSANQQQIKQLEKACNIAKSAYERDRAVARKLERMA
DLALTNMYKEARINDKKSKVVSALQTMLFSMVRKLDNQALNSILDNAVKGCVPLNAIPPLTSNTLTIIVPDKQVFDQVVD
NVYVTYAPNVWHIQSIQDADGAVKQLNEIDVNSTWPLVISANRHNEVSTVVLQNNELMPQKLRTQVVNSGSDMNCNIPTQ
CYYNTTGTGKIVYAILSDCDGLKYTKIVKEDGNCVVLELDPPCKFSVQDVKGLKIKYLYFVKGCNTLARGWVVGTLSSTV
RLQAGTATEYASNSAILSLCAFSVDPKKTYLDYIQQGGVPVTNCVKMLCDHAGTGMAITIKPEATTNQDSYGGASVCIYC
RSRVEHPDVDGLCKLRGKFVQVPLGIKDPVSYVLTHDVCQVCGFWRDGSCSCVGTGSQFQSKDTNFLNGFGVQV
>P0C6V2 ~~~1a~~~Replicase polyprotein 1a~~~
MSSKQFKILVNEDYQVNVPSLPIRDVLQEIKYCYRNGFEGYVFVPEYCRDLVDCDRKDHYVIGVLGNGVSDLKPVLLTEP
SVMLQGFIVRANCNGVLEDFDLKIARTGRGAIYVDQYMCGADGKPVIEGDFKDYFGDEDIIEFEGEEYHCAWTTVRDEKP
LNQQTLFTIQEIQYNLDIPHKLPNCATRHVAPPVKKNSKIVLSEDYKKLYDIFGSPFMGNGDCLSKCFDTLHFIAATLRC
PCGSESSGVGDWTGFKTACCGLSGKVKGVTLGDIKPGDAVVTSMSAGKGVKFFANCVLQYAGDVEGVSIWKVIKTFTVDE
TVCTPGFEGELNDFIKPESKSLVACSVKRAFITGDIDDAVHDCIITGKLDLSTNLFGNVGLLFKKTPWFVQKCGALFVDA
WKVVEELCGSLTLTYKQIYEVVASLCTSAFTIVNYKPTFVVPDNRVKDLVDKCVKVLVKAFDVFTQIITIAGIEAKCFVL
GAKYLLFNNALVKLVSVKILGKKQKGLECAFFATSLVGATVNVTPKRTETATISLNKVDDVVAPGEGYIVIVGDMAFYKS
GEYYFMMSSPNFVLTNNVFKAVKVPSYDIVYDVDNDTKSKMIAKLGSSFEYDGDIDAAIVKVNELLIEFRQQSLCFRAFK
DDKSIFVEAYFKKYKMPACLAKHIGLWNIIKKDSCKRGFLNLFNHLNELEDIKETNIQAIKNILCPDPLLDLDYGAIWYN
CMPGCSDPSVLGSVQLLIGNGVKVVCDGCKGFANQLSKGYNKLCNAARNDIEIGGIPFSTFKTPTNTFIEMTDAIYSVIE
QGKALSFRDADVPVVDNGTISTADWSEPILLEPAEYVKPKNNGNVIVIAGYTFYKDEDEHFYPYGFGKIVQRMYNKMGGG
DKTVSFSEEVDVQEIAPVTRVKLEFEFDNEIVTGVLERAIGTRYKFTGTTWEEFEESISEELDAIFDTLANQGVELEGYF
IYDTCGGFDIKNPDGIMISQYDINITADEKSEVSASSEEEEVESVEEDPENEIVEASEGAEGTSSQEEVETVEVADITST
EEDVDIVEVSAKDDPWAAAVDVQEAEQFNPSLPPFKTTNLNGKIILKQGDNNCWINACCYQLQAFDFFNNEAWEKFKKGD
VMDFVNLCYAATTLARGHSGDAEYLLELMLNDYSTAKIVLAAKCGCGEKEIVLERAVFKLTPLKESFNYGVCGDCMQVNT
CRFLSVEGSGVFVHDILSKQTPEAMFVVKPVMHAVYTGTTQNGHYMVDDIEHGYCVDGMGIKPLKKRCYTSTLFINANVM
TRAEKPKQEFKVEKVEQQPIVEENKSSIEKEEIQSPKNDDLILPFYKAGKLSFYQGALDVLINFLEPDVIVNAANGDLKH
MGGVARAIDVFTGGKLTERSKDYLKKNKSIAPGNAVFFENVIEHLSVLNAVGPRNGDSRVEAKLCNVYKAIAKCEGKILT
PLISVGIFNVRLETSLQCLLKTVNDRGLNVFVYTDQERQTIENFFSCSIPVNVTEDNVNHERVSVSFDKTYGEQLKGTVV
IKDKDVTNQLPSAFDVGQKVIKAIDIDWQAHYGFRDAAAFSASSHDAYKFEVVTHSNFIVHKQTDNNCWINAICLALQRL
KPQWKFPGVRGLWNEFLERKTQGFVHMLYHISGVKKGEPGDAELMLHKLGDLMDNDCEIIVTHTTACDKCAKVEKFVGPV
VAAPLAIHGTDETCVHGVSVNVKVTQIKGTVAITSLIGPIIGEVLEATGYICYSGSNRNGHYTYYDNRNGLVVDAEKAYH
FNRDLLQVTTAIASNFVVKKPQAEERPKNCAFNKVAASPKIVQEQKLLAIESGANYALTEFGRYADMFFMAGDKILRLLL
EVFKYLLVLFMCLRSTKMPKVKVKPPLAFKDFGAKVRTLNYMRQLNKPSVWRYAKLVLLLIAIYNFFYLFVSIPVVHKLT
CNGAVQAYKNSSFIKSAVCGNSILCKACLASYDELADFQHLQVTWDFKSDPLWNRLVQLSYFAFLAVFGNNYVRCFLMYF
VSQYLNLWLSYFGYVEYSWFLHVVNFESISAEFVIVVIVVKAVLALKHIVFACSNPSCKTCSRTARQTRIPIQVVVNGSM
KTVYVHANGTGKFCKKHNFYCKNCDSYGFENTFICDEIVRDLSNSVKQTVYATDRSHQEVTKVECSDGFYRFYVGDEFTS
YDYDVKHKKYSSQEVLKSMLLLDDFIVYSPSGSALANVRNACVYFSQLIGKPIKIVNSDLLEDLSVDFKGALFNAKKNVI
KNSFNVDVSECKNLDECYRACNLNVSFSTFEMAVNNAHRFGILITDRSFNNFWPSKVKPGSSGVSAMDIGKCMTSDAKIV
NAKVLTQRGKSVVWLSQDFAALSSTAQKVLVKTFVEEGVNFSLTFNAVGSDDDLPYERFTESVSPKSGSGFFDVITQLKQ
IVILVFVFIFICGLCSVYSVATQSYIESAEGYDYMVIKNGIVQPFDDTISCVHNTYKGFGDWFKAKYGFIPTFGKSCPIV
VGTVFDLENMRPIPDVPAYVSIVGRSLVFAINAAFGVTNMCYDHTGNAVSKDSYFDTCVFNTACTTLTGLGGTIVYCAKQ
GLVEGAKLYSDLMPDYYYEHASGNMVKLPAIIRGLGLRFVKTQATTYCRVGECIDSKAGFCFGGDNWFVYDNEFGNGYIC
GNSVLGFFKNVFKLFNSNMSVVATSGAMLVNIIIACLAIAMCYGVLKFKKIFGDCTFLIVMIIVTLVVNNVSYFVTQNTF
FMIIYAIVYYFITRKLAYPGILDAGFIIAYINMAPWYVITAYILVFLYDSLPSLFKLKVSTNLFEGDKFVGNFESAAMGT
FVIDMRSYETIVNSTSIARIKSYANSFNKYKYYTGSMGEADYRMACYAHLGKALMDYSVNRTDMLYTPPTVSVNSTLQSG
LRKMAQPSGLVEPCIVRVSYGNNVLNGLWLGDEVICPRHVIASDTTRVINYENEMSSVRLHNFSVSKNNVFLGVVSARYK
GVNLVLKVNQVNPNTPEHKFKSIKAGESFNILACYEGCPGSVYGVNMRSQGTIKGSFIAGTCGSVGYVLENGILYFVYMH
HLELGNGSHVGSNFEGEMYGGYEDQPSMQLEGTNVMSSDNVVAFLYAALINGERWFVTNTSMSLESYNTWAKTNSFTELS
STDAFSMLAAKTGQSVEKLLDSIVRLNKGFGGRTILSYGSLCDEFTPTEVIRQMYGVNLQAGKVKSFFYPIMTAMTILFA
FWLEFFMYTPFTWINPTFVSIVLAVTTLISTVFVSGIKHKMLFFMSFVLPSVILVTAHNLFWDFSYYESLQSIVENTNTM
FLPVDMQGVMLTVFCFIVFVTYSVRFFTCKQSWFSLAVTTILVIFNMVKIFGTSDEPWTENQIAFCFVNMLTMIVSLTTK
DWMVVIASYRIAYYIVVCVMPSAFVSDFGFMKCISIVYMACGYLFCCYYGILYWVNRFTCMTCGVYQFTVSAAELKYMTA
NNLSAPKNAYDAMILSAKLIGVGGKRNIKISTVQSKLTEMKCTNVVLLGLLSKMHVESNSKEWNYCVGLHNEINLCDDPE
IVLEKLLALIAFFLSKHNTCDLSELIESYFENTTILQSVASAYAALPSWIALEKARADLEEAKKNDVSPQILKQLTKAFN
IAKSDFEREASVQKKLDKMAEQAAASMYKEARAVDRKSKIVSAMHSLLFGMLKKLDMSSVNTIIDQARNGVLPLSIIPAA
SATRLVVITPSLEVFSKIRQENNVHYAGAIWTIVEVKDANGSHVHLKEVTAANELNLTWPLSITCERTTKLQNNEIMPGK
LKERAVRASATLDGEAFGSGKALMASESGKSFMYAFIASDNNLKYVKWESNNDIIPIELEAPLRFYVDGANGPEVKYLYF
VKNLNTLRRGAVLGYIGATVRLQAGKPTEHPSNSSLLTLCAFSPDPAKAYVDAVKRGMQPVNNCVKMLSNGAGNGMAVTN
GVEANTQQDSYGGASVCIYCRCHVEHPAIDGLCRYKGKFVQIPTGTQDPIRFCIENEVCVVCGCWLNNGCMCDRTSMQSF
TVDQSYLNECGVLVQLD
>P0C6V3 ~~~1a~~~Replicase polyprotein 1a~~~
MASSLKQGVSPKPRDVILVSKDIPEQLCDALFFYTSHNPKDYADAFAVRQKFDRSLQTGKQFKFETVCGLFLLKGVDKIT
PGVPAKVLKATSKLADLEDIFGVSPLARKYRELLKTACQWSLTVEALDVRAQTLDEIFDPTEILWLQVAAKIHVSSMAMR
RLVGEVTAKVMDALGSNLSALFQIVKQQIARIFQKALAIFENVNELPQRIAALKMAFAKCARSITVVVVERTLVVKEFAG
TCLASINGAVAKFFEELPNGFMGSKIFTTLAFFKEAAVRVVENIPNAPRGTKGFEVVGNAKGTQVVVRGMRNDLTLLDQK
ADIPVEPEGWSAILDGHLCYVFRSGDRFYAAPLSGNFALSDVHCCERVVCLSDGVTPEINDGLILAAIYSSFSVSELVTA
LKKGEPFKFLGHKFVYAKDAAVSFTLAKAATIADVLRLFQSARVIAEDVWSSFTEKSFEFWKLAYGKVRNLEEFVKTYVC
KAQMSIVILAAVLGEDIWHLVSQVIYKLGVLFTKVVDFCDKHWKGFCVQLKRAKLIVTETFCVLKGVAQHCFQLLLDAIH
SLYKSFKKCALGRIHGDLLFWKGGVHKIVQDGDEIWFDAIDSVDVEDLGVVQEKSIDFEVCDDVTLPENQPGHMVQIEDD
GKNYMFFRFKKDENIYYTPMSQLGAINVVCKAGGKTVTFGETTVQEIPPPDVVPIKVSIECCGEPWNTIFKKAYKEPIEV
DTDLTVEQLLSVIYEKMCDDLKLFPEAPEPPPFENVALVDKNGKDLDCIKSCHLIYRDYESDDDIEEEDAEECDTDSGEA
EECDTNSECEEEDEDTKVLALIQDPASIKYPLPLDEDYSVYNGCIVHKDALDVVNLPSGEETFVVNNCFEGAVKPLPQKV
VDVLGDWGEAVDAQEQLCQQEPLQHTFEEPVENSTGSSKTMTEQVVVEDQELPVVEQDQDVVVYTPTDLEVAKETAEEVD
EFILIFAVPKEEVVSQKDGAQIKQEPIQVVKPQREKKAKKFKVKPATCEKPKFLEYKTCVGDLTVVIAKALDEFKEFCIV
NAANEHMTHGSGVAKAIADFCGLDFVEYCEDYVKKHGPQQRLVTPSFVKGIQCVNNVVGPRHGDNNLHEKLVAAYKNVLV
DGVVNYVVPVLSLGIFGVDFKMSIDAMREAFEGCTIRVLLFSLSQEHIDYFDVTCKQKTIYLTEDGVKYRSIVLKPGDSL
GQFGQVYAKNKIVFTADDVEDKEILYVPTTDKSILEYYGLDAQKYVIYLQTLAQKWNVQYRDNFLILEWRDGNCWISSAI
VLLQAAKIRFKGFLTEAWAKLLGGDPTDFVAWCYASCTAKVGDFSDANWLLANLAEHFDADYTNAFLKKRVSCNCGIKSY
ELRGLEACIQPVRATNLLHFKTQYSNCPTCGANNTDEVIEASLPYLLLFATDGPATVDCDEDAVGTVVFVGSTNSGHCYT
QAAGQAFDNLAKDRKFGKKSPYITAMYTRFAFKNETSLPVAKQSKGKSKSVKEDVSNLATSSKASFDNLTDFEQWYDSNI
YESLKVQESPDNFDKYVSFTTKEDSKLPLTLKVRGIKSVVDFRSKDGFIYKLTPDTDENSKAPVYYPVLDAISLKAIWVE
GNANFVVGHPNYYSKSLHIPTFWENAENFVKMGDKIGGVTMGLWRAEHLNKPNLERIFNIAKKAIVGSSVVTTQCGKLIG
KAATFIADKVGGGVVRNITDSIKGLCGITRGHFERKMSPQFLKTLMFFLFYFLKASVKSVVASYKTVLCKVVLATLLIVW
FVYTSNPVMFTGIRVLDFLFEGSLCGPYKDYGKDSFDVLRYCADDFICRVCLHDKDSLHLYKHAYSVEQVYKDAASGFIF
NWNWLYLVFLILFVKPVAGFVIICYCVKYLVLNSTVLQTGVCFLDWFVQTVFSHFNFMGAGFYFWLFYKIYIQVHHILYC
KDVTCEVCKRVARSNRQEVSVVVGGRKQIVHVYTNSGYNFCKRHNWYCRNCDDYGHQNTFMSPEVAGELSEKLKRHVKPT
AYAYHVVDEACLVDDFVNLKYKAATPGKDSASSAVKCFSVTDFLKKAVFLKEALKCEQISNDGFIVCNTQSAHALEEAKN
AAIYYAQYLCKPILILDQALYEQLVVEPVSKSVIDKVCSILSSIISVDTAALNYKAGTLRDALLSITKDEEAVDMAIFCH
NHDVDYTGDGFTNVIPSYGIDTGKLTPRDRGFLINADASIANLRVKNAPPVVWKFSELIKLSDSCLKYLISATVKSGVRF
FITKSGAKQVIACHTQKLLVEKKAGGIVSGTFKCFKSYFKWLLIFYILFTACCSGYYYMEVSKSFVHPMYDVNSTLHVEG
FKVIDKGVLREIVPEDTCFSNKFVNFDAFWGRPYDNSRNCPIVTAVIDGDGTVATGVPGFVSWVMDGVMFIHMTQTERKP
WYIPTWFNREIVGYTQDSIITEGSFYTSIALFSARCLYLTASNTPQLYCFNGDNDAPGALPFGSIIPHRVYFQPNGVRLI
VPQQILHTPYVVKFVSDSYCRGSVCEYTRPGYCVSLNPQWVLFNDEYTSKPGVFCGSTVRELMFSMVSTFFTGVNPNIYM
QLATMFLILVVVVLIFAMVIKFQGVFKAYATTVFITMLVWVINAFILCVHSYNSVLAVILLVLYCYASLVTSRNTVIIMH
CWLVFTFGLIVPTWLACCYLGFIIYMYTPLFLWCYGTTKNTRKLYDGNEFVGNYDLAAKSTFVIRGSEFVKLTNEIGDKF
EAYLSAYARLKYYSGTGSEQDYLQACRAWLAYALDQYRNSGVEIVYTPPRYSIGVSRLQSGFKKLVSPSSAVEKCIVSVS
YRGNNLNGLWLGDTIYCPRHVLGKFSGDQWNDVLNLANNHEFEVTTQHGVTLNVVSRRLKGAVLILQTAVANAETPKYKF
IKANCGDSFTIACAYGGTVVGLYPVTMRSNGTIRASFLAGACGSVGFNIEKGVVNFFYMHHLELPNALHTGTDLMGEFYG
GYVDEEVAQRVPPDNLVTNNIVAWLYAAIISVKESSFSLPKWLESTTVSVDDYNKWAGDNGFTPFSTSTAITKLSAITGV
DVCKLLRTIMVKNSQWGGDPILGQYNFEDELTPESVFNQIGGVRLQSSFVRKATSWFWSRCVLACFLFVLCAIVLFTAVP
LKFYVYAAVILLMAVLFISFTVKHVMAYMDTFLLPTLITVIIGVCAEVPFIYNTLISQVVIFLSQWYDPVVFDTMVPWMF
LPLVLYTAFKCVQGCYMNSFNTSLLMLYQFVKLGFVIYTSSNTLTAYTEGNWELFFELVHTTVLANVSSNSLIGLFVFKC
AKWMLYYCNATYLNNYVLMAVMVNCIGWLCTCYFGLYWWVNKVFGLTLGKYNFKVSVDQYRYMCLHKINPPKTVWEVFST
NILIQGIGGDRVLPIATVQAKLSDVKCTTVVLMQLLTKLNVEANSKMHVYLVELHNKILASDDVGECMDNLLGMLITLFC
IDSTIDLSEYCDDILKRSTVLQSVTQEFSHIPSYAEYERAKNLYEKVLVDSKNGGVTQQELAAYRKAANIAKSVFDRDLA
VQKKLDSMAERAMTTMYKEARVTDRRAKLVSSLHALLFSMLKKIDSEKLNVLFDQASSGVVPLATVPIVCSNKLTLVIPD
PETWVKCVEGVHVTYSTVVWNIDTVIDADGTELHPTSTGSGLTYCISGANIAWPLKVNLTRNGHNKVDVVLQNNELMPHG
VKTKACVAGVDQAHCSVESKCYYTNISGNSVVAAITSSNPNLKVASFLNEAGNQIYVDLDPPCKFGMKVGVKVEVVYLYF
IKNTRSIVRGMVLGAISNVVVLQSKGHETEEVDAVGILSLCSFAVDPADTYCKYVAAGNQPLGNCVKMLTVHNGSGFAIT
SKPSPTPDQDSYGGASVCLYCRAHIAHPGSVGNLDGRCQFKGSFVQIPTTEKDPVGFCLRNKVCTVCQCWIGYGCQCDSL
RQPKSSVQSVAGASDFDKNYLNGYGVAVRLG
>P0C6V5 ~~~1a~~~Replicase polyprotein 1a~~~
MASSLKQGVSPKLRDVILVSKDIPEQLCDALFFYTSHNPKDYADAFAVRQKFDRNLQTGKQFKFETVCGLFLLKGVDKIT
PGVPAKVLKATSKLADLEDIFGVSPFARKYRELLKTACQWSLTVETLDARAQTLDEIFDPTEILWLQVAAKIQVSAMAMR
RLVGEVTAKVMDALGSNMSALFQIFKQQIVRIFQKALAIFENVSELPQRIAALKMAFAKCAKSITVVVMERTLVVREFAG
TCLASINGAVAKFFEELPNGFMGAKIFTTLAFFREAAVKIVDNIPNAPRGTKGFEVVGNAKGTQVVVRGMRNDLTLLDQK
AEIPVESEGWSAILGGHLCYVFKSGDRFYAAPLSGNFALHDVHCCERVVCLSDGVTPEINDGLILAAIYSSFSVAELVAA
IKRGEPFKFLGHKFVYAKDAAVSFTLAKAATIADVLKLFQSARVKVEDVWSSLTEKSFEFWRLAYGKVRNLEEFVKTCFC
KAQMAIVILATVLGEGIWHLVSQVIYKVGGLFTKVVDFCEKYWKGFCAQLKRAKLIVTETLCVLKGVAQHCFQLLLDAIQ
FMYKSFKKCALGRIHGDLLFWKGGVHKIIQEGDEIWFDAIDSIDVEDLGVVQEKLIDFDVCDNVTLPENQPGHMVQIEDD
GKNYMFFRFKKDENIYYTPMSQLGAINVVCKAGGKTVTFGETTVQEIPPPDVVFIKVSIECCGEPWNTIFKKAYKEPIEV
ETDLTVEQLLSVVYEKMCDDLKLFPEAPEPPPFENVTLVDKNGKDLDCIKSCHLIYRDYESDDDIEEEDAEECDTDSGDA
EECDTNLECEEEDEDTKVLALIQDPASNKYPLPLDDDYSVYNGCIVHKDALDVVNLPSGEETFVVNNCFEGAVKALPQKV
IDVLGDWGEAVDAQEQLCQQESTRVISEKSVEGFTGSCDAMAEQAIVEEQEIVPVVEQSQDVVVFTPADLEVVKETAEEV
DEFILISAVPKEEVVSQEKEEPQVEQEPTLVVKAQREKKAKKFKVKPATCEKPKFLEYKTCVGDLAVVIAKALDEFKEFC
IVNAANEHMSHGGGVAKAIADFCGPDFVEYCADYVKKHGPQQKLVTPSFVKGIQCVNNVVGPRHGDSNLREKLVAAYKSV
LVGGVVNYVVPVLSSGIFGVDFKISIDAMREAFKGCAIRVLLFSLSQEHIDYFDATCKQKTIYLTEDGVKYRSVVLKPGD
SLGQFGQVFARNKVVFSADDVEDKEILFIPTTDKTILEYYGLDAQKYVTYLQTLAQKWDVQYRDNFVILEWRDGNCWISS
AIVLLQAAKIRFKGFLAEAWAKLLGGDPTDFVAWCYASCNAKVGDFSDANWLLANLAEHFDADYTNALLKKCVSCNCGVK
SYELRGLEACIQPVRAPNLLHFKTQYSNCPTCGASSTDEVIEASLPYLLLFATDGPATVDCDENAVGTVVFIGSTNSGHC
YTQADGKAFDNLAKDRKFGRKSPYITAMYTRFSLRSENPLLVVEHSKGKAKVVKEDVSNLATSSKASFDDLTDFEQWYDS
NIYESLKVQETPDNLDEYVSFTTKEDSKLPLTLKVRGIKSVVDFRSKDGFTYKLTPDTDENSKTPVYYPVLDSISLRAIW
VEGSANFVVGHPNYYSKSLRIPTFWENAESFVKMGYKIDGVTMGLWRAEHLNKPNLERIFNIAKKAIVGSSVVTTQCGKI
LVKAATYVADKVGDGVVRNITDRIKGLCGFTRGHFEKKMSLQFLKTLVFFFFYFLKASSKSLVSSYKIVLCKVVFATLLI
VWFIYTSNPVVFTGIRVLDFLFEGSLCGPYNDYGKDSFDVLRYCAGDFTCRVCLHDRDSLHLYKHAYSVEQIYKDAASGI
NFNWNWLYLVFLILFVKPVAGFVIICYCVKYLVLSSTVLQTGVGFLDWFVKTVFTHFNFMGAGFYFWLFYKIYVQVHHIL
YCKDVTCEVCKRVARSNRQEVSVVVGGRKQIVHVYTNSGYNFCKRHNWYCRNCDDYGHQNTFMSPEVAGELSEKLKRHVK
PTAYAYHVVYEACVVDDFVNLKYKAAIPGKDNASSAVKCFSVTDFLKKAVFLKEALKCEQISNDGFIVCNTQSAHALEEA
KNAAVYYAQYLCKPILILDQALYEQLIVEPVSKSVIDKVCSILSNIISVDTAALNYKAGTLRDALLSITKDEEAVDMAIF
CHNHEVEYTGDGFTNVIPSYGMDTDKLTPRDRGFLINADASIANLRVKNAPPVVWKFSDLIKLSDSCLKYLISATVKSGG
RFFITKSGAKQVISCHTQKLLVEKKAGGVINNTFKWFMSCFKWLFVFYILFTACCLGYYYMEMNKSFVHPMYDVNSTLHV
EGFKVIDKGVIREIVSEDNCFSNKFVNFDAFWGKSYENNKNCPIVTVVIDGDGTVAVGVPGFVSWVMDGVMFVHMTQTDR
RPWYIPTWFNREIVGYTQDSIITEGSFYTSIALFSARCLYLTASNTPQLYCFNGDNDAPGALPFGSIIPHRVYFQPNGVR
LIVPQQILHTPYIVKFVSDSYCRGSVCEYTKPGYCVSLDSQWVLFNDEYISKPGVFCGSTVRELMFNMVSTFFTGVNPNI
YIQLATMFLILVVIVLIFAMVIKFQGVFKAYATIVFTIMLVWVINAFVLCVHSYNSVLAVILLVLYCYASMVTSRNTAII
MHCWLVFTFGLIVPTWLACCYLGFILYMYTPLVFWCYGTTKNTRKLYDGNEFVGNYDLAAKSTFVIRGTEFVKLTNEIGD
KFEAYLSAYARLKYYSGTGSEQDYLQACRAWLAYALDQYRNSGVEVVYTPPRYSIGVSRLQAGFKKLVSPSSAVEKCIVS
VSYRGNNLNGLWLGDSIYCPRHVLGKFSGDQWGDVLNLANNHEFEVVTQNGVTLNVVSRRLKGAVLILQTAVANAETPKY
KFVKANCGDSFTIACSYGGTVIGLYPVTMRSNGTIRASFLAGACGSVGFNIEKGVVNFFYMHHLELPNALHTGTDLMGEF
YGGYVDEEVAQRVPPDNLVTNNIVAWLYAAIISVKESSFSQPKWLESTTVSIEDYNRWASDNGFTPFSTSTAITKLSAIT
GVDVCKLLRTIMVKSAQWGSDPILGQYNFEDELTPESVFNQVGGVRLQSSFVRKATSWFWSRCVLACFLFVLCAIVLFTA
VPLKFYVHAAVILLMAVLFISFTVKHVMAYMDTFLLPTLITVIIGVCAEVPFIYNTLISQVVIFLSQWYDPVVFDTMVPW
MLLPLVLYTAFKCVQGCYMNSFNTSLLMLYQFMKLGFVIYTSSNTLTAYTEGNWELFFELVHTIVLANVSSNSLIGLIVF
KCAKWMLYYCNATYFNNYVLMAVMVNGIGWLCTCYFGLYWWVNKVFGLTLGKYNFKVSVDQYRYMCLHKVNPPKTVWEVF
TTNILIQGIGGDRVLPIATVQSKLSDVKCTTVVLMQLLTKLNVEANSKMHAYLVELHNKILASDDVGECMDNLLGMLITL
FCIDSTIDLGEYCDDILKRSTVLQSVTQEFSHIPSYAEYERAKSIYEKVLADSKNGGVTQQELAAYRKAANIAKSVFDRD
LAVQKKLDSMAERAMTTMYKEARVTDRRAKLVSSLHALLFSMLKKIDSEKLNVLFDQANSGVVPLATVPIVCSNKLTLVI
PDPETWVKCVEGVHVTYSTVVWNIDCVTDADGTELHPTSTGSGLTYCISGDNIAWPLKVNLTRNGHNKVDVALQNNELMP
HGVKTKACVAGVDQAHCSVESKCYYTSISGSSVVAAITSSNPNLKVASFLNEAGNQIYVDLDPPCKFGMKVGDKVEVVYL
YFIKNTRSIVRGMVLGAISNVVVLQSKGHETEEVDAVGILSLCSFAVDPADTYCKYVAAGNQPLGNCVKMLTVHNGSGFA
ITSKPSPTPDQDSYGGASVCLYCRAHIAHPGGAGNLDGRCQFKGSFVQIPTTEKDPVGFCLRNKVCTVCQCWIGYGCQCD
SLRQPKPSVQSVAVASGFDKNYLNGYGVAVRLG
>K9N638 ~~~1a~~~Replicase polyprotein 1a~~~
MSFVAGVTAQGARGTYRAALNSEKHQDHVSLTVPLCGSGNLVEKLSPWFMDGENAYEVVKAMLLKKEPLLYVPIRLAGHT
RHLPGPRVYLVERLIACENPFMVNQLAYSSSANGSLVGTTLQGKPIGMFFPYDIELVTGKQNILLRKYGRGGYHYTPFHY
ERDNTSCPEWMDDFEADPKGKYAQNLLKKLIGGDVTPVDQYMCGVDGKPISAYAFLMAKDGITKLADVEADVAARADDEG
FITLKNNLYRLVWHVERKDVPYPKQSIFTINSVVQKDGVENTPPHYFTLGCKILTLTPRNKWSGVSDLSLKQKLLYTFYG
KESLENPTYIYHSAFIECGSCGNDSWLTGNAIQGFACGCGASYTANDVEVQSSGMIKPNALLCATCPFAKGDSCSSNCKH
SVAQLVSYLSERCNVIADSKSFTLIFGGVAYAYFGCEEGTMYFVPRAKSVVSRIGDSIFTGCTGSWNKVTQIANMFLEQT
QHSLNFVGEFVVNDVVLAILSGTTTNVDKIRQLLKGVTLDKLRDYLADYDVAVTAGPFMDNAINVGGTGLQYAAITAPYV
VLTGLGESFKKVATIPYKVCNSVKDTLTYYAHSVLYRVFPYDMDSGVSSFSELLFDCVDLSVASTYFLVRLLQDKTGDFM
STIITSCQTAVSKLLDTCFEATEATFNFLLDLAGLFRIFLRNAYVYTSQGFVVVNGKVSTLVKQVLDLLNKGMQLLHTKV
SWAGSNISAVIYSGRESLIFPSGTYYCVTTKAKSVQQDLDVILPGEFSKKQLGLLQPTDNSTTVSVTVSSNMVETVVGQL
EQTNMHSPDVIVGDYVIISEKLFVRSKEEDGFAFYPACTNGHAVPTLFRLKGGAPVKKVAFGGDQVHEVAAVRSVTVEYN
IHAVLDTLLASSSLRTFVVDKSLSIEEFADVVKEQVSDLLVKLLRGMPIPDFDLDDFIDAPCYCFNAEGDASWSSTMIFS
LHPVECDEECSEVEASDLEEGESECISETSTEQVDVSHEISDDEWAAAVDEAFPLDEAEDVTESVQEEAQPVEVPVEDIA
QVVIADTLQETPVVSDTVEVPPQVVKLPSEPQTIQPEVKEVAPVYEADTEQTQSVTVKPKRLRKKRNVDPLSNFEHKVIT
ECVTIVLGDAIQVAKCYGESVLVNAANTHLKHGGGIAGAINAASKGAVQKESDEYILAKGPLQVGDSVLLQGHSLAKNIL
HVVGPDARAKQDVSLLSKCYKAMNAYPLVVTPLVSAGIFGVKPAVSFDYLIREAKTRVLVVVNSQDVYKSLTIVDIPQSL
TFSYDGLRGAIRKAKDYGFTVFVCTDNSANTKVLRNKGVDYTKKFLTVDGVQYYCYTSKDTLDDILQQANKSVGIISMPL
GYVSHGLDLIQAGSVVRRVNVPYVCLLANKEQEAILMSEDVKLNPSEDFIKHVRTNGGYNSWHLVEGELLVQDLRLNKLL
HWSDQTICYKDSVFYVVKNSTAFPFETLSACRAYLDSRTTQQLTIEVLVTVDGVNFRTVVLNNKNTYRSQLGCVFFNGAD
ISDTIPDEKQNGHSLYLADNLTADETKALKELYGPVDPTFLHRFYSLKAAVHKWKMVVCDKVRSLKLSDNNCYLNAVIMT
LDLLKDIKFVIPALQHAFMKHKGGDSTDFIALIMAYGNCTFGAPDDASRLLHTVLAKAELCCSARMVWREWCNVCGIKDV
VLQGLKACCYVGVQTVEDLRARMTYVCQCGGERHRQIVEHTTPWLLLSGTPNEKLVTTSTAPDFVAFNVFQGIETAVGHY
VHARLKGGLILKFDSGTVSKTSDWKCKVTDVLFPGQKYSSDCNVVRYSLDGNFRTEVDPDLSAFYVKDGKYFTSEPPVTY
SPATILAGSVYTNSCLVSSDGQPGGDAISLSFNNLLGFDSSKPVTKKYTYSFLPKEDGDVLLAEFDTYDPIYKNGAMYKG
KPILWVNKASYDTNLNKFNRASLRQIFDVAPIELENKFTPLSVESTPVEPPTVDVVALQQEMTIVKCKGLNKPFVKDNVS
FVADDSGTPVVEYLSKEDLHTLYVDPKYQVIVLKDNVLSSMLRLHTVESGDINVVAASGSLTRKVKLLFRASFYFKEFAT
RTFTATTAVGSCIKSVVRHLGVTKGILTGCFSFVKMLFMLPLAYFSDSKLGTTEVKVSALKTAGVVTGNVVKQCCTAAVD
LSMDKLRRVDWKSTLRLLLMLCTTMVLLSSVYHLYVFNQVLSSDVMFEDAQGLKKFYKEVRAYLGISSACDGLASAYRAN
SFDVPTFCANRSAMCNWCLISQDSITHYPALKMVQTHLSHYVLNIDWLWFAFETGLAYMLYTSAFNWLLLAGTLHYFFAQ
TSIFVDWRSYNYAVSSAFWLFTHIPMAGLVRMYNLLACLWLLRKFYQHVINGCKDTACLLCYKRNRLTRVEASTVVCGGK
RTFYITANGGISFCRRHNWNCVDCDTAGVGNTFICEEVANDLTTALRRPINATDRSHYYVDSVTVKETVVQFNYRRDGQP
FYERFPLCAFTNLDKLKFKEVCKTTTGIPEYNFIIYDSSDRGQESLARSACVYYSQVLCKSILLVDSSLVTSVGDSSEIA
TKMFDSFVNSFVSLYNVTRDKLEKLISTARDGVRRGDNFHSVLTTFIDAARGPAGVESDVETNEIVDSVQYAHKHDIQIT
NESYNNYVPSYVKPDSVSTSDLGSLIDCNAASVNQIVLRNSNGACIWNAAAYMKLSDALKRQIRIACRKCNLAFRLTTSK
LRANDNILSVRFTANKIVGGAPTWFNALRDFTLKGYVLATIIVFLCAVLMYLCLPTFSMVPVEFYEDRILDFKVLDNGII
RDVNPDDKCFANKHRSFTQWYHEHVGGVYDNSITCPLTVAVIAGVAGARIPDVPTTLAWVNNQIIFFVSRVFANTGSVCY
TPIDEIPYKSFSDSGCILPSECTMFRDAEGRMTPYCHDPTVLPGAFAYSQMRPHVRYDLYDGNMFIKFPEVVFESTLRIT
RTLSTQYCRFGSCEYAQEGVCITTNGSWAIFNDHHLNRPGVYCGSDFIDIVRRLAVSLFQPITYFQLTTSLVLGIGLCAF
LTLLFYYINKVKRAFADYTQCAVIAVVAAVLNSLCICFVASIPLCIVPYTALYYYATFYFTNEPAFIMHVSWYIMFGPIV
PIWMTCVYTVAMCFRHFFWVLAYFSKKHVEVFTDGKLNCSFQDAASNIFVINKDTYAALRNSLTNDAYSRFLGLFNKYKY
FSGAMETAAYREAAACHLAKALQTYSETGSDLLYQPPNCSITSGVLQSGLVKMSHPSGDVEACMVQVTCGSMTLNGLWLD
NTVWCPRHVMCPADQLSDPNYDALLISMTNHSFSVQKHIGAPANLRVVGHAMQGTLLKLTVDVANPSTPAYTFTTVKPGA
AFSVLACYNGRPTGTFTVVMRPNYTIKGSFLCGSCGSVGYTKEGSVINFCYMHQMELANGTHTGSAFDGTMYGAFMDKQV
HQVQLTDKYCSVNVVAWLYAAILNGCAWFVKPNRTSVVSFNEWALANQFTEFVGTQSVDMLAVKTGVAIEQLLYAIQQLY
TGFQGKQILGSTMLEDEFTPEDVNMQIMGVVMQSGVRKVTYGTAHWLFATLVSTYVIILQATKFTLWNYLFETIPTQLFP
LLFVTMAFVMLLVKHKHTFLTLFLLPVAICLTYANIVYEPTTPISSALIAVANWLAPTNAYMRTTHTDIGVYISMSLVLV
IVVKRLYNPSLSNFALALCSGVMWLYTYSIGEASSPIAYLVFVTTLTSDYTITVFVTVNLAKVCTYAIFAYSPQLTLVFP
EVKMILLLYTCLGFMCTCYFGVFSLLNLKLRAPMGVYDFKVSTQEFRFMTANNLTAPRNSWEAMALNFKLIGIGGTPCIK
VAAMQSKLTDLKCTSVVLLSVLQQLHLEANSRAWAFCVKCHNDILAATDPSEAFEKFVSLFATLMTFSGNVDLDALASDI
FDTPSVLQATLSEFSHLATFAELEAAQKAYQEAMDSGDTSPQVLKALQKAVNIAKNAYEKDKAVARKLERMADQAMTSMY
KQARAEDKKAKIVSAMQTMLFGMIKKLDNDVLNGIISNARNGCIPLSVIPLCASNKLRVVIPDFTVWNQVVTYPSLNYAG
ALWDITVINNVDNEIVKSSDVVDSNENLTWPLVLECTRASTSAVKLQNNEIKPSGLKTMVVSAGQEQTNCNTSSLAYYEP
VQGRKMLMALLSDNAYLKWARVEGKDGFVSVELQPPCKFLIAGPKGPEIRYLYFVKNLNNLHRGQVLGHIAATVRLQAGS
NTEFASNSSVLSLVNFTVDPQKAYLDFVNAGGAPLTNCVKMLTPKTGTGIAISVKPESTADQETYGGASVCLYCRAHIEH
PDVSGVCKYKGKFVQIPAQCVRDPVGFCLSNTPCNVCQYWIGYGCNCDSLRQAALPQSKDSNFLNESGVLL
>P0C6V6 ~~~1a~~~Replicase polyprotein 1a~~~
MASNHVTLAFANDAEISAFGFCTASEAVSYYSEAAASGFMQCRFVSLDLADTVEGLLPEDYVMVVIGTTKLSAYVDTFGS
RPRNICGWLLFSNCNYFLEELELTFGRRGGNIVPVDQYMCGADGKPVLQESEWEYTDFFADSEDGQLNIAGITYVKAWIV
ERSDVSYASQNLTSIKSITYCSTYEHTFLDGTAMKVARTPKIKKNVVLSEPLATIYREIGSPFVDNGSDARSIIRRPVFL
HAFVKCKCGSYHWTVGDWTSYVSTCCGFKCKPVLVASCSAMPGSVVVTRAGAGTGVKYYNNMFLRHVADIDGLAFWRILK
VQSKDDLACSGKFLEHHEEGFTDPCYFLNDSSLATKLKFDILSGKFSDEVKQAIIAGHVVVGSALVDIVDDALGQPWFIR
KLGDLASAPWEQLKAVVRGLGLLSDEVVLFGKRLSCATLSIVNGVFEFLADVPEKLAAAVTVFVNFLNEFFESACDCLKV
GGKTFNKVGSYVLFDNALVKLVKAKARGPRQAGICEVRYTSLVVGSTTKVVSKRVENANVNLVVVDEDVTLNTTGRTVVV
DGLAFFESDGFYRHLADADVVIEHPVYKSACELKPVFECDPIPDFPLPVAASVAELCVQTDLLLKNYNTPYKTYSCVVRG
DKCCITCTLQFKAPSYVEDAVNFVDLCTKNIGTAGFHEFYITAHEQQDLQGFLTTCCTMSGFECFMPTIPQCPAVLEEID
GGSIWRSFITGLNTMWDFCKRLKVSFGLDGIVVTVARKFKRLGALLAEMYNTYLSTVVENLVLAGVSFKYYATSVPKIVL
GGCFHSVKSVFASVFQIPVQAGIEKFKVFLNCVHPVVPRVIETSFVELEETTFKPPALNGGIAIVDGFAFYYDGTLYYPT
DGNSVVPICFKKKGGGDVKFSDEVSVKTIDPVYKVSLEFEFESETIMAVLNKAVGNRIKVTGGWDDVVEYINVAIEVLKD
HVEVPKYYIYDEEGGTDPNLPVMVSQWPLNDDTISQDLLDVEVVTDAPIDSEGDEVDSSAPEKVADVANSEPGDDGLPVA
PETNVESEVEEVAATLSFIKDTPSTVTKDPFAFDFVSYGGLKVLRQSHNNCWVTSTLVQLQLLGIVDDPAMELFSAGRVG
PMVRKCYESQKAILGSLGDVSACLESLTKDLHTLKITCSVVCGCGTGERIYEGCAFRMTPTLEPFPYGACAQCAQVLMHT
FKSIVGTGIFCRDTTALSLDSLVVKPLCAAAFIGKDSGHYVTNFYDAAMAIDGYGRHQIKYDTLNTICVKDVNWTAPLVP
AVDSVVEPVVKPFYSYKNVDFYQGDFSDLVKLPCDFVVNAANEKLSHGGGIAKAIDVYTKGMLQKCSNDYIKAHGPIKVG
RGVMLEALGLKVFNVVGPRKGKHAPELLVKAYKSVFANSGVALTPLISVGIFSVPLEESLSAFLACVGDRHCKCFCYGDK
EREAIIKYMDGLVDAIFKEALVDTTPVQEDVQQVSQKPVLPNFEPFRIEGAHAFYECNPEGLMSLGADKLVLFTNSNLDF
CSVGKCLNDVTSGALLEAINVFKKSNKTVPAGNCVTLDCANMISITMVVLPFDGDANYDKNYARAVVKVSKLKGKLVLAV
DDATLYSKLSHLSVLGFVSTPDDVERFYANKSVVIKVTEDTRSVKAVKVESTATYGQQIGPCLVNDTVVTDNKPVVADVV
AKVVPNANWDSHYGFDKAGEFHMLDHTGFTFPSEVVNGRRVIKTTDNNCWVNVTCLQLQFARFRFKSAGLQAMWESYCTG
DVAMFVHWLYWLTGVDKGQPSDSENALNMLSKYIVPAGSVTIERVTHDGCCCSKRVVTAPVVNASVLKLGVEDGLCPHGL
NYIGKVVVVKGTTIVVNVGKPVVAPSHLFLKGVSYTTFLDNGNGVVGHYTVFDHGTGMVHDGDAFVPGDLNVSPVTNVVV
SEQTAVVIKDPVKKAELDATKLLDTMNYASERFFSFGDFMSRNLITVFLYILSILGLCFRAFRKRDVKVLAGVPQRTGII
LRKSMRYNAKALGVFFKLKLYWFKVLGKFSLGIYALYALLFMTIRFTPIGSPVCDDVVAGYANSSFDKNEYCNSVICKVC
LYGYQELSDFSHTQVVWQHLRDPLIGNVMPFFYLAFLAIFGGVYVKAITLYFIFQYLNSLGVFLGLQQSIWFLQLVPFDV
FGDEIVVFFIVTRVLMFIKHVCLGCDKASCVACSKSARLKRVPVQTIFQGTSKSFYVHANGGSKFCKKHNFFCLNCDSYG
PGCTFINDVIATEVGNVVKLNVQPTGPATILIDKVEFSNGFYYLYSGDTFWKYNFDITDSKYTCKEALKNCSIITDFIVF
NNNGSNVNQVKNACVYFSQMLCKPVKLVDSALLASLSVDFGASLHSAFVSVLSNSFGKDLSSCNDMQDCKSTLGFDDVPL
DTFNAAVAEAHRYDVLLTDMSFNNFTTSYAKPEEKFPVHDIATCMRVGAKIVNHNVLVKDSIPVVWLVRDFIALSEETRK
YIIRTTKVKGITFMLTFNDCRMHTTIPTVCIANKKGAGLPSFSKVKKFFWFLCLFIVAAFFALSFLDFSTQVSSDSDYDF
KYIESGQLKTFDNPLSCVHNVFINFDQWHDAKFGFTPVNNPSCPIVVGVSDEARTVPGIPAGVYLAGKTLVFAINTIFGT
SGLCFDASGVADKGACIFNSACTTLSGLGGTAVYCYKNGLVEGAKLYSELAPHSYYKMVDGNAVSLPEIISRGFGIRTIR
TKAMTYCRVGQCVQSAEGVCFGADRFFVYNAESGSDFVCGTGLFTLLMNVISVFSKTVPVTVLSGQILFNCIIAFVAVAV
CFLFTKFKRMFGDMSVGVFTVGACTLLNNVSYIVTQNTLGMLGYATLYFLCTKGVRYMWIWHLGFLISYILIAPWWVLMV
YAFSAIFEFMPNLFKLKVSTQLFEGDKFVGSFENAAAGTFVLDMHAYERLANSISTEKLRQYASTYNKYKYYSGSASEAD
YRLACFAHLAKAMMDYASNHNDTLYTPPTVSYNSTLQAGLRKMAQPSGVVEKCIVRVCYGNMALNGLWLGDIVMCPRHVI
ASSTTSTIDYDYALSVLRLHNFSISSGNVFLGVVSATMRGALLQIKVNQNNVHTPKYTYRTVRPGESFNILACYDGAAAG
VYGVNMRSNYTIRGSFINGACGSPGYNINNGTVEFCYLHQLELGSGCHVGSDLDGVMYGGYEDQPTLQVEGASSLFTENV
LAFLYAALINGSTWWLSSSRIAVDRFNEWAVHNGMTTVGNTDCFSILAAKTGVDVQRLLASIQSLHKNFGGKQILGHTSL
TDEFTTGEVVRQMYGVNLQGGYVSRACRNVLLVGSFLTFFWSELVSYTKFFWVNPGYVTPMFACLSLLSSLLMFTLKHKT
LFFQVFLIPALIVTSCINLAFDVEVYNYLAEHFDYHVSLMGFNAQGLVNIFVCFVVTILHGTYTWRFFNTPASSVTYVVA
LLTAAYNYFYASDILSCAMTLFASVTGNWFVGAVCYKVAVYMALRFPTFVAIFGDIKSVMFCYLVLGYFTCCFYGILYWF
NRFFKVSVGVYDYTVSAAEFKYMVANGLRAPTGTLDSLLLSAKLIGIGGERNIKISSVQSKLTDIKCSNVVLLGCLSSMN
VSANSTEWAYCVDLHNKINLCNDPEKAQEMLLALLAFFLSKNSAFGLDDLLESYFNDNSMLQSVASTYVGLPSYVIYENA
RQQYEDAVNNGSPPQLVKQLRHAMNVAKSEFDREASTQRKLDRMAEQAAAQMYKEARAVNRKSKVVSAMHSLLFGMLRRL
DMSSVDTILNLAKDGVVPLSVIPAVSATKLNIVTSDIDSYNRIQREGCVHYAGTIWNIIDIKDNDGKVVHVKEVTAQNAE
SLSWPLVLGCERIVKLQNNEIIPGKLKQRSIKAEGDGIVGEGKALYNNEGGRTFMYAFISDKPDLRVVKWEFDGGCNTIE
LEPPRKFLVDSPNGAQIKYLYFVRNLNTLRRGAVLGYIGATVRLQAGKQTEQAINSSLLTLCAFAVDPAKTYIDAVKSGH
KPVGNCVKMLANGSGNGQAVTNGVEASTNQDSYGGASVCLYCRAHVEHPSMDGFCRLKGKYVQVPLGTVDPIRFVLENDV
CKVCGCWLSNGCTCDRSIMQSTDMAYLNEYGALVQLD
>P0C6U8 ~~~1a~~~Replicase polyprotein 1a~~~
MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEAREHLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTN
HGHKVVELVAEMDGIQYGRSGITLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQN
WNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAW
FTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFPLNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTL
MKCNHCDEVSWQTCDFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKG
GRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDLLEILSRERVNINIVGDFHLNEEVAIILASF
SASTSAFIDTIKSLDYKSFKTIVESCGNYKVTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFARTLDAANHSIP
DLQRAAVTILDGISEQSLRLVDAMVYTSDLLTNSVIIMAYVTGGLVQQTSQWLSNLLGTTVEKLRPIFEWIEAKLSAGVE
FLKDAWEILKFLITGVFDIVKGQIQVASDNIKDCVKCFIDVVNKALEMCIDQVTIAGAKLRSLNLGEVFIAQSKGLYRQC
IRGKEQLQLLMPLKAPKEVTFLEGDSHDTVLTSEEVVLKNGELEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQYC
ALSPGLLATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERVDKVLNEKCSVYTVESGTEVTEFACVVAEAV
VKTLQPVSDLLTNMGIDLDEWSVATFYLFDDAGEENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHEYGTEDDYQGL
PLEFGASAETVRVEEEEEEDWLDDTTEQSEIEPEPEPTPEEPVNQFTGYLKLTDNVAIKCVDIVKEAQSANPMVIVNAAN
IHLKHGGGVAGALNKATNGAMQKESDDYIKLNGPLTVGGSCLLSGHNLAKKCLHVVGPNLNAGEDIQLLKAAYENFNSQD
ILLAPLLSAGIFGAKPLQSLQVCVQTVRTQVYIAVNDKALYEQVVMDYLDNLKPRVEAPKQEEPPNTEDSKTEEKSVVQK
PVDVKPKIKACIDEVTTTLEETKFLTNKLLLFADINGKLYHDSQNMLRGEDMSFLEKDAPYMVGDVITSGDITCVVIPSK
KAGGTTEMLSRALKKVPVDEYITTYPGQGCAGYTLEEAKTALKKCKSAFYVLPSEAPNAKEEILGTVSWNLREMLAHAEE
TRKLMPICMDVRAIMATIQRKYKGIKIQEGIVDYGVRFFFYTSKEPVASIITKLNSLNEPLVTMPIGYVTHGFNLEEAAR
CMRSLKAPAVVSVSSPDAVTTYNGYLTSSSKTSEEHFVETVSLAGSYRDWSYSGQRTELGVEFLKRGDKIVYHTLESPVE
FHLDGEVLSLDKLKSLLSLREVKTIKVFTTVDNTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPS
DDTLRSEAFEYYHTLDESFLGRYMSALNHTKKWKFPQVGGLTSIKWADNNCYLSSVLLALQQLEVKFNAPALQEAYYRAR
AGDAANFCALILAYSNKTVGELGDVRETMTHLLQHANLESAKRVLNVVCKHCGQKTTTLTGVEAVMYMGTLSYDNLKTGV
SIPCVCGRDATQYLVQQESSFVMMSAPPAEYKLQQGTFLCANEYTGNYQCGHYTHITAKETLYRIDGAHLTKMSEYKGPV
TDVFYKETSYTTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNASFDNFKLTCSNTKFADDLNQ
MTGFTKPASRELSVTFFPDLNGDVVAIDYRHYSASFKKGAKLLHKPIVWHINQATTKTTFKPNTWCLRCLWSTKPVDTSN
SFEVLAVEDTQGMDNLACESQQPTSEEVVENPTIQKEVIECDVKTTEVVGNVILKPSDEGVKVTQELGHEDLMAAYVENT
SITIKKPNELSLALGLKTIATHGIAAINSVPWSKILAYVKPFLGQAAITTSNCAKRLAQRVFNNYMPYVFTLLFQLCTFT
KSTNSRIRASLPTTIAKNSVKSVAKLCLDAGINYVKSPKFSKLFTIAMWLLLLSICLGSLICVTAAFGVLLSNFGAPSYC
NGVRELYLNSSNVTTMDFCEGSFPCSICLSGLDSLDSYPALETIQVTISSYKLDLTILGLAAEWVLAYMLFTKFFYLLGL
SAIMQVFFGYFASHFISNSWLMWFIISIVQMAPVSAMVRMYIFFASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVEC
TTIVNGMKRSFYVYANGGRGFCKTHNWNCLNCDTFCTGSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVAVKNGALHL
YFDKAGQKTYERHPLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYYSQLMCQPILLLDQALVSDVGDS
TEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSELAKGVALDGVLSTFVSAARQGVVDTDVDTKDVIECLKLSHHSDL
EVTGDSCNNFMLTYNKVENMTPRDLGACIDCNARHINAQVAKSHNVSLIWNVKDYMSLSEQLRKQIRSAAKKNNIPFRLT
CATTRQVVNVITTKISLKGGKIVSTCFKLMLKATLLCVLAALVCYIVMPVHTLSIHDGYTNEIIGYKAIQDGVTRDIIST
DDCFANKHAGFDAWFSQRGGSYKNDKSCPVVAAIITREIGFIVPGLPGTVLRAINGDFLHFLPRVFSAVGNICYTPSKLI
EYSDFATSACVLAAECTIFKDAMGKPVPYCYDTNLLEGSISYSELRPDTRYVLMDGSIIQFPNTYLEGSVRVVTTFDAEY
CRHGTCERSEVGICLSTSGRWVLNNEHYRALSGVFCGVDAMNLIANIFTPLVQPVGALDVSASVVAGGIIAILVTCAAYY
FMKFRRVFGEYNHVVAANALLFLMSFTILCLVPAYSFLPGVYSVFYLYLTFYFTNDVSFLAHLQWFAMFSPIVPFWITAI
YVFCISLKHCHWFFNNYLRKRVMFNGVTFSTFEEAALCTFLLNKEMYLKLRSETLLPLTQYNRYLALYNKYKYFSGALDT
TSYREAACCHLAKALNDFSNSGADVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDTVYCPR
HVICTAEDMLNPNYEDLLIRKSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQTFSVLACYNG
SPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTI
TLNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCAALKELLQNGMNGRT
ILGSTILEDEFTPFDVVRQCSGVTFQGKFKKIVKGTHHWMLLTFLTSLLILVQSTQWSLFFFVYENAFLPFTLGIMAIAA
CAMLLVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRIMTWLELADTSLSGYRLKDCVMYASALVLLILMTARTVYDD
AARRVWTLMNVITLVYKVYYGNALDQAISMWALVISVTSNYSGVVTTIMFLARAIVFVCVEYYPLLFITGNTLQCIMLVY
CFLGYCCCCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKSSIDAFKLNIKLLGIGGKPCIKVATVQSKMS
DVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILLAKDTTEAFEKMVSLLSVLLSMQGAVDINRLCEEMLDNRATLQA
IASEFSSLPSYAAYATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRKLEKMADQAMTQMYKQARSEDKRA
KVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAKLMVVVPDYGTYKNTCDGNTFTYASALWEIQQVVD
ADSKIVQLSEINMDNSPNLAWPLIVTALRANSAVKLQNNELSPVALRQMSCAAGTTQTACTDDNALAYYNNSKGGRFVLA
LLSDHQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVLGSLAATVRLQAGNATEVPAN
STVLSFCAFAVDPAKAYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQESFGGASCCLYCRCHIDHPNPKGFC
DLKGKYVQIPTTCANDPVGFTLRNTVCTVCGMWKGYGCSCDQLREPLMQSADASTFLNGFAV
>P0DTC1 ~~~~~~Replicase polyprotein 1a~~~
MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGVLPQLEQPYVFIKRSDARTAP
HGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRKVLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQEN
WNTKHSSGVTRELMRELNGGAYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLDFIDTKRGVYCCREHEHEIAW
YTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRIRSVYPVASPNECNQMCLSTL
MKCDHCGETSWQTGDFVKATCEFCGTENLTKEGATTCGYLPQNAVVKIYCPACHNSEVGPEHSLAEYHNESGLKTILRKG
GRTIAFGGCVFSYVGCHNKCAYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVNINIVGDFKLNEEIAIILASF
SASTSAFVETVKGLDYKAFKQIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEAARVVRSIFSRTLETAQNSVR
VLQKAAITILDGISQYSLRLIDAMMFTSDLATNNLVVMAYITGGVVQLTSQWLTNIFGTVYEKLKPVLDWLEEKFKEGVE
FLRDGWEIVKFISTCACEIVGGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKC
VKSREETGLLMPLKAPKEIIFLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVGTPVCINGLMLLEIKDTEKYC
ALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVNITFELDERIDKVLNEKCSAYTVELGTEVNEFACVVADAVI
KTLQPVSELLTPLGIDLDEWSMATYYLFDESGEFKLASHMYCSFYPPDEDEEEGDCEEEEFEPSTQYEYGTEDDYQGKPL
EFGATSAALQPEEEQEEDWLDDDSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTPVVQTIEVNSFSGYLKLTDNVYI
KNADIVEEAKKVKPTVVVNAANVYLKHGGGVAGALNKATNNAMQVESDDYIATNGPLKVGGSCVLSGHNLAKHCLHVVGP
NVNKGEDIQLLKSAYENFNQHEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLEMKSEKQVE
QKIAEIPKEEVKPFITESKPSVEQRKQDDKKIKACVEEVTTTLEETKFLTENLLLYIDINGNLHPDSATLVSDIDITFLK
KDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKVPTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSII
SNEKQEILGTVSWNLREMLAHAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDYGARFYFYTSKTTVASLINTLND
LNETLVTMPLGYVTHGLNLEEAARYMRSLKVPATVSVSSPDAVTAYNGYLTSSSKTPEEHFIETISLAGSYKDWSYSGQS
TQLGIEFLKRGDKSVYYTSNPTTFHLDGEVITFDNLKTLLSLREVRTIKVFTTVDNINLHTQVVDMSMTYGQQFGPTYLD
GADVTKIKPHNSHEGKTFYVLPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKYPQVNGLTSIKWADNNCYLATAL
LTLQQIELKFNPPALQDAYYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQHANLDSCKRVLNVVCKTCGQQQT
TLKGVEAVMYMGTLSYEQFKKGVQIPCTCGKQATKYLVQQESPFVMMSAPPAQYELKHGTFTCASEYTGNYQCGHYKHIT
SKETLYCIDGALLTKSSEYKGPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDNYYKKDNSYFTEQPIDLVPNQPY
PNASFDNFKFVCDNIKFADDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPSFKKGAKLLHKPIVWHVNNATNK
ATYKPNTWCIRCLWSTKPVETSNSFDVLKSEDAQGMDNLACEDLKPVSEEVVENPTIQKDVLECNVKTTEVVGDIILKPA
NNSLKITEEVGHTDLMAAYVDNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTIANYAKPFLNKVVSTTTNIVTRC
LNRVCTNYMPYFFTLLLQLCTFTRSTNSRIKASMPTTIAKNTVKSVGKFCLEASFNYLKSPNFSKLINIIIWFLLLSVCL
GSLIYSTAALGVLMSNLGMPSYCTGYREGYLNSTNVTIATYCTGSIPCSVCLSGLDSLDTYPSLETIQITISSFKWDLTA
FGLVAEWFLAYILFTRFFYVLGLAAIMQLFFSYFAVHFISNSWLMWLIINLVQMAPISAMVRMYIFFASFYYVWKSYVHV
VDGCNSSTCMMCYKRNRATRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTFCAGSTFISDEVARDLSLQFKRP
INPTDQSSYIVDSVTVKNGSIHLYFDKAGQKTYERHSLSHFVNLDNLRANNTKGSLPINVIVFDGKSKCEESSAKSASVY
YSQLMCQPILLLDQALVSDVGDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEAELAKNVSLDNVLSTFISAARQG
FVDSDVETKDVVECLKLSHQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARHINAQVAKSHNIALIWNVKDFMS
LSEQLRKQIRSAAKKNNLPFKLTCATTRQVVNVVTTKIALKGGKIVNNWLKQLIKVTLVFLFVAAIFYLITPVHVMSKHT
DFSSEIIGYKAIDGGVTRDIASTDTCFANKHADFDTWFSQRGGSYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGD
FLHFLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFKDASGKPVPYCYDTNVLEGSVAYESLRPDTRYVLMDGS
IIQFPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVSTSGRWVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGA
LDISASIVAGGIVAIVVTCLAYYFMRFRRAFGEYSHVVAFNTLLFLMSFTVLCLTPVYSFLPGVYSVIYLYLTFYLTNDV
SFLAHIQWMVMFTPLVPFWITIAYIICISTKHFYWFFSNYLKRRVVFNGVSFSTFEEAALCTFLLNKEMYLKLRSDVLLP
LTQYNRYLALYNKYKYFSGAMDTTSYREAACCHLAKALNDFSNSGSDVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCM
VQVTCGTTTLNGLWLDDVVYCPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPK
TPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFCYMHHMELPTGVHAGTDL
EGNFYGPFVDRQTAQAAGTDTTITVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQT
GIAVLDMCASLKELLQNGMNGRTILGSALLEDEFTPFDVVRQCSGVTFQSAVKRTIKGTHHWLLLTILTSLLVLVQSTQW
SLFFFLYENAFLPFAMGIIAMSAFAMMFVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRIMTWLDMVDTSLSGFKLK
DCVMYASAVVLLILMTARTVYDDGARRVWTLMNVLTLVYKVYYGNALDQAISMWALIISVTSNYSGVVTTVMFLARGIVF
MCVEYCPIFFITGNTLQCIMLVYCFLGYFCTCYFGLFCLLNRYFRLTLGVYDYLVSTQEFRYMNSQGLLPPKNSIDAFKL
NIKLLGVGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILLAKDTTEAFEKMVSLLSVLLS
MQGAVDINKLCEEMLDNRATLQAIASEFSSLPSYAAFATAQEAYEQAVANGDSEVVLKKLKKSLNVAKSEFDRDAAMQRK
LEKMADQAMTQMYKQARSEDKRAKVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNIIPLTTAAKLMVVIPDYNTY
KNTCDGTTFTYASALWEIQQVVDADSKIVQLSEISMDNSPNLAWPLIVTALRANSAVKLQNNELSPVALRQMSCAAGTTQ
TACTDDNALAYYNTTKGGRFVLALLSDLQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRG
MVLGSLAATVRLQAGNATEVPANSTVLSFCAFAVDAAKAYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQES
FGGASCCLYCRCHIDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLKNTVCTVCGMWKGYGCSCDQLREPMLQSADAQSFL
NGFAV
>Q008X5 ~~~1a~~~Replicase polyprotein 1a~~~
MSILFGNRQANATKRSDMASVARAVYEVDLISTKYARRTQERLAHNKHAKPSYPSVFFGRRMKAVKEPTFTPSTLFFEEA
TLPKVLASKAKPDTGIKTRRVYVADSLTINGHTYPIVGHFVEMAVSKKEAFPIQPKRVKPKPLMAKPIPNIRRTFLTPEE
RTNTPTTPTTTTTTPFVAGETAGPTIEYTPTSIDLPFAMPTVKQIKENAHTILREQDDCLRFAQTALFKHLGTVTHTTPN
HATTFQVKGRTSLLTFEWRKTTQSPLTDGHFYLQTANNHAELMQPVEGKLTTIFTTTIQQGTTHSLHLIKQESARTLKTR
KPLKLVTYKQETPTTTITPQSLKKTITYIPGSFCINVAEPTLQSVMRRQPLTPTPNDALLQIYHKLGCTTKSPNHASTFE
LFGNTYTWYPVQHTNNLLHKDPNRRFFLHITGQTPQLLIRTERKTFLTLQDEVTYISGKLFVMNHAPIQGEYKTQTAEWV
GSYNMAKTPKAIKPAKTVEYINTTPCHKPATMPAPITYRQCPYTWTLHEPSISKVQRNLFHIPKTATNCLDRIQKALFPE
IVTSNQHFPIGFTIQTDTTMQSYEWAILCKKVTRTYYLAVQNHHATLWYKCAQVYMCLSDDIAPTTELQGSVYILNKVTD
PGFYQNTRQWCGSNDDLHEPKHLIKNLANGDVNNIAHCSLTPWTTTPLVYSTQKNKLTRKLIHYYNATYTVQVPKENPNA
QPSTMKCAYKAYIDLATPTELQIHLDLEIQGKLYSQTAQKKGSKFNKLDIPTFGDILKGTLYVFSSKVLLYEQPKRCTSV
CFHLPNNAQSFFFNTETIQTFEDLFARISSEEVEDNLVLIKGFVPCLGAIYITKDLKFIQPELKEDFKHPTTYYTFTTTV
DPEQVLYSLHPSFTDIVPPHGFPYYTFAKLNNHIDAWNITDDQADTLAEAQPIIFQWPTEEATITTPYKVLHYEHLEGLD
YISLSSFNTVECPEETQSAHDSSSESESEDEELPAHPLSNAPSQASLSSVASTAPPTSPTSSPTPSPTPLQQQVGLKGKE
VPVGGWVLVSEEETPSEEVDSPKLLPNEVPLSFDFDLPIEPITRPISPELQQPILTHYEHPTSPTPSVEIEIDFGSYENL
TLQTEETVTTEVQPEPTPAPTPEPTIVTETVQETPVPTETTQESTPESTPESTPEPTPESTSESTLEPEHVATPSQSPTH
ITVTEITHEPETPDSWSERYDSTSNIPEVFNQLSFGSTDSVKITTPKTETPDEPQQPTVETVSAAQQLLQIVQTATPDIA
QLMSELPPYRLICIGSYCPILAENISKQLPTAVTTPTDADIPTVIFNVSEETMDTVINTVKTKHQANHLTFSLTTIIALD
VPKDKSLPLQQIYDKLTQQDYNTDFIYESHHRQPKESLTHASVLSAYTASYKTTAIKSIADNAVVLDIGYGKGNDGPRYA
VRPLTVTGIDTAARMLAIADQNKPENVTLVKQGFFTHITKTSNTYTHVIAFNSLHYPLASSHPDTLVQRLPTCPANILIP
CHHLLEGIQTPTYSVVKDEDMWCVKVTKNEFIESSYNYDVFVKALESKYHVTIGSLLDCVEKPSTRSITPTLWTAMRNFV
NNDQEMQRILSGYITFNLTPLPPKVEIINDWLDNNATVTINNPFASNEGVTFAVHNIGAITTTEGEFIVNAANKQLNNGT
GVTGAIFAAHDKELKLTQAIKALPTYGASDKLESHQHVVQTIIKNNNSTHAINILHAAAPIKVKCTSKNPEVLLAHNETA
QSELKETYKAIVDYAQLNKLTHIYLPLFGAGAYGHKPLDSLEAFLDAMRNRSPQSTTQYTLLLSDPVKPLDNPSFSYEFL
NLLVTNLNINKQFAQLIVNKYHNTCALQSAIQMNTTTDTHNFLATFIYLLYTMPYSMNTFRQTHTPEEPFTPGKMVTVVD
TTIAFLTTLDILPPCGQPCGYLPPSITKDGEYICACQKTSNWSLPFHFYNARYNKVYHTGLNNILTHKHSAFHKSRNAAH
FIAKTGPSTSSYPVYMAPVPEILAYNASYRDSCQDNAIEEQSDSQASQSPSSPVTIPVSLPTASPASSVKSALRSDIPIT
TDQQSTTSASISTATTASTIPTAPLTSSDSNTSVVTSLYGNMEELTYLDASGTSQDFILSETTPFIAHIYHNNEATFIPP
GYQLLDTNTNDPIEMYITPPRPIDGSPMISLASTASTTPMTYPLLSIRLTTEELTSFFKTKTDKFHLISHKSCLTVHLFD
SPTLNSIAADSTSDAHLYQQHLKDLYTFSDCCSMYTRTEVYNCIEADTPLIRQSEQTKFHPINLDTLIEMVATFPPIVKR
YSQTTTPDFTNLTVYFVSNGDIITTPTGSTSEQPPQLKIFLDYQTSSKFTTLVDLTLHEQTEANTIITYHHGEHQLLKPN
PSAFYIEFQTYSSFFSRFQTFSTNFFWTLFINFLINVRFCITADSAYFHWQGKPIETTNLNIVYSIGRLDFVLSKHTTPW
LTKPTDTLNPLTLIKNTLVQPIAINFHGRIRPLQSTNTRFGATHTPTKLPVHLLNTSLRTHYLSLLSLFQCTFSVFLAYI
ALLYSFSGHGIFTVVAYFTMLFARYYITSFINFCTSQLTATQVTQWFAAIKAKYTGIYESSQDRVLTVNVTGTNVPYIVK
YSTILTVTMYVAFMAFVWTVSTYAAQYTAGERYDRPPYQTVFQKTLNVLGLTETVTYYYPYASLNEACAASTSILCRLGS
PFNFHYPSDYTQVRTVQTDTSSPFWLFIIFMPPSFLFIVLPWLILCTITPTVSIAQLLVPSIILNATIVFIYIRRKFTGH
CCGPHTCIKHADISRSLQFRPTSQIQHSLTFCGTLCAKHNWYCNNSDSPTHTLGIQLAQLIETTYKLQPGTIKPDSSYTH
TTETATLPIMKVSTTSTDFTTSEPNTTVEHLHLQVIAHVTGTRISIESSSNKVQQQNTQHTRLTNKPVTGFMHTTLLQKL
KRQHKDELSSYLCNFVPSDNKKDCILPHSVVHMTLTENQRTFLLKNFTFSTNVTVDPTTTGFIPSSLNISTLPHKHFMIN
VIESAMLAKLPKEVQDTLRTTHLETTTLERQAMSLTTQAILTTFALIMATFVVAFLAFFSTAQVGKTPYAGLNPTMVGNV
NAEPYIQPTTLENSILIPLHGASKVCWRAQNGTLFFTDAIPTTECARAAVPYIGYKSEFTQTCASSNLRYPFTVYLGSIK
VMYLRDGISYLTSTLSHNSNTKKLCVQVGSNAVRCASVLPTGASSNVAALLMASVVVISMVLFYLYLLQIFKFYTNSVIM
SFVIQLLTLLATTVSTPLAVTVQLFVITYGYTNWILLTLSLLNLTVLLSTPVGITFVVIYGLYKAYTLFTSSGQGCVYNE
GGTIRFSGSFEQVANSTFPLTNASCVQLLSDLGITYQQLNVYASSRDRNVRRLAQALLHRQLDSASECILYEGCSGNTIT
RQALQRIRQAVTVVVTPASQNLCKITSNQANGIGLSCTGTFFTSTEIITCAHGIGTSDITAVHKGITYDCKVKSINNDIA
ILITTTVNLSVQNIKLDSSFSQKSDNYQRNFVQFVSFVDQQNSDAVTINNTVMLPSGHFFAIGTEAGESGSPYTLNGNII
GIHYGIDNAGSWMLASRPDGSFYVPATQHGNSAKVTFSTDAFAQQFPAIVTNKTSDQLVNEIAATNNTAFELDTTDVSHL
SNLLKHLKNNNETPKSLSDYLPVAPVTQQTSVTVGQTLTSPVNNMQTALYTLMLVSEVITYILTPNSDLSVLISMFVTSA
FLKFGASKLFYNTEMLRNTITTFVVYRYTTLLIALVFSQYYLHILSYALKLNTLVLTVIALTFLVTPLVLLTIRRVYYYS
QNYLISCVFIISCFASHTYTLYILTDTTVDFQTFIISEPIFSTLLCNLFIGFTLISVVPNPLYVCVVFMYILLDCEALGF
IVTCFMASYLCPKPLRSLTTFLCTDTLVLTAPAYLHWYGAKGTQREYSVIYAVFDSILTPETTIQIPVTIMEGIEQQVKF
IFAVPKSANIQDEEEAYVEYNQNSDIKDLAIKNEEKVCTITGMFRKTRTIKGTLMESFYPRHQPDSYILNQVVRHNYVLA
HDPETIILTTVNPTHLESEPFQTIVKKVRKLHALYQEISEQSSDDTELCHAYILALIKSTVLEAQMPTEKINFVSSQTML
SPAMILIFAEAYQILTESRSFTNHITPQSDIGSMQTTLASLAEMDTDEMTPQERKIHIKRMNVLKQEIAKMESASLKLEK
FLDNMHKAEISKRGKEDILLKVSNMLRLHLNKVANAAHCTIQTPSAGLITLASAFDVHSLCVTQHSESVLIQTPDDDTFL
VYVDGQIYTCYNPTDITGKKLVPINVMSPDNVQFPTYPVVFSLSKQDYAEEITEQNNIGYTERTHNFKLKELASGLAVTL
DGLVVVTETKELATAFKIGQRYFKFLNTNKTPVARNNTHAIIQLLRNNISQQAVVRIGGSRVSNDHIAISQVPVQTIGYL
TYAGISVCRQCATKQDHTCQYAGYFVQIPREHVSNIFNLTDTPPCLHNKFTCTTCQPLQQQSKTQQPPLNWWGSV
>Q5UQ27 ~~~~~~Probable Rab-related GTPase~~~
MNSYIKEKMENNGYKIILIGSSGVGKSSIVHQFLFNRKISNVSPTIGAAFASKQVIAKNGKTLKLNIWDTAGQERFRSIT
KMYYTNSLGCLVVFDVTDRESFDDVYYWINDLRINCHTTYYILVVANKIDIDKNNWRVSENEIKKFCRDNDCDYVFASSF
ESDTVNNLFGKMIDKMSEIKINPDSRRNDIIYLSDKSSGIDDFIDKISQNCCYIS
>P03037 ~~~ant~~~Antirepressor protein ant~~~
MNSIAILEAVNTSYVPFNGQHVLTAMVAGVAYVAMKPVVDNIGLSWSSQVQKLLKMKDKFNYVDIDMVAGDMKKRLMGCI
PLKKLNGWLFSINPEKVRADIRDKLIKYQEECFTVLYDYWTKGKAENPRKKTSVDERTPLRDAVNMLVSKKHLMYPEAYA
MIHQRFNVESIEELEASQIPLAVEYIHRVVLEGEFIGKQEKKTNDLSAKEANSLVWLWDYANRSQALFRELYPAMRQIQS
NYSGKCYDYGHEFSYIIGIARDVLINHTRDVDINEPDGPTNLSAWMRLKDKELPPSLHRY
>Q9J589 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MENKESVLLELVPKIKAYIKDDTVKEKSYQDFIEKNKELFICNLYNVNMITDEDIKLLYITIEQNIDIDDKSLVAIFSYI
GYNFEKNIHDDNSSIDLGDRMTGDMNYNMYDTFFSTLDFIIRQKHVNILVNDEGNNDFNINYRSFTTSLSYKEDKYEQVV
NEIPFNMKELLSYVSKNLDQLRFSKKYLDFAYLCRNIGIKISKRKYNVRYIFNYVIDELTIPIVIKDYLDVKYVYLEETN
KAYRNNFDNDNKYFYEWGKVIIPKFKNPRLYSYFFLSNYGLCDLFMELINIKQVTFEPRKNPIEYIYVSELKFWEEGGSV
DFVPCEHEIAIIDAKKVSLEYYENINKFIAKYIYYEDGLAYCNLCGINIQELNLDATDVTKISLINVTYNKSIFMSEPYN
YFSHSQRFIFNTIMSFDTIMKSQMWNMKYNINRLILNFLIDINSKRHEYEKQFATEIKKGIFFLRLSANLFDIQMSSMEL
FYSAKILNIHFIVALVIVLNSGADFIMYYMTNKKEETNYSDLNHIISVIVFDFLKKTRTVDSKQFNTIELFTETYMKIAT
EELIVHYNRIKLEMERLIAIKKDRKTPNYDISIYRQIQRTDEIAFFPSCITSTKLFITYEKVVAENTEIITIKHPVRIKE
GTDEDKEIFEDIMKKTTKVLIRVNDTNAYNASFFTTHIKLEVEKKKIIIPLTSLFVYNVLKYYSSNVDFYVFKFGDPFPF
HYDLISQEHTNHKITGYNMLRQELLPNSNVFTYFSDSLNRQELEFSFYMFLASYVNVTEWIEENSKKIKELYIINFNN
>A0A7H0DN81 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKSYLLDTNISPKSYNDFISRNKNIFVINLYNVSTITEEDIRLLYTTIEQNIDADDQTLVAIFSYI
GYKFEQTVKEEISTSLSFNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNMRYVFLYKIDGLSIPIIIKDFLDIKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIARDEPVFVKRKKLDIIEIDEPEAWKRDVR
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFLIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMKSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSALKFFDTSTIIVKTMAPIHACIKQKIVAPTPSVKPTDA
SLKNFKELTCDEDIKILIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYVFRFGDPMPFEEE
LIDHEHVQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIRELYMINFNN
>O57207 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKAYLLDANISPKSYDDFISRNKNIFVINLYNVSTITEEDIRLLYTTIEQNIDADDQTLVAIFSYI
GYKFEQAVKEEISTSLSFNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNVRYVFLYKIDGLSIPIIIKDFLDVKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIARDEPVFVKRKKLDIIEIDEPEAWKRDVR
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFLIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMKSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSALKFFDTSTIIVKTMAPVHTYIEQKIVAPTPSVEPTDA
SLKNFKELTCDEDIKISIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYVFRFGDPMPFEEE
LIDHEHVQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIKELYMINFNN
>P68439 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKAYLLDANISPKSYDDFISRNKNIFVINLYNVSTITEEDIRLLYTTIEQNIDADDQTLVAIFSYI
GYKFEQAVKEEISTSLSFNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNVRYVFLYKIDGLSIPIIIKDFLDVKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIARDEPVFVKRKKLDIIEIDEPEAWKRDVR
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFLIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMTSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSALKFFDTSTIIVKTMAPVHTCVEQKIVAPTPSVEPTDA
SLKNFKELTCDEDIKILIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYVFRFGDPMPFEEE
LIDHEHVQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIKELYMINFNN
>P68438 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKAYLLDANISPKSYDDFISRNKNIFVINLYNVSTITEEDIRLLYTTIEQNIDADDQTLVAIFSYI
GYKFEQAVKEEISTSLSFNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNVRYVFLYKIDGLSIPIIIKDFLDVKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIARDEPVFVKRKKLDIIEIDEPEAWKRDVR
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFLIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMTSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSALKFFDTSTIIVKTMAPVHTYIEQKIVAPTPSVEPTDA
SLKNFKELTCDEDIKILIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYVFRFGDPMPFEEE
LIDHEHVQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIKELYMINFNN
>P0DSR3 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKAYLLDANISPKSYDDFISRNKNIFVINLYNVSAITEEDIRLLYTTIEQNIDANDQTLVAIFSYI
GYKFEQTVKEEISTSLSLNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNVRYVFLYKIDGLSIPIIIKDFLDVKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIAKDEPVFIKRKKLDIIEIDEPEAWKRDVK
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFFIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMKSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSTLKFFDTSTIIVKTMAPVHTYVEQKIVAPTPSVEPTDA
SLKQFKELTCDEDIKILIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYIFRFGDPMPFEEE
LIDHEHAQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIRELYMINFNN
>P0DSR4 ~~~~~~RNA polymerase-associated transcription-specificity factor RAP94~~~
MDSKETILIEIIPKIKAYLLDANISPKSYDDFISRNKNIFVINLYNVSAITEEDIRLLYTTIEQNIDANDQTLVAIFSYI
GYKFEQTVKEEISTSLSLNDKNTTDEMTYNLYDLFFNTLDMYLRQKKISILVNDDVRGDVIVSYKNSDLVSSFNAELEPE
IKKIPFNMKNLLPYLEKNLDQLRFSKKYLDFAYLCRHIGIPISKKKYNVRYVFLYKIDGLSIPIIIKDFLDVKYVYLENT
GKIYKNSFSEDHNNSLSDWGKVIIPLLKDRHLYSYIFLSSYHLHSYYTDLIAKDEPVFIKRKKLDIIEIDEPEAWKRDVK
VEFAPCEHQIRLKEAMKVDANYFTKINNFANEFIYYEDGVAYCRVCGINIPIFNLDAADVIKNTVIVSTFNKTIFLSEPY
SYFVHSQRFIFNIIMSFDNIMKSQTWVMKYNINRLILNFFIDINSRRQEYEKKFSSEIKRGLFFLRLSANLFESQVSSTE
LFYVSKMLNLNYIVALVIILNSSADFIVSYMKSKNKTVEESTLKYAISVVIYDFLVKTRICEKGSLDTIVLFTDVYTSIM
PEELDLHFQRITLELRKLVSIQRSALEPNYDVESRGEELPLSTLKFFDTSTIIVKTMAPVHTYVEQKIVAPTPSVEPTDA
SLKQFKELTCDEDIKILIRVHDTNATKLVIFPSHLKIEIERKKLIIPLKSLYITNTLKYYYSNSYLYIFRFGDPMPFEEE
LIDHEHAQYKINCYNILRYHLLPDSDVFVYFSNSLNREALEYAFYIFLSKYVNVKQWIDENITRIRELYMINFNN
>P03050 ~~~arc~~~Transcriptional repressor arc~~~
MKGMSKMPQFNLRWPREVLDLVRKVAEENGRSVNSEIYQRVMESFKKEGRIGA
>P01115 3.6.5.2~~~H-RAS~~~Transforming protein p29~~~
MPAARAAPAADEPMRDPVAPVRAPALPRPAPGAVAPASGGARAPGLAAPVEAMTEYKLVVVGARGVGKSALTIQLIQNHF
VDEYDPTIEDSYRKQVVIDGETCLLDILDTTGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDD
VPMVLVGNKCDLAGRTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVL
S
>P01117 3.6.5.2~~~K-RAS~~~GTPase KRas~~~
MTEYKLVVVGASGVGKSALTIQLIQNHFVDEYDPTIQDSYRKQVVIDGETCLLDILDTTGQEEYSAMRDQYMRTGEGFLC
VFAINNTKSFEDIHHYREQLKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQELARSYGIPFIETSAKTRQRVEDAFYTLV
REIRQYRLKKISKEEKTPGCVKIKKCVIM
>P23207 ~~~oad~~~Receptor-binding protein pb5~~~
MSFFAGKLNNKSILSLRRGSGGDTNQHINPDSQTIFHSDMSHVIITETHSTGLRLDQGAGDYYWSEMPSRVTQLHNNDPN
RVVLTEIEFSDGSRHMLSGMSMGVGAKAYGIINPQIMSQGGLKTQITASADLSLDVGYFNTGTSGTIPQKLRDGTGCQHM
FGAFSGRRGFASSAMYLGGAALYKSAWSGSGYVVADAGTLTIPSDYVRHPGARNFGFNAIYVRGRSCNRVLYGMEGPNYT
TGGAVQGASSSGALNFTYNPSNPESPKYSVGFARADPTNYAYWESMGDPNDSANGPIGIYSEHLGIYPSKITWYVTNLVY
NGSGYNIDGGLFNGNDIKLSPREFIIKGVNVNNTSWKFINFIEKNFNVGNRADFRDVGCNLSKDSPSTGISGIATFGLPT
TESNNAPSIKGGNVGGLHANVVSIYNFLPSASWYVSSNPPKIGNNYGDVWSENLLPLRLLGGSGSTILSGNIVFQGNGSV
HVGTVGLDLNSSRNGAIVCTMEFIDDTWLSAGGIGCFNPTEMLSQGAEYGDSRFRIGGNTINKKLHQILSLPAGEYVPFF
TIKGTVVNACKLQAAAYNPTPYWVSGLPGSVGQTGYYTLTYYMRNDGNNNISIWLDSSMSNIIGMKACLPNIKLIIQRLT
>Q71AW2 ~~~rbp~~~Receptor binding protein~~~
MTIKNFTFFSPNSTEFPVGSNNDGKLYMMLTGMDYRTIRRKDWSSPLNTALNVQYTNTSIIAGGRYFELLNETVALKGDS
VNYIHANIDLTQTANPVSLSAETANNSNGVDINNGSGVLKVCFDIVTTSGTGVTSTKPIVQTSTLDSISVNDMTVSGSID
VPVQTLTVEAGNGLQLQLTKKNNDLVIVRFFGSVSNIQKGWNMSGTWVDRPFRPAAVQSLVGHFAGRDTSFHIDINPNGS
ITWWGANIDKTPIATRGNGSYFIK
>M1EBB2 ~~~~~~Receptor-recognizing protein gp38~~~
MAVQGPWVGSSYVAETGQNWASLAANELRVTERPFWISSFIGRSKEEIWEWTGENHSFNKDWLIGELRNRGGTPVVINIR
AHQVSYTPGAPLFEFPGDLPNAYITLNIYADIYGRGGTGGVAYLGGNPGGDCIHNWIGNRLRINNQGWICGGGGGGGGFR
VGHTEAGGGGGRPLGAGGVSSLNLNGDNATLGAPGRGYQLGNDYAGNGGDVGNPGSASSAEMGGGAAGRAVVGTSPQWIN
VGNIAGSWL
>P03036 ~~~CRO~~~Regulatory protein cro~~~
MQTLSERLKKRRIALKMTQTELATKAGVKQQSIQLIEAGVTKRPRFLFEIAMALNCDPVWLQYGTKRGKAA
>P09964 ~~~cro~~~Regulatory protein cro~~~
MYKKDVIDHFGTQRAVAKALGISDAAVSQWKEVIPEKDAYRLEIVTAGALKYQENAYRQAA
>P03040 ~~~cro~~~Regulatory protein cro~~~
MEQRITLKDYAMRFGQTKTAKDLGVYQSAINKAIHAGRKIFLTINADGSVYAEEVKPFPSNKKTTA
>Q9T216 ~~~3~~~Recombination directionality factor~~~
MAKRSIWAGDEDNKPKKRETYADDTVGRFHSGYSETNERGKVVPVALDKWRISTGEQSVADAVAQLFGGTPVENEESTSE
NFIDVFTDRPKVPVIIEADGIHWDMKLWLNGKLKHHCDGFDFVSHADEEMIGQPCGCPKLFDERKAAAKEYDAPNPAITV
TFTLADDPELGRFKFQTGSWTLFKVLHEAEDDVERVGKGGAVLANLELELVEYTPKRGPMRNKLVSYYKPTITVLKSYND
AIAD
>Q8JU61 2.7.7.48~~~S2~~~RNA-directed RNA polymerase VP2~~~
MEELFNALPQPLQQLSLALAGEIPLTDHIFEQAASTWHVQPRSLTYKLLDHIPFSTPVVVPPSIYHSLDWSKCFAVNQDR
VERVPTIDDPDDVYVPNSDIGPLLTSLHTIPDYGFLHPAIENDATTLRAERARCASTFYKIASSQARQVKLDPIRMLGFL
LLVQARPRVPSGLVTDQPTRRDPTQSPALHAIWQVMQYYKVAGVYYAPALVVPSGAIWWIPPPGKRNVVSVQYLLTDLIN
LAILAHMTDMSPTLELTGVLMYLRAASSHSHAYTLLQMKSVFPALSLRSMYRNKGFGGKAPAIEWTEPRSKYKFRWTGVT
QLHDGLRPRSPSMDVPTLEVLTKYELVDIGHIIIRERNAHPRHNHDSVRFVRDVMALTSGMYLVRQPTMSVLREYSQVPD
IKDPIPPSAWTGPIGNVRYLLPSVQGPARHLYDTWRAAARQIAQDPQWHDPLNQAIMRAQYVTARGGSSASLKFALKVTG
IVLPEYDDSKVKKSSKIYQAAQIARIAFMLLIAAIHAEVTMGIRNQVQRRARSIMPLNVIQQAISAPHTLVANYINKHMN
LSTTSGSVVTDKVIPLILYASTPPNTVVNVDIKACDASITYNYFLSVICGAMHEGFEVGNADAAFMGVPSTIVSDRRSSV
APYSRPISGLQTMVQHLADLYAAGFRYSVSDAFSSGNKFSFPTSTFPSGSTATSTEHTANNSTMMEYFLNVHAPSHVKSA
SLKRILTDMTIQRNYVCQGDDGILLLPHEAASKISADDMNELLTCLRDYGQLFGWNYDIDWSDTAEYLKLYALMGCRIPN
TSRHPPVGKEYAAPQTDEIWPSLIDIVIGHHLNGVTDVLNWREWLRFSWAFACYSSRGGYTNPKGQSFSAQYPWWTFVYL
GIPPILLPGQTPFIHSCYMPPGDQGMFSILNGWRDWLISHASTTLPPLRHNHPVWGLSDVPSLLSQFGVYAGYHAAQHYR
RPKPAPETASSDSINQITSDLTEYLFYDSALKARVMKGRYNWERLSSSLSLNVGSRVPSLFDVPGKWVAAGRDAEKPPPS
SVEDMFTSLNRCIRRPTHSFSRLLELYLRVHVTLGESIPLAIDPDVPQVAGADPANDDHWFKYTCLGDIPSATRNYFGES
LFVGRVVSGLDVEAVDATLLRLKILGAPPEAFIAVLNGIGMSDSEAHQIAGRISLANAQLVQIARVVHLSIPSSWMTLNT
GPYIHHHAYDFKPGITQPSAKSRDKSIWMSPILKLLCTSYAMTVAGPVRTSIVTEIDGSAAALSGNLRVWMRDV
>Q65652 ~~~ORF1~~~RNA replication polyprotein~~~
MALTYRSPVEEVLTLFEPTAQSLIASAAVSAFQRHEKDNFEWFRYSVPAFAKEHLSKAGIYLSPYAGFPHSHPVCKTLEN
YILYVVVPSIVNSTFFFVGIKDFKINFLKSRFDKLNMISALNRYVSSADKIRYGNDFVIRAGVEHRALKRHRGLVDSPTL
KALMPNVKSGSKLFLHDELHYWSKEELIGFLEICEPEVLLGTVIYPPELLIGSDCSLNPWCYEYEVKKKKLLFYPDGVRS
EGYEQPLSGGYLLQTSRIKLPNAGIYCVDLLCSRFAHHLFSITRGDLITPDNRSFGPFEAVHSGALAGISRGKPNFYPVS
QHTILRVYRYLRSLKKPDKQSAMAKFSQIVHEPCGRAVKFMEEFSDLIINTGTLRTVINPEQVKLFFGNLGRCMPPCFAS
KLKGTRTVCLDEFISMLRPLSVDVTLETISMHSMTMVVTTWSQEAEEGVDLPKIFEEKWEGKQSLDRTEAPYLGLAPFVD
YKIQWRLQFNIPKFLNQLAELFVNSCSVNGGVRSMSIPAYLRRLATCRSCVGRAMLCCLTEVDIASLRVVVRNRYPYTED
FYRCRRRWFLRIGAQRRPSFYIEDAKHLERLGQFEEEQFQRPMSRRSLYTLASVSMNGTDDPFCSDCFYDPVPVARAKIV
PTPTVIVERALEPLAIDTGTTSDAPCDAPGATCLRGAQAVVCACGLSMAVSAVPYAELKMDFYPDALKGRDAAWYSKEDR
EYKYNGGSHLCRGWPKWLQLWMQANGVDETYDCMLAQRYGAQGKIGFHADNEEIFMRGAPVHTVSMDGNADFGTECAAGR
QYTTLRGNVQFTMPSGFQETHKHAVRNTTAGRVSYTFRRLAKKDESRVIEEVVEVETKDMGFSSSLFGVQIIVDEPCDGV
EETFNVQCVPGDGNCFWHSLGSFTGLTVECMKAGIKNFACGPEGAEKLSRQLEPNVWAEDEALCAACAHLGVDLVIFDED
QGFKMLYRYPGNKREALLRLKGSHFEPLEPKEMCVVKAIAQAVKRSPMDVLRVALKKMGEDFKEQICRGKGVMLDVFMVL
AKIFDVSACVLQGTEQIMINPKGRIKGLFRMTTDHLSYDGVPDKVKHSEVNVYKHDVALQIEDLIELRELSSLVEYTPSF
SRAKLLADCLHDGSTGVMCSELYNDKGHLCPEGRETTRVTIGVLLGTFGCGKSRLFKEILFKLCGKSVCYISPRKALCDS
FDDEIRKARGNMGERGIKHYKSLTFEKAILQASKLHKGSLVIIDEIQLYPPGYLDLLLLLAGPTMKYFALGDPCQSDYDS
EKDRTILGSVRSDVFELLDGIEYKFNILSRRFQSSLFRGRLPCLMYEEDLEAGAPLRLIDGLESIDTSAAYSRCCLVSSF
EEKKIVNAYFGERTKCLTFGESTGMTFDVGCVLITSISAHTSEQRWITALSRFRKDIVFVNAASVAWDTLQSVYANRWLG
RFLNRSARQEDLRRMLPGTPLFVEGFQKNLLGADEGKRECKLEGDPWLKTMVDLLQVEDMEDIEIAKEVLQDEWCKTHLP
QCELESVRARWVHKILAKEFREKRMGCLVSEQFTDQHSKQMGKHLTNSAERFETIYPRHRAADTVTFIMAVRKRLSFSCP
IKESAKLNQALPYGPFLLKEFLKRVPLKPMHDRKMMEQAKFDFEEKKTSKSAATIENHSNRSCRDWLIDVGLVFSKSQLC
TKFDNRFRDAKRAQTIVCFQHAVLCRFAPYMRYIEKKLNEVLPSKYYIHSGKGLEELNRWVIEGRFEGVCTESDYEAFDA
SQDHYIVAFEICLMRYLGLPNDLIEDYKFIKTHLGSKLGNFAIMRFSGEASTFLFNTMANMLFTFLQYDLKGNERICFAG
DDMCANGRLHVSSKHKNFMSKLKLKAKVSNTMNPTFCGWNLSSDGIFKKPQLVLERLCIAKETNNLANCIDNYAIEVSFA
YLMGERAKQRMDEEEVEAFYNCVRIIVKSKHLLKSDVATIYQTARVD
>P03594 2.7.7.48~~~ORF2a~~~RNA-directed RNA polymerase 2a~~~
MSSKTWDDDFVRQVPSFQWIIDQSLEDEVEAASLQVQEPADGVAIDGSLASFKLAIAPLEIGGVFDPPFDRVRWGSICDT
VQQMVQQFTDRPLIPQAEMARMLYLDIPGSFVLEDEIDDWYPEDTSDGYGVSFAADEDHASDLKLASDSSNCEIEEVRVT
GDTPKELTLGDRYMGIDEEFQTTNTDYDITLQIMNPIEHRVSRVIDTHCHPDNPDISTGPIYMERVSLARTEATSHSILP
THAYFDDSYHQALVENGDYSMDFDRIRLKQSDVDWYRDPDKYFQPKMNIGSAQRRVGTQKEVLTALKKRNADVPEMGDAI
NMKDTAKAIAKRFRSTFLNVDGEDCLRASMDVMTKCLEYHKKWGKHMDLQGVNVAAETDLCRYQHMLKSDVKPVVTDTLH
LERAVAATITFHSKGVTSNFSPFFTACFEKLSLALKSRFIVPIGKISSLELKNVRLNNRYFLEADLSKFDKSQGELHLEF
QREILLALGFPAPLTNWWSDFHRDSYLSDPHAKVGMSVSFQRRTGDAFTYFGNTLVTMAMIAYASDLSDCDCAIFSGDDS
LIISKVKPVLDTDMFTSLFNMEIKVMDPSVPYVCSKFLVETEMGNLVSVPDPLREIQRLAKRKILRDEQMLRAHFVSFCD
RMKFINQLDEKMITTLCHFVYLKYGKEKPWIFEEVRAALAAFSLYSENFLRFSDCYCTEGIRVYQMSDPVCKFKRTTEER
KTDGDWFHNWKNPKFPGVLDKVYRTIGIYSSDCSTKELPVKRIGRLHEALERESLKLANDRRTTQRLKKKVDDYATGRGG
LTSVDALLVKSHCETFKPSDLR
>Q65667 ~~~ORF1~~~Replicase~~~
MADSFGFTPMEVLLFGGESVQLLTSDMPIDVQWGFVHSTRCYALWKDDLIHLNPLLKYSQRIAKRWERLVSGFVGPVPLD
KLLSLLAKLMRYCVNMGVSVQEIYLSDAIVSSSYMLHVSRSAGCVSFSWLYAKLSMFASCGKFWVGSSHHTAANMIEGSR
AVNGPDVAISEMVEAFHLEVKSSLVVTVSLTPREKKILERELGFVPLYKQKSRAPRNHPVLAALREVMRQEYSASCNILN
TKLKTLVVGAASREVNCYSSNPSVHYYFANKDSKDLVRTTLELLHSALATKYRNMESGERELMNNLKGCGYIVKRSVENA
VYEVVSDKDVAEVLRYAQTVASTKKEAKKKPNTGKRKMVMSEATRRTIELHELSRIVAEEKKIPNHFHFDESDFASVGNF
TQLVCEDVGYNFSVDAWLHLFEATGAQTAVGYMALPNELLFEHYPISDYYDYWEGVEKHGSLGGITISPLRNGQVVGMPT
GVFQPVHFDKTSAGLGIPGSKMGAAERVICHMSDGLGNGYNHVKSDWQTLLKHPILSSSKYNFAVEVDLTGRYGCLATFR
LTRVTGVKYVARTIKLRPEDRYVRVLDLLHIVRSIRLKGHAGLKEPYQYFPVYKREVDTPVSYCFSIAEKSLTVQNIANF
IRHHIGGVSLVNKELVSAWRLNPQLVPSFAYAVYFYVVNLRGELDGMLQKLMKKGITWADRLKANVSAFLRDMVDPISFL
WTWLFERRLVDQIFQDGTDVFYQMDRACVDEKALRLNDHIKITRDFLPADTLLPEGWSLDDWEKAPDSLKTLSAAASLPV
ECGAVNCVGKSFKSVRTLLPPSVVTSPVEQFFKSGGKFRDDAEFAELLSAHYRWQMDNSFCACQVCAALTGKTGSQVVEC
RWKAESMYTFSMSQTEVDDFRNEIKAQSIEKGNRFGEMLIGVHQKIPTQAFEVSVRLEYVKGGPGTGKSFLIRSLADPIR
DLVVAPFIKLRSDYQNQRVGDELLSWDFHTPHKALDVTGKQIIFVDEFTAYDWRLLAVLAYRNHAHTIYLVGDEQQTGIQ
EGRGEGISILNKVDLSKVSTHVPIMNFRNPVRDVKVLNYLFGSRMVPMSSVEKGFSFGDVKEFSSLSNIPDTKIIHYSDE
TGEHMMPDYVRGVSKTTVRANQGSTYDNVVLPVLPSDLNLINSAELNLVALSRHRNKLTILLDNDGMNIGAVLKGMLEGV
PEELERRDYIVGMYLGLHLPIKKEFFFPESEFAKSFRLMVAKYEAFVPYDSNLPTLVLQGDVVVLDIARVENDINDAFNC
PDFYNLVSRPNNCLVVAISECLGVTLEKLDNLMQANAVTLDKYHAWLSKKSPSTWQDCRMFADALKVSMYVKVLSDKPYD
LTYEVDGAGSSVTLHLVGKESDGHFIAAPLSPSLSTNERESGHDSKKPADDSDTFDAANLFADKGVSSADIEAFCAYLEK
TLMATIMEYDLRLQSWANVVDDTDDFYQINISEFRQSTCFGKLLSALEVLKVDVSRKRFISDWLCKNLENKQFRWRWSSS
VASTSSAGSNVDDDFVNMAGGKTDANVDPADVLRQSFMDYASEFVPILIAESPILMPLVEPEPILSKCMVPEFDAFLLIK
EFDLDNGADEYQCAYLNESVANRVGDKFVSGVLDTDIISPLNLRGHPIAENVKYHSMCVAPAQIYFKRNQWQELQVQQAR
YLFRKVRNSPSSTQDSVARMVAQLFVSDCLVPNVADTFSASNLWRIMDKAMHDMVTKNYQGQMEEEFTRNAKLYRFQLKD
IEKPLKDPETDLAKAGQGILAWSKEAHVKFMVAFRVLNDLLLKSLNSNVVYDNTMSETEFVGKINAAMNIVPDSAINGVI
DAAACDSGQGVFTQLIERHIYAALGISDFFLDWYFSFREKYVMQSRYVRAHMSYVKTSGEPGTLLGNTILMGAMLNAMLR
GTGPFCMAMKGDDGFKRQANLKINDQMLKLIKKETVLDFKLDLNVPITFCGYALSNGHLFPSVSRKLTKIAAHRFREYKH
FCEYQESLRDWIKNLPKDPAVYADFLECNASLSCRNVDDVQRWLDAIISVSRIGREQFMMMFPIREVFMSLPPVEDSLGE
LSSTKVAVSIGDNVSNVVRKVARVDMKKF
>P15965 2.7.7.48~~~~~~RNA-directed RNA polymerase beta chain~~~
MSKSTKKFNSLCIDLSRDLSLEVYQSIASVATGSSDPHSDDFTAIAYLRDELLTKHPNLGDGNDEATRRSLAIAKLLEAN
DRCGQINRDGFLHDATASWDPDVLQTSIRSLIGNLLSGYSSQLFRHCTFSNGASMGHKLQDAAPYKKFAEQATVTPRALK
AAVLVKDQCSPWIRHSHVFPESYTFRLVGGNGVFTVPKNNKIDRAACKEPDMNMYLQKGVGGFIRRRLKTVGIDLNDQTI
NQRLAQQGSRDGSLATIDLSSASDSISDRLVWSFLPPELYSYLDMIRSHYGYVNGKMIRWELFSTMGNGFTFELESMIFW
AIVRATQIHFRNTGTIGIYGDDIICPTEIAPRVLEALSFYGFKPNLRKTFTSGSFRESCGAHYFRGVDVKPFYIKKPITD
LFSLMLILNRIRGWGVVNGIADPRLYEVWEKLSRLVPRYLFGGTDLQADYYVVSPPILKGIYSKMNGRREYAEARTTGFK
LARIARWRKHFSDKHDSGRYIAWFHTGGEITDSMKSAGVRVMRTSEWLQPVPVFPQECGPASSPQ
>P07393 2.7.7.48~~~~~~RNA-directed RNA polymerase beta chain~~~
MFRFREIEKTLCMDRTRDCAVRFHVYLQSLDLGSSDPLSPDFDGLAYLRDECLTKHPSLGDSNSDARRKELAYAKLMDSD
QRCKIQNSNGYDYSHIESGVLSGILKTAQALVANLLTGFESHFLNDCSFSNGASQGFKLRDAAPFKKIAGQATVTAPAYD
IAVAAVKTCAPWYAYMQETYGDETKWFRRVYGNGLFSVPKNNKIDRAACKEPDMNMYLQKGAGSFIRKRLRSVGIDLNDQ
TRNQELARLGSIDGSLATIDLSSASDSISDRLVWDLLPPHVYSYLARIRTSFTMIDGRLHKWGLFSTMGNGFTFELESMI
FWALSKSIMLSMGVTGSLGIYGDDIIVPVECRPTLLKVLSAVNFLPNEEKTFTTGYFRESCGAHFFKDADMKPFYCKRPM
ETLPDVMLLCNRIRGWQTVGGMSDPRLFPIWKEFADMIPPKFKGGCNLDRDTYLVSPDKPGVSLVRIAKVRSGFNHAFPY
GHENGRYVHWLHMGSGEVLETISSARYRCKPNSEWRTQIPLFPQELEACVLS
>P00585 2.7.7.48~~~~~~RNA-directed RNA polymerase beta chain~~~
MSKTTKKFNSLCIDLPRDLSLEIYQSIASVATGSGDPHSDDFTAIAYLRDELLTKHPTLGSGNDEATRRTLAIAKLREAN
GDRGQINREGFLHDKSLSWDPDVLQTSIRSLIGNLLSGYRSSLFGQCTFSNGAPMGHKLQDAAPYKKFAEQATVTPRALR
AALLVRDQCAPWIRHAVRYNESYEFRLVVGNGVFTVPKNNKIDRAACKEPDMNMYLQKGVGAFIRRRLKSVGIDLNDQSI
NQRLAQQGSVDGSLATIDLSSASDSISDRLVWSFLPPELYSYLDRIRSHYGIVDGETIRWELFSTMGNGFTFELESMIFW
AIVKATQIHFGNAGTIGIYGDDIICPSEIAPRVLEALAYYGFKPNLRKTFVSGLFRESCGAHFYRGVDVKPFYIKKPVDN
LFALMLILNRLRGWGVVGGMSDPRLYKVWVRLSSQVPSMFFGGTDLAADYYVVSPPTAVSVYTKTPYGRLLADTRTSGFR
LARIARERKFFSEKHDSGRYIAWFHTGGEITDSMKSAGVRVIRTSEWLTPVPTFPQECGPASSPR
>P11124 2.7.7.48~~~P2~~~RNA-directed RNA polymerase~~~
MPRRAPAFPLSDIKAQMLFANNIKAQQASKRSFKEGAIETYEGLLSVDPRFLSFKNELSRYLTDHFPANVDEYGRVYGNG
VRTNFFGMRHMNGFPMIPATWPLASNLKKRADADLADGPVSERDNLLFRAAVRLMFSDLEPVPLKIRKGSSTCIPYFSND
MGTKIEIAERALEKAEEAGNLMLQGKFDDAYQLHQMGGAYYVVYRAQSTDAITLDPKTGKFVSKDRMVADFEYAVTGGEQ
GSLFAASKDASRLKEQYGIDVPDGFFCERRRTAMGGPFALNAPIMAVAQPVRNKIYSKYAYTFHHTTRLNKEEKVKEWSL
CVATDVSDHDTFWPGWLRDLICDELLNMGYAPWWVKLFETSLKLPVYVGAPAPEQGHTLLGDPSNPDLEVGLSSGQGATD
LMGTLLMSITYLVMQLDHTAPHLNSRIKDMPSACRFLDSYWQGHEEIRQISKSDDAILGWTKGRALVGGHRLFEMLKEGK
VNPSPYMKISYEHGGAFLGDILLYDSRREPGSAIFVGNINSMLNNQFSPEYGVQSGVRDRSKRKRPFPGLAWASMKDTYG
ACPIYSDVLEAIERCWWNAFGESYRAYREDMLKRDTLELSRYVASMARQAGLAELTPIDLEVLADPNKLQYKWTEADVSA
NIHEVLMHGVSVEKTERFLRSVMPR
>P14647 2.7.7.48~~~~~~RNA-directed RNA polymerase subunit beta~~~
MSKTASSRNSLSAQLRRAANTRIEVEGNLALSIANDLLLAYGQSPFNSEAECISFSPRFDGTPDDFRINYLKAEIMSKYD
DFSLGIDTEAVAWEKFLAAEAECALTNARLYRPDYSEDFNFSLGESCIHMARRKIAKLIGDVPSVEGMLRHCRFSGGATT
TNNRSYGHPSFKFALPQACTPRALKYVLALRASTHFDIRISDISPFNKAVTVPKNSKTDRCIAIEPGWNMFFQLGIGGIL
RDRLRCWGIDLNDQTINQRRAHEGSVTNNLATVDLSAASDSISLALCELLLPPGWFEVLMDLRSPKGRLPDGSVVTYEKI
SSMGNGYTFELESLIFASLARSVCEILDLDSSEVTVYGDDIILPSCAVPALREVFKYVGFTTNTKKTFSEGPFRESCGKH
YYSGVDVTPFYIRHRIVSPADLILVLNNLYRWATIDGVWDPRAHSVYLKYRKLLPKQLQRNTIPDGYGDGALVGSVLINP
FAKNRGWIRYVPVITDHTRDRERAELGSYLYDLFSRCLSESNDGLPLRGPSGCDSADLFAIDQLICRSNPTKISRSTGKF
DIQYIACSSRVLAPYGVFQGTKVASLHEA
>P09675 2.7.7.48~~~~~~RNA-directed RNA polymerase subunit beta~~~
MPKTASRRREITQLLGKVDINFEDDIHMSIANDLFEAYGIPKLDSAEECINTAFPSLDQGVDTFRVEYLRAEILSKFDGH
PLGIDTEAAAWEKFLAAEEGCRQTNERLSLVKYHDNSILSWGERVIHTARRKILKLIGESVPFGDVALRCRFSGGATTSV
NRLHGHPSWKHACPQDVTKRAFKYLQAFKRACGDVVDLRVNEVRTSNKAVTVPKNSKTDRCIAIEPGWNMFFQLGVGAVL
RDRLRLWKIDLNDQSTNQRLARDGSLLNHLATIDLSAASDSISLKLVELLMPPEWYDLLTDLRSDEGILPDGRVVTYEKI
SSMGNGYTFELESLIFAAIARSVCELLEIDQSTVSVYGDDIIIDTRAAAPLMDVFEYVGFTPNRKKTFCDGPFRESCGKH
WFQGVDVTPFYIRRPIRCLADMILVLNSIYRWGTVDGIWDPRALTVYEKYLKLLPRNWRRNRIPDGYGDGALVGLATTNP
FVIVKNYSRLYPVLVEVQRDVKRSEEGSYLYALLRDRETRYSPFLRDADRTGFDEAPLATSLRRKTGRYKVAWIQDSAFI
RPPYLITGIPEVKLAS
>Q0PW25 ~~~ORF2A-2B~~~Replicase polyprotein P2AB~~~
MGCSVVGNCKSVMLMSRMSWSKLALLISVAMAAAMTDSPPTLICMGILVSVVLNWIVCAVCEEASELILGVSLETTRPSP
ARVIGEPVFDPRYGYVAPAIYDGKSFDVILPISALSSASTRKETVEMAVENSRLQPLESSQTPKSLVALYSQDLLSGWGS
RIKGPDGQEYLLTALHVWETNISHLCKDGKKVPISGCPIVASSADSDLDFVLVSVPKNAWSVLGVGVARLELLKRRTVVT
VYGGLDSKTTYCATGVAELENPFRIVTKVTTTGGWSGSPLYHKDAIVGLHLGARPSAGVNRACNVAMAFRVVRKFVTVEN
SELYPDQSSGPARELDAETYTERLEQGIAFTEYNISGITVKTSDREWTTAEALRVARYKPLGGGKAWGDSDDEDTQETAI
RPLNLPAGGLPTGQSALGQLIEYAGYVWRDEGIINSNGMPFRSAGKSSCRFREAVCRAVHRDVRAAETEFPELKELAWPS
RGSKAEIGSLLFQAGRFERVEAPANLQLAITNLQAQYPRSRPRSCFRREPWCREDFVAEIEKIAHSGEINLKASPGVPLA
EIGVSNQQVIDVAWPLVCEAVVERLHALASVDPRQHDWSPEELVKRGLCDPVRLFVKQEPHSRQKIEQGRFRLISSVSLV
DQLVERMLFGPQNTTEIALWHSNPSKPGMGLSKASQVALLWEDLARKHQTHPGAMADISGFDWSVQDWELWADVSMRIEL
GSFPALMAKAAISRFYCLMNATFQLTNGELLTQELPGLMKSGSYCTSSSNSRIRCLMAELIGSPWCIAMGDDSVEGWVDD
APRKYSALGHLCKEYEACPVLPNGDLKEVSFCSHLISKGRAELETWPKCLFRYLSGPHDVESLEMELSSSRRWGQIVRYL
RRIGRVSGNDGEERSSNESPATTKTQGSAAAWGPPQEAWPVDGASLSTFEPSSSGWFHLEGW
>Q66136 2.7.7.48~~~ORF2a~~~RNA-directed RNA polymerase 2a~~~
MAFPAPAFSLANLLNGSYGVDTPEDMERLRSEQREEAAAACRNYRPLPAVDVSESVTEDAHSLQTPDGAPAEAVSDEFVT
YGAEDYLEKSDDELLVAFETMVKPMRIGQLWCPAFNKCSFISSIAMARALLLAPRTSHRTMKCFEDLVAAIYTKSDFYYS
EECEADDVQMDISSRDVPGYSFEPWSRTSGFEPPPICEACDMIMYQCPCFDFNALKKPCAERTFADDYVIEGLDGVVDNA
TLLSNLGPFLVPVKCQYEKCPTPTIAIPPNLNRATDRVDINLVQSICDSTLPTHSNYDDSFHQVFVESADYSIDLDHVRL
RQSDLIAKIPDSGHMIPVLNTGSGHKRVGTTKEVLTAIKKRNADVPELGDSVNLSRLSKAVAERFFISYINGNSLASSNF
VNVVSNFHDYMEKWKSSGLSYDDLPDLHAENLQFYDHMIKSDVKPVVSDTLNIDRPVPATITYHKKSITSQFSPLFTALF
ERFQRCLRERIILPVGKISSLEMAGFDVKNKYCLEIDLSKFDKSQGEFHLLIQEHILNGLGCPAPITKWWCDFHRFSYIR
DRRAGVGMPISFQRRTGDAFTYFGNTIVTMAEFAWCYDTDQFEKLLFSGDDSLGFSQLPPVGDSSKFTTLYNMEAKVMEP
SVPYICSKFLLSDEFGNTFSVPDPLREVQRLGTKKIPYSDNDEFLFAHFMQFVDRLKFLDRMSQSCIDQLSIFFELKYKK
SGEEAALMLGAFKKYTANFQSYKELYYSDRRQCELINSFCISEFRVERVNSNKQRKKHGIERRCNDKRRTPTGSYGGGEE
AETKVSQTESTGTRSQKSQRESAFKSQTVPLPTVLSSGRSGTDRVIPPCERGEGTRA
>Q66929 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MTLKVILGEHQITRTELLVGIATVSGCGAVVYCISKFWGYGAIAPYPQSGGNRVTRALQRAVIDKTKTPIETRFYPLDSL
RTVTPKRVADNGHAVSGAVRDAARRLIDESITAVGGSKFEVNPNPNSSTGLRNHFHFAVGDLAQDFRNDTPADDAFIVGV
DVDYYVTEPDVLLEHMRPVVLHTFNPKKVSGFDADSPFTIKNNLVEYKVSGGAAWVHPVWDWCEAGEFIASRVRTSWKEW
FLQLPLRMIGLEKVGYHKIHHCRPWTDCPDRALVYTIPQYVIWRFNWIDTELHVRKLKRIEYQDETKPGWNRLEYVTDKN
ELLVSIGREGEHAQITIEKEKLDMLSGLSATQSVNARLIGMGHKDPQYTSMIVQYYTGKKVVSPISPTVYKPTMPRVHWP
VTSDADVPEVSARQYTLPIVSDCMMMPMIKRWETMSESIERRVTFVANDKKPSDRIAKIAETFVKLMNGPFKDLDPLSIE
ETIERLNKPSQQLQLRAVFEMIGVKPRQLIESFNKNEPGMKSSRIISGFPDILFILKVSRYTLAYSDIVLHAEHNEHWYY
PGRNPTEIADGVCEFVSDCDAEVIETDFSNLDGRVSSWMQRNIAQKAMVQAFRPEYRDEIISFMDTIINCPAKAKRFGFR
YEPGVGVKSGSPTTTPHNTQYNGCVEFTALTFEHPDAEPEDLFRLIGPKCGDDGLSRAIIQKSINRAAKCFGLELKVERY
NPEIGLCFLSRVFVDPLATTTTIQDPLRTLRKLHLTTRDPTIPLADAACDRVEGYLCTDALTPLISDYCKMVLRLYGPTA
STEQVRNQRRSRNKEKPYWLTCDGSWPQHPQDAHLMKQVLIKRTAIDEDQVDALIGRFAAMKDVWEKITHDSEESAAACT
FDEDGVAPNSVDESLPMLNDAKQTRANPGTSRPHSNGGGSSHGNELPRRTEQRAQGPRQPARLPKQGKTNGKSDGNITAG
ETQRGGIPRGKGPRGGKTNTRRTPPKAGAQPQPSNNRK
>Q993M1 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MRRFEFALARMSGAAFCVYTGYRLLTSKWLADRVEDYRQRIIAEKKQILRDAAMIRTQIQREMELVRISVRKGHSHQEAA
TERNSATETMLGVVEKCGYEPYVISPSPREVGYHGSRQFYSLADFRQDYRRDDITDRHIIVMTDVDYYVDMHELIGLGVP
ILLYTFQPSTVSGEVKDGYFTITDDSVHYRVAGGKDVRHRIWNYNHDTMYVCSRPRGFWANLMQILRDITGVTAICSFLY
TKLGIAPFGDPVTMFTVDQFKMGEHRNIVSIVPFATCRSNLLKISEYGAELEYMRYQQRNNIANFNAVTYISENGPLISL
GLEGNFASVQLPLQDFENIRTAYELSKTNNLSDTVRRSGRPCKEAAIIHKCLQAECAVVSEVVHKPGDLARHYQAVGSAY
DTDPAEQGKCYAREYAPGPLTQTAVFPSESRSNELATIDGRIAGPQAKAKSREHITPKMRKVARDFVHHLVPIAGTGRPY
PLTYVEEQQTKPLQRARNDANRYHDEFTMMVKAFQKKEAYNAPNYPRNISTVPHTQNVKLSSYTYAFKASVLQHVPWYMP
THTPAEIADAVQNLAASSTELVETDYSKFDGTFLRFMRECVEFAIYKRWVHLDHLPELTTLLANEIQAPAVTRLGIKYDP
DCSRLSGSALTTDGNSIANAFVSYLAGRMAGMDDDEAWSWIGIVYGDDGLRSGNVSNELLTNTASSLGFDLKIVNRAPRG
SPVTFLSRVYLDPWSSPASVQSPLRTLLKLHTTCDTQSEIDDIGWAKTQAYLVTDSKTPFIGHWCRAYQRNCTARVVQYA
DYADIPFWVKNDDHVGNSWPQSESDDWNDIVANELGVTTAELLKHLALLDAYAGPISGLPRLTTSIDLEPKMSVALDGEI
QAGPSQNKTSKDGTNPTSDRSAPRRARAALPGDDGHARRSRRSDRDPGKRDAHVRDKRPRRSSPPTRPVTPVPTPSSGDR
GTDGDGLGRAAVRQRQRRRTQV
>Q67724 2.1.1.-~~~~~~Methyltransferase/helicase/RNA-directed RNA polymerase~~~
MYAKATDVARVYAAADVAYANVLQQRAVKLDFAPPLKALETLHRLYYPLRFKGGTLPPTQHPILAGHQRVAEEVLHNFAR
GRSTVLEIGPSLHSALKLHGAPNAPVADYHGCTKYGTRDGSRHITALESRSVATGRPEFKADASLLANGIASRTFCVDGV
GSCAFKSRVGIANHSLYDVTLEELANAFENHGLHMVRAFMHMPEELLYMDNVVNAELGYRFHVIEEPMAVKDCAFQGGDL
RLHFPELDFINESQERRIERLAARGSYSRRAVIFSGDDDWGDAYLHDFHTWLAYLLVRNYPTPFGFSLHIEVQRRHGSSI
ELRITRAPPGDRMLAVVPRTSQGLCRIPNIFYYADASGTEHKTILTSQHKVNMLLNFMQTRPEKELVDMTVLMSFARARL
RAIVVASEVTESSWNISPADLVRTVVSLYVLHIIERRRAAVAVKTAKDDVFGETSFWESLKHVLGSCCGLRNLKGTDVVF
TKRVVDKYRVHSLGDIICDVRLSPEQVGFLPSRVPPARVFHDREELEVLREAGCYNERPVPSTPPVEEPQGFDADLWHAT
AASLPEYRATLQAGLNTDVKQLKITLENALKTIDGLTLSPVRGLEMYEGPPGSGKTGTLIAALEAAGGKALYVAPTRELR
EAMDRRIKPPSASRTQHVALAILRRATAEGAPFATVVIDECFMFPLVYVAIVHALSPSSRIVLVGDVHQIGFIDFQGTSA
NMPLVRDVVKQCRRRTFNQTKRCPADVVATTFFQSLYPGCTTTSGCVASISHVAPDYRNSQAQTLCFTQEEKSRHGAEGA
MTVHEAQGRTFASVILHYNGSTAEQKLLAEKSHLLVGITRHTNHLYIRDPTGDIERQLNHSAKAEVFTDIPAPLEITTVK
PSEEVQRNEVMATIPPQSPTPHGAIHLLRKNFGDQPDCGCVALAKTGYEVFGGRAKINVELAEPDATPKPHRAFQEGVQW
VKVTNASNKHQALQTLLSRYTKRSADLPLHEAKEDVKRMLNSLDRHWDWTVTEDARDRAVFETQLKFTQRGGTVEDLLEP
DDPYIRDIDFLMKTQQKVSPKPINTGKVGQGIAAHSKSLNFVLAAWIRILEEILRTGSRTVRYSNGLPDEEEAMLLEAKI
NQVPHATFVSADWTEFDTAHNNTSELLFAALLERIGTPAAAVNLFRERCGKRTLRAKGLGSVEVDGLLDSGAAWTPCRNT
IFSAAVMLTLFRGVKFAAFKGDDSLLCGSHYLRFDASRLHMGERYKTKHLKVEVQKIVPYIGLLVSAEQVVLDPVRSALK
IFGRCYTSELLYSKYVEAVRDITKGWSDARYHSLLCHMSACYYNYAPESAAYIIDAVVRFGRGDFPFEQLRVVRAHVQAP
DAYSSTYPANVRASCLDHVFEPRQAAAPAGFVATCAKPETPSSLTAKAGVSATTSHVATGTAPPESPWDAPAANSFSELL
TPETPSTSSSAVIVFIGLLYIVWKVAQWWRHRKRTEDLNSRKPPSQDRQSRSSECLDRSGERTGSSLTAPTAPSPSFSFS
ERARLATGPTVAAATSPSATPSCATDQVAARTTPDFAPFLGSQSARAVSKPYRPPTTARWKEVTPLHAWKGVTGDRPEVR
EDPETAAVVQALISGRYPQKTKLSSDASKGYSRTKGCSQSTSFPAPSADYQARDCQTVRVCRAAAEMARSCIHEPLASSA
ASADLKRIRSTSDSVPDVKISKSA
>Q50LE4 2.7.7.48~~~Segment-2~~~Potential RNA-dependent RNA polymerase~~~
MQVAPNVWSKYFNIPNPGLRAYFSNVVSGQPEVYRTPFYKGMSLESICDEWYKKLVSIDTQWPTLMEFEDDLRKKVGPMS
VMLPLKERMSDIDSYYDSISKDQVPFDTKAISAAKSEWKGVSRLRLRSEVNTVAVMKKSTNSGSPYFSKRKAVVSKTIPC
DVYMDGRYCVMRQNGREWSGAAVLGWRGQEGGPKPTDVKQRVVWMFPFAVNIRELQVYQPLILTFQRLGLVPAWVSMEAV
DRRITKMFDTKGPRDVVVCTDFSKFDQHFNPTCQSVAKELLADLLTGQEAVDWLERVFPIKYAIPLAYNWGEIRYGIHGM
GSGSGGTNADETLVHRVLQHEAAISHHTTLNPNSQCLGDDGVLTYPGISAEDVMQSYSRHGLDMNLEKQYVSKQDCTYLR
RWHHTDYRVDGMCVGVYSTMRALGRLAMQERYYDPDVWGEKMVTLRYLSIIENVKYHPLKEEFLDFCIKGDKTRLGLGIP
GFLDNIAGEAQKAIDMMPDFLGYTKSLQYDGDLRRNAAAGIENWWVVQALKSRR
>Q3HM40 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGRWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLA
MITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNDSTRKKIEKIRPLLIDGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLLGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFEIKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESV
NNAVMMPAHGPAKNMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFAEIMKICSTIEELRRQK
>P03430 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSERGRWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFETSCLETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMNKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKRKQRLNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTEISFTITGDNTKWNENQNPRMFLA
MITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKIRTQIPAEMLASIDLKYFNDSTRKKIEKIRPLLIDGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRHTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVNRFYRTCKLLGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFEIKKLWEQTHSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVNHKDIESV
NNAVIMPAHGPAKNMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFTEIMKICSTIEELRRQK
>P03431 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGRWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFENSCIETMEVVQQTRVDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMKKEEMGITTHFQRKRRVRDNMTKKMITQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLA
MITYMTRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNDSTRKKIEKIRPLLIEGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLLGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFEIKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESM
NNAVMMPAHGPAKNMEYDAVATTHSWIPKRNRSILNTSQRGVLEDEQMYQRCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFTEIMKICSTIEELRRQK
>Q910D6 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDRLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRVNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLA
MITYITKNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESV
NNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFAEIMKICSTIEELRRQK
>P03432 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDRLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRVNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLA
MITYITKNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFELEKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVCLKWELMDEDYQGRLCNPLNPFVSHKEIESV
NNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFAEIMKICSTIEELRRQK
>Q9Q0V0 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGKWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGIFENSCLETMEVVQQTRVDKLTQGRQTYDWTLKRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMDKGEMEIITHFQRKRRVRDNMTKKMVTQRTIGKKKQRLNKRSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVYFVETLARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTITGDNTKWNENQNPRMFLA
MITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLASIDLKYFNESTRKKIEKIRPLLIDGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIEAGVDRFYRTCKLVGINMTKK
KSYINRTGTCEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMMDNDLGPATAQMALQLFIKDYRYPYR
CHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNPYNIRNLHIPEAGLKWELMDEDYQGRLCNPLNPFVSHKEIESV
NNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCNLFEKFFPSSSYRRPVGISSMVEAMVS
RARIDARIDFESGRIKKEEFAEIMKICSTIEELGRQK
>Q9WLS3 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MDVNPTLLFLKVPAQNAISTTFPYTGDPPYSHGTGTGYTMDTVNRTHQYSEKGRWTTNTETGAPQLNPIDGPLPEDNEPS
GYAQTDCVLEAMAFLEESHPGLFENSCLETMEVVQQTRMDKLTQGRQTYDWTLNRNQPAATALANTIEVFRSNGLTANES
GRLIDFLKDVMESMDKEEMEITTHFQRKRRVRDNMTKKMVTQRTIGKKKQRLTKKSYLIRALTLNTMTKDAERGKLKRRA
IATPGMQIRGFVHFVEALARSICEKLEQSGLPVGGNEKKAKLANVVRKMMTNSQDTELSFTVTGDNTKWNENQNPRIFLA
MITYITRNQPEWFRNVLSIAPIMFSNKMARLGKGYMFESKSMKLRTQIPAEMLANIDLKYFNESTRKKIEKIRPLLVEGT
ASLSPGMMMGMFNMLSTVLGVSILNLGQKRYTKTTYWWDGLQSSDDFALIVNAPNHEGIQAGVDRFYRTCKLVGINMSKK
KSYINRTGTFEFTSFFYRYGFVANFSMELPSFGVSGINESADMSIGVTVIKNNMINNDLGPATAQMALQLFIKDYRYTYR
CHRGDTQIQTRRSFELKKLWEQTRSKAGLLVSDGGPNLYNIRNLHIPEVGLKWELMDEDYQGRLCNPLNPFVSHKEVESV
NNAVVMPAHGPAKSMEYDAVATTHSWIPKRNRSILNTSQRGILEDEQMYQKCCTLFEKFFPSSSYRRPVGISSMMEAMVS
RARIDARIDFESGRIKKEEFAEILKICSTIEELGRQGK
>Q9Q6Q5 2.7.7.48~~~VP1~~~RNA-directed RNA polymerase~~~
MSDIFNSPQARSTISAAFGIKPTAGQDVEELLIPKVWVPPEDPLASPSRLAKFLRENGYKVLQPRSLPENEEYETDQILP
DLAWMRQIEGAVLKPTLSLPIGDQEYFPKYYPTHRPSKEKPNAYPPDIALLKQMIYLFLQVPEANEGLKDEVTLLTQNIR
DKAYGSGTYMGQATRLVAMKEVATGRNPNKDPLKLGYTFESIAQLLDITLPVGPPGEDDKPWVPLTRVPSRMLVLTGDVD
GDFEVEDYLPKINLKSSSGLPYVGRTKGETIGEMIAISNQFLRELSTLLKQGAGTKGSNKKKLLSMLSDYWYLSCGLLFP
KAERYDKSTWLTKTRNIWSAPSPTHLMISMITWPVMSNSPNNVLNIEGCPSLYKFNPFRGGLNRIVEWILAPEEPKALVY
ADNIYIVHSNTWYSIDLEKGEANCTRQHMQAAMYYILTRGWSDNGDPMFNQTWATFAMNIAPALVVDSSCLIMNLQIKTY
GQGSGNAATFINNHLLSTLVLDQWNLMRQPRPDSEEFKSIEDKLGINFKIERSIDDIRGKLRQLVLLAQPGYLSGGVEPE
QSSPTVELDLLGWSATYSKDLGIYVPVLDKERLFCSAAYPKGVENKSLKSKVGIEQAYKVVRYEALRLVGGWNYPLLNKA
CKNNAGAARRHLEAKGFPLDEFLAEWSELSEFGEAFEGFNIKLTVTSESLAELNKPVPPKPPNVNRPVNTGGLKAVSNAL
KTGRYRNEAGLSGLVLLATARSRLQDAVKAKAEAEKLHKSKPDDPDADWFERSETLSDLLEKADIASKVAHSALVETSDA
LEAVQSTSVYTPKYPEVKNPQTASNPVVGLHLPAKRATGVQAALLGAGTSRPMGMEAPTRSKNAVKMAKRRQRQKESRQQ
P
>A7L9Z4 2.7.7.48~~~VP1~~~RNA-directed RNA polymerase~~~
MSDVFNSPQARSTISAAFGIKPTAGQDVEELLIPKVWVPPEDPLASPSRLAKFLRENGYKVLQPRSLPENEEYETDQILP
DLAWMRQIEGAVLKPTLSLPIGDQEYFPKYYPTHRPSKEKPNAYPPDIALLKQMIYLFLQVPEANEGLKDEVTLLTQNIR
DKAYGSGTYMGQANRLVAMKEVATGRNPNKDPLKLGYTFESIAQLLDITLPVGPPGEDDKPWVPLTRVPSRMLVLTGDVD
GDFEVEDYLPKINLKSSSGLPYVGRTKGETIGEMIAISNQFLRELSTLLKQGAGTKGSNKKKLLSMLSDYWYLSCGLLFP
KAERYDKSTWLTKTRNIWSAPSPTHLMISMITWPVMSNSPNNVLNIEGCPSLYKFNPFRGGLNRIVEWILAPEEPKALVY
ADNIYIVHSNTWYSIDLEKGEANCTRQHMQAAMYYILTRGWSDNGDPMFNQTWATFAMNIAPALVVDSSCLIMNLQIKTY
GQGSGNAATFINNHLLSTLVLDQWNLMRQPRPDSEEFKSIEDKLGINFKIERSIDDIRGKLRQLVLLAQPGYLSGGVEPE
QSSPTVELDLLGWSATYSKDLGIYVPVLDKERLFCSAAYPKGVENKSLKSKVGIEQAYKVVRYEALRLVGGWNYPLLNKA
CKNNAGAARRHLEAKGFPLDEFLAEWSELSEFGEAFEGFNIKLTVTSESLAELNKPVPPKPPNVNRPVNTGGLKAVSNAL
KTGRYRNEAGLSGLVLLATARSRLQDAVKAKAEAEKLHKSKPDDPDADWFERSETLSDLLEKADIASKVAHSALVETSDA
LEAVQSTSVYTPKYPEVKNPQTASNPVVGLHLPAKRATGVQAALLGAGTSRPMGMEAPTRSKNAVKMAKRRQRQKESRQ
>P07832 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MNINPYFLFIDVPIQAAISTTFPYTGVPPYSHGTGTGYTIDTVIRTHEYSNKGKQYISDVTGCVMVDPTNGPLPEDNEPS
AYAQLDCVLEALDRMDEEHPGLFQAGSQNAMEALMVTTVDKLTQGRQTFDWTVCRNQPAATALNTTITSFRLNDLNGADK
GGLVPFCQDIIDSLDKPEMIFFTVKNIKKKLPAKNRKGFLIKRIPMKVKDRITRVEYIKRALSLNTMTKDAERGKLKRRA
IATAGIQIRGFVLVVENLAKNICENLEQSGLPVGGNEKKAKLSNAVAKMLSNCPPGGISMTVTGDNTKWNECLNPRIFLA
MTERITRDSPIWFRDFCSIAPVLFSNKIARLGKGFMITSKTKRLKAQIPCPDLFNIPLERYNEETRAKLKKLKPFFNEEG
TASLSPGMMMGMFNMLSTVLGVAALGIKNIGNKEYLWDGLQSSDDFALFVNAKDEETCMEGINDFYRTCKLLGINMSKKK
SYCNETGMFEFTSMFYRDGFVSNFAMELPSFGVAGVNESADMAIGMTIIKNNMINNGMGPATAQTAIQLFIADYRYTYKC
HRGDSKVEGKRMKIIKELWENTKGRDGLLVADGGPNLYNLRNLHIPEIILKYNIMDPEYKGRLLHPQNPFVGHLSIEGIK
EADITPAHGPIKKMDYDAVSGTHSWRTKRNRSILNTDQRNMILEEQCYAKCCNLFEACFNSASYRKPVGQHSMLEAMAHR
LRMDARLDYESGRMSKEDFEKAMAHLGEIGYM
>O36430 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MNINPYFLFIDVPIQAAISTTFPYTGVPPYSHGTGTGHTIDTVIRTHEYSNKGKQYVSDVTGCTMVDPTNGPLPEDNEPS
AYAQLDCVLEALDRMDEEHPGLFQAASQNAMEALMVTTVDKLTQGRQTFDWTVCRNQPAATALNTTITSFRLNDLNGADK
GGLVPFCQDIIDSLDKPEMTFFSVKNIKKKLPAKNRKGFLIKRIPMKVKDRITRVEYIKRALSLNTMTKDAERGKLKRRA
IATAGIQIRGFVLVVENLAKNICENLEQSGLPVGGNEKKAKLSNAVAKMLSNCPPGGISMTVTGDNTKWNECLNPRIFLA
MTERITRDSPIWFRDFCSIAPVLFSNKIARLGKGFMITSKTKRLKAQIPCPDLFSIPLERYNEETRAKLKKLKPFFNEEG
TASLSPGMMMGMFNMLSTVLGVAALGIKNIGNKEYLWDGLQSSDDFALFVNAKDEETCMEGINDFYRTCKLLGINMSKKK
SYCNETGMFEFTSMFYRDGFVSNFAMEIPSFGVAGVNESADMAIGMTIIKNNMINNGMGPATAQTAIQLFIADYRYTYKC
HRGDSKVEGKRMKIIKELWENTKGRDGLLVADGGPNIYNLRNLHIPEIVLKYNLMDPEYKGRLLHPQNPFVGHLSIEGIK
EADITPAHGPVKKMDYDAVSGTHSWRTKRNRSILNTDQRNMILEEQCYAKCCNLFEACFNSASYRKPVGQHSMLEAMAHR
LRMDARLDYESGRMSKDDFEKAMAHLGEIGYI
>Q9IMP4 2.7.7.48~~~PB1~~~RNA-directed RNA polymerase catalytic subunit~~~
MEINPYLMFLNNDVTSLISTTYPYTGPPPMSHGSSTKYTLETIKRTYDYSRTSVEKTSKVFNIPRRKFCNCLEDKDELVK
PTGNVDISSLLGLAEMMEKRMGEGFFKHCVMEAETEILKMHFSRLTEGRQTYDWTSERNMPAATALQLTVDAIKETEGPF
KGTTMLEYCNKMIEMLDWKEIKFKKVKTVVRREKDKRSGKEIKTKVPVMGIDSIKHDEFLIRALTINTMAKDGERGKLQR
RAIATPGMIVRPFSKIVETVAQKICEKLKESGLPVGGNEKKAKLKTTVTSLNARMNSDQFAVNITGDNSKWNECQQPEAY
LALLAYITKDSSDLMKDLCSVAPVLFCNKFVKLGQGIRLSNKRKTKEVIIKAEKMGKYKNLMREEYKNLFEPLEKYIQKD
VCFLPGGMLMGMFNMLSTVLGVSTLCYMDEELKAKGCFWTGLQSSDDFVLFAVASNWSNIHWTIRRFNAVCKLIGINMSL
EKSYGSLPELFEFTSMFFDGEFVSNLAMELPAFTTAGVNEGVDFTAAMSIIKTNMINNSLSPSTALMALRICLQEFRATY
RVHPWDSRVKGGRMKIINEFIKTIENKDGLLIADGGKLMNNISTLHIPEEVLKFEKMDEQYRNRVFNPKNPFTNFDKTID
IFRAHGPIRVEENEAVVSTHSFRTRANRTLLNTDMRAMMAEEKRYQMVCDMFKSVFESADINPPIGAMSIGEAIEEKLLE
RAKMKRDIGAIEDSEYEEIKDIIRDAKKARLESR
>P22173 2.7.7.48~~~VP1~~~RNA-directed RNA polymerase~~~
MSDIFNSPQNKASILTALMKSTTGDVEDVLIPKRFRPAKDPLDSPQAAAQFLKDNKYRILRPRAIPTMVELETDAALPRL
RQMVEDGKLKDTVSVPEGTTAFYPKYYPFHKPDHDEVGTFGAPDITLLKQLTFFLLENDFPTGPETLRQVREAIATLQYG
SGSYSGQLNRLLAMKGVATGRNPNKTPKTVGYTNEQLAKLLEQTLPINTPKHEDPDLRWAPSWLINYTGDLSTDKSYLPH
VTIKSSAGLPYIGKTKGDTTAEALVLADSFIRDLGRAATSADPEAGVKKTITDFWYLSCGLLFPKGERYTQVDWDKKTRN
IWSAPYPTHLLLSMVSTPVMNESKLNITNTQTPSLYGFSPFHGGMDRIMTIIRDSLDNDEDLVMIYADNIYILQDNTWYS
IDLEKGEANCTPQHMQAMMYYLLTRGWTNEDGSPRYNPTWATFAMNVAPSMVVDSSCLLMNLQLKTYGQGSGNAFTFLNN
HLMSTIVVAEWVKAGKPNPMTKEFMDLEEKTGINFKIERELKNLRETIVEAVETAPQDGYLADGSDLPPIRPGKAVELDL
LGWSAIYSRQMEMFVPVLENERLIASAAYPKGLENKALARKPGAEIAYQIVRYEAIRLVGGWNNPLLETAAKHMSLDKRK
RLEVKGIDVTGFLDDWNNMSEFGGDLEGITLSEPLTNQTLVDINTPLDSFDPKARPQTPRSPKKTLDEVTTAITSGTYKD
PKSAVWRLLDQRTKLRVSTLRDQALALKPASSSVDNWAEATEELAQQQQLLMKANNLLKSSLTETREALETIQSDKIIAG
KSNPEKNPGTAANPVVGYGEFSEKIPLTPTQKKNAKRREKQRRNQ
>Q9IMM4 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MLNYETIINGASSALNIVSRALGYRVPLAKSLALVAGSCVVYKIIVHRRTLVAFLVIGPYATVVQHRLPMALQRAIIEYT
REDREISLFPQNSIVSAEHARKADNGHPISGGTRDVARETISLAIRAAGFRHYEISPARQSPAEAASHQHYAAADLVRAA
TEDKIQDGDVVVAIDIDYYLRDIDRYLGRGVPFMAYTFNPVEVAGRDGDSFFRITNNQVTFDVSGGGSWSHEVWDWCAFG
EFIETRDASWLAWFARAVGLTKSQIHKVHYCRPWPQSPHRALVWCLPVASYWRFTFIPTDLHTRTLRRVRYQDTSRPGWN
SIVSTGSEGLNISLGREGADHCVTIPKVHYDMLMGLSSAQSLSSRMIGLKYTDPSVLATVAQYYQGKNVEVADADRIGRA
INPKVHWPAHVEVDEAEVSARVYASPLVSDENMMPMIKRWETLSLSLDRRVTFQRNPKVPGKRLRAYAIEFVDLVVPERG
VGVPYSLEDTAAMLDKPSQTLAIQQVWETVDMPPRRLIEAFVKNEPTMKAGRIISSFADMRFLLRFSSYTLAFRDQVLHA
EHNRHWFCPGLTPEQIATKVVDYVSGVEEPSEGDFSNFDGTVSEWLQRHVMNAVYLRYFNHRAQRDLRSYTDMLVSCPAR
AKRFGFAYDAGVGVKSGSPTTCDLNTVCNGFLQYCSIRMTHPELTPIDAFRLIGLAFGDDSLFERRFAKNYAKVSAEVGM
VLKIERFDPAQGITFLARVYPDPYTSTTSFQDPLRTWRKLHLTTRDPTIPLATAAIDRVEGYLVTDGLSPLTGAYCRMVK
RVYEAGGAEDAAKRRSRKSHSREKPYWLTVGGAWPQDVKDVDLMFQCAAARTGVDLETLRSLDQRLGEITDVWADITINR
DNEPNPYKDTLDLEGPADGRVDDRVFQNDKHVMRLRANQVTSSQAGAAGSGDASNDPNAHDRGSQRQQGSASVLRVPDRA
APAGVSSDEQPAHQTASRSSASRGGAGPGRGGRRRPGPPAKTTAGGARDGNQARAPTSGPSKRQAEGRSRSSRGPAGSRG
RGK
>B3VML1 2.7.7.48~~~~~~RNA-dependent RNA polymerase~~~
MEPRQELLDPERVREEAKRIVRWLCGLVDRTGKLPAGDYQGKILNTVSEICKRSKLPCGELAEALTKGRLTRSLVDYSEL
LISNLVVGYFDVLEIYLGKQSVRLSDIREMACKYTFYAINRRLEDYIKFQTAWLQARAMRDVTPEPEAPSWLNEGFGRCF
SALLNRKVHLRCVLRARPNDVSLAASLYQVKRVAPPLPDDQIEKNLEKSLDRLTKDEEPAGVDEPFLEDLKRECKRTVDE
LVQNARREGWNRKISRDCFPSQSAAFENPISKGGQLGQLVKENNTPRLPVLLGMFEYKGRVTPVYGWADDGDTILSDEEL
GREVPAALKCRRSPVLEPFKVRVITMGPAVQYYRARRVQGCLWDLLKHTRCTHLPNRPVEESDIGFYVRRRGADLFRGEE
VPYVSGDYSAATDNLHPDLSLSVVDRVCDHLLSDDNRPLDPVSPWRVLFHRVLVGHRIYDGNSSRNTEVAAQSWGQLMGS
PLSFPVLCIVNLAVTRYVLEKACGRIVTLEESGILVNGDDILFRCPERTIPFWTRMVTIAGLSPSPGKNFVSYRYCQLNS
ELYDMSGSRAEYLPFIKANLIYGTLARGCERKRAADLCYGDTTTEGGTFGHRARALIKGFGPDMQDRLMSRFLHSIKGFL
EKIPEVSWFIHPRYGGLGLPLTRPVTHNPYHLRIAAYLSCGGEQSQEARCMMQWLSAPTKSFNAATLLRILEVARNCKVP
FRKVPFALLHRAEAAGVDLEALFRKALLRSAPRLGVEYPSNESGDMQRLGDWRRFFRDVGRKAARTRCRSGTADERKGLF
LMSPDNAVKGPQYEYIFDWASHNMGGNIWDPSYKFRADPFSSSESDEPRAKEIGPNRGPE
>P89657 2.1.1.-~~~~~~Replicase large subunit~~~
MAYTQQATNAALASTLRGNNPLVNDLANRRLYESAVEQCNAHDRRPKVNFLRSISEEQTLIATKAYPEFQITFYNTQNAV
HSLAGGLRSLELEYLMMQIPYGSTTYDIGGNFAAHMFKGRDYVHCCMPNMDLRDVMRHNAQKDSIELYLSKLAQKKKVIP
PYQKPCFDKYTDDPQSVVCSKPFQHCEGVSHCTDKVYAVALHSLYDIPADEFGAALLRRNVHVCYAAFHFSENLLLEDSY
VSLDDIGAFFSREGDMLNFSFVAESTLNYTHSYSNVLKYVCKTYFPASSREVYMKEFLVTRVNTWFCKFSRLDTFVLYRG
VYHRGVDKEQFYSAMEDAWHYKKTLAMMNSERILLEDSSSVNYWFPKMKDMVIVPLFDVSLQNEGKRLARKEVMVSKDFV
YTVLNHIRTYQSKALTYANVLSFVESIRSRVIINGVTARSEWDVDKALLQSLSMTFFLQTKLAMLKDDLVVQKFQVHSKS
LTEYVWDEITAAFHNCFPTIKERLINKKLITVSEKALEIKVPDLYVTFHDRLVKEYKSSVEMPVLDVKKSLEEAEVMYNA
LSEISILKDSDKFDVDVFSRMCNTLGVDPLVAAKVMVAVVSNESGLTLTFERPTEANVALALQPTITSKEEGSLKIVSSD
VGESSIKEVVRKSEISMLGLTGNTVSDEFQRSTEIESLQQFHMVSTETIIRKQMHAMVYTGPLKVQQCKNYLDSLVASLS
AAVSNLKKIIKDTAAIDLETKEKFGVYDVCLKKWLVKPLSKGHAWGVVMDSDYKCFVALLTYDGENIVCGETWRRVAVSS
ESLVYSDMGKIRAIRSVLKDGEPHISSAKVTLVDGVPGCGKTKEILSRVNFDEDLVLVPGKQAAEMIRRRANSSGLIVAT
KENVRTVDSFLMNYGRGPCQYKRLFLDEGLMLHPGCVNFLVGMSLCSEAFVYGDTQQIPYINRVATFPYPKHLSQLEVDA
VETRRTTLRCPADITFFLNQKYEGQVMCTSSVTRSVSHEVIQGAAVMNPVSKPLKGKVITFTQSDKSLLLSRGYEDVHTV
HEVQGETFEDVSLVRLTPTPVGIISKQSPHLLVSLSRHTRSIKYYTVVLDAVVSVLRDLECVSSYLLDMYKVDVSTQYQL
QIESVYKGVNLFVAAPKTGDVSDMQYYYDKCLPGNSTILNEYDAVTMQIRENSLNVKDCVLDMSKSVPLPRESETTLKPV
IRTAAEKPRKPGLLENLVAMIKRNFNSPELVGVVDIEDTASLVVDKFFDAYLIKEKKKPKNIPLLSRASLERWIEKQEKS
TIGQLADFDFIDLPAVDQYRHMIKQQPKQRLDLSIQTEYPALQTIVYHSKKINALFGPVFSELTRQLLETIDSSRFMFYT
RKTPTQIEEFFSDLDSNVPMDILELDISKYDKSQNEFHCAVEYEIWKRLGLDDFLAEVWKHGHRKTTLKDYTAGIKTCLW
YQRKSGDVTTFIGNTIIIAACLSSMLPMERLIKGAFCGDDSILYFPKGTDFPDIQQGANLLWNFEAKLFRKRYGYFCGRY
IIHHDRGCIVYYDPLKLISKLGAKHIKNREHLEEFRTSLCDVAGSLNNCAYYTHLDDAVGEVIKTAPPGSFVYRALVKYL
CDKRLFQTLFLE
>P22956 2.7.7.48~~~ORF1~~~RNA-directed RNA polymerase~~~
MGFINLSLFDVDKLMVWVSKFNPGKILSAICNLGIDCWNRFRKWFFGLNFDAHMWAVDAFIPLMPHYTEQMERVVDDFCS
ETPESKLEDCLELDTSVNEFFDEEVYKKDEEGVMKLQRSAARKHIKRVRPGMMQAAIKAVETRIRNRHTIFGDDMGKVDE
AAVRATASDICGEFKINEHHTNALVYAAAYLAMTPDQRSIDSVKLAYNPKSQARRTLVSAIRENKAVAGFKSLEDFLGGP
LSFPVEDAPYPILGIPEIRVAEKRASRVMKSKRVVGLPAVSAGLKVCVHQTSLHNMIVSLERRVFRVKNSAGELVVPPKP
IQNAFDSISYFREEWLRKLSHKGQILKSSLADVVACYSSEKRKLYQKAADSLEKKPVQWRDSKVQAFIKVEKLECDTKDP
VPRTIQPRSKRYNLAIGQYLRLNEKKMLDSIDDVFKEKTVLSGLDNRAQGRAIAHKWRKYQNPIGIGLDASRFDQHCSVD
ALKFEQTFYKACFPGDQQLETLLKWQLSNTGSALLPTGELVRYRTKGCRMSGDINTGLGNKILMCSMVHAFLKETGVRAS
LANNGDDCVLFCEKGDYEQINRNLEQWFLCRGFEMTVEKPVDVLEKVAFCRSQPVCIATQWAMVRQLGSLSRDCFSTQDW
LNPKTFKDAMNALGQCNGIINDGVPIHMAQAKLMHRIGGNRKFNLDALHKQMEYSWRDRLGKRTNLLWSEVEDATRLSYF
RAFGIEPYIQRIVEEYFSQVEITCEGRSTNVLPTHYSRIHKDLIKAR
>Q02119 2.7.7.48~~~~~~RNA-directed RNA polymerase P1~~~
MDLARSDIIPHLLCLFQEIIQANIQKVSEAYDLELKIMNILTLWRNGSDNLISDNDDYMKKGLFFSRSNDPLALQARYAQ
MYDDLFKLNNYKVPDDVVRKHDTKILDIILKESSVPFWYDISDDEAHESMLPEFRLQDIHEFRLNLKRVAVVPDESEEIQ
MDESQSDKRRRKKRMEKSRPVWLSGSENDRRIELNDSLKPSQKFETKLSSYLLNRLMNEMNPHYCGHPLPALFVTLIMLK
AYSLKNKFFSYGIRYMELVCNEIGGPDLNTRTFPVLFGSDGSFVGTRVYSHYPIKLRMILNDLTYLLTYSDLHKFQEFEL
DVNDEVLLHMLHSPNDGRQLKKAVTRLNLYYGLKFNPKTTDCGVVNGMDFTHKHPITKTADFTSPVLPMTNSFNKAEICY
GHSSKILNRAVFTDTVRGYIREDLKNVADLDLPKLHEHVSKLVDMRVNYTIIYDLMFLRVMLNLGGYSRSNQITDFRKTI
DEITKMNEDFLSGADPEKNIDILNAWMAPTMEDCGYRLTKSILFGKFRKAKYPSDLEAKSNIDYYVTARSAGIGNLRISI
ETDKRKYKVRTTSKSAFVNAMGSGILDVNPVSNEPMMLTDYLLTQTPETRANLEAAIDSGSISDSELMRILGQNSIGSRS
TTAWRPVRPIYINVLQAHLAQAFIIGPHINATVNQHESQPTSLWFTGDDLGVGFATLYQNGTADIIAPAIEASSTGKALS
VLADCSSWDQTFLTATIIPYYNGIKKALLEYQQADMRNFYMIDSSRTGVPGMKLSEIVDWFNSFQTKRIFNASYLKERHS
FVVKYMWSGRLDTFFMNSVQNALITRRIAEEVSLKVSNTGLSWFQVAGDDAIMVYDGSSISTTEQVTRVNEITVRNYEES
NHIINPQKTVISHISGEYAKIYYYAGMHFRDPSIQLHESEKDSGASDVTESLRGFGQVIYEYNKRAIGTLRVNALYGRLI
AGLAYSVNVRRYDASKRTYANMKYYPPPTSVIAPAAFKGGLGLSFTGLSLNEVLFIKMHLHEAVSQGLHVISMISFEANE
VVSNSLSAYYLKDQKDLLRDMKLGKHLEKVKGISFKSSDLAFSGSDFSQGLNLKRESIDKVKLEVSRKSIRDLRSSGISV
PSTHAYENLPYASLHQSFKSLKVDRDTSKFTNERLLVSLLEYKSDIPRVSVTSQYPVYDLINISKVDELNVRSGGPVRFI
STPIEGKLLEENIGTRQGVQFKNRGYGGSQEVLHFIRSNGLVITEQALIDLIIKSGVLLMINPQRGLIDLFQSLSGDTAS
SMHLANFFMAEKPHWEDNAISLTIAGSLLENCDSRIENVKNFVSVLATGMQKDLQRMFYYVGFVYYAQRLIWSGGHSSKI
FVSIDEDKLADFLRGSKPITRRRKAMAGTKREPINLSANFSYEISEPDREISEYDPLVLCHPLSMPFFGNWQEKYSVMQS
DEQM
>P0CK31 2.7.7.48~~~L1~~~RNA-directed RNA polymerase lambda-3~~~
MSSMILTQFGPFIESISGITDQSNDVFEDAAKAFSMFTRSDVYKALDEIPFSDDAMLPIPPTIYTKPSHDSYYYIDALNR
VRRKTYQGPDDVYVPNCSIVELLEPHETLTSYGRLSEAIENRAKDGDSQARIATTYGRIAESQARQIKAPLEKFVLALLV
AEAGGSLYDPVLQKYDEIPDLSHNCPLWCFREICRHISGPLPDRAPYLYLSAGVFWLMSPRMTSAIPPLLSDLVNLAILQ
QTAGLDPSLVKLGVQICLHAAASSSYSWFILKTKSIFPQNTLHSMYESLEGGYCPNLEWLEPRSDYKFMYMGVMPLSAKY
ARSAPSNDKKARELGEKYGLSSVVGELRKRTKTYVKHDFASVRYIRDAMACTSGIFLVRTPTETVLQEYTQSPEIKVPIP
QKDWTGPIGEIRILKDTTSSIARYLYRTWYLAAARMAAQPRTWDPLFQAIMRSQYVTARGGSGAALRESLYAINVSLPDF
KGLPVKAATKIFQAAQLANLPFSHTSVAILADTSMGLRNQVQRRPRSIMPLNVPQQQVSAPHTLTADYINYHMNLSPTSG
SAVIEKVIPLGVYASSPPNQSINIDISACDASITWDFFLSVIMAAIHEGVASSSIGKPFMGVPASIVNDESVVGVRAARP
ISGMQNMIQHLSKLYKRGFSYRVNDSFSPGNDFTHMTTTFPSGSTATSTEHTANNSTMMETFLTVWGPEHTDDPDVLRLM
KSLTIQRNYVCQGDDGLMIIDGTTAGKVNSETIQNDLELISKYGEEFGWKYDIAYDGTAEYLKLYFIFGCRIPNLSRHPI
VGKERANSSAEEPWPAILDQIMGVFFNGVHDGLQWQRWIRYSWALCCAFSRQRTMIGESVGYLQYPMWSFVYWGLPLVKA
FGSDPWIFSWYMPTGDLGMYSWISLIRPLMTRWMVANGYVTDRCSTVFGNADYRRCFNELKLYQGYYMAQLPRNPKKSGR
AASREVREQFTQALSDYLMQNPELKSRVLRGRSEWEKYGAGIIHNPPSLFDVPHKWYQGAQEAAIATREELAEMDETLMR
ARRHSYSSFSKLLEAYLLVKWRMCEAREPSVDLRLPLCAGIDPLNSDPFLKMVSVGPMLQSTRKYFAQTLFMAKTVSGLD
VNAIDSALLRLRTLGADKKALTAQLLMVGLQESEADALAGKIMLQDVNTVQLARVVNLAVPDTWMSLDFDSMFKHHVKLL
PKDGRHLNTDIPPRMGWLRAILRFLGAGMVMTATGVAVDIYLEDIHGGGRSLGQRFMTWMRQEGRSA
>P0CK32 2.7.7.48~~~L1~~~RNA-directed RNA polymerase lambda-3~~~
MSSMILTQFRPFIESISGITDQSNDVFEDAAKAFSMFTRSDVYKALDEIPFSDDAMLPIPPTIYTKPSHDSYYYIDALNR
VRRKTYQGPDDVYVPNCSIVELLEPHETLTSYGRLSEAIENRAKDGDSQARIATTYGRIAESQARQIKAPLEKFVLALLV
SEAGGSLYDPVLQKYDEIPDLSHNCPLWCFREICRHISGPLPDRAPYLYLSAGVFWLMSPRMTSAIPPLLSDLVNLAILQ
QTAGLDPSLVKLGVQICLHAAASSSYAWFILKTKSIFPQNTLHSMYESLEGGYCPNLEWLEPRSDYKFMYMGVMPLSTKY
ARSAPSNDKKARELGEKYGLSSVVSELRKRTKTYVKHDFASVRYIRDAMACTSGIFLVRTPTETVLQEYTQSPEIKVPIP
QKDWTGPVGEIRILKDTTSSIARYLYRTWYLAAARMAAQPRTWDPLFQAIMRSQYVTARGGSGAALRESLYAINVSLPDF
KGLPVKAATKIFQAAQLANLPFSHTSVAILADTSMGLRNQVQRRPRSIMPLNVPQQQVSAPHTLTADYINYHMNLSTTSG
SAVIEKVIPLGVYASSPPNQSINIDISACDASITWDFFLSVIMAAIHEGVASGSIGKPFMGVPASIVNDESVVGVRAARP
ISGMQNMIQHLSKLYKRGFSYRVNDSFSPGNDFTHMTTTFPSGSTATSTEHTANNSTMMETFLTVWGPEHTDDPDVLRLM
KSLTIQRNYVCQGDDGLMIIDGNTAGKVNSETIQKMLELISKYGEEFGWKYDIAYDGTAEYLKLYFIFGCRIPNLSRHPI
VGKERANSSAEEPWPAILDQIMGIFFNGVHDGLQWQRWIRYSWALCCAFSRQRTMIGESVGYLQYPMWSFVYWGLPLVKV
FGSDPWIFSWYMPTGDLGMYSWISLIRPLMTRWMVANGYATDRCSPVFGNADYRRCFNEIKLYQGYYMAQLPRNPTKSGR
AAPREVREQFTQALSDYLMQNPELKSRVLRGRSEWEKYGAGIIHNPPSLFDVPHKWYLGAQEAATATREELAEMDETLMR
ARRHSYSSFSKLLEAYLLVKWRMCEAREPSVDLRLPLCAGIDPLNSDPFLKMVSVGPMLQSTRKYFAQTLFMAKTVSGLD
VNAIDSALLRLRTLGADKKALTAQLLMVGLQESEADALAGKIMLQDVSTVQLARVVNLAVPDTWMSLDFDSMFKHHVKLL
PKDGRHLNTDIPPRMGWLRAILRFLGAGMVMTATGVAVDIYLEDIHGGGRALGQRFMTWMRQEGRSA
>A9Q1K7 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MEPENYLAWLARDIVRNLSYTSLVYNNPKVAIVELLDNKEAFFTYEKEQKTPEALINYIDSIVKSSISVEDKIEALLKIR
YISVYVDDKSDKRDIVLQLLNRTIKKIESKTKISNELNDAINAITIESRNWKIQNSESFKPYHYNQLVSDFLKYNEFEIL
EGTDPLKWKSDTLQGLSPNYNHRTHTLISSIIYATSVRFDNYNDEQLQVLLYLFSIIKTNYVNGYLEILPNRKWSHSLAD
LRENKSIMMYSAKIIHASCAMISILHAVPIDYFFLAQIIASFSEIPAHAAKQLSSPMTLYIGIAQLRSNIVVSTKIAAES
VATESPNISRLEESQLREWEQEMNEYPFQSSRMVRMMKKNIFDVSVDVFYAIFNCFSATFHVGHRIDNPQDAIEAQVKVE
YTSDVDKEMYDQYYFLLKRMLTDQLAEYAEEMYFKYNSDVTAESLAAMANSSNGYSRSVTFIDREIKTTKKMLHLDDDLS
KNLNFTNIGEQIKKGIPMGTRNVPARQTRGIFILSWQVAAIQHTIAEFLYKKAKKGGFGATFAEAYVSKAATLTYGILAE
ATSKADQLILYTDVSQWDASQHNTEPYRSAWINAIKEARTKYKINYNQEPVVLGMNVLDKMIEIQEALLNSNLIVESQGS
KRQPLRIKYHGVASGEKTTKIGNSFANVALITTVFNNLTNTMPSIRVNHMRVDGDDNVVTMYTANRIDEVQENIKEKYKR
MNAKVKALASYTGLEMAKRFIICGKIFERGAISIFTAERPYGTDLSVQSTTGSLIYSAAVNAYRGFGDDYLNFMTDVLVP
PSASVKITGRLRSLLSPVTLYSTGPLSFEITPYGLGGRMRLFSLSKENMELYKILTSSLAISIQPDEIKKYSSTPQFKAR
VDRMISSVQIAMKSEAKIITSILRDKEEQKTLGVPNVATAKNRQQIDKARKTLSLPKEILPKVTKYYPEEIFHLILRNST
LTIPKLNTMTKVYMNNSVNITKLQQQIGVRVSSGIQVHKPINTLLKLVEKHSPIKISPSDLILYSKKYDLTNLNGKKQFL
MDLGISGNELRFYLNSKLLFHDLLLSKYDKLYEAPGFGATQLNALPLDLTAAEKVFSIKLNLPNTYYELLMLVLLYEYVN
FVMFTGNTFRAVCIPESQTINAKLVKTIMTMIDNIQLDTVMFSDNIF
>P21615 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFIYNSQSAVQIPIYYSSNSELEDRCIEFHSKCLENSKNGRSLKKLFTEYSDVIENATLLSILSYSYD
KYNAVERKLVKYAKGKPLEAGLTVNELDYENNKITSELFPTAEEYTDSLMDPAILTSLSSNLNAVMFWLEKHENDVAEKL
KIYKRRLDLFTIVASTVNKYGVPRHNAKYRYEYEVMKDKPYYLVTWANSSIEMLMSVFSHEDYLIARELIVLSYSNRSTL
AKLVSSPMSILVALVDINGTFITNEELELEFSNKYVRAIVPDQTFDELKQMLDNMRKAGLTDIPKMIQEWLVDCSIEKFP
LMAKIYSWSFHVGFRKQKMLDAALDQLKTEYTEDVDDEMYREYTMLIRDEVVKMLEEPVKHDDHLLQDSELAGLLSMSSA
SNGESRQLKFGRKTIFSTKKNMHVMDDMANGRYTPGIIPPVNVDKPIPLGRRDVPGRRTRIIFILPYEYFIAQHAVVEKS
VIYAKHTREYAEFYSQSNQLLSYGDVTRFLSNNAMVLYTDVSQWDSSQHNTQPFRKGIIMGLDMLANMTNDARVIQTLNL
YKQTQINLMDSYVQIPDGNVIKKIQYGAVASGEKQTKAANSIANLALIKTVLSRISNKYSFATKIIRVDGDDNYAVLQFN
TEVTKQMVQDVSNDVRETYARMNAKVKALVSTVGIEIAKRYIAGGKIFFRAGINLLNNEKRGQSTQWDQAAVLYSNYIVN
RLRGFETDREFILTKIMQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTDDEVYIQRAFMSLSSQKSGIAD
EIAASSTFKNYVSKLSEQLLFSKNNIVSRGIALTEKAKLNSYAPISLEKRRAQISALLTMLQKPVTFKSSKITINDILRD
IKPFFTVSEAHLPIQYQKFMPTLPDNVQYIIQCIGSRTYQIEDDGSKSAISRLISKYSVYKPSIEELYKVISLHENEIQL
YLISLGIPKIDADTYVGSKIYSQDKYRILESYVYNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
VINYAIKNGSWISLFCNYPKSEMIKLWKKMWNITSLRSPYTNANFFQD
>Q9QNB3 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFVYNSQSAVQIPIYYSSNSELETRCIEFHAKCVDNSKKGLSLKPLFEEYKDVTDNATLLSILSYSYD
KYNAVERKLVSYAKGKPLEADLTANELDYENNKITSELFQSAEEYTDSLMDPAILTSLSSNLNAVMFWLERHSNDIADAN
KIYKRRLDLFTIVASTINKYGVPRHNEKYRYEYEVMKDKPYYLVTWANSSIEMLMSVFSHEDYLIAKELIVLSYSNRSTL
AKLVSSPMSILVALIDINGTFITNEELELEFSDKYVKAIVPDQTFDELQEMINNMKKIGLVDIPRMIQEWLIDCSLEKFT
LMSKIYSWSFHVGFRKQKMIDAALDQLKTEYTKDVDDEMYNEYTMLIRDEIVKMLEIPVKHDDHLLRDSELAGLLSMSSA
SNGESRQIKFGRKTIFSTKKNMHVMDDIAHGKYTPGVIPPVNVDKPIPLGRRDVPGRRTRIIFILPYEYFIAQHAVVEKM
LLYAKHTREYAEFYSQSNQLLSYGDVTRFLSSNSMVLYTDVSQWDSSQHNTQPFRKGIIMGLDMLSNMTNDPKVVQALNL
YKQTQINLMDSYVQIPDGNVIKKNQYGAVASGEKQTKAANSIANLALIKTVLSRIANKYSFITKIIRVSGDDNYAVLQFN
TDLTKQMIQDVSNDVRYIYFRMNAKVKALVSTVGIEIAKRYLAGGKIFFRAGINLLNNEKRGQSTQWDQAAILYSNYIVN
KLRGFETDREFILTKIIQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTDDEVYIQRAFMSLSSQKSGIAD
EIASSQTFKNYVSKLSDQLLVSKNVIVSKGIAVTEKAKLNSYAPIYLEKRRAQISALLTMLQKPVSFKSNKNTINEILRD
IKPFFVTTEDNLPIQYRKFMPTLPDNVQYVIQCIGSRTYQIEDSGSKSSISKLISKYSVYKPSIEELYKVISLREQEIQL
YLVSLGVPLVDASAYVASRIYSQDKYKILESYVYNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
IINYAIKNRAWISVFCNYPKSEMIKLWKKMWSITALRSPYTSANFFQD
>Q85036 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFVYNSQSAVQIPIYYSSNSELEKRCIDFHAKCVDNSKKGLSLSSLFEEYKDVIDNATLLSILSYSYD
KYNAVERKLINYAKGKPLEADLTANELDYENNKITSELFKSAEEYTDSLMDPAILTSISSNLNAVMFWLERHSNDVGDAN
KVYRRRLDLFIIVASTINKYGVPRHNEKYRYEYEVMKDKPYYLVTWANSAIEMLMSVFSHEDYLIAKELIILSYSNRSTL
AKLVSSPMSILVALIDINGTFITNEELELEFSDKYVKAIVPDQTFNELQEMIDNMKKAGLVDIPRMIQEWLVDCSLEKFT
LMSKIYSWSFHVGFRKQKMIDAALDQLKTEYTEDVDNEMYNEYTMLIRDEIVKMLEVPVKHDDHLLRDSELAGLLSMSSA
SNGESRQLKFGRKTIFSTKKNMHVMDDIAHGRYTPGVIPPVNVDRPIPLGRRDVPGRRTRIIFILPYEYNSAQHAVVEKM
LSYAKHTREYAEFYSQSNQLLSYGDVTRFLSSNSMVLYTDVSQWDSSQHNTQPFRKGIIMGLDMLANMTNDPKVVQTLNL
YKQTQINLMDSYVQIPDGNVIKKIQYGAVASGEKQTKAANSIANLALIKTVLSRIANKYSFITKIIRVDGDDNYAVLQFN
TDVTKQMVQEVSNDVRYIYSRMNAKVKALVSTVGIEIAKRYIAGGKIFFRAGINLLNNEKRGQSTQWDQAAILYSNYIVN
KLRGFDTDREFILTKIIQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTDDEVYIQRAFMSLSSQKSGIAD
EIASSQTFKNYVSKLSDQLLVSKNAIVSKGIAVTEKAKLNSYAPVYLEKRRAQISALLTMLQKPVSFKSNKITINDILRD
IKPFFVTTEAKLPIQYRKFMPTLPDNVQYVIQCIGSRTYQIEDSGSKSSISKLISKYSVYKPSIEELYKVISLREQEIQL
YLVSLGVPPVDAGTYVGSRIYSQDKYKILESYVYNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
IINYAIKNKSWISLFCNYPKSEMIKLWKKMWNITALRSPYTSANFFQD
>P17468 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFIYNSQSAVQIPIYYSSNSELENRCIEFHSKCLENSKNGLSLKKLFVEYSDVMENATLLSILSYSYD
KYNAVERKLVKYAKGKPLEADLTVNELDYENNKITSELFPTAEEYTDSLMDPAILTSLSSNLNAVMFWLEKHENDVAEKL
KIYKRRLDLFTIVASTVNKYGVPRHNAKYRYEYEVMKDKPYYLVTWANSSIEMLMSVFSHEDYLIARELIVLSYSNRSTL
AKLVSSPMSILVALVDINGTFITNEELELEFSNKYVRAIVPDQTFDELKQMLDNMRKAGLTDIPKMIQDWLVDCSIEKFP
LMAKIYSWSFHVGFRKQKMLDAALDQLKTEYTEDVDDEMYREYTMLIRDEVVKMLEEPVKHDDHLLQDSELAGLLSMSSA
SNGESRQLKFGRKTIFSTKKNMHVMDDMANGRYTPGIIPPVNVDKPIPLGRRDVPGRRTRIIFILPYEYFIAQHAVVEKM
LIYAKHTREYAEFYSQSNQLLSYGDVTRFLSNNSMVLYTDVSQWDSSQHNTQPFRKGIIMGLDMLANMTNDARVIQTLNL
YKQTQINLMDSYVQIPDGNVIKKIQYGAVASGEKQTKAANSIANLALIKTVLSRISNKYSFATKIIRVDGDDNYAVLQFN
TEVTEQMVQDVSNDVRETYARMNAKVKALVSTVGIEIAKRYIAGGKIFFRAGINLLNNEKRGQSTQWDQAAVLYSNYIVN
RLRGFETDREFILTKIMQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTDDEVYIQRAFMSLSSQKSGIAD
EIAASSTFKNYVSRLSGQLLFSKNNIVSRGIALTEKAKLNSYAPISLEKRRAQISALLTMLQKPVTFKSSKITINDILRD
IKPFFTVNEAHLPIQYQKFMPTLPDNVQYIIQCIGSRTYQIEDDGSKSAISRLISKYSVYKPSIEELYKVISLHENEIQL
YLISLGIPKIDADTYVGSKIYSQDKYRILESYVCNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
VINHAIKNGSWISLFCNYPKSEMIKLWKKMWNITSLRSPYTNANFFQD
>A2T3S0 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFIYNSQSAVQIPIYYSSNSELENRCIEFHSKCLENSKNGLSLRKLFVEYNDVIENATLLSILSYSYD
KYNAVERKLVKYAKGKPLEADLTVNELDYENNKITSELFPTAEEYTDSLMDPAILTSLSSNLNAVMFWLEKHENDVAEKL
KVYKRRLDLFTIVASTINKYGVPRHNAKYRYEYDVMKDKPYYLVTWANSSIEMLMSVFSHDDYLIAKELIVLSYSNRSTL
AKLVSSPMSILVALVDINGTFITNEELELEFSNKYVRAIVPDQTFDELNQMLDNMRKAGLVDIPKMIQDWLVDRSIEKFP
LMAKIYSWSFHVGFRKQKMLDAALDQLKTEYTENVDDEMYREYTMLIRDEVVKMLEEPVKHDDHLLRDSELAGLLSMSSA
SNGESRQLKFGRKTIFSTKKNMHVMDDMANERYTPGIIPPVNVDKPIPLGRRDVPGRRTRIIFILPYEYFIAQHAVVEKM
LIYAKHTREYAEFYSQSNQLLSYGDVTRFLSNNTMVLYTDVSQWDSSQHNTQPFRKGIIMGLDILANMTNDAKVLQTLNL
YKQTQINLMDSYVQIPDGNVIKKIQYGAVASGEKQTKAANSIANLALIKTVLSRISNKHSFATKIIRVDGDDNYAVLQFN
TEVTKQMIQDVSNDVRETYARMNAKVKALVSTVGIEIAKRYIAGGKIFFRAGINLLNNEKRGQSTQWDQAAILYSNYIVN
RLRGFETDREFILTKIMQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTDDEVYIQRAFMSLSSQKSGIAD
EIAASSTFKNYVTRLSEQLLFSKNNIVSRGIALTEKAKLNSYAPISLEKRRAQISALLTMLQKPVTFKSSKITINDILRD
IKPFFTVSDAHLPIQYQKFMPTLPDNVQYIIQCIGSRTYQIEDDGSKSAISRLISKYSVYKPSIEELYKVISLHENEIQL
YLISLGIPKIDADTYVGSKIYSQDKYRILESYVYNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
VINYAIKNGSWISLFCNYPKSEMIKLWKKMWNITSLRSPYTNANFFQD
>O37061 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MGKYNLILSEYLSFIYNSQSAVQIPIYYSSNSELENRCIEFHSKCLENSKNGLSLRKLFVEYNDVIENATLLSILSYSYD
KYNAVERKLVKYAKGKPLEADLTVNELDYENNKMTSELFPTAEEYTDSLMDPAILTSLSSNLNAVMFWLEKHENDVAEKL
KVYKRRLDLFTIVASTINKYGVPRHNAKYRYEYDVMKDKPYYLVTWANSSIEMLMSVFSHDDYLIAKELIVLSYSNRSTL
AKLVSSPMSILVALVDINGTFITNEELELEFSNKYVRAIVPDQTFDELNQMLDNMRKAGLVDIPKMIQDWLVDRSIEKFP
LMAKIYSWSFHVGFRKQKMLDAALDQLKTEYTENVDDEMYREYTMLIRDEVVKMLEEPVKHDDHLLRDSELAGLLSMSSA
SNGESRQLKFGRKTIFSTKKNMHVMDDMANERYTPGIIPPVNVDKPIPLGRRDVPGRRTRIIFILPYEYFIAQHAVVEKM
LIYAKHTREYAEFYSQSNQLLSYGDVTRFLSNNTMVLYTDVSQWDSSQHNTQPFRKGIIMGLDILANMTNDAKVLQTLNL
YKQTQINLMDSYVQIPDGNVIKKIQYGAVASGEKQTKAANSIANLALIKTVLSRISNKHSFATKIIRVDGDDNYAVLQFN
TEVTKQMIQDVSNDVRETYARMNAKVKALVSTVGIEIAKRYIAGGKIFFRAGINLLNNEKRGQSTQWDQAAILYSNYIVN
RLRGFETDREFILTKIMQMTSVAITGSLRLFPSERVLTTNSTFKVFDSEDFIIEYGTTVDEVYIQRAFMSLSSQKSGIAD
EIAASSTFKNYVTRLSEQLLFSKNNIVSRGIALTEKAKLNSYAPISLEKRRAQISALLTMLQKPVTFKSSKITINDILRD
IKPFFTVSDAHLPIQYQKFMPTLPDNVQYIIQCIGSRTYQIEDDGSKSAISRLISKYSVYKPSIEELYKVISLHENEIQL
YLISLGIPKIDADTYVGSKIYSRDKYRILESYVYNLLSINYGCYQLFDFNSPDLEKLIRIPFKGKIPAVTFILHLYAKLE
VINYAIKNGSWISLFCNYPKSEMIKLWKKMWNITSLRSPYTNANFFQE
>O72157 ~~~ORF2A-2B~~~Replicase polyprotein P2AB~~~
MYHPGRSPSFLITLANVICAAILFDIHTGGYQPGSLIPIVAWMTPFVTLLWLSASFATYLYKYVRTRLLPEEKVARVYYT
AQSAPYFDPALGVMMQFAPSHGGASIEVQVNPSWISLLGGSLKINGDDASNESAVLGSFYSSVKPGDEPASLVAIKSGPQ
TIGFGCRTKIDGDDCLFTANHVWNNSMRPTALAKRGKQVAIEDWETPLSCDHKMLDFVVVRVPKHVWSKLGVKATQLVCP
SDKDAVTCYGGSSSDNLLSGTGVCSKVDFSWKLTHSCPTAAGWSGTPIYSSRGVVGMHVGFEDIGKLNRGVNAFYVSNYL
LRSQETLPPELSVIEIPFEDVETRSYEFIEVEIKGRGKAKLGKREFAWIPESGKYWADDDDDSLPPPPKVVDGKMVWSSA
QETVAEPLNLPAGGRVKALAALSQLAGYDFKEGEAASTRGMPLRFVGQSACKFRELCRKDTPDEVLRATRVFPELSDFSW
PERGSKAELHSLLLQAGKFNPTGIPRNLEGACQNLLERYPASKSCYCLRGEAWSFDAVYEEVCKKAQSAEINEKASPGVP
LSRLASTNKDLLKRHLELVALCVTERLFLLSEAEDLLDESPVDLVRRGLCDPVRLFVKQEPHASRKVREGRFRLISSVSL
VDQLVERMLFGPQNQLEIAEWEHIPSKPGMGLSLRQQAKSLFDDLRVKHSRCPAAEADISGFDWSVQDWELWADVEMRIV
LGGFGHKLAKAAQNRFSCFMNSVFQLSDGTLIEQQLPGIMKSGSYCTSSTNSRIRCLMAELIGSPWCIAMGDDSVEGWVD
GAKDKYMRLGHTCKDYKPCATTISGRLYEVEFCSHVIREDRCWLASWPKTLFKYLSEGKWFFEDLERDVSSSPHWPRIRH
YVVGNTPSPHKTNLQNQSPRYGEEVDKTTVNQGYSEHSGSPGHSIEEAQEPEAAPFCCEAASVYPGWGVHGPYCSGDYGS
LT
>P25328 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MKEPVDCRLSTPAGFSGTVPPPGRTKAARPGTIPVRRSRGSASALPGKIYGWSRRQRDRFAMLLSSFDAALAAYSGVVVS
RGTRSLPPSLRLFRAMTRKWLSVTARGNGVEFAIASAKEFSAACRAGWISGTVPDHFFMKWLPEPVRRKSGLWAQLSFIG
RSLPEGGDRHEIEALANHKAALSSSFEVPADVLTSLRNYSEDWARRHLAADPDPSLLCEPCTGNSATFERTRREGGFAQS
ITDLVSSSPTDNLPPLESMPFGPTQGQALPVHVLEVSLSRYHNGSDPKGRVSVVRERGHKVRVVSAMETHELVLGHAARR
RLFKGLRRERRLRDTLKGDFEATTKAFVGCAGTVISSDMKSASDLIPLSVASAIVDGLEASGRLLPVEIAGLRACTGPQH
LVYPDGSEITTRRGILMGLPTTWAILNLMHLWCWDSADRQYRLEGHPFRATVRSDCRVCGDDLIGVGPDSLLRSYDRNLG
LVGMILSPGKHFRSNRRGVFLERLLEFQTRKTVYEHAVIYRKVGHRRVPVDRSHIPVVTRVTVLNTIPLKGLVRASVLGR
DDPPVWWAAAVAESSLLSDYPRKKIFAAARTLRPGLSRQFRRLGIPPFLPRELGGAGLVGPSDRVDAPAFHRKAISSLVW
GSDATAAYSFIRMWQGFEGHPWKTAASQETDTWFADYKVTRPGKMYPDRYGFLDGESLRTKSTMLNSAVYETFLGPDPDA
THYPSLRIVASRLAKVRKDLVNRWPSVKPVGKDLGTILEAFEESKLCTLWVTPYDASGYFDDSLLLMDESVYQRRFRQLV
IAGLMREGRMGDLLFPNWLPPSTVVSGFP
>Q07048 2.7.7.48~~~~~~RNA-directed RNA polymerase~~~
MHHKVNVKTQREVHFPMDLLQACGASAPRPVARVSRATDLDRRYRCVLSLPEERARSVGCKWSSTRAALRRGLEELGSRE
FRRRLRLADDCWRAICAAVCTGRKFPSFSVTDRPARARLAKVYRMGRRLLVGVVCRGESVVSDLKQECADLRRVIFEGST
RIPSSSLWGLVGVLGWTSPERAMQLTFIGRALPYGSPDVERRALASHAATLSIPAECHPNYLVAAEQFAKSWADDNLPRK
FRIYPIAVQESSCMEYSRAQGGLLQSFRKGFVGYDPAAPSADPDDLELAKERGFSRIRASWYSTFRYRGELKSTNQSLEA
RVAVVPERGFKARIVTTHSASRVTFGHQFRRYLLQGIRRHPALVDVIGGDHRRAVETMDGDFGLLRPDGRLLSADLTSAS
DRIPHDLVKAILRGIFSDPDRRPPGTSLADVFDLVLGPYHLHYPDGSEVTVRQGILMGLPTTWPLLCLIHLFWVELSDWA
PARPNHSRGFVLGESFRICGDDLIAWWRPERIALYNQIAVDCGAQFSAGKHLESKTWGIFTEKVFTVKPVKMKVRVRSEP
SLKGYVFSRSSAFSCRMGGKGITGIRAARLYTIGAMPRWSRRIRDVYPGSLEHRTASQRYGEPVTVYRFGRWSSAIPLRW
AVRAPTRTVGNPVQSLPDWFTVGPAASSVAADSNAFGAVSRVLRRMFPGLPRKLASAGIPPYLPRVFGGGGLVKSTGLTT
KIGAVASRRWMSRIGHDLYRSRERKSTLGRVWTLSTSPAYAASLHEVEKFMDRPDIILTRKCRNPMLKHARELGLFEEVF
ESRVGGGILWASLNGKALVESHSPSILQVSRNLRRSLACPSGGFLRPSAPIGKLVQRHTLPRGTVWFLESSATDSARQGG
MGLPPPPPPPLGGGGMAGPPPPPFMGLRPESSVPTSVPFTPSMFSERLAALESLFGRPPPS
>P23172 2.7.7.48~~~gag-pol~~~Probable RNA-directed RNA polymerase~~~
MSSLLNSLLPEYFKPKTNLNINSSRVQYGFNARIDMQYEDDSGTRKGSRPNAFMSNTVAFIGNYEGIIVDDIPILDGLRA
DIFDTHGDLDMGLVEDALSKSTMIRRNVPTYTAYASELLYKRNLTSLFYNMLRLYYIKKWGSIKYEKDAIFYDNGHACLL
NRQLFPKSRDASLESSLSLPEAEIAMLDPGLEFPEEDVPAILWHGRVSSRATCILGQACSEFAPLAPFSIAHYSPQLTRK
LFVNAPAGIEPSSGRYTHEDVKDAITILVSANQAYTDFEAAYLMLAQTLVSPVPRTAEASAWFINAGMVNMPTLSCANGY
YPALTNVNPYHRLDTWKDTLNHWVAYPDMLFYHSVAMIESCYVELGNVARVSDSDAINKYTFTELSVQGRPVMNRGIIVD
LTLVAMRTGREISLPYPVSCGLTRTDALLQGTEIHVPVVVKDIDMPQYYNAIDKDVIEGQETVIKVKQLPPAMYPIYTYG
INTTEFYSDHFEDQVQVEMAPIDNGKAVFNDARKFSKFMSIMRMMGNDVTATDLVTGRKVSNWADNSSGRFLYTDVKYEG
QTAFLVDMDTVKARDHCWVSIVDPNGTMNLSYKMTNFRAAMFSRNKPLYMTGGSVRTIATGNYRDAAERLRAMDETLRLK
PFKITEKLDFSCSSLRDTKFVGQQYAILTPSGTTTDIRSGRGTNQSYRRGRTSTGYRIGVEDDEDLDIGTVKYIVPLYLN
GDNVAQNCLEATHVLIKACSIANRIVDDGEGHCFTQQGLAQQWIFHRGEMIFVKAVRIGQLNAYYVDYKNVTNYSLKTAA
QVGATISNNLRHGFVDNQQDAYTRLVANYSDTRKWIRDNFTYNYNMEKEKYRITQYHHTHVRLKDLFPSRKIVKLEGYEA
LLAMMLDRFNNIESTHVTFFTYLRALPDREKEVFISLVLNYNGLGREWLKSEGVRAKQAQGTVKYDMSKLFELNVLENGV
DEEVDWEKEKRNRSDIKTVNISYAKVLEHCRELFIMARAEGKRPMRMKWQEYWRQRAVIMPGGSVHSQHPVEQDVIRVLP
REIRSKKGVASVMPYKEQKYFTSRRPEIHAYTSTKYEWGKVRALYGCDFSSHTMADFGLLQCEDTFPGFVPTGSYANEDY
VRTRIAGTHSLIPFCYDFDDFNSQHSKEAMQAVIDAWISVYHDKLTDDQIEAAKWTRNSVDRMVAHQPNTGETYDVKGTL
FSGWRLTTFFNTALNYCYLANAGINSLVPTSLHNGDDVFAGIRTIADGISLIKNAAATGVRANTTKMNIGTIAEFLRVDM
RAKNSTGSQYLTRGIATFTHSRVESDAPLTLRNLVSAYKTRYDEILARGASIDNMKPLYRKQLFFARKLFNVEKDIVDNL
ITMDISCGGLQEKGRVSEMVLQEVDIENIDSYRKTRMIAKLIDKGVGDYTAFLKTNFSEIADAITRETRVESVTKAYNVK
KKTVVRAFRDLSAAYHERAVRHAWKGMSGLHIVNRIRMGVSNLVMVVSKINPAKANVLAKSGDPTKWLAVLT
>P89202 2.1.1.-~~~~~~Replicase large subunit~~~
MSTSTLINKAQTNSCGDVGVVDLLKRKVYDDTVKTMQGLDRRAKYRLNQCLGPEQCRTVRGGYPEFQIEFTGASNTSHAM
AAGLRGLELEYLYTLVPYGAVSYDIGGNFPAHMMKGRSYVHCCNPALDARDLARNENYRISIENYLSRFEDKSGDYCQWQ
RKKPKVSKPLPRYQKACFDRYNEDPEHVTCSETFEKCRISPPAERDDIYATSLHSLYDIPYQNLGPALARKRIKVLHAAF
HFSEDLLLGASEGLLTQIGGTFQRNGDVLTFSFLDESSLIYTHSFRNVFEYVTRTFFVACNRYAYMKEFRSRRVDTVFCS
FIRIDTYCLYRSVFKDCDEHVFAAMDDAWEFKKKRVMLEASRPIFNDVAQFNVYFPNAKDKVCLPIFAVKSVSGAPVTTR
HILVEKDFYWTALNHILTYPDGKADFRGVMSFLESIRSRVVINGTTTASQWEVDKSQLKDIALSLLLIAKLEKLKISVIE
KRIKIERQGLVSLLKEFLHGLLDEYTQTMAEWVVEKGWVKSVDQVLQVTIPDLVLNFRDHFRCEFRTSANVSEVNVSEHL
VATNEYYAKVSDLVDRNPTLAFDFEKFQDYCEKLGVDIDTVTELIDAISTGRAGITLDHTDDKEEQLPRTLAGSSSYLEE
EPSDDLVCLSDKAIVNRSTILGELKNNVVIFEGTLPKNSVFVSAPDDPSVTIELSELHARPVSDFLSMQKPVNIVYTGEV
QICQMQNYLDYLSASLVACISNLKKYLQDQWLNPGEKFQKIGVWDNLNNKWIVVPQKKKYAWGLAADVDGNQKTVILNYD
EHGMPILEKSYVRLVVSTDTYLFTVVSMLGYLRHLDQKKPTATITLVDGVPGCGKTQEILSRFDANSDLILVQGREACEM
IRRRANDNVPGSATKENVRTFDSFVMNRKPGKFKTLWVDEGLMVHPGLINFCINISCVSSVYIFGDRKQIPFINRVMNFS
IPDNLAKLYYDEIVSRDTTKRCPLDVTHFLNSVYEKRVMSYSNVQRSLECKMISGKAKINDYRSILAEGKLLTFTQEDKE
YLLKAGFKDVNTVHEAQGETYRDVNLIRVTATPLTIVSAGSPHVTVALSRHTNRFVYYTVVPDVVMTTVQKTQCVSNFLL
DMYAVAYTQKXQLQISPFYTHDIPFVETNKVGQISDLQYFYDSWLPGNSFVQNNHDQWSIISSDINLHSEAVRLDMNKRH
IPRTKGEFLRPLLNTAVEPPRIPGLLENLLALIKRNFNAPDLAGQLDYDFLSRKVCDGFFGKLLPPDVEASELLRLPVDH
MYSVQNFDDWLNKQEPGVVGQLANWDHIGMPAADQYRHMIKRTPKAKLDLSIQSEYPALQTIVYHSKHVNAVFGPIFSCL
TERLLSVVDPLRFKFFTRTTPADLEFFFRDMVVGDMEILELDISKYDKSQNKFHFEVEMRIWEMLGIDKYIEKVWENGHR
KTHLRDYTAGIKTVIEYQRKSGDVTTFIGNTIIIAACLCSILPMEKVFKAGFCGDDSIIYLPRNLLYPDIQSVSNNMWNF
EAKLFKKLHGYFCGRYXLRNGRYLRLLPDPLKIITKLGCKAIKDWDHLEEFRISMFDMACEYKNCFGFDVLESAVKESFP
KAEGCNVAFCAIYKFLSNKYLFRTLFSDV
>O41353 2.7.7.48~~~Segment 2~~~RNA-directed RNA polymerase catalytic subunit~~~
MNLFTPRSEINPTTTQELLYAYTGPAPVAYGTRTRAVLENIIRPYQYFYKEPNVQRALDIKTGCKEPEDINVEGPSSGFH
TASVLKLADNFFRKYRPAMEKLKYWILVKLPKLKYAELSKGRQTYSFIHKRNLPAPIALEETVEFLEQNLRRKIGPTLLS
YCQAIADVMELDETTYEGARDPRPWDIQLEEIDSDEEDPLFRQVGREETYTIKFSREELWDQMRTLNTMCKHLERGRLNR
RTIATPSMLIRGFVKIVEDAAKEILENVPTSGVPVGGEEKLAKLASKQTFHTAVTGELSGDQEKFNECLDPDAMRLMWTV
FLRKLGCPDWIMELFNIPFMVFKSKLADMGEGLVYTKGKLTDRKPLGEMPSEFDDLVRNVVGNSISCRLGMFMGMYNLTS
TLLALISIEREELTGSHVESSDDFIHFFNCKTHEEMFKQAETLRLTLKLVGINMSPSKCILISPAGIGEFNSKFHHRDFV
GNVATELPALVPNGTNPMTDLAMGLNVIKHSVNTGQMNLCTGALAMRIFNHAYKYAYMALGVTRRTRFMEENAITPLLTN
QGASPVHSFSTMHLDEVALRRHLGLLDEETLRRILNPNNPVTQKGDPSMFFRIENKMPQIMEDYSVPSCFKYTLSRNRTI
QDKPHKALLNKEERYQRVTSIINKLFPEVLIQEASAPGTVRESLKRRLELVVERSDLDEERKKRILSRIF
>P18339 2.1.1.-~~~~~~Replicase large subunit~~~
MAHIQSIISNALLESVSGKNTLVNDLARRRMYDTAVEEFNARDRRPKVNFSKTISEEQTLLVSNAYPEFQITFYNTQNAV
HSLAGGLRALELEYLMLQVPYGSPTYDIGGNFAAHLFKGRDYVHCCMPNLDIRDIMRHEGQKDSIEMYLSRLSRSNKVIP
EFQREAFNRYAEAPNEVCCSKTFQDCRIHPPENSGRRYAVALHSLYDIPVHEFGAALISKNIHVCYAASILAEALLLDQT
EVTLNEIGATFKREGDDVSFFFADESTLNYSHKYKNILHYVVKSYFPASSRIVYFKEFLVTRVNTWFCKFTKVDTYILYK
SVRQVGCDSDQFYEAMEDAFAYKKTLAMFNTERAIFRDTASVNFWFPKMKDMVIVPLFEGSITSKKMTRSEVIVNRDFVY
TVLNHIRTYQAKALTYQNVLSFVESIRSRVIINGVTARSEWDVDKAILQPLSMTFFLQTKLAALQDDIVMGKFRCLDKTT
SELIWDEVGKFFGNVFPTIKERLVSRKILDVSENALKIKIPDLYVTWKDRFVAEYTKSEELPHLDIKKDLEEAEQMYDAL
SELSILKGADNFDIAKFKDMCKALDVSPDVAARVIVAVAENRSGLTLTFDKPTEENVAKALKSTASEAVVCLEPTSEEVN
VNKFSIAEKGRLPVCAESHGLTNANLEHQELESLNDFHKACVDSVITKQMASVVYTGSLKVQQMKNYVDSLAASLSATVS
NLCKSLKDEVGYDSDSREKVGVWDVTLKKWLLKPAAKGHSWGVVLDYKGKMFTALLSYEGDRMVTESDWRRVAVSSDTMV
YSDIAKLQNLRKTMRDGEPHEPTAKMVLVDGVPGCGKYKGDFERFDLDEDLILVPGKQAAAMIRRRANSSGLIRATMDNV
RTVDSLLMHPKPRSHKRLFIDEGLMLHTGCVNFLVLISGCDIAYIYGDTQQIPFINRVQNFPYPKHFEKLQVDEVEMRRT
TLRCPGDVNFFLQSKYEGAVTTTSTVQRSVSSEMIGGKGVLNSVSKPLKGKIVTFTQADKFELEEKGYKNVNTVHEIQGE
TFEDVSLVRLTATPLTLISKSSPHVLVALTRHTKSFKYYTVVLDPLVQIISDLSSLSSFLLEMYMVEAGSRXQLQMDAVF
KGHNLFVATPKSGDFPDLQFYYDVCLPGNSTILNKYDAVTMRLRDNSLNVKDCVLDFSKSIPMPKEVKPCLEPVLRTAAE
PPRAAGLLENLVAMIKRNFNAPDLTGTIDIESTASVVVDKFFDSYFIKKEKYTKNIAGVMTKDSMMRWLENRKEVLLDDL
ANYNFTDLPAIDQYKHMIKAQPKQKLDLSIQNEYPALQTIVYHSKQINGILAGFSELTRLLLEAFDSKKFLFFTRKTPEQ
IQEFFSDLDSHVPMDVLELDISKYDKSQNEFHCAVEYEIWKRLGLNEFLAEVWKQGHRKTTLKDYIAGIKTCLWYQRKSG
DVTTFIGNTVIIAACLGSMLPMEKVIKGAFCGDDSVLYFPKGLDFPDIQSCANLMWNFEAKLYRKRYGYFCGRYIIHHDK
GAIVYYDPLKLISKLGAKHIKDYDHLEELRVSLCDVACSLGNWCLGFPQLNAAIKEVHKTAIDGSFAFNCVNKFLCDKFL
FRTLFLNGC
>P03586 2.1.1.-~~~~~~Replicase large subunit~~~
MAYTQTATTSALLDTVRGNNSLVNDLAKRRLYDTAVEEFNARDRRPKVNFSKVISEEQTLIATRAYPEFQITFYNTQNAV
HSLAGGLRSLELEYLMMQIPYGSLTYDIGGNFASHLFKGRAYVHCCMPNLDVRDIMRHEGQKDSIELYLSRLERGGKTVP
NFQKEAFDRYAEIPEDAVCHNTFQTMRHQPMQQSGRVYAIALHSIYDIPADEFGAALLRKNVHTCYAAFHFSENLLLEDS
YVNLDEINACFSRDGDKLTFSFASESTLNYCHSYSNILKYVCKTYFPASNREVYMKEFLVTRVNTWFCKFSRIDTFLLYK
GVAHKSVDSEQFYTAMEDAWHYKKTLAMCNSERILLEDSSSVNYWFPKMRDMVIVPLFDISLETSKRTRKEVLVSKDFVF
TVLNHIRTYQAKALTYANVLSFVESIRSRVIINGVTARSEWDVDKSLLQSLSMTFYLHTKLAVLKDDLLISKFSLGSKTV
CQHVWDEISLAFGNAFPSVKERLLNRKLIRVAGDALEIRVPDLYVTFHDRLVTEYKASVDMPALDIRKKMEETEVMYNAL
SELSVLRESDKFDVDVFSQMCQSLEVDPMTAAKVIVAVMSNESGLTLTFERPTEANVALALQDQEKASEGALVVTSREVE
EPSMKGSMARGELQLAGLAGDHPESSYSKNEEIESLEQFHMATADSLIRKQMSSIVYTGPIKVQQMKNFIDSLVASLSAA
VSNLVKILKDTAAIDLETRQKFGVLDVASRKWLIKPTAKSHAWGVVETHARKYHVALLEYDEQGVVTCDDWRRVAVSSES
VVYSDMAKLRTLRRLLRNGEPHVSSAKVVLVDGVPGCGKTKEILSRVNFDEDLILVPGKQAAEMIRRRANSSGIIVATKD
NVKTVDSFMMNFGKSTRCQFKRLFIDEGLMLHTGCVNFLVAMSLCEIAYVYGDTQQIPYINRVSGFPYPAHFAKLEVDEV
ETRRTTLRCPADVTHYLNRRYEGFVMSTSSVKKSVSQEMVGGAAVINPISKPLHGKILTFTQSDKEALLSRGYSDVHTVH
EVQGETYSDVSLVRLTPTPVSIIAGDSPHVLVALSRHTCSLKYYTVVMDPLVSIIRDLEKLSSYLLDMYKVDAGTQXQLQ
IDSVFKGSNLFVAAPKTGDISDMQFYYDKCLPGNSTMMNNFDAVTMRLTDISLNVKDCILDMSKSVAAPKDQIKPLIPMV
RTAAEMPRQTGLLENLVAMIKRNFNAPELSGIIDIENTASLVVDKFFDSYLLKEKRKPNKNVSLFSRESLNRWLEKQEQV
TIGQLADFDFVDLPAVDQYRHMYKAQPKQKLDTSIQTEYPALQTIVYHSKKINAIFGPLFSELTRQLLDSVDSSRFLFFT
RKTPAQIEDFFGDLDSHVPMDVLELDISKYDKSQNEFHCAVEYEIWRRLGFEDFLGEVWKQGHRKTTLKDYTAGIKTCIW
YQRKSGDVTTFIGNTVIIAACLASMLPMEKIIKGAFCGDDSLLYFPKGCEFPDVQHSANLMWNFEAKLFKKQYGYFCGRY
VIHHDRGCIVYYDPLKLISKLGAKHIKDWEHLEEFRRSLCDVAVSLNNCAYYTQLDDAVWEVHKTAPPGSFVYKSLVKYL
SDKVLFRSLFIDGSSC
>P03587 2.1.1.-~~~~~~Replicase large subunit~~~
MAYTQTATSSALLETVRGNNTLVNDLAKRRLYDTAVDEFNARDRRPKVNFSKVVSEEQTLIATKAYPEFQITFYNTQNAV
HSLAGGLRSLELEYLMMQIPYGSLTYDIGGNFASHLFKGRAYVHCCMPNLDVRDIMRHEGQKDSIELYLSRLERGNKHVP
NFQKEAFDRYAEMPNEVVCHDTFQTCRHSQECYTGRVYAIALHSIYDIPADEFGAALLRKNVHVCYAAFHFSENLLLEDS
HVNLDEINACFQRDGDRLTFSFASESTLNYSHSYSNILKYVCKTYFPASNREVYMKEFLVTRVNTWFCKFSRIDTFLLYK
GVAHKGVDSEQFYKAMEDAWHYKKTLAMCNSERILLEDSSSVNYWFPKMRDMVIVPLFDISLETSKRTRKEVLVSKDFVY
TVLNHIRTYQAKALTYSNVLSFVESIRSRVIINGVTARSEWDVDKSLLQSLSMTFFLHTKLAVLKDDLLISKFALGPKTV
SQHVWDEISLAFGNAFPSIKERLINRKLIKITENALEIRVPDLYVTFHDRLVSEYKMSVDMPVLDIRKKMEETEEMYNAL
SELSVLKNSDKFDVDVFSQMCQSLEVDPMTAAKVIVAVMSNESGLTLTFEQPTEANVALALQDSEKASDGALVVTSRDVE
EPSIKGSMARGELQLAGLSGDVPESSYTRSEEIESLEQFHMATASSLIHKQMCSIVYTGPLKVQQMKNFIDSLVASLSAA
VSNLVKILKDTAAIDLETRQKFGVLDVASKRWLVKPSAKNHAWGVVETHARKYHVALLEHDEFGIITCDNWRRVAVSSES
VVYSDMAKLRTLRRLLKDGEPHVSSAKVVLVDGVPGCGKTKEILSRVNFEEDLILVPGRQAAEMIRRRANASGIIVATKD
NVRTVDSFLMNYGKGARCQFKRLFIDEGLMLHTGCVNFLVEMSLCDIAYVYGDTQQIPYINRVTGFPYPAHFAKLEVDEV
ETRRTTLRCPADVTHFLNQRYEGHVMCTSSEKKSVSQEMVSGAASINPVSKPLKGKILTFTQSDKEALLSRGYADVHTVH
EVQGETYADVSLVRLTPTPVSIIARDSPHVLVSLSRHTKSLKYYTVVMDPLVSIIRDLERVSSYLLDMYKVDAGTQXQLQ
VDSVFKNFNLFVAAPKTGDISDMQFYYDKCLPGNSTLLNNYDAVTMKLTDISLNVKDCILDMSKSVAAPKDVKPTLIPMV
RTAAEMPRQTGLLENLVAMIKRNFNSPELSGVVDIENTASLVVDKFFDSYLLKEKRKPNKNFSLFSRESLNRWIAKQEQV
TIGQLADFDFVDLPAVDQYRHMIKAQPKQKLDLSIQTEYPALQTIVYHSKKINAIFGPLFSELTRQLLDSIDSSRFLFFT
RKTPAQIEDFFGDLDSHVPMDVLELDVSKYDKSQNEFHCAVEYEIWRRLGLEDFLAEVWKQGHRKTTLKDYTAGIKTCLW
YQRKSGDVTTFIGNTVIIASCLASMLPMEKLIKGAFCGDDSLLYFPKGCEYPDIQQAANLMWNFEAKLFKKQYGYFCGRY
VIHHDRGCIVYYDPLKLISKLGAKHIKDWDHLEEFRRSLCDVAESLNNCAYYTQLDDAVGEVHKTAPPGSFVYKSLVKYL
SDKVLFRSLFLDGSSC
>Q66220 2.7.7.48~~~~~~Replicase large subunit~~~
MAQFQQTVNMQTLQAAAGRNSLVNDLASRRVYDNAVEELNARSRRPKVHFSKSVSTEQTLLASNAYPEFEISFTHTQQAV
HSLAGGLRTLELEYLMMQVPFGSLTYDIGGNFAAHLFKGRDYVHCCMPNLDVRDIARHEGHKEAIFSYLSRLDRQRRPVP
EYQRAAFHNYAENPHFVHCDRPFQQCELSTVNRWDTYAIALHSIYDIPADEFGAALLRKNVKICYAAFHFHENMLLDCDS
VTLEDIGATFQRAGDKLNFFFHNESTLNYTHSFSNIIKYVCKTFFPASQRYVYHKEFLVTRVNTWYCKFTRVDTFTLFRG
VYKTSVDSEEFYKAMDDAWEYKKTLAMLNSERTIFKDSAAINFWFPKVRDMVIIPLFDASITTGRMSRREVLVNKDFVYT
VLNHIKTYQAKALTYANVLSFVESIRSRVIINGVTARSEWDTDKAILGPLAMTFFLVTKLSHVQDEIVLKKFQKFDATAK
ELIWSSLCDALKGVIPSVKETLARGGFVKLAEESLEIKIPELYCTFTDRLVLEYKRTEEFQSCDLSKPLEESEKYYNALS
ELSVLENLGSFDLDAFKELCQKKNVDPDVAAKVVVAIMNSELTLPFKKPTEEEVAEALSGEVVQDEGLRLSNKAPFPCVS
NLKEGLVPACGLCPNGANFDRVDMDISEFHLKSVDAVKKGAMMSAVYTGKIKVQQMKNYVDYLSASLSATVSNLCKVLRD
VHGVDPESQEKSGVWDVRRGRWLLKPNAKCHAWGVAEDANHKLVIVLLNWDEGKPVCDETWFRLAVSSDSLVYSDMGKLK
TLTSCCRDGEPPEPTAKLVLVDGVPGCGKTKEILEKVNFSEDLVLVPGKEASKMIIRRANQAGITRADKDNVRTVDSFLM
HPPKRVFKRLFIDEGLMLHTGCVNFLMLLSHCDVAYVYVDTQQIPFICRVANFPYPAHFAKLVVDEKEDRRVTLRCPADV
TYFLNQKYDGSVLCTSSVERSVSAEVVRGKGALNPITLPLEGKILTFTQADKFELLDKGYKDVNTVHEVQGETYEKTAIV
RLTATPLEIISRASPHVLVALTRHTTRCKYYTVVLDPMVNVISELGKLSNFLLEMYKVESGTQXQLQIDTVFKGTNLFVP
TPKSGDWRDMQFYYDTLLPGNSTILNEFDAVTMNLRDISLNVKDCRIDFSKSVQVPKERPVFMKPKLRTAAEMPRTAGLL
ENLVAMIKRNMNAPDLTGTIDIEDTASLVVEKFWDAYVVKEFSGTDGMAMTRESFSRWLSKQESSTVGQLADFNFVDLPA
VDEYKHMIKSQPKQKLDLSIQDEYPALQTIVYHSKKINAIFGPMFSELTRMLLETIDTSKFLFYTRKTPTQIEEFFSDLD
SSQAMEILELDISKYDKSQNEFHCAVEYKIWEKLGIDDWLAEVWRQGHRKTTLKDYTAGIKTCLWYQRKSGDVTTFIGNT
IIIAACLSSMIPMDKVIKAAFCGDDSLIYIPKGLDLPDIQAGANLTWNFEAKLFRKKYGYFCGRYVIHHDRGAIVYYDPL
KLISKLGCKHIRDEVHLEELRRSLCDVTSNLNNCAYFSQLDEAVAEVHKTAVGGAFVYCSIIKYLSDKRLFKDLFFV
>P06956 ~~~cre~~~Recombinase cre~~~
MSNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQA
RGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRN
LAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFC
RVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNI
VMNYIRNLDSETGAMVRLLEDGD
>P17578 ~~~~~~Repeat element protein~~~
MESLETKESKVFTPVYFTSRQILLPLKMISYVSRFLKFEDFRKFIRAMWPNGEANVIFQELLERLSIRKFKAKFYNREEI
EVEYKFDRERSGINWILINFKDLLPILGGIMLPDDEDKFQSIFTLEDFLKRNLKVHRCSGGIHTSCHNLGRDSDSDSEAK
LDICPFDHYHHFCPDHVIAWFKHYLLTAILLREGVYDELVKNANLPNADHLTSGRRRTEQYWLRVARRKKCRFSQ
>P35926 ~~~ref~~~Recombination enhancement function protein~~~
MKTIEQKIEQCRKWQKAARERAIARQREKLADPVWRESQYQKMRDTLDRRIAKQKERPPASKTRKSAVKIKSRGLKGRTP
TAEERRIANALGALPCIACYMHGVISNEVSLHHIAGRTAPGCHKKQLPLCRWHHQHAAPAEVREKYPWLVPVHADGVVGG
KKEFTLLNKSEMELLADAYEMANIMH
>P69628 ~~~regA~~~Translation repressor protein~~~
MIEITLKKPEDFLKVKETLTRMGIANNKDKVLYQSCHILQKKGLYYIVHFKEMLRMDGRQVEMTEEDEVRRDSIAWLLED
WGLIEIVPGQRTFMKDLTNNFRVISFKQKHEWKLVPKYTIGN
>Q01751 ~~~regA~~~Translation repressor protein~~~
MIEIKLKNPEDFLKVKETLTRMGIANNKDKVLYQSCHILQKQGKYYIVHFKEMLRMDGRQVDIDGEDYQRRDSIAQLLED
WGLIVIEDSAREDLFGLTNNFRVISFKQKDDWTLKAKYTIGN
>P69702 ~~~regA~~~Translation repressor protein~~~
MIEITLKKPEDFLKVKETLTRMGIANNKDKVLYQSCHILQKKGLYYIVHFKEMLRMDGRQVEMTEEDEVRRDSIAWLLED
WGLIEIVPGQRTFMKDLTNNFRVISFKQKHEWKLVPKYTIGN
>P69626 ~~~regA~~~Translation repressor protein~~~
MIEITLKKPEDFLKVKETLTRMGIANNKDKVLYQSCHILQKKGLYYIVHFKEMLRMDGRQVEMTEEDEVRRDSIAWLLED
WGLIEIVPGQRTFMKDLTNNFRVISFKQKHEWKLVPKYTIGN
>P13312 3.1.-.-~~~regB~~~Endoribonuclease RegB~~~
MTINTEVFIRRNKLRRHFESEFRQINNEIREASKAAGVSSFHLKYSQHLLDRAIQREIDETYVFELFHKIKDHVLEVNEF
LSMPPRPDIDEDFIDGVEYRPGRLEITDGNLWLGFTVCKPNEKFKDPSLQCRMAIINSRRLPGKASKAVIKTQ
>P04891 ~~~N~~~Probable regulatory protein N~~~
MTVITYGKSTFAGNAKTRRHERRRKLAIERDTICNIIDSIFGCDAPDASQEVKAKRIDRVTKAISLAGTRQKEVEGGSVL
LPGVALYAAGHRKSKQITAR
>P07243 ~~~N~~~Probable regulatory protein N~~~
MVTIVWKESKGTAKSRYKARRAELIAERRSNEALARKIALKLSGCVRADKAASLGSLRCKKAEEVERKQNRIYYSKPRSE
MGVTCVGRQKIKLGSKPLI
>P03045 ~~~N~~~Antitermination protein N~~~
MDAQTRRRERRAEKQAQWKAANPLLVGVSAKPVNRPILSLNRKPKSRVESALNPIDLTVLAEYHKQIESNLQRIERKNQR
TWYSKPGERGITCSGRQKIKGKSIPLI
>P03047 ~~~Q~~~Antitermination protein Q~~~
MRLESVAKFHSPKSPMMSDSPRATASDSLSGTDVMAAMGMAQSQAGFGMAAFCGKHELSQNDKQKAINYLMQFAHKVSGK
YRGVAKLEGNTKAKVLQVLATFAYADYCRSAATPGARCRDCHGTGRAVDIAKTELWGRVVEKECGRCKGVGYSRMPASAA
YRAVTMLIPNLTQPTWSRTVKPLYDALVVQCHKEESIADNILNAVTR
>P03563 ~~~AC3~~~Replication enhancer protein~~~
MDSRTGEPITVPQAENGVYIWEITNPLYFKIISVEDPLYTNTRIYHLQIRFNHNLRRALDLHKAFLNFQVWTTSTTASGR
TYLNRFKYLVMLYLEQLGVICINNVIRAVRFATDRSYITHVLENHSIKYKFY
>P27266 ~~~C3~~~Replication enhancer protein~~~
MDLRTGEYITAHQATSGVYTFGITNPLYFTITRHNQNPFNNKYNTLTFQIRFNHNLRKELGIHKCFLNFHIWTTLQSPTG
HFLRVFKYQVCKYLNNLGVISLNNVVRAVDYVLFHVFERTIDVTENHEIKFNFY
>P27265 ~~~C3~~~Replication enhancer protein~~~
MDSRTGELITAPQAENGVFIWEINNPLYFKITEHSQRPFLMNHDIISIQIRFNHNIRKVMGIHKCFLDFRIWTTLQPQTG
HFLRVFRYEVLKYLDSLGVISINNVIRAVDHVLYDVLENTINVTETHDIKYKFY
>Q66862 2.7.7.-~~~C1~~~Para-Rep C1~~~
MACSNWVFTRNFQGALPLLSFDERVQYAVWQHERGTHDHIQGVIQLKKKARFSTVKEIIGGNPHVEKMKGTIEEASAYVQ
KEETRVAGPWSYGDLLKRGSHRRKTMERYLEDPEEMQLKDPDTALRCNAKRLKEDFMKEKTKLQLRPWQKELHDLILTEP
DDRTIIWVYGPDGGEGKSMFAKELIKYGWFYTAGGKTQDILYMYAQDPERNIAFDVPRCSSEMMNYQAMEMMKNRCFAST
KYRSVDLCCNKNVHLVVFANVAYDPTKISEDRIVIINC
>Q89269 3.6.4.12~~~~~~Protein Rep40~~~
MELVGWLVDKGITSEKQWIQEDQASYISFNAASNSRSQIKAALDNAGKIMSLTKTAPDYLVGQQPVEDISSNRIYKILEL
NGYDPQYAASVFLGWATKKFGKRNTIWLFGPATTGKTNIAEAIAHTVPFYGCVNWTNENFPFNDCVDKMVIWWEEGKMTA
KVVESAKAILGGSKVRVDQKCKSSAQIDPTPVIVTSNTNMCAVIDGNSTTFEHQQPLQDRMFKFELTRRLDHDFGKVTKQ
EVKDFFRWAKDHVVEVEHEFYVKKGGAKKRPAPSDADISEPKRVRESVAQPSTSDAEASINYADRLARGHSL
>Q89270 ~~~~~~Protein Rep52~~~
MELVGWLVDKGITSEKQWIQEDQASYISFNAASNSRSQIKAALDNAGKIMSLTKTAPDYLVGQQPVEDISSNRIYKILEL
NGYDPQYAASVFLGWATKKFGKRNTIWLFGPATTGKTNIAEAIAHTVPFYGCVNWTNENFPFNDCVDKMVIWWEEGKMTA
KVVESAKAILGGSKVRVDQKCKSSAQIDPTPVIVTSNTNMCAVIDGNSTTFEHQQPLQDRMFKFELTRRLDHDFGKVTKQ
EVKDFFRWAKDHVVEVEHEFYVKKGGAKKRPAPSDADISEPKRVRESVAQPSTSDAEASINYADRYQNKCSRHVGMNLML
FPCRQCERMNQNSNICFTHGQKDCLECFPVSESQPVSVVKKAYQKLCYIHHIMGKVPDACTACDLVNVDLDDCIFEQ
>P03132 3.6.4.12~~~~~~Protein Rep68~~~
MPGFYEIVIKVPSDLDGHLPGISDSFVNWVAEKEWELPPDSDMDLNLIEQAPLTVAEKLQRDFLTEWRRVSKAPEALFFV
QFEKGESYFHMHVLVETTGVKSMVLGRFLSQIREKLIQRIYRGIEPTLPNWFAVTKTRNGAGGGNKVVDECYIPNYLLPK
TQPELQWAWTNMEQYLSACLNLTERKRLVAQHLTHVSQTQEQNKENQNPNSDAPVIRSKTSARYMELVGWLVDKGITSEK
QWIQEDQASYISFNAASNSRSQIKAALDNAGKIMSLTKTAPDYLVGQQPVEDISSNRIYKILELNGYDPQYAASVFLGWA
TKKFGKRNTIWLFGPATTGKTNIAEAIAHTVPFYGCVNWTNENFPFNDCVDKMVIWWEEGKMTAKVVESAKAILGGSKVR
VDQKCKSSAQIDPTPVIVTSNTNMCAVIDGNSTTFEHQQPLQDRMFKFELTRRLDHDFGKVTKQEVKDFFRWAKDHVVEV
EHEFYVKKGGAKKRPAPSDADISEPKRVRESVAQPSTSDAEASINYADRLARGHSL
>Q89268 ~~~~~~Protein Rep78~~~
MPGFYEIVIKVPSDLDGHLPGISDSFVNWVAEKEWELPPDSDMDLNLIEQAPLTVAEKLQRDFLTEWRRVSKAPEALFFV
QFEKGESYFHMHVLVETTGVKSMVLGRFLSQIREKLIQRIYRGIEPTLPNWFAVTKTRNGAGGGNKVVDECYIPNYLLPK
TQPELQWAWTNMEQYLSACLNLTERKRLVAQHLTHVSQTQEQNKENQNPNSDAPVIRSKTSARYMELVGWLVDKGITSEK
QWIQEDQASYISFNAASNSRSQIKAALDNAGKIMSLTKTAPDYLVGQQPVEDISSNRIYKILELNGYDPQYAASVFLGWA
TKKFGKRNTIWLFGPATTGKTNIAEAIAHTVPFYGCVNWTNENFPFNDCVDKMVIWWEEGKMTAKVVESAKAILGGSKVR
VDQKCKSSAQIDPTPVIVTSNTNMCAVIDGNSTTFEHQQPLQDRMFKFELTRRLDHDFGKVTKQEVKDFFRWAKDHVVEV
EHEFYVKKGGAKKRPAPSDADISEPKRVRESVAQPSTSDAEASINYADRYQNKCSRHVGMNLMLFPCRQCERMNQNSNIC
FTHGQKDCLECFPVSESQPVSVVKKAYQKLCYIHHIMGKVPDACTACDLVNVDLDDCIFEQ
>O39521 3.1.21.-~~~C1~~~Replication-associated protein A~~~
MPSASKNFRLQSKYVFLTYPKCSSQRDDLFQFLWEKLTPFLIFFLGVASELHQDGTTHYHALIQLDKKPCIRDPSFFDFE
GNHPNIQPARNSKQVLDYISKDGDIKTRGDFRDHKVSPRKSDARWRTIIQTATSKEEYLDMIKEEFPHEWATKLQWLEYS
ANKLFPPQPEQYVSPFTESDLRCHEDLHNWRETHLYHVSIDAYTFIHPVSYDQAQSDLEWMADLTRMREGLGSDTPASTS
ADQLVPERPPGLEVSGDTTTGTGPSTSPTTMNTPPIISSTTSPSSSSHCGSN
>P03631 3.1.21.-~~~A~~~Replication-associated protein A~~~
MVRSYYPSECHADYFDFERIEALKPAIEACGISTLSQSPMLGFHKQMDNRIKLLEEILSFRMQGVEFDNGDMYVDGHKAA
SDVRDEFVSVTEKLMDELAQCYNVLPQLDINNTIDHRPEGDEKWFLENEKTVTQFCRKLAAERPLKDIRDEYNYPKKKGI
KDECSRLLEASTMKSRRGFAIQRLMNAMRQAHADGWFIVFDTLTLADDRLEAFYDNPNALRDYFRDIGRMVLAAEGRKAN
DSHADCYQYFCVPEYGTANGRLHFHAVHFMRTLPTGSVDPNFGRRVRNRRQLNSLQNTWPYGYSMPIAVRYTQDAFSRSG
WLWPVDAKGEPLKATSYMAVGFYVAKYVNKKSDMDLAAKGLGAKEWNNSLKTKLSLLPKKLFRIRMSRNFGMKMLTMTNL
STECLIQLTKLGYDATPFNQILKQNAKREMRLRLGKVTVADVLAAQPVTTNLLKFMRASIKMIGVSNLQSFIASMTQKLT
LSDISDESKNYLDKAGITTACLRIKSKWTAGGK
>P14980 3.1.21.-~~~C1~~~Replication-associated protein A~~~
MASSSSNRQFSHRNANTFLTYPKCPENPEIACQMIWELVVRWIPKYILCAREAHKDGSLHLHALLQTEKPVRISDSRFFD
INGFHPNIQSAKSVNRVRDYILKEPLAVFERGTFIPRKSPFLGKSDSEVKEKKPSKDEIMRDIISHATSKEEYLSMIQKE
LPFDWSTKLQYFEYSANKLFPEIQEEFTNPHPPSSPDLLCNESINDWLQPNIFQVSPEAYMLLQPTCYTLEDAISDLQWM
DSVSSHQMKDQESRASTSSAQQEPENLLGPEA
>P14990 3.1.21.-~~~C1~~~Replication-associated protein A~~~
MASSSSNRQFSHRNANTFLTYPKCPENPEIACQMIWELVVRWIPKYILCAREAHKDGSLHLHALLQTEKPVRISDSRFFD
INGFHPNIQSAKSVNRVRDYILKEPLAVFERGTFIPRKSPFLGKSDSEVKEKKPSKDEIMRDIISHATSKAEYLSMIQKE
LPFDWSTKLQYFEYSANKLFPEIQEEFTNPHPPSSPDLLCNESINDWLQPNIFQVSPEAYMLLQPTCYTLEDAISDLQWM
DSVSSHQMKDQESRASTSSAQQEPENLLGPEA
>P06847 3.1.21.-~~~C1~~~Replication-associated protein A~~~
MASSSAPRFRVYSKYLFLTYPECTLEPQYALDSLRTLLNKYEPLYIAAVRELHEDGSPHLHVLVQNKLRASITNPNALNL
RMDTSPFSIFHPNIQAAKDCNQVRDYITKEVDSDVNTAEWGTFVAVSTPGRKDRDADMKQIIESSSSREEFLSMVCNRFP
FEWSIRLKDFEYTARHLFPDPVATYTPEFPTESLICHETIESWKNEHLYSVSLESYILCTSTPADQAQSDLEWMDDYSRS
HRGGISPSTSAGQPEQERLPGQGL
>P07040 ~~~repc~~~Repressor c protein~~~
MMSFEIKEWFNAKELEGMPGVPKLATNITRKAVAEDWVKRQRHGGKGVAYEYHINSLPEETRRAIKGASLSDKPVHTSIV
HTVDERLIYAMSFLTPDEQAAAVEIIRVAGIKGLMPTIVSKDKALEALGITVEQQKTLQTLQALPPEKVREILSQYEGKE
HNFPVRENDVKKAV
>P06019 ~~~repc~~~Repressor protein c~~~
MKSNFIEKNNTEKSIWCSPQEIMAADGMPGSVAGVHYRANVQGWTKQKKEGVKGGKAVEYDVMSMPTKEREQVIAHLGLS
TPDTGAQANEKQDSSELINKLTTTLINMIEELEPDEARKALKLLSKGGLLALMPLVFNEQKLYSFIGFSQQSIQTLMMLD
ALPEEKRKEILSKYGIHEQESVVVPSQEPQEVKKAV
>Q9YUD3 2.7.7.-~~~Rep~~~Replication-associated protein~~~
MPSKEGSGCRRWCFTLNNPTDGEIEFVRSLGPDEFYYAIVGREKGEQGTPHLQGYFHFKNKKRLSALKKLLPRAHFERAK
GSDADNEKYCSKEGDVILTLGIVARDGHRAFDGAVAAVMSGRKMKEVAREFPEVYVRHGRGLHNLSLLVGSSPRDFKTEV
DVIYGPPGCGKSRWANEQPGTKYYKMRGEWWDGYDGEDVVVLDDFYGWLPYCEMLRLCDRYPHKVPVKGAFVEFTSKRII
ITSNKPPETWYKEDCDPKPLFRRFTRVWWYNVDKLEQVRPDFLAHPINY
>P0CK40 2.7.7.-~~~AC1~~~Replication-associated protein~~~
MPPPQRFRVQSKNYFLTYPRCTIPKEEALSQLQKIHTTTNKKFIKVCEERHDNGEPHLHALIQFEGKFICTNKRLFDLVS
TTRSAHFHPNIQGAKSSSDVKEYIDKDGVTIEWGQFQVDGRSARGGQQSANDSYAKALNADSIESALTILKEEQPKDYVL
QNHNIRSNLERIFFKVPEPWVPPFPLSSFVNIPVVMQDWVDDYFGRGSAARPERPISIIVEGDSRTGKTMWARALGPHNY
LSGHLDFNSRVYSNSVEYNVIDDISPNYLKLKHWKELIGAQKDWQSNCKYGKPVQIKGGIPSIVLCNPGEGSSYKDFLNK
EENRALHNWTIHNAIFVTLTAPLYQSTAQDCQT
>P69546 3.1.21.-~~~II~~~Replication-associated protein G2P~~~
MIDMLVLRLPFIDSLVCSRLSGNDLIAFVDLSKIATLSGMNLSARTVEYHIDGDLTVSGLSHPFESLPTHYSGIAFKIYE
GSKNFYPCVEIKASPAKVLQGHNVFGTTDLALCSEALLLNFANSLPCLYDLLDVNATTISRIDATFSARAPNENIAKQVI
DHLRNVSNGQTKSTRSQNWESTVTWNETSRHRTLVAYLKHVELQHQIQQLSSKPSAKMTSYQKEQLKVLSNPDLLEFASG
LVRFEARIETRYLKSFGLPLNLFDAIRFASDYNSQGKDLIFDLWSFSFSELFKAFEGDSMNIYDDSAVLDAIQSKHFTIT
PSGKTSFAKASRYFGFYRRLVNEGYDSVALTMPRNSFWRYVSALVECGIPKSQLMNLSTCNNVVPLVRFINVDFSSQRPD
WYNEPVLKIA
>Q82676 2.7.7.-~~~AC1~~~Replication-associated protein~~~
MSPPKRFQINAKNYFLTYPRCSLTKEEALSQIRNFQTPTNPKFIKICRELHENGEPHLHVLIQFEGKYKCQNQRFFDLVS
PTRSAHFHPNIQGAKSSSDVKSYIDKDGDTWRWGTFQIDGRSARGGQQSANDAYAAALNSGSKSEALKILRELAPRDYLR
DFHHISSNLDRIFTKPPPPYENPFPLSSFDRVPEELDEWFHENVMGRARPLRPKSIVIEGDSRTGKTMWSRALGPHNYLC
GHLDLSPKVYNNDAWYNVIDDVDPHYLKHFKRIHGGPEDWQSNTKYGKPVQIKGGIPTIFLCNPGPNSSYKEFLDEEKNS
ALKAWALKNATFISLEGPLYSGTNQGPTQSC
>Q9YPS2 2.7.7.-~~~AC1~~~Replication-associated protein~~~
MPRLGRFAINAKNYFLTYPRCPLTKEDVLEQLLALSTPVNKKFIRVCRELHEDGEPHLHVLLQFEGKLQTKNERFFDLVS
PTRSTHYHPNIQAAKSASDVKSYMDKDGDVLDHGSFQVDGRSARGGKQSANDAYAEALNSGSKLQALNILREKAPKDYIL
QFHNLNCNLSRIFADDVPPYVSPYSLSAFDKVPSYISSWASENVRDSCAPERPISIVIEGDSRTGKTMWARALGPHNYLC
GHLDLNSKIYSNDAWYNVIDDVDPHYLKHFKEFMGAQRDWQSNVKYGKPTHIKGGIPTIFLCNPGPKSSYKEYLDEPDNT
ALKLWASKNAEFYTLKEPLFSSVDQGATQGCQEASNSTLSN
>Q8BB16 2.7.7.-~~~Rep~~~Replication-associated protein~~~
MPSKKNGRSGPQPHKRWVFTLNNPSEDERKKIREPPISLFDYFIVGEEGNEEGRTPHLQGFANFVKKQTFNKVKWYLGAR
CHIEKAKGTDQQNKEYCSKEGNLLIECGAPRSQGQRSDLSTAVSTLLESGSLVTVAEQHPVTFVRNFRGLAELLKVSGKM
QKRDWKTNVHVIVGPPGCGKSKWAANFADPETTYWKPPRNKWWDGYHGEEVVVIDDFYGWLPWDDLLRLCDRYPLTVETK
GGTVPFLARSILITSNQTPLEWYSSTAVPAVEALYRRITSLVFWKNATEQSTEEGGQFVTLSPPCPEFPYEINY
>P03567 2.7.7.-~~~AC1~~~Replication-associated protein~~~
MPSHPKRFQINAKNYFLTYPQCSLSKEESLSQLQALNTPINKKFIKICRELHEDGQPHLHVLIQFEGKYCCQNQRFFDLV
SPTRSAHFHPNIQRAKSSSDVKTYIDKDGDTLVWGEFQVDGRSARGGCQTSNDAAAEALNASSKEEALQIIREKIPEKYL
FQFHNLNSNLDRIFDKTPEPWLPPFHVSSFTNVPDEMRQWAENYFGKSSAARPERPISIIIEGDSRTGKTMWARSLGPHN
YLSGHLDLNSRVYSNKVEYNVIDDVTPQYLKLKHWKELIGAQRDWQTNCKYGKPVQIKGGIPSIVLCNPGEGASYKVFLD
KEENTPLKNWTFHNAKFVFLNSPLYQSSTQSS
>P36279 2.7.7.-~~~C1~~~Replication-associated protein~~~
MTRPKSFRINAKNYFLTYPKCSLTKEEALSQLNNLETPTSKKYIKVCRELHENGEPHLHVLIQFEGKFQCKNQRFFDLVS
PTRSAHFHPNIQGAKSSSDVKSYLEKDGDTLEWGEFQIDGRSARGGQQSANDAYAQALNTGSKSEALNVLRELAPKDYVL
QFHNLNSNLDRIFTPPLEVYVSPFLSSSFDRVPEELEEWVAENVKDAAARPLRPISIVIEGESRTGKTVWARSLGPHNYL
CGHLDLSPKVYSNDAWYNVIDDVDPHYLKHFKEFMGAQRDWQSNTKYGKPVQIKGGIPTIFLCNPGPNSSYKEYLDEEKN
SALKAWALKNAEFITLNEPLYSGTYQGPTQNSEEEVHPEEEN
>Q67620 2.7.7.-~~~C1~~~Replication-associated protein~~~
MAQPKRFQINAKHYFLTFPKCSLSKEEALEQLLQLQTPTNKKYIKICRELHEDGQPHLHMLIQFEGKFNCKNNRFFDLVS
PTRSAHFHPNIQGAKSSSDVKSYIDKDGDVLEWGTFQIDGRSARGGQQTANDAYAKAINAGRKSEALDVIKELAPRDYIL
HFHNINSNLNMVFQVPPAPYVSPFLSSSFDQVPDELEHWVSENVMDVAARPWRPVSIVIEGDSRTGKTMWARSLGPHNYL
CGHLDLSQKVYSNNAWYNVIDDVDPHYLKHFKEFMGSQRDWQSNTKYGKPIQIKGGIPTIFLCNPGPQSSFKEYLDEEKN
QTLKNWAIKNAIFVTIHQPLFTNTNQDPTPHRQEETSEA
>P27260 2.7.7.-~~~C1~~~Replication-associated protein~~~
MPRSGRFSIKAKNYFLTYPKCDLTKENALSQITNLQTPTNKLFIKICRELHENGEPHLHILIQFEGKYNCTNQRFFDLVS
PTRSAHFHPNIQGAKSSSDVKSYIDKDGDVLEWGTFQIDGRSARGGQQTANDAYAKAINAGSKSQALDVIKELAPRDYVL
HFHNINSNLDKVFQVPPAPYVSPFLSSSFDQVPDELEHWVSENVMDAAARPWRPVSIVIEGDSRTGKTTWARSLGPHNYL
CGHLDLSQKVYSNNAWYNVIDDVDPHYLKHFKEFMGAQRDWQSNTKYGKPIQIKGGIPTIFLCNPGPQSSFKEYLDEEKN
QALKNWATKNAIFVTIHQPLFADTNQNTTSHRQEEASEA
>P27259 2.7.7.-~~~C1~~~Replication-associated protein~~~
MPRLFKIYAKNYFLTYPNCSLSKEEALSQLKKLETPTNKKYIKVCKELHENGEPHLHVLIQFEGKYQCKNQRFFDLVSPN
RSAHFHPNIQAAKSSTDVKTYVEKDGNFIDFGVSQIDGRSARGGQQSANDAYAEALNSGSKSEALNILKEKAPKDYILQF
HNLSSNLDRIFSPPLEVYVSPFLSSSFNQVPDELEEWVAENVVYSAARPWRPISIVIEGDSRTGKTMWARSLGPHNYLCG
HLDLSPKVYSNDAWYNVIDDVDPHYLKHFKEFMGAQRDWQSNTKYGKPIQIKGGIPTIFLCNPGPTSSYREYLDEEKNIS
LKNWALKNATFVTLYEPLFASINQGPTQDSQEETNKA
>Q67622 2.7.7.-~~~C1/C2~~~Replication-associated protein~~~
MASSSAPRFRVYSKYLFLTYPECTLEPQYALDSLRTLLNKYEPLYIAAVRELHEDGSPHLHVLVQNKLRASITNPNALNL
RMDTSPFSIFHPNIQAAKDCNQVRDYITKEVDSDVNTAEWGTFVAVSTPGRKDRDADMKQIIESSSSREEFLSMVCNRFP
FEWSIRLKDFEYTARHLFPDPVATYTPEFPTESLICHETIESWKNEHLYSESPGRHKSIYICGPTRTGKTSWARSLGTHN
YYNSLVDFTTYDVNAKYNIIDDIPFKFTPNWKCFVGAQRDFTVNPKYGKRKVIRGGIPCIILVNPDEDWLKDMTPEQSDY
MYSNTVVHYMYEGETFINYSFASGEDVTASQ
>P24097 ~~~rev~~~Protein Rev~~~
MDQDLDRAERGERGGGSEELLQEEINEGRLTAREALQTWINNDSPRYVKKLRQGQPELPTSPGGGGGRGHRARKLPGERR
PGFWKSLRELVEQNRRKQERRLSGLDRRIQQLEDLVRHMSLGSPDPSTPSASVLSVNPPAQTPLGHLPPRSYFKLKRVDC
GAGWDLRTTAAPGLPICELDWIQGTK
>P31628 ~~~rev~~~Protein Rev~~~
MDAGARQIRFTGEKNWMEVTMEEEEKGKRKGCIERQQDIQDLKYPNLPAGHSHHGNKSRRRRRQSGFWRWLRGIRRQRDK
PKGDSEKGLGSCVGALAELTLEEAMAEEPADAASPTADDGHLDKWTAWRLPQK
>P33460 ~~~rev~~~Protein Rev~~~
MDAGARYMRLTGKENWVEVTMDGEKERKREGFTAGQQDIQNSKYPDIPTGHSHHGNKSRRRRRKSGFWRWLRGIRQQRNK
RKSDSTESLEPCLGALAELTLEGAMEKGPAEAARPSADDGNLDKWMAWRTPQK
>P11305 ~~~rev~~~Protein Rev~~~
MAESKEARDQEMNLKEESKEEKRRNDWWKIDPQGPLESDQWCRVLRQSLPEEKIPSQTCIARRHLGPGPTQHTPSRRDRW
IRGQILQAEVLQERLEWRIRGVQQAAKELGEVNRGIWRELYFREDQRGDFSAWGGYQRAQERLWGEQSSPRVLRPGDSKR
RRKHL
>P20919 ~~~rev~~~Protein Rev~~~
MAESKEARDQEMNLKEESKEEKRRNDWWKKDPQGPLESDQWCRVLRQSLPEEKIPSQTCIARRHLGPGPTQHTPSRRDRW
IRGQILQTEVLQERLEWRIRGVQQAAKELGEVNRGIWRELYFREDQRGDFSAWGGYQRAQERLWGEQSSPRVLRPGDSKR
RRKHL
>P19032 ~~~rev~~~Probable protein Rev~~~
MAEGFAANRQWIGPEEAEELLDFDKATQMNEEGPLNPGVNPFRVPAVTEADKQEYCKILQPRLQEIRNEIQEVKLEEGNA
GKMKKKRQRRRRKKKAFKKMMTDLEDRFRKLFGSPSKDEYTEIEIEEDPPKKEKRVDWDEYWDPEEIERMLMD
>P04325 ~~~rev~~~Protein Rev~~~
MAGRSGDSDEELLKAVRLIKFLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILSTYLGRSAEPVPLQLPPLER
LTLDCNEDCGTSGTQGVGSPQILVESPTILESGAKE
>P04616 ~~~rev~~~Protein Rev~~~
MAGRSGDSDEDLLKAVRLIKFLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILSTYLGRSAEPVPLQLPPLER
LTLDCNEDCGTSGTQGVGSPQILVESPTVLESGAKE
>P04620 ~~~rev~~~Protein Rev~~~
MAGRSGDSDEDLLKAVRLIKFLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILSTYLGRSAEPVPLQLPPLER
LTLDCNEDCGTSGTQGVGSPQILVESPTVLESGTKE
>P04618 ~~~rev~~~Protein Rev~~~
MAGRSGDSDEELIRTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILGTYLGRSAEPVPLQLPPLER
LTLDCNEDCGTSGTQGVGSPQILVESPTVLESGTKE
>P69718 ~~~rev~~~Protein Rev~~~
MAGRSGDSDEDLLKAVRLIKFLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILSTYLGRSAEPVPLQLPPLER
LTLDCNEDCGTSGTQGVGSPQILVESPTILESGAKE
>P04619 ~~~rev~~~Protein Rev~~~
MAGRSGDRDEDLLKAVRLIKILYQSNPPPSPEGTRQARRNRRRRWRARQRQIHSIGERILSTYLGRSEEPVPLQLPPLER
LNLNCSEDCGTSGTQGVGSPQISVESPTVLESGTEE
>P21280 ~~~rev~~~Protein Rev~~~
MASKESKPSRTTWRDMEPPLRETWNQVLQELVKRQQQEEEEQQGLVSGLQASKADQIYTGNSGDRSTGGIGGKTKKKRGW
YKWLRKLRAREKNIPSQFYPDMESNMVGMENLTLETQLEDNALYNPATHIGDMAMDGREWMEWRESAQKEKRKGGLSGQR
TNAYPGK
>P03759 ~~~rexB~~~Protein rexB~~~
MRNRIMPGVYIVIIPYVIVSICYLLFRHYIPGVSFSAHRDGLGATLSSYAGTMIAILIAALTFLIGSRTRRLAKIREYGY
MTSVVIVYALSFVELGALFFCGLLLLSSISGYMIPTIAIGIASASFIHICILVFQLYNLTREQE
>P0C205 ~~~~~~Protein Rex~~~
MPKTRRRPRRSQRKRPPTPWPTSQGLDRVFFSDTQSTCLETVYKATGAPSLGDYVRPAYIVTPYWPPVQSIRSPGTPSMD
ALSAQLYSSLSLDSPPSPPREPLRPSRSLPRQSLIQPPTFHPPSSRPCANTPPSEMDTWNPPLGSTSQPCLFQTPDSGPK
TCTPSGEAPLSACTSTSFPPPSPGPSCPT
>P0C206 ~~~~~~Protein Rex~~~
MPKTRRRPRRSQRKRPPTPWPTSQGLDRVFFSDTQSTCLETVYKATGAPSLGDYVRPAYIVTPYWPPVQSIRSPGTPSMD
ALSAQLYSSLSLDSPPSPPREPLRPLRSLPRQSLIQPPTFHPPSSRPCANTPPSEMDTWNPPLGSTSQPCLFQTPDSGPK
TCTPSGEAPLSACTSTSFPPPSPGPSCPM
>P0C207 ~~~~~~Protein Rex~~~
MPKTRRRPRRSQRKRPPTPWPTSQGLDRVFFSDTQSTCLETVYRATGAPSLGDYVRPVYIVTPYWPPVQSIRSPGTPSMD
ALSAQLYSSLSLDSPPSPPREPLRPSRSLPRRPPIQPPTFHPPSSRPCANTPPSETDTWNPPLGSTSQPCLFQTPASGPK
TCTPSGEAPLSACTSTSFPPPSPGPSCPT
>P0C208 ~~~~~~Protein Rex~~~
MPKTRRGPRRSQRKRPPTPWPTSQGLDKVFFTDIQSTCLETVYKATGAPSLGDYVRPAYIVTPYWPPVQSIRSPRTPSMD
ALSAQLYSSLSLGSPPSPPREPLKPSRSLPHRPLIQPPTFHPPSSRPYANTPPSEMGAWSPPLGSSSQACPSPTPASGPK
TCTPSGEAPSSACTSISFPPPSPGPSCPR
>Q85601 ~~~~~~Protein Rex~~~
MPKTRRQRTRRARRNRPPTPWPISQDLDRASYMDTPSTCLAIVYRPIGVPSQVVYVPPAYIDMPSWPPVQSTNSPGTPSM
DALSALLSNTLSLASPPSPPREPQGPSRSLPLPPLLSPPRFHLPSFNQCESTPPTEMDAWNQPSGISSPPSPSPNLASVP
KTSTPPGEKP
>A0A385DT91 ~~~~~~Ring protein 1~~~
MVNNINWVKLPVILDRLLRHPLLTDLNLETAIQYTLDFISAMGLPNVYVDKIETIDIKEYRGELPCDLISINQVRLHKNG
IALRAMTDNFNAYPTHDHKEGDWYERGEPSFKTQGRVIFTSIKHEKVDISYKAIMLDDEGLPLIPDNPIFLKTLELYIKK
EWFTILFDMGKISPAVLNNTQQEYAFKAGQCNNEFVIPSVSEMEAITNMWNQLIPRVTEFRRGFKNLGDKEYIRVH
>A0A385DT87 ~~~~~~Ring protein 2~~~
MTYNELIYMVLDELKLSSDDSYYTPDHVIFLLVKYRSFLLKQRYSDIKKQIPDSDYQSICLDLIEVPAISGEPCEGSSYL
RSKNKVPTTMMIGNPRVYPMDFYQGEITYISRDRMRYVGYNKFLRNIIYCSKAPDGYLYFKSWNPQFLHLEKVSFNAIFE
DAKEASEMACPEENGTICKLEDKEFPIEDALVPPLIELVVKELRGPEYSPKDEDNNAKDDLPDAR
>A0A385DV73 ~~~~~~Ring protein 3~~~
MTNKEFSDGFSTLLNSFGITPNITLDEYEKSTFLTNAQEQLIIDIYSGRNIIYGKSFEQTEEIRRYLSNLVETYETSTKV
TGKLGLSKDSVFFEIPQDTWFITYEVAFLKDSRLGCLDGIEASVVPLPQDDLYRAKDNPFRGPSKDRVLRLDIKSDLAEL
ISKYNVDKYLMRYISQPTPIILVDLPDGLSINGVSTESECELNPVVHRAILERAVQLAIISKTQLTGNKE
>A0A385DVC3 ~~~~~~Ring protein 4/5~~~
MNVNEFSNEFDVLYNNIMSNAAPGLNEYEKSVLLTKAQEEIVKNYFEPAGNKYGKGLDDSPKRQIDFSELIKVGEGVLNT
SAPTITFDKRAKVYDLPADLFLVINEAVDTNAGTKQIVPISYSDYTRLMSRPYKEPVKYQAWRIITTSINNISVELIVNS
NETITDYKVRYIRRPAPIITTNLSSEYGDVTINGVSTVSECELNPIIHSEILQRAVELAKAAYQGDLQASVELGQRSE
>Q3V4R7 ~~~~~~RNAP inhibitory protein~~~
MKNMLHPQKYETHVLDDLMEFYEGVIGYPEIDLRLAGEEAWLKGVNPELAEAVKKIIKTIRRYLEGSPYDGSEKPIPRYI
IAEIFSQIAPEVQLLVNALDTEGKYGFLKHIKKLNLNSLAMLSKNYNENDKLWKELENEGYVYLE
>P42491 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
METFFIETLASDVYGKALNVDLDRLSQAQVKYTLQELISYCSALTILHYDYSTLAARLSVYQLHQSTASSFSKAVRLQAA
QSCSRLSPQFVDVVYKYKAIFDSYIDYNRDYKLSLLGIETMKNSYLLKNKDGVIMERPQDAYMRVAIMIYGMGKVVNIKM
ILLTYDLLSRHVITHASPTMFNAGTKKPQLSSCFLLNVNDNLENLYDMVKTAGIISGGGGGIGLCLSGIRAKNSFISGSG
LRSNGIQNYIMLQNASQCYANQGGLRPGAYAVYLELWHQDIFTFLQMPRLKGQMAEQRLNAPNLKYGLWVPDLFMEILED
QIHNRGDGTWYLFSPDQAPNLHKVFDLERSQHENAHREFKKLYYQYVAEKRYTGVTTAKEIIKEWFKTVIQVGNPYIGFK
DAINRKSNLSHVGTITNSNLCIEVTIPCWEGDKAEQGVCNLAAVNLAAFIRENGYDYRGLIEASGNVTENLDNIIDNGYY
PTEATRRSNMRHRPIGIGVFGLADVFASLKMKFGSPEAIAMDEAIHAALYYGAMRRSIELAKEKGSHPSFPGSAASKGLL
QPDLWVRCGDLSSSWEERVAQTTQGVLTRKSWWQLRLAAMQGVRNGYLTALMPTATSSNSTGKNECFEPFTSNLYTRRTL
SGEFIVLNKYLIDDLKEIDLWTEAIQQQLLNAGGSIQHILDIPAEIRDRYKTSREMNQKILTKHAAARNPFVSQSMSLNY
YFYEPELSQVLTVLVLGWKKGLTTGSYYCHFSPGAGTQKKIIRNSEKACNADCEACLL
>P03190 1.17.4.1~~~RIR1~~~Ribonucleoside-diphosphate reductase large subunit~~~
MATTSHVEHELLSKLIDELKVKANSDPEADVLAGRLLHRLKAESVTHTVAEYLEVFSDKFYDEEFFQMHRDELETRVSAF
AQSPAYERIVSSGYLSALRYYDTYLYVGRSGKQESVQHFYMRLAGFCASTTCLYAGLRAALQRARPEIESDMEVFDYYFE
HLTSQTVCCSTPFMRFAGVENSTLASCILTTPDLSSEWDVTQALYRHLGRYLFQRAGVGVGVTGAGQDGKHISLLMRMIN
SHVEYHNYGCKRPVSVAAYMEPWHSQIFKFLETKLPENHERCPGIFTGLFVPELFFKLFRDTPWSDWYLFDPKDAGDLER
LYGEEFEREYYRLVTAGKFCGRVSIKSLMFSIVNCAVKAGSPFILLKEACNAHFWRDLQGEAMNAANLCAEVLQPSRKSV
ATCNLANICLPRCLVNAPLAVRAQRADTQGDELLLALPRLSVTLPGEGAVGDGFSLARLRDATQCATFVVACSILQGSPT
YDSRDMASMGLGVQGLADVFADLGWQYTDPPSRSLNKEIFEHMYFTALCTSSLIGLHTRKIFPGFKQSKYAGGWFHWHDW
AGTDLSIPREIWSRLSERIVRDGLFNSQFIALMPTSGCAQVTGCSDAFYPFYANASTKVTNKEEALRPNRSFWRHVRLDD
REALNLVGGRVSCLPEALRQRYLRFQTAFDYNQEDLIQMSRDRAPFVDQSQSHSLFLREEDAARASTLANLLVRSYELGL
KTIMYYCRIEKAADLGVMECKASAALSVPREEQNERSPAEQMPPRPMEPAQVAGPVDIMSKGPGEGPGGWCVPGGLEVCY
KYRQLFSEDDLLETDGFTERACESCQ
>P08543 1.17.4.1~~~RIR1~~~Ribonucleoside-diphosphate reductase large subunit~~~
MASRPAASSPVEARAPVGGQEAGGPSAATQGEAAGAPLAHGHHVYCQRVNGVMVLSDKTPGSASYRISDNNFVQCGSNCT
MIIDGDVVRGRPQDPGAAASPAPFVAVTNIGAGSDGGTAVVAFGGTPRRSAGTSTGTQTADVPTEALGGPPPPPRFTLGG
GCCSCRDTRRRSAVFGGEGDPVGPAEFVSDDRSSDSDSDDSEDTDSETLSHASSDVSGGATYDDALDSDSSSDDSLQIDG
PVCRPWSNDTAPLDVCPGTPGPGADAGGPSAVDPHAPTPEAGAGLAADPAVARDDAEGLSDPRPRLGTGTAYPVPLELTP
ENAEAVARFLGDAVNREPALMLEYFCRCAREETKRVPPRTFGSPPRLTEDDFGLLNYALVEMQRLCLDVPPVPPNAYMPY
YLREYVTRLVNGFKPLVSRSARLYRILGVLVHLRIRTREASFEEWLRSKEVALDFGLTERLREHEAQLVILAQALDHYDC
LIHSTPHTLVERGLQSALKYEEFYLKRFGGHYMESVFQMYTRIAGFLACRATRGMRHIALGREGSWWEMFKFFFHRLYDH
QIVPSTPAMLNLGTRNYYTSSCYLVNPQATTNKATLRAITSNVSAILARNGGIGLCVQAFNDSGPGTASVMPALKVLDSL
VAAHNKESARPTGACVYLEPWHTDVRAVLRMKGVLAGEEAQRCDNIFSALWMPDLFFKRLIRHLDGEKNVTWTLFDRDTS
MSLADFHGEEFEKLYQHLEVMGFGEQIPIQELAYGIVRSAATTGSPFVMFKDAVNRHYIYDTQGAAIAGSNLCTEIVHPA
SKRSSGVCNLGSVNLARCVSRQTFDFGRLRDAVQACVLMVNIMIDSTLQPTPQCTRGNDNLRSMGIGMQGLHTACLKLGL
DLESAEFQDLNKHIAEVMLLSAMKTSNALCVRGARPFNHFKRSMYRAGRFHWERFPDARPRYEGEWEMLRQSMMKHGLRN
SQFVALMPTAASAQISDVSEGFAPLFTNLFSKVTRDGETLRPNTLLLKELERTFSGKRLLEVMDSLDAKQWSVAQALPCL
EPTHPLRRFKTAFDYDQKLLIDLCADRAPYVDHSQSMTLYVTEKADGTLPASTLVRLLVHAYKRGLKTGMYYCKVRKATN
SGVFGGDDNIVCMSCAL
>P89462 1.17.4.1~~~RIR1~~~Ribonucleoside-diphosphate reductase large subunit~~~
MANRPAASALAGARSPSERQEPREPEVAPPGGDHVFCRKVSGVMVLSSDPPGPAAYRISDSSFVQCGSNCSMIIDGDVAR
GHLRDLEGATSTGAFVAISNVAAGGDGRTAVVALGGTSGPSATTSVGTQTSGEFLHGNPRTPEPQGPQAVPPPPPPPFPW
GHECCARRDARGGAEKDVGAAESWSDGPSSDSETEDSDSSDEDTGSETLSRSSSIWAAGATDDDDSDSDSRSDDSVQPDV
VVRRRWSDGPAPVAFPKPRRPGDSPGNPGLGAGTGPGSATDPRASADSDSAAHAAAPQADVAPVLDSQPTVGTDPGYPVP
LELTPENAEAVARFLGDAVDREPALMLEYFCRCAREESKRVPPRTFGSAPRLTEDDFGLLNYALAEMRRLCLDLPPVPPN
AYTPYHLREYATRLVNGFKPLVRRSARLYRILGVLVHLRIRTREASFEEWMRSKEVDLDFGLTERLREHEAQLMILAQAL
NPYDCLIHSTPNTLVERGLQSALKYEEFYLKRFGGHYMESVFQMYTRIAGFLACRATRGMRHIALGRQGSWWEMFKFFFH
RLYDHQIVPSTPAMLNLGTRNYYTSSCYLVNPQATTNQATLRAITGNVSAILARNGGIGLCMQAFNDASPGTASIMPALK
VLDSLVAAHNKQSTRPTGACVYLEPWHSDVRAVLRMKGVLAGEEAQRCDNIFSALWMPDLFFKRLIRHLDGEKNVTWSLF
DRDTSMSLADFHGEEFEKLYEHLEAMGFGETIPIQDLAYAIVRSAATTGSPFIMFKDAVNRHYIYDTQGAAIAGSNLCTE
IVHPASKRSSGVCNLGSVNLARCVSRQTFDFGRLRDAVQACVLMVNIMIDSTLQPTPQCTRGNDNLRSMGIGMQGLHTAC
LKMGLDLESAEFRDLNTHIAEVMLLAAMKTSNALCVRGARPFSHFKRSMYRAGRFHWERFSNASPRYEGEWEMLRQSMMK
HGLRNSQFIALMPTAASAQISDVSEGFAPLFTNLFSKVTRDGETLRPNTLLLKELERTFGGKRLLDAMDGLEAKQWSVAQ
ALPCLDPAHPLRRFKTAFDYDQELLIDLCADRAPYVDHSQSMTLYVTEKADGTLPASTLVRLLVHAYKRGLKTGMYYCKV
RKATNSGVFAGDDNIVCTSCAL
>P52343 ~~~RIR1~~~Ribonucleoside-diphosphate reductase large subunit-like protein~~~
MKRKERRINKDYGYNRKCVCHYEASQKRFCYSQYSCASVLYERVRDIAKIIDRLDSGLDAWCLRDAIISVLRATHCVPRV
DRMLGRWYLKTSVFYDFCPDDLILSCPNVIMPNVLNFVKKYRDFIRSVLYKVSVSWKNQYMPGVLAASRFLEEISNSLNG
VEESIPCIYLRMCATLTEIVLRIGYLREIYQENPYVMFEELAFSLFTQKWVLPFSCMTNLGLVEKANSTVFDVAIYNTCL
YSLVDFTIVNGEHLFPALLNGSNISMNLTRYQQEAKNIFEILLSQIRVVERDTDKTVQLTVYVEVWHVSALMWLDLYEAL
PQTSRVTFCLIIPGIFMDRYELKRAQWSLFHKNIAFELGKCDEITFSTKYLEFERTTDHAKITMSSFIEKICLCLKGGRM
GLIFRKNVYQYSMIPHVPLYCGGDFLDVLPVRDGINTCMRMLLNVVHFLGDEVSDELTEEIDFVRLQCKFFMFNELRRVV
RKMVLVANAVIDYAVENKDFLCEGIEDGRSLGICVTGLHSVFMTVGLSYAHPDARRLYRMICEHIYYTCVRTSVDCCMKG
AEPCNLFDRSKYALGMLYFDHFDNVECTLPEELWTTLRKDVLMHGVRNIHFTAGTAMQKEFDIINSSESFWPMEDNKILR
RSNIKVVIGKDGLNDVTSVYSSELKSLYIPVYNNLLLNRFNKHQQYLKTVGYRVLNVDTNLFTDKELDDLAVFKDGFSYH
LNDLIEMYKSGLPFLDQGQANVFYFNDTVSLRLLLPLLYKAGFKVAMYKVLCNSEMYKHLDLSNPLPLIGKCSDGVVMHV
KNIL
>Q06A28 ~~~RIR1~~~Ribonucleoside-diphosphate reductase large subunit-like protein~~~
MDRQPKVYSDPDNGFFFLDVPMPDDGQGGQQTATTAAGGAFGVGGGHSVPYVRIMNGVSGIQIGNHNAMSIASCWSPSYT
DRRRRSYPKTATNAAADRVAAAVSAANAAVNAAAAAAAAGGGGGANLLAAAVTCANQRGCCGGNGGHSLPPTRMPKTNAT
AAAAPAVAGASNAKSDNNHANATSGAGSAAATPAATTPAATAVENRRPSPSPSTASTAPCDEGSSPRHHRPSHVSVGTQA
TPSTPIPIPAPRCSTGQQQQQPQAKKLKPAKADPLLYAATMPPPASVTTAAAAAVAPESESSPAASAPPAAAAMATGGDD
EDQSSFSFVSDDVLGEFEDLRIAGLPVRDEMRPPTPTMTVIPVSRPFRAGRDSGRDALFDDAVESVRCYCHGILGNSRFC
ALVNEKCSEPAKERMARIRRYAADVTRCGPLALYTAIVSSANRLIQTDPSCDLDLAECYVETASKRNAVPLSAFYRDCDR
LRDAVAAFFKTYGMVVDAMAQRITERVGPALGRGLYSTVVMMDRCGNSFQGREETPISVFARVAAALAVECEVDGGVSYK
ILSSKPVDAAQAFDAFLSALCSFAIIPSPRVLAYAGFGGSNPIFDAVSYRAQFYSAESTINGTLHDICDMVTNGLSVSVS
AADLGGDIVASLHILGQQCKALRPYARFKTVLRIYFDIWSVDALKIFSFILDVGREYEGLMAFAVNTPRIFWDRYLDSSG
DKMWLMFARREAAALCGLDLKSFRNVYEKMERDGRSAITVSPWWAVCQLDACVARGNTAVVFPHNVKSMIPENIGRPAVC
GPGVSVVSGGFVGCTPIHELCINLENCVLEGAAVESSVDVVLGLGCRFSFKALESLVRDAVVLGNLLIDMTVRTNAYGAG
KLLTLYRDLHIGVVGFHAVMNRLGQKFADMESYDLNQRIAEFIYYTAVRASVDLCMAGADPFPKFPKSLYAAGRFYPDLF
DDDERGPRRMTKEFLEKLREDVVKHGIRNASFITGCSADEAANLAGTTPGFWPRRDNVFLEQTPLMMTPTKDQMLDECVR
SVKIEPHRLHEEDLSCLGENRPVELPVLNSRLRQISKESATVAVRRGRSAPFYDDSDDEDEVACSETGWTVSTDAVIKMC
VDRQPFVDHAQSLPVAIGFGGSSVELARHLRRGNALGLSVGVYKCSMPPSVNYR
>Q76RD8 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
MFVIKRNGYKENVMFDKITSRIRKLCYGLNTDHIDPIKIAMKVIQGIYNGVTTVELDTLAAEIAATCTTQHPDYAILAAR
IAVSNLHKETKKLFSEVMEDLFNYVNPKNGKHSPIISSITMDIVNKYKDKLNSVIIYERDFSYNYFGFKTLEKSYLLKIN
NKIVERPQHMLMRVAVGIHQWDIDSAIETYNLLSEKWFTHASPTLFNAGTSRHQMSSCFLLNMIDDSIEGIYDTLKRCAL
ISKMAGGIGLSISNIRASGSYISGTNGISNGIIPMLRVYNNTARYIDQGGNKRPGVMAIYLEPWHSDIMAFLDLKKNTGN
EEHRTRDLFIALWIPDLFMKRVKDDGEWSLMCPDECPGLDNVWGDEFERLYTLYERERRYKSIIKARVVWKAIIESQIET
GTPFILYKDACNKKSNQQNLGTIKCSNLCTEIIQYADANEVAVCNLASVALNMFVIDGRFDFLKLKDVVKVIVRNLNKII
DINYYPIPEAEISNKRHRPIGIGVQGLADAFILLNYPFDSLEAQDLNKKIFETIYYGALEASCELAEKEGPYDTYVGSYA
SNGILQYDLWNVVPSDLWNWEPLKDKIRTYGLRNSLLVAPMPTASTAQILGNNESVEPYTSNIYTRRVLSGEFQVVNPHL
LRVLTERKLWNDEIKNRIMADGGSIQNTNLPEDIKRVYKTIWEIPQKTIIKMAADRGAFIDQSQSMNIHIADPSYSKLTS
MHFYGWSLGLKTGMYYLRTKPASAPIQFTLDKDKIKPLVVCDSEICTSCSG
>P20503 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
MFVIKRNGYKENVMFDKITSRIRKLCYGLNTDHIDPIKIAMKVIQGIYNGVTTVELDTLAAEIAATCTTQHPDYAILAAR
IAVSNLHKETKKLFSEVMEDLFNYVNPKNGKHSPIISSITMDIVNKYKDKLNSVIIYERDFSYNYFGFKTLEKSYLLKIN
NKIVERPQHMLMRVAVGIHQWDIDSAIETYNLLSEKWFTHASPTLFNAGTSRHQMSSCFLLNMIDDSIEGIYDTLKRCAL
ISKMAGGIGLSISNIRASGSYISGTNGISNGIIPMLRVYNNTARYIDQGGNKRPGVMAIYLEPWHSDIMAFLDLKKNTGN
EEHRTRDLFIALWIPDLFMKRVKDDGEWSLMCPDECPGLDNVWGDEFERLYTLYERERRYKSIIKARVVWKAIIESQIET
GTPFILYKDACNKKSNQQNLGTIKCSNLCTEIIQYADANEVAVCNLASVALNMFVIDGRFDFLKLKDVVKVIVRNLNKII
DINYYPIPEAEISNKRHRPIGIGVQGLADAFILLNYPFDSLEAQDLNKKIFETIYYGALEASCELAEKEGPYDTYVGSYA
SNGILQYDLWNVVPSDLWNWEPLKDKIRTYGLRNSLLVAPMPTASTAQILGNNESVEPYTSNIYTRRVLSGEFQVVNPHL
LRVLTERKLWNDEIKNRIMADGGSIQNTNLPEDIKRVYKTIWEIPQKTIIKMAADRGAFIDQSQSMNIHIADPSYSKLTS
MHFYGWSLGLKTGMYYLRTKPASAPIQFTLDKDKIKPLVVCDSEICTSCSG
>P12848 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
MFVIKRNGYKENVMFDKITSRIRKLCYGLNTDHIDPIKIAMKVIQGIYNGVTTVELDTLAAEIAATCTTQHPDYAILAAR
IAVSNLHKETKKLFSEVMEDLFNYVNPKNGKHSPIISSITMDIVNKYKDKLNSVIIYERDFSYNYFGFKTLEKSYLLKIN
NKIVERPQHMLMRVAVGIHQWDIDSAIETYNLLSEKWFTHASPTLFNAGTSRHQMSSCFLLNMIDDSIEGIYDTLKRCAL
ISKMAGGIGLSISNIRASGSYISGTNGISNGIIPMLRVYNNTARYIDQGGNKRPGVMAIYLEPWHSDIMAFLDLKKNTGN
EEHRTRDLFIALWIPDLFMKRVKDDGEWSLMCPDECPGLDNVWGDEFERLYTLYERERRYKSIIKARVVWKAIIESQIET
GTPFILYKDACNKKSNQQNLGTIKCSNLCTEIIQYADANEVAVCNLASVALNMFVIDGRFDFLKLKDVVKVIVRNLNKII
DINYYPIPEAEISNKRHRPIGIGVQGLADAFILLNYPFDSLEAQDLNKKIFETIYYGALEASCELAEKEGPYDTYVGSYA
SNGILQYDLWNVVPSDLWNWEPLKDKIRTYGLRNSLLVAPMPTASTAQILGNNESVEPYTSNIYTRRVLSGEFQVVNPHL
LRVLTERKLWNDEIKNRIMADGGSIQNTNLPEDIKRVYKTIWEIPQKTIIKMAADRGAFIDQSQSMNIHIADPSYSKLTS
MHFYGWSLGLKTGMYYLRTKPASAPIQFTLDKDKIKPPVVCDSEICTSCSG
>P0DSV1 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
MFVIKRNGYKENVMFDKITSRIRKLCYGLNTDHIDPIKIAMKVIQGIYNGVTTVELDTLTAEIAATCTTQHPDYAILAAR
IAVSNLHKETKKLFSEVMKDLFNYVNPKNGKHSPIISSITMDVVNKYKDKLNSVIIYERDFSYNYFGFKTLEKSYLLKIN
NKIVERPQHMLMRVAVGIHQWDIDSAIETYNLLSEKWFTHASPTLFNAGTSRHQMSSCFLLNMMDDSIEGIYDTLKRCAL
ISKMAGGIGLSISNIRASGSYISGTNGASNGIIPMLRVYNNTARYIDQGGNKRPGVMTIYLEPWHSDIMAFLDLKKNTGN
EEHRTRDLFIALWIPDLFMKRVKDDGEWSLMCPDECPGLDNVWGDEFERLYTLYEREKRYKSIIKARVVWKAIIESQIET
GTPFILYKDACNKKSNQQNLGTIKCSNLCTEIIQYADANEVAVCNLASIALNMFVIDGQFDFLKLKDVVKVIVRNLNKII
DINYYPIPEAEISNKRHRPIGIGVQGLADAFILLNYPFDSLEAQDLNKKIFETIYYGALEASCELAEKEGPYDTYVGSYA
SNGILQYDLWNVVPSDLWNWEPLKDKIRTYGLRNSLLVAPMPTASTAQILGNNESVEPYTSNIYTRRVLSGEFQVVNPHL
LRVLTERKLWNDEIKNRIMVDGGSIQNTNLPEDIKRVYKTIWEIPQKTIIKMAADRGAFIDQSQSMNIHIADPSYSKLTS
MHFYGWSLGLKTGMYYLRTKPASAPIQFTLDKDKIKPLVVCDSEICTSCSG
>P0DSV2 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase large subunit~~~
MFVIKRNGYKENVMFDKITSRIRKLCYGLNTDHIDPIKIAMKVIQGIYNGVTTVELDTLTAEIAATCTTQHPDYAILAAR
IAVSNLHKETKKLFSEVMKDLFNYVNPKNGKHSPIISSITMDVVNKYKDKLNSVIIYERDFSYNYFGFKTLEKSYLLKIN
NKIVERPQHMLMRVAVGIHQWDIDSAIETYNLLSEKWFTHASPTLFNAGTSRHQMSSCFLLNMMDDSIEGIYDTLKRCAL
ISKMAGGIGLSISNIRASGSYISGTNGASNGIIPMLRVYNNTARYIDQGGNKRPGVMTIYLEPWHSDIMAFLDLKKNTGN
EEHRTRDLFIALWIPDLFMKRVKDDGEWSLMCPDECPGLDNVWGDEFERLYTLYEREKRYKSIIKARVVWKAIIESQIET
GTPFILYKDACNKKSNQQNLGTIKCSNLCTEIIQYADANEVAVCNLASIALNMFVIDGQFDFLKLKDVVKVIVRNLNKII
DINYYPIPEAEISNKRHRPIGIGVQGLADAFILLNYPFDSLEAQDLNKKIFETIYYGALEASCELAEKEGPYDTYVGSYA
SNGILQYDLWNVVPSDLWNWEPLKDKIRTYGLRNSLLVAPMPTASTAQILGNNESVEPYTSNIYTRRVLSGEFQVVNPHL
LRVLTERKLWNDEIKNRIMVDGGSIQNTNLPEDIKRVYKTIWEIPQKTIIKMAADRGAFIDQSQSMNIHIADPSYSKLTS
MHFYGWSLGLKTGMYYLRTKPASAPIQFTLDKDKIKPLVVCDSEICTSCSG
>P42492 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MLIFISNMEELLIENSQRFTIFPIQHPECWNWYKKLESLTWTAQEVDMCKDIDDWEAMPKPQREFYKQILAFFVVADEIV
IENLLTNFMREIKVKEVLYFYTMQAAQECVHSEAYSIQVKTLIPDEKEQQRIFSGIEKHPIIKKMAQWVRQWMDPDRNTL
GERLVGFAAVEGILFQNHFVAIQFLKEQNIMPGLVSYNEFISRDEGVHCSFACFLISNYVYNIPEEKIIHKILKEAVELV
DEFINYAFDKARGRVPGFSKEMLFQYIRYFTDNLCFMMQCKSIYKVGNPFPQMTKFFLNEVEKTNFFELRPTQYQNCVKD
DAFAFKLFLNDDDF
>P0CAP6 1.17.4.1~~~RIR2~~~Ribonucleoside-diphosphate reductase small subunit~~~
MSKLLYVRDHEGFACLTVETHRNRWFAAHIVLTKDCGCLKLLNERDLEFYKFLFTFLAMAEKLVNFNIDELVTSFESHDI
DHYYTEQKAMENVHGETYANILNMLFDGDRAAMNAYAEAIMADEALQAKISWLRDKVAAAVTLPEKILVFLLIEGIFFIS
SFYSIALLRVRGLMPGICLANNYISRDELLHTRAASLLYNSMTAKADRPRATWIQELFRTAVEVETAFIEARGEGVTLVD
VRAIKQFLEATADRILGDIGQAPLYGTPPPKDCPLTYMTSIKQTNFFEQESSDYTMLVVDDL
>P0C701 1.17.4.1~~~RIR2~~~Ribonucleoside-diphosphate reductase small subunit~~~
MSKLLYVRDHEGFACLTVETHRNRWFAAHIVLTKDCGCLKLLNERDLEFYKFLFTFLAMAEKLVNFNIDELVTSFESHDI
DHYYTEQKAMENVHGETYANILNMLFDGDRAAMNAYAEAIMADEALQAKISWLRDKVAAAVTLPEKILVFLLIEGIFFIS
SFYSIALLRVRGLMPGICLANNYISRDELLHTRAASLLYNSMTAKADRPRATWIQELFRTAVEVETAFIEARGEGVTLVD
VRAIKQFLEATADRILGDIGQAPLYGTPPPKDCPLTYMTSIKQTNFFEQESSDYTMLVVDDL
>O57175 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAPNPNRFVIFPIQYHDIWNMYKKAEASFWTVEEVDISKDINDWNKLTPDEKYFIKHVLAFFAASDGIVNENLAER
FCTEVQITEARCFYGFQMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSAGYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLHPPSEETVRSIITDAVSIEQEFLTAALP
VKLIGMNCEMMKTYIEFVADRLISELGFKKIYNVTNPFDFMENISLEGKTNFFEKRVGEYQKMGVMSQEDNHFSLDVDF
>P20493 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAPNPNRFVIFPIQYYDIWNMYKKAEASFWTVEEVDISKDINDWNKLTPDEKYFIKHVLAFFAASDGIVNENLAER
FCTEVQITEARCFYGFQMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSAGYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLYPPSEETVRSIITDAVSIEQEFLTAALP
VKLIGMNCEMMKTYIEFVADRLISELGFKKIYNVTNPFDFMENISLEGKTNFFEKRVGEYQKMGVMSQEDNHFSLDVDF
>P29883 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAPNPNRFVIFPIQYHDIWNMYKKAEASFWTVEEVDISKDINDWNKVTPDEKYFIKHVLAFFAASDGIVNENLAER
FCTEVQITEARCFYGFRMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSAGYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLHPPSEETVRSIITDAVSIEQEFLTAALP
VKLIGMNCEMMKTYIEFVADRLISELGFKKIYNVTNPFDFMENISLEGKTNFFEKRVGEYQKMGVMSQKDNHFSLDVDF
>P11158 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAPNPNRFVIFPIQYYDIWNMYKKAEASFWTVEEVDISKDINDWNKLTPDEKYFIKHVLAFFAASDGIVNENLAER
FCTEVQITEARCFYGFQMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSAGYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLHPPSEETVRSIITDAVSIEQEFLTAALP
VKLIGMNCEMMKTYIEFVADRLISELGFKKIYNVTNPFDFMENISLEGKTNFFEKRVGEYQKMGVMSQEDNHFSLDVDF
>P0DSS7 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAKNPNRFVIFPIQYHDIWNMYKKAEASFWTVEEVDISKDINDWNKLTPDEKYFIKHVLAFFAASDGIVNENLAER
FCIEVQITEARCFYGFQMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSAGYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLYPPSEETVRSIITDAVSIEQEFLTVALP
VKLIGMNCEMMKTYMEFVADRLISELGFKRIYNVTNPSDFMENISLEGKTNFFEKRVGEYQKMGVMSQEDNHFSLDVDF
>P0DSS8 1.17.4.1~~~~~~Ribonucleoside-diphosphate reductase small chain~~~
MEPILAKNPNRFVIFPIQYHDIWNMYKKAEASFWTVEEVDISKDINDWNKLTPDEKYFIKHVLAFFAASDGIVNENLAER
FCIEVQITEARCFYGFQMAIENIHSEMYSLLIDTYVKDSNEKNYLFNAIETMPCVKKKADWAQKWIHDSASYGERLIAFA
AVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLLYPPSEETVRSIITDAVSIEQEFLTVALP
VKLIGMNCEMMKTYIKFVADRLISELGFKKIYNVTNPFDFMENISLEGKTNFFEKRVGEYQKMGVMSQEDNHFSLDVDF
>Q4JQV7 1.17.4.1~~~RIR2~~~Ribonucleoside-diphosphate reductase small subunit~~~
MDQKDCSHFFYRPECPDINNLRALSISNRWLESDFIIEDDYQYLDCLTEDELIFYRFIFTFLSAADDLVNVNLGSLTQLF
SQKDIHHYYIEQECIEVVHARVYSQIQLMLFRGDESLRVQYVNVTINNPSIQQKVQWLEEKVRDNPSVAEKYILMILIEG
IFFVSSFAAIAYLRNNGLFVVTCQFNDLISRDEAIHTSASCCIYNNYVPEKPAITRIHQLFSEAVEIECAFLKSHAPKTR
LVNVDAITQYVKFSADRLLSAINVPKLFNTPPPDSDFPLAFMIADKNTNFFERHSTSYAGTVINDL
>Q9T1W9 ~~~~~~Releasin protein~~~
MDLISVLALWPYLLPVVAGGAVWAMRRSFASTERVERLENRMTEMETRYASIPGTEDVHEMRLRIAELSGDIRVLSQRVQ
SFSHQLELLLENAVNRSNS
>P32277 6.5.1.3~~~~~~RNA ligase 2~~~
MFKKYSSLENHYNSKFIEKLYSLGLTGGEWVAREKIHGTNFSLIIERDKVTCAKRTGPILPAEDFFGYEIILKNYADSIK
AVQDIMETSAVVSYQVFGEFAGPGIQKNVDYCDKDFYVFDIIVTTESGDVTYVDDYMMESFCNTFKFKMAPLLGRGKFEE
LIKLPNDLDSVVQDYNFTVDHAGLVDANKCVWNAEAKGEVFTAEGYVLKPCYPSWLRNGNRVAIKCKNSKFSEKKKSDKP
IKAKVELSEADNKLVGILACYVTLNRVNNVISKIGEIGPKDFGKVMGLTVQDILEETSREGITLTQADNPSLIKKELVKM
VQDVLRPAWIELVS
>Q65153 ~~~~~~Putative RNA-ligase~~~
MSNESFPETLENLLSTLQTKQQNAIQSEVIEWLHSFCETFHLKIHCHKQFIPSGEKKRAKIPAQEIQGNTQPSHHVHRVV
LSRAQPVKAQESLLTTMCNGLVLDANTWTCLAIPPPAPFQQATRQVQHFYRNNFYEVVPIQDGTLLTIYYWDDPEHGPSW
CLASTHGYDVSNYCWIGDKTFAELVYELLQQHSTCDVTLEKNKTRGTRLFFNNLNPDYCYTIGIRHHNLQPLIYDPQNIW
AIQSTNLKTLKTVYPEYYGYIGIPGIQSQVPELPQYDLPYLIRSYKTAMNQAKNAIKNGKKDKEYFNYGYLLISRAPAIT
KSTSNVLLKSPLLVFLQKSVYQKKHNISNSQRLEFIILQNYLMQHFRDHFIALFPQYISYYTKYQNMLNMIIHSIATKDK
DHPFAGAVVKKVLEDIENAENIIDHTTIQNYAHQSKYAMLYLSIISHF
>P0DTD9 6.5.1.-~~~~~~RNA ligase~~~
MESMNVKYPVEYLIEHLNSFESPEVAVESLRKEGIMCKNRGDLYMFKYHLGCKFDKIYHLACRGAILRKTDSGWKVLSYP
FDKFFNWGEELQPEIVNYYQTLRYASPLNEKRKAGFMFKLPMKLVEKLDGTCVVLYYDEGWKIHTLGSIDANGSIVKNGM
VTTHMDKTYRELFWETFEKKYPPYLLYHLNSSYCYIFEMVHPDARVVVPYEEPNIILIGVRSVDPEKGYFEVGPSEEAVR
IFNESGGKINLKLPAVLSQEQNYTLFRANRLQELFEEVTPLFKSLRDGYEVVYEGFVAVQEIAPRVYYRTKIKHPVYLEL
HRIKTTITPEKLADLFLENKLDDFVLTPDEQETVMKLKEIYTDMRNQLESSFDTIYKEISEQVSPEENPGEFRKRFALRL
MDYHDKSWFFARLDGDEEKMQKSEKKLLTERIEKGLFK
>P00971 6.5.1.3~~~~~~RNA ligase 1~~~
MQELFNNLMELCKDSQRKFFYSDDVSASGRTYRIFSYNYASYSDWLLPDALECRGIMFEMDGEKPVRIASRPMEKFFNLN
ENPFTMNIDLNDVDYILTKEDGSLVSTYLDGDEILFKSKGSIKSEQALMANGILMNINHHRLRDRLKELAEDGFTANFEF
VAPTNRIVLAYQEMKIILLNVRENETGEYISYDDIYKDATLRPYLVERYEIDSPKWIEEAKNAENIEGYVAVMKDGSHFK
IKSDWYVSLHSTKSSLDNPEKLFKTIIDGASDDLKAMYADDEYSYRKIEAFETTYLKYLDRALFLVLDCHNKHCGKDRKT
YAMEAQGVAKGAGMDHLFGIIMSLYQGYDSQEKVMCEIEQNFLKNYKKFIPEGY
>C0HM52 6.5.1.3~~~RlnA~~~RNA ligase 1~~~
MSSLAPWRTTSWSPLGSPPSLEDALRLARTTRAFAVRRDGEGRALVTYLYGTPELFSLPGARELRGIVYREEDGTVLSRP
FHKFFNFGEPLAPGEEAFKAFRDSMVPLFVAEKVDGYLAQAYLDGGELRFASRHSLNPPLVGALLRKAVDEEAMARLGKL
LAAEGGRWTALLEVVDPEAPVMVPYQEPGVYLLALRSIGEGHYLLPGVHFPLPEALRYVRWEPRMDFDPHRFRGEIRDLQ
GVEGYVVTDGAEFVKFKTGWAFRLARFLMDPEGVFLEAYAEDRLDDLVGALAGREDLLRAVARAQDYLAGLYGEAVGAGD
ALRRMGLPRKEAWARVQEEAGRWGGFAPAYARAAMAAYEGGEAREAFLVELRKRSARKALEALHLFPRVGGELRG
>P03049 ~~~mnt~~~Regulatory protein mnt~~~
MARDDPHFNFRMPMEVREKLKFRAEANGRSMNSELLQIVQDALSKPSPVTGYRNDAERLADEQSELVKKMVFDTLKDLYK
KTT
>P04487 ~~~~~~Accessory factor US11~~~
MSQTQPPAPVGPGDPDVYLKGVPSAGMHPRGVHAPRGHPRMISGPPQRGDNDQAAGQCGDSGLLRVGADTTISKPSEAVR
PPTIPRTPRVPREPRVPRPPREPREPRVPRAPRDPRVPRDPRDPRQPRSPREPRSPREPRSPREPRTPRTPREPRTARGS
V
>P13319 3.1.26.4~~~rnh~~~Ribonuclease H~~~
MDLEMMLDEDYKEGICLIDFSQIALSTALVNFPDKEKINLSMVRHLILNSIKFNVKKAKTLGYTKIVLCIDNAKSGYWRR
DFAYYYKKNRGKAREESTWDWEGYFESSHKVIDELKAYMPYIVMDIDKYEADDHIAVLVKKFSLEGHKILIISSDGDFTQ
LHKYPNVKQWSPMHKKWVKIKSGSAEIDCMTKILKGDKKDNVASVKVRSDFWFTRVEGERTPSMKTSIVEAIANDREQAK
VLLTESEYNRYKENLVLIDFDYIPDNIASNIVNYYNSYKLPPRGKIYSYFVKAGLSKLTNSINEF
>Q9J5A4 2.7.7.6~~~RPO7~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFPLVCSTCGRDLSEARYRLLVEQMELKKVVITYSRKCCRLKLSTQIEPYRNLTVQPSLDIN
>Q98229 2.7.7.6~~~RPO7~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFALVCASCGRDLSEVRYRLLIERQKLSDVLRNLTHMCCRLKLATQIEPYRNLTVQPLLDIN
>Q9Q8P6 2.7.7.6~~~RPO7~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLVCSTCGRDISEERYALLIKKIDLKTVLRGVKNNCCRLKLSTQIEPQRNLTVEPLLDIN
>Q9Q921 2.7.7.6~~~RPO7~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLICSTCGRDISEERFALLIKKIALKTVLKGVKNSCCRLKLSTQIEPQRNLTVEPLLDIN
>P68314 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNECCRLKLSTQIEPQRNLTVQPLLDIN
>P68315 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNECCRLKLSTQIEPQRNLTVQPLLDIN
>P68316 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNECCRLKLSTQIEPQRNLTVQPLLDIN
>P68317 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNECCRLKLSTQIEPQRNLTVQPLLDIN
>P0DSU7 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQVVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNKCCRLKLSTQIEPQRNLTVQPLLDIN
>P0DSU8 2.7.7.6~~~~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQVVCSTCGKDISHERYKLIIRKKSLKDVLVSVKNKCCRLKLSTQIEPQRNLTVQPLLDIN
>Q9DHQ8 2.7.7.6~~~RPO7~~~DNA-directed RNA polymerase 7 kDa subunit~~~
MVFQLICSTCGRDISEERYYLLIKELSLKKVLEGVKNNCCRLKLSTQIEPQRNLTVQPLIDIN
>Q76ZP7 2.7.7.6~~~~~~DNA-directed RNA polymerase 133 kDa polypeptide~~~
MKKNTDSEMDQRLGYKFLVPDPKAGVFYRPLHFQYVSYSNFILHRLHEILTVKRPLLSFKNNTERIMIEISNVKVTPPDY
SPIIASIKGKSYDALATFTVNIFKEVMTKEGISITKISSYEGKDSHLIKIPLLIGYGNKNPLDTAKYLVPNVIGGVFINK
QSVEKVGINLVEKITTWPKFRVVKPNSFTFSFSSVSPPNVLPTRYRHYKISLDISQLEALNISSTKTFITVNIVLLSQYL
SRVSLEFIRRSLSYDMPPEVVYLVNAIIDSAKRITESITDFNIDTYINDLVEAEHIKQKSQLTINEFKYEMLHNFLPHMN
YTPDQLKGFYMISLLRKFLYCIYHTSRYPDRDSMVCHRILTYGKYFETLAHDELENYIGNIRNDIMNNHKNRGTYAVNIH
VLTTPGLNHAFSSLLSGKFKKSDGSYRTHPHYSWMQNISIPRSVGFYPDQVKISKMFSVRKYHPSQYLYFCSSDVPERGP
QVGLVSQLSVLSSITNILTSEYLDLEKKICEYIRSYYKDDISYFETGFPITIENALVASLNPNMICDFVTDFRRRKRMGF
FGNLEVGITLVRDHMNEIRINIGAGRLVRPFLVVDNGELMMDVCPELESRLDDMTFSDIQKEFPHVIEMVDIEQFTFSNV
CESVQKFRMMSKDERKQYDLCDFPAEFRDGYVASSLVGINHNSGPRAILGCAQAKQAISCLSSDIRNKIDNGIHLMYPER
PIVISKALETSKIAANCFGQHVTIALMSYKGINQEDGIIIKKQFIQRGGLDIVTAKKHQVEIPLENFNNKERDRSNAYSK
LESNGLVRLNAFLESGDAMARNISSRTLEDDFARDNQISFDVSEKYTDMYKSRVERVQVELTDKVKVRVLTMKERRPILG
DKFTTRTSQKGTVAYVADETELPYDENGITPDVIINSTSIFSRKTISMLIEVILTAAYSAKPYNNKGENRPVCFPSSNET
SIDTYMQFAKQCYEHSNPKLSDEELSDKIFCEKILYDPETDKPYASKVFFGPIYYLRLRHLTQDKATVRCRGKKTKLIRQ
ANEGRKRGGGIKFGEMERDCLIAHGAANTITEVLKDSEEDYQDVYVCENCGDIAAQIKGINTCLRCSKLNLSPLLTKIDT
THVSKVFLTQMNARGVKVKLDFERRPPSFYKPLDKVDLKPSFLV
>O57204 2.7.7.6~~~~~~DNA-directed RNA polymerase 147 kDa polypeptide~~~
MAVISKVTYSLYDQKEINATDIIISHVKNDDDIGTVKDGRLGAMDGALCKTCGKTELECFGHWGKVSIYKTHIVKPEFIS
EIIRLLNYICIHCGLLRSREPYSDDINLKELSGHALRRLKDKILSKKKSCWNSECMQPYQKITFSKKKVCFVNKLDDINV
PNSLIYQKLISIHEKFWPLLEIHQYPANLFYTDYFPIPPLIIRPAISFWIDSIPKETNELTYLLGMIVKNCNLNADEQVI
QKAVIEYDDIKIISNNTSSINLSYITSGKNNMIRSYIVARRKDQTARSVIGPSTSITVNEVGMPAYIRNTLTEKIFVNAF
TVDKVKQLLASNQVKFYFNKRLNQLTRIRQGKFIKNKIHLLPGDWVEVAVQEYTSIIFGRQPSLHRYNVIASSIRATEGD
TIKISPGIVNSQNADFDGDEEWMILEQNPKAVIEQSILMYPTTLLKHDIHGAPVYGSIQDEIVAAYSLFRIQDLCLDEVL
NILGKYGREFDPKGKCKFSGKDIYTYLIGEKINYPGLLKDGEIIANDVDSNFVVAMRHLSLAGLLSDHKSNVEGINFIIK
SSYVFKRYLSIYGFGVTFKDLRPNSTFTNKLEAINVEKIELIKEAYAKYLNDVRDGKIVPLSKALEADYVESMLSNLTNL
NIREIEEHMRQTLIDDPDNNLLKMAKAGYKVNPTELMYILGTYGQQRIDGEPAETRVLGRVLPYYLPDSKDPEGRGYILN
SLTKGLTGSQYYFSMLVARSQSTDIVCETSRTGTLARKIIKKMEDMVVDGYGQVVIGNTLIKYAANYTKILGSVCKPVDL
IYPDESMTWYLEISALWNKIKQGFVYSQKQKLAKKTLAPFNFLVFVKPTTEDNAIKVKDLYDMIHNVIDDVREKYFFTVS
NIDFMEYIFLTHLNPSRIRITKETAITIFEKFYEKLNYTLGGGTPIGIISAQVLSEKFTQQALSSFHTTEKSGAVKQKLG
FNEFNNLTNLSKNKTEIITLVSDDISKLQSVKINFEFVCLGELNPNITLRKETDRYVVDIIVNRLYIKRAEITELVVEYM
IERFISFSVIVKEWGMETFIEDEDNIRFTVYLNFVEPEELNLSKFMMVLPGAANKGKISKFKIPISDYTGYDDFNQTKKL
NKMTVELMNLKELGSFDLENVNVYPGVWNTYDIFGIEAAREYLCEAMLNTYGEGFDYLYQPCDLLASLLCASYEPESVNK
FKFGAASTLKRATFGDNKALLNAALHKKSEPINDNSSCHFFSKVPNIGTGYYKYFIDLGLLMRMERKLSDKISSQKIKEM
EETEDF
>P21967 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSLFNTNAYLPVVIQPHELNLDLMDNIKKAVINKYLHKETSGFMAKKIQIVEDTPMPLAELVNNEIVVHVTCNIDYKYYK
VGDIVSGILTITDESDISVVCSDLICKIRSDSGTVSYDNSKYCFIKNGKVYANESTVTVMLKEAQSGMESSFVFLGNIIE
K
>Q76ZS0 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSSFVTNGYLPVTLEPHELTLDIKTNIRNAVYKTYLHREISGKMAKKIEIREDVELPLGEIVNNSVVINVPCVITYAYYH
VGDIVRGTLNIEDESNVTIQCGDLICKLSRDSGTVSFSDSKYCFFRNGNAYDNGSEVTAVLMEAQQGIESSFVFLANIVD
S
>P21034 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSSFVTNGYLSVTLEPHELTLDIKTNIRNAVYKTYLHREISGKMAKKIEIREDVELPLGEIVNNSVVINVPCVITYAYYH
VGDIVRGTLNIEDESNVTIQCGDLICKLSRDSGTVSFSDSKYCFFRNGNAYDNGSEVTAVLMEAQQGIESSFVFLANIVD
S
>P04310 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSSFVTNGYLPVTLEPHELTLDIKTNIRNAVYKTYLHREISGKMAKKIEIREDVELPLGEIVNNSVVINVPCVITYAYYH
VGDIVRGTLNIEDESNVTIQCGDLICKLSRDSGTVSFSDSKYCFFRNGNAYDNGSEVTAVLMEAQQGIESSFVFLANIVD
S
>P0DSX9 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSSFVTNGYLPVTLEPHELTLDIKTNIRNAVYKTYLHKEISGKMAKKIEICKDVELPLGEIVNNSVVINVPCVITYAYYH
VGDIVRGTLNIEDESNVTIQCGDLICKLSRDSGTVSFSDSKYCFFRNGNAYDNGSEVSAVLMEAQQGTESSFVFLANIVD
S
>P0DSY0 2.7.7.6~~~~~~DNA-directed RNA polymerase 18 kDa subunit~~~
MSSFVTNGYLPVTLEPHELTLDIKTNIRNAVYKTYLHKEISGKMAKKIEICKDVELPLGEIVNNSVVINVPCVITYAYYH
VGDIVRGTLNIEDESNVTIQCGDLICKLSRDSGTVSFSDSKYCFFRNGNAYDNGSEVSAVLMEAQQGTESSFVFLANIVD
S
>Q05569 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MDNSMDINDILLSDDNDYKSYDEDDDSISDIGETSDDCCTTKQSDSRIESFKFDETTQSPHPKQLSERIKAIKQRYTRRI
SLFEITGILSESYNLLQRGRIPLLNDLTEETFKDSIINIMFKEIEQGNCPIVIQKNGELLSLTDFDKKGVQYHLDYIKTI
WRNQRKL
>A0A7H0DNA3 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MADTDDIIDYESDDLTEYEDDEEDGESLETSDIDPKSSYKIVESTSTHIEDAHSNLKHIGNHISALKRRYTRRISLFEIA
GIIAESYNLLQRGRLPLVSEFSDETMKQNMLHVIIQEIEEGSCPIVIEKNGELLSVNDFDKDGLKFHLDYIIKIWKLQKR
Y
>Q76ZQ8 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MADTDDIIDYESDDLTEYEDDEEEEEDGESLETSDIDPKSSYKIVESASTHIEDAHSNLKHIGNHISALKRRYTRRISLF
EIAGIIAESYNLLQRGRLPLVSEFSDETMKQNMLHVIIQEIEEGSCPIVIEKNGELLSVNDFDKDGLKFHLDYIIKIWKL
QKRY
>P68610 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MADTDDIIDYESDDLTEYEDDEEEEEDGESLETSDIDPKSSYKIVESASTHIEDAHSNLKHIGNHISALKRRYTRRISLF
EIAGIIAESYNLLQRGRLPLVSEFSDETMKQNMLHVIIQEIEEGSCPIVIEKNGELLSVNDFDKDGLKFHLDYIIKIWKL
QKRY
>P68611 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MADTDDIIDYESDDLTEYEDDEEEEEDGESLETSDIDPKSSYKIVESASTHIEDAHSNLKHIGNHISALKRRYTRRISLF
EIAGIIAESYNLLQRGRLPLVSEFSDETMKQNMLHVIIQEIEEGSCPIVIEKNGELLSVNDFDKDGLKFHLDYIIKIWKL
QKRY
>P33813 2.7.7.6~~~~~~DNA-directed RNA polymerase 19 kDa subunit~~~
MADTDDIIDYESDDLTEYEDDDEEEEDGESLETSDIDPKSSYKIVESASTHIEDAHSNLKHIGNHISALKRRYTRRISLF
EIAGIIAESYNLLQRGRLPLVSEFSNETMKQNMLHVIIQEIEEGSCPIVIEKNGELLSVNDFDKDGLKFHLDYIIKIWKL
QKRY
>P68609 2.7.7.6~~~~~~DNA-directed RNA polymerase 22 kDa subunit~~~
MNQYNVKYLAKILCLKTEIARDPYAVINRNVLLRYTTDIEYNDLVTLITVRHKIDSMKTVFQVFNESSINYTPVDDDYGE
PIIITSYLQKGHNKFPVNFLYIDVVISDLFPSFVRLDTTETNIVNSVLQTGDGKKTLRLPKMLETEIVVKILYRPNIPLK
IVRFFRNNMVTGVEIADRSVISVAD
>O57187 2.7.7.6~~~~~~DNA-directed RNA polymerase 30 kDa polypeptide~~~
MENVYISSYSSNEQTSMAVAATDIRELLSQYVDDANLEDLIEWAMEKSSKYYIKNIGNTKSNIEETKFESKNNIGIEYSK
DSRNKLSYRNKPSIATNLEYKTLCDMIKGTSGTEKEFLRYLLFGIKCIKKGVEYNIDKIKDVSYNDYFNVLDEKYNTPCP
NCKSRNTTPMMIQTRAADEPPLVRHACRDCKQHFKPPKFRAFRNLNVTTQSIHENKEITEILPDNNPSPPESPEPASPID
DGLIRSTFDRNDEPPEDDE
>P21603 2.7.7.6~~~~~~DNA-directed RNA polymerase 30 kDa polypeptide~~~
MENVYISSYSSNEQTSMAVTATDIRELLSQYVDDANLEDLIEWAMEKSSKYYIKNIGNTKSNIEETKFESKNNIGIEYSK
DSRNKLSYRNKPSIATNLEYKTLCDMIKGTSGTEKEFLRYLLFGIKCIKKGVEYNIDKIKDVSYNDYFNVLDEKYNTPCP
NCKSRNTTPMMIQTRAADEPPLVRHACRDCKQHFKPPKFRAFRNLNVTTQSIHENKEITEILPDNNPSPPESPEPASPID
DGLIRATFDRNDEPPEDDE
>O57233 2.7.7.6~~~~~~DNA-directed RNA polymerase 35 kDa subunit~~~
MQHPREENSIVVELEPSLATFIKQGFNNLVKWPLLNIGIVLSNTSTAVNEEWLTAVEHIPTMKIFYKHIHKILTREMGFL
VYLKRSQSERDNYITLYDFDYYIIDKDTNSVTMVDKPTELKETLLHVFQEYRLKSSQTIELIAFSSGTVINEDIVSKLTF
LDVEVFNREYNNVKTIIDPDFVFRSPFIVISPMGKLTFFVEVYSWFDFKSCFKDIIDFLEGALIANIHNHMIKVGNCDET
VSSYNPESGMLFVNDLMTMNIVNFFGCNSRLESYHRFDMTKVDVELFIKALSDACKKILSASNRL
>P24757 2.7.7.6~~~~~~DNA-directed RNA polymerase 35 kDa subunit~~~
MQHPREENSIVVELEPSLATFIKQGFNNLVKWPLLNIGIVLSNTSTAVNEEWLTAVEHIPTMKIFYKHIHKILTREMGFL
VYLKRSQSERDNYITLYDFDYYIIDKDTNSVTMVDKPTELKETLLHVFQEYRLKSSQTIELIAFSSGTVINEDIVSKLTF
LDVEVFNREYNNVKTIIDPDFVFRSPFIVISPMGKLTFFVEVYSWFDFKSCLKDIIDFLEGALIANIHNHMIKVGNCDET
VSSYNPESGMLFVNDLMTMNIVNFFGCNSRLESYHRFDMTKVDVELFIKALSDACKKILSASNRL
>P42486 2.7.7.6~~~~~~DNA-directed RNA polymerase RPB1 homolog~~~
MEAGYAEIAAVQFNIAGDNDHKRQGVMEVTISNLFEGTLPAEGGIYDARMGTTDHHYKCITCSHQRKQCMGHPGILQMHA
PVLQPLFIAEIRRWLRVICLNCGAPIVDLKRYEHLIRPKRLIEAASSQTEGKQCYVCKAVHPKIVKDSEDYFTFWVDQQG
KIDKLYPQIIREIFSRVTYDTVVKLGRSKNSHPEKLVLKAIQIPPISIRPGIRLGIGSGPQSFHDINNVIQYLVRKNLLI
PKDLQIVRGQKIPLNIDRNLQTIQQLYYNFLLDSVSTTATQGGTGKRGIVMGARPAPSIMRRLPRKEGRIRKSLLGSQVW
SISRSTICGNSDLHLDEVGYPISFARTLQVAETVQHYNINRLMPYFLNGKRQYPGCSRVYKQITQSVHDIEGLKQDFRLE
VGDILYRDVVTGDVAFFNRQPSLERSSIGVHRIVVLENPKISTFQMNVSACAWYNADFDGDQMNLWVPWSVMSRVEAELL
CSVRNWFISTKSSGPVNGQVQDSTVGSFLLTRTNTPMGKNVMNKLHAMGLFQTTQTDPPCFANYSPTDLLDGKSVVSMLL
KQTPINYQRAPTWYSEVYAPYMHYNKQDISTQIRNGELIEGVLDKKAVGAGSSGGIYHLISRRYGPQQALKMIFATQQLA
LNYVRNAGFTVSTADMLLTPEAHQEVQEIINKLLLESEEINNRLLHGDIMPPIGLTTHDFYEKLQLNALKFPDRILKPIM
NSINPETNGLFQMVATGAKGSNPNMIHIMAGIGQIEINTQRIQPQFSFGRTLVYYPRFALEAQAYGFICNSYIAGLTSPE
FIFGEMNGRFDLINKALSTSSTGYANRKAIFGLQSCIVDYYRRVSIDTRLVQQLYGEDGLDARQLETVRFETIMLSDQEL
EDKFKYTGIQSPLFEEEFSRLKKDRDKYRQIFLNVENFNFSQLLTDVRQVPVNVASIVKNILLSSTSGVLPFDEKSILQK
YAMVKTFCKNLPYVFINNIQERLQTPIPVYLKRAAALMRMLIRIELATVKTLNITCEQMSAILDLIRLQYTQSLINYGEA
VGILAAQSVSEPLTQYMLDSHHRSVAGGTNKSGIVRPQEIFSAKPVEAEQSSEMLLRLKNPEVETNKTYAQEIANSIELI
TFERLILQWHLLYETYSSTKKNVMYPDFASDVEWMTDFLENHPLLQPPEDIANWCIRLELNKTTMILKSISLESIINSLR
AKHPNTYIMHSVENTASGIPIIIRIYLRESAFRRSTNTRMATDEKIAVNVVDKLLNSTIRGIPGIKNANVVKLMRHRVDA
QGKLVRLDNIYAIKTNGTNIFGAMLDDNIDPYTIVSSSIGDTMELYGIEAARQKIISEIRTVMGDKGPNHRHLLMYADLM
TRTGQVTSLEKAGLNAREPSNVLLRMALSSPVQVLTDAAVDSAVNPIYGIAAPTLMGSVPRIGTMYSDIIMDEKYITENY
KSVDSMIDML
>P42487 2.7.7.6~~~~~~DNA-directed RNA polymerase RPB2 homolog~~~
MEPLRPQITYGPIETVDNEELTEADMLSFISAAVNSTGLIGYNIKSFDDLMDNGIPQIVKQMFNVDITYKDQRDHTEIDK
LRESVQIQFNFTDVNIERPQHRNYSQGNKINLLPNKARLCGLSYSGPVNLAAEVILTAHYSNGRQEVKRASIPPFQVSTF
PIMRGSNRCHTHHLSKTAKKEIGEDPNEPGGYFIARGGEWVVDLLENIRFNTLHIHYHTMQQGNNEIIRGEFISQPGGAF
ENSSQIIIRYMTTGAITIEINSTKFSKLRIPWYLIFRMFGMTGDDSIIEQVVFDLESNSPVNTFMIEILEKSIHVLDPIF
QPVQHEPNREKIIQFLSEKVSKFVSNPSAYKSDENAVQYLNERQLTILDKILLPHMGQTADTRVRKLRFLGLLIHKILLV
IMNVFPPTDRDSYRTKRVHGSGVSLAKAFKAIFNTSVIAPIINGFKELLKQTAFEELTQRNIIEAFSAALSKNTASDLNR
SMEQSIISGNKTIMVRQRPIVNRVSTQSLERKNLLNTISALRTVNTHNTTNASKQTERADMMRRVHASYPGYICVAQSAD
TGEKVGMSKQLAITANVCTAGEVLSLKQRLLSDPAIQQLADVSNKDIVRKGLARVFINGEWIGCCTNAFELAQRYRMFRR
EGKIVHPHTTIYWDSMVDEVEFWLDVGRLTRPLLIVDNNIEKYNQACYKAAEARKKGDKDWEKHKIPFIQNTRFTSQMAK
DILAGTLTLEDLVAQGICEFITPEEAENCLVAFSIIELRKHKHDVTRRFTHVDVPQAILGLAALVSPYANCTQPARVTYE
TNQGRQTGGWYCFSWPYRVDMNRFFQFYNEMPLVKTIAHNYVIPNGLNTIVAYMIYGGYNQEDSVIVSQSFIDRGGFAGT
FYREEKVELESDIESFGKPDPLITKNLKPGANYEKLVDGFVPVGTVVKKGDIIIGKVAKIRGEKDELNKYIDRSVMYGFD
EPAVVDAVMRPHGPNDEIFGLMRLRYERNLNIGDKMSSRSGNKGIAALALPTSDMPFTEDGLQPDLIVNPHSHPSRMTNG
QMIETTVGLANALQGVVTDGTAFLPINVQLLSERLAQEGLRFNGCQKMFNGQTGEYFDAAIFIGPTYHQRLQKFVLDDRY
AVASYGPTDALTGQPLDGKRSHGGLRLGEMEHWVLTAQGAMQTIIEKSHDDSDGCISYICRNCGEPAIYNASHPIYKCMN
CDVQADIGMVDSRRSSIVFQHEMRAANVNITSVLSPRVFQPA
>Q65184 ~~~~~~DNA-directed RNA polymerase RPB3-11 homolog~~~
MEKIFQNVEIKPFLIDFSNPFIKNAAKRLFQLEEQLPLVPVNVVMDFKGISRAAVHGLSRVLQDEIPNYMLDIKPGGYKI
EDSTDLFMTEQFIRNRINFIPIYAKNETLVFALRSLNNSCEVKTIYSRDLIQVAGPKLKYPIFNPTFEIGFLQPGKSLII
EDIYIKKGIGRKHAAFNLAVKTHFSHLDIEQYPTDKKEYMALSGYKQSSMTSDPRHHRLGLCFPAVPLPHINQAVRTYLK
NACRIIIGRIQSIQKIYENFEEPQPELVLFSLDEEKTKAIITIKDETHTIGNLLKTCIYEMIPDISFVGYQCVPHKQEMV
LTIIHKASQEDLITLLEKSIQNIIQTFQILEKNVDELIA
>Q89907 ~~~~~~DNA-directed RNA polymerase RPB7 homolog~~~
MIDQKIFETTLNIDDPTNFCTNVEAHLLKELENIYVGKCFKNSFILNITGVIQRSPCFIMRTNNSGRGYMHVRFSAVVSY
LNAFDLIAAVKIIKNDSNIILGESLLTEPVTIVIPSSESQNNVAEVGQIVPVQLANSSVYYIPGRQQASATGSIFIPKHT
FSVYHVQEELTQEQALNLTKLVNIIEMLLESRSKKDFKQICFFEKLYYTYSISSDEILDLKIWKGPKGKEMSRLKPCNVL
SFLYDALKNKSSSLGFWARPPNLLKSSPLAYQQDQNSFNATELPIICSAEVMFVTLLKEIINYLQFMNDLCDTFNNEQLI
KRHENIWMLIEQRKIGHDF
>P07879 ~~~rpbA~~~15 kDa RNA polymerase-binding protein~~~
MTKITVNYTVDVKDIQPKHVRSESNPQNQNKIRRAWVLSLSDNAMEVIQNKIKSAPARHAYYEAIDREVSNKWIELMRKH
TTESLNAGAKFIMTSCGERLEDDYCGNADERLIVAAQIVAETIAADFNR
>P08707 ~~~CI~~~Repressor protein CI~~~
MRIDSLGWSNVDVLDRICEAYGFSQKIQLANHFDIASSSLSNRYTRGAISYDFAAHCALETGANLQWLLTGEGEAFVNNR
ESSDAKRIEGFTLSEEILKSDKQLSVDAQFFTKPLTDGMAIRSEGKIYFVDKQASLSDGLWLVDIEGAISIRELTKLPGR
KLHVAGGKVPFECGIDDIKTLGRVVGVYSEVN
>P13121 ~~~C1~~~Repressor protein C1~~~
MINYVYGEQLYQEFVSFRDLFLKKAVARAQHVDAASDGRPVRPVVVLPFKETDSIQAEIDKWTLMARELEQYPDLNIPKT
ILYPVPNILRGVRKVTTYQTEAVNSVNMTAGRIIHLIDKDIRIQKSAGINEHSAKYIENLEATKELMKQYPEDEKFRMRV
HGFSETMLRVHYISSSPNYNDGKSVSYHVLLCGVFICDETLRDGIIINGEFEKAKFSLYDSIEPIICDRWPQAKIYRLAD
IENVKKQIAITREEKKVKSAASVTRSRKTKKGQPVNDNPESAQ
>P03041 ~~~C1~~~Transcriptional activator protein C1~~~
MELTSTRKKANAITSSILNRIAIRGQRKVADALGINESQISRWKGDFIPKMGMLLAVLEWGVEDEELAELAKKVAHLLTK
EKPQDCGNSFEA
>P14819 ~~~CI~~~Repressor protein CI~~~
MSSISERIKFLLAREGLKQRDLAEALSTSPQTVNNWIKRDALSREAAQQISEKFGYSLDWLLNGEGSPKKDLESNIPPES
EWGTVDAWDKNTPLPDDEVEVPFLKDIEFACGDGRVHDEDHNGFKLRFSKATLRRVGANSDGSGVLCFPASGDSMEPVIP
DGATVAVDTGNKRVIDGELYAINQGDLKRIKQLYRKPGGKILIRSINRDYDDEEADEADVEIIGFVFWYSVLRYRR
>P03034 ~~~cI~~~Repressor protein cI~~~
MSTKKKPLTQEQLEDARRLKAIYEKKKNELGLSQESVADKMGMGQSGVGALFNGINALNAYNAALLAKILKVSVEEFSPS
IAREIYEMYEAVSMQPSLRSEYEYPVFSHVQAGMFSPELRTFTKGDAERWVSTTKKASDSAFWLEVEGNSMTAPTGSKPS
FPDGMLILVDPEQAVEPGDFCIARLGGDEFTFKKLIRDSGQVFLQPLNPQYPMIPCNESCSVVGKVIASQWPEETFG
>P21678 ~~~CII~~~Regulatory protein CII~~~
MFDFQVSKHPHYDEACRAFAQRHNMAKLAERAGMNVQTLRNKLNPEQPHQFTPPELWLLTDLTEDSTLVDGFLAQIHCLP
CVPVNELAKDKLQSYVMRAMSELGELASGAVSDERLTTARKHNMIESVNSGIRMLSLSALALHARLQTNPAMSSVVDTMS
GIGASFGLI
>P69202 ~~~C2~~~Repressor protein C2~~~
MNTQLMGERIRARRKKLKIRQAALGKMVGVSNVAISQWERSETEPNGENLLALSKALQCSPDYLLKGDLSQTNVAYHSRH
EPRGSYPLISWVSAGQWMEAVEPYHKRAIENWHDTTVDCSEDSFWLDVQGDSMTAPAGLSIPEGMIILVDPEVEPRNGKL
VVAKLEGENEATFKKLVMDAGRKFLKPLNPQYPMIEINGNCKIIGVVVDAKLANLP
>P03042 ~~~cII~~~Transcriptional activator II~~~
MVRANKRNEALRIESALLNKIAMLGTEKTAEAVGVDKSQISRWKRDWIPKFSMLLAVLEWGVVDDDMARLARQVAAILTN
KKRPAATERSEQIQMEF
>P18681 ~~~CIII~~~Regulatory protein CIII~~~
MMHFQLAGSGVMSAFYPHESELSRRVKQLIRAAKKQLEALCAMK
>P03044 ~~~cIII~~~Protease inhibitor III~~~
MQYAIAGWPVAGCPSESLLERITRKLRDGWKRLIDILNQPGVPKNGSNTYGYPD
>P15238 ~~~C~~~Repressor protein C~~~
MHKGTFHMSRLTDTLAAKLEEAGITQAELARRVGQSQQAINNLFAGRAASSMVWRELARELGIDEQEMRQMMTEAGRDPE
KVTSLAGLRKYRAVLPSPREPFPIIRQQEHLPRPNATIGEETNMEPRKKKLLPVLGEAVGGEDGEYIFNGSVLDYVDCPP
SLENVPNAYAVYIDGESMVPRFRPGETVWVHPTKPPRRGDDVVIQIHPDNEDDGAPPRGFVKEFVGWTANKLVLQQYNPT
KKIEFTREQVVSVHPIILAGKYW
>P04132 ~~~C~~~Repressor protein C~~~
MSNTISEKIVLMRKSEYLSRQQLADLTGVPYGTLSYYESGRSTPPTDVMMNILQTPQFTKYTLWFMTNQIAPEFGQIAPA
LAHFGQNETTSPHSGQKTG
>P06153 ~~~~~~Immunity repressor protein~~~
MTVGQRIKAIRKERKLTQVQLAEKANLSRSYLADIERDRYNPSLSTLEAVAGALGIQVSAIVGEETLIKEEQAEYNSKEE
KDIAKRMEEIRKDLEKSDGLSFSGEPMSQEAVESLMEAMEHIVRQTQRINKKYTPKKYRNDDQE
>Q7T6X5 2.7.7.6~~~RPO1~~~DNA-directed RNA polymerase subunit 1~~~
MEANKNTYSRLGDTIETVERIEFCINSNESIIRHSAIVDPNGITEAETFNSNNNEPVQGGVIDKRLGVTESHLECSTCGE
TALRCPGHFGHIKFVEPVFHMGYLIYLKHILSCICIRCNKLLVYKNEKEIAALIKNKQGKQRFAEIRSICKKVTHCQKEN
YGCGTPAHKISIDKRNGNIFLLAEPVKRTDEYDETGETRKRPQQILTPQLCYDILKSVSDEDCIIMGFDPAKSRPEDMII
LNFPVPPVQVRPSIRAEILSSPTMDDDLTHKLIDIIKSNENLKNTKGDGSLIKYTSINDDFMLLQLHVATFFANDMAGLA
RSQQKNKKVTKSMSERLRGKEGRIRGNLMGKRVDMSARTVITSDPNIALNEVGVPLIIAKNLTFDEIVTEHNIEYLTQLV
KNGKRVYPGANFVIKHVIDAEGNESGHIYHLKYVDKPISLKPGDIVKRQLIDGDIVIFNRQPSLHKLSMMGHKCHVIPDN
NLLTFRVNVSVTDPYNADFDGDEMNLHVPQSIQTATEILLIANASRRFVSPATSNIAIKAKQDTLMGSYVQTEPDMEIDW
RDAMSILMSTSVKLDNDIPKYQNVSGKFLYSQIIPEGLNITKRKNDKEFQLKIKNGELTDGTLGKSEISSILQRIWFQYG
SKETQEFIDDAQRMILQFLMRYGYTVSIKDTVIGEKVNQYIYDLIETKRKETLAFITEYDNDPYVMTKDAFEIKLQENLK
SVQDEIKNTVMRNFDKNSGIFIAISSGSSGEPMNAGQIAGCIGQVIVEGKRIQIRFNGRTLPMFPKFDDSAFSRGFCRNS
FIEGLGPFEFFFQVMAGREGIINTAIKTADTGYIQRKLVKMLEDIKQEYDGTVRNANGKLISCVYGDNGINTENQVDQKI
DLISANDNKVRNDYVYTEDEIKYLIKNHKTDKRYTTDLNNSLYRKLISMRDQLRRIQRLVNLTSAEFKETYKMPVDIQQF
IFNIINRDVRNNNVVVDPYYVLKMIKDMYYGSDSKIMKYNNRTSRIKKEDEKRIKFLMKIYLYDVLAPKKCTHVYKFSKQ
EFDEIVDYFKKTIMLAKVEGGEMVGFVAAQSIGEPVTQTNLKSFHKSGTGKTVSGGLVRVKELLGISKNIKTPITEIILE
EKYKNDKITASRIASYLKYTTLRDVVEKADVIYDPEPFSKDGLMKKDGVDNIFDQEQGKTGCQTDIKNLPWVLRIMLSKE
KMIERNINMLEIKTSFCRNWGTRNEDKTSKKEFNKVIDKITQCAIVTNYDNSQVPIVHVRFDANNYNLNTLIQFQEMVIN
TYKIKGISNITESNNIIEESYVDFDDEGNVVKKKQYVIIAEGINLSEMSQINGIDLLRTKCNDIVTIYEMYGVEAARTAF
IKEFTAAIESSGGFSNYQHIEILADAITHMGGLIPVNRHGANKLDTDPFSRASFEKTVEQLLAAAVFGESDHMRSVSARI
MVGALINGGTGCFDLLLDHKKIQQSLVESEEVVAPVVPIKKKTVLDDLISKKKSK
>Q7T6X7 2.7.7.6~~~RPO2~~~DNA-directed RNA polymerase subunit 2~~~
MSKKSVEIEDVNNTYDQEAHFALLDLFFEKDKQVLVKHHIDSFNQFIEEIIPNILQGGDNVISEKATENKIIRYRLTFND
LGIKPPTLENEENLLYPLDAIRKQISYSAKYTATVTQWQDIVDIDTKKTETRIIGSPEKDVPIAKIPIMVLSKYCNLTLR
PDIAGKHCKYDAGGYFIVNGSEKVVLSVESMIPRKPVVFTQRDQNSLLYYVRVQSIPASQFVGNPQLFTVKMKRDNSIIL
SIPHFKEVSIFTFIRALGIETDEDIVDSILDVKKEKDLLNLLSICMNSSNTPSVTKEEALEIMANQIKSTKTFTDTNPEV
KAEQRRRYLDKIMTQFVLPHITSGTGDPEIDKIYKAHYICYMIHKLLKCYLRGAREVEEYRGCDDRDSMVNKRIDLTGRL
LGGLFKQFYDKMLNDCNKIFRTKNIDDKKPPNIIPHIKPNSIEQGLRQALSTGNFGSQSRKGLSQMLNRMNHLHSLSYMR
RVITPTVDASTMKMTSPRHLHNTQYGSMCPLESPEGKPKTGLVKNMAMMEGITINMNSQIPIIESYLIGKITTLESANKK
RLHQYVKVFLNGNWLGVTRNIIKIHNDLRAMRFRGELSRMVGLVLNYKTAEFHIYTDGGRLIRPYLTVTDNKLNFKPEML
DEVNSWEEFLAKFPEVIEYVDKEEEQNIMLAVFPQYIQDANRIMSKKPINSRDQLNKINRTNRYDDNVYVRYTHCEIHPC
MILGLISSNIPFPDHNQSPRGIFQYNQARQAMGLYISDYRERTDISYILYHPQIPLVTSRASKYTGTHIFPAGENSIVAI
ASYTGLMNQEDSLVINDSAIQKGYMRAQALKKYMEIIKKNPASSQTSIFMKPDRNKVDNLRDANYDKLSEEGYAKVETVI
RDGDVVIGVVNPKPTAREDEKQYKDASSIYKSLIPGAVDKVITEVNNDGYPIIKMRIRSERIPNVGDKFSSRAGQKGTIG
YKAHRADMLFSKSGLIPDIIINPNCMPKRMTIGQLIECLLGKLCAVKGVYGDATPFTSVDLNAINDELVAAGYEEWGNET
MYNGMNGKKLPVKIFIGPTYYQRLKQMVGDKAHSRARGPTQLLTRQAPEGRSRDGGLRIGFEMERDALCAHGVAQFLKEK
TVDNSDIYTCHVCDSCGQFAHKVPEKKYYTCTGCRNTTSISKIVIPYAFKLLLQELASINILGKIRTSKTIATPRG
>Q5UPX7 2.7.7.6~~~~~~DNA-directed RNA polymerase subunit 5~~~
MEPNKSILQIMKSNSDKVKIILGNIITMLGNRIYIDNNGDSKPLLDPTNAFKNMEERGDNVFIIKADNGDLYAVKVIFQK
ITAISKQSVISEFLDEYETYKKIIVTKDYTGKIDTFILRNSGQIFKEHEFLADLLSNEFQPRFQLLSPSEMESVKTEYNA
NPYTLKKIVRSDPIVRYFALKKNDIIRIIRPSATSGQGIDYRVVV
>P19811 ~~~rep~~~Replicase polyprotein 1ab~~~
MATFSATGFGGSFVRDWSLDLPDACEHGAGLCCEVDGSTLCAECFRGCEGMEQCPGLFMGLLKLASPVPVGHKFLIGWYR
AAKVTGRYNFLELLQHPAFAQLRVVDARLAIEEASVFISTDHASAKRFPGARFALTPVYANAWVVSPAANSLIVTTDQEQ
DGFCWLKLLPPDRREAGLRLYYNHYREQRTGWLSKTGLRLWLGDLGLGINASSGGLKFHIMRGSPQRAWHITTRSCKLKS
YYVCDISEADWSCLPAGNYGGYNPPGDGACGYRCLAFMNGATVVSAGCSSDLWCDDELAYRVFQLSPTFTVTIPGGRVCP
NAKYAMICDKQHWRVKRAKGVGLCLDESCFRGICNCQRMSGPPPAPVSAAVLDHILEAATFGNVRVVTPEGQPRPVPAPR
VRPSANSSGDVKDPAPVPPVPKPRTKLATPNPTQAPIPAPRTRLQGASTQEPLASAGVASDSAPKWRVAKTVYSSAERFR
TELVQRARSVGDVLVQALPLKTPAVQRYTMTLKMMRSRFSWHCDVWYPLAVIACLLPIWPSLALLLSFAIGLIPSVGNNV
VLTALLVSSANYVASMDHQCEGAACLALLEEEHYYRAVRWRPITGALSLVLNLLGQVGYVARSTFDAAYVPCTVFDLCSF
AILYLCRNRCWRCFGRCVRVGPATHVLGSTGQRVSKLALIDLCDHFSKPTIDVVGMATGWSGCYTGTAAMERQCASTVDP
HSFDQKKAGATVYLTPPVNSGSALQCLNVMWKRPIGSTVLGEQTGAVVTAVKSISFSPPCCVSTTLPTRPGVTVVDHALY
NRLTASGVDPALLRVGQGDFLKLNPGFRLIGGWIYGICYFVLVVVSTFTCLPIKCGIGTRDPFCRRVFSVPVTKTQEHCH
AGMCASAEGISLDSLGLTQLQSYWIAAVTSGLVILLVCHRLAISALDLLTLASPLVLLVFPWASVGLLLACSLAGAAVKI
QLLATLFVNLFFPQATLVTMGYWACVAALAVYSLMGLRVKVNVPMCVTPAHFLLLARSAGQSREQMLRVSAAAPTNSLLG
VARDCYVTGTTRLYIPKEGGMVFEGLFRSPKARGNVGFVAGSSYGTGSVWTRNNEVVVLTASHVVGRANMATLKIGDAML
TLTFKKNGDFAEAVTTQSELPGNWPQLHFAQPTTGPASWCTATGDEEGLLSGEVCLAWTTSGDSGSAVVQGDAVVGVHTG
SNTSGVAYVTTPSGKLLGADTVTLSSLSKHFTGPLTSIPKDIPDNIIADVDAVPRSLAMLIDGLSNRESSLSGPQLLLIA
CFMWSYLNQPAYLPYVLGFFAANFFLPKSVGRPVVTGLLWLCCLFTPLSMRLCLFHLVCATVTGNVISLWFYITAAGTSY
LSEMWFGGYPTMLFVPRFLVYQFPGWAIGTVLAVCSITMLAAALGHTLLLDVFSASGRFDRTFMMKYFLEGGVKESVTAS
VTRAYGKPITQESLTATLAALTDDDFQFLSDVLDCRAVRSAMNLRAALTSFQVAQYRNILNASLQVDRDAARSRRLMAKL
ADFAVEQEVTAGDRVVVIDGLDRMAHFKDDLVLVPLTTKVVGGSRCTICDVVKEEANDTPVKPMPSRRRRKGLPKGAQLE
WDRHQEEKRNAGDDDFAVSNDYVKRVPKYWDPSDTRGTTVKIAGTTYQKVVDYSGNVHYVEHQEDLLDYVLGKGSYEGLD
QDKVLDLTNMLKVDPTELSSKDKAKARQLAHLLLDLANPVEAVNQLNLRAPHIFPGDVGRRTFADSKDKGFVALHSRTMF
LAARDFLFNIKFVCDEEFTKTPKDTLLGYVRACPGYWFIFRRTHRSLIDAYWDSMECVYALPTISDFDVSPGDVAVTGER
WDFESPGGGRAKRLTADLVHAFQGFHGASYSYDDKVAAAVSGDPYRSDGVLYNTRWGNIPYSVPTNALEATACYRAGCEA
VTDGTNVIATIGPFPEQQPIPDIPKSVLDNCADISCDAFIAPAAETALCGDLEKYNLSTQGFVLPSVFSMVRAYLKEEIG
DAPPLYLPSTVPSKNSQAGINGAEFPTKSLQSYCLIDDMVSQSMKSNLQTATMATCKRQYCSKYKIRSILGTNNYIGLGL
RACLSGVTAAFQKAGKDGSPIYLGKSKFDPIPAPDKYCLETDLESCDRSTPALVRWFATNLIFELAGQPELVHSYVLNCC
HDLVVAGSVAFTKRGGLSSGDPITSISNTIYSLVLYTQHMLLCGLEGYFPEIAEKYLDGSLELRDMFKYVRVYIYSDDVV
LTTPNQHYAASFDRWVPHLQALLGFKVDPKKTVNTSSPSFLGCRFKQVDGKCYLASLQDRVTRSLLYHIGAKNPSEYYEA
AVSIFKDSIICCDEDWWTDLHRRISGAARTDGVEFPTIEMLTSFRTKQYESAVCTVCGAAPVAKSACGGWFCGNCVPYHA
GHCHTTSLFANCGHDIMYRSTYCTMCEGSPKQMVPKVPHPILDHLLCHIDYGSKEELTLVVADGRTTSPPGRYKVGHKVV
AVVADVGGNIVFGCGPGSHIAVPLQDTLKGVVVNKALKNAAASEYVEGPPGSGKTFHLVKDVLAVVGSATLVVPTHASML
DCINKLKQAGADPYFVVPKYTVLDFPRPGSGNITVRLPQVGTSEGETFVDEVAYFSPVDLARILTQGRVKGYGDLNQLGC
VGPASVPRNLWLRHFVSLEPLRVCHRFGAAVCDLIKGIYPYYEPAPHTTKVVFVPNPDFEKGVVITAYHKDRGLGHRTID
SIQGCTFPVVTLRLPTPQSLTRPRAVVAVTRASQELYIYDPFDQLSGLLKFTKEAEAQDLIHGPPTACHLGQEIDLWSNE
GLEYYKEVNLLYTHVPIKDGVIHSYPNCGPACGWEKQSNKISCLPRVAQNLGYHYSPDLPGFCPIPKELAEHWPVVSNDR
YPNCLQITLQQVCELSKPCSAGYMVGQSVFVQTPGVTSYWLTEWVDGKARALPDSLFSSGRFETNSRAFLDEAEEKFAAA
HPHACLGEINKSTVGGSHFIFSQYLPPLLPADAVALVGASLAGKAAKAACSVVDVYAPSFEPYLHPETLSRVYKIMIDFK
PCRLMVWRNATFYVQEGVDAVTSALAAVSKLIKVPANEPVSFHVASGYRTNALVAPQAKISIGAYAAEWALSTEPPPAGY
AIVRRYIVKRLLSSTEVFLCRRGVVSSTSVQTICALEGCKPLFNFLQIGSVIGPV
>Q83017 ~~~rep~~~Replicase polyprotein 1ab~~~
MQSGFDRCLCTPNARVFWEHGQVYCTRCLAARPLLPLSQQNPRLGALGLFYRPATPLTWEAPITYPTKECRPGGLCWLSG
IYPIARMTSGNHNFQARLNFVASVVYRDGKLTSKHLEEEFEVYSRGCRWYPITGPVPGIALYANAVHVSDEPFPGCTHVL
SNLPLPQQPLRKGLCPFSDARAEVWRYKGNTIFVSEQGYLWTTGSNDSVPEPWGEARRLCEKIIASLPADHLVKIEFSNY
PFDYSFTGGDGAGYVLFPCKKNDTKFSKCWEKVFEDHSSWKVACEEADLADRMGYRTPAGVAGPYLARRLQYRGLRAVVK
PEQNDYVVWALGVPESYIRHISRAGEPVENFFVRVGEFSIVSNCVATPYPKFRFQTRKYYGYSPPGDGACGLHCISAIIN
DIFGDALCTKLTNCSRDSSEWLSDQDMYQLVMTARLPATLGHCPSATYKLDCVNQHWTVTKRKGDRALGGLSPECVRGVC
GGECKFVPTYPREINLELAAKSPISALAFSLGVEPYCDCWNFTNSVLVNDSLAVETARAGEAYRSAMGIPKDDWVLLAEL
MTENCLTRREVLDKLQRGLRLHATSKPGSPASVSPASSIDFSAAGLLLDGTESDKEAVVAVNNDCYTVLGFDKNSATKSE
QELATGLFSELVEPMETSTSKHESRKILEAASRALKSAKPKRKRNKKKKTSSPTPTPPETPTREVPGAIEVVSGDEEAGA
CESATIVPDKAQARPPPRPKRQALKKAEQGFILKDIIWNPTESGVKCLTIVEDVRAFLKSITPPGGALGTRARITAHIVE
QFHVIRESTPELVLAHAEHQAKNMHELLLSEKAKLILGIGEDTLKKLVSSQRSLPRSIGFGAWLSDQQKTADSCGEREFV
EVPLKSGAEPTPSKRDLGVSLGDQLSQDGAPRLSSSTACEIKERVPPIKDSGGGLGQKFMAWLNHQVFLLSSHLLAMWSV
VLGSRQKLNWADYVYTLFCLCCVLLCFHFPAIGFIPLAGCVFGSPWRVRLSVFSVWLCVAVVVFQEVLPEPGSVCSSASA
ECAAALERYSGNGVHRPVNHIGVGLVGTVAGFVARVVGGPRHYWFYFLRLMVVLDLGLVFLAVALRGRCKKCFCKCVRVA
PHEVHLRVFPLTKVARPTLEAVCDMYSAPRVDPILVATGIKGCWQGKVSPHQVTDKPVSYSNLEEKKISNKTVVPPPTDP
QQAVKCLKVLQCGGSIQDVGVPEVKKVSKVPYKAPFFPNVSIDPECYIVVDPVTYSAAMRGGYGVSHLIVGTGDFAEVNG
LRFVSGGHVADFVCLGLYVMLNFLISAWLSSPVSCGRGTNDPWCKNPFSYPVVGQGVMCNSHLCISEDGLTSPMVLSYSL
IDWALMIAVIATVAIFIAKVSLLVDVICVFLCLLMYVFPPLSVIAFAFPFALCKVHLHPVTLVWVQFFLLAVNFWAGVAV
AVILISSWFLARATSSTGLVTPYDVHLVTSTPRGASSLASAPEGTYLAAVRRSALTGRCCMFVPTNFGSVLEGSLRTRGC
AKNVVSVFGSASGSGGVFTIHGNPVVVTATHLLSDGKARVSCVGFSQCLTFKSVGDYAFARVAEWKGDAPKVELSDRRGR
AYCSPQVEWSLVLLGPNTAFCFTKCGDSGSPVVDEDGNLIGVHTGSNKRGSGMITTHNGKTLGMSNVKLSEMCQHYGGSG
VPVSTVRLPKHLIVDVEAVASDLVAVVESLPTPEGALSSVQLLCVFFFLWRLIHVPFVPVIAVAFFFLNEILPVVLARLM
FSFALSLFSVFTGFSVQVLLLRLVIAALNRSAVSFGSFLLGQLFHCCLMPSHLETLGPVPGYFYPSTTEVASKEIFVTLL
AIHVLALLLSLFKRPMLADVLVGNGSFDAAFFLKYFAEGNLRDGVSDSCNMTPEGLTAALAITLSDDDLEFLQRHSEFKC
FVSASNMRNGAKEFIESAYARALRAQLAATDKIKASKSILAKLESFAGGVVTKVEPGDVVVVLGKKIVGDLVEITINDVK
HVIRVIETRVMAGTQFSVGTICGDLENACEDPSGLVKTSKKQRRRQKRTGLGTEVVGTVEIDGVSYNKVWHKATGDVTYE
GFLVSENSRLRTLGTSAIGRFQEFIRKHGSKVKTSVEKYPVGKNKHIEFAVTTYNLDGEEFDVPDHEPLEWTITIGDSDL
EAERLTVDQALRHMGHDSLLTPKEKEKLARIIESLNGLQQSSALNCLTTSGLERCSRGGVTVSKDAVKIVKYHSRTFSIG
DVNLKVMSFDEYRRTMGKPGHLLVAKLTDGVVVMRKHEPSLVDVILTGEDAEFFPRTHGPGNTGIHRFVWDFESPPVDLE
LELSEQIITACSMRRGDAPALDLPYKLHPVRGDPYRHRGVLFNTRFGDITYLIPEKTKEPLHAAACYNKGVPVSDSETLV
ATTLPHGFELYVPTLPPSVLEYLDSRPDTPRMLTKHGCASAAEKDLQKFDLSRQGFVLPGVLYMVRRYLSRLIGVRRRLF
MPSTYPAKNSMAGINGGRFPLTWLQSHPDIDALCKRACEEHWQTVTPCTLKKQYCSKSKTRTILGTNNFVALGLRSALSG
VTQGFMRKGIGTPICLGKNKFTPLPVRIGGRCLEADLASCDRSTPAIIRWFTTNLLFELAGAEEWIPSYVLNCCHDVVST
MSGCFDKRGGLSSGDPVTSISNTVYSLIIYAQHMVLSAFRCGHKIGGLFLQDSLEMEQLFELQPLLVYSDDVVFYNESDE
LPNYHFFVDHLDLMLGFKTDRSKTVITSEPKLPGCRISGGRVLVPQRDRIVAALAYQMKASCVGEYFASAAAILMDACAC
CDHDESWYFDLVCGIAECAGSPWFRFPGPSFFLDMWNRLSAEEKKKCRTCAHCGAPATLVSSCGLNLCDYHGHGHPHCPV
VLPCGHAVGSGVCEQCSSSAMNLNTELDILLMCVPYHPPKVELLSVNDKVSSLPPGAYQARGGVVSVRRDILGNVVDLPD
GDYQVMKVAQTCADISMVSVNSNILRSQFVTGAPGTGKTTYLLSVVRDDDVIYTPTHRTMLDVVKALKVCRFDPPKDTPL
EFPVPGRTGPTVRLIGAGFVPGRVSYLDEAAYCNPLDVLKVLSKTPLVCVGDLNQLPPVGFNGPCFAFSLMPGRQLIEVF
RFGPAVVNSIKKFYKEELVPRGPDTGVKFLKQYQPYGQVLTPYHRDRVDGAITIDSSQGCTYDVVTVYLPTPKSLNSARA
LVALTRARHYVFIYDPYDQLQQYLQVFEHEPADAWAFWCGDQPKMIVGGVVKQLAGHSRTTDLKLQQLMGLEGTASPLPQ
VGHNLGFYYSPDLIQFAKIPPELCKHWPVVTAQNRTEWPDRLVCGMNKMDKNSRAVFCAGYYVGPSIFLGVPGVVSYYLT
KYLKGESVPLPDSIMSTGRIRLNVREYLDENEIEFAKKCPQPFIGEVKGSNVGGCHHVTSRFLPPVLVPGSVVKVGVSCP
GKAAKGLCTVTDVYLPELDSYLHPPSKSMDYKLLVDFQPVKLMVWKDATAYFHEGIRPMEAMSRFLKVPEGEGVFFDLDE
FVTNAKVSKLPCKYSVSAHQFLTEVVLSMTPTSEAPPDYELLFARAYCVPGLDVGTLNAYIYKRGPSTYTTSNFARLVKD
TAVPVGCKGSGYMFPK
>Q9YN02 ~~~rep~~~Replicase polyprotein 1ab~~~
MSGILDRCTCTPNARVFVAEGQVYCTRCLSARSLLPLNLQVSELGVLGLFYRPEEPLRWTLPRAFPTVECSPAGACWLSA
IFPIARMTSGNLNFQQRMVRVAAEIYRAGQLTPAVLKALQVYERGCRWYPIVGPVPGVAVFANSLHVSDKPFPGATHVLT
NLPLPQRPKPEDFCPFECAMATVYDIGHDAVMYVAEGKISWAPRGGDEVKFEAVPGELKLIANRLRTSFPPHHAVDMSKF
AFTAPGCGVSMRVERQHGCLPADTVPEGNCWWSLFDLLPLEVQDKEIRHANQFGYQTKHGVSGKYLQRRLQVNGLRAVTD
SNGPIVVQYFSVKESWIRHLKLAGEPSYSGFEDLLRIRVEPNTSPLANTEGKIFRFGSHKWYGAGKRARKARSCATATVA
GRALSVRETRQAKEHEVAGADKAEHLKHYSPPAEGNCGWHCISAIANRMVNSIFETTLPERVRPPDDWATDDDLANAIQI
LRLPAALDRNGACTSAKYVLKLEGEHWTVTVTPGMSPSLLPLECVQGCCEHKGGLGSPDAIEVSGFDPACLDWLAEVMHL
PSSAIPAALAEMSGDSDRSASPVTTVWTVSQFFARHSGGNHPDQVRLGKIISLCQVIEDCCCSQNKTNRVTPEEVAAKID
LYLRGATNLEECLARLEKARPPRVIDTSFDWDVVLPGVEAATQTNKLPQVNQCRALVPVVTQKSLDNNSVPLTAFSLANY
YYRAQGDEVRHRERLTAVLSKLEEVVREEYGLMPTEPGPRPTLPRGLDELKDQMEEDLLRLANAQATSDMMAWAVEQVDL
KTWVKNYPRWTPPPPPPKVQPRKTKPVKSLPERKPVPAPRRKVGPDCGSPVSLGGDVPNSWEDLAVSSPLDLPTPPEPAT
LSSELVIVSSPQCIFRPATPLSEPAPIPAPRGTVSRPVTPLSEPIPVPAPRRKFQQVKRLSSAAAVPLHQNEPLDLSASS
QTEYEASPSAPPQSGGVLGVEGHEAEETLSEISDMSGNIKPASVSSSSSLSSVEITRPKYSAQAIIDSGGPCSGHLQGVK
ETCLSVMREACDATKLDDPATQEWLSRMWDRVDMLTWRNTSVCQAIRTLDGRLKFLPKMILETPPPYPCEFVMMPHTPAP
SVGAESDLTIGSVATEDVPRILEKIENVGEMANQEPSAFSEDKPVDDQLVNDPRISSRRPDESTAAPSAGTGGAGSFTDL
PSSDGADADGGGPFRTAKRKAERLFDQLSRQVFDLVSHLPVFFSRLFHPGGGYSTGDWGFAAFTLLCLFLCYSYPAFGIA
PLLGVFSGTSRRVRMGVFGCWLAFAVGLFKPVSDPVGAACEFDSPECRNILLSFELLKPWDPVRSLVVGPVGLGLAILGR
LLGGARCIWHFLLRLGIVADCILAGAYVLSQGRCKKCWGSCIRTAPNEVAFNVFPFTRATRSSLIDLCDRFCAPKGMDPI
FLATGWRGCWAGRSPIEQPSEKPIAFAQLDEKKITARTVVAQPYDPNQAVKCLRVLQAGGAMVAEAVPKVVKVSAVPFRA
PFFPTGVKVDPDCRVVVDPDTFTAALRSGYSTTNLVLGVGDFAQLNGLKIRQISKPSGGGPHLMAALHVACSMALHMLTG
IYVTAVGSCGTGTNDPWCANPFAVPGYGPGSLCTSRLCISQHGLTLPLTALVAGFGIQEIALVVLIFVSIGGMAHRLSCK
ADMLCILLAIASYVWVPLTWLLCVFPCWLRCFSLHPLTILWLVFFLISVNMPSGILAMVLLVSLWLLGRYTNVAGLVTPY
DIHHYTSGPRGVAALATAPDGTYLAAVRRAALTGRTMLFTPSQLGSLLEGAFRTRKPSLNTVNVIGSSMGSGGVFTIDGK
VKCVTAAHVLTGNSARVSGVGFNQMLDFDVKGDFAIADCPNWQGAAPKAQFCADGWTGRAYWLTSSGVEPGVIGKGFAFC
FTACGDSGSPVITEAGELVGVHTGSNKQGGGIVTRPSGQFCNVAPIKLSELSEFFAGPKVPLGDVKVGSHIIKDISEVPS
DLCALLAAKPELEGGLSTVQLLCVFFLLWRMMGHAWTPLVAVSFFILNEVLPAVLVRSVFSFGMFVLSWLTPWSAQILMI
RLLTAALNRNRWSLAFFSLGAVTGFVADLAATQGHPLQAVMNLSTYAFLPRMMVVTSPVPVITCGVVHLLAIILYLFKYR
GLHQILVGDGVFSAAFFLRYFAEGKLREGVSQSCGMNHESLTGALAMRLNDEDLDFLMKWTDFKCFVSASNMRNAAGQFI
EAAYAKALRVELAQLVQVDKVRGVLAKLEAFADTVAPQLSPGDIVVALGHTPVGSIFDLKVGSTKHTLQAIETRVLAGSK
MTVARVVDPTPTPPPAPVPIPLPPKVLENGPNAWGDEDRLNKKKRRRMEALGIYVMGGKKYQKFWDKNSGDVFYEEVHNN
TDEWECLRVGDPADFDPEKGTLCGHVTIENKAYHVYISPSGKKFLVPVNPENGRVQWEAAKLSMEQALGMMNVDGELTAK
ELEKLKRIIDKLQGLTKEQCLNCLLAASGLTRCGRGGLVVTETAVKIVKFHNRTFTLGPVNLKVASEVELKDAVEHNQHP
VARPIDGGVVLLRSAVPSLIDVLISGADASPKLLAHHGPGNTGIDGTLWDFESEATKEEVALSAQIIQACDIRRGDAPKI
GLPYKLYPVRGNPERVKGVLQNTRFGDIPYKTPSDTGSPVHAAACLTPNATPVTDGRSVLATTMPPGFELYVPTIPASVL
DYLDSRPDCPKQLTEHGCEDAALKDLSKYDLSTQGFVLPGVLRLVRKYLFAHVGKCPPVHRPSTYPAKNSMAGINGNRFP
TKDIQSVPEIDVLCAQAVRENWQTVTPCTLKKQYCGKKKTRTILGTNNFIALAHRAALSGVTQGFMKKAFNSPIALGKNK
FKELQTSVLGRCLEADLASCDRSTPAIVRWFAANLLYELACAEEHLPSYVLNCCHDLLVTQSGAVTKRGGLSSGDPITSV
SNTIYSLVIYAQHMVLSYFKSGHPHGLLFLQDQLKFEDMLKVQPLIVYSDDLVLYAESPTMPNYHWWVEHLNLMLGFQTD
PKKTAITDSPSFLGCRIINGRQLVPNRDRILAALAYHMKASNVSEYYASAAAILMDSCACLEYDPEWFEELVVGIAQCAR
KDGYSFPGTPFFMSMWEKLRSNYEGKKSRVCGYCGAPAPYATACGLDVCIYHTHFHQHCPVTIWCGHPAGSGSCSECKSP
VGKGTSPLDEVLEQVPYKPPRTVIMHVEQGLTPLDPGRYQTRRGLVSVRRGIRGNEVELPDGDYASTALLPTCKEINMVA
VASNVLRSRFIIGPPGAGKTYWLLQQVQDGDVIYTPTHQTMLDMIRALGTCRFNVPAGTTLQFPVPSRTGPWVRILAGGW
CPGKNSFLDEAAYCNHLDVLRLLSKTTLTCLGDFKQLHPVGFDSHCYVFDIMPQTQLKTIWRFGQNICDAIQPDYRDKLM
SMVNTTRVTYVEKPVRYGQVLTPYHRDREDDAITIDSSQGATFDVVTLHLPTKDSLNRQRALVAITRARHAIFVYDPHRQ
LQGLFDLPAKGTPVNLAVHRDGQLIVLDRNNKECTVAQALGNGDKFRATDKRVVDSLRAICADLEGSSSPLPKVAHNLGF
YFSPDLTQFAKLPVELAPHWPVVTTQNNEKWPDRLVASLRPIHKYSRACIGAGYMVGPSVFLGTPGVVSYYLTKFVKGEA
QLLPETVFSTGRIEVDCREYLDDREREVAASLPHAFIGDVKGTTVGGCHHVTSRYLPRVLPKESVAVVGVSSPGKAAKAL
CTLTDVYLPDLEAYLHPETQSKCWKMMLDFKEVRLMVWRDKTAYFQLEGRYFTWYQLASYASYIRVPVNSTVYLDPCMGP
ALCNRRVVGSTHWGADLAVTPYDYGAKIILSSAYHGEMPPGYKILACAEFSLDDPVRYKHTWGFESDTAYLYEFTGNGED
WEDYNDAFRARQEGKIYKATATSLKFHFPPGPVIEPTLGLN
>Q8B912 ~~~rep~~~Replicase polyprotein 1ab~~~
MSGILDRCTCTPNARVFVAEGQVYCTRCLSARSLLPLNLQVPELGVLGLFYRPEEPLRWTLPRAFPTVECSPTGACWLSA
IFPIARMTSGNLNFQQRMVRVAGEIYRAGQLTPTVLKTIQVYERGCRWYPIVGPVPGVGVYANSLHVSDKPFPGATHVLT
NLPLPQRPKPEDFCPFECAMADVYDIGRGAVMYVAGGKVSWAPRGGDEVKFEPVPKELKLVANRLHTSFPPHHVVDMSKF
TFMTPGSGVSMRVEYQYGCLPADTVPEGNCWWRLFDLLPPEVQNKEIRHANQFGYQTKHGVPGKYLQRRLQVNGLRAVTD
THGPIVIQYFSVKESWIRHLKPVEEPSLPGFEDLLRIRVEPNTSPLAGKNEKIFRFGSHKWYGAGKRARKARSGATTMVA
HRASSAHETRQATKHEGAGANKAEHLKLYSPPAEGNCGWHCISAIVNRMVNSNFETTLPERVRPPDDWATDEDLVNTIQI
LRLPAALDRNGACGGAKYVLKLEGEHWTVSVNPGMSPSLLPLECVQGCCEHKGGLGSPDAVEVSGFDPACLDRLLQVMHL
PSSTIPAALAELSDDSNRPVSPAAATWTVSQSYARHRGGNHHDQVCLGKIISLCQVIEDCCCHQNKTNRATPEEVAAKID
QYLRGATSLEECLAKLERVSPPGAADTSFDWNVVLPGVEAAHQTTEQLHVNPCRTLVPPVTQEPLGKDSVPLTAFSLSNC
YYPAQGNEVRHRERLNSVLSKLEEVVLEEYGLMSTGLGPRPVLPSGLDELKDQMEEDLLKLANTQATSEMMAWAAEQVDL
KAWVKSYPRWTPPPPPPRVQPRKTKSVKSLPEDKPVPAPRRKVRSGCGSPVLMGDNVPNGSEDLTVGGPLNFPTPSEPMT
PMSEPVLTPALQRVPKLMTPLDGSAPVPAPRRTVSRPMTPLSEPIFLSAPRHKFQQVEEANPATTTLTHQNEPLDLSASS
QTEYEASPLASSQNMSILEAGGQEAEEVLSEISDILNDTSPAPVSSSSSLSSVKITRPKYSAQAIIDSGGPCSGHLQKEK
EACLSIMREACDASKLSDPATQEWLSRMWDRVDMLTWRNTSAYQAFRTLNGRFEFLPKMILETPPPHPCGFVMLPHTPAP
SVSAESDLTIGSVATEDVPRILGKIGDTGELLNQGPSAPFKGGPVCDQPAKNSRMSPRESDESIIAPPADTGGAGSFTDL
PSSDSVDANGGGPLRTVKTKAGRLLDQLSCQVFSLVSHLPVFFSHLFKSDSGYSPGDWGFAAFTLFCLFLCYSYPFFGFA
PLLGVFSGSSRRVRMGVFGCWLAFAVGLFKPVSDPVGTACEFDSPECRNVLHSFELLKPWDPVRSLVVGPVGLGLAILGR
LLGGARYVWHFLLRFGIVADCILAGAYVLSQGRCKKCWGSCVRTAPNEIAFNVFPFTRATRSSLIDLCDRFCAPKGMDPI
FLATVWRGCWTGRSPIEQPSEKPIAFAQLDEKRITARTVVAQPYDPNQAVKCLRVLQAGGAMVAEAVPKVVKVSAIPFRA
PFFPAGVKVDPECRIVVDPDTFTTALRSGYSTTNLVLGMGDFAQLNGLKIRQISKPSGGGSHLVAALHVACSMALHMLAG
VYVTAVGSCGTGTNDPWCTNPFAAPGYGPGSLCTSRLCISQHGLTLPLTALVAGFGLQEIALVVLIFVSMGGMAHRLSCK
ADMLCILLAIASYVWVPLTWLLCVFPCWLRWFSLHPLTILWLVFFLISVNIPSGILAVVLLVSLWLLGRYTNIAGLVTPY
DIHHYTSGPRGVAALATAPDGTYLAAVRRAALTGRTMLFTPSQLGSLLEGAFRTQKPSLNTVNVVGSSMGSGGVFTIDGK
IKCVTAAHVLTGNSARVSGVGFNQMLDFDVKGDFAIADCPNWQGAAPKAQFCEDGWTGRAYWLTSSGVEPGVIGNGFAFC
FTACGDSGSPVITEAGELVGVHTGSNKQGGGIVTRPSGQFCNVTPIKLSELSEFFAGPKVPLGDVKIGSHIIKDTCEVPS
DLCALLAAKPELEGGLSTVQLLCVFFLLWRMMGHAWTPLVAVGFFILNEILPAVLVRSVFSFGMFVLSWLTPWSAQVLMI
RLLTAALNRNRLSLGFYSLGAVTSFVADLAVTQGHPLQVVMNLSTYAFLPRMMVVTSPVPVIACGVVHLLAIILYLFKYR
CLHYVLVGDGVFSSAFFLRYFAEGKLREGVSQSCGMSHESLTGALAMRLTDEDLDFLTKWTDFKCFVSASNMRNAAGQFI
EAAYAKALRIELAQLVQVDKVRGTLAKLEAFADTVAPQLSPGDIVVALGHTPVGSIFDLKVGSTKHTLQAIETRVLAGSK
MTVARVVDPTPAPPPVPVPIPLPPKVLENGPNAWGDEDRLNKKKRRRMEAVGIFVMDGKKYQKFWDKNSGDVFYEEVHNS
TDEWECLRAGDPADFDPETGVQCGHITIEDRVYNVFTSPSGRKFLVPANPENRRAQWEAAKLSVEQALGMMNVDGELTAK
ELEKLKGIIDKLQGLTKEQCLNCLLAASGLTRCGRGGLVVTETAVKIVKFHNRTFTLGPVNLKVASEVELKDAVEHNQHP
VARPVDGGVVLLRSAVPSLIDVLISGADASPKLLARHGPGNTGIDGTLWDFEAEATKEEVALSAQIIQACDIRRGDAPEI
GLPYKLYPVRGNPERVKGVLQNTRFGDIPYKTPSDTGSPVHAAACLTPNATPVTDGRSVLATTMPSGFELYVPTIPASVL
DYLDSRPDCPKQLTEHGCEDAALRDLSKYDLVTQGFVLPGVLRLVRKYLFAHVGKCPPVHRPSTYPAKNSMAGINGNRFP
TKDIQSVPEIDVLCAQAVRENWQTVTPCTLKKQYCGKKKTRTILGTNNFIALAHRAALSGVTQGFMKKAFNSPIALGKNK
FKELQTPVLGRCLEADLASCDRSTPAIVRWFAANLLYELACAEEHLPSYVLNCCHDLLVTQSGAVTKRGGLSSGDPITSV
SNTIYSLVIYAQHMVLSYFKSGHPHGLLFLQDQLKFEDMLKVQPLIVYSDDLVLYAESPSMPNYHWWVEHLNLMLGFQTD
PKKTAITDSPTFLGCRIINGRQLVPNRDRILAALAYHMKASNVSEYYASAAAILMDSCACLEYDPEWFEELVVGIAQCAR
KDGYSFPGPPFFLSMWEKLRSNHEGKKSRMCGYCMAPAPYATACGLDVCVYHTHFHQHCPVIIWCGHPAGSGSCGECEPP
LGKGTSPLDEVLEQVPYKPPRTVIMHVEQGLTPLDPGRYQTRRGLVSVRRGIRGNEVDLPDGDYASTALLPTCKEINMVA
VAPNVLRSRFIIGPPGAGKTHWLLQQVQDGDVIYTPTHQTMLDMIRALGTCRFNVPAGTTLQFPAPSRTGPWVRILAGGW
CPGKNSFLDEAAYCNHLDVLRLLSKTTLTCLGDFKQLHPVGFDSHCYVFDIMPQTQLKTIWRFGQNICDAIQPDYRDKLV
SMVNTTRVTYVEKPVRYGQVLTPYHRDREDGAITIDSSQGATFDVVTLHLPTKDSLNRQRALVAITRARHAIFVYDPHRQ
LQSMFDLPAKGTPVNLAVHRDEQLIVLDRNNKEITVAQALGNGDKFRATDKRVVDSLRAICADLEGSSSPLPKVAHNLGF
YFSPDLTQFAKLPAELAPHWPVVTTQNNERWPDRLVASLRPIHKYSRACIGAGYMVGPSVFLGTPGVVSYYLTKFVRGEA
QVLPETVFSTGRIEVDCREYLDDREREVAESLPHAFIGDVKGTTVGGCHHVTSKYLPRFLPKESVAVVGVSSPGEAAKAF
CTLTDVYLPDLEAYLHPETQSKCWKVMLDFKEVRLMVWKGKTAYFQLEGRHFTWYQLASYTSYIRVPVNSTVYLDPCMGP
ALCNRRVVGSTHWGADLAVTPYDYGAKIILSSAYHGEMPPGYKILACAEFSLDDPVRYKHTWGFESDTAYLYEFTGNGED
WEDYNGAFRARQKGKIYKATATSMKFHFPPGPVIEPTLGLN
>Q04561 ~~~rep~~~Replicase polyprotein 1ab~~~
MSGTFSRCMCTPAARVFWNAGQVFCTRCLSARSLLSPELQDTDLGAVGLFYKPRDKLHWKVPIGIPQVECTPSGCCWLSA
VFPLARMTSGNHNFLQRLVKVADVLYRDGCLAPRHLRELQVYERGCNWYPITGPVPGMGLFANSMHVSDQPFPGATHVLT
NSPLPQQACRQPFCPFEEAHSSVYRWKKFVVFTDSSLNGRSRMMWTPESDDSAALEVLPPELERQVEILIRSFPAHHPVD
LADWELTESPENGFSFNTSHSCGHLVQNPDVFDGKCWLSCFLGQSVEVRCHEEHLADAFGYQTKWGVHGKYLQRRLQVRG
IRAVVDPDGPIHVEALSCPQSWIRHLTLDDDVTPGFVRLTSLRIVPNTEPTTSRIFRFGAHKWYGAAGKRARAKRAAKSE
KDSAPTPKVALPVPTCGITTYSPPTDGSCGWHVLAAIMNRMINGDFTSPLTQYNRPEDDWASDYDLVQAIQCLRLPATVV
RNRACPNAKYLIKLNGVHWEVEVRSGMAPRSLSRECVVGVCSEGCVAPPYPADGLPKRALEALASAYRLPSDCVSSGIAD
FLANPPPQEFWTLDKMLTSPSPERSGFSSLYKLLLEVVPQKCGATEGAFIYAVERMLKDCPSSKQAMALLAKIKVPSSKA
PSVSLDECFPTDVLADFEPASQERPQSSGAAVVLCSPDAKEFEEAAPEEVQESGHKAVHSALLAEGPNNEQVQVVAGEQL
KLGGCGLAVGNAHEGALVSAGLINLVGGNLSPSDPMKENMLNSREDEPLDLSQPAPASTTTLVREQTPDNPGSDAGALPV
TVREFVPTGPILCHVEHCGTESGDSSSPLDLSDAQTLDQPLNLSLAAWPVRATASDPGWVHGRREPVFVKPRNAFSDGDS
ALQFGELSESSSVIEFDRTKDAPVVDAPVDLTTSNEALSVVDPFEFAELKRPRFSAQALIDRGGPLADVHAKIKNRVYEQ
CLQACEPGSRATPATREWLDKMWDRVDMKTWRCTSQFQAGRILASLKFLPDMIQDTPPPVPRKNRASDNAGLKQLVAQWD
RKLSVTPPPKPVGPVLDQIVPPPTDIQQEDVTPSDGPPHAPDFPSRVSTGGSWKGLMLSGTRLAGSISQRLMTWVFEVFS
HLPAFMLTLFSPRGSMAPGDWLFAGVVLLALLLCRSYPILGCLPLLGVFSGSLRRVRLGVFGSWMAFAVFLFSTPSNPVG
SSCDHDSPECHAELLALEQRQLWEPVRGLVVGPSGLLCVILGKLLGGSRYLWHVLLRLCMLADLALSLVYVVSQGRCHKC
WGKCIRTAPAEVALNVFPFSRATRVSLVSLCDRFQTPKGVDPVHLATGWRGCWRGESPIHQPHQKPIAYANLDEKKMSAQ
TVVAVPYDPSQAIKCLKVLQAGGAIVDQPTPEVVRVSEIPFSAPFFPKVPVNPDCRVVVDSDTFVAAVRCGYSTAQLVLG
RGNFAKLNQTPPRNSISTKTTGGASYTLAVAQVSAWTLVHFILGLWFTSPQVCGRGTADPWCSNPFSYPTYGPGVVCSSR
LCVSADGVTLPLFSAVAQLSGREVGIFILVLVSLTALAHRMALKADMLVVFSAFCAYAWPMSSWLICFFPILLKWVTLHP
LTMLWVHSFLVFCLPAAGILSLGITGLLWAIGRFTQVAGIITPYDIHQYTSGPRGAAAVATAPEGTYMAAVRRAALTGRT
LIFTPSAVGSLLEGAFRTHKPCLNTVNVVGSSLGSGGVFTIDGRRTVVTAAHVLNGDTARVTGDSYNRMHTFKTNGDYAW
SHADDWQGVAPVVKVAKGYRGRAYWQTSTGVEPGIIGEGFAFCFTNCGDSGSPVISESGDLIGIHTGSNKLGSGLVTTPE
GETCTIKETKLSDLSRHFAGPSVPLGDIKLSPAIIPDVTSIPSDLASLLASVPVVEGGLSTVQLLCVFFLLWRMMGHAWT
PIVAVGFFLLNEILPAVLVRAVFSFALFVLAWATPWSAQVLMIRLLTASLNRNKLSLAFYALGGVVGLAAEIGTFAGRLS
ELSQALSTYCFLPRVLAMTSCVPTIIIGGLHTLGVILWLFKYRCLHNMLVGDGSFSSAFFLRYFAEGNLRKGVSQSCGMN
NESLTAALACKLSQADLDFLSSLTNFKCFVSASNMKNAAGQYIEAAYAKALRQELASLVQIDKMKGVLSKLEAFAETATP
SLDIGDVIVLLGQHPHGSILDINVGTERKTVSVQETRSLGGSKFSVCTVVSNTPVDALTGIPLQTPTPLFENGPRHRSEE
DDLKVERMKKHCVSLGFHNINGKVYCKIWDKSTGDTFYTDDSRYTQDHAFQDRSADYRDRDYEGVQTTPQQGFDPKSETP
VGTVVIGGITYNRYLIKGKEVLVPKPDNCLEAAKLSLEQALAGMGQTCDLTAAEVEKLKRIISQLQGLTTEQALNCLLAA
SGLTRCGRGGLVVTETAVKIIKYHSRTFTLGPLDLKVTSEVEVKKSTEQGHAVVANLCSGVILMRPHPPSLVDVLLKPGL
DTIPGIQPGHGAGNMGVDGSIWDFETAPTKAELELSKQIIQACEVRRGDAPNLQLPYKLYPVRGDPERHKGRLINTRFGD
LPYKTPQDTKSAIHAACCLHPNGAPVSDGKSTLGTTLQHGFELYVPTVPYSVMEYLDSRPDTPFMCTKHGTSKAAAEDLQ
KYDLSTQGFVLPGVLRLVRRFIFGHIGKAPPLFLPSTYPAKNSMAGINGQRFPTKDVQSIPEIDEMCARAVKENWQTVTP
CTLKKQYCSKPKTRTILGTNNFIALAHRSALSGVTQAFMKKAWKSPIALGKNKFKELHCTVAGRCLEADLASCDRSTPAI
VRWFVANLLYELAGCEEYLPSYVLNCCHDLVATQDGAFTKRGGLSSGDPVTSVSNTVYSLVIYAQHMVLSALKMGHEIGL
KFLEEQLKFEDLLEIQPMLVYSDDLVLYAERPTFPNYHWWVEHLDLMLGFRTDPKKTVITDKPSFLGCRIEAGRQLVPNR
DRILAALAYHMKAQNASEYYASAAAILMDSCACIDHDPEWYEDLICGIARCARQDGYSFPGPAFFMSMWEKLRSHNEGKK
FRHCGICDAKADYASACGLDLCLFHSHFHQHCPVTLSCGHHAGSKECSQCQSPVGAGRSPLDAVLKQIPYKPPRTVIMKV
GNKTTALDPGRYQSRRGLVAVKRGIAGNEVDLSDGDYQVVPLLPTCKDINMVKVACNVLLSKFIVGPPGSGKTTWLLSQV
QDDDVIYTPTHQTMFDIVSALKVCRYSIPGASGLPFPPPARSGPWVRLIASGHVPGRVSYLDEAGYCNHLDILRLLSKTP
LVCLGDLQQLHPVGFDSYCYVFDQMPQKQLTTIYRFGPNICAAIQPCYREKLESKARNTRVVFTTRPVAFGQVLTPYHKD
RIGSAITIDSSQGATFDIVTLHLPSPKSLNKSRALVAITRARHGLFIYDPHNQLQEFFNLTPERTDCNLVFSRGDELVVL
NADNAVTTVAKALETGPSRFRVSDPRCKSLLAACSASLEGSCMPLPQVAHNLGFYFSPDSPTFAPLPKELAPHWPVVTHQ
NNRAWPDRLVASMRPIDARYSKPMVGAGYVVGPSTFLGTPGVVSYYLTLYIRGEPQALPETLVSTGRIATDCREYLDAAE
EEAAKELPHAFIGDVKGTTVGGCHHITSKYLPRSLPKDSVAVVGVSSPGRAAKAVCTLTDVYLPELRPYLQPETASKCWK
LKLDFRDVRLMVWKGATAYFQLEGLTWSALPDYARFIQLPKDAVVYIDPCIGPATANRKVVRTTDWRADLAVTPYDYGAQ
NILTTAWFEDLGPQWKILGLQPFRRAFGFENTEDWAILARRMNDGKDYTDYNWNCVRERPHAIYGRARDHTYHFAPGTEL
QVELGKPRLPPGQVP
>Q9WJB2 ~~~rep~~~Replicase polyprotein 1ab~~~
MSGILDRCTCTPNARVFMAEGQVYCTRCLSARSLLPLNLQVSELGVLGLFYRPEEPLRWTLPRAFPTVECSPAGACWLSA
IFPIARMTSGNLNFQQRMVRVAAELYRAGQLTPAVLKALQVYERGCRWYPIVGPVPGVAVFANSLHVSDKPFPGATHVLT
NLPLPQRPKPEDFCPFECAMATVYDIGHDAVMYVAERKVSWAPRGGDEVKFEAVPGELKLIANRLRTSFPPHHTVDMSKF
AFTAPGCGVSMRVERQHGCLPADTVPEGNCWWSLFDLLPLEVQNKEIRHANQFGYQTKHGVSGKYLQRRLQVNGLRAVTD
LNGPIVVQYFSVKESWIRHLKLAGEPSYSGFEDLLRIRVEPNTSPLADKEEKIFRFGSHKWYGAGKRARKARSCATATVA
GRALSVRETRQAKEHEVAGANKAEHLKHYSPPAEGNCGWHCISAIANRMVNSKFETTLPERVRPPDDWATDEDLVNAIQI
LRLPAALDRNGACTSAKYVLKLEGEHWTVTVTPGMSPSLLPLECVQGCCGHKGGLGSPDAVEVSGFDPACLDRLAEVMHL
PSSAIPAALAEMSGDSDRSASPVTTVWTVSQFFARHSGGNHPDQVRLGKIISLCQVIEDCCCSQNKTNRVTPEEVAAKID
LYLRGATNLEECLARLEKARPPRVIDTSFDWDVVLPGVEAATQTIKLPQVNQCRALVPVVTQKSLDNNSVPLTAFSLANY
YYRAQGDEVRHRERLTAVLSKLEKVVREEYGLMPTEPGPRPTLPRGLDELKDQMEEDLLKLANAQTTSDMMAWAVEQVDL
KTWVKNYPRWTPPPPPPKVQPRKTKPVKSLPERKPVPAPRRKVGSDCGSPVSLGGDVPNSWEDLAVSSPFDLPTPPEPAT
PSSELVIVSSPQCIFRPATPLSEPAPIPAPRGTVSRPVTPLSEPIPVPAPRRKFQQVKRLSSAAAIPPYQDEPLDLSASS
QTEYEASPPAPPQSGGVLGVEGHEAEETLSEISDMSGNIKPASVSSSSSLSSVRITRPKYSAQAIIDSGGPCSGHLQEVK
ETCLSVMREACDATKLDDPATQEWLSRMWDRVDMLTWRNTSVYQAICTLDGRLKFLPKMILETPPPYPCEFVMMPHTPAP
SVGAESDLTIGSVATEDVPRILEKIENVGEMANQGPLAFSEDKPVDDQLVNDPRISSRRPDESTSAPSAGTGGAGSFTDL
PPSDGADADGGGPFRTVKRKAERLFDQLSRQVFDLVSHLPVFFSRLFYPGGGYSPGDWGFAAFTLLCLFLCYSYPAFGIA
PLLGVFSGSSRRVRMGVFGCWLAFAVGLFKPVSDPVGAACEFDSPECRNILHSFELLKPWDPVRSLVVGPVGLGLAILGR
LLGGARCIWHFLLRLGIVADCILAGAYVLSQGRCKKCWGSCIRTAPNEVAFNVFPFTRATRSSLIDLCDRFCAPKGMDPI
FLATGWRGCWAGRSPIEQPSEKPIAFAQLDEKKITARTVVAQPYDPNQAVKCLRVLQSGGAMVAKAVPKVVKVSAVPFRA
PFFPTGVKVDPDCRVVVDPDTFTAALRSGYSTTNLVLGVGDFAQLNGLKIRQISKPSGGGPHLMAALHVACSMALHMLAG
IYVTAVGSCGTGTNDPWCANPFAVPGYGPGSLCTSRLCISQHGLTLPLTALVAGFGIQEIALVVLIFVSIGGMAHRLSCK
ADMLCVLLAIASYVWVPLTWLLCVFPCWLRCFSLHPLTILWLVFFLISVNMPSGILAMVLLVSLWLLGRYTNVAGLVTPY
DIHHYTSGPRGVAALATAPDGTYLAAVRRAALTGRTMLFTPSQLGSLLEGAFRTRKPSLNTVNVIGSSMGSGGVFTIDGK
VKCVTAAHVLTGNSARVSGVGFNQMLDFDVKGDFAIADCPNWQGAAPKTQFCTDGWTGRAYWLTSSGVEPGVIGKGFAFC
FTACGDSGSPVITEAGELVGVHTGSNKQGGGIVTRPSGQFCNVAPIKLSELSEFFAGPKVPLGDVKVGSHIIKDISEVPS
DLCALLAAKPELEGGLSTVQLLCVFFLLWRMMGHAWTPLVAVSFFILNEVLPAVLVRSVFSFGMFVLSWLTPWSAQVLMI
RLLTAALNRNRWSLAFFSLGAVTGFVADLAATQGHPLQAVMNLSTYAFLPRMMVVTSPVPVITCGVVHLLAIILYLFKYR
GPHHILVGDGVFSAAFFLRYFAEGKLREGVSQSCGMNHESLTGALAMRLNDEDLDFLMKWTDFKCFVSASNMRNAAGQFI
EAAYAKALRVELAQLVQVDKVRGTLAKLEAFADTVAPQLSPGDIVVALGHTPVGSIFDLKVGSTKHTLQAIETRVLAGSK
MTVARVVDPTPTPPPAPVPIPLPPKVLENGPNAWGDEDRLNKKKRRRMEALGIYVMGGKKYQKFWDKNSGDVFYEEVHNN
TDEWECLRVGDPADFDPEKGTLCGHVTIENKAYHVYTSPSGKKFLVPVNPENGRVQWEAAKLSVEQALGMMNVDGELTAK
ELEKLKRIIDKLQGLTKEQCLNCLAASDLTRCGRGGLVVTETAVKIVKFHNRTFTLGPVNLKVASEVELKDAVEHNQHPV
ARPIDGGVVLLRSAVPSLIDVLISGADASPKLLAHHGPGNTGIDGTLWDFESEATKEEVALSAQIIQACDIRRGDAPEIG
LPYKLYPVRGNPERVKGVLQNTRFGDIPYKTPSDTGSPVHAAACLTPNATPVTDGRSVLATTMPPGFELYVPTIPASVLD
YLDSRPDCPKQLTEHGCEDAALKDLSKYDLSTQGFVLPGVLRLVRKYLFAHVGKCPPVHRPSTYPAKNSMAGINGNRFPT
KDIQSVPEIDVLCAQAVRENWQTVTPCTLKKQYCGKKKTRTILGTNNFIALAHRAVLSGVTQGFMKKAFNSPIALGKNKF
KELQTPVLGRCLEADLASCDRSTPAIVRWFAANLLYELACAEEHLPSYVLNCCHDLLVTQSGAVTKRGGLSSGDPITSVS
NTIYSLVIYAQHMVLSYFKSGHPHGLLFLQDQLKFEDMLKVQPLIVYSDDLVLYAESPTMPNYHWWVEHLNLMLGFQTDP
KKTAITDSPSFLGCRIINGRQLVPNRDRILAALAYHMKASNVSEYYASAAAILMDSCACLEYDPEWFEELVVGIAQCARK
DGYSFPGTPFFMSMWEKLRSNYEGKKSRVCGYCGAPAPYATACGLDVCIYHTHFHQHCPVTIWCGHPAGSGSCSECKSPV
GKGTSPLDEVLEQVPYKPPRTVIMHVEQGLTPLDPGRYQTRRGLVSVRRGIRGNEVGLPDGDYASTALLPTCKEINMVAV
ASNVLRSRFIIGPPGAGKTYWLLQQVQDGDVIYTPTHQTMLDMIRALGTCRFNVPAGTTLQFPVPSRTGPWVRILAGGWC
PGKNSFLDEAAYCNHLDVLRLLSKTTLTCLGDFKQLHPVGFDSHCYVFDIMPQTQLKTIWRFGQNICDAIQPDYRDKLMS
MVNTTRVTYVEKPVRYGQVLTPYHRDREDDAITIDSSQGATFDVVTLHLPTKDSLNRQRALVAITRARHAIFVYDPHRQL
QGLFDLPAKGTPVNLAVHCDGQLIVLDRNNKECTVAQALGNGDKFRATDKRVVDSLRAICADLEGSSSPLPKVAHNLGFY
FSPDLTQFAKLPVELAPHWPVVSTQNNEKWPDRLVASLRPIHKYSRACIGAGYMVGPSVFLGTPGVVSYYLTKFVKGGAQ
VLPETVFSTGRIEVDCREYLDDREREVAASLPHGFIGDVKGTTVGGCHHVTSRYLPRVLPKESVAVVGVSSPGKAAKALC
TLTDVYLPDLEAYLHPETQSKCWKMMLDFKEVRLMVWKDKTAYFQLEGRYFTWYQLASYASYIRVPVNSTVYLDPCMGPA
LCNRRVVGSTHWGADLAVTPYDYGAKIILSSAYHGEMPPGYKILACAEFSLDDPVKYKHTWGFESDTAYLYEFTGNGEDW
EDYNDAFRARQEGKIYKATATSLKFYFPPGPVIEPTLGLN
>A0MD28 ~~~~~~Replicase polyprotein 1ab~~~
MSGTFSRCMCTPAARVFWNAGQVFCTRCLSARPLLSPELQDTDLGVVGLFYKPKDKIHWKVPIGIPQVECTPSGCCWLSA
VFPLARMTSGNHNFLQRLVKVADVLYRDGCLAPRHLRELQVYERGCSWYPITGPVPGMGLFANSMHVSDQPFPGATHVLT
NSPLPQRACRQPFCPFEEAHSDVYRWKKFVIFTDSSPNGRFRMMWTPESDDSAALEVLPPELERQVEILTRSFPAHHPIN
LADWELTESPENGFSFGTSHSCGHIVQNPNVFDGKCWLTCFLGQSAEVCYHEEHLANALGYQTKWGVHGKYLQRRLQVRG
MRAVVDPDGPIHVEALSCSQSWVRHLTLNNDVTPGFVRLTSIRIVSNTEPTAFRIFRFGAHKWYGAAGKRARAKRATKSG
KDSALAPKIAPPVPTCGITTYSPPTDGSCGWHVLAAIVNRMINGDFTSPLPQYNRPEDDWASDYDLAQAIQCLQLPATVV
RNRACPNAKYLIKLNGVHWEVEVRSGMAPRSLSRECVVGVCSEGCVAPPYPADGLPKRALEALASAYRLPSDCVSSGIAD
FLADPPPQEFWTLDKMLTSPSPERSGFSSLYKLLLEVVPQKCGATEGAFVYAVERMLKDCPSPEQAMALLAKIKVPSSKA
PSVSLDECFPAGVPADFEPAFQERPRSPGAAVALCSPDAKGFEGTASEEAQESGHKAVHAVPLAEGPNNEQVQVVAGEQL
ELGGCGLAIGSAQSSSDSKRENMHNSREDEPLDLSHPAPAATTTLVGEQTPDNPGSDASALPIAVRGFVPTGPILRHVEH
CGTESGDSSSPLDLSFAQTLDQPLDLSLAAWPVKATASDPGWVRGRCEPVFLKPRKAFSDGDSALQFGELSESSSVIEFD
QTKDTLVADAPVDLTTSNEALSAVDPSEFVELRRPRHSAQALIDRGGPLADVHAKIKNRVYEQCLQACEPGSRATPATRE
WLDKMWDRVDMKTWRCTSQFQAGRILASLKFLPDMIQDTPPPVPRKNRASDNAGLKQLVARWDKKLSVTPPPKSAGLVLD
QTVPPPTDIQQEDATPSDGLSHASDFSSRVSTSWSWKGLMLSGTRLAGSAGQRLMTWVFEVYSHLPAFILTLFSPRGSMA
PGDWLFAGVVLLALLLCRSYPILGCLPLLGVFSGSLRRVRLGVFGSWMAFAVFLFSTPSNPVGSSCDHDSPECHAELLAL
EQRQLWEPVRGLVVGPSGLLCVILGKLLGGSRHLWHVILRLCMLTDLALSLVYVVSQGRCHKCWGKCIRTAPAEVALNVF
PFSRATRNSLTSLCDRFQTPKGVDPVHLATGWRGCWRGESPIHQPHQKPIAYANLDEKKISAQTVVAVPYDPSQAIKCLK
VLQAGGAIVDQPTPEVVRVSEIPFSAPFFPKVPVNPDCRIVVDSDTFVAAVRCGYSTAQLVLGRGNFAKLNQTPLRDSAS
TKTTGGASYTLAVAQVSVWTLVHFILGLWFTSPQVCGRGTADPWCSNPFSYPAYGPGVVCSSRLCVSADGVTLPLFSAVA
QLSGREVGIFILVLVSLTALAHRLALKADMLVVFSAFCAYAWPMSSWLICFFPILLKWVTLHPLTMLWVHSFLVFCMPAA
GILSLGITGLLWAVGRFTQVAGIITPYDIHQYTSGPRGAAAVATAPEGTYMAAVRRAALTGRTLIFTPSAVGSLLEGAFR
THKPCLNTVNVVGSSLGSGGVFTIDGRKTVVTAAHVLNGDTARVTGDSYNRMHTFKTSGDYAWSHADDWQGVAPVVKVAK
GYRGRAYWQTSTGVEPGVIGEGFAFCFTNCGDSGSPVISESGDLIGIHTGSNKLGSGLVTTPEGETCAIKETKLSDLSRH
FAGPSVPLGDIKLSPAIVPDVTSIPSDLASLLASVPVMEGGLSTVQLLCVFFLLWRMMGHAWTPIVAVGFFLLNEILPAV
LVRAVFSFALFILAWATPWSAQVLMIRLLTASLNRNKLSLAFYALGGVVGLAAEIGAFAGRLPELSQALSTYCFLPRVLA
MASYVPIIIIGGLHALGVILWLFKYRCLHNMLVGDGSFSSAFFLRYFAEGNLRKGVSQSCGMSNESLTAALACKLSQADL
DFLSSLTNFKCFVSASNMKNAAGQYIEAAYAKALRQELASLVQVDKMKGILSKLEAFAETATPSLDAGDVVVLLGQHPHG
SILDINVGTERKTVSVQETRSLGGSKFSVCTVVSNTPVDALTGIPLQTPTPLFENGPRHRGEEDDLRVERMKKHCVSLGF
HNINGKVYCKIWDKSTGDTFYTDDSRYTQDLAFQDRSADYRDRDYEGVQTAPQQGFDPKSETPIGTVVIGGITYNRYLIK
GKEVLVPKPDNCLEAAKLSLEQALAGMGQTCDLTAAEVEKLRRIISQLQGLTTEQALNCLLAASGLTRCGRGGLVVTETA
VKIVKYHSRTFTLGPLDLKVTSEAEVKKSTEQGHAVVANLCSGVILMRPHPPSLVDVLLKPGLDTKPGIQPGHGAGNMGV
DGSTWDFETAPTKAELELSKQIIQACEVRRGDAPNLQLPYKLYPVRGDPERHGGRLINTRFGDLSYKTPQDTKSAIHAAC
CLHPNGAPVSDGKSTLGTTLQHGFELYVPTVPYSVMEYLDSRPDTPFMCTKHGTSKAAAEDLQKYDLSTQGFVLPGVLRL
VRRFIFGHIGKAPPLFLPSTYPAKNSMAGINGQRFPTKDVQSIPEIDEMCARAVKENWQTVTPCTLKKQYCSKPKTRTIL
GTNNFIALAHRSALSGVTQAFMKKAWKSPIALGKNKFKELHCTVAGRCLEADLASCDRSTPAIVRWFVANLLYELAGCEE
YLPSYVLNCCHDLVATQDGAFTKRGGLSSGDPVTSVSNTVYSLIIYAQHMVLSALKMGHEIGLKFLEEQLKFEDLLEIQP
MLVYSDDLVLYAERPTFPNYHWWVEHLDLMLGFRTDPKKTVITDKPSFLGCRIEAGRQLVPNRDRILAALAYHMKAQNAS
EYYASAAAILMDSCACIDHDPEWYEDLICGIARCARQDGYSFPGPAFFMSMWEKLRSHNEGKKFRHCGICDAKADHASAC
GLDLCLFHSHFHQHCPVTLSCGHHAGSRECSQCQSPVGAGRSPLDAVLKQIPYKPPRTVIMKVGNKTTALDPGRYQSRRG
LVAVKRGIAGNEVDLPDGDYQVVPLLPTCKDINMVKVACNVLLSKFIVGPPGSGKTTWLLSQVQDDDVIYTPTHQTMFDI
VSALKVCRYSIPGASGLPFPPPARSGPWVRLVASGHVPGRTSYLDEAGYCNHLDILRLLSKTPLVCLGDLQQLHPVGFDS
YCYVFDQMPQKQLTTIYRFGPNICAAIQPCYREKLESKARNTRVVFTTWPVAFGQVLTPYHKDRIGSAITIDSSQGATFD
IVTLHLPSPKSLNKSRALVAITRARHGLFIYDPHNQLQEFFNLIPERTDCNLVFSRGDDLVVLSADNAVTTVAKALGTGP
SRFRVSDPRCKSLLAACSASLEGSCMPLPQVAHNLGFYFSPDSPAFAPLPKELAPHWPVVTHQNNRAWPDRLVASMRPID
ARYSKPMVGAGYVVGPSTFLGTPGVVSYYLTLYIRGEPQALPETLVSTGRIATDCREYLDAAEEEAAKELPHAFIGDVKG
TTVGGCHHITSKYLPRTLPKDSVAVVGVSSPGRAAKAMCTLTDVYLPELRPYLQPETASKCWKLKLDFRDVRLMVWKGAT
AYFQLEGLTWSALPDYARFIQLPKDAVVYIDPCIGPATANRKVVRTTDWRADLAVTPYDYGAQNILTTAWFEDLGPQWKI
LGLQPFRRAFGFENTEDWAILARRMSDGKDYTDYNWDCVRERPHAIYGRARDHTYHFAPGTELQVELGKPRLPPGREP
>Q68772 ~~~rep~~~Replicase polyprotein 1ab~~~
MFCECPRSNLVVMCSGAFCCVLCGHRRRPRPASESDRAKYGPIVQYVEARVAHVYSGLEGRYCALEMIPITYGNKFPYCK
PLPVSFVIKTLAGVQGDLTRLEETPLPGGYGVIPCWGPHLAAVGYLSPAHVGRDWFEGATHAIVHIGSYGGHERPTTIPF
NTTGGDVYQLGTCTIVETIDHVEWHAGVKPGTAICPLDRIDFAQKVITAFPEGFLANKAWLGDKRGTLKVEADPETAALS
FEHGRCWLKLFPDPACELTTASTFGYQLNCGVQGKYIARRLQTNGLKLVQNQEGKFIAYTFHRGSWLGHIGHADESVPPD
CQIIARFDVLPYNEWSPLPLLKLPGKTYFGGNASSVSWPEWKYDEQLLYADSLTAGFCWLQLFPPLSRKSEAQRAILAQQ
VNNYGVTGTYLEYRLRQYGIVLAECDYGEHYIYAAASDSSIRHISPVPIHDRHHVFVTRLTARFGAFDEGFDLGFGTRYG
RRRGGGKKSGQSSGVRAPGRTTPDLAGDWGKAVDDQEKTASKVTTDKAMSTSEPAVVQVGCETKPVADAAAVPASVNSTG
CALLPVQADPCCTAGVAAKESEPKAVAAPSIPITFGAPAGETLPVAASPLVVKKDKRCISVKLTAKKALPKETFIPPPDG
GCGVHAFAAIQYHINTGHWPEQKPVVNWAYEAWTTNEDIGHMICSTETPAALEPCLHARYVVRLDSDHWVVDHYPNRPMC
FVEACAHGWCSSLLSEPTGEEGEHLVDCSALYDCLGKFRNGTEFADTVLGLSKTAHCCNKRVPTPRKQAIMSLLNRPNCV
PCIAPPSQVRTVDPSQPAAPLPPVPRPRKRKAAAQQVSKVPSEQDPSLAHDPPEKPDSVRPPKLGYLDRAWNNMLARTHK
LHNLQQRVFGLYPQLLSMLLPSGARPSTPRLLGCYFSMAVAMFFLFLGSPLFILCAVLAGVIAPSARYPKILCCCLVVVY
ICTLFADAISSVCDNDDADCRAFLSDLGDRYSTNQPVYITPGPATFFLAVSRNFFVVSVALFPLHLLLLMVDVLLVIGVL
CMDGYCFRCFSRCVRKAPEEVSLLTIPQSRVSRRFLLDICDFYSAPPVDIIRLATGLNGCFRGDYSPIGSSTSVITADKI
DVKKVSCRTVCSFPSCPSEAVKVLHVLSVRGQMCAHNEQKVEKVDALPCKNPLFPYDLSSKKIVPVDSGTYEILSSIGCD
MSHLVIGDGDFFKVMGVPRPSPFTVMRLRACRVVGGGRIFRTALAAAWVLFFVCAGYWVQMSTPCGIGTNDPFCKSSFGV
PTYVNQGVCHGQYCASSKGVSRATSILTVRNPAVAPYIVLAACLVYLASVYVPGIIEVSLLVLNALLPAGPAISALRTLV
MIIAAPHLSMKYIAFFCCTTAFVDFTSVVVVLTALLVGWILARYTGIGGFVTPYDIHDVVKSQRDGVAVANAPPNTYLGA
VRRAALTGKPAFFVANNTGIVLEGLLREKTRASNSVSVYGVTCGSGGLFSDGNNTVCLTATHVCGNNKAVVDYQGTRYEA
VFTTKGDYASAVVPIPGAFPPLKFAPQSYTGRAYWYANTGVETGFVGTTGCLVFSGPGDSGSPIITPDGLIVGVHTGSDS
KGSGAYTTPNGLTVSGPLSLKEMGAHYEGPIVDVPTRLPRNVHNDTKSVPQPLARLLESSINLEGGLGTIQLIIVAVVLW
KYAVDPLSIPFVVAFFLLNEILPKCLIRCFYNYSLFCLAAFSPLASRIFFIRLLTAALNRNPTALICHACFAGIAVLNDF
IILGDIRLALRFTSFYVVGVNHDAIAIAVIGALVCVAACCLELFGLPQMASVIGCHGSFDPTFLSRYVHEGIRQGVSSGF
GTESLSTALACALSEDELNFLAQAVDHKAIVSAIHVHKTLQDYILSKNAKILRASLASVHANHNASKALASLDKFLQGTS
TQLKPGDPVILLGSTSAELVSVFSGDSEYIAEPIRSHPVAGTICTLCVVQAKCEGGLVTQVNGKFSPAKYLAVAGKVLAD
HPDYKLENDGRFPRTREDRVKDSVQVDTVDIGSHTFKKMWNKTTGDVWYDIIMPESAANPLAVHDLDSAVAAIGMSKEIP
EKDMNRLRAIISKLQGLVSSEALNLLTAAGCTSADRSGLVITLDYAKIITHHARTRAFSSIDFKVVSPDEAMRTARLSPS
PQPIIASFSDDKFLLLRRHPPSLLDVLTKGLDATCREPLHSPGDQGIDGYLWDFEAPHSKEAIWLSNQIISACAARRGDA
PGCYPYKLHPVRGDPYRVGNVLKNTRFGDVTYTAVSDSDSPWLKVASINSGGCPVVTDRVLGSTIPVGSEIYLPTLPESV
LDYLDSRPDCPTYYTQHGCEAAALQDLKKFNLSTQGFILPEVLNIVRNYLLGTIGYRPAIYKPSTVPSNDSHAGINGLSF
STKTLQALPDIDELCEKAIAEVWQTVTPVTLKKQFCSKAKTRTILGTNAMASLALRALLSGVTQGFQLAGKNSPICLGKS
KFDPCTFEVKGRCLETDLASCDRSTPAIVRHFATKLLFEMACAERALPLYVVNCCHDLIVTQTSAATKRGGLSSGDPVTS
IANTIYSLVLYVQHMVLTLLENGHPLSLKFLSGKLNFQDLYKLQAFIVYSDDLILLNESDDLPNFERWVPHLELALGFKV
DPKKTVITSNPGFLGCEYRHGWLVPQKQRVLAALAYHVNAKDVHTYYINATAILNDASALSAFEPDWFDDLVIGLADCAR
KDGYSFPGPAAFREFFSRVSGYQFEGKEVQVCSICCSTARTTSLCGMALCDFCAHRHYHPGCHVLSSFCKHVIGSNTCKM
CSIPILKDRTKFAELLASDQYRSVCTVEVTVVDGYTDAAPGRYSYQKKQYMLRKERRGCPLDLPDGKYSMKLLPNSCSGI
CVPKAQENATLSNFVVGPPGSGKTTFISNLLDDDAVVYCPTHVSLIAYSKSLPAARFSVPRGQDPAEYGTPALSGPTLQL
LSAGYVPGAKHYLDEACYANPFDVFKLLSKTPITAIGDPAQLTPVGFDTPLYVFELMKKNALHAIYRFGQNICNAIQPCY
STKLVSQRQGDTEVIFQTKFAPRGKVLTPYHRDRVGAAVTIDSSQGSTYDVVTLYLPTKGSLTLARGLVGITRARERLYV
YDPHHQLAKYFNLQPSSTTIRPHAVVIDGKARVMLSDKCYAAPEDFPGMLCTARPATAADRKILEETCLKLDFLESGSLS
PLPRVCYNLGFYYSPDITKLLPIPSELAKHWPVATNRNNPEWPNRLVVSATRLSPLSHPAVCAGYYVGDSLFVGTPNVTS
YWLTKFLDGRAVPMEDSVYSTGRFEMDIRDYLDSAERDFAAKHPHAFIGDTKGTTVGGCHHITSQYLPHVLPADSVVKVG
VSKPGVAHKALCTVTDIYLPMLGSYTSPPTQSKVYKVNVDHKACKLMVWRDQTMYFQEGFDYHTLVDALRFVRLSSDGVY
RVAPELTPMIGNRRLDLGAKPLRPVDLAITPWDDPKCEFLVTHASPFDMSDEFLLVNAFDFIKEDLLGKSVTPVYFYKRL
SEPLHFDQNLPPHVGAILSKAPRFISLAKVFNFCFTPTACHCKVSVKTATGDHMCKCSLSSDEFLSRFNPTVGTP
>Q859P9 2.7.7.6~~~~~~Virion DNA-directed RNA polymerase~~~
MSVFDRLAGFADSVTNAKQVDVSTATAQKKAEQGVTTPLVSPDAAYQMQAARTGNVGANAFEPGTVQSDFMNLTPMQIMN
KYGVEQGLQLINARADAGNQVFNDSVTTRTPGEELGDIATGVGLGFVNTLGGIGALGAGLLNDDAGAVVAQQLSKFNDAV
HATQSQALQDKRKLFAARNLMNEVESERQYQTDKKEGTNDIVASLSKFGRDFVGSIENAAQTDSIISDGLAEGVGSLLGA
GPVLRGASLLGKAVVPANTLRSAALAGAIDAGTGTQSLARIASTVGRAAPGMVGVGAMEAGGAYQQTADEIMKMSLKDLE
KSPVYQQHIKDGMSPEQARRQTASETGLTAAAIQLPIAAATGPLVSRFEMAPFRAGSLGAVGMNLARETVEEGVQGATGQ
LAQNIAQQQNIDKNQDLLKGVGTQAGLGALYGFGSAGVVQAPAGAARLAGAATAPVLRTTMAGVKAAGSVAGKVVSPIKN
TLVARGERVMKQNEEASPVADDYVAQAAQEAMAQAPEAEVTIRDAVEATDATPEQKVAAHQYVSDLMNATRFNPENYQEA
PEHIRNAVAGSTDQVQVIQKLADLVNTLDESNPQALMEAASYMYDAVSEFEQFINRDPAALDSIPKDSPAIELLNRYTNL
TANIQNTPKVIGALNVINRMINESAQNGSLNVTEESSPQEMQNVALAAEVAPEKLNPESVNVVLKHAADGRIKLNNRQIA
ALQNAAAILKGAREYDAEAARLGLRPQDIVSKQIKTDESRTQEGQYSALQHANRIRSAYNSGNFELASAYLNDFMQFAQH
MQNKVGALNEHLVTGNADKNKSVHYQALTADREWVRSRTGLGVNPYDTKSVKFAQQVALEAKTVADIANALASAYPELKV
SHIKVTPLDSRLNAPAAEVVKAFRQGNRDVASSQPKADSVNQVKETPVTKQEPVTSTVQTKTPVSESVKTEPTTKESSPQ
AIKEPVNQSEKQDVNLTNEDNIKQPTESVKETETSTKESTVTEELKEGIDAVYPSLVGTADSKAEGIKNYFKLSFTLPEE
QKSRTVGSEAPLKDVAQALSSRARYELFTEKETANPAFNGEVIKRYKELMEHGEGIADILRSRLAKFLNTKDVGKRFAQG
TEANRWVGGKLLNIVEQDGDTFKYNEQLLQTAVLAGLQWRLTATSNTAIKDAKDVAAITGIDQALLPEGLVEQFDTGMTL
TEAVSSLAQKIESYWGLSRNPNAPLGYTKGIPTAMAAEILAAFVESTDVVENIVDMSEIDPDNKKTIGLYTITELDSFDP
INSFPTAIEEAVLVNPTEKMFFGDDIPPVANTQLRNPAVRNTPEQKAALKAEQATEFYVHTPMVQFYETLGKDRILELMG
AGTLNKELLNDNHAKSLEGKNRSVEDSYNQLFSVIEQVRAQSEDISTVPIHYAYNMTRVGRMQMLGKYNPQSAKLVREAI
LPTKATLDLSNQNNEDFSAFQLGLAQALDIKVHTMTREVMSDELTKLLEGNLKPAIDMMVEFNTTGSLPENAVDVLNTAL
GDRKSFVALMALMEYSRYLVAEDKSAFVTPLYVEADGVTNGPINAMMLMTGGLFTPDWIRNIAKGGLFIGSPNKTMNEHR
STADNNDLYQASTNALMESLGKLRSNYASNMPIQSQIDSLLSLMDLFLPDINLGENGALELKRGIAKNPLTITIYGSGAR
GIAGKLVSSVTDAIYERMSDVLKARAKDPNISAAMAMFGKQAASEAHAEELLARFLKDMETLTSTVPVKRKGVLELQSTG
TGAKGKINPKTYTIKGEQLKALQENMLHFFVEPLRNGITQTVGESLVYSTEQLQKATQIQSVVLEDMFKQRVQEKLAEKA
KDPTWKKGDFLTQKELNDIQASLNNLAPMIETGSQTFYIAGSENAEVANQVLATNLDDRMRVPMSIYAPAQAGVAGIPFM
TIGTGDGMMMQTLSTMKGAPKNTLKIFDGMNIGLNDITDASRKANEAVYTSWQGNPIKNVYESYAKFMKNVDFSKLSPEA
LEAIGKSALEYDQRENATVDDIANAASLIERNLRNIALGVDIRHKVLDKVNLSIDQMAAVGAPYQNNGKIDLSNMTPEQQ
ADELNKLFREELEARKQKVAKARAEVKEETVSEKEPVNPDFGMVGREHKASGVRILSATAIRNLAKISNLPSTQAATLAE
IQKSLAAKDYKIIYGTPTQVAEYARQKNVTELTSQEMEEAQAGNIYGWTNFDDKTIYLVSPSMETLIHELVHASTFEEVY
SFYQGNEVSPTSKQAIENLEGLMEQFRSLDISKDSPEMREAYADAIATIEGHLSNGFVDPAISKAAALNEFMAWGLANRA
LAAKQKRTSSLVQMVKDVYQAIKKLIWGRKQAPALGEDMFSNLLFNSAILMRSQPTTQAVAKDGTLFHSKAYGNNERLSQ
LNQTFDKLVTDYLRTDPVTEVERRGNVANALMSATRLVRDVQSHGFNMTAQEQSVFQMVTAALATEAAIDPHAMARAQEL
YTHVMKHLTVEHFMADPDSTNPADRYYAQQKYDTISGANLVEVDAKGRTSLLPTFLGLAMVNEELRSIIKEMPVPKADKK
LGNDIDTLLTNAGTQVMESLNRRMAGDQKATNVQDSIDALSETIMAAALKRESFYDAVATPTGNFIDRANQYVTDSIERL
SETVIEKADKVIANPSNIAAKGVAHLAKLTAAIASEKQGEIVAQGVMTAMNQGKVWQPFHDLVNDIVGRTKTNANVYDLI
KLVKSQISQDRQQFREHLPTVIAGKFSRKLTDTEWSAMHTGLGKTDLAVLRETMSMAEIRDLLSSSKKVKDEISTLEKEI
QNQAGRNWNLVQKKSKQLAQYMIMGEVGNNLLRNAHAISRLLGERITNGPVADVAAIDKLITLYSLELMNKSDRDLLSEL
AQSEVEGMEFSIAYMVGQRTEEMRKAKGDNRTLLNHFKGYIPVENQQGVNLIIADDKEFAKLNSQSFTRIGTYQGSTGFR
TGSKGYYFSPVAARAPYSQGILQNVRNTAGGVDIGTGFTLGTMVAGRITDKPTVERITKALAKGERGREPLMPIYNSKGQ
VVAYEQSVDPNMLKHLNQDNHFAKMVGVWRGRQVEEAKAQRFNDILIEQLHAMYEKDIKDSSANKSQYVNLLGKIDDPVL
ADAINLMNIETRHKAEELFGKDELWVRRDMLNDALGYRAASIGDVWTGNSRWSPSTLDTVKKMFLGAFGNKAYHVVMNAE
NTIQNLVKDAKTVIVVKSVVVPAVNFLANIYQMIGRGVPVKDIAVNIPRKTSEINQYIKSRLRQIDAEAELRAAEGNPNL
VRKLKTEIQSITDSHRRMSIWPLIEAGEFSSIADAGISRDDLLVAEGKIHEYMEKLANKLPEKVRNAGRYALIAKDTALF
QGIQKTVEYSDFIAKAIIYDDLVKRKKKSSSEALGQVTEEFINYDRLPGRFRGYMESMGLMWFYNFKIRSIKVAMSMIRN
NPVHSLIATVVPAPTMFGNVGLPIQDNMLTMLAEGRLDYSLGFGQGLRAPTLNPWFNLTH
>P18147 2.7.7.6~~~1~~~DNA-directed RNA polymerase~~~
MNALNIGRNDFSEIELAAIPYNILSEHYGDQAAREQLALEHEAYELGRQRFLKMLERQVKAGEFADNAAAKPLVLTLHPQ
LTKRIDDWKEEQANARGKKPRAYYPIKHGVASELAVSMGAEVLKEKRGVSSEAIALLTIKVVLGNAHRPLKGHNPAVSSQ
LGKALEDEARFGRIREQEAAYFKKNVADQLDKRVGHVYKKAFMQVVEADMISKGMLGGDNWASWKTDEQMHVGTKLLELL
IEGTGLVEMTKNKMADGSDDVTSMQMVQLAPAFVELLSKRAGALAGISPMHQPCVVPPKPWVETVGGGYWSVGRRPLALV
RTHSKKALRRYADVHMPEVYKAVNLAQNTPWKVNKKVLAVVNEIVNWKHCPVGDVPAIEREELPPRPDDIDTNEVARKAW
RKEAAAVYRKDKARQSRRCRCEFMVAQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGSLTLAKGKPIGLDGF
YWLKIHGANCAGVDKVPFPERIKFIEENEGNILASAADPLNNTWWTQQDSPFCFLAFCFEYAGVKHHGLNYNCSLPLAFD
GSCSGIQHFSAMLRDSIGGRAVNLLPSDTVQDIYKIVADKVNEVLHQHAVNGSQTVVEQIADKETGEFHEKVTLGESVLA
AQWLQYGVTRKVTKRSVMTLAYGSKESLVRQQVLEDTIQPAIDNGEGLMFTHPNQAAGYMAKLIWDAVTVTVVAAVEAMN
WLKSAAKLLAAEVKDKKTKEVLRKRCAIHWVTPDGFPVWQEYRKQNQARLKLVFLGQANVKMTYNTGKDSEIDAHKQESG
IAPNFVHSQDGSHLRMTVVHANEVYGIDSFALIHDSSGTIPADAGNLFKAVRETMVKTYEDNDVIADFYDQFADQLHESQ
LDKMPAVPAKGDLNLRDILESDFAFA
>P06221 2.7.7.6~~~~~~DNA-directed RNA polymerase~~~
MQDLHAIQLQLEEEMFNGGIRRFEADQQRQIAAGSESDTAWNRRLLSELIAPMAEGIQAYKEEYEGKKGRAPRALAFLQC
VENEVAAYITMKVVMDMLNTDATLQAIAMSVAERIEDQVRFSKLEGHAAKYFEKVKKSLKASRTKSYRHAHNVAVVAEKS
VAEKDADFDRWEAWPKETQLQIGTTLLEILEGSVFYNGEPVFMRAMRTYGGKTIYYLQTSESVGQWISAFKEHVAQLSPA
YAPCVIPPRPWRTPFNGGFHTEKVASRIRLVKGNREHVRKLTQKQMPKVYKAINALQNTQWQINKDVLAVIEEVIRLDLG
YGVPSFKPLIDKENKPANPVPVEFQHLRGRELKEMLSPEQWQQFINWKGECARLYTAETKRGSKSAAVVRMVGQARKYSA
FESIYFVYAMDSRSRVYVQSSTLSPQSNDLGKALLRFTEGRPVNGVEALKWFCINGANLWGWDKKTFDVRVSNVLDEEFQ
DMCRDIAADPLTFTQWAKADAPYEFLAWCFEYAQYLDLVDEGRADEFRTHLPVHQDGSCSGIQHYSAMLRDEVGAKAVNL
KPSDAPQDIYGAVAQVVIKKNALYMDADDATTFTSGSVTLSGTELRAMASAWDSIGITRSLTKKPVMTLPYGSTRLTCRE
SVIDYIVDLEEKEAQKAVAEGRTANKVHPFEDDRQDYLTPGAAYNYMTALIWPSISEVVKAPIVAMKMIRQLARFAAKRN
EGLMYTLPTGFILEQKIMATEMLRVRTCLMGDIKMSLQVETDIVDEAAMMGAAAPNFVHGHDASHLILTVCELVDKGVTS
IAVIHDSFGTHADNTLTLRVALKGQMVAMYIDGNALQKLLEEHEVRWMVDTGIEVPEQGEFDLNEIMDSEYVFA
>P00573 2.7.7.6~~~1~~~T7 RNA polymerase~~~
MNTINIAKNDFSDIELAAIPFNTLADHYGERLAREQLALEHESYEMGEARFRKMFERQLKAGEVADNAAAKPLITTLLPK
MIARINDWFEEVKAKRGKRPTAFQFLQEIKPEAVAYITIKTTLACLTSADNTTVQAVASAIGRAIEDEARFGRIRDLEAK
HFKKNVEEQLNKRVGHVYKKAFMQVVEADMLSKGLLGGEAWSSWHKEDSIHVGVRCIEMLIESTGMVSLHRQNAGVVGQD
SETIELAPEYAEAIATRAGALAGISPMFQPCVVPPKPWTGITGGGYWANGRRPLALVRTHSKKALMRYEDVYMPEVYKAI
NIAQNTAWKINKKVLAVANVITKWKHCPVEDIPAIEREELPMKPEDIDMNPEALTAWKRAAAAVYRKDKARKSRRISLEF
MLEQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGYYWLKIHGANCAGVDKVPFPERIK
FIEENHENIMACAKSPLENTWWAEQDSPFCFLAFCFEYAGVQHHGLSYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVN
LLPSETVQDIYGIVAKKVNEILQADAINGTDNEVVTVTDENTGEISEKVKLGTKALAGQWLAYGVTRSVTKRSVMTLAYG
SKEFGFRQQVLEDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWESVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTGEILR
KRCAVHWVTPDGFPVWQEYKKPIQTRLNLMFLGQFRLQPTINTNKDSEIDAHKQESGIAPNFVHSQDGSHLRKTVVWAHE
KYGIESFALIHDSFGTIPADAANLFKAVRETMVDTYESCDVLADFYDQFADQLHESQLDKMPALPAKGNLNLRDILESDF
AFA
>P0DTK7 ~~~~~~Radical SAM Nalpha-GlyT isomerase~~~
MDHAAWLNEETGEAAQEAYKYFMRPDPNQREFLGPIEEEDDPLTGMRAKFRMAKVGMVRNAKDEDKKEVKVYLGFNDVTY
LPHIRIPNAKPLQGWYQDKHNDKRGSRARPCFSEAILTEPYGGYCTVGCAFCYVNSGFRGYRGTGLISVPVNYGEQVRNM
LSKSRTSAAGYFSSFTDPFLPIEDVYHNTQQGAEAFVELGLPIFFLSRLSYPSWAIDLLKRNPYSYAQKSLNTGNDRDWH
KLSPGAISLQDHIDEIAELRRQGIYTSIQVNPVVPGIVTHDDIRHLFERLAAVGNNHVIVKFVEAGYSWAPAMIERLHKR
FGPERTKAFTDLFTENQAGAQKTIAEPYRVEAHQLYRKWATELGMTYATCYEYRRGKPGTGEPAWLSMGREMITADQCHG
QRVPMFTRTDLDQPFQEVKECAPTGCLHCADDNEGKPRCGSELFGAAKALRSPDFKKIVEPTPPEEDGGERKIIPITQID
>Q9J546 3.1.-.-~~~~~~Holliday junction resolvase~~~
MIICSVDIGIKNPAYAIFNYDNTSNTIKLIAIEKSDWTKNWERSVARDLTRYNPDVVILEKQGFKSPNSKIIYFIKGFFY
NSNTKVIVRNPTFKGGSYRNRKKQSIDVFIQKISEYTDYKNDILNKYTKLDDIADSFNLGLSYMESLLKKCKISKD
>Q80HV3 3.1.-.-~~~~~~Resolvase OPG149~~~
METLTSSSQSLISSPMSKKDYSSEIICAFDIGAKNPARTVLEVKDNSVRVLDISKLDWSSDWERRIAKDLSQYEYTTVLL
ERQPRRSPYVKFIYFIKGFLYHTSAAKVICVSPVMSGNSYRDRKKRSVEAFLDWMDTFGLRDSVPDRRKLDDVADSFNLA
MRYVLDKWNTNYTPYNRCKSRNYIKKM
>P33497 2.7.10.1~~~V-RYK~~~Tyrosine-protein kinase transforming protein RYK~~~
TTTVVNYTAKKSYCRRAVELTLGSLGVSSELQQKLQDVVIDRNALSLGKVLGEGEFGSVMEGRLSQPEGTPQKVAVKTMK
LDNFSHREIEEFLSEAACIKDFDHPNVIKLLGVCIELSSQQIPKPMVVLPFMKYGDLHSFLLRSRLEMAPQFVPLQMLLK
FMVDIALGMEYLSSRQFLHRDLAARNCMLRDDMTVCVADFGLSKKIYSGDYYRQGRIAKMPVKWIAIESLADRVYTTKSD
VWAFGVTMWEIATRGMTPYPGVQNHEIYEYLFHGQRLKKPENCLDELYDIMSSCWRAEPADRPTFSQLKVHLEKLLESLP
APRGSKDVIYVNTSLPEESPDSTQDLGLDSVIPQADSDLDPGDIAEPCCSHTKAALVAVDIHDGGSRYVLESEGSPTEDA
YVPQLPHEGSAWTEASTLPVGSSLAAQLPCADGCLEDSEALL
>Q9T1V0 ~~~S~~~Tail fiber protein S~~~
MFYIDNDSGVTVMPPVSAQRSAIVRWFSEGDGNNVITWPGMDWFNIVQAELLNTLEEAGIQPDKTKLNQLALSIKAIMSN
NALLIKNNLSEIKTAGASAQRTARENLDIYDASLNKKGLVQLTSATDSPSETLAATAKAVKIAMDNANARLAKDRNGADI
PNKPLFIQNLGLQETVNRARNAVQKNGDTLSGGLTFENDSILAWIRNTDWAKIGFKNDADSDTDSYMWFETGDNGNEYFK
WRSKQSTTTKDLMNLKWDALSVLVNAIVNGEVISKSANGLRIAYGNYGFFIRNDGSNTYFMLTNSGDNMGTYNGLRPLWI
NNATGAVSMGRGLNVSGDTLSDRFAINSSNGMWIQMRDNNAIFGKNIVNTDSAQALLRQNHADRKFMIGGLGNKQFGIYM
INNSRTANGTDGQAYMDNNGNWLCGAQVIPGNYANFDSRYVRDVRLGTQSLTGGLSRDYKAPSGHVITGFHTNGDWEMQG
GDDKVYIRPVQKNINGTWYNVASA
>P0DJY6 ~~~S'~~~Tail fiber protein S'~~~
MPKSTIIQNLGLQETVNQASGALQQNQNGADIPGKDTFTKNIGACRAYSAWLNIGGDSQVWTTAQFISWLESQGAFNHPY
WMCKGSWAYANNKVITDTGCGNICLAGAVVEVIGTRGAMTIRVTTPSTSSGGGITNAQFTYINHGDAYAPGWRRDYNTKN
QQPAFALGQTGSTVGNDKAVGWNWNSGVYNANIGGASTLILHFNMNTGSCPAVQFRVNYRNGGIFYRSARDGYGFEADWS
EIYTTTRKPSAGDVGAYTQAECNSRFITGIRLGGLSSVQTWNGPGWSDRSGYVVTGSVNGNRDELIDTTQARPIQYCING
TWYNAGSI
>P15240 ~~~sak~~~Staphylokinase~~~
MLKRSLLFLTVLLLLFSFSSITNEVSASSSFDKGKYKKGDDASYFEPTGPYLMVNVTGVDGKRNELLSPRYVEFPIKPGT
TLTKEKIEYYVEWALDATAYKEFRVVELDPSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVI
EKK
>B6UL32 ~~~~~~SaV protein~~~
MNYGTNKHYANEYGMELNEYFKHHFSYEELAGWYTMQVLKYLVRAGKKEGESYDKDRNKALDYAGELANLSNENELTEYT
TDDIMGFAQDIADDFKQWKGERNNFKSEFTKEEIKAIDERYLEFIEEV
>P03633 3.4.-.-~~~B~~~Internal scaffolding protein B~~~
MEQLTKNQAVATSQEAVQNQNEPQLRDENAHNDKSVHGVLNPTYQAGLRRDAVQPDIEAERKKRDEIEAGKSYCSRRFGG
ATCDDKSAQIYARFDKNDWRIQPAEFYRFHDAEVNTFGYF
>P69486 ~~~D~~~External scaffolding protein D~~~
MSQVTEQSVRFQTALASIKLIQASAVLDLTEDDFDFLTSNKVWIATDRSRARRCVEACVYGTLDFVGYPRFPAPVEFIAA
VIAYYVHPVNIQTACLIMEGAEFTENIINGVERPVKAAELFAFTLRVRAGNTDVLTDAEENVRQKLRAEGVM
>Q05222 ~~~~~~Probable capsid assembly scaffolding protein~~~
MSDNPTPESTPEAETPEVEKPMEPQGKVFDEAYVQSLRQEAAAARVAKKDAVEAAEARVKAEYEAKLAERDTAYTELQNQ
LGQAWIELEKVYLSLDAKVPNDKVRAFVEILEGNDRDSIAESVKSRLELVGGFGNKTPSPAFDPSQGRGGKPPIPLNGDP
ILEAIKAAVGIKK
>P25478 ~~~O~~~Capsid assembly scaffolding protein~~~
MAKKVSKFFRIGVEGDTCDGRVISAQDIQEMAETFDPRVYGCRINLEHLRGILPDGIFKRYGDVAELKAEKIDDDSALKG
KWALFAKITPTDDLIAMNKAAQKVYTSMEIQPNFANTGKCYLVGLAVTDDPASLGTEYLEFCRTAKHNPLNRFKLSPENL
ISVATPVELEFEDLPETVFTALTEKVKSIFGRKQASDDARLNDVHEAVTAVAEHVQEKLSATEQRLAEMETAFSALKQEV
TDRADETSQAFTRLKNSLDHTESLTQQRRSKATGGGGDALMTNC
>P13848 ~~~7~~~Capsid assembly scaffolding protein~~~
MPLKPEEHEDILNKLLDPELAQSERTEALQQLRVNYGSFVSEYNDLTKSHEKLAAEKDDLIVSNSKLFRQIGLTDKQEED
HKKADISETITIEDLEAK
>Q38580 ~~~~~~Capsid assembly scaffolding protein~~~
MSLKEQLGEELYGQVLAKLGEGAKLVDISDGSFIPKEKFDAVNSEKKSLEQQLTDRDQQLQELSTKATGHDELSAKIADL
QKANEEAKQAFEAEKQQLKYEHALETALRDSGAKNPKAVKALLDTESIKLDGDKLLGFEDQIKALKEQEDYLFKGTEPNG
GVQGTPPPGKGADLGGLPTKKNPFKQGPDFNLTEQGILFRENPELAKKLQAEAQ
>P04534 ~~~~~~Capsid assembly scaffolding protein~~~
MLKEQLIAEAQKIDASVALDSIFESVNISPEAKETFGTVFEATVKQHAVKLAESHIAKIAEKAEEEVEKNKEEAEEKAEK
KIAEQASKFIDHLAKEWLAENKLAVDKGIKAELFESMLGGLKELFVEHNVVVPEESVDVVAEMEEELQEHKEESPRLFEE
LNMRDAYINYVQREVALSESTKDLTESQKEKVSALVEGMDYSDAFSSKLSAIVEMVKKSNKDESTITESINTPDTEAAGL
NFVTEAVEDKAAQGAEDIVSVYAKVASRF
>P03716 ~~~9~~~Capsid assembly scaffolding protein~~~
MAESNADVYASFGVNSAVMSGGSVEEHEQNMLALDVAARDGDDAIELASDEVETERDLYDNSDPFGQEDDEGRIQVRIGD
GSEPTDVDTGEEGVEGTEGSEEFTPLGETPEELVAASEQLGEHEEGFQEMINIAAERGMSVETIEAIQREYEENEELSAE
SYAKLAEIGYTKAFIDSYIRGQEALVEQYVNSVIEYAGGRERFDALYNHLETHNPEAAQSLDNALTNRDLATVKAIINLA
GESRAKAFGRKPTRSVTNRAIPAKPQATKREGFADRSEMIKAMSDPRYRTDANYRRQVEQKVIDSNF
>Q1HVC7 ~~~BVRF2/BdRF1~~~Capsid scaffolding protein~~~
MVQAPSVYVCGFVERPDAPPKDACLHLDPLTVKSQLPLKKPLPLTVEHLPDAPVGSVFGLYQSRAGLFSAASITSGVFLS
LLDSIYHDCDIAQSQRLPLPREPKLEALHAWLPSLSLASLHPDIPQTTADGGKLSFFDHVSICALGRRRGTTAVYGTDLA
WVLKHFSDLEPSIAAQIENDANAAKRESGCPEDHPLPLTKLIAKAIDAGFLRNRVETLRQDRGVANIPAESYLKASDAPD
LQKPDKALQSPPPASTDPDTMLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH
QAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYFGLPGLFGPPPPVPPYYGSHL
RADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDIAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYT
HPGPYGFQPHQSYEVPRYVPHPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGHH
RGKKLVQASASGVAQSKEPTTPKAKSVSAHLKSIFCEELLNKRVA
>P03234 ~~~BVRF2/BdRF1~~~Capsid scaffolding protein~~~
MVQAPSVYVCGFVERPDAPPKDACLHLDPLTVKSQLPLKKPLPLTVEHLPDAPVGSVFGLYQSRAGLFSAASITSGDFLS
LLDSIYHDCDIAQSQRLPLPREPKVEALHAWLPSLSLASLHPDIPQTTADGGKLSFFDHVSICALGRRRGTTAVYGTDLA
WVLKHFSDLEPSIAAQIENDANAAKRESGCPEDHPLPLTKLIAKAIDAGFLRNRVETLRQDRGVANIPAESYLKASDAPD
LQKPDKALQSPPPASTDPATMLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSH
QAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYFGLPGLFGPPPPVPPYYGSHL
RADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDIAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYT
HSGPYGFQPHQSYEVPRYVPHPPPPPTSHQAAQAQPPPPGTQAPEAHCVAESTIPEAGAAGNSGPREDTNPQQPTTEGHH
RGKKLVQASASGVAQSKEPTTPKAKSVSAHLKSIFCEELLNKRVA
>P16753 ~~~~~~Capsid scaffolding protein~~~
MTMDEQQSQAVAPVYVGGFLARYDQSPDEAELLLPRDVVEHWLHAQGQGQPSLSVALPLNINHDDTAVVGHVAAMQSVRD
GLFCLGCVTSPRFLEIVRRASEKSELVSRGPVSPLQPDKVVEFLSGSYAGLSLSSRRCDDVEAATSLSGSETTPFKHVAL
CSVGRRRGTLAVYGRDPEWVTQRFPDLTAADRDGLRAQWQRCGSTAVDASGDPFRSDSYGLLGNSVDALYIRERLPKLRY
DKQLVGVTERESYVKASVSPEAACDIKAASAERSGDSRSQAATPAAGARVPSSSPSPPVEPPSPVQPPALPASPSVLPAE
SPPSLSPSEPAEAASMSHPLSAAVPAATAPPGATVAGASPAVSSLAWPHDGVYLPKDAFFSLLGASRSAVPVMYPGAVAA
PPSASPAPLPLPSYPASYGAPVVGYDQLAARHFADYVDPHYPGWGRRYEPAPSLHPSYPVPPPPSPAYYRRRDSPGGMDE
PPSGWERYDGGHRGQSQKQHRHGGSGGHNKRRKETAAASSSSSDEDLSFPGEAEHGRARKRLKSHVNSDGGSGGHAGSNQ
QQQQRYDELRDAIHELKRDLFAARQSSTLLSAALPSAASSSPTTTTVCTPTGELTSGGGETPTALLSGGAKVAERAQAGV
VNASCRLATASGSEAATAGPSTAGSSSCPASVVLAAAAAQAAAASQSPPKDMVDLNRRIFVAALNKLE
>P10210 ~~~~~~Capsid scaffolding protein~~~
MAADAPGDRMEEPLPDRAVPIYVAGFLALYDSGDSGELALDPDTVRAALPPDNPLPINVDHRAGCEVGRVLAVVDDPRGP
FFVGLIACVQLERVLETAASAAIFERRGPPLSREERLLYLITNYLPSVSLATKRLGGEAHPDRTLFAHVALCAIGRRLGT
IVTYDTGLDAAIAPFRHLSPASREGARRLAAEAELALSGRTWAPGVEALTHTLLSTAVNNMMLRDRWSLVAERRRQAGIA
GHTYLQASEKFKMWGAEPVSAPARGYKNGAPESTDIPPGSIAAAPQGDRCPIVRQRGVALSPVLPPMNPVPTSGTPAPAP
PGDGSYLWIPASHYNQLVAGHAAPQPQPHSAFGFPAAAGSVAYGPHGAGLSQHYPPHVAHQYPGVLFSGPSPLEAQIAAL
VGAIAADRQAGGQPAAGDPGVRGSGKRRRYEAGPSESYCDQDEPDADYPYYPGEARGAPRGVDSRRAARHSPGTNETITA
LMGAVTSLQQELAHMRARTSAPYGMYTPVAHYRPQVGEPEPTTTHPALCPPEAVYRPPPHSAPYGPPQGPASHAPTPPYA
PAACPPGPPPPPCPSTQTRAPLPTEPAFPPAATGSQPEASNAEAGALVNASSAAHVDVDTARAADLFVSQMMGAR
>Q2HRB6 ~~~~~~Capsid scaffolding protein~~~
MAQGLYVGGFVDVVSCPKLEQELYLDPDQVTDYLPVTEPLPITIEHLPETEVGWTLGLFQVSHGIFCTGAITSPAFLELA
SRLADTSHVARAPVKNLPKEPLLEILHTWLPGLSLSSIHPRELSQTPSGPVFQHVSLCALGRRRGTVAVYGHDAEWVVSR
FSSVSKSERAHILQHVSSCRLEDLSTPNFVSPLETLMAKAIDASFIRDRLDLLKTDRGVASILSPAYLKASQFPVGIQAV
TPPRPAMNSSGQEDIISIPKSAFLSMLQSSIDGMKTTAAKMSHTLSGPGLMGCGGQMFPTDHHLPSYVSNPAPPYGYAYK
NPYDPWYYSPQLPGYRTGKRKRGAEDDEGHLFPGEEPAYHKDILSMSKNIAEIQSELKEMKLNGWHAGPPPSSSAAAAAV
DPHYRPHANSAAPCQFPTMKEHGGTYVHPPIYVQAPHGQFQQAAPILFAQPHVSHPPVSTGLAVVGAPPAEPTPASSTQS
IQQQAPETTHTPCAAVEKDAPTPNPTSNRVEASSRSSPKSKIRKMFCEELLNKQ
>P03711 3.4.21.-~~~C~~~Capsid assembly protease C~~~
MTAELRNLPHIASMAFNEPLMLEPAYARVFFCALAGQLGISSLTDAVSGDSLTAQEALATLALSGDDDGPRQARSYQVMN
GIAVLPVSGTLVSRTRALQPYSGMTGYNGIIARLQQAASDPMVDGILLDMDTPGGMVAGAFDCADIIARVRDIKPVWALA
NDMNCSAGQLLASAASRRLVTQTARTGSIGVMMAHSNYGAALEKQGVEITLIYSGSHKVDGNPYSHLPDDVRETLQSRMD
ATRQMFAQKVSAYTGLSVQVVLDTEAAVYSGQEAIDAGLADELVNSTDAITVMRDALDARKSRLSGGRMTKETQSTTVSA
TASQADVTDVVPATEGENASAAQPDVNAQITAAVAAENSRIMGILNCEEAHGREEQARVLAETPGMTVKTARRILAAAPQ
SAQARSDTALDRLMQGAPAPLAAGNPASDAVNDLLNTPV
>P16046 ~~~~~~Capsid scaffolding protein~~~
MADPVYVGGFLVRYDEPPGEAELFLPSGVVDRWLRDCRGPLPLNVNHDESATVGYVAGLQNVRAGLFCLGRVTSPKFLDI
VQKASEKSELVSRGPPSESSLRPDGVLEFLSGSYSGLSLSSRRDINAADGAAGDAETACFKHVALCSVGRRRGTLAVYGR
QPDWVMERFPDLTEADREALRNQLSGSGEVAAKESAESSAAAAVDPFQSDSYGLLGNSVDALYIQERLPKLRYDKRLVGV
TARESYVKASVSPAEQETCDIKVEKERPKEPEQSHVPTESMSHPMSAVATPAASTVAPSQAPLALAHDGVYLPKDAFFSL
IGASRPLAEAAGARAAYPAVPPPPAYPVMNYEDPSSRHFDYSAWLRRPAYDAVPPLPPPPVMPMPYRRRDPMMEEAERAA
WERGYAPSAYDHYVNNGSWSRSRSGALKRRRERDASSDEEEDMSFPGEADHGKARKRLKAHHGRDNNNSGSDAKGDRYDD
IREALQELKREMLAVRQIAPAALLAPAQLATPVASPTTTTSHQAEASEPQASTAAAAPSTASSHGSKSAERGVVNASCRV
APPLEAVNPPKDMVDLNRRLFVAALNKME
>Q83417 ~~~~~~Capsid scaffolding protein~~~
MGPVYVSGYLALYDRDGGELALTREIVAAALPPAGPLPINIDHRPRCDIGAVLAVVDDDRGPFFLGVVNCPQLGAVLARA
VGPDFFGDMRLSDEERLLYLLSNYLPSASLSSRRLAPGEAPDETLFAHVALCVIGRRVGTIVVYDASPEAAVAPFRQLSA
RARSELLARAAESPDRERVWHMSEEALTRALLSTAVNNMLLRDRWELVAARRREAGVRGHTYLQATMWAGLLPKSGASPA
PGPSAAMAAPPSAAPGDYIFVPAAQYNQLVVNQRPAPSLESQLGAIVSAAMDRRHRRSPSPEPRPPARKRRYDDYAQDNA
YYPGEAPPPASDLAAVVSSLQREISHLRAQQLRYPTPYYAPAAPPQLLPPGAVVGHPHPHHAAGALYPPMYAPQPGLHAP
PPSPVAHAVPALPGLPGLQGLAAPVAHVPAQVVPQQPVVVQAQPVAVPAAAAAAPAPAPAAAAAAAAPVQAAAPAAPASA
PQPPVQASVSAPADVSAGTIDASSAAVACQRGADIFVSQMMSQR
>P09286 ~~~~~~Capsid scaffolding protein~~~
MAAEADEENCEALYVAGYLALYSKDEGELNITPEIVRSALPPTSKIPINIDHRKDCVVGEVIAIIEDIRGPFFLGIVRCP
QLHAVLFEAAHSNFFGNRDSVLSPLERALYLVTNYLPSVSLSSKRLSPNEIPDGNFFTHVALCVVGRRVGTVVNYDCTPE
SSIEPFRVLSMESKARLLSLVKDYAGLNKVWKVSEDKLAKVLLSTAVNNMLLRDRWDVVAKRRREAGIMGHVYLQASTGY
GLARITNVNGVESKLPNAGVINATFHPGGPIYDLALGVGESNEDCEKTVPHLKVTQLCRNDSDMASVAGNASNISPQPPS
GVPTGGEFVLIPTAYYSQLLTGQTKNPQVSIGAPNNGQYIVGPYGSPHPPAFPPNTGGYGCPPGHFGGPYGFPGYPPPNR
LEMQMSAFMNALAAERGIDLQTPCVNFPDKTDVRRPGKRDFKSMDQRELDSFYSGESQMDGEFPSNIYFPGEPTYITHRR
RRVSPSYWQRRHRVSNGQHEELAGVVAKLQQEVTELKSQNGTQMPLSHHTNIPEGTRDPRISILLKQLQSVSGLCSSQNT
TSTPHTDTVGQDVNAVEASSKAPLIQGSTADDADMFANQMMVGRC
>P14348 ~~~SCP~~~Small capsomere-interacting protein~~~
MARRLPKPTLQGRLEADFPDSPLLPKFQELNQNNLPNDVFREAQRSYLVFLTSQFCYEEYVQRTFGVPRRQRAIDKRQRA
SVAGAGAHAHLGGSSATPVQQAQAAASAGTGALASSAPSTAVAQSATPSVSSSISSLRAATSGATAAASAAAAVDTGSGG
GGQPHDTAPRGARKKQ
>Q7M6N6 ~~~SCP~~~Small capsomere-interacting protein~~~
MSNTAPGPTVANKRDEKHRHVVNVVLELPTEISEATHPVLATMLSKYTRMSSLFNDKCAFKLDLLRMVAVSRTRR
>P10219 ~~~SCP~~~Small capsomere-interacting protein~~~
MAVPQFHRPSTVTTDSVRALGMRGLVLATNNSQFIMDNNHPHPQGTQGAVREFLRGQAAALTDLGLAHANNTFTPQPMFA
GDAPAAWLRPAFGLRRTYSPFVVREPSTPGTP
>P89458 ~~~SCP~~~Small capsomere-interacting protein~~~
MAAPQFHRPSTITADNVRALGMRGLVLATNNAQFIMDNSYPHPHGTQGAVREFLRGQAAALTDLGVTHANNTFAPQPMFA
GDAAAEWLRPSFGLKRTYSPFVVRDPKTPSTP
>Q9WT32 ~~~SCP~~~Small capsomere-interacting protein~~~
MTTIRGDDLSNQITQISGSSSKKEEEKKKQQMLTGVLGLQPTMANHPVLGVFLPKYAKQNGGNVDKTAFRLDLIRMLALH
RLNTKTGSD
>Q2HR63 ~~~SCP~~~Small capsomere-interacting protein~~~
MSNFKVRDPVIQERLDHDYAHHPLVARMNTLDQGNMSQAEYLVQKRHYLVFLIAHHYYEAYLRRMGGIQRRDHLQTLRDQ
KPRERADRVSAASAYDAGTFTVPSRPGPASGTTPGGQDSLGVSGSSITTLSSGPHSLSPASDILTTLSSTTETAAPAVAD
ARKPPSGKKK
>Q8JL80 ~~~~~~Semaphorin-like protein 139~~~
MIPLLFILFYFTNCIEWHKFETSEEIISTYLIDDVLYTGVNGAVYTFSNNELNKTGLTNNNNYITTSIKVEDTLVCGTNN
GNPKCWKIDGSEDPKYRGRGYAPYQNSKVTIISHNECVLSDINISKEGIKRWRRFDGPCGYDLYTADNVIPKDGVRGAFV
DKDGTYDKVYILFTDTIDTKRIVKIPYIAQMCLNDEGGPSSLSSHRWSTFLKVELECDIDGRSYRQIIHSKAIKTDNDTI
LYVFFDSPYSKSALCTYSMNAIKHSFSTSKLGGYTKQLPSPAPGICLPAGKVVPHTTFDIIEQYNELDDIIKPLSQPIFE
GPSGVKWFDIKEKENEHREYRIYFIKENTIYSFDTKSKQTRSAQVDARLFSVMVTSKPLFIADIGIGVGIPRMKKILKM
>P21062 ~~~~~~Semaphorin-like protein A39~~~
MIPLLFILFYFANGIEWHKFETSEEIISTYLLDDVLYTGVNGAVYTFSNNKLNKTGLTNNNYITTSIKVEDADKDTLVCG
TNNGNPKCWKIDGSDDPKHRGRGYAPYQNSKVTIISYNECVLSDINISKEGIKRWRRFDGPCGYDLYTADNVIPKDGLRG
AFVDKDGTYDKVYILFTDTIGSKRIVKIPYIAQMCLNDEGGPSSLSSHRWSTFLKVELECDIDGRSYRQIIHSRTIKTDN
DTILYVFFDSPYSKSALCTYSMNTIKQSFSTSKLEGYTKQLPSPAPGICLPAGKVVSHTTFEVIEKYNVLDDIIKPLSNQ
PIFEGPSGVKWFDIKEKENEHREYRIYFIKENSIYSFDTKSKQTRSSQVDARLFSVMVTSKPLFIADIGIGVGMPQMKKI
LKM
>E1XTK6 ~~~~~~Serinyltransferase~~~
MTHLSKMPTGYTPPAEWKYPIDLSIDYRKPENRMYLLKAWVEALSYTEEHNQQVRLMDYAIEVTEGITQLEKIERKIWMA
FLWGCCYNGIGPWTIYSEFPVPPQSPQEFKRFCDWYNLNFERMRFDTDCRYRKSKMIPCVQSYIDWLAGRTQMDAFRPLL
ETKLQSDQFVKLWDTAMGWKYFGRLSAWNFLEALNMVFGNMYQIDVPGFMLRDRDGSESNRNGAAFLSNRDDWVTKHGKK
KINGCPITDEECDILEADLEQAFKDCVAEFGHITFINRLNFETSGACWLKKFFRLKNTRYIGWDAERTWDEIDYMERIWP
EYSCKALWEARSLWLPDTLLCEKAPAGHVPGVQKWKMPVFFETGVPLHIWHLQQGTRWEPSEVYTNLKMPVRKIDDNPKS
TSVNLMSLLKR
>P24939 ~~~L4~~~Protein 33K~~~
MAPKKKLQLPPPPPTDEEEYWDSQAEEVLDEEEEMMEDWDSLDEASEAEEVSDETPSPSVAFPSPAPQKLATVPSIATTS
APQAPPALPVRRPNRRWDTTGTRAAPTAPAAAAAAATAAVTQKQRRPDSKTLTKPKKSTAAAAAGGGALRLAPNEPVSTR
ELRNRIFPTLYAIFQQSRGQEQELKIKNRSLRSLTRSCLYHKSEDQLRRTLEDAEALFSKYCALTLKD
>P24940 ~~~L4~~~Protein 33K~~~
MAPKKKLQLPPPPTDEEEYWDSQAEEVLDEEEEDMMEDWESLDEEASEVEEVSDETPSPSVAFPSPAPQKSATGSSMATT
SAPQAPPALPVRRPNRRWDTTGTRAAHTAPAAAAAAATAAATQKQRRPDSKTLTKPKKSTAAAAAGGGALRLAPNEPVST
RELRNRIFPTLYAIFQQSRGQEQELKIKNRSLRSLTRSCLYHKSEDQLRRTLEDAEALFSKYCALTLKD
>P11805 ~~~L4~~~Protein 33K~~~
MPPKGNKHPIAQRQSQQKLQKQWDEEETWDDSQAEEVSDEEAEEQMESWDSLDEEDLEDVEEETIASDKAPSFKKPVRSQ
PPKTIPPLPPQPCSLKASRRWDTVSIAGSPTAPAAPTKRLEKTPRVRKTSSAIATRQDSPATQELRKRIFPTLYAIFQQS
RGQQLELKVKNRSLRSLTRSCLYHRSEDQLQRTLEDAEALFNKYCSVSLKD
>P19416 ~~~L4~~~Protein 33K~~~
MPPKGNKQAIADRRSQKQQKLQEQWDEEEESWDDSQAEEVSDEEEMESWESLDEELEDKPPKDEEEEIIASAAAPSSKEP
ARSQPPTGKVGPSPPRPGLLKASRRWDTVSIAGSPPAPVAPTKRSEKTTRPRKEKTSAIATRQDTPVAQELRKRIFPTLY
AIFQQSRGQQLELKVKNRSLRSLTRSCLYHRREDQLQRTLEDAEALFNKYCSVSLKD
>P0DJY3 ~~~~~~Protein suppressor of silencing~~~
MAMTKKFKVSFDVTAKMSSDVQAILEKDMLHLCKQVGSGAIVPNGKQKEMIVQFLTHGMEGLMTFVVRTSFREAIKDMHE
EYADKDSFKQSPATVREVF
>P25989 ~~~~~~Small delta antigen~~~
MSRSERRKDRGGREDILEQWVSGRKKLEELERDLRKLKKKIKKLEEDNPWLGNIKGIIGKKDKDGEGAPPAKKLRMDQME
IDAGPRKRPLRGGFTDKERQDHRRRKALENKRKQLSSGGKSLSREEEEELKRLTEEDEKRERRIAGPSVGGVNPLEGGSR
GAPGGGFVPSMQGVPESPFARTGEGLDIRGSQGFP
>P0C6L3 ~~~~~~Small delta antigen~~~
MSRPEGRKNRGGREEVLEQWVSGRKKLEELERDLRKVKKKIKKLEDEHPWLGNIKGILGKKDKDGEGAPPAKRARTDQME
VDSGPRKRPSRGGFTDKERQDHRRRKALENKRKQLSAGGKNLSKEEEEELRRLTEEDERRERRIAGPQVGGVNPLEGGTR
GAPGGGFVPSMQGVPESPFTRTGEGLDIRGSQGFP
>P06934 ~~~~~~Small delta antigen~~~
MSRSESRKNRGGREEILEQWVAGRKKLEELERDLRKTKKKLKKIEDENPWLGNIKGILGKKDKDGEGAPPAKRARTDQME
VDSGPRKRPLRGGFTDKERQDHRRRKALENKKKQLSAGGKNLSKEEEEELRRLTEEDERRERRVAGPPVGGVIPLEGGSR
GAPGGGFVPSLQGVPESPFSRTGEGLDIRGNRGFP
>P24932 ~~~L4~~~Shutoff protein~~~
MESVEKEDSLTAPFEFATTASTDAANAPTTFPVEAPPLEEEEVIIEQDPGFVSEDDEDRSVPTEDKKQDQDDAEANEEQV
GRGDQRHGDYLDVGDDVLLKHLQRQCAIICDALQERSDVPLAIADVSLAYERHLFSPRVPPKRQENGTCEPNPRLNFYPV
FAVPEVLATYHIFFQNCKIPLSCRANRSRADKQLALRQGAVIPDIASLDEVPKIFEGLGRDEKRAANALQQENSENESHC
GVLVELEGDNARLAVLKRSIEVTHFAYPALNLPPKVMSTVMSELIVRRARPLERDANLQEQTEEGLPAVGDEQLARWLET
REPADLEERRKLMMAAVLVTVELECMQRFFADPEMQRKLEETLHYTFRQGYVRQACKISNVELCNLVSYLGILHENRLGQ
NVLHSTLKGEARRDYVRDCVYLFLCYTWQTAMGVWQQCLEERNLKELQKLLKQNLKDLWTAFNERSVAAHLADIIFPERL
LKTLQQGLPDFTSQSMLQNFRNFILERSGILPATCCALPSDFVPIKYRECPPPLWGHCYLLQLANYLAYHSDIMEDVSGD
GLLECHCRCNLCTPHRSLVCNSQLLSESQIIGTFELQGPSPDEKSAAPGLKLTPGLWTSAYLRKFVPEDYHAHEIRFYED
QSRPPNAELTACVITQGHILGQLQAINKARQEFLLRKGRGVYLDPQSGEELNPIPPPPQPYQQPRALASQDGTQKEAAAA
AAATHGRGGILGQSGRGGFGRGGGDDGRLGQPRRSFRGRRGVRRNTVTLGRIPLAGAPEIGNRSQHRYNLRSSGAAGTAC
SPTQP
>P24933 ~~~L4~~~Shutoff protein~~~
MESVEKKDSLTAPSEFATTASTDAANAPTTFPVEAPPLEEEEVIIEQDPGFVSEDDEDRSVPTEDKKQDQDNAEANEEQV
GRGDERHGDYLDVGDDVLLKHLQRQCAIICDALQERSDVPLAIADVSLAYERHLFSPRVPPKRQENGTCEPNPRLNFYPV
FAVPEVLATYHIFFQNCKIPLSCRANRSRADKQLALRQGAVIPDIASLNEVPKIFEGLGRDEKRAANALQQENSENESHS
GVLVELEGDNARLAVLKRSIEVTHFAYPALNLPPKVMSTVMSELIVRRAQPLERDANLQEQTEEGLPAVGDEQLARWLQT
REPADLEERRKLMMAAVLVTVELECMQRFFADPEMQRKLEETLHYTFRQGYVRQACKISNVELCNLVSYLGILHENRLGQ
NVLHSTLKGEARRDYVRDCVYLFLCYTWQTAMGVWQQCLEECNLKELQKLLKQNLKDLWTAFNERSVAAHLADIIFPERL
LKTLQQGLPDFTSQSMLQNFRNFILERSGILPATCCALPSDFVPIKYRECPPPLWGHCYLLQLANYLAYHSDIMEDVSGD
GLLECHCRCNLCTPHRSLVCNSQLLNESQIIGTFELQGPSPDEKSAAPGLKLTPGLWTSAYLRKFVPEDYHAHEIRFYED
QSRPPNAELTACVITQGHILGQLQAINKARQEFLLRKGRGVYLDPQSGEELNPIPPPPQPYQQQPRALASQDGTQKEAAA
AAATHGRGGILGQSGRGGFGRGGGGHDGRLGEPRRGSFRGRRGVRRNTVTLGRIPLAGAPEIGNRFQHGYNLRSSGAAGT
ARSPTQP
>P10225 3.1.27.-~~~~~~Virion host shutoff protein~~~
MGLFGMMKFAHTHHLVKRRGLGAPAGYFTPIAVDLWNVMYTLVVKYQRRYPSYDREAITLHCLCRLLKVFTQKSLFPIFV
TDRGVNCMEPVVFGAKAILARTTAQCRTDEEASDVDASPPPSPITDSRPSSAFSNMRRRGTSLASGTRGTAGSGAALPSA
APSKPALRLAHLFCIRVLRALGYAYINSGQLEADDACANLYHTNTVAYVYTTDTDLLLMGCDIVLDISACYIPTINCRDI
LKYFKMSYPQFLALFVRCHTDLHPNNTYASVEDVLRECHWTPPSRSQTRRAIRREHTSSRSTETRPPLPPAAGGTETRVS
WTEILTQQIAGGYEDDEDLPLDPRDVTGGHPGPRSSSSEILTPPELVQVPNAQLLEEHRSYVANPRRHVIHDAPESLDWL
PDPMTITELVEHRYIKYVISLIGPKERGPWTLLKRLPIYQDIRDENLARSIVTRHITAPDIADRFLEQLRTQAPPPAFYK
DVLAKFWDE
>Q82171 3.1.27.-~~~~~~Virion host shutoff protein~~~
MGLFGMMKFAHTHHLVKRQGLGAPAGYFTPIAVDLWNVMYTLVVKYQRRYPSYDREAITLHCLCRLLKVFTQKSLFPIFV
TDRGVNCMEPVVFGAKAILARTTAQCRTDEEASDVDASPPPSPITDSRPSSAFSNMRRRGTSLASGTRGTAGSGAALPSA
APSKPALRLAHLFCIRVLRALGYAYINSGQLEADDACANLYHTNTVAYVYTTDTDLLLMGCDIVLDISACYIPTINCRDI
LKYFKMSYPQFLALFVRCHTDLHPNNTYASVEDVLRECHWTPPSRSQTRRAIRREHTSSRSTETRPPLPPAAGGTEMRVS
WTEILTQQIAGGYEDDEDLPLDPRDVTGGHPGPRSSSSEILTPPELVQVPNAQLLEEHRSYVASRRRHVIHDAPESLDWL
PDPMTITELVEHRYIKYVISLIGPKERGPWTLLKRLPIYQDIRDENLARSIVTRHITAPDIADRFLEQLRTQAPPPAFYK
DVLAKFWDE
>Q6WB95 ~~~SH~~~Small hydrophobic protein~~~
MITLDVIKSDGSSKTCTHLKKIIKDHSGKVLIALKLILALLTFFTITITINYIKVENNLQICQSKTESDKEDSPSNTTSV
TTKTTLDHDITQYFKRLIQRYTDSVINKDTCWKISRNQCTNITTYKFLCFKPEDSKINSCDRLTDLCRNKSKSAAEAYHT
VECHCIYTIEWKCYHHSID
>P0DOE4 ~~~SH~~~Small hydrophobic protein~~~
MENTSITIEFSSKFWPYFTLIHMITTIISLLIIISIMIAILNKLCEYNVFHNKTFELPRARVNT
>P0DOE5 ~~~SH~~~Small hydrophobic protein~~~
MENTSITIEFSSKFWPYFTLIHMITTIISLLIIISIMIAILNKLCEYNVFHNKTFELPRARVNT
>P22109 ~~~SH~~~Small hydrophobic protein~~~
MPAIQPPLYLTFLLLILLYLIITLYVWTILTINHKTAVRYAALYQRSCSRWGFDQSL
>P22110 ~~~SH~~~Small hydrophobic protein~~~
MPAIQPPLYLTFLLLTLLYLIITLYVWTILTINHNTAVRYAALYQRSFSRWGFDQSL
>P22112 ~~~SH~~~Small hydrophobic protein~~~
MPAIQPPLYPTFLLLILLSLIITLYVWIISTITYKTAVRHAALHQRSFSRWSLDHSL
>P33496 ~~~SH~~~Small hydrophobic protein~~~
MTSTVNLGSDTASKRTVIKSRCNSCCRILVSCVAVICAILALIFLVATIGLSVKLAFTVQEVHNCKQKLSGASTTTAAIY
TTPSTMIEALQTNQLKLTTNERRSTPPDCLVEKKLCEGEVRYLKTKGCLGAREGEDLNCIDLVVECVGKPCGHNEDYKEC
ICTNNGTATKCCYN
>Q992I2 ~~~ORF3~~~Sigma-C capsid protein~~~
MAGLNPSQRREVVSLILSLTSNVNISHGDLTPIYERLTNLEASTELLHRSISDISTTVSNISANLQDMTHTLDDVTANLD
GLRTTVTALQDSVSILSTNVTDLTNRSSAHAAILSSLQTTVDGNSTAISNLKSDISSNGLAITDLQDRVKSLESTASHGL
SFSPPLSVADGVVSLDMDPYFCSQRVSLTSYSAEAQLMQFRWMARGTNGSSDTIDMTVNAHCHGRRTDYMMSSTGNLTVT
SNVVLLTFDLSDITHIPSDLARLVPSAGFQAASFPVDVSFTRDSATHAYQAYGVYSSSRVFTITFPTGGDGTANIRSLTV
RTGIDT
>P03528 ~~~S1~~~Outer capsid protein sigma-1~~~
MDPRLREEVVRLIIALTSDNGASLSKGLESRVSALEKTSQIHSDTILRITQGLDDANKRIIALEQSRDDLVASVSDAQLA
ISRLESSIGALQTVVNGLDSSVTQLGARVGQLETGLAELRVDHDNLVARVDTAERNIGSLTTELSTLTLRVTSIQADFES
RISTLERTAVTSAGAPLSIRNNRMTMGLNDGLTLSGNNLAIRLPGNTGLNIQNGGLQFRFNTDQFQIVNNNLTLKTTVFD
SINSRIGATEQSYVASAVTPLRLNSSTKVLDMLIDSSTLEINSSGQLTVRSTSPNLRYPIADVSGGIGMSPNYRFRQSMW
IGIVSYSGSGLNWRVQVNSDIFIVDDYIHICLPAFDGFSIADGGDLSLNFVTGLLPPLLTGDTEPAFHNDVVTYGAQTVA
IGLSSGGAPQYMSKNLWVEQWQDGVLRLRVEGGGSITHSNSKWPAMTVSYPRSFT
>P04506 ~~~S1~~~Outer capsid protein sigma-1~~~
MDASLITEIRKIVLQLSVSSNGSQSKEIEEIKKQVQVNVDDIRAANIKLDGLGRQIADISNSISTIESRLGEMDNRLVGI
SSQVTQLSNSVSQNTQSISSLGDRINAVEPRVDSLDTVTSNLTGRTSTLEADVGSLRTELAALTTRVTTEVTRLDGLINS
GQNSIGELSTRLSNVETSMVTTAGRGLQKNGNTLNVIVGNGMWFNSSNQLQLDLSGQSKGVGFVGTGMVVKIDTNYFAYN
SNGEITLVSQINELPSRVSTLESAKIDSVLPPLTVREASGVRTLSFGYDTSDFTIINSVLSLRSRLTLPTYRYPLELDTA
NNRVQVADRFGMRTGTWTGQLQYQHPQLSWRANVTLNLMKVDDWLVLSFSQMTTNSIMADGKFVINFVSGLSSGWQTGDT
EPSSTIDPLSTTFAAVQFLNNGQRIDAFRIMGVSEWTDGELEIKNYGGTYTGHTQVYWAPWTIMYPCNVR
>P11314 ~~~S2~~~Inner capsid protein sigma-2~~~
MARAAFLFKTVGFGGLQNVPINDELSSHLLRAGNSPWQLTQFLDWISLGRGLATSALVPTAGSRYYQMSCLLSGTLQIPF
RPNHRWGDIRFLRLVWSAPTLDGLVVAPPQVLAQPALQAQADRVYDCDDYPFLARDPRFKHRVYQQLSAVTLLNLTGFGP
ISYVRVDEDMWSGDVNQLLMNYFGHTFAEIAYTLCQASANRPWEHDGTYARMTQIILSLFWLSYVGVIHQQNTYRTFYFQ
CNRRGDAAEVWILSCSLNHSAQIRPGNRSLFVMPTSPDWNMDVNLILSSTLTGCLCSGSQLPLIDNNSVPAVSRNIHGWT
GRAGNQLHGFQVRRMVTEFCDRLRRDGVMTQAQQNQIEALADQTQQFKRDKLEAWAREDDQYNQANPNSTMFRTKPFTNA
QWGRGNTGATSAAIAALI
>P03527 ~~~S4~~~Outer capsid protein sigma-3~~~
MEVCLPNGHQVVDLINNAFEGRVSIYSAQEGWDKTISAQPDMMVCGGAVVCMHCLGVVGSLQRKLKHLPHHRCNQQIRHQ
DYVDVQFADRVTAHWKRGMLSFVAQMHEMMNDVSPDDLDRVRTEGGSLVELNWLQVDPNSMFRSIHSSWTDPLQVVDDLD
TKLDQYWTALNLMIDSSDLIPNFMMRDPSHAFNGVKLGGDARQTQFSRTFDSRSSLEWGVMVYDYSELEHDPSKGRAYRK
ELVTPARDFGHFGLSHYSRATTPILGKMPAVFSGMLTGNCKMYPFIKGTAKLKTVRKLVEAVNHAWGVEKIRYALGPGGM
TGWYNRTMQQAPIVLTPAALTMFPDTIKFGDLNYPVMIGDPMILG
>P30211 ~~~S4~~~Outer capsid protein sigma-3~~~
MEVCLPNGHQIVDWINNAFEGRVSIYSAQQGWDKTISAQPDMMVCGGAVVCMHCLGVVGSLQRKLKHLPHHKCNQQLRQQ
DYVDVQFADRVTAHWKRGMLSFVSQMHAIMNDVTPEELERVRTDGGSLAELNWLQVDPGSMFRSIHSSWTDPLQVVEDLD
TQLDRYWTALNLMIDSSDLVPNFMMRDPSHAFNGVKLEGEARQTQFSRTFDSRSNLEWGVMIYDYSELERDPLKGRAYRK
EVVTPARDFGHFGLSHYSRATTPILGKMPAVFSGMLTGNCKMYPFIKGTAKLRTVKKLVDAVNHTWGSEKIRYALGPGGM
TGWYNRTMQQAPIVLTPAALTMFPDMTKFGDLQYPIMIGDPAVLG
>P07939 ~~~S4~~~Outer capsid protein sigma-3~~~
MEVCLPNGHQIVDLINNAFEGRVSIYSAQEGWDKTISAQPDMMVCGGAVVCMHCLGVVGSLQRKLKHLPHHRCNQQIRHQ
DYVDVQFADRVTAHWKRGMLSFVAQMHAMMNDVSPEDLDRVRTEGGSLVELNWLQVDPNSMFRSIHSSWTDPLQVVDDLD
TKLDQYWTALNLMIDSSDLVPNFMMRDPSHAFNGVRLEGDARQTQFSRTFDSRSSLEWGVMVYDYSELEHDPSKGRAYRK
ELVTPARDFGHFGLSHYSRATTPILGKMPAVFSGMLTGNCKMYPFIKGTAKLKTVRKLVDSVNHAWGVEKIRYALGPGGM
TGWYDRTMQQAPIVLTPAALTMFSDTTKFGDLDYPVMIGDPMILG
>P04524 ~~~~~~RNA polymerase sigma-like factor~~~
MSETKPKYNYVNNKELLQAIIDWKTELANNKDPNKVVRQNDTIGLAIMLIAEGLSKRFNFSGYTQSWKQEMIADGIEASI
KGLHNFDETKYKNPHAYITQACFNAFVQRIKKERKEVAKKYSYFVHNVYDSRDDDMVALVDETFIQDIYDKMTHYEESTY
RTPGAEKKSVVDDSPSLDFLYEAND
>P03526 ~~~S3~~~Protein sigma-NS~~~
MASSLRAAISKIKRDDVGQQVCPNYVMLRSSVTTKVVRNVVEYQIRTGGFFSCLAMLRPLQYAKRERLLGQRNLERISTR
DILQTRDLHSLCMPTPDAPMSNHQASTMRELICSYFKVDHADGLKYIPMDERYSPSSLARLFTMGMAGLHITTEPSYKRV
PIMHLAADLDCMTLALPYMITLDGDTVVPVAPTLSAEQLLDDGLKGLACMDISYGCEVDANSRPAGDQSMDSSRCINELY
CEETAEAICVLKTCLVLNCMQFKLEMDDLAHNAAELDKIQMMIPFSERVFRMASSFATIDAQCFRFCVMMKDKNLKIDMR
ETTRLWTRSASDDSVATSSLSISLDRGRWVAADASDARLLVFPIRV
>P07940 ~~~S3~~~Protein sigma-NS~~~
MASSLRAAISKIKRDDVGQQVCPNYVMLRSSVTTKVVRNVVEYQIRTGGFFSCLAMLRPLQYAKRERLLGQRNLERISTR
DILQTRDLHSLCMPTPDAPMSNHQAATMRELICSYFKVDHTDGLKYIPMDERYSPSSLARLFTMGMAGLHITTEPSYKRV
PIMHLAADLDCMTLALPYMITLDGDTVVPVAPTLSAEQLLDDGLKGLACMDISYGCEVDASNRSAGDQSMDSSRCINELY
CEETAEAICVLKTCLVLNCMQFKLEMDDLAHNATELDKIQMMIPFSERVFRMASSFATIDAQCFRFCVMMKDKNLKIDMR
ETMRLWTRSALDDSVVTSSLSISLDRGRWVAADATDARLLVFPIRV
>P17863 ~~~V-SKI~~~Transforming protein Ski~~~
FHLSSMSSLGGPAAFSARWAQEMYKKDNGKDPAEPVLHLPPIQPPPVMPGPFFMPSDRSTERCETILEGETISCFVVGGE
KRLCLPQILNSVLRDFSLQQINSVCDELHIYCSRCTADQLEILKVMGILPFSAPSCGLITKTDAERLCNALLYGGTYPPH
CKKEFSSTIELELTEKSFKVYHECFGKCKGLLVPELYSNPSAACIQCLDCRLMYPPHKFVVHSHKSLENRTCHWGFDSAN
WRSYILLSQDYTGKEEKARLGQLLDEMKEKFDYNNKYKRKAPRNRESPRVQLRRNKMFKTMLWDPAGGSAVLQRQPDGNE
VPSDPPASKKTKIDDSASQSPASTEKEKQSSRLRSLSSSSNKSIGCVHPRQRLSAFRPWSPAVSANEKELSTHLPALIRD
SSFYSYKSFENAVAPNVALAPPAQQKVVSNPPCATVV
>Q7Y5B1 ~~~~~~Small outer capsid protein~~~
MGGYVNIKTFTHPAGEGKEVKGMEVSVPFEIYSNEHRIADAHYQTFPSEKAAYTTVVTDAADWRTKNAAMFTPTPVSG
>P03715 ~~~soc~~~Small outer capsid protein~~~
MASTRGYVNIKTFEQKLDGNKKIEGKEISVAFPLYSDVHKISGAHYQTFPSEKAAYSTVYEENQRTEWIAANEDLWKVTG
>P39230 ~~~sp~~~Protein spackle~~~
MKKFIFATIFALASCAAQPAMAGYDKDLCEWSMTADQTEVETQIEADIMNIVKRDRPEMKAEVQKQLKSGGVMQYNYVLY
CDKNFNNKNIIAEVVGE
>Q9T1X1 ~~~~~~Spanin, inner membrane subunit~~~
MTFASKSLLLAAVFTAVLSGGLWHRLDSTRHDNQTLRRELQTEQQARHTAEWLLHGQEQTMQVFSAIRAANRAARLADET
EHHDAKEKITTAITGDNCSTRPVPAVAADRLRELEKRTRAIGGDPARN
>P03803 ~~~~~~Spanin, inner membrane subunit~~~
MLEFLRKLIPWVLAGMLFGLGWHLGSDSMDAKWKQEVHNEYVKRVEAAKSTQRAIDAVSAKYQEDLAALEGSTDRIISDL
RSDNKRLRVRVKTTGTSDGQCGFEPDGRAELDDRDAKRILAVTQKGDAWIRALQDTIRELQRK
>P00726 ~~~Rz~~~Spanin, inner membrane subunit~~~
MSRVTAIISALVICIIVCLSWAVNHYRDNAITYKAQRDKNARELKLANAAITDMQMRQRDVAALDAKYTKELADAKAEND
ALRDDVAAGRRRLHIKAVCQSVREATTASGVDNAASPRLADTAERDYFTLRERLITMQKQLEGTQKYINEQCR
>P0DTL7 ~~~~~~Spanin, outer lipoprotein subunit~~~
MMQKKKLQPPLLVTIAALVLCLPLLLTGCGNSKNAPVPSVVILPEIDTELTEATPVPPMPQPLTWGASLLWNADLLMALG
QCNRDKASVREQEIRRKEIYERRPEPGGGAAAR
>P03788 ~~~~~~Spanin, outer lipoprotein subunit~~~
MSTLRELRLRRALKEQSMRYLLSIKKTLPRWKGALIGLFLICVATISGCASESKLPEPPMVSVDSSLMVEPNLTTEMLNV
FSQ
>Q37935 ~~~Rz1~~~Spanin, outer lipoprotein subunit~~~
MLKLKMMLCVMMLPLVVVGCTSKQSVSQCVKPPPPPAWIMQPPPDWQTPLNGIISPSERG
>Q6XQ97 ~~~~~~U-spanin~~~
MKLKKTCIAITVAVGVISLSGCSTASALSGLLSDSPDVTAQVGAENTKQLAGVTAKADDKREVKVSDSNIGKIDSSVKKS
VEVSTIQANTVNAESITVTKSGSWYDPVVCWILVFIVLLLFYFLIRKHEKKEA
>P29815 ~~~~~~Spheroidin~~~
MSNVPLATKTIRKLSNRKYEIKIYLKDENTCFERVVDMVVPLYDVCNETSGVTLESCSPNIEVIELDNTHVRIKVHGDTL
KEMCFELLFPCNVNEAQVWKYVSRLLLDNVSHNDVKYKLANFRLTLNGKHLKLKEIDQPLFIYFVDDLGNYGLITKENIQ
NNNLQVNKDASFITIFPQYAYICLGRKVYLNEKVTFDVTTDATNITLDFNKSVNIAVSFLDIYYEVNNNEQKDLLKDLLK
RYGEFEVYNADTGLIYAKNLSIKNYDTVIQVERLPVNLKVRAYTKDENGRNLCLMKITSSTEVDPEYVTSNNALLGTLRV
YKKFDKSHLKIVMHNRGSGNVFPLRSLYLELSNVKGYPVKASDTSRLDVGIYKLNKIYVDNDENKIILEEIEAEYRCGRQ
VFHERVKLNKHQCKYTPKCPFQFVVNSPDTTIHLYGISNVCLKPKVPKNLRLWGWILDCDTSRFIKHMADGSDDLDLDVR
LNRNDICLKQAIKQHYTNVIILEYANTYPNCTLSLGNNRFNNVFDMNDNKTISEYTNFTKSRQDLNNMSCILGINIGNSV
NISSLPGWVTPHEAKILRSGCARVREFCKSFCDLSNKRFYAMARDLVSLLFMCNYVNIEINEAVCEYPGYVILFARAIKV
INDLLLINGVDNLAGYSISLPIHYGSTEKTLPNEKYGGVDKKFKYLFLKNKLKDLMRDADFVQPPLYISTYFRTLLDAPP
TDNYEKYLVDSSVQSQDVLQGLLNTCNTIDTNARVASSVIGYVYEPCGTSEHKIGSEALCKMAKEASRLGNLGLVNRINE
SNYNKCNKYGYRGVYENNKLKTKYYREIFDCNPNNNNELISRYGYRIMDLHKIGEIFANYDESESPCERRCHYLEDRGLL
YGPEYVHHRYQESCTPNTFGNNTNCVTRNGEQHVYENSCGDNATCGRRTGYGRRSRDEWNDYRKPHVYDNCADANSSSSD
SCSDSSSSSESESDSDGCCDTDASLDSDIENCYQNPSKCDAGC
>P12393 ~~~SPI-1~~~Serine proteinase inhibitor 1~~~
MKYLVLVLCLTSCACRDIGLWTFRYVYNESDNVVFSPYGLTSALSVLRIAAGGNTKREIDVPESVVEDSDAFLALRELFV
DASVPLRPEFTAEFSSRFNTSVQRVTFNSENVKDVINSYVKDKTGGDVPRVLDASLDRDTKMLLLSSVRMKTSWRHVFDP
SFTTDQPFYSGNVTYKVRMMNKIDTLKTETFTLRNVGYSVTELPYKRRQTAMLLVVPDDLGEIVRALDLSLVRFWIRNMR
KDVCQVVMPKFSVESVLDLRDALQRLGVRDAFDPSRADFGQASPSNDLYVTKVLQTSKIEADERGTTASSDTAITLIPRN
ALTAIVANKPFMFLIYHKPTTTVLFMGTITKGEKVIYDTEGRDDVVSSV
>P15058 ~~~~~~Serine proteinase inhibitor 1~~~
MDIFKELILKHTDENVLISPVSILSTLSILNHGAAGSTAEQLSKYIENMNENTPDDNNDMDVDIPYCATLATANKIYGSD
SIEFHASFLQKIKDDFQTVNFNNANQTKELINEWVKTMTNGKINSLLTSPLSINTRMTVVSAVHFKAMWKYPFSKHLTYT
DKFYISKNIVTSVDMMVSTENNLQYVHINELFGGFSIIDIPYEGNSSMVIILPDDIEGIYNIEKNITDEKFKKWCGMLST
KSIDLYMPKFKVEMTEPYNLVPILENLGLTNIFGYYADFSKMCNETITVEKFLHTTFIDVNEEYTEASAVTGVFMTNFSM
VYRTKVYINHPFMYMIKDNTGRILFIGKYCYPQ
>P07385 ~~~~~~Serine proteinase inhibitor 2~~~
MDIFREIASSMKGENVFISPPSISSVLTILYYGANGSTAEQLSKYVEKEADKNKDDISFKSMNKVYGRYSAVFKDSFLRK
IGDNFQTVDFTDCRTVDAINKCVDIFTEGKINPLLDEPLSPDTCLLAISAVYFKAKWLMPFEKEFTSDYPFYVSPTEMVD
VSMMSMYGEAFNHASVKESFGNFSIIELPYVGDTSMVVILPDNIDGLESIEQNLTDTNFKKWCDSMDAMFIDVHIPKFKV
TGSYNLVDALVKLGLTEVFGSTGDYSNMCNSDVSVDAMIHKTYIDVNEEYTEAAAATCALVADCASTVTNEFCADHPFIY
VIRHVDGKILFVGRYCSPTTN
>P68565 ~~~SERP2~~~Serine proteinase inhibitor 2~~~
MELFKHFLQSTASDVFVSPVSISAVLAVLLEGAKGRTAAQLRLALEPRYSHLDKVTVASRVYGDWRLDIKPKFMQAVRDR
FELVNFNHSPEKIKDDINRWVAARTNNKILNAVNSISPDTKLLIVAAIYFEVAWRNQFVPDFTIEGEFWVTKDVSKTVRM
MTLSDDFRFVDVRNEGIKMIELPYEYGYSMLVIIPDDLEQVERHLSLMKVISWLKMSTLRYVHLSFPKFKMETSYTLNEA
LATSGVTDIFAHPNFEDMTDDKNVAVSDIFHKAYIEVTEFGTTAASCTYGCVTDFGGTMDPVVLKVNKPFIFIIKHDDTF
SLLFLGRVTSPNY
>P15059 ~~~~~~Serine proteinase inhibitor 2~~~
MDIFREIASSMKGENVFISPASISSVLTILYYGANGSTAEQLSKYVEKEENMDKVSAQNISFKSINKVYGRYSAVFKDSF
LRKIGDKFQTVDFTDCRTIDAINKCVDIFTEGKINPLLDEPLSPDTCLLAISAVYFKAKWLTPFEKEFTSDYPFYVSPTE
MVDVSMMSMYGKAFNHASVKESFGNFSIIELPYVGDTSMMVILPDKIDGLESIEQNLTDTNFKKWCNSLEATFIDVHIPK
FKVTGSYNLVDTLVKSGLTEVFGSTGDYSNMCNSDVSVDAMIHKTYIDVNEEYTEAAAATCALVSDCASTITNEFCVDHP
FIYVIRHVDGKILFVGRYCSPTTNC
>P18047 ~~~~~~Fiber protein 1~~~
MKRARFEDDFNPVYPYEHYNPLDIPFITPPFASSNGLQEKPPGVLSLKYTDPLTTKNGALTLKLGTGLNIDKNGDLSSDA
SVEVSAPITKTNKIVGLNYTKPLALQNNALTLSYNAPFNVVNNNLALNMSQPVTINANNELSLLIDAPLNADTGTLRLRS
DAPLGLVDKTLKVLFSSPLYLDNNFLTLAIERPLALSSNRAVALKYSPPLKIENENLTLSTGGPFTVSGGNLNLATSAPL
SVQNNSLSLGVNPPFLITDSGLAMDLGDGLALGGSKLIINLGPGLQMSNGAITLALDAALPLQYKNNQLQLRIGSASALI
MSGVTQTLNVNANTSKGLAIENNSLVVKLGNGLRFDSWGSIAVSPTTTTPTTLWTTADPSPNATFYESLDAKVWLVLVKC
NGMVNGTISIKAQKGTLLKPTASFISFVMYFYSDGTWRKNYPVFDNEGILANSATWGYRQGQSANTNVSNAVEFMPSSKR
YPNEKGSEVQNMALTYTFLQGDPNMAISFQSIYNHAIEGYSLKFTWRVRNNERFDIPCCSFSYVTEQ
>P14267 ~~~~~~Fiber protein 1~~~
MKRARLEDDFNPVYPYEHYNPLDIPFITPPFASSNGLQEKPPGVLSLKYTDPLTTKNGALTLKLGTGLNIDENGDLSSDA
SVEVSAPITKTNKIVGLNYTKPLALRSNALTLSYNAPLNVVNNNLALNISQPVTVNANNELSLLIDAPLNADTGTLRLQS
AAPLGLVDKTLKVLFSSPLYLDNNFLTLAIERPLALSSSRAVTLKYSPPLKIENENLTLSTGGPFTVSGGNLNLTTSAPL
SVQNNSLSLVITSPLKVINSMLAVGVNPPFTITDSGLAMDLGDGLALGGSKLIINLGPGLQMSNGAITLALDAALPLQYR
DNQLQLRIGSTSGLIMSGVTQTLNVNANTGKGLAVENNSLVVKLGNGLRFDSWGSITVSPTTTTPTTLWTTADPSPNATF
YESLDAKVWLVLVKCNGMVNGTISIKAQKGILLRPTASFISFVMYFYSDGTWRKNYPVFDNEGILANSATWGYRQGQSAN
TNVSNAVEFMPSSKRYPNQKGSEVQNMALTYTFLQGDPNMAISFQSIYNHALEGYSLKFTWRVRNNERFDIPCCSFSYVT
EQ
>Q64761 ~~~~~~Fiber protein 1~~~
MTSPLTLSQRALALKTDSTLTLNTQGQLGVSLTPGDGLVLNTNGLSINADPQTLAFNNSGALEVNLDPDGPWSKTATGID
LRLDPTTLEVDNWELGVKLDPDEAIDSGPDGLCLNLDETLLLATNSTSGKTELGVHLNTSGPITADDQGIDLDVDPNTMQ
VNTGPSGGMLAVKLKSGGGLTADPDGISVTATVAPPSISATAPLTYTSGTIALTTDTQTMQVNSNQLAVKLKTGGGLTAD
ADGISVSVAPTPTISASPPLTYTNGQIGLSIGDQSLQVSSGQLQVKLKSQGGIQQSTQGLGVAVDQTLKIVSNTLEVNTD
PSGPLTSGNNGLSLAAVTPLAVSSAGVTLNYQSPLTVTSNSLGLSIAAPLQAGAQGLTVNTMEPLSASAQGIQLHYGQGF
QVVAGTLQLLTNPPIVVSSRGFTLLYTPAFTVSNNMLGLNVDGTDCVAISSAGLQIRKEAPLYVTSGSTPALALKYSSDF
TITNGALALANSGGGGSSTPEVATYHCGDNLLESYDIFASLPNTNAAKVAAYCRLAAAGGVVSGTIQVTSYAGRWPKVGN
SVTDGIKFAIVVSPPMDKDPRSNLSQWLGATVFPAGATTALFSPNPYGSLNTITTLPSIASDWYVPESNLVTYTKIHFKP
TGSQQLQLASGELVVAAAKSPVQTTKYELIYLGFTLKQNSSGTNFFDPNASSDLSFLTPPIPFTYLGYYQ
>P18048 ~~~~~~Fiber protein 2~~~
MKRTRIEDDFNPVYPYDTSSTPSIPYVAPPFVSSDGLQENPPGVLALKYTDPITTNAKHELTLKLGSNITLQNGLLSATV
PTVSPPLTNSNNSLGLATSAPIAVSANSLTLATAAPLTVSNNQLSINTGRGLVITNNAVAVNPTGALGFNNTGALQLNAA
GGMRVDGANLILHVAYPFEAINQLTLRLENGLEVTNGGKLNVKLGSGLQFDNNGRITISNRIQTRGVTSLTTIWSISPTP
NCSIYETQDANLFLCLTKNGAHVLGTITIKGLKGALREMNDNALSVKLPFDNQGNLLNCALESSTWRYQETNAVASNALT
FMPNSTVYPRNKTADPGNMLIQISPNITFSVVYNEINSGYAFTFKWSAEPGKPFHPPTAVFCYITEQ
>P16883 ~~~~~~Fiber protein 2~~~
MKRTRIEDDFNPVYPYDTFSTPSIPYVAPPFVSSDGLQEKPPGVLALKYTDPITTNAKHELTLKLGSNITLENGLLSATV
PTVSPPLTNSNNSLGLATSAPIAVSANSLTLATAAPLTVSNNQLSINAGRGLVITNNALTVNPTGALGFNNTGALQLNAA
GGMRVDGANLILHVAYPFEAINQLTLRLENGLEVTSGGKLNVKLGSGLQFDSNGRIAISNSNRTRSVPSLTTIWSISPTP
NCSIYETQDANLFLCLTKNGAHVLGTITIKGLKGALREMHDNALSLKLPFDNQGNLLNCALESSTWRYQETNAVASNALT
FMPNSTVYPRNKTAHPGNMLIQISPNITFSVVYNEINSGYAFTFKWSAEPGKPFHPPTAVFCYITEQ
>Q64762 ~~~~~~Fiber protein 2~~~
MADQKRKLADPDAEAPTGKMARAGPGELDLVYPFWYQVAAPTEITPPFLDPNGPLYSTDGLLNVRLTAPLVIIRQSNGNA
IGVKTDGSITVNADGALQIGISTAGPLTTTANGIDLNIDPKTLVVDGSSGKNVLGVLLKGQGALQSSAQGIGVAVDESLQ
IVDNTLEVKVDAAGPLAVTAAGVGLQYDNTQFKVTNGTLQLYQAPTSSVAAFTSGTIGLSSPTGNFVSSSNNPFNGSYFL
QQINTMGMLTTSLYVKVDTTTMGTRPTGAVNENARYFTVWVSSFLTQCNPSNIGQGTLEPSNISMTSFEPARNPISPPVF
NMNQNIPYYASRFGVLESYRPIFTGSLNTGSIDVRMQVTPVLATNNTTYNLIAFTFQCASAGLFNPTVNGTVAIGPVVHT
CPAARAPVTV
>P03275 ~~~L5~~~Fiber protein~~~
MKRARPSEDTFNPVYPYDTETGPPTVPFLTPPFVSPNGFQESPPGVLSLRVSEPLDTSHGMLALKMGSGLTLDKAGNLTS
QNVTTVTQPLKKTKSNISLDTSAPLTITSGALTVATTAPLIVTSGALSVQSQAPLTVQDSKLSIATKGPITVSDGKLALQ
TSAPLSGSDSDTLTVTASPPLTTATGSLGINMEDPIYVNNGKIGIKISGPLQVAQNSDTLTVVTGPGVTVEQNSLRTKVA
GAIGYDSSNNMEIKTGGGMRINNNLLILDVDYPFDAQTKLRLKLGQGPLYINASHNLDINYNRGLYLFNASNNTKKLEVS
IKKSSGLNFDNTAIAINAGKGLEFDTNTSESPDINPIKTKIGSGIDYNENGAMITKLGAGLSFDNSGAITIGNKNDDKLT
LWTTPDPSPNCRIHSDNDCKFTLVLTKCGSQVLATVAALAVSGDLSSMTGTVASVSIFLRFDQNGVLMENSSLKKHYWNF
RNGNSTNANPYTNAVGFMPNLLAYPKTQSQTAKNNIVSQVYLHGDKTKPMILTITLNGTSESTETSEVSTYSMSFTWSWE
SGKYTTETFATNSYTFSYIAQE
>P04501 ~~~L5~~~Fiber protein~~~
MAKRARLSTSFNPVYPYEDESSSQHPFINPGFISPDGFTQSPNGVLSLKCVNPLTTASGSLQLKVGSGLTVDTTDGSLEE
NIKVNTPLTKSNHSINLPIGNGLQIEQNKLCSKLGNGLTFDSSNSIALKNNTLWTGPKPEANCIIEYGKQNPDSKLTLIL
VKNGGIVNGYVTLMGASDYVNTLFKNKNVSINVELYFDATGHILPDSSSLKTDLELKYKQTADFSARGFMPSTTAYPFVL
PNAGTHNENYIFGQCYYKASDGALFPLEVTVMLNKRLPDSRTSYVMTFLWSLNAGLAPETTQATLITSPFTFSYIREDD
>P36844 ~~~L5~~~Fiber protein~~~
MSKSARGWSDGFDPVYPYDADNDRPCPSSTLPSFSSDGFQEKPLGVLSLGPGRPCHTKNGEITLKLGEGVDLDDSGKLIA
NTVNKAIAPLSFFQQHHFPLTWIPLYTPKMENYPYKFLPPLSILKSTILNTLVSAFGSGLGLSGSALAVQLASPLTFDDK
GNIKITLNRGLHVTTGDAIESNISWAKGIKFEDGAIATNIGKGSRFGTSSTETGVNNAYPIQVKLGSGLSFDSTGAIMAG
NKDYDKLTLWTTPDPSPNCQILAENDAKLTLCLTMCDSQILATVSVLVVRSGNLNPITGTVSSAQVFLRFDANGVLLTEH
STSKKYWGYKQGDSIDGTPYTNAVGFMPNSTAYPKTQSSTTKNNIVGQVYMNGDVSKPMLLTITLNGTDDTTSAYSMSFS
YTWTNGSYIGATFGANSYTFSYIAQQ
>P11818 ~~~L5~~~Fiber protein~~~
MKRARPSEDTFNPVYPYDTETGPPTVPFLTPPFVSPNGFQESPPGVLSLRLSEPLVTSNGMLALKMGNGLSLDEAGNLTS
QNVTTVSPPLKKTKSNINLEISAPLTVTSEALTVAAAAPLMVAGNTLTMQSQAPLTVHDSKLSIATQGPLTVSEGKLALQ
TSGPLTTTDSSTLTITASPPLTTATGSLGIDLKEPIYTQNGKLGLKYGAPLHVTDDLNTLTVATGPGVTINNTSLQTKVT
GALGFDSQGNMQLNVAGGLRIDSQNRRLILDVSYPFDAQNQLNLRLGQGPLFINSAHNLDINYNKGLYLFTASNNSKKLE
VNLSTAKGLMFDATAIAINAGDGLEFGSPNAPNTNPLKTKIGHGLEFDSNKAMVPKLGTGLSFDSTGAITVGNKNNDKLT
LWTTPAPSPNCRLNAEKDAKLTLVLTKCGSQILATVSVLAVKGSLAPISGTVQSAHLIIRFDENGVLLNNSFLDPEYWNF
RNGDLTEGTAYTNAVGFMPNLSAYPKSHGKTAKSNIVSQVYLNGDKTKPVTLTITLNGTQETGDTTPSAYSMSFSWDWSG
HNYINEIFATSSYTFSYIAQE
>P15141 ~~~L5~~~Fiber protein~~~
MSNFNSSPVPTIFMSFFQMTKRVRLSDSFNPVYPYEDESTSQHPFINPGFISPNGFTQSPDGVLTLKCLTPLTTTGGSLQ
LKVGGGLTIDDTDGFLKENISAATPLVKTGHSIGLSLGPGLGTNENKLCAKLGEGLTFNSNNICIDDNINTLWTGVNPTT
ANCQIMASSESNDCKLILTLVKTGGLVTAFVYVIGVSNDFNMLTTHKNINFTAELFFDSTGNLLTSLSSLKTPLNHKSGQ
NMATGALTNAKGFMPSTTAYPFNVNSREKENYIYGTCYYTASDHTAFPIDISVMLNQRALNNETSYCIRVTWSWNTGVAP
EVQTSATTLVTSPFTFYYIREDD
>P36845 ~~~L5~~~Fiber protein~~~
MTKRLRAEDDFNPVYPYGYARNQNIPFLTPPFVSSNGFQNFPPGVLSLKLADPITINNQNVSLKVGGGLTLQEETGKLTV
NTEPPLHLTNNKLGIALDAPFDVIDNKLTLLAGHGLSIITKETSTLPGLVNTLVVLTGKGIGTDLSNNGGNICVRVGEGG
GLSFNDNGDLVAFNKKEDKRTLWTTPDTSPNCRIDQDKDSKLSLVLTKCGSQILANVSLIVVAGRYKIINNNTNPALKGF
TIKLLFDKNGVLMESSNLGKSYWNFRNQNSIMSTAYEKAIGFMPNLVAYPKPTTGSKKYARDIVYGNIYLGGKPHQPVTI
KTTFNQETGCEYSITFDFSWAKTYVNVEFETTSFTFSYIAQE
>P68982 ~~~L5~~~Fiber protein~~~
MSKRLRVEDDFNPVYPYGYARNQNIPFLTPPFVSSDGFQNFPPGVLSLKLADPIAIVNGNVSLKVGGGLTLQDGTGKLTV
NADPPLQLTNNKLGIALDAPFDVIDNKLTLLAGHGLSIITKETSTLPGLRNTLVVLTGKGIGTESTDNGGTVCVRVGEGG
GLSFNNDGDLVAFNKKEDKRTLWTTPDTSPNCKIDQDKDSKLTLVLTKCGSQILANVSLIVVDGKYKIINNNTQPALKGF
TIKLLFDENGVLMESSNLGKSYWNFRNENSIMSTAYEKAIGFMPNLVAYPKPTAGSKKYARDIVYGNIYLGGKPDQPVTI
KTTFNQETGCEYSITFDFSWAKTYVNVEFETTSFTFSYIAQE
>P36711 ~~~L5~~~Fiber protein~~~
MKRSRTQYAEETEENDDFNPVYPFDPFDTSDVPFVTPPFTSSNGLQEKPPGVLALNYKDPIVTENGTLTLKLGDGIKLNA
QGQLTASNNINVLEPLTNTSQGLKLSWSAPLAVKASALTLNTRAPLTTTDESLALITAPPITVESSRLGLATIAPLSLDG
GGNLGLNLSAPLDVSNNNLHLTTETPLVVNSSGALSVATADPISVRNNALTLPTADPLMVSSDGLGISVTSPITVINGSL
ALSTTAPLNSTGSTLSLSVANPLTISQDTLTVSTGNGLQVSGSQLVTRIGDGLTFDNGVMKVNVAGGMRTSGGRIILDVN
YPFDASNNLSLRRGLGLIYNQSTNWNLTTDISTEKGLMFSGNQIALNAGQGLTFNNGQLRVKLGAGLIFDSNNNIALGSS
SNTPYDPLTLWTTPDPPPNCSLIQELDAKLTLCLTKNGSIVNGIVSLVGVKGNLLNIQSTTTTVGVHLVFDEQGRLITST
PTALVPQASWGYRQGQSVSTNTVTNGLGFMPNVSAYPRPNASEAKSQMVSLTYLQGDTSKPITMKVAFNGITSLNGYSLT
FMWSGLSNYINQPFSTPSCSFSYITQE
>P36847 ~~~L5~~~Fiber protein~~~
MSKRLRVEDDFNPVYPYGYARNQNIPFLTPPFVSSDGFQNFPPGVLSLKLADPIAIANGNVSLKMGGGLTLQEGTGNLTV
NTEPPLQLTNNRIGIALDAPFDVIGGKLTLLAGHGLSIITEETSPLPGLVNTLVVLTGKGLGTDTTDNGGSIRVRVGEGG
GLSFNEAGDLVAFNKKEDMRTLWTTPDPSPNCKIIEDKDSKLTLILTKCGSQILGSVSLLVVKGKFSNINNTTNPNEADK
QITVKLLFDANGVLKQGSTMDSSYWNYRSDNSNLSQPYKKAVGFMPSKTAYPKQTKPTNKEISQAKNKIVSNVYLGGKID
QPCVIIISFNEEADSDYSIVFYFKWYKTYENVQFDSSSFNFSYIAQE
>P35773 ~~~L5~~~Fiber protein~~~
MTKRVRLSDSFNPVYPYEDESTSQHPFINPGFISPNGFTQSPDGVLTLKCLTPLTTTGGSLQLKVGGGLTVDDTDGTLQE
NIGTTTPLVKTGHSIGLSLGAGLGTDENKLCTKLGKGLTFNSNNICIDDNINTLWTGINPTEANCQMMDSSESNDCKLIL
TLVKTGALVTAFVYVIGVSNNFNMLTTYRNINFTAELFFDSAGNLLTSLSSLKTPLNHKSGQNMATGAITNAKSFMPSTT
AYPFNNNSREKENYIYGTCHYTASDHTAFPIDISVMLNQRAIRADTSYCIRITWSWNTGDAPEGQTSATTLVTSPFTFYY
IREDD
>P68983 ~~~L5~~~Fiber protein~~~
MSKRLRVEDDFNPVYPYGYARNQNIPFLTPPFVSSDGFQNFPPGVLSLKLADPIAIVNGNVSLKVGGGLTLQDGTGKLTV
NADPPLQLTNNKLGIALDAPFDVIDNKLTLLAGHGLSIITKETSTLPGLRNTLVVLTGKGIGTESTDNGGTVCVRVGEGG
GLSFNNDGDLVAFNKKEDKRTLWTTPDTSPNCKIDQDKDSKLTLVLTKCGSQILANVSLIVVDGKYKIINNNTQPALKGF
TIKLLFDENGVLMESSNLGKSYWNFRNENSIMSTAYEKAIGFMPNLVAYPKPTAGSKKYARDIVYGNIYLGGKPDQPVTI
KTTFNQETGCEYSITFDFSWAKTYVNVEFETTSFTFSYIAQE
>P35774 ~~~L5~~~Fiber protein~~~
MTKRVRLSDSFNPVYPYEDESTSQHPFINPGFISPNGFTQSPNGVLTLKCLTPLTTTGGSLQLKVGGGLTVDDTNGFLKE
NISATTPLVKTGHSIGLPLGAGLGTNENKLCIKLGQGLTFNSNNICIDDNINTLWTGVNPTEANCQIMNSSESNDCKLIL
TLVKTGALVTAFVYVIGVSNNFNMLTTHRNINFTAELFFDSTGNLLTRLSSLKTPLNHKSGQNMATGAITNAKGFMPSTT
AYPFNDNSREKENYIYGTCYYTASDRTAFPIDISVMLNRRAINDETSYCIRITWSWNTGDAPEVQTSATTLVTSPFTFYY
IREDD
>P36848 ~~~L5~~~Fiber protein~~~
MKRSRTQYAEEPEENDDFNPVYPFDPYDTAHVPFVTPPFTSSNAFQEKPPGVLSLNYKDPIVTENGSLTLKLGNGIKLNS
QGQLTTTNTKVLEPLPHTSQGLTLSWSAPLSVKASALTLNTMAPFTTTNESLSLVTAPPITVEASQLGLASCSTSKLRGG
GNLGFHLPAPFVVPSSNALTLSASDPLTVNSNSLGLNITSPITLINGSLALATSPPLDTTGSTLNLSVAAPLSVSQNALT
VSTGNGLQVSGSQLVTRIGDGLRFDNGVIKAHVAGGNETLRGKIILDVNYPFDATTNLSLRRGSGLIYNESTNWNLTTDI
STEKGLTFSGNQIAINAGPCGLTFNNRKLQVKLGAGHTFSSNDNIALNSIATPYDPLTLWTTPDPPPNCTLRQELDAKLT
LCLTKNESIVNGIVSLIGVKGDLLHIQPTTTTVGLHLVFDRQGRLVTTTPTALVPQASWGYKQGQSVSSSAVANALGFMP
NVSAYPRPNAGEAKSQMLSQTYLQGDTTKPITMKVVFNGNATVDGYSLTFMWTGVSNYLNQQFSTPSCSFSYIAQE
>Q03553 ~~~L5~~~Fiber protein~~~
MKRSVPQDFNLVYPYKAKRPNIMPPFFDRNGFVENQEATLAMLVEKPLTFDKEGALTLGVGRGIRINPAGLLETNDLASA
VFPPLASDEAGNVTLNMSDGLYTKDNKLAVKVGPGLSLDSNNALQVHTGDGLTVTDDKVSLNTQAPLSTTSAGLSLLLGP
SLHLGEEERLTVNTGAGLQISNNALAVKVGSGITVDAQNQLAASLGDGLESRDNKTVVKAGPGLTITNQALTVATGNGLQ
VNPEGQLQLNITAGQGLNFANNSLAVELGSGLHFPPGQNQVSLYPGDGIDIRDNRVTVPAGPGLRMLNHQLAVASGDGLE
VHSDTLRLKLSHGLTFENGAVRAKLGPGLGTDDSGRSVVRTGRGLRVANGQVQIFSGRGTAIGTDSSLTLNIRAPLQFSG
PALTASLQGSGPITYNSNNGTFGLSIGPGMWVDQNRLQVNPGAGLVFQGNNLVPNLADPLAISDSKISLSLGPGLTQASN
ALTLSLGNGLEFSNQAVAIKAGRGLRFESSSQALESSLTVGNGLTLTDTVIRPNLGDGLEVRDNKIIVKLGANLRFENGA
VTAGTVNPSAPEAPPTLTAEPPLRASNSHLQLSLSEGLVVHNNALALQLGDGMEVNQHGLTLRVGSGLQMRDGILTVTPS
GTPIEPRLTAPLTQTENGIGLALGAGLELDESALQVKVGPGMRLNPVEKYVTLLLGPGLSFGQPANRTNYDVRVSVEPPM
VFGQRGQLTFLVGHGLHIQNSKLQLNLGQGLRTDPVTNQLEVPLGQGLEIADESQVRVKLGDGLQFDSQARITTAPNMVT
ETLWTGTGSNANVTWRGYTAPGSKLFLSLTRFSTGLVLGNMTIDSNASFGQYINAGHEQIECFILLDNQGNLKEGSNLQG
TWEVKNNPSASKAAFLPSTALYPILNESRGSLPGKNLVGMQAILGGGGTCTVIATLNGRRSNNYPAGQSIIFVWQEFNTI
ARQPLNHSTLTFSYWT
>Q65961 ~~~L5~~~Fiber protein~~~
MKRTRRSLPANFDPVYPYDAPKPSTQPPFFNDRKGLTESSPGTLAVNISPPLTFSNLGAIKLSTGAGLILKEGKLEANIG
PGLTTNQEGQITVEKDSDGLTFTSPLHKIENTVSLSIGEGLEDESGTLKVNFPSPPPPLLFSPPLAEAGGTVSLPLQESM
QVTEGKLGVKPTTYSPPLQKTDQQVSLRVGPGLTVLNGQLQAVQPPATTYKEPLLETENSVSLKVGAGLAVQDGALVATP
PNVTFSAPLEKNGNAVSVRVGAGLSIQGNALVATTSPTLTFAYPLIKNNNHITLSAGSGLRVSGGSLTVATGPGLSHING
TIAAVIGAGLKFENNAILAKLGNGLTIRDGAIEAVAPQPSFTPVTLWTGPDPNVNASINGTPVIRSFISLTRDSNLVTVN
ASFTGEGSYQSVSPTQSQFSLILEFNQFGQLMSTGNLNSTTTWGEKPWGNNTVQVQPSHTWKLCMPNREVYSTPAATLTS
CGLNSIAHDGAPNRSIDCMLIINKLRGAATYTLTFRFLNFNKLSSSTVFKTDVLTFTYVGENQ
>P22230 ~~~L5~~~Fiber protein~~~
MKRTRSALPANFDPVYPYDAPKPSTQPPFFNDRKGLTESSPGTLAVNISPPLTFSNLGAIKLSTGAGLILKEGKLEANIG
PGLTTNQEGQITVEKDSDGLTFTSPLHKIENTVSLSIGEGLEDESGTLKVNFPSPPPPLLFSPPLAEAGGTVSLPLQESM
QVTEGKLGVKPTTYSPPLQKTDQQVSLRVGPGLTVLNGQLQAVQPPATTYKEPLLETENSVSLKVGAGLAVQDGALVATP
PNVTFSAPLEKNGNAVSVRVGAGLSIQGNALVATTSPTLTFAYPLIKNNNHITLSAGSGLRVSGGSLTVATGPGLSHING
TIAAVIGAGLKFENNAILAKLGNGLTIRDGAIEAVAPQPSFTPVTLWTGPDPNVNASINGTPVIRSFISLTRDSNLVTVN
ASFTGEGSYQSVSPTQSQFSLILEFNQFGQLMSTGNLNSTTTWGEKPWGNNTVQVQPSHTWKLCMPNREVYSTPAATLTS
CGLNSIAHDGAPNRSIDCMLIINKLRGAATYTLTFRFLNFNKLSSSTVFKTDVLTFTYVGENQ
>Q96689 ~~~L5~~~Fiber protein~~~
MKRTRSALPANFDPVYPYDAPKPSTQPPFFNDRKGLTESSPGTLAVNISPPLTFSNLGAIKLSTGAGLILKEGKLEANIG
PGLTTNQEGQITVEKDSDGLTFTSPLHKIENTVSLSIGEGLEDESGTLKVNFPSPPPPLLFSPPLAEAGGTVSLPLQESM
QVTEGKLGVKPTTYSPPLQKTDQQVSLRVGPGLTVLNGQLQAVQPPATTYKEPLLETENSVSLKVGAGLAVQDGALVATP
PNVTFSAPLEKNGNAVSVRVGAGLSIQGNALVATTSPTLTFAYPLIKNNNHITLSAGSGLRVSGGSLTVATGPGLSHING
TIAAVIGAGLKFENNAILAKLGNGLTIRDGAIEAVAPQPSFTPVTLWTGPDPNVNTSINGTPVIRSFISLTRDSNLVTVN
ASFTGEGSYQSVSPTQSQFSLILEFNQFGQLMSTGNLNSTTTWGEKPWGNNTVQVQPSHTWKLCMPNREVYSTPAATLTS
CGLNSIAHDGAPNRSIDCMLIINKLAGAATYTLTFRFLNFNKLSSSTVFKTDVLTFTYVGENQ
>Q65914 ~~~L5~~~Fiber protein~~~
MKRTRRALPANYDPVYPYDAPGSSTQPPFFNNKQGLTESPPGTLAVNVSPPLTFSTLGAIKLSTGPGLTLNEGKLQASLG
PGLITNTEGQITVENVNKVLSFTSPLHKNENTVSLALGDGLEDENGTLKVTFPTPPPPLQFSPPLTKTGGTVSLPLQDSM
QVTNGKLGVKPTTYAPPLKKTDQQVSLQVGSGLTVINEQLQAVQPPATTYNEPLSKTDNSVSLQVGAGLAVQSGALVATP
PPPLTFTSPLEKNENTVSLQVGAGLSVQNNALVATPPPPLTFAYPLVKNDNHVALSAGSGLRISGGSLTVATGPGLSHQN
GTIGAVVGAGLKFENNAILAKLGNGLTIRDGAIEATQPPAAPITLWTGPGPSINGFINDTPVIRCFICLTRDSNLVTVNA
SFVGEGGYRIVSPTQSQFSLIMEFDQFGQLMSTGNINSTTTWGEKPWGNNTVQPRPSHTWKLCMPNREVYSTPAATISRC
GLDSIAVDGAPSRSIDCMLIINKPKGVATYTLTFRFLNFNRLSGGTLFKTDVLTFTYVGENQ
>P19721 ~~~L5~~~Fiber protein~~~
MVEALNAVYPYDLALLPEDYEKTTAPDAVQAANAARPFLNPVYPYQQPVAGDFGFPIVMPPFFNSYDFTSIHGNTLSLRL
NKPLKRTAKGLQLLLGSGLSVNADGQLESSEGISEADAPLQINDGVLQLSFGEGLSVNDHGELESKGKVEAVTLPLALQD
HVMSLSFGQGLQVNDQGQLEALAMVHSTSAPLKVTNNNLELALGRGLIVDDQGQLRLAPNLLWPESPLAIEQGTNHLILF
YNQSLDVEDGKLTLPEPFDPLTLDGGRLRMQLAPNSGLAVTEKGSLGINWGEGIQVKEQKITLKVTPANGLAVTEQGGLN
INWGNGIKVDEQKVTLKTSNEFALTENGLYLTSPLNPIEVNQHGQLGIALGYGFHAHRGYLELTPQTLWTGLPIGNNGTF
HTKQDCKIFLSLTRLGPMVHGTFMLQAPQYELTTNGMREITFSFNSTGGLEQPAPVTYWGALDPPPTAKAAEIENQKRVK
KRAAPDPPVEPPPKRRGDLAVLFAKVAEQAMELAKEQAVQAQPPEHVNTDWADHMNLLRFMPNTLVYPTAATIAANLQFH
DTRLSLRRATLKIRLNGSPDSAYQLGFMLELVGTQSASIVTDTISFWYYAEDY
>Q83457 ~~~L5~~~Fiber protein~~~
MGPKKQKRELPEDFDPVYPYDVPQLQINPPFVSGDGFNQSVDGVLSLHIAPPLVFDNTRALTLAFGGGLQLSGKQLVVAT
EGSGLTTNPDGKLVLKVKSPITLTAEGISLSLGPGLSNSETGLSLQVTAPLQFQGNALTLPLAAGLQNTDGGMGVKLGSG
LTTDNSQAVTVQVGNGLQLNGEGQLTVPATAPLVSGSAGISFNYSSNDFVLDNDSLSLRPKAISVTPPLQSTEDTISLNY
SNDFSVDNGALTLAPTFKPYTLWTGASPTANVILTNTTTPNGTFFLCLTRVGGLVLGSFALKSSIDLTSMTKKVNFIFDG
AGRLQSDSTYKGRFGFRSNDSVIEPTAAGLSPAWLMPSTFIYPRNTSGSSLTSFVYINQTYVHVDIKVNTLSTNGYSLEF
NFQNMSFSAPFSTSYGTFCYVPRRTTHRPRHGPFSLRERRHLFQLLQQ
>A9CB96 ~~~L5~~~Fiber protein~~~
MKKIKRSAADPDPVYPFGDEVPIPLPPFLVPGGGLTTDGLSLAVQTVDPLNVTLGGVGLKIGDGLSVVDGKLTSEAKIVA
DPPLQQSGDTLSLSTDSSMMVLPSGQLTINNLPSISVTSSGVGLVSPNAPLQLMSNGALQLSVGGGLTVGAQGSLQISTG
VGVNVNAAGVLESYPLPPLVWDYSSKSLTLDIGPGLTVVNGKLQVIGATFSNQMSRMAPAPRADLQSNSIEPLPSPPSKT
SLDIAEELQNDKGVSFAFQAREEELGAFTKRTLFAYSGDGLTGPFKAPASAELSSFLTAHPKGRWLIAFPLGTGIVSVDE
GILTLEISRSLPEVGSGSSSTSLKVISIYFMDLFFPVPFIDRASHPAPRRSNNSRQLFHSKQRLFLKVKDFKKRSWYSSL
FTLINLNIQECPELS
>A3EX94 ~~~S~~~Spike glycoprotein~~~
MTLLMCLLMSLLIFVRGCDSQFVDMSPASNTSECLESQVDAAAFSKLMWPYPIDPSKVDGIIYPLGRTYSNITLAYTGLF
PLQGDLGSQYLYSVSHAVGHDGDPTKAYISNYSLLVNDFDNGFVVRIGAAANSTGTIVISPSVNTKIKKAYPAFILGSSL
TNTSAGQPLYANYSLTIIPDGCGTVLHAFYCILKPRTVNRCPSGTGYVSYFIYETVHNDCQSTINRNASLNSFKSFFDLV
NCTFFNSWDITADETKEWFGITQDTQGVHLYSSRKGDLYGGNMFRFATLPVYEGIKYYTVIPRSFRSKANKREAWAAFYV
YKLHQLTYLLDFSVDGYIRRAIDCGHDDLSQLHCSYTSFEVDTGVYSVSSYEASATGTFIEQPNATECDFSPMLTGVAPQ
VYNFKRLVFSNCNYNLTKLLSLFAVDEFSCNGISPDSIARGCYSTLTVDYFAYPLSMKSYIRPGSAGNIPLYNYKQSFAN
PTCRVMASVLANVTITKPHAYGYISKCSRLTGANQDVETPLYINPGEYSICRDFSPGGFSEDGQVFKRTLTQFEGGGLLI
GVGTRVPMTDNLQMSFIISVQYGTGTDSVCPMLDLGDSLTITNRLGKCVDYSLYGVTGRGVFQNCTAVGVKQQRFVYDSF
DNLVGYYSDDGNYYCVRPCVSVPVSVIYDKSTNLHATLFGSVACEHVTTMMSQFSRLTQSNLRRRDSNIPLQTAVGCVIG
LSNNSLVVSDCKLPLGQSLCAVPPVSTFRSYSASQFQLAVLNYTSPIVVTPINSSGFTAAIPTNFSFSVTQEYIETSIQK
VTVDCKQYVCNGFTRCEKLLVEYGQFCSKINQALHGANLRQDESVYSLYSNIKTTSTQTLEYGLNGDFNLTLLQVPQIGG
SSSSYRSAIEDLLFDKVTIADPGYMQGYDDCMKQGPQSARDLICAQYVSGYKVLPPLYDPNMEAAYTSSLLGSIAGAGWT
AGLSSFAAIPFAQSMFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTSNLAFSKVQDAVNANAQALSKLASELS
NTFGAISSSISDILARLDTVEQDAQIDRLINGRLISLNAFVSQQLVRSETAARSAQLASDKVNECVKSQSKRNGFCGSGT
HIVSFVVNAPNGFYFFHVGYVPTNYTNVTAAYGLCNNNNPPLCIAPIDGYFITNQTTTYSVDTEWYYTGSSFYKPEPITQ
ANSRYVSSDVKFDKLENNLPPPLLENSTDVDFKDELEEFFKNVTSHGPNFAEISKINTTLLDLSDEMAMLQEVVKQLNDS
YIDLKELGNYTYYNKWPWYVWLGFIAGLVALLLCVFFLLCCTGCGTSCLGKMKCKNCCDSYEEYDVEKIHVH
>A3EXD0 ~~~S~~~Spike glycoprotein~~~
MIRSVLVLMCSLTFIGNLTRGQSVDMGHNGTGSCLDSQVQPDYFESVHTTWPMPIDTSKAEGVIYPNGKSYSNITLTYTG
LYPKANDLGKQYLFSDGHSAPGRLNNLFVSNYSSQVESFDDGFVVRIGAAANKTGTTVISQSTFKPIKKIYPAFLLGHSV
GNYTPSNRTGRYLNHTLVILPDGCGTILHAFYCVLHPRTQQNCAGETNFKSLSLWDTPASDCVSGSYNQEATLGAFKVYF
DLINCTFRYNYTITEDENAEWFGITQDTQGVHLYSSRKENVFRNNMFHFATLPVYQKILYYTVIPRSIRSPFNDRKAWAA
FYIYKLHPLTYLLNFDVEGYITKAVDCGYDDLAQLQCSYESFEVETGVYSVSSFEASPRGEFIEQATTQECDFTPMLTGT
PPPIYNFKRLVFTNCNYNLTKLLSLFQVSEFSCHQVSPSSLATGCYSSLTVDYFAYSTDMSSYLQPGSAGAIVQFNYKQD
FSNPTCRVLATVPQNLTTITKPSNYAYLTECYKTSAYGKNYLYNAPGAYTPCLSLASRGFSTKYQSHSDGELTTTGYIYP
VTGNLQMAFIISVQYGTDTNSVCPMQALRNDTSIEDKLDVCVEYSLHGITGRGVFHNCTSVGLRNQRFVYDTFDNLVGYH
SDNGNYYCVRPCVSVPVSVIYDKASNSHATLFGSVACSHVTTMMSQFSRMTKTNLLARTTPGPLQTTVGCAMGFINSSMV
VDECQLPLGQSLCAIPPTTSSRVRRATSGASDVFQIATLNFTSPLTLAPINSTGFVVAVPTNFTFGVTQEFIETTIQKIT
VDCKQYVCNGFKKCEDLLKEYGQFCSKINQALHGANLRQDESIANLFSSIKTQNTQPLQAGLNGDFNLTMLQIPQVTTGE
RKYRSTIEDLLFNKVTIADPGYMQGYDECMQQGPQSARDLICAQYVAGYKVLPPLYDPYMEAAYTSSLLGSIAGASWTAG
LSSFAAIPFAQSIFYRLNGVGITQQVLSENQKIIANKFNQALGAMQTGFTTTNLAFNKVQDAVNANAMALSKLAAELSNT
FGAISSSISDILARLDTVEQEAQIDRLINGRLTSLNAFVAQQLVRTEAAARSAQLAQDKVNECVKSQSKRNGFCGTGTHI
VSFAINAPNGLYFFHVGYQPTSHVNATAAYGLCNTENPQKCIAPIDGYFVLNQTTSTVADSDQQWYYTGSSFFHPEPITE
ANSKYVSMDVKFENLTNRLPPPLLSNSTDLDFKEELEEFFKNVSSQGPNFQEISKINTTLLNLNTELMVLSEVVKQLNES
YIDLKELGNYTFYQKWPWYIWLGFIAGLVALALCVFFILCCTGCGTSCLGKLKCNRCCDSYDEYEVEKIHVH
>A3EXG6 ~~~S~~~Spike glycoprotein~~~
MLLILVLGVSLAAASRPECFNPRFTLTPLNHTLNYTSIKAKVSNVLLPDPYIAYSGQTLRQNLFMADMSNTILYPVTPPA
NGANGGFIYNTSIIPVSAGLFVNTWMYRQPASSRAYCQEPFGVAFGDTFENDRIAILIMAPDNLGSWSAVAPRNQTNIYL
LVCSNATLCINPGFNRWGPAGSFIAPDALVDHSNSCFVNNTFSVNISTSRISLAFLFKDGDLLIYHSGWLPTSNFEHGFS
RGSHPMTYFMSLPVGGNLPRAQFFQSIVRSNAIDKGDGMCTNFDVNLHVAHLINRDLLVSYFNNGSVANAADCADSAAEE
LYCVTGSFDPPTGVYPLSRYRAQVAGFVRVTQRGSYCTPPYSVLQDPPQPVVWRRYMLYDCVFDFTVVVDSLPTHQLQCY
GVSPRRLASMCYGSVTLDVMRINETHLNNLFNRVPDTFSLYNYALPDNFYGCLHAFYLNSTAPYAVANRFPIKPGGRQSN
SAFIDTVINAAHYSPFSYVYGLAVITLKPAAGSKLVCPVANDTVVITDRCVQYNLYGYTGTGVLSKNTSLVIPDGKVFTA
SSTGTIIGVSINSTTYSIMPCVTVPVSVGYHPNFERALLFNGLSCSQRSRAVTEPVSVLWSASATAQDAFDTPSGCVVNV
ELRNTTIVNTCAMPIGNSLCFINGSIATANADSLPRLQLVNYDPLYDNSTATPMTPVYWVKVPTNFTLSATEEYIQTTAP
KITIDCARYLCGDSSRCLNVLLHYGTFCNDINKALSRVSTILDSALLSLVKELSINTRDEVTTFSFDGDYNFTGLMGCLG
PNCGATTYRSAFSDLLYDKVRITDPGFMQSYQKCIDSQWGGSIRDLLCTQTYNGIAVLPPIVSPAMQALYTSLLVGAVAS
SGYTFGITSAGVIPFATQLQFRLNGIGVTTQVLVENQKLIASSFNNALVNIQKGFTETSIALSKMQDVINQHAAQLHTLV
VQLGNSFGAISSSINEIFSRLEGLAANAEVDRLINGRMMVLNTYVTQLLIQASEAKAQNALAAQKISECVKAQSLRNDFC
GNGTHVLSIPQLAPNGVLFIHYAYTPTEYAFVQTSAGLCHNGTGYAPRQGMFVLPNNTNMWHFTTMQFYNPVNISASNTQ
VLTSCSVNYTSVNYTVLEPSVPGDYDFQKEFDKFYKNLSTIFNNTFNPNDFNFSTVDVTAQIKSLHDVVNQLNQSFIDLK
KLNVYEKTIKWPWYVWLAMIAGIVGLVLAVIMLMCMTNCCSCFKGMCDCRRCCGSYDSYDDVYPAVRVNKKRTV
>Q3I5J5 ~~~S~~~Spike glycoprotein~~~
MKILILAFLASLAKAQEGCGIISRKPQPKMAQVSSSRRGVYYNDDIFRSNVLHLTQDYFLPFDSNLTQYFSLNVDSDRFT
YFDNPILDFGDGVYFAATEKSNVIRGWIFGSTFDNTTQSAVIVNNSTHIIIRVCNFNLCKEPMYTVSRGAQQSSWVYQSA
FNCTYDRVEKSFQLDTAPKTGNFKDLREYVFKNRDGFLSVYQTYTAVNLPRGLPIGFSVLRPILKLPFGINITSYRVVMA
MFSQTTSNFLPESAAYYVGNLKYTTFMLSFNENGTITNAIDCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVIRF
PNITNRCPFDKVFNATRFPNVYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
VAPGETGVIADYNYKLPDDFTGCVIAWNTAKQDQGQYYYRSHRKTKLKPFERDLSSDENGVRTLSTYDFYPSVPVAYQAT
RVVVLSFELLNAPATVCGPKLSTQLVKNQCVNFNFNGLKGTGVLTESSKRFQSFQQFGRDTSDFTDSVRDPQTLEILDIS
PCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPAAIHADQLTPAWRVYSTGTNVFQTQAGCLIGAEHVNASYECDIPIGA
GICASYHTASTLRSVGQKSIVAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVMPVSMAKTSVDCTMYICGDSLECSNL
LLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQMYKTPAIKDFGGFNFSQILPDPSKPTKRSFIEDLLFNKVTLADAG
FMKQYGECLGDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQMAYRFNGIGVT
QNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQ
IDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQE
RNFTTAPAICHEGKAYFPREGVFVSNGTSWFITQRNFYSPQIITTDNTFVAGSCDVVIGIINNTVYDPLQPELDSFKEEL
DKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTI
LLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT
>P23052 ~~~S~~~Spike glycoprotein~~~
MFLCFCAATVLCFWINSGGADVVPNGTLIFSEPVPYPFSLDVLRSFSQHVVLRNKRAVTTISWSYSYQITTSSLSVNSWY
VTFTAPLGWNYYTGQSFGTVLNQNAMMRASQSTFTYDVISYVGQRPNLDCQVNSLVNGGLDGWYSTVRVDNCFNAPCHVG
GRPGCSIGLPYMSNGVCTRVLSTTQSPGLQYEIYSGQQFAVYQITPYTQYTITMPSGILGYCQQTPLYVDCGTWTPFRVH
SYGCDKVTQNCKYTLTSNWVVAFQNKATAVILPSELIVPVAQKVTRRLGVNTPDYFWLVKQAYHYLSQANLSPNYALFSA
LCNSLYQQSATLSTLCFGSPFFVAQECYNNALYLPDAVFTTLFSTLFSWDYQINYPLNQVLTQNETFLQLPATNYQGQTL
SQGRMLNLFKDAIVFLDFFDTKFYRTNDAPSSDIFVVVARQAQLIRYGNFRIEQINGYFQVKCSSNIISTLEPHPAGVIM
IARHHSMWSVAARNSTSFYCVTHSLTTFGKLDISTSWFFHTLALPSGPVSQVSMPLLSTAAVGVYMHPMIEHWIPLLTLA
QSQYQPSFFNIGINKTITLTTQLQAYAQVYTAWFLSVIYVRLPEARRLTLGVQLVPFIQALLSIKQADLDATDVDSVARY
NVLSLMWGRKYAVVNYNQLPEWTYPLFKGEIGESMWFRKKIMPTTEGCQTSAHFSSITGYLQFSDYVYIPKYNKVSCPIS
TLAPSVLQVYEVQSLFVILIQCVSGSYDWYPGLSGGTAFVYKSYKLGTVCVLLPSDVLSTGPNIGFYSGTALSIVTVQTT
NDVLPNCIGLVQDNIFTPCHPSGCPVRNSYDNYIVCFDSSTYTFKNYHRTTPPVMNVPIQEVPLQMEIPTVILQSYELKH
TESVLLQDIEGGIIVDHNTGSIWYPDGQAYDVSFYVSVIIRYAPPKLELPSTLANFTSCLDYICFGNYQCRTEAQTFCTS
MDYFEQVFNKSLISLKTALQDLHYVLKLVLPETTLELTELTRRRRRAVYEFDDTISLLSESFERFMSPASQAYMANMLWW
DDAFDGFSLPQRTGSILSRSPSLSSVSSWNSYTSRTPLISNVKTPKTTFNVKLSMPKLPKASTLSKIGSVLSSGLSIASL
GLSIFSIVEDRRVTELTQQQIMALEDQITILTDYTEKNFKEIQSSLNTLGQQVQDFSQQVTMSLQQLSNGLEQITQQLDK
SIYYVTATQQYATYMSSLINHLTELAAAVYKTQDMYVTCIHSLQSGVLSPNCITPSQIFQLYQVARNLSGQCQPIFSERE
VSRFYSLPLVTDAMVHNDTYWFSWSIPITCSNIQGSVYKVQPGYIVNPTHPTSLQYDLPSHVVTSNAGALRFDDHYCDRY
NQVYLCTKSAFDLQPSNYLTMLYSNISENVSLTFHPEPRPDPCVYLSSSALYCYYSDQCNQCVVAVGNCSNQTVTYRNYT
YPIMDPQCRGFDQITISSPIDIGVDFTALPSRPPLPLHLSYVNVTFNVTIPHGLNWTDLVLDYSFKDKIYEISKNITDLH
QQILQVSSWASGWFQRIRDFLYNLLPTWITWLTLGFSLFSIVISGINIILFFEMNGKVKKS
>P31340 ~~~V~~~Spike protein~~~
MNTLANIQELARALRNMIRTGIIVETDLNAGRCRVQTGGMCTDWLQWLTHRAGRSRTWWAPSVGEQVLILAVGGELDTAF
VLPGIYSGDNPSPSVSADALHIRFPDGAVIEYEPETSALTVSGIKTASVTASGSVTATVPVVMVKASTRVTLDTPEVVCT
NRLITGTLEVQKGGTMRGNIEHTGGELSSNGKVLHTHKHPGDSGGTTGSPL
>Q9XJR3 ~~~I~~~Spike protein P1~~~
MIVKKKLAAGEFAETFKNGNNITIIKAVGELVLRAYGADGGEGLRTIVRQGVSIKGMNYTSVMLHTEYAQEIEYWVGDLD
YSFQEQTTKSRDVNSFQIPLRDGVRELLPEDASRNRASIKSPVDIWIGGENMTALNGIVDGGRKFEAGQEFQINTFGSVN
YWVSDEEIRVFKEYSARAKYAQNEGRTALEANNVPFFDIDVPPELDGVPFSLKARVRHKSKGVDGLGDYTSISVKPAFYI
TEGDETTDTLIKYTSYGSTGSHSGYDFDDNTLDVMVTLSAGVHRVFPVETELDYDAVQEVQHDWYDESFTTFIEVYSDDP
LLTVKGYAQILMERT
>P25191 ~~~S~~~Spike glycoprotein~~~
MFLILLISLPMAFAVIGDLKCTTVSINDVDTGAPSISTDIVDVTNGLGTYYVLDRVYLNTTLLLNGYYPTSGSTYRNMAL
KGTLLLSRLWFKPPFLSDFINGIFAKVKNTKVIKKGVMYSEFPAITIGSTFVNTSYSVVVQPHTTNLDNKLQGLLEISVC
QYTMCEYPHTICHPNLGNKRVELWHWDTGVVSCLYKRNFTYDVNADYLYFHFYQEGGTFYAYFTDTGVVTKFLFNVYLGT
VLSHYYVLPLTCNSAMTLEYWVTPLTSKQYLLAFNQDGVIFNAVDCKSDFMSEIKCKTLSIAPSTGVYELNGYTVQPIAD
VYRRIPNLPDCNIEAWLNDKSVPSPLNWERKTFSNCNFNMSCLMSFIQADSFTCNNIDAAKIYGMCFSSITIDKFAIPNG
RKVDLQLGNLGYLQSFNYRIDTTATSCQLYYNLPAANVSVSRFNPSTWNRRFGFTEQSVFKPQPVGVFTHHDVVYAQHCF
KAPTNFCPCKLDGSLCVGNGPGIDAGYKNSGIGTCPAGTNYLTCHNAAQCDCLCTPDPITSKSTGPYKCPQTKYLVGIGE
HCSGLAIKSDYCGGNPCTCQPQAFLGWSVDSCLQGDRCNIFANFILHDVNSGTTCSTDLQKSNTDIILGVCVNYDLYGIT
GQGIFVEVNAPYYNSWQNLLYDSNGNLYGFRDYLTNRTFMIRSCYSGRVSAAFHANSSEPALLFRNIKCNYVFNNTLSRQ
LQPINYFDSYLGCVVNADNSTSSVVQTCDLTVGSGYCVDYSTKRRSRRAITTGYRFTNFEPFTVNSVNDSLEPVGGLYEI
QIPSEFTIGNMEEFIQTSSPKVTIDCSAFVCGDYAACKSQLVEYGSFCDNINAILTEVNELLDTTQLQVANSLMNGVTLS
TKLKDGVNFNVDDINFSPVLGCLGSDCNKVSSRSAIEDLLFSKVKLSDVGFVEAYNNCTGGAEIRDLICVQSYNGIKVLP
PLLSVNQISGYTLAATSASLFPPWSAAAGVPFYLNVQYRINGIGVTMDVLSQNQKLIANAFNNALDAIQEGFDATNSALV
KIQAVVNANAEALNNLLQQLSNRFGAISSSLQEILSRLDALEAQAQIDRLINGRLTALNAYVSQQLSDSTLVKFSAAQAM
EKVNECVKSQSSRINFCGNGNHIISLVQNAPYGLYFIHFSYVPTKYVTAKVSPGLCIAGDRGIAPKSGYFVNVNNTWMFT
GSGYYYPEPITGNNVVVMSTCAANYTKAPDVMLNISTPNLHDFKEELDQWFKNQTSVAPDLSLDYINVTFLDLQDEMNRL
QEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGFAGVAMLVLLFFICCCTGCGTSCFKICGGCCDDYTGHQELVIKTS
HDD
>P36300 ~~~S~~~Spike glycoprotein~~~
MIVLTLCLFLFLYSSVSCTSNNDCVQVNVTQLPGNENIIKDFLFQNFKEEGSLVVGGYYPTEVWYNCSTTQQTTAYKYFS
NIHAFYFDMEAMENSTGNARGKPLLVHVHGNPVSIIVYISAYRDDVQFRPLLKHGLLCITKNDTVDYNSFTINQWRDICL
GDDRKIPFSVVPTDNGTKLFGLEWNDDYVTAYISDESHRLNINNNWFNNVTLLYSRTSTATWQHSAAYVYQGVSNFTYYK
LNKTAGLKSYELCEDYEYCTGYATNVFAPTSGGYIPDGFSFNNWFMLTNSSTFVSGRFVTNQPLLVNCLWPVPSFGVAAQ
EFCFEGAQFSQCNGVSLNNTVDVIRFNLNFTTDVQSGMGATVFSLNTTGGVILEISCYNDTVSESSFYSYGEIPFGVTDG
PRYCYVLYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCIAFNLTTGASGAFWTIAYTSYTEALVQVENT
AIKKVTYCNSHINNIKCSQLTANLQNGFYPVASSEVGLVNKSVVLLPSFYSHTSVNITIDLGMKRSVTVTIASPLSNITL
PMQDNNIDVYCIRSNQFSVYVHSTCKSSLWDNNFNSACTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLNPVGANC
KLDVAARTRTNEQVFGSLYVIYEEGDNIVGVPSDNSGLHDLSVLHLDSCTDYNIYGRTGVGIIRKTNSTLLSGLYYTSLS
GDLLGFKNVSDGVVYSVTPCDVSAQAAVIDGAIVGAMTSINSELLGLTHWTTTPNFYYYSIYNYTNVMNRGTAIDNDIDC
EPIITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCARYVCNGNPRCNKLLTQ
YVSACQTIEQALAMGARLENMEIDSMLFVSENALKLASVEAFNSTENLDPIYKEWPNIGGSWLGGLKDILPSHNSKRKYR
SAIEDLLFDKVVTSGLGTVDEDYKRSAGGYDIADLVCARYYNGIMVLPGVANDDKMTMYTASLTGGITLGALSGGAVAIP
FAVAVQARLNYVALQTDVLNKNQQILANAFNQAIGNITQAFGKVNDAIHQTSKGLATVAKALAKVQDVVNTQGQALSHLT
VQLQNNFQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKVNECVRSQSQRFGFC
GNGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICASDGSRTFGLVVEDVQLTLFRNLDEKFYLTPRTMYQPRVAT
SSDFVQIEGCDVLFVNGTVIELPSIIPDYIDINQTVQDILENFRPNWTVPELPLDIFHATYLNLTGEINDLEFRSEKLHN
TTVELAILIDNINNTLVNLEWLNRIETYVKWPWYVWLLIGLVVIFCIPILLFCCCSTGCCGCIGCLGSCCHSICSRGQFE
SYEPIEKVHVH
>Q65984 ~~~S~~~Spike glycoprotein~~~
MIVLILCLLLFSYNSVICTSNNDCVQGNVTQLPGNENIIKDFLFHTFKEEPSVVVGGYYPTEVWYNCSRSATTTAYKDFS
NIHAFYFDMEAMENSTGNARGKPLLVHVHGDPVSIIIYISAYRDDVQPRPLLKHGLLCITKNKIIDYNTFTSAQWSAICL
GDDRKIPFSVIPTDNGTKIFGLEWNDDYVTAYISDRSHHLNINNNWFNNVTILYSRSSSATWQKSAAYVYQGVSNFTYYK
LNNTNGLKSYELCEDYEYCTGYATNVFAPTVGGYIPHGFSFNNWFMRTNSSTFVSGRFVTNQPLLVNCLWPVPSFGVAAQ
QFCFEGAQFSQCNGVSLNNTVDVIRFNLNFTALVQSGMGATVFSLNTTGGVILEISCYNDTVSESSFYSYGEISFGVTDG
PRYCFALYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSGAFWTIAYTSYTDALVQVENT
AIKKVTYCNSHINNIKCSQLTANLQNGFYPVASSEVGLVNKSVVLLPSFYSHTSVNITIDLGMKRSGYGQPIASTLSNIT
LPMQDNNTDVYCIRSNRFSVYFHSTCKSSLWDDVFNSDCTDVLYATAVIKTGTCPFSFDKLNNYLTFNKFCLSLNPVGAN
CKFDVAARTRTNEQVVRSLYVIYEEGDNIVGVPSDNSGLHDLSVLHLDSCTDYNIYGITGVGIIRQTNSTLLSGLYYTSL
SGDLLGFKNVSDGVIYSVTPCDVSAHAAVIDGAIVGAMTSINSELLGLTHWTTTPNFYYYSIYNYTNERTRGTAIDSNDV
DCEPIITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLL
TQYVSACQTIEQALAMGARLENMEIDSMLFVSENALKLASVEAFNSTETLDPIYKEWPNIGGSWLGGLKDILPSHNSKRK
YRSAIEDLLFDKVVTSGLGTVDEDYKRCTGGYDIADLVCAQYYNGIMVLPGVANDDKMAMYTASLAGGITLGSLGGGAVS
IPFAIAVQARLNYVALQTDVLNKNQQILANAFNQAIGNITQAFGKVNDAIHQTSQGLATVAKVLAKVQDVVNTQGQALSH
LTLQLQNNFQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKVNECVRSQSQRFG
FCGNGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICASDGDRTFGLVVKDVQLTLFRNLDDKFYLTPRTMYQPIV
ATSSDFVQIEGCDVLFVNATVIDLPSIIPDYIDINQTVQDILENFRPNWTVPELPLDIFNATYLNLTGEINDLEFRSEKL
HNTTVELAILIDNINNTLVNLEWLNRIETYVKWPWYVWLLIGLVVIFCIPILLFCCCSTGCCGCIGCLGSCCHSICSRRQ
FESYEPIEKVHVH
>P15423 ~~~S~~~Spike glycoprotein~~~
MFVLLVAYALLHIAGCQTTNGLNTSYSVCNGCVGYSENVFAVESGGYIPSDFAFNNWFLLTNTSSVVDGVVRSFQPLLLN
CLWSVSGLRFTTGFVYFNGTGRGDCKGFSSDVLSDVIRYNLNFEENLRRGTILFKTSYGVVVFYCTNNTLVSGDAHIPFG
TVLGNFYCFVNTTIGNETTSAFVGALPKTVREFVISRTGHFYINGYRYFTLGNVEAVNFNVTTAETTDFCTVALASYADV
LVNVSQTSIANIIYCNSVINRLRCDQLSFDVPDGFYSTSPIQSVELPVSIVSLPVYHKHTFIVLYVDFKPQSGGGKCFNC
YPAGVNITLANFNETKGPLCVDTSHFTTKYVAVYANVGRWSASINTGNCPFSFGKVNNFVKFGSVCFSLKDIPGGCAMPI
VANWAYSKYYTIGSLYVSWSDGDGITGVPQPVEGVSSFMNVTLDKCTKYNIYDVSGVGVIRVSNDTFLNGITYTSTSGNL
LGFKDVTKGTIYSITPCNPPDQLVVYQQAVVGAMLSENFTSYGFSNVVELPKFFYASNGTYNCTDAVLTYSSFGVCADGS
IIAVQPRNVSYDSVSAIVTANLSIPSNWTTSVQVEYLQITSTPIVVDCSTYVCNGNVRCVELLKQYTSACKTIEDALRNS
ARLESADVSEMLTFDKKAFTLANVSSFGDYNLSSVIPSLPTSGSRVAGRSAIEDILFSKLVTSGLGTVDADYKKCTKGLS
IADLACAQYYNGIMVLPGVADAERMAMYTGSLIGGIALGGLTSAVSIPFSLAIQARLNYVALQTDVLQENQKILAASFNK
AMTNIVDAFTGVNDAITQTSQALQTVATALNKIQDVVNQQGNSLNHLTSQLRQNFQAISSSIQAIYDRLDTIQADQQVDR
LITGRLAALNVFVSHTLTKYTEVRASRQLAQQKVNECVKSQSKRYGFCGNGTHIFSIVNAAPEGLVFLHTVLLPTQYKDV
EAWSGLCVDGTNGYVLRQPNLALYKEGNYYRITSRIMFEPRIPTMADFVQIENCNVTFVNISRSELQTIVPEYIDVNKTL
QELSYKLPNYTVPDLVVEQYNQTILNLTSEISTLENKSAELNYTVQKLQTLIDNINSTLVDLKWLNRVETYIKWPWWVWL
CISVVLIFVVSMLLLCCCSTGCCGFFSCFASSIRGCCESTKLPYYDVEKIHIQ
>Q5MQD0 ~~~S~~~Spike glycoprotein~~~
MLLIIFILPTTLAVIGDFNCTNFAINDLNTTVPRISEYVVDVSYGLGTYYILDRVYLNTTILFTGYFPKSGANFRDLSLK
GTTYLSTLWYQKPFLSDFNNGIFSRVKNTKLYVNKTLYSEFSTIVIGSVFINNSYTIVVQPHNGVLEITACQYTMCEYPH
TICKSKGSSRNESWHFDKSEPLCLFKKNFTYNVSTDWLYFHFYQERGTFYAYYADSGMPTTFLFSLYLGTLLSHYYVLPL
TCNAISSNTDNETLQYWVTPLSKRQYLLKFDNRGVITNAVDCSSSFFSEIQCKTKSLLPNTGVYDLSGFTVKPVATVHRR
IPDLPDCDIDKWLNNFNVPSPLNWERKIFSNCNFNLSTLLRLVHTDSFSCNNFDESKIYGSCFKSIVLDKFAIPNSRRSD
LQLGSSGFLQSSNYKIDTTSSSCQLYYSLPAINVTINNYNPSSWNRRYGFNNFNLSSHSVVYSRYCFSVNNTFCPCAKPS
FASSCKSHKPPSASCPIGTNYRSCESTTVLDHTDWCRCSCLPDPITAYDPRSCSQKKSLVGVGEHCAGFGVDEEKCGVLD
GSYNVSCLCSTDAFLGWSYDTCVSNNRCNIFSNFILNGINSGTTCSNDLLQPNTEVFTDVCVDYDLYGITGQGIFKEVSA
VYYNSWQNLLYDSNGNIIGFKDFVTNKTYNIFPCYAGRVSAAFHQNASSLALLYRNLKCSYVLNNISLTTQPYFDSYLGC
VFNADNLTDYSVSSCALRMGSGFCVDYNSPSSSSSRRKRRSISASYRFVTFEPFNVSFVNDSIESVGGLYEIKIPTNFTI
VGQEEFIQTNSPKVTIDCSLFVCSNYAACHDLLSEYGTFCDNINSILDEVNGLLDTTQLHVADTLMQGVTLSSNLNTNLH
FDVDNINFKSLVGCLGPHCGSSSRSFFEDLLFDKVKLSDVGFVEAYNNCTGGSEIRDLLCVQSFNGIKVLPPILSESQIS
GYTTAATVAAMFPPWSAAAGIPFSLNVQYRINGLGVTMDVLNKNQKLIATAFNNALLSIQNGFSATNSALAKIQSVVNSN
AQALNSLLQQLFNKFGAISSSLQEILSRLDALEAQVQIDRLINGRLTALNAYVSQQLSDISLVKFGAALAMEKVNECVKS
QSPRINFCGNGNHILSLVQNAPYGLLFMHFSYKPISFKTVLVSPGLCISGDVGIAPKQGYFIKHNDHWMFTGSSYYYPEP
ISDKNVVFMNTCSVNFTKAPLVYLNHSVPKLSDFESELSHWFKNQTSIAPNLTLNLHTINATFLDLYYEMNLIQESIKSL
NNSYINLKDIGTYEMYVKWPWYVWLLISFSFIIFLVLLFFICCCTGCGSACFSKCHNCCDEYGGHHDFVIKTSHDD
>Q0ZME7 ~~~S~~~Spike glycoprotein~~~
MFLIIFILPTTLAVIGDFNCTNSFINDYNKTIPRISEDVVDVSLGLGTYYVLNRVYLNTTLLFTGYFPKSGANFRDLALK
GSIYLSTLWYKPPFLSDFNNGIFSKVKNTKLYVNNTLYSEFSTIVIGSVFVNTSYTIVVQPHNGILEITACQYTMCEYPH
TVCKSKGSIRNESWHIDSSEPLCLFKKNFTYNVSADWLYFHFYQERGVFYAYYADVGMPTTFLFSLYLGTILSHYYVMPL
TCNAISSNTDNETLEYWVTPLSRRQYLLNFDEHGVITNAVDCSSSFLSEIQCKTQSFAPNTGVYDLSGFTVKPVATVYRR
IPNLPDCDIDNWLNNVSVPSPLNWERRIFSNCNFNLSTLLRLVHVDSFSCNNLDKSKIFGSCFNSITVDKFAIPNRRRDD
LQLGSSGFLQSSNYKIDISSSSCQLYYSLPLVNVTINNFNPSSWNRRYGFGSFNLSSYDVVYSDHCFSVNSDFCPCADPS
VVNSCAKSKPPSAICPAGTKYRHCDLDTTLYVKNWCRCSCLPDPISTYSPNTCPQKKVVVGIGEHCPGLGINEEKCGTQL
NHSSCFCSPDAFLGWSFDSCISNNRCNIFSNFIFNGINSGTTCSNDLLYSNTEISTGVCVNYDLYGITGQGIFKEVSAAY
YNNWQNLLYDSNGNIIGFKDFLTNKTYTILPCYSGRVSAAFYQNSSSPALLYRNLKCSYVLNNISFISQPFYFDSYLGCV
LNAVNLTSYSVSSCDLRMGSGFCIDYALPSSRRKRRGISSPYRFVTFEPFNVSFVNDSVETVGGLFEIQIPTNFTIAGHE
EFIQTSSPKVTIDCSAFVCSNYAACHDLLSEYGTFCDNINSILNEVNDLLDITQLQVANALMQGVTLSSNLNTNLHSDVD
NIDFKSLLGCLGSQCGSSSRSLLEDLLFNKVKLSDVGFVEAYNNCTGGSEIRDLLCVQSFNGIKVLPPILSETQISGYTT
AATVAAMFPPWSAAAGVPFSLNVQYRINGLGVTMDVLNKNQKLIANAFNKALLSIQNGFTATNSALAKIQSVVNANAQAL
NSLLQQLFNKFGAISSSLQEILSRLDNLEAQVQIDRLINGRLTALNAYVSQQLSDITLIKAGASRAIEKVNECVKSQSPR
INFCGNGNHILSLVQNAPYGLLFIHFSYKPTSFKTVLVSPGLCLSGDRGIAPKQGYFIKQNDSWMFTGSSYYYPEPISDK
NVVFMNSCSVNFTKAPFIYLNNSIPNLSDFEAELSLWFKNHTSIAPNLTFNSHINATFLDLYYEMNVIQESIKSLNSSFI
NLKEIGTYEMYVKWPWYIWLLIVILFIIFLMILFFICCCTGCGSACFSKCHNCCDEYGGHNDFVIKASHDD
>Q6Q1S2 ~~~S~~~Spike glycoprotein~~~
MKLFLILLVLPLASCFFTCNSNANLSMLQLGVPDNSSTIVTGLLPTHWFCANQSTSVYSANGFFYIDVGNHRSAFALHTG
YYDANQYYIYVTNEIGLNASVTLKICKFSRNTTFDFLSNASSSFDCIVNLLFTEQLGAPLGITISGETVRLHLYNVTRTF
YVPAAYKLTKLSVKCYFNYSCVFSVVNATVTVNVTTHNGRVVNYTVCDDCNGYTDNIFSVQQDGRIPNGFPFNNWFLLTN
GSTLVDGVSRLYQPLRLTCLWPVPGLKSSTGFVYFNATGSDVNCNGYQHNSVVDVMRYNLNFSANSLDNLKSGVIVFKTL
QYDVLFYCSNSSSGVLDTTIPFGPSSQPYYCFINSTINTTHVSTFVGILPPTVREIVVARTGQFYINGFKYFDLGFIEAV
NFNVTTASATDFWTVAFATFVDVLVNVSATNIQNLLYCDSPFEKLQCEHLQFGLQDGFYSANFLDDNVLPETYVALPIYY
QHTDINFTATASFGGSCYVCKPHQVNISLNGNTSVCVRTSHFSIRYIYNRVKSGSPGDSSWHIYLKSGTCPFSFSKLNNF
QKFKTICFSTVEVPGSCNFPLEATWHYTSYTIVGALYVTWSEGNSITGVPYPVSGIREFSNLVLNNCTKYNIYDYVGTGI
IRSSNQSLAGGITYVSNSGNLLGFKNVSTGNIFIVTPCNQPDQVAVYQQSIIGAMTAVNESRYGLQNLLQLPNFYYVSNG
GNNCTTAVMTYSNFGICADGSLIPVRPRNSSDNGISAIITANLSIPSNWTTSVQVEYLQITSTPIVVDCATYVCNGNPRC
KNLLKQYTSACKTIEDALRLSAHLETNDVSSMLTFDSNAFSLANVTSFGDYNLSSVLPQRNIRSSRIAGRSALEDLLFSK
VVTSGLGTVDVDYKSCTKGLSIADLACAQYYNGIMVLPGVADAERMAMYTGSLIGGMVLGGLTSAAAIPFSLALQARLNY
VALQTDVLQENQKILAASFNKAINNIVASFSSVNDAITQTAEAIHTVTIALNKIQDVVNQQGSALNHLTSQLRHNFQAIS
NSIQAIYDRLDSIQADQQVDRLITGRLAALNAFVSQVLNKYTEVRGSRRLAQQKINECVKSQSNRYGFCGNGTHIFSIVN
SAPDGLLFLHTVLLPTDYKNVKAWSGICVDGIYGYVLRQPNLVLYSDNGVFRVTSRVMFQPRLPVLSDFVQIYNCNVTFV
NISRVELHTVIPDYVDVNKTLQEFAQNLPKYVKPNFDLTPFNLTYLNLSSELKQLEAKTASLFQTTVELQGLIDQINSTY
VDLKLLNRFENYIKWPWWVWLIISVVFVVLLSLLVFCCLSTGCCGCCNCLTSSMRGCCDCGSTKLPYYEFEKVHVQ
>P36334 ~~~S~~~Spike glycoprotein~~~
MFLILLISLPTAFAVIGDLKCTSDNINDKDTGPPPISTDTVDVTNGLGTYYVLDRVYLNTTLFLNGYYPTSGSTYRNMAL
KGSVLLSRLWFKPPFLSDFINGIFAKVKNTKVIKDRVMYSEFPAITIGSTFVNTSYSVVVQPRTINSTQDGDNKLQGLLE
VSVCQYNMCEYPQTICHPNLGNHRKELWHLDTGVVSCLYKRNFTYDVNADYLYFHFYQEGGTFYAYFTDTGVVTKFLFNV
YLGMALSHYYVMPLTCNSKLTLEYWVTPLTSRQYLLAFNQDGIIFNAEDCMSDFMSEIKCKTQSIAPPTGVYELNGYTVQ
PIADVYRRKPNLPNCNIEAWLNDKSVPSPLNWERKTFSNCNFNMSSLMSFIQADSFTCNNIDAAKIYGMCFSSITIDKFA
IPNGRKVDLQLGNLGYLQSFNYRIDTTATSCQLYYNLPAANVSVSRFNPSTWNKRFGFIEDSVFKPRPAGVLTNHDVVYA
QHCFKAPKNFCPCKLNGSCVGSGPGKNNGIGTCPAGTNYLTCDNLCTPDPITFTGTYKCPQTKSLVGIGEHCSGLAVKSD
YCGGNSCTCRPQAFLGWSADSCLQGDKCNIFANFILHDVNSGLTCSTDLQKANTDIILGVCVNYDLYGILGQGIFVEVNA
TYYNSWQNLLYDSNGNLYGFRDYIINRTFMIRSCYSGRVSAAFHANSSEPALLFRNIKCNYVFNNSLTRQLQPINYFDSY
LGCVVNAYNSTAISVQTCDLTVGSGYCVDYSKNRRSRGAITTGYRFTNFEPFTVNSVNDSLEPVGGLYEIQIPSEFTIGN
MVEFIQTSSPKVTIDCAAFVCGDYAACKSQLVEYGSFCDNINAILTEVNELLDTTQLQVANSLMNGVTLSTKLKDGVNFN
VDDINFSPVLGCLGSECSKASSRSAIEDLLFDKVKLSDVGFVEAYNNCTGGAEIRDLICVQSYKGIKVLPPLLSENQISG
YTLAATSASLFPPWTAAAGVPFYLNVQYRINGLGVTMDVLSQNQKLIANAFNNALYAIQEGFDATNSALVKIQAVVNANA
EALNNLLQQLSNRFGAISASLQEILSRLDALEAEAQIDRLINGRLTALNAYVSQQLSDSTLVKFSAAQAMEKVNECVKSQ
SSRINFCGNGNHIISLVQNAPYGLYFIHFSYVPTKYVTARVSPGLCIAGDRGIAPKSGYFVNVNNTWMYTGSGYYYPEPI
TENNVVVMSTCAVNYTKAPYVMLNTSIPNLPDFKEELDQWFKNQTSVAPDLSLDYINVTFLDLQVEMNRLQEAIKVLNQS
YINLKDIGTYEYYVKWPWYVWLLICLAGVAMLVLLFFICCCTGCGTSCFKKCGGCCDDYTGYQELVIKTSHDD
>P11224 ~~~S~~~Spike glycoprotein~~~
MLFVFILFLPSCLGYIGDFRCIQLVNSNGANVSAPSISTETVEVSQGLGTYYVLDRVYLNATLLLTGYYPVDGSKFRNLA
LRGTNSVSLSWFQPPYLNQFNDGIFAKVQNLKTSTPSGATAYFPTIVIGSLFGYTSYTVVIEPYNGVIMASVCQYTICQL
PYTDCKPNTNGNKLIGFWHTDVKPPICVLKRNFTLNVNADAFYFHFYQHGGTFYAYYADKPSATTFLFSVYIGDILTQYY
VLPFICNPTAGSTFAPRYWVTPLVKRQYLFNFNQKGVITSAVDCASSYTSEIKCKTQSMLPSTGVYELSGYTVQPVGVVY
RRVANLPACNIEEWLTARSVPSPLNWERKTFQNCNFNLSSLLRYVQAESLFCNNIDASKVYGRCFGSISVDKFAVPRSRQ
VDLQLGNSGFLQTANYKIDTAATSCQLHYTLPKNNVTINNHNPSSWNRRYGFNDAGVFGKNQHDVVYAQQCFTVRSSYCP
CAQPDIVSPCTTQTKPKSAFVNVGDHCEGLGVLEDNCGNADPHKGCICANNSFIGWSHDTCLVNDRCQIFANILLNGINS
GTTCSTDLQLPNTEVVTGICVKYDLYGITGQGVFKEVKADYYNSWQTLLYDVNGNLNGFRDLTTNKTYTIRSCYSGRVSA
AFHKDAPEPALLYRNINCSYVFSNNISREENPLNYFDSYLGCVVNADNRTDEALPNCDLRMGAGLCVDYSKSRRAHRSVS
TGYRLTTFEPYTPMLVNDSVQSVDGLYEMQIPTNFTIGHHEEFIQTRSPKVTIDCAAFVCGDNTACRQQLVEYGSFCVNV
NAILNEVNNLLDNMQLQVASALMQGVTISSRLPDGISGPIDDINFSPLLGCIGSTCAEDGNGPSAIRGRSAIEDLLFDKV
KLSDVGFVEAYNNCTGGQEVRDLLCVQSFNGIKVLPPVLSESQISGYTTGATAAAMFPPWSAAAGVPFSLSVQYRINGLG
VTMNVLSENQKMIASAFNNALGAIQDGFDATNSALGKIQSVVNANAEALNNLLNQLSNRFGAISASLQEILTRLEAVEAK
AQIDRLINGRLTALNAYISKQLSDSTLIKVSAAQAIEKVNECVKSQTTRINFCGNGNHILSLVQNAPYGLYFIHFSYVPI
SFTTANVSPGLCISGDRGLAPKAGYFVQDDGEWKFTGSSYYYPEPITDKNSVIMSSCAVNYTKAPEVFLNTSIPNPPDFK
EELDKWFKNQTSIAPDLSLDFEKLNVTLLDLTYEMNRIQDAIKKLNESYINLKEVGTYEMYVKWPWYVWLLIGLAGVAVC
VLLFFICCCTGCGSCCFKKCGNCCDEYGGHQDSIVIHNISSHED
>Q02385 ~~~S~~~Spike glycoprotein~~~
MLFVFILFLPSCLGYIGDFRCIQTVNYNGNNASAPSISTEAVDVSKGLGTYYVLDRVYLNATLLLTGYYPVDGSNYRNLA
LTGTNTLSLTWFKPPFLSEFNDGIFAKVQNLKTNTPTGATSYFPTIVIGSLFGNTSYTVVLEPYNNIIMASVCTYTICQL
PYTPCKPNTNGNRVIGFWHTDVKPPICLLKRNFTFNVNAPWLYFHFYQQGGTFYAYYADKPSATTFLFSVYIGDILTQYF
VLPFICTPTAGSTLLPLYWVTPLLKRQYLFNFNEKGVITSAVDCASSYISEIKCKTQSLLPSTGVYDLSGYTVQPVGVVY
RRVPNLPDCKIEEWLTAKSVPSPLNWERRTFQNCNFNLSSLLRYVQAESLSCNNIDASKVYGMCFGSVSVDKFAIPRSRQ
IDLQIGNSGFLQTANYKIDTAATSCQLYYSLPKNNVTINNYNPSSWNRRYGFNDAGVFGKSKHDVAYAQQCFIVRPSYCP
CAQPDIVSACTSQTKPMSAYCPTGTIHRECSLWNGPHLRSARVGSGTYTCECTCKPNPFDTYDLRCGQIKTIVNVGDHCE
GLGVLEDKCGNSDPHKGCSCAHDSFIGWSHDTCLVNDHSQIFANILLNGINSGTTCSTDLQLPNTEVATGVCVRYDLYGI
TGQGVFKEVKADYYNSWQALLYDVNGNLNGFRDLTTNKTYTIRSCYSGRVSAAYHKEAPEPALLYRNINCSYVFTNNISR
EENPLNYFDSYLGCVVNADNRPDEALPNCDLRMGAGLCVDYSKSRRARRSVSTGYRLTTFEPYMPMLVNDSVQSVGGLYE
MQIPTNFTIGHHEEFIQIRAPKVTIDCAAFVCGDNAACRQQLVEYGSFCDNVNAILNEVNNLLDNMQLQVASALMQGVTI
SSRLPDGISGPIDDINFSPLLGCIGSTCAEDGNGPSAMRGRSAIEDLLFDKVKLSDVGFVEAYNNCTGGQEVRDLLCVQS
FNGIKVLPPVLSESQISGYTAGATAAAMFPPWTAAAGVPFSLNVQYRINGLGVTMNVLSENQKMIASAFNNALGAIQEGF
DATNSALGKIQSVVNANAEALNNLLNQLSNRFGAISASLQEILTRLDRVEAKAQIDRLINGRLTALNAYISKQLSDSTLI
KFSAAQAIEKVNECVKSQTTRINFCGNGNHILSLVQNAPYGLCFIHFSYVPTSFKTANVSPGLCISGDRGLAPKAGYFVQ
DNGEWKFTGSNYYYPEPITDKNSVVMISCAVNYTKAPEVFLNNSIPNLPDFKEELDKWFKNQTSIAPDLSLDFEKLNVTF
LDLTYEMNRIQDAIKKLNESYINLKEVGTYEMYVKWPWYVWLLIGLAGVAVCVLLFFICCCTGCGSCCFRKCGSCCDEYG
GHQDSIVIYNISAHED
>P11225 ~~~S~~~Spike glycoprotein~~~
MLFVFILLLPSCLGYIGDFRCIQTVNYNGNNASAPSISTEAVDVSKGRGTYYVLDRVYLNATLLLTGYYPVDGSNYRNLA
LTGTNTLSLTWFKPPFLSEFNDGIFAKVQNLKTNTPTGATSYFPTIVIGSLFGNTSYTVVLEPYNNIIMASVCTYTICQL
PYTPCKPNTNGNRVIGFWHTDVKPPICLLKRNFTFNVNAPWLYFHFYQQGGTFYAYYADKPSATTFLFSVYIGDILTQYF
VLPFICTPTAGSTLAPLYWVTPLLKRQYLFNFNEKGVITSAVDCASSYISEIKCKTQSLLPSTGVYDLSGYTVQPVGVVY
RRVPNLPDCKIEEWLTAKSVPSPLNWERRTFQNCNFNLSSLLRYVQAESLSCNNIDASKVYGMCFGSVSVDKFAIPRSRQ
IDLQIGNSGFLQTANYKIDTAATSCQLYYSLPKNNVTINNYNPSSWNRRYGFKVNDRCQIFANILLNGINSGTTCSTDLQ
LPNTEVATGVCVRYDLYGITGQGVFKEVKADYYNSWQALLYDVNGNLNGFRDLTTNKTYTIRSCYSGRVSAAYHKEAPEP
ALLYRNINCSYVFTNNISREENPLNYFDSYLGCVVNADNRTDEALPNCNLRMGAGLCVDYSKSRRARRSVSTGYRLTTFE
PYMPMLVNDSVQSVGGLYEMQIPTNFTIGHHEEFIQIRAPKVTIDCAAFVCGDNAACRQQLVEYGSFCDNVNAILNEVNN
LLDNMQLQVASALMQGVTISSRLPDGISGPIDDINFSPLLGCIGSTCAEDGNGPSAIRGRSAIEDLLFDKVKLSDVGFVE
AYNNCTGGQEVRDLLCVQSFNGIKVLPPVLSESQISGYTAGATAAAMFPPWTAAAGVPFSLNVQYRINGLGVTMNVLSEN
QKMIASAFNNALGAIQEGFDATNSALGKIQSVVNANAEALNNLLNQLSNRFGAISASLQEILTRLDAVEAKAQIDRLING
RLTALNAYISKQLSDSTLIKFSAAQAIEKVNECVKSQTTRINFCGNGNHILSLVQNAPYGLCFIHFSYVPTSFKTANVSP
GLCISGDRGLAPKAGYFVQDNGEWKFTGSNYYYPEPITDKNSVAMISCAVNYTKAPEVFLNNSIPNLPDFKEELDKWFKN
QTSIAPDLSLDFEKLNVTFLDLTYEMNRIQDAIKKLNESYINLKEVGTYEMYVKWPWYVWLLIGLAGVAVCVLLFFICCC
TGCGSCCFRKCGSCCDEYGGHQDSIVIHNISAHED
>P07946 ~~~S~~~Spike glycoprotein~~~
MKKLFVVLVVMPLIYGDNFPCSKLTNRTIGNQWNLIETFLLNYSSRLPPNSDVVLGDYFPTVQPWFNCIRNNSNDLYVTL
ENLKALYWDYATENITWNHRQRLNVVVNGYPYSITVTTTRNFNSAEGAIICICKGSPPTTTTESSLTCNWGSECRLNHKF
PICPSNSEANCGNMLYGLQWFADEVVAYLHGASYRISFENQWSGTVTFGDMRATTLEVAGTLVDLWWFNPVYDVSYYRVN
NKNGTTVVSNCTDQCASYVANVFTTQPGGFIPSDFSFNNWFLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTFCF
EGAGFDQCNGAVLNNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGGVTLEISCYTVSDSSFFSYGEIPFGVTDGPRYCYV
HYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYTSYTEALVQVENTAITKVT
YCNSHVNNIKCSQITANLNNGFYPVSSSEVGLVNKSVVLLPSFYTHTIVNITIGLGMKRSGYGQPIASTLSNITLPMQDH
NTDVYCIRSDQFSVYVHSTCKSALWDNIFKRNCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVA
ARTRTNEQVVRSLYVIYEEGDNIVGVPSDNSGVHDLSVLHLDSCTDYNIYGRTGVGIIRQTNRTLLSGLYYTSLSGDLLG
FKNVSDGVIYSVTPCDVSAQAAVIDGTIVGAITSINSELLGLTHWTTTPNFYYYSIYNYTNDRTRGTAIDSNDVDCEPVI
TYSNIGVCKNGAFVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLTQYVSA
CQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSSETLDPIYKEWPNIGGSWLEGLKYILPSHNSKRKYRSAIE
DLLFDKVVTSGLGTVDEDYKRCTGGYDIADLVCAQYYNGIMVLPGVANADKMTMYTASLAGGITLGALGGGAVAIPFAVA
VQARLNYVALQTDVLNKNQQILASAFNQAIGNITQSFGKVNDAIHQTSRGLATVAKALAKVQDVVNIQGQALSHLTVQLQ
NNFQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKVNECVRSQSQRFGFCGNGT
HLFSLANAAPNGMIFFHTVLLPTAYETVTAWPGICASDGDRTFGLVVKDVQLTLFRNLDDKFYLTPRTMYQPRVATSSDF
VQIEGCDVLFVNATVSDLPSIIPDYIDINQTVQDILENFRPNWTVPELTFDIFNATYLNLTGEIDDLEFRSEKLHNTTVE
LAILIDNINNTLVNLEWLNRIETYVKWPWYVWLLIGLVVIFCIPLLLFCCCSTGCCGCIGCLGSCCHSICSRRQFENYEP
IEKVHVH
>P24413 ~~~S~~~Spike glycoprotein~~~
MKKLFVVLVVMPLIYGDKFPTSVVSNCTDQCASYVANVFTILPGGFIPSDFSFNNWFLLTNSSTLVNGKLVTKQPLLVNC
LWPVPSFEEVASTFCFEGADFDQCNGAVLNNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGGVTLEISCYNDTVSDSSFS
SYGEIPFGVTNGPRYCYVLYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYT
SYTEALVQVENTAITNVTYCNSYVNNIKCSQLTANLNNGFYPVSSSEVGSVNKSVVLLPSFLTHTIVNITIGLGMKRSGY
GQPIASTLSNITLPMQDNNNDVYCVRSDQFSVYVHSTCKSVLWDNVFKRNCTDVLDATAVIKTGTCPFSFDKLNNYLTFN
KFCLSLSPVGANCKFDVAARTRTNDQVVRSLYVIYEEGDSIVGVPSDNSGLHDLSVLHLDSCTDYNIYGRTGVGIIRQTN
RTILSGLYYTSLSGDLLGFTNVSDGVIYSVTPCDVSAQAAIIDGTIVGAITSINSELLGLTHWTTTPNFYYYSIYNYTND
KTRGTPIGSNDVDCEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRY
VCNGNPRCNKLLTQYVSACQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSSETLDPIYKEWPNIGGFWLEGL
KYILPSDNSKRKYRSAIEDLLFSKVVTSGLGTVDEDYKRCTGGYDIADLVCAQYYNGIMVLPGVANADKMTMYTASLAGG
ITLGALGGGAVAIPFAVAVQARLNYVALQTDVLNKNQQILASAFNQAIGNITQSFGKVNDAIHQTSRGLTTVAKALAKVQ
DVVNTQGQALRHLTVQLQNNFQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKV
NECVRSQSQRFGFCGNGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICALDVDRTFGLVVKDVQLTLFRNLDDKF
YLTPRTMYQPRVATSSDFVQIEGCDVLFVNTTVSDLPSIIPDYIDINQTVQDILENFRPNWTVPELTLDVFNATYLNLTG
EIDDLEFRSEKLHNTTVELAILIDNINNTVVNLEWLNRIETYVKWPWYVWLLIGLVVIFCIPLLLFCCCSTGCCGCIGCL
GSCCHSIFSRRQFENYEPIEKVHVH
>P10033 ~~~S~~~Spike glycoprotein~~~
MIVLVTCLLLLCSYHTVLSTTNNECIQVNVTQLAGNENLIRDFLFSNFKEEGSVVVGGYYPTEVWYNCSRTARTTAFQYF
NNIHAFYFVMEAMENSTGNARGKPLLFHVHGEPVSVIISAYRDDVQQRPLLKHGLVCITKNRHINYEQFTSNQWNSTCTG
ADRKIPFSVIPTDNGTKIYGLEWNDDFVTAYISGRSYHLNINTNWFNNVTLLYSRSSTATWEYSAAYAYQGVSNFTYYKL
NNTNGLKTYELCEDYEHCTGYATNVFAPTSGGYIPDGFSFNNWFLLTNSSTFVSGRFVTNQPLLINCLWPVPSFGVAAQE
FCFEGAQFSQCNGVSLNNTVDVIRFNLNFTADVQSGMGATVFSLNTTGGVILEISCYSDTVSESSSYSYGEIPFGITDGP
RYCYVLYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIGCISFNLTTGVSGAFWTIAYTSYTEALVQVENTA
IKNVTYCNSHINNIKCSQLTANLNNGFYPVASSEVGFVNKSVVLLPSFFTYTAVNITIDLGMKLSGYGQPIASTLSNITL
PMQDNNTDVYCIRSNQFSVYVHSTCKSSLWDNIFNQDCTDVLEATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANC
KFDVAARTRTNEQVVRSLYVIYEEGDNIVGVPSDNSGLHDLSVLHLDSCTDYNIYGRTGVGIIRRTNSTLLSGLYYTSLS
GDLLGFKNVSDGVIYSVTPCDVSAQAAVIDGAIVGAMTSINSELLGLTHWTTTPNFYYYSIYNYTSERTRGTAIDSNDVD
CEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYMQVYTTPVSIDCARYVCNGNPRCNKLLT
QYVSACQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSTENLDPIYKEWPSIGGSWLGGLKDILPSHNSKRKY
GSAIEDLLFDKVVTSGLGTVDEDYKRCTGGYDIADLVCAQYYNGIMVLPGVANADKMTMYTASLAGGITLGALGGGAVAI
PFAVAVQARLNYVALQTDVLNKNQQILANAFNQAIGNITQAFGKVNDAIHQTSQGLATVAKALAKVQDVVNTQGQALSHL
TVQLQNNFQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKVNECVRSQSQRFGF
CGNGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICASDGDRTFGLVVKDVQLTLFRNLDDKFYLTPRTMYQPRVA
TSSDFVQIEGCDVLFVNATVIDLPSIIPDYIDINQTVQDILENYRPNWTVPEFTLDIFNATYLNLTGEIDDLEFRSEKLH
NTTVELAILIDNINNTLVNLEWLNRIETYVKWPWYVWLLIGLVVVFCIPLLLFCCFSTGCCGCIGCLGSCCHSICSRRQF
ENYEPIEKVHVH
>P11223 ~~~S~~~Spike glycoprotein~~~
MLVTPLLLVTLLCALCSAVLYDSSSYVYYYQSAFRPPSGWHLQGGAYAVVNISSEFNNAGSSSGCTVGIIHGGRVVNASS
IAMTAPSSGMAWSSSQFCTAHCNFSDTTVFVTHCYKHGGCPLTGMLQQNLIRVSAMKNGQLFYNLTVSVAKYPTFRSFQC
VNNLTSVYLNGDLVYTSNETIDVTSAGVYFKAGGPITYKVMREVKALAYFVNGTAQDVILCDGSPRGLLACQYNTGNFSD
GFYPFTNSSLVKQKFIVYRENSVNTTCTLHNFIFHNETGANPNPSGVQNIQTYQTKTAQSGYYNFNFSFLSSFVYKESNF
MYGSYHPSCKFRLETINNGLWFNSLSVSIAYGPLQGGCKQSVFKGRATCCYAYSYGGPSLCKGVYSGELDHNFECGLLVY
VTKSGGSRIQTATEPPVITQNNYNNITLNTCVDYNIYGRTGQGFITNVTDSAVSYNYLADAGLAILDTSGSIDIFVVQGE
YGLNYYKVNPCEDVNQQFVVSGGKLVGILTSRNETGSQLLENQFYIKITNGTRRFRRSITENVANCPYVSYGKFCIKPDG
SIATIVPKQLEQFVAPLFNVTENVLIPNSFNLTVTDEYIQTRMDKVQINCLQYVCGSSLDCRKLFQQYGPVCDNILSVVN
SVGQKEDMELLNFYSSTKPAGFNTPVLSNVSTGEFNISLLLTNPSSRRKRSLIEDLLFTSVESVGLPTNDAYKNCTAGPL
GFFKDLACAREYNGLLVLPPIITAEMQALYTSSLVASMAFGGITAAGAIPFATQLQARINHLGITQSLLLKNQEKIAASF
NKAIGHMQEGFRSTSLALQQIQDVVSKQSAILTETMASLNKNFGAISSVIQEIYQQFDAIQANAQVDRLITGRLSSLSVL
ASAKQAEYIRVSQQRELATQKINECVKSQSIRYSFCGNGRHVLTIPQNAPNGIVFIHFSYTPDSFVNVTAIVGFCVKPAN
ASQYAIVPANGRGIFIQVNGSYYITARDMYMPRAITAGDVVTLTSCQANYVSVNKTVITTFVDNDDFDFNDELSKWWNDT
KHELPDFDKFNYTVPILDIDSEIDRIQGVIQGLNDSLIDLEKLSILKTYIKWPWYVWLAIAFATIIFILILGWVFFMTGC
CGCCCGCFGIMPLMSKCGKKSSYYTTFDNDVVTEQYRPKKSV
>K9N5Q8 ~~~S~~~Spike glycoprotein~~~
MIHSVFLLMFLLTPTESYVDVGPDSVKSACIEVDIQQTFFDKTWPRPIDVSKADGIIYPQGRTYSNITITYQGLFPYQGD
HGDMYVYSAGHATGTTPQKLFVANYSQDVKQFANGFVVRIGAAANSTGTVIISPSTSATIRKIYPAFMLGSSVGNFSDGK
MGRFFNHTLVLLPDGCGTLLRAFYCILEPRSGNHCPAGNSYTSFATYHTPATDCSDGNYNRNASLNSFKEYFNLRNCTFM
YTYNITEDEILEWFGITQTAQGVHLFSSRYVDLYGGNMFQFATLPVYDTIKYYSIIPHSIRSIQSDRKAWAAFYVYKLQP
LTFLLDFSVDGYIRRAIDCGFNDLSQLHCSYESFDVESGVYSVSSFEAKPSGSVVEQAEGVECDFSPLLSGTPPQVYNFK
RLVFTNCNYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPLSMKSDLSVSSAGPISQFNYKQSFSNPTCLI
LATVPHNLTTITKPLKYSYINKCSRFLSDDRTEVPQLVNANQYSPCVSIVPSTVWEDGDYYRKQLSPLEGGGWLVASGST
VAMTEQLQMGFGITVQYGTDTNSVCPKLEFANDTKIASQLGNCVEYSLYGVSGRGVFQNCTAVGVRQQRFVYDAYQNLVG
YYSDDGNYYCLRACVSVPVSVIYDKETKTHATLFGSVACEHISSTMSQYSRSTRSMLKRRDSTYGPLQTPVGCVLGLVNS
SLFVEDCKLPLGQSLCALPDTPSTLTPRSVRSVPGEMRLASIAFNHPIQVDQLNSSYFKLSIPTNFSFGVTQEYIQTTIQ
KVTVDCKQYVCNGFQKCEQLLREYGQFCSKINQALHGANLRQDDSVRNLFASVKSSQSSPIIPGFGGDFNLTLLEPVSIS
TGSRSARSAIEDLLFDKVTIADPGYMQGYDDCMQQGPASARDLICAQYVAGYKVLPPLMDVNMEAAYTSSLLGSIAGVGW
TAGLSSFAAIPFAQSIFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTTNEAFHKVQDAVNNNAQALSKLASEL
SNTFGAISASIGDIIQRLDVLEQDAQIDRLINGRLTTLNAFVAQQLVRSESAALSAQLAKDKVNECVKAQSKRSGFCGQG
THIVSFVVNAPNGLYFMHVGYYPSNHIEVVSAYGLCDAANPTNCIAPVNGYFIKTNNTRIVDEWSYTGSSFYAPEPITSL
NTKYVAPQVTYQNISTNLPPPLLGNSTGIDFQDELDEFFKNVSTSIPNFGSLTQINTTLLDLTYEMLSLQQVVKALNESY
IDLKELGNYTYYNKWPWYIWLGFIAGLVALALCVFFILCCTGCGTNCMGKLKCNRCCDRYEEYDLEPHKVHVH
>Q91AV1 ~~~S~~~Spike glycoprotein~~~
MRSLIYFWLLLPVLPTLSLPQDVTRCQSTTNFRRFFSKFNVQAPAVVVLGGYLPSMNSSSWYCGTGIETASGVHGIFLSY
IDSGQGFEIGISQEPFDPSGYQLYLHKATNGNTNAIARLRICQFPDNKTLGPTVNDVTTGRNCLFNKAIPAYMRDGKDIV
VGITWDNDRVTVFADKIYHFYLKNDWSRVATRCYNRRSCAMQYVYTPTYYMLNVTSAGEDGIYYEPCTANCTGYAANVFA
TDSNGHIPEGFSFNNWFLLSNDSTLLHGKVVSNQPLLVNCLLAIPKIYGLGQFFSFNHTMDGVCNGAAVDRAPEALRFNI
NDTSVILAEGSIVLHTALGTNLSFVCSNSSDPHLAIFAIPLGATEVPYYCFLKVDTYNSTVYKFLAVLPPTVREIVITKY
GDVYVNGFGYLHLGLLDAVTINFTGHGTDDDVSGFWTIASTNFVDALIEVQGTSIQRILYCDDPVSQLKCSQVAFDLDDG
FYPISSRNLLSHEQPISFVTLPSFNDHSFVNITVSAAFGGLSSANLVASDTTINGFSSFCVDTRQFTITLFYNVTNSYGY
VSKSQDSNCPFTLQSVNDYLSFSKFCVSTSLLAGACTIDLFGYPAFGSGVKLTSLYFQFTKGELITGTPKPLEGITDVSF
MTLDVCTKYTIYGFKGEGIITLTNSSILAGVYYTSDSGQLLAFKNVTSGAVYSVTPCSFSEQAAYVNDDIVGVISSLSNS
TFNNTRELPGFFYHSNDGSNCTEPVLVYSNIGVCKSGSIGYVPSQYGQVKIAPTVTGNISIPTNFSMSIRTEYLQLYNTP
VSVDCATYVCNGNSRCKQLLTQYTAACKTIESALQLSARLESVEVNSMLTISEEALQLATISSFNGDGYNFTNVLGASVY
DPASGRVVQKRSVIEDLLFNKVVTNGLGTVDEDYKRCSNGRSVADLVCAQYYSGVMVLPGVVDAEKLHMYSASLIGGMAL
GGITAAAALPFSYAVQARLNYLALQTDVLQRNQQLLAESFNSAIGNITSAFESVKEAISQTSKGLNTVAHALTKVQEVVN
SQGSALNQLTVQLQHNFQAISSSIDDIYSRLDILSADVQVDRLITGRLSALNAFVAQTLTKYTEVQASRKLAQQKVNECV
KSQSQRYGFCGGDGEHIFSLVQAAPQGLLFLHTVLVPGDFVNVLAIAGLCVNGEIALTLREPGLVLFTHELQTYTATEYF
VSSRRMFEPRKPTVSDFVQIESCVVTYVNLTSDQLPDVIPDYIDVNKTLDEILASLPNRTGPSLPLDVFNATYLNLTGEI
ADLEQRSESLRNTTEELRSLINNINNTLVDLEWLNRVETYIKWPWWVWLIIVIVLIFVVSLLVFCCISTGCCGCCGCCGA
CFSGCCRGPRLQPYEAFEKVHVQ
>P59594 ~~~S~~~Spike glycoprotein~~~
MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHTFGNPV
IPFKDGIYFAATEKSNVVRGWVFGSTMNNKSQSVIIINNSTNVVIRACNFELCDNPFFAVSKPMGTQTHTMIFDNAFNCT
FEYISDAFSLDVSEKSGNFKHLREFVFKNKDGFLYVYKGYQPIDVVRDLPSGFNTLKPIFKLPLGINITNFRAILTAFSP
AQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNIT
NLCPFGEVFNATKFPSVYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPG
QTGVIADYNYKLPDDFMGCVLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFERDISNVPFSPDGKPCTPPALNCYWPLND
YGFYTTTGIGYQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFTD
SVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAE
HVDTSYECDIPIGAGICASYHTVSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVMPVSMAKTSVDC
NMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQMYKTPTLKYFGGFNFSQILPDPLKPTKRSFI
EDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPF
AMQMAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLN
DILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPH
GVVFLHVTYVPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGNCDVVIGIINNTVY
DPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWL
GFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT
>P0DTC2 ~~~S~~~Spike glycoprotein~~~
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFD
NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY
SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT
LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV
QPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF
VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC
NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGS
NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI
SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF
NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAG
TITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN
TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRV
DFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL
QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
>P23061 ~~~SPH~~~Spindolin~~~
MNKLILISLIASLYQVEVDAHGYMTFPIARQRRCSAAGGNWYPVGGGGIQDPMCRAAYQNVFNKVLNSNGGDVIDASEAA
NYMYTQDNEYAALAGPDYTNICHIQQRVVPSYLCAAGASDWSIRPFGDKSGMDLPGSWTPTIIQLSDNQQSNVVMELEFC
PTAVHDPSYYEVYITNPSFNVYTDNVVWANLDLIYNNTVTLRPKLPESTCAANSMVYRFEVSIPVRPSQFVLYVRWQRID
PVGEGFYNCVDMKFKYSEGPDEEDIIEPEYEVDNEAECFAYRTNSGNVNVNPLQENKYMAYANKAIRNINTHSNGCSRNR
NNKNNYNKYYSKTYNYNQNRK
>Q05894 ~~~~~~Spindolin~~~
MNKFYYICIYINILYVCVSGHGYMTFPIARQRRCSVRGGQWWPPNGDGITDTMCRAAYQNVYNKVLNQYNDPQEAATAAQ
YMFQQDNEYAALAGPDYTNLCNLQQNVVPNNLCAAGADDWDVVPFGDKSGMDLPGNWVPTVIPLDSNHQSSVALELEFCP
TAVHDPSYYEVYITNSGFNVHTDNVVWGNLELIFNDTVPLRPKSSTSTCNANPNVYRFTVSIPVRPAQFVLYVRWQRIDP
VGEGFYNCVDMAFDYAAGPSEEDVIYPDYEAPGQNAYTCHANRNKYGGNYENTIDEDKYQAQLDESIKSRYDKYSRHKGG
KFGQKQCNGNKHHYNKYTKYYNQNYKNNKNY
>P23058 ~~~SLP~~~Spheroidin-like protein~~~
MIALLIALFAAIHAPAVRSHGYLSVPTARQYKCFKDGNFYWPDNGDNIPDAACRNAYKSVYYKYRALDLESGAAASTAQY
MFQQYMEYAAVAGPNYDDFDLIKQRVVPHTLCGAGSNDRNSVFGDKSGMDEPFNNWRPNTLYLNRYQPVYQMNVHFCPTA
IHEPSYFEVFITKSNWDRRNPITWNELEYIGGNDSNLIPNPGDSLCDNSLVYSIPVVIPYRSNQFVMYVRWQRIDPVGEG
FYNCADLVFETLDDECRYAQMAKVVRSQLQKHKLDARIDHNDEESCWRARKSNYSSFFNPGF
>Q65328 ~~~SLP~~~Spheroidin-like protein~~~
MYKLCAVLFALAVPAVRPHGYLSTPVARQYKCFADGNFYWPDNGDGVPDEACRNAYKKVFHRYRAVGAPPGEAAAAAQYM
FQQYAEYAAVAGPNYRDLELVKREVLPHTLCGAAANDRHALFGDKSGMDEPFHNWRPDVLYVNRYQRAHSFNVHFCPTAV
HEPSYFEVYVTKFTWDRRSPVTWNELEYIGGNGSGLVPNPGDAFCASGQLYSIPVSVPYRPGPFVMYVRWQRIDPVGEGF
YNCADLVFGTENDECRYARAAKAVRDQLRQQNLCNDCVEAGPQESCAPTRPQRRAHNYLRRGGAHDQQADGASVRETIDE
L
>P00525 2.7.10.2~~~V-SRC~~~Tyrosine-protein kinase transforming protein Src~~~
MGSSKSKPKDPSQRRCSLEPPDSTHHGGFPASQTPNKTAAPDTHRTPSRSFGTVATEPKLFGGFNTSDTVTSPQRAGALA
GGVTTFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLTTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESER
LLLNPENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFSSLQQLVAYYSKHADGLCHR
LTNVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHE
KLVQLYAVVSEEPIYIVTEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVC
KVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMGNGEVLDRVERGYR
MPCPPECPESLHDLMCQCWRRDPEERPTFEYLQAQLLPACVLEVAE
>P25020 2.7.10.2~~~V-SRC~~~Tyrosine-protein kinase transforming protein Src~~~
MGSSKSKPKDPSQRRRSLEPPDSTHHGGFPASQTPDETAAPDAHRNPSRSFGTVATEPKLFWGFNTSDTVTSPQRAGALA
GGVTTFVALYDYESWTETDLSFKKGERLQIVNNTEGDWWLAHSLTTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESER
LLLNPENPRGTFLVRKSETAKGAYCLSVSDFDNAKGPNVKHYKICKLYSGGFYITSRTQFGSLQQLVAYYSKHADGLCHR
LTNVCPTSKPQTQGLAKDAWEIPRESLRLEAKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHE
KLVQLYAVVSEEPIYIVIEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVC
KVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERAYR
MPCPPECPESLHDLMCQCWRKDPEERPTFKYLQAQLLPACVLEVAE
>P00526 2.7.10.2~~~V-SRC~~~Tyrosine-protein kinase transforming protein Src~~~
MGSSKSKPKDPSQRRHSLEPPDSTHHGGFPASQTPDETAAPDAHRNPSRSFGTVATEPKLFWGFNTSDTVTSPQRAGALA
GGVTTFVALYDYESWTETDLSFKKGERLQIVNNTEGDWWLAHSLTTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESER
LLLNPENPRGTFLVRKSETAKGAYCLSVSDFDNAKGPNVKHYKIYKLYSGGFYITSRTQFGSLQQLVAYYSKHADGLCHR
LANVCPTSKPQTQGLAKDAWEIPRESLRLEAKLGQGCFGEVWMGTWNDTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHE
KLVQLYAVVSEEPIYIVIEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVC
KVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYR
MPCPPECPESLHDLMCQCWRKDPEERPTFKYLQAQLLPACVLEVAE
>P00524 2.7.10.2~~~V-SRC~~~Tyrosine-protein kinase transforming protein Src~~~
MGSSKSKPKDPSQRRRSLEPPDSTHHGGFPASQTPNKTAAPDTHRTPSRSFGTVATEPKLFGGFNTSDTVTSPQRAGALA
GGVTTFVALYDYESWIETDLSFKKGERLQIVNNTEGNWWLAHSLTTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESER
LLLNPENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFSSLQQLVAYYSKHADGLCHR
LTNVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHE
KLVQLYAVVSEEPIYIVIEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVC
KVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMGNGEVLDRVERGYR
MPCPPECPESLHDLMCQCWRRDPEERPTFEYLQAQLLPACVLEVAE
>P63185 2.7.10.2~~~V-SRC~~~Tyrosine-protein kinase transforming protein Src~~~
MGSSKSKPKGPSQRRRSLEPPDSTHHGGFPASQTPNKTAAPDTHRTPSRSFGTVATEPKLFGDFNTSDTVTSPQRAGALA
GGVTTFVALYDYESWIETDLSFKKGERLQIVNNTEGNWWLAHSVTTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESER
LLLNPENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFSSLQQLVAYYSKHADGLCHR
LTNVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHK
KLVQLYAVVSEEPIYIVIEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVC
KVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMGNGEVLDRVERGYR
MPCPPECPESLHDLMSQCWRRDPEERPTFEYLQAQLLPACVLEVAE
>Q9MCD0 ~~~gene 5~~~Single-stranded DNA-binding protein~~~
MSNELKQVEQTEEAVVVSETKDYIKVYENGKYRRKAKYQQLNSMSHRELTDEEEINIFNLLNGAEGSAVEMKRAVGSKVT
IVDFITVPYTKIDEDTGVEENGVLTYLINENGEAIATSSKAVYFTLNRLLIQCGKHADGTWKRPIVEIISVKQTNGDGMD
LKLVGFDKKK
>Q6JM09 ~~~~~~SSB protein~~~
MAIITVTAQANEKNTRTVSTAKGDKKIISVPLFEKEKGSNVKVAYGSAFLPDFIQLGDTVTVSGRVQAKESGEYVNYNFV
FPTVEKVFITNDNSSQSQAKQDLFGGSEPIEVNSEDLPF
>Q9MCD1 ~~~5~~~Single-stranded DNA-binding protein~~~
MTNEIKATFDVTTLEGRMKILNAKNAGGASLKTCEDGAIIEAVGIAQYQQESDTYGDMKEETVTAIFTADGNVISAISKT
VAEAASEIIDLVKEFNLDSFKVKVSKQKSSKGNEFFSLLLVG
>Q38504 ~~~5~~~Single-stranded DNA-binding protein~~~
MENTNIVKATFDTETLEGQIKIFNAQTGGGQSFKNLPDGTIIEANAIAQYKQVSDTYGDAKEETVTTIFAADGSLYSAIS
KTVAEAASDLIDLVTRHKLETFKVKVVQGTSSKGNVFFSLQLSL
>P09035 ~~~~~~Single-stranded DNA-binding protein~~~
MFKRKSTAELAAQMAKLAGNKGGFSSEDKGEWKLKLDNAGNGQAVIRFLPSKNDEQAPFAILVNHGFKKNGKWYIENCSS
THGDYDSCPVCQYISKNDLYNTDNKEYGLVKRKTSYWANILVVKDPAAPENEGKVFKYRFGKKIWDKINAMIAVDVEMGE
TPVDVTCPWEGANFVLKVKQVSGFSNYDESKFLNQSAIPNIDDESFQKELFEQMVDLSEMTSKDKFKSFEELSTKFSQVM
GTAAMGGAAATAAKKADKVADDLDAFNVDDFNTKTEDDFMSSSSGSSSSADDTDLDDLLNDL
>P03695 ~~~~~~Single-stranded DNA-binding protein~~~
MFKRKSTAELAAQMAKLNGNKGFSSEDKGEWKLKLDNAGNGQAVIRFLPSKNDEQAPFAILVNHGFKKNGKWYIETCSST
HGDYDSCPVCQYISKNDLYNTDNKEYSLVKRKTSYWANILVVKDPAAPENEGKVFKYRFGKKIWDKINAMIAVDVEMGET
PVDVTCPWEGANFVLKVKQVSGFSNYDESKFLNQSAIPNIDDESFQKELFEQMVDLSEMTSKDKFKSFEELNTKFGQVMG
TAVMGGAAATAAKKADKVADDLDAFNVDDFNTKTEDDFMSSSSGSSSSADDTDLDDLLNDL
>P09797 ~~~~~~Single-stranded DNA-binding protein~~~
MFKRKSTAELAAQMAKLAGNKGGFSSEDKGEWKLKLDNAGNGQAVIRFLPSKNDEQAPFAILVNHGFKKNGKWYIETCSS
THGDYDSCPVCQYISKNDLYNTDNKEYSLVKRKTSYWANILVVKDPAAPENEGKVFKYRFGKKIWDKINAMIAVDVEMGE
TPVDVTCPWEGANFVLKVKQVSGFSNYDESKFLNQSAIPNIDDESFQKELFEQMVDLSEMTSKDKFKSFEELSTKFSQVM
GTAAMGGAAATAAKKADKVADDLDAFNVDDFNTKTEDDFMSSSSGSSSSADDTDLDDLLNDL
>P03696 ~~~~~~Single-stranded DNA-binding protein~~~
MAKKIFTSALGTAEPYAYIAKPDYGNEERGFGNPRGVYKVDLTIPNKDPRCQRMVDEIVKCHEEAYAAAVEEYEANPPAV
ARGKKPLKPYEGDMPFFDNGDGTTTFKFKCYASFQDKKTKETKHINLVVVDSKGKKMEDVPIIGGGSKLKVKYSLVPYKW
NTAVGASVKLQLESVMLVELATFGGGEDDWADEVEENGYVASGSAKASKPRDEESWDEDDEESEEADEDGDF
>P20376 ~~~~~~Probable ssDNA-binding protein~~~
MAKSWGETTGGSNDKIEFLKFNNGITRVRIVSGVLPRYVYWLTNKEGSVAPFECLRFNRDKESFVRGKADPVHELGFFEK
ELDKDGNRVPLKPKKNYIAFVIDRSDNKLKVMEVKATILKGIQSIMKQLNLATPFDIDISIEKKGKGFDTEYDVQQIAAM
QFQIKLQDPNSAESKQYAADVDLIGEAMCDEDGDIIKFEKVPSLEQTYPVPTYEEQKEAIQAFMEGRENKDDDAKSGNSN
AGSQKGIDQEAASDLDD
>P85502 ~~~~~~Structural protein 2~~~
MAYDTSAFPLGSKDPRVLYNNAENMDVAMNSVEQERWMDRGPQRPPLPRWTYWGMEQNYNRFISNSAWELPPLVYVDGSP
LTVERSSQVIERDGNLYSVKLPASFPVELSGTWSADEPLLVFRSDQSLRQELAEQNGGTLVGWKRTQLSASIDTIQQLAD
SIPIRVWEFAELVSDKPSPDPATWNWTPAFQAMVDTAESYMQSSGAKQITCYAGPGTFLIDSIVWRSGVHMYFGGAELKA
HPDSIDGNSLINATLKLSDIGFYGPGIVNGDKDSFDPEHRQHGIHCVAKKVKVKDLFIENIGSSSVFSLGDGVIFRPTIP
EGDFQCEDCEVSGCTFSNIERQCITVESGFNIRILSNGFYNSTYAALDIENAGYTMGDVDGVIFQGNYIDGCLYGVTAVT
YQPVDAQRNIVCGGNIYKNVMDAYHFRGCSNVKVGYGDIAEVSRYGAYIYSDGATTVSNIEISDFTTSGGTYGVYAQTTS
GGSFNRIKLSTLKITGTSTSPITVQSTSGLRIEEVDVLINTGAGVVIQNCASPIIRNLKMVGAVTLSVPAVSFIGTTTNP
RVGGLDIAGFTVGVSVTTSATTTIHSLSNNVFAGVATPWSVNPGNYIKGQFSGTFTMNAAASMNVNSVGMNVTSSVVRLI
PTNAAAATLQAGSKMAWVVNSASSNNVSFRVQTADGTAAAGTETFAFVIENL
>P85503 ~~~~~~Structural protein 3~~~
MAKNNVVKAQGTDLYFIDPDTHVVMNAGCITSLSGIDTSIDQIETTCLNETARSYVAGLATPGTATFSINTNPQDPVHIR
LLELKNAGVSLDWAVGWSDGTSAPTAVLDSSGEYDFDVPADRSWLLFEGYMNSFSFEFAQNAVVTSSIGIQVSGEPVLIP
KSTS
>P09385 3.2.2.22~~~stxA2~~~Shiga-like toxin 2 subunit A~~~
MKCILFKWVLCLLLGFSSVSYSREFTIDFSTQQSYVSSLNSIRTEISTPLEHISQGTTSVSVINHTPPGSYFAVDIRGLD
VYQARFDHLRLIIEQNNLYVAGFVNTATNTFYRFSDFTHISVPGVTTVSMTTDSSYTTLQRVAALERSGMQISRHSLVSS
YLALMEFSGNTMTRDASRAVLRFVTVTAEALRFRQIQREFRQALSETAPVYTMTPGDVDLTLNWGRISNVLPEYRGEDGV
RVGRISFNNISAILGTVAVILNCHHQGARSVRAVNEESQPECQITGDRPVIKINNTLWESNTAAAFLNRKSQFLYTTGK
>P09386 ~~~stxB2~~~Shiga-like toxin 2 subunit B~~~
MKKMFMAVLFALASVNAMAADCAKGKIEFSKYNEDDTFTVKVDGKEYWTSRWNLQPLLQSAQLTGMTVTIKSSTCESGSG
FAEVQFNND
>P69179 ~~~stxB~~~Shiga-like toxin 1 subunit B~~~
MKKTLLIAASLSFFSASALATPDCVTGKVEYTKYNDDDTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIKTNACHNG
GGFSEVIFR
>P69178 ~~~stxB~~~Shiga-like toxin 1 subunit B~~~
MKKTLLIAASLSFFSASALATPDCVTGKVEYTKYNDDDTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIKTNACHNG
GGFSEVIFR
>P24852 ~~~~~~Small t antigen~~~
MELTSEEYEELRGLLGTPDIGNADTLKKAFLKACKVHHPDKGGNEEAMKRLLYLYNKAKIAASATTSQVWYFLIIGYISL
KNKNIYLPKIFWLRFQNMAPHSGNSGGKNSIKALMSKICIVMRN
>P03083 ~~~~~~Small t antigen~~~
MDKVLNREESMELMDLLGLDRSAWGNIPVMRKAYLKKCKELHPDKGGDEDKMKRMNFLYKKMEQGVKVAHQPDFGTWNSS
EVGCDFPPNSDTLYCKEWPNCATNPSVHCPCLMCMLKLRHRNRKFLRSSPLVWIDCYCFDCFRQWFGCDLTQEALHCWEK
VLGDTPYRDLKL
>P03081 ~~~~~~Small t antigen~~~
MDKVLNREESLQLMDLLGLERSAWGNIPLMRKAYLKKCKEFHPDKGGDEEKMKKMNTLYKKMEDGVKYAHQPDFGGFWDA
TEVFASSLNPGVDAMYCKQWPECAKKMSANCICLLCLLRMKHENRKLYRKDPLVWVDCYCFDCFRMWFGLDLCEGTLLLW
CDIIGQTTYRDLKL
>Q5UR82 6.1.1.10~~~MARS~~~Methionine--tRNA ligase~~~
MQKFFVTSALPYPNNSSPHLGNLVGALLSGDVYARFKRNQGHEVIYLCGTDEYGTTTMIRARKEGVTCRELCDKYFELHK
KVYDWFNIEFDVFGRTSTTKQTEITWEIFNGLYNNGYIEEKTTVQAFCEKCDMYLADTYLKGYCYHDGCRENRVISNGDQ
CEICQKMIDVNKLINPFCSICLTPPIQKSTDHLYLSLDKLTPLVQQYLDRVEFDSRIMAISKAWLEIGLNPRCITRDLEW
GTPIPINLDPKLEKYADKVFYVWFDAPIGYYSILANERDDWREWLNSGVTWVSTQAKDNVPFHSIVFPASVIGSNIELPL
IDRICGTDYLLYEGQKFSKSQGVGLFGDKVAEISPKLGINEDYWRFYLMKIRPETQDSSFNLEEFVRIVKTDLVNNIGNF
INRVFSLLEKTPYRDLNYQISPEYIEFIKKYEVSMDEFKFRDGLKICLEMSSRGNKFVQSTKPWTMIKDGLDTQEIMTEA
VGICWILLNLLKPIIPKSACDMLSNLDTDNQNIFCLIGGSNINIRILNIIKLPFKNIDLKQLREFIEGKN
>Q5UPJ7 6.1.1.1~~~YARS~~~Tyrosine--tRNA ligase~~~
MENTDHTNNEHRLTQLLSIAEECETLDRLKQLVDSGRIFTAYNGFEPSGRIHIAQALITVMNTNNIIECGGQMIIYIADW
FAKMNLKMNGDINKIRELGRYFIEVFKACGINLDGTRFIWASEFIASNPSYIERMLDIAEFSTISRVKRCCQIMGRNESD
CLKASQIFYPCMQAADVFELVPEGIDICQLGIDQRKVNMLAIEYANDRGLKIPISLSHHMLMSLSGPKKKMSKSDPQGAI
FMDDTEQEVSEKISRAYCTDETFDNPIFEYIKYLLLRWFGTLNLCGKIYTDIESIQEDFSSMNKRELKTDVANYINTIID
LVREHFKKPELSELLSNVKSYQQPSK
>P52283 3.1.21.4~~~CVJIR~~~Type II restriction enzyme CviJI~~~
MDIRRKRFTIEGAKRIILEKKRLEEKKRIAEEKKRIALIEKQRIAEEKKRIAEEKKRFALEEKKRIAEEKKRIAEEKKRI
VEEKKRLALIEKQRIAEEKIASGRKIRKRISTNATKHEREFVKVINSMFVGPATFVFVDIKGNKSREIHNVVRFRQLQGS
KAKSPTAYVDREYNKPKADIAAVDITGKDVAWISHKASEGYQQYLKISGKNLKFTGKELEEVLSFKRKVVSMAPVSKIWP
ANKTVWSPIKSNLIKNQAIFGFDYGKKPGRDNVDIIGQGRPIITKRGSILYLTFTGFSALNGHLENFTGKHEPVFYVRTE
RSSSGRSITTVVNGVTYKNLRFFIHPYNFVSSKTQRIM
>P31117 3.1.21.4~~~CVIAIIR~~~Type II restriction enzyme CviAII~~~
MTQKILNPVTGRFVKVDGSTGKKIKTGNIYDVNNILSSKLTKKIFDKIRRNDLAKEERETQQLTKKISKGIFDKIRSENK
KYEKNHQKLTKKMVLDIFSKIYKEDSRKSKTQCVVSEKKDNGGVLLTEDLGKIFEKSICMLYDTPYIGPYKYGNEKPMLL
KTRLHKLLDFFPELTHTAAGGALHDFTTKNSRYLSAKTSKKKDGKVAPQKIGQPTKKKFLEFFNLPPDTSNDDIKLFIKK
NIVRILDEYFKYTFDDTIIYYNEVNNIIMLVKTLKKVKFDPNLIEFGCNKPGKSWKESTILFYNNKRLGEFQIHTSRSCI
KFRWFFENILLLFPDNFEVTIL
>P08763 2.1.1.72~~~mod~~~Type III restriction-modification enzyme EcoPI Mod subunit~~~
MKKETIFSEVETANSKQLAVLKANFPQCFDKNGAFIQEKLLEIIRASEVELSKESYSLNWLGKSYARLLANLPPKTLLAE
DKTHNQQEENKNSQNLLIKGDNLEVLKHMVNAYAEKVNMIYIDPPYNTGKDGFVYNDDRKFTPEQLSELAGIELDEANRI
LEFTTKGSSSHSAWLTFIYPRLYIARELLKEDGVIFISIDDNEDKQLGLLCDEVFGQGNFVAKLPTIMNLKGNHDNFGFS
DTHEYIYVYAKNKDVCSLGQFDIDESEVEKEWDEDEYGLFKRADTLKRTGQDASRKSRPKGWFPVFINSENKVYVTDDDK
PLNEDDYVLYPVSPTGEELSWSWGKKKINDEFYNLIVIDIKDGKNIYKKQRPALGELPTKKPKSIWYKPEYSTSTATTEL
KNLLGAKLFEGPKPVPLITDLVKIGTKKDSLVLDFFAGSGTTAEAVAYLNEKDSGCRNFICIQKDEVINKTKNAYSLGYR
SIFEITKKRIQEVFKKSTTTSDNAAKIGFKVIHTIDDFRAKVESELTLTNHTFFDDAVLTPEQYDALLTTWCVYDGSLLT
TPIEDVDLSGYTAHFCNGRLYLIAPNFTSEALKALLQKLDSDEDFAPNKVVFYGCNFESAKQRELNEALKSYANKKSIEL
DLVVRN
>P08764 3.1.21.5~~~res~~~Type III restriction-modification enzyme EcoPI Res subunit~~~
MSKGFTFEKNLPHQKAGVDAVMNVFVSATSHQEDNVSIRLLVNPELRLTEQQYYKNIKKVQELNGIEHVKNNYDARSNVI
DVSMETGTGKTYTYTKTIFDLNKSFGINKFIIIVPTLSIKAGTVNFLKSDALKEHFRDDYEREIKTYVVESQKNAGKSTK
SYMPQAIHDFVEASNFNKKYIHVLVINTGMIHSKNLNSTYDVGLLDNHFDSPFSALGAVKPFIIIDEPHKFPTGKKTWEN
IEKFNAQYIIRYGATFSEGYKNLVYRLTAVDAFNEDLVKGIDAYIEDIVGDGDANLKFIKSDGEEVTFELNENNKKTLFK
LTKGESLSKTHSAIHDLTLDALGKNTVVLSNGIELKIGCSINPYSYDQTLADSMMRKAIKEHFKLEKEFLTQRPRIKPLT
LFFIDDIEGYRDGNNIAGSLKAKFEEYVLAEANELLKIEKDEFYSNYLEKTVKDISSVHGGYFSKDNSDKDDKIEKEINE
ILHDKELLLSLDNPRRFIFSKWTLREGWDNPNVFQICKLRSSGSTTSKLQEVGRGLRLPVNEYMCRVKDRNFTLKYYVDF
TEKDFVDSLVKEVNESSFKERVPSKFTQELKEQIRAQYPELSSRALMNELFNDEIIDDNDNFKDSDAYSRLKSKYPAAFP
IGVKPGKIKKATDGKRRTKMRVGKFSELKELWELINQKAVIEYKINSENEFLSIFKSFMLEETERFTKSGVHTRIDKIYI
HNDMAMSKSIVSDDDDFAKLNTMSYREFLDNLSQTIFVKHDTLHKVFCDIKDTINITEYLNIQTIRKIKSGFSKYLLNNS
FNKFSLGYNLISGSIHPTKFTNADGKPLDEVLSSDLGVLQDNSKAPLDTYLFEEVFYDSELERRNITDREIQSVVVFSKI
PKNSIKIPVAGGYTYSPDFAYVVKTAEGDYLNFIIETKNVDSKDSLRLEEKKKIEHAQALFNQISQSVKVEFKTQFANDD
IYQLIKSALP
>P85988 ~~~~~~Major tail tube protein~~~
MAIPKKLRLFTLYVDGTNHIGKIPSVTLPKVTRKTEDYQGGGMQGAVAVDLGLDGGALDASMVVGGVVEELILKYGGDID
EMRLRFVGEIYSGGTSSLMEVEMRGRITEIDPGEAKQGDDTNHTYAIKNTYYKLSVDDKALLEIDLLNFIYKRDGKNLYP
DRIVSALGLGG
>A9CRB8 ~~~~~~Putative tail protein~~~
MAQDKYIVALQIADKDLAKKLTIEEATLLGSLAEGGHTISNDLAEIIQGGKKDYSRNSVEEEIKLTLDVVPGDKGQLALK
ESVKQFKQLRVWIWETKKRDGKHHGVFAYVVIEEHEWSFDDEDNKIEITAKVKFNSADGTINDLPKEWLNPSALAPVVEF
EDMNAYEDSYENRTKKTTAGSSDLSM
>B2ZYZ1 ~~~~~~Putative tail protein~~~
MANMKNSNDRIILFRKAGEKVDATKMLFLTEYGLSHEADTDTEDTMDGSYNTGGSVESTMSGTAKMFYGDDFADEIEDAV
VDRVLYEAWEVESRIPGKNGDATKFKAKYFQGFHNKFELKAEANGIDEYEYEYGVNGRFQRGFATLPEAVTKKLKATGYR
FHDTTKADALTGEDLTAIPQPKVDSSTVTPGEV
>P0DJY4 ~~~~~~Probable tail assembly protein FS-gp41~~~
MDEMNLGPEAQELHDSIVAEIQSGVLKLKDGLPFGTGDETEMQYDVTLRELTAGDMIDAQAAAEKLVMSKEGPVLVSSPS
RMGLEMLRRQIASVGCIKGPLSMALIRKLSVDDFQRLSLATEMYDMAVAASLTQERGGEWLRCRNDIEKAATAIGVILKS
GPEWALSLPLSRFFRHCQQAKTLSQYHR
>P79680 ~~~~~~Probable tail assembly protein gp41~~~
MDEMNLGPEAQELHDSIVAEIQSGVLKLKDGLPFGTGDETEMQYDVTLRELTAGDMIDAQAAAEKLVMSKEGPVLVSSPS
RMGLEMLRRQIASVGCIKGPLSMALIRKLSVDDFQRLSLATEMYDMAVAASLTQERGRVAAVPE
>P13771 3.6.1.-~~~B~~~ATP-dependent target DNA activator B~~~
MNISDIRAGLRTLVENEETTFKQIALESGLSTGTISSFINDKYNGDNERVSQMLQRWLEKYHAVAELPEPPRFVETQTVK
QIWTSMRFASLTESIAVVCGNPGVGKTEAAREYRRTNNNVWMITITPSCASVLECLTELAFELGMNDAPRRKGPLSRALR
RRLEGTQGLVIIDEADHLGAEVLEELRLLQESTRTGLVLMGNHRVYSNMTGGNRTVEFARLFSRIAKRTAINKTKKADVK
AIADAWQINGENELELLQQIAQKPGALRILNHSLRLAAMTAHGKGERVNEDYLRQAFRELDLDVDISTLLRN
>P03763 3.6.1.-~~~B~~~ATP-dependent target DNA activator B~~~
MNISDIRAGLRTLVENEETTFKQIALESGLSTGTISSFINDKYNGDNERVSQMLQRWLEKYHAVAELPEPPRFVETQTVK
QIWTSMRFASLTESIAVVCGNPGVGKTEAAREYRRTNNNVWMITITPSCASVLECLTELAFELGMNDAPRRKGPLSRALR
RRLEGTQGLVIIDEADHLGAEVLEELRLLQESTRIGLVLMGNHRVYSNMTGGNRTVEFARLFSRIAKRTAINKTKKADVK
AIADAWQINGEKELELLQQIAQKPGALRILNHSLRLAAMTAHGKGERVNEDYLRQAFRELDLDVDISTLLRN
>P19564 ~~~tat~~~Protein Tat~~~
MPGPWVAMIMLPQPKESFGGKPIGWLFWNTCKGPRRDCPHCCCPICSWHCQLCFLQKNLGINYGSGPRRRGTRGKGRRIR
RTASGGDQRREADSQRSFTNMDQ
>P32544 ~~~tat~~~Protein Tat~~~
MADRRIPGTAEENLQKSSGGVPGQNTGGQEARPNYHCQLCFLRSLGIDYLDASLRKKNKQRLKAIQQGRQPQYLL
>P20920 ~~~tat~~~Protein Tat~~~
MADRRIPGTAEENLQKSSGGVPGQNTGGQEARPNYHCQLCFLRSLGIDYLDASLRKKNKQRLKAIQQGRQPQYLL
>P04326 ~~~tat~~~Protein Tat~~~
MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRAPQGSQTHQVSLSKQPTSQSRGD
PTGPKE
>P69697 ~~~tat~~~Protein Tat~~~
MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSRGD
PTGPKE
>P04610 ~~~tat~~~Protein Tat~~~
MEPVDPRLEPWKHPGSQPKTACTTCYCKKCCFHCQVCFTTKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQPRGD
PTGPKE
>P04608 ~~~tat~~~Protein Tat~~~
MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRAHQNSQTHQASLSKQPTSQPRGD
PTGPKE
>P69698 ~~~tat~~~Protein Tat~~~
MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRPPQGSQTHQVSLSKQPTSQSRGD
PTGPKE
>P04613 ~~~tat~~~Protein Tat~~~
MDPVDPNLEPWNHPGSQPRTPCNKCYCKKCCYHCQMCFITKGLGISYGRKKRRQRRRPPQGNQAHQDPLPEQPSSQHRGD
HPTGPKE
>P12506 ~~~tat~~~Protein Tat~~~
MDPVDPNIEPWNHPGSQPKTACNRCHCKKCCYHCQVCFITKGLGISYGRKKRRQRRRPSQGGQTHQDPIPKQPSSQPRGD
PTGPKE
>P04605 ~~~tat~~~Protein Tat~~~
METPLKAPESSLKSCNEPFSRTSEQDVATQELARQGEEILSQLYRPLETCNNSCYCKRCCYHCQMCFLNKGLGICYERKG
RRRRTPKKTKTHPSPTPDKSISTRTGDSQPTKKQKKTVEATVETDTGPGR
>P20880 ~~~tat~~~Protein Tat~~~
METPLKAPEGSLGSYNEPSSCTSEQDAAAQGLVSPGDEILYQLYQPLEACDNKCYCKKCCYHCQMCFLNKGLGIWYERKG
RRRRTPKKTKAHSSSASDKSISTRTGNSQPEKKQKKTLETALETIGGPGR
>Q82854 ~~~tat~~~Protein Tat~~~
MPGPWATTLTFPGHNGGFGGGPKCWLFWNTCAGPRRVCPKCSCPICVWHCQLCFLQKGLGIRHDGRRKKRGTRGKGRKIH
YARSITESGGQRAPNCASSSASCQTWALKHGINC
>P36340 ~~~tat~~~Protein Tat~~~
MEPSGKEDHNCLPQDLGQEEIDYKQLLEEYYQPLQACENKCWCKKCCFHCMLCFHKKGLGIRYHVYRKRGPGTNKKIPGG
GEEAIRRAIDLCFFNRTCSRTHTANGQTTEKKKATA
>P22384 ~~~tat~~~Protein Tat~~~
MEPSGKEDHNCPPQDSGQEEIDYKQLLEEYYQPLQACENKCWCKKCCFHCMLCFQKKGLGIRYHVYRKRVPGTNKKIPGS
GEEAIRRAIDLSFHRTASRTYTANGQTTEKKKATA
>P03409 ~~~Tax~~~Protein Tax-1~~~
MAHFPGFGQSLLFGYPVYVFGDCVQGDWCPISGGLCSARLHRHALLATCPEHQITWDPIDGRVIGSALQFLIPRLPSFPT
QRTSKTLKVLTPPITHTTPNIPPSFLQAMRKYSPFRNGYMEPTLGQHLPTLSFPDPGLRPQNLYTLWGGSVVCMYLYQLS
PPITWPLLPHVIFCHPGQLGAFLTNVPYKRIEELLYKISLTTGALIILPEDCLPTTLFQPARAPVTLTAWQNGLLPFHST
LTTPGLIWTFTDGTPMISGPCPKDGQPSLVLQSSSFIFHKFQTKAYHPSFLLSHGLIQYSSFHSLHLLFEEYTNIPISLL
FNEKEADDNDHEPQISPGGLEPPSEKHFRETEV
>P14079 ~~~tax~~~Protein Tax-1~~~
MAHFPGFGQSLLFGYPVYVFGDCVQGDWCPISGGLCSARLHRHALLATCPEHQITWDPIDGRVIGSALQFLIPRLPSFPT
QRTSKTLKVLTPPITHTTPNIPPSFLQAMRKYSPFRNGYMEPTLGQHLPTLSFPDPGLRPQNLYTLWGGSVVCMYLYQLS
PPITWPLLPHVIFCHPGQLGAFLTNVPYKRIEKLLYKISLTTGALIILPEDCLPTTLFQPARAPVTLTAWQNGLLPFHST
LTTPGLIWTFTDGTPMISGPCPKDGQPSLVLQSSSFIFHKFQTKAYHPSFLLSHGLIQYSSFHNLHLLFEEYTNIPISLL
FNEKEADDNDHEPQISPGGLEPLSEKHFRETEV
>P0C213 ~~~tax~~~Protein Tax-1~~~
MAHFPGFGQSLLFGYPVYVFGDCVQGDWCPISGGLCSARLHRHALLATCPEHQITWDPIDGRVIGSALQFLIPRLPSFPT
QRTSKTLKVLTPPTTHTTPNIPPSFLQAMRKYSPFRNGYMEPTLGQHLPTLSFPDPGLRPQNLYTLWGGSVVCMYLYQLS
PPITWPLLPHVIFDHPDQLGAFLTNVPYKRMEELLYKISLTTGALIILPEDCLPTTLFQPARAPVTLTAWQSGLLPFHST
LTTPGLIWAFTDGTPMISGPCPKDGRPSLVLQSSSFIFHKFQTKAYHPSFLLSHGLIQYSSFHNLHLLFEEYTNIPISLL
FNEKEADDTDHEPQVSPGGLEPPSEKHFRETEV
>P0C222 ~~~tax~~~Protein Tax-1~~~
MAHFPGFGQSLLYGYPVYVFGDCVQGDWCPISGGLCSARLHRHALLATCPEHQITDPIDGRVIGSALQFLIPRLPSFPTQ
RTSKTLKVLTPPTTHTTPNIPPSFFQAVRQHSPFRNGCMEPTLGQQLPSLSFPDPGLRPQNLYTLWGSSVVCMYLYQLSP
PITWPLLPQVIFCHPGQLGAFLTNVPYKRMEELLYKIFLNTGALIILPEGCLPTTLFQPIRAPATLTAWQNGLLPFQSTL
TTPGLIWTFSDGTPMISGPCPKDGQPSLVLQSSSFIFHKFQTKAYHPSVLLSHGLIQYSSFHSLHLPFEEYTNIPISLLF
NKREADDTDYGPRIPPGGLEPPSEKHFHETEV
>Q4U0X7 ~~~tax~~~Protein Tax-3~~~
MAHFPGFGQSLLYGYPVYVFGDCVQADWCPISGGLCSARLHRHALLATCPEHQITWDPIDGRVVSSALQYLIPRLPSFPT
QRTTRTLKVLTPPTTATTPKVPPSFFHAVKKHTPFRNNCLELTLGEQLPAMSFPDPGLRPQNVYTIWGCSVVCLYLYQLS
PPMTWPLIPHVIFCHPEQLGAFLTRVPTKRLEELLYKIFLSTGAIIILPENCFPTTLFQPTRAPAIQAPWHTGLLPCQKE
IVTPGLIWTFTDGSPMISGPCPKEGQPSLVVQSSTFIFQQFQTKASHPAFLLSHKLIQYSSFHSLHLLFEEYSTVPFSLL
FNEKGANVSDDEPRGGPQPPTGGQIAESSV
>P03410 ~~~tax~~~Protein Tax-2~~~
MAHFPGFGQSLLYGYPVYVFGDCVQADWCPVSGGLCSTRLHRHALLATCPEHQLTWDPIDGRVVSSPLQYLIPRLPSFPT
QRTSRTLKVLTPPTTPVSPKVPPAFFQSMRKHTPYRNGCLEPTLGDQLPSLAFPEPGLRPQNIYTTWGKTVVCLYLYQLS
PPMTWPLIPHVIFCHPRQLGAFLTKVPLKRLEELLYKMFLHTGTVIVLPEDDLPTTMFQPVRAPCIQTAWCTGLLPYHSI
LTTPGLIWTFNDGSPMISGPYPKAGQPSLVVQSSLLIFEKFETKAFHPSYLLSHQLIQYSSFHNLHLLFDEYTNIPVSIL
FNKEEADDNGD
>Q65175 ~~~~~~Putative TATA-binding protein pB263R~~~
MEDETELCFRSNKVTRLEMFVCTYGGKITSLACSHMELIKMLQIAEPVKALNCNFGHQCLPGYESLIKTPKKTKNMLRRP
RKTEGDGTCFNSAIEASILFKDKMYKLKCFPSTGEIQVPGVIFPDFEDGKNIIQQWVDFLQHQPIEKKIQIIEFKTIMIN
FKFQINPVSPRVIIHLKKFAALLEHIPTPYPIREIKPPLEDSKVSAKFMVSPGKKVRINVFLKGKINILGCNTKESAEII
YTFLKDLISVHWQEILCVLPVPD
>P10230 ~~~~~~Tegument protein UL46~~~
MQRRTRGASSLRLARCLTPANLIRGDNAGVPERRIFGGCLLPTPEGLLSAAVGALRQRSDDAQPAFLTCTDRSVRLAARQ
HNTVPESLIVDGLASDPHYEYIRHYASAATQALGEVELPGGQLSRAILTQYWKYLQTVVPSGLDVPEDPVGDCDPSLHVL
LRPTLAPKLLARTPFKSGAVAAKYAATVAGLRDALHRIQQYMFFMRPADPSRPSTDTALRLNELLAYVSVLYRWASWMLW
TTDKHVCHRLSPSNRRFLPLGGSPEAPAETFARHLDRGPSGTTGSMQCMALRAAVSDVLGHLTRLANLWQTGKRSGGTYG
TVDTVVSTVEVLSIVHHHAQYIINATLTGYGVWATDSLNNEYLRAAVDSQERFCRTTAPLFPTMTAPSWARMELSIKAWF
GAALAADLLRNGAPSLHYESILRLVASRRTTWSAGPPPDDMASGPGGHRAGGGTCREKIQRARRDNEPPPLPRPRLHSTP
ASTRRFRRRRADGAGPPLPDANDPVAEPPAAATQPATYYTHMGEVPPRLPARNVAGPDRRPPAATCPLLVRRASLGSLDR
PRVWGPAPEGEPDQMEATYLTADDDDDDARRKATHAASARERHAPYEDDESIYETVSEDGGRVYEEIPWMRVYENVCVNT
ANAAPASPYIEAENPLYDWGGSALFSPPGRTGPPPPPLSPSPVLARHRANALTNDGPTNVAALSALLTKLKREGRRSR
>P89466 ~~~~~~Tegument protein UL46~~~
MQRRARGASSLRLARCLTPANLIRGANAGVPERRIFAGCLLPTPEGLLSAAVGVLRQRADDLQPAFLTGADRSVRLAARH
HNTVPESLIVDGLASDPHYDYIRHYASAAKQALGEVELSGGQLSRAILAQYWKYLQTVVPSGLDIPDDPAGDCDPSLHVL
LRPTLLPKLLVRAPFKSGAAAAKYAAAVAGLRDAAHRLQQYMFFMRPADPSRPSTDTALRLSELLAYVSVLYHWASWMLW
TADKYVCRRLGPADRRFVALSGSLEAPAETFARHLDRGPSGTTGSMQCMALRAAVSDVLGHLTRLAHLWETGKRSGGTYG
IVDAIVSTVEVLSIVHHHAQYIINATLTGYVVWASDSLNNEYLTAAVDSQERFCRTAAPLFPTMTAPSWARMELSIKSWF
GAALAPDLLRSGTPSPHYESILRLAASGPPGGRGAVGGSCRDKIQRTRRDNAPPPLPRARPHSTPAAPRRCRRHREDLPE
PPHVDAADRGPEPCAGRPATYYTHMAGAPPRLPPRNPAPPEQRPAAAARPLAAQREAAGVYDAVRTWGPDAEAEPDQMEN
TYLLPDDDAAMPAGVGLGATPAADTTAAAAWPAESHAPRAPSEDADSIYESVGEDGGRVYEEIPWVRVYENICPRRRLAG
GAALPGDAPDSPYIEAENPLYDWGGSALFSPRRATRAPDPGLSLSPMPARPRTNALANDGPTNVAALSALLTKLKRGRHQ
SH
>P04291 ~~~~~~Tegument protein UL14~~~
MDRDAAHAALRRRLAETHLRAEIYKDQTLQLHREGVSTQDPRFVGAFMAAKAAHLELEARLKSRARLEMMRQRATCVKIR
VEEQAARRDFLTAHRRYLDPALGERLDAVDDRLADQEEQLEEAATNASLWGDGDLAEGWMSPADSDLLVMWQLTSAPKVH
ANGPSRIGSHPTYTPTPTGPPGAPAAPLSRTPPSPAPPTGPATDPASASGFARDYPDGE
>P10205 ~~~~~~Tegument protein UL21~~~
MELSYATTMHYRDVVFYVTTDRNRAYFVCGGCVYSVGRPCASQPGEIAKFGLVVRGTGPDDRVVANYVRSELRQRGLQDV
RPIGEDEVFLDSVCLLNPNVSSELDVINTNDVEVLDECLAEYCTSLRTSPGVLISGLRVRAQDRIIELFEHPTIVNVSSH
FVYTPSPYVFALAQAHLPRLPSSLEALVSGLFDGIPAPRQPLDAHNPRTDVVITGRRAPRPIAGSGAGSGGAGAKRATVS
EFVQVKHIDRVGPAGVSPAPPPNNTDSSSLVPGAQDSAPPGPTLRELWWVFYAADRALEEPRADSGLTREEVRAVRGFRE
QAWKLFGSAGAPRAFIGAALGLSPLQKLAVYYYIIHRERRLSPFPALVRLVGRYTQRHGLYVPRPDDPVLADAINGLFRD
ALAAGTTAEQLLMFDLLPPKDVPVGSDVQADSTALLRFIESQRLAVPGGVISPEHVAYLGAFLSVLYAGRGRMSAATHTA
RLTGVTSLVLAVGDVDRLSAFDRGAAGAASRTRAAGYLDVLLTVRLARSQHGQSV
>Q00703 ~~~~~~Tegument protein UL21~~~
MEFEYQSTIVHQGVLFYVADGGDRAYFVHGGCIVSVHRRSREIGKFGLTLRGNAPGNRVVANYVRTELARLGRAWAAPQG
SDDVFVDALGLLLPLTELDLCGRAELDVYDPYLVECMVSLPASALSLTLVHDRQQDRVLELLAEPAIVHPSSGFVYAVNE
ACFALVQAYLSELPSSLQVLTEGLFDGIPGVRPPLSGETRPTAVVVKGGRAAPTLSVRPRRYAERALRATVVSDFVQVRY
IPATRRIWATRGGSLSLQMLCDLVAGADAILRRAAGASDDASAAVVEAVSAVAADPFFGTGSTSLTGAQRFALYQFILAR
WHLPSCYAALEGMLDRLDERPGAGAGDDDDGGEGGGGGGHGGSRAASAVAHAVNRVLREATVFGEVMRMLVNAAVVHAPA
IADPAGASPPTTKHAREDAATGLELAVMMSDAETNALDADACELVEAAGARVLDGLYAGRGLVAATAPVGRALRPTSAVC
AEAALLTAFGDSPAALRGAQYLFQLFRARLTRANISIVLNKNR
>P36338 ~~~~~~Tegument protein UL47~~~
MDAARDGRPERRRAVSGTYRTHPFQRPSARRSAGRPARCGRRGRGAPRVRRPRPYFQRPPDEDTSEDENVYDYIDGDSSD
SADDYDSDYFTANRGPNHGAGDAMDTDAPPERAPEGGAPQDYLTAHLRAIEVLPESAPHRSLLERTARTVYAQQFPPRDL
SAGSKAPAQRARRSLRGFPRGGGGGQEPGPDDEGDDAADLREDLVPDEAYAHLERDERLSEGPPLLNMEAAAAAAGERSV
VEELFTYAPAQPQVEVPLPRILEGRVRPSAFFAQMSLDALCRTPPNDQRVARERRAWEMAGTPHGLLITTWSTVDPEFSI
GGMYVGAPEGTRPRLVWRRAMKQAMALQYRLGVGGLCRAVDGAACRPLRRCSFWRDALLRECATAIFCRGRGARAAPRRL
PRPAVGLLAATQFTPPDASPHATLFRGSMGSLIYWHELRVMLTAVPALCARYAGAGLQSAELYLLALRHSEAPGYTANER
YALSAYLTLFVALAERGLRWLYLAGAHLLGPHPTAAAFREVRAKIPYERLPLGSATLHDAEVETVDSATFQEALAFSALA
HVYGEAYVAVRTATTLLMAEYAVHAERRDVRQMTAAFLGVGLIAQRLMGSLNLLLNCVAGAAVYGGRRVTVREGTLARYS
LLADAALPLVRPVFLVEFREARDGVMRELRLRPVASPPLAGKRRVMELYLSLDSIEALVGREPLGSRPVLGPLVDIAEAL
ADHPHLVTGDGRGPRLGGR
>P30021 ~~~~~~Tegument protein UL47~~~
MDAARDGRPERRPRRSGTYRTHPFQRPSARRSLLDALRAADAEAAERPRVRRPRPDFQRPPDEDTSEDENVYDYIDGDSS
DSADDYDSDYFTANRGPNHGAGDAMDTDAPPERAPEGGAPQDYLTAHLRAIEALPESAPHRSLLERTARTVYAHEFPPRD
LSAGSRAPAQRARRSLRGFPRGGGGGQEPGPDDEGDDAADLREDLVPDEAYAHLERDERLSEGPPLLNMEAAAAAAGERS
VVEELFTYAPAQPQVEVPLPRILEGRVRPSAFFAQMPLDALCRTPPNDQRVVRERRAWDMAGTPHGLLITTWSTVDPEFS
IGGMYVGAPEGTRPRLVWRRAMKQAMALQYRLGVGGLCRAVDGARMPPTEALLFLAARAAARSAQLPFFVAAGARGRRRA
APARGGGWAAGSHAVHATGRVPHATLFRGSMGSLIYWHELRVMLTAVPALCARYAGAGLQSAELYLLALRHSEAPGYTAN
ERYALSAYLTLFVALAERAVRWLYLAGAHLLGPHPTAAAFREVRAKIPYERLPLGSATLHDAEVETVDSATFQEALAFSA
LAHVYGEAYVAVRTATTLLMAEYAAHAERRDVREMTAAFLGVGLIAQRLMGSLEPAAELRSRRSGVRGPACPTVREGTLA
RYSLLADAALPLVRPVSLVEFWEARDGVMRELRLRPVASPPLAGKRRVMELYLSLDSIEALVGREPLGSRPVLGPLVDIA
EALADHPHLVTGDGRGPRLGGR
>P10231 ~~~~~~Tegument protein UL47~~~
MSAREPAGRRRRASTRPRASPVADEPAGDGVGFMGYLRAVFRGDDDSELEALEEMAGDEPPVRRRREGPRARRRRASEAP
PTSHRRASRQRPGPDAARSQSVRGRLDDDDEVPRGPPQARQGGYLGPVDARAILGRVGGSRVAPSPLFLEELQYEEDDYP
EAVGPEDGGGARSPPKVEVLEGRVPGPELRAAFPLDRLAPQVAVWDESVRSALALGHPAGFYPCPDSAFGLSRVGVMHFA
SPDNPAVFFRQTLQQGEALAWYITGDGILDLTDRRTKTSPAQAMSFLADAVVRLAINGWVCGTRLHAEARGSDLDDRAAE
LRRQFASLTALRPVGAAAVPLLSAGGLVSPQSGPDAAVFRSSLGSLLYWPGVRALLDRDCRVAARYAGRMTYLATGALLA
RFNPDAVRCVLTREAAFLGRVLDVLAVMAEQTVQWLSVVVGARLHPHVHHPAFADVAREELFRALPLGSPAVVGAEHEAL
GDTAARRLLANSGLNAVLGAAVYALHTALATVTLKYARACGDAHRRRDDAAATRAILAAGLVLQRLLGFADTVVACVTLA
AFDGGFTAPEVGTYTPLRYACVLRATQPLYARTTPAKFWADVRAAAEHVDLRPASSAPRAPVSGTADPAFLLKDLEPFPP
APVSGGSVLGPRVRVVDIMSQFRKLLMGDEGAAALRAHVSGRRATGLGGPPRP
>P09263 ~~~~~~Tegument protein UL47 homolog~~~
MQSGHYNRRQSRRQRISSNTTDSPRHTHGTRYRSTNWYTHPPQILSNSETLVAVQELLNSEMDQDSSSDASDDFPGYALH
HSTYNGSEQNTSTSRHENRIFKLTEREANEEININTDAIDDEGEAEEGEAEEDAIDDEGEAEEGEAEEDAIDDEGEAEEG
EAEEDAIDDEGEAEEGEAEEGEAEEGEAEEDAIDDEGEAEEDAAEEDAIDDEGEAEEDYFSVSQVCSRDADEVYFTLDPE
ISYSTDLRIAKVMEPAVSKELNVSKRCVEPVTLTGSMLAHNGFDESWFAMRECTRREYITVQGLYDPIHLRYQFDTSRMT
PPQILRTIPALPNMTLGELLLIFPIEFMAQPISIERILVEDVFLDRRASSKTHKYGPRWNSVYALPYNAGKMYVQHIPGF
YDVSLRAVGQGTAIWHHMILSTAACAISNRISHGDGLGFLLDAAIRISANCIFLGRNDNFGVGDPCWLEDHLAGLPREAV
PDVLQVTQLVLPNRGPTVAIMRGFFGALAYWPELRIAISEPSTSLVRYATGHMELAEWFLFSRTHSLKPQFTPTEREMLA
SFFTLYVTLGGGMLNWICRATAMYLAAPYHSRSAYIAVCESLPYYYIPVNSDLLCDLEVLLLGEVDLPTVCESYATIAHE
LTGYEAVRTAATNFMIEFADCYKESETDLMVSAYLGAVLLLQRVLGHANLLLLLLSGAALYGGCSIYIPRGILDAYNTLM
LAASPLYAHQTLTSFWKDRDDAMQTLGIRPTTDVLPKEQDRIVQASPIEMNFRFVGLETIYPREQPIPSVDLAENLMQYR
NEILGLDWKSVAMHLLRKY
>P0CK49 ~~~BSRF1~~~Tegument protein UL51 homolog~~~
MAFYLPDWSCCGLWLFGRPRNRYSQLPEEPETFECPDRWRAEIDLGLPPGVQVGDLLRNEQTMGSLRQVYLLAVQANSIT
DHLKRFDAVRVPESCRGVVEAQVAKLEAVRSVIWNTMISLAVSGIEMDENGLKALLDKQAGDSLALMEMEKVATALKMDE
TGAWAQEISAVVSSVTAPSASAPFINSAFEPEVPTPVLAPPPVVRQPEHSGPTELALT
>P16823 ~~~~~~Tegument protein UL51 homolog~~~
MQLAQRLCELLMCRRKAAPVADYVLLQPSEDVELRELQAFLDENFKQLEITPADLRTFSRDTDVVNHLLKLLPLYRQCQS
KCAFLKGYLSEGCLPHTRPAAEVECKKSQRILEALDILILKLVVGEFAMSEADSLEMLLDKFSTDQASLVEVQRVMGLVD
MDCEKSAYMLEAGAAATVAPLTPPAVVQGESGVREDGETVAAVSAFACPSVSDSLIPEETGVTRPMMSLAHINTVSCPTV
MRFDQRLLEEGDEEDEVTVMSPSPEPVQQQPPVEPVQQQPQGRGSHRRRYKESAPQETLPTNHEREILDLMRHSPDVPRE
AVMSPTMVTIPPPQIPFVGSARELRGVKKKKPTAAALLSSA
>P10235 ~~~~~~Tegument protein UL51~~~
MASLLGAICGWGARPEEQYEMIRAAVPPSEAEPRLQEALAVVNALLPAPITLDDALGSLDDTRRLVKARALARTYHACMV
NLERLARHHPGFEAPTIDGAVAAHQDKMRRLADTCMATILQMYMSVGAADKSADVLVSQAIRSMAESDVVMEDVAIAERA
LGLSAFGVAGGTRSGGIGVTEAPSLGHPHTPPPEVTLAPAARNGDALPDPKPESCPRVSVPRPTASPTAPRPGPSRAAPC
VLGQ
>F5H9W9 ~~~~~~Tegument protein ORF55~~~
MSSPWYTWTCCGINLFGRGNHAYKRLGDPLEGCPERWRQEIDLGLPPGVCLGDVVQSNLGTTALHQTYLLAVQSNKITDY
LKRFDVAKIPAGCQETVKTQVKKLQSIQNVVWNTMLALAVGEITVDDSALQSLLNKRAGECVSLMEMEKLATAMASDDSV
IWASEISHSLSEPTSVLPLTPAVTRQPEATLPKPPTEDPSVSAMHSSIPPRPSSTLEETTESAIGST
>Q4JQW8 ~~~ORF7~~~Tegument protein UL51 homolog~~~
MQTVCASLCGYARIPTEEPSYEEVRVNTHPQGAALLRLQEALTAVNGLLPAPLTLEDVVASADNTRRLVRAQALARTYAA
CSRNIECLKQHHFTEDNPGLNAVVRSHMENSKRLADMCLAAITHLYLSVGAVDVTTDDIVDQTLRMTAESEVVMSDVVLL
EKTLGVVAKPQASFDVSHNHELSIAKGENVGLKTSPIKSEATQLSEIKPPLIEVSDNNTSNLTKKTYPTETLQPVLTPKQ
TQDVQRTTPAIKKSHVMLV
>A0A1L4BKS3 ~~~~~~Terminase, large subunit~~~
MKRLRPSDKFFELLGYKPHHVQLAIHRSTAKRRVACLGRQSGKSEAASVEAVFELFARPGSQGWIIAPTYDQAEIIFGRV
VEKVERLSEVFPTTEVQLQRRRLRLLVHHYDRPVNAPGAKRVATSEFRGKSADRPDNLRGATLDFVILDEAAMIPFSVWS
EAIEPTLSVRDGWALIISTPKGLNWFYEFFLMGWRGGLKEGIPNSGINQTHPDFESFHAASWDVWPERREWYMERRLYIP
DLEFRQEYGAEFVSHSNSVFSGLDMLILLPYERRGTRLVVEDYRPDHIYCIGADFGKNQDYSVFSVLDLDTGAIACLERM
NGATWSDQVARLKALSEDYGHAYVVADTWGVGDAIAEELDAQGINYTPLPVKSSSVKEQLISNLALLMEKGQVAVPNDKT
ILDELRNFRYYRTASGNQVMRAYGRGHDDIVMSLALAYSQYEGKDGYKFELAEERPSKLKHEESVMSLVEDDFTDLELAN
RAFSA
>Q9T1W6 3.1.-.-~~~~~~Probable terminase, large subunit gp28~~~
MNTRENNLKALHAPRKINLREEAGLLGVDIVTDIGEAQPRNEPVFLGYQRRWFEDESQICIAEKSRRTGLTWAEAGRNVM
TAAKPKRRGGRNVFYVGSRQEMALEYIAACALFARAFNQLAKADVWEQTFWDSDKKEEILTYMIRFPNSGFKIQALSSRP
SNLRGLQGDVVIDEAAFHEALDELLKAAFALNMWGASVRIISTHNGVDNLFNQYIQDAREGRKDYSVHRITLDDAIADGL
YRRICYVTNQPWSPEAEKAWRDGLYRNAPNKESADEEYGCIPKKSGGAYLSRVLIEAAMTPARDIPVLRFEAPDDFESLT
PQMRHGIVQDWCEQELLPLLDALSPLNKHVLGEDFARRGDLTVFVPLAITPDLRKRECFRVELRNVTYDQQRQILLFILS
RLPRFTGAAFDATGNGGYLAEAARLIYGPEMIDCISLTPAWYQEWMPKLKGEFEAQNITIARHQTTLDDLLHIKVDKGIP
QIDKGRTKDEGGKGRRHGDFAVALCMAVRASYMNGFVIDEDSIQALPPRHRGDDVDNDDFDDYHQFERGGW
>P26745 ~~~2~~~Terminase, large subunit~~~
MELDAILDNLSDEEQIELLELLEEEENYRNTHLLYEFAPYSKQREFIDAGHDYPERCFMAGNQLGKSFTGAAEVAFHLTG
RYPGTKGYPADGKYGGEWKGKRFYEPVVFWIGGETNETVTKTTQRILCGRIEENDEPGYGSIPKEDIISWKKSPFFPNLV
DHLLVKHHTADGVEDGISICYFKPYSQGRARWQGDTIHGVWFDEEPPYSIYGEGLTRTNKYGQFSILTFTPLMGMSDVVT
KFLKNPSKSQKVVNMTIYDAEHYTDEQKEQIIASYPEHEREARARGIPTMGSGRIFQIPEETIKCQPFECPDHFYVIDAQ
DFGWNHPQAHIQLWWDKDADVFYLARVWKKSENTAVQAWGAVKSWANKIPVAWPHDGHQHEKGGGEQLKTQYADAGFSML
PDHATFPDGGNSVESGISELRDLMLEGRFKVFNTCEPFFEEFRLYHRDENGKIVKTNDDVLDATRYGYMMRRFARMMRDI
RKPKEKKIPAPIRPVRRGR
>P54308 ~~~2~~~Terminase, large subunit~~~
MKKVRLSEKFTPHFLEVWRTVKAAQHLKYVLKGGRGSAKSTHIAMWIILLMMMMPITFLVIRRVYNTVEQSVFEQLKEAI
DMLEVGHLWKVSKSPLRLTYIPRGNSIIFRGGDDVQKIKSIKASKFPVAGMWIEELAEFKTEEEVSVIEKSVLRAELPPG
CRYIFFYSYNPPKRKQSWVNKVFNSSFLPANTFVDHSTYLQNPFLSKAFIEEAEEVKRRNELKYRHEYLGEALGSGVVPF
ENLQIEEGIITDAEVARFDNIRQGLDFGYGPDPLAFVRWHYDKRKNRIYAIDELVDHKVSLKRTADFVRKNKYESARIIA
DSSEPRSIDALKLEHGINRIEGAKKGPDSVEHGERWLDELDAIVIDPLRTPNIAREFENIDYQTDKNGDPIPRLEDKDNH
TIDATRYAFERDMKKGGVSLWG
>P10310 ~~~~~~Terminase, large subunit~~~
MSTQSNRNALVVAQLKGDFVAFLFVLWKALNLPVPTKCQIDMAKVLANGDNKKFILQAFRGIGKSFITCAFVVWTLWRDP
QLKILIVSASKERADLNSIFIKNIIDLLPFLDELKPSPGQRDSVISFDVGPAKPDHSPSVKSVGITGQLTGSRADIIIAD
DVEIPSNSATQGAREKLWTLVQEFRALLKPLPTSRVIYLGTPQTEMTLYKELEDNRGYTTIIWPALYPRSREEDLYYGER
LAPMLREEFNDGFEMLQGQPTDPVRFDMEDLRERELEYGKAGFTLQFMLNPNLSDAEKYPLRLRDAIVCGLDFEKAPMHY
QWLPNRQNRNEELPNVGLKGDDIHSYHSCSQNTGQYQQRILVIDPSGRGKDETGYAVLFTLNGYIYLMEAGGFPDGYSDK
TLESLAKKANEWKVQTVVFESNFGDGMFGKVFSPVLLKHHAAALEEIRARGMKELRICDTLEPVLSTHRLVIRDEVIRED
YQTARDADGKHDVRYSLFYQLTRMAREKGAVAHDDRLDAFRLGVEFLRSTMELDAVKVEAEVLEAFLEEHMEHPIHSAGH
VVTAMVDGMELYWEDDDVNGDRFINW
>P17312 ~~~~~~Terminase, large subunit~~~
MEQPINVLNDFHPLNEAGKILIKHPSLAERKDEDGIHWIKSQWDGKWYPEKFSDYLRLHKIVKIPNNSDKPELFQTYKDK
NNKRSRYMGLPNLKRANIKTQWTREMVEEWKKCRDDIVYFAETYCAITHIDYGVIKVQLRDYQRDMLKIMSSKRMTVCNL
SRQLGKTTVVAIFLAHFVCFNKDKAVGILAHKGSMSAEVLDRTKQAIELLPDFLQPGIVEWNKGSIELDNGSSIGAYASS
PDAVRGNSFAMIYIDECAFIPNFHDSWLAIQPVISSGRRSKIIITTTPNGLNHFYDIWTAAVEGKSGFEPYTAIWNSVKE
RLYNDEDIFDDGWQWSIQTINGSSLAQFRQEHTAAFEGTSGTLISGMKLAVMDFIEVTPDDHGFHQFKKPEPDRKYIATL
DCSEGRGQDYHALHIIDVTDDVWEQVGVLHSNTISHLILPDIVMRYLVEYNECPVYIELNSTGVSVAKSLYMDLEYEGVI
CDSYTDLGMKQTKRTKAVGCSTLKDLIEKDKLIIHHRATIQEFRTFSEKGVSWAAEEGYHDDLVMSLVIFGWLSTQSKFI
DYADKDDMRLASEVFSKELQDMSDDYAPVIFVDSVHSAEYVPVSHGMSMV
>Q6QGD2 ~~~~~~Terminase, large subunit~~~
MEVSRPYVNTVDVIDFGIDKRFFRLPVSGILAQEGITPNGPQIAIINALEDPRHRFVTACVSRRVGKSFIAYTLGFLKLL
EPNVKVLVVAPNYSLANIGWSQIRGLIKKYGLQTERENAKDKEIELANGSLFKLASAAQADSAVGRSYDFIIFDEAAISD
VGGDAFRVQLRPTLDKPNSKALFISTPRGGNWFKEFYAYGFDDTLPNWVSIHGTYRDNPRADLNDIEEARRTVSKNYFRQ
EYEADFSVFEGQIFDTFNAIDHVKDLKGMRHFFKDDEAFETLLGIDVGYRDPTAVLTIKYHYDTDTYYVLEEYQQAEKTT
AQHAAYIQHCIDRYKVDRIFVDSAAAQFRQDLAYEHEIASAPAKKSVLDGLACLQALFQQGKIIVDASCSSLIHALQNYK
WDFQEGEEKLSREKPRHDANSHLCDALRYGIYSISRGK
>P03694 ~~~~~~Terminase, large subunit~~~
MSTQSNRNALVVAQLKGDFVAFLFVLWKALNLPVPTKCQIDMAKVLANGDNKKFILQAFRGIGKSFITCAFVVWSLWRDP
QLKILIVSASKERADANSIFIKNIIDLLPFLSELKPRPGQRDSVISFDVGPANPDHSPSVKSVGITGQLTGSRADIIIAD
DVEIPSNSATMGAREKLWTLVQEFAALLKPLPSSRVIYLGTPQTEMTLYKELEDNRGYTTIIWPALYPRTREENLYYSQR
LAPMLRAEYDENPEALAGTPTDPVRFDRDDLRERELEYGKAGFTLQFMLNPNLSDAEKYPLRLRDAIVAALDLEKAPMHY
QWLPNRQNIIEDLPNVGLKGDDLHTYHDCSNNSGQYQQKILVIDPSGRGKDETGYAVLYTLNGYIYLMEAGGFRDGYSDK
TLELLAKKAKQWGVQTVVYESNFGDGMFGKVFSPILLKHHNCAMEEIRARGMKEMRICDTLEPVMQTHRLVIRDEVIRAD
YQSARDVDGKHDVKYSLFYQMTRITREKGALAHDDRLDALALGIEYLRESMQLDSVKVEGEVLADFLEEHMMRPTVAATH
IIEMSVGGVDVYSEDDEGYGTSFIEW
>P03708 ~~~A~~~Terminase, large subunit~~~
MNISNSQVNRLRHFVRAGLRSLFRPEPQTAVEWADANYYLPKESAYQEGRWETLPFQRAIMNAMGSDYIREVNVVKSARV
GYSKMLLGVYAYFIEHKQRNTLIWLPTDGDAENFMKTHVEPTIRDIPSLLALAPWYGKKHRDNTLTMKRFTNGRGFWCLG
GKAAKNYREKSVDVAGYDELAAFDDDIEQEGSPTFLGDKRIEGSVWPKSIRGSTPKVRGTCQIERAASESPHFMRFHVAC
PHCGEEQYLKFGDKETPFGLKWTPDDPSSVFYLCEHNACVIRQQELDFTDARYICEKTGIWTRDGILWFSSSGEEIEPPD
SVTFHIWTAYSPFTTWVQIVKDWMKTKGDTGKRKTFVNTTLGETWEAKIGERPDAEVMAERKEHYSAPVPDRVAYLTAGI
DSQLDRYEMRVWGWGPGEESWLIDRQIIMGRHDDEQTLLRVDEAINKTYTRRNGAEMSISRICWDTGGIDPTIVYERSKK
HGLFRVIPIKGASVYGKPVASMPRKRNKNGVYLTEIGTDTAKEQIYNRFTLTPEGDEPLPGAVHFPNNPDIFDLTEAQQL
TAEEQVEKWVDGRKKILWDSKKRRNEALDCFVYALAALRISISRWQLDLSALLASLQEEDGAATNKKTLADYARALSGED
E
>P03269 ~~~PTP~~~Preterminal protein~~~
MALSVNDCARLTGQSVPTMEHFLPLRNIWNRVRDFPRASTTAAGITWMSRYIYGYHRLMLEDLAPGAPATLRWPLYRQPP
PHFLVGYQYLVRTCNDYVFDSRAYSRLRYTELSQPGHQTVNWSVMANCTYTINTGAYHRFVDMDDFQSTLTQVQQAILAE
RVVADLALLQPMRGFGVTRMGGRGRHLRPNSAAAVAIDARDAGQEEGEEEVPVERLMQDYYKDLRRCQNEAWGMADRLRI
QQAGPKDMVLLSTIRRLKTAYFNYIISSTSARNNPDRHPLPPATVLSLPCDCDWLDAFLERFSDPVDADSLRSLGGGVPT
QQLLRCIVSAVSLPHGSPPPTHNRDMTGGVFQLRPRENGRAVTETMRRRRGEMIERFVDRLPVRRRRRRVPPPPPPPEEE
EEGEALMEEEIEEEEAPVAFEREVRDTVAELIRLLEEELTVSARNSQFFNFAVDFYEAMERLEALGDINESTLRRWVMYF
FVAEHTATTLNYLFQRLRNYAVFARHVELNLAQVVMRARDAEGGVVYSRVWNEGGLNAFSQLMARISNDLAATVERAGRG
DLQEEEIEQFMAEIAYQDNSGDVQEILRQAAVNDTEIDSVELSFRFKLTGPVVFTQRRQIQEINRRVVAFASNLRAQHQL
LPARGADVPLPPLPAGPEPPLPPGARPRHRF
>P04499 ~~~PTP~~~Preterminal protein~~~
MALSVNDCARLTGQSVPTMEHFLPLRNIWNRVRDFPRASTTAAGITWMSRYIYGYHRLMLEDLAPGAPATLRWPLYRQPP
PHFLVGYQYLVRTCNDYVFDSRAYSRLRYTELSQPGHQTVNWSVMANCTYTINTGAYHRFVDMDDFQSTLTQVQQAILAE
RVVADLALLQPMRGFGVTRMGGRGRHLRPNSAAAAAIDARDAGQEEGEEEVPVERLMQDYYKDLRRCQNEAWGMADRLRI
QQAGPKDMVLLSTIRRLKTAYFNYIISSTSARNNPDRRPLPPATVLSLPCDCDWLDAFLERFSDPVDADSLRSLGGGVPT
QQLLRCIVSAVSLPHGSPPPTHNRDMTGGVFQLRPRENGRAVTETMRRRRGEMIERFVDRLPVRRRRRRVPPPPPPPEEE
EGEALMEEEIEEEEEAPVAFEREVRDTVAELIRLLEEELTVSARNSQFFNFAVDFYEAMERLEALGDINESTLRRWVMYF
FVAEHTATTLNYLFQRLRNYAVFARHVELNLAQVVMRARDAEGGVVYSRVWNEGGLNAFSQLMARISNDLAATVERAGRG
DLQEEEIEQFMAEIAYQDNSGDVQEILRQAAVNDTEIDSVELSFRLKLTGPVVFTQRRQIQEINRRVVAFASNLRAQHQL
LPARGADVPLPPLPAGPEPPLPPGARPRHRF
>Q6X3W5 ~~~4~~~DNA terminal protein~~~
MANKRLKKKLETKRKKSLLVSEGYSKKETKKLKGRELETVYKKKAHNRKNRERAREIANLAKQWGLSPSKYNSWKKLLPE
IERIKKEQDREAPFLLIYYQDFTGETDSKFIYDFKKRNNTRSRSQITESIIGWLQNAHNKLFLGRVAIRIVPKRDVSKTN
TLWRNHGYVKIYEGQGKELSKLLTAIETIMVGVYDVKERDKYLKELVAKLRSLPYEKAKKNAKEIQKIYDTKSYKKESWD
NDDYY
>P03681 ~~~3~~~Primer terminal protein~~~
MARSPRIRIKDNDKAEYARLVKNTKAKIARTKKKYGVDLTAEIDIPDLDSFETRAQFNKWKEQASSFTNRANMRYQFEKN
AYGVVASKAKIAEIERNTKEVQRLVDEKIKAMKDKEYYAGGKPQGTIEQRIAMTSPAHVTGINRPHDFDFSKVRSYSRLR
TLEESMEMRTDPQYYEKKMIQLQLNFIKSVEGSFNSFDAADELIEELKKIPPDDFYELFLRISEISFEEFDSEGNTVENV
EGNVYKILSYLEQYRRGDFDLSLKGF
>P09009 ~~~VIII~~~DNA terminal protein~~~
MAKKKPVEKNGLVYKEFQKQVSNLKKAGLIPKTLDVRKVKPTKHYKGLVSKYKDVATGGAKLAAIPNPAVIETLEARGES
IIKKGGKAYLKARQQINQRGQIVNPFTVRVTKRGEVVRRYRKTTPEGKPVYITQRELPIKFENMEQWLTELKAAGFQLQP
GEQIYFTFNGNYSRRTYTSFDEAFNKFMTYDIIIDAVAGKLKVEDEADLVKSVGFQRISGPEAKAYNRNRIVLPEMQFSQ
AAKKKYKRRQKRGYGSKGV
>A0A1L4BKP6 ~~~~~~Terminase, small subunit~~~
MSVSFRDRVLKLYLLGFDPSEIAQTLSLDVKRKVTEEEVLHVLAEARELLSALPSLEDIRAEVGQALERARIFQKDLLAI
YQNMLRNYNAMMEGLTEHPDGTPVIGVRPADIAAMADRIMKIDQERITALLNSLKVLGHVGSTTAGALPSATELVSVEEL
VAEVVDEAPKT
>Q9T1W7 ~~~~~~Probable terminase, small subunit gp27~~~
MDRKTRGRASKVDLLPENVRKTLHEMLRDKAIPQARILEEINALIEDAGLPDEMKLSRSGLNRYATNVEQVGHNLRQMRE
MTSALTAELGDKPMGETTKLILEMARSQLFKAMMRQIENPESDVDIDLLKNAMLAAQRLESTAMSSHRREKEIRQAFAEE
AANAVSEELRGQDGISEELEQRIRDVLLGKA
>P68654 ~~~1~~~Terminase, small subunit~~~
MKVNKKRLAEIFNVDPRTIERWQSQGLPCASKGSKGIESVFDTAMAIQWYAQRETDIENEKLRKELDDLRAAAESDLQPG
TIDYERYRLTKAQADAQELKNAREDGVVLETELFTFILQRVAQEISGILVRVPLTLQRKYPDISPSHLDVVKTEIAKASN
VAAKAGENVGGWIDDFRRAEGS
>P04893 ~~~3~~~Terminase, small subunit~~~
MAAPKGNRFWEARSSHGRNPKFESPEALWAACCEYFEWVEANPLWEMKAFSYQGEVIQEPIAKMRAMTITGLTLFIDVTL
ETWRTYRLREDLSEVVTRAEQVIYDQKFSGAAADLLNANIIARDLGLKEQSQVEDVTPDKGDRDKRRSRIKELFNRGTGR
DS
>P68928 ~~~1~~~Terminase small subunit~~~
MKEPKLSPKQERFIEEYFINDMNATKAAIAAGYSKNSASAIGAENLQKPAIRARIDARLKEINEKKILQANEVLEHLTRI
ALGQEKEQVLMGIGKGAETKTHVEVSAKDRIKALELLGKAHAVFTDKQKVETNQVIIVDDSGDAE
>P54307 ~~~1~~~Terminase small subunit~~~
MGEVKGKWTPKLERFVDEYFINGMNATKAAIAAGYSKKSASTIAAENMQKPHVRARIEERLAQMDKKRIMQAEEVLEHLT
RIALGQEKEQVLMGIGKGAETKTHVEVSAKDRIKALELLGKAHAVFTDKQKVETNQVIIVDDSGDAE
>P17311 ~~~~~~Terminase, small subunit~~~
MEGLDINKLLDISDLPGIDGEEIKVYEPLQLVEVKSNPQNRTPDLEDDYGVVRRNMHFQQQMLMDAAKIFLETAKNADSP
RHMEVFATLMGQMTTTNREILKLHKDMKDITSEQVGTKGAVPTGQMNIQNATVFMGSPTELMDEIGDAYEAQEAREKVIN
GTTD
>P23208 ~~~~~~Probable terminase, small subunit~~~
MANDVLVPDLMSPEGMDVIEAYLQCGSDVPSAARSLGMSEIAFRDIMNRSEVKNYLNDIFMESGFRNRDRLFGVLDEVIK
RKLEELEETGMGSDQDIMDILWKAHKMKMEEMKMMVELEKVKAAARTPANQTNIQNNIIAGAGDQNYMDLITSLATGGKK
>P03693 ~~~~~~Terminase, small subunit gp18~~~
MEKDKSLITFLEMLDTAMAQRMLADLSDHERRSPQLYNAINKLLDRHKFQIGKLQPDVHILGGLAGALEEYKEKVGDNGL
TDDDIYTLQ
>P03707 3.6.4.-~~~Nu1~~~Terminase small subunit~~~
MEVNKKQLADIFGASIRTIQNWQEQGMPVLRGGGKGNEVLYDSAAVIKWYAERDAEIENEKLRREVEELRQASEADLQPG
TIEYERHRLTRAQADAQELKNARDSAEVVETAFCTFVLSRIAGEIASILDGLPLSVQRRFPELENRHVDFLKRDIIKAMN
KAAALDELIPGLLSEYIEQSG
>P13299 3.1.-.-~~~ITEVIR~~~Intron-associated endonuclease 1~~~
MKSGIYQIKNTLNNKVYVGSAKDFEKRWKRHFKDLEKGCHSSIKLQRSFNKHGNVFECSILEEIPYEKDLIIERENFWIK
ELNSKINGYNIADATFGDTCSTHPLKEEIIKKRSETVKAKMLKLGPDGRKALYSKPGSKNGRWNPETHKFCKCGVRIQTS
AYTCSKCRNRSGENNSFFNHKHSDITKSKISEKMKGKKPSNIKKISCDGVIFDCAADAARHFKISSGLVTYRVKSDKWNW
FYINA
>P04445 ~~~TF1~~~Transcription factor 1~~~
MNKTELIKAIAQDTELTQVSVSKMLASFEKITTETVAKGDKVQLTGFLNIKPVARQARKGFNPQTQEALEIAPSVGVSVK
PGESLKKAAEGLKYEDFAK
>P03682 ~~~4~~~Late genes activator p4~~~
MPKTQRGIYHNLKESEYVASNTDVTFFFSSELYLNKFLDGYQEYRKKFNKKIERVAVTPWNMDMLADITFYSEVEKRGFH
AWLKGDNATWREVHVYALRIMTKPNTLDWSRIQKPRLRERRKSMV
>P03740 ~~~tfa~~~Tail fiber assembly protein~~~
MAFRMSEQPRTIKIYNLLAGTNEFIGEGDAYIPPHTGLPANSTDIAPPDIPAGFVAVFNSDEASWHLVEDHRGKTVYDVA
SGDALFISELGPLPENFTWLSPGGEYQKWNGTAWVKDTEAEKLFRIREAEETKKSLMQVASEHIAPLQDAADLEIATKEE
TSLLEAWKKYRVLLNRVDTSTAPDIEWPAVPVME
>P27948 ~~~~~~Transcription factor TFIIS homolog~~~
MKMHIARDSIVFLLNKHLQNTILTNKIEQECFLQADTPKKYLQYIKPFLINCMTKNITTDLVMKDSKRLEPYIILEMRDI
IQMMFFRTLQKHMFFKEHTDLCTEYAQKIEASCYHYTYQQQEKTFLEEYSTRCGTINHIINCEKKSHQQQDNDALNKLIS
GELKPEAIGSMTFAELCPSAALKEKTEITLRSQQKVAEKTSQLYKCPNCKQRMCTYREVQTRALDEPSTIFCTCKKCGHE
FIG
>Q6QGM6 3.1.21.-~~~hegD~~~H-N-H endonuclease F-TflII~~~
MKLENFKVIPEYPEYLISPYGEVYSTKSNKLLTHHLGSAGYPFVTFYEQGKNVSIVLHRLLARVFKDLPSLESELEVDHK
DRNKLNFSLDNLVVMTKQDHRIKTTVERGHTIGGNKCPYCNKQINSSSKTCFDCKPKSSPDITAEQIEYWVINYSWVKAS
KELGLSDNGLRKRYKSLTGKDPKSIKKKVSQVG
>Q6QGL2 3.1.21.-~~~hegA~~~H-N-H endonuclease F-TflIV~~~
MRTDILDRKEEIAQWIAQGKSKAEIARMLSCSSNTLEAYLGKLGIVVIPTNRQYDNKYKSATEYLYNGSPISSYKLKNKI
LNEGLKPHKCESCGLESWLDKPIPLELEHKDGNHYNNEWDNLALLCPNCHALTPTHAGKNIGRYTERTVNTCAICHCEIS
SRATHCKSCTPKGITINPDITVEQIEYWVSKYSWIRASKELGLSDTGLRKRYKSLTGKDPKSIKKNR
>Q6QGM3 3.1.21.-~~~hegC~~~H-N-H endonuclease F-TflI~~~
MQFVPIKDAPGYLVNEAGDVFSTFTNKVLSRYIVDGYPAVKLQINGKQTSVLIHRIISHVFGDLYNLFDPELEVDHKDRD
RLNLSKDNLQVLSKIEHQRKTNKDNGWSDSRVPCPLCGSLMLQRSVTCTNCKPKPTGRLIKPELSLDDITEKVLLMGWVK
AAKELEVSESTLRRRYTKLTGLSPKVLTEQRKSK
>Q89581 3.6.4.-~~~~~~Termination factor NPH-I homolog~~~
MSCVHNNTSFPVQIEAYLKEVYEKYKELQESKDTSLTARFARALKYYQFLIYTAFSDPKFGIGQGENTRGLLIYHQMGMG
KTILSLSLAISLSHIYNPILIAPKSLHSNFQQSLLKLIKLLYPETTDHSKELQKISRRFRFVSLDAYNMGQQIIKAGGSL
NGCLLIVDEAHNLFRGIINSANDKTNARQLYNNIMQAKNIRILFLTGTPCSKDPFEMVPCFNMLSGRILLPLHYERFYTA
YVNKTTNSPLNADKLLNRLVGMISYAGNQNELNKLFPTELPLIIEKVEMSPEQYRQYLLARDVENAEKHASSGMYEKINA
AALCLPGSEQESGSSYYVRSRMISIFASEMLTVKEDEKLSEAVQQLPKEAFTENSSPKIVRMLKNIKTSPGPVLIYSQFV
ELGLHVVARFLEIEGYQCLQPLKVLEEGHNTILLHKDGKDLMVKNFAEDGPTHTLVLSSKITRFTLITGKILSKERDMIQ
QVWNSPLNIHGEVIKILLVSKTGAEGLDLKYGRQVHILEPYWDKAREDQVKARIIRIGSHDALPPEEKTVQPFLYIAVAN
QKMFYSIPEGSQEQKTIDERFHERGLEKSHLNSAFRDLLKRAAIECAFNGESGCLMCQPTNALLFHENFERDLRLPNPCQ
PLVKAEVKAYSISYEGKQFFYQKNKDVGLGYTFYEYNPIIKAYIEIKPSNPLYIKLIKHVQAGTTA
>P04867 3.6.4.-~~~~~~Movement protein TGB1~~~
MDMTKTVEEKKTNGTDSVKGVFENSTIPKVPTGQEMGGDGSSTSKLKETLKVADQTPLSVDNGAKSKLDSSDRQVPGVAD
QTPLSVDNGAKSKLDSSDRQVPGPELKPNVKKSKKKRIQKPAQPSGPNDLKGGTKGSSQVGENVSENYTGISKEAAKQKQ
KTPKSVKMQSNLADKFKANDTRRSELINKFQQFVHETCLKSDFEYTGRQYFRARSNFFEMIKLASLYDKHLKECMARACT
LERERLKRKLLLVRALKPAVDFLTGIISGVPGSGKSTIVRTLLKGEFPAVCALANPALMNDYSGIEGVYGLDDLLLSAVP
ITSDLLIIDEYTLAESAEILLLQRRLRASMVLLVGDVAQGKATTASSIEYLTLPVIYRSETTYRLGQETASLCSKQGNRM
VSKGGRDTVIITDYDGETDETEKNIAFTVDTVRDVKDCGYDCALAIDVQGKEFDSVTLFLRNEDRKALADKHLRLVALSR
HKSKLIIRADAEIRQAFLTGDIDLSSKASNSHRYSAKPDEDHSWFKAK
>Q9IV54 3.6.4.-~~~~~~Movement protein TGB1~~~
MESGFNGSRPHRVKKDLPDRVNPVNTQGSSGTTGNAFRKNNNNKTQNWKPRSGPGNRNEGDQTKNNKSDLQQPSEVHPEN
QVRPESSTGESVKQQSEPHRVLEDKKQSGKTAGSSVRIPEEGGGGLGSANYLGKRQLDFVAKLCVESGFKSTGKPLKRYP
AEFFKSSGLLEKFVKYLSSRLDKGCNLSQRESEVVLKNLRSKRAEQSFLAGAVTGVPGSGKTTLLRKVQCEGGFNSIVIL
GNPRSKTEFSNLPSCYTAKEILLLGIAIKCEVLLIDEYTLLTSGEILLLQKITNSRIVILFGDRAQGSSNTLCSPEWLQV
PVIFQSLTSRRFGKATANLCRRQGFDFEGGEHEDKVVESPYEGSSPATDINIVFSESTREDLLECGIESTLVSDVQGKEY
NTVTLFIPDEDREYLTNAHLRSVAFSRHKFALEIRCNPELFMQLINGELASKQQPQTDRYGPE
>P17780 ~~~ORF2~~~Movement and silencing protein TGBp1~~~
MDILISSLKSLGYSRTSKSLDSGPLVVHAVAGAGKSTALRKLILRHPTFTVHTLGVPDKVSIRTRGIQKPGPIPEGNFAI
LDEYTLDNTTRNSYQALFADPYQAPEFSLEPHFYLETSFRVPRKVADLIAGCGFDFETNSQEEGHLEITGIFKGPLLGKV
IAIDEESETTLSRHGVEFVKPCQVTGLELKVVTIVSAAPIEEIGQSTAFYNAITRSKGLTYVRAGT
>P04869 ~~~~~~Movement protein TGB2~~~
MKTTVGSRPNKYWPIVAGIGVVGLFAYLIFSNQKHSTESGDNIHKFANGGSYRDGSKSISYNRNHPFAYGNASSPGMLLP
AMLTIIGIISYLWRTRDSVLGDSGGNNSCGEDCQGECLNGHSRRSLLCDIGXSFYHCSMAIVYISKQYTYGDWSLLLSRS
ELCEDLWNRGYEPRSYCGHPPLAEVPFWGISDVGRFNQCFEYSS
>Q9IV53 ~~~~~~Movement protein TGB2~~~
MVRNNEIGARPNKYWPVVAAVVAICLFGFLTVTNQKHATQSGDNIHKFANGGQYRDGSKSIKYNCNNPRAYNGSSSNITF
SQLFLPVLLIGAALYAYLWFTRPDCSVTCRGDCCRSYGG
>P04868 ~~~~~~Movement protein TGB3~~~
MAMPHPLECCCPQCLPSSESFPIYGEQEIPCSETQAETTPVEKTVRANVLTDILDDHYYAILASLFIIALWLLYIYLSSI
PTETGPYFYQDLNSVKIYGIGATNPEVIAAIHHWQKYPFGESPMWGGLISVLSILLKPLTLVFALSFFLLLSSKR
>Q9IV52 ~~~~~~Movement protein TGB3~~~
MDPPVILHSPNCSCQFCSSELPSTHTCGSQDRTVPLHVEATAAGHMEAKNFSLQYVLLVAFVSVLLGFSFCVYLKSMSND
EASDMTYYYQDLNSVEIKLGKNPLDPEVIKAIHSFQEFPYGNIPSIRREAEFDVQNDESSAVVLSGSNNNRRQVASTPCE
NNVLLKLWKDDLSFTIIAVTVLVGAMLARC
>Q5UR12 4.2.1.46~~~~~~Putative dTDP-D-glucose 4,6-dehydratase~~~
MKNILVTGGLGFIGSNFVNHISSKYDNVNIYVYDIGDYCASVENVEWNNRTKLIKGDIRNFDLIMHTLTEHEIDTIVHFA
AHSHVDNSFKNSLAFTETNVFGTHVLLECSRMYGKLKLFFHMSTDEVYGEIDTTDTSREVSLLCPTNPYAATKAGAEHIV
KSYFLSYKLPIIIARCNNVYGRNQYPEKLIPKFICSLLDGKKLHIQGTGNSRRNFIHAIDVADAVDLVINNGVIGETYNI
GVTNEHSVLDVAQILCDIAGVNLENQLEYVPDRLFNDFRYNITNDKIKSLGWEQSRKDFKKELVELFDWYKVNRHRYNIP
GSQ
>A0A385DTH1 ~~~~~~Tail hub protein A~~~
MHFNELRISQDNRFLIIDVSVDNQDYFEDVLLDSIVIDTQDTFVMNGPSDNPLYIYNVEDAYDLTYSLPEQCNCNPVRVE
EDESYCFTYGTQQMKNVRLELNIQDLKVSPCSTMFFVYVKSKGTPSTDTPCGFDKDQILGTVINLQPIYKQTLKYLKEVE
CDCNIPKGFIDMILKLKAIELCVRTGNYPQAIKYWNKFFIKNNCKSPTSNCGCYG
>A0A385DVM6 ~~~~~~Tail hub protein B~~~
MDKMLEISEEAITRYFTTLSQFGYKKYSDVDKIIVLFFMEEMLAGEMSYYVTQDDYRNIVNALYCLAGSTCMIDFPMFES
YDTLVHSNNRTFVPRITEDSILRSTEDDNFRVEA
>O41156 2.1.1.148~~~~~~Probable flavin-dependent thymidylate synthase~~~
MSAKLISVTKPVVEGVNTAEELIAYAARVSNPENQINNKTASGLLKYCIRHKHWSIFETAFMTLELKTSRGIAAQVLRHR
SFHFQEFSQRYASVMETPPPHQARFQDHKNRQNSLDTVPEDDQTWWATEQEKLYAQSMELYNKALEKGIAKECARFILPL
STPTTIYMSGTIRDWIHYIELRTSNGTQREHIDLANACKEIFIKEFPSIAKALDWV
>Q6QGJ5 2.1.1.45~~~thy~~~Probable thymidylate synthase~~~
MQQYLKILTDVILLGEPRNDRTGTGTVSIFDSYAKFDLREGFPAVTTKRLAWKSVVGELLWFLSGSTNLHDLRVFTFGRD
EGQWTIWTPNYEDQAISMGYDKGNLGPVYGKQWRNFGGRDQILELIEGLKNNPHGRRHLVSAWNVAELDKMALPPCHYGF
QCYVSNDGYLDLKWTQRSVDCFLGLPFNIASYALLTHILAKLTGLKPRYLIFSGGDTHIYNDHMEQVEEQVKRKPRPLPT
LVMPEFVDLYDLLENNTAAWSFHLEGYDPHPALKAKMSS
>Q9YJQ8 ~~~~~~Protein tio~~~
MANEPQEHEEGKPFFPPLGDSGEEGPPNIPQDPTPGTPPGPINSKNEDYPPPLENPGPNKSEGPPDGSGNSSPPVTMLVK
NNGDRTKQDVSESGGNNSAPNSVESKHTSSSSSAGNGNETKCPDEQNTQECITTIYIPWEDAKPKLMGLVKLDSSDSEEE
RSPFNKYPKNYKKLRVDMGENWPPGIPPPQLPPRPANLGQKQSATSKNGPQIILREATEVESQQATDGQLNHRVEKVEKK
LTCVICLLIGILVLLILLFMLGFLFLLMK
>P03749 ~~~J~~~Tip attachment protein J~~~
MGKGSSKGHTPREAKDNLKSTQLLSVIDAISEGPIEGPVDGLKSVLLNSTPVLDTEGNTNISGVTVVFRAGEQEQTPPEG
FESSGSETVLGTEVKYDTPITRTITSANIDRLRFTFGVQALVETTSKGDRNPSEVRLLVQIQRNGGWVTEKDITIKGKTT
SQYLASVVMGNLPPRPFNIRMRRMTPDSTTDQLQNKTLWSSYTEIIDVKQCYPNTALVGVQVDSEQFGSQQVSRNYHLRG
RILQVPSNYNPQTRQYSGIWDGTFKPAYSNNMAWCLWDMLTHPRYGMGKRLGAADVDKWALYVIGQYCDQSVPDGFGGTE
PRITCNAYLTTQRKAWDVLSDFCSAMRCMPVWNGQTLTFVQDRPSDKTWTYNRSNVVMPDDGAPFRYSFSALKDRHNAVE
VNWIDPNNGWETATELVEDTQAIARYGRNVTKMDAFGCTSRGQAHRAGLWLIKTELLETQTVDFSVGAEGLRHVPGDVIE
ICDDDYAGISTGGRVLAVNSQTRTLTLDREITLPSSGTALISLVDGSGNPVSVEVQSVTDGVKVKVSRVPDGVAEYSVWE
LKLPTLRQRLFRCVSIRENDDGTYAITAVQHVPEKEAIVDNGAHFDGEQSGTVNGVTPPAVQHLTAEVTADSGEYQVLAR
WDTPKVVKGVSFLLRLTVTADDGSERLVSTARTTETTYRFTQLALGNYRLTVRAVNAWGQQGDPASVSFRIAAPAAPSRI
ELTPGYFQITATPHLAVYDPTVQFEFWFSEKQIADIRQVETSTRYLGTALYWIAASINIKPGHDYYFYIRSVNTVGKSAF
VEAVGRASDDAEGYLDFFKGKITESHLGKELLEKVELTEDNASRLEEFSKEWKDASDKWNAMWAVKIEQTKDGKHYVAGI
GLSMEDTEEGKLSQFLVAANRIAFIDPANGNETPMFVAQGNQIFMNDVFLKRLTAPTITSGGNPPAFSLTPDGKLTAKNA
DISGSVNANSGTLSNVTIAENCTINGTLRAEKIVGDIVKAASAAFPRQRESSVDWPSGTRTVTVTDDHPFDRQIVVLPLT
FRGSKRTVSGRTTYSMCYLKVLMNGAVIYDGAANEAVQVFSRIVDMPAGRGNVILTFTLTSTRHSADIPPYTFASDVQVM
VIKKQALGISVV
>P03738 ~~~L~~~Tail tip protein L~~~
MQDIRQETLNECTRAEQSASVVLWEIDLTEVGGERYFFCNEQNEKGEPVTWQGRQYQPYPIQGSGFELNGKGTSTRPTLT
VSNLYGMVTGMAEDMQSLVGGTVVRRKVYARFLDAVNFVNGNSYADPEQEVISRWRIEQCSELSAVSASFVLSTPTETDG
AVFPGRIMLANTCTWTYRGDECGYSGPAVADEYDQPTSDITKDKCSKCLSGCKFRNNVGNFGGFLSINKLSQ
>P25049 ~~~~~~Tyrosine-protein kinase-interacting protein~~~
MENQREEIELTEIPETEKKRTAEEKLLSCSAETAEEKVSLCSEETTDTSSSSSSEQTPAPIEVNVNIQTSTYLPQNAATN
LNSLYTSFEDARAQGKGLVRHNSDDLKSFLEKYPPDYRKPKRDLSESWDPGMPKPTLPPRPANLGASQASTVRRHVREQN
FKQLRERKANEGKIVKDLKRLEYKVNIILCLVVVILAIILLLTGLSILFIRIKS
>P22575 ~~~~~~Tyrosine-protein kinase-interacting protein~~~
MANEGEEIELTEFPETEKERKDEEKLSSCSEETTNTSSSSGSDHVPVPIEVNVIIQNSSRTEDELQNSKEIELTGFQGKL
SSCSEETTAPSSSYSSKQASVFIEENGDNETSTYRPQNVLTNLNSLYTTFEDARAQGKGMVRHKSEDLQSFLEKYPPDFR
KPKRDLSATWDPGMPTPPLPPRPANLGERQASTVRLHVKESNCKQPRERKANERNIVKDLKRLENKINVIICLVVVILAV
LLLVTVLSILHIGMKS
>P88825 ~~~~~~Tyrosine protein kinase-interacting protein~~~
MANEGEEIELTEFPETEKERKDEEKLSSCSEETTDTSSSSSSDHVPAPIEVNVIIQNSSRTEDELQNSTKFAVANEGKEI
ELTGFQGKLSSCSEETTATSSSYSSKQASVCIEENGDNETSTYRPQNVLTNLNSLYTTFEDARAQGKGMVRYKSEDLQSF
LEKYPPDYRKPKRDLSATWDPGMPTPALPPRPANLGERQASTVRLHVKESNCKQPRERKANERNIVKDLKRLENKVNAII
CLVVVILAVLLLVTVLSILHIGMKS
>Q06691 ~~~~~~Telokin-like protein 20~~~
MASMSNITPDIIVNAQINSEDENVLDFIIEDEYYLKKRGVGAHIIKVASSPQLRLLYKNAYSTVSCGNYGVLCNLVQNGE
YDLNAIMFNCAEIKLNKGQMLFQTKIWRSDNSKTDAAVHTSSPKRTVETENDDDGEAASAAAIDEQEGNADVVGLDFEEN
IDDGDAPTPKKQKLDNAKQD
>D3WAD2 ~~~~~~Probable tape measure protein~~~
MASNATFEVEIYGNTTKFENSLKGVNTAMSGLRGEAKNLREALKLDPANTGKMAQLQKNLQTQLGLSRDKATKLKEELST
VDKGTSAGQKKWLQLTRDLGTVETQANRLEGEIKQVEGAISSGSWNIDAKMDTKGVNSGIDGMKSRFSGLREIAVGVFRQ
IGSSAVSAVGNGLKGWVSDAMDTQKAMISLQNTLKFKGNGQDFDYVSKSMQTLAKDTNANTEDTLKLSTTFIGLGDSAKT
AVGKTEALVKANQAFGGTGEQLKGVVQAYGQMSASGKVSAENINQLTDNNTALGSALKSTVMEMNPALKQYGSFASASEK
GAISVEMLDKAMQKLGGAGGGAVTTIGDAWDSFNETLSLALLPTLDALTPIISSIIDKMAGWGESAGKALDSIVKYVKEL
WGALEKNGALSSLSKIWDGLKSTFGSVLSIIGQLIESFAGIDLKTGESAGSVENVSKTIANLAKGLADVIKKIADFAKKF
SESKGAIDTLKTSLVALTAGFVAFKIGSGIITAISAFKKLQTAIQAGTGVMGAFNAVMAINPFVALGIAIAAIVAGLVYF
FTQTETGKKAWASFVDFLKSAWDGIVSFFSGIGQWFADIWNGAVDGAKGIWQGLVDWFSGIVQGVQNIWNGITTFFTTLW
TTVVTGIQTAWAGVTGFFTGLWDGIVNVVTTVFTTISSLVTGAYNWFVTTFQPLISFYKSIFGLVGSVINLAFQLILAII
RGAYQLVIGAWSGISGFFGVIFNAVSSVVSTVFSAIGSFAGSAWNVLVGVWNAVAGFFGGIFNAVKGVVSSVFSAIGSFA
SSAWGVVSSIWSAVSGFFSGIFNAVSSVVSGVFSALGGFASNAWGAITGIFSGVADFFSGVFDGAKNIVSGVFEAFGNFA
SNAWNAITGVFNGIGSFFSDIFGGVKNTIDSVLGGVTDTINNIKGSIDWVASKVGGLFKGSMVVGLTDVNLSSSGYGLST
NSVSSDNRTYNTFNVQGGAGQDVSNLARAIRREFELGRA
>Q9T1V6 ~~~~~~Probable tape measure protein~~~
MTGKRLKASVIIDLNGNLSRRSRQYSNQINALSRSGQSSLRALRMEVVRVSGAIDRMGSLSTRTFRMLSAGALGIAGVGY
TANKLFIGAAAQREQQIIAMNSLYHGDKVRAQAMMAWAKQNAKDTTWGLSGVLDEIRSSKGFGMTDEQTKQFITMLQDQG
AMHGWDLPTAQGASLQLKQMFARQQITAADANLLTGYGINVYQALADATGTDVKKIRDLGTKGKLGMKSILTVFRTLSEQ
SKGAQASAMNSWDGMFAQMEANLLEFRIKVANSGPFEEIKNEMRRVLNWHDMADKSGELDALAENIGQKFLTTFRTVKIS
AQELWRWLKPGKDALAWVDQNIVSLKKLAAVLVSVWLANKALRAGWAVAKPSWQVASYPFKTGRRMWRWMRNRKRGQAGL
PVPDAMTSETLLQGIGIQRVFVINWPRGFGDYGSGGGRRVRSGGRMAPLLPRQPLLLSGPQPLALPAPRPVLALPPPGVP
VTARPAPLPLPGKSGLLSRLAGSAAGQLVTGTVGKLADAGRAVGGWFSGIGNKLAGSAIGRVVTKGAGALGWMGKGAGRA
LSRLGGPVMGALQLAPVLMDEQASTHEKAGAIGSTAGAWLGGAVGSLAGPLGTVAGATLGSVAGEYLGGFVTDLYQKWTA
TDKEPQEQKVNAEASLRVELGEGLRLTSSRVTEDGMGLNIYAGDNYITGW
>P85501 ~~~~~~Tape measure protein~~~
MATDSLGTLTVDLIANTGGFERGMDAAERRIASTTRAFQRQEQAAERLVGRIDPVAGAINRLVQEQTELERHFRSGIIPA
GEFERLNRILNDQLDAVQRGNREMASGAMSARQYQAALRGVPAQFTDIAVSLASGQQPLTVLLQQGGQLKDMFGGVVPAA
RALGGYIAGLVNPITGLAASVGVLGISFIDAEREAAAFNKAIFAGNNAAGVSGSGLSQIAEQASAVAGSLSSANKAAIAL
ASSGKVAASQLQSLTEATIAIAQFTGKEVDDVAKSLSAMGDSATDAAAKISEQYGLLTYEQYQVIKSIDEQGNSQRALDV
LGEELNRNAQERLKQYRESLSDIERDWIDIKTAITNSYAAVRSEIFPNQNQQIEQIQRILRTRQEGGVLGAVSSAFGFGE
NSTESLQQQLDSLVKQRDAAAKQAEEQAKITKSNQDRVDASREWEKENEKYLSSRVKMEKEISAARELGRKAGLNEIEIE
DRIAQIRKSYEEKPSSRSGSLDAGQRMLDSLRQQYASMQAQLEATEKLGTQAQALVKWEQQLADLKSRGSLSADQKALLA
NADLITAQLKRNAALEDELNTRKEIQKTLDDYKRLNESLRTDAEKQLDLTRQRFEILDKARQAGISDDDYRRTAERIVSS
STTKAPTFSGVDAVVAGPQGELDKLDKAQEDLEAWYEQQLEILNENREKRAELNASWDEQELKLKQEHEDAMAAIEQSRQ
QITLSANEQFFGNLSGLAKTFFGEQSGLYKAAFVAEKSFAIAKTLINVPKTASDAYSAMAGIPVIGPALGIAAAAAAVTA
QLAQVAAVKNVNLSGMAHDGIDAVPETGTWLLQKGERVTTAETSAKLDKTLDDVRSNQSGGGAPTINLIEDRSRAGQVNT
RRQDDQYIIDVVVADLFGDGRTSKAIGSSFGMRRSGT
>P13337 ~~~~~~Tape measure protein~~~
MKKPQEMQTMRRKVISDNKPTQEAAKSASNTLSGLNDISTKLDDAQAASELIAQTVEEKSNEIIGAIDNVESAVSDTSAG
SELIAETVEIGNNINKEIGESLGSKLDKLTSLLEQKIQTAGIQQTGTSLATVESAIPVKVVEDDTAESVGPLLPAPEAVN
NDPDADFFPTPQPVEPKQESPEEKQKKEAFNLKLSQALDKLTKTVDFGFKKSISITDKISSMLFKYTVSAAIEAAKMTAM
ILAVVVGIDLLMIHFKYWSDKFSKAWDLFSTDFTKFSSETGTWGPLLQSIFDSIDKIKQLWEAGDWGGLTVAIVEGLGKV
LFNLGELIQLGMAKLSAAILRVIPGMKDTADEVEGRALENFQNSTGASLNKEDQEKVANYQDKRMNGDLGPIAEGLDKIS
NWKTRASNWIRGVDNKEALTTDEERAAEEEKLKQLSPEERKNALMKANEARAAMIRFEKYADSADMSKDSTVKSVEAAYE
DLKKRMDDPDLNNSPAVKKELAARFSKIDATYQELKKNQPNAKPETSAKSPEAKQVQVIEKNKAQQAPVQQASPSINNTN
NVIKKNTVVHNMTPVTSTTAPGVFDATGVN
>Q6QGE7 ~~~~~~Tape measure protein pb2 precursor~~~
MTDKLIRELLIDVKQKGATRTAKSIENVSDALENAAAASELTNEQLGKMPRTLYSIERAADRAAKSLTKMQASRGMAGIT
KSIDGIGDKLDYLAIQLIEVTDKLEIGFDGVSRSVKAMGNDVAAATEKVQDRLYDTNRALGGTSKGFNDTAGAAGRASRA
LGNTSGSARGATRDFAAMAKIGGRLPIMYAALASNVFVLQTAFESLKVGDQLNRLEQFGTIVGTMTGTPVQTLALSLQNA
TNGAISFEEAMRQASSASAYGFDSEQLEQFGLVARRAAAVLGVDMTDALNRVIKGVSKQEIELLDELGVTIRLNDAYENY
VKQLNATSTGIKYTVDSLTTYQKQQAYANEVIAESTRRFGYLDDALKATSWEQFAANANSALRSLQQSAATYLNPVMDTL
NTFLYQTKSSQMRVSAMARSASAKTTPAENVTALIENAVGAREDLDTYLKESEERVKKAQELKQQLDDLKAKQAATAPIA
NALTAGGIGGDESNKLVVQLTNELARQNKEIEERTKTEKVLRQAVQDTGEALLRNGKLAEQLGAKMKYADTAVPGDKGVF
EVDPNNLKAVSEIQKNFDFLKKSSSDTANNIRMAASSITNAKKASSDLNSVVKAVEDTSKVTGQSADTLVKNLNLGFSSL
DQMKAAQKGLSEYVTAMDKSEQNALEVAKRKDEVYNQTKDKAKAEAAAREVLLRQQQEQLTAAKALLAINPNDPEALKQV
AKIETEILNTKAQGFENAKKTKDYTDKILGVDREIALLNDRTMTSTQYRLAQLRLELQLEQEKTELYSKQADGQAKVEQS
RRAQAQISREIWEAEKQGTASHVSALMDALEVSQTQRNVTGQSQILTERLSILQQQLELSKGNTEEELKYRNEIYKTSAA
LEQLKKQRESQMQQQVGSSVGATYTPTTGLIGEDKDFADMQNRMASYDQAISKLSELNSEATAVAQSMGNLTNAMIQFSQ
GSLDTTSMIASGMQTVASMIQYSTSQQVSAIDQAIAAEQKRDGKSEASKAKLKKLEAEKLKIQQDAAKKQIIIQTAVAVM
QAATAVPYPFSIPLMVAAGLAGALALAQASSASGMSSIADSGADTTQYLTLGERQKNVDVSMQASSGELSYLRGDKGIGN
ANSFVPRAEGGMMYPGVSYQMGEHGTEVVTPMVPMKATPNDQLSDGSKTTSGRPIILNISTMDAASFRDFASNNSTAFRD
AVELALNENGTTLKSLGNS
>P03736 ~~~H~~~Tape measure protein~~~
MAEPVGDLVVDLSLDAARFDEQMARVRRHFSGTESDAKKTAAVVEQSLSRQALAAQKAGISVGQYKAAMRMLPAQFTDVA
TQLAGGQSPWLILLQQGGQVKDSFGGMIPMFRGLAGAITLPMVGATSLAVATGALAYAWYQGNSTLSDFNKTLVLSGNQA
GLTADRMLVLSRAGQAAGLTFNQTSESLSALVKAGVSGEAQIASISQSVARFSSASGVEVDKVAEAFGKLTTDPTSGLTA
MARQFHNVSAEQIAYVAQLQRSGDEAGALQAANEAATKGFDDQTRRLKENMGTLETWADRTARAFKSMWDAVLDIGRPDT
AQEMLIKAEAAYKKADDIWNLRKDDYFVNDEARARYWDDREKARLALEAARKKAEQQTQQDKNAQQQSDTEASRLKYTEE
AQKAYERLQTPLEKYTARQEELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSSVKVSAGDRQEDSAHAALLTLQA
ELRTLEKHAGANEKISQQRRDLWKAESQFAVLEEAAQRRQLSAQEKSLLAHKDETLEYKRQLAALGDKVTYQERLNALAQ
QADKFAQQQRAKRAAIDAKSRGLTDRQAEREATEQRLKEQYGDNPLALNNVMSEQKKTWAAEDQLRGNWMAGLKSGWSEW
EESATDSMSQVKSAATQTFDGIAQNMAAMLTGSEQNWRSFTRSVLSMMTEILLKQAMVGIVGSIGSAIGGAVGGGASASG
GTAIQAAAAKFHFATGGFTGTGGKYEPAGIVHRGEFVFTKEATSRIGVGNLYRLMRGYATGGYVGTPGSMADSRSQASGT
FEQNNHVVINNDGTNGQIGPAALKAVYDMARKGARDEIQTQMRDGGLFSGGGR
>P07636 3.1.22.-~~~A~~~DDE-recombinase A~~~
MELWVSPKECANLPGLPKTSAGVIYVAKKQGWQNRTRAGVKGGKAIEYNANSLPVEAKAALLLRQGEIETSLGYFEIARP
TLEAHDYDREALWSKWDNASDSQRRLAEKWLPAVQAADEMLNQGISTKTAFATVAGHYQVSASTLRDKYYQVQKFAKPDW
AAALVDGRGASRRNVHKSEFDEDAWQFLIADYLRPEKPAFRKCYERLELAAREHGWSIPSRATAFRRIQQLDEAMVVACR
EGEHALMHLIPAQQRTVEHLDAMQWINGDGYLHNVFVRWFNGDVIRPKTWFWQDVKTRKILGWRCDVSENIDSIRLSFMD
VVTRYGIPEDFHITIDNTRGAANKWLTGGAPNRYRFKVKEDDPKGLFLLMGAKMHWTSVVAGKGWGQAKPVERAFGVGGL
EEYVDKHPALAGAYTGPNPQAKPDNYGDRAVDAELFLKTLAEGVAMFNARTGRETEMCGGKLSFDDVFEREYARTIVRKP
TEEQKRMLLLPAEAVNVSRKGEFTLKVGGSLKGAKNVYYNMALMNAGVKKVVVRFDPQQLHSTVYCYTLDGRFICEAECL
APVAFNDAAAGREYRRRQKQLKSATKAAIKAQKQMDALEVAELLPQIAEPAAPESRIVGIFRPSGNTERVKNQERDDEYE
TERDEYLNHSLDILEQNRRKKAI
>Q7T6X9 5.6.2.1~~~TOP1E~~~DNA topoisomerase 1B~~~
MSYDEKHLMEGIYREKSGDKFIYYYFDNNEEVTTKDIERINKLRIPPAWTNVWVARDPNSPIQAIGTDSKGRKQYRYNEI
HIQGAEKEKFKRLYDFIKSIPKLEKAMVRDNNFPFYNKNRVISLMLQMVRDYNMRVGKEVYARQNKSYGISSLRKKHVKI
SPGVITLNFKGKSGQRLNYTIRNDFYIDGIKMLMKLEGDRLFQYISTDEDGNEKIMRVNDRDLNKYIQENMGSEFTIKDF
RTFGANLYFIQALLSETRKRTPKNRKTIKKNIANAFKSTARQLKHTGAVSKKSYVMNYTLELYQNNPEFFIEHKNDDPID
FLLRILKSYRKDVLGE
>Q5UQB5 5.6.2.1~~~TOP1P~~~DNA topoisomerase 1 type prokaryotic~~~
MSILILLESPGKISKISSILGKNYVVKASMGHFRDLDPKKMSIDFDNDFEPVYIVTKPDVVKNLKSAMKNIDLVYLAADE
DREGEAIAQSLYDVLKPSNYKRLRFNAITKDAIMSAIKNAGDIDKNLVDAQKARRVLDRLFGYLISPILQRQIGGKLSAG
RVQSVTVRIIIDKENEIKNFINKNADSSYFKVSGTFNGAKATLHESNDKKPFDLETAYKGKTAQIALINSENPNSKVVNF
MKRCLKSQFFIHSVEDKMTTRSPAPPFTTSTLQQEANRKFGMSIDSTMKTAQKLYEGGYITYMRTDSVEISAEGHRDIKK
IITDQYGADYYQKNLYKNKAANSQEAHEAIRPTHPELLTLEGEIEDAYQIKLYKLIWQRTIASQMKPAKIKVTIIQISIS
KYVEDKLNPFYYFQSQIETVVFPGFMKVYVESIDDPDTDNQITKNFTGKIPTVGSKVTMEEIIARQEYMRPPPRYSEASL
VKKLEELGIGRPSTYVNTIKTIINREYVKITDVPGIKKDITIYSIKSENKKHIMEVYEDTDTILLGKENKKIVPTNLGIT
VNDFLMKYFPEFLDYKFTANMETDLDYVSTGTKNWVDIVQDFYDKLKPIVDELSKQKGLSQSSERLLGEDNDGNEITATK
TKFGPVVRKKIGDKYVYAKIKDPLTLDTIKLSDAIKLLEYPKNLGQYKGFDVLLQKGDYGFYLSYNKENFSLGEIDDPED
INLDTAIKAIEAKKANNIAEFNLTENGKKIKAIVLNGKYGYYVQVTRNRIKKNYPIPKDLDPNNLTEQQILSIISVKKTY
KKSAPKGGSKTIRKPSQTKYSQTKSTKSTKSTKSTNKKFVGKSAKKTTKKTTKK
>A0A7H0DN83 5.6.2.1~~~~~~DNA topoisomerase I~~~
MRALFYKDGKLFTDNNFLNPVSDDNPAYEVLQHVKIPTHLTDVVVYEQTWEEALTRLIFVGSDSKGRRQYFYGKMHIQNR
NAKRDRIFVRVYNVMKRINCFINKNIKKSSTDSNYQLAVFMLMETMFFIRFGKMKYLKENETVGLLTLKNKHIEISPDEI
VIKFVGKDKVSHEFVVHKSNRLYKPLLKLTDDSSPEEFLFNKLSERKVYECIKQFGIRIKDLRTYGVNYTFLYNFWTNVK
SVSPLPSPKKLIALTIKQTAEVVGHTPSISKRAYMATTILEMVKDKNFLDVVSKTTFDEFLSIVVDHVKSSTDG
>Q76ZS7 5.6.2.1~~~TOP1~~~DNA topoisomerase 1B~~~
MRALFYKDGKLFTDNNFLNPVSDDNPAYEVLQHVKIPTHLTDVVVYEQTWEEALTRLIFVGSDSKGRRQYFYGKMHVQNR
NAKRDRIFVRVYNVMKRINCFINKNIKKSSTDSNYQLAVFMLMETMFFIRFGKMKYLKENETVGLLTLKNKHIEISPDEI
VIKFVGKDKVSHEFVVHKSNRLYKPLLKLTDDSSPEEFLFNKLSERKVYECIKQFGIRIKDLRTYGVNYTFLYNFWTNVK
SISPLPSPKKLIALTIKQTAEVVGHTPSISKRAYMATTILEMVKDKNFLDVVSKTTFDEFLSIVVDHVKSSTDG
>P68697 5.6.2.1~~~~~~DNA topoisomerase 1B~~~
MRALFYKDGKLFTDNNFLNPVSDDNPAYEVLQHVKIPTHLTDVVVYEQTWEEALTRLIFVGSDSKGRRQYFYGKMHVQNR
NAKRDRIFVRVYNVMKRINCFINKNIKKSSTDSNYQLAVFMLMETMFFIRFGKMKYLKENETVGLLTLKNKHIEISPDEI
VIKFVGKDKVSHEFVVHKSNRLYKPLLKLTDDSSPEEFLFNKLSERKVYECIKQFGIRIKDLRTYGVNYTFLYNFWTNVK
SISPLPSPKKLIALTIKQTAEVVGHTPSISKRAYMATTILEMVKDKNFLDVVSKTTFDEFLSIVVDHVKSSTDG
>P68698 5.6.2.1~~~~~~DNA topoisomerase 1B~~~
MRALFYKDGKLFTDNNFLNPVSDDNPAYEVLQHVKIPTHLTDVVVYEQTWEEALTRLIFVGSDSKGRRQYFYGKMHVQNR
NAKRDRIFVRVYNVMKRINCFINKNIKKSSTDSNYQLAVFMLMETMFFIRFGKMKYLKENETVGLLTLKNKHIEISPDEI
VIKFVGKDKVSHEFVVHKSNRLYKPLLKLTDDSSPEEFLFNKLSERKVYECIKQFGIRIKDLRTYGVNYTFLYNFWTNVK
SISPLPSPKKLIALTIKQTAEVVGHTPSISKRAYMATTILEMVKDKNFLDVVSKTTFDEFLSIVVDHVKSSTDG
>P32989 5.6.2.1~~~~~~DNA topoisomerase 1~~~
MRALFYKDGKLFTDNNFLNPVSDNNPAYEVLQHVKIPTHLTDVVVYGQTWEEALTRLIFVGSDSKGRRQYFYGKMHVQNR
NAKRDRIFVRVYNVMKRINCFINKNIKKSSTDSNYQLAVFMLMETMFFIRFGKMKYLKENETVGLLTLKNKHIEISPDKI
VIKFVGKDKVSHEFVVHKSNRLYKPLLKLTDDSSPEEFLFNKLSERKVYECIKQFGIRIKDLRTYGVNYTFLYNFWTNVK
SISPLPSPKKLIALTIKQTAEVVGHTPSISKRAYMATTILEMVKDKNFLDVVSKTTFDEFLSIVVDHVKSSTDG
>Q00942 5.6.2.2~~~TOP~~~DNA topoisomerase 2~~~
MEAFEISDFKEHAKKKSMWAGALNKVTISGLMGVFTEDEDLMALPIHRDHCPALLKIFDELIVNATDHERACHSKTKKVT
YIKISFDKGVFSCENDGPGIPIAKHEQASLIAKRDVYVPEVASCFFLAGTNINKAKDCIKGGTNGVGLKLAMVHSQWAIL
TTADGAQKYVQQINQRLDIIEPPTITPSREMFTRIELMPVYQELGYAEPLSETEQADLSAWIYLRACQCAAYVGKGTTIY
YNDKPCRTGSVMALAKMYTLLSAPNSTIHTATIKADAKPYSLHPLQVAAVVSPKFKKFEHVSIINGVNCVKGEHVTFLKK
TINEMVIKKFQQTIKDKNRKTTLRDSCSNIFVVIVGSIPGIEWTGQRKDELSIAENVFKTHYSIPSSFLTSMTRSIVDIL
LQSISKKDNHKQVDVDKYTRARNAGGKRAQDCMLLAAEGDSALSLLRTGLTLGKSNPSGPSFDFCGMISLGGVIMNACKK
VTNITTDSGETIMVRNEQLTNNKVLQGIVQVLGLDFNCHYKTQEERAKLRYGCIVACVDQDLDGCGKILGLLLAYFHLFW
PQLIIHGFVKRLLTPLIRVYEKGKTMPVEFYYEQEFDAWAKKQTSLVNHTVKYYKGLAAHDTHEVKSMFKHFDNMVYTFT
LDDSAKELFHIYFGGESELRKRELCTGVVPLTETQTQSIHSVRRIPCSLHLQVDTKAYKLDAIERQIPNFLDGMTRARRK
ILAGGVKCFASNNRERKVFQFGGYVADHMFYHHGDMSLNTSIIKAAQYYPGSSHLYPVFIGIGSFGSRHLGGKDAGSPRY
ISVQLASEFIKTMFPAEDSWLLPYVFEDGQRAEPEYYVPVLPLAIMEYGANPSEGWKYTTWARQLEDILALVRAYVDKDN
PKHELLHYAIKHKITILPLRPSNYNFKGHLKRFGQYYYSYGTYDISEQRNIITITELPLRVPTVAYIESIKKSSNRMTFI
EEIIDYSSSETIEILVKLKPNSLNRIVEEFKETEEQDSIENFLRLRNCLHSHLNFVKPKGGIIEFNSYYEILYAWLPYRR
ELYQKRLMREHAVLKLRIIMETAIVRYINESAELNLSHYEDEKEASRILSEHGFPPLNHTLIISPEFASIEELNQKALQG
CYTYILSLQARELLIAAKTRRVEKIKKMQARLDKVEQLLQESPFPGASVWLEEIDAVEKAIIKGRNTQWKFH
>P07065 5.6.2.2~~~~~~DNA topoisomerase medium subunit~~~
MQLNNRDLKSIIDNEALAYAMYTVENRAIPNMIDGFKPVQRFVIARALDLARGNKDKFHKLASIAGGVADLGYHHGENSA
QDAGALMANTWNNNFPLLDGQGNFGSRTVQKAAASRYIFARVSKNFYNVYKDTEYAPVHQDKEHIPPAFYLPIIPTVLLN
GVSGIATGYATYILPHSVSSVKKAVLQALQGKKVTKPKVEFPEFRGEVVEIDGQYEIRGTYKFTSRTQMHITEIPYKYDR
ETYVSKILDPLENKGFITWDDACGEHGFGFKVKFRKEYSLSDNEEERHAKIMKDFGLIERRSQNITVINEKGKLQVYDNV
VDLIKDFVEVRKTYVQKRIDNKIKETESAFRLAFAKAHFIKKVISGEIVVQGKTRKELTEELSKIDMYSSYVDKLVGMNI
FHMTSDEAKKLAEEAKAKKEENEYWKTTDVVTEYTKDLEEIK
>P09176 5.6.2.2~~~~~~DNA topoisomerase large subunit~~~
MIKNEIKILSDIEHIKKRSGMYIGSSANETHERFMFGKWESVQYVPGLVKLIDEIIDNSVDEGIRTKFKFANKINVTIKN
NQVTVEDNGRGIPQAMVKTPTGEEIPGPVAAWTIPKAGGNFGDDKERVTGGMNGVGSSLTNIFSVMFVGETGDGQNNIVV
RCSNGMENKSWEDIPGKWKGTRVTFIPDFMSFETNELSQVYLDITLDRLQTLAVVYPDIQFTFNGKKVQGNFKKYARQYD
EHAIVQEQENCSIAVGRSPDGFRQLTYVNNIHTKNGGHHIDCAMDDICEDLIPQIKRKFKIDVTKARVKECLTIVMFVRD
MKNMRLIRQTKERLTSPFGEIRSHIQLDAKKISRDILNNEAILMPIIEAALARKLAAEKAAETKAAKKASKAKVHKHIKA
NLCGKDADTTLFLTEGDSAIGYLIDVRDKELHGGYPLRGKVLNSWGMSYADMLKNKELFDICAITGLVLGEKAFEEKEDG
EWFTFELNGDTIIVNENDEVQINGKWITVGELRKNL
>P23992 5.6.2.2~~~~~~DNA topoisomerase small subunit~~~
MKFVKIDSSSVDMKKYKLQNNVRRSIKSSSMNYANVAIMTDADHDGLGSIYPSLLGFFSNWPELFEQGRIRFVKTPVIIA
QVGKKQEWFYTVAEYESAKDALPKHSIRYIKGLGSLEKSEYREMIQNPVYDVVKLPENWKELFEMLMGDNADLRKEWMSQ
>Q65164 2.5.1.-~~~~~~Trans-prenyltransferase~~~
MLHLIYISIIVVLIIILISYTRKPKYFRITAPRSVALFHGIHPLNPKNYKTFSKEFETILNNAIEDGDFKGQLTEPCSYA
LRGGKYIRPIILMEIVRACQLQHSFGAPIYPAEAALAVEYFHVASLIIDDMPSFDNDVKRRNKDTVWARFGVAKAQMSAL
ALTMQGFQNICRQIDWIKEHCPRFPDPNQLGALLCTFVSHSLNSAGSGQLVDTPEKTIPFFKIAFIMGWVLGTGTIEDIG
MIERAAHCFGHAFQLADDIKDHDTDTGWNYAKIHGKQKTFDDVAQSLQECKKILHGKKIFTSIWNEIFQKVINVALGT
>Q5UR25 ~~~~~~Thioredoxin domain-containing protein R362~~~
MSIVTNGNLNKYFQQIKNLKEEFVKKYNSGESTEEIIGKMRQIHGQAIFEDTRNKFQLAKNNKLNPSDFYNLSVDYCTLA
NMDKKIRHLEILQFERKHKGYTHKGLDTDKYEINRSLIRSNGPLVGTTQPIQSTQANQISMTGKGTQDPVNKVRPEIRKD
DDGVDVISLHNTQTEDVTTRRVNSDVIPTEFNNTMNTQNQSNSNYLNTTEYLTNLSNTEANRLMSDYNNGNYELTDSQKQ
QGGNHIFEVGKPTVVNFYADWCGYSRQFMPNWEKVRDSVKKKYGERIQLSSLNVGQDTDKVNISKNAGVNGYPTVVIFKD
GNTYHKVAGNASADDIVKFIDETMSR
>Q5UQN5 ~~~~~~Thioredoxin domain-containing protein R443~~~
MRKLSWQHIVLIVLAIILILWIISLLLCRKPVRPTYQVPIIQPMQVIQPHQNDIDPAWQTTYSPNNTDNQNQQYVLYYFN
NPSCPHCKNFSSTWDMLKNNFRSINNLSLKEISTDKQENEHLVFYYNIRRVPTIILVTPDKNLEYSGNKSLEDLTQFIRS
NMNQ
>Q96703 ~~~AC2~~~Transcriptional activator protein~~~
MQNSSLLKPPSIKAQHKIAKRRAVRRRRIDLNCGCSIFLHINCADNGFTHRGEHHCSSGREFRFYLGGSKSPIFQDTTRR
GPVVHQNQDLPHPSPVQPQPTESIGSPQSLLQLPSLDDFDESFWADIFK
>P14976 ~~~AC2~~~Transcriptional activator protein~~~
MQSSSPSQNHSTQVPIKVTHRQVKKRAIRRRRVDLVCGCSYYLHINCFNHGFTHRGSHHCSSSNEWRVYLGNKQSPVFHN
HQAPTTTIPAEPGHHNSPGSIQSQPEEGAGDSQMFSQLPDLDNLTASDWSFLKGL
>P03562 ~~~AC2~~~Transcriptional activator protein~~~
MRNSSSSTPPSIKAQHRAAKRRAIRRRRIDLNCGCSIYIHIDCRNNGFTHRGTYHCASSREWRLYLGDNKSPLFQDNQRR
GSPLHQHQDIPLTNQVQPQPEESIGSPQGISQLPSMDDIDDSFWENLFK
>Q06658 ~~~AC2~~~Transcriptional activator protein~~~
MRSSSPSQPPSIKRAHRQAKKRAIRRRRVDLQCGCSIYFHLGCAGHGFTHRGTHHCTSGGEWRVYLGARKSPLFQDTQSR
GPTVYQNEGIPRTDTVQPQPEESVASPQSLPELPSLDDVDDSFWINLFS
>Q9DXE6 ~~~C2~~~Transcriptional activator protein~~~
MRSSSPSTGHSTQVPIKVQHRIAKKTTRRRRVDLPCGCSYFVALGCHNHGFTHRGTTHCSSIREWRVYLDGQKSPVFQDN
QTPRETISEEPRHNHNTSPIQLQPEESVGDTQMFSNLPNLDSFTSSDLAFLKSI
>P27262 ~~~C2~~~Transcriptional activator protein~~~
MQPSSPSTSHCSQVSIKVQHKIAKKKPIRRKRVDLDCGCSYYLHLNCNNHGFTHRGTHHCSSGREWRFYLGDKQSPLFQD
NRTQPEAISNEPRHHFHSDKIQPQHQEGNGDSQMFSRLPNLDDITASDWSFLKSI
>P10212 ~~~TRM1~~~Tripartite terminase subunit 1~~~
MAAPVSEPTVARQKLLALLGQVQTYVFQIELLRRCDPHIGRGKLPQLKLNALQVRALRRRLRPGLEAQAGAFLTPLSVTL
ELLLEYAWREGERLLGSLETFATAGDVAAFFTETMGLARPCPYHQRVRLDTYGGTVHMELCFLHDVENFLKQLNYCHLIT
PSRGATAALERVREFMVGAVGSGLIVPPELSDPSHPCAVCFEELCVTANQGATIARRLADRICNHVTQQAQVRLDANELR
RYLPHAAGLSDADRARALSVLDHALARTAGGDGQPHPSPENDSVRKEADALLEAHDVFQATTPGLYAISELRFWLASGDR
AGQTTMDAFASNLTALARRELQQETAAVAVELALFGRRAEHFDRAFGSHLAALDMVDALIIGGQATSPDDQIEALIRACY
DHHLTTPLLRRLVSPEQCDEEALRRVLARMGAGGAADAPKGGAGPDDDGDRVAVEEGARGLGAPGGGGEDEDRRRGPGGQ
GPETWGDIATQAAADVRERRRLYADRLTKRSLASLGRCVREQRGELEKMLRVSVHGEVLPATFAAVANGFAARARFCALT
AGAGTVIDNRSAPGVFDAHRFMRASLLRHQVDPALLPSITHRFFELVNGPLFDHSTHSFAQPPNTALYYSVENVGLLPHL
KEELARFIMGAGGSGADWAVSEFQRFYCFDGISGITPTQRAAWRYIRELIIATTLFASVYRCGELELRRPDCSRPTSEGR
YRYPPGVYLTYDSDCPLVAIVESAPDGCIGPRSVVVYDRDVFSILYSVLQHLAPRLPDGGHDGPP
>P11871 ~~~TRM1~~~Tripartite terminase subunit 1~~~
MAERRLVAVLGQVQTYVFQLEMLKRCDPAVVRELAPRVKLNALMCRYLARRLPLEAQTTPLTCALRLALAYARAEGDRVL
GALAAAGDDAEAYFERTMGGACRFHARVALDTYGGRVETELQFLHDAENLLKQLNYCHLITPHAVDLSAVDEFLARTIGG
GLVVPPELYDPAQPCAVCFEELCVTANQGEATHRRLLGCVCDHLTRQLAVRVDPEDVAKNLPHVHGLDEARRGRALAALA
AVDAAEAREAEAASTAAAGAEAGDAGETARRRADALLDAHDVFRPASRRLYAVSELQFWLASTNQAVRALDLFTHNLDDL
ERRERRAEVRAAAVELALFGRRPEHFDRARAARELDIIDGLLVGGCAASPDERLEALIRACYDHHMSTPMLRMLDPDRAN
RDALERLLEGGDDADADGGAAGGADAGDGGVGDEDGPGAPPPADAVAWADLPAAALRDAERRRRLYADRLSRRSAASLAQ
CVREQRRELEKTLRVNVYGDALLHTYVAVAAGFRARRAFCEAAARAGTVVDERETGCFDAHSFMKATVQRHPVDAALLPA
VTRKFFELVNGPLFAHDTHAFAQPPNTALYFAVENVGLLPHLKEELARFMVARDWCVSEFRGFYRFQTAGVTATQRQAWR
YIRELVLAVAVFRSVFHCGDVEVLRADRFAGRDGLYLTYEASCPLVAVFGAGPGGIGPGTTAVLASDVFGLLHTTLQLRG
APSR
>P10217 ~~~TRM2~~~Tripartite terminase subunit 2~~~
MAGREGRTRQRTLRDTIPDCALRSQTLESLDARYVSRDGAHDAAVWFEDMTPAELEVVFPTTDAKLNYLSRTQRLASLLT
YAGPIKAPDDAAAPQTPDTACVHGELLAAKRERFAAVINRFLDLHQILRG
>P03219 3.1.-.-~~~TRM3~~~Tripartite terminase subunit 3~~~
MLYASQRGRLTENLRNALQQDSTTQGCLGAETPSIMYTGAKSDRWAHPLVGTIHASNLYCPMLRAYCRHYGPRPVFVASD
ESLPMFGASPALHTPVQVQMCLLPELRDTLQRLLPPPNLEDSEALTEFKTSVSSARAILEDPNFLEMREFVTSLASFLSG
QYKHKPARLEAFQKQVVLHSFYFLISIKSLEITDTMFDIFQSAFGLEEMTLEKLHIFKQKASVFLIPRRHGKTWIVVAII
SLILSNLSNVQIGYVAHQKHVASAVFTEIIDTLTKSFDSKRVEVNKETSTITFRHSGKISSTVMCATCFNKNSIRGQTFH
LLFVDEANFIKKEALPAILGFMLQKDAKIIFISSVNSADQATSFLYKLKDAQERLLNVVSYVCQEHRQDFDMQDSMVSCP
CFRLHIPSYITMDSNIRATTNLFLDGAFSTELMGDTSSLSQGSLSRTVRDDAINQLELCRVDTLNPRVAGRLASSLYVYV
DPAYTNNTSASGTGIAAVTHDRADPNRVIVLGLEHFFLKDLTGDAALQIATCVVALVSSIVTLHPHLEEVKVAVEGNSSQ
DSAVAIASIIGESCPLPCAFVHTKDKTSSLQWPMYLLTNEKSKAFERLIYAVNTASLSASQVTVSNTIQLSFDPVLYLIS
QIRAIKPIPLRDGTYTYTGKQRNLSDDVLVALVMAHFLATTQKHTFKKVH
>P16732 3.1.-.-~~~TRM3~~~Tripartite terminase subunit 3~~~
MLRGDSAAKIQERYAELQKRKSHPTSCISTAFTNVATLCRKRYQMMHPELGLAHSCNEAFLPLMAFCGRHRDYNSPEESQ
RELLFHERLKSALDKLTFRPCSEEQRASYQKLDALTELYRDPQFQQINNFMTDFKKWLDGGFSTAVEGDAKAIRLEPFQK
NLLIHVIFFIAVTKIPVLANRVLQYLIHAFQIDFLSQTSIDIFKQKATVFLVPRRHGKTWFIIPIISFLLKHMIGISIGY
VAHQKHVSQFVLKEVEFRCRHTFARDYVVENKDNVISIDHRGAKSTALFASCYNTNSIRGQNFHLLLVDEAHFIKKEAFN
TILGFLAQNTTKIIFISSTNTTSDSTCFLTRLNNAPFDMLNVVSYVCEEHLHSFTEKGDATACPCYRLHKPTFISLNSQV
RKTANMFMPGAFMDEIIGGTNKISQNTVLITDQSREEFDILRYSTLNTNAYDYFGKTLYVYLDPAFTTNRKASGTGVAAV
GAYRHQFLIYGLEHFFLRDLSESSEVAIAECAAHMIISVLSLHPYLDELRIAVEGNTNQAAAVRIACLIRQSVQSSTLIR
VLFYHTPDQNHIEQPFYLMGRDKALAVEQFISRFNSGYIKASQELVSYTIKLSHDPIEYLLEQIQNLHRVTLAEGTTARY
SAKRQNRISDDLIIAVIMATYLCDDIHAIRFRVS
>P04295 3.1.-.-~~~TRM3~~~Tripartite terminase subunit 3~~~
MFGQQLASDVQQYLERLEKQRQLKVGADEASAGLTMGGDALRVPFLDFATATPKRHQTVVPGVGTLHDCCEHSPLFSAVA
RRLLFNSLVPAQLKGRDFGGDHTAKLEFLAPELVRAVARLRFKECAPADVVPQRNAYYSVLNTFQALHRSEAFRQLVHFV
RDFAQLLKTSFRASSLTETTGPPKKRAKVDVATHGRTYGTLELFQKMILMHATYFLAAVLLGDHAEQVNTFLRLVFEIPL
FSDAAVRHFRQRATVFLVPRRHGKTWFLVPLIALSLASFRGIKIGYTAHIRKATEPVFEEIDACLRGWFGSARVDHVKGE
TISFSFPDGSRSTIVFASSHNTNGIRGQDFNLLFVDEANFIRPDAVQTIMGFLNQANCKIIFVSSTNTGKASTSFLYNLR
GAADELLNVVTYICDDHMPRVVTHTNATACSCYILNKPVFITMDGAVRRTADLFLADSFMQEIIGGQARETGDDRPVLTK
SAGERFLLYRPSTTTNSGLMAPDLYVYVDPAFTANTRASGTGVAVVGRYRDDYIIFALEHFFLRALTGSAPADIARCVVH
SLTQVLALHPGAFRGVRVAVEGNSSQDSAVAIATHVHTEMHRLLASEGADAGSGPELLFYHCEPPGSAVLYPFFLLNKQK
TPAFEHFIKKFNSGGVMASQEIVSATVRLQTDPVEYLLEQLNNLTETVSPNTDVRTYSGKRNGASDDLMVAVIMAIYLAA
QAGPPHTFAPITRVS
>Q9T1V8 ~~~~~~Probable tail terminator protein~~~
MLEETEAALLARVRELFGATLRQVEPLTGTWTNEDVHRLFLAPPSVFLAWMGCGEGRTRREVESRWAFFVVAELLNGEPV
NRPGIYQIVERLIAGVNGQTFGPTTGMRLTQVRNLCDDNRINAGVVLYGVLFSGTTPLPSVVDLDSLDDYERHWQTWKFP
DETPEFAAHINVNQEKDHDAEN
>P09695 ~~~TRS1~~~Protein HHLF1~~~
MAQRNGMSPRPPPLGRGRGAGGPSGVGSSPPSSCVPMGAPSTAGTGASAAATTTPGHGVHRVEPRGPPGAPPSSGNNSNF
WHGPERLLLSQIPVERQALTELEYQAMGAVWRAAFLANSTGRAMRKWSQRDAGTLLPLGRPYGFYARVTPRSQMNGVGAT
DLRQLSPRDAWIVLVATVVHEVDPAADPTLGDKAGHPEGLCAQDGLYLALGAGFRVFVYDLANNTLILAARDADEWFRHG
AGEVVRLYRCNRLGVGTPRATLLPQPALRQTLLRAEEATALGRELRRRWAGTTVALQTPGRRLQPMVLLGAWQELAQYEP
FASAPHPASLLTAVRRHLNQRLCCGWLALGAVLPARWLGCAAGPATGTAAGTTSPPAASGTETEAAGGDAPCAIAGAVGS
AVPVPPQPYGAAGGGAICVPNADAHAVVGADAAAAAAPTVMVGSTAMAGPAASGTVPRAMLVVLLDELGAVFGYCPLDGH
VYPLAAELSHFLRAGVLGALALGRESAPAAEAARRLLPELDREQWERPRWDALHLHPRAALWAREPHGQLAFLLRPGRGE
AEVLTLATKHPAICANVEDYLQDARRRADAQALGLDLATVVMEAGGQMIHKKTKKPKGKEDESLMKGKHSRYTRPTEPPL
TPQASLGRALRRDDEDWKPSRLPGEDSWYDLDETFWVLGSNRKNDVYQRRWKKTVLRCGLEIDRPMPTVPKGCRPQTFTH
EGIQLMGGATQEPLDTGLYAPSHVTSAFVPSVYMPPTVPYPDPAARLCRDMRRVTFSNIATHYHYNAQ
>Q6SVX2 ~~~TRS1~~~Protein TRS1~~~
MAQRNGMSPRPPPLGRGRGAGGPSGVGSSPPSSCVPMGATSTAGTGASAAPTATPGHGVHRVEPRGPPGAPPSSGNNSNF
WHGPERLLLSQIPVERQALTELEYQAMGAVWRAAFLANSTGRAMRKWSQRDAGTLLPLGRPYGFYARVTPRSQMNGVGAT
DLRQLSPRDAWIVLVATVVHEVDPAADPTVGDKAGHPEGLCAQDGLYLALGAGFRVFVYDLANNTLILAARDADEWFRHG
AGEVVRLYRCNRLGVGTPRATLLPQPALRQTLLRAEEATALGRELRRRWAGTTVALQTPGRRLQPMVLLGAWQELAQYEP
FASAPHPASLLTAVRRHLNQRLCCGWLALGAVLPARWLGCAAGPATGTTSPPAASGTETEAAGGDAPCAMAGAVGSAVTI
PPQPYGGAGGSAICVPNADAHAVVGADATAAAAAAAAAPTVMVGPTAMAGPAASGTVPRAMLVVVLDELGAVFGYCPLDG
HVYPLAAELSHFLRAGVLGALALGRESAPAAEAARRLLPELDREQWERPRWDALHLHPRAALWAREPHGQLAFLLRPGRG
EAEVLTLATKHPVICANVEDYLQDARRRADAQALGLDLATVVMEAGGQMIHKKTKKPKGKEDESVMKGKHSRYTRPTEPP
LTPQASLGRALRRDDEDWKPSRVPGEDSWYDLDETFWVLGSNRKNDVYQRRWKKTVLRCGLEIDRPMPTVPKGCRPQTFT
HEGIQLMGGATQEPLDTGLYAPSHVTSAFVPSVYMPPTVPYPDPAARLCRDMRRVTFSNVATHYHYNA
>P03187 ~~~TRX1~~~Triplex capsid protein 1~~~
MKVQGSVDRRRLQRRIAGLLPPPARRLNISRGSEFTRDVRGLVEEHAQASSLSAAAVWRAGLLAPGEVAVAGGGSGGGSF
SWSGWRPPVFGDFLIHASSFNNAEATGTPLFQFKQSDPFSGVDAVFTPLSLFILMNHGRGVAARVEAGGGLTRMANLLYD
SPATLADLVPDFGRLVADRRFHNFITPVGPLVENIKSTYLNKITTVVHGPVVSKAIPRSTVKVTVPQEAFVDLDAWLSGG
AGGGGGVCFVGGLGLQPCPADARLYVALTYEEAGPRFTFFQSSRGHCQIMNILRIYYSPSIMHRYAVVQPLHIEELTFGA
VACLGTFSATDGWRRSAFNYRGSSLPVVEIDSFYSNVSDWEVIL
>P16783 ~~~TRX1~~~Triplex capsid protein 1~~~
MDARAVAKRPRDPADEDNELVTALKAKREVNTISVRYLYHADHQALTARFFVPEGLVEFEAQPGALLIRMETGCDSPRHL
YISLYLLGIRASNVSASTRCLLESVYTASAARAALQWLDLGPHLLHRRLETLGCVKTVSLGITSLLTCVMRGYLYNTLKT
EVFALMIPKDMYLTWEETRGRLQYVYLIIVYDYDGPETRPGIYVLTSSIAHWQTLVDVARGKFARERCSFVNRRITRPRQ
IPLCTGVIQKLGWCLADDIHTSFLVHKELKLSVVRLDNFSVELGDFREFV
>P32888 ~~~TRX1~~~Triplex capsid protein 1~~~
MKTNPLPATPSVWGGSTVELPPTTRDTAGQGLLRRVLRPPISRRDGPGLPRGSGPRRAASTLWLLGLDGTDAPPGALTPN
DDTEQALDKILRGTMRGGAALIGSPRHHLTRQVILTDLCQPNADRAGTLLLALRHPADLPHLAHQRAPPGRQTERLGEAW
GQLMEATALGSGRAESGCTRAGLVSFNFLVAACAASYDARDAADAVRAHVTANYRGTRVGARLDRFSECLRAMVHTHVFP
HEVMRFFGGLVSWVTQDELASVTAVCAGPQEAAHTGHPGRPRSAVILPACAFVDLDAELGLGGPGAAFLYLVFTYRQRRD
QELCCVYVIKSQLPPRGLEPALERLFGRLRITNTIHGTEDMTPPAPNRNPDFPLAGLAANPQTPRCSAGQVTNPQFADRL
YRWQPDLRGRPTARTCTYAAFAELGMMPEDSPRCLHRTERFGAVSVPVVILEGVVWRPGEWRACA
>P22486 ~~~TRX1~~~Triplex capsid protein 1~~~
MKTKPLPTAPMAWAESAVETTTGPRELAGHAPLRRVLRPPIARRDGPVLLGDRAPRRTASTMWLLGIDPAESSPGTRATR
DDTEQAVDKILRGARRAGGLTVPGAPRYHLTRQVTLTDLCQPNAERAGALLLALRHPTDLPHLARHRAPPGRQTERLAEA
WGQLLEASALGSGRAESGCARAGLVSFNFLVAACAAAYDARDAAEAVRAHITTNYGGTRAGARLDRFSECLRAMVHTHVF
PHEVMRFFGGLVSWVTQDELASVTAVCSGPQEATHTGHPGRPRSAVTIPACAFVDLDAELCLGGPGAAFLYLVFTYRQCR
DQELCCVYVVKSQLPPRGLEAALERLFGRLRITNTIHGAEDMTPPPPNRNVDFPLAVLAASSQSPRCSASQVTNPQFVDR
LYRWQPDLRGRPTARTCTYAAFAELGVMPDDSPRCLHRTERFGAVGVPVVILEGVVWRPGGWRACA
>Q9WT35 ~~~TRX1~~~Triplex capsid protein 1~~~
MNSKSSARAAIVDTVEAVKKRKYISIEAGTLNNVVEKERKFLKQFLSGRENLRIAARVFTPCELLAPELENLGMLMYRFE
TDVDNPKILFVGLFFLCSNAFNVSACVRTALTTMYTNSMVDNVLSMINTCKYLEDKVSLFGVTSLVSCGSSCLLSCVMQG
NVYDANKENIHGLTVLKEIFLEPDWEPRQHSTQYVYVVHVYKEVLSKLQYGIYVVLTSFQNEDLVVDILRQYFEKERFLF
LNYLINSNTTLSYFGSVQRIGRCATEDIKSGFLQYRGITLPVIKLENIFVDLSEKKVFV
>F5H8Y5 ~~~TRX1~~~Triplex capsid protein 1~~~
MKVQAENAARLGRQVLGLLPPPTHRVSLTRGPEFARGVRDLLSKYAASTRPTVGSLHEALRQAPFRQPTYGDFLVYSQTF
SPQEPLGTFLFSFKQEDNGSSMDMLLTPTSLFMLSGMEAAKAPQTHKVAGVWYGSGSGLADFIPNLSELMDTGEFHTLLT
PVGPMVQSVHSTFVTKVTSAMKGVGLARDEPRAHVGLTLPCDMLVDLDESCPMVQRREPAGLNVTIYASLVYLRVNQRPS
MALTFFQSGKGFAEVVAMIKDHFTDVIRTKYIQLRHELYINRLVFGAVCTLGTVPFDSHPVHQSLNVKGTSLPVLVFANF
EAACGPWTVFL
>P09276 ~~~TRX1~~~Triplex capsid protein 1~~~
MGSQPTNSHFTLNEQTLCGTNISLLGNNRFIQIGNGLHMTYAPGFFGNWSRDLTIGPRFGGLNKQPIHVPPKRTETASIQ
VTPRSIVINRMNNIQINPTSIGNPQVTIRLPLNNFKSTTQLIQQVSLTDFFRPDIEHAGSIVLILRHPSDMIGEANTLTQ
AGRDPDVLLEGLRNLFNACTAPWTVGEGGGLRAYVTSLSFIAACRAEEYTDKQAADANRTAIVSAYGCSRMETRLIRFSE
CLRAMVQCHVFPHRFISFFGSLLEYTIQDNLCNITAVAKGPQEAARTDKTSTRRVTANIPACVFWDVDKDLHLSADGLKH
VFLVFVYTQRRQREGVRLHLALSQLNEQCFGRGIGFLLGRIRAENAAWGTEGVANTHQPYNTRALPLVQLSNDPTSPRCS
IGEITGVNWNLARQRLYQWTGDFRGLPTQLSCMYAAYTLIGTIPSESVRYTRRMERFGGYNVPTIWLEGVVWGGTNTWNE
CYY
>P25214 ~~~TRX2~~~Triplex capsid protein 2~~~
MDLKVVVSLSSRLYTDEIAKMQQRIGCILPLASTHGTQNVQGLGLGQVYSLETVPDYVSMYNYLSDCTLAVLDEVSVDSL
ILTKIVPGQTYAIKNKYQPFFQWHGTGSLSVMPPVFGREHATVKLESNDVDIVFPMVLPTPIAEEVLQKILLFNVYSRVV
MQAPGNADMLDVHMHLGSVSYLGHHYELALPEVPGPLGLALLDNLSLYFCIMVTLLPRASMRLVRGLIRHEHHDLLNLFQ
EMVPDEIARIDLDDLSVADDLSRMRVMMTYLQSLASLFNLGPRLATAAYSQETLTATCWLR
>P16728 ~~~TRX2~~~Triplex capsid protein 2~~~
MAAMEANIFCTFDHKLSIADVGKLTKLVAAVVPIPQRLHLIKHYQLGLHQFVDHTRGYVRLRGLLRNMTLTLMRRVEGNQ
ILLHVPTHGLLYTVLNTGPVTWEKGDALCVLPPLFHGPLARENLLTLGQWELVLPWIVPMPLALEINQRLLIMGLFSLDR
SYEEVKAAVQQLQTITFRDATFTIPDPVIDQHLLIDMKTACLSMSMVANLASELTMTYVRKLALEDSSMLLVKCQELLMR
LDRERSVGEPRTPARPQHVSPDDEIARLSALFVMLRQLDDLIREQVVFTVCDVSPDNKSATCIFKG
>P10202 ~~~TRX2~~~Triplex capsid protein 2~~~
MLADGFETDIAIPSGISRPDAAALQRCEGRVVFLPTIRRQLTLADVAHESFVSGGVSPDTLGLLLAYRRRFPAVITRVLP
TRIVACPLDVGLTHAGTVNLRNTSPVDLCNGDPISLVPPVFEGQATDVRLDSLDLTLRFPVPLPSPLAREIVARLVARGI
RDLNPSPRNPGGLPDLNVLYYNGSRLSLLADVQQLGPVNAELRSLVLNMVYSITEGTTIILTLIPRLFALSAQDGYVNAL
LQMQSVTREAAQLIHPEAPALMQDGERRLPLYEALVAWLTHAGQLGDTLALAPVVRVCTFDGAAVVRSGDMAPVIRYP
>P89441 ~~~TRX2~~~Triplex capsid protein 2~~~
MITDCFEADIAIPSGISRPDAAALQRCEGRVVFLPTIRRQLALADVAHESFVSGGVSPDTLGLLLAYRRRFPAVITRVLP
TRIVACPVDLGLTHAGTVNLRNTSPVDLCNGDPVSLVPPVFEGQATDVRLESLDLTLRFPVPLPTPLAREIVARLVARGI
RDLNPDPRTPGELPDLNVLYYNGARLSLVADVQQLASVNTELRSLVLNMVYSITEGTTLILTLIPRLLALSAQDGYVNAL
LQMQSVTREAAQLIHPEAPMLMQDGERRLPLYEALVAWLAHAGQLGDILALAPAVRVCTFDGAAVVQSGDMAPVIRYP
>Q9QJ27 ~~~TRX2~~~Triplex capsid protein 2~~~
METVYCTFDHKLSLSDISTLCKLMNIVIPIPAHHHLIGSGNLGLYPIVSSNKDYVHIRNVLRTMVVTILQKVEGNQLVLR
KPMTGQQYAIKNTGPFPWEKGDTLTLIPPLSTHSEEKLLKLGDWELTVPLVVPTAIAAEINIRLLCIGLIAVHREYNEMQ
TIIDELCSIQYRDVLIKLPDIVNDKQSMYSMKTACISLSMITAMAPDIVRTYIDRLTLEDHSMLLIKCQELLSKRTTLST
QRCGQLHATDIKDELKKIKSVLTMIDQINSLTNEKTYFVVCDVSADNRMATCIYKN
>F5HGN8 ~~~TRX2~~~Triplex capsid protein 2~~~
MALDKSIVVNFTSRLFADELAALQSKIGSVLPLGDCHRLQNIQALGLGCVCSRETSPDYIQIMQYLSKCTLAVLEEVRPD
SLRLTRMDPSDNLQIKNVYAPFFQWDSNTQLAVLPPFFSRKDSTIVLESNGFDLVFPMVVPQQLGHAILQQLLVYHIYSK
ISAGAPDDVNMAELDLYTTNVSFMGRTYRLDVDNTDPRTALRVLDDLSMYLCILSALVPRGCLRLLTALVRHDRHPLTEV
FEGVVPDEVTRIDLDQLSVPDDITRMRVMFSYLQSLSSIFNLGPRLHVYAYSAETLAASCWYSPR
>P79678 ~~~L~~~Tail sheath protein~~~
MSDISFNAIPSDVRVPLTYIEFDNSNAVSGTPAPRQRVLMFGQSGSKASAAPNVPVRIRSGSQASAAFGQGSMLALMADA
FLNANRVAELWCIPQGNGTGNAAVGEISLSGTAGENGSLVTYIAGQRLAVSVAAGATGAALADLLVARIKGQPDLPVTAE
VRADSGDDDTHADVVLSAKFTGALSAVDVRWNYYAGETTPYGIITAFKAASGKNGNPDISASIAGMGDLQYKYIVMPYTD
EPNLNLLRTELQERWGPVNQADGFAVTVLSGTYGDISTFGVSRNDHLISCMGIAGAPEPSYLYAATLCAVASQALSIDPA
RPLQTLTLPGRMPPAVGDRFTWSERNALLFDGISTFNVNDGGEMQIERMITMYRTNKYGDSDPSYLNVNTIATLSYLRYS
LRTRITQKFPNYKLASDGTRFATGQAVVTPSVIKTELLALFEEWENAGLVEDFDTFKEELYVARNKDDKDRLDVLCGPNL
INQFRIFAAQVQFIL
>P22501 ~~~FI~~~Tail sheath protein~~~
MSDYHHGVQVLEINEGTRVISTVSTAIVGMVCTASDADAETFPLNKPVLITNVQSAISKAGKKGTLAASLQAIADQSKPV
TVVMRVEDGTGDDEETKLAQTVSNIIGTTDENGQYTGLKAMLAAESVTGVKPRILGVPGLDTKEVAVALASVCQKLRAFG
YISAWGCKTISEVKAYRQNFSQRELMVIWPDFLAWDTVTSTTATAYATARALGLRAKIDQEQGWHKTLSNVGVNGVTGIS
ASVFWDLQESGTDADLLNESGVTTLIRRDGFRFWGNRTCSDDPLFLFENYTRTAQVVADTMAEAHMWAVDKPITATLIRD
IVDGINAKFRELKTNGYIVDATCWFSEESNDAETLKAGKLYIDYDYTPVPPLENLTLRQRITDKYLANLVTSVNSN
>P85993 ~~~~~~Major tail sheath protein~~~
MTDNFFHGARVKENTDLQTAINDVDSTVIGLVAVADDADATTFPLDTPVLITRVISVLGKAGKTGSLYKSLKAISDQVST
RVIVVRVAAAGTEDGAKTQSQLIIGGSQADGSYTGMFALLTAEQKVGYRPRILGVPMYDTQEVTAQLRVIAKQLRAFSYS
YCDGCETIAEAKTYREQFAEREGMLIWPNFIAYNSVSGENEEFPAVAYALGLRAKIDNEQGWHKSLSNVAVSNVLGITKD
VFWALQAEDSDANELNANEVTTLIKRDGFRFWGNRTTDKDEYIFEVYTRTAQILADTIAEAQFTTVDSPLTPANVKDVVS
GINAKLQGLVTAGRLIGAACWFDIVDNPKTSIPQGKAVVRYNYSPVPPLEDLTMIQTFTDQYYEAAFASLGGA
>P13332 ~~~~~~Tail sheath protein~~~
MTLLSPGIELKETTVQSTVVNNSTGTAALAGKFQWGPAFQIKQVTNEVDLVNTFGQPTAETADYFMSAMNFLQYGNDLRV
VRAVDRDTAKNSSPIAGNIDYTISTPGSNYAVGDKITVKYVSDDIETEGKITEVDADGKIKKINIPTGKNYAKAKEVGEY
PTLGSNWTAEISSSSSGLAAVITLGKIITDSGILLAEIENAEAAMTAVDFQANLKKYGIPGVVALYPGELGDKIEIEIVS
KADYAKGASALLPIYPGGGTRASTAKAVFGYGPQTDSQYAIIVRRNDAIVQSVVLSTKRGEKDIYDSNIYIDDFFAKGGS
EYIFATAQNWPEGFSGILTLSGGLSSNAEVTAGDLMEAWDFFADRESVDVQLFIAGSCAGESLETASTVQKHVVSIGDAR
QDCLVLCSPPRETVVGIPVTRAVDNLVNWRTAAGSYTDNNFNISSTYAAIDGNHKYQYDKYNDVNRWVPLAADIAGLCAR
TDNVSQTWMSPAGYNRGQILNVIKLAIETRQAQRDRLYQEAINPVTGTGGDGYVLYGDKTATSVPSPFDRINVRRLFNML
KTNIGRSSKYRLFELNNAFTRSSFRTETAQYLQGNKALGGIYEYRVVCDTTNNTPSVIDRNEFVATFYIQPARSINYITL
NFVATATGADFDELTGLAG
>D3WAC9 ~~~~~~Probable tail terminator protein~~~
MAMNLLNTASIAKEMQTKVTERMGDWFEAEFKAKANSASRRTRLIRSHGHTYTYARYQNTGQLSSNLKQVKKGDKIVIDA
GTRANYTSGYHGMYFLVEKKGMQEVKTTLKKGANYANSMKL
>P13331 ~~~3~~~Tail tube terminator protein~~~
MSQALQQIFNQANTTNFVVSIPHSNTTSAFTLNAQSVPIPGIRIPVTDTVTGPFGLGRAQRPGVTFEYDPLIVRFIVDEE
LKSWIGMYEWMLGTSNYLTGENTAQKTGPEYITLYILDNSKTEIVMSINFYKPWVSDLSEVEFSYTEDSDPALVCTATIP
YTYFQVEKDGKIIAEV
>Q6QGE1 ~~~~~~Tail tube terminator protein p142~~~
MDHRTSIAQAMVDRISKQMDGSQPDEYFNNLYGNVSRQTYKFEEIREFPYVAVHIGTETGQYLPSGQQWMFLELPILVYD
KEKTDIQEQLEKLVADIKTVIDTGGNLEYTVSKPNGSTFPCEATDMIITSVSTDEGLLAPYGLAEINVTVRYQPPRRSLR
R
>P03732 ~~~U~~~Tail tube terminator protein~~~
MKHTELRAAVLDALEKHDTGATFFDGRPAVFDEADFPAVAVYLTGAEYTGEELDSDTWQAELHIEVFLPAQVPDSELDAW
MESRIYPVMSDIPALSDLITSMVASGYDYRRDDDAGLWSSADLTYVITYEM
>P68930 ~~~~~~Proximal tail tube connector protein~~~
MSSYTMQLRTYIEMWSQGETGLSTAEKIEKGRPKLFDFNYPIFDESYRTIFETHFIRNFYMREIGFETEGLFKFHLETWL
MINMPYFNKLFESELIKYDPLENTRVGVKSNTKNDTDRNDNRDVKQDLTSNGTSSTDAKQNDTSKTTGNEKSSGSGSITD
DNFKRDLNADTADDRLQLTTKDGEGVLEYASQIEEHNENKKRDTKTSNTTDTTSNTTGTSTLDSDSKTSNKANTTSNDKL
NSQINSVEDYIEDRVGKIGTQSYARLVMDYREALLRIEQRIFNEMQELFMLVY
>D1L2W4 3.2.1.-~~~~~~Tail tubular protein A~~~
MNMQDAYFGSAAELDAVNEMLAAIGESPVTTLDEDGSADVANARRILNRINRQIQSKGWAFNINESATLTPDVSTGLIPF
RPAYLSILGGQYVNRGGWVYDKSTGTDTFSGPITVTLITLQDYDEMPECFRQWIVTKASRQFNSRFFGAEDVENSLAQEE
MEARMACNEYEMDFGQYNMLDGDAYVQGLIGR
>D1L2Y9 3.2.1.28~~~~~~Tail tubular protein A~~~
MRELDAINLTLEALGESRVMDINTSNPSAGLARSALARNRRGLLSTGFWFNVVEREVTPTADGFIKVPWNQLAVYDAGSD
SKYGVRDGNLYDLMEQNQYFDSPVKLKIVLDLDFEDLPEHAAMWVANYTTAQVYLNDLGGDSNYANYAQEAERYKSMVLR
EHLRNQRFSTSKTRFARRIRRARFMV
>P03746 ~~~~~~Tail tubular protein gp11~~~
MRSYDMNVETAAELSAVNDILASIGEPPVSTLEGDANADAANARRILNKINRQIQSRGWTFNIEEGITLLPDVYSNLIVY
SDDYLSLMSTSGQSIYVNRGGYVYDRTSQSDRFDSGITVNIIRLRDYDEMPECFRYWIVTKASRQFNNRFFGAPEVEGVL
QEEEDEARRLCMEYEMDYGGYNMLDGDAFTSGLLTR
>P03747 ~~~~~~Tail tubular protein gp12~~~
MALISQSIKNLKGGISQQPDILRYPDQGSRQVNGWSSETEGLQKRPPLVFLNTLGDNGALGQAPYIHLINRDEHEQYYAV
FTGSGIRVFDLSGNEKQVRYPNGSNYIKTANPRNDLRMVTVADYTFIVNRNVVAQKNTKSVNLPNYNPNQDGLINVRGGQ
YGRELIVHINGKDVAKYKIPDGSQPEHVNNTDAQWLAEELAKQMRTNLSDWTVNVGQGFIHVTAPSGQQIDSFTTKDGYA
DQLINPVTHYAQSFSKLPPNAPNGYMVKIVGDASKSADQYYVRYDAERKVWTETLGWNTEDQVLWETMPHALVRAADGNF
DFKWLEWSPKSCGDVDTNPWPSFVGSSINDVFFFRNRLGFLSGENIILSRTAKYFNFYPASIANLSDDDPIDVAVSTNRI
AILKYAVPFSEELLIWSDEAQFVLTASGTLTSKSVELNLTTQFDVQDRARPFGIGRNVYFASPRSSFTSIHRYYAVQDVS
SVKNAEDITSHVPNYIPNGVFSICGSGTENFCSVLSHGDPSKIFMYKFLYLNEELRQQSWSHWDFGENVQVLACQSISSD
MYVILRNEFNTFLARISFTKNAIDLQGEPYRAFMDMKIRYTIPSGTYNDDTFTTSIHIPTIYGANFGRGKITVLEPDGKI
TVFEQPTAGWNSDPWLRLSGNLEGRMVYIGFNINFVYEFSKFLIKQTADDGSTSTEDIGRLQLRRAWVNYENSGTFDIYV
ENQSSNWKYTMAGARLGSNTLRAGRLNLGTGQYRFPVVGNAKFNTVYILSDETTPLNIIGCGWEGNYLRRSSGI
>P26596 ~~~~~~Tail tube protein~~~
MKLDYNSREIFFGNEALIVADMSKGINGKPEFTNHKIVAGLVSVGSMEDQAETNSYPADDVPDHGVKKGATLLQGEMVFI
QTDQALKEDILGQQRTENGLGWSPTGNWKTKCVQYLIKGRKRDKVTGEFVDGYRVVVYPNLTPTAEATKESETDSVDGVD
PIQWTLAVQATESDIYLNGGKKVPAIEYEIWGEQAKDFVKKMESGLFIMQPDTVLAGAITLVAPVIPNVTTATKGNNDGT
IVVPDTLKDSKGGTVKVTSVIKDAHGKVATNGQLAPGVYIVTFSADGYEDVTAGVSVTDHS
>Q9MCC1 ~~~~~~Tail tube protein~~~
MKLDYNSREIFFGNEALIVADMTKGSNGKPEFTNHKIVTGLVSVGSMEDQAETNSYPADDVPDHGVKKGATLLQGEMVFI
QTDQALKEDMLGQQRTENGLGWSPTGNWKTKCVQYLIKGRKRDKVTGEFVDGYRVVVYPHLTPTAEATKESETDSVDGVD
PIQWTLAVQATESDIYSNGGKKVPAIEYEIWGEQAKDFAKKMESGLFIMQPDTVLAGAITLVAPVIPNVTTATKGNNDGT
IVVPDTLKDSKGGTVKVTSVIKDAHGKVATNGQLAPGVYIVTFSADGYEDVTAGVSVTDHS
>P79679 ~~~M~~~Tail tube protein~~~
MAGNQRQGVAFIRVNGMELESMEGASFTPSGITREEVTGSRVYGWKGKPRAAKVECKIPGGGPIGLDEIIDWENITVEFQ
ADTGETWMLANAWQADEPKNDGGEISLVLMAKQSKRIA
>P22502 ~~~FII~~~Tail tube protein~~~
MAMPRKLKLMNVFLNGYSYQGVAKSVTLPKLTRKLENYRGAGMNGSAPVDLGLDDDALSMEWSLGGFPDSVIWELYAATG
VDAVPIRFAGSYQRDDTGETVAVEVVMRGRQKEIDTGEGKQGEDTESKISVVCTYFRLTMDGKELVEIDTINMIEKVNGV
DRLEQHRRNIGL
>P85992 ~~~~~~Tail tube protein~~~
MATVNEFRGAMSRGGGVQRQHRWRVTISFPSFAASADQTRDVCLLAVTTNTPTGQLGEILVPWGGRELPFPGDRRFEALP
ITFINVVNNGPYNSMEVWQQYINGSESNRASANPDEYFRDVVLELLDANDNVTKTWTLQGAWPQNLGQLELDMSAMDSYT
QFTCDLRYFQAVSDRSR
>O48449 ~~~~~~Tail tube protein gp17.1*~~~
MPETPIMGQDVKYLFQSIDAATGSAPLFPAYQTDGSVSGERELFDEQTKNGRILGPGSVADSGEVTYYGKRGDAGQKAIE
DAYQNGKQIKFWRVDTVKNENDKYDAQFGFAYIESREYSDGVEGAVEISISLQVIGELKNGEIDTLPEEIVNVSKGGYDF
QQPGQTTGEAPGTVPLPNAPQNLTYTATTDSVTVKWDAVEGADSYNVYRGAEKKLDANVTTTSHTLTGIQPDTQLTVNVA
AVNAGGESPMSQIVTKTLPAESTG
>P13333 ~~~~~~Tail tube protein gp19~~~
MFVDDVTRAFESGDFARPNLFQVEISYLGQNFTFQCKATALPAGIVEKIPVGFMNRKINVAGDRTFDDWTVTVMNDEAHD
ARQKFVDWQSIAAGQGNEITGGKPAEYKKSAIVRQYARDAKTVTKEIEIKGLWPTNVGELQLDWDSNNEIQTFEVTLALD
YWE
>Q6QGE2 ~~~N4~~~Tail tube protein pb6~~~
MSLQLLRNTRIFVSTVKTGHNKTNTQEILVQDDISWGQDSNSTDITVNEAGPRPTRGSKRFNDSLNAAEWSFSTYILPYK
DKNTSKQIVPDYMLWHALSSGRAINLEGTTGAHNNATNFMVNFKDNSYHELAMLHIYILTDKTWSYIDSCQINQAEVNVD
IEDIGRVTWSGNGNQLIPLDEQPFDPDQIGIDDETYMTIQGSYIKNKLTILKIKDMDTNKSYDIPITGGTFTINNNITYL
TPNVMSRVTIPIGSFTGAFELTGSLTAYLNDKSLGSMELYKDLIKTLKVVNRFEIALVLGGEYDDERPAAILVAKQAHVN
IPTIETDDVLGTSVEFKAIPSDLDAGDEGYLGFSSKYTRTTINNLIVNGDGATDAVTAITVKSAGNVTTLNRSATLQMSV
EVTPSSARNKEVTWAITAGDAATINATGLLRADASKTGAVTVEATAKDGSGVKGTKVITVTAGG
>P03733 ~~~V~~~Tail tube protein~~~
MPVPNPTMPVKGAGTTLWVYKGSGDPYANPLSDVDWSRLAKVKDLTPGELTAESYDDSYLDDEDADWTATGQGQKSAGDT
SFTLAWMPGEQGQQALLAWFNEGDTRAYKIRFPNGTVDVFRGWVSSIGKAVTAKEVITRTVKVTNVGRPSMAEDRSTVTA
ATGMTVTPASTSVVKGQSTTLTVAFQPEGVTDKSFRAVSADKTKATVSVSGMTITVNGVAAGKVNIPVVSGNGEFAAVAE
ITVTAS
>Q331T6 ~~~tubR~~~DNA-binding protein TubR~~~
MAVNKNEYKILIMLKENQCTTELKSFTYTKLCNISKLSMSTVRRSIKKFLELQYVKEGCKQGISKTFYITPNGIEKLKSI
M
>Q331T8 ~~~tubY~~~Regulator protein TubY~~~
MNEESTRYVDVNYSDLDKELTYTTSEVAEILNENESTIRYWCDCFSDYIHIEREGRNRKFTKSNIDDLAFTKELLKKERL
TIKQAQKRWEHIKTQPSQNTKVISTTETTSQENVLNEQALLKLEEIKKQFLNDISTQINNTISQQLSTALNAHNEALEQT
KVELKDYISATIEDKLEANISNLKAHIDATTENTNKQIHQIYDKDVELVNDLKKHMEERKQQNEEQNNKKGFFGKLFKR
>Q331T7 3.6.5.-~~~tubZ~~~Tubulin-like protein TubZ~~~
MKNKIVFAPIGQGGGNIVDTLLGICGDYNALFINTSKKDLDSLKHAKHTYHIPYAEGCGKERKKAVGYAQTYYKQIIAQI
MEKFSSCDIVIFVATMAGGTGSGITPPILGLAKQMYPNKHFGFVGVLPKATEDIDEHMNAIACWNDIMRSTNEGKDISIY
LLDNNKREKESDINKEFATLFNDFMNMSESHAEGVVDEDEISKLLTMKKSNVILEFDDKEDIQVALAKSLKESIFAEYTT
NTCEFMGISTTRVVDVEAIKSIVGYPRRTFKGYNSKKNIVVATGIEPQKTTVQMMNEIIEDKMKQRREVTSKSENMIIEP
IALDDEDNKSVISSNEKEISIDNVEKEIDINDFFSKYM
>P41063 ~~~TUM~~~SOS operon TUM protein~~~
MDRELNEHVMIERVEMIARLTAEGTCQERDREIALNLIAEIARGNLMKNNNFSVVFSAPPVGETFAKEGKVKVNITLDKD
QKIGQPVIDAFQCELTKRIQSVFPSTRVTVKKGSMTGVELMGFDKDSDREALDSILQEVWEDESWR
>P00471 2.1.1.45~~~TD~~~Thymidylate synthase~~~
MKQYQDLIKDIFENGYETDDRTGTGTIALFGSKLRWDLTKGFPAVTTKKLAWKACIAELIWFLSGSTNVNDLRLIQHDSL
IQGKTVWDENYENQAKDLGYHSGELGPIYGKQWRDFGGVDQIIEVIDRIKKLPNDRRQIVSAWNPAELKYMALPPCHMFY
QFNVRNGYLDLQWYQRSVDVFLGLPFNIASYATLVHIVAKMCNLIPGDLIFSGGNTHIYMNHVEQCKEILRREPKELCEL
VISGLPYKFRYLSTKEQLKYVLKLRPKDFVLNNYVSHPPIKGKMAV
>P90463 2.1.1.45~~~~~~Thymidylate synthase~~~
MFPFVPLSLYVAKKLFRARGFRFCQKPGVLALAPEVDPCSIQHEVTGAETPHEELQYLRQLREILCRGSDRLDRTGIGTL
SLFGMQARYSLRDHFPLLTTKRVFWRGVVQELLWFLKGSTDSRELSRTGVKIWDKNGSREFLAGRGLAHRREGDLGPVYG
FQWRHFGAAYVDADADYTGQGFDQLSYIVDLIKNNPHDRRIIMCAWNPADLSLMALPPCHLLCQFYVADGELSCQLYQRS
GDMGLGVPFNIASYSLLTYMLAHVTGLRPGEFIHTLGDAHIYKTHIEPLRLQLTRTPRPFPRLEILRSVSSMEEFTPDDF
RLVDYCPHPTIRMEMAV
>Q4JQW2 2.1.1.45~~~~~~Thymidylate synthase~~~
MGDLSCWTKVPGFTLTGELQYLKQVDDILRYGVRKRDRTGIGTLSLFGMQARYNLRNEFPLLTTKRVFWRAVVEELLWFI
RGSTDSKELAAKDIHIWDIYGSSKFLNRNGFHKRHTGDLGPIYGFQWRHFGAEYKDCQSNYLQQGIDQLQTVIDTIKTNP
ESRRMIISSWNPKDIPLMVLPPCHTLCQFYVANGELSCQVYQRSGDMGLGVPFNIAGYALLTYIVAHVTGLKTGDLIHTM
GDAHIYLNHIDALKVQLARSPKPFPCLKIIRNVTDINDFKWDDFQLDGYNPHPPLKMEMAL
>F5HET4 ~~~~~~Protein UL131A~~~
MRLCRVWLSVCLCAVVLGQCQRETAEKNDYYRVPHYWDACSRALPDQTRYKYVEQLVDLTLNYHYDASHGLDNFDVLKRI
NVTEVSLLISDFRRQNRRGGTNKRTTFNAAGSLAPHARSLEFSVRLFAN
>Q9T1U9 ~~~U~~~Tail fiber assembly protein U~~~
MMHLKNIKSENPKTKEQYQLTKNFDVIWLWSEDGKNWYEEVNNFQDDTIKIVYDENNIIVAITKDASTLNPEGFSVVEIP
DITANRRADDSGKWMFKDGAVVKRIYTADEQQQQAESQKAALLSEAESVIQPLERAVRLNMATDEERARLESWERYSVLV
SRVDTANPEWPQKPE
>Q71TD6 ~~~U~~~Tail fiber assembly protein U~~~
MQHLKNIRSGNPKTKEQYQLTKNFDVIWLWSEDGKNWYEEVKNFQPDTIKIVYDENNIIVAITKDASTLNPEGFSVVEVP
DITANRRADDSGKWMFKDGAVVKRIYTADEQQQQAESQKAALLSEAESVIQPLERAVRLNMATDEERTRLEAWERYSVLV
SRVDTANPEWPQKPE
>P60506 ~~~~~~U21 glycoprotein~~~
MWTILLFCVPVIYGELYPDFCPLAVVDFDVNATVDDLLLFDISLSKQCSDDKIRHSAVAAMTDNAFFFGNSETQIETDFG
KYLAFNCYQVFSTLNHFLFKNFKKTKGLMKRYDKLCLDVESYIHIQIICSPFKSFIRLRRMNETGISPRILETTFYLQNK
RNSTWVAIKNYLGEDDPFTYRIWHTLTHAKNFLINSCENDFNQLFFWQRKYLSLAKTFEATFKQGFNPMIEQRNEQRYRT
NNIDCSFSKFRQNGVKVAVCKYTGWGVSGFGSLEVLQKIKSPFGEEWKRVGFNSTGAFTPLYGSDVLWGLIFLRVEMTTY
VCTCTNKNTGTQIQVTLPDVDLDLLDSEKTSSNVFVDMLCYTLIAILFLAFVTAVVLLGVSCLDGVQKVLTWPLQHIQKE
PVSEKIINLTNLMFGQEPLPKKESLKQQCL
>Q69559 ~~~~~~U24 protein~~~
MDPPRTPPPSYSEVLMMDVMCGQVSPHVINDTSFVECIPPPQSRPAWNLWNNRRKTFSFLVLTGLAIAMILFIVFVLYVF
HVNRQRR
>Q9QJ42 ~~~~~~U24 protein~~~
MDRPRTPPPSYSEVLMMDVMYGQVSPHASNDTSFVECLPPPQSSRSAWNLWNKRRKTFAFLVLTGLAIAMILFIAFVIYV
FNVNRRKK
>Q69505 ~~~~~~U24 protein~~~
MTHETPPPSYNDVMLQMFHDHSVFLHQENLSPRTINSTSSSEIKNVRRRGTFIILACLIISVILCLILILHIFNVRYGGT
KP
>P0DJY5 ~~~U'~~~Tail fiber assembly protein U'~~~
MMHLKNITAGNPKTKEQYQLTKQFNIKWLYTEDGKNWYEEQKNFQPDTLKMVYDHNGVIICIEKDVSAINPEGANVVEVP
DITANRRADISGKWMFKDGVVIKRTYTEEEQRQQAENEKQSLLQLVRDKTQLWDSQLRLGIISDENKQKLTEWMLYAQKV
ESTDTSSLPVTFPEQPE
>Q71TD7 ~~~U'~~~Tail fiber assembly protein U'~~~
MMHLRNITAGNPKTKEQYQLTKQFNIKWLYTEDGKNWYEEQKNFQYDTLKMAYDHNGVIICIEKDVSAINPEGASVVELP
DITANRRADISGKWMFKDGVVVKRTYTEEEQRQQAENEKQSLLQLVRDKTQLWDSQLRLGIISAENKQKLTEWMLFAQKV
ESTDTSSLPVTFPEQPE
>Q77PU6 ~~~~~~Protein U90~~~
MESAKDTTSTSMFILGKPSGNNMESNEERMQNYHPDPVVEESIKEILEESLKCDVSFESLLFPELEAFDLFIPESSNDIA
SKNVSYSSNVEEGASDEFKTLVAQSVGNCIQSIGASVKAAMKQEQSNMEDNLINSAGLLTLHRSMLERLVLEQLGQLINI
NLLSSASSQFVSCYAKMLSGKNLDFFNWCEPRFIVFACDKFDGLVKKIASESRDLLMDLKANMNNQFITALKNIFSKAYV
ALDSGKLNMVATSLLLMAHNKEMSNPEISNKEFCKQVNLLKQELLESRNEIIENHVKNMKMFQEFANKQMNQIFMDNCDK
TFLKIHINCKNLITAAKNIGIAVLQSIVLCSNEFSWQYLKPRRHQFKITMMNMITHACECIETIYDDTGLIKPLTSSDIM
EGYIAINKNRESSICDLNIDPSESILLELADFDEHGKYSEESSIESIHEDDDNVDYLKYMEVQSPTDNNIPTPSKNNESP
TRQKLTNIHEKDVGKMYPDTPSPDVPGKSKEAKTFIEYSRQIGKEQTSPNCVCTASVTDLGGPDNFKSITGLESGKHFLI
KKLLETQPDSVVVETGSGQQDILAYSPDKRSQTKEWIQEKGSNSKCTETLPGMTFTNSATPVKSHGAIQDTLNPESKLDK
EMEAVESLVNLCDGFHDNPLISEMITFGYETDHSAPYESESDNNDETDYIADCDSTVRTNNIHMNNTNENTPFSKSLYSP
PEVTPSKEDHKTEKIVAVSQKCKSKKRTAKRKNVPIKPSKSKKIKLDRLPETTNVIVISSESEDEEDGNNIIDKSMLEKT
IKSEPNSESSSESDDCTSEDNYLHLSDYDKVINNGHCQSKGFPSPVFTIPIRSMPGTHDIRNKFVPKKHWLWFMRKTHKV
DNCVIHSSAKMNVKNDSDVTEANHCFINHFVPIKTDDEEYEKENVSYTYSKIQDSKTDLEDITPTKKLITEMVMENFMDL
TDIIKHGIAKHCQDLSSKYTVITHTACEKNLNVANSQNLVTAETQIFDPQGTGNNSPILNIINDTTCQNDENRCTEGTSN
DNEKCTIRSDCNSDKMEVFKLDGYPSDYDPFEENAQIY
>Q77J49 2.3.2.27~~~~~~RING finger containing E3 ubiquitin-protein ligase WSV222~~~
MFTHLTRAFRKMNNLVNRSFIDVHRVVAELSYPEFEEDVKNPESSIYRTPISLFQNKDIVTIVGDYILSPKTDSFQVLYP
IKKVIEHFPVIFHCTHNNAPLWVHLLDERHHRLLQSLLTYEIVNAKYRGIVVIPYYRRPINYQTGKSLLMSKLASVKVLD
ILMRCGSYKFISLMCMINKKNNTNFLHCCASKWGEVGSKMMLHIAEMFFANPTTSQHLSDASSFPDAAAEDDKGKTPAHL
AIQEDNADALLFLISLYGAPWFQDNNSYMKSALELKSNKCVKVLSFAADKYEILPNINNNQLEPDTMCGVCATSVEEDEN
EGKTTSLSWYQMNCKHYIHCECLMGMCAAAGNVQCPMCREDVGDEVLERCPPTIFRWLKLAERSEHNRVLFEAKKQEFYK
QMEAMKPPRVVVPPRRTFLTPARRGERAIRIAREIATNAIAEATAQGDVNSYFPVLIDGSGEEYEEEGEEFFNSEEEALA
FGRPFLEDEEEARQIQMRQFAELSRRGVSVNIINNDNPHRHISTVNIVQPVYGVEKSPAASFIYNMLKNDVFESIRSRDT
RVGGERVPVMNLSNDKRALFHAASSMLCDFATETNSQIVGLDFQAVYDPHHISNYIETFGSPLHAYPGAVTFLDGAQDYY
AESIRYDNDIVSFSEMASELHITEALDVFEGSLLSPLFKKIRTGKSYSNWNDHLRRRNYARDIAEEFVRVCENSLASREH
PPVHVHPFRDGAIPILIEYIVDFIHHCITWSMQVNALHCMRKYIEHENTNVHLLNLRPTDERVEVLRVSQLRWSRLFNEQ
YNTRMSLSTKRLSLMKIFNHDLGVSKFGVYKLLDIIEMYCFTLI
>Q8VAK2 2.3.2.27~~~~~~RING finger containing E3 ubiquitin-protein ligase WSV403~~~
MVASTPCPGPGPVPTQELLSTNFLEAHKLVVELLLPSYSSDVVYCDSETYTKPIPIFGNKSIVSTIGDYVLSNPNEDVSY
QMVSSVLEKFPLLFHCTYKTNEEDKGIPLWKKLYNKRKFKLLNSLLVHNNKNWTPVPAIPFDRENICDASGRSVLMSEIM
STSTFQTICKNNTHYLFDMLNMERGKQGGSFLHFFASRKNSFTNFENEEMDSHVLSNIAKFICNEKEKLDSFIPANGKIP
CPDKTNDEGYIPLEIAIMEDNYPALLYLVCRYGASWANTYGDHNESLKAFAIRNDAKDCLEIIEFISDHYSFNKNVTKEE
FVKEKTVECVGCLYDIEDEKRCYKLPCGHFMHTFCLSNKCSKANFRCVKCFQTFDDTIFRKCPPTIQWKMGINQTTNHKE
MDLFNRAFDTYLDFICSYNVKLDKKSKPKHKPENKKVEEELAKRTAEIEEAMKKKEEELTKRTAEIEEAIKKKEEELAKR
TAEIEEAMKKKEEEELSKYNKIIEKGKRRLNEECVKLRDISTAAINMYKEKVRINGVLLKDSDQELAEAKERLRKILLLE
EETKLDRFLFRPKRVEERIFLTKDDETLAFKLALEKKTEDIIAKKNNQKGSERRDGEYTITSHIEKLPQSTALASVCVLN
E
>P27949 2.3.2.23~~~UBC~~~Ubiquitin-conjugating enzyme E2~~~
MVSRFLIAEYRHLIENPSENFKISVNENNITEWDVILRGPPDTLYEGGLFKAKVAFPPEYPYAPPKLTFTSEMWHPNIYP
DGRLCISILHGDNAEEQGMTWSPAQKIDTILLSVISLLNEPNPDSPANVDAAKSYRKYVYKEDLESYPMEVKKTVKKSLD
ECSPEDIEYFKNAASNVPPIPSDAYEDECEEMEDDTYILTYDDDEEEEDEEMDDE
>P25869 2.3.2.23~~~UBC~~~Ubiquitin-conjugating enzyme E2~~~
MVSSFLLAEYKNLIVNPSEHFKISVNEDNLTEWDVILKGPPDTLYEGGLFKAKIVFPPKYPYEPPRLTFTSEMWHPNIYS
DGKLCISILHGDNAEEQGMTWSPAQKIDTVLLSVISLLNEPNPDSPANVDAAKSYRKYLYKEDLESYPMEVKKTVKKSLD
ECSAEDIEYFKNVPVNVLPVPSDDYEDEEMEDGTYILTYDDEDEEEDEEMDDE
>P16709 ~~~vUbi~~~viral Ubiquitin~~~
MQIFIKTLTGKTITAETEPAETVADLKQKIADKEGVPVDQQRLIFAGKQLEDSKTMADYNIQKESTLHMVLRLRGGY
>Q5UPW7 3.4.19.12~~~~~~Putative ubiquitin carboxyl-terminal hydrolase L293~~~
MCVEMKNKTNTTIPSQPQIPIQPSILPTQQTISPIPQIIPPTQPTLPKLQLDPILGSNGIPIANQEWSGTEYVRTNDPTK
FIVITKTLDPKIARFLPTFKKPVITDIINKTSNSTLDMNKIITTNNFNVENAKALANFGNSCYFGTSMQLLFVMFHVRNF
IVKNSNFTETGISIDLKNAYDSIKNLFITMNTAPKNDPIKSFPDYPNVKKQIMKEPDPVQMLEEDAEEFITQFMSDLDPK
ARNLALIKGNEYIYDVSNAQLNRQQSFSNIFNLIPLDIIKSNPTDTLENILSKTYTMVELREGVNTIQNPITSDYEFTYF
VNQPTILPEYFLVRLNMVDPTSTNKLRHNIQINTTLVLTINGQTTTYFALAIIVHRGNSIRTGHYTCLVFDNQTGSQFQY
IFYDDSLSSLVSIPTNSKIIPSNLYLKNITDSAYIILYGDITKLR
>B2BW43 2.7.1.48~~~~~~Uridine kinase P10~~~
MKPYIIGISGISGSGKSTFAANLKTKCGDFYAGDVVIINLDGFYRSINEDDMCLVKAGEYNFDHPYAIDLDHARRCISAI
ADGHLVAVPIYDFEKKKCVGHYEVNNPRVVIVEGIHALHPKLFPLYDITAFLEVPMSVALSRRAVRDNKERGRTPEDTAE
MFRKFVLPMYKLHVEPNVKKAAILVNGLNTGVHINMLYEHIKPLLIYPRF
>Q90158 2.4.1.-~~~egt~~~Ecdysteroid UDP-glucosyltransferase~~~
MIFILLTTLLAVGGAQTANILAVLPTPAYSHHLVYQAYVQALADKCHNVTVVKPQLLDYAAANKQRCGRIEQIDADMSSQ
QYKKLVASSGAFRKRGVVSDETTVTAENYMGLVEMFRDQFDNAHVRSFLATNRTFDVVVVEAFADYALVFGHLFRPAPVI
QIAPGYGLAENFDAVGAVGRHPVHYPNIWRSSSIGNADGALIEWRLYNEFELLARRSDALLKLQFGPNTPTIRQLRNNVQ
LLLLNLHPVYDNNRPVPPSVQYLGGGLHLTLEPPQRLDIELEKRLNASVNGTVYVSFGSSIDTNSIHAEFLQMLLDTFAK
LDNRTVLWKVDDAVAKSVVLPRNVIAQKWFNQRAVLNHRNVVAFVTQGGLQSSDEALHARVPMVCLPMMGDQFHHSAKLE
QFGVARALNTVTVSAAQLALAVGDVIADRLAYQLRMTNLLNVVAFDEATPADKAIKFTERVIRFGHDITRSECSLKSPSA
NTDYSDYFVRFPL
>Q90157 2.4.1.-~~~EGT~~~Ecdysteroid UDP-glucosyltransferase~~~
MASLLIALTLLAADAQTANILAVLPTPAYSHHAVYKAYVHALAKNCHNVTAVKPRLLDYALLNECGRIEQIDADMSLEQY
QKLMAGSGAFRKRGVVADETTVTADNYMSLIEMFKDQFDNANVRHFLASNRTFDAVVVEASADYELVFGHLFRPATVIQI
APGYGLAENFDAAGAVARHPVHYPNIWRSSFSGEAAGALSEWRLLNEFELLASQRSNELLKQQFGLDTPTIRQLRDNVQL
LLLNLHPVYDNNRPVPPSVQYLGGGLHLSQAPSHKLTAALERRLNESVDGAIYVSFGSSIDTNSIHAEFIQMLLESFVQL
NNYTVLWKVDDTVPASVKLPSNVVTQKWFDQRAVLHHKKVVAFVMQAGLQSSDEALESRVPMVCLPMMGDQFHHARKLQQ
FGVARTLDTAVVSAAQLTLAIGEVIADAEAYRARIDDLRAVLEHDAAPAEKAVKFTERVIIFKHDMTRPARTLKTTSANI
AYSDYFLRFPL
>P16776 ~~~UL5~~~Protein UL5~~~
MFLGYSDCVDPGLAVYRVSRSRLKLVLSFVWLVGLRLHDCAAFESCCYDITEAESNKAISRDKAAFTSSVSTRTPSLAIA
PPPDRSMLLSREEELVPWSRLIITKQFYGGLIFHTTWVTGFVLLGLLTLFASLFRVPQSICRFCIDRLRDIARPLKYRYQ
RLVATV
>P16743 ~~~UL7~~~CEACAM1-like protein UL7~~~
MASDVSSHLLTVTQSRWTIHHMYNKLLILALFTPVILESIIYVSGPQGGNVTLVSNFTSNISARWFRWDGNDSHLICFYK
RGEGLSTPYVGLSLSCAANQITIFNLTLNDSGRYGAEGFTRSGENETFLWYNLTVKPKPLETTTASNVTTIVTTTPTVIG
TKSNVTGNASLAPQLRAVAGFLNQTPRENNTHLALVGVIVFIALIVVCIMGWWKLLCSKPKL
>P16744 ~~~UL8~~~Membrane protein UL8~~~
MASDVSSHLLTVTQSRWTIHHMYNKLLILALFTPVILESIIYVSGPQGGNVTLVSNFTSNISARWFRWDGNDSHLICFYK
RGEGLSTPYVGLSLSCAANQITIFNLTLNDSGRYGAEGFTRSGENETFLWYNLTVKPKPLETTTASNVTTIVTTTPTVIG
TKSNVTGNASLAPQLRAVAGFLNQTPRENNTHLALGEGFVPTMTNPGLYASENYNGNYELTEAANTARTNSSDWVTLGTS
ASLLRSTETAVNPSNATTVTPQPVEYPAGEVQYQRTKTHYSWMLIIAIILIIFIIICLRAPQKVYDRWKDNKQYGQVFMT
DTEL
>P16833 ~~~~~~Uncharacterized protein UL116~~~
MKRRRRWRGWLLFPALCFCLLCEAVETNATTVTSTTAAAATTNTTVATTGTTTTSPNVTSTTSNTVTTPTTVSSVSNLTS
STTSIPISTSTVSGTRNTGNNNTTTIGTNATSPSPSVSILTTVTPAATSTISVDGVVTASDYTPTFDDLENITTTRAPTR
PPAQDLCSHNLSIILYEEESQSSVDIAVDEEEPELEDDDEYDELWFPLYFEAECNRNYTLHVNHSCDYSVRQSSVSFPPW
RDIDSVTFVPRNLSNCSAHGLAVIVAGNQTWYVNPFSLAHLLDAIYNVLGIEDLSANFRRQLAPYRHTLIVPQT
>P16739 ~~~~~~Viral Fc-gamma receptor-like protein UL119~~~
MCSVLAIALVVALLGDMHPGVKSSTTSAVTSPSNTTVTSTTSISTSNNVTSAVTTTVQTSTSSASTSVIATTQKEGHLYT
VNCEASYSHDQVSLNATCKVILLNNTKNPDILSVTCYARTDCKGPFTQVGYLSAFPPDNEGKLHLSYNATAQELLISGLR
PQETTEYTCSFFSWGRHHNATWDLFTYPIYAVYGTRLNATTMRVRVLLQEHEHCLLNGSSLYHPNSTVHLHQGNQLIPPW
NISNVTYNGQRLREFVFYLNGTYTVVRLHVQIAGRSFTTTYVFIKSDPLFEDRLLAYGVLAFLVFMVIILLYVTYMLARR
RDWSYKRLEEPVEEKKHPVPYFKQW
>P16721 ~~~~~~Protein UL11~~~
MLLRYITFHREKVLYLAIACFFGIYISFHDACILVPAKVGTNVTLNAVHVHDGDYVYWSFGGGGANRLMCRYTPRLDEIH
KNTNRSFSCLTNHSLLLINVTEEYTDYYRTMTTFVHQSHNWHNHGNKWTLDTCYYVYVTQNGTLPTTTTKKPTTTTRTTT
TTTTKKTTTTSTTTTTTTTKKTTTSTTHHRHSNPKESTTPKTHVELHVGLGATAAETPLQPSPQYQHVATHALWVLAVVI
VIIIIIIFYFRIPQKLWLLWQHDKHGIVLIPQTDL
>Q6SWB9 ~~~~~~Protein UL11~~~
MLFRYITFHREKVLYLTAACIFGVYISLHDACIPVVGKIGTNVTLNAVDVLPPRDQVRWSYGPGGQGYMLCIFTGTSTTT
FNNTRFNFSCLSNYSLLLINVTTQYSTTYRTMTSLDHWLHQRHNHGSRWTLDTCYNLTVNENGTFPTTTTKKPTTTTRTT
TTTTQRTTTTRTTTTAKKTTISTTHHKHPSPKKSTTPNSHVEHHVGFEATAAETPLQPSPQHQHLATHALWVLAVVIVII
IIIIFYFRIPQKLWLLWQHDKHGIVLIPQTDL
>P16837 ~~~~~~Uncharacterized protein UL128~~~
MSPKDLTPFLTTLWLLLGHSRVPRVRAEECCEFINVNHPPERCYDFKMCNRFTVALRCPDGEVCYSPEKTAEIRGIVTTM
THSLTRQVVHNKLTSCNYNPLYLEADGRIRCGKVNDKAQYLLGAAGSVPYRWINLEYDKITRIVGLDQYLESVKKHKRLD
VCRAKMGYMLQ
>V9LLX6 ~~~~~~Envelope protein UL128~~~
MSPKNLTPFLTALWLLLDHSRVPRVRAEECCEFINVNHPPERCYDFKMCNRFTVALRCPDGEVCYSPEKTAEIRGIVTTM
THSLTRQVVHNKLTSCNYNPLYLEADGRIRCGKVNDKAQYLLGAAGSVPYRWINLEYDKITRIVGLNQYLESVKKHKRLD
VCRAKMGYMLQ
>P16772 ~~~~~~Uncharacterized protein UL130~~~
MLRLLLRHHFHCLLLCAVWATPCLASPWSTLTANQNPSPPWSKLTYSKPHDAATFYCPFLYPSPPRSPLQFSGFQQVSTG
PECRNETLYLLYNREGQTLVERSSTWVKKVIWYLSGRNQTILQRMPQTASKPSDGNVQISVEDAKIFGAHMVPKQTKLLR
FVVNDGTRYQMCVMKLESWAHVFRDYSVSFQVRLTFTEANNQTYTFCTHPNLIV
>F5HCP3 ~~~~~~Envelope glycoprotein UL130~~~
MLRLLLRHHFHCLLLCAVWATPCLASPWSTLTANQNPSPPWSKLTYSKPHDAATFYCPFLYPSPPRSPLQFSGFQRVSTG
PECRNETLYLLYNREGQTLVERSSTWVKKVIWYLSGRNQTILQRMPRTASKPSDGNVQISVEDAKIFGAHMVPKQTKLLR
FVVNDGTRYQMCVMKLESWAHVFRDYSVSFQVRLTFTEANNQTYTFCTHPNLIV
>F5HAQ7 ~~~~~~Protein UL135~~~
MVWLWLGVGLLGGTGLASLVLAISLFTQRRGRKRSDETSSRGRLPGAASDKRGACACCYRIPKEDVVEPLDLELGLMRVA
THPPTPQVPRCTSLYIGEDGLPIDKPEFPPARFEIPDVSTPGTPTSIGRSPSHCSSSSSLSSSASVDTVLHQPPPSWKPP
PPPGRKKRPPTPPVRAPTTRLSSHRPPTPIPAPRKNLSTPPTKKTPPPTKPKPVGWTPPVTPRPFPKTPTPQKPPRNPRL
PRTVGLENLSKVGLSCPCPRPRTPTEPTTLPIVSVSELAPPPRWSDIEELLEKAVQSVMKDAESMQMT
>F5HF35 ~~~~~~Protein UL136~~~
MSVKGVEMPEMTWDLDVGNKWRRRKVLSRIHRFWECRLRVWWLSDAGVRETDPPRPRRRPTWMTAVFHVICAVLLTLMIM
AIGALIAYLRYYHQDSWRDMLHDLFCGCHYPEKCRRHHERQRSRRRAMDVPDPELGDPARRPLNGAMYYGSGCRFDTVEM
VDETRPAPPALSSPETGDDSNDDAVAGGGAGGVTSSATRTTSSNALLPEWMDAVHVAVQAAVQATVQVSGPRENAVSPAT
>F5HGQ8 ~~~~~~Protein UL138~~~
MDDLPLNVGLPIIGVMLVLIVAILCYLAYHWHDTFKLVRMFLSYRWLIRCCELYGEYERRFADLSSLGLGAVRRESDRRY
RFSERPDEILVRWEEVSSQCSYASSRITDRRAGSSSSSSVHVANQRNSVPPPDMAVTAPLTDVDLLKPVTGSATQFTTVA
MVHYHQEYT
>P16755 ~~~~~~Uncharacterized protein UL13~~~
MLWAHCGRFLRYHLLPLLLCRLPFLLLFQRPQWAHGLDIVEEDEWLREIQGATYQLSIVRQAMQHAGFQVRAASVMTRRN
AVDLDRPPLWSGSLPHLPVYDVRSPRPLRPPSSQHHAVSPELPSRDGIRWQYQELQYLVEEQRRRNQSRNAIPRPSFPPP
DPPSQPAEDARDADAERTESPHSAESTVRHDASENAVRRRHERRRYNALTVRSRDSLLLTRIRFSNQRCFGRGRLRHPAG
SGPNTGGPRPGGAGLRQLRQQLTVRWQLFRLRCHGWTQQVSSQIRTRWEESNVVSQTATRVRTWFVERTTFWRRTWVPRQ
NPAAEAQELAVIPPAPTVLRQNEEPRQQLTGEETRNSTHTQREEVEDVSREGAREGNDGSRASGNDERRNNAGRYDDDHE
VQEPQVTYPAGQGELNRRSQEENEEGGPCESPPMTTNTLTVACPPREPPHRALFRLCLGLWVSSYLVRRPMTI
>P04290 2.7.11.1~~~~~~Serine/threonine-protein kinase UL13~~~
MDESRRQRPAGHVAANLSPQGARQRSFKDWLASYVHSNPHGASGRPSGPSLQDAAVSRSSHGSRHRSGLRERLRAGLSRW
RMSRSSHRRASPETPGTAAKLNRPPLRRSQAALTAPPSSPSHILTLTRIRKLCSPVFAINPALHYTTLEIPGARSFGGSG
GYGDVQLIREHKLAVKTIKEKEWFAVELIATLLVGECVLRAGRTHNIRGFIAPLGFSLQQRQIVFPAYDMDLGKYIGQLA
SLRTTNPSVSTALHQCFTELARAVVFLNTTCGISHLDIKCANILVMLRSDAVSLRRAVLADFSLVTLNSNSTIARGQFCL
QEPDLKSPRMFGMPTALTTANFHTLVGHGYNQPPELLVKYLNNERAEFTNHRLKHDVGLAVDLYALGQTLLELVVSVYVA
PSLGVPVTRFPGYQYFNNQLSPDFALALLAYRCVLHPALFVNSAETNTHGLAYDVPEGIRRHLRNPKIRRAFTDRCINYQ
HTHKAILSSVALPPELKPLLVLVSRLCHTNPCARHALS
>P89436 2.7.11.1~~~~~~Serine/threonine-protein kinase UL13~~~
MDESGRQRPASHVAADISPQGAHRRSFKAWLASYIHSLSRRASGRPSGPSPRDGAVSGARPGSRRRSSFRERLRAGLSRW
RVSRSSRRRSSPEAPGPAAKLRRPPLRRSETAMTSPPSPPSHILSLARIHKLCIPVFAVNPALRYTTLEIPGARSFGGSG
GYGEVQLIREHKLAVKTIREKEWFAVELVATLLVGECALRGGRTHDIRGFITPLGFSLQQRQIVFPAYDMDLGKYIGQLA
SLRATTPSVATALHHCFTDLARAVVFLNTRCGISHLDIKCANVLVMLRSDAVSLRPAVLADFSLVTLNSNSTISRGQFCL
QEPDLESPRGFGMPAALTTANFHTLVGHGYNQPPELSVKYLNNERAEFNNRPLKHDVGLAVDLYALGQTLLELLVSVYVA
PSLGVPVTRVPGYQYFNNQLSPDFAVALLAYRCVLHPALFVNSAETNTHGLAYDVPEGIRRHLRNPKIRRAFTEQCINYQ
RTHKAVLSSVSLPPELRPLLVLVSRLCHANPAARHSLS
>P09296 2.7.11.1~~~~~~Serine/threonine-protein kinase UL13 homolog~~~
MDADDTPPNLQISPTAGPLRSHHNTDGHEPNATAADQQERESTNPTHGCVNHPWANPSTATCMESPERSQQTSLFLLKHG
LTRDPIHQRERVDVFPQFNKPPWVFRISKLSRLIVPIFTLNEQLCFSKLQIRDRPRFAGRGTYGRVHIYPSSKIAVKTMD
SRVFNRELINAILASEGSIRAGERLGISSIVCLLGFSLQTKQLLFPAYDMDMDEYIVRLSRRLTIPDHIDRKIAHVFLDL
AQALTFLNRTCGLTHLDVKCGNIFLNVDNFASLEITTAVIGDYSLVTLNTYSLCTRAIFEVGNPSHPEHVLRVPRDASQM
SFRLVLSHGTNQPPEILLDYINGTGLTKYTGTLPQRVGLAIDLYALGQALLEVILLGRLPGQLPISVHRTPHYHYYGHKL
SPDLALDTLAYRCVLAPYILPSDIPGDLNYNPFIHAGELNTRISRNSLRRIFQCHAVRYGVTHSKLFEGIRIPASLYPAT
VVTSLLCHDNSEIRSDHPLLWHDRDWIGST
>Q6RJQ3 ~~~~~~Protein UL141~~~
MCRRESLRTLPWLFWVLLSCPRLLEYSSSSFPFATADIAEKMWAENYETTSPAPVLVAEGEQVTIPCTVMTHSWPMVSIR
ARFCRSHDGSDELILDAVKGHRLMNGLQYRLPYATWNFSQLHLGQIFSLTFNVSTDTAGMYECVLRNYSHGLIMQRFVIL
TQLETLSRPDEPCCTPALGRYSLGDQIWSPTPWRLRNHDCGMYRGFQRNYFYIGRADAEDCWKPACPDEEPDRCWTVIQR
YRLPGDCYRSQPHPPKFLPVTPAPPADIDTGMSPWATRGIAAFLGFWSIFTVCFLCYLCYLQCCGRWCPTPGRGRRGGEG
YRRLPTYDSYPGVKKMKR
>F5HHH2 ~~~~~~Membrane glycoprotein UL142~~~
MRIEWACWLFGYFVSSVGSERSLSYRYHLESNSSANVVCNGNISVFVNGTLGVRYNITVGISSSLLIGHLTIQTLESWFT
PWVQNKSYSKQPLSTTETLYNIDSENIHRVSQYFHTRWIKSLQENHTCDLTNSTPTYTYQANVNNTNYLTLTSSGWQDRL
NYTAINSTHFNLTESNITSIHKYLNTTCIERLRNYTLEPVYTTAVPQNVTPEHAITTLYTTPPNAITIKDTTQSHTVQTP
SFNDTHNVTEHTLNISYVLSQKTNNTTSPWVYAIPMGATATIGAGLYIGKHFTPVKFVYEVWRGQ
>Q68396 ~~~~~~Membrane glycoprotein UL144~~~
MKPLIMLICFAVILLQLGVTKVCQHNEVQLGNECCPPCGSGQRVTKVCTDYTSVTCTPCPNGTYVSGLYNCTDCTQCNVT
QVMIRNCTSTNNTVCAPKNHTYFSTPGVQHHKQRQQNHTAHITVKQGKSGRHTLAWLSLFIFLVGIILLILYLIAAYRSE
RCQQCCSIGKIFYRTL
>F5HF44 ~~~~~~Protein UL145~~~
MYGVLAHYYSFISSPSVMVNFKHHNAVQLLCARTRDGTAGWERLTHHASYHANYGAYAVLMATSQRKSLVLHRYSAVTAV
ALQLMPVEMLRRLDQSDWVRGAWIVSETFPTSDPKGFWSDDDSPMGGSED
>P16757 ~~~~~~Protein UL16~~~
MERRRGTVPLGWVFFVLCLSASSSCAVDLGSKSSNSTCRLNVTELASIHPGETWTLHGMCISICYYENVTEDEIIGVAFT
WQHNESVVDLWLYQNDTVIRNFSDITTNILQDGLKMRTVPVTKLYTSRMVTNLTVGRYDCLRCENGTTKIIERLYVRLGS
LYPRPPGSGLAKHPSVSADEELSATLARDIVLVSAITLFFFLLALRIPQRLCQRLRIRLPHRYQRLRTED
>P08560 ~~~~~~Glycoprotein UL18~~~
MMTMWCLTLFVLWMLRVVGMHVLRYGYTGIFDDTSHMTLTVVGIFDGQHFFTYHVNSSDKASSRANGTISWMANVSAAYP
TYLDGERAKGDLIFNQTEQNLLELEIALGYRSQSVLTWTHECNTTENGSFVAGYEGFGWDGETLMELKDNLTLWTGPNYE
ISWLKQNKTYIDGKIKNISEGDTTIQRNYLKGNCTQWSVIYSGFQPPVTHPVVKGGVRNQNDNRAEAFCTSYGFFPGEIN
ITFIHYGDKVPEDSEPQCNPLLPTLDGTFHQGCYVAIFCNQNYTCRVTHGNWTVEIPISVTSPDDSSSGEVPDHPTANKR
YNTMTISSVLLALLLCALLFAFLHYFTTLKQYLRNLAFAWRYRKVRSS
>P28971 ~~~~~~Protein UL20 homolog~~~
MPQVLMGNTRLHAPLEDGIPLIENDENSSQNEVDLYDYVSMSSYGGDNDFLISSAGGNITPENRPSFSAHVVLFAISALV
IKPVCCFIFLNHYVITGSYDFAVAGGVCTVLYYMRLALTAWFMFRNIQSDMLPLNVWQQFVIGCMALGRTVAFMVVSYTT
LFIRSELFFSMLAPNAGREYITPIIAHKLMPLISVRSAVCLVIISTAVYAADAICDTIGFTLPRMWMCILMRSSSVKRS
>P10204 ~~~~~~Protein UL20~~~
MTMRDDLPLVDRDLVDEAAFGGEEGELPLEEQFSLSSYGTSDFFVSSAYSRLPPHTQPVFSKRVILFLWSFLVLKPLEMV
AAGMYYGLTGRVVAPACILAAIVGYYVTWAVRALLLYVNIKRDRLPLSAPVFWGMSVFLGGTALCALFAAAHETFSPDGL
FHFIATNQMLPPTDPLRTRALGIACAAGASMWVAAADSFAASANFFLARFWTRAILNAPVAF
>Q00702 ~~~~~~Protein UL20~~~
MEDAAADVDAAADAKLTGENDALLSSAFVGARPPRPRFSSHVVSLLALALALRPACCLVLALHGSRATLAALLTALAFYA
RAAVCAVLVARNVARDRMPLSPAQQAALGLLAAARLAFLYVALDAGRHYAPALAGALYGADCVCDALAFLLPRAYARSIM
H
>P09290 ~~~~~~Protein UL20 homolog~~~
MNPPQARVSEQTKDLLSVMVNQHPEEDAKVCKSSDNSPLYNTMVMLSYGGDTDLLLSSACTRTSTVNRSAFTQHSVFYII
STVLIQPICCIFFFFYYKATRCMLLFTAGLLLTILHHFRLIIMLLCVYRNIRSDLLPLSTSQQLLLGIIVVTRTMLFCIT
AYYTLFIDTRVFFLITGHLQSEVIFPDSVSKILPVSWGPSPAVLLVMAAVIYAMDCLVDTVSFIGPRVWVRVMLKTSISF
>P16846 ~~~~~~Tegument protein UL23~~~
MSVIKDCFLNLLDRWRPPKTSRPWKPGQRVALVWPKDRCLVIRRRWRLVRDEGRDAQRLASYLCCPEPLRFVGSICTYNF
LKHKGDHNVPSELYLGASGAMYLWTDHIYSDSLTFVAESITEFLNIGLRRCNFITVPEELPHTASLRALAGCMHIHAFAQ
WRATYRGRLMVMGDYSVIRVSTIRLYDWSEINDWRVMVGSNHVEPLGWLVSPYDVINLFVDDCMRVFAANNQHVCIVADS
LMEFVTRGMTRCHENGIYYGTRSMRKLNKPTCPYGVDHQLFDDA
>P03232 ~~~BXRF1~~~Protein UL24 homolog~~~
MDPTRGLCALSTHDLAKFHSLPPARKAAGKRAHLRCYSKLLSLKSWEQLASFLSLPPGPTFTDFRLFFEVTLGRRIADCV
VVALQPYPRCYIVEFKTAMSNTANPQSVTRKAQRLEGTAQLCDCANFLRTSCPPVLGSQGLEVLAALVFKNQRSLRTLQV
EFPALGQKTLPTSTTGLLNLLSRWQDGALRARLDRPRPTAQGHRPRTHVGPKPSQLTARVPRSARAGRAGGRKGQVGAVG
QVCPGAQK
>P28927 ~~~~~~Protein UL24 homolog~~~
MKRRQRLTARSRLRAGIRCHNRFYNAMVQDLASAKRNGVYGERLAPLFSELVPAETLKTALGVSLAFEVNLGQRRPDCVC
TVQFGKGSDAKGVCILIELKTCRFSKNMNTASKNLQRKGGMRQLHDSCRLLARTLPPGSGEIVLAPVLVFVAQRGMRVLR
VTRLSPQVVYSNAAVLSCTISRLAEYAPPVSAKSTRRRCVAKGTKAKAFSTKAAAEPVPSITPAQPSAAAAVVSLFPAAV
PANTTNAAAVHQPVAVSHVNPLAWAASLFSPK
>Q6S6T4 ~~~~~~Protein UL24 homolog~~~
MKRRQRLTARSRLRAGIRCHNRFYNAMVQDLASAKRNGVYGERLAPLFSELVPAETLKTALGVSLAFEVNLGQRRPDCVC
TVQFGKGSDAKGVCILIELKTCRFSKNMNTASKNLQRKGGMRQLHDSCRLLARTLPPGSGEIVLAPVLVFVAQRGMRVLR
VTRLSPQVVYSNAAVLSCTISRLAEYAPPVSAKSTRRRCVAKGTKAKAFSTKAAAEPVPSITPAQPSAAAAVVSLFPAAV
PANTTNAAAVHQPVAVSHVNPLAWVASLFSPK
>P24432 ~~~~~~Protein UL24 homolog~~~
MKRKQRLTARSRLRAGIRCHNRFYNAMVQDLASAKKNGVYGARLAPLFSELVPAETLKTAMGVSLAFEVNLGQRRPDCVC
TVQFGHGSDAKGVCILIELKTCRFSKNMNTASKNLQRKGGMRQLHDSCRLLAKTLPPGSGEILLAPVLVFVAQRGMRVLR
VTRLSPQVVYSNAAVLSCTISRLAEYSPPISERSTRRRCVTRRTNSKAFRAKTTTGSIQPITQAKPAATAAVASLFSATA
QANTTNAAVGYQPATISLANPLAWVASLFAPK
>Q18LE7 ~~~~~~Protein UL24 homolog~~~
MPVNFVAARKKRDGVNTHIELQKAIYKSRSFTEINNILDGILPDAHRETPFATYFEANLGCRRPDCMIVFDDLPKQIITC
VLVEFKTTSRTAFDKRKKDAVQQYQLHQGEEQVRDAVKILSSITGRGCNLRVWGFLLFYQQSTLRVLHKTIPECAVTLTD
RWAFSALLKKSKNESFHAFLQKSCTASTTGPQKELFGIHKPENSEVETVGATKSTRKGAEKSRLSRRSRKSN
>P10208 ~~~~~~Protein UL24~~~
MAARTRSLVERRRVLMAGVRSHTRFYKALAEEVREFHATKICGTLLTLLSGSLQGRSVFEATRVTLICEVDLGPRRPDCI
CVFEFANDKTLGGVCVIIELKTCKYISSGDTASKREQRATGMKQLRHSLKLLQSLAPPGDKIVYLCPVLVFVAQRTLRVS
RVTRLVPQKVSGNITAVVRMLQSLSTYTVPIEPRTQRARRRRGGAARGSASRPKRSHSGARDPPESAARQLPPADQTPTS
TEGGGVLKRIAALFCVPVATKTKPRAASE
>P89447 ~~~~~~Protein UL24~~~
MARTGRRAAVGRPARTSSLTERRRVLLAGVRSHTRFYKAFAREVREFNATRICGTLLTLMSGSLQGRSLFEATRVTLICE
VDLGPRRPDCICVFEFANDKTLGGVCVILELKTCKSISSGDTASKREQRTTGMKQLRHSLKLLQSLAPPGDKVVYLCPIL
VFVAQRTLRVSRVTRLVPQKISGNITAAVRMLQSLSTYAVPPEPQTRRSRRRVAATARPQRPPSPTRDPEGTAGHPAPPE
SDPPSPGVVGVAAEGGGVLQKIAALFCVPVAAKSRPRTKTE
>P52535 ~~~~~~Protein UL24 homolog~~~
MSLTGLPDIRKKIGQFHHLRIYKQILSFQGNFARLNYFLGDVFPANLRSASVSVFFEVRLGPRIPDCIVLLKSVDVKDEF
AFHCYFFEFKTTLGKSTMQSVHHNCIHQAQYLQGLRQLQQSISFLDQYLIADEVSWNVVPVICFFRQWGLKLDFFKKFSG
KTKRLSFSFIRDLFARSQDGAVQSLLSIPNYTNFRRACQKHTDLYRKRCRKAPKSVLTKTSGENRSRASRQVAKNAPKNR
IRRTAKKDAKRQ
>Q06092 ~~~~~~Protein UL24 homolog~~~
MSLTGLPDIRKKIGQFHHLRIYKQILSLQGNFARLNYFLGDVFPANLRSASVSVFFEVRLGPRIPDCIVLLKSVDAKDEF
AFHCYFFEFKTTLGKSTMQSVHHNCIHQAQYLQGLRQLQQSISFLDQYLIADEVSWNVVPVICFFRQWGLKLDFFKKFSG
KTKRLSFSFIRDLFARSQDGAVQSLLSIPNYTNFRRACQKHTDLYRKRCRKAPKSVLTKTSGENRSRASRQVAKNAPKNR
IRRTAKKDAKRQ
>P52545 ~~~~~~Protein UL24 homolog~~~
MSLTGLPDIRKKIGQFHHLRIYKQILSLQGNFARLNYFLGDVFPANLRSASVSVFFEVRLGPRIPDCIVLLKSVDAKDEF
AFHCYFFEFKTTLGKSTMQSVHHNCIHQAQYLQGLRQLHQSISFLDQYLIADEVLWNVVPVICFFRQWGLKLDFFKKFSG
KTKRLSFSFICDLFARSQDGAVQSLLSIPNYTNFRRACQKHTDLYRRRYQKASKSVLTKTSGENRSRASRQVAKNAPKNR
IRRTAKKDAKRQ
>P52386 ~~~~~~Protein UL24 homolog~~~
MSLEYLPPVRRRIGQYNHLRIYKKILLLKSNFEKLNFFLGNLFPEELHDSKIHVYFEVRLGCRIPDCIIVFRHFGEKLLK
TFHCYFFEFKTTFAKSNLFSIQKNRTQKIQYLQGLRQLRQATDYLQQFVIKNESLCKVNPVICFFRQHGLKLDFVKTFIA
KELQLSSTFLCNLFTKYQNDTVKSILSISNPTNFRRACQKYSNLYRGRYATTPKLGNSKTSKRKRRNSKKQDFKKLVKN
>Q2HRB2 ~~~~~~Protein UL24 homolog~~~
MVRPTEAEVKKSLSRLPAARKRAGNRAHLATYRRLLKYSTLPDLWRFLSSRPQNPPLGHHRLFFEVTLGHRIADCVILVS
GGHQPVCYVVELKTCLSHQLIPTNTVRTSQRAQGLCQLSDSIHYIAHSAPPGTEAWTITPLLIFKNQKTLKTVYSESPGA
FPTPVHTTEGKLCAFLTARENADIRKVLSKVPKKPKMDRGGKILGPTPGKRAVYSQAHHGRNKKGRPWTAQPTRAKSRTK
DKGTPAFPRAGPACSGP
>P23986 ~~~~~~Protein UL24 homolog~~~
MARRRKEVQRGESRSHRSRTRSKTAHHRKFSRRQLRPSLRARLNAGIRCHNRFYRALVRSLEEVFEGGGDGRLAYTIIPQ
CKPAGGKIVVMFEVNLGLRKPDCICLLETQHEMKCIVIELKTCRFSKSLMTESKLRQGYTGTLQLRDSARLLENLAVPGT
EKVKILSLLVFVAQRGMNILAVKTVGETVINVSSELFFVTLATRSQYLKTFCAKLEPRVSHARSKYQQESAKNADLASPA
PSALQTVAALFSSRVGKESATKPASYSTSTEESKNLSEPCFDPDSNL
>Q01005 ~~~~~~Protein UL24 homolog~~~
MLSVIKQRDKEVLAHLPNKRKIAGNKAHLETYKKLAKYTVSASIFKFLSISHPCPLRAKTRLFFEVSLGNRIADCVMLTS
CGETRICYVIELKTCMTSNLDLISDIRKSQRSQGLCQLADTVNFIHNYAPLGRQAWTVLPILIFKSQKTLKTLHIETPKF
PVNLTHTSEEKLSCFLWSRADVEIRKKIHLAPKPKRIFKWDSLLDSTSTEHSAYRQKLIERNKKKCFTLQNQTSKFRDRT
NKKSNDQLRARQANARPCKKKQHNNKRLRNNRKHGGKVSRLTTTTSFSSEAAFSNYPVSTHKL
>P09288 ~~~~~~Protein UL24 homolog~~~
MSASRIRAKCFRLGQRCHTRFYDVLKKDIDNVRRGFADAFNPRLAKLLSPLSHVDVQRAVRISMSFEVNLGRRRPDCVCI
IQTESSGAGKTVCFIVELKSCRFSANIHTPTKYHQFCEGMRQLRDTMALIKETTPTGSDEIMVTPLLVFVSQRGLNLLQV
TRLPPKVIHGNLVMLASHLENVAEYTPPIRSVRERRRLCKKKIHVCSLAKKRAKSCHRSALTKFEENAACGVDLPLRRPS
LGACGGILQSITGMFSHG
>Q4JQU0 ~~~~~~Protein UL24 homolog~~~
MSASRIRAKCFRLGQRCHTRFYDVLKKDIDNVRRGFADAFNPRLAKLLSPLSHVDVQRAVRISMSFEVNLGRRRPDCVCI
IQTESSGAGKTVCFIVELKSCRFSANIHTPTKYHQFCEGMRQLRDTVALIKETTPTGSDEIMVTPLLVFVSQRGLNLLQV
TRLPPKVIHGNLVMLASHLENVAEYTPPIRSVRERRRLCKKKIHVCSLAKKRAKSCHRSALTKFEENAACGVDLPLRRPS
LGACGGILQSITGMFSHG
>Q6SWA4 ~~~~~~Protein UL27~~~
MNPVDQPPPPPLTQQPEEQAKEDHDDGDERLFRDPLTTYEYLDDCRDDEEFCHQFLRAYLTPIRNRQEAVRAGLLCRTPE
DLAAAGGQKRKTPAPKHPKHAMVYIRRSCLVHSACATAHGKYDIRGLTLESDLAVWAALRGVPLPPDPQHFRWLNAGAFR
RLVHEAQYLPEISRAAKRIALAVATGQYVVCTLLDYKTFGTRTHYLRQLCSMTEELYLRLDGTLCLFLEPEERELIGRCL
PAALCRGLPVKYRTHRAAVFFHATFMARAEAALKDLYAAFCECGDGRDDGGNHNGNYGGGDHSSLSPSAVASHHSRLEHA
ELRLERNRHLGAFHLPAIRHLTAGDVARVQDSVSRDLGFADWSQTLVDDYFLLPAGWACANPRRGYAMYLASNAVLALRI
IRLLRASIRHEYTACIRMLSGDVQRLIRLFKGEAALLRKGLAQNPVQRRELSRFRKHVHDLKRIRFTEDTFVETFCDFLE
LVQRIPDYRSVSLRIKRELLCLHVFKLRRGCRAPPTPEAVRVQRLLWHSLRHGDAPQDRTRLPQFSSALSDAELSNHANR
CRRKAPLELGPAVVAAPGPSVRYRAHIQKFERLHVRRFRPHEVGGHAT
>C0H677 ~~~~~~Protein UL29/UL28~~~
MSGRRKGCSAATASSSSSSPPSRLPPLPGHARRPRRKRCLVPEVFCTRDLADLCVRRDYEGLRRYLRRFEGSCVSLGWPS
QCIYVVGGEHSPHSLTEIDLEHCQNDFFGEFRALHLIGTVSHATCRYQVFVDAYGAVFAYDAQEDCLYELASDLAGFFAK
GMIRCDPVHESICARLQPNVPLVHPDHRAELCRRSRASARGRYLRSLLAFRELLACEDTAARCAYVEAHREAQLTLIWPE
KHSLVLRTARDLGLSASMLRRFQRSLYTREPVMPLGEIEGAEDKTFFHRVRILCGDTGTVYAALVGQDKLVRLARDLRGF
VRVGLALLIDDFRYESIGPVDRSSLYEANPELRLPFKKRRLVVGYFDSLSSLYLRGQPKFSSIWRGLRDAWTHKRPKPRE
RASGVHLQRYVRATAGRWLPLCWPPLHGIMLGDTQYFGVVRDHKTYRRFSCLRQAGRLYFIGLVSVYECVPDANTAPEIW
VSGHGHAFAYLPGEDKVYVLGLSFGEFFENGLFAVYSFFERDYVDEIVEGAWFKHTFAGMYELSQILHDRANLLRVCQLH
AGSKIRLGGSPACTFTFGSWNVAEADEANNFVIGVLEQAHFVVIGWMEPVNKAVFMDAHGGIHVLLYGTMLVKLAETLRG
FIRQGSFWFRCPRRFCFSPLDSSATVAAKPVSSHTSPAYDVSEYVFSGRSVLDSVSGTGAS
>P16848 ~~~~~~Protein UL31~~~
MGDKPTLVTLLTVAVSSPPPSSPLPLVSFTELLLPPPSVAAAAVAATATSEVGEKTAEQEVAAADPETGNERRENRENEG
GETRTTGTTAVKRSHDGIPRQLAERLRLCRHMDPEQDYRLPAQDVVTSWIEALRDADRDNYGRCVRHAKIHRSASHLTAY
ESYLVSITEQYNTASNVTEKASYVQGCIFLSFPVIYNNTQGCGYKYDWSNVVTPKAAYAELFFLLCSTSESSVVLQPLIT
KGGLCSSMAVYDEETMRQSQAVQIGFLHTQLVMVPFVPHACPHYAVPFTTPGKPGCGGAPSGVAGLEEAAPFGRVSVTRH
GATLLCRVDHLTWISKRVTTYGHKKITRYLAQFRGTMDDDEAALPGEDEAWIASKNVQYEFMGLIFTVNVDSLCVDAEQR
QLLGTVATSFCHRVSDKITARNMPRAFSFYLLTSAQRGYDLRFSRNPSLFFSGDALNCPLLNEPNVFSLTVHAPYDIHFG
VQPRQTVELDLRYVQITDRCFLVANLPHEDAFYTGLSVWRGGEPLKVTLWTRTRSIVIPQGTPIATLYQITEGDGNVYSY
NHHTVFRQMHAAGTTTFFLGDMQLPADNFLTSPHP
>P03184 ~~~BFLF1~~~Packaging protein UL32 homolog~~~
MAHKVTSANEPNPLTGKRLSSCPLTRSGVTEVAQIAGRTPKMEDFVPWTVDNLKSQFEAVGLLMAHSYLPANAEEGIAYP
PLVHTYESLSPASTCRVCDLLDTLVNHSDAPVAFFEDYALLCYYCLNAPRAWISSLITGMDFLHILIKYFPMAGGLDSLF
MPSRILAIDIQLHFYICRCFLPVSSSDMIRNANLGYYKLEFLKSILTGQSPANFCFKSMWPRTTPTFLTLPGPRTCKDSQ
DVPGDVGRGLYTALCCHLPTRNRVQHPFLRAEKGGLSPEITTKADYCGLLLGTWQGTDLLGGPGHHAIGLNAEYSGDELA
ELALAITRPEAGDHSQGPCLLAPMFGLRHKNASRTICPLCESLGAHPDAKDTLDRFKSLILDSFGNNIKILDRIVFLIKT
QNTLLDVPCPRLRAWLQMCTPQDFHKHLFCDPLCAINHSITNPSVLFGQIYPPSFQAFKAALAAGQNLEQGVCDSLITLV
YIFKSTQVARVGKTILVDVTKELDVVLRIHGLDLVQSYQTSQVYV
>P10216 ~~~~~~Packaging protein UL32~~~
MATSPPGVLASVAVCEESPGSSWKAGAFERAYVAFDPSLLALNEALCAELLTASHVIGVPPVGTLDEDVAADVVTAPSRA
RGGAGDGGGSGGRGGPRNPPPDPCGEGLLDTGPFSAAAIDTFALDRPCLVCRTIELYKQTYRLSPQWVADYAFLCAKCLG
APHCAASIFVAAFEFVYVMDRHFLRTKKATLVGSFARFALTINDIHRHFFLHCCFRTDGGVPGRHAQKQPKPSPSPGAAK
VQYSNYSFLAQSATRALIGTLASGGEEGAGSAAGSGTQPSLTTALMNWKDCARLLDCTEGRRGGGDSCCTRAAARNGEFE
TVAGDREPEESPDTWAYADLVLLLLAGTPAVWESGPQLRAAAEARRATVRQSWEAHRGARTRDVAPRFAQFTEPDAQPDL
DLGPLMATVLKHGRGRGRTGGECLLCNLLLVRAYWLALRRLRASVVRYSENNTSLFDCIVPVVDQLEADPETQPGDGGRF
VSLLRAAGPEAIFKHMFCDPMCAITEMEVDPWVLFGHPPATHPDELLLHKAKLACGNEFEGRVCIALRALIYTFKTYQVF
VPKPTALATFVREAGALLRRHSISLLSLEHTLCTYV
>F5HF47 ~~~~~~Packaging protein UL32 homolog~~~
MFVPWQLGTITRHRDELQKLLAASLLPEHPEESLGNPIMTQIHQSLQPSSPCRVCQLLFSLVRDSSTPMGFFEDYACLCF
FCLYAPHCWTSTMAAAADLCEIMHLHFPEEEATYGLFGPGRLMGIDLQLHFFVQKCFKTTAAEKILGISNLQFLKSEFIR
GMLTGTITCNFCFKTSWPRTDKEEATGPTPCCQITDTTTAPASGIPELARATFCGASRPTKPSLLPALIDIWSTSSELLD
EPRPRLIASDMSELKSVVASHDPFFSPPLQADTSQGPCLMHPTLGLRYKNGTASVCLLCECLAAHPEAPKALQTLQCEVM
GHIENNVKLVDRIAFVLDNPFAMPYVSDPLLRELIRGCTPQEIHKHLFCDPLCALNAKVVSEDVLFRLPREQEYKKLRAS
AAAGQLLDANTLFDCEVVQTLVFLFKGLQNARVGKTTSLDIIRELTAQLKRHRLDLAHPSQTSHLYA
>P16849 ~~~~~~G-protein coupled receptor homolog UL33~~~
MDTIIHNSTRNNTPPHINDTCNMTGPLFAIRTTEAVLNTFIIFVGGPLNAIVLITQLLTNRVLGYSTPTIYMTNLYSTNF
LTLTVLPFIVLSNQWLLPAGVASCKFLSVIYYSSCTVGFATVALIAADRYRVLHKRTYARQSYRSTYMILLLTWLAGLIF
SVPAAVYTTVVMHHDANDTNNTNGHATCVLYFVAEEVHTVLLSWKVLLTMVWGAAPVIMMTWFYAFFYSTVQRTSQKQRS
RTLTFVSVLLISFVALQTPYVSLMIFNSYATTAWPMQCEHLTLRRTIGTLARVVPHLHCLINPILYALLGHDFLQRMRQC
FRGQLLDRRAFLRSQQNQRATAETNLAAGNNSQSVATSLDTNSKNYNQHAKRSVSFNFPSGTWKGGQKTASNDTSTKIPH
RLSQSHHNLSGV
>P16766 ~~~~~~Protein UL35~~~
MAQGSRAPSGPPLPVLPVDDWLNFRVDLFGDEHRRLLLEMLTQGCSNFVGLLNFGVPSPVYALEALVDFQVRNAFMKVKP
VAQEIIRICILANHYRNSRDVLRDLRTQLDVLYSDPLKTRLLRGLIRLCRAAQTGVKPEDISVHLGADDVTFGVLKRALV
RLHRVRDALGLRASPEAEARYPRLTTYNLLFHPPPFTTVEAVDLCAENLSDVTQRRNRPLRCLTSIKRPGSRTLEDALND
MYLLLTLRHLQLRHALELQMMQDWVVERCNRLCDALYFCYTQAPETRQTFVTLVRGLELARQHSSPAFQPMLYNLLQLLT
QLHEANVYLCPGYLHFSAYKLLKKIQSVSDARERGEFGDEDEEQENDGEPREAQLDLEADPTAREGELFFFSKNLYGNGE
VFRVPEQPSRYLRRRMFVERPETLQIFYNFHEGKITTETYHLQRIYSMMIEGASRQTGLTPKRFMELLDRAPLGQESEPE
ITEHRDLFADVFRRPVTDAASSSSASSSSSSASPNSVSLPSARSSSTRTTTPASTYTSAGTSSTTGLLLSSSLSGSHGIS
SADLEQPPRQRRRMVSVTLFSPYSVAYSHHRRHRRRRSPPPAPRGPAHTRFQGPDSMPSTSYGSDVEDPRDDLAENLRHL
>F5HG98 ~~~~~~Apoptosis inhibitor UL38~~~
MTTTTHSTAAIMSLLDEAEWRQTQMDVGGLIQASALGKVALRYAVRKLMKRGARLRHDSGLYVCICDPSYEFLQMNLSKI
SWLERHCPPLDQELIMFGVIEAWEEASVRPTRQLVLFMTPKWDVFAYDSGILFFLAPSMAQFWHGAIVLEYWNALFPVEV
RSHVRQHAHTMDDLVMVFHQLDYEKQVLEARRDKNTEGPRTFAKSVNSYVRAILESERRIREGKIPMTFVDRDSLRANSL
AHIQATGAQPSHAPAQRVLSAPPSLPSPVSEEDPAAAATPPSSAATTPPSSVVPASVESELSSSPPLPPVVVKDVMYTAG
EGDVVQMVVVV
>Q9WT45 ~~~~~~Apoptosis inhibitor U19~~~
MAFTGDELARMLQFKDKMISSAGSALRFEKVVQEAMASGIVLQHITCIKVRICDNSDILSDRQLRCLLINGLYPFEGRMS
MFGVTEEWEGASAAPERQVVFLLSSTGQVLGYEDGVIFYLSPTFSDFWTTAMEFSCQNAILSNFIAQKSRDEYSDQFQKY
FTRMRHTPISLTGVLPRRFQKVESGACVEEDARASMRPIQSDSFGAKGPFCWPTEELLQPSAKKDVGGTVCMALSCQEDN
SARHCTIYGLTKTPGIKIMLSRHTQTDRSEAMCDAATQTEDVVDNSSETLFLGGNLVHQSILETEVQATAKNTFDVSDPR
IDSVYDTTVVGAMATDDVGCKHVQGGASLAQEKPLKGYCIIATPSECKPNIHWLKSPENAVHESAAVLR
>P16780 ~~~~~~Protein UL40~~~
MNKFSNTRIGFTCAVMAPRTLILTVGLLCMRIRSLLCSPAETTVTTAAVTSAHGPLCPLVFQGWAYAVYHQGDMALMTLD
VYCCRQTSNNTVVAFSHHPADNTLLIEVGNNTRRHVDGISCQDHFRAQHQDCPAQTVHVRGVNESAFGLTHLQSCCLNEH
SQLSERVAYHLKLRPATFGLETWAMYTVGILALGSFSSFYSQIARSLGVLPNDHHYALKKA
>P16815 ~~~~~~Protein UL42~~~
MEPTPMLRDRDHDDAPPTYEQAMGLCPTTVSTPPPPPPDCSPPPYRPPYCLVSSPSPRHTFDMDMMEMPATMHPTTGAYF
DNGWKWTFALLVVAILGIIFLAVVFTVVINRDSANITTGTQASSG
>F5HHZ3 ~~~~~~Protein UL42~~~
MEPTPMLRDRDHDDAPPTYEQAMGLCPTTVSTPPPPPPDCSPPPYRPPYCLVSSPSPRHTFDMDMMEMPATMHPTTGAYF
DNGWKWTFALLVVAILGIIFLAVVFTVVINRDSANITTGTQASSG
>D5LX53 ~~~~~~Protein UL42~~~
MEPTPMLRDRDHDDAPPTYEQAMGLCPTTVSTPPPPPPDCSPPPYRPPYCLVSSPSPRHTFDMDMMEMPATMHPTTGAYF
DNGWKWTFALLVVAILGIIFLAVVFTVVINRDNSTATGTSSG
>F5HG20 ~~~~~~Protein ORF66~~~
MALDQRWDRFLVSWFGLDEAQLTAHRVFEGENGVPVEEYVAFVIFGERGFQGNMPSWARHLLDRPSLAQAIAVLRAGSDT
VAKQAQICAAQQLLGAHVWVVVTLSRAQAADHARAIPRHVWAKYLSLPFSKACAQLCKLLALCSRFPLVTCCSKPPPSLP
WLRKKWHGPLPRRPLLEVPSPTRRGVAATEDGNGLGIGAADTGLREALERVAPTVPCGNPFDAMLGSLCFLSLIKSRHVV
LPACEQEGPGLVRNLGRRLLAYNVLSPCVSIPVICSRVARAALAKRARCARAVVCMECGHCLNFGRGKFHTVNFPPTNVF
FSRDRKEKQFTICATTGRIYCSYCGSEHMRVYPLCDITGRGTLARVVIRAVLANNAALAIRDLDQTVSFVVPCLGTPDCE
AALLKHRDVRGLLQLTSQLLEFCCGKCSS
>P10240 ~~~~~~Protein UL56~~~
MASEAAQPDAGLWSAGNAFADPPPPYDSLSGRNEGPFVVIDLDTPTDPPPPYSAGPLLSVPIPPTSSGEGEASERGRSRQ
AAQRAARRARRRAERRAQRRSFGPGGLLATPLFLPETRLVAPPDITRDLLSGLPTYAEAMSDHPPTYATVVAVRSTEQPS
GALAPDDQRRTQNSGAWRPPRVNSRELYRAQRAARGSSDHAPYRRQGCCGVVWRHAVFGVVAIVVVIILVFLWR
>P28282 ~~~~~~Protein UL56~~~
MALGAGHAHACRDDGDDSVIDAPPPYESVAGASAGQFVVIDIDTPTDSPPPYSAGTSPVGLVSPASSGDGEVCERGRSRR
AAWRAARRARRRAERRARRRSFGPGGLFVETPLFLPETMIGAHPGVGGDLPSGLPTYAEATSDRPPTYAMVMAACPTEPP
GGSVGPADQPRVQSSRTWRPPLVNSRELYRAQRAARCASSSDTPQAPGWCGGTCRHAVFGVVAVVVVIILAFLWR
>P16725 ~~~~~~Protein UL76~~~
MPSGRGDDADSTGNALRRLPHVRKRIGKRKHLDIYRRLLRVFPSFVALNRLLGGLFPPELQKYRRRLFIEVRLSRRIPDC
VLVFLPPDSGSRGIVYCYVIEFKTTYSDADDQSVRWHATHSLQYAEGLRQLKGALVDFDFLRLPRGGGQVWSVVPSLVFF
QQKADRPSFYRAFRSGRFDLCTDSVLDYLGRRQDESVAHLLAATRRRLLRTARGKRAALPRARASAVAGGRGGDNARRGL
ARGRAHGPGAQTVSASGAQGSGSQGADLLRGSRRARVRGGGAVEPAVRARRRTVAADAATTTVSSAFFVPRDRRGRSFCR
PTRSL
>Q6SW66 ~~~~~~Protein UL76~~~
MPSGCGDDADSTGNALRRLPHVRKRIGKRKHLDIYRRLLRVFPSFVALNRLLGGLFPPELQKYRRRLFIEVRLSRRIPDC
VLVFLPPDSGSRGIVYCYVIEFKTTYSDADDQSVRWHATHSLQYAEGLRQLKGALVDFDFLRLPRGGGQVWSVVPSLVFF
QQKADRPSFYRAFRSGRFDLCTDSVLDYLGRRQDESVAHLLAATRRRLLRAARGKRAALPRARASAVAGGRGGGNARRGL
ARGRAHGPGAQTVSASGAEGSGSQGTDLLRGSRRARVRGGGAVEPAVRARRRTVAADAATTTVSSAFVVPRDRRGRSFRR
PTRSL
>F5HET1 ~~~~~~Protein UL78~~~
MSPSVEETTSVTESIMFAIVSFKHMGPFEGYSMSADRAASDLLIGMFGSVSLVNLLTIIGCLWVLRVTRPPVSVMIFTWN
LVLSQFFSILATMLSKGIMLRGALNLSLCRLVLFVDDVGLYSTALFFLFLILDRLSAISYGRDLWHHETRENAGVALYAV
AFAWVLSIVAAVPTAATGSLDYRWLGCQIPIQYAAVDLTIKMWFLLGAPMIAVLANVVELAYSDRRDHVWSYVGRVCTFY
VTCLMLFVPYYCFRVLRGVLQPASAAGTGFGIMDYVELATRTLLTMRLGILPLFIIAFFSREPTKDLDDSFDYLVERCQQ
SCHGHFVRRLVQALKRAMYSVELAVCYFSTSVRDVAEAVKKSSSRCYADATSAAVVVTTTTSEKATLVEHAEGMASEMCP
GTTIDVSAESSSVLCTDGENTVASDATVTAL
>Q2HRB4 ~~~~~~Protein UL79 homolog~~~
MLGKYVCETEPLSPGLRRLMWRFLQNKNLNTFHAQELRFIHLVLCKMYNFGLNVYLLREATANAGTYDEVVLGRKVPAEV
WKLVYDGLEEMGVSSEMLLCEAYRDSLWMHLNDKVGLLRGLANYLFHRLGVTHDVRIAPENLVDGNFLFNLGSVLPCRLL
LAAGYCLAFWGSDEHERWVRFFAQKLFICYLIVSGRLMPQRSLLVWASETGYPGPVEAVCRDIRSMYGIRTYAVSGYLPA
PSEAQLAYLGAFNNNAV
>P16727 ~~~~~~Protein UL84~~~
MPRVDPNLRNRARRPRARRGGGGGVGSNSSRHSGKCRRQRRALSAPPLTFLATTTTTTMMGVASTDDDSLLLKTPDELDK
YSGSPQTILTLTDKHDIRQPRVHRGTYHLIQLHLDLRPEELRDPFQILLSTPLQLGEANDESQTAPATLQEEETAASHEP
EKKKEKQEKKEEDEDDRNDDRERGILCVVSNEDSDVRPAFSLFPARPGCHILRSVIDQQLTRMAIVRLSLNLFALRIITP
LLKRLPLRRKAAHHTALHDCLALHLPELTFEPTLDINNVTENAASVADTAESTDADLTPTLTVRVRHALCWHRVEGGISG
PRGLTSRISARLSETTAKTLGPSVFGRLELDPNESPPDLTLSSLTLYQDGILRFNVTCDRTEAPADPVAFRLRLRRETVR
RPFFSDAPLPYFVPPRSGAADEGLEVRVPYELTLKNSHTLRIYRRFYGPYLGVFVPHNRQGLKMPVTVWLPRSWLELTVL
VSDENGATFPRDALLGRLYFISSKHTLNRGCLSAMTHQVKSTLHSRSTSHSPSQQQLSVLGASIALEDLLPMRLASPETE
PQDCKLTENTTEKTSPVTLAMVCGDL
>P29839 ~~~~~~UL84 protein~~~
MPRADPNLRNRARRPRARRGGGGGVGSNSSRHSGKCRRQRRALSAPPLTFLATTTTTTMMGVASTDDDSLLLKTPDELDK
HSGSPQTILTLTDKHDIRQPRVHRGTYHLIQLHLDLRPEELRDPFQILLSTPLQLGEANGESQTAPATSQEEETAASHEL
EKKKEKEEKKEEEDEDDRNDDRERGILCVVSNEDSDVRPAFSLFPARPGCHILRSVIDQQLTRMAIVRLSLNLFALRIIT
PPLKRVPLRRKAAHHTALHDCMALHLPELTFESTLDINNVTENAASVADAAESTDADLTPTLTVRVRHAVCWHRVEGGIS
GPRGLTSRISARLSETTAKTLGPSVFGRLELDPNESPPDLTLSSLTLYQDGMLRFNVTCDRTEAPADPVAFRLRLRRETV
RRPFFSDAPLPYFVPPRSGAADEGLEVRVPYELTLKNSHTLRIYRRFYGPYLGVFVPHNRQGLKMPVTVWLPRSWLELTV
LVSDENGATFPRDALLGRLYFISSKHTLNRGCLSAMTHQVKSTLHSRSTSHSPSQQQLSVLGASIALEDLLPMRLASPET
EPQDCKLTENTTEKTSPVTLAMVCGDL
>P16731 ~~~~~~Protein UL88~~~
MMEAAAAAAAAFRPEERPTPGWHDAALLMDDGTVREHAFRNGPLSQLIRRVLPPPPDAEDDVVFASELCFYCSGRFNRRS
SVFSIYWQKHSDLVYALTGITHCAKLVVECGQLGSSRLRWRDGDASGEERRGDDDSRDELYDVPGIYMIRVNDGGSTGPR
HVIWPGTSVLWAPDVVITTVQRRISAARALVNTFRQYFFLLERRSHEELVLCPPEMEERLAPLLQSATRGDSDMFDGVVA
SAYHRLRMSNIPRSSARLLEHCVGLAGAKKLLLLDVPRLENYFLCQVCLYELDEDEMGEEMLGMLAGKPEDAAVSGASGG
FLLHRKTMKLAACLCLLLNSLHLHQEALEALDPPPPRVEENDLVNVVLRRYYRSHGGVQARTLAAARALLADYAETFSPL
GSFTRLGYDRLVSADAGVSRRHLVALLRA
>Q2HRA1 ~~~~~~Protein ORF31~~~
MKSVASPLCQFHGVFCLYQCRQCLAYHVCDGGAECVLLHTPESVICELTGNCMLGNIQEGQFLGPVPYRTLDNQVDRDAY
HGMLACLKRDIVRYLQTWPDTTVIVQEIALGDGVTDTISAIIDETFGECLPVLGEAQGGYALVCSMYLHVIVSIYSTKTV
YNSMLFKCTKNKKYDCIAKRVRTKWMRMLSTKDT
>P03220 ~~~BGLF3~~~Late gene expression regulator BGLF3~~~
MFNAVKADMPDDPMLARRYGQCLELALEACQDTPEQFKLVETPLKSFLLVSNILPQDNRPWHEARSSGRVAEDDYDFSSL
ALELLPLNPRLPEEWQFGGQGWSSRMEPSQPEMGMGLCFEVFDGDLMRIALAWNKDEVIGQALQILAHSQTWTSLVPEDP
LPWMWALFYGPRSHCEERHCVYAAARGKRGPILLPTAVYTPCANIEAFLAHLTRCVYALYLDVRDWKGEDIAPPFDVSRL
NKMAKQLCLLPQEPFCITRVCLLCLLHKQNLNAQYKRPVDTYDPCLILTGEAERYMVDAVGNYREASTGTTVLYPTYDLG
SIVADMVTYEDE
>Q2HR98 ~~~~~~Protein UL95 homolog~~~
MFALSSLVSEGDPEVTSRYVKGVQLALDLSENTPGQFKLIETPLNSFLLVSNVMPEVQPICSGLPALRPDFSNLHLPRLE
KLQRVLGQGFGAAGEEIALDPSHVETHEKGQVFYNHYATEEWTWALTLNKDALLREAVDGLCDPGTWKGLLPDDPLPLLW
LLFNGPASFCRADCCLYKQHCGYPGPVLLPGHMYAPKRDLLSFVNHALKYTKFLYGDFSGTWAAACRPPFATSRIQRVVS
QMKIIDASDTYISHTCLLCHIYQQNSIIAGQGTHVGGILLLSGKGTQYITGNVQTQRCPTTGDYLIIPSYDIPAIITMIK
ENGLNQL
>P16788 2.7.11.1~~~~~~Serine/threonine protein kinase UL97~~~
MSSALRSRARSASLGTTTQGWDPPPLRRPSRARRRQWMREAAQAAAQAAVQAAQAAAAQVAQAHVDENEVVDLMADEAGG
GVTTLTTLSSVSTTTVLGHATFSACVRSDVMRDGEKEDAASDKENLRRPVVPSTSSRGSAASGDGYHGLRCRETSAMWSF
EYDRDGDVTSVRRALFTGGSDPSDSVSGVRGGRKRPLRPPLVSLARTPLCRRRVGGVDAVLEENDVELRAESQDSAVASG
PGRIPQPLSGSSGEESATAVEADSTSHDDVHCTCSNDQIITTSIRGLTCDPRMFLRLTHPELCELSISYLLVYVPKEDDF
CHKICYAVDMSDESYRLGQGSFGEVWPLDRYRVVKVARKHSETVLTVWMSGLIRTRAAGEQQQPPSLVGTGVHRGLLTAT
GCCLLHNVTVHRRFHTDMFHHDQWKLACIDSYRRAFCTLADAIKFLNHQCRVCHFDITPMNVLIDVNPHNPSEIVRAALC
DYSLSEPYPDYNERCVAVFQETGTARRIPNCSHRLRECYHPAFRPMPLQKLLICDPHARFPVAGLRRYCMSELSALGNVL
GFCLMRLLDRRGLDEVRMGTEALLFKHAGAACRALENGKLTHCSDACLLILAAQMSYGACLLGEHGAALVSHTLRFVEAK
MSSCRVRAFRRFYHECSQTMLHEYVRKNVERLLATSDGLYLYNAFRRTTSIICEEDLDGDCRQLFPE
>D3XDS2 2.7.11.1~~~~~~Serine/threonine protein kinase M97~~~
MSVELTPPRSDGSVGFAPVVVPPAPRKPLRRRAVSDLEKLYKVKRRLVFGADDGAVDNDTSNNNSGSSSTTSRSRRKTAA
DVVSDSPKRTDDSSTAGEDGYTHCVHSCACTPGERHLLCCELVSIGDSVSVARCPLCSLGISTTYLSRGCCRGRSKVTGG
DEDEEDEDEEENSQDEDRDEEEAASASSSGGLEWSDDSNSALSWSDENIIISPFPGLKCYVTTFEDIRQPVLLETGSAYL
PVYVPYDESFCRNRCLERGGDDDDERDATLIGKGSFGQVWRLSDKKTALKAAASESINETLLTVWISGVVRSRAQDAGYR
GELDDSVYCNILVATGSCLRHNLVSFASFDRDLYNYRGWHYAGLASYRRAFSGIADALRFLNLRCGVGHFDVTPMNVLIN
YDRADDRQIARAVICDFSLSQCHTEGTTGHCVVVFQQTKTVRALPKSAYYLTDIYHPAFKPLMLQKLCAIEPRKQFPKPS
ANRFCVSDLCALGHVAAFCLVRVLDERGQLKVRSTSEDALFGVARKTCDALARHSVDEVANFCSLLITRQLAYTATLLGS
DDMREPMARLCDYFETVSDKDAPDRFRSVYKRARREIDGSYMVRLLLAASETEDGRYLLDNIRATCLMVDSEDLDVDPYK
IFP
>P14739 ~~~UGI~~~Uracil-DNA glycosylase inhibitor~~~
MTNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNGENK
IKML
>P12888 3.2.2.27~~~UNG~~~Uracil-DNA glycosylase~~~
MASRGLDLWLDEHVWKRKQEIGVKGENLLLPDLWLDFLQLSPIFQRKLAAVIACVRRLRTQATVYPEEDMCMAWARFCDP
SDIKVVILGQDPYHGGQANGLAFSVAYGFPVPPSLRNIYAELHRSLPEFSPPDHGCLDAWASQGVLLLNTILTVQKGKPG
SHADIGWAWFTDHVISLLSERLKACVFMLWGAKAGDKASLINSKKHLVLTSQHPSPLAQNSTRKSAQQKFLGNNHFVLAN
NFLREKGLGEIDWRL
>P10186 3.2.2.27~~~UL2~~~Uracil-DNA glycosylase~~~
MKRACSRSPSPRRRPSSPRRTPPRDGTPPQKADADDPTPGASNDASTETRPGSGGEPAACRSSGPAALLAALEAGPAGVT
FSSSAPPDPPMDLTNGGVSPAATSAPLDWTTFRRVFLIDDAWRPLMEPELANPLTAHLLAEYNRRCQTEEVLPPREDVFS
WTRYCTPDEVRVVIIGQDPYHHPGQAHGLAFSVRANVPPPPSLRNVLAAVKNCYPEARMSGHGCLEKWARDGVLLLNTTL
TVKRGAAASHSRIGWDRFVGGVIRRLAARRPGLVFMLWGTHAQNAIRPDPRVHCVLKFSHPSPLSKVPFGTCQHFLVANR
YLETRSISPIDWSV
>P13158 3.2.2.27~~~UL2~~~Uracil-DNA glycosylase~~~
MAMKRNPSRVFCAYSKNGTHRSAAPTTHRCIAGGGRGALDAGAENTQGHPESRCFPGGRPPQTGPSWCLGAAFRRAFLID
DAWRPLLEPELANPLTARLLAEYDRRCQTEEVLPPREDVFSWTRYCTPDDVRVVIIGQDPYHHPGQAHGLAFSVRADVPV
PPSLRNVLAAVKNCYPDARMSGRGCLEKWARDGVLLLNTTLTVKRGAAASTSKLGWDRFVGGVVRRLAARRPGLVFMLWG
AHAQNAIRPDPRQHYVLKFSHPSPLSKVPFGTCQHFLAANRYLETRDIMPITVV
>F5HFA1 3.2.2.27~~~~~~Uracil-DNA glycosylase~~~
MDAWLQQTVFRGTLSISQGVDDRDLLLAPKWISFLSLSSFLKQKLLSLLRQIRELRLTTTVYPPQDKLMWWSHCCDPEDI
KVVILGQDPYHKGQATGLAFSVDPQCQVPPSLRSIFRELEASVPNFSTPSHGCLDSWARQGVLLLNTVLTVEKGRAGSHE
GLGWDWFTSFIISSISSKLEHCVFLLWGRKAIDRTPLINAQKHLVLTAQHPSPLASLGGRHSRWPRFQGCNHFNLANDYL
TRHRRETVDWGLLEQ
>Q5UPT2 3.2.2.-~~~UNG~~~Probable uracil-DNA glycosylase~~~
MSKKNVDPFSDSDSSSEPPSIFSSDNEENSDVDNSVIINDKNTKSDEADIKYMDEDESSDSESESESKKKSKKSKKSKKS
KKSVTKKKNNLLVGNRIITEYILIDANNYHFKSWIECFPDCKVNLKLLLFRPEWFDFFKYVESKTYFPQLESKLSSYLEK
RQRIVPYPELLFNTMNVLPPGKIKVVILGQDPYPGSCISGVPYAMGCSFSVPLNCPVPKSLANIYTNLIKFNHMRKAPKH
GCLASWILQGTFMINSAFTTVLNESGVHARTWESFTADLIDYLTDNYDDLIFVAWGAHAHKLCQRVDPKKHYIITSSHPS
PYSVSNTMTSMSYGPNPKKVTYPSFNSVDHFGKINEHLKSRNKKPIFWDL
>P32941 3.2.2.27~~~UNG~~~Uracil-DNA glycosylase~~~
MRRVFLSHEPYVIEYHEDWENIITRLVDMYNEVAEWILKDDTSPTPDKFFKQLSVSLKDKRVCVCGIDPYPRDATGVPFE
SHNFTKKTIKYIAETVSNITGVRYYKGYNLNNVEGVFPWNYYLSCKIGETKSHALHWKRISKLLLQHITKYVNVLYCLGK
TDFANIRSILETPVTTVIGYHPAAREKQFEKDKGFEIVNVLLEINDKPSIRWEQGFSY
>Q91UM2 3.2.2.27~~~~~~Uracil-DNA glycosylase~~~
MNSVTVSHAPYTITYHDDWEPVMSQLVEFYNEVASWLLRDETSPIPDKFFIQLKQPLRNKRVCVCGIDPYPKDGTGVPFE
SPNFTKKSIKEIASSISRLTGVIDYKGYNLNIIDGVIPWNYYLSCKLGETKSHAIYWDKISKLLLQHITKHVSVLYCLGK
TDFSNIRAKLESPVTTIVGYHPAARDRQFEKDRSFEIINVLLELDNKAPINWAQGFIY
>P20536 3.2.2.27~~~~~~Uracil-DNA glycosylase~~~
MNSVTVSHAPYTITYHDDWEPVMSQLVEFYNEVASWLLRDETSPIPDKFFIQLKQPLRNKRVCVCGIDPYPKDGTGVPFE
SPNFTKKSIKEIASSISRLTGVIDYKGYNLNIIDGVIPWNYYLSCKLGETKSHAIYWDKISKLLLQHITKHVSVLYCLGK
TDFSNIRAKLESPVTTIVGYHPAARDRQFEKDRSFEIINVLLELDNKVPINWAQGFIY
>P04303 3.2.2.27~~~~~~Uracil-DNA glycosylase~~~
MNSVTVSHAPYTITYHDDWEPVMSQLVEFYNEVASWLLRDETSPIPDKFFIQLKQPLRNKRVCVCGIDPYPKDGTGVPFE
SPNFTKKSIKEIASSISRLTGVIDYKGYNLNIIDGVIPWNYYLSCKLGETKSHAIYWDKISKLLLQHITKHVSVLYCLGK
TDYSNIRAKLESPVTTIVGYHPAARDRQFEKDRSFEIINVLLELDNKAPINWAQGFIY
>P09713 ~~~US2~~~Unique short US2 glycoprotein~~~
MNNLWKAWVGLWTSMGPLIRLPDGITKAGEDALRPWKSTAKHPWFQIEDNRCYIDNGKLFARGSIVGNMSRFVFDPKADY
GGVGENLYVHADDVEFVPGESLKWNVRNLDVMPIFETLALRLVLQGDVIWLRCVPELRVDYTSSAYMWNMQYGMVRKSYT
HVAWTIVFYSINITLLVLFIVYVTVDCNLSMMWMRFFVC
>F5HE05 ~~~US2~~~Unique short US2 glycoprotein~~~
MNNLWKAWVGLWTSMGPLIRLPDGITKAGEDALRPWKSTAKHPWFEIEDNRCYIDNGKLFARGSIVGNMSRFVFDPKADY
GGVGENLYVHADDVEFVPGESLKWNVRNLDVMPIFETLALRLVLQGDVIWLRCVPELRVDYTSSAYMWNMQYGMVRKSYT
HVAWTIVFYSINITLLVLFIVYVTVDCNLSMMWMRFFVC
>P06485 ~~~US2~~~Protein US2~~~
MGVVVVNVMTLLDQNNALPRTSVDASPALWSFLLRQCRILASEPLGTPVVVRPANLRRLAEPLMDLPKPTRPIVRTRSCR
CPPNTTTGLFAEDSPLESTEVVDAVACFRLLHRDQPSPPRLYHLWVVGAADLCVPFLEYAQKIRLGVRFIAIKTPDAWVG
EPWAVPTRFLPEWTVAWTPFPAAPNHPLETLLSRYEYQYGVVLPGTNGRERDCMRWLRSLIALHKPHPATPGPLTTSHPV
RRPCCACMGMPEVPDEQPTSPGRGPQETDPLIAVRGERPRLPHICYPVTTL
>P13292 ~~~US2~~~Protein US2~~~
MGVVVVSVVTLLDQRNALPRTSADASPALWSFLLRQCRILASEPLGTPVVVRPANLRRLAEPLMDLPKFTRPIVRTRSCR
CPPNTTTGLFAEDDPLESIEILDAPACFRLLHQERPGPHRLYHLWVVGAADLCVPFFEYAQKTRLGFRFIATKTNDAWVG
EPWPLPDRFLPERTVSWTPFPAAPNHPLENLLSRYEYQYGVVVPGDRERSCLRWLRSLVAPHNKPRPASSRPHPATHPTQ
RPCFTCMGRPEIPDEPSWQTGDDDPQNPGPPLAVGDEWPPSSHVCYPITNL
>P09712 ~~~US3~~~Membrane glycoprotein US3~~~
MKPVLVLAILAVLFLRLADSVPRPLDVVVSEIRSAHFRVEENQCWFHMGMLYFKGRMSGNFTEKHFVNVGIVSQSYMDRL
QVSGEQYHHDERGAYFEWNIGGHPVTHTVDMVDITLSTRWGDPKKYAACVPQVRMDYSSQTINWYLQRSMRDDNWGLLFR
TLLVYLFSLVVLVLLTVGVSARLRFI
>F5HEU0 ~~~US3~~~Membrane glycoprotein US3~~~
MKPVLVLAILAVLFLRLADSVPRPLNVVVSEIKSAHFRVEENQCWFHMGMLYFKGRMSGNFTKKHFVNVGIVSQSYMDRL
QVSGEQYHHDERGAYFEWNIGGYPVSHTVDMVDITLSTRWGDPKKYAACVPQVRMDYSSQTINWYLQRSMRDDNWGLLFR
TLLVYLFSLVVLVLLTVGVSARLRFI
>B9VXD7 ~~~US3~~~Membrane glycoprotein US3~~~
MKPVLVLAILAVLFLRLADSVPRPLDVVVSEIRSAHFRVEENQCWFHMGMLHYKGRMSGNFTEKHFVSVGIVSQSYMDRL
QVSGEQYHHDERGAYFEWNIGGHPVPHTVDMVDITLSTRWGDPKKYAACVPQVRMDYSSQTINWYLQRSIRDDNWGLLFR
TLLVYLFSLVVLVLLTVGVSARLRFI
>P04413 2.7.11.1~~~US3~~~Serine/threonine-protein kinase US3~~~
MACRKFCRVYGGQGRRKEEAVPPETKPSRVFPHGPFYTPAEDACLDSPPPETPKPSHTTPPSEAERLCHLQEILAQMYGN
QDYPIEDDPSADAADDVDEDAPDDVAYPEEYAEELFLPGDATGPLIGANDHIPPPCGASPPGIRRRSRDEIGATGFTAEE
LDAMDREAARAISRGGKPPSTMAKLVTGMGFTIHGALTPGSEGCVFDSSHPDYPQRVIVKAGWYTSTSHEARLLRRLDHP
AILPLLDLHVVSGVTCLVLPKYQADLYTYLSRRLNPLGRPQIAAVSRQLLSAVDYIHRQGIIHRDIKTENIFINTPEDIC
LGDFGAACFVQGSRSSPFPYGIAGTIDTNAPEVLAGDPYTTTVDIWSAGLVIFETAVHNASLFSAPRGPKRGPCDSQITR
IIRQAQVHVDEFSPHPESRLTSRYRSRAAGNNRPPYTRPAWTRYYKMDIDVEYLVCKALTFDGALRPSAAELLCLPLFQQ
K
>P09251 2.7.11.1~~~~~~Serine/threonine-protein kinase US3 homolog~~~
MNDVDATDTFVGQGKFRGAISTSPSHIMQTCGFIQQMFPVEMSPGIESEDDPNYDVNMDIQSFNIFDGVHETEAEASVAL
CAEARVGINKAGFVILKTFTPGAEGFAFACMDSKTCEHVVIKAGQRQGTATEATVLRALTHPSVVQLKGTFTYNKMTCLI
LPRYRTDLYCYLAAKRNLPICDILAIQRSVLRALQYLHNNSIIHRDIKSENIFINHPGDVCVGDFGAACFPVDINANRYY
GWAGTIATNSPELLARDPYGPAVDIWSAGIVLFEMATGQNSLFERDGLDGNCDSERQIKLIIRRSGTHPNEFPINPTSNL
RRQYIGLAKRSSRKPGSRPLWTNLYELPIDLEYLICKMLSFDARHRPSAEVLLNHSVFQTLPDPYPNPMEVGD
>P14334 ~~~US6~~~Unique short US6 glycoprotein~~~
MDLLIRLGFLLMCALPTPGERSSRDPKTLLSLSPRQQACVPRTKSHRPVCYNDTGDCTDADDSWKQLGEDFAHQCLQAAK
KRPKTHKSRPNDRNLEGRLTCQRVRRLLPCDLDIHPSHRLLTLMNNCVCDGAVWNAFRLIERHGFFAVTLYLCCGITLLV
VILALLCSITYESTGRGIRRCGS
>Q6SW00 ~~~US6~~~Unique short US6 glycoprotein~~~
MDLLIRLGFLLMCALPTPGERSSRDPKTLLSLSPRQACVPRTKSHRPVCYNDTGDCTDADDSWKQLGEDFAHQCLQAAKK
RPKTHKSRPNDRNLEGRLTCQRVRRLLPCDLDIHPSHRLLTLMNNCVCDGAVWNAFRLIERHGFFAVTLYLCCGITLLVV
ILALLCSITYESTGRGIRRCGS
>P60528 ~~~US6~~~Unique short US6 glycoprotein~~~
MDLLIRLGFLLMCALPTPGERSSRDPKTLLSLSPRQQACVPRTKSHRPVCYNDTGDCTDADDSWKQLGEDFAHQCLQAAK
KRPKTHKSRPNDRNLEGRLTCQRVRRLLPCDLDIHPSHRLLTLMNNCVCDGAVWNAFRLIERHGFFAVTLYLCCGITLLV
VILALLCSITYESTGRGIRRCGS
>P28965 ~~~~~~Virion protein US10 homolog~~~
MDGAYGHVHNGSPMAVDGEESGAGTGTGAGADGLYPTSTDTAAHAVSLPRSVGDFAAVVRAVSAEAADALRSGAGPPAEA
WPRVYRMFCDMFGRYAASPMPVFHSADPLRRAVGRYLVDLGAAPVETHAELSGRMLFCAYWCCLGHAFACSRPQMYERAC
ARFFETRLGIGETPPADAERYWAALLNMAGAEPELFPRHAAAAAYLRARGRKLPLQLPSAHRTAKTVAVTGQSINF
>P29123 ~~~IR5~~~Virion protein US10 homolog~~~
MDGAYGHVHNGSPMAVDGEESGAGTGTGAGADGLYPTSTDTAAHAVSLPRSVGDFAAVVRAVSAEAADALRSGAGPPAEA
WPRVYRMFCDMFGRYAASPMPVFHSADPLRRAVGLYLVDLGAAPVETHAELSGRMLFCAYWCCLGHAFACSRPQMYERAC
ARFFETRLGIGETPPADAERYWAALLNMAGAEPELFPRHAAAAAYLRARGRKLPLQLPSAHRTAKTVAVTGQSINF
>P84402 ~~~~~~Virion protein US10 homolog~~~
MDGAYGHVHNGSPMAVDGEESGAGTGTGAGADGLYPTSTDTAAHAVSLPRSVGDFAAVVRAVSAEAADALRSGAGPPAEA
WPRVYRMFCDMFGRYAASPMPVFHSADPLRRAVGRYLVDLGAAPVETHAELSGRMLFCAYWCCLGHAFACSRPQMYERAC
ARFFETRLGIGETPPADAERYWAALLNMAGAEPELFPRHAAAAAYLRARGRKLPLQLPSAHRTAKTVAVTGQSINF
>P18349 ~~~~~~Virion protein US10 homolog~~~
MAHAIPRPAEEIPLVPGRARSVRLGSTLPRVMDCAYGSPMAVDGDVRTGGDCGGGEGLYPTSTDTAAHAVSLPRSVGEFA
SAVRAMSADAADALRRGAGPPPEIWPRAYRMFCELFGRYAVSPMPVFHSADPLRRAVGRYLVDLGAAPVETHAELSTRLL
FCAHWCCLGHAFGCSRQAMYERECARFFEARLGIGETPPADSERYWVALLDMAGADPELFPRHAAAAAYLRTRGRKLPLP
LPPQAGSATVSVASQSINF
>Q05107 ~~~~~~Virion protein US10 homolog~~~
MAMWSLRRKSSRSVQLRVDSPKEQSYDILSAGGEHVALLPKSVRSLARTILTAATISQAAMKAGKPPSSRLWGEIFDRMT
VTLNEYDISASPFHPTDPTRKIVGRALRCIERAPLTHEEMDTRFTIMMYWCCLGHAGYCTVSRLYEKNVRLMDIVGSATG
CGISPLPEIESYWKPLCRAVATKGNAAIGDDAELAHYLTNLRESPTGDGESYL
>Q77MP8 ~~~~~~Virion protein US10 homolog~~~
MAMWSLRRKSSRSVQLRVDSPKEQSYDILSAGGEHVALLPKSVRSLARTILTAATISQAAMKAGKPPSSRLWGEIFDRMT
VTLNEYDISASPFHPTDPTRKIVGRALRCIERAPLTHEEMDTRFTIMMYWCCLGHAGYCTVSRLYEKNVRLMDIVGSATG
CGISPLPEIESYWKPLCRAVATKGNAAIGDDAELAHYLTNLRESPTGDGESYL
>P06486 ~~~~~~Virion protein US10~~~
MIKRRGNVEIRVYYESVRTLRSRSHLKPSDRQQSPGHRVFPGSPGFRDHPENLGNPEYRELPETPGYRVTPGIHDNPGLP
GSPGLPGSPGLPGSPGPHAPPANHVRLAGLYSPGKYAPLASPDPFSPQHGAYARARVGIHTAVRVPPTGSPTHTHLRQDP
GDEPTSDDSGLYPLDARALAHLVMLPADHRAFFRTVVEVSRMCAANVRDPPPPATGAMLGRHARLVHTQWLRANQETSPL
WPWRTAAINFITTMAPRVQTHRHMHDLLMACAFWCCLTHASTCSYAGLYSTHCLHLFGAFGCGDPALTPPLC
>D3YPD5 ~~~~~~Virion protein US10~~~
MIKRRGNVEIRVYYESVRTLRSRSHLKPSDRQQSPGHRVFPGSPGFRDHPENLGNPEYRELPETPGYRVTPGIHDNPGSP
GLPGSPGPHAPPANHVRLAGLYSPGKYAPLASPDPFSPQDGAYARARVGLHTAVRVPPTGSPTHTHLRHDPGDEPTSDDS
GLYPLDARALAHLVMLPADHRAFFRTVVEVSRMCAANVRDPPPPATGAMLGRHARLVHTQWLRANQETSPLWPWRTAAIN
FITTMAPRVQTHRHMHDLLMACAFWCCLTHASTCSYAGLYSTHCLHLFGAFGCGDPALTPPLC
>P89478 ~~~~~~Virion protein US10~~~
MIRRRGNVEIRVYYESVRPSRSRSHLKPSDHQEFPGHHVSPGSPGFPESPGNREFHDLPENPGSRAYPGTRDPHDPHGCP
GSLDPHGNPAQPAGLPSPVPYAPLGSPDPSSPRQRTYVLPRVGIHNAPASDTRAPKRANSRHRADRPPESPGSELYPLNA
QALAHLQMLPADHRAFFRTVIEVSRLCALNTHDPPPPLAGARVGQEAQLVHTQWLRANRESSPLWPWRTAAMNFIAAAAP
CVQTHRHMHDLLMACAFWCCLAHASTCSYAGLYSAHCQHLFRAFGCGPPVLTTSRGQGGWCN
>Q6UDG2 ~~~~~~Virion protein US10 homolog~~~
MRPLSRLHFPRGSLPLCPYSGSSAEAEAYQRLRGVRAASELWCELHDLAEHLLPIVSKRGRRPDDEAGRSLMLRNAADRL
RASFVRAVELSKPSADAQDKGRSDGAGDKQKGGDAARDGGEMAGLSQLLFLPSPALHRPPIRVGPTEDCLNSDMTSSDAA
VPHAYGSSSSSSDEAGVPGRRRSRRRRCVHAWRKENVEARARAQSNNLRAAVSAVLLDRWLPIGDARNTCTPPGELEERL
LAAVLAAAHWCCLWHDSPCGAGSLYADIYAEDIFATGAPQR
>P09311 ~~~~~~Virion protein US10 homolog~~~
MNLCGSRGEHPGGEYAGLYCTRHDTPAHQALMNDAERYFAAALCAISTEAYEAFIHSPSERPCASLWGRAKDAFGRMCGE
LAADRQRPPSVPPIRRAVLSLLREQCMPDPQSHLELSERLILMAYWCCLGHAGLPTIGLSPDNKCIRAELYDRPGGICHR
LFDAYLGCGSLGVPRTYERS
>P09727 ~~~~~~Unique short US11 glycoprotein~~~
MNLVMLILALWAPVAGSMPELSLTLFDEPPPLVETEPLPPLSDVSEYRVEYSEARCVLRSGGRLEALWTLRGNLSVPTPT
PRVYYQTLEGYADRVPTPVEDVSESLVAKRYWLRDYRVPQRTKLVLFYFSPCHQCQTYYVECEPRCLVPWVPLWSSLEDI
ERLLFEDRRLMAYYALTIKSAQYTLMMVAVIQVFWGLYVKGWLHRHFPWMFSDQW
>Q6SVZ5 ~~~~~~Membrane glycoprotein US11~~~
MNLVMLILALWAPVAGSMPELSLTLFDEPPPLVETEPLPPLPDVSEYRVEYSEARCVLRSGGRLEALWTLRGNLSVPTPT
PRVYYQTLEGYADRVPTPVEDISESLVAKRYWLRDYRVPQRTKLVLFYFSPCHQCQTYYVECEPRCLVPWVPLWSSLEDI
ERLLFEDRRLMAYYALTIKSAQYTLMMVAVIQVFWGLYVKGWLHRHFPWMFSDQW
>Q8UZK5 ~~~~~~Unique short US11 glycoprotein~~~
MNLIMLILALWAPVAGSMPELSLTLFDEPPPLVETEPLPPLPDVSEYRVEYSEARCVLRSGGRLEALWTLRGNLSVPTPT
PRVYYQTLEGYADRVPTPVEDISESLVAKRYWLRDYRVPQRTKLVLFYFSPCHQCQTYYVECEPRCLVPWVPLWSSLEDI
ERLLFEDRRLMAYYALTIKSAQYTLMMVAVIQVFWGLYVKGWLHRHFPWMFSDQW
>P69336 ~~~~~~Transmembrane protein HWLF4~~~
MLHVVPLEWTVEEVVPYLERLAVWLRASVLVAFQLTATVALSVLSWWLMPPPVAELCERGRDDDPPPLSHLSLVVPVGCL
FLLLRGPSIDRCPRKLPLLLAYCLPHALAFLTLLMCQPSPQAFVGAALLALAVDLSCLGASLLGCDPGASLRRLWLPSVL
SLLCATALGLWLLRAAAPFFLGLHATTLLTVTLMLIHDLSLITCQSSFPESFQPSLRLYVENVALFIGMYHLLRLWLWSP
>F5HAR3 ~~~~~~Transmembrane protein US19~~~
MLHVVPLEWTVEEVVPYLERLAVWLRASVLVAFQLTATVALSVLSWWLMPPPVAELCERGRDDDPPSLSHLSLVVPVGCL
FLLLRGPSIDRCPRKLPLLLAYCLPHALAFLTLLMCQPSPQAFVGAALLALAVDLSCLGASLLGCDPGASLRRLWLPSVL
SLLCATALGLWLLRAAAPFFLGLHATTLLTVTLMLIHDLSLITCQSSFPESFQPSLRLYVENVALFIGMYHLLRLWLWSP
>P69337 ~~~~~~Transmembrane protein HWLF4~~~
MLHVVPLEWTVEEVVPYLERLAVWLRASVLVAFQLTATVALSVLSWWLMPPPVAELCERGRDDDPPPLSHLSLVVPVGCL
FLLLRGPSIDRCPRKLPLLLAYCLPHALAFLTLLMCQPSPQAFVGAALLALAVDLSCLGASLLGCDPGASLRRLWLPSVL
SLLCATALGLWLLRAAAPFFLGLHATTLLTVTLMLIHDLSLITCQSSFPESFQPSLRLYVENVALFIGMYHLLRLWLWSP
>Q03307 ~~~~~~Transmembrane protein HWLF3~~~
MRVISRARSACTWTSCTSLSPCSTSCPPSPAAPTLLRRRSLPQQRRRPSSSPNRRVRGVTTSPCPTRSLVYKRRVGAPQR
LCAETVATMQAQEANALLLSRMEALEWFKKFTVWLRVYAIFIFQLAFSFGLGSVFWLGFPQNRNFCVENYSFFLTVLVPI
VCMFITYTLGNEHPSNATVLFIYLLANSLTAAIFQMCSESRVLVGSYVMTLALFISFTGLAFLGGRDRRRWKCISCVYVV
MLLSFLTLALLSDADWLQKIVVTLCAFSISFFLGILAYDSLMVIFFCPPNQCIRHAVCLYLDSMAIFLTLLLMLSGPRWI
SLSDGVPLDNGTLTAASTTGKS
>P09703 ~~~~~~G-protein coupled receptor homolog US27~~~
MTTSTNNQTLTQVSNMTNHTLNSTEIYQLFEYTRLGVWLMCIVGTFLNVLVITTILYYRRKKKSPSDTYICNLAVADLLI
VVGLPFFLEYAKHHPKLSREVVCSGLNACFYICLFAGVCFLINLSMDRYCVIVWGVELNRVRNNKRATCWVVIFWILAVL
MGMPHYLMYSHTNNECVGEFANETSGWFPVFLNTKVNICGYLAPIALMAYTYNRMVRFIINYVGKWHMQTLHVLLVVVVS
FASFWFPFNLALFLESIRLLAGVYNDTLQNVIIFCLYVGQFLAYVRACLNPGIYILVGTQMRKDMWTTLRVFACCCVKQE
IPYQDIDIELQKDIQRRAKHTKRTHYDRKNAPMESGEEEFLL
>P69332 ~~~~~~G-protein coupled receptor homolog US28~~~
MTPTTTTAELTTEFDYDEDATPCVFTDVLNQSKPVTLFLYGVVFLFGSIGNFLVIFTITWRRRIQCSGDVYFINLAAADL
LFVCTLPLWMQYLLDHNSLASVPCTLLTACFYVAMFASLCFITEIALDRYYAIVYMRYRPVKQACLFSIFWWIFAVIIAI
PHFMVVTKKDNQCMTDYDYLEVSYPIILNVELMLGAFVIPLSVISYCYYRISRIVAVSQSRHKGRIVRVLIAVVLVFIIF
WLPYHLTLFVDTLKLLKWISSSCEFERSLKRALILTESLAFCHCCLNPLLYVFVGTKFRQELHCLLAEFRQRLFSRDVSW
YHSMSFSRRSSPSRRETSSDTLSDEVCRVSQIIP
>P69333 ~~~~~~G-protein coupled receptor homolog US28~~~
MTPTTTTAELTTEFDYDEDATPCVFTDVLNQSKPVTLFLYGVVFLFGSIGNFLVIFTITWRRRIQCSGDVYFINLAAADL
LFVCTLPLWMQYLLDHNSLASVPCTLLTACFYVAMFASLCFITEIALDRYYAIVYMRYRPVKQACLFSIFWWIFAVIIAI
PHFMVVTKKDNQCMTDYDYLEVSYPIILNVELMLGAFVIPLSVISYCYYRISRIVAVSQSRHKGRIVRVLIAVVLVFIIF
WLPYHLTLFVDTLKLLKWISSSCEFERSLKRALILTESLAFCHCCLNPLLYVFVGTKFRQELHCLLAEFRQRLFSRDVSW
YHSMSFSRRSSPSRRETSSDTLSDEVCRVSQIIP
>P09708 ~~~~~~Uncharacterized protein HHRF7~~~
MAMYTSESERDWRRVIHDSHGLWCDCGDWREHLYCVYDSHFQRRPTTRAERRAANWRRQMRRLHRLWCFCQDWKCHALYA
EWDGKESDDESSASSSGEAPEQQVPAWKTVRAFSRAYHHRINRGLRGTPPPRNLPGYEHASEGWRFCSRRERREDDLRTR
AEPDRVVFQLGGVPPRRHRETYV
>O09802 ~~~~~~Protein US8.5~~~
MDPALRSYHQRLRLYTPVAMGINLAASSQPLDPEGPIAVTPRPPIRPSSGKAPHPEAPRRSPNWATAGEVDVGDELIAIS
DERGPPRHDRPPLATSTAPSPHPRPPGYTAVVSPMALQAVDAPSLFVAWLAARWLRGASGLGAVLCGIAWYVTSIARGA
>Q08098 ~~~US9~~~Envelope protein US9~~~
MERSHKASCGCFEGMESPRSVVNENYRGADEADAAPPSPPPEGSIVSIPILELTIEDAPASAEATGTAAAAPAGRTPDAN
AAPGGYVPVPAADADCYYSESDSETAGEFLIRMGRQQRRRHRRRRCMIAAALTCIGLGACAAAAAAGAVLALEVVPRP
>P32513 ~~~~~~Envelope protein US9 homolog~~~
MEKAEAAAVVIPLSVSNPSYRGSGMSDQEVSEEQSAGDAWVSAAMAAAEAVAAAATSTGIDNTNDYTYTAASENGDPGFT
LGDNTYGPNGAASGCPSPPSPEVVGLEMVVVSSLAPEIAAAVPADTISASAAAPATRVDDGNAPLLGPGQAQDYDSESGC
YYSESDNETASMFIRRVGRRQARRHRRRRVALTVAGVILVVVLCAISGIVGAFLARVFP
>P06481 ~~~US9~~~Envelope protein US9~~~
MTSRLSDPNSSARSDMSVPLYPTASPVSVEAYYSESEDEAANDFLVRMGRQQSVLRRRRRRTRCVGMVIACLLVAVLSGG
FGALLMWLLR
>P11313 ~~~~~~Envelope protein US9 homolog~~~
MPTAAPADMDTFDPSAPVPTSVSNPAADVLLAPKGPRSPLRPQDDSDCYYSESDNETPSEFLRRVGRRQAARRRRRRCLM
GVAISAAALVICSLSALIGGIIARHV
>P09312 ~~~~~~Envelope protein US9~~~
MAGQNTMEGEAVALLMEAVVTPRAQPNNTTITAIQPSRSAEKCYYSDSENETADEFLRRIGKYQHKIYHRKKFCYITLII
VFVFAMTGAAFALGYITSQFVG
>Q5UNU2 3.-.-.-~~~~~~Putative UV-damage endonuclease~~~
MNQNLIRLGYACLNSDLRNYDIFTSRKPILKTVKSQGFDYVKETIIRNLRDLFTIIIYNESHGIRFFRISSSIFPHLGNP
LLPDSDYDLSFAKNLIKEIGSYAKINGHRLTMHPGQFVQLGSNNEEVVRRSFVELQNHATLLEMLGYSSLDSSVLIVHGG
GTFGDKETTLERWKSNFRKLPENIRQLICLENDENSYGILDLLPVCEELNVPFCLDIFHNRVSKNRIPLTKKLMKRIINT
WKRRNMTPKMHFSNQEPGLRRGAHSKTINELPEYLFRIPDMFQTSLDIILEVKDKEKSVLKMYFKYFDIETNIDGRNNFI
LKKDYSLKKN
>P20703 3.6.4.12~~~uvsW~~~ATP-dependent DNA helicase uvsW~~~
MDIKVHFHDFSHVRIDCEESTFHELRDFFSFEADGYRFNPRFRYGNWDGRIRLLDYNRLLPFGLVGQIKKFCDNFGYKAW
IDPQINEKEELSRKDFDEWLSKLEIYSGNKRIEPHWYQKDAVFEGLVNRRRILNLPTSAGKSLIQALLARYYLENYEGKI
LIIVPTTALTTQMADDFVDYRLFSHAMIKKIGGGASKDDKYKNDAPVVVGTWQTVVKQPKEWFSQFGMMMNDECHLATGK
SISSIISGLNNCMFKFGLSGSLRDGKANIMQYVGMFGEIFKPVTTSKLMEDGQVTELKINSIFLRYPDEFTTKLKGKTYQ
EEIKIITGLSKRNKWIAKLAIKLAQKDENAFVMFKHVSHGKAIFDLIKNEYDKVYYVSGEVDTETRNIMKTLAENGKGII
IVASYGVFSTGISVKNLHHVVLAHGVKSKIIVLQTIGRVLRKHGSKTIATVWDLIDSAGVKPKSANTKKKYVHLNYLLKH
GIDRIQRYADEKFNYVMKTVNLISFGPLEKKMLLEFKQFLYEASIDEFMGKIASCQTLEGLEELEAYYKKRVKETELKDT
DDISVRDALAGKRAELEDSDDEVEESF
>P04529 ~~~UVSX~~~Recombination and repair protein~~~
MSDLKSRLIKASTSKLTAELTASKFFNEKDVVRTKIPMMNIALSGEITGGMQSGLLILAGPSKSFKSNFGLTMVSSYMRQ
YPDAVCLFYDSEFGITPAYLRSMGVDPERVIHTPVQSLEQLRIDMVNQLDAIERGEKVVVFIDSLGNLASKKETEDALNE
KVVSDMTRAKTMKSLFRIVTPYFSTKNIPCIAINHTYETQEMFSKTVMGGGTGPMYSADTVFIIGKRQIKDGSDLQGYQF
VLNVEKSRTVKEKSKFFIDVKFDGGIDPYSGLLDMALELGFVVKPKNGWYAREFLDEETGEMIREEKSWRAKDTNCTTFW
GPLFKHQPFRDAIKRAYQLGAIDSNEIVEAEVDELINSKVEKFKSPESKSKSAADLETDLEQLSDMEEFNE
>P04537 ~~~uvsY~~~Recombination protein uvsY~~~
MRLEDLQEELKKDVFIDSTKLQYEAANNVMLYSKWLNKHSSIKKEMLRIEAQKKVALKARLDYYSGRGDGDEFSMDRYEK
SEMKTVLSADKDVLKVDTSLQYWGILLDFCSGALDAIKSRGFAIKHIQDMRAFEAGK
>P0DJX2 ~~~~~~U exon protein~~~
MKIVGAEGQEHEEFDIPFKLWRKFAAKRRLRYQSWEEGKEVMLNKLDKDLLTDFKAFAARFSSRPRPSKIFGTSSSEAIS
GEGNGQSGRGAARNHPRARTRCGATSTNHGGRVVPVAVAAASPRAPKKAAEAASRVRGRRRLVTRCAGAAHTQPAAIDLD
GGFGHCVQKEKEAPLSQARAPAIPRGDRGQRGRKRRCGATNGGFQQPTGANQARQGR
>A8W995 ~~~~~~U exon protein~~~
MKIVGADGQEQEETDIPFRLWRKFAARRKLQYQSWEEGKEVLLNKLDRNLLTDFKAFAARFSSRPRPSKIFGTSLSEAIS
GEGNGQSGRGAARNHPRARTRCGATSPNHGGRVVPVPVAAASPGAPKKADEAAYRVRGRGRLITRRAGAAHTQPAAIDLG
GGFGHCAQEEKEAPFSQARAPAITRGNRGQRGRKRRCGATNGGFQQPTGANQAWQRR
>Q6TVP7 ~~~~~~Protein ORFV073~~~
MARRARFSPRLHIPAARAALGPHLHFPRRRLVLRHCGVRAFVGDAIVSKKEMTNPLCAQAIVFGNGFVETYVRSLDPRLL
GAYHALSRPVCERPLFAVRGWRRLFPIVARRLDAVERRTRRVLRSMCRTYTTCMSADRAAAVSHPVMRRRWFGHRATKTR
RARLRRRCRNRSSKRRAERRKRFCNYCP
>B4YNF9 ~~~~~~Structural protein V19~~~
MHHSFFYDSNTEKISLIAQAAYYDRTLTTPIEIYCNVNLFTFFDSIKHIGLGYNTPTGRDILFDVRFLGNNYYQDPETAP
SYPPEFIQMQQEYPTLSNWNAVKTIQLVSNLLPINKESIPSFRNSNVGIINAQGILADFVPLVTNGPEARISIDFVATGP
WRLIDMFGSVPIYMVDLYVYWTDQTGGQYLINIPPGRILTCKLVFIKKSLSKYLVSEK
>P03787 ~~~~~~Fusion protein 5.5/5.7~~~
MAMTKKFKVSFDVTAKMSSDVQAILEKDMLHLCKQVGSGAIVPNGKQKEMIVQFLTHGMEGLMTFVVRTSFREAIKDMHE
EYADKDSFKQSPATVREVFLMSDYLKVLQAIKSCPKTFQSNYVRNNASLVAEAASRGHISCLTTSGRNGGAWEITASGTR
FLKRMGGCV
>B4YNE8 ~~~~~~Structural protein V8~~~
MSVSTLFQQNNNNIYNKSNTLTNTPSNPTGNTNTLWSNSGFNPPHLMYGASDVTAAINNIAFETGTFNLQLSGPWASPIS
HAVSYTKINNLVNLTIPTYQAQATTLASISSIVGALPTNLRPVNNPEIDFEIFVLDNGTRTTNPGLITLLSNGQILIYKD
NNLGQFTVGSGGSGFNPFSITYMI
>P03551 ~~~ORF III~~~Virion-associated protein~~~
MANLNQIQKEVSEILSDQKSMKADIKAILELLGSQNPIKESLETVAAKIVNDLTKLINDCPCNKEILEALGTQPKEQLIE
QPKEKGKGLNLGKYSYPNYGVGNEELGSSGNPKALTWPFKAPAGWPNQF
>P03698 ~~~bet~~~Recombination protein bet~~~
MSTALATLAGKLAERVGMDSVDPQELITTLRQTAFKGDASDAQFIALLIVANQYGLNPWTKEIYAFPDKQNGIVPVVGVD
GWSRIINENQQFDGMDFEQDNESCTCRIYRKDRNHPICVTEWMDECRREPFKTREGREITGPWQSHPKRMLRHKAMIQCA
RLAFGFAGIYDKDEAERIVENTAYTAERQPERDITPVNDETMQEINTLLIALDKTWDDDLLPLCSQIFRRDIRASSELTQ
AEAVKALGFLKQKAAEQKVAA
>Q66677 ~~~~~~CARD domain-containing protein E10~~~
MAEKYPLQAGDPCVTLTEEDIWDVERLCLEELRVLLVSHLKSHKHLDHLRAKKILSREDAEEVSSRATSRSRAGLLVDMC
QDHPRGFQCLKESCKNEVGQEHLVDLLERAFEKHCGDKLTQKWWESGADGRNPRRPPGSEDNSGYTALLPTNPSGGGPSI
GSGHSRPRGRDDSGGIGGGVYPFSHGGARVVGGGWGGWGESGGAGRGGSLLSGGHGGHPPHGGPGGGGRDYYGGGGSGYY
ESIPEPANFPNSGGGGRGGGVRYDAGGDGRLGGLPPDPQEVDDPSLSVQGRGGPAPDPPSPPLRTRRFFCC
>P07695 ~~~cox~~~Regulatory protein cox~~~
MSKQVTLMTDAIPYQEFAKLIGKSTGAVRRMIDKGKLPVIDMTDPQSASGRAGEYWVYLPAWNNGLKLAYESRPKEIRDG
WLMWLGLGEPR
>P68639 ~~~C3L~~~Complement control protein C3~~~
MKVESVTFLTLLGIGCVLSCCTIPSRPINMKFKNSVETDANANYNIGDTIEYLCLPGYRKQKMGPIYAKCTGTGWTLFNQ
CIKRRCPSPRDIDNGQLDIGGVDFGSSITYSCNSGYHLIGESKSYCELGSTGSMVWNPEAPICESVKCQSPPSISNGRHN
GYEDFYTDGSVVTYSCNSGYSLIGNSGVLCSGGEWSDPPTCQIVKCPHPTISNGYLSSGFKRSYSYNDNVDFKCKYGYKL
SGSSSSTCSPGNTWKPELPKCVR
>P68638 ~~~~~~Complement control protein C3~~~
MKVESVTFLTLLGIGCVLSCCTIPSRPINMKFKNSVETDANANYNIGDTIEYLCLPGYRKQKMGPIYAKCTGTGWTLFNQ
CIKRRCPSPRDIDNGQLDIGGVDFGSSITYSCNSGYHLIGESKSYCELGSTGSMVWNPEAPICESVKCQSPPSISNGRHN
GYEDFYTDGSVVTYSCNSGYSLIGNSGVLCSGGEWSDPPTCQIVKCPHPTISNGYLSSGFKRSYSYNDNVDFKCKYGYKL
SGSSSSTCSPGNTWKPELPKCVR
>Q77Q36 ~~~~~~viral cyclin homolog~~~
MATANNPPSGLLDPTLCEDRIFYNILEIEPRFLTSDSVFGTFQQSLTSHMRKLLGTWMFSVCQEYNLEPNVVALALNLLD
RLLLIKQVSKEHFQKTGSACLLVASKLRSLTPISTSSLCYAAADSFSRQELIDQEKELLEKLAWRTEAVLATDVTSFLLL
KLLGGSQHLDFWHHEVNTLITKALVDPKTGSLPASIISAAGCALLVPANVIPQDTHSGGVVPQLASILGCDVSVLQAAVE
QILTSVSDFDLRILDSY
>P21680 ~~~dhr~~~Protein dhr~~~
MSRDELRIVLGAMIPNMEEGFEIKTRDGAILRVDPEWECCKEFKDGLKAEIIKQLKSKPAVVFGYS
>P03116 3.6.4.12~~~E1~~~Replication protein E1~~~
MANDKGSNWDSGLGCSYLLTEAECESDKENEEPGAGVELSVESDRYDSQDEDFVDNASVFQGNHLEVFQALEKKAGEEQI
LNLKRKVLGSSQNSSGSEASETPVKRRKSGAKRRLFAENEANRVLTPLQVQGEGEGRQELNEEQAISHLHLQLVKSKNAT
VFKLGLFKSLFLCSFHDITRLFKNDKTTNQQWVLAVFGLAEVFFEASFELLKKQCSFLQMQKRSHEGGTCAVYLICFNTA
KSRETVRNLMANTLNVREECLMLQPAKIRGLSAALFWFKSSLSPATLKHGALPEWIRAQTTLNESLQTEKFDFGTMVQWA
YDHKYAEESKIAYEYALAAGSDSNARAFLATNSQAKHVKDCATMVRHYLRAETQALSMPAYIKARCKLATGEGSWKSILT
FFNYQNIELITFINALKLWLKGIPKKNCLAFIGPPNTGKSMLCNSLIHFLGGSVLSFANHKSHFWLASLADTRAALVDDA
THACWRYFDTYLRNALDGYPVSIDRKHKAAVQIKAPPLLVTSNIDVQAEDRYLYLHSRVQTFRFEQPCTDESGEQPFNIT
DADWKSFFVRLWGRLDLIDEEEDSEEDGDSMRTFTCSARNTNAVD
>P04014 3.6.4.12~~~E1~~~Replication protein E1~~~
MADDSGTENEGSGCTGWFMVEAIVEHTTGTQISEDEEEEVEDSGYDMVDFIDDRHITQNSVEAQALFNRQEADAHYATVQ
DLKRKYLGSPYVSPISNVANAVESEISPRLDAIKLTTQPKKVKRRLFETRELTDSGYGYSEVEAATQVEKHGDPENGGDG
QERDTGRDIEGEGVEHREAEAVDDSTREHADTSGILELLKCKDIRSTLHGKFKDCFGLSFVDLIRPFKSDRTTCADWVVA
GFGIHHSIADAFQKLIEPLSLYAHIQWLTNAWGMVLLVLIRFKVNKSRCTVARTLGTLLNIPENHMLIEPPKIQSGVRAL
YWFRTGISNASTVIGEAPEWITRQTVIEHSLADSQFKLTEMVQWAYDNDICEESEIAFEYAQRGDFDSNARAFLNSNMQA
KYVKDCAIMCRHYKHAEMKKMSIKQWIKYRGTKVDSVGNWKPIVQFLRHQNIEFIPFLSKLKLWLHGTPKKNCIAIVGPP
DTGKSCFCMSLIKFLGGTVISYVNSCSHFWLQPLTDAKVALLDDATQPCWTYMDTYMRNLLDGNPMSIDRKHRALTLIKC
PPLLVTSNIDISKEEKYKYLHSRVTTFTFPNPFPFDRNGNAVYELSDANWKCFFERLSSSLDIEDSEDEEDGSNSQAFRC
VPGSVVRTL
>P03114 3.6.4.12~~~E1~~~Replication protein E1~~~
MADPAGTNGEEGTGCNGWFYVEAVVEKKTGDAISDDENENDSDTGEDLVDFIVNDNDYLTQAETETAHALFTAQEAKQHR
DAVQVLKRKYLVSPLSDISGCVDNNISPRLKAICIEKQSRAAKRRLFESEDSGYGNTEVETQQMLQVEGRHETETPCSQY
SGGSGGGCSQYSSGSGGEGVSERHTICQTPLTNILNVLKTSNAKAAMLAKFKELYGVSFSELVRPFKSNKSTCCDWCIAA
FGLTPSIADSIKTLLQQYCLYLHIQSLACSWGMVVLLLVRYKCGKNRETIEKLLSKLLCVSPMCMMIEPPKLRSTAAALY
WYKTGISNISEVYGDTPEWIQRQTVLQHSFNDCTFELSQMVQWAYDNDIVDDSEIAYKYAQLADTNSNASAFLKSNSQAK
IVKDCATMCRHYKRAEKKQMSMSQWIKYRCDRVDDGGDWKQIVMFLRYQGVEFMSFLTALKRFLQGIPKKNCILLYGAAN
TGKSLFGMSLMKFLQGSVICFVNSKSHFWLQPLADAKIGMLDDATVPCWNYIDDNLRNALDGNLVSMDVKHRPLVQLKCP
PLLITSNINAGTDSRWPYLHNRLVVFTFPNEFPFDENGNPVYELNDKNWKSFFSRTWSRLSLHEDEDKENDGDSLPTFKC
VSGQNTNTL
>P06789 3.6.4.12~~~E1~~~Replication protein E1~~~
MADPEGTDGEGTGCNGWFYVQAIVDKKTGDVISDDEDENATDTGSDMVDFIDTQGTFCEQAELETAQALFHAQEVHNDAQ
VLHVLKRKFAGGSTENSPLGERLEVDTELSPRLQEISLNSGQKKAKRRLFTISDSGYGCSEVEATQIQVTTNGEHGGNVC
SGGSTEAIDNGGTEGNNSSVDGTSDNSNIENVNPQCTIAQLKDLLKVNNKQGAMLAVFKDTYGLSFTDLVRNFKSDKTTC
TDWVTAIFGVNPTIAEGFKTLIQPFILYAHIQCLDCKWGVLILALLRYKCGKSRLTVAKGLSTLLHVPETCMLIQPPKLR
SSVAALYWYRTGISNISEVMGDTPEWIQRLTIIQHGIDDSNFDLSEMVQWAFDNELTDESDMAFEYALLADSNSNAAAFL
KSNCQAKYLKDCATMCKHYRRAQKRQMNMSQWIRFRCSKIDEGGDWRPIVQFLRYQQIEFITFLGALKSFLKGTPKKNCL
VFCGPANTGKSYFGMSFIHFIQGAVISFVNSTSHFWLEPLTDTKVAMLDDATTTCWTYFDTYMRNALDGNPISIDRKHKP
LIQLKCPPILLTTNIHPAKDNRWPYLESRITVFEFPNAFPFDKNGNPVYEINDKNWKCFFERTWSRLDLHEEEEDADTEG
NPFGTFKLRAGQNHRPL
>P06421 3.6.4.12~~~E1~~~Replication protein E1~~~
MADPEGTNGAGMGCTGWFEVEAVIERRTGDNISEDEDETADDSGTDLLEFIDDSMENSIQADTEAARALFNIQEGEDDLN
AVCALKRKFAACSQSAAEDVVDRAANPCRTSINKNKECTYRKRKIDELEDSGYGNTEVETQQMVQQVESQNGDTNLNDLE
SSGVGDDSEVSCETNVDSCENVTLQEISNVLHSSNTKANILYKFKEAYGISFMELVRPFKSDKTSCTDWCITGYGISPSV
AESLKVLIKQHSLYTHLQCLTCDRGIIILLLIRFRCSKNRLTVAKLMSNLLSIPETCMVIEPPKLRSQTCALYWFRTAMS
NISDVQGTTPEWIDRLTVLQHSFNDNIFDLSEMVQWAYDNELTDDSDIAYYYAQLADSNSNAAAFLKSNSQAKIVKDCGI
MCRHYKKAEKRKMSIGQWIQSRCEKTNDGGNWRPIVQLLRYQNIEFTAFLGAFKKFLKGIPKKSCMLICGPANTGKSYFG
MSLIQFLKGCVISCVNSKSHFWLQPLSDAKIGMIDDVTPISWTYIDDYMRNALDGNEISIDVKHRALVQLKCPPLLLTSN
TNAGTDSRWPYLHSRLTVFEFKNPFPFDENGNPVYAINDENWKSFFSRTWCKLDLIEEEDKENHGGNISTFKCSAGENTR
SLRS
>P03113 3.6.4.12~~~E1~~~Replication protein E1~~~
MADDSGTENEGSGCTGWFMVEAIVQHPTGTQISDDEDEEVEDSGYDMVDFIDDSNITHNSLEAQALFNRQEADTHYATVQ
DLKRKYLGSPYVSPINTIAEAVESEISPRLDAIKLTRQPKKVKRRLFQTRELTDSGYGYSEVEAGTGTQVEKHGVPENGG
DGQEKDTGRDIEGEEHTEAEAPTNSVREHAGTAGILELLKCKDLRAALLGKFKECFGLSFIDLIRPFKSDKTTCLDWVVA
GFGIHHSISEAFQKLIEPLSLYAHIQWLTNAWGMVLLVLLRFKVNKSRSTVARTLATLLNIPENQMLIEPPKIQSGVAAL
YWFRTGISNASTVIGEAPEWITRQTVIEHGLADSQFKLTEMVQWAYDNDICEESEIAFEYAQRGDFDSNARAFLNSNMQA
KYVKDCATMCRHYKHAEMRKMSIKQWIKHRGSKIEGTGNWKPIVQFLRHQNIEFIPFLTKFKLWLHGTPKKNCIAIVGPP
DTGKSYFCMSLISFLGGTVISHVNSSSHFWLQPLVDAKVALLDDATQPCWIYMDTYMRNLLDGNPMSIDRKHKALTLIKC
PPLLVTSNIDITKEDKYKYLHTRVTTFTFPNPFPFDRNGNAVYELSNTNWKCFFERLSSSLDIQDSEDEEDGSNSQAFRC
VPGTVVRTL
>P03122 ~~~E2~~~Regulatory protein E2~~~
METACERLHVAQETQMQLIEKSSDKLQDHILYWTAVRTENTLLYAARKKGVTVLGHCRVPHSVVCQERAKQAIEMQLSLQ
ELSKTEFGDEPWSLLDTSWDRYMSEPKRCFKKGARVVEVEFDGNASNTNWYTVYSNLYMRTEDGWQLAKAGADGTGLYYC
TMAGAGRIYYSRFGDEAARFSTTGHYSVRDQDRVYAGVSSTSSDFRDRPDGVWVASEGPEGDPAGKEAEPAQPVSSLLGS
PACGPIRAGLGWVRDGPRSHPYNFPAGSGGSILRSSSTPVQGTVPVDLASRQEEEEQSPDSTEEEPVTLPRRTTNDGFHL
LKAGGSCFALISGTANQVKCYRFRVKKNHRHRYENCTTTWFTVADNGAERQGQAQILITFGSPSQRQDFLKHVPLPPGMN
ISGFTASLDF
>P06422 ~~~E2~~~Regulatory protein E2~~~
MENLSERFNVLQDQLMNIYEAAEQTLEAQIAHWLLLRKEAVLLYFARQKGITRIGYQPVPPLAVSEAKAKQAIGIMLQLQ
SLQKSEFADEPWTLVDTSIETYKNAPENHFKKGATPVEVIYDKQPDNANVYTMWKHIYYTDADDKWHKTTSGVNQTGIYY
MQGSFRHYYVVFADDARRYSATGEWEVKINKDTVFAPVTSSTPPGSPPGQADTDTAAKTPTTSADSTSRQQRSPAKQPQQ
TETKGRRYGRRPSSRTRPQKEQRRSRSRHRTRSRSRSLSRVRAVGSTTVSRSRSSSLTKAVRPRSRSRSRGRATATSRRR
AGRGSPRRRRSTSRSPSTNTFKRSQRGGGRRGRGRGSRGRRERSSSTSPTPTKRSRGESSRLRGVSPSEVGRSVQSVSAK
HTGRLGRLLDEAIDPPVILVRGEANTLKCFRNRARVRYRGLFKYFSTTWSWVAGDSTERLGRSRMLILFTSAGQRKDFDE
TVKYPKGVDTSYGNLDSL
>P04015 ~~~E2~~~Regulatory protein E2~~~
MEAIAKRLDACQDQLLELYEENSIDIHKHIMHWKCIRLESVLLHKAKQMGLSHIGLQVVPPLTVSETKGHNAIEMQMHLE
SLAKTQYGVEPWTLQDTSYEMWLTPPKRCFKKQGNTVEVKFDGCEDNVMEYVVWTHIYLQDNDSWVKVTSSVDAKGIYYT
CGQFKTYYVNFNKEAQKYGSTNHWEVCYGSTVICSPASVSSTVREVSIAEPTTYTPAQTTAPTVSACTTEDGVSAPPRKR
ARGPSTNNTLCVANIRSVDSTINNIVTDNYNKHQRRNNCHSAATPIVQLQGDSNCLKCFRYRLNDKYKHLFELASSTWHW
ASPEAPHKNAIVTLTYSSEEQRQQFLNSVKIPPTIRHKVGFMSLHLL
>P03120 ~~~E2~~~Regulatory protein E2~~~
METLCQRLNVCQDKILTHYENDSTDLRDHIDYWKHMRLECAIYYKAREMGFKHINHQVVPTLAVSKNKALQAIELQLTLE
TIYNSQYSNEKWTLQDVSLEVYLTAPTGCIKKHGYTVEVQFDGDICNTMHYTNWTHIYICEEASVTVVEGQVDYYGLYYV
HEGIRTYFVQFKDDAEKYSKNKVWEVHAGGQVILCPTSVFSSNEVSSPEIIRQHLANHPAATHTKAVALGTEETQTTIQR
PRSEPDTGNPCHTTKLLHRDSVDSAPILTAFNSSHKGRINCNSNTTPIVHLKGDANTLKCLRYRFKKHCTLYTAVSSTWH
WTGHNVKHKSAIVTLTYDSEWQRDQFLSQVKIPKTITVSTGFMSI
>P06790 ~~~E2~~~Regulatory protein E2~~~
MQTPKETLSERLSCVQDKIIDHYENDSKDIDSQIQYWQLIRWENAIFFAAREHGIQTLNHQVVPAYNISKSKAHKAIELQ
MALQGLAQSAYKTEDWTLQDTCEELWNTEPTHCFKKGGQTVQVYFDGNKDNCMTYVAWDSVYYMTDAGTWDKTATCVSHR
GLYYVKEGYNTFYIEFKSECEKYGNTGTWEVHFGNNVIDCNDSMCSTSDDTVSATQLVKQLQHTPSPYSSTVSVGTAKTY
GQTSAATRPGHCGLAEKQHCGPVNPLLGAATPTGNNKRRKLCSGNTTPIIHLKGDRNSLKCLRYRLRKHSDHYRDISSTW
HWTGAGNEKTGILTVTYHSETQRTKFLNTVAIPDSVQILVGYMTM
>P17383 ~~~E2~~~Regulatory protein E2~~~
METLSQRLNVCQDKILEHYENDSKRLCDHIDYWKHIRLECVLMYKAREMGIHSINHQVVPALSVSKAKALQAIELQMMLE
TLNNTEYKNEDWTMQQTSLELYLTAPTGCLKKHGYTVEVQFDGDVHNTMHYTNWKFIYLCIDGQCTVVEGQVNCKGIYYV
HEGHITYFVNFTEEAKKYGTGKKWEVHAGGQVIVFPESVFSSDEISFAGIVTKLPTANNTTTSNSKTCALGTSEGVRRAT
TSTKRPRTEPEHRNTHHPNKLLRGDSVDSVNCGVISAAACTNQTRAVSCPATTPIIHLKGDANILKCLRYRLSKYKQLYE
QVSSTWHWTCTDGKHKNAIVTLTYISTSQRDDFLNTVKIPNTVSVSTGYMTI
>Q84294 ~~~E2~~~Regulatory protein E2~~~
MEAIAKRLDACQEQLLELYEENSTDLNKHVLHWKCMRHESVLLYKAKQMGLSHIGMQVVPPLKVSEAKGHNAIEMQMHLE
SLLKTEYSMEPWTLQETSYEMWQTPPKRCFKKRGKTVEVKFDGCANNTMDYVVWTDVYVQDTDSWVKVHSMVDAKGIYYT
CGQFKTYYVNFVKEAEKYGSTKQWEVCYGSTVICSPASVSSTTQEVSIPESTTYTPAQTSTPVSSSTQEDAVQTPPRKRA
RGVQQSPCNALCVAHIGPVDSGNHNLITNNHDQHQRRNNSNSSATPIVQFQGESNCLKCFRYRLNDKHRHLFDLISSTWH
WASPKAPHKHAIVTVTYHSEEQRQQFLNVVKIPPTIRHKLGFMSLHLL
>P06922 ~~~E4~~~Protein E4~~~
MADPAAATKYPLLKLLGSTWPTTPPRPIPKPSPWAPKKHRRLSSDQDQSQTPETPATPLSCCTETQWTVLQSSLHLTAHT
KDGLTVIVTLHP
>P0CK45 ~~~E5~~~Protein E5~~~
MPNLWFLLFLGLVAAMQLLLLLFLLLFFLVYWDHFECSCTGLPF
>P06927 ~~~E5~~~Probable protein E5~~~
MTNLDTASTTLLACFLLCFCVLLCVCLLIRPLLLSVSTYTSLIILVLLLWITAASAFRCFIVYIIFVYIPLFLIHTHARF
LIT
>P06931 ~~~E6~~~Protein E6~~~
MDLKPFARTNPFSGLDCLWCREPLTEVDAFRCMVKDFHVVIREGCRYGACTICLENCLATERRLWQGVPVTGEEAELLHG
KTLDRLCIRCCYCGGKLTKNEKHRHVLFNEPFCKTRANIIRGRCYDCCRHGSRSKYP
>P36799 ~~~E6~~~Protein E6~~~
MAVAMSMDANCPKNIFLLCRNTGIGFDDLRLHCIFCTKQLTTTELQAFALRELNVVWRRGAPYGACARCLLVEGIARRLK
YWEYSYYVSGVEEETKQSIDTQQIRCYMCHKPLVKEEKDRHRNEKRRLHKISGHWRGSCQYCWSRCTVRIPR
>P06930 ~~~E6~~~Protein E6~~~
MAEGAEHQQKLTEKDKAELPLSIRDLAEALGIPVIDCLIPCNFCGNFLNYLEACEFDYKRLSLIWKDYCVFACCRVCCGA
TATYEFNQFYEQTVLGRDIELASGLSIFDIDIRCQTCLAFLDIIEKLDCCGRGLPFHKVRNAWKGICRQCKHFYHDW
>P04019 ~~~E6~~~Protein E6~~~
MESKDASTSATSIDQLCKTFNLSLHTLQIQCVFCRNALTTAEIYAYAYKNLKVVWRDNFPFAACACCLELQGKINQYRHF
NYAAYAPTVEEETNEDILKVLIRCYLCHKPLCEIEKLKHILGKARFIKLNNQWKGRCLHCWTTCMEDLLP
>P03126 ~~~E6~~~Protein E6~~~
MHQKRTAMFQDPQERPRKLPQLCTELQTTIHDIILECVYCKQQLLRREVYDFAFRDLCIVYRDGNPYAVCDKCLKFYSKI
SEYRHYCYSLYGTTLEQQYNKPLCDLLIRCINCQKPLCPEEKQRHLDKKQRFHNIRGRWTGRCMSCCRSSRTRRETQL
>P06463 ~~~E6~~~Protein E6~~~
MARFEDPTRRPYKLPDLCTELNTSLQDIEITCVYCKTVLELTEVFEFAFKDLFVVYRDSIPHAACHKCIDFYSRIRELRH
YSDSVYGDTLEKLTNTGLYNLLIRCLRCQKPLNPAEKLRHLNEKRRFHNIAGHYRGQCHSCCNRARQERLQRRRETQV
>P17386 ~~~E6~~~Protein E6~~~
MFKNPAERPRKLHELSSALEIPYDELRLNCVYCKGQLTETEVLDFAFTDLTIVYRDDTPHGVCTKCLRFYSKVSEFRWYR
YSVYGTTLEKLTNKGICDLLIRCITCQRPLCPEEKQRHLDKKKRFHNIGGRWTGRCIACWRRPRTETQV
>P06427 ~~~E6~~~Protein E6~~~
MFQDTEEKPRTLHDLCQALETTIHNIELQCVECKKPLQRSEVYDFAFADLTVVYREGNPFGICKLCLRFLSKISEYRHYN
YSVYGNTLEQTVKKPLNEILIRCIICQRPLCPQEKKRHVDLNKRFHNISGRWAGRCAACWRSRRRETAL
>P27228 ~~~E6~~~Protein E6~~~
MFQDPAERPYKLHDLCNEVEESIHEICLNCVYCKQELQRSEVYDFACYDLCIVYREGQPYGVCMKCLKFYSKISEYRWYR
YSVYGETLEKQCNKQLCHLLIRCITCQKPLCPVEKQRHLEEKKRFHNIGGRWTGRCMSCWKPTRRETEV
>P24835 ~~~E6~~~Protein E6~~~
MARFHNPAERPYKLPDLCTTLDTTLQDITIACVYCRRPLQQTEVYEFAFSDLYVVYRDGEPLAACQSCIKFYAKIRELRY
YSDSVYATTLENITNTKLYNLLIRCMCCLKPLCPAEKLRHLNSKRRFHKIAGSYTGQCRRCWTTKREDRRLTRRETQV
>P21735 ~~~E6~~~Protein E6~~~
MARFDDPKQRPYKLPDLCTELNTSLQDVSIACVYCKATLERTEVYQFAFKDLCIVYRDCIAYAACHKCIDFYSRIRELRY
YSNSVYGETLEKITNTELYNLLIRCLRCQKPLNPAEKRRHLKDKRRFHSIAGQYRGQCNTCCDQARQERLRRRRETQV
>P36813 ~~~E6~~~Protein E6~~~
MARPVKVCELAHHLNIPIWEVLLPCNFCTGFLTYQELLEFDYKDFNLLWKDGFVFGCCAACAYRSAYHEFTNYHQEIVVG
IEIEGRAAANIAEIVVRCLICLKRLDLLEKLDICAQHREFHRVRNRWKGVCRHCRVIE
>P26554 ~~~E6~~~Protein E6~~~
MFEDKRERPRTLHELCEALNVSMHNIQVVCVYCKKELCRADVYNVAFTEIKIVYRDNNPYAVCKQCLLFYSKIREYRRYS
RSVYGTTLEAITKKSLYDLSIRCHRCQRPLGPEEKQKLVDEKKRFHEIAGRWTGQCANCWQRTRQRNETQV
>P54667 ~~~E6~~~Protein E6~~~
MALFHNPEERPYKLPDLCRTLDTTLHDVTIDCVYCRRQLQRTEVYEFAFSDLCVVYRDGVPFAACQSCIKFYAKIRELRY
YSESVYATTLETITNTKLYNLLIRCMSCLKPLCPAEKLRHLTTKRRLHKIAGNFTGQCRHCWTSKREDRRRIRQETQV
>Q84291 ~~~E6~~~Protein E6~~~
MESANASTSATTIDQLCKTFNLSMHTLQINCVFCKNALTTAEIYSYAYKQLKVLFRGGYPYAACACCLEFHGKINQYRHF
DYAGYATTVEEETKQDILDVLIRCYLCHKPLCEVEKVKHILTKARFIKLNCTWKGRCLHCWTTCMEDMLP
>P06462 ~~~E6~~~Protein E6~~~
MESANASTSATTIDQLCKTFNLSMHTLQINCVFCKNALTTAEIYSYAYKHLKVLFRGGYPYAACACCLEFHGKINQYRHF
DYAGYATTVEEETKQDILDVLIRCYLCHKPLCEVEKVKHILTKARFIKLNCTWKGRCLHCWTTCMEDMLP
>P50804 ~~~E6~~~Protein E6~~~
MARFPNPAERPYKLPDLCTALDTTLHDITIDCVYCKTQLQQTEVYEFAFSDLFIVYRNGEPYAACQKCIKFHAKVRELRH
YSNSVYATTLESITNTKLYNLSIRCMSCLKPLCPAEKLRHVNTKRRFHQIAGSYTGQCRHCWTSNREDRRRIRRETQV
>P06933 ~~~E7~~~Protein E7~~~
MVQGPNTHRNLDDSPAGPLLILSPCAGTPTRSPAAPDAPDFRLPCHFGRPTRKRGPTTPPLSSPGKLCATGPRRVYSVTV
CCGNCGKELTFAVKTSSTSLLGFEHLLNSDLDLLCPRCESRERHGKR
>Q07857 ~~~E7~~~Protein E7~~~
MRGAAPTVADLNLELNDLVLPANLLSEEVLQSSDDEYEITEEESVVPFRIDTCCYRCEVAVRITLYAAELGLRTLEQLLV
EGKLTFCCTACARSLNRNGR
>P06465 ~~~E7~~~Protein E7~~~
MVGEMPALKDLVLQLEPSVLDLDLYCYEEVPPDDIEEELVSPQQPYAVVASCAYCEKLVRLTVLADHSAIRQLEELLLRS
LNIVCPLCTLQRQ
>P04020 ~~~E7~~~Protein E7~~~
MHGRLVTLKDIVLDLQPPDPVGLHCYEQLEDSSEDEVDKVDKQDAQPLTQHYQILTCCCGCDSNVRLVVECTDGDIRQLQ
DLLLGTLNIVCPICAPKP
>P03129 ~~~E7~~~Protein E7~~~
MHGDTPTLHEYMLDLQPETTDLYCYEQLNDSSEEEDEIDGPAGQAEPDRAHYNIVTFCCKCDSTLRLCVQSTHVDIRTLE
DLLMGTLGIVCPICSQKP
>P06788 ~~~E7~~~Protein E7~~~
MHGPKATLQDIVLHLEPQNEIPVDLLCHEQLSDSEEENDEIDGVNHQHLPARRAEPQRHTMLCMCCKCEARIKLVVESSA
DDLRAFQQLFLNTLSFVCPWCASQQ
>P17387 ~~~E7~~~Protein E7~~~
MRGETPTLQDYVLDLQPEATDLHCYEQLPDSSDEEDVIDSPAGQAEPDTSNYNIVTFCCQCKSTLRLCVQSTQVDIRILQ
ELLMGSFGIVCPNCSTRL
>Q80901 ~~~E7~~~Protein E7~~~
MIGKEATIPEIVLELQELVQPTADLHCYEELSEEETEEERPHIPYKIVAPCCFCGSKLRLIVVATPIGIRSQEELLLGEV
QLVCPNCRGKLRHD
>P21736 ~~~E7~~~Protein E7~~~
MHGPRETLQEIVLHLEPQNELDPVDLLCYEQLSESEEENDEADGVSHAQLPARRAEPQRHKILCVCCKCDGRIELTVESS
AEDLRTLQQLFLSTLSFVCPWCATNQ
>P26557 ~~~E7~~~Protein E7~~~
MRGNNPTLREYILDLHPEPTDLFCYEQLCDSSDEDEIGLDGPDGQAQPATANYYIVTCCYTCGTTVRLCINSTTTDVRTL
QQLLMGTCTIVCPSCAQQ
>P06464 ~~~E7~~~Protein E7~~~
MHGRHVTLKDIVLDLQPPDPVGLHCYEQLVDSSEDEVDEVDGQDSQPLKQHFQIVTCCCGCDSNVRLVVQCTETDIREVQ
QLLLGTLNIVCPICAPKT
>P0DKA0 ~~~~~~Protein E8^E2C~~~
MAILKWKLSRCYSSNEVSSPEIIRQHLANHPAATHTKAVALGTEETQTTIQRPRSEPDTGNPCHTTKLLHRDSVDSAPIL
TAFNSSHKGRINCNSNTTPIVHLKGDANTLKCLRYRFKKHCTLYTAVSSTWHWTGHNVKHKSAIVTLTYDSEWQRDQFLS
QVKIPKTITVSTGFMSI
>P52584 ~~~A2R~~~Vascular endothelial growth factor homolog~~~
MKLLVGILVAVCLHQYLLNADSNTKGWSEVLKGSECKPRPIVVPVSETHPELTSQRFNPPCVTLMRCGGCCNDESLECVP
TEEVNVSMELLGASGSGSNGMQRLSFVEHKKCDCRPRFTTTPPTTTRPPRRRR
>P0C2R0 ~~~E~~~Envelope small membrane protein~~~
MFNLFLTDTVWYVGQIIFIFAVCLMVTIIVVAFLASIKLCIQLCGLCNTLVLSPSIYLYDRSKQLYKYYNEEMRLPLLEV
DDI
>Q89894 ~~~E~~~Envelope small membrane protein~~~
MNLLNKSLEENGSFLTALYIIVGFLALYLLGRALQAFVQAADACCLFWYTWVVIPGAKGTAFVYKYTYGRKLNNPELEAV
IVNEFPKNGWNNKNPANFQDAQRDKLYS
>K9N5R3 ~~~E~~~Envelope small membrane protein~~~
MLPFVQERIGLFIVNFFIFTVVCAITLLVCMAFLTATRLCVQCMTGFNTLLVQPALYLYNTGRSVYVKFQDSKPPLPPDE
WV
>Q84706 ~~~E~~~Envelope small membrane protein~~~
MLQLVNDNGLVVNVILWLFVLFFLLIISITFVQLVNLCFTCHRLCNSAVYTPIGRLYRVYKSYMRIDPLPSTVIDV
>P59637 ~~~E~~~Envelope small membrane protein~~~
MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCCNIVNVSLVKPTVYVYSRVKNLNSSEGVPDLLV
>P0DTC4 ~~~E~~~Envelope small membrane protein~~~
MYSFVSEETGTLIVNSVLLFLAFVVFLLVTLAILTALRLCAYCCNIVNVSLVKPSFYVYSRVKNLNSSRVPDLLV
>Q65188 ~~~~~~Inner membrane protein H108R~~~
MVNLFPVFTLIVIITILITTRELSTTMLIVSLVTDYIIINTQYTEQQHENNTFSMPQKNSFNESYNKDKKSNTHIPYQWL
APELKEAESKYWWGNYDPHSEPVLAGAS
>Q65233 ~~~~~~Inner membrane protein H108R~~~
MVNLFPVFTLIVIITILITTRELSTTMLIVSLVTDYIIINTQYTEQQHEMNTFSTQQLPQKNSFNESYNKDKKSNTHIPY
QWLAPELKEAENKYWWGNYDPYSEPVLAGAS
>Q65139 ~~~~~~Uncharacterized protein A118R~~~
MHSNAFFNLIACVLFPTPLIPSMVISIPRMINKWVKRVQFLTFLTNLFLYNIVQHYINRIRCYSFIKYLLLYNLYRPIFG
RSLQMAITKIKIISDATAAVLLKSCAAMYDVLIDKKFK
>Q65178 ~~~~~~Uncharacterized protein CP123L~~~
MPSTGTLVIIFAIVLILCIMLLFFYKTVEAEKPGVLPPPIPPPTPPPSKKKYDHNEYMEKTDLEPEVKKNHRKWANEAEH
LISSSVKGLENLDETAFLANHKGHGFRTFEHAKSLFKEFLKKY
>Q65186 ~~~~~~Uncharacterized protein H124R~~~
MNLEYVQVVQKFNQVLLELTKKVCTVVGGSKPTYWYHHIRRVCSECPSMPMSMIGPYLNVYKAQILTRDKNFFMNFDPAH
NEYTFIIQKLKEAARNMPEDELEQYWVKLLFLLKSYIKCKPFIN
>Q65171 ~~~~~~Uncharacterized protein B125R~~~
MAVYAKDLDNNKELNQKLINDQLKIIDTLLLAEKKNFLVYELPAPFDFSSGDPLASQRDIYYAIIKSLEERGFTVKICMK
GDRALLFITWKKIQSIEINKKEEYLRMHFIQDEEKAFYCKFLESR
>Q65154 3.1.4.-~~~~~~Uncharacterized protein C129R~~~
MEHPSTNYTPEQQHEKLKYYVLIPKHLWSYIKYGTHVRYYTTQNVFRVGGFVLQNPYEAVIKNEVKTAIRLQNSFNTKAK
GHVTWAVPYDNISKLYAKPDAIMLTIQENVEKALHALNQNVLTLASKIR
>Q07344 ~~~~~~Structural protein A137R~~~
MEAVLTKLDQEEKKALQNFHRCAWEETKNIINDFLEIPEERCTYKFNSYTKKMELLFTPEFHTAWHEVPECREFILNFLR
LISGHRVVLKGPTFVFTKETKNLGIPSTINVDFQANIENMDDLQKGNLIGKMNIKEG
>Q07385 ~~~~~~Uncharacterized protein K145R~~~
MDHYLKKLQDIYTKLEGHPFLFSPSKTNEKEFITLLNQALASTQLYRSIQQLFLTMYKLDPIGFINYIKTSKQEYLCLLI
NPKLVTKFLKITSFKIYINFRLKTFYISPNKYNNFYTAPSEEKTNHLLKEEKTWAKIVEEGGEES
>Q65197 ~~~~~~Uncharacterized protein E146L~~~
MGGTTDFVLSITIVLVILIIIAFIWYNFTGWSPFKYSKGNTVTFKTPDESSIAYMRFRNCIFTFTDPKGSLHSIDVTEVL
NNMAKGFRDAQNPPSSFTLGGHCQAPLNAFSFVLPGVNDRATVATADDAKKWENCDATLTGLQRII
>Q65140 ~~~~~~Protein A151R~~~
MMALLHKEKLIECIENEVLSGGTVLLLVKNIVVSEISYIDNSYKYFTFNANHDLKSKEDLKGATSNNIAKMIYNWIIKNP
QNNKIWSGEPRTQIYFENDLYHTNYNHECIKDFWDVSTSVGPCIFNDRSIWCTKCTSFYPFTNIMSPNIFQ
>Q65145 ~~~~~~Uncharacterized protein F165R~~~
MANPNKRIMNKKSKQASISSILNFFFFYIMEYFVAVDNETPLGVFTSIEQCEETMKQYPGLHYVVFKYTCPADAENTDVV
YLIPSLTLHTPMFVDHCPNRTKQARHVLKKINLVFEEESIENWKVSVNTVFPHVHNRLSAPKFSIDEANEAVEKFLIQAG
RLMSL
>Q65166 ~~~~~~Transmembrane protein B169L~~~
MNVDFIAGINNLGEKIYTCEPFKTSFQNPFIVALIITAVVLVVFFAICNPPVDKKRKTKTAIYVYICIVALLFLHYYVLN
HQLNDIYNKSNMDVIVSSIHDKYKGGDEIIPPISPPSVSNELEEDQPKKIPAGPKPAGPKPADSKPASSADSKPLVPLQE
VIMPSQYNN
>Q65185 ~~~~~~Uncharacterized protein H171R~~~
MVVYDLLVSLSKESIDVLRFVEANLAAFNQQYIFFNIQRKNSITTPLLITPQQEKISQIVEFLMDEYNKNNRRPSGPPRE
QPMHPLLPYQQSSDEQPMMPYQQPPGNDDQPYEQIYHKKHASQQVNTELNDYYQHILALGDEDKGMDSMLKLPEKAKRGS
DDEDDMFSIKN
>Q65174 ~~~~~~Uncharacterized protein B175L~~~
METNCPNILYLSGITIEECLQTKKTATDTLNTNDDEAEVEKKLPSVFTTVSKWVTHSSFKCWTCHLYFKTVPKFVPTYMR
ENERGEIEMGVLGNFCSFSCAASYVDVHYTEPKRWEARELLNMLYRFFTSQWISYIKPAPSYTMRKEYGGKLSEEAFISE
LHTLEESISSKHIFI
>P27942 ~~~~~~Protein I177L~~~
MWKVNDQGFLNITVTGTKFNLIAITGKLGFYTDPPSHLMIMPLKIFPVHKFSKNEPNKKQKRFIYF
>Q65183 ~~~~~~Uncharacterized protein S183L~~~
MSVVVGGVEYSLNNWARYEIKRRAAELESVNYYPHCEYIMPEDIVVSILGSKPNCPFLEALKRFHDFLKKRRIIFKGEYL
VIPWMGAQDVADMIHHVENRINLDHLEDLAHMLKLITYHKSFDTCINQAFEHLYAFKFPDANIETHELKHIRQLEKKMYG
YILRLEKLQTVLTFYIEFLLKQV
>Q65193 ~~~~~~Uncharacterized protein E184L~~~
MKTFITCTSVKNYFRQHLKTNQRISSELISYVCTILNHICHQYLQNPQAQEEEWFALIKELPIIKDGLSKEERFFSSGVK
HFLHEYKITPENQEKFQKMLNAITEQLMSRLCKVFSIMIQRQGFLKTQTLMYSHLFTILSILMVADNLYGEQDPTEFFSL
IIEQTKTIKKKKKSSSEEEESHEE
>P27943 ~~~~~~Late protein I196L~~~
MLFRYLVWLFRFIEVKNVVSISLLVIGSNYLTTAISNNTSTTISPTTSSNYLLTAISNNTSTTILPTTTSSNYLTSAIPN
IISDKEDDTPFSTDKTVSDGLSPITLYRAIRSTLNDTMTDILTRPYRPTTVIFHSDTPQPVKNATQGNIIKKTYRQVLTF
FIQPNPLFPCFKNHEVFLNLANILNTILCIILIKNV
>Q65198 ~~~~~~Inner membrane protein E199L~~~
MSCMPVSTKCNDIWVDFSCTGPSISELQKKEPKAWAAILRSHTNQQTAEDDNIIGSICDKQGLCSKDEYAYSQYCACVNS
GTLWAECAFAPCNGNKNAYKTTEHRNILTNKQCPSGLTICQNIAEYGGSGNISDLYQNFNCNSVINTFLINVMNHPFLTL
ILIILILIIIYRLMPSSGGKHNDDKLPPPSLIFSNLNNF
>Q65147 ~~~~~~Uncharacterized protein K205R~~~
MVEPREQFFQDLLSAVDQQMDTVKNDIKDIMKEKTSFMVSFENFIERYDTMEKNIQDLQNKYEEMAANLMTVMTDTKIQL
GAIIAQLEILMINGTPLPAKKTTIKEAMPLPSSNTNNDQTSPPASGKTSETPKKNPTNAMFFTRSEWASSKTFREKFLTP
EIQAILDEQFANKTGIERLHAEGLYMWRTQFSDEQKKMVKEMMKK
>Q65189 ~~~~~~Uncharacterized protein H233R~~~
MILIASPFSLAHLEYLHTWHVTIKNIAQQHGLDIKVAIVVSTSHLNNFLPISGALNIECITFPSCGIKEIDLLWARIKLF
QHYCAIGARLLWLVSADIRPPVSAWPAIADSLKKGADAVVIPYPSRWNNLIPTVIKEIVVHQKKCLVAVDARHLDTDTQI
VGAGMGCIVLTLKALMVRLSIGKQPVKILWPDLHGTAEGIPLEGVEVGWFLNAYAHKLNIRCLGADHIAQHLT
>Q65158 ~~~~~~Transmembrane protein C257L~~~
MYSVCDVVRDAVAQSHLCACPNDKLPQCKGVTKAPPKCSVFHVAKLQDTKFKWKYTLDPLKAQKLSQIDKDIEKDAITLK
LIYGIELSPEDLEWWKMQRCLINKKTGAKGGQFANKYLERQDLELLGYSPTPIIGGDFMFTALPDKVLRTIPVAWDRFLN
PAMMIFFLIILLCVILGIFYVLVRNTLRRKQKSKQHQMEIKRFIKEKEQDPYIHTSFESWPADPNKKWKDLIPMYEAQGY
CMADYRKKLGMPPGPNC
>Q65205 ~~~~~~Protein I267L~~~
MLLVLIDVDGFMGQLYNENGTQTILIPREVVIFYWEKNTASKILQLFFHGGIDPIFEKINQRSFSFQSRHIHHFTLDESP
LPNSIALPTDTLQAFKAGKKMIFQHLVKITKDHEQILLLHKGGPEGEWVRSFNIPNATVQNLNDLCCPSVEKLVLKKRDY
ISSSIGCPKHIQGSNHCPVFECHVLFKWIQENTSIVQGVLKRPSLPYEEAVLFIEHRINMVDNHPFKKDSVKQNQKKKNW
IATQFVQHGIYVDNGILSKIYNKYSLF
>Q65196 ~~~~~~Uncharacterized protein E301R~~~
MSEDIRRGPGRPPKKRVVPNFERKGILEKPVRPQSRLEFSYDNPLIFKNLFIYFKNLKSKNILVRCTPTEITFFSRDQSQ
ASFVIATIDGKNVNHYYASDVFWLGINRELVEKMFNSIDRSFLKITIVHRYDKPETLFFIFTDFDIDKECTYQITVSEPE
LDMDLIEMEKSISEERLKNYPLRWEFTSKQLKKTFSDLSNYTELVTIEKLGGDTPLHLYFQKFNSISYHEMYKSSNKINL
TSTIPKSQVFQINVKIAHIKSLASAMVTDKIRILCEENGNLIFQSEMDALMLNTITLNNTI
>Q65180 ~~~~~~Uncharacterized protein CP312R~~~
MLLVKMTTHIFHADDLLQALQQAKAEKNFSSVFSLDWDKLRTAKRNTTVKYVTVNVIVKGKKAPLMFNFQNEKHVGTIPP
STDEEVIRMNAENPKFLVKKRDRDPCLQFNKYKISPPLEDDGLTVKKNEQGEEIYPGDEEKSKLFQIIELLEEAFEDAVQ
KGPEAMKTKHVIKLIQRKISNSAVKNADKPLPNPIARIRIKINPATSILTPILLDKNKPITLQNGKTSFEELKDEDGVKA
NPDNIHKLIESHSMHDGIINARSICISNMGISFPLCLEMGVVKVFEKNNGIDVNSIYGSDDISTLVNQIAIA
>Q65144 ~~~~~~Uncharacterized protein F317L~~~
MVETQMDKLGFLLNHIGKQVTTKVLSNAHITQTMKEIILENHGVDGGAAKNVSKGKSSPKEKKHWTEFESWEQLSKSKRS
FKEYWAERNEIVNTLLLNWDNVRGAIKKFLDDDREWCGRINMINGVPEIVEIIPSPYRAGENIYFGSEAMMPADIYSRVA
NKPAMFVFHTHPNLGSCCGGMPSICDISTTLRYLLMGWTAGHLIISSNQVGMLTVDKRIIVDLWANENPRWLMAQKILDI
FMMLTSRRSLVNPWTLRDLKKILQDYGIEYIIFPSNDFFIYEDERLLMFSKKWTNFFTLHELLDDLETIETKASSTT
>Q65182 ~~~~~~Protein D345L~~~
METFVRLFKDSPQQRSDAWHAIRRTQVGGSDLASVLGLNPYKSYYIILAEKANLFKKNLNRAACSWGTLFERVSKDLLEL
FCQTTVIGDNIHIDGTYLGYPGHSNSPDGFCHLTLGYTQQSWEIKTIFNNVRYEATKRIPVLVEIKSPFNRKIKNSVPSY
YMPQIQSGLALSPPISMGIYVEAMFRVCGIHQLGSNNETNTDIHPPESMLPLAWGIITICSTQEHTEAPQDFGTLDAETF
RQLLETLYQKDQYTIHYSMPYETACPEMPNVVGYFGWKVFIFQIIPVMKHPQFLKDKYPIIQQFLRDLHTIKASPSPMET
YEKICCSEESALSTEDIDNFTDMLT
>Q65168 ~~~~~~Uncharacterized protein B354L~~~
MALTTHSGKLIPELQFKAHHFIDKTTVLYGPSKTGKTVYVKHIMKILQPHIEQILVVAPSEPSNRSYEGFVHPTLIHYRL
WLADKQKKNDNKGAERFLEAIWQRQTMMSSIYSRVNNIDMLKTLYHKLPIDIQQKENKNIAKVECLKAEQTDQKKEEKIT
SLYQQLLKKIIIQNIHMYKNLCLTEDEKFTLNYINLNPRLLLILDDCAAELHPLFTKEIFKKFFYQNRHCFISMIICCQD
DTDLPANLRKNAFVSIFTNASICMSNFSRQSNRYSKQDKEYVEEISHIVFKGYRKLVYIREDENRQHFYHSTVPLPTAFS
FGSKALLKLCKAVYSKEVVIDKSNPYWSKFRLNF
>Q65151 3.1.4.-~~~~~~ERCC4 domain-containing protein EP364R~~~
MYFLVADHREHHVIPFLKTDFHDMQHNPMFTQKQALLEIKQLFTGDYLICKSPTTILACIERKTYKDFAASLKDGRYKNR
QKMLSLREQTNCQLYFFVEGPAFPNPQKKINHVAYASIITAMTHLMVRDHIFVIQTKNEAHSSQKLVQLFYAFSKEMVCV
VPTSLTPTDEELCIKLWSSLSGISGVIGKILANTCSVAHLVSGKLSSQNIDQLKTPSNRPFPKKVKRMLISISKGNKELE
IKLLSGVPNIGKKLAAEILKDHALLFFLNQPVECLANIQIVQKTRTIKLGMKRAEAIHYFLNWCGSAHVTDDSQNITEAS
RPATQPAATQPLHEVSDDATSNASDTSSPIGHQTLSKEMLLNTA
>Q65170 ~~~~~~Zinc finger protein B385R~~~
MDEIINKYQAVEKLFKEIQQGLAAYDQYKTLISEMMHYNNHIKQEYFNFLMIISPYLIRAHSGETLRNKVNNEIKRLILV
ENINTKISKTLVSVNFLLQKKLSTDGVKTKNMWCTNNPMLQVKTAHNLFKQLCDTQSKTQWVQTLKYKECKYCHTDMVFN
TTQFGLQCPNCGCIQELMGTIFDETHFYNHDGQKAKSGIFNPNRHYRFWIEHILGRNPEQELGTKQDPCGTKVLQQLKKI
IKRDNKCIALLTVENIRKMLKEINRTDLNNCVSLILRKLTGVGPPQISESILLRGEYIFTEAIKIREKVCKKGRINRNYY
PYYIYKIFDAILPPNDTTNRRILQYIHLQGNDTLANNDSEWESICMELPEIKWKPTDRTHCVHFF
>Q65173 ~~~~~~Uncharacterized protein B407L~~~
MEDTTFLEGANLAGITTLMNNLHINEQANLEELEKQVMGKQQSFPTDHFDEELNGLAKSLGINFNDPEFSLDSPHSVISK
KPSGRGRDKVHGGIRRDSVCTDSICSDSVCSGSIRSGSIRSGSIRNGSIRSGSVRDDSVRSGKTRRGLACNSSSRNDRGY
SLSTHRKKYAESEASQKTAFSKRDRKNHYAESEYSEKSIKPSTKQVDRLINHLRSNGDPNSFYKKDHDYERKTKLVKLEK
INMLLTYLGNEQISTDDIKIPTIDSSMQEIDDVIEMLTLRNVGIRYSSIAEEILIGLARGLEIVFDGTREIPFLNYRPDY
TGLHNTFMIKLFKMRYETSQVVGNLVQNMSPLSKICLELGPSLLLYPALIRTKHKASEDLYNLLQKGPEDPFTAYNEIHE
TLKKNNK
>Q07384 ~~~~~~Uncharacterized protein K421R~~~
MYTHVDVVGIAEASAALYVQKDRDRYLDVLTTIENFIYQHKCIITGESAHLLFLKKNIYLYEFYSNNVAEHSKALATLLY
KLDPEYLTRYTVLITKIPNHWYVINVDQREFVRLYAIPAVKQHLPIPILPFYCTSALTHQELFCLGPELQLIQIYSKLCN
PNFVEEWPTLLDYEKSMRMLFLEQFPRRLEIAGGKKEEEKHESIIKKIILEMVSTRQRIVVGGYIQKNLYNHVLKNRNRL
QLITSLDIYEEKDIIQQFCDSNGLKIKIRINNPLLPTNPELRRLTIYFNHNNDDDQSYLIVDMYNTGSYELVPTNQINTL
DGSFLIGTPFVQARFLLVEIWVLMLIAQQTKKDTKKIIQFFINQYEMLMNSPWPSMEALFPSSSKRYLGNYVDPNALIKW
AQLKLKRIPPFYPGKPDEESC
>Q65195 ~~~~~~Uncharacterized protein E423R~~~
MLWRNEITEFMDQLSKYSQEILKTFKQLRPSEYKQYNEFLTQVTPLLQKTSEKIPELVDHIFNYLDNVEKICELLVNASS
IIISSKIREQVKHGMSFSYKADLDSLADILSQKQYVLMHLSKNIAAEYFNTCLNQGKSKLDLKAASVFYSSRPRTASSAE
LYRKMLYAYGSPQEINYYTEKARNKTLDVEESDSMAIIERTARHNLSLMHPLEAMGLTFGATNTDADPEDLKDKTVINLT
LPQATESITYHLKSLMQLKKVSTASGLNTNILKAFDNIISTPVKKNKMASKLAPGMDVVFTSDNGKTFFTKNILSKNMLA
GPKERVFAYNNLISNLNNSCFIQNHNDFLRQQDSWPFYDAHNFTNKFLMQPIFSGQTRPRLQGAMEAAHVETHLTAFLQS
IQPSRPQDPSVLASPKLSALILN
>Q65148 2.1.1.-~~~~~~Probable methyltransferase EP424R~~~
MSNYYYYYGGGRYDWLKTVEPTNFLKIGLPYQAHPLHLQHQATTTPPSILEKFKRADILLNEVKAEMDPLMLQPETEKKL
YQILGSIDMFKGLRKKVEFTYNAQIVTNAWLKMYELLNTMNFNNTSQAFCNCELPGGFISAINHFNYTMMHYPTFNWVAS
SLYPSSETDALEDHYGLYQCNPDNWLMQSPLLKKNVDYNDGDVTIASNVKNLALRATQRLTPIHLYTADGGINVGHDYNK
QEELNLKLHFGQALTGLLSLSKGGNMILKHYTLNHAFTLSLICVFSHFFEELYITKPTSSRPTNSETYIVGKNRLRLFTP
KEEQILLKRLEFFNDTPLVDLSLYQNLLESIYFAVETIHLKQQIEFLNFGMKCYRHFYNKIKLLNEYLAPKKKIFQDRWR
VLNKLYVLEKKHKLKLCAPQGSVA
>Q65167 ~~~~~~Uncharacterized protein B475L~~~
MDQEESHVISIFETLGAYFINIFYNFLYKNALYKKHSIVTEYQYQVKGYILGVKQNKKLYEKMLDSFYKYFCNITQINSK
TLNFSNFITTIVDSFIPKEYSQSISLEKKESILELLLCDYISNLGTFITTEKMLPFIIKNRKENYHKVTKEMQDYSLTFL
LKKRMELYNKFLRKQAYVEPETELEETYARLSSYNRSLLHQIEELTSEKKSLLADLSTLRKKYEKRQSEYRRLVQLLYQQ
IQRSSTSKSSYPLTKFIETLPSEHFSNEEYQKETPADQKEVVEMELLRKQELLTSQELTSKSPNNYPVPHSRTIVSKPPD
NYPVPRSRTTTKLDFDNSLQNQELHTKNGFSEKDIVEFGQDKPEEENILAIDQDKPEEENILAIKQDIPEEENILAIDQD
KPEFNQDTPEFKEAVLDTKENILEEENQDEPIVQNPFLENFWKPEQKTFNQSGLFEESSNFSNDWSGGDVTLNFS
>Q65191 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase QP509L~~~
MEAIISFAGIGINYKKLQSKLQHDFGRLLKALTVTARALPGQPKHIAIRQETAFTLQGEYIYFPILLRKQFEMFNMVYTT
RPVSLRALPCVETEFPLFNYQQEMVDKIHKKLLSPYGRFYLHLNTGLGKTRIAISIIQKLLYPTLVIVPTKAIQIQWIDE
LTLLLPHLRVAAYNNAACKKKDMTSKEYDVIVGIINTLRKKPEQFFEPFGLVVLDEAHELHSPENYKIFWKIQLSRILGL
SATPLDRPDGMDKIIIHHLGQPQRTVSPTTTFSGYVREIEYQGHPDFVSPVCINEKVSAIATIDKLLQDPSRIQLVVNEA
KRLYSLHTAEPHKWGTDEPYGIIIFVEFRKLLEIFYQALSKEFKDVQIVVPEVALLCGGVSNTALSQAHSASIILLTYGY
GRRGISFKHMTSIIMATPRRNNMEQILGRITRQGSDEKKVRIVVDIKDTLSPLSSQVYDRHRIYKKKGYPIFKCSASYQQ
PYSSNEVLIWDPYNESCLACTTTPPSPSK
>Q65169 ~~~~~~Protein B602L~~~
MAEFNIDELLKNVLEDPSTEISEETLKQLYQRTNPYKQFKNDSRVAFCSFTNLREQYIRRLIMTSFIGYVFKALQEWMPS
YSKPTHTTKTLLSELITLVDTLKQETNDVPSESVVNTILSIADSCKTQTQKSKEAKTTIDSFLREHFVFDPNLHAQSAYT
CASTCADTNVDTCASTCASTCASTCASTCASTCASTCASTGASTCADTNVDTCASTCADTNVDTCASTCADTNVDTCAST
CADTNVNTCASMCADTNVDTCASTCANTCASTEYTDLADPERIPLHIMQKTLNVPNELQADIDAITQTPQGYRAAAHILQ
NIELHQSIKHMLENPRAFKPILFNTKITRYLSQHIPPQDTFYKWNYYIEDNYEELRAATESIYPEKPDLEFAFIIYDVVD
SSNQQKVDEFYYKYKDQIFSEVSSIQLGNWTLLGSFKANRERYNYFNQNNEIIKRILDRHEEDLKIGKEILRNTIYHKKA
KNIQETGPDAPGLSIYNSTFHTDSGIKGLLSFKELKNLEKASGNIKKAREYDFIDDCEEKIKQLLSKENLTPDEESELIK
TKKQLNNALEMLNVPDDTIRVDMWVNNNNKLEKEILYTKAEL
>Q65156 ~~~~~~Uncharacterized protein C717R~~~
MTKLAQWMFEQYVKDLNLKNRGSPSFRKWLTLQPSLLRYSGVMRANAFDILKYGYPMQQSGYTVATLEIHFKNIRSSFAN
IYWNRDSEEPEYVCCCATYQSHDGEYRYRFVWYQPFIEAYNAIEAALDPLETIILNLIAARDLDFVVHIFPYNKGHEDYL
ASTQLILKIFIATLLMDILRIKDNTLDVHLNSDYIIVMERLWPHIKDAIEHFFEAHKDLLGYLIAFRNGGNFAGSLRPSC
GQKIVPLTIREVLQMNDINLAVWREVFIMQECSDLVINGIAPCFPIFNTWTYLQGINQIFFENTSLQEKFKKDFIARELS
KEIIKGQKTLNDKEFKKLSLHQIQYMESFLLMSDVAIMITTEYVGYTLQSLPGIISRSSYLSPIVKNILMDEDSFMSLLF
DLCYGAYVLHKKENVIHADLHLNNMTYYHFNPTSFTDRNKPGKYTLKVKNPVIAFITGPKVETETYVFKHIDGFGCIIDF
SRAIMGPNHAIKLERQYGLAFVNTFYRNQSEHILKVLRYYFPEMLTNRENEIQGVILSNFNFFFNSITAIDFYAIARNLR
SMLSLDYLHTSEVKRNVEISQTFLDTCQFLEEKAVEFLFKNLHTVLSGKPVEKTAGDVLLPIVFKKFLYPNIPKNILRSF
TVIDVYNYNNIKRYSGKAIQTFPPWAQTKEILTHAEGRTFEDIFPRGELVFKKAYAENNHLDKILQRIREQLANENL
>P27946 ~~~~~~Uncharacterized protein I73R~~~
METQKLISMVKEALEKYQYPLTAKNIKVVIQKEYNVVLPTGSINSILYSNSELFEKIDKTNTIYPPLWIRKTN
>Q07383 ~~~~~~Transmembrane protein EP84R~~~
MPYSRDITKFITATEPEVGLPLLALQHSKSVIGVILLVISLLFIFIGIIILSVSSGHTTAASIFIVLSLILGGGGFFLIY
KDNS
>Q65206 ~~~~~~Uncharacterized protein DP238L~~~
MARGQNIRKRTFSDMDTPSDKNIGIHTNSLPKNNLYRRILFKGKISNYSISKDSLAKDHSSKHSISKNGLIGKKRPAPLD
ISFQSMNSSISSSTQKKTRILDEEIKDQSLSNENDTDSPVIVDITLKPSYMSKTSRITEIIHKMKELNMNRIEDGSSFNK
KRSEHDDKNILLHTMEMEEEDCEIEEDIAIDSPYLNTSLSEDDTDSIVGTDYSEKEKETISETESSSDDESYSLYDSF
>Q65187 ~~~~~~Protein H339R~~~
MAGRVKIKQKELIDSTVKNKNVMNLFHEIIGSKGNINFSVVWPKFKKIKQSVYDYISTLSVLEKASVMQNFEADKKLLEL
FVQKLWAAYEGYFKYPEIEKYEVEGQVNFNLVPQCVLEKFSQLYRIRINSELVTLILNSCAFMSKYNDYILKKDPYILTI
TPGLCFSPIPNFEDLNFKHLYNSDKNSQHDKEFIMFILYKLYTAALGVYNAISIPDIDVEDLENIILSSVSQIKKQIPRC
KDAFNKIESSVHLLRKNFNTYYSDYVGSGYNPTIIMEQYIKDISQDSKNISPRISYQFRTIIKYYRDMIATRHQTMDPQV
LNLVKHVEKKLDMLDREKN
>P15236 ~~~fil~~~Protein fil~~~
MLKSEPSFASLLVKQSPGMHYGHGWIAGKDGKRWHPCRSQSELLKGLKTKSPKSSGFLIIRIVHFVIKGVKHVTR
>F5HEZ4 ~~~~~~Viral FLICE protein~~~
MATYEVLCEVARKLGTDDREVVLFLLNVFIPQPTLAQLIGALRALKEEGRLTFPLLAECLFRAGRRDLLRDLLHLDPRFL
ERHLAGTMSYFSPYQLTVLHVDGELCARDIRSLIFLSKDTIGSRSTPQTFLHWVYCMENLDLLGPTDVDALMSMLRSLSR
VDLQRQVQTLMGLHLSGPSHSQHYRHTP
>Q05278 ~~~6~~~Minor tail protein Gp6~~~
MADLGNPLDLEMLCLVTGRDFRWTIDYPWGPGELFLELETGGEHNALHQVYVTGATGGTYTLNVNGTNTPAIDYNDVSEN
PQGLAGDIQDALDAAVGAGNAVVHPVSLFPAWTLNFNLNASKPLTEQLVNTINKAANDFFDTFDQLLGVDVEMTVTDTLN
FKLKVTSRRSFDEVGVVTFAVDVTSQAVINFFNSVAELTGAVNTVNVDFYWNRTYDIEFTGSLGLQPIPATTADITNLAG
TSKAVSVTVVEPGKKRLTIWPFTVNGETATIKVESEEADKIPNRCRWQLVHMPTGEAAGGDAKQLGRVYRQPR
>P26748 ~~~8~~~Scaffolding protein~~~
MEPTTEIQATEDLTLSGDHAAASADSLVVDNANDNAGQEEGFEIVLKDDETAPKQDPAKNAEFARRRIERKRQRELEQQM
EAVKRGELPESLRVNPDLPPQPDINAYLSEEGLAKYDYDNSRALAAFNAANTEWLMKAQDARSNAVAEQGRKTQEFTQQS
AQYVEAARKHYDAAEKLNIPDYQEKEDAFMQLVPPAVGADIMRLFPEKSAALMYHLGANPEKARQLLAMDGQSALIELTR
LSERLTLKPRGKQISSAPPADQPITGDVSAANKDAIRKQMDAAASKGDVETYRKLKAKLKGIR
>Q38623 ~~~~~~Uncharacterized protein gp20~~~
MYRKFSDECFGPSTLINAIKVIALVVLITISAVVYLSVC
>Q05229 ~~~~~~Major tail protein Gp23~~~
MAENDDAVLTAAVGYVYVGAAGTAAPTPALLKTIDLSKPETWTGATGWTSVGHTSRGTLPEFGFEGGESEVKGSWQKKKL
REITTEDPIDYVTVLLHQFDEQSLGLYYGPNASETPGVFGVKTGQTNEKAVLVVIEDGDMRLGHHAHKAGVRRDDAIELP
IDDLAALPVRFTYLDHEDELPFSWINEDLFNVPEVPEG
>Q05233 ~~~~~~Minor tail protein Gp26~~~
MPNSAGVEVARISVKVSPNTKEFRRELKTELEKIERELKGDVEINGHLDAAQAKADFKRMMMQLKTEAAKGVHVPVDVTV
DKKSKKGGLLGGLLGGSRGLGDLGDDAEKASSQVQHLGKSFLGLTRAAWIGVGIVAVAAPLVGIVAGLLAGLPSLLSAFG
AGAGVVALGMDGIKAAASTLAPTLETVKAAVSSTFQQGLTPVFQQLGPMLTAITPNLQNVASGLVNMAGSITDVITQAPG
LQQIQNILTKTGEFFTGLGPVLATGTQAFLTLSNAGANSFGTLLAPLQEFTNGFNDMVNRVTSNGVFEGAMQGLSQTLGS
VLNLFNRLMESGLQAMGQLGGPLSTFINGFGDLFVSLMPALTSVSGLIGNVLGTLGTQLAPIVTALTPAFQTLASTLGTM
LTGALQALGPILTQVATLIGTTLNTALQALQPMLPSLMQSFQQISDVLVTSLAPHIPALATALGQVAGAVLQLAPTIIST
LVPAFVQLVPKVAELVPTIVNLVQSFANLMPVVLPLAQALVSVAGAVIQVGVSIGGALIGALANLTEIISNVIKKVSEWV
SSFSSGAQQIAAKAAELPGMIQSALANLMAIGLQAGKDLVQGLINGIGGMVSAAVNKAKELASSVAGAVKGFLGIESPSK
LFTEYGQFTAEGFGNGMEAGFKPVIERAKDLAAELSRAMESGTDPSGILAGLDQNELKQMLAALEEERKRLKVEKNGIPK
GDKAGREALQNQLDQIQAQKDILSYQRDRIKNESEYGDMAGEDPLVKAASGLMSAPVDFAKATGKQFLSDIGISGDGFIS
KAITEGIQYIFQIGSVDEALSIKDREESKNALSVVGR
>Q05234 ~~~~~~Minor tail protein Gp27~~~
MITDTIVELEGVNGERFNLTTGDQGVYLATDVEGCFYDPPVKVVVEEPGNYPGARYLSHRALKRDIVFGVVILNDAKQGP
RSWLSRDSEWRKAWAFNRTCKLYVTTPDSGTRYLKLALFESPTVKMDTDPRGKPLEVTVMSCIAYDPFWYEDDKVFSAKT
KTDTRFDPSFWTPPWPWEELPKETLRIKVGREQGGLNPTDQYIFPKWTVPGSTEKVPNFPWPFPPNVPIPWETAPFTQFV
IPDYSFEDEEFRNRRLKTPGLIYGENCVIDTDRREEQIASESGSPVWARMNGVRFRNSIPPYTEEAEFVIDASGCAPGQV
VTLRLTRPWSRCWGLE
>Q0PI71 ~~~~~~Uncharacterized 27 kDa protein~~~
MESNKMKSITKMTVNTLQMDSCKAAAKRFKNATWELISNDFNLDIMNMFCILYYACTSHELKLDELVFQTAHKVIMNEVS
VSIGESRMFIRIYVSFHMLMGDLNELLHHGISKFEGSPIKDMTSYLIYGEEYVRNLFQSINSTRATPVYQLFITKVPRQF
PAITHGDSEYTYMLMVHNYLSTQSIAGVSHTPSNSDNYVIRTAEEYVHHHYAQLDRQASDNKMRPESSDQME
>Q05235 ~~~~~~Minor tail protein Gp28~~~
MSGLTSVREAEDLWQKIQLRRCKREQERLKHPDVELRDGDFRLRGLVAGERVLEWEFIENETGTCTLQLSLSHYLAKWVM
DHRGRAKRNVIINIEKQGARWTGMMDHYRVIKTDAGDAYIEIVFLHDFEQTKHIRVWCNPFLRPELQFPKVWIIFGPAKW
CLLVTLFVNLLRLETSLWTLPDDPTDINEWMGPSFNPANWRNIVKPFPFLADNSPVTMVFSRFGTFYDTAKKILEDHQLT
LTCRRYIKDRDPHPFEDLKGLWGIDPVEDLLQKIPLRDGCVVWDIEDNSGWGTQTAFGGSWLTGFVRGMVQLAGDGQVEG
VDVFTGDYTFPGEYYSPWFMGTSPIAPHVVFEEGPLTGIKSSEFSYYEATDTSFLAGGQSAPGINEGISALVNIGGDLLT
SFINSQLAALGAVGGAIDLPPLGGLLDAVLQPLYSDVFGAFMEVPTLRAMGISLPISGLEDIVTGLGDFHYFENMADGAM
KAFTLSAFAAIASQIHKTRARTTHTLKVSDAAPYIFAPKPYGHCWIGDRVGTSVLGYPVEHQLFVERIRKVKYRIDKDGM
KPLEIEIGYREPKNPALHILEEIKRVNGALGTAGIL
>P17313 ~~~~~~Capsid assembly protein Gp31~~~
MSEVQQLPIRAVGEYVILVSEPAQAGDEEVTESGLIIGKRVQGEVPELCVVHSVGPDVPEGFCEVGDLTSLPVGQIRNVP
HPFVALGLKQPKEIKQKFVTCHYKAIPCLYK
>P17171 ~~~~~~Head formation protein~~~
MNKDDLDLDLEIIDESPSSEGEEERKERLFNESLKIIKSAMENVIQEIVIKLEDGSTHIVYVTKLDWVDGKVVMDFAVLD
QERKAELAPHVEKCITMQLQDAFNKRSKKKFKFF
>Q05286 ~~~~~~Repressor-like immunity protein~~~
MSGKIQHKAVVPAPSRIPLTLSEIEDLRRKGFNQTEIAELYGVTRQAVSWHKKTYGGRLTTRQIVQQNWPWDTRKPHDKS
KAFQRLRDHGEYMRVGSFRTMSEDKKKRLLSWWKMLRDDDLVLEFDPSIEPYEGMAGGGFRYVPRGIEDDDLLIRVNEHT
NLTAEGELLWSWPDDIEELLSEP
>P01136 ~~~~~~Pro-Viral epidermal growth factor~~~
MSMKYLMLLFAAMIIRSFADSGNAIETTSPEITNATTDIPAIRLCGPEGDGYCLHGDCIHARDIDGMYCRCSHGYTGIRC
QHVVLVDYQRSENPNTTTSYIPSPGIMLVLVGIIIITCCLLSVYRFTRRTKLPIQDMVVP
>P03654 ~~~K~~~K protein~~~
MKPKTTLLLQELLLLTYELNRSGLLVENEEIQSQLKKLEVVLLCNLSPSSQRAGKN
>Q8BB27 ~~~G~~~Envelope glycoprotein p57~~~
MQLSMSFLIGFGTLVLALSARTFDLQGLSCNTDSTPGLIDLEIRRLCHTPTENVISCEVRYLNHTTINLPAVHTSCLKYH
CKTYWGFFGSYSADRIINRYTGTVKGCLNNSAPEDPFECNWFYCCSAITTEICRCSITNVTVAVQTFPPFMYCSFADCST
VSQQELESGKAMLSDGSTLTYTPYILQSEVVNKTLNGTILCNSSSKIVSFDEFRRSYSLANGSYQSSSINVTCVNYTSSC
RPRLKRRRRDTQQIEYLVHKLRPTLKDAWEDCEILQSLLLGVFGTGIASASQFLRGWLNHPDIIGYIVNGVGVVWQCHRV
NVTFMAWNESTYYPPVDYNGRKYFLNDEGRLQTNTPEARPGLKRVMWFGRYFLGTVGSGVKPRRIRYNKTSHDYHLEEFE
ASLNMTPQTSIASGHETDPINHAYGTQADLLPYTRSSNITSTDTGSGWVHIGLPSFAFLNPLGWLRDLLAWAAWLGGVLY
LISLCVSLPASFARRRRLGRWQE
>P52638 ~~~G~~~Envelope glycoprotein p57~~~
MQPSMSFLIGFGTLVLVLSARTFDLQGLSCNTDSTPGLIDLEIRRLCHTPTENVISCEVSYLNHTTISLPAVHTSCLKYH
CKTYWGFFGSYSADRIINRYTGTVKGCLNNSAPEDPFECNWFYCCSAITTEICRCSITNVTVAVQTFPPFMYCSFADCST
VSQQELESGKAMLSDGSTLTYTPYILQSEVVNKTLNGTILCNSSSKIVSFDEFRRSYSLTNGSYQSSSINVTCANYTSSC
RPRLKRRRRDTQQIEYLVHKLRPTLKDAWEDCEILQSLLLGVFGTGIASASQFLRSWLNHPDIIGYIVNGVGVVWQCHRV
NVTFMAWNESTYYPPVDYNGRKYFLNDEGRLQTNTPEARPGLKRVMWFGRYFLGTVGSGVKPRRIRYNKTSHDYHLEEFE
ASLNMTPQTSIASGHETDPINHAYGTQADLLPYTRSSNITSTDTGSGWVHIGLPSFAFLNPLGWLRDLLAWAAWLGGVLY
LISLCVSLPASFARRRRLGRWQE
>Q6WB94 ~~~G~~~Major surface glycoprotein G~~~
MEVKVENIRAIDMLKARVKNRVARSKCFKNASLILIGITTLSIALNIYLIINYTIQKTSSESEHHTSSPPTESNKEASTI
STDNPDINPNSQHPTQQSTENPTLNPAASVSPSETEPASTPDTTNRLSSVDRSTAQPSESRTKTKPTVHTRNNPSTASST
QSPPRATTKAIRRATTFRMSSTGKRPTTTSVQSDSSTTTQNHEETGSANPQASVSTMQN
>P33495 ~~~G~~~Major surface glycoprotein G~~~
MGSKLYMVQGTSAYQTAVGFWLDIGRRYILAIVLSAFGLTCTVTIALTVSVIVEQSVLEECRNYNGGDRDWWSTTQEQPT
TAPSATPAGNYGGLQTARTRKSESCLHVQISYGDMYSRSDTVLGGFDCMGLLVLCKSGPICQRDNQVDPTALCHCRVDLS
SVDCCKVNKISTNSSTTSEPQKTNPAWPSQDNTDSDPNPQGITTSTATLLSTSLGLMLTSKTGTHKSGPPQALPGSNTNG
KTTTDREPGPTNQPNSTTNGQHNKHTQRMTPPPSHDNTRTILQHTTPWEKTFSTYKPTHSPTNESDQSLPTTQNSINCEH
FDPQGKEKICYRVGSYNSNITKQCRIDVPLCSTYSTVCMKTYYTEPFNCWRRIWRCLCDDGVGLVEWCCTS
>P16778 ~~~~~~UL37 immediate early glycoprotein~~~
MSPVYVNLLGSVGLLAFWYFSYRWIQRKRLEDPLPPWLRKKKACALTRRSRHRLRRQHGVIDGENSETERSVDLVAALLA
EAGEESVTEDTEREDTEEEREDEEEENEARTPEVNPIDAEGLSGLAREACEALKKALRRHRFLWQRRQRARMLQHNGPQQ
SHHAAVFCRVHGLRGFQVSVWLLLTLLWSTGHGVSVRCTYHGTDVNRTSNTTSMNCHLNCTRNHTQIYNGPCLGTEARLP
LNVTFNQSRRKWHSVMLKFGFQYHLEGWFPLRVLNESREINVTEVHGEVACFRNDTNVTVGQLTLNFTGHSYVLRAIAHT
SPFESYVRWEETNVTDNATSSENTTTVMSTLTKYAESDYIFLQDMCPRFLKRTVKLTRNKTKHNVTVTGNNMTTLPVWTP
ECKGWTYWTTLSVMWRNRRSALLRAKSRALGHWALLSICTVAAGSIALLSLFCILLIGLRRDLLEDFRYICRDEGSSSTK
NDVHRIV
>Q98146 ~~~~~~viral G-protein coupled receptor~~~
MAAEDFLTIFLDDDESWNETLNMSGYDYSGNFSLEVSVCEMTTVVPYTWNVGILSLIFLINVLGNGLVTYIFCKHRSRAG
AIDILLLGICLNSLCLSISLLAEVLMFLFPNIISTGLCRLEIFFYYLYVYLDIFSVVCVSLVRYLLVAYSTRSWPKKQSL
GWVLTSAALLIALVLSGDACRHRSRVVDPVSKQAMCYENAGNMTADWRLHVRTVSVTAGFLLPLALLILFYALTWCVVRR
TKLQARRKVRGVIVAVVLLFFVFCFPYHVLNLLDTLLRRRWIRDSCYTRGLINVGLAVTSLLQALYSAVVPLIYSCLGSL
FRQRMYGLFQSLRQSFMSGATT
>P87671 ~~~GP~~~Envelope glycoprotein~~~
MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVNDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVP
SATKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF
LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLT
YVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTVVSNGAKNISG
QSPARTSSDPGTNTTTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSPQSLTTKPGPDNSTHNTPVYKLDISE
ATQVEQHHRRTDNDSTASDTPSATTAAGPPKAENTNTSKSTDFLDPATTTSPQNHSETAGNNNTHHQDTGEESASSGKLG
LITNTIAGVAGLITGGRRTRREAIVNAQPKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYTEGLMHNQNGLICGLRQ
LANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKIDQIIHDFVDKTLPDQGD
NDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF
>O11457 ~~~GP~~~Envelope glycoprotein~~~
MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVP
SATKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF
LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLT
YVQLESRFTPQFLLQLNETRYTSGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTAVSNRAKNISG
QSPARTSSDPGTNTTTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSLRPPITKPGPDNSTHNTPVYKLDISE
ATQVEQHHRRTDNASTTSDTPPATTAAGPLKAENTNTSKGTDLLDPATTTSPQNHSETAGNNNTHHQDTGEESASSGKLG
LITNTIAGVAGLITGGRRTRREAIVNAQPKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYIEGLMHNQDGLICGLRQ
LANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKIDQIIHDFVDKTLPDQGD
NDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF
>Q66814 ~~~GP~~~Envelope glycoprotein~~~
MEGLSLLQLPRDKFRKSSFFVWVIILFQKAFSMPLGVVTNSTLEVTEIDQLVCKDHLASTDQLKSVGLNLEGSGVSTDIP
SATKRWGFRSGVPPQVVSYEAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGDYAFHKDGAFF
LYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIREAANYTENTSSYYATSYLEYEIENFGAQHSTTLFKINNNT
FVLLDRPHTPQFLFQLNDTIQLHQQLSNTTGKLIWTLDANINADIGEWAFWENKKNLSEQLRGEELSFETLSLNETEDDD
ATSSRTTKGRISDRATRKYSDLVPKDSPGMVSLHVPEGETTLPSQNSTEGRRVDVNTQETITETTATIIGTNGNNMQIST
IGTGLSSSQILSSSPTMAPSPETQTSTTYTPKLPVMTTEESTTPPRNSPGSTTEAPTLTTPENITTAVKTVWPQESTSNG
LITSTVTGILGSLGLRKRSRRQVNTRATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQNALVCGLRQ
LANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTCRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDN
DDNWWTGWRQWIPAGIGITGIIIAIIALLCVCKLLC
>Q7T9D9 ~~~GP~~~Envelope glycoprotein~~~
MGGLSLLQLPRDKFRKSSFFVWVIILFQKAFSMPLGVVTNSTLEVTEIDQLVCKDHLASTDQLKSVGLNLEGSGVSTDIP
SATKRWGFRSGVPPKVVSYEAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGDYAFHKDGAFF
LYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIREAVNYTENTSSYYATSYLEYEIENFGAQHSTTLFKIDNNT
FVRLDRPHTPQFLFQLNDTIHLHQQLSNTTGRLIWTLDANINADIGEWAFWENKKNLSEQLRGEELSFEALSLNETEDDD
AASSRITKGRISDRATRKYSDLVPKNSPGMVPLHIPEGETTLPSQNSTEGRRVGVNTQETITETAATIIGTNGNHMQIST
IGIRPSSSQIPSSSPTTAPSPEAQTPTTHTSGPSVMATEEPTTPPGSSPGPTTEAPTLTTPENITTAVKTVLPQESTSNG
LITSTVTGILGSLGLRKRSRRQTNTKATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQNALVCGLRQ
LANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTCRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDN
DDNWWTGWRQWIPAGIGITGIIIAIIALLCVCKLLC
>P87666 ~~~GP~~~Envelope glycoprotein~~~
MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVP
SATKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF
LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLT
YVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTAVSNRAKNISG
QSPARTSSDPGTNTTTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSPQPPTTKPGPDNSTHNTPVYKLDISE
ATQVEQHHRRTDNDSTASDTPPATTAAGPLKAENTNTSKGTDLLDPATTTSPQNHSETAGNNNTHHQDTGEESASSGKLG
LITNTIAGVAGLITGGRRARREAIVNAQPKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYTEGLMHNQDGLICGLRQ
LANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKIDQIIHDFVDKTLPDQGD
NDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF
>Q05320 ~~~GP~~~Envelope glycoprotein~~~
MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVP
SATKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF
LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLT
YVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTVVSNGAKNISG
QSPARTSSDPGTNTTTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSPQSLTTKPGPDNSTHNTPVYKLDISE
ATQVEQHHRRTDNDSTASDTPSATTAAGPPKAENTNTSKSTDFLDPATTTSPQNHSETAGNNNTHHQDTGEESASSGKLG
LITNTIAGVAGLITGGRRTRREAIVNAQPKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYIEGLMHNQDGLICGLRQ
LANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKIDQIIHDFVDKTLPDQGD
NDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF
>P35253 ~~~GP~~~Envelope glycoprotein~~~
MKTTCFLISLILIQGTKNLPILEIASNNQPQNVDSVCSGTLQKTEDVHLMGFTLSGQKVADSPLEASKRWAFRTGVPPKN
VEYTEGEEAKTCYNISVTDPSGKSLLLDPPTNIRDYPKCKTIHHIQGQNPHAQGIALHLWGAFFLYDRIASTTMYRGKVF
TEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTSSNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPPPLPTARP
EIKLTSTPTDATKLNTTDPSSDDEDLATSGSGSGEREPHTTSDAVTKQGLSSTMPPTPSPQPSTPQQGGNNTNHSQDAVT
ELDKNNTTAQPSMPPHNTTTISTNNTSKHNFSTLSAPLQNTTNDNTQSTITENEQTSAPSITTLPPTGNPTTAKSTSSKK
GPATTAPNTTNEHFTSPPPTPSSTAQHLVYFRRKRSILWREGDMFPFLDGLINAPIDFDPVPNTKTIFDESSSSGASAEE
DQHASPNISLTLSYFPNINENTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAVLIKNQNNLVCRLR
RLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGGTCKVLGPDCCIGIEDLSKNISEQIDQIKKDEQKEGTGWGL
GGKWWTSDWGVLTNLGILLLLSIAVLIALSCICRIFTKYIG
>P35254 ~~~GP~~~Envelope glycoprotein~~~
MKTTCLFISLILIQGIKTLPILEIASNNQPQNVDSVCSGTLQKTEDVHLMGFTLSGQKVADSPLEASKRWAFRTGVPPKN
VEYTEGEEAKTCYNISVTDPSGKSLLLDPPTNIRDYPKCKTIHHIQGQNPHAQGIALHLWGAFFLYDRIASTTMYRGRVF
TEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTSNNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPSPLPTARP
EIKPTSTPTDATTLNTTDPNNDDEDLITSGSGSGEQEPYTTSDAVTKQGLSSTMPPTPSPQPSTPQQEGNNTDHSQGTVT
EPNKTNTTAQPSMPPHNTTAISTNNTSKNNFSTLSVSLQNTTNYDTQSTATENEQTSAPSKTTLPPTGNLTTAKSTNNTK
GPTTTAPNMTNGHLTSPSPTPNPTTQHLVYFRKKRSILWREGDMFPFLDGLINAPIDFDPVPNTKTIFDESSSSGASAEE
DQHASPNISLTLSYFPNINENTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIKNQNNLVCRLR
RLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGGTCKVLGPDCCIGIEDLSRNISEQIDQIKKDEQKEGTGWGL
GGKWWTSDWGVLTNLGILLLLSIAVLIALSCICRIFTKYIG
>Q1PDC7 ~~~GP~~~Envelope glycoprotein~~~
MKTIYFLISLILIQSIKTLPVLEIASNSQPQDVDSVCSGTLQKTEDVHLMGFTLSGQKVADSPLEASKRWAFRTGVPPKN
VEYTEGEEAKTCYNISVTDPSGKSLLLDPPSNIRDYPKCKTVHHIQGQNPHAQGIALHLWGAFFLYDRVASTTMYRGKVF
TEGNIAAMIVNKTVHRMIFSRQGQGYRHMNLTSTNKYWTSSNETQRNDTGCFGILQEYNSTNNQTCPPSLKPPSLPTVTP
SIHSTNTQINTAKSGTMNPSSDDEDLMISGSGSGEQGPHTTLNVVTEQKQSSTILSTPSLHPSTSQHEQNSTNPSRHAVT
EHNGTDPTTQPATLLNNTNTTPTYNTLKYNLSTPSPPTRNITNNDTQRELAESEQTNAQLNTTLDPTENPTTGQDTNSTT
NIIMTTSDITSKHPTNSSPDSSPTTRPPIYFRKKRSIFWKEGDIFPFLDGLINTEIDFDPIPNTETIFDESPSFNTSTNE
EQHTPPNISLTFSYFPDKNGDTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIKNQNNLVCRLR
RLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGGTCKVLGPDCCIGIEDLSKNISEQIDKIRKDEQKEETGWGL
GGKWWTSDWGVLTNLGILLLLSIAVLIALSCICRIFTKYIG
>Q66810 ~~~GP~~~Envelope glycoprotein~~~
MGASGILQLPRERFRKTSFFVWVIILFHKVFSIPLGVVHNNTLQVSDIDKFVCRDKLSSTSQLKSVGLNLEGNGVATDVP
TATKRWGFRAGVPPKVVNYEAGEWAENCYNLAIKKVDGSECLPEAPEGVRDFPRCRYVHKVSGTGPCPGGLAFHKEGAFF
LYDRLASTIIYRGTTFAEGVIAFLILPKARKDFFQSPPLHEPANMTTDPSSYYHTTTINYVVDNFGTNTTEFLFQVDHLT
YVQLEARFTPQFLVLLNETIYSDNRRSNTTGKLIWKINPTVDTSMGEWAFWENKKNFTKTLSSEELSFVPVPETQNQVLD
TTATVSPPISAHNHAGEDHKELVSEDSTPVVQMQNIKGKDTMPTTVTGVPTTTPSPFPINARNTDHTKSFIGLEGPQEDH
STTQPAKTTSQPTNSTESTTLNPTSEPSSRGTGPSSPTVPNTTESHAELGKTTPTTLPEQHTAASAIPRAVHPDELSGPG
FLTNTIRGVTNLLTGSRRKRRDVTPNTQPKCNPNLHYWTALDEGAAIGLAWIPYFGPAAEGIYTEGIMENQNGLICGLRQ
LANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPQDWTKNITDKIDQIIHDFVDNNLPNQND
GSNWWTGWKQWVPAGIGITGVIIAIIALLCICKFML
>P68550 ~~~~~~Probable host range protein 2-1~~~
MGVQHKLDIFLVSEGIAIKEANLLKGDSYGCTIKIKLDKEKTFKFVIVLEPEWIDEIKPIYMKVNDESVELELDYKDAIK
RIYSAEVVLCSDSVINLFSDVDVSYTCEYPTIKVNTIKKYYSVQNRGMTYVHIESPINTKDKCWFVEKNGWYEDRTHS
>Q9Q8N4 ~~~~~~Probable host range protein 2-3~~~
MEEGIVHKLDVFLIDENVSIKHVNLFDGDSYGCNIHLKTATCKYITFILVLEPDWENIVEAKPIHMRLNGKKIRVPLVAK
THTSLIYKVVIYVEEDALARFYSDVERSYTDVYPTFLVNTDTRRYYILDSGRTYTYIDPFISDGDKRRWLTREIEEAYDA
STEEEEEDDTEEDMDTVHLYCLEEEDEEKIADTGNDNQKDAED
>P68742 ~~~~~~Viral histone-like protein~~~
MSTKKKPTITKQELYSLVAADTQLNKALIERIFTSQQKIIQNALKHNQEVIIPPGIKFTVVTVKAKPARQGHNPATGEPI
QIKAKPEHKAVKIRALKPVHDMLN
>F5HAY6 ~~~~~~Viral inhibitor of caspase-8-induced apoptosis~~~
MDDLRDTLMAYGCIAIRAGDFNGLNDFLEQECGTRLHVAWPERCFIQLRSRSALGPFVGKMGTVCSQGAYVCCQEYLHPF
GFVEGPGFMRYQLIVLIGQRGGIYCYDDLRDCIYELAPTMKDFLRHGFRHCDHFHTMRDYQRPMVQYDDYWNAVMLYRGD
VESLSAEVTKRGYASYSIDDPFDECPDTHFAFWTHNTEVMKFKETSFSVVRAGGSIQTMELMIRTVPRITCYHQLLGALG
HEVPERKEFLVRQYVLVDTFGVVYGYDPAMDAVYRLAEDVVMFTCVMGKKGHRNHRFSGRREAIVRLEKTPTCQHPKKTP
DPMIMFDEDDDDELSLPRNVMTHEEAESRLYDAITENLMHCVKLVTTDSPLATHLWPQELQALCDSPALSLCTDDVEGVR
QKLRARTGSLHHFELSYRFHDEDPETYMGFLWDIPSCDRCVRRRRFKVCDVGRRHIIPGAANGMPPLTPPHAYMNN
>P13202 ~~~~~~Immediate early protein IE1~~~
MESSAKRKMDPDNPDEGPSSKVPRPETPVTKATTFLQTMLRKEVNSQLSLGDPLFPELAEESLKTFEQVTEDCNENPEKD
VLAELVKQIKVRVDMVRHRIKEHMLKKYTQTEEKFTGAFNMMGGCLQNALDILDKVHEPFEEMKCIGLTMQSMYENYIVP
EDKREMWMACIKELHDVSKGAANKLGGALQAKARAKKDELRRKMMYMCYRNIEFFTKNSAFPKTTNGCSQAMAALQNLPQ
CSPDEIMAYAQKIFKILDEERDKVLTHIDHIFMDILTTCVETMCNEYKVTSDACMMTMYGGISLLSEFCRVLCCYVLEET
SVMLAKRPLITKPEVISVMKRRIEEICMKVFAQYILGADPLRVCSPSVDDLRAIAEESDEEEAIVAYTLATAGVSSSDSL
VSPPESPVPATIPLSSVIVAENSDQEESEQSDEEEEEGAQEEREDTVSVKSEPVSEIEEVAPEEEEDGAEEPTASGGKST
HPMVTRSKADQ
>P03169 ~~~~~~Immediate early protein IE1~~~
MESSAKRKMDPDNPDEGPSSKVPRPETPVTKATTFLQTMLRKEVNSQLSLGDPLFPELAEESLKTFERVTEDCNENPEKD
VLAELVKQIKVRVDMVRHRIKEHMLKKYTQTEEKFTGAFNMMGGCLQNALDILDKVHEPFEEMKCIGLTMQSMYENYIVP
EDKREMWMACIKELHDVSKGAANKLGGALQAKARAKKDELRRKMMYMCYRNIEFFTKNSAFPKTTNGCSQAMAALQNLPQ
CSPDEIMAYAQKIFKILDEERDKVLTHIDHIFMDILTTCVETMCNEYKVTSDACMMTMYGGISLLSEFCRVLSCYVLEET
SVMLAKRPLITKPEVISVMKRRIEEICMKVFAQYILGADPLRVCSPSVDDLRAIAEESDEEEAIVAYTLATRGASSSDSL
VSPPESPVPATIPLSSVIVAENSDQEESEQSDEEEEEGAQEEREDTVSVKSEPVSEIEEVAPEEEEDGAEEPTASGGKST
HPMVTRSKADQ
>P19893 ~~~~~~Viral transcription factor IE2~~~
MESSAKRKMDPDNPDEGPSSKVPRPETPVTKATTFLQTMLRKEVNSQLSLGDPLFPELAEESLKTFEQVTEDCNENPEKD
VLAELGDILAQAVNHAGIDSSSTGPTLTTHSCSVSSAPLNKPTPTSVAVTNTPLPGASATPELSPRKKPRKTTRPFKVII
KPPVPPAPIMLPLIKQEDIKPEPDFTIQYRNKIIDTAGCIVISDSEEEQGEEVETRGATASSPSTGSGTPRVTSPTHPLS
QMNHPPLPDPLGRPDEDSSSSSSSSCSSASDSESESEEMKCSSGGGASVTSSHHGRGGFGGAASSSLLSCGHQSSGGAST
GPRKKKSKRISELDNEKVRNIMKDKNTPFCTPNVQTRRGRVKIDEVSRMFRNTNRSLEYKNLPFTIPSMHQVLDEAIKAC
KTMQVNNKGIQIIYTRNHEVKSEVDAVRCRLGTMCNLALSTPFLMEHTMPVTHPPEVAQRTADACNEGVKAAWSLKELHT
HQLCPRSSDYRNMIIHAATPVDLLGALNLCLPLMQKFPKQVMVRIFSTNQGGFMLPIYETAAKAYAVGQFEQPTETPPED
LDTLSLAIEAAIQDLRNKSQ
>Q6SWP7 ~~~~~~Viral transcription factor IE2~~~
MESSAKRKMDPDNPDEGPSSKVPRPETPVTKATTFLQTMLRKEVNSQLSLGDPLFPELAEESLKTFERVTEDCNENPEKD
VLAELGDILAQAVNHAGIDSSSTGPTLTTHSCSVSSAPLNKPTPTSVAVTNTPLPGASATPELSPRKKPRKTTRPFKVII
KPPVPPAPIMLPLIKQEDIKPEPDFTIQYRNKIIDTAGCIVISDSEEEQGEEVETRGATASSPSTGSGTPRVTSPTHPLS
QMNHPPLPDPLGRPDEDSSSSSSSCSSASDSESESEEMKCSSGGGASVTSSHHGRGGFGGAASSSLLSCGHQSSGGASTG
PRKKKSKRISELDNEKVRNIMKDKNTPFCTPNVQTRRGRVKIDEVSRMFRNTNRSLEYKNLPFTIPSMHQVLDEAIKACK
TMQVNNKGIQIIYTRNHEVKSEVDAVRCRLGTMCNLALSTPFLMEHTMPVTHPPEVAQRTADACNEGVKAAWSLKELHTH
QLCPRSSDYRNMIIHAATPVDLLGALNLCLPLMQKFPKQVMVRIFSTNQGGFMLPIYETAAKAYAVGQFEQPTETPPEDL
DTLSLAIEAAIQDLRNKSQ
>O92503 2.3.2.27~~~IE2~~~E3 ubiquitin-protein ligase IE2~~~
MSRQINAVTPSSSSRRHRLSLSRRRINFTTSPEALPSSSSRSQPSSSSRSQPYSSSRSQPYSSSRRRRRQERSQEQRVSE
DNVQIIGNANEPLTRTYHSQGVTYHVHGQVNISNDDPLLSQEDDTIESVDRASQQYQNSIASETAAQRALQRGLDLESQL
MSEISPRSPAYSPPYPSNDVLSQSPDLFDSPQSPQQHELELEDEDEEEEEEEGEEVEVSCNICFTTLKDTKNVDSSFVTS
IDCNHAVCFKCYVRIIMDNSTYKCFCSASSSDFRVYNKHGYVEFMPLTLIRNRDSIKQHWRELLENNTVNNRIIDLNDVE
RLERERSELRAKNSQVEHKMTMLNCDYAMLKHEHKITELKLKWANRDLEEFTKKTQELQSTVNDLQEQLRKQVAESQAKF
SQFERRNSELVAELYTIEMSKP
>P19563 ~~~vif~~~Virion infectivity factor~~~
MERTLQSVVGRRRGSSNRGRGKNSLISTPSYALHPPPRFRYPRWEFVRQTEYSMTACVRKGKLVLTYQYAIWKRVWTIET
GFTDPSLFMTPAGTHTTEEIGHLDLFWLRYCSCPHEMPPWLDFLRGTLNLRISCRRALQASVLTSTPRHSLQRLAALQLC
TNACLCWYPLGRINDTTPLWLNFSSGKEPTIQQLSGHP
>P69721 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWH
LGQGVSIEWRKKRYSTQVDPELADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLALAALITPKKIK
PPLPSVTKLTEDRWNKPQKTKGHRGSHTMNGH
>P69723 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWH
LGQGVSIEWRKKRYSTQVDPELADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLALAALITPKKIK
PPLPSVTKLTEDRWNKPQKTKGHRGSHTMNGH
>Q70623 ~~~vif~~~Truncated virion infectivity factor~~~
MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWH
LGQGVSIEWRKKRYSTQVDPELADQLIHLYYFDCFSDSAIRKALLGHIVSPR
>P12504 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRINTWKRLVKHHMYISRKAKDWFYRHHYESTNPKISSEVHIPLGDAKLVITTYWGLHTGERDWH
LGQGVSIEWRKKRYSTQVDPDLADQLIHLHYFDCFSESAIRNTILGRIVSPRCEYQAGHNKVGSLQYLALAALIKPKQIK
PPLPSVRKLTEDRWNKPQKTKGHRGSHTMNGH
>P31820 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWH
LGQGVSIKWRKKRYSTQVDPELADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLALAALITPKKIK
PPLPSVTKLTEDRWNKPQKTKGHRRSHTMNGH
>P18805 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRINTWKSLVKYHMYVSKKANRWFYRHHYDSHHPKISSEVHIPLGEARLVVTTYWGLHTGEKEWH
LGQGVSIEWRKRRYSTQVDPGLADQLIHMYYFDCFAESAIRKAILGHIVSPSCEYQAGHNKVGSLQYLALAALIAPKKIK
PPLPSVRKLTEDRWNKPQKTKGRRGSHTMNGH
>P18097 ~~~vif~~~Virion infectivity factor~~~
MEEDRNWIVVPTWRVPGRMEKWHALVKYLKYRTKDLEEVRYVPHHKVGWAWWTCSRVIFPLQGKSHLEIQAYWNLTPEKG
WLSSHAVRLTWYTEKFWTDVTPDCADILIHSTYFSCFTAGEVRRAIRGEKLLSCCNYPQAHKAQVPSLQYLALVVVQQND
RPQRKGTARKQWRRDHWRGLRVAREDHRSLKQGGSEPSAPRAHFPGVAKVLEILA
>P24108 ~~~vif~~~Virion infectivity factor~~~
MEEGKSWIAVPTWRVPGRMEKWHSLVKYLKYKTGDLEQVCYVPHHKVGWAWWTCSRVIFPLRGDSRLEIQAYWNLTPEKG
WLSSYAVRMTWYTEKFWTDVTPDCADTLIHSTYFSCFTAGEVRRAIRGEKLLSCCKYPRAHRSQVPSLQFLALVVVQQND
RPQRDRTTRKQWRRDYRRGLRLARQDSRSYKQRGSESPAPGAYFPGVAKVLEILA
>P17758 ~~~vif~~~Virion infectivity factor~~~
MEEGKNWIVVPTWRVPGRMERWHSLVKHLKYRTKDLEEVRYVPHHKVGWAWWTCSRVIFPLEGESHLEIQAYWNLTPEKG
WLSSHSVRLTWYTEKFWTDVTPDCADSLIHSTYFSCFTAGEVRRAIRGEKLLSCCNYPQAHKAQVPSLQYLALVVVQQNG
RPQRKGAARKQWRRDHWRGLRVARQDYRSLKQGGSEPSAPRAHFPGVAKVLGILA
>P15834 ~~~vif~~~Virion infectivity factor~~~
MEEEKDWIVVPTWRIPGRLERWHSLIKYLKYRTGELQQVSYVPHHKVGWAWWTCSRIIFPLNKGAWLEVQGYWNLTPERG
FLSSYAVRLTWYERNFYTDVTPDVADQLLHGSYFSCFSANEVRRAIRGEKILSYCNYPSAHEGQVPSLQFLALRVVQEGK
NGSQGESATRKQRRRNSRRSIRLARKNNNRAQQGSGQPFAPRTYFPGLAEVLGILA
>Q89753 ~~~vif~~~Virion infectivity factor~~~
MEEEKNWIAVPTWRIPCRLERWHSLIKYLKYRTKDLQQVSYVPHHKVGWAWWTCSRVIFPLKEGAHLEVQGYWNLTPERG
FLSSYAVRLTWYERSFYTDVTPDVADRLLHGSYFSSFTANEVRRAIRGEKILSHCNYPSAHTGQVPSLQFLALRVVQEGK
DGSQGESTTRKQRRRNSRRGIRMARDNIRTSQQSSSQSLAQGTYFPGLAEVLGILA
>P18043 ~~~vif~~~Virion infectivity factor~~~
MEEGKNWIVPTWRVPGRRMERWHSLVKYLKYRTRDLEEVRYVPHHKVGWAWWTCSRVIFPLKGESHLEIQAYWNLTPEKG
WLSSHSVRITWYTERFWTDVTPDYADILIHSTYFSCFTAGEVRRAIRGEKLLSCCNYPQAHKVQVPSLQYLALVVVQQND
RPQRKGTARKQWRRDHWRGLRVARQDYRSLKQRGSEPSAPRAHFPGVAKVLEILA
>Q74121 ~~~vif~~~Virion infectivity factor~~~
MEEGERWIVVPTWRVPGRMEKWHSLVKYLKHRTKDLEGVCYVPHHKVGWAWWTCSRVIFPLQGNSHLEIQAYWNLTPEKG
WLSSYAVRITWYTERFWTDVTPDCADSLIHSTYFSCFTAGEVRRAIRGEKLLSCCNYPQAHRSKVPLLQFLALVVVQQNG
RPQKNSTTRKRWRSNYWRGFRLARKDGRGHKQRGSEPPASGAYFPGVAKVLEILA
>P05901 ~~~vif~~~Virion infectivity factor~~~
MEEGKRWIVVPIWRVPGRMERWHSLVKYLKYRTKDLEKVCYVPHHKVGWAWWTCSRVIFPLKENSHLEIQAYWNLTPEKG
WLSSHSVRITWYTEKFWTDVTPDCADTLIHSTYFSCFTAGEVRRAIRGEKLLSCCKYPRAHRSQVPSLQFLALVVVQQND
RSQGNSATRKQRRGDYRRGLRMARQDSRGYKQRGSESPPTRAHFPGLAEVLEILA
>P04595 ~~~vif~~~Virion infectivity factor~~~
MEEDKRWIVVPTWRVPGRMEKWHSLVKYLKYKTKDLEKVCYVPHHKVGWAWWTCSRVIFPLKGNSHLEIQAYWNLTPEKG
WLSSYSVRITWYTEKFWTDVTPDCADVLIHSTYFPCFTAGEVRRAIRGEKLLSCCNYPRAHRAQVPSLQFLALVVVQQND
RPQRDSTTRKQRRRDYRRGLRLAKQDSRSHKQRSSESPTPRTYFPGVAEVLEILA
>P12452 ~~~vif~~~Virion infectivity factor~~~
MDQGKRWIAVPTWRVPGRMEKWHSLIKYLKYRTKDLEQVRYVPHHKVGWAWWTCSRVIFPLKGNSHLEIQAYWNLTPEKG
WLSSYSVRMTWYSEGFWTDVTPDCADTLIHSTYFSCFTAGEVRRAIRGEKSLSCCNYPQAHKSKVPSLQFLALVVVQQND
KPQRDNTTRKQWRRNYRRGLRLARQDGRSHKQRGSEPPAQGAYFPGVAKVLEILA
>P20878 ~~~vif~~~Virion infectivity factor~~~
MEEGKRWIAVPTWRVPGRMERWHSLIKYLKYRTGDLEKVCYVPHHKVGWAWWTCSRVIFPLKGESHLEIQAYWNLTPEKG
WLSSYSVRLTWYTEKFWTDVTPDCADSLIHSTYFSCFTAGEVRRAIRGEKLLSCCNYPQAHKYQVPSLQFLALVVVQQNG
RPQRDNTTRKQWRRNYRRGLRVARQDGRSHKQRGSEPPAPRAYFPGVAKVLEILA
>Q76635 ~~~vif~~~Virion infectivity factor~~~
MEGEKNWIVVPTWRIPGRLEKWHSLVKYLKHRTKELQQVSYVPHHKVGWAWWTCSRVIFPLKEEAYLEVQGYWNLTPERG
FLSSYAVRLTWYKRSFYTDVTPDVADQLLHGSYFSCFTANEVRRAIRGEKILSYCNYPSAHEGQVPSLQFLALRVIQEGK
DGSQGESATRKQRRRNNRRSIRLARKNNNRAQQGSSQPLAPRTHFPGLAEVLGILA
>P17284 ~~~vif~~~Virion infectivity factor~~~
MENRWQVMIVWQVDRMRIKTWNSLVKYHIYRSKKARGWFYRHHYDHPNPKVASEIHIPFRDYSKLIVTTYWALSPGERAW
HLGHGVSIQWRLGSYVTQVDPFTADRLIHSQYFDCFAETAIRRAILGQLVAPRCEYKEGHRQVGSLQFLALKALISERRH
RPPLPSVAKLTEDRWNKHQRTKVHQENLTRNGH
>Q02841 ~~~vif~~~Virion infectivity factor~~~
MEREKQWIVRVVWRVSERQISRWRGIVTYKIRNKQLPWEYRHHWQVQWQFWTYSQFIIPLSKDDYIEVNIYHNLTPERGW
LSSHGVGLSYYHQKGYKTEVDPGTADRMIHLYYFNCFTDRAIQQAIRGEKYTWCTFKEGHKGQVQSLQLLALVAYTNGIR
KRSKRTFTRMAGNLGSRQGAMGRMATRHAQGSKRRSQKALWNEHANPSMELLCRGGKET
>P22383 ~~~vif~~~Virion infectivity factor~~~
MERVERIVRLTWKVSSQRIEKWHWLVRRQMAWATANNEEGCWWLYPHFMAYNEWYTCSKVVIIINRDIRLIVRSYWHLQI
EVGCLSTYAVSIEAVVRPPPFEKEWCTEITPEVADHLIHLHFYDCFMDSAVMKAIRGEEVLKVCRFPAGHKAQGVLSLQF
LCLRVIYGPEER
>P05903 ~~~vif~~~Virion infectivity factor~~~
MEEEKRWIVVPTWRIPERLERWHSLIKYLKYKTKDLQKACYVPHHKVGWAWWTCSRVIFPLQEGSHLEVQGYWNLTPERG
WLSTYAVRITWYSKDFWTDVTPEYADILLHSTYFPCFTAGEVRRAIRGERLLSCCRFPRAHKHQVPSLQYLALRVVSHVR
SQGENPTWKQWRRDNRRSLRVAKQNSRGDKQRGGKPPTEGANFPGLAKVLGILA
>P05902 ~~~vif~~~Virion infectivity factor~~~
MEEEKRWIAVPTWRIPERLERWHSLIKYLKYKTKDLQKVCYVPHFKVGWAWWTCSRVIFPLQEGSHLEVQGYWHLTPERG
WPSTYAVRITWYSRDLLDRCNTRLCRHFSCIALISLFTAGEVRRAIRGEQLLSCCKFPRAHRYQVPSLQYLALKVVSDVR
SQGENPTWKQWRRDNRRGLRMAKQNSRGDKQRGGKPPTKGADFPGLAKVLGILA
>P12505 ~~~vif~~~Virion infectivity factor~~~
MEEEKNWIVVPTWRIPERLERWHSLIKHLKYNTKDLQMACYVPHHKVGWAWWTCSRVIFPLRDETHLEVQGYWNLAPEKG
WLSTYAVRITWYSRNFWTDVTPDYADTLLHSTYFPCFSEGEVRKAIRGEKLLSCCKFPKAHKNQVPSLQYLALTVVSHVR
SQGEDPTWKQWRRNNRKGLRMAKQNSRRNKQGSSKSPAEGANFPGLAKVLGILA
>P19506 ~~~vif~~~Virion infectivity factor~~~
MEEEKNWIVVPTWRIPGRLEKWHSLIKHLKYNTKDLQKACYVPHHKVGWAWWTCSRVIFPLKDEAHLEVQGYWNLTPEKG
WLSTYAVRITWYSRNFWTDVTPDYADTLLHGTYFPCFSEGEVRRAIRGEKLLSCCKFPKAHKNQVPSLQYLALTVVSHVR
SQGENPTWKQWRRNNRRGLRLARQNSRRNKQGSSESFAEGTNFPGLAKVLGILA
>P89905 ~~~vif~~~Virion infectivity factor~~~
MEREKLWVTRLTWRVSGEHIDKWKGIVKYHMRNRLQDWTYLMHYQCGWAWYTCSRFLIPLGGEGKIVVDCYWHLTPEQGW
LSTYAVAISFENWQNTYKTEVTPDVADHMIHCHYFPCFTDRAIQQAIRGESFLWCTYKEGHVAENHWGQVRSLQFLALTV
YTDFLRNGRRKRFQGKKTRMVRNLGSQQGAVGRMIKRHGSRTQSGSTTPFWERTPLPSMELLSGRRGKEWGTNDRKGL
>Q8AII0 ~~~~~~Virion infectivity factor~~~
MENRWQVQVVWMIDRMRLRTWTSLVKHHIFTTKCCKDWKYRHHYETDTPKRAGEIHIPLTERSKLVVLHYWGLACGERPW
HLGHGIGLEWRQGKYSTQIDPETADQLIHTRYFTCFAAGAVRQAILGERILTFCHFQSGHRQVGTLQFLAFRKVVESQDK
QPKGPRRPLPSVTKLTEDRWNKHRTTTGRRENHTLSGC
>P27974 ~~~vif~~~Virion infectivity factor~~~
MSQEKHWVMRLTWKVQEEVITKWQGIVRYWMNKRNLKWEYKMHYQITWAWYTMSRYVIPLPGSGEIHVDIYWHLAPKQGW
LSTYAVGIQYVSLVNDKYRTELDPNTADSMIHCHYFTCFTDRAIQQALRGNRFIFCQFPGGHKLTGQVPSLQYLALLAHQ
NGLRKRSQRGETRRTRNLGSQQGAVGRMAQRYGRRNQQRSQTAFWPRTPIPSMELLSGGRGETGKTHSGKGI
>P27983 ~~~vif~~~Virion infectivity factor~~~
MNQEKEWVMRVTWKVPEELITKWQGIVRYWMRTRKLDWKYRMHYQITWAWYTMSRYEIPLGQHGSIHVDLYWHLTPEKGW
LSTYAEGIQYLSNRDPWYRTELDPATADSLIHTHYFTCFTERAIRKALLGQRFTFCQFPEGHKKTGQVPSLQYLALLAHQ
NGLRQRSQRSKTGGTRNMGFEQGAVGRMAKRHARRYQSGSQDAFWARAPVPSMELLSGGGRKESHSHARKGL
>P05904 ~~~vif~~~Virion infectivity factor~~~
MNPNKEWVMRVTWKVPGDLITKWQGIVRYWMRQRNLKWNYYMHYQITWAWYTMSRYVIPIGKHGEICVDLYWHLTPEQGW
LSTYAVGIQYVSNLESKYRTELDPATADSIIHGHYFNCFKERAIQQALRGHRFVFCQFPEGHKSTGQVPSLQYLALLAHQ
NGLRERSKRGKTRRSRNLGSKQGAVGQMAKRYVTRSQPGGEAAFWERTPVPSMELLSGGRRKTWYSHDGKGLQIL
>P69717 ~~~vif~~~Virion infectivity factor~~~
MLSSYRHQKKYKKNKAREIGPQLPLWAWKETAFSINQEPYWYSTIRLQGLMWNKRGHKLMFVKENQGYEYWETSGKQWKM
EIRRDLDLIAQINFRNAWQYKSQGEWKTIGVWYESPGDYKGKENQFWFHWRIALCSCNKTRWDIREFMIGKHRWDLCKSC
IQGEIVKNTNPRSLQRLALLHLAKDHVFQVMPLWRARRVTVQKFPWCRSPMGYTIPWSLQECWEMESIFE
>Q2HRC7 ~~~K2~~~Viral interleukin-6 homolog~~~
MRWFKLWSILLVGSLLVSGTRGKLPDAPEFEKDLLIQRLNWMLWVIDECFRDLCYRTGICKGILEPAAIFHLKLPAINDT
DHCGLIGFNETSCLKKLADGFFEFEVLFKFLTTEFGKSVINVDVMELLTKTLGWDIQEELNKLTKTHYSPPKFDRGLLGR
LQGLKYWVRHFASFYVLSAMEKFAGQAVRVLNSIPDVTPDVHDK
>P21442 2.7.7.-~~~int~~~Integrase~~~
MAVRKDTKNGKWLAEVYVNGNASRKWFLTKGDALRFYNQAKEQTTSAVDSVQVLESSDLPALSFYVQEWFDLHGKTLSDG
KARLAKLKNLCSNLGDPPANEFNAKIFADYRKRRLDGEFSVNKNNPPKEATVNREHAYLRAVFNELKSLRKWTTENPLDG
VRLFKERETELAFLYERDIYRLLAECDNSRNPDLGLIVRICLATGARWSEAETLTQSQVMPYKITFTNTKSKKNRTVPIS
KELFDMLPKKRGRLFNDAYESFENAVLRAEIELPKGQLTHVLRHTFASHFMMNGGNILVLKEILGHSTIEMTMRYAHFAP
SHLESAVKFNPLSNPAQ
>P22884 2.7.7.-~~~~~~Integrase~~~
MARRGWGSLKTQRSGRIQASYVNPQDGVRYYALQTYDNKMDAEAWLAGEKRLIEMETWTPPQDRAKKAAASAITLEEYTR
KWLVERDLADGTRDLYSGHAERRIYPVLGEVAVTEMTPALVRAWWAGMGRKHPTARRHAYNVLRAVMNTAVEDKLIAENP
CRIEQKAADERDVEALTPEELDIVAAEIFEHYRIAAYILAWTSLRFGELIELRRKDIVDDGMTMKLRVRRGASRVGNKIV
VGNAKTVRSKRPVTVPPHVAEMIRAHMKDRTKMNKGPEAFLVTTTQGNRLSKSAFTKSLKRGYAKIGRPELRIHDLRAVG
ATFAAQAGATTKELMARLGHTTPRMAMKYQMASEARDEAIAEAMSKLAKTS
>P36932 2.7.7.-~~~int~~~Integrase~~~
MAIKKLDDGRYEVDIRPTGRNGKRIRRKFDKKSEAVAFEKYTLYNHHNKEWLSKPTDKRRLSELTQIWWDLKGKHEEHGK
SNLGKIEIFTKITNDPCAFQITKSLISQYCATRRSQGIKPSSINRDLTCISGMFTALIEAELFFGEHPIRGTKRLKEEKP
ETGYLTQEEIALLLAALDGDNKKIAILCLSTGARWGEAARLKAENIIHNRVTFVKTKTNKPRTVPISEAVAKMIADNKRG
FLFPDADYPRFRRTMKAIKPDLPMGQATHALRHSFATHFMINGGSIITLQRILGHTRIEQTMVYAHFAPEYLQDAISLNP
LRGGTEAESVHTVSTVE
>P03700 2.7.7.-~~~int~~~Integrase~~~
MGRRRSHERRDLPPNLYIRNNGYYCYRDPRTGKEFGLGRDRRIAITEAIQANIELFSGHKHKPLTARINSDNSVTLHSWL
DRYEKILASRGIKQKTLINYMSKIKAIRRGLPDAPLEDITTKEIAAMLNGYIDEGKAASAKLIRSTLSDAFREAIAEGHI
TTNHVAATRAAKSEVRRSRLTADEYLKIYQAAESSPCWLRLAMELAVVTGQRVGDLCEMKWSDIVDGYLYVEQSKTGVKI
AIPTALHIDALGISMKETLDKCKEILGGETIIASTRREPLSSGTVSRYFMRARKASGLSFEGDPPTFHELRSLSARLYEK
QISDKFAQHLLGHKSDTMASQYRDDRGREWDKIEIK
>P20214 2.7.7.-~~~~~~Integrase~~~
MTKDKTRYKYGDYILRERKGRYYVYKLEYENGEVKERYVGPLADVVESYLKMKLGVVGDTPLQADPPGFEPGTSGSGGGK
EGTERRKIALVANLRQYATDGNIKAFYDYLMNERGISEKTAKDYINAISKPYKETRDAQKAYRLFARFLASRNIIHDEFA
DKILKAVKVKKANADIYIPTLEEIKRTLQLAKDYSENVYFIYRIALESGVRLSEILKVLKEPERDICGNDVCYYPLSWTR
GYKGVFYVFHITPLKRVEVTKWAIADFERRHKDAIAIKYFRKFVASKMAELSVPLDIIDFIQGRKPTRVLTQHYVSLFGI
AKEQYKKYAEWLKGV
>F5HF68 ~~~vIRF-1~~~VIRF-1~~~
MDPGQRPNPFGAPGAIPKKPCLSQGSPGTSGSGAPCDEPSRSESPGEGPSGTGGSAAAGDITRQAVVAAITEWSRTRQLR
ISTGASEGKASIKDWIVCQVNSGKFPGVEWEDEERTRFRIPVTPLADPCFEWRRDGELGVVYIRERGNMPVDASFKGTRG
RRRMLAALRRTRGLQEIGKGISQDGHHFLVFRVRKPEEEQCVECGVVAGAVHDFNNMARLLQEGFFSPGQCLPGEIVTPV
PSCTTAEGQEAVIDWGRLFIRMYYNGEQVHELLTTSQSGCRISSALRRDPAVHYCAVGSPGQVWLPNVPNLACEIAKREL
CDTLDACAKGILLTSSCNGIFCVCYHNGPVHFIGNTVPPDSGPLLLPQGKPTRIFNPNTFLVGLANSPLPAPSHVTCPLV
KLWLGKPVAVGKLEPHAPSPRDFAARCSNFSDACVVLEIMPKPLWDAMQ
>Q2HR71 ~~~vIRF-2~~~Viral IRF2-like protein~~~
MPRYTESEWLTDFIIDALDSGRFWGVGWLDEQKRIFTVPGRNRRERMPEGFDDFYEAFLEERRRHGLPEIPETETGLGCF
GRLLRTANRARQERPFTIYKGKMKLNRWIMTPRPYKGCEGCLVYLTQEPAMKNMLKALFGIYPHDDKHREKALRRSLRKK
AQREAARKQAAAVATPTTSSAAEVSSRSQSEDTESSDSENELWVGAQGFVGRDMHSLFFEEPEPSGFGSSGQSSSLLAPD
SPRPSTSQVQGPLHVHTPTDLCLPTGGLPSPVIFPHETQGLLAPPAGQSQTPFSPEGPVPSHVSGLDDCLPMVDHIEGCL
LDLLSDVGQELPDLGDLGELLCETASPQGPMQSEGGEEGSTESVSVLPATHPLESSAPGASVMGSGQELPDLGDLSELLC
ETASPQGPMQSEGGEEGSTESVSVLPATHPLESSAPGASVMGSSFQASDNVDDFIDCIPPLCRDDRDVEDQEKADQTFYW
YGSDMRPKVLTATQSVAAYLSKKQAIYKVGDKLVPLVVEVYYFGEKVKTHFDLTGGIVICSQVPEASPEHICQTVPPYKC
LLPRTAHCSVDANRTLEQTLDRFSMGVVAIGTNMGIFLKGLLEYPAYFVGNASRRRIGKCRPLSHRHEIQQAFDVERHNR
EPEGSRYASLFLGRRPSPEYDWDHYPVILHIYLAPFYHRD
>F5HIC6 ~~~vIRF-3~~~Viral IRF3-like protein~~~
MAGRRLTWISEFIVGALDSDKYPLVKWLDRSTGTFLAPAARNDVIPLDSLQFFIDFKRECLSKGLHPRDLLGSPITAFGK
ICTTSRRLRRLPGEEYEVVQGINCRRWRLLCAEVKECWWCVHARTHLHSGSSLWEILYQHSVRLEKHRRRPRPFVGENSD
SSEEDHPAFCDVPVTQTGAESEDSGDEGPSTRHSASGVQPVDDANADSPGSGDEGPSTRHSDSQPPPADETTVHTDNVED
DLTLLDKESACALMYHVGQEMDMLMRAMCDEDLFDLLGIPEDVIATSQPGGDTDASGVVTEGSIAASAVGAGVEDVYLAG
ALEAQNVAGEYVLEISDEEVDDGAGLPPASRRRPVVGEFLWDDGPRRHERPTTRRIRHRKLRSAYYRVARPPVMITDRLG
VEVFYFGRPAMSLEVERKVFILCSQNPLADISHSCLHSRKGLRVLLPKPDDNNTGPGDVNLLAAVLRSFASGLVIVSLRS
GIYVKNLCKSTVLYHGNNPPKKFGVICGLSSRAVLDVFNVAQYRIQGHEHIKKTTVFIGGDPTSAEQFDMVPLVIKLRLR
SVTCDD
>Q2HR73 ~~~vIRF-4~~~Viral IRF4-like protein~~~
MPKAGGSEWATLWIIDALENNKFPYFSWFDRNNLLFAAPAPLPAGSDIPPGWYSVYHAFDEECDRVYGPSPVVGQTVYGR
FGRLLRGTRRAVVRNDLRYSDTFGGSYVVWQLVRTPFKNCTYCYGAAYGPEKLQRFIQCLLSPPMQTTATRRSDTREQSY
EEAGAAAPAPPKAPSGLRGRPRKSNRYYNVGDITTEQKAACSVWIPVNEGASTSGMGSSGTRQVTQASSFTWRVPGDPPA
PSTLTGPSDPHSSGAGLPGTAPPKPQHETRLAGTVSGVSGVAQTPGDTGQLAPPMRDGSRLPSTSPWIPACFPWGDLPVT
GWWPQGASGLPEKVHPPTTGQFDPLSPRWTYTGIPSSQLNPAAPSWIPPHAQAGTFVGEFSQGAPLAPQGLLPQSGQCAS
AWLPRRETGAEGACGASTEGRAPQGAASERVYPFEPQPPSAPAPGYAKPSCYNWSPLAEPPATRPIRAPVWHPPVGHAVV
PEVRTPLWIPWSSGGAPNQGLSHTQGGASATPSAGAPPTPEVAERQEPSSSGIPYVCQGDNMATGYRRVTTSSGALEVEI
IDLTGDSDTPSTTVASTPLPVSGPRVFQPTVLYSAPEPAVNPEVSHLPTELERRECVCPGSGERPRVPLVSTYAGDRYAV
GGYGPEQSLVPPPLGLPLTLSNLQGEDICTWEEGLGNILSELQEEPSSSTRQATDRRRPRSRSPHGRRTPVSHSGPEKPP
SKMFFDPPDSQRVSFVVEIFVYGNLRGTLRREGDAGEAMLCSWPVGDTLGHLCQSFVPELLRIPRLTVPSPEQMEILNRV
FEGLGHGFPIFCSMSGIYSRNATQVEGWWFGNPNSRYERILRSFSPRVPQQLFNTARYLATTAAIPQTPLSVNPVTCGTV
FFGASPASTENFQNVPLTVKIFIGSIWDSLH
>A0A7H0DNC2 ~~~~~~Intermediate transcription factor 3 large subunit~~~
MDNLFTFLHEIEDRYARTIFNFHLISCDEIGDIYGLMKERISSEDMFDNIVYNKDIHPAIKKLVYCDIQLTKHIINQNTY
PVFNDSSQVKCCHYFDINSNNSNISSRTVEIFESEKSSLVSYIKTTNKKRKVNYGEIKKTVHGGTNANYFSGKKSDEYLS
TTVRSNINQPWIKTISKRMRVDIINHSIVTRGKSSILQTIEIIFTNRTCVKIFKDSTMHIILSKDKDEKGCINMIDKLFY
VYYNLFLLFEDIIQNDYFKEVANVVNHVLMATALDEKLFLIKKMAEHDVYGVSNFKIGMFNLTFIKSLDHTVFPSLLDED
SKIKFFKGKKLNIVALRSLEDCTNYVTKSENMIEMMKERSTILNSIDIETESVDRLKELLLK
>P20998 ~~~~~~Intermediate transcription factor 3 large subunit~~~
MDNLFTFLHEIEDRYARTIFNFHLISCDEIGDIYGLMKERISSEDMFDNIVYNKDIHHAIKKLVYCDIQLTKHIINQNTY
PVFNDSSQVKCCHYFDINSDNSNISSRTVEIFEREKSSLVSYIKTTNKKRKVNYGEIKKTVHGGTNANYFSGKKSDEYLS
TTVRSNINQPWIKTISKRMRVDIINHSIVTRGKSSILQTIEIIFTNRTCVKIFKDSTMHIILSKDKDEKGCIHMIDKLFY
VYYNLFLLFEDIIQNEYFKEVANVVNHVLTATALDEKLFLIKKMAEHDVYGVSNFKIGMFNLTFIKSLDHTVFPSLLDED
SKIKFFKGKKLNIVALRSLEDCINYVTKSENMIEMMKERSTILNSIDIETESVDRLKELLLK
>Q80HV2 ~~~~~~Intermediate transcription factor 3 large subunit~~~
MDNLFTFLHEIEDRYARTIFNFHLISCDEIGDIYGLMKERISSEDMFDNIVYNKDIHPAIKKLVYCDIQLTKHIINQNTY
PVFNDSSQVKCCHYFDINSDNSNISSRTVEIFEREKSSLVSYIKTTNKKRKVNYGEIKKTVHGGTNANYFSGKKSDEYLS
TTVRSNINQPWIKTISKRMRVDIINHSIVTRGKSSILQTIEIIFTNRTCVKIFKDSTMHIILSKDKDEKGCIHMIDKLFY
VYYNLFLLFEDIIQNEYFKEVANVVNHVLTATALDEKLFLIKKMAEHDVYGVSNFKIGMFNLTFIKSLDHTVFPSLLDED
SKIKFFKGKKLNIVALRSLEDCINYVTKSENMIEMMKERSTILNSIDIETESVDRLKELLLK
>P0DSR9 ~~~~~~Intermediate transcription factor 3 large subunit~~~
MDNLFTFLHEIEDRYTRTIFNFHLISCDEIGDIYGLMKERISSEDMFDNIVYNKDIHPAIKKLVYCDIQLTKHIINQNTY
PVFNDSSQVKCCHYFDINSDNSNISSRTVEIFEREKSSLVSYIKTTNKKRKVNYGEIKKTVHGGTNANYFSGKKSDEYLS
TTVRSNINQPWIKTISKRMRVNIINHSIVTRGKSSILQTIEIIFTNRTCVKIFKDSTMHIILSKDKDEKGCIHMIDKLFY
VYYNLFLLFEDIIQNEYFKEVANVVNHVLTATALDEKLFLIKKMAKHDVYGVSNFKIGMFNLTFIKSLDHTVFPSLLDED
SKIKFFKGKKLNIVALRSLEDCINYVTKSENMIEMMKERSTILNSIDIETESVDRLKDLLLK
>P0DSS0 ~~~~~~Intermediate transcription factor 3 large subunit~~~
MDNLFTFLHEIEDRYTRTIFNFHLISCDEIGDIYGLMKERISSEDMFDNIVYNKDIHPAIKKLVYCDIQLTKHIINQNTY
PVFNDSSQVKCCHYFDINSDNSNISSRTVEIFEREKSSLVSYIKTTNKKRKVNYGEIKKTVHGGTNANYFSGKKSDEYLS
TTVRSNINQPWIKTISKRMRVNIINHSIVTRGKSSILQTIEIIFTNRTCVKIFKDSTMHIILSKDKDEKGCIHMIDKLFY
VYYNLFLLFEDIIQNEYFKEVANVVNHVLTATALDEKLFLIKKMAKHDVYGVSNFKIGMFNLTFIKSLDHTVFPSLLDED
SKIKFFKGKKLNIVALRSLEDCINYVTKSENMIEMMKERSTILNSIDIETESVDRLKDLLLK
>P03103 ~~~L1~~~Major capsid protein L1~~~
MALWQQGQKLYLPPTPVSKVLCSETYVQRKSIFYHAETERLLTIGHPYYPVSIGAKTVPKVSANQYRVFKIQLPDPNQFA
LPDRTVHNPSKERLVWAVIGVQVSRGQPLGGTVTGHPTFNALLDAENVNRKVTTQTTDDRKQTGLDAKQQQILLLGCTPA
EGEYWTTARPCVTDRLENGACPPLELKNKHIEDGDMMEIGFGAANFKEINASKSDLPLDIQNEICLYPDYLKMAEDAAGN
SMFFFARKEQVYVRHIWTRGGSEKEAPTTDFYLKNNKGDATLKIPSVHFGSPSGSLVSTDNQIFNRPYWLFRAQGMNNGI
AWNNLLFLTVGDNTRGTNLTISVASDGTPLTEYDSSKFNVYHRHMEEYKLAFILELCSVEITAQTVSHLQGLMPSVLENW
EIGVQPPTSSILEDTYRYIESPATKCASNVIPAKEDPYAGFKFWNIDLKEKLSLDLDQFPLGRRFLAQQGAGCSTVRKRR
ISQKTSSKPAKKKKK
>P04012 ~~~L1~~~Major capsid protein L1~~~
MWRPSDSTVYVPPPNPVSKVVATDAYVKRTNIFYHASSSRLLAVGHPYYSIKKVNKTVVPKVSGYQYRVFKVVLPDPNKF
ALPDSSLFDPTTQRLVWACTGLEVGRGQPLGVGVSGHPLLNKYDDVENSGGYGGNPGQDNRVNVGMDYKQTQLCMVGCAP
PLGEHWGKGTQCSNTSVQNGDCPPLELITSVIQDGDMVDTGFGAMNFADLQTNKSDVPLDICGTVCKYPDYLQMAADPYG
DRLFFYLRKEQMFARHFFNRAGTVGEPVPDDLLVKGGNNRSSVASSIYVHTPSGSLVSSEAQLFNKPYWLQKAQGHNNGI
CWGNHLFVTVVDTTRSTNMTLCASVSKSATYTNSDYKEYMRHVEEFDLQFIFQLCSITLSAEVMAYIHTMNPSVLEDWNF
GLSPPPNGTLEDTYRYVQSQAITCQKPTPEKEKQDPYKDMSFWEVNLKEKFSSELDQFPLGRKFLLQSGYRGRTSARTGI
KRPAVSKPSTAPKRKRTKTKK
>P03101 ~~~L1~~~Major capsid protein L1~~~
MSLWLPSEATVYLPPVPVSKVVSTDEYVARTNIYYHAGTSRLLAVGHPYFPIKKPNNNKILVPKVSGLQYRVFRIHLPDP
NKFGFPDTSFYNPDTQRLVWACVGVEVGRGQPLGVGISGHPLLNKLDDTENASAYAANAGVDNRECISMDYKQTQLCLIG
CKPPIGEHWGKGSPCTNVAVNPGDCPPLELINTVIQDGDMVDTGFGAMDFTTLQANKSEVPLDICTSICKYPDYIKMVSE
PYGDSLFFYLRREQMFVRHLFNRAGAVGENVPDDLYIKGSGSTANLASSNYFPTPSGSMVTSDAQIFNKPYWLQRAQGHN
NGICWGNQLFVTVVDTTRSTNMSLCAAISTSETTYKNTNFKEYLRHGEEYDLQFIFQLCKITLTADVMTYIHSMNSTILE
DWNFGLQPPPGGTLEDTYRFVTSQAIACQKHTPPAPKEDPLKKYTFWEVNLKEKFSADLDQFPLGRKFLLQAGLKAKPKF
TLGKRKATPTTSSTSTTAKRKKRKL
>P06794 ~~~L1~~~Major capsid protein L1~~~
MCLYTRVLILHYHLLPLYGPLYHPRPLPLHSILVYMVHIIICGHYIILFLRNVNVFPIFLQMALWRPSDNTVYLPPPSVA
RVVNTDDYVTPTSIFYHAGSSRLLTVGNPYFRVPAGGGNKQDIPKVSAYQYRVFRVQLPDPNKFGLPDTSIYNPETQRLV
WACAGVEIGRGQPLGVGLSGHPFYNKLDDTESSHAATSNVSEDVRDNVSVDYKQTQLCILGCAPAIGEHWAKGTACKSRP
LSQGDCPPLELKNTVLEDGDMVDTGYGAMDFSTLQDTKCEVPLDICQSICKYPDYLQMSADPYGDSMFFCLRREQLFARH
FWNRAGTMGDTVPQSLYIKGTGMPASPGSCVYSPSPSGSIVTSDSQLFNKPYWLHKAQGHNNGVCWHNQLFVTVVDTTPS
TNLTICASTQSPVPGQYDATKFKQYSRHVEEYDLQFIFQLCTITLTADVMSYIHSMNSSILEDWNFGVPPPPTTSLVDTY
RFVQSVAITCQKDAAPAENKDPYDKLKFWNVDLKEKFSLDLDQYPLGRKFLVQAGLRRKPTIGPRKRSAPSATTSSKPAK
RVRVRARK
>P06416 ~~~L1~~~Major capsid protein L1~~~
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLAVGHPYFSIKNPTNAKKLLVPKVSGLQYRVFRVRLPD
PNKFGFPDTSFYNPDTQRLVWACVGLEIGRGQPLGVGISGHPLLNKFDDTETGNKYPGQPGADNRECLSMDYKQTQLCLL
GCKPPTGEHWGKGVACTNAAPANDCPPLELINTIIEDGDMVDTGFGCMDFKTLQANKSDVPIDICGSTCKYPDYLKMTSE
PYGDSLFFFLRREQMFVRHFFNRAGTLGEAVPDDLYIKGSGTTASIQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHN
NGICWGNQVFVTVVDTTRSTNMTLCTQVTSDSTYKNENFKEYIRHVEEYDLQFVFQLCKVTLTAEVMTYIHAMNPDILED
WQFGLTPPPSASLQDTYRFVTSQAITCQKTVPPKEKEDPLGKYTFWEVDLKEKFSADLDQFPLGRKFLLQAGLKAKPKLK
RAAPTSTRTSSAKRKKVKK
>P27232 ~~~L1~~~Major capsid protein L1~~~
MSLWRSNEATVYLPPVSVSKVVSTDEYVTRTNIYYHAGSSRLLAVGHPYYAIKKQDSNKIAVPKVSGLQYRVFRVKLPDP
NKFGFPDTSFYDPASQRLVWACTGVEVGRGQPLGVGISGHPLLNKLDDTENSNKYVGNSGTDNRECISMDYKQTQLCLIG
CRPPIGEHWGKGTPCNANQVKAGECPPLELLNTVLQDGDMVDTGFGAMDFTTLQANKSDVPLDICSSICKYPDYLKMVSE
PYGDMLFFYLRREQMFVRHLFNRAGTVGETVPADLYIKGTTGTLPSTSYFPTPSGSMVTSDAQIFNKPYWLQRAQGHNNG
ICWSNQLFVTVVDTTRSTNMSVCSAVSSSDSTYKNDNFKEYLRHGEEYDLQFIFQLCKITLTADVMTYIHSMNPSILEDW
NFGLTPPPSGTLEDTYRYVTSQAVTCQKPSAPKPKDDPLKNYTFWEVDLKEKFSADLDQFPLGRKFLLQAGLKARPNFRL
GKRAAPASTSKKSSTKRRKVKS
>Q05138 ~~~L1~~~Major capsid protein L1~~~
MVQILFYILVIFYYVAGVNVFHIFLQMSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKN
TSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTET
SNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFN
TLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSS
AFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEF
DLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKDYMFWEVD
LKEKFSADLDQFPLGRKFLLQAGLQARPKLKRPASSAPRTSTKKKKVKR
>P26535 ~~~L1~~~Major capsid protein L1~~~
MVLILCCTLAILFCVADVNVFHIFLQMSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLAVGNPYFSIKS
PNNNKKVLVPKVSGLQYRVFRVRLPDPNKFGFPDTSFYNPDTQRLVWACVGLEIGRGQPLGVGVSGHPYLNKFDDTETSN
RYPAQPGSDNRECLSMDYKQTQLCLIGCKPPTGEHWGKGVACNNNAAATDCPPLELFNSIIEDGDMVDTGFGCMDFGTLQ
ANKSDVPIDICNSTCKYPDYLKMASEPYGDSLFFFLRREQMFVRHFFNRAGKLGEAVPDDLYIKGSGNTAVIQSSAFFPT
PSGSIVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCTEVTKEGTYKNDNFKEYVRHVEEYDLQFV
FQLCKITLTAEIMTYIHTMDSNILEDWQFGLTPPPSASLQDTYRFVTSQAITCQKTAPPKEKEDPLNKYTFWEVNLKEKF
SADLDQFPLGRKFLLQSGLKAKPRLKRSAPTTRAPSTKRKKVKK
>P69898 ~~~L1~~~Major capsid protein L1~~~
MWRPSDSTVYVPPPNPVSKVVATDAYVTRTNIFYHASSSRLLAVGHPYFSIKRANKTVVPKVSGYQYRVFKVVLPDPNKF
ALPDSSLFDPTTQRLVWACTGLEVGRGQPLGVGVSGHPFLNKYDDVENSGSGGNPGQDNRVNVGMDYKQTQLCMVGCAPP
LGEHWGKGKQCTNTPVQAGDCPPLELITSVIQDGDMVDTGFGAMNFADLQTNKSDVPIDICGTTCKYPDYLQMAADPYGD
RLFFFLRKEQMFARHFFNRAGEVGEPVPDTLIIKGSGNRTSVGSSIYVNTPSGSLVSSEAQLFNKPYWLQKAQGHNNGIC
WGNQLFVTVVDTTRSTNMTLCASVTTSSTYTNSDYKEYMRHVEEYDLQFIFQLCSITLSAEVMAYIHTMNPSVLEDWNFG
LSPPPNGTLEDTYRYVQSQAITCQKPTPEKEKPDPYKNLSFWEVNLKEKFSSELDQYPLGRKFLLQSGYRGRSSIRTGVK
RPAVSKASAAPKRKRAKTKR
>P04013 ~~~L2~~~Minor capsid protein L2~~~
MKPRARRRKRASATQLYQTCKATGTCPPDVIPKVEHTTIADQILKWGSLGVFFGGLGIGTGAGSGGRAGYIPLGSSPKPA
ITGGPAARPPVLVEPVAPSDPSIVSLIEESAIINAGAPEVVPPTQGGFTITSSESTTPAILDVSVTNHTTTSVFQNPLFT
EPSVIQPQPPVEASGHILISAPTITSQHVEDIPLDTFVVSSSDSGPTSSTPLPRAFPRPRVGLYSRALQQVQVTDPAFLS
TPQRLVTYDNPVYEGEDVSLQFTHESIHNAPDEAFMDIIRLHRPAITSRRGLVRFSRIGQRGSMYTRSGQHIGARIHYFQ
DISPVTQAAEEIELHPLVAAENDTFDIYAEPFDPIPDPVQHSVTQSYLTSTPNTLSQSWGNTTVPLSIPSDWFVQSGPDI
TFPTASMGTPFSPVTPALPTGPVFITGSDFYLHPTWYFARRRRKRIPLFFTDVAA
>P03107 ~~~L2~~~Minor capsid protein L2~~~
MRHKRSAKRTKRASATQLYKTCKQAGTCPPDIIPKVEGKTIAEQILQYGSMGVFFGGLGIGTGSGTGGRTGYIPLGTRPP
TATDTLAPVRPPLTVDPVGPSDPSIVSLVEETSFIDAGAPTSVPSIPPDVSGFSITTSTDTTPAILDINNTVTTVTTHNN
PTFTDPSVLQPPTPAETGGHFTLSSSTISTHNYEEIPMDTFIVSTNPNTVTSSTPIPGSRPVARLGLYSRTTQQVKVVDP
AFVTTPTKLITYDNPAYEGIDVDNTLYFSSNDNSINIAPDPDFLDIVALHRPALTSRRTGIRYSRIGNKQTLRTRSGKSI
GAKVHYYYDLSTIDPAEEIELQTITPSTYTTTSHAASPTSINNGLYDIYADDFITDTSTTPVPSVPSTSLSGYIPANTTI
PFGGAYNIPLVSGPDIPINITDQAPSLIPIVPGSPQYTIIADAGDFYLHPSYYMLRKRRKRLPYFFSDVSLAA
>Q06687 ~~~VLF-1~~~Very late expression factor 1~~~
MNGFNVRNENNFNSWKIKIQSAPRFESVFDLATDRQRCTPDEVKNNSLWSKYMFPKPFAPTTLKSYKSRFIKIVYCSVDD
VHLEDMSYSLDKEFDSIENQTLLIDPQELCRRMLELRSVTKETLQLTINFYTNMMNLPEYKIPRMVMLPRDKELKNIREK
EKNLMLKNVIDTILNFINDKIKMLNSDYVHDRGLIRGAIVFCIMLGTGMRINEARQLSVDDLNVLIKRGKLHSDTINLKR
KRSRNNTLNNIKMKPLELAREIYSRNPTILQISKNTSTPFKDFRRLLEESGVEMERPRSNMIRHYLSSNLYNSGVPLQKV
AKLMNHESSASTKHYLNKYNIGLDETSSEEENNNDDDDAQHNRNSSGSSGESLLYYRNE
>P03701 ~~~lom~~~Outer membrane protein lom~~~
MRNVCIAVAVFAALAVTVTPARAEGGHGTFTVGYFQVKPGTLPSLSGGDTGVSHLKGINVKYRYELTDSVGVMASLGFAA
SKKSSTVMTGEDTFHYESLRGRYVSVMAGPVLQISKQVSAYAMAGVAHSRWSGSTMDYRKTEITPGYMKETTTARDESAM
RHTSVAWSAGIQINPAASVVVDIAYEGSGSGDWRTDGFIVGVGYKF
>P15908 ~~~VLTF1~~~Late transcription factor 1~~~
MSLRIKIDKLRQLVTYFSEFSEEVSINIDVKSNVLYIFATLGGSINIWTIVPLNSNVFYNGVENTVFNLPVLKVKNCLCS
FHNDAVVSITADHDNNTVTLSSHYTVSIDCNNEQIPHSTGTSISLGIDQKKSYIFNFHKYEEKCCGRTVFHLDMLLGFIK
CISQYQYLNICFDDKKLLLKTPGTRDTFVRSYSMTEWSPTLQNYSFKIAIFSLNKLRGFKKRVLVFESKIVMDTEGNILG
LLFRDRIGTYKVNVFMAFQD
>O57198 ~~~~~~Late transcription factor 1~~~
MSIRIKIDKLRQIVAYFSEFSEEVSINVDSTDELMYIFAALGGSVNIWAIIPLSASVFYRGAENIVFNLPVSKVKSCLCS
FHNDAIIDIEPDLENNLVKLSSYHVVSVDCNKELMPIRTDTTICLSIDQKKSYVFNFHKYEEKCCGRTVIHLEWLLGFIK
CISQHQHLAIMFKDDNIIMKTPGNTDAFSREYSMTECSQELQKFSFKIAISSLNKLRGFKKRVNVFETRIVMDNDDNILG
MLFSDRVQSFKINIFMTFLD
>P68612 ~~~~~~Late transcription factor 1~~~
MSIRIKIDKLRQIVAYFSEFSEEVSINVDSTDELMYIFAALGGSVNIWAIIPLSASVFYRGAENIVFNLPVSKVKSCLCS
FHNDAIIDIEPDLENNLVKLSSYHVVSVDCNKELMPIRTDTTICLSIDQKKSYVFNFHKYEEKCCGRTVIHLEWLLGFIK
CISQHQHLAIMFKDDNIIMKTPGNTDAFSREYSMTECSQELQKFSFKIAISSLNKLRGFKKRVNVFETRIVMDNDDNILG
MLFSDRVQSFKINIFMAFLD
>P68613 ~~~~~~Late transcription factor 1~~~
MSIRIKIDKLRQIVAYFSEFSEEVSINVDSTDELMYIFAALGGSVNIWAIIPLSASVFYRGAENIVFNLPVSKVKSCLCS
FHNDAIIDIEPDLENNLVKLSSYHVVSVDCNKELMPIRTDTTICLSIDQKKSYVFNFHKYEEKCCGRTVIHLEWLLGFIK
CISQHQHLAIMFKDDNIIMKTPGNTDAFSREYSMTECSQELQKFSFKIAISSLNKLRGFKKRVNVFETRIVMDNDDNILG
MLFSDRVQSFKINIFMAFLD
>P32990 ~~~~~~Late transcription factor 1~~~
MSIRIKIDKLRQIVAYFSEFSEEVSINVDSTDELMYIFAALGGSVNIWAIIPLSASVFYRGAENIVFNLPVSKVKSCLCS
FHNDAIIDIEPDLENNLVKLSSYHVVSVDCNKELMPIRTDTTICLSIDQKKSYVFNFHKYEEKCCGRTVIHLEWLLGFIK
CISQHQHLTIMFKDDNIIMKTPGNTDAFSREYSMTECSQELQKFSFKIAISSLNKLRGFKKRVNVFETRIVMDNDDNILG
MLFSDRVQSFKINIFMAFLD
>O72910 ~~~VLTF2~~~Viral late gene transcription factor 2~~~
MYKKVNLSGIVISEPKSVKKFKTKDSIVNVLPEYYHTIADKRLEIRKDKDNCWFCKQDMNTYNPYFIETLYGDHIGVFCS
KICRDSFANMIKSVIALREEPKISLLPLELYEKPEEVLEVINDLRHKEGIYGSCILESDKNIIKLTLRCHCNTN
>O57216 ~~~VLTF2~~~Viral late gene transcription factor 2~~~
MAKRVSLPDVVISAPKAVFKPAKEEALACILPKYYKSMADVSIKTNSVIDKCWFCNQDLVFRPISIETFKGGEVGYFCSK
ICRDSLASMVKSHVALREEPKISLLPLVFYEDKEKVINTINLLRDKDGVYGSCYFKENSQIIDISLRSLL
>P20982 ~~~~~~Viral late gene transcription factor 2~~~
MAKRVSLPDVVISAPKAVFKPAKEEALACILPKYYKSMADVSIKTNSVIDKCWFCNQDLVFRPISIETYKGGEVGYFCSK
ICRDSLASMVKSHVALREEPKISLLPLVFYEDKEKVINTINLLRDKDGVYGSCYFKENSQIIDISLRSLL
>P07610 ~~~~~~Viral late gene transcription factor 2~~~
MAKRVSLPDVVISAPKAVFKPAKEEALACILPKYYKSMADVSIKTNSVIDKCWFCNQDLVFKPISIETFKGGEVGYFCSK
ICRDSLASMVKSHVALREEPKISLLPLVFYEDKEKVINTINLLRDKDGVYGSCYFKENSQIIDISLRSLL
>P0DSV3 ~~~~~~Viral late gene transcription factor 2~~~
MAKRVSLPDVVISAPKAVFKPAKEEALACILPKYYKSMADMSIKTNSVIDKCWFCNQDLVFRPISIETFKGGEVGYFCSK
ICRDSLASMVKSHVALREEPKISLLPLVFYEDKEKVINTINLLRDKDGVYGSCYFKENSQIIDISLRSLL
>P0DSV4 ~~~~~~Viral late gene transcription factor 2~~~
MAKRVSLPDVVISAPKAVFKPAKEEALACILPKYYKSMADMSIKTNSVIDKCWFCNQDLVFRPISIETFKGGEVGYFCSK
ICRDSLASMVKSHVALREEPKISLLPLVFYEDKEKVINTINLLRDKDGVYGSCYFKENSQIIDISLRSLL
>Q9J566 ~~~VLTF3~~~Viral late gene transcription factor 3~~~
MSNLKHCSNCKHNGLITESNHEFCIFCQSIFQLSNKVSKKSNFHVSNKLIHLRNVLRRLLSNQCSSDVIVELKSVMTKNN
ISSTDIDANFVSSFLKANEKINKKDYKLVFEIINHIKEEKLNLDTSKINEVIEIFKHLVFFCQENTPSKTINYSFFLDKI
FSLTSVTNNLKPQTVKNYTKNNSNQLVWENFLDYMKKKKINTSVYDYGHEYVFVDYGFTTCSLEV
>A0A7H0DN99 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSCCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>Q76ZR2 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSGCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>P68318 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSGCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>P68319 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSGCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>P0DOT5 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSGCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>P0DOT6 ~~~~~~Viral late gene transcription factor 3~~~
MNLRLCSGCRHNGIVSEQGYEYCIFCESVFQKCTKVQKKSNFHVSNKLIHLRNVLRRLLSHQCSGEIISELLDIMEKNQI
STDDVDANFVSSFLKANERINKKDYKLVFEIINQVKDEKLNLSTEKINEVVEIFKHLVFFCQENTPSKTINYSFFLDKIF
DITSVTKNLKPQTVKNYTKNNSNQLVWENFLAHMRSKKRVTMVEDYGHEYVFVDERFSTCSLEV
>Q38008 ~~~CPH1~~~Holin~~~
MLYNIMLEVAKGDYITILFALILFDFITGFLKAWKWKVTDSWTGLKGVIKHTLTFIFYYFVAVFLTYIHAMAVGQILLVI
INLYYALSIMENLAVMGVFIPKFMTARVQEELQKYTAQLDAGKDLLEEFKGEKK
>A3EXD6 ~~~M~~~Membrane protein~~~
MASSNVTLSNDEVLRLVKDWNFTWSVVFLLITIVLQYGYPSRSMFVYVIKMFVLWLLWPASMALSIFCAVYPIDLASQII
SGILAATSCAMWISYFVQSIRLFMRTGSWWSFNPESNCLLNVPIGGTTVVRPLVEDSTSVTAVVTDGYLKMAGMHFGACD
FQRLPSEVTVAKPNVLIALKMIKRQAYGTNSGVAIYHRYKAGNYRRPPIIQDQELALLRA
>P27904 ~~~M~~~Membrane protein~~~
MFETNYWPFPDQAPNPFTAQIEQLTATENVYIFLTTLFGILQLVYVMFKLLCTMFPSLHFSPIWRGLENFWLFLSLASLA
IAYWWLPSMTFTGYWALTIIATILVFILLIMMFVKFVNFVKLFYRTGSFAIAIRGPIVLVALDVTIKLHCTPFAILVKEI
GNIFYLSEYCNKPLTAAQIAALRICVNGQWFAYTRSSTTSAARVAAANSTAKYHLFVLQGVAEYTQLSSVKFE
>P69704 ~~~M~~~Membrane protein~~~
MSSVTTPAPVYTWTADEAIKFLKEWNFSLGIILLFITIILQFGYTSRSMFVYVIKMIILWLMWPLTIILTIFNCVYALNN
VYLGFSIVFTIVAIIMWIVYFVNSIRLFIRTGSWWSFNPETNNLMCIDMKGRMYVRPIIEDYHTLTVTIIRGHLYMQGIK
LGTGYSLSDLPAYVTVAKVSHLLTYKRGFLDKIGDTSGFAVYVKSKVGNYRLPSTQKGSGMDTALLRNNI
>Q01455 ~~~M~~~Membrane protein~~~
MSSKTTPAPVYIWTADEAIKFLKEWNFSLGIILLFITIILQFGYTSRSMFVYVIKMIILWLMWPLTIILTIFNCVYALNN
VYLGLSIVFTIVAIIMWIVYFVNSIRLFIRTGSFWSFNPETNNLMCIDMKGTMYVRPIIEDYHTLTVTIIRGHLYIQGIK
LGTGYSWADLPAYMTVAKVTHLCTYKRGFLDRISDTSGFAVYVKSKVGNYRLPSTQKGSGMDTALLRNNI
>Q9JEB4 ~~~M~~~Membrane protein~~~
MNSTTQAPQPVYQWTADEAIRFLKEWNFSLGIILLFVTIILQFGYTSRSMFVYVVKMILLWLMWPLTIVLCIFNCVYALN
NVYLGFSIVFTIVSIIMWIMYFVNSIRLFIRTGSWWSFNPETNNLMCTDMKGTVYVRPIIEDYHTLTATIIRGHLYMQGV
KLGTGFSLSDLPAYVTVAKVSHLCTYKRAFLDKVDGVSGFAVYVKSKVGNYRLPSNKPSGMDTALLRI
>P03415 ~~~M~~~Membrane protein~~~
MSSTTQAPEPVYQWTADEAVQFLKEWNFSLGIILLFITIILQFGYTSRSMFIYVVKMIILWLMWPLTIVLCIFNCVYALN
NVYLGFSIVFTIVSIVIWIMYFVNSIRLFIRTGSWWSFNPETNNLMCIDMKGTVYVRPIIEDYHTLTATIIRGHLYMQGV
KLGTGFSLSDLPAYVTVAKVSHLCTYKRAFLDKVDGVSGFAVYVKSKVGNYRLPSNKPSGADTALLRI
>P08549 ~~~M~~~Membrane protein~~~
MSSTTQAPGPVYQWTADEAVQFLKEWNFSLGIILLFITIILQFGYTSRSMFIYVVKMIILWLMWPLIIVLCMFNCVYALN
NVYLGFSIVFTIVSVVMWIMYFVNSIRLFIRTGSWWSFNPETNNLMCIDMKGTVYVRPIIEDYHTLTATIIRGHFYMQGV
KLGTGFSLSDLPAYVTVAKVSHLCTYKRAFLDKVDGVSGFAVYVKSKVGNYRLPSNKPSGADTVLLRI
>P09175 ~~~M~~~Membrane protein~~~
MKLLLILACVIACACGERYCAMKDDTGLSCRNSTASACESCFNGGDLIWHLANWNFSWSIILIIFITVLQYGRPQFSWFV
YGIKMLIMWLLWPIVLALTIFNAYSEYQVSRYVMFGVSIAGAIVTFVLWIMYFVRSIQLYRRTKSWWSFNPEINAILCVS
ALGRSYVLPLEGVPTGVTLTLLSGNLYAEGFKIAGGMNIDNLPKYVMVALPSRTIVYTLVGKKLKASSATGWAYYVKSKA
GDYSTDARTDNLSEQEKLLHMV
>P04135 ~~~M~~~Membrane protein~~~
MKILLILACVIACACGERYCAMKSDTDLSCRNSTASDCESCFNGGDLIWHLANWNFSWSIILIVFITVLQYGRPQFSWFV
YGIKMLIMWLLWPVVLALTIFNAYSEYQVSRYVMFGFSIAGAIVTFVLWIMYFVRSIQLYRRTKSWWSFNPETKAILCVS
ALGRSYVLPLEGVPTGVTLTLLSGNLYAEGFKIAGGMNIDNLPKYVMVALPSRTIVYTLVGKKLKASSATGWAYYVKSKA
GDYSTEARTDNLSEQEKLLHMV
>P25878 ~~~M~~~Membrane protein~~~
MKYILLILACIIACVYGERYCAMQDSGLQCINGTNSRCQTCFERGDLIWHLANWNFSWSVILIVFITVLQYGRPQFSWLV
YGIKMLIMWLLWPIVLALTIFNAYSEYQVSRYVMFGFSVAGAVVTFALWMMYFVRSVQLYRRTKSWWSFNPETNAILCVN
ALGRSYVLPLDGTPTGVTLTLLSGNLYAEGFKMAGGLTIEHLPKYVMIATPSRTIVYTLVGKQLKATTATGWAYYVKSKA
GDYSTEARTDNLSEHEKLLHMV
>P11222 ~~~M~~~Membrane protein~~~
MSNETNCTLDFEQSVQLFKEYNLFITAFLLFLTIILQYGYATRSKVIYTLKMIVLWCFWPLNIAVGVISCIYPPNTGGLV
AAIILTVFACLSFVGYWIQSIRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVLSPIIKNGVLYCEGQWLAK
CEPDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRFATFVYAKQSVDTGELESVATGGSSLYT
>Q910E2 ~~~M~~~Membrane protein~~~
MAENCTLDSEQAVLLFKEYNLFITAFLLFLTILLQYGYATRSRTIYILKMIVLWCFWPLNIAVGVISCIYPPNTGGLVAA
IILTVFACLSFVGYWIQSCRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVLAPIIKNGVLYCEGQWLAKCE
PDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRFATFVYAKQSVDTGELESVATGGSSLYT
>P69607 ~~~M~~~Membrane protein~~~
MSNETNCTLDFEQSVELFKEYNLFITAFLLFLTIILQYGYATRSKFIYILKMIVLWCFWPLNIAVGVISCIYPPNTGGLV
AAIILTVFACLSFVGYWIQSIRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVLSPIIKNGVLYCEGQWLAK
CEPDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRFATFVYAKQSVDTGELESVATGGSSLYT
>P69606 ~~~M~~~Membrane protein~~~
MSNETNCTLDFEQSVELFKEYNLFITAFLLFLTIILQYGYATRSKFIYILKMIVLWCFWPLNIAVGVISCIYPPNTGGLV
AAIILTVFACLSFVGYWIQSIRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVLSPIIKNGVLYCEGQWLAK
CEPDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRFATFVYAKQSVDTGELESVATGGSSLYT
>P59596 ~~~M~~~Membrane protein~~~
MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYSNRNRFLYIIKLVFLWLLWPVTLACFVLAAVYRINWVTGGIA
IAMACIVGLMWLSYFVASFRLFARTRSMWSFNPETNILLNVPLRGTIVTRPLMESELVIGAVIIRGHLRMAGHSLGRCDI
KDLPKEITVATSRTLSYYKLGASQRVGTDSGFAAYNRYRIGNYKLNTDHAGSNDNIALLVQ
>P0DTC5 ~~~M~~~Membrane protein~~~
MADSNGTITVEELKKLLEQWNLVIGFLFLTWICLLQFAYANRNRFLYIIKLIFLWLLWPVTLACFVLAAVYRINWITGGI
AIAMACLVGLMWLSYFIASFRLFARTRSMWSFNPETNILLNVPLHGTILTRPLLESELVIGAVILRGHLRIAGHHLGRCD
IKDLPKEITVATSRTLSYYKLGASQRVAGDSGFAAYSRYRIGNYKLNTDHSSSSDNIALLVQ
>Q98157 ~~~ORF K4~~~Viral macrophage inflammatory protein 2~~~
MDTKGILLVAVLTALLCLQSGDTLGASWHRPDKCCLGYQKRPLPQVLLSSWYPTSQLCSKPGVIFLTKRGRQVCADKSKD
WVKKLMQQLPVTAR
>P06817 ~~~NB~~~Glycoprotein NB~~~
MNNATFNCTNINPITHIRGSIIITICVSLIVILIVFGCIAKIFINKNNCTNNVIRVHKRIKCPDCEPFCNKRDDISTPRA
GVDIPSFILPGLNLSEGTPN
>P87505 ~~~Segment-5~~~Non-structural protein NS1~~~
MDRFLTYFQVRGERANAVRLFGEISEQIDCSHLKRDCFVNGICARQHFKECCNIATDNGSRTNADKLVALALRALLDRQT
IWTCVIKNADYVSQYADEQMEEEVNKLYDVYLQSGTREEFEGFRQRNRPSRVVMDDSCSMLSYFYIPMNQGNPAPVAKLS
RWGQFGICYYDRTNVDGLIPYDEIGLAQAIDGLKDLIEGRLPVCPYTGANGRINAVLHLPLEMEVIMAVQENATQLMRRA
AQDFKFITHAGWRLYPRLLRQRFAIEDATEGVIHHVMLGHLRYYDEDTSIVKYRFLNDGSLDWRTWTIPLHLMRTARLGH
LQPESILVFMHKSLHVRYALWLTSLCLTQSRWLIQKLPELTGGTDVLYTRAYVHADNHKVPNVRDLMMNEVFRKIDDHWV
IQKCHTTKEAITVTAIQIQRSIRGDGQWDTPMFHQSMALLTRLIVYWLTDVTERSAIFRLTCFAIFGCKPTARGRYIDWD
DLGTFMKNVLDGRDLTVLEDETCFISMMRMAMLHVQRSKVVCATVLEAPLEIQQVGQIVEVPFDFMHN
>P07131 ~~~Segment-5~~~Non-structural protein NS1~~~
MERFLRKYNISGDYANATRTFLAISPQWTCSHLKRNCLFNGMCVKQHFERAMIAATDAEEPAKAYKLVELAKEAMYDRET
VWLQCFKSFSQPYEEDVEGKMKRCGAQLLEDYRKSGMMDEAVKQSALVNSERIRLDDSLSAMPYIYVPINDGQIVNPTFI
SRYRQIAYYFYNPDAADDWIDPNLFGIRGQHNQIKREVERQINTCPYTGYRGRVFQVMFLPIQLINFLRMDDFAKHFNRY
ASMAIQQYLRVGYAEEIRYVQQLFGRVPTGEFPLHQMMLMRRDLPTRDRSIVEARVRRSGDESWQSWLLPMIIIREGLDH
QDRWEWFIDYMDRKHTCQLCYLKHSKQIPTCSVIDVRASELTGCSPFKMVKIEEHVGNDSVFKTKLVRDEQIGRIGDHYY
TTNCYTGAEALITTAIHIHHWIRGSGIWNDEGWQEGIFMLGRVLLRWELTKAQRSALLRLFCFVCYGYAPRADGTIPDWN
NLGNFLDIILKGPELSEDEDERAYATMFEMVRCIITLCYAEKVHFAGFAAPACEGGEVINLAAAMSQMWMEY
>P23065 ~~~Segment-8~~~Non-structural protein NS2~~~
MEQKQRRFTKNIFVLDVTAKTLCGAIAKLSSQPYCQIKIGRVVAFKPVKNPEPKGYVLNVPGPGAYRIQDGQDIISLMLT
PHGVEATTERWEEWKFEGVSVTPMATRVQYNGVMVDAEIKYCKGMGIVQPYMRNDFDRNEMPDLPGVMRSNYDIRELRQK
IKNERESAPRLQVHSVAPREESRWMDDDEAKVDDEAKEIVPGTSGLEKLREARSNVFKEVEAVINWNLDERDEGDRDERG
DEEQVKTLSDDDDQGEDASDDEHPKTHITKEYIEKVAKQIKLKDERFMSLSSAMPQASGGFDRMIVTKKLKWQNVPLYCF
DESLKRYELQCVGACERVAFVSKDMSLIICRSAFRRL
>P18683 ~~~nun~~~Transcription termination factor nun~~~
MLMVKKTIYVNPDSGQNRKVSDRGLTSRDRRRIARWEKRIAYALKNGVTPGFNAIDDGPEYKINEDPMDKVDKALATPFP
RDVEKIEDEKYEDVMHRVVNHAHQRNPNKKWS
>P13520 3.1.11.3~~~old~~~Old nuclease~~~
MTVRLASVSISNFRSCKSTSAILRPFTALVGYNNAGKSNIILAIKWLLDGSLISESDVYDPTHPVSVEGVIQGITDDTLS
LLTEENQQKIAPFIIDGTLTFARRQEFNKETGKAKKSLDVYDGTTWKKNPGGIDGAISNIFPEPIHIPAMSDAVEDSTKC
KNTTTIGKILSAIVSEIKQEHEEKFSKNISEIGKYLSHNGENRLESLNKIDSGVNKKVNQFFPDVSVKLHFPTPTLDEIF
KSGTLKVFESREDEPVMRDISRFGHGTQRSIQMALIQYLAEIKKENSESKKSNTLIFIDEPELYLHPSAINSVRESLVTL
SESGYQVIISTHSASMLSAKHAANAIQVCKDSNGTIARKTISEKIEELYKSSSPQLHSAFTLSNSSYLLFSEEVLLVEGK
TETNVLYALYKKINGHELNPSKICIVAVDGKGSLFKMSQIINAIGIKTRILADCDFLSNILLTEHKDLLSTECDNLLTAL
IESINSGELSLNTKVTTFESFKSISSKDFIKICNHEKTQKHIHEIHQKLKDNGIYIWKSGDIEAVYGFGKKQTEWDSLLD
CLCDESKDVRAVIKKYDEMEDFIKWI
>P27382 ~~~XI~~~Infectivity protein P11~~~
MEKVKAWLIKYKWWIVAAIGGLAAFLLLKNRGGGSGGGGEYMVGSGPVYQQAGSGAVDNTMALAALQANTQLSAQNAQLQ
AQMDASRLQLETQLNIETLAADNAHYSTQSQLQLGMAQVDLSKYLGDLQSTTSTALAGMQSDTAKYQSNIQLQAENIRAN
TSLAEIDAQKYIVGKQADIAKYQAKTERRGQDYGFALGLLNFGGKFF
>Q25BI4 ~~~~~~Structural protein 11~~~
MPGAVELDENTINPVSDIAYKDDNGNWRAGPDAKKGYEQGQFIDSDYAEKATSIRRTTNLIKSIQRARDVSEGEARELKD
DMVTELEKAETKEERRDIWQKYGSP
>P27392 ~~~XVI~~~Protein P16~~~
MDKKKLLYWVGGGLVLILIWLWFRNRPAAQVASNWEGPPYMTYNQPQAGSVTLPVAGYTSPSPTLPNRNRSCGCNPAVSA
AMAQGADLASKLTDSITSQLNDYASSLNDYLASQAGV
>P06492 ~~~~~~Tegument protein VP16~~~
MDLLVDELFADMNADGASPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVPPAALFNRLLDDLGFSAGPALCTM
LDTWNEDLFSALPTNADLYRECKFLSTLPSDVVEWGDAYVPERTQIDIRAHGDVAFPTLPATRDGLGLYYEALSRFFHAE
LRAREESYRTVLANFCSALYRYLRASVRQLHRQAHMRGRDRDLGEMLRATIADRYYRETARLARVLFLHLYLFLTREILW
AAYAEQMMRPDLFDCLCCDLESWRQLAGLFQPFMFVNGALTVRGVPIEARRLRELNHIREHLNLPLVRSAATEEPGAPLT
TPPTLHGNQARASGYFMVLIRAKLDSYSSFTTSPSEAVMREHAYSRARTKNNYGSTIEGLLDLPDDDAPEEAGLAAPRLS
FLPAGHTRRLSTAPPTDVSLGDELHLDGEDVAMAHADALDDFDLDMLGDGDSPGPGFTPHDSAPYGALDMADFEFEQMFT
DALGIDEYGG
>P04486 ~~~~~~Tegument protein VP16~~~
MDLLVDELFADMDADGASPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVPPAALFNRLLDDLGFSAGPALCTM
LDTWNEDLFSALPTNADLYRECKFLSTLPSDVVEWGDAYVPERAQIDIRAHGDVAFPTLPATRDGLGLYYEALSRFFHAE
LRAREESYRTVLANFCSALYRYLRASVRQLHRQAHMRGRDRDLGEMLRATIADRYYRETARLARVLFLHLYLFLTREILW
AAYAEQMMRPDLFDCLCCDLESWRQLAGLFQPFMFVNGALTVRGVPIEARRLRELNHIREHLNLPLVRSAATEEPGAPLT
TPPTLHGNQARASGYFMVLIRAKLDSYSSFTTSPSEAVMREHAYSRARTKNNYGSTIEGLLDLPDDDAPEEAGLAAPRLS
FLPAGHTRRLSTAPPTDVSLGDELHLDGEDVAMAHADALDDFDLDMLGDGDSPGPGFTPHDSAPYGALDMADFEFEQMFT
DALGIDEYGG
>P68336 ~~~~~~Tegument protein VP16~~~
MDLLVDDLFADADGVSPPPPRPAGGPKNTPAAPPLYATGRLSQAQLMPSPPMPVPPAALFNRLLDDLGFSAGPALCTMLD
TWNEDLFSGFPTNADMYRECKFLSTLPSDVIDWGDAHVPERSPIDIRAHGDVAFPTLPATRDELPSYYEAMAQFFRGELR
AREESYRTVLANFCSALYRYLRASVRQLHRQAHMRGRNRDLREMLRTTIADRYYRETARLARVLFLHLYLFLSREILWAA
YAEQMMRPDLFDGLCCDLESWRQLACLFQPLMFINGSLTVRGVPVEARRLRELNHIREHLNLPLVRSAAAEEPGAPLTTP
PVLQGNQARSSGYFMLLIRAKLDSYSSVATSEGESVMREHAYSRGRTRNNYGSTIEGLLDLPDDDDAPAEAGLVAPRMSF
LSAGQRPRRLSTTAPITDVSLGDELRLDGEEVDMTPADALDDFDLEMLGDVESPSPGMTHDPVSYGALDVDDFEFEQMFT
DAMGIDDFGG
>P09265 ~~~~~~Tegument protein VP16 homolog~~~
MECNLGTEHPSTDTWNRSKTEQAVVDAFDESLFGDVASDIGFETSLYSHAVKTAPSPPWVASPKILYQQLIRDLDFSEGP
RLLSCLETWNEDLFSCFPINEDLYSDMMVLSPDPDDVISTVSTKDHVEMFNLTTRGSVRLPSPPKQPTGLPAYVQEVQDS
FTVELRAREEAYTKLLVTYCKSIIRYLQGTAKRTTIGLNIQNPDQKAYTQLRQSILLRYYREVASLARLLYLHLYLTVTR
EFSWRLYASQSAHPDVFAALKFTWTERRQFTCAFHPVLCNHGIVLLEGKPLTASALREINYRRRELGLPLVRCGLVEENK
SPLVQQPSFSVHLPRSVGFLTHHIKRKLDAYAVKHPQEPRHVRADHPYAKVVENRNYGSSIEAMILAPPSPSEILPGDPP
RPPTCGFLTR
>P27383 ~~~XVII~~~Protein P17~~~
MMGKGFEMMVASAIRAAGINPDELMEKANTLVHNLNYQLDRFGQRLDSIDSRLSVIEKALDISPAEKPDNQPELTGITFE
GDNNDQ
>Q8JU62 ~~~S1~~~Outer capsid protein VP1~~~
MAAVFGIQLVPKLNTSTTRRTFLPLRFDLLLDRLQSTNLHGVLYRALDFNPVDRSATVIQTYPPLNAWSPHPAFIENPLD
YRDWTEFIHDRALAFVGVLTQRYPLTQNAQRYTNPLVLGAAFGDFLNARSIDIFLDRLFYGPTQESPITSITKFPYQWTI
DFNVTADSVRTPAGCKYITLYGYDPSRPSTPATYGKHRPTYATVFYYSTLPARSRLLANLAAGPTVLEHFDSPTYGPHLL
LPQTGDVLGYSSSLISQAALLMVESVMDALRDNANASASTAVTRLDQSYHPVTSFDPSTFNTLLQRATNLALLAVQGVQS
ESAIPAIPTMSDVRSFVARLMAEGDPQQWFPYRVDQILYWPESPFVPPIGPFYAPFRPVNFPFTTGSYTVVPDASRPLRL
LPQYRNATITVQQADDAYEDTALSPLITTHGFCVTGGVSTSIYDISGDPTAYPPAQLVDTPNDYFDRERMARRDLFRRLR
APADRSAIKDRAVFDFLASLVNPTTANPVLDTSFSMAYLGASSAHANADEPVILADIRSGSIPGLPIPRRIVQFGYDVVH
GSLLDLSRAVPTGTFGLVYADLDQVEDAGTDMPAANRAAIAMLGTALQMTTAGGVSVLKVNFPTRAFWTQVFNLYATHAT
TLHLVKPTIVNSSEVFLVFGGRQSNGALRSTTALQRALLSLYARNAAIDRAVTHIPFFGVPDDGTSDLGIDAVRLFDPMF
SDAVANLPSNALASLVSRVVPSSIMFTRVPSNGPVSTTIYGKRTFLSNRRRARLRDVPMLITTTLVHQRRFTTPPTFTLF
SSEAVPVTTLVAAGYNSFISEQTRNPNLAHLLDLGTGPECRILSLIPPTLQVTMSDARPCAELMASFDPALTAYVQGDYS
TAAFWNGIRCDSATAIFTLGAAAAAAGTDLIAFVQQLIPRIVAAGGTRMWLQLNTPLYEVSSLPDLIDIDLRDRVYRFNG
GERVEPYADPVPLQQAIAALLPAAALSWHTLSPTCDWLPYIIGVGSPLNLSDINTAISYSRLTPILHIDTTTPPLRVNPV
PTPLNQQCAIRITSLDPAAVLSVQHNGVEVIGGTPGNVISVAGAAALQYILANQEFLLQFTPTLPGIFDVFLTTLGQPPV
PRGSFTITPPPTTVVLNMPPPGQLDFTDVGNDARITCDPYYQLAVCIFKDGQYVRVNPEKASVVTNAPNRDLHFVLDLAD
NHVLLYLCDVTPSGLGDRIAFPIVDIYRIAFPRNTPVRASLPYTGGGAHLTSGGNPFMSLTTPPAVLPAGVALAALSTSV
ATQYPTYTLPAGVYEYVIE
>P13891 ~~~~~~Major capsid protein VP1~~~
MSQKGKGSCPRPQQVPRLLVKGGIEVLDVKSGPDSITTIEAYLQPRPGQKNGYSTVITVQAEGYQDAPHSTEVPCYSCAR
IPLPTINDDITCPTLLMWEAVSVKTEVVGVSSILNMHSGAFRAFNGYGGGFTICGPRIHFFSVGGEPLDLQACMQNSKTV
YPAPLIGPGEGERRETAQVLDTGYKARLDKDGLYPIECWCPDPAKNENTRYYGNLTGGPETPPVLAFTNTTTTILLDENG
VGPLCKGDGLFLSAADVAGTYVDQRGRQYWRGLPRYFSIQLRKRNVRNPYPVSGLLNSLFNDLMPRMTGQSMQGSDAQVE
EVRVYEGMEGLAPEIDMPPKAPR
>P03088 ~~~~~~Major capsid protein VP1~~~
MAPTKRKGECPGAAPKKPKEPVQVPKLLIKGGVEVLEVKTGVDAITEVECFLNPEMGDPDENLRGFSLKLSAENDFSSDS
PERKMLPCYSTARIPLPNLNEDLTCGNLLMWEAVTVQTEVIGITSMLNLHAGSQKVHEHGGGKPIQGSNFHFFAVGGEPL
EMQGVLMNYRSKYPDGTITPKNPTAQSQVMNTDHKAYLDKNNAYPVECWVPDPSRNENARYFGTFTGGENVPPVLHVTNT
ATTVLLDEQGVGPLCKADSLYVSAADICGLFTNSSGTQQWRGLARYFKIRLRKRSVKNPYPISFLLSDLINRRTQRVDGQ
PMYGMESQVEEVRVFDGTERLPGDPDMIRYIDKQGQLQTKML
>P03089 ~~~~~~Major capsid protein VP1~~~
MAPTKRKGERKDPVQVPKLLIRGGVEVLEVKTGVDSITEVECFLTPEMGDPDEHLRGFSKSISISDTFESDSPNRDMLPC
YSVARIPLPNLNEDLTCGNILMWEAVTLKTEVIGVTSLMNVHSNGQATHDNGAGKPVQGTSFHFFSVGGEALELQGVLFN
YRTKYPDGTIFPKNATVQSQVMNTEHKAYLDKNKAYPVECWVPDPTRNENTRYFGTLTGGENVPPVLHITNTATTVLLDE
FGVGPLCKGDNLYLSAVDVCGMFTNRSGSQQWRGLSRYFKVQLRKRRVKNPYPISFLLTDLINRRTPRVDGQPMYGMDAQ
VEEVRVFEGTEELPGDPDMMRYVDKYGQLQTKML
>P0DOI3 ~~~~~~Major capsid protein VP1~~~
MSCTPCRPQKRLTRPRSQVPRVQTLATEVKKGGVEVLAAVPLSEETEFKVELFVKPVIGNTTAAQDGREPTPHYWSISSA
IHDKESGSSIKVEETPDADTTVCYSLAEIAPPDIPNQVSECDMKVWELYRMETELLVVPLVNALGNTNGVVHGLAGTQLY
FWAVGGQPLDVVGVTPTDKYKGPTTYTINPPGDPRTLHVYNSNTPKAKVTSERYSVESWAPDPSRNDNCRYFGRVVGGAA
TPPVVSYGNNSTIPLLDENGIGILCLQGRLYITCADMLGTANSRIHTPMARFFRLHFRQRRVKNPFTMNVLYKQVFNRPT
ETVDAQVGVTEVTMVEEIGPLPPSIQTTLPTSVNLTQLPRTVTLQSQAPLLNTQQNSK
>P04010 ~~~~~~Major capsid protein VP1~~~
MAPQRKRQDGACKKTCPIPAPVPRLLVKGGVEVLEVRTGPDAITQIEAYLNPRMGNNIPSEDLYGYSNSINTAFSKASDT
PNKDTLPCYSVAVIKLPLLNEDMTCDTILMWEAVSVKTEVVGISSLVNLHQGGKYIYGSSSGCVPVQGTTYHMFAVGGEP
LELQGLVASSTATYPDDVVAIKNMKPGNQGLDPKAKALLDKDGKYPVEVWCPDPSKNENTRYYGSFTGGATTPPVMQFTN
SVTTVLLDENGVGPLCKGDKLFLSCADIAGVHTNYSETQVWRGLPRYFNVTLRKRIVKNPYPVSSLLNSFFSGLMPQIQG
QPMEGVSGQVEEVRIFEGTEGLPGDPDLNRYVDKFCQHQTVLPVSNDM
>P49302 ~~~~~~Capsid protein VP1~~~
MAPKRKSGVSKCETKCTKACPRPAPVPKLLIKGGMEVLDLVTGPDSVTEIEAFLNPRMGQPPTPESLTEGGQYYGWSRGI
NLATSDTEDSPGNNTLPTWSMAKLQLPMLNEDLTCDTLQMWEAVSVKTEVVGSGSLLDVHGFNKPTDTVNTKGISTPVEG
SQYHVFAVGGEPLDLQGLVTDARTKYKEEGVVTIKTITKKDMVNKDQVLNPISKAKLDKDGMYPVEIWHPDPAKNENTRY
FGNYTGGTTTPPVLQFTNTLTTVLLDENGVGPLCKGEGLYLSCVDIMGWRVTRNYDVHHWRGLPRYFKITLRKRWVKNPY
PMASLISSLFNNMLPQVQGQPMEGENTQVEEVRVYDGTEPVPGDPDMTRYVDRFGKTKTVFPGN
>A5HBD5 ~~~~~~Major capsid protein VP1~~~
MACTAKPACTAKPGRSPRSQPTRVQSLPKQVRKGGVDVLAAVPLSEETEFKVELFVKPVIGNAEGTTPHYWSISSPLKTA
EAANVTPDADTTVCYSLSQVAPPDIPNQVSECDMLIWELYRMETEVLVLPVLNAGILTTGGVGGIAGPQLYFWAVGGQPL
DVLGLAPTEKYKGPAQYTVNPKTNGTVPHVYSSSETPRARVTNEKYSIESWVADPSRNDNCRYFGRMVGGAATPPVVSFS
NNSTIPLLDENGIGILCLQGRLYITCADLLGVNKNRVHTGLSRFFRLHFRQRRVRNPYTINLLYKQVFNKPADDISGQLQ
VTEVTMTEETGPLPPTVEGNVGVPTTSNLSHLPATVTLQATGPILNTQG
>P20223 ~~~VP1~~~Structural protein VP1~~~
MRLLRSLARKIASLKEAKVALKVASDPRKYFNEEQMTEAYRIFWQTWDGDIIRSARRFVEVAKANPKLTKGEATNIGVLL
GLFIFILIGIVLLPVIVSQVNNLTSGTSPQVTGTNATLLNLVPLFYILVLIIVPAVVAYKIYKD
>P03087 ~~~~~~Major capsid protein VP1~~~
MAPTKRKGSCPGAAPKKPKEPVQVPKLVIKGGIEVLGVKTGVDSFTEVECFLNPQMGNPDEHQKGLSKSLAAEKQFTDDS
PDKEQLPCYSVARIPLPNINEDLTCGNILMWEAVTVKTEVIGVTAMLNLHSGTQKTHENGAGKPIQGSNFHFFAVGGEPL
ELQGVLANYRTKYPAQTVTPKNATVDSQQMNTDHKAVLDKDNAYPVECWVPDPSKNENTRYFGTYTGGENVPPVLHITNT
ATTVLLDEQGVGPLCKADSLYVSAVDICGLFTNTSGTQQWKGLPRYFKITLRKRSVKNPYPISFLLSDLINRRTQRVDGQ
PMIGMSSQVEEVRVYEDTEELPGDPDMIRYIDEFGQTTTRMQ
>P30022 ~~~~~~Tegument protein VP22~~~
MARFHRPSEDEDDYEYSDLWVRENSLYDYESGSDDHVYEELRAATSGPEPSGRRASVRACASAAAVQPAARGRDRAAAAG
TTVAAPAAAPARRSSSRASSRPPRAAADPPVLRPATRGSSGGAGAVAVGPPRPRAPPGANAVASGRPLAFSAAPKTPKAP
WCGPTHAYNRTIFCEAVALVAAEYARQAAASVWDSDPPKSNERLDRMLKSAAIRILVCEGSGLLAAANDILAARAQRPAA
RGSTSGGESRLRGERARP
>P10233 ~~~~~~Tegument protein VP22~~~
MTSRRSVKSGPREVPRDEYEDLYYTPSSGMASPDSPPDTSRRGALQTRSRQRGEVRFVQYDESDYALYGGSSSEDDEHPE
VPRTRRPVSGAVLSGPGPARAPPPPAGSGGAGRTPTTAPRAPRTQRVATKAPAAPAAETTRGRKSAQPESAALPDAPAST
APTRSKTPAQGLARKLHFSTAPPNPDAPWTPRVAGFNKRVFCAAVGRLAAMHARMAAVQLWDMSRPRTDEDLNELLGITT
IRVTVCEGKNLLQRANELVNPDVVQDVDAATATRGRSAASRPTERPRAPARSASRPRRPVE
>Q4JQW6 ~~~ORF9~~~Tegument protein VP22~~~
MASSDGDRLCRSNAVRRKTTPSYSGQYRTARRSVVVGPPDDSDDSLGYITTVGADSPSPVYADLYFEHKNTTPRVHQPND
SSGSEDDFEDIDEVVAAFREARLRHELVEDAVYENPLSVEKPSRSFTKNAAVKPKLEDSPKRAPPGAGAIASGRPISFST
APKTATSSWCGPTPSYNKRVFCEAVRRVAAMQAQKAAEAAWNSNPPRNNAELDRLLTGAVIRITVHEGLNLIQAANEADL
GEGASVSKRGHNRKTGDLQGGMGNEPMYAQVRKPKSRTDTQTTGRITNRSRARSASRTDARK
>O11459 ~~~~~~Membrane-associated protein VP24~~~
MAKATGRYNLISPKKDLEKGVVLSDLCNFLVSQTIQGWKVYWAGIEFDVTHKGMALLHRLKTNDFAPAWSMTRNLFPHLF
QNPNSTIESPLWALRVILAAGIQDQLIDQSLIEPLAGALGLISDWLLTTNTNHFNMRTQRVKEQLSLKMLSLIRSNILKF
INKLDALHVVNYNGLLSSIEIGTQNHTIIITRTNMGFLVELQEPDKSAMNRKKPGPAKFSLLHESTLKAFTQGSSTRMQS
LILEFNSSLAI
>Q91DD5 ~~~~~~Membrane-associated protein VP24~~~
MAKATGRYNLVPPKKDMEKGVIFSDLCNFLITQTLQGWKVYWAGIEFDVSQKGMALLTRLKTNDFAPAWAMTRNLFPHLF
QNPNSVIQSPIWALRVILAAGLQDQLLDHSLVEPLTGALGLISDWLLTTTSTHFNLRTRSVKDQLSLRMLSLIRSNILQF
INKLDALHVVNYNGLLSSIEIGTSTHTIIITRTNMGFLVEVQEPDKSAMNSKRPGPVKFSLLHESAFKPFTRVPQSGMQS
LIMEFNSLLAI
>Q5XX02 ~~~~~~Membrane-associated protein VP24~~~
MAKATGRYNLVTPKRELEQGVVFSDLCNFLVTPTVQGWKVYWAGLEFDVNQKGITLLNRLKVNDFAPAWAMTRNLFPHLF
KNQQSEVQTPIWALRVILAAGILDQLMDHSLIEPLSGALNLIADWLLTTSTNHFNMRTQRVKDQLSMRMLSLIRSNIINF
INKLETLHVVNYKGLLSSVEIGTPSYAIIITRTNMGYLVEVQEPDKSAMDIRHPGPVKFSLLHESTLKPVATPKPSSITS
LIMEFNSSLAI
>Q05322 ~~~~~~Membrane-associated protein VP24~~~
MAKATGRYNLISPKKDLEKGVVLSDLCNFLVSQTIQGWKVYWAGIEFDVTHKGMALLHRLKTNDFAPAWSMTRNLFPHLF
QNPNSTIESPLWALRVILAAGIQDQLIDQSLIEPLAGALGLISDWLLTTNTNHFNMRTQRVKEQLSLKMLSLIRSNILKF
INKLDALHVVNYNGLLSSIEIGTQNHTIIITRTNMGFLVELQEPDKSAMNRMKPGPAKFSLLHESTLKAFTQGSSTRMQS
LILEFNSSLAI
>P35256 ~~~~~~Membrane-associated protein VP24~~~
MAELSTRYNLPANVTENSINLDLNSTARWIKEPSVGGWTVKWGNFVFHIPNTGMTLLHHLKSNFVVPEWQQTRNLFSHLF
KNPKSTIIEPFLALRILLGVALKDQELQQSLIPGFRSIVHMLSEWLLLEVTSAIHISPNLLGIYLTSDMFKILMAGVKNF
FNKMFTLHVVNDHGKPSSIEIKLTGQQIIITRVNMGFLVEVRRIDIEPCCGETVLSESVVFGLVAEAVLREHSQMEKGQP
LNLTQYMNSKIAI
>Q06906 ~~~~~~Occlusion-derived virus envelope protein E25~~~
MWGALILLILLVFLFYLWYNGKLNLNSLTESSPSLAQSSDSVQVDPQTEQLNVKLGNNKMTYMRVAHGDNKVSQVYVAEK
PMSMDDIEKQGNARVGANSLFIGTVYDQGVRSPNAPGASNDVTVTRTTANFDVKEYKNMFIVVKGLPPAKMTKEDNMLCF
TVDGLHVCLVDANAAPLSERVFARLPPSACTLVYTRNSAAQQLLLENGFTVVNAEHTAFLKNHKSYREL
>Q25BG9 ~~~~~~Structural protein 26~~~
MNFTKTDFVISMGMTIAVIFMSFTFPALGMTGDSVQENEIPEFNITKGSADFAREQPEYPARPSEGTLTYKNNSANWADG
RQTYLQKGDTEYLVSFFDNSDNQNPPEWKLNLIKFNSSGSYSESTIITEGESKTLTSADGSYEIGFNNLEIEDSAVGNET
ASVDWKVLEQPSDSTWIGRLPVVGGLISGANQLASVVGWIGAIIWHFVVQILVTIGNTLLIFYNIFVYFLEFMYWITTAY
SGVVAGAPTAWASVIVAIPGILLGFEFVKLVALAISLLPLT
>Q25BG8 ~~~~~~Structural protein 27~~~
MTEKTTKSKNSIATKNIVRVSLICFLLVFSVTVPFVFSPVSNASGQTTTLVDGFEDGTLSPWQTFQSFSVNTNNPYKGDY
SAVANSDDNTIFVTTNNDQYTAATVAVNIKDSNNNARILFQDRPDSNNADLLANIYIESGNVGFYGGNGQDTGIDISYSE
WVVFEIKNIDYSNQKYDIEVYDKSGNSLGSFSGADFYDSVSSMNGVDIYKVDSGSRVDHFTTGEFVSKSTVSGNVTDLEG
NPMANATVTADSVSTTTDDNGSYSIKLADGTYDITANKKNYKPQTKQIEVNGSAKTVDFSLGKIEKQLSIEGPNFVRPNQ
TIPYKVEYTNETGTYDVSNYSNITSANTTLLSIDETNKTLLAGGQNATVKVTAKYNTTEVTTNVTKQYYVSYLKLENIDT
VPPAKWMQAFLGFDDGYAENKNMKGIGSDIQWLLFTVIIMSTIAKLFDNPWAGIGSGVITGVLLWVLEYIGLGLLLSMVF
FGIFIGLILVRVRRDGGNEVTINES
>O71024 ~~~Segment-2~~~Outer capsid protein VP2~~~
MASEFGILICDKLKENTLEKTNCDVIITGVGKVGVHEEDGVLGYEWEETNYRLGLCEIENTMSISDFVYKQIRCEGAYPI
LPHYVTDVIKYGMVIHRNDHQIRVDRDEKSIGKIQIQPYFGDMYFSPEYYPATFIKRESLPISVDTIRGYIGARMRGIEA
RAGRIREGDGNLLECARRWEKAAYERIENEKALRCVAHETDPTYQILKKQRFGFVYPHYYVLNTNYNPTTVTRTSRINDW
LLKEKTQGVVKTAEAFSDNAELKTLAERMEEEELTEDIIRAVIRYGAKYATRSGMREDTLSLQELDRYCDSLTTFVHKKK
KDEGDDETARTIIRNQWIKGMPRMDFKKEMKITRGPIANWSFFMSIDAFKRNNKVDINPNHQTWKDHIKEVTDQMNRAQQ
GNNNKPLKVQIDGVSILTSEKYGTVGHWVDWVVDLIMLAQVKMLIKEYKFKRLNSQNLMSGMNKLVGALRCYAYCLILAL
YDYYGQDIEGFKKGSNSSAILETVIQMFPNFKQEIQANFGINLNIKDKKTIAIRRATMHSDFSSNEEYGYKFVFGWAARG
EEVLSNYGDVLSDEVEELFTKLRKKEHWDKVVEDPESYFIDELYQKNPAEVFYSAGYDTDQNVVIDGKMTEGVTYFSKRF
VSYWYRVEKITTKHLEFLTEENRKVAQFDFEDYKPMAIGEMGIHASTYKYESLLLGKNRGQKVNDSIALCNYDLALTNFG
VSRRQDCCWISSCSAIELSMRANIIIAIFRRIEDKRYENFAKILSGLTQQQDLYFPTYKHYYLFVLQKVLRDERRIDLNR
ICTELFDTQRRRGILLSFTALRFWNDSEFLGDALMMNFLHRVVFEMENVDVDYGKKWHPLLVSSEKGLRVIAVDVFNSMM
GVSTSGWLPYVERICSESDMRRRLNADELELKRWFFDYYATLPLERRGEPRLSFKYEGLTTWIGSNCGGVRDYVVQLLPM
RKSKPGLLCIAYGDDVNVQWVEHELRDFLMHEGSLGLVVISGKMLVNKSKLRVRNLKIYNRGTLDSLFLISGGNYTFGNK
FLLSKLMAKAE
>P32508 ~~~Segment-2~~~Outer capsid protein VP2~~~
MDELGIPVYKRGFPEHLLRGYEFIIDVGTKIESVGGRHDVTKIPEMNAYDIKQESIRTALWYNPIRNDGFVLPRVLDITL
RGYDERRAVVESTRHKSFHTNDQWVQWMMKDSMDAQPLKVGLDTQVWNVAHSLHNSVVEIDSKKADTMAYHVEPIEDASK
GCLHTRTMMWNHLVRIETFHAAQEVHILFKPTYDIVVHAERRDRSQPFRPGDQTLINFGRGQKVAMNHNSYDKMVEGLTH
LVIRGKTPEVIRDDIASLDEICNRWIQSRHDPGEIKAYELCKYLSTIGRKSLDREKEPEDEANLSIRFQEAIDNKFRQHD
PERLKIFEHRNQRRDEDRFYILLMIAGSDTFNTRVWWSNPYPCLRRKLIASETKLGDVYSMMRSWYDWSVRPTYAPYEKT
REQEKYIYGRVNLFDFVAEPGIKIIHWEYKLNHSTREITYAQGNPCDYYPEDDDVIVTKFDDVAYGQMINEMINGGWNQE
QFKMHKILKSEGNVLTIDFEKDAKLTTNEGVTMPEYFNKWIIAPMFNAKLRIKHEEIAQRQSDDPMVKRTLSPIAADPIV
LQRLTLARFYDIRPALIGQGLSRQQAQSTYDEEISKQAGYAEILKRRGIVQIPKKPCPTVTAQYTLELYSLSLINILQQH
VARDCDEEAIYEHPKADYELEIFGESIVDISQVIVLVFDLIFERRRRVRDVYESRYIITRIRRMRGKERLNVIAEFFPTY
GSLLNGLNSAYVVQDIMYLNFLPLYFLAGDNMIYSHRQWSIPLLLYTHEVMVIPLEVGSYNDRCGLIAYLEYMVFFPSKA
IRLSKLNEAHAKIAREMLKYYANTTVYDGGDNSNVVTTKQLLYETYLASLCGGFLDGIVWYLPITHPKKCIVAIEVSDER
VPASIRAGRIRLRFPLSSRHLKGSAIIQIDLVGRFTVYSEGIVSFLVCKKNLLKYKCEIILLKFSGHVFGNDEMLTKLLN
V
>P54092 3.1.3.16~~~VP2~~~Dual specificity protein phosphatase VP2~~~
MHGNGGQPAAGGSESALSREGQPGPSGAAQGQVISNERSPRRYSTRTINGVQATNKFTAVGNPSLQRDPDWYRWNYNHSI
AVWLRECSRSHAKICNCGQFRKHWFQECAGLEDRSTQASLEEAILRPLRVQGKRAKRKLDYHYSQPTPNRKKVYKTVRWQ
DELADREADFTPSEEDGGTTSSDFDGDINFDIGGDSGIVDELLGRPFTTPAPVRIV
>P54093 3.1.3.16~~~VP2~~~Dual specificity protein phosphatase VP2~~~
MHGNGGQPAAGGSESALSREGQPGPSGAAQGQVISNERSPRRYSTRTINGVQATNKFTAVGNPSLQRDPDWYRWNYNHSI
AVWLRECSRSHAKICNCGQFRKHWFQECAGLEDRSTQASLEEAILRPLRVQGKRAKRKLDYHYSQPTPNRKKVYKTVRWK
DELADREADFTPSEEDGGTTSSDFDEDINFDIGGDSGIVDELLGRPFTTPAPVRIV
>P69484 3.1.3.16~~~VP2~~~Dual specificity protein phosphatase VP2~~~
MHGNGGQPAAGGSESALSREGQPGPSGAAQGQVISNERSPRRYSTRTINGVQATNKFTAVGNPSLQRDPDWYRWNYNHSI
AVWLRECSRSHAKICNCGQFRKHWFQECAGLEDRSTQASLEEAILRPLRVQGKRAKRKLDYHYSQPTPNRKKVYKTVRWQ
DELADREADFTPSEEDGGTTSSDFDEDINFDIGGDSGIVDELLGRPFTTPAPVRIV
>Q9IZU7 3.1.3.16~~~VP2~~~Dual specificity protein phosphatase VP2~~~
MHGNGGQPAAGGSESALSREGQPGPSGAAQGQVISNERSPRRYSTRTINGVQATNKFTAVGNPSLQRDPDWYRWNYSHSI
AVWLRECSRSHAKICNCGQFRKHWFQECAGLEDRSTQASLEEAILRPLRVQGKRAKRKLDYHYSQPTPNRKKVYKTVRWQ
DELADREADFTPSEEDGGTTSSDFDEDINFDIGGDSGIVDELLGRPFTTPAPVRIV
>P69485 3.1.3.16~~~VP2~~~Dual specificity protein phosphatase VP2~~~
MHGNGGQPAAGGSESALSREGQPGPSGAAQGQVISNERSPRRYSTRTINGVQATNKFTAVGNPSLQRDPDWYRWNYNHSI
AVWLRECSRSHAKICNCGQFRKHWFQECAGLEDRSTQASLEEAILRPLRVQGKRAKRKLDYHYSQPTPNRKKVYKTVRWQ
DELADREADFTPSEEDGGTTSSDFDEDINFDIGGDSGIVDELLGRPFTTPAPVRIV
>P28711 ~~~ORF3~~~Protein VP2~~~
MNSILGLIDTVTNTIGKAQQIELDKAALGQQRELALQRIGLDRQALNNQVEQFNKILEQRVQGPIQSVRLARAAGFRVDP
YSYTNQNFYDDQLNAIRLSYKNLFKI
>Q66916 ~~~ORF3~~~Protein VP2~~~
MNSILGLIDTVTNTIGKAQQIELDKAALGQQRELALQRMNLDRQALNNQVEQFNKILEQRVQGPLQSVRLARAAGFRVDP
YSYTNQNFYDDQLNAIRLSYRNLFKN
>P03094 ~~~~~~Minor capsid protein VP2~~~
MGAALALLGDLVASVSEAAAATGFSVAEIAAGEAAAAIEVQIASLATVEGITSTSEAIAAIGLTPQTYAVIAGAPGAIAG
FAALIQTVSGISSLAQVGYRFFSDWDHKVSTVGLYQQSGMALELFNPDEYYDILFPGVNTFVNNIQYLDPRHWGPSLFAT
ISQALWHVIRDDIPSITSQELQRRTERFFRDSLARFLEETTWTIVNAPINFYNYIQQYYSDLSPIRPSMVRQVAEREGTR
VHFGHTYSIDDADSIEEVTQRMDLRNQQSVHSGEFIEKTIAPGGANQRTAPQWMLPLLLGLYGTVTPALEAYEDGPNQKK
RRVSRGSSQKAKGTRASAKTTNKRRSRSSRS
>P03096 ~~~~~~Minor capsid protein VP2~~~
MGAALTILVDLIEGLAEVSTLTGLSAEAILSGEALAALDGEITALTLEGVMSSETALATMGISEEVYGFVSTVPVFVSRT
AGAIWLMQTVQGASTISLGIQRYLHNEEVPTVNRNMALIPWRDPALLDIYFPGVNQFAHALNVVHDWGHGLLHSVGRYVW
QMVVQETQHRLEGAVRELTVRQTHTFLDGLARLLENTRWVVSNAPQSAIDAINRGASSASSGYSSLSDYYRQLGLNPPQR
RALFNRIEGSMGNGGPTPAAHIQDESGEVIKFYQAQVVSHQRVTPDWMLPLILGLYGDITPTWATVIEEDGPQKKKRRL
>P12908 ~~~~~~Minor capsid protein VP2~~~
MGAALTILVDLIEGLAEVSTLTGLSAEAILSGEALAALDGEITALTLEGVMSSETALATMGISEEVYGFVSTVPVFVNRT
AGAIWLMQTVQGASTISLGIQRYLHNEEVPTVNRNMALIPWRDPALLDIYFPGVNQFAHALNVVHDWGHGLLHSVGRYVW
QMVVQETQHRLEGAVRELTVRQTHTFLDGLARLLENTRWVVSNAPQSAIDAINRGASSASSGYSSLSDYYRQLGLNPPQR
RALFNRIEGSMGNGGPTPAAHIQDESGEVIKFYQAQVVSHQRVTPDWMLPLILGLYGDITPTWATVIEEDGPQKKKRRL
>A9Q1K8 ~~~~~~Inner capsid protein VP2~~~
MEVIEKLETIRETLNDTKDKKDFQKIHDELVQYLDGVDSLTIDDDKWNEILKLFKTIIIKLKSSTIKTTPLENELLQHEK
KRTVKEVEKVEEDIKTDVTNDTSKPTLMDKVLQVSNQPNNPYSADVLQIRTILSKTLFVDTDSEAYSLYVPESQKLDVSP
ITIELTTIEKYQPKVNILKQAVIVPSQNPLLADTYGAPEILFSTDFFDDITSNSSEGLQLYFFDKAYKLKKELPNLPFLS
SLDKDVNPLNPLNSVCKSFGQEKYYDMVMDRTDRGLDARRAAMQFDNVIVDAQNRTVQFNVRMHPFDLQLLRISQQFAEP
MQDLAPVVREYMMLGADGYVLTQKIRLDRDQQLIANRRSVVFDRMCELSGPLYRSRIIHSMRMMSKLWRTNVFRTSLEDE
ITKIYAAAEVSMISIDATTSALSTINIASAEQTLNALLNMSFFRCELDLIGSQSSFGAAMSAMIALMILPTDQENMDDEV
FDVLCNLVYNELIAWAADRPVFVRRAGATNAFRQFVNAGLNRDITNYMRFVLLRRPWLPLYNSRDVRRNAHVLVPNVDLA
NINDQVYVAINSFLNGIIEASRRNPNPNKTISANSFRKLMKNMRDICVNRLMPVIRLIRYNVERIGMILHMLPYSADIFD
INRNLRDERLRIKIPMSGFLSLVMGITKAPDAFDWSQILNFADDVRKMDYAEAISIEDSASVAIMRNDANRATSKKEIFI
SEVRPPTPTVASIQKIPSATLTAIFSDRQLINLIRDTHSFRVIREIAVALQAAFDNSPTSQHGVGKGAVLHPVPQNFGRS
SQFVRRDNILLQRPAGIQQFTIEDLKQGRYFQGLMAQIRARQPIIVNGPIPLRISDAAEIEQVTLAFLTMNSPYDAYIDP
RDLKQQKLLTDREVDLFIDQSPARPNDEFDNVMARTSVFIIDAPRAIVPINPQRLNFPYHDIMVTDSVTKFIEFTVALTP
DLQLFNGLLVFEQ
>P17462 ~~~~~~Inner capsid protein VP2~~~
MAYRKRGATVEADINNNDRMQEKDDEKQDQNNRMQLSDKVLSKKEEVVTDSQEEIKIRDEVKKSTKEESKQLLEVLKTKE
EHQKEIQYEILQKTIPTFEPKESILKKLEDIKPEQAKKQTKLFRIFEPRQLPIYRANGEKELRNRWYWKLKKDTLPDGDY
DVREYFLNLYDQVLTEMPDYLLLKDMAVENKNSRDAGKVVDSETASICDAIFQDEETEGAVRRFIAEMRQRVQADRNVVN
YPSILHPIDYAFNEYFLQHQLVEPLNNDIIFNYIPERIRNDVNYILNMDRNLPSTARYIRPNLLQDRLNLHDNFESLWDT
ITTSNYILARSVVPDLKELVSTEAQIQKMSQDLQLEALTIQSETQFLTGINSQAANDCFKTLIAAMLSQRTMSLDFVTTN
YMSLISGMWLLTVVPNDMFIRESLVACQLAIVNTIIYPAFGMQRMHYRNGDPQTPFQIAEQQIRKFSGSGIGWHFVNNNQ
FRQVVIDGVLNQVLNDNIRNVHVIKQLMQALMQLSRQQFPTMPVDYKRSIQRGILLLSNRLGQLVDLTRLLAYNYETLMA
CVTMNMQHVQTLTTEKLQLTSVTSLCMLIGNATVIPSPQTLFHYYNVNVNFHSNYNERINDAVAIITAANRLNLYQKKMK
AIVEDFLKRLHIFDVARVPDDQMYRLRDRLRLLPVEVRRLDIFNLILMNMDQIERASDKIAQGVIIAYRDMQLERDEMYG
YVNIARNLDGFQQINLEELMRTGDYAQITNMLLNNQPVALVGALPFVTDSSVISLIAKLDATVFAQIVKLRKVDTLKPIL
YKINSDSNDFYLVANYDWVPTSTTKVYKQVPQQFDFRNSMHMLTSNLTFTVYSDLLAFVSADTVEPINAVAFDNMRIMNE
L
>Q9QNB2 ~~~~~~Inner capsid protein VP2~~~
MAYRKRGAKREDLSQQHERLQEKEIENNTDVTMENKNNNNNRKQRLSDKVLSQKEEIITDVQDDIKIADEVKKSSKEESK
QLLEILKTKEEHQKEVQYEILQKTIPTFEPKESILKKLEDIRPEQAKKQMKLFRIFEPRQLPIYRTNGEKELRNRWYWKL
KKDTLPDGDYDVREYFLNLYDQILIEMPDYLLLKDMAVENKNSRDAGKVVDSETANICDAIFQDEETEGVIRRFIADMRQ
QVQSDRNIVNYPSILHPIDHAFNEYFLNHQLVEPLNNEIIFNYIPERIRNDVNYILNMDMNLPSTARYIRPNLLQDRLNL
HDNFESLWDTITTSNYVLARSVVPDLKEKELVSTEAQIQKMSQDLQLEALTIQSETQFLAGINSQAANDCFKTLIAAMLS
QRTMSLDFVTTNYMSLISGMWLLTVIPNDMFLRESLVACELAIINTIVYPAFGMQRMHYRNGDPQTPFQIAEQQIQNFQV
ANWLHFINNNRFRQVVIDGVLNQTLNDNIRNGQVINQLMEALMQLSRQQFPTMPVDYKRSIQRGILLLSNRLGQLVDLTR
LVSYNYETLMACITMNMQHVQTLTTEKLQLTSVTSLCMLIGNTTVIPSPQTLFHYYNVSVNFHSNYNERINDAVAIITAA
NRLNLYQKKMKSIVEDFLKRLQIFDVPRVPDDQMYRLRDRLRLLPVERRRLDIFNLILMNMEQIERASDKIAQGVLIAYR
DMQLERDEMYGFVNIARNLDGYQQINLEELMRTGDYGQITNMLLNNQPVALVGALPFVTDSSVISLIAKLDATVFAQIVK
LRKVDTLKPILYKINSDSNDFYLVANYDWIPTSTTKVYKQVPQPFDFRASMHMLTSNLTFTVYSDLLSFVSADTVEPINA
IAFDNMRIMNEL
>P12472 ~~~~~~Inner capsid protein VP2~~~
MAYRKRGARREANINNNDRMQEKDDEKQDQNNRMQLSDKVLSKKEEVVTDSQEEIKIADEVKKSTKEESKQLLEVLKTKE
EHQKEIQYEILQKTIPTFEPKESILKKLEDIKPEQAKKQTKLFRIFEPRQLPIYRANGEKELRNRWYWKLKKDTLPDGDY
DVREYFLNLYDQVLTEMPDYLLLKDMAVENKNSRDAGKVVDSETASICDAIFQDEETEGAVRRFIAEMRQRVQADRNVVN
YPSILHPIDYAFNEYFLQHQLVEPLNNDIIFNYIPERIRNDVNYILNMDRNLPSTARYIRPNLLQDRLNLHDNFESLWDT
ITTSNYILARSVVPDLKELVSTEAQIQKMSQDLQLEALTIQSETQFLTGINSQAANDCFKTLIAAMLSQRTMSLDFVTTN
YMSLISGMWLLTVVPNDMFIRESLVACQLAIVNTIIYPAFGMQRMHYRNGDPQRPFQIAEQQIQNFQVANWLHFVNNNQF
RQVVIDGVLNQVLNDNIRNGHVINQLMEALMQLSRQQFPTMPVDYKRSIQRGILLLSNRLGQLVDLTRLLAYNYETLMAC
VTMNMQHVQTLTTEKLQLTSVTSLCMLIGNATVIPSPQTLFHYYNVNVNFHSNYNERINDAVAIITGANRLNLYQKKMKA
IVEDFLKRLHIFDVARVPDDQMYRLRDRLRLLPVEVRRLDIFNLILMNMDQIERASDKIAQGVIIAYRDMQLERDEMYGY
VNIARNLDGFQQINLEELMRTGDYAQITNMLLNNQPVALVGALPFVTDSSVISLIANVDATVFAQIVKLRKVDTLKPILY
KINSDSNDFYLVANYDWVPTSTTKVYKQVPQQFDFRNSMHMLTSNLTFTVYSDLLAFVSADTVEPINAVAFDNMRIMNEL
>A2T3R5 ~~~~~~Inner capsid protein VP2~~~
MAYRKRGARRETNLKQDERMQEKEDSKNINNDSPKSQLSEKVLSKKEEIITDNQEEVKISDEVKKSNKEESKQLLEVLKT
KEEHQKEVQYEILQKTIPTFEPKESILKKLEDIKPEQAKKQTKLFRIFEPKQLPIYRANGERELRNRWYWKLKRDTLPDG
DYDVREYFLNLYDQVLMEMPDYLLLKDMAVENKNSRDAGKVVDSETAAICDAIFQDEETEGAVRRFIAEMRQRVQADRNV
VNYPSILHPIDHAFNEYFLQHQLVEPLNNDIIFNYIPERIRNDVNYILNMDRNLPSTARYIRPNLLQDRLNLHDNFESLW
DTITTSNYILARSVVPDLKELVSTEAQIQKMSQDLQLEALTIQSETQFLTGINSQAANDCFKTLIAAMLSQRTMSLDFVT
TNYMSLISGMWLLTVIPNDMFIRESLVACQLAIINTIVYPAFGMQRMHYRNGDPQTPFQIAEQQIQNFQVANWLHFVNYN
QFRQVVIDGVLNQVLNDNIRNGHVVNQLMEALMQLSRQQFPTMPVDYKRSIQRGILLLSNRLGQLVDLTRLLSYNYETLM
ACITMNMQHVQTLTTEKLQLTSVTSLCMLIGNATVIPSPQTLFHYYNVNVNFHSNYNERINDAVAIITAANRLNLYQKKM
KSIVEDFLKRLQIFDVARVPDDQMYRLRDRLRLLPVEIRRLDIFNLIAMNMEQIERASDKIAQGVIIAYRDMQLERDEMY
GYVNIARNLDGFQQINLEELMRSGDYAQITNMLLNNQPVALVGALPFITDSSVISLIAKLDATVFAQIVKLRKVDTLKPI
LYKINSDSNDFYLVANYDWIPTSTTKVYKQVPQQFDFRASMHMLTSNLTFTVYSDLLAFVSADTVEPINAVAFDNMRIMN
EL
>Q86218 ~~~~~~Inner capsid protein VP2~~~
MAYRKRGARRETNLKQDERMQEKEDSKNINNDSPKSQLSEKVLSKKEEIITDNQEEVKISDEVKKSNKEESKQLLEVLKT
KEEHQKEVQYEILQKTIPTFEPKESILKKLEDIKPEQAKKQTKLFRIFEPKQLPIYRANGERELRNRWYWKLKRDTLPDG
DYDVREYFLNLYDQVLMEMPDYLLLKDMAVENKNSRDAGKVVDSETAAICDAIFQDEETEGAVRRFIAEMRQRVQADRNV
VNYPSILHPIDHAFNGYFLQHQLVEPLNNDIIFNYIPERIRNDVNYILNMDRNLPSTARYIRPNLLQDRLNLHDNFESLW
DTITTSNYILARSVVPDLKELVSTEAQIQKMSQDLQLEALTIQSETQFLTGINSQAANDCFKTLIAAMLSQRTMSLDFVT
TNYMSLISGMWLLTVIPNDMFIRESLVACQLAIINTIVYPAFGMQRMHYRNGDPQTPFQIAEQQIQNFQVANWLHFVNYN
QFRQVVIDGVLNQVLNDNIRNGHVVNQLMEALMQLSRQQFPTMPVDYKRSIQRGILLLSNRLGQLVDLTRLLSYNYETLM
ACITMNMQHVQTLTTEKLQLTSVTSLCMLIGNATVIPSPQTLFHYYNVNVNFHSNYNERINDAVAIITAANRLNLYQKKM
KSIVEDFLKRLQIFDVARVPDDQMYRLRDRLRLLPVEIRRLDIFNLIAMNMEQIERASDKIAQGVIIAYRDMQLERDEMY
GYVNIARNLDGFQQINLEELMRSGDYAQITNMLLNNQPVALVGALPFITDSSVISLIAKLDATVFAQIVKLRKVDTLKPI
LYKINSDSNDFYLVANYDWIPTSTTKVYKQVPQQFDFRASMHMLTSNLTFTVYSDLLAFVSADTVEPINAVAFDNMRIMN
EL
>P20224 ~~~VP2~~~Capsid protein VP2~~~
MKWVQKAIKRPGRVHRYLMRLYGKRAFTKDGDIKASYLDKAIKHVKKAKIPKEKKRSLLSALLLAKRLKRMHRK
>P03093 ~~~~~~Minor capsid protein VP2~~~
MGAALTLLGDLIATVSEAAAATGFSVAEIAAGEAAAAIEVQLASVATVEGLTTSEAIAAIGLIPQAYAVISGAPAAIAGF
AALLQTVTGVSAVAQVGYRFFSDWDHKVSTVGLYQQPGMAVDLYRPDDYYDILFPGVQTFVHSVQYLDPRHWGPTLFNAI
SQAFWRVIQNDIPRLTSQELERRTQRYLRDSLARFLEETTWTVINAPVNWYNSLQDYYSTLSPIRPTMVRQVANREGLQI
SFGHTYDNIDEADSIQQVTERWEAQSQSPNVQSGEFIEKFEAPGGANQRTAPQWMLPLLLGLYGSVTSALKAYEDGPNKK
KRKLSRGSSQKTKGTSASAKARHKRRNRSSRS
>Q6XDK7 ~~~ORF2~~~Protein VP2~~~
MSWFTGASLAAGSLVDMAGTISSIVAQQRQIDLMAEANRIQADWVRRQEALQIRGQDISRDLAVNGTAQRVESLVNAGFT
PVDARRLAGGTETVSYGLLDRPILQRGILSGITETRHLQAMQGALSAFKNGASYGAPPAPSGFVNPNYQPSPPRLKLGPR
PPSTNV
>P27391 ~~~XXX~~~Minor capsid protein P30~~~
MALINPQFPYAGPVPIPGPAPTETMPLLNYRVEGRIAGIQQARQFMPFLQGPHRAVAEQTYHAIGTGIQMGQTFNQPLIN
TQEG
>Q8JPX6 ~~~~~~Transcriptional activator VP30~~~
MEHSRERGRSSNMRHNSREPYENPSRSRSLSRDPNQVDRRQPRSASQIRVPNLFHRKKTDALIVPPAPKDICPTLKKGFL
CDSKFCKKDHQLDSLNDHELLLLIARRTCGIIESNSQITSPKDMRLANPTAEDFSQGNSPKLTLAVLLQIAEHWATRDLR
QIEDSKLRALLTLCAVLTRKFSKSQLGLLCETHLRHEGLGQDQADSVLEVYQRLHSDKGGNFEAALWQQWDRQSLIMFIS
AFLNIALQIPCESSSVVVSGLATLYPAQDNSTPSEATNDTTWSSTVE
>Q77DJ5 ~~~~~~Transcriptional activator VP30~~~
MEASYERGRPRAARQHSRDGHDHHVRARSSSRENYRGEYRQSRSASQVRVPTVFHKKRVEPLTVPPAPKDICPTLKKGFL
CDSSFCKKDHQLESLTDRELLLLIARKTCGSVEQQLNITAPKDSRLANPTADDFQQEEGPKITLLTLIKTAEHWARQDIR
TIEDSKLRALLTLCAVMTRKFSKSQLSLLCETHLRREGLGQDQAEPVLEVYQRLHSDKGGSFEAALWQQWDRQSLIMFIT
AFLNIALQLPCESSAVVVSGLRTLVPQSDNEEASTNPGTCSWSDEGTP
>Q05323 ~~~~~~Transcriptional activator VP30~~~
MEASYERGRPRAARQHSRDGHDHHVRARSSSRENYRGEYRQSRSASQVRVPTVFHKKRVEPLTVPPAPKDICPTLKKGFL
CDSSFCKKDHQLESLTDRELLLLIARKTCGSVEQQLNITAPKDSRLANPTADDFQQEEGPKITLLTLIKTAEHWARQDIR
TIEDSKLRALLTLCAVMTRKFSKSQLSLLCETHLRREGLGQDQAEPVLEVYQRLHSDKGGSFEAALWQQWDRQSLIMFIT
AFLNIALQLPCESSAVVVSGLRTLVPQSDNEEASTNPGTCSWSDEGTP
>P35258 ~~~~~~Transcriptional activator VP30~~~
MQQPRGRSRTRNHQVTPTIYHETQLPSKPHYTNYHPRARSMSSTRSSAESSPTNHIPRARPPSTFNLSKPPPPPKDMCRN
MKIGLPCADPTCNRDHDLDNLTNRELLLLMARKMLPNTDKTFRSPQDCGSPSLSKGLSKDKQEQTKDVLTLENLGHILSY
LHRSEIGKLDETSLRAALSLTCAGIRKTNRSLINTMTELHMNHENLPQDQNGVIKQTYTGIHLDKGGQFEAALWQGWDKR
SISLFVQAALYVMNNIPCESSISVQASYDHFILPQSQGKGQ
>P27384 ~~~XXXI~~~Penton protein P31~~~
MNVNNPNQMTVTPVYNGCDSGEGPQSVRGYFDAVAGENVKYDLTYLADTQGFTGVQCIYIDNAENDGAFEIDVEETGQRI
KCPAGKQGYFPLLVPGRAKFVARHLGSGKKSVPLFFLNFTIAQGVW
>P27390 ~~~XXXIV~~~Protein P34~~~
MNDFVGPIVTVLTAIIGVAILAVLVSRNSNTAGVIKAGSGGFSSMLGTALSPVTGGTGFAMTNNYSGF
>Q8JPY0 ~~~~~~Polymerase cofactor VP35~~~
MYNNKLKVCSGPETTGWISEQLMTGKIPVTDIFIDIDNKPDQMEVRLKPSSRSSTRTCTSSSQTEVNYVPLLKKVEDTLT
MLVNATSRQNAAIEALENRLSTLESSLKPIQDMGKVISSLNRSCAEMVAKYDLLVMTTGRATSTAAAVDAYWKEHKQPPP
GPALYEENALKGKIDDPNSYVPDAVQEAYKNLDSTSTLTEENFGKPYISAKDLKEIMYDHLPGFGTAFHQLVQVICKIGK
DNNLLDTIHAEFQASLADGDSPQCALIQITKRVPIFQDVPPPIIHIRSRGDIPRACQKSLRPAPPSPKIDRGWVCLFKMQ
DGKTLGLKI
>Q5XX07 ~~~~~~Polymerase cofactor VP35~~~
MQQDRTYRHHGPEVSGWFSEQLMTGKIPLTEVFVDVENKPSPAPITIISKNPKTTRKSDKQVQTDDASSLLTEEVKAAIN
SVISAVRRQTNAIESLEGRVTTLEASLKPVQDMAKTISSLNRSCAEMVAKYDLLVMTTGRATATAAATEAYWNEHGQAPP
GPSLYEDDAIKAKLKDPNGKVPESVKQAYINLDSTSALNEENFGRPYISAKDLKEIIYDHLPGFGTAFHQLVQVICKIGK
DNNILDIIHAEFQASLAEGDSPQCALIQITKRIPAFQDASPPIVHIKSRGDIPKACQKSLRPVPPSPKIDRGWVCIFQFQ
DGKALGLKI
>Q6V1Q9 ~~~~~~Polymerase cofactor VP35~~~
MTTRTKGRGHTAATTQNDRMPGPELSGWISEQLMTGRIPVSDIFCDIENNPGLCYASQMQQTKPNPKTRNSQTQTDPICN
HSFEEVVQTLASLATVVQQQTIASESLEQRITSLENGLKPVYDMAKTISSLNRVCAEMVAKYDLLVMTTGRATATAAATE
AYWAEHGQPPPGPSLYEESAIRGKIESRDETVPQSVREAFNNLDSTTSLTEENFGKPDISAKDLRNIMYDHLPGFGTAFH
QLVQVICKLGKDSNSLDIIHAEFQASLAEGDSPQCALIQITKRVPIFQDAAPPVIHIRSRGDIPRACQKSLRPVPPSPKI
DRGWVCVFQLQDGKTLGLKI
>Q05127 ~~~~~~Polymerase cofactor VP35~~~
MTTRTKGRGHTAATTQNDRMPGPELSGWISEQLMTGRIPVSDIFCDIENNPGLCYASQMQQTKPNPKTRNSQTQTDPICN
HSFEEVVQTLASLATVVQQQTIASESLEQRITSLENGLKPVYDMAKTISSLNRVCAEMVAKYDLLVMTTGRATATAAATE
AYWAEHGQPPPGPSLYEESAIRGKIESRDETVPQSVREAFNNLNSTTSLTEENFGKPDISAKDLRNIMYDHLPGFGTAFH
QLVQVICKLGKDSNSLDIIHAEFQASLAEGDSPQCALIQITKRVPIFQDAAPPVIHIRSRGDIPRACQKSLRPVPPSPKI
DRGWVCVFQLQDGKTLGLKI
>P35259 ~~~~~~Polymerase cofactor VP35~~~
MWDSSYMQQVSEGLMTGKVPIDQVFGANPLEKLYKRRKPKGTVGLQCSPCLMSKATSTDDIIWDQLIVKRTLADLLIPIN
RQISDIQSTLSEVTTRVHEIERQLHEITPVLKMGRTLEAISKGMSEMLAKYDHLVISTGRTTAPAAAFDAYLNEHGVPPP
QPAIFKDLGVAQQACSKGTMVKNATTDAADKMSKVLELSEETFSKPNLSAKDLALLLFTHLPGNNTPFHILAQVLSKIAY
KSGKSGAFLDAFHQILSEGENAQAALTRLSRTFDAFLGVVPPVIRVKNFQTVPRPSQKSLRAVPPNPTIDKGWVCVYSSE
QGETRALKI
>Q6UY68 ~~~~~~Polymerase cofactor VP35~~~
MWDSSYMQQVSEGLMTGKVPIDQVFGANPLEKLYKRRKPKGTVGLQCSPCLMSKATSTDDIIWDQLIVKKTLADLLIPIN
RQISDIQSTLSEVTTRVHEIERQLHEITPVLKMGRTLEAISKGMSEMLAKYDHLVISTGRTTAPAAAFDAYLNEHGVPPP
QPAIFKDLGVAQQACSKGTMVKNATTDAADKMSKVLELSEETFSKPNLSAKDLALLLFTHLPGNNTPFHILAQVLSKIAY
KSGKSGAFLDAFHQILSEGENAQAALTRLSRTFDAFLGVVPPVIRVKNFQTVPRPCQKSLRAVPPNPTIDKGWVCVYSSE
QGETRALKI
>Q03039 ~~~~~~Polymerase cofactor VP35~~~
MWDSSYMQQVSEGLMTGKVPIDQVFGANPSEKLHKRRKPKGTVGLQCSPCLMSKATSTDDIVWDQLIVKKTLADLLIPIN
RQISDIQSTLNEVTTRVHEIERQLHEITPVLKMGRTLEAISKGMSEMLAKYDHLVISTGRTTAPAAAFDAYLNEHGVPPP
QPAIFKDLGVAQQACSKGTMVKNETTDAADKMSKVLELSEETFSKPNLSAKDLALLLFTHLPGNNTPFHILAQVLSKIAY
KSGKSGAFLDAFHQILSEGENAQAALTRLSRTFDAFLGVVPPVIRVKNFQTVPRPCQKSLRAVPPNPTIDKGWVCVYSSE
QGETRALKI
>P68348 ~~~~~~38 kDa phosphoprotein~~~
MEFEAEHEGLTASWVAPAPQGGKGAEGRAGVADEAGHGKTEAECAEDGEKCGDAEMSALDRVQRDRWRFSSPPPHSGVTG
KGAIPIKGDGKAIECQELTGEGEWLSQWEELPPEPRRSGNEHLDESRYAKQTERGSSTGKEEGDGMKQMGELAQQCEGGT
YADLLVEAEQAVVHSVRALMLAERQNPNILGEHLNKKRVLVQRPRTILSVESENATMRSYMLVTLICSAKSLLLGSCMSF
FAGMLVGRTADVKTPLWDTVCLLMAFCAGIVVGGVDSGEVESGETKSESN
>O71025 ~~~Segment-3~~~Core protein VP3~~~
MQGNERIQDKNEKEKAYAPYLDGASVSTDNGPILSVFALQEIMQKIRQNQSDMAAHAPDVDGAIPEVMTIISGIKGLLEE
KDYKVINAPPNSFRTIPMQSMEYVLQVNTFYERMSEIGGPVDETDPIGFYALILEKLKFLKSEGAFILQGIATKDYRGAE
IADPEIIGVSFQNALSHLAAIDRQIIQDTLNGMIIENGLVADRNVDVFRAAMSDPIYRIRNVLQGYIEGIQYGELRESVN
WLMRLGLRKRIEFANDFLTDFRRADTIWIISQRLPINANVIWNVPRCHIANLNTNVALCLPTGEYLMPNPRINSITITQR
ITQTNPFSIISGLTPTAVQMNDVRKIYLALMFPNQIILDIKPDSSHAVDPVLRMVAGVLGHVMFTYGPIMTNITPTMAEL
LDAALSDYLLYMYNNRIPINYGPTGQPLDFRIGARNQYDCNAFRADPQTGRGYNGWGVVDVQRVQPSPYDHVQRVIRYCD
IDSREIIDPRTYGMNMTYPIFREMLRMLVAAGKDQEAAYLRQMLPFHMIRFARINQIINEDLLSAFSLPDQNFDVVLHNL
IQGNFGETDPVILEVSWASIWFAFVRRFEPIARSDLLEAAPLIEARYAAELSTMQMDVQQLRMMRARVPDTVINATPSQC
WKAVLKNAPEPIKNLMNLSHSFSFVNVRDIVRWSQQRDIQESLAYVLNREAWAIANDFEDLMLVDHVYIQRTMLPEPRLD
DINEFRRQGFFHTNMIDGAPPIGDVTHYTYAIANLQANMGQFRAAIRRTLDDNGWIQFGGMLRNIKIKFFDSRPPDEILT
AMPYVYTEEERDGVRMVAFKYATTATAYFLLYNVEYSNTPDTLITVNPTFTMTKIHMRKKIVRRVRAPDVLSQVNKRLVA
YKGKMRLMDVTKCLKTGVQLARPTI
>Q9INI4 ~~~Segment-3~~~Viral guanylyltransferase VP3~~~
MELFSDSGSIVENFKERINKLVFDYSLNHGGFRKTYKIQRDRVYYMLGDAHHANLSGKCLMLYNSEKDIFEGLGFKVKGS
RINVSKTQILRNYEINFETIIGVEPNGLKTISTAKDVKKLYDIYSYKSSLHPFDDFMAHCINRWGMSIPASLERIIKSEI
IKVRSGVLNRNSELYNYIPTVDASFSEMSRGPANVILTDGKLVPDGTCFGPILSKSVEDPRLKNEFRSKGLIMVHDYFIL
IGESPGPHYKKYTKMTKDNTIFWDPRRTDHKFNNVVSYFKKENIRDVVEYTTDALNRGLKPLVLIDIRKDKPKNLNTPEG
AIEWERMVHDDNNLIIDMVNALDKRVTVCAKLRPAFMQVGSMRKLLRPVRILPLPYLRRSTAEFNMFVPNEALMNGNEIY
DVTYDDLVRMSSEVFVLKNIIGGLYNMYLKDMHLNLGVVNKSVSLSDGSSAIWSLSNINNERISNFNFNNFLYAAPYSDF
ATSSVKRHFKGRNYSDWCLNILDEVNLKDGVYLVPLYAIVGGGQITSHDFVNAIITDQEQLIDFTQSERALSTQVVKLVS
FILKDSFTAKGLNWTEIDNEIRNRRLSSLSGVGFTVTKMLDGKVLVDGKVVTVSGHMLYILLGSILGLPYGIKKYLKEIE
LNILKPGSSYERGVGGRVWHGLISHYLAVDCVIDVIDEYMVCTYEDRSKLNVVLRYVKSKLLELGSKYDVYLSVDERLVL
>P56582 ~~~Segment-3~~~Core protein VP3~~~
MAAQNEQRPERIKTTPYLEGDVLSSDSGPLLSVFALQEIMQKVRQVQADYMTATREVDFTVPDVQQILDDIKALAAAQVY
KIVKVPSTSFRHIVTQSRDRVLRVDTYYEEMSQVGDVITEDEPEKFYSTIIKKVRFIRGKGSFILHDIPARDHRGMEVAE
PEVLGVEFKNVLPVLTAEHRAMIQNALDGSIIENGNVATRDVDVFIGACSEPFYRIYNRFQGYIEAVQLQELRNSIGWLE
RLGQRKRITYSQEVLTDFRRQDMIWVLALQLPVNPQVVWDVPRSSIANLIMNIATCLPTGEYIAPNPRISSITLTQRITT
TGPFAILTGSTPTAQQLNDVRKIYLALMFPGQIILDLKIDPGERMDPAVRMVAGVVGHLLFTAGGRFTNLTQNMARQLDI
ALNDYLLYMYNTRVQVNYGPTGEPLDFQIGRNQYDCNVFRADFATGTGYNGWATIDVEYRDPAPYVHAQRYIRYCGIDSR
ELINPTTYGIGMTYHCYNEMLRMLVAAGKDSEAAYFRSMLPFHMVRFARINQIINEDLHSVFSLPDDMFNALLPDLIAGA
HQNADPVVLDVSWISLWFAFNRSFEPTHRNEMLEIAPLIESVYASELSVMKVDMRHLSLMQRRFPDVLIQARPSHFWKAV
LNDSPEAVKAVMNLSHSHNFINIRDMMRWVLLPSLQPSLKLVLEEEAWAAANDFEDLMLTDQVYMHRDMLPEPRLDDIER
FRQEGFYYTNMLEAPPEIDRVVQYTYEIARLQANMGQFRAALRRIMDDDDWIGFGGVLRTVRVKFFDARPPDDILQGLPF
SYDTNEKGGLSYATIKYATETTIFYLIYNVEFSNTPDSLVLINPTYTMTKVFINKRIVERVRVGQILAVLNRRFVAYKGK
MRIMDITQSLKMGTKLAAPTV
>P54095 ~~~VP3~~~Apoptin~~~
MNALQEDTPPGPSTVFRPPTSSRPLETPHCEIRIGIAGITITLSLCGCANARAPTLRSATADSSESTGFKNAPDLRTDQP
KPPSKKRSCDPSEYRVSELKENLITTTPSRPRTARRCIRL
>P54096 ~~~VP3~~~Apoptin~~~
MNALQEDTPPGPSTVFRPPTSSRPLETPHCREIRIGIAGITITLSLCGCANARAPTLRSATADNSESTGFKNVPDLRTDQ
PKPPSKKRSCDPSEYRVSELKESLTTTTPSRPRTARRCIRL
>Q99152 ~~~VP3~~~Apoptin~~~
MNALQEDTPPGPSTVFRPPTSSRPLETPHCREIRIGIAGITITLSLCGCANARAPTLRSATADNSESTGFKNVPDLRTDQ
PKPPSKKRSCDPSEYRVSELKESLITTTPSRPRTAKRRIRL
>Q9IZU6 ~~~VP3~~~Apoptin~~~
MNAHQEDTPPGPSTVFRPPTSSRPLETPHCREIRIGIAGITVTLSLCGCANARVPTLRSATADNSENTGFKNVPDLRTDQ
PKPPSKKRSCDPSEYRVSELKESLITTTPSRPRTARRCIRL
>P54094 ~~~VP3~~~Apoptin~~~
MNALQEDTPPGPSTVFRPPTSSRPLETPHCREIRIGIAGITITLSLCGCANARAPTLRSATADNSESTGFKNVPDLRTDQ
PKPPSKKRSCDPSEYRVSELKESLITTTPSRPRTARRCIRL
>Q7TDB4 ~~~S3~~~Outer capsid protein VP3~~~
MFDRQYPTVHDLYIPFPVFQSRLEQPFDTTVTSIRELRTISSQSTVYGYDLTVNDPLYYDLEPLLGNSISLTLDPKLTDS
ERLDAVYLDINNRLANCHGDLLRKFSATSYSIDTSVIPYVFLPMYRYLLHIMTGSAFNSLFRQMIVNVDANCANADESLL
TSAQHLFALLNKINPSRQLPAPLRHILINATIADVPYDMQGKFVPYNVVFLPTSNESLRDATIARIREPAGYHPRPSIVV
PHYFVFRSTTDALCRFMYLAKRTFLHVNDKTATHTSVRRCELLRLNFPLDQSFAQLSLLVQLQLPLSTLSIQRLPHLSTT
VNQLITLASSSYSEQAIINLLRVNWNVIGYIELSTLGEPSLPAIRVYDFTSSMNTRSVTQGPNVQIRTRSNAIDVHVREF
IRFGRYLPLEIPKCRVPRLVSLQVINYSLNHLLSVTPWPDQYDVTRHRPERIIENSVKRTIQYQEYDPSVGTWATSSDMT
NYTHIPSDSYYHQFIVTCLRSFCGLRDLPRENSARYPYVVLLYGLALGHEIAPSRMGLTYAMTSHMISYVLSTITTGIDV
APSEIISRFKLFLIDVPFADTIIHDLRKVTPNVNVHSTSIFTSNERGDARILTGWVVIRIAVSRFEQQKRSFEYMSYFND
ILRFCDGGIIHFDIPDATFLMHVVTSLQCTPNRRVKVLSYFASQSPFSLTLHFYRDTTDPLLPVANIGHWVTRHQMKRYA
YTDRDSTIPLRHEVIPALSTVMSRMTAEYSFVCQKSDLPVCLSALSTISNYARVATWTDYRGIAHWSGSAVIDPLRLLDS
SRTGVLATNVPIEPLIAPSHGVPRLERSTYRVVDAFHLCSLIGPIFIQREFNIWTASRTSERTRHVIDVGGRDGAFRGLF
PHAMYTVIDPAPAPQHMISNYISEPWDFNDFQGSLDRIMDTLGIIDPQDVFLVFSHVFISALNRPAAHVNALEQLGALQC
SSVVSTQTSGSSASTLYSSYVNHNPFLEIRMENAAYLTRTYPSPYPLPTRAEMNEAILNNARSRLHQTSAAEILDLAMRF
GYAPSYEAIVTLPALCDQHVVYAIQ
>A9Q1K9 ~~~~~~Protein VP3~~~
MAKLIIINSEKGEKVETHEDIFKLSNLQQREIYAITNERTKSILLNQTFYTILDIENEPKDRVAFDSYNSLFPTSIFSYN
RQDRLFGTCNHVLDNNIHYSFALFDSMVDNLSTYLPNDWNIIKIPDSIDYPIGNDLLFYVFDNLVHMTIDQFVNSEEKQM
NTVPKCKESQDRIKEVFTDIMSHLYMPAIDYDPQSYNYRISRREIGNLVRDQVFSLVKGHIHLIGPEMESLRNIIMFLHA
GNSITFHTIDTSKKSNYIKELEFNKKTKLTMANVLINQRKNMNNFFKGLIKHYMTYGIPNKVYYIGAYPSYWLELITWVP
FNIITYDPKYRHVDNDKIIWHDKLFDRNDIETIESKSYIYIDIRTDIRKLDMTKKQRIFKEEDDMIVEIATKLASKQCTV
MFKRKIFPGNNMSFGDPLFHPKLTQLGREYYNCITTIVSPSIYKESELYSLLLSARSNNVSNYVYGGSKFDQSSIVNYNS
TVIALYSLSNTVNSLETIEHAIKFNHIITFPHRTDRGDWRNIEELNNLSPFQNKKRQLEFEDWSIDPKNYAMKFGCEIVS
ESVFLQLGHSRALIPDLYNHIISIRMEMPLFYPDRFFSHIGIRQPSIFKRDSYMTSRLSAYISRQLTHSIDLSVLKKNHF
EGYSGHLIAIETSFSSLVFTMSPYRWLIRAKKSLTKSKIRDKFKIGDGQPHTREEFENTYDYLKINRLVNSTFHSLLLG
>Q6WAT6 ~~~~~~Protein VP3~~~
MKVLALRHSVAQVYADTQIYIHDETKDDYENAFFISNLTTHNILYLNYSVKTLQILNKSGIAAVEIQKMDKLFTLIRCNF
TYDYIDDVVYLHDYSYYTNNEIRTDQHWVTKTNIEDYLLPGWKLTYVGYNGNDTRGHYNFSFKCQNAATDDDAIIEYIYS
NELDFQNFILKKIKERMTTSLPIARLSNRVFRDKLFKTLVSDHSKIVNVGPRNESMFTFLDHPSIKQFSNGPYLVKDTIK
LKQERWLGKRLSQFDIGQYKNYVKCINNLISIYDMYHEKPIIYMLGSAPSYWIHDVKQYSNLKFETWDPLDTPYSDLHHK
ELFYISDVTKLKDNSILYIDIRTDRENADWKTWRKIVEEQTVNNLNIAYKYLSTGKAKVCCVKMTAMDLELPISAKLLHH
PTTEIRSEFYLIMDIWDSKNIKRFIPKGVLYSYINNIITENVFIQQPFKLKTLRNEYVVALYALSNDFNNREDVIKLINN
QKNALITVRINNTFKDEPKVGFKDIYDWTFLPTDFETNESIITSYDGCLGVFGLSISLASKPTGNNHLFMLSGTNKYFNM
DQFANHMSISRRSHQIRFSESATSYSGYIFRDLSNNNFNLIGTNVENSVSGHVYNALIYYRYNYSFDLKRWIYLHSTNKA
SIEGGRYYEHAPIELIYACRSAREFAKLQDDLTVLRYSNEIENYINKVYSITYADDPNYFIGIKFKNIPYEYDVKVPHLT
FGVLNISDSMVPDVVVILKKFKSELFRMDVTTSYTYMLSDEIYVANVSGVLSTYFKLYNAFYKEQITFGQSRMFIPHITL
SFSNKKVVRIDSTRLNIDFIYLRKIKGDTVFDMAE
>Q82041 ~~~~~~Protein VP3~~~
MRVLGLFERGNNLNFADTYVYTWNQQYSYHENAFLISNQVATTIILYLDGININEVNKAFELLNSNGIPALIIKPDHIGI
FTSSNFTYDWQYKIVYFHEYTYYKNNEFIVSDEFWLYTNINELLPYKILYYERGMRELYAGREYTLYNTATDDDILYKYI
YEKDSIMNGTDYKKLYDTNSVKNFVHFMRLLRMRFAVPFDQLSNRITRSRVFSKSRIHIGLRNESIPQALDNIHSQWINY
SANGIVISELKGLGSYSEKKISEFGIGQFKNYMNFLTLMFYIKNMKKKPSCTIIGAAPGYWISSMKKYFTIVTYDNKEVD
STEHHNRYFTDDDIVNVKTNGVYIDVRSEFKTNDWRQRRKLIEEETIKWLEISYKLLENKRVEAILLKMTAMDGEIPDGY
CVHSPTTYRKSEYYLLIDKHIIKRQKIKVTKSLMYNAINTIYSDNVFISGKYSLRGKTEGVLALYCLSNTINQKEKVIQY
ANSFSGTCMTVRLNNTYEVDKIIDFKTNSDHTFLPSDFTCSLNTILTSYRGYAGIFGYAITKDLKSNGNNHIYIIPNARD
ENNFDTFGSHLGLSRYSHSKRFSESATTMSGYIFRDMVSGKENMQDTDKDNYASGHVFNAIAHYRFDYTYDIVGWLRLHK
TGQFKVKSDIYKEHTDSEIRNAIESAYVYYLLDGDKVGEKYSKKMMEIWEVQV
>Q9QNB1 ~~~~~~Protein VP3~~~
MKVLALRHSVAQVYADTQTYLHDDSKDEYENAFLISNLTAHNILYLNYSLKTLKILNKSGIAAVETQSPDELFALIRCNF
TYDYENNIVYLHDYSYYTNNEIRTDQHWITKTDIIDYLLPGWKLTYVGYNGKNTRGHYNFSFSCQNAATDDDIIVEYIYS
NELDFQNFMLRKIKERMTTSLPIARLSNRVFRDKLFPSIANMHKRVINVGPRNESMFTFLNFPTIKQFSNGAYIVKHTIK
LKQEKWLGKRVSQFDIGQYKNMLNVVTTIYYYYNLYHSKPIIYMLGSAPSYWIYNVKQYSDFTFETWDPVDTPYSTTHHK
ELFFDKDVMKLKDDSVLYIDIRTDRKNMDWKEWRKVVEQQTVSNLNIAYNYLSTGKAKVCCVKLTAMDLELPITAKLLHH
PTTEVRSEFYAILDVWDIITIKRFIPKGVFYAFINNITTENVFIQPPFKLKASPTDYIVALYALSNDFNSRQDVINLINK
QKQSLITVRINNTFKDEPKVNFKNIYDWTFLPTDFELKDSVITSYDGCLGMFGLSLSLSSKPTGNNHLFIINGTDKYYKL
DQYANHMGISRRSHQIRFSESATSYSGYIFRDLSNNNFNLIGTNVENSVSGHVYNALIYYRYNYAFDLKRWIYLHSIGKV
AVEGGRYYEHAPIELIYACRSAREFAILQDDLTVLRYANEIEGYINKVYSITYADDPNYFIGITFNNIPYEYDVKVPHLT
LGVLFISDNMIDEVVAVLKEMKTELFKTEISTSYNYMLFDNVYVANASGVLSTYFKLYNMFYRNHITFGQSRMFIPHITL
SFSNKRTIRIESTRLKINSIYLRKIRGETVFDMSE
>Q8BB04 ~~~~~~Protein VP3~~~
MKVLALRHSVAQVYADTQIYTHDETKDDYENAFLISNLTTHNILYLNYSVKTLQILNKSGIAAIEIQKNDELFTLIRCNF
TYDYIDDIVYLHDYSYYTNNEIRTDQHWVTKTNIEDYLLPGWKLMYVGYNGNDTRGHYNFSFKCQNAATDDDAIIEYIYS
NELDFQNFILKKIKERMTTSLPIARLSNRVFRDKLFKTLVSDHSRVVNVGPRNESMFTFLDHPSIKQFSNGPYLVKDTIK
LKQERWLGKRLSQFDIGQYKNMLNVLTTLYQYYDMYHEKPIIYMVGSAPSYWIHDVKQYSDLKFETWDPLDTPYSDLHHK
ELFYASDVTKLKDNSILYVDIRTDRENADWKTWRKIVEEQTANNLNIAYKYLSTGKAKVCCVKMTAMDLELPISAKLLHH
PTTEIRSEFYLIMDIWDSKNTKRFIPKGVLYSYINNTITENVFIQQPFKLRTLRNEYVVALYALSNDFNNREDVVKLVNN
QKNALITVRINNTFKDEPKVGFKDIYDWTFLPTDFETNESIITSYDGCLGMFGLSISLASKPTGNNHLFILSGTNKYFKL
DQFANHMSISRRSHQIRFSESATSYSGYIFRDLSNNNFNLIGTNVENSVSGHVYNALIYYRYNYSFDLKRWIYLHSTNKA
SIEGGRYYEHAPIELIYACRSAREFAKLQDDLTVLRYSNEIENYINKVYSITYADDPNYFIGIKFKNIPYEYDVKVPHLT
FGVLNISDSMVPDVVAILKKFKNELFRMDVTTSYTYMLSDEIYVANVSGVLSTYFKLYNAFYKEQITFGQSRMFIPHITL
SFSNKRVVRIGSTRLNIDFIYLRKIKGDTVFDMTE
>A2T3S5 ~~~~~~Protein VP3~~~
MKVLALRHSVAQVYADTQVYVHDDTKDSYENAFLISNLTTHNILYLNYSIKTLEILNKSGIAAIALQSLEELFTLIRCNF
TYDYELDIIYLHDYSYYTNNEIRTDQHWITKTNIEEYLLPGWKLTYVGYNGSETRGHYNFSFKCQNAATDDDLIIEYIYS
EALDFQNFMLKKIKERMTTSLPIARLSNRVFRDKLFPSLLKEHKNVVNVGPRNESMFTFLNYPTIKQFSNGAYLVKDTIK
LKQERWLGKRISQFDIGQYKNMLNVLTAIYYYYNLYKSKPIIYMIGSAPSYWIYDVRHYSDFFFETWDPLDTPYSSIHHK
ELFFINDVKKLKDNSILYIDIRTDRGNADWKKWRKTVEEQTINNLDIAYEYLRTGKAKVCCVKMTAMDLELPISAKLLHH
PTTEIRSEFYLLLDTWDLTNIRRFIPKGVLYSFINNIITENVFIQQPFKVKVLNDSYIVALYALSNDFNNRSEVIKLINN
QKQSLITVRINNTFKDEPKVGFKNIYDWTFLPTDFDTKEAIITSYDGCLGLFGLSISLASKPTGNNHLFILSGTDKYYKL
DQFANHTSISRRSHQIRFSESATSYSGYIFRDLSNNNFNLIGTNIENSVSGHVYNALIYYRYNYSFDLKRWIYLHSIDKV
DIEGGKYYELAPIELIYACRSAKEFATLQDDLTVLRYSNEIENYINTVYSITYADDPNYFIGIQFRNIPYKYDVKIPHLT
FGVLHISDNMVPDVIDILKIMKNELFKMDITTSYTYMLSDGIYVANVSGVLSTYFKIYNVFYKNQITFGQSRMFIPHITL
SFNNMRTVRIETTKLQIKSIYLRKIKGDTVFDMVE
>P20225 ~~~VP3~~~Structural protein VP3~~~
MEISLKPIIFLVVFIIVGIALFGPINSVVNNVTTSGTYTTIVSGTVTTSSFVSNPQYVGSNNATIVALVPLFYILVLIIV
PAVVAYKLYKEE
>Q2PDK5 ~~~~~~Matrix protein VP40~~~
MRRVILPTAPPEYMEAIYPVRSNSTIARGGNSNTGFLTPESVNGDTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVN
VISGPKVLMKQIPIWLPLGVADQKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQE
FVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILLPNKSGKKGNSADLTFPEKIQAI
MTSLQDLKIVPIDPTKNIMGIEVPETLVHKLTGKKVTSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASL
PAVIEK
>Q5XX06 ~~~~~~Matrix protein VP40~~~
MRRVTVPTAPPAYADIGYPMSMLPIKSSRAVSGIQQKQEVLPGMDTPSNSMRPVADDNIDHTSHTPNGVASAFILEATVN
VISGPKVLMKQIPIWLPLGIADQKTYSFDSTTAAIMLASYTITHFGKANNPLVRVNRLGQGIPDHPLRLLRMGNQAFLQE
FVLPPVQLPQYFTFDLTALKLVTQPLPAATWTDETPSNLSGALRPGLSFHPKLRPVLLPGKTGKKGHVSDLTAPDKIQTI
VNLMQDFKIVPIDPAKSIIGIEVPELLVHKLTGKKMSQKNGQPIIPVLLPKYIGLDPISPGDLTMVITPDYDDCHSPASC
SYLSEK
>Q77DJ6 ~~~~~~Matrix protein VP40~~~
MRRVILPTAPPEYMEAIYPVRSNSTIARGGNSNTGFLTPESVNGDTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVN
VISGPKVLMKQIPIWLPLGVADQKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQE
FVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILLPNKSGKKGNSADLTSPEKIQAI
MTSLQDFKIVPIDPTKNIMGIEVPETLVHKLTGKKVTSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASL
PAVIEK
>Q05128 ~~~~~~Matrix protein VP40~~~
MRRVILPTAPPEYMEAIYPVRSNSTIARGGNSNTGFLTPESVNGDTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVN
VISGPKVLMKQIPIWLPLGVADQKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQE
FVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILLPNKSGKKGNSADLTSPEKIQAI
MTSLQDFKIVPIDPTKNIMGIEVPETLVHKLTGKKVTSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASL
PAVIEK
>P35260 ~~~~~~Matrix protein VP40~~~
MASSSNYNTYMQYLNPPPYADHGANQLIPADQLSNQQGITPNYVGDLNLDDQFKGNVCHAFTLEAIIDISAYNERTVKGV
PAWLPLGIMSNFEYPLAHTVAALLTGSYTITQFTHNGQKFVRVNRLGTGIPAHPLRMLREGNQAFIQNMVIPRNFSTNQF
TYNLTNLVLSVQKLPDDAWRPSKDKLIGNTMHPAVSIHPNLPPIVLPTVKKQAYRQHKNPNNGPLLAISGILHQLRVEKV
PEKTSLFRISLPADMFSVKEGMMKKRGENSPVVYFQAPENFPLNGFNNRQVVLAYANPTLSAV
>P33246 ~~~~~~Structural glycoprotein p40~~~
MTDERGNFYYNTPPPPLRYPSNPATAIFTNAQTYNNAPGYVPPATRDNKMDTSRSNSTNSVAIAPYNKSKEPTLDAGESI
WYNKCVDFVQKIIRYYRCNDMSELSPLMIHFINTIRDMCIDTNPINVNVVKRFESEETMIRHLIRLQKELGQGNAAESLP
SDSNIFQASFVLNSLPAYAQKFYNGGADMLGKDALAEAAKQLSLAVQYMVAESVTCNIPIPLPFNQQLANNYMTLLLKHA
TLPPNIQSAVESRRFPHINMINDLINAVIDDLFAGGGDYYYYVLNEKNRARIMSLKENVAFLAPLSASANIFNYMAELAT
RAGKQPSMFQNATFLTSAPTRSIRLPLI
>P36345 ~~~~~~Structural glycoprotein p40~~~
MSLPHAVTTALQHQQHQKQLQESSSDAWTNKCVDYVERIIRFYRTNDMSHLTPQMIMLINTIRDLCVESHPISVNVVKRF
DSDENLIKHYSRLRKELGGSEVAENIFQPSFVYNVLPSYAQKFYNKGAENVSGDSVSEAAHELGEALQYQIAEAVASNTP
IPLPVRHQLVNTYITLLLQRANIPPNVQDAVSSRKYPTLNIINDLINNVIDDVFTGVYGNYYYYVLNEKNRARIVTLKEN
IGFLAPLSASTDIFQYIANLATRAGKRPSLFQGATFLNAPSSNGSNVEQNRTSCQQSLTELAFQNEALRRYIFQKLSYKQ
NY
>O10333 ~~~~~~Structural glycoprotein gp41~~~
MNERDGFYLTPSQSGHPFAPTSATLTSSQSGYPTAVSTTLSRADSRSNTALAKAGDVGEAIWYNKCTDYVHKIIRYYRCN
DMSELTPLMIQFINTIRDMCIDSNPVSANIIKRAQSDDDIVRHLIGLQKELRQNSVAEAIGSDFNIFQPSFVLNSLPAYA
QKFYNGGANTLGKDALNEAAKQLSLAVQYMVSEAVTCSIPIPLPFDQQLANNYVTLLLKRATLPDNMQEAVKSRSFVHIN
MINDLINAVIEDLFAGGGTYYYYVLNEKNRARVVGLKENVGFLAPISASADIFNYIAQLATQQGKRPDMFENAAFLTAAA
HAINSPAAHLTQSACQKSLSQLAAQCETLTRFVFMIMNHNTVPTTRG
>P34051 ~~~~~~DNA-directed RNA polymerase subunit p47~~~
MFVTRLEHTTQFLPQACKLEDEVLVLYAIYLNGFDYTLPRFVKREVQVNADGFVRFDYNIKMFDFARFTDMTQTTPEDID
DYINLTRINSLSDHDLKMLQLVCRDRWYKGDVARLRRILQQKDVDDLVKFVCNVMWERAYEDHYTLGQQLSIRITTKLIQ
SGLDFKHQPDTTAPVSVRGWEDATFEKYLQSITSISEVIKRHVFSKKYICLEVAASYWSDAVESLQQENFRIILNSKTPY
VLLIEMDDDKNSMVYLRKLAHLLENKIVNLLFVTDVEFYFKNGNFMFYLYNSLKFYYYCLKNKFAFENVDKEIFFLLYTI
VALEWFNGGHLNSFTLEKSALYNPLELSTRRLNSIKRAAQHNRVINCDSEINMDYIRGKRVRTGAHYGKRVVNFDLNQSL
H
>Q9E7N7 ~~~P~~~Phosphoprotein~~~
MDSESLDFSSADTVILRSPNAGTNPDGHPDTVECPDFDTDIPKTSDDSSKMDNKGSSSSSKAVKDLLELAAKSQGIVVTD
VMQNTAIALHHNLGLDASSLDWFVAGITFANNSMIMEKMVSAIKELQIEVRNIQVASSGIKGTSEELVSKMKANKNDIVK
ELVKTRDSVLSAMGGILSAPEIEQQPVKTVTIGASQGRRKSTVVPPIEINPELESPVLSKTVSTATPEERIRHEKEKLLA
DLDWEIGEIAQYTPLIVDFLVPDDILAMAADGLTPELKEKIQNEIIENHIALMALEEYSS
>B2BNE4 ~~~S6~~~Outer capsid protein VP4~~~
MGNVQTSTNVYNIDGNGNTFAPSSQMASTASPAIDLKPGVLNPTGKLWQTMGTGAPSADSLVLVVDNKGEYTYLSENMRE
TLNKAVTDVNMWQPLFQATKSGCGPVVLANFTTISTGYVGATADDAFSNGLVSNGPFLATMHIMELQKTIAARMRDVAIW
QKHLDTAMTLMTPDVSAGDVTCKWRSLLEFAQDILPLDNLCRSYPNEFYTVAAQRYPAIRPGQPDTQVALPQPHPLGEVA
GSFNAPTSEVGSLVGAGAALSDAISTLASKDLDLVEADTPLPVSVFTPSLAPRTYRPAFIDPQDAAWIAQWNGDANIRII
TTYQSTDYTVQLGPGPTRVIDMNAMIDAKLTLDVSGTILPFQENNDLSSAIPAFVLIQTKVPLHSVTQASDVEGITVVSA
AESSAINLSVNVRGDPRFDMLHLHAMFERETIAGIPYIYGIGTFLIPSITSSSSFCNPTLMDGELTVTPLLLRETTYKGA
VVDTVTPSEVMANQTSEEVASALANDAVLLVSGQLERLATVVGDVIPIASGEDDAATSAIVGRLAIEATMRARHGGDTRA
LPNFGQLWKRAKRAASMFASNPALALQVGVPVLADSGILSALTSGVSTAIRTGSLGKGVSDASSKLNARQSLTLARKTFF
KKVEELWPSQ
>P07132 ~~~Segment-4~~~Core protein VP4~~~
MPEPHAVLYVTNELSHIVKDGFLPIWKLTGDESLNDLWLENGKYATDVYAYGDVSKWTIRQLRGHGFIFISTHKNVQLAD
IIKTVDVRIPREVARSHDMKAFENEIGRRRIRMRKGFGDALRNYAFKMAIEFHGSEAETLNDANPRLHKIYGMPEIPPLY
MEYAEIGTRFDDEPTDEKLVSMLDYIVYSAEEVHYIGCGDLRTLMQFKKRSPGRFRRVLWHVYDPIAPECSDPNVIVHNI
MVDSKKDILKHMNFLKRVERLFIWDVSSDRSQMNDHEWETTRFAEDRLGEEIAYEMGGAFSSALIKHRIPNSKDEYHCIS
TYLFPQPGADADMYELRNFMRLRGYSHVDRHMHPDASVTKVVSRDVRKMVELYHGRDRGRFLKKRLFEHLHIVRKNGLLH
ESDEPRADLFYLTNRCNMGLEPSIYEVMKKSVIATAWVGRAPLYDYDDFALPRSTVMLNGSYRDIRILDGNGAILFLMWR
YPDIVKKDLTYDPAWAMNFAVSLKEPIPDPPVPDISLCRFIGLRVESSVLRVRNPTLHETADELKRMGLDLSGHLYVTLM
SGAYVTDLFWWFKMILDWSAQNREQKLRDLKRSAAEVIEWKEQMAERPWHVRNDLIAALREYKRKMGMREGASIDSWLEL
LRHL
>P33428 ~~~Segment-4~~~Core protein VP4~~~
MPEPHAVLYVTNELSHIVKNGFLPIWKLTGDESLNDLWLENGKYATDVYAYGDVSKWTIRQLRGHGFIFISTHKNVQLAD
IIKTVDVRIPREVARSHDMKAFENEIGRRRIRMRKGFGDALRNYAFKMAIEFHGSEAETLNDANPRLHKIYGMPEMPPLY
MEYAEIGTRFDDEPTDEKLVSMLDYIVYSAEEVHYVGCGDLRTLMQFKKRSPGRFRRVLWHVYDPIAPECSDPNVIVHNI
MVDSKKNILKHMNFLKRVERLFIWDVSSDRSQMNDHEWETTRFAEDRLGEEIAYEMGGAFSSALIKHRIPNSKDEYHCIS
TYLFPQPGADADMYELRNFMRLRGYSHVDRHMHPDASVTKVVSRDVRKMVELYHGRDRGRFLKKRLFEHLHIVRKNGLLH
ESDEPRADLFYLTNRCNMGLEPSIYEVMKKSVIATAWVGRAPLYDYDDFALPRSTVMLNGSYRDIRILDGNGAILFLMWR
YPDIVKKDLTYDPAWAMNFAVSLKEPIPDPPVPDISLCRFIGLRVESSVLRVRNPTLHETADELKRMGLDLSGHLYVTLM
SGAYVTDLFWWFKMILDWSAQNKEQKLRDLKRSAAEVIEWKEQMAERPWHVRNDLIRALREYKRKMGMREGASIDSWLEL
LRHL
>P33429 ~~~Segment-4~~~Core protein VP4~~~
MPEPHAVLYVTNELSHIVKNGFLPIWKLTGDESLNDLWLENGKYATDVYAYGDVSKWTIRQLRGHGFIFISTHKNVQLAD
IIKTVDVRIPREVARSHDMKAFENEIGRRRIRMRKGFGDALRNYAFKMAIEFHGSEAETLNDANPRLHKVYGMPEIPPLY
MEYAEIGARFDDEPTDEKLVSMLDYIVYSAEEVHYVGCGDLRTLMQFKKRSPGRFRRVLWHVYDPIAPECSDPNVIVHNI
MVDSKKDILKYMNFLKRVERLFIWDVSSDRSQMNDHEWETTRFAEDRLGEEIAYEMGGAFSSALIKHRIPNSKDEYHCIS
TYLFPQPGADADMYELRNFMRLRGYSHVDRHMHPDASVTKVVSRDVRKMVELYHGRDRGTFLKKRLFEHLHIVRKNGLLH
ESDEPRADLFYLTNRCNMGLEPSIYEVMKKSVIATAWVGRAPLYDYDDFALPRSTVMLNGSYRDISILDGNGAILYLMWR
YPDIVKKDLTYDPAWAMNFAVSLKEPIPDPPVPDISLCRFIGLRVESSVLRVRNPTLHETADELKRMGLDLSGHLYVTLM
SGAYVTDLFWWFNIILDWSAQNKEQKLRDLKRSAAEVIEWKEQMAERPWHVRNDLIRALREYKRKMGMREGASIDSWLEL
LRHL
>P33427 ~~~Segment-4~~~Core protein VP4~~~
MPEPHAVLYVTNELSHIVKSGFLPIWRLTGVESLNVLWLENGKYATDVYAYGDVSKWTIRQLRGHGFIFISTHKNIQLAD
IIKTVDVRIPREVAKSQDMKAFENEIGRRRIRMRKGFGDALRNKLFKMAIEFHGSEAETLNDANPRLHKIYGMPEMPPLY
IEYAEIGTRFDDEPTDEKLVSMLDYIVYSAEEVHYVGCGDLRTIMQFKKRSPGRFRRVLWHVYHPIAPESSDPNVIVHNV
MVDSKKDILKHMNFLKRVERLFIWDVSSDRSQMDDDEWESTRFAEDRLGEEIAYEMGGAFSSALIKHRIPNSRDEYHCIS
TYLLPQPGADADMYELRNFMRLKGYSHVDRHMHPDASVMKVVSRDVRKMVELYHGRDRGRFVKNRLFEHLHIVRKNGLLH
ESDEPRADLFYLTNRCNMGLEPSIYEVMKKSVIATAWVGRAPLYDYDDFALPRSTVMLNGSYHDIRILDGNGAILFLMWK
YPDIVKKDLTYDHAWAMNFAVSLKEPIPDPPVPDISLCRFIGLRVESSVLRVRNPTLHETADELKRMGLDLSGHLYVTLM
SGAYVTDLFWWFKMILDWSAQSKEQKLRDLKRSAAEVIEWKEQMAERPWHVRNSLIAALREYKRKMGIREGASIDSWLEL
LRHL
>Q9E780 ~~~~~~Outer capsid protein VP4~~~
MGSFIYKQLLTNSYTVELSDEIDAIGSEKTQNVTINPGPFAQTGYAPVEWGAGETNDSTTIEPVLDGPYQPTRFNPEIGY
WILLAPETQGIVLETTNTTNKWFATILIEQDVVAESRTYTIFGKTESIQAENTSQTEWKFIDIIKTTQDGTYSQYGPLVL
STKLYGVMKYGGRLYAYIGHTPNATPGHYTIANYDTMEMSIFCEFYIMPRSQEAQCTEYINSGLPPIQNTRNIVPLSLSS
RSIKYQKAQVNEDIIISKTSLWKEMQYNIDIIIRFKFNNSIIKSGGLGYKWLEIAFKPANYQYNYIRDGENITAHTTCSV
NGVNEFSYNGGSLPTDFAISRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNTVICNGGDYSFQVPVGQWPVMSGG
AVSLQSAGVTLSTQFTDFVSLNSLRFRFSLAVESPPFSITRTRVSNLYGLPAANPNGGRDFYEILGRFSLISLVPSNDDY
QTPIMNSVTVRQDLDRQLGELRDEFNALSQQIAMSQLIDLALLPLDMFSMFSGIKGSIDVARSMATKVMKKFRNSKLASS
VSTLTDSLSDAASSLSRTSTIRSIGSSASAWTNISSQVDDVISSTSEISTQTSTISRRLRVKEIATQTEGMNFDDISAAV
LKAKIDRSTQIDSNTLPDIVTEASEKFIPNRAYRVMDGDEVLEASTDGKFFAYKVETFDEVPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGITKAQAFNLLRSDPRVLREFINQENPIIRNRIEQLILQCKL
>Q8JNB1 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTTDLSDEIEEIGSSKSQDVTINPGPFAQTGYAPVDWGLGETNDSTTVAPVLDGPYQPITFTPPIEY
WALFAPNDKGVVAELTNNADMWLVIILVEPNVPQELRLYTLFGQQVNLTIENTSQTKWKFIDFRKRSQNDTYILENALLS
ETKLQAAMKYGGKLFTFTGDTPNAAPQEWGYTTNNYSAISIKSLCDFYIVPRLPRETCRNYINQGLPPMQNTRNVVSVAL
SARDVISQRVSINEDIVVSKASLWKEMQYNRDITIRFKFANQIIKSGGLGYKWSEISFKPANYQYTYTRNGEEITAHTTC
SVNGVNNFSYNGGSLPTDFVISRYEVIKENSYVYIDYWDDSQAFRNMVYVRSLAADLNSVTCSGGSYSFALPLGNFPVMS
GGAVSLHPSGVTLSTQFTDFVSLNSLRFRFRLAVEEPPFSITRTRVSRLYGLPAVNPNNAKDFYEIAGRFSLISLIPPND
DYQTPIMNSVTVRQDLERQLGELRNEFNALSQQIAMSQLIDLALLPLDMFSMFSGIKGTIDIAKSMATNVMKKFRKSNLA
NSVSALTESLSDAASSISRRSTIRTIGSSASAWTEVSTAVIDTTTATNSISTQTATITKRLRLKEMAIQTDGMNFDDISA
AVLKTKIDKSTQIAPNTLPDIVTEASEKFIPNRAYRVMDNDEVFEAGTDGKFFAYRVETFEEIPFDVQKFADLITDSPVI
SAIIDFKTLKNLNDNYGITKQQAYNLLRSDPRVLREFINQENPIIRNRIENLIMQCRL
>A9Q1L0 ~~~~~~Outer capsid protein VP4~~~
MSLRSLLITTEAVGETTQTSDHQTSFSTRTYNEINDRPSLRVEKDGEKAYCFKNLDPVRYDTRMGEYPFDYGGQSTENNQ
LQFDLFTKDLMADTDIGLSDDVRDDLKRQIKEYYQQGYRAIFLIRPQNQEQQYIASYSSTNLNFTSQLSVGVNLSVLNKI
QENKLHIYSTQPHIPSVGCEMITKIFRTDVDNENSLINYSVPVTVTISVTKATFEDTFVWNQNNDYPNMNYKDLIPAVTK
NSIYHDVKRITKIHEYINSKKKKNGVGKIGGIQIAESKDGFWKILTKNYQIKLKFGIEGYGVMGGTFGNWLIDSGFKTVE
TNYEYQRNGKTINATTVASVKPSRKCGTRSPVFGQLQFSGEMMVLSHNDILTVFYTEREWALSNAIYAKNFATDFKRQFE
VTAQSDELLVRTNVVPHTIKNTPGKALMEYSHGGFGQIDTSDYTGMALTFRFRCVSEDLPEGYYDKDKALTFANVGLTSF
QDRQETNGTYWVYNTSTVGFGSCYPKKEFEYDINVTYTTLLPSDPEFTTGGTNYAQSVTAVLEESFINLQNQVNEMLTRM
NISDLTSGVMSVFSVATSFPQILDGISDLLKAASSAFKKVKGKVGNVAKRLRGKRYVRLFDEDISIEETPRFLDSIRSSR
RPSILSNMFNDDETFTALHTLASRTNSVASDVTYIQPIITTRIANSTPPVIAPASSVTYAKLKDISKIINAEIDPKSIME
FNQVSNTISILDSTKKLAQYAVDPDVIDGILNKMVGGHARSLFSLKVRKHLLDAVEKDAFVKYNYHDLMGKLLNDRELLD
ITNNLSSQKQFELAKEFRDLLINALA
>Q08010 ~~~~~~Outer capsid protein VP4~~~
MASLVYRQLLANSYTSDLQDTIDDISAQKTENVTVNPGPFAQTGYALVEWTHGDITTDETVQQTLDGPYAPSSVIIQPQY
WVLMNPETADVIAEADATNKKYACVMLAPNTEEGDKQYTILGRQITINLGNTDQNRYKFFDLASENGETYSKIQELLTPN
RLNAFMKDQGRLYVYHGTVPNISTGYYTLDDIANVQTNIKCNYYIVPKSQTQQLEDFLKNGLPPIQESRYIMPVERSVQN
IYRAKPNEDIVISKTSLWKEMQYNRDIVIRFKFGNTIIKSGGLGYKWSEISYKPMNYEYTYERDGETVVAHTTCSVAGVN
DFGYNSGSLPTDFVVSKYEVLKGNSYVYIDYWDDSQAFKNMVYVRSLSAEFNAINCTGGTYDFQLPVGQWPQMRGGNVTL
NSDAVTLSTQYTDFVSLNSLRFRFKPAIGEPFFEITRTRETRLYGLPASNPMGGNEYYETAGRFSLISLVPSNDDYQTPI
QNSTTVRQDLEQQISDLREEFNQLSSEIAMSQLIDLALLPLDMFSMFSGIKSTIDAVKSVTTSVMKKMKTSTLAKSVSTI
TEELSDAATSVSRASSIRSNASVWNNLVDTRTQTSVATNDIATQTSRIASKLRVKEFATQTEGGLSFNDISAAVLKTKID
KIETVQPKILPTIITESVDKFIPTRQYRIIDKDIAYEISNSGKYFAYRVDTFEEVIFDVEKFADLVTDSPVISAIIDFKT
LKNLNDNFGITKEQAYNLLRSDPRVLKDFINQNNPIIRNRIEQLILQCRI
>P35746 ~~~~~~Outer capsid protein VP4~~~
MRSLIYRQLLYNSYSVDLSDEITNIGAEKKENVTVQLGEFAQSQYAPVSWGSGETLSGNVEEQTLDGPYAPDSSNLPSNC
WYLVNPSNDGVVFSVTDNSTFWMFTYLVLPNTAQTNVTVNVMNETVNISIDNSGSTYRFVDYIKTSSTQAYGSRNYLNTA
HRLQAYRRDGDGNISNYWGADTQGDLRVGTYSNPVPNAVINLNADFYVIPDSQQETCTEYIRGGLPAMQTTTYVTPISYA
IRSQRIARPNEDIIISKASLWKEVQYNRDIVIRFVFANNIIKAGGLGYKWSEISYKANNYQYTYMRDGVEVVAHTTVSVN
GVSVYNYNTGPLPTDFMIRNYDVLKESSFVYVDYWDDSQAFRNMVYVRSLNAELNQVRCEGGHYSFALPVGSWPVMQGGS
VILTFDGVTLSTQFTDYVSLNSLRFRFRCAVSEPSFRVTGTRISNLYGLPAANPMGDQQYYEAAGRFSLILLVPSNDDYQ
TPIANSVTVRQDLERQLDEMRREFNELSANIALSQLIDLALLPLDMFSMFSGIQSTVEAAKTFATSVMKKFRKSDLAKSV
NSLTDAITDAAGSISRSSTLRSVNSAASVWTDISDIVDSTDNVVAATATAAAKKFRVKEFTTEFNGVSFDDISAAVVKTK
MSKLNVVDEEILPQIITEASEKFIPNRAYRLIDGEKVYEVTTEGKYFAYLTETFEEVVFDAERFAELVTDSPVISAIIDF
KTIKNLNDNYGITREQALNMLRSDPKVLRSFINQNNPIIKNRIEQLILQCRI
>P08713 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVELSDEIQEIGSTKTQNVTVNPGPFAQTNYASVNWGPGETNDSTTVEPVLDGPYQPTTFNPPVSY
WMLLAPTNAGVVDQGTNNTNRWLATILIKPNVQQVERTYTLFGQQVQVTVSNDSQTKWKFVDLSKQTQDGNYSQHGPLLS
TPKLYGVMKHGGKIYTYNGETPNATTGYYSTTNFDTVNMTAYCDFYIIPLAQEAKCTEYINNGLPPIQNTRNIVPVSIVS
RNIVYTRAQPNQDIVVSKTSLWKEMQYNRDIVIRFKFANSIIKSGGLGYKWSEVSFKPANYQYTYTRDGEEVTAHTTCSV
NGINDFNYNGGSLPTDFVISKYEVIKENSFVYIDYWDDSQAFRNMVYVRSLAADLNSVMCTGGDYSFAIPVGNYPVMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLSVEEPPFSILRTRVSGLYGLPAAKPNNSQEYYEIAGRFSLISLVPSNDDY
QTPIINSVTVRQDLERQLGELRDEFNNLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKKSSLANS
VSTLTDSLSDAASSISRSASVRSVSSTASAWTEVSNITSDINVTTSSISTQTSTISRRLRLKEMATQTDGMNFDDISAAV
LKTKIDKSTQLNTNTLPEIVTEASEKFIPNRAYRVIKDDEVLEASTDGKYFAYKVETILKRFHSMYKFADLVTDSPVISA
IIDFKTLKNLNDNYGISRQQALNLLRSDPRVLREFINQDNPIIRNRIESLIMQCRL
>Q96642 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLANSYAVDLSDEIQSVGSEKNQRVTVNPGPFAQTGYAPVNWGPGEVNDSTVVQPVLDGPYQPASFDLPVGN
WMLLAPTGPGVVVEGTDNSGRWLSVILIEPGVTSETRTYTMFGSSKQVLVSNASDTKWKFVEMMKTAVDGDYAEWGTLLS
DTKLYGMMKYGERLFIYEGETPNATTKGYIVTNYASVEVRPYSDFYIISRSQESACTEYINNGLPPIQNTRNVVPVAISS
RSIEPRRVQANEDIVVSKTSLWKEMQYNRDIIIRFRFDNSIIKSGGLAYKWAEISFKAANYQYNYMKDGEEVTAHTTCSV
NGVNDFSFNGGSLPTDFAISRYEVIKENSYVYVDYWDDSQTFRNMVYVRSLAANLNDVMCSGGDYSFALPAGQWPVMKGG
AATLHTAGVTLSTQFTDYVSLNSLRFRFRLAVEEPSFTITRTRVSKLYGLPAANPNGGREYYEVAGRFSLISLVPSNDDY
QAPIMNSVTVRQDLERRLNELREEFNNLSQEIAVSQLIDLAMLPLDMFSMFSGIEGTVNAPQSMATNVMRKFKSSKLASS
VSMLTDSLSDAASSIARSTSIRSIGSAASAWANISEQTQDAVNEVATISSQVSQISGKLRLKEITTPTEGMNFDDISAAV
LKAKIDRSIQVDPNALPDVITEASEKFIRNRAYRVIDGDESFEAGTGGRFFANKVETLEEMPFNIEKFADLVTHSPVISA
IIDFKTLKNLNDNYGITREQAFNLLRSNPKVLRGFIDPNNPIIKNRIEQLIMQCRL
>Q98637 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSTKTQNITINPGPFAQTGYAPVNWGPGETNDSTTIEPVLDGPYQPTSFNPPVGY
WMLLSPTAPGVVVEGTNNTDRWLATILIEPNVTSQQRTYTIFGVQEQITVENTSQTQWRFVDVSKTTQNGSYSQYSPLLS
TPKLYAVMKYGGRIHTYSGQTPNATTGYYSATNYDSVNMTTFCDFYIIPRSEESKCTEYINNGLPPIQNTRNIIPLALSA
RNVRSLKAQSNEDIVVSKTSLWKEMQYNRDITIRFKFANSIVKSGGLGYKWSEISFKPANYQYTYMRDGEEVTAHTTCSV
NGMNDFSFNGGSLPTDLLISRYEVIKENSYVYIDYWDDSQAFRNMVYVRSLAANLNSVICAGGHYNFALPVGQWPYMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPSFAIMRTRVSGLYGLPAANPNNGREYYEIAGRFSLISLIPSNDNY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSIATNVMKKFKKSSLASS
VSTLTDSLSDAASSLSRGSSIRSVGSSVSAWTDVSTQITDVSSSVSSISTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSIQISPNTLPDIVTEASEKFIPNRAYRVINNDEVFEASTDGRFFAYRVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGIGKQQAFNLLRSDPKVLREFINQNNPIIRNRIEQLIMQCRL
>Q06894 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSTKTQNTTINPGPFAQTGYAPVNWGPGETNDSTTIEPVLDGPYQPTSFNPPVGY
WMLLSPTAAGVIVEGTNNTDRWLATILIEPNVTSQQRTYTIFGVQEQITVENTSQTQWRFVDVSKTTQNGNYSQHGPLLS
TPKLYAVMKYGGRIHTYSGQTPNATTGYYSATNYDSVNMTTFCDFYIIPRSEESKCTEYINNRLPPIQNTRNIVPLALSA
RNVISLKAQSNEDIVVSKTSLWKEMQYNRDITIRFKFANSIVKSGGLGYKWSEISFKPANYQYTYMRDGEEVTAHTTCSV
NGMNDFSFNGGSLPTDFVISRYEVIKENSYVYIDYWDDSQAFRNMVYVRSLAANLNSVICTGGDYNFALPVGQWPYMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPSFAIMRTRVSGLYGLPAANPNNGREYYEIAGRFSLISLVPSNDNY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSIATNVMKKFKKSSLASS
VSTLTDSLSDAASSVSRGSSIRSVGSSVSAWTDVSTQITDVSSSVSSISTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSIQISPNTLPDIVTEASEKFIPNRAYRVINNDEVLEAGTDGKFFAYRVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGIGKQQAFNLLRSDPRVLREFINQNNPIIRNRIEQLIMQCRL
>Q98635 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLANSYTVDLSDEIENIGYAKSKNVTINPGPFAQTGYAPVNWGPGEVNDSTTVEPVLDGPYQPTNFNPPVNY
WMLLSPLNAGVVVEGTNSIDRWLATVLVEPNVTTTVRTYTLFGVQEQISVENNSTTKWKFINLIKTTPPGNFTLYSTLLS
EPKLHGIMKHGGQLWVYNGETQTLLLQDYVTSNYDSLTMTSFCDFYIIPRNQESTCTEYINNGLPPIQNTRNVVSVSISS
RNIIHNRAQVNEDIVISKTSLWKGVQYNRDIINRFRFANAIIKSGGLGYKWSEISFKPANYQYTYTRDGEEITAHTTCSV
NGVNDFSFNGGSLPTDFVISRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVLCTGGDYSFALPVGQWPVMTGG
AVMLHAAGVTLSTQFTDFVSLNSLRFRFSLSVEEPYFSITRTRVTRLYGLPAVNPNNDRDYYEIAGRFSLISLVPSNDDY
QTPIMNSVTVRQDLERQLGELREEFNALSQEIAISQLIDLALLPLDMFSMFSGIQSSIDAAKSMATNVMKKFKKSKLASS
VSTLTNSLSDAASSVSRSSSIRSVSSSVSAWTDVSNQFTDISNSVNSISTQTSTISRRLRLKEIATQTEGINFDDISAAV
LKTKIDKSTQIAANNIPDVITEASEKFIPNRAYRVISNDNVFEASTDGRFFAYKVGTFEEIPFDVQKLADLVTDSPVISA
IIDFKTLKNLNDNYGITREQAFNLLRSDPRVLREFINQDNPIIKNRIEQLILQCRL
>Q02945 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLANSYTVDLSDEIENIGYAKSKNVTINPGPFAQTGYTPVNWGPGEVNDSTTVEPILDGPYQPTNFNPPVNY
WMLLSPLNAGVVVEGTNSIDRWLATVLVEPNVLTTVRTYTLFGVQEQISVENNSTTKWKFINLIKTTLSGNFTLYSTLLS
EPKLHGIMKHGGQLWVYNGETQTLLLQDYVTSNYDSLTMTSFCDFYIIPRSQESTCTEYINNGLPPIQNTRNVVSVSISS
RNIILNRAQVNKDIVISKTSLWKEVQYNRDITIRFRFANAIIKSGGLGYKWSEISFKPANYQYSYTRDGEEITAHTTCSV
NGVNDFSFNGGSNPTDFLISRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVLCTGGDYTFALPVGQWPVMTGG
AVMLHAAGVTLSTQFTDFVSLNSLRFRFSLSVEEPYFSITRTRVTRLYGLPAVNPNNNRDYYEIAGRFSLISLVPSNDDY
QTPIMNSVTVRQDLERQLGELREEFNTLSQEIAVSQLIDLALLPLDMFSMVSGIKSSIDAAKSMASNVMKKFKKSKLASS
ISTLTNSLSDASSSVSRNSSIRSVSSSVSAWTDVSNQLTDISNSVNSISTQTSTISRRLRLKEIATQTEGMNFDDISAAV
LKTKIDKSTQIAANNIPDIITEASEKFIPNRAYRVISNDNVFEASTDGRFFAYKVGTFEGIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGITREQAFNLLRSDPRVLREFINQDNPIIKNRIEQLILQCRL
>Q98636 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLGNSYAVDLSDETQEIGASRNQNVTVNPGPFAQTNYAPVSWGPGEVRDSTTVEPLLDGPYQPTTFNPPVDY
WMLLAPTDRGVVVEGTNNTNRWLAIILVEPDVPTEERTYTLFGQQAQITVANDSQLKWKFIVVSKQTLDGAYAQYGPLLS
ATKLYAVMKHSGRIYTYSGETPNATTAYYSTTNYDTVNMKAYCHFYIIPRTQESKCTEYINTGLPPIQNTRNVIPVSITS
RDIQYTRAQVNEDILISKASLWKEMQYNRDIIIRFQIANSIVKSGGLGYKWSEISFKPANYQYSYIRDDEEVTSATTCSV
NGVNEFSYSGGSLPTDFAVSKYEVIKENSFVYVDYWDDSQAFRNMVYVRSLAANLNSVMCTGGDFSFALPVGHYPVMTGG
AVTLHSAGVTLSTQFTDFVSLNSLRFRFSLSVEEPPFSIIRTRVSGLYGLPATKPNNSQEYYEIAGRFSLISLVPSNDDY
QTPIMNSVTVRQDLERQLSELRDEFNSLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMVTNVMKKFKKSSLANS
VSTLTNSLSDAASSVSRSSSIRSIGSTASAWTDVSITASDVSTATNSIATQTSTISKRLRLKEMATQTDGMNFDDISAEM
LKTKIDKSTQITADTLPEMITEASEKFIPNRTYRIINNDEVFETSIDGKYFAYRVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKKLNDNYGITKEQAFNLLRSDPKVLREFINQNNPIIKNRIENLIMQCRL
>P39034 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLSNSYVTNISDEVNEIGTKKTTNVTVNPGPFAQTGYAPVDWGHGELPDSTLVQPTLDGPYQPTSLNLPVDY
WMLIAPTREGRVAEGTNTTDRWFACVLVEPNVQNTQRQYVLDGQNVQLQVSNDSSTSWKFILFIKLTPDGTYTQYSTLST
PHKLCSWMKRDNRVYWYQGSSPNASESYYLTINNDNSNVSSDAEFYLIPQSQTAMCTQYINNGLPPIQNTRNIVPVNIAS
RQIKDIRAQMNEDIVISKTSLWKEMQYNRDIIIRFKFANSIIKSGGLGYKWSEISFKPMNYQYTYTRDGEEVTAHTTCSV
NGVNDFNYNGGTLPTDFAISRFEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVVCSGGSYSFALPVGNHPVMSGG
AVTLTSAGVTLSTQYTDYVSLNSLRFRFRLAVSEPSFSISRTRMSGIYGLPAVNPNNNAEYYEIAGRFSLISLVPTNDDY
QTPIANSVTVRQDLERQLGELREEFNSLSQEIAVSQLIDLATLPLDMFSMFSGIKSTVEAVKSMTTNVMKRFKTSSLANA
ISDLTSNMSEAASSVRLTSVRSIGTVTLPRARVSLQVSDDLRSMQDVSTQVSNVSRNLRLKEFTTQTDTLSFDDISAAVL
KTKLDKSTQISQQTMPDIIAESSEKFIPKRSYRIVDEDTAFETGIDGTFYAYKVDTFNEIPFDMERFNKLITDSPVLSAI
IDFKTLKNLNDNYGITKKQAMELLHSNPKTLKEFINNNNPIIRNRIENLISQCRL
>Q07416 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSTKTQNTTINPGPFAQTGYAPVNWGPGETNDSTTIEPVLDGPYQPTSFNPPVGY
WMLLSPTTAGVIVEGTNNTDRWLATILIEPNVTSQQRTYTIFGVQEQITVENTSQTQWRFVDVSKTTQNGSYSQYGPLLS
TPKLYAVMKYGGRIHTYSGQTPNATTGYYSATNYDSVNMTTFCDFYIIPRSEESKCTEYINNGLPPIQNTRNIVPLALSA
RNVISLKAQSNEDIVVSKTSLWKEMQYNRDITIRFKFANSIVKSGGLGYKWSEISFKPANYQYTYMRDGEEVTAHTTCSV
NGMNDFSFNGGSLPTDFVISRYEVIKENSYVYIDYWDDSQAFRNMVYVRSLAANLNSVTCTGGDYNFALPVGQWPYMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPSFAIMRTRVSGLYGLPAANPNNGREYYEIAGRFSLISLVPSNDNY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSIATNVMKKFKRSSLASS
VSILTDSLSDAASSVSRGSSIRSVGSSVSAWTDVSTQITDVSSSVSSISTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSVQISPTTLPDIVTEASEKFIPNRAYRVINNDEVFEAGTDGRFFAYRVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGIGKQQAFNLLRSDPRVLREFINQNNPIIRNRIEQLIMQCRL
>Q04916 ~~~~~~Outer capsid protein VP4~~~
MLTYLRREWQSFGETVTIKNTFNAQEDNNQSGRKTDNRPVKTEGRYCYKADVNRSKYYHDVQGFSLGQSDLHIDPTQFIM
YSGTISNGISYVNQAPSCVQLSLKFTPGNSSLIEDLHIEPYKVEVLKIEHVGNVSRATLLSDIVSLSIAQKKLLLYGFTQ
LGIQGLTGDVVSVETKRIPTPTQTNLLTIEDSMQCFTWDMNCANVRSTKQDSRLIIYEQEDGFWKIVTETLSIKVKPYFK
AYGTMGGAFKNWLVDSGFEKYQHDLAYVRDGVTVNAHTITYVNPSGKAGLQQDWRPATDYNGQITVLQPGDGFSVWYYED
KWQINQAIYAKNFQSDTRAQGYLENVGTLKFKMNYIPAFAEIRNKPGKVNYAYLNGGFAQVDASGYTGMSIILNFVCTGE
RFYASDNNSRVDNKITPFISYIGDYYTLSGGDFYRQGCCAGFAAGYDDVSPEHGITVSYTVMKPSDPDFITGGENYGESI
TSDLEVSIRNLQDQINSIIAEMNIQQVTSAVFTAITNLGELPGLFSNITKVFSKTKEALSKLKSRKKTSPMPIAATSIID
KTTVDVPNLTIVNKMPEEYELGIIYNSMRTKKLIEQKKHDFSTFTVATEVKLPYISKATNFSDQFMTSISSRGITIGKSD
IIQYDPMNNILSAMNRKNAQIINYKIDPDLAHEVLSQMSTNATRSLFSLNVRKQLHINNSFDTPTYGQLVERILDDGQLL
DILGKLNPNSVEELFSEFLHRIQHQLREY
>P39033 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLSNSYVTNISDEVNEIGTKKTTNVTVNPGPFAQTGYAPVDWGHGELPDSTLVQPTLDGPYQPTSLNLPVDY
WMLIAPTREGKVAEGTNTTDRWFACVLVEPNVQNTQRQYVLDGQNVQLQVSNDSSTSWKFILFIKLTPDGTYTQYSTLST
PHKLCAWMKRDNRVYWYQGATPNASESYYLTINNDNSNVSSDAEFYLIPQSQTAMCTQYINNGLPPIQNTRNIVPVNITS
RQIKDIRAQINEDIVISKTSLWKEMQYNRDIIIRFKFANSIIKSGGLGYKWSEISFKPMNYQYTYTRDGEEVTAHTTCSV
NGVNDFNYNGGTLPTDFAISRFEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVVCSGGSYSFALPVGNHPVMSGG
AVTLTSAGVTLSTQYTDYVSLNSLRFRFRLAVSEPSFSISRTRMSGIYGLPAVNPNNNAEYYEIAGRFSLISLVLTNDDY
QTPIANSVTVRQDLERQLGELREEFNSLSQEIAVSQLIDLATLPLDMFSMFSGIKSTVEAVKSMTTNVMKRFKTSSLANA
ISDLTSNMSEAASSVRLTSVRSVGTVTLPRARVSLQVSDDLRSMQDVSTQVSNVSRNLRLKEFTTQTDTLSFDDISAAVL
KTKLDKSTQISQQTMPDIIAESSEKFIPKRSYRIVDEDTAFETGIDGTFYAYKVDTFNEIPFDMERFNKLVTDSPVLSAI
IDFKTLKNLNDNYGITKKQAMELLHSNPKTLKEFINNNNPIIRNRIENLISQCRL
>Q82040 ~~~~~~Outer capsid protein VP4~~~
MASSLYAQLISQNYYSLGNEILSDQQTNKVVSDYVDAGNYTYAQLPPTTWGSGSILKSAFSTPEITGPHTNTVIEWSNLI
NTNTWLLYQKPLNSVRLLKHGPDTYNSNLAAFELWYGKSGTTITSVYYNTINNQNKTHDANSDCLILFWNEGSTQLEKQV
VTFNWNVGGILIKPINSSRMRICMSGMENFNNDSFNWENWNHEFPRSNPGININMYTEYFLASSDPYTYLKNLQQPTAKT
VDMKMMKKMNDNSKLGDGPINVSNIISKDSLWQEVQYVRDITLQCKILSEIVKGGGWGYDYTSVTFKTVNHTYSYTRAGE
NVNAHVTISFNNVKERAYGGSLPTDFKIGRFDILDTDSYVYIDYWDDSEIFKNMVYVRDVRADIGGFQYSYSSEMSYYFQ
IPVGSYPGLHSSRLQLVYDRCLLSQQFTDYAALNSLRFVFRVVSTSGWFITTGDINTRRVASGTGFAYSDGHVANTVGTI
SFISLIPSNPNYQTPIASSSTVRMDLERKINDLRDDFNALASSVALSDILSLAMSPLTFSNLLESVPAITSSVKDVAASV
MKKFRSTKMFKKAAKQNYREFVIGDLLEDVTNVARNNNSLNYSDITSAMMVSTTNRLQITDVDTFSEIVSRSADNFISNR
SYRMIENNTVHEITPTRRFSYDIKTLQQRNFDIDKFSKLASQSPVISAIVDFATIKAIRDTYGISDDIIYKLVASDAPTI
LSFINQNNPLIRNRITNLINQCKL
>P11196 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYSVDLHDEIEQIGSEKTQSVTVNPGPFAQTRYAPVNWGSWEINDSTTVEPVLDGPYQPTTFKPPNDY
WLLISSNTNGVVYESTNNNDFWTAVSSVEPHVSQTNRQYILFGENKQFNVENNSDKWKFFETFTGSSQGNFSNRRTLTSS
NRLVGMLKYGGRVWTFHGETPRATTDSSNTADLNNISIIIHSEFYIIPRSQESKCNEYINNGLPPIQNTRNVVPLSLSSR
SIQYKRAQVNEDITISKTSLWKEMQYNRDIIIRFKFGNSIIKLGGLGYKWSEISYKAANYQYSYSRDGEQVTAHTTCSVN
GVNNFSYNGGSLPTDFSISRYEVIKENSYVYIDYWDDSKAFRNMVYVRSLAANLNSVKCTGGSYNFRLPVGKWPIMNGGA
VSLHFAGVTLSTQFTDFVSLNSLRFRFSLTVDEPSFSIIRTRTINLYGLPAANPNNGNEYYEMSGRFSLISLVQTNDDYQ
TPIMNSVTVRQDLERQLNDLREEFNSLSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDLTKSMATSVMKKFRKSKLATSI
SEMTNSLSDAASSASRSASIRSNLSTISNWTNTSKSVSNVTDSVNDVSTQTSTISKKLRLREMITQTEGLSFDDISAAVL
KTKIDMSTQIGKNTLPDIVTEASEKFIPKRSYRVLKDDEVMEINTEGKFFAYKVDTLNEIPFDINKFAELVTDSPVISAI
IDFKTLKNLNDNYGITRIEAFNLIKSNPNVLRNFINQNNPIIRNRIEQLILQCKL
>Q01641 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLSNSYVTNISDEVNEIGTKKTTNVTVNPGPFAQTGYAPVDWGHGELPDSTLVQPTLDGPYQPTSLNLPVDY
WMLIAPTREGKVAEGTNTTDRWFACVLVEPNVQNTQRQYVLDGQNVQLHVSNDSSTSWKFILFIKLTPYGTYTQYSTLST
PHKLCAWMKRDNRVYWYQGATPNASESYYLTINNDNSNVSSDAEFYLIPQSQTAMCTQYINNGLPPIQNTRNIVPVNITS
RQIKDVRAQMNEDIVISKTSLWKEMQYNRDIIIRFKFANSIIKSGGLGYKWSEISFKPMNYQYTYTRDEEEVTAHTTCSV
NGVNDFNYNGGTLPTDFAISRFEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVVCSGGSYSFALPVGNHPVMSGG
AVTLTSAGVTLSTQYTDYVSLNSLQFRFRLAVSEPSFSISRTRMSGIYGLPAVNPNNSAEYYEIAGRFSLISLVPTNDDY
QTPIANSVTVRQDLERQLGELREEFNSLSQEIAVSQLIDLATLPLDMFSMFSGIKSTVEAVKSMTTNVMKRFKTSSLANA
ISDLTSNMSEAASSVRLTSVRSVGTITLPRARVSLQVGDDLRSMQDVSTQVSNVSRNLRLKEFTTQTDTLSFDDISAAVL
KTKLDKSTQISQQTMPDIIAESSEKFIPKRSYRIVDEDIRFETGIDGTFYAYKVDTFNEIPFDMERFNKLITDSPVLSAI
IDFKTLKNLNDNYGITKKQAMELLHSNPKTLKEFINNNNPIIRNRIENLISQCRL
>P11197 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYSVELSDEINTIGSEKTQNVTINPGPFAQTNYAPVVLESWEVNDSTTIEPVLDGPYQPTSFKPPSDY
WILLNPTDQQVVLEGTNKTDIWIALLLVEPNVTNQSRQYTLFGETKQITVENNTNKWKFFEMFRKNVSAEFQHKRTLTSD
TKLAGFLKHYNSVWTFHGETPHATTDYSSTSNLSEVETVIHVEFYIIPRSQESKCVEYINTGLPPMQNTRNIVPVALSSR
SVTYQRAQVNEDIIISKTSLWKEMQCNRDIIIRFKFNNSIVKLGGLGYKWSEISFKAANYQYNYLRDGEQVTAHTTCSVN
GVNNFSYNGGSLPTDFSVSRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNSVKCSGGNYNFQLPVGAWPVMSGGA
VSLHFAGVTLSTQFTDFVSLNSLRFRFSLTVEEPPFSILRTRVSGLYGLPAFNPNSGHEYYEIAGRFSLISLVPSNDDYQ
TPIMNSVTVRQDLERQLGDLREEFNSLSQEIAMTQLIDLALLPLDMFSMFSGIKSTIDAAKSMATKVMKKFKRSGLATSI
SELTGSLSNAASSISRSSSIRSNISSISVWTDVSEQIAGSSDSVSNISTQMSAISRRLRLREITTQTEGMNFDDISAAVL
KTKIDRSTHISPDTLPDIMTESSKKFIPKRAYRVLKDDEVMEADVDGKFFAYKVDTFEEVPFDVDKFVDLVTDSPVISAI
IDFKTLKNLNDNYGITRSQALDLIRSDPRVLRDFINQNNPIIKNRIEQLILQCRL
>Q08778 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLSNSYVTNISDEVSEIGARKTANVTVNPGPFAQTGYAPVNWGHGELSDSTLVQPTLDGPYQPTTFNLPIDY
WMLIAPTQIGRVAEGTNTTNRWFACVLVELNVQNTQREYVLDGQTVQLQVSNDSSTLWKFILFIKLEKNGTYTQYSTLST
SNKLCAWMKREGRVYSYAGVTPNASESYYLTINNDDSNVSSDAEFYLIPQSQTELCTQYINNGLPPIQNTRNVVPVSLTS
REIRHSRAQMNEDIVVSKTSLWKEMQYNRDITIRFKFANSIVKSGGLGYKWSEISFKPMNYQYTYTRDGEEITAHTTCSV
NGVNDFTYNGGPLPTDFAISRFEVIKENSYVYIDYWDDSQAFRNMVYVRSLAADLNDVVCSGGDYSFALPVGAYPIMSGG
AVTLSPAGVTLSTQFTDYVSLNSLRFRFRLAVSEPSFSISRTRLSGIYGLPAANPNNSVEYYEIAGRFSLISLVPTNDDY
QTPIANSVTVRQDLERQLGELREEFNSLSQEIALSQLIDLATLPLDMFSMFSGIKSTVEAVKSMTTNIMKKFKTSNLANA
ISDLTNSMSDAASSISRSASVRSIGSNTTMRISTAIQTGEDLRTMTDASTQISNVSRSLRLREFTTQTDNLSFDDISAAV
LKTKLDKSTQISQTTIPDIISESSEKFIPMRTYRVMDNDTGFETGIDGTFYAYRIDTFDEIPFDVEKFNRLITDSPVLSA
IIDFKTLKNLNDNYGITKTQAMELLQSNPRTLKEFINSNNPIIRNRIENLIAQCRL
>Q09113 ~~~~~~Outer capsid protein VP4~~~
MRSLIYRQLLYNSYSVDLSDEITNIGAEKKENVTVQIGEFAQSQYAPVSWGSGETLSGNVEEQPLDGPYTPDKSNLPSNY
WYLINPSNDGVVFSVTDNSTLWMFTYLVLPNTAQTSVVVNVMNETVNISIDNSGSAYKFVDYFKTSSAQAYRSRNFLITA
HRLQAYKRDGDGNISNYWGSDAYGDLRVGTYFNPVPNAVINLNADFYVIPDSQQEMCTEYIRRGLPAIQTTTYVTPISYA
VRSQRIARPNEDITISKASLWKEVQYNRDIVIRFVFANNIIKAGGLGYKWSEISYKANNYQYTYMRDGIEVVAHTTVSVN
GVSVYDYNTGSLPTDFTIRNYDVLKESSFVYVDYWDDSQAFRNMVYVRSLNAELNQVQCVGGHYSFALPVGSWPVMQGGS
VVLTFDGVTLSTQFTDYVSLNSLRFRFRCAVSEPPFRVTGTRISNLYGLPAANPMGDQQYYEASGRFSLISLVPSNDDYQ
TPIANSVTVRQDLERQLDEMRREFNELSANIALSQLIDLALLPLDMFSMFSGIRSTIEAAKNFATSVMKKFRKSNLAKSV
NSLTDAITDAAGSISRSSTLRSANSAVSVWTDISDIVDSTDNVVTATATAAAKKFRVKEFTTEFNGVSFDDISAAVVKTK
MNKLNVVDEEMLPQIITEASEKFIPNRAYRLIDGDKVYEVTTEGKYFAYLTETFEEVMFDAERFAELVTYSPVISAIIDF
KTIKNLNDNYGITREQALNMLRSDPKVLRSFINQNNPIIKNRIEQLILQCRI
>P11193 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYSVDLHDEIEQIGSEKTQNVTINPSPFAQTRYAPVNWGHGEINDSTTVEPILDGPYQPTTFTPPNDY
WILINSNTNGVVYESTNNSDFWTAVVAIEPHVNPVDRQYTIFGESKQFNVSNDSNKWKFLEMFRSSSQNEFYNRRTLTSD
TRFVGILKYGGRVWTFHGETPRATTDSSSTANLNNISITIHSEFYIIPRSQESKCNEYINNGLPPIQNTRNVVPLPLSSR
SIQYKRAQVNEDIIVSKTSLWKEMQYNRDIIIRFKFGNSIVKMGGLGYKWSEISYKAANYQYNYLRDGEQVTAHTTCSVN
GVNNFSYNGGSLPTDFGISRYEVIKENSYVYVDYWDDSKAFRNMVYVRSLAANLNSVKCTGGSYNFSIPVGAWPVMNGGA
VSLHFAGVTLSTQFTDFVSLNSLRFRFSLTVDEPPFSILRTRTVNLYGLPAANPNNGNEYYEISGRFSLIYLVPTNDDYQ
TPIMNSVTVRQDLERQLTDLREEFNSLSQEIAMAQLIDLALLPLDMFSMFSGIKSTIDLTKSMATSVMKKFRKSKLATSI
SEMTNSLSDAASSASRNVSIRSNLSAISNWTNVSNDVSNVTNSLNDISTQTSTISKKFRLKEMITQTEGMSFDDISAAVL
KTKIDMSTQIGKNTLPDIVTEASEKFIPKRSYRILKDDEVMEINTEGKFFAYKINTFDEVPFDVNKFAELVTDSPVISAI
IDFKTLKNLNDNYGITRTEALNLIKSNPNMLRNFINQNNPIIRNRIEQLILQCKL
>Q06895 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSTKTQNTTINPGPFAQTGYAPVNWGPGETNDSTTIEPVLDGPYQPTSFNPPVGY
WMLLSPTTAGVIVEGTNNTDRWLATILIEPNVTSQQRTYTIFGVQEQITIENTSQTQWRFVDVSKTTQNGSYSQYGPLLS
TPKLYAVMKYGGRIHTYSGQTPNATTGYYSATNYDSVNMTTFCDFYIIPRSEESKCTEYINNGLPPIQNTRNIVPLALSA
RNVIPLKAQSNEDIVVSKTSLWKEMQYNRDITIRFKFANSIVKSGGLGYKWSEISFKPANYQYTYMRDGEEVTAHTTCSV
NGMNDFSFNGGSLPTDFVISRYEVIKENSYVYIDYWDDSQAFRNMVYVRSLAANLNSVTCAGGDYNFALPVGQWPYMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPSFAIMRTRVSGLYGLPAANPNNGREYYEIAGRFSLISLVPSNDNY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSIATNVMKKFKRSSLASS
VSTLTDSLSDAASSVSRGSSIRSVGSSVSAWTDVSIQITDVSSSVSSISTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSIQISPTTLPDIVTEASEKFIPNRAYRVINNDEVFEAGTDGRFFAYRVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGIGKQQAFNLLRSDPRVLREFINQNNPIIRNRIEQLIMQCRL
>P0C6Y8 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSAKSKNVTINPGPFAQTGYAPVNWGAGETNDSTTVEPLLDGPYRPTTFNPPTSY
WVLLAPTVEGVVIQGTNNIDRWLATILIEPNVQTTNRIYNLFGQQVTLSVENTSQTQWKFIDVSKTTPTGSYTQHGPLFS
TPKLYAVMKFSGRIYTYNGTTPNATTGYYSTTNYDTVNMTLFCDFYIIPRNQEEKCTEYINHGLPPIQNTRNVVPVSLSA
REVVHTRAQVNEDIVVSKTSLWKEMQYNRDITIRFKFDRTIIKAGGLGYKWSEISFKPITYQYTYTRDGEQITAHTTCSV
NGVNNFSYNGGSLPTDFAISRYEVIKENSFVYIDYWDDSQAFRNMVYVRSLAANLNTVTCTGGSYTFALPLGNYPVMTGG
TVSLHPAGVTLSTQFTDFVSLNSLRFRFRLTVGEPSFSITRTRVSRLYGLPAANPNNQREYYEISGRFSLISLVPSNDDY
QTPIMNSVTVRQDLERQLGELRDEFNSLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKRSNLASS
VSTLTDAMSDAASSVSRSSSIRSIGSSVSAWTEVSTSITDISTTVDTVSTQTATIAKRLRLKEIATQTDGMNFDDISAAV
LKTKIDKSVQITPNTLPDIVTEASEKFIPNRTYRVINNDEVFEAGMDGKFFAYRVDTFDEIPFDVQKFADLVTDSPVISA
IIDLKTLKNLKDNYGISKQQAFDLLRSDPKVLREFINQNNPIIRNRIENLIMQCRL
>P11114 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVNLSDEIQEIGSAKSQDVTINPGPFAQTGYAPVNWGAGETNDSTTVEPLLDGPYQPTTFNPPTSY
WVLLAPTVEGVIIQGTNNTDRWLATILIEPNVQTTNRIYNLFGQQVTLSVENTSQTQWKFIDVSTTTPTGSYTQHGPLFS
TPKLYAVMKFSGRIYTYNGTTPNATTGYYSATNYDTVNMTSFCDFYIIPRNQEEKCTEYINHGLPPIQNTRNVVPVSLSA
REIVHTRAQVNEDIVVSKTSLWKEMQCNRDITIRFKFDRTIIKAGGLGYKWSEISFKPITYQYTYARDGEQITAHTTCSV
NGVNNFSYNGGSLPTDFAISRYEVIKENSFVYIDYWDDSQAFRNMVYVRSLAANLNTVTCTGGSYTFALPLGHYPVMTGG
TVSLHPAGVTLSTQFTDFVSLNSLRFRFRLTVGEPSFSITRTRVSRLYGLPAANPNNQREYYEISGRFSLISLVPSNDDY
QTPIMNSVTVRQDLERQLGELRDEFNSLSQQIAISQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKRSNLASS
VSTLTDAMSDAASSISRSSSIRSIGSSASAWTEVSNSIADVSTTVDTVSTQTATIAKRLRLKEIATQTDGMNFDDISAAV
LKTKIDKSVQITPNTLPEIVTEASEKFIPNRTYRVINNDEVFEAGMDGKFFAYRVDTFDEIPFDVQKFADLVTDSPVISA
IIDLKTLKNLKDNYGISKQQAFDLLRSDPRVLREFINQNNPIIRNRIENLIMQCRL
>Q98167 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLSNSYVTNISDEVSEIGARKTTNVTVNSGPFAQTGYAPVNWGHGELSDSTLVQPTLDGPYQPTTFNLPINY
WMLIAPTQAGRVAEGTNTTNRWFACVLVEPSVQSTRREYVLDGQTVQLQVSNDSSTLWKFILFIKLEKNGTYSQYSTLST
SNKLCAWMKREGRVYWYTGTTPNASESYYLTINNDNSHVSCDAEFYLLPRSQTDLCDQYINNGLPPVQNTRNVVPVSITS
REIRYTKAQMNEDIVVSKTSLWKEMQYNRDIIIRFKFANSIVKSGGLGYKLSEISFKPMNYQYTYTRDGEEINSHTTCSV
NGVNDFSYNGGTLPTDFSISRFEVIKENSKVYIDYWDDSQAFRNMVYVRSLSANLNDAVCSGGDYTFALPVGAWPVMSGG
AVTLSSEGVTLSTQFTDYLSLNSLRFRFRLTVSEPSFSISRTRLSGIYGLPAANPNNNVEYYEIAGRFSLISLVLTNDDY
QTPIANSVTVRQDLERQLGELREEFNALSQEIALSQLIDLATLPLDMFSMFSGIKSTVETVKSMTTNIMKKFKTSNLANA
ISDLTNSMSDAASSVSRSVSVRSIGGNATSRISTAIQAGDDLRTVADASTQISSVSRSLRLREFTTQADNLSFDDISAAV
LKTKLDKSTQISQSTIPDIISESSEKFIPMRTYRVIDNDTAFETGIDGHFYAYRVDTFDEVPFDVERFNKLITDSPVLSA
IIDFKTLKNLNDNYGITKTQAMELLQSNPKTLKEFINNNNPIIRNRIENLIAQCRL
>Q96802 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVELSDEIQEIGSTKTQDVTVNPGPFAQTNYAPVNWGPGETNDSTTVEPVLDGPYQPTTFNPPVSY
WMLLAPTNAGVVVEGTNNTNRWLATILIEPNVQQVERTYTLFGQQVQVTVSNNSQTKWKFVDLSKQTQDGNYSQHGSLLS
TPKLYGVMKHGGKIYTYNGETPNATTDYYSTTNFDTVNMTAYCDFYIIPLAQEAKCTKYINNGLPPIQNTRNIVPVSIVS
RNIVYTRAQPNQDIVVSKTSLWKEMQYNRDIVIRFKFANSIIKSGGLGYKWSEVSFKPANYQYTYTRDGEEVTAHTTCSV
NGINNFNYNGGSLPTDFVISKYEVIKENSFVYIDYWDDSQAFRNMVNVRSLAADLNSVMCTGGDYSFALPLGHYPVMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLSVEEPPFSILRTRVSGLYGLPAARPNNSQEYYEIAGRFSLISLVPSNDDY
QTPIINSVTVRQDLERQLGELRDEFNNLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKKSSLANS
VSTLTDSLSDAASSISRNASVRSVSSTASAWTEVSNITSDINVTTSSISTQTSTISRRLRLKEMATQTDGMNFDDISAAV
LKTKIDKSTQLNTNTLPEIVTEASEKFIPNRAYRVIKDDEVLEASTDGKYFAYKVETFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGISRQQALNLLRSDPRVLREFINQDNPIIRNRIESLIMQCRL
>P12473 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVDLSDEIQEIGSTKTQNVTINLGPFAQTGYAPVNWGPGETNDSTTVEPVLDGPYQPTSFNPPVDY
WMLLAPTAAGVVVEGTNNTDRWLATILVEPNVTSETRSYTLFGTQEQITIANASQTQWKFIDVVKTTQNGSYSQYGPLQS
TPKLYAVMKHNGKIYTYNGETPNVTTKYYSTTNYDSVNMTAFCDFYIIPREEESTCTEYINNGLPPIQNTRNIVPLALSA
RNIISHRAQANEDIVVSKTSLWKEMQYNRDITIRFKFASSIVKSGGLGYKWSEISFKPANYQYTYTRDGEDVTAHTTCSV
NGMNDFNFNGGSLPTDFIISRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNSVICTGGDYSFALPVGQWPVMTGG
AVSLHSAGVTLSTQFTDFVSFNSLRFRFRLTVEEPSFSITRTRVGGLYGLPAAYPNNGKEYYEVAGRLSLISLVPSNDDY
QTPITNSVTVRQDLERQLGELREEFNALSQEIAMSQLIYLALLPLDMFSMFSGIKSTIDAAKSMATSVMKKFKKSGLANS
VSTLTDSLSDAASSISRGASIRSVGSSASAWTDVSTQITDVSSSVSSISTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDRSTQISPNTLPDIVTEASEKFIPNRAYRVINNDEVFEAGTDGRYFAYRVETFDEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGISRQQAFNLLRSDPRVLREFINQDNPIIRNRIEQLIMQCRL
>P12976 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVDLSDEIQEIGSTKSQNVTINPGPFAQTGYAPVNWGPGEINDSTTVEPLLDGPYQPTTFNPPVDY
WMLLAPTTPGVIVEGTNNTDRWLATILIEPNVQSENRTYTIFGIQEQLTVSNTSQDQWKFIDVVKTTANGSIGQYGPLLS
SPKLYAVMKHNEKLYTYEGQTPNARTAHYSTTNYDSVNMTAFCDFYIIPRSEESKCTEYINNGLPPIQNTRNVVPLSLTA
RDVIHYRAQANEDIVISKTSLWKEMQYNRDITIRFKFANTIIKSGGLGYKWSEISFKPANYQYTYTRDGEEVTAHTTCSV
NGVNDFSFNGGYLPTDFVVSKFEVIKENSYVYIDYWDDSQAFRNVVYVRSLAANLNSVMCTGGSYNFSLPVGQWPVLTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPHFKLTRTRLDRLYGLPAADPNNGKEYYEIAGRFSLISLVPSNDDY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKKFKKSGLANS
VSTLTDSLSDAASSISRGSSIRSIGSSASAWTDVSTQITDISSSVSSVSTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSTQISPNTIPDIVTEASEKFIPNRAYRVINNDDVFEAGIDGKFFAYKVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGITKQQAFNLLRSDPRVLREFINQDNPIIRNRIEQLIMQCRL
>P0C6Y9 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVELSDEIQEIGSTKTQNVTVNPGPFAQTNYAPVNWGPGETNDSTTVEPVLDGPYQPTTFNPPVSY
WMLLAPTNAGVVVEGTNNTNRWLATILIEPNVQQVERTYTLFGQQVQVTVSNDSQTKWKFVDLSKQTQDGNYSQHGSLLS
TPKLYGVMKHGGKIYTYNGETPNANTGYYSTTNFDTVNMTAYCDFYIIPLAQEAKCTEYINNGLPPIQNTRNIVPVSIVS
RNIVYTRAQPNQDIVVSKTSLWKEMQYNRDIVIRFKFANSIIKSGGLGYKWSEVSFKPANYQYTYTRDGEEVTAHTTCSV
NGVNDFNYNGGSLPTDFVISKYEVIKENSFVYIDYWDDSQAFRNMVYVRSLAADLNSVMCTGGDYSFALPVGNYPVMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLSVEEPPFSILRTRVSGLYGLPAAKPNNSQEYYEIAGRFSLISLVPLNDDY
QTPIMNSVTVRQDLERQLGELRDEFNNLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKKSSLANS
VSTLTDSLSDAASSISRSASVRSVSSTASAWTEVSNIASDINVTTSSISTQTSTISRRLRLKEMATQTDGMNFDDISAAV
LKTKIDKSTQLNTNTLPEIVTEASEKFIPNRAYRVIKDDEVLEASTDGKYFAYKVETFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGISRQQALNLLRSDPRVLREFINQDNPIIRNRIESLIMQCRL
>P17463 ~~~~~~Outer capsid protein VP4~~~
MAALIYRQLLTNSYTVELSDEIQEIGSTKTQNVTVNPGPFAQTNYAPVNWGPGETNDSTTVEPVLDGPYQPTTFNPPVSY
WMLLAPTNAGVVVEGTNNTNRWLATILIEPNVQQVERTYTLFGQQVQVTVSNDSQTKWKFVDLSKQTQDGNYSQHGSLLS
TPKLYGVMKHGGKIYTYNGETPNANTGYYSTTNFDTVNMTAYCDFYIIPLAQEAKCTEYINNGLPPIQNTRNIVPVSIVS
RNIVYTRAQPNQDIVVSKTSLWKEMQYNRDIVIRFKFANSIIKSGGLGYKWSEVSFKPANYQYTYTRDGEEVTAHTTCSV
NGVNDFNYNGGSLPTDFVISKYEVIKENSFVYIDYWDDSQAFRNMVYVRSLAADLNSVMCTGGDYSFALPVGNYPVMTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLSVEEPPFSILRTRVSGLYGLPAAKPNNSQEYYEIAGRFSLISLVPLNDDY
QTPIMNSVTVRQDLERQLGELRDEFNNLSQQIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKRFKKSSLANS
VSTLTDSLSDAASSISRSASVRSVSSTASAWTEVSNIASDINVTTSSISTQTSTISRRLRLKEMATQTDGMNFDDISAAV
LKTKIDKSTQLNTNTLPEIVTEASEKFIPNRAYRVIKDDEVLEASIDGKYFAYKVETFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGISRQQALNLLRSDPRVLREFINQDNPIIRNRIESLIMQCRL
>A2T3T2 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLTNSYTVDLSDEIQEIGSTKSQNVTINPGPFAQTGYAPVNWGPGEINDSTTVEPLLDGPYQPTTFNPPVDY
WMLLAPTTPGVIVEGTNNTDRWLATILIEPNVQSENRTYTIFGIQEQLTVSNTSQDQWKFIDVVKTTANGSIGQYGPLLS
SPKLYAVMKHNEKLYTYEGQTPNARTAHYSTTNYDSVNMTAFCDFYIIPRSEESKCTEYINNGLPPIQNTRNVVPLSLTA
RDVIHYRAQANEDIVISKTSLWKEMQYNRDITIRFKFANTIIKSGGLGYKWSEISFKPANYQYTYTRDGEEVTAHTTCSV
NGVNDFSFNGGYLPTDFVVSKFEVIKENSYVYIDYWDDSQAFRNVVYVRSLAANLNSVMCTGGSYNFSLPVGQWPVLTGG
AVSLHSAGVTLSTQFTDFVSLNSLRFRFRLAVEEPHFKLTRTRLDRLYGLPAADPNNGKEYYEIAGRFSLISLVPSNDDY
QTPIANSVTVRQDLERQLGELREEFNALSQEIAMSQLIDLALLPLDMFSMFSGIKSTIDAAKSMATNVMKKFKKSGLANS
VSTLTDSLSDAASSISRGSSIRSIGSSASAWTDVSTQITDISSSVSSVSTQTSTISRRLRLKEMATQTEGMNFDDISAAV
LKTKIDKSTQISPNTIPDIVTEASEKFIPNRAYRVINNDDVFEAGIDGKFFAYKVDTFEEIPFDVQKFADLVTDSPVISA
IIDFKTLKNLNDNYGITKQQAFNLLRSDPRVLREFINQDNPIIRNRIEQLIMQCRL
>Q8JNB4 ~~~~~~Outer capsid protein VP4~~~
MASLIYRQLLANSYAVDLSDEIQSVGSEKNQRVTVNPGPFAQTVYAPVNWGPGEVNDSTVVQPVLDGPYQPASFDLPVGN
WMLLAPTGPGVVVEGTDNSGRWLSVILIEPGVTSETRTYTMFGSSKQVLVSNASDTKWKFVEMMKTAIDGDYAEWGTLLS
DTKLYGMMKYGKRLFIYEGETPNATTKRYIVTNYASVEVRPYSDFYIISRSQESACTEYINNGLPPIQNTRNVVPVAIVS
RSIKPREVQANEDIVVSKTSLWKEMQYNRDIIIRFKFDNSIIKSGGLGYKWAEISFKAANYQYNYMRDGEDVTAHTTCSV
NGVNDFSFNGGSLPTDFAISRYEVIKENSYVYVDYWDDSQAFRNMVYVRSLAANLNDVMCSGGHYSFALPAGQWPVMKGG
AVTLHTAGVTLSTQFTDYVSLNSLRFRFRLAAEEPSFTITRTRVSKLYGIPAANPNGGREYYEVAGRFSLISLVPSNDDY
QTPIMNSVTVRQYLERHLNELREEFNNLSQEIAVSQLIDLAMLPLDMFSMFSGIESTVNAAKSMATNVMRKFKSSKLASS
VSMLRDSLSDGASSIARSTSIRSIGSTASAWANISERTQDAVNEVATISSQVSQISGKLRLKEITTQTEGMNFDDVSGAV
LKAKIDRSIQVDQNALPDVITEASEKFIRNRAYRVIDGDEAFEAGTDGRFFAYKVETLEEMPFNIEKFADLVTNSPVISA
IIDFKTLKNLNDNYGITREQAFNLLRSNPKVLRGFIDQNNPIIKNRIEQLIMQCRL
>O71026 ~~~Segment-6~~~Outer capsid protein VP5~~~
MGKFTSFLKRAGSATKNALTSDAAKRMYKMAGKTLQKVVESEVGSSAIDGVMQGTIQSIIQGENLGDSIRQAVILNVAGT
LESAPDPLSPGEQLLYNKVSEIERAEKEDRVIETHNKKIIEKYGEDLLEIRKIMKGEAEAEQLEGKEMEYVEKALKGMLK
IGKDQSERITRLYRALQTEEDLRTSDETRMISEYREKFDALKQAIELEQQATHEEAMQEMLDLSAEVIETAAEEVPIFGA
GQANVVATTRAIQGGLKLKEIIDKLTGIDLSHLKVADIHPHIIEKAMLKDRIPDNELAMAIKSKVEVIDEMNTETEHVIE
SIIPLVKKEYEKHDNKYHVNIPSALKIHSEHTPKVHIYTTPWDSDKVFICRCIAPHHQQRSFMIGFDLEVEFVFYEDTSV
EGHIMHGGAVSIEGRGFRQAYSEFMNSLVYPSTPELHKRRLQRSLGSHPIYMGSMVITVSYEQLVSNAMKLVYDTDLQMH
CLRGPLKFQRRTLMNALLFGVKVA
>Q8JU58 ~~~S5~~~Microtubule-associated protein VP5~~~
MITIVVIPTAHFSWTDTNFLNSVDYRLTSQPKIRDRFAVYAPGWLRRQLDEFSASLTASELLQALQTIPIPVKARCLLLP
KPKRFAQWLLDVPSANIWHIPVTTLRATVASKHPSSDVYNYIPDHVPPSAEFDTVTRRVAAGRDIYVRSTKVLGAPLCLA
APAKYYAGYLSTHQLDGVYPDNWAPDNFHKREFCLTILPSLLGPRTFLLDVDADRDASYPLSVLWPQLRVLALKSRLLLP
PVALLRRVVDPGLKPTWSADSDAAFRALRLSRPSSASKPTGFDFSALPVVDIICLFESEPDDHGRVAPGTRLTIHSVPTD
LLTSLSIQEGVRYPLRQESGMFVPWVLLALLMSDDVTISGTRRSVKLETAHASARPFVHITVERCASARVVDVRGSPAMY
ANAVCLTLPKGSYKSTIIDTLPAMFSDLSILEQAAVIDSDALGDSLRPSFETQFLERLENLDPKLLDRAVASILSPASDT
SDDAVTTVLDVFNALYREVMTPAQRSRLPLLTQQGRVLAFAHSDYELLSANIPIQVVRGSIPIDHVVNLLARRNRVGGTA
LQVLLDYCYRTQASPLAPTPAGRLYKQLFGPWLMVPRLSDPLIKLRLVASAPAKVLRAAGWTIDGDPPLEVSCLCAYVTD
RAMAAALIERRLDSRALVNVGGDQLMFVEYAPPLPLVSIPRTFLLPVTYVVHWVSPQRVLLNGGNVSFTSGLEWTFDDDP
QVVTSTGV
>Q8JU54 ~~~S8~~~Clamp protein VP6~~~
MAQRQFFGLTYNFYGQPAPLFDLNDLQELAGCYARPWTSRFSHLAISTGSLPVWSARYPSVASRNIIVNTLLGAHLNPFA
GGQVTSHQGITWRDPVLSSLAPVPAIQPPPVWAVAENVPLDSNNYPTYVLNLSSMWPINQDVHIMTMWALSDQGPIYHLE
VPVDPMPAATTAALMAYIGVPIAHLAQTAYRFAGQLPQSPDSTMVSTIRWLSAIWFGSLTGRLNRSRTCNGFYFEFAKPA
LNPDQAVLKWNDGARAAPPAAAQSSYMRCISPHWQHQIVEVAGALMSQSVTAVTGLPALIDEATLPAWSQGVANLTGNGQ
GVVPCLDYNPVPMAAARHLQWRQDGLITAAQEAQLNNDYTAYALTIERHLTAMLVANPIAAGRMPIQPFNAADFGQAGQT
AAAVALAQAMFV
>B2BNE7 ~~~S8~~~Clamp protein VP6~~~
MALRRFLRTSPITSTAQPAPLYDNQDIQDIVGAYARPWQSRFGDITITTTSAPVWSGRYPSVAARNIIVNTILGAHLNAF
SGGVIAQYRGLTWRDNIMSSLAPPSQNPPPPAWVPAENVQLDSDNYPQYALNLSKMWSVNLDVHIMTMWALSDYGPLYEI
SVPTAPMPAMTTAALMAYIGCSITQLAMTAYQYAGQLPQTAAATMTTTLRWLAAIWFGSLCGVVHRNHTVNGFYFDFGKP
GFNPDHAVLKWNDGNRAAPPAAARFRVYRVRSPHWQQMTSEVAGAILAQSVTAVAGLTAMFNNRGLPVWAQNIPHFTGAA
AGTRVSRTYNPVTMAAARHQNWQAAGLITAVKQAELDQQYTDYAQAIEVHLTAQLAANPVANGRMPIQPFLPADFAAAGG
TNQVVADARLMFP
>Q86341 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLAKTLKDARARIVEGTLYTNVADIVQQVNQVINSINGSTFQTGGIGNLPIRNWTFDFGTLGTTLLNLDANYVEN
ARTTIDYFIDFVDSVCVDEIVRESQRNGIAPQSDSLRQLSNAKYKRINYDNESEYIENWNLQNRRQRTGYLLHKPNILPY
NNSFTLTRSQPAHDNVCGTIWLNNGSEIEIAGFDSECALNAPGNIQEFEHVVPMRRVLNNATVSLLPYAPRLTQRAVIPT
ADGLNTWLFNPIILRPNNVQVEFLLNGQVITNYQARYGILAARNFDSIRISFQLVRPPNMTPGVAALFPVAAPFPNHATV
GLTLKIESASCESVLSDANEPYLSIVTGLRQEYAIPVGPVFPAGMNWTELLNNYSASREDNLQRIFTAASIRSMIIK
>Q91N56 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGIKFKRINFDATSEIIENWNLQNRRQRTGFIFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVITS
ADGATTWYFNPVILRPANVEVEFLLNGQVVNTYQAKFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEFHATV
GLTLRIESAVCESVLADANETMLANVTAVRQEYAIPIGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>A7J3A1 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGLKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHHATV
GLTLRIESAVCESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>Q00734 ~~~~~~Intermediate capsid protein VP6~~~
MDVLFSIAKTVSELKKRVVVGTIYTNVEDIIQQTNELIRTLNGSTFHTGGIGTQPQKDWVVQLPQLGTTLLNLDDNYVQS
ARGIIDYLASFIEAVCDDEMVREASRNGMQPQSPTLIALASSKFKTINFNNSSQSIKNWSAQSRRENPVYEYKNPMVFEY
RNSYILHRADQQFGNAMGLRYYTTSNTCQIAAFDSTMAENAPNNTQRFIYHGRLKRPISNVLMKVERGAPNVNNPTILPD
PTNQTTWLFNPVQVMNGTFTIEFYNNGQLVDMVRNMGIATVRTFDSYRITIDMIRPAAMTQYVQQLFPVGGPYSHQAAYM
LTLSVLDATTESVLCDSHSVDYSIVANTRRDSAMPAGTVFQPGFPWEQTLSNYTVAQEDNLERLLLVASVKRMVM
>P18610 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHQATV
GLTLRIESAVCESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>P16592 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSISKTLKDARDKIVEGTLYSNVSDIIQQFNQIIVTMNGNEFQTGGIGTLPIRNWTFDFGLLGTTLLNLDANYVET
ARTTIEYFIDFIDNVCMDEMTRESQRNGIAPQSDALRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFVFHKPNIFPY
SASFTLNRSQPLHNDLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRALTTATITILPDAERFSFPRVINS
ADGATTWFFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVNALFPQAQPFQHHATV
GLTLRIDSAVCESVLADSNETMLANVTAVRQEYAVPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>P16488 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMVITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRKGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGTTTWYFNPVIFRPNNVEIEFLLNGQIINNYQARFGTIIARNFDTIRLSFQLMRPPPQNMTPAVAALFPNAPPFEHHA
TVGLTLRIESAICESVLADASETMLANVTSVRQEYAVPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLIK
>Q01754 ~~~~~~Intermediate capsid protein VP6~~~
MDLIETVNACVGLQKRVLKLAPNTNLNTAGQSVLNDYNALASRVNGRTYALLDQTAVYTPYTVNAPIISLAVRISTDDYD
DMRSGIDSILDILAAAIRTEGSRPTRVIERRVIEPNVKQLVEDLKLKSLTSEISIANMAAVDTALIQPEIIETENPLFAD
IIEQVIHRPNASMTGGNIRATLGRWSGNKGIVTCMSGMDSEHRFTVDFKTRTCGIINVVYAPTAGVIMIPMPTGRNREGH
LIDVSAEMMAENFAIDFMDDDDIIQTETGVGVFSFPMCNRIRFRINPWDMQKHNDNLWTVNLANWPQGTSPRQPAISFLF
ETRRTFTEGDYQHLSRCAPKVQYMMDTIFPETAFTNRPVVDWNVQSLLTSSSQKTWCQKIAMLIAAYAAKI
>P69481 ~~~~~~Intermediate capsid protein VP6~~~
MDVLFSIAKTVSDLKKKVVVGTIYTNVEDVVQQTNELIRTLNGNIFHTGGIGTQPQKEWNFQLPQLGTTLLNLDDNYVQS
TRGIIDFLSSFIEAVCDDEIVREASRNGMQPQSPALILLSSSKFKTINFNNSSQSIKNWNAQSRRENPVYEYKNPMLFEY
KNSYILQRANPQFGSVMGLRYYTTSNTCQIAAFDSTLAENAPNNTQRFVYNGRLKRPISNVLMKIEAGAPNISNPTILPD
PNNQTTWLFNPVQLMNGTFTIEFYNNGQLIDMVRNMGIVTVRTFDSYRITIDMIRPAAMTQYVQRIFPQGGPYHFQATYM
LTLSILDATTESVLCDSHSVEYSIVANVRRDSAMPAGTVFQPGFPWEHTLSNYTVAQEDNLERLLLIASVKRMVM
>Q9QNB0 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNDFQTGGIGNLPIRNWTFDFGLLGTTLLNLDANYVEN
ARTTIEYFIDFIDNVCMDEMAREAQRNGVAPQSEALGKLAGIKFKRINFDNSSEYIENWNLQNRRQRTGFVFHKPNIFPY
SASFTLNRSQPMHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANIQQFEHIVQLRRALTTATITLLPDAERFSFPRVINS
ADGATTWFFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRSLFQLMRPPNMTPAVNALFPQAQPFQHHATV
GLTLRIESAVCESVLADANETLLANVTAVRQEYAIPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>P08035 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPTRNWSFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSESLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIHVAGFDYSCAINAPANIQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIVARNFDTIRLSFQLMRPPNMTPSVAALFPNAQPFEHHATV
GLTLRIESAICESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLHRVFTVASIRSMLVK
>P87723 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNDFQTGGIGNLPIRNWTFDFGLLGTTLLNLDANYVEN
ARTTIEYFIDFIDNVCIDEMSRESQRNGVAPQSEALRKLAGIKFKRINFNNSSEYIENWNLQNRRQRTGFVFHKPNIFPY
SASFTLNRSQPMHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANIQQFEHIVQLRRALTTATITLLPDAERFSFPRVINS
ADGATTWLFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVNALFPQAQPFQHHATV
GLTLRIESAVCESVLADANETLLANVTAVRQEYAIPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>Q86219 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSESLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHHATV
GLTLRIESAVCESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLIK
>P89043 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNDFQTGGIGNSPVRNWNFDFGLLGTTLLNLDANYVEN
ARTTIEYFVDFIDNVCMDEMTRESQRSGIAPQSEALRKQSGIKFKRINFDNSSDYIENWNLQNRRQRTGFVFHKPNILPY
SASFTLNRSQPAHDNLMGTMWINAGSEIQVAGFDYSCAFNAPANIQQFEHVVPLRRALTTATITLLPDAERFSFPRVINS
ADGTTTWYFNPVILRPSNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLVRPPNMTPAVANLFPQAPPFIFHATV
GLTLRIESAVCESVLADASETLLANVTSVRQEYAIPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>Q91N61 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNEFQTGGIGNLPIRNWTFDFGLLGTTLLNLDANYVEN
ARTTIEYFIDFIDNVCMDEIARESQRNGIAPQSEALRKLSGIKFKRINFDNSSDYIENWNLQNRRQRTGFVFHKPNILPY
SASFTLNRSQPAHDNLMGTMWINAGSEIQVAGFDYSCALNAPANIQQFEHVVPLRRALTTATITLLPDAERFSFPRVINS
ADGTTTWYFNPVILRPSNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLVRPPNMTPAVANLFPQAPPFIFHATV
GLTLRIESAVCESVLADASETLLANVTSVRQEYAIPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>P14162 ~~~~~~Intermediate capsid protein VP6~~~
MDVLFSIAKTVSDLKKKVVVGTIYTNVEDIIQQTNELIRTLNGNTFHTGGIGTQPQKEWNFQLPQLGTTLLNLDDNYVQA
TRSVIDYLASFIEAVCDDEIVREASRNGMQPQSPTLIALASSKFKTINFNNSSQSIKNWSAQSRRENPVYEYKNPMVFEY
RNSYILQRANPQYGNVMGLRYYTASNTCQLAAFDSTLAENAPNNTQRFIYNGRLKRPISNVLMKIEAGAPNINNLTILPD
PTNQTTWLYNPDQLMNGTFTIEFYNNGQLVDMVRNMGVVTVRTFDSYRITIDMIRPAAMTQYVQRLFPQGGPYPYQAAYM
LTLSILDATTESVLCDSHSVDYSIVANVRRDSAMPAGTVFQPGFPWEQTLSNYTVAQEDNLERLLLVASVKRMVM
>Q06386 ~~~~~~Intermediate capsid protein VP6~~~
MEVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIVTMNGNDFQTGGIGNLPIRNWTFDFGLLGTTLLNLDANYVEN
ARTTIEYFIDFIDNVCMDEIARESQRNGIAPQSEALRKLSGIKFKRINFDNSSDYIENWNLQNRRHGTGFVFHKPNILPY
SASFTLNRSQPAHDNLMGTMWINAGSEIQVAGFDYSCAFNAPANIQQFEHVVPLRRALTTATITLLPDAERFSFPRVINS
ADGTTTWYFNPVILRPSNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLVRPPNMTPAVANLFPQAPPFIFHATV
GLTLRIESAVCESVLADASETLLANVTSVRQEYAIPVGPVFPPGMNWTELITNYSPSREDNLQRVFTVASIRSMLIK
>P04509 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLIKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHIVQLRRVLTTATITLLPDAERFSFPRVITS
ADGATTWYFNPVILRPNNVEIEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHHATV
GLTLRIESAVCESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>B2BN53 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANIQQFEHIVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHHATV
GLTLRIESAVCESVLADASKTMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>A2T3T0 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFNFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSAIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANIQQFEHIVPLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWFFNPVILRPNNVEVEFLLNGQIINTYQARFGTIVARNFDTIRLSFQLMRPPNMTPAVAVLFPNAQPFEHHATV
GLTLRIESAVCESVLADASETLLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLIK
>Q8JTI5 ~~~~~~Intermediate capsid protein VP6~~~
MDVLYSLSKTLKDARDKIVEGTLYSNVSDLIQQFNQMIITMNGNEFQTGGIGNLPIRNWNFDFGLLGTTLLNLDANYVET
ARNTIDYFVDFVDNVCMDEMVRESQRNGIAPQSDSLRKLSGIKFKRINFDNSSEYIENWNLQNRRQRTGFTFHKPNIFPY
SASFTLNRSQPAHDNLMGTMWLNAGSEIQVAGFDYSCAINAPANTQQFEHVVQLRRVLTTATITLLPDAERFSFPRVINS
ADGATTWYFNPVILRPNNVEVEFLLNGQIINTYQARFGTIIARNFDTIRLSFQLMRPPNMTPAVAALFPNAQPFEHHATV
GLTLRIESAVCESVLADASETMLANVTSVRQEYAIPVGPVFPPGMNWTDLITNYSPSREDNLQRVFTVASIRSMLVK
>O70791 ~~~6~~~Protein 6~~~
MSSQQETNDKSNTQGHPETDPEGKTGTDTGNTEDSPPDTDNVPITDDAIMDDVMDEDVKEEDIDYSWIEDMRDEDVDAEW
LFELIDECNGWPD
>P36325 ~~~Segment-7~~~Core protein VP7~~~
MDAIAARALSVVRACVTVTDARVSLDPGVMETLGIAINRYNGLTNHSVSMRPQTQAERNEMFFMCTDMVLAALNVQIGNI
SPDYDQALATVGALATTEIPYNVQAMNDIVRITGQMQTFGPSKVQTGPYAGAVEVQQSGRYYVPQGRTRGGYINSNIAEV
CMDAGAAGQVNALLAPRRGDAVMIYFVWRPLRIFCDPQGASLESAPGTFVTVDGVNVAAGDVVAWNTIAPVNVGNPGARR
SILQFEVLWYTSLDRSLDTVPELAPTLTRCYAYVSPTWHALRAVIFQQMNMQPINPPIFPPTERNEIVAYLLLVASLADV
YAALRPDFRMNGVVAPVGQINRALVLAAYH
>O71027 ~~~Segment-7~~~Core protein VP7~~~
MDAIAARALSVVRACVTVTDARVSLDPGVMETLGIAINRYNGLTNHSVSMRPQTQAERNEMFFMCTDMVLAALNVQIGNI
SPDYDQALATVGALATTEIPYNVQAMNDIVRITGQMQTFGPSKVQTGPYAGAVEVQQSGRYYVPQGRTRGGYINSNIAEV
CMDAGAAGQVNALLAPRRGDAVMIYFVWRPLRIFCDPQGASLESAPGTFVTVDGVNVAAGDVVAWNTIAPVNVGNPGARR
SILQFEVLWYTSLDRSLDTVPELAPTLTRCYAYVSPTWHALRAVIFQQMNMQPINPPIFPPTERNEIVAYLLVASLADVY
AALRPDFRMNGVVAPVGQINRALVLAAYH
>B2BNE9 ~~~~~~Outer capsid protein VP7~~~
MPAHMIPQVARAVVQSTYSGSLSAIDDNLEPTDDIDQAAYITTGRYVVCALCLATVSDSPTQLSRWVFHHCSDDRRPLIR
SMLLASSRHAHALRESREVDMRRISRLVHQADEEDELDAPRQARRIGYVDLHSCDLQNPTPELATRQLCNDPTRTHSTHP
HLARSHPYMLPTAALDIDPPEPVTMFATMSRTDGVPMLFNMTHRNVEVLASPAARASLMYALLKLANAKLTPKQRSIMYG
PSANDMVAACTKACAATTFRHVGRYAARVVIEE
>P69361 ~~~Segment-7~~~Core protein VP7~~~
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNEMFFMCLDMMLSAAGINVGPI
SPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWGPARQPYGFFLETEETFQPGRWFMRAAQAVTAVVCGPDMIQ
VSLNAGARGDVQQIFQGRNDPMMIYLVWRRIENFAMAQGNSQQTQAGVTVSVGGVDMRAGRIIAWDGQAALHVHNPTQQN
AMVQIQVVFYISMDKTLNQYPALTAEIFNVYSFRDHTWHGLRTAILNRTTLPNMLPPIFPPNDRDSILTLLLLSTLADVY
TVLRPEFAIHGVNPMPGPLTRAIARAAYV
>P18259 ~~~Segment-7~~~Core protein VP7~~~
MDTIAARALTVMRACATLQEARIVLEANVMEILGIAINRYNGLTLRGVTMRPTSLAQRNEMFFMCLDMMLSAAGINVGPI
SPDYTQHMATIGVLATPEIPFTTEAANEIARVTGETSTWGPARQPYGFFLETEETFQPGRWFMRAAQAATAVVCGPDMIQ
VSLNAGARGDVQQIFQGRNDPMMIYLVWRRIENFAMAQGNSQQTQAGVTVSVGGVDMRAGRIIAWDGQAALHVRNPTQQN
AMVQIQVVFYISMDKTLNQYPALTAEIFNVYSFRDHTWHGLRTAIRNRTTLPNMLPPIFPPNDRDSILTLLLLSTLADVY
TVLRPEFAMHGVNPMPWPLTAAIARAAYV
>Q9E779 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILNSLILVVLLNYILKSVTRVMDYIIYRFLFVIVIVSLCAKAQNYGINLPITGSMDGAYTNTTDDKPFLTST
LCIYYPTIASNDLADPDWKNTVSQLFLTKGWPMGSVYFNEYVNIAEFSINPQLFCDYNIVLMKYESDLEMDMSELADLLL
NEWLCNPMDVTLYYYQQTDEANKWISMGTSCTIKVCPLNTQTLGIGCLTTDTSSFETVAVNEKLVITDVVDGVSHKLDVT
NVTCTIRNCKKLGPRENVAVVQIGGANILDITADPTTAPQTERMMRVNWKKWWQVFYTIVDYINQIVKVMSKRSRSLNSA
AFYYRV
>Q3ZK60 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISFILLNYILKSLTRMMDFVIYRFLFVIVVLSPLLKAQNYGINLPITGSMDTAYANSTQEETFLTST
LCLYYPTEAATEINDNSWKDTLSQLFLTKGWPTGSIYFREYTDIVSFSVDPQLYCDYNVVLMKYDAALQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCLTTDTATFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGSDVLDITADPTTAPQTERMMRINWKKWWQVFYTVVDYVNQIIQLMSKRSRSLNSA
AFYYRV
>Q85031 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISLVFVNYILKSVTRAMDFIIYRFLLVIILLAPFIKTQNYGINLPITGSMDTPYMNSTTSETFLTST
LCLYYPNEAATEIADTKWTETLSQLFLTKGWPTGSVYFKGYADIASFSVEPQLYCDYNIVLMKYDGNLQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGTSCTIKVCPLNTQTLGIGCSTTDINSFEIVANAEKLAITDVVDGVNHKLDVT
TNTCTIRNCKKLGPRENVAVIQVGGPNILDITADPTTAPQTERMMRINWKRWWQVFYTIVDYVNQIVQVMSKRSRSLNSA
AFYYRV
>P36357 ~~~~~~Outer capsid glycoprotein VP7~~~
MYSTECTILLIEIIFYFLAAIILYDMLHKMANSPLLCIAVLTVTLAVTSKCYAQNYGINVPITGSMDVAVPNKTDDQIGL
SSTLCIYYPKEAATQMNDAEWKSTVTQLLLAKGWPTTSVYLNEYADLQSFSNDPQLNCDYNIILAKYDQNETLDMSELAE
LLLYEWLCNPMDVTLYYYQQTSESNKWIAMGSDCTIKVCPLNTQTLGIGCKTTDVSTFEELTTTEKLAIIDVVDGVNHKA
NYTISTCTIKNCIRLDPRENVAIIQVGGPEIIDISEDPMVVPHVQRATRINWKKWWQIFYTVVDYINTIIQAMSKRSRSL
NTSAYYFRV
>Q00252 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTTLIFLILFVLLNYILKSITRVMDYILYRFLLFIVIVTPFVNSQNYGINLPITGSMDTNYQNVSTSKPFLTST
LCLYYPTEAETEIADSSWKDTLSQLFLTKGWPTGSVYLKSYTDIATFSINPQLYCDYNIVLMKYNANAELDMSELAALIL
NEWLCNPMDITLYYYQQTDEANKWISMGDSCTIKVCPLNTQTLGIGCLTTDTTTFEEVATAEKLAITDVVDGVNYKINVT
TATCTIRNCKKLGPRENVAVIQVGGSNILDITADPTTAPQTERMMRVNWKKWWQVFYTIVDYVNQIIQAMSKRSRSLDSA
AFYYRI
>Q00253 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTFLIYLISIILFNYILKSITRMMDYIIYKFLLIVTIASIVVNAQNYGINLPITGSMDASYVNATKDKPFLTST
LCLYYPTEARTEINDNEWTSTLSQLFLTKGWPTGSVYFKEYDDIATFSVDPQLYCDYNIVLMRYNSSLELDMSELANLIL
NEWLCNPMDITLYYYQQTDEANKWIAMGQSCTIKVCPLNTQTLGIGCQTTNARTFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGADILNITSDPTTAPQTERMMRINWKKWWQVFYTIVDYVNQIVQAMSKMSGSLNSA
AFYYRV
>P17700 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTFLIYLISIILLNYILKSITRMMDYIIYKFLLIVTITSIVVNAQNYGINLPITGSMDTSYVNATKDEPFLTST
LCLYYPTEARTEINDNEWTSTLSQLFLTKGWPTGSVYFKEYDDIATFSVDPQLYCDYNIVLMRYNSSLKLDMSELANLIL
NEWLSNPMDITLYYYQQTDEANKWIAMGQSCTIKVCPLNTQTLGIGCQTTNTGTFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGADILDITSDPTTNPQTERMMRINWKKWWQVFYTIVDYVNQIVQAMSKRSRSLNSA
AFYYRV
>P31632 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLVSIILINYILKSITRIMDYIIYRFLFVVVLMAIVTSAQNYGVNLPITGSMDTAYANSTQNEPFLTST
LCLYYPIEASNEIADTEWRNTLSQLFLTKGWPTGSVYFKEYADIAAFSVEPQLYCDYNIVLMKYDSTLELDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCSTTNPDTFETVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGANILDITADPTTAPQTERMMRVNWKKWWQVFYTVVDYVNQIIQAMSRRSRSLNSA
AFYYRV
>Q00254 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTFLIYLISIILFNYILKSITRMMDYIIYKFLLIVTITSIVVNAQNYGVNLPITGSMDTSYVNATKGKPFLTST
LCLYYPTEARTEINDNEWTNTLSQLFLTKGWPTGSVYFKEYDDIATFSVDPQLYCDYNIVLMRYNSNLELDMSELANLIL
NEWLCNPMDITLYYYQQTDEANKWIAMGQSCTIKVCPLNTQTLGIGCQTTNTRTFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGADILDITSDPTTNPQTERIMRINWKKWWQVFYTIVDYVNQIVQAMSKRSRSLNSA
AFYYRV
>P03534 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLTSITLLNYILKSITRIMDYIIYRFLLIVVVLATMINAQNYGVNLPITGSMDTAYANSTQSEPFLTST
LCLYYPVEASNEIADTEWKDTLSQLFLTKGWPTGSVYFKEYTDIAAFSVEPQLYCDYNLVLMKYDSTQELDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTVKVCPLNTQTLGIGCLITNPDTFETVATTEKLVITDVVDGVNHKLNVT
TATCTIRNCKKLGPRENVAIIQVGGANVLDITADPTTAPQTERMMRINWKKWWQVFYTVVDYVNQIIQTMSKRSRSLNSS
AFYYRV
>Q96643 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILISLTSITLLNYILKSITRMMDYIIYRFLLIAVILATMINAQNYGVNLPITGSMDTAYANSTQNEPFWTST
LCLYYPVEASNEMADTEWKDTLSQLFLTKGWPTGSVYFKEYTDIAAFSVEPQLYCDYNLVLMKYDSTQELDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTVQVCPLNTQTLGIGCLITNPDTFETVATPEKLVITDVVDGVNHKLNVT
TATCTIRNCEKLGPRENVAVIQVGGANILDITADPTTTPQTERMMRINWKKWWQVFYTVVDYVNQIIQTMSKRSRSLNSS
AFYYRV
>P29821 ~~~~~~Outer capsid glycoprotein VP7~~~
MYSTKCTNFFLEIIFYVIFCTLFLLVLEKMSKLLSWIVIVCLFVFAISSKCSAQNYGINVPITGSMDVVLANSTQDQIGL
TSTLCIYYPKAADTEIADPEWKATVTQLLLTKGWPTTSVYLNEYQDLVTFSNDPKLYCDYNIVLAHYTNDVALDISELAE
FLLYEWLCNPMDVTLYYYQQTSEPNKWIAMGTNCTIKVCPLNTQTLGIGCQTTNTDTFEILTMSEKLAIIDVVDGVNHKV
DYTVATCKINNCIRLNPRENVAIIQVGGPEVLDISENPMVIPKVSRMTRMNWKKWWQVFYTIVDYINTIITTMSKRSRSL
DVSSYYYRV
>Q03874 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIECTTILTFLISLILLNYILQLLTRIMDFIIYRFLFIIVFLSPFLKAQNYGINLPISGSMDTAYVNSTQENIFLTST
LCLYYPTEAATQIDDSSWKDTISQLFLTKGWPAGSVYLKEYTDITSFSIDPQLYCDYNVVLMKYDEALQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCLTTNVATFEEVATSEKLVIKDVVDGVDHKVECT
TTTCTIRNCKKLGPRENVAIIQVGGSDILDITADPTTAPQIARMMRINWKKWWQVFYTVVDYINQIVQVMSKRSRSLDSA
AFYYRI
>P04328 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILTILISIILLNYILKTITNTMDYIIFRFLLLIALISPFVRTQNYGMYLPITGSLDAVYTNSTSGEPFLTST
LCLYYPAEAKNEISDDEWENTLSQLFLTKGWPIGSVYFKDYNDINTFSVNPQLYCDYNVVLMRYDNTSELDASELADLIL
NEWLCNPMDISLYYYQQSSESNKWISMGTDCTVKVCPLNTQTLGIGCKTTDVNTFEIVASSEKLVITDVVNGVNHNINIS
INTCTIRNCNKLGPRENVAIIQVGGPNALDITADPTTVPQVQRIMRINWKKWWQVFYTVVDYINQVIQVMSKRSRSLDAA
AFYYRI
>P30216 ~~~~~~Outer capsid glycoprotein VP7~~~
MVCTTLYTVCAILFILFIYILLFRKMFHLITDTLIVILILSNCVEWSQGQMFIDDIYYNGNVETIINSTDPFNVESLCIY
FPNAIVGSQGPGKSDGHLNDGNYAQTIATLFETKGFPKGSIILKTYTQTSDFINSVEMTCSYNIVIIPDSPNDSESIEQI
AEWILNVWRCDDMNLEIYTYEQIGINNLWAAFGSDCDISVCPLDTTSNGIGCSPASTETYEVVSNDTQLALINVVDNVRH
RIQMNTAQCKLKNCIKGEARLNTALIRISTSSSFDNSLSPLNNGQTTRSFKINAKKWWTIFYTIIDYINTIVQAMTPRHR
AIYPEGWMLRYA
>Q86207 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLISIILLNYILKSVTRIMDYIIYRFLLITIALFALTRAQNYGLNLPITGSMDTVYTNSTQEEVFLTST
LCLYYPNEASTQINDGDWKDSLSQMFLTKGWPTGSVYFKEYSSIVDFSVDPQLYCDYNLVLMKYDQSLELDMSELADLIL
NEWLCNPMDITLYYYQQSGESNKWISMGSSCTVKVCPLNTQTLGIGCQTTNVDSFEMVAENEKLAIVDVVDGINHKINLT
TTTCTIRNCKKLGPRENVAVIQVGGSNVLDITADPTTNPQTERMMRVNWKKWWQVFYTIVDYINQIVQVMSKRSRSLNSA
AFYYRV
>Q76WK3 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLISIILLNYILKSVTRIMDYIIYRFLLITVALFALTRAQNYGLNLPITGSMDTVYTNSTQEEVFLTST
LCLYYPTEASTQINDGDWKDSLSQMFLTKGWPTGSVYFKEYSSIVDFSVDPQLYCDYNLVLMKYDQSLELDMSELADLIL
NEWLCNPMDITLYYYQQSGESNKWISMGSSCTVKVCPLNTQTLGIGCQTTNVDSFEMVAENEKLAIVDVVDGINHKINLT
TTTCTIRNCKKLGPRENVAVIQVGGSNVLDITADPTTNPQTERMMRVNWKKWWQVFYTIVDYINQIVQVMSKRSRSLNSA
AFYYRV
>Q07156 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISIILLNYILKSVTSAMDFIIYRFLLIIVVVSPFVKTQNYGINVPITGSMDTAYTNSSQQETFLTST
LCLYYPIEASTQIGDTEWKGTLSQLFLTKGWPTGSVYFKEYTDIASFSIDPQFYCDYNVVLVKYNSTLELDMSELADLIL
NEWLCNPMDIALYYYQQTNEANKWISMGQSCTIKVCPLNTQTLGIGCTTTNTATFEEVATNEKLVITDVVDGVNHKLDVT
TNTCTIRNCRKLGPRENVAKLQVGGSEVLDITADPTTTPQTERMMQINWKKWWQVFYTVVDYINQIVQVMSKRSRSFNSA
AFYYRI
>P30217 ~~~~~~Outer capsid glycoprotein VP7~~~
MVCTTLYTVCVILCILFIYMLLFRKMFHLIADALVITLIISNCIGWSQGQMFIDDIHYNGNVETIVNATDPFDVRSLCIY
FPNAVVGSQGPGKTDGYLNDGNYAQTIAALFETKGFPRGSIVLKTYTKVSDFVDSVEMTCSYNIVIIPDSPTNSESIERI
AEWILNVWRCDDMNLDIYTYEQTGIDNLWAAFGSDCDVSVCPLDTTMNGIGCSPASTETYEVLSNDTQLALLNVVDNVKH
RIQMNTASCKLKNCIKGEARLNTALIRISTSSSFDNSLSPLNDGQTTRSFKINAKKWWTIFYTIIDYINTIIQTMTPRHR
AIYPEGWMLRYA
>P27423 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLFYLISFILVSYILKTITKIMDYLIYRITFIIVVLSMLSNAQNYGINLPITGSMDIAYANSTQNDNFLSST
LCLYYPTEASTQISDNEWKDTLSQLFLTKGWPTGSVYFNEYSNVLEFSIDPKLYCDYNIVLIRFTSGEELDISELADLIL
NEWLCNPMDITLYYYQQTGEANKWISMGSSCTVKVCPLNTQTLGIGCQTTNATTFETVADREKLAIVDVVDGVNHKLDVT
STTCTIRNCNKLGPRENVAIIQVGGSNILDITADPTTSPQTERMMRVNWKKWWQVFYTVVDYINQIVQVMSKRSRSLDSS
SFYYRV
>P17466 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILTXLISLIFITYILKSITRTMDFIIYRFLFVIVVLAPFVKTQNYGINLPITGSMDTPYMNSTMSETFLTST
LCLYYPHEAATQIADDKWKDTLSQLFLTKGWSTGSVYFKEYTDIASFSVDPQLYCDYNIVLMKYDGNSQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGNSCTIKVCPLNTQTLGIGCLTTDPTTFEEVASAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGSNILDITADPTTAPQTERMMRINWKKWWQVFYTIVDYVNQIVQVMSKRSRSLNSA
AFYYRI
>Q86515 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLTSITLLNYMLKSITRIMDYIIYRFLLIVVILATIIKAQNYGVNLPITGSMDTAYADSTQSEPFLTST
LCLYYPVEASNEIADTEWKDTLSQLFLTKGWPTGSVYLKEYADIAAFSVEPQLYCDYNLVLMKYDSTQELDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISTGSSCTVKVCPLNTQTLGIGCLITNPDTFETVATTEKLVITDVVDGVNHKLNVT
TATCTIRNCKKLGPRENVAVIQVGGANILDITADPTTTPQTERMMRINWKKWWQVFYTVVDYVNQIIQTMSKRSRSLNSS
AFYYRV
>P12476 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISLILLNYILKSLTRMMDFIIYRFLFIVVILSPLLKAQNYGINLPITGSMDTAYANSTQEETFLTST
LCLYYPTEAATEINDNSWKDTLSQLFLTKGWPTGSVYFKEYTDIASFSVDPQLYCDYNVVLMKYDATLQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCLTTDTATFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGSDVLDITADPTTAPQTERMMRINWKKWWQVFYTVVDYVNQIIQAMSKRSRSLNSA
AFYYRI
>P03533 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISIILLNYILKSLTRIMDCIIYRLLFIIVILSPFLRAQNYGINLPITGSMDTAYANSTQEETFLTST
LCLYYPTEAATEINDNSWKDTLSQLFLTKGWPTGSVYFKEYTNIASFSVDPQLYCDYNVVLMKYDATLQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCLTTDATTFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGSDILDITADPTTAPQTERMMRINWKKWWQVFYTVVDYVDQIIQVMSKRSRSLNSA
AFYYRV
>A2T3P5 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTVLTFLISIILLNYILKSLTRIMDFIIYRFLFIIVILSPFLRAQNYGINLPITGSMDTAYANSTQEETFLTST
LCLYYPTEAATEINDNSWKDTLSQLFLTKGWPTGSVYFKEYTNIASFSVDPQLYCDYNVVLMKYDATLQLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCTIKVCPLNTQTLGIGCLTTDATTFEEVATAEKLVITDVVDGVNHKLDVT
TATCTIRNCKKLGPRENVAVIQVGGSDILDITADPTTAPQTERMMRINWKKWWQVFYTVVDYVDQIIQVMSKRSRSLNSA
AFYYRV
>Q8JNB3 ~~~~~~Outer capsid glycoprotein VP7~~~
MYGIEYTTILIFLTSITLLNYILKSITRIMDYIIYRFLLMVVILATIINAQNYGVNLPITGSMDTAYANSTQSEPFLTST
LCLYYPVEASNEIADTEWKDTLSQLFLTKGWPTGSVYFKEYADIAAFSVEPQLYCDYNLVLMKYDSTQKLDMSELADLIL
NEWLCNPMDITLYYYQQTDEANKWISMGSSCNVKVCPLNTQTLGIGCLITNPDTFETVATTEKLVITDVVDGVNHKLNVT
TATCTIRNCKKLGPRENVAVIQVGGANILDITADPTTTPQTERMMRINWKKWWQCFYTVVDYVNQIIQTVSKRSRSLNSS
AFYYRV
>Q00733 ~~~~~~Capsid-associated protein VP80~~~
MNDSNSLLITRLAAQILSRNMQTVDVIVDDKTLSLEEKIDTLTSMVLAVNSPPQSPPRVTSSDLAASIIKNNSKMVGNDF
EMRYNVLRMAVVFVKHYPKYYNETTAGLVAEIESNLLQYQNYVNQGNYQNIEGYDSLLNKAEECYVKIDRLFKESIKKIM
DDTEAFEREQEAERLRAEQTAANALLERRAQTSADDVVNRADANIPTAFSDPLPGPSAPRYMYESSESDTYMETARRTAE
HYTDQDKDYNAAYTADEYNSLVKTVLLRLIEKALATLKNRLHITTIDQLKKFRDYLNSDADAGEFQIFLNQEDCVILKNL
SNLASKFFNVRCVADTLEVMLEALRNNIELVQPESDAVRRIVIKMTQEIKDSSTPLYNIAMYKSDYDAIKNKNIKTLFDL
YNDRLPINFLDTSATSPVRKTSGKRSAEDDLLPTRSSKRANRPEINVISSEDEQEDDDVEDVDYEKESKRRKLEDEDFLK
LKALEFSKDIVNEKLQKIIVVTDGMKRLYEYCNCKNSLETLPSAANYGSLLKRLNLYNLDHIEMNVNFYELLFPLTLYND
NDNSDKTLSHQLVNYIFLASNYFQNCAKNFNYMRETFNVFGPFKQIDFMVMFVIKFNFLCDMRNFAKLIDELVPNKQPNM
RIHSVLVMRDKIVKLAFSNLQFQTFSKKDKSRNTKHLQRLIMLMNANYNVI
>P08363 ~~~~~~Non-structural protein P8~~~
MLSGLIQRFEEEKMKHNQDRVEELSLVRVDDTISQPPRYAPSAPMPSSMPTVALEILDKAMSNTTGATQTQKAEKAAFAS
YAEAFRDDVRLRQIKRHVNEQILPKLKSDLSELKKKRAIIHTTLLVAAVVALLTSVCTLSSDMSVAFKINGTKTEVPSWF
KSLNPMLGVVNLGATFLMMVCAKSERALNQQIDMIKKEVMKKQSYNDAVRMSFTEFSSIPLDGFEMPLT
>P15913 ~~~~~~Core protein VP8~~~
MNDLLLENLFGEKALCAQVTRDQLLEIIAAGARSKFPKSLLSMYRVTPRVMTRYPLKLITNESITGVVITTVYNLKKNLN
IPQNNKLTTQDIERYYLDKSVEVINLMVGNTSLGDLACGRPRRTKSSKKKDPVIFLGISAPLILVMNSKKSINTYIQDKK
SDPSSDYVNINPGIGVLENYGNTYLLDIHNPSSVLTISTIYGLDNNMELKKLSTASEIDAYQDVNIGKSVDLKKFNEIFN
TMKKHLSLSNFSI
>P20981 ~~~~~~Core protein VP8~~~
MSLLLENLIEEDTIFFAGSISEYDDLQMVIAGAKSKFPRSMLSIFNIVPRTMSKYELELIHNENITGAMFTTMYNIRNNL
GLGDDKLTIEAIENYFLDPNNEVMPLIINNTDMTAVIPKKSGRRKNKNMVIFRQGSSPILCIFETRKKINIYKENMESAS
TEYTPIGDNKALISKYAGINVLNVYSPSTSMRLNAIYGFTNKNKLEKLSTNKELESYSSSPLQEPIRLNDFLGLLECVKK
NIPLTDIPTKD
>P03295 ~~~~~~Core protein VP8~~~
MSLLLENLIEEDTIFFAGSISEYDDLQMVIAGAKSKFPRSMLSIFNIVPRTMSKYELELIHNENITGAMFTTMYNIRNNL
GLGDDKLTIEAIENYFLDPNNEVMPLIINNTDMTAVIPKKSGRRKNKNMVIFRQGSSPILCIFETRKKINIYKENMESAS
TEYTPIGDNKALISKYAGINILNVYSPSTSIRLNAIYGFTNKNKLEKLSTNKELESYSSSPLQEPIRLNDFLGLLECVKK
NIPLTDIPTKD
>P0DSW9 ~~~~~~Core protein VP8~~~
MSLLLENLIEEDTIFFAGSISEYDDLQMVIAGAKSKFPRSMLSIFNIVPRTMSKYELELIHNENITGAMFTTMYNIRNNL
GLGDDKLTIEAIENYFLDPNNEVMPLIINNTDMTAVIPKKSGRRKNKNMVIFRQGSSPILCIFETRKKINIYKENMESAS
TEYTPIGDNKALISKYAGINVLNVYSPSTSMRLNAIYGFTNKNKLEKLSTNKELELYSSSPLQEPIRLNDFLGLLECVKK
NIPLTDIPTKD
>P0DSX0 ~~~~~~Core protein VP8~~~
MSLLLENLIEEDTIFFAGSISEYDDLQMVIAGAKSKFPRSMLSIFNIVPRTMSKYELELIHNENITGAMFTTMYNIRNNL
GLGDDKLTIEAIENYFLDPNNEVMPLIINNTDMTAVIPKKSGRRKNKNMVIFRQGSSPILCIFETRKKINIYKENMESAS
TEYTPIGDNKALISKYAGINVLNVYSPSTSMRLNAIYGFTNKNKLEKLSTNKELELYSSSPLQEPIRLNDFLGLLECVKK
NIPLTDIPTKD
>P16790 ~~~~~~DNA polymerase processivity factor~~~
MDRKTRLSEPPTLALRLKPYKTAIQQLRSVIRALKENTTVTFLPTPSLILQTVRSHCVSKITFNSSCLYITDKSFQPKTI
NNSTPLLGNFMYLTSSKDLTKFYVQDISDLSAKISMCAPDFNMEFSSACVHGQDIVRESENSAVHVDLDFGVVADLLKWI
GPHTRVKRNVKKAPCPTGTVQILVHAGPPAIKFILTNGSELEFTANNRVSFHGVKNMRINVQLKNFYQTLLNCAVTKLPC
TLRIVTEHDTLLYVASRNGLFAVENFLTEEPFQRGDPFDKNYVGNSGKSRGGGGGGGSLSSLANAGGLHDDGPGLDNDLM
NEPMGLGGLGGGGGGGGKKHDRGGGGGSGTRKMSSGGGGGDHDHGLSSKEKYEQHKITSYLTSKGGSGGGGGGGGGGLDR
NSGNYFNDAKEESDSEDSVTFEFVPNTKKQKCG
>Q06419 3.1.-.-~~~A~~~Replication gene A protein~~~
MAVKASGRFVPPSAFAAGTGKMFTGAYAWNAPREAVGRERPLTRDEMRQMQGVLSTINRLPYFLRSLFTSRYDYIRRNKS
PVHGFYFLTSTFQRRLWPRIERVNQRHEMNTDASLLFLAERDHYARLPGMNDKELKKFAARISSQLFMMYEELSDAWVDA
HGEKESLFTDEAQAHLYGHVAGAARAFNISPLYWKKYRKGQMTTRQAYSAIARLFNDEWWTHQLKGQRMRWHETLLIAVG
EVNKDRSPYASKHAIRDVRARRQANLEFLKSCDLENRETGERIDLISKVMGSISNPEIRRMELMNTIAGIERYAAAEGDV
GMFITLTAPSKYHPTRQVGKGESKTVQLNHGWNDEAFNPKDAQRYLCHIWSLMRTAFKDNDLQVYGLRVVEPHHDGTPHW
HMMLFCNPRQRNQIIEIMRRYALKEDGDERGAARNRFQAKHLNQGGAAGYIAKYISKNIDGYALDGQLDNDTGRPLKDTA
AAVTAWASTWRIPQFKTVGLPTMGAYRELRKLPRGVSIADEFDERVEAARAAADSGDFALYISAQGGANVPRDCQTVRVA
RSPSDEVNEYEEEVERVVGIYAPHLGARHIHITRTTDWRIVPKVPVVEPLTLKSGIAAPRSPVNNCGKLTGGDTSLPAPT
PSEHAAAVLNLVDDGVIEWNEPEVVRALRGALKYDMRTPNRQQRNGSPLKPHEIAPSARLTRSERLQITRIRVDLAQNGI
RPQRWELEALARGATVNYDGKKFTYPVADEWPGFSTVMEWT
>P07696 ~~~B~~~Replication gene B protein~~~
MTVMTLNLVEKQPAAMRRIIGKHLAVPRWQDTCDYYNQMMERERLTVCFHAQLKQRHATMCFEEMNDVERERLVCAIDEL
RGAFSKRRQVGASEYAYISFLTVSQRRTLFMHAGLTEKEFNQPYWRINEESCYWRDALFRALRELFSLFEYAPTILTSVK
PEQYLH
>P24728 ~~~~~~Polyhedral envelope-associated phosphoprotein~~~
MKPTNNVMFDDASVLWIDTDYIYQNLKMPLQAFQQLLFTIPSKHRKMINDAGGSCHNTVKYMVDIYGAAVLVLRTPCSFA
DQLLSTFIANNYLCYFYRRRRSRSRSRSRSRSRSPHCRPRSRSPHCRPRSRSRSRSRSRSRSSSPRRGRRQIFDALEKIR
HQNDMLMSNVNQINLNQTNQFLELSNMMTGVRNQNVQLLAALETAKDVILTRLNTLLAEITDSLPDLTSMLDKLAEQLLD
AINTVQQTCATS
>Q89121 2.7.11.1~~~~~~Serine/threonine-protein kinase 2~~~
MGVANDSSPEYQWMSPHRLSDTVILGDCLYFNNIMSQLDLHQNWAPSVRLLNYFKNFNKETLLKIEENDYINSSFFQQKD
KRFYPINDDFYHISTGGYGIVFKIDNYVVKFVFEATKLYSPMETTAEFTVPKFLYNNLKGDEKKLIVCAWAMGLNYKLTF
LHTLYKRVLHMLLLLIQTMDGQELSLRYSSKVFLKAFNERKDSIKFVKLLSHFYPAVINSNINVINYFNRMFHFFEHEKR
TNYEYERGNIIIFPLALYSADKVDTELAIKLGFKSLVQYIKFIFLQMALLYIKIYELPCCDNFLHADLKPDNILLFDSNE
PIIIHLKDKKFVFNERIKSALNDFDFSQVAGIINKKIKNNFKVEHNWYYDFHFFVHTLLKTYPEIEKDIEFSTALEEFIM
CTKTDCDKYRLKVSILHPISFLEKFIMRDIFSDWINGGN
>F5HGH5 2.7.11.1~~~vPK~~~Viral protein kinase~~~
MRWKRMERRPPLTPLRRSRTQSSGGGLTICPRCALKLPKATRISERPWASTWQLNQHIQVSKTKKATAYLKAPREWGQCT
HQDPDWSKRLGRGAFGIIVPISEDLCVKQFDSRREFFYEAIANDLMQATRERYPMHSGGSRLLGFVQPCIPCRSIVYPRM
KCNLLQLDWSQVNLSVMAAEFTGLMAAVSFLNRYCGMVHCDVSPDNILATGDLTPMNPGRLVLTDFGSVALHSGSKWTNL
VVTSNLGFKQHCYDFRVPPKLICKHLYKPSCVLFQCYLSSLGKMHAQVLDQPYPISPNMGLTIDMSSLGYTLLTCLELYL
DLPLNNPLKFLGSATRDGRPEPMYYLGFMIPRVVMTQILSAVWTMTLDLGLDCTGKAQAIPMRQEHQLAFQKQCYLYKAN
QKAESLANCSDKLNCPMLKSLVRKLLERDFFNHGGHPHTRGLVF
>P25475 ~~~L~~~Head completion/stabilization protein~~~
MMTLIIPRKEAPVSGEGTVVIPQPAGDEPVIKNTFFFPDIDPKRVRERMRLEQTVAPARLREAIKSGMAETNAELYEYRE
QKIAAGFTRLADVPADDIDGESIKVFYYERAVCAMATASLYERYRGVDASAKGDKKADSIDSTIDELWRDMRWAVARIQG
KPRCIVSQI
>P85225 ~~~~~~Putative tail sheath protein~~~
MAVEQFPRKKVSRPHTEITVDTSGIGGSSSSSDKTLMLVGSAKGGKPDTVYRFRNYQQAKQVLRSGDLLDAIELAWNASD
VNTASAGDILAVRVEDAKNATLTKGGLTFASTIYGVDANEIQVALEDNNLTHTKRLTVAFSKDGYKKVFDNLGKIFSIQY
KGSEAQANFTIAQDSISKKATTLTLNVGSEPESTTEVMKYELGQGVYSETNVLVSAINSLPDWEAKFFPIGDKNLPTDAL
EAVTKVDVKTEAVFVGALAGDIAKQLEYNDYVTVAVDATKPVEDFELTNLTGGSDGTAPESWANKFPLLANEGGYYLVPL
TDKQAVHSEALAFVKDRTDNGDPMRIIVGGGTNETVEESITRATNLRDPRASLVGFSGTRKMDDGRLLKLPGYMMASQIA
GIASGLEVGEAITFKHFNVTSVDRVFESSQLDMLNESGVISIEFVRNRTLTAFRVVQDVTTYNDKSDPVKNEMSVGEAND
FLVSELKIELDNNFIGTKVIDTSASLIKNFIQSFLDNKKRAREIQDYTPEEVQVVLEGDVASISMTVMPIRSLNKITVQL
VYKQQILTA
>P85226 ~~~~~~Putative major capsid protein~~~
MTEKKNTERQLTSVQEEVIKGFTTGYGITPESQTDAAALRREFLDDQITMLTWADGDLSFYRDITKRPATSTVAKYDVYL
AHGRVGHTRFTREIGVAPISDPNLRQKTVNMKYVSDTKNMSIATGLVNNIEDPMRILTDDAISVVAKTIEWASFYGDSDL
SENPDAGSGLEFDGLAKLIDKHNVLDAKGASLTEALLNQASVLVGKGYGTPTDAYMPIGVQADFVNQQLDRQVQVISDNG
QNATMGFNVKGFNSARGFIRLHGSTVMELEQILDENRMQLPNAPQKATVKATLEAGTKGKFRDEDLTIDTEYKVVVVSDD
AESAPSDVASVVIDDKKKQVKLEITINNMYQARPQYVAIYRKGLETGLFYQIARVPASKAVEGVITFIDVNDEIPETADV
FVGELTPSVVHLFELLPMMRLPLAQVNASVTFAVLWYGALALRAPKKWARIKNVKYIATGNVFN
>P85228 ~~~~~~Virion protein 4~~~
MAKEILNIEDLLKPETLEVAIDGKYLIVPTLSDGFTGTVAGGYAYAVTKKGTDYTVNELIYNQKDNTFKPSDEPIIITDD
NEIFFITRTLEDPYNYPVVATEKLKTKDVKEKQVLQAFLAFADDRFKLGVYNVFLADEPFVYGDKTE
>P85229 ~~~~~~Virion protein 5~~~
MASVGNQTVHTGNTVYLMIGNKIIGRAQSASGERQYGTQGIYEIGSIMPQEHVYLKYEGTITLERMRMKKEDLASLGITA
LGEDILQRDIIDIVMMDNLTKEIVVAYRGCSAISYSESFTANEVTSESTQFTYLTSAKVK
>P85230 ~~~~~~Virion protein 6~~~
MNYPKREKVVEVSLASGTYSVFPRRLGVTTNDAMSIVNGAMKGAELPMIPVHKLADRDSELTYVNAFQIQTATENIVDVP
ERITSLYTKPEDETPEDEEVRLGTINNYFSLR
>A8E283 ~~~~~~Tail fiber protein~~~
MNKLVKRRFQAGLGSEIKRVYKEGQQINTLLLAQVIQVNYKYNTVDLLALQHKEVFQNSYANEGRFSARLPMEFGGRNIV
GQPYGQVNPIAVGTVVLVGFINSDKDMPIVISVYNNNDVSKQLSRTQFSNSDPKDLELIGDMHQKFSLYPSLTYDSVDGE
GGRVVTFSGKSFIAFDTKEVANSSTTDAGYGTKYEDLETSYYNNGDLIEPMKGRAPNVLFKHQGVLDDDGKPDLHDLLIH
INPDGTYRTSMMNKEEDWRTLFEMTPDGRVKLRKQDSINIDGGIEISELGINNEGFVYLRNGDMDLEVRKDGIYSQGKLF
TADVDLSDVYDKLNGLSIQIKETNGQLEIIANGVEEQNGKISELSTEITIVAGKVESKVTKTEVQDMIDSSFVDMSDAIK
KAQEDADKANKVIADMSSDNRLTPSEKIDLLKEWDIIKNEYPSYLEQAETYEVDSKDYTAKYNSLELFVTPILADMESTS
SVDGATLRKTFNSYYTARIALLNSISKKLKDGITEAMKKASQASLDATQAMADASQAKIDADNANKLISDIASDNKLTPS
EKYQLKKEWDVIVKEYPTTIAQAEKYAVDTAEYTAKYKALELFVEPLFKDMDETSIVDGERLRATFSDYYASKIALLKEV
TDSAKTELDAYGNKISVMETNITQTSEAITLLATRVQTVEDGVQSNKAQIEIQAEQISQKVTASEVKGIVDDSINNLTLG
GTNLFVIKTQTAGLLNENDGTVGTAVDNSVVSDYIKVNQKTPYIATLYGNTGTNMIITDWYDKNRTFISGEAVADSGDFS
KKYVSPENAVYARVSYKKANSVNIKFEAGTKATDYSPSWEDIKGDQTALEEYIKKVEEQAKKAQQDAENAKNDAENANNA
IADMSNDNMLAPNEKKQILLQWEQIKTEYPINLDQATKFGVSSQQYTTAYNALDEYLKPILADMTTTSVVVGSTLRNTFN
NYYDKRTTLLNRISDVAKNVADKAQETADTINDNLQNIGGYNYVGFSSGDNMLPRLMIKNVGYYTLGSSTTEFIDSMVAV
KGDATTQPFDYTVGTSDKEIAGGGLADYRMKEVKEGQWLTASANVQVIDGGSARLAIYTLEGDNWVGSNSTPIQVSDGLK
RVVAQRKVTGLTKGVLIRIESADTNVKEFRFGNVQLEVGIIPTPWKKSDIDIQEDINNVVQNIKTYTAWANDLQGLDFTR
EKVEGKTYMYVGTSMKDSDNYSDYTWRLTDEHIEGQINGKEGAWIYSPTAPTNPSQGLIWVDLSKVPNQPKRWVDSETGW
VALTPEEVKDLPWGEDGTNLADWVAQAEQRISSDSIINTVLGSEDFTSVFDTKANTTDLDNLATYEDLDSIKEDYNRLIK
EGINGIDFTPYVTNSELQQLKDSFNFSVQQAGGVNMLKNSLGFSGLDFWDGTVGKNLLPNSTWNLGFGRWGGASIASFEI
LPPEDDKPTSHILGSIGSRSSTKEIGNRPHPLKVNSGETYTISFDYKEEALAYDKDRPILVVRNYPDKDTDQWMEYSIEG
WAVMANGSTTDLTVWRRFTKTFTIGTSGYLDILPKTIVESWTHRSFWRELKIEEGSQATTWVPNKEDGAFTGGIVETTQT
EELANLGFGSGFVSSKRPSSSLTQSVELPEIGANLEYSLSFYMKVTTDNPVADFKCGIRVYEGDTLTYTLGIEDATQPIP
LGFQQYKLVFTPTSTSTKIEMFVENGQEASVIISGIMYNIGNIPLKWQPYPSEIYNTNVKIDINGVTVKNNQTDGYTMIT
PQEFSGYSRIDGNIERIFTLNGQVTEVKMLKAEKRITMEPVSVFAMNTVTDTKRIRGWAFVPSFE
>Q00946 3.4.22.-~~~~~~Cysteine protease S273R~~~
MSILEKITSSPSECAEHLTNKDSCLSKKIQKELTSFLEKKETLGCDSESCVITHPAVKAYAQQKGLDLSKELETRFKAPG
PRNNTGLLTNFNIDETLQRWAIKYTKFFNCPFSMMDFERVHYKFNQVDMVKVYKGEELQYVEGKVVKRPCNTFGCVLNTD
FSTGTGKHWVAIFVDMRGDCWSIEYFNSAGNSPPGPVIRWMERVKQQLLKIHHTVKTLAVTNIRHQRSQTECGPYSLFYI
RARLDNVSYAHFISARITDEDMYKFRTHLFRIA
>P0C9B9 3.4.22.-~~~~~~SUMO-1 cysteine protease S273R~~~
MSILEKITSSPSECAEHITNKDSCLSKKIQKELTSFLQKKETLGCDSESCVITHPAVKAYAQQKGLDLSKELETRFKAPG
PRNNTGLLTNFNIDETLQRWAIKYTKFFNCPFSMMDFERIHYKFNQVDMVKVYKGEELQYVEGKAVKRPCNTFGCVLNTD
FSTGTGKHWVAIFVDMRGDCWSIEYFNSAGNSPPGPVIRWMERVKQQLLKIHHTVKTLAVTNIRHQRSQTECGPYSLFYI
RARLDNVSYTHFISTRITDEEMYKFRTHLFRIA
>Q73369 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPYNDWTLELLEELKNEAVRHFPRIWLHSLGQHIYETYGDTWTGVEALIRILQQLLFIHFRIGCRHSR
IGIIQQRRTRNGASKS
>P05928 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHIYETYGDTWAGVEAIIRILQQLLFIHFRIGCRHSR
IGVTQQRRARNGASRS
>P69726 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHIYETYGDTWAGVEAIIRILQQLLFIHFRIGCRHSR
IGVTRQRRARNGASRS
>P12520 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPYNEWTLELLEELKSEAVRHFPRIWLHNLGQHIYETYGDTWAGVEAIIRILQQLLFIHFRIGCRHSR
IGVTRQRRARNGASRS
>P05954 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPYNEWTLELLEELKSEAVRHFPRLWLHSLGQHIYETYGDTWAGVEAIIRILQQLLFIHFRIGCQHSR
IGITRQRRARNGASRS
>P35967 ~~~vpr~~~Protein Vpr~~~
MEQAPEDQGPQREPHNEWTLELLEELKREAVRHFPRPWLHGLGQHIYETYGDTWAGVEAIIRILQQLLFIHFRIGCQHSR
IGIIQQRRARRNGASRS
>P19509 ~~~vpr~~~Protein Vpr~~~
MTERPPEDEAPQREPWDEWVVEVLEEIKEEALNHFDPRLLTALGNYIYDRHGDTLEGAGELIRILQRALFIHFRGGCRHS
RIGQSGGGNPLSTIPPSRGVL
>P05460 ~~~psu~~~Polarity suppression protein~~~
MESTALQQAFDTCQNNKAAWLQRKNELAAAEQEYLRLLSGEGRNVSRLDELRNIIEVRKWQVNQAAGRYIRSHEAVQHIS
IRDRLNDFMQQHGTALAAALAPELMGYSELTAIARNCAIQRATDALREALLSWLAKGEKINYSAQDSDILTTIGFRPDVA
SVDDSREKFTPAQNMIFSRKSAQLASRQSV
>P69699 ~~~vpu~~~Protein Vpu~~~
MQPIQIAIVALVVAIIIAIVVWSIVIIEYRKILRQRKIDRLIDRLIERAEDSGNESEGEISALVEMGVEMGHHAPWDVDD
L
>P05923 ~~~vpu~~~Protein Vpu~~~
MQPIQIAIAALVVAIIIAIVVWSIVIIEYRKILRQRKIDRLIDRLIERAEDSGNESEGEISALVEMGVEMGHHAPWDIDD
L
>P05919 ~~~vpu~~~Protein Vpu~~~
MQPIPIVAIVALVVAIIIAIVVWSIVIIEYRKILRQRKIDRLIDRLIERAEDSGNESEGEISALVEMGVEMGHHAPWDVD
DL
>Q70625 ~~~vpu~~~Protein Vpu~~~
MQPIQIAIVALVVAIIIAIVVWSIVIIEYRKILRQRKIDRLIDRLIERAEDSGNESEGEISALAEMGVEMGHHAPWDVDD
L
>P19554 ~~~vpu~~~Protein Vpu~~~
MQPLQILAIVALVVAAIIAIVVWTIVYIEYRKILRQRKIDRLIDRITERAEDSGNESEGDQEELSALVERGHLAPWDVDD
L
>P18099 ~~~vpx~~~Protein Vpx~~~
MTDPRERVPPGNSGEETIGEAFEWLERTIEALNREAVNHLPRELIFQVWQRSWRYWHDEQGMSASYTKYRYLCLMQKAIF
THFKRGCTCWGEDMGREGLEDQGPPPPPPPGLV
>P18045 ~~~vpx~~~Protein Vpx~~~
MTDPRERVPPGNSGEETIGEAFEWLDRTIEALNREAVNHLPRELIFQVWQRSWRYWHDDQGMSPSYTKYRYLCLMQKAVF
IHFKRGCTCLGGGHGPGGWRSGPPPPPPPGLV
>P06939 ~~~vpx~~~Protein Vpx~~~
MTDPRETVPPGNSGEETIGEAFAWLNRTVEAINREAVNHLPRELIFQVWQRSWRYWHDEQGMSESYTKYRYLCIIQKAVY
MHVRKGCTCLGRGHGPGGWRPGPPPPPPPGLV
>P12454 ~~~vpx~~~Protein Vpx~~~
MTNPRETIPPGNSGEETIEEAFDWLDRTVEAINREAVNHLPRELIFQVWQRSWRYWHDEQGMSRSYTKYRYLCLMQKAVF
MHFKKGCTCRGEGHGPGGWRSGPPPPPPPGLV
>P19508 ~~~vpx~~~Protein Vpx~~~
MSDPRERIPPGNSGEETIGEAFDWLDRTVEEINRAAVNHLPRELIFQVWRRSWEYWHDEMGMSVSYTKYRYLCLIQKAMF
MHCKKGCRCLGGEHGAGGWRPGPPPPPPPGLA
>P03690 ~~~rIIA~~~Protein rIIA~~~
MIITTEKETILGNGSKSKAFSITASPKVFKILSSDLYTNKIRAVVRELITNMIDAHALNGNPEKFIIQVPGRLDPRFVCR
DFGPGMSDFDIQGDDNSPGLYNSYFSSSKAESNDFIGGFGLGSKSPFSYTDTFSITSYHKGEIRGYVAYMDGDGPQIKPT
FVKEMGPDDKTGIEIVVPVEEKDFRNFAYEVSYIMRPFKDLAIINGLDREIDYFPDFDDYYGVNPERYWPDRGGLYAIYG
GIVYPIDGVIRDRNWLSIRNEVNYIKFPMGSLDIAPSREALSLDDRTRKNIIERVKELSEKAFNEDVKRFKESTSPRHTY
RELMKMGYSARDYMISNSVKFTTKNLSYKKMQSMFEPDSKLCNAGVVYEVNLDPRLKRIKQSHETSAVASSYRLFGINTT
KINIVIDNIKNRVNIVRGLARALDDSEFNNTLNIHHNERLLFINPEVESQIDLLPDIMAMFESDEVNIHYLSEIEALVKS
YIPKVVKSKAPRPKAATAFKFEIKDGRWEKRNYLRLTSEADEITGYVAYMHRSDIFSMDGTTSLCHPSMNILIRMANLIG
INEFYVIRPLLQKKVKELGQCQCIFEALRDLYVDAFDDVDYDKYVGYSSSAKRYIDKIIKYPELDFMMKYFSIDEVSEEY
TRLANMVSSLQGVYFNGGKDTIGHDIWTVTNLFDVLSNNASKNSDKMVAEFTKKFRIVSDFIGYRNSLSDDEVSQIAKTM
KALAA
>P03704 ~~~2~~~Bacterial RNA polymerase inhibitor~~~
MSNVNTGSLSVDNKKFWATVESSEHSFEVPIYAETLDEALELAEWQYVPAGFEVTRVRPCVAPK
>P03689 ~~~P~~~Replication protein P~~~
MKNIAAQMVNFDREQMRRIANNMPEQYDEKPQVQQVAQIINGVFSQLLATFPASLANRDQNEVNEIRRQWVLAFRENGIT
TMEQVNAGMRVARRQNRPFLPSPGQFVAWCREEASVTAGLPNVSELVDMVYEYCRKRGLYPDAESYPWKSNAHYWLVTNL
YQNMRANALTDAELRRKAADELVHMTARINRGEAIPEPVKQLPVMGGRPLNRAQALAKIAEIKAKFGLKGASV
>P60170 ~~~GP~~~Pre-small/secreted glycoprotein~~~
MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVP
SATKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF
LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLT
YVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKTSLEKFAVKSCLSQLYQTEPKTSVV
RVRRELLPTQGPTQQLKTTKSWLQKIPLQWFKCTVKEGKLQCRI
>P05461 ~~~sid~~~Glycoprotein 3~~~
MSDHTIPEYLQPALAQLEKARAAHLENARLMDETVTAIERAEQEKNALAQADGNDADDWRTAFRAAGGVLSDELKQRHIE
RVARRELVQEYDNLAVVLNFERERLKGACDSTATAYRKAHHHLLSLYAEHELEHALNETCEALVRAMHLSILVQENPLAN
TTGHQGYVAPEKAVMQQVKSSLEQKIKQMQISLTGEPVLRLTGLSAATLPHMDYEVAGTPAQRKVWQDKIDQQGAELKAR
GLLS
>Q80874 ~~~~~~Suppressor of RNA silencing~~~
MMATFSCVCCGTSTTSTYCGKRCERKHVYSETRNKRLELYKKYLLEPQKCALNGIVGHSCGMPCSIAEEACDQLPIVSRF
CGQKHADLYDSLLKRSEQELLLEFLQKKMQELKLSHIVKMAKLESEVNAIRKSVASSFEDSVGCDDSSSVSKL
>Q85447 ~~~~~~Suppressor of RNA-mediated gene silencing~~~
MEVDTATFVRLHHELLCAHEGPSIISKFDAIKKVKLGTLANQSGGVNNITEAFLAKLRNFERKSEAYLASDLAERELTRD
THKAIVFVTKSVLLGGKSLKDLLPYGVIVCAFIFIPETASVLDNVPVMIGNQKRPLTVALIKYIAKSLNCDLVGDSYDTF
YYCNSSAYGKNLISVSDNDFSNPQRALLSVGDLCYQAARSLHVAAANYIRIFDRMPPGFQPSKHLFRIIGVLDMETLKTM
VTSNIAREPGMFCHDNVKDVLHRIGVYSPNHHFSAVILWKGWASTYAYMFNQEQLNMLSGTSGLAGDFGKYKLTYGSTFD
EGVIHVQYQFVTPEVVRKRNIYPDLSALKGGGS
>Q85434 ~~~~~~Suppressor of RNA-mediated gene silencing~~~
MEVDTATFVRLHHELLCAHEGPSIISKFDAIKKVKLGTLANQSGGANNITEAFFDKLRNFERKSEAYLASDLAERELTRD
THKAIVFVTKSVLLGGKSLKDLLPYGVIVCAFIFIPETASVLDNVRVMIGNQKRPLTVALIKYMAKSLNCDLVGDSYDTF
YYCNSSAYGKNLISVSENDFSNPQRALLSVGDLCYQAARSIHVAAANYIRIFDRMPPGFQPSKHLFRIIGVLDMETLKTM
VTSNIAREPGMFSHDNVKDVLHRTGVFSPNHHFSAVILWRGWASTYAYMFNQEQLNMLSGTSGLAGDFGKYKLTYGSTFD
EGVIHVQSQFVTPEAVRKRNIYPDLSALKGGSS
>Q67897 ~~~p3~~~Suppressor of RNA silencing p3~~~
MNVSLYYSGTPVSSHSLLSKNGLSNIVLTCKDLPIPIDLLSLFFDILNERHPSFDEHMFLQMIRKPDDPENLSVFLKSAI
WMLSHKRDLPGHYRLPLTCLVSTYSEYFVELKPRQPSTKCWFCKIAKDGLPFRVEGVHGFPSEAELYIVPSKEHAIESFE
VLSGKKLYRSPSKKKHGYLIASNKPPLTSKYVEYDPSKPDTKP
>P26658 ~~~p3~~~Suppressor of RNA silencing p3~~~
MNVFTSSVGSVEFDHPLLLENDLTSLSINCDDVHCSSRALCYIYDIHSSRHPSIDEHQFLRLLHGPDDAVTLGSFLKTLI
WILSHDKNLPEEYRLPTIMMSSSYVKFFTEVKPRPPSTNCWTCRMSKDNLPFTVPSVKGFPPDAELYIVPISDHDGKPVK
FDNRKTLYRSPSKKRHKYVISSDKPPLSARYVKYVDSSALESLPGSSPAVL
>P13310 ~~~vs~~~Valyl--tRNA ligase modifier~~~
MTKILVLCIGLISFSASASADTSYTEIREYVNRTAADYCGKNKACQAEFAQKLIYAYKDGERDKSSRYKNDTLLKRYAKK
WNTLECSVAEEKDKAACHSMVDRLVDSYNRGLSTR
>A0A7H0DNA6 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENSGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQSGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLAIENDSQFTC
SLAVESELPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKHFTYDWS
>P68719 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENSGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQFGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLVIENDSQFTC
SLAVESKLPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKYFT
>P20986 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENSGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQFGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLVIENDSQFTC
SLAVESELPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKYFT
>P68720 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENSGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQFGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLVIENDSQFTC
SLAVESKLPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKYFT
>P0DSP9 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENIGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQSGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLVIENDSQFIC
SLAVESELPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKYFT
>P0DSQ0 ~~~~~~Intermediate transcription factor 3 small subunit~~~
MFEPVPDLNLEASVELGEVNIDQTTPMIKENIGFISRSRRLFAHRSKDDERKLALRFFLQRLYFLDHREIHYLFRCVDAV
KDVTITKKNNIIVAPYIALLTIASKGCKLTETMIEAFFPELYNEHSKKFKFNSQVSIIQEKLGYQSGNYHVYDFEPYYST
VALAIRDEHSSGIFNIRQESYLVSSLSEITYRFYLINLKSDLVQWSASTGAVINQMVNTVLITVYEKLQLVIENDSQFIC
SLAVESELPIKLLKDRNELFTKFINELKKTSSFKISKRDKDTLLKYFT
>P52443 ~~~~~~Protein U26~~~
MRRLTDSFILGLAKGAVIPGLYTFRMTEGRSPLGQIGVLITVAISFLLTFKRFDPRFYKPIGDFKIVFLSLMAPKLPSLL
SAVVMICLIFSEMRLRMILSRCVMIMPSYSPAVFTGIMVSLFFKSQMFDDYSVLITAASLLPITVRYGWMIRSSGFLLGL
QKYRPILKSTSFREVDLKCLVKFTVEFLLLFTILWIGKMFLSMPKSNHLFFLTVVNNVFFKLNVFKAAACAVVAILSGLM
MNVCLYRIIFEAFVGLGFSSIMLNLSSDLKDRSFYAGDLLNGFFCLVVCCMYFGV
>K7XWG4 ~~~vXCL1~~~Viral Lymphotactin~~~
MRLLTILALCCVAIWVVESIGIEVLHETICVSLRTQRIPIQKIKTYTIKEGAMRAVIFVTKRGLRICADPDAGWTKAAIT
TLDKKNKKNKQKFNTTTVIPTQVPVSTNETTTVYG
>P68927 ~~~xis~~~Excisionase~~~
MYLTLQEWNARQRRPRSLETVRRWVRECRIFPPPVKDGREYLFHESAVKVDLNRPVTGSLLKRIRNGKKAKS
>P04889 ~~~xis~~~Excisionase~~~
MESHSLTLDEACAFLKISRPTATNWIRTGRLQATRKDPTKPKSPYLTTRQACIAALQSPLHTVQVSAGDDITEELKCHYS
AEVKPGTPVSHCRTAKDLSSLLGQRTKGRPQSFMTS
>P03699 ~~~xis~~~Excisionase~~~
MYLTLQEWNARQRRPRSLETVRRWVRECRIFPPPVKDGREYLFHESAVKVDLNRPVTGGLLKRIRNGKKAKS
>O55777 ~~~P/V/C~~~Non-structural protein V~~~
MDKLDLVNDGLDIIDFIQKNQKEIQKTYGRSSIQQPSTKDRTRAWEDFLQSTSGEHEQAEGGMPKNDGGTEGRNVEDLSS
VTSSDGTIGQRVSNTRAWAEDPDDIQLDPMVTDVVYHDHGGECTGHGPSSSPERGWSYHMSGTHDGNVRAVPDTKVLPNA
PKTTVPEEVREIDLIGLEDKFASAGLNPAAVPFVPKNQSTPTEEPPVIPEYYYGSGRRGDLSKSPPRGNVNLDSIKIYTS
DDEDENQLEYEDEFAKSSSEVVIDTTPEDNDSINQEEVVGDPSDQGLEHPFPLGKFPEKEETPDVRRKDSLMQDSCKRGG
VPKRLPMLSEEFECSGSDDPIIQELEREGSHPGGSLRLREPPQSSGNSRNQPDRQLKTGDAASPGGVQRPGTPMPKSRIM
PIKKGHRREVSICWDGRRAWVEEWCNPVCSRITPQPRKQECYCGECPTECSQCCHEE
>P0C774 ~~~P/V~~~Non-structural protein V~~~
MAEEQARHVKNGLECIRALKAEPIGSLAVEEAMAAWSEISDNPGQDRATCKEEEAGSSGLSKPCLSAIGSTEGGAPRIRG
QGSGESDDDAETLGIPSRNLQASSTGLQCYHVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSGGDDESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLKLQSRGNNFPKLGKTLNVPPPPNPSRASTSETPIKKGHRREIGLIW
NGDRVFIDRWCNPMCSKVTLGTIRARCTCGECPRVCEQCRTDTGVDTRIWYHNLPEIPE
>Q9EMA9 ~~~P/V~~~Non-structural protein V~~~
MAEEQARHVKNGLECIRALKAEPIGSLAIEEAMAAWSEISDNPGQERATCREEKAGSSGLSKPCLSAIGSTEGGAPRIRG
QGPGESDDDAETLGIPPRNLQASSTGLQCYYVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSGGDNESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLRLQSRGNNFPKLGKTLNVPPPPDPGRASTSETPIKKGHRREISLIW
NGDRVFIDRWCNPMCSKVTLGTIRARCTCGECPRVCEQCRTDTGVDTRIWYHNLPEIPE
>P60168 ~~~P/V~~~Non-structural protein V~~~
MAEEQARHVKNGLECIRALKAEPIGSLAIGEAMAAWSEISDNPGQERATYKEEKAGGSGLSKPCLSAIGSTEGGAPRIRG
QGSGESDDDTETLGIPSRNLQASSTGLQCHYVYDHSGEAVKGIQDADSIMVQSGLDGDSTLSEGDNESENSDVDIGEPDT
EGYAITDRGSAPISMGFRASDVETAEGGEIHELLRLQSRGNNFPKLGKTLNVPPPPDPGRASTSETPIKKGHRREISLIW
DGDRVFIDRWCNPMCSKVTLGTIRARCTCGECPRVCEQCRTDTGVDTRIWYHNLPEIPE
>P30927 ~~~P/V~~~Non-structural protein V~~~
MDQFIKQDETGDLIETGMNVANHFLSAPIQGTNSLSKASIIPGVAPVLIGNPEQKNIQHPTASHQGSKSKGRGSGVRSII
VPPSEAGNGGTQIPEPLFAQTGQGGIVTTVYQDPTIQPTGSYRSVELTKIGKERMINRFVEKPRISTPVTEFKRGAGSGC
SRPDNPRGGHRREWSLSWVQGEVRVFEWCNPICSPITAAARFHSCKCGNCPAKCDQCERDYGPP
>P30928 ~~~P/V~~~Non-structural protein V~~~
MDQFIKQDETGDLIETGMNVANHFLSAPIQGTNSLSKATIIPGVAPVLIGNPEQKNIQYPTTSHQGSKSKGRGSGARPII
VSSSEGGTGGTQVPEPLFAQTGQGGIVTTVYQDPTIQPTGSYRSVELAKIGKERMINRFVEKPRTSTPVTEFKRGAGSGC
SRPDNPRGGHRREWSLSWVQGEVRVFEWCNPICSPITAAARFHSCKCGNCPAKCDQCERDYGPP
>P0C765 ~~~V~~~Protein V~~~
MATFTDAEIDELFETSGTVIDNIITAQGKPAETVGRSAIPQGKTKVLSAAWEKHGSIQPPASQDNPDRQDRSDKQPSTPE
QTTPHDSPPATSADQPPTQATDEAVDTQLRTGASNSLLLMLDKLSNKSSNAKKGPMVEPPRGESPTSDSTAGESTQSRKQ
SGKTAEPSQGRPWKPGHRREHSISWTMGGVTTISWCNPSWSPIKAEPKQYPCFCGSFPPTCRLCASDDVYYGGDFPKSK
>Q997F2 ~~~P/V/C~~~Non-structural protein V~~~
MDKLELVNDGLNIIDFIQKNQKEIQKTYGRSSIQQPSIKDQTKAWEDFLQCTSGESEQVEGGMSKDDGDVERRNLEDLSS
TSPTDGTIGKRVSNTRDWAEGSDDIQLDPVVTDVVYHDHGGECTGYGFTSSPERGWSDYTSGANNGNVCLVSDAKMLSYA
PEIAVSKEDRETDLVHLENKLSTTGLNPTAVPFTLRNLSDPAKDSPVIAEHYYGLGVKEQNVGPQTSRNVNLDSIKLYTS
DDEEADQLEFEDEFAGSSSEVIVGISPEDEEPSSVGGKPNESIGRTIEGQSIRDNLQAKDNKSTDVPGAGPKDSAVKEEP
PQKRLPMLAEEFECSGSEDPIIRELLKENSLINCQQGKDAQPPYHWSIERSISPDKTEIVNGAVQTADRQRPGTPMPKSR
GIPIKKGHRREISICWDGKRAWVEEWCNPACSRITPLPRRQECQCGECPTECFHCG
>P35941 ~~~P/V~~~Non-structural protein V~~~
MAEEQAYHVSKGLECIKALRENPPNMEEIQEVSNIRDQTYKSSKESGTTGVQEEEITQNIDESHTPTKRSNSVSDVLQED
QRGREDNTAPVEAKDRIEEDTQTGPAVRRYYVYDHCGEKVKGIEDADSLMVPAGPPSNRGFEGREGSLDDSIEDSSEDYS
EGNASSNWGYTFGLNPDRAADVSMLMEEELTALLGTGHNAGGQKRDGRTLQFPNSPEGSIGNQACEPIKKGHRREVSLTW
NDDRCWIDKWCNPICTQVNWGVIRAKCICGECPPVCDDCKDDPEMQNRIWYETPTRETK
>P19847 ~~~P/V~~~Non-structural protein V~~~
MAEEPTYTTEQVDELIHAGLGTVDFFLSRPIDAQSSLGKGSIPPGVTAVLTSAAEAKSKPVAAGPVKPRRKKVISNTTPY
TIADNIPPEKLPINTPIPNPLLPLARPHGKMTDIDIVTGNITEGSYKGVELAKLGKQTLLTRFTSNEPVSSAGSAQDPNF
KRGGANRERARGNHRREWSIAWVGDQVKVFEWCNPRCAPVTASARKFTCTCGSCPSICGECEGDH
>P11207 ~~~P/V~~~Non-structural protein V~~~
MDPTDLSFSPDEINKLIETGLNTVEYFTSQQVTGTSSLGKNTIPPGVTGLLTNAAEAKIQESTNHQKGSVGGGAKPKKPR
PKIAIVPADDKTVPGKPIPNPLLGLDSTPSTQTVLDLSGKTLPSGSYKGVKLAKFGKENLMTRFIEEPRENPIATSSPID
FKRGRDTGGFHRREYSIGWVGDEVKVTEWCNPSCSPITAAARRFECTCHQCPVTCSECERDT
>P60169 ~~~P/V~~~Non-structural protein V~~~
MAEEQAYHVNKGLECIKALRARPLDPLVVEEALAAWVETSEGQTLDRMSSDEAEADHQDISKPCFPAAGPGKSSMSRCHD
QGLGGSNSCDEELGAFIGDSSMHSTEVQHYHVYDHSGEKVEGVEDADSILVQSGADDGVEVWGGDEESENSDVDSGEPDP
EGSAPADWGSSPISPATRASDVETVEGDEIQKLLEDQSRIRKMTKAGKTLVVPPIPSQERPTASEKPIKKGHRREIDLIW
NDGRVFIDRWCNPTCSKVTVGTVRAKCICGECPRVCEQCITDSGIENRIWYHNLADIPE
>P69280 ~~~P/V/C~~~Protein V~~~
MDQDAFILKEDSEVEREAPGGRESLSDVIGFLDAVLSSEPTDIGGDRSWLHNTINTPQGPGSAHRAKSEGEGEVSTPSTQ
DNRSGEESRVSGRTSKPEAEAHAGNLDKQNIHRAFGGRTGTNSVSQDLGDGGDSGILENPPNERGYPRSGIEDENREMAA
HPDKRGEDQAEGLPEEVRGGTSLPDEGEGGASNNGRSMEPGSSHSARVTGVLVIPSPELEEAVLRRNKRRPTNSGSKPLT
PATVPGTRSPPLNRYNSTGSPPGKPPSTQDEHINSGDTPAVRVKDRKPPIGTRSVSDCPANGRPIHPGLESDSTKKGHRR
EHIIYERDGYIVDESWCNPVCSRIRIIPRRELCVCKTCPKVCKLCRDDIQCMRPDPFCREIFRS
>P69287 ~~~P/V/C~~~Protein V~~~
MDQDALISKEDSEVEREASGGRESLSDVIGFLDAVLSSEPTDIGGDRSWLHNTINTLQRPGSTHRVKGEGEGEVSTSSTQ
DNRSGEESRVSGGTSEPEAEAHARNVDKQNIHWATGRGASTDSVPQDLGNGRDSGILEDPPNEGGYPRSGAEDENREMAA
NPDKRGEDQAEGLPEEIRRSAPLPDEREGRADNNGRGVEPGSPHSARVTGVLVIPSPELEEAVLQRNKRRPANSGSRSLT
PVVVPSTRSPPPDHDNSTRSPPRKPPTTQDEHTNPRNTPAVRIKDRRPPTGTRSAPDRPTDGYPTHPSPETDATKKGHRR
EHIIYERDGYIVNESWCNPVCSRIRVVPRRELCVCKACPKICKLCRDGI
>P69282 ~~~P/V/C~~~Protein V~~~
MDQDAFILKEDSEVEREAPGGRESLSDVIGFLDAVLSSEPTDIGGDRSWLHNTINTPQGPGSAHRAKSEGEGEVSTPSTQ
DNRSGEESRVSGRTSKPEAEAHAGNLDKQNIHRAFGGRTGTNSVSQDLGDGGDSGILENPPNERGYPRSGIEDENREMAA
HPDKRGEDQAEGLPEEVRGSTSLPDEGEGGASNNGRSMEPGSSHSARVTGVLVIPSPELEEAVLRRNKRRPTNSGSKPLT
PATVPGTRSPPLNRYNSTGSPPGKPPSTQDEHINSGDTPAVRVKDRKPPIGTRSVSDCPANGRSIHPGLETDSTKKGHRR
EHIIYERDGYIVDESWCNPVCSRIRIIPRRELCVCKTCPKVCKLCRDDIQCMRPDPFCREIFRS
>P10104 ~~~wac~~~Fibritin~~~
MTDIVLNDLPFVDGPPAEGQSRISWIKNGEEILGADTQYGSEGSMNRPTVSVLRNVEVLDKNIGILKTSLETANSDIKTI
QGILDVSGDIEALAQIGINKKDISDLKTLTSEHTEILNGTNNTVDSILADIGPFNAEANSVYRTIRNDLLWIKRELGQYT
GQDINGLPVVGNPSSGMKHRIINNTDVITSQGIRLSELETKFIESDVGSLTIEVGNLREELGPKPPSFSQNVYSRLNEID
TKQTTVESDISAIKTSIGYPGNNSIITSVNTNTDNIASINLELNQSGGIKQRLTVIETSIGSDDIPSSIKGQIKDNTTSI
ESLNGIVGENTSSGLRANVSWLNQIVGTDSSGGQPSPPGSLLNRVSTIETSVSGLNNAVQNLQVEIGNNSAGIKGQVVAL
NTLVNGTNPNGSTVEERGLTNSIKANETNIASVTQEVNTAKGNISSLQGDVQALQEAGYIPEAPRDGQAYVRKDGEWVFL
STFLSPA
>Q9ZX29 ~~~whiBTM4~~~Probable transcriptional regulator WhiBTM4~~~
MHMHMGGDPSAICAQTDPELWFPDKGQSTRDAKRMCMRCPLLDECRALALRDPHLVGVWGGLSAQERRRIRKGASA
>P0C1C6 ~~~P/V/C~~~Protein W~~~
MDKLDLVNDGLDIIDFIQKNQKEIQKTYGRSSIQQPSTKDRTRAWEDFLQSTSGEHEQAEGGMPKNDGGTEGRNVEDLSS
VTSSDGTIGQRVSNTRAWAEDPDDIQLDPMVTDVVYHDHGGECTGHGPSSSPERGWSYHMSGTHDGNVRAVPDTKVLPNA
PKTTVPEEVREIDLIGLEDKFASAGLNPAAVPFVPKNQSTPTEEPPVIPEYYYGSGRRGDLSKSPPRGNVNLDSIKIYTS
DDEDENQLEYEDEFAKSSSEVVIDTTPEDNDSINQEEVVGDPSDQGLEHPFPLGKFPEKEETPDVRRKDSLMQDSCKRGG
VPKRLPMLSEEFECSGSDDPIIQELEREGSHPGGSLRLREPPQSSGNSRNQPDRQLKTGDAASPGGVQRPGTPMPKSRIM
PIKKGAQTRSLNMLGRKTCLGRRVVQPGMFADYPPTKKARVLLRRMSN
>P0C1C7 ~~~P/V/C~~~Protein W~~~
MDKLELVNDGLNIIDFIQKNQKEIQKTYGRSSIQQPSIKDQTKAWEDFLQCTSGESEQVEGGMSKDDGDVERRNLEDLSS
TSPTDGTIGKRVSNTRDWAEGSDDIQLDPVVTDVVYHDHGGECTGYGFTSSPERGWSDYTSGANNGNVCLVSDAKMLSYA
PEIAVSKEDRETDLVHLENKLSTTGLNPTAVPFTLRNLSDPAKDSPVIAEHYYGLGVKEQNVGPQTSRNVNLDSIKLYTS
DDEEADQLEFEDEFAGSSSEVIVGISPEDEEPSSVGGKPNESIGRTIEGQSIRDNLQAKDNKSTDVPGAGPKDSAVKEEP
PQKRLPMLAEEFECSGSEDPIIRELLKENSLINCQQGKDAQPPYHWSIERSISPDKTEIVNGAVQTADRQRPGTPMPKSR
GIPIKKGAQTRNIHLLGRKTCLGRRVVQPGMFEDHPPTKKARVSMRRMSN
>P0DOF0 ~~~P/X~~~X protein~~~
MSSDLRLTLLELVRRLNGNATIESGRLPGGRRRSPDTTTGTTGVTKTTEGPKECIDPTSRPAPEGPQEEPLHDLRPRPAN
RKGAAVE
>P69713 ~~~X~~~Protein X~~~
MAARLYCQLDPSRDVLCLRPVGAESRGRPLSGPLGTLSSPSPSAVPADHGAHLSLRGLPVCAFSSAGPCALRFTSARCME
TTVNAHQILPKVLHKRTLGLPAMSTTDLEAYFKDCVFKDWEELGEEIRLKVFVLGGCRHKLVCAPAPCNFFTSA
>P0C686 ~~~X~~~Protein X~~~
MAARVCCQLDPARDVLCLRPVGAESRGRPVSGPFGPLPSPSSSAVPADHGARLSLRGLPVCAFSSAGPCALRFTSARRME
TTVNAHQVLPKVLHKRTLGLSAMSTTDLEAYFKDCLFKDWEELGEEIRLMVFVLGGCRHKLVCSPAPCNFFTSA
>Q69027 ~~~X~~~Protein X~~~
MAARLCCQLDPARDVLCLRPVGAESRGRPVSGPFGPLPSPSSSAVPADHGAHLSLRGLPVCAFSSAGPCALRFTSARSME
TTVNAHQVLPKVLHKRTLGLSAMSTTDLEAYFKDCLFKDWEELGEEIRLKVFVLGGCRHKLVCSPAPCNFFPSA
>P03165 ~~~X~~~Protein X~~~
MAARLCCQLDPARDVLCLRPVGAESRGRPFSGSLGTLSSPSPSAVPTDHGAHLSLRGLPVCAFSSAGPCALRFTSARRME
TTVNAHQILPKVLHKRTLGLSAMSTTDLEAYFKDCLFKDWEELGEEIRLKVFVLGGCRHKLVCAPAPCNFFTSA
>Q80IU5 ~~~X~~~Protein X~~~
MAARLCCQLDPARDVLCLRPVGAESCGRPVSGSLGGLSSPSPSAVPADHGAHLSLRGLPVCAFSSAGPCALRFTSARRME
TTVNAHQILPKVLHKRTLGLSAMSTTDLEAYFKDCLFKDWEELGEEIRLKVFVLGGCRHKLVCVPAPCNFFTSA
>Q99HR6 ~~~X~~~Protein X~~~
MAARMCCQLDPARDVLCLRPVGAESRGRPLPGPLGALPPSSASAVPADHGSHLSLRGLPVCSFSSAGPCALRFTSARRME
TTVNAPWSLPTVLHKRTIGLSGRSMTWIEEYIKDCVFKDWEELGEEIRLKVFVLGGCRHKLVCSPAPCNFFTSA
>P03167 ~~~X~~~Protein X~~~
MAARLCCQLDPARDVLLLRPFGSQSSGPPFPRPSAGSAASPASSLSASDESDLPLGRLPACFASASGPCCLVVTCAELRT
MDSTVNFVSWHANRQLGMPSKDLWTPYIRDQLLTKWEEGSIDPRLSIFVLGGCRHKCMRLP
>P11294 ~~~X~~~Protein X~~~
MAARLCCQLDSARDVLLLRPIGPQSSGPPFPRPAAGSAASSASSPSPSDESDLPLGRLPACFASASGPCCLVFTCADLRT
MDSTVNFVSWHAKRQLGMPSKDLWTPYIKDQLLTKWEEGSIDPRLSIFVLGGCRHKCMRLL
>P39487 ~~~~~~Uncharacterized 14.7 kDa protein in Gp60-mobA intergenic region~~~
MTCFKNEKGEVFRLHVNDPRIKTEKLVGVGNTVAATAKAAELEKAKPWYNKSATNPEAVKLIPNLYEWYVTKYDPDHYKR
TGVAKWKSVNNITVNSKLFGRAFNEFKRGWIPDEKFYEVYNEICKN
>Q914L6 ~~~~~~Uncharacterized protein 14~~~
MEKQDLEKIESDIINDWTEADDLDDALDFLFMEKVSEFKIKFKDPLKVTEEEYRELLGNYDSSNSVSSNGITIDQYTYDE
DDDIMYKLEFTYRKEDNKIYIYEVQGWREKKK
>Q06VN8 ~~~~~~Structural protein ORF43~~~
MTIKRNGYNEYISMSSNSSPLLLYDIISLTNGSNDNEIIINFRSHVLNSNIIELLSGAAPHYFDVRNILVRNMESQKIGA
SFTINSITFVSEYMQYFVLNGNENLPNVYIDERGVVAYRKVWRRFTLGSPMFVFDYDGKLVKFTIAVERAKEGITVSSIL
HTRVRILDANSEDDIAFISEAIVTMMIYYKNSELEISQYYAENLYDELPRVRVLRDSKRIRNKTARRIRTTSINDIRTVI
PVNNYARQCNRLPMLVDSIDDVPTELRGNTIKFPLHGEGGIDKPKYVYCRYSDAPYPGMMVNNLAGASKKFPYLPCCYKT
SQEVKDVNLAYSRDVMPEVSSVAPNRVQIYFNTQRLLPCNSFGRSPMCIEKFITGINYFDDRLHDDDDNDSVSEYYPTIS
KVTDDVNASTYYCIRGGVQKGPNSIIEAVLRSLQYLDGRDPRTLSITDSILNSERLKLRDCINVSAQELYDIPFDKRLDI
LSSTQTYLDGFRFVKALQYQYDVNIYIYSRNCGPVTRFIYNHNEISLKFEDNDNINKQQQQQRERNDDDDDDDDSTLVLP
YHNKSSAYIDTKIHRKSIVIYIHHGTEATTNATGYPHVEYIMFRKHKKVVEKLWDVYKAMLPVNITNVYGLSYYEADNDD
INETISKRVAENCRKKLYVSTGKLLKLVGENNDGDDDDSFIPKLQVLDRLGKCVQLDEYRISSLSDFIEPLPLPIVNENM
FYNNPSCLTQCYGFDAAFKVDRLSRLLLYITLYEMTLTGSITSVYFTINRQRFSELICDSGCVGKLLNNGCVTLQKYTHH
LVSRTTTTTTTISPSISQEDHENNIDDSNITYILVDSSDTAKRLKYNATLLIKRMTAKEIDNLKLSTHVLPLLRYERDFA
LSNTSHDNMVVVTSMDKFTTKSFDHRYIYKIPSTIIPNIGESLVLITNESKEQTELVTVTSIENFTRIHDHYENSDAIQF
KCVVNVWSGASTWKKSV
>Q70LE5 ~~~~~~Putative zinc finger protein ORF59a~~~
MIEVSSMERVYQCLRCGLTFRTKKQLIRHLVNTEKVNPLSIDYYYQSFSVSLKDVNKII
>P13314 ~~~~~~Uncharacterized 10.2 kDa protein in regB-denV intergenic region~~~
MIEDIKGYKPHTEEKIGKVNAIKDAEVRLGLIFDALYDEFWEALDNCEDCEFAKNYAESLDQLTIAKTKLKEASMWACRA
VFQPEEKY
>Q70LE8 ~~~~~~Uncharacterized protein ORF99~~~
MDTHEFHKLLIKVVDLFLEDRIKEFELKLNTTLDELEFEELIGKPDSSNSAENNGIFIDEYSYDASENAIKKLFVEYVRQ
PEFKYTVLSIKGVNDWVRE
>Q70LC3 ~~~~~~Uncharacterized protein ORF102~~~
MIVDKNKIVIPMSEFLDSMFLVIEKLGVHAEKKGSMIFLSSERVKLADWKQLGAMCSDCYHCKLPLSSFIEIVTRKAKDK
FLVMYNEKEVTLVARGVQTIQK
>Q8QL27 ~~~~~~Uncharacterized protein 114~~~
MNKVYLANAFSINMLTKFPTKVVIDKIDRLEFCENIDNEDIINSIGHDSTIQLINSLCGTTFQKNRVEIKLEKEDKLYVV
QISQRLEEGKILTLEEILKLYESGKVQFFEIIVD
>Q3V4Q3 ~~~~~~Structural protein ORF131~~~
MAKYEPKKGDYAGGAVKILDMFENGQLGYPEVTLKLAGEEANARRAGDERTKEAIHAIVKMISDAMKPYRNKGSGFQSQP
IPGEVIAQVTSNPEYQQAKAFLASPATQVRNIEREEVLSKGAKKLAQAMAS
>Q8QL44 ~~~~~~Uncharacterized protein 131~~~
MASLKEIIDELGKQAKEQNKIASRILKIKGIKRIVVQLNAVPQDGKIRYSLTIHSQNNFRKQIGITPQDAEDLKLIAEFL
EKYSDFLNEYVKFTPRNNNAIQEEEIDMEQQEEKEEKPREKGKKKSVEEEF
>Q06VD9 ~~~~~~Structural protein ORF142~~~
MNQNHTLDNERNDDDEHSNNHVDTNDMKNFISCIKSKSNINDDKEDQQRPVRYLSRTVINSMTRGQLLELVKSMYSDRSD
IYHMNSIQLLTLINKTIGHNGNYVQVVTKDYINDLNVTQLRIFCKELQLDTTHDVDRMNKPMLEILLKSYFKNNYPTSTN
DDNDNENRSDDDDDDDDYRNDREEVEDSDINKNVNISSYNNEDIEEVDERTKSLKIDRRLLKMHQDNAKLVDKIDMLKNT
IEDLQNRLLTAETTRDDILRRYSILENRENIRDDDDNRNSTTISNLQNLLNNANIAMTRSRINVADLTASLNERTKQLED
CLTRGREKDASIDKLNLKVTELEKLLQIQRDTHQDATLKISNIRMNDNNATRIQISSLNDELHRCQDQIKSTSITVYNRI
YSDLYERLWS
>Q06VD6 ~~~~~~Structural protein ORF147~~~
MLHINMHIIDIGINTKMKPTPRHIHMIVKSNYTNMELERNTFSSNMNVMKDELNEMRRKLELHEAEKRAEITKGSQKKSN
SSSSENLEKDNAIKSLRSQLNSVKVVNEQLKFNIRELQNSHSITIKKLEQIHRDNLTSVEISFANNIEKYKSRIDELTRV
NGELTKINATLSVAGPSASPLRKSGDDQRSYNRYTTNDINKLNRDRNQWLRELNIIRSENGELKRQIDSLNKLLEKSRNE
IKQIRSKYNSLETQKDYENGDLFAGMEENLDAAVSRCKELFDEKLRLENDIQRLMEEGAMNRDKIHRQNYDIEKLKNTIQ
ENLDKQFDLINNGYRYDDDDDDNCSGGVAVLKSRITNLESQIIELSADIEYRDGNIASLEKERDDLKQKYNNEKRNSRNA
SDGVAKLKINMKKSELERDKFREKCLKSDNIIMELKRQLEINNVDLERIRDERNEYQEKVKFLDAEILTNRNKFEEQTNA
LNNMLRQYSRRSINNSNPSLNDTRKLHMQLQLQIETLNNQLDTVKSERKIVLEEVIRLNHQIDTLKQIYQCSMQNTTILC
RNYVKNSDSKLEYLLNSHRDELKEYQDMLNISEAKLIQSEKQYQQLMADNSSAATEMRALEKRLASEMNRYQEELSQKDA
IISQLKSVIEPMQREIENLNTSRRNLQYNINNTTNDSELTTSSSSSNNMLLYDKIKQLENELNRYKESERAWRNERDLLV
SKMRNTVGANDLSLQSRIRTLEMDNKNLRENLQANNSTIANERANYIQQLDKLNEEIDMNNREMGKLQVEIASLNNQLNA
SNVLNESQIEKINRQNEELNSNLTEINALRDELNKRESDILVANRELNVLRRRVEKYKSSTTPSSEEKLTTPKRMQYVKT
IKTLRKEIADNKRQIKNLIRDRSQNDDHINQLRVNDQLLNVINSLKNQIRESNGEIKMYERKLYESNRQNVKLNDNMKRV
VGENESLNNRIKSIYNNYDKQIAEIGEEVAKKNELIKILTLSVENIESNNEDVRGDGVNTTNFFKNLLVTTIRERNDLDR
KVQELQSEVFINKSRLSECQATMVQLNNDLTMRVEENKNLERVIVELRERISNNESKLQELQLEDGGRMTRIPEYDDENI
RNYRLRLQQAQNIIKNNQNELSRLKNVDNDLRELRKKLLSQSGKEDNDNHISQELQNAMTQLDYSKNIIEEHVEQIQQLT
AKNEDTERALNELQGDMRALEVDRSNILTENASLNDELTSMKDDIDIMQRKFTQLQNDYYSTSRELELEKEERFKCNATN
VEIKQELRTLKSELLRLQKQCGGIKKCATKLQESIISGDEKLSSTVSSSSSSSPKTTTKRKRNNDDNTNDDYNNRKKVHL
SKSTSRSPKRNLKRKEINNDNDASTSKNRRSSSSRTTITPPIPRSRALRLSASSIEDLKIDSKRQRLLNAGIVQDTELTD
EEYDENFDRNLLSDDDTEEDIMRKRILSNPKYEFLNNTYNG
>Q7TLC7 ~~~~~~Uncharacterized protein 14~~~
MLPPCYNFLKEQHCQKASTQREAEAAVKPLLAPHHVVAVIQEIQLLAAVGEILLLEWLAEVVKLPSRYCC
>Q70LE6 ~~~~~~Uncharacterized protein ORF157~~~
MSEMSVVEYEVVSKNLTSKMSHELLFSVKKRWFVKPFRHDRQLGKLHYKLLPGNYIKFGLYVLKNQDYARFEIAWVHVDK
DGKIEERTVYSIETYWHIFIDIENDLNCPYVLAKFIEMRPEFHKTAWVEESNYSIAEDDIQMVESIKRYLERKIASD
>Q3V4T6 ~~~~~~Structural protein ORF273~~~
MGEKITEEREFQSISEIPEEEIDATNDEEKLADIVENEIEKEIRKSKTRKCKTIENFYYYILRDGKIYPASDYDIEVEKG
KRSANDIYAFVETDVTRDFDEFLFDIDYGLPSISDILKFYLEKAGFRIANEVPTPNLKYYIHAVVEFGEDRPQYLAVNIY
DIDSLARALRIPQIVEQKLGNKPRTITADEFNDIERIVAEEQPILAGYTYDEALRIPYHYYVDHNNSFKDDALKIAHAYL
QLFPTPYQVCYEWKARWFNKIDCLKLERLKPSS
>Q8QL46 ~~~~~~Uncharacterized protein 56B~~~
MQTQEQSQKKKQKAVFGIYMDKDLKTRLKVYCAKNNLQLTQAIEEAIKEYLQKRNG
>Q8QHM9 ~~~~~~Uncharacterized protein 56~~~
MKKEIQVQGVRYYVESEDDLVSVAHELAKMGYTVQQIANALGVSERKVRRYLESC
>Q5UPC4 ~~~~~~Uncharacterized protein L48~~~
MDIFLCPVCQSGYRYFYTLYLPNNNIDVILFCDECECVWIDPEYIDYQDAVSNDFLVDKYKVTSCKILFNKEISGWSTNK
DIKNSRWDNFIENYEQFVFQNIYHLDKNKRYPFLYLYATN
>Q5UPE0 ~~~~~~Uncharacterized protein L65~~~
MQKKVLFNDIVFVCFPITDNGSIIISDIGYSDDGYNRPTGRQGTIENGDPYRITVPAGYSYTNKNVQNGNMIRLVRVNDS
KNGCWYNGSGDGWLEIRDYVGPDWRADWTVQIINPANNDNSLYYGQHFRLMNRAQVENPTFQGPSDFASIALWGSNNTNN
SVMMLLNGPLDTAKLECCKDNPIFTQPDYCANYRGTTCSGQCDDILSNYCAQVTTTDPKCGCLLPASFYTQNSAIGPPEC
IDDRCVDTNSYRKSTQCHPNCQIVDCDININDFNGTNINKIVYEQECGSKSTPNGPNGPTPTPSNGPNGPTPVPGIPPAN
GSSTSFFSRYGLWIIIAIILLIVIISAVGIYFYLR
>Q5UPL1 ~~~~~~Uncharacterized protein L136~~~
MGLEKLTWVSEKKPDWSNVQKLIAACEATNQYTNIGPIISQLESFIRDSFLIEESKAVIVTSNGTSALHALVGGINRQLG
RELKFVTQSFTFPSSNQGPLKDSIIVDIDEDGGLDLNAVKNIEYDGIIVTNIHGNVVDINKYVDFCMNHNKLLIFDNAAT
GYTFYLGKNSCNYGHASIISFHHTKPFGFGEGGCIIVDRLYENNIRIGLNFGLDNSLGEKSQYSNQASNYRMCDLNAAFI
LSYLQNNYKKIINRHSEIYEIYKNNLPKRFKLFPNHSKKNPVCSSICLLFDKPFRLDKIPFLSRKYYKPLDLSSPVSLDF
YQRILCIPCNIDLTDRQIYEIIGVLNEFADKN
>Q5UQ21 ~~~~~~Uncharacterized protein L208~~~
MIFCEKCRYSFNITKDVKSVQVGGKVNIALNNLFGKFNKNQQIVESDLTKLKVTDVLYDERFENMTKKDKKKMMSLIKSV
NKSFFQEVGGQGELNTNKAYFICKYCKNYRPIEAGTTIYTKNYDTTDNSDIENYSFYIHDHTLQRTKNYICKNKDCKTHQ
NDNLKEAVLAKNSADQLVYVCTACTTHWINSI
>Q5UPU7 ~~~~~~Uncharacterized WD repeat-containing protein L264~~~
MNRNIRRVRREVDSYIPSNNVNVETEGYLIRRIPKKCKPKVTSTIAQRVSQLENEVAEINVALAEHVNELNSQEKRIDKL
EKTVKKKKSNCSDDSECSECSECSERSCCSKCNSNKCCCNNTCGVFPNSVEAFSYYGPISGNCGPFPGGNCGPFPGGPCG
PFAGGNCGPFSGGSCGPFPGGPCGPIPGGNCGPIPGGPCGPISGGPCGPISGGSCGPALPYSAAVDGYEYFQNGPMMSCP
PIGPMGPNGFVEEQFVGGNGPFIGGGGFIGPNNGFVEESWGNCNDCRRGKCKKHKNRRSKSDNSDLSEYSSSNSDDSECT
DSDGSSCSTDGSPDCTESENTESHRSHGKKKHRFVNKKRNTCNNSDNKDNSRINMNICDLLKLFGDKCPIKIDPNEECKQ
VPNEFIQIKCCQEQICCQNSCCANKNSCCTDNCCPKTKCCEKDDTNSYTIEWQKSPDCCKEIDYCEKPFSCETNYCGNSY
ETDYCPIYIKPDSINSNNYTNSSEKYQYNHYDSDNCSDNYSDNYSDNDSEKVYNINLDATNINTDPNYNHNFNNNHNFNN
NFTNNRMTDYSTDYISHNFNPGNNQYSQQISHEQNINQQQSQNSENLLDNNEYGDYENYYNMLRGDQNIFPVNNSSNFNQ
YESTDISNNQNFNSQIVDSIIDTNHNKDSIQVEFSNNNDKTHHEKNECHCHEHSQPCKTTSIVPFGTPIVDCNNCRNEEC
ITIIQTDSSCSSNKVPIIQPIEPETKTMSIIDTAIANIDTCTDLTLINQPKGISTDAAILWAAKIGNLNMNQGMKVTTDS
NNNIIVAGFYRADVTPEPSNDDVPPTVIYDSKGCGVKNITASGIEEIFIVKYDMYGNFLWFAKIESTFTDFTVSIAVDTE
DNVIVTGSFTDSAVNIYDSTSQLVKHIPQPIPEPSAEITQSFIVKYSPAGIYQWTAVLFTSGIAVIKSVTTDPNNNIYIT
GYYGGQTLTFQNSDGTDSYSLGANSLTNVFVAAYNYLGFVLWVTMCGNIGNTVSEQGYNEGLDIKYSPDQTIVVSGYYNT
NPLIIYDGPDGLTPSGISLTNVNNTNINITGNDINTTPDIFLIKFRLNGTALWATKISGTITQFNTSVWASEFSGISNQF
YTTIAIDPDANIIITGTYNQGPVQIFNTPSGTILSTINLIITGTISTYIIKYGPRGNAIWATRISGALSQVSNGIATDSD
SNIIVSGYFSAPVTVFYSSDGTTPFTLENVSEISAFTVKYDRCGNALWAVKQENNGITQALNVAVDNNDSVVIVGTFNQA
PINFYNSNKQMAKCIINDSSYDGYVAKYADFVQSLVLLPGCKQKDIAINESCYKRANTLVTYKAGTISNSVSNCLRGFLM
TRANSSIKLLPNGNNWLVDYSNNILFIYP
>Q5UPU8 ~~~~~~Uncharacterized protein L274~~~
MKNLTIGAIFLIFFAVSAFASAPCPKKLADTTCPREPYSFEGMKRQEPIKGRNVGWFQCQATMTVHDPVTTAVLVTQPTF
TFNLSTLYHNSCGCWKTVTTIPGDAVKMEYGNGLPGQAMYGSCFYATWADLAGNPVETVGFARNVVGVHRDLIYVSRATN
ELVVHQSIYPLGRDQVGDEYVLVRHFDPVTKGIDWVQAVYCTKIQDQLEPLP
>Q5UPY0 ~~~~~~Uncharacterized protein L294~~~
MNIAYKYIGYIYTTIMSNKITVNKDYTQATLNPSNTVDIGPRISNYQADYLFIHFDEIMRREIDRNPDIIYDNNFQEFMQ
YVSEYKRILVESRKHKLVPGLFGSTSEQFSCSTGSHIFTPEDRDKINEAIILGQLNPRKYLTENDLSLIGFNNHYIQECG
NRLSHNFFEHVALPTQRVWNNYLLSKSIEPNASITINTYQKLHEDFQKSMADINDQVINGDYATRYNIIEQKISDTVNFL
ENLISASNTEENIINNTLDLSNYYERLVFGEPNVDPSLVIPKDTIDLVERDALKKQIKASINRYSKMSTSSVLASTYNVA
SEVQNIRNVGSLATNLLDTLNNANNIINPPQELYDLMSTKLYIDDNGNQVAKYLNDTASNKLIQNEIDKQNSYLEFIRTN
ETITSRGKLIPVNSANDPLSRLNRYSHKYTYVPIAQGGNNSKSTKYQFNKKSTQNITISQSGGRDIDVEKLLDSSKQASN
VINQTISDIDQSRNFYTKQTELYFNNNNFVDIDNLNQRNQLLIEMINVLGIYNFLIGERSNNRSVFLSELDKRTKSFKQI
LDNYIKSVNSRIPPITDDKTIGQILANGFLSVSEIQNLITTSGLNINVEQIPEDAILNNATTAKLISTLIDKYSLSNMKS
VVSDLRKILNVAPLGITYQTNPEDSYKQLSSFYSILKSLNTSLLSEYQNLSTNTQNFQSIRDLTGKILESNISTFNTFTK
QLIESVRNFGSEIDGIQSIRKQILQSKRRLADIEFFTRESTYNLSPKSQSVSQQDADLLQSKNLGVAVMNHDLSSNEITD
FYQNYVTNYLDKIAGASVNVVKHYSYDKDIKTINDNKNMFNNTYDNYNTLLETLQTSANTTRTLRNIVSNNTTINSDASF
NSLARRIWDVINYFNGTFMPLNSDIMNKINIPGDLSEYLSKYLIVNPEIINNYLANSVNFDSTGKIQPTDKLSRKSQIDL
LVKTIPFYIKGAQLSDTYTTSQYSPGIPNQTFIGEVQDRLNQLIANETDDNAYNLYNYVTTSRNIILPEARIVGEISDQL
EDVYNSRFSTPTNRSEFQNSILSQLDTIIIGIYRQFYETLMYLYQLANNYILINNPHNPMYSSIQKDQHNLAALQQDIRI
MNLSNNFAVNGRDIPDIVPKSTGGPINCRNPLINKRFNKTIHRRINCLGLMIIDYNNNIGSDLSTITQQYNTFYQNSVPL
LNLLRDQTISGEQINAILDVNINALNLFVANPKLLPNSFTYGTRIQESNMITPNIINTFDINNLDPNKSVFNDYETTLSA
YIPSFVNDIIDINGQYIEYTNVRYHLISVGSISLPNVKIQAGDTISIRNIDDIQMLIDIGNNVVDFTNRLYKSSKQISNL
VVETVDTWGSSVLTDSDEFNLRRSKKFRLLQTLNTLRHIEDSIRSNRNITRLNQFLLEKEDPTNAVLIMSAISIVNNINL
TDRLFNENVDKIWIFARKLVNVWYTIFSQILSLLVVNGVVPATTNLIRSLDPFMRRSNLFNIDNVIDNTPNTNNYVDAIN
NEFNTKISQINNDKNKYPINLDPSKSTGIPDQDKYIIDNPNIISFNHYYNNISDKSSRVYILMAIMNDEGYRNINKYLES
ISDHTEYLINLVDFSENYLDNNFIQKKLDDLPIIDPKITNLVNAQTNFKQAFMLELQLELQALGLDQLYGPIESLVNNGT
NTIQTGLSNPKTITPVAILDPNRSIEIDNQVSLDIISSPRIFDPSEYLYLANEFITSRQSGRFFRLYLSDIYRVIFRSIT
NELDRTMNSINQDISATNVADLEEFRETYLNYITNKKFSTTLYLNPTDLSDSITAVPNDSNSPNIKAVLLLPNDYQTNSD
RINTINDIFRTVSETIKPNIYELFETEISNYNKLIQADYTVRKLITDYKSDVQNKLDKLRNIRNVISQVMFDNYFKQYAF
ISNGNLMTLLENTIENYETIWQLIEQKIYSIVNKNNYHILTMSQINNYQAFKSAINKLINNQAIIKKFYKRMSFGLIEYY
YDIMDSLVVCLESKNFEDMSDIEAYIYQYHYVQLKRCHALFRWIRQEYQRNKQAQDDVNSRNITPGTKYNRILDYKIELL
KTTGDVNSVFLEFQGLRRYLDEYSAIAMDKVQLHLRINDFVSNSYNNELNTLSNGRDISYMLDTDPNSNEYKNRWDNKQL
MFINQGNSNNLKINFDLLQKIYQFNNPSSPPRDFEAYYTATYRRMKNNQGIDFQRIYNTNVFPESDVISNYMSIAPNILN
NKGTVIMTYGYSGVGKSASLFGRKMDLSRGVDKPSNGILQATLDQLTNVEIYFRVFEIYGLGTQYNYYWNPTENNNYQCY
PDFYQCIIHHVLDTTDSTTLKTKDHLVFTNRHDMLAYIMDLQNPKNGTQFTINNKNDPNLSNKTTFFNTIGQMVKSTYSK
ITEDHYRNFTDFVDDIDRVRSDGIHIKKVFDHIVKQIKGTINNPISSRSILVYDFEINLDPGSSNPIFIPFLIYDLPGKE
DISRTYVDTSITPAIQGSSIEKINLRRRVFKDIDPPSTTPGVHNKERKSTYVLNPLLIPVFDNNIQIITDILGEISSKNT
MISRLDVNFEATIVTDILNFVVDNFGVDNKGENVPSVQYPMNSFYKNPGGITTLVQLLSDNELIDTYKSTRYADVLPLII
GKGILSVNIIAGNRTYNPNENIKEIKILIGVVIIGHLIKYRLFDVVVEIINRIVEGPGNPNQNDDGGWSVSKIYAFFEAY
YINENVVGLLQYLITNVLNKSSNSSGILEQVSTINNNNIKDTISQSYRTANAYSIIKSQLRIYPKTPLPDDYNIKVNSNL
LVSSDVLKTLEIKEFMDNNDIQPDGQFFTPKVTPITELARRMDNVISFENRSDYDNNRIFRSGSSNFNCYDSNDVNDKIL
INPRRAIFNTAPSTMTETNRPLLQDFIEPYEQKISFYYIFYVVSNSQSRNKAEEQVKLLNNSMPFIDKMDPVSKKKQCV
>Q5UPZ5 ~~~~~~Uncharacterized protein L309~~~
MISNNTITIILVIAIVAIVFYIYFQRNKSRKDSHKNPGNYQNFGTLNNSSDAKIMKKSVKKSTKKKSTKKLLKKSNQNNQ
QNNQQNNQQNNQQKKVRFDKTVKYNIYKQQSPEISSHNSNNSPFDVDTILNSIYSDQSDISEDSNCSELSEMSELSEESN
HSELSDTNNNLCCNIIPSNLEDNTNNYWDSSFGLPLATDEEKNKFAKQIKKNHKNYEKALGNFTKHKTDDNVIIKTDTTI
EMFKSPYTSDEDETELQRKPRTVKDIYDNKVAGPKAKPKKIKYKTANMVMYENENEMNGGFIKGTNIHGFDGNAGFKSAD
ICDEF
>Q5UQS3 ~~~~~~Uncharacterized protein L330~~~
MAFNNSTIIIIIVIAFAFFLIYSQNNQPKIIQQPVPQISQFKSQLNQPQNSQHNGHLNPSIISPQLCPKCDKENCSLEQI
SPSRSKSPTPQITNVHIEHESDPYSDPIKKQDIYGMMDPLTFPQQRLPREVLQKYQEYYDKNGSYPPFGQNTQPLFDNPV
LAGILIKQVDENEPFTDNVPSSIPLFKVKSNKNSNRFFYYIIDQRYFSKLELKIPLDSIRVNGVRYNNAEFYGIPELFDG
DVIDNIALYPSNRFSVKLYKIYSFP
>Q5UQU0 ~~~~~~Uncharacterized protein L352~~~
MNNYFDAGNFNTVFNNPNQMSVHGYDPKSGFNNHGFMNRNNLLSNNLCNNLLDEEITEYSVLIDSKDRNYQVYPDPFKYT
VKFNPLRRTTEIIDGEKVVNEEPMPVINDNFKNVKYIRLESIILPFYTKIRFVDEDIDGDIVQRAKVNTSKPLTDNLYVL
MRIEEFKGVNYKSSNDVLAESFAVIYFDNKISNTHYEGKCNGGIKIFPSDKLGTINSLKISFVSPYGEEINCDHLMKEIK
SNMICNCDDPEGDEYTDCFKHNLFHPLNPIFQHHVHFKIGVVTPRLNKLNFN
>Q5UQW1 2.7.7.6~~~~~~Putative DNA-directed RNA polymerase subunit L376~~~
MAQQSLYFQTKLEDKVSLLPSQMVGNMENYLLENLEAKVKDKVTEHGIVLKVNRIIEYDYGIISKNNFSGTAIYRVKYEC
LICSPVKNLSIICLVENIVKGYIIAKNGPVIVAIPFNNIDSDKFQLTNGNIVYKNNSNNIQKGDYVKVSIINIKTNLNEK
KITTIAKLLDMATNDEIKSYDNDQLLIVNGDVDDEQEFI
>Q5UQW0 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase L377~~~
MKIMEDQINPINIRKQKDLDDNLEEITSENSEIAPADIEKKMLENHAYPSPTQEDFQRAIYVKRDFYIHSIPERKVLNTY
DEIKEFRDNKCAGNFKLTESQTLLSNFINPNTPYRGLLMFWGTGVGKSCGAIAIAEKFKHMVEKYGTKIHVLVPGPINKQ
NFLNEIIKCTGETYTKMFQDKTIVINEAEKNRIRKNALNVVNQYYRIMSYRSFYKKVLGEKIRDKVVTGNKVKLTSRKTE
TGEFERDISIDRIYSLDNTLLIVDEAHNITGNGEGDAVKKIIDVSKNLKVVFLSATPMKNLADSIVELINYLRPKNYQME
RDKIFTSQRGSEMDFKPGGRDYLRKMVRGYVSYLRGADPLTFAERVDIGEIPPGLDFTKVTRCFMLPFQLGVYDNVIATQ
DDSLDRNSEAVANFVFPGLSKDRNSKNIEGYYGIKGMNEIRNQILNNSETLNRRIASTILSEYEIEDPSNLMYLTDNNSV
ISGNIFNEKYLKHFSIKFYSALQKINETVYGKRNSGLIFIYLNLVRVGISIFQEVLLMNGYLEYQENTNNYNLKRDTRCY
FCDHKYGDHYNLPDDIPKHDFYPATFITVTGKSEEDIEQIPEEKHRILNNVFNNVNNREGKYLKIVIGSRVMNEGITLRN
IKEIYILDVHFNLGKVDQAIGRGIRFCTHYGITNEKDPFPKVEVNKYVVSVKNGLSTEEQLYKKAESKYKLIKQVERILQ
EEAIDCPLNRNGNIFPEEMKRYANCGTKDNPCPAICGYMPCEFKCGDKLLNAKYYDPDRAVYKKITKSELDYSTYNNALA
SDEIDYSKAKIKEMYKLDFIYTLKDILRYVKKSYPVEKREMFDDFYVYQALNDLIPITGNDFNNFHDTIADKYNRPGYLI
YINTYYIFQPFDENENIPMYYRRIFTPPTINKINVKDYIKNTPEYRQHKNLYQLDEEGPIDREYDFDSVQDYYDSRDEFD
YVGIIDRESSKRKNGTNSSGDEFKIRRKRPKILSKKRETGIPSFLGAVCSTSKDKKYLASIMKKLNLDENKSDSRMDICD
RIKNKLFDLEKYSTNNMTYLIIPSNHPHIPFPLNLKDRVQYIIDQIKRETRSSINPEIKTIKTTGEFNDIDYIYYELYYD
SSMDKYQDILTLYGAKKINNDWIIIIK
>Q5UQ45 ~~~~~~Uncharacterized protein L389~~~
MKRTIKNFLTGSITYFDTEKYFSNPNLYYYRIYPVEDNKTIKIRVSTTDNTIIHPEINKTKIILKIKILEDSIVRSLRKE
IITNRNARKQLHSKGNLRYVPTTSRNPSNTDTYSSSIDISSSSSSINTSDDSSGKTSSNDLSDMSSKSLNESLTDYIESN
PFELDKPSTTTNRISKTKYISKIIGVYKLNTKSFKTILNNLPNKHILDLSEFIQGDFLLSIYNNYENIFYLRNDFMLMNK
LHTTIPLEINRNRLFFSFLFLLFYLHQNNYNTVNVELFNIGLSHNNSDKLISLKTIKGRRTVPLTINYEHGGYIEPILID
YVAMSNGQQNVFKLSNFMPFDNSVLYKNITDDIKYPYLPIDLIKKTFGTISSYVMENLCWIAKSTTNTEISIIDRQILYL
LDMYDLFNGVPLSNNNITNVANYNMVQTNIIMNCHGDVSKCFQYNTNNLINDLDNLEVTILPPDNTSPTKLRMIPDDLKK
LYQIYNNQYGIKIFGSIKSTNDIDIKMETFTRRVCLSRYWYSQLKNSYLVSRTILEDNYNFFEAFLDEQIDKLMKLIQQE
LYTIDGKFQVEIFVYHFLSLSTDYILPSINNFDYSNYLGESYVTMIYDFVTSIISNGNIRNKLFVFLALSKIIHLSMIRK
LFMVFYIKENSNNNSELLQHGINNNKITMDKFDENIPFEFDANIYCQQRGLNINLIEYRYTILKELVPLKQNDIGTNNLG
YYLFRLLDNNIKHTYVLYMYEDFKYCILGRIRFDKNKVPFKDYTYIPKTQDILYPKMDEYEFYKKNNEYVAFLDRGMDTQ
RMSVLDHDNIFIESFRQGEPVLSGASGHTADILLCAGYLEPSDSDRFVNKMKLMTILCVSVMFPRKDHSIFEMYRALQLF
NKPMKTNEFTCPVENINTGSCFKWLLRDIVYNVGDNQLIGNQIYNTLTNRLNRSFEYYLSPKYYRIIEKIIEQFYKSTFQ
NININKFANDIRNIISDQQFNNINDELFIIMTIIRREGNKIWMNDAEYAYNIENKHISMRNFIDGNYSLPDFFTDDDIFL
YEDNVITCTRSIYDFVKPVNSFDNTCLILQKIFWAASKPQCIW
>Q5UQJ7 ~~~~~~Uncharacterized protein L399~~~
MDLGSEFSDTGNFSGFKNNDLYGGDKSSNTFERNFHNIFEEAKKYRQRIYDAERKMNGGADDNNSLKPKRQINKPLRLML
DISKILRDSGNYKDLKTGDYMGIAKLISDRAKEKVGTTEISDEVRSEALKMARDPDEFVTKYRSKKQSSSSSSSNSRGNS
RGNSRNRSRRNNDWDDSDDSDDSDNWDQRGGNNDGDNIKTDVWNDPQQTSNFTGGCGCGMKGGNNWDNMRVFNDTKNLSN
NNSTNTLNNNNTIISNLNNATNTLNNNNTTTSNLNNSTIISGLNNATNTSNLNNAVSTLDNNNTNISNFNNATNTRNNNN
ATNNLNRNTYAFNNGNKLNNTAWSNFRKSTIY
>Q5UQL0 ~~~~~~Putative core protein L410~~~
MADNKGRRDTFDVSGDTNTNATSNKRSIEDKIEADLDKIMKKGSFTPADALELYNKYGDENEYLVDKILKMNSKKNLKIR
KQARLLAEKIYQRYNNGTKPLHEILEKMLKYKKENKWSDKEYDEFRKELTNLLTGNRALEIDYNQNLTSYKSRINRALGG
IKNEEVIRQADSGLRIKDSEKGVLQDILNMYDRTANLHRTVFMHSLMYEDCSLVAITGEFDRKRHLATNYIDPILACMFI
PKIDIFEIHMLYANFGSIIKARSERKQITNEPDLLLYYDITSDPNDVVCEINSPIADLRNRYRVQIALWGIVMKLRNGNY
YDSEAMDDFSKTLDACRNNLYDNADLAYNNDEGAIMRRLLSVFSLRPTIIVTKPVYSIASFAMGQLMPQLIMGANGVPSP
FGQSMNPFNNQPIYTITSIPMITLQIPPITDNAEPIDLRSATTQTLWVNENKTIIPKEQSIIYSKEVLIFYINRRIQRIQ
LKTFSNPLSFSQLPLTMSSFERLNGYPVHVPDRLVLDRGDEVYNLRSVVSVTETKIGGLSTSLVEGLPSNSILGQNSQST
GIITGRTGLITKPRNFETGVFEPEYYLYDPFGASLPVRHPDQSGYFTNKPISHIPPLYSPSVDLTGGIENPSFFDRASRT
GTIFIYAKPQGYNPNQPLSLF
>Q5UQL4 ~~~~~~Uncharacterized protein L417~~~
MAMNSTLFLEIKNEYTEHLVDVLTPYIYEGLLAIYSRACEIADESKKSNKILSIFQKLLQSIDEWNQSRIDDETNRIKMT
GQTQEYLDDLVKAVIRSNIILLSCNNNISQAICQNFYNGLSTSALIHRCYAECGKDAHNNPYLFFHGVEPLDYKRNQVII
QDHVRMGITRAIRKILPISMILKEYLVNSMNLFPDPNRTNPLFPAQSINNAINNPFQLPSPMQMPGQPTQIVQPVQAVQS
MQQFQPFQIEQPMGSQMMGFGTNITAPETKNQIEQEIFNVIKSENVGQNPQKIKAIMNIEKIITSAEPNQIADFAAKTTS
ARDPMNLPQANPIGFQNNQFNQQKKLYQHPQHNQYNQIGSQGQNRDYRQNNSNHSNNSNHSNHVDRTDRTNQSTYTNQTN
LTNQSNRSNYSNNKNYYNSKSRTGDYSEINKNSKQGHFEQSQNYSKDTESMFITPANGQNKLGTGGNIDLEKIDFNNINL
IEDYGVKHSHE
>Q5UQN6 ~~~~~~Uncharacterized protein L442~~~
MRGNNLRGGVGEYKINDVLGTRFQNFMDLVDQQKLTPFLVRNLDASLASATGSPYGLDDLHPDFAEYVMAVSAALRQIES
PELTKVQVTGKYGNALESIVNDLFSGTTSGLIAQPTQPFGFNQLPFHNNHPNIKHLIPGFHLFPYTSAVYPTVNQNNNAL
INAILYLYHYISLLDLDSGVDARNIVSFLKKDNLVFNRYVNDIVTTMTNNNFFSQSLGFPTGQPTDEIVRDSVLQGLTGV
AYQIVNRLRNVQASNQTFANLGAVAGQDISAKDAFAKFVSSVATPGYPGVNADKLFDAFSKITGGFYDQGPDVVNIADVD
NDQAKQYRTNGLLNPVYILSPSIAKTVNADDYDKLNQAGGKRNSSMNNSTQNNNSSRSNNSARNNNSVWNNNNSAWKNNN
SAWNDNSSWKNNNRFGQAGGITFGDIGAAPAGSVLSLPFLYGPALPDGSHKLITEDEQGDQNDAKLVDIDSVKEIEDLTD
PNNITGIIPEIVDIGNDPNYNYARAQLVTLLYQLIVNEATFDQASGLPNLDNEIRNAAQNYNKIVVRLRAIPTSNFGSSM
ADLRDSFLSEFVNTSYRQFQVEKQTGNLITGQQTIANKGSNFANPDIKSFYDNTLVNNADFYKTYFNLVKLGPNGVVVDA
DVKDITEAKGKSDAELQNYRLNVRKNTGYTRFTGAQVGGLLGDIVFIDRIPAFPQDGSIRNVWLTRAIALTPVTLNAYNV
EALRRIAREVYNSPVGQSTVPVYGQPVDLTLIAQSAARMNFPISNVAFRDTFNNLLQNALNQAATGTAVTPGFIEQEDKL
VEHLLRVSSRWERDGNTFIFKDLSGNPVQTDPADNCLLIDTSTRECLSVLTQCIADPGTKLSDTCARLMEFNFKVNPPLN
LLKDEISKMNPGVAFEILRKFGFGSYLAEDKDDSGSVIRRYKVQSVGSWIRELMGESARCAPGQAPVVNQGPCRTISLRE
ELGQANADKILNMAKDSAPFLRYLEVLVHWVNANPQVLNPEETKDQSTLCPTSYPKVNDSFNTYSYLNPYKDVVYRLRNT
TCDLERLKCSIMGNYLGSGSRKIFTDLATIPHDTNMPFTRIGFTSTVPLLNKVPMFGGDGGIYNLQNQLNNLNNPVGYNM
FHQIYKDLLNTMGNIGDSRCIRLSSNTQARCEDKLESFKNAEIELNKCLNRLIERNKIYQATRGRIDLNRVPPENVAAVL
EKHSNLLNMNSAYNKKAVNLIDIFQTIAKAIINKVEEGAPKQTVERPLTMGFHNPSYNF
>Q5UQQ0 ~~~~~~Uncharacterized protein L452~~~
MQSTTNNNTNKKNHARSNRSDNLSTVVSNRSAYRSASKSASRSNNLSTPGQSRVISYDDDITSDYNTYTINDYIDDPTTD
QNTVNDDEGNIYGGNFFDGFIPTSNLFADNYEIKKLDDVIARTNNIMYGGRIPEDQIISTESNNIFGIKDFRVTNTEDTN
DDNSNSQSVNSRTDSDNLSARNTSISNSLLTSGRNSASIRNANPLLVRNATSLGNSERNSPDRPSTQGDSSIRGEADNFS
GRNASARNSASNKNSASRSSVSNKNSASRSSASRSSVSRNSESIKSSSRNSESRNLESNKNSTSRNLESNKNSASRNSAS
RNSTSIKSDSKNSDSRNSQTNKSKNQRGGLSEIKGIPVTDKFIGQNEIPTSSLGTHQYEPNYSLTSPGSIISTSSKLNNE
AIRGINNRPIAGKNQVYPDTITDTNDLSSFFANTESNTIDQFGGQTSDKNNSTKSNTKYNKSSRKISEISYGTSKRSHNR
SSNTSNLKSETDYDIFTANSEL
>Q5UQP8 ~~~~~~Uncharacterized protein L454~~~
MNFSNKPNKSRKKSNRKNKKSNKSNTQKFFNENIVVDCDLFGVSENITNNAKSMNQVFDNESGISKRVQKPSVKNINKKN
LKSKQKNRPAYWAQFDAQRFDSIGDPSAPNELYQSCDKSKLADLERQLSYQGGWTQFNSDSSMSYGIVSDDKLTHNNMMP
YFSAKSGYGSNDLLNTSVMNYKNSLFTGNLKDTWKHKQEIKPHFKPVADSSYIHGTPIYSDDVRQRYESQASKLRQGEKL
FDSIQVTPGLNLRHDEKGTHGYHSMYRPLEKTVDELRVRPKITYEGRIIEGMRGQERAAQAPVITYKPDTYKTTTKDDLL
PTSDIKKAPKVIDNFIMKDTARHDQHIEYTGGAYTGSNHVGRNVPEYMQEKYKESTRQNFLAPKPMQKHSKTDTKFNPNL
KSYELPFTLKDQNIHNKHPGITSSIFGNTTYSNLTDSAKFTNKQLIAEKSVTNTNIGTNNMRGTVHCMDIANPTIKEISV
ENRLNPNINGFSTIHRTYNPEIAKATIKETVIDNLEPSNVGQNIGSYANLTHNLNETIKETTVEIPQNNFVVPVGQYQRT
PGLQDTTNPTIRETTVGINRNSNILPIGQYQRAPDMQDIARTTMNETIITTPWNNHITPINQQQSAPHPQDLFRSTLKET
TIDLNRNSNILPIGQYQRAPNLQDNFRTSTRETTIEIPRHSQIIPTGQYQRAPNLQDNTNPTIRETTVGISRNSQIIPIN
QYQRAPNLQDNLKTSLKEITNSISRNSQVLPVGQYQRTPNLQDLPNTNLKEITNSMTRNTQINPVGQYQRAPDLQDITNP
TLREITGNIHRNSFVVPVNQQQRAPDLQDIMNPTLRETTVGIHRNNIIIPVSQQQGPTNLQNNNNTTLKEISMSKHRNSI
ITPINQHQRAPNNQDIARSTLKETSIGIHRNTNIVPIGQQQRTPNPQDILNPTLRETIKIPYNTNINAVGQQQGRTNLTD
SARSTIKETVVSIPQNYVTTAVGQQQGKTIQDSLRQTLKELNLENNHNSNIGTRENNLGAAHSFNRQPLRSTTKESIVDV
PQNTHMTAVGQLKHKAPLLDTVRTTMKENIIQIPYNSVVTAVNQSQGSSSSFNREPLNTTIKEMVTDNKHIGQANHDGYG
RGYGYLTEKKEAPNTNKQFTCQEVYIAPLEGHQKPKSYEAAYNAETNERKEALFKYRSPTDSGVNMGPDPDMINLKLRND
NHESRDPNTGYSFNNSLDRFNPQISSKISNSIDSCRNIDPMLVGQLDSNPFNIKYNI
>Q5UQF0 ~~~~~~Putative ankyrin repeat protein L484~~~
MSRYQPRIILNHPEKVVPDDIMEQFFLTIKTGDIDKIRNFVAQNKNKFNIIEKSSKPGGPNKTPIHAVLELDDRIADQET
KLTIIKYLDKMGAPMDLPDSDNVWPIHLAAADQDEDIIDYMLKNKVSIDRKDSSNNTPLHYAVYGKQVPCFDKVKVGSIV
PPQDIDKLPLNKTLTDTKDYIVELLNNNPQIKNNLIHTVNTIMNIPSMYQATKTEKDLETDIVDIFTEIASNPSYPSQNY
ANSMSVQQNKLDQLINRMYILINDDLLRGLTNPLKISPNNTGWGPVTPSSSGQPSSIDRILEREHDYVFNELNNKYSSAR
NTVIDINLASTNKLSRETIPNIMRNINTNYIDRLIFCPDCSNSEYGEKITLTKMLYLLVWSNYKLNYIPDLVRRIMDNMK
IMSTPIHSQIVNSNYTAFPLNSNLLVDDDSSLIGYIFSNLLRNKLNPQSQIDLALANVMDQVDPTINAITGGINSCITNQ
LVPLFTNSDDPNGAVNLFDGVLNSPINNLWSIPEFRSLREDINDLRPAYRDGKISWFQMLFNLIQEIQPNLVTGNTNDVT
NNIFIRGRRTPRYILPLTPLPNSTNYGPPPGPPHGTLNNAYTYSEAFRVMDALVQYITSGQINATRYPRVFDQNINDWIP
FIDNEFQSDALLQQYPELTFLYKILAVHTQRAIYSVIFNCISILLRNLPVSPEADIVRNYLELIDDAYMYYLLLPSEPNP
AEFTQTGTDDTLADLKQHKWDPDNDLVNWFTKYIQRIPTEFIEQLYQLIIQNIDDFNYGNLENIRNHIESSMGDLSNFIT
PIINNSEFRNSIKKYFGTFNYPASYNGTRIASMQVIPRFNSLIDDIDFEYRFDDYLTLLRNGAITGLTFLTEVYGYFFVD
TKKRLQLIDESLVDINAIVADIIANINNETYYYIPQVILPALVKQLITVITNIYAIKDVLSKFTPVKVEFDSLITNSIPE
HNQIINLGNDFVSYVQDQLKIIYTNTIDIIKYHNNVIEFLNTHSASQLINATNTATGNNITTTRVFNRNLIPIQIFPSTL
SDNPNFTEIETVLRSYSIPEITYYADATDNTDILIDIFDIGSTLNLYHGTISFDRTGIISNSPNVTDNLQINIENDGTVK
DIPDPIAGQWLSFDVNHPRGTSYFNAFIAYITRNFQFDKLNGMPSSVFKFLPEYLSFMKQQIIEQVVQFIVDNKDTNAKK
IYDELTNLGTQTSYQAIPDVKNYIIIGKLTDDILNKLFEFAIRQSVSSWIYNITSNDPRYRSIIDPLNQTISIINKKDYL
RLSLNQINRETISDLLSTNSPYVDYGLAQVETNPNNLSYTNYQNLNKFIYYLYNINYSSDNFQDGNCYYINPTIVSKLID
SYTLEAKNSDGNTPLHLAISMTNPDIVEILLKHGANPFTFRNIRNESPYDLFNDIIRSHLKYDTGNTVSKSIENFYKPFN
DLLVSRLLDDKFKNNIIKNITYGIPIELIMYNHMFHLYLENYRYDFTIEMRDSIRRIFQKYFNITDTVYPTDLFEINRDE
DLSKILESEFPQNRVKTSVQLLNPKLRDYQKQLAELNNQIINLTKEKRSTINSQQISFINNLIANLEARKSIVQTNISDL
EFKQPTSTVDSAMISSFKSSVNSIQNRIGDRSLNLIDFYRLAFGRIGYSQDLYLNIWRLYLEKDILNARSMIFPLLDQVI
NNHVSLVKENKMTRENKMTRENKQELAVIVDFYSTVKKYIESKKSLPNNLDDNPILQEEHDHIVYLINLILTPTIRNILI
SQIYQGLAESDLTNTLIADRNVAIRDILNSEYNGHSLDSFIKDILPNLSVKYFTTIYGNQDPNKFITSASDVFQPIIETV
KLNRIIQVSDDSILIQNLRDYVIPFFINTYQNFIHHIELSIYAYEKYILNTYQLTKILQIMSNLQIPK
>Q5UQF4 ~~~~~~Uncharacterized protein L485~~~
MNPKHSSCISGQNNSMYIREECPNCHFSYTKGFSNSVENKCPSCKHIYKSGDRINHGPGCSICSSRTCGANNSLYKNGCP
AIMSDGRFITNYNSTNELTEEMRKLNKIKSPNEFRLFLQNNGDKFMEAERNHLLRNNSCAPKIACSQGWYDLWTQNNGNW
SNFDKLDCNN
>Q5UQF8 ~~~~~~Uncharacterized protein L488~~~
MLTKNTNNFSNSSNDPLNLYLNKNYAALANSDLTKKIDSDGNTVVHKMAKNLDHDAFDSILKHNPTAFKYNVINTANKRS
ELPIHKAMETLQSGGDPDHGFIDYLINGLGANPNVPDASGRTITQVNPTTYNPTNPTGSTNVPGITHLPSNAVNDQSVKQ
LNDQVIKNIRNLAKTAEDNINKISPKLGQSLRDKGVTPDSITNAAKNVVDRMPMIDKFFGKQTANQVGTVSQVTGGDNNV
EFLRQLTNHYSTLRGGRSTSDRSDMNRYDSFVAKNKNSILNNYDRQYNDTFSSTGGKANKKLGDINTEDINNLFTDDAQN
ENSDNSMNTWGGAKKTYGNKSRSNDNSDNNDDSDDSERDIIDRRQTTKYQNMFSSQERPRERNTKVDEIYRSFVKKIMDL
LGVDEETAKLYRSAIKIDIGNKNPELRKWENDELKIQEMEKIFNNKQSLQNALDKIDMDQIKSEMSRRRDESNKRRDEKR
KDREEKRRQKRSQRSDTRKQGIDSVTSDEATSDQTQSTDSNNTTQTASKKRTRKSTTSQSRVGPNNYLRSEDIIISTEN
>Q5UQG1 ~~~~~~Uncharacterized protein L492~~~
MSKQNTYRLINPYIEGTTDTVVHSSNSFKAGKKLYGGISGYFTNHLDNFHMTIQNVQTGGLTHFRILEQRKNDGTVDYKL
EKIDGEFSPDVDNKLLSSVAKLEQQKGGGSNDSSDSSDSETECFKFPLQPINRFVYFYLPYHKLNLVGLSPVDISRIFMP
TFGFPFNPTIEIRFDLYKY
>Q5UQG2 1.-.-.-~~~~~~Probable zinc-type alcohol dehydrogenase-like protein L498~~~
MSLEEKLNKYSNKLSMTNNLNKIDVYNKKINKYKERQIYSNLKKPQPIDESVISLYSLKNEYQKSPDKMTALGFGVTDVG
KPVELLVFDRKKPTNNEVSIEIYYTGICHSDWHFIVGEWKADFPLIPGHELIGRVIDIGPNVDKYSIGDIVCVSPVIDSC
GHCKMCTHHIEQHCMNGATEIYNQKTRLPGDIKPSGPITYGGYSNIVIIKQHFVYKFPKNLDIERCAPLMCAGATTYSPL
RQAKVGPGMKVGIVGIGGLGHIAVKIAKAMGAHVVAITRTEWKFKDSVNNLGANESILSTNVWQMNQHKGSFDFILSTIP
MAHDIVPYIELLKYKATICTVGELFPTVINGMDLAQHPCFLQSSLIAGSDEIKEMLAFCSEHNIMPDVQIIKADKINDTR
QKLLESKAKYRYVIDIRASLNK
>Q5UQ80 ~~~~~~Uncharacterized protein L515~~~
MNPTNLNFGKILKQVEMVDVNEDSLDLKFPKRTIVDISTDTFIKIIFEKDDNYYLFLNYVNALDLYLNYYTNINLVSSNK
QLLNQDEHIDVYFKGGNVMNYHFRTMVSDPRLKELFSAYFKKSDFDFSVSIHTDTDNRFNQLKTYVYPKIIDYLITTSNL
FNEYLQEIINGDINKSTIKIDPKFLHNFKQDNADLKYYTIRDTIKDIIGLPRFNFVKEIIDNTRNTNTKNFMHIRQIDNI
GRYIRVKFHDNSIIQFIPSDKSFYELKYTPYIINNFNQVIDYYNEYVLNGLIPVYHNFYENSKYHACLLYPYYKYLVEPI
GQHEMEYSDLIDKIIKYNFSLLEKSEFYTKEKINSMLQQISTSLNELKDTYYEKNSDNPPDKTEVTNSNAFIRYTVNKNR
SDNPHIELAPTNNFLVYNDFEKSDPLSIINFDDNQTIKTTNNVHYVSGNMLIKNILGNRQILDFDLFRIKFNLVAVNYIF
ENEKLLREFKIPSEFIDVSVTIIDSNVYNEDHQTFIMPIKLDNTIIPDIPVKSHSYTYFIGDLIRILFTDFNFFPWMKGK
YEKRIKRLLLLLYLYDQQHQTNYLDTLYNMATNIKYNLTNPNKTQKNMDKYALSKVHLNSYKDYSNLFDLVYIDNKYGPI
KEPMKMLLIVSEILDKNNALDIINHFRKYLKLAPLTNISNLKTEFGKFLDEIISTYNDINPQNIQAKSSINTLVSRNNYS
MIKNNRRNY
>Q5UQ79 ~~~~~~Uncharacterized protein L516~~~
MIRLQTPIIKETIHPKKTPTALNWLYPEKIIHNISSKALIDNIYSENYQHILVGLISLIEPWVYLHTNLKLIDNGHLPLI
DDETIILISSGEYMIDYYYRIEVNKYGKQTDLFDKLLQSVNNSNKDFFDKYKNYFKISGIDFDLNINTSNTNRFEIIEKY
VIESLVEILDKITDGLDAIYNYDNQRDVLLPIFNNNYNNDNVESIDIDYDLDLLKKVKTIMANPTFKFDKDLMFHPTKSL
IMEARETNYRVFNDLGNQIDAIMKNSNNIMIFPYAYFLIKNVHGKNDYPNSSYKKFDNIIINQLNQYFIELSNGEIYSNY
HKNAFFVDSINGFRSINHDLYLKRFNDFVPEENNNNAFYDTIKFPNITQVNNMEIIPSDSFYIYNQGKEVIDSIVEKKHN
IVVNQSKAKATRSNTYIIDKDTIELNLSVVFKGLLFNNKPVVYPISSNIVTFEIERINSTNYVDNVNVPYQVLQIKNPKV
TIKTYDRKTLINKLMKKLFDEDHFLPWYIPNYNQNIVKLLFLINLDKHEYIDFLKKLLQTQNKKELVDYSLYYCFDYQSY
SFYNLIWIDPMTVDRYYEIQYLIKFIIIMDNILLLPDVSLRKILIDFNSSYGWYDPNVDLSSVKVSYQNFKNNLLNTLNE
LDDIYHNKSK
>Q5UQ90 1.14.14.-~~~~~~Cytochrome P450-like protein L532~~~
MVLSDILFSIYEHREKSPVFSWFAYLLRILDWIIQFLSFGLIPSIGGDLYDLVDNGLFKFVLDRNIQKKQNQLYDKFRLG
TVKMCLVFDGELTKKLLLDNSIRRGGLYNLLTKFFGKGIFTSNIHSRWMKQRKAIFKLFSPQNLIQITPELTTSMFEELD
RLITIKKDLDLVTVLSLIGLVGFCKVIFGVDVTDMSEELIEPLNDLLIYINGAVEPVLITADPSYRRFITNKKFVHNWMR
KLIDKARKSENCFEIMRQQLDDIGSDDETELIEFILSVVLGGHETTARLMLGIIYSVCHNKEIIEKLNNETDEYPKGDYI
NLKKRPYLNNIIKEGTRLFPPVWLLSREAKNDTTIDNHFFKKGTQFLISPLIILRDYNVWGSNAEKFDPERFSNMDPKSK
ASKLYIPFIVGSEDCPGKKFAILESAIVVSKLFKEYEITVLKHKLNPMSAGTFRLSDKLPVSIKKLKN
>Q5UQA3 ~~~~~~Uncharacterized protein L533~~~
MELLDKAVTDFQLFYQKILDLYESDFSPADKWTEIATHIHDNQQIPIGIYKLLRQNDTVNQIDIDYKSDLEQLNSKISLE
SDFIVKSSLIIASMHFIIYDMITREGNYSSIMYGSEEMNIPSVHNMIYYVYISTKNQQNIYFHAYILLFGLESIFNKKFY
VGMDFEFTNKKIQLSQLNFEHNVSTKSIIMIVSPNELEDIMMNNFIKIIICNTNIKKILHGSDSLDIPYMYTHRLDGDPD
KIIKFTRTLIDTRFLCEYYKLNSDIVSDNRCSIYDEDPSRSAVYFFKVISDEQQNKLAELLESMPAPHDIQWNIHKMPIS
QARYAAYDVLYLKVFYYRIIHVATEEDSSDIGKKNIIELYKHLLNEITRFVYLEQNGITLLMAKCKEEVDVVNNYFIRNS
EGITKMIDIFNQVSLGLETSDPKVNVDSFSKVNHFKRPITTIIKRIVYGFISYRCRIYKDKSTIWTDKLDNQFIFDFLAK
MKFNYLNKMFKELSKTLESRIVAICSVR
>Q5UQ98 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase L538~~~
MDSIILKNHQKKPIEFMKNNRGVILYHSTGAGKTLTAIYSVYQFDYPIIIIGPKSSKKAFTDNIEKAGMDISRFTFYTYT
KIKMILESDITIFKNMSVIVDEAHSLRNENMYNLYISSALMLASKIILLTATPVVNYFNDLAVLVNIVRGEDSLPTERAL
FDQMFYDEETMTLINAPILFNKLLNTISYYKIIDTINYPTSESHIKQVEMDHLQIDEYKYYIKQILYSNENVPDNVDIFN
INYGLLPSKKRNFFLNVTRQLSNVAKIADTSPKIEDIMKYIISGPYPIVIYSNFLKSGIYTLAVRLEKENISYKIISGFV
SQDKLNMIVNNYNNGLFKVLLISSAGSESLDLKNTHQVHIMEPHWNESKIIQVIGRSIRYGSHISLPQNERKVDIYRWIS
IFPNQYRNISADEYLTTLSQRKMELWNKYNQIVIDASIENNYFAK
>Q5UQ96 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase L540~~~
MASNLTFEDKIGILDPKGLRPNPLNNEPYSDDYKKLAMVWSTYPAYSAADRVLSALENYQLVFVTSSTGSGKSVLIPKLA
LHYTNYNGRVVMTLPKRIITLSAAIFSAKVSDVKLGESIGYAYKGSDKSMYNDQNKIIYVTDGIFVMEYVRDPLLSKFNV
VIIDEAHERRIQIDLILLFLRTLLQSGNRPDLKVIIMSATIDTDKYQKYFNSVDSTVIDIAGQPNHPIETHFMDKPVTSY
MKEGLELIEDLIHQQIKKDMLFFITTSNEALQLCRSIRPQYPRVYCVEVYSDMDKNLKQYAESRDKYLELGNYDQKLVMA
TNVAESSLTIDGLVYVIDSGYELSSRFDPECYGQILEKKFVSKAQALQRRGRVGRTEPGVCYHLLTKQQFDGLADYPTPD
ILRQDITMDLIKIIQVSPNKTYAEGINMMNQLMDPPLRSHINATRNLFDLYNVVDDNGILTQVGIVATQFSSLPLNRILF
LIYAFELQCAREASIIVAMTEFLNGRVTNLFYKSDTICESNCEKQAANLLLEKLIQKRGDHFTYLKIYQEFSKSTDQKSW
ARKYGVRLDTINNIERTANQYFYRILNLLRKPRLPNNKNTLIDTSIDTPMDIQSRISSTDTKTNLLNALKKSHQHLTASK
LKPTYSKENITGKISRDSVLNQIYKKNEISKKKIIYDELSNINGKWEFRTVTIIS
>Q5UQA5 ~~~~~~Uncharacterized protein L544~~~
MRMLIFTYKLERYIKNKILPKILVVPDRDKYQIKGSFRRRIPYITDIDIVNNVHPEYDDTNIYQRIVDLINSFTNDNQIK
LIYVICGTDDRFLLTEYSDEEIEKIKILLNPTELVELNNVLSKYQDDLNKKVFYINEIIWDLYKLRWTSSEVLAGKKILR
GGIEVSFQDVVKNNSILLLQYFVKIEYYPIGFDIAVRYKPINLITAYQNAAFYQLKLANYSKEYYFMLFPLRFYFKNDPT
ISKQLEYIIETKFGLYKQLLVRIDSYRTIYESGNLDLDTAKSIIISIIKDIRKLNGIDMNIIDKIQEVSNNSAGQDKIIA
WNTLLTQLYTNINKSVNKQSKKYFTRYINIIPKEDRKLCCLEEEHVLQSGGINFESTNFLTKKKLIY
>Q5UR27 ~~~~~~Uncharacterized protein L550~~~
MAYFGCGPCGGSAGYNNCYNNGYGNGYGSNHCRDGFGACQNNVSGRNREVYYEKDNCFNANNNNFYGGRENDSCGGWNAG
CNNGNCGGWNSGGCGLPFYGGGCGIGGGCGIGGGCGPIGDCGPIGGGCGPIGGGCGPVGGWRKGYKGGYGGAPSSKLCRG
LGYKKRN
>Q5UR50 ~~~~~~Uncharacterized protein L567~~~
MNHYDQYQKYKKKYLDLKNQLNNSSQYGGNCGNYGNNQFNNQFNSQATNRYQTGGAEFDLKDEVAFWGRQMMEHLLLLHL
GLDEEELKNSALQNHMDWKRYLTENFFSKGVNPGPDQAFLTTNELEKIGLLNKNVVNGLIDQTIQYKSKLVKTLNSGQWV
GWIYPAMAQHMLEEAEYFKRKVNGPDYTPEQETKFVVHHHSTEMGATTQLLDPTEKDNIKIAKSYADICMSKLSGRNKKP
FPNQWTSQEEAILRGQDPVDLATLMRISLKYSRELTQFAKETGQKIDSKQLKSIIHPVLAHHIFREFYRFTKRLEQLGAQ
>Q5UP53 ~~~~~~Uncharacterized protein L585~~~
MDQTVYSNIPRYTYDPYYTDNPYYGGWAYGTLDPYRRCGGYGGYGAFAIPPWRGLNSYEGFGKVYRPWAGRSLLVAPAIY
RAGSGPVGSTPWGSGRQVNAARPIGGR
>Q5UP59 ~~~~~~Uncharacterized protein L591~~~
MNITNINTTRIPAGTRLYLTKKKYNKFYIQPDKTLVNDNLFVAYDVKIGGVIAIPQGTRVLGNWVAESSPSIAVQLQLTK
IFLYGSGQNISADSDLVETLVDYNSDEIDCAPYLYKIHHFKSPAGIYRRIVNTKCRSKILTDNNRNSIYLEVNTKEISVV
LNEDIIPMPDLSKMPVPQSPSVPTMPSVNNDQPTQKPNKITRNYKPKHQHNYKPNYQPNYQPNYGQNCSNSHSNDYFSDY
PTEHSGDSSSYSTYEDDDF
>Q5UP75 ~~~~~~Uncharacterized protein L612~~~
MISNQNQVYVVNVREETLKNPYYRRVIDTSPHQQLVLMSLKPKEDIKFEIHPDNDQHIRIEKGTAIALTGKNKDTKHVLT
EGMCIVIPAGTWHQIINSSNTDYLKLYTIYSPADHPPGEIDVRRPQSGGSGSKTGDFKNKYYKYKYKYFKLRSSVPRELN
>Q5UR10 ~~~~~~Uncharacterized protein L647~~~
MYKMPFDRAGCLNNVGGFGSCGSNYNRINDAKLLKRELHFTETLLDTINRECASGFNNYGYGGGNCGFNDCGFGGCGPAP
CGPGGFGGGFGGGFGGGFGGPFNGGAGVEVEGWGYGPQPFQGGFGGGYFDKDDDKKKKKKDDKKDDPCNPFCKPCYKPVC
NNPKEKVCKCKECKRASRKIYEDY
>Q5UNV2 ~~~~~~Uncharacterized protein L688~~~
MYLFYQLILEIFQIMSCQNYQSYGCGTYPCVTYGNCYTTCASPCLPYPTNCVQVCTSSQPCPSPCPIPVPCPVTIVEYIT
TAPTATTIESSPTGMALTPIPVGSTSIPSGTVTVITGYAATPVRSIGGITLNSALGQFTVPLAGSYLITGYIGFSYNAVG
IREVYVYKVDGATSVITLISTDSRNTTATNPTYISYSAMDYFNAGDRIFIAAAQNSGSTITTTADNRIAITRMNRQ
>Q5UNV3 ~~~~~~Uncharacterized protein L690~~~
MQFQSVGNCAPCKDPCPLKSFCDPGLSNTCNIQPFVNPFGPRRAVATWKVNYLISNRNVLAAHCDPDLINPWGIVIYNNQ
LWVVCNNTDSITNYDLFGNKLLGTISIRNASHNPSYPTGIAINCGGGFSISTGNISRGGLMLTCSEHGTCHSFNPVVDPL
HSFIVLNQQITGEVSVYRGLCIANNTLYLADFFQRHIDVFDQNFNRLIGYPFVDNDGSDPIPLNYGPSNIVNIGCFLYIL
WAKKDPNITLYAVDEPGAGYISVFNLDGSFVRRFTSRGVLNNPWGMIPAPAECGFPPGSFLVGNHGDGRINIFDCNGRYV
GPVLAMTGLPLVIDGIRGLAPHYTDFSEIYFSAACDEMTDGLVGSLVKDQVIYF
>Q5UNV5 ~~~~~~Uncharacterized protein L701~~~
MSTLSPIYSRCATQGNSCPVSTIPEAMAYADPNGTGTIYYRNSEANKAFTCNNASFGNQTNSTAYQCYNGNLPTDFRTAG
SSFYENGIPKGWTKCSDENETCDPKVNSDVDILFGADGSYVYSSAKSVPCNINIFGDPKQGVKKACYWRSPLIPINHTPS
TPVTPTTPSGTQTTGHKWWVYLLLFGIPLLILIFLIIFFIAKK
>Q5UNW7 ~~~~~~Endonuclease 8-like L720~~~
MVEAPRIRITYEKIRHTKNHRIVSISGPSYKRMNVDLIDYIIRKWWFAGKYIYLMLISSNKPTYVIRTHMMMHGRILVGN
QDSPTKRAFMIIQLDNDIVLRWYRSQITLLDPNCLAEIKTNYTICTTRQAIMDSIKLMKYDLSNNRFDYNLFQSHLKNGI
NIHSSEIITDFLLDQEYFPGVGNILQQEALYDCKILPLKKVQDIDEPMFDCLCNSLKKIIDLLYESYKFRESGKEFGPIL
RIYRKSLCPLGHKTIRKKIGLRNRMTTWCPVCQL
>Q5UNX6 ~~~~~~Uncharacterized protein L724~~~
MKICFEKNFECCNPCQPKCCPPPCPPKCCPPPCPPKCESICIDKCPNTCDPCCPPLVDDCLAKKLECLWRQCFCDARLIP
EFGVPCQSDGVAVITHTLGRGLCNLKINGLKSQSILANNSFYSVEVSGCKWLNLYEIQLPDVPGKNGCKSSGEIYTEALV
KLGISVEGDGYRWKGSQPYCLNIHSKAIGMHPIEFSKKQIAAIKAVLDYFFCDNKCCC
>Q5UNX5 ~~~~~~Uncharacterized protein L725~~~
MANNLVQLIFDQFIEILEDLAAKDEWCFDFNKCDFDFRVRELVNHRLLDVKYTIKDECGRPRDVIQEIDITGICYEDLTT
CKWVDYLTKLAVEYINNICPPRYIIIKEEPKKCRPQLPEWNPFPCKRTTTIYRRQKPVEKKPECEVIFEKGCECLPSCER
EVPVPKEQIFIKYEPVPAKCCERTVLVRSPEQNRHSFGVHKGNIDYNNHVWPKCCQSKKCNCAH
>Q5UPR2 ~~~~~~Uncharacterized protein L778~~~
MSAIRYGDNVFITLPRLSTPMIFNGLVPHYTKPNQYEYVPILSSGSIANGDSYIIEPININNSNTALNPQSVFRLKQVSQ
NKYLYDNNGIVYLGNDTDNKANWSLKPVNLNATTIDYNQEFRLVNQGTGNNAVFSTINNVTMITSKYNDTTNNSIFKFLK
GPFTYAQSQCCQGNILYTRPNMCGIYKQGSSVCHTIPSSQSNYPSYTTSMVGSTQSTTPVGSNPPTHRSIDKWYIIGGIF
WVIVLIILVIFIIWKLK
>Q5UPS5 ~~~~~~Uncharacterized protein L780~~~
MKWLIFGNKGWIGSMVSKILEQQGEQVVGAQSRADDESAVEREISEIKPDRVMSFIGRTHGPGYSTIDYLEQSGKLVENV
KDNLYGPLCLAFICQKYNIHLTYLGTGCIFEGQNNFSADEKGFTENDKPNFFGSSYSVVKGFTDRLMHFFDNDVLNLRIR
MPITIEQNPRSFITKILSYSRICSIPNSMTILDQMIPVMIDMARNKTTGTFNFTNPGLVSHNEILSLIRDIHKPNLTWEN
MSREQQLAILKADRSNNLLNTDKLQSLYPDVPDILTGIREVVSKMKFQQ
>Q5UQI1 ~~~~~~Uncharacterized protein L829~~~
MSNRFDSKPKCRCVAKIDDNYENNCQSKYISKCEIPRNICQRKNIDFFYDFRLKYSDADFDYVFGNDGVVTQNFTGLTVN
SVPFTQTVPIGNEHPKWLKFYKDAFPLYNDREVIFETEMSGVQVIDGNSIPEKMKPRIRNVDDDLRLASGALNVIDPNTW
MVFDFFVTNTAIYAFYERLPFGKTSSTPSNTTSQFGNKSFHDKFTHNGSIHNGSIHNGSIHNGSHCNPNPDVPTDLGNYA
AFSNAIWVARRSADDPLSQFSKLAIGIHKGKGLVTWYIDDIPVFTWDRIGYRMHDEYRMVDHGGIEGIVSPDSMRLGFGT
FSLFDMNLPNDYDRGYVDPVVVLPDGPHREIARSALIQLDFAANYRETFPDPYTGLERPLADPAITFAYTLGETPDDNRA
IKLFGQGAIIKLKYLRVYTRSPNAKPEFSRVNH
>Q5UP30 ~~~~~~Uncharacterized protein L851~~~
MNSYIREKVDEYRERYDLPDLKVTRSDKEGKRLKAVYTDKDGHRKKIYFGQEGAYTYADGAPDYVRNAYHARASGQYTKK
GKQAISIPGSAASLSYNILW
>Q5UP40 ~~~~~~Uncharacterized protein L872~~~
MEYKLNKYIHKINTDPKSIYLNKIAKYYDSSIPIQVGGLRLNDFDSLVKFLKQAYDKAPTETSNKYLVILYGPPASGKSI
SRYIASYWIQELFKETESIENIYKSFIDTGIDEITYDIETPTGKRIIDLLKENIDNKLGNDKSIENAKKNISLLASSSWD
IYRTNRPDYVSELLYYFAIFLNKNIFLETTGSSIPYLERIINLLSFYGYIPIVVYPFINDVSILYNRSIQRGLKEGRFLR
CDTSFGLASQMQISLANYPKIKNIVSQYKNYLIYQYNSNFSNEITKNIYSFNFSSLGDYMLEFKCKIETIDDKITQNNII
DITSNNYDKALNLNLNCGEN
>Q5UQZ2 ~~~~~~Putative truncated GMC-type inactive oxidoreductase L893~~~
MKVVAVNAGFNVTLQMAYPPNDLLVELHNGLNTYGINWWHYFVPSLVNDDTPAGKLFASTLSKLSYYPRSGAHLDSHQSC
SCSIGGTVDTELKVIGVENVRVTDLSAAPHPPGGNTWCTAAMIGARATDLILGKPLVANLPPEDVPVFTTS
>Q5UQZ1 ~~~~~~Putative truncated GMC-type inactive oxidoreductase L894~~~
MYVFLLFSRYKIFYVYIKKMAHRSRCNCNDTSNSNGSQHGINLPLRKIDTYDPCVNCRVKPHLCPKPHPCPKPENLEADI
VIIGAGAAGCVLAYYLTKFSDLKIILLEAGHTHFNDPVVTDPMGFFGKYNPPNENIRMSQNPSYAWQPALEPDTGAYSMR
NVVAHGLAVGGSTAINQLNYIVGGRTVFDNDWPTGWKYDDIKKYFRRVLADISPIRDGTKVNLTNTILESMRVLADQQVS
SGVPVDFLINKATGGLPNIEQTYQGAPIVNLNDYEGINSVCGFKSYYVGVNQLSDGSYIRKYAGNTYLNSYYVDSNGFGI
GKFSNLRVISDAVVDRIHFEGQRAVSVTYIDKKGNLHSVKVHKEVEICSGSFFTPTILQRSGIGDFSYLSSIGVPDLVYN
NPLVGQGLRNHYSPITQVSVTGPDAAAFLSNTAAGPTI
>Q5UQZ3 ~~~~~~Uncharacterized protein L899~~~
MNSKISVSNLDSNVIDIITRILKSQSNTDVDNATDIIIGAISKNILTLQDDRDLSSIKQIFQSINDSECAFIGRQIDNEI
VFTVQDIAMYLARDTFNPLNNDVNTFIKYSWSSYSNKQGTNEKRDTINIVIKNQNVETYTYTKIDLPYLLVHVARYLSIF
N
>Q5UPL2 1.-.-.-~~~~~~Putative GMC-type oxidoreductase R135~~~
MKNKECCKCYNPCEKICVNYSTTDVAFERPNPCKPIPCKPTPIPCDPCHNTKDNLTGDIVIIGAGAAGSLLAHYLARFSN
MKIILLEAGHSHFNDPVVTDPMGFFGKYNPPNENISMSQNPSYSWQGAQEPNTGAYGNRPIIAHGMGFGGSTMINRLNLV
VGGRTVFDNDWPVGWKYDDVKNYFRRVLVDINPVRDNTKASITSVALDALRIIAEQQIASGEPVDFLLNKATGNVPNVEK
TTPDAVPLNLNDYEGVNSVVAFSSFYMGVNQLSDGNYIRKYAGNTYLNRNYVDENGRGIGKFSGLRVVSDAVVDRIIFKG
NRAVGVNYIDREGIMHYVKVNKEVVVTSGAFYTPTILQRSGIGDFTYLSSIGVKNLVYNNPLVGTGLKNHYSPVTITRVH
GEPSEVSRFLSNMAANPTNMGFKGLAELGFHRLDPNKPANANTVTYRKYQLMMTAGVGIPAEQQYLSGLSPSSNNLFTLI
ADDIRFAPEGYIKIGTPNIPRDVPKIFFNTFVTYTPTSAPADQQWPIAQKTLAPLISALLGYDIIYQTLMSMNQTARDSG
FQVSLEMVYPLNDLIYKLHNGLATYGANWWHYFVPTLVGDDTPAGREFADTLSKLSYYPRVGAHLDSHQGCSCSIGRTVD
SNLKVIGTQNVRVADLSAAAFPPGGNTWATASMIGARAVDLILGFPYLRDLPVNDVPILNVN
>Q5UPM2 ~~~~~~Uncharacterized protein R160~~~
MSISKTLDKNIKQVNGPINVVRLQGKIGSVNKVVYLFMDRHLAVEYQTECDNIFAKDVHMFFADSFKNIGSTGKTYDFFL
EKDAEEITPKIPEELNTSKTKYISEVGKMFKKIFKYDPKANRALTSDIFQNTRFHYIDFRGYLYMDIFEPIDMANYVMEN
IWNKRDLKSEDLNRIAGQLNIVSQQCQLILQIMDSYSNNKNRDKNNSERQYGVRKITPLKYTTVSESYQKTYKQQKQIQI
DYIRYFINKIYTKYNDNTIKSKLINRLEYFKNGIRNLNTEVNNIIVDINNIMTEMTSSSGKLVQYNEKWYYGMPYEKEIM
YIADLYNKLRNLNGDSVSYIARLIDIYFLRRFLDKDYITNAILYSGANHSLTDIDILVKDFDFKITHVAYSKYPINQITN
SVKKAEFGMVNIQPLFDNNEQCSDVTYFPDNFL
>Q7T6X0 ~~~~~~Uncharacterized protein R161~~~
MTTLDKNIVQINGPVNIARLEGQINGFNKVVYLFMDHHIPVQFQTECDNIFARDVQTFLAESFRNIGETGLMYDFFIEER
PESIINEDSKTTNSRKEGYIWEVVKMFNKVFKFDPKENKVMSSNVFENVRFHYADIREYIYLDTLNLYNDIRNLLQEMQE
NNFMDPYILEHIVNILSNIFEKNKSVIDIMNSFEKNSEANSVNKITPLKYPKVYEDTNIPLTNKEPNKDASTLEKQKNLR
TDYLKYFLNKLYNSYKDKNIKSKLLLELRQLKNNITNLQNKITKTMDNVKKIIEEIEQSKNKVTLTDYSFFMGISPLETR
IYIANLINQISILATANIIENIGFMDIFFLRRFLDKDYITNAIIYSGSLHSANYMKILVKEFDFKVTHISKSQYPIDKLN
DSIKERNNLAEIVYLAGSYTQCSDITNFPKNFQ
>Q5URB2 ~~~~~~Uncharacterized protein R188~~~
MRSSFSKCKVNTCNPSNCLEVDILIVGGGASGIYSAWRLSQTYPNKKVLVIEEKSYLGGRLESVYFGNEKIYAEVGGMRT
FPSIDLYVTAVIKKLKLQSIPVPYIEPDNIAYVKNTRLTVEATSIGPSSNPEKQKLIQLYKIPPDEQNISTNDIIYAAAV
RAAPTFPEDWRTVYDYPELNNETFSEMFSEQGVSANTQQVFEVFSGYSFFISQRLAASTGIRENISISGENNQHFVVGGY
SSIVFGMTDETFCNPNYQLFLNTSIKKITPSNSPNGLHTSILVDTLTGLPITIKSESIILSVPKDSLNRIVAPITPNTIE
TMNSLTDWRAFKAFLLVDQTTYQLISMNGHMKGRGISDLPARQVWAYSGNPPCVLIYCDNAYADFWKKYINDEINNCFPK
FHDPCINKPLVSELKRQIGIIYSVDPSKIVVNKILYKYWYAGAYFSKPSDIPKLFEEVRTPLGPEYSVYLVGSDISVSQG
WVDGALNTADNLLVKYYGVTSILDENRLY
>Q5UPU1 ~~~~~~Uncharacterized protein R253~~~
MSKPNTETISVNIPESEGVPLPDEQSVAERSIVNQSPNNSTVVTVNSEGDVSKIITRIKNHLASFNYAASLDFNKDKDHD
KVLTVGEYKISRECLLHYLSGNPDFLKSSAGECSKAIQTFSNESGNLDLESLLTLSPNKDFIHDTNFYKNLYGFNVSIAD
FIANNNEFKKANYNTQIRILQNYHEFLKQSIEYFNKYMNQYKVIDDNLISRSYNLMYLLNVLTFRRANVGRNINELLDSY
NKLNQAIATNLAIYDSINKSKLEIAPEARSSIIDKGVQDLVNSLKQRTNILKKQGETLKKNVEDINKDTSNLKRHATGDI
IGIADSLRKEVDSVATSFVSTEKKK
>Q5UPZ2 2.7.11.1~~~~~~Putative serine/threonine-protein kinase R301~~~
MNFNKFYNKFHDEPIDCENFPDIGQIKSTSVGSGGSDNIVLIVVQDNIKYAVKIIPTLFYPKYKEQPNDDQMEIKFYQFF
TKRYILTDRTPHIVGIYKCKTCDNIRDFLLKIKPKKSCPTFEEKLLKKVQYTQFENQLCNLLLYGENKLMDSKFIMALLE
YCDFDFSRYLRDLLQNVYQNNFGNNIGEFFYELTRILFQIIFTLAIIQDDYPGFQHSDLFIRNILISITDKYTDNEYVAY
YYKQKIFYLPANGIYAKINDFGTSIIVNELESNTYIYDKQTNKIFHKNPFNHKNDIYNLLIDIYFAFDEYIIENKLDESK
INPIINFMDKFIDIETINKIFEINYYQLKDTWYIDGISVLENTIKTPHDYIMSDVFEVFQDLPANAKIIRHFNSPRL
>Q5UPZ7 ~~~~~~PP2C-like domain-containing protein R307~~~
MNESKRENIQVINIGDCRAVLCKNGLAIPLNKDHKPIWPDEKRRIDRVNEKYETNEKIHFDAGDWRIGDLSVSRSFGDLD
NTPYVTHVPDLFDYQLQSDDEFIIMACDGVWDVLENHEAINFVRDHRNDNHTEFYSIPGKYPNREAFESDNISRKLASYA
IARGSTDNVSVIIIFFSKE
>Q5UQR5 ~~~~~~Uncharacterized protein R326~~~
MEDPIKIIHKYKNNNGRIQYHINIFIGDIVDENCMRILRKIKNLDLYTSLTSLETREIDILEKNYGEYWYEKFFNSYHIN
NTKELTLKNSVRMRELRSLYTEEWVNRHFVNYKKRLETTIFNYEYVVKEDRERRSVKRHIRRQQDDVEELLDYRTTGHPL
PPALNDEQYVMSRIESSVETDNWCADNLLEEQLTESELNRELKKLVSKQNLDYSTDKPEDSESEDIELEDSESEDSESED
IDQHGGQGPDDDEFNANFDDPQFDEFDFGNTEDEQNVKLFEMEVEQDLEDVDLLFNDIDETDKNSKLTTREIKEAISNEQ
YDRIGKKIVDFDESRDNSMFDENLKDVITKTYITNQYLFKDDTIKTIRDKICCGFKNSNKFGENTYIIPSHQYLWSEYTY
QGKIDRVMIGQKWIIRNDILKLDVEPNTNTSVKKNSGEIYVF
>Q5UQR4 ~~~~~~Uncharacterized protein R327~~~
MVDVYNELGLNYDPNFEEQRNLIDVYLRIYFPKIRPEEFISILDFLNESASDAKKGIEKNKIRTVYDTIKNNLILENEPM
RDIEITKKKYEKEYVKLFKENYVTQSFIRAYLLDRGKKIDLFRIFDNFILNENYPFIQYQPPDGTPRSRYNEKYLLENER
KEIIMKWFENTPWGISFKVRVSDKSDYKYMAINLSDNGRIDYKIQWKEEDMQTVDDIDKTYSFVKDLIRKINRENERFGI
KLKIPSDDQFKFAFINTIQKFELPDNFAINHNDLSEFSRYFFPYVALVIEPRKRQSKTKTTERDERSKFGTYLRYKRVSK
YDNKTKIEHRIVFFMRNYEYNDQSLSNEISKEFNITEEQALEEINAVRERYPNIKKSRKVLKKLENIPKYKPPGIGVDIQ
GKTRNNYKMRIAGARDRAQLNRIITFMNILIYLYAETYLYKRPDRQRMKDLLQKLTKIARRRNKVDEIVNHETPIKSVKQ
MTNIDRKRLTSKTEDDQNQWTRDCQNSGEDKKRRPQQFLNVEELQKLGYVWNPKLGDINFGHYERKIMVDSNGRTDSNKK
KSEVVLRAVMLPLDDSGNNYVYYTCGPEENGKHMYIGFLKSKNPYGEAKPCCFIKDQLYSKNNDKRNLFLKSIGLIQNDE
SEVNKIVGDQLYILQSSNKIQEGRFAFLPKYLDIFLNAMLNNERVIKNHYLVSTTTGYYFKYGTKQDEYRYLNAVGSVLD
LSIEDLRNKLSSSLTKDKNQLLFTSLNNGDIRAQFGSMESYLTYINTNKYLEYPLLNDLICSPGVINKYGLNIIIFQRKI
RIIRKSFEREKIREHYYIVCQNHENIEDLIDPDRETILIVKEGKNYYPIILVKKEDENTKEVSITKTFQYNSNPENIVSH
IFKYYEVNCQQEFKLLIKEKSNSNLNAKETNKILISLGIKEYVPKYQMIDARFKCRYLITNAGYIIPVIPSGIVKNVNII
STVNNYLKDYTTTQKYLTDLSKLTKNKLKIKPIGVFYKDKRQKSYVISAIMTEGYDAVPIIERSMTSEYIKKEKLLTQGR
PNDEDIDRDISRGKSSIVVDKRVYEVSKNKYETETYQLFRYHLSYFLNNTTEGGKFKKEIENIINSEDINKRERKLELKN
VLYRMTNADLSKTFNQLISRLNKQFGGQDNVSDQSENQSENQSLESETSPIVRELNPLTQANSPISIIQEPDTIQIIPED
FVPGRRNNLRESFVNTDVDEFDYSEQINPLDEPFDYATNKEPMAESVPFPKNEKTWLSIMPDSKIIDYPSYILKNNREYC
YINKNKDACNINKHCAWNNSKNLCLFNVKRIQLIDFINQVTEELIQNELRASEILRRGEYFVSNIVDYNVFTERPGERIV
MASNSNLEKILSELFGKENIPRIGKKRYKFDNTQTYEQLNFDNPLKETSIWYIQNIIDNNNTVFRAFANTYYWLVHPYDE
VSMRNIGYYSPLQTTLSNIYKSQVINWLLQQENQDMIQKISSYIRYNKVEDFVTKLSMDVNTISSGIVELCILSILYETI
IYVNDEYFNVIYVLHPTQGIVYDYKKSKNKFTNEKYQSYKKVIDIRFRYSSNSNYPDYIDALYPKKNN
>Q5UQT5 ~~~~~~Uncharacterized protein R345~~~
MLSNVGHSIQKMINNIPVAITANETSTAAVTSGGNVYQTGLIGGKIHYSFNEVITNENIVGKVIDVKATEDRLYLLNNSG
SVFEYAYNAGSCSPVVREVYSPAACGGDKAVQIEAGRAHILIKTEAGKIWGAGSNAQYQLVPQGQVRYDTAVEIIVTDTN
LHDNECPGSFTGVYNELECPVIPNCEKDKCDISCVKENLCDVHLGYINVSHLALNPPCETGTLSVPVYGDINYVGFLCVD
NKGCATGSVTYTINHLYIKCGCLLGKFTHKDKCGCHVRELNLSSTCQVNIFQANPCPSANTDLCALTGSAPITGTTQISG
KCGSCVIANVDIPIDFPLPSVSFEATCNTIVLEYNDCKTSITVLCDGTLCDYCECATTLDLDFDVPLKCEAPKPIKPQIE
LPQPCWAGIYAGYDISVLVDNCNRIYVYGSLHNIRSNKDLLKRSCLEELLKGTNASISFPADQLNCANHTARNDNCKCPK
CRDKAFKTDLNKFGIHLSFPNSEDECSQKNMSVCDFLQNLKNCNDAQGCEPTCEPCDGYIYLNVAGDCGCPCGAPASAPI
GSITLFNKKSICKLVSQNCPDISEVAIDVSTIVEFDLNKYCIDTRDIALDKVVKLQFCNDGPNVNVYIDIDKPGGIKFTS
NGKKCNVEFTVSASTQNHQYILNFGSILDPVELTNLKYALSLDCYYPCPKYKNPFDTKITNTYIRGGDHIKFVVSNPKNI
RQAVTADIPTVFRLNRRVIDVGVGYNNLTVLVGGLACPNEIYVIGSNCHGELGLGTNETIVSWRQLNRCIFDCQVNRVFS
GRYVTFYITQSNNVYAAGQWKCFINSTTPQIVKSICPAWRISDMAITLNQIILLGADGCIFGLGDNHLGELGLCHTDCVT
KPTPISFFYKLNNSAIKQFNDSLAHPMERNNRKCNRPFNPCEFGNFGGPCAPGPCGPFGPFGPSGPGPNQGPGDDYNNFK
STKYPRNGYNKYQPNNRIHSRNRY
>Q5UQT3 ~~~~~~Uncharacterized protein R347~~~
MSTVADTMQTLSNDIQLTFEELIKLLLRKTLKPILNPLGNNNIVYNSIKGKYERGLVNSANNDNINRYTYTMTKESLGWT
DEILAHFRINAKFIKSSVPQRFPSNTKYHNMMKPNAKISFSDKETKNIINQCGENAICQYKNLLSMLNTRNVTGILANKK
ATFNYSSDIPYTYTFKELEDIGKDIIDQTDPSIPGCDSARMSTAMLRVRLYNLEQNLAKLSVDKPDIFDAYIDMLRIING
VPRKKIVWVIKLIALLTYQRWEDLDKQKHQKYYEILGDDFVNSFNSNELITILLQSNSSEYIIKASLLFPLLKMLFIVFG
YSKLVPVISDINAEAQAAEHERLRKFYDKYKGATIRRQSDDQNKFYETECAENNFNADHAIKQIYGKADPEPIDQCVYDF
IRLFGDLYPTLIGLMGGTEKFPPESIEINCNEIPHSYEDLTIIPEFLWRFNDFSYCRYLEYICSKQLHNVLNYKVNEKTR
YLQSI
>Q5UQU2 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase R350~~~
MNRRNRSNDLNPEPSIENPNNQIAEEFPGNNSVYKSDGYVDLKNNGRLFPIWILKNFKQYKLPEIIRKENEDPCNVQVKL
ELRKYQEFVGQYLNPQGPYTSILLYHGLGSGKTASAINLMNILYNYDNGTNFIVLIKASLHNDPWMQDLKEWLGRDPSEQ
NVDNVTKLDRYKNIHFVHYDSPFADSSFMSVIKTLDLSKPTMYIIDEAHNFIRNVYSNINSKLGKRAKVIYEYIMKDKRE
NKNTRIVLISATPAINTPFELALMFNLLRPGIFPSSELDFNRTFVTESSYPILNPMKKNMFERRILGLVSYYIGATPDLY
ARQELKYINLPMSAYQYDIYRIFEKLEAEIQERARRRGKQSQLYRTYTRQACNFVFPYVNMNVNGELRPRPGKFRLSEKL
ADDFSKGKNLDVPDTEKEILNKYTKAIENYLNETERYFQNINKKDAENGRTIINDLDEFKKGFGTKFNSFLQYYQSEGPR
SSLLTEMYNCSPKMLAIAFMTYISPGKVMIYSNYVVMEGIDVMKIYFRLIGFNDFTIAREYMGYCEYHGRIDPKDRVRIK
NMFNDKNNVYGNKCKVIMLSPSATEGIQLLDIRQEHIMEPYWTEVRIQQVIGRGVRQCSHRDLPMSERIVDIYRYKVIKP
ENLDPDDTVRQSTDEYVEDQAKSKANLIESFLGAMKEAAVDCELFKEHNMMSQSYYCFKFPESAVTKTNVGPAYREDIKD
DVKYDSGLNSKNSIVERIRVVKVNAVYQINTDNNNPVYSSPTKYWYNKKTGMVYDFETHYPVGQVEFIDNLPNKLDKDTY
IMRIDVIIPSITGSVNT
>Q5UQV1 ~~~~~~Uncharacterized protein R354~~~
MTDISYYNNEIDKILWNILGDDYFTQDEFDDLVNSVANTIYQYDNEVSIDKLKVIIEFVILNKFKICYIYDNDSILNQVK
YEKKSVGSKTIGKNSTNDDEDDDEDIAVIKLSDIEAGENWFKKSPKISSKQFQSVDKVEVATYEDLISHKHDYPKEIYKE
SHYIRRNTRLDVIKKIPQFEQKSKEWLKQRTESLTATAISVVFDEDPYKHPIVILLDKCGRGLPFVENKFVHHGNKYEQI
GTMFYSFRNNVEVGEYGLLQHSGHKFIAASPDGICSKKANTGGLSKLVGRLLEIKFPFSREINNSGDLDGDICPHYYFLQ
VQTQLYVTEMDECDFLQCKIDEYDSWEDFVKDSNPIVPGLSKTTNLEKGCLIQLSDKNLIGSDDKEKCLYNSKYIYPPKL
HMTNEEIEKWISSEIMNYHNNDLSENYMIDRVIYWRLSQVTCNLIKLNKEAFEEKIPLLQQFWDYVLFYRQHSDKLDKLI
KFVEKVKEDNSAEIFSYINEDFLSLNKDSKYEPLYQEETEWRKKYNQIKAKKAQMYKNKSYNKYTKFSN
>Q5UQV0 3.4.22.-~~~~~~Putative thiol protease R355~~~
MNICGPQRYDKENNTCFNVDQLVEMAKAYNRYLSKTKLNPSRNYHFGDADLINIKSDKKYLLKQFKDRFGKICGSDEICL
THQAFMGELVGEMKDDILFGTFRSEGPSKSTEWLSTIDINQIMVPYENIYPNFKFIGAVPADCDQVSVCPLYNINYDKLM
DEGINYIATIFNHDRYGQPGSHWVAMFVDINNGKLYYCDSNGKEPTKYIENSIEKFAQFYKRKTGNDIIYKYNKNSYQKD
GSECGVYSCNFIIRMLSGEPFDNIVSNSLSFQEINSCRNVYFRNQPSKFKPHKLCDPTNSGK
>Q5UQX0 ~~~~~~Uncharacterized protein R383~~~
MKPIIFSLPRVINNESIPPPINKSKIYFDDNPSPKLIKYGFNNISEKMDLNILTSDSHYKAGLNIDFTRDDKNSFVSKTA
EIFGNQYDPAFYQAWEILNIFDLINKSESIYTNIPETLLEVTNSHKKLFKTNKQYNITNDINKAINIQLIYSRYSDIDID
ENALIQLIYNDLSSLFQMQIQGSNMILQLFNVQTQVTVQLIYLLSSYYTEAYLYKPESSSDLSDNKYLILIGLRNKSTID
LPKFPSNRYLLSLGVNDIPNNFTSVIQCMNSFVMPNKYETYLKIINYLNTKVYEGATYQDLIKEQNKFTLDWINIFTEPD
KIKTILDDSINFVDKNCANSSKLDELFS
>Q5UQW6 ~~~~~~Uncharacterized protein R387~~~
MDKKTWVYIIIAIIIILLLVWYFRNHMSDQKGVNVNNQTYNMLQQQISSLNQQILFLKQQISNLHVPAPTSTVNSLRQTV
SDINQQVSTINNQISSLNPYLPRNQQLELASVLSIFNRNALDLNNISRSVINRDINYFNAGQHGSQVPQNSHTVQNPNVA
DNELNVLQQKVDNLNGVVSNIRQHLAQFGSGIPESFRDEAEKAASYLNDRIDDINKNLPNLVQRLNPNQRNNLNRILSEL
NNDLSSLKNSLGSAVRNRINSVNIH
>Q5UQJ8 ~~~~~~Uncharacterized protein R398~~~
MIFFDMMHNKASKHGGAVYSLLGNHELMNTQGNFDYVSYENYHNFDYDSPSGEKYTGSLGRQNVFKPGSNFVKKMACNRL
SVLVIGSTMFTHAGVLPVLARKLDKLDLDSNKKLEYLNMIVRKWLLNKLSGKQDEEYKSLFINDTKISPFWNRIYGMIPN
NTSIDSDQCFNSVKKTLQVFKIGKIVVGHTPQLFTNKDGINGTCYERGEDNKLYRIDGGFADAFNAFNKKHVVQVLEITD
DKYFRIITSKKN
>Q5UQJ6 2.7.11.1~~~~~~Putative serine/threonine-protein kinase R400~~~
MMNKKNNKYKSDSLDSKEDKVDLYQDAMISNLISVNNKSTSDSDAKKPTENRIELLKNAYKGGTLKPMIDFDDHNTETFM
DKRITKNLLDARSLFLSMGVKLIYIKSGTTGHTFKAISRSNKNVVFAVKVCAYPKDDYGGIKSSSRPENVEIRMLKILSY
FVVNRLTPHLVLPIGTFHTDIEKFINIPEGVIDLKDEKNDMYKKFIERYHDGEFEKFVSVLISEWCNGGDLLDYIRKNYD
SMTLETWTVVIFQLLFTLALIHEKFPAFRHNDMKANNILVEKTDNKHEGPDKWYRYSLGSHVFIIPGIGIQIKIWDFDFA
SIDGIVENKKVNADWTKKINISKKKNMYYDMHYFFNTLISKRFFPQFYEGGVPQEIVDFVHRIVPEEFRNGSDNINKKGR
ILVDVEYTTPFKVIMTDPLFEKYRYNQYYFHPQRNMAPKKSILFQQGNGSKQPVPKKSTGQKPTKKV
>Q5UQK6 ~~~~~~Uncharacterized protein R402~~~
MSSNNDNSTNNKNNVNINTDNRLYQLDELISDIYKAVYIKDNDYTIKFIDVKDINLPRFKLSDYDLDIIFDKIKYAGSFP
TGIVIDGTSTNEIWFKRRGETEMSTIRIVPYQNKEAVDDITDPINVNQIMKTLLSELVVSEKTNNILLPVINVDVLGSDL
TTYGKISPYINSSDDTYYSVQVTEKYYSLKTLDQFFKDYVIEARTIKSIIYQAIDVLYQISVQYPKFKYNQLFPETIDCY
LKQDNNLIIPEIKLSNFYLSSIDDLVKNSYLDSNDFTVEQIADQYGDLYQLVNYMWNNLQSSIQNFPDVIKIFDIVLPKK
IRSKELYLTSELWNLLSEDEKFELKIKNLRNNHIFTSKDSLSNTTFVKSKDQPIDFSGGSEDNELSVEDFEASEDLNDID
ESIPVVKNSSKIKYPHKDIGIMANNKTISDRKLTDNLSNKSSNDNTSETTSDKSYRSSNSRNSDNSKSKTTRSKTQSSDS
SKSSRIPRSSESKRSTNSVVSVGSTGSDVYSDMERTEYPSRSTYKSRTINRSDSESSPVSSRTSSPVDDSRLKQSRISED
KPRKNKAYRGRRVIGQNNTASLLAALNDDNYNQGQNNVTDINSIGSMLGVSVNELASKNSNPNYSQIMQQIASQMNGQQA
SPNSLFGQAGNINQQLNPQQLSALLGQSYNPNTQFNQLNQLGQLGQLNSMNSINQMAQLGQMGQLGQTGQVNPMSQMSQM
NTMNPMGQPNQSYNSQNDTDLLYRYMATLNQGQSGQQMDPNAIATLMQQNSTGFPSYAQLGGNVNNNNNMNRNPFFFQ
>Q5UQK5 ~~~~~~Uncharacterized protein R403~~~
MNKGQNQVVPPMSQFGGQNPPQLSSIPPIVNPVVVQNRTSPGTPFITNKAKEIYNRRQQEEISSDSEEEESPIEPAKSKY
SRDSRDSRDTRDSRKPKKDSRNMLGSLQKTGTEIAYEPAVKFEMILPKVVKPTRPETPLGIYEQYTPILPPVNNNRFNPA
AFQHLFAPSSALLYGQSVKMPMQNVYNINLPGPTGGHVAMDKIYENILPGKNVKCGFSTLGERIQMLDYIRQILVSSCDG
ENISIDSSGGNSLLSYIKLMELNPNFYSTITNNPYKGLPYGLLIYRSCFPIRFEPTNQSVVCAKNSIGLNIRLYALSYAE
YYSYKLNNNIYLEYDVWRELMYYEYVKNRIIKSKQSPNFPILYAYFFCPNRNINFFQLKTNCLTRKELLSEEYKKFRQIH
EAISNTGTNKIIRPMSMSGQNDKCQLPDEVDPLFRLYSGTTLILITEAPHHNLYQWASRTYETDGIVQRMTSSGFYDEKI
WYDILFQIISALYVMQIHGIYIRNMTIEDNVYIKDLKISGKSCGYWKWIIDGIPYYVPNYGYLVMIDSNFKDTRSESSLL
DRDCCEREYKIYANNIYSQKYDSNVLNENIFENYRRIINTNAFTKEHTKNNVMRPPESIMRLIELMSNDPEKNLGTVLHK
YFRKYLNNRIGTLLRRDTEIPNVRDVTRQFNNGEMAVEVIGDQMYKWCMVSKTNSDGTVEIITRSDSNLDDYIIKDVRIE
TLKQYSPYENIDQNLDQEVKLDDNPLETYTISSN
>Q5UQK1 2.1.1.-~~~~~~Putative RNA methyltransferase R407~~~
MYSDIIRKQSDCYDSIKNFCTSESQLYPIKFYPQNEYYANKFGLDLNNLTQIDNNTISQKVLLEIAKIWANFIMDSEYPV
FDRKTGKGFWNNVSFKKNNDNQLMVKILVVGQWIDKINSDYFDQLIELVYNKCRIDTHCIYLQYSQNKSSPDKQENIQLF
AGSNGIQENILGINFWITPFTFSQANPNTCEMLYETLHQFINIDHDKRSIICYGRNVGHICLTLKSTKNVWAYNPCPVVD
SDLKRTLTSNIISVNIQLFLDTKCELISDKLNLLDQYDKSNYLLIVSPGRNGLKNNVIQAIKQSRSIIKEFYYVSCYSKS
LIRDLKNIENCHIEKAQAIDLFPSTEFCETIVKIVL
>Q5UQP5 ~~~~~~Uncharacterized protein R457~~~
MAGKFTNMRYDNQAYNEEIRRSTDPLLYKLDSNYSVNCSPCFAAHGPIGGHNNSVAIGNQIDVDSVLRGVGRINSKSNQQ
QAPESLNQYTMYTPRECSPSLESHHSRFSHPAHDIRGLNVPDMRLGYPLHDPQCQIFEDFGVNTRLQAKDNHRAVWQQPM
DQKSVYPKARPGREKNCTVSVNCTYAPYS
>Q5UQD0 ~~~~~~Uncharacterized protein R459~~~
MAGSFTRKMYDNCATQQTTKQSTDPLELLLDVNKYVNCNNICKPLAQRYPSSAQLVDVESSLWGIDKLASRCDSSKHPFC
AKNGCLLTNDPRIAPHITPYACEWGHTGDNSVVTTNMKMPSHPGYTLPNPNICKDQTNGYYHNAVKSPNMTGPQHQVIPQ
HQVVPQHQTVPQHQSVIKPKADQLRYQNIQNRPY
>Q5UQD8 ~~~~~~Uncharacterized protein R463~~~
MNQSINFPNNLQNNSNINNALKYLTKIIPNNKYPQQNQSHNNTTPVVTNAINPIEKLNVEEITYLQKYLENIKNKKLNLN
KQSNNQTNNQTNNQTNNQTNNQTNNIRPQINNNKQPISKIQTNRTNEIYDPLKREMPIDWRILPANSLNNFRNNAFDANV
FEPGSRGATSTRIGKKAQFNNPYDYGSKQNSFENVFQKPCNDPYVYDNNMLNQLNINEVPNNLRPNDLRNVDVESSLLQR
ESVHLPGQRNISEREFNRWNMLPFDPQDHRHIVWEDNMPRGGYATRAERLDDN
>Q5UQD9 2.7.7.6~~~~~~Putative DNA directed RNA polymerase subunit R470~~~
MSKSKSSRSTDIIDIYAKPDIKLKVLEPRQDRELRVELEGRSINHAIVNAVRRSVMLYVPIYGFHRSNIHIELNKFKNMY
NFDLMYNIFETLPIFDVPNFMDLIDPDVYLPVELSKNLFGRFVQEQYTEQSDQEDKLVDATKKLFKIELTLNYKNNTADD
KYISSHDCVIKIDGKTSDGYLKRRPICLFVLKPSEEISLRAEANLGIAKNFAAYEATTNAIHEEKNPNKYVITYKTLEQL
NKYVILNKACTIIYKKLENLQDYLLRTYTEDRDPTEQIDIELYGEDHTLGTILENVLQQCEYVEKAGYCMPHLLIDKILV
SYKLYDDSEIGPIKVLNDCITYLIKLYKQLADLVPKK
>Q5UQE4 ~~~~~~Uncharacterized protein R472~~~
MYVNQIDDIIDGILNKLYFEGLSNDESFNSIVNSNKINFVEYREQINKFIDDFVKSIDLTEIRKIINNKDNLNRIIDIIK
RYVAYYYFLSIAYNYTGSLKDFRNNMIQYSKLQESSTFIIRNFFDTENNYQLIKYFKIIKDTSKILLMTDLQRKTLNPLD
VKDTIDFLNGLGKEYINNYLLMITVVDNEDTVNINPHNLIKTIVFGELYKNQERNLVFEILNEIEEDKEEYTYIDIVVSS
DETNDLNNFRQIFLGEDDAEARARDLFELANETNRVTTIETVETKNNDLIALKIITPIVDDFLRYHRDTERLDAESGPVN
IPIVSNNNSKNVQLALLYQQRKKKENTRAQLVINKLDAILDYYSPNVKNNPEFISEIKKYFQNPLSYRKAVLHNYLDEVN
VIDKIRKQGKKAMENNEYFLELMQIITHAYFNFKDFKNFGTSINLFNTKPVNMLRYSNIEFQNQMSNLEVDVHTGIEGQS
VNLVGLAIGPFNDEPVSCTKKSDLLDIRKIQITYMKNDQPVTRSTDNGYKAFLKIIKHFYINTLEIREEPEFSIYNNFDD
IRKLNPDIFGKMIYWTYNTELDTFEMDTYENIKSNSVQDIIRFMNAMIYDKIMDFLDKKLVLLIETHTNLSLSKIESLIQ
IFSNMNQLSIDQEERRDLVISNFLQKKSQETTIVPKKNLEIIPLPEYQPLNLKKPFVIGISMINPLNPQPYIKLEAYSRT
TKDRGIIQATHGKCKHESEWNEINKVKNQNLNKYNSLVTAFIEKYHLETTQLDYVCKVCGQILPLKRYVQDGSFNNNTQQ
FVTAYVPIDIPLEEIKEYRKYVLAIRYIDALINRVSLITNTNMLVGTNTNVRQRRKGLVKNIIDIILKHNSVNMRKNISD
VERSEYLAKKYNINKDLNEVYFFELDDSIFNFTPTASNTEITLNKLKFNNILLYFILIFITELNGPQITMMATDKIAGNI
YVFLKYGQKLFGDLLIKTNINSNETAPITQYPVLCYLLYLLSYYLVKYKLWYQPGENTKVYNPYYSKVIINSFVDLFNGI
SSDAGRITDDYVYKLTVSKMYTQLNTTYRNNEIINILRRNQSKYDTRSSGIDTTQVTTENEIPTYPIANPINIPVKPRAI
PDFKKSSGIIFDREDKILYPIQLTNTDITNCPIGSYHAWVSDGHDIRCTICGEKGSEVTGSVIRLDANYYYSLNNIANRR
CIRGTLHDFIDKNGKLVCSICGHTPNETYDKPDLDKLMDNINKIDDQNAENLLRNIHNQQIKYENQQKVVEDFIREIKTD
YAKDSNNKLYGRLGPICDKLITIFETYLGSNVNLDIDKYPVYLRNNIYIIDHSYNGTPLDKPVIFSQNENRILFRENHQF
FKTDVYYYTDNRLQIDVFYHAVTLKLLGYKEKHKEYTRVAKTNSYLKINYSIMERLLMLGYKTKYIDIEDNFVRNSSRIK
DVNTNYFQIIDNLIKDHINKIKKIIDKFSSTIYKIKNYQNQLNEEQEPIYLQSSQDIDKLISKYFNIIKIFNIGEDDKAF
DNWNYLRTNFEYQEINWLDTNVRPSENMYVNSELVNYYDISSSEMMYYLVDQLISIIDSNPEKITRSNLCQMMVEIIMYI
YNIYNIDEYKNILEFKRFDYIINGSSVMVDMLRRGQGLEQSKELEQHLDDTEPDIMQDMDGEPQEADELEDLKEEAESLD
IEGDYFAEEDEDYAQEDFIE
>Q5UQG0 ~~~~~~Putative PAN domain-containing protein R486~~~
MSQTAIIIWIVVIIILLVLGGLGAYFFYSRYRHRKNIPPTPINPPSSITPIQPINPPSSITPIQPSGPPSGGNHPIPASC
PAYQLVNNKAITLVPLPADTSNVKNAQDCQNLCTQNPDCYFYNYVGLFDGCSLMQGTVDNNVMTGFAIRGSEDGCPKWAR
YNTSIQGFNTGNPSNVESEEKCQQLCQQNSSCDWYTYDIGKKTCTLNKAIDFNTSTLGIKMPH
>Q5UQF7 ~~~~~~Uncharacterized protein R489~~~
MAKFNNNILLIILIIVILFIIFYFLNKNNQSNTNNNYPVSHFSSNVSRTLPVNTNVTVPTVNDCDELSENIVESLLSKYD
SSMMSDRSPFHNLVQQKQQTLRSYDGPYDNDESDDRDFTYKKNKFTRRTPNDLNDLFDVNKMLPQETEEDWFDDLHMKNA
KHINNTHMIHPKKHRGLDTIGSTHKNATHDLRGDIPNPKMSVSPWGNSTIEPDVFARGLCG
>Q5UQ83 3.-.-.-~~~~~~Putative alpha/beta hydrolase R526~~~
MTDYQSKYSLYKRKYLSLKQKQNGGNNTADNTADNIDPIVKKFVDSIKDAKPVYEVTPEEARKNLNSIQSDQSYKTTVDM
ENVVVNDKNVNATIIRPKGNRDRLPVVFYVHGAGWVMGGLQTHGRFVSEIVNKANVTVIFVNYSLAPEKKFPTQIVECYD
ALVYFYSNAQRYNLDFNNIIVVGDSVGGNMATVLAMLTREKTGPRFKYQILLYPVISAAMNTQSYQTFENGPWLSKKSME
WFYEQYTEPNQNLMIPSISPINATDRSIQYLPPTLLVVDENDVLRDEGEAYAHRLSNLGVPTKSVRVLGTIHDFMLLNPL
VKSPATKLTLEIVVNEIKRITTPNKN
>Q5UQ94 3.1.11.-~~~~~~Putative 5'-3' exonuclease R528~~~
MSEEYKKRILIQYENYLLQQDNYVYLATKNSVNWSRNKITPGTAFMNKLVNYLKSDQIQSLLNTNRQDMNIIITDMYEVG
EGEKKIVNYVHKYLHNTSDTVMVYSPDADVILLCMLMPVSNLYMLRHNQETSKKFKRNIYDLINIKMLKNNISYYINNNP
DFSRENFIVDNINYDLVCISTLFGNDFVPKIETINVKKGFQNIMDAYLKTLIELKERNTYLVRKVNGKFNLSLTFLRRVI
KNLLPEENDYIKHNKLYNTYVTAGQIKNVFSYMEINSENIVSVYNEFMREYGDLKNLIKNNGNLTYFETNDQFMNSLKKS
ICIIMDDQCVNTSYLSNKDTIKLLRNYYIKTREFPRVNINLNTWSHSTDDRRHRNIIRENNYNKYQIEIYKFDKMLDEYY
VKFNAQPLDLSRNKIEQFYETYFGIILLDKNKNLTEEANEIMRDYTEGLLWVFDYYFNDKTYVNRWYYQHEKAPLLTHLS
MYLDGINHDYFNDLLSGLKKYRVKNIKNFFNPVEQLIYVSPMIPGIIKLLPSNYRSYITSDHLDPFLKTYFIDVEEIVDQ
LWDQKISDEIDCRSIPYLNKCIIKSIEKPSSSEDKLFLTAIRKVKPTATSHKRSKSIEPDF
>Q5UR35 ~~~~~~Uncharacterized protein R553~~~
MDYKQEYLKLKQLLLNQRGSGKMNPLLLNNLSSMVFIVGNNILGDDVKAVIISLIEEFNRIASQDLRHNIELSQYYAVGR
GLMTREGTIQCENCELVLKYTHSSVCLSNGRIVRKTSGDNEVNCQVSRGMVKFYIKQKTGDQYRYTRIRIARVQQKGSLA
PFTDALYLNNDELDKLHEDTGAAISLSHSDLAKQSASRGDLKVKSPSDVAIATKNINKNVDQINRAATESQLGLMNRSRV
VSPVPTQTSVNVANTVARNMSIGDKNSAIMVPNNPISLDSSDDVQPVIGAVDERGIELQHVGNGGPVSQSNTLNSKIPVT
SAQIPIDANIAVVPSAGLGTSETITGPSKKSTTSVYNKLSQSVSNAFHTDKAQDKVTDKISKETLKQDETGIASSLGKET
KNFFGNLWNKAGDTADKMTNWLRNLGSKSTVVPVSKDLSVVTPLPPQKAGFVPVPTNDDSNTNYLQNKVYTNNPPIDMQY
SQNAISTVRNGKTNLSLVEHTQDRQSSTRLNPRYNVMY
>Q5UR31 ~~~~~~Uncharacterized protein R557~~~
MSQVTGFDQSNVAQYSSFDPLGSLEHAGSNLGNFIQRNNPFPSLSQSASHTFDDVRSDSGKIFDELKSEADKFYDDAKHG
LSDIDYRDFYASDPGNTVLRASMQSPYLNSYMDINNAPNVIPLQAPPIVTNRNSKDYNILFVVVILLLLFVAWRCYVNKR
>Q7T6X3 3.6.4.13~~~~~~Putative ATP-dependent RNA helicase R563~~~
MSSRLSYVNVDDPNFYEFIDEKYSKYKIPPKQKTFKQFCFPSKYEFQIPQQFLAEYINPKTPYKGLLIYHRIGAGKTCTA
IKIAENFKNKSKIMIVVPASLKGNFRSELRSLCADDHYLTSKERSELKILHPSSDEYKSIIKVSDDRIDKYYTIYSYNKF
VDLIKQNKINLTNTLLIIDEVHNMISETGTYYESLYETIHSAPDNMRLVIMTATPIFDKPNEIALTMNLLVRNKQLPVGP
DFVSTFMDIRYNSKGPVYHVKNMDLFKEFVKGYVSYYRGAPPYVFPKSELFFVRTKMSDLQKTVYQKITGKEVKQTKVRD
YVNENISNNFFIGTRMISNIVYPNEKVGLKGYNSLTDEDLTIAKIREYSPKFLKILRKIKRCNGTVFVYSNFKEYGGIRV
FARLLEFHRFKNYEFNGSGPRRFAIWSGDQDPIYKEEVKAVFNNKDNEFGSKIKVILGSSSIKEGVSFLRVQEVHIMEPY
WNFSRMEQIIGRAIRFCSHKDVELDRQLVKVYIYLAVHPDIKMSIDERMMKMALDKKMINSAFEKALKEAAIDCELFKNA
NVYPGEQDIQCEQ
>Q5UR52 2.3.1.-~~~~~~Uncharacterized N-acetyltransferase R584~~~
MPKSSSNNPCPPGEILREGYYRHGYERRGFKRSSGTYIPPTEVSEAYVPPTCIIDRGKPGKGPRILPKPGNDLHLSWYGY
AVHEPERDRHRALVEASQENDPLEVLRRLNLLRNYQAYPDVKDIMSQDVKFMSDFYAKYKERNQEANRFGRSNSRSRSNS
RSKSSRSRSNNRSKSSRSSSTQSKSNNRSNSRSNSKTSRRLKKLPENDLDLDQDGGNRYGGDPLTSITSDTTGNQILETN
IKLSKETECSEGKCQVFNKVYESHTINGKKIVFETLDQNDSENVLSLDKLYLDSDQTIDRVRQNLSTNKGYIIGIKSDNK
LEGYCWYKEISNNEVKIFWFCANKGYGTALYTFMEKYFKLNNYSRIVIDVSMEGSYAVRRINFWNHQGFKTYQVKTDNHK
IHMEKDI
>Q5UP54 1.8.3.2~~~~~~Probable FAD-linked sulfhydryl oxidase R596~~~
MSLSKQVVPTHRVEIAPNSESTCKMDHSNYQHNGLITKIWGTAGWTFNHAVTFGYPLNPTSDDKRRYKNYFISLGDVLPC
RLCRESYKKFITTGKTALTNEVLRNRHTLTKWFYDVHNAVNNKLEVDYGLSYEDVVNKYESFRAKCGKPVPTVKGCVTPL
DHKAFSFKKLYYMDAPIVSLDKVENFVRIARMRGISDKYFCFLELATVLNGDFNELKKQSSWEYRNKYCQKKIRHMRENA
IPSIEEQGYWKGTPTIDELKLLLFLCSNLNRTEVNDAINNVERLESTHYIEN
>Q7T6X1 ~~~~~~Uncharacterized protein R610~~~
MTSCNYQPSEWPGDESWARLNNNQYDKVGFLEGVYDEYTKNKEPVELKFAGVYSLGSPNNFFRVTPSPVPVPTPMSVPRN
VPRNVPTPAAPTPVTLTYRVPVTHSVPVTTEVPVTHTIATQPVMQTVPVMTQTVPVVTVHDSSPVAHVQVPNVIEGFEPL
EISGRGGNTRPNFGYDTRPTTRPTRPDFGSDIGFGTRPHPRPPSPRPPSPRPPHPRPPSPRPPHPRPPSPRPRPRPQPDR
NWWSNRPRPHPWGPGPVIRRSWPNVYGFPIWFNQPMNSFPLGWDIITAREIYPNIRVVKADGIDVATTMDYQPERLNVET
VNNIIIRSYGFY
>Q5UP43 ~~~~~~Uncharacterized protein R646~~~
MTFRNRVGLTSAGGCCCPTLMTINQNVINNDINSPCGATSPYNFDAVVANRASVNPLGFGTRNISVISPASPTYTPTIDD
YTGVYLRYNGNPITRPRNYLGNNFLNGGAENIADRTFNNNCNLASRFTNNDLRGYRAVDPNGRMRMYPPVFCDRPCDPCN
PYSQSDIPYASELPYGATIPTQLDYPYGAGSIYSRGSQCNIAPVPSIYNYGPPTTFDPYNGNVQAIPIGPYGTTEPNICN
PGAYRPDLFYPDRYRMDPNRLDNPRTYWNRLYC
>Q5UR09 ~~~~~~Uncharacterized protein R648~~~
MMDNMQCGYYNSGSPYNFPSVLGPGAAIGPGPVPLGYCKEPCPTRSLCDPSYIGACGIQPLINPFGPRRAVTTWKINYLV
SNRTNQAAHTDPDLINPWGIAIFGNQLWIANGQTDTITNYDLFGNKLLGSITVRNIAQNSSYPTGIAINCTGNFATTNGT
LTKSGLFLTCSEHGTVHSYNPQVDPLVSFLVLNEQLTGEIHVFRGLAVAGDVLYLADFFQSKIMVFDSNYNRLLGFPFVD
GDTSDPIPISYGPTNIVNIGCYLYVVYARKDPNVPLQAITGAGFGYISIFNLDGTFVRRFTSRGVLNDPWAIIPAPVECG
FPPGSLLVSNHGDGRINAFDCNGRYVGPMLNQSGLPVIIDGLRGLAPHYTDFNEIFFTAEVDENIDGLVGSICKDQVIYF
>Q5UQ64 ~~~~~~Uncharacterized protein R653~~~
MDSYEPSQLIWARLLLYEISKRNGETEYQHTNGPVWWGPNDNKTVYLSITDCSGFVNALLRKSFELSQKDMVEWFDSQRP
LAVDYYNTIYHQNGFTRIKNIYKLEPGDFIAIKFPNHHPSLDDTGHIMMINSYPEIMASKNLPDEYQNYNNILQFRVNVI
DQTATPHGRYDTRYSPDDNQNGLGSGYIKLYTNIDGTIIGYSWSLSKKSRYIDKTVHPIVVGRLDIY
>Q5UR17 ~~~~~~Uncharacterized protein R658~~~
MDKTHTEYIDLTREISELSDRQKKFDLESVTDSLDDLDYSDDSNSPENELEILPEDVHTKKIIIDWTRHAESCSNLDSNN
VHDTDEYPLRKTGYDNLNSHDKYISEKSNTNAIMMARKMTSKVKALAMYHPNISYIGCQQAVLLGSYLTKKEYQYDAVFA
SPTVRSIMTALMICRGLKVTIYVVPYINEHTNLAGSKDNQNTPLNSTLLKKQILYIKDWLENNWINNFDDIYVMNVLGDL
RKEMTNMVNNPYSEDIIKKIDYIFNCKPNLKKDGGNVNDYGQKCAIFEEINNIIKILDQKFSNAKKSNLIERGVFDEDLE
SIHDTLKKITDKKFIRGPSVNFTILEMLEKIESEKIPNDDKFFIHQNLRKSCLNKFYTKILPISFNLNFINRNKITIAIM
CVSHGGAMKKYFKSKYPSKNIPDHVLNTQIFREAIFINDNAIEAYSIDFNYYVPRKIRATYSNFETLNIDICRLGSLKGI
LNYPLYSPEWESKIKPKLSFKTLPPVNYATPDSKFYFENKDKYQESITDYEIMGGFNFIHN
>Q5UNT7 2.7.11.1~~~~~~Putative serine/threonine-protein kinase R679~~~
MSTLIDNHIEHLSKLSDKKICRILRIKPSKDIDKNDLITKLILDHYYGKPRLGNIIVIKSLQNGGVRNWEFNLEEIDSNN
RQLDSTIPWNNDYSLREIQNTLNHRYNNINQDLVDLLSVDDMNNSLNTDFLKKFYTYFYIGNYNRDKLCQTMKTFAKSTK
KVTSNGITKNKTVGKGAAGIAFLAETNSGSFEFVIKAMNNVKQYRNKSLDIGTILYKSELPLRSNVKNNHLEYLATDVMR
TVELRYPGYSGYNAFISNEGLLYLNSANDNFTNQTIMHIVLNRILTQYDNDHFIYQFDAFFCENRSGLKRGTSTLTNKIT
LGKTNSTNVKQTDGYNIMEFANAGSLDAILDDWSKSINIDSNYETLLFMFNDIFVQILKTLKILQQPKFAFVHGDLKTKN
IFVKTDGQINLPNGQVFPRYIYKIADYDKSSITWNGIRFHNSGNLGTNIIGKLYDNLNTLDLTSTVDSNYYYLTNICPFI
ESCTSIINGIELESIPIRYLPIPFYSSIDVYSFVTSMLCHKIFHNFVDYCLVRSIDNEITNILKHLFTETDLNIVMDHIN
NTFNSNKKLDLTKYGKIISIIKNNHIGLRKNINKIYDIYGIKLHMKETRTVVPNIILSADQNICLDKCKLNTCNIIQSTR
YNRLSDQCYWKSTNSETQELHISDSDQIDREIDSDEQKQIIDNLFNDIKQQSKK
>Q5UNV1 ~~~~~~Uncharacterized protein R691~~~
MSRRKTATKIIDYLEKRINCENIMRYTDEYLDNNSPDKSLVIDEVPVSFTIIPYNANENYSRQLVSTEGDREDKPDFNDI
KHMFDVFDFMTEANNTGFEHFPYIYGVLDCLNDIDSTVYVYYEKFDGTLPLLIDNIEHPSDWYDIVFQIVLIIMYIKYVG
KMSFKAAPERFLYKKITKPYYKEYSVGDTTLNINHKYLIVCWDTNTTDFEGIQNSESSSENKLPIIDLDFLTEYINVNKD
SLKIQPSNRIIKLIQEIKNQPDNIPKILVQYYGPQ
>Q5UNV0 ~~~~~~Uncharacterized protein R692~~~
MIKIYRMKFKKFVFKFIDHDKRNFTVVCVNVYANKATHEFAHDNDFSKKLEWKIKHFKHAHALERRIHQLVKETYFREST
GSLDQFADFKSVKVCVKDKIVKINLGENQEGNPVYKQVKSVSKHYHVFVRGTKPLNRREKGAYTHSMKVHDIHLTGNLDQ
GLEFAELCNFSIPESGIHSVQSQSSVTQSLNGQNVNPGAVVTGGDNWLSATNNANWNSTANTNAAWNSMNRNSVAQNSAS
KNANNWNSAANSAVKSSQNNNLSAMNNSLYNNNKAVNTNTINSTNNRNVSSQNNANRNASMATTYNNSVNSANSINTANT
RSQTGGQDEEDFEKKYKKYKNKYAKLKNQKTSNF
>Q5UNW0 ~~~~~~Uncharacterized protein R695~~~
MATNTSSNLLTNPYTEREKKLIDEAYYRDLKYNLTSKSRWKFIGDVSETLSQICVGTSSVLAFASGFFEDIDILAFVAGT
VGVGSLVLLQFSSYAMKESSERTQQVNVILTKLGLETIPDIVVEPSIIKARLQGELGEQENDVVIEV
>Q5UNW3 ~~~~~~Uncharacterized protein R705~~~
MSRNGRNDYDYDDDYDNDQMGGRLARDRERDSEGYLYTKTGERNRRSGPEARERNRENALNRSRNDGRFAPERGGRGYGD
DDVKYTKSGRVNGRTTREARERNSRTARSEERDELGRFIGKRGSSRATRGTYTEGANDAAQLLLGGKQGVAENPETDERH
SDAFRMEHRRLAEERERDEFGRFLPTEDGEDGRRGRRSNSRRRSSNARSTSSRSSGSRSSGSRSSGSRSSGSRSTGSKTS
SRSSGSKTSRSSGSSRSRSGSSGSKSGRSRNSRSGTSGRSSNSRSGSRSGSSSRSSNSRSGSRSSSRSGSSRR
>Q5UNW2 ~~~~~~Uncharacterized protein R706~~~
MSNINKILDPNSRMVSGPLNVVRLEGNFYGIKKVLYLFMDYHADVSDQTQCENIFSEDVQKYFAKTFYKLNDSNKIYDFF
LEVFPTEIAQDIYSDDVPEIDHKEMYIEEVVKLFKKLFRYDPKKNRVLMNNITKNIRLHYIDIRDYYKNNVHNRTADMNI
IARNFMVKGNIKVSQLNKIIKLMKIIRNHLEEIVEILEYSPKNNKSQRSKIIKTRDYDDLDIQTIEYLARKIKSTYKYQD
VKKIMNMLLQQSIDNFKSTIKNIDESIKIFTNYAKQISESTDKLVRDPNTSYVYVYGLSPYTIRNMIVDIANRVDKIMDE
QLIEFFARFTDIFFLRRFLDKDYITNGIIYTGALHSNTYINVLVKYFDFKVTHFSYSKISNPQLLTSEIKKRSLMEIQEL
ILPNSFGQCSDMNSFPKEFQ
>Q5UQ56 ~~~~~~Uncharacterized protein R710~~~
MSWHTGSNQDNKLFPKGKLSGSYAPLDIAFENSPAMNEFENRLCHNNPIISERSMSPAVSASYSNPEATSCGCMQTQTQP
QHQTLSQHLPQTHHTDAHDQQKLSGIFYNRTTDAQNQFSETINPPPSYTVHNTDIRIPLNRQQQYPANHLGSELLEGYNN
VGTEPCMGFWEILLLIILIAVLVYGIYWLYKSEK
>Q5UNX9 ~~~~~~Uncharacterized protein R721~~~
MYTSTKPDKQNKLKKLIYNEKYHDNCHDYLKTTYLEKYAEPRYKKLLYKIREKIPKVGICEDTVFDDYRNVIVCDQHAVI
FGHYNDVFPILATYALNACVGLVMYVPKHKIGALAHIDGLPGYSQESAKEDGLELDFSPVYENIEIMIRYLKQLSGSDES
LEITYYLIGGIYGLSEVMVHDILEAINKIQNDKLKFNFMGRNLLGPGNQSRNICIDMATGKITYFDYTINSEYYGKNRKD
NVPMNIIRAPRKSEAYLDITYVPISIDDSQ
>Q5UNX8 ~~~~~~Uncharacterized protein R722~~~
MSKQNNRIIPTNRDKKICRKCGYHVCSDHYVANLNKHHEIIKNHNDKINRVNMNDIINIKLLFHILLPRDSFNKDKVISR
THDIVCSLNDDFNNYSSNTNTMNNSKYKNIIKQIFGANTIKQGIYLSSDYQKIIPEKSSNIVFELGEIYYYPVKHKLNLS
KYDDISDVESQRHEIEQYVRSSEAGAYEPKKFVNIWIIDMIDTSILGFSNFPWEVVNSCHGIIANRRCFFPEEYGESNYS
SFKTFTHHMGHHLGLLHVYNPNHCQSKSCDSSGGSKSIVLDFIIDPLDKINNKKLHIDKEYNPYFTNFMDFTCDKYVSNF
TVRQIQEMRFMINKFKPKLNSLLNESQCPIPKYNPETDTISATINFNQRFSNREGSAVPSYEKADNPRMSASQGMINPEI
FMGIADPQQPFLKPTVVPTNKNINDLIPNLCGTTLPTRKSGNTQDQIIENIQNVLPSSYMNSEQPKDAYADFKKKYNIIY
SEDSYIINHPHNPYLLQQHHQDISAMQQQILEEKNQLRRATIDVPVCGNPYNAGQTVGHYVPPINPEEIDRFNAKRQFAE
SFNPEMFRPNDPLMYQSPQVDPSCCQKPMDPRLYRPAVPSDPRMGQYMTDPRLFQNGNQTQYQSNPQSNPQFNHQAYSQP
NPQAYPQSNPQMYPQQTTFPPRYDPRMNNPYASRATSNGLSPNNVVQQYQSYYDNPSNQQSNQQSNQQSNQQPNQQPNQQ
PNQQPNQQPNQQSVQQCNQVNQEALNDLRSRNIKASPTVSPGDLINKMNRVNEQLQNIKSSLQQDPDTSTTTRVNTAVGQ
RSFGVPRVSNDQSKPKFNKFGQPITNSPFVSGNVASTAMSGKITKENISVNPRDASKAPKSRFQRTKPPQAV
>Q5UNY5 ~~~~~~Uncharacterized protein R727~~~
MNNMDYDPRSMGSYGPNYNNFNPNFGKIRRHKNTSYPGYNGFDPSDGPMNGPMGGPMGGPMNGPMNGQMGGPMNGSMNGP
MNGPMNGPMNGQMGGPMNGPMGGPINGPMNGPRGRQMNGPNNGPMGGPMNGPMNGPNNNQFNGPMNGPNEYYSPEDSDGS
DYSDSNPNEFDSDDDDIDLSYFQKRQYDKPTYHKFEVTDEYLADYLLRKFGLDKDKVVRSIQKNKKDEFKAMILNYFYKE
NPKIKHQSINNQYKLFRKWNKSGLKIKIDDLETYYENKVKPHMH
>Q5URA7 ~~~~~~Putative lipocalin R877~~~
MWIIILIVIIVIITIIFSKRDVVSQSSLDIQRYMGTWYEIARLPTSFQKGCVNSTANYQLLEPNKIQVTNNCEINGRINS
VTGTAIPAANTRIVSGFLTPASLMVNFGYGFSPYNVIFIDENYQYAIVSGGNDTLWILSRFKNINQSTYNQLVTIVYNQG
YDVNNLIRN
>Q6UY71 ~~~Z~~~RING finger protein Z~~~
MGNSKSKSNPSSSSESQKGAPTVTEFRRTAIHSLYGRYNCKCCWFADKNLIKCSDHYLCLRCLNVMLKNSDLCNICWEQL
PTCITVPEEPSAPPE
>Q6IVU5 ~~~Z~~~RING finger protein Z~~~
MGNCNGASKSNQPDSSRVTQPAAEFRRVAHSSLYGRYNCKCCWFADTNLITCNDHYLCLRCHQVMLRNSDLCNICWKPLP
TTITVPVEPTAPPP
>O73557 ~~~Z~~~RING finger protein Z~~~
MGNKQAKAPESKDSPRASLIPDATHLGPQFCKSCWFENKGLVECNNHYLCLNCLTLLLSVSNRCPICKMPLPTKLRPSAA
PTAPPTGAADSIRPPPYSP
>P18541 ~~~Z~~~RING finger protein Z~~~
MGQGKSREEKGTNSTNRAEILPDTTYLGPLSCKSCWQKFDSLVRCHDHYLCRHCLNLLLSVSDRCPLCKYPLPTRLKIST
APSSPPPYEE
>Q6IUF9 ~~~Z~~~RING finger protein Z~~~
MGNCNKPPKRPPNTQTSAAQPSAEFRRTALPSLYGRYNCKCCWFADTNLITCNDHYLCLRCHQTMLRNSELCHICWKPLP
TSITVPVEPSAPPP
>Q6UY62 ~~~Z~~~RING finger protein Z~~~
MGNSKSKSKLSANQYEQQTVNSTKQVAILKRQAEPSLYGRHNCRCCWFANTNLIKCSDHYICLKCLNIMLGKSSFCDICG
EELPTSIVVPIEPSAPPPED
>Q88470 ~~~Z~~~RING finger protein Z~~~
MGNCNRTQKPSSSSNNLEKPPQAAEFRRTAEPSLYGRYNCKCCWFADKNLITCSDHYLCLRCHQIMLRNSELCNICWKPL
PTSIRVPLEASAPDL
