HUE025604T2 - Optimized cellulase enzymes - Google Patents
Optimized cellulase enzymes Download PDFInfo
- Publication number
- HUE025604T2 HUE025604T2 HUE10153355A HUE10153355A HUE025604T2 HU E025604 T2 HUE025604 T2 HU E025604T2 HU E10153355 A HUE10153355 A HU E10153355A HU E10153355 A HUE10153355 A HU E10153355A HU E025604 T2 HUE025604 T2 HU E025604T2
- Authority
- HU
- Hungary
- Prior art keywords
- gly
- thr
- ser
- asp
- asn
- Prior art date
Links
- 108010059892 Cellulase Proteins 0.000 title claims description 13
- 229920001184 polypeptide Polymers 0.000 claims description 126
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 116
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 111
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 54
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 42
- 102000004190 Enzymes Human genes 0.000 claims description 33
- 108090000790 Enzymes Proteins 0.000 claims description 33
- 125000000539 amino acid group Chemical group 0.000 claims description 32
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 31
- 150000001413 amino acids Chemical class 0.000 claims description 31
- 238000012217 deletion Methods 0.000 claims description 27
- 230000037430 deletion Effects 0.000 claims description 27
- 229920002678 cellulose Polymers 0.000 claims description 25
- 239000001913 cellulose Substances 0.000 claims description 25
- 150000007523 nucleic acids Chemical class 0.000 claims description 21
- 238000006243 chemical reaction Methods 0.000 claims description 20
- 239000000203 mixture Substances 0.000 claims description 20
- 108020004707 nucleic acids Proteins 0.000 claims description 20
- 102000039446 nucleic acids Human genes 0.000 claims description 20
- 239000000758 substrate Substances 0.000 claims description 18
- 241000235648 Pichia Species 0.000 claims description 14
- 241000223259 Trichoderma Species 0.000 claims description 14
- 230000035772 mutation Effects 0.000 claims description 14
- 108010084185 Cellulases Proteins 0.000 claims description 12
- 102000005575 Cellulases Human genes 0.000 claims description 12
- 239000006228 supernatant Substances 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 10
- 241000228212 Aspergillus Species 0.000 claims description 7
- 241000222120 Candida <Saccharomycetales> Species 0.000 claims description 6
- 241000235070 Saccharomyces Species 0.000 claims description 6
- 238000003780 insertion Methods 0.000 claims description 6
- 230000037431 insertion Effects 0.000 claims description 6
- 101710121765 Endo-1,4-beta-xylanase Proteins 0.000 claims description 5
- 241000235649 Kluyveromyces Species 0.000 claims description 5
- 241000235346 Schizosaccharomyces Species 0.000 claims description 5
- 230000007515 enzymatic degradation Effects 0.000 claims description 5
- 239000004753 textile Substances 0.000 claims description 5
- 239000003599 detergent Substances 0.000 claims description 4
- 235000013305 food Nutrition 0.000 claims description 4
- 125000003118 aryl group Chemical group 0.000 claims description 2
- 239000002028 Biomass Substances 0.000 claims 1
- 241000475481 Nebula Species 0.000 claims 1
- 241000228143 Penicillium Species 0.000 claims 1
- 230000004913 activation Effects 0.000 claims 1
- 108010008885 Cellulose 1,4-beta-Cellobiosidase Proteins 0.000 description 147
- 108090000623 proteins and genes Proteins 0.000 description 55
- 241000499912 Trichoderma reesei Species 0.000 description 42
- 230000014509 gene expression Effects 0.000 description 41
- 108020004414 DNA Proteins 0.000 description 40
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 40
- 241000959173 Rasamsonia emersonii Species 0.000 description 35
- 230000000694 effects Effects 0.000 description 33
- 229940088598 enzyme Drugs 0.000 description 32
- 238000000034 method Methods 0.000 description 32
- 235000001014 amino acid Nutrition 0.000 description 31
- 238000006467 substitution reaction Methods 0.000 description 31
- 102000004169 proteins and genes Human genes 0.000 description 30
- 230000004927 fusion Effects 0.000 description 29
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 27
- 241000235058 Komagataella pastoris Species 0.000 description 27
- 235000018102 proteins Nutrition 0.000 description 26
- 108010076504 Protein Sorting Signals Proteins 0.000 description 22
- 210000004027 cell Anatomy 0.000 description 19
- 238000000224 chemical solution deposition Methods 0.000 description 19
- 108010089804 glycyl-threonine Proteins 0.000 description 17
- 239000000463 material Substances 0.000 description 17
- 108010037850 glycylvaline Proteins 0.000 description 16
- 108091026890 Coding region Proteins 0.000 description 15
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 15
- 230000027455 binding Effects 0.000 description 13
- 238000010367 cloning Methods 0.000 description 13
- 102220125025 rs773325406 Human genes 0.000 description 13
- 241000233866 Fungi Species 0.000 description 12
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 12
- 238000004519 manufacturing process Methods 0.000 description 12
- 108010061238 threonyl-glycine Proteins 0.000 description 12
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 11
- 238000012512 characterization method Methods 0.000 description 11
- 108010077515 glycylproline Proteins 0.000 description 11
- 238000006460 hydrolysis reaction Methods 0.000 description 11
- 229940081969 saccharomyces cerevisiae Drugs 0.000 description 11
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Chemical group OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 10
- 241000223199 Humicola grisea Species 0.000 description 10
- 239000012634 fragment Substances 0.000 description 10
- 230000002538 fungal effect Effects 0.000 description 10
- 230000007062 hydrolysis Effects 0.000 description 10
- 241000588724 Escherichia coli Species 0.000 description 9
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 9
- 241000228182 Thermoascus aurantiacus Species 0.000 description 9
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 9
- -1 W40 Chemical compound 0.000 description 9
- 108010005233 alanylglutamic acid Proteins 0.000 description 9
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 8
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 8
- 102000008300 Mutant Proteins Human genes 0.000 description 8
- 108010021466 Mutant Proteins Proteins 0.000 description 8
- 230000003197 catalytic effect Effects 0.000 description 8
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 7
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 7
- ZKJZBRHRWKLVSJ-ZDLURKLDSA-N Gly-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O ZKJZBRHRWKLVSJ-ZDLURKLDSA-N 0.000 description 7
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 7
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 7
- IXKSXJFAGXLQOQ-XISFHERQSA-N WHWLQLKPGQPMY Chemical group C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 IXKSXJFAGXLQOQ-XISFHERQSA-N 0.000 description 7
- 230000001461 cytolytic effect Effects 0.000 description 7
- 238000000855 fermentation Methods 0.000 description 7
- 230000004151 fermentation Effects 0.000 description 7
- 108010049041 glutamylalanine Proteins 0.000 description 7
- 239000013612 plasmid Substances 0.000 description 7
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 7
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 7
- CZIXHXIJJZLYRJ-SRVKXCTJSA-N Asn-Cys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CZIXHXIJJZLYRJ-SRVKXCTJSA-N 0.000 description 6
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 6
- DZLQXIFVQFTFJY-BYPYZUCNSA-N Cys-Gly-Gly Chemical compound SC[C@H](N)C(=O)NCC(=O)NCC(O)=O DZLQXIFVQFTFJY-BYPYZUCNSA-N 0.000 description 6
- KJJASVYBTKRYSN-FXQIFTODSA-N Cys-Pro-Asp Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CC(=O)O)C(=O)O KJJASVYBTKRYSN-FXQIFTODSA-N 0.000 description 6
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 6
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 6
- GZBZACMXFIPIDX-WHFBIAKZSA-N Gly-Cys-Asp Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)CN)C(=O)O GZBZACMXFIPIDX-WHFBIAKZSA-N 0.000 description 6
- WRFOZIJRODPLIA-QWRGUYRKSA-N Gly-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O WRFOZIJRODPLIA-QWRGUYRKSA-N 0.000 description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- HBGKOLSGLYMWSW-DCAQKATOSA-N His-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CS)C(=O)O HBGKOLSGLYMWSW-DCAQKATOSA-N 0.000 description 6
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 6
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 6
- QAYSODICXVZUIA-WLTAIBSBSA-N Tyr-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QAYSODICXVZUIA-WLTAIBSBSA-N 0.000 description 6
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 6
- 108010093581 aspartyl-proline Proteins 0.000 description 6
- 101150052795 cbh-1 gene Proteins 0.000 description 6
- 239000008103 glucose Substances 0.000 description 6
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 6
- 239000004615 ingredient Substances 0.000 description 6
- 229920000642 polymer Polymers 0.000 description 6
- 108010029020 prolylglycine Proteins 0.000 description 6
- 102220005490 rs33986902 Human genes 0.000 description 6
- 235000000346 sugar Nutrition 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 5
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 5
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 5
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 5
- KESWRFKUZRUTAH-FXQIFTODSA-N Asp-Pro-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O KESWRFKUZRUTAH-FXQIFTODSA-N 0.000 description 5
- TXGDWPBLUFQODU-XGEHTFHBSA-N Cys-Pro-Thr Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O TXGDWPBLUFQODU-XGEHTFHBSA-N 0.000 description 5
- 101710098247 Exoglucanase 1 Proteins 0.000 description 5
- ZWMYUDZLXAQHCK-CIUDSAMLSA-N Glu-Met-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O ZWMYUDZLXAQHCK-CIUDSAMLSA-N 0.000 description 5
- LWYUQLZOIORFFJ-XKBZYTNZSA-N Glu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O LWYUQLZOIORFFJ-XKBZYTNZSA-N 0.000 description 5
- NEDQVOQDDBCRGG-UHFFFAOYSA-N Gly Gly Thr Tyr Chemical compound NCC(=O)NCC(=O)NC(C(O)C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 NEDQVOQDDBCRGG-UHFFFAOYSA-N 0.000 description 5
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 5
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 5
- QQXJROOJCMIHIV-AVGNSLFASA-N Leu-Val-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O QQXJROOJCMIHIV-AVGNSLFASA-N 0.000 description 5
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 5
- OCRSGGIJBDUXHU-WDSOQIARSA-N Met-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 OCRSGGIJBDUXHU-WDSOQIARSA-N 0.000 description 5
- 108091034117 Oligonucleotide Proteins 0.000 description 5
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 5
- OOKCGAYXSNJBGQ-ZLUOBGJFSA-N Ser-Asn-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OOKCGAYXSNJBGQ-ZLUOBGJFSA-N 0.000 description 5
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 5
- 101150006914 TRP1 gene Proteins 0.000 description 5
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 5
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 5
- 235000004279 alanine Nutrition 0.000 description 5
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 239000012228 culture supernatant Substances 0.000 description 5
- 108010016616 cysteinylglycine Proteins 0.000 description 5
- 239000013613 expression plasmid Substances 0.000 description 5
- 102000037865 fusion proteins Human genes 0.000 description 5
- 108020001507 fusion proteins Proteins 0.000 description 5
- 108010081985 glycyl-cystinyl-aspartic acid Proteins 0.000 description 5
- 230000003301 hydrolyzing effect Effects 0.000 description 5
- 108010005942 methionylglycine Proteins 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 108010051242 phenylalanylserine Proteins 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 108010026333 seryl-proline Proteins 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 239000010902 straw Substances 0.000 description 5
- 108010036320 valylleucine Proteins 0.000 description 5
- MBWYUTNBYSSUIQ-HERUPUMHSA-N Ala-Asn-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N MBWYUTNBYSSUIQ-HERUPUMHSA-N 0.000 description 4
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 4
- VHEVVUZDDUCAKU-FXQIFTODSA-N Ala-Met-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O VHEVVUZDDUCAKU-FXQIFTODSA-N 0.000 description 4
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 4
- AENHOIXXHKNIQL-AUTRQRHGSA-N Ala-Tyr-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H]([NH3+])C)CC1=CC=C(O)C=C1 AENHOIXXHKNIQL-AUTRQRHGSA-N 0.000 description 4
- NVPHRWNWTKYIST-BPNCWPANSA-N Arg-Tyr-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 NVPHRWNWTKYIST-BPNCWPANSA-N 0.000 description 4
- XXAOXVBAWLMTDR-ZLUOBGJFSA-N Asn-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)N)N XXAOXVBAWLMTDR-ZLUOBGJFSA-N 0.000 description 4
- COUZKSSMBFADSB-AVGNSLFASA-N Asn-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N COUZKSSMBFADSB-AVGNSLFASA-N 0.000 description 4
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 4
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 4
- IWLZBRTUIVXZJD-OLHMAJIHSA-N Asp-Thr-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O IWLZBRTUIVXZJD-OLHMAJIHSA-N 0.000 description 4
- PDIYGFYAMZZFCW-JIOCBJNQSA-N Asp-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N)O PDIYGFYAMZZFCW-JIOCBJNQSA-N 0.000 description 4
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 4
- LBOLGUYQEPZSKM-YUMQZZPRSA-N Cys-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N LBOLGUYQEPZSKM-YUMQZZPRSA-N 0.000 description 4
- GGJOGFJIPPGNRK-JSGCOSHPSA-N Glu-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 GGJOGFJIPPGNRK-JSGCOSHPSA-N 0.000 description 4
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 4
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 4
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 4
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 4
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 4
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 4
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 4
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical group OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 4
- MDVZJYGNAGLPGJ-KKUMJFAQSA-N Leu-Asn-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MDVZJYGNAGLPGJ-KKUMJFAQSA-N 0.000 description 4
- WGXOKDLDIWSOCV-MELADBBJSA-N Phe-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O WGXOKDLDIWSOCV-MELADBBJSA-N 0.000 description 4
- NPLGQVKZFGJWAI-QWHCGFSZSA-N Phe-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O NPLGQVKZFGJWAI-QWHCGFSZSA-N 0.000 description 4
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 4
- GZNYIXWOIUFLGO-ZJDVBMNYSA-N Pro-Thr-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZNYIXWOIUFLGO-ZJDVBMNYSA-N 0.000 description 4
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 4
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 4
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 4
- RHAPJNVNWDBFQI-BQBZGAKWSA-N Ser-Pro-Gly Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O RHAPJNVNWDBFQI-BQBZGAKWSA-N 0.000 description 4
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 4
- CDBYLPFSWZWCQE-UHFFFAOYSA-L Sodium Carbonate Chemical compound [Na+].[Na+].[O-]C([O-])=O CDBYLPFSWZWCQE-UHFFFAOYSA-L 0.000 description 4
- OWQKBXKXZFRRQL-XGEHTFHBSA-N Thr-Met-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CS)C(=O)O)N)O OWQKBXKXZFRRQL-XGEHTFHBSA-N 0.000 description 4
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 4
- TZQWJCGVCIJDMU-HEIBUPTGSA-N Thr-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)O)N)O TZQWJCGVCIJDMU-HEIBUPTGSA-N 0.000 description 4
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 4
- KHTIUAKJRUIEMA-HOUAVDHOSA-N Thr-Trp-Asp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 KHTIUAKJRUIEMA-HOUAVDHOSA-N 0.000 description 4
- 241000223261 Trichoderma viride Species 0.000 description 4
- CDPXXGFRDZVVGF-OYDLWJJNSA-N Trp-Arg-Trp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CDPXXGFRDZVVGF-OYDLWJJNSA-N 0.000 description 4
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 4
- 108010045350 alanyl-tyrosyl-alanine Proteins 0.000 description 4
- 108010041407 alanylaspartic acid Proteins 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010077245 asparaginyl-proline Proteins 0.000 description 4
- 108010004073 cysteinylcysteine Proteins 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 4
- 108010050848 glycylleucine Proteins 0.000 description 4
- 108010091871 leucylmethionine Proteins 0.000 description 4
- 239000002029 lignocellulosic biomass Substances 0.000 description 4
- 238000002703 mutagenesis Methods 0.000 description 4
- 231100000350 mutagenesis Toxicity 0.000 description 4
- 239000002773 nucleotide Substances 0.000 description 4
- 125000003729 nucleotide group Chemical group 0.000 description 4
- 108010084572 phenylalanyl-valine Proteins 0.000 description 4
- 108010031719 prolyl-serine Proteins 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 230000028327 secretion Effects 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 4
- 108010020532 tyrosyl-proline Proteins 0.000 description 4
- 102220574283 5-hydroxytryptamine receptor 2A_S86T_mutation Human genes 0.000 description 3
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 3
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 3
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 3
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 3
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 3
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 3
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 3
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 3
- UIUXXFIKWQVMEX-UFYCRDLUSA-N Arg-Phe-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UIUXXFIKWQVMEX-UFYCRDLUSA-N 0.000 description 3
- ZCSHHTFOZULVLN-SZMVWBNQSA-N Arg-Trp-Val Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 ZCSHHTFOZULVLN-SZMVWBNQSA-N 0.000 description 3
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 3
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 3
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 3
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 3
- VCJCPARXDBEGNE-GUBZILKMSA-N Asn-Pro-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 VCJCPARXDBEGNE-GUBZILKMSA-N 0.000 description 3
- REQUGIWGOGSOEZ-ZLUOBGJFSA-N Asn-Ser-Asn Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)C(=O)N REQUGIWGOGSOEZ-ZLUOBGJFSA-N 0.000 description 3
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 3
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 3
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 3
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 3
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 3
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 3
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 3
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 3
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 3
- 229920003043 Cellulose fiber Polymers 0.000 description 3
- 241001248634 Chaetomium thermophilum Species 0.000 description 3
- NXTYATMDWQYLGJ-BQBZGAKWSA-N Cys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CS NXTYATMDWQYLGJ-BQBZGAKWSA-N 0.000 description 3
- ZOKPRHVIFAUJPV-GUBZILKMSA-N Cys-Pro-Arg Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O ZOKPRHVIFAUJPV-GUBZILKMSA-N 0.000 description 3
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 3
- GUBGYTABKSRVRQ-CUHNMECISA-N D-Cellobiose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)OC(O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-CUHNMECISA-N 0.000 description 3
- 235000014466 Douglas bleu Nutrition 0.000 description 3
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 3
- UENPHLAAKDPZQY-XKBZYTNZSA-N Glu-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N)O UENPHLAAKDPZQY-XKBZYTNZSA-N 0.000 description 3
- ITVBKCZZLJUUHI-HTUGSXCWSA-N Glu-Phe-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ITVBKCZZLJUUHI-HTUGSXCWSA-N 0.000 description 3
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 3
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 3
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 3
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 3
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 3
- DNAZKGFYFRGZIH-QWRGUYRKSA-N Gly-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 DNAZKGFYFRGZIH-QWRGUYRKSA-N 0.000 description 3
- JFFAPRNXXLRINI-NHCYSSNCSA-N His-Asp-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JFFAPRNXXLRINI-NHCYSSNCSA-N 0.000 description 3
- UAQSZXGJGLHMNV-XEGUGMAKSA-N Ile-Gly-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N UAQSZXGJGLHMNV-XEGUGMAKSA-N 0.000 description 3
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 3
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 3
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 3
- LSLUTXRANSUGFY-XIRDDKMYSA-N Leu-Trp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O LSLUTXRANSUGFY-XIRDDKMYSA-N 0.000 description 3
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 3
- RDFIVFHPOSOXMW-ACRUOGEOSA-N Leu-Tyr-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RDFIVFHPOSOXMW-ACRUOGEOSA-N 0.000 description 3
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 3
- LECIJRIRMVOFMH-ULQDDVLXSA-N Lys-Pro-Phe Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LECIJRIRMVOFMH-ULQDDVLXSA-N 0.000 description 3
- 101710089743 Mating factor alpha Proteins 0.000 description 3
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 3
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 3
- FRPVPGRXUKFEQE-YDHLFZDLSA-N Phe-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FRPVPGRXUKFEQE-YDHLFZDLSA-N 0.000 description 3
- PSBJZLMFFTULDX-IXOXFDKPSA-N Phe-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=CC=C1)N)O PSBJZLMFFTULDX-IXOXFDKPSA-N 0.000 description 3
- CMHTUJQZQXFNTQ-OEAJRASXSA-N Phe-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O CMHTUJQZQXFNTQ-OEAJRASXSA-N 0.000 description 3
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 3
- GCFNFKNPCMBHNT-IRXDYDNUSA-N Phe-Tyr-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)NCC(=O)O)N GCFNFKNPCMBHNT-IRXDYDNUSA-N 0.000 description 3
- SSSFPISOZOLQNP-GUBZILKMSA-N Pro-Arg-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSFPISOZOLQNP-GUBZILKMSA-N 0.000 description 3
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 3
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 3
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 3
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 3
- AFWBWPCXSWUCLB-WDSKDSINSA-N Pro-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H]1CCC[NH2+]1 AFWBWPCXSWUCLB-WDSKDSINSA-N 0.000 description 3
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 3
- VVAWNPIOYXAMAL-KJEVXHAQSA-N Pro-Thr-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VVAWNPIOYXAMAL-KJEVXHAQSA-N 0.000 description 3
- 240000001416 Pseudotsuga menziesii Species 0.000 description 3
- 235000005386 Pseudotsuga menziesii var menziesii Nutrition 0.000 description 3
- 101900208676 Saccharomyces cerevisiae Mating factor alpha Proteins 0.000 description 3
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 3
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 3
- MPPHJZYXDVDGOF-BWBBJGPYSA-N Ser-Cys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CO MPPHJZYXDVDGOF-BWBBJGPYSA-N 0.000 description 3
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 3
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 3
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 3
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 3
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 3
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 3
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 3
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 3
- WBAXJMCUFIXCNI-WDSKDSINSA-N Ser-Pro Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O WBAXJMCUFIXCNI-WDSKDSINSA-N 0.000 description 3
- KIEIJCFVGZCUAS-MELADBBJSA-N Ser-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N)C(=O)O KIEIJCFVGZCUAS-MELADBBJSA-N 0.000 description 3
- 241000228178 Thermoascus Species 0.000 description 3
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 3
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 3
- WCRFXRIWBFRZBR-GGVZMXCHSA-N Thr-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WCRFXRIWBFRZBR-GGVZMXCHSA-N 0.000 description 3
- AXEJRUGTOJPZKG-XGEHTFHBSA-N Thr-Val-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O)N)O AXEJRUGTOJPZKG-XGEHTFHBSA-N 0.000 description 3
- LVTKHGUGBGNBPL-UHFFFAOYSA-N Trp-P-1 Chemical compound N1C2=CC=CC=C2C2=C1C(C)=C(N)N=C2C LVTKHGUGBGNBPL-UHFFFAOYSA-N 0.000 description 3
- JBBYKPZAPOLCPK-JYJNAYRXSA-N Tyr-Arg-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O JBBYKPZAPOLCPK-JYJNAYRXSA-N 0.000 description 3
- FGVFBDZSGQTYQX-UFYCRDLUSA-N Tyr-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O FGVFBDZSGQTYQX-UFYCRDLUSA-N 0.000 description 3
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 3
- CWSIBTLMMQLPPZ-FXQIFTODSA-N Val-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C(C)C)N CWSIBTLMMQLPPZ-FXQIFTODSA-N 0.000 description 3
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 3
- JZWZACGUZVCQPS-RNJOBUHISA-N Val-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N JZWZACGUZVCQPS-RNJOBUHISA-N 0.000 description 3
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 3
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 3
- DVLWZWNAQUBZBC-ZNSHCXBVSA-N Val-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N)O DVLWZWNAQUBZBC-ZNSHCXBVSA-N 0.000 description 3
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 3
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 3
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 3
- UFCHCOKFAGOQSF-BQFCYCMXSA-N Val-Trp-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N UFCHCOKFAGOQSF-BQFCYCMXSA-N 0.000 description 3
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 3
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 3
- 108010047495 alanylglycine Proteins 0.000 description 3
- 108010027234 aspartyl-glycyl-glutamyl-alanine Proteins 0.000 description 3
- 108010047754 beta-Glucosidase Proteins 0.000 description 3
- 102000006995 beta-Glucosidase Human genes 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 3
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 3
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 3
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 3
- 108010036413 histidylglycine Proteins 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 108010012058 leucyltyrosine Proteins 0.000 description 3
- 108010038320 lysylphenylalanine Proteins 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000012807 shake-flask culturing Methods 0.000 description 3
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 108010003137 tyrosyltyrosine Proteins 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- LIWOHUSRWUWRSX-ZJZGAYNASA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-phenylpropanoyl]amino]-5-(diaminomethylideneamino)pentanoyl]amino]-3-methylbutanoyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 LIWOHUSRWUWRSX-ZJZGAYNASA-N 0.000 description 2
- HSHNITRMYYLLCV-UHFFFAOYSA-N 4-methylumbelliferone Chemical compound C1=C(O)C=CC2=C1OC(=O)C=C2C HSHNITRMYYLLCV-UHFFFAOYSA-N 0.000 description 2
- PRTGXBPFDYMIJH-KSFLKEQHSA-N 7-[(2s,3r,4r,5s,6r)-3,4-dihydroxy-6-(hydroxymethyl)-5-[(2s,3r,4s,5r,6r)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl]oxyoxan-2-yl]oxy-4-methylchromen-2-one Chemical compound O([C@@H]1[C@@H](CO)O[C@H]([C@@H]([C@H]1O)O)OC1=CC=2OC(=O)C=C(C=2C=C1)C)[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O PRTGXBPFDYMIJH-KSFLKEQHSA-N 0.000 description 2
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 2
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 2
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 2
- 108010076441 Ala-His-His Proteins 0.000 description 2
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 2
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 2
- RGQCNKIDEQJEBT-CQDKDKBSSA-N Ala-Leu-Tyr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 RGQCNKIDEQJEBT-CQDKDKBSSA-N 0.000 description 2
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 2
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 2
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 2
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 2
- 241000710929 Alphavirus Species 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 2
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- VVJTWSRNMJNDPN-IUCAKERBSA-N Arg-Met-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O VVJTWSRNMJNDPN-IUCAKERBSA-N 0.000 description 2
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 2
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 2
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 2
- VLIJAPRTSXSGFY-STQMWFEESA-N Arg-Tyr-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 VLIJAPRTSXSGFY-STQMWFEESA-N 0.000 description 2
- DQTIWTULBGLJBL-DCAQKATOSA-N Asn-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N DQTIWTULBGLJBL-DCAQKATOSA-N 0.000 description 2
- XVVOVPFMILMHPX-ZLUOBGJFSA-N Asn-Asp-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XVVOVPFMILMHPX-ZLUOBGJFSA-N 0.000 description 2
- TWXZVVXRRRRSLT-IMJSIDKUSA-N Asn-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(O)=O TWXZVVXRRRRSLT-IMJSIDKUSA-N 0.000 description 2
- RRVBEKYEFMCDIF-WHFBIAKZSA-N Asn-Cys-Gly Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)NCC(=O)O)N)C(=O)N RRVBEKYEFMCDIF-WHFBIAKZSA-N 0.000 description 2
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 2
- OOWSBIOUKIUWLO-RCOVLWMOSA-N Asn-Gly-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O OOWSBIOUKIUWLO-RCOVLWMOSA-N 0.000 description 2
- OLISTMZJGQUOGS-GMOBBJLQSA-N Asn-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OLISTMZJGQUOGS-GMOBBJLQSA-N 0.000 description 2
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 2
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 2
- XFJKRRCWLTZIQA-XIRDDKMYSA-N Asn-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N XFJKRRCWLTZIQA-XIRDDKMYSA-N 0.000 description 2
- NTWOPSIUJBMNRI-KKUMJFAQSA-N Asn-Lys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTWOPSIUJBMNRI-KKUMJFAQSA-N 0.000 description 2
- AEZCCDMZZJOGII-DCAQKATOSA-N Asn-Met-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O AEZCCDMZZJOGII-DCAQKATOSA-N 0.000 description 2
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 2
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 2
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 2
- QHAJMRDEWNAIBQ-FXQIFTODSA-N Asp-Arg-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O QHAJMRDEWNAIBQ-FXQIFTODSA-N 0.000 description 2
- JGDBHIVECJGXJA-FXQIFTODSA-N Asp-Asp-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JGDBHIVECJGXJA-FXQIFTODSA-N 0.000 description 2
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 2
- JHFNSBBHKSZXKB-VKHMYHEASA-N Asp-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(O)=O JHFNSBBHKSZXKB-VKHMYHEASA-N 0.000 description 2
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 2
- BIVYLQMZPHDUIH-WHFBIAKZSA-N Asp-Gly-Cys Chemical compound C([C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)C(=O)O BIVYLQMZPHDUIH-WHFBIAKZSA-N 0.000 description 2
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 2
- GYNUXDMCDILYIQ-QRTARXTBSA-N Asp-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)O)N GYNUXDMCDILYIQ-QRTARXTBSA-N 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- AMRLSQGGERHDHJ-FXQIFTODSA-N Cys-Ala-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMRLSQGGERHDHJ-FXQIFTODSA-N 0.000 description 2
- VNLYIYOYUNGURO-ZLUOBGJFSA-N Cys-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N VNLYIYOYUNGURO-ZLUOBGJFSA-N 0.000 description 2
- XRTISHJEPHMBJG-SRVKXCTJSA-N Cys-Asp-Tyr Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XRTISHJEPHMBJG-SRVKXCTJSA-N 0.000 description 2
- ZIKWRNJXFIQECJ-CIUDSAMLSA-N Cys-Cys-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O ZIKWRNJXFIQECJ-CIUDSAMLSA-N 0.000 description 2
- ATPDEYTYWVMINF-ZLUOBGJFSA-N Cys-Cys-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O ATPDEYTYWVMINF-ZLUOBGJFSA-N 0.000 description 2
- BDWIZLQVVWQMTB-XKBZYTNZSA-N Cys-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N)O BDWIZLQVVWQMTB-XKBZYTNZSA-N 0.000 description 2
- KXUKWRVYDYIPSQ-CIUDSAMLSA-N Cys-Leu-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUKWRVYDYIPSQ-CIUDSAMLSA-N 0.000 description 2
- DQGIAOGALAQBGK-BWBBJGPYSA-N Cys-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N)O DQGIAOGALAQBGK-BWBBJGPYSA-N 0.000 description 2
- 239000004249 Erythorbin acid Substances 0.000 description 2
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 2
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 2
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 2
- SVZIKUHLRKVZIF-GUBZILKMSA-N Glu-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N SVZIKUHLRKVZIF-GUBZILKMSA-N 0.000 description 2
- ZZIFPJZQHRJERU-WDSKDSINSA-N Glu-Cys-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ZZIFPJZQHRJERU-WDSKDSINSA-N 0.000 description 2
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 2
- YRMZCZIRHYCNHX-RYUDHWBXSA-N Glu-Phe-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O YRMZCZIRHYCNHX-RYUDHWBXSA-N 0.000 description 2
- BIYNPVYAZOUVFQ-CIUDSAMLSA-N Glu-Pro-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O BIYNPVYAZOUVFQ-CIUDSAMLSA-N 0.000 description 2
- ALMBZBOCGSVSAI-ACZMJKKPSA-N Glu-Ser-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ALMBZBOCGSVSAI-ACZMJKKPSA-N 0.000 description 2
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 2
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 2
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 2
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 2
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 2
- GQGAFTPXAPKSCF-WHFBIAKZSA-N Gly-Ala-Cys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O GQGAFTPXAPKSCF-WHFBIAKZSA-N 0.000 description 2
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 2
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 2
- DJTXYXZNNDDEOU-WHFBIAKZSA-N Gly-Asn-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)C(=O)N DJTXYXZNNDDEOU-WHFBIAKZSA-N 0.000 description 2
- LLXVQPKEQQCISF-YUMQZZPRSA-N Gly-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN LLXVQPKEQQCISF-YUMQZZPRSA-N 0.000 description 2
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 2
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 2
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 2
- ADZGCWWDPFDHCY-ZETCQYMHSA-N Gly-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 ADZGCWWDPFDHCY-ZETCQYMHSA-N 0.000 description 2
- IUZGUFAJDBHQQV-YUMQZZPRSA-N Gly-Leu-Asn Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IUZGUFAJDBHQQV-YUMQZZPRSA-N 0.000 description 2
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 2
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 2
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 2
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- LBDXVCBAJJNJNN-WHFBIAKZSA-N Gly-Ser-Cys Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O LBDXVCBAJJNJNN-WHFBIAKZSA-N 0.000 description 2
- FXTUGWXZTFMTIV-GJZGRUSLSA-N Gly-Trp-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)CN FXTUGWXZTFMTIV-GJZGRUSLSA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- 229920002488 Hemicellulose Polymers 0.000 description 2
- FDQYIRHBVVUTJF-ZETCQYMHSA-N His-Gly-Gly Chemical compound [O-]C(=O)CNC(=O)CNC(=O)[C@@H]([NH3+])CC1=CN=CN1 FDQYIRHBVVUTJF-ZETCQYMHSA-N 0.000 description 2
- LNDVNHOSZQPJGI-AVGNSLFASA-N His-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNDVNHOSZQPJGI-AVGNSLFASA-N 0.000 description 2
- CWSZWFILCNSNEX-CIUDSAMLSA-N His-Ser-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CWSZWFILCNSNEX-CIUDSAMLSA-N 0.000 description 2
- 101150017040 I gene Proteins 0.000 description 2
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 2
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 2
- HDODQNPMSHDXJT-GHCJXIJMSA-N Ile-Asn-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O HDODQNPMSHDXJT-GHCJXIJMSA-N 0.000 description 2
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 2
- ODPKZZLRDNXTJZ-WHOFXGATSA-N Ile-Gly-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ODPKZZLRDNXTJZ-WHOFXGATSA-N 0.000 description 2
- RWHRUZORDWZESH-ZQINRCPSSA-N Ile-Trp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RWHRUZORDWZESH-ZQINRCPSSA-N 0.000 description 2
- 241000421270 Komagataella phaffii CBS 7435 Species 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 2
- RRSLQOLASISYTB-CIUDSAMLSA-N Leu-Cys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O RRSLQOLASISYTB-CIUDSAMLSA-N 0.000 description 2
- ARRIJPQRBWRNLT-DCAQKATOSA-N Leu-Met-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ARRIJPQRBWRNLT-DCAQKATOSA-N 0.000 description 2
- RRVCZCNFXIFGRA-DCAQKATOSA-N Leu-Pro-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RRVCZCNFXIFGRA-DCAQKATOSA-N 0.000 description 2
- HGUUMQWGYCVPKG-DCAQKATOSA-N Leu-Pro-Cys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N HGUUMQWGYCVPKG-DCAQKATOSA-N 0.000 description 2
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 2
- RNYLNYTYMXACRI-VFAJRCTISA-N Leu-Thr-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O RNYLNYTYMXACRI-VFAJRCTISA-N 0.000 description 2
- ZTPWXNOOKAXPPE-DCAQKATOSA-N Lys-Arg-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N ZTPWXNOOKAXPPE-DCAQKATOSA-N 0.000 description 2
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 2
- BYEBKXRNDLTGFW-CIUDSAMLSA-N Lys-Cys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O BYEBKXRNDLTGFW-CIUDSAMLSA-N 0.000 description 2
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 2
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 2
- CSNNHWWHGAXBCP-UHFFFAOYSA-L Magnesium sulfate Chemical compound [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 2
- LMKSBGIUPVRHEH-FXQIFTODSA-N Met-Ala-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(N)=O LMKSBGIUPVRHEH-FXQIFTODSA-N 0.000 description 2
- LQTGGXSOMDSWTQ-UNQGMJICSA-N Met-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCSC)N)O LQTGGXSOMDSWTQ-UNQGMJICSA-N 0.000 description 2
- 229920001340 Microbial cellulose Polymers 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 2
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 2
- 241000320412 Ogataea angusta Species 0.000 description 2
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 2
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 2
- MJQFZGOIVBDIMZ-WHOFXGATSA-N Phe-Ile-Gly Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O MJQFZGOIVBDIMZ-WHOFXGATSA-N 0.000 description 2
- GRVMHFCZUIYNKQ-UFYCRDLUSA-N Phe-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O GRVMHFCZUIYNKQ-UFYCRDLUSA-N 0.000 description 2
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 2
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-N Phosphoric acid Chemical compound OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 2
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 2
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 2
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- PEYNRYREGPAOAK-LSJOCFKGSA-N Pro-His-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 PEYNRYREGPAOAK-LSJOCFKGSA-N 0.000 description 2
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 2
- 101100010928 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) tuf gene Proteins 0.000 description 2
- SSJMZMUVNKEENT-IMJSIDKUSA-N Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CO SSJMZMUVNKEENT-IMJSIDKUSA-N 0.000 description 2
- BKOKTRCZXRIQPX-ZLUOBGJFSA-N Ser-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N BKOKTRCZXRIQPX-ZLUOBGJFSA-N 0.000 description 2
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 2
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 2
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 2
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 2
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 2
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 2
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 2
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 2
- KJKQUQXDEKMPDK-FXQIFTODSA-N Ser-Met-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O KJKQUQXDEKMPDK-FXQIFTODSA-N 0.000 description 2
- FBLNYDYPCLFTSP-IXOXFDKPSA-N Ser-Phe-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FBLNYDYPCLFTSP-IXOXFDKPSA-N 0.000 description 2
- JLKWJWPDXPKKHI-FXQIFTODSA-N Ser-Pro-Asn Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CC(=O)N)C(=O)O JLKWJWPDXPKKHI-FXQIFTODSA-N 0.000 description 2
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 2
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 2
- ZSDXEKUKQAKZFE-XAVMHZPKSA-N Ser-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N)O ZSDXEKUKQAKZFE-XAVMHZPKSA-N 0.000 description 2
- 101150001810 TEAD1 gene Proteins 0.000 description 2
- 101150074253 TEF1 gene Proteins 0.000 description 2
- 241000228341 Talaromyces Species 0.000 description 2
- CTONFVDJYCAMQM-IUKAMOBKSA-N Thr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H]([C@@H](C)O)N CTONFVDJYCAMQM-IUKAMOBKSA-N 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- UZJDBCHMIQXLOQ-HEIBUPTGSA-N Thr-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O UZJDBCHMIQXLOQ-HEIBUPTGSA-N 0.000 description 2
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 2
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 2
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 2
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 2
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 2
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 2
- IJKNKFJZOJCKRR-GBALPHGKSA-N Thr-Trp-Ser Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 IJKNKFJZOJCKRR-GBALPHGKSA-N 0.000 description 2
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 2
- CKHWEVXPLJBEOZ-VQVTYTSYSA-N Thr-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O CKHWEVXPLJBEOZ-VQVTYTSYSA-N 0.000 description 2
- 102100029898 Transcriptional enhancer factor TEF-1 Human genes 0.000 description 2
- 241000209140 Triticum Species 0.000 description 2
- 235000021307 Triticum Nutrition 0.000 description 2
- VEYXZZGMIBKXCN-UBHSHLNASA-N Trp-Asp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VEYXZZGMIBKXCN-UBHSHLNASA-N 0.000 description 2
- OGZRZMJASKKMJZ-XIRDDKMYSA-N Trp-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N OGZRZMJASKKMJZ-XIRDDKMYSA-N 0.000 description 2
- WMIUTJPFHMMUGY-ZFWWWQNUSA-N Trp-Pro-Gly Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)NCC(=O)O WMIUTJPFHMMUGY-ZFWWWQNUSA-N 0.000 description 2
- XGEUYEOEZYFHRL-KKXDTOCCSA-N Tyr-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XGEUYEOEZYFHRL-KKXDTOCCSA-N 0.000 description 2
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 2
- XKDOQXAXKFQWQJ-SRVKXCTJSA-N Tyr-Cys-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O XKDOQXAXKFQWQJ-SRVKXCTJSA-N 0.000 description 2
- UMXSDHPSMROQRB-YJRXYDGGSA-N Tyr-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O UMXSDHPSMROQRB-YJRXYDGGSA-N 0.000 description 2
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 2
- JKUZFODWJGEQAP-KBPBESRZSA-N Tyr-Gly-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O JKUZFODWJGEQAP-KBPBESRZSA-N 0.000 description 2
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 2
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 2
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 2
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 2
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 2
- YDVDTCJGBBJGRT-GUBZILKMSA-N Val-Met-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N YDVDTCJGBBJGRT-GUBZILKMSA-N 0.000 description 2
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 2
- WHNSHJJNWNSTSU-BZSNNMDCSA-N Val-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 WHNSHJJNWNSTSU-BZSNNMDCSA-N 0.000 description 2
- 230000002378 acidificating effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 2
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 239000002551 biofuel Substances 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 239000012152 bradford reagent Substances 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 229940106157 cellulase Drugs 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 108010009297 diglycyl-histidine Proteins 0.000 description 2
- 230000007071 enzymatic hydrolysis Effects 0.000 description 2
- 238000006047 enzymatic hydrolysis reaction Methods 0.000 description 2
- 238000006911 enzymatic reaction Methods 0.000 description 2
- 239000000446 fuel Substances 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Chemical group 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 102000006602 glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 2
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 2
- 108010084389 glycyltryptophan Proteins 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 229920005610 lignin Polymers 0.000 description 2
- 108010017391 lysylvaline Proteins 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- 238000012269 metabolic engineering Methods 0.000 description 2
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 2
- 239000007003 mineral medium Substances 0.000 description 2
- 231100000219 mutagenic Toxicity 0.000 description 2
- 230000003505 mutagenic effect Effects 0.000 description 2
- 238000007857 nested PCR Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 108010030237 phenylalanyl-arginyl-valyl-phenylalanine Proteins 0.000 description 2
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 239000005014 poly(hydroxyalkanoate) Substances 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 238000001273 protein sequence alignment Methods 0.000 description 2
- 210000001938 protoplast Anatomy 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000002708 random mutagenesis Methods 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 239000007974 sodium acetate buffer Substances 0.000 description 2
- 229910000029 sodium carbonate Inorganic materials 0.000 description 2
- 230000002195 synergetic effect Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 2
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 1
- IGXNPQWXIRIGBF-KEOOTSPTSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IGXNPQWXIRIGBF-KEOOTSPTSA-N 0.000 description 1
- PKYCWFICOKSIHZ-UHFFFAOYSA-N 1-(3,7-dihydroxyphenoxazin-10-yl)ethanone Chemical compound OC1=CC=C2N(C(=O)C)C3=CC=C(O)C=C3OC2=C1 PKYCWFICOKSIHZ-UHFFFAOYSA-N 0.000 description 1
- BTJIUGUIPKRLHP-UHFFFAOYSA-N 4-nitrophenol Chemical compound OC1=CC=C([N+]([O-])=O)C=C1 BTJIUGUIPKRLHP-UHFFFAOYSA-N 0.000 description 1
- IAYJZWFYUSNIPN-MUKCROHVSA-N 4-nitrophenyl beta-lactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)O[C@@H](OC=2C=CC(=CC=2)[N+]([O-])=O)[C@H](O)[C@H]1O IAYJZWFYUSNIPN-MUKCROHVSA-N 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 241000222518 Agaricus Species 0.000 description 1
- JYEBJTDTPNKQJG-FXQIFTODSA-N Ala-Asn-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N JYEBJTDTPNKQJG-FXQIFTODSA-N 0.000 description 1
- GWFSQQNGMPGBEF-GHCJXIJMSA-N Ala-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N GWFSQQNGMPGBEF-GHCJXIJMSA-N 0.000 description 1
- YSMPVONNIWLJML-FXQIFTODSA-N Ala-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 1
- ATAKEVCGTRZKLI-UWJYBYFXSA-N Ala-His-His Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 ATAKEVCGTRZKLI-UWJYBYFXSA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 1
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 1
- DWYROCSXOOMOEU-CIUDSAMLSA-N Ala-Met-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DWYROCSXOOMOEU-CIUDSAMLSA-N 0.000 description 1
- SYIFFFHSXBNPMC-UWJYBYFXSA-N Ala-Ser-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N SYIFFFHSXBNPMC-UWJYBYFXSA-N 0.000 description 1
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 1
- 108010025188 Alcohol oxidase Proteins 0.000 description 1
- 101710194180 Alcohol oxidase 1 Proteins 0.000 description 1
- 244000144725 Amygdalus communis Species 0.000 description 1
- DQNLFLGFZAUIOW-FXQIFTODSA-N Arg-Cys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O DQNLFLGFZAUIOW-FXQIFTODSA-N 0.000 description 1
- KRQSPVKUISQQFS-FJXKBIBVSA-N Arg-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N KRQSPVKUISQQFS-FJXKBIBVSA-N 0.000 description 1
- JQFZHHSQMKZLRU-IUCAKERBSA-N Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N JQFZHHSQMKZLRU-IUCAKERBSA-N 0.000 description 1
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 1
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 1
- YHZQOSXDTFRZKU-WDSOQIARSA-N Arg-Trp-Leu Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N)=CNC2=C1 YHZQOSXDTFRZKU-WDSOQIARSA-N 0.000 description 1
- 241000235349 Ascomycota Species 0.000 description 1
- DNYRZPOWBTYFAF-IHRRRGAJSA-N Asn-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N)O DNYRZPOWBTYFAF-IHRRRGAJSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- SGAUXNZEFIEAAI-GARJFASQSA-N Asn-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)C(=O)O SGAUXNZEFIEAAI-GARJFASQSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- RCFGLXMZDYNRSC-CIUDSAMLSA-N Asn-Lys-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O RCFGLXMZDYNRSC-CIUDSAMLSA-N 0.000 description 1
- JTXVXGXTRXMOFJ-FXQIFTODSA-N Asn-Pro-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O JTXVXGXTRXMOFJ-FXQIFTODSA-N 0.000 description 1
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 1
- HNXWVVHIGTZTBO-LKXGYXEUSA-N Asn-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O HNXWVVHIGTZTBO-LKXGYXEUSA-N 0.000 description 1
- AKPLMZMNJGNUKT-ZLUOBGJFSA-N Asp-Asp-Cys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CS)C(O)=O AKPLMZMNJGNUKT-ZLUOBGJFSA-N 0.000 description 1
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 1
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 1
- ILQCHXURSRRIRY-YUMQZZPRSA-N Asp-His-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)O)N ILQCHXURSRRIRY-YUMQZZPRSA-N 0.000 description 1
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 1
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- ALMIMUZAWTUNIO-BZSNNMDCSA-N Asp-Tyr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ALMIMUZAWTUNIO-BZSNNMDCSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000221198 Basidiomycota Species 0.000 description 1
- 238000009010 Bradford assay Methods 0.000 description 1
- 108090000565 Capsid Proteins Proteins 0.000 description 1
- 102100023321 Ceruloplasmin Human genes 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 241000123346 Chrysosporium Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 206010011224 Cough Diseases 0.000 description 1
- GRNOCLDFUNCIDW-ACZMJKKPSA-N Cys-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N GRNOCLDFUNCIDW-ACZMJKKPSA-N 0.000 description 1
- NOCCABSVTRONIN-CIUDSAMLSA-N Cys-Ala-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CS)N NOCCABSVTRONIN-CIUDSAMLSA-N 0.000 description 1
- XTHUKRLJRUVVBF-WHFBIAKZSA-N Cys-Gly-Ser Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O XTHUKRLJRUVVBF-WHFBIAKZSA-N 0.000 description 1
- WYVKPHCYMTWUCW-YUPRTTJUSA-N Cys-Thr Chemical compound C[C@@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)N)O WYVKPHCYMTWUCW-YUPRTTJUSA-N 0.000 description 1
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 1
- 125000002353 D-glucosyl group Chemical group C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 101710088194 Dehydrogenase Proteins 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 241001645360 Fusicoccum Species 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 1
- YVYVMJNUENBOOL-KBIXCLLPSA-N Glu-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N YVYVMJNUENBOOL-KBIXCLLPSA-N 0.000 description 1
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 1
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 1
- DUYYPIRFTLOAJQ-YUMQZZPRSA-N Gly-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN DUYYPIRFTLOAJQ-YUMQZZPRSA-N 0.000 description 1
- LXXLEUBUOMCAMR-NKWVEPMBSA-N Gly-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)CN)C(=O)O LXXLEUBUOMCAMR-NKWVEPMBSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- QPCVIQJVRGXUSA-LURJTMIESA-N Gly-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QPCVIQJVRGXUSA-LURJTMIESA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 1
- DBJYVKDPGIFXFO-BQBZGAKWSA-N Gly-Met-Ala Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O DBJYVKDPGIFXFO-BQBZGAKWSA-N 0.000 description 1
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- POJJAZJHBGXEGM-YUMQZZPRSA-N Gly-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN POJJAZJHBGXEGM-YUMQZZPRSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 1
- CUVBTVWFVIIDOC-YEPSODPASA-N Gly-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CN CUVBTVWFVIIDOC-YEPSODPASA-N 0.000 description 1
- GWCJMBNBFYBQCV-XPUUQOCRSA-N Gly-Val-Ala Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O GWCJMBNBFYBQCV-XPUUQOCRSA-N 0.000 description 1
- 244000286779 Hansenula anomala Species 0.000 description 1
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 1
- HIAHVKLTHNOENC-HGNGGELXSA-N His-Glu-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HIAHVKLTHNOENC-HGNGGELXSA-N 0.000 description 1
- SOYCWSKCUVDLMC-AVGNSLFASA-N His-Pro-Arg Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCNC(=N)N)C(=O)O SOYCWSKCUVDLMC-AVGNSLFASA-N 0.000 description 1
- PZUZIHRPOVVHOT-KBPBESRZSA-N His-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CN=CN1 PZUZIHRPOVVHOT-KBPBESRZSA-N 0.000 description 1
- 241000223198 Humicola Species 0.000 description 1
- 101001026677 Hypocrea jecorina Exoglucanase 1 Proteins 0.000 description 1
- LDRALPZEVHVXEK-KBIXCLLPSA-N Ile-Cys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N LDRALPZEVHVXEK-KBIXCLLPSA-N 0.000 description 1
- CZWANIQKACCEKW-CYDGBPFRSA-N Ile-Pro-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)O)N CZWANIQKACCEKW-CYDGBPFRSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- ZGKVPOSSTGHJAF-HJPIBITLSA-N Ile-Tyr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CO)C(=O)O)N ZGKVPOSSTGHJAF-HJPIBITLSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 241000222342 Irpex Species 0.000 description 1
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- BTNXKBVLWJBTNR-SRVKXCTJSA-N Leu-His-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O BTNXKBVLWJBTNR-SRVKXCTJSA-N 0.000 description 1
- FPFOYSCDUWTZBF-IHPCNDPISA-N Leu-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]([NH3+])CC(C)C)C(=O)N[C@@H](CC(C)C)C([O-])=O)=CNC2=C1 FPFOYSCDUWTZBF-IHPCNDPISA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 241001490312 Lithops pseudotruncatella Species 0.000 description 1
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 1
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 1
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 1
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 1
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- YQAIUOWPSUOINN-IUCAKERBSA-N Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCCCN YQAIUOWPSUOINN-IUCAKERBSA-N 0.000 description 1
- PWHULOQIROXLJO-UHFFFAOYSA-N Manganese Chemical compound [Mn] PWHULOQIROXLJO-UHFFFAOYSA-N 0.000 description 1
- FVKRQMQQFGBXHV-QXEWZRGKSA-N Met-Asp-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FVKRQMQQFGBXHV-QXEWZRGKSA-N 0.000 description 1
- SMVTWPOATVIXTN-NAKRPEOUSA-N Met-Ser-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SMVTWPOATVIXTN-NAKRPEOUSA-N 0.000 description 1
- XLTSAUGGDYRFLS-UMPQAUOISA-N Met-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCSC)N)O XLTSAUGGDYRFLS-UMPQAUOISA-N 0.000 description 1
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 1
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 1
- MDSUKZSLOATHMH-UHFFFAOYSA-N N-L-leucyl-L-valine Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(O)=O MDSUKZSLOATHMH-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 241001452677 Ogataea methanolica Species 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241000235652 Pachysolen Species 0.000 description 1
- 241000235647 Pachysolen tannophilus Species 0.000 description 1
- 241000222385 Phanerochaete Species 0.000 description 1
- 241000222393 Phanerochaete chrysosporium Species 0.000 description 1
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 1
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 1
- 102000011755 Phosphoglycerate Kinase Human genes 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 235000015696 Portulacaria afra Nutrition 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- AUYKOPJPKUCYHE-SRVKXCTJSA-N Pro-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@@H]1CCCN1 AUYKOPJPKUCYHE-SRVKXCTJSA-N 0.000 description 1
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- OWQXAJQZLWHPBH-FXQIFTODSA-N Pro-Ser-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O OWQXAJQZLWHPBH-FXQIFTODSA-N 0.000 description 1
- WVXQQUWOKUZIEG-VEVYYDQMSA-N Pro-Thr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O WVXQQUWOKUZIEG-VEVYYDQMSA-N 0.000 description 1
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 241000235342 Saccharomycetes Species 0.000 description 1
- 241000235060 Scheffersomyces stipitis Species 0.000 description 1
- DWUIECHTAMYEFL-XVYDVKMFSA-N Ser-Ala-His Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DWUIECHTAMYEFL-XVYDVKMFSA-N 0.000 description 1
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 1
- OBXVZEAMXFSGPU-FXQIFTODSA-N Ser-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)CN=C(N)N OBXVZEAMXFSGPU-FXQIFTODSA-N 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- SNNSYBWPPVAXQW-ZLUOBGJFSA-N Ser-Cys-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)O)N)O SNNSYBWPPVAXQW-ZLUOBGJFSA-N 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- YZMPDHTZJJCGEI-BQBZGAKWSA-N Ser-His Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 YZMPDHTZJJCGEI-BQBZGAKWSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 1
- LDEBVRIURYMKQS-WISUUJSJSA-N Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO LDEBVRIURYMKQS-WISUUJSJSA-N 0.000 description 1
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 1
- HAYADTTXNZFUDM-IHRRRGAJSA-N Ser-Tyr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HAYADTTXNZFUDM-IHRRRGAJSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 1
- 241001080024 Telles Species 0.000 description 1
- 101001099217 Thermotoga maritima (strain ATCC 43589 / DSM 3109 / JCM 10099 / NBRC 100826 / MSB8) Triosephosphate isomerase Proteins 0.000 description 1
- XYEXCEPTALHNEV-RCWTZXSCSA-N Thr-Arg-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XYEXCEPTALHNEV-RCWTZXSCSA-N 0.000 description 1
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 1
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 1
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 1
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 1
- GXDLGHLJTHMDII-WISUUJSJSA-N Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(O)=O GXDLGHLJTHMDII-WISUUJSJSA-N 0.000 description 1
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 1
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 1
- QNTBGBCOEYNAPV-CWRNSKLLSA-N Trp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O QNTBGBCOEYNAPV-CWRNSKLLSA-N 0.000 description 1
- WACMTVIJWRNVSO-CWRNSKLLSA-N Trp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O WACMTVIJWRNVSO-CWRNSKLLSA-N 0.000 description 1
- NXJZCPKZIKTYLX-XEGUGMAKSA-N Trp-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NXJZCPKZIKTYLX-XEGUGMAKSA-N 0.000 description 1
- 244000177175 Typha elephantina Species 0.000 description 1
- 235000018747 Typha elephantina Nutrition 0.000 description 1
- HPYDSVWYXXKHRD-VIFPVBQESA-N Tyr-Gly Chemical compound [O-]C(=O)CNC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 HPYDSVWYXXKHRD-VIFPVBQESA-N 0.000 description 1
- WPXKRJVHBXYLDT-JUKXBJQTSA-N Tyr-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N WPXKRJVHBXYLDT-JUKXBJQTSA-N 0.000 description 1
- VNYDHJARLHNEGA-RYUDHWBXSA-N Tyr-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=C(O)C=C1 VNYDHJARLHNEGA-RYUDHWBXSA-N 0.000 description 1
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 1
- JAQGKXUEKGKTKX-HOTGVXAUSA-N Tyr-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 JAQGKXUEKGKTKX-HOTGVXAUSA-N 0.000 description 1
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 1
- SMKXLHVZIFKQRB-GUBZILKMSA-N Val-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](C(C)C)N SMKXLHVZIFKQRB-GUBZILKMSA-N 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 1
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 1
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 1
- KANQPJDDXIYZJS-AVGNSLFASA-N Val-Leu-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N KANQPJDDXIYZJS-AVGNSLFASA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- GVRKWABULJAONN-VQVTYTSYSA-N Val-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVRKWABULJAONN-VQVTYTSYSA-N 0.000 description 1
- BZDGLJPROOOUOZ-XGEHTFHBSA-N Val-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N)O BZDGLJPROOOUOZ-XGEHTFHBSA-N 0.000 description 1
- 208000010115 WHIM syndrome Diseases 0.000 description 1
- 208000033355 WHIM syndrome 1 Diseases 0.000 description 1
- 241000235017 Zygosaccharomyces Species 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 239000008351 acetate buffer Substances 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 150000001294 alanine derivatives Chemical group 0.000 description 1
- 150000001298 alcohols Chemical class 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 229910000147 aluminium phosphate Inorganic materials 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 101150114858 cbh2 gene Proteins 0.000 description 1
- FYGDTMLNYKFZSV-ZWSAEMDYSA-N cellotriose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)O[C@@H](O[C@@H]2[C@H](OC(O)[C@H](O)[C@H]2O)CO)[C@H](O)[C@H]1O FYGDTMLNYKFZSV-ZWSAEMDYSA-N 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012993 chemical processing Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010924 continuous production Methods 0.000 description 1
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 150000002009 diols Chemical class 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- CDMADVZSLOHIFP-UHFFFAOYSA-N disodium;3,7-dioxido-2,4,6,8,9-pentaoxa-1,3,5,7-tetraborabicyclo[3.3.1]nonane;decahydrate Chemical compound O.O.O.O.O.O.O.O.O.O.[Na+].[Na+].O1B([O-])OB2OB([O-])OB1O2 CDMADVZSLOHIFP-UHFFFAOYSA-N 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000012847 fine chemical Substances 0.000 description 1
- 102000054767 gene variant Human genes 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 125000002791 glucosyl group Chemical group C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 description 1
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 1
- YMAWOPBAYDPSLA-UHFFFAOYSA-N glycylglycine Chemical compound [NH3+]CC(=O)NCC([O-])=O YMAWOPBAYDPSLA-UHFFFAOYSA-N 0.000 description 1
- 108010015792 glycyllysine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 108010002430 hemicellulase Proteins 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 230000028644 hyphal growth Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 239000012978 lignocellulosic material Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L magnesium chloride Substances [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 235000019341 magnesium sulphate Nutrition 0.000 description 1
- 229910052748 manganese Inorganic materials 0.000 description 1
- 239000011572 manganese Substances 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 108010056582 methionylglutamic acid Proteins 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000003801 milling Methods 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 150000002482 oligosaccharides Polymers 0.000 description 1
- 150000007524 organic acids Chemical class 0.000 description 1
- 235000005985 organic acids Nutrition 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- CWCMIVBLVUHDHK-ZSNHEYEWSA-N phleomycin D1 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC[C@@H](N=1)C=1SC=C(N=1)C(=O)NCCCCNC(N)=N)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C CWCMIVBLVUHDHK-ZSNHEYEWSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 239000008363 phosphate buffer Substances 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000001766 physiological effect Effects 0.000 description 1
- 238000013492 plasmid preparation Methods 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 229920005862 polyol Polymers 0.000 description 1
- 150000003077 polyols Chemical class 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 102200017779 rs10132601 Human genes 0.000 description 1
- 102200161322 rs11739136 Human genes 0.000 description 1
- 102220342577 rs1344845304 Human genes 0.000 description 1
- 102220082618 rs138865666 Human genes 0.000 description 1
- 102220005514 rs3180281 Human genes 0.000 description 1
- 102220005433 rs35628685 Human genes 0.000 description 1
- 102200037746 rs5931 Human genes 0.000 description 1
- 108010038196 saccharide-binding proteins Proteins 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 239000001632 sodium acetate Substances 0.000 description 1
- 235000017281 sodium acetate Nutrition 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000011573 trace mineral Substances 0.000 description 1
- 235000013619 trace mineral Nutrition 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 235000017103 tryptophane Nutrition 0.000 description 1
- 150000003654 tryptophanes Chemical class 0.000 description 1
- 108010021889 valylvaline Proteins 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 238000009279 wet oxidation reaction Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
- 239000007222 ypd medium Substances 0.000 description 1
Landscapes
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Description
Description Field of invention [0001] The invention discloses cellulase enzymes with optimized properties for processing of cellulose- and lignocel-lulose-containing substrates. In particular, cellobiohydrolase enzymes with preferred characteristics are disclosed. The present disclosure provides fusion, insertion, deletion and/or substitution variants of such enzymes. Enzyme variants have enhanced thermostability, proteolytic stability, specific activity and/or stability at extreme pH. Nucleic acid molecules encoding said enzymes, a composition comprising said enzymes, a method for preparation, and the use for cellulose processing and/or for the production of biofuels are disclosed.
Background of the invention [0002] The development of production processes based on renewable resources is highly desired, for example for the generation of ethanol from cellulosic and lignocellulosic materials.
[0003] Cellulose material in pure form or in combination with hemicellulose and/or lignin is a valuable and readily available raw material for the production of chemicals and fuels. A key step in processing cellulose and lignocellulose is the hydrolysis of the beta-1,4-linked glucose polymer cellulose and the subsequent release of glucose monomers and short glucose oligomers such as cellobiose, cellotriose, etc. Enzymes that catalyze this reaction are found in various organisms, especially filamentous fungi and bacteria, that are capable of degrading and hydrolysing cellulose.
[0004] Continuous processes for converting solid lignocellulosic biomass into combustible fuel products are known. Treatment to make cellulosic substrates more susceptible to enzymatic degradation comprises milling, chemical processing and/or hydrothermal processing. Examples are wet oxidation and/or steam explosion. Such treatments increase the accessibility of cellulose fibers and separate them from hemicellulose and lignin, required for the degradation of cellulose polymers. Among these cellobiohydrolase (CBH) enzymes, and more specifically cellobiohydrolase I (CBHI) enzymes, play a key role in the hydrolysis step as they provide the most processive enzymatic activity. CBHI enzymes catalyze the progressive hydrolytic release of cellobiose from the reducing end of the cellulose polymers. (Lynd LR, Weimer PJ, van Zyl WH, Pretorius IS. Microbial cellulose utilization: fundamentals and biotechnology. Microbiol Mol Biol Rev. 2002 Sep;66(3):506-77).
[0005] Hydrolyzed cellulosic materials contain several valuable carbohydrate molecules which can be isolated from the mixtures. Sugar containing hydrolysates of cellulosic materials can be used for microbial production of a variety of fine chemicals or biopolymers, such as organic acids, ethanol or higher alcohols (also diols or polyols) or polyhydroxy-alkanoates (PHAs). One of the major uses of the sugar hydrolysates is in the production of biofuels.
[0006] Kurabi et al. (2005) describes preparations of cellulases from Trichoderma reesei and other fungi, such as Pénicillium sp. The performance has been analysed on steam-exploded and ethanol organosolv-pretreated Douglas-fir. Better performance of enzyme mixtures appears to be a result of improved properties of single component enzymes as well as the effect of each compound in the mixture, especially the presence of beta-glucosidase. (Kurabi A, Berlin A, Gilkes N, Kilburn D, Bura R, Robinson J, Markov A, Skomarovsky A, Gusakov A, Okunev O, Sinitsyn A, Gregg D, Xie D, Saddler J.(2005) Enzymatic hydrolysis of steam-exploded and ethanol organosolv-pretreated Douglas-Fir by novel and commercial fungal cellulases. Appl Biochem Biotechnol. 121-124: 219-30).
[0007] Cellobiohydrolase sequences of the glucohydrolase class 7 (cel7) are known to the art from several fungal sources. The Talaromyces emersonii Cel7 cellobiohydrolase is known and expression was reported in Escherichia coli. Grassick et al. present a report on the purification and 3D structural determination of a native core CBH protein, and of the cloning and over-expression of the corresponding gene, from a thermophilic fungal source. CBH 16 was found to be extremely thermostable with a temperature optimum of 68 °C at pH 5.0 and a half-life (tVfe) of 68.0 min at 80 °C and pH 5.0. (Grassick A, Murray PG, Thompson R, Collins CM, Byrnes L, Birrane G, Higgins TM, Tuohy MG. Three-dimensional structure of a thermostable native cellobiohydrolase, CBH IB, and molecular characterization of the cel7 gene from the filamentous fungus, Talaromyces emersonii. Eur J Biochem. 2004 Nov;271 (22):4495-4506) and Saccharomyces cerevisiae (Voutilainen SP, Murray PG, Tuohy MG, Koivula A. Expression of Talaromyces emersonii cellobiohydrolase Cel7A in Saccharomyces cerevisiae and rational mutagenesis to improve its thermostability and activity. Protein Eng Des Sel. 2010 Feb;23(2):69-79), however the protein was either produced in inactive form or at rather low yields (less or equal to 5mg/l). Hypocrea jecorina cellobiohydrolase I can be produced from wild type or engineered strains of the genus Hypocrea or Trichoderma at high yields. Improved sequences of Hypocrea jecorina Cel7A are disclosed by US7459299B2, US7452707B2, W02005/030926, W001/04284A1 or US2009/0162916 A1.
[0008] Positions leading to improvements were deduced from alignments with sequences from reported thermostable enzymes, suggested from structural information and shuffling of identified positions followed by limited screenings. Screening of larger libraries in transformable organisms such as Saccharomyces cerevisiae was reported by application of very sensitive fluorescent substrates, which resemble native substrates in a very restricted way. (Percival Zhang YH,
Himmel ME, Mielenz JR. Outlook for cellulase improvement: screening and selection strategies. Biotechnol Adv. 2006 Sep-Oct;24(5):452-81).
[0009] The production of cellobiohydrolases from other fungal systems such as Thermoascus aurantiacus, Chrys-osporiumlucknowenseor Phanerochaete chrysosporium was reported. Expression of Cel7 cellobiohydrolasefrom yeasts was reported, but enzymatic yields or enzyme properties remain unsatisfactory. (Penttilä ME, André L, Lehtovaara P, Bailey M, Teeri TT, Knowles JK. Efficient secretion of two fungal cellobiohydrolases by Saccharomyces cerevisiae. Gene. 1988;63(1 ):103-12).
[0010] W003/000941 discloses a number of CBHs and their corresponding gene sequences. Physiological properties and applications however were not disclosed. The fusion of cellulose binding domains to catalytic subunits of cellobiohydrolases is reported to improve the hydrolytic properties of proteins without a native domain.
[0011] US 2009042266 (A1) discloses fusions of Thermoascus aurantiacus Cel7A with cellulose binding domains from cellobiohydrolase I from Chaetomium thermophilum and Hypocrea jecorina.
[0012] US5686593 reports the fusion of specially designed linker regions and binding domains to cellobiohydrolases.
[0013] Hong et al. (2003) describe the production of Thermoascus aurantiacus CBHI in yeast and its characterization. (Hong J, Tamaki H, Yamamoto K, Kumagai H Cloning of a gene encoding thermostable cellobiohydrolase from Thermoascus aurantiacus and its expression in yeast. Appl Microbiol Biotechnol. 2003 Nov;63(1 ):42-50).
[0014] Tuohy et al. (2002) report the expression and characterization of Talaromyces emersonii CBH. (Tuohy MG, Walsh DJ, Murray PG, Claeyssens M, Cuffe MM, Savage AV, Coughlan MP.:Kinetic parameters and mode of action of the cellobiohydrolases produced by Talaromyces emersonii. Biochim Biophys Acta. 2002 Apr 29;1596(2):366-80).
[0015] Nevoigt et al. (2008) reports on the expression of cellulolytic enzymes in yeasts. (Nevoigt E. Progress in metabolic engineering of Saccharomyces cerevisiae. Microbiol Mol Biol Rev. 2008 Sep;72(3):379-412).
[0016] Fujitaetal. (2004) reports on a Saccharomyces cervisiae strain expressing a combination of an endoglucanase, a beta glucosidase and a CBH 11 displayed on the cell surface. Cellobiohydrolase I (Cel7) was not used in this setup. (Fujita Y, Ito J, Ueda M, Fukuda H, Kondo A. Synergistic saccharification, and direct fermentation to ethanol, of amorphous cellulose by use of an engineered yeast strain codisplaying three types of cellulolytic enzyme. Appl Environ Microbiol. 2004 Feb;70(2):1207-12).
[0017] Boer H et al. (2000) describes the expression of GH7 classified enzymes in different yeast hosts but expressed protein levels were low. (Boer H, Teeri TT, Koivula A. Characterization of Trichoderma reesei cellobiohydrolase Cel7A secreted from Pichia pastoris using two different promoters. Biotechnol Bioeng. 2000 Sep 5;69(5):486-94).
[0018] Godbole et al (1999) and Hong et al ( 2003) found that proteins of this enzyme class expressed fom yeast were often misfolded, hyperglycosylated and hydrolytic capabilities decreased compared to the protein expressed from the homologous host. (Godbole S, Decker SR, Nieves RA, Adney WS, Vinzant TB, Baker JO, Thomas SR, Himmel ME. Cloning and expression of Trichoderma reesei cellobiohydrolase I in Pichia pastoris. Biotechnol Prog. 1999 Sep-Oct;15(5):828-33).
[0019] Kanokratana et al (2008), Li et al (2009) as well as CN01757710 describe the efficient expression of Cel7 CBH I enzymes, howeverthese proteins are lacking celllulose binding domains required for efficientsubstrate processing. (Kanokratana P, Chantasingh D, Champreda V, Tanapongpipat S, Pootanakit K, Eurwilaichitr L Identification and expression of cellobiohydrolase (CBHI) gene from an endophytic fungus, Fusicoccum sp. (BCC4124) in Pichia pastoris. LProtein Expr Purif. 2008 Mar;58(1):148-53. Epub 2007 Sep 19; Li YL, Li H, Li AN, Li DC. Cloning of a gene encoding thermostable cellobiohydrolase from the thermophilic fungus Chaetomium thermophilum and its expression in Pichia pastoris. J Appl Microbiol. 2009 Jun;106(6):1867-75).
[0020] Voutilainen (2008) and Viikari (2007) disclose Cel7 enzymes comprising thermostable cellobiohydrolases, however with only low to moderate expression levels from Trichoderma reesei, (Voutilainen SP, Puranen T, Siika-Aho M, Lappalainen A, Alapuranen M, Kallio J, Hooman S, Viikari L, Vehmaanperä J, Koivula A. Cloning, expression, and characterization of novel thermostable family 7 cellobiohydrolases. Biotechnol Bioeng. 2008 Oct 15;101(3):515-28. PubMed PMID: 18512263; Viikari L, Alapuranen M, Puranen T, Vehmaanperä J, Siika-Aho M. Thermostable enzymes in lignocellulose hydrolysis. Adv Biochem Eng Biotechnol. 2007;108:121-45).
[0021] Grassick et al. (2004) disclose unfolded expression of Cellobiohydrolase I from Talaromyces emersonii in Escherichia coli but not in yeast. (Grassick A, Murray PG, Thompson R, Collins CM, Byrnes L, Birrane G, Higgins TM, Tuohy MG. Three-dimensional structure of a thermostable native cellobiohydrolase, CBH IB, and molecular characterization of the cel7 gene from the filamentous fungus, Talaromyces emersonii. Eur J Biochem. 2004 Nov;271 (22):4495-506).
[0022] WO 2009/138877 describes a method for heterologous expression of polypeptides encoded by wild-type and codon-optimized variants ofcbhl and/or cbh2 from the fungal organisms Talaromyces emersonii (T. emersonii), Humicola grisea (H. grisea), Thermoascus aurantiacus (T. aurantiacus), and Trichoderma reesei (T. reesei) in host cells, such as the yeast Saccharomyces cerevisiae. The expression in such host cells of the corresponding genes, and variants and combinations thereof, were found to result in improved specific activity of the expressed cellobiohydrolases.
[0023] WO 2009/139839 describes a methods and composition for a large capacity alphavirus vector and particle. In some aspects methods for providing alphavirus particles comprising a modified capsid protein are described.
[0024] Therefore, there is a need for cellulase enzymes with improved characteristics for the use in technical processes for cellulose hydrolysis. In particular there is a need for CBH enzymes with higher catalytic activity and/or higher stability under process conditions. Moreover there is a need for CBH enzymes with higher productivity in fungal and/or yeast expression and secretion systems.
Summary of the invention [0025] The present invention provides a polypeptide having cellobiohydrolase activity. This polypeptide comprises an amino acid sequence having at least 85 % sequence identity to SEQ ID NO: 2. wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion.
[0026] Furthermore, the present invention discloses a nucleic acid encoding the polypeptide of the present invention, preferably having at least 95 % identity to SEQ ID NO: 1, a vector comprising this nucleic acid and a host transformed with said vector.
[0027] The present application further discloses a method of producing a cellobiohydrolase protein encoded by a vector of the present invention, a method for identifying polypeptides having cellobiohydrolase activity, and a method of preparing such polypeptides having cellobiohydrolase activity.
[0028] The present invention also provides a polypeptide having cellobiohydrolase activity, wherein the polypeptide comprises an amino acid sequence having at least 85 % sequence identity to SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion, wherein one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 2 are modified by substitution or deletion: G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, I277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410, Y422, P442, N445, R446, T456, S460, P462, G463, H468 and/or V482 of amino acids 1 to 500 of SEQ ID NO: 2.
[0029] Moreover, the present application discloses a polypeptide having cellobiohydrolase activity, which is obtainable by the method of preparing a polypeptide having cellobiohydrolase activity according to the present application, and a polypeptide having cellobiohydrolase activity, wherein the polypeptide comprises an amino acid sequence having at least 80 % sequence identity to SEQ ID NO: 5, wherein one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 5 are modified by substitution or deletion: Q1, G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, I277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410 and/or Y422 of amino acids 1 to 440 of SEQ ID NO: 5.
[0030] The present application furthermore discloses a polypeptide having cellobiohydrolase activity comprising an amino acid sequence having at least 85 % sequence identity to SEQ ID NO: 12 wherein one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 12 are modified by substitution or deletion: Q1, T15, Q28, W40, C72, V133, V155, Y158, T162, Y247, N307, G308, E317, S341, D345, Y370, T389, Q406, N441, R442, T452, S456, P458, G459, H464 and/or V478.
[0031] The present invention furthermore discloses a composition comprising a polypeptide of the present invention and one or more endoglucanases and/or one or more beta-glucosidases and/or one or more further cellobiohydrolases and/or one or more xylanases.
[0032] The present invention further provides the use of a polypeptide or the composition of the present invention for the enzymatic degradation of lignocellulosic biomass, and/or for textiles processing and/or as ingredient in detergents and/or as ingredient in food or feed compositions.
Brief description of the figures [0033]
Figure 1: Restriction Maps of pV1 for constitutive expression of Proteins in Pichia pastoris: pUC19 - ori: Origin of replication in E. coli; KanR: Kanamycine/G418 Resistance with TEF1 and EMZ Promoter sequences for selection in Pichia pastoris and E. coli, respectively; 5’GAP: glyceraldehyde-3-phosphate dehydrogenase Promoter region; 3’-GAP: terminator region; SP MFalpha: Saccharomyces cerevisiae mating factor alpha signal sequence; MCS: multiple cloning site.
Figure 2: Commassie stained SDS-PAGE of 10-fold concentrated supernatants of shake-flask cultures of Pichia pastoris CBS 7435 containing expression plasmids with coding sequences for the mature CBH I proteins of Tri-choderma viride (CBH-f; lane 2), Humicola grisea (CBH-d; lane 3), Talaromyces emersonii (CBH-b; lane 4), Ther-moascus aurantiacus (CBH-e; lane 5), as well as the Talaromyces emersonii CBHI-CBD fusion (CBH-a; lane 6) and the Humicola gr/'sea-CBDfusion (CBH-g; lane 7) in N-terminal fusion to the signal peptide of the Saccharomyces cerevisiae mating factor alpha under control of the Pichia pastoris glyceraldehyde-3-phosphate dehydrogenase promoter.
Figure 3: Map of the pV3 expression plasmid for protein expression in Pichia pastoris. Replicons: pUC19 - ori: Origin of replication in E. coli; ZeoR: Zeocine resistance gene with TEF1 and EM7 promoter sequences for expression in Pichia pastoris and E. coli, respectively; AOX I promoter: Promoter region of the Pichia pastoris alcohol oxidase I gene; AOX 1 transcriptional terminator: terminator region; SP MFalpha: Saccharomyces cerevisiae mating factor alpha signal sequence; MCS: multiple cloning site.
Figure 4: SDS-PAGE analysis of culture supernatant samples taken from the fermentation of a Pichia pastoris strain with a genomic integration of an AOXI-expression cassette, expressing the Talaromyces emersonii CBHI / Trichode-rma reesei -CBD fusion peptide (CBFI-a) in a 71 bioreactor during methanol induction. Samples P1 - P7 are taken at the beginning of the methanol induction and after 20, 45, 119.5, 142.5, 145.5 and 167 hours, respectively.
Figure 5: Map of pV4 expression plasmid for the expression of the Talaromyces emersonii CBH I / Trichoderma reesei-CBD fusion peptide (CBFI-ah) in Trichoderma reesei. Replicon: pUC19 for replication in E. coli. cbh1 5’: 5’ promoter region of the Trichoderma CBH I gene; cbh1 signal peptide: Coding sequence for the Trichoderma reesei CBH I leader peptide; CBFI-a: Talaromyces emersoniiCBHI / Trichoderma reesei-CBD fusion peptide: coding region for SEQ ID NO: 18; cbh1 Terminator: 3’ termination region of the Trichoderma reesei CBHI locus; hygromycine resistance: coding region of the hygromycine phosphotransferase under control of a Trichoderma reesei phos-phoglycerate kinase promoter; cbh1 3’: homology sequence to the termination region of the Trichoderma reesei CBH I locus for double crossover events.
Figure 6: SDS-Page of Trichoderma reesei culture supernatants. Lane 1 shows the expression pattern of a replacement strain carrying a Talaromyces emersonii CBH I / Trichoderma reesei -CBD fusion (CBFI-ah) inplace of the native CBHI gene. In comparison lane 2 shows the pattern for the unmodified strain under same conditions. M: molecular size marker.
Figure 7: Determination of IT50 values from Substrate Conversion Capacity vs. temperature graphs after normalization. For the normalization step the maximum and the minimum fluorescence values for the selected temperature are correlated to 1 or 0, respectively. Linear interpolation to F’(T)=0.5 between the nearest two temperature points with normalized values next to 0.5 results in the defined IT50 temperature.
Figure 8: Normalized Conversion Capacity vs. temperature graphs of "wt" Talaromyces emersonii CBHI / Trichoderma reesei -CBD fusions (CBFI-ah: SEQ ID NO: 18 = SEQ ID NO: 2 + 6x Flis-Tag) and mutants based on 4-Methylumbelliferyl -ß-D-lactoside hydrolysis results evaluated at various temperatures. The fluorescence values were normalized according to figure 8 over the temperature range from 55°C to 75°C. A. ..wt; B. ..G4C,A72C; C. ..G4C,A72C,Q349K; D. ..G4C,A72C,D181 N.Q349K; E. ..Q1 L,G4C,A72C,D181 N,E183K,Q349R; F. ..QL,G4C,A72C,S86T,D181 N,E183K,D320V,Q349R; G. ..G4C, A72C,E183K,D202Y,N310D,Q349R; H ... Q1L,G4C,A72C, A145T,H203R,Q349K,T403K;
Figure 9: Glucose yields of hydrolysis of pretreated straw with wt and mutated Talaromyces emersonii CBHI / Trichoderma reesei -CBD (CBFI-ah) fusion protein after hydrolysis for 48 hours in the presence of a ß-glycosidase. The variants are characterized by the following mutations with respect to SEQ ID NO: 18 and were expressed from Pichia pastoris in shake flask cultures and isolated from the supernatant by affinity chomatography using Ni-NTA. A: wt
B: G4C.A72C
C: G4C,A72C,Q349K
D: G4C.A72C, D181N.Q349K
E: Q1L,G4C,A72C,D181N,E183K,Q349R
F: Q1L,G4C,A72C,S86T,D181N,E183K,D320V,Q349R
G: G4C, A72C, E183K,D202Y,N310D,Q349R
Figure 10: Alignment of SEQ ID NO: 2 with the Trichoderma reesei CBHI. The alignment matrix blosum62mt2 with gap opening penalty of 10 and gap extension penalty of 0.1 was used to create the alignment.
Detailed description of the invention [0034] The present invention discloses a polypeptide having cellobiohydrolase activity, which comprises an amino acid sequence with at least 85 % sequence identity to SEQ ID NO: 2 wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion. "Cellobiohydrolase" or "CBH" refers to enzymes that cleave cellulose from the end of the glucose chain and produce cellobiose as the main product. Alternative names are 1,4-beta-D-glucan cellobiohydrolases or cellulose 1,4-beta-cellobiosidases. CBHs hydrolyze the 1,4-beta-D-glucosidic linkages from the reducing or non-reducing ends of a polymer containing said linkages. "Cellobiohydrolase I" or "CBH I" act from the reducing end of the cellulose fiber. "Cellobiohydrolase II" or "CBH II" act from the non-reducing end of the cellulose fiber. Cellobiohydrolases typically have a structure consisting of a catalytic domain and one or more "cellulose-binding domains" or "CBD". Such domains can be located either at the N- or C-terminus of the catalytic domain. CBDs have carbohydrate-binding activity and they mediate the binding of the cellulase to crystalline cellulose and presence or absence of binding domains are known to have a major impact on the processivity of an enzyme especially on polymeric substrates.
[0035] The parental sequence is given in SEQ ID NO: 2. The sequence derives from the C-terminal fusion of the linker domain and cellulose binding domain of Trichoderma reesei CBHI (SEQ ID NO: 4) to the catalytic domain of Talaromyces emersonii CBHI (SEQ ID NO: 5).
[0036] In a preferred aspect, the invention discloses protein variants that show a high activity at high temperature over an extended period of time. Preferably, the polypeptide of the present invention maintains 50 % of its maximum substrate conversion capacity when the conversion is done for 60 minutes at a temperature of 60 °C or higher. The respective temperature is also referred to as the IT50 value. In other words, the IT50 value is preferably 60 °C or higher. "Substrate Conversion Capacity" of an enzyme is herein defined as the degree of substrate conversion catalyzed by an amount of enzyme within a certain time period under defined conditions (Substrate concentration, pH value and buffer concentration, temperature), as can be determined by end-point assaying of the enzymatic reaction under said conditions. "Maximum Substrate Conversion Capacity" of an enzyme is herein defined as the maximum in Substrate Conversion Capacity found for the enzyme within a number of measurements performed as described before, where only one parameter, e.g. the temperature, was varied within a defined range. According to the present invention, the assay described in Example 8 is used to determine these parameters.
[0037] Furthermore, the disclosed polypeptides have preferably an IT50 value in the range of 62 to 70 °C, more preferably 65 to 70 °C.
[0038] The polypeptide of the present invention preferably comprises an amino acid sequence having at least 90 %, preferably at least 95 %, more preferably at least 99 % sequence identity to SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion. Furthermore, it is particularly preferred that the amino acid sequence of the polypeptide has the sequence as defined by SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion, or a sequence as defined by SEQ ID NO: 2 wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion wherein 1 to 75, more preferably 1 to 35 amino acid residues are substituted, deleted, or inserted.
[0039] Particularly preferred are variants of the protein of SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion. "Protein variants" are polypeptides whose amino acid sequence differs in one or more positions from this parental protein, whereby differences might be replacements of one amino acid by another, deletions of single or several amino acids, or insertion of additional amino acids or stretches of amino acids into the parental sequence. Per definition variants of the parental polypeptide shall be distinguished from other polypeptides by comparison of sequence identity (alignments) using the ClustalW Algorithm (Larkin M.A., Black-shields G., Brown N.P., Chenna R., McGettigan P.A., McWilliam H., Valentin F., Wallace I.M., Wilm A., Lopez R., Thompson J.D., Gibson T.J. and Higgins D.G. (2007) ClustalW and ClustalX version 2. Bioinformatics 2007 23(21): 2947-2948). Methods for the generation of such protein variants include random or site directed mutagenesis, site-saturation mutagenesis, PCR-based fragment assembly, DNA shuffling, homologous recombination in-vitro or in-vivo, and methods of gene-synthesis.
[0040] The nomenclature of amino acids, peptides, nucleotides and nucleic acids is done according to the suggestions of IUPAC. Generally amino acids are named within this document according to the one letter code.
[0041] Exchanges of single amino acids are described by naming the single letter code of the original amino acid followed by its position number and the single letter code of the replacing amino acid, i.e. the change of glutamine at position one to a leucine at this position is described as "Q1L". For deletions of single positions from the sequence the symbol of the replacing amino acid is substituted by the three letter abbreviation "del" thus the deletion of alanine at position 3 would be referred to as "A3del". Inserted additional amino acids receive the number of the preceding position extended by a small letter in alphabetical order relative to their distance to their point of insertion. Thus, the insertion of two tryptophanes after position 3 is referred to as "3aW, 3bW". Introduction of untranslated codons TAA, TGA and TAG into the nucleic acid sequence is indicated as in the amino acid sequence, thus the introduction of a terminating codon at position 4 of the amino acid sequence is referred to as "G4*".
[0042] Multiple mutations are separated by a plus sign ora slash ora comma. For example, two mutations in positions 20 and 21 substituting alanine and glutamic acid for glycine and serine, respectively, are indicated as "A20G+E21S" or "A20G/E21S" "A20G,E21S".
[0043] When an amino acid residue at a given position is substituted with two or more alternative amino acid residues these residues are separated by a comma or a slash. For example, substitution of alanine at position 30 with either glycine or glutamic acid is indicated as "A20G,E" or "A20G/E", or "A20G, A20E".
[0044] When a position suitable for modification is identified herein without any specific modification being suggested, it is to be understood that any amino acid residue may be substituted for the amino acid residue present in the position. Thus, for instance, when a modification of an alanine in position 20 is mentioned but not specified, it is to be understood that the alanine may be deleted or substituted for any other amino acid residue (i.e. any one of R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y and V).
[0045] The terms "similar mutation" or "similar substitution" refer to an amino acid mutation that a person skilled in the art would consider similar to a first mutation. Similar in this context means an amino acid that has similar chemical characteristics. If, for example, a mutation at a specific position leads to a substitution of a non-aliphatic amino acid residue (e.g. Ser) with an aliphatic amino acid residue (e.g. Leu), then a substitution at the same position with a different aliphatic amino acid (e.g. lie or Val) is referred to as a similar mutation. Further amino acid characteristics include size of the residue, hydrophobicity, polarity, charge, pK-value, and other amino acid characteristics known in the art. Accordingly, a similar mutation may include substitution such as basic for basic, acidic for acidic, polar for polar etc. The sets of amino acids thus derived are likely to be conserved for structural reasons. These sets can be described in the form of a Venn diagram (Livingstone CD, and Barton GJ. (1993) "Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation" Comput.AppI Biosci. 9: 745-756; TaylorW. R. (1986) "The classification of amino acid conservation" J.Theor.Biol. 119; 205-218). Similar substitutions may be made, for example, according to the following grouping of amino acids: Flydrophobic: FWYHKMILVAG; Aromatic: F W Y H; Aliphatic: I L V; Polar: W Y H K R E D C S T N; Charged H K R E D; Positively charged: H K R; Negatively charged: E D.
[0046] As convention for numbering of amino acids and designation of protein variants for the description of protein variants the first glutamine (Q) of the amino acid sequence QQAGTA within the parental protein sequence given in SEQ ID NO: 2 is referred to as position number 1 or Q1 or glutamine 1. The numbering of all amino acids will be according to their position in the parental sequence given in SEQ ID NO: 2 relative to this position number 1.
[0047] The present invention furthermore discloses variants of the polypeptides of the present invention with changes of their sequence at one or more of the positions : G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, I277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410, Y422, P442, N445, R446, T456, S460, P462, G463, H468 and/or V482 of amino acids 1 to 500 of SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion.
[0048] In a preferred embodiment, the variant of the polypeptides of the present invention comprises one or more specific changes of their sequence at the following positions (preferred exchange) or a similar mutation.
(continued)
(continued)
[0049] Even more preferably, the variant ofthe polypeptides of the present invention comprises an amino acid sequence selected from the sequences with mutations with respect to SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion, optionally fused with a C-terminal 6x-His Tag, listed in the following Table.
(continued)
[0050] In a further aspect, the present invention discloses a nucleic acid encoding the polypeptide of the present invention. The nucleic acid is a polynucleotide sequence (DNAorRNA) which is, when set under control of an appropriate promoter and transferred into a suitable biological host or chemical environment, processed to the encoded polypeptide, whereby the process also includes all post-translational and post-transcriptional steps necessary. The coding sequence can be easily adapted by variation of degenerated base-triplets, alteration of signal sequences, or by introduction of introns, without affecting the molecular properties of the encoded protein. The nucleic acid of the present invention has preferably at least 95 %, more preferably at least 97 %, and most preferably 100% identity to SEQ ID NO: 1. The present invention also provides a vector comprising this nucleic acid and a host transformed with said vector.
[0051] The present application also discloses methods for the production of polypeptides of the present invention and variants thereof in various host cells, including yeast and fungal hosts. It also discloses the use of the resulting strains for the improvement of protein properties by variation of the sequence. Furthermore, the present application discloses methods for the application of such polypeptides in the hydrolysis of cellulose.
[0052] A further aspect of the application disdoses vectors and methods for the production of protein variants of SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO : 2 is modified by substitution or deletion, expressing them in yeast and testing their activity on cellulosic material by measuring the released mono- and/or oligomeric sugar molecules.
[0053] The present application further discloses a method of producing a cellobiohydrolase protein, comprising the steps: a. obtaining a host cell, which has been transformed with a vector comprising the nucleic acid of the present invention; b. cultivation of the host cell under conditions under which the cellobiohydrolase protein is expressed; and c. recovery of the cellobiohydrolase protein.
[0054] In a preferred embodiment, the host cell is derived from the group consisting of Saccharomyces, Schizosac-charomyces, Kluyveromyces, Pichia, Hansenula, Aspergillus, Trichoderma, Pénicillium, Candida and Yarrowina. The host cell is preferably capable of producing ethanol, wherein most preferred yeasts include Saccharomyces cerevisiae, Pichia stipitis, Pachysolen tannophilus, or a methylotrophic yeast, preferably derived from the group of host cells comprising Pichia methanolica, Pichia pastoris, Pichia angusta, Hansenula polymorpha.
[0055] It has surprisingly been found that the polypeptide according to the present invention and variants thereof can be expressed from yeast at high levels. "Yeast" shall herein refer to all lower eukaryotic organisms showing a unicellular vegetative state in their life cycle. This especially includes organisms of the class Saccharomycetes, in particular of the genus Saccharomyces, Pachysolen, Pichia, Candida, Yarrowina, Debaromyces, Klyveromyces, Zygosaccharomyces.
[0056] Thus, one aspect of the application discloses the expression of the claimed polypeptide and variants thereof in yeast. The efficient expression of this fusion protein (SEQ ID NO: 2) and derivative protein variants of SEQ ID NO: 2 from yeast can be achieved by insertion of the nucleic acid molecule of SEQ ID NO: 1 starting from nucleotide position 1 into an expression vector under control of at least one appropriate promoter sequence and fusion of the nucleotide molecule to an appropriate signal peptide, for example to the signal peptide of the mating factor alpha of Saccharomyces cerevisiae.
[0057] In a preferred embodiment, the polypeptide of the present invention and variants thereof are expressed and secreted at a level of more than 100 mg/l, more preferably of more than 200 mg/l, particularly preferably of more than 500 mg/l, or most preferably of more than 1 g/l into the supernatant after introduction of a nucleic acid encoding a polypeptide having an amino acid sequence with at least 85% sequence identity to the SEQ ID NO: 2 into a yeast. To determine the level of expression in yeast, the cultivation and isolation of the supernatant can be carried out as described in Example 3.
[0058] A further aspect of the application discloses methods for the production of a polypeptide according to the present invention in a filamentous fungus, preferably in a fungus of the genus Aspergillus or Trichoderma, more preferably in a fungus of the genus Trichoderma, most preferably in Trichoderma reesei. "Filamentous fungi" or "fungi” shall herein refer to all lower eukaryotic organisms showing hyphal growth during at least one state in their life cycle. This especially includes organisms of the phylum Ascomycota and Basidiomycota, in particular of the genus Trichoderma, Talaromyces, Aspergillus, Pénicillium, Chrysosporium, Phanerochaete, Thermoascus, Agaricus, Pleutrus, Irpex. The polypeptide is expressed by fusion of the coding region of a compatible signal sequence to the nucleic acid molecule starting with nucleotide position 52 of SEQ ID NO: 3, as it was done in SEQ ID NO: 3 with the signal sequence of the Trichoderma reesei CBHI, and the positioning of the fusion peptide under control of a sufficiently strong promoter followed by transfer of the genetic construct to the host cell. Examples for such promoters and signal sequences as well as techniques for an efficient transfer have been described in the art.
[0059] In a further aspect the present application further discloses a method for identifying polypeptides having cello-biohydrolase activity, comprising the steps of: a. Generating a library of mutant genes encoding mutant proteins by mutagenesis of a nucleic acid according to claim 9 or a nucleic acid having the sequence defined by SEQ ID NO: 6 (encoding SEQ ID NO: 5), preferably having the sequence defined by SEQ ID NO: 1 ; b. Inserting each mutant gene into an expression vector; c. Transforming yeast cells with each expression vector to provide a library of yeast transformants; d. Cultivation of each yeast transformant under conditions under which the mutant protein is expressed and secreted; e. Incubating the expressed mutant protein with a substrate; f. Determining the catalytic activity of the mutant protein; g. Selecting a mutant protein according to the determined catalytic activity.
[0060] Specifically, step d. may be performed by utilizing a well-plate format. This format preferably allows the high-throughput performance of the method for identifying polypeptides having cellobiohydrolase activity.
[0061] Preferably, the steps e. to g. of the method for identifying polypeptides having cellobiohydrolase activity are performed as follows: e. Incubating the expressed mutant protein with cellulosic material; f. Determining the amount of released sugar; g. Selecting a mutant protein according to the amount of released sugar.
[0062] In another embodiment, the method for identifying polypeptides having cellobiohydrolase activity comprises the additional steps of: h. Sequencing the selected mutant gene or protein; i. Identifying the amino acid modification(s) by comparing the sequence of the selected mutant protein with the amino acid sequence of SEQ ID NO: 2.
[0063] The present application further discloses a method of preparing a polypeptide having cellobiohydrolase activity, comprising the steps: a. Providing a polypeptide having cellobiohydrolase activity comprising an amino sequence having at least 70 % sequence identity to the catalytic domain of SEQ ID NO: 2 (SEQ ID NO: 5); b. Identifying the amino acids of this polypeptide which correspond to the amino acids which are modified with respect to the amino acid sequence of SEQ ID NO: 2, as identified in step i. of the method for identifying polypeptides having cellobiohydrolase activity; and c. Preparing a mutant polypeptide of the polypeptide provided in step a. by carrying out the amino acid modification(s) identified in step b. through site-directed mutagenesis.
[0064] Preferably, the polypeptide provided in step a. of the method of preparing a polypeptide having cellobiohydrolase activity is a wild type cellobiohydrolase derived from Trichoderma reesei.
[0065] The present application further discloses polypeptides having cellobiohydrolase activity, which are obtainable by the method of preparing a polypeptide having cellobiohydrolase activity according to the present application.
[0066] Furthermore, the present invention provides a composition comprising a polypeptide and/or variants thereof of the present invention and one or more cellulases, e.g. oneormoreendoglucanases and/or one or more beta-glucosidases and/or one or more further cellobiohydrolases and/or one or more xylanases. "Cellulases" or "Cellulolytic enzymes" are defined as enzymes capable of hydrolysing cellulosic substrates or derivatives or mixed feedstocks comprising cellulosic polymers. Such enzymes are referred to as having "cellulolytic activity", thus being able to hydrolyze cellulose molecules from such material into smaller oligo- or monosaccharides. Cellulolytic enzymes include cellulases and hemicellulases, in particular they include cellobiohydrolases (CBHs), endoglucanases (EGs) and beta-glucosidases (BGLs).
[0067] The present application further discloses a polypeptide having cellobiohydrolase activity, wherein the polypeptide comprises an amino acid sequence having at least 80 %, preferably at least 95%, more preferably at least 98%, even more preferably at least 99%, and most preferably 100% sequence identity to SEQ ID NO: 5, wherein one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 5 are modified by substitution or deletion of: Q1, G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, 1277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410 and/or Y422 of amino acids 1 to 440 of SEQ ID NO: 5.
[0068] In a preferred embodiment, the polypeptide having cellobiohydrolase activity with an amino acid sequence having at least 80 % sequence identity to SEQ ID NO: 5 comprises one or more of the following modified amino acid residues of the sequence defined by SEQ ID NO: 5: Q1L, G4, A6G/V, T15S, Q28Q/R, W40R, D64N, E65K/V, A72V, S86T, K92K/R, V130I/V, V152A/E, Y155C, K159E, D181N, E183V/K, N194C/R/Y/D/K/I/UG/Q/SA/, D202Y/N/G, P224L, T243I/R/Y/A/F/Q/P/D/V/W/L/M, Y244F/H, I277V, K304R, N310D, S311G/N, N318Y, D320V/E/N, T335I, T344M, D346G/A/E/V, Q349R/K, A358E, Y374C/P/R/H/S/A, A375D/N/Y/R/Q/LA//E/G/T/M, T392C/D/K, T393A, D410G, Y422F.
[0069] More preferably, the polypeptide having cellobiohydrolase activity comprises one or more modified amino acid residues of the sequence defined by SEQ ID NO: 5 as indicated in the following Table:
(continued)
(continued)
(continued)
[0070] Furthermore, the present application discloses a polypeptide having cellobiohydrolase activity comprising an amino acid sequence having at least 85%, preferably at least 95%, more preferably at least 98%, even more preferably at least 99%, and most preferably 100% sequence identity to SEQ ID NO: 12 wherein one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 12 are modified by substitution or deletion: Q1, T15, Q28, W40, C72, V133, V155, Y158, T162, Y247, N307, G308, E317, S341, D345, Y370, T389, Q406, N441, R442, T452, S456, P458, G459, H464 and/or V478.
[0071] In a preferred embodiment, the polypeptide having cellobiohydrolase activity comprising an amino acid sequence having at least 85 % sequence identity to SEQ ID NO: 12 comprises one or more of the following modified amino acid residues of the sequence defined by SEQ ID NO: 12:
(continued)
[0072] Another aspect of the disclosure relates to the application of the isolated polypeptides and variants thereof of the present invention for the complete or partial hydrolysis of cellulosic material. The cellulosic material can be of natural, processed or artificial nature. "Cellulosic material" herein shall be defined as all sorts of pure, non-pure, mixed, blended or otherwise composed material containing at least a fraction of ß-1 -4-linked D-glucosyl polymers of at least 7 consecutive subunits. Prominent examples of cellulosic materials are all sort of cellulose containing plant materials like wood (soft and hard), straw, grains, elephant grass, hey, leaves, cotton and materials processed there from or waste streams derived from such processes. Cellulosic material used in an enzymatic reaction is herein also referred to as cellulosic substrate.
[0073] The hydrolysis of the cellulose material can be a sequential process following cellobiohydrolase production or contemporary to the production in the yeast cell (consolidated bioprocess). The expression of cellulolytic enzymes in yeast is of special interest due to the ability of many yeasts to ferment the released sugars (C6 or C5) to ethanol or other metabolites of interest.
[0074] A further aspect of the application thus relates to the application of whole cells expressing the polypeptide or variant thereof according to the present invention for the processing of cellulosic materials.
[0075] In a particular aspect, the present application discloses the use of a polypeptide and variants thereof or the composition of the present invention for the enzymatic degradation of cellulosic material, preferably lignocellulosic biomass, and/or for textiles processing and/or as ingredient in detergents and/or as ingredient in food or feed compositions.
Examples
Example 1: Preparation of Pichia pastoris expression plasmid [0076] Expression plasmids for the constitutive expression of protein from transformed Pichia pastoris hosts are prepared by assembly of an expression cassette consisting of a Pichia pastoris gyceraldehyde phosphate dehydrogenase (GAP) promoter, a Saccharomyces cerevisiae SPa (mating factor alpha signal peptide), a multiple cloning site (MCS) and the 3’-GAP-terminator sequence. For selection purposes a kanamycine resistance gene is used under control of the EM7 or TEF promoter for bacterial or yeast selection purposes, respectively. The resulting plasmid vectors are designated as pV1 (Figure 1) and pV2 (alternative MCS) Transformation and expression cultivation are done essentially as described by Waterham, H. R., Digan, M. E., Koutz, P. J., Lair, S. V., Cregg, J. M. (1997). Isolation of the Pichia pastoris glyceraldehyde-3-phosphate dehydrogenase gene and regulation and use of its promoter. Gene, 186, 37-44 and Cregg, J.M.: Pichia Protocols in Methods in Molecular Biology, Second Edition, Humana Press, Totowa New Jersey 2007.
Comparative example 1: Construction of Pichia pastoris expression constructs for CBHI sequences [0077] CBHI genes of Trichoderma viride (CBH-f), Humicola grisea (CBH-d), Thermoascus aurantiacus (CBH-e), Talaromyces emersonii (CBH-b), and fusions of the cellulose binding domain of Trichoderma reesei CBHI with the Talaromyces emersonii CBHI (CBH-a) or the Humicola grisea CBHI (CBH-g) are amplified using the oligo nucleotide pairs and templates (obtained by gene synthesis) as given in the table. The fusion gene encoding SEQ ID NO: 2 is generated by overlap extension PCR using the PCR-Fragments generated from SEQ ID NOs:5 and 11. Phusion DNA polymerase (Finnzymes) is used for the amplification PCR.
Table 1 : Primers and templates for the amplification of CBH-a, CBH-b, CBH-d, CBH-e, CBH-f and CBH-g
(continued)
[0078] PCR fragments of expected length are purified from agarose gels after electrophoresis using the Promega® SV PCR and Gel Purification kit. Concentration of DNA fragments are measured on a Spectrophotometer and 0,2pmol of fragments are treated with 9U of T4-DNA polymerase in the presence of 2,5mM dATP for 37,5 min at 22,5°C and treated fragments are annealed with T4-DNA-Polymerase/dTTP treated Smal-linearized pV1 plasmid DNA and afterwards transformed into chemically competent Escherichia coli Top10 cells. Deviant from the described procedure the product generated by the primer pair according to the table lane 11 encoding the Humiculagrisea fusion protein fragments are cloned via the introduced Sph\ and Sal I site to pV2. Transformants are controlled by sequencing of isolated plasmid DNA.
Comparative example 2: Expression of CBHI Genes in Pichia pastoris [0079] Plasmids of Example 2 are transformed to electro-competent Pichia pastoris CBS 7435 cells and transformants are used to inoculate cultures in YPD medium containing 200mg/l, which are incubated for 5 days at 27°C in a rotary shaker at 250 rpm. Culture supernatants were separated by centrifugation at 5000xg for 30 minutes in a Sorvall Avant centrifuge. Supernatants were concentrated on spin columns with cut-off size of 10kDa. Protein pattern of such concentrated supernatants were analyzed by SDS-PAGE (Laemmli et al.) and gels were stained with colloidal Commassie blue stain. Enzymatic activity was determined by incubation of the supernatant with 2mM solutions of p-nitrophenyl-ß-D-lactoside or 200μΜ solutions of 4-methyl-umbelliferyl-ß-D-lactoside in 50 mM sodium acetate buffer (pH 5) for 1 hour. The reaction was stopped my addition of equal volumes of 1 M sodium carbonate solution and determination of released p-nitrophenol or 4-methyl umbelliferone by measurement of the absorbance at 405 nm or the fluorescence at 360 nm/ 450 nm excitation/emission.
Comparative example 3: Genome integration and expression of the Talaromyces emersonii CBIW-T. reese/CBHII-CBD fusion sequence in Pichia pastoris [0080] The DNA-fragment of the fusion gene are generated by 2 step overlap extension PCR using the oligo nucleotide pairs and synthetic templates as indicated in the table (of Example 2). T4-DNA polymerase treated full length fragment was annealed with the linear pV3 vectorfragment by slowly reducing the temperature from 75°C to 4°C. The pV3 plasmid contains a fusion of the mating factor alpha signal peptide to a multiple cloning site, situated downstream the of a Pichia pastoris AOXI promoter. Transformation of the annealed solution into chemical competent E. coli cells yields transformants, which are selected by their Teocine resistance checked for containing expected construct plasmid by restriction analysis and sequencing. pV3-CBH-a plasmid preparations are linearized with Sacl and approximately 1 μg of linear DNA-fragments are transformed to Pichia pastoris electrocompetent cells. 94 Transformants from YPD-Zeocin plates are afterwards checked for expression by cultivation in 500μΙ 96-deepwell Plate cultures in BMMY-medium containing 1% methanol and 0.5 % methanol was fed every 24h for 5 days (350 rpm/27°C; humidified orbital shaker with 2,5 cm amplitude. Supernatants are tested for activity on 4-MUL and clones with highest expression levels are selected and again evaluated under same conditions.
[0081] For fermentation in an Infers Multifors bioreactor the strain producing the highest enzyme concentration is selected. A YPD-Zeocin (1 OOg/l) pre-culture is chosen for inoculation of Mineral medium consisting of phosphate-buffer, magnesium sulphate and chloride, trace elements/biotin and glycerol, with pH calibration using ammonia and phosphoric acid. After metabolism of the batch glycerol (2%) additional glycerol feed is maintained for 1 day before the feed is changed to methanol to shift to inductive conditions for the AOXI promoter. Under these conditions the fermentation is kept for 5 days. Cells are separated from the fermentation liquid by centrifugation at 5000xg for 30 minutes. Supernatants are analyzed for total Protein using Bradford Reagent and BSA Standards (Biorad). SDS-PAGE / Coomassie Brilliant blue staining is used to analyze the Protein Pattern on the SDS-PAGE.
Example 2: Trichoderma reesei expression vector construct [0082] Sbfl/Swal digested pSCMB100 plasmid DNA was transformed into Trichoderma reesei SCF41 essentially as described by Penttilâ et al 1997. 10μg of linear DNA was used for the transformation of 107 protoplasts. Selection of transformants was done by growth of the protoplasts on Mandel’s Andreotti media plates with overlay agar, containing hygromycine as selective agent (100mg/l). Transformants were further purified by passage over sporolation media plates and re-selection of spores on hygromycin media. From re-grown mycelia genomic DNA was isolated and the replacement event verified by PCR. Transformants verified in being true replacement strains were further tested for secretion of recombinant protein.
Example 3: Expression of Talaromyces emersonii CBHI / Trichoderma reesei-CBD fusion (CBH-ah) from Trichoderma reesei [0083] Expression of recombinant CBHI replacement strains of Talaromyces emersonii CBHI / Trichoderma reesei -CBD fusion with 6x His-Tag in Trichoderma reesei Q6A(ATCC 13631) was done in shakeflask cultures containing 40ml Mineral medium containing 2% Avicel in 300ml flasks and cultivation at 30°C/250rpm for 6 days. Supernatants recovered by centrifugation and further analyzed by SDS-PAGE and Bradford Protein assays.
Example 4: Screening thermo stability variants [0084] Random mutagenesis libraries of the Talaromyces emersonii CBHI / Trichoderma reesei - CBD fusion (with 6x His-Tag) gene were generated using error prone PCR applying manganese containing bufferers and inbalanced dNTP concentrations in the Tag-DNA polymerase reaction micture, used for PCR-amplification, essentially as described by Craig and Joyce (R.Craig Cadwell and G.F. Joyce, 1995. Mutagenic PCR, in PCR Primer: a laboratory manual, ed. C. W. Dieffenbach and G. S. Dveksler, Cold Spring Harbor Press, Cold Spring Harbor, ME, 583-589). As template the wild type fusion gene (SEQ ID NO: 17) or mutants thereof were chosen. Mutated PCR-Fragments were cloned to the pPKGMe Plasmid using Sphl and Hind III endonucleases and T4-DNA-ligase.
[0085] Libraries of the Talaromyces emersonii CBHI / Trichoderma reesei -CBD fusion (with 6x His-Tag) gene variants were distributed in 1536 well plates with well occupation number close to 1. Enzyme was expressed over 7 days in a volume of 4μΙ YPG-G418 medium. For evaluation of the properties of the variants 2μΙ samples of culture supernatants were transferred to plates containing a suspension of milled straw, acetate buffer and beta-glucosidase. After incubation of the sealed reaction plates for 48 hours at defined temperatures the glucose concentration was determined using Amplex red in the presence of GOX and HRP by analyzing the fluorescence level. Best-performing Hits were re-cultivated and re-evaluated. Plasmids of confirmed CBH-ah variants were recovered (Pierce DNAzol Yeast genomic DNA Kit) and sequenced using oligonucleotides alpha-f (5’ TACTATTGCCAGCATTGCTGC-3’) and oli740 (5’-TCAGCTATTTCACAT-ACAAATCG-3’).
Example 5: Determination of Substrate Conversion Capacity at different temperatures for indication of the thermostability of CBH-ah-Variants using 4-methylumbellifery-ß-D-lactoside (4-MUL) [0086] For precise comparison of the thermal stability culture supernatants containing the secreted cellobiohydrolase variants were diluted tenfold in sodium acetate buffer (50mM, pH 5) and 10μΙ samples were incubated with 90μΙ of 200μΜ 4-MUL (in buffer) in the temperature gradient of an Eppendorff Gradient Thermocycler. A temperature gradient of 20°C reaching from 55°C to 75°C was applied to 12 reaction mixtures forfor each sample for one hour. The temperature profile could be recorded after addition of 100μΙ 1 M sodium carbonate solution to each reaction and measurement of the fluorescence intensity at 360nm/454nm in a Tecan Infinite M200 plate reader. For comparison of the thermostability the values were normalized between 1 and 0 for the maximum and minimum fluorescence count (Figure 7).
Table 2: Listing of Mutants of SEQ ID NO: 18 with improved IT50 values.
(continued)
(continued)
(continued)
(continued)
Example 6: Characterization of Variants of the Talaromyces emersonii CBHI ITrichoderma reesei- CBD fusion (with 6x His-Tag) [0087] 80 mL of fermentation broth were concentrated to a final volume of approx. 1mL. After determination of protein concentration (Bradford reagent, Biorad, Germany, Standard is BSA form Sigma-Aldrich, Germany) 1.2mg of protein were purified with the Ni-NTA Spin kit (Qiagen, Germany). The purified CBH1 fraction was subsequently assayed by performing a hydrolysis reaction on pretreated (acid pretreatment) wheat straw. 12,5mg (dry mass) of pretreated wheat straw is mixed with 0,0125mg of purified CBH1 and 40CBU Novo188 (Novozymes, Denmark) per mg of CBH1.50mM sodium acetate (Sigma-Aldrich, Germany) is added up to 500μΙ_. The assay is kept at temperatures ranging from 50°C to 65°C for 48 hours and analysed by HPLC to determine the temperature dependent glucose content.
SEQUENCE LISTING
[0088] <110> Süd-Chemie AG <120> optimized Cellulase Enzymes <130> 139 202 <160> 41 <170 Patentin version 3.4 <210 1 <211 > 1509 <212> DNA <213> Artificial <220> <223> Coding Sequence for Talaromyces emersonii CBHI /Trichoderma reesei -CBD fusion (mature CBH-a) <400> 1 cagcaggccg gcacggcgac ggcagagaac cäcccgcccc tgacatggca ggaatgcacc 60 gcccctggga gctgcaccac ccagaacggg gcggtcgttc ttgatgcgaa ctggcgttgg 120 gtgcacgatg tgaacggata caccaactgc tacacgggca atacctggga ccccacgtac 180 tgccctgacg acgaaacctg cgcccagaac tgtgcgctgg acggcgcgga ttacgagggc 240 acctacggcg tgacttcgtc gggcagctcc ttgáaactca atttcgtcac cgggtcgaac 300 gtcggatccc gtctctacct gctgcaggac gactcgacct atcagatctt caagctcctg 360 aaçcgcgagt tcagctttga cgtcgatgtc tccaatcttc cgtgcggatt gaacggcgct 420 ctgtactttg tcgccatgga cgccgacggc ggcgtgtcca agtacccgaa caacaaggct 480 ggtgccaagt acggaaccgg gtattgcgac tcccaatgcc cacgggacct caagttcatc 540 gacggcgagg ccaacgtcga gggctggcag ccgtcttcga acaacgccaa caccggaatt 600 ggcgaccacg gctcctgctg tgcggagatg gatgtctggg aagcaaacag catctccaat 660 gcggtcactc cgcacccgtg cgacacgcca ggccagacga tgtggtctgg agatgactgc 720 ggtggcacat actctaacga tcgctacgcg ggaacctgcg atcctgacgg ctgtgacttc 780 aacccttacc gcatgggcaa cacttctttc tacgggcctg geaagatcat cgataccacc 840 aagçccttca ctgtcgtgac gcagttcctc actgatgatg gtacggatac tggaactctc 900 agcgagatca agcgcttcta catccagaac agcaacgtca ttccgcagcc caactcggac 960 atcagtggcg tgaccggcaa ctcgatcacg acggagttct gcactgctca gaagcaggcc 1020 tttggcgaca cggacgactt ctctcagcac ggtggcctgg ccaagatggg agcggccatg 1080 cagcagggta tggtcctggt gatgagtttg tgggacgact acgccgcgca gatgctgtgg 1140 ttggattccg actacccgac ggatgcggac cccacgaccc ctggtattgc ccgtggaacg 1200 tgtccgacgg actcgggcgt cccatcggat gtcgagtcgc agagccccaa ctcctacgtg 1260 acctactcga acattaagtt tggtccgatc ggtagcacag gtaatccttc aggtggtaat 1320 cctccaggtg gaaacagagg aacaacgaca actagaagac cagctactäc aactggttca 1380 agtccaggtc caactcaatc acactacggt caatgtggtg gtataggtta ctctggtccc 1440 actgtttgtg cttctggtac tacttgccaa gttctgaacc cttactactc acagtgtcta 1500 taatgataa 1509 <210 2 <211> 500 <212> PRT <213> Artificial <220> <223> Mature sequence of Talaromyces emersonii CBHI / Trichoderma reesei -CBD (mature CBH-a) <400> 2
Gin Gin Ala Gly Thr Ala Thr Ala Glu Asn His Pro Pro Leu Thr Trp 15 10 15
Gin Glu Cys Thr Ala pro Gly Ser cys Thr Thr Gin Asn Gly Ala Val 20 25 30 val Leu Asp Ala Asn Trp Arg Trp val His Asp Val Asn Gly Tyr Thr 35 40 45
Asn Cys Tyr Thr Gly Asn Thr Trp Asp Pro Thr Tyr Cys Pro Asp Asp 50 55 60
Glu Thr Cys Ala Gin Asn Cys Ala Leu Asp Gly Ala Asp Tyr Glu Gly 65 70 75 80
Thr Tyr Gly Val Thr ser Ser Gly ser Ser Leu Lys Leu Asn Phe val 85 90 95
Thr Gly Ser Asn val Gly Ser Arg Leu Tyr Leu Leu Gin Asp Asp ser 100 105 110
Thr Tyr Gin Ile Phe Lys Leu Leu Asn Arg Glu Phe Ser Phe Asp val 115 120 125
Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe Val 130 135 140
Ala Met Asp Ala Asp Gly Gly Val ser Lys Tyr Pro Asn Asn Lys Ala 145 150 155 160
Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro Arg Asp 165 170 175
Leu Lys Phe He Asp Gly Glu Ala Asn val Glu Gly Trp Gin Pro ser 180 185 190
Ser Asn Asn Ala Asn Thr Gly He Gly Asp His Gly ser cys cys Ala 195 200 205
Glu Met Asp val Trp Glu Al a Asn Ser Ile Ser Asn Al a val Thr Pro 210 215; 220
His Pro Cys Asp Thr Prb Gly Gin Thr Met Cys ser Gly Asp Asp cys 225 230 235 240
Gly Gly Thr Tÿr Ser Asn Asp Arg Tyr Ala Gly Thr Cys Asp Pro Asp 245 250 255
Gly Cys Asp Phe Asn Pro Tyr Arg Met Gly Asn Thr ser Phe Tyr Gly 260 265 270
Pro Gly Lys île île Asp Thr Thr Lys Pro Phe Thr Val Val Thr Gin 275 280 285
Phe Leu Thr Asp Asp Gly Thr Asp Thr Gly Thr Leu ser Glu île Lys 290 295 300
Arg Phe Tyr île Gin Asn Ser Asn val Ile Pro Gin Pro Asn ser Asp 305 310 315 320
Ile Ser Gly Val Thr Gly Asn Ser Ile Thr Thr Glu Phe Cys Thr Ala 325 330 335 ,
Gin Lys Gin Ala Phe Gly Asp Thr Asp Asp Phe Ser Gin Hi s Gly Gly 340 345 350
Leu Ala Lys Met Gly Ala Ala Met Glh Gin Gly Met val Leu val Met 355 360 365
Ser Leu Trp Asp Asp Tyr Ala Ala Gin Met Leu Trp Leu Asp ser Asp 370 375 380
Tyr Pro Thr Asp Ala Asp Pro Thr Thr Pro Gly Ile Ala Arg Gly Thr 385 390 395 400
Cys Pro Thr Asp ser Gly val Pro Ser Asp val Glu Ser Gin Ser Pro 405. 410 415
Asn ser-tÿr val Thr Tyr Ser Asn île Lys Phe Gly Pro île Gly ser 420 425 430
Thr Gly Asn Pro Ser Gly Gly Asn pro Pro Gly Gly Asn Arg Gly Thr 435 440 445
Thr Thr Thr Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro 450 455 460
Thr Gin Ser His Tyr Gly Gin Cys Gly Gly île Gly Tyr Ser Gly Pro 465 470 475 480
Thr val cys Ala ser Gly Thr Thr cys Gin val Leu Asn Pro Tyr Tyr 485 490 495 ser. Gin cys Leu 500 <210>3 <211> 1581 <212> DNA <213> Artificial <220> <223> Coding sequence of the fusion of CBH-a with Trichoderma reesei CBH Signal peptide <400 3 atgtatcgga agttggccgt catctcggcc ttcttggcca cagctcgtgc tcagcaggcc 60 ggcacggcga cggcagagaa ccacccgccc ctgacatggc aggaatgcac cgcccctggg 120 agctgcacca cccagaacgg ggcggtcgtt cttgatgcga aetggcgttg ggtgcacgat 180 gtgaacggat acaccaactg ctacacgggc aatacctggg accccacgta ctgccctgac 240 gâcgaaacct gcgcccagaa ctgtgcgctg gacggcgcgg attacgaggg cacctacggc 300 gtgacttcgt cgggcagctc cttgaaactc aatttcgtca ccgggtcgaa cgtcggatcc 360 cgtctctacc tgctgcagga cgactcgacc tatcagatct tcaagctcct gaaccgcgag 420 ttcagctttg acgtcgatgt ctccaatctt ccgtgcggat tgaacggcgc tctgtacttt 480 gtcgccatgg acgccgacgg cggcgtgtcc aagtacccga acaacaaggc tggtgccaag 540 tacggaaccg ggtattgcga ctcccaatgc ccacgggacc tcaagttcat cgacggcgag 600 gccaacgtcg agggctggca gccgtcttcg aacaacgcca acaccggaat tggcgaccac 660 ggctcctgct gtgcggagat ggatgtctgg gaagcaaaca gcatctccaa tgcggtcact 720 ccgcacccgt gcgacacgcc.aggccagacg atgtgctctg gagatgactg cggtggcaca 780 tactctaacg atcgctacgc gggaacctgc gatcctgacg gctgtgactt caacccttac 840 cgcatgggca acacttcttt ctacgggcct ggcaagatca tcgataccac caagcccttc 900
actgtcgtga cgcagttcct cactgatgat ggtacggata ctggaactct cagcgagatc 96Q aagcgcttct aeatccagaa cagcaacgtc attccgcagc ccaactcgga catcagtggc 1020 gtgaccggca actcgatcac gacggagttc tgcactgctc agâagcaggc ctttggcgac 1080 acggacgact tctctcagca cggtggcctg gccaagatgg gagcggccat gcagcagggt 1140 atggtcctgg tgatgagttt gtgggacgac tacgccgcgc agatgctgtg gttggattcc 1200 gactacccga cggatgcgga ccccacgacc cctggtattg cccgtggaac gtgtccgacg 1260 gactcgggcg tcccatcgga tgtcgagtcg cagagcccca actcctacgt gacctactcg 1320 aacattaagt ttggtccgat cggtagcaca ggtaatcctt caggtggtaa tcctccaggt 1380 ggaaacagag gaacaacgac aactagaaga ccagctacta caactggttc aagtccaggt 1440 ccaactcaat cacactacgg tcaatgtggt ggtataggtt actctggtcc cactgtttgt 1500 gcttctggta ctacttgcca agttctgaac ccttactact cacagtgtct agcttctgca 1560 cátcatcacc accaccatta a 1581 <210>4 <211> 70 <212> PRT <213> Artificial <220> <223> Trichoderma reesei CBHI cellulose binding domain and linker sequence <400> 4
Gl y Ser Thr Gl y Asn Pro ser Gl y Gl y Asn Pro Pro Gl y Gly Asn Arg 1 5 10 15
Gly Thr Thr Thr Thr Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro 20 25 30
Gly Pro Thr Gin Ser His Tyr Gly Gin cys Gly Gly Ile Gly Tyr Ser 35 40 45
Gly Pro Thr Val Cys Ala Ser Gly Thr Thr Cys Gin Val Leu Asn Pro 50 55 60
Tyr Tyr ser Gin Cys Leu 65 70 <210 5 <211> 437 <212> PRT <213> Artificial <220 <223> Talaromyces emersonii CBHI sequence (CBH-b) <400> 5
Gin Gin Ala Gly Thr Ala Thr Ala Glu Asn His Pro Pro Leu Thr Trp 1 5 10 15
Gin Glu cys Thr Ala Pro Gly Ser cys Thr. Thr Glri Asn Gly Ala Val 20 25 30
Val Leu Asp Ala Asn Trp Arg Trp Val His Asp val Asn Gly Tyr Thr 35 40 45
Asn cys Tyr Thr Gly Asn Thr Trp Asp Pro Thr tyr cys Pro Asp Asp 50 55 60
Glu Thr cys Ala Gin Asn Cys Ala Leu Asp Gly Ala Asp Tyr Glu Gly 65 70 75 80
Thr Tyr Gly val Thr Ser ser Gly Ser Ser Leu Lys Leu Asn Phe Val 85 90 95
Thr Gly Ser Asn Val Gly Ser Arg Leu Tyr Leu Leu Gin Asp Asp Ser
. 100 105 . HO
Thr Tyr Gin Ile Phe Lys Leu Leu Asn Arg Glu Phe Ser Phe Asp val 115 120 125
Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe val 130 135 140
Ala Met Asp Ala Asp Gly Gly val Ser Lys Tyr Pro Asn Asn Lys Ala 145 150 155 100
Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro Arg Asp 165 170 175
Leu Lys Phe Ile Asp Gly Glu Ala Asn Val Glu Gly Trp Gin Pro Ser 180 185 190
Ser Asn Asn Ala Asn Thr Gly Ile Gly Asp His Gly Ser Cys Cys Ala 195 200 205
Glu Met Asp val Trp Glu Ala Asn ser Ile ser Asn Ala val Thr Pro 210 215 220
His Pro cys Asp Thr pro Gly Gin Thr Met cys ser Gly Asp Asp cys 225 230 235 240
Gly Gly Thr Tyr Ser Asn Asp Arg Tyr Ala Gly Thr Cys Asp Pro Asp 245 250 255
Gly cys Asp Phe Asn Pro Tyr Arg Met Gly Asn Thr Ser Phe Tyr Gly 260 265 270
Pro Gly Lys lie lie Asp Thr Thr Lys Pro Phe Thr Val Val Thr Gin 275 280 285
Phe Leu Thr Asp Asp Gly Thr Asp Thr Gly Thr Leu Ser Glu île Lys 290 295 300
Arg Phe Tyr Ile Gin Asn Ser Asn val Ile Pro Gin Pro Asn Ser Asp 305 310 315 320
Ile Ser Gly val Thr Gly Asn Ser Ile Thr Thr Glu Phe cys Thr Ala 325 330 335
Gin Lys Gin Ala Phe Gly Asp Thr Asp Asp Phe Ser Gin His Gly Gly 340 345 350
Leu Ala Lys Met Gly Ala Ala Met Gin Gin Gly Met val Leu val Met 355 360 365
Ser Leu Trp Asp Asp Tyr Ala Ala Gin Met Leu Trp Leu Asp Ser Asp 370 375 380
Tyr Pro Thr Asp Ala Asp Pro Thr Thr Pro Gly Ile Ala Arg Gly Thr 385 390 395 400
Cys Pro Thr Asp Ser Gly Val Pro Ser Asp Val Glu Ser Gin Ser Pro 405 410 415
Asn Ser Tyr Val Thr Tyr Ser Asn Ile Lys Phe Gly Pro Ile Asn Ser 420 425 430
Thr Phe Thr Ala Ser 435 <210 6 <211 > 1590 <212> DNA <213> Artificial <220> <223> Coding sequence of Talaromyces emersonii CBHI fused to the alpha factor signal peptide <400> 6 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcaccc tctcagcagg ccggcacggc gacggcagag 300 aaccacccgc ccctgacatg gcaggaatgc accgcccctg ggagctgcac cacccagaac 360 ggggcggtcg ttcttgatgc gaactggcgt tgggtgcacg atgtgaacgg atacaccaac 420 tgctacacgg gcaatacctg ggaccccacg tactgccctg acgacgaaac ctgcgcccag 480 aactgtgcgc tggacggcgc ggattacgag ggcacctacg gcgtgacttc gtcgggcagc 540 tccttgàaac tcaatttcgt caccgggtcg aacgtcggat cccgtctcta cctgctgcag 600 gacgactcga cctatcagat cttcaagctt ctgaaccgcg agttcagctt tgacgtcgat 660 gtctccaatc ttccgtgcgg attgaacggc gctctgtact ttgtcgccat ggacgccgac 720 ggcggcgtgt ccaagtaccc gaacaacaag gctggtgcca agtacggaac cgggtattgc 780 gactcccaat gcccacggga cctcaagttc atcgacggcg aggccaacgt cgagggctgg 840 cagccgtctt cgaacaacgc caacaccgga attggcgacc acggctcctg ctgtgcggag 900 atggatgtct gggaagcaaa cagcatctcc aatgçggtca ctccgcaccc gtgcgacacg 960 ccaggccaga cgatgtgctc tggagatgac tgcggtggca catactctaa cgatcgctac 1020 gcgggaacct gcgatcctga cggctgtgac ttcaaccctt accgcatggg caacacttct 1080 ttctacgggc ctggcaagat catcgatacc accaagccct tcactgtcgt gacgcagttc 1140 ctcactgatg atggtacgga tactggaact ctcagcgaga tcaagcgctt ctacatccag 1200 aacagcaacg tcattccgca gcccaactcg gacatcagtg gcgtgaccgg caactcgatc 1260 acgacggagt tctgcactgc tcagaagcag gcctttggcg acacggacga cttctctcag 1320 cäcggtggcc tggccaagat gggagcggcc atgcagcagg gtatggtcct ggtgatgagt 1380 ttgtgggacg actacgccgc gcagatgctg tggttggatt ccgactaccc gacggatgcg 1440 gaccccacga cccctggtat tgcccgtgga acgtgtccga cggactcggg cgtcccatcg 1500 gàtgtcgagt cgcagagccc caactcctac gtgacctact cgaacattaa gtttggtccg 1560 atcaactcga cettcaccgc ttcgtgataa 1590 <210>7 <211> 429 <212> PRT <213> Artificial <220> <223> Humicola grisea CBHI (CBH-d) <400 7
Gin Gin Ala Gly Thr île Thr Ala Glu Asn His Pro Arg Met Thr Trp 15 10 15
Lys Arg Cys Ser Gly Pro Gly Asn Cys Gin Thr Val Gin Gly Glu Val 20 25 30
Val Ile Asp Ala Asn Trp Arg Trp Leu His Asn Asn Gly Gin Asn cys 35 40 45
Tyr Glu Gly Asn Lys Trp Thr Ser Gin Cys Ser Ser Ala Thr Asp Cys 50 55 60
Ala Gin Arg Cys Ala Leu Asp Gly Ala Asn Tyr Gin Ser Thr Tyr Gly 65 70 75 80
Ala Ser Thr ser Gly Asp Ser Leu Thr Leu Lys Phe val Thr Lys His 85 90 95
Glu Tyr Gly Thr Asn Ile Gly Ser Arg Phe Tyr Leu Met Ala Asn Gin 100 105 110
Asn Lys Tyr Gin Met Phe Thr Leu Met Asn Asn Glu Phe Ala Phe Asp 115 120 125
Val Asp Leu Ser Lys Val Glu Cys Gly Ile Asn Ser Ala Leu Tyr Phe 130 135 140 val Ala Met Glu Glu Asp Gly Gly Met Ala Ser Tyr Pro ser Asn Arg 145 150 155 160
Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala Gin Cys Ala Arg 165 170 175
Asp Leu Lys Phe Ile Gly Gly Lys Ala Asn Ile Glu Gly Trp Arg pro 180 185 190
Ser thr Asn Asp Pro Asn Ala Gly val Gly Pro Met Gly Ala Cys Cys 195 200 205
Ala Glu lie Asp val Trp Glu ser Asn Ala Tyr Ala Tyr Ala Phe Thr 210 215 220
Pro His Ala Gys Gly ser Lys Asn Arg Tyr His lie cys Glu Thr Asn 225 230 235 240
Asn Cys Gly Gly Thr Tyr Ser Asp Asp Arg Phe Ala Gly Tyr cys Asp 245 250 255
Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met Gly Asn Lys Asp Phe 260 265 270
Tyr Gly Lys Gly Lys Thr Val Asp Thr Asn Arg Lys Phe Thr val val 275 280 285
Ser Arg Phe Glu Arg Asn Arg Leu ser Gin Phe Phe Val Gin Asp Gly 290 295 300
Arg Lys lie Glu Val Pro Pro Pro Thr Trp Pro Gly Leu pro Asn Ser 305 310 315 320
Ala Asp lie Thr Pro Glu Leu cys Asp Ala Gin Phe Arg val Phe Asp 325 330 335
Asp Arg Asn Arg Phe Ala Glu Thr Gly Gly phe Asp Ala Leu Asn Glu 340 345 350
Ala Leu Thr lie Pro Met val Leu val Met Ser lie Trp Asp Asp His 355 360 365
His Ser Asn Met Leu Trp Leu Asp Ser Ser Tyr Pro Pro Glu Lys Ala 370 375 380
Gly Leu Pro Gly Gly Asp Arg Gly Pro Cys Pro Thr Thr ser Gly val 385 390 395 40Ö
Pro Ala Glu val Glu Ala Gin Tyr pro Àsp Ala Gin Val val Trp ser 405 410 415
Asn Ile Arg Phe Gly Pro lie Gly Ser Thr val Asn val 420 425 <210 8 <211 > 1563 <212> DNA <213> Artificial <220 <223> Coding sequence of Humicola grisea CBHI fused to the alpha factor signal peptide <400 8 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aäcgtgaggc ggaagcaccc tctcagcagg ctggtactat tactgctgag 300 aaccacccaa gaatgacctg gaagagatgc tctggtccag gaaactgtca gactgttcag 360 ggcgaggttg tgattgacgc taattggaga tggttgcaca acaacggcca gaactgttac 420 gagggtaaca agtggacctc tcagtgttct tctgctaccg actgtgctca gagatgtgct 480 ttggacggtg ccaactacca gtctacctac ggtgcttcta cctctggtga ctctctgacc 540 ctgaagttcg ttaccaagca cgagtacgga accaacateg gctctagatt ctacctgatg 600 gccaaccaga acaagtacca gatgttcacc ctgatgaaca acgagttcgc ctttgacgtt 660 gacctgtcta aggtggagtg cggtatcaac tctgccctgt acttcgttgc tatggaagag 720 gacggtggaa tggcttctta cccatctaac agagccggtg ctaagtácgg tactggttac 780 tgtgacgccc agtgtgctag agacctgaag ttcatcggtg gaaaggccaa cattgagggt 840 tggagaccat ctaccaacga cccaaacgct ggtgttggtc caatgggagc ttgttgtgcc 900 gagattgatg tgtgggagtc taacgcttac gcctacgctt ttaccccaca cgcttgcggt 960 tctaagaaca gataccacat ctgcgagacc aacaactgtg gtggaaccta ctctgacgac 1020 agattcgctg gàtactgcga cgctaacggt tgtgactaca acccatacag aatgggcaac 1080 aaggacttct acggcaaggg aaagaccgtt gacaccaaca gaaagttcac cgtggtgtcg 1140 agattcgaga gaaacagact gtcgcagttc tttgtgcagg acggcagaaa gattgaggtc 1200 ccaccaccaa cttggccagg attgccaaac tctgccgaca ttaccccaga gttgtgtgac 1260 gctcägttca gagtgttcga cgacagaaac agatttgctg agaccggtgg ttttgacgct 1320 ttgaacgagg ctctgaccat tccaatggtg ctggtgatgt ctatttggga cgaccaccac 1380 tctaâcatgt tgtggctgga ctcttcttac ccaccagaga aggctggatt gccaggtggt 1440 gacagaggac catgtccaac tacttcgggt gttccagctg aggttgaggc tcagtaccca 1500 gacgctcagg ttgtgtggtc gaacatcaga ttcggcccaa tcggttctac cgtgaacgtg 1560 taa 1563 <210 9 <211> 439 <212> PRT <213> Artificial <220 <223> Thermoascus auratiacus CBHI (CBH-e) <400 9
His Glu Ala Gly Thr val Thr Ala Glu Asn His Ptd Ser Leu Thr Trp 1. 5 10 15
Gin Gin Cys Ser Ser Gly Gly Ser Cys Thr Thr Gin Asn Gly Lys val 20 25 30
Val lie Asp Ala Asn Trp Arg Trp val His Thr Thr Ser Gly Tyr Thr 35 40 45
Asn Cys Tyr Thr Gly Asn Thr Trp Asp Thr Ser Ile Cys Pro Asp Asp 50 55 60
Val Thr Cys Ala Gin Asn Cys Ala Leu Asp Gly Ala Asp Tyr Ser Gly 65 70 75 80
Thr Tyr Gly val Thr Thr Ser Gly Asn Ala Leu Arg Leu Asn Phe Val 85 90 95
Thr Gin Ser Ser Gly Lys Asn Ile Gly ser Arg Leu Tyr Leu Leu Gin 100 105 110
Asp Asp Thr Thr Tyr Gin Ile Phe Lys Leu Leu Gly Gin Glu Phe Thr 115 120 125
Phe Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu 130 135 140
Tyr Phe Val Ala Met Asp Ala Asp Gly Asn Leu Ser Lys Tyr Pro Gly 145 150 155 . 160
Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys 165 170 175 pro Arg Asp Leu Lys Phe lie Asn Gly Gin Ala Asn Val Glu Gly Trp 180 185 190
Gin Pro Ser Ala Asn Asp Pro Asn Ala Gly val Gly Asn His Gly ser 195 200 205
Ser .Cys Ala Glu Met Asp val Trp Glu Ala Asn Ser Ile Ser Thr Ala 210 215 220
Val Thr Pro His Pro Cys Asp Thr Pro Gly Gin Thr Met Cys Gin Gly 225 230 235 240
Asp Asp Cys Gly Gly Thr Tyr Ser Ser Thr Arg Tyr Ala Gly Thr Cys 245 250 255
Asp Thr Asp Gly Cys Asp Phe Asn Pro Tyr Gin Pro Gly Asn His Ser 260 265 270
Phe Tyr Gly pro Gly Lys île val Asp Thr ser ser Lys Phe Thr val 275 280 285
Val Thr Gin Phe île Thr Asp Asp Gly Thr Pro Ser Gl y Thr Leu Thr 290 295 300
Glu île Lys Arg Phe Tyr Val Gin Asn Gly Lys val Ile Pro Gin Ser 305 310 315 320
Glu ser Thr Ile ser Gly val Thr Gly Asn ser île Thr Thr Glu Tyr 325 330 335
Cys Thr Al a Gin Lys Al a Al a Phe Asp Asn Thr Gly Phe Phe Thr Hi s 340 345 350
Gly Gly Leu Gin Lys Ile Ser Gin Al a Leu Al a Gin Gly Met Val Leu 355 360 365
Val Met Ser Leu Trp Asp Asp Hi s Al a Al a Asn Met Leu Trp Leu Asp 370 375 380 ser Thr Tyr Pro Thr Asp Ala Asp Pro Asp Thr Pro Gly val Ala Arg 385 390 395 400
Gly Thr Cys Pro Thr Thr Ser Gly Val Pro Ala Asp val Glu Ser Gin 405 410 415
Asn Pro Asn Ser Tyr Val Ile Tyr Ser Asn Ile Lys val Gly Pro île 420 425 430
Asn Ser Thr Phe Thr Ala Asn 435 <210 10 <211 > 1593 <212> DNA <213> Artificial <220> <223> coding sequence of Thermoascus auratiacus CBHI fused to the alpha factor signal peptide <400> 10 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcaccc tctcacgagg ccggtaccgt aaccgcagag 300 aatcaccctt ccctgacctg gcagcaatgc tccagcggcg gtagttgtac cacgcagaat 360 ggaaaagtcg ttatcgatgc gaactggcgt tgggtccata ccacctctgg atacaccaac 420 tgctacacgg gcaatacgtg ggacaccagt atctgtcccg acgacgtgac ctgcgctcag 480 aattgtgcct tggatggage ggattacagt ggcacctatg gtgttacgac cagtggcaac 540 gccctgagac tgaactttgt cacccaaagc tcagggaaga acattggctc gcgcctgtac 600 ctgctgcagg acgacaccac ttatcagatc ttcaagctgc tgggtcagga gtttaccttc 660 gatgtcgacg tctccaatct cccttgcggg ctgaacggcg ccctctactt tgtggccatg 720 gacgccgacg gcaatttgtc caaataccct ggcaacaagg caggcgctaa gtatggcact 780 ggttactgcg actctcagtg ccctcgggat ctcaagttca tcaacggtca ggccaacgtt 840 gaaggctggc agccgtctgc caacgaccca aatgccggcg ttggtaacca cggttcctcg 900 tgcgctgaga tggatgtctg ggaagccaac agcatctcta ctgcggtgac gcctcaccca 960 tgcgacaccc ccggccagac catgtgccag ggagacgact gtggtggaac ctactcctcc 1020 actcgatatg ctggtacctg cgacactgat ggctgcgàct tcaatcctta ccagccaggc 1080 aaccactcgt tctacggccc cgggaagatc gtcgacacta gctccaaatt caccgtcgtc 1140 acccagttca tcaccgacga cgggacaccc tccggcaccc tgacggagat caaacgcttc 1200 tacgtccaga acggcaaggt gatcccccag tcggagtcga cgatcagcgg cgtcaccggc 1260 aactcaatca ccaccgagta ttgcacggcc cagaaggcag ccttcgacaa caccggcttc 1320 tt'cacgcacg gcgggcttca gaagatcagt eaggctctgg ctcagggcat ggtcctcgtc 1380 atgagcctgt gggacgatca cgccgccaac atgctctggc tggacagcac ctacccgact 1440 gatgcggacc cggacacccc tggcgtcgcg cgcggtacct gccccacgac ctccggcgtc 1500 ccggccgacg tggagtcgca gaaccccaat tcatatgtta tctactccaa catcaaggtc 1560 ggacccatca actcgacctt caccgccaac taa 1593 <210 11 <211> 1794 <212> DNA <213> Artificial <220 <223> Coding sequence for Trichoderma reesei CBHI (CBH-c), including the alpha factor signal peptide and a 6x His Tag <400 11 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcaccc tcttcagctt gtacactgca atccgagact 300 catccácctt taacgtggca aaagtgtagt tctggcggaa cttgtactca acagactggt 360 agtgtcgtga tagatgctaa ctggagatgg acacatgcaa cgaactcctc aactaactgc 420 tacgatggta acacctggtc ttctacattg tgtcctgaca acgaaacctg cgctaagaac 480 tgttgtcttg atggagcagc ttacgcaagt acatatggtg tgactacctc tggtaacagc 540 ctttccattg gttttgtaac ccagtcggct cagaagaatg ttggtgctag attgtacctg 600 atggcttcag acaccäcata ccaggagttt accttgttgg gaaacgagtt ctctttcgac 660 gtagatgtgt ctcagctacc atgtggattg aatggagcct tgtactttgt ctcaatggat 720 gcagacggag gtgtttcaaa gtacectact aacacagctg gtgctaagta tggaactgga 780 tactgcgatt ctcaatgccc aagagacctg aagttcatca acggacaagc taacgttgaa 840 ggttgggaac cttctagcaa caacgcaaac actggaattg gtggtcatgg ttcttgctgt 900 tcagagatgg acatttggga agccaactcc atcagtgaag ctttgactcc acatccatgc 960 acaactgttg ggcaagaaat ttgcgaaggt gatggttgtg gtggcactta ctctgataac 1020 agatacggcg gaacatgtga tccagatgga tgtgattgga acccatacag actgggtaac 1080 acttcgtttt acggaccagg ttcttccttc actctagaca ctacgaagaa gttgactgtg 1140 gtcacccaat ttgagacttc tggtgccatt aaccgatact acgtgcagaa cggagttact 1200 ttccaacagc caaaçigctga attgggtagt tactcaggca acgagcttaa cgátgactac 1260 tgcactgctg aagaagcaga atttggtgga tcttcctttt cggataaggg tggattgacg 1320 cagttcaaga aagctacctc tggtggaatg gttctagtca tgagtctgtg ggacgattac 1380 tacgctaaca tgctttggct ggactctact taccctacaa acgagacatc ttctactcct 1440 ggtgctgtaa gaggtagctg ttctacatct tctggagttc cagcccaagt tgagagtcaa 1500 agtccaaatg ccaaggtcac cttctccaac atcaagttcg gaccaattgg tagcacaggt 1560 aatccttcag gtggtaatcc tccaggtgga aacagaggaa caacgacaac tagaagacca 1620 gctactacaa ctggttcaag tccaggtcca actcaatcac actacggtca atgtggtggt 1680 ataggttact ctggtcccac tgtttgtgct tctggtacta cttgecaagt tctgaaccct 1740 tactactcac agtgtctagc ttctgcacac catcatcatc atcattaatg ataa 1794 <210 12 <211> 496
<212> PRT <213> Artificial <220 <223> Trichoderma reesei CBHI (CBH-C) <400 12
Gin ser Ala Cys Thr Leu Gin ser Glu Thr His Pro Pro Leu Thr Trp 1 5 10 15
Gin Lys Cys ser Ser Gly Gly Thr cys Thr Gin Gin Thr Gly ser val 20 25 30
Val lie Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn Ser Ser Thr 35 40 45
Asn Cys Tyr Asp Gly Asn Thr Trp ser ser Thr Leu Cys pro Asp Asn 50 55 60
Glu Thr Cys Ala Lys Asn Cys Cys Leu Asp Gly Ala Ala Tyr Ala ser 65 70 75 80
Thr Tyr Gly val Thr Thr Ser Gly Asn ser Leu Ser Ile Gly Phe Val 85 90 95
Thr Gin Ser Ala Gin Lys Asn val Gly Ala Arg Leu Tyr Leu Met Al a 100 105 110 ser Asp Thr Thr Tyr Gin Glu Phe Thr Leu Leu Gly Asn Glu Phe Ser 115 120 125
Phe Asp Val Asp Val Ser Gin Leu Pro Cys Gly Leu Asn Gly Ala Leu 130 135 140
Tyr Phe val Ser Met Asp Al a Asp Gly Gly val Ser Lys Tyr Pro Thr 145 150 155 160
Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp ser Gin Cys 165 170 175
Pro Arg Asp Leu Lys Phe Ile Asn Gly Gin Ala Asn val Glu Gly Trp 180 185 190
Glu Pro ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly Ser 195 200 205
Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn Ser Ile Ser Glu Ala 210 215 220
Leu Thr pro His Pro Cys Thr Thr val Gly Gin Glu. Ile Cys Glu Gly 225 230 235 240
Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr Cys 245 250 255
Asp Pro Asp Gly Cys Asp Trp Asn pro Tyr Arg Leu Gly Asn Thr Ser - 260 265 270
Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys Leu 275 280 285
Thr Val val Thr Gin Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr Tyr 290 295 300
Val Gin Asn Gly val Thr Phe Gin Gin Pro Asn Ala Glu Leu Gly Ser 305 310 315 320
Tyr Ser Gly Asn Glu Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu Ala 325 330 335
Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gin Phe 340 345 350
Lys Lys ATa-Thr Ser Gly Gly Met val Leu val Met Ser Leu Trp Asp 355 360 365 Àsp Tyr Tyr Al a Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr Asn 370 375 380
Glu Thr Ser Ser Thr Pro Gly Ala val Arg Gly ser cys ser Thr ser 385 390 395 400
Ser Gly val Pro Ala Gin Val Glu Ser Gin Ser Pro Asn Alá Lys Val 405 410 415
Thr phe Ser Asn Ile Lys Phe Gly Pro île Gly Ser Thr Gly Asn Pro 420 425 430
Ser Gly Gly Asn Pro Pro Gly Gly Asn Arg Gly Thr Thr Thr Thr Arg 435 440 445
Arg pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Gin Ser His 450 455 460
Tyr Gly Gin Cys Gly Gly lie Gly Tyr Ser Gly Pro Thr val Cys Ala 465 470 475 480
Ser Gly Thr Thr cys Gin Val Leu Asn Pro Tyr Tyr Ser Gin Cys Leu 485 490 495 <210> 13 <211 > 1767 <212> DNA <213> Artificial <220> <223> coding sequence for Trichoderma viride CBHI, including the alpha factor signal peptide <400> 13 atgagatttç cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagteaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc càttttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcaccc tctcaatctg cttgcacctt gcagtctgaa 300 actcacccac cattgacctg gcagaagtgt tcttctggcg gtacttgtac tcagcagacc 360 ggttctgttg ttatcgacgc caactggaga tggactcacg ctaccaactc ttctaceaac 420 tgctacgacg gtaacacttg gtcgtctacc ttgtgtccag acaacgagac ctgtgccaag 480 aactgttgtt tggacggtgc tgcttacgct tctacctacg gtgttaccac ctctggtaac 540 tcgctgtcta tcggtttcgt tacccagtct gcccagaaaa atgttggtgc cagactgtac 600 ttgatggctt ctgacaecac ctaccaagag tttaccctgc tgggtaacga gttctctttc 660 gacgtggacg tttctcaact gccatgtgga ctgaacggtg ccctgtactt cgtttctatg 720 gacgctgacg gtggtgtttc taagtaccca accaacaccg ctggtgctaa atacggaacc 780 ggttactgcg attctcagtg cccaagagac ctgaagttca tcaacggaca ggctaacgtt 840 gaaggatggg agccatcttc taacaacgcc aacaccggta ttggtggtca cggttcttgc 900 tgttctgaga tggacatctg ggaggccaac tctatttctg aggctttgac cccacaccca 960 tgtactactg tgggtcaaga gatctgtgag ggtgatggtt gtggtggtac ttactcggac 1020 aacagatacg gtggtacttg tgacccagac ggttgtgatt gggacccata cagactgggt 1080 aacacctctt tctacggtcc aggatcttct tttaccctgg acaccaccaa gaagttgacc 1140 gttgttaccc agtttgagac ctctggtgcc atcaacagat actacgtgca gaacggtgtt 1200 actttccagc agccaaacgc tgaactggga tcttactctg gtaacggact gaacgacgac 1260 tactgtactg ctgaggaagc tgagttcggt ggttcttctt tctctgacaa gggtggactg 1320 acccagttta agaaggctac ctctggcgga atggtgctgg ttatgtcttt gtgggacgac 1380 tactacgcta acatgctgtg gcttgactct acctacccaa ctaacgagac ctcttctacc 1440 ccaggtgctg ttagaggatc ttgctctacc tcttctggtg ttccagctca ggttgagtct 15Ö0 cagtctccaa acgccaaggt gaccttctct aacatcaagt tcggtccaat cggttctact 1560 ggtgacccat ctggtggtaa cccaccaggt ggaaacccac ctggtactac cactaccaga 1620 agaccagcta ccaccactgg ttcttctcca ggtccaaccc aatctcacta cggtcagtgt 1680 ggtggtattg gttactctgg tccaaccgtt tgtgcttctg gaaccacctg tcaggttctg 1740 aacccatâct actcgcagtg cctgtaa 1767 <210 14 <211> 497 <212> PRT <213> Artificial <220 <223> Trichoderma viride CBHI (CBH-f) <400 14
Gin ser Ala Cys Thr Leu Gin ser Glu Thr His Pro Pro Leu Thr Trp 1 5 10 15
Gin Lys cys ser ser Gly Gly Thr Cys Thr Gin Gin Thr Gly Ser val 20 25 30 val île Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn ser ser Thr 35 40 45
Asn CyS Tyr Asp Gly Asn Thr Trp ser ser Thr Leu Cys Pro Asp Asn 50 55 60
Glu Thr Cys Ala Lys Asn cys cys Leu Asp Gly Ala Ala Tyr Ala Ser 65 70 75 80
Thr Tyr Gly val Thr Thr Ser Gly Asn Ser Leu Ser Ile Gly Phe val 85 90 95
Thr Gin ser Ala Gin Lys Asn val Gly Ala Arg Leu Tyr Leu Met Ala 100 105 110 5er Asp Thr Thr Tyr Gin Glu Phe Thr Leu Leu Gly Asn Glu Phe ser 115 ,120 125
Phe Asp val Asp Val Ser Gin Leu Pro Cys Gly Leu Asn Gly Ala Leu 130 135 140
Tyr Phe val Ser Met Asp Ala Asp Gly Gly Val Ser Lys Tyr Pro Thr 145 150 155 160
Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin cys 165 170 175
Pro Arg Asp Leu Lys Phe Ile Asn Gly Gin Ala Asn val Glu Gly Trp 180 185 190
Glu Pro ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly Ser 195 200 205
Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn 5er île Ser Glu Ala 210 215 220
Leu Thr Pro His Pro Cys Thr Thr val Gly Gin Glu Ile Cys Glu Gly 225 230 235 240
Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr cys 245 250 255
Asp Pro Asp Gly Cys Asp Trp Asp Pro Tyr Arg Leu Gly Asti Thr Ser 260 265 270
Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys Leu 275 280 285
Thr Val val Thr Gin Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr Tyr 290 295 300 val Gin Asn Gly val Thr Phe Gin Gin pro Asn Ala Glu Leu Gly ser 305 310 315 320
Tyr ser Gly Asn Gly Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu Ala 325 330 335
Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gin Phe 340 345 350
Lys Lys Ala Thr Ser Gly Gly.Met Val Leu val Met Ser Leu Trp Asp 355 360 365
Asp Tyr Tyr Ala Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr Asn 370 375 380
Glu Thr ser Ser Thr Pro Gly Ala val Arg Gly Ser Cys Ser Thr Ser 385 390 395 400
Ser Gly val Pro Ala Gin Val Glu Ser Gin Ser Pro Asn Ala Lys val 405 410 415
Thr Phe Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser Thr Gly Asp Pro 420 425 430
Ser Gly Gly Asn Pro Pro Gly Gly Asn Pro Pro Gly Thr Thr Thr Thr 435 440 445
Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Gin ser 450 455 460
His Tyr Gly Gin Cys Gly Gly Ile Gly Tyr Ser Gly Pro Thr val Cys 465 470 475 480
Ala Ser Gly Thr Thr Cys Gin val Leu Asn Pro Tyr Tyr Ser Gin Cys . 485 490 495
Leu <210 15 <211 > 1785 <212> DNA <213> Artificial <220 <223> coding sequence for Humicolagrisea CBHI-Trichoderma reesei CBHI cellulose binding domain fusion protein including the alpha factor signal peptide and a 6x His Tag <400 15 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcatgc tcgcagcagg ctggtacaat tactgctgag 300 aaccatccaa gaatgacgtg gaagagatgt agtggtccag gaaactgtca gactgttcag 360 ggtgaggtcg tgatagatgc taactggaga tggttgcata acaacggcca gaactgctac 420 gagggtaaca agtggacctc tcagtgttct tctgctaccg actgcgctca gagatgtgct 480 cttgatggag caaactacca gagtacatat ggtgcttcta cctctggtga cagccttacc 540 ctgaagtttg taaccaagca cgagtacgga accaatatcg gttctagatt ctacctgatg 600 gctaaccaga acaagtacca gatgtttacc ttgatgaaca acgagttcgc cttcgacgta 660 gatctgtcta aggtggagtg tggaatcaat tctgccttgt actttgtcgc tatggaagag 720 gacggaggta tggcttctta cccttctaac agagctggtg ctaagtatgg aactggatac 780 tgcgatgccc aatgcgctag agacctgaag ttcatcggtg gaaaggctaa cattgaaggt 840 tggagacctt ctaccaacga cccaaacgct ggagttggtc caatgggtgc ttgctgtgcc 900 gagattgacg tgtgggaatc taacgcttac gcctacgctt ttactccaca tgcttgcggt 960 tctaagaaca gataccacat ttgcgaaacc aacaactgtg gtggcactta ctctgatgac 1020 agattcgctg gatactgtga tgctaa'cgga tgtgattaca acccatacag aatgggtaac 1080 aaggactttt acggaaaggg taagactgtt gacactaaca gaaagttcac tgtggtctcg 1140 agatttgaga gaaacagact gtcgcagttc tttgtgcagg acggaagaaa gattgaggtc 1200 ccaccaccaa cttggccagg attgccaaac tctgccgaca ttaccccaga gttgtgcgac 1260 gctcagttca gagtgtttga cgacagaaac agatttgctg agaccggtgg atttgacgct 1320 ttgaacgagg ctctgaccat tccaatggtt ctagtcatga gtatttggga cgatcaccac 1380 tctaacätgc tttggctgga ctcttcttac cctccagaga aggctggatt gcctggtggt 1440 gacagaggtc catgtccaac aacttctgga gttccagccg aggttgaggc tcaataccca 1500 gacgcccagg tcgtgtggtc caacatcaga ttcggaccaa ttggaagctt aacaggtaat 1560 ccttcaggtg gtaatcctcc aggtggaaac agaggaacaa cgacaactag aagaccagct 1620 actacaactg gttcaagtcc aggtccaact caatcacact acggtcaatg tggtggtata 1680 ggttactctg gtcccactgt ttgtgcttct ggtactactt gccaagttct gaacccttac 1740 tactcacagt gtctagcttc tgcacaccat catcatcatc attaa 1785 <210> 16 <211> 503
<212> PRT <213> Artificial <220 <223> Humicola grisea CBHI- Trichoderma reesei CBHI cellulose binding domain fusion protein including a 6x His Tag (CBH-g) <400 16
Gin Gin Ala Gly Thr lie Thr Ala Glu Asti His pro Arg Met Thr Trp 15 10 15
Lys Arg Cys ser Gly Pro Gly Asn Cys Gin Thr val Gin Gly Glu val 20 25 30
Val lie Asp Ala Asn Trp Arg Trp Leu His Asn Asn Gly Gin Asn Cys 35 40 45
Tyr Glu Gly Asn Lys Trp Thr Ser Gin Cys Ser ser Ala Thr Asp Cys 50 55 60
Ala Gin Ärg Cys Ala Leu Asp Gly Ala Asn Tyr Gin Ser Thr Tyr Gly 65 70 75 80
Ala ser Thr Ser Gly Asp Ser Leu Thr Leu Lys Phe Val Thr Lys His 85 90 95
Glu Tyr Gly Thr Asn Ile Gly Ser Arg Phe Tyr Leu Met Ala Asn Gin 100 105 110
Asn Lys Tyr Gin Met Phe Thr Leu Met Asn Asn Glu Phe Ala Phe Asp 115 120 125
Val Asp Leu Ser Lys Val Glu cys Gly Ile Asn Ser Ala Leu Tyr Phe 130 135 140
Val Ala Met Glu Glu Asp Gly Gly Met Ala Ser Tyr pro Ser Asn Arg 145 150 155 160
Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ala Gin Cys Ala Arg 165 170 175
Asp Leu Lys Phe Ile Gly Gly Lys Ala Asn Ile Glu Gly Trp Arg pro 180 185 190
Ser Thr Asn Asp Pro Asn Ala Gly Val Gly Pro Met Gly Ala cys cys 195 200 205
Ala Glu île Asp val Trp Glu ser Asn Ala Tyr Ala Tyr Ala Phe Thr 210 215 220
Pro His Ala Cys Gly Ser Lys Asn Arg Tyr His Ile cys Glu Thr Asn 225 230 235 240
Asn cys Gly Gly Thr Tyr Ser Asp Asp Arg Phe Ala Gly Tyr Cys Asp 245 250 255
Ala Asn Gly Cys Asp Tyr Asn Pro Tyr Arg Met Gly Asn Lys Asp Phe 260 265 270 '
Tyr Gly Lys Gly Lys Thr Val Asp Thr Asn Arg Lys Phe Thr Val Val 275 280 285
Ser Arg Phe Glu Arg Asn Arg Leu Ser Gin Phe Phe val Gin Asp Gly 290 295 300
Arg Lys Ile Glu Val Pro Pro Pro Thr Trp Pro Gly Leu Pro Asn Ser 305 310 315 320
Ala Asp Ile Thr pro Glu Leu Cys Asp Ala Gin Phe Arg val Phe Asp 325 330 3.35
Asp Arg Asn Arg Phe Ala Glu Thr Gly Gly Phe Asp Ala Leu Asn Glu 340 345 350
Ala Leu Thr Ile Pro Met val Leu val Met ser Ile Trp Asp Asp His 355 360 365
His Ser Asn Met Leu Trp Leu Asp Ser Ser Tyr Pro Pro Glu Lys Ala 370 375 380
Gly Leu Pro Gly Gly Asp Arg Gly Pro Cys Pro Thr Thr Ser Gly val 385 390 395 400
Pro Ala Glu val Glu Ala Gin Tyr Pro Asp Ala Gin val val Trp Ser 405 410 415
Asn Ile Arg Phe Gly Pro Ile Gly Ser Leu Thr Gly Asn Pro ser Gly 420 425 430
Gly Asn Pro Pro Gly Gly Asn Arg Gly Thr Thr Thr Thr Arg Arg Pro 435 440 445
Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Gin Ser His Tyr Gly 450 455 460
Gin cys Gly Gly lie Gly Tyr ser Gly Pro Thr val Cys Ala ser Gly 465 470 475 480
Thr Thr cys Gin Val Leu Asn Pro Tyr Tyr ser Gin Cys Leu Ala ser 485 490 495
Ala His His His His His Hi s 500 <210 17 <211 > 1809 <212> DNA <213> Artificial <220> <223> Coding sequence for Talaromyces emersonii CBHI / Trichoderma reesei -CBD fusion including the alpha factor signal peptide and a 6x His Tag <400> 17 atgagatttc cttcaatttt tactgcagtt ttattcgcag catcctccgc attagctgct 60 ccagtcaaca ctacaacaga agatgaaacg gcacaaattc cggctgaagc tgtcatcggt 120 tacttagatt tagaagggga tttcgatgtt gctgttttgc cattttccaa cagcacaaat 180 aacgggttat tgtttataaa tactactatt gccagcattg ctgctaaaga agaaggggta 240 tctttggata aacgtgaggc ggaagcatgc tcgcagcagg ccggcacggc gacggcagag 300 aaccacccgc ccctgacatg gcaggaatgc accgcccctg ggagctgcac cacccagaac 360 ggggcggtcg ttcttgatgc gaactggcgt tgggtgcacg atgtgaacgg atacaccaac 420 tgctacacgg gcaatacctg ggaccccacg tactgccctg acgacgaaac ctgcgcccag 480 aactgtgcgc tggacggcgc ggattacgag ggcacctacg gcgtgacttc gtcgggcagc 540 tccttgaaác tcaatttcgt caccgggtcg aacgtcggat cccgtctcta cctgctgcag 600 gacgactcga cctatcagat cttcaagctc ctgaaccgcg agttcagctt tgacgtcgat 660 gtetccaatc ttccgtgcgg attgaacggc gctctgtact ttgtcgccat ggacgccgac 720 ggcggcgtgt ccaagtaccc gaacaacaag gctggtgcca agtacggaac cgggtattgc 780 gactcccaat gcccacggga cctcaagttc atcgacggcg aggccaacgt cgagggctgg 840 cagccgtctt cgaacaacgc caacaccgga attggcgacc acggctcctg ctgtgcggag 900 atggatgtct gggaagcaaa cagcatctcc aatgcggtca ctccgcaccc gtgcgacacg 960 ccaggccaga cgatgtgctc tggagatgac tgcggtggca catactctaa cgatcgctac 1020 gcgggaacct gcgatcctga cggctgtgac ttcaaccctt accgcatggg caacacttct 1080 ttctacgggc ctggcaagat catcgatacc accaagccct tcactgtcgt gacgcagttc 1140 ctcactgatg atggtacgga tactggaact ctcagcgaga tcaagcgctt ctacatccag 1200 aacagcaacg tcattccgca gcccaactcg gacatcagtg gcgtgaccgg caactcgatc 1260 acgacggägt tctgcactgc tcágaagcag gcctttggcg acacggacga cttctctcag 1320 cacggtggcc tggccaagat gggagcggcc atgcagcagg gtatggtcct ggtgatgagt 1380 ttgtgggacg actacgccgc gcagatgctg tggttggatt ccgactaccc gacggatgcg 1440 gaccccacga cccctggtat tgcccgtgga acgtgtccga cggactcggg cgtcccatcg 1500 gatgtcgagt cgcagagccc caactcctac gtgacctact cgaacattaa gtttggtccg 1560 atcggtagca caggtaatcc ttcaggtggt aatcctccag gtggaaacag aggaacaacg 1620 acaactagaa gaccagctac tacaactggt tcaägtccag gtccaactca atcacactac 1680 ggtcaatgtg gtggtatagg ttactctggt cccactgttt gtgcttctgg tactacttgc 1740 caagttctga acccttacta ctcacagtgt ctagcttctg cacatcatca ccaccaçcat 1800 taatgataa , 1809
<210 18 <211> 509 <212> PRT <213> Artificial <220 <223> Mature Sequence of Talaromyces emersonii CBHI /Trichoderma reesei -CBD fusion with 6x-His tag (CBH-ah) <400 18
Gin Gin Ala Gly Thr Ala Thr Ala Glu Asn m's Pro pro Leu Thr Trp 1 5 10 15
Gin Glu Cys Thr Ala Pro Gly Ser Cys Thr Thr Gin Asn Gly Ala Val 20 25 30
Val Leu Asp Ala Asn Trp Arg Trp val His Asp val Asn Gly Tyr Thr 35 40 45
Asn Cys Tyr Thr Gly Asn Thr Trp Asp Pro Thr Tyr Cys Pro Asp Asp 50 55 60
Glu Thr cys Ala Gin Asn Cys Ala Leu Asp Gly Ala Asp Tyr Glu Gly 65 70 75 80
Thr Tyr Gly val Thr Ser Ser Gly Ser ser Leu Lys Leu Asn Phe Val 85 90 95
Thr Gly Ser Asn val Gly Ser Arg Leu Tyr Leu Leu Gin Asp Asp Ser 100 105 110
Thr Tyr Gin Ile Phe Lys Leu Leu Asn Arg Glu Phe Ser Phe Asp Val 115 120 125
Asp val ser Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu Tyr Phe val 130 135 140
Ala Met Asp Ala Asp Gly Gly val Ser Lys Tyr Pro Asn Asn Lys Ala 145 150 155 160
Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gin Cys Pro Arg Asp 165 170 175
Leu Lys Phe Ile Asp Gly Glu Ala Asn val Glu Gly Trp Gin Pro Ser 180 185 190 ser Asn Asn Ala Asn Thr Gly lie Gly Asp His Gly Ser Cys Cys Ala 195 200 205
Glu Met Asp val Trp Glu Ala Asn Ser lie ser Asn Ala val Thr Pro 210 215 220
His Pro cys Asp Thr Pro Gly Gin Thr Met Cys Ser Gly Asp Asp Cys 225 230 235 240
Gly Gly Thr Tyr ser Asn Asp Arg Tyr Ala Gly Thr Cys Asp Pro Asp 245 250 255
Gly Cys Asp Phe Asn Pro Tyr Arg Met Gly Asn Thr Ser Phe Tyr Gly 260 265 270
Pro Gly Lys lie lie Asp Thr Thr Lys Pro Phe Thr val val Thr Gin 275 280 285
Phe Leu Thr Asp Asp Gly Thr Asp Thr Gly Thr Leu Ser Glu lie Lys 290 295 300
Arg Phe Tyr lie Gin Asn Ser Asn val lie pro Gin Pro Asn Ser Asp 305 310 315 320
Ile Ser Gly val Thr Gly Asn Ser Ile Thr Thr Glu Phe Cys Thr Ala 325 330 335
Gin Lys Gin Ala Phe Gly Asp Thr Asp Asp Phe Ser Gin His Gly Gly 340 345 350
Leu Ala Lys Met Gly Ala Ala Met Gin Gin Gly Met val Leu val Met 355 360 365 sèr Leu Trp Asp Asp Tyr Ala Ala Gin Met Leu Trp Leu Asp Ser Asp 370 375 380
Tyr Pro Thr Asp Ala Asp Pro Thr Thr Pro Gly Ile Ala Arg Gly Thr 385 390 395 400
Cys Pro Thr Asp Ser Gly val Pro Ser Asp Val Glu Ser Gin ser Pro 405 410 415
Asn ser Tyr val Thr Tyr Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser 420 425 430
Thr Gly Asn Pro Ser Gly Gly Asn Pro Pro Gly Gly Asn Arg Gly Thr 435 440 445
Thr Thr Thr Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro 450 455 460
Thr Gin Ser His Tyr Gly Gin Cys Gly Gly Ile Gly Tyr ser Gly Pro 465 470 475 480
Thr Val Cys Ala Ser Gly Thr Thr cys Gin val Leu Asn Pro Tyr Tyr 485 490 495
Ser Gin Cys Leu Ala Ser Ala His His His His His His 500 505 <210 19 <211 > 1335 <212> DNA <213> Artificial <220 <223> Alternative coding sequence of Humicola grisea CBHI with signal sequence <400 19 atggccagcg atctggcaca gcaggctggt acaattactg ctgagaacca tccaagâatg 60 acgtggaaga gatgtagtgg tccaggaaac tgtcagactg ttcagggtga ggtcgtgata 120 gatgctaact ggagatggtt gcataacaac ggccagaact gctacgaggg taacaagtgg 180 acctctcagt gttcttctgc taccgactgc gctcagagat gtgctcttga tggagcaaac 240 taccagagta catatggtgc ttctacctct ggtgacagcc ttaccctgaa gtttgtaacc 300 aagcacgagt acggaaccaa tatcggttct agattctacc tgatggctaa ccagaacaag 360 taccagatgt ttaccttgat gaacaacgag ttcgccttcg acgtagatct gtctaaggtg 420 gagtgtggaa tcaattctgc cttgtacttt gtcgctatgg aagaggacgg aggtatggct 480 tcttaccctt ctaacagagc tggtgctaag tatggaactg gatactgcga tgcccaatgc 540 gctagagacc tgaagttcat cggtggaaag gctaacattg aaggttggag accttctacc 600 aacgacccaa acgctggagt tggtccaatg ggtgcttgct gtgccgagat tgacgtgtgg 660 gaatctaacg cttacgccta cgcttttact ccacatgctt gcggttctaa gaacagatac 720 cacatttgcg aaaccaacaa ctgtggtggc acttactctg atgacagatt cgctggatac 780 tgtgatgcta acggatgtga ttacaaccca tacagaatgg gtaacaagga cttttacgga 840 aagggtaaga ctgttgacac taacagaaag ttcactgtgg tctcgagatt tgagagaaac 900 agactgtcgc agttctttgt gcaggacgga agaaagattg aggtcccacc accaacttgg 960 ccaggattgc caaactctgc cgacattacc ccagagttgt gcgacgctca gttcagagtg 1020 tttgacgaca gaaacagatt tgctgagacc ggtggatttg acgctttgaa cgaggctctg 1080 accattccaa tggttctagt catgagtatt tgggacgatc accactctaa catgctttgg 1140 ctggactctt cttaccctcc agagaaggct ggattgcctg gtggtgacag aggtccatgt 1200 ccaacaactt ctggagttcc agccgaggtt gaggctcaat acccagacgc ccaggtcgtg 1260 tggtccaaca tcagattcgg accaattggt agcacagtga atgtggcttc tgcacaccat 1320 catcatcatc attga 1335 <210> 20 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 20 gaggcggaag caccctctca atctgcttgc accttgcagt c 41 <210> 21 <211> 38 <212> DNA <213> Artificial <220> <223> Primer reverse <400 21 ggagacgcag agcccttatt acaggcactg cgagtagt 38 <210> 22 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 22 gaggcggaag caccctctca gcaggctggt actattactg c 41 <210> 23 <211> 44 <212> DNA <213> Artificial <220> <223> Primer reverse <400> 23 ggagacgcag agcccttaca cgttcacggt agaaccgatt gggc 44 <210> 24 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 24 gaggcggaag caccctctca cgaggccggt accgtaaccg c 41 <210> 25 <211> 41 <212> DNA <213> Artificial <220> <223> Primer reverse <400> 25 ggagacgcag agcccttatt agttggcggt gaaggtcgag t 41 <210> 26 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 26 gaggcggaag caccctctca gcaggccggc acggcgacgg c 41 <210> 27 <211> 41 <212> DNA <213> Artificial <220 <223> Primer reverse <400 27 ggagacgcag agcccttatc acgaagcggt gaaggtcgag t 41 <210> 28 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 28 gaggcggaag caccctctca gcaggccggc acggcgacgg c 41 <210> 29 <211> 38 <212> DNA <213> Artificial <220> <223> Primer reverse <400> 29 attacctgtg ctaccgatcg gaccaaactt aatgttcg 38 <210> 30 <211> 38 <212> DNA <213> Artificial <220> <223> Primer forward <400> 30 aagtttggtc cgatcggtag cacaggtaat ccttcagg 38 <210> 31 <211> 44 <212> DNA <213> Artificial <220> <223> Primer reverse <400> 31 ggagacgcag agcccttatt atagacactg tgagtagtaa gggt 44 <210> 32 <211> 41 <212> DNA <213> Artificial <220> <223> Primer forward <400> 32 gaggcggaag caccctctca gcaggccggc acggcgacggc 41 <210> 33 <211> 44 <212> DNA <213> Artificial <220 <223> Primer reverse <400 33 ggagacgcag agcccttatc attaatggtg gtggtgatga tgag 44 <210> 34 <211> 40 <212> DNA <213> Artificial <220> <223> Primer forward <400> 34 aggcggaagc atgctcgcag caggctggta caattactgc 40 <210> 35 <211> 41 <212> DNA <213> Artificial <220 <223> Primer reverse <400> 35 ggattacctg ttaagcttcc aattggtccg aatctgatgt t 41 <210> 36 <211> 42 <212> DNA <213> Artificial <220> <223> Primer forward <400> 36 accaattgga agcttaacag gtaatccttc aggtggtaat cc 42 <210> 37 <211> 46 <212> DNA <213> Artificial <220 <223> Primer reverse <400 37 atcttgcagg tcgacttatc attaatgatg atgatgatgg tgtgca 46 <210> 38 <211> 40 <212> DNA <213> Artificial <220> <223> Primer forward <400> 38 aggcggaagc atgctcgcag caggctggta caattactgc 40 <210> 39 <211> 46 <212> DNA <213> Artificial <220> <223> Primer reverse <400 39 atcttgcagg tcgacttatc attaatgatg atgatgatgg tgtgca 46 <210> 40 <211> 21 <212> DNA <213> Artificial <220> <223> oligonucleotide alpha-f <400> 40 tactattgcc agcattgctg c 21 <210> 41 <211> 23 <212> DNA <213> Artificial <220> <223> Oligonucleotide oli740 <400> 41 tcagctattt cacatacaaa teg 23
Claims 1. A polypeptide having cellobiohydrolase activity, wherein the polypeptide comprises an amino acid sequence having at least 85 % sequence identity to SEQ ID NO: 2, wherein the amino acid residue at position Q1 of SEQ ID NO: 2 is modified by substitution or deletion. 2. The polypeptide according to claim 1, wherein the polypeptide maintains 50 % of its maximum substrate conversion capacity when the conversion is done for 60 minutes at a temperature of 60°C or higher.
The polypeptide according to claim 1 or 2, wherein the polypeptide comprises an amino acid sequence having at least 90 %, preferably at least 95 %, more preferably at least 99 % sequence identity to SEQ ID NO: 2.
The polypeptide according to one or more of the preceding claims, wherein the amino acid sequence of the polypeptide has the sequence as defined by SEQ ID NO: 2, or a sequence as defined by SEQ ID NO: 2 wherein 1 to 75 amino acid residues, more preferably 1 to 35 amino acid residues are substituted, deleted, or inserted.
The polypeptide according to claim 4, wherein additionally one or more of the following amino acid residues of the sequence defined by SEQ ID NO: 2 are modified by substitution or deletion: positions G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, I277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410, Y422, P442, N445, R446, T456, S460, P462, G463, H468 and/or V482 of amino acids 1 to 500 of SEQ ID NO: 2.
The polypeptide according to claim 5, wherein the polypeptide comprises one or more of the following preferred exchanges with respect to the sequence defined by SEQ ID NO: 2:
(continued)
The polypeptide according to one or more of the preceding claims, wherein the polypeptide has an amino acid sequence selected from the list of the following mutations of SEQ ID NO: 2:
(continued)
The polypeptide according to one or more of the preceding claims, which is expressed and secreted at a level of more than 100 mg/l, more preferably of more than 200 mg/l, particularly preferably of more than 500 mg/l, and most preferably of more than 1 g/l into the supernatant after introduction of a nucleic acid encoding a polypeptide having an amino acid sequence with at least 85% sequence identity to the SEQ ID NO: 2 into a yeast, wherein the amino acid residue at position Q1 of SEQ ID NO 2 must be modified by substitution or deletion. A nucleic acid encoding the polypeptide of one or more of claim 1 to 8, preferably having at least 95% identity to SEQ ID NO: 1. ). A vector comprising the nucleic acid of claim 9. 1. A host cell transformed with a vector of claim 10. 2. The host cell of claim 11, wherein the host cell is derived from the group consisting of Saccharomyces, Schizosac-charomyces, Kluyveromyces, Pichia, Pichia, Hansenula, Aspergillus, Trichoderma, Pénicillium, Candida and Yar-rowina. i. Composition comprising the polypeptide of one or more of claims 1 to 9 and one or more endoglucanases and/or one or more beta-glucosidases and/or one or more further cellobiohydrolases and/or one or more xylanases. 14. Use of the polypeptide according to one or more of claims 1 to 9 or of the composition of claim 13 for the enzymatic degradation of lignocellulosic biomass, and/or for textiles processing and/or as ingredient in detergents and/or as ingredient in food or feed compositions.
Patentansprüche 1. Polypeptid mit Cellobiohydrolaseaktivität, wobei das Polypeptid eine Aminosäuresequenz mit mindestens 85% Sequenzidentität zu SEQ ID NO: 2 umfasst, wobei der Aminosäurerest in Position Q1 von SEQ ID NO: 2 durch Substitution oder Deletion modifiziert ist. 2. Polypeptid nach Anspruch 1, wobei das Polypeptid 50% seiner maximalen Stoffwandlungsleistung beibehält, wenn die Umwandlung 60 Minuten lang bei einer Temperatur von 60°C oder höher erfolgt. 3. Polypeptid nach Anspruch 1 oder 2, wobei das Polypeptid eine Aminosäuresequenz mit mindestens 90%, vorzugsweise mindestens 95%, stärker bevorzugt mindestens 99% Sequenzidentität zu SEQ ID NO: 2 umfasst. 4. Polypeptid nach einem oder mehreren der vorhergehenden Ansprüche, wobei die Aminosäuresequenz des Polypeptids die Sequenz gemäß SEQ ID NO: 2 oder eine Sequenz gemäß SEQ ID NO: 2, wobei 1 bis 75 Aminosäurereste, stärker bevorzugt 1 bis 35 Aminosäurereste, substituiert, deletiert oder insertiert sind, aufweist. 5. Polypeptid nach Anspruch 4, wobei zusätzlich einer oder mehrere der folgenden Aminosäurereste in der Sequenz gemäß SEQ ID NO: 2 durch Substitution oder Deletion modifiziert sind: die Positionen G4, A6, T15, Q28, W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, 1277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410, Y422, P442, N445, R446, T456, S460, P462, G463, H468 und/oderV482 der Aminosäuren 1 bis 500 von SEQ ID NO: 2. 6. Polypeptid nach Anspruch 5, wobei das Polypeptid einen oder mehrere der folgenden bevorzugten Austausche in Bezug auf die Sequenz gemäß SEQ ID NO: 2 umfasst:
(fortgesetzt)
7. Polypeptid nach einem oder mehreren der vorhergehenden Ansprüche, wobei das Polypeptid eine Aminosäuresequenz, ausgewählt aus der Liste der folgenden Mutationen von SEQ ID NO: 2, aufweist:
(fortgesetzt)
8. Polypeptid nach einem oder mehreren der vorhergehenden Ansprüche, das nach dem Einführen einer Nukleinsäure, die für ein Polypeptid mit einer Aminosäuresequenz mit mindestens 85% Sequenzidentität zu der SEQ ID NO: 2 in eine Hefe, wobei der Aminosäurerest in Position Q1 von SEQ ID NO: 2 durch Substitution oder Deletion modifiziert sein muss, auf einem Niveau von mehr als 100 mg/l, stärker bevorzugt mehr als 200 mg/l, besonders bevorzugt mehr als 500 mg/l und am stärksten bevorzugt mehr als 1 g/l exprimiert und in den Überstand sezerniert wird. 9. Nukleinsäure, die für das Polypeptid nach einem oder mehreren der Ansprüche 1 bis 8 codiert, vorzugsweise mit mindestens 95% Identität zu SEQ ID NO: 1. 10. Vektor, der die Nukleinsäure nach Anspruch 9 umfasst. 11. Wirtszelle, die mit einem Vektor nach Anspruch 10 transformiert ist. 12. Wirtszelle nach Anspruch 11, wobei die Wirtszelle von der Gruppe bestehend aus Saccharomyces, Schizosaccha-romyces, Kluyveromyces, Pichia, Hansenula, Aspergillus, Trichoderma, Pénicillium, Candida und Yarrowina abgeleitet ist. 13. Zusammensetzung, umfassend das Polypeptid nach einem oder mehreren der Ansprüche 1 bis 9 und eine oder mehrere Endoglucanasen und/oder eine oder mehrere beta-Glucosidasen und/oder eine oder mehrere weitere Cellobiohydrolasen und/oder eine oder mehrere Xylanasen. 14. Verwendung des Polypeptids nach einem oder mehreren der Ansprüche 1 bis 9 oder der Zusammensetzung nach Anspruch 13 für den enzymatischen Abbau von Lignocellulosebiomasse und/oder für die Textilverarbeitung und/oder als Bestandteil in Detergenzien und/oder als Bestandteil in Nahrungs- oder Futtermittelzusammensetzungen.
Revendications 1. Polypeptide ayant une activité de cello-biohydrolase, dans laquelle le polypeptide comprend une séquence d’acides aminés ayant une identité de séquence, d’au moins 85%, avec la SEQ ID n° : 2, dans lequel le résidu d’acide aminé en position Q1 de la SEQ ID n° : 2 est modifié par substitution ou délétion. 2. Polypeptide selon la revendication 1, dans laquelle le polypeptide conserve 50% de sa capacité maximale de conversion du substrat quand la conversion est faite pendant 60 minutes à une température de 60°C ou plus. 3. Polypeptide selon la revendication 1 ou la 2, dans laquelle le polypeptide comprend une séquence d’acides aminés ayant une identité de séquence d’au moins 90%, de préférence d’au moins 95%, mieux préféré d’au moins 99%, avec la SEQ ID n° : 2. 4. Polypeptide selon une ou plusieurs des revendications précédentes, dans laquelle la séquence d’acides aminés du polypeptide a la séquence définie par la SEQ ID n° : 2 ou une séquence telle que définie par la SEQ ID n° : 2, dans laquelle de 1 à 75 résidu(s) d’acide(s) aminé(s), mieux préféré de 1 à 35 résidu(s) d’acide(s) aminé(s) est (sont) substitué(s), supprimé(s) ou inséré(s). 5. Polypeptide selon la revendication 4, dans laquelle en outre un ou plusieurs des résidus d’acides aminés suivants de la séquence définie par la SEQ ID n° : 2 est (sont) modifié(s) par substitution ou délétion : positions G4, A6, T15, Q28.W40, D64, E65, A72, S86, K92, V130, V152, Y155, K159, D181, E183, N194, D202, P224, T243, Y244, 1277, K304, N310, S311, N318, D320, T335, T344, D346, Q349, A358, Y374, A375, T392, T393, D410, Y422, P442, N445, R446, T456, S460, P462, G463, H468 et/ou V482 des acides aminés 1 à 500 de la SEQ ID n° : 2. 6. Polypeptide selon la revendication 5, dans laquelle le polypeptide comprend un ou plusieurs des échanges préférés suivants en ce qui concerne la séquence définie par la SEQ ID n° : 2 :
(suite)
(suite)
Polypeptide selon une ou plusieurs des revendications précédentes, dans laquelle le polypeptide a une séquence d’acides aminés choisie dans la liste des mutations suivantes de la SEQ ID n° : 2 :
(suite)
8. Polypeptide, selon une ou plusieurs des revendications précédentes, qui est exprimé et sécrété à un taux supérieur à 100 mg/l, mieux préféré supérieur à 200 mg/l, particulièrement préféré supérieur à 500 mg/l et, de manière préférée entre toutes, supérieur à 1 g/l dans le surnageant après l’introduction d’un acide nucléique, qui code pour un polypeptide ayant une séquence d’acides aminés avec une identité de séquence d’au moins 85% avec la SEQ ID n° : 2, dans une levure, dans laquelle le résidu d’acide aminé en position Q1 de la SEQ ID n° : 2 doit être modifié par substitution ou délétion. 9. Acide nucléique codant pour le polypeptide, selon une ou plusieurs des revendications 1 à 8, ayant de préférence une identité d’au moins 95% avec la SEQ ID n° : 1. 10. Vecteur comprenant l’acide nucléique selon la revendication 9. 11. Cellule hôte transformée avec un vecteur selon la revendication 10. 12. Cellule hôte selon la revendication 11, dans laquelle la cellule hôte est dérivée du groupe constitué par Saccharomyces, Schizosaccharomyces, Kluyveromyces, Pichia, Hansenula, Aspergillus, Trichoderma, Pénicillium, Candida et Yarrowina. 13. Composition comprenant le polypeptide, selon une ou plusieurs des revendications 1 à 9, et une ou plusieurs endoglucanase(s) et/ou une ou plusieurs bêta-glucosidase(s) et/ou une ou plusieurs cello-biohydrolase(s) en outre et/ou une ou plusieurs xylanase(s). 14. Utilisation du polypeptide, selon une ou plusieurs des revendications 1 à 9, ou de la composition, selon la revendication 13, pour la dégradation enzymatique d’une biomasse ligno-cellulosique, et/ou pour le traitement de textiles et/ou comme ingrédient dans les détergents et/ou comme ingrédient dans de la nourriture ou des compositions alimentaires.
Figure 10 1 50
T. _reesei_CBHI QSACTLQSET HPPLTWQKCS SGGTCTQQTG SWIDANWRW THATNSSTNC
SeqID_NO.2 QQAGTATAEN HPPLTWQBCT APGSCTTQNG AWLDANWRW VHDVNGYTNC 51 100
T , __reesei__CBHI YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIGFVTQSA
SeqID_N0.2 YTGNTWDPTY CPDDSTCAQK CALDGADYEG TYGVTSSGSS LKLNFVTG.. 101 150 T._reesei_CBHI QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD.
SeqID_N0.2 .SNVGSRLYL LQDDSTYQIF KLLNREFSFD VDVSNLPCGL NGALYFVAMD 151 200
T._reesei_CBHÏ ÄDGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN
SeqID_NO.2 ADGGVSKYPN NKAGAKYGTG YCDSQCPRDL KFIDGEANVE GWQPSSNNAN 201 250
T._reesei_CBHI TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN
SeqID_N0.2 TGIGDHGSCC AEMDWEANS ISNAVTPHPC DTPGQTMCSG DDCGGTYSND 251 300 T-_reesei_CBHI RYGGTCDPDG CDWNPYR'LGN TSFYGPGSSF TLDTTKKLTV VTQFETSG..
SeqID_NO.2 RYAGTCDPDG CDFNPYRMGN TSFYGPGK.. IIDTTKPFTV VTQFLTDDGT 301 35Ô
T._reesei_CBHI ......AINR YYVQNGVTFQ QPNAELGSYS GNELNDDYCT AEEAEFGGSS
SeqID_NO.2 DTGTLSEIKR FYIQNSNVIP QPNSDISGVT GNSITTËFÇT AQKQAPGDTD 351 400
T._reesei_CBHI .FSDKGGLTQ FKKATSGGMV LVMSLWDDYY ANMLWLDSTY PTNETSSTPG
SeqID_NO.2 DFSQHGGLAK MGAAMQQGMV LVMSLWDDYA AQMLWLDSDY PTDADPTTPG 401 450
T._reesei_CBHI AVRGSCSTSS GVPAQVESQS PNAKVTFSNI KFGPIGSTGN PSGGNPPGGN
SeqID_N0.2 IARGTCPTDS GVPSDVESQS PNSYVTYSNI KFGPIGSTGN PSGGNPPGGN 451 500 T._reesei_CBHI RGTTTTRRPA TTTGSSPGPT QSHYGQCGGI GYSGPTVCAS GTTCQVLNPY '.·
SeqID_NO.2 RGTTTTRRPA TTTGSSPGPT QSHYGQCGGI GYSGPTVCAS GTTCQVLNPY 501
T._reesei_CBHI YSQCL
SeqID_NO.2 YSQCL
REFERENCES CITED IN THE DESCRIPTION
This list of references cited by the applicant is for the reader’s convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.
Patent documents cited in the description • US 7459299 B2 [0007] • US 7452707 B2 [0007] • WO 2005030926 A [0007] • WO 0104284 A1 [0007] • US 20090162916A1 [0007] • WO 03000941 A [0010]
Non-patent literature cited in the description • LYND LR ; WEIMER PJ ; VAN ZYL WH ; PRETO-RIUS IS. Microbial cellulose utilization: fundamentals and biotechnology. Microbiol Mol Biol Rev, September 2002, vol. 66 (3), 506-77 [0004] • KURABI A ; BERLIN A ; GILKES N ; KILBURN D ; BURA R ; ROBINSON J ; MARKOVA ; SKOMAR-OVSKY A ; GUSAKOV A ; OKUNEV O. Enzymatic hydrolysis of steam-exploded and ethanol organo-solv-pretreated Douglas-Fir by novel and commercial fungal cellulases. Appl Biochem Biotechnol., 2005, vol. 121-124, 219-30 [0006] • GRASSICK A ; MURRAY PG ; THOMPSON R ; COLLINS CM ; BYRNES L ;BIRRANE G ; HIGGINS TM ; TUOHY MG. Three-dimensional structure of a thermostable native cellobiohydrolase, CBH IB, and molecular characterization of the cel7 gene from the filamentous fungus, Talaromyces emersonii. Eur J Biochem, November 2004, vol. 271 (22), 4495-4506 [0007] • VOUTILAINEN SP ; MURRAY PG ; TUOHY MG ; KOIVULA A. Expression of Talaromyces emersonii cellobiohydrolase Cel7A in Saccharomyces cerevi-siae and rational mutagenesis to improve its thermostability and activity. Protein Eng Des Sel., February 2010, vol. 23 (2), 69-79 [0007] • PERCIVAL ZHANG YH ; HIMMEL ME ; MIELENZ JR. Outlook for cellulase improvement: screening and selection strategies. BiotechnolAdv., September 2006, vol. 24 (5), 452-81 [0008] • PENTTILÄ ME ; ANDRÉ L ; LEHTOVAARA P ; BAILEY M ; TEERI TT ; KNOWLES JK. Efficient secretion of two fungal cellobiohydrolases by Saccharomyces cerevisiae. Gene, 1988, vol. 63 (1), 103-12 [0009]
• HONG J ; TAMAKI H ; YAMAMOTO K ; KUMAGAI H. Cloning of a gene encoding thermostable cellobiohydrolase from Thermoascus aurantiacus and its expression in yeast. Appl Microbiol Biotechnol., November 2003, vol. 63 (1), 42-50 [0013] • US 2009042266 A[0011] • US 5686593 A [0012] • CN 01757710 [0019] • WO 2009138877 A [0022] • WO 2009139839 A [0023] • TUOHY MG ; WALSH DJ ; MURRAY PG ; CLAEY-SSENS M ; CUFFE MM ; SAVAGE AV ; COUGH- LAN MP. Kinetic parameters and mode of action of the cellobiohydrolases produced by Talaromyces emersonii. Biochim Biophys Acta, 29 April 2002, vol. 1596 (2), 366-80 [0014] • NEVOIGT E. Progress in metabolic engineering of Saccharomyces cerevisiae. Microbiol Mol Biol Rev., September 2008, vol. 72 (3), 379-412 [0015] • FUJITA Y ; ITO J ; UEDA M ; FUKUDA H ; KONDO A. Synergistic saccharification, and direct fermentation to ethanol, of amorphous cellulose by use of an engineered yeast strain codisplaying three types of cellulolytic enzyme. Appl Environ Microbiol., February 2004, vol. 70 (2), 1207-12 [0016] • BOER H ; TEERI TT ; KOIVULA A. Characterization of Trichoderma reesei cellobiohydrolase Cel7A secreted from Pichia pastoris using two different promoters. BiotechnolBioeng., 05 September2000, vol. 69 (5), 486-94 [0017] • GODBOLE S ; DECKER SR ; NIEVES RA ; ADNEY WS ; VINZANT TB ; BAKER JO ; THOMAS SR ; HIMMEL ME. Cloning and expression of Trichoderma reesei cellobiohydrolase I in Pichia pastoris. Biotechnol Prog., September 1999, vol. 15 (5), 828-33 [0018] • KANOKRATANA P ; CHANTASINGH D ; CHAM-PREDA V ; TANAPONGPIPAT S ; POOTANAKIT K ; EURWILAICHITR L. Identification and expression of cellobiohydrolase (CBH I) gene from an endophytic fungus, Fusicoccum sp. (BCC4124) in Pichia pastoris. LProtein Expr Purif., 19 September 2007, vol. 58 (1), 148-53 [0019] • LI YL ; LI H ; LI AN ; LI DC. Cloning of a gene encoding thermostable cellobiohydrolase from the thermophilic fungus Chaetomium thermophilum and its expression in Pichia pastoris. J Appl Microbiol., June 2009, vol. 106 (6), 1867-75 [0019] • VOUTILAINEN SP ; PURANEN T ; SIIKA-AHO Μ ; LAPPALAINEN A ; ALAPURANEN Μ ; KALLIO J ; HOOMAN S ; VIIKARI L ; VEHMAANPERÄ J ; KOI-VULA A. Cloning, expression, and characterization of novel thermostable family 7 cellobiohydrolases. Bi-otechnol Bioeng., 15 October 2008, vol. 101 (3), 515-28 [0020] • VIIKARI L ; ALAPURANEN M ; PURANEN T ; VEHMAANPERÄ J ; SIIKA-AHO M. Thermostable enzymes in lignocellulose hydrolysis. Adv Biochem Eng Biotechnol., 2007, vol. 108, 121-45 [0020] • GRASSICK A ; MURRAY PG ; THOMPSON R ; COLLINS CM ; BYRNES L ;BIRRANE G ; HIGGINS TM ; TUOHY MG. Three-dimensional structure of a thermostable native cellobiohydrolase, CBH IB, and molecular characterization of the cel7 gene from the filamentous fungus, Talaromyces emersonii. Eur J Biochem, November 2004, vol. 271 (22), 4495-506 [0021] • LARKIN M.A. ; BLACKSHIELDS G. ; BROWN N.P. ; CHENNA R. ; MCGETTIGAN P.A. ; MCWIL-LIAM H. ; VALENTIN F. ; WALLACE I.M. ; WILM A. ; LOPEZ R. ClustalW and ClustalX version 2. Bioinformatics, 2007, vol. 23 (21), 2947-2948 [0039] • LIVINGSTONE CD ; BARTON GJ. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput.Appi Bio-sci., 1993, vol. 9, 745-756 [0045] • TAYLOR W. R. The classification of amino acid conservation. J.Theor.Biol., 1986, vol. 119, 205-218 [0045] • WATERHAM, H. R. ; DIGAN, M. E. ; KOUTZ, P. J. ; LAIR, S. V. ; CREGG, J. M. Isolation of the Pichia pastoris glyceraldehyde-3-phosphate dehydrogenase gene and regulation and use of its promoter. Gene, 1997, vol. 186, 37-44 [0076] • CREGG, J.M. Pichia Protocols in Methods in Molecular Biology. Humana Press, 2007 [0076] • R.CRAIG CADWELL ; G.F. JOYCE. Mutagenic PCR, in PCR Primer: a laboratory manual. Cold Spring Harbor Press, 1995, 583-589 [0084]
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| HUE10153355A HUE025604T2 (en) | 2010-02-11 | 2010-02-11 | Optimized cellulase enzymes |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| HUE10153355A HUE025604T2 (en) | 2010-02-11 | 2010-02-11 | Optimized cellulase enzymes |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| HUE025604T2 true HUE025604T2 (en) | 2016-03-29 |
Family
ID=57835772
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| HUE10153355A HUE025604T2 (en) | 2010-02-11 | 2010-02-11 | Optimized cellulase enzymes |
Country Status (1)
| Country | Link |
|---|---|
| HU (1) | HUE025604T2 (en) |
-
2010
- 2010-02-11 HU HUE10153355A patent/HUE025604T2/en unknown
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| DK2357227T3 (en) | Optimized cellulase enzymes | |
| US9885028B2 (en) | Carbohydrate binding modules with reduced binding to lignin | |
| CN101970652B (en) | Cellulase variants with reduced inhibition by glucose | |
| CA2763836A1 (en) | Novel beta-glucosidase enzymes | |
| EP2401370A1 (en) | Novel lignin-resistant cellulase enzymes | |
| MX2013004203A (en) | Thermostable trichoderma cellulase. | |
| AU2009276270A1 (en) | Family 6 cellulase with decreased inactivation by lignin | |
| MX2011000552A (en) | Modified family 6 glycosidases with altered substrate specificity. | |
| CN108368496A (en) | Mutant β -glucosidase variants with increased thermostability | |
| EA034175B1 (en) | Protein having endoglucanase activity and use thereof for saccharification of lignocellulose | |
| HUE025604T2 (en) | Optimized cellulase enzymes | |
| US20160251642A1 (en) | Stable fungal cel6 enzyme variants | |
| US20130084619A1 (en) | Modified cellulases with enhanced thermostability | |
| Galanopoulou | Bacterial hydrolases of thermophilic origin and their application in plant biomass valorisation |