WO2019169027A2 - Microbial production of triterpenoids including mogrosides - Google Patents

Microbial production of triterpenoids including mogrosides Download PDF

Info

Publication number
WO2019169027A2
WO2019169027A2 PCT/US2019/019886 US2019019886W WO2019169027A2 WO 2019169027 A2 WO2019169027 A2 WO 2019169027A2 US 2019019886 W US2019019886 W US 2019019886W WO 2019169027 A2 WO2019169027 A2 WO 2019169027A2
Authority
WO
WIPO (PCT)
Prior art keywords
amino acid
seq
acid sequence
mog
ugt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2019/019886
Other languages
French (fr)
Other versions
WO2019169027A3 (en
Inventor
Ryan PHILIPPE
Ajikumar Parayil KUMARAN
Christine Nicole S. SANTOS
Michelle N. GOETTGE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Manus Bio Inc
Original Assignee
Manus Bio Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020207027034A priority Critical patent/KR20200136911A/en
Priority to US16/971,740 priority patent/US12351849B2/en
Priority to EP19760267.5A priority patent/EP3759230A4/en
Priority to CN201980028158.0A priority patent/CN112041457A/en
Priority to BR112020017490-4A priority patent/BR112020017490A2/en
Priority to JP2020544815A priority patent/JP7382946B2/en
Application filed by Manus Bio Inc filed Critical Manus Bio Inc
Priority to MX2020008922A priority patent/MX2020008922A/en
Publication of WO2019169027A2 publication Critical patent/WO2019169027A2/en
Publication of WO2019169027A3 publication Critical patent/WO2019169027A3/en
Anticipated expiration legal-status Critical
Priority to JP2023189958A priority patent/JP2024020310A/en
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P5/00Preparation of hydrocarbons or halogenated hydrocarbons
    • C12P5/007Preparation of hydrocarbons or halogenated hydrocarbons containing one or more isoprene units, i.e. terpenes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P5/00Preparation of hydrocarbons or halogenated hydrocarbons
    • C12P5/02Preparation of hydrocarbons or halogenated hydrocarbons acyclic
    • C12P5/026Unsaturated compounds, i.e. alkenes, alkynes or allenes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/44Preparation of O-glycosides, e.g. glucosides
    • C12P19/60Preparation of O-glycosides, e.g. glucosides having an oxygen of the saccharide radical directly bound to a non-saccharide heterocyclic ring or a condensed ring system containing a non-saccharide heterocyclic ring, e.g. coumermycin, novobiocin
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23LFOODS, FOODSTUFFS OR NON-ALCOHOLIC BEVERAGES, NOT OTHERWISE PROVIDED FOR; PREPARATION OR TREATMENT THEREOF
    • A23L2/00Non-alcoholic beverages; Dry compositions or concentrates therefor; Preparation or treatment thereof
    • A23L2/52Adding ingredients
    • A23L2/60Sweeteners
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0012Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7)
    • C12N9/0014Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7) acting on the CH-NH2 group of donors (1.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0071Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
    • C12N9/0083Miscellaneous (1.14.99)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1048Glycosyltransferases (2.4)
    • C12N9/1051Hexosyltransferases (2.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1085Transferases (2.) transferring alkyl or aryl groups other than methyl groups (2.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/18Preparation of compounds containing saccharide radicals produced by the action of a glycosyl transferase, e.g. alpha-, beta- or gamma-cyclodextrins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • C12P33/20Preparation of steroids containing heterocyclic rings
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y204/00Glycosyltransferases (2.4)
    • C12Y204/01Hexosyltransferases (2.4.1)
    • C12Y204/01017Glucuronosyltransferase (2.4.1.17)
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23LFOODS, FOODSTUFFS OR NON-ALCOHOLIC BEVERAGES, NOT OTHERWISE PROVIDED FOR; PREPARATION OR TREATMENT THEREOF
    • A23L27/00Spices; Flavouring agents or condiments; Artificial sweetening agents; Table salts; Dietetic salt substitutes; Preparation or treatment thereof
    • A23L27/30Artificial sweetening agents
    • A23L27/33Artificial sweetening agents containing sugars or derivatives
    • A23L27/36Terpene glycosides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y114/00Oxidoreductases acting on paired donors, with incorporation or reduction of molecular oxygen (1.14)
    • C12Y114/99Miscellaneous (1.14.99)
    • C12Y114/99007Squalene monooxygenase (1.14.99.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y204/00Glycosyltransferases (2.4)
    • C12Y204/01Hexosyltransferases (2.4.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y205/00Transferases transferring alkyl or aryl groups, other than methyl groups (2.5)
    • C12Y205/01Transferases transferring alkyl or aryl groups, other than methyl groups (2.5) transferring alkyl or aryl groups, other than methyl groups (2.5.1)
    • C12Y205/01021Squalene synthase (2.5.1.21), i.e. farnesyl-disphosphate farnesyltransferase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y303/00Hydrolases acting on ether bonds (3.3)
    • C12Y303/02Ether hydrolases (3.3.2)
    • C12Y303/02003Epoxide hydrolase (3.3.2.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y504/00Intramolecular transferases (5.4)
    • C12Y504/99Intramolecular transferases (5.4) transferring other groups (5.4.99)
    • C12Y504/99033Cucurbitadienol synthase (5.4.99.33)

Definitions

  • Mogrosides are triterpene-derived specialized secondary metabolites found in the fruit of the Cucurbitaceae family plant Siraitia grosvenorii (a/k/a monkfruit or Luo Han Guo). Their biosynthesis in fruit involves number of consecutive glycosylations of the aglycone mogrol to the final sweet products Mogroside V (Mog. V).
  • Mogroside fruit extract The food industry is increasing its use of mogroside fruit extract as a natural non-sugar food sweetener. For example, Mog. V has a sweetening capacity that is -250 times that of sucrose (Kasai et al., Agric Biol Chem (1989)).
  • additional health benefits of mogrosides have been revealed in recent studies (Li et al, Chin J Nat Med (2014)).
  • Mog. V has been approved as a high-intensity sweetening agent in Japan (Jakinovich et al. , Journal of Natural Products (1990)) and the extract has gained GRAS status in the USA as a non-nutritive sweetener and flavor enhancer (GRAS 522). Extraction of mogrosides from the fruit can yield a product of varying degrees of purity, often accompanied by undesirable aftertaste. In addition, yields of mogroside from cultivated fruit are limited due to low plant yields and particular cultivation requirements of the plant. Mogrosides are present at about 1% in the fresh fruit and about 4% in the dried fruit (Li HB, et al, 2006). Mog.
  • V is the main component, with a content of 0.5% to 1.4% in the dried fruit. Moreover, purification difficulties limit purity for Mog. V, with commercial products from plant extracts being standardized to about 50% Mog. V. It is highly likely that a pure Mog. V product will achieve greater commercial success than the blend, since it is less likely to have off flavors, will be easier to formulate into products, and has good solubility potential. It is therefore advantageous to be able to produce sweet mogroside compounds via biotechnological processes.
  • the present invention provides a method for making mogrol glycosides, as well as other triterpenoid compounds, using recombinant microbial processes.
  • the invention provides methods for making products, including foods, beverages, and sweeteners (among others), by incorporating the mogrol glycosides produced according to the methods described herein.
  • the invention provides a method for making a triterpenoid compound. The method comprises providing a recombinant microbial host cell expressing a heterologous enzyme pathway catalyzing the conversion of isopentenyl pyrophosphate (IPP) and/or dimethylallyl pyrophosphate (DMAPP) to one or more triterpenoid compounds.
  • IPP isopentenyl pyrophosphate
  • DMAPP dimethylallyl pyrophosphate
  • the heterologous enzyme pathway comprises a famesyl diphosphate synthase (FPPS) and a squalene synthase (SQS), which are recombinantly expressed.
  • the SQS comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 2 to 16, 166, and 167.
  • the host cell is cultured under conditions for producing the triterpenoid.
  • the microbial host cell in various embodiments may be prokaryotic or eukaryotic.
  • the microbial host cell is a bacterium such as Escherichia coli, or the microbial cell may be a yeast cell.
  • the host cell is a bacterial or yeast host cell engineered to increase production of IPP and DMAPP from glucose.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11).
  • AaSQS has high activity in E. coli.
  • Other SQS enzymes that are active in E. coli include Siraitia grosvenorii SQS (SEQ ID NO: 2), Euphorbia lathyris SQS (SEQ ID NO: 14), Eleutherococcus senticosus SQS (SEQ ID NO: 16), Flavobacteriales bacterium SQS (SEQ ID NO: 166), and Bacteroidetes bacterium SQS (SEQ ID NO: 167).
  • the heterologous enzyme pathway produces squalene, which is optionally an intermediate that acts as a substrate for additional downstream pathway enzymes.
  • squalene is recovered from the culture, and may be recovered from the microbial cells, and/or may be recovered from the media and/or an organic layer.
  • the host cell expresses one or more enzymes that produce mogrol from squalene.
  • the host cell may express one or more of squalene epoxidase (SQE), cucurbitadienol synthase (CDS), epoxide hydrolase (EPH), cytochrome P450 oxidases (CYP450), non-heme iron-dependent oxygenases, and cytochrome P450 reductases (CPR).
  • SQL squalene epoxidase
  • CDS cucurbitadienol synthase
  • EPH epoxide hydrolase
  • CYP450 cytochrome P450 oxidases
  • non-heme iron-dependent oxygenases and cytochrome P450 reductases (CPR).
  • the heterologous enzyme pathway further comprises a squalene epoxidase (SQE).
  • the heterologous enzyme pathway may comprise an SQE that produces 2,3-oxidosqualene.
  • Exemplary squalene epoxidases may comprise an amino acid sequence that is at least 70% identical to any one of SEQ ID NOS: 17 to 39, 168, 169, and 170.
  • the squalene epoxidase may comprise an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39).
  • M1SQE has high activity in E. coli.
  • SQE enzymes in accordance with the disclosure include Bathymodiolus azoricus Endosymbiont squalene epoxidase (SEQ ID NO: 168), Methyloprofundus sediment squalene epoxidase (SEQ ID NO: 169 ), Methylomicrobium buryatense squalene epoxidase (SEQ ID NO: 170), and engineered derivatives thereof.
  • the heterologous enzyme pathway further comprises a triterpene cyclase.
  • the microbial cell coexpresses FPPS, SQS, SQE, and the triterpene cyclase
  • the microbial cell produces cucurbitadienol.
  • the cucurbitadienol may be the substrate for downstream enzymes in the heterologous pathway, or is alternatively recovered from the culture (either from microbial cells, or the culture media or organic layer).
  • the triterpene cyclase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 40 to 55.
  • the triterpene cyclase has cucurbitadienol synthase (CDS) activity.
  • the CDS in various embodiments comprises an amino acid sequence that is at least 70% identical to the amino acid sequence of SEQ ID NO: 40 ( Siraitia grosvenorii).
  • the heterologous enzyme pathway further comprises an epoxide hydrolase (EPH).
  • EPH epoxide hydrolase
  • Exemplary EPH enzymes comprise an amino acid sequence that is at least 70% identical to amino acid sequence selected from SEQ ID NOS: 56 to 72.
  • the EPH may employ as a substrate 24, 25- epoxy cucurbitadienol, for production of 24, 25-dihydroxy cucurbitadienol.
  • the heterologous pathway further comprises one or more oxidases.
  • the one or more oxidases may be active on cucurbitadienol or oxygenated products thereof as a substrate, adding (collectively) hydroxylations at Cl 1, C24 and 25, thereby producing mogrol.
  • Exemplary oxidase enzymes are described herein.
  • the heterologous enzyme pathway produces mogrol, which may be an intermediate for downstream enzymes in the heterologous pathway, or in some embodiments is recovered from the culture.
  • Mogrol may be recovered from host cells in some embodiments, or in some embodiments, can be recovered from the culture media or organic layer.
  • the heterologous enzyme pathway further comprises one or more uridine diphosphate-dependent glycosyltransferase (UGT) enzymes, thereby producing one or more mogrol glycosides (or“mogrosides”).
  • the mogrol glycoside may be pentaglycosylated, or hexaglycosylated in some embodiments. In other embodiments, the mogrol glycoside has two, three, or four glucosylations.
  • the one or more mogrol glycosides may be selected from Mog. II-E, Mog. III-A-2, Mog. III-E, Mog. IIIx, Mog. IV -A, Mog. IV-E, Siamenoside, Isomog. IV, and Mog. V.
  • the mogroside is a pentaglucosylated or hexaglucosylated mogroside.
  • the host cell expresses a UGT enzyme that catalyzes the primary glycosylation of mogrol at C24 and/or C3 hydroxyl groups.
  • the UGT enzyme catalyzes beta 1,2 and/or beta 1,6 branching glycosylations of mogrol glycosides at the primary C3 and C24 gluscosyl groups.
  • Exemplary UGT enzymes are disclosed herein (SEQ ID NOS: 116 to 165).
  • the microbial cell expresses at least four UGT enzymes, resulting in glucosylation of mogrol at the C3 hydroxyl group, the C24 hydroxyl group, as well as a further 1,6 glucosylation at the C3 glucosyl group, and a further 1,6 glucosylation and a further 1,2 glucosylation at the C24 glucosyl group.
  • the product of such glucosylation reactions is Mog. V.
  • At least one UGT enzyme expressed by the microbial cell may comprise an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165).
  • UGT85C1 and derivatives thereof, provide for glucosylation of the C3 hydroxyl of mogrol or Mog. 1A.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146).
  • UGT85C2 and derivatives thereof, provide for glucosylation of the C24 hydroxyl of mogrol or Mog. 1E.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (CaUGT_l,6) (SEQ ID NO: 164).
  • CaUGT_l,6, and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glycosyl groups.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117).
  • UGT94-289-3 (“Sg94_3”), and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glucosyl groups, as well as beta 1,2 glucosylation at the C24 glucosyl group.
  • the microbial cell expresses at least one UGT enzyme capable of catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group (e.g., of Mog. IV A, see FIG. 4).
  • Exemplary UGT enzymes in accordance with these embodiments include Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117), Stevia rebaudiana UGT91D1 (SEQ ID NO: 147), Stevia rebaudiana UGT91D2 (SEQ ID NO: 148), Stevia rebaudiana UGT9lD2e (SEQ ID NO: 149), OsUGTl-2 (SEQ ID NO: 150), or MbUGTl-2 (SEQ ID NO: 163), or derivatives thereof.
  • At least one UGT enzyme is a circular permutant of a wild- type UGT enzyme, optionally having amino acid substitutions, deletions, and/or insertions with respect to the corresponding position of the wild-type enzyme. Circular permutants can provide novel and desirable substrate specificities, product profiles, and reaction kinetics over the wild-type enzymes.
  • at least one UTG enzyme is a circular permutant of SEQ ID NO: 146, SEQ ID NO: 164, or SEQ ID NO: 165, SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163, or a derivative thereof.
  • Mogrol glycosides can be recovered from the microbial culture.
  • mogrol glycosides may be recovered from microbial cells, or in some embodiments, are predominately transported into the extracellular media, where they may be recovered or sequestered.
  • the invention provides a method for making a pentaglycosylated or hexaglycosylated mogroside, such as Mog V.
  • the invention comprises reacting a mogrol glycoside with a plurality of uridine diphosphate dependent glycosyltransferase (UGT) enzymes.
  • UGT uridine diphosphate dependent glycosyltransferase
  • one UGT enzyme comprises an amino acid sequence that is at least 70% identical to SEQ ID NO: 164 (or circular permutant thereof), where the UGT enzyme catalyzes beta 1,6 addition of a glucose.
  • Other UGT enzymes as described herein will be coexpressed to glycosylate the desired substrate to Mog. V.
  • the mogrol is reacted with about four UGT enzymes.
  • a first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or a circular permutant thereof.
  • a second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or a circular permutant thereof.
  • a third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or a circular permutant thereof.
  • a fourth UGT enzyme is capable of catalyzing beta 1,2 addition of a glucose molecule, such as SgUGT94_289_3 (SEQ ID NO: 117) or a derivative or circular permutant thereof.
  • the mogrol glycoside can be recovered and/or purified from the reaction or culture.
  • the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
  • the reaction is performed in a microbial cell, and UGT enzymes are recombinantly expressed in the cell.
  • mogrol is produced in the cell by a heterologous mogrol synthesis pathway, as described herein.
  • mogrol or mogrol glycosides are fed to the cells for glycosylation.
  • the reaction is performed in vitro using purified UGT enzyme, partially purified UGT enzyme, or recombinant cell lysates.
  • the invention provides a method for making a product comprising a mogrol glycoside. The method comprises producing a mogrol glycoside in accordance with this disclosure, and incorporating the mogrol glycoside into a product.
  • the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
  • the product is a sweetener composition, flavoring composition, food, beverage, chewing gum, texturant, pharmaceutical composition, tobacco product, nutraceutical composition, or oral hygiene composition.
  • the product may be a sweetener composition comprising a blend of artificial and/or natural sweeteners.
  • the composition may further comprise one or more of a steviol glycoside, aspartame, and neotame.
  • exemplary steviol glycosides comprises one or more of RebM, RebB, RebD, RebA, RebE, and Rebl.
  • FIG. 1 shows the chemical structures of Mog. V, Mog. VI, and Isomog. V.
  • the type of glycosylation reaction is shown within each glucose moiety (e.g., C3 or C24 core glycosylation and the 1-2, 1-4, or 1-6 glycosylation additions).
  • FIG. 2 shows routes to mogroside V production in vivo. The enzymatic transformation required for each step is indicated, along with the type of enzyme required. Numbers in parentheses correspond to the chemical structures in FIG. 3. Abbreviations: FPP, famesyl pyrophosphate; SQS, squalene synthase; SQE, squalene epoxidase; TTC, triterpene cyclase; EPH, epoxide hydrolase; CYP450, cytochrome P450 with reductase partner; UGTs, uridine diphosphate glycosyltransferases.
  • FPP famesyl pyrophosphate
  • SQS squalene synthase
  • SQE squalene epoxidase
  • TTC triterpene cyclase
  • EPH epoxide hydrolase
  • CYP450 cytochrome P450 with reductase partner
  • 3 depicts chemical structures of metabolites involved in mogroside V biosynthesis: (1) famesyl pyrophosphate; (2) squalene; (3) 2,3-oxidosqualene; (4) 2,3;22,23-dioxidosqualene; (5) 24,25-epoxycucurbitadienol; (6) 24, 25- dihydroxy cucurbitadienol; (7) mogrol; (8) mogroside V; (9) cucurbitadienol.
  • FIG. 4 illustrates glycosylation routes to mogroside V, and in vitro bio transformation activity observed for various UGT enzymes.
  • Bubble structures represent different mogrosides.
  • White tetra-cycbc core represents mogrol.
  • the numbers below each structure indicate the particular glycosylated mogroside, while the notation with the arrows indicates the enzymes observed to exhibit the glycosylation activity.
  • Black circles represent C3 or C24 glucosylations. Dark grey vertical circles represent 1,6- glucosylations. Light grey horizontal circles represent l,2-glucosylations.
  • FIG. 5 shows results for in vivo production of squalene in E. coli using different squalene synthases.
  • the asterisk denotes a different plasmid construct and experiment run on a different day from the others shown.
  • FIG. 6 shows results for in vivo production of squalene, 2,3-oxidosqualene, and 2,3;22,23-dioxidosqualene using different squalene epoxidases.
  • SQS squalene synthase
  • SQE squalene epoxidase
  • Sg Siratia grosvenorii, Aa, Artemesia annua, BaE, Bathymodiolus azoricus endosymbiont
  • Ms Methyloprofundus sedimenti
  • Mb Methylomicrobium buryatense
  • Ml Methylomonas lenta.
  • FIG. 7 shows results for in vivo production of the cyclized triterpene product. Reactions involve an increasing number of enzymes expressed in an E. coli cell line having an overexpression of MEP pathway enzymes. The asterisks represent fermentation experiments incubated for a quarter of the time than the other experiments. As shown, co-expression of AaSQS, M1SQE, and SgTTC resulted in high production of the triterpenoid product, cucurbitadienol.
  • FIG.8 shows Mogroside V production using a combination of different enzymes.
  • Penta-glycosylated products are observed when 85C1, 85C2, and Sg94_3 or CaUGT_l,6 are incubated together with mogrol as a substrate.
  • Mogroside substrates were incubated in Tris buffer containing magnesium chloride, beta-mercaptoethanol, UDP-glucose, single UGT, and a phosphatase.
  • B Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogroside II- E.
  • FIG. 9 shows in vitro assays showing the conversion of mogroside substrates to more glycosylated products.
  • Mogroside substrates were incubated in Tris buffer containing magnesium chloride, beta-mercaptoethanol, UDP-glucose, single UGT, and a phosphatase.
  • the panels correspond to the use of different substrates: (A) mogrol; (B) mogroside I-A; (C) mogroside I-E; (D) mogroside II-E; (E) mogroside III; (F) mogroside IV-A; (G) mogroside IV; (H) siamenoside.
  • FIG. 10 is an amino acid alignment of CaUGT_l,6 and SgUGT94_289_3 using Clustal Omega (Version CLUSTAL O (1,2,4). These sequences share 54% amino acid identity.
  • FIG. 11 is an amino acid alignment of Homo sapiens squalene synthase (HsSQS) (NCBI accession NP_004453.3) and AaSQS (SEQ ID NO: 11) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQS has a published crystal structure (PDB entry: 1EZF). These sequences share 42% amino acid identity.
  • FIG. 12 is an amino acid alignment of Homo sapiens squalene epoxidase (HsSQE) (NCBI accession XP 011515548) and M1SQE (SEQ ID NO: 39) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQE has a published crystal structure (PDB entry: 6C6N). These sequences share 35% amino acid identity.
  • the present invention in various aspects and embodiments, provides a method for making mogrol glycosides, as well as other triterpenoid compounds, using recombinant microbial processes.
  • the invention provides methods for making products, including foods, beverages, and sweeteners (among others), by incorporating the mogrol glycosides produced according to the methods described herein.
  • the terms“terpene or triterpene” are used interchangeably with the terms“terpenoid” or“triterpenoid,” respectively.
  • the invention provides a method for making a triterpenoid compound.
  • the method comprises providing a recombinant microbial host cell expressing a heterologous enzyme pathway catalyzing the conversion of isopentenyl pyrophosphate (IPP) and/or dimethylallyl pyrophosphate (DMAPP) to one or more triterpenoid compounds.
  • the heterologous enzyme pathway comprises a famesyl diphosphate synthase (FPPS) and a squalene synthase (SQS), which are recombinantly expressed.
  • the SQS comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 2 to 16, 166, and 167.
  • the host cell is cultured under conditions for producing the triterpenoid.
  • the FPPS may be Saccharomyces cerevisiae famesyl pyrophosphate synthase (ScFPPS) (SEQ ID NO: 1), or modified variants thereof. Modified variants may comprise an amino acid sequence that is at least 70% identical to SEQ ID NO: 1). For example, the FPPS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least
  • the FPPS comprises an amino acid sequence having from 1 to 20 amino acid modifications or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 1, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
  • Numerous other FPPS enzymes are known in the art, and may be employed for conversion of IPP and/or DMAPP to famesyl diphosphate in accordance with this aspect.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 11.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 11, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, AaSQS has high activity in E. coli.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii SQS (SEQ ID NO: 2).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 2.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 2, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, SgSQS has high activity in E. coli.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Euphorbia lathyris SQS (SEQ ID NO: 14).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 14.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 14, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, E1SQS was active in E. coli.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Eleutherococcus senticosus SQS (SEQ ID NO: 16).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 16.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 16, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, EsSQS was active in E. coli.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Flavobacteriales bacterium SQS (SEQ ID NO: 166).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 166.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 166, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, FbSQS was active in E. coli.
  • the SQS comprises an amino acid sequence that is at least 70% identical to Bacteroidetes bacterium SQS (SEQ ID NO: 167).
  • the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 167.
  • the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 167, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, BbSQS was active in E. coli.
  • Amino acid modifications to the SQS enzyme can be guided by available enzyme structures and homology models, including those described in Aminfar and Tohidfar, In silico analysis of squalene synthase in Fabaceae family using bioinformatics tools. J. Genetic Engineer and Biotech. 16 (2016) 739-747.
  • the publicly available crystal structure for HsSQE (PDB entry: 6C6N) may be used to inform amino acid modifications.
  • An alignment between AaSQS and HsSQS is shown in FIG. 11.
  • the enzymes have 42% amino acid identity.
  • the heterologous enzyme pathway produces squalene, which is optionally an intermediate that acts as a substrate for additional downstream pathway enzymes.
  • squalene is recovered from the culture, and may be recovered from the microbial cells, and/or may be recovered from the media and/or an organic layer.
  • the microbial host cell in various embodiments may be prokaryotic or eukaryotic.
  • the microbial host cell is a bacteria selected from Escherichia spp., Bacillus spp., Corynebacterium spp., Rhodobacter spp., Zymomonas spp., Vibrio spp., and Pseudomonas spp.
  • the bacterial host cell is a species selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida.
  • the bacterial host cell is E. coli.
  • the microbial cell may be a yeast cell, such as but not limited to a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica.
  • the microbial cell will produce MEP or MVA products, which act as substrates for the heterologous enzyme pathway.
  • the MEP (2-C-methyl-D-erythritol 4-phosphate) pathway also called the MEP/DOXP (2-C-methyl-D-erythritol 4-phosphate/l-deoxy-D- xylulose 5-phosphate) pathway or the non-mevalonate pathway or the mevalonic acid- independent pathway refers to the pathway that converts glyceraldehyde-3-phosphate and pyruvate to IPP and DMAPP.
  • the pathway typically involves action of the following enzymes: l-deoxy-D-xylulose-5-phosphate synthase (Dxs), l-deoxy-D-xylulose-5-phosphate reductoisomerase (IspC), 4-diphosphocytidyl- 2-C-methyl-D-erythritol synthase (IspD), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (IspE), 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF), l-hydroxy- 2-methyl-2-(E)-butenyl 4-diphosphate synthase (IspG), and isopentenyl diphosphate isomerase (IspH).
  • Dxs l-deoxy-D-xylulose-5-phosphate synthase
  • IspC l-de
  • genes that make up the MEP pathway include dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, and ispA.
  • the host cell expresses or overexpresses one or more of dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, ispA, or modified variants thereof, which results in the increased production of IPP and DMAPP.
  • the triterpenoid (e.g., squalene, mogrol, or other interemediate described herein) is produced at least in part by metabolic flux through an MEP pathway, and wherein the host cell has at least one additional gene copy of one or more of dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, ispA, or modified variants thereof.
  • the MV A pathway refers to the biosynthetic pathway that converts acetyl-CoA to IPP.
  • the mevalonate pathway typically comprises enzymes that catalyze the following steps: (a) condensing two molecules of acetyl-CoA to acetoacetyl-CoA (e.g., by action of acetoacetyl-CoA thiolase); (b) condensing acetoacetyl-CoA with acetyl-CoA to form hydroxymethylglutaryl-CoenzymeA (HMG- CoA) (e.g., by action of HMG-CoA synthase (HMGS)); (c) converting HMG-CoA to mevalonate (e.g., by action of HMG-CoA reductase (HMGR)); (d) phosphorylating mevalonate to mevalonate 5-phosphate (e.g., by action of mevalonate kina
  • the host cell expresses or overexpresses one or more of acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, and MPD or modified variants thereof, which results in the increased production of IPP and DMAPP.
  • the triterpenoid e.g., mogrol or squalene
  • the triterpenoid is produced at least in part by metabolic flux through an MVA pathway, and wherein the host cell has at least one additional gene copy of one or more of acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, or modified variants thereof.
  • the host cell is a bacterial host cell engineered to increase production of IPP and DMAPP from glucose as described in US 2018/0245103 and US 2018/0216137, the contents of which are hereby incorporated by reference in their entireties.
  • the host cell overexpresses MEP pathway enzymes, with balanced expression to push/pull carbon flux to IPP and DMAP.
  • the host cell is engineered to increase the availability or activity of Fe-S cluster proteins, so as to support higher activity of IspG and IspH, which are Fe-S enzymes.
  • the host cell is engineered to overexpress IspG and IspH, so as to provide increased carbon flux to 1 -hydroxy -2-methyl-2-(E)-butenyl 4- diphosphate (HMBPP) intermediate, but with balanced expression to prevent accumulation of HMBPP at an amount that reduces cell growth or viability, or at an amount that inhibits MEP pathway flux and/or terpenoid production.
  • HMBPP 1 -hydroxy -2-methyl-2-(E)-butenyl 4- diphosphate
  • the host cell exhibits higher activity of IspH relative to IspG.
  • the host cell is engineered to downregulate the ubiquinone biosynthesis pathway, e.g., by reducing the expression or activity of IspB, which uses IPP and FPP substrate.
  • the host cell expresses one or more enzymes that produce mogrol from squalene.
  • the host cell may express one or more of squalene epoxidase (SQE), cucurbitadienol synthase (CDS), epoxide hydrolase (EPH), cytochrome P450 oxidases (CYP450), non-heme iron-dependent oxygenases, and cytochrome P450 reductases (CPR).
  • SQE squalene epoxidase
  • CDS cucurbitadienol synthase
  • EPH epoxide hydrolase
  • CYP450 cytochrome P450 oxidases
  • CPR cytochrome P450 reductases
  • the pathway proceeds through cucurbitadienol, and in some embodiments, does not involve a further epoxidation step.
  • one or more of SQE, CDS, EPH, CYP450, non-heme iron- dependent oxygenases, flavodoxin reductases (FPR), ferredoxin reductases (FDXR), and CPR enzymes are engineered to increase flux to mogrol.
  • the heterologous enzyme pathway further comprises a squalene epoxidase (SQE).
  • the heterologous enzyme pathway may comprise an SQE that produces 2,3-oxidosqualene (intermediate (3) in FIG. 2).
  • the SQE will produce 22,23 -dioxidosqualene (intermediate (4) in FIG. 2).
  • the squalene epoxidase may comprise an amino acid sequence that is at least 70% identical to any one of SEQ ID NOS: 17 to 39, 168-170.
  • the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39).
  • the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 39.
  • the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 39, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG.
  • M1SQE had good activity in E. coli. Further, when coexpressed with AaSQS, high levels of the single epoxylated product (2,3-oxidosqualene) was observed. Accordingly, coexpression of AaSQS (or an engineered derivative) with M1SQE (or an engineered derivative) has a good potential for bioengineering of the mogrol pathway. Amino acid modifications may be made to increase expression or stability of the SQE enzyme in the microbial cell, or to increase productivity of the enzyme
  • the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Bathymodiolus azoricus Endosymbiont squalene epoxidase (SEQ ID NO: 168).
  • the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 168.
  • the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 168, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, BaESQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
  • the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylopro fundus sediment squalene epoxidase (SEQ ID NO: 169).
  • the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 169.
  • the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 169, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, MsSQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
  • the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylomicrobium buryatense squalene epoxidase (SEQ ID NO: 170).
  • the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 170.
  • the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 170, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, MbSQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
  • FIG. 12 shows an alignment of HsSQE and M1SEQ, which is useful for guiding engineering of the enzymes for expression, stability, and productivity in microbial host cells. The two enzymes have 35% identity.
  • the heterologous enzyme pathway further comprises a triterpene cyclase.
  • the microbial cell coexpresses FPPS, SQS, SQE, and the triterpene cyclase
  • the microbial cell produces cucurbitadienol (compound (9) in FIG. 2).
  • the cucurbitadienol may be the substrate for downstream enzymes in the heterologous pathway, or is alternatively recovered from the culture (either from microbial cells, or the culture media or organic layer).
  • the triterpene cyclase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 40 to 55. In some embodiments, the triterpene cyclase has cucurbitadienol synthase (CDS) activity.
  • the CDS in various embodiments comprises an amino acid sequence that is at least 70% identical to the amino acid sequence of SEQ ID NO: 40, and may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 40.
  • the CDS may comprise an amino acid sequence having from 1 to 20 amino acid modifications or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 40, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
  • Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
  • Amino acid modifications can be guided by available enzyme structures and homology models, including those described in Itkin M., et al, The biosynthetic pathway of the nonsugar high-intensity sweetener mogroside V from Siraitia srosvenorii PNAS (2016) Vol 113(47): E7619-E7628.
  • the CDS may be modeled using the structure of human lanosterol synthase (oxidosqualene cyclase) (PDB 1W6K).
  • the heterologous enzyme pathway further comprises an epoxide hydrolase (EPH).
  • the EPH may comprise an amino acid sequence that is at least 70% identical to amino acid sequence selected from SEQ ID NOS: 56 to 72.
  • the EPH may employ as a substrate 24,25-epoxycucurbitadienol (intermediate (5) of FIG. 2), for production of 24,25-dihydroxycucurbitadienol (intermediate (6) of FIG. 2).
  • the EPH comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 56 to 72.
  • Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
  • the heterologous pathway further comprises one or more oxidases.
  • the one or more oxidases may be active on cucurbitadienol or oxygenated products thereof as a substrate, adding (collectively) hydroxylations at Cl 1, C24 and 25, thereby producing mogrol (see FIG. 2).
  • At least one oxidase is a cytochrome P450 enzyme.
  • Exemplary cytochrome P450 enzymes comprise an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 73 to 91.
  • at least one P450 enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 73 to 91.
  • the CYP450 and/or CPR is modified as described in US 2018/0251738, the contents of which are hereby incorporated by reference in their entireties.
  • the CYP450 enzyme has a deletion of all or part of the wild type P450 N-terminal transmembrane region, and the addition of a transmembrane domain derived from an E. coli or bacterial inner membrane, cytoplasmic C-terminus protein.
  • the transmembrane domain is a single-pass transmembrane domain.
  • the transmembrane domain is a multi-pass (e.g., 2, 3, or more transmembrane helices) transmembrane domain.
  • At least one oxidase is a non-heme iron oxidase.
  • Exemplary non-heme iron oxidases comprise an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 100 to 115.
  • the non-heme iron oxidase comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 100 to 115.
  • the microbial host cell expresses one or more electron transfer proteins selected from a cytochrome P450 reductase (CPR), flavodoxin reductase (FPR) and ferredoxin reductase (FDXR) sufficient to regenerate the one or more oxidases.
  • CPR cytochrome P450 reductase
  • FPR flavodoxin reductase
  • FDXR ferredoxin reductase
  • the heterologous enzyme pathway produces mogrol, which may be an intermediate for downstream enzymes in the heterologous pathway, or in some embodiments is recovered from the culture.
  • Mogrol may be recovered from host cells in some embodiments, or in some embodiments, can be recovered from the culture media or organic layer.
  • the heterologous enzyme pathway further comprises one or more uridine diphosphate-dependent glycosyltransferase (UGT) enzymes, thereby producing one or more mogrol glycosides (or“mogrosides”).
  • the mogrol glycoside may be pentaglycosylated, or hexaglycosylated in some embodiments. In other embodiments, the mogrol glycoside has two, three, or four glucosylations.
  • the one or more mogrol glycosides may be selected from Mog. II-E, Mog. III-A-2, Mog. III-E, Mog. IIIx, Mog. IV -A, Mog. IV-E, Siamenoside, Isomog. IV, and Mog. V.
  • the mogroside is a pentaglucosylated or hexaglucosylated mogroside.
  • the one or more mogrol glycosides include Mog. VI, Isomog. V, and Mog. V.
  • the host cell produces Mog. V.
  • the host cell expresses a UGT enzyme that catalyzes the primary glycosylation of mogrol at C24 and/or C3 hydroxyl groups.
  • the UGT enzyme catalyzes beta 1,2 and/or beta 1,6 branching glycosylations of mogrol glycosides at the primary C3 and C24 gluscosyl groups.
  • the UGT enzyme catalyzes beta 1,2 glucosylation of Mog IV-A, beta 1,6 glucosylation of Mog. IV, and/or beta 1,6 glucosylation of Siamenoside to Mog. V.
  • the UGT enzyme catalyzes the beta 1,6 glucosylation of Mog. V to Mog. VI.
  • the UGT enzyme catalyzes the beta 1,4 glucosylation of Siamenoside and/or the beta 1,6 glucosylation of Isomog. IV to Isomog. V,
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 116 to 165.
  • the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 116 to 165.
  • the microbial cell expresses at least four UGT enzymes, resulting in glucosylation of mogrol at the C3 hydroxyl group, the C24 hydroxyl group, as well as a further 1,6 glucosylation at the C3 glucosyl group, and a further 1,6 glucosylation and a further 1,2 glucosylation at the C24 glucosyl group.
  • the product of such glucosylation reactions is Mog. V (FIG. 4).
  • At least one UGT enzyme expressed by the microbial cell may comprise an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165).
  • UGT85C1, and derivatives thereof provide for glucosylation of the C3 hydroxyl of mogrol or Mog. 1A.
  • Other glucosyltransferase reactions detected for UGT85C1 are shown in FIG. 4.
  • at least one UGT enzyme may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165.
  • the UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146).
  • UGT85C2 and derivatives thereof, provide for glucosylation of the C24 hydroxyl of mogrol or Mog. 1E.
  • Other glucosyltransferase reactions detected for UGT85C2 are shown in FIG. 4.
  • at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146.
  • At least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (CaUGT_l,6) (SEQ ID NO: 164).
  • CaUGT_l,6, and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glycosyl groups. Glycosyltransferase reactions observed for CaUGT_l,6 are shown in FIG. 4.
  • at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164.
  • At least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
  • At least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117).
  • UGT94-289-3 (“Sg94_3”), and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glucosyl groups, as well as beta 1,2 glucosylation at the C24 glucosyl group. Glycosyltransferase reactions observed for Sg94_3 are shown in FIG. 4.
  • at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 117.
  • at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 117, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
  • the microbial cell expresses at least one UGT enzyme capable of catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group (e.g., of Mog. IV A, see FIG. 4).
  • Exemplary UGT enzymes in accordance with these embodiments include Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117), Stevia rebaudiana UGT91D1 (SEQ ID NO: 147), Stevia rebaudiana UGT91D2 (SEQ ID NO: 148), Stevia rebaudiana UGT9lD2e (SEQ ID NO: 149), OsUGTl-2 (SEQ ID NO: 150), or MbUGTl-2 (SEQ ID NO: 163), or derivatives thereof.
  • Derivatives include enzymes comprising amino acid sequence that are least 70% identical to one or more of SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163.
  • the UGT enzyme catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one or more of SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163.
  • At least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
  • At least one UGT enzyme is a circular permutant of a wild- type UGT enzyme, optionally having amino acid substitutions, deletions, and/or insertions with respect to the corresponding position of the wild-type enzyme.
  • Circular permutants can provide novel and desirable substrate specificities, product profiles, and reaction kinetics over the wild-type enzymes.
  • a circular permutant retains the same basic fold of the parent enzyme, but has a different position of the N-terminus (e.g.,“cut-site”), with the original N- and C-termini connected, optionally by a linking sequence.
  • the N-terminal Methionine is positioned at a site in the protein other than the natural N-terminus.
  • UGT circular permutants are described in US 2017/0332673, which is hereby incorporated by reference in its entirety.
  • at least one UTG enzyme is a circular permutant of SEQ ID NO: 146, SEQ ID NO: 164, or SEQ ID NO: 165, SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163.
  • the circular permutant further has one or more amino acid modifications (e.g., amino acid substitutions, deletions, and/or insertions) with respect to the parent UGT enzyme.
  • the circular permutant will have at least about 70%, or at least about 80%, or at least about 90%, or at least about 95%, or at least about 98% identity to the parent enzyme, when the corresponding amino acid sequences are aligned (i.e.., without regard to the new N-terminus of the circular permutant).
  • the heterologous enzyme pathway comprises three or four UGT enzymes.
  • a first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165) (or derivative thereof as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 165 or derivative thereof (as described above).
  • a second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146) (or derivative as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 146 (or derivative as described above).
  • a third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorri UGT94-289-3 (SEQ ID NO: 117) (or derivative or circular permutant as described above).
  • UGT94-289-3 is replaced with another UGT enzyme capable of beta 1,2 glucosyltransferase activity (as described above), together with a fourth UGT enzyme.
  • the fourth UGT enzyme comprises an amino acid sequence that is at least 70% identical to CaUGT_l,6 (SEQ ID NO: 164) (or derivative as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 164 (or derivative as described above). Expression of these enzymes in the host cell converts mogrol to predominately tetra and pentaglycosylated products, including Mog. V. See FIG. 4, FIG. 8, FIG. 9.
  • the microbial host cell has one or more genetic modifications that increase the production of UDP-glucose, the co-factor employed by UGT enzymes.
  • These genetic modifications may include one or more, or two or more (or all) of AgalE, AgalT.
  • Mogrol glycosides can be recovered from the microbial culture.
  • mogrol glycosides may be recovered from microbial cells, or in some embodiments, are predominately transported into the extracellular media, where they may be recovered or sequestered.
  • the invention provides a method for making a pentaglycosylated or hexaglycosylated mogroside.
  • the mogroside is Mog V.
  • the invention comprises reacting a mogrol glycoside with a plurality of uridine diphosphate dependent glycosyltransferase (UGT) enzymes.
  • UGT uridine diphosphate dependent glycosyltransferase
  • one UGT enzyme comprises an amino acid sequence that is at least 70% identical to SEQ ID NO: 164, where the UGT enzyme catalyzes beta 1,6 addition of a glucose.
  • the UGT enzyme comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 164 or a derivative thereof (described above).
  • the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164.
  • the UGT enzyme may comprise an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
  • the UGT enzyme is a circular permutant of SEQ ID NO: 164, or derivative thereof. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular mogroside substrates, such as Mog. IV or Siamenoside. Other UGT enzymes will be coexpressed to glycosylate the desired substrate to Mog. V.
  • the mogrol glycoside substrate comprises Mog. HE.
  • the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C1 (SEQ ID NO: 165), or a circular permutant comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 165, including derivatives of UGT85C1 or circular permutants as described.
  • the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165.
  • the UGT enzyme may comprise an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 165.
  • the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IA or Mog, IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C2 (SEQ ID NO: 146), or a derivative or circular permutant of UGT85C2 as described herein.
  • the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146.
  • the UGT enzyme comprises an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 146.
  • the mogrol is reacted with about four UGT enzymes.
  • a first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or a derivative of circular permutant as described.
  • a second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or a derivative or circular permutant as described.
  • a third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or a derivative or circular permutant as described.
  • a fourth UGT enzyme is capable of catalyzing beta 1,2 addition of a glucose molecule, such as SgUGT94_289_3 (SEQ ID NO: 117) or a derivative or circular permutant as described.
  • the mogrol glycoside can be recovered and/or purified from the reaction or culture.
  • the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
  • the reaction is performed in a microbial cell, and UGT enzymes are recombinantly expressed in the cell.
  • mogrol is produced in the cell by a heterologous mogrol synthesis pathway, as described herein.
  • mogrol or mogrol glycosides are fed to the cells for glycosylation.
  • the reaction is performed in vitro using purified UGT enzyme, partially purified UGT enzyme, or recombinant cell lysates.
  • the microbial host cell can be prokaryotic or eukaryotic, and is optionally a bacteria selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida.
  • the microbial cell is a yeast selected from a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica.
  • the microbial host cell is E. coli.
  • the bacterial host cell is cultured to produce the triterpenoid product (e.g., mogroside).
  • carbon substrates such as Cl, C2, C3, C4, C5, and/or C6 carbon substrates are employed for the production phase.
  • the carbon source is glucose, sucrose, fructose, xylose, and/or glycerol.
  • Culture conditions are generally selected from aerobic, microaerobic, and anaerobic.
  • the bacterial host cell may be cultured at a temperature between 22° C and 37° C. While commercial biosynthesis in bacteria such as E. coli can be limited by the temperature at which overexpressed and/or foreign enzymes (e.g., enzymes derived from plants) are stable, recombinant enzymes may be engineered to allow for cultures to be maintained at higher temperatures, resulting in higher yields and higher overall productivity.
  • foreign enzymes e.g., enzymes derived from plants
  • the culturing is conducted at about 22° C or greater, about 23° C or greater, about 24° C or greater, about 25° C or greater, about 26° C or greater, about 27° C or greater, about 28° C or greater, about 29° C or greater, about 30° C or greater, about 31° C or greater, about 32° C or greater, about 33° C or greater, about 34° C or greater, about 35° C or greater, about 36° C or greater, or about 37° C.
  • the bacterial host cells are further suitable for commercial production, at commercial scale.
  • the size of the culture is at least about 100 L, at least about 200 L, at least about 500 L, at least about 1,000 L, or at least about 10,000 L, or at least about 100,000 L, or at least about 500,000 L, or at least about 600,000 L.
  • the culturing may be conducted in batch culture, continuous culture, or semi-continuous culture.
  • methods further include recovering the product from the cell culture or from cell lysates.
  • the culture produces at least about 100 mg/L, or at least about 200 mg/L, or at least about 500 mg/L, or at least about 1 g/L, or at least about 2 g/L, or at least about 5 g/L, or at least about 10 g/L, or at least about 20 g/L, or at least about 30 g/L, or at least about 40 g/L of the terpenoid or terpenoid glycoside product.
  • the production of indole (including prenylated indole) is used as a surrogate marker for terpenoid production, and/or the accumulation of indole in the culture is controlled to increase production.
  • accumulation of indole in the culture is controlled to below about 100 mg/L, or below about 75 mg/L, or below about 50 mg/L, or below about 25 mg/L, or below about 10 mg/L.
  • the accumulation of indole can be controlled by balancing protein expression and activity using the multivariate modular approach as described in U.S. Pat. No. 8,927,241 (which is hereby incorporated by reference), and/or is controlled by chemical means.
  • markers for efficient production of terpene and terpenoids include accumulation of DOX or ME in the culture media.
  • the bacterial strains may be engineered to accumulate less of these chemical species, which accumulate in the culture at less than about 5 g/L, or less than about 4 g/L, or less than about 3 g/L, or less than about 2 g/L, or less than about 1 g/L, or less than about 500 mg/L, or less than about 100 mg/L.
  • terpene or terpenoid production by manipulation of MEP pathway genes is not expected to be a simple linear or additive process. Rather, through combinatorial analysis, optimization is achieved through balancing components of the MEP pathway, as well as upstream and downstream pathways.
  • Indole (including prenylated indole) accumulation and MEP metabolite accumulation e.g., DOX, ME, MEcPP, and/or famesol
  • DOX, ME, MEcPP, and/or famesol in the culture can be used as surrogate markers to guide this process.
  • the bacterial strain has at least one additional copy of dxs and idi expressed as an operon/module; or dxs, ispD, ispF, and idi expressed as an operon or module (either on a plasmid or integrated into the genome), with additional MEP pathway complementation described herein to improve MEP carbon.
  • the bacterial strain may have a further copy of dxr, and ispG and/or ispH, optionally with a further copy of ispE and/or idi, with expressions of these genes tuned to increase MEP carbon and/or improve terpene or terpenoid titer.
  • the bacterial strain has a further copy of at least dxr, ispE, ispG and ispH, optionally with a further copy of idi, with expressions of these genes tuned to increase MEP carbon and/or improve terpene or terpenoid titer.
  • Manipulation of the expression of genes and/or proteins, including gene modules, can be achieved through various methods.
  • expression of the genes or operons can be regulated through selection of promoters, such as inducible or constitutive promoters, with different strengths (e.g., strong, intermediate, or weak).
  • promoters of different strengths include Trc, T5 and T7.
  • expression of genes or operons can be regulated through manipulation of the copy number of the gene or operon in the cell.
  • expression of genes or operons can be regulated through manipulating the order of the genes within a module, where the genes transcribed first are generally expressed at a higher level.
  • expression of genes or operons is regulated through integration of one or more genes or operons into the chromosome.
  • optimization of protein expression can also be achieved through selection of appropriate promoters and ribosomal binding sites. In some embodiments, this may include the selection of high-copy number plasmids, or single-, low- or medium-copy number plasmids.
  • the step of transcription termination can also be targeted for regulation of gene expression, through the introduction or elimination of structures such as stem- loops.
  • Expression vectors containing all the necessary elements for expression are commercially available and known to those skilled in the art. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989. Cells are genetically engineered by the introduction into the cells of heterologous DNA. The heterologous DNA is placed under operable control of transcriptional elements to permit the expression of the heterologous DNA in the host cell.
  • endogenous genes are edited, as opposed to gene complementation. Editing can modify endogenous promoters, ribosomal binding sequences, or other expression control sequences, and/or in some embodiments modifies trans-acting and/or cis-acting factors in gene regulation. Genome editing can take place using CRISPR/Cas genome editing techniques, or similar techniques employing zinc finger nucleases and TALENs. In some embodiments, the endogenous genes are replaced by homologous recombination. In some embodiments, genes are overexpressed at least in part by controlling gene copy number. While gene copy number can be conveniently controlled using plasmids with varying copy number, gene duplication and chromosomal integration can also be employed. For example, a process for genetically stable tandem gene duplication is described in US 2011/0236927, which is hereby incorporated by reference in its entirety.
  • the terpene or terpenoid product can be recovered by any suitable process, including partitioning the desired product into an organic phase or hydrophobic phase. Alternatively, the aqueous phase can be recovered, and/or the whole cell biomass can be recovered, for further processing.
  • the production of the desired product can be determined and/or quantified, for example, by gas chromatography (e.g., GC-MS).
  • the desired product can be produced in batch or continuous bioreactor systems. Production of product, recovery, and/or analysis of the product can be done as described in US 2012/0246767, which is hereby incorporated by reference in its entirety.
  • product oil is extracted from aqueous reaction medium using an organic solvent, such as an alkane such as heptane or dodecane, or vegetable oil (e.g., safflower oil) followed by fractional distillation.
  • organic solvent such as an alkane such as heptane or dodecane
  • vegetable oil e.g., safflower oil
  • product oil is extracted from aqueous reaction medium using a hydrophobic phase, such as a vegetable oil, followed by organic solvent extraction and fractional distillation.
  • Terpene and terpenoid components of fractions may be measured quantitatively by GC/MS, followed by blending of fractions to generate a desired product profile.
  • sequence alignments can be carried out with several art-known algorithms, such as with the mathematical algorithm of Karlin and Altschul (Karlin & Altschul (1993) Proc. Natl. Acad. Sci. USA 90: 5873- 5877), with hmmalign (HMMER package, http://hmmer.wustl.edu/) or with the CLUSTAL algorithm (Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994) Nucleic Acids Res. 22, 4673-80).
  • the grade of sequence identity may be calculated using e.g.
  • BLAST, BLAT or BlastZ (or BlastX).
  • BLASTN and BLASTP programs of Altschul et al (1990) J. Mol. Biol. 215: 403-410.
  • Gapped BLAST is utilized as described in Altschul et al (1997) Nucleic Acids Res. 25: 3389-3402.
  • Sequence matching analysis may be supplemented by established homology mapping techniques like Shuffle-LAGAN (Brudno M., Bioinformatics 2003b, 19 Suppl 1 : 154-162) or Markov random fields.
  • Constant substitutions may be made, for instance, on the basis of similarity in polarity, charge, size, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the amino acid residues involved.
  • the 20 naturally occurring amino acids can be grouped into the following six standard amino acid groups:
  • “conservative substitutions” are defined as exchanges of an amino acid by another amino acid listed within the same group of the six standard amino acid groups shown above. For example, the exchange of Asp by Glu retains one negative charge in the so modified polypeptide.
  • glycine and proline may be substituted for one another based on their ability to disrupt a-helices.
  • Some preferred conservative substitutions within the above six groups are exchanges within the following sub-groups: (i) Ala, Val, Leu and Ile; (ii) Ser and Thr; (ii) Asn and Gln; (iv) Lys and Arg; and (v) Tyr and Phe.
  • non-conservative substitutions are defined as exchanges of an amino acid by another amino acid listed in a different group of the six standard amino acid groups (1) to (6) shown above.
  • Modifications of enzymes as described herein can include conservative and/or non-conservative mutations.
  • “rational design” is involved in constructing specific mutations in enzymes.
  • Rational design refers to incorporating knowledge of the enzyme, or related enzymes, such as its reaction thermodynamics and kinetics, its three dimensional structure, its active site(s), its substrate(s) and/or the interaction between the enzyme and substrate, into the design of the specific mutation. Based on a rational design approach, mutations can be created in an enzyme which can then be screened for increased production of a terpene or terpenoid relative to control levels. In some embodiments, mutations can be rationally designed based on homology modeling. As used herein,“homology modeling” refers to the process of constructing an atomic resolution model of one protein from its amino acid sequence and a three-dimensional structure of a related homologous protein.
  • the invention provides a method for making a product comprising a mogrol glycoside.
  • the method comprises producing a mogrol glycoside in accordance with this disclosure, and incorporating the mogrol glycoside into a product.
  • the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
  • the product is a sweetener composition, flavoring composition, food, beverage, chewing gum, texturant, pharmaceutical composition, tobacco product, nutraceutical composition, or oral hygiene composition.
  • the product may be a sweetener composition comprising a blend of artificial and/or natural sweeteners.
  • the composition may further comprise one or more of a steviol glycoside, aspartame, and neotame.
  • exemplary steviol glycosides comprises one or more of RebM, RebB, RebD, RebA, RebE, and Rebl.
  • Non-limiting examples of flavors for which the products can be used in combination include lime, lemon, orange, fruit, banana, grape, pear, pineapple, mango, bitter almond, cola, cinnamon, sugar, cotton candy and vanilla flavors.
  • Non-limiting examples of other food ingredients include flavors, acidulants, and amino acids, coloring agents, bulking agents, modified starches, gums, texturizers, preservatives, antioxidants, emulsifiers, stabilizers, thickeners and gelling agents.
  • Mogrol glycosides obtained according to this invention may be incorporated as a high intensity natural sweetener in foodstuffs, beverages, pharmaceutical compositions, cosmetics, chewing gums, table top products, cereals, dairy products, toothpastes and other oral cavity compositions, etc.
  • Mogrol glycosides obtained according to this invention can be used in combination with various physiologically active substances or functional ingredients.
  • Functional ingredients generally are classified into categories such as carotenoids, dietary fiber, fatty acids, saponins, antioxidants, nutraceuticals, flavonoids, isothiocyanates, phenols, plant sterols and stands (phytosterols and phytostanols); polyols; prebiotics, probiotics; phytoestrogens; soy protein; sulfides/thiols; amino acids; proteins; vitamins; and minerals.
  • Functional ingredients also may be classified based on their health benefits, such as cardiovascular, cholesterol-reducing, and anti inflammatory.
  • Mogrol glycosides obtained according to this invention may be applied as a high intensity sweetener to produce zero calorie, reduced calorie or diabetic beverages and food products with improved taste characteristics. It may also be used in drinks, foodstuffs, pharmaceuticals, and other products in which sugar cannot be used.
  • highly purified target mogrol glycoside(s) particularly, Mog. V, Mog. VI, or Isomog. V, can be used as a sweetener not only for drinks, foodstuffs, and other products dedicated for human consumption, but also in animal feed and fodder with improved characteristics.
  • Examples of products in which mogrol glycoside(s) may be used as a sweetening compound include, but are not limited to, alcoholic beverages such as vodka, wine, beer, liquor, and sake, etc.; natural juices; refreshing drinks; carbonated soft drinks; diet drinks; zero calorie drinks; reduced calorie drinks and foods; yogurt drinks; instant juices; instant coffee; powdered types of instant beverages; canned products; syrups; fermented soybean paste; soy sauce; vinegar; dressings; mayonnaise; ketchups; curry; soup; instant bouillon; powdered soy sauce; powdered vinegar; types of biscuits; rice biscuit; crackers; bread; chocolates; caramel; candy; chewing gum; jelly; pudding; preserved fruits and vegetables; fresh cream; jam; marmalade; flower paste; powdered milk; ice cream; sorbet; vegetables and fruits packed in bottles; canned and boiled beans; meat and foods boiled in sweetened sauce; agricultural vegetable food products; seafood; ham; sausage; fish ham; fish sausage; fish paste; deep
  • the conventional methods such as mixing, kneading, dissolution, pickling, permeation, percolation, sprinkling, atomizing, infusing and other methods may be used.
  • mogroside V has a sweetening capacity that is about 250 times that of sucrose (Kasai et al, Agric Biol Chem (1989)). Mogrosides are reported to have health benefits as well (Li et al., Chin J Nat Med (2014)).
  • Mog. V has been approved as a high-intensity sweetening agent in Japan (Jakinovich et al., Journal of Natural Products (1990)) and the extract has gained GRAS status in the USA as a non-nutritive sweetener and flavor enhancer (GRAS 522). Extraction of mogrosides from the fruit can yield a product of varying degrees of purity, often accompanied by undesirable aftertaste. In addition, yields of mogroside from cultivated fruit are limited due to low plant yields and particular cultivation requirements of the plant. Mogrosides are present at ⁇ l% in the fresh fruit and ⁇ 4% in the dried fruit. Mog. V is the main component, with a content of 0.5%-l.4% in the dried fruit.
  • FIG. 1 shows the chemical structures of Mog. V, Mog. VI, and Isomog. V. Mog.
  • V has five glucosylations with respect to the mogrol core, including glucosylations at the C3 and C24 hydroxyl groups, followed by 1-2, 1-4, and 1-6 glucosyl additions. These glucosylation reactions are catalyzed by uridine diphosphate-dependent glycosyltransferase enzymes (UGTs).
  • UGTs uridine diphosphate-dependent glycosyltransferase enzymes
  • FIG.2 shows routes to Mog. V production in vivo. The enzymatic transformation required for each step is indicated, along with the type of enzyme required. Numbers in parentheses correspond to the chemical structures in FIG.
  • mogrosides can be produced by biosynthetic fermentation processes, using microbial strains that produce high levels of MEP pathway products, along with heterologous expression of mogrol biosynthesis enzymes and UGT enzymes that direct glucosylation reactions to Mog. V, or other desired mogroside compound.
  • IPP isopentenyl pyrophosphate
  • DMAPP dimethylallyl pyrophosphate
  • FPP famesyl diphosphate
  • FPPS recombinant famesyl diphosphate synthase
  • FPP is converted to squalene (2) by a condensation reaction catalyzed by squalene synthase (SQS).
  • Squalene is converted to 2,3-oxidosqualene (3) by an epoxidation reaction catalyzed by a squalene epoxidase (SQE).
  • the pathway can proceed to 22,23- dioxidosqualene (4) by further epoxidation followed by cyclization to 24,25- epoxycucurbitadienol (5) by a triterpene cyclase, and then hydration of the remaining epoxy group to 24,25-dihydroxycucurbitadienol (6) by an epoxide hydrolase.
  • a further hydroxylation catalyzed by a P450 oxidase produces mogrol (7).
  • the pathway can alternatively proceed by cyclization of (3) to produce cucurbitadienol (9), followed by epoxidation to (5), or multiple hydroxylations of cucurbitadienol to (6), or mogrol (7).
  • FIG. 4 illustrates glucosylation routes to Mog. V, and indicates in vitro bio transformation activity observed for different enzymes.
  • Glucosylation of the C3 hydroxyl produces Mog. I-E
  • glucosylation of the C24 hydroxyl produces Mog. I-Al
  • Glucosylation of Mog. I-Al at C3 or glucosylation of Mog. I-El at C24 produces Mog. II-E.
  • Further 1-6 glucosylation of Mog. II-E at C3 produces Mog. III-A2.
  • Further 1-6 glucosylation at C24 of Mog. HE produces Mog. III. 1-2 glucosylation of Mog. III-A2 at C24 produces Mog. IV, and then to Mog.
  • glucosylations may proceed through Mog. Ill, with a 1-6 glucosylation at C3 and a 1-2 glucosylation at C24, or through Siamenoside or Mog. IV with 1-6 glucosylations.
  • biosynthetic enzymes from monkfruit Siraitia grosvenorii
  • many of these enzymes lack the productivity or physical properties desired for overexpression in microbial hosts, particularly for fermentation approaches that operate at higher temperatures than the natural climate of the plant. Accordingly, alternative enzymes are desired to improve production of mogrol using microbial fermentation, with mogrol acting as the substrate for glucosylation to produce Mog. V.
  • FIG. 7 shows coexpression of SQS, SQE, and TTC enzymes.
  • Siraitia grosvernorii CDS or triterpene cyclase, or“TTC”
  • AaSQS and M1SQE resulted in high production of the triterpenoid product, cucurbitadienol (Product 3).
  • Mogrol was used as a substrate for in vitro glucosylation reactions with candidate UGT enzymes, to identify candidate enzymes that provide efficient glucosylation of mogrol to Mog. V.
  • Reactions were carried out in 50 mM Tris-HCl buffer (pH 7.0) containing beta-mercaptoethanol (5 mM), magnesium chloride (400 uM), substrate (200 uM), UDP-glucose (5 mM), and a phosphatase (1 U). Results are shown in FIG. 8A.
  • Mog. V product is observed when the UGT enzymes 85C1 (S. rebaudiana), 85C2 ( S rebaudiana), and UGTSg94_3 are incubated together.
  • FIG. 8B Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogroside II- E.
  • EIC Extracted ion chromatogram
  • EIC Extracted ion chromatogram
  • 1285.4 Da mogroside V+H
  • Sg94_3 solid dark grey line
  • CaUGT_l,6 light grey line
  • FIG. 4 and FIG. 9 show additional glycosyltransferase activities observed on particular substrates. Coexpression of UGT enzymes can be selected to move product to any desired mogroside product.
  • FIG. 10 is an amino acid alignment of CaUGT_l,6 and SgUGT94_289_3 using Clustal Omega (Version CLUSTAL O (1,2,4). These sequences share 54% amino acid identity.
  • Coffea arabica UGT_l,6 is predicted to be a beta-D-glucosyl crocetin beta 1,6- glucosyltransferase-like (XP_027096357. l).
  • CaUGT_l,6 can be further engineered for microbial expression and activity, including engineering of a circular permutant.
  • FIG. 11 is an amino acid alignment oiHomo sapiens squalene synthase (HsSQS) (NCBI accession NP_004453.3) and AaSQS (SEQ ID NO: 11) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQS has a published crystal structure (PDB entry: 1EZF). These sequences share 42% amino acid identity.
  • HsSQS amino acid alignment oiHomo sapiens squalene synthase
  • PDB entry: 1EZF published crystal structure
  • HsSQE Homo sapiens squalene epoxidase
  • M1SQE SEQ ID NO: 39
  • Clustal Omega Version CLUSTAL O (1.2.4)
  • Saccharomyces cerevisiae FPPS (SEQ ID NO: 1)
  • Cucumis sativus (SEQ ID NO: 4)
  • Cucurbita moschata (SEQ ID NO: 7)
  • Sechium edule (SEQ ID NO: 8)
  • Panax quinquefolius (SEQ ID NO: 9)
  • Diospyros kaki (SEQ ID NO: 13) MGSLAAMLRHPDDVYPLVKLKMAARHAEKQIPPEPHWAFCYTMLHKVSRSFGLVIQQLGTELRN AVCIFYLVLRALDTVEDDTSIATEVKVPILLAFHHHIYDRDWHFSCGTREYKVLMDEFHHVSTA FLELGKGYQEAIEDITMRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGLEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVNKLEDLKYEKNSVKSVQCL NDMVTNALIHVDDCLKYMSALRDPAIFRFCAIPQIMAIGTLALCYNNIEVFRGWKMRRGLTAK VIDQTKTISDVYGAFFDFSCMLKSKVEKNDPNSTKTLSRIEAIQKTCRESGTLSKRKSYILRSK RTHNSTLIFVLFI ILAILFAYLSANRPPINM
  • Eleutherococcus senticosus SEQ ID NO: 16
  • Flavobacteriales bacterium (SEQ ID NO: 166) MLNNSLFSRLEEIPALLKLKLGSKDYYKNNNSETLTCDNLRYCFDTLNKVSRSFATVIKQLPNE LGNNVCVFYLILRALDSIEDDMNLPKELKIKLLREFHKKNYESGWNISGVGDKKEHVELLENYD KVIQSFLAIDQKNQLI ITDICRKVGAGMANFVKAEIESVEDYNLYCHHVAGLVGIGLSRMFISS GLENDDFLNQDEISNSMGLFLQKTNIVRDYREDLDEGRMFWPKDIWHVYGSKINDFAINPTHDQ SVLCLNHMLNNALTHATDCLAYLKHLRNENIFKFCAIPQVMAMATLCKIYSNPDVFIKNVKIRK GLAAKLILNTTSMDEVIKVYKDMLLVIESKISSDNNPVSAETIQLLKQIREYFNDETLIVRKIA
  • Juglans regia (JrSQEl) (SEQ ID NO: 28) MVDPYALGWSFASVLMGLVALYILVDKKNRSRVSSEARSEGVESVTTTTSGECRLTDGDADVI I
  • JrSQE2 Juglans regia (JrSQE2) (SEQ ID NO: 31)
  • Theobroma cacao (SEQ ID NO: 32)
  • Sorghum bicolor SEQ ID NO: 36
  • Medicago sativa (SEQ ID NO: 38) MDLYNIGWILSSVLSLFALYNLIFSGKRNYHDVNDKVKDSVTSTDAGDIQSEKLNGDADVI IVG AGIAGAALAHTLGKDGRRVHI IERDLSEPDRIVGELLQPGGYLKLVELGLQDCVDNIDAQRVFG YALFKDGKHTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNVNMEQGTVISLLEEKGTI KGVQYKNKDGQALTAYAPLTIVCDGCFSNLRRSLCNPKVDNPSCFVGLILENCELPCANHGHVI LGDPSPILFYPISSTEIRCLVDVPGTKVPSISNGDMTKYLKTTVAPQVPPELYDAFIAAVDKGN IRTMPNRSMPADPRPTPGAVLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPMRDLNDAPTLC KYLESFYTLRKPVASTINTLAGALYKVFSASPDEARKEMR
  • Methyloprofundus sediment SEQ ID NO: 169
  • CDS Cucurbitadienol Synthase
  • TTP Triterpene Synthase
  • CcCDS2 Citrullus colocynthis (CcCDS2) (SEQ ID NO: 44)
  • Citrullus lanatus subsp. vulgaris (SEQ ID NO: 48)
  • Theobroma cacao (SEQ ID NO: 49)
  • Prunus avium (SEQ ID NO: 51)
  • Trigonella foenum-graecum SEQ ID NO: 54
  • Ricinus communis (SEQ ID NO: 55)
  • EPH3 Siraitia grosvenorii EPH3 (SgEPH3) (SEQ ID NO: 58) MDQIEHITINTNGIKMHIASVGTGPWLLLHGFPELWYSWRHQLLYLSSVGYRAIAPDLRGYGD TDSPASPTSYTALHIVGDLVGALDELGIEKVFLVGHDWGAI IAWYFCLFRPDRIKALVNLSVQF IPRNPAIPFIEGFRTAFGDDFYMCRFQVPGEAEEDFASIDTAQLFKTSLCNRSSAPPCLPKEIG FRAIPPPENLPSWLTEEDINYYAAKFKQTGFTGALNYYRAFDLTWELTAPWTGAQIQVPVKFIV GDSDLTYHFPGAKEYIHNGGFKKDVPLLEEWWKDACHFINQERPQEINAHIHDFINKF
  • Prunus persica (SEQ ID NO: 62)
  • Ricinus communis (SEQ ID NO: 64)
  • Camelina sativa (SEQ ID NO: 68) MEKIEHTTVSTNGINMHVASIGSGPVILFLHGFPDLWYSWRHQLLSFAALGYRAIAPDLRGYGD SDAPPSPESYTILHIVGDLVGLLDSLGVDRVFLVGHDWGAIVAWWLCMIRPDRVKALVNTSWF NPRNPSVKPVDKFRDLFGDDYYVCRFQETGEIEEDFAQVDTKKLITRFFVSRNPRPPCIPKSVG FRGLPDPPSLPAWLTEQDVSFYGDKFSQKGFTGGLNYYRAMNLSWELTAPWAGLQIKVPVKFIV GDLDITYNIPGTKEYIHGGGLKKHVPFLQEVWMEGVGHFLQQEKPDEVTDHIYGFFEKFRTRE
  • Punica granatum SEQ ID NO: 70
  • Arabidopsis lyrata subsp. lyrata (SEQ ID NO: 71) MEKIEHTTVSTNGINMHVASIGSGPVILFLHGFPDLWYSWRHQLLSFAALGYRAIAPDLRGYGD SDAPPSRESYTILHIVGDLVGLLNSLGVDRVFLVGHDWGAIVAWWLCMIRPDRVNALVNTSWF NPRNPSVKPVDAFRALFGDDYYICRFQEPGEIEEDFAQVDTKKLITRFFISRNPRPPCIPKSVG FRGLPDPPSLPAWLTEEDVSFYGDKFSQKGFTGGLNYYRALNLSWELTAPWAGLQIKVPVKFIV GDLDITYNIPGTKEYIHEGGLKKHVPFLQEVWLEGVGHFLHQEKPDEITDHIYGFFKKFRTRE TASL
  • Rhinolophus sinicus SEQ ID NO: 72
  • Cucumis sativus (SEQ ID NO: 76) MWTILLGLATLAIAYYIHWVNKWKDSKFNGVLPPGTMGLPLIGETIQLSRPSDSLDVHPFIQRK VKRYGPIFKTCLAGRPVWSTDAEFNHYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGLIHK YIRSITLNHFGAESLRERFLPRIEESARETLHYWSTQTSVEVKESAAAMVFRTSIVKMFSEDSS KLLTEGLTKKFTGLLGGFLTLPLNLPGTTYHKCIKDMKQIQKKLKDILEERLAKGVKIDEDFLG
  • Cucurbita moschata (SEQ ID NO: 77)
  • Prunus avium (SEQ ID NO: 78)
  • Populus trichocarpa (SEQ ID NO: 79)
  • Prunus persica (SEQ ID NO: 80) MWTLVGLSLVGLLVIYFTHWI IKWRNPKCNGVLPPGSMGLPFIGETLNLIIPSYSLDLHPFIKK
  • MWKVGLCWGVIWWFTR WINKWRNPKCNGILPPGSMGPPLIGESLQLI IPSYSLDLHPFIKKR VQRYGPIFRTSWGQPMWSTDVEFNHYLAKQEGRLVHFWYLDSFAEIFNLEDENAISAVGLIH KYGRSIVLNHFGTDSLKKTLLSQIEEIVNKTLQTWSSLPSVEVKHAASVMAFDLTAKQCFGYDV ENSAVKMSEKFLYTLDSLISFPFNIPGTVYHKCLKDKKEVLNMLRNIVKERMNSPEKYRGDFLD QITADMNKESFLTQDFIVYLLYGLLFASFESISASLSLTLKLLAEHPAVLQQLTAEHEAILKNR DNPNSSLTWDEYKSMTFTFQVINEALRLGNVAPGLLRRALKDIEFKGYTIPAGWTIMLANSAIQ LNPNTYEDPLAFNPWRWQDLDPQIVSKNFMPFGGGIRQCAGAEYSKTFLATFLHVLVTKYRWTK
  • Jatropha curcas JcP450.1 (SEQ ID NO: 85)
  • Jatropha curcas JcP450.2 (SEQ ID NO: 87)
  • Arabidopsis thaliana CPR1 (AtCPRl) (SEQ ID NO: 93)
  • Arabidopsis thaliana (AtCPR3) (SEQ ID NO: 95)
  • AaCPR Artemisia annua CPR (SEQ ID NO: 98)
  • CPR PgCPR (SEQ ID NO: 99) MAQSSSGSMSPFDFMTAI IKGKMEPSNASLGAAGEVTAMILDNRELVMILTTSIAVLIGCVWF IWRRSSSQTPTAVQPLKPLLAKETESEVDDGKQKVTIFFGTQTGTAEGFAKALADEAKARYDKV TFKWDLDDYAADDEEYEEKLKKETLAFFFLATYGDGEPTDNAARFYKWFLEGKERGEWLQNLK FGVFGLGNRQYEHFNKIAIWDEILAEQGGKRLISVGLGDDDQCIEDDFTAWRESLWPELDQLL RDEDDTTVSTPYTAAVLEYRWFHDPADAPTLEKSYSNANGHSWDAQHPLRANVAVRRELHTP ASDRSCTHLEFDISGTGIAYETGDHVGVYCENLAETVEEALELLGLSPDTYFSVHADKEDGTPL SGSSLPPPFPPCTLRTALTLHADLLSSPKKSALLALAAHASDPTEADRLR
  • Acetobacter pasteurianus subsp. ascendens (ApGA2ox) (SEQ ID NO: 100)
  • Dendrobium catenatum (DcGA3ox) (SEQ ID NO: 102) MPSLSKEHFDLYSAFHVPETHAWSSSHLHDHPIAGDGATIPVIDISDPDAASMVGGACRSWGVF YATSHGIPADLLHQVESHARRLFSLPLHRKLQTAPRDGSLSGYGRPPISAFFPKLMWSEGFTLA GHDDHLAVTSQLSPFDSLSFCEVMEAYRKEMKKLAGRLFRLLILSLGLEEEEMGQVGPLKELSQ AADAIQLNSYPTCPEPERAIGMAAHTDSAFLTVLHQTDGAGGLQVLRDQDESGSARWVDVLPRP DCLWNVGDLLHILSNGRFKSVRHRAWNRADHRISAAYFIGPPAHMKVGSITKLVDMRTGPMY
  • Arabidopsis thaliana (AtF3H) (SEQ ID NO: 106)
  • DsH6H Datura stramonium
  • Arabidopsis thaliana (SEQ ID NO: 109)
  • Catharanthus roseus (CrD4Hlike) (SEQ ID NO: 112)
  • HvIDS2 Hordeum vulgare subsp. vulgare
  • HvIDS3 Hordeum vulgare subsp. vulgare (HvIDS3) (SEQ ID NO: 115)
  • Uridine diphosphate dependent glycosyltrans ferase (UGT)
  • UGT720-269-1 Siraitia grosvenorii UGT720-269-1 (SEQ ID NO: 116) MEDRNAMDMSRIKYRPQPLRPASMVQPRVLLFPFPALGHVKPFLSLAELLSDAGIDWFLSTEY NHRRISNTEALASRFPTLHFETIPDGLPPNESRALADGPLYFSMREGTKPRFRQLIQSLNDGRW PITCI ITDIMLSSPIEVAEEFGIPVIAFCPCSARYLSIHFFIPKLVEEGQIPYADDDPIGEIQG VPLFEGLLRRNHLPGSWSDKSADISFSHGLINQTLAAGRASALILNTFDELEAPFLTHLSSIFN KIYTIGPLHALSKSRLGDSSSSASALSGFWKEDRACMSWLDCQPPRSWFVSFGSTMKMKADEL REFWYGLVSSGKPFLCVLRSDWSGGEAAELIEQMAEEEGAGGKLGMWEWAAQEKVLSHPAVG GFLTHCGWNSTVESIAAGVPMMCWPILGDQPSNATWI
  • Cucurbita moschata 1 (CmoUGTl) (SEQ ID NO: 132)
  • Cucurbita moschata 2 (CmoUGT2) (SEQ ID NO: 133)
  • Theobroma cacao (SEQ ID NO: 136)
  • Corchorus capsularis (SEQ ID NO: 137) MDSKQKKMSVLMFPWLAYGHISPFLELAKKLSKRNFHTFFFSTPINLNSIKSKLSPKYAQSIQF VELHLPSLPDLPPHYHTTNGLPPHLMNTLKKAFDMSSLQFSKILKTLNPDLLVYDFIQPWAPLL ALSNKIPAVHFLCTSAAMSSFSVHAFKKPCEDFPFPNIYVHGNFMNAKFNNMENCSSDDSISDQ DRVLQCFERSTKI ILVKTFEELEGKFMDYLSVLLNKKIVPTGPLTQDPNEDEGDDDERTKLLLE WLNKKSKSSTVFVSFGSEYFLSKEEREEIAYGLELSKVNFIWVIRFPLGENKTNLEEALPQGFL QRVSERGLWENWAPQAKILQHSSIGGFVSHCGWSSVMESLKFGVPI IAIPMHLDQPLNARLW DVGVGLEVIRNHGSLEREEIAKLIKEWLGNGNDGEIVRR
  • Ziziphus jujube (SEQ ID NO: 138)
  • Vitis vinifera (SEQ ID NO: 139)
  • FILDNDPQDERISNLPTHGPLAGMRIPI INEHGADELRRELELLMLASEEDEEVSCLITDALWY FAQSVADSLNLRRLVLMTSSLFNFHAHVSLPQFDELGYLDPDDKTRLEEQASGFPMLKVKDIKS AYSNWQILKEILGKMIKQTKASSGVIWNSFKELEESELETVIREIPAPSFLIPLPKHLTASSSS LLDHDRTVFQWLDQQPPSSVLYVSFGSTSEVDEKDFLEIARGLVDSKQSFLWWRPGFVKGSTW VEPLPDGFLGERGRIVKWVPQQEVLAHGAIGAFWTHSGWNSTLESVCEGVPMIFSDFGLDQPLN ARYMSDVLKVGVYLENGWERGEIANAIRRVMVDEEGEYIRQNARVLKQKADVSLMKGGSSYESL ESLVSYISSL
  • Arabidopsis thaliana AAN72025.1 (SEQ ID NO: 151)
  • Arabidopsis thaliana AAF87256.1 (SEQ ID NO: 152)
  • Neisseria gonorrhoeae Q5F735 (SEQ ID NO: 155)
  • Rhizobium meliloti strain 1021
  • ExoM P33695 SEQ ID NO: 156
  • Rhizobium radiobacter Q44418 SEQ ID NO: 1557
  • Streptococcus agalactiae cpsl 087183 (SEQ ID NO: 158)
  • Streptococcus pneumoniae cps3S Q54611 (SEQ ID NO: 159)
  • MENKTETTVRRRRRI ILFPVPFQGHINPILQLANVLYSKGFSITIFHTNFNKPKTSNYPHFTFR FILDNDPQDERISNLPTHGPLAGMRIPI INEHGADELRRELELLMLASEEDEEVSCLITDALWY FAQSVADSLNLRRLVLMTSSLFNFHAHVSLPQFDELGYLDPDDKTRLEEQASGFPMLKVKDIKS AYSNWQILKEILGKMIKQTKASSGVIWNSFKELEESELETVIREIPAPSFLIPLPKHLTASSSS LLDHDRTVFQWLDQQPPSSVLYVSFGSTSEVDEKDFLEIARGLVDSKQSFLWWRPGFVKGSTW VEPLPDGFLGERGRIVKWVPQQEVLAHGAIGAFWTHSGWNSTLESVCEGVPMIFSDFGLDQPLN ARYMSDVLKVGVYLENGWERGEIANAIRRVMVDEEGEYIRQNARVLKQKADVSLMKGGSSYESL

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Nutrition Science (AREA)
  • Food Science & Technology (AREA)
  • Polymers & Plastics (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Coloring Foods And Improving Nutritive Qualities (AREA)
  • Medicinal Preparation (AREA)
  • Steroid Compounds (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

The present invention provides host cells and methods for making mogrol glycosides, including Mogroside V (Mog. V), Mogroside VI (Mog. VI), Iso-Mogroside V (Isomog. V), and glycosylation products that are minor products in Siraitia grosvenorii. The invention provides engineered enzymes and engineered host cells for producing mogrol glycosylation products, such as Mog. V, Mog. VI, and Isomog. V, at high purity and/or yield. The present technology further provides methods of making products containing mogrol glycosides, such as Mog. V, Mog. VI, and Isomog. V, including food products, beverages, oral care products, sweeteners, and flavoring products.

Description

MICROBIAL PRODUCTION OF TRITERPENOIDS INCLUDING
MOGROSIDES
BACKGROUND
Mogrosides are triterpene-derived specialized secondary metabolites found in the fruit of the Cucurbitaceae family plant Siraitia grosvenorii (a/k/a monkfruit or Luo Han Guo). Their biosynthesis in fruit involves number of consecutive glycosylations of the aglycone mogrol to the final sweet products Mogroside V (Mog. V). The food industry is increasing its use of mogroside fruit extract as a natural non-sugar food sweetener. For example, Mog. V has a sweetening capacity that is -250 times that of sucrose (Kasai et al., Agric Biol Chem (1989)). Moreover, additional health benefits of mogrosides have been revealed in recent studies (Li et al, Chin J Nat Med (2014)).
A variety of factors are promoting a surge in interest in research and commercialization of the mogrosides and monkfruit in general, including, for example, the explosion in popularity of and demand for natural sweeteners; the difficulties in scalable sourcing of the current lead natural sweetener, rebaudioside M (RebM), from the Stevia plant; the superior taste performance of mogroside V relative to other natural and artificial sweetener products on the market; and the medicinal potential of the plant and fruit.
Purified Mog. V has been approved as a high-intensity sweetening agent in Japan (Jakinovich et al. , Journal of Natural Products (1990)) and the extract has gained GRAS status in the USA as a non-nutritive sweetener and flavor enhancer (GRAS 522). Extraction of mogrosides from the fruit can yield a product of varying degrees of purity, often accompanied by undesirable aftertaste. In addition, yields of mogroside from cultivated fruit are limited due to low plant yields and particular cultivation requirements of the plant. Mogrosides are present at about 1% in the fresh fruit and about 4% in the dried fruit (Li HB, et al, 2006). Mog. V is the main component, with a content of 0.5% to 1.4% in the dried fruit. Moreover, purification difficulties limit purity for Mog. V, with commercial products from plant extracts being standardized to about 50% Mog. V. It is highly likely that a pure Mog. V product will achieve greater commercial success than the blend, since it is less likely to have off flavors, will be easier to formulate into products, and has good solubility potential. It is therefore advantageous to be able to produce sweet mogroside compounds via biotechnological processes.
SUMMARY
The present invention, in various aspects and embodiments, provides a method for making mogrol glycosides, as well as other triterpenoid compounds, using recombinant microbial processes. In other aspects, the invention provides methods for making products, including foods, beverages, and sweeteners (among others), by incorporating the mogrol glycosides produced according to the methods described herein. In one aspect, the invention provides a method for making a triterpenoid compound. The method comprises providing a recombinant microbial host cell expressing a heterologous enzyme pathway catalyzing the conversion of isopentenyl pyrophosphate (IPP) and/or dimethylallyl pyrophosphate (DMAPP) to one or more triterpenoid compounds. The heterologous enzyme pathway comprises a famesyl diphosphate synthase (FPPS) and a squalene synthase (SQS), which are recombinantly expressed. In various embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 2 to 16, 166, and 167. The host cell is cultured under conditions for producing the triterpenoid.
The microbial host cell in various embodiments may be prokaryotic or eukaryotic. In some embodiments, the microbial host cell is a bacterium such as Escherichia coli, or the microbial cell may be a yeast cell. In some embodiments, the host cell is a bacterial or yeast host cell engineered to increase production of IPP and DMAPP from glucose.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11). AaSQS has high activity in E. coli. Other SQS enzymes that are active in E. coli (including with 37° C culture conditions) include Siraitia grosvenorii SQS (SEQ ID NO: 2), Euphorbia lathyris SQS (SEQ ID NO: 14), Eleutherococcus senticosus SQS (SEQ ID NO: 16), Flavobacteriales bacterium SQS (SEQ ID NO: 166), and Bacteroidetes bacterium SQS (SEQ ID NO: 167). In various embodiments, the heterologous enzyme pathway produces squalene, which is optionally an intermediate that acts as a substrate for additional downstream pathway enzymes. In some embodiments, squalene is recovered from the culture, and may be recovered from the microbial cells, and/or may be recovered from the media and/or an organic layer.
In various embodiments, the host cell expresses one or more enzymes that produce mogrol from squalene. For example, the host cell may express one or more of squalene epoxidase (SQE), cucurbitadienol synthase (CDS), epoxide hydrolase (EPH), cytochrome P450 oxidases (CYP450), non-heme iron-dependent oxygenases, and cytochrome P450 reductases (CPR).
In some embodiments, the heterologous enzyme pathway further comprises a squalene epoxidase (SQE). For example, the heterologous enzyme pathway may comprise an SQE that produces 2,3-oxidosqualene. Exemplary squalene epoxidases may comprise an amino acid sequence that is at least 70% identical to any one of SEQ ID NOS: 17 to 39, 168, 169, and 170. For example, the squalene epoxidase may comprise an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39). M1SQE has high activity in E. coli. Further, when coexpressed with AaSQS, high titer of the single epoxy lated product (2,3-oxidosqualene) was observed. Accordingly, coexpression of AaSQS (or an engineered derivative) with MsSQE (or an engineered derivative) has a good potential for bioengineering of the mogrol pathway. Alternative SQE enzymes in accordance with the disclosure include Bathymodiolus azoricus Endosymbiont squalene epoxidase (SEQ ID NO: 168), Methyloprofundus sediment squalene epoxidase (SEQ ID NO: 169 ), Methylomicrobium buryatense squalene epoxidase (SEQ ID NO: 170), and engineered derivatives thereof.
In various embodiments, the heterologous enzyme pathway further comprises a triterpene cyclase. In some embodiments, where the microbial cell coexpresses FPPS, SQS, SQE, and the triterpene cyclase, the microbial cell produces cucurbitadienol. The cucurbitadienol may be the substrate for downstream enzymes in the heterologous pathway, or is alternatively recovered from the culture (either from microbial cells, or the culture media or organic layer). In some embodiments, the triterpene cyclase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 40 to 55. In some embodiments, the triterpene cyclase has cucurbitadienol synthase (CDS) activity. The CDS in various embodiments comprises an amino acid sequence that is at least 70% identical to the amino acid sequence of SEQ ID NO: 40 ( Siraitia grosvenorii).
In some embodiments, the heterologous enzyme pathway further comprises an epoxide hydrolase (EPH). Exemplary EPH enzymes comprise an amino acid sequence that is at least 70% identical to amino acid sequence selected from SEQ ID NOS: 56 to 72. In some embodiments, the EPH may employ as a substrate 24, 25- epoxy cucurbitadienol, for production of 24, 25-dihydroxy cucurbitadienol.
In some embodiments, the heterologous pathway further comprises one or more oxidases. The one or more oxidases may be active on cucurbitadienol or oxygenated products thereof as a substrate, adding (collectively) hydroxylations at Cl 1, C24 and 25, thereby producing mogrol. Exemplary oxidase enzymes are described herein.
In various embodiments, the heterologous enzyme pathway produces mogrol, which may be an intermediate for downstream enzymes in the heterologous pathway, or in some embodiments is recovered from the culture. Mogrol may be recovered from host cells in some embodiments, or in some embodiments, can be recovered from the culture media or organic layer.
In some embodiments, the heterologous enzyme pathway further comprises one or more uridine diphosphate-dependent glycosyltransferase (UGT) enzymes, thereby producing one or more mogrol glycosides (or“mogrosides”). The mogrol glycoside may be pentaglycosylated, or hexaglycosylated in some embodiments. In other embodiments, the mogrol glycoside has two, three, or four glucosylations. The one or more mogrol glycosides may be selected from Mog. II-E, Mog. III-A-2, Mog. III-E, Mog. IIIx, Mog. IV -A, Mog. IV-E, Siamenoside, Isomog. IV, and Mog. V. In some embodiments, the mogroside is a pentaglucosylated or hexaglucosylated mogroside.
In some embodiments, the host cell expresses a UGT enzyme that catalyzes the primary glycosylation of mogrol at C24 and/or C3 hydroxyl groups. In some embodiments, the UGT enzyme catalyzes beta 1,2 and/or beta 1,6 branching glycosylations of mogrol glycosides at the primary C3 and C24 gluscosyl groups. Exemplary UGT enzymes are disclosed herein (SEQ ID NOS: 116 to 165). For example, in some embodiments, the microbial cell expresses at least four UGT enzymes, resulting in glucosylation of mogrol at the C3 hydroxyl group, the C24 hydroxyl group, as well as a further 1,6 glucosylation at the C3 glucosyl group, and a further 1,6 glucosylation and a further 1,2 glucosylation at the C24 glucosyl group. The product of such glucosylation reactions is Mog. V.
For example, at least one UGT enzyme expressed by the microbial cell may comprise an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165). UGT85C1, and derivatives thereof, provide for glucosylation of the C3 hydroxyl of mogrol or Mog. 1A.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146). UGT85C2, and derivatives thereof, provide for glucosylation of the C24 hydroxyl of mogrol or Mog. 1E.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (CaUGT_l,6) (SEQ ID NO: 164). CaUGT_l,6, and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glycosyl groups.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117). UGT94-289-3 (“Sg94_3”), and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glucosyl groups, as well as beta 1,2 glucosylation at the C24 glucosyl group.
In some embodiments, the microbial cell expresses at least one UGT enzyme capable of catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group (e.g., of Mog. IV A, see FIG. 4). Exemplary UGT enzymes in accordance with these embodiments include Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117), Stevia rebaudiana UGT91D1 (SEQ ID NO: 147), Stevia rebaudiana UGT91D2 (SEQ ID NO: 148), Stevia rebaudiana UGT9lD2e (SEQ ID NO: 149), OsUGTl-2 (SEQ ID NO: 150), or MbUGTl-2 (SEQ ID NO: 163), or derivatives thereof.
In some embodiments, at least one UGT enzyme is a circular permutant of a wild- type UGT enzyme, optionally having amino acid substitutions, deletions, and/or insertions with respect to the corresponding position of the wild-type enzyme. Circular permutants can provide novel and desirable substrate specificities, product profiles, and reaction kinetics over the wild-type enzymes. In some embodiments, at least one UTG enzyme is a circular permutant of SEQ ID NO: 146, SEQ ID NO: 164, or SEQ ID NO: 165, SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163, or a derivative thereof.
Mogrol glycosides can be recovered from the microbial culture. For example, mogrol glycosides may be recovered from microbial cells, or in some embodiments, are predominately transported into the extracellular media, where they may be recovered or sequestered.
In some aspects, the invention provides a method for making a pentaglycosylated or hexaglycosylated mogroside, such as Mog V. In various embodiments, the invention comprises reacting a mogrol glycoside with a plurality of uridine diphosphate dependent glycosyltransferase (UGT) enzymes. For example, in some embodiments, one UGT enzyme comprises an amino acid sequence that is at least 70% identical to SEQ ID NO: 164 (or circular permutant thereof), where the UGT enzyme catalyzes beta 1,6 addition of a glucose. Other UGT enzymes as described herein will be coexpressed to glycosylate the desired substrate to Mog. V.
In some embodiments, the mogrol is reacted with about four UGT enzymes. A first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or a circular permutant thereof. A second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or a circular permutant thereof. A third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or a circular permutant thereof. A fourth UGT enzyme is capable of catalyzing beta 1,2 addition of a glucose molecule, such as SgUGT94_289_3 (SEQ ID NO: 117) or a derivative or circular permutant thereof.
The mogrol glycoside can be recovered and/or purified from the reaction or culture. In some embodiments, the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
In various embodiments, the reaction is performed in a microbial cell, and UGT enzymes are recombinantly expressed in the cell. In some embodiments, mogrol is produced in the cell by a heterologous mogrol synthesis pathway, as described herein. In other embodiments, mogrol or mogrol glycosides are fed to the cells for glycosylation. In still other embodiments, the reaction is performed in vitro using purified UGT enzyme, partially purified UGT enzyme, or recombinant cell lysates. In other aspects, the invention provides a method for making a product comprising a mogrol glycoside. The method comprises producing a mogrol glycoside in accordance with this disclosure, and incorporating the mogrol glycoside into a product. In some embodiments, the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V. In some embodiments, the product is a sweetener composition, flavoring composition, food, beverage, chewing gum, texturant, pharmaceutical composition, tobacco product, nutraceutical composition, or oral hygiene composition.
The product may be a sweetener composition comprising a blend of artificial and/or natural sweeteners. For example, the composition may further comprise one or more of a steviol glycoside, aspartame, and neotame. Exemplary steviol glycosides comprises one or more of RebM, RebB, RebD, RebA, RebE, and Rebl.
Other aspects and embodiments of the invention will be apparent from the following detailed disclosure.
DESCRIPTION OF THE FIGURES
FIG. 1 shows the chemical structures of Mog. V, Mog. VI, and Isomog. V. The type of glycosylation reaction is shown within each glucose moiety (e.g., C3 or C24 core glycosylation and the 1-2, 1-4, or 1-6 glycosylation additions).
FIG. 2 shows routes to mogroside V production in vivo. The enzymatic transformation required for each step is indicated, along with the type of enzyme required. Numbers in parentheses correspond to the chemical structures in FIG. 3. Abbreviations: FPP, famesyl pyrophosphate; SQS, squalene synthase; SQE, squalene epoxidase; TTC, triterpene cyclase; EPH, epoxide hydrolase; CYP450, cytochrome P450 with reductase partner; UGTs, uridine diphosphate glycosyltransferases. FIG. 3 depicts chemical structures of metabolites involved in mogroside V biosynthesis: (1) famesyl pyrophosphate; (2) squalene; (3) 2,3-oxidosqualene; (4) 2,3;22,23-dioxidosqualene; (5) 24,25-epoxycucurbitadienol; (6) 24, 25- dihydroxy cucurbitadienol; (7) mogrol; (8) mogroside V; (9) cucurbitadienol.
FIG. 4 illustrates glycosylation routes to mogroside V, and in vitro bio transformation activity observed for various UGT enzymes. Bubble structures represent different mogrosides. White tetra-cycbc core represents mogrol. The numbers below each structure indicate the particular glycosylated mogroside, while the notation with the arrows indicates the enzymes observed to exhibit the glycosylation activity. Black circles represent C3 or C24 glucosylations. Dark grey vertical circles represent 1,6- glucosylations. Light grey horizontal circles represent l,2-glucosylations. Abbreviations: Mog, mogrol; sia, siamenoside.
FIG. 5 shows results for in vivo production of squalene in E. coli using different squalene synthases. The asterisk denotes a different plasmid construct and experiment run on a different day from the others shown. Abbreviations: SQS, squalene synthase; Sg, Siratia grosvenorii Aa, Artemesia annua, Es, Eleutherococcus senticosus El, Euphorbia lathyris Fb, Flavobacteriales bacterium, Bb, Bacteroidetes bacterium.
FIG. 6 shows results for in vivo production of squalene, 2,3-oxidosqualene, and 2,3;22,23-dioxidosqualene using different squalene epoxidases. Abbreviations: SQS, squalene synthase, SQE, squalene epoxidase; Sg, Siratia grosvenorii, Aa, Artemesia annua, BaE, Bathymodiolus azoricus endosymbiont; Ms, Methyloprofundus sedimenti; Mb, Methylomicrobium buryatense; Ml, Methylomonas lenta.
FIG. 7 shows results for in vivo production of the cyclized triterpene product. Reactions involve an increasing number of enzymes expressed in an E. coli cell line having an overexpression of MEP pathway enzymes. The asterisks represent fermentation experiments incubated for a quarter of the time than the other experiments. As shown, co-expression of AaSQS, M1SQE, and SgTTC resulted in high production of the triterpenoid product, cucurbitadienol. Abbreviations: SQS, squalene synthase; SQE, squalene epoxidase; TTC, triterpene cyclase; Sg, Siratia grosvenorii, Aa, Artemesia annua, Ml, Methylomonas lenta. FIG.8 shows Mogroside V production using a combination of different enzymes. (A) Penta-glycosylated products are observed when 85C1, 85C2, and Sg94_3 or CaUGT_l,6 are incubated together with mogrol as a substrate. Mogroside substrates were incubated in Tris buffer containing magnesium chloride, beta-mercaptoethanol, UDP-glucose, single UGT, and a phosphatase. (B) Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogroside II- E. (C) Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogrol. Abbreviation: MogV, mogroside V.
FIG. 9 shows in vitro assays showing the conversion of mogroside substrates to more glycosylated products. Mogroside substrates were incubated in Tris buffer containing magnesium chloride, beta-mercaptoethanol, UDP-glucose, single UGT, and a phosphatase. The panels correspond to the use of different substrates: (A) mogrol; (B) mogroside I-A; (C) mogroside I-E; (D) mogroside II-E; (E) mogroside III; (F) mogroside IV-A; (G) mogroside IV; (H) siamenoside.
FIG. 10 is an amino acid alignment of CaUGT_l,6 and SgUGT94_289_3 using Clustal Omega (Version CLUSTAL O (1,2,4). These sequences share 54% amino acid identity.
FIG. 11 is an amino acid alignment of Homo sapiens squalene synthase (HsSQS) (NCBI accession NP_004453.3) and AaSQS (SEQ ID NO: 11) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQS has a published crystal structure (PDB entry: 1EZF). These sequences share 42% amino acid identity.
FIG. 12 is an amino acid alignment of Homo sapiens squalene epoxidase (HsSQE) (NCBI accession XP 011515548) and M1SQE (SEQ ID NO: 39) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQE has a published crystal structure (PDB entry: 6C6N). These sequences share 35% amino acid identity.
DETAILED DESCRIPTION OF THE INVENTION
The present invention, in various aspects and embodiments, provides a method for making mogrol glycosides, as well as other triterpenoid compounds, using recombinant microbial processes. In other aspects, the invention provides methods for making products, including foods, beverages, and sweeteners (among others), by incorporating the mogrol glycosides produced according to the methods described herein. As used herein, the terms“terpene or triterpene” are used interchangeably with the terms“terpenoid” or“triterpenoid,” respectively.
In one aspect, the invention provides a method for making a triterpenoid compound. The method comprises providing a recombinant microbial host cell expressing a heterologous enzyme pathway catalyzing the conversion of isopentenyl pyrophosphate (IPP) and/or dimethylallyl pyrophosphate (DMAPP) to one or more triterpenoid compounds. The heterologous enzyme pathway comprises a famesyl diphosphate synthase (FPPS) and a squalene synthase (SQS), which are recombinantly expressed. In various embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 2 to 16, 166, and 167. The host cell is cultured under conditions for producing the triterpenoid.
By way of non-limiting example, the FPPS may be Saccharomyces cerevisiae famesyl pyrophosphate synthase (ScFPPS) (SEQ ID NO: 1), or modified variants thereof. Modified variants may comprise an amino acid sequence that is at least 70% identical to SEQ ID NO: 1). For example, the FPPS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least
98%, or at least 99% identical to SEQ ID NO: 1. In some embodiments, the FPPS comprises an amino acid sequence having from 1 to 20 amino acid modifications or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 1, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Numerous other FPPS enzymes are known in the art, and may be employed for conversion of IPP and/or DMAPP to famesyl diphosphate in accordance with this aspect.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 11. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 11, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, AaSQS has high activity in E. coli.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii SQS (SEQ ID NO: 2). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 2. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 2, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, SgSQS has high activity in E. coli.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Euphorbia lathyris SQS (SEQ ID NO: 14). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 14. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 14, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, E1SQS was active in E. coli.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Eleutherococcus senticosus SQS (SEQ ID NO: 16). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 16. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 16, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, EsSQS was active in E. coli.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Flavobacteriales bacterium SQS (SEQ ID NO: 166). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 166. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 166, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, FbSQS was active in E. coli.
In some embodiments, the SQS comprises an amino acid sequence that is at least 70% identical to Bacteroidetes bacterium SQS (SEQ ID NO: 167). For example, the SQS may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 167. In some embodiments, the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 167, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme. As shown in FIG. 5, BbSQS was active in E. coli.
Amino acid modifications to the SQS enzyme can be guided by available enzyme structures and homology models, including those described in Aminfar and Tohidfar, In silico analysis of squalene synthase in Fabaceae family using bioinformatics tools. J. Genetic Engineer and Biotech. 16 (2018) 739-747. The publicly available crystal structure for HsSQE (PDB entry: 6C6N) may be used to inform amino acid modifications. An alignment between AaSQS and HsSQS is shown in FIG. 11. The enzymes have 42% amino acid identity. In various embodiments, the heterologous enzyme pathway produces squalene, which is optionally an intermediate that acts as a substrate for additional downstream pathway enzymes. In some embodiments, squalene is recovered from the culture, and may be recovered from the microbial cells, and/or may be recovered from the media and/or an organic layer.
The microbial host cell in various embodiments may be prokaryotic or eukaryotic. In some embodiments, the microbial host cell is a bacteria selected from Escherichia spp., Bacillus spp., Corynebacterium spp., Rhodobacter spp., Zymomonas spp., Vibrio spp., and Pseudomonas spp. For example, in some embodiments, the bacterial host cell is a species selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida. In some embodiments, the bacterial host cell is E. coli. Alternatively, the microbial cell may be a yeast cell, such as but not limited to a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica.
The microbial cell will produce MEP or MVA products, which act as substrates for the heterologous enzyme pathway. The MEP (2-C-methyl-D-erythritol 4-phosphate) pathway, also called the MEP/DOXP (2-C-methyl-D-erythritol 4-phosphate/l-deoxy-D- xylulose 5-phosphate) pathway or the non-mevalonate pathway or the mevalonic acid- independent pathway refers to the pathway that converts glyceraldehyde-3-phosphate and pyruvate to IPP and DMAPP. The pathway, which is present in bacteria, typically involves action of the following enzymes: l-deoxy-D-xylulose-5-phosphate synthase (Dxs), l-deoxy-D-xylulose-5-phosphate reductoisomerase (IspC), 4-diphosphocytidyl- 2-C-methyl-D-erythritol synthase (IspD), 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase (IspE), 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (IspF), l-hydroxy- 2-methyl-2-(E)-butenyl 4-diphosphate synthase (IspG), and isopentenyl diphosphate isomerase (IspH). The MEP pathway, and the genes and enzymes that make up the MEP pathway, are described in US 8,512,988, which is hereby incorporated by reference in its entirety. For example, genes that make up the MEP pathway include dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, and ispA. In some embodiments, the host cell expresses or overexpresses one or more of dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, ispA, or modified variants thereof, which results in the increased production of IPP and DMAPP. In some embodiments, the triterpenoid (e.g., squalene, mogrol, or other interemediate described herein) is produced at least in part by metabolic flux through an MEP pathway, and wherein the host cell has at least one additional gene copy of one or more of dxs, ispC, ispD, ispE, ispF, ispG, ispH, idi, ispA, or modified variants thereof.
The MV A pathway refers to the biosynthetic pathway that converts acetyl-CoA to IPP. The mevalonate pathway, which will be present in yeast, typically comprises enzymes that catalyze the following steps: (a) condensing two molecules of acetyl-CoA to acetoacetyl-CoA (e.g., by action of acetoacetyl-CoA thiolase); (b) condensing acetoacetyl-CoA with acetyl-CoA to form hydroxymethylglutaryl-CoenzymeA (HMG- CoA) (e.g., by action of HMG-CoA synthase (HMGS)); (c) converting HMG-CoA to mevalonate (e.g., by action of HMG-CoA reductase (HMGR)); (d) phosphorylating mevalonate to mevalonate 5-phosphate (e.g., by action of mevalonate kinase (MK)); (e) converting mevalonate 5-phosphate to mevalonate 5 -pyrophosphate (e.g., by action of phosphomevalonate kinase (PMK)); and (f) converting mevalonate 5 -pyrophosphate to isopentenyl pyrophosphate (e.g., by action of mevalonate pyrophosphate decarboxylase (MPD)). The MVA pathway, and the genes and enzymes that make up the MVA pathway, are described in US 7,667,017, which is hereby incorporated by reference in its entirety. In some embodiments, the host cell expresses or overexpresses one or more of acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, and MPD or modified variants thereof, which results in the increased production of IPP and DMAPP. In some embodiments, the triterpenoid (e.g., mogrol or squalene) is produced at least in part by metabolic flux through an MVA pathway, and wherein the host cell has at least one additional gene copy of one or more of acetoacetyl-CoA thiolase, HMGS, HMGR, MK, PMK, MPD, or modified variants thereof.
In some embodiments, the host cell is a bacterial host cell engineered to increase production of IPP and DMAPP from glucose as described in US 2018/0245103 and US 2018/0216137, the contents of which are hereby incorporated by reference in their entireties. For example, in some embodiments the host cell overexpresses MEP pathway enzymes, with balanced expression to push/pull carbon flux to IPP and DMAP. In some embodiments, the host cell is engineered to increase the availability or activity of Fe-S cluster proteins, so as to support higher activity of IspG and IspH, which are Fe-S enzymes. In some embodiments, the host cell is engineered to overexpress IspG and IspH, so as to provide increased carbon flux to 1 -hydroxy -2-methyl-2-(E)-butenyl 4- diphosphate (HMBPP) intermediate, but with balanced expression to prevent accumulation of HMBPP at an amount that reduces cell growth or viability, or at an amount that inhibits MEP pathway flux and/or terpenoid production. In some embodiments, the host cell exhibits higher activity of IspH relative to IspG. In some embodiments, the host cell is engineered to downregulate the ubiquinone biosynthesis pathway, e.g., by reducing the expression or activity of IspB, which uses IPP and FPP substrate.
In some embodiments, the host cell expresses one or more enzymes that produce mogrol from squalene. For example, the host cell may express one or more of squalene epoxidase (SQE), cucurbitadienol synthase (CDS), epoxide hydrolase (EPH), cytochrome P450 oxidases (CYP450), non-heme iron-dependent oxygenases, and cytochrome P450 reductases (CPR). As shown in FIG. 2, the heterologous pathway can proceed through several routes to mogrol, which may involve one or two epoxidations of the core substrate. In some embodiments, the pathway proceeds through cucurbitadienol, and in some embodiments, does not involve a further epoxidation step. In some embodiments, one or more of SQE, CDS, EPH, CYP450, non-heme iron- dependent oxygenases, flavodoxin reductases (FPR), ferredoxin reductases (FDXR), and CPR enzymes are engineered to increase flux to mogrol.
In some embodiments, the heterologous enzyme pathway further comprises a squalene epoxidase (SQE). For example, the heterologous enzyme pathway may comprise an SQE that produces 2,3-oxidosqualene (intermediate (3) in FIG. 2). In some embodiments, the SQE will produce 22,23 -dioxidosqualene (intermediate (4) in FIG. 2). For example, the squalene epoxidase may comprise an amino acid sequence that is at least 70% identical to any one of SEQ ID NOS: 17 to 39, 168-170.
In some embodiments, the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39). For example, the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 39. In various embodiments, the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 39, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, M1SQE had good activity in E. coli. Further, when coexpressed with AaSQS, high levels of the single epoxylated product (2,3-oxidosqualene) was observed. Accordingly, coexpression of AaSQS (or an engineered derivative) with M1SQE (or an engineered derivative) has a good potential for bioengineering of the mogrol pathway. Amino acid modifications may be made to increase expression or stability of the SQE enzyme in the microbial cell, or to increase productivity of the enzyme
In some embodiments, the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Bathymodiolus azoricus Endosymbiont squalene epoxidase (SEQ ID NO: 168). For example, the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 168. In various embodiments, the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 168, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, BaESQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
In some embodiments, the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylopro fundus sediment squalene epoxidase (SEQ ID NO: 169). For example, the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 169. In various embodiments, the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 169, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, MsSQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
In some embodiments, the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylomicrobium buryatense squalene epoxidase (SEQ ID NO: 170). For example, the SQE may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 170. In various embodiments, the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 170, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. As shown in FIG. 6, MbSQE had good activity in E. coli. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
Other SEQ enzymes tested showed no activity in E. coli.
Amino acid modifications can be guided by available enzyme structures and homology models, including those described in Padyana AK, et al, Structure and inhibition mechanism of the catalytic domain of human squalene epoxidase. Nat. Comm. (2019) Vol. 10(97): 1-10; or Ruckenstulh et al, Structure-Function Correlations of Two Highly Conserved Motifs in Saccharomyces cerevisiae Squalene Epoxidase. Antimicrob. Agents and Chemo. (2008) Vol. 52(4): 1496-1499. FIG. 12 shows an alignment of HsSQE and M1SEQ, which is useful for guiding engineering of the enzymes for expression, stability, and productivity in microbial host cells. The two enzymes have 35% identity.
In various embodiments, the heterologous enzyme pathway further comprises a triterpene cyclase. In some embodiments, where the microbial cell coexpresses FPPS, SQS, SQE, and the triterpene cyclase, the microbial cell produces cucurbitadienol (compound (9) in FIG. 2). The cucurbitadienol may be the substrate for downstream enzymes in the heterologous pathway, or is alternatively recovered from the culture (either from microbial cells, or the culture media or organic layer).
In some embodiments, the triterpene cyclase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 40 to 55. In some embodiments, the triterpene cyclase has cucurbitadienol synthase (CDS) activity. The CDS in various embodiments comprises an amino acid sequence that is at least 70% identical to the amino acid sequence of SEQ ID NO: 40, and may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 40. For example, the CDS may comprise an amino acid sequence having from 1 to 20 amino acid modifications or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 40, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
Amino acid modifications can be guided by available enzyme structures and homology models, including those described in Itkin M., et al, The biosynthetic pathway of the nonsugar high-intensity sweetener mogroside V from Siraitia srosvenorii PNAS (2016) Vol 113(47): E7619-E7628. For example, the CDS may be modeled using the structure of human lanosterol synthase (oxidosqualene cyclase) (PDB 1W6K).
In some embodiments, the heterologous enzyme pathway further comprises an epoxide hydrolase (EPH). The EPH may comprise an amino acid sequence that is at least 70% identical to amino acid sequence selected from SEQ ID NOS: 56 to 72. In some embodiments, the EPH may employ as a substrate 24,25-epoxycucurbitadienol (intermediate (5) of FIG. 2), for production of 24,25-dihydroxycucurbitadienol (intermediate (6) of FIG. 2). In some embodiments, the EPH comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 56 to 72. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme.
In some embodiments, the heterologous pathway further comprises one or more oxidases. The one or more oxidases may be active on cucurbitadienol or oxygenated products thereof as a substrate, adding (collectively) hydroxylations at Cl 1, C24 and 25, thereby producing mogrol (see FIG. 2).
In some embodiments, at least one oxidase is a cytochrome P450 enzyme. Exemplary cytochrome P450 enzymes comprise an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 73 to 91. In some embodiments, at least one P450 enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 73 to 91. In some embodiments, particularly in embodiments in which the microbial cell is a bacterium, the CYP450 and/or CPR is modified as described in US 2018/0251738, the contents of which are hereby incorporated by reference in their entireties. For example, in some embodiments, the CYP450 enzyme has a deletion of all or part of the wild type P450 N-terminal transmembrane region, and the addition of a transmembrane domain derived from an E. coli or bacterial inner membrane, cytoplasmic C-terminus protein. In some embodiments, the transmembrane domain is a single-pass transmembrane domain. In some embodiments, the transmembrane domain is a multi-pass (e.g., 2, 3, or more transmembrane helices) transmembrane domain.
In some embodiments, at least one oxidase is a non-heme iron oxidase. Exemplary non-heme iron oxidases comprise an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 100 to 115. In some embodiments, the non-heme iron oxidase comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 100 to 115.
In various embodiments, the microbial host cell expresses one or more electron transfer proteins selected from a cytochrome P450 reductase (CPR), flavodoxin reductase (FPR) and ferredoxin reductase (FDXR) sufficient to regenerate the one or more oxidases. Exemplary CPR proteins are provided herein as SEQ ID NOS: 92 to 99.
In various embodiments, the heterologous enzyme pathway produces mogrol, which may be an intermediate for downstream enzymes in the heterologous pathway, or in some embodiments is recovered from the culture. Mogrol may be recovered from host cells in some embodiments, or in some embodiments, can be recovered from the culture media or organic layer.
In some embodiments, the heterologous enzyme pathway further comprises one or more uridine diphosphate-dependent glycosyltransferase (UGT) enzymes, thereby producing one or more mogrol glycosides (or“mogrosides”). The mogrol glycoside may be pentaglycosylated, or hexaglycosylated in some embodiments. In other embodiments, the mogrol glycoside has two, three, or four glucosylations. The one or more mogrol glycosides may be selected from Mog. II-E, Mog. III-A-2, Mog. III-E, Mog. IIIx, Mog. IV -A, Mog. IV-E, Siamenoside, Isomog. IV, and Mog. V. In some embodiments, the mogroside is a pentaglucosylated or hexaglucosylated mogroside. In some embodiments, the one or more mogrol glycosides include Mog. VI, Isomog. V, and Mog. V. In some embodiments, the host cell produces Mog. V.
In some embodiments, the host cell expresses a UGT enzyme that catalyzes the primary glycosylation of mogrol at C24 and/or C3 hydroxyl groups. In some embodiments, the UGT enzyme catalyzes beta 1,2 and/or beta 1,6 branching glycosylations of mogrol glycosides at the primary C3 and C24 gluscosyl groups. In some embodiments, the UGT enzyme catalyzes beta 1,2 glucosylation of Mog IV-A, beta 1,6 glucosylation of Mog. IV, and/or beta 1,6 glucosylation of Siamenoside to Mog. V. In some embodiments, the UGT enzyme catalyzes the beta 1,6 glucosylation of Mog. V to Mog. VI. In some embodiments, the UGT enzyme catalyzes the beta 1,4 glucosylation of Siamenoside and/or the beta 1,6 glucosylation of Isomog. IV to Isomog. V,
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 116 to 165. For example, in some embodiments, the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one of SEQ ID NOS: 116 to 165. For example, in some embodiments, the microbial cell expresses at least four UGT enzymes, resulting in glucosylation of mogrol at the C3 hydroxyl group, the C24 hydroxyl group, as well as a further 1,6 glucosylation at the C3 glucosyl group, and a further 1,6 glucosylation and a further 1,2 glucosylation at the C24 glucosyl group. The product of such glucosylation reactions is Mog. V (FIG. 4).
For example, at least one UGT enzyme expressed by the microbial cell may comprise an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165). UGT85C1, and derivatives thereof, provide for glucosylation of the C3 hydroxyl of mogrol or Mog. 1A. Other glucosyltransferase reactions detected for UGT85C1 are shown in FIG. 4. In some embodiments, at least one UGT enzyme may comprise an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165. In some embodiments, the UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146). UGT85C2, and derivatives thereof, provide for glucosylation of the C24 hydroxyl of mogrol or Mog. 1E. Other glucosyltransferase reactions detected for UGT85C2 are shown in FIG. 4. In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146. In some embodiments, at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (CaUGT_l,6) (SEQ ID NO: 164). CaUGT_l,6, and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glycosyl groups. Glycosyltransferase reactions observed for CaUGT_l,6 are shown in FIG. 4. In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164. In some embodiments, at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117). UGT94-289-3 (“Sg94_3”), and derivatives thereof, provide for further beta 1,6 glucosylation at C24 and C3 glucosyl groups, as well as beta 1,2 glucosylation at the C24 glucosyl group. Glycosyltransferase reactions observed for Sg94_3 are shown in FIG. 4. In some embodiments, at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 117. In some embodiments, at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 117, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
In some embodiments, the microbial cell expresses at least one UGT enzyme capable of catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group (e.g., of Mog. IV A, see FIG. 4). Exemplary UGT enzymes in accordance with these embodiments include Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117), Stevia rebaudiana UGT91D1 (SEQ ID NO: 147), Stevia rebaudiana UGT91D2 (SEQ ID NO: 148), Stevia rebaudiana UGT9lD2e (SEQ ID NO: 149), OsUGTl-2 (SEQ ID NO: 150), or MbUGTl-2 (SEQ ID NO: 163), or derivatives thereof. Derivatives include enzymes comprising amino acid sequence that are least 70% identical to one or more of SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163. In some embodiments, the UGT enzyme catalyzing beta 1,2 addition of a glucose molecule to at least the C24 glucosyl group comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to one or more of SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163. In some embodiments, at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 or having from 1 to 10 amino acid modifications with respect to SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular substrates.
In some embodiments, at least one UGT enzyme is a circular permutant of a wild- type UGT enzyme, optionally having amino acid substitutions, deletions, and/or insertions with respect to the corresponding position of the wild-type enzyme. Circular permutants can provide novel and desirable substrate specificities, product profiles, and reaction kinetics over the wild-type enzymes. A circular permutant retains the same basic fold of the parent enzyme, but has a different position of the N-terminus (e.g.,“cut-site”), with the original N- and C-termini connected, optionally by a linking sequence. For example, in the circular permutants, the N-terminal Methionine is positioned at a site in the protein other than the natural N-terminus. UGT circular permutants are described in US 2017/0332673, which is hereby incorporated by reference in its entirety. In some embodiments, at least one UTG enzyme is a circular permutant of SEQ ID NO: 146, SEQ ID NO: 164, or SEQ ID NO: 165, SEQ ID NO: 117, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, and SEQ ID NO: 163. In some embodiments, the circular permutant further has one or more amino acid modifications (e.g., amino acid substitutions, deletions, and/or insertions) with respect to the parent UGT enzyme. In these embodiments, the circular permutant will have at least about 70%, or at least about 80%, or at least about 90%, or at least about 95%, or at least about 98% identity to the parent enzyme, when the corresponding amino acid sequences are aligned (i.e.., without regard to the new N-terminus of the circular permutant).
In some embodiments, the heterologous enzyme pathway comprises three or four UGT enzymes. A first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165) (or derivative thereof as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 165 or derivative thereof (as described above). A second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146) (or derivative as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 146 (or derivative as described above). A third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Siraitia grosvenorri UGT94-289-3 (SEQ ID NO: 117) (or derivative or circular permutant as described above). In some embodiments, UGT94-289-3 is replaced with another UGT enzyme capable of beta 1,2 glucosyltransferase activity (as described above), together with a fourth UGT enzyme. The fourth UGT enzyme comprises an amino acid sequence that is at least 70% identical to CaUGT_l,6 (SEQ ID NO: 164) (or derivative as described above), or comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 164 (or derivative as described above). Expression of these enzymes in the host cell converts mogrol to predominately tetra and pentaglycosylated products, including Mog. V. See FIG. 4, FIG. 8, FIG. 9.
In some embodiments, the microbial host cell has one or more genetic modifications that increase the production of UDP-glucose, the co-factor employed by UGT enzymes. These genetic modifications may include one or more, or two or more (or all) of AgalE, AgalT. Agal K. AgalM, AushA, Aagp, Apgm, duplication of E coli GALU, expression of Bacillus subtillus UGPA, and expression of Bifidobacterium adolescentis SPL.
Mogrol glycosides can be recovered from the microbial culture. For example, mogrol glycosides may be recovered from microbial cells, or in some embodiments, are predominately transported into the extracellular media, where they may be recovered or sequestered.
In some aspects, the invention provides a method for making a pentaglycosylated or hexaglycosylated mogroside. In some embodiments, the mogroside is Mog V. In various embodiments, the invention comprises reacting a mogrol glycoside with a plurality of uridine diphosphate dependent glycosyltransferase (UGT) enzymes. For example, in some embodiments, one UGT enzyme comprises an amino acid sequence that is at least 70% identical to SEQ ID NO: 164, where the UGT enzyme catalyzes beta 1,6 addition of a glucose. Alternatively, the UGT enzyme comprises an amino acid sequence that is a circular permutant of SEQ ID NO: 164 or a derivative thereof (described above).
In some embodiments, the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164. For example, the UGT enzyme may comprise an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions. In some embodiments, the UGT enzyme is a circular permutant of SEQ ID NO: 164, or derivative thereof. Amino acid modifications may be made to increase expression or stability of the enzyme in the microbial cell, or to increase productivity of the enzyme for particular mogroside substrates, such as Mog. IV or Siamenoside. Other UGT enzymes will be coexpressed to glycosylate the desired substrate to Mog. V.
In some embodiments, the mogrol glycoside substrate comprises Mog. HE. In some embodiments, the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C1 (SEQ ID NO: 165), or a circular permutant comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 165, including derivatives of UGT85C1 or circular permutants as described. In some embodiments, the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165. For example, the UGT enzyme may comprise an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 165.
In some embodiments, the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IA or Mog, IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C2 (SEQ ID NO: 146), or a derivative or circular permutant of UGT85C2 as described herein. In some embodiments, the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146. For example, the UGT enzyme comprises an amino acid sequence having from 1 to 20 or from 1 to 10 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 146.
In some embodiments, the mogrol is reacted with about four UGT enzymes. A first UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or a derivative of circular permutant as described. A second UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or a derivative or circular permutant as described. A third UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or a derivative or circular permutant as described. A fourth UGT enzyme is capable of catalyzing beta 1,2 addition of a glucose molecule, such as SgUGT94_289_3 (SEQ ID NO: 117) or a derivative or circular permutant as described.
The mogrol glycoside can be recovered and/or purified from the reaction or culture. In some embodiments, the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
In various embodiments, the reaction is performed in a microbial cell, and UGT enzymes are recombinantly expressed in the cell. In some embodiments, mogrol is produced in the cell by a heterologous mogrol synthesis pathway, as described herein. In other embodiments, mogrol or mogrol glycosides are fed to the cells for glycosylation. In still other embodiments, the reaction is performed in vitro using purified UGT enzyme, partially purified UGT enzyme, or recombinant cell lysates.
As described herein, the microbial host cell can be prokaryotic or eukaryotic, and is optionally a bacteria selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida. In some embodiments, the microbial cell is a yeast selected from a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica. In some embodiments, the microbial host cell is E. coli.
The bacterial host cell is cultured to produce the triterpenoid product (e.g., mogroside). In some embodiments, carbon substrates such as Cl, C2, C3, C4, C5, and/or C6 carbon substrates are employed for the production phase. In exemplary embodiments, the carbon source is glucose, sucrose, fructose, xylose, and/or glycerol. Culture conditions are generally selected from aerobic, microaerobic, and anaerobic.
In various embodiments, the bacterial host cell may be cultured at a temperature between 22° C and 37° C. While commercial biosynthesis in bacteria such as E. coli can be limited by the temperature at which overexpressed and/or foreign enzymes (e.g., enzymes derived from plants) are stable, recombinant enzymes may be engineered to allow for cultures to be maintained at higher temperatures, resulting in higher yields and higher overall productivity. In some embodiments, the culturing is conducted at about 22° C or greater, about 23° C or greater, about 24° C or greater, about 25° C or greater, about 26° C or greater, about 27° C or greater, about 28° C or greater, about 29° C or greater, about 30° C or greater, about 31° C or greater, about 32° C or greater, about 33° C or greater, about 34° C or greater, about 35° C or greater, about 36° C or greater, or about 37° C.
In some embodiments, the bacterial host cells are further suitable for commercial production, at commercial scale. In some embodiments, the size of the culture is at least about 100 L, at least about 200 L, at least about 500 L, at least about 1,000 L, or at least about 10,000 L, or at least about 100,000 L, or at least about 500,000 L, or at least about 600,000 L. In an embodiment, the culturing may be conducted in batch culture, continuous culture, or semi-continuous culture.
In various embodiments, methods further include recovering the product from the cell culture or from cell lysates. In some embodiments, the culture produces at least about 100 mg/L, or at least about 200 mg/L, or at least about 500 mg/L, or at least about 1 g/L, or at least about 2 g/L, or at least about 5 g/L, or at least about 10 g/L, or at least about 20 g/L, or at least about 30 g/L, or at least about 40 g/L of the terpenoid or terpenoid glycoside product.
In some embodiments, the production of indole (including prenylated indole) is used as a surrogate marker for terpenoid production, and/or the accumulation of indole in the culture is controlled to increase production. For example, in various embodiments, accumulation of indole in the culture is controlled to below about 100 mg/L, or below about 75 mg/L, or below about 50 mg/L, or below about 25 mg/L, or below about 10 mg/L. The accumulation of indole can be controlled by balancing protein expression and activity using the multivariate modular approach as described in U.S. Pat. No. 8,927,241 (which is hereby incorporated by reference), and/or is controlled by chemical means.
Other markers for efficient production of terpene and terpenoids, include accumulation of DOX or ME in the culture media. Generally, the bacterial strains may be engineered to accumulate less of these chemical species, which accumulate in the culture at less than about 5 g/L, or less than about 4 g/L, or less than about 3 g/L, or less than about 2 g/L, or less than about 1 g/L, or less than about 500 mg/L, or less than about 100 mg/L.
The optimization of terpene or terpenoid production by manipulation of MEP pathway genes, as well as manipulation of the upstream and downstream pathways, is not expected to be a simple linear or additive process. Rather, through combinatorial analysis, optimization is achieved through balancing components of the MEP pathway, as well as upstream and downstream pathways. Indole (including prenylated indole) accumulation and MEP metabolite accumulation (e.g., DOX, ME, MEcPP, and/or famesol) in the culture can be used as surrogate markers to guide this process.
For example, in some embodiments, the bacterial strain has at least one additional copy of dxs and idi expressed as an operon/module; or dxs, ispD, ispF, and idi expressed as an operon or module (either on a plasmid or integrated into the genome), with additional MEP pathway complementation described herein to improve MEP carbon. For example, the bacterial strain may have a further copy of dxr, and ispG and/or ispH, optionally with a further copy of ispE and/or idi, with expressions of these genes tuned to increase MEP carbon and/or improve terpene or terpenoid titer. In various embodiments, the bacterial strain has a further copy of at least dxr, ispE, ispG and ispH, optionally with a further copy of idi, with expressions of these genes tuned to increase MEP carbon and/or improve terpene or terpenoid titer.
Manipulation of the expression of genes and/or proteins, including gene modules, can be achieved through various methods. For example, expression of the genes or operons can be regulated through selection of promoters, such as inducible or constitutive promoters, with different strengths (e.g., strong, intermediate, or weak). Several non- limiting examples of promoters of different strengths include Trc, T5 and T7. Additionally, expression of genes or operons can be regulated through manipulation of the copy number of the gene or operon in the cell. In some embodiments, expression of genes or operons can be regulated through manipulating the order of the genes within a module, where the genes transcribed first are generally expressed at a higher level. In some embodiments, expression of genes or operons is regulated through integration of one or more genes or operons into the chromosome.
Optimization of protein expression can also be achieved through selection of appropriate promoters and ribosomal binding sites. In some embodiments, this may include the selection of high-copy number plasmids, or single-, low- or medium-copy number plasmids. The step of transcription termination can also be targeted for regulation of gene expression, through the introduction or elimination of structures such as stem- loops. Expression vectors containing all the necessary elements for expression are commercially available and known to those skilled in the art. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989. Cells are genetically engineered by the introduction into the cells of heterologous DNA. The heterologous DNA is placed under operable control of transcriptional elements to permit the expression of the heterologous DNA in the host cell.
In some embodiments, endogenous genes are edited, as opposed to gene complementation. Editing can modify endogenous promoters, ribosomal binding sequences, or other expression control sequences, and/or in some embodiments modifies trans-acting and/or cis-acting factors in gene regulation. Genome editing can take place using CRISPR/Cas genome editing techniques, or similar techniques employing zinc finger nucleases and TALENs. In some embodiments, the endogenous genes are replaced by homologous recombination. In some embodiments, genes are overexpressed at least in part by controlling gene copy number. While gene copy number can be conveniently controlled using plasmids with varying copy number, gene duplication and chromosomal integration can also be employed. For example, a process for genetically stable tandem gene duplication is described in US 2011/0236927, which is hereby incorporated by reference in its entirety.
The terpene or terpenoid product can be recovered by any suitable process, including partitioning the desired product into an organic phase or hydrophobic phase. Alternatively, the aqueous phase can be recovered, and/or the whole cell biomass can be recovered, for further processing. The production of the desired product can be determined and/or quantified, for example, by gas chromatography (e.g., GC-MS). The desired product can be produced in batch or continuous bioreactor systems. Production of product, recovery, and/or analysis of the product can be done as described in US 2012/0246767, which is hereby incorporated by reference in its entirety. For example, in some embodiments, product oil is extracted from aqueous reaction medium using an organic solvent, such as an alkane such as heptane or dodecane, or vegetable oil (e.g., safflower oil) followed by fractional distillation. In other embodiments, product oil is extracted from aqueous reaction medium using a hydrophobic phase, such as a vegetable oil, followed by organic solvent extraction and fractional distillation. Terpene and terpenoid components of fractions may be measured quantitatively by GC/MS, followed by blending of fractions to generate a desired product profile.
The similarity of nucleotide and amino acid sequences, i.e. the percentage of sequence identity, can be determined via sequence alignments. Such alignments can be carried out with several art-known algorithms, such as with the mathematical algorithm of Karlin and Altschul (Karlin & Altschul (1993) Proc. Natl. Acad. Sci. USA 90: 5873- 5877), with hmmalign (HMMER package, http://hmmer.wustl.edu/) or with the CLUSTAL algorithm (Thompson, J. D., Higgins, D. G. & Gibson, T. J. (1994) Nucleic Acids Res. 22, 4673-80). The grade of sequence identity (sequence matching) may be calculated using e.g. BLAST, BLAT or BlastZ (or BlastX). A similar algorithm is incorporated into the BLASTN and BLASTP programs of Altschul et al (1990) J. Mol. Biol. 215: 403-410. BLAST polynucleotide searches can be performed with the BLASTN program, score=l00, word length=l2. BLAST protein searches may be performed with the BLASTP program, score=50, word length=3. To obtain gapped alignments for comparative purposes, Gapped BLAST is utilized as described in Altschul et al (1997) Nucleic Acids Res. 25: 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs are used. Sequence matching analysis may be supplemented by established homology mapping techniques like Shuffle-LAGAN (Brudno M., Bioinformatics 2003b, 19 Suppl 1 : 154-162) or Markov random fields.
"Conservative substitutions" may be made, for instance, on the basis of similarity in polarity, charge, size, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the amino acid residues involved. The 20 naturally occurring amino acids can be grouped into the following six standard amino acid groups:
(1) hydrophobic: Met, Ala, Val, Leu, Ile;
(2) neutral hydrophilic: Cys, Ser, Thr; Asn, Gin;
(3) acidic: Asp, Glu;
(4) basic: His, Lys, Arg; (5) residues that influence chain orientation: Gly, Pro; and
(6) aromatic: Trp, Tyr, Phe.
As used herein,“conservative substitutions” are defined as exchanges of an amino acid by another amino acid listed within the same group of the six standard amino acid groups shown above. For example, the exchange of Asp by Glu retains one negative charge in the so modified polypeptide. In addition, glycine and proline may be substituted for one another based on their ability to disrupt a-helices. Some preferred conservative substitutions within the above six groups are exchanges within the following sub-groups: (i) Ala, Val, Leu and Ile; (ii) Ser and Thr; (ii) Asn and Gln; (iv) Lys and Arg; and (v) Tyr and Phe.
As used herein,“non-conservative substitutions” are defined as exchanges of an amino acid by another amino acid listed in a different group of the six standard amino acid groups (1) to (6) shown above.
Modifications of enzymes as described herein can include conservative and/or non-conservative mutations.
In some embodiments“rational design” is involved in constructing specific mutations in enzymes. Rational design refers to incorporating knowledge of the enzyme, or related enzymes, such as its reaction thermodynamics and kinetics, its three dimensional structure, its active site(s), its substrate(s) and/or the interaction between the enzyme and substrate, into the design of the specific mutation. Based on a rational design approach, mutations can be created in an enzyme which can then be screened for increased production of a terpene or terpenoid relative to control levels. In some embodiments, mutations can be rationally designed based on homology modeling. As used herein,“homology modeling” refers to the process of constructing an atomic resolution model of one protein from its amino acid sequence and a three-dimensional structure of a related homologous protein.
In other aspects, the invention provides a method for making a product comprising a mogrol glycoside. The method comprises producing a mogrol glycoside in accordance with this disclosure, and incorporating the mogrol glycoside into a product. In some embodiments, the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V. In some embodiments, the product is a sweetener composition, flavoring composition, food, beverage, chewing gum, texturant, pharmaceutical composition, tobacco product, nutraceutical composition, or oral hygiene composition.
The product may be a sweetener composition comprising a blend of artificial and/or natural sweeteners. For example, the composition may further comprise one or more of a steviol glycoside, aspartame, and neotame. Exemplary steviol glycosides comprises one or more of RebM, RebB, RebD, RebA, RebE, and Rebl.
Non-limiting examples of flavors for which the products can be used in combination include lime, lemon, orange, fruit, banana, grape, pear, pineapple, mango, bitter almond, cola, cinnamon, sugar, cotton candy and vanilla flavors. Non-limiting examples of other food ingredients include flavors, acidulants, and amino acids, coloring agents, bulking agents, modified starches, gums, texturizers, preservatives, antioxidants, emulsifiers, stabilizers, thickeners and gelling agents.
Mogrol glycosides obtained according to this invention may be incorporated as a high intensity natural sweetener in foodstuffs, beverages, pharmaceutical compositions, cosmetics, chewing gums, table top products, cereals, dairy products, toothpastes and other oral cavity compositions, etc.
Mogrol glycosides obtained according to this invention can be used in combination with various physiologically active substances or functional ingredients. Functional ingredients generally are classified into categories such as carotenoids, dietary fiber, fatty acids, saponins, antioxidants, nutraceuticals, flavonoids, isothiocyanates, phenols, plant sterols and stands (phytosterols and phytostanols); polyols; prebiotics, probiotics; phytoestrogens; soy protein; sulfides/thiols; amino acids; proteins; vitamins; and minerals. Functional ingredients also may be classified based on their health benefits, such as cardiovascular, cholesterol-reducing, and anti inflammatory.
Mogrol glycosides obtained according to this invention may be applied as a high intensity sweetener to produce zero calorie, reduced calorie or diabetic beverages and food products with improved taste characteristics. It may also be used in drinks, foodstuffs, pharmaceuticals, and other products in which sugar cannot be used. In addition, highly purified target mogrol glycoside(s), particularly, Mog. V, Mog. VI, or Isomog. V, can be used as a sweetener not only for drinks, foodstuffs, and other products dedicated for human consumption, but also in animal feed and fodder with improved characteristics.
Examples of products in which mogrol glycoside(s) may be used as a sweetening compound include, but are not limited to, alcoholic beverages such as vodka, wine, beer, liquor, and sake, etc.; natural juices; refreshing drinks; carbonated soft drinks; diet drinks; zero calorie drinks; reduced calorie drinks and foods; yogurt drinks; instant juices; instant coffee; powdered types of instant beverages; canned products; syrups; fermented soybean paste; soy sauce; vinegar; dressings; mayonnaise; ketchups; curry; soup; instant bouillon; powdered soy sauce; powdered vinegar; types of biscuits; rice biscuit; crackers; bread; chocolates; caramel; candy; chewing gum; jelly; pudding; preserved fruits and vegetables; fresh cream; jam; marmalade; flower paste; powdered milk; ice cream; sorbet; vegetables and fruits packed in bottles; canned and boiled beans; meat and foods boiled in sweetened sauce; agricultural vegetable food products; seafood; ham; sausage; fish ham; fish sausage; fish paste; deep fried fish products; dried seafood products; frozen food products; preserved seaweed; preserved meat; tobacco; medicinal products; and many others.
During the manufacturing of products such as foodstuffs, drinks, pharmaceuticals, cosmetics, table top products, and chewing gum, the conventional methods such as mixing, kneading, dissolution, pickling, permeation, percolation, sprinkling, atomizing, infusing and other methods may be used.
As used in this specification and the appended claims, the singular forms“a”, “an” and“the” include plural referents unless the content clearly dictates otherwise. For example, reference to“a cell” includes a combination of two or more cells, and the like. As used herein, the term“about” in reference to a number is generally taken to include numbers that fall within a range of 10% in either direction (greater than or less than) of the number.
EXAMPLES The biosynthesis of mogrosides in fruit involves a number of consecutive glycosylations of the aglycone mogrol to the final sweet products, including mogroside V (Mog. V). Mog. V has a sweetening capacity that is about 250 times that of sucrose (Kasai et al, Agric Biol Chem (1989)). Mogrosides are reported to have health benefits as well (Li et al., Chin J Nat Med (2014)).
A variety of factors are promoting a surge in interest in mogrosides and monkfruit in general, including an explosion in demand for natural sweeteners, difficulties in scalable sourcing of the current lead natural sweetener, rebaudioside M (RebM) from the Stevia plant, the superior taste performance of mogroside V relative to other natural and artificial sweetener products on the market, and the medicinal potential of the plant and fruit.
Purified Mog. V has been approved as a high-intensity sweetening agent in Japan (Jakinovich et al., Journal of Natural Products (1990)) and the extract has gained GRAS status in the USA as a non-nutritive sweetener and flavor enhancer (GRAS 522). Extraction of mogrosides from the fruit can yield a product of varying degrees of purity, often accompanied by undesirable aftertaste. In addition, yields of mogroside from cultivated fruit are limited due to low plant yields and particular cultivation requirements of the plant. Mogrosides are present at ~l% in the fresh fruit and ~4% in the dried fruit. Mog. V is the main component, with a content of 0.5%-l.4% in the dried fruit. Moreover, purification difficulties limit purity for Mog. V, with commercial products from plant extracts being standardized to -50% Mog. V. A pure Mog. V product is desirable to avoid off flavors, and will be easier to formulate into products, since Mog. V has good solubility potential. It is therefore advantageous to produce sweet mogroside compounds, such as Mog. V, via biotechnological processes. FIG. 1 shows the chemical structures of Mog. V, Mog. VI, and Isomog. V. Mog.
V has five glucosylations with respect to the mogrol core, including glucosylations at the C3 and C24 hydroxyl groups, followed by 1-2, 1-4, and 1-6 glucosyl additions. These glucosylation reactions are catalyzed by uridine diphosphate-dependent glycosyltransferase enzymes (UGTs). FIG.2 shows routes to Mog. V production in vivo. The enzymatic transformation required for each step is indicated, along with the type of enzyme required. Numbers in parentheses correspond to the chemical structures in FIG. 3, namely: (1) famesyl pyrophosphate; (2) squalene; (3) 2,3-oxidosqualene; (4) 2,3;22,23-dioxidosqualene; (5) 24,25-epoxycucurbitadienol; (6) 24,25-dihydrooxycucurbitadienol; (7) mogrol; (8) mogroside V; (9) cucurbitadienol.
As illustrated in FIG. 2, mogrosides can be produced by biosynthetic fermentation processes, using microbial strains that produce high levels of MEP pathway products, along with heterologous expression of mogrol biosynthesis enzymes and UGT enzymes that direct glucosylation reactions to Mog. V, or other desired mogroside compound. For example, in bacteria such as E. coli, isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) are produced from glucose, and are converted to famesyl diphosphate (FPP) (1) by recombinant famesyl diphosphate synthase (FPPS). FPP is converted to squalene (2) by a condensation reaction catalyzed by squalene synthase (SQS). Squalene is converted to 2,3-oxidosqualene (3) by an epoxidation reaction catalyzed by a squalene epoxidase (SQE). The pathway can proceed to 22,23- dioxidosqualene (4) by further epoxidation followed by cyclization to 24,25- epoxycucurbitadienol (5) by a triterpene cyclase, and then hydration of the remaining epoxy group to 24,25-dihydroxycucurbitadienol (6) by an epoxide hydrolase. A further hydroxylation catalyzed by a P450 oxidase produces mogrol (7).
The pathway can alternatively proceed by cyclization of (3) to produce cucurbitadienol (9), followed by epoxidation to (5), or multiple hydroxylations of cucurbitadienol to (6), or mogrol (7).
FIG. 4 illustrates glucosylation routes to Mog. V, and indicates in vitro bio transformation activity observed for different enzymes. Glucosylation of the C3 hydroxyl produces Mog. I-E, or glucosylation of the C24 hydroxyl produces Mog. I-Al. Glucosylation of Mog. I-Al at C3 or glucosylation of Mog. I-El at C24 produces Mog. II-E. Further 1-6 glucosylation of Mog. II-E at C3 produces Mog. III-A2. Further 1-6 glucosylation at C24 of Mog. HE produces Mog. III. 1-2 glucosylation of Mog. III-A2 at C24 produces Mog. IV, and then to Mog. V with a further 1-6 glucosylation at C24. Alternatively, glucosylations may proceed through Mog. Ill, with a 1-6 glucosylation at C3 and a 1-2 glucosylation at C24, or through Siamenoside or Mog. IV with 1-6 glucosylations. While biosynthetic enzymes from monkfruit ( Siraitia grosvenorii) have been identified for production of mogrol (See, WO 2016/038617 and US 2015/0322473, which are hereby incorporated by reference in their entireties), many of these enzymes lack the productivity or physical properties desired for overexpression in microbial hosts, particularly for fermentation approaches that operate at higher temperatures than the natural climate of the plant. Accordingly, alternative enzymes are desired to improve production of mogrol using microbial fermentation, with mogrol acting as the substrate for glucosylation to produce Mog. V.
Using an E. coli strain that produces high levels of the MEP pathway products IPP and DMAPP (see US 2018/0245103 and US 2018/0216137, which are hereby incorporated by reference), and with overexpression of ScFPPS, enzymes were screened for their ability to convert FPP to squalene (SQS activity), as well epoxidation of squalene to produce 2,3-oxidosqualene (SQE activity). The 2,3-oxidosqualene intermediate can by cyclized by a triterpene cyclase, such as CDS from Siraitia grosvenorii. As demonstrated in FIG. 5, several enzymes were identified with good activity in E. coli. These include AaSQS, SgSQS, EsSQS, BbSQS, E1SQS, and FbSQS. In particular, AaSQS showed high activity in E. coli at 37° C culture conditions.
As shown in FIG. 6, co-expression of Artemisia annua SQS and Methylomonas lenta M1SQE in E. coli provided a substantial gain in titer of the 2,3-oxidosqualene intermediate. Other SQE enzymes were active in E. coli, including BaESQE, MsSQE, and MbSQE.
FIG. 7 shows coexpression of SQS, SQE, and TTC enzymes. Siraitia grosvernorii CDS (or triterpene cyclase, or“TTC”), when coexpressed with AaSQS and M1SQE, resulted in high production of the triterpenoid product, cucurbitadienol (Product 3). These fermentation experiments were performed at 37° C for 48 to 120 hours.
Mogrol was used as a substrate for in vitro glucosylation reactions with candidate UGT enzymes, to identify candidate enzymes that provide efficient glucosylation of mogrol to Mog. V. Reactions were carried out in 50 mM Tris-HCl buffer (pH 7.0) containing beta-mercaptoethanol (5 mM), magnesium chloride (400 uM), substrate (200 uM), UDP-glucose (5 mM), and a phosphatase (1 U). Results are shown in FIG. 8A. Mog. V product is observed when the UGT enzymes 85C1 (S. rebaudiana), 85C2 ( S rebaudiana), and UGTSg94_3 are incubated together. A penta-glycosylated product is formed when the UGT enzymes 85C1 (S. rebaudiana), 85C2 (S. rebaudiana), and CaUGT_l,6 are incubated together. FIG. 8B, Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogroside II- E. FIG. 8C, Extracted ion chromatogram (EIC) for 1285.4 Da (mogroside V+H) of reactions containing 85C1 + 85C2 and either Sg94_3 (solid dark grey line) or CaUGT_l,6 (light grey line) when incubated with mogrol. Abbreviation: MogV, mogroside V. FIG. 4 and FIG. 9 show additional glycosyltransferase activities observed on particular substrates. Coexpression of UGT enzymes can be selected to move product to any desired mogroside product.
FIG. 10 is an amino acid alignment of CaUGT_l,6 and SgUGT94_289_3 using Clustal Omega (Version CLUSTAL O (1,2,4). These sequences share 54% amino acid identity. Coffea arabica UGT_l,6 is predicted to be a beta-D-glucosyl crocetin beta 1,6- glucosyltransferase-like (XP_027096357. l). Together with known UGT structures and primary sequences, CaUGT_l,6 can be further engineered for microbial expression and activity, including engineering of a circular permutant.
Biosynthesis enzymes can be further engineered for expression and activity in microbial cells, using known structures and primary sequences. FIG. 11 is an amino acid alignment oiHomo sapiens squalene synthase (HsSQS) (NCBI accession NP_004453.3) and AaSQS (SEQ ID NO: 11) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQS has a published crystal structure (PDB entry: 1EZF). These sequences share 42% amino acid identity. FIG. 12 is an amino acid alignment of Homo sapiens squalene epoxidase (HsSQE) (NCBI accession XP 011515548) and M1SQE (SEQ ID NO: 39) using Clustal Omega (Version CLUSTAL O (1.2.4)). HsSQE has a published crystal structure (PDB entry: 6C6N). These sequences share 35% amino acid identity. SEQUENCES
Farnesyl Pyrophosphate Synthase (FPPS)
Saccharomyces cerevisiae FPPS (SEQ ID NO: 1)
MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTPGGKLNRGLSWDTYA
ILSNKTVEQLGQEEYEKVAILGWCIELLQAYFLVADDMMDKSITRRGQPCWYKVPEVGEIAIND
AFMLEAAIYKLLKSHFRNEKYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSF
IVTFKTAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQDDYLDCFGTPEQIGKIG
TDIQDNKCSWVINKALELASAEQRKTLDENYGKKDSVAEAKCKKIFNDLKIEQLYHEYEESIAK
DLKAKISQVDESRGFKADVLTAFLNKVYKRSK
Squalene Synthase (SQS)
Siraitia grosvenorii SQSa (SEQ ID NO: 2)
MGSLGAILRHPDDFYPLLKLKMAARHAEKQIPPEPHWGFCYTMLHKVSRSFALVIQQLAPELRN AICIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDQFHHVSTA FLELGKGYQEAIEDITKRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHASDLEDL APDSLSNSMGLLLQKTNI IRDYLEDINEIPKSRMFWPREIWGKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCLKYMSNLRDLSIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTQTMADVYGAFFDFSVMLKAKVNSSDPNATKTLSRIEAIQKTCEQSGLLNKRKLYAVKSE PMFNPTLIVILFSLLCI ILAYLSAKRLPANQPV
Siraitia grosvenorii SQSb (SEQ ID NO: 3)
MGSLGAILRHPDDFYPLLKLKMAARHAEKQIPPEPHWGFCYTMLHKVSRSFALVIQQLAPELRN AICIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDQFHHVSTA FLELGKGYQEAIEDITKRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHASDLEDL APDSLSNSMGLLLQKTNI IRDYLEDINEIPKSRMFWPREIWGKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCLKYMSNLRDLSIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTQTMADVYGAFFDFSVMLKAKVNNSDPNATKTLSRIEAIQKTCEQSGLLNKRKLYAVKSE PMFNPTLIVILFSLLCI ILAYLSAKRLPANQPV
Cucumis sativus (SEQ ID NO: 4)
MGSLGAILKHPDDFYPLLKLKIAARHAEKQIPPEPHWGFCYTMLHKVSRSFALVIQQLKPELRN AVCIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDEFHHVSTA FLELGKGYQEAIEDITKRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHAAELEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWGKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCLKYMSNLRDLSIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTKTMADVYGAFFDFSVMLKAKVNSNDPNASKTLSRIEAIQKTCKQSGILNRRKLYWRSE PMFNPAVIVILFSLLCI ILAYLSAKRLPANQSV
Cucumis melo (SEQ ID NO: 5)
MGSLGAILKHPDDFYPLLKLKMAARHAEKQIPPESHWGFCYTMLHKVSRSFALVIQQLKPELRN AVCIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDEFHHVSTA FLELGKGYQEAIEDITKRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHAAELEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWGKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCLKYMSNLRDLSIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTKTMADVYGAFFDFSVMLKAKVNSNDPNASKTLSRIEAIQQTCQQSGLMNKRKLYWRSE PMYNPAVIVILFSLLCI ILAYLSAKRLPANQSV
Cucumis melo (SEQ ID NO: 6)
MGSLGAILKHPDDFYPLLKLKMAARHAEKQIPPESHWGFCYTMLHKVSRSFALVIQQLKPELRN AVCIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDEFHHVSTA FLELGKGYQEAIEDITKRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHAAELEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWGKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCPKYMSNLRDLSIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTKTMADVYGAFFDFSVMLKAKVNSNDPNASKTLSRIEAIQQTCQQSGLMNKRKLYWRSE PMYNPAVIVILFSLLCI ILAYLSAKRLPANQSV
Cucurbita moschata (SEQ ID NO: 7)
MGSLGAILRHPDDIYPLLKLKMAARHAEKQIPPESHWGFCYTMLHKVSRSFALVIQQLKPELRN AVCIFYLVLRALDTVEDDTSIQTDIKVPILKAFHCHIYNRDWHFSCGTKDYKVLMDEFHHVSTA FLELGRGYQEAIEDITKRMGAGMAKFICKEVETVEDYDEYCHYVAGLVGLGLSKLFHASKSENL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWSKYADKLEDFKYEKNSVKAVQCL NDLVTNALTHVEDCLEYMSNLKDLSIFRFCAIPQIMAIGTLALCYNNVDVFRGWKMRRGLTAK VIYRTKTMADVYGAFFDFSVMLKAKVNSSDPNASKTLTRIEAIQKTCKQSGLLNKRELYAVRSE PMCNPAAIWLFSLLCI ILAYLSAKLLPANQPV
Sechium edule (SEQ ID NO: 8)
MGSLGAILSHPDDLYPLLKLKMAAKHAEKQIPPDPHWGFCFSMLHKVSRSFALVIQQLKPELRN AVCIFYLVLRALDTVEDDTGIHPDIKVPILQAFHCHIYNRDWHFSCGTKHYKVLMDEFHHVSTA FLELGKGYQEAIEDVTERMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFHAAELEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWNKYADKLEDFKYEENSVKAVQCL NDLVTNALNHVEDCLKYMSNLKDLSTFRFCAIPQIMAIGTLALCYDNVEVFRGWKMRRGLTAK I IDRTKKIADVYGAFFDFSVMLKAKVNSSDPNAAKTLSRIEAIEKTCKESGLLNKRKLYVIRSE PLFNPAVLVILFSLICILLAYLSAKRLPANQPV
Panax quinquefolius (SEQ ID NO: 9)
MGSLGAILKHPDDFYPLLKLKFAARHAEKQIPPEPHWAFCYSMLHKVSRSFGLVIQQLGPQLRD AVCIFYLVLRALDTVEDDTSIPTEVKVPILMAFHRHIYDKDWHFSCGTKEYKVLMDEFHHVSNA FLELGSGYQEAIEDITMRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGAEDL ATDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVDKLEDLKYEENSAKAVQCL NDMVTDALVHAEDCLKYMSDLRDPAIFRFCAIPQIMAIGTLALCFNNTQVFRGWKMRRGLTAK VIDRTKTMSDVYGAFFDFSCLLKSKVDNNDPNATKTLSRLEAIQKTCKESGTLSKRKSYI IESE SGHNSALIAI IFI ILAILYAYLSSNLLLNKQ
Malus domestica (SEQ ID NO: 10)
MGALSTMLKHPDDIYPLLKLKIASRQIEKQIPAEPHWAFCYTMLQKVSRSFALVIQQLGTELRN AVCLFYLVLRALDTVEDDTSVATDVKVPILLAFHRHIYDPDWHFACGTNNYKVLMDEFHHVSTA FLELGTGYQEAIEDITKRMGAGMAKFILKEVETIDDYDEYCHYVAGLVGLGLSKLFHAAGKEDL ASDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVNKLEDLKYEENSEKAVQCL NDMVTNALIHMEDCLKYMAALRDPAIFKFCAIPQIMAIGTLALCYNNIEVFRGWKMRRGLTAK VIDRTKSMDDVYGAFFDFSSILKSKVDKNDPNATKTLSRVEAVQKLCRDSGALSKRKSYIANRE QSYNSTLIVALFI ILAI IYAYLSASPRI
Artemisia annua (SEQ ID NO: 11)
MSSLKAVLKHPDDFYPLLKLKMAAKKAEKQIPSQPHWAFSYSMLHKVSRSFALVIQQLNPQLRD AVCIFYLVLRALDTVEDDTSIAADIKVPILIAFHKHIYNRDWHFACGTKEYKVLMDQFHHVSTA FLELKRGYQEAIEDITMRMGAGMAKFICKEVETVDDYDEYCHYVAGLVGIGLSKLFHSSGTEIL FSDSISNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPREIWSKYVNKLEDLKYEENSEKAVQCL NDMVTNALIHIEDCLKYMSQLKDPAIFRFCAIPQIMAIGTLALCYNNIEVFRGWKLRRGLTAK VIDRTKTMADVYQAFSDFSDMLKSKVDMHDPNAQTTITRLEAAQKICKDSGTLSNRKSYIVKRE SSYSAALLALLFTILAILYAYLSANRPNKIKFTL
Glycine soja (SEQ ID NO: 12)
MDQRSEDEFYPLLKLKIVARNAEKQIPPEPHWAFCYTMLHKVSRSFALVIQQLGIELRNAVCIF YLVLRALDTVEDDTSIETDVKVPILIAFHRHIYDRDWHFSCGTKEYKVLMGQFHHVSTAFLELG KNYQEAIEDITKRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGSEDLAPDDL SNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSEYVNKLEDLKYEENSVKAVQCLNDMVT NALMHAEDCLTYMAALRDPPIFRFCAIPQIMAIGTLALCYNNIEVFRGWKMRRGLTAKVIDRT KTMADVYGAFFDFASMLEPKVDKNDPNATKTLSRLEAIQKTCRESGLLSKRKSYIVNDESGYGS TMIVILVIMVSI IFAYLSANHHNS
Diospyros kaki (SEQ ID NO: 13) MGSLAAMLRHPDDVYPLVKLKMAARHAEKQIPPEPHWAFCYTMLHKVSRSFGLVIQQLGTELRN AVCIFYLVLRALDTVEDDTSIATEVKVPILLAFHHHIYDRDWHFSCGTREYKVLMDEFHHVSTA FLELGKGYQEAIEDITMRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGLEDL APDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVNKLEDLKYEKNSVKSVQCL NDMVTNALIHVDDCLKYMSALRDPAIFRFCAIPQIMAIGTLALCYNNIEVFRGWKMRRGLTAK VIDQTKTISDVYGAFFDFSCMLKSKVEKNDPNSTKTLSRIEAIQKTCRESGTLSKRKSYILRSK RTHNSTLIFVLFI ILAILFAYLSANRPPINM
Euphorbia lathyris (SEQ ID NO: 14)
MGSLGAILKHPDDFYPLLKLKMAAKHAEKQIPAQPHWGFCYSMLHKVSRSFSLVIQQLGTELRD AVCIFYLVLRALDTVEDDTSIPTDVKVPILIAFHKHIYDPEWHFSCGTKEYKVLMDQIHHLSTA FLELGKSYQEAIEDITKKMGAGMAKFICKEVETVDDYDEYCHYVAGLVGLGLSKLFDASGFEDL APDDLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVNKLEDLKYEENSVKAVQCL NDMVTNALIHMDDCLKYMSALRDPAIFRFCAIPQIMAIGTLALCYNNVEVFRGWKMRRGLTAK VIDRTRTMADVYRAFFDFSCMMKSKVDRNDPNAEKTLNRLEAVQKTCKESGLLNKRRSYINESK PYNSTMVILLMIVLAI ILAYLSKRAN Camellia oleifera (SEQ ID NO: 15)
MGSLGAILKHPDDFYPLMKLKMAARRAEKNIPPEPHWGFCYSMLHKVSRSFALVIQQLDTELRN AVCIFYLVLRALDTVEDDTSIATEVKVPILMAFHRHIYDRDWHFSCGTKEYKVLMDEFHHVSTA FSELGRGYQEAIEDITMRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGSEDL ASDSLSNSMGLFLQVFLLTCIKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVNKLEDLKDKENS VKAVECLNDMVTNALIHVEDCLTYMSALRDPSIFRFCAIPQIMAIGTLALCYNNIEVFRGWKM RRGLTAKVIDRTKTMSDVYGGFFDFSCMLKSKVNKSDPNAMKALSRLEAIQKICRESGTLNKRK SYIIKSEPRYNSTLVFVLFI ILAILFAYL
Eleutherococcus senticosus (SEQ ID NO: 16)
MGSLGAILKHPDDFYPLLKLKFAARHAEKQIPPEPHWAFCYSMLHKVSRSFGLVIQQLDAQLRD AVCIFYLVLRALDTVEDDTSIPTEVKVPILMAFHRHIYDKDWHFSCGTKEYKVLMDEFHHVSNA FLELGSGFQEAIEDITMRMGAGMAKFICKEVETIDDYDEYCHYVAGLVGLGLSKLFHASGAEDL ATDSLSNSMGLFLQKTNI IRDYLEDINEIPKSRMFWPRQIWSKYVDKLENLKYEENSAKAVQCL NDMVTNALLHAEDCLKYMSNLRDPAIFRFCAIPQIMAIGTLALCFNNIQVFRGWKMRRGLTAK VIDRTKTMSDVYGAFFDFSCLLKSKVDNNDPNATKTLSRLEAIQKTCKESGTLSKRKSYI IESK SAHNSALIAI IFI ILAILYAYLSSNLPNNQ
Flavobacteriales bacterium (SEQ ID NO: 166) MLNNSLFSRLEEIPALLKLKLGSKDYYKNNNSETLTCDNLRYCFDTLNKVSRSFATVIKQLPNE LGNNVCVFYLILRALDSIEDDMNLPKELKIKLLREFHKKNYESGWNISGVGDKKEHVELLENYD KVIQSFLAIDQKNQLI ITDICRKVGAGMANFVKAEIESVEDYNLYCHHVAGLVGIGLSRMFISS GLENDDFLNQDEISNSMGLFLQKTNIVRDYREDLDEGRMFWPKDIWHVYGSKINDFAINPTHDQ SVLCLNHMLNNALTHATDCLAYLKHLRNENIFKFCAIPQVMAMATLCKIYSNPDVFIKNVKIRK GLAAKLILNTTSMDEVIKVYKDMLLVIESKISSDNNPVSAETIQLLKQIREYFNDETLIVRKIA
Bacteroidetes bacterium (SEQ ID NO: 167)
MLNSSLFSRLEEIPALLKLKLGSINNYKNNNSENLTSKNLRYCFDTLNKVSRSFASVIKQLPNE LMVNVCLFYLILRALDSIEDDMNLPKDFKINLLREFLDKNYEPGWKISGVGDKKEYVELLENYD KVIQVFLDIDPKNQLI ITDICRKMGAGMAHFVEAEINSVKDYNLYCYHVAGLVGIGLSKMFLAS GLENCDYLNQEEISSSMGLFLQKTNIVRDYKEDMEENRIFWPKEIWRTYASKFSDFSINPQHET SISCLNHMVNDALGHVIDCLEYLRHLRNENIFKFCAIPQVMAMATLCKVYNNPDVFIKTVKIRK
GLAAKLILNTTSMDEVIKVYKGLLLDIENKIPLHNPTSDETLRLIKNIRSYCNNETMWSKTA
Squalene Epoxidase
Siraitia grosvenorii SQE1 (SEQ ID NO: 17) MVDQCALGWILASALGLVIALCFFVAPRRNHRGVDSKERDECVQSAATTKGECRFNDRDVDVIV VGAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQRV YGYALFKDGKNTRLSYPLENFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEEKG TIKGVQYKSKNGEEKTAYAPLTIVCDGCFSNLRRSLCNPMVDVPSYFVGLVLENCELPFANHGH VILGDPSPILFYQISRTEIRCLVDVPGQKVPSIANGEMEKYLKTWAPQVPPQIYDSFIAAIDK GNIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLSDAST LCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLLS GLNPRPLSLVLHFFAVAIYGVGRLLLPFPSVKGIWIGARLIYSASGI IFPI IRAEGVRQMFFPA TVPAYYRSPPVFKPIV
Siraitia grosvenorii SQE2 (SEQ ID NO: 18) MVDQCALGWILASVLGAAALYFLFGRKNGGVSNERRHESIKNIATTNGEYKSSNSDGDI I IVGA GVAGSALAYTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLTELGLEDCVDDIDAQRVYGY ALFKDGKDTRLSYPLEKFHSDVAGRSFHNGRFIQRMREKAASLPKVSLEQGTVTSLLEENGI IK GVQYKTKTGQEMTAYAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLVLENCDLPYANHGHVIL ADPSPILFYRISSTEIRCLVDVPGQKVPSISNGEMANYLKNWAPQIPSQLYDSFVAAIDKGNI RTMPNRSMPADPYPTPGALLMGDAFNMRHPLTGGGMTVALSDVWLRDLLKPLRDLNDAPTLSK YLEAFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLLSGLN PRPISLVLHFFAVAIYGVGRLLIPFPSPKRVWIGARI ISGASAI IFPI IKAEGVRQMFFPATVA AYYRAPRWKGR
Momordica charantia (SEQ ID NO: 19)
MVDECALGWILAAALGAVIALCLFVAPKTNNQDGGVDSKATPECVQTTNGECRSDGDSDVI IVG AGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLADCVEEIDAQRVYG YALFKDGKNTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKADSLPNVRLEQGTVTSLLEEKGTI KGVQYKSKDGKEKTAYAPLTIVCDGCFSNLRRSLCNPMVDVPSCFVGLVLENCQLPFANHGHW LGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMEKYLKTWAPQVPPQIYDAFIAAIDKGN IRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLHDAPTLC KYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGMFSNGPVSLLSGL NPRPLSLVLHFFAVAIYGVGRLLFPFPSPKGIWIGARLIYSASGI IFPI IKAEGVRQMFFPATV PAYYRSPPALKPVA
Cucurbita maxima (SEQ ID NO: 20)
MVDYCAFGWILAAVLGLAIALSFFVSPRRNRRGGADSTPRSEGVRSSSTTNGECRSVDGDADVI IVGAGVAGSALAHTLGKDGRLVHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQK VYGYALFKDGKNTQLSYPLEKFQSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEEK GTIKGVQYKSKNGEEKTAYAPLTIVCDGCFSNLRRSLCKPMVDVPSCFVGLVLENCQLPFANHG HWLGDPSPILFYPISSTEIRCLVDVPGQKIPSISNGEMEKYLKTIVAPQVPPQIHDAFIAAID KGNIRTMPNRSMPAAPQPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLNDAP TLCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLL SGLNPRPLSLVLHFFAVAIYGVGRLLLPFPSPKGIWIGARLVYSASGI IFPI IKAEGVRQMFFP ATVPAYYRSPPVHKSIA
Cucurbita moschata (SEQ ID NO: 21)
MVDYCAFGWILAAVLGLAIALSFFVSPRRNRRGGADSTPRSEGVRSSSTTNGECRSVDCDADVI
IVGAGVAGSALAHTLGKDGRLVHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQK
VYGYALFKDGKNTQLSYPLEKFQSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEEK
GTIKGVQYKSKNGEEKTAHAPLTIVCDGCFSNLRRSLCKPMVDVPSCFVGLVLENCQLPFANHG
HWLGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMEKYLKTIVAPQVPPQIHDAFIAAID KGNIRTMPNRSMPAAPQPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLNDAP TLCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLL SGLNPRPLSLVLHFFAVAIYGVGRLLLPFPSPKGIWIGARLVYSASGI IFPI IKAEGVRQMFFP ATVPAYYRSPPVLKTIA
Cucurbita moschata (SEQ ID NO: 22)
MMVDHCAFAWILDWLGLWAVTFFVAAPRRNRRGGTDSTASKDCVISTAIANGECKPDDADAE VI IVGAGVAGSALAYTLGKDGRRVHVIERDLTEPDRIVGEFLQPGGYLKLIELGLGDCVEEIDA QKLYGYALFKDGKNTRVSYPLGNFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLE TKGTIKGVQYKSKNGEEKTAYAPLTIVCDGCFSNLRRSLCKPMVDVPSCFVGLVLENCQLPFAN HGHWLGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGDMEKYLKTWAPQVPPQIHDAFIAA IEKGNVRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLND ASTLCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGVFSNGPIS LLSGLNPRPSSLVLHFFAVAIYGVGRLLLPFPSLKGIWIGARLIYSASGIILPI IKAEGVRQMF FPATVPAYYRSPPVHKPIT
Cucumis sativus (SEQ ID NO: 23)
MVDHCTFGWIFSAFLAFVIAFSFFLSPRKNRRGRGTNSTPRRDCLSSSATTNGECRSVDGDADV I IVGAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQ KVYGYALFKDGKSTRLSYPLENFQSDVSGRSFHNGRFIQRMREKAAFLPNVRLEQGTVTSLLEE KGTITGVQYKSKNGEQKTAYAPLTIVCDGCFSNLRRSLCNPMVDVPSCFVGLVLENCQLPYANL GHWLGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMEKYLKTWAPQVPPQIHDAFIAAI EKGNIRTMPNRSMPAAPQPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLNDA PTLCKYLESFYTLRKPVASTINTLAGALYKVFCASSDQARKEMRQACFDYLSLGGIFSNGPVSL LSGLNPRPLSLVLHFFAVAIYGVGRLLLPFPSPKGIWIGARLVYSASGI IFPI IKAEGVRQMFF PATVPAYYRTPPVFNS
Cucumis melo (SEQ ID NO: 24)
MVDHCAFGWIFSALLAFPIALSLFLSPWRNRRVRGTDSTPRSASVSSSATTNGECRSVDGDADV
VIVGAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQ
KVYGYALFKDGKNTRLSYPLENFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEE
KGTITGVQYKSKNGEQKTAYAPLTIVCDGCFSNLRRSLCTPMVDVPSYFVGLVLENCQLPYANL
GHWLGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMEKYLKTWAPQVPPQIHDAFIAAI
EKGNIRTMPNRSMPAAPQPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLNDA
PTLCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSL LSGLNPRPLSLVLHFFAVAIYGVGRLLLPFPSLKGIWIGARLVYSASGI IFPI IKAEGVRQMFF PATVPAYYRTPPVLNS
Cucurbita maxima (SEQ ID NO: 25)
MMVEHCAYGWILAAVLGLWAVTFFVAVPRRNRRGGTDSTASKDCVISPAIANGECEPEDADAD ADVI IVGAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGEFLQPGGHLKLIELGLGDCVEEI
DAQKLYGYALFKDGKNTRVSYPLGNFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSL LEKKGTIKGVQYKSKNGEEKTAYAPLTIVCDGCFSNLRRSLCKPMVDVPSCFVGLVLENCRLPF ANHGHWLGDPSPILFYPISSTEIRCLVDVPGQKVPSIPNGDMEKYLKTWAPQVPPQIHDAFI AAIEKGNIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDL NDAPTLCKYLESYYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGVFSNGP ISLLSGLNPRPSCLVLHFFAVAIYGVGRLLLPFPSLKGIWIGARLIYSASGI ILPI IKAEGVRQ MFFPATVPAYYRSPPVHKPIT
Ziziphus jujube (SEQ ID NO: 26)
MLDQCPLGWILASVLGLFVLCNLIVKNRNSKASLEKRSECVKSIATTNGECRSKSDDVDVI IVG AGVAGSALAHTLGKDGRRLHVIERDLTEPDRIVGELLQPGGYLKLIELGLQDCVEEIDAQRVFG YALFKDGKDTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKSASLPNVRLEQGTVTSLLEEKGTI KGVQYKTKTGQELTAFAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLVLENCELPYANHGHVI LADPSPILFYPISSTEVRCLVDVPGQKVPSISNGEMAKYLKSWAPQIPPQIYDAFIAAVDKGN IRTMPNRSMPASPFPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLGDLNDAATLC KYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSTGPVSLLSGL NPRPLSLVLHFFAVAIYGVGRLLLPFPSPKRIWIGARLISGASGI IFPI IKAEGVRQMFFPATV PAYYRAAPVE
Morus alba (SEQ ID NO: 27)
MADPYTMGWILASLLGLFALYYLFVNNKNHREASLQESGSECVKSVAPVKGECRSKNGDADVI I VGAGVAGSALAHTLGKDGRRVHVIERDLAEPDRIVGELLQPGGYLKLIELGLQDCVEEIDSQRV YGYALFKDGKDTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNVQLEQGTVTSLLEENG TIKGVQYKTKTGQELTAYAPLTIVCDGCFSNLRRSLCIPKVDVPSCFVGLVLENCNLPYANHGH WLADPSPILFYPISSTEVRCLVDVPGQKVPSISNGEMAKYLKTWASQIPPQIYDSFVAAVDK GNIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLRDLNDSVT LCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMREACFDYLSLGGVFSEGPVSLLS GLNPRPLSLVCHFFAVAIYGVGRLLLPFPSPKRLWIGARLISGASGI IFPI IRAEGVRQMFFPA TIPAYYRAPRPN
Juglans regia (JrSQEl) (SEQ ID NO: 28) MVDPYALGWSFASVLMGLVALYILVDKKNRSRVSSEARSEGVESVTTTTSGECRLTDGDADVI I
VGAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLEDCVEDIDAQRV FGYALFKDGKNTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLLNVRLEQGTVTSLLEENG TVKGVQYKTKDGNELTAHAPLTIVCDGCFSNLRRSLCNPQVDVPSSFVGLVLENCELPYANHGH VILADPSPILFYPISSTEVRCLVDVPGKKVPSIANGEMEKYLKNMVAPQLPPEIYDSFVAAVDR GNIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLRDLNDAPT LCKYLESFYTLRKPVASTINTLAGALYKVFCASPDRARKEMRQACFDYLSLGGVFSMGPVSLLS GLNPRPLSLVLHFFAVAVYGVGRLLVPFPSPSRIWIGARLISGASAI IFPI IKAEGVRQMFFPA TVPAYYRAPPVKRDH
Cucumis melo (SEQ ID NO: 29)
MVDQCALGWILASVLGASALYLLFGKKNCGVLNERRRESLKNIATTNGECKSSNSDGDI I IVGA GVAGSALAYTLAKDGRQVHVIERDLSEPDRIVGELLQPGGYLKLTELGLEDCVDDIDAQRVYGY ALFKDGKDTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEENGTIK GVQYKNKSGQEMTAYAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLILENCDLPYANHGHVIL ADPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMANYLKNWAPQIPPQLYNSFIAAIDKGNI RTMPNRSMPADPYPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLRDLNDAPTLCK YLEAFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLLSGLN PRPLSLVLHFFAVAIYGVGRLLIPFPSPKRVWIGARLISGASAI IFPI IKAEGVRQMFFPKTVA AYYRAPPWRER
Cucumis sativus (SEQ ID NO: 30)
MVDQCALGWILASVLGASALYLLFGKKNCGVSNERRRESLKNIATTNGECKSSNSDGDI I IVGA GVAGSALAYTLAKDGRQVHVIERDLSEPDRIVGELLQPGGYLKLTELGLEDCVDEIDAQRVYGY ALFKDGKDTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEENGTIR GVQYKNKSGQEMTAYAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLILENCDLPHANHGHVIL ADPSPILFYPISSTEIRCLVDVPGQKVPSISNGEMANYLKNWAPQIPPQLYNSFIAAIDKGNI RTMPNRSMPADPYPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLRDLNDAPTLCK YLEAFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGIFSNGPVSLLSGLN PRPLSLVLHFFAVAIYGVGRLLIPFPSPKRVWIGARLISGASAI IFPI IKAEGVRQMFFPKTVA AYYRAPPIVRER
Juglans regia (JrSQE2) (SEQ ID NO: 31)
MVDQYALGLILASVLGFWLYNLMAKKNRIRVSSEARTEGVQTVITTTNGECRSIEGDVDVI IV GAGVAGSALAHTLGKDGRKVHVIERDLSEPDRIVGELLQPGGYLKLVELGLQDSVEDIDAQRVF GYALFKDGKNTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNIRLEQGTVTSLLEENGT IKGVQYKTKDGKELAAHAPLTIVCDGCFSNLRRSLCNPQVDVPSSFVGLVLENCELPYANHGHV VLADPSPILFYPISSTEVRCLVDVPGQKVPSISNGEMAKYLKTMVAPQVPPEIYDSFVAAVDRG NIRTMPNRSMPAAPQPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLRPLRDLNDAPTL CKYLESFYTLRKPVASTINTLAGALYKVFCASPDRARNEMRQACFDYLSLGGVFSTGPVSLLSG LNPRPLSLVLHFFAVAVYGVGRLLVPFPSPSRMWIGARLISGASAI IFPIIKAEGVRQMFFPAT VPAYYRAPPVNCQARSLKPDALKGL
Theobroma cacao (SEQ ID NO: 32)
MADSYVWGWILGSVMTLVALCGWLKRRKGSGISATRTESVKCVSSINGKCRSADGSDADVI IV GAGVAGSALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLEDCVEEIDAQQVF GYALFKDGKHTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKSASLPNVRLEQGTVTSLLEEKGT IRGVQYKTKDGRELTAFAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLVLENCNLPYSNHGHV ILADPSPILFYPISSTEVRCLVDVPGQKVPSIANGEMANYLKTIVAPQVPPEIYNSFVAAVDKG NIRTMPNRSMPAAPYPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLRPLRDLNDAPTL CKYLESFYTLRKPIASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGVFSTGPISLLSG LNPRPVSLVLHFFAVAIYGVGRLLLPFPSPKRIWIGARLISGASGI IFPIIKAEGVRQMFFPAT VPAYYRAPPVE
Cucurbita moschata (SEQ ID NO: 33)
MMVDHCAFAWILDWLGLWAVTFFVAAPRRNRRGGTDSTASKDCVISTAIANGECKPDDADAE VI IVGAGVAGSALAYTLGKDGRRVHVIERDLTEPDRIVGEFLQPGGYLKLIELGLGDCVEEIDA QKLYGYALFKDGKNTRVSYPLGNFHSDVSGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLE TKGTIKGVQYKSKNGEEKTAYAPLTIVCDGCFSNLRRSLCKPMVDVPSCFVGLVLENCQLPFAN HGHWLGDPSPILFYPISSTEIRCLVDVPGQKVPSISNGDMEKYLKTWAPQVPPQIHDAFIAA IEKGNVRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLKDLND ASTLCKYLESFYTLRKPVASTINTLAGALYKVFCASPDQARKEMRQACFDYLSLGGVFSNGPIS LLSGLNPRPSSLVLHFFAVAIYGVGRLLLPFPSLKGIWIGARLIYSASGIILPI IKAEGVRQMF FPATVPAYYRSPPVHKPIT
Phaseolus vulgaris (SEQ ID NO: 34)
MLDTYVFGWI ICAALSVFVIRNFVFAGKKCCASSETDASMCAENITTAAGECRSSMRDGEFDVL IVGAGVAGSALAYTLGKDGRQVLVIERDLSEPDRIVGELLQPGGYLKLIELGLEDCVDKIDAQQ VFGYALFKDGKHIRLSYPLEKFHSDVAGRSFHNGRFIQRMREKAASLPNVRLEQGTVTSLLEEK GVIKGVQYKTKDSQELSVCAPFTIVCDGCFSNLRRSLCDPKVDVPSCFVGLVLENCELPCANHG HVILGEPSPVLFYPISSTEIRCLVDVPGQKVPSISNGEMAKYLKTVIAPQVPHELHNAFIAAVD
KGSIRTMPNRSMPAAPYPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLRPLRDLNDAP SLCKYLESFYTLRKPVASTINTLAGALYKVFCASSDPARKEMRQACFDYLSLGGQFSEGPISLL SGLNPRPLTLVLHFFAVATYGVGRLLLPFPSPKRMWIGLRLISSASGI IMPI IKAEGVRQMFFP ATVPAYYRNPPAA
Hevea brasiliensis (SEQ ID NO: 35)
MKMADHYLLGWILASVMGLFAFYYIVYLLVKPEEDNNRRSLPQPRSDFVKTMTATNGECRSDDD SDVDVI IVGAGVAGAALAHTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLEDCVE EIDAQRVFGYALFKDGKHTQLAYPLEKFHSEVAGRSFHNGRFIQRMREKAASLPSVKLEQGTVT SLLEEKGTIKGVLYKTKTGEELTAFAPLTIVCDGCFSNLRRSLCNPKVDVPSCFVGLVLENCRL PYANNGHVILADPSPILFYPISSTEVRSLVDVPGQKVPSVSSGEMANYLKNWAPQVPPEIYDS FVAAVDKGNIRTMPNRSMPASPYPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRDLLKPLR DLHDAPTLCRYLESFYTLRKPVASTINTLAGALYKVFCASPDEARKEMRQACFDYLSLGGVFST GPVSLLSGLNPRPLSLVLHFFAVAIYGVGRLLLPFPSPHRIWVGARLISGASGI IFPIIKAEGV RQMFFPATVPAYYRAPPIKCN
Sorghum bicolor (SEQ ID NO: 36)
MAAAAAAASGVGFQLIGAAAATLLAAVLVAAVLGRRRRRARPQAPLVEAKPAPEGGCAVGDGRT DVIIVGAGVAGSALAYTLGKDGRRVHVIERDLTEPDRIVGELLQPGGYLKLIELGLEDCVEEID AQRVLGYALFKDGRNTKLAYPLEKFHSDVAGRSFHNGRFIQRMRQKAASLPNVQLEQGTVTSLL EENGTVKGVQYKTKSGEELKAYAPLTIVCDGCFSNLRRALCSPKVDVPSCFVGLVLENCQLPHP NHGHVILANPSPILFYPISSTEVRCLVDVPGQKVPSIASGEMANYLKTWAPQIPPEIYDSFIA AIDKGSIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPLHNLH DASSLCKYLESFYTLRKPVASTINTLAGALYKVFSASPDQARNEMRQACFDYLSLGGVFSNGPI ALLSGLNPRPLSLVAHFFAVAIYGVGRLMLPLPSPKRMWIGARLISGACGI ILPI IKAEGVRQM FFPATVPAYYRAAPMGE
Zea mays (SEQ ID NO: 37)
MRKNLEEAGCAVSDGGTDVI IVGAGVAGSALAYTLGKDGRRVHVIERDLTEPDRIVGELLQPGG YLKLIELGLQDCVEEIDAQRVLGYALFKDGRNTKLAYPLEKFHSDVAGRSFHNGRFIQRMRQKA ASLPNVQLEQGTVTSLLEENGTVKGVQYKTKSGEELKAYAPLTIVCDGCFSNLRRALCSPKVDV PSCFVGLVLENCQLPHPNHGHVILANPSPILFYPISSTEVRCLVDVPGQKVPSIATGEMANYLK TWAPQIPPEIYDSFIAAIDKGSIRTMPNRSMPAAPHPTPGALLMGDAFNMRHPLTGGGMTVAL SDIWLRNLLKPLRNLHDASSLCKYLESFYTLRKPVASTINTLAGALYKVFSASPDQARNEMRQ ACFDYLSLGGVFSNGPIALLSGLNPRPLSLVAHFFAVAIYGVGRLMLPLPSPKRMWIGARLISG ACGI ILPI IKAEGVRQMFFPATVPAYYRAAPTGEKA
Medicago sativa (SEQ ID NO: 38) MDLYNIGWILSSVLSLFALYNLIFSGKRNYHDVNDKVKDSVTSTDAGDIQSEKLNGDADVI IVG AGIAGAALAHTLGKDGRRVHI IERDLSEPDRIVGELLQPGGYLKLVELGLQDCVDNIDAQRVFG YALFKDGKHTRLSYPLEKFHSDVSGRSFHNGRFIQRMREKAASLPNVNMEQGTVISLLEEKGTI KGVQYKNKDGQALTAYAPLTIVCDGCFSNLRRSLCNPKVDNPSCFVGLILENCELPCANHGHVI LGDPSPILFYPISSTEIRCLVDVPGTKVPSISNGDMTKYLKTTVAPQVPPELYDAFIAAVDKGN IRTMPNRSMPADPRPTPGAVLMGDAFNMRHPLTGGGMTVALSDIWLRNLLKPMRDLNDAPTLC KYLESFYTLRKPVASTINTLAGALYKVFSASPDEARKEMRQACFDYLSLGGLFSEGPISLLSGL NPRPLSLVLHFFAVAVFGVGRLLLPFPSPKRVWIGARLLSGASGI ILPI IKAEGIRQMFFPATV PAYYRAPPVNAF
Methylomonas lenta (SEQ ID NO: 39)
MKEEFDICI IGAGMAGATISAYLAPKGIKIALIDHCYKEKKRIVGELLQPGAVLSLEQMGLSHL LDGFEAQTVKGYALLQGNEKTTIPYPSQHEGIGLHNGRFLQQIRASALENSSVTQIHGKALQLL ENERNEI IGVSYRESITSQIKSIYAPLTITSDGFFSNFRAHLSNNQKTVTSYFIGLILKDCEMP FPKHGHVFLSGPTPFICYPISDNEVRLLIDFPGEQLPRKNLLQEHLDTNVTPYIPECMRSSYAQ AIQEGGFKVMPNHYMAAKPIVRKGAVMLGDALNMRHPLTGGGLTAVFSDIQILSAHLLAMPDFK NTDLIHEKIEAYYRDRKRANANLNILANALYAVMSNDLLKTAVFKYLQCGGANAQESIAVLAGL NRKHFSLIKQFCFLAVFGACNLLQQSISNIPKALK1LKDAFVI IKPLIKNELS
Bathymodiolus azoricus Endosymbiont (SEQ ID NO: 168)
MHTTSEHNDLFDICIVGAGMAGATIATYLAPRGIKIALIDRDYAEKRRIVGELLQPGAVQTLKK MGLEHLLEGFDAQPIYGYALFNKDCEFSIEYNQDKSTNYRGVGLHNGRFLQKIREDALKQPSIT QIHGTVSELIEDENHWTGVKYKEKYTRELKTVNAKLTITSDGFFSSFRKDLTNNVKTVTSFFV GIILKDCELPYPHHGHVFLSAPTPFICYPISSTESRLLIDFPGDQAPKKEAVKHHIENNVIPFL PKEFRLCLDQALRENDYKIMPNHYMPAKPVLKKGWLLGDALNMRHPITGGGLTAVFNDVYLLS THLLAMPDFNDTKLIHEKVNLYYNDRYHANTNVNIMANALYGVMSNDLLKQSVFEYLRKGGDNS GGPISLLAGLNRNPTILIKHFFSVALLCLRNLFKAHKMSLTNAFYVIKDAFCI IVPLAINELRP SSFLKKNIHN
Methyloprofundus sediment (SEQ ID NO: 169)
MNTSPEHNDLFDICIVGVGMAGATIAAYLAPRGLKIALIDREYTEKRRIVGELLQPGAVQTLKK
MGLEHLLEGFDAQPIYGYALFNNDKEFSISYNSDDSTEYHGVGLHNGRFLQKIREDVFKNETVT
QIHGTVSELIEDKKGWKGVTYREKHTREYKTVKAKLTVTSDGFFSNFRKDLSNNVKTVTSFFI
GLVLNDCNLPFPNHGHVFLSAPTPFICYPISSTETRLLIDYPGDKAPKKDEIREHILNKVAPFL
PEEFKECFANAMEDDDFKVMPNHYMPAKPVLKEGAVLLGDALNMRHPLTGGGLTAVFNDVYLLS
THLLAMPDFNDPKLLHEKLELYYQDRYHANTNVNIMANALYGVMSNDLLKQGVFEYLRKGGDNS GGPITLLAGLNRNPTLLIKHFFSVAFLCICNLSGNNKMNFTNVFRVMKDAFCI IKPLAVNELRP
SSFYKKNIQL
Methylomicrobium buryatense (SEQ ID NO: 170)
MESNFDICI IGAGMAGATIAAYLAPKGINIALIDHCYKEKKRIVGELLQPGAVLSLEQLGLGHL LDGIDAQPVEGYALLQGNEQTTIPYPSPNHGMGLHNGRFLQQIRASALQNSSVTQIQGKALSLL ENEQNEI IGVNYRDSVSNEIKSIYAPLTITSDGFFSNFRELLSNNEKTVTSYFIGLILKDCEIP VPKHGHVFLSGPTPFICYPISSNEVRLLIDFPGGQFPRKAFLQAHLETNVTPYIPEGMQTSYRH ALQEDRLKVMPNHYMAAKPKIRKGAVMLGDALNMRHPLTGGGLTAVFSDIEILSGHLLAMPDFN NNDLIYQKIEAYYRDRQYANANLNILANALYGVMSNELLKNSVFKYLQRGGVNAKESIAILAGL NKNHYSLMKQFFFVALFGAYTLVRENITNLPKATKILSDALTI IKPLAKNELSLVGIFSDYFKR
Cucurbitadienol Synthase (CDS) , Triterpene Synthase (TTP)
Siraitia grosvenorii CDS (SEQ ID NO: 40)
MWRLKVGAESVGENDEKWLKSISNHLGRQVWEFCPDAGTQQQLLQVHKARKAFHDDRFHRKQSS
DLFITIQYGKEVENGGKTAGVKLKEGEEVRKEAVESSLERALSFYSSIQTSDGNWASDLGGPMF
LLPGLVIALYVTGVLNSVLSKHHRQEMCRYVYNHQNEDGGWGLHIEGPSTMFGSALNYVALRLL
GEDANAGAMPKARAWILDHGGATGITSWGKLWLSVLGVYEWSGNNPLPPEFWLFPYFLPFHPGR
MWCHCRMVYLPMSYLYGKRFVGPITPIVLSLRKELYAVPYHEIDWNKSRNTCAKEDLYYPHPKM
QDILWGSLHHVYEPLFTRWPAKRLREKALQTAMQHIHYEDENTRYICLGPVNKVLNLLCCWVED
PYSDAFKLHLQRVHDYLWVAEDGMKMQGYNGSQLWDTAFSIQAIVSTKLVDNYGPTLRKAHDFV
KSSQIQQDCPGDPNVWYRHIHKGAWPFSTRDHGWLISDCTAEGLKAALMLSKLPSETVGESLER
NRLCDAVNVLLSLQNDNGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTSATMEALTLF
KKLHPGHRTKEIDTAIVRAANFLENMQRTDGSWYGCWGVCFTYAGWFGIKGLVAAGRTYNNCLA
IRKACDFLLSKELPGGGWGESYLSCQNKVYTNLEGNRPHLVNTAWVLMALIEAGQAERDPTPLH
RAARLLINSQLENGDFPQQEIMGVFNKNCMITYAAYRNIFPIWALGEYCHRVLTE
Momordica charantia (SEQ ID NO: 41)
MWRLKVGAESVGENDEKWVKSISNHLGRQVWEFCPDAGTPQQLLQIEKARKAFQDNRFHRKQTS DLLVSIQCEKGTTNGARVPGTKLKEGEEVRKEAVKSTLERALSFYSSIQTSDGNWASDLGGPMF LLPGLVIALCVTGALNSVLSKHHRQEMCRYLYNHQNEDGGWGLHIESPSTMFGSALNYVALRLL GEDADGGEGRAMTKARAWILGHGGATAITSWGKLWLSVLGVYEWSGNNPLPPEFWLLPYFLPFH PGRMWCHCRMVYLPMSYLYGKRFVGPITPWLSLRKELYTVPYHEIDWNKSRNTCAKEDLYYPH SKMQDILWGSIHHMYEPLFTHWPAKRLREKALKTAMQHIHYEDENTRYICLGPVNKVLNMLCCW VEDPYSEAFKLHLQRVHDYLWVAEDGMKMQGYNGSQLWDTAFSVQAI ISTKLVDNYGPTLRKAH DYVKNSQIQQDCPGEPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLMLSKLPSETVGEP LERNRLCDAVNVLLSLQNDNGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTSATMEAL
ALFKKLHPGHRTKEIDTAIARAADFLENMQRTDGSWYGCWGVCFTYAGWFGIKGLVAAGRAYSN
CLAIRKACDFLLSKELPGGGWGESYLSCQNKVYTNLEGNRPHLVNTAWVLMALIEAGQGERDPA
PLHRAARLLINSQLENGDFPQEEIMGVFNKNCMITYAAYRNIFPIWALGEYCHRVLTE
Cucurbita maxima (SEQ ID NO: 42)
MWRLKVGAESVGEKDEKWVKSVSNHLGRQVWEFCADAAADTPHQLLQIQNARNHFHHNRFHRKQ
SSDLFLAIQYEKEIAKGAKGGAVKVKEGEEVGKEAVKSTLERALGFYSAVQTSDGNWASDLGGP
MFLLPGLVIALHVTGVLNSVLSKHHRVEMCRYLYNHQNEDGGWGLHIEGTSTMFGSALNYVALR
LLGEDADGGDGGAMTKARAWILERGGATAITSWGKLWLSVLGVYEWSGNNPLPPEFWLLPYSLP
FHPGRMWCHCRMVYLPMSYLYGKRFVGPITPKVLSLRQELYTIPYHEIDWNKSRNTCAKEDLYY
PHPKMQDILWGSIYHVYEPLFTRWPGKRLREKALQAAMKHIHYEDENSRYICLGPVNKVLNMLC
CWVEDPYSDAFKLHLQRVHDYLWVAEDGMRMQGYNGSQLWDTAFSIQAIVATKLVDSYAPTLRK
AHDFVKDSQIQEDCPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLMLSKLPSTMVG
EPLEKNRLCDAVNVLLSLQNDNGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTAATME
ALTLFKKLHPGHRTKEIDTAIGKAANFLEKMQRADGSWYGCWGVCFTYAGWFGIKGLVAAGRTY
NSCLAIRKACEFLLSKELPGGGWGESYLSCQNKVYTNLEGNKPHLVNTAWVLMALIEAGQGERD
PAPLHRAARLLMNSQLENGDFVQQEIMGVFNKNCMITYAAYRNIFPIWALGEYCHRVLTE
Citrullus colocynthis (CcCDSl) (SEQ ID NO: 43)
MWRLKVGAESVGEKEEKWLKSISNHLGRQVWEFCADQPTASPNHLQQIDNARKHFRNNRFHRKQ SSDLFLAIQNEKEIANGTKGGGIKVKEEEDVRKETVKNTVERALSFYSAIQTNDGNWASDLGGP MFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYLYNHQNEDGGWGLHIEGTSTMFGSALNYVALR LLGEDADGGEGGAMTKARGWILDRGGATAITSWGKLWLSVLGVYEWSGNNPLPPEFWLLPYCLP FHPGRMWCHCRMVYLPMSYLYGKRFVGPITPIVLSLRKELYTIPYHEIDWNKSRNTCAKEDLYY PHPKMQDILWGSIYHLYEPLFTRWPGKRLREKALQMAMKHIHYEDENSRYICLGPVNKVLNMLC CWVEDPYSDAFKFHLQRVPDYLWIAEDGMRMQGYNGSQLWDTAFSVQAI ISTKLIDSFGTTLKK AHDFVKDSQIQQDFPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLMLSKLPSKIVG EPLEKSRLCDAVNVLLSLQNENGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTSATME ALTLFKKLHPGHRTKEIDTAVAKAANFLENMQRTDGSWYGCWGVCFTYAGWFGIKGLVAAGRTY STCVAIRKACDFLLSKELPGGGWGESYLSCQNKVYTNLEGNRPHLVNTAWVLMALIEAGQAERD PAPLHRAARLLINSQLENGDFPQEEIMGVFNKNCMITYAAYRNIFPIWALGEYFHRVLTE
Citrullus colocynthis (CcCDS2) (SEQ ID NO: 44)
MWRLKVGAESVGEKEEKWLKSISNHLGRQVWEFCAHQPTASPNHLQQIDNARNHFRNNRFHRKQ
SSDLFLAIQNEKEIANVTKGGGIKVKEEEDVRKETVKNTVERALSFYSAIQTNDGNWASDLGGP MFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYLYNHQNEDGGWGLHIEGTSTMFGSALNYVALR LLGEDADGGEGGAMTKARSWILDRGGATAITSWGKLWLSVLGVYEWSGNNPLPPEFWLLPYCLP FHPGRMWCHCRMVYLPMSYLYGKRFVGPITPIVLSLRKELYTIPYHEIDWNRSRNTCAKEDLYY PHPKMQDILWGSIYHLYEPLFTRWPGKRLREKALQMAMKHIHYEDENSRYICLGPVNKVLNMLC CWVEDPYSDAFKFHLQRVPDYLWVAEDGMRMQGYNGSQLWDTAFSVQAI ISTKLIDSFGTTLKK AHDFVKDSQIQQDCPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLMLSKLPSKIVG EPLEKSRLCDAVNVLLSLQNENGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTSATME ALTLFKKLHPGHRTKEIDIAVARAANFLENMQRTDGSWYGCWGVCFTYAGWFGIKGLVAAGRTY NSCVAIRKACDFLLSKELPGGGWGESYLSCQNKVYTNLEGNRPHLVNTAWVLMALIEAGQAERD PAPLHRAARLLINSQLENGDFPQEEIMGVFNKNCMITYAAYRNIFPIWALGEYFHRVLTE
Cucurbita moschata (SEQ ID NO: 45)
MWRLKVGAESVGEKDEKWVKSVSNHLGRQVWEFCADAAAAATPRQLLQIQNARNHFHRNRFHRK
QSSDLFLAIQYEKEIAEGGKGGAVKVKEEEEVGKEAVKSTLERALSFYSAVQTSDGNWASDLGG
PMFLLPGLVIALYVTGVLNSVLSKHHRVEMCRYLYNHQNEDGGWGLHIEGTSTMFGSALNYVAL
RLLGEDADGGDDGAMTKARAWILERGGATAITSWGKLWLSVLGVYEWSGNNPLPPEFWLLPYSL
PFHPGRMWCHCRMVYLPMSYLYGKRFVGPITPKVLSLRQELYTVPYHEIDWNKSRNTCAKEDLY
YPHPKMQDILWGSIYHVYEPLFTRWPGKRLREKALQTAMKHIHYEDENSRYICLGPVNKVLNML
CCWVEDPYSDAFKLHLQRVHDYLWVAEDGMRMQGYNGSQLWDTAFSIQAIVATKLVDSFAPTLR
KAHDFVKDSQIQEDCPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLMLSKLPSTMV
GEPLEKNRLCDAVNVLLSLQNDNGGFASYELTRSYPWLELINPAETFGDIVIDYPYVECTAATM
EALTLFKKLHPGHRTKEIDTAVGKAANFLEKMQRADGSWYGCWGVCFTYAGWFGIKGLVAAGRT
YNSCLAIRKACEFLLSKELPGGGWGESYLSCQNKVYTNLEGNKPHLVNTAWVLMALIEAGQGER
DPAPLHRAARLLMNSQLENGDFVQQEIMGVFNKNCMITYAAYRNIFPIWALGEYCHRVLTE
Cucumis sativus (SEQ ID NO: 46)
MWRLKVGKESVGEKEEKWIKSISNHLGRQVWEFCAENDDDDDDEAVIHWANSSKHLLQQQRRQ
SSFENARKQFRNNRFHRKQSSDLFLTIQYEKEIARNGAKNGGNTKVKEGEDVKKEAVNNTLERA
LSFYSAIQTSDGNWASDLGGPMFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYIYNHQNEDGGW
GLHIEGSSTMFGSALNYVALRLLGEDANGGECGAMTKARSWILERGGATAITSWGKLWLSVLGV
YEWSGNNPLPPEFWLLPYSLPFHPGRMWCHCRMVYLPMSYLYGKRFVGPITHMVLSLRKELYTI
PYHEIDWNRSRNTCAQEDLYYPHPKMQDILWGSIYHVYEPLFNGWPGRRLREKAMKIAMEHIHY
EDENSRYIYLGPVNKVLNMLCCWVEDPYSDAFKFHLQRIPDYLWLAEDGMRMQGYNGSQLWDTA
FSIQAILSTKLIDTFGSTLRKAHHFVKHSQIQEDCPGDPNVWFRHIHKGAWPFSTRDHGWLISD
CTAEGLKASLMLSKLPSKIVGEPLEKNRLCDAVNVLLSLQNENGGFASYELTRSYPWLELINPA
ETFGDIVIDYSYVECTSATMEALALFKKLHPGHRTKEIDAALAKAANFLENMQRTDGSWYGCWG
VCFTYAGWFGIKGLVAAGRTYNNCVAIRKACHFLLSKELPGGGWGESYLSCQNKVYTNLEGNRP HLVNTAWVLMALIEAGQGERDPAPLHRAARLLINSQLENGDFPQQEIMGVFNKNCMITYAAYRN
IFPIWALGEYSHRVLTE
Cucumis melo (SEQ ID NO: 47)
MWRLKVGKESVGEKEEKWIKSISNHLGRQVWEFCSGENENDDDEAIAVANNSASKFENARNHFR NNRFHRKQSSDLFLAIQCEKEI IRNGAKNEGTTKVKEGEDVKKEAVKNTLERALSFYSAVQTSD GNWASDLGGPMFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYIYNHQNEDGGWGLHIEGSSTMF GSALNYVALRLLGEAADGGEHGAMTKARSWILERGGATAITSWGKLWLSVLGVYEWSGNNPLPP EFWLLPYSLPFHPGRMWCHCRMVYLPMSYLYGKRFVGPITPIVLSLRKELYTIPYHEIDWNRSR NTCAKEDLYYPHPKMQDILWGSIYHVYEPLFSGWPGKRLREKAMKIAMEHIHYEDENSRYICLG PVNKVLNMLCCWVEDPYSDAFKFHLQRIPDYLWLAEDGMRMQGYNGSQLWDTAFSIQAI ISTKL IDTFGPTLRKAHHFVKHSQIQEDCPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASLM LSKLPSKIVGEPLEKNRLCDAVNVLLSLQNENGGFASYELTRSYPWLELINPAETFGDIVIDYS YVECTSATMEALALFKKLHPGHRTKEIDAAIAKAANFLENMQKTDGSWYGCWGVCFTYAGWFGI KGLVAAGRTYNNCVAIRKACNFLLSKELPGGGWGESYLSCQNKVYTNLEGNKPHLVNTAWVMMA LIEAGQGERDPAPLHRAARLLINSQLESGDFPQQEIMGVFNKNCMITYAAYRNIFPIWALGEYS HRVLDM
Citrullus lanatus subsp. vulgaris (SEQ ID NO: 48)
DGNWASDLGGPMFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYLYNHQNEDGGWGLHIEGTSTM FGSALNYVALRLLGEDADGGEGGAMTKARSWILDRGGATAITSWGKLWLSVLGVYEWSGNNPLP PEFWLLPYCLPFHPGRMWCHCRMVYLPMSYLYGKRFVGPITPIVLSLRKELYTIPYHEIDWNRS RNTCAKEDLYYPHPKMQDILWGSIYHLYEPLFTRWPGKRLREKALQMAMKHIHYEDENSRYICL GPVNKVLNMLCCWVEDPYSDAFKFHLQRVPDYLWVAEDGMRMQGYNGSQLWDTAFSVQAI ISTK LIDSFGTTLKKAHDFVKDSQIQQDCPGDPNVWFRHIHKGAWPFSTRDHGWLISDCTAEGLKASL MLSKLPSEIVGEPLEKSRLCDAVNVLLSLQNENGGFASYELTRSYPWLELINPAETFGDIVIDY PYVECTSATMEALTLFKKLHPGRRTKEIDIAVARAANFLENMQRTDGSWYGCWGVCFTYAGWFG IKGLVAAGRTYNSCVAIRKACDFLLSKELPGGGWGESYLSCQNKVYTNLEGNRPHLVNTAWVLM ALIEAGQAERDPAPLHRAARLLINSQLENGDFPQEEIMGVFNKNCMITYAAYRNIFPIWALGEY FHRVLTE
Theobroma cacao (SEQ ID NO: 49)
MWRLKIGKESVGDNGAWLRSSNDHVGRQVWEFCPESGTPEELSKVEMARQSFSTDRLLKKHSSD LLMRIQYAKENQFVTNFPQVKLKEFEDVKEEATLTTLRRALNFYSTIQADDGHWPGDYGGPMFL LPGLVITLSVTGALNAVLSKEHQYEMCRYLYNHQNRDGGWGLHIEGPSTMFGTVLNYVTLRLLG EGPEGGQGAVEKACEWILEHGSATAITSWGKMWLSVLGAYEWSGNNPLPPEVWLCPYFLPIHPG RMWCHCRMVYLPMSYLYGKRFVGPITPI ILSLRKELYAVPYHEVDWNKARNTCAKEDLYYPHPL
VQDILWASLHYLYEPIFTRWPCKSLREKALRTVMQHIHYEDENTRYICIGPVNKVLNMLSCWVE DPYSESFKLHLPRILDYLWIAEDGMKMQGYNGSQLWDTAFAVQAI ISTGLADEYGPILRKAHDF IKYSQVLEDCPGDLNFWYRHISKGAWPFSTVDHGWPISDCTSEGLKAVLLLSTLPSESVGEPLH MMRLYDAVNVILSLQNVDGGFPTYELTRSYQWLELINPAETFGDIVIDYPYVECTSAAIQALIS FKKLFPEHRMEEIENCIGRAVEFIEKIQAADGSWYGSWGVCFTYAGWFGIKGLSAAGRTYNNSS NIRKACDFLLSKELATGGWGESYLSCQNKVYTNLEGARPHIVNTSWALLALIEAGQAERDPTPL HRAARILINSQMEDGDFPQEEIMGVFNKNCMISYSAYRNIFPIWALGEYTCRVLRAP
Ziziphus jujube (SEQ ID NO: 50)
MWKLKIGAETVGEGGSDGWLRSVNSHLGRQVWEFHPELGTPEELRQIQDARDAFFNHRFHKQHS
SDLLMRIQFAKENPCVANPPQVKVKDTDEVTEESVTTTLRRAINFYSTIQAHDGHWAGDYGGPM
FLLPGLVITLSVTGALNAVLSKEHQCEMCRYIYNHQNEDGGWGLHIEGPSTMFGTVLNYVSLRL
LGEGAEDGLGTIENARKWILDHGGATAITSWGKMWLSVLGVYEWSGNNPLPPEVWLCPYTLPFH
PGRMWCHCRMVYLPMSYLYGKRFVGPITPTIRSLRKELYTAPYHEIDWNRARNECAKEDLYYPH
PLVQDVLWASLHYVYEPIFMRWPAKKLREKALSTVMQHIHYEDENTRYICIGPVNKVLNMLCCW
VEDPNSEAFKLHLPRISDYLWIAEDGMKMQGYNGSQLWDTAFAVQAIVSTDLAEEYGPTIRKAH
EYIKNSQVLEDCPGDLNFWYRHISKGAWPFSTADHGWPISDCTAEGLKAVLLLSQLSSETVGDS
LDVKRLFNAVNVILSLQNGDGGFATYELTRSYQWLELINPAETFGDIVIDYPYVECTSAALEAL
TLFKKSYPGHRREEVENCITNAAMFIENIQAKDGSWYGSWGVCFTYAGWFGIKGLVASGRTYEN
CPSIRKACDFLLSKELPSGGWGESYLSCQNKVYTNLKDNKPHIVNTAWAMLALIVARQAERDPM
PLHRAARILIKSQMHDGDFPQEEIMGVFNKNCMISYAAYRNIFPIWALGEYRLHVLRSL
Prunus avium (SEQ ID NO: 51)
MWKLKIGAETVGEGGYQWLKSVNNHLGRQVWEFNPELGSPEELQRIEDARKAFWDNRFERRHSS DLLMRIQFEKENQCVTNLPQLKVKYEEEVTEEWKTTLRRAISFYSTIQAHDGHWPGDYGGPMF LLPGLVITLSITGALNDVLSKEHQHEMCRYLYNHQNKDGGWGLHIEGPSTMFGTALNYVTLRLF GEGADDGEGAMELARKWILDHGGVTKITSWGKMWLSVLGTYEWSGNNPLPPEVWLCPYSLPFHP GRMWCHCRMVYLPMSYLYGKRFVGPITPTIRSLRKELYGVPYHEVDWNQARNLCAKEDLYYPHP MVQDILWASLHYVYEPVFTRWPAKKLRENALQTVMQHIHYEDENTRYICIGPVNKVLNMLCCWA EDPNSDAFKLHLPRIPDYLWVAEDGMKMQGYNGSQSWDTSFAVQAI ISTNLAEEFGPTLRKAHE YIKDSQVLEDCPGDLNFWYRHISKGAWPFSTADHGWPISDCTAEGLKAVLLLSKLPTGTVGESL DMKQLYDAVNVMLSLQNEDGGFATYELTRSYQWLELINPAETFGDIVIDYPYVECTSAAIQALT MFRKLYPGHRREEIESCIARAAKFIEKIQATDGSWYGSWGVCFTYAGWFGIKGLAAAGRTYKDC SSIRKACDFLLSKELPSGGWGESYLSCQNKVYTNLKDNRPHIVHTAWAMLALIGAGQAKRDPTP LHRAARVLINSQMENGDFPQKEIMGVFNKNCMISYSAYRNIFPIWALGEYRCQVLEAL Brassica napus (SEQ ID NO: 52)
MWKLKIAEGGSPWLRTTNNHVGRQFWEFDPNLGTPEELAAVEEARKSFRENRFAKKHSSDLLMR LQFSRESLSRPVLPQVNIKDGDDVTEKMVETTLKRGVDFYSTIQASDGHWAGDYGGPMFLLPGL I ITLSITGALNTVLSEQHKAEMRRYLHNHQNEDGGWGLHIEGPSTMFGSVLNYVTLRLLGEGPN DGDGAMEKGRDWILNHGGATNITSWGKMWLSVLGAFEWSGNNPLPPEIWLLPYILPIHPGRMWC HCRMVYLPMSYLYGKRFVGPITSTVLSLRKELFTVPYHEVDWNEARNLCAKEDLYYPHPLVQDI LWASLHKIVEPVLTRWPGSNLREKALRTTLEHIHYEDENTRYICIGPVNKVLNMLCCWVEDPNS EAFKLHLPRIHDYLWVAEDGMKMQGYNGSQLWDTSFAVQAVLATNFVEEYGPVLKKAHSYVKNS QVSEDCPGDLSYWYRHISKGAWPFSTADHGWPISDCTAEGLKAALLLSKVPKEIVGEPVDTKRL YDAVNVI ISLQNADGGFATYELTRSYPWLELINPAETFGDIVIDYPYVECTSAAIQALIAFRKL YPGHRKKEVDECIEKAVKFIESIQESDGSWYGSWAVCFTYGTWFGVKGLEAAGKTLKNSPTVAK ACEFLLSKQLPSGGWGESYLSCQDKVYSNLDGNRSHWNTAWALLSLIGAGQVEVDQKPLHRAA RYLINAQMESGDFPQQEIMGVFNRNCMITYAAYRNIFPIWALGEYRSKVLLQQGE
Spinacia oleracea (SEQ ID NO: 53)
MWKLKIAEGGSPWLRTTNNHVGRQIWEFDPNLGTPEQIREVEEARENFWKNRFEQKHSSDLLMR
MQFAQENSSNWLPQVKVKDEDEITEETVATTLRRALSYQSTIQAHDGHWPGDYGGPMFLMPGL
VIALSVTGALNAVLSKEHQKEMCRYLYNHQNKDGGWGLHIEGHSTMFGTVLTYVTLRLLGEGVD
DGDGAMERGRKWTLEHGSATAITSWGKMWLSVLGVFEWAGNNPMPPETWLLPYILPVHPGRMWC
HCRMVYLPMSYLYGKRFVGPITPTVLSLRRELFDVPYHEIDWDRARNECAKEDLYYPHPLVQDI
LWASLHKAVEPILMRWPGKKLREKALSTVMEHIHYEDENTRYICIGPVNKVLNMLCCWVEDPNS
EAFKLHLPRIPDFLWIAEDGMKMQGYNGSQLWDTTFMVQAILATNLGEEYGGTLRKAHNFIKDS
QVREDCPGDLSYWYRHISKGAWPFSTADHGWPISDCTAEGLKAALLLSKVPSDIVGEPLEVKRL
YDSVNVLLSLQNGDGGFATYELTRSYPWLELINPAETFGDIVIDYPYVECTSAAIQALVSFKRL
YPGHRREEIENCIKKAAKFIEDIQAADGSWYGSWAVCFTYATWFGIKGLVAAGKNYDNCPAIRK
ACDFLLSKQLSNGGWGESYLSCQNKVYSNIEGNKAHWNTGWAMLALIGAGQAKRDPMPLHRAA
KVLINSQMPNGDFPQQEIMGVFNRNCMITYAAYRNIFPTWALGEYRTQVLQK
Trigonella foenum-graecum (SEQ ID NO: 54)
MWKLKVAEGGSPWLRTVNNYVGRQVWEFDPNSGSPQELDQIESVRQNFHNNRFSHKHSDDLLMR IQLAKENPMGEVIPKVRVKDVEDVNEESVTTTLRRALNFYSTLQSRDGHWPGDYGGPMFLMPGL VIALSITGALNAVLTDEHQKEMRRYLYNHQNKDGGWGLHIEGPSTMFGSVLCYVTLRLLGEGPN DGEGEMEKARDWILEHGGATYITSWGKMWLSVLGVFEWSGNNPLPPEIWLLPYMLPIHPGRMWC HCRMVYLPMSYLYGKRFVGPITPTVLSLRKELFTVPYHDIDWNQARNLCAKEDLYYPHPLVQDI LWASLHKFVEPIFMNWPGKKLREKAVETVMEHVHYEDENTRYICIGPVNKVLNMLCCWVEDPNS EAFKLHLPRIHDFLWIAEDGMKMQGYNGSQLWDTAFAVQAXISTNLIDEFAPTLRKAHTFIKNS QVLEDCPGDLSKWYRHISKGAWPFSTADHGWPISDCTAEGLKAVLLLSKIGPEIVGEPLDAKGF YDAVNVI ISLQNEDGGLATYELTRSYKWLEI INPAETFGDIVIDYTYVECTSAAIQALSTFRKL YPGHRREEIQHCIEKAAAFIEKIQASDGSWYGSWGVCFTYGTWFGVKGLIAAGKSFSNCLSIRK ACDFLLSKQLPSGGWGESYLSCQNKVYSNLESNRSHWNTGWAMLALIEAEQAKRDPTPLHHAA VCLINSQMENGDFPQEEIMGVFNKNCMITYAAYRNIFPIWALGEYRRHVLQA
Ricinus communis (SEQ ID NO: 55)
MWKLRIAEGSGNPWLRTTNDHIGRQVWEFDSSKIGSPEELSQIENARQNFTKNRFIHKHSSDLL MRIQFSKENPICEVLPQVKVKESEQVTEEKVKITLRRALNYYSSIQADDGHWPGDYGGPMFLMP GLIIALSITGALNAILSEEHKREMCRYLYNHQNRDGGWGLHIEGPSTMFGSVLCYVSLRLLGEG PNEGEGAVERGRNWILKHGGATAITSWGKMWLSVLGAYEWSGNNPLPPEMWLLPYILPVHPGRM WCHCRMVYLPMSYLYGKRFVGPITPTVLSLRKELYTVPYHEIDWNQARNQCAKEDLYYPHPMLQ DVLWATLHKFVEPILMHWPGKRLREKAIQTAIEHIHYEDENTRYICIGPVNKVLNMLCCWVEDP NSEAFKLHLPRLYDYLWLAEDGMKMQGYNGSQLWDTAFAVQAIVSTNLIEEYGPTLKKAHSFIK KMQVLENCPGDLNFWYRHISKGAWPFSTADHGWPISDCTAEGIKALMLLSKIPSEIVGEGLNAN RLYDAVNWLSLQNGDGGFPTYELSRSYSWLEFINPAETFGDIVIDYPYVECTSAAIQALTSFR KSYPEHQREEIECCIKKAAKFMEKIQISDGSWYGSWGVCFTYGTWFGIKGLVAAGKSFGNCSSI RKACDFLLSKQCPSGGWGESYLSCQKKVYSNLEGDRSHWNTAWAMLSLIDAGQAERDPTPLHR AARYLINAQMENGDFPQQEIMGVFNRNCMITYAAYRDIFPIWALGEYRCRVLKAS
Epoxide Hydrolase Siraitia grosvenorii EPH1 (SgEPHl) (SEQ ID NO: 56)
MEKIEHSTIATNGINMHVASAGSGPAVLFLHGFPELWYSWRHQLLYLSSLGYRAIAPDLRGFGD TDAPPSPSSYTAHHIVGDLVGLLDQLGVDQVFLVGDWGAMMAWYFCLFRPDRVKALVNLSVHFT PRNPAISPLDGFRLMLGDDFYVCKFQEPGVAEADFGSVDTATMFKKFLTMRDPRPPI IPNGFRS LATPEALPSWLTEEDIDYFAAKFAKTGFTGGFNYYRAIDLTWELTAPWSGSEIKVPTKFIVGDL DLVYHFPGVKEYIHGGGFKKDVPFLEEVWMEGAAHFINQEKADEINSLIYDFIKQF
Siraitia grosvenorii EPH2 (SgEPH2) (SEQ ID NO: 57)
MEKIEHTTISTNGINMHVASIGSGPAVLFLHGFPELWYSWRHQLLFLSSMGYRAIAPDLRGFGD TDAPPSPSSYTAHHIVGDLVGLLDQLGIDQVFLVGHDWGAMMAWYFCLFRPDRVKALVNLSVHF LRRHPSIKFVDGFRALLGDDFYFCQFQEPGVAEADFGSVDVATMLKKFLTMRDPRPPMIPKEKG FRALETPDPLPAWLTEEDIDYFAGKFRKTGFTGGFNYYRAFNLTWELTAPWSGSEIKVAAKFIV GDLDLVYHFPGAKEYIHGGGFKKDVPLLEEWWDGAAHFINQERPAEISSLIYDFIKKF
Siraitia grosvenorii EPH3 (SgEPH3) (SEQ ID NO: 58) MDQIEHITINTNGIKMHIASVGTGPWLLLHGFPELWYSWRHQLLYLSSVGYRAIAPDLRGYGD TDSPASPTSYTALHIVGDLVGALDELGIEKVFLVGHDWGAI IAWYFCLFRPDRIKALVNLSVQF IPRNPAIPFIEGFRTAFGDDFYMCRFQVPGEAEEDFASIDTAQLFKTSLCNRSSAPPCLPKEIG FRAIPPPENLPSWLTEEDINYYAAKFKQTGFTGALNYYRAFDLTWELTAPWTGAQIQVPVKFIV GDSDLTYHFPGAKEYIHNGGFKKDVPLLEEWWKDACHFINQERPQEINAHIHDFINKF
Momordica charantia (SEQ ID NO: 59)
MEKIEHSTIAANGITIHVASVGSGPAVLLLHGFPELWYSWRHQLLFLASKGYRAIAPDLRGFGD
SDAPPSPSSYTPLHIVGDLVALLDHLGIDLVFLVGHDWGAMMAWHFCLLRPDRVKALVNLSVHF
MPRNPAMSPLDGMRLLLGDDFYVCRFQEPGAAEADFGSVDTATMMKKFLTMRDPRPPIIPNGFR
SLETPQALPPWLTEEDIDYFAAKFAKTGFTGGFNYYRAIGRTWELTAPWTGSKIKVPAKFIVGD
LDMVYHLPDAKEYIHGGGFKEDVPLLEEVWIEGAAHFINQEKPDEISSLIYDFIKKF
Cucurbita moschata (SEQ ID NO: 60)
MEKIEHSTIATNGINMHVASIGSGPPVLFLHGFPELWYSWRHQLLFLASKGFRAIAPDLRGFGD SDVPPSPSSYTPFHI IGDLIGLLDHLGIEQVFLVGHDWGAMMAWYFCLFRPDRVKALVNLSVHY NPRNPAISPLSRTRQFLGDDFYICKFQTPGVAEADFGSVDTATMMKKFLTIRDPSPPIIPNGFK TLKTPETLPSWLTEEDIDYFASKFTKTGFTGGFNYYRAIEQTWELTGPWSGAKIKVPTKYWGD VDMVYHLPGAKQYIHGGGFKKDVPLLEEVWMEGAAHFINQEKADEISAHIYDFIIKF
Cucurbita maxima (SEQ ID NO: 61)
MENIEHTIVPTNGINMHIASIGSGPAVLFLHGFPELWYSWRHQLLFLASNGFRAIAPDLRGFGD
TDVPPSPSSYTAHHIVGDLIGLLDHLGIDRVFLVGHDWGAMMAWYFCLFRPDRVRALVNLSVHY
LHRHPSIKFVDGFRAFLGDDFYFCQFQEPGVAEADFGSVDTATMLKKFLTMRDPRPPMIPKEKG
FRALETPDPLPSWLTEEDVDYFASKFSKTGFTGGFNYYRAFDLSWELTAPWSGSQVKVPAKFIV
GDLDLVYHFPGAKEYIHGGRFKEDVPFLEEVWIEGAAHFINQERADEISSLIYEFINKF
Prunus persica (SEQ ID NO: 62)
MEKIEHTTVSTNGINMHIASIGTGPWLFLHGFPELWYSWRHQLLSLSSLGYRCIAPDLRGFGD
TDAPPSPASYSALHIVGDLIGLLDHLGIDQVFLVGHDWGAVIAWWFCLFRPDRVKALVNMSVAF
SPRNPKRKPVDGFRALFGDDYYICRFQEPGEIEKEFAGYDTTSIMKKFLTGRSPKPPCLPKELG
LRAWKTPETLPPWLSEEDLNYFASKFSKTGFVGGLNYYRALNLTWELTGPWTGLQVKVPVKFIV
GDLDITYHIPGVKNYIHNGGFKRDVPFLQEVWIEDGAHFINQERPDEISRHVYDFIQKF
Morus notabilis (SEQ ID NO: 63) MEKIEHSTVHTNGINMHVASVGTGPAILFLHGFPELWYSWRHQMISLSSLGYRCIAPDLRGYGD TDAPPSPTSYTSLHIVGDLVGLIDHLVIEKLFLVGHDWGAMIAWYFCLFRPDRIKALVNLSVPF FPRNPKINFVDGFRAELGDDFYICRFQEPGESEADFSSDTVAVFRRILANRDPKPPLIPKEIGF RGVYEDPVALPSWLTEDDINHFANKFNETGFTGGLNYYRALNLTWELTAAWTGARVQVPTKFIM GDLDLVYYFPGMKEYILNGGFKRDVPLLQELVI IEGAAHFINQEKPDEISSHIHHFIQKF
Ricinus communis (SEQ ID NO: 64)
MEKIEHTTVATNGINMHVAAIGTGPEILFLHGFPELWYSWRHQLLSLSSRGYRCIAPDLRGYGD
TDAPESLTGYTALHIVGDLIGLLDSMGIEQVFLVGHDWGAMMAWYLCMFRPDRIKALVNTSVAY
MSRNPQLKSLELFRTVYGDDYYVCRFQEPGGAEEDFAQVDTAKLIRSVFTSRDPNPPIVPKEIG
FRSLPDPPSLPSWLSEEDVNYYADKFNKKGFTGGLNYYRNIDQNWELTAPWDGLQIKVPVKFVI
GDLDLTYHFPGIKDYIHNGGFKQWPLLQEVWMEGVAHFINQEKPEEISEHIYDFIKKF
Citrus unshiu (SEQ ID NO: 65)
MEKIEHTTVGTNGINMHVASIGTGPWLFIHGFPELWYSWRNQLLYLSSRGYRAIAPDLRGYGD
TDAPPSVTSYTALHLVGDLIGLLDKLGIHQVFLVGHDWGALIAWYFCLFRPDRVKALVNMSVPF
PPRNPAVRPLNNFRAVYGDDYYICRFQEPGEIEEEFAQIDTARLMKKFLCLRIAKPLCIPKDTG
LSTVPDPSALPSWLSEEDVNYYASKFNQKGFTGPVNYYRCSDLNWELMAPWTGVQLEVPVKFIV
GDQDLVYNNKGMKEYIHNGGFKKYVPYLQEVWMEGVAHFINQEKAEEVGAHIYEFIKKF
Hevea brasiliensis (SEQ ID NO: 66)
MEKIEHITVFTNGINMHIASIGTGPEILFLHGFPELWYSWRHQLLSLSSLGYRCIAPDLRGYGD TDAPQSVNQYTVLHIVGDLVGLLDSLGIQQVFLVGHDWGAFIAWYFCIFRPDRIKALVNTSVAF MPRNPQVKPLDGLRSMFGDDYYICQFQKPGKAEEDFAQVNTAKLIKLLFTSRDPRPPHFLKEVG LKALQDPPSQQSWLTEEDVNFYAAKFNQKGFRGGLNYYQNINMNWELAAAWTGVQIKVPVKFI I GDLDLTYHFPGIKEYIHNGGFKKDVPLLQDVWMEGVAHFLNQEKPEEVSKHIYDFIKKF
Handroanthus impetiginosus (SEQ ID NO: 67)
MDKIQHKI IQTNGINIHVAEIGDGPAVLFLHGFPELWYSWRHQMLFLSSRGYRAIAPDLRGYGD SDAPPCATSYTAFHI IGDLVGLLDAMGLDRVFLVGHDWGAVMAWYFCLLRPDRIKALVNLSWF QPRNPKRKPVESMRAKLGDDYYICRFQEPGEAEEEFARVDTARLIKKLLTTRNPAPPRLPKEVG FGCLPHKPITMPSWLSEEDVQYYAAKFNQKGFTGGLNYYRAMDLSWELAAPWTGVQIKVPVKFI VGDLDITYNTPGVKEYIHKGRFKQHVPFLQELVILEGVAHFLNQEKPDEINQHIYDFIHKF
Camelina sativa (SEQ ID NO: 68) MEKIEHTTVSTNGINMHVASIGSGPVILFLHGFPDLWYSWRHQLLSFAALGYRAIAPDLRGYGD SDAPPSPESYTILHIVGDLVGLLDSLGVDRVFLVGHDWGAIVAWWLCMIRPDRVKALVNTSWF NPRNPSVKPVDKFRDLFGDDYYVCRFQETGEIEEDFAQVDTKKLITRFFVSRNPRPPCIPKSVG FRGLPDPPSLPAWLTEQDVSFYGDKFSQKGFTGGLNYYRAMNLSWELTAPWAGLQIKVPVKFIV GDLDITYNIPGTKEYIHGGGLKKHVPFLQEVWMEGVGHFLQQEKPDEVTDHIYGFFEKFRTRE
TSSL
Coffea canephora (SEQ ID NO: 69)
MDKIQHRQVPVNGINLHVAEIGDGPAILFLHGFPELWYSWRHQLLSLSAKGYRALAPDLRGYGD SDAPPSPSNYTALHIVGDLVGLLDSLGLDRVFLVGHDWGAVMAWYFCLLRPDRIKALVNMSWF TPRNPKRKPLEAMRARFGDDYYICRFQEPGEAEEEFARVDTARI IKKFLTSRRPGPLCVPKEVG FGGSPHNPIQLPSWLSEDDVNYFASKFSQKGFTGGLNYYRAMDLNWELTAPWTGLQIKVPVKFI VGDLDVTFTTPGVKEYIQKGGFKRDVPFLQELWMEGVAHFVNQEKPEEVSAHIYDFIQKF
Punica granatum (SEQ ID NO: 70)
MEKIQHTTVRTNGINMHVATAGSGPDSILFVHGFPELWYTWRHQMVSLAALGYRTIAPDLRGYG DTDAPPSHESYTAFHIVGDLVGLLDSMGIEKVFLVGHDWGAAIAWYFCLFRPDRIKALVNMSW FHPRNPNRKPVDGLRAILGDDYYICRFQAPGEIEEDFARADTANI IKFFLVSRNPRPPQIPKEG FSCLANSRQMDLPSWLSEEDINYYASKFSEKGFTGGLNYYRVMNLNWELTAPFTGLQIKVPAKF MVGDLDITYNTPGTKEFIHNGGLKKHVPFLQEVWMEGVAHFINQEKPEEVTAHIYDFIKKF
Arabidopsis lyrata subsp. lyrata (SEQ ID NO: 71) MEKIEHTTVSTNGINMHVASIGSGPVILFLHGFPDLWYSWRHQLLSFAALGYRAIAPDLRGYGD SDAPPSRESYTILHIVGDLVGLLNSLGVDRVFLVGHDWGAIVAWWLCMIRPDRVNALVNTSWF NPRNPSVKPVDAFRALFGDDYYICRFQEPGEIEEDFAQVDTKKLITRFFISRNPRPPCIPKSVG FRGLPDPPSLPAWLTEEDVSFYGDKFSQKGFTGGLNYYRALNLSWELTAPWAGLQIKVPVKFIV GDLDITYNIPGTKEYIHEGGLKKHVPFLQEVWLEGVGHFLHQEKPDEITDHIYGFFKKFRTRE TASL
Rhinolophus sinicus (SEQ ID NO: 72)
MDKIEHTTVSTNGINMHVASIGSGPVILFLHGFPDLWYSWRHQLLSFAGLGYRAIAPDLRGYGD SDSPPSHESYTILHIVGDLVGLLDSLGVDRVFLVGHDWGAWAWWLCMIRPDRVNALVNTSWF NPRNPSVKPVDAFKALFGEDYYVCRFQEPGEIEEDFAQVDTKKLINRFFTSRNPRPPCIPKTLG FRGLPDPPALPAWLTEQDVSFYADKFSQKGFTGGLNYYRAMNLSWELTAPWAGLQIKVPVKFIV GDLDITYNIPGTKEYIHEGGLKKHVPFLQEVWMEGVGHFLHQEKPDEVTDHIYGFFKKF Cytochrome P450
Siraitia grosvenorii CYP87D18 (SEQ ID NO: 73)
MWTWLGLATLFVAYYIHWINKWRDSKFNGVLPPGTMGLPLIGETIQLSRPSDSLDVHPFIQKK VERYGPIFKTCLAGRPVWSADAEFNNYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGLIHK YIRSITLNHFGAEALRERFLPFIEASSMEALHSWSTQPSVEVKNASALMVFRTSVNKMFGEDAK KLSGNIPGKFTKLLGGFLSLPLNFPGTTYHKCLKDMKEIQKKLREWDDRLANVGPDVEDFLGQ AFKDKESEKFISEEFI IQLLFSISFASFESISTTLTLILKLLDEHPEWKELEVEHEAIRKARA DPDGPITWEEYKSMTFTLQVINETLRLGSVTPALLRKTVKDLQVKGKI IPEGWTIMLVTASRHR DPKVYKDPHIFNPWRWKDLDSITIQKNFMPFGGGLRHCAGAEYSKVYLCTFLHILCTKYRWTKL GGGTIARAHILSFEDGLHVKFTPKE
Cucumis melo (SEQ ID NO: 74)
MWTILLGLATLAIAYYIHWVNKWKDSKFNGVLPPGTMGLPLIGETIQLSRPSDSLDVHPFIQSK VKRYGPIFKTCLAGRPVWSTDAEFNHYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGLIHK YIRSITLNHFGAESLRERFLPRIEESARETLHYWSTQPSVEVKESAAAMVFRTSIVKMFSEDSS KLLTAGLTKKFTGLLGGFLTLPLNVPGTTYHKCIKDMKEIQKKLKDILEERLAKGVSIDEDFLG QAIKDKESQQFISEEFIIQLLFSISFASFESISTTLTLILNFLADHPDVAKELEAEHEAIRKAR ADPDGPITWEEYKSMNFTLNVICETLRLGSVTPALLRKTTKEIQIKGYTIPEGWTVMLVTASRH RDPEVYKDPDTFNPWRWKELDSITIQRNFMPFGGGLRHCAGAEYSKVYLCTFLHILFTKYRWRK LKGGKIARAHILRFEDGLYVNFTPKE Cucurbita maxima (SEQ ID NO: 75)
MWTIWGLATLAVAYYIHWINKWKDSKFNGVLPPGTMGLPLIGETLQLSRPSDSLDVHPFIKKK VKRYGSIFKTCLAGRPVWSTDAEFNNYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGFIHK YIRSITLNHFGAESLRERFLPRIEESAKETLCYWATQPSVEVKDSAAVMVFRTSMVKMVSKDSS KLLTGGLTKKFTGLLGGFLTLPINVPGTTYNKCMKDMKEIQKKLREILEGRLASGAGSDEDFLG QAVKDKGSQKFISDDFIIQLLFSISFASFESISTTLTLILNYLADHPDWKELEAEHEAIRNAR ADPDGPITWEEYKSMTFTLHVIFETLRLGSVTPALLRKTTKELQINGYTIPEGWTVMLVTASRH RDPAVYKDPHTFNPWRWKELDSITIQKNFMPFGGGLRHCAGAEYSKVYLCTFLHILFTKYRWTK LKGGKVARAHILSFEDGLHMKFTPKE
Cucumis sativus (SEQ ID NO: 76) MWTILLGLATLAIAYYIHWVNKWKDSKFNGVLPPGTMGLPLIGETIQLSRPSDSLDVHPFIQRK VKRYGPIFKTCLAGRPVWSTDAEFNHYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGLIHK YIRSITLNHFGAESLRERFLPRIEESARETLHYWSTQTSVEVKESAAAMVFRTSIVKMFSEDSS KLLTEGLTKKFTGLLGGFLTLPLNLPGTTYHKCIKDMKQIQKKLKDILEERLAKGVKIDEDFLG
QAIKDKESQQFISEEFIIQLLFSISFASFESISTTLTLILNFLADHPDWKELEAEHEAIRKAR
ADPDGPITWEEYKSMNFTLNVICETLRLGSVTPALLRKTTKEIQIKGYTIPEGWTVMLVTASRH
RDPEVYKDPDTFNPWRWKELDSITIQKNFMPFGGGLRHCAGAEYSKVYLCTFLHILFTKYRWRK
LKGGKIARAHILRFEDGLYVNFTPKE
Cucurbita moschata (SEQ ID NO: 77)
MWAIWGLATLAVAYYIHWINKWKDSKFNGVLPPGTMGLPLVGETLQLARPSDSLDVHPFIKKK
VKRYGSIFKTCLAGRPVWSTDAEFNNYIMLQEGRAVEMWYLDTLSKFFGLDTEWLKALGFIHK
YIRSITLNHFGAESLRERFLPRIEESAKETLRYWATQPSVEVKDSAAVMVFRTSMVKMVSEDSS
KLLTGGLTKKFTGLLGGFLTLPINVPGTTYNKCMKDMKEIQKKLREILEGRLASGAGSDEDFLG
QAIKDKGSQQFISDDFIIQLLFSISFASFESISTTLTLVLNYLADHPDWKELEAEHEAIRNAR
ADPDGPITWEEYKSMTFTLHVIFETLRLGSVTPALLRKTTKELQINGYTIPEGWTVMLVTASRH
RDPAVYKDPHTFNPWRWKELDSITIQKNFMPFGGGLRHCAGAEYSKVYLCTFLHILFTKYRWTK
LKGGKVARAHILSFEDGLHVKFTPKE
Prunus avium (SEQ ID NO: 78)
MWTLVGLSLVALLVIYFTHWI IKWRNPKCNGVLPPGSMGLPLIGETLNLIIPSYSLDLHPFIKK RLQRYGPIFRTSLAGRPVWTADPEFNNYIFQQEGRMVELWYLDTFSKIFVHEGDSKTNAIGMV HKYVRSIFLNHFGAERLKEKLLPQIEEFVNKSLCAWSSKASVEVKHAGSVMVFNFSAKQMISYD AEKSSDDLSEKYTKI IDGLMSFPLNIPGTAYYNCSKHQKNVTTMLRDMLKERRISPETRRGDFL DQLSIDMEKEKFLSEDFSVQLVFGGLFATFESISAVIALAFSLLADHPSWEELTAEHEAILKN RENPNSSITWDEYKSMTFTLQVINEILRLGNVAPGLLRRALKDIPVKGFTIPEGWTIMWTSAL QLSPNTFEDPLEFNPWRWKDLDSYAVSKNFMPFGGGMRQCAGAEYSRVFLATFLHVLVTKYRWT TIKAARIARNPILGFGDGIHIKFEEKKT
Populus trichocarpa (SEQ ID NO: 79)
MWAIGLVWALWIYYTHMIFKWRSPKIEGVLPPGSMGWPLIGETLQFISPGKSLDLHPFVKKR MEKYGPIFKTSLVGRPI IVSTDYEMNKYILQHEGTLVELWYLDSFAKFFALEGETRVNAIGTVH KYLRSITLNHFGVESLKESLLPKIEDMLHTNLAKWASQGPVDVKQVISVMVFNFTANKIFGYDA ENSKEKLSENYTKILNSFISLPLNIPGTSFHKCMQDREKMLKMLKDTLMERLNDPSKRRGDFLD QAIDDMKTEKFLTEDFIPQLMFGILFASFESMSTTLTLTFKFLTENPRWEELRAEHEAIVKKR ENPNSRLTWEEYRSMTFTQMWNETLRISNIPPGLFRKALKDFQVKGYTVPAGWTVMLVTPATQ LNPDTFKDPVTFNPWRWQELDQVTISKNFMPFGGGTRQCAGAEYSKLVLSTFLHILVTNYSFTK IRGGDVSRTPIISFGDGIHIKFTARA
Prunus persica (SEQ ID NO: 80) MWTLVGLSLVGLLVIYFTHWI IKWRNPKCNGVLPPGSMGLPFIGETLNLIIPSYSLDLHPFIKK
RLQRYGPIFRTSLAGRQVWTADPEFNNYLFQQEGRMVELWYLDTFSKIFVHEGESKTNAVGMV HKYVRSIFLNHFGAERLKEKLLPQIEEFVNKSLCAWSSKASVEVKHAGSVMVFNFSAKQMISYD AEKSSDDLSEKYTKI IDGLMSFPLNIPGTAYYNCLKHQKNVTTMLRDMLKERQISPETRRGDFL DQISIDMEKEKFLSEDFSVQLVFGGLFATFESISAVLALAFSLLAEHPSWEELTAEHEAILKN RENLNSSLTWDEYKSMTFTLQVINEILRLGNVAPGLLRRALKDIPVKGFTIPEGWTIMWTSAL QLSPNTFEDPLEFNPWRWKDLDSYAVSKNFMPFGGGMRQCAGAEYSRVFLATFLHVLVTKYRWT TIKAARIARNPILGFGDGIHIKFEEKKT
Populus euphratica (SEQ ID NO: 81)
MWTFVLCWAVLWYYTHWINKWRNPTCNGVLPPGSMGLPI IGETLELI IPSYSLDLHPFIKKR IQRYGPIFRTNILGRPAWSADPEINSYIFQNEGKLVEMWYMDTFSKLFAQSGESRTNAFGI IH KYARSLTLTHFGSESLKERLLPQVENIVSKSLQMWSSDASVDVKPAVSIMVCDFTAKQLFGYDA ENSSDKISEKFTKVIDAFMSLPLNIPGTTYHKCLKDKDSTLSILRNTLKERMNSPAESRGGDFL DQIIADMDKEKFLTEDFTVNLIFGILFASFESISAALTLSLKLIGDHPSVLEELTVEHEAILKN RENPDSPLTWAEYNSMTFSLQVINETLRLGNVAPGLLRRALQDMQVKGYTIPAGWVIMWNSAL HLNPATFKDPLEFNPWRWKDFDSYAVSKNLMPFGGGRRQCAGSEFTKLFMAIFLHKLVTKYRWN I IKQGNIGRNPILGFGDGIHISFSPKDI
Juglans regia (SEQ ID NO: 82)
MWKVGLCWGVIWWFTRWINKWRNPKCNGILPPGSMGPPLIGESLQLI IPSYSLDLHPFIKKR VQRYGPIFRTSWGQPMWSTDVEFNHYLAKQEGRLVHFWYLDSFAEIFNLEDENAISAVGLIH KYGRSIVLNHFGTDSLKKTLLSQIEEIVNKTLQTWSSLPSVEVKHAASVMAFDLTAKQCFGYDV ENSAVKMSEKFLYTLDSLISFPFNIPGTVYHKCLKDKKEVLNMLRNIVKERMNSPEKYRGDFLD QITADMNKESFLTQDFIVYLLYGLLFASFESISASLSLTLKLLAEHPAVLQQLTAEHEAILKNR DNPNSSLTWDEYKSMTFTFQVINEALRLGNVAPGLLRRALKDIEFKGYTIPAGWTIMLANSAIQ LNPNTYEDPLAFNPWRWQDLDPQIVSKNFMPFGGGIRQCAGAEYSKTFLATFLHVLVTKYRWTK VKGGKMARNPILWFADGIHINFALKHN
Pyrus x bretschneideri (SEQ ID NO: 83)
MWDWGLSFVALLVIYLTYWITQWKNPKCNGVLPPGSMGLPLIGETLNLLIPSYSLDLHPFIRK
RLERYGPIFRTSLAGKPVLVSADPEFNNYVLKQEGRMVEFWYLDTFSKIFMQEGGNGTNQIGVI
HKYARSIFLNHFGAECIKEKLLTQIEGSINKHLRAWSNQESVEVKKAGSIMALNFCAEHMIGYD
AETATENLGEIYHRVFQGLISFPLNVPGTAYHNCLKIHKKATTMLRAMLRERRSSPEKRRGDFL
DQIIDDLDQEKFLSEDFCIHLIFGGLFAIFESISTVLTLFFSLLADHPAVLQELTAEHEALLKN
REDPNSALTWDEYKSMTFTLQVINETLRLVNTAPGLLRRALKDIPVKGYTIPAGWTILLVTPAL HLTSNTFKDHLEFNPWRWKDLDSLVISKNFMPFGSGLRQCAGAEFSRAYLSTFLHVLVTKYRWT
TIKGARISRRPMLTFGDGAHIKFSEKKN
Morus notabilis (SEQ ID NO: 84)
MWNTICLSWGLWIWISNWIRRWRNPKCNGVLPPGSMGFPLIGETLPLIIPTYSLDLHPFIKN RLQRYGSIFRTSIVGRPWISADPEFNNFLFQQEGSLVELYYLDTFSKIFVHEGVSRTNEFGW HKYIRSIFLNHFGAERLKEKLLPEIEQMVNKTLSAWSTQASVEVKHAASVLVLDFSAKQI ISYD AKKSSESLSETYTRI IQGFMSFPLNIPGTAYNQCVKDQKKI IAMLRDMLKERRASPETNRGDFL DQISKDMDKEKFLSEDFWQLIFGGLFATFESVSAVLALGFMLLSEHPSVLEEMIAEHETILKN REHPNSLLAWGEYKSMTFTLQVINETLRLGNVAPGLLRKALKDIRVKGFTIPKGWAIMMVTSAL QLSPSTFKNPLEFNPWRWKDLDSLVISKNFMPFGRGMRQCAGAEYSRAFMATFFHVLLTKYRWT TIKVGNVSRNPILRFGNGIHIKFSKKN
Jatropha curcas (JcP450.1) (SEQ ID NO: 85)
MWIIGLCFASLLVIYCTHFFYKWRNPKCKGVLPPGSMGLPI IGETLQLI IPSYSLDHHPFIQKR IQRYGPIFRTNLVGRPVIVSADPEVNQYIFQQEGNSVEMWYLDAYAKIFQLDGESRLSAVGRVH KYIRSITLNNFGIENLKENLLPQIQDLVNQSLQKWSNKASVDVKQAASVMVFNLTAKQMFSYGV EKNSSEEMTEKFTGIFNSLMSLPLNIPGTTYHKCLKDREAMLKMLRDTLKQRLSSPDTHRGDFL DQAIDDMDTEKFLTGDCIPQLIFGILLAGFETTATTLTLAFKFLAEHPLVLEELTAEHEKILSK RENLESPLTWDEYKSMTFTHHVINETLRLANFLPGLLRKALKDIQVKNYTIPAGWTIMWKSAM QLNPEIYKDPLAFNPWRWKDLDSYTVSKNFMPFGGGSRQCAGADYSKLFMTIFLHVLVTKYRWR KIKGGDIARNPILGFGDGLHIEVSAKN
Hevea brasiliensis (SEQ ID NO: 86)
MLTWLLLVGFFI IYYTYWISKWRNPNCNGVLPPGSMGFPLIGETLQLLIPSYSLDLHPFIKKR IHRYGPIFRSNLAGRPVIVSADPEFNYYILSQEGRSVEIWYLDTFSKLFRQQGESRTNVAGYVH KYLRGAFLSQIGSENLREKLLLHIQDMVNRTLCSWSNQESVEVKHSASLAVCDFTAKVLFGYDA EKSPDNLSETFTRFVEGLISFPLNIPRTAYRQCLQDRQKALSILKNVLTDRRNSVENYRGDVLD LLLNDMGKEKFLTEDFICLIMLGGLFASFESISTITTLLLKLFSAHPEWQELEAEHEKILVSR HGSDSLSITWDEYKSMTFTHQVINETLRLGNVAPGLLRRAIKDVQFKGYTIPSGWTIMMVTSAQ QVNPEVYKDPLVFNPWRWKDFDSITVSKNFTPFGGGTRQCVGAEYSRLTLSLFIHLLVTKYRWT KIKEGEIRRAPMLGFGDGIHFKFSEKE
Jatropha curcas (JcP450.2) (SEQ ID NO: 87)
MKRAIYICLARITKQGLSLIEMLMTELLFGAFFI IFLTYWINRWRNPKCNGVLPPGSMGLPLLG ETLQLLIPRYSLDLHPFIRKRIQRYGPIFRSNVAGRPIVFTADPELNHYIFIQERRLVELWYMD TFSNLFVLDGESRPTGATGYIHKYMRGLFLTHFGAERLKDKLLHQIQELIHTTLQSWCKQPTIE VKHAASAVICDFSAKFLFGYEAEKSPFNMSERFAKFAESLVSFPLNIPGTAYHQSLEDREKVMK LLKNVLRERRNSTKKSEEDVLKQILDDMEKENFITDDFI IQILFGALFAISESIPMTIALLVKF LSAQPSWEELTAEHEEILKNKKEKGLDSSITWEDYKSMTFTLQVINETLRIANVAPGLLRRTL RDIHYKGYTIPAGWTIMVLTSSRHMNPEIYKDPVEFNPWRWKDLDSQTISKNFTPFGGGTRQCA GAEYSRAFISMFLHVLVTKYRWKNVKEGKICRGPILRIEDGIHIKLYEKH
Chenopodium quinoa (SEQ ID NO: 88)
MWPTMGLYVATIVAICFILLELKRRNSREKQWLPPGSKGFPLIGETLQLLVPSYSLDLPSFIR TRIQRYGPIFKTRLVGRPWMSADPGFNRYIVQQEGKSVEMWYLDTFSKLFAQDGEARTTAAGL VHKYLRNLTLSHFGSESLRVNLLPHLESLVRNTLLGWSSKDTIDVKESALTMTIEFVAKQLFGY DSDKSKEKIGEKFGNISQGLFSLPLNIPGTTYHSCLKSQREVMDMMRTALKDRLTTPESYRGDF LDHALKDLSTEKFLSEEFILQIMFGLLFASSESTSMTLTLVLKLLSENPHVLKELEAEHERI IK NKESPDSPLTWAEVKSMTFTLQVINESLRLGNVSLGILRRTLKDIEINGYTIPAGWTIMLVTSA CQYNSDIYKDPLTFNPWRWKEMQPDVIAKNFMPFGGGTRQCAGAEFAKVLMTIFLHNLVTNYRW EKIKGGEIVRTPILGFRNALRVKLTKKN
Spinacia oleracea (SEQ ID NO: 89)
MVLLPGSKGFPFIGETLQLLLPSYSLDLPSFIRTRIQRYGPIFQTRLVGRPVWSADPGFNRYI VQQEGKMVEMWYLDTFSKIFAQQGEGRTNAAGLVHKYLRNITFTHFGSQTLRDKLLPHLEILVR KTLHGWTSQESIDVKEAALTMTIEFVAKQLFGYDSDKSKERIGDKFANISQGLLSFPLNIPGTT YHSCLKSQREVMDMMRKTLKERLASPDTCQGDFLDHALKDLNTDKFLTEDFILQIMFGLLFASS ESTSITLTLILKFLSENPHVLEELEVEHERILKNRESPDSPLTWAEVKSMTFTLQVINESLRLG NVSLGLLRRTLKDIEINGYTIPAGWTIMLVTSACQYNSDVYKDPLTFNPWRWKEMQPDVIAKNF MPFGGGTRQCAGAEFAKVLMTIFLHVLVTTYRWEKIKGGEI IRTPILGFRNGLHVKLIKKARLS
Manihot esculenta (SEQ ID NO: 90)
MEMWSVWLYI ISLI I I IATHWIYRWRNPKCNGKLPPGSMGIPFIGETIQFLIPSKSLDVPNFIK KRMNKYGPLFRTNLVGRPVIVSSDPDFNYYLLQREGKLVERWYMDSFSKLLHHDVTQII IKHGS IHKYLRNLVLGHFGPEPLKDKLLPQLESAISQRLQDWSKQPSIEAKSASSAMIFDFTAKILFSY EPEKSGENIGEIFSNFLQGLMSIPLNIPGTAFHRCLKNQKRAIQMITEILKERRSNPEIHKGDF LDQIVEDMKKDSFWTEEFAIYMMFGLLLASFETISSTLALAIIFLTDNPPWQKLTEEHEAILK ARENRDSGLSWKEYKSLSYTHQWNESLRLASVAPGILRRAITDIQVDGYTIPKGWTIMWPAA VQLNPNTFEDPLVFNPSRWEDMGAVAMAKNFIAFGGGSRSCAGAEFSRVLMSVFVHVFVTNYRW TKIKGGDMVRSPALGFGNGFHIRVSEKQL
Olea europaea var. sylvestris (SEQ ID NO: 91) MAALDLSTVGYLIVGLLTVYITHWIYKWRNPKCNGVLPPGSMGLPLIGETIQLVIPNASLDLPP FIKKRMKRYGPIFRTNVAGRPVI ITADPEFNHFLLRQDGKLVDTWSMDTFAEVFDQASQSSRKY TRHLTLNHFGVEALREKLLPQMEDMVRTTLSNWSSQESVEVKSASVTMAIDYAARQIYSGNLEN APLKISDLFRDLVDGLMSFPINIPGTAHHRCLQTHKKVREMMKDIVKTRLEEPERQYGDMLDHM IEDMKKESFLDEDFIVQLMFGLFFVTSDSISTTLALAFKLLAEHPLVLEELTAEHEAILKKREK SESHLTWNDYKSMTFTLQVINEVLRLGNIAPGFFRRALQDIPVNGYTIPSGWVIMIATAGLHLN SNQFEDPLKFNPWRWKVCKVSSVIAKCFMPFGSGMKQCAGAEYSRVLLATFIHVLTTKYRWAIV KGGKIVRSPI IRFPDGFHYKI IEKTN
Cytochrome P450 Reductase
Stevia rebaudiana (SrCPRl) (SEQ ID NO: 92)
MAQSDSVKVSPFDLVSAAMNGKAMEKLNASESEDPTTLPALKMLVENRELLTLFTTSFAVLIGC
LVFLMWRRSSSKKLVQDPVPQVIWKKKEKESEVDDGKKKVSIFYGTQTGTAEGFAKALVEEAK
VRYEKTSFKVIDLDDYAADDDEYEEKLKKESLAFFFLATYGDGEPTDNAANFYKWFTEGDDKGE
WLKKLQYGVFGLGNRQYEHFNKIAIWDDKLTEMGAKRLVPVGLGDDDQCIEDDFTAWKELVWP
ELDQLLRDEDDTSVTTPYTAAVLEYRWYHDKPADSYAEDQTHTNGHWHDAQHPSRSNVAFKK
ELHTSQSDRSCTHLEFDISHTGLSYETGDHVGVYSENLSEWDEALKLLGLSPDTYFSVHADKE
DGTPIGGASLPPPFPPCTLRDALTRYADVLSSPKKVALLALAAHASDPSEADRLKFLASPAGKD
EYAQWIVANQRSLLEVMQSFPSAKPPLGVFFAAVAPRLQPRYYSISSSPKMSPNRIHVTCALVY
ETTPAGRIHRGLCSTWMKNAVPLTESPDCSQASIFVRTSNFRLPVDPKVPVIMIGPGTGLAPFR
GFLQERLALKESGTELGSSIFFFGCRNRKVDFIYEDELNNFVETGALSELIVAFSREGTAKEYV
QHKMSQKASDIWKLLSEGAYLYVCGDAKGMAKDVHRTLHTIVQEQGSLDSSKAELYVKNLQMSG
RYLRDVW
Arabidopsis thaliana CPR1 (AtCPRl) (SEQ ID NO: 93)
MATSALYASDLFKQLKSIMGTDSLSDDWLVIATTSLALVAGFWLLWKKTTADRSGELKPLMI
PKSLMAKDEDDDLDLGSGKTRVSIFFGTQTGTAEGFAKALSEEIKARYEKAAVKVIDLDDYAAD
DDQYEEKLKKETLAFFCVATYGDGEPTDNAARFYKWFTEENERDIKLQQLAYGVFALGNRQYEH
FNKIGIVLDEELCKKGAKRLIEVGLGDDDQSIEDDFNAWKESLWSELDKLLKDEDDKSVATPYT
AVIPEYRWTHDPRFTTQKSMESNVANGNTTIDIHHPCRVDVAVQKELHTHESDRSCIHLEFDI
SRTGITYETGDHVGVYAENHVEIVEEAGKLLGHSLDLVFSIHADKEDGSPLESAVPPPFPGPCT
LGTGLARYADLLNPPRKSALVALAAYATEPSEAEKLKHLTSPDGKDEYSQWIVASQRSLLEVMA
AFPSAKPPLGVFFAAIAPRLQPRYYSISSSPRLAPSRVHVTSALVYGPTPTGRIHKGVCSTWMK
NAVPAEKSHECSGAPIFIRASNFKLPSNPSTPIVMVGPGTGLAPFRGFLQERMALKEDGEELGS
SLLFFGCRNRQMDFIYEDELNNFVDQGVISELIMAFSREGAQKEYVQHKMMEKAAQVWDLIKEE
GYLYVCGDAKGMARDVHRTLHTIVQEQEGVSSSEAEAIVKKLQTEGRYLRDVW Arabidopsis thaliana CPR2 (AtCPR2) (SEQ ID NO: 94)
MASSSSSSSTSMIDLMAAIIKGEPVIVSDPANASAYESVAAELSSMLIENRQFAMIVTTSIAVL IGCIVMLVWRRSGSGNSKRVEPLKPLVIKPREEEIDDGRKKVTIFFGTQTGTAEGFAKALGEEA KARYEKTRFKIVDLDDYAADDDEYEEKLKKEDVAFFFLATYGDGEPTDNAARFYKWFTEGNDRG EWLKNLKYGVFGLGNRQYEHFNKVAKWDDILVEQGAQRLVQVGLGDDDQCIEDDFTAWREALW PELDTILREEGDTAVATPYTAAVLEYRVSIHDSEDAKFNDINMANGNGYTVFDAQHPYKANVAV KRELHTPESDRSCIHLEFDIAGSGLTYETGDHVGVLCDNLSETVDEALRLLDMSPDTYFSLHAE KEDGTPISSSLPPPFPPCNLRTALTRYACLLSSPKKSALVALAAHASDPTEAERLKHLASPAGK DEYSKWWESQRSLLEVMAEFPSAKPPLGVFFAGVAPRLQPRFYSISSSPKIAETRIHVTCALV YEKMPTGRIHKGVCSTWMKNAVPYEKSENCSSAPIFVRQSNFKLPSDSKVPI IMIGPGTGLAPF RGFLQERLALVESGVELGPSVLFFGCRNRRMDFIYEEELQRFVESGALAELSVAFSREGPTKEY VQHKMMDKASDIWNMISQGAYLYVCGDAKGMARDVHRSLHTIAQEQGSMDSTKAEGFVKNLQTS GRYLRDVW
Arabidopsis thaliana (AtCPR3) (SEQ ID NO: 95)
MASSSSSSSTSMIDLMAAIIKGEPVIVSDPANASAYESVAAELSSMLIENRQFAMIVTTSIAVL IGCIVMLVWRRSGSGNSKRVEPLKPLVIKPREEEIDDGRKKVTIFFGTQTGTAEGFAKALGEEA KARYEKTRFKIVDLDDYAADDDEYEEKLKKEDVAFFFLATYGDGEPTDNAARFYKWFTEGNDRG EWLKNLKYGVFGLGNRQYEHFNKVAKWDDILVEQGAQRLVQVGLGDDDQCIEDDFTAWREALW PELDTILREEGDTAVATPYTAAVLEYRVSIHDSEDAKFNDITLANGNGYTVFDAQHPYKANVAV KRELHTPESDRSCIHLEFDIAGSGLTMKLGDHVGVLCDNLSETVDEALRLLDMSPDTYFSLHAE KEDGTPISSSLPPPFPPCNLRTALTRYACLLSSPKKSALVALAAHASDPTEAERLKHLASPAGK DEYSKWWESQRSLLEVMAEFPSAKPPLGVFFAGVAPRLQPRFYSISSSPKIAETRIHVTCALV YEKMPTGRIHKGVCSTWMKNAVPYEKSEKLFLGRPIFVRQSNFKLPSDSKVPI IMIGPGTGLAP FRGFLQERLALVESGVELGPSVLFFGCRNRRMDFIYEEELQRFVESGALAELSVAFSREGPTKE YVQHKMMDKASDIWNMISQGAYLYVCGDAKGMARDVHRSLHTIAQEQGSMDSTKAEGFVKNLQT SGRYLRDVW
Stevia rebaudiana CPR2 (SrCPR2) (SEQ ID NO: 96)
MAQSESVEASTIDLMTAVLKDTVIDTANASDNGDSKMPPALAMMFEIRDLLLILTTSVAVLVGC
FWLVWKRSSGKKSGKELEPPKIWPKRRLEQEVDDGKKKVTIFFGTQTGTAEGFAKALFEEAK
ARYEKAAFKVIDLDDYAADLDEYAEKLKKETYAFFFLATYGDGEPTDNAAKFYKWFTEGDEKGV
WLQKLQYGVFGLGNRQYEHFNKIGIWDDGLTEQGAKRIVPVGLGDDDQSIEDDFSAWKELVWP
ELDLLLRDEDDKAAATPYTAAIPEYRWFHDKPDAFSDDHTQTNGHAVHDAQHPCRSNVAVKKE
LHTPESDRSCTHLEFDISHTGLSYETGDHVGVYCENLIEWEEAGKLLGLSTDTYFSLHIDNED
GSPLGGPSLQPPFPPCTLRKALTNYADLLSSPKKSTLLALAAHASDPTEADRLRFLASREGKDE YAEWWANQRSLLEVMEAFPSARPPLGVFFAAVAPRLQPRYYSISSSPKMEPNRIHVTCALVYE
KTPAGRIHKGICSTWMKNAVPLTESQDCSWAPIFVRTSNFRLPIDPKVPVIMIGPGTGLAPFRG
FLQERLALKESGTELGSSILFFGCRNRKVDYIYENELNNFVENGALSELDVAFSRDGPTKEYVQ
HKMTQKASEIWNMLSEGAYLYVCGDAKGMAKDVHRTLHTIVQEQGSLDSSKAELYVKNLQMSGR
YLRDVW
Stevia rebaudiana CPR3 (SrCPR3) (SEQ ID NO: 97)
MAQSNSVKISPLDLVTALFSGKVLDTSNASESGESAMLPTIAMIMENRELLMILTTSVAVLIGC
VWLVWRRSSTKKSALEPPVIWPKRVQEEEVDDGKKKVTVFFGTQTGTAEGFAKALVEEAKAR
YEKAVFKVIDLDDYAADDDEYEEKLKKESLAFFFLATYGDGEPTDNAARFYKWFTEGDAKGEWL
NKLQYGVFGLGNRQYEHFNKIAKWDDGLVEQGAKRLVPVGLGDDDQCIEDDFTAWKELVWPEL
DQLLRDEDDTTVATPYTAAVAEYRWFHEKPDALSEDYSYTNGHAVHDAQHPCRSNVAVKKELH
SPESDRSCTHLEFDISNTGLSYETGDHVGVYCENLSEWNDAERLVGLPPDTYFSIHTDSEDGS
PLGGASLPPPFPPCTLRKALTCYADVLSSPKKSALLALAAHATDPSEADRLKFLASPAGKDEYS
QWIVASQRSLLEVMEAFPSAKPSLGVFFASVAPRLQPRYYSISSSPKMAPDRIHVTCALVYEKT
PAGRIHKGVCSTWMKNAVPMTESQDCSWAPIYVRTSNFRLPSDPKVPVIMIGPGTGLAPFRGFL
QERLALKEAGTDLGLSILFFGCRNRKVDFIYENELNNFVETGALSELIVAFSREGPTKEYVQHK
MSEKASDIWNLLSEGAYLYVCGDAKGMAKDVHRTLHTIVQEQGSLDSSKAELYVKNLQMSGRYL
RDW
Artemisia annua CPR (AaCPR) (SEQ ID NO: 98)
MAQSTTSVKLSPFDLMTALLNGKVSFDTSNTSDTNIPLAVFMENRELLMILTTSVAVLIGCVW
LVWRRSSSAAKKAAESPVIWPKKVTEDEVDDGRKKVTVFFGTQTGTAEGFAKALVEEAKARYE
KAVFKVIDLDDYAAEDDEYEEKLKKESLAFFFLATYGDGEPTDNAARFYKWFTEGEEKGEWLDK
LQYAVFGLGNRQYEHFNKIAKWDEKLVEQGAKRLVPVGMGDDDQCIEDDFTAWKELWPELDQ
LLRDEDDTSVATPYTAAVAEYRWFHDKPETYDQDQLTNGHAVHDAQHPCRSNVAVKKELHSPL
SDRSCTHLEFDISNTGLSYETGDHVGVYVENLSEWDEAEKLIGLPPHTYFSVHADNEDGTPLG
GASLPPPFPPCTLRKALASYADVLSSPKKSALLALAAHATDSTEADRLKFLASPAGKDEYAQWI
VASHRSLLEVMEAFPSAKPPLGVFFASVAPRLQPRYYSISSSPRFAPNRIHVTCALVYEQTPSG
RVHKGVCSTWMKNAVPMTESQDCSWAPIYVRTSNFRLPSDPKVPVIMIGPGTGLAPFRGFLQER
LAQKEAGTELGTAILFFGCRNRKVDFIYEDELNNFVETGALSELVTAFSREGATKEYVQHKMTQ
KASDIWNLLSEGAYLYVCGDAKGMAKDVHRTLHTIVQEQGSLDSSKAELYVKNLQMAGRYLRDV
W
CPR ( PgCPR) (SEQ ID NO: 99) MAQSSSGSMSPFDFMTAI IKGKMEPSNASLGAAGEVTAMILDNRELVMILTTSIAVLIGCVWF IWRRSSSQTPTAVQPLKPLLAKETESEVDDGKQKVTIFFGTQTGTAEGFAKALADEAKARYDKV TFKWDLDDYAADDEEYEEKLKKETLAFFFLATYGDGEPTDNAARFYKWFLEGKERGEWLQNLK FGVFGLGNRQYEHFNKIAIWDEILAEQGGKRLISVGLGDDDQCIEDDFTAWRESLWPELDQLL RDEDDTTVSTPYTAAVLEYRWFHDPADAPTLEKSYSNANGHSWDAQHPLRANVAVRRELHTP ASDRSCTHLEFDISGTGIAYETGDHVGVYCENLAETVEEALELLGLSPDTYFSVHADKEDGTPL SGSSLPPPFPPCTLRTALTLHADLLSSPKKSALLALAAHASDPTEADRLRHLASPAGKDEYAQW IVASQRSLLEVMAEFPSAKPPLGVFFASVAPRLQPRYYSISSSPRIAPSRIHVTCALVYEKTPT GRVHKGVCSTWMKNSVPSEKSDECSWAPIFVRQSNFKLPADAKVPI IMIGPGTGLAPFRGFLQE RLALKEAGTELGPSILFFGCRNSKMDYIYEDELDNFVQNGALSELVLAFSREGPTKEYVQHKMM EKASDIWNLISQGAYLYVCGDAKGMARDVHRTLHTIAQEQGSLDSSKAESMVKNLQMSGRYLRD VW
Non-heme iron oxidase
Acetobacter pasteurianus subsp. ascendens (ApGA2ox) (SEQ ID NO: 100)
MSVSKTTETFTSIPVIDISKLYSSDLAERKAVAEKLGDAARNIGFLYISGHNVSADLIEGVRKA ARDFFAEPFEKKMEYYIGTSATHKGFVPEGEEVYSAGRPDHKEAFDIGYEVPANHPLVQAGTPL LGPNNWPDIPGFRSAAEAYYRTVFDLGRTLFRGFALALGLNESYFDTVANFPPSKLRMIHYPYD ADAQDAPGIGAHTDYECFTILLADKPGLEVMNGNGDWIDAPPIPGAFWNIGDMLEVMTAGEFV ATAHRVRKVSEERYSFPLFYACDYHTQIRPLPAFAKKIDASYETITIGEHMWAQALQTYQYLVK KVEKGELKLPKGARKTATFGHFKRNSAA
Cucurbita maxima (CmGA2ox) (SEQ ID NO: 101)
MAAASSFSAAFYSGIPLIDLSAPDAKQLIVKACEELGFFKWKHGVPMELISSLESESTKFFSL PLSEKQRAGPPSPFGYGNKQIGRNGDVGWVEYLLLNTHLESNSDGFLSMFGQDPQKLRSAVNDY ISAVRNMAGEILELMAEGLKIQQRNVFSKLVMDEQSDSVFRVNHYPPCPDLQALKGTNMIGFGE HTDPQI ISVLRSNNTSGFQISLADGNWISVPPDHSSFFINVGDSLQVMTNGRFKSVKHRVLTNS SKSRVSMIYFGGPPLSEKIAPLASLMQGEERSLYKEFTWFEYKRSAYNSRLADNRLVPFERIAA S
Dendrobium catenatum (DcGA3ox) (SEQ ID NO: 102) MPSLSKEHFDLYSAFHVPETHAWSSSHLHDHPIAGDGATIPVIDISDPDAASMVGGACRSWGVF YATSHGIPADLLHQVESHARRLFSLPLHRKLQTAPRDGSLSGYGRPPISAFFPKLMWSEGFTLA GHDDHLAVTSQLSPFDSLSFCEVMEAYRKEMKKLAGRLFRLLILSLGLEEEEMGQVGPLKELSQ AADAIQLNSYPTCPEPERAIGMAAHTDSAFLTVLHQTDGAGGLQVLRDQDESGSARWVDVLPRP DCLWNVGDLLHILSNGRFKSVRHRAWNRADHRISAAYFIGPPAHMKVGSITKLVDMRTGPMY
RPVTWPEYLGIRTRLFDKALDSVKFQEKELEKD Cucurbita maxima (CmGA3ox) (SEQ ID NO: 103)
MATTIADVFKSFPVHIPAHKNLDFDSLHELPDSYAWIQPDSFPSPTHKHHNSILDSDSDSVPLI
DLSLPNAAALIGNAFRSWGAFQVINHGVPISLLQSIESSADTLFSLPPSHKLKAARTPDGISGY
GLVRISSFFPKRMWSEGFTIVGSPLDHFRQLWPHDYHKHCEIVEEYDREMRSLCGRLMWLGLGE
LGITRDDMKWAGPDGDFKTSPAATQFNSYPVCPDPDRAMGLGPHTDTSLLTIVYQSNTRGLQVL
REGKRWVTVEPVAGGLWQVGDLLHILTNGLYPSALHQAWNRTRKRLSVAYVFGPPESAEISP
LKKLLGPTQPPLYRPVTWTEYLGKKAEHFNNALSTVRLCAPITGLLDVNDHSRVKVG
Cucurbita maxima (CmGA20ox) (SEQ ID NO: 104)
MHWTSTPEARHDGAPLVFDASVLRHQHNIPKQFIWPDEEKPAATCPELEVPLIDLSGFLSGEK
DAAAEAVRLVGEACEKHGFFLWNHGVDRKLIGEAHKYMDEFFELPLSQKQSAQRKAGEHCGYA
SSFTGRFSSKLPWKETLSFRFAADESLNNLVLHYLNDKLGDQFAKFGRVYQDYCEAMSGLSLGI
MELLGKSLGVEEQCFKNFFKDNDSIMRLNFYPPCQKPHLTLGTGPHCDPTSLTILHQDQVGGLQ
VFVDNQWRLITPNFDAFWNIGDTFMALSNGRYKSCLHRAWNSERTRKSLAFFLCPRNDKWR
PPRELVDTQNPRRYPDFTWSMLLRFTQTHYRADMKTLEAFSAWLQQEQQEQQEQQFNI
Agapanthus praecox subsp. orientalis (ApoGA20ox) (SEQ ID NO: 105)
MVLQPFVFDAALLRDEHNIPTQFIWPEEDKPSPDASEELILPFIDLKAFLSGDPDSPFQVSKQV
GEACESLGAFQVTNHGIDFDLLEEAHSCIQKFFSMPLCEKQRALRKAGESYGYASSFTGRFCSK
LPWKETLSFRYSSSSSDIVQNYFVRTLGEEFRHFGEVYQKYCESMSKLSLMIMEVLGLSLGVGR
MHFREFFEGNDSTMRLNYYPPCKKPDLTLGTGPHCDPTSLTILHQDDVSGLQVFTGGKWLTVRP
KTDAFWNIGDTFTALSNGRYKSCLHRAWNSKTARKSLAFFLCPAMNKIVRPPRELVDIDHPR
AYPDFTWSALLEFTQKHYRADMQTLNEFSKYILQAQGTLHK
Arabidopsis thaliana (AtF3H) (SEQ ID NO: 106)
MAPGTLTELAGESKLNSKFVRDEDERPKVAYNVFSDEIPVISLAGIDDVDGKRGEICRQIVEAC
ENWGIFQWDHGVDTNLVADMTRLARDFFALPPEDKLRFDMSGGKKGGFIVSSHLQGEAVQDWR
EIVTYFSYPVRNRDYSRWPDKPEGWVKVTEEYSERLMSLACKLLEVLSEAMGLEKESLTNACVD
MDQKIWNYYPKCPQPDLTLGLKRHTDPGTITLLLQDQVGGLQATRDNGKTWITVQPVEGAFW
NLGDHGHFLSNGRFKNADHQAWNSNSSRLSIATFQNPAPDATVYPLKVREGEKAILEEPITFA
EMYKRKMGRDLELARLKKLAKEERDHKEVDKPVDQIFA
Chrysosplenium americanum (CaF6H) (SEQ ID NO: 107) QEKTLNSRFVARDEDSLERPKVSAIYNGSFDEIPVLISLAGIDMTGAGTDAAARRSEICRKIVE ACEDWGIFGEIDDDHGKRAEICDKIVKACEDWGVFQPDEKLESVMSAAKKGDFWDHGVDAEVI SQWTTFAKPTSHTQFETETTRDFPNKPEGWKATTEQYSRTLMGLACKLLGVISEAMGLEKEALT KACVDMDQKVWNYYPKCPQPDLTLGLKRHTDPGTITLLLQDQVGGLQATRDGGKTWITVQPVK DNGWILLHIGDSNGHRHGHFLSNGRFKSHQAYRYRRPTRGSPTFGTKVSNYPPCPEQSLVRPPA GRPYGRALNALDAKKLASAKQQLESAAILLISELAVAYI ILAILPSSEI IAEEGYL
Datura stramonium (DsH6H) (SEQ ID NO: 108)
MATFVSNWSTNNVSESFIAPLEKRAEKDVALGNDVPI IDLQQDHLLIVQQITKACQDFGLFQVI NHGVPEKLMVEAMEVYKEFFALPAEEKEKFQPKGEPAKFELPLEQKAKLYVEGERRCNEEFLYW KDTLAHGCYPLHEELLNSWPEKPPTYRDVIAKYSVEVRKLTMRILDYICEGLGLKLGYFDNELT QIQMLLANYYPSCPDPSSTIGSGGHYDGNLITLLQQDLVGLQQLIVKDDKWIAVEPIPTAFWN LGLTLKVMSNEKFEGSIHRWTHPTRNRISIGTLIGPDYSCTIEPIKELLSQENPPLYKPYPYA KFAEIYLSDKSDYDAGVKPYKINQFPN
Arabidopsis thaliana (AtH6DH) (SEQ ID NO: 109)
MENHTTMKVSSLNCIDLANDDLNHSWSLKQACLDCGFFYVINHGISEEFMDDVFEQSKKLFAL
PLEEKMKVLRNEKHRGYTPVLDELLDPKNQINGDHKEGYYIGIEVPKDDPHWDKPFYGPNPWPD
ADVLPGWRETMEKYHQEALRVSMAIARLLALALDLDVGYFDRTEMLGKPIATMRLLRYQGISDP
SKGIYACGAHSDFGMMTLLATDGVMGLQICKDKNAMPQKWEYVPPIKGAFIVNLGDMLERWSNG
FFKSTLHRVLGNGQERYSIPFFVEPNHDCLVECLPTCKSESELPKYPPIKCSTYLTQRYEETHA
NLSIYHQQT
Solanum lycopersicum (S1F35H) (SEQ ID NO: 110)
MALRINELFVAAI IYI IVHI I ISKLITTVRERGRRLPLPPGPTGWPVIGALPLLGSMPHVALAK MAKKYGPIMYLKVGTCGMWASTPNAAKAFLKTLDINFSNRPPNAGATHLAYNAQDMVFAPYGP RWKLLRKLSNLHMLGGKALENWANVRANELGHMLKSMFDASQDGECWIADVLTFAMANMIGQV MLSKRVFVEKGVEVNEFKNMWELMTVAGYFNIGDFIPKLAWMDIQGIEKGMKNLHKKFDDLLT KMFDEHEATSNERKENPDFLDWMANRDNSEGERLSTTNIKALLLNLFTAGTDTSSSVIEWALA EMMKNPKIFEKAQQEMDQVIGKNRRLIESDIPNLPYLRAICKETFRKHPSTPLNLPRVSSEPCT VDGYYIPKNTRLSVNIWAIGRDPDVWENPLEFTPERFLSGKNAKIEPRGNDFELIPFGAGRRIC AGTRMGIVMVEYILGTLVHSFDWKLPNNVIDINMEESFGLALQKAVPLEAMVTPRLSLDVYRC
D4H (SEQ ID NO: 111)
MPKSWPIVISSHSFCFLPNSEQERKMKDLNFHAATLSEEESLRELKAFDETKAGVKGIVDTGIT
KIPRIFIDQPKNLDRISVCRGKSDIKIPVINLNGLSSNSEIRREIVEKIGEASEKYGFFQIVNH GIPQDVMDKMVDGVRKFHEQDDQIKRQYYSRDRFNKNFLYSSNYVLIPGIACNWRDTMECIMNS
NQPDPQEFPDVCRDILMKYSNYVRNLGLILFELLSEALGLKPNHLEEMDCAEGLILLGHYYPAC
PQPELTFGTSKHSDSGFLTILMQDQIGGLQILLENQWIDVPFIPGALVINIADLLQLITNDKFK
SVEHRVLANKVGPRISVAVAFGIKTQTQEGVSPRLYGPIKELISEENPPIYKEVTVKDFITIRF
AKRFDDSSSLSPFRLNN
Catharanthus roseus (CrD4Hlike) (SEQ ID NO: 112)
MKELNNSEEELKAFDDTKAGVKALVDSGITEIPRIFLDHPTNLDQISSKDREPKFKKNIPVIDL DGISTNSEIRREIVEKIREASEKWGFFQIVNHGIPQEVMDDMIVGIRRFHEQDNEIKKQFYTRD RTKSFRYTSNFVLNPKIACNWRDTFECTMAPHQPNPQDLPDICRDIMMKYISYTRNLGLTLFEL LSEALGLKSNRLKDMHCDEGVELVGHYYPACPQPELTLGTSKHTDTGFLTMLQQDQIGGLQVLY ENHQWVDVPFIPGALI INIGDFLQI ISNDKFKSAPHRVLANKNGPRISTASVFMPNFLESAEVR LYGPIKELLSEENPPIYEQITAKDYVTVQFSRGLDGDSFLSPFMLNKDNMEK
Zea mays (ZmBX6) (SEQ ID NO: 113)
MAPTTATKDDSGYGDERRRELQAFDDTKLGVKGLVDSGVKSIPSIFHHPPEALSDI ISPAPLPS SPPSGAAIPWDLSVTRREDLVEQVRHAAGTVGFFWLVNHGVAEELMGGMLRGVRQFNEGPVEA KQALYSRDLARNLRFASNFDLFKAAAADWRDTLFCEVAPNPPPREELPEPLRNVMLEYGAAVTK LARFVFELLSESLGMPSDHLYEMECMQNLNWCQYYPPCPEPHRTVGVKRHTDPGFFTILLQDG MGGLQVRLGNNGQSGGCWVDIAPRPGALMVNIGDLLQLVTNDRFRSVEHRVPANKSSDTARVSV ASFFNTDVRRSERMYGPIPDPSKPPLYRSVRARDFIAKFNTIGLDGRALDHFRL
Hordeum vulgare subsp. vulgare (HvIDS2) (SEQ ID NO: 114)
MAKVMNLTPVHASSIPDSFLLPADRLHPATTDVSLPI IDMSRGRDEVRQAILDSGKEYGFIQW NHGISEPMLHEMYAVCHEFFDMPAEDKAEFFSEDRSERNKLFCGSAFETLGEKYWIDVLELLYP LPSGDTKDWPHKPQMLREWGNYTSLARGVAMEILRLLCEGLGLRPDFFVGDISGGRVWDINY YPPSPNPSRTLGLPPHCDRDLMTVLLPGAVPGLEIAYKGGWIKVQPVPNSLVINFGLQLEWTN GYLKAVEHRAATNFAEPRLSVASFIVPADDCWGPAEEFVSEDNPPRYRTLTVGEFKRKHNWN LDSSINQI ININNNQKGI
Hordeum vulgare subsp. vulgare (HvIDS3) (SEQ ID NO: 115)
MENILHATPAPVSLPESFVFASDKVPPATKAWSLPI IDLSCGRDEVRRSILEAGKELGFFQW NHGVSKQVMRDMEGMCEQFFHLPAADKASLYSEERHKPNRLFSGATYDTGGEKYWRDCLRLACP FPVDDSINEWPDTPKGLRDVIEKFTSQTRDVGKELLRLLCEGMGIRADYFEGDLSGGNVILNIN HYPSCPNPDKALGQPPHCDRNLITLLLPGAVNGLEVSYKGDWIKVDPAPNAFWNFGQQLEWT NGLLKSIEHRAMTNSALARTSVATFIMPTQECLIGPAKEFLSKENPPCYRTTMFRDFMRIYNW
KLGSSLNLTTNLKNVQKEI
Uridine diphosphate dependent glycosyltrans ferase (UGT)
Siraitia grosvenorii UGT720-269-1 (SEQ ID NO: 116) MEDRNAMDMSRIKYRPQPLRPASMVQPRVLLFPFPALGHVKPFLSLAELLSDAGIDWFLSTEY NHRRISNTEALASRFPTLHFETIPDGLPPNESRALADGPLYFSMREGTKPRFRQLIQSLNDGRW PITCI ITDIMLSSPIEVAEEFGIPVIAFCPCSARYLSIHFFIPKLVEEGQIPYADDDPIGEIQG VPLFEGLLRRNHLPGSWSDKSADISFSHGLINQTLAAGRASALILNTFDELEAPFLTHLSSIFN KIYTIGPLHALSKSRLGDSSSSASALSGFWKEDRACMSWLDCQPPRSWFVSFGSTMKMKADEL REFWYGLVSSGKPFLCVLRSDWSGGEAAELIEQMAEEEGAGGKLGMWEWAAQEKVLSHPAVG GFLTHCGWNSTVESIAAGVPMMCWPILGDQPSNATWIDRVWKIGVERNNREWDRLTVEKMVRAL MEGQKRVEIQRSMEKLSKLANEKWRGINLHPTISLKKDTPTTSEHPRHEFENMRGMNYEMLVG NAIKSPTLTKK
Siraitia grosvenorii UGT94-289-3 (SEQ ID NO: 117) MTIFFSVEILVLGIAEFAAIAMDAAQQGDTTTILMLPWLGYGHLSAFLELAKSLSRRNFHIYFC STSVNLDAIKPKLPSSFSDSIQFVELHLPSSPEFPPHLHTTNGLPPTLMPALHQAFSMAAQHFE SILQTLAPHLLIYDSLQPWAPRVASSLKIPAINFNTTGVFVISQGLHPIHYPHSKFPFSEFVLH NHWKAMYSTADGASTERTRKRGEAFLYCLHASCSVILINSFRELEGKYMDYLSVLLNKKWPVG PLVYEPNQDGEDEGYSSIKNWLDKKEPSSTVFVSFGSEYFPSKEEMEEIAHGLEASEVNFIWW RFPQGDNTSGIEDALPKGFLERAGERGMWKGWAPQAKILKHWSTGGFVSHCGWNSVMESMMFG VPIIGVPMHVDQPFNAGLVEEAGVGVEAKRDPDGKIQRDEVAKLIKEVWEKTREDVRKKAREM SEILRSKGEEKFDEMVAEISLLLKI
Siraitia grosvenorii UGT74-345-2 (SEQ ID NO: 118)
MDETTVNGGRRASDVWFAFPRHGHMSPMLQFSKRLVSKGLRVTFLITTSATESLRLNLPPSSS LDLQVISDVPESNDIATLEGYLRSFKATVSKTLADFIDGIGNPPKFIVYDSVMPWVQEVARGRG LDAAPFFTQSSAVNHILNHVYGGSLSIPAPENTAVSLPSMPVLQAEDLPAFPDDPEWMNFMTS QFSNFQDAKWIFFNTFDQLECKKQSQWNWMADRWPIKTVGPTIPSAYLDDGRLEDDRAFGLNL LKPEDGKNTRQWQWLDSKDTASVLYISFGSLAILQEEQVKELAYFLKDTNLSFLWVLRDSELQK LPHNFVQETSHRGLWNWCSQLQVLSHRAVSCFVTHCGWNSTLEALSLGVPMVAIPQWVDQTTN AKFVADVWRVGVRVKKKDERIVTKEELEASIRQWQGEGRNEFKHNAIKWKKLAKEAVDEGGSS DKNIEEFVKTIA
Siraitia grosvenorii UGT75-281-2 (SEQ ID NO: 119) MGDNGDGGEKKELKENVKKGKELGRQAIGEGYINPSLQLARRLISLGVNVTFATTVLAGRRMKN KTHQTATTPGLSFATFSDGFDDETLKPNGDLTHYFSELRRCGSESLTHLITSAANEGRPITFVI YSLLLSWAADIASTYDIPSALFFAQPATVLALYFYYFHGYGDTICSKLQDPSSYIELPGLPLLT SQDMPSFFSPSGPHAFILPPMREQAEFLGRQSQPKVLVNTFDALEADALRAIDKLKMLAIGPLI PSALLGGNDSSDASFCGDLFQVSSEDYIEWLNSKPDSSWYISVGSICVLSDEQEDELVHALLN SGHTFLWVKRSKENNEGVKQETDEEKLKKLEEQGKMVSWCRQVEVLKHPALGCFLTHCGWNSTI ESLVSGLPWAFPQQIDQATNAKLIEDVWKTGVRVKANTEGIVEREEIRRCLDLVMGSRDGQKE EIERNAKKWKELARQAIGEGGSSDSNLKTFLWEIDLEI
Siraitia grosvenorii UGT720-269-4 (SEQ ID NO: 120)
MAEQAHDLLHVLLFPFPAEGHIKPFLCLAELLCNAGFHVTFLNTDYNHRRLHNLHLLAARFPSL
HFESISDGLPPDQPRDILDPKFFISICQVTKPLFRELLLSYKRISSVQTGRPPITCVITDVIFR
FPIDVAEELDIPVFSFCTFSARFMFLYFWIPKLIEDGQLPYPNGNINQKLYGVAPEAEGLLRCK
DLPGHWAFADELKDDQLNFVDQTTASSRSSGLILNTFDDLEAPFLGRLSTIFKKIYAVGPIHSL
LNSHHCGLWKEDHSCLAWLDSRAAKSWFVSFGSLVKITSRQLMEFWHGLLNSGKSFLFVLRSD
WEGDDEKQWKEIYETKAEGKWLWGWAPQEKVLAHEAVGGFLTHSGWNSILESIAAGVPMIS
CPKIGDQSSNCTWISKVWKIGLEMEDRYDRVSVETMVRSIMEQEGEKMQKTIAELAKQAKYKVS
KDGTSYQNLECLIQDIKKLNQIEGFINNPNFSDLLRV
Siraitia grosvenorii UGT94-289-2 (SEQ ID NO: 121)
MDAQQGHTTTILMLPWVGYGHLLPFLELAKSLSRRKLFHIYFCSTSVSLDAIKPKLPPSISSDD SIQLVELRLPSSPELPPHLHTTNGLPSHLMPALHQAFVMAAQHFQVILQTLAPHLLIYDILQPW APQVASSLNIPAINFSTTGASMLSRTLHPTHYPSSKFPISEFVLHNHWRAMYTTADGALTEEGH KIEETLANCLHTSCGWLVNSFRELETKYIDYLSVLLNKKWPVGPLVYEPNQEGEDEGYSSIK NWLDKKEPSSTVFVSFGTEYFPSKEEMEEIAYGLELSEVNFIWVLRFPQGDSTSTIEDALPKGF LERAGERAMWKGWAPQAKILKHWSTGGLVSHCGWNSMMEGMMFGVPI IAVPMHLDQPFNAGLV EEAGVGVEAKRDSDGKIQREEVAKSIKEWIEKTREDVRKKAREMDTKHGPTYFSRSKVSSFGR LYKINRPTTLTVGRFWSKQIKMKRE
Siraitia grosvenorii UGT94-289-1 (SEQ ID NO: 122)
MDAQRGHTTTILMFPWLGYGHLSAFLELAKSLSRRNFHIYFCSTSVNLDAIKPKLPSSSSSDSI QLVELCLPSSPDQLPPHLHTTNALPPHLMPTLHQAFSMAAQHFAAILHTLAPHLLIYDSFQPWA PQLASSLNIPAINFNTTGASVLTRMLHATHYPSSKFPISEFVLHDYWKAMYSAAGGAVTKKDHK IGETLANCLHASCSVILINSFRELEEKYMDYLSVLLNKKWPVGPLVYEPNQDGEDEGYSSIKN WLDKKEPSSTVFVSFGSEYFPSKEEMEEIAHGLEASEVHFIWWRFPQGDNTSAIEDALPKGFL ERVGERGMWKGWAPQAKILKHWSTGGFVSHCGWNSVMESMMFGVPI IGVPMHLDQPFNAGLAE EAGVGVEAKRDPDGKIQRDEVAKLIKEVWEKTREDVRKKAREMSEILRSKGEEKMDEMVAAIS
LFLKI
Momordica charantia 1 (McUGTl) (SEQ ID NO: 123)
MAQPQTQARVLVFPYPTVGHIKPFLSLAELLADGGLDWFLSTEYNHRRIPNLEALASRFPTLH FDTIPDGLPIDKPRVI IGGELYTSMRDGVKQRLRQVLQSYNDGSSPITCVICDVMLSGPIEAAE ELGIPWTFCPYSARYLCAHFVMPKLIEEGQIPFTDGNLAGEIQGVPLFGGLLRRDHLPGFWFV KSLSDEVWSHAFLNQTLAVGRTSALI INTLDELEAPFLAHLSSTFDKIYPIGPLDALSKSRLGD SSSSSTVLTAFWKEDQACMSWLDSQPPKSVIFVSFGSTMRMTADKLVEFWHGLVNSGTRFLCVL RSDIVEGGGAADLIKQVGETGNGIWEWAAQEKVLAHRAVGGFLTHCGWNSTMESIAAGVPMMC WQIYGDQMINATWIGKVWKIGIERDDKWDRSTVEKMIKELMEGEKGAEIQRSMEKFSKLANDKV VKGGTSFENLELIVEYLKKLKPSN
Momordica charantia 2 (McUGT2) (SEQ ID NO: 124)
MAQPRVLLFPFPAMGHVKPFLSLAELLSDAGVEWFLSTEYNHRRIPDIGALAARFPTLHFETI PDGLPPDQPRVLADGHLYFSMLDGTKPRFRQLIQSLNGNPRPITCI INDVMLSSPIEVAEEFGI PVIAFCPCSARFLSVHFFMPNFIEEAQIPYTDENPMGKIEEATVFEGLLRRKDLPGLWCAKSSN ISFSHRFINQTIAAGRASALILNTFDELESPFLNHLSSIFPKIYCIGPLNALSRSRLGKSSSSS SALAGFWKEDQAYMSWLESQPPRSVIFVSFGSTMKMEAWKLAEFWYGLVNSGSPFLFVFRPDCV INSGDAAEVMEGRGRGMWEWASQEKVLAHPAVGGFLTHCGWNSTVESIVAGVPMMCCPIVADQ LSNATWIHKVWKIGIEGDEKWDRSTVEMMIKELMESQKGTEIRTSIEMLSKLANEKWKGGTSL NNFELLVEDIKTLRRPYT
Momordica charantia 3 (McUGT3) (SEQ ID NO: 125)
MEQSDSNSDDHQHHVLLFPFPAKGHIKPFLCLAQLLCGAGLQVTFLNTDHNHRRIDDRHRRLLA TQFPMLHFKSISDGLPPDHPRDLLDGKLIASMRRVTESLFRQLLLSYNGYGNGTNNVSNSGRRP PISCVITDVIFSFPVEVAEELGIPVFSFATFSARFLFLYFWIPKLIQEGQLPFPDGKTNQELYG VPGAEGI IRCKDLPGSWSVEAVAKNDPMNFVKQTLASSRSSGLILNTFEDLEAPFVTHLSNTFD KIYTIGPIHSLLGTSHCGLWKEDYACLAWLDARPRKSWFVSFGSLVKTTSRELMELWHGLVSS GKSFLLVLRSDWEGEDEEQWKEILESNGEGKWLWGWAPQEEVLAHEAIGGFLTHSGWNSTM ESIAAGVPMVCWPKIGDQPSNCTWVSRVWKVGLEMEERYDRSTVARMARSMMEQEGKEMERRIA ELAKRVKYRVGKDGESYRNLESLIRDIKITKSSN
Momordica charantia 4 (McUGT4) (SEQ ID NO: 126)
MDAHQQAEHTTTILMLPWVGYGHLTAYLELAKALSRRNFHIYYCSTPVNIESIKPKLTIPCSSI
QFVELHLPSSDDLPPNLHTTNGLPSHLMPTLHQAFSAAAPLFEEILQTLCPHLLIYDSLQPWAP KIASSLKIPALNFNTSGVSVIAQALHAIHHPDSKFPLSDFILHNYWKSTYTTADGGASEKTRRA REAFLYCLNSSGNAILINTFRELEGEYIDYLSLLLNKKVIPIGPLVYEPNQDEDQDEEYRSIKN WLDKKEPCSTVFVSFGSEYFPSNEEMEEIAPGLEESGANFIWWRFPKLENRNGI IEEGLLERA GERGMVIKEWAPQARILRHGSIGGFVSHCGWNSVMES11CGVPVIGVPMRVDQPYNAGLVEEAG VGVEAKRDPDGKIQRHEVSKLIKQVWEKTRDDVRKKVAQMSEILRRKGDEKIDEMVALISLLP KG
Momordica charantia 5 (McUGT5) (SEQ ID NO: 127)
MDARQQAEHTTTILMLPWVGYGHLSAYLELAKALSRRNFHIYYCSTPVNIESIKPKLTIPCSSI QFVELHLPFSDDLPPNLHTTNGLPSHLMPALHQAFSAAAPLFEAILQTLCPHLLIYDSLQPWAP QIASSLKIPALNFNTTGVSVIARALHTIHHPDSKFPLSEIVLHNYWKATHATADGANPEKFRRD LEALLCCLHSSCNAILINTFRELEGEYIDYLSLLLNKKVTPIGPLVYEPNQDEEQDEEYRSIKN WLDKKEPYSTIFVSFGSEYFPSNEEMEEIARGLEESGANFIWWRFHKLENGNGITEEGLLERA GERGMVIQGWAPQARILRHGSIGGFVSHCGWNSVMESI ICGVPVIGVPMGLDQPYNAGLVEEAG VGVEAKRDPDGKIQRHEVSKLIKQVWEKTRDDVRKKVAQMSEILRRKGDEKIDEMVALISLLL KG
Cucumis sativus (SEQ ID NO: 128)
MGLSPTDHVLLFPFPAKGHIKPFFCLAHLLCNAGLRVTFLSTEHHHQKLHNLTHLAAQIPSLHF
QSISDGLSLDHPRNLLDGQLFKSMPQVTKPLFRQLLLSYKDGTSPITCVITDLILRFPMDVAQE
LDIPVFCFSTFSARFLFLYFSIPKLLEDGQIPYPEGNSNQVLHGIPGAEGLLRCKDLPGYWSVE
AVANYNPMNFVNQTIATSKSHGLILNTFDELEVPFITNLSKIYKKVYTIGPIHSLLKKSVQTQY
EFWKEDHSCLAWLDSQPPRSVMFVSFGSIVKLKSSQLKEFWNGLVDSGKAFLLVLRSDALVEET
GEEDEKQKELVIKEIMETKEEGRWVIVNWAPQEKVLEHKAIGGFLTHSGWNSTLESVAVGVPMV
SWPQIGDQPSNATWLSKVWKIGVEMEDSYDRSTVESKVRSIMEHEDKKMENAIVELAKRVDDRV
SKEGTSYQNLQRLIEDIEGFKLN
Cucurbita maxima 1 (CmaUGTl) (SEQ ID NO: 129)
MELSHTHHVLLFPFPAKGHIKPFFSLAQLLCNAGLRVTFLNTDHHHRRIHDLNRLAAQLPTLHF DSVSDGLPPDEPRNVFDGKLYESIRQVTSSLFRELLVSYNNGTSSGRPPITCVITDVMFRFPID IAEELGIPVFTFSTFSARFLFLIFWIPKLLEDGQLRYPEQELHGVPGAEGLIRWKDLPGFWSVE DVADWDPMNFVNQTLATSRSSGLILNTFDELEAPFLTSLSKIYKKIYSLGPINSLLKNFQSQPQ YNLWKEDHSCMAWLDSQPRKSWFVSFGSWKLTSRQLMEFWNGLVNSGMPFLLVLRSDVIEAG EEWREIMERKAEGRWVIVSWAPQEEVLAHDAVGGFLTHSGWNSTLESLAAGVPMISWPQIGDQ TSNSTWISKVWRIGLQLEDGFDSSTIETMVRSIMDQTMEKTVAELAERAKNRASKNGTSYRNFQ TLIQDITNI IETHI Cucurbita maxima 2 (CmaUGT2) (SEQ ID NO: 130)
MDAQKAVDTPPTTVLMLPWIGYGHLSAYLELAKALSRRNFHVYFCSTPVNLDSIKPNLIPPPSS IQFVDLHLPSSPELPPHLHTTNGLPSHLKPTLHQAFSAAAQHFEAILQTLSPHLLIYDSLQPWA PRIASSLNIPAINFNTTAVSI IAHALHSVHYPDSKFPFSDFVLHDYWKAKYTTADGATSEKIRR GAEAFLYCLNASCDWLVNSFRELEGEYMDYLSVLLKKKWSVGPLVYEPSEGEEDEEYWRIKK WLDEKEALSTVLVSFGSEYFPSKEEMEEIAHGLEESEANFIWWRFPKGEESCRGIEEALPKGF VERAGERAMWKKWAPQGKILKHGSIGGFVSHCGWNSVLESIRFGVPVIGVPMHLDQPYNAGLL EEAGIGVEAKRDADGKIQRDQVASLIKRVWEKTREDIWKTVREMREVLRRRDDDMIDEMVAEI SWLKI
Cucurbita maxima 3 (CmaUGT3) (SEQ ID NO: 131)
MSSNLFLKISIPFGRLRDSALNCSVFHCKLHLAIAIAMDAQQAANKSPTATTIFMLPWAGYGHL SAYLELAKALSTRNFHIYFCSTPVSLASIKPRLIPSCSSIQFVELHLPSSDEFPPHLHTTNGLP SRLVPTFHQAFSEAAQTFEAFLQTLRPHLLIYDSLQPWAPRIASSLNIPAINFFTAGAFAVSHV LRAFHYPDSQFPSSDFVLHSRWKIKNTTAESPTQAKLPKIGEAIGYCLNASRGVILTNSFRELE GKYIDYLSVILKKRVFPIGPLVYQPNQDEEDEDYSRIKNWLDRKEASSTVLVSFGSEFFLSKEE TEAIAHGLEQSEANFIWGIRFPKGAKKNAIEEALPEGFLERAGGRAMWEEWVPQGKILKHGSI GGFVSHCGWNSAMESIVCGVPI IGIPMQVDQPFNAGILEEAGVGVEAKRDSDGKIQRDEVAKLI KEVWERTREDIRNKLEKINEILRSRREEKLDELATEISLLSRN
Cucurbita moschata 1 (CmoUGTl) (SEQ ID NO: 132)
MELSPTHHLLLFPFPAKGHIKPFFSLAQLLCNAGARVTFLNTDHHHRRIHDLDRLAAQLPTLHF DSVSDGLPPDESRNVFDGKLYESIRQVTSSLFRELLVSYNNGTSSGRPPITCVITDCMFRFPID IAEELGIPVFTFSTFSARFLFLFFWIPKLLEDGQLRYPEQELHGVPGAEGLIRCKDLPGFLSDE DVAHWKPINFVNQILATSRSSGLILNTFDELEAPFLTSLSKIYKKIYSLGPINSLLKNFQSQPQ YNLWKEDHSCMAWLDSQPPKSWFVSFGSWKLTNRQLVEFWNGLVNSGKPFLLVLRSDVIEAG EEWRENMERKAEGRWMIVSWAPQEEVLAHDAVGGFLTHSGWNSTLESLAAGVPMISWTQIGDQ TSNSTWVSKVWRIGLQLEDGFDSFTIETMVRSVMDQTMEKTVAELAERAKNRASKNGTSYRNFQ TLIQDITNI IETHI
Cucurbita moschata 2 (CmoUGT2) (SEQ ID NO: 133)
MDAQKAVDTPPTTVLMLPWIGYGHLSAYLELAKALSRRNFHVYFCSTPVNLDSIKPNLIPPPPS IQFVDLHLPSSPELPPHLHTTNGLPSHLKPTLHQAFSAAAQHFEAILQTLSPHLLIYDSLQPWA PRIASSLNIPAINFNTTAVSI IAHALHSVHYPDSKFPFSDFVLHDYWKAKYTTADGATSEKTRR GVEAFLYCLNASCDWLVNSFRELEGEYMDYLSVLLKKKWSVGPLVYEPSEGEEDEEYWRIKK
WLDEKEALSTVLVSFGSEYFPPKEEMEEIAHGLEESEANFIWWRFPKGEESSSRGIEEALPKG FVERAGERAMWKKWAPQGKILKHGSIGGFVSHCGWNSVLESIRFGVPVIGAPMHLDQPYNAGL
LEEAGIGVEAKRDADGKIQRDQVASLIKQVWEKTREDIWKKVREMREVLRRRDDDDMMIDEMV
AVISWLKI
Cucurbita moschata 3 (CmoUGT3) (SEQ ID NO: 134)
MDAQQAANKSPTASTIFMLPWVGYGHLSAYLELAKALSTRNFHVYFCSTPVSLASIKPRLIPSC
SSIQFVELHLPSSDEFPPHLHTTNGLPAHLVPTIHQAFAAAAQTFEAFLQTLRPHLLIYDSLQP
WAPRIASSLNIPAINFFTAGAFAVSHVLRAFHYPDSQFPSSDFVLHSRWKIKNTTAESPTQVKI
PKIGEAIGYCLNASRGVILTNSFRELEGKYIDYLSVILKKRVLPIGPLVYQPNQDEEDEDYSRI
KNWLDRKEASSTVLVSFGSEFFLSKEETEAIAHGLEQSEANFIWGIRFPKGAKKNAIEEALPEG
FLERVGGRAMWEEWVPQGKILKHGNIGGFVSHCGWNSAMESIMCGVPVIGIPMQVDQPFNAGI
LEEAGVGVEAKRDSDGKIQRDEVAKLIKEVWERTREDIRNKLEEINEILRTRREEKLDELATE
ISLLCKN
Prunus persica (SEQ ID NO: 135)
MAMKQPHVI IFPFPLQGHMKPLLCLAELLCHAGLHVTYVNTHHNHQRLANRQALSTHFPTLHFE SISDGLPEDDPRTLNSQLLIALKTSIRPHFRELLKTISLKAESNDTLVPPPSCIMTDGLVTFAF DVAEELGLPILSFNVPCPRYLWTCLCLPKLIENGQLPFQDDDMNVEITGVPGMEGLLHRQDLPG FCRVKQADHPSLQFAINETQTLKRASALILDTVYELDAPCISHMALMFPKIYTLGPLHALLNSQ IGDMSRGLASHGSLWKSDLNCMTWLDSQPSKSI IYVSFGTLVHLTRAQVIEFWYGLVNSGHPFL WVMRSDITSGDHQIPAELENGTKERGCIVDWVSQEEVLAHKSVGGFLTHSGWNSTLESIVAGLP MICWPKLGDHYI ISSTVCRQWKIGLQLNENCDRSNIESMVQTLMGSKREEIQSSMDAISKLSRD SVAEGGSSHNNLEQLIEYIRNLQHQN
Theobroma cacao (SEQ ID NO: 136)
MRQPHVLVLPFPAQGHIKPMLCLAELLCQAGLRVTFLNTHHSHRRLNNLQDLSTRFPTLHFESV SDGLPEDHPRNLVHFMHLVHSIKNVTKPLLRDLLTSLSLKTDIPPVSCI IADGILSFAIDVAEE LQIKVI IFRTISSCCLWSYLCVPKLIQQGELQFSDSDMGQKVSSVPEMKGSLRLHDRPYSFGLK QLEDPNFQFFVSETQAMTRASAVIFNTFDSLEAPVLSQMIPLLPKVYTIGPLHALRKARLGDLS QHSSFNGNLREADHNCITWLDSQPLRSWYVSFGSHWLTSEELLEFWHGLVNSGKRFLWVLRP DI IAGEKDHNQI IAREPDLGTKEKGLLVDWAPQEEVLAHPSVGGFLTHCGWNSTLESMVAGVPM LCWPKLPDQLVNSSCVSEWKIGLDLKDMCDRSTVEKMVRALMEDRREEVMRSVDGISKLARES VSHGGSSSSNLEMLIQELET
Corchorus capsularis (SEQ ID NO: 137) MDSKQKKMSVLMFPWLAYGHISPFLELAKKLSKRNFHTFFFSTPINLNSIKSKLSPKYAQSIQF VELHLPSLPDLPPHYHTTNGLPPHLMNTLKKAFDMSSLQFSKILKTLNPDLLVYDFIQPWAPLL ALSNKIPAVHFLCTSAAMSSFSVHAFKKPCEDFPFPNIYVHGNFMNAKFNNMENCSSDDSISDQ DRVLQCFERSTKI ILVKTFEELEGKFMDYLSVLLNKKIVPTGPLTQDPNEDEGDDDERTKLLLE WLNKKSKSSTVFVSFGSEYFLSKEEREEIAYGLELSKVNFIWVIRFPLGENKTNLEEALPQGFL QRVSERGLWENWAPQAKILQHSSIGGFVSHCGWSSVMESLKFGVPI IAIPMHLDQPLNARLW DVGVGLEVIRNHGSLEREEIAKLIKEWLGNGNDGEIVRRKAREMSNHIKKKGEKDMDELVEEL MLICKMKPNSCHLS
Ziziphus jujube (SEQ ID NO: 138)
MMERQRSIKVLMFPWLAHGHISPFLELAKRLTDRNFQIYFCSTPVNLTSVKPKLSQKYSSSIKL
VELHLPSLPDLPPHYHTTNGLALNLIPTLKKAFDMSSSSFSTILSTIKPDLLIYDFLQPWAPQL
ASCMNIPAVNFLSAGASMVSFVLHSIKYNGDDHDDEFLTTELHLSDSMEAKFAEMTESSPDEHI
DRAVTCLERSNSLILIKSFRELEGKYLDYLSLSFAKKWPIGPLVAQDTNPEDDSMDIINWLDK
KEKSSTVFVSFGSEYYLTNEEMEEIAYGLELSKVNFIWWRFPLGQKMAVEEALPKGFLERVGE
KGMWEDWAPQMKILGHSSIGGFVSHCGWSSLMESLKLGVPIIAMPMQLDQPINAKLVERSGVG
LEVKRDKNGRIEREYLAKVIREIWEKARQDIEKKAREMSNIITEKGEEEIDNWEELAKLCGM
Vitis vinifera (SEQ ID NO: 139)
MDARQSDGISVLMFPWLAHGHISPFLQLAKKLSKRNFSIYFCSTPVNLDPIKGKLSESYSLSIQ LVKLHLPSLPELPPQYHTTNGLPPHLMPTLKMAFDMASPNFSNILKTLHPDLLIYDFLQPWAPA AASSLNIPAVQFLSTGATLQSFLAHRHRKPGIEFPFQEIHLPDYEIGRLNRFLEPSAGRISDRD RANQCLERSSRFSLIKTFREIEAKYLDYVSDLTKKKMVTVGPLLQDPEDEDEATDIVEWLNKKC EASAVFVSFGSEYFVSKEEMEEIAHGLELSNVDFIWWRFPMGEKIRLEDALPPGFLHRLGDRG MWEGWAPQRKILGHSSIGGFVSHCGWSSVMEGMKFGVPI IAMPMHLDQPINAKLVEAVGVGRE VKRDENRKLEREEIAKVIKEWGEKNGENVRRKARELSETLRKKGDEEIDVWEELKQLCSY
Juglans regia (SEQ ID NO: 140)
MDTARKRIRWMLPWLAHGHISPFLELSKKLAKRNFHIYFCSTPVNLSSIKPKLSGKYSRSIQL
VELHLPSLPELPPQYHTTKGLPPHLNATLKRAFDMAGPHFSNILKTLSPDLLIYDFLQPWAPAI
AASQNIPAINFLSTGAAMTSFVLHAMKKPGDEFPFPEIHLDECMKTRFVDLPEDHSPSDDHNHI
SDKDRALKCFERSSGFVMMKTFEELEGKYINFLSHLMQKKIVPVGPLVQNPVRGDHEKAKTLEW
LDKRKQSSAVFVSFGTEYFLSKEEMEEIAYGLELSNVNFIWWRFPEGEKVKLEEALPEGFLQR
VGEKGMWEGWAPQAKILMHPSIGGFVSHCGWSSVMESIDFGVPIVAIPMQLDQPVNAKWEQA
GVGVEVKRDRDGKLEREEVATVIREWMGNIGESVRKKEREMRDNIRKKGEEKMDGVAQELVQL
YGNGIKNV Hevea brasiliensis (SEQ ID NO: 141)
METLQRRKISVLMFPWLAHGHLSPFLELSKKLNKRNFHVYFCSTPVNLDSIKPKLSAEYSFSIQ LVELHLPSSPELPLHYHTTNGLPPHLMKNLKNAFDMASSSFFNILKTLKPDLLIYDFIQPWAPA LASSLNIPAVNFLCTSMAMSCFGLHLNNQEAKFPFPGIYPRDYMRMKVFGALESSSNDIKDGER AGRCMDQSFHLILAKTFRELEGKYIDYLSVKLMKKIVPVGPLVQDPIFEDDEKIMDHHQVIKWL EKKERLSTVFVSFGTEYFLSTEEMEEIAYGLELSKAHFIWWRFPTGEKINLEESLPKRYLERV QERGKIVEGWAPQQKILRHSSIGGFVSHCGWSSIMESMKFGVPI IAMPMNLDQPVNSRIVEDAG VGIEVRRNKSGELEREEIAKTIRKVWEKDGKNVSRKAREMSDTIRKKGEEEIDGWDELLQLC DVKTNYLQ
Manihot esculenta (SEQ ID NO: 142)
MATAQTRKISVLMFPWLAHGHLSPFLELSKKLANRNFHVYFCSTPVNLDSIKPKLSPEYHFSIQ FVELHLPSSPELPSHYHTTNGLPPHLMKTLKKAFDMASSSFFNILKTLNPDLLIYDFLQPWAPA LASSLNIPAVNFLCSSMAMSCFGLNLNKNKEIKFLFPEIYPRDYMEMKLFRVFESSSNQIKDGE RAGRCIDQSFHVILAKTFRELEGKYIDYVSVKCNKKIVPVGPLVEDTIHEDDEKTMDHHHHHHD EVIKWLEKKERSTTVFVSFGSEYFLSKEEMEEIAHGLELSKVNFIWWRFPKGEKINLEESLPE GYLERIQERGKIVEGWAPQRKILGHSSIGGFVSHCGWSSIMESMKLGVPIIAMPMNLDQPINSR IVEAAGVGIEVSRNQSGELEREEMAKTIRKVWEREGVYVRRKAREMSDVLRKKGEEEIDGWD ELVQLCDMKTNYL
Cephalotus follicularis (SEQ ID NO: 143)
MDLKRRSIRVLMLPWLAHGHISPFLELAKKLTNRNFLIYFCSTPINLNSIKPKLSSKYSFSIQL VELHLPSLPELPPHYHTTNGLPLHLMNTLKTAFDMASPSFLNILKTLKPDLLICDHLQPWAPSL ASSLNIPAI IFPTNSAIMMAFSLHHAKNPGEEFPFPSININDDMVKSINFLHSASNGLTDMDRV LQCLERSSNTMLLKTFRQLEAKYVDYSSALLKKKIVLAGPLVQVPDNEDEKIEI IKWLDSRGQS STVFVSFGSEYFLSKEEREDIAHGLELSKVNFIWWRFPVGEKVKLEEALPNGFAERIGERGLV VEGWAPQAMILSHSSIGGFVSHCGWSSMMESMKFGVPI IAMPMHIDQPLNARLVEDVGVGLEIK RNKDGRFEREELARVIKEVLVYKNGDAVRSKAREMSEHIKKNGDQEIDGVADALVKLCEMKTNS LNQD
Stevia rebaudiana UGT74G1 (SEQ ID NO: 144)
MAEQQKIKKSPHVLLIPFPLQGHINPFIQFGKRLISKGVKTTLVTTIHTLNSTLNHSNTTTTSI
EIQAISDGCDEGGFMSAGESYLETFKQVGSKSLADLIKKLQSEGTTIDAIIYDSMTEWVLDVAI
EFGIDGGSFFTQACWNSLYYHVHKGLISLPLGETVSVPGFPVLQRWETPLILQNHEQIQSPWS
QMLFGQFANIDQARWVFTNSFYKLEEEVIEWTRKIWNLKVIGPTLPSMYLDKRLDDDKDNGFNL
YKANHHECMNWLDDKPKESWYVAFGSLVKHGPEQVEEITRALIDSDVNFLWVIKHKEEGKLPE NLSEVIKTGKGLIVAWCKQLDVLAHESVGCFVTHCGFNSTLEAISLGVPWAMPQFSDQTTNAK LLDEILGVGVRVKADENGIVRRGNLASCIKMIMEEERGVI IRKNAVKWKDLAKVAVHEGGSSDN DIVEFVSELIKA
Stevia rebaudiana UGT76G1 (SEQ ID NO: 145) MENKTETTVRRRRRI ILFPVPFQGHINPILQLANVLYSKGFSITIFHTNFNKPKTSNYPHFTFR
FILDNDPQDERISNLPTHGPLAGMRIPI INEHGADELRRELELLMLASEEDEEVSCLITDALWY FAQSVADSLNLRRLVLMTSSLFNFHAHVSLPQFDELGYLDPDDKTRLEEQASGFPMLKVKDIKS AYSNWQILKEILGKMIKQTKASSGVIWNSFKELEESELETVIREIPAPSFLIPLPKHLTASSSS LLDHDRTVFQWLDQQPPSSVLYVSFGSTSEVDEKDFLEIARGLVDSKQSFLWWRPGFVKGSTW VEPLPDGFLGERGRIVKWVPQQEVLAHGAIGAFWTHSGWNSTLESVCEGVPMIFSDFGLDQPLN ARYMSDVLKVGVYLENGWERGEIANAIRRVMVDEEGEYIRQNARVLKQKADVSLMKGGSSYESL ESLVSYISSL
Stevia rebaudiana UGT85C2 (SEQ ID NO: 146)
MDAMATTEKKPHVIFIPFPAQSHIKAMLKLAQLLHHKGLQITFVNTDFIHNQFLESSGPHCLDG APGFRFETIPDGVSHSPEASIPIRESLLRSIETNFLDRFIDLVTKLPDPPTCIISDGFLSVFTI DAAKKLGIPVMMYWTLAACGFMGFYHIHSLIEKGFAPLKDASYLTNGYLDTVIDWVPGMEGIRL KDFPLDWSTDLNDKVLMFTTEAPQRSHKVSHHIFHTFDELEPSI IKTLSLRYNHIYTIGPLQLL LDQIPEEKKQTGITSLHGYSLVKEEPECFQWLQSKEPNSWYVNFGSTTVMSLEDMTEFGWGLA NSNHYFLWI IRSNLVIGENAVLPPELEEHIKKRGFIASWCSQEKVLKHPSVGGFLTHCGWGSTI ESLSAGVPMICWPYSWDQLTNCRYICKEWEVGLEMGTKVKRDEVKRLVQELMGEGGHKMRNKAK
DWKEKARIAIAPNGSSSLNIDKMVKEITVLARN
Stevia rebaudiana UGT91D1 (SEQ ID NO: 147)
MYNVTYHQNSKAMATSDSIVDDRKQLHVATFPWLAFGHILPFLQLSKLIAEKGHKVSFLSTTRN IQRLSSHISPLINWQLTLPRVQELPEDAEATTDVHPEDIQYLKKAVDGLQPEVTRFLEQHSPD WIIYDFTHYWLPSIAASLGISRAYFCVITPWTIAYLAPSSDAMINDSDGRTTVEDLTTPPKWFP FPTKVCWRKHDLARMEPYEAPGISDGYRMGMVFKGSDCLLFKCYHEFGTQWLPLLETLHQVPW PVGLLPPEIPGDEKDETWVSIKKWLDGKQKGSWYVALGSEALVSQTEWELALGLELSGLPFV WAYRKPKGPAKSDSVELPDGFVERTRDRGLVWTSWAPQLRILSHESVCGFLTHCGSGSIVEGLM FGHPLIMLPIFCDQPLNARLLEDKQVGIEIPRNEEDGCLTKESVARSLRSVWENEGEIYKANA RALSKIYNDTKVEKEYVSQFVDYLEKNARAVAIDHES
Stevia rebaudiana UGT91D2 (SEQ ID NO: 148) MATSDSIVDDRKQLHVATFPWLAFGHILPYLQLSKLIAEKGHKVSFLSTTRNIQRLSSHISPLI NWQLTLPRVQELPEDAEATTDVHPEDIPYLKKASDGLQPEVTRFLEQHSPDWI IYDYTHYWLP SIAASLGISRAHFSVTTPWAIAYMGPSADAMINGSDGRTTVEDLTTPPKWFPFPTKVCWRKHDL ARLVPYKAPGISDGYRMGLVLKGSDCLLSKCYHEFGTQWLPLLETLHQVPWPVGLLPPEVPGD EKDETWVSIKKWLDGKQKGSWYVALGSEVLVSQTEWELALGLELSGLPFVWAYRKPKGPAKS DSVELPDGFVERTRDRGLWTSWAPQLRILSHESVCGFLTHCGSGSIVEGLMFGHPLIMLPIFG DQPLNARLLEDKQVGIEIPRNEEDGCLTKESVARSLRSVWEKEGEIYKANARELSKIYNDTKV EKEYVSQFVDYLEKNTRAVAIDHES
Stevia rebaudiana UGT91D2e (SEQ ID NO: 149)
MATSDSIVDDRKQLHVATFPWLAFGHILPYLQLSKLIAEKGHKVSFLSTTRNIQRLSSHISPLI NWQLTLPRVQELPEDAEATTDVHPEDIPYLKKASDGLQPEVTRFLEQHSPDWI IYDYTHYWLP SIAASLGISRAHFSVTTPWAIAYMGPSADAMINGSDGRTTVEDLTTPPKWFPFPTKVCWRKHDL ARLVPYKAPGISDGYRMGLVLKGSDCLLSKCYHEFGTQWLPLLETLHQVPWPVGLLPPEIPGD EKDETWVSIKKWLDGKQKGSWYVALGSEVLVSQTEWELALGLELSGLPFVWAYRKPKGPAKS DSVELPDGFVERTRDRGLWTSWAPQLRILSHESVCGFLTHCGSGSIVEGLMFGHPLIMLPIFG DQPLNARLLEDKQVGIEIPRNEEDGCLTKESVARSLRSVWEKEGEIYKANARELSKIYNDTKV EKEYVSQFVDYLEKNARAVAIDHES
OsUGTl-2 (SEQ ID NO: 150)
MDSGYSSSYAAAAGMHWICPWLAFGHLLPCLDLAQRLASRGHRVSFVSTPRNISRLPPVRPAL
APLVAFVALPLPRVEGLPDGAESTNDVPHDRPDMVELHRRAFDGLAAPFSEFLGTACADWVIVD
VFHHWAAAAALEHKVPCAMMLLGSAHMIASIADRRLERAETESPAAAGQGRPAAAPTFEVARMK
LIRTKGSSGMSLAERFSLTLSRSSLWGRSCVEFEPETVPLLSTLRGKPITFLGLMPPLHEGRR
EDGEDATVRWLDAQPAKSWYVALGSEVPLGVEKVHELALGLELAGTRFLWALRKPTGVSDADL
LPAGFEERTRGRGWATRWVPQMSILAHAAVGAFLTHCGWNSTIEGLMFGHPLIMLPIFGDQGP
NARLIEAKNAGLQVARNDGDGSFDREGVAAAIRAVAVEEESSKVFQAKAKKLQEIVADMACHER
YIDGFIQQLRSYKD
Arabidopsis thaliana AAN72025.1 (SEQ ID NO: 151)
MGSISEMVFETCPSPNPIHVMLVSFQGQGHVNPLLRLGKLIASKGLLVTFVTTELWGKKMRQAN
KIVDGELKPVGSGSIRFEFFDEEWAEDDDRRADFSLYIAHLESVGIREVSKLVRRYEEANEPVS
CLINNPFIPWVCHVAEEFNIPCAVLWVQSCACFSAYYHYQDGSVSFPTETEPELDVKLPCVPVL
KNDEIPSFLHPSSRFTGFRQAILGQFKNLSKSFCVLIDSFDSLEREVIDYMSSLCPVKTVGPLF
KVARTVTSDVSGDICKSTDKCLEWLDSRPKSSWYISFGTVAYLKQEQIEEIAHGVLKSGLSFL
WVIRPPPHDLKVETHVLPQELKESSAKGKGMIVDWCPQEQVLSHPSVACFVTHCGWNSTMESLS SGVPWCCPQWGDQVTDAVYLIDVFKTGVRLGRGATEERWPREEVAEKLLEATVGEKAEELRK
NALKWKAEAEAAVAPGGSSDKNFREFVEKLGAGVTKTKDNGY
Arabidopsis thaliana AAF87256.1 (SEQ ID NO: 152)
MGSHVAQKQHWCVPYPAQGHINPMMKVAKLLYAKGFHITFVNTVYNHNRLLRSRGPNAVDGLP SFRFESIPDGLPETDVDVTQDIPTLCESTMKHCLAPFKELLRQINARDDVPPVSCIVSDGCMSF TLDAAEELGVPEVLFWTTSACGFLAYLYYYRFIEKGLSPIKDESYLTKEHLDTKIDWIPSMKNL RLKDIPSFIRTTNPDDIMLNFI IREADRAKRASAIILNTFDDLEHDVIQSMKSIVPPVYSIGPL HLLEKQESGEYSEIGRTGSNLWREETECLDWLNTKARNSWYVNFGSITVLSAKQLVEFAWGLA ATGKEFLWVIRPDLVAGDEAMVPPEFLTATADRRMLASWCPQEKVLSHPAIGGFLTHCGWNSTL ESLCGGVPMVCWPFFAEQQTNCKFSRDEWEVGIEIGGDVKREEVEAWRELMDEEKGKNMREKA EEWRRLANEATEHKHGSSKLNFEMLVNKVLLGE
Columba livia C1UGT1 (SEQ ID NO: 153)
MIHCGKKHICAFVTCILISASILMYSWKDPQLQNNITRKIFQATSALPASQLCRGKPAQNVITA LEDNRTFI ISPYFDDRESKVTRVIGIVHHEDVKQLYCWFCCQPDGKIYVARAKIDVHSDRFGFP YGAADIVCLEPENCNPTHVSIHQSPHANIDQLPSFKIKNRKSETFSVDFTVCISAMFGNYNNVL QFIQSVEMYKILGVQKWIYKNNCSQLMEKVLKFYMEEGTVEI IPWPINSHLKVSTKWHFSMDA KDIGYYGQITALNDCIYRNMQRSKFWLNDADEI ILPLKHLDWKAMMSSLQEQNPGAGIFLFEN HIFPKTVSTPVFNISSWNRVPGVNILQHVHREPDRKEVFNPKKMI IDPRQWQTSVHSVLRAYG NSVNVPADVALVYHCRVPLQEELPRESLIRDTALWRYNSSLITNVNKVLHQTVL
Haemophilus ducreyi LgtF Q9L875 (SEQ ID NO: 154)
MPTLTVAMIVKNEAQDLAECLKTVDGWVDEIVIVDSGSTDDTLKIATQFNAKVYVNSDWQGFGP
QRQFAQQYVTSDYVLWLDADERVTPELKASILQAVQHNQKNTVYKVSRLSEIFGKEIRYSGWYP
DYWRLYPTYLAKYGDELVHEKVHYPADSRVEKLQGDLLHFTYKNIHHYLVKSASYAKAWAMQR
AKAGKKASLLDGVTHAIACFLKMYLFKAGFLDGKQGFLLAVLSAHSTFVKYADLWDRTRS
Neisseria gonorrhoeae Q5F735 (SEQ ID NO: 155)
MKKVSVLIVAKNEANHIRECIESCRFDKEVIVIDDHSADNTAEIAEGLGAKVFRRHLNGDFGAQ KTFAIEQAGGEWVFLIDADERCTPELSDEISKIVRTGDYAAYFVERRNLFPNHPATHGAMRPDS VCRLMPKKGGSVQGKVHETVQTPYPERRLKHFMYHYTYDNWEQYFNKFNKYTSISAEKYREQGK PVSFVRDI ILRPIWGFFKIYILNKGFLDGKMGWIMSVNHSYYTMIKYVKLYYLYKSGGKF
Rhizobium meliloti (strain 1021) ExoM P33695 (SEQ ID NO: 156) MPNETLHIDIGVCTYRRPELAETLRSLAAMNVPERARLRVIVADNDAEPSARALVEGLRPEMPF
DILYVHCPHSNISIARNCCLDNSTGDFLAFLDDDETVSGDWLTRLLETARTTGAAAVLGPVRAH
YGPTAPRWMRSGDFHSTLPVWAKGEIRTGYTCNALLRRDAASLLGRRFKLSLGKSGGEDTDFFT
GMHCAGGTIAFSPEAWVHEPVPENRASLAWLAKRRFRSGQTHGRLLAEKAHGLRQAWNIALAGA
KSGFCATAAVLCFPSAARRNRFALRAVLHAGVISGLLGLKEIEQYGAREVTSA
Rhizobium radiobacter Q44418 (SEQ ID NO: 157)
MCRCGRAVRSRPVCRPGQLWRRSPRPRSRNHSRCRPLRLSVFPRPHRRVRHHCQRDLRWEPGR WIAVRWKAARSHRRFRRCPFPRQLVWPVRERHRDAGDRRNQRERRRRDAYHEISEPKFRTRKRT ESFWMNKAITVIVWLLVSLCVLAI ITMPVSLQTHLVATAISLILLATIKSFNGQGAWRLVALGF GTAIVLRYVYWRTTSTLPPVNQLENFIPGFLLYLAEMYSWMLGLSLVIVSMPLPSRKTRPGSP DYRPTVDVFVPSYNEDAELLANTLAAAKNMDYPADRFTVWLLDDGGSVQKRNAANIVEAQAAQR RHEELKKLCEDLDVRYLTRERNVHAKAGNLNNGLAHSTGELVTVFDADHAPARDFLLETVGYFD EDPRLFLVQTPHFFVNPDPIERNLRTFETMPSENEMFYGI IQRGLDKWNGAFFCGSAAVLRREA LQDSDGFSGVSITEDCETALALHSRGWNSVYVDKPLIAGLQPATFASFIGQRSRWAQGMMQILI FRQPLFKRGLSFTQRLCYMSSTLFWLFPFPRTIFLFAPLFYLFFDLQIFVASGGEFLAYTAAYM LVNLMMQNYLYGSFRWPWISELYEYVQTVHLLPAWSVIFNPGKPTFKVTAKDESIAEARLSEI SRPFFVIFALLLVAMAFAWRIYSEPYKADVTLWGGWNLLNLIFAGCALGWSERGDKSASRR ITVKRRCEVQLGGSDTWVPASIDNVSVHGLLINIFDSATNIEKGATAIVKVKPHSEGVPETMPL NWRTVRGEGFVSIGCTFSPQRAVDHRLIADLIFA SEQWSEFQRVRRKKPGLIRGTAIFLAIA LFQTQRGLYYLVRARRPAPKSAKPVGAVK
Streptococcus agalactiae cpsl 087183 (SEQ ID NO: 158)
MIKKIEKDLISVIVPIYNVEDYLVECIESLIVQTYRNIEILLINDGSTDNCATIAKEFSERDCR VIYIEKSNGGLSEARNYGIYHSKGKYLTFVDSDDKVSSDYIANLYNAIQKHDSSIAIGGYLEFY ERHNSIRNYEYLDKVIPVEEALLNMYDIKTYGSIFITAWGKLFHKSIFNDLEFALNKYHEDEFF NYKAYLKANSITYIDKPLYHYRIRVGSIMNNSDNVI IARKKLDVLSALDERIKLITSLRKYSVF LQKTEIFYVNQYFRTKKFLKQQSVMFKEDNYIDAYRMYGRLLRKVKLVDKLKLIKNRFF
Streptococcus pneumoniae cps3S Q54611 (SEQ ID NO: 159)
MYTFILMLLDFFQNHDFHFFMLFFVFILIRWAVIYFHAVRYKSYSCSVSDEKLFSSVIIPWDE PLNLFESVLNRISRHKPSEI IWINGPKNERLVKLCHDFNEKLENNMTPIQCYYTPVPGKRNAI RVGLEHVDSQSDITVLVDSDTVWTPRTLSELLKPFVCDKKIGGVTTRQKILDPERNLVTMFANL LEEIRAEGTMKAMSVTGKVGCLPGRTIAFRNIVERVYTKFIEETFMGFHKEVSDDRSLTNLTLK KGYKTVMQDTSWYTDAPTSWKKFIRQQLRWAEGSQYNNLKMTPWMIRNAPLMFFIYFTDMILP MLLISFGVNIFLLKILNITTIVYTASWWEI ILYVLLGMIFSFGGRNFKAMSRMKWYYVFLIPVF
I IVLSI IMCPIRLLGLMRCSDDLGWGTRNLTE
MbUGTcl3 (SEQ ID NO: 160)
MADAMATTEKKPHVIFIPFPAQSHIKAMLKLAQLLHHKGLQITFVNTDFIHNQFLESSGPHCLD GAPGFRFETIPDGVSHSPEASIPIRESLLRSIETNFLDRFIDLVTKLPDPPTCIISDGFLSVFT IDAAKKLGIPVMMYWTLAACGFMGFYHIHSLIEKGFAPLKDASYLTNGYLDTVIDWVPGMEGIR LKDFPLDWSTDLNDKVLMFTTEATQRSHKVSHHIFHTFDELEPSI IKTLSLRYNHIYTIGPLQL LLDQIPEEKKQTGITSLHGYSLVKEEPECFQWLQSKEPNSWYVNFGSTTVMSLEDMTEFGWGL ANSNHYFLWI IRSNLVIGENAVLPPELEEHIKKRGFIASWCSQEKVLKHPSVGGFLTHCGWGST IESLSAGVPMICWPYSWDQLTNCRYICKEWEVGLEMGTKVKRDEVKRLVQELMGEGGHKMRNKA KDWKEKARIAIAPNGSSSLNIDKMVKEITVLARN
MbUGTcl 9 (SEQ ID NO: 161)
MANHHECMNWLDDKPKESWYVAFGSLVKHGPEQVEEITRALIDSDVNFLWVIKHKEEGKLPEN LSEVIKTGKGLIVAWCKQLDVLAHESVGCFVTHCGFNSTLEAISLGVPWAMPQFSDQTTNAKL LDEILGVGVRVKADENGIVRRGNLASCIKMIMEEERGVI IRKNAVKWKDLAKVAVHEGGSSDND IVEFVSELIKAGSGEQQKIKKSPHVLLIPFPLQGHINPFIQFGKRLISKGVKTTLVTTIHTLNS TLNHSNTTTTSIEIQAISDGCDEGGFMSAGESYLETFKQVGSKSLADLIKKLQSEGTTIDAI IY DSMTEWVLDVAIEFGIDGGSFFTQACWNSLYYHVHKGLISLPLGETVSVPGFPVLQRWETPLI LQNHEQIQSPWSQMLFGQFANIDQARWVFTNSFYKLEEEVIEWTRKIWNLKVIGPTLPSMYLDK RLDDDKDNGFNLYKA
MbUGTl-3 (SEQ ID NO: 162)
MENKTETTVRRRRRI ILFPVPFQGHINPILQLANVLYSKGFSITIFHTNFNKPKTSNYPHFTFR FILDNDPQDERISNLPTHGPLAGMRIPI INEHGADELRRELELLMLASEEDEEVSCLITDALWY FAQSVADSLNLRRLVLMTSSLFNFHAHVSLPQFDELGYLDPDDKTRLEEQASGFPMLKVKDIKS AYSNWQILKEILGKMIKQTKASSGVIWNSFKELEESELETVIREIPAPSFLIPLPKHLTASSSS LLDHDRTVFQWLDQQPPSSVLYVSFGSTSEVDEKDFLEIARGLVDSKQSFLWWRPGFVKGSTW VEPLPDGFLGERGRIVKWVPQQEVLAHGAIGAFWTHSGWNSTLESVCEGVPMIFSDFGLDQPLN ARYMSDVLKVGVYLENGWERGEIANAIRRVMVDEEGEYIRQNARVLKQKADVSLMKGGSSYESL ESLVSYISSL
MbUGTl-2 (SEQ ID NO: 163)
MATKGSSGMSLAERFWLTLSRSSLWGRSCVEFEPETVPLLSTLRGKPITFLGLMPPLHEGRRE
DGEDATVRWLDAQPAKSWYVALGSEVPLGVEKVHELALGLELAGTRFLWALRKPTGVSDADLL PAGFEERTRGRGWATRWVPQMSILAHAAVGAFLTHCGWNSTIEGLMFGHPLIMLPIFGDQGPN ARLIEAKNAGLQVARNDGDGSFDREGVAAAIRAVAVEEESSKVFQAKAKKLQEIVADMACHERY IDGFIQQLRSYKDDSGYSSSYAAAAGMHWICPWLAFGHLLPCLDLAQRLASRGHRVSFVSTPR NISRLPPVRPALAPLVAFVALPLPRVEGLPDGAESTNDVPHDRPDMVELHRRAFDGLAAPFSEF LGTACADWVIVDVFHHWAAAAALEHKVPCAMMLLGSAEMIASIADERLEHAETESPAAAGQGRP AAAPTFEVARMKLIR
Coffea arabica (SEQ ID NO: 164)
MENHATFNVLMLPWLAHGHVSPYLELAKKLTARNFNVYLCSSPATLSSVRSKLTEKFSQSIHLV ELHLPKLPELPAEYHTTNGLPPHLMPTLKDAFDMAKPNFCNVLKSLKPDLLIYDLLQPWAPEAA SAFNIPAWFISSSATMTSFGLHFFKNPGTKYPYGNAIFYRDYESVFVENLTRRDRDTYRVINC MERSSKI ILIKGFNEIEGKYFDYFSCLTGKKWPVGPLVQDPVLDDEDCRIMQWLNKKEKGSTV FVSFGSEYFLSKKDMEEIAHGLEVSNVDFIWWRFPKGENIVIEETLPKGFFERVGERGLWNG WAPQAKILTHPNVGGFVSHCGWNSVMESMKFGLPIIAMPMHLDQPINARLIEEVGAGVEVLRDS KGKLHRERMAETINKVMKEASGESVRKKARELQEKLELKGDEEIDDWKELVQLCATKNKRNGL HYY
Stevia rebaudiana UGT85C1 (SEQ ID NO: 165)
MDQMAKIDEKKPHWFIPFPAQSHIKCMLKLARILHQKGLYITFINTDTNHERLVASGGTQWLE NAPGFWFKTVPDGFGSAKDDGVKPTDALRELMDYLKTNFFDLFLDLVLKLEVPATCI ICDGCMT FANTIRAAEKLNIPVILFWTMAACGFMAFYQAKVLKEKEIVPVKDETYLTNGYLDMEIDWIPGM KRIRLRDLPEFILATKQNYFAFEFLFETAQLADKVSHMI IHTFEELEASLVSEIKSIFPNVYTI GPLQLLLNKITQKETNNDSYSLWKEEPECVEWLNSKEPNSWYVNFGSLAVMSLQDLVEFGWGL VNSNHYFLWI IRANLIDGKPAVMPQELKEAMNEKGFVGSWCSQEEVLNHPAVGGFLTHCGWGSI IESLSAGVPMLGWPSIGDQRANCRQMCKEWEVGMEIGKNVKRDEVEKLVRMLMEGLEGERMRKK
ALEWKKSATLATCCNGSSSLDVEKLANEIKKLSRN

Claims

1. A method for making a triterpenoid, comprising:
providing a recombinant microbial host cell expressing a heterologous enzyme pathway catalyzing the conversion of isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) to one or more triterpenoids, the pathway comprising a famesyl diphosphate synthase (FPPS) and a squalene synthase (SQS), wherein the SQS comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 2 to 16, 166, and 167; and
culturing the host cell under conditions for producing the triterpenoid.
2. The method of claim 1, wherein the SQS comprises an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11).
3. The method of claim 2, wherein the SQS comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 11.
4. The method of claim 2, wherein the SQS comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 11, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
5. The method of claim 2, wherein the SQS comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 2, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 166, or SEQ ID NO: 167.
6. The method of any one of claims 1 to 5, wherein the triterpenoid is squalene.
7. The method of claim 6, wherein the microbial host cell is prokaryotic or eukaryotic, and is optionally a bacteria selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida or is optionally a yeast selected from a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica.
8. The method of claim 7, wherein the microbial host cell is E. coli.
9. The method of claim 8, wherein the E. coli produces increased MEP pathway products, and has an overexpression of one or more MEP pathway enzymes.
10. The method of claim 6, wherein the heterologous enzyme pathway further comprises a squalene epoxidase (SQE).
11. The method of claim 10, wherein the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to any one of SEQ ID NOS: 17 to 39, 168, 169, or 170.
12. The method of claim 11, wherein the squalene epoxidase comprises an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39).
13. The method of claim 12, wherein the SQE comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 39.
14. The method of claim 12, wherein the SQE comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 39, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
15. The method of claim 1, wherein the host cell is a bacterium that coexpresses an SQS enzyme comprising an amino acid sequence that is at least 70% identical to Artemisia annua SQS (SEQ ID NO: 11), and a squalene epoxidase comprising an amino acid sequence that is at least 70% identical to Methylomonas lenta squalene epoxidase (SEQ ID NO: 39).
16. The method of any one of claims 1 to 15, wherein the heterologous enzyme pathway further comprises a triterpene cyclase.
17. The method of claim 16, wherein the triterpene cyclase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 40 to 55.
18. The method of claim 17, wherein the triterpene cyclase is a cucurbitadienol synthase (CDS).
19. The method of claim 18, wherein the CDS comprises an amino acid sequence that is at least 70% identical to the amino acid sequence of SEQ ID NO: 40.
20. The method of claim 19, wherein the CDS comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 40.
21. The method of claim 19, wherein the CDS comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 40, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
22. The method of any one of claims 1 to 21, wherein the heterologous enzyme pathway further comprises an epoxide hydrolase (EPH).
23. The method of claim 22, wherein the EPH comprises an amino acid sequence that is at least 70% identical to amino acid sequence selected from SEQ ID NOS: 56 to 72.
24. The method of any one of claims 1 to 23, wherein the heterologous pathway further comprises one or more oxidases.
25. The method of claim 24, wherein at least one oxidase is a cytochrome P450 enzyme.
26. The method of claim 25, wherein at least one cytochrome P450 enzyme comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 73 to 91.
27. The method of claim 24, wherein at least one oxidase is a non-heme iron oxidase.
28. The method of claim 27, wherein the non-heme iron oxidase comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 100 to 115.
29. The method of any one of claims 24 to 28, wherein the microbial host cell expresses one or more electron transfer proteins selected from a cytochrome P450 reductase (CPR), flavodoxin reductase (FPR) and ferredoxin reductase (FDXR) sufficient to regenerate the one or more oxidases.
30. The method of any one of claims 1 to 29, wherein the heterologous enzyme pathway produces mogrol.
31. The method of claim 30, wherein the heterologous enzyme pathway further comprises one or more uridine diphosphate-dependent glycosyltransferase (UGT) enzymes, thereby producing one or more mogrol glycosides.
32. The method of claim 31 , wherein the one or more mogrol glycosides are selected from Mog. II-E, Mog. III-A-2, Mog. III-E, Mog. IIIx, Mog. IV-A, Mog. IV-E, Siamenoside, Isomog. IV, and Mog. V.
33. The method of claim 32, wherein the one or more mogrol glycosides include Mog. VI, Isomog. V, and Mog. V.
34. The method of claim 33, wherein the host cell produces Mog. V.
35. The method of any one of claims 31 to 34, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to an amino acid sequence selected from SEQ ID NOS: 116 to 165.
36. The method of claim 35, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165).
37. The method of claim 35 or 36, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165.
38. The method of claim 37, wherein at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
39. The method of claim 35, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146).
40. The method of claim 39, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146.
41. The method of claim 40, wherein at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
42. The method of claim 35, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164).
43. The method of claim 42, wherein at least one UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164.
44. The method of claim 43, wherein at least one UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
45. The method of claim 31 , wherein at least one UTG enzyme is a circular permutant of a wild-type UGT enzyme, or a derivative thereof.
46. The method of claim 45, wherein at least one UTG enzyme is a circular permutant of SEQ ID NO: 146, SEQ ID NO: 164, or SEQ ID NO: 165, or a derivative thereof.
47. The method of any one of claims 31 to 45, comprising at least one UGT enzyme capable of catalyzing beta 1,2 addition of a glucose molecule.
48. The method of claim 47, wherein the UGT enzyme comprises the amino acid sequence of SEQ ID NO: 117, or a circular permutant thereof.
49. The method of claim 47 or 48, wherein the heterologous enzyme pathway comprises four UGT enzymes:
a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 165 or a derivative thereof;
a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 146 or a derivative thereof;
a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 164 or derivative thereof; and
a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT (SEQ ID NO: 117), or comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 117 or derivative thereof.
50. The method of any one of claims 31 to 49, wherein microbial host cell has one or more genetic modifications that increase the production or availability of UDP-glucose.
51. The method of claim 50, wherein the one or more genetic modifications include one or more AgalE, AgalT. AgalK. AgalM, AushA, Aagp, Apgm, duplication or overexpression of E coli GALU, expression of Bacillus subtillus UGPA, and expression of Bifidobacterium adolescentis SPL.
52. A method for making Mog. V, comprising:
reacting a mogrol glycoside with a uridine diphosphate dependent glycosyltransferase (UGT) comprising an amino acid sequence that is at least 70% identical to SEQ ID NO: 164, or comprising an amino acid sequence that is a circular permutant of SEQ ID NO: 164 optionally having from 1 to 20 amino acid substitutions, deletions, and/or insertions with respect to the corresponding position of SEQ ID NO: 164.
53. The method of claim 52, wherein the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 164 or a circular permutant thereof.
54. The method of claim 52, wherein the UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 164, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions.
55. The method of any one of claims 52 to 64, wherein the mogrol glycoside substrate comprises Mog. HE, Mog. Ill, Mog. IV or Siamenoside.
56. The method of claim 55, wherein the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C1 (SEQ ID NO: 165), or a circular permutant thereof.
57. The method of claim 56, wherein the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 165, or a circular permutant thereof.
58. The method of claim 56, wherein the UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 165, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 165.
59. The method of any one of claims 55 to 58, wherein the Mog. HE is the glycosyltransferase product of a reaction of mogrol or Mog. IA or Mog, IE with a UGT enzyme comprising an amino acid sequence that has at least 70% identity to UGT85C2 (SEQ ID NO: 146), or a circular permutant thereof.
60. The method of claim 59, wherein the UGT enzyme comprises an amino acid sequence that is at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 98%, or at least 99% identical to SEQ ID NO: 146, or a circular permutant thereof.
61. The method of claim 59, wherein the UGT enzyme comprises an amino acid sequence having from 1 to 20 amino acid modifications with respect to SEQ ID NO: 146, the amino acid modifications being independently selected from amino acid substitutions, deletions, and insertions with respect to corresponding positions in SEQ ID NO: 146.
62. The method of any one of claims 52 to 61, wherein mogrol is reacted with: a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C1 (SEQ ID NO: 165), or a circular permutant thereof; a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Stevia rebaudiana UGT85C2 (SEQ ID NO: 146), or a circular permutant thereof; and a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Coffea arabica UGT (SEQ ID NO: 164), or a circular permutant thereof; and
a UGT enzyme comprising an amino acid sequence that is at least 70% identical to Siraitia grosvenorii UGT (SEQ ID NO: 117), or circular permutant thereof.
63. The method of any one of claims 52 to 62, further comprising, recovering and/or purifying the mogrol glycoside.
64. The method of claim 63, wherein the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
65. The method of any one of claims 52 to 64, wherein the reaction is performed in a microbial cell, and UGT enzymes are recombinantly expressed in the cell.
66. The method of claim 65, wherein mogrol is produced in the cell by a heterologous mogrol synthesis pathway.
67. The method of claim 65, wherein mogrol or mogrol glycosides are fed to the cells for glycosylation.
68. The method of any one of claims 52 to 64, wherein the reaction is performed in vitro using purified UGT enzyme, partially purified UGT enzyme, or recombinant cell lysates.
69. The method of any one of claims 64 to 68, wherein the microbial host cell is prokaryotic or eukaryotic, and is optionally a bacteria selected from Escherichia coli, Bacillus subtilis, Corynebacterium glutamicum, Rhodobacter capsulatus, Rhodobacter sphaeroides, Zymomonas mobilis, Vibrio natriegens, or Pseudomonas putida or is optionally a yeast selected from a species of Saccharomyces, Pichia, or Yarrowia, including Saccharomyces cerevisiae, Pichia pastoris, and Yarrowia lipolytica.
70. The method of claim 69, wherein the microbial host cell is E. coli.
71. The method of claim 69 or 70, wherein the mogrol glycoside products are recovered from the extracellular media.
72. A method for making a product comprising a mogrol glycoside, comprising: producing a mogrol glycoside in accordance with any one of claims 1 to 71, and incorporating the mogrol glycoside into a product.
73. The method of claim 72, wherein the mogrol glycoside is Mog. V, Mog. VI, or Isomog. V.
74. The method of claims 72 or 73, wherein the product is a sweetener composition, flavoring composition, food, beverage, chewing gum, texturant, pharmaceutical composition, tobacco product, nutraceutical composition, or oral hygiene composition.
75. The method of any one of claims 72 to 74, wherein the product further comprises one or more of a steviol glycoside, aspartame, and neotame.
76. The method of claim 75, wherein the steviol glycoside comprises one or more of RebM, RebB, RebD, RebA, RebE, and Rebl.
PCT/US2019/019886 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides Ceased WO2019169027A2 (en)

Priority Applications (8)

Application Number Priority Date Filing Date Title
US16/971,740 US12351849B2 (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides
EP19760267.5A EP3759230A4 (en) 2018-02-27 2019-02-27 MICROBIAL PRODUCTION OF TRITERPENOIDS WITH MOGROSIDES
CN201980028158.0A CN112041457A (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids, including mogrosides
BR112020017490-4A BR112020017490A2 (en) 2018-02-27 2019-02-27 MICROBIAL PRODUCTION OF TRITERPENOIDS INCLUDING MOGROSIDS
JP2020544815A JP7382946B2 (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides
KR1020207027034A KR20200136911A (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides
MX2020008922A MX2020008922A (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides.
JP2023189958A JP2024020310A (en) 2018-02-27 2023-11-07 Microbial production of triterpenoids including mogrosides

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862635751P 2018-02-27 2018-02-27
US62/635,751 2018-02-27

Publications (2)

Publication Number Publication Date
WO2019169027A2 true WO2019169027A2 (en) 2019-09-06
WO2019169027A3 WO2019169027A3 (en) 2019-10-03

Family

ID=66542535

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/019886 Ceased WO2019169027A2 (en) 2018-02-27 2019-02-27 Microbial production of triterpenoids including mogrosides

Country Status (8)

Country Link
US (1) US12351849B2 (en)
EP (1) EP3759230A4 (en)
JP (2) JP7382946B2 (en)
KR (1) KR20200136911A (en)
CN (1) CN112041457A (en)
BR (1) BR112020017490A2 (en)
MX (1) MX2020008922A (en)
WO (1) WO2019169027A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110592040A (en) * 2019-09-18 2019-12-20 江苏施宇甜生物科技有限公司 Process for producing recombinant bacillus subtilis of UPD (ultra-high Performance) glycosyltransferase
CN111518817A (en) * 2020-05-14 2020-08-11 云南农业大学 Hemsleya amabilis triterpene synthetase HcOSC6 gene, engineering bacteria thereof and application thereof in preparation of cucurbitadienol
WO2021126960A1 (en) * 2019-12-16 2021-06-24 Manus Bio, Inc. Microbial production of mogrol and mogrosides
WO2021188703A1 (en) * 2020-03-17 2021-09-23 The Coca-Cola Company Novel mogroside production system and methods
WO2021231728A1 (en) * 2020-05-13 2021-11-18 Ginkgo Bioworks, Inc. Biosynthesis of mogrosides
WO2022099123A1 (en) * 2020-11-06 2022-05-12 The Medical College Of Wisconsin, Inc. Peptide inhibitors of human mitochondrial fission protein 1 and methods of use
WO2023278976A3 (en) * 2021-06-29 2023-03-02 Firmenich Incorporated Methods for making high intensity sweeteners
US12234464B2 (en) 2018-11-09 2025-02-25 Ginkgo Bioworks, Inc. Biosynthesis of mogrosides
US12286661B2 (en) 2018-11-07 2025-04-29 Firmenich Incorporated Methods for making high intensity sweeteners
US12338476B2 (en) 2018-11-07 2025-06-24 Firmenich Incorporated Methods for making high intensity sweeteners
US12428662B2 (en) 2017-05-03 2025-09-30 Firmenich Incorporated Methods for making high intensity sweeteners

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201805576D0 (en) * 2018-04-04 2018-05-16 Optibiotix Ltd Sweeteners and methods of production thereof
CN112063647B (en) * 2020-09-17 2023-05-02 云南农业大学 Construction method of saccharomyces cerevisiae recombinant Cuol01, saccharomyces cerevisiae recombinant Cuol02 and application
CN112877355A (en) * 2021-01-22 2021-06-01 杜云龙 Method for expressing notoginsenoside by using tobacco
CN114107332B (en) * 2022-01-27 2022-12-06 中国中医科学院中药研究所 Co-expressed nucleic acids and uses thereof
CN114774503B (en) * 2022-06-20 2022-10-14 中国中医科学院中药研究所 Squalene epoxidase and its encoding gene and application
WO2025101625A1 (en) * 2023-11-06 2025-05-15 Manus Bio Inc. Enzymes, host cells, and methods for producing mogrosides
WO2025209017A1 (en) * 2024-04-02 2025-10-09 四川盈嘉合生科技有限公司 Engineered yeast capable of producing mogrol, siamenoside i, mogroside iiix, mogroside iva, and/or mogroside v and use thereof
CN121825910A (en) * 2024-12-18 2026-04-10 苏州一兮生物技术有限公司 A squalene cyclooxygenase mutant and its application
CN119753063A (en) * 2024-12-26 2025-04-04 诸城市浩天药业有限公司 Extraction method of mogroside V
CN119751616B (en) * 2025-01-02 2025-10-28 南京农业大学 BnACO1 gene, protein and application thereof in regulation of brassica napus plant height
CN121674622B (en) * 2026-02-05 2026-04-21 湖北省农业科学院经济作物研究所 Method for identifying flooding resistance of plants

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7667017B2 (en) 2001-12-06 2010-02-23 The Regents Of The University Of California Isolated mevalonate pathway enzyme nucleic acids
US20110236927A1 (en) 2008-08-27 2011-09-29 Massachusetts Institute Of Technology Genetically stabilized tandem gene duplication
US20120246767A1 (en) 2010-10-29 2012-09-27 Jean Davin Amick Modified valencene synthase polypeptides, encoding nucleic acid molecules and uses thereof
US8512988B2 (en) 2009-11-10 2013-08-20 Massachusetts Institute Of Technology Microbial engineering for the production of chemical and pharmaceutical products from the isoprenoid pathway
US8927241B2 (en) 2009-11-10 2015-01-06 Massachusetts Institute Of Technology Microbial engineering for the production of chemical and pharmaceutical products from the isoprenoid pathway
US20150322473A1 (en) 2012-12-04 2015-11-12 Evolva Sa Methods and materials for Biosynthesis of Mogroside Compounds
WO2016038617A1 (en) 2014-09-11 2016-03-17 The State Of Israel, Ministry Of Agriculture & Rural Development, Agricultural Research Organization (Aro) (Volcani Center) Methods of producing mogrosides and compositions comprising same and uses thereof
US20170332673A1 (en) 2014-11-05 2017-11-23 Manus Biosynthesis, Inc. Microbial production of steviol glycosides
US20180216137A1 (en) 2017-01-26 2018-08-02 Manus Bio, Inc. Metabolic engineering for microbial production of terpenoid products
US20180245103A1 (en) 2017-02-03 2018-08-30 Manus Bio, Inc. Metabolic engineering for microbial production of terpenoid products
US20180251738A1 (en) 2015-08-21 2018-09-06 Manus Bio, Inc. Increasing productivity of e. coli host cells that functionally express p450 enzymes

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8367395B2 (en) * 2006-09-28 2013-02-05 Dsm Ip Assets B.V. Production of sterols in oleaginous yeast and fungi
WO2013061341A1 (en) * 2011-10-28 2013-05-02 Sujoy Kumar Guha An improved intra-uterine contraceptive device
CN104017797B (en) * 2014-06-04 2016-03-16 中国医学科学院药用植物研究所 Mutant of a kind of Grosvenor Momordica SgCAS gene and uses thereof
SG11201701278RA (en) * 2014-08-21 2017-03-30 Manus Biosynthesis Inc Methods for production of oxygenated terpenes

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7667017B2 (en) 2001-12-06 2010-02-23 The Regents Of The University Of California Isolated mevalonate pathway enzyme nucleic acids
US20110236927A1 (en) 2008-08-27 2011-09-29 Massachusetts Institute Of Technology Genetically stabilized tandem gene duplication
US8512988B2 (en) 2009-11-10 2013-08-20 Massachusetts Institute Of Technology Microbial engineering for the production of chemical and pharmaceutical products from the isoprenoid pathway
US8927241B2 (en) 2009-11-10 2015-01-06 Massachusetts Institute Of Technology Microbial engineering for the production of chemical and pharmaceutical products from the isoprenoid pathway
US20120246767A1 (en) 2010-10-29 2012-09-27 Jean Davin Amick Modified valencene synthase polypeptides, encoding nucleic acid molecules and uses thereof
US20150322473A1 (en) 2012-12-04 2015-11-12 Evolva Sa Methods and materials for Biosynthesis of Mogroside Compounds
WO2016038617A1 (en) 2014-09-11 2016-03-17 The State Of Israel, Ministry Of Agriculture & Rural Development, Agricultural Research Organization (Aro) (Volcani Center) Methods of producing mogrosides and compositions comprising same and uses thereof
US20170283844A1 (en) 2014-09-11 2017-10-05 The State of Israel, Ministry of Agriculture & Rural Development, Argricultural Research Organiza Methods of producing mogrosides and compositions comprising same and uses thereof
US20170332673A1 (en) 2014-11-05 2017-11-23 Manus Biosynthesis, Inc. Microbial production of steviol glycosides
US20180251738A1 (en) 2015-08-21 2018-09-06 Manus Bio, Inc. Increasing productivity of e. coli host cells that functionally express p450 enzymes
US20180216137A1 (en) 2017-01-26 2018-08-02 Manus Bio, Inc. Metabolic engineering for microbial production of terpenoid products
US20180245103A1 (en) 2017-02-03 2018-08-30 Manus Bio, Inc. Metabolic engineering for microbial production of terpenoid products

Non-Patent Citations (16)

* Cited by examiner, † Cited by third party
Title
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402
AMINFARTOHIDFAR: "In silico analysis of squalene synthase in Fabaceae family using bioinformatics tools", J. GENETIC ENGINEER. AND BIOTECH., vol. 16, 2018, pages 739 - 747
BRUDNO M., BIOINFORMATICS, vol. 19, 2003, pages 154 - 162
GHIMIRE ET AL., APPL. ENVIRON. MICROBIOL., vol. 75, no. 22, 2009, pages 7291 - 7293
ITKIN M. ET AL.: "The biosynthetic pathway of the nonsugar, high-intensity sweetener mogroside V from Siraitia grosvenorii", PNAS, vol. 113, no. 47, 2016, pages E7619 - E7628, XP055578320, DOI: 10.1073/pnas.1604828113
JAKINOVICH ET AL., JOURNAL OF NATURAL PRODUCTS, 1990
JAKINOVICH ET AL., JOURNAL OF NATURALPRODUCTS, 1990
KARLINALTSCHUL, PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 5873 - 5877
KASAI ET AL., AGRIC BIOL CHEM, 1989
LI ET AL., CHIN J NAT MED, 2014
PADYANA AK ET AL.: "Structure and inhibition mechanism of the catalytic domain of human squalene epoxidase", NAT. COMM., vol. 10, no. 97, 2019, pages 1 - 10
RUCKENSTULH ET AL.: "Structure-Function Correlations of Two Highly Conserved Motifs in Saccharomyces cerevisiae Squalene Epoxidase", ANTIMICROB. AGENTS AND CHEMO., vol. 52, no. 4, 2008, pages 1496 - 1499
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
See also references of EP3759230A4
THOMPSON, J. D.HIGGINS, D. G.GIBSON, T. J., NUCLEIC ACIDS RES., vol. 22, 1994, pages 4673 - 80

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12428662B2 (en) 2017-05-03 2025-09-30 Firmenich Incorporated Methods for making high intensity sweeteners
US12338476B2 (en) 2018-11-07 2025-06-24 Firmenich Incorporated Methods for making high intensity sweeteners
US12286661B2 (en) 2018-11-07 2025-04-29 Firmenich Incorporated Methods for making high intensity sweeteners
US12234464B2 (en) 2018-11-09 2025-02-25 Ginkgo Bioworks, Inc. Biosynthesis of mogrosides
CN110592040A (en) * 2019-09-18 2019-12-20 江苏施宇甜生物科技有限公司 Process for producing recombinant bacillus subtilis of UPD (ultra-high Performance) glycosyltransferase
JP2023506242A (en) * 2019-12-16 2023-02-15 マナス バイオ インコーポレイテッド Microbial production of mogrol and mogrosides
EP4077649A4 (en) * 2019-12-16 2024-07-10 Manus Bio Inc. Microbial production of mogrol and mogrosides
WO2021126960A1 (en) * 2019-12-16 2021-06-24 Manus Bio, Inc. Microbial production of mogrol and mogrosides
US12480146B2 (en) 2019-12-16 2025-11-25 Manus Bio Inc. Microbial production of mogrol and mogrosides
JP7785675B2 (en) 2019-12-16 2025-12-15 マナス バイオ インコーポレイテッド Microbial production of mogrol and mogroside
CN115605081A (en) * 2020-03-17 2023-01-13 可口可乐公司(Us) Novel mogroside production system and method
WO2021188703A1 (en) * 2020-03-17 2021-09-23 The Coca-Cola Company Novel mogroside production system and methods
WO2021231728A1 (en) * 2020-05-13 2021-11-18 Ginkgo Bioworks, Inc. Biosynthesis of mogrosides
CN111518817B (en) * 2020-05-14 2022-08-23 云南农业大学 Hemsleya amabilis triterpene synthetase HcOSC6 gene, engineering bacterium thereof and application thereof in preparation of cucurbitadienol
CN111518817A (en) * 2020-05-14 2020-08-11 云南农业大学 Hemsleya amabilis triterpene synthetase HcOSC6 gene, engineering bacteria thereof and application thereof in preparation of cucurbitadienol
WO2022099123A1 (en) * 2020-11-06 2022-05-12 The Medical College Of Wisconsin, Inc. Peptide inhibitors of human mitochondrial fission protein 1 and methods of use
WO2023278976A3 (en) * 2021-06-29 2023-03-02 Firmenich Incorporated Methods for making high intensity sweeteners

Also Published As

Publication number Publication date
JP2021513867A (en) 2021-06-03
JP7382946B2 (en) 2023-11-17
EP3759230A2 (en) 2021-01-06
EP3759230A4 (en) 2022-05-25
WO2019169027A3 (en) 2019-10-03
BR112020017490A2 (en) 2020-12-22
US20210032669A1 (en) 2021-02-04
MX2020008922A (en) 2021-01-08
US12351849B2 (en) 2025-07-08
JP2024020310A (en) 2024-02-14
CN112041457A (en) 2020-12-04
KR20200136911A (en) 2020-12-08

Similar Documents

Publication Publication Date Title
US12351849B2 (en) Microbial production of triterpenoids including mogrosides
US12478082B2 (en) Uridine diphosphate-dependent glycosyltransferase enzyme
JP7061145B2 (en) Improved production method of rebaudioside D and rebaudioside M
US12480146B2 (en) Microbial production of mogrol and mogrosides
US9284570B2 (en) Microbial production of natural sweeteners, diterpenoid steviol glycosides
AU2020302789B2 (en) Uridine diphosphate-dependent glycosyltransferase enzyme
US20240392340A1 (en) Enzymes, host cells, and methods for biosynthesis of dammarenediol and derivatives
WO2025101625A1 (en) Enzymes, host cells, and methods for producing mogrosides
BR112015018872B1 (en) METHOD FOR PRODUCING A STEVIOL GLYCOSIDE COMPOSITION, RECOMBINANT HOST CELL, METHODS FOR PRODUCING REBAUDIOSIDE M, CELL CULTURE, CELL CULTURE LYSATE AND REACTION MIX

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19760267

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2020544815

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20207027034

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019760267

Country of ref document: EP

Effective date: 20200928

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020017490

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112020017490

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20200827

WWG Wipo information: grant in national office

Ref document number: 16971740

Country of ref document: US