WO2024125637A1 - 提高rna分子翻译效率和/或稳定性的utr及其应用 - Google Patents

提高rna分子翻译效率和/或稳定性的utr及其应用 Download PDF

Info

Publication number
WO2024125637A1
WO2024125637A1 PCT/CN2023/139184 CN2023139184W WO2024125637A1 WO 2024125637 A1 WO2024125637 A1 WO 2024125637A1 CN 2023139184 W CN2023139184 W CN 2023139184W WO 2024125637 A1 WO2024125637 A1 WO 2024125637A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
polynucleotide
sequence
variant
nos
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/139184
Other languages
English (en)
French (fr)
Inventor
黄慧
李林鲜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shenxin Biotechnology Co Ltd
Original Assignee
Shenzhen Shenxin Biotechnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shenxin Biotechnology Co Ltd filed Critical Shenzhen Shenxin Biotechnology Co Ltd
Priority to JP2025534775A priority Critical patent/JP2026501176A/ja
Priority to KR1020257023528A priority patent/KR20250113527A/ko
Priority to EP23902830.1A priority patent/EP4636090A1/en
Priority to AU2023396545A priority patent/AU2023396545A1/en
Priority to CN202380086599.2A priority patent/CN120435560A/zh
Publication of WO2024125637A1 publication Critical patent/WO2024125637A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/88Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation using microencapsulation, e.g. using amphiphile liposome vesicle
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/50Vector systems having a special element relevant for transcription regulating RNA stability, not being an intron, e.g. poly A signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/80Vector systems having a special element relevant for transcription from vertebrates
    • C12N2830/85Vector systems having a special element relevant for transcription from vertebrates mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2840/00Vectors comprising a special translation-regulating system
    • C12N2840/10Vectors comprising a special translation-regulating system regulates levels of translation
    • C12N2840/105Vectors comprising a special translation-regulating system regulates levels of translation enhancing translation

Definitions

  • the present invention belongs to the field of biotechnology and relates to a 5' and/or 3' untranslated region (UTR) for improving the translation efficiency and/or stability of RNA molecules, a nucleic acid molecule comprising the UTR, a vector comprising the UTR or the nucleic acid molecule, a pharmaceutical composition comprising the UTR, the nucleic acid molecule or the vector, and the application of the UTR, the nucleic acid molecule, the vector and the pharmaceutical composition.
  • UTR 5' and/or 3' untranslated region
  • mRNA messenger RNA
  • mRNA messenger RNA
  • mRNA vaccines have been clinically used in the field of infectious disease prevention. Compared with recombinant protein subunit vaccines, inactivated vaccines or DNA vaccines, mRNA vaccines have several obvious advantages. First, since mRNA does not infect the body or integrate into genomic DNA, the safety of mRNA is greatly improved. Secondly, the use of modified bases can reduce the inherent immunogenicity of mRNA molecules and reduce their degradation by the body, further improving the safety and stability of mRNA vaccines. In addition, in vitro, the yield of mRNA synthesis is very high, so mRNA vaccines have the potential to be efficient, rapidly developed, low-cost manufactured, and easy to manage safely.
  • the untranslated region (UTR) of mRNA controls the translation, degradation and localization of genes, including the stem-loop structure, upstream start codon, upstream open reading frame, internal ribosome entry site and various cis-acting elements that bind to RNA-binding proteins.
  • UTRs play a crucial role in post-transcriptional regulation of gene expression, including regulation of mRNA nuclear export and translation efficiency, subcellular localization, and stability. UTRs may also play other roles, such as the specific incorporation of the modified amino acid selenocysteine at the UGA codon of mRNA encoding selenoproteins, a process mediated by a conserved stem-loop structure in the 3'-UTR.
  • mRNA molecules directly affects the dosage and dosing interval of mRNA drugs (especially mRNA vaccines), which ultimately affects the bioavailability of mRNA drugs and determines the clinical application value of mRNA drugs.
  • mRNA drugs especially mRNA vaccines
  • UTRs untranslated regions
  • the inventors have identified a variety of 5'-UTRs and/or 3'-UTRs that can improve the translation efficiency and/or stability of mRNA molecules.
  • the above UTRs are universal core elements that give mRNA molecules containing the UTRs enhanced translation efficiency, significantly improve the expression of target genes and/or mRNA stability, and have broad application value in the industrialization of mRNA drugs.
  • the present invention provides a recombinant RNA molecule comprising: (1) a first nucleotide sequence encoding a polypeptide and/or protein of interest; and (2) a second nucleotide sequence containing a 5'-untranslated region (5'-UTR); the 5'-UTR comprising at least one of the following polynucleotides: (a): derived from genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, The 5'-UTR of at least one of the genes PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a fragment of the 5'-UTR described in (a); (c): a variant of the 5'-UTR described in (a); and (d): a variant of the fragment described in (
  • the gene is a human gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB; (f): a fragment of the 5'-UTR described in (e); (g): a variant of the 5'-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21; preferably, the sequence as shown in SEQ ID NO: 1 Variants of RNA encoded by the polynucleotides as shown in at least one of SEQ ID NOs: 1 to 21, fragments of RNA encoded by the polynucleotides as shown in at least one of
  • the second nucleotide sequence comprises at least one of: 5'-UTRs of at least two of the genes described in (a), fragments of 5'-UTRs of at least two of the genes described in (a), variants of 5'-UTRs of at least two of the genes described in (a), and variants of fragments of 5'-UTRs of at least two of the genes described in (a).
  • the second nucleotide sequence comprises at least one of: at least two copies of the 5'-UTR in (a), at least two copies of a fragment of the 5'-UTR in (a), at least two copies of a variant of the 5'-UTR in (a), and at least two copies of a variant of a fragment of the 5'-UTR in (a).
  • the recombinant RNA molecule further comprises at least one of a promoter, a 5'-cap structure, a 3'-UTR and a poly (A) tail.
  • the recombinant RNA molecule further comprises at least one of a 5'-cap structure, a 3'-UTR and a poly (A) tail.
  • the 5'-cap structure includes at least one of m7GpppG , m27,3' - OGpppG, m7Gppp (5')N1, and m7Gppp ( m2'-O )N1.
  • the 3'-UTR comprises: i) a 3'-UTR derived from at least one of an albumin gene, an ⁇ -globin gene, a ⁇ -globin gene, a tyrosine hydroxylase gene, a lipoxygenase gene, and a collagen ⁇ gene; ii) a variant of the 3'-UTR in i); iii) at least one of a 3'-UTR derived from at least one of a gene MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2, and GH1, a fragment thereof, a variant thereof, and a variant of a fragment thereof
  • the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides; preferably, the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in succession. In some embodiments, the nucleotides constituting the poly(A) tail include one or more nucleotides other than A nucleotides; alternatively, the nucleotides constituting the poly(A) include two or more nucleotides other than A nucleotides in succession.
  • the present invention also provides another recombinant RNA molecule, which comprises: (1) a first nucleotide sequence encoding a polypeptide and/or protein of interest; and (2) a second nucleotide sequence containing a 3'-untranslated region (3'-UTR); the 3'-UTR comprises at least one of the following polynucleotides: (a): derived from genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FB The 3'-UTR of at least one gene among XL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; (b): a fragment of the 3'-UTR described in (a); (c): a variant of
  • the gene is a human gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): 3’-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, and PGLYRP1; (f): a fragment of the 3’-UTR described in (e); (g): a variant of the 3’-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48; preferably, the sequence is as shown in SEQ ID NO: 2 2 to 48, fragments of RNA encoded by a polynucleotide whose sequence is as shown in at least one of SEQ ID NOs: 22 to 48, and variants of fragments of RNA encoded by a polynu
  • the second nucleotide sequence comprises at least one of: 3’-UTRs of at least two of the genes described in (a), fragments of 3’-UTRs of at least two of the genes described in (a), variants of 3’-UTRs of at least two of the genes described in (a), and variants of fragments of 3’-UTRs of at least two of the genes described in (a).
  • the second nucleotide sequence comprises at least one of: at least two copies of the 3’-UTR in (a), at least two copies of a fragment of the 3’-UTR in (a), at least two copies of a variant of the 3’-UTR in (a), and at least two copies of a variant of a fragment of the 3’-UTR in (a).
  • the recombinant RNA molecule further comprises at least one of a promoter, a 5'-cap structure, a 5'-UTR and a poly (A) tail.
  • the recombinant RNA molecule further comprises at least one of a 5'-cap structure, a 3'-UTR and a poly (A) tail.
  • the 5'-cap structure includes at least one of m7GpppG , m27,3' - OGpppG, m7Gppp (5')N1, or m7Gppp ( m2'-O )N1.
  • the 5'-UTR comprises: i) a 5'-UTR derived from genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, At least one of the 5'-UTR of at least one gene among UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7, fragments thereof, variants and variants of fragments thereof; preferably, RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21,
  • the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides; preferably, the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in a row. In some embodiments, the nucleotides constituting the poly(A) tail include one or more nucleotides other than A nucleotides.
  • the present invention provides a DNA molecule encoding the recombinant RNA molecule of the present invention.
  • the present invention provides a vector comprising the recombinant RNA molecule or DNA molecule of the present invention.
  • the present invention provides a host cell comprising the recombinant RNA molecule, DNA molecule or vector of the present invention.
  • the present invention provides a lipid nanoparticle comprising the recombinant RNA molecule of the present invention.
  • the present invention provides a pharmaceutical composition
  • a pharmaceutical composition comprising the recombinant RNA molecule of the present invention, the DNA molecule of the present invention, the vector of the present invention, the host cell of the present invention or the lipid nanoparticle of the present invention, and a pharmaceutically acceptable carrier.
  • the present invention provides a vector comprising a first nucleotide sequence encoding a 5'-UTR and/or a second nucleotide sequence encoding a 3'-UTR, wherein:
  • the first nucleotide sequence comprises at least one of the following polynucleotides: (a): a polynucleotide encoding a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a polynucleotide encoding a fragment of the 5'-UTR described in (a); (c): a polynucleotide encoding a variant of the 5'-UTR described in (a); and (d): a polynucleotide encoding a variant of the fragment described in (b);
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a polynucleotide encoding a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1 and NFKB2; (f): a polynucleotide encoding a fragment of the 3'-UTR described in (e); (g): a polynucleotide encoding a variant of the 3'-UTR described in (e); and (h): a polynucleotide encoding a variant of the fragment described in (f).
  • the gene is a human gene.
  • the first nucleotide sequence comprises a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB, or a variant thereof.
  • the second nucleotide sequence comprises a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, and PGLYRP1, or a variant thereof.
  • the vector comprises the first nucleotide sequence and the second nucleotide sequence.
  • the first nucleotide sequence comprises: i) a polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of a polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of a polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21, and a variant of a fragment of a polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21; preferably, the variant of the polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21, the variant of the polynucleotide sequence as shown in at least one of SEQ ID NOs: 1 to 21, the variant of the polynucleotide sequence as shown in Fragments of a polynucleotide as shown in at least one of SEQ ID NOs: 1 to 21 and variants of a fragment of a polyn
  • the second nucleotide sequence comprises: (1): a polynucleotide encoding a 3'-UTR derived from at least one of the albumin gene, ⁇ -globin gene, ⁇ -globin gene, tyrosine hydroxylase gene, lipoxygenase gene, and collagen ⁇ gene; (2): a polynucleotide encoding a variant of the 3'-UTR in (1); (3): at least one of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 22 to 48, a fragment of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 22 to 48, a variant of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 22 to 48, and a variant of a fragment of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 22 to 48
  • the vector further comprises a polynucleotide encoding a poly(A) tail.
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides; preferably, the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in a row.
  • the nucleotides constituting the poly(A) tail comprise one or more nucleotides other than A nucleotides.
  • the present invention provides a use of a recombinant RNA molecule, a DNA molecule, a vector, a host cell, a lipid nanoparticle or a pharmaceutical composition of the present invention in the preparation of a drug; preferably, the drug is used for gene therapy, gene vaccination or protein replacement therapy.
  • the drug is a nucleic acid drug, wherein the nucleic acid includes at least one of the following: RNA, messenger RNA (mRNA), DNA, plasmid, ribosomal RNA (rRNA), single-stranded guide RNA (sgRNA) and Cas9 mRNA.
  • RNA messenger RNA
  • rRNA ribosomal RNA
  • sgRNA single-stranded guide RNA
  • Cas9 mRNA RNA, messenger RNA (mRNA), DNA, plasmid, ribosomal RNA (rRNA), single-stranded guide RNA (sgRNA) and Cas9 mRNA.
  • the drug is used for the treatment and/or prevention of a disease; preferably, the disease is selected from the group consisting of: rare diseases, infectious diseases, cancer, genetic diseases, autoimmune diseases, diabetes, neurodegenerative diseases, cardiovascular diseases, renal vascular diseases, and metabolic diseases; preferably, the cancer includes one or more of lung cancer, gastric cancer, liver cancer, esophageal cancer, colon cancer, pancreatic cancer, brain cancer, lymphoma, blood cancer or prostate cancer; the genetic disease includes one or more of hemophilia, thalassemia, and Gaucher's disease.
  • Figure 1A shows a flow chart for the construction of plasmid D
  • Figure 1B shows a schematic diagram of the construction and transformation of plasmid D
  • Fig. 2 shows the map of plasmid luciferase-pcDNA3
  • Figure 3 shows a map of plasmid B
  • Figure 4 shows a map of plasmid C
  • FIG. 5 shows the spectrum of plasmid D
  • FIG6 shows the expression of luciferase after HEK293 cells were transfected with mRNA containing different 5'-UTRs in Example 5;
  • FIG7 shows the expression of luciferase after HEK293 cells were transfected with mRNA containing different 3'-UTRs in Example 6;
  • FIG8 shows the expression of mRNAs containing the same 5′-UTR but different 3′-UTRs in mice
  • FIG. 9 shows the expression of mRNAs containing the same 3′-UTR but different 5′-UTRs in mice.
  • the expressions “comprises,” “comprising,” “containing,” and “having” are open ended, meaning the inclusion of the listed elements, steps, or components but not the exclusion of other unlisted elements, steps, or components.
  • the expression “consisting of” excludes any element, step, or component not specified.
  • the expression “consisting essentially of” means that the scope is limited to the specified elements, steps, or components, plus optional elements, steps, or components that do not significantly affect the basic and novel properties of the claimed subject matter. It should be understood that the expressions “consisting essentially of” and “consisting of” are encompassed within the meaning of the expression “comprising.”
  • the numerical ranges described herein should be understood to include any and all subranges contained therein.
  • the range “1 to 10” should be understood to include not only the explicitly stated values of 1 and 10, but also any single value (e.g., 2, 3, 4, 5, 6, 7, 8, and 9) and subranges (e.g., 1 to 2, 1.5 to 2.5, 1 to 3, 1.5 to 3.5, 2.5 to 4, 3 to 4.5, etc.) within the range of 1 to 10.
  • This principle also applies to ranges with only one value as the minimum or maximum value.
  • fragment or “fragment of a nucleic acid” refers to a portion of a nucleic acid. For example, a nucleic acid shortened at the 5' and/or 3' ends. A fragment of a nucleic acid comprises at least 50%, 60%, 70% or 80% from the nucleic acid. Preferably, a fragment of a nucleic acid comprises at least 70% or 80% from the nucleic acid. Preferably, at least 90%, 95%, 96%, 97%, 98% or 99% of the nucleotide residues. Generally, it can be a shorter portion of the full length of a nucleic acid.
  • variant nucleic acids refers to a nucleic acid variant, wherein at least one nucleotide of the nucleic acid variant is different from a reference nucleic acid (or "parent").
  • a variant nucleic acid includes single or multiple nucleotide deletions, additions, mutations and/or insertions, wherein: a deletion includes removing one or more nucleotides from a reference nucleic acid; an addition includes replacing one or more nucleotides (e.g., 1, 2, 3, 5, 10, 20, 30, 50 or more) with a reference nucleic acid.
  • nucleic acid variant used herein includes naturally occurring variants and engineered variants. Therefore, the "nucleic acid variant” defined herein can be derived from, separated from, related to, based on or homologous to a reference nucleic acid sequence.
  • Nucleic acid variants optionally have at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity with the corresponding naturally occurring (wild type) nucleic acid or its homologues, fragments or derivatives; preferably at least 70%, more preferably at least 80%, even more preferably at least 85%, even more preferably at least 90%, most preferably at least 95% or even 97%.
  • nucleic acid molecules include degenerate nucleic acid sequences, wherein the degenerate nucleic acid sequences according to the present invention are nucleic acids that differ from the reference nucleic acid in the codon sequence due to the degeneracy of the genetic code.
  • the nucleic acid variant is a variant of 5'-UTR, a variant of 3'-UTR or a variant of ployA.
  • the mutation introduced into the nucleic acid variant can prevent the nucleic acid variant from being recognized by nucleases and being cleaved, avoid binding to microRNA, or avoid generating complex secondary structures such as hairpin structures or G-quadruplexes.
  • the nucleic acid variant is a variant of the 3'-UTR derived from the ALB gene (GenBank accession number is NM_000477.7), and the variant mutates one "A” into a "C” based on the 3'-UTR of the ALB gene to avoid being recognized by nucleases and being cleaved.
  • % identity refers to the percentage of identical nucleotides or amino acids in the optimal alignment between the sequences to be compared, and the differences between the two sequences can be distributed over a local region (segment) or over the entire length of the sequences to be compared.
  • identity between the two sequences is determined after the optimal alignment of the segment or "comparison window".
  • the optimal alignment can be performed manually or with the aid of algorithms known in the art. Algorithms known in the art include, but are not limited to, the local homology algorithm described by Smith and Waterman, 1981, Ads App. Math. 2, 482 and Neddleman and Wunsch, 1970, J. Mol. Biol.
  • % identity or % homology can be obtained by determining the number of identical positions corresponding to the sequences to be compared, dividing this number by the number of positions compared (e.g., the number of positions in the reference sequence), and multiplying this result by 100 to obtain % homology.
  • the degree of homology is given for a region of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, or about 100%. In some embodiments, the degree of homology is given for the entire length of the reference sequence.
  • Alignment to determine sequence identity can be performed using tools known in the art, preferably using optimal sequence alignment, for example, using Align, using standard settings, preferably EMBOSS::needle, Matrix:Blosum62, Gap Open 10.0, Gap Extend 0.5.
  • a fragment or variant of a specific nucleic acid or a nucleic acid having a specific degree of identity with a specific nucleic acid preferably has at least one functional property of the specific nucleic acid and is preferably functionally equivalent to the specific nucleic acid, e.g. a nucleic acid exhibiting properties that are identical or similar to those of the specific nucleic acid.
  • nucleotide includes deoxyribonucleotide, deoxyribonucleotide, deoxyribonucleotide derivative, and ribonucleotide derivative.
  • ribonucleotide is a constituent substance of ribonucleic acid (RNA), composed of one molecule of base, one molecule of pentose and one molecule of phosphoric acid, which refers to a nucleotide with a hydroxyl group at the 2' position of the ⁇ -D-ribofuranosyl group
  • deoxyribonucleotide is a constituent substance of deoxyribonucleic acid (DNA), also composed of one molecule of base, one molecule of pentose and one molecule of phosphoric acid, which refers to a nucleotide in which the hydroxyl group at the 2' position of the ⁇ -D-ribofuranosyl group is replaced by hydrogen, and is the main chemical component of
  • Nucleotide is usually referred to by a single letter representing the base: “A” or “A nucleotide” refers to adenine deoxyribonucleotide or adenine ribonucleotide containing adenine, “C” or “C nucleotide” refers to cytosine deoxyribonucleotide or cytosine ribonucleotide containing cytosine, “G” or “G nucleotide” refers to guanine deoxyribonucleotide or guanine ribonucleotide containing guanine, “U” or “U nucleotide” refers to uracil ribonucleotide containing uracil, and “T” or “T nucleotide” refers to thymine deoxyribonucleotide containing thymine.
  • nucleic acid generally refers to a polymer containing deoxyribonucleotides (deoxyribonucleic acid, referred to as DNA) or a polymer containing ribonucleotides (ribonucleic acid, referred to as RNA) or any compound of a combination thereof.
  • nucleic acids herein also include derivatives of nucleic acids.
  • derivatives of nucleic acids includes chemical derivatization of nucleic acids on the bases, sugars or phosphates of nucleotides, as well as nucleic acids containing non-natural nucleotides and nucleotide analogs.
  • nucleic acids can be in the form of single-stranded or double-stranded linear or covalently closed circular molecules.
  • DNA-encoded RNA refers to the RNA corresponding to the DNA, that is, a polynucleotide after all the T nucleotides in the DNA are replaced by U nucleotides.
  • the polynucleotide may comprise one segment or multiple segments (nucleic acid fragments) (e.g., 1, 2, 3, 4, 5, 6, 7, 8 segments).
  • the polynucleotide may comprise a segment encoding a polypeptide of interest (e.g., a polypeptide and polypeptide antigen described herein).
  • the polynucleotide may comprise a segment encoding a polypeptide of interest and a regulatory segment (including but not limited to a segment for transcriptional regulation and translational regulation).
  • the regulatory segment comprises a polynucleotide corresponding to one or more of the following regulatory elements: a promoter, a 5' untranslated region (5'-UTR), a 3' untranslated region (3'-UTR), and a poly (A) tail.
  • promoter refers to a polynucleotide located upstream of the 5' end of the coding region of a gene, which contains a conserved sequence required for specific binding of RNA polymerase and transcription initiation, can activate RNA polymerase, enable RNA polymerase to accurately bind to template DNA and have the specificity of transcription initiation. Promoters can be derived from viruses, bacteria, fungi, plants, insects and animals.
  • promoters include bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 promoter, lac operator-promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, CMV IE promoter, SV40 early promoter or SV 40 late promoter and CMV IE promoter.
  • the term "5' untranslated region" or "5'-UTR” can be an RNA sequence in mRNA that is located upstream of the coding sequence and is not translated into protein. The 5'-UTR in a gene usually starts from the transcription start site and ends at the nucleotide upstream of the translation start codon of the coding sequence.
  • the 5'-UTR may contain elements that control gene expression, such as ribosome binding sites, 5'-terminal oligopyrimidine tracts, and translation initiation signals such as Kozak sequences.
  • mRNA can be post-transcriptionally modified by adding a 5' cap. Therefore, the 5'-UTR in mature mRNA can also refer to the RNA sequence between the 5' cap and the start codon.
  • the term "3' untranslated region" or "3'-UTR" may be located downstream of the coding sequence in mRNA and is not translated into protein.
  • the 3'-UTR in mRNA is located between the stop codon and the poly (A) sequence of the coding sequence, for example, starting from the nucleotides downstream of the stop codon and ending with the nucleotides upstream of the poly (A) sequence.
  • 5’ or 3’-UTR derived from gene A refers to 5’ or 3’-UTR of mRNA from gene A.
  • the 5’ or 3’-UTR derived from gene A may be the entire 5’ or 3’-UTR of mRNA of gene A, or may be a partial 5’ or 3’-UTR of mRNA of gene A.
  • poly (A) acid As used herein, the terms "poly (A) acid”, “poly (A) sequence” and “poly (A) tail” are used interchangeably, and the naturally occurring poly (A) sequence is usually composed of adenine ribonucleotides.
  • modified poly (A) sequence refers to a poly (A) sequence comprising nucleotides or nucleotide segments other than adenine ribonucleotides.
  • the poly (A) sequence is usually located at the 3' end of the mRNA, such as the 3' end (downstream) of the 3'-UTR.
  • the term "5'-cap structure” the 5'-cap structure is usually located at the 5' end of the mature mRNA. In some embodiments, In the embodiment, the 5'-cap structure is connected to the 5'-end of the mRNA through a 5'-5'-triphosphate bond.
  • the 5'-cap structure is usually formed by a modified (e.g., methylated) ribonucleotide (especially a guanine nucleotide derivative).
  • m7GpppN (cap 0 or "cap0”, is a cap structure formed by the 5' phosphate group of hnRNA reacting with the 5'-phosphate group of m7GTP under the action of guanylyl transferase to form a 5', 5'-phosphodiester bond), wherein N is the terminal 5' nucleotide of the nucleic acid carrying the 5'-cap structure.
  • the 5'-cap structure includes but is not limited to cap 0, cap 1 (a cap structure formed by further methylating the first nucleotide sugar group 2'-OH of hnRNA on the basis of cap 0, or "cap1”), cap 2 (a cap structure formed by further methylating the second nucleotide sugar group 2'-OH of hnRNA on the basis of cap 1, or "cap2”), cap 4, cap 0 analogs, cap 1 analogs, cap 2 analogs, or cap 4 analogs.
  • cap 1 a cap structure formed by further methylating the first nucleotide sugar group 2'-OH of hnRNA on the basis of cap 0, or "cap1”
  • cap 2 a cap structure formed by further methylating the second nucleotide sugar group 2'-OH of hnRNA on the basis of cap 1, or "cap2”
  • cap 4 cap 0 analogs, cap 1 analogs, cap 2 analogs, or cap 4 analogs.
  • the term "expression” includes transcription and/or translation of a nucleotide sequence. Thus, expression may involve the production of transcripts and/or polypeptides.
  • transcription refers to the process of transcribing the genetic code in a DNA sequence into RNA (transcript).
  • in vitro transcription refers to the in vitro synthesis of RNA, particularly mRNA, in a cell-free system (e.g., in an appropriate cell extract) (see, e.g., Pardi N., Muramatsu H., Weissman D., Karikó K. (2013). In: Rabinovich P. (eds) Synthetic Messenger RNA and Cell Metabolism Modulation.
  • a vector that can be used to produce a transcript is also referred to as a "transcription vector,” which contains regulatory sequences required for transcription.
  • transcription encompasses "in vitro transcription.”
  • polypeptide refers to a polymer comprising two or more amino acids covalently linked by peptide bonds.
  • a “protein” may comprise one or more polypeptides, wherein the polypeptides interact with each other by covalent or non-covalent means.
  • the term "host cell” refers to a cell for receiving, maintaining, replicating, expressing a polynucleotide or a vector.
  • the term "host cell” includes prokaryotic cells (e.g., Escherichia coli) or eukaryotic cells (e.g., yeast cells and insect cells). For example, cells from humans, mice, hamsters, pigs, goats, primates.
  • the cell can be derived from a variety of tissue types and includes primary cells and cell lines. Some specific examples include keratinocytes, peripheral blood leukocytes, bone marrow stem cells, and embryonic stem cells.
  • the host cell is an antigen presenting cell, particularly a dendritic cell, a monocyte, or a macrophage.
  • the nucleic acid can be present in a host cell in a single copy or in several copies.
  • the host cell can be a cell expressing a polypeptide of the present invention therein.
  • recombinant or “recombinant” means "produced by genetic engineering”.
  • "recombinant material” such as recombinant RNA molecules, is non-naturally occurring.
  • naturally occurring or “naturally occurring” as used herein refers to the fact that a material can be found in nature. For example, a peptide or nucleic acid that is present in an organism (including a virus) and can be isolated from a source in nature and has not been intentionally modified by man in an experiment is naturally occurring.
  • the term "plasmid” generally refers to a circular DNA molecule, but the term can also encompass linearized DNA molecules. Specifically, the term “plasmid” also encompasses molecules obtained by, for example, digesting a circular plasmid with a restriction enzyme, thereby converting the circular plasmid molecule into a linear molecule and linearizing the circular plasmid. Plasmids can replicate, i.e., amplify the genetic information stored as chromosomal DNA in a cell independently, and can be used for cloning, i.e., for amplifying genetic information in bacterial cells.
  • the DNA plasmid is a medium copy or high copy plasmid, more preferably a high copy plasmid.
  • high copy plasmids include, for example, pUC and pTZ plasmids or any other plasmids (e.g., pMB1, pCoIE1) comprising a replication origin that supports high copies of the plasmid.
  • treatment and the like are used herein to generally mean obtaining a desired pharmacological and/or physiological effect.
  • treatment according to the invention may relate to the treatment of a disease state, but may also relate to prophylactic treatment in terms of complete or partial prevention of a disease or its symptoms.
  • treatment is to be understood as being therapeutic in terms of partial or complete cure of a disease and/or adverse effects and/or symptoms attributable to the disease.
  • Treatment may also be prophylactic or preventive treatment, i.e. measures taken to prevent a disease, For example, to prevent the onset of infection and/or disease.
  • the 5'-UTR of the recombinant nucleic acid molecule includes a polynucleotide with a sequence as shown in SEQ ID NO:1
  • the 3'-UTR of the recombinant nucleic acid molecule includes a polynucleotide with a sequence as shown in SEQ ID NO:22
  • the following scheme is also an embodiment of the present invention: the 5'-UTR of the recombinant nucleic acid molecule includes a polynucleotide with a sequence as shown in SEQ ID NO:1
  • the 3'-UTR of the recombinant nucleic acid molecule includes a polynucleotide with a sequence as shown in SEQ ID NO:22.
  • the inventors unexpectedly discovered that incorporating the 5'-UTR of gene PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 or CDK7 into mRNA can improve the translation efficiency of the coding sequence.
  • PPIA peptidylprolyl isomerase A
  • HPX hemopexin
  • CDK5RAP3 CDK5 regulatory subunit associated protein 3;
  • HSPA8 heat shock protein family A (Hsp70)member 8;
  • HBA1 hemoglobin subunit alpha 1
  • HBB hemoglobin subunit beta
  • MYSM1 Myb like, SWIRM and MPN domains 1;
  • LENG1 leukocyte receptor cluster member 1
  • TMSB4X thymosin beta 4 X-linked
  • CASP4 caspase 4
  • IFNA1 interferon alpha 1
  • PGLYRP1 peptidoglycan recognition protein 1;
  • UCHL1 ubiquitin C-terminal hydrolase L1;
  • TTR transthyretin
  • APOA2 apolipoprotein A2
  • DTYMK deoxythymidylate kinase
  • APOC2 apolipoprotein C2
  • CDK7 cyclin-dependent kinase 7.
  • the present invention provides a 5'-UTR comprising at least one selected from the following polynucleotides: (a): a 5'-UTR derived from at least one gene among PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a fragment of the 5'-UTR described in (a); (c): a variant of the 5'-UTR described in (a); and (d): a variant of the fragment described in (b).
  • the gene is a eukaryotic gene.
  • the gene is a chordate gene.
  • the gene is a vertebrate gene.
  • the gene is a mammalian gene.
  • the gene is a primate gene.
  • each of the genes is independently a gene of humans (Homo sapiens), a gene of bonobos (Pan paniscus), a gene of chimpanzees (Pan troglodytes), or a gene of western lowland gorillas (Gorilla gorilla gorilla).
  • the gene is a human gene.
  • the gene is human PPIA gene, human HPX gene, bonobo HPX gene, human FTCD gene, human CDK5RAP3 gene, western lowland gorilla CDK5RAP3 gene, human HSPA8 gene, human HBA1 gene, human HBB gene, human MYSM1 gene, bonobo MYSM1 gene, human LENG1 gene, human TMSB4X gene, human CASP4 gene, human IFNA1 gene, human PGLYRP1 gene, western lowland gorilla PGLYRP1 gene, human UCHL1 gene, human CPAMD8 gene, chimpanzee CPAMD8 gene, human TTR gene, western lowland gorilla TTR gene, human APOA2 gene, human GH1 gene, human DTYMK gene, bonobo DTYMK gene, human APOC2 gene, chimpanzee APOC2 gene, and human CDK7 gene.
  • the GenBank accession number of the PPIA gene is BC137057.1; the GenBank accession number of the HPX gene is AH002827.2; the GenBank accession number of the FTCD gene is NM_006657.3; the GenBank accession number of the CDK5RAP3 gene is AK223387.1; the GenBank accession number of the HSPA8 gene is NM_006597.6; the GenBank accession number of the HBA1 gene is NM_000558.5; the GenBank accession number of the HBB gene is NM_000518.5; the GenBank accession number of the MYSM1 gene is NM_001085487.2; the GenBank accession number of the LENG1 gene is NM_024316.3; the GenBank accession number of the TMSB4X gene is NM_021109.4; and the CAS
  • the GenBank accession number of the P4 gene is NM_001225.3; the GenBank accession number of the IFNA1 gene is NM_02
  • the polynucleotides encoding the 5'-UTR of human genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7 are as shown in Table 1.
  • the 5'-UTR comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • a fragment, variant, or variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to an RNA encoded by a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • a fragment, variant, or variant of a fragment of A is an abbreviation for a fragment of A, a variant of A, or a variant of a fragment of A.
  • a fragment, variant, or variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21 refers to a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, or a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21.
  • the gene is selected from at least one of PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the 5'-UTR comprises at least one of the following polynucleotides: 1): RNA encoded by a polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6; 2): a fragment of the RNA in 1); 3): a variant of the RNA in 1); and a variant of the fragment in 2).
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the 5'-UTR of the present invention may comprise two or more tandem 5'-UTRs from the above-mentioned genes, fragments of the 5'-UTRs of the above-mentioned genes, variants of the 5'-UTRs of the above-mentioned genes, or variants of the fragments of the 5'-UTRs of the above-mentioned genes.
  • the 5'-UTR comprises at least one of the following polynucleotides: (a): 5'-UTRs derived from at least two genes of PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): fragments of the 5'-UTRs of at least two of the genes in (a); (c): variants of the 5'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the 5'-UTR comprises at least one of the following polynucleotides: (e): a 5'-UTR derived from at least two genes among genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB; (f): a fragment of the 5'-UTR described in (e); (g): a variant of the 5'-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the 5'-UTR comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21.
  • the sequence is one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in the sequence has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide shown in one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in one of SEQ ID NOs: 1 to 21.
  • the 5'-UTR includes at least one of the following polynucleotides: RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a fragment of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a variant of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, and a variant of a fragment of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the 5'-UTR comprises at least one of the following polynucleotides: a): at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7; b): at least two copies of a fragment of a 5'-UTR of at least one of the genes in a); c): at least two copies of a variant of a 5'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of a 5'-UTR of at least one of the genes in a).
  • the 5'-UTR comprises at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the 5'-UTR comprises at least one of the following polynucleotides: at least two copies of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a variant of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, and at least two copies of a variant of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21.
  • the 5'-UTR comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, at least two copies of an RNA fragment encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, at least two copies of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6.
  • the fragment, variant, or variant of the fragment of RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6 has an insertion, addition, deletion, or substitution of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides compared to the RNA encoded by the polynucleotide having a sequence as shown in SEQ ID NOs: 9, 7, 18, 12, 8, 1, or 6.
  • the inventors unexpectedly discovered that incorporation of a 3'-UTR from gene MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1 or NFKB2 into mRNA can increase the translation efficiency of the coding sequence.
  • MPND MPN domain containing
  • PGLYRP1 peptidoglycan recognition protein 1;
  • HPX hemopexin
  • CDK7 cyclin dependent kinase 7
  • APOC2 apolipoprotein C2
  • PFN1 profilin 1
  • RBP4 retinol binding protein 4
  • ALB albumin
  • GSDMD gasdermin D
  • ORM1 orosomucoid 1;
  • CASP4 caspase 4
  • CHMP2A charged multivesicular body protein 2A
  • LENG1 leukocyte receptor cluster member 1
  • MYCBPAP MYCBP associated protein
  • APOC1 apolipoprotein C1
  • GAPDH glyceraldehyde-3-phosphate dehydrogenase
  • HSPA8 heat shock protein family A (Hsp70)member 8;
  • APOA2 apolipoprotein A2
  • UCHL1 ubiquitin C-terminal hydrolase L1;
  • TSG101 tumor susceptibility 101
  • NFKB2 nuclear factor kappa B subunit 2.
  • the present invention provides a 3'-UTR comprising at least one selected from the following polynucleotides: (a): derived from a gene The 3'-UTR of at least one gene among MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1 and NFKB2; (b): a fragment of the 3'-UTR described in (a); (c): a variant of the 3'-UTR described in (a); and (d): a variant of the fragment described in (b).
  • the gene is a eukaryotic gene.
  • the gene is a chordate gene.
  • the gene is a vertebrate gene.
  • the gene is a mammalian gene.
  • the gene is a primate gene.
  • each of the genes is independently a gene of humans (Homo sapiens), a gene of bonobos (Pan paniscus), a gene of chimpanzees (Pan troglodytes), or a gene of western lowland gorillas (Gorilla gorilla gorilla).
  • the gene is a human gene.
  • the gene is at least one of a human MPND gene, a bonobo MPND gene, a human FBXW10 gene, a human FBXW12 gene, a bonobo FBXW12 gene, a human PGLYRP1 gene, a human HPX gene, a human CDK7 gene, a human APOC2 gene, a human PFN1 gene, a human RBP4 gene, a human FTCD gene, a human NAAA gene, a human ALB gene, a human GSDMD gene, a human FBXL8 gene, a bonobo FBXL8 gene, a western lowland gorilla FBXL8 gene, a human ORM1 gene, a human CASP4 gene, a human CHMP2A gene, a bonobo CHMP2A gene, a human LENG1 gene, a bonobo LENG1 gene, a human MYCBPAP gene, a human APOC1 gene,
  • the GenBank accession number of the MPND gene is NM_001300862.1; the GenBank accession number of the FBXW10 gene is NM_001267586.2; the GenBank accession number of the FBXW12 gene is NM_001159929.1; the GenBank accession number of the PGLYRP1 gene is NM_005091.3; the GenBank accession number of the HPX gene is AH002827.2; and the GenBank accession number of the CDK7 gene is AY130859.
  • GenBank accession number of APOC2 gene is NM_000483.5; the GenBank accession number of PFN1 gene is NM_005022.4; the GenBank accession number of RBP4 gene is NM_006744.4; the GenBank accession number of FTCD gene is NM_006657.3; the GenBank accession number of NAAA gene is NM_001363719.2; the GenBank accession number of ALB gene is NM_000477.7; the GenBank accession number of GSDMD gene is NM_024736.7; FB The GenBank accession number of XL8 gene is NM_018378.2; the GenBank accession number of ORM1 gene is NM_000607.4; the GenBank accession number of CASP4 gene is NM_001225.3; the GenBank accession number of CHMP2A gene is NM_198426.3; the GenBank accession number of LENG1 gene is NM_024316.3; the GenBank accession number of MY
  • the polynucleotides encoding the 3'-UTRs of human genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2 are as shown in Table 2.
  • the 3'-UTR comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48.
  • the fragment, variant, or variant of the fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 22 to 48.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide whose sequence is shown in at least one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 48.
  • the gene is selected from at least one of MPND, FBXW10, FBXW12 and PGLYRP1.
  • the 3'-UTR comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a variant of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, and a variant of a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the encoded RNA has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotide insertions, additions, deletions or substitutions compared to the encoded RNA.
  • the 3'-UTR sequence of the present invention may comprise two or more tandemly linked 3'-UTRs from the above-mentioned genes, fragments of the 3'-UTRs of the above-mentioned genes, variants of the 3'-UTRs of the above-mentioned genes, or variants of the 3'-UTR fragments of the above-mentioned genes.
  • the 3'-UTR comprises at least one of the following polynucleotides: (a): 3'-UTRs derived from at least two genes of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; (b): fragments of the 3'-UTRs of at least two of the genes in (a); (c): variants of the 3'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the 3’-UTR comprises at least one of the following polynucleotides: (e): a 3’-UTR derived from at least two genes of the genes MPND, FBXW10, FBXW12, and PGLYRP1; (f): a fragment of the 3’-UTR described in (e); (g): a variant of the 3’-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the 3'-UTR comprises at least one of the following polynucleotides: RNA encoded by at least two of the polynucleotides in SEQ ID NOs: 22 to 48, fragments of RNA encoded by at least two of the polynucleotides in SEQ ID NOs: 22 to 48, variants of RNA encoded by at least two of the polynucleotides in SEQ ID NOs: 22 to 48, and variants of fragments of RNA encoded by at least two of the polynucleotides in SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of the fragment of RNA encoded by one of the polynucleotides in SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide in one of the sequences as SEQ ID NOs: 22 to 48.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide shown in one of sequences SEQ ID NO: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in one of sequences SEQ ID NO: 22 to 48.
  • the 3’-UTR comprises at least one of the following polynucleotides: RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, variants of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, and variants of fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence SEQ ID NO: 24, 22, 23, or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide of sequence SEQ ID NO: 24, 22, 23, or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence SEQ ID NO: 24, 22, 23, or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide of sequence SEQ ID NO: 24, 22, 23, or 25.
  • the 3'-UTR comprises at least one of the following polynucleotides: a): at least two copies of a 3'-UTR derived from one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; b): at least two copies of a fragment of the 3'-UTR of at least one of the genes in a); c): at least two copies of a variant of the 3'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of the 3'-UTR of at least one of the genes in a).
  • the 3'-UTR comprises at least one of the following polynucleotides: e): at least two copies of a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, and PGLYRP1; f): at least two copies of a fragment of a 3'-UTR derived from at least one of the genes in e); g): at least two copies of e) a variant of the 3'-UTR of at least one of the genes described in; and h) at least two copies of a variant of the fragment described in f).
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the 3'-UTR sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, and two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 48.
  • the 3'-UTR comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, and at least two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • RNA molecules comprising the 5' and/or 3'-UTR of the present invention
  • the present invention provides a recombinant RNA molecule comprising a 5' and/or 3'-UTR identified in the present invention that improves the translation of a coding sequence.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest and a second nucleotide sequence containing a 5'-UTR, wherein the second nucleotide sequence comprises at least one selected from the following polynucleotides: (a): a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a fragment of the 5'-UTR described in (a); (c): a variant of the 5'-UTR described in (a); and (d): a variant of the fragment described in (b), wherein the first nucleotide sequence and the second nucleot
  • the gene is a human gene.
  • the first nucleotide sequence encodes at least one polypeptide of interest. For example, one, two, three, four, five, six, seven, eight, nine, or ten polypeptides of interest.
  • the first nucleotide sequence encodes at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine, or ten proteins of interest.
  • the first nucleotide sequence encodes at least one polypeptide of interest and at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest and one, two, three, four, five, six, seven, eight, nine or ten proteins of interest.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21.
  • the fragment, variant, or variant of the fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 1 to 21.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide with a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the gene is selected from at least one of PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: 1): RNA encoded by a polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6; 2): a fragment of the RNA in 1); 3): a variant of the RNA in 1); and 4): a variant of the fragment in 2).
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the polynucleotide shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the second nucleotide sequence may comprise two or more tandem 5'-UTRs from the above-mentioned gene, fragments of the 5'-UTR of the above-mentioned gene, variants of the 5'-UTR of the above-mentioned gene, or variants of the fragments of the 5'-UTR of the above-mentioned gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (a): 5'-UTRs derived from at least two of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): fragments of the 5'-UTRs of at least two of the genes in (a); (c): variants of the 5'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a 5'-UTR derived from at least two genes among genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB; (f): a fragment of the 5'-UTR described in (e); (g): a variant of the 5'-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21.
  • the second nucleotide sequence comprises RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in SEQ ID NOs: 1 to 21.
  • RNA encoded by a polynucleotide as shown in at least two of SEQ ID NOs: 1 to 21 a fragment of an RNA encoded by a polynucleotide as shown in at least two of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide as shown in at least two of SEQ ID NOs: 1 to 21, or a variant of a fragment of an RNA encoded by a polynucleotide as shown in at least two of SEQ ID NOs: 1 to 21.
  • a fragment, a variant, or a variant of a fragment of an RNA encoded by a polynucleotide as shown in one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with an RNA encoded by a polynucleotide as shown in one of SEQ ID NOs: 1 to 21.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a fragment of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a variant of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, and a variant of a fragment of RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6.
  • the fragment, variant or variant of the fragment of the RNA encoded by the polynucleotide whose sequence is shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide with a sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide with a sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7; b): at least two copies of a fragment of the 5'-UTR of at least one of the genes in a); c): at least two copies of a variant of the 5'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of the 5'-UTR of at least one of the genes in a).
  • the second nucleotide sequence comprises at least two copies of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7.
  • the 5'-UTR sequence comprises at least one of the following polynucleotides: a): at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB; b): at least two copies of a fragment of a 5'-UTR of at least one of the genes in a); c): at least two copies of a variant of a 5'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of a 5'-UTR of at least one of the genes in a).
  • the 5'-UTR sequence comprises at least two copies of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a variant of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, and at least two copies of a variant of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies, or more.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, and two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO:9, 7, 18, 12, 8, 1 and 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO:9, 7, 18, 12, 8, 1 and 6 has an insertion, addition, deletion or substitution of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides compared to the RNA encoded by the polynucleotide having a sequence as shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • the recombinant RNA molecule is an mRNA molecule. In some embodiments, the recombinant RNA molecule further comprises at least one of a promoter, a 5'-cap structure, a 3'-UTR, and a poly(A) tail. In some embodiments, the recombinant RNA molecule further comprises at least one of a 5'-cap structure, a 3'-UTR, and a poly(A) tail.
  • the 5'-cap structure includes but is not limited to at least one of m7GpppG , m27,3' - OGpppG, m7Gppp (5')N1 and m7Gppp ( m2'-O )N1.
  • m7G represents 7-methylguanosine cap nucleoside
  • ppp represents a triphosphate bond between the 5' carbon of the cap nucleoside and the first nucleotide of the primary RNA transcript
  • N1 is the 5'most nucleotide
  • G represents guanosine nucleoside
  • m7 represents a methyl group at the 7-position of guanine
  • m2' -O represents a methyl group at the 2'-O position of the nucleotide.
  • the 3'-UTR comprises: i) a nucleotide sequence derived from the 3'-UTR of at least one of the albumin gene, the ⁇ -globin gene, the ⁇ -globin gene, the tyrosine hydroxylase gene, the lipoxygenase gene, and the collagen ⁇ gene, or a variant thereof; ii) a variant of the 3'-UTR in i); iii) at least one of the 3'-UTR, fragments, variants, and variants of fragments thereof, derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2,
  • the fragment, variant, or variant of the fragment of RNA encoded by the polynucleotides as shown in SEQ ID NO:22-49 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotides as shown in SEQ ID NO:22-49; iv) at least two copies of one of the polynucleotide sequences i), ii) or iii); or v) at least two polynucleotides in the group consisting of the polynucleotides in i) to iii).
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies, or nine copies.
  • the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides; the nucleotides constituting the poly(A) tail include at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in succession. In some embodiments, the nucleotides constituting the poly(A) tail include one or more nucleotides other than A nucleotides.
  • the poly(A) tail includes two or more consecutive nucleotides other than A nucleotides, wherein the first and last nucleotides in the sequence having two or more consecutive nucleotides are nucleotides other than A nucleotides.
  • the poly(A) tail is truncated, i.e., m consecutive A nucleotides and n consecutive A nucleotides are reconnected by a linker sequence consisting of p non-A nucleotides, wherein m, n and p are positive integers.
  • m is 30, n is 70, and p is 10.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest and a second nucleotide sequence comprising a 3'-UTR, wherein the second nucleotide sequence comprises at least one of the following polynucleotides: (a): a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1 and NFKB2; (b): a fragment of the 3'-UTR described in (a); (c): a variant of the 3'-UTR described in (a); and (d):
  • the gene is a human gene.
  • the first nucleotide sequence encodes at least one polypeptide of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest. In some embodiments, the first nucleotide sequence encodes at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten proteins of interest. In some embodiments, the first nucleotide sequence encodes at least one polypeptide of interest and at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest and one, two, three, four, five, six, seven, eight, nine or ten proteins of interest.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48.
  • the fragment, variant, or variant of the fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 22 to 48.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted with the RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the gene is selected from at least one of MPND, FBXW10, FBXW12 and PGLYRP1.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a variant of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, and a variant of a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25 is at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in SEQ ID NO: 24, 22, 23, or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to an RNA encoded by a polynucleotide having a sequence as shown in SEQ ID NO: 24, 22, 23, or 25.
  • the 3'-UTR in the second nucleotide sequence in the recombinant RNA molecule of the present invention may comprise two or more tandem 3'-UTRs from the above-mentioned gene, fragments of the 3'-UTR of the above-mentioned gene, variants of the 3'-UTR of the above-mentioned gene, or variants of fragments of the 3'-UTR of the above-mentioned gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (a): 3'-UTRs derived from at least two of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; (b): fragments of the 3'-UTRs of at least two of the genes in (a); (c): variants of the 3'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a 3’-UTR derived from at least two genes among the genes MPND, FBXW10, FBXW12 and PGLYRP1; (f): a fragment of the 3’-UTR described in (e); (g): a variant of the 3’-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 22 to 48, a fragment of an RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 22 to 48, a variant of an RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide sequence as shown in at least two of SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of the fragment of an RNA encoded by a polynucleotide sequence as shown in one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide sequence as shown in one of SEQ ID NOs: 22 to 48.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide whose sequence is shown as one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by a polynucleotide whose sequence is shown as one of SEQ ID NOs: 22 to 48.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, variants of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, and variants of fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; b): at least two copies of a fragment of the 3'-UTR of at least one of the genes in a); c): at least two copies of a variant of the 3'-UTR of at least one of the genes in a); and d): at least two copies of a fragment of the 3'-UTR of at least one of the genes in a).
  • the second nucleotide sequence comprises at least two copies of a nucleotide sequence derived from the 3'-UTR of at least one of the genes MPND, FBXW10, FBXW12 and PGLYRP1.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, at least two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, at least two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, and at least two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 48.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, and at least two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • the recombinant RNA molecule is an mRNA molecule. In some embodiments, the recombinant RNA molecule further comprises at least one of a promoter, a 5'-cap structure, a 5'-UTR and a poly (A) tail.
  • the recombinant RNA molecule further comprises at least one of a 5'-cap structure, a 5'-UTR and a poly(A) tail.
  • the 5'-cap structure includes but is not limited to at least one of m7GpppG , m27,3' - OGpppG, m7Gppp (5')N1 or m7Gppp ( m2'-O )N1.
  • m7G represents 7-methylguanosine cap nucleoside
  • ppp represents a triphosphate bond between the 5' carbon of the cap nucleoside and the first nucleotide of the primary RNA transcript
  • N1 is the 5'most nucleotide
  • G represents guanosine nucleoside
  • m7 represents a methyl group at the 7-position of guanine
  • m2' -O represents a methyl group at the 2'-O position of the nucleotide.
  • the 5'-UTR comprises: i) a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7, a fragment, a variant or a variant of a fragment thereof; preferably, at least one of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, and a
  • RNA encoded by the polynucleotides of at least one of SEQ ID NOs: 1-21 variants of RNA encoded by the polynucleotides of at least one of SEQ ID NOs: 1-21, fragments of RNA encoded by the polynucleotides of at least one of SEQ ID NOs: 1-21, and variants of fragments of RNA encoded by the polynucleotides of at least one of SEQ ID NOs: 1-21, which have at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% homology with the RNA encoded by the polynucleotides of at least one of SEQ ID NOs: 1-21; ii) at least two copies of at least one polynucleotide of i); or iii) at least two polynucleotides of i).
  • the at least two copies are two copies, three copies,
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides; preferably, the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in succession.
  • the poly(A) tail comprises one or more nucleotides other than A nucleotides.
  • the poly(A) tail comprises two or more consecutive nucleotides other than A nucleotides, wherein the first and last nucleotides in the sequence having two or more consecutive nucleotides are nucleotides other than A nucleotides.
  • the poly(A) tail is a truncated polyA, i.e., m consecutive A nucleotides and n consecutive A nucleotides are reconnected by a linker sequence consisting of p non-A nucleotides, wherein m, n and p are positive integers.
  • m is 30, n is 70, and p is 10.
  • the DNA sequence corresponding to the poly(A) tail is shown as SEQ ID NO:53.
  • the recombinant RNA molecule is no more than 50000nt.
  • the recombinant RNA molecule is no more than 40000nt, 30000nt, 20000nt, 10000nt, 9000nt, 8000nt or 6000nt.
  • the recombinant RNA molecule is 500nt to 50000nt.
  • the recombinant RNA molecule is 1000nt to 40000nt, 1000nt to 30000nt, 1500nt to 10000nt or 1500nt to 8000nt.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a second nucleotide sequence comprising a 5'-UTR, and a third nucleotide sequence comprising a 3'-UTR, wherein the second nucleotide sequence comprises a 5'-UTR derived from at least one of genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7, a fragment thereof, a variant thereof, and at least one of a variant of a fragment thereof, wherein the third nucleotide sequence comprises a 5'-UTR derived from at least one of genes PPIA, HPX, FTCD, CDK5RAP3,
  • the nucleotide sequence comprises at least one of a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2 and GH1, a fragment, a variant or a variant of a fragment thereof, and wherein the first nucleotide sequence does not naturally occur in the same RNA molecule with at least one of the second nucleotide sequence and the third nucleotide sequence.
  • a combination of a 5'-UTR, fragment, variant, or fragment variant derived from at least one of the above genes and a 3'-UTR, fragment, variant, or fragment variant derived from at least one of the above genes can further improve the translation efficiency and/or stability of the mRNA molecule, and improve the expression level of the polypeptide and/or protein of interest, for example, by 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 times, etc.
  • A, a fragment, a variant or a variant of a fragment is an abbreviation of A, a fragment of A, a variant of A or a variant of a fragment of A.
  • 5'-UTR derived from at least one of genes PPIA, HPX and FTCD, a fragment, a variant or a variant of a fragment refers to 5'-UTR derived from at least one of genes PPIA, HPX and FTCD, a fragment derived from 5'-UTR of at least one of genes PPIA, HPX and FTCD, a variant derived from 5'-UTR of at least one of genes PPIA, HPX and FTCD, or a variant of a fragment derived from 5'-UTR of at least one of genes PPIA, HPX and FTCD.
  • At least one of A, a fragment, a variant and a variant of a fragment is A, a fragment of A, a variant of A and a variant of a fragment of A.
  • at least one of the 5'-UTR, fragments, variants and variants of fragments derived from at least one of the genes PPIA, HPX and FTCD means at least one of the 5'-UTR, fragments, variants and variants of fragments derived from at least one of the genes PPIA, HPX and FTCD, variants of the 5'-UTR, and variants of fragments derived from at least one of the genes PPIA, HPX and FTCD.
  • the gene is a human gene.
  • the first nucleotide sequence encodes at least one polypeptide of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest. In some embodiments, the first nucleotide sequence encodes at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten proteins of interest. In some embodiments, the first nucleotide sequence encodes at least one polypeptide of interest and at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest and one, two, three, four, five, six, seven, eight, nine or ten proteins of interest.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21.
  • the second nucleotide sequence comprises one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 1 to 21.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of at least one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of at least one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide of one of SEQ ID NOs: 1 to 21.
  • the gene is selected from at least one of PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the second nucleotide sequence comprises at least one of the following nucleotides: 1): RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6; 2): a fragment of the RNA in 1); 3): a variant of the RNA in 1); and 4): a variant of the fragment in 2).
  • the second nucleotide sequence comprises one of the following nucleotides: 1): RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6; 2): a fragment of the RNA in 1); 3): a variant of the RNA in 1); and 4): a variant of the fragment in 2).
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of the sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the 5'-UTR in the second nucleotide sequence in the recombinant RNA molecule of the present invention may contain two or more sequences.
  • the 5'-UTR of the gene, the fragment of the 5'-UTR of the gene, the variant of the 5'-UTR of the gene, or the variant of the fragment of the 5'-UTR of the gene is linked to the 5'-UTR of the gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (a): 5'-UTRs derived from at least two of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): fragments of the 5'-UTRs of at least two of the genes in (a); (c): variants of the 5'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the second nucleotide sequence comprises one of the following polynucleotides: (a): 5'-UTRs derived from at least two of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): fragments of the 5'-UTRs of at least two of the genes in (a); (c): variants of the 5'-UTRs of at least two of the genes in (a); and (d): variants of the fragments in (b).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a 5'-UTR derived from at least two of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB; (f): a fragment of the 5'-UTR described in (e); (g): a variant of the 5'-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises one of the following polynucleotides: (e): a 5'-UTR derived from at least two of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB; (f): a fragment of the 5'-UTR described in (e); (g): a variant of the 5'-UTR described in (e); and (h): a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21.
  • the second nucleotide sequence comprises RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, or a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence as shown in one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence as shown in at least two of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide of the sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1 and 6.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 is at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide shown in 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in SEQ ID NO:9, 7, 18, 12, 8, 1 or 6.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7; b): at least two copies of a fragment of the 5'-UTR of at least one of the genes in a); c): at least two copies of a variant of the 5'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of the 5'-UTR of at least one of the genes in a).
  • the second nucleotide sequence comprises at least two copies of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7.
  • the 5'-UTR sequence comprises at least two copies of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB, at least two copies of a fragment of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB, at least two copies of a variant of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB, or at least two copies of a variant of a fragment of a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB.
  • the 5'-UTR sequence comprises at least two copies of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB, at least two copies of a fragment of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB, at least two copies of a variant of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB, or at least two copies of a variant of a fragment of a 5'-UTR derived from one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, at least two copies of a variant of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21, and at least two copies of a variant of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NO: 1 to 21.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, and two copies of a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of the sequence as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6 has an insertion, addition, deletion, or substitution of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides compared to the RNA encoded by the polynucleotide having a sequence as shown in SEQ ID NOs: 9, 7, 18, 12, 8, 1, or 6.
  • the third nucleotide sequence comprises: at least one of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 49, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 49, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 49, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 49.
  • the third nucleotide sequence comprises: at least one of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least one of SEQ ID NO: 22 to 48.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49.
  • a fragment, variant, or variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49.
  • the gene is selected from at least one of MPND, FBXW10, FBXW12 and PGLYRP1. In some embodiments, the gene is selected from one of MPND, FBXW10, FBXW12 and PGLYRP1.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a variant of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, and a variant of a fragment of RNA encoded by a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 24, 22, 23 or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 24, 22, 23 or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the 3'-UTR in the third nucleotide sequence of the recombinant molecule of the present invention may comprise two or more tandem 3'-UTRs from the above-mentioned gene, fragments of the 3'-UTR of the above-mentioned gene, variants of the 3'-UTR of the above-mentioned gene, or variants of the 3'-UTR fragments of the above-mentioned gene.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: (a): derived from the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2, and GH1.
  • polynucleotides comprises at least one of the following polynucleotides: (a): derived from the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC
  • the third nucleotide sequence comprises at least one of the following polynucleotides: (e): 3'-UTRs derived from at least two genes of the genes MPND, FBXW10, FBXW12, and PGLYRP1; (f): a fragment of the 3'-UTR of at least two genes of the genes described in (e); (g): a variant of the 3'-UTR of at least two genes of the genes described in (e); and (h): a variant of the fragment of the 3'-UTR of at least two genes of the genes described in (f).
  • the third nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 49, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 49, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 49, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 49.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 48, a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 48, a variant of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 48, and a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in at least two of SEQ ID NO: 22 to 48.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 49 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 49.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 49 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown in one of SEQ ID NOs: 22 to 49.
  • the third nucleotide sequence contains at least one of the following polynucleotides: RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, variants of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25, and variants of fragments of RNA encoded by at least two of the polynucleotides shown in sequences SEQ ID NO: 24, 22, 23 and 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide shown in the sequence SEQ ID NO: 24, 22, 23 or 25.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2, and GH1; b): at least two copies of a fragment of the 3'-UTR of at least one of the genes in a); c): at least two copies of a variant of the 3'-UTR of at least one of the genes in a); and d): at least two copies of a variant of a fragment of the 3'-UTR of at least one of the genes
  • the second nucleotide sequence comprises at least two copies of the 3'-UTR of at least one of the genes MPND, FBXW10, FBXW12 and PGLYRP1.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of the RNA encoded by the polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, At least two copies of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, at least two copies of a variant of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, and at least two copies of a variant of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, at least two copies of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, at least two copies of a variant of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49, and at least two copies of a variant of a fragment of RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 49.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 49 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 49.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 49 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide of one of SEQ ID NOs: 22 to 49.
  • the third nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 24, 22, 23 and 25, at least two copies of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 24, 22, 23 and 25, at least two copies of a variant of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 24, 22, 23 and 25, and at least two copies of a variant of a fragment of an RNA encoded by a polynucleotide having a sequence as shown in one of SEQ ID NO: 24, 22, 23 and 25.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the RNA encoded by the polynucleotide whose sequence is shown as one of SEQ ID NO: 24, 22, 23 and 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the RNA encoded by the polynucleotide whose sequence is shown as SEQ ID NO: 24, 22, 23 or 25.
  • nucleotide sequence of the 5’-UTR is as shown in SEQ ID NO:55.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 3’-UTR of any of the above embodiments, and a 5’-UTR with a nucleotide sequence as shown in SEQ ID NO:55.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 3’-UTR with a sequence as shown in one of SEQ ID NOs: 22 to 48, and a 5’-UTR with a nucleotide sequence as shown in SEQ ID NO: 55.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 3’-UTR derived from at least one of genes FBXW10, FBXW12, MPND and PGLYRP1, and a 5’-UTR having a nucleotide sequence as shown in SEQ ID NO:55.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 3’-UTR with a sequence as shown in one of SEQ ID NOs: 22 to 25, and a 5’-UTR with a nucleotide sequence as shown in SEQ ID NO: 55.
  • the 3’-UTR is a 3’-UTR derived from the gene COP1 (CARD only protein).
  • nucleotide sequence of the 3’-UTR is as shown in SEQ ID NO:56.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 5'-UTR according to any of the above embodiments, and a 3'-UTR having a nucleotide sequence as shown in SEQ ID NO:56.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 5’-UTR as shown in one of SEQ ID NOs: 1 to 21, and a 3’-UTR of the gene COP1 (CARD only protein).
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 5’-UTR with a sequence as shown in one of SEQ ID NOs: 1 to 21, and a 3’-UTR with a nucleotide sequence as shown in SEQ ID NO: 56.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, and HBB, and a 3'-UTR derived from the gene COP1.
  • the recombinant RNA molecule comprises a first nucleotide sequence encoding a polypeptide and/or protein of interest, a 5’-UTR with a sequence as shown in one of SEQ ID NOs: 1, 6 to 9, 12 and 18, and a 3’-UTR with a nucleotide sequence as shown in SEQ ID NO: 56.
  • the recombinant RNA molecule is an mRNA molecule. In some embodiments, the recombinant RNA molecule further comprises at least one of a 5'-cap structure and a poly (A) tail.
  • the 5'-cap structure includes but is not limited to at least one of m7GpppG , m27,3' - OGpppG, m7Gppp (5')N1 or m7Gppp ( m2'-O )N1.
  • m7G represents 7-methylguanosine cap nucleoside
  • ppp represents a triphosphate bond between the 5' carbon of the cap nucleoside and the first nucleotide of the primary RNA transcript
  • N1 is the 5'most nucleotide
  • G represents guanosine nucleoside
  • m7 represents a methyl group at the 7-position of guanine
  • m2' -O represents a methyl group at the 2'-O position of the nucleotide.
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides.
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in succession.
  • the poly(A) tail comprises one or more nucleotides other than A nucleotides.
  • the poly(A) tail comprises two or more consecutive nucleotides other than A nucleotides, wherein the first and last nucleotides in the sequence having two or more consecutive nucleotides are nucleotides other than A nucleotides.
  • the poly(A) tail is truncated, i.e., m consecutive A nucleotides and n consecutive A nucleotides are reconnected by a linker sequence consisting of p non-A nucleotides, wherein m, n and p are positive integers.
  • m is 30, n is 70, and p is 10.
  • the DNA sequence corresponding to the poly (A) tail is shown as SEQ ID NO:53.
  • the recombinant RNA molecule is no more than 50000nt.
  • the recombinant RNA molecule is no more than 40000nt, 30000nt, 20000nt, 10000nt, 9000nt, 8000nt or 6000nt.
  • the recombinant RNA molecule is 500nt to 50000nt.
  • the recombinant RNA molecule is 1000nt to 40000nt, 1000nt to 30000nt, 1500nt to 10000nt or 1500nt to 8000nt.
  • the recombinant RNA molecule of any of the above embodiments comprises modified nucleosides.
  • the recombinant DNA molecule comprises at least one of modified uridine, modified cytidine, modified adenosine, and modified guanosine.
  • the modified nucleoside is a modified uridine.
  • 0.1% to 100% of the uridine in the recombinant RNA molecule is modified.
  • 80% to 100% of the uridine is modified.
  • 100% of the uridine is modified.
  • Exemplary modified uridines include pseudouridine ( ⁇ ), N1-methyl pseudouridine, pyridin-4-one ribonucleoside, 5-aza-uridine, 6-aza-uridine, 2-thio-5-aza-uridine, 2-thio-uridine (s2U), 4-thio-uridine (s4U), 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxy-uridine (ho5U), 5-aminoallyl-uridine, 5-halo-uridine (e.g., 5-iodo-uridine or 5-bromo-uridine), 3-methyl-uridine (m3U), 5-methoxy-uridine (mo5U), uridine-5-oxyacetic acid (cmo5U), uridine-5-oxyacetic acid methyl ester (mcmo5U), 5-carboxymethyl-uridine (cm5U), 1-carboxymethyl-pseudouridine, 5-carboxyhydroxymethyl-uridine (ch
  • the modified nucleoside is a modified cytidine.
  • 0.1% to 100% of the cytidines in the recombinant RNA molecule are modified.
  • Preferably, 80% to 100% of the cytidines are modified.
  • 100% of the cytidines are modified.
  • Exemplary modified cytidines include 5-aza-cytidine, 6-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine (m3C), N4-acetyl-cytidine (ac4C), 5-formyl-cytidine (f5C), N4-methyl-cytidine (m4C), 5-methyl-cytidine (m5C), 5-halo-cytidine (e.g., 5-iodo-cytidine), 5-hydroxymethyl-cytidine (hm5C), 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine (s2C), 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-d
  • 5-Methoxy-zebularine 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebularine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoisocytidine, 4-methoxy-1-methyl-pseudoisocytidine, lysidine (k2C), ⁇ -thio-cytidine, 2'-O-methyl-cytidine (Cm), 5,2'-O-dimethyl Cm), N4,2'-O-trimethyl-cytidine (m42Cm), 1-thio-cytidine, 2'-F-ara-cytidine, 2'-F-cytidine and 2'-OH-ara-cytidine.
  • the modified nucleoside is a modified adenosine.
  • 0.1% to 100% of the adenosine in the recombinant RNA molecule is modified.
  • 80% to 100% of the adenosine is modified.
  • 100% of the adenosine is modified.
  • Exemplary modified adenosines include 2-amino-purine, 2,6-diaminopurine, 2-amino-6-halo-purine (e.g., 2-amino-6-chloro-purine), 6-halo-purine (e.g., 6-chloro-purine), 2-amino-6-methyl-purine, 8-azido-adenosine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-amino-purine, 7-deaza-8-aza-2-amino-purine, 7-deaza-2,6-diaminopurine, 7-deaza-8-aza-2,6-diaminopurine, 1-methyl-adenosine (m1A), 2-methyl-adenine (m2A), N6 -methyl-adenosine (m6A), 2-methylthio-N6-methyl-adenosine (ms2m6A), N6-
  • the modified nucleoside is a modified guanosine.
  • 0.1% to 100% of the guanosine in the recombinant RNA molecule is modified.
  • 80% to 100% of the guanosine is modified.
  • 100% of the guanosine is modified.
  • Exemplary modified guanosines include inosine (I), 1-methyl-inosine (m1I), wyosine (imG), methyl wyosine (mimG), 4-demethyl-wyosine (imG-14), iso-wyosine (imG2), yW, peroxyyW (o2yW), hydroxyyW (OHyW), undermodified hydroxyyW (OHyW*), 7-deaza-guanosine, queuosine (Q), epoxy-queuosine (o Q), galactosyl-braided glycoside (galQ), mannosyl-braided glycoside (manQ), 7-cyano-7-deaza-guanosine (preQ0), 7-aminomethyl-7-deaza-guanosine (preQ1), archaeosine (G+), 7-deaza-8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine
  • the polypeptide and/or protein of interest refers to a therapeutically or pharmaceutically active polypeptide or protein having a therapeutic or preventive effect, whose function in or near a cell is necessary or beneficial, for example, such a protein, whose deficiency or defective form causes a disease, providing such a protein can regulate or prevent the disease, or such a protein, which is beneficial to the body in or near a cell.
  • the polypeptide or protein may comprise a complete protein or a functional variant thereof.
  • the nucleotide sequence encoding the polypeptide and/or protein of interest or the expressed peptide and/or protein comprises or is one or more of the following: (a) an antigen; (b) a therapeutic protein or polypeptide, a fragment, fragment or variant thereof; and (c) other polypeptides or proteins.
  • the peptide and/or protein expressed by the nucleotide sequence encoding the polypeptide and/or protein of interest comprises or is an antigen.
  • the antigen expressed by the nucleotide sequence encoding the polypeptide and/or protein of interest is derived from one or more of the following: (1) pathogenic antigens, fragments, variants or variants of fragments thereof, (2) tumor antigens, fragments, variants or variants of fragments thereof, (3) allergic antigens, fragments, variants or variants of fragments thereof, (4) autoimmune self-antigens, fragments, variants or variants of fragments thereof.
  • pathogenic antigens are derived from pathogenic organisms that can cause an immune response in a subject (e.g., a mammalian subject, further e.g., a human).
  • pathogenic organisms include or are one or more of the following: bacteria, viruses, fungi, and protozoa (e.g., unicellular organisms, multicellular organisms).
  • the pathogenic antigen comprises or is a surface antigen, a fragment, a variant, or a variant of a fragment thereof, such as a protein located on the surface of a virus, a bacterium, or a protozoan, a fragment thereof (e.g., an external portion of a surface antigen), a variant thereof or variations of the fragment.
  • the pathogenic antigen comprises or is derived from a polypeptide or protein from a pathogen associated with an infectious disease.
  • the pathogenic antigen is selected from but not limited to the group consisting of the antigens derived from pathogens recorded on pages 21 to 35 of WO2018/078053A1, the antigens derived from pathogens recorded on page 57, paragraph 3 to page 63, paragraph 2 of WO2019/077001A1, the antigens derived from pathogens recorded on page 32, line 26 to page 34, line 27 of WO2013/120628A1, and the antigens recorded on page 34, line 29 to page 59, line 5 of WO2013/120628A1.
  • the pathogen of the pathogenic antigen is selected from but not limited to one or more of the following: scabies, Babesia, Leishmania, Gnatostoma, Ancylostoma braziliensis, Ancylostoma duodenale, Strongyloides stercoralis, Trichuris trichuris, Toxocara canis, Toxocara cati, Toxoplasma gondii, Trypanosoma brucei, Trypanosoma cruzi, Brugia malayi, Onchocerca volvulus, Bancrofti, Tapeworm, Taenia solium, Echinococcus, Ascaris lumbricoides, Dinucleate Amoeba fragilis, Naegleria fowleri (Naegleria fowleri), Necator americanus, Paragonimus (e.g., Paragonimus westermani), Clonorchis sinensis, Plasmodium
  • the pathogenic antigen includes or is one or more of the following:
  • SARS coronavirus 2 SARS-CoV-2
  • SARS-CoV-2019 coronavirus or SARS coronavirus SARS coronavirus
  • spike protein S
  • envelope protein E
  • membrane protein M
  • nucleocapsid protein N
  • MERS coronavirus spike protein (S), spike S1 fragment (S1), envelope protein (E), membrane protein (M) or nucleocapsid protein (N)
  • H human papillomavirus
  • HPV16 replication protein E1, regulatory protein E2, protein E3, protein E4, protein E5, protein E6, protein E7, protein E8, major capsid protein L1 and minor capsid protein L2; (4) one or more of the following proteins of human parainfluenza virus (HPIV/PIV) (e.g.
  • hPIV-1, hPIV-2, hPIV-3 or hPIV-4 serotypes fusion protein (F), hemagglutinin neuraminidase , glycoprotein (G), matrix protein (M), phosphoprotein (P), nucleocapsid protein, fusion glycoprotein F0, F1 or F2, recombinant PIV3/PIV1 fusion glycoprotein, C protein, D protein, viral replicase (L) and non-structural V protein; (5) one or more of the following proteins of human metapneumovirus (hMPV): fusion (F) glycoprotein, glycoprotein (G), phosphoprotein (P), and nucleocapsid protein; (6) one or more of the following proteins of influenza virus: hemagglutinin (HA), neuraminidase (NA), nucleoprotein (NP), M1 protein, M2 protein, NS1 protein, NS2 protein (NEP protein: nuclear export protein), PA protein, PB1 protein (polymerase
  • the tumor antigen is selected from but not limited to the group consisting of the tumor antigens described in WO2018/078053A1, pages 47-51.
  • the antigens expressed by the nucleotide sequences encoding the polypeptides and/or proteins of interest include or are allergic antigens and autoimmune self-antigens.
  • allergic antigens and autoimmune self-antigens are derived from or selected from, but not limited to, the antigen groups described on pages 59 to 73 of WO2018/078053A1.
  • the antigens expressed by the nucleotide sequences encoding the polypeptides and/or proteins of interest are listed on pages 48 to 51 of WO2018/078053A1.
  • the polypeptide and/or protein expressed by the nucleotide sequence encoding the polypeptide and/or protein of interest comprises or is a therapeutic protein or polypeptide.
  • the therapeutic protein or polypeptide includes or is one or more of the following:
  • Enzyme replacement therapy for the treatment of metabolic, endocrine or amino acid disorders or therapeutic proteins or polypeptides for replacing missing, defective or mutated proteins (2) Therapeutic proteins or polypeptides for the treatment of blood diseases, circulatory system diseases, respiratory system diseases, infectious diseases or immune deficiencies; (3) Therapeutic proteins or polypeptides for the treatment of cancer or tumor diseases; (4) Therapeutic proteins or polypeptides for hormone replacement therapy; (5) Therapeutic proteins or polypeptides for reprogramming somatic cells into pluripotent stem cells or totipotent stem cells; (6) Therapeutic proteins or polypeptides used as adjuvants or immunostimulants; (7) Therapeutic proteins or polypeptides as therapeutic antibodies; (8) Therapeutic proteins or polypeptides as gene editing agents; (9) Therapeutic proteins or polypeptides for the treatment or prevention of liver diseases selected from the group consisting of liver fibrosis, cirrhosis and liver cancer; and (10) Therapeutic proteins or polypeptides for the treatment or prevention of rare diseases.
  • enzyme replacement therapy for the treatment of metabolic, endocrine or amino acid disorders or therapeutic proteins or polypeptides for replacing missing, deleted or mutated proteins include or are one or more of the following: acidic sphingomyelin Lipase, fatty acid, aglycosidase beta, leucosidase, ⁇ -galactosidase A, ⁇ -glucosidase, ⁇ -L-iduronidase, ⁇ -N-acetylglucosaminidase, amphiregulin, angiopoietin (Ang1, Ang2, Ang3, Ang4, ANGPTL2, ANGPTL3, ANGPTL4, ANGPTL5, ANGPTL6, ANGPTL7), ATPase, Cu(2+)-transporting ⁇ polypeptide (ATP7B), argininosuccinate synthetase (ASS1 ), beta-cell factor, beta-glucuronidase, bone morphogen
  • the therapeutic protein or polypeptide for treating metabolic or endocrine diseases is selected from the proteins or polypeptides described in Table A (in combination with Table C) of WO2017/191274.
  • the therapeutic protein or polypeptide for treating a blood disorder, a circulatory system disease, a respiratory system disease, a cancer or tumor disease, an infectious disease, or an immune deficiency comprises or is one or more of the following: alteplase (tissue plasminogen activator; tPA), anistreplase, antithrombin III (AT-III), bivalirudin, darbepoetin- ⁇ , drotrecogin- ⁇ (activated protein C), erythropoietin, epoetin alfa- ⁇ , erythropoietin, erthropoyetin, factor IX, factor VIIa, factor VIII, recombinant hirudin, protein C concentrate, reteplase (tP A deletion mutant protein), streptokinase, tenecteplase, urokinase, angiostatin, anti-CD22 immunotoxin, denileukin, immunocyanine,
  • the therapeutic protein or polypeptide for treating cancer or tumor disease includes or is one or more of the following: cytokines, chemokines, suicide gene products, immunogenic proteins or peptides, apoptosis inducers, angiogenesis inhibitors, heat shock proteins, tumor antigens, ⁇ -catenin inhibitors, STING pathway activators, checkpoint regulators, innate immune activators, antibodies, dominant negative receptors and decoy receptors, myeloid-derived suppressor cells (MDSCs) inhibitors, IDO pathway inhibitors, and proteins or peptides that bind to apoptosis inhibitors;
  • cytokines cytokines
  • chemokines suicide gene products
  • immunogenic proteins or peptides include apoptosis inducers, angiogenesis inhibitors, heat shock proteins, tumor antigens, ⁇ -catenin inhibitors, STING pathway activators, checkpoint regulators, innate immune activators, antibodies, dominant negative receptors and decoy receptors, myeloid-derived suppressor cells
  • the hormones in the therapeutic protein or polypeptide for hormone replacement therapy include one or more of the following: estrogen, progesterone, progesterone, and testosterone.
  • therapeutic proteins for reprogramming somatic cells into pluripotent or totipotent stem cells include one or more of the following: Oct-3/4, Sox gene family (e.g., Sox1, Sox2, Sox3, and Sox15), Klf family (e.g., Klf1, Klf2, Klf4, and Klf5), Myc family (e.g., c-Myc, L-Myc, and N-Myc), Nanog, and LIN28.
  • Sox gene family e.g., Sox1, Sox2, Sox3, and Sox15
  • Klf family e.g., Klf1, Klf2, Klf4, and Klf5
  • Myc family e.g., c-Myc, L-Myc, and N-Myc
  • Nanog LIN28.
  • the therapeutic protein or polypeptide used as an adjuvant or immunostimulatory protein includes or is one or more of the following: human adjuvant proteins, in particular pattern recognition receptors TLR1, TLR2, TLR3, TLR4, TLR5, TLR6, TLR7, TLR8, TLR9, TLR10, TLR11; NOD1, NOD2, NOD3, NOD4, NOD5, NALP1, NALP2, NALP3, NALP4, NALP5, NALP6, NALP6, NALP7, NALP7, NALP8, NALP9, NALP10, NALP11, NALP12, NALP13, NALP14J IPAF, NAIP, CIITA, RIG-I, MDA5 and LGP2, TLR signaling signal transducers (including adaptor proteins (such as Trif and Cardif), components of small GTPases signals (such as RhoA, Ras, Rac1, Cdc42, Rab, etc.), components of PIP signals (such as PI3K, Src kinase, etc.), components of Myers (such as
  • NF-kB NF-kB
  • c-Fos c-Jun
  • c-Myc CREB
  • AP-1 Elk-1
  • ATF2 IRF-3
  • IRF-7 heat shock proteins
  • HSP10 HSP60, HSP65, HSP70, HSP75 and HSP90
  • gp96 fibrinogen, type III repeat extra domain of fibronectin, etc.
  • components of the complement system e.g.
  • the human auxiliary protein includes one or more of the following: trif, flt-3 ligand, Gp96 or fibronectin, cytokines that induce or enhance innate immune responses (e.g., IL-1 ⁇ , IL-1R1, IL1 ⁇ , IL-2, IL-6, IL-7, IL-8, IL-9, IL-12, IL-13, IL-15, IL-16, IL-17, IL-18, IL-21, IL-23, TNF ⁇ , IFN ⁇ , IFN ⁇ , IFN ⁇ , GM-CSF, G-CSF, M-CSF), chemokines (e.g., IL-8, IP-10, MCP-1, MIP-1 ⁇ , RANTES, Eotaxin, CCL21), cytokines released by macrophages (e.g., IL-1, IL-6, IL-8, IL-12, TNF- ⁇ , etc.).
  • therapeutic proteins or polypeptides used as adjuvants or immunostimulators include one or more of the following: bacterial (adjuvant) proteins, protozoan (adjuvant) proteins, viral (adjuvant) proteins, fungal (adjuvant) proteins, and animal-derived proteins.
  • the bacterial (adjuvant) protein includes one or more of the following: bacterial heat shock proteins or chaperones (including Hsp60, Hsp70, Hsp90, Hsp100); Gram-negative bacterial OmpA (outer membrane protein); OspA; bacterial porins (e.g., OmpF); bacterial toxins (e.g., pertussis toxin (PT) of Bordetella pertussis, pertussis toxin (PT) of Bordetella pertussis); Cough adenylate cyclase toxin CyaA and CyaC, pertussis toxin PT-9K/129G mutant, Bordetella pertussis adenylate cyclase toxin CyaA and CyaC, tetanus toxin, cholera toxin (CT), cholera toxin B subunit, cholera toxin CTK63 mutant, CTE112K mutant of CT
  • LTK63, LTR72 phenol-soluble regulatory protein
  • HP-NAP Helicobacter pylori neutrophil activating protein
  • surfactant protein D Borrelia burgdorferi outer surface protein A lipoprotein, Ag38 (38kDa antigen) of Mycobacterium tuberculosis
  • bacterial pilin proteins e.g. pilin of gram-negative bacterial pilin
  • surfactant protein A and bacterial flagellar proteins e.g. pilin of gram-negative bacterial pilin
  • the protozoan (adjuvant) proteins include one or more of the following: Tc52 from Trypanosoma cruzi, PFTG from Trypanosoma gondii, protozoan heat shock proteins, LeIF from Leishmania, and Spectrum-like proteins from Toxoplasma gondii.
  • the viral (adjuvant) proteins include one or more of the following: respiratory syncytial virus fusion glycoprotein (F protein), MMT virus envelope protein, mouse leukemia virus protein, and wild-type measles virus hemagglutinin protein.
  • F protein respiratory syncytial virus fusion glycoprotein
  • MMT virus envelope protein MMT virus envelope protein
  • mouse leukemia virus protein MMT virus envelope protein
  • wild-type measles virus hemagglutinin protein wild-type measles virus hemagglutinin protein.
  • the fungal (adjuvant) protein comprises a fungal immunomodulatory protein (FIP, eg, LZ-8).
  • FIP fungal immunomodulatory protein
  • the animal-derived protein includes keyhole limpet hemocyanin (KLH).
  • KLH keyhole limpet hemocyanin
  • the polypeptide and/or protein expressed by the nucleotide sequence encoding the polypeptide and/or protein of interest comprises or is a therapeutic protein or polypeptide as a therapeutic antibody, such as one or more of the cytokines, chemokines, suicide enzymes and gene products, apoptosis inducers, endogenous angiogenesis inhibitors, heat shock proteins, tumor antigens, innate immune activators, and antibodies against proteins associated with tumor or cancer development described in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12 of WO2016/170176A1.
  • V DNA molecules or vectors encoding the recombinant RNA molecules of the 5' and/or 3'-UTR of the present invention
  • the present invention provides a DNA molecule encoding the 5'-UTR, 3'-UTR and/or recombinant RNA molecule of the present invention.
  • the present invention also provides a vector comprising the recombinant RNA molecule or DNA molecule of the present invention and a host cell comprising the recombinant RNA molecule, DNA molecule and/or vector of the present invention.
  • the present invention also provides a vector comprising a first nucleotide sequence encoding a 5'-UTR and/or a second nucleotide sequence encoding a 3'-UTR, wherein: the first nucleotide sequence comprises at least one of the following polynucleotides: (a): a polynucleotide encoding a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a polynucleotide encoding a fragment of the 5'-UTR described in (a); (c): a polynucleotide encoding a variant of the 5'-UTR described in (a
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a polynucleotide encoding a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, NFKB2; (f): a polynucleotide encoding a fragment of the 3'-UTR in (e); (g): a polynucleotide encoding a variant of the 3'-UTR in (e); and (h): a polynucleotide encoding a variant of the fragment in (f).
  • the gene is a human gene.
  • the first nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a fragment of an RNA encoded by a polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21, a variant of a polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21, and a variant of a fragment of a polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the fragment of the polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, or 60% similarity with the polynucleotide with a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • a fragment, variant, or variant of a fragment of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the gene is selected from at least one of PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB.
  • the first nucleotide sequence comprises at least one of the following polynucleotides: 1): a polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6; 2): a fragment of the polynucleotide shown in 1); 3): a variant of the polynucleotide shown in 1); and 4): a variant of the fragment shown in 2).
  • the fragment, variant, or variant of the polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • a fragment, variant, or variant of a fragment of a polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to a polynucleotide shown in sequence SEQ ID NO: 9, 7, 18, 12, 8, 1 or 6.
  • the first nucleotide sequence may comprise two or more tandemly linked polynucleotides encoding the 5'-UTR from the above-mentioned gene, polynucleotides encoding fragments of the 5'-UTR from the above-mentioned gene, polynucleotides encoding variants of the 5'-UTR from the above-mentioned gene, and polynucleotides encoding variants of fragments of the 5'-UTR from the above-mentioned gene.
  • the first nucleotide sequence comprises at least one of the following polynucleotides: (a): a polynucleotide encoding a 5'-UTR derived from at least two genes of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2 and CDK7; (b): a polynucleotide encoding a fragment of the 5'-UTR of at least two of the genes described in (a); (c): a polynucleotide encoding a variant of the 5'-UTR of at least two of the genes described in (a); and (d): a polynucleotide encoding a variant of the fragment of the 5'-UTR of at least two of
  • the first nucleotide sequence comprises at least one of the following polynucleotides: (e): a polynucleotide encoding a 5'-UTR derived from at least two genes of genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB; (f): a polynucleotide encoding a fragment of the 5'-UTR described in (e); (g): a polynucleotide encoding a variant of the 5'-UTR described in (e); and (h): a polynucleotide encoding a variant of the fragment described in (f).
  • polynucleotides comprises at least one of the following polynucleotides: (e): a polynucleotide encoding a 5'-UTR derived from at least two genes of genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB
  • the first nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a fragment of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, a variant of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21, and a variant of a fragment of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 1 to 21.
  • a fragment, a variant, or a variant of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with a polynucleotide having a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • a fragment, variant, or variant of a fragment of a polynucleotide with a sequence as shown in one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to a polynucleotide with a sequence as shown in one of SEQ ID NOs: 1 to 21.
  • the first nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide with a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, a fragment of a polynucleotide with a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, a variant of a polynucleotide with a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6, and a variant of a fragment of a polynucleotide with a sequence as shown in at least two of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6.
  • a fragment, variant, or variant of a fragment of a polynucleotide with a sequence as shown in SEQ ID NOs: 9, 7, 18, 12, 8, 1, or 6 has at least 40%, 50%, 60%, 70%, 80%, 90%, 100%, 110%, 120%, 130%, 140%, 150%, 160%, 170%, 180%, 190%, 200%, 210%, 220%, 230%, 240%, 250%, 260%, 270%, 280%, 290%, 300%, 310%, 320%, 330%, 340%, 350%, 360%, 370%, 380%, 390%, 400%, 410%, 420%, 430%, 440%, 450%, 460%, 470%, 480%, 490%, 500%, 510%, 520%, 530%, 540%, In some embodiments, the fragment, variant, or variant of a fragment of a polynucleotide as shown in SEQ ID NO: 9, 7, 18, 12, 8, 1, or 6 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more
  • the first nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a polynucleotide encoding a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1, HBB, MYSM1, LENG1, TMSB4X, CASP4, IFNA1, PGLYRP1, UCHL1, CPAMD8, TTR, APOA2, GH1, DTYMK, APOC2, and CDK7; b): at least two copies of a fragment of a polynucleotide encoding a 5'-UTR of at least one of the genes in a); c): at least two copies of a polynucleotide encoding a variant of a 5'-UTR of at least one of the genes in a); and d): at least two copies of a polynucleotide en
  • the first nucleotide sequence comprises at least one of the following polynucleotides: e): at least two copies of a polynucleotide encoding a 5'-UTR derived from at least one of the genes PPIA, HPX, FTCD, CDK5RAP3, HSPA8, HBA1 and HBB; f) at least two copies of a polynucleotide encoding a fragment of a 5'-UTR derived from at least one of the genes in e); g) at least two copies of a polynucleotide encoding a variant of a 5'-UTR derived from at least one of the genes in e); h) at least two copies of a polynucleotide encoding a variant of a fragment of a 5'-UTR derived from at least one of the genes in e).
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies
  • the first nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, at least two copies of a fragment of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, at least two copies of a variant of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21, and at least two copies of a variant of a fragment of a polynucleotide having a sequence as shown in at least one of SEQ ID NOs: 1 to 21.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of the polynucleotide whose sequence is shown in one of SEQ ID NOs: 1 to 21 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the polynucleotide whose sequence is shown in one of SEQ ID NOs: 1 to 21.
  • the fragment, variant, or variant of the fragment of the polynucleotide whose sequence is shown in one of SEQ ID NOs: 1 to 21 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the polynucleotide whose sequence is shown in one of SEQ ID NOs: 1 to 21.
  • the first nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, at least two copies of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, at least two copies of a variant of a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6, and at least two copies of a variant of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NO: 9, 7, 18, 12, 8, 1 and 6.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of a polynucleotide with a sequence as shown in one of SEQ ID NOs: 9, 7, 18, 12, 8, 1, and 6 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the polynucleotide with a sequence as shown in SEQ ID NOs: 9, 7, 18, 12, 8, 1, or 6.
  • the homolog or variant has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides inserted, added, deleted, or substituted compared to SEQ ID NOs: 9, 7, 18, 12, 8, 1, or 6.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: A polynucleotide as set forth in at least one of SEQ ID NOs: 22 to 48, a fragment of a polynucleotide as set forth in at least one of SEQ ID NOs: 22 to 48, a variant of a polynucleotide as set forth in at least one of SEQ ID NOs: 22 to 48, and a variant of a fragment of a polynucleotide as set forth in at least one of SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of a fragment of a polynucleotide as set forth in at least one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to a polynucleotide as set forth in one of SEQ ID NOs: 22 to 48.
  • a fragment, variant, or variant of a fragment of a polynucleotide with a sequence as shown in at least one of SEQ ID NOs: 22 to 48 has an insertion, addition, deletion, or substitution of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides compared to a polynucleotide with a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the gene is selected from at least one of MPND, FBXW10, FBXW12 and PGLYRP1.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a fragment of a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, a variant of a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25, and a variant of a fragment of a polynucleotide with a sequence as shown in SEQ ID NO: 24, 22, 23 or 25.
  • the fragment, variant, or variant of the fragment of the polynucleotide shown in SEQ ID NO: 24, 22, 23, or 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the polynucleotide shown in SEQ ID NO: 24, 22, 23, or 25.
  • the fragment, variant, or variant of the fragment of the polynucleotide shown in SEQ ID NO: 24, 22, 23, or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to the polynucleotide shown in SEQ ID NO: 24, 22, 23, or 25.
  • the second nucleotide sequence may comprise two or more tandemly linked polynucleotides encoding the 3'-UTR from the above gene, polynucleotides encoding variants of the 3'-UTR from the above gene, or polynucleotides encoding variants of fragments of the 3'-UTR from the above gene.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (a): a polynucleotide encoding a 3'-UTR derived from at least two of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1 and NFKB2; (b): a polynucleotide encoding a fragment of the 3'-UTR of at least two of the genes described in (a); (c): a polynucleotide encoding a variant of the 3'-UTR of at least two of the genes described in (a); and (d): a polynucleotides
  • the second nucleotide sequence comprises at least one of the following polynucleotides: (e): a polynucleotide encoding a 3’-UTR derived from at least two genes of the genes MPND, FBXW10, FBXW12, and PGLYRP1; (f): a polynucleotide encoding a fragment of the 3’-UTR described in (e); (g): a polynucleotide encoding a variant of the 3’-UTR described in (e); and (h): a polynucleotide encoding a variant of the fragment described in (f).
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 22 to 48, a fragment of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 22 to 48, a variant of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 22 to 48, and a variant of a fragment of a polynucleotide having a sequence as shown in at least two of SEQ ID NOs: 22 to 48.
  • a fragment, a variant, or a variant of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the fragment, variant or variant of the fragment of the polynucleotide with a sequence as shown in one of SEQ ID NO: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the polynucleotide with a sequence as shown in one of SEQ ID NO: 22 to 48.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a polynucleotide with a sequence as shown in at least two of SEQ ID NOs: 24, 22, 23, and 25, a polynucleotide with a sequence as shown in SEQ ID NOs: 24, In some embodiments, the fragments, variants, or variants of the fragments of the polynucleotides as shown in at least two of SEQ ID NOs: 24, 22, 23, and 25, the variants of the polynucleotides as shown in at least two of SEQ ID NOs: 24, 22, 23, and 25, and the variants of the fragments of the polynucleotides as shown in at least two of SEQ ID NOs: 24, 22, 23, and 25.
  • the fragments, variants, or variants of the fragments of the polynucleotides as shown in SEQ ID NOs: 24, 22, 23, or 25 have at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with the polynucleotides as shown in SEQ ID NOs: 24, 22, 23, or 25.
  • the fragment, variant, or variant of the fragment of the polynucleotide shown in SEQ ID NO: 24, 22, 23 or 25 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the polynucleotide shown in SEQ ID NO: 24, 22, 23 or 25.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: a): at least two copies of a polynucleotide encoding a 3'-UTR derived from at least one of the genes MPND, FBXW10, FBXW12, PGLYRP1, HPX, CDK7, APOC2, PFN1, RBP4, FTCD, NAAA, ALB, GSDMD, FBXL8, ORM1, CASP4, CHMP2A, LENG1, MYCBPAP, APOC1, GAPDH, HSPA8, APOA2, UCHL1, TSG101, NAE1, and NFKB2; b): at least two copies of a polynucleotide encoding a fragment of the 3'-UTR of at least one of the genes in a); c): at least two copies of a polynucleotide encoding a variant of the 3'-UTR of at least one of the genes in
  • the second nucleotide sequence comprises at least one of the following polynucleotides: e): at least 2 copies of a polynucleotide encoding a 3'-UTR derived from one of the genes MPND, FBXW10, FBXW12 and PGLYRP1; f): at least two copies of a polynucleotide encoding a fragment of the 3'-UTR of at least one of the genes in e); g): at least two copies of a polynucleotide encoding a variant of the 3'-UTR of at least one of the genes in e); and h): at least two copies of a polynucleotide encoding a variant of the fragment of the 3'-UTR of at least one of the genes in e).
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, at least two copies of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, at least two copies of a variant of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48, and at least two copies of a variant of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the fragment, variant, or variant of the fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity with a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the fragment, variant, or variant of the fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48 has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted, or substituted compared to a polynucleotide having a sequence as shown in one of SEQ ID NOs: 22 to 48.
  • the second nucleotide sequence comprises at least one of the following polynucleotides: at least two copies of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, at least two copies of a variant of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25, and at least two copies of a variant of a fragment of a polynucleotide having a sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25.
  • the at least two copies are two copies, three copies, four copies, five copies, six copies, seven copies, eight copies or nine copies.
  • the polynucleotide sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25 has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the polynucleotide sequence as shown in SEQ ID NOs: 24, 22, 23 or 25.
  • the sequence as shown in one of SEQ ID NOs: 24, 22, 23 and 25 The polynucleotide has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more nucleotides inserted, added, deleted or substituted compared to the polynucleotide whose sequence is shown in SEQ ID NO: 24, 22, 23 or 25.
  • the vector is used to produce recombinant RNA molecules such as mRNA molecules.
  • the vector comprises the first nucleotide sequence and the second nucleotide sequence.
  • the vector also comprises the third nucleotide sequence encoding the polypeptide and/or protein of interest between the first nucleotide sequence and the second nucleotide sequence.
  • the third nucleotide sequence encodes at least one polypeptide of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest.
  • the third nucleotide sequence encodes at least one protein of interest.
  • the third nucleotide sequence encodes at least one polypeptide of interest and at least one protein of interest. For example, one, two, three, four, five, six, seven, eight, nine or ten polypeptides of interest and one, two, three, four, five, six, seven, eight, nine or ten proteins of interest.
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides.
  • the nucleotides constituting the poly(A) tail comprise at least 20, at least 40, at least 80, at least 100 or at least 120 A nucleotides in succession.
  • the poly(A) tail comprises one or more nucleotides other than A nucleotides.
  • the poly(A) tail comprises two or more consecutive nucleotides other than A nucleotides, wherein the first and last nucleotides in the sequence having two or more consecutive nucleotides are nucleotides other than A nucleotides.
  • the poly(A) tail is truncated, i.e., m consecutive A nucleotides and n consecutive A nucleotides are reconnected by a linker sequence consisting of p non-A nucleotides, wherein m, n and p are positive integers.
  • m is 30, n is 70, and p is 10.
  • the DNA sequence corresponding to the poly (A) tail is shown as SEQ ID NO:53.
  • the vector may also include expression control elements for correct expression of the vector in the host.
  • control elements are known to those skilled in the art and may include promoters, splicing cassettes, translation initiation codons, translation and insertion sites for introducing inserts into the vector.
  • the vector of the present invention may be, for example, a plasmid, a cosmid, a virus, a phage or another vector conventionally used in genetic engineering, and may contain additional genes, such as a marker gene that allows selection of the vector in a suitable host cell and under suitable conditions.
  • RNA molecules and vectors of the present invention can be introduced into cells directly or via liposomes, viral vectors (eg, adenovirus, retrovirus), electroporation, ballistics (eg, gene gun) or other delivery systems.
  • viral vectors eg, adenovirus, retrovirus
  • electroporation e.g, electroporation
  • ballistics e.g, gene gun
  • the present invention also provides a method for preparing mRNA, comprising contacting the vector of the present invention with RNA polymerase.
  • the method of the present invention further comprises the step of linearizing the plasmid. In some embodiments, the supercoil rate of the plasmid is at least about 90% prior to linearization. In some embodiments, the method of the present invention further comprises the step of purifying the linearized plasmid. In some embodiments, the method of the present invention further comprises the step of purifying mRNA.
  • the method of the present invention further comprises the steps of capping and optionally purifying the capped product.
  • the cap is a Cap1 cap.
  • the capping reaction is as follows: pppN1(p)Nx-OH(3') ⁇ ppN1(pN)x-OH(3')+Pi ppN1(pN)x-OH(3')+GTP ⁇ G(5')ppp(5')N1(pN)x-OH(3')+PPi G(5')ppp(5')N1(pN)x-OH(3')+AdoMet ⁇ m7G(5')ppp(5')N1(pN)x-OH(3')+AdoHyc m7GpppN1(pN)x-OH(3')+AdoMet ⁇ m7Gppp[m2'-O]N1(pN)x-OH(3')+AdoHyc.
  • the present invention also provides a host cell comprising the recombinant RNA molecule, DNA molecule and/or vector of the present invention, such as a bacterial cell.
  • the vector of the present invention such as a plasmid, is stored and/or amplified in the host cell.
  • Host cells of the present invention can be prepared by transforming competent host cells with the vector of the present invention.
  • Competent host cells are cells with the ability to absorb free extracellular genetic material (such as DNA plasmids) independently of the sequence.
  • DNA plasmids free extracellular genetic material
  • Various bacterial cells known to those skilled in the art are naturally able to absorb exogenous DNA from the environment, and therefore can serve as bacterial host cells according to the present invention.
  • competent bacterial host cells can be obtained from natural non-competent bacterial cells using, for example, electroporation or chemicals (such as calcium ion treatment and accompanied by high temperature exposure). After uptake, the DNA plasmid is preferably neither degraded nor integrated in the genomic information of the bacterial host cell.
  • Bacterial host cells include Escherichia coli (E.coli) cells well known to those skilled in the art.
  • Lipid nanoparticles containing the recombinant RNA molecules of the 5' and/or 3'-UTR of the present invention pharmaceutical compositions containing the recombinant RNA molecules of the 5' and/or 3'-UTR of the present invention, and treatment/prevention of diseases
  • the present invention provides a lipid nanoparticle comprising the recombinant RNA molecule, DNA molecule or vector of the present invention.
  • the lipid nanoparticles contain protonable cationic lipids.
  • the lipid nanoparticles further contain one or more of helper lipids, structural lipids and PEG-lipids (polyethylene glycol-lipids). In some further embodiments, the lipid nanoparticles further contain the helper lipids, the structural lipids and the PEG-lipids.
  • the auxiliary lipid is a phospholipid.
  • the phospholipid is usually semi-synthetic, or it may be of natural origin or chemically modified.
  • the phospholipid includes, but is not limited to, DSPC (distearylphosphatidylcholine), DOPE (dioleoylphosphatidylethanolamine), DOPC (dioleoylphosphatidylcholine), DOPS (dioleoylphosphatidylserine), DSPG (1,2-dioctadecanoyl-sn-glycerol-3-phospho-(1'-rac-glycerol)), DPPG (dipalmitoylphosphatidylglycerol), DPPC (dipalmitoylphosphatidylcholine), DGTS (1,2-dipalmitoyl-sn-glycerol-3-O-4'-(N,N,N-trimethyl)homoserine), lysophospho
  • the structural lipid is a sterol substance, including but not limited to cholesterol, cholesterol ester, steroid hormones, steroid vitamins, bile acid, cholesterol, ergosterol, ⁇ -sitosterol and oxidized cholesterol derivatives.
  • the structural lipid is at least one selected from cholesterol, cholesterol ester, steroid hormones, steroid vitamins and bile acid.
  • the structural lipid is cholesterol, preferably high-purity cholesterol, especially injection-grade high-purity cholesterol, such as CHO-HP (produced by AVT).
  • the term PEG-lipid is a conjugate of polyethylene glycol and a lipid structure.
  • the PEG-lipid is selected from PEG-DMG and PEG-distearoylphosphatidylethanolamine (PEG-DSPE), preferably PEG-DMG.
  • PEG-DSPE PEG-distearoylphosphatidylethanolamine
  • the PEG-DMG is a polyethylene glycol (PEG) derivative of 1,2-dimyristyl glyceride.
  • the average molecular weight of the PEG is about 2000 to 5000, preferably about 2000.
  • the lipid nanoparticle further contains the helper lipid, the structural lipid and the PEG-lipid.
  • the lipid nanoparticle comprises the following amount (molar percentage) of the protonatable cationic lipid, based on the total amount of the protonatable cationic lipid, the auxiliary lipid, the structural lipid and the PEG-lipid: about 25.0%-75.0%, such as about 25.0%-28.0%, 28.0%-32.0%, 32.0%-35.0%, 35.0%-40.0%, 40.0%-42.0%, 42.0%-45.0%, 45.0%-46.3%, 46.3%-48.0%, 48.0%-49.5%, 49.5%-50.0%, 50.0%-55.0%, 55.0%-60.0%, 60.0%-65.0%, or 65.0%-75.0%.
  • the present invention provides a cationic liposome comprising the recombinant RNA molecule, DNA molecule or vector of the present invention.
  • the present invention provides a cationic protein comprising the recombinant RNA molecule, DNA molecule or vector of the present invention.
  • the present invention provides a cationic polymer comprising the recombinant RNA molecule, DNA molecule or vector of the present invention. thing.
  • the present invention provides a pharmaceutical composition, which comprises the recombinant RNA molecule, DNA molecule, vector, host cell, cationic liposome, cationic protein, cationic polymer or lipid nanoparticle of the present invention, and a pharmaceutically acceptable carrier, diluent or excipient.
  • the present invention also provides a use of the recombinant RNA molecule, DNA molecule, vector, lipid nanoparticle or pharmaceutical composition of the present invention in preparing a drug.
  • the medicament is for gene therapy, genetic vaccination, protein replacement therapy, antisense therapy, or treatment by interfering RNA.
  • the drug is a nucleic acid drug, wherein the nucleic acid comprises at least one of the following: RNA, messenger RNA (mRNA), antisense oligonucleotide, DNA, plasmid, ribosomal RNA (rRNA), microRNA (miRNA), transfer RNA (tRNA), small inhibitory RNA (siRNA), small nuclear RNA (snRNA), small hairpin RNA (shRNA), tRNA, single-stranded guide RNA (sgRNA) and Cas9mRNA.
  • RNA messenger RNA
  • rRNA ribosomal RNA
  • miRNA microRNA
  • tRNA transfer RNA
  • small inhibitory RNA small nuclear RNA
  • shRNA small hairpin RNA
  • tRNA single-stranded guide RNA
  • Cas9mRNA Cas9mRNA.
  • the drug is used for the treatment and/or prevention of a disease.
  • the disease is selected from the group consisting of rare diseases, infectious diseases, cancer, genetic diseases, autoimmune diseases, diabetes, neurodegenerative diseases, cardiovascular diseases, renal vascular diseases, and metabolic diseases;
  • the cancer includes one or more of lung cancer, gastric cancer, liver cancer, esophageal cancer, colon cancer, pancreatic cancer, brain cancer, lymphoma, blood cancer, or prostate cancer;
  • the genetic disease includes one or more of hemophilia, thalassemia, and Gaucher's disease.
  • the medicament is a vaccine.
  • the recombinant molecule is used to produce an antigen of a pathogen or a portion thereof.
  • the drug is a gene therapy agent.
  • the recombinant molecule is used to produce a protein associated with a genetic disease.
  • the recombinant molecules are used to generate antibodies, such as scFVs or nanobodies.
  • the present invention also provides a method for preventing or treating a disease, comprising administering the recombinant RNA molecule or pharmaceutical composition of the present invention to a subject in need thereof.
  • the disease or condition is selected from the group consisting of rare diseases, infectious diseases, cancer, genetic diseases, autoimmune diseases, diabetes, neurodegenerative diseases, cardiovascular diseases, renal vascular diseases, and metabolic diseases.
  • the cancer includes one or more of lung cancer, gastric cancer, liver cancer, esophageal cancer, colon cancer, pancreatic cancer, brain cancer, lymphoma, blood cancer or prostate cancer;
  • the genetic disease includes one or more of hemophilia, thalassemia, and Gaucher's disease.
  • the recombinant RNA molecule or pharmaceutical composition is used as a vaccine to prevent a disease. In some embodiments, the recombinant RNA molecule is used to produce an antigen or part thereof of a pathogen.
  • the recombinant RNA molecule is used to produce the protein associated with the genetic disease.
  • the recombinant RNA molecules are used to produce antibodies, such as scFVs or nanobodies.
  • the present invention obtains optimized UTRs through a large number of screenings. These UTRs can improve the translation efficiency and/or stability of mRNA, increase the expression level of polypeptides and/or proteins, and have very important application value for the research and development of mRNA vaccines.
  • the construction method is as follows:
  • Luciferase-pcDNA3 plasmid purchased from Addgene, its plasmid number #18964
  • new restriction sites HindIII and BamHI were added in front of the Kozack sequence
  • new restriction sites KpnI and ApaI were added after the stop codon of luciferase to obtain plasmid B.
  • the nucleotide sequence of Luciferase-pcDNA3 plasmid is shown in SEQ ID NO.50, and the plasmid map of luciferase-pcDNA3 is shown in Figure 2.
  • the nucleotide sequence of plasmid B is shown in SEQ ID NO:51, and the plasmid map of plasmid B is shown in Figure 3.
  • the Amp (ampicillin) resistance gene in plasmid B was replaced with the Kana (kanamycin) resistance gene, and the neo/KanR sequence from 3746 to 4540 was removed to obtain plasmid C.
  • the plasmid spectrum of plasmid C is shown in Figure 4, and the nucleotide sequence of plasmid C is shown in SEQ ID NO:52.
  • Plasmid C was inserted with a poly(A) tail as shown in SEQ ID NO:53. Specifically, plasmid C was digested with ApaI, the product was purified, the purified single digestion product and the poly(A) tail as shown in SEQ ID NO:53 were recombined by homologous recombination, the recombinant product was transformed into DH5 ⁇ , clones were screened on an LB plate containing 50 ⁇ g/mL kanamycin, and clones with correct sequencing were selected to extract plasmids to obtain plasmid D.
  • the plasmid spectrum of plasmid D is shown in Figure 5, and the nucleotide sequence of plasmid D is shown in SEQ ID NO:54.
  • FIG. 1A The construction flowchart of plasmid D is shown in Figure 1A , and the schematic diagram of the construction and transformation is shown in Figure 1B .
  • the 5’-UTR sequence was synthesized by Shanghai Sangon Biotechnology Co., Ltd. According to the company’s quality analysis report, the 5’-UTR sequence was consistent with the theoretical design sequence.
  • Plasmid D was double-digested with HindIII and BamHI, and the fragment A with a molecular weight of about 6 kb was obtained by gel excision recovery (Axygen); different 5'UTR sequences were introduced into the homology arm sequence by PCR, and homologous recombination reaction was carried out with fragment A, and Takara's homologous recombination enzyme system was used to react at 50°C for 15 minutes; the above reaction system was transformed into DH5 ⁇ (Takara) competent cells, plated (LB plate with 50 ⁇ g/mL kanamycin), and after culturing for 16 hours, 3-4 single clones were picked and cultured in a medium containing 50 ⁇ g/mL kanamycin for 8 hours, and the plasmids were extracted and sequenced by Sangon Biotech (Shanghai) Co., Ltd. to obtain plasmids containing different 5'UTRs.
  • the 3'-UTR sequence was synthesized by commissioning Sangon Biotechnology (Shanghai) Co., Ltd. According to the company's quality analysis report, the 3'-UTR sequence was consistent with the theoretical design sequence.
  • Plasmid D was double-digested with KpnI and ApaI, and the fragment A with a molecular weight of about 6 kb was obtained by gel excision recovery (Axygen); different 3'-UTR sequences were introduced into the homology arm sequence by PCR, and homologous recombination reaction was carried out with fragment A, and Takara's homologous recombination enzyme system was used to react at 50°C for 15 minutes; the above reaction system was transformed into DH5 ⁇ (Takara) competent cells, plated (LB plate with 50 ⁇ g/mL kanamycin), and after culturing for 16 hours, 3-4 single clones were picked and cultured in a medium containing 50 ⁇ g/mL kanamycin for 8 hours, and the plasmids were extracted and sequenced by Sangon Biotech (Shanghai) Co., Ltd. to obtain plasmids containing different 3'-UTRs.
  • Enzyme linearization Take 20 ⁇ g of plasmid and use the corresponding enzyme (BsaI) for linearization.
  • the reaction conditions are 37°C for 2h. Note: The reaction system can also be scaled up according to the required production volume.
  • Cap1 cap structure and reaction principle are as follows: pppN1(p)Nx-OH(3') ⁇ ppN1(pN)x-OH(3')+Pi ppN1(pN)x-OH(3')+GTP ⁇ G(5')ppp(5')N1(pN)x-OH(3')+PPi G(5')ppp(5')N1(pN)x-OH(3')+AdoMet ⁇ m7G(5')ppp(5')N1(pN)x-OH(3')+AdoHyc m7GpppN1(pN)x-OH(3')+AdoMet ⁇ m7Gppp[m2'-O]N1(pN)x-OH(3')+AdoHyc
  • the amount of capped mRNA per time should not exceed 60ug.
  • the pre-heated mRNA was mixed with the above system and incubated at 37°C for 1 h.
  • Characterization data Determine concentration using onedrop or Qubit.
  • HEK293 cells were transfected with mRNA containing different 5'-UTRs. Specifically, HEK293 cells were added to a 96-well plate at 40,000 cells/well one day in advance, and cell transfection was performed when the cell confluence reached 70% to 90% the next day.
  • mRNAs containing different 5'-UTRs were transfected into HEK293 cells, with 100 ng of mRNA per well. After incubation in a 37°C CO2 incubator for 16 h, luciferase reporter assay was performed (following the instructions of the Promega kit).
  • HEK293 cells were transfected with mRNA containing different 3'-UTRs. Specifically, HEK293 cells were plated on a 96-well plate at 40,000 cells/well one day in advance, and cell transfection was performed when the cell confluence reached 70% to 90% the next day.
  • mRNAs containing different 3'-UTRs were transfected into HEK293 cells, with 100 ng of mRNA per well. After incubation in a 37°C CO2 incubator for 16 h, luciferase reporter assay was performed (following the instructions of the Promega kit).
  • step 2 Using the series of plasmids in step 1, a series of mRNAs containing 5UTR-NO54 and one of the different 3'UTRs were prepared with reference to Example 4, and then respective LNP-mRNA preparations were prepared. The steps of preparing the LNP-mRNA preparations included:
  • LNP encapsulation The volume of the preparation solution (the total volume of the aqueous phase and its alcohol phase of each system) is 1.5 mL, wherein the mass ratio of mRNA: lipid is 1:10, and the concentration of HAc-NaOAc buffer (0.2 M, pH 5.0) in the final aqueous phase is 0.025 M.
  • SM-102 lipids dissolved in ethanol
  • chemical structure of SM-102 is as follows:
  • SM-102 is commercially available or can be prepared according to techniques known in the art.
  • 1 ⁇ PBS+8% (m/V) sucrose solution Take 2 packets of 1 ⁇ PBS pre-prepared powder into a beaker, dissolve and mix with 2L DEPC water, then add 160g sucrose and mix to obtain 1 ⁇ PBS+8% (m/V) sucrose solution.
  • mice used in the experiment were: female Balb/c mice, 6 weeks old; 3 mice in each group; each mouse was injected with 12 ⁇ g of LNP-mRNA preparation via the tail vein. After 12 hours, the mice were anesthetized by isoflurane inhalation and injected with the luciferase development substrate D-Luciferin (150 mg/kg). The animals were then placed in a supine position, and the signal distribution and intensity of luciferin in the mice were observed using the IVIS live imaging system.
  • step 2 Using the series of plasmids in step 1, prepare a series of mRNAs containing 3UTR-NO3 and one of the different 5'-UTRs as described above, and further prepare the corresponding LNP-mRNA preparations.
  • mice used in the experiment were: female Balb/c mice, 6 weeks old; 3 mice in each group; each mouse was injected with 12 ⁇ g of LNP-mRNA preparation through the tail vein. After 12 hours, the mice were anesthetized by isoflurane inhalation and injected with luciferase development substrate D-Luciferin (150 mg/kg). The animals were then placed in a supine position, and the signal distribution and intensity of luciferin in the mice were observed using the IVIS in vivo imaging system.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Epidemiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)

Abstract

提供一种重组RNA分子,其包含编码感兴趣多肽的核苷酸序列和5'-UTR和/或3'-UTR,RNA分子具有提高的翻译效率和/或稳定性,提高多肽和/或蛋白的表达水平。还提供编码重组RNA分子的载体、包含重组RNA分子的药物组合物以及使用重组RNA分子治疗或预防疾病的方法。

Description

提高RNA分子翻译效率和/或稳定性的UTR及其应用 技术领域:
本发明属于生物技术领域,涉及一种提高RNA分子翻译效率和/或稳定性的5’和/或3’非翻译区(untranslated region,UTR),包含所述UTR的核酸分子,包含所述UTR或核酸分子的载体,包含所述UTR、核酸分子或载体的药物组合物,所述UTR、所述核酸分子、所述载体和所述药物组合物的应用。
背景技术:
mRNA(messenger RNA)是一种单链的RNA分子,它是以细胞核中的基因组DNA为模板制造后被输出到细胞质中,并在核糖体中翻译产生特定的蛋白从而发挥生物学效应,mRNA拥有能够合成任意一种蛋白质的潜力,由于其经济、安全、快速、灵活等特性,mRNA药物在传染病预防、癌症和罕见病在内的多种疾病治疗领域均具有巨大的应用潜力。
mRNA疫苗已经在传染病预防领域获得临床应用,与重组蛋白亚单位疫苗、灭活疫苗或DNA疫苗相比,mRNA疫苗具有几个明显的优势。首先,由于mRNA不会感染机体,也不会整合到基因组DNA上,因此大大提高了mRNA的安全性。其次,使用修饰碱基可以降低mRNA分子的固有免疫原性及减少其被机体降解,进一步提高mRNA疫苗的安全性和稳定性。另外,在体外,mRNA合成的产率很高,因此mRNA疫苗具有高效、可快速开发、低成本制造和便于安全管理的潜力。
mRNA的非翻译区(untranslated region,UTR)控制基因的翻译、降解和定位,包括茎环结构、上游起始密码子、上游开放阅读框、内部核糖体进入位点和各种与RNA结合蛋白结合的顺式作用元件。
UTR在基因表达的转录后调节中起着至关重要的作用,包括调节mRNA出核转运和翻译效率、亚细胞定位和稳定性。UTR也可能发挥其他作用,例如在编码硒蛋白的mRNA的UGA密码子处特异性掺入修饰的氨基酸硒半胱氨酸,该过程由3’-UTR中的保守茎环结构介导。
本领域技术人员知晓,mRNA分子的翻译效率直接影响mRNA药物(特别是mRNA疫苗)的给药剂量和给药间隔,最终影响mRNA药物的生物利用度并决定mRNA药物的临床应用价值。虽然已经有用于增加mRNA分子翻译效率和稳定性的技术方案,例如通过添加非翻译区(UTR)来提高mRNA分子的翻译效率,但在提高mRNA分子翻译效率方面,仍然存在进一步改进的需求。
因此,需要鉴定或设计可以实现更高的mRNA翻译效率和/或稳定性的UTR。
发明内容
为解决上述问题,发明人鉴定出多种可提高mRNA分子翻译效率和/或稳定性的5’-UTR和/或3’-UTR,上述UTR属于通用性的核心元件,赋予含有所述UTR的mRNA分子增强的翻译效率,显著提升目标基因的表达和/或mRNA稳定性,在mRNA药物产业化中具有广泛的应用价值。
在第一方面,本发明提供一种重组RNA分子,其包含:(1)编码感兴趣的多肽和/或蛋白的第一核苷酸序列;和(2)含有5’-非翻译区(5’-UTR)的第二核苷酸序列;所述5’-UTR包含选自以下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、 PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;(b):(a)中所述5’-UTR的片段;(c):(a)中所述5’-UTR的变体;及(d):(b)中所述片段的变体;所述第一核苷酸序列与所述第二核苷酸序列不天然出现于同一RNA分子。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述第二核苷酸序列包含以下多核苷酸中的至少一种:(e):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR;(f):(e)中所述5’-UTR的片段;(g):(e)中所述5’-UTR的变体;及(h):(f)中所述片段的变体。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体;优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。
在一些实施方案中,所述第二核苷酸序列包含:(a)中所述基因中的至少两个基因的5’-UTR、(a)中所述基因中的至少两个基因的5’-UTR的片段、(a)中所述基因中的至少两个基因的5’-UTR的变体和(a)中所述基因中的至少两个基因的5’-UTR的片段的变体中的至少一种。
在一些实施方案中,所述第二核苷酸序列包含:至少两个拷贝的(a)中的5’-UTR、至少两个拷贝的(a)中的5’-UTR的片段、至少两个拷贝的(a)中的5’-UTR的变体和至少两个拷贝的(a)中的5’-UTR的片段的变体中的至少一种。
在一些实施方案中,所述重组RNA分子进一步还包含启动子、5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述重组RNA分子进一步还包含5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述5’-帽子结构包括m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1和m7Gppp(m2′-O)N1中的至少一种。在一些实施方案中,所述3’-UTR包含:i)源自白蛋白基因、α-珠蛋白基因、β-珠蛋白基因、酪氨酸羟化酶基因、脂加氧酶基因和胶原蛋白α基因中的至少一种基因的3’-UTR;ii)所述i)中的所述3’-UTR的变体;iii)源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中至少一种基因的3’-UTR、其片段、变体和片段的变体中的至少一种;优选地,序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA、序列SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段的变体中的至少一种;优选地,所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段、和所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段的变体与所述序列如SEQ ID NO:22~49中的至少一个所示的多核苷酸编码的RNA有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、 96%、97%、98%或99%的相同性;iv)至少两个拷贝的i)、ii)或iii)中的一种多核苷酸;或v)由i)~iii)中的多核苷酸所构成的组中的至少两种多核苷酸。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。在一些实施方案中,构成所述poly(A)尾的核苷酸包括一个或多个除A核苷酸之外的其他核苷酸;可选地,构成所述poly(A)的核苷酸包含连续两个或两个以上的除A核苷酸外的其他核苷酸。
在第二方面,本发明还提供了另一种重组RNA分子,其包含:(1)编码感兴趣的多肽和/或蛋白的第一核苷酸序列;和(2)含有3’-非翻译区(3’-UTR)的第二核苷酸序列;所述3’-UTR包含选自以下多核苷酸中的至少一种:(a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、和NFKB2中的至少一个基因的3’-UTR;(b):(a)中所述3’-UTR的片段;(c):(a)中所述3’-UTR的变体;及(d):(b)中所述片段的变体;所述第一核苷酸序列和所述第二核苷酸序列不天然出现于同一RNA分子。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述第二核苷酸序列包含以下多核苷酸中的至少一种:(e):源自基因MPND、FBXW10、FBXW12、和PGLYRP1中的至少一个基因的3’-UTR;(f):(e)中所述3’-UTR的片段;(g):(e)中所述3’-UTR的变体;及(h):(f)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体;优选地,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:22~48中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。
在一些实施方案中,所述第二核苷酸序列包含:(a)中所述基因中的至少两个基因的3’-UTR、(a)中所述基因中的至少两个基因的3’-UTR的片段、(a)中所述基因中的至少两个基因的3’-UTR的变体和(a)中所述基因中的至少两个基因的3’-UTR的片段的变体中的至少一种。
在一些实施方案中,所述第二核苷酸序列包含:至少两个拷贝的(a)中的3’-UTR、至少两个拷贝的(a)中的3’-UTR的片段、至少两个拷贝的(a)中的3’-UTR的变体和至少两个拷贝的(a)中的3’-UTR的片段的变体中的至少一种。
在一些实施方案中,所述重组RNA分子进一步还包含启动子、5’-帽子结构、5’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述重组RNA分子进一步还包含5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述5’-帽子结构包括m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1或m7Gppp(m2′-O)N1中的至少一种。
在一些实施方案中,所述5’-UTR包含:i)源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、 UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中至少一个基因的5’-UTR、其片段、变体和片段的变体中的至少一种;优选地,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体;优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;ii)至少两个拷贝的i)中的其中一种多核苷酸;或iii)至少两种i)中的多核苷酸。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。在一些实施方案中,构成所述poly(A)尾的核苷酸包含一个或多个除A核苷酸之外的其他核苷酸。
在第三方面,本发明提供了一种DNA分子,其编码本发明的重组RNA分子。
在第四方面,本发明提供了一种载体,其包含本发明的重组RNA分子或DNA分子。
在第五方面,本发明提供了一种宿主细胞,其包含本发明的重组RNA分子、DNA分子或载体。
在第六方面,本发明提供了一种脂质纳米颗粒,其包含本发明的重组RNA分子。
在第七方面,本发明提供了一种药物组合物,其包含本发明的重组RNA分子、本发明的DNA分子、本发明的载体、本发明的宿主细胞或本发明的脂质纳米颗粒,以及药学上可接受的载剂。
在第八方面,本发明提供了一种载体,其包含编码5’-UTR的第一核苷酸序列和/或编码3’-UTR的第二核苷酸序列,其中:
所述第一核苷酸序列包含如下多核苷酸中的至少一种:(a):编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR的多核苷酸;(b):编码(a)中所述5’-UTR的片段的多核苷酸;(c):编码(a)中所述5’-UTR的变体的多核苷酸;及(d):编码(b)中所述片段的变体的多核苷酸;
所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):编码源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少一个基因的3’-UTR的多核苷酸;(f):编码(e)中所述3’-UTR的片段的多核苷酸;(g):编码(e)中所述3’-UTR的变体的多核苷酸;及(h):编码(f)中所述片段的变体的多核苷酸。
在一些实施方案中,基因是人基因。
在一些实施方案中,所述第一核苷酸序列包含源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个的5’-UTR或其变体。
在一些实施方案中,所述第二核苷酸序列包含源自基因MPND、FBXW10、FBXW12、和PGLYRP1中的至少一个的3’-UTR或其变体。
在一些实施方案中,所述载体包含所述第一核苷酸序列和第二核苷酸序列。
在一些实施方案中,所述第一核苷酸序列包含:i)序列如SEQ ID NO:1~21中至少一个所示的多核苷酸、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段的变体;优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的同源性;ii)至少两个拷贝的i)中的一种多核苷酸;或iii)至少两种i)中的多核苷酸。
在一些实施方案中,第二核苷酸序列包含:(1):编码源自白蛋白基因、α-珠蛋白基因、β-珠蛋白基因、酪氨酸羟化酶基因、脂加氧酶基因、和胶原蛋白α基因中至少一个基因的3’-UTR的多核苷酸;(2):编码(1)中的所述3’-UTR的变体的多核苷酸;(3):序列如SEQ ID NO:22~48中至少一个所示的多核苷酸、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸的片段的变体中的至少一种;优选地,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的片段、和所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的片段的变体,与所述序列如SEQ ID NO:22~48中的至少一个所示的多核苷酸有70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;(4):至少两个拷贝的(1)、(2)或(3)中的一种多核苷酸;或(5):由(1)~(3)中的多核苷酸所构成的组中的至少两种多核苷酸。
在一些实施方案中,所述载体还包含编码poly(A)尾的多核苷酸。在一些实施方案中,组成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,组成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。在一些实施方案中,组成所述poly(A)尾的核苷酸包含一个或多个除A核苷酸之外的其他核苷酸。
在第九方面,本发明提供了一种本发明的重组RNA分子、DNA分子、载体、宿主细胞、脂质纳米颗粒或药物组合物在制备药物中的用途;优选地,所述药物用于基因治疗、基因疫苗接种或蛋白质替代疗法。
在一些实施方案中,所述药物为核酸药物,其中所述核酸包括下述的至少一种:RNA、信使RNA(mRNA)、DNA、质粒、核糖体RNA(rRNA)、单链向导RNA(sgRNA)和Cas9 mRNA。
在一些实施方案中,所述药物用于疾病的治疗和/或预防;优选地,所述疾病选自由以下组成的组:罕见病、感染性疾病、癌症、遗传性疾病、自体免疫性疾病、糖尿病、神经退化性疾病、心血管疾病、肾血管疾病,以及代谢性疾病;优选地,所述癌症包括肺癌、胃癌、肝癌、食管癌、结肠癌、胰腺癌、脑癌、淋巴癌、血癌或前列腺癌中的一种或多种;所述遗传疾病包括血友病,地中海贫血、高雪氏病中的一种或多种。
附图说明
图1A显示质粒D构建流程图;
图1B显示质粒D构建改造示意图;
图2显示质粒luciferase-pcDNA3的图谱;
图3显示质粒B图谱;
图4显示质粒C图谱;
图5显示质粒D谱图;
图6为实施例5中的含有不同5’-UTR的mRNA分别转染HEK293细胞后的荧光素酶的表达情况;
图7为实施例6中含有不同3’-UTR的mRNA分别转染HEK293细胞后的荧光素酶的表达情况;
图8是含相同5’-UTR而3’-UTR不同的mRNA在小鼠体内的表达情况;
图9是含相同3’-UTR而5’-UTR不同的mRNA在小鼠体内的表达情况。
发明详述
一、定义
本文引用的所有专利、专利申请、科学出版物、制造商的说明书和指南等,无论上文或下文,均整体援引加入本文。本文中的任何内容均不应理解为承认本公开无权先于这样的公开。
除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白和核酸化学、分子生物学、细胞和组织培养、微生物学相关术语均为相应领域内广泛使用的术语(参见,例如,Molecular Cloning:A Laboratory Manual,2nd Edition,J.Sambrook et al.eds.,Cold Spring Harbor Laboratory Press,Cold Spring Harbor 1989)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
如本文所用,表述“包括”、“包含”、“含有”和“具有”是开放式的,表示包括所列举的元素、步骤或组分但不排除其他未列举的元素、步骤或组分。表述“由……组成”不包括未指定的任何元素、步骤或组分。表述“基本上由……组成”是指范围限于指定的元素、步骤或组分,加上不显著影响要求保护的主题的基本和新颖性质的任选存在的元素、步骤或组分。应当理解,表述“基本上由……组成”和“由……组成”涵盖在表述“包含”的含义之内。
如本文所用,除非上下文另外指明,否则在描述本发明的上下文中(特别是在权利要求的上下文中)使用的单数形式表述的“一”和“一个/种”和“所述”以及类似的引用应被解释为涵盖单数和复数两者。术语“一个或多个”或者“至少一个”涵盖1、2、3、4、5、6、7、8、9个或更多个。
本文中所述的数值范围应理解为涵盖其中包含的任何和所有子范围。例如,范围“1至10”应理解为不仅包括明确记载的1和10的值,而且还包括1至10范围内的任何单个值(例如2、3、4、5、6、7、8和9)和子范围(例如1至2、1.5至2.5、1至3、1.5至3.5、2.5至4、3至4.5等等)。该原则亦适用于仅用一个数值作为最小值或最大值的范围。
除非另有说明,否则本文描述的所有方法均可以以任何合适的顺序进行。
如本文所用,术语“野生型”表示该序列是天然存在的并且未经人为修饰的,包括天然存在的突变体。术语“片段”或“核酸的片段”涉及核酸的一部分。例如在5’和/或3’端缩短的核酸。核酸的片段包含来自所述核酸的至少50%、60%、70%或80%。优选地,核酸的片段包含来自所述核酸的至少70%或80%。优选至少90%、95%、96%、97%、98%或99%的核苷酸残基。通常可以是核酸的全长的较短部分。
关于核酸中的术语“变体”是指核酸变体,该核酸变体的至少一个核苷酸与参考核酸(或称“母本”)不同。与参考核酸相比,变体核酸包括单个或多个核苷酸缺失、添加、突变和/或插入,其中:缺失包括从参考核酸移除一个或更多个核苷酸;添加包括将一个或更多个核苷酸(例如1、2、3、5、10、20、30、50个或更 多个核苷酸)与参考核酸的5’和/或3’端融合;突变可以包括但不限于替换(例如至少一个核苷酸被移除并且在其位置插入另一个核苷酸(例如颠换和转换));插入包括添加至少一个核苷酸。本文使用的术语“核酸变体”包括天然存在的变体和工程化变体。因此,本文所定义的“核酸变体”可从参考核酸衍生、分离、相关、基于或同源于参考核酸序列。“核酸变体”可选具有与相应天然存在的(野生型)核酸或其同系物、片段或衍生物的至少5%、10%、20%、30%、40%、50%、60%、70%、80%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%;优选至少70%、更优选至少80%、甚至更优选至少85%、甚至更优选至少90%、最优选至少95%或甚至97%的序列同一性。可以理解的是,对于核酸分子,术语“变体”包括简并性核酸序列,其中根据本发明的简并性核酸序列是由于遗传密码的简并性而在密码子序列中与参考核酸不同的核酸。
在一些实施方案中,核酸变体为5’-UTR的变体、3’-UTR的变体或ployA的变体。在一些实施方案中,核酸变体中所引入的突变能够使核酸变体避免因被核酸酶识别而被酶切、避免与microRNA结合、或避免产生发卡结构或G-四链体(G-quadruplex)等复杂二级结构。例如,核酸变体为源自ALB基因(GenBank登录号为NM_000477.7)的3’-UTR的变体,该变体在ALB基因的3’-UTR的基础上将其中的一个“A”突变为一个“C”,以避免被核酸酶识别而被酶切。
如本文所用,关于术语“%相同性”或“%同一性”是指在待比较的序列之间的最佳比对中相同的核苷酸或氨基酸的百分比,两个序列之间的差异可以分布在待比较序列的局部区域(区段)或整个长度上。通常在对区段或“比较窗口”最佳比对之后,确定两个序列之间的相同性。最佳比对可以手动进行或者借助于本领域已知算法。本领域已知算法包括但不限于Smith and Waterman,1981,Ads App.Math.2,482和Neddleman and Wunsch,1970,J.Mol.Biol.48,443描述的局部同源性算法,Pearson and Lipman,1988,Proc.Natl Acad.Sci.USA 88,2444描述的相似性搜索方法,或使用计算机程序,例如Wisconsin Genetics Software Package,Genetics Computer Group,575 Science Drive,Madison,Wis.中的GAP、BESTFIT、FASTA、BLAST P、BLAST N和TFASTA进行。例如,可以利用美国国家生物技术信息中心(NCBI)网站公共可用的BLASTN或BLASTP算法确定两个序列的百分比相同性。
“%相同性”或“%同一性”可以通过如下获得:通过确定待比较的序列对应的相同位置的数目,用这个数目除以比较的位置数目(例如,参考序列中的位置数目),并将这个结果乘以100,获得%相同性。在一些实施方案中,至少约50%、至少约55%、至少约60%、至少约65%、至少约70%、至少约75%、至少约80%、至少约85%、至少约90%、至少约95%或约100%的区域给出相同性程度。在一些实施方案中,对参考序列的整个长度给出相同性程度。可以用本领域已知的工具进行确定序列相同性的比对,优选利用最佳序列比对,例如,利用Align,利用标准设置,优选EMBOSS::needle、Matrix:Blosum62、Gap Open 10.0、Gap Extend 0.5。
优选地,特定核酸的片段或变体或者与特定核酸具有特定同一性程度的核酸优选地具有所述特定核酸的至少一种功能特性,并且优选地与所述特定核酸是功能上等同的,例如展现出与特定核酸的特性相同或相似的特性的核酸。
在本文中,“核苷酸”包括脱氧核糖核苷酸、脱氧核糖核苷酸、脱氧核糖核苷酸衍生物、及核糖核苷酸衍生物。如本文所用,“核糖核苷酸”是核糖核酸(RNA)的构成物质,由一分子碱基、一分子五碳糖和一分子磷酸组成,其是指在β-D-呋喃核糖(β-D-ribofuranosyl)基团的2’位置具有羟基的核苷酸而“脱氧核糖核苷酸”是脱氧核糖核酸(DNA)的构成物质,也是由一分子碱基、一分子五碳糖和一分子磷酸构成,其是指在β-D-呋喃核糖(β-D-ribofuranosyl)基团的2’位置的羟基被氢取代的核苷酸,是染色体的主要化学成分。
“核苷酸”通常由代表其中碱基的单字母来指代:“A”或“A核苷酸”指含有腺嘌呤的腺嘌呤脱氧核糖核苷酸或腺嘌呤核糖核苷酸,“C”或“C核苷酸”指含有胞嘧啶的胞嘧啶脱氧核糖核苷酸或胞嘧啶核糖核苷酸,“G”或“G核苷酸”指含有鸟嘌呤的鸟嘌呤脱氧核糖核苷酸或鸟嘌呤核糖核苷酸,“U”或“U核苷酸”指含有尿嘧啶的尿嘧啶核糖核苷酸,“T”或“T核苷酸”指含有胸腺嘧啶的胸腺嘧啶脱氧核糖核苷酸。
如本文所用,术语“核酸”是通常地指包含脱氧核糖核苷酸的聚合物(脱氧核糖核酸,简称DNA)或核糖核苷酸的聚合物(核糖核酸,简称RNA)或其组合的任何化合物。另外,本文中的核酸还包括核酸的衍生物。术语“核酸的衍生物”包括对核酸在核苷酸的碱基上、糖上或磷酸上的化学衍生化,以及含有非天然核苷酸和核苷酸类似物的核酸。此外,在本文中,核酸可以是单链或双链的线性或共价闭合环状分子的形式。
“多核苷酸序列”、“核酸序列”和“核苷酸序列”可以互换使用,用来表示多核苷酸中核苷酸的排序。本领域人员应当理解,DNA编码链(有义链)与其编码的RNA可以看作具有相同的核苷酸序列,DNA编码链序列中的脱氧胸苷酸对应其编码的RNA序列中的尿苷酸。DNA编码的RNA是指与DNA对应的RNA,即DNA中的T核苷酸全部替换为U核苷酸后的多核苷酸。
多核苷酸可以包含一个区段或多个区段(核酸片段)(例如1、2、3、4、5、6、7、8个区段)。例如,多核苷酸可以包含编码感兴趣多肽(例如本文所述多肽和多肽抗原)的区段。在特定实施方案中,多核苷酸可以包含编码感兴趣多肽的区段以及调控区段(包括但不限于用于转录调控和翻译调控的区段)。在一实施方案中,调控区段包含以下的一个或多个起调控作用的元件所对应的多核苷酸:启动子、5’非翻译区(5’-UTR)、3’非翻译区(3’-UTR)和poly(A)尾。
术语“启动子”是指位于基因的编码区5’端上游的多核苷酸,它含有RNA聚合酶特异性结合和转录起始所需的保守序列,能活化RNA聚合酶,使RNA聚合酶与模板DNA能准确的结合并具有转录起始的特异性。启动子可以源自包括病毒、细菌、真菌、植物、昆虫和动物。启动子的代表性实例包括噬菌体T7启动子、噬菌体T3启动子、SP6启动子、lac操纵子-启动子、tac启动子、SV40晚期启动子、SV40早期启动子、RSV-LTR启动子、CMV IE启动子、SV40早期启动子或SV 40晚期启动子和CMV IE启动子。如本文所用,术语“5’非翻译区”或“5’-UTR”可以是mRNA中位于编码序列上游,且不被翻译为蛋白质的RNA序列。基因中的5’-UTR通常从转录起始位点开始,且在编码序列的翻译起始密码子上游的核苷酸结束。5’-UTR可以包含控制基因表达的元件,如核糖体结合位点、5’-末端寡嘧啶束和翻译起始信号如Kozak序列。mRNA可以通过添加5’帽进行转录后修饰。因此,成熟mRNA中的5’-UTR也可以指5’帽和起始密码子之间的RNA序列。如本文所用,术语“3’非翻译区”或“3’-UTR”可以是mRNA中位于编码序列下游,且不被翻译为蛋白质的。mRNA中的3’-UTR位于编码序列的终止密码子和poly(A)序列之间,例如从终止密码子下游的核苷酸开始到poly(A)序列上游的核苷酸结束。
如本文所用,“源自基因A的5’或3’-UTR”是指来自基因A的mRNA的5’或3’-UTR。源自基因A的5’或3’-UTR可以是基因A的mRNA的全部5’或3’-UTR,也可以是基因A的mRNA的部分5’或3’-UTR。
如本文所用,术语“聚腺苷酸”、“poly(A)序列”和“poly(A)尾”可互换使用,天然存在的poly(A)序列通常由腺嘌呤核糖核苷酸组成。根据本发明,术语“经修饰的poly(A)序列”是指包含除腺嘌呤核糖核苷酸之外的核苷酸或核苷酸区段的poly(A)序列。poly(A)序列通常位于mRNA的3’端,例如3’-UTR的3’端(下游)。如本文所用,术语“5’-帽子结构”:5’-帽子结构通常位于成熟mRNA的5’末端。在一些实 施方案中,5’-帽子结构通过5’-5’-三磷酸酯键与mRNA的5’-末端连接。5’-帽子结构通常是由修饰的(例如甲基化的)核糖核苷酸(尤其是由鸟嘌呤核苷酸衍生物)形成。例如m7GpppN(帽0或称“cap0”,是hnRNA的5'磷酸基团在鸟苷酸转移酶的作用下与m7GTP的5'-磷酸基团作用形成5',5'-磷酸二酯键而形成的帽结构),其中N是携带5’-帽子结构的核酸的末端5′核苷酸。在一些实施例中,5’-帽子结构包括但不限于帽0、帽1(在帽0的基础上对hnRNA第一个核苷酸糖基2'-OH进一步发生甲基化而形成的帽结构,或称“cap1”)、帽2(在帽1的基础上对hnRNA第二位核苷酸糖基2'-OH进一步甲基化而形成的帽结构,或称“cap2”)、帽4、帽0类似物、帽1类似物、帽2类似物或帽4类似物。
如本文所用,术语“表达”包括核苷酸序列的转录和/或翻译。因此,表达可以涉及转录物和/或多肽的产生。术语“转录”涉及将DNA序列中的遗传密码转录为RNA(转录物)的过程。术语“体外转录”指在不含细胞的系统中(例如在适当的细胞提取物中)体外合成RNA,特别是mRNA(参见,例如Pardi N.,Muramatsu H.,Weissman D.,KarikóK.(2013).In:Rabinovich P.(eds)Synthetic Messenger RNA and Cell Metabolism Modulation.Methods in Molecular Biology(Methods and Protocols),vol 969.Humana Press,Totowa,NJ.)。可以用于产生转录物的载体又称为“转录载体”,其中包含转录所需的调控序列。术语“转录”涵盖“体外转录”。
如本文所用,术语“多肽”指包含通过肽键共价连接的两个以上氨基酸的聚合物。“蛋白”可以包含一条或多条多肽,其中多肽之间通过共价或非共价方式相互作用。
如本文所用,术语“宿主细胞”指用于接受、保持、复制、表达多核苷酸或载体的细胞。术语“宿主细胞”包含原核细胞(例如大肠杆菌)或真核细胞(例如酵母细胞和昆虫细胞)。例如来自人、小鼠、仓鼠、猪、山羊、灵长类的细胞。细胞可以来源于多种组织类型,并且包含初级细胞和细胞系。一些具体实例包括角质形成细胞、外周血白细胞、骨髓干细胞和胚胎干细胞。在另一些实施方案中,宿主细胞是抗原呈递细胞,特别地树突细胞、单核细胞或巨噬细胞。核酸可以以单拷贝或以数个拷贝存在于宿主细胞中。在一些实施方案中,宿主细胞可以是在其中表达本发明的多肽的细胞。
在本文中,术语“重组”或“重组的”意指“通过基因工程产生的”。优选地,在本发明的上下文中,“重组物质”例如重组RNA分子是非天然存在的。本文所用的术语“天然出现的”或“天然存在的”是指物质可见于自然界中的事实。例如,存在于生物体(包括病毒)中和可从自然界来源中分离的并且未在实验中人工有意修饰的肽或核酸是天然存在的。
在本发明的上下文中,术语“质粒”通常是指环状DNA分子,但是该术语还可以涵盖线性化DNA分子。具体地,术语“质粒”还涵盖通过例如用限制酶消化环状质粒,从而使该环状质粒分子转变成线性分子而使该环状质粒线性化所得到的分子。质粒可以复制,即在细胞中独立于作为染色体DNA存储的遗传信息扩增,并且可以用于克隆,即用于在细菌细胞中扩增遗传信息。优选地,所述DNA质粒是中拷贝或高拷贝质粒,更优选地是高拷贝质粒。此类高拷贝质粒的实例包括,例如,pUC和pTZ质粒或包含支持质粒高拷贝的复制起点的任意其它质粒(例如pMB1、pCoIE1)。
术语“治疗”等在本文中用于通常意指获得期望的药理学和/或生理学效果。因此,本发明的治疗可以涉及某种疾病的状态的治疗,但是也可以涉及就完全或部分预防疾病或其症状而言的预防性治疗。优选地,术语“治疗”应理解为在部分或完全治愈疾病和/或归因于该疾病的不利作用和/或症状方面是治疗性的。治疗也可以是预防性(prophylactic)或预防性(preventive)治疗,即为预防疾病而采取的措施, 例如为了预防感染和/或疾病的发作。
本文中,将描述本发明的一些要素。这些要素和具体实施方案一起列出,然而应理解,其可以以任何方式和任意数量组合以产生另外的实施方案。不同描述的实例和优选实施方案不应解释为将本发明仅限于明确描述的实施方案。该说明书应理解为支持并包括将明确描述的实施方案与任意数量的所公开和/或优选要素组合的实施方案。此外,除非上下文另外指出,否则本发明中所有描述要素的任意排列和组合应视为被本发明的说明书公开。例如,如果在一个实施方案中,重组核酸分子的5’-UTR包括序列如SEQ ID NO:1所示的多核苷酸,并且如果在另一个实施方案中,重组核酸分子的3’-UTR包括序列如SEQ ID NO:22所示的多核苷酸,则以下方案也是本发明请求保护一个实施方案:重组核酸分子的5’-UTR包括序列如SEQ ID NO:1所示的多核苷酸,该重组核酸分子的3’-UTR包括序列如SEQ ID NO:22所示的多核苷酸。
二、改进翻译的5’-UTR
发明人出人意料地发现,将基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2或CDK7的5’-UTR并入mRNA中可以提高编码序列的翻译效率。
上述基因的全称如下所示。
PPIA:peptidylprolyl isomerase A;
HPX:hemopexin;
FTCD:formimidoyltransferase cyclodeaminase;
CDK5RAP3:CDK5 regulatory subunit associated protein 3;
HSPA8:heat shock protein family A(Hsp70)member 8;
HBA1:hemoglobin subunit alpha 1;
HBB:hemoglobin subunit beta;
MYSM1:Myb like,SWIRM and MPN domains 1;
LENG1:leukocyte receptor cluster member 1;
TMSB4X:thymosin beta 4 X-linked;
CASP4:caspase 4;
IFNA1:interferon alpha 1;
PGLYRP1:peptidoglycan recognition protein 1;
UCHL1:ubiquitin C-terminal hydrolase L1;
CPAMD8:C3 and PZP like,alpha-2-macroglobulin domain containing 8;
TTR:transthyretin;
APOA2:apolipoprotein A2;
GH1:growth hormone 1);
DTYMK:deoxythymidylate kinase;
APOC2:apolipoprotein C2;
CDK7:cyclin-dependent kinase 7。
因此,本发明提供一种5’-UTR,其包含选自如下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中至少一个基因的5’-UTR;(b):(a)中所述5’-UTR的片段;(c):(a)中所述5’-UTR的变体;和(d):(b)中所述片段的变体。
在一些实施方案中,所述基因为真核生物的基因。
在一些实施方案中,所述基因为脊索动物的基因。
在一些实施方案中,所述基因为脊椎动物的基因。
在一些实施方案中,所述基因为哺乳动物的基因。
在一些实施方案中,所述基因为灵长动物的基因。
在一些实施方案中,各个所述基因分别独立地为人(Homo sapiens)的基因、倭黑猩猩(Pan paniscus)的基因、黑猩猩(Pan troglodytes)的基因或西部低地大猩猩(Gorilla gorilla gorilla)的基因。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述基因为人PPIA基因、人HPX基因、倭黑猩猩HPX基因、人FTCD基因、人CDK5RAP3基因、西部低地大猩猩CDK5RAP3基因、人HSPA8基因、人HBA1基因、人HBB基因、人MYSM1基因、倭黑猩猩MYSM1基因、人LENG1基因、人TMSB4X基因、人CASP4基因、人IFNA1基因、人PGLYRP1基因、西部低地大猩猩PGLYRP1基因、人UCHL1基因、人CPAMD8基因、黑猩猩CPAMD8基因、人TTR基因、西部低地大猩猩TTR基因、人APOA2基因、人GH1基因、人DTYMK基因、倭黑猩猩DTYMK基因、人APOC2基因、黑猩猩APOC2基因和人CDK7基因。
在一些实施方案中,PPIA基因的GenBank登录号(Accession Number)为BC137057.1;HPX基因的GenBank登录号为AH002827.2;FTCD基因的GenBank登录号为NM_006657.3;CDK5RAP3基因的GenBank登录号为AK223387.1;HSPA8基因的GenBank登录号为NM_006597.6;HBA1基因的GenBank登录号为NM_000558.5;HBB基因的GenBank登录号为NM_000518.5;MYSM1基因的GenBank登录号为NM_001085487.2;LENG1基因的GenBank登录号为NM_024316.3;TMSB4X基因的GenBank登录号为NM_021109.4;CASP4基因的GenBank登录号为NM_001225.3;IFNA1基因的GenBank登录号为NM_024013.3;PGLYRP1基因的GenBank登录号为NM_005091.2;UCHL1基因的GenBank登录号为NG_012931.1;CPAMD8基因的GenBank登录号为NG_054892.1;TTR基因的GenBank登录号为NM_000371.3;APOA2基因的GenBank登录号为NM_001643.2;GH1基因的GenBank登录号为NM_000515.5;DTYMK基因的GenBank登录号为NM_001165031.1;APOC2基因的GenBank登录号为NM_000483.4;CDK7基因的GenBank登录号为AY130859.1。
在一些实施方案中,编码人基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7的5’-UTR的多核苷酸如表1所示。
表1、所鉴定出的改进翻译的5’-UTR:

因此,在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。需要说明的是,在本文中“A的片段、变体或片段的变体”是A的片段、A的变体或A的片段的变体的简称。例如“序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体”是指序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、或序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体。
在优选的实施方案中,所述基因选自PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一种。在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:1):序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA;2):1)中所述RNA的片段;3):1)中所述RNA的变体;和2)中所述片段的变体。在一些实施方案中,所述序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
本发明的5’-UTR可以包含两个或多个串联的来自上述基因的5’-UTR、上述基因的5’-UTR的片段、上述基因的5’-UTR的变体、或上述基因的5’-UTR的片段的变体。
在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少两个基因的5’-UTR;(b):(a)中所述基因中的至少两个基因的5’-UTR的片段;(c):(a)中所述基因中的至少两个基因的5’-UTR的变体;和(d):(b)中所述片段的变体。在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:(e):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少两个基因的5’-UTR;(f):(e)中所述5’-UTR的片段;(g):(e)中所述5’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述5’-UTR包含如下多核苷酸的至少一种:序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,序列如SEQ ID NO:1~21之一 所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体,与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述5’-UTR包括如下多核苷酸中的至少一种:序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体或片段的变体,与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段的变体。在一些实施方案中,所述5’-UTR包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述5’-UTR包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核 苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
三、改进翻译的3’-UTR
发明人出人意料地发现,源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1或NFKB2的3’-UTR并入mRNA中可以提高编码序列的翻译效率。
上述基因的全称如下所示。
MPND:MPN domain containing;
FBXW10:F-box and WD repeat domain containing 10;
FBXW12:F-box and WD repeat domain containing 12;
PGLYRP1:peptidoglycan recognition protein 1;
HPX:hemopexin;
CDK7:cyclin dependent kinase 7;
APOC2:apolipoprotein C2;
PFN1:profilin 1;
RBP4:retinol binding protein 4;
FTCD:formimidoyltransferase cyclodeaminase;
NAAA:N-acylethanolamine acid amidase;
ALB:albumin;
GSDMD:gasdermin D;
FBXL8:F-box and leucine rich repeat protein 8;
ORM1:orosomucoid 1;
CASP4:caspase 4;
CHMP2A:charged multivesicular body protein 2A;
LENG1:leukocyte receptor cluster member 1;
MYCBPAP:MYCBP associated protein;
APOC1:apolipoprotein C1;
GAPDH:glyceraldehyde-3-phosphate dehydrogenase;
HSPA8:heat shock protein family A(Hsp70)member 8;
APOA2:apolipoprotein A2;
UCHL1:ubiquitin C-terminal hydrolase L1;
TSG101:tumor susceptibility 101;
NAE1:NEDD8 activating enzyme E1 subunit 1;
NFKB2:nuclear factor kappa B subunit 2。
因此,本发明提供3’-UTR,其包含选自如下多核苷酸中的至少一种:(a):源自基因 MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中至少一种基因的3’-UTR;(b):(a)中所述3’-UTR的片段;(c):(a)中所述3’-UTR的变体;和(d):(b)中所述片段的变体。
在一些实施方案中,所述基因为真核生物的基因。
在一些实施方案中,所述基因为脊索动物的基因。
在一些实施方案中,所述基因为脊椎动物的基因。
在一些实施方案中,所述基因为哺乳动物的基因。
在一些实施方案中,所述基因为灵长动物的基因。
在一些实施方案中,各个所述基因分别独立地为人(Homo sapiens)的基因、倭黑猩猩(Pan paniscus)的基因、黑猩猩(Pan troglodytes)的基因或西部低地大猩猩(Gorilla gorilla gorilla)的基因。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述基因为人MPND基因、倭黑猩猩MPND基因、人FBXW10基因、人FBXW12基因、倭黑猩猩FBXW12基因、人PGLYRP1基因、人HPX基因、人CDK7基因、人APOC2基因、人PFN1基因、人RBP4基因、人FTCD基因、人NAAA基因、人ALB基因、人GSDMD基因、人FBXL8基因、倭黑猩猩FBXL8基因、西部低地大猩猩FBXL8基因、人ORM1基因、人CASP4基因、人CHMP2A基因、倭黑猩猩CHMP2A基因、人LENG1基因、倭黑猩猩LENG1基因、人MYCBPAP基因、人APOC1基因、人GAPDH基因、人HSPA8基因、人APOA2基因、人UCHL1基因、人TSG101基因、人NAE1基因、倭黑猩猩NAE1基因、人NFKB2基因和倭黑猩猩NFKB2基因中的至少一种。
在一些实施方案中,MPND基因的GenBank登录号(Accession Number)为NM_001300862.1;FBXW10基因的GenBank登录号为NM_001267586.2;FBXW12基因的GenBank登录号为NM_001159929.1;PGLYRP1基因的GenBank登录号为NM_005091.3;HPX基因的GenBank登录号为AH002827.2;CDK7基因的GenBank登录号为AY130859.1;APOC2基因的GenBank登录号为NM_000483.5;PFN1基因的GenBank登录号为NM_005022.4;RBP4基因的GenBank登录号为NM_006744.4;FTCD基因的GenBank登录号为NM_006657.3;NAAA基因的GenBank登录号为NM_001363719.2;ALB基因的GenBank登录号为NM_000477.7;GSDMD基因的GenBank登录号为NM_024736.7;FBXL8基因的GenBank登录号为NM_018378.2;ORM1基因的GenBank登录号为NM_000607.4;CASP4基因的GenBank登录号为NM_001225.3;CHMP2A基因的GenBank登录号为NM_198426.3;LENG1基因的GenBank登录号为NM_024316.3;MYCBPAP基因的GenBank登录号为NM_032133.6;APOC1基因的GenBank登录号为NG_012859.1;GAPDH基因的GenBank登录号为NM_002046.7;HSPA8基因的GenBank登录号为NM_006597.6;APOA2基因的GenBank登录号为NM_001643.2;UCHL1基因的GenBank登录号为NG_012931.1;TSG101基因的GenBank登录号为NM_006292.4;NAE1基因的GenBank登录号为NM_003905.4;NFKB2基因的GenBank登录号为NM_001261403.3。
在一些实施方案中,编码人基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2的3’-UTR的多核苷酸如表2所示。
表2、所鉴定出的改进翻译的3’-UTR



因此,在一些实施方案中,所述3’-UTR包含下述多核苷酸中的至少一种:序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自MPND、FBXW10、FBXW12和PGLYRP1中的至少一种。在一些实施方案中,所述3’-UTR包含下述多核苷酸中的至少一种:序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸 编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
本发明的3’-UTR序列可以包含两个或多个串联的来自上述基因的3’-UTR、上述基因的3’-UTR的片段、上述基因的3’-UTR的变体、或上述基因的3’-UTR片段的变体。
在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:(a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少两个基因的3’-UTR;(b):(a)中所述基因中的至少两个基因的3’-UTR的片段;(c):(a)中所述基因中的至少两个基因的3’-UTR的变体;和(d):(b)中所述片段的变体。在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:(e):源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少两个基因的3’-UTR;(f):(e)中所述3’-UTR的片段;(g):(e)中所述3’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:序列SEQ ID NO:22~48中至少两个所示的多核苷酸编码的RNA、序列SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的变体和序列SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列SEQ ID NO:22~48之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA、序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的变体、和序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,3’-UTR包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的一个基因的3’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段的变体。在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:e):至少两个拷贝的源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少一个基因的3’-UTR;f):至少两个拷贝的源自e)中所述基因中的至少一个基因的3’-UTR的片段;g):至少两个拷贝的源自 e)中所述基因中的至少一个基因的3’-UTR的变体;和h)至少两个拷贝的f)中所述片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述3’-UTR序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA、两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的变体、和两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述3’-UTR包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的变体和至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
四、包含本发明的5’和/或3’-UTR的重组RNA分子
本发明提供一种重组RNA分子,其包含本发明所鉴定的改进编码序列翻译的5’和/或3’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列和含有5’-UTR的第二核苷酸序列,所述第二核苷酸序列包含选自如下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;(b):(a)中所述5’-UTR的片段;(c):(a)中所述5’-UTR的变体;及(d):(b)中所述片段的变体,其中所述第一核苷酸序列与所述第二核苷酸序列不天然出现于同一RNA分子。
在一些实施方案中,所述基因是人基因。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴 趣的蛋白。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽和至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽和一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一种。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:1):序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA;2):1)中所述RNA的片段;3):1)中所述RNA的变体;和4):2)中所述片段的变体。在一些实施方案中,所述序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
所述第二核苷酸序列可以包含两个或多个串联的来自上述基因的5’-UTR、上述基因的5’-UTR的片段、上述基因的5’-UTR的变体、或上述基因的5’-UTR的片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少两个基因的5’-UTR;(b):(a)中所述基因中的至少两个基因的5’-UTR的片段;(c):(a)中所述基因中的至少两个基因的5’-UTR的变体;和(d):(b)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少两个基因的5’-UTR;(f):(e)中所述5’-UTR的片段;(g):(e)中所述5’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸的至少一种:序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述第二核苷酸序列包含序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID  NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的变体、或序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体,与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体或片段的变体,与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体,与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段的变体。在一些实施方案中,所述第二核苷酸序列包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的一个的5’-UTR。在一些实施方案中,所述5’-UTR序列包含下述多核苷酸中的至少一种:a):至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段的变体。在一些实施方案中,所述5’-UTR序列包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的一个基因的5’-UTR。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的变体和至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个 拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA、两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的变体、和两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述重组RNA分子是mRNA分子。在一些实施方案中,所述重组RNA分子进一步还包含启动子、5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。在一些实施方案中,所述重组RNA分子进一步还包含5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述5’-帽子结构包括但不限于m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1和m7Gppp(m2′-O)N1中的至少一种。“m7G”代表7-甲基鸟苷帽核苷,“ppp”代表帽核苷的5′碳和初级RNA转录物的第一个核苷酸之间的三磷酸酯键,N1是最5′的核苷酸,“G”代表鸟嘌呤核苷,“m7”代表在鸟嘌呤的7-位上的甲基,“m2′-O”代表核苷酸2′-O位上的甲基。
在一些实施方案中,所述3’-UTR包含:i)源自白蛋白基因、α-珠蛋白基因、β-珠蛋白基因、酪氨酸羟化酶基因、脂加氧酶基因、和胶原蛋白α基因中至少一种基因的3’-UTR或其变体的核苷酸序列;ii)所述i)中的3’-UTR的变体;iii)源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中至少一种基因的3’-UTR、其片段、变体和片段的变体中的至少一种;优选地,序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA的片段的变体中的至少一种。在一些实施方案中,所述序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~49所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;iv)至少两个拷贝的i)、ii)或iii)中的一种多核苷酸序列;或v)由i)~iii)中的多核苷酸所构成的组中的至少两种多核苷酸。优选地,所述至少两个拷贝为两 个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。在一些实施方案中,构成所述poly(A)尾的核苷酸包含一个或多个除A核苷酸之外的核苷酸。在一些实施方案中,所述poly(A)尾包含两个或多个连续的除A核苷酸之外的核苷酸,其中所述具有两个或更多个连续核苷酸的序列中的第一个和最后一个核苷酸是除A核苷酸之外的核苷酸。优选地,所述poly(A)尾为截断式,即在m个连续的A核苷酸和n个连续的A核苷酸通过p个非A核苷酸组成的接头序列再连接,其中m、n和p是正整数。优选地,m为30,n为70个,且p为10。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列和含有3’-UTR的第二核苷酸序列,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少一个基因的3’-UTR;(b):(a)中所述3’-UTR的片段;(c):(a)中所述3’-UTR的变体;和(d):(b)中所述片段的变体;其中所述第一和第二核苷酸序列不天然出现于同一RNA分子。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽和至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽和一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自MPND、FBXW10、FBXW12和PGLYRP1中的至少一种。在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相 同性。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
本发明的重组RNA分子中的第二核苷酸序列中的3’-UTR可以包含两个或多个串联的来自上述基因的3’-UTR、上述基因的3’-UTR的片段、上述基因的3’-UTR的变体、或上述基因的3’-UTR的片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少两个基因的3’-UTR;(b):(a)中所述基因中的至少两个基因的3’-UTR的片段;(c):(a)中所述基因中的至少两个基因的3’-UTR的变体;和(d):(b)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少两个基因的3’-UTR;(f):(e)中所述3’-UTR的片段;(g):(e)中所述3’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA、序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的变体、和序列SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少一个基因的3’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段 的变体。在一些实施方案中,所述第二核苷酸序列包含至少两个拷贝的源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少一个基因的3’-UTR的核苷酸序列。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如与SEQ ID NO:22~48之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述重组RNA分子是mRNA分子。在一些实施方案中,所述重组RNA分子进一步还包含启动子、5’-帽子结构、5’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述重组RNA分子进一步还包含5’-帽子结构、5’-UTR和poly(A)尾中的至少一种。
在一些实施方案中,所述5’-帽子结构包括但不限于m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1或m7Gppp(m2′-O)N1中的至少一种。“m7G”代表7-甲基鸟苷帽核苷,“ppp”代表帽核苷的5′碳和初级RNA转录物的第一个核苷酸之间的三磷酸酯键,N1是最5′的核苷酸,“G”代表鸟嘌呤核苷,“m7”代表在鸟嘌呤的7-位上的甲基,“m2′-O”代表核苷酸2′-O位上的甲基。
在一些实施方案中,所述5’-UTR包含:i)源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中至少一个基因的5’-UTR、其片段、变体或片段的变体;优选地,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体中的至少一 种;所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸编码的RNA有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的同源性;ii)至少两个拷贝的i)中的至少一种多核苷酸;或iii)至少两种i)中的多核苷酸。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。优在一些实施方案中,所述poly(A)尾包含一或多个除A核苷酸之外的核苷酸。在一些实施方案中,所述poly(A)尾包含两个或多个连续的除A核苷酸之外的核苷酸,其中所述具有两个或更多个连续核苷酸的序列中的第一个和最后一个核苷酸是除A核苷酸之外的核苷酸。优选地,所述poly(A)尾为截断式polyA,即在m个连续的A核苷酸和n个连续的A核苷酸通过p个非A核苷酸组成的接头序列再连接,其中m、n和p是正整数。优选地,m为30,n为70个,且p为10。
在一些实施方案中,所述poly(A)尾对应的DNA序列如SEQ ID NO:53所示。
在一些实施方案中,所述重组RNA分子不超过50000nt。优选地,所述重组RNA分子不超过40000nt、30000nt、20000nt、10000nt、9000nt、8000nt或6000nt。在一些实施方案中,所述重组RNA分子为500nt~50000nt。在一些实施方案中,所述重组RNA分子为1000nt~40000nt、1000nt~30000nt、1500nt~10000nt或1500nt~8000nt。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、含5’-UTR的第二核苷酸序列、和含3’-UTR的第三核苷酸序列,其中所述第二核苷酸序列包含源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR、其片段、变体、和片段的变体中的至少一种,其中所述第三核苷酸序列包含源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中的至少一个基因的3’-UTR、其片段、变体或和片段的变体中的至少一种,且其中所述第一核苷酸序列与所述第二核苷酸序列和第三核苷酸序列的至少一个不天然出现于同一RNA分子。
相对于单独的源自上述基因中的至少一种基因的5’-UTR或3’-UTR、其片段、变体、或片段的变体,源自上述基因中的至少一种基因的5’-UTR、其片段、变体、或片段的变体与源自上述基因中的至少一种基因的3’-UTR、其片段、变体、或片段的变体组合,能进一步提高mRNA分子翻译效率和/或稳定性,提高感兴趣的多肽和/或蛋白的表达水平,例如提高1倍、2倍、3倍、4倍、5倍、6倍、7倍、8倍、9倍或10倍等。
需要说明的是,在本文中“A、其片段、变体或片段的变体”是A、A的片段、A的变体或A的片段的变体的简称。例如“源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR、其片段、变体或片段的变体”是指源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR、源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的片段、源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的变体、或源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的片段的变体。在本文中“A、其片段、变体和片段的变体中的至少一种”是A、A的片段、A的变体和A的片段的变体 所构成的组中的至少一种的简称。例如“源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR、其片段、变体和片段的变体中的至少一种”是指源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR、源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的片段、源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的变体、和源自基因PPIA、HPX和FTCD中的至少一个基因的5’-UTR的片段的变体中的至少一种。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。在一些实施方案中,所述第一核苷酸序列编码至少一种感兴趣的多肽和至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽和一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一种。在一些实施方案中,所述第二核苷酸序列包含如下核苷酸中的至少一种:1):序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA;2):1)中所述RNA的片段;3):1)中所述RNA的变体;和4):2)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含如下核苷酸中的一种:1):序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA;2):1)中所述RNA的片段;3):1)中所述RNA的变体;和4):2)中所述片段的变体。在一些实施方案中,所述序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
本发明的重组RNA分子中的第二核苷酸序列中的5’-UTR可以包含两个或多个串 联的来自上述基因的5’-UTR、上述基因的5’-UTR的片段、上述基因的5’-UTR的变体、或上述基因的5’-UTR的片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少两个基因的5’-UTR;(b):(a)中所述基因中的至少两个基因的5’-UTR的片段;(c):(a)中所述基因中的至少两个基因的5’-UTR的变体;和(d):(b)中所述片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的一种:(a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少两个的5’-UTR;(b):(a)中所述基因中的至少两个基因的5’-UTR的片段;(c):(a)中所述基因中的至少两个基因的5’-UTR的变体;和(d):(b)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少两个的5’-UTR;(f):(e)中所述5’-UTR的片段;(g):(e)中所述5’-UTR的变体;和(h):(f)中所述片段的变体。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的一种:(e):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少两个的5’-UTR;(f):(e)中所述5’-UTR的片段;(g):(e)中所述5’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸的至少一种:序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述第二核苷酸序列包含序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的变体、或序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸编码的RNA的片段、变体、或片段的变体,与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或 6所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的5’-UTR的片段的变体。
在一些实施方案中,所述第二核苷酸序列包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的一个的5’-UTR。在一些实施方案中,所述5’-UTR序列包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR、至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR的片段、至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR的变体或至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR的片段的变体。在一些实施方案中,所述5’-UTR序列包含至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的一个基因的5’-UTR、至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的一个基因的5’-UTR的片段、至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的一个基因的5’-UTR的变体、或至少两个拷贝的源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的一个基因的5’-UTR的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的变体和至少两个拷贝的序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA、两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的变体、和两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一 所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体,与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第三核苷酸序列包含:序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段的变体中的至少一种。在一些实施方案中,所述第三核苷酸序列包含:序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体中的至少一种。在一些实施方案中,序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自MPND、FBXW10、FBXW12和PGLYRP1中的至少一种。在一些实施方案中,所述基因选自MPND、FBXW10、FBXW12和PGLYRP1中的一种。在一些实施方案中,所述第三核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
本发明的重组分子的第三核苷酸序列中的3’-UTR可以包含两个或多个串联的来自上述基因的3’-UTR、上述基因的3’-UTR的片段、上述基因的3’-UTR的变体、或上述基因的3’-UTR片段的变体。
在一些实施方案中,所述第三核苷酸序列包含下多核苷酸中的至少一种:(a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中的 至少两个基因的3’-UTR;(b):(a)中所述基因中的至少两个基因的3’-UTR的片段;(c):(a)中所述基因中的至少两个基因的3’-UTR的变体;和(d):(a)中所述基因中的至少两个基因的3’-UTR的片段的变体。在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:(e):源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少两个基因的3’-UTR;(f):(e)中所述3’-UTR的片段;(g):(e)中所述3’-UTR的变体;和(h):(f)中所述片段的变体。
在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:22~49中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~49中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~49中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~49中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的变体、和序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:序列SEQ ID NO:24、22、23和25中至少两个所示的多核苷酸编码的RNA、序列SEQ ID NO:24、22、23和25中至少两个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:24、22、23和25中至少两个所示的多核苷酸编码的RNA的变体、和序列SEQ ID NO:24、22、23和25中至少两个所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA的片段、变体或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中的至少一个基因的3’-UTR;b):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段;c):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的变体;和d):至少两个拷贝的a)中所述基因中的至少一个基因的3’-UTR的片段的变体。在一些实施方案中,所述第二核苷酸序列包含至少两个拷贝的源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少一个基因的3’-UTR。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA、 至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段的变体。在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~49之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如与SEQ ID NO:22~49之一所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第三核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的变体、和至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸编码的RNA的片段、变体、或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸编码的RNA相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述5’-UTR的核苷酸序列如SEQ ID NO:55所示。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、上述任一实施方案的3’-UTR及核苷酸序列如SEQ ID NO:55所示的5’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、序列如SEQ ID NO:22~48中的一个所示3’-UTR及核苷酸序列如SEQ ID NO:55所示的5’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、源自基因FBXW10、FBXW12、MPND及PGLYRP1中的至少一种基因的3’-UTR以及核苷酸序列如SEQ ID NO:55所示的5’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、序列如SEQ ID NO:22~25中的一个所示3’-UTR及核苷酸序列如SEQ ID NO:55所示的5’-UTR。
在一些实施方案中,所述3’-UTR为源自基因COP1(CARD only protein)的3’-UTR。
在一些实施方案中,所述3’-UTR的核苷酸序列如SEQ ID NO:56所示。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、上述任一实施方案的5’-UTR及核苷酸序列如SEQ ID NO:56所示的3’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、序列如SEQ ID NO:1~21中的一个所示5’-UTR及基因COP1(CARD only protein)的3’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、序列如SEQ ID NO:1~21中的一个所示5’-UTR及核苷酸序列如SEQ ID NO:56所示的3’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR、及源自基因COP1的3’-UTR。
在一些实施方案中,所述重组RNA分子包含编码感兴趣的多肽和/或蛋白的第一核苷酸序列、序列如SEQ ID NO:1、6~9、12和18中的一个所示5’-UTR及核苷酸序列如SEQ ID NO:56所示的3’-UTR。
在一些实施方案中,所述重组RNA分子是mRNA分子。在一些实施方案中,所述重组RNA分子进一步包含5’-帽子结构和poly(A)尾中的至少一种。
在一些实施方案中,所述5’-帽子结构包括但不限于m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1或m7Gppp(m2′-O)N1中的至少一种。“m7G”代表7-甲基鸟苷帽核苷,“ppp”代表帽核苷的5′碳和初级RNA转录物的第一个核苷酸之间的三磷酸酯键,N1是最5′的核苷酸,“G”代表鸟嘌呤核苷,“m7”代表在鸟嘌呤的7-位上的甲基,“m2′-O”代表核苷酸2′-O位上的甲基。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。在一些实施方案中,所述poly(A)尾包含一个或多个除A核苷酸之外的核苷酸。在一些实施方案中,所述poly(A)尾包含两个或多个连续的除A核苷酸之外的核苷酸,其中所述具有两个或更多个连续核苷酸的序列中的第一个和最后一个核苷酸是除A核苷酸之外的核苷酸。优选地,所述poly(A)尾为截断式,即在m个连续的A核苷酸和n个连续的A核苷酸通过p个非A核苷酸组成的接头序列再连接,其中m、n和p是正整数。优选地,m为30,n为70个,且p为10。
在一些实施方案中,所述poly(A)尾对应的DNA序列如SEQ ID NO:53所示。
在一些实施方案中,所述重组RNA分子不超过50000nt。优选地,所述重组RNA分子不超过40000nt、30000nt、20000nt、10000nt、9000nt、8000nt或6000nt。在一些实施方案中,所述重组RNA分子为500nt~50000nt。在一些实施方案中,所述重组RNA分子为1000nt~40000nt、1000nt~30000nt、1500nt~10000nt或1500nt~8000nt。
在一些实施方案中,上述任一实施方案的所述重组RNA分子包含修饰的核苷。在一些实施方案中,所述重组DNA分子包含修饰的尿苷、修饰的胞苷、修饰的腺苷和修饰的鸟苷中的至少一种。
在一些实施方案中,所述修饰的核苷是修饰的尿苷。在一些实施方案中,所述重组RNA分子中0.1%~100%的尿苷被修饰。优选地,80%~100%的尿苷被修饰。优选地,100%的尿苷被修饰。示例性的修饰的尿苷包括假尿苷(ψ)、N1-甲基假尿苷、吡啶-4-酮核糖核苷、5-氮杂-尿苷、6-氮杂-尿苷、2-硫代-5-氮杂-尿苷、2-硫代-尿苷(s2U)、4-硫代-尿苷(s4U)、4-硫代-假尿苷、2-硫代-假尿苷、5-羟基-尿苷(ho5U)、5-氨基烯丙基-尿苷、5-卤代-尿苷(例如,5-碘-尿苷或5-溴-尿苷)、3-甲基-尿苷(m3U)、5-甲氧基-尿苷(mo5U)、尿苷-5-氧基乙酸(cmo5U)、尿苷-5-氧基乙酸甲酯(mcmo5U)、5-羧基甲基-尿苷(cm5U)、1-羧基甲基-假尿苷、5-羧基羟基甲基-尿苷(chm5U)、5-羧 基羟基甲基-尿苷甲酯(mchm5U)、5-甲氧基羰基甲基-尿苷(mcm5U)、5-甲氧基羰基甲基-2-硫代-尿苷(mcm5s2U)、5-氨基甲基-2-硫代-尿苷(nm5s2U)、5-甲基氨基甲基-尿苷(mnm5U)、5-甲基氨基甲基-2-硫代-尿苷(mnm5s2U)、5-甲基氨基甲基-2-硒代-尿苷(mnm5se2U)、5-氨基甲酰基甲基-尿苷(ncm5U)、5-羧基甲基氨基甲基-尿苷(cmnm5U)、5-羧基甲基氨基甲基-2-硫代-尿苷(cmnm5s2U)、5-丙炔基-尿苷、1-丙炔基-假尿苷、5-牛磺酸甲基-尿苷(5-taurinomethyl-uridine)(τm5U)、1-牛磺酸甲基-假尿苷、5-牛磺酸甲基-2-硫代-尿苷(τm5s2U)、1-牛磺酸甲基-4-硫代-假尿苷、5-甲基-尿苷(m5U,即具有核碱基脱氧胸腺嘧啶)、1-甲基-假尿苷(m1ψ)、5-甲基-2-硫代-尿苷(m5s2U)、1-甲基-4-硫代-假尿苷(m1s4ψ)、4-硫代-1-甲基-假尿苷、3-甲基-假尿苷(m3ψ)、2-硫代-1-甲基-假尿苷、1-甲基-1-脱氮-假尿苷、2-硫代-1-甲基-1-脱氮-假尿苷、二氢尿苷(D)、二氢假尿苷、5,6-二氢尿苷、5-甲基-二氢尿苷(m5D)、2-硫代-二氢尿苷、2-硫代-二氢假尿苷、2-甲氧基-尿苷、2-甲氧基-4-硫代-尿苷、4-甲氧基-假尿苷、4-甲氧基-2-硫代-假尿苷、N1-甲基-假尿苷、3-(3-氨基-3-羧基丙基)尿苷(acp3U)、1-甲基-3-(3-氨基-3-羧基丙基)假尿苷(acp3ψ)、5-(异戊烯基氨基甲基)尿苷(inm5U)、5-(异戊烯基氨基甲基)-2-硫代-尿苷(inm5s2U)、α-硫代-尿苷、2'-O-甲基-尿苷(Um)、5,2'-O-二甲基-尿苷(m5Um)、2'-O-甲基-假尿苷(ψm)、2-硫代-2'-O-甲基-尿苷(s2Um)、5-甲氧基羰基甲基-2'-O-甲基-尿苷(mcm5Um)、5-氨基甲酰基甲基-2'-O-甲基-尿苷(ncm5Um)、5-羧基甲基氨基甲基-2'-O-甲基-尿苷(cmnm5Um)、3,2'-O-二甲基-尿苷(m3Um)、和5-(异戊烯基氨基甲基)-2'-O-甲基-尿苷(inm5Um)、1-硫代-尿苷、脱氧胸苷、2'-F-阿糖-尿苷(2'-F-ara-uridine)、2'-F-尿苷、2'-OH-阿糖-尿苷、5-(2-甲氧甲酰基乙烯基)尿苷(5-(2-carbomethoxyvinyl)uridine)和5-[3-(1-E-丙烯基氨基)尿苷。
在一些实施方案中,所述修饰的核苷是修饰的胞苷。在一些实施方案中,所述重组RNA分子中0.1%~100%的胞苷被修饰。优选地,80%~100%的胞苷被修饰。优选地,100%的胞苷被修饰。示例性的修饰的胞苷包括5-氮杂-胞苷、6-氮杂-胞苷、假异胞苷(pseudoisocytidine)、3-甲基-胞苷(m3C)、N4-乙酰基-胞苷(ac4C)、5-甲酰基-胞苷(f5C)、N4-甲基-胞苷(m4C)、5-甲基-胞苷(m5C)、5-卤代-胞苷(例如5-碘-胞苷)、5-羟甲基-胞苷(hm5C)、1-甲基-假异胞苷、吡咯并-胞苷、吡咯并-假异胞苷、2-硫代-胞苷(s2C)、2-硫代-5-甲基-胞苷、4-硫代-假异胞苷、4-硫代-1-甲基-假异胞苷、4-硫代-1-甲基-1-脱氮-假异胞苷、1-甲基-1-脱氮-假异胞苷、泽布拉林(zebularine)、5-氮杂-泽布拉林、5-甲基-泽布拉林、5-氮杂-2-硫代-泽布拉林、2-硫代-泽布拉林、2-甲氧基-胞苷、2-甲氧基-5-甲基-胞苷、4-甲氧基-假异胞苷、4-甲氧基-1-甲基-假异胞苷、赖西丁(lysidine)(k2C)、α-硫代-胞苷、2'-O-甲基-胞苷(Cm)、5,2'-O-二甲基-胞苷(m5Cm)、N4-乙酰基-2'-O-甲基-胞苷(ac4Cm)、N4,2'-O-二甲基-胞苷(m4Cm)、5-甲酰基-2'-O-甲基-胞苷(f5Cm)、N4,N4,2'-O-三甲基-胞苷(m42Cm)、1-硫代-胞苷、2'-F-阿糖-胞苷、2'-F-胞苷和2'-OH-阿糖-胞苷。
在一些实施方案中,所述修饰的核苷是修饰的腺苷。在一些实施方案中,所述重组RNA分子中0.1%~100%的腺苷被修饰。优选地,80%~100%的腺苷被修饰。优选地,100%的腺苷被修饰。示例性的修饰的腺苷包括2-氨基-嘌呤、2,6-二氨基嘌呤、2-氨基-6-卤代-嘌呤(例如,2-氨基-6-氯-嘌呤)、6-卤代-嘌呤(例如,6-氯-嘌呤)、2-氨基-6-甲基-嘌呤、8-叠氮基-腺苷、7-脱氮-腺嘌呤、7-脱氮-8-氮杂-腺嘌呤、7-脱氮-2-氨基-嘌呤、7-脱氮-8-氮杂-2-氨基-嘌呤、7-脱氮-2,6-二氨基嘌呤、7-脱氮-8-氮杂-2,6-二氨基嘌呤、1-甲基-腺苷(m1A)、2-甲基-腺嘌呤(m2A)、N6-甲基-腺苷(m6A)、2-甲硫基-N6-甲基-腺苷(ms2m6A)、N6-异戊烯基-腺苷(i6A)、2-甲硫基-N6-异戊烯基-腺苷(ms2i6A)、N6-(顺式-羟基异戊烯基)腺苷(io6A)、2-甲硫基-N6-(顺式-羟基异戊烯基)腺苷(ms2io6A)、N6-甘氨酰基氨基甲酰基-腺苷(g6A)、N6-苏氨酰基氨基甲酰基-腺苷(t6A)、N6-甲基-N6-苏氨酰基氨基甲酰基-腺苷(m6t6A)、2-甲硫基-N6-苏氨酰基氨基 甲酰基-腺苷(ms2g6A)、N6,N6-二甲基-腺苷(m62A)、N6-羟基正缬氨酰基氨基甲酰基-腺苷(hn6A)、2-甲硫基-N6-羟基正缬氨酰基氨基甲酰基-腺苷(ms2hn6A)、N6-乙酰基-腺苷(ac6A)、7-甲基-腺嘌呤、2-甲硫基-腺嘌呤、2-甲氧基-腺嘌呤、α-硫代-腺苷、2'-O-甲基-腺苷(Am)、N6,2'-O-二甲基-腺苷(m6Am)、N6,N6,2'-O-三甲基-腺苷(m62Am)、1,2'-O-二甲基-腺苷(m1Am)、2'-O-核糖基腺苷(磷酸酯)(Ar(p))、2-氨基-N6-甲基-嘌呤、1-硫代-腺苷、8-叠氮基-腺苷、2'-F-阿糖-腺苷、2'-F-腺苷、2'-OH-阿糖-腺苷和N6-(19-氨基-五氧杂十九烷基)-腺苷。
在一些实施方案中,所述修饰的核苷是修饰的鸟苷。在一些实施方案中,所述重组RNA分子中0.1%~100%的鸟苷被修饰。优选地,80%~100%的鸟苷被修饰。优选地,100%的鸟苷被修饰。示例性的修饰的鸟苷包括肌苷(I)、1-甲基-肌苷(m1I)、丫苷(wyosine)(imG)、甲基丫苷(mimG)、4-脱甲基-丫苷(imG-14)、异丫苷(imG2)、怀丁苷(yW)、过氧怀丁苷(o2yW)、羟基怀丁苷(OHyW)、修饰不完全的(undermodified)羟基怀丁苷(OHyW*)、7-脱氮-鸟苷、辫苷(queuosine)(Q)、环氧辫苷(oQ)、半乳糖基-辫苷(galQ)、甘露糖基-辫苷(manQ)、7-氰基-7-脱氮-鸟苷(preQ0)、7-氨基甲基-7-脱氮-鸟苷(preQ1)、古嘌苷(G+)、7-脱氮-8-氮杂-鸟苷、6-硫代-鸟苷、6-硫代-7-脱氮-鸟苷、6-硫代-7-脱氮-8-氮杂-鸟苷、7-甲基-鸟苷(m7G)、6-硫代-7-甲基-鸟苷、7-甲基-肌苷、6-甲氧基-鸟苷、1-甲基-鸟苷(m1G)、N2-甲基-鸟苷(m2G)、N2,N2-二甲基-鸟苷(m22G)、N2,7-二甲基-鸟苷(m2,7G)、N2,N2,7-二甲基-鸟苷(m2,2,7G)、8-氧代-鸟苷、7-甲基-8-氧代-鸟苷、1-甲基-6-硫代-鸟苷、N2-甲基-6-硫代-鸟苷、N2,N2-二甲基-6-硫代-鸟苷、α-硫代-鸟苷、2'-O-甲基-鸟苷(Gm)、N2-甲基-2'-O-甲基-鸟苷(m2Gm)、N2,N2-二甲基-2'-O-甲基-鸟苷(m22Gm)、1-甲基-2'-O-甲基-鸟苷(m1Gm)、N2,7-二甲基-2'-O-甲基-鸟苷(M2,7Gm)、2'-O-甲基-肌苷(Im)、1,2'-O-二甲基-肌苷(m1Im)、2'-O-核糖基鸟苷(磷酸酯)(Gr(p))、1-硫代-鸟苷、O6-甲基-鸟苷、2'-F-阿糖-鸟苷和2'-F-鸟苷。
在一些实施方案中,在上述任一实施方案中,所述感兴趣的多肽和/或蛋白是指具有治疗或预防效果的治疗上或药学上有活性的多肽或蛋白质,其在细胞中或细胞附近的功能是需要的或有益的,例如,这样的蛋白质,其缺乏或其缺陷形式导致疾病发生,提供这样的蛋白质则可以调节或预防疾病,或者这样的蛋白质,其在细胞中或其附近对身体是有利的。所述多肽或蛋白质可以包含完整的蛋白质或其功能变体。
在上述任一实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列或所表达的肽和/或蛋白包含或为下述中的一种或多种:(a)抗原;(b)治疗性蛋白或多肽、其片段、片段或变体;及(c)其他多肽或蛋白。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的肽和/或蛋白包含或为抗原。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的抗原源自下述的一种或多种:(1)致病性抗原、其片段、变体或片段的变体,(2)肿瘤抗原、其片段、变体或片段的变体,(3)过敏性抗原、其片段、变体或片段的变体(4)自身免疫性自体抗原、其片段、变体或片段的变体。
在一些实施方案中,病性抗原源自致病性生物体,其能够引起受试者(例如哺乳动物受试者,进一步例如人类)的免疫应答。在一些实施方案中,致病性生物体包括或为下述中的一种或多种:细菌、病毒、真菌、和原生动物(例如单细胞生物、多细胞生物)。
在一些实施方案中,致病性抗原包含或为表面抗原、其片段、变体或片段的变体,例如位于病毒、细菌或原生动物表面的蛋白、其片段(例如表面抗原的外部部分)、变体 或片段的变体。
在一些实施方案中,致病性抗原包含或是源自与传染病相关的病原体的多肽或蛋白。
在一些实施方案中,致病性抗原选自但不限于WO2018/078053A1第21至35页中记载的源自病原体的抗原、WO2019/077001A1第57页第3段-第63页第2段记载的源自病原体的抗原、WO2013/120628A1第32页第26行至第34页第27行中记载的源自病原体的抗原、以及WO2013/120628A1中第34页第29行至第59页第5行所记载的抗原所构成的组。
在一些实施方案中,致病性抗原的病原体选自但不限于下述中的一种或多种:疥螨、巴倍虫、利什曼原虫、颚口线虫、巴西钩口线虫、十二指肠钩口线虫、粪类圆线虫、毛首鞭形线虫、犬弓首线虫、猫弓首线虫、弓形虫、布氏锥虫、克鲁兹锥虫、马来丝虫、旋盘尾丝虫、班氏丝虫、绦虫、猪带绦虫、棘球绦虫、人蛔虫、脆弱双核阿米巴、福氏耐格里阿米巴原虫(Naegleria fowleri)、美洲钩虫、并殖吸虫(例如卫氏并殖吸虫)、华支睾吸虫、疟原虫(例如恶性疟原虫、间日疟原虫、三日疟原虫或卵形疟原虫)、耶氏肺孢子虫、横川后殖吸虫、旋毛虫、阴道毛滴虫、肠贾第虫、血吸虫、白地霉、人酵母菌、汉赛巴尔通体、韦尼克黑酵母(Hortaea Wernicke)、曲霉、马拉色菌、霍乱弧菌、鲍曼不动杆菌、巴西副球孢子菌、淋病奈瑟菌、脑膜炎奈瑟菌、巴氏杆菌、星状诺卡氏菌、诺卡氏菌、申克孢子丝菌、葡萄球菌、无乳链球菌、肺炎链球菌、化脓性链球菌、毛癣菌、小肠结肠炎耶尔森菌、鼠疫耶尔森菌、假结核耶尔森菌、解脲脲原体、、沙门氏菌、图拉氏弗朗西斯菌、梭杆菌、麻风分枝杆菌、弥漫型麻风分枝杆菌、结核分枝杆菌、溃疡分枝杆菌、志贺氏菌、、溶血隐秘杆菌、炭疽杆菌、蜡样芽胞杆菌、荚膜组织胞浆菌、皮炎芽生菌、百日咳博德特氏菌、包柔氏螺旋体菌、包柔体、布鲁氏菌、伯克霍尔德菌(例如洋葱伯克霍尔德菌、鼻疽伯克霍尔德菌、类鼻疽伯克霍尔德菌)、弯曲菌、念珠菌(例如白念珠菌)、肺炎嗜衣原菌、白喉棒状杆菌、伯氏考克斯氏体、梭菌(例如肉毒梭菌、艰难梭菌、产气荚膜梭菌、产气荚膜梭菌)、破伤风梭菌、球孢菌、恰菲埃里希氏体、埃翁氏埃里希氏体(Ehrlichia ewingii)、埃里希氏体、肠外致病性大肠杆菌、金氏菌、肉芽肿克雷伯菌、无形体(例如嗜吞噬细胞无形体)、钩端螺旋体属、博氏疏螺旋体、梅毒螺旋体、立克次氏体(例如普氏立克次氏体、立氏立克次氏体、伤寒立克次氏体)、鹦鹉热嗜衣原体、沙眼衣原体、库鲁病朊病毒、拉沙病毒(LASV)、嗜肺军团菌、单核细胞增生李斯特菌、肠球菌、表面癣菌、大肠杆菌O157:H7和O104:H4、肝片吸虫和巨大片吸虫、肠道病毒(例如柯萨奇A病毒、肠道病毒71(EV71))、FFI朊病毒、CJD朊病毒、爱泼斯坦-巴尔病毒(EBV)、猫免疫缺陷病毒(FIV)、黄病毒、GSS朊病毒、瓜那利托病毒、杜克雷嗜血杆菌、流感嗜血杆菌、幽门螺杆菌、布尼亚病毒科、杯状病毒科、星状病毒科、冠状病毒、刚果出血热病毒、新生隐球菌、隐孢子虫属、巨细胞病毒(CMV)、BK病毒、登革热病毒、埃博拉病毒(EBOV)、单纯疱疹病毒(HSV)、人类免疫缺陷病毒(HIV)、人乳头瘤病毒(HPV)、流感病毒、狂犬病病毒、诺如病毒、尼帕病毒、亨利帕病毒(亨克莱病毒-尼帕病毒)、甲型肝炎病毒、乙型肝炎病毒(HBV)、丙型肝炎病毒(HCV)、丁型肝炎病毒、戊型肝炎病毒、、人博卡病毒(HBoV)、人类偏肺病毒(hMPV)、人类副流感病毒(HPIV)、日本脑炎病毒、JC病毒、鸠宁(Junin)病毒、黄热病病毒、MERS冠状病毒、淋巴细胞性脉络丛脑膜炎病毒(LCMV)、马丘波病毒、马尔堡病毒、麻疹病毒、人传染性软疣病毒(MCV)、腮腺炎病毒、细小病毒B19、肺炎支原体正粘病毒、脊髓灰质炎病毒、鼻病毒、裂谷热 病毒、轮状病毒、风疹病毒、萨比亚病毒、SARS冠状病毒(例如SARS-CoV-2)、nCoV-2019冠状病毒、辛诺柏病毒、汉坦病毒、痘苗病毒、呼吸道合胞病毒(RSV)、蜱传脑炎病毒(TBEV)、水痘带状疱疹病毒(VZV)、委内瑞拉马脑炎病毒、西尼罗河病毒、西部马脑炎病毒、和寨卡病毒。
在一些实施方案中,致病性抗原包括或为下述的一种或多种:
(1)SARS冠状病毒2(SARS-CoV-2)、nCoV-2019冠状病毒或SARS冠状病毒(SARS-CoV))的下述蛋白中的一种或多种:棘突蛋白(S)、包膜蛋白(E)、膜蛋白(M)或核衣壳蛋白(N);(2)MERS冠状病毒的下述蛋白中的一种或多种棘突蛋白(S)、棘突S1片段(S1)、包膜蛋白(E)、膜蛋白(M)或核衣壳蛋白(N);(3)人乳头瘤病毒(例如HPV16)的下述蛋白中的一种或多种:复制蛋白E1、调节蛋白E2、蛋白E3、蛋白E4、蛋白E5、蛋白E6、蛋白E7、蛋白E8、主要衣壳蛋白L1和次要衣壳蛋白L2;(4)人副流感病毒(HPIV/PIV)(例如hPIV-1、hPIV-2、hPIV-3或hPIV-4血清型)的下述蛋白中的一种或多种:融合蛋白(F)、血凝素神经氨酸酶、糖蛋白(G)、基质蛋白(M)、磷蛋白(P)、核衣壳蛋白、融合糖蛋白F0、F1或F2、重组PIV3/PIV1融合糖蛋白、C蛋白、D蛋白、病毒复制酶(L)和非结构V蛋白;(5)人类偏肺病毒(hMPV)的下述蛋白中的一种或多种:融合(F)糖蛋白、糖蛋白(G)、磷蛋白(P)、和核衣壳蛋白;(6)流感病毒的下述蛋白中的一种或多种:血凝素(HA)、神经氨酸酶(NA)、核蛋白(NP)、M1蛋白、M2蛋白、NS1蛋白、NS2蛋白(NEP蛋白:核输出蛋白)、PA蛋白、PB1蛋白(聚合酶碱性1蛋白)、PB1-F2蛋白和PB2蛋白;(7)狂犬病病毒的下述蛋白中的一种或多种:核蛋白(N)、大结构蛋白(L)、磷蛋白(P)、基质蛋白(M)和糖蛋白(G);(8)人类免疫缺陷病毒的下述蛋白中的一种或多种:HIV p24抗原、HIV包膜蛋白(Gp120、Gp41、Gp160)、多蛋白GAG、负因子蛋白Nef、转录反式激活剂Tat和Brec1;(9)沙眼衣原体的下述蛋白中的一种或多种:主要外膜蛋白MOMP、可能的外膜蛋白PMPC、外膜复合蛋白B OmcB、热休克蛋白Hsp60 HSP10、蛋白IncA、III型分泌系统蛋白、核糖核苷酸还原酶小链蛋白NrdB、质粒蛋白Pgp3、衣原体外蛋白N CopN、抗原CT521、抗原CT425、抗原CT043、抗原TC0052、抗原TC0189、抗原TC0582、抗原TC0660、抗原TC0726、抗原TC0816、抗原TC0828;(10)巨细胞病毒(CMV/HCMV)的下述蛋白中的一种或多种:pp65抗原、膜蛋白pp15、衣壳近端皮层(tegument)蛋白pp150、蛋白M45、DNA聚合酶UL54、螺旋酶UL105、糖蛋白gM、糖蛋白gN、糖蛋白H、糖蛋白B gB、蛋白UL83、蛋白UL94、蛋白UL99、HCMV糖蛋白(选自gH-gL、gB、gO、gN和gM)、HCMV蛋白(选自UL83、UL123、UL128、UL130和UL131A)、表皮蛋白pp150(pp150)、皮层蛋白pp65/下基质磷蛋白(pp65)、包膜糖蛋白M(UL100)、调节蛋白IE1(UL123)、包膜蛋白(UL128)、包膜糖蛋白(130)、包膜蛋白(UL131A)、包膜糖蛋白B(UL55)、结构糖蛋白N gpUL73(UL73)、结构糖蛋白O gpUL74(UL74);(11)登革热病毒的下述蛋白中的一种或多种:衣壳蛋白C、膜前蛋白prM、膜蛋白M、包膜蛋白E(结构域I、结构域II、结构域II)、蛋白NS1、蛋白NS2A、蛋白NS2B、蛋白NS3、蛋白NS4A、蛋白2K、蛋白NS4B、蛋白NS5;(12)EBOV病毒的下述蛋白中的一种或多种:EBOV糖蛋白(GP)、表面EBOV GP、野生型EBOV pro GP、成熟EBOV GP、分泌型野生型EBOV pro GP、分泌型成熟EBOV GP、EBOV核蛋白(NP)、RNA聚合酶L和EBOV基质蛋白(选自VP35、VP40、VP24和VP30);(13)乙型肝炎病毒(HBV)的下述蛋白中的一种或多种:乙型肝炎表面抗原HBsAg、乙型肝炎核心抗原HbcAg、聚合酶、蛋白Hbx、前S2中表面蛋白、表 面蛋白L、大S蛋白、病毒蛋白VP1、病毒蛋白VP2、病毒蛋白VP3、和病毒蛋白VP4;(14)呼吸道合胞病毒(RSV)的下述蛋白中的一种或多种:融合蛋白F、F蛋白、核蛋白N、基质蛋白M、基质蛋白M2-1、基质蛋白M2-2、磷蛋白P、小疏水蛋白SH、主要表面糖蛋白G、聚合酶L、非结构蛋白1NS1、非结构蛋白2NS2、RSV附着蛋白(G)(糖蛋白G)、融合(F)糖蛋白(糖蛋白F)、核蛋白(N)、磷蛋白(P)、大聚合酶蛋白(L)、基质蛋白(M,M2)、小疏水蛋白(SH)、非结构蛋白1(NS1)、非结构蛋白2(NS2)、膜结合RSV F蛋白、膜结合DS Cavl(稳定的融合前RSV F蛋白);(15)结核分枝杆菌的下述蛋白中的一种或多种:分泌抗原SssA(葡萄球菌属,葡萄球菌食物中毒);分泌抗原SssA(葡萄球菌属,如金黄色葡萄球菌、葡萄球菌感染);分子伴侣DnaK、细胞表面脂蛋白Mpt83、脂蛋白P23、磷酸转运系统渗透蛋白pstA、14kDa抗原、纤维连接蛋白结合蛋白C FbpC1、丙氨酸脱氢酶TB43、谷氨酰胺合成酶1、ESX-1蛋白、蛋白CFP10,TB10.4蛋白质、蛋白MPT83、蛋白MTB12、蛋白MTB8、Rpf样蛋白质、蛋白MTB32、蛋白MTB39、晶体蛋白、热休克蛋白HSP65、和蛋白PST-S;(16)黄热病病毒的下述蛋白中的一种或多种:基因组多蛋白、蛋白E、蛋白M、衣壳蛋白C、蛋白酶NS3、蛋白NS1、蛋白NS2A、蛋白AS2B、蛋白NS4A、蛋白NS4B、蛋白NS5;(17)环孢子蛋白;和(18)寨卡病毒的下述蛋白中的一种或多种:寨卡病毒衣壳蛋白(c)、寨卡病毒膜前蛋白(prM)、寨卡病毒pr蛋白(pr)、寨卡病毒膜蛋白(M)、寨卡病毒包膜蛋白(E)、寨卡病毒非结构蛋白、寨卡病毒prME抗原、寨卡病毒衣壳蛋白、膜前/膜蛋白、ZIKV包膜蛋白、ZIKV非结构蛋白1、ZIKV非结构蛋白2A、ZIKV非结构蛋白2B、ZIKV非结构蛋白3、ZIKV非结构蛋白4A、ZIKV非结构蛋白4B、ZIKV非结构蛋白5、和寨卡病毒包膜蛋白(e)。
在一些实施方案中,肿瘤抗原选自但不限于由WO2018/078053A1第47-51页中记载的肿瘤抗原所组成的组。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的抗原包含或为过敏性抗原和自身免疫性自体抗原。在一些实施方案中,过敏性抗原和自身免疫性自体抗原源自或选自但不限于WO2018/078053A1第59至73页记载的抗原组。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的抗原列于WO2018/078053A1第48至51页。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的多肽和/或蛋白包含或为治疗性蛋白或多肽。
在一些实施方案中,治疗性蛋白或多肽包括或为下述的一种或多种:
(1)用于治疗代谢、内分泌或氨基酸紊乱的酶替代疗法或用于替代缺失、缺陷或突变蛋白质的治疗性蛋白或多肽;(2)用于治疗血液疾病、循环系统疾病、呼吸系统疾病、传染病或免疫缺陷的治疗性蛋白或多肽;(3)用于治疗癌症或肿瘤疾病的治疗性蛋白或多肽;(4)用于激素替代治疗的治疗性蛋白或多肽;(5)用于将体细胞重新编程为多能干细胞或全能干细胞的治疗性蛋白或多肽;(6)用作佐剂或免疫刺激的治疗性蛋白或多肽;(7)作为治疗性抗体的治疗性蛋白或多肽;(8)作为基因编辑剂的治疗性蛋白或多肽;(9)用于治疗或预防选自由肝纤维化、肝硬化和肝癌组成的组的肝病的治疗性蛋白或多肽;和(10)用于治疗或预防罕见病的治疗性蛋白或多肽。
在一些实施方案中,用于治疗代谢、内分泌或氨基酸紊乱的酶替代疗法或用于替代缺失、缺失或突变蛋白质的治疗性蛋白或多肽,包括或为下述的一种或多种:酸性鞘磷 脂酶、脂肪酸、无糖苷酶β、白葡萄糖苷酶、α-半乳糖苷酶A、α-葡萄糖苷酶、α-L-艾杜糖苷酸酶、α-N-乙酰氨基葡萄糖苷酶、双调蛋白、血管生成素(Ang1、Ang2、Ang3、Ang4、ANGPTL2、ANGPTL3、ANGPTL4、ANGPTL5、ANGPTL6、ANGPTL7)、ATP酶、Cu(2+)-转运β多肽(ATP7B)、精氨琥珀酸合成酶(ASS1)、β细胞素、β-葡萄糖醛酸酶、骨形态发生蛋白BMP(BMP1、BMP2、BMP3、BMP4、BMP5、BMP6、BMP7、BMP8a、BMP8b、BMP10、BMP15)、CLN6蛋白、表皮生长因子(EGF)、表观蛋白、表观调节素、成纤维细胞生长因子(FGF、FGF-1、FGF-2、FGF-3、FGF-4、FGF-5、FGF-6、FGF-7、FGF-8、FGF-9、FGF-10、FGF-11、FGF-12、FGF-13、FGF-14、FGF-16、FGF-17、FGF-17、FGF-18、FGF-19、FGF-20、FGF-21、FGF-22、FGF-23)、延胡索酰乙酰乙酸水解酶(FAH)、加硫酯酶、生长激素释放肽、葡萄糖脑苷酶、GM-CSF、肝素结合EGF样生长因子(HB-EGF)、肝细胞生长因子HGF、肝细胞生成素、人白蛋白、白蛋白丢失增加、艾度硫酸酯酶(艾度糖-2-硫酸酯酶)、整联蛋白αVβ3、αVβ5和α5β1、Iuduronate硫酸酯酶、拉罗尼酶、N-乙酰半乳糖胺-4-硫酸酯酶(rhASB;加硫酶、芳香基硫酸酯酶A(ARSA)、芳香基硫酸酯酶B(ARSB))、N-乙酰氨基葡萄糖-6-硫酸酯酶、神经生长因子(NGF,脑源性神经营养因子(BDNF)、神经营养素-3(NT-3)和神经营养素4/5(NT-4/5)、神经调节蛋白(NRG1、NRG2、NRG3、NRG4)、神经纤毛蛋白(NRP-1、NRP-2)、肥胖抑制素、苯丙氨酸羟化酶(PAH),苯丙氨酸氨解水解酶(PAL)、血小板源生长因子(PDGF(PDFF-A、PDGF-B、PDGF-C、PDGF-D))、TGFβ受体(内皮素、TGFβ1受体、TGFβ2受体、TGFβ3受体)、血小板生成素(THPO)(巨核细胞生长和发育因子(MGDF))、转化生长因子(TGF(TGF-a、TGF-β(TGFβ1、TGFβ2和TGFβ3)))、VEGF(VEGF-A、VEGF-B、VEGF-C、VEGF-D、VEGF-E、VEGF-F和PIGF)、奈西立肽、胰蛋白酶、促肾上腺皮质激素(ACTH)、心钠肽(ANP)、胆囊收缩素、胃泌素、瘦素、催产素、生长抑素、加压素(抗利尿激素)、降钙素、艾塞那肽、生长激素(GH)、生长激素、胰岛素、胰岛素样生长因子1IGF-1、美卡西芬酯、IGF-1类似物、培维索孟、普兰林肽、特立帕肽(人甲状旁腺激素残基1-34)、贝卡普勒明、Dibotermin-α(骨形态生成蛋白2)、醋酸组氨瑞林(促性腺激素释放激素;GnRH)、奥曲肽、肝细胞核因子4α(HNF4A)、CCAAT/增强子结合蛋白α(CEBPA)、成纤维细胞生长因子21(FGF21)、细胞外基质蛋白酶或人胶原酶MMP1、肝细胞生长因子(HGF)、TNF相关凋亡诱导配体(TRAIL)、阿片生长因子受体样1(OGFRL1)、梭菌II型胶原酶、松弛素1(RLN1)、松弛素2(RLN2)、松弛素3(RLN3)和帕利夫明(角质形成细胞生长因子;KGF)。
在一些实施方案中,用于治疗代谢或内分泌疾病的治疗蛋白或多肽选自WO2017/191274表A(结合表C)中记载的蛋白或多肽。
在一些实施方案中,用于治疗血液疾病、循环系统疾病、呼吸系统疾病、癌症或肿瘤疾病、传染病或免疫缺陷的治疗性蛋白或多肽,包括或为下述的一种或多种:阿替普酶(组织纤溶酶原激活剂;tPA)、阿尼普酶、抗凝血酶III(AT-III)、比伐卢定、达贝泊汀-α、Drotrecogin-α(活化蛋白C)、促红细胞生成素、阿法依泊汀-α、红细胞生成素、erthropoyetin、因子IX、因子VIIa、因子VIII、重组水蛭素、蛋白C浓缩物、瑞替普酶(tPA的缺失突变蛋白)、链激酶、替奈普酶、尿激酶、血管增生抑制素、抗-CD22免疫毒素、地尼白介素、免疫花青、MPS(锌指蛋白)、阿柏西普、内皮他丁、胶原酶、人脱氧核糖核酸酶I、脱氧核糖核酸酶、透明质酸酶、木瓜蛋白酶、L-天冬酰胺酶、Peg-天冬酰胺酶、拉布立酶、人绒毛膜促性腺激素(HCG)、人卵泡刺激素(FSH)、促黄体素-α、催乳素、α-1-蛋白酶抑制因子、乳糖酶、胰酶(脂肪酶、淀粉酶、蛋白酶)、腺苷脱氨酶(牛培格脱氨酶,PEG-ADA)、 阿贝西普、阿法赛特、阿那白滞素、依那西普、白细胞介素-1(IL-1受体拮抗剂)、阿那白滞素、胸腺九肽、TNF-α拮抗剂、恩夫韦地和胸腺肽α1。
在一些实施方案中,用于治疗癌症或肿瘤疾病的治疗性蛋白或多肽,包括或为下述的一种或多种:细胞因子、趋化因子、自杀基因产物、免疫原性蛋白或肽、凋亡诱导剂、血管生成抑制剂、热休克蛋白、肿瘤抗原、β-连环蛋白抑制剂、STING通路激活剂、检查点调节剂、天然免疫激活剂、抗体、显性负性受体和诱饵受体、髓源性抑制细胞(MDSCs)抑制剂、IDO途径抑制剂和结合凋亡抑制剂的蛋白质或肽;
在一些实施方案中,用于激素替代治疗的治疗性蛋白或多肽中的激素包括下述的一种或多种:雌激素、孕酮、黄体酮以及睾酮。
在一些实施方案中,用于将体细胞重编程为多能或全能干细胞的治疗性蛋白质,包括下述的一种或多种:Oct-3/4、Sox基因家族(例如Sox1、Sox2、Sox3和Sox15)、Klf家族(例如Klf1、Klf2、Klf4和Klf5)、Myc家族(例如c-Myc、L-Myc和N-Myc)、Nanog和LIN28。
在一些实施方案中,用作佐剂或免疫刺激蛋白的治疗性蛋白或多肽包括或为下述的一种或多种:人辅助蛋白(human adjuvant proteins),特别是模式识别受体TLR1、TLR2、TLR3、TLR4、TLR5、TLR6、TLR7、TLR8、TLR9、TLR10、TLR11;NOD1、NOD2、NOD3、NOD4、NOD5、NALP1、NALP2、NALP3、NALP4、NALP5、NALP6、NALP6、NALP7、NALP7、NALP8、NALP9、NALP10、NALP11、NALP12、NALP13、NALP14J IPAF、NAIP、CIITA、RIG-I、MDA5和LGP2、TLR信号的信号转导子(包括衔接蛋白(例如Trif和Cardif)、小GTPases信号的成分(例如RhoA、Ras、Rac1、Cdc42、Rab等)、PIP信号的成分(例如PI3K、Src激酶等)、MyD88依赖信号的成分(例如MyD88、IRAK1、IRAK2、IRAK4、TIRAP、TRAF6等)、MyD88独立信号的成分(例如TICAM1、TICAM2、TRAF6、TBK1、IRF3、TAK1、IRAK1等)等;活化的激酶(例如Akt、MEKK1、MKK1、MKK3、MKK4、MKK6、MKK7、ERK1、ERK2、GSK3、PKC激酶、PKD激酶、GSK3激酶、JNK、p38MAPK、TAK1、IKK、TAK1等);活化的转录因子(例如NF-kB、c-Fos、c-Jun、c-Myc、CREB、AP-1、Elk-1、ATF2、IRF-3、IRF-7、热休克蛋白(例如HSP10、HSP60、HSP65、HSP70、HSP75和HSP90)、gp96、纤维蛋白原、纤维连接蛋白的III型重复额外结构域等);补体系统的成分(例如C1q、MBL、C1r、C1s、C2b、Bb、D、MASP-1、MASP-2、C4b、C3b、C5a、C3a、C4a、C5b、C6、C7、C8、C9、CR1、CR2、CR3、CR4、C1qR、C1INH、C4bp、MCP、DAF、H、I、P、CD59等);诱导靶基因的细胞表面蛋白(例如β-防御素)。在一些实施方案中,人辅助蛋白包括下述的一种或多种:trif、flt-3配体、Gp96或纤维连接蛋白、诱导或增强先天免疫应答的细胞因子(例如IL-1α、IL-1R1、IL1β、IL-2、IL-6、IL-7、IL-8、IL-9、IL-12、IL-13、IL-15、IL-16、IL-17、IL-18、IL-21、IL-23、TNFα、IFNα、IFNβ、IFNγ、GM-CSF、G-CSF、M-CSF),趋化因子(例如IL-8、IP-10、MCP-1、MIP-1α、RANTES、Eotaxin、CCL21),巨噬细胞释放的细胞因子(例如IL-1、IL-6、IL-8、IL-12、TNF-α等)。
在一些实施方案中,用作佐剂或免疫刺激的治疗性蛋白或多肽包括下述的一种或多种:细菌(佐剂)蛋白、原生动物(佐剂)蛋白、病毒(佐剂)蛋白、真菌(佐剂)蛋白、和动物源性蛋白。
在一些实施方案中,细菌(佐剂)蛋白包括下述中的一种或多种:细菌热休克蛋白或伴侣(包括Hsp60、Hsp70、Hsp90、Hsp100);革兰氏阴性菌OmpA(外膜蛋白);OspA;细菌孔蛋白(例如OmpF);细菌毒素(例如百日咳杆菌的百日咳毒素(PT)、百日咳杆菌的百 日咳腺苷酸环化酶毒素CyaA和CyaC、百日咳毒素PT-9K/129G突变体、百日咳杆菌腺苷酸环化酶毒素CyaA和CyaC、破伤风毒素、霍乱毒素(CT)、霍乱毒素B亚单位、霍乱毒素CTK63突变体、CT的CTE112K突变体、大肠杆菌不耐热肠毒素(LT)、毒性降低的不耐热肠毒素(LTB)大肠杆菌不耐热肠毒素突变体的B亚单位(例如LTK63、LTR72));酚溶性调节蛋白;幽门螺杆菌中性粒细胞激活蛋白(HP-NAP);表面活性蛋白D;伯氏疏螺旋体外表面蛋白A脂蛋白、结核分枝杆菌的Ag38(38kDa抗原);细菌菌毛蛋白(例如革兰氏阴性菌菌毛的菌毛蛋白)和表面活性蛋白A以及细菌鞭毛蛋白。
在一些实施方案中,原生动物(佐剂)蛋白包括下述的一种或多种:克氏锥虫的Tc52、刚地锥虫的PFTG、原生动物热休克蛋白、利什曼原虫的LeIF、和刚地弓形虫的类谱蛋白。
在一些实施方案中,病毒(佐剂)蛋白包括下述的一种或多种:呼吸道合胞病毒融合糖蛋白(F蛋白)、MMT病毒包膜蛋白、小鼠白血病病毒蛋白和野生型麻疹病毒血凝素蛋白。
在一些实施方案中,真菌(佐剂)蛋白包括真菌免疫调节蛋白(FIP,例如LZ-8)。
在一些实施方案中,动物源性蛋白包括钥孔血蓝蛋白(KLH)。
在一些实施方案中,编码感兴趣的多肽和/或蛋白的核苷酸序列所表达的多肽和/或蛋白包含或为作为治疗性抗体的治疗性蛋白或多肽。例如WO2016/170176A1的表1、表2、表3、表4、表5、表6、表7、表8、表9、表10、表11和表12中记载的细胞因子、趋化因子、自杀酶和基因产物、凋亡诱导剂、内源性血管生成抑制剂、热休克蛋白、肿瘤抗原、天然免疫激活剂、针对与肿瘤或癌症发展相关的蛋白的抗体中的一种或多种。
五、编码本发明的5’和/或3’-UTR的重组RNA分子的DNA分子或载体
本发明提供一种DNA分子,其编码本发明的5’-UTR、3’-UTR和/或重组RNA分子。本发明还提供一种包含本发明的重组RNA分子或DNA分子的载体以及包含本发明的重组RNA分子、DNA分子和/或载体的宿主细胞。
本发明还提供一种载体,其包含编码5’-UTR的第一核苷酸序列和/或编码3’-UTR的第二核苷酸序列,其中:所述第一核苷酸序列包含如下多核苷酸中的至少一种:(a):编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR的多核苷酸;(b):编码(a)中所述5’-UTR的片段的多核苷酸;(c):编码(a)中所述5’-UTR的变体的多核苷酸;及(d):编码(b)中所述片段的变体的多核苷酸;所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):编码源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2中的至少一个基因的3’-UTR的多核苷酸;(f):编码(e)中所述3’-UTR的片段的多核苷酸;(g):编码(e)中所述3’-UTR的变体的多核苷酸;及(h):编码(f)中所述片段的变体的多核苷酸。
在一些实施方案中,所述基因是人基因。
在一些实施方案中,所述第一核苷酸序列包含下述多核苷酸中的至少一种:序列如SEQ ID NO:1~21中至少一个所示的多核苷酸、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体、和序列如SEQ ID NO:1~21中至少一个所示多核苷酸的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸具有至少40%、50%、 60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一种。在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:1):序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸;2):1)中所述多核苷酸的片段;3):1)中所述多核苷酸的变体;和4):2)中所述片段的变体。在一些实施方案中,所述序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
所述第一核苷酸序列可以包含两个或多个串联的编码来自上述基因的5’-UTR的多核苷酸、编码来自上述基因的5’-UTR的片段的多核苷酸、编码来自上述基因的5’-UTR的变体的多核苷酸、编码来自上述基因的5’-UTR的片段的变体的多核苷酸。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:(a):编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少两个基因的5’-UTR的多核苷酸;(b):编码(a)中所述基因中的至少两个基因的5’-UTR的片段的多核苷酸;(c):编码(a)中所述基因中的至少两个基因的5’-UTR的变体的多核苷酸;和(d):编码(a)中所述基因中的至少两个基因的5’-UTR的片段的变体的多核苷酸。在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:(e):编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少两个基因的5’-UTR的多核苷酸;(f):编码(e)中所述5’-UTR的片段的多核苷酸;(g):编码(e)中所述5’-UTR的变体的多核苷酸;和(h):编码(f)中所述片段的变体的多核苷酸。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸的至少一种:序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸的片段、序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸的变体和序列如SEQ ID NO:1~21中的至少两个所示的多核苷酸的片段的变体。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:9、7、18、12、8、1和6中至少两个所示的多核苷酸、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸的片段、序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸的变体和序列如SEQ ID NO:9、7、18、12、8、1和6中的至少两个所示的多核苷酸的片段的变体。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6个所示的多核苷酸的片段、变体或片段的变体与SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸具有至少40%、50%、60%、 70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸的片段、变体或片段的变体与SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR的多核苷酸;b):至少两个拷贝的编码a)中所述基因中的至少一个基因的5’-UTR的多核苷酸的片段;c):至少两个拷贝的编码a)中所述基因中的至少一个基因的5’-UTR的变体的多核苷酸;和d):至少两个拷贝的编码a)中所述基因中的至少一个基因的5’-UTR的片段的变体的多核苷酸。在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一个:e):至少两个拷贝的编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR的多核苷酸;f)至少两个拷贝的编码源自e)所述基因中的至少一个基因的5’-UTR的片段的多核苷酸;g)至少两个拷贝的编码源自e)所述基因中的至少一个基因的5’-UTR的变体的多核苷酸;h)至少两个拷贝的编码源自e)所述基因中的至少一个基因的5’-UTR的片段的变体的多核苷酸。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:1~21中至少一个所示的多核苷酸、至少两个拷贝的序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段、至少两个拷贝的序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体和至少两个拷贝的序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:1~21之一所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:1~21之一所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:1~21之一所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第一核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸、至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸的片段、至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸的变体、和至少两个拷贝的序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:9、7、18、12、8、1和6之一所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:9、7、18、12、8、1或6所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述同源物或变体与SEQ ID NO:9、7、18、12、8、1或6相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸的至少一种:序列如 SEQ ID NO:22~48中的至少一种所示的多核苷酸、序列如SEQ ID NO:22~48中的至少一种所示的多核苷酸的片段、序列如SEQ ID NO:22~48中的至少一种所示的多核苷酸的变体和序列如SEQ ID NO:22~48中的至少一种所示的多核苷酸的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:22~48中的至少一种所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48中的至少一种所示的多核苷酸的片段、变体、或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在优选的实施方案中,所述基因选自MPND、FBXW10、FBXW12和PGLYRP1中的至少一种。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:24、22、23或25所示的多核苷酸、序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段、序列如SEQ ID NO:24、22、23或25所示的多核苷酸的变体和序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段的变体。在一些实施方案中,所述序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段、变体或片段的变体与SEQ ID NO:24、22、23或25所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:24、22、23或25所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
所述第二核苷酸序列可以包含两个或多个串联的编码来自上述基因的3’-UTR的多核苷酸、编码来自上述基因的3’-UTR的变体的多核苷酸、编码来自上述基因的3’-UTR的片段的变体的多核苷酸。
在一些实施方案中,所述第二核苷酸序列包含下述多核苷酸中的至少一种:(a):编码源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少两个的3’-UTR的多核苷酸;(b):编码(a)中所述基因中的至少两个基因的3’-UTR的片段的多核苷酸;(c):编码(a)中所述基因中的至少两个基因的3’-UTR的变体的多核苷酸;和(d):编码(a)中所述基因中的至少两个基因的3’-UTR的片段的变体的多核苷酸。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):编码源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少两个基因的3’-UTR的多核苷酸;(f):编码(e)中所述3’-UTR的片段的多核苷酸;(g):编码(e)中所述3’-UTR的变体的多核苷酸;和(h):编码(f)中所述片段的变体的多核苷酸。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸的片段、序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸的变体和序列如SEQ ID NO:22~48中的至少两个所示的多核苷酸的片段的变体。在一些实施方案中,序列如SEQ ID NO:22~48之一所示的多核苷酸的片段、变体或片段的变体与序列如22~48之一所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:22~48之一所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:序列如SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸、序列如SEQ ID NO:24、 22、23和25中的至少两个所示的多核苷酸的片段、序列如SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸的变体和序列如SEQ ID NO:24、22、23和25中的至少两个所示的多核苷酸的片段的变体。在一些实施方案中,序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段、变体、或片段的变体与序列SEQ ID NO:24、22、23或25所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,所述序列如SEQ ID NO:24、22、23或25所示的多核苷酸的片段、变体、或片段的变体与序列SEQ ID NO:24、22、23或25所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:a):至少两个拷贝的编码源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少一个基因的3’-UTR的多核苷酸;b):至少两个拷贝的编码a)中所述基因中的至少一个基因的3’-UTR的片段的多核苷酸;c):至少两个拷贝的编码a)中所述基因中的至少一个基因的3’-UTR的变体的多核苷酸;和d):至少两个拷贝的编码a)中所述基因中的至少一个基因的3’-UTR的片段的变体的多核苷酸。在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:e):至少2个拷贝的编码源自基因MPND、FBXW10、FBXW12和PGLYRP1中的一个的3’-UTR的多核苷酸;f):至少两个拷贝的编码e)中所述基因中的至少一个基因的3’-UTR的片段的多核苷酸;g):至少两个拷贝的编码e)中所述基因中的至少一个基因的3’-UTR的变体的多核苷酸;和h):至少两个拷贝的编码e)中所述基因中的至少一个基因的3’-UTR的片段的变体的多核苷酸。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸、至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸的片段、至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸的变体和至少两个拷贝的序列如SEQ ID NO:22~48之一所示的多核苷酸的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:22~48之一所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:22~48之一所示的多核苷酸的片段、变体或片段的变体与序列如SEQ ID NO:22~48之一的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述第二核苷酸序列包含如下多核苷酸中的至少一种:至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸的片段、至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸的变体和至少两个拷贝的序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸的片段的变体。优选地,所述至少两个拷贝为两个拷贝、三个拷贝、四个拷贝、五个拷贝、六个拷贝、七个拷贝、八个拷贝或九个拷贝。在一些实施方案中,所述序列如SEQ ID NO:24、22、23和25之一所示的多核苷酸与序列如SEQ ID NO:24、22、23或25所示的多核苷酸具有至少40%、50%、60%、70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。在一些实施方案中,序列如SEQ ID NO:24、22、23和25之一所示 的多核苷酸与序列如SEQ ID NO:24、22、23或25所示的多核苷酸相比具有至少1、2、3、4、5、6、7、8、9、10或更多个核苷酸的插入、添加、删除或取代。
在一些实施方案中,所述载体用于生产重组RNA分子例如mRNA分子。在一些实施方案中,所述载体包含所述第一核苷酸序列和第二核苷酸序列。在一些实施方案中,所述载体还包含所述第一核苷酸序列和第二核苷酸序列之间的编码感兴趣的多肽和/或蛋白的第三核苷酸序列。在一些实施方案中,所述第三核苷酸序列编码至少一种感兴趣的多肽。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽。在一些实施方案中,所述第三核苷酸序列编码至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。在一些实施方案中,所述第三核苷酸序列编码至少一种感兴趣的多肽和至少一种感兴趣的蛋白。例如一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的多肽和一种、两种、三种、四种、五种、六种、七种、八种、九种或十种感兴趣的蛋白。
在一些实施方案中,构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个的A核苷酸。优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个的A核苷酸。在一些实施方案中,所述poly(A)尾包含一或多个除A核苷酸之外的核苷酸。在一些实施方案中,所述poly(A)尾包含两个或多个连续的除A核苷酸之外的核苷酸,其中所述具有两个或更多个连续核苷酸的序列中的第一个和最后一个核苷酸是除A核苷酸之外的核苷酸。优选地,所述poly(A)尾为截断式,即在m个连续的A核苷酸和n个连续的A核苷酸通过p个非A核苷酸组成的接头序列再连接,其中m、n和p是正整数。优选地,m为30,n为70个,且p为10。
在一些实施方案中,在载体中,所述poly(A)尾对应的DNA序列如SEQ ID NO:53所示。
此外,所述载体还可以包含表达控制元件,用于载体在宿主中的正确表达。这样的控制元件是本领域技术人员已知的,并且可以包括启动子、剪接盒、翻译起始密码子、用于将插入物引入载体的翻译和插入位点。
在一些实施方案中,本发明的载体可以是例如质粒、粘粒、病毒、噬菌体或在基因工程中常规使用的另一个载体,并且可以包含另外的基因,如允许在合适的宿主细胞中和在合适的条件下选择所述载体的标记基因。
本发明的重组RNA分子和载体可以直接引入或通过脂质体、病毒载体(例如腺病毒、逆转录病毒)、电穿孔、弹道(例如基因枪)或其它递送系统引入细胞。
此外,本发明还提供一种制备mRNA的方法,包括本发明的载体与RNA聚合酶接触。
在一些实施方案中,本发明的方法还包括使质粒线性化的步骤。在一些实施方案中,在线性化前,所述质粒的超螺旋率是至少约90%。在一些实施方案中,本发明的方法还包括纯化线性化的质粒的步骤。在一些实施方案中,本发明的方法还包括纯化mRNA的步骤。
在一些实施方案中,本发明的方法还包括加帽和任选存在的纯化加帽产物的步骤。在一些实施方案中,所述帽是Cap1型帽。所述Cap1型帽结构如式(I)所示:
cap G1G2=m7G-5'-ppp-5'-Gm2'-3'-p-[m7=7-CH3;m2'=2'-O-CH3;-ppp-=-
PO2H-O-PO2H-O-PO2H)-;-p-=-PO2H-]。
加帽反应如下所示:
pppN1(p)Nx-OH(3')→ppN1(pN)x-OH(3')+Pi
ppN1(pN)x-OH(3')+GTP→G(5')ppp(5')N1(pN)x-OH(3')+PPi
G(5')ppp(5')N1(pN)x-OH(3')+AdoMet→m7G(5')ppp(5')N1(pN)x-OH(3')+AdoHyc
m7GpppN1(pN)x-OH(3')+AdoMet→m7Gppp[m2’-O]N1(pN)x-OH(3')+AdoHyc。
另外,本发明还提供一种包含本发明的重组RNA分子、DNA分子和/或载体的宿主细胞。例如细菌细胞。本发明的载体,例如质粒,在所述宿主细胞中储存和/或扩增。
本发明的宿主细胞可以通过用本发明的载体转化感受态宿主细胞制备。感受态宿主细胞是具有不依赖序列摄取游离的细胞外遗传物质(如DNA质粒)的能力的细胞。本领域技术人员已知的多种细菌细胞天然地能够从环境中摄取外源DNA,因此可以充当根据本发明的细菌宿主细胞。此外,本领域技术人员已知可以采用例如电穿孔或化学品(如用钙离子处理并伴随高温暴露)从天然非感受态细菌细胞获得感受态细菌宿主细胞。摄取后,DNA质粒优选地既不降解也不整合在细菌宿主细胞的基因组信息中。细菌宿主细胞包括本领域技术人员公知的大肠杆菌(E.coli)细胞。
六、包含有本发明5’和/或3’-UTR的重组RNA分子的脂质纳米颗粒、和含有本发明的5’和/或3’-UTR的重组RNA分子的药物组合物以及治疗/预防疾病
本发明提供了一种包含本发明的重组RNA分子、DNA分子或载体的脂质纳米颗粒。
在一些实施方案中,脂质纳米颗粒含可质子化阳离子脂质。
在一些实施方案中,所述脂质纳米颗粒还含有辅助脂质、结构脂质和PEG-脂质(聚乙二醇-脂质)中的一种或多种。在一些进一步的实施方案中,所述脂质纳米颗粒还含有所述辅助脂质、所述结构脂质和所述PEG-脂质。
在一些如上文所述的实施方案中,所述辅助脂质为磷脂类物质。所述磷脂类物质通常是半合成的,也可以是天然来源的或被化学修饰的。所述磷脂类物质包括但不限于DSPC(二硬脂酰磷脂酰胆碱)、DOPE(二油酰磷脂酰乙醇胺)、DOPC(二油酰基卵磷脂)、DOPS(二油酰磷脂酰丝氨酸)、DSPG(1,2-二十八烷酰基-sn-甘油-3-磷酸-(1’-rac-甘油))、DPPG(二棕榈酰磷脂酰甘油)、DPPC(二棕榈酰磷脂酰胆碱)、DGTS(1,2-二棕榈酰-sn-甘油-3-O-4'-(N,N,N-三甲基)高丝氨酸)、溶血磷脂等。优选地,所述辅助脂质是选自DSPC、DOPE、DOPC和DOPS中的一种或多种。在一些实施方案中,所述辅助脂质是DSPC和/或DOPE。
在一些实施方案中,所述结构脂质为甾醇类物质,包括但不限于胆固醇、胆固醇酯、固醇类激素、固醇类维生素、胆汁酸、胆甾醇、麦角甾醇、β-谷甾醇和氧化胆固醇衍生物等。优选地,所述结构脂质是选自胆固醇、胆固醇酯、固醇类激素、固醇类维生素和胆汁酸中的至少一种。在一些实施方案中,所述结构脂质是胆固醇,优选高纯度胆固醇,特别是注射级高纯度胆固醇,例如CHO-HP(由AVT生产)。
如本文所用的,术语PEG-脂质(聚乙二醇-脂质)为聚乙二醇和脂类结构的缀合物。优选地,所述PEG-脂质选自PEG-DMG和PEG-二硬脂酰磷脂酰乙醇胺(PEG-DSPE),优选为PEG-DMG。优选地,所述PEG-DMG是1,2-二肉豆蔻酸甘油酯的聚乙二醇(PEG)衍生物。优选地,所述PEG的平均分子量为约2000至5000,优选约2000。
在一些进一步的实施方案中,所述脂质纳米颗粒还含有所述辅助脂质、所述结构脂质和所述PEG-脂质。在一些实施方案中,以所述可质子化阳离子脂质、所述辅助脂质、所述结构脂质和所述PEG-脂质的总量计,所述脂质纳米颗粒包含以下量(摩尔百分比)的所述可质子化阳离子脂质:约25.0%-75.0%,例如约25.0%-28.0%、28.0%-32.0%、32.0%-35.0%、35.0%-40.0%、40.0%-42.0%、42.0%-45.0%、45.0%-46.3%、46.3%-48.0%、48.0%-49.5%、49.5%-50.0%、50.0%-55.0%、55.0%-60.0%、60.0%-65.0%、或65.0%-75.0%。
本发明提供了一种包含本发明的重组RNA分子、DNA分子或载体的阳离子脂质体。
本发明提供了一种包含本发明的重组RNA分子、DNA分子或载体的阳离子蛋白。
本发明提供了一种包含本发明的重组RNA分子、DNA分子或载体的阳离子聚合 物。
本发明提供一种药物组合物,其包含本发明的重组RNA分子、DNA分子、载体、宿主细胞、阳离子脂质体、阳离子蛋白、阳离子聚合物或脂质纳米颗粒,以及药学可接受的载体、稀释剂或赋形剂。
另外,本发明还提供一种本发明的重组RNA分子、DNA分子、载体、脂质纳米颗粒或药物组合物在制备药物中的用途。
在一些实施方案中,所述药物用于基因治疗、基因疫苗接种、蛋白质替代疗法、反义治疗或通过干扰RNA进行的治疗。
在一些实施方案中,所述药物为核酸药物。其中所述核酸包括下述的至少一种:RNA、信使RNA(mRNA)、反义寡核苷酸、DNA、质粒、核糖体RNA(rRNA)、微小RNA(miRNA)、转移RNA(tRNA)、小的抑制RNA(siRNA)、小的核RNA(snRNA)、小发夹RNA(shRNA)、tRNA、单链向导RNA(sgRNA)和Cas9mRNA。
在一些实施方案中,所述药物用于疾病的治疗和/或预防。优选地,所述疾病选自由以下组成的组:罕见病、感染性疾病、癌症、遗传性疾病、自体免疫性疾病、糖尿病、神经退化性疾病、心血管、肾血管疾病,以及代谢性疾病;优选地,所述癌症包括肺癌、胃癌、肝癌、食管癌、结肠癌、胰腺癌、脑癌、淋巴癌、血癌或前列腺癌中的一种或多种;所述遗传疾病包括血友病,地中海贫血、高雪氏病中的一种或多种。
在一些实施方案中,所述药物是疫苗。在一些实施方案中,所述重组分子用于产生病原体的抗原或其部分。
在一些实施方案中,所述药物是基因治疗剂。在一些实施方案中,所述重组分子用于产生遗传疾病相关的蛋白质。
在一些实施方案中,所述重组分子用于产生抗体,例如scFV或纳米抗体。
此外,本发明还提供一种预防或治疗疾病的方法,包括给有需要的对象施用本发明的重组RNA分子或药物组合物。
在一些实施方案中,所述疾病或病症选自由以下组成的组:罕见病、感染性疾病、癌症、遗传性疾病、自体免疫性疾病、糖尿病、神经退化性疾病、心血管、肾血管疾病,以及代谢性疾病。优选地,所述癌症包括肺癌、胃癌、肝癌、食管癌、结肠癌、胰腺癌、脑癌、淋巴癌、血癌或前列腺癌中的一种或多种;所述遗传疾病包括血友病,地中海贫血、高雪氏病中的一种或多种。
在一些实施方案中,所述重组RNA分子或药物组合物用作预防疾病的疫苗。在一些实施方案中,所述重组RNA分子用于产生病原体的抗原或其部分。
在一些实施方案中,所述重组RNA分子用于产生所述遗传疾病相关的蛋白质。
在一些实施方案中,所述重组RNA分子用于产生抗体,例如scFV或纳米抗体。
七、本发明的有益效果:
本发明通过大量的筛选,获得优化的UTR。这些UTR能够提高mRNA翻译效率和/或稳定性,提高多肽和/或蛋白的表达水平,对mRNA疫苗的研发有着非常重要的应用价值。
实施例
实施例1质粒D的构建
构建方法如下:
质粒B的构建:
以Luciferase-pcDNA3质粒(购自Addgene,其质粒编号#18964)为骨架,在Kozack序列前面加入新的酶切位点HindIII和BamHI,并且在luciferase的终止密码子后面添加新的酶切位点KpnI和ApaI,得到质粒B。Luciferase-pcDNA3质粒的核苷酸序列如SEQ ID NO.50所示,luciferase-pcDNA3的质粒图谱如图2所示。质粒B的核苷酸序列如SEQ ID NO:51所示,质粒B的质粒图谱如图3所示。
质粒C的构建:
将质粒B中的Amp(氨苄)抗性基因更换为Kana(卡那霉素)抗性基因,并且去除从3746到4540间的neo/KanR序列得到质粒C,质粒C的质粒谱图如图4所示,质粒C的核苷酸序列如SEQ ID NO:52所示。
质粒D的构建:
质粒C插入如SEQ ID NO:53所示的poly(A)尾。具体地,质粒C以ApaI进行单酶切,产物进行纯化,纯化后的单酶切产物和如SEQ ID NO:53所示的poly(A)尾以同源重组的方式进行重组,重组产物转化DH5α,在含有50μg/mL卡那霉素的LB平板上筛选克隆并挑取测序正确的克隆提取质粒,获得质粒D。质粒D的质粒谱图如图5所示,质粒D的核苷酸序列如SEQ ID NO:54所示。
质粒D构建流程图如图1A所示,构建改造示意图如图1B所示。
实施例2含不同5’-UTR的质粒的构建
设计并合成如表1所示的5’-UTR:
5’-UTR序列委托生工生物工程(上海)股份有限公司合成获得,根据该公司的质量分析报告,5’-UTR序列和理论设计序列一致。
含不同5’UTR的质粒构建:
质粒D以HindIII和BamHI双酶切,切胶回收(Axygen)取分子量约6kb左右的片段A;将不同的5’UTR序列采用PCR的方式引入同源臂序列,和片段A进行同源重组反应,使用Takara的同源重组酶体系50℃反应15分钟;上述反应体系转化DH5α(Takara)感受态细胞,涂板(带有50μg/mL卡那霉素的LB平板),培养16小时后,挑取3-4个单克隆,在含有50μg/mL卡那霉素培养基中培养8小时,提质粒并由生工生物工程(上海)股份有限公司测序,获得含有不同5’UTR的质粒。
实施例3含不同3’UTR的质粒的构建
合成并设计如表2所示的3’-UTR:
3’-UTR序列委托生工生物工程(上海)股份有限公司合成获得。根据该公司的质量分析报告,3’-UTR序列和理论设计序列一致。
不同3’-UTR序列的质粒构建:
质粒D以KpnI和ApaI双酶切,切胶回收(Axygen)取分子量约6kb左右的片段A;将不同的3’-UTR序列采用PCR的方式引入同源臂序列,和片段A进行同源重组反应,使用Takara的同源重组酶体系50℃反应15分钟;上述反应体系转化DH5α(Takara)感受态细胞,涂板(带有50μg/mL卡那霉素的LB平板),培养16小时后,挑取3-4个单克隆,在含有50μg/mL卡那霉素培养基中培养8小时,提质粒并由生工生物工程(上海)股份有限公司测序,获得含不同3’-UTR的质粒。
实施例4 mRNA的合成
1.提取质粒模板,保证质粒超螺旋率达90%以上。
2.质粒线性化
(1)酶切线性化:取20μg质粒使用相应酶(BsaI)酶切进行质粒线性化处理。
备注:根据实际情况调整反应体系,受纯化试剂盒柱子影响,每个反应体系质粒不超过20μg。
(2)将体系配制到0.2ml管中,混匀后放至37℃培养箱过夜或2h酶切反应。
(3)取1μL酶切后产物与原质粒同时跑电泳,看是否酶切完全。
3.线性化质粒纯化:使用Takara回收试剂盒回收(货号:9761)
(1)向PCR反应液(或其它酶促反应液)中加入3倍量的Buffer DC(如果需加入的Buffer DC量不足100μL时应加入100μL),然后均匀混合。
(2)将试剂盒中的Spin Column安置于Collection Tube上。
(3)将上述操作(1)的溶液转移至Spin Column中,室温12,000rpm离心1分钟,弃滤液。
(4)将700μL的Buffer WB加入Spin Column中,室温12,000rpm离心30秒钟,弃滤液。
4.酚氯仿抽提(去RNA酶、蛋白等)
(1)将待抽提质粒稀释至300μL或500μL,加入等体积的酚氯仿,12000rpm,10min。
(2)离心后尽可能的吸取上层液体,再加入等体积的酚氯仿,12000rpm,10min。
(3)离心后再次尽可能的吸取上层液体,加入0.1倍的5M的NaCl,0.7倍的异丙醇后冰浴15min后,12000rpm,10min,弃上清。
(4)用70%的乙醇(提前冰浴)200μL洗涤一次,12000rpm,1min,弃上清,再12000rpm,1min,用小枪头彻底吸取干净。
(5)室温晾干后加适量的UltraPure DNase/RNase-Free-Distilled Water混匀。
(6)使用OneDrop测定纯化样品浓度。
鉴定:取100ng纯化后的质粒与原质粒一起电泳,确认质粒线性化完全且无杂带后可用于mRNA合成。
5.mRNA的体外转录(参考诺唯赞IVT反应试剂盒)
(1)实验区域清理:首先用紫外对生物安全柜进行灭菌0.5h,之后用酒精棉球将安全柜擦干净,喷上RNase抑制剂。待5min之后,用酒精棉球将抑制剂擦干净,开始RNA实验。
(2)体系的配置:配制体系前,将各组分试剂取出,在振荡器上涡旋混匀,离心,放置到冰盒中备用。
注:配置体系的过程中,全程在冰上操作,将质粒模板与IVT体系分来配制。
(3)IVT体系配制:将除了质粒模板的各个组分加入到体系液面以下,最后加T7酶,涡旋混匀,其中尿嘧啶(U)用N1-甲基假尿苷替代。
(4)取出需要质粒的量到新的PCR管中,和IVT体系一起孵育5min,最后将两者的体系混合到一起,涡旋震荡混匀,离心之后37℃孵育2h。
(5)将体系配制到0.2mL平盖薄壁管中,混匀后置于PCR仪中进行反应,
反应条件为37℃ 2h。备注:反应体系也可根据所需要的生产量进行同步放大生产。
(6)使用DNase去除线性化质粒模板,加入(20μL体系)1μL/(100μL体系)5μL DNase于反应体系中,37℃孵育15min。
6.纯化方式:(参考Thermo MEGAclear Kit试剂盒)
柱纯化
(1)将每个反应样转移至1.5ml EP管(若体系不足100μL,用Elution Solution补足至100μL),轻柔混匀。
(2)加入350μL Binding Solution Concentrate,用移液器轻柔的混匀;
(3)加入250μL无水乙醇,用移液器轻柔的混匀;
(4)将试剂盒自带滤芯插入收集管中,将混合液700μL加入至滤芯中,室温结合10min,12000g离心1min,倒掉滤液,将收集管重复利用,之后进行RNA的洗涤;
(5)加入500μL Wash Solution,通过滤芯进行过滤(12000g离心1min);
(6)重复步骤(5);
(7)去除Wash Solution,继续离心,最高转速离心1min,去除残留的Wash Solution;
(8)加入适量(80-100ul)事先预热的RNase-Free Distilled Water入滤芯,盖上盖子,70℃静置10min,12000g离心1min,可根据需要增加洗脱次数;
(9)浓度测定,用onedrop或Qubit测定浓度(稀释10倍后测定);
7.鉴定:取5μL稀释的纯化样品与NorthernMaxTM Formaldehyde Load Dye混合75℃孵育10min,然后置于冰上保存。然后使用1%琼脂糖凝胶进行凝胶电泳,确定mRNA大小正确,条带无弥散,可进行下一步加帽反应。
8.加帽反应(参考cellscript试剂盒)
mRNACap1型加帽反应:
Cap1型帽子结构和反应原理如下:
pppN1(p)Nx-OH(3')→ppN1(pN)x-OH(3')+Pi
ppN1(pN)x-OH(3')+GTP→G(5')ppp(5')N1(pN)x-OH(3')+PPi
G(5')ppp(5')N1(pN)x-OH(3')+AdoMet→m7G(5')ppp(5')N1(pN)x-OH(3')+AdoHyc
m7GpppN1(pN)x-OH(3')+AdoMet→m7Gppp[m2’-O]N1(pN)x-OH(3')+AdoHyc
5‘-Cap1型帽子结构:
cap G1G2=m7G+-5'-ppp-5'-Gm2'-3'-p-[m7=7-CH3;m2'=2'-O-CH3;-ppp-=-PO2H-O-PO2H-O-PO2H)-;-p-=-PO2H-],37℃ 5min或者65℃ 5min(CELL SCRIPT,反应体系如下表4所示)。
表3:Cap1型加帽反应体系
备注:受体系影响,单次加帽mRNA量(100ul体系)不超过60ug。
加帽(100ul体系)(参考CELL SCRIPT加帽反应试剂盒)如下表5所示。
表4:100μL加帽体系
将预先加热的mRNA与上述体系混合一起,37℃,1h。
9.纯化方式:同体外转录后产物纯化方式一致
表征数据:用onedrop或Qubit测定浓度。
实验结果:获得含有Cap1型帽子和不同UTR的mRNA终产物,所述mRNA终产物中的全部尿嘧啶(U)核苷被N1-甲基假尿苷替代。
实施例5 5’-UTR的筛选
5’-UTR对荧光素酶表达水平的影响用细胞荧光素酶报告试验来进行比较。
用含不同5’-UTR的mRNA转染HEK293细胞。具体而言,提前一天按照40000/孔,将HEK293细胞加入96孔板,待第二天细胞融合度达到70%~90%时进行细胞转染。
使用转染试剂lipoMAX(英潍捷基,按照lipoMAX使用说明进行操作),将含不同5'-UTR的mRNA分别转染进HEK293细胞,每孔转100ng mRNA,37℃CO2培养箱孵育16h后进行荧光素酶报告实验检测(按照Promega试剂盒说明检测)。
实验结果如图6所示。和无UTR的mRNA相比,添加有5UTR-NO2、5UTR-NO4、5UTR-NO12、5UTR-NO13、5UTR-NO14、5UTR-NO17、5UTR-NO20、5UTR-NO22、 5UTR-NO26、5UTR-NO29、5UTR-NO34和5UTR-NO37的mRNA可以明显提高蛋白的表达水平。
实施例6 3’-UTR的筛选
3’-UTR对荧光素酶表达水平的影响用细胞荧光素酶报告试验来进行比较。
用含不同3’-UTR的mRNA转染HEK293细胞。具体而言,提前一天按照40000/孔,将HEK293细胞铺于96孔板,待第二天细胞融合度达到70%~90%时进行细胞转染。
使用转染试剂lipoMAX(英潍捷基,按照lipoMAX使用说明进行操作),将含不同3'-UTR的mRNA分别转染进HEK293细胞,每孔转100ng mRNA,37℃CO2培养箱孵育16h后进行荧光素酶报告实验检测(按照Promega试剂盒说明检测)。
实验结果如图7所示。和无UTR的mRNA相比,含有3UTR-NO4、3UTR-NO5、3UTR-NO6、3UTR-NO8、3UTR-NO10、3UTR-NO11、3UTR-NO13、3UTR-NO14、3UTR-NO16、3UTR-NO22、3UTR-NO23、3UTR-NO24、3UTR-NO25、3UTR-NO32、3UTR-NO36、3UTR-NO41、3UTR-NO46、3UTR-NO48、3UTR-NO50的mRNA可以提高蛋白的表达水平。
实施例7 5’-UTR与3’-UTR组合的筛选
一、通过体内活体成像(IVIS,in vivo imaging)比较含有5UTR-NO54和不同3’-UTR中的一种的一系列mRNA在小鼠体内的表达情况,具体操作包括:
1、将5UTR-NO54(DNA序列如SEQ ID NO:55所示,委托生工生物工程(上海)股份有限公司合成)插入到空白质粒D(空白质粒D的构建方法参见实施例1)的HindIII和BamHI之间,固定5’UTR序列以后,将3UTR-NO6和3UTR-NO8分别插入到含有5UTR-NO54的质粒的ApaI和KpnI之间,得到含有5UTR-NO54和不同3’-UTR中的一种的一系列质粒。
2、利用步骤1的一系列质粒,参照实施例4制备含有5UTR-NO54和不同3’UTR中的一种的一系列mRNA,然后制备各自的LNP-mRNA制剂,制备LNP-mRNA制剂的步骤包括:
(1)LNP包封:配液体积(每个体系的水相与其醇相的总体积)为1.5mL,其中,mRNA:脂质的质量比为1:10,HAc-NaOAc缓冲液(0.2M,pH5.0)在最终水相的浓度为0.025M。采用微流控设备(MPE-L2)及芯片(SN.000035),以水相:醇相=9mL/min:3mL/min制备各样品,将两相按接口要求注入微流控芯片进行混合。水相包含0.3mg的mRNA,总体积1125μL;醇相包含溶于乙醇的脂质(SM-102):DSPC:CHO-HP:DMG-PEG2000(Mol%)=50:10:38.5:1.5,总体积375μL,其中,SM-102化学结构如下所示:
SM-102可市购获得,也可以按照本领域公知技术制备获得。
(2)透析换液,制备LNP-mRNA制剂:
a.配制透析液:
1×PBS+8%(m/V)蔗糖溶液:取2包1×PBS预制粉末至烧杯中,用2L DEPC水溶解混匀,继续加入160g蔗糖,混匀即得1×PBS+8%(m/V)蔗糖溶液。
b.透析:将各组药液分别装至100KD透析袋,浸入装有1L透析溶液的烧杯中,用铝箔纸将烧杯包裹并以100rpm转速室温透析1h后,更换透析液继续透析1个小 时,获得不同的LNP-mRNA制剂。
3、将步骤2所获得不同的LNP-mRNA制剂,分别进行体内活体成像实验。其中:实验所采用的小鼠为:雌性Balb/c小鼠,6周龄;每组3只小鼠;每只小鼠尾静脉注射12μg的LNP-mRNA制剂。12h后,通过异氟烷吸入麻醉小鼠,并注射荧光素酶显影底物D-Luciferin(150mg/kg)。然后将动物仰卧位放置,用IVIS活体成像系统观察小鼠体内luciferin的信号分布及强度。
各种LNP-mRNA制剂在小鼠体内表达荧光素酶的结果如图8所示,图8中平均光子数的单位为p/s/cm2/sr。图8中的对照1与其他组别的差别仅在于,对照1不含3’-UTR,对照1的LNP-mRNA制剂制备方法参照上文。结果显示,与对照1相比,包含5UTR-NO54和选自3UTR-NO6和3UTR-NO8之一的3’-UTR的mRNA在小鼠中实现了更高的荧光素酶表达。
二、通过体内活体成像比较含有3UTR-NO3和不同5’-UTR中的一种的一系列mRNA在小鼠体内的表达情况,具体步骤包括:
1、将3UTR-NO3(DNA序列如SEQ ID NO:56所示,委托生工生物工程(上海)股份有限公司合成))插入到空白质粒D(空白质粒D的构建方法参见实施例1)的ApaI和KpnI之间,固定3’-UTR序列以后,将5UTR-NO12、5UTR-NO2、5UTR-NO11、5UTR-NO13、5UTR-NO14、5UTR-NO18和5UTR-NO29分别插入到含有3UTR-NO3的质粒的HindIII和BamHI之间,得到含有3UTR-NO3和不同5’-UTR中的一种的一系列质粒。
2、利用步骤1的一系列质粒,参照上文制备含有3UTR-NO3和不同5’-UTR中一种的一系列mRNA,并进一步制备相应的LNP-mRNA制剂。
3、将步骤2所获得不同的LNP-mRNA制剂,分别进行小鼠成像实验。其中:实验所采用的小鼠为:雌性Balb/c小鼠,6周龄;每组3只;每只小鼠尾静脉注射12μg的LNP-mRNA制剂。12h后,通过异氟烷吸入麻醉小鼠,并注射荧光素酶显影底物D-Luciferin(150mg/kg)。然后将动物仰卧位放置,用IVIS活体成像系统观察小鼠体内luciferin的信号分布及强度。
各种LNP-mRNA制剂在小鼠体内表达荧光素酶的结果如图9所示,图9中平均光子数的单位为p/s/cm2/sr。图9中的对照2与其他组别的差别仅在于,对照2不含5’-UTR,对照2的LNP-mRNA制剂制备方法参照上文。结果显示,与对照2相比,包含3UTR-NO3和选自5UTR-NO12、5UTR-NO2、5UTR-NO11、5UTR-NO13、5UTR-NO14、5UTR-NO18和5UTR-NO29之一的5’UTR的mRNA在小鼠中实现了更高的荧光素酶表达。









Claims (40)

  1. 一种重组RNA分子,其包含:
    (1)编码感兴趣的多肽和/或蛋白的第一核苷酸序列;和
    (2)含有5’-非翻译区(5’-UTR)的第二核苷酸序列;
    所述5’-UTR包含选自以下多核苷酸中的至少一种:
    (a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR;
    (b):(a)中所述5’-UTR的片段;
    (c):(a)中所述5’-UTR的变体;及
    (d):(b)中所述片段的变体;
    所述第一核苷酸序列与所述第二核苷酸序列不天然出现于同一RNA分子。
  2. 权利要求1的重组RNA分子,其中所述基因是人基因。
  3. 权利要求1或2的重组RNA分子,其中所述第二核苷酸序列包含以下多核苷酸中的至少一种:
    (a):源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个基因的5’-UTR;
    (b):(a)中所述5’-UTR的片段;
    (c):(a)中所述5’-UTR的变体;及
    (d):(b)中所述片段的变体。
  4. 权利要求1~3任一项的重组RNA分子,其中所述第二核苷酸序列包含下述多核苷酸中的至少一种:
    序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体;
    优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。
  5. 权利要求1~4任一项的重组RNA分子,其中所述第二核苷酸序列包含:
    (a)中所述基因中的至少两个基因的5’-UTR、(a)中所述基因中的至少两个基因的5’-UTR的片段、(a)中所述基因中的至少两个基因的5’-UTR的变体和(a)中所述基因中的至少两个基因的5’-UTR的片段的变体中的至少一种。
  6. 权利要求1~4任一项的重组RNA分子,其中所述第二核苷酸序列包含:至少两个拷贝的(a)中的5’-UTR、至少两个拷贝的(a)中的5’-UTR的片段、至少两个拷贝的(a)中的5’-UTR的变体和至少两个拷贝的(a)中的5’-UTR的片段的变体中的至少一种。
  7. 权利要求1~6任一项的重组RNA分子,其进一步还包含启动子、5’-帽子结构、3’-UTR和poly(A)尾中的至少一种。
  8. 权利要求7的重组RNA分子,其中所述5’-帽子结构包括m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1和m7Gppp(m2′-O)N1中的至少一种。
  9. 权利要求7或8的重组RNA分子,其中所述3’-UTR包含:
    i)源自白蛋白基因、α-珠蛋白基因、β-珠蛋白基因、酪氨酸羟化酶基因、脂加氧酶基因和胶原蛋白α基因中的至少一种基因的3’-UTR;
    ii)所述i)中的所述3’-UTR的变体;
    iii)源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、NFKB2和GH1中至少一种基因的3’-UTR、其片段、变体和片段的变体中的至少一种;优选地,序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA、序列SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段、序列SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段的变体中的至少一种;
    优选地,所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段、和所述序列如SEQ ID NO:22~49中至少一个所示的多核苷酸编码的RNA的片段的变体与所述序列如SEQ ID NO:22~49中的至少一个所示的多核苷酸编码的RNA有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;
    iv)至少两个拷贝的i)、ii)或iii)中的一种多核苷酸;或
    v)由i)~iii)中的多核苷酸所构成的组中的至少两种多核苷酸。
  10. 权利要求7~9任一项的重组RNA分子,其中构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。
  11. 权利要求7~10任一项的重组RNA分子,其中构成所述poly(A)尾的核苷酸包括一个或多个除A核苷酸之外的其他核苷酸;
    可选地,构成所述poly(A)的核苷酸包含连续两个或两个以上的除A核苷酸外的其他核苷酸。
  12. 一种重组RNA分子,其包含:
    (1)编码感兴趣的多肽和/或蛋白的第一核苷酸序列;和
    (2)含有3’-非翻译区(3’-UTR)的第二核苷酸序列;
    所述3’-UTR包含选自以下多核苷酸中的至少一种:
    (a):源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1、和NFKB2中的至少一个基因的3’-UTR;
    (b):(a)中所述3’-UTR的片段;
    (c):(a)中所述3’-UTR的变体;及
    (d):(b)中所述片段的变体;
    所述第一核苷酸序列和所述第二核苷酸序列不天然出现于同一RNA分子。
  13. 权利要求12的重组RNA分子,其中所述基因是人基因。
  14. 权利要求12或13的重组RNA分子,其中所述第二核苷酸序列包含以下多核苷酸中的至少一种:
    (a):源自基因MPND、FBXW10、FBXW12和PGLYRP1中的至少一个基因的3’-UTR;
    (b):(a)中所述3’-UTR的片段;
    (c):(a)中所述3’-UTR的变体;及
    (d):(b)中所述片段的变体。
  15. 权利要求12~14任一项的重组RNA分子,其中所述第二核苷酸序列包含下述多核苷酸中的至少一种:
    序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体;
    优选地,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:22~48中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性。
  16. 权利要求12~15任一项的重组RNA分子,其中所述第二核苷酸序列包含:(a)中所述基因中的至少两个基因的3’-UTR、(a)中所述基因中的至少两个基因的3’-UTR的片段、(a)中所述基因中的至少两个基因的3’-UTR的变体和(a)中所述基因中的至少两个基因的3’-UTR的片段的变体中的至少一种。
  17. 权利要求12~15任一项的重组RNA分子,其中所述第二核苷酸序列包含:至少两个拷贝的(a)中的3’-UTR、至少两个拷贝的(a)中的3’-UTR的片段、至少两个拷贝的(a)中的3’-UTR的变体和至少两个拷贝的(a)中的3’-UTR的片段的变体中的至少一种。
  18. 权利要求12~17任一项的重组RNA分子,其进一步还包含启动子、5’-帽子结构、5’-UTR和poly(A)尾中的至少一种。
  19. 权利要求17的重组RNA分子,其中所述5’-帽子结构包括m7GpppG、m2 7,3′-OGpppG、m7Gppp(5')N1或m7Gppp(m2′-O)N1中的至少一种。
  20. 权利要求18或19的重组RNA分子,其中所述5’-UTR包含:
    i)源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中至少一个基因的5’-UTR、其片段、变体和片段的变体中的至少一种;优选地,序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体;
    优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸编码的RNA的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸编码的RNA具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;
    ii)至少两个拷贝的i)中的其中一种多核苷酸;或
    iii)至少两种i)中的多核苷酸。
  21. 权利要求18~20任一项的重组RNA分子,其中构成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,构成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。
  22. 权利要求18~21任一项的重组RNA分子,其中构成所述poly(A)尾的核苷酸包 含一个或多个除A核苷酸之外的其他核苷酸。
  23. 一种DNA分子,其编码权利要求1~22任一项的重组RNA分子。
  24. 一种载体,其包含权利要求23的DNA分子。
  25. 一种宿主细胞,其包含权利要求1~22任一项的重组RNA分子、权利要求23的DNA分子或权利要求24所述的载体。
  26. 一种脂质纳米颗粒,其包含权利要求1~22任一项的重组RNA分子。
  27. 一种药物组合物,其包含权利要求1~22任一项的重组RNA分子、权利要求23的DNA、权利要求24的载体、权利要求25的宿主细胞或权利要求26的脂质纳米颗粒,以及药学上可接受的载剂。
  28. 一种载体,其包含编码5’-UTR的第一核苷酸序列和/或编码3’-UTR的第二核苷酸序列,其中:
    所述第一核苷酸序列包含如下多核苷酸中的至少一种:(a):编码源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1、HBB、MYSM1、LENG1、TMSB4X、CASP4、IFNA1、PGLYRP1、UCHL1、CPAMD8、TTR、APOA2、GH1、DTYMK、APOC2和CDK7中的至少一个基因的5’-UTR的多核苷酸;(b):编码(a)中所述5’-UTR的片段的多核苷酸;(c):编码(a)中所述5’-UTR的变体的多核苷酸;及(d):编码(b)中所述片段的变体的多核苷酸;
    所述第二核苷酸序列包含如下多核苷酸中的至少一种:(e):编码源自基因MPND、FBXW10、FBXW12、PGLYRP1、HPX、CDK7、APOC2、PFN1、RBP4、FTCD、NAAA、ALB、GSDMD、FBXL8、ORM1、CASP4、CHMP2A、LENG1、MYCBPAP、APOC1、GAPDH、HSPA8、APOA2、UCHL1、TSG101、NAE1和NFKB2中的至少一个基因的3’-UTR的多核苷酸;(f):编码(e)中所述3’-UTR的片段的多核苷酸;(g):编码(e)中所述3’-UTR的变体的多核苷酸;及(h):编码(f)中所述片段的变体的多核苷酸。
  29. 权利要求28的载体,其中所述基因是人基因。
  30. 权利要求28或29的载体,其中所述第一核苷酸序列包含源自基因PPIA、HPX、FTCD、CDK5RAP3、HSPA8、HBA1和HBB中的至少一个的5’-UTR或其变体。
  31. 权利要求28~30任一项的载体,其中所述第二核苷酸序列包含源自基因MPND、FBXW10、FBXW12、和PGLYRP1中的至少一个的3’-UTR或其变体。
  32. 权利要求28~31任一项的载体,其包含所述第一核苷酸序列和第二核苷酸序列。
  33. 权利要求28~32任一项的载体,其中所述第一核苷酸序列包含:
    i)序列如SEQ ID NO:1~21中至少一个所示的多核苷酸、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段、序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体和序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段的变体;
    优选地,所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的变体、所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段和所述序列如SEQ ID NO:1~21中至少一个所示的多核苷酸的片段的变体,与所述序列如SEQ ID NO:1~21中的至少一个所示的多核苷酸具有至少70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的同源性;
    ii)至少两个拷贝的i)中的一种多核苷酸;或
    iii)至少两种i)中的多核苷酸。
  34. 权利要求28~33任一项的载体,其中所述第二核苷酸序列包含:
    (1):编码源自白蛋白基因、α-珠蛋白基因、β-珠蛋白基因、酪氨酸羟化酶基因、脂加氧酶基因、和胶原蛋白α基因中至少一个基因的3’-UTR的多核苷酸;
    (2):编码(1)中的所述3’-UTR的变体的多核苷酸;
    (3):序列如SEQ ID NO:22~48中至少一个所示的多核苷酸、序列如SEQ ID NO: 22~48中至少一个所示的多核苷酸的片段、序列如SEQ ID NO:22~48中至少一个所示的多核苷酸的变体和序列如SEQ ID NO:22~48中至少一个所示的多核苷酸的片段的变体中的至少一种;
    优选地,所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的RNA、所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的片段、和所述序列如SEQ ID NO:22~48中至少一个所示的多核苷酸编码的片段的变体,与所述序列如SEQ ID NO:22~48中的至少一个所示的多核苷酸有70%、80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%的相同性;
    (4):至少两个拷贝的(1)、(2)或(3)中的一种多核苷酸;或
    (5):由(1)~(3)中的多核苷酸所构成的组中的至少两种多核苷酸。
  35. 权利要求28~34任一项的载体,其还包含编码poly(A)尾的多核苷酸。
  36. 权利要求35的载体,其中组成所述poly(A)尾的核苷酸包含至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸;优选地,组成所述poly(A)尾的核苷酸包含连续地至少20个、至少40个、至少80个、至少100个或至少120个A核苷酸。
  37. 权利要求35的载体,其中组成所述poly(A)尾的核苷酸包含一个或多个除A核苷酸之外的其他核苷酸。
  38. 权利要求1~22任一项所述的重组RNA分子、权利要求23所述的DNA分子、权利要求24或28所述的载体、权利要求25所述的宿主细胞、权利要求26述的脂质纳米颗粒或权利要求27所述的药物组合物在制备药物中的用途;
    优选地,所述药物用于基因治疗、基因疫苗接种或蛋白质替代疗法。
  39. 权利要求38所述的用途,所述药物为核酸药物,其中所述核酸包括下述的至少一种:RNA、信使RNA(mRNA)、DNA、质粒、核糖体RNA(rRNA)、单链向导RNA(sgRNA)和Cas9 mRNA。
  40. 权利要求38~39任一项所述的用途,所述药物用于疾病的治疗和/或预防;
    优选地,所述疾病选自由以下组成的组:罕见病、感染性疾病、癌症、遗传性疾病、自体免疫性疾病、糖尿病、神经退化性疾病、心血管疾病、肾血管疾病,以及代谢性疾病;
    优选地,所述癌症包括肺癌、胃癌、肝癌、食管癌、结肠癌、胰腺癌、脑癌、淋巴癌、血癌或前列腺癌中的一种或多种;所述遗传疾病包括血友病,地中海贫血、高雪氏病中的一种或多种。
PCT/CN2023/139184 2022-12-16 2023-12-15 提高rna分子翻译效率和/或稳定性的utr及其应用 Ceased WO2024125637A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2025534775A JP2026501176A (ja) 2022-12-16 2023-12-15 Rna分子の翻訳効率及び/又は安定性を向上させるためのutr及びその使用
KR1020257023528A KR20250113527A (ko) 2022-12-16 2023-12-15 Rna 분자의 번역 효율 및/또는 안정성을 개선하기 위한 utr 및 이의 용도
EP23902830.1A EP4636090A1 (en) 2022-12-16 2023-12-15 Utr for improving translation efficiency and/or stability of rna molecule and use thereof
AU2023396545A AU2023396545A1 (en) 2022-12-16 2023-12-15 Utr for improving translation efficiency and/or stability of rna molecule and use thereof
CN202380086599.2A CN120435560A (zh) 2022-12-16 2023-12-15 提高rna分子翻译效率和/或稳定性的utr及其应用

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNPCT/CN2022/139593 2022-12-16
CN2022139593 2022-12-16

Publications (1)

Publication Number Publication Date
WO2024125637A1 true WO2024125637A1 (zh) 2024-06-20

Family

ID=91484402

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/139184 Ceased WO2024125637A1 (zh) 2022-12-16 2023-12-15 提高rna分子翻译效率和/或稳定性的utr及其应用

Country Status (6)

Country Link
EP (1) EP4636090A1 (zh)
JP (1) JP2026501176A (zh)
KR (1) KR20250113527A (zh)
CN (1) CN120435560A (zh)
AU (1) AU2023396545A1 (zh)
WO (1) WO2024125637A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025228326A1 (zh) * 2024-04-29 2025-11-06 深圳深信生物科技有限公司 重组rna分子及其应用

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013120628A1 (en) 2012-02-15 2013-08-22 Curevac Gmbh Nucleic acid comprising or coding for a histone stem-loop and a poly(a) sequence or a polyadenylation signal for increasing the expression of an encoded pathogenic antigen
CN104321432A (zh) * 2012-03-27 2015-01-28 库瑞瓦格有限责任公司 包含5′top utr的人工核酸分子
WO2016170176A1 (en) 2015-04-22 2016-10-27 Curevac Ag Rna containing composition for treatment of tumor diseases
WO2017191274A2 (en) 2016-05-04 2017-11-09 Curevac Ag Rna encoding a therapeutic protein
WO2018078053A1 (en) 2016-10-26 2018-05-03 Curevac Ag Lipid nanoparticle mrna vaccines
US20180195077A1 (en) * 2015-07-16 2018-07-12 Cornell University Methods of enhancing translation ability of rna molecules, treatments, and kits
WO2019077001A1 (en) 2017-10-19 2019-04-25 Curevac Ag NEW ARTIFICIAL NUCLEIC ACID MOLECULES
CN111405912A (zh) * 2017-09-29 2020-07-10 因特利亚治疗公司 用于基因组编辑的多核苷酸、组合物及方法
CN112656954A (zh) * 2013-10-22 2021-04-16 夏尔人类遗传性治疗公司 用于递送信使rna的脂质制剂
CN113521269A (zh) * 2020-04-22 2021-10-22 生物技术Rna制药有限公司 冠状病毒疫苗
CN114717230A (zh) * 2021-01-05 2022-07-08 麦塞拿治疗(香港)有限公司 成纤维细胞生长因子mRNA的无细胞和无载体体外RNA转录方法和核酸分子

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013120628A1 (en) 2012-02-15 2013-08-22 Curevac Gmbh Nucleic acid comprising or coding for a histone stem-loop and a poly(a) sequence or a polyadenylation signal for increasing the expression of an encoded pathogenic antigen
CN104321432A (zh) * 2012-03-27 2015-01-28 库瑞瓦格有限责任公司 包含5′top utr的人工核酸分子
CN112656954A (zh) * 2013-10-22 2021-04-16 夏尔人类遗传性治疗公司 用于递送信使rna的脂质制剂
WO2016170176A1 (en) 2015-04-22 2016-10-27 Curevac Ag Rna containing composition for treatment of tumor diseases
US20180195077A1 (en) * 2015-07-16 2018-07-12 Cornell University Methods of enhancing translation ability of rna molecules, treatments, and kits
WO2017191274A2 (en) 2016-05-04 2017-11-09 Curevac Ag Rna encoding a therapeutic protein
WO2018078053A1 (en) 2016-10-26 2018-05-03 Curevac Ag Lipid nanoparticle mrna vaccines
CN111405912A (zh) * 2017-09-29 2020-07-10 因特利亚治疗公司 用于基因组编辑的多核苷酸、组合物及方法
WO2019077001A1 (en) 2017-10-19 2019-04-25 Curevac Ag NEW ARTIFICIAL NUCLEIC ACID MOLECULES
CN113521269A (zh) * 2020-04-22 2021-10-22 生物技术Rna制药有限公司 冠状病毒疫苗
CN114717230A (zh) * 2021-01-05 2022-07-08 麦塞拿治疗(香港)有限公司 成纤维细胞生长因子mRNA的无细胞和无载体体外RNA转录方法和核酸分子

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
"Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
NEDDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 443
PARDI N.MURAMATSU HWEISSMAN DKARIKÓ K: "Synthetic Messenger RNA and Cell Metabolism Modulation. Methods in Molecular Biology (Methods and Protocols", vol. 969, 2013, HUMANA PRESS
PEARSONLIPMAN, PROC. NATL ACAD. SCI. USA, vol. 88, 1988, pages 2444
See also references of EP4636090A1
SMITHWATERMAN, ADS APP. MATH., vol. 2, 1981, pages 482

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025228326A1 (zh) * 2024-04-29 2025-11-06 深圳深信生物科技有限公司 重组rna分子及其应用

Also Published As

Publication number Publication date
EP4636090A1 (en) 2025-10-22
KR20250113527A (ko) 2025-07-25
CN120435560A (zh) 2025-08-05
JP2026501176A (ja) 2026-01-14
AU2023396545A1 (en) 2025-07-17

Similar Documents

Publication Publication Date Title
CN108026537B (zh) 人工核酸分子
CA3216490A1 (en) Epstein-barr virus mrna vaccines
WO2022221336A1 (en) Respiratory syncytial virus mrna vaccines
TW202305140A (zh) 多價rna組合物中rna種類之鑑定及比率測定方法
EP4096683A1 (en) Respiratory virus immunizing compositions
CN114901360A (zh) 用于递送核酸的新型脂质纳米颗粒
EP4219723B1 (en) Circular rna platforms, uses thereof, and their manufacturing processes from engineered dna
JP2024528447A (ja) 非天然型5’非翻訳領域及び3’非翻訳領域、及びその用途
US20240200083A1 (en) Plasmid system without selectable markers and production method thereof
EP4636090A1 (en) Utr for improving translation efficiency and/or stability of rna molecule and use thereof
WO2023227124A1 (zh) 一种构建mRNA体外转录模板的骨架
CN118638801A (zh) 一种编码促红细胞生成素的mRNA分子及其应用
AU2022349697A1 (en) Synthetic production of circular dna vectors
EP4560021A1 (en) Mrna for sars-cov-2 s protein and use thereof
CN119768532A (zh) 用于生产核酸的方法
WO2025228326A1 (zh) 重组rna分子及其应用
CA3262348A1 (en) Mrna for sars-cov-2 s protein and use thereof
EP4123029A1 (en) In-vitro transcript mrna and pharmaceutical composition comprising same
CN121586776A (zh) 自扩增核酸分子及其应用
WO2024199441A1 (zh) 一种包含poly(A)的多核苷酸分子及其用途
WO2025149032A1 (en) Respiratory syncytial virus vaccine compositions and their use
WO2025177220A1 (en) Alternative self-amplifying rna
WO2025202929A1 (en) Methods for producing nucleic acids
WO2026027700A2 (en) Modified ribonucleic acids
Widada Isolation of GAG-CA Subunit of Jembrana Virus from the Viral Genome by RT-PCR and Its Cloning in pCR21-Topoplasmid usingTopoisomerase-Based System

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23902830

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2025534775

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2025534775

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202380086599.2

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: AU2023396545

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 1020257023528

Country of ref document: KR

Free format text: ST27 STATUS EVENT CODE: A-0-1-A10-A15-NAP-PA0105 (AS PROVIDED BY THE NATIONAL OFFICE)

WWE Wipo information: entry into national phase

Ref document number: 1020257023528

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2023902830

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023396545

Country of ref document: AU

Date of ref document: 20231215

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 1020257023528

Country of ref document: KR

ENP Entry into the national phase

Ref document number: 2023902830

Country of ref document: EP

Effective date: 20250716

ENP Entry into the national phase

Ref document number: 2023902830

Country of ref document: EP

Effective date: 20250716

WWP Wipo information: published in national office

Ref document number: 202380086599.2

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 2023902830

Country of ref document: EP

Effective date: 20250716

WWP Wipo information: published in national office

Ref document number: 2023902830

Country of ref document: EP