WO2024041653A1 - 一种CRISPR-Cas13系统及其应用 - Google Patents

一种CRISPR-Cas13系统及其应用 Download PDF

Info

Publication number
WO2024041653A1
WO2024041653A1 PCT/CN2023/115093 CN2023115093W WO2024041653A1 WO 2024041653 A1 WO2024041653 A1 WO 2024041653A1 CN 2023115093 W CN2023115093 W CN 2023115093W WO 2024041653 A1 WO2024041653 A1 WO 2024041653A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
cas13
sequence
seq
rna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/115093
Other languages
English (en)
French (fr)
Inventor
梁峻彬
梁兴祥
孙阳
徐辉
司凯威
李秋婷
彭志琴
皇甫德胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Synsorbio Technology Co Ltd
Guangzhou Reforgene Medicine Co Ltd
Original Assignee
Zhejiang Synsorbio Technology Co Ltd
Guangzhou Reforgene Medicine Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020257009723A priority Critical patent/KR20250053925A/ko
Priority to AU2023328197A priority patent/AU2023328197A1/en
Priority to IL319213A priority patent/IL319213A/en
Priority to EP23856738.2A priority patent/EP4578945A4/en
Priority to JP2024550764A priority patent/JP2025511466A/ja
Application filed by Zhejiang Synsorbio Technology Co Ltd, Guangzhou Reforgene Medicine Co Ltd filed Critical Zhejiang Synsorbio Technology Co Ltd
Priority to CN202380014099.8A priority patent/CN118159650A/zh
Publication of WO2024041653A1 publication Critical patent/WO2024041653A1/zh
Priority to US18/755,750 priority patent/US12297450B2/en
Priority to ZA2024/07040A priority patent/ZA202407040B/en
Priority to MX2025002272A priority patent/MX2025002272A/es
Anticipated expiration legal-status Critical
Priority to US19/194,365 priority patent/US20250250590A1/en
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/50Physical structure
    • C12N2310/53Physical structure partially self-complementary or closed
    • C12N2310/531Stem-loop; Hairpin
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Definitions

  • This disclosure relates to the field of CRISPR gene editing, specifically a CRISPR-Cas13 system and its application.
  • CRISPR-Cas13 is an RNA targeting and editing system based on the bacterial immune system that protects bacteria from viruses.
  • the CRISPR-Cas13 system is similar to the CRISPR-Cas9 system, but unlike the Cas9 protein, which targets DNA, the Cas13 protein targets RNA.
  • CRISPR-Cas13 belongs to the Type VI CRISPR-Cas13 system, which contains a single effector protein Cas13.
  • CRISPR-Cas13 can be divided into multiple subtypes (such as Cas13a, Cas13b, Cas13c and Cas13d) based on phylogeny.
  • Cas13a, Cas13b, Cas13c and Cas13d a subtype of Cas13a, Cas13b, Cas13c and Cas13d
  • cytotoxicity e.g., parasitic RNA degradation caused by
  • the present disclosure provides a CRISPR-Cas13 system targeting target RNA and its application. .
  • One aspect of the present disclosure relates to a Cas13 protein whose amino acid sequence has at least 90% sequence identity as compared to SEQ ID NO: 1.
  • the Cas13 protein is capable of forming a CRISPR complex with a guide polynucleotide comprising a direct repeat sequence linked to a guide sequence engineered to hybridize to a target RNA.
  • the Cas13 protein is capable of forming a CRISPR complex with a guide polynucleotide, and the CRISPR complex is capable of specifically binding to a target RNA sequence.
  • the Cas13 protein is capable of forming a CRISPR complex with a guide polynucleotide comprising a direct repeat sequence linked to a guide sequence engineered to guide the CRISPR Sequence-specific binding of the complex to target RNA.
  • the amino acid sequence of the Cas13 protein has at least 95% sequence identity as compared to SEQ ID NO:1. In some embodiments, the amino acid sequence of the Cas13 protein has at least 96% sequence identity compared to SEQ ID NO:1. In some embodiments, the amino acid sequence of the Cas13 protein has At least 97% sequence identity compared to SEQ ID NO:1. In some embodiments, the amino acid sequence of the Cas13 protein has at least 98% sequence identity as compared to SEQ ID NO:1. In some embodiments, the amino acid sequence of the Cas13 protein has at least 99% sequence identity compared to SEQ ID NO:1. In some embodiments, the amino acid sequence of the Cas13 protein has at least 99.5% sequence identity compared to SEQ ID NO: 1. In some embodiments, the amino acid sequence of the Cas13 protein is shown in SEQ ID NO: 1.
  • the disclosed Cas13 protein i.e., C13-2 protein
  • the sequence SEQ ID NO: 1 was identified based on bioinformatics analysis of prokaryotic genomes and metagenomes in the CNGB database (China National Gene Bank) and subsequent activity verification.
  • the Cas13 protein of the present disclosure is from species (species) comprising a genome with an average nucleotide identity (ANI) of ⁇ 95% to the genome represented by the genome number CNA0009596 in the CNGB database.
  • ANI Average nucleotide identity
  • this disclosure is defined by the above threshold, and it is considered that species with ANI values ⁇ 95% of the reference genome are the same species, and the Cas13 protein therein has homology and similar functions to the protein claimed in this disclosure, and belongs to this disclosure range.
  • ANI analysis tools include FastANI, JSpecies and other programs.
  • the amino acid sequence of the Cas13 protein contains one, two, three, four, five, six, seven or more mutations as compared to SEQ ID NO: 1, e.g., a single Amino acid insertions, single amino acid deletions, single amino acid substitutions, or combinations thereof.
  • the Cas13 protein contains one or more mutations in the catalytic domain and has reduced RNA cleavage activity. In some embodiments, the Cas13 protein contains a mutation in the catalytic domain and has reduced RNA cleavage activity. In some embodiments, the Cas13 protein contains one or more mutations in one or both HEPN domains and substantially lacks RNA cleavage activity. In some embodiments, the Cas13 protein contains a mutation in either HEPN domain and substantially lacks RNA cleavage activity.
  • substantially lacking RNA cleavage activity refers to retaining only ⁇ 50%, ⁇ 40%, ⁇ 30%, ⁇ 20%, ⁇ 10%, ⁇ 5%, or ⁇ 1% of RNA compared to the wild-type Cas13 protein. Cleavage activity, or no detectable RNA cleavage activity.
  • the RxxxxH motif of the Cas13 protein (x represents any amino acid, RxxxxH may also be denoted as Rx4H or R4xH) contains one or more mutations and substantially lacks RNA cleavage activity.
  • the Cas13 protein has an RxxxxH motif at positions 210-215, an RxxxxH motif at positions 750-755, and/or an RxxxxH motif at positions 785-790 of the reference protein as shown in SEQ ID NO: 1 The corresponding position contains a mutation.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the Cas13 protein comprises a mutation at a position corresponding to the RxxxxH motif at positions 750-755 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 and the RxxxxH motif at positions 750-755 of the reference protein as set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 and the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 750-755 and the RxxxxH motif at positions 785-790 of the reference protein as set forth in SEQ ID NO: 1.
  • the Cas13 protein corresponds to the RxxxxH motif at positions 210-215, the RxxxxH motif at positions 750-755, and the RxxxxH motif at positions 785-790 of the reference protein as shown in SEQ ID NO: 1 Position contains mutation.
  • the RxxxxH motif is mutated to AxxxxH, RxxxxA, or AxxxxA. In some embodiments, the RxxxxH motif is mutated to AxxxxH. In some embodiments, the RxxxxH motif is mutated to RxxxxA. In some embodiments, the RxxxxH motif is mutated to AxxxxA.
  • the Cas13 protein contains 1, 2, 3, 4, 5 or 6 mutations. In some embodiments, the Cas13 protein is mutated to A (alanine) at positions corresponding to amino acid residues R210, H215, R750, H755, R785 and/or H790 of the reference protein as shown in SEQ ID NO: 1 acid).
  • the amino acid sequence of the Cas13 protein contains mutations at positions corresponding to amino acid residues R210 and H215 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the amino acid sequence of the Cas13 protein contains mutations at positions corresponding to amino acid residues R750 and H755 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the amino acid sequence of the Cas13 protein contains mutations at positions corresponding to amino acid residues R785 and H790 of the reference protein shown in SEQ ID NO: 1.
  • the amino acid sequence of the Cas13 protein contains mutations at positions corresponding to amino acid residues R210, H215, R750 and H755 of the reference protein as set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R750, H755, R785 and H790 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein has amino acid residues similar to those of the reference protein set forth in SEQ ID NO:1. Mutations are included at the corresponding positions of R210, H215, R785 and/or H790.
  • the Cas13 protein comprises mutations at positions corresponding to amino acid residues R210, H215, R750, H755, R785 and H790 of the reference protein set forth in SEQ ID NO: 1.
  • the corresponding position of R210, R750 or R785 is mutated to A. In some embodiments, the corresponding position of H215, H755 or H790 is mutated to A. In some embodiments, the corresponding positions of R210, H215, R750, H755, R785 and H790 are all mutated to A.
  • the Cas13 protein is modified by introducing mutations in the RxxxxH motif at positions 210-215, the RxxxxH motif at positions 750-755, and/or the RxxxxH motif at positions 785-790 of the sequence shown in SEQ ID NO:1 get.
  • the Cas13 protein is introduced by introducing 1, 2, 3, 4, 5 at the R210, H215, R750, H755, R785 and/or H790 positions of the sequence shown in SEQ ID NO:1 or 6 mutations. In some embodiments, the Cas13 protein is obtained by mutating R210, H215, R750, H755, R785 and/or H790 of the sequence shown in SEQ ID NO: 1 to A (alanine).
  • the Cas13 protein is obtained by mutating R210, H215, R785 and H790 of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating the R210, H215, R750 and H755 positions of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating R750, H755, R785 and H790 of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating R210, H215, R750, H755, R785 and H790 of the sequence shown in SEQ ID NO: 1 to A.
  • the Cas13 protein is at amino acid residues 40-91, 146-153, 158-176, 182-209, 216- 253, 271-287, 341-353, 379-424, 456-477, 521-557, 575-588, 609-625, 700-721, 724-783, 796-
  • the corresponding positions at positions 815, 828-852, or 880-893 contain at least one mutation.
  • the Cas13 protein contains any of the following amino acid residues at positions corresponding to the reference protein set forth in SEQ ID NO: 1
  • One or more mutations R11, N34, R35, R47, R58, R63, R64, N68, N87, N265, N274, R276, R290, R294, N299, N303, R308, R314, R320, R328, N332, R341, N346, R358, N372, N383, N390, N394, R47+R290, R47+R314, R290+R314, R47+R290+R314, R308+N68, N394+N68, N87+N68, R308+N265, N394+ N265, N87+N265, R308+N68+N265, N87+N68+N265, T7, A16, S260, A263, M266, N274, F288, M302, N303
  • the Cas13 protein compared to a reference protein as set forth in SEQ ID NO: 1, contains any one or more mutations: R11, N34, R35, R47, R58, R63, R64, N68, N87, N265, N274, R276, R290, R294, N299, N303, R308, R314.
  • the Cas13 protein contains any of the following amino acid residues at positions corresponding to the reference protein set forth in SEQ ID NO: 1
  • One or more mutations R47+R290, R47+R314, R290+R314, R47+R290+R314, N394+N265, N87+N265, A263, M266, N274, F288, V305, I311, D313, H324, T360, E365, A373, M380, D402, D411, S418.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue R11 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N34 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R35 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R47 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R58 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue R63 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R64 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N68 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N87 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N265 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N274 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue R276 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R290 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R294 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N299 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N303 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R308 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R314 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R320 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R328 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N332 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R341 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N346 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue R358 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N372 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N383 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N390 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N394 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R47 and R290 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R47 and R314 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R290 and R314 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R47, R290 and R314 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R308 and N68 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues N394 and N68 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues N87 and N68 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R308 and N265 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues N394 and N265 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues N87 and N265 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R308, N68 and N265 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues N87, N68 and N265 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue T7 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue A16 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue S260 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue A263 of the reference protein shown in SEQ ID NO:1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue M266 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N274 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue F288 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue M302 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue N303 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue L304 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue V305 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue I311 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue D313 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue H324 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue P326 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue H327 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N332 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue N346 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue T353 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue T360 of the reference protein shown in SEQ ID NO:1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue E365 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue A373 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue M380 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue S382 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue K395 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises a mutation at a position corresponding to amino acid residue Y396 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue D402 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue D411 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains a mutation at a position corresponding to amino acid residue S418 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein is mutated at the corresponding position of the mutation site of the reference protein shown in SEQ ID NO: 1 in Table 24 or Table 29 to identical amino acid residues.
  • the Cas13 protein is at the corresponding position of the mutation site of the reference protein shown in SEQ ID NO: 1 in Table 24 or Table 29 Contain the same mutations.
  • the Cas13 protein is obtained by introducing any one or more mutations at the following positions to the sequence shown in SEQ ID NO: 1: R11, N34, R35, R47, R58, R63, R64, N68, N87, N265, N274, R276, R290, R294, N299, N303, R308, R314, R320, R328, N332, R341, N346, R358, N372, N383, N390, N394, R47+R290, R47+R314, R290+R314, R47+R290+R314, R308+N68, N394+N68, N87+N68, R308+N265, N394+N265, N87+N265, R308+N68+N265, N87+N68+N265, T7, A16, S260, A263, M266, N274, F288, M302, N303, L304, V305, I311, D313, H324, P326, H
  • the Cas13 protein is obtained by introducing any one or more mutations at the following positions to the sequence shown in SEQ ID NO: 1: N34, R64, N68, N265, R276, R294, N299, R314 ⁇ R47+R290, R47+R314, R290+R314, R47+R290+R314, N394+N265, N87+N265, A263, M266, N274, F288, V305, I311, D313, H324, T360, E365, A373, M380, D402 and D411.
  • the Cas13 protein is obtained by introducing any one or more mutations in Table 24 or Table 29 to the sequence shown in SEQ ID NO: 1.
  • the Cas13 protein is at amino acid residues 91-120, 141-180, 211-240, 331-360, 351- Corresponding positions of 400, 431-460, 461-500, 511-550, 611-640, 631-660, 661-690, 691-760, 821-860 or 861-890 Obtained due to sequence deletion.
  • the Cas13 protein contains one or more positions corresponding to amino acid residues 348-350, 521-556, or 883-893 of the reference protein as shown in SEQ ID NO: 1 Deletion of amino acids.
  • the sequence deletion is ⁇ 300, ⁇ 200, ⁇ 150, ⁇ 100, ⁇ 90, ⁇ 80, ⁇ 70, ⁇ 60, ⁇ 50, ⁇ 40 , ⁇ 30, ⁇ 20 or ⁇ 10 amino acid residues.
  • the Cas13 protein consists of the sequence shown in SEQ ID NO: 1 at positions 91-120, 141-180, 211-240, 331-360, 351-400, 431-460 It is obtained by sequence deletion at positions 461-500, 511-550, 611-640, 631-660, 661-690, 691-760, 821-860 or 861-890.
  • sequence deletion is the deletion of 1 or more amino acid residues.
  • Another aspect of the present disclosure relates to fusion proteins.
  • the fusion protein comprises a Cas13 protein or a functional fragment thereof as described herein, and any one or more of the following fused to the Cas13 protein or a functional fragment thereof: Cytosine deamination Enzyme domain, adenosine deaminase domain, translation activation domain, translation inhibition domain, RNA methylation domain, RNA demethylation domain, nuclease domain, splicing factor domain, reporter domain, Affinity domains, subcellular localization signals, reporter tags, and affinity tags.
  • the fusion protein comprises a Cas13 protein or a functional fragment thereof as described herein, and any one or more of the following fused to the Cas13 protein or a functional fragment thereof: Cytosine deamination Enzyme domain, adenosine deaminase domain, translation activation domain, translation repression domain, RNA methylation domain, RNA demethylation domain, nuclease domain, splicing factor domain, subcellular localization Signals, reporting tags, and affinity tags.
  • the fusion protein comprises a Cas13 protein described herein, or a functional fragment thereof, fused to a homologous or heterologous protein domain and/or a polypeptide tag. In some embodiments, the fusion does not change The original function of the Cas13 protein and/or its functional fragment.
  • the said not changing the original function of the Cas13 protein or its functional fragment means that the fused protein still has the ability to recognize, bind and/or cleave the target RNA when used in combination with gRNA.
  • the ability of the fused protein to recognize, bind or cleave target RNA when used in combination with gRNA may be improved or reduced compared to the ability of the Cas13 protein to recognize, bind or cleave target RNA when used in combination with gRNA, but as long as the When the fused protein is used in combination with gRNA, it can effectively recognize, bind or cleave the target RNA, which is a situation that "does not change the original function of the Cas13 protein".
  • the fusion protein comprises a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or polypeptide tag.
  • the fusion protein comprises a functional fragment of a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag.
  • the functional fragment is a fragment obtained by deleting a partial sequence of the nuclease domain of the Cas13 protein.
  • the functional fragment is the Cas13 protein obtained by deleting the sequence corresponding to HEPN-1_I, HEPN-1_II, HEPN-2, NTD, Helical-1 and/or Helical-2 of C13-2. fragment.
  • the functional fragment is a fragment obtained by deleting the sequence corresponding to HEPN-1_I, HEPN-1_II and/or HEPN-2 of C13-2 in the Cas13 protein. In some embodiments, the functional fragment is a fragment obtained by deleting the sequence corresponding to HEPN-1_I of C13-2 in the Cas13 protein. In some embodiments, the functional fragment is a fragment obtained by deleting the sequence corresponding to HEPN-1_II of C13-2 in the Cas13 protein. In some embodiments, the functional fragment is a fragment obtained by deleting the sequence corresponding to HEPN-2 of C13-2 in the Cas13 protein. In some embodiments, the functional fragment is a fragment obtained by deleting the sequence corresponding to the NTD of C13-2 and HEPN-1_I of the Cas13 protein.
  • the Cas13 protein or functional fragment thereof is fused to any one or more selected from the following: cytosine deaminase domain, adenosine deaminase domain, translation activation domain, translation Inhibitory domain, RNA methylation domain, RNA demethylation domain, nuclease domain, splicing factor domain, reporter domain, affinity domain, subcellular localization signal, reporter tag and affinity tag.
  • the Cas13 protein or functional fragment thereof is fused to a subcellular localization signal.
  • the subcellular localization signal is optionally selected from a nuclear localization signal (NLS), a nuclear export signal (NES), a chloroplast localization signal, or a mitochondrial localization signal.
  • the Cas13 protein or functional fragment thereof is fused to any one or more selected from the following: cytosine deaminase domain, adenosine deaminase domain, translation activation domain, translation Inhibitory domain, RNA methylation domain, RNA demethylation domain, nuclease domain, splicing factor domain, reporter domain, affinity domain, subcellular localization signal, reporter tag and affinity tag.
  • the Cas13 protein or functional fragment thereof is fused to a subcellular localization signal.
  • the subcellular localization signal is optionally selected from a nuclear localization signal (NLS), a nuclear export signal (NES), a chloroplast localization signal or a mitochondrial localization signal.
  • the Cas13 protein or functional fragment thereof is fused to a homologous or heterologous nuclear localization signal (NLS). In some embodiments, the Cas13 protein or functional fragment thereof is fused to a homologous or heterologous nuclear export signal (NES).
  • NLS homologous or heterologous nuclear localization signal
  • NES homologous or heterologous nuclear export signal
  • the Cas13 protein or functional fragment thereof is covalently linked to a protein domain and/or a polypeptide tag. In some embodiments, the Cas13 protein or functional fragment thereof is directly covalently linked to a protein domain and/or a polypeptide tag. In some embodiments, the Cas13 protein or its functional fragment is covalently connected to the protein domain and/or polypeptide tag through a linking sequence; further, in some embodiments, the linking sequence is an amino acid sequence.
  • the Cas13 protein or functional fragment thereof of the fusion protein is connected to the homologous or heterologous protein domain and/or polypeptide tag through a rigid connecting peptide sequence.
  • the Cas13 protein portion of the fusion protein is connected to the homologous or heterologous protein domain and/or polypeptide tag through a flexible linker peptide sequence.
  • the rigid linker peptide sequence is A(EAAAK) 3 A (SEQ ID NO: 279).
  • the flexible linker peptide sequence is (GGGGS)3 (SEQ ID NO: 280).
  • the fusion protein is capable of forming a CRISPR complex with the guide polynucleotide, and the CRISPR complex is capable of specifically binding to the target RNA sequence.
  • the fusion protein comprises a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide, so The CRISPR complex can specifically bind to the target RNA sequence.
  • the fusion protein comprises a functional fragment of a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide.
  • the CRISPR complex can specifically bind to the target RNA sequence.
  • the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide comprising a direct repeat sequence linked to a guide sequence engineered to direct the CRISPR Sequence-specific binding of the complex to target RNA.
  • the fusion protein comprises a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide, so
  • the guide polynucleotide comprises a direct repeat sequence linked to a guide sequence engineered to direct sequence-specific binding of the CRISPR complex to a target RNA.
  • the fusion protein comprises a functional fragment of a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide.
  • the guide polynucleotide comprises a direct repeat sequence linked to a guide sequence engineered to guide sequence-specific binding of the CRISPR complex to a target RNA.
  • the fusion protein is capable of forming a CRISPR complex with the guide polynucleotide, and the CRISPR complex is capable of sequence-specific binding and cleavage of the target RNA.
  • the fusion protein comprises a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide, so The CRISPR complex can specifically bind to the target RNA sequence.
  • the fusion protein comprises a functional fragment of a Cas13 protein described herein fused to a homologous or heterologous protein domain and/or a polypeptide tag, and the fusion protein is capable of forming a CRISPR complex with a guide polynucleotide.
  • the CRISPR complex can specifically bind to the target RNA sequence.
  • the structure of the fusion protein is NLS-Cas13 protein-SV40NLS-nucleoplasmin NLS.
  • a guide polynucleotide comprising (i) having at least 50% sequence identity to any one of SEQ ID NO:3 and SEQ ID NO:80-87. a direct repeat sequence linked to (ii) a homologous or heterologous guide sequence engineered to hybridize to the target RNA, the guide polynucleotide being capable of forming a CRISPR complex with the Cas13 protein and directing the CRISPR complex Sequence-specific binding of the substance to the target RNA.
  • the Cas13 protein is Cas13a, Cas13b, Cas13c, or Cas13d.
  • the amino acid sequence of the Cas13 protein has a sequence identity of at least 90%, at least 95%, at least 98%, or at least 99% compared to SEQ ID NO: 1.
  • the direct repeat sequence has at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or 100% sequence identity. In some embodiments, the direct repeat sequence has at least 80% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has at least 85% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has at least 90% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87.
  • the direct repeat sequence has at least 95% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has 100% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87.
  • the direct repeat sequence has at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or 100% sequence identity.
  • the direct repeat sequence has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% compared to any one of SEQ ID NO: 3 and 87 or 100% sequence identity sex.
  • the direct repeat sequence is A at base 26 corresponding to SEQ ID NO:3.
  • the direct repeat sequence is GGAAGATN 1 ACTCTACAAACCTGTAGN 2 GN 3 N 4 N 5 N 6 N 7 N 8 N 9 N 10 N 11 (SEQ ID NO: 277); wherein N 1 and N 3 -N 11 is optionally selected from A, C, G, T; N 2 is optionally selected from A and G.
  • the direct repeat sequence is GGAAGATN 12 ACTCTACAAACCTGTAGN 13 GN 14 N 15 N 16 N 17 N 18 N 19 N 20 N 21 N 22 (SEQ ID NO: 278 ); wherein, N 12 , N 13 , N 19 and N 21 are optionally selected from A and G, N 14 are optionally selected from A and T, N 15 and N 16 are optionally selected from C and T, N 17 and N 18 are optionally selected from G and T, N 20 and N 22 is optionally selected from C and G.
  • the guide sequence is located 3' to the direct repeat sequence. In some embodiments, the guide sequence is located 5' to the direct repeat sequence.
  • the guide sequence contains 15-35 nucleotides. In some embodiments, the guide sequence hybridizes to the target RNA with no more than one nucleotide mismatch. In some embodiments, the direct repeat sequence contains 25 to 40 nucleotides.
  • the guide polynucleotide further comprises an aptamer sequence.
  • the aptamer sequence is inserted into a loop of the guide polynucleotide.
  • the aptamer sequence includes an MS2 aptamer sequence, a PP7 aptamer sequence, or a Q[beta] aptamer sequence.
  • the guide polynucleotide comprises modified nucleotides.
  • the modification comprises 2'-O-methyl, 2'-O-methyl-3'-phosphorothioate, or 2'-O-methyl-3'-thioPACE.
  • the target RNA of the guide polynucleotide is located in the nucleus of a eukaryotic cell.
  • the target RNA is optionally selected from TTR RNA, SOD1 RNA, PCSK9 RNA, VEGFA RNA, VEGFR1 RNA, PTBP1 RNA, AQp1 RNA, or ANGPTL3 RNA.
  • the target RNA is optionally selected from VEGFA RNA, PTBP1 RNA, AQp1 RNA or ANGPTL3 RNA.
  • the guide sequence is optionally selected from the sequences shown in SEQ ID NO: 5, SEQ ID NO: 6, and SEQ ID NO: 42-49 (for targeting AQp1 RNA, PTBP1 RNA, and ANGPTL3 RNA respectively).
  • the guide sequence is selected from the sequences shown in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:43, and SEQ ID NO:45-47.
  • the Cas13 protein is Cas13a, Cas13b, Cas13c, or Cas13d. In some embodiments, the Cas13 protein is Cas13d. In some embodiments, the Cas13 protein has a sequence identity of at least 90%, at least 95%, at least 98%, or at least 99% compared to SEQ ID NO: 1.
  • a CRISPR-Cas13 system comprising: a Cas13 protein or fusion protein described herein, or a nucleic acid encoding the Cas13 protein or fusion protein; and a guide polynucleotide or a guide polynucleotide encoding the finger.
  • a nucleic acid of a guide polynucleotide the guide polynucleotide comprises a direct repeat sequence connected to a guide sequence engineered to hybridize with a target RNA; the guide polynucleotide is capable of binding to the Cas13 protein or
  • the fusion protein forms a CRISPR complex and directs sequence-specific binding of the CRISPR complex to target RNA.
  • the target RNA is optionally selected from TTR RNA, SOD1 RNA, PCSK9 RNA, VEGFA RNA, VEGFR1 RNA, PTBP1 RNA, AQp1 RNA, or ANGPTL3 RNA.
  • the target RNA is optionally selected from VEGFA RNA, PTBP1 RNA, AQp1 RNA or ANGPTL3 RNA.
  • the guide sequence is optionally selected from the sequences shown in SEQ ID NO: 5, SEQ ID NO: 6, and SEQ ID NO: 42-49 (for targeting AQp1 RNA, PTBP1 RNA, and ANGPTL3 RNA respectively) .
  • the guide sequence is optionally selected from the sequences shown in SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 43, and SEQ ID NO: 45-47.
  • the direct repeat sequence has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or 100% sequence identity.
  • the fusion protein comprises a Cas13 protein described herein, or a functional fragment thereof, fused to a homologous or heterologous protein domain and/or a polypeptide tag.
  • the Cas13 protein or functional fragment portion thereof of the fusion protein is fused to a homologous or heterologous nuclear localization signal (NLS). In some embodiments, the Cas13 protein or functional fragment portion thereof of the fusion protein is fused to a homologous or heterologous nuclear export signal (NES).
  • NLS homologous or heterologous nuclear localization signal
  • NES homologous or heterologous nuclear export signal
  • the Cas13 protein contains a mutation in the catalytic domain and has reduced RNA cleavage activity. In some embodiments, the Cas13 protein contains mutations in one or both HEPN domains and substantially lacks RNA cleavage activity. In some embodiments, "substantially lacking RNA cleavage activity" means retaining only ⁇ 50%, ⁇ 40%, ⁇ 30%, ⁇ 20%, ⁇ 10%, ⁇ 5% compared to wild-type Cas13 protein or ⁇ 1% RNA cleavage activity, or no detectable RNA cleavage activity.
  • the Cas13 protein or functional fragment portion thereof of the fusion protein is directly connected to the homologous or heterologous protein domain and/or polypeptide tag through covalent bonds.
  • the Cas13 protein portion of the fusion protein is linked to a homologous or heterologous protein domain and/or a polypeptide tag through a peptide sequence.
  • the protein domain comprises a cytosine deaminase domain, an adenosine deaminase domain, a translation activation domain, a translation inhibition domain, an RNA methylation domain, an RNA demethylation domain domain, nuclease domain or splicing factor domain.
  • the Cas13 protein is covalently linked to an affinity tag or reporter tag.
  • the Cas13 protein has at least 95% sequence identity to SEQ ID NO: 1. In some embodiments, the Cas13 protein has at least 97% sequence identity as compared to SEQ ID NO: 1 sex. In some embodiments, the Cas13 protein has at least 98% sequence identity to SEQ ID NO: 1. In some embodiments, the Cas13 protein has at least 99% sequence identity as compared to SEQ ID NO:1. In some embodiments, the Cas13 protein has at least 99.5% sequence identity to SEQ ID NO: 1. In some embodiments, the Cas13 protein comprises the sequence set forth in SEQ ID NO:1.
  • the Cas13 protein is from a species comprising a genome with an average nucleotide identity (ANI) ⁇ 95% to the genome represented by the genome number CNA0009596 in the CNGB database.
  • ANI average nucleotide identity
  • the Cas13 protein is used for RNA cleavage without protospacer flanking sequence (PFS) requirements.
  • PFS protospacer flanking sequence
  • the guide sequence is located 3' to the direct repeat sequence. In some embodiments, the guide sequence is located 5' to the direct repeat sequence.
  • the guide sequence contains 15-35 nucleotides. In some embodiments, the guide sequence hybridizes to the target RNA with no more than one nucleotide mismatch.
  • the direct repeat sequence contains 25 to 40 nucleotides.
  • the direct repeat sequence has at least 80% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has at least 90% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has at least 95% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence has 100% sequence identity as compared to any one of SEQ ID NO: 3 and SEQ ID NO: 80-87. In some embodiments, the direct repeat sequence is optionally selected from SEQ ID NO: 3 and SEQ ID NO: 80-87.
  • the guide polynucleotide further comprises an aptamer sequence.
  • the aptamer sequence is inserted into a loop of a guide polynucleotide.
  • the aptamer sequence includes an MS2 aptamer sequence, a PP7 aptamer sequence, or a Q[beta] aptamer sequence.
  • the CRISPR-Cas13 system includes a fusion protein comprising an adapter protein and a homologous or heterologous protein domain, or a nucleic acid encoding the fusion protein, the adapter protein capable of binding the aptamer sequence.
  • the adapter protein includes MS2 phage coat protein, PP7 phage coat protein, or Q ⁇ phage coat protein.
  • the protein domain includes a cytosine deaminase domain, an adenosine deaminase domain, a translation activation domain, a translation inhibition domain, an RNA methylation domain, and an RNA demethylation domain. Structural domain, nuclease domain, splicing factor domain, reporter domain, affinity domain, reporter tag and affinity tag.
  • the guide polynucleotide comprises modified nucleotides.
  • the modified nucleotides comprise 2'-O-methyl, 2'-O-methyl-3'-phosphorothioate, or 2'-O-methyl-3'-thio on behalf of PACE.
  • the Cas13 protein or fusion protein and the guide polynucleotide do not naturally co-occur.
  • Another aspect of the present disclosure relates to a CRISPR-Cas13 system, characterized in that it includes any Cas13 protein, its fusion protein or a nucleic acid encoding the same, and a guide polynucleotide or a nucleic acid encoding the same as described herein.
  • Another aspect of the present disclosure relates to a vector system comprising a CRISPR-Cas13 system described herein, said vector system comprising one or more vectors comprising a polynucleotide sequence encoding a Cas13 protein or fusion protein described herein and A polynucleotide sequence encoding a guide polynucleotide.
  • AAV vector comprising the CRISPR-Cas13 system described herein, wherein the AAV vector comprises a DNA sequence encoding a Cas13 protein or fusion protein described herein and the guide polynucleotide.
  • lipid nanoparticles comprising a CRISPR-Cas13 system described herein, the lipid nanoparticles comprising a guide polynucleotide described herein and an mRNA encoding a Cas13 protein or fusion protein described herein .
  • a lentiviral vector comprising a CRISPR-Cas13 system described herein, the lentiviral vector comprising a guide polynucleotide described herein and an mRNA encoding a Cas13 protein or fusion protein described herein.
  • the lentiviral vector is pseudotyped with homologous or heterologous envelope proteins, such as VSV-G.
  • the mRNA encoding the Cas13 protein or fusion protein is linked to an aptamer sequence.
  • ribonucleoprotein complex comprising a CRISPR-Cas13 system described herein, wherein the ribonucleoprotein complex consists of a guide polynucleotide described herein and a Cas13 protein or fusion protein described herein form.
  • virus-like particle comprising a CRISPR-Cas13 system described herein, the virus-like particle comprising a ribonucleoside formed from a guide polynucleotide described herein and a Cas13 protein or fusion protein described herein protein complex.
  • the Cas13 protein or fusion protein is fused to a gag protein.
  • eukaryotic cells comprising the CRISPR-Cas13 system described herein.
  • the eukaryotic cell is a mammalian cell. In some embodiments, the eukaryotic cell is a human cell.
  • Another aspect of the present disclosure relates to a pharmaceutical composition
  • a pharmaceutical composition comprising the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the The nucleic acid, the vector system described herein, the lipid nanoparticles described herein, the lentiviral vector described herein, the ribonucleoprotein complex described herein, the virus-like particles described herein or the true virus described herein nucleated cells.
  • Another aspect of the present disclosure relates to a pharmaceutical composition
  • a pharmaceutical composition comprising the CRISPR-Cas13 system described herein.
  • Another aspect of the present disclosure relates to an in vitro composition
  • an in vitro composition comprising the CRISPR-Cas13 system described herein, and a labeled detector that is unable to hybridize to or be targeted by a guide polynucleotide described herein.
  • RNA Ribonucleic acid
  • Another aspect of the present disclosure relates to isolated nucleic acids encoding Cas13 proteins or fusion proteins described herein.
  • Another aspect of the present disclosure relates to the isolation of a nucleic acid encoding a guidance polynucleotide described herein.
  • Another aspect of the present disclosure relates to a CRISPR-Cas13 system comprising any Cas13 protein or nucleic acid encoding the same, and a guide polynucleotide or an isolated nucleic acid encoding the same as described herein.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein for detecting target RNA in a nucleic acid sample suspected of containing target RNA or preparing a reagent for detecting target RNA in a nucleic acid sample suspected of containing target RNA.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are in any one of the following Or the use of preparing reagents to achieve any of the following solutions:
  • Cleave or nick one or more target RNA molecules activate or upregulate one or more target RNA molecules, activate or inhibit the translation of one or more target RNA molecules , inactivating one or more target RNA molecules, visualizing, labeling, or detecting one or more target RNA molecules, binding one or more target RNA molecules, transporting one or more target RNA molecules, and masking a one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are useful in cleaving a or Use in multiple target RNA molecules or in preparing reagents for cleaving one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein or the eukaryotic cells described herein are combined with a or Use in a variety of target RNA molecules.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein for preparing reagents that bind or cleave one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are useful in cutting or editing mammals.
  • target RNA of animal cells The purpose of editing is base editing.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein for preparing reagents for cleaving or editing target RNA in mammalian cells, the editing being base editing.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are capable of activating or upregulating a
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are effective in inhibiting a or Use in the translation of multiple target RNA molecules or in the preparation of reagents that inhibit the translation of one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are administered in a Use in inactivating multiple target RNA molecules or preparing reagents that inactivate one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein can be used for visualization, labeling, or Use in detecting one or more target RNA molecules or preparing reagents for visualizing, labeling or detecting one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are transporting a or Use in a variety of target RNA molecules or in the preparation of reagents for transporting one or more target RNA molecules.
  • Another aspect of the present disclosure relates to a method comprising a CRISPR-Cas13 system described herein, a Cas13 protein described herein, a fusion protein described herein, a guide polynucleotide described herein, a nucleic acid described herein,
  • the vector system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are useful in masking a or Multiple target RNA molecules or preparation masks Use in reagents that mask one or more target RNA molecules.
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein
  • the system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are useful in diagnosis, treatment or prevention. Use in diseases or conditions associated with target RNA.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein for diagnosing, treating or preventing diseases or conditions associated with target RNA.
  • Another aspect of the present disclosure relates to a method of diagnosing, treating, or preventing a disease or condition associated with target RNA by: administering to a subject in need thereof a sample according to the invention herein;
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein
  • the system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein or the eukaryotic cells described herein are used in preparation for diagnosis, Use in medicines to treat or prevent diseases or conditions associated with target RNA.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein in the preparation of a medicament for the diagnosis, treatment or prevention of a disease or disorder associated with a target RNA.
  • the above preferred conditions can be combined arbitrarily to obtain preferred examples of the present disclosure.
  • the reagents and raw materials used in this disclosure are commercially available.
  • Figure 1 shows the CRISPR locus of the CRISPR-Cas13 system, including the CRISPR array and the coding sequence of the C13-2 protein.
  • Figure 2 shows the structure of the C13-2 guide polynucleotide, which consists of direct repeat sequences and guide sequences.
  • the backbone sequence is the same as the direct repeat sequence;
  • the guide sequence consists of a variable number of multiple nucleotides, and N in the figure represents any nucleotide.
  • the stem-loop structure can be seen, and the stem region contains multiple complementary base pairs.
  • Figure 3 shows expression and purification of recombinant C13-2 protein.
  • FIG. 4 shows that C13-2 is highly active in downregulating PTBP1 (polypyrimidine tract binding protein 1) RNA in 293T cells.
  • PTBP1 polypyrimidine tract binding protein 1
  • Figure 5 shows that C13-2 has higher activity in downregulating AQp1 (aquaporin 1) RNA in 293T cells compared to CasRx and shRNA.
  • Figure 6 shows that C13-2 has higher activity in downregulating PTBP1 RNA in 293T cells compared with C13-113 and C13-114.
  • Figure 7 shows that C13-2 downregulates ANGPTL3 (angiopoietin-like 3) RNA in 293T cells.
  • ANGPTL3 angiopoietin-like 3
  • Figure 8 shows the C13-2 structural domain predicted by the computational method; positions 1-95 in the figure are NTD domains, positions 96-255 are HEPN-1_I domains, positions 256-417 are Helical-1 domains, and positions 418 -504 bits are the HEPN-1_II domain, bits 505-651 are the Helical-2 domain, and bits 652-893 are the HEPN-2 domain.
  • Figure 9 shows the editing effects of targeted VEGFA RNA when using different DRs.
  • Figure 10 shows the editing effects of targeting PTBP1 RNA when using different DRs.
  • Figure 11 shows a comparison of the effectiveness of C13-2 and known Cas13 tools targeting VEGFA RNA.
  • Figure 12 shows a comparison of the effectiveness of C13-2 and known Cas13 tools targeting PTBP1 RNA.
  • Figure 13 shows the sequencing peak diagram after single base editing of dC13-2.
  • Figure 14 shows VEGFA RNA levels after targeted editing of qPCR tested mutants.
  • Figure 15 shows RNAseq testing of edited VEGFA RNA levels.
  • Figure 16 shows the editing efficiency of AR RNA by qPCR tested mutants.
  • Figure 17 shows post-edited AR RNA levels measured by RNAseq.
  • Figure 18 shows the sequence alignment results of the direct repeat sequences DRrc, DR-hf2, and DR2rc.
  • Figure 19 shows the RNA secondary structure of the direct repeat sequence DR-hf2 (SEQ ID NO:87) predicted by RNAfold.
  • sequence identity As used herein, the term "sequence identity" (identity or percent identity) is used to refer to the match of sequences between two polypeptides or between two nucleic acids. When a position in both sequences being compared is occupied by the same base or amino acid monomer subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or two If a certain position in each polypeptide is occupied by lysine), then the molecules are identical at that position. "Percent sequence identity" between two sequences is a function of the number of matching positions common to the two sequences divided by the number of positions compared ⁇ 100%. For example, if 6 out of 10 positions of two sequences match, then the two sequences have 60% sequence identity.
  • Comparisons are made when maximum sequence identity occurs. Such comparisons can be made by using published and commercially available comparison algorithms and programs, such as but not limited to Clustal ⁇ , MAFFT, Probcons, T-Coffee, Probalign, BLAST, which can be reasonably selected by those of ordinary skill in the art. Those skilled in the art can determine appropriate parameters for aligning sequences, including, for example, any algorithms required to achieve a better alignment or optimal alignment of the entire length of the compared sequences, as well as to achieve a better alignment of parts of the compared sequences. Or whatever algorithm is needed for optimal contrast.
  • a guide polynucleotide is used to refer to the molecule in the CRISPR-Cas system that forms a CRISPR complex with the Cas protein and guides the CRISPR complex to the target sequence.
  • a guide polynucleotide contains a backbone sequence linked to a guide sequence that can hybridize to a target sequence.
  • the backbone sequence usually contains direct repeats and sometimes tracrRNA sequences.
  • the guide polynucleotide includes a guide sequence and a direct repeat sequence. At this time, the guide polynucleotide may also be called crRNA.
  • Type 2 CRISPR-Cas systems endow microorganisms with multiple adaptive immune mechanisms.
  • This article provides an analysis of prokaryotic genomes and metagenomes to identify a previously uncharacterized RNA-guided, RNA-targeting CRISPR-Cas13 system containing C13-2 (also known as CasRfg.4), which is classified as type VI system.
  • C13-2 also known as CasRfg.4
  • the engineered CRISPR-Cas13 system based on C13-2 is highly active in human cells.
  • C13-2 can also be flexibly packaged into AAV vectors.
  • Our results demonstrate C13-2 as a programmable RNA-binding module for efficient targeting of cellular RNA, thereby providing a versatile platform for transcriptome engineering as well as therapeutic and diagnostic approaches.
  • the CRISPR-Cas13 system containing C13-2 was identified, and subsequent experiments verified the targeted RNA cleavage activity in human cells.
  • the CRISPR-Cas13 system containing C13-2 is a type VI CRISPR-Cas system.
  • Figure 1 shows the CRISPR locus of the CRISPR-Cas13 system containing C13-2.
  • the protein sequence of wild-type C13-2 is SEQ ID NO:1.
  • the wild-type DNA coding sequence of C13-2 is SEQ ID NO:9.
  • Figure 8 shows the computationally predicted structural domain of C13-2, which includes NTD, HEPN-1_I, HEPN-1_II and HEPN-2.
  • NTD is N terminus domain, between HEPN-1_I and HEPN-1_II, HEPN-1_II
  • HEPN-1_II Between it and HEPN-2 are Helical-1 and Helical-2 domains respectively.
  • the direct repeat sequence of the C13-2 guide polynucleotide is SEQ ID NO:3.
  • Figure 2 shows the RNA secondary structure of the direct repeat sequence of the C13-2 guide polynucleotide predicted by RNAfold.
  • the engineered CRISPR-Cas13 system described here can efficiently knock down endogenous target RNAs in human cells, paving the way for RNA targeting applications as part of the transcriptome engineering toolbox.
  • C13-2-mediated knockdown across multiple endogenous transcripts can achieve higher efficiency and/or than CasRx, PspCas13b, Cas13X.1 and/or Cas13Y.1-mediated knockdown. or specificity.
  • one aspect of the present disclosure relates to a CRISPR-Cas13 system, composition or kit comprising: a Cas13 protein or fusion protein having at least 90% sequence identity as compared to SEQ ID NO: 1, or encoding said Cas13 a nucleic acid of a protein; and a guide polynucleotide or a nucleic acid encoding said guide polynucleotide; said guide polynucleotide comprising a direct repeat sequence linked to a guide sequence engineered to hybridize to a target RNA,
  • the guide polynucleotide can form a CRISPR complex with the Cas13 protein and guide the sequence-specific binding of the CRISPR complex to the target RNA.
  • the guide polynucleotide comprises a direct repeat sequence linked to a guide sequence engineered to hybridize to a target RNA, the guide polynucleotide is capable of forming a CRISPR with the Cas13 protein complex and directs the CRISPR complex to sequence-specifically bind and cleave the target RNA.
  • the polynucleotide sequence encoding a Cas13 protein or fusion protein and/or the polynucleotide sequence encoding a guide polynucleotide is operably linked to a regulatory sequence. In some embodiments, the polynucleotide sequence encoding a Cas13 protein or fusion protein is operably linked to a regulatory sequence. In some embodiments, the polynucleotide sequence encoding a guide polynucleotide is operably linked to a regulatory sequence.
  • the regulatory sequences of the polynucleotide sequence encoding a Cas13 protein or fusion protein are the same as or different from the regulatory sequences of the polynucleotide sequence encoding a guide polynucleotide.
  • the Cas13 protein described herein has at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% sequence identity. In some embodiments, the Cas13 protein described herein has at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity.
  • the CRISPR-Cas13 system includes a fusion protein comprising the Cas13 protein and a protein domain and/or a polypeptide tag, the sequence identity percentage between the Cas13 portion of the fusion protein and the reference sequence is calculated.
  • the Cas13 protein described herein is from a species that contains a genome with an average nucleotide identity (ANI) ⁇ 95% to the genome set forth in the CNGB database with accession number CNA0009596.
  • ANI average nucleotide identity
  • the Cas13 proteins described herein comprise one or more (e.g., 1 or 2) native HEPN domains, each native HEPN domain comprising the RX 4 H amino acid motif (where X represents any amino acid, The subscript "4" represents 4 consecutive amino acids).
  • the first catalytic RX 4 H motif is located at amino acid positions 210-215 of SEQ ID NO: 1
  • the second catalytic RX 4 H motif is located at amino acid positions 785-790 of SEQ ID NO: 1
  • the third The RX 4 H motif is located at amino acid positions 750-755 of SEQ ID NO:1.
  • the Cas13 proteins described herein comprise one or more mutated HEPN domains.
  • the mutant Cas13 protein can process its guide polynucleotide but cannot cleave the target RNA.
  • the Cas13 protein described herein does not require a protospacer flanking sequence (PFS) for RNA cleavage.
  • PFS protospacer flanking sequence
  • the CRISPR-Cas13 system described herein can be introduced into cells (or cell-free systems) in a variety of non-limiting ways: (i) as Cas13 mRNA and guide polynucleotide, (ii) as part of a single vector or plasmid, or divided into Multiple vectors or plasmids, (iii) as individual Cas13 proteins and guide polynucleotides, or (iv) as RNP complexes of Cas13 proteins and guide polynucleotides.
  • the CRISPR-Cas13 system, composition or kit comprises a nucleic acid molecule encoding the Cas13 protein, wherein the coding sequence is codon optimized for expression in eukaryotic cells. In some embodiments, the CRISPR-Cas13 system, composition or kit comprises a nucleic acid molecule encoding the Cas13 protein, wherein the coding sequence is codon optimized for expression in mammalian cells. In some embodiments, the CRISPR-Cas13 system, composition or kit comprises a nucleic acid molecule encoding the Cas13 protein, wherein the coding sequence is codon-optimized for expression in human cells.
  • the nucleic acid molecule encoding the Cas13 protein is a plasmid. In some embodiments, the nucleic acid molecule encoding the Cas13 protein is part of a viral vector genome, such as the DNA genome of an AAV vector flanked by ITRs. In some embodiments, the nucleic acid molecule encoding the Cas13 protein is mRNA.
  • the guide polynucleotide of the CRISPR-Cas13 system is guide RNA.
  • the guide polynucleotide is a chemically modified guide polynucleotide.
  • the guide polynucleotide comprises at least one chemically modified nucleotide.
  • the guide polynucleotide is a hybrid RNA-DNA guide.
  • the guide polynucleotide is a hybrid RNA-LNA (locked nucleic acid) guide.
  • the guide polynucleotide comprises at least one guide sequence (also called a spacer sequence) linked to at least one direct repeat (DR).
  • the guide sequence is located 3' to the direct repeat sequence. In some embodiments, the guide sequence is located 5' to the direct repeat sequence.
  • the guide sequence comprises at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, at least 25 nucleotides, at least 26 nucleotides, at least 27 nucleosides acid, at least 28 nucleotides, at least 29 nucleotides, or at least 30 nucleotides.
  • the guide sequence comprises no more than 60 nucleotides, no more than 55 nucleotides, no more than 50 nucleotides, no more than 45 nucleotides, no more than 40 nucleotides , no more than 35 nucleotides, or no more than 30 nucleotides. In some embodiments, the guide sequence comprises 15-20 nucleotides, 20-25 nucleotides, 25-30 nucleotides, 30-35 nucleotides, or 35-40 nucleotides .
  • the guide sequence has sufficient complementarity to the target RNA sequence to hybridize to the target RNA and direct sequence-specific binding of the CRISPR-Cas13 complex to the target RNA. In some embodiments, the guide sequence has 100% complementarity to the target RNA (or a region of RNA to be targeted), but the guide sequence may have less than 100% complementarity to the target RNA. , for example at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% complementarity.
  • the guide sequence is engineered to hybridize to the target RNA with no more than two nucleotide mismatches. In some embodiments, the guide sequence is engineered to hybridize to the target RNA with no more than one nucleotide mismatch. In some embodiments, the guide sequence is engineered to hybridize to the target RNA, with or without mismatches.
  • the direct repeat sequence comprises at least 20 nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, at least 25 nucleotides. nucleotides, at least 26 nucleotides, at least 27 nucleotides, at least 28 nucleotides, at least 29 nucleotides, at least 30 nucleotides, at least 31 nucleotides, at least 32 nucleotides Nucleotides, at least 33 nucleotides, at least 34 nucleotides, at least 35 nucleotides, or at least 36 nucleotides.
  • the direct repeats comprise no more than 60 nucleotides, no more than 55 nucleotides, no more than 50 nucleotides, no more than 45 nucleotides, no more than 40 nuclei nucleotides or no more than 35 nucleotides.
  • the direct repeat sequence includes 20-25 nucleotides, 25-30 nucleotides, 30-35 nucleotides, or 35-40 nucleotides.
  • the direct repeat sequence is modified to replace at least one complementary base pair in the stem region shown in Figure 2 with a different complementary base pair.
  • the direct repeats are modified to alter the number of complementary base pairs in the stem region shown in Figure 2.
  • the direct repeats are modified to alter the number of nucleotides in the loop region shown in Figure 2 (e.g., 3, 4, or 5 nucleotides in the loop).
  • the direct repeats are modified to alter the nucleotide sequence in the loop region.
  • aptamer sequences are inserted or appended to the ends of direct repeat sequences.
  • the direct repeat sequence has at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity.
  • the CRISPR-Cas13 system, composition, or kit includes at least 2, at least 3, at least 4, at least 5, at least 10, or at least 20 different guide polynucleotides.
  • the guide polynucleotide targets at least 2, at least 3, at least 4, at least 5, at least 10, or at least 20 different target RNA molecules, or targets one or more At least 2, at least 3, at least 4, at least 5, at least 10 or at least 20 different regions of the target RNA molecule.
  • the guide polynucleotide includes a constant direct repeat sequence located upstream of a variable guide sequence.
  • the plurality of guide polynucleotides are part of an array (which may be part of a vector, such as a viral vector or a plasmid).
  • a guide array including the sequence DR-spacer-DR-spacer-DR-spacer may include three unique raw guide polynucleotides (one for each DR-spacer sequence).
  • the array is processed by the Cas13 protein into three separate maturation guide polynucleotides. This allows for multiplexing, eg, delivering multiple guide polynucleotides to a cell or system to target multiple target RNAs or multiple regions within a single target RNA.
  • the ability of the guide polynucleotide to direct sequence-specific binding of the CRISPR complex to the target RNA can be assessed by any suitable assay.
  • components of the CRISPR system sufficient to form a CRISPR complex, including the guide polynucleotide to be tested, can be provided to a host cell with the corresponding target RNA molecule, such as by transfection of a vector encoding the components of the CRISPR complex. , and then evaluate preferential cleavage within the target sequence.
  • cleavage of a target RNA sequence can be assessed in a test tube by providing the target RNA, components of the CRISPR complex, including a guide polynucleotide to be tested, and a control guide polynucleotide that is different from the test guide polynucleotide, And compare the ability of the test and control guide polynucleotides to bind the target RNA or the rate of cleaving the target RNA.
  • the Cas13 proteins provided herein comprise one or more mutations, such as single amino acid insertions, single amino acid deletions, single amino acid substitutions, or their combination.
  • the Cas13 protein compared to wild-type C13-2 protein (SEQ ID NO: 1), includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, or 90 amino acid changes (e.g.
  • the Cas13 protein compared to wild-type C13-2 protein (SEQ ID NO: 1), includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, or 90 amino acid changes (eg, insertions, deletions, or substitutions) but retaining the ability to bind target RNA molecules complementary to the guide sequence of the guide polynucleotide.
  • amino acid changes eg,
  • the Cas13 protein compared to wild-type C13-2 protein (SEQ ID NO: 1), includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 amino acid changes (e.g., insertions, deletions, or substitutions) that retain the ability to bind target RNA molecules complementary to the guide sequence of the guide polynucleotide and/or retain the ability to process the guide array RNA transcript into the guide polynucleotide molecule ability.
  • amino acid changes e.g., insertions, deletions, or substitutions
  • the Cas13 protein contains one or more mutations in the catalytic domain and has reduced RNA cleavage activity. In some embodiments, the Cas13 protein contains a mutation in the catalytic domain and has reduced RNA cleavage activity. In some embodiments, the Cas13 protein contains one or more mutations in one or both HEPN domains and substantially lacks RNA cleavage activity. In some embodiments, the Cas13 protein contains mutations in one or both HEPN domains and substantially lacks RNA cleavage activity.
  • substantially lacking RNA cleavage activity means retaining only ⁇ 50%, ⁇ 40%, ⁇ 30%, ⁇ 20%, ⁇ 10%, ⁇ 5% compared to wild-type Cas13 protein or ⁇ 1% RNA cleavage activity, or no detectable RNA cleavage activity.
  • the Cas13 protein has at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97% compared to SEQ ID NO: 1 , at least 98% or at least 99% sequence identity. In some embodiments, the Cas13 protein has at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% compared to the protein sequence encoded by SEQ ID NO:9 , at least 97%, at least 98%, or at least 99% sequence identity.
  • the CRISPR-Cas13 system includes a fusion protein of the Cas13 with a protein domain and/or a polypeptide tag, the sequence identity percentage between the Cas13 portion of the fusion protein and the reference sequence is calculated.
  • the Cas13 protein can form a CRISPR complex with the guide polynucleotide, and the CRISPR complex can specifically bind to the target RNA sequence.
  • the Cas13 protein is capable of forming a CRISPR complex with a guide polynucleotide comprising a direct repeat sequence linked to a guide sequence engineered to guide the CRISPR Sequence-specific binding of the complex to target RNA.
  • One type of modification or mutation involves the substitution of amino acids for amino acid residues with similar biochemical properties, that is, conservative substitutions (eg, conservative substitutions of 1-4, 1-8, 1-10, or 1-20 amino acids). Typically, conservative substitutions have little or no effect on the activity of the resulting protein or peptide.
  • a conservative substitution is an amino acid substitution in the Cas13 protein that does not substantially affect the binding of the Cas13 protein to a target RNA molecule complementary to the guide sequence of the gRNA molecule, and/or the processing of the guide array RNA transcript into the gRNA molecule. Alanine scanning can be used to identify which amino acid residues in the Cas13 protein are resistant to amino acid substitutions.
  • the ability of the variant Cas13 protein to modify gene expression in the CRISPR-Cas system does not exceed 25%, such as no more than 20%, such as no more than 10%.
  • amino acids considered conservative substitutions include: Ser for Ala; Lys for Arg; Gln or His for Asn; Glu for Asp; Ser for Cys; Asn for Gln; Asp for Glu; and Pro Gly; replace His with Asn or Gln; replace Ile with Leu or Val; replace Leu with Ile or Val; replace Lys with Arg or Gln; replace Met with Leu or Ile; replace Phe with Met, Leu or Tyr; replace Ser with Thr ; Replace Thr with Ser; Replace Trp with Tyr; Replace Tyr with Trp or Phe; Replace Val with Ile or Leu.
  • More substantial changes can be made by using less conservative substitutions, e.g., selecting residues that are more differential in maintaining the effect of: (a) the structure of the polypeptide backbone in the region where the substitution occurs, e.g., as a helix or The folded conformation; (b) the charge or hydrophobicity of the region that interacts with the target site; or (c) the volume of the side chain.
  • Substitutions generally expected to produce the largest changes in polypeptide function are (a): hydrophilic residues (e.g., serine or threonine) versus hydrophobic residues (e.g., leucine, isoleucine, phenylalanine, valerine) (b) substitution between cysteine or proline and any other residue; (c) residues with positively charged side chains (e.g., lysine, Substitutions between arginine or histidine) and negatively charged residues (e.g. glutamic acid or aspartic acid); or (d) residues with bulky side chains (e.g. phenylalanine) and residues without Substitutions between side chain residues (e.g. glycine).
  • hydrophilic residues e.g., serine or threonine
  • hydrophobic residues e.g., leucine, isoleucine, phenylalanine, valerine
  • the Cas13 protein is located at amino acid positions 40-91 of SEQ ID NO: 1 (i.e., the region from amino acid 40 to amino acid 91 in the sequence of SEQ ID NO: 1, including the 40th amino acid and Amino acid 91) contains one or more mutations.
  • the Cas13 protein comprises one or more mutations at amino acid positions 146-153 of SEQ ID NO:1.
  • the Cas13 protein comprises one or more mutations at amino acid positions 158-176 of SEQ ID NO:1.
  • the Cas13 protein comprises one or more mutations at amino acid positions 182-209 of SEQ ID NO:1.
  • the Cas13 protein comprises one or more mutations at amino acid positions 216-253 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 271-287 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 341-353 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 379-424 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 456-477 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 521-557 of SEQ ID NO:1.
  • the Cas13 protein comprises one or more mutations at amino acid positions 575-588 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 609-625 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 700-721 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 724-783 of SEQ ID NO:1. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 796-815 of SEQ ID NO:1. In some embodiments, the Cas13 protein is SEQ ID NO: 1 contains one or more mutations at amino acid positions 828-852. In some embodiments, the Cas13 protein comprises one or more mutations at amino acid positions 880-893 of SEQ ID NO:1.
  • the Cas13 protein comprises one of the amino acid residues at amino acid positions 348-350 of SEQ ID NO: 1 (i.e., amino acid 348, amino acid 349, and amino acid 350 in the sequence SEQ ID NO: 1). or multiple amino acid deletions. In some embodiments, the Cas13 protein comprises a deletion of one or more amino acids at amino acid positions 521-556 of SEQ ID NO: 1. In some embodiments, the Cas13 protein comprises a deletion of one or more amino acids at amino acid positions 883-893 of SEQ ID NO: 1.
  • the RxxxxH motif of the Cas13 protein (x represents any amino acid, RxxxxH may also be denoted as Rx4H or R4xH) contains one or more mutations and substantially lacks RNA cleavage activity.
  • the Cas13 protein corresponds to the RxxxxH motif at positions 210-215, the RxxxxH motif at positions 750-755, and/or the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1 Position contains mutation.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 750-755 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 and the RxxxxH motif at positions 750-755 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 210-215 and the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to the RxxxxH motif at positions 750-755 and the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises the RxxxxH motif at positions 210-215, the RxxxxH motif at positions 750-755, and the RxxxxH motif at positions 785-790 of the reference protein shown in SEQ ID NO: 1. mutation.
  • the RxxxxH motif is mutated to AxxxxH, RxxxxA, or AxxxxA. In some embodiments, the RxxxxH motif is mutated to AxxxxH. In some embodiments, the RxxxxH motif is mutated to RxxxxA. In some embodiments, the RxxxxH motif is mutated to AxxxxA.
  • the Cas13 protein contains 1, 2, 3 at positions corresponding to amino acid residues R210, H215, R750, H755, R785 and/or H790 of the reference protein shown in SEQ ID NO:1 1, 4, 5 or 6 mutations.
  • the Cas13 protein is mutated to A (alanine) at positions corresponding to amino acid residues R210, H215, R750, H755, R785 and/or H790 of the reference protein shown in SEQ ID NO:1 acid).
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R210 and H215 of the reference protein shown in SEQ ID NO: 1. In some embodiments, the Cas13 protein contains mutations at positions corresponding to amino acid residues R750 and H755 of the reference protein set forth in SEQ ID NO: 1. In some embodiments, the Cas13 protein contains mutations at positions corresponding to amino acid residues R785 and H790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R210, H215, R750 and H755 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R750, H755, R785 and H790 of the reference protein set forth in SEQ ID NO: 1.
  • the Cas13 protein contains mutations at positions corresponding to amino acid residues R210, H215, R785 and/or H790 of the reference protein shown in SEQ ID NO: 1.
  • the Cas13 protein comprises mutations at positions corresponding to amino acid residues R210, H215, R750, H755, R785 and H790 of the reference protein set forth in SEQ ID NO: 1.
  • the corresponding position of R210, R750 or R785 is mutated to A. In some embodiments, the corresponding position of H215, H755 or H790 is mutated to A. In some embodiments, the corresponding positions of R210, H215, R750, H755, R785 and H790 are all mutated to A.
  • the Cas13 protein is obtained by introducing mutations into the RxxxxH motif at positions 210-215, the RxxxxH motif at positions 750-755, and/or the RxxxxH motif at positions 785-790 of the sequence shown in SEQ ID NO:1 .
  • the Cas13 protein is introduced by introducing 1, 2, 3, 4, 5 at the R210, H215, R750, H755, R785 and/or H790 positions of the sequence shown in SEQ ID NO:1 or 6 mutations. In some embodiments, the Cas13 protein is obtained by mutating R210, H215, R750, H755, R785 and/or H790 of the sequence shown in SEQ ID NO: 1 to A (alanine).
  • the Cas13 protein is obtained by mutating the R210, H215, R785 and H790 positions of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating the R210, H215, R750 and H755 positions of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating R750, H755, R785 and H790 of the sequence shown in SEQ ID NO: 1 to A. In some embodiments, the Cas13 protein is obtained by mutating R210, H215, R750, H755, R785 and H790 of the sequence shown in SEQ ID NO: 1 to A.
  • the Cas13 protein is at amino acid residues 40-91, 146-153, 158-176, 182-209, and 216-253 of the reference protein shown in SEQ ID NO:1 , 271-287, 341-353, 379-424, 456-477, 521-557, 575-588, 609-625, 700-721, 724-783, The corresponding positions of positions 796-815, 828-852 or 880-893 contain at least one mutation.
  • the Cas13 protein contains one or more amino acids at positions corresponding to amino acid residues 348-350, 521-556, or 883-893 of the reference protein shown in SEQ ID NO:1. Missing.
  • the Cas13 protein contains any one or more at the corresponding position of the following amino acid residues of the reference protein shown in SEQ ID NO: 1
  • the Cas13 protein is mutated to the same amino acid at the corresponding position of the mutation site of the reference protein shown in SEQ ID NO: 1 in Table 24 Residues. In some embodiments, compared with the reference protein shown in SEQ ID NO: 1, the Cas13 protein contains the same mutation at the corresponding position of the mutation site of the reference protein shown in SEQ ID NO: 1 in Table 24.
  • the Cas13 protein is obtained by introducing any one or more mutations into the sequence shown in SEQ ID NO: 1 at the following positions: R11, N34, R35, R47, R58, R63, R64, N68, N87, N265, N274, R276, R290, R294, N299, N303, R308, R314, R320, R328, N332, R341, N346, R358, N372, N383, N390, N394, R47+R290, R47+R314, R290+ R314, R47+R290+R314, R308+N68, N394+N68, N87+N68, R308+N265, N394+N265, N87+N265, R308+N68+N265, N87+N68+N265, T7, A16, S260, A263, M266, N274, F288, M302, N303, L304, V305, I311, D313, H324, P326, H
  • the Cas13 protein is obtained by introducing any one or more mutations in Table 24 from the sequence shown in SEQ ID NO:1.
  • the Cas13 protein is at amino acid residues 91-120, 141-180, 211-240, 331-360, and 351-400 of the reference protein shown in SEQ ID NO:1 , 431-460 bits, 461-500 bits, 511-550 bits, 611-640 bits, 631-660 bits, 661-690 bits, 691-760 bits, 821-860 bits or 861-890 bits corresponding position sequence. What is missing is what is gained.
  • the sequence deletion is ⁇ 300, ⁇ 200, ⁇ 150, ⁇ 100, ⁇ 90, ⁇ 80, ⁇ 70, ⁇ 60, ⁇ 50, ⁇ 40 pcs, ⁇ 30 pcs, ⁇ 20 pcs or ⁇ 10 pcs ammonia acid residues.
  • the Cas13 protein consists of the sequence shown in SEQ ID NO: 1 at positions 91-120, 141-180, 211-240, 331-360, 351-400, 431-460, Obtained by sequence deletion at positions 461-500, 511-550, 611-640, 631-660, 661-690, 691-760, 821-860 or 861-890.
  • Subcellular localization signal (or localization signal)
  • the Cas13 protein or functional fragment thereof is fused to at least one homologous or heterologous subcellular localization signal.
  • exemplary subcellular localization signals include organelle localization signals, such as nuclear localization signals (NLS), nuclear export signals (NES), or mitochondrial localization signals.
  • the Cas13 protein or functional fragment thereof is fused to at least 1 homologous or heterologous NLS. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least 2 NLS. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least 3 NLS. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least 1 N-terminal NLS and at least 1 C-terminal NLS. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least 2 C-terminal NLS. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least 2 N-terminal NLS.
  • the NLS is independently selected from SPKKKRKVEAS (SEQ ID NO:53), GPKKKRKVAAA (SEQ ID NO:54), PKKKRKV (SEQ ID NO:55), KRPAATKKA GQA KKKK (SEQ ID NO:56) ,PAAKRVKLD(SEQ ID NO:57),RQRRNELKRSP(SEQ ID NO:58),NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY(SEQ ID NO:59),RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV(SEQ ID NO:60),VSRKRPRP(SEQ ID NO:6 1), PPKKARED(SEQ ID NO:62), POPKKKPL(SEQ ID NO:63), SALIKKKKKMAP(SEQ ID NO:64), DRLRR(SEQ ID NO:65), PKQKKRK(SEQ ID NO:66), RKLKKKIKKL(SEQ ID NO
  • the Cas13 protein or functional fragment thereof is fused to a homologous or heterologous NES. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least two NESs. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least three NESs. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least one N-terminal NES and at least one C-terminal NES. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least two C-terminal NESs. In some embodiments, the Cas13 protein or functional fragment thereof is fused to at least two N-terminal NESs.
  • the NES is independently selected from the group consisting of adenovirus type 5 E1B NES, HIV Rev NES, MAPK NES, and PTK2 NES.
  • the Cas13 protein or functional fragment thereof is fused to homologous or heterologous NLS and NES, and there is a cleavable linker between the NLS and the NES.
  • the NES facilitates the production of delivery particles (eg, virus-like particles) comprising the Cas13 protein or functional fragment thereof in a production cell line.
  • delivery particles eg, virus-like particles
  • cleavage of the linker in a target cell can expose the NLS and promote nuclear localization of the Cas13 protein or functional fragment thereof in the target cell.
  • the Cas13 protein or functional fragment thereof is covalently linked or fused to homologous or heterologous protein domains and/or polypeptide tags. In some embodiments, the Cas13 protein or functional fragment thereof is fused to a homologous or heterologous protein domain and/or a polypeptide tag.
  • the protein domain and polypeptide tag are optionally selected from: cytosine deaminase domain, adenosine deaminase domain, translation activation domain, translation inhibition domain, RNA methylation domain , RNA demethylation domain, nuclease domain, splicing factor domain, reporter domain, affinity domain, subcellular localization signal, reporter tag and affinity tag.
  • the protein domain includes a cytosine deaminase domain, an adenosine deaminase domain, a translation activation domain, a translation inhibition domain, an RNA methylation domain, and an RNA demethylation domain. domain, ribonuclease domain, splicing factor domain, reporter domain and affinity domain.
  • the polypeptide tag includes a reporter tag and an affinity tag.
  • the length of the amino acid sequence of the protein domain is ⁇ 40 amino acids, ⁇ 50 amino acids, ⁇ 60 amino acids, ⁇ 70 amino acids, ⁇ 80 amino acids, ⁇ 90 amino acids, ⁇ 100 amino acids Amino acids, ⁇ 150 amino acids, ⁇ 200 amino acids, ⁇ 250 amino acids, ⁇ 300 amino acids, ⁇ 350 amino acids, or ⁇ 400 amino acids.
  • Exemplary protein domains include domains that can cleave RNA (e.g., PIN endonuclease domain, NYN domain, SMR domain from SOT1, or RNase domain from staphylococcal nuclease) that can affect RNA Stability domains (e.g., tristetraprolin (TTP) or domains from UPF1, EXOSC5, and STAU1), domains that can edit nucleotides or ribonucleotides (e.g., cytidine deaminase, PPR proteins, adenosine glycoside deaminases, ADAR family proteins, or APOBEC family proteins), domains that can activate translation (e.g., eIF4E and other translation initiation factors, yeast poly(A)-binding proteins, or domains of GLD2), domains that can inhibit translation Domains (e.g., Pumilio or FBF PUF proteins, deadenylase, CAF1, Argonaute proteins), domains that can methyl
  • the protein domain comprises an adenosine deaminase domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to an adenosine deaminase domain to direct RNA in mammalian cells.
  • A-to-I deaminase activity of the transcript are described in Cox et al., Science 358(6366):1019-1027 (2017), which is incorporated by reference in its entirety. Enter this article.
  • the adenosine deaminase domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended within the guide polynucleotide, thereby allowing the adenosine deaminase to be
  • the ammonia enzyme domain is non-covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises a cytosine deaminase domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to a cytosine deaminase domain to direct RNA in mammalian cells.
  • a cytosine deaminase domain evolved from ADAR2 for targeted C-to-U RNA editing is described in Abudayyeh et al., Science 365(6451):382-386 (2019), by reference Incorporated into this article in its entirety.
  • the cytosine deaminase domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended to the guide polynucleotide, thereby allowing the cytosine
  • the deaminase domain is non-covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises a splicing factor domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to a splicing factor domain to direct the availability of target RNA in mammalian cells. Change splicing. Splicing factor domains for targeting alternative splicing are described in Konermann et al., Cell 173(3):665-676 (2016), which is incorporated herein by reference in its entirety.
  • Non-limiting examples of splicing factor domains include the RS-rich domain of SRSF1, the Gly-rich domain of hnRNPA1, the alanine-rich motif of RBM4, or the proline-rich motif of DAZAP1.
  • the splicing factor domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended to the guide polynucleotide, thereby allowing the splicing factor domain to be non- Covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises a translation activation domain.
  • the Cas13 protein with a mutated HEPN domain, the catalytically inactive Cas13 protein, or a functional fragment thereof is the same as Translational activation domains are covalently linked or fused to activate or increase expression of target RNA.
  • Non-limiting examples of translation activation domains include domains of eIF4E and other translation initiation factors, yeast poly(A) binding protein, or GLD2.
  • the translation activation domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended to the guide polynucleotide, thereby allowing the translation activation domain to be non- Covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises a translation inhibition domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to a translation inhibition domain to inhibit or reduce the expression of a target RNA.
  • translation inhibition domains include Pumilio or FBF PUF proteins, deadenylase, CAF1, Argonaute proteins.
  • the translation inhibition domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended within the guide polynucleotide, thereby allowing the translation inhibition domain to Non-covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises an RNA methylation domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to an RNA methylation domain for methylation of target RNA. change.
  • RNA methylation domains include m6A domains, such as METTL14, METTL3 or WTAP.
  • the RNA methylation domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended to the guide polynucleotide, thereby allowing the RNA methylation
  • the CL domain is non-covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises an RNA demethylation domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to an RNA demethylation domain for demethylation of target RNA. base.
  • RNA demethylation domains include human alkylation repair homolog 5 or ALKBH5.
  • the RNA demethylation domain is covalently linked or fused to an adapter protein capable of binding an aptamer sequence inserted or appended to the guide polynucleotide, thereby allowing the RNA to demethylate
  • the methylation domain is non-covalently linked to the Cas13 protein or functional fragment thereof complexed with the guide polynucleotide.
  • the protein domain comprises a ribonuclease domain.
  • the Cas13 protein with a mutated HEPN domain, a catalytically inactive Cas13 protein, or a functional fragment thereof is covalently linked or fused to a ribonuclease domain to cleave target RNA.
  • ribonuclease domains include PIN endonuclease domains, NYN domains, SMR domains from SOT1, or RNase domains from staphylococcal nucleases.
  • the protein domain comprises an affinity domain (affinity domain) and/or a reporter domain (reporter domain).
  • the Cas13 protein or functional fragment thereof is covalently linked or fused to a reporter domain, such as a fluorescent protein.
  • reporting fields include GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP.
  • the Cas13 protein is covalently linked or fused to a polypeptide tag.
  • examples of the polypeptide tags are small polypeptide sequences.
  • the length of the amino acid sequence of the polypeptide tag is ⁇ 50 amino acids, ⁇ 40 amino acids, ⁇ 30 amino acids, ⁇ 25 amino acids, ⁇ 20 amino acids, ⁇ 15 amino acids, ⁇ 10 amino acids or ⁇ 5 amino acids.
  • the Cas13 protein is covalently linked or fused to an affinity tag, such as a purification tag.
  • affinity tags include HA-tag, His-tag (eg 6-His), Myc-tag, E-tag, S-tag, calmodulin tag, FLAG-tag, GST-tag, MBP-tag Tag, Halo Tag or Biotin.
  • the C13-2 inactivating mutant (R210A+H215A+R750A+H755A+R785A+H790A) is fused to a protein domain and/or a polypeptide tag.
  • the Cas13 protein is fused to an ADAR.
  • the C13-2 inactivating mutant (R210A+H215A+R750A+H755A+R785A+H790A) is fused to cytosine deaminase or adenine deaminase.
  • the C13-2 inactivating mutant (R210A+H215A+R750A+H755A+R785A+H790A) is directly covalently linked to cytosine deaminase or adenine deaminase, via a rigid linker peptide sequence A ( EAAAK)3A connection, or through a flexible linker peptide sequence (GGGGS)3.
  • EAAAK rigid linker peptide sequence A
  • GGGGS flexible linker peptide sequence
  • the guide polynucleotide further comprises an aptamer sequence.
  • the aptamer sequence is inserted into a loop of a guide polynucleotide.
  • the aptamer sequence is inserted into a tetra loop of a guide polynucleotide.
  • An exemplary tetra loop of a guide polynucleotide is shown in Figure 2.
  • the aptamer sequence is appended to the terminus of the guide polynucleotide.
  • the aptamer sequence includes an MS2 aptamer sequence, a PP7 aptamer sequence, or a Q[beta] aptamer sequence.
  • the CRISPR-Cas13 system further comprises a fusion protein containing an adapter protein and a homologous or heterologous protein domain and/or a polypeptide tag, or a nucleic acid encoding the fusion protein, wherein the adapter protein is capable of binding aptamer sequence.
  • the adapter protein includes MS2 phage coat protein (MCP), PP7 phage coat protein (PCP), or Q ⁇ phage coat protein (QCP).
  • the protein domain includes a cytosine deaminase domain, an adenosine deaminase domain, a translation activation domain, a translation inhibition domain, an RNA methylation domain, and an RNA demethylation domain. domain, nuclease domain, splicing factor domain, affinity domain or reporter domain.
  • the guide polynucleotide comprises modified nucleotides.
  • the modified nucleotides comprise 2'-O-methyl, 2'-O-methyl-3'-phosphorothioate, or 2'-O-methyl-3'-thioPACE .
  • the guide polynucleotide is a chemically modified guide polynucleotide. Chemically modified guide polynucleotides are described in Hendel et al., Nat. Biotechnol. 33(9):985-989 (2015), which is incorporated herein by reference in its entirety.
  • the guide polynucleotide is a hybrid RNA-DNA guide, a hybrid RNA-LNA (locked nucleic acid) guide, a hybrid DNA-LNA guide, or a hybrid DNA-RNA-LNA guide.
  • the direct repeat sequence contains one or more ribonucleotides substituted by a corresponding deoxyribonucleotide.
  • the guide sequence contains one or more ribonucleotides substituted by corresponding deoxyribonucleotides.
  • Hybrid RNA-DNA guide polynucleotides are described in WO2016/123230, which is incorporated herein by reference in its entirety.
  • Another aspect of the present disclosure relates to a vector system comprising a CRISPR-Cas13 system described herein, said vector system comprising one or more vectors comprising a polynucleotide sequence encoding said Cas13 protein or fusion protein and coding The polynucleotide sequence of the guide polynucleotide.
  • the vector system includes at least one plasmid or viral vector (eg, retrovirus, lentivirus, adenovirus, adeno-associated virus, or herpes simplex virus).
  • the polynucleotide sequence encoding a Cas13 protein or fusion protein and the polynucleotide sequence encoding a guide polynucleotide are located on the same vector.
  • the polynucleotide sequence encoding a Cas13 protein or fusion protein and the polynucleotide sequence encoding a guide polynucleotide are located on multiple vectors.
  • the polynucleotide sequence encoding a Cas13 protein or fusion protein and/or the polynucleotide sequence encoding a guide polynucleotide is operably linked to a regulatory sequence. In some embodiments, the polynucleotide sequence encoding a Cas13 protein or fusion protein is operably linked to a regulatory sequence. In some embodiments, the polynucleotide sequence encoding a guide polynucleotide is operably linked to a regulatory sequence.
  • the regulatory sequences of the polynucleotide sequence encoding a Cas13 protein or fusion protein are the same as or different from the regulatory sequences of the polynucleotide sequence encoding a guide polynucleotide.
  • the regulatory sequences are optionally selected from promoters, enhancers, internal ribosome entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signal and poly-U sequence).
  • control sequences include control sequences that cause the nucleotide sequence to be constitutively expressed in many types of host cells, as well as control sequences that cause the nucleotide sequence to be expressed only in certain host cells (e.g., tissue-specific regulatory sequences).
  • tissue-specific regulatory sequences can be expressed primarily in the desired tissue of interest, such as muscle, neurons, bone, skin, blood, specific organs (e.g., liver, pancreas), or specific cell types (e.g., lymphocytes).
  • Regulatory sequences may also direct expression in a time-dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, which may or may not be tissue or cell type specific.
  • the regulatory sequence is an enhancer element, such as a WPRE, CMV enhancer, the R-U5 segment in the LTR of HTLV-1, the SV40 enhancer, or rabbit beta-globin exon 2 and Intronic sequences between 3.
  • an enhancer element such as a WPRE, CMV enhancer, the R-U5 segment in the LTR of HTLV-1, the SV40 enhancer, or rabbit beta-globin exon 2 and Intronic sequences between 3.
  • the vector comprises a pol III promoter (e.g., U6 and H1 promoters), a pol II promoter (e.g., a retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with an RSV boost ), cytomegalovirus (CMV) promoter (optionally with CMV enhancer), SV40 promoter, dihydrofolate reductase promoter, ⁇ -actin promoter, phosphoglycerol kinase (PGK) promoter, or EF1 ⁇ promoter), or pol III promoter and pol II promoter.
  • a pol III promoter e.g., U6 and H1 promoters
  • a pol II promoter e.g., a retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with an RSV boost ), cytomegalovirus (CMV) promoter (optionally with CMV enhancer), SV40 promoter, dihydr
  • the promoter is a constitutive promoter that is continuously active and not regulated by external signals or molecules. Suitable constitutive promoters include, but are not limited to, CMV, RSV, SV40, EF1 ⁇ , CAG and ⁇ -actin promoters. In some embodiments, the promoter is an inducible promoter that is regulated by external signals or molecules (eg, transcription factors).
  • the promoter is a tissue-specific promoter that can be used to drive tissue-specific expression of the Cas13 protein or fusion protein.
  • Suitable muscle-specific promoters include, but are not limited to, CK8, MHCK7, myoglobin promoter (Mb), desmin promoter, muscle creatine kinase promoter (MCK) and variants thereof, and SPc5-12 Synthetic promoter.
  • Suitable immune cell-specific promoters include, but are not limited to, B29 promoter (B cells), CD14 promoter (monocytes), CD43 promoter (leukocytes and platelets), CD68 (macrophages), and SV40/CD43 promoter (white blood cells and platelets).
  • Suitable blood cell-specific promoters include, but are not limited to, CD43 promoter (leukocytes and platelets), CD45 promoter (hematopoietic cells), INF- ⁇ (hematopoietic cells), WASP promoter (hematopoietic cells), SV40/CD43 promoter ( leukocytes and platelets), and the SV40/CD45 promoter (hematopoietic cells).
  • Suitable pancreas-specific promoters include, but are not limited to, the elastase-1 promoter.
  • Suitable endothelial cell-specific promoters include, but are not limited to, the Fit-1 promoter and the ICAM-2 promoter.
  • Suitable neuronal tissue/cell specific promoters include, but are not limited to, GFAP promoter (astrocytes), SYN1 promoter (neurons), and NSE/RU5' (mature neurons).
  • Suitable kidney-specific promoters include, but are not limited to, the NphsI promoter (podocyte).
  • Suitable bone-specific promoters include, but are not limited to, the OG-2 promoter (osteoblast, odontoblast).
  • Suitable lung-specific promoters include, but are not limited to, SP-B promoter (lung).
  • Suitable liver-specific promoters include, but are not limited to, the SV40/Alb promoter.
  • Suitable cardiac-specific promoters include, but are not limited to, alpha-MHC.
  • AAV adeno-associated viral
  • AAV adeno-associated viral
  • the adeno-associated viral (AAV) vector comprises a guide polynucleoside encoding a Cas13 protein or fusion protein described herein Acid DNA.
  • the AAV vector comprises an ssDNA genome comprising coding sequences for a Cas13 protein or fusion protein and a guide polynucleotide flanking an ITR.
  • the CRISPR-Cas13 systems described herein are packaged in AAV vectors, such as AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, and AAVrh74.
  • the CRISPR-Cas13 systems described herein are packaged in AAV vectors containing engineered capsids with tissue tropism, such as engineered muscle tropism capsids.
  • engineered capsids with tissue tropism such as engineered muscle tropism capsids.
  • LNPs lipid nanoparticles
  • the LNPs comprise a guide polynucleotide described herein and encoding a Cas13 protein or fusion described herein Protein mRNA.
  • lipid nanoparticles comprise four components: cationic or ionizable lipids, cholesterol, helper lipids, and PEG -Lipids.
  • the cationic or ionizable lipids include cKK-E12, C12-200, ALC-0315, DLin-MC3-DMA, DLin-KC2-DMA, FTT5, Moderna SM-102, and Intellia LPO1.
  • the PEG-lipid comprises PEG-2000-C-DMG, PEG-2000-DMG, or ALC-0159.
  • the helper lipid includes DSPC. The components of LNP are described in Paunovska et al., Nature Reviews Genetics 23:265-280 (2022), which is incorporated by reference in its entirety.
  • a lentiviral vector comprising the CRISPR-Cas13 system described herein, wherein the lentiviral vector comprises a guide polynucleotide described herein, and encoding a Cas13 protein or fusion protein described herein of mRNA.
  • the lentiviral vector is pseudotyped with homologous or heterologous envelope proteins, such as VSV-G.
  • the mRNA encoding a Cas13 protein or fusion protein is linked to an aptamer sequence.
  • a ribonucleoprotein complex comprising a CRISPR-Cas13 system described herein, wherein the ribonucleoprotein complex is formed from a guide polynucleotide described herein and a Cas13 protein or fusion protein.
  • the ribonucleoprotein complex can be delivered to eukaryotic, mammalian or human cells by microinjection or electroporation.
  • the ribonucleoprotein complex can be packaged in virus-like particles and delivered to a mammalian or human subject in vivo.
  • virus-like particle comprising a CRISPR-Cas13 system described herein, wherein the virus-like particle comprises: a guide polynucleotide described herein, and a Cas13 protein or fusion protein; or consists of A ribonucleoprotein complex composed of the guidance polynucleotide and Cas13 protein or fusion protein.
  • engineered virus-like particles are pseudotyped with homologous or heterologous envelope proteins, such as VSV-G.
  • the Cas13 protein is fused to a gag protein (eg, MLVgag) through a cleavable linker, wherein cleavage of the linker in the target cell exposes the NLS between the linker and the Cas13 protein.
  • the fusion protein comprises (e.g., from 5' to 3') a gag protein (e.g., MLVgag), one or more NES, a cleavable linker, one or more NLS, and Cas13 , as described in Banskota et al. Cell 185(2):250-265(2022).
  • the Cas13 protein is fused to a first dimerization domain capable of dimerizing or heterodimerizing with a second dimerizing domain fused to a membrane protein. ionization, wherein the presence of ligand promotes dimerization and enriches Cas13 protein or fusion protein into VLPs, as described in Campbell, et al., Molecular Therapy 27:151-163 (2019).
  • Cells can be eukaryotic or prokaryotic.
  • Examples of such cells include, but are not limited to, bacterial, archaeal, plant, fungal, yeast, insect, and mammalian cells, such as Lactobacillus, Lactococcus, Bacillus (e.g., Bacillus subtilis), Escherichia (e.g., Escherichia coli bacilli), Clostridium, Saccharomyces or Pichia (such as Saccharomyces cerevisiae or Pichia pastoris), Kluyveromyces lactis, Salmonella typhimurium, Drosophila cells, Caenorhabditis elegans cells, Xenopus cells, SF9 cells, C129 cells, 293 cells, Neurospora, and immortalized mammalian cell lines (eg, HeLa cells
  • the cell is a prokaryotic cell, such as a bacterial cell, such as E. coli.
  • the cell is a eukaryotic cell, such as a mammalian cell or a human cell.
  • the cells are primary eukaryotic cells, stem cells, tumor/cancer cells, circulating tumor cells (CTCs), blood cells (e.g., T cells, B cells, NK cells, Tregs, etc.), hematopoietic stem cells, specialized Immune cells (such as tumor-infiltrating lymphocytes or tumor suppressor lymphocytes), stromal cells in the tumor microenvironment (such as cancer-associated fibroblasts, etc.).
  • the cells are brain or neuronal cells of the central or peripheral nervous system (eg, neurons, astrocytes, microglia, retinal ganglion cells, rods/cones, etc.).
  • the target RNA may be used to target one or more target RNA molecules, such as those present in biological samples, environmental samples (e.g. soil, air or water samples), etc. .
  • the target RNA is a coding RNA, such as pre-mRNA or mature mRNA.
  • the target RNA is nuclear RNA.
  • the target RNA is an RNA transcript located in the nucleus of a eukaryotic cell.
  • the target RNA is a non-coding RNA, such as a functional RNA, siRNA, microRNA, snRNA, snoRNA, piRNA, scaRNA, tRNA, rRNA, lncRNA, or lincRNA.
  • a functional RNA such as siRNA, microRNA, snRNA, snoRNA, piRNA, scaRNA, tRNA, rRNA, lncRNA, or lincRNA.
  • a CRISPR-Cas13 system, composition or kit described herein performs one or more of the following functions on the target RNA: cleaves one or more targets RNA molecules may nick one or more target RNA molecules, activate or upregulate one or more target RNA molecules, activate or inhibit the translation of one or more target RNA molecules, and cause one or more target RNA molecules to nick. Inactivating target RNA molecules, visualizing, labeling or detecting one or more target RNA molecules, binding one or more target RNA molecules, editing one or more target RNA molecules, transporting one or more target RNA molecules, and masking one or more target RNA molecules.
  • the CRISPR-Cas13 systems, compositions, or kits described herein modify one or more target RNA molecules, and the modification of one or more target RNA molecules includes one or more of the following: RNA base substitution, RNA base deletion, RNA base insertion, fragmentation of target RNA, RNA methylation and RNA demethylation.
  • a CRISPR-Cas13 system, composition, or kit described herein can target one or more target RNA molecules.
  • a CRISPR-Cas13 system, composition, or kit described herein can bind one or more target RNA molecules.
  • a CRISPR-Cas13 system, composition, or kit described herein can cleave one or more target RNA molecules.
  • a CRISPR-Cas13 system, composition, or kit described herein can activate translation of one or more target RNA molecules. In some embodiments, a CRISPR-Cas13 system, composition, or kit described herein can inhibit translation of one or more target RNA molecules. In some embodiments, a CRISPR-Cas13 system, composition or kit described herein can detect one or more target RNA molecules. In some embodiments, a CRISPR-Cas13 system, composition, or kit described herein can edit one or more target RNA molecules.
  • the target RNA is AQp1 RNA. Knocking down Aqp1 RNA levels using the CRISPR-Cas13 system described here can reduce the production of aqueous humor and reduce intraocular pressure, which can be used to treat diseases such as glaucoma.
  • the target RNA is AQp1 RNA and the guide sequence of the guide polynucleotide is SEQ ID NO: 5.
  • the target RNA is PTBP1 RNA. Knocking down PTBP1 RNA levels using the CRISPR-Cas13 system described here can promote the transdifferentiation of brain astrocytes into neurons, which could be used to treat diseases such as Parkinson's disease.
  • the target RNA is PTBP1 RNA and the guide sequence of the guide polynucleotide is SEQ ID NO: 6.
  • the target RNA is VEGFA RNA. Reducing VEGFA RNA levels using the CRISPR-Cas13 system described here prevents choroidal neovascularization, which could be used to treat diseases such as age-related macular degeneration.
  • the target RNA is ANGPTL3 RNA. Knocking down ANGPTL3 RNA levels using the CRISPR-Cas13 system described in this article can reduce low-density lipoprotein cholesterol (LDL-C) and other blood lipids, and can be used to treat atherosclerotic heart disease such as hyperlipidemia and familial hypercholesterolemia. Vascular disease.
  • LDL-C low-density lipoprotein cholesterol
  • the target RNA is ANGPTL3 RNA
  • the guide sequence of the guide polynucleotide is selected from any one or more of SEQ ID NO: 42-49.
  • compositions comprising the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the The nucleic acid, the vector system described herein, the lipid nanoparticles described herein, the lentiviral vector described herein, the ribonucleoprotein complex described herein, the virus-like particles described herein, or the eukaryotic cells.
  • the pharmaceutical composition may comprise, for example, an AAV vector encoding a Cas13 protein or fusion protein described herein and a guide polynucleotide.
  • the pharmaceutical composition may comprise, for example, lipid nanoparticles comprising a guide polynucleotide described herein and an mRNA encoding the Cas13 protein or fusion protein.
  • the pharmaceutical composition may comprise, for example, a lentiviral vector comprising a guide polynucleotide described herein and an mRNA encoding the Cas13 protein or fusion protein.
  • the pharmaceutical composition may comprise, for example: a virus-like particle comprising a guide polynucleotide described herein and a Cas13 protein or fusion protein; or a ribonucleoprotein complex formed by the guide polynucleotide and a Cas13 protein or fusion protein .
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein System, lipid nanoparticles described herein, lentiviral vectors described herein, ribonucleoprotein complexes described herein, Use of the virus-like particles, or the eukaryotic cells described herein, for cleaving or editing target RNA in mammalian cells.
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein
  • the system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are in any one of the following Purpose: cleave or nick one or more target RNA molecules, activate or upregulate one or more target RNA molecules, activate or inhibit one or more target RNAs Translation of molecules, inactivation of one or more target RNA molecules, visualization, labeling or detection of one or more target RNA molecules, binding of one or more target RNA molecules, transport of one or more target RNA molecules, and masking one or more target RNA molecules.
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein System, lipid nanoparticles described herein, lentiviral vectors described herein, ribonucleoprotein complexes described herein, virus-like particles described herein, or eukaryotic cells described herein in mammalian cells Use of modifying one or more target RNA molecules, said modification of one or more target RNA molecules including one or more of the following: RNA base substitution, RNA base deletion, RNA base insertion, target RNA fragmentation, RNA methylation and RNA demethylation.
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the vector described herein
  • the system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are useful in diagnosis, treatment, or Use in preventing diseases or conditions associated with target RNA.
  • the disease or disorder is Parkinson's disease.
  • the disease or disorder is Parkinson's disease and the target RNA is PTBP1 RNA.
  • the disease or condition is glaucoma. In some embodiments, the disease or condition is glaucoma and the target RNA is AQp1 RNA. In some embodiments, the disease or disorder is amyotrophic lateral sclerosis. In some embodiments, the disease or disorder is amyotrophic lateral sclerosis and the target RNA is superoxide dismutase 1 (SOD1) RNA. In some embodiments, the disease or disorder is age-related macular degeneration and the target RNA is VEGFA RNA. In some embodiments, the disease or condition is age-related macular degeneration, and the target RNA is VEGFA RNA or VEGFR1 RNA. In some embodiments, the disease or condition is elevated plasma LDL cholesterol levels. In some embodiments, the disease or disorder is elevated plasma LDL cholesterol levels, and the target RNA is PCSK9 RNA or ANGPTL3 RNA.
  • Another aspect of the present disclosure relates to the CRISPR-Cas13 system described herein, the Cas13 protein described herein, the fusion protein described herein, the guide polynucleotide described herein, the nucleic acid described herein, the carrier
  • the system, the lipid nanoparticles described herein, the lentiviral vectors described herein, the ribonucleoprotein complexes described herein, the virus-like particles described herein, or the eukaryotic cells described herein are prepared for use in diagnosis. , use in drugs to treat or prevent diseases or conditions related to target RNA.
  • the disease or disorder is Parkinson's disease.
  • the disease or condition is glaucoma.
  • the disease or disorder is amyotrophic lateral sclerosis. In some embodiments, the disease or condition is age-related macular degeneration. In some embodiments, the disease or condition is elevated plasma LDL cholesterol levels. In some embodiments, the disease or disorder is Parkinson's disease and the target RNA is PTBP1 RNA. In some embodiments, the disease or condition is glaucoma and the target RNA is AQp1 RNA. In some embodiments, the disease or disorder is amyotrophic lateral sclerosis and the target RNA is superoxide dismutase 1 (SOD1) RNA. In some embodiments, the disease or condition is age-related macular degeneration and the target RNA is VEGFA RNA or VEGFR1 RNA. In some embodiments, the disease or disorder is elevated plasma LDL cholesterol levels and the target RNA is PCSK9 RNA or ANGPTL3 RNA.
  • the pharmaceutical composition is delivered to a human subject in vivo.
  • the pharmaceutical composition can be delivered by any effective route.
  • routes of administration include, but are not limited to, intravenous infusion, intravenous injection, intraperitoneal injection, intramuscular injection, intratumoral injection, subcutaneous injection, intradermal injection, intraventricular injection, intravascular injection, intracerebellar injection, ophthalmic injection Intraocular injection, subretinal injection, intravitreal injection, intracameral injection, intratympanic injection, intranasal administration and inhalation.
  • methods of targeting RNA result in editing the sequence of the target RNA.
  • the target RNA can be cleaved or nicked at a precise location, e.g., in the target RNA.
  • RNA exists as a double-stranded nucleic acid molecule, it cleaves either single strand).
  • this method is used to reduce the expression of target RNA, which will reduce the translation of the corresponding protein. This approach can be used in cells where increased RNA expression is not desired.
  • RNA is associated with diseases such as cystic fibrosis, Huntington's disease, Tay-Sachs, fragile X syndrome, fragile X-associated tremor/ataxia syndrome, muscular dystrophy, myotonic dystrophy, spinal cord Muscular atrophy, spinocerebellar ataxia, age-related macular degeneration, or familial ALS.
  • diseases such as cystic fibrosis, Huntington's disease, Tay-Sachs, fragile X syndrome, fragile X-associated tremor/ataxia syndrome, muscular dystrophy, myotonic dystrophy, spinal cord Muscular atrophy, spinocerebellar ataxia, age-related macular degeneration, or familial ALS.
  • RNA is associated with cancer, such as lung, breast, colon, liver, pancreatic, prostate, bone, brain, skin (eg, melanoma), or kidney cancer.
  • target RNAs include, but are not limited to, those associated with cancer (eg, PD-L1, BCR-ABL, Ras, Raf, p53, BRCA1, BRCA2, CXCR4, ⁇ -catenin, HER2, and CDK4). Editing such target RNA can produce therapeutic effects.
  • cancer eg, PD-L1, BCR-ABL, Ras, Raf, p53, BRCA1, BRCA2, CXCR4, ⁇ -catenin, HER2, and CDK4
  • the RNA is expressed in immune cells.
  • the target RNA may encode a protein that results in inhibition of a desired immune response, such as tumor infiltration. Knocking down this RNA can promote this required immune response (e.g., PD1, CTLA4, LAG3, TIM3).
  • target RNA codes for undesirable immunity Reaction-activated proteins, for example in the context of autoimmune diseases such as multiple sclerosis, Crohn's disease, lupus or rheumatoid arthritis.
  • Another aspect of the present disclosure relates to an in vitro composition
  • an in vitro composition comprising a CRISPR-Cas13 system described herein and a labeled detector RNA that is unable to hybridize to a guide polynucleotide described herein.
  • Another aspect of the present disclosure relates to the use of the CRISPR-Cas13 system described herein for detecting target RNA in a nucleic acid sample suspected of containing the target RNA.
  • methods of detecting target RNA include a Cas13 protein or fusion protein fused to a fluorescent protein or other detectable label and a guide polynucleotide comprising a guide sequence specific for the target RNA. Binding of Cas13 proteins or fusion proteins to target RNA can be visualized by microscopy or other imaging methods.
  • RNA aptamer sequences can be appended to or inserted into guide polynucleotides, such as MS2, PP7, Q ⁇ and other aptamers.
  • Introducing proteins that specifically bind to these aptamers can be used to detect target RNA because the Cas13-guide-target RNA complex will be labeled through the aptamer interaction .
  • methods of detecting target RNA in a cell-free system result in the production of a detectable label or enzymatic activity.
  • the target RNA will be recognized by Cas13. Binding of Cas13 to target RNA triggers its RNase activity, which results in cleavage of the target RNA as well as a detectable label.
  • the detectable label is RNA linked to a fluorescent probe and a quencher.
  • Intact detectable RNA is linked to a fluorescent probe and quencher to suppress fluorescence. After the detectable RNA is cleaved by Cas13, the fluorescent probe is released from the quencher and displays fluorescent activity.
  • This method can be used to determine whether target RNA is present in lysed cell samples, lysed tissue samples, blood samples, saliva samples, environmental samples (e.g., water, soil, or air samples), or other lysed cellular or cell-free samples. This method can also be used to detect pathogens, such as viruses or bacteria, or to diagnose disease states, such as cancer.
  • detection of target RNA aids in diagnosing disease and/or pathological conditions, or the presence of viral or bacterial infections.
  • Cas13-mediated detection of non-coding RNAs such as PCA3 can be used to diagnose prostate cancer if detected in patient urine.
  • Cas13-mediated detection of lncRNA-AA174084, a gastric cancer biomarker can be used to diagnose gastric cancer.
  • Protein sequences within 10kb upstream and downstream of CRISPR Array are compared with known Cas13, and proteins with evalue greater than 1*e-5 are filtered out. Then it is compared with NCBI's NR library and EBI's patent library to filter out proteins with high similarity, and then selects candidate proteins. Through experimental verification, the C13-2 protein (SEQ ID NO: 1,893aa) was finally obtained.
  • the C13-2 protein is also known as CasRfg.4 protein.
  • the source of the genome sequence of C13-2 protein is shown in Table 1.
  • the native (wild-type) DNA coding sequence of C13-2 protein is SEQ ID NO:9.
  • the locus structure of the C13-2 protein is shown in Figure 1, including the CRISPR array and C13-2 coding sequence.
  • the direct repeat (DR) sequence or scaffold sequence of the gRNA used in combination with C13-2 can be:
  • RNA secondary structure of the above direct repeat sequence was predicted using RNAfold, as shown in Figure 2.
  • the constructed recombinant vector is named C13-2-pET28a (SEQ ID NO:10). This recombinant vector is used to express C13-2 recombinant protein (SEQ ID NO:11).
  • the C13-2 recombinant protein structure is His tag- NLS-Cas13-SV40 NLS-nucleoplasmin NLS.
  • the sequence of the synthetic exogenous EGFP expression vector is shown in SEQ ID NO:13, and the plasmid structure is CMV-EGFP.
  • the verification vector plasmid for synthesizing C13-2 protein targeting EGFP, its full-length sequence is shown in SEQ ID NO:14, and the plasmid structure is CMV-C13-2-U6-gRNA.
  • EGFP is used as an exogenous reporter gene, and its nucleic acid sequence (720bp) is shown in SEQ ID NO:12.
  • the guide sequence targeting EGFP is ugccguucuucugcuugucggccaugauau (SEQ ID NO: 4).
  • exogenous EGFP expression vector and the C13-2 protein-targeting EGFP verification vector plasmid were transfected into 293T cells in a 24-well plate at a ratio of 1:2 (166ng:334ng).
  • the transfection method is as follows:
  • Opti-MEM I (Thermo, 11058021) reduced serum medium to dilute the aforementioned 500ng of plasmid DNA, and mix gently;
  • the cells 48 hours after transfection were digested with trypsin (Trypsin 0.25%, EDTA, Thermo), centrifuged at 300g for 5 minutes to remove the supernatant, and the cells in each well were resuspended in 500 ⁇ L of PBS.
  • EGFP fluorescence expression was detected by flow cytometry. After removing cell debris through FCS-A and SSC-A gates, data were collected by flow cytometry.
  • Synthetic expression vector C13-2-BsaI plasmid the sequence is shown in SEQ ID NO:15.
  • the target nucleic acids selected in the experiment are AQp1 (Aquaporin 1) and PTBP1 (Polypyrimidine Tract Binding Protein 1).
  • the 293T cell line that highly expresses AQp1 was used to verify AQp1, and the 293T cell line was used to verify PTBP1.
  • Method for constructing the 293T cell line (293T-AQp1 cells) that highly expresses AQp1 construct the vector Lv-AQp1-T2a-GFP (SEQ ID NO: 16) that overexpresses the AQp1 gene and the EGFP gene. Among them, AQp1 and EGFP are separated by 2A peptide. Lv-AQp1-T2a-GFP plasmid packaging lentivirus was used to transduce 293T cells to form a cell line stably overexpressing the AQp1 gene.
  • the gRNA guide sequence targeting AQp1 is:
  • the gRNA guide sequence targeting PTBP1 is:
  • primers Use primer annealing to obtain fragments targeting the target site.
  • the primers are as follows:
  • Upstream primer 5’-AGACGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’ (SEQ ID NO: 22)
  • Upstream primer 5’-AGACAGGGCAGAACCGATGCTGATGAAGAC-3’ (SEQ ID NO: 24)
  • the primer annealing reaction system is as shown in the table below. Incubate in the PCR machine at 95°C for 5 minutes, then immediately take it out and incubate on ice for 5 minutes so that the primers anneal to each other to form double-stranded DNA with sticky ends.
  • the C13-2 vector structure is CMV-C13-2-U6-gRNA, which can be used to express C13-2 protein and gRNA targeting AQp1 or PTBP1.
  • control vectors were prepared using conventional methods:
  • CasRx-AQp1 plasmid positive control vector for CasRx targeting AQp1
  • SEQ ID NO:17 The sequence of CasRx-AQp1 plasmid (positive control vector for CasRx targeting AQp1) is shown in SEQ ID NO:17, and the plasmid structure is CMV-CasRx-U6-gRNA. It contains the sequence encoding CasRx (the amino acid sequence is shown in SEQ ID NO: 2).
  • shRNA-AQp1 plasmid positive control vector for shRNA targeting AQp1
  • SEQ ID NO: 19 The sequence of shRNA-AQp1 plasmid (positive control vector for shRNA targeting AQp1) is shown in SEQ ID NO: 19, which is used to express shRNA molecules.
  • the shRNA molecule sequence is CCACGACCCUCUUUGUCUUCA CUCGAG UGAAGACAAAGAGGGUCGUGG (SEQ ID NO: 7).
  • shRNA-PTBP1 plasmid shRNA targeting PTBP1 positive control vector sequence
  • SEQ ID NO:20 The shRNA molecule sequence is CAGCCCAUCUACAUCCAGUUCCUCGAG GAACUGGAUGUAGAUGGGCUG (SEQ ID NO:8).
  • CasRx-blank plasmid (blank control vector, which can express CasRx and gRNA, but gRNA does not target AQp1 and PTBP1) is shown in SEQ ID NO: 21, and the plasmid structure is CMV-CasRx-U6-gRNA.
  • the vector to be verified is transfected into 293T cells and 293T-AQp1 cells.
  • control plasmid or vector plasmid targeting AQP1 was transfected into 293T-AQp1 cells at 500 ng in a 24-well plate.
  • the control plasmid or PTBP1-targeting vector plasmid was transfected into 293T cells in a 24-well plate at 500 ng.
  • the transfection method is as follows:
  • RNA extraction using the SteadyPure Universal RNA Extraction Kit AG21017 kit, and the RNA concentration was detected using an ultra-micro spectrophotometer.
  • the RNA product was reverse transcribed using the Evo M-MLV Mix Kit with gDNA Clean for qPCR reverse transcription kit, and the reverse transcription product was detected using the SYBR Green Premix Pro Taq HS qPCR Kit.
  • the primers used in qPCR are as follows:
  • Upstream primer 5’-ATTGTCCCAGATATAGCCGTTG-3’(SEQ ID NO:26)
  • Upstream primer 5’-GCTCTTCTGGAGGGCAGTGG-3’ (SEQ ID NO: 28)
  • Downstream primer 5’-CAGGTTGACAGCCGGGTTGAG-3’ (SEQ ID NO: 29)
  • Upstream primer 5’-CCATGGGGAAGGTGAAGGTC-3’(SEQ ID NO:30)
  • This experiment uses the relative quantification method, that is, the 2 - ⁇ Ct method to calculate target RNA. Its calculation method is as follows:
  • ⁇ Ct Ct(AQp1)-Ct(GAPDH) or Ct(PTBP1)-Ct(GAPDH)
  • ⁇ Ct ⁇ Ct(sample to be verified such as C13-2)- ⁇ Ct(CasRx-blank or C13-2-BsaI)
  • RNA amounts of AQp1 and PTBP1 Calculate the RNA amounts of AQp1 and PTBP1 according to the above calculation method.
  • the results are shown in Table 4 and Figure 4.
  • three independent biological replicate experiments were performed (the same batch of 293T cells were used for the transfection operation), and the average results of the three times were obtained, as shown in Table 5 and Figure 5.
  • C13-2-BsaI and CasRx-blank are control groups, which neither target AQp1 nor PTBP1.
  • Experimental results show that when combined with the gRNA in this experimental example, C13-2 showed obvious editing effects, with high editing activity targeting PTBP1 and higher editing activity targeting AQp1 than CasRx.
  • the patent with announcement number US10476825B2 discloses the Cas13 protein from BMZ-11B_GL0037771, which is called C13-113 in this experimental example (the amino acid sequence of this protein is shown in the sequence SEQ ID NO: 32 of this article).
  • the direct repeat sequence corresponding to C13-113 used in this experimental example is shown in SEQ ID NO:33.
  • GenBank has disclosed the Cas13 protein MBR0191107.1, which is referred to as C13-114 in this application (the amino acid sequence of this protein is shown in the sequence SEQ ID NO: 34 herein).
  • the direct repeat sequence corresponding to C13-114 used in this experimental example is shown in SEQ ID NO: 35.
  • C13-113 and C13-114 expression vectors C13-113-BsaI (SEQ ID NO:36) and C13-114-BsaI (SEQ ID NO:37) were synthesized in the reagent company respectively.
  • primers are as follows:
  • Upstream primer 5’-CAACGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’(SEQ ID NO:38)
  • Upstream primer 5’-atctGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’(SEQ ID NO:39)
  • the synthesized C13-113-BsaI and C13-114-BsaI plasmids were digested with BsaI endonuclease, and then T4-ligated with the annealed product. After transforming into E. coli, positive clones were selected and the plasmids were extracted. . Detection was performed 72 hours after transfection of 293T cells (293T cells of a different batch from Experimental Example 4). The blank control group was transfected with C13-2-BsaI of Experimental Example 4.
  • the amount of PTBP1 RNA was calculated using the 2 - ⁇ Ct method. Its calculation method is as follows:
  • the data in the table shows that when combined with the gRNA of this experimental example to target PTBP1, a higher editing efficiency of C13-2 was observed, and its editing efficiency was better than that of C13-113 and C13-114. No significant editing was observed in the C13-113 group.
  • the EMBOSS-water program and the NCBI-Blast program are used to predict the entire genome and cDNA sequence of the target species (Homo sapiens).
  • the forward and reverse strands of the gRNA guide sequence are used for comparison, and the prediction results are filtered. According to the predicted target and The difference in gRNA guide sequence length does not exceed four bases and the mismatch+gap does not exceed four bases for filtering.
  • the potential off-target information obtained is shown in Table 7.
  • the C13-2 targeting PTBP1 vector plasmid, shRNA-PTBP1 plasmid (the shRNA expressed in this experimental example is called “shRNA2" in this experimental example) and CasRx-blank plasmid of Experimental Example 4 were used.
  • the CasRx-PTBP1 plasmid (positive control vector for CasRx targeting PTBP1) was prepared using conventional methods. The sequence is shown in SEQ ID NO: 18, and the plasmid structure is CMV-CasRx-U6-gRNA.
  • a plasmid expressing shRNA1 was constructed according to the method of Experimental Example 4 (the only difference from the shRNA-PTBP1 plasmid is that the guide sequence of the encoded shRNA is different, but it also targets PTBP1, and no additional g is added after the U6 promoter).
  • the above plasmids were transfected into 293T cells respectively.
  • RNA-Seq sequencing on the extracted RNA samples.
  • the fastq files obtained by sequencing are compared with the reference genome of the target species through HISAT or STAR software to obtain the aligned BAM files.
  • Use kallisto, RSEM or HTSeq to detect transcripts and the expression of each gene.
  • Up means the expression is up-regulated
  • Down means the expression is down-regulated
  • Sig represents a significant difference in gene expression from the control group (CasRx-blank group).
  • Isec represents the number of intersections between DEG and "potential off-target genes" predicted by the program.
  • C13-2 has almost no off-target sites, while shRNA1 and shRNA2 have a large number of off-target sites. .
  • C13-2 is superior to shRNA1 and shRNA2, and is comparable to CasRx.
  • the size of C13-2 is only 893aa, which is much smaller than the 967aa of CasRx, so it is easier to package into AAV together with gRNA for delivery.
  • This experimental example refers to the method of Experimental Example 4.
  • Lv-ANGPTL3-T2a-GFP (SEQ ID NO: 52) that overexpresses ANGPTL3 gene and EGFP gene.
  • ANGPTL3 and EGFP are separated by 2A peptide, and Lv-ANGPTL3-T2a-GFP plasmid is packaged for lentiviral transduction. 293T cells, and a 293T cell line stably overexpressing the ANGPTL3 gene was obtained (called 293T-ANGPTL3 cells).
  • C13-2-BsaI vector SEQ ID NO: 15
  • the annealed product and the skeleton purified and recovered after digestion were connected by T4, and the positive clones were selected after transformation into E. coli and extracted.
  • C13-2 targeting ANGPTL3 plasmid C13-2 protein expression
  • U6 promoter drives gRNA expression
  • the DR sequence of gRNA is SEQ ID NO:3
  • the gRNA guide sequence is shown in Table 9
  • the plasmid is transfected into 293T -ANGPTL3 cells.
  • the negative control group was transfected with C13-2-BsaI plasmid.
  • the primers used in qPCR are as follows:
  • Upstream primer 5’-CCAGAACACCCAGAAGTAACT-3’ (SEQ ID NO:50)
  • Upstream primer 5’-CCATGGGGAAGGTGAAGGTC-3’(SEQ ID NO:30)
  • the C13-2-VEGFA vector (SEQ ID NO:72) was constructed, which can express C13-2 protein and gRNA targeting VEGFA.
  • the gRNA guide sequence is TGGGTGCAGCCTGGGACCACTTGGCATGG (SEQ ID NO:73).
  • the R4xH mutant verification vector of C13-2 was then constructed from the C13-2-VEGFA vector using conventional homologous recombination methods, as shown in Table 11.
  • the mutant vector introduces the following mutations relative to the C13-2 coding sequence of the C13-2-VEGFA vector:
  • R210A+H215A AGAAACGCCACCGCCCAC(SEQ ID NO:74) ⁇ GCAAACGCCACCGCCGCC(SEQ ID NO:75);
  • R750A+H755A AGAAAGACCAAGAGACAC(SEQ ID NO:76) ⁇ GCAAAGACCAAGAGAGCC(SEQ ID NO:77); and/or
  • R785A+H790A AGAAACGACGTGGAGCAC(SEQ ID NO:78) ⁇ GAAAACGACGTGGAGGCC(SEQ ID NO:79).
  • the vector was transfected into the 293T cell line. Transfection was performed according to Lipofectamine 2000 (Thermo) instructions. After 72 hours, use SteadyPure Universal RNA Extraction Kit to extract RNA. The RNA extracted from the three batches of experiments was sent to a sequencing company for RNAseq sequencing. The amount of VEGFA RNA was detected as shown in Table 12 below:
  • the experimental data in Table 12 shows that the editing activity is still high after the R750A+H755A mutation is introduced, the editing activity is retained after the R210A+H215A and/or R785A+H790A mutation is introduced, and R210A+H215A, R750A+H755A and R785A+ are introduced simultaneously.
  • the H790A mutation resulted in almost complete loss of editing activity.
  • the verification vector for each truncation shown in Table 13 was constructed from the C13-2-VEGFA plasmid (SEQ ID NO:72) (the only difference from the C13-2-VEGFA vector is that C13 The coding sequence of -2 is truncated), which can express each truncated protein, as well as a gRNA targeting VEGFA.
  • the verification vector and control vector were transfected into the 293T cell line. After 72 hours, use the SteadyPure Universal RNA Extraction Kit to extract RNA, and use the Evo M-MLV Mix Kit with gDNA Clean for qPCR reverse transcription kit for reverse transcription. according to Green Premix Pro Taq HS qPCR Kit Instructions for Use Configure the reaction system and use QuantStudio TM 5 Real-Time PCR System for detection.
  • the primers used in qPCR are as follows:
  • the C13-2 truncation in this experimental example retains certain RNA editing activity.
  • Construct verification vectors (shown in Table 16) targeting the endogenous genes VEGFA and PTBP1 encoding different DR sequences (shown in Table 15).
  • the C13-2-VEGFA vector of Experimental Example 8 and the C13-2 targeting PTBP1 vector plasmid of Experimental Example 4 were constructed by conventional methods (only the crRNA expression box sequence was replaced). Various crRNA sequences are expressed in Table 16. (5'-guide-DR-3') validation vector.
  • the verification vector and control vector were transfected into 293T cells according to the experimental method of the aforementioned experimental example. After 72 hours, RNA was extracted, reverse transcribed, and detected with a qPCR kit.
  • the primers used in qPCR are as follows:
  • Detect PTBP1 ATTGTCCCAGATATAGCCGTTG (SEQ ID NO:26)
  • RNAfold predicts the RNA secondary structure of the direct repeat sequence DR-hf2 as shown in Figure 19.
  • All verification vectors and control vectors (Table 19) in this experimental example use the same backbone sequence as C13-2-BsaI (SEQ ID NO: 15). Only the Cas13 coding sequence and crRNA coding sequence are different. The NLS sequence and its linking sequence of each vector are the same. The structure of all vectors is CMV-NLS-Cas13-2 ⁇ NLS-U6-crRNA. All Cas13s have 1 NLS at the N-terminus and 2 NLS at the C-terminus, with a total structure of 3 ⁇ NLS.
  • the guide sequence targeting GFP is tgccgttcttctgcttgtcggccatgatat (SEQ ID NO:90).
  • the guide sequence targeting PTBP1 is:
  • the guide sequence for targeting VEGFA is:
  • the verification vector and control vector were transfected into 293T cells.
  • 293T that was not transfected with plasmid was used as a blank control.
  • the primers used in qPCR are as follows:
  • Detect PTBP1 ATTGTCCCAGATATAGCCGTTG (SEQ ID NO:26)
  • the amount of target RNA was calculated using the 2 - ⁇ Ct method, and each Cas13 protein used its respective targeted GFP group as a negative control. Multiple batches of experiments were performed, and the results were averaged. As shown in Table 20, Table 21, Figure 11, and Figure 12.
  • C13-2 In the comparison of VEGFA target editing effects, the editing effect of C13-2 is very good, which is better than the current mainstream Cas13 editing tool.
  • the editing efficiency is C13-2>PspCas13b>Cas13X.1>Cas13Y.1.
  • C13-2 has a very good editing effect, with editing efficiency values of C13-2>CasRx>PspCas13b>Cas13X.1>Cas13Y.1.
  • the editing efficiency of C13-2 was significantly better than PspCas13b, Cas13X.1 and Cas13Y.1 (P ⁇ 0.05).
  • the target used to verify the single base editing effect is EGFP
  • the guide sequence used to target the EGFP is tgccgttcttctgcttgtcggccatgatatagacgttgtggctgttgtagttgtactccagcttgtgccc (SEQ ID NO: 95).
  • the vectors involved in this experimental example of dC13-2 express C13-2 inactive mutants containing R210A+H215A, R750A+H755A and R785A+H790A mutations.
  • the dC13-2-BsaI and dC13-2-EGFP vectors were constructed using conventional methods.
  • the single base editing verification vector was obtained through homologous recombination from the dC13-2-BsaI plasmid, dC13-2-EGFP plasmid and pC0055-CMV-dPspCas13b-GS-ADAR2DD plasmid.
  • dC13-2-ADAR-EGFP dC13-2-A(EAAAK)3A-ADAR-EGFP
  • dC13-2-(GGGGS)3-ADAR-EGFP dC13-2-(GGGGS)3-ADAR-EGFP
  • pC0055-CMV-dPspCas13b-GS-ADAR2DD was used as a positive control. Since this plasmid does not contain a gRNA expression cassette, the gRNA expression vector PLKO-PURO-PspGRNA-EGFP was synthesized by an outsourcing company.
  • RNA product is reverse transcribed using the Evo M-MLV RT-PCR Universal Reverse Transcriptase Kit.
  • the reverse transcription product is subjected to PCR using identification primers, and the PCR product is sent to a sequencing company for sequencing.
  • the identification primer sequence is as follows (product length 704bp):
  • the negative control dC13-2-EGFP group did not undergo base conversion
  • the positive control dPspCas13b-ADAR group induced base conversion.
  • Base conversion was induced between dC13-2 and ADAR when no linking peptide was used, rigid linking peptide A (EAAAK) 3A was used, and flexible linking peptide (GGGGS) 3 was used.
  • the first round of mutations was validated using the VEGFA target.
  • the C13-2-VEGFA vector of the aforementioned experimental example was used as the wild-type C13-2 encoding vector, and the Novozant point mutation kit Mut Express II Fast Mutagenesis Kit V2 was used to transform it to obtain the expression construct of each mutant (verification vector) for expression of C13-2 mutants and VEGFA-targeting gRNA.
  • the primers used are shown in Table 25 below.
  • the verification vector and control vector were transfected into 293T cells according to Lipofectamine 2000 (Thermo) instructions.
  • the C13-2-BsaI control group was transfected with the C13-2-BsaI vector of the aforementioned experimental example, and the WT control group was transfected with the C13-2-VEGFA vector, both expressing wild-type C13-2.
  • a control group of 293T cells was set up without transfection of any plasmid.
  • RNA was extracted from cells 48 hours after transfection using the SteadyPure Universal RNA Extraction Kit, and the RNA concentration was detected using an ultramicro-volume spectrophotometer.
  • the RNA product was reverse transcribed using the Evo M-MLV Mix Kit with gDNA Clean for qPCR reverse transcription kit, and the reverse transcription product was detected using the SYBR Green Premix Pro Taq HS qPCR Kit (Low Rox Plus) kit.
  • the primers used in qPCR are SEQ ID NO:88, 89, 30, and 31.
  • Post-edited target RNA levels were calculated using the 2 - ⁇ Ct method. The experiment was repeated three times, and the results were averaged, as shown in Table 26 and Figure 14.
  • the total RNA samples extracted after editing were subjected to RNAseq sequencing.
  • the library construction type was LncRNA chain-specific library, the sequencing data volume was 16G, and the sequencing strategy was PE150.
  • Kallisto software was used to quantify the expression levels of genes, and then sleuth software was used to perform differential expression analysis. Genes with
  • the significantly down-regulated differentially expressed genes were intersected with the predicted potential off-target genes, and after eliminating the on-target VEGFA gene, the off-target gene set was obtained.
  • RNAseq result data are basically consistent with the qPCR results.
  • the M04, M09, M17, M22, M25, M27 and M28 groups were less than the WT group.
  • the number of off-target genes in groups M01 to M28 is all 0, that is, there is no off-target gene.
  • the second round of mutations selected the human AR (androgen receptor) target for testing, and synthesized the C13-2-AR-h3 plasmid vector (SEQ ID NO: 161, which can express wild-type C13-2 and AR-targeting h3 gRNA , its guide sequence is SEQ ID NO:162, namely ATAACATTTCCGAAGACGACAAGAT).
  • the C13-2-AR-h3 plasmid vector was transformed using the Novozantn point mutation kit Mut Express MultiS Fast Mutagenesis Kit V2 to obtain the expression construct (verification vector) of each mutant, which is used to express the C13-2 mutant and h3 gRNA.
  • the primers used are shown in Table 30 below.
  • the verification vector and control vector were transfected into 293T cells according to Lipofectamine 2000 (Thermo) instructions.
  • the C13-2-BsaI control group was transfected with the C13-2-BsaI vector of the aforementioned experimental example, and the WT control group was transfected with the C13-2-AR-h3 vector, both expressing wild-type C13-2.
  • a control group of 293T cells was set up without transfection of any plasmid.
  • RNA was extracted from cells 48 hours after transfection using the SteadyPure Universal RNA Extraction Kit, and the RNA concentration was detected using an ultramicro-volume spectrophotometer.
  • the RNA product was reverse transcribed using the Evo M-MLV Mix Kit with gDNA Clean for qPCR reverse transcription kit, and the reverse transcription product was detected using the SYBR Green Premix Pro Taq HS qPCR Kit (Low Rox Plus) kit.
  • the primers used in qPCR are as follows:
  • Post-edited target RNA levels were calculated using the 2 - ⁇ Ct method. The experiment was repeated three times, and the results were averaged, as shown in Table 31 and Figure 16.
  • the total RNA samples extracted after editing were subjected to RNAseq sequencing.
  • the library construction type was LncRNA chain-specific library, the sequencing data volume was 16G, and the sequencing strategy was PE150.
  • Kallisto software was used to quantify the expression levels of genes, and then sleuth software was used to perform differential expression analysis. Genes with
  • RNAseq result data are basically consistent with the qPCR results.
  • the editing activities of M2-24, M2-25, M2-26, M2-32, M2-33, M2-34, M2-35, M2-39, M2-40 and M2-41 variants are higher than those of wild-type C13 -2 high.
  • the M2-37, M2-38, M2-39, M2-40 and M2-41 groups were less than the WT group.
  • M2-1, M2-6, M2-7, M2-14, M2-15, M2-22, M2-31, M2-38, and M2-39 did not miss the target.
  • the C13-2-VEGFA vector and negative control vector C13-2-BsaI in the aforementioned experimental examples were transfected into 293T cells (Lipofectamine 2000, Thermo Fisher), and the supernatant was collected after culturing at 37°C for 72 hours.
  • the level of VEGFA protein was detected using Elascience's Human VEGF-A (Vascular Endothelial Cell Growth Factor A) ELISA Kit, which showed that the expression of VEGFA protein was reduced by 97.4% compared to the negative control group.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Cell Biology (AREA)
  • Public Health (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Virology (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicinal Preparation (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

涉及一种CRISPR-Cas13系统及其应用,还涉及一种Cas13蛋白、融合蛋白和指导多核苷酸。所述Cas13蛋白具有与SEQ ID NO:1相比至少90%的序列同一性。所述融合蛋白包含融合至蛋白结构域和/或多肽标签的所述Cas13蛋白。所述指导多核苷酸包含同向重复序列和工程化以与靶RNA杂交的指导序列,所述同向重复序列与SEQ ID NO:3、80-87中任一项具有至少70%的序列同一性。所述CRISPR-Cas13系统包含与SEQ ID NO:1相比具有至少90%序列同一性的Cas13蛋白或其编码核酸,以及指导多核苷酸或其编码核酸。

Description

一种CRISPR-Cas13系统及其应用 技术领域
本披露涉及CRISPR基因编辑领域,具体涉及一种CRISPR-Cas13系统及其应用。
背景技术
CRISPR-Cas13是一种基于细菌免疫系统的RNA靶向和编辑系统,可保护细菌免受病毒侵害。CRISPR-Cas13系统类似于CRISPR-Cas9系统,但与靶向DNA的Cas9蛋白不同,Cas13蛋白靶向RNA。
CRISPR-Cas13属于Type VI CRISPR-Cas13系统,它包含一个单一的效应蛋白Cas13。目前,CRISPR-Cas13根据系统发育可分为多个亚型(如Cas13a、Cas13b、Cas13c和Cas13d)。然而,目前对于发现尺寸紧凑(例如,适用于AAV递送)、在哺乳动物细胞中编辑效率高(例如,RNA靶向/切割活性)和/或细胞毒性低(例如,旁式RNA降解引起的细胞休眠和细胞凋亡)的新Cas13系统仍存在迫切的需求。
发明内容
为解决现有技术中缺乏尺寸紧凑、在哺乳动物细胞中编辑效率高和/或细胞毒性低的Cas13系统的缺陷的技术问题,本披露提供一种靶向靶RNA的CRISPR-Cas13系统及其应用。
本披露的一方面涉及一种Cas13蛋白,其氨基酸序列具有与SEQ ID NO:1相比至少90%的序列同一性。
在一些实施方案中,所述Cas13蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化与靶RNA杂交。
在一些实施方案中,所述Cas13蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能够与靶RNA序列特异性结合。
在一些实施方案中,所述Cas13蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。
在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少95%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少96%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有 与SEQ ID NO:1相比至少97%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少98%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少99%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少99.5%的序列同一性。在一些实施方案中,所述Cas13蛋白的氨基酸序列如SEQ ID NO:1所示。
本披露的序列为SEQ ID NO:1的Cas13蛋白(即C13-2蛋白)是基于CNGB数据库(中国国家基因库)中原核基因组和宏基因组的生物信息学分析以及后续的活性验证而鉴定得到。在一些实施方案中,本披露所述Cas13蛋白来自包含与CNGB数据库中编号为CNA0009596所示基因组的平均核苷酸同一性(ANI)≥95%的基因组的物种(species)。
平均核苷酸同一性(average nucleotide identity,ANI)是一种在核酸水平上评价两个基因组之间所有直系同源蛋白编码基因的相似性的指标,对于细菌/古细菌一般以阈值ANI=95%来作为判断是否为同一物种的依据(Richter M,Rosselló-Móra R.Shifting the genomic gold standard for the prokaryotic species definition.Proc Natl Acad Sci U S A.2009 Nov 10;106(45):19126-31),因此,本披露以上述阈值进行界定,认为与参考基因组ANI值≥95%的物种均为同一物种,其中的Cas13蛋白与本披露要求保护的蛋白具有同源性,功能相似,属于本披露的范围。ANI分析工具包括FastANI、JSpecies等程序。
在一些实施方案中,与SEQ ID NO:1相比,所述Cas13蛋白的氨基酸序列包含一个、两个、三个、四个、五个、六个、七个或更多个突变,例如单个氨基酸插入、单个氨基酸缺失、单个氨基酸取代,或其组合。
在一些实施方案中,所述Cas13蛋白在催化结构域中包含一个或多个突变并且具有降低的RNA切割活性。在一些实施方案中,所述Cas13蛋白在催化结构域中包含一个突变并且具有降低的RNA切割活性。在一些实施方案中,所述Cas13蛋白在一个或两个HEPN结构域中包含一个或多个突变并且基本上缺乏RNA切割活性。在一些实施方案中,所述Cas13蛋白在任意一个HEPN结构域中包含一个突变并且基本上缺乏RNA切割活性。
所述“基本上缺乏RNA切割活性”是指与野生型Cas13蛋白相比仅保留≤50%、≤40%、≤30%、≤20%、≤10%、≤5%或≤1%的RNA切割活性,或无可检测的RNA切割活性。
在一些实施方案中,所述Cas13蛋白的RxxxxH基序(x表示任意氨基酸,RxxxxH也可记为Rx4H或R4xH)包含一个或多个突变并且基本上缺乏RNA切割活性。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的210-215位RxxxxH基序、750-755位RxxxxH基序和/或785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的210-215位RxxxxH基序的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的750-755位RxxxxH基序的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的210-215位RxxxxH基序和750-755位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的210-215位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的750-755位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的210-215位RxxxxH基序、750-755位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述RxxxxH基序突变为AxxxxH、RxxxxA或AxxxxA。在一些实施方案中,所述RxxxxH基序突变为AxxxxH。在一些实施方案中,所述RxxxxH基序突变为RxxxxA。在一些实施方案中,所述RxxxxH基序突变为AxxxxA。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750、H755、R785和/或H790的对应位置包含1个、2个、3个、4个、5个或6个突变。在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基R210、H215、R750、H755、R785和/或H790的对应位置突变为A(丙氨酸)。
在一些实施方案中,所述Cas13蛋白的氨基酸序列在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基R210和H215的对应位置包含突变。在一些实施方案中,所述Cas13蛋白的氨基酸序列在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基R750和H755的对应位置包含突变。在一些实施方案中,所述Cas13蛋白的氨基酸序列在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基R785和H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白的氨基酸序列在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基R210、H215、R750和H755的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R750、H755、R785和H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基 R210、H215、R785和/或H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750、H755、R785和H790的对应位置包含突变。
在一些实施方案中,所述R210、R750或R785的对应位置突变为A。在一些实施方案中,所述H215、H755或H790的对应位置突变为A。在一些实施方案中,所述R210、H215、R750、H755、R785和H790的对应位置都突变为A。
在一些实施方案中,所述Cas13蛋白通过在如SEQ ID NO:1所示序列的210-215位RxxxxH基序、750-755位RxxxxH基序和/或785-790位RxxxxH基序引入突变而得到。
在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和/或H790位置引入1个、2个、3个、4个、5个或6个突变而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和/或H790位置突变为A(丙氨酸)而得到。
在一些实施方案中,所述Cas13蛋白通过在如SEQ ID NO:1所示序列的R210、H215、R785和H790位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750和H755位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在如SEQ ID NO:1所示序列的R750、H755、R785和H790位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在如SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和H790位置突变为A而得到。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基40-91位、146-153位、158-176位、182-209位、216-253位、271-287位、341-353位、379-424位、456-477位、521-557位、575-588位、609-625位、700-721位、724-783位、796-815位、828-852位或880-893位的对应位置包含至少一个突变。
在一些实施方案中,与如SEQ ID NO:1所示的参比蛋白相比,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314、R320、R328、N332、R341、N346、R358、N372、N383、N390、N394、R47+R290、R47+R314、R290+R314、R47+R290+R314、R308+N68、N394+N68、N87+N68、R308+N265、N394+N265、N87+N265、R308+N68+N265、N87+N68+N265、T7、A16、S260、A263、M266、N274、F288、M302、N303、L304、V305、I311、D313、H324、P326、H327、N332、N346、T353、T360、E365、A373、M380、S382、K395、Y396、D402、D411、S418。
在一些实施方案中,与如SEQ ID NO:1所示的参比蛋白相比,所述Cas13蛋白在与 如SEQ ID NO:1所示的参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314。
在一些实施方案中,与如SEQ ID NO:1所示的参比蛋白相比,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:R47+R290、R47+R314、R290+R314、R47+R290+R314、N394+N265、N87+N265、A263、M266、N274、F288、V305、I311、D313、H324、T360、E365、A373、M380、D402、D411、S418。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R11的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N34的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R35的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R47的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R58的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R63的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R64的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N68的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N87的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N274的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R276的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R290的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R294的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N299的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N303的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R308的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R314的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R320的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R328的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N332的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R341的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N346的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R358的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N372的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N383的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N390的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N394的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R47和R290的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R47和R314的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R290和R314的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R47、R290和R314的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R308和N68的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N394和N68的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N87和N68的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R308和N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N394和N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N87和N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R308、N68和N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N87、N68和N265的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基T7的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基A16的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基S260的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基A263的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基M266的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N274的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基F288的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基M302的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N303的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基L304的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基V305的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基I311的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基D313的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基H324的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基P326的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基H327的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N332的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基N346的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基T353的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基T360的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基E365的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基A373的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基M380的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基S382的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基K395的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基Y396的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基D402的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基D411的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基S418的对应位置包含突变。
在一些实施方案中,与SEQ ID NO:1所示参比蛋白相比,所述Cas13蛋白在表24或表29中SEQ ID NO:1所示参比蛋白的突变位点的对应位置突变为相同的氨基酸残基。
在一些实施方案中,与如SEQ ID NO:1所示的参比蛋白相比,所述Cas13蛋白在表24或表29中SEQ ID NO:1所示参比蛋白的突变位点的对应位置包含相同的突变。
在一些实施方案中,所述Cas13蛋白由如SEQ ID NO:1所示的序列在以下位置引入任意一种或更多种突变而得到:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314、R320、R328、N332、R341、N346、R358、N372、N383、N390、N394、R47+R290、R47+R314、R290+R314、R47+R290+R314、R308+N68、N394+N68、N87+N68、R308+N265、N394+N265、N87+N265、R308+N68+N265、N87+N68+N265、T7、A16、S260、A263、M266、N274、F288、M302、N303、L304、V305、I311、D313、H324、P326、H327、N332、N346、T353、T360、E365、A373、M380、S382、K395、Y396、D402、D411和S418。
在一些实施方案中,所述Cas13蛋白由如SEQ ID NO:1所示的序列在以下位置引入任意一种或更多种突变而得到:N34、R64、N68、N265、R276、R294、N299、R314、 R47+R290、R47+R314、R290+R314、R47+R290+R314、N394+N265、N87+N265、A263、M266、N274、F288、V305、I311、D313、H324、T360、E365、A373、M380、D402和D411。
在一些实施方案中,所述Cas13蛋白由如SEQ ID NO:1所示的序列引入表24或表29中任意一种或更多种突变而得到。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基91-120位、141-180位、211-240位、331-360位、351-400位、431-460位、461-500位、511-550位、611-640位、631-660位、661-690位、691-760位、821-860位或861-890位的对应位置发生序列缺失而得到。
在一些实施方案中,所述Cas13蛋白在与如SEQ ID NO:1所示的参比蛋白的氨基酸残基348-350位、521-556位或883-893位的对应位置包含一个或多个氨基酸的缺失。
在一些实施方案中,所示序列缺失中缺失≤300个、≤200个、≤150个、≤100个、≤90个、≤80个、≤70个、≤60个、≤50个、≤40个、≤30个、≤20个或≤10个氨基酸残基。
在一些实施方案中,所述Cas13蛋白由如SEQ ID NO:1所示的序列在91-120位、141-180位、211-240位、331-360位、351-400位、431-460位、461-500位、511-550位、611-640位、631-660位、661-690位、691-760位、821-860位或861-890位发生序列缺失而得到。
所述序列缺失为1个或更多个氨基酸残基的缺失。
本披露的另一方面涉及融合蛋白。
在本披露的一些实施方案中,所述融合蛋白包含本文所述的Cas13蛋白或其功能片段,以及与所述Cas13蛋白或其功能片段融合的以下任意一种或更多种:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、亚细胞定位信号、报告标签和亲和标签。
在本披露的一些实施方案中,所述融合蛋白包含本文所述的Cas13蛋白或其功能片段,以及与所述Cas13蛋白或其功能片段融合的以下任意一种或更多种:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、亚细胞定位信号、报告标签和亲和标签。
在本披露的一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白或其功能片段。在一些实施方案中,所述融合不改变 所述的Cas13蛋白和/或其功能片段的原有功能。
所述不改变Cas13蛋白或其功能片段的原有功能是指融合后的蛋白在与gRNA组合使用时仍具有识别、结合和/或切割靶RNA的能力。融合后的蛋白与gRNA组合使用时识别、结合或切割靶RNA的能力相比所述的Cas13蛋白与gRNA组合使用时识别、结合或切割靶RNA的能力可能有所提高或降低,但只要所述融合后的蛋白与gRNA组合使用时可有效地识别、结合或切割靶RNA,则都属于“不改变Cas13蛋白的原有功能”的情况。
在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白的功能片段。例如,在一些实施方式中,所述功能片段为所述Cas13蛋白缺失核酸酶结构域的部分序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的HEPN-1_Ⅰ、HEPN-1_Ⅱ、HEPN-2、NTD、Helical-1和/或Helical-2对应的序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的HEPN-1_Ⅰ、HEPN-1_Ⅱ和/或HEPN-2对应的序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的HEPN-1_Ⅰ对应的序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的HEPN-1_Ⅱ对应的序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的HEPN-2对应的序列后所得的片段。在一些实施方式中,所述功能片段为所述Cas13蛋白缺失与C13-2的NTD和HEPN-1_Ⅰ对应的序列后所得的片段。
在一些实施方案中,所述Cas13蛋白或其功能片段融合至选自以下的任意一种或更多种:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、亚细胞定位信号、报告标签和亲和标签。
在一些实施方案中,所述Cas13蛋白或其功能片段融合亚细胞定位信号。在一些实施方案中,所述亚细胞定位信号任选自核定位信号(NLS)、核输出信号(NES)、叶绿体定位信号或线粒体定位信号。
在一些实施方案中,所述Cas13蛋白或其功能片段融合至选自以下的任意一种或更多种:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、亚细胞定位信号、报告标签和亲和标签。
在一些实施方案中,所述Cas13蛋白或其功能片段融合亚细胞定位信号。在一些实 施方案中,所述亚细胞定位信号任选自核定位信号(NLS)、核输出信号(NES)、叶绿体定位信号或线粒体定位信号。
在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源核定位信号(NLS)融合。在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源核输出信号(NES)融合。
在一些实施方案中,所述Cas13蛋白或其功能片段与蛋白结构域和/或多肽标签共价连接。在一些实施方案中,所述Cas13蛋白或其功能片段与蛋白结构域和/或多肽标签直接共价连接。在一些实施方案中,所述Cas13蛋白或其功能片段与蛋白结构域和/或多肽标签通过连接序列共价连接;进一步地,在一些实施方案中,所述连接序列为氨基酸序列。
在一些实施方案中,所述融合蛋白的Cas13蛋白或其功能片段与同源或异源蛋白结构域和/或多肽标签通过刚性连接肽序列连接。在一些实施方案中,所述融合蛋白的Cas13蛋白部分与同源或异源蛋白结构域和/或多肽标签通过柔性连接肽序列连接。在一些实施方案中,所述刚性连接肽序列为A(EAAAK)3A(SEQ ID NO:279)。在一些实施方案中,所述柔性连接肽序列为(GGGGS)3(SEQ ID NO:280)。
在一些实施方案中,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能与靶RNA序列特异性结合。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白,所述融合蛋白能与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能与靶RNA序列特异性结合。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白的功能片段,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能与靶RNA序列特异性结合。
在一些实施方案中,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白的功能片段,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。
在一些实施方案中,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能序列特异性结合并切割靶RNA。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能够与靶RNA序列特异性结合。在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白的功能片段,所述融合蛋白能够与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能与靶RNA序列特异性结合。
在一些实施方案中,所述融合蛋白的结构为NLS-Cas13蛋白-SV40NLS-nucleoplasmin(核质素)NLS。
本披露的另一方面涉及指导多核苷酸,所述指导多核苷酸包含(i)与SEQ ID NO:3和SEQ ID NO:80-87中任一项具有至少50%的序列同一性的同向重复序列,该同向重复序列连接至(ii)工程化以与靶RNA杂交的同源或异源指导序列,所述指导多核苷酸能够与Cas13蛋白形成CRISPR复合物并指导所述CRISPR复合物与所述靶RNA的序列特异性结合。
在一些实施方案中,所述Cas13蛋白为Cas13a、Cas13b、Cas13c或Cas13d。在一些实施方案中,所述Cas13蛋白的氨基酸序列具有与SEQ ID NO:1相比至少90%、至少95%、至少98%或至少99%的序列同一性。
在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少55%、至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%或100%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少80%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少85%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少90%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少95%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有100%的序列同一性。
在一些实施方案中,所述同向重复序列与SEQ ID NO:3、81、82、84和87中任一项相比具有至少50%、至少55%、至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%或100%的序列同一性。
在一些实施方案中,所述同向重复序列与SEQ ID NO:3和87中任一项相比具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%或100%的序列同一 性。
在一些实施方案中,所述同向重复序列在对应于SEQ ID NO:3的第26位碱基为A。
在一些实施方案中,所述同向重复序列为GGAAGATN1ACTCTACAAACCTGTAGN2GN3N4N5N6N7N8N9N10N11(SEQ ID NO:277);其中,N1和N3-N11任选自A、C、G、T;N2任选自A和G。
在一些实施方案中,所述同向重复序列为GGAAGATN12ACTCTACAAACCTGTAGN13GN14N15N16N17N18N19N20N21N22(SEQ ID NO:278);其中,N12、N13、N19和N21任选自A和G,N14任选自A和T,N15和N16任选自C和T,N17和N18任选自G和T,N20和N22任选自C和G。
在一些实施方案中,所述指导序列位于所述同向重复序列的3'端。在一些实施方案中,所述指导序列位于所述同向重复序列的5'端。
在一些实施方案中,所述指导序列包含15-35个核苷酸。在一些实施方案中,所述指导序列与所述靶RNA杂交,错配不超过一个核苷酸。在一些实施方案中,所述同向重复序列包含25至40个核苷酸。
在一些实施方案中,所述指导多核苷酸进一步包含适体序列。在一些实施方案中,所述适体序列被插入到所述指导多核苷酸的环(loop)中。在一些实施方案中,所述适体序列包括MS2适体序列、PP7适体序列或Qβ适体序列。
在一些实施方案中,所述指导多核苷酸包含修饰的核苷酸。在一些实施方案中,所述修饰包含2'-O-甲基、2'-O-甲基-3'-硫代磷酸酯或2'-O-甲基-3'-硫代PACE。
在一些实施方案中,所述的指导多核苷酸的靶RNA位于真核细胞的细胞核中。
在一些实施方案中,所述靶RNA任选自TTR RNA、SOD1 RNA、PCSK9 RNA、VEGFA RNA、VEGFR1 RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。可选地,在一些实施方案中,所述靶RNA任选自VEGFA RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。进一步地,所述指导序列任选自SEQ ID NO:5、SEQ ID NO:6、SEQ ID NO:42-49所示序列(分别用于靶向AQp1 RNA、PTBP1 RNA、ANGPTL3 RNA)。优选地,所述指导序列任选自SEQ ID NO:5、SEQ ID NO:6、SEQ ID NO:43、SEQ ID NO:45-47所示序列。
在一些实施方案中,所述Cas13蛋白为Cas13a、Cas13b、Cas13c或Cas13d。在一些实施方案中,所述Cas13蛋白为Cas13d。在一些实施方案中,所述Cas13蛋白具有与SEQ ID NO:1相比至少90%、至少95%、至少98%或至少99%的序列同一性。
本披露的另一方面涉及一种CRISPR-Cas13系统,其包含:本文所述的Cas13蛋白或融合蛋白,或编码所述Cas13蛋白或融合蛋白的核酸;以及指导多核苷酸或编码所述指 导多核苷酸的核酸;所述指导多核苷酸包含连接至指导序列的同向重复序列,所述指导序列被工程化以与靶RNA杂交;所述指导多核苷酸能够与所述Cas13蛋白或融合蛋白形成CRISPR复合物并指导所述CRISPR复合物与靶RNA的序列特异性结合。
在一些实施方案中,所述靶RNA任选自TTR RNA、SOD1 RNA、PCSK9 RNA、VEGFA RNA、VEGFR1 RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。可选地,在一些实施方案中,所述靶RNA任选自VEGFA RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。在一些实施方案中,所述指导序列任选自SEQ ID NO:5、SEQ ID NO:6、SEQ ID NO:42-49所示序列(分别用于靶向AQp1 RNA、PTBP1 RNA、ANGPTL3 RNA)。在一些实施方案中,所述指导序列任选自SEQ ID NO:5、SEQ ID NO:6、SEQ ID NO:43、SEQ ID NO:45-47所示序列。
在一些实施方案中,所述同向重复序列与SEQ ID NO:3、81、82、84和87中任一项相比具有至少70%、至少75%、至少80%、至少85%、至少90%、至少95%或100%的序列同一性。
在一些实施方案中,所述融合蛋白包含融合至同源或异源蛋白结构域和/或多肽标签的本文所述的Cas13蛋白或其功能片段。
在一些实施方案中,所述融合蛋白的Cas13蛋白或其功能片段部分与同源或异源核定位信号(NLS)融合。在一些实施方案中,所述融合蛋白的Cas13蛋白或其功能片段部分与同源或异源核输出信号(NES)融合。
在一些实施方案中,所述Cas13蛋白在催化结构域中包含突变并且具有降低的RNA切割活性。在一些实施方案中,所述Cas13蛋白在一个或两个HEPN结构域中包含突变并且基本上缺乏RNA切割活性。在一些实施方案中,所述“基本上缺乏RNA切割活性”是指与野生型Cas13蛋白相比仅保留≤50%、≤40%、≤30%、≤20%、≤10%、≤5%或≤1%的RNA切割活性,或无可检测的RNA切割活性。
在一些实施方案中,所述融合蛋白的Cas13蛋白或其功能片段部分与同源或异源蛋白结构域和/或多肽标签直接通过共价键连接。在一些实施方案中,所述融合蛋白的Cas13蛋白部分与同源或异源蛋白结构域和/或多肽标签通过肽序列连接。
在一些实施方案中,所述蛋白质结构域包含胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域或剪接因子结构域。在一些实施方案中,所述Cas13蛋白与亲和标签或报告标签共价连接。
在一些实施方案中,所述Cas13蛋白与SEQ ID NO:1相比具有至少95%的序列同一性。在一些实施方案中,所述Cas13蛋白与SEQ ID NO:1相比具有至少97%的序列同一 性。在一些实施方案中,所述Cas13蛋白与SEQ ID NO:1相比具有至少98%的序列同一性。在一些实施方案中,所述Cas13蛋白与SEQ ID NO:1相比具有至少99%的序列同一性。在一些实施方案中,所述Cas13蛋白与SEQ ID NO:1相比具有至少99.5%的序列同一性。在一些实施方案中,所述Cas13蛋白包含SEQ ID NO:1所示序列。
在一些实施方案中,所述Cas13蛋白来自包含与CNGB数据库中编号为CNA0009596所示基因组的平均核苷酸同一性(ANI)≥95%的基因组的物种(species)。
在一些实施方案中,所述Cas13蛋白用于RNA切割时没有原间隔区侧翼序列(PFS)的要求。
在一些实施方案中,所述指导序列位于所述同向重复序列的3'端。在一些实施方案中,所述指导序列位于所述同向重复序列的5'端。
在一些实施方案中,所述指导序列包含15-35个核苷酸。在一些实施方案中,所述指导序列与所述靶RNA杂交,错配不超过一个核苷酸。
在一些实施方案中,所述同向重复序列包含25至40个核苷酸。
在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少80%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少90%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少95%的序列同一性。在一些实施方案中,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有100%的序列同一性。在一些实施方案中,所述同向重复序列任选自SEQ ID NO:3和SEQ ID NO:80-87。
在一些实施方案中,所述指导多核苷酸进一步包含适体序列。在一些实施方案中,所述适体序列被插入到指导多核苷酸的环中。在一些实施方案中,所述适体序列包括MS2适体序列、PP7适体序列或Qβ适体序列。
在一些实施方案中,所述CRISPR-Cas13系统包括包含接头蛋白(Adaptor protein)和同源或异源蛋白结构域的融合蛋白,或编码所述融合蛋白的核酸,所述接头蛋白能够结合所述适体序列。
在一些实施方案中,所述接头蛋白包括MS2噬菌体外壳蛋白、PP7噬菌体外壳蛋白或Qβ噬菌体外壳蛋白。在一些实施方案中,所述蛋白结构域包含胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、报告标签和亲和标签。
在一些实施方案中,所述指导多核苷酸包含修饰的核苷酸。在一些实施方案中,所述修饰的核苷酸包含2'-O-甲基、2'-O-甲基-3'-硫代磷酸酯或2'-O-甲基-3'-硫代PACE。
在一些实施方案中,所述Cas13蛋白或融合蛋白和所述指导多核苷酸并非天然共同存在。
本披露的另一方面涉及一种CRISPR-Cas13系统,其特征在于,其包含任何一种Cas13蛋白、其融合蛋白或编码其的核酸,以及本文所述的指导多核苷酸或编码其的核酸。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的载体系统,所述载体系统包含一个或多个载体,所述载体包含编码本文所述Cas13蛋白或融合蛋白的多核苷酸序列和编码指导多核苷酸的多核苷酸序列。
本披露的另一方面涉及包含本文所述CRISPR-Cas13系统的腺相关病毒(AAV)载体,其中所述AAV载体包含编码本文所述Cas13蛋白或融合蛋白和所述指导多核苷酸的DNA序列。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的脂质纳米粒,所述脂质纳米粒包含本文所述的指导多核苷酸和编码本文所述的Cas13蛋白或融合蛋白的mRNA。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的慢病毒载体,所述慢病毒载体包含本文所述的指导多核苷酸和编码本文所述的Cas13蛋白或融合蛋白的mRNA。在一些实施方案中,所述慢病毒载体是用同源或异源包膜蛋白如VSV-G假型化的。在一些实施方案中,编码所述Cas13蛋白或融合蛋白的mRNA与适体序列连接。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的核糖核蛋白复合物,其中所述核糖核蛋白复合物由本文所述的指导多核苷酸和本文所述的Cas13蛋白或融合蛋白形成。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的病毒样颗粒,所述病毒样颗粒包含由本文所述的指导多核苷酸和本文所述的Cas13蛋白或融合蛋白形成的核糖核蛋白复合物。在一些实施方案中,所述Cas13蛋白或融合蛋白与gag蛋白融合。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的真核细胞。在一些实施方案中,所述真核细胞是哺乳动物细胞。在一些实施方案中,所述真核细胞是人细胞。
本披露的另一方面涉及一种药物组合物,其包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞。
本披露的另一方面涉及一种药物组合物,其包含本文所述的CRISPR-Cas13系统。
本披露的另一方面涉及一种体外组合物,其包含本文所述的CRISPR-Cas13系统,以及不能与本文所述的指导多核苷酸杂交或被所述指导多核苷酸靶向的标记的detector  RNA。
本披露的另一方面涉及编码本文所述的Cas13蛋白或融合蛋白的分离的核酸。
本披露的另一方面涉及编码本文所述的指导多核苷酸的分离的核酸。
本披露的另一方面涉及一种CRISPR-Cas13系统,其包含任何一种Cas13蛋白或编码其的核酸,以及本文所述的指导多核苷酸或编码其的分离的核酸。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在检测疑似包含靶RNA的核酸样品中的靶RNA或制备检测疑似包含靶RNA的核酸样品中的靶RNA的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在以下任一项或制备实现以下任一项方案的试剂的用途:
切割一种或多种靶RNA分子或使一种或多种靶RNA分子产生切口(nicking),激活或上调一种或多种靶RNA分子,激活或抑制一种或多种靶RNA分子的翻译,使一种或多种靶RNA分子失活,可视化、标记或检测一种或多种靶RNA分子,结合一种或多种靶RNA分子,运输一种或多种靶RNA分子,以及掩蔽一种或多种靶RNA分子。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在切割一种或多种靶RNA分子或制备切割一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在结合一种或多种靶RNA分子中的用途。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在制备结合或切割一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在切割或编辑哺乳动物细胞的靶RNA中 的用途,所述编辑为碱基编辑。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在制备切割或编辑哺乳动物细胞的靶RNA的试剂中的用途,所述编辑为碱基编辑。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在激活或上调一种或多种靶RNA分子或制备激活或上调一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在抑制一种或多种靶RNA分子的翻译或制备抑制一种或多种靶RNA分子的翻译的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在使一种或多种靶RNA分子失活或制备使一种或多种靶RNA分子失活的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在可视化、标记或检测一种或多种靶RNA分子或制备可视化、标记或检测一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在运输一种或多种靶RNA分子或制备运输一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在掩蔽一种或多种靶RNA分子或制备掩 蔽一种或多种靶RNA分子的试剂中的用途。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在诊断、治疗或预防与靶RNA相关的疾病或病症中的用途。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在诊断、治疗或预防与靶RNA相关的疾病或病症中的用途。
本披露的另一方面涉及一种诊断、治疗或预防与靶RNA相关的疾病或病症的方法,所述方法为:在有需要的受试者的样品或向有需要的受试者施用根据本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的CRISPR-Cas13系统或本文所述的分离的核酸。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒或本文所述的真核细胞在制备用于诊断、治疗或预防与靶RNA相关的疾病或病症的药物中的用途。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在制备用于诊断、治疗或预防与靶RNA相关的疾病或病症的药物中的用途。在符合本领域常识的基础上,上述各优选条件,可任意组合,即得本披露各较佳实例。
本披露所用试剂和原料均市售可得。
附图说明
图1示出了CRISPR-Cas13系统的CRISPR基因座,包括CRISPR阵列和C13-2蛋白的编码序列。
图2示出了C13-2指导多核苷酸的结构,其由同向重复序列和指导序列组成。在此图中骨架序列与同向重复序列相同;指导序列由数量可变的多个核苷酸组成,图中N表示任意核苷酸。将同向重复序列(SEQ ID NO:3)用RNAfold预测得到其二级结构,示于图中,可见茎环结构,茎区含有多个互补碱基对。
图3示出了重组C13-2蛋白的表达和纯化。
图4显示C13-2在下调293T细胞中的PTBP1(聚嘧啶束结合蛋白1)RNA方面具有高活性。
图5显示与CasRx和shRNA相比,C13-2在下调293T细胞中的AQp1(水通道蛋白1)RNA方面具有更高的活性。
图6显示与C13-113和C13-114相比,C13-2在下调293T细胞中的PTBP1 RNA方面具有更高的活性。
图7显示C13-2可下调293T细胞中的ANGPTL3(血管生成素样3)RNA。
图8示出了计算方法预测的C13-2结构域;图中第1-95位为NTD域,第96-255位为HEPN-1_Ⅰ域,第256-417位为Helical-1域,第418-504位为HEPN-1_Ⅱ域,第505-651位为Helical-2域,第652-893位为HEPN-2域。
图9示出了使用不同DR时靶向VEGFA RNA的编辑效果。
图10示出了使用不同DR时靶向PTBP1 RNA的编辑效果。
图11示出了C13-2与已知Cas13工具靶向VEGFA RNA的效果对比。
图12示出了C13-2与已知Cas13工具靶向PTBP1 RNA的效果对比。
图13示出了dC13-2单碱基编辑后的测序峰图。
图14示出了qPCR测试突变体靶向编辑后的VEGFA RNA水平。
图15示出了RNAseq测试编辑后的VEGFA RNA水平。
图16示出了qPCR测试突变体对AR RNA的编辑效率。
图17示出了RNAseq测得的编辑后AR RNA水平。
图18示出了同向重复序列DRrc、DR-hf2、DR2rc的序列比对结果。
图19示出了RNAfold预测的同向重复序列DR-hf2(SEQ ID NO:87)的RNA二级结构。
具体实施方式
下面通过实施例的方式进一步说明本披露,但并不因此将本披露限制在所述的实施例范围之中。下列实施例中未注明具体条件的实验方法,按照常规方法和条件,或按照商品说明书选择。
如本文中所使用的,术语“序列同一性”(identity或percent identity)用于指两个多肽之间或两个核酸之间序列的匹配情况。当两个进行比较的序列中的某个位置都被相同的碱基或氨基酸单体亚单元占据时(例如,两个DNA分子中的每一个的某个位置都被腺嘌呤占据,或两个多肽中的每一个的某个位置都被赖氨酸占据),那么各分子在该位置上是同一的。两个序列之间的“百分比序列同一性”(percent identity)是由这两个序列共有的匹配位置数目除以进行比较的位置数目×100%的函数。例如,如果两个序列的10个位置中有6个匹配,那么这两个序列具有60%的序列同一性。通常,在将两个序列比对以产 生最大序列同一性时进行比较。这样的比对可通过使用已公开和可商购的比对算法和程序,诸如但不限于ClustalΩ、MAFFT、Probcons、T-Coffee、Probalign、BLAST,本领域的普通技术人员可合理选择使用。本领域技术人员能确定用于比对序列的适宜参数,例如包括对所比较序列全长实现较优比对或最佳对比所需要的任何算法,以及对所比较序列的局部实现较优比对或最佳对比所需要的任何算法。
如本文中所使用的,术语“指导多核苷酸”用于指CRISPR-Cas系统中与Cas蛋白形成CRISPR复合物并将CRISPR复合物引导至靶序列的分子。通常情况下,指导多核苷酸包含与指导序列连接的骨架序列,指导序列可以与靶序列杂交。骨架序列通常包含同向重复序列,有时还可包含tracrRNA序列。当骨架序列不包含tracrRNA序列时,指导多核苷酸包含指导序列和同向重复序列,此时指导多核苷酸也可称为crRNA。
CRISPR-Cas13系统
2类CRISPR-Cas系统赋予微生物多种适应性免疫机制。本文提供了对原核基因组和宏基因组的分析,以鉴定包含C13-2(也称为CasRfg.4)的先前未表征的RNA引导的、靶向RNA的CRISPR-Cas13系统,其被归类为VI型系统。基于C13-2的工程化CRISPR-Cas13系统在人体细胞中具有强大的活性。作为一种紧凑的单效应子Cas13酶,C13-2也可以灵活地包装到AAV载体中。本文的结果展示了将C13-2作为一种可编程的RNA结合模块,用于有效靶向细胞RNA,从而为转录组工程以及治疗和诊断方法提供通用平台。
如实验例1所述,基于NCBI GenBank和CNGB数据库中原核基因组和宏基因组的生物信息学分析,鉴定了包含C13-2的CRISPR-Cas13系统,随后实验验证了在人类细胞中的靶向RNA切割活性。包含C13-2的CRISPR-Cas13系统是VI型CRISPR-Cas系统。图1显示了包含C13-2的CRISPR-Cas13系统的CRISPR基因座。野生型C13-2的蛋白质序列为SEQ ID NO:1。C13-2的野生型DNA编码序列为SEQ ID NO:9。图8显示了计算预测的C13-2的结构域,其包括NTD、HEPN-1_I、HEPN-1_Ⅱ和HEPN-2,NTD为N terminus domain,在HEPN-1_I与HEPN-1_Ⅱ之间、HEPN-1_Ⅱ与HEPN-2之间分别为Helical-1和Helical-2结构域。C13-2指导多核苷酸的同向重复序列为SEQ ID NO:3。图2显示了RNAfold预测的C13-2指导多核苷酸的同向重复序列的RNA二级结构。本文描述的工程化CRISPR-Cas13系统可以有效地敲低人类细胞中的内源性靶RNA,为作为转录组工程工具箱的一部分的RNA靶向应用铺平了道路。在一些实施方案中,C13-2介导的跨多种内源性转录物的敲低能够比CasRx、PspCas13b、Cas13X.1和/或Cas13Y.1介导的敲低实现更高的效率和/或特异性。
因此,本披露的一个方面涉及CRISPR-Cas13系统、组合物或试剂盒,其包含:与SEQ ID NO:1相比具有至少90%序列同一性的Cas13蛋白或融合蛋白,或编码所述Cas13 蛋白的核酸;以及指导多核苷酸或编码所述指导多核苷酸的核酸;所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以与靶RNA杂交,所述指导多核苷酸能够与所述Cas13蛋白形成CRISPR复合物并指导所述CRISPR复合物与所述靶RNA的序列特异性结合。
在一些实施方案中,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以与靶RNA杂交,所述指导多核苷酸能够与所述Cas13蛋白形成CRISPR复合物并指导所述CRISPR复合物序列特异性地结合并切割所述靶RNA。
在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列和/或所述编码指导多核苷酸的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码指导多核苷酸的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列的调控序列与所述编码指导多核苷酸的多核苷酸序列的调控序列相同或不同。
在一些实施方案中,本文所述的Cas13蛋白具有与SEQ ID NO:1相比至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.5%的序列同一性。在一些实施方案中,本文所述的Cas13蛋白具有与SEQ ID NO:9编码的蛋白序列相比至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性。当所述CRISPR-Cas13系统包括包含所述Cas13蛋白与蛋白结构域和/或多肽标签的融合蛋白时,计算融合蛋白的Cas13部分与参考序列之间的序列同一性百分比。
在一些实施方案中,本文所述Cas13蛋白来自包含与CNGB数据库中编号为CNA0009596所示基因组的平均核苷酸同一性(ANI)≥95%的基因组的物种(species)。
在一些实施方案中,本文所述的Cas13蛋白包含一个或多个(例如1个或2个)天然HEPN结构域,每个天然HEPN结构域包含RX4H氨基酸基序(其中X表示任意氨基酸,下标“4”表示为4个连续的氨基酸)。在一些实施方案中,第一催化RX4H基序位于SEQ ID NO:1的氨基酸位置210-215,第二催化RX4H基序位于SEQ ID NO:1的氨基酸位置785-790,第三RX4H基序位于SEQ ID NO:1的氨基酸位置750-755。在一些实施方案中,本文所述的Cas13蛋白包含一个或多个突变的HEPN结构域。在一些实施方案中,所述突变Cas13蛋白可以加工其指导多核苷酸,但不能切割靶RNA。
在一些实施方案中,本文所述的Cas13蛋白用于RNA切割时没有原间隔区侧翼序列(Protospacer Flanking Sequence,PFS)的要求。
本文所述的CRISPR-Cas13系统可以以多种非限制性方式引入细胞(或无细胞系统):(i)作为Cas13mRNA和指导多核苷酸,(ii)作为单个载体或质粒的一部分,或分为多个载体或质粒,(iii)作为单独的Cas13蛋白和指导多核苷酸,或(iv)作为Cas13蛋白和指导多核苷酸的RNP复合物。
在一些实施方案中,所述CRISPR-Cas13系统、组合物或试剂盒包含编码所述Cas13蛋白的核酸分子,其中编码序列被密码子优化以在真核细胞中表达。在一些实施方案中,所述CRISPR-Cas13系统、组合物或试剂盒包含编码所述Cas13蛋白的核酸分子,其中编码序列被密码子优化以在哺乳动物细胞中表达。在一些实施方案中,所述CRISPR-Cas13系统、组合物或试剂盒包含编码所述Cas13蛋白的核酸分子,其中编码序列被密码子优化以在人细胞中表达。
在一些实施方案中,编码所述Cas13蛋白的核酸分子是质粒。在一些实施方案中,编码所述Cas13蛋白的核酸分子是病毒载体基因组的一部分,例如侧翼为ITR的AAV载体的DNA基因组。在一些实施方案中,编码所述Cas13蛋白的核酸分子是mRNA。
指导多核苷酸
在一些实施方案中,所述CRISPR-Cas13系统的所述指导多核苷酸是指导RNA。在一些实施方案中,所述指导多核苷酸是化学修饰的指导多核苷酸。在一些实施方案中,所述指导多核苷酸包含至少一个化学修饰的核苷酸。在一些实施方案中,所述指导多核苷酸是杂合RNA-DNA指导。在一些实施方案中,所述指导多核苷酸是杂合RNA-LNA(锁核酸)指导。
在一些实施方案中,所述指导多核苷酸包含与至少一个同向重复序列(direct repeat,DR)连接的至少一个指导序列(guide sequence,也称为间隔序列spacer sequence)。在一些实施方案中,所述指导序列位于同向重复序列的3'端。在一些实施方案中,所述指导序列位于同向重复序列的5'端。
在一些实施方案中,所述指导序列包含至少15个核苷酸、至少16个核苷酸、至少17个核苷酸、至少18个核苷酸、至少19个核苷酸、至少20个核苷酸、至少21个核苷酸、至少22个核苷酸、至少23个核苷酸、至少24个核苷酸、至少25个核苷酸、至少26个核苷酸、至少27个核苷酸、至少28个核苷酸、至少29个核苷酸、或至少30个核苷酸。在一些实施方案中,所述指导序列包含不超过60个核苷酸、不超过55个核苷酸、不超过50个核苷酸、不超过45个核苷酸、不超过40个核苷酸、不超过35个核苷酸、或不超过30个核苷酸。在一些实施方案中,所述指导序列包含15-20个核苷酸、20-25个核苷酸、25-30个核苷酸、30-35个核苷酸或35-40个核苷酸。
在一些实施方案中,所述指导序列与所述靶RNA序列具有足够的互补性以与所述靶RNA杂交并指导所述CRISPR-Cas13复合物与所述靶RNA的序列特异性结合。在一些实施方案中,所述指导序列与所述靶RNA(或要靶向的RNA的区域)具有100%的互补性,但所述指导序列可以与所述靶RNA具有小于100%的互补性,例如至少80%、至少85%、至少90%、至少95%、至少98%或至少99%的互补性。
在一些实施方案中,所述指导序列被工程化以与所述靶RNA杂交,错配不超过两个核苷酸。在一些实施方案中,所述指导序列被工程化以与所述靶RNA杂交,且错配不超过一个核苷酸。在一些实施方案中,所述指导序列被工程化以与所述靶RNA杂交,有或没有错配。
在一些实施方案中,所述同向重复序列包含至少20个核苷酸、至少21个核苷酸、至少22个核苷酸、至少23个核苷酸、至少24个核苷酸、至少25个核苷酸,至少26个核苷酸,至少27个核苷酸,至少28个核苷酸,至少29个核苷酸,至少30个核苷酸,至少31个核苷酸,至少32个核苷酸,至少33个核苷酸、至少34个核苷酸、至少35个核苷酸或至少36个核苷酸。在一些实施方案中,所述同向重复序列包含不超过60个核苷酸、不超过55个核苷酸、不超过50个核苷酸、不超过45个核苷酸、不超过40个核苷酸或不超过35个核苷酸。在一些实施方案中,所述同向重复序列包含20-25个核苷酸、25-30个核苷酸、30-35个核苷酸或35-40个核苷酸。
在一些实施方案中,所述同向重复序列被修饰,以用不同的互补碱基对替换图2所示的茎区(stem region)中的至少一个互补碱基对。在一些实施方案中,所述同向重复序列被修饰以改变图2所示的茎区中互补碱基对的数目。在一些实施方案中,所述同向重复序列被修饰以改变图2所示的环区(loop region)中的核苷酸数目(例如,环中3、4或5个核苷酸)。在一些实施方案中,所述同向重复序列被修饰以改变环区中的核苷酸序列。在一些实施方案中,适体(aptamer)序列插入或附加到同向重复序列的末端。在一些实施方案中,所述同向重复序列具有与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或100%的序列同一性。
在一些实施方案中,所述CRISPR-Cas13系统、组合物或试剂盒包含至少2个、至少3个、至少4个、至少5个、至少10个或至少20个不同的指导多核苷酸。在一些实施方案中,所述指导多核苷酸靶向至少2个、至少3个、至少4个、至少5个、至少10个或至少20个不同的靶RNA分子,或靶向一个或多个靶RNA分子的至少2个、至少3个、至少4个、至少5个、至少10个或至少20个不同区域。
在一些实施方案中,所述指导多核苷酸包括位于可变指导序列上游的恒定同向重复序列。在一些实施方案中,多个指导多核苷酸是阵列的一部分(其可以是载体的一部分,例如病毒载体或质粒)。例如,包括序列DR-间隔区-DR-间隔区-DR-间隔区的指导阵列可以包括三个独特的未加工指导多核苷酸(每个DR-间隔区序列一个)。一旦被引入细胞或无细胞系统,阵列就会被所述Cas13蛋白加工成三个单独的成熟指导多核苷酸。这允许多路复用,例如将多个指导多核苷酸递送至细胞或系统以靶向多个靶RNA或单个靶RNA内的多个区域。
指导多核苷酸指导CRISPR复合物与靶RNA的序列特异性结合的能力可以通过任何合适的测定来评估。例如,可以将足以形成CRISPR复合物的CRISPR系统的组分,包括待测试的指导多核苷酸,提供给具有相应靶RNA分子的宿主细胞,例如通过编码CRISPR复合物的组分的载体的转染,然后评估靶序列内的优先切割。类似地,可以在试管中评估靶RNA序列的切割,方法是提供靶RNA、CRISPR复合物的组分,包括待测试的指导多核苷酸和不同于测试指导多核苷酸的对照指导多核苷酸,并比较待测试和对照指导多核苷酸之间结合靶RNA的能力或切割靶RNA的速率。
Cas13突变体
在一些实施方案中,与野生型C13-2蛋白(SEQ ID NO:1)相比,本文提供的Cas13蛋白包含一个或多个突变,例如单个氨基酸插入、单个氨基酸缺失、单个氨基酸取代,或其组合。在一些实例中,与野生型C13-2蛋白(SEQ ID NO:1)相比所述Cas13蛋白包含1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61、62、63、64、65、66、67、68、69、70、71、72、73、74、75、76、77、78、79、80、81、82、83、84、85、86、87、88、89或90个氨基酸变化(例如插入、缺失或取代),但保留结合与指导多核苷酸的指导序列互补的靶RNA分子的能力,和/或保留将指导阵列RNA转录物加工成指导多核苷酸分子的能力。在一些实例中,与野生型C13-2蛋白(SEQ ID NO:1)相比所述Cas13蛋白包含1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61、62、63、64、65、66、67、68、69、70、71、72、73、74、75、76、77、78、79、80、81、82、83、84、85、86、87、88、89或90个氨基酸变化(例如插入、缺失或取代),但保留结合与指导多核苷酸的指导序列互补的靶RNA分子的能力。在一些实例中,与野生型C13-2蛋白(SEQ ID NO:1)相比所述Cas13蛋白包含1、2、3、4、 5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29或30个氨基酸变化(例如插入、缺失或取代),但保留结合与指导多核苷酸的指导序列互补的靶RNA分子的能力,和/或保留将指导阵列RNA转录物加工成指导多核苷酸分子的能力。
在一些实施方案中,所述Cas13蛋白在催化结构域中包含一个或多个突变并且具有降低的RNA切割活性。在一些实施方案中,所述Cas13蛋白在催化结构域中包含一个突变并且具有降低的RNA切割活性。在一些实施方案中,所述Cas13蛋白在一个或两个HEPN结构域中包含一个或多个突变并且基本上缺乏RNA切割活性。在一些实施方案中,所述Cas13蛋白在一个或两个HEPN结构域中包含突变并且基本上缺乏RNA切割活性。在一些实施方案中,所述“基本上缺乏RNA切割活性”是指与野生型Cas13蛋白相比仅保留≤50%、≤40%、≤30%、≤20%、≤10%、≤5%或≤1%的RNA切割活性,或无可检测的RNA切割活性。
在一些实施方案中,所述Cas13蛋白具有与SEQ ID NO:1相比至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性。在一些实施方案中,所述Cas13蛋白具有与SEQ ID NO:9编码的蛋白质序列相比至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性。当所述CRISPR-Cas13系统包含所述Cas13与蛋白结构域和/或多肽标签的融合蛋白时,计算融合蛋白的Cas13部分与参考序列之间的序列同一性百分比。
在一些实施方案中,所述Cas13蛋白能与指导多核苷酸形成CRISPR复合物,所述CRISPR复合物能与靶RNA序列特异性结合。
在一些实施方案中,所述Cas13蛋白能与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。
一种类型的修饰或突变包括用氨基酸取代具有相似生化性质的氨基酸残基,即保守取代(例如1-4、1-8、1-10或1-20个氨基酸的保守取代)。通常,保守取代对所得蛋白或肽的活性影响很小或没有影响。例如,保守取代是Cas13蛋白中的氨基酸取代,其基本上不影响Cas13蛋白和与gRNA分子指导序列互补的靶RNA分子的结合,和/或加工指导阵列RNA转录物成gRNA分子的过程。丙氨酸扫描可用于鉴定Cas13蛋白中的哪些氨基酸残基可以耐受氨基酸取代。在一个实例中,当丙氨酸或其他保守氨基酸被1-4、1-8、1-10或1-20个天然氨基酸取代后,变体Cas13蛋白在CRISPR-Cas系统中修饰基因表达的能力的改变不超过25%,例如不超过20%,例如不超过10%。可以被取代并且被 认为是保守取代的氨基酸的例子包括:用Ser替换Ala;用Lys替换Arg;用Gln或His替换Asn;用Glu替换Asp;用Ser替换Cys;用Asn替换Gln;用Asp替换Glu;用Pro替换Gly;用Asn或Gln替换His;用Leu或Val替换Ile;用Ile或Val替换Leu;用Arg或Gln替换Lys;用Leu或Ile替换Met;用Met,Leu或Tyr替换Phe;用Thr替换Ser;用Ser替换Thr;用Tyr替换Trp;用Trp或Phe替换Tyr;用Ile或Leu替换Val。
可以通过使用保守性较低的取代来进行更实质性的改变,例如,选择在维持以下效果方面差异更大的残基:(a)取代发生区域中多肽骨架的结构,例如,作为一个螺旋或折叠构象;(b)与靶位点相互作用的区域的电荷或疏水性;或(c)侧链的体积。通常预期会在多肽功能中产生最大变化的取代是(a):亲水残基(例如丝氨酸或苏氨酸)与疏水残基(例如亮氨酸、异亮氨酸、苯丙氨酸、缬氨酸或丙氨酸)之间的取代;(b)半胱氨酸或脯氨酸与任何其他残基之间的取代;(c)带正电侧链的残基(例如赖氨酸、精氨酸或组氨酸)与带负电残基(例如谷氨酸或天冬氨酸)之间的取代;或(d)具有庞大侧链的残基(例如苯丙氨酸)与不具有侧链的残基(例如甘氨酸)之间的取代。
在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置40-91处(即SEQ ID NO:1序列中第40个氨基酸到第91个氨基酸的区域,包括第40个氨基酸和第91个氨基酸)包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置146-153处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置158-176处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置182-209处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置216-253处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置271-287处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置341-353处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置379-424处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置456-477处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置521-557处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置575-588处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置609-625处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置700-721处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置724-783处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置796-815处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在 SEQ ID NO:1的氨基酸位置828-852处包含一个或多个突变。在一些实施方案中,所述Cas13蛋白在SEQ ID NO:1的氨基酸位置880-893处包含一个或多个突变。
在一些实施方案中,所述Cas13蛋白包含在SEQ ID NO:1的氨基酸位置348-350处(即SEQ ID NO:1序列中第348个氨基酸、第349个氨基酸和第350个氨基酸)的一个或多个氨基酸的缺失。在一些实施方案中,所述Cas13蛋白包含在SEQ ID NO:1的氨基酸位置521-556处的一个或多个氨基酸的缺失。在一些实施方案中,所述Cas13蛋白包含在SEQ ID NO:1的氨基酸位置883-893处的一个或多个氨基酸的缺失。
在一些实施方案中,所述Cas13蛋白的RxxxxH基序(x表示任意氨基酸,RxxxxH也可记为Rx4H或R4xH)包含一个或多个突变并且基本上缺乏RNA切割活性。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的210-215位RxxxxH基序、750-755位RxxxxH基序和/或785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的210-215位RxxxxH基序的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的750-755位RxxxxH基序的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的210-215位RxxxxH基序和750-755位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的210-215位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的750-755位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的210-215位RxxxxH基序、750-755位RxxxxH基序和785-790位RxxxxH基序的对应位置包含突变。
在一些实施方案中,所述RxxxxH基序突变为AxxxxH、RxxxxA或AxxxxA。在一些实施方案中,所述RxxxxH基序突变为AxxxxH。在一些实施方案中,所述RxxxxH基序突变为RxxxxA。在一些实施方案中,所述RxxxxH基序突变为AxxxxA。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750、H755、R785和/或H790的对应位置包含1个、2个、3个、4个、5个或6个突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750、H755、R785和/或H790的对应位置突变为A(丙氨 酸)。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210和H215的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R750和H755的对应位置包含突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R785和H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750和H755的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R750、H755、R785和H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R785和/或H790的对应位置包含突变。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基R210、H215、R750、H755、R785和H790的对应位置包含突变。
在一些实施方案中,所述R210、R750或R785的对应位置突变为A。在一些实施方案中,所述H215、H755或H790的对应位置突变为A。在一些实施方案中,所述R210、H215、R750、H755、R785和H790的对应位置都突变为A。
在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的210-215位RxxxxH基序、750-755位RxxxxH基序和/或785-790位RxxxxH基序引入突变而得到。
在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和/或H790位置引入1个、2个、3个、4个、5个或6个突变而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和/或H790位置突变为A(丙氨酸)而得到。
在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R785和H790位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750和H755位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R750、H755、R785和H790位置突变为A而得到。在一些实施方案中,所述Cas13蛋白通过在SEQ ID NO:1所示序列的R210、H215、R750、H755、R785和H790位置突变为A而得到。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基40-91位、146-153位、158-176位、182-209位、216-253位、271-287位、341-353位、379-424位、456-477位、521-557位、575-588位、609-625位、700-721位、724-783位、 796-815位、828-852位或880-893位的对应位置包含至少一个突变。在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基348-350位、521-556位或883-893位的对应位置包含一个或多个氨基酸的缺失。
在一些实施方案中,与SEQ ID NO:1所示参比蛋白相比,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314、R320、R328、N332、R341、N346、R358、N372、N383、N390、N394、R47+R290、R47+R314、R290+R314、R47+R290+R314、R308+N68、N394+N68、N87+N68、R308+N265、N394+N265、N87+N265、R308+N68+N265、N87+N68+N265、T7、A16、S260、A263、M266、N274、F288、M302、N303、L304、V305、I311、D313、H324、P326、H327、N332、N346、T353、T360、E365、A373、M380、S382、K395、Y396、D402、D411、S418。
在一些实施方案中,与SEQ ID NO:1所示参比蛋白相比,所述Cas13蛋白在表24中SEQ ID NO:1所示参比蛋白的突变位点的对应位置突变为相同的氨基酸残基。在一些实施方案中,与SEQ ID NO:1所示参比蛋白相比,所述Cas13蛋白在表24中SEQ ID NO:1所示参比蛋白的突变位点的对应位置包含相同的突变。
在一些实施方案中,所述Cas13蛋白由SEQ ID NO:1所示序列在以下位置引入任意一种或更多种突变而得到:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314、R320、R328、N332、R341、N346、R358、N372、N383、N390、N394、R47+R290、R47+R314、R290+R314、R47+R290+R314、R308+N68、N394+N68、N87+N68、R308+N265、N394+N265、N87+N265、R308+N68+N265、N87+N68+N265、T7、A16、S260、A263、M266、N274、F288、M302、N303、L304、V305、I311、D313、H324、P326、H327、N332、N346、T353、T360、E365、A373、M380、S382、K395、Y396、D402、D411、S418。
在一些实施方案中,所述Cas13蛋白由SEQ ID NO:1所示序列引入表24中任意一种或更多种突变而得到。
在一些实施方案中,所述Cas13蛋白在与SEQ ID NO:1所示参比蛋白的氨基酸残基91-120位、141-180位、211-240位、331-360位、351-400位、431-460位、461-500位、511-550位、611-640位、631-660位、661-690位、691-760位、821-860位或861-890位的对应位置发生序列缺失而得到。
在一些实施方案中,所示序列缺失中缺失≤300个、≤200个、≤150个、≤100个、≤90个、≤80个、≤70个、≤60个、≤50个、≤40个、≤30个、≤20个或≤10个氨 基酸残基。
在一些实施方案中,所述Cas13蛋白由SEQ ID NO:1所示序列在91-120位、141-180位、211-240位、331-360位、351-400位、431-460位、461-500位、511-550位、611-640位、631-660位、661-690位、691-760位、821-860位或861-890位发生序列缺失而得到。
亚细胞定位信号(或称定位信号)
在一些实施方案中,所述Cas13蛋白或其功能片段与至少一种同源或异源亚细胞定位信号融合。示例性的亚细胞定位信号包括细胞器定位信号,例如核定位信号(NLS)、核输出信号(NES)或线粒体定位信号。
在一些实施方案中,所述Cas13蛋白或其功能片段与至少1个同源或异源NLS融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少2个NLS融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少3个NLS融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少1个N-末端NLS和至少1个C-末端NLS融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少2个C-末端NLS融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少2个N-末端NLS融合。
在一些实施例中,所述NLS独立地选自SPKKKRKVEAS(SEQ ID NO:53)、GPKKKRKVAAA(SEQ ID NO:54)、PKKKRKV(SEQ ID NO:55)、KRPAATKKA GQA KKKK(SEQ ID NO:56)、PAAKRVKLD(SEQ ID NO:57)、RQRRNELKRSP(SEQ ID NO:58)、NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY(SEQ ID NO:59)、RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV(SEQ ID NO:60)、VSRKRPRP(SEQ ID NO:61)、PPKKARED(SEQ ID NO:62)、POPKKKPL(SEQ ID NO:63)、SALIKKKKKMAP(SEQ ID NO:64)、DRLRR(SEQ ID NO:65)、PKQKKRK(SEQ ID NO:66)、RKLKKKIKKL(SEQ ID NO:67)、REKKKFLKRR(SEQ ID NO:68)、KRKGDEVDGVDEVAKKKSKK(SEQ ID NO:69)、RKCLQAGMNLEARKTKK(SEQ ID NO:70)和PAAKKKKLD(SEQ ID NO:71)。
在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源NES融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少两个NES融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少三个NES融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少一个N-末端NES和至少一个C-末端NES融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少两个C-末端NES融合。在一些实施方案中,所述Cas13蛋白或其功能片段与至少两个N-末端NES融合。
在一些实施方案中,所述NES独立地选自腺病毒5型E1B NES、HIV Rev NES、MAPK NES和PTK2NES。
在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源的NLS和NES融合,在所述NLS和所述NES之间存在一个可切割接头。在一些实施方案中,在生产细胞系中所述NES促进产生包含所述Cas13蛋白或其功能片段的递送颗粒(例如,病毒样颗粒)。在一些实施方案中,靶细胞中所述接头的切割可以暴露所述NLS并促进靶细胞中所述Cas13蛋白或其功能片段的核定位。
蛋白结构域及多肽标签
在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源蛋白结构域和/或多肽标签共价连接或融合。在一些实施方案中,所述Cas13蛋白或其功能片段与同源或异源蛋白结构域和/或多肽标签融合。
在一些实施方案中,所述蛋白结构域和多肽标签任选自:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、亚细胞定位信号、报告标签和亲和标签。
在一些实施方案中,所述蛋白结构域包含胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核糖核酸酶结构域、剪接因子结构域、报告域和亲和域。在一些实施方案中,所述多肽标签包含报告标签和亲和标签。
在一些实施方案中,所述蛋白结构域的氨基酸序列的长度为≥40个氨基酸、≥50个氨基酸、≥60个氨基酸、≥70个氨基酸、≥80个氨基酸、≥90个氨基酸、≥100个氨基酸、≥150个氨基酸、≥200个氨基酸、≥250个氨基酸、≥300个氨基酸、≥350个氨基酸或≥400个氨基酸。
示例性的蛋白结构域包括可以切割RNA的结构域(例如,PIN内切核酸酶结构域、NYN结构域、来自SOT1的SMR结构域,或来自葡萄球菌核酸酶的RNase结构域),可以影响RNA稳定性的结构域(例如,tristetraprolin(TTP)或来自UPF1、EXOSC5和STAU1的结构域),可以编辑核苷酸或核糖核苷酸的结构域(例如,胞苷脱氨酶、PPR蛋白、腺苷脱氨酶、ADAR家族蛋白或APOBEC家族蛋白),可以激活翻译的结构域(例如,eIF4E和其他翻译起始因子,酵母多聚(A)结合蛋白或GLD2的结构域),可以抑制翻译的结构域(例如,Pumilio或FBF PUF蛋白,去腺苷酶,CAF1,Argonaute蛋白),可以甲基化RNA的结构域(例如,来自m6A甲基转移酶因子(如METTL14、METTL3或WTAP)的结构域),可以使RNA去甲基化的结构域(例如,人类烷基化修复同源物 5),可以影响剪接的结构域(例如,SRSF1的富含RS的结构域,hnRNP A1的富含Gly的结构域,RBM4的富含丙氨酸的基序,或DAZAP1的富含脯氨酸的基序),可以实现亲和纯化或免疫沉淀的结构域,以及可以实现邻近依赖(proximity-based)蛋白质标记和识别的结构域(例如,生物素连接酶(如BirA)或过氧化物酶(如APEX2),以使靶DNA相互作用蛋白被生物素化)。
在一些实施方案中,所述蛋白结构域包含腺苷脱氨酶结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与腺苷脱氨酶结构域共价连接或融合以指导哺乳动物细胞中RNA转录物的A-to-I脱氨酶活性。Cox et al.,Science 358(6366):1019-1027(2017)中描述了基于ADAR2工程化的用于靶向A-to-I RNA编辑的腺苷脱氨酶结构域,其通过引用整体并入本文。在其他实施方案中,腺苷脱氨酶结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加到所述指导多核苷酸内的适体序列,从而允许所述腺苷脱氨酶结构域非共价连接到与所述指导多核苷酸复合的Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含胞嘧啶脱氨酶结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与胞嘧啶脱氨酶结构域共价连接或融合以指导哺乳动物细胞中RNA转录物的C-to-U脱氨酶活性。Abudayyeh et al.,Science 365(6451):382-386(2019)中描述了从ADAR2进化而来的用于靶向C-to-U RNA编辑的胞嘧啶脱氨酶结构域,该文献通过引用整体并入本文。在其他实施方案中,所述胞嘧啶脱氨酶结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加至所述指导多核苷酸的适体序列,从而允许所述胞嘧啶脱氨酶结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含剪接因子结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与剪接因子结构域共价连接或融合以指导哺乳动物细胞中靶RNA的可变剪接。Konermann et al.,Cell 173(3):665-676(2018)中描述了用于靶向选择性剪接的剪接因子结构域,其通过引用整体并入本文。剪接因子结构域的非限制性实例包括SRSF1的富含RS的结构域、hnRNPA1的富含Gly的结构域、RBM4的富含丙氨酸的基序或DAZAP1的富含脯氨酸的基序。在其他实施方案中,所述剪接因子结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加至所述指导多核苷酸的适体序列,从而允许所述剪接因子结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含翻译激活结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与 翻译激活结构域共价连接或融合以激活或增加靶RNA的表达。翻译激活结构域的非限制性实例包括eIF4E和其他翻译起始因子、酵母聚(A)结合蛋白或GLD2的结构域。在其他实施方案中,所述翻译激活结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加至所述指导多核苷酸的适体序列,从而允许所述翻译激活结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含翻译抑制结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与翻译抑制结构域共价连接或融合以抑制或降低靶RNA的表达。翻译抑制结构域的非限制性实例包括Pumilio或FBF PUF蛋白、去腺苷酶、CAF1、Argonaute蛋白。在其他实施方案中,所述翻译抑制结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加到所述指导多核苷酸内的适体序列,从而允许所述翻译抑制结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含RNA甲基化结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与RNA甲基化结构域共价连接或融合以用于靶RNA的甲基化。RNA甲基化结构域的非限制性例子包括m6A结构域,例如METTL14、METTL3或WTAP。在其他实施方案中,所述RNA甲基化结构域与接头蛋白共价连接或融合,该接头蛋白能够结合插入或附加至所述指导多核苷酸的适体序列,从而允许所述RNA甲基化结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含RNA去甲基化结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白或其功能片段与RNA去甲基化结构域共价连接或融合以用于靶RNA的去甲基化。RNA去甲基化结构域的非限制性实例包括人烷基化修复同源物5或ALKBH5。在其他实施方案中,所述RNA去甲基化结构域共价连接或融合至接头蛋白,该接头蛋白能够结合插入或附加至所述指导多核苷酸的适体序列,从而允许所述RNA去甲基化结构域非共价连接到与所述指导多核苷酸复合的所述Cas13蛋白或其功能片段上。
在一些实施方案中,所述蛋白结构域包含核糖核酸酶结构域。在一些实施方案中,具有突变HEPN结构域的所述Cas13蛋白、催化失活的所述Cas13蛋白、或其功能片段与核糖核酸酶结构域共价连接或融合以切割靶RNA。核糖核酸酶结构域的非限制性实例包括PIN内切核酸酶结构域、NYN结构域、来自SOT1的SMR结构域或来自葡萄球菌核酸酶的RNase结构域。
在一些实施方案中,所述蛋白结构域包含亲和结构域(亲和域)和/或报告结构域(报告域)。在一些实施方案中,所述Cas13蛋白或其功能片段共价连接或融合至报告结构域,例如荧光蛋白。报告域的非限制性例子包括GST、HRP、CAT、GFP、HcRed、DsRed、CFP、YFP、BFP。
在一些实施方案中,所述Cas13蛋白与多肽标签共价连接或融合。在一些实施方案中,所述多肽标签的实例是小多肽序列。在一些实施方案中,所述多肽标签的氨基酸序列的长度为≤50个氨基酸、≤40个氨基酸、≤30个氨基酸、≤25个氨基酸、≤20个氨基酸、≤15个氨基酸、≤10个氨基酸或≤5个氨基酸。在一些实施方案中,Cas13蛋白与亲和标签如纯化标签共价连接或融合。亲和标签的非限制性实例包括HA-标签、His-标签(例如6-His)、Myc-标签、E-标签、S-标签、钙调蛋白标签、FLAG-标签、GST-标签、MBP-标签、Halo标签或生物素。
在一些实施方案中,C13-2失活突变体(R210A+H215A+R750A+H755A+R785A+H790A)与蛋白结构域和/或多肽标签融合。
在一些实施方案中,Cas13蛋白与ADAR融合。
在一些实施方案中,C13-2失活突变体(R210A+H215A+R750A+H755A+R785A+H790A)与胞嘧啶脱氨酶或腺嘌呤脱氨酶融合。
在一些实施方案中,C13-2失活突变体(R210A+H215A+R750A+H755A+R785A+H790A)与胞嘧啶脱氨酶或腺嘌呤脱氨酶直接共价连接、通过刚性连接肽序列A(EAAAK)3A连接,或通过柔性连接肽序列(GGGGS)3连接。
适体序列(Aptamer Sequence)
在一些实施方案中,所述指导多核苷酸进一步包含适体(aptamer)序列。在一些实施方案中,所述适体序列被插入到指导多核苷酸的环(loop)中。在一些实施方案中,所述适体序列被插入到指导多核苷酸的tetra loop中。指导多核苷酸的示例性tetra loop显示在图2中。在一些实施方案中,所述适体序列附加到所述指导多核苷酸的末端。
Konermann et al.,Nature 517:583–588(2015)中描述了将适体序列插入到CRISPR-Cas系统的指导多核苷酸上,该文献通过引用整体并入本文。在一些实施方案中,所述适体序列包括MS2适体序列、PP7适体序列或Qβ适体序列。
接头蛋白(Adaptor protein)
在一些实施方案中,所述CRISPR-Cas13系统进一步包含含有接头蛋白以及同源或异源蛋白结构域和/或多肽标签的融合蛋白,或编码该融合蛋白的核酸,其中所述接头蛋白能够结合适体序列。
Konermann et al.,Nature 517:583-588(2015)中描述了接头蛋白和蛋白结构域的融合蛋白,该文献通过引用整体并入本文。在一些实施方案中,所述接头蛋白包括MS2噬菌体外壳蛋白(MCP)、PP7噬菌体外壳蛋白(PCP)或Qβ噬菌体外壳蛋白(QCP)。在一些实施方案中,所述蛋白结构域包含胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、亲和域或报告域。
修饰的指导多核苷酸
在一些实施方案中,所述指导多核苷酸包含修饰的核苷酸。在一些实施方案中,修饰的核苷酸包含2'-O-甲基、2'-O-甲基-3'-硫代磷酸酯或2'-O-甲基-3'-硫代PACE。在一些实施方案中,所述指导多核苷酸是化学修饰的指导多核苷酸。Hendel et al.,Nat.Biotechnol.33(9):985-989(2015)中描述了化学修饰的指导多核苷酸,其全文以引用方式并入本文。
在一些实施方案中,所述指导多核苷酸是杂合RNA-DNA指导、杂合RNA-LNA(锁核酸)指导、杂合DNA-LNA指导或杂合DNA-RNA-LNA指导。在一些实施方案中,所述同向重复序列包含一个或多个被相应脱氧核糖核苷酸取代的核糖核苷酸。在一些实施方案中,所述指导序列包含一种或多种被相应脱氧核糖核苷酸取代的核糖核苷酸。杂合RNA-DNA指导多核苷酸在WO2016/123230中进行了描述,其通过引用整体并入本文。
载体系统
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的载体系统,所述载体系统包含一个或多个载体,所述载体包含编码所述Cas13蛋白或融合蛋白的多核苷酸序列和编码所述指导多核苷酸的多核苷酸序列。
在一些实施方案中,所述载体系统包含至少一个质粒或病毒载体(例如,逆转录病毒、慢病毒、腺病毒、腺相关病毒或单纯疱疹病毒)。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列和所述编码指导多核苷酸的多核苷酸序列位于同一载体上。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列和所述编码指导多核苷酸的多核苷酸序列位于多个载体上。
在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列和/或所述编码指导多核苷酸的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码指导多核苷酸的多核苷酸序列可操作地连接至调控序列。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的多核苷酸序列的调控序列与所述编码指导多核苷酸的多核苷酸序列的调控序列相同或不同。在一些实施方案中,所述调控序列任选自启动子、增强子、内部核糖体进入位点(IRES)和其他表达控制元件(例如,转录终止信号,如 多腺苷酸化信号和poly-U序列)。在一些实施方案中,所述调控序列包括使核苷酸序列在许多类型的宿主细胞中组成型表达的调控序列,以及使核苷酸序列仅在某些宿主细胞中表达的调控序列(例如,组织特异性调节序列)。组织特异性启动子可以主要在所需的感兴趣组织中直接表达,例如肌肉、神经元、骨、皮肤、血液、特定器官(例如肝脏、胰腺)、或特定的细胞类型(例如淋巴细胞)。调控序列还可以以时间依赖性方式指导表达,例如以细胞周期依赖性或发育阶段依赖性方式,其也可能是或可能不是组织或细胞类型特异性的。在一些实施例中,所述调控序列是增强子元件,例如WPRE、CMV增强子、HTLV-1的LTR中的R-U5区段、SV40增强子、或兔β-珠蛋白外显子2和3之间的内含子序列。
在一些实施方案中,所述载体包含pol III启动子(例如,U6和H1启动子)、pol II启动子(例如,逆转录病毒Rous肉瘤病毒(RSV)LTR启动子(任选地带有RSV增强子)、巨细胞病毒(CMV)启动子(任选地带有CMV增强子)、SV40启动子、二氢叶酸还原酶启动子、β-肌动蛋白启动子、磷酸甘油激酶(PGK)启动子、或EF1α启动子),或pol III启动子和pol II启动子。
在一些实施方案中,所述启动子是组成型启动子,其是连续活性的并且不受外部信号或分子的调节。合适的组成型启动子包括但不限于CMV、RSV、SV40、EF1α、CAG和β-肌动蛋白启动子。在一些实施方案中,所述启动子是受外部信号或分子(例如,转录因子)调节的诱导型启动子。
在一些实施方案中,所述启动子是组织特异性启动子,其可用于驱动所述Cas13蛋白或融合蛋白的组织特异性表达。合适的肌肉特异性启动子包括但不限于CK8、MHCK7、肌红蛋白启动子(Mb)、结蛋白(Desmin)启动子、肌肉肌酸激酶启动子(MCK)及其变体,以及SPc5-12合成启动子。合适的免疫细胞特异性启动子包括但不限于B29启动子(B细胞)、CD14启动子(单核细胞)、CD43启动子(白细胞和血小板)、CD68(巨噬细胞)和SV40/CD43启动子(白细胞和血小板)。合适的血细胞特异性启动子包括但不限于CD43启动子(白细胞和血小板)、CD45启动子(造血细胞)、INF-β(造血细胞)、WASP启动子(造血细胞)、SV40/CD43启动子(白细胞和血小板),和SV40/CD45启动子(造血细胞)。合适的胰腺特异性启动子包括但不限于弹性蛋白酶-1启动子。合适的内皮细胞特异性启动子包括但不限于Fit-1启动子和ICAM-2启动子。合适的神经元组织/细胞特异性启动子包括但不限于GFAP启动子(星形胶质细胞)、SYN1启动子(神经元)和NSE/RU5'(成熟神经元)。合适的肾特异性启动子包括但不限于NphsI启动子(足细胞)。合适的骨特异性启动子包括但不限于OG-2启动子(成骨细胞、成牙本质细胞)。 合适的肺特异性启动子包括但不限于SP-B启动子(肺)。合适的肝脏特异性启动子包括但不限于SV40/Alb启动子。合适的心脏特异性启动子包括但不限于α-MHC。
AAV载体
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的腺相关病毒(AAV)载体,其中所述腺相关病毒(AAV)载体包含编码本文所述的Cas13蛋白或融合蛋白以及指导多核苷酸的DNA。
通过AAV载体递送CRISPR-Cas系统在Maeder et al.,Nature Medicine 25:229-233(2019)中进行了描述,其通过引用整体并入本文。在一些实施方案中,所述AAV载体包含ssDNA基因组,该基因组包含Cas13蛋白或融合蛋白以及侧接ITR的指导多核苷酸的编码序列。
在一些实施方案中,本文所述的CRISPR-Cas13系统被包装在AAV载体中,例如AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAV9和AAVrh74。在一些实施方案中,本文所述的CRISPR-Cas13系统被包装在AAV载体中,该AAV载体包含具有组织嗜性的工程化衣壳,例如工程化肌肉嗜性衣壳。Tabebordbar et al.,Cell 184:4919-4938(2021)中描述了通过定向进化对具有组织趋向性的AAV衣壳进行工程改造,该文献通过引用整体并入本文。
脂质纳米粒
本披露的另一方面涉及脂质纳米粒(LNP),其包含本文所述的CRISPR-Cas13系统,其中所述LNP包含本文所述的指导多核苷酸,以及编码本文所述的Cas13蛋白或融合蛋白的mRNA。
Gillmore et al.,N.Engl.J.Med.,385:493-502(2021)中描述了CRISPR-Cas系统的LNP递送,其全文以引用方式并入本文。在一些实施方案中,除了RNA有效负载(Cas13 mRNA和指导多核苷酸)之外,脂质纳米粒(LNP)还包含四种组分:阳离子或可电离脂质、胆固醇、辅助脂质和PEG-脂质。在一些实施方案中,所述阳离子或可电离脂质包括cKK-E12、C12-200、ALC-0315、DLin-MC3-DMA、DLin-KC2-DMA、FTT5、Moderna SM-102和Intellia LP01。在一些实施方案中,所述PEG-脂质包含PEG-2000-C-DMG、PEG-2000-DMG或ALC-0159。在一些实施方案中,所述辅助脂质包括DSPC。LNP的组分在Paunovska et al.,Nature Reviews Genetics 23:265-280(2022)中进行了描述,该文献通过引用整体并入本文。
慢病毒载体
本披露的另一方面涉及包含本文所述CRISPR-Cas13系统的慢病毒载体,其中所述慢病毒载体包含本文所述的指导多核苷酸,以及编码本文所述的Cas13蛋白或融合蛋白 的mRNA。在一些实施方案中,所述慢病毒载体是用同源或异源包膜蛋白如VSV-G假型化的。在一些实施方案中,所述编码Cas13蛋白或融合蛋白的mRNA与适体序列连接。
RNP复合物
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的核糖核蛋白复合物,其中所述核糖核蛋白复合物由本文所述的指导多核苷酸以及Cas13蛋白或融合蛋白形成。在一些实施方案中,可以通过显微注射或电穿孔将所述核糖核蛋白复合物递送至真核细胞、哺乳动物细胞或人类细胞。在一些实施方案中,所述核糖核蛋白复合物可以包装在病毒样颗粒中并在体内递送至哺乳动物或人类受试者。
病毒样颗粒
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的病毒样颗粒(VLP),其中所述病毒样颗粒包含:本文所述的指导多核苷酸,以及Cas13蛋白或融合蛋白;或由所述指导多核苷酸以及Cas13蛋白或融合蛋白组成的核糖核蛋白复合物。
Banskota et al.Cell 185(2):250-265(2022)、Mangeot et al.,Nature Communications 10(1):1-15(2019)、Campbell,et al.,Molecular Therapy 27:151-163(2019)、Campbell,et al.,Molecular Therapy,27(2019):151-163和Mangeot et al.Molecular Therapy,19(9):1656-1666(2011)描述了工程化VLP,其全文以引用方式并入本文。在一些实施方案中,工程化的病毒样颗粒(VLP)是用同源或异源包膜蛋白例如VSV-G假型化的。在一些实施方案中,所述Cas13蛋白通过可切割连接子与gag蛋白(例如MLVgag)融合,其中靶细胞中接头的切割暴露了位于连接子和Cas13蛋白之间的NLS。在一些实施方案中,所述融合蛋白包含(例如,从5'到3')gag蛋白(例如,MLVgag)、一种或多种NES、可切割连接子、一种或多种NLS、和Cas13,如Banskota et al.Cell 185(2):250-265(2022)所述。在一些实施方案中,所述Cas13蛋白与第一二聚化结构域融合,所述第一二聚化结构域能够与融合至膜蛋白的第二二聚化结构域二聚化或异二聚化,其中配体的存在促进所述二聚化并富集Cas13蛋白或融合蛋白至VLP中,如Campbell,et al.,Molecular Therapy 27:151-163(2019)中所述。
细胞
本披露的另一方面涉及包含本文所述的CRISPR-Cas13系统的细胞。细胞(例如,其可用于产生无细胞系统)可以是真核或原核的。此类细胞的实例包括但不限于细菌、古细菌、植物、真菌、酵母、昆虫和哺乳动物细胞,例如乳杆菌、乳球菌、芽孢杆菌(例如枯草芽孢杆菌)、埃希氏菌属(例如大肠杆菌)、梭菌属、酵母菌属或毕赤酵母属(如酿酒酵母或巴斯德毕赤酵母),乳酸克鲁维酵母、鼠伤寒沙门氏菌、果蝇细胞、秀丽隐杆线虫 细胞、非洲爪蟾细胞、SF9细胞、C129细胞、293细胞、脉孢菌和永生化哺乳动物细胞系(例如,Hela细胞、骨髓细胞系和淋巴样细胞系)。
在一些实施方案中,所述细胞是原核细胞,例如细菌细胞,例如大肠杆菌。在一些实施方案中,细胞是真核细胞,例如哺乳动物细胞或人类细胞。在一些实施方案中,细胞是原代真核细胞、干细胞、肿瘤/癌细胞、循环肿瘤细胞(CTC)、血细胞(例如,T细胞、B细胞、NK细胞、Tregs等)、造血干细胞、特化免疫细胞(如肿瘤浸润淋巴细胞或肿瘤抑制淋巴细胞)、肿瘤微环境中的基质细胞(如癌症相关成纤维细胞等)。在一些实施方案中,细胞是中枢或外周神经系统的脑或神经元细胞(例如,神经元、星形胶质细胞、小胶质细胞、视网膜神经节细胞、视杆/视锥细胞等)。
靶RNA分子
本文所述的CRISPR-Cas13系统、组合物或试剂盒可用于靶向一种或多种靶RNA分子,例如存在于生物样品、环境样品(例如土壤、空气或水样品)等中的靶RNA分子。在一些实施方案中,所述靶RNA是编码RNA,例如pre-mRNA或成熟mRNA。在一些实施方案中,所述靶RNA是核RNA。在一些实施方案中,所述靶RNA是位于真核细胞核中的RNA转录物。在一些实施方案中,所述靶RNA是非编码RNA,例如功能性RNA、siRNA、microRNA、snRNA、snoRNA、piRNA、scaRNA、tRNA、rRNA、lncRNA或lincRNA。
在一些实施方案中,除了靶向靶RNA分子之外,本文所述的CRISPR-Cas13系统、组合物或试剂盒对靶RNA执行以下功能中的一种或多种:切割一种或多种靶RNA分子或使一种或多种靶RNA分子产生切口(nicking),激活或上调一种或多种靶RNA分子,激活或抑制一种或多种靶RNA分子的翻译,使一种或多种靶RNA分子失活,可视化、标记或检测一种或多种靶RNA分子,结合一种或多种靶RNA分子,编辑一种或多种靶RNA分子,运输一种或多种靶RNA分子,以及掩蔽一种或多种靶RNA分子。在一些实例中,本文所述的CRISPR-Cas13系统、组合物或试剂盒修饰一种或多种靶RNA分子,所述修饰一种或多种靶RNA分子包括以下中的一种或多种:RNA碱基取代、RNA碱基缺失、RNA碱基插入、靶RNA的断裂、RNA甲基化和RNA去甲基化。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以靶向一种或多种靶RNA分子。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以结合一种或多种靶RNA分子。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以切割一种或多种靶RNA分子。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以激活一种或多种靶RNA分子的翻译。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以抑制一种或多种靶RNA分子的翻译。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以检测一种或多 种靶RNA分子。在一些实施方案中,本文所述的CRISPR-Cas13系统、组合物或试剂盒可以编辑一种或多种靶RNA分子。
在一些实施方案中,所述靶RNA是AQp1 RNA。使用本文所述的CRISPR-Cas13系统敲低Aqp1 RNA水平可以减少房水的产生并降低眼压,可用于治疗青光眼等疾病。在一些实施方案中,靶RNA是AQp1 RNA,指导多核苷酸的指导序列为SEQ ID NO:5。
在一些实施方案中,靶RNA是PTBP1 RNA。使用本文所述的CRISPR-Cas13系统敲低PTBP1 RNA的水平可以促进脑星形胶质细胞转分化为神经元,这可用于治疗帕金森病等疾病。在一些实施方案中,靶RNA是PTBP1 RNA,指导多核苷酸的指导序列为SEQ ID NO:6。
在一些实施方案中,靶RNA是VEGFA RNA。使用本文所述的CRISPR-Cas13系统降低VEGFA RNA的水平可以防止脉络膜新生血管形成,这可以用于治疗诸如年龄相关性黄斑变性等疾病。
在一些实施方案中,靶RNA是ANGPTL3 RNA。使用本文所述的CRISPR-Cas13系统敲低ANGPTL3 RNA水平可以降低低密度脂蛋白胆固醇(LDL-C)等血脂,可用于治疗高脂血症、家族性高胆固醇血症等动脉粥样硬化性心血管疾病。在一些实施方案中,靶RNA是ANGPTL3 RNA,指导多核苷酸的指导序列选自SEQ ID NO:42-49中的任一种或几种。
治疗应用
本披露的另一方面涉及一种药物组合物,其包含本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒、或本文所述的真核细胞。所述药物组合物可以包含例如编码本文所述的Cas13蛋白或融合蛋白以及指导多核苷酸的AAV载体。所述药物组合物可以包含例如脂质纳米粒,该脂质纳米粒包含本文所述的指导多核苷酸和编码所述Cas13蛋白或融合蛋白的mRNA。所述药物组合物可以包含例如包含本文所述的指导多核苷酸和编码所述Cas13蛋白或融合蛋白的mRNA的慢病毒载体。所述药物组合物可以包含例如:包含本文所述的指导多核苷酸以及Cas13蛋白或融合蛋白的病毒样颗粒;或由所述指导多核苷酸以及Cas13蛋白或融合蛋白形成的核糖核蛋白复合物。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文 所述的病毒样颗粒、或本文所述的真核细胞在切割或编辑哺乳动物细胞中的靶RNA中的用途。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒、或本文所述的真核细胞在以下任一项的用途:切割一种或多种靶RNA分子或使一种或多种靶RNA分子产生切口(nicking),激活或上调一种或多种靶RNA分子,激活或抑制一种或多种靶RNA分子的翻译,使一种或多种靶RNA分子失活,可视化、标记或检测一种或多种靶RNA分子,结合一种或多种靶RNA分子,运输一种或多种靶RNA分子,以及掩蔽一种或多种靶RNA分子。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒、或本文所述的真核细胞在哺乳动物细胞中修饰一种或多种靶RNA分子的用途,所述修饰一种或多种靶RNA分子包括以下中的一种或多种:RNA碱基取代、RNA碱基缺失、RNA碱基插入、靶RNA的断裂、RNA甲基化和RNA去甲基化。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒、或本文所述的真核细胞在诊断、治疗或预防与靶RNA相关的疾病或病症中的用途。在一些实施方案中,所述疾病或病症是帕金森病。在一些实施方案中,所述疾病或病症是帕金森病,并且靶RNA是PTBP1 RNA。在一些实施方案中,所述疾病或病症是青光眼。在一些实施方案中,所述疾病或病症是青光眼,并且靶RNA是AQp1 RNA。在一些实施方案中,所述疾病或病症是肌萎缩侧索硬化症。在一些实施方案中,所述疾病或病症是肌萎缩侧索硬化症,并且靶RNA是超氧化物歧化酶1(SOD1)RNA。在一些实施方案中,所述疾病或病症是年龄相关性黄斑变性,并且靶RNA是VEGFA RNA。在一些实施方案中,所述疾病或病症是年龄相关性黄斑变性,并且靶RNA是VEGFA RNA或VEGFR1 RNA。在一些实施方案中,所述疾病或病症是血浆LDL胆固醇水平升高。在一些实施方案中,所述疾病或病症是血浆LDL胆固醇水平升高,并且靶RNA是PCSK9 RNA或ANGPTL3 RNA。
本披露的另一方面涉及涉及本文所述的CRISPR-Cas13系统、本文所述的Cas13蛋白、本文所述的融合蛋白、本文所述的指导多核苷酸、本文所述的核酸、本文所述的载体 系统、本文所述的脂质纳米粒、本文所述的慢病毒载体、本文所述的核糖核蛋白复合物、本文所述的病毒样颗粒、或本文所述的真核细胞在制备用于诊断、治疗或预防与靶RNA相关的疾病或病症的药物中的用途。在一些实施方案中,所述疾病或病症是帕金森病。在一些实施方案中,所述疾病或病症是青光眼。在一些实施方案中,所述疾病或病症是肌萎缩侧索硬化症。在一些实施方案中,所述疾病或病症是年龄相关性黄斑变性。在一些实施方案中,所述疾病或病症是血浆LDL胆固醇水平升高。在一些实施方案中,所述疾病或病症是帕金森病,并且靶RNA是PTBP1 RNA。在一些实施方案中,所述疾病或病症是青光眼,并且靶RNA是AQp1 RNA。在一些实施方案中,所述疾病或病症是肌萎缩侧索硬化症,并且靶RNA是超氧化物歧化酶1(SOD1)RNA。在一些实施方案中,所述疾病或病症是年龄相关性黄斑变性,并且靶RNA是VEGFA RNA或VEGFR1 RNA。在一些实施方案中,所述疾病或病症是血浆LDL胆固醇水平升高,并且靶RNA是PCSK9 RNA或ANGPTL3 RNA。
在一些实施方案中,将药物组合物体内递送至人类受试者。所述药物组合物可以通过任何有效途径递送。示例性给药途径包括但不限于静脉内输注、静脉内注射、腹膜内注射、肌肉内注射、瘤内注射、皮下注射、皮内注射、心室内注射、血管内注射、小脑内注射、眼内注射、视网膜下注射、玻璃体内注射、前房内注射、鼓室内注射、鼻内给药和吸入。
在一些实施方案中,靶向RNA的方法导致编辑靶RNA的序列。例如,通过使用具有非突变HEPN结构域的Cas13蛋白或融合蛋白以及包含对靶RNA特异的指导序列的指导多核苷酸,可以在精确位置切割靶RNA或使靶RNA形成切口(nick,例如在靶RNA以双链型核酸分子存在时切割其中任一条单链)。在一些实例中,这种方法用于降低靶RNA的表达,这将降低相应蛋白质的翻译。这种方法可以用于不需要增加RNA表达的细胞中。在一个实例中,RNA与诸如囊性纤维化、亨廷顿氏病、Tay-Sachs、脆性X综合征、脆性X相关性震颤/共济失调综合征、肌营养不良、强直性肌营养不良、脊髓性肌萎缩、脊髓小脑共济失调、年龄相关性黄斑变性,或家族性ALS等疾病有关。在另一个例子中,RNA与癌症(例如肺癌、乳腺癌、结肠癌、肝癌、胰腺癌、前列腺癌、骨癌、脑癌、皮肤癌(例如黑素瘤)或肾癌)相关。靶RNA的实例包括但不限于与癌症相关的那些(例如,PD-L1、BCR-ABL、Ras、Raf、p53、BRCA1、BRCA2、CXCR4、β-连环蛋白、HER2和CDK4)。编辑这样的靶RNA可以产生治疗效果。
在一些实施方案中,RNA在免疫细胞中表达。例如,靶RNA可以编码导致抑制所需免疫反应(例如肿瘤浸润)的蛋白质。敲低这种RNA可以促进这种需要的免疫反应(例如,PD1、CTLA4、LAG3、TIM3)。在另一个例子中,靶RNA编码导致不希望的免疫 反应激活的蛋白质,例如在自身免疫疾病如多发性硬化、克罗恩病、狼疮或类风湿性关节炎的情况下。
诊断应用
本披露的另一方面涉及一种体外组合物,其包含本文所述的CRISPR-Cas13系统和不能与本文所述的指导多核苷酸杂交的标记的detector RNA。
本披露的另一方面涉及本文所述的CRISPR-Cas13系统在检测疑似包含靶RNA的核酸样品中的靶RNA的用途。
在一些实施方案中,检测靶RNA的方法包括与荧光蛋白或其他可检测标记融合的Cas13蛋白或融合蛋白以及包含对靶RNA特异的指导序列的指导多核苷酸。Cas13蛋白或融合蛋白与靶RNA的结合可以通过显微镜或其他成像方法进行可视化。在另一个例子中,RNA适体序列可以附加到或插入到指导多核苷酸,例如MS2、PP7、Qβ和其他适体。引入与这些适体特异性结合的蛋白质,例如与荧光蛋白或其他可检测标记融合的MS2噬菌体外壳蛋白可用于检测靶RNA,因为Cas13-指导-靶RNA复合物将通过适体相互作用而被标记。
在一些实施方案中,在无细胞系统中检测靶RNA的方法导致产生可检测的标记或酶活性。例如,通过使用Cas13蛋白、包含对靶RNA特异的指导序列的指导多核苷酸、和可检测标记,靶RNA将被Cas13识别。Cas13与靶RNA的结合会触发其RNase活性,这会导致靶RNA以及可检测标记的切割。
在一些实施方案中,可检测标记是与荧光探针和淬灭剂连接的RNA。完整的可检测RNA连接荧光探针和淬灭剂,抑制荧光。在可检测RNA被Cas13切割后,荧光探针从淬灭剂中释放出来并显示出荧光活性。这种方法可用于确定靶RNA是否存在于裂解的细胞样品、裂解的组织样品、血液样品、唾液样品、环境样品(例如水、土壤或空气样品)、或其他裂解的细胞或无细胞样品中。这种方法还可用于检测病原体,例如病毒或细菌,或诊断疾病状态,例如癌症。
在一些实施方案中,靶RNA的检测有助于诊断疾病和/或病理状态,或病毒或细菌感染的存在。例如,Cas13介导的非编码RNA如PCA3的检测,如果在患者尿液中检测到则可用于诊断前列腺癌。在另一个例子中,Cas13介导的lncRNA-AA174084的检测,其是一种胃癌的生物标志物,可用于诊断胃癌。
实施例
实验例1:C13-2蛋白的筛选
1、CRISPR和基因的注释
使用软件对来自NCBI Gebank和CNGB(中国国家基因库)数据库的微生物基因组,预测全基因组的蛋白,然后使用软件预测基因组上的CRISPR array。
2、蛋白的初步筛选
用聚类去除冗余的蛋白,同时过滤掉氨基酸序列长度小于800aa(氨基酸)或者大于1400aa的蛋白。
3、CRISPR相关蛋白的获得
CRISPR Array上下游10kb以内的蛋白序列和已知Cas13进行比对,过滤掉evalue大于1*e-5的蛋白。然后再与NCBI的NR库、EBI的专利库比对,过滤掉相似度高的蛋白,再经挑选得到候选蛋白。通过实验验证,最终得到C13-2蛋白(SEQ ID NO:1,893aa)。C13-2蛋白也被称为CasRfg.4蛋白。
C13-2蛋白的基因组序列来源如表1所示。
表1.C13-2蛋白的基因组序列的来源.
C13-2蛋白的天然(野生型)DNA编码序列为SEQ ID NO:9。
C13-2蛋白的基因座结构如图1所示,包含CRISPR array和C13-2编码序列。
与C13-2联合使用的gRNA的同向重复(direct repeat,DR)序列或骨架序列(scaffold sequence)可以为:
5’-GGAAGAUAACUCUACAAACCUGUAGGGUUCUGAGAC-3’(SEQ ID NO:3)。
使用RNAfold预测得到上述同向重复序列的RNA二级结构如图2所示。
实验例2:C13-2蛋白的制备、分离和纯化
(一)载体构建
1、取pET28a载体质粒,经BamHI和XhoI双酶切后,琼脂糖凝胶电泳切胶回收线性化的载体,将包含蛋白编码序列(可编码C13-2蛋白以及核定位信号)的DNA片段通过同源重组的方式插入到载体pET28a的克隆区,反应液转化Stbl3感受态,涂布硫酸卡那霉素抗性的LB平板,37℃过夜培养后,挑取克隆测序鉴定。
构建好的重组载体命名为C13-2-pET28a(SEQ ID NO:10),该重组载体用于表达C13-2重组蛋白(SEQ ID NO:11),该C13-2重组蛋白架构为His tag-NLS-Cas13-SV40 NLS-nucleoplasmin NLS。
2、序列正确的阳性克隆过夜培养,提取质粒后转化表达菌株RIPL-BL21(DE3),涂布硫酸卡那霉素抗性的LB平板,37℃过夜培养。
(二)蛋白表达
1、挑取单克隆接种至5ml硫酸卡那霉素抗性的LB培养液,37℃过夜培养。
2、以1:100体积比转接种500ml硫酸卡那霉素抗性的TB培养液中,以220rpm的转速,37℃培养至OD值为0.6,加IPTG至终浓度0.2mM,16℃诱导24h。
3、离心收集菌体,15ml PBS漂洗菌体后离心收集菌体,加lysis buffer超声破碎,10,000g离心30min获得含重组蛋白的上清液,上清经过0.45um滤膜过滤后即可上柱纯化。
(三)蛋白纯化
通过MAC(Ni Sepharose 6Fast Flow,CYTIVA)和HITRAP HEPARIN HP(CYTIVA)纯化。纯化的C13-2重组蛋白经过SDS-PAGE电泳呈一条带(如图3所示)。
实验例3:外源基因编辑效率验证
1.合成靶向EGFP的待验证载体及对照载体
合成外源EGFP表达载体的序列如SEQ ID NO:13所示,质粒结构为CMV-EGFP。合成C13-2蛋白靶向EGFP的验证载体质粒,其全长序列如SEQ ID NO:14所示,质粒结构为CMV-C13-2-U6-gRNA。
使用EGFP作为外源的报告基因,其核酸序列(720bp)如SEQ ID NO:12所示。
靶向EGFP的指导序列为ugccguucuucugcuugucggccaugauau(SEQ ID NO:4)。
2.待验证载体转染293T细胞
将外源EGFP表达载体与C13-2蛋白靶向EGFP的验证载体质粒按照1:2(166ng:334ng)在24孔板中转染293T细胞。
转染方法如下所示:
1、胰酶(Trypsin 0.25%,EDTA,Thermo,25200056)消化293T细胞,对细胞计数,按照一个孔500μL将2×105细胞铺24孔板。
2、对于每个转染样品,请按照以下步骤准备复合物:
a、在加入细胞的24孔板每个孔中,加入50μL无血清的Opti-MEM I(Thermo,11058021)还原血清培养基中稀释前述的500ng的质粒DNA,并轻轻混合;
b、在使用前轻轻混合Lipofectamine 2000(Thermo,11668019),然后在每个孔中,即50μL的Opti-MEM I培养基中稀释1μL的Lipofectamine 2000。在室温下孵育5分钟。注意:在25分钟内继续执行步骤c;
c、孵育5分钟后,将稀释的DNA与稀释的Lipofectamine 2000合并。轻轻混合并在室温下孵育20分钟(溶液可能看起来混浊)。注意:复合物在室温下稳定6小时。将复合物加入293T细胞中并混合,48h后使用流式细胞仪进行检测。
3.流式细胞仪检测下调EGFP表达的效果
使用的细胞以及质粒说明如表2所示:
表2.实验分组
将转染后48h的细胞使用胰酶(Trypsin 0.25%,EDTA,Thermo)消化,300g 5min离心去除上清,每个孔的细胞使用500μL的PBS重悬,通过流式细胞仪检测EGFP荧光表达,通过FCS-A以及SSC-A划门去除细胞碎片后,流式细胞仪检测收集数据。
收集记录FITC通道Mean-FITC-A结果,并按下述计算公式计算下调幅度:
令EGFP组的GFP荧光为a,其他组别的GFP荧光为x。下调幅度(%)=(a-x)÷a×100。其中细胞空白对照组不参与比较。下调幅度结果如表3所示,C13-2靶向EGFP时可使其表达下调64.49%。
表3.流式细胞仪检测GFP荧光结果
实验例4:内源基因编辑效率验证
1.构建靶向AQp1以及PTBP1的载体
合成表达载体C13-2-BsaI质粒,序列如SEQ ID NO:15所示。
实验选择的靶核酸是AQp1(Aquaporin 1)以及PTBP1(Polypyrimidine Tract Binding Protein 1),其中验证AQp1使用高表达AQp1的293T细胞系,验证PTBP1使用293T细胞系。
高表达AQp1的293T细胞系(293T-AQp1细胞)的构建方法:构建过表达AQp1基因以及EGFP基因的载体Lv-AQp1-T2a-GFP(SEQ ID NO:16)。其中,AQp1与EGFP使用2A肽进行间隔。将Lv-AQp1-T2a-GFP质粒包装慢病毒转导293T细胞,形成稳定过表达AQp1基因的细胞系。
靶向AQp1的gRNA指导序列为:
AGGGCAGAACCGAUGCUGAUGAAGAC(SEQ ID NO:5)。
靶向PTBP1的gRNA指导序列为:
GUGGUUGGAGAACUGGAUGUAGAUGGGCUG(SEQ ID NO:6)。
使用引物退火方式获得靶向靶位点的片段,其引物如下所示:
靶向PTBP1
上游引物:5’-AGACGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’(SEQ ID NO:22)
下游引物:5’-AAAACAGCCCATCTACATCCAGTTCTCCAACCAC-3’(SEQ ID NO:23)
靶向AQp1
上游引物:5’-AGACAGGGCAGAACCGATGCTGATGAAGAC-3’(SEQ ID NO:24)
下游引物:5’-AAAAGTCTTCATCAGCATCGGTTCTGCCCT-3’(SEQ ID NO:25)
引物退火反应体系如下表所示,在PCR仪内95℃孵育5分钟,随后立刻取出在冰上孵育5分钟,使引物之间互相退火形成含粘性末端的双链DNA。
将合成的C13-2-BsaI载体质粒使用Bsa I内切酶进行酶切后,将上述退火产物和酶切后纯化回收的骨架通过T4 DNA连接酶连接,转化大肠杆菌后挑选阳性克隆并提取质粒,得到C13-2载体(C13-2靶向AQP1或PTBP1的载体质粒),用于下述C13-2实验组。C13-2载体结构为CMV-C13-2-U6-gRNA,可用于表达C13-2蛋白,以及靶向AQp1或PTBP1的gRNA。
使用常规方法制备得到以下对照载体:
CasRx-AQp1质粒(CasRx靶向AQp1的阳性对照载体)序列如SEQ ID NO:17所示,质粒结构为CMV-CasRx-U6-gRNA。其包含编码CasRx(氨基酸序列如SEQ ID NO:2所示)的序列。
shRNA-AQp1质粒(shRNA靶向AQp1的阳性对照载体)序列如SEQ ID NO:19所示,用于表达shRNA分子,所述shRNA分子序列为CCACGACCCUCUUUGUCUUCACUCGAGUGAAGACAAAGAGGGUCGUGG(SEQ ID NO:7)。
shRNA-PTBP1质粒(shRNA靶向PTBP1的阳性对照载体)序列如SEQ ID NO:20所示,用于表达shRNA分子,所述shRNA分子序列为CAGCCCAUCUACAUCCAGUUCCUCGAGGAACUGGAUGUAGAUGGGCUG(SEQ ID NO:8)。
CasRx-blank质粒(空白对照载体,可表达CasRx和gRNA,但gRNA不靶向AQp1和PTBP1)序列如SEQ ID NO:21所示,质粒结构为CMV-CasRx-U6-gRNA。
2、待验证载体转染293T细胞以及293T-AQp1细胞
将对照质粒或靶向AQP1的载体质粒按照500ng在24孔板中转染293T-AQp1细胞。将对照质粒或靶向PTBP1的载体质粒按照500ng在24孔板中转染293T细胞。
转染方法如下所示:
1、胰酶(Trypsin 0.25%,EDTA,Thermo)消化细胞,对细胞计数,按照一个孔500μL将2×105细胞铺24孔板。
2、对于每个转染样品,请按照以下步骤准备复合物:
a、在加入细胞的24孔板每个孔中,加入50μL无血清的Opti-MEM I(Thermo)还原血清培养基中稀释前述的质粒DNA,并轻轻混合;
b、在使用前轻轻混合Lipofectamine 2000(Thermo,11668019),然后在每个孔中,即50μL的Opti-MEM I培养基中稀释1.8μL的Lipofectamine 2000。在室温下孵育5分钟。注意:在25分钟内继续执行步骤c;
c、孵育5分钟后,将稀释的DNA与稀释的Lipofectamine 2000合并。轻轻混合并在室温下孵育20分钟(溶液可能看起来混浊)。注意:复合物在室温下稳定6小时。
将复合物加入细胞中并混合。
3、qPCR检测靶基因的RNA
转染后48h的细胞使用SteadyPure Universal RNA Extraction Kit AG21017试剂盒进行RNA提取操作,并使用超微量分光光度计检测RNA浓度。RNA产物使用Evo M-MLV Mix Kit with gDNA Clean for qPCR反转录试剂盒进行反转录,反转录产物使用SYBR Green Premix Pro Taq HS qPCR Kit试剂盒进行检测。
其中qPCR所使用引物如下所示:
检测PTBP1:
上游引物:5’-ATTGTCCCAGATATAGCCGTTG-3’(SEQ ID NO:26)
下游引物:5’-GCTGTCATTTCCGTTTGCTG-3’(SEQ ID NO:27)
检测AQp1:
上游引物:5’-GCTCTTCTGGAGGGCAGTGG-3’(SEQ ID NO:28)
下游引物:5’-CAGTGTGACAGCCGGGTTGAG-3’(SEQ ID NO:29)
检测内参GAPDH:
上游引物:5’-CCATGGGGAAGGTGAAGGTC-3’(SEQ ID NO:30)
下游引物:5’-GAAGGGGTCATTGATGGCAAC-3’(SEQ ID NO:31)
按照SYBR Green Premix Pro Taq HS qPCR Kit使用说明配置反应体系,使用QuantStudioTM5 Real-Time PCR System进行检测。
本实验使用相对定量方法即2-△△Ct法计算目标RNA。其计算方式如下所示:
△Ct=Ct(AQp1)-Ct(GAPDH)或Ct(PTBP1)-Ct(GAPDH)
△△Ct=△Ct(待验证样品如C13-2)-△Ct(CasRx-blank或C13-2-BsaI)
2-△△Ct=2^(-△△Ct)
按照上述计算方式计算AQp1以及PTBP1的RNA量。对于靶向PTBP1的验证实验,结果如表4和图4所示。对于靶向AQp1的验证,进行独立的3次生物学重复实验(转染操作使用相同批次的293T细胞),并获得3次的平均结果,如表5和图5所示。
表4.PTBP1 RNA的敲降测试结果
表5.AQp1 RNA的敲降测试结果
注:“-”表示未检测。
C13-2-BsaI和CasRx-blank为对照组,它们即不靶向AQp1,也不靶向PTBP1。实验结果表明,联合本实验例的gRNA时,C13-2显示了明显的编辑效果,靶向PTBP1的编辑活性高,靶向AQp1的编辑活性高于CasRx。
实验例5:与已公开的Cas13蛋白的编辑效率对比
1、构建靶向内源基因PTBP1的编辑载体
公告号为US10476825B2的专利公开了来自BMZ-11B_GL0037771的Cas13蛋白,在本实验例中将其称为C13-113(该蛋白的氨基酸序列如本文的序列SEQ ID NO:32所示)。本实验例使用的C13-113对应的同向重复序列如SEQ ID NO:33所示。
GenBank公开了Cas13蛋白MBR0191107.1,在本申请中将其称为C13-114(该蛋白的氨基酸序列如本文的序列SEQ ID NO:34所示)。本实验例使用的C13-114对应的同向重复序列如SEQ ID NO:35所示。
在试剂公司分别合成C13-113、C13-114表达载体C13-113-BsaI(SEQ ID NO:36)和C13-114-BsaI(SEQ ID NO:37)。
按照实验例4的方法,使用引物退火方式获得靶向靶位点的片段,其引物如下所示:
靶向PTBP1
C13-113:
上游引物:5’-CAACGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’(SEQ ID NO:38)
下游引物:5’-AAAACAGCCCATCTACATCCAGTTCTCCAACCAC-3’(SEQ ID NO:23)
C13-114:
上游引物:5’-atctGTGGTTGGAGAACTGGATGTAGATGGGCTG-3’(SEQ ID NO:39)
下游引物:5’-AAAACAGCCCATCTACATCCAGTTCTCCAACCAC-3’(SEQ ID NO:23)
按照实验例4的方法,将合成的C13-113-BsaI和C13-114-BsaI质粒用Bsa I内切酶进行酶切后,与退火产物进行T4连接,转化大肠杆菌后挑选阳性克隆并提取质粒。转染293T细胞(与实验例4不同批次的293T细胞)72h后进行检测。空白对照组分别转染实验例4的C13-2-BsaI。
转染后72h的细胞按照实验例4的方法,提取RNA,反转录,qPCR(引物同实验例4)。
按照Green Premix Pro Taq HS qPCR Kit使用说明配置反应体系,使用QuantStudioTM5Real-Time PCR System进行检测。
使用2-△△Ct法计算PTBP1 RNA的量。其计算方式如下所示:
△Ct=Ct(PTBP1)-Ct(GAPDH)
△△Ct=△Ct(待验证样品)-△Ct(C13-2-BsaI)
2-△△Ct=2^(-△△Ct)
实验结果如表6和图6所示。
表6.PTBP1 RNA的敲降测试结果对比

表中数据显示,联合本实验例的gRNA靶向PTBP1时,观察到C13-2的较高编辑效率,其编辑效率优于C13-113、C13-114。C13-113组未观察到明显的编辑。
实验例6:测试C13-2的脱靶
通过EMBOSS-water程序及NCBI-Blast程序在靶物种(Homo sapiens)的全基因组及全cDNA序列中进行预测,使用gRNA指导序列的正反链进行比对,对预测结果进行过滤,按照预测靶标与gRNA指导序列长度之差不超过四个碱基以及mismatch+gap不超过四个碱基进行过滤,共得到潜在的脱靶信息如表7所示。
表7.预测的潜在脱靶基因数量
使用实验例4的C13-2靶向PTBP1的载体质粒、shRNA-PTBP1质粒(其表达的shRNA在本实验例中称为“shRNA2”)以及CasRx-blank质粒。
使用常规方法制备得到CasRx-PTBP1质粒(CasRx靶向PTBP1的阳性对照载体)序列如SEQ ID NO:18所示,质粒结构为CMV-CasRx-U6-gRNA。
按照实验例4的方法另外构建得到表达shRNA1的质粒(与shRNA-PTBP1质粒区别仅在于编码的shRNA的指导序列不同,但也靶向PTBP1,U6启动子后不再额外加g)。
按照实验例4的方法,将上述质粒分别转染293T细胞。按照Lipofectamine 2000(Thermo)说明书操作转染24孔板,72h后使用SteadyPure Universal RNA Extraction Kit AG21017试剂盒提取RNA。
对提取的RNA样本进行PE150bp RNA-Seq测序,测序获得的fastq文件通过HISAT或STAR软件与靶物种参考基因组进行比对,获得比对后的BAM文件。使用kallisto、RSEM或HTSeq检测得到转录本及各基因的表达量。
使用DESeq2、limma-voom、edger对各组的表达量进行差异分析(相对于CasRx-blank组的差异),将满足p.adj<0.05、|log2FoldChange|>=0.5、basemean>2.5的作为差异表达基因differential expression gene(DEG),共得到DEG信息如下:
表8.差异表达基因数量以及与预测的潜在脱靶基因的交集

注:Up表示表达上调,Down表示表达下调。
Sig代表与对照组(CasRx-blank组)的基因表达差异显著。
Isec代表DEG与程序预测的“潜在脱靶基因”取交集后的数目。
从上表数据可以看出,CasRx、C13-2的转录组测序和预测脱靶取交集后的脱靶位点个数都为0,C13-2几乎无脱靶,shRNA1和shRNA2则存在大量的脱靶位点。在脱靶安全性方面C13-2优于shRNA1和shRNA2,且与CasRx相当。另外,C13-2尺寸仅为893aa,远小于CasRx的967aa,因此更便于与gRNA一起包装进AAV进行递送。
实验例7:靶向ANGPTL3的编辑
本实验例参照实验例4的方法。
构建过表达ANGPTL3基因以及EGFP基因的载体Lv-ANGPTL3-T2a-GFP(SEQ ID NO:52),其中,ANGPTL3与EGFP使用2A肽进行间隔,将Lv-ANGPTL3-T2a-GFP质粒包装慢病毒转导293T细胞,得到稳定过表达ANGPTL3基因的293T细胞系(称为293T-ANGPTL3细胞)。
用引物退火方式获得靶向靶位点的片段。
将C13-2-BsaI载体(SEQ ID NO:15)使用Bsa I内切酶进行酶切后,将退火产物和酶切后纯化回收的骨架进行T4连接,转化大肠杆菌后挑选阳性克隆并提取得到C13-2靶向ANGPTL3的质粒(CMV驱动C13-2蛋白表达,U6启动子驱动gRNA表达,gRNA的DR序列为SEQ ID NO:3,gRNA指导序列如表9所示),将质粒转染293T-ANGPTL3细胞。其中阴性对照组转染C13-2-BsaI质粒。
表9.靶向ANGPTL3 RNA的gRNA指导序列

转染后72h,提取RNA,反转录,反转录产物进行qPCR检测。
qPCR所使用引物如下所示:
检测ANGPTL3:
上游引物:5’-CCAGAACACCCAGAAGTAACT-3’(SEQ ID NO:50)
下游引物:5’-TCTGTGGGTTCTTGAATACTAGTC-3’(SEQ ID NO:51)
检测内参GAPDH:
上游引物:5’-CCATGGGGAAGGTGAAGGTC-3’(SEQ ID NO:30)
下游引物:5’-GAAGGGGTCATTGATGGCAAC-3’(SEQ ID NO:31)
使用相对定量方法即2-△△Ct法计算靶RNA的含量。重组质粒转染细胞、提取RNA、反转录以及qPCR均进行独立的3次生物学重复实验,并获得3次的平均结果。如表10和图7所示。
表10.ANGPTL3 RNA的敲降测试结果
实验数据显示,C13-2可对ANGPTL3 RNA实现有效的敲降,其中gRNA2、gRNA4、gRNA5、gRNA6的编辑效果显著。
实验例8:使用C13-2失活突变体(dC13-2)进行编辑
构建得到C13-2-VEGFA载体(SEQ ID NO:72),其可表达C13-2蛋白以及靶向VEGFA的gRNA,该gRNA指导序列为TGGGTGCAGCCTGGGACCACTTGGCATGG(SEQ ID NO:73)。
然后使用常规的同源重组方法由C13-2-VEGFA载体构建得到C13-2的R4xH突变体验证载体,如表11所示。
突变体载体相对于C13-2-VEGFA载体的C13-2编码序列引入以下突变:
R210A+H215A :AGAAACGCCACCGCCCAC(SEQ ID NO:74)→GCAAACGCCACCGCCGCC(SEQ ID NO:75);
R750A+H755A:AGAAAGACCAAGAGACAC(SEQ ID NO:76)→GCAAAGACCAAGAGAGCC(SEQ ID NO:77);和/或
R785A+H790A:AGAAACGACGTGGAGCAC(SEQ ID NO:78)→GCAAACGACGTGGAGGCC(SEQ ID NO:79)。
表11.用于测试C13-2失活突变的载体
将载体转染293T细胞系。按照Lipofectamine 2000(Thermo)说明书操作转染。72h后使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA。将3批次实验提取的RNA送测序公司进行RNAseq测序,检测VEGFA RNA量如下表12所示:
表12.RNAseq测序确定的VEGFA RNA量

表12的实验数据表明,引入R750A+H755A突变后编辑活性仍较高,引入R210A+H215A和/或R785A+H790A突变后保留较弱的编辑活性,同时引入R210A+H215A、R750A+H755A和R785A+H790A突变后才使得编辑活性几乎完全丧失。
实验例9:C13-2截短体的测试
构建靶向内源基因VEGFA的C13-2截短体的验证载体
使用三片段同源重组的方法,由C13-2-VEGFA质粒(SEQ ID NO:72)构建得到表13所示的各截短体的验证载体(与C13-2-VEGFA载体的区别仅在于C13-2的编码序列被截短),其可表达各截短体蛋白,以及靶向VEGFA的gRNA。
表13.构建的C13-2截短体的验证载体

按照Lipofectamine 2000(Thermo)说明书操作,将验证载体以及对照载体转染293T细胞系。72h后使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA,使用Evo M-MLV Mix Kit with gDNA Clean for qPCR反转录试剂盒进行反转录。按照Green Premix Pro Taq HS qPCR Kit使用说明配置反应体系,使用QuantStudioTM5 Real-Time PCR System进行检测。
其中qPCR所使用引物如下所示:
检测VEGFA:ACCTCCACCATGCCAAGTGG(SEQ ID NO:88)
CAGGGTCTCGATTGGATGGC(SEQ ID NO:89)
检测内参GAPDH:CCATGGGGAAGGTGAAGGTC(SEQ ID NO:30)
GAAGGGGTCATTGATGGCAAC(SEQ ID NO:31)
使用相对定量方法即2-△△Ct法计算目标RNA。进行多批次重复试验,结果取平均值。如表14所示。
表14.截短体编辑后的VEGFA RNA相对量

本实验例的C13-2截短体保留一定的RNA编辑活性。
实验例10:不同DR(同向重复)序列的测试
构建编码不同DR序列(如表15所示)的靶向内源基因VEGFA和PTBP1的验证载体(如表16所示)。
表15.设计的不同DR序列
表16.构建得到表达各种gRNA的载体质粒
由实验例8的C13-2-VEGFA载体以及实验例4的C13-2靶向PTBP1的载体质粒通过常规方法构建得到(仅是crRNA表达框序列发生了替换)表16中表达各种不同crRNA序列(5’-guide-DR-3’)的验证载体。
按前述实验例的实验方法将验证载体以及对照载体转染293T细胞,72h后提取RNA、反转录、qPCR试剂盒进行检测。其中qPCR所使用引物如下所示:
检测VEGFA:ACCTCCACCATGCCAAGTGG(SEQ ID NO:88)
CAGGGTCTCGATTGGATGGC(SEQ ID NO:89)
检测PTBP1:ATTGTCCCAGATATAGCCGTTG(SEQ ID NO:26)
GCTGTCATTTCCGTTTGCTG(SEQ ID NO:27)
检测内参GAPDH:CCATGGGGAAGGTGAAGGTC(SEQ ID NO:30)
GAAGGGGTCATTGATGGCAAC(SEQ ID NO:31)
使用2-△△Ct法计算目标RNA的变化。进行多批次重复试验,结果取平均值。如表17、表18、图9、图10所示。
表17.不同gRNA编辑后的VEGFA RNA相对量
表18.不同gRNA编辑后的PTBP1 RNA相对量
表17、表18实验结果显示,使用DRrc或DR-hf2时编辑效率最高,优于其他DR序列(P<0.05)。使用DR2rc、DR3或DR4时也可达到比较高的编辑效率,优于DR2、DR3rc、DR4rc和DR-hf1。
同向重复序列DRrc、DR-hf2、DR2rc的序列比对结果如图18所示。
RNAfold预测同向重复序列DR-hf2的RNA二级结构如图19所示。
实验例11:C13-2与主流Cas13工具的对比
本实验例全部验证载体及对照载体(表19)均使用与C13-2-BsaI(SEQ ID NO:15)同样的骨架序列,仅Cas13编码序列、crRNA编码序列不同。各载体的NLS序列及其连接序列相同。所有载体结构为CMV-NLS-Cas13-2×NLS-U6-crRNA,全部Cas13均在N端带有1个NLS,在C端带有2个NLS,总共为3×NLS的结构。
因为293T细胞中不含有EGFP序列,因此使用靶向GFP的载体作为阴性对照。
靶向GFP的指导序列为tgccgttcttctgcttgtcggccatgatat(SEQ ID NO:90)。
靶向PTBP1的指导序列为:
GTGGTTGGAGAACTGGATGTAGATGGGCTG(SEQ ID NO:6)。
靶向VEGFA的指导序列为:
TGGGTGCAGCCTGGGACCACTTGGCATGG(SEQ ID NO:73)。
表19.为与主流Cas13工具对比所构建的验证载体及对照载体
为节省篇幅,示例性地给出了表19中4个载体的序列(SEQ ID NO:91-94)。
将验证载体以及对照载体转染293T细胞。未转染质粒的293T作空白对照。
按照Lipofectamine 2000(Thermo)说明书操作转染,48h后使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA,并使用超微量分光光度计检测RNA浓度。RNA产物使用Evo M-MLV Mix Kit with gDNA Clean for qPCR反转录试剂盒进行反转录。按照Green Premix Pro Taq HS qPCR Kit使用说明配置反应体系,使用QuantStudioTM5 Real-Time PCR System进行检测。
其中qPCR所使用引物如下所示:
检测VEGFA:ACCTCCACCATGCCAAGTGG(SEQ ID NO:88)
CAGGGTCTCGATTGGATGGC(SEQ ID NO:89)
检测PTBP1:ATTGTCCCAGATATAGCCGTTG(SEQ ID NO:26)
GCTGTCATTTCCGTTTGCTG(SEQ ID NO:27)
检测内参GAPDH:CCATGGGGAAGGTGAAGGTC(SEQ ID NO:30)
GAAGGGGTCATTGATGGCAAC(SEQ ID NO:31)
使用2-△△Ct法计算靶RNA的量,各Cas13蛋白以其各自的靶向GFP组为阴性对照。进行多批次实验,结果取平均值。如表20、表21、图11、图12所示。
表20.C13-2与已知Cas13工具靶向VEGFA RNA
在VEGFA靶点编辑效果对比中,C13-2编辑效果很好,优于目前主流的Cas13编辑工具,编辑效率C13-2>PspCas13b>Cas13X.1>Cas13Y.1。
表21.C13-2与已知Cas13工具靶向PTBP1 RNA

在PTBP1靶点编辑效果对比中,C13-2编辑效果很好,编辑效率数值C13-2>CasRx>PspCas13b>Cas13X.1>Cas13Y.1。C13-2编辑效率明显优于PspCas13b、Cas13X.1和Cas13Y.1(P<0.05)。
实验例12:由dC13-2构建单碱基编辑器
本实验例中用于验证单碱基编辑效果的靶点是EGFP,用于靶向该EGFP的指导序列是tgccgttcttctgcttgtcggccatgatatagacgttgtggctgttgtagttgtactccagcttgtgccc(SEQ ID NO:95)。
本实验例涉及dC13-2的载体表达的都是含R210A+H215A、R750A+H755A和R785A+H790A突变的C13-2失活突变体。
表22.单碱基编辑相关的质粒载体及说明

使用常规方法构建得到dC13-2-BsaI和dC13-2-EGFP载体。
由dC13-2-BsaI质粒、dC13-2-EGFP质粒和pC0055-CMV-dPspCas13b-GS-ADAR2DD质粒,通过同源重组的方法获得单碱基编辑验证载体
dC13-2-ADAR-EGFP、dC13-2-A(EAAAK)3A-ADAR-EGFP和dC13-2-(GGGGS)3-ADAR-EGFP。
将pC0055-CMV-dPspCas13b-GS-ADAR2DD作为阳性对照。由于该质粒不含gRNA表达框,因此由外包公司合成gRNA表达载体PLKO-PURO-PspGRNA-EGFP。
按照Lipofectamine 2000(Thermo)说明书操作转染,将待验证的载体与EGFP报告载体pAAV-CMV-EGFP按照4:1进行转染,转染方案如下表23所示,其中因为pC0055-CMV-dPspCas13b-GS-ADAR2DD与PLKO-PURO-PspGRNA-EGFP-AD分别表达PspCas13b-ADAR蛋白以及gRNA,因此各按200ng进行共转。
表23.单碱基编辑验证载体与报告载体转染方案
转染48h后,使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA,并使用超微量分光光度计检测RNA浓度。RNA产物使用Evo M-MLV RT-PCR通用型反转录酶试剂盒进行反转录,反转录产物使用鉴定引物进行PCR,并将PCR产物送测序公司测序。
鉴定引物序列如下所示(产物长度704bp):
agggcgaggagctgtt(SEQ ID NO:103),
gtacagctcgtccatgccg(SEQ ID NO:104)。
测序结果如图13所示,指导序列第48位碱基在靶RNA上的对应位置产生了A→G转换,说明发明人构建的dC13-2编辑器成功诱导了单碱基编辑。
其中阴性对照dC13-2-EGFP组未发生碱基转换,阳性对照dPspCas13b-ADAR组诱导发生了碱基转换。dC13-2与ADAR之间不使用连接肽、使用刚性连接肽A(EAAAK)3A以及使用柔性连接肽(GGGGS)3时均诱导产生了碱基转换。
实验例13:C13-2的第一轮突变
第一轮突变原理:使用AlphaFold v2.1预测C13-2的结构,对属于REC叶的aa 1-89、aa 263-417进行N→A突变和R→A突变。
设计的突变体如下表24所示。
表24.第一轮突变
验证载体构建
第一轮突变使用VEGFA靶点进行验证。使用前述实验例的C13-2-VEGFA载体作为野生型C13-2编码载体,使用诺唯赞点突变试剂盒Mut Express II Fast Mutagenesis Kit V2对其进行改造,得到各突变体的表达构建体(验证载体),用于表达C13-2突变体和靶向VEGFA的gRNA。使用的引物如下表25所示。
表25.用于验证载体的引物序列


转染
将验证载体以及对照载体按照Lipofectamine 2000(Thermo)说明书转染293T细胞。C13-2-BsaI对照组转染前述实验例的C13-2-BsaI载体,WT对照组转染C13-2-VEGFA载体,均表达野生型C13-2。另设293T细胞对照组,不转染任何质粒。
qPCR检测靶RNA水平
转染后48h的细胞使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA,并使用超微量分光光度计检测RNA浓度。RNA产物使用Evo M-MLV Mix Kit with gDNA Clean for qPCR反转录试剂盒进行反转录,反转录产物使用SYBR Green Premix Pro Taq HS qPCR Kit(Low Rox Plus)试剂盒进行检测。
其中qPCR所使用引物为SEQ ID NO:88、89、30、31。
按照Green Premix Pro Taq HS qPCR Kit(Rox Plus)使用说明配置反应体系,使用QuantStudioTM5 Real-Time PCR System进行检测。
使用2-△△Ct法计算编辑后的靶RNA水平。重复3次实验,结果取平均值,如表26、图14所示。
表26.qPCR测试突变体靶向编辑后的VEGFA RNA水平

从qPCR检测结果可见,所有点突变体仍保留了高编辑活性。
RNAseq测序
将编辑后提取的总RNA样品进行RNAseq测序,建库类型为LncRNA链特异性文库,测序数据量是16G,测序策略是PE150。
RNAseq分析原理:
使用fastqc、multiqc对数据进行质控,使用fastp去除低质量reads;
比对至人rRNA序列进行去除,使用Hisat2比对软件比对至hg38参考基因组;
比对后使用Kallisto软件对基因进行表达水平的定量,而后使用sleuth软件进行表达量差异分析,将|b|>0.5,qval<0.05,mean_obs>2的基因视为差异表达基因;以293T细胞对照组为基准;
将指导序列使用EMBOSS water软件比对至参考cDNA,将比对碱基数>=18,错配碱基数<=6,最小连续配对碱基数>=8的转录本视为预测脱靶的转录本,对应的基因视为预测得到的潜在脱靶基因;
将显著下调的差异表达基因与预测的潜在脱靶基因取交集,剔除在靶的VEGFA基因后,得到脱靶基因集。
RNASeq结果分析
以293T细胞对照组作为基准,分析各组的VEGFA基因表达水平,结果如表27和图15所示。
表27.RNAseq测得的VEGFA编辑效率

RNAseq结果数据与qPCR结果基本吻合。
联合本实验例的gRNA时,M02、M07、M08、M10、M12、M14、M15、M18变体的编辑活性相比野生型C13-2略微提高。
差异表达基因数量及脱靶基因数量如表28所示。
关于编辑后细胞内表达下调的差异表达基因数量,M04、M09、M17、M22、M25、M27和M28组少于WT组。
关于取交集后确定的脱靶基因数量,M01至M28组均为0,即无脱靶。
表28.不同突变体靶向VEGFA的差异表达基因及脱靶基因

实验例14:C13-2第二轮突变
突变体设计
基于实验例1结果得到5个低脱靶突变位点(M09 N87A、M17 R308A、M28 N394A、M04 R47A、M13 R290A),围绕5个点内部组合或者与其他突变位点组合。另外还设计了保守位点的保守性突变。如表29所示。
表29.第二轮设计的突变体

验证载体构建
第二轮突变选用人AR(雄激素受体)靶点进行测试,合成得到C13-2-AR-h3质粒载体(SEQ ID NO:161,可表达野生型C13-2和靶向AR的h3 gRNA,其指导序列为SEQ ID NO:162,即ATAACATTTCCGAAGACGACAAGAT)。
使用诺唯赞点突变试剂盒Mut Express MultiS Fast Mutagenesis Kit V2对C13-2-AR-h3质粒载体进行改造,得到各突变体的表达构建体(验证载体),用于表达C13-2突变体和h3 gRNA。使用的引物如下表30所示。
表30.用于构建突变体表达载体的引物


转染
将验证载体以及对照载体按照Lipofectamine 2000(Thermo)说明书转染293T细胞。C13-2-BsaI对照组转染前述实验例的C13-2-BsaI载体,WT对照组转染C13-2-AR-h3载体,均表达野生型C13-2。另设293T细胞对照组,不转染任何质粒。
qPCR检测靶RNA水平
转染后48h的细胞使用SteadyPure Universal RNA Extraction Kit试剂盒提取RNA,并使用超微量分光光度计检测RNA浓度。RNA产物使用Evo M-MLV Mix Kit with gDNA Clean for qPCR反转录试剂盒进行反转录,反转录产物使用SYBR Green Premix Pro Taq HS qPCR Kit(Low Rox Plus)试剂盒进行检测。
其中qPCR所使用引物如下所示:
检测AR:CCAGGGACCATGTTTTGCC(SEQ ID NO:275)
CGAAGACGACAAGATGGACAA(SEQ ID NO:276)
检测内参GAPDH:SEQ ID NO:30、31。
按照Green Premix Pro Taq HS qPCR Kit(Rox Plus)使用说明配置反应体系,使用QuantStudioTM5 Real-Time PCR System进行检测。
使用2-△△Ct法计算编辑后的靶RNA水平。重复3次实验,结果取平均值,如表31、图16所示。
表31.qPCR测试突变体靶向编辑后的AR RNA水平

RNAseq测序
将编辑后提取的总RNA样品进行RNAseq测序,建库类型为LncRNA链特异性文库,测序数据量是16G,测序策略是PE150。
RNAseq分析原理:
使用fastqc、multiqc对数据进行质控,使用fastp去除低质量reads;
比对至人rRNA的序列进行去除,使用Hisat2比对至hg38参考基因组;
比对后使用Kallisto软件对基因进行表达水平的定量,而后使用sleuth软件进行表达量差异分析,将|b|>0.5,qval<0.05,mean_obs>2的基因视为差异表达基因;
将指导序列使用EMBOSS water软件比对至参考cDNA,将比对碱基数>=18,错配碱基数<=6,最小连续配对碱基数>=8的转录本视为预测脱靶的转录本,对应的基因视为预测的脱靶基因;
将显著下调的差异表达基因与预测的脱靶基因取交集,剔除在靶的AR基因后,得到脱靶基因集。
RNASeq结果分析
以293T细胞对照组作为基准,分析各组的AR基因表达水平及编辑效率(取平均值),结果如表32和图17所示。
表32.RNAseq测得的AR编辑效率

RNAseq结果数据与qPCR结果基本吻合。
联合本实验例的gRNA时,M2-1、M2-2、M2-3、M2-4、M2-9、M2-10、M2-16、M2-17、M2-18、M2-19、M2-23、M2-24、M2-25、M2-26、M2-32、M2-33、M2-34、M2-35、M2-39、M2-40和M2-41变体的编辑活性比野生型C13-2高。
差异表达基因数量及脱靶基因数量如表33所示。
关于编辑后细胞内表达下调的差异表达基因数量,M2-1、M2-2、M2-3、M2-4、M2-5、M2-6、M2-7、M2-8、M2-10、M2-13、M2-14、M2-15、M2-16、M2-21、M2-22、M2-23、M2-24、M2-25、M2-26、M2-27、M2-28、M2-29、M2-30、M2-31、M2-32、M2-33、M2-34、M2-35、M2-36、M2-37、M2-38、M2-39、M2-40和M2-41组少于WT组。
关于取交集后确定的脱靶基因数量,M2-1、M2-2、M2-3、M2-4、M2-5、M2-6、M2-7、M2-8、M2-10、M2-13、M2-14、M2-15、M2-16、M2-21、M2-22、M2-23、M2-24、M2-25、M2-26、M2-27、M2-28、M2-29、M2-30、M2-31、M2-32、M2-33、M2-35、M2- 36、M2-37、M2-38、M2-39、M2-40和M2-41组少于WT组。其中,M2-1、M2-6、M2-7、M2-14、M2-15、M2-22、M2-31、M2-38、M2-39无脱靶。
表33.突变体靶向AR时的差异表达基因及脱靶基因

实验例15:靶蛋白水平的测试
将前述实验例的C13-2-VEGFA载体及阴性对照载体C13-2-BsaI转染293T细胞(Lipofectamine 2000,Thermo Fisher),37℃培养72h后收集上清。使用Elascience公司的Human VEGF-A(Vascular Endothelial Cell Growth Factor A)ELISA Kit检测VEGFA蛋白的水平,显示相比阴性对照组的VEGFA蛋白表达量下调了97.4%。
在293T细胞中,使用载体表达CasRx、以及含相同指导序列的gRNA进行编辑后,相比CasRx阴性对照组的VEGFA蛋白表达量下调了75.7%。 序列:






















































































Claims (48)

  1. 一种Cas13蛋白,其特征在于,其氨基酸序列具有与SEQ ID NO:1相比至少90%的序列同一性。
  2. 根据权利要求1所述的Cas13蛋白,其特征在于,所述Cas13蛋白能够与指导多核苷酸形成CRISPR复合物,所述指导多核苷酸包含与指导序列连接的同向重复序列,所述指导序列被工程化以指导所述CRISPR复合物与靶RNA的序列特异性结合。
  3. 根据权利要求1所述的Cas13蛋白,其特征在于,所述Cas13蛋白在催化结构域中包含一个或多个突变并且具有降低的RNA切割活性。
  4. 根据权利要求1所述的Cas13蛋白,其特征在于,所述Cas13蛋白在一个或两个HEPN结构域中包含一个或多个突变并且基本上缺乏RNA切割活性。
  5. 根据权利要求1所述的Cas13蛋白,其特征在于,其具有与SEQ ID NO:1相比至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性。
  6. 根据权利要求1所述的Cas13蛋白,其特征在于,所述Cas13蛋白在与氨基酸序列如SEQ ID NO:1所示的参比蛋白的氨基酸残基40-91位、146-153位、158-176位、182-209位、216-253位、271-287位、341-353位、379-424位、456-477位、521-557位、575-588位、609-625位、700-721位、724-783位、796-815位、828-852位或880-893位的对应位置包含至少一个突变;和/或,
    所述Cas13蛋白在与氨基酸序列如SEQ ID NO:1所示的参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:R11、N34、R35、R47、R58、R63、R64、N68、N87、N265、N274、R276、R290、R294、N299、N303、R308、R314、R320、R328、N332、R341、N346、R358、N372、N383、N390、N394、R47+R290、R47+R314、R290+R314、R47+R290+R314、R308+N68、N394+N68、N87+N68、R308+N265、N394+N265、N87+N265、R308+N68+N265、N87+N68+N265、T7、A16、S260、A263、M266、N274、F288、M302、N303、L304、V305、I311、D313、H324、P326、H327、N332、N346、T353、T360、E365、A373、M380、S382、K395、Y396、D402、D411和S418;
    较佳地,所述突变为R11A、N34A、R35A、R47A、R58A、R63A、R64A、N68A、N87A、N265A、N274A、R276A、R290A、R294A、N299A、N303A、R308A、R314A、R320A、R328A、N332A、R341A、N346A、R358A、N372A、N383A、N390A、N394A、R47A+R290A、R47A+R314A、R290A+R314A、R47A+R290A+R314A、R308A+N68A、N394A+N68A、N87A+N68A、R308A+N265A、N394A+N265A、N87A+N265A、R308A+N68A+N265A、N87A+N68A+N265A、T7S、A16S、S260E、A263K、M266I、N274K、F288Y、M302F、N303S、L304I、V305K、I311M、D313E、H324Y、P326S、H327V、 N332Y、N346D、T353L、T360S、E365D、A373E、M380K、S382R、K395G、Y396D、D402L、D411E和S418K。
  7. 根据权利要求1所述的Cas13蛋白,其特征在于,所述Cas13蛋白在与氨基酸序列如SEQ ID NO:1所示的参比蛋白的以下氨基酸残基的对应位置包含任意一种或更多种突变:N34、R64、N68、N265、R276、R294、N299、R314、R47+R290、R47+R314、R290+R314、R47+R290+R314、N394+N265、N87+N265、A263、M266、N274、F288、V305、I311、D313、H324、T360、E365、A373、M380、D402和D411;
    较佳地,所述突变为N34A、R64A、N68A、N265A、R276A、R294A、N299A、R314A、R47A+R290A、R47A+R314A、R290A+R314A、R47A+R290A+R314A、N394A+N265A、N87A+N265A、A263K、M266I、N274K、F288Y、V305K、I311M、D313E、H324Y、T360S、E365D、A373E、M380K、D402L和D411E。
  8. 根据权利要求1~7任一项所述的Cas13蛋白,其特征在于,所述Cas13蛋白来自包含与CNGB数据库(中国国家基因库)中编号为CNA0009596所示基因组的平均核苷酸同一性(ANI)≥95%的基因组的物种(species)。
  9. 一种融合蛋白,其特征在于,其包含融合至蛋白结构域和/或多肽标签的根据权利要求1~8任一项所述的Cas13蛋白或其功能片段;可选的,融合后不改变所述的Cas13蛋白和/或功能片段的原有功能。
  10. 根据权利要求9所述的融合蛋白,其特征在于,所述Cas13蛋白或其功能片段与核定位信号(NLS)融合。
  11. 根据权利要求9所述的融合蛋白,其特征在于,所述Cas13蛋白或其功能片段与核输出信号(NES)融合。
  12. 根据权利要求9所述的融合蛋白,其特征在于,所述Cas13蛋白或其功能片段与蛋白结构域共价连接。
  13. 根据权利要求9所述的融合蛋白,其特征在于,所述Cas13蛋白或其功能片段融合至选自以下的任意一种或更多种蛋白结构域和/或多肽标签:胞嘧啶脱氨酶结构域、腺苷脱氨酶结构域、翻译激活结构域、翻译抑制结构域、RNA甲基化结构域、RNA去甲基化结构域、核酸酶结构域、剪接因子结构域、报告域、亲和域、亚细胞定位信号、报告标签和亲和标签。
  14. 根据权利要求9~13任一项所述的融合蛋白,其特征在于,所述融合蛋白的结构为NLS-Cas13蛋白-SV40 NLS-nucleoplasmin NLS。
  15. 一种指导多核苷酸,其特征在于,其包含(i)与SEQ ID NO:3和SEQ ID NO:80-87中任一项具有至少50%的序列同一性的同向重复序列,该同向重复序列连接至(ii)工 程化以与靶RNA杂交的指导序列,所述指导多核苷酸能够与Cas13蛋白形成CRISPR复合物并指导所述CRISPR复合物与所述靶RNA的序列特异性结合;优选地,所述Cas13蛋白为Cas13a、Cas13b、Cas13c或Cas13d;更优选地,所述Cas13蛋白具有与如SEQ ID NO:1所示的氨基酸序列相比至少90%、至少95%、至少98%或至少99%的序列同一性。
    可选地,所述同向重复序列在对应于SEQ ID NO:3的第26位碱基为A;
    可选地,所述同向重复序列为GGAAGATN1ACTCTACAAACCTGTAGN2GN3N4N5N6N7N8N9N10N11;其中,N1和N3-N11任选自A、C、G、T;N2任选自A和G;
    进一步可选地,所述同向重复序列为GGAAGATN12ACTCTACAAACCTGTAGN13GN14N15N16N17N18N19N20N21N22;其中,N12、N13、N19和N21任选自A和G,N14任选自A和T,N15和N16任选自C和T,N17和N18任选自G和T,N20和N22任选自C和G。
  16. 根据权利要求15所述的指导多核苷酸,其特征在于,所述同向重复序列与SEQ ID NO:3、81、82、84和87中任一项相比具有至少80%、至少90%或至少95%的序列同一性。
  17. 根据权利要求15所述的指导多核苷酸,其特征在于,所述指导序列位于所述同向重复序列的3'端。
  18. 根据权利要求15所述的指导多核苷酸,其特征在于,所述指导序列包含15-35个核苷酸。
  19. 根据权利要求15所述的指导多核苷酸,其特征在于,所述指导序列与所述靶RNA杂交,错配不超过一个核苷酸。
  20. 根据权利要求15所述的指导多核苷酸,其特征在于,所述同向重复序列包含25至40个核苷酸。
  21. 根据利要求15所述的指导多核苷酸,其特征在于,所述指导多核苷酸进一步包含适体序列。
  22. 根据权利要求21所述的指导多核苷酸,其特征在于,所述适体序列被插入到所述指导多核苷酸的环(loop)中。
  23. 根据权利要求21所述的指导多核苷酸,其特征在于,所述适体序列包括MS2适体序列、PP7适体序列或Qβ适体序列。
  24. 根据权利要求15所述的指导多核苷酸,其特征在于,所述指导多核苷酸包含修饰的核苷酸。
  25. 根据权利要求24所述的指导多核苷酸,其特征在于,所述修饰包含2'-O-甲基、2'-O-甲基-3'-硫代磷酸酯或2'-O-甲基-3'-硫代PACE修饰。
  26. 根据权利要求15~25任一项所述的指导多核苷酸,其特征在于,所述靶RNA位于真核细胞的细胞核中。
  27. 根据权利要求15~26任一项所述的指导多核苷酸,其特征在于,所述靶RNA任选自TTR RNA、SOD1 RNA、PCSK9 RNA、VEGFA RNA、VEGFR1 RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA;优选地,所述靶RNA任选自PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。
  28. 根据权利要求27所述的指导多核苷酸,其特征在于,所述指导序列任选自如SEQ ID NO:5-6、SEQ ID NO:42-49所示序列;优选地,所述指导序列任选自如SEQ ID NO:5-6、SEQ ID NO:43、SEQ ID NO:45-47所示序列。
  29. 一种CRISPR-Cas13系统,其特征在于,其包含:
    如权利要求1~8任一项所述的Cas13蛋白或如权利要求9~14任一项所述的融合蛋白,或编码所述Cas13蛋白或融合蛋白的核酸;以及
    指导多核苷酸或编码所述指导多核苷酸的核酸;所述指导多核苷酸包含连接至指导序列的同向重复序列,所述指导序列被工程化以与靶RNA杂交;
    所述指导多核苷酸能够与所述Cas13蛋白或融合蛋白形成CRISPR复合物并指导所述CRISPR复合物与靶RNA的序列特异性结合。
  30. 根据权利要求29所述的CRISPR-Cas13系统,其特征在于,所述靶RNA任选自TTR RNA、SOD1 RNA、PCSK9 RNA、VEGFA RNA、VEGFR1 RNA、PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA;可选地,所述靶RNA任选自PTBP1 RNA、AQp1 RNA或ANGPTL3 RNA。
  31. 根据权利要求30所述的CRISPR-Cas13系统,其特征在于,所述指导序列任选自如SEQ ID NO:5-6、SEQ ID NO:42-49所示序列;优选地,所述指导序列任选自如SEQ ID NO:5-6、SEQ ID NO:43、SEQ ID NO:45-47所示序列。
  32. 根据权利要求29~31中任一项所述的CRISPR-Cas13系统,其特征在于,所述同向重复序列与SEQ ID NO:3和SEQ ID NO:80-87中任一项相比具有至少70%的序列同一性。
  33. 一种CRISPR-Cas13系统,其特征在于,其包含权利要求15~28中任一项所述的指导多核苷酸或编码其的核酸,以及Cas13蛋白或编码其的核酸。
  34. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的载体系统,其中,所述载体系统包含一个或多个载体,所述载体包含编码所述Cas13蛋白或融合蛋白 的多核苷酸序列和编码所述指导多核苷酸的多核苷酸序列。
  35. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的腺相关病毒载体,其中,所述腺相关病毒载体包含编码所述Cas13蛋白或融合蛋白和指导多核苷酸的DNA。
  36. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的脂质纳米粒,其中,所述脂质纳米粒包含所述的指导多核苷酸和编码所述Cas13蛋白或融合蛋白的mRNA。
  37. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的慢病毒载体,其中,所述慢病毒载体包含所述指导多核苷酸和编码所述Cas13蛋白或融合蛋白的mRNA;优选地,所述慢病毒载体是用包膜蛋白假型化的;可选地,所述编码Cas13蛋白或融合蛋白的mRNA与适体序列连接。
  38. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的核糖核蛋白复合物,其中,所述核糖核蛋白复合物由所述指导多核苷酸和Cas13蛋白或融合蛋白形成。
  39. 一种包含如权利要求29~33中任一项所述的CRISPR-Cas13系统的病毒样颗粒,其中,所述病毒样颗粒包含由所述指导多核苷酸和Cas13蛋白或融合蛋白形成的核糖核蛋白复合物;可选地,所述Cas13蛋白或融合蛋白与gag蛋白融合。
  40. 一种包含如权利要求1~8任一项所述的Cas13蛋白、如权利要求9~14任一项所述的融合蛋白、如权利要求15~28中任一项所述的指导多核苷酸、或者如权利要求29~33中任一项所述的CRISPR-Cas13系统的真核细胞;可选地,所述真核细胞是哺乳动物细胞。
  41. 一种药物组合物,其特征在于,其包含如权利要求1~8任一项所述的Cas13蛋白、如权利要求9~14任一项所述的融合蛋白、如权利要求15~28中任一项所述的指导多核苷酸、或者如权利要求29~33中任一项所述的CRISPR-Cas13系统。
  42. 一种体外组合物,其特征在于,其包含如权利要求29~32中任一项所述的CRISPR-Cas13系统,以及不能与所述指导多核苷酸杂交的标记的detector RNA。
  43. 一种分离的核酸,其特征在于,其编码根据权利要求1~8任一项所述的Cas13蛋白或权利要求9~14任一项所述的融合蛋白。
  44. 一种分离的核酸,其特征在于,其编码根据权利要求15~28中任一项所述的指导多核苷酸。
  45. 一种根据权利要求1~8中任一项所述的Cas13蛋白、权利要求9~14中任一项所述的融合蛋白、权利要求15~28中任一项所述的指导多核苷酸、权利要求29~33中任一 项所述的CRISPR-Cas13系统或权利要求42或权利要求43所述的分离的核酸在检测疑似包含靶RNA的核酸样品中的靶RNA或制备检测疑似包含靶RNA的核酸样品中的靶RNA的试剂中的用途。
  46. 一种根据权利要求1~8中任一项所述的Cas13蛋白、权利要求9~14中任一项所述的融合蛋白、权利要求15~28中任一项所述的指导多核苷酸、权利要求29~33中任一项所述的CRISPR-Cas13系统或权利要求42或权利要求43所述的分离的核酸在以下任一项或制备实现以下任一项方案的试剂中的用途:切割一种或多种靶RNA分子或使一种或多种靶RNA分子产生切口(nicking),激活或上调一种或多种靶RNA分子,激活或抑制一种或多种靶RNA分子的翻译,使一种或多种靶RNA分子失活,可视化、标记或检测一种或多种靶RNA分子,结合一种或多种靶RNA分子,运输一种或多种靶RNA分子,以及掩蔽一种或多种靶RNA分子。
  47. 一种诊断、治疗或预防与靶RNA相关的疾病或病症的方法,其特征在于,向有需要的受试者的样品或向有需要的受试者施用根据权利要求1~8中任一项所述的Cas13蛋白、权利要求9~14中任一项所述的融合蛋白、权利要求15~28中任一项所述的指导多核苷酸、权利要求29~33中任一项所述的CRISPR-Cas13系统或权利要求42或权利要求43所述的分离的核酸。
  48. 一种如根据权利要求1~8中任一项所述的Cas13蛋白、权利要求9~14中任一项所述的融合蛋白、权利要求15~28中任一项所述的指导多核苷酸、权利要求29~33中任一项所述的CRISPR-Cas13系统或权利要求42或权利要求43所述的分离的核酸在制备用于诊断、治疗或预防与靶RNA相关的疾病或病症的药物中的用途。
PCT/CN2023/115093 2022-08-26 2023-08-25 一种CRISPR-Cas13系统及其应用 Ceased WO2024041653A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
CN202380014099.8A CN118159650A (zh) 2022-08-26 2023-08-25 一种CRISPR-Cas13系统及其应用
AU2023328197A AU2023328197A1 (en) 2022-08-26 2023-08-25 Crispr-cas13 system and use thereof
IL319213A IL319213A (en) 2022-08-26 2023-08-25 The CRISPR-CAS13 system and its use
EP23856738.2A EP4578945A4 (en) 2022-08-26 2023-08-25 CRISPR-CAS13 SYSTEM AND ITS USE
JP2024550764A JP2025511466A (ja) 2022-08-26 2023-08-25 CRISPR-Cas13システム及びその使用
KR1020257009723A KR20250053925A (ko) 2022-08-26 2023-08-25 CRISPR-Cas13 시스템 및 이의 용도
US18/755,750 US12297450B2 (en) 2022-08-26 2024-06-27 CRISPR-Cas13 system and use thereof
ZA2024/07040A ZA202407040B (en) 2022-08-26 2024-09-12 Crispr-cas13 system and use thereof
MX2025002272A MX2025002272A (es) 2022-08-26 2025-02-25 Sistema de crispr-cas13 y uso del mismo
US19/194,365 US20250250590A1 (en) 2022-08-26 2025-04-30 Crispr-cas13 system and use thereof

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202211035342.8 2022-08-26
CN202211035342 2022-08-26
CN202310457880 2023-04-24
CN202310457880.4 2023-04-24

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/755,750 Continuation US12297450B2 (en) 2022-08-26 2024-06-27 CRISPR-Cas13 system and use thereof

Publications (1)

Publication Number Publication Date
WO2024041653A1 true WO2024041653A1 (zh) 2024-02-29

Family

ID=90012578

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/115093 Ceased WO2024041653A1 (zh) 2022-08-26 2023-08-25 一种CRISPR-Cas13系统及其应用

Country Status (10)

Country Link
US (2) US12297450B2 (zh)
EP (1) EP4578945A4 (zh)
JP (1) JP2025511466A (zh)
KR (1) KR20250053925A (zh)
CN (1) CN118159650A (zh)
AU (1) AU2023328197A1 (zh)
IL (1) IL319213A (zh)
MX (1) MX2025002272A (zh)
WO (1) WO2024041653A1 (zh)
ZA (1) ZA202407040B (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119301245A (zh) * 2024-06-17 2025-01-10 北大荒垦丰种业股份有限公司 效应核酸酶及其应用
CN118939854A (zh) * 2024-10-15 2024-11-12 上海焕一生物科技有限公司 多识别码系统的数据库融合方法、系统、介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016123230A1 (en) 2015-01-28 2016-08-04 Pioneer Hi-Bred International, Inc. Crispr hybrid dna/rna polynucleotides and methods of use
US20190062724A1 (en) * 2017-08-22 2019-02-28 Salk Institute For Biological Studies Rna targeting methods and compositions
WO2020160150A1 (en) * 2019-01-29 2020-08-06 The Regents Of The University Of California Rna-targeting cas enzymes
CN113234702A (zh) * 2021-03-26 2021-08-10 珠海舒桐医疗科技有限公司 一种Lt1Cas13d蛋白及基因编辑系统
CN113544267A (zh) * 2019-01-14 2021-10-22 罗切斯特大学 使用CRISPR-Cas进行靶向核RNA裂解和聚腺苷酸化

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11104937B2 (en) * 2017-03-15 2021-08-31 The Broad Institute, Inc. CRISPR effector system based diagnostics
US12553063B2 (en) * 2019-04-12 2026-02-17 University Of Massachusetts CAS13 family AAV vectors and uses thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016123230A1 (en) 2015-01-28 2016-08-04 Pioneer Hi-Bred International, Inc. Crispr hybrid dna/rna polynucleotides and methods of use
US20190062724A1 (en) * 2017-08-22 2019-02-28 Salk Institute For Biological Studies Rna targeting methods and compositions
US10476825B2 (en) 2017-08-22 2019-11-12 Salk Institue for Biological Studies RNA targeting methods and compositions
CN113544267A (zh) * 2019-01-14 2021-10-22 罗切斯特大学 使用CRISPR-Cas进行靶向核RNA裂解和聚腺苷酸化
WO2020160150A1 (en) * 2019-01-29 2020-08-06 The Regents Of The University Of California Rna-targeting cas enzymes
CN113234702A (zh) * 2021-03-26 2021-08-10 珠海舒桐医疗科技有限公司 一种Lt1Cas13d蛋白及基因编辑系统

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
"Master’s Dissertations", 18 July 2020, FUJIAN NORMAL UNIVERSITY, China, article YE, YANGMIAO: "Structural Insights into CRISPR-Cas13d Effector", XP009552797, DOI: 10.27019/d.cnki.gfjsu.2020.002005 *
ABUDAYYEH ET AL., SCIENCE, vol. 365, no. 6451, 2019, pages 382 - 386
BANSKOTA ET AL., CELL, vol. 185, no. 2, 2022, pages 250 - 265
CAMPBELL ET AL., MOLECULAR THERAPY, vol. 27, 2019, pages 151 - 163
COX ET AL., SCIENCE, vol. 358, no. 6366, 2017, pages 1019 - 1027
DATABASE Protein 21 April 2021 (2021-04-21), ANONYMOUS : "MAG: type VI-D CRISPR-associated RNA-guided ribonuclease Cas13d [Thermoguttaceae bacterium] ", XP093142112, retrieved from NCBI Database accession no. MBR0191107.1 *
GILLMORE ET AL., N. ENGL. J. MED., vol. 385, 2021, pages 493 - 502
GUPTA RAHUL, GHOSH ARIJIT, CHAKRAVARTI RUDRA, SINGH RAJVEER, RAVICHANDIRAN VELAYUTHAM, SWARNAKAR SNEHASIKTA, GHOSH DIPANJAN: "Cas13d: A New Molecular Scissor for Transcriptome Engineering", FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, FRONTIERS MEDIA, CH, vol. 10, 31 March 2022 (2022-03-31), CH , pages 866800, XP093142116, ISSN: 2296-634X, DOI: 10.3389/fcell.2022.866800 *
HENDEL ET AL., NAT. BIOTECHNOL., vol. 33, no. 9, 2015, pages 985 - 989
KONERMANN ET AL., CELL, vol. 173, no. 3, 2018, pages 665 - 676
KONERMANN ET AL., NATURE, vol. 517, 2015, pages 583 - 588
MAEDER ET AL., NATURE MEDICINE, vol. 25, 2019, pages 229 - 233
MANGEOT ET AL., MOLECULAR THERAPY, vol. 19, no. 9, 2011, pages 1656 - 1666
MANGEOT ET AL., NATURE COMMUNICATIONS, vol. 10, no. 1, 2019, pages 1 - 15
PAUNOVSKA ET AL., NATURE REVIEWS GENETICS, vol. 23, 2022, pages 265 - 280
RICHTER MROSSELLÓ-MÓRA R: "Shifting the genetic gold standard for the prokaryotic species definition", PROC NATL ACAD SCI U S A., vol. 106, no. 45, 10 November 2009 (2009-11-10), pages 19126 - 31
See also references of EP4578945A4
TABEBORDBAR ET AL., CELL, vol. 184, 2021, pages 4919 - 4938

Also Published As

Publication number Publication date
EP4578945A4 (en) 2025-12-17
AU2023328197A1 (en) 2025-04-03
CN118159650A (zh) 2024-06-07
MX2025002272A (es) 2025-04-02
US20250250590A1 (en) 2025-08-07
ZA202407040B (en) 2025-05-28
EP4578945A1 (en) 2025-07-02
KR20250053925A (ko) 2025-04-22
US12297450B2 (en) 2025-05-13
JP2025511466A (ja) 2025-04-16
IL319213A (en) 2025-04-01
US20240392323A1 (en) 2024-11-28

Similar Documents

Publication Publication Date Title
JP2025066771A (ja) サプレッサーtRNA及びデアミナーゼによる変異のRNAターゲティング
DK3320092T3 (en) CONSTRUCTED CRISPR-CAS9 COMPOSITIONS AND METHODS OF USE
CN106852157B (zh) 用于使用h1启动子表达crispr向导rna的组合物和方法
CA3106738C (en) METHOD FOR MODULATING RNA SPLICING BY INDUCING A BASE MUTATION AT A SPLICE SITE OR A BASE SUBSTITUTION IN A POLYPYRIMIDIN REGION
AU2018321105B2 (en) Improved transposase polypeptide and uses thereof
HK1247238A1 (zh) 改造的crispr-cas9组合物和使用方法
KR20160089530A (ko) Hbv 및 바이러스 질병 및 질환을 위한 crispr­cas 시스템 및 조성물의 전달,용도 및 치료적 적용
JP2023522788A (ja) 標的化されたゲノム組込みによってデュシェンヌ型筋ジストロフィーを矯正するためのcrispr/cas9療法
US20250250590A1 (en) Crispr-cas13 system and use thereof
EP4428232A1 (en) Isolated cas13 protein and use thereof
US11492614B2 (en) Stem loop RNA mediated transport of mitochondria genome editing molecules (endonucleases) into the mitochondria
JP7667595B2 (ja) Aqp1 RNAを標的とするsgRNA及びそのベクターと使用
JP2022531887A (ja) 安定化されたrna治療薬を含むエキソソーム
CN117230043B (zh) Cas13蛋白及其应用
WO2024240138A1 (zh) 基于perv逆转录酶的先导编辑系统
WO2024152937A1 (zh) 靶向fgf2的基因编辑系统及其应用
JP2025535495A (ja) 遺伝子編集システムとその応用
CN115044583A (zh) 用于基因编辑的rna框架和基因编辑方法
US20250163474A1 (en) Systems, methods, and compositions for targeted gene manipulation and uses thereof
WO2025103411A1 (zh) Cas蛋白、CRISPR-Cas系统及其应用
WO2025036482A1 (en) Type ii cas protein, crispr-cas system and uses thereof
TWI838812B (zh) 用於治療法布瑞氏症之組合物及方法
WO2024245152A1 (zh) 靶向ptbp1的基因编辑系统及其应用
WO2024230837A1 (zh) 指导rna、基因编辑系统及其应用
WO2026061453A1 (zh) 靶向系统及其应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856738

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202380014099.8

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2024550764

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 319213

Country of ref document: IL

Ref document number: MX/A/2025/002272

Country of ref document: MX

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112025003849

Country of ref document: BR

WWE Wipo information: entry into national phase

Ref document number: 819629

Country of ref document: NZ

Ref document number: AU2023328197

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 20257009723

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020257009723

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2023856738

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 819629

Country of ref document: NZ

ENP Entry into the national phase

Ref document number: 2023856738

Country of ref document: EP

Effective date: 20250326

WWP Wipo information: published in national office

Ref document number: MX/A/2025/002272

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2023328197

Country of ref document: AU

Date of ref document: 20230825

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 1020257009723

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 11202501302R

Country of ref document: SG

WWP Wipo information: published in national office

Ref document number: 11202501302R

Country of ref document: SG

WWP Wipo information: published in national office

Ref document number: 2023856738

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 112025003849

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20250226