WO2024229449A2 - Enyzmes d'édition de base - Google Patents
Enyzmes d'édition de base Download PDFInfo
- Publication number
- WO2024229449A2 WO2024229449A2 PCT/US2024/027887 US2024027887W WO2024229449A2 WO 2024229449 A2 WO2024229449 A2 WO 2024229449A2 US 2024027887 W US2024027887 W US 2024027887W WO 2024229449 A2 WO2024229449 A2 WO 2024229449A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- seq
- nos
- engineered
- nucleic acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/0008—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
- A61K48/0025—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/0008—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
- A61K48/0025—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid
- A61K48/0041—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition wherein the non-active part clearly interacts with the delivered nucleic acid the non-active part being polymeric
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/11—Antisense
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/30—Chemical structure
- C12N2310/31—Chemical structure of the backbone
- C12N2310/315—Phosphorothioates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/30—Chemical structure
- C12N2310/32—Chemical structure of the sugar
- C12N2310/321—2'-O-R Modification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04004—Adenosine deaminase (3.5.4.4)
Definitions
- Cas enzymes along with their associated Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) guide ribonucleic acids (RNAs) appear to be a pervasive (-45% of bacteria. -84% of archaea) component of prokaryotic immune systems, serving to protect such microorganisms against non-self nucleic acids, such as infectious viruses and plasmids by CRISPR-RNA guided nucleic acid cleavage. While the deoxyribonucleic acid (DNA) elements encoding CRISPR RNA elements may be relatively conserved in structure and length, their CRISPR-associated (Cas) proteins are highly diverse, containing a wide variety 7 of nucleic acidinteracting domains.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- CRISPR DNA elements have been observed as early as 1987, the programmable endonuclease cleavage ability of CRISPR complexes has only been recognized relatively recently, leading to the use of recombinant CRISPR systems in diverse DNA manipulation and gene editing applications.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- the base editor comprises a sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1654-1703, and 2021-2023.
- the base editor comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023.
- engineered base editing system comprising: a base editor encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- base editor is encoded by a nucleic acid sequence having at least 90% identity to any one of SEQ ID NOs: 1727-1757.
- the base editor is encoded by a nucleic acid sequence having 100% identity to any one of SEQ ID NOs: 1727-1757.
- the base editor comprises a deaminase.
- the deaminase binds non-covalently to the endonuclease.
- the deaminase is covalently linked to the endonuclease.
- the deaminase is fused to the endonuclease.
- the engineered guide polynucleotide is a single guide nucleic acid.
- the engineered guide polynucleotide is a dual guide nucleic acid.
- the engineered guide polynucleotide is RNA.
- the endonuclease binds non-covalently to the engineered guide polynucleotide. In some embodiments, the endonuclease is covalently linked to the engineered guide polynucleotide.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917- 931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489- 1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having at least 90% sequence identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917- 931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489- 1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1431- 1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1431-1454, 1704, and 2010-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1890-1976.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1977-2009.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity 7 to any one of SEQ ID NOs: 1479-1483 and 1758-1889.
- the base editor comprises a nickase domain.
- the nickase comprises an aspartate to alanine mutation at residue 9 relative to SEQ ID NO: 70, residue 13 relative to SEQ ID NOs: 71 , 72, or 74, residue 12 relative to SEQ ID NO: 73, residue 17 relative to SEQ ID NO: 75, residue 23 relative to SEQ ID NO: 76, or residue 10 relative to SEQ ID NO: 597, or any combination thereof.
- the base editor further comprises a uracil DNA glycosylase inhibitor sequence. In some embodiments, the base editor further comprises a FAM72A sequence. In some embodiments, the FAM72A sequence has at least 80% identity to SEQ ID NO: 1121.
- nucleic acids encoding engineered base editing systems described herein are nucleic acids encoding engineered base editing systems described herein.
- the vector is a plasmid, a minicircle, a CELiD, an adeno- associated virus (AAV) derived virion, a lentivirus, or an adenovirus.
- AAV adeno- associated virus
- the cell is a eukaryotic cell.
- the cell is a mammalian cell.
- the cell is an immortalized cell.
- the cell is an insect cell.
- the cell is a yeast cell.
- the cell is a plant cell.
- the cell is a fungal cell.
- the cell is a prokaryotic cell.
- the cell is an A549, HEK-293, HEK-293T, BHK, CHO, HeLa, MRC5, Sf9, Cos-1, Cos-7, Vero, BSC 1, BSC 40, BMT 10, WI38, HeLa, Saos, C2C12, L cell, HT1080, HepG2, Huh7, K562, primary cell, or a derivative thereof.
- the cell is an engineered cell.
- the cell is a stable cell.
- modifying the target nucleic acid sequence comprises converting an adenine to a guanine in the target nucleic acid sequence. In some embodiments, modifying the target nucleic acid sequence comprises converting a cytosine to a uracil in the target nucleic acid sequence. In some embodiments, the target nucleic acid sequence comprises deoxyribonucleic acid (DNA). In some embodiments, the target nucleic acid sequence comprises ribonucleic acid (RNA). In some embodiments, the target nucleic acid sequence comprises genomic DNA, viral DNA, viral RNA, or bacterial DNA.
- the target nucleic acid sequence is modified in vitro. In some embodiments, target nucleic acid sequence is modified in vivo. In some embodiments, the target nucleic acid sequence is modified ex vivo. In some embodiments, the target nucleic acid sequence is modified within a cell. In some embodiments, the cell is a prokaryotic cell, a bacterial cell, a eukaryotic cell, a fungal cell, a plant cell, an animal cell, a mammalian cell, a rodent cell, a primate cell, a human cell, or a primary cell.
- Described herein, in certain embodiments, are methods of modifying a nucleic acid encoding ANGPTL3 comprising contacting the nucleic acid sequence encoding ANGPTL3 with an engineered base editing system, said base editing system comprising:a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence
- Described herein, in certain embodiments, are methods of modifying a nucleic acid encoding BCL11 A comprising contacting the nucleic acid sequence encoding BCL11A with an engineered base editing system, said base editing system comprising a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-11 0 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- Described herein, in certain embodiments, are methods of modifying a nucleic acid encoding PAH comprising contacting the nucleic acid sequence encoding PAH with an engineered base editing system, said base editing system comprising a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identify to anyone of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identify- to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- FIG. 2 depicts a schematic diagram highlighting the domain architecture of the homodimeric and heterodimeric ABEs.
- Two copies of MG68-4 variants were inserted into the MG3-6/3-8 nuclease chassis allowing for PID swapping to diversify the PAM accessible to these ABE enzymes.
- FIG. 3A - FIG. 3B depict boxplots showing comparison of A ⁇ G editing by ABE variants at 31 distinct gRNAs targeting 3 different genes in Hepal-6 cells.
- FIG. 3A depicts boxplots showing the mean observed editing
- FIGs. 4A - 4B depict boxplots showing comparison of C ⁇ G promiscuous editing by ABE variants at 31 distinct gRNAs targeting 3 different genes in Hepal-6 cells.
- FIG. 4A depicts boxplots showing the mean observed editing
- FIG. 6 depicts a heatmap showing the guide-wise and variant-wise breakdown of maximum A ⁇ G editing activity observed across the highly edited gRNAs targeting 3 different genes in Hepal-6 cells. The guides are ranked according to the observed A ⁇ G editing activity' in decreasing order from left to right. ABE07-77 is highlighted along with the guides chosen for further in vivo investigation.
- FIG. 10 depicts a graph showing that max C to T editing for 139-52-V2, 139-52-V13, 139-52-V14, 139-52-V17, 139-86vl2, and 152-6vl3 across all 5 guides is 26.2%, 58.2%, 22.7%, 16.8%, 12.6%, 50.9% respectively.
- 139-52-V17, 139-86vl2, and 152-6vl3 edited at 159.2%, 102.1%, 353.7%, 137.8%, 76.3%, and 309.1% of the maximum positive control editing respectively.
- FIG. 11 depicts bar plots showing -1 nucleotide in the 5’ position of the cytidine being deaminated has been shown to be important for binding and deamination reaction by the cytidine deaminase.
- the -1 nucleotide preference for each of our 6 engineered CDA variants w ere measured. To do this, the number of reads of the four highest edited cytidine sites for each guide that edited >1% (and their -1 nucleotide identity) w ere tabulated. This w as done separately for all five guides after which the preference across all 5 guides was averaged.
- the resulting graph represents the relative -1 nucleotide preference each CDA has for the type of cytidine it prefers to deaminate across the five guides targeting the HEK293 engineered site.
- FIG. 12A and FIG. 12B depict graphs showing the comparison of on target A ⁇ G editing by oligomeric ABE variants at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- FIG. 17A - FIG. 17C depict graphs showing the preferred sequence context of the edited adenines. Frequency distribution of bases surrounding all editable adenines within the 15 distinct gRNA tested in this study (FIG. 17A), the highly edited (greater than 30% editing activity) adenines by ABE15 (FIG. 17B), and the less edited (less than 30% editing activity) adenines by ABE15 (FIG. 17C).
- FIG. 18A and FIG. 18B depict graphs showing the comparison of on target A ⁇ G editing by engineered ABE variants at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- FIG. 20A and FIG. 20B depict graphs showing the of unwanted indel formation engineered ABE variants at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells and preferred sequence editing context for engineered ABE23.
- FIG. 19A and FIG. 19B depict graphs showing the comparison of promiscuous C editing by engineered ABE variants at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- Boxplots depict the Mean observed C editing (FIG. 19A) and Max observed C editing (FIG. 19B) by individual ABE variants, with data corresponding
- FIG. 20B shows the frequency distribution of bases surrounding the highly edited (greater than 30% editing activity) adenines by ABE23.
- FIG. 21A and FIG. 21B depict graphs showing the comparison of on target A ⁇ G editing by engineered D109Q ABE variants and variants with additional proline mutations designed to relax the sequence preference at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- D109Q mutations designed to curb promiscuous C deamination increased the mean and max A to G editing.
- FIG. 22A and FIG. 22B depict graphs showing the comparison of promiscuous C editing by engineered D109Q ABE variants and variants with additional proline mutations designed to relax the sequence preference at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- FIG. 23 depicts a graph showing the comparison of unwanted indel formation and editing context for engineered D109Q variants and variants with additional proline mutations designed to relax the sequence preference at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- FIG. 24A and FIG. 24B depict graphs showing the comparison of on target A ⁇ G editing by top performing ABEs at 15 distinct gRNA targeting 2 different genes in Hepal-6 cells.
- FIGs. 27A-FIG. 27C depict small base editor fusion constructs, including a schematic of the small base editors indicating the relative position of the deaminase domain with respect to the nickase domain (FIG. 27A), the structural alignment of the computationally -predicted structure of MG3-6_3-8 ABE (light grey) with the predicted structure of MG34-29 nuclease (dark grey) (FIG. 27B), and an alternative view in which the MG3-6_3-8 nickase is omitted from the alignment shown in FIG. 27B to uncover a potential loop within the MG34-29 structure prediction to fuse the MG68-4 deaminase (FIG. 27C).
- FIGs. 29A-FIG. 29C depict PAM-interacting domain (PID) engineering that was used to develop a suite of chimeric MG3-6 base editors with extensive genome targetability.
- FIG. 29A shows the percentage of genomic adenines in the hg38 human reference genome that are targetable by SpCas9 ABEs.
- FIG. 29B depicts a schematic diagram of the PID-swappable chimeric nuclease platform highlighting the diversity and range of PAMs accessible through PID-swapping.
- FIG. 29C depicts the percentage of genomic adenines in the hg38 human reference genome that are targetable using the MG3-6 PID-swappable ABEs platform.
- FIGs. 30A-FIG. 30D depict a schematic of the experimental design for high-throughput chimeric ABE testing.
- FIG. 30A shows an editing window of two highly active MG3-6_3-8 ABEs. Each dot represents a unique editing event observed across over 15 unique guides by ABE07 (SEQ ID NO: 1411) and ABE33 (SEQ ID NO: 1673).
- the Gaussian curves on the top panel indicate the estimated editing window for MG3-6 3-8 ABEs that was used to design the tiling guides used throughout this experiment.
- FIG. 30A shows an editing window of two highly active MG3-6_3-8 ABEs. Each dot represents a unique editing event observed across over 15 unique guides by ABE07 (SEQ ID NO: 1411) and ABE33 (SEQ ID NO: 1673).
- the Gaussian curves on the top panel indicate the estimated editing window for MG3-6 3-8 ABEs that was used to design the tiling guides used throughout this experiment.
- FIG. 30B shows a scheme depicting base editing of the splice sites using ABEs, which can lead to intron retention (splice donor disruption) or exon skipping (splice acceptor disruption) and hence be utilized for targeted knockout of genes.
- Plate layout for (FIG. 30C) high-throughput pooled screen and (FIG. 30D) deconvolution screen were used to identify highly active ABEs and guide combinations that yield efficient targeted knockouts.
- FIGs. 31A- FIG. 31B depict targeted knockout of hANGPTL3 using PID-swapped chimeric ABEs.
- FIG. 31A shows a graph of the comparison of theoretical splice site targetability of reference ABEs and MG3-6 3-8 chimeric ABEs.
- FIG. 31B shows the results of pooled screening of compatible chimeric ABEs across the splice sites of hANGPTL3 gene as tested in K562 cells.
- the heatmap values indicate the percentage A >G editing observed at the splice site adenines by the corresponding chimeric ABE variants.
- FIGs. 32A- FIG. 32D show targeted knockout of hANGPTL3 using PID-swapped chimeric ABEs. Barplots of the A ⁇ G editing at the splice site adenines used to deconvolute the active guides and chimeric ABE combinations at the hANGPTL3 exon 1 splice donor (FIG. 32A), exon 4 splice acceptor and splice donor (FIG. 32B), exon 5 splice donor (FIG. 32C), and exon 7 splice acceptor (FIG. 32D).
- FIGs. 33A- FIG. 33C depict targeted enhancer site disruption of GATA1 -binding site in hBCLUA using PID-swapped chimeric ABEs.
- FIG. 33A shows graph of the comparison of theoretical targetability GATA1 enhancer binding site of reference ABEs and MG3-6_3-8 chimeric ABEs.
- FIG. 33B shows the results of pooled screening of compatible chimeric ABEs across the various enhancer factor binding sites of BCL11 A gene tested in K.562 cells. The heatmap values indicate the percentage A ⁇ G editing observed at the GATA site adenine by the corresponding chimeric ABE variants.
- FIG. 33C shows a barplot of the A ⁇ G editing at the GATA1 adenines used to deconvolute the active guides and chimeric ABE combinations at DHS+58 enhancer binding site.
- FIGs. 34A- FIG. 34D depict targeted correction of hPAH SNVs using PID-swapped chimeric ABEs.
- FIG. 34A shows a table of the most common pathogenic SNVs reported in hPAH and their responsiveness to Sapropterin, adapted from Pegler DS, Greene CL. Phenylalanine Hydroxylase Deficiency.
- FIGs. 34B-34C show pooled screening of compatible chimeric ABEs across the SNVs sites of hPAH gene as tested in modified cell line.
- the heatmap values indicate the percentage A ⁇ G editing observed at the target adenines as well as the bystander A’s by the corresponding chimeric ABE variants for the following SNVs: 1222OT (p.R408W) SNV (FIG. 34B), 1066-11G>A (FIG. 34C), and 1315+1G>A (FIG. 34D).
- SEQ ID NOs: 1-47 show the full-length peptide sequences of MG66 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 48-49 show the full-length peptide sequences of MG67 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 50-51 show the full-length peptide sequences of MG68 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 52-56 show the sequences of uracil DNA glycosylase inhibitors suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 57-66 show the sequences of reference deaminases.
- SEQ ID NO: 67 shows the sequence of a reference uracil DNA glycosylase inhibitor.
- SEQ ID NO: 68 shows the sequence of an adenine base editor.
- SEQ ID NO: 69 shows the sequence of a cytosine base editor.
- SEQ ID NOs: 70-78 show the full-length peptide sequences of MG nickases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 79-87 shows the protospacer and PAM used in in vitro nickase assays described herein.
- SEQ ID NOs: 88-96 show the peptide sequences of single guide RNA used in in vitro nickase assays described herein.
- SEQ ID NOs: 97-156 show the sequences of spacers when targeting E. coli lacZ.
- SEQ ID NOs: 157-176 show the sequences of primers when conducting site directed mutagenesis.
- SEQ ID NOs: 177-178 show the sequences of primers for lacZ sequencing.
- SEQ ID NOs: 179-342 show the sequences of primers used during amplification.
- SEQ ID Nos: 343-345 show the sequences of primers for lacZ sequencing.
- SEQ ID NOs: 346-359 show the sequences of primers used during amplification.
- SEQ ID NOs: 360-368 show protospacer adjacent motifs suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 369-384 show nuclear localization sequences (NLS’s) suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 385-443 show the full-length peptide sequences of MG68 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 444-447 show the full-length peptide sequences of MG121 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 448-475 show the full-length peptide sequences of MG68 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 476 and 477 show sequences of adenine base editors.
- SEQ ID NOs: 478-482 show sequences of cytosine base editors.
- SEQ ID NOs: 483-487 show the sequences of plasmids suitable for encoding the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 488 and 489 show the sgRNA scaffold sequences for MG15-1 and MG34- 1.
- SEQ ID NOs: 490-522 show the sequences of spacers used to target genomic loci in E. coll and HEK293T cells.
- SEQ ID NOs: 523-585 show the sequences of primers used during amplification and Sanger sequencing.
- SEQ ID NOs: 584-585 show the sequences of primers used during amplification.
- SEQ ID NO: 586 shows the sequence of an adenine base editor.
- SEQ ID NO: 587 shows the sequence of a cytosine base editor.
- SEQ ID NOs: 588-589 show sequences of adenine base editors.
- SEQ ID NOs: 590-593 show the full-length peptide sequences of linkers suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 594 shows the sequence of a cytosine deaminase.
- SEQ ID NO: 595 shows the sequence of an adenosine deaminase.
- SEQ ID NO: 596 shows the sequence of an MG34 active effector suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 597 shows the sequence of an MG34 nickase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 598 shows the sequence of an MG34 PAM.
- SEQ ID NOs: 599-638 show the full-length peptide sequences of MG138 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 639-659 show the full-length peptide sequences of MG139 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 660-662 show the full-length peptide sequences of MG141 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 663-664 show the full-length peptide sequences of MG142 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 665-675 show the full-length peptide sequences of MG93 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 676-678 show sequences of adenine base editors.
- SEQ ID NOs: 679-680 show the sgRNA scaffold sequences for MG34-1 and SpCas9.
- SEQ ID NOs: 681-689 show spacer sequences used to target genomic loci in guide RNAs.
- SEQ ID NOs: 690-707 show sequences of primers used to amplify genomic targets of adenine bae editors (ABE) for next generation sequencing (NGS) analysis.
- SEQ ID NO: 708 shows the sequence of a blasticidin (BSD) resistance cassette.
- SEQ ID NOs: 709-719 show spacer sequences used to target genomic loci in guide RNAs.
- SEQ ID NOs: 720-726 show the sequences of plasmids suitable for encoding the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 728-729 show sequences of adenine base editors.
- SEQ ID NOs: 730-736 show spacer sequences used to target genomic loci in guide RNAs.
- SEQ ID NOs: 737-738 show the sequences of plasmids suitable for encoding the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 739-740 show sequences of cytidine base editors.
- SEQ ID NO: 741 shows the sequence of a plasmid suitable for encoding the A1CF gene.
- SEQ ID NO: 742 shows the sequence of an RNA used to test CD As for RNA activity.
- SEQ ID NO: 743 shows the sequence of a labelled primer for poisoned primer extension assay used to test CD As for RNA activity.
- SEQ ID NOs: 744-827 show the full-length peptide sequences of MG139 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 828 shows the full-length peptide sequence of an MG93 cytidine deaminase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 829 shows the full-length peptide sequence of an MG142 cytidine deaminase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 830-835 show the full-length peptide sequences of MG152 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 836-860 show sequences of adenine base editors.
- SEQ ID NOs: 861-864 show spacer sequences used to target genomic loci in guide RNAs.
- SEQ ID NOs: 865-872 show sequences of primers used to amplify genomic targets of adenine bae editors (ABE) for next generation sequencing (NGS) analysis.
- SEQ ID NOs: 873-875 show the sequences of plasmids suitable for encoding the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 876 shows the sgRNA scaffold sequence for MG34-1.
- SEQ ID NOs: 877-916 show sequences of cytosine base editors.
- SEQ ID NOs: 917-931 show the sequences of sgRNAs suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 932-961 show sequences of primers used to amplify genomic targets of adenine base editors (ABE) for next generation sequencing (NGS) analysis.
- SEQ ID NO: 962 shows a site engineered in mammalian cell line with 5 PAMs compatible with Cas9 and MG3-6 editing.
- SEQ ID NOs: 963-967 show the sequences of sgRNAs suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 968-969 show sequences of cytosine base editors.
- SEQ ID NO: 970 shows the full-length peptide sequence of an MG139 cytidine deaminase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 971-977 show the full-length peptide sequences of MG93 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 978-981 show the full-length peptide sequences of MG138 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 982 shows the full-length peptide sequence of MG142 cytidine deaminase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 983-1014 shows the full-length peptide sequence of MG128 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1015-1026 shows the full-length peptide sequence of MG129 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1027-1031 shows the full-length peptide sequence of MG130 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1032-1040 shows the full-length peptide sequence of MG131 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1041-1043 shows the full-length peptide sequence of MG132 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1044-1057 shows the full-length peptide sequence of MG133 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1058-1061 shows the full-length peptide sequence of MG134 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1062-1069 shows the full-length peptide sequence of MG135 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1070-1081 shows the full-length peptide sequence of MG136 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1082-1098 shows the full-length peptide sequence of MG137 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1099-1105 show the sequences of sgRNAs suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1106-1111 show the sequences of MG35 P AMs.
- SEQ ID NO: 1112 shows the DNA sequence of a gene encoding the ABE-MG35-1 adenine base editor.
- SEQ ID NO: 1 1 13 shows the protein sequence of the ABE-MG35-1 adenine base editor.
- SEQ ID NO: 1114 shows the nucleotide sequence of a plasmid encoding a Cas9-based cytosine base editor (CBE).
- SEQ ID NO: 1115 shows the nucleotide sequence of a plasmid encoding Fam72a.
- SEQ ID Nos: 11 16-11 17 show the sequences of Cas9-CBE target sites.
- SEQ ID NOs: 1118-11 19 show the sequences of NGS amplicons.
- SEQ ID NO: 1120 shows the full-length peptide sequence of an MG35 nuclease.
- SEQ ID NO: 1121 shows the full-length peptide sequence of Fam72A.
- SEQ ID NOs: 1122-1127 shows the full-length peptide sequences of MG35 nucleases.
- SEQ ID NOs: 1128-1160 shows the full-length peptide sequences of MG3-6/3-8 adenine base editors.
- SEQ ID NOs: 1161-1186 shows the full-length peptide sequences of MG34-1 adenine base editors.
- SEQ ID NOs: 1187-1195 show the sequences of sgRNAs suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1196-1204 show spacer sequences used to target genomic loci in guide RNAs.
- SEQ ID NO: 1205 shows the nucleotide sequence of a plasmid encoding an MG3-6/3-8 adenine base editor.
- SEQ ID NO: 1206 shows the nucleotide sequence of a plasmid encoding an sgRNA suitable for an MG3-6/3-8 adenine base editor described herein.
- SEQ ID NO: 1207 shows the nucleotide sequence of a plasmid encoding an MG34-1 adenine base editor.
- SEQ ID NOs: 1208-1269 show the full-length peptide sequences of MG93 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1270-1296 show the full-length peptide sequences of MG139 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1297-1311 show the full-length peptide sequences of MG152 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1312-1313 show the full-length peptide sequences of MG138 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1314-1315 show the full-length peptide sequences of MG139 deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1316-1319 show the nucleotide sequences of 5 ’-F AM-labeled ssDNAs.
- SEQ ID NOs: 1320-1321 show the nucleotide sequences of Cy5.5-labeled ssDNAs.
- SEQ ID NOs: 1322-1355 show sequences of cytidine base editors.
- SEQ ID NOs: 1356-1362 show the full-length peptide sequences of MG34-1 adenine base editors.
- SEQ ID NOs: 1363-1415 show the full-length peptide sequences of MG3-6/3-8 adenine base editors.
- SEQ ID NOs: 1416-1417 show the nucleotide sequences of sgRNAs suitable for use with MG34-1 adenine base editors described herein.
- SEQ ID NO: 1418 shows the nucleotide sequence of an sgRNA suitable for use with MG3-6/3-8 adenine base editors described herein.
- SEQ ID NOs: 1419-1420 show the DNA sequences of target sites suitable for targeting by MG34-1 adenine base editors described herein.
- SEQ ID NO: 1421 shows a DNA sequence of a target site suitable for targeting by MG3- 6/3-8 adenine base editors described herein.
- SEQ ID NO: 1422 shows the nucleotide sequence of a plasmid suitable for expression of an MG34-1 adenine base editor described herein.
- SEQ ID NO: 1423 shows the nucleotide sequence of a plasmid suitable for expression of an MG3-6/3-8 adenine base editor described herein.
- SEQ ID NO: 1424 shows the full-length peptide sequence of an MG35-1 adenine base editor.
- SEQ ID NO: 1425-1426 show the nucleotide sequences of plasmids suitable for expression of MG35-1 adenine base editors and sgRNAs described herein.
- SEQ ID NOs: 1427-1428 show the nucleotide sequences of sgRNAs suitable for use with MG35-1 adenine base editors described herein.
- SEQ ID NOs: 1429-1430 show the DNA sequences of target sites suitable for targeting by MG35-1 adenine base editors described herein.
- SEQ ID NOs: 1431-1454 show the nucleotide sequences of sgRNAs engineered to function with an MG3-6/3-8 adenine base editor in order to target APOA1.
- SEQ ID NOs: 1455-1478 show the DNA sequences of APOA1 target sites.
- SEQ ID NOs: 1479-1483 show the nucleotide sequences of sgRNAs engineered to function with an MG3-6/3-8 adenine base editor in order to target ANGPTL3.
- SEQ ID NOs: 1484-1488 show the DNA sequences of ANGPTL3 target sites.
- SEQ ID NOs: 1489-1490 show the nucleotide sequences of sgRNAs engineered to function with an MG3-6/3-8 adenine base editor in order to target TRAC.
- SEQ ID Nos: 1491-1492 show the DNA sequences of TRAC sites.
- SEQ ID NOs: 1493-1516 show the nucleotide sequences of NGS primers suitable for use in assessing base editing of APO Al.
- SEQ ID NOs: 1517-1521 show the nucleotide sequences ofNGS primers suitable for use in assessing base editing of ANGPTL3.
- SEQ ID NOs: 1522-1523 show the nucleotide sequences ofNGS primers suitable for use in assessing base editing of TRAC.
- SEQ ID NOs: 1524-1547 show the nucleotide sequences ofNGS primers suitable for use in assessing base editing of APOA1.
- SEQ ID NOs: 1548-1552 show the nucleotide sequences of NGS primers suitable for use in assessing base editing of ANGPTL3.
- SEQ ID NOs: 1553-1554 show the nucleotide sequences ofNGS primers suitable for use in assessing base editing of TRAC.
- SEQ ID NO: 1555 shows the nucleotide sequence of a plasmid suitable for use in mRNA production.
- SEQ ID NOs: 1556-1562 show the full-length peptide sequences of MG131 adenine deaminase variants.
- SEQ ID NOs: 1563-1566 show the full-length peptide sequences of MG134 adenine deaminase variants.
- SEQ ID NOs: 1567-1574 show the full-length peptide sequences of MG135 adenine deaminase variants.
- SEQ ID NOs: 1575-1589 show the full-length peptide sequences of MG137 adenine deaminase variants.
- SEQ ID NOs: 1590-1599 show the full-length peptide sequences of MG68 adenine deaminase variants.
- SEQ ID NOs: 1600-1602 show the full-length peptide sequences of MG132 adenine deaminase variants.
- SEQ ID NOs: 1603-1616 show the full-length peptide sequences of MG133 adenine deaminase variants.
- SEQ ID NOs: 1617-1624 show the full-length peptide sequences of MG136 adenine deaminase variants.
- SEQ ID NOs: 1625-1633 show the full-length peptide sequences of MG129 adenine deaminase variants.
- SEQ ID NOs: 1634-1638 show the full-length peptide sequences of MG130 adenine deaminase variants.
- SEQ ID NOs: 1639-1644 show the full-length peptide sequences of MG34-1 adenine base editors.
- SEQ ID NOs: 1698-1703 show the full-length peptide sequences of MG34 adenine base editors.
- SEQ ID NOs: 1645-1646 show the nucleotide sequences of ssDNA substrates suitable for testing adenine deaminase activity in vitro.
- SEQ ID Nos: 1647-1653 show linker sequences for deaminase systems described herein.
- SEQ ID NOs: 1654-1658, and 1665-1694 show the full-length peptide sequences of MG3-6/3-8 chimera base editors.
- SEQ ID NOs: 1659 and 1661-1664 show the full-length peptide sequences of MG139 cytidine deaminases suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NO: 1660 shows the full-length peptide sequence of an MG152 cytidine deaminase suitable for the engineered nucleic acid editing systems described herein.
- SEQ ID NOs: 1695-1697 show the full-length peptide sequences of MG102 adenine base editors.
- SEQ ID NO: 1704 shows the full-length nucleotide sequence of an hApoAl_l guide RNA.
- SEQ ID NOs: 1705-1710 show the full-length nucleotide sequences of MG34 effector chemically synthesized/modified sgRNAs.
- SEQ ID NOs: 1711-1719 show the full-length nucleotide sequences of TRAC target sites.
- SEQ ID NOs: 1720-1725 show the full-length nucleotide sequences of AAVS1 target sites.
- SEQ ID NO: 1726 shows the full-length nucleotide sequence of the hApoAl target site.
- SEQ ID NOs. 1727-1728, and 1743 show the nucleotide sequence of MG3-6/3-8 adenine base editor.
- SEQ ID NOs: 1729 and 1744 show the nucleotide sequence of MG3 -6/3 -8/3 -4 adenine base editor.
- SEQ ID NOs: 1730 and 1745 shows the nucleotide sequence of MG3-6/3-8/3-6 adenine base editor.
- SEQ ID NOs: 1731 and 1746 show the nucleotide sequence of MG3 -6/3 -8/3 -7 adenine base editor.
- SEQ ID NOs: 1732 and 1747 show the nucleotide sequence of MG3 -6/3 -8/3 -22 adenine base editor.
- SEQ ID NOs: 1733 and 1748 show the nucleotide sequence of MG3-6/3-8/3-24 adenine base editor.
- SEQ ID NOs: 1734 and 1749 show the nucleotide sequence of MG3 -6/3 -8/3 -38 adenine base editor.
- SEQ ID NOs: 1735 and 1750 show the nucleotide sequence of MG3-6/3-8/3-89 adenine base editor.
- SEQ ID NOs: 1736 and 1751 show the nucleotide sequence of MG3-6/3-8/3-90 adenine base editor.
- SEQ ID NOs: 1737 and 1752 show the nucleotide sequence of MG3 -6/3 -8/3 -92 adenine base editor.
- SEQ ID NOs: 1738 and 1753 show the nucleotide sequence of MG3-6/3-8/3-93 adenine base editor.
- SEQ ID NOs: 1739 and 1754 show the nucleotide sequence of MG3 -6/3 -8/3 -95 adenine base editor.
- SEQ ID NOs: 1740 and 1755 show the nucleotide sequence of MG3-6/3-8/3-104 adenine base editor.
- SEQ ID NOs: 1741 and 1756 show the nucleotide sequence of MG3 -6/3 -8/ 150-2 adenine base editor.
- SEQ ID NOs: 1742 and 1757 show the nucleotide sequence of MG3 -6/3 -8/150-9 adenine base editor.
- SEQ ID NOs: 1758-1889 show the nucleotide sequence of MG3-6 ABE hANGPTL3 guide.
- SEQ ID NOs: 1890-1976 show the nucleotide sequence of MG3-6 ABE hBCLHA guide.
- SEQ ID NOs: 1977-2009 show the nucleotide sequence of MG3-6 ABE hPAH guide.
- SEQ ID NOs: 2010-2019 show the nucleotide sequence of MG3-6 ABE APOA1 guide.
- SEQ ID NO: 2020 shows the nucleotide sequence of engineered therapeutic sequence.
- SEQ ID NO: 2021 shows the protein sequence of MG3 -6/3 -8/3 -4 adenine base editor.
- SEQ ID NO: 2022 shows the protein sequence of MG3-6/3-8/3-7 adenine base editor.
- SEQ ID NO: 2023 shows the protein sequence of MG3-6/3-8/3-104 adenine base editor.
- SEQ ID NO: 2024-2043 show the amino acid sequences of nuclear localization signals (NLS).
- nucleotide refers to a base-sugar-phosphate combination.
- Contemplated nucleotides include naturally occurring nucleotides and synthetic nucleotides.
- Nucleotides are monomeric units of a nucleic acid sequence (e.g., deoxyribonucleic acid (DNA) and ribonucleic acid (RNA)).
- the term nucleotide includes ribonucleoside triphosphates adenosine triphosphate (ATP), uridine triphosphate (UTP), cytosine triphosphate (CTP), guanosine triphosphate (GTP) and deoxyribonucleoside triphosphates such as dATP.
- ATP adenosine triphosphate
- UDP uridine triphosphate
- CTP cytosine triphosphate
- GTP guanosine triphosphate
- deoxyribonucleoside triphosphates such as dATP.
- dCTP diTP, dUTP, dGTP, dTTP, or derivatives thereof.
- derivatives include, for example, [aSJdATP, 7-deaza-dGTP and 7-deaza-dATP, and nucleotide derivatives that confer nuclease resistance on the nucleic acid molecule containing them.
- nucleotide as used herein encompasses dideoxyribonucleoside triphosphates (ddNTPs) and their derivatives.
- ddNTPs include, but are not limited to, ddATP, ddCTP, ddGTP, ddITP, and ddTTP.
- a nucleotide may be unlabeled or detectably labeled, such as using moieties comprising optically detectable moieties (e.g., fluorophores) or quantum dots.
- Detectable labels include, for example, radioactive isotopes, fluorescent labels, chemiluminescent labels, bioluminescent labels, and enzyme labels.
- Fluorescent labels of nucleotides include but are not limited fluorescein, 5- carboxyfluorescein (FAM), 2'7'-dimethoxy-4'5-dichloro-6-carboxyfluorescein (JOE), rhodamine, 6-carboxyrhodamine (R6G), N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA), 6-carboxy- X-rhodamine (ROX), 4-(4'dimethylaminophenylazo) benzoic acid (DABCYL), Cascade Blue, Oregon Green, Texas Red.
- FAM 5- carboxyfluorescein
- JE 2'7'-dimethoxy-4'5-dichloro-6-carboxyfluorescein
- rhodamine 6-carboxyrhodamine
- R6G 6-carboxyrhodamine
- TAMRA N,N,N',N'-tetra
- Cyanine and 5-(2'-aminoethyl)aminonaphthalene-l -sulfonic acid include [R6G]dUTP, [TAMRA]dUTP, [R110]dCTP, [R6G]dCTP, [TAMRA]dCTP, [JOE]ddATP, [R6G]ddATP, [FAM]ddCTP, [R110]ddCTP, [TAMRA]ddGTP, [ROX]ddTTP, [dR6G]ddATP, [dR110]ddCTP, [dTAMRA] ddGTP, and [dROX] ddTTP available from Perkin Elmer. Foster City, Calif;
- nucleotide encompasses chemically modified nucleotides.
- An exemplary' chemically -modified nucleotide is biotin-dNTP.
- biotinylated dNTPs include, biotin-dATP (e g., bio-N6-ddATP, biotin-14- dATP), biotin-dCTP (e.g., biotin- 11-dCTP, biotin-14-dCTP), and biotin-dUTP (e.g., biotin-11- dUTP, biotin- 16-dUTP, biotm-20-dUTP).
- polynucleotide oligonucleotide
- nucleic acid a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof, either in single-, double-, or multistranded form.
- Contemplated polynucleotides include a gene or fragment thereof.
- Exemplary polynucleotides include, but are not limited to, DNA, RNA, coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes.
- loci locus defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes.
- cDNA cell-free DNA
- cfRNA cell-free RNA
- nucleic acid probes and primers.
- a T means U (Uracil) in RNA and T (Thymine) in DNA.
- a polynucleotide can be exogenous or endogenous to a cell and/or exist in a cell-free environment.
- the term polynucleotide encompasses modified polynucleotides (e.g., altered backbone, sugar, or nucleobase).
- modifications to the nucleotide structure are imparted before or after assembly of the polymer.
- modifications include: 5 -bromouracil, peptide nucleic acid, xeno nucleic acid, morpholinos, locked nucleic acids, glycol nucleic acids, threose nucleic acids, dideoxynucleotides, cordycepin, 7-deaza-GTP, fluorophores (e.g., rhodamine or fluorescein linked to the sugar), thiol-containing nucleotides, biotin-linked nucleotides, fluorescent base analogs, CpG islands, methyl-7-guanosine, methylated nucleotides, inosine, thiouridine, pseudouridine, dihydrouridine, queuosine, and wyosine.
- the sequence of nucleotides may be interrupted by non-nucleotide components.
- peptide refers to a polymer of at least two amino acid residues joined by peptide bond(s). This term does not connote a specific length of polymer, nor is it intended to imply or distinguish whether the peptide is produced using recombinant techniques, chemical or enzymatic synthesis, or is naturally occurring. The terms apply to naturally occurring amino acid polymers as well as amino acid polymers comprising at least one modified amino acid. In some cases, the polymer is interrupted by non-amino acids. The terms include amino acid chains of any length, including full length proteins, and proteins with or without secondary or tertiary structure (e.g.. domains).
- amino acid polymer that has been modified, for example, by disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, oxidation, and any other manipulation such as conjugation with a labeling component.
- amino acid and amino acids refer to natural and non-natural amino acids, including, but not limited to, modified amino acids.
- Modified amino acids include amino acids that have been chemically modified to include a group or a chemical moiety not naturally present on the amino acid.
- amino acid’' includes both D-amino acids and L-amino acids.
- non-native refers to a nucleic acid or polypeptide sequence that is non-naturally occurring.
- Non-native refers to a non-naturally occurring nucleic acid or polypeptide sequence that comprises modifications such as mutations, insertions, or deletions.
- the term non-native encompasses fusion nucleic acids or polypeptides that encodes or exhibits an activity (e g., enzymatic activity, methyltransferase activity, acetyltransferase activity, kinase activity, ubiquitinating activity, etc.) of the nucleic acid or polypeptide sequence to which the non-native sequence is fused.
- a non-native nucleic acid or polypeptide sequence includes those linked to a naturally-occurring nucleic acid or polypeptide sequence (or a variant thereof) by genetic engineering to generate a chimeric nucleic acid or polypeptide sequence encoding a chimeric nucleic acid or polypeptide.
- operably linked refers to an arrangement of genetic elements, e.g., a promoter, an enhancer, a polyadenylation sequence, etc., wherein an operation (e.g, movement or activation) of a first genetic element has some effect on the second genetic element.
- the effect on the second genetic element can be, but need not be, of the same type as operation of the first genetic element.
- two genetic elements are operably linked if movement of the first element causes an activation of the second element.
- a regulatory element which may comprise promoter and/or enhancer sequences, is operatively linked to a coding region if the regulatory element helps initiate transcription of the coding sequence. There may be intervening residues between the regulatory element and coding region so long as this functional relationship is maintained.
- a “functional fragment” of a DNA or protein sequence refers to a fragment that retains a biological activity (either functional or structural) that is substantially similar to a biological activity of the full-length DNA or protein sequence.
- a biological activity of a DNA sequence includes its ability to influence expression in a manner attributed to the full-length sequence.
- engineered,” “synthetic,” and “artificial” are used interchangeably herein to refer to an object that has been modified by human intervention. For example, the terms refer to a polynucleotide or polypeptide that is non-naturally occurring.
- An engineered peptide has, but does not require, low sequence identity (e.g., less than 50% sequence identity, less than 25% sequence identity, less than 10% sequence identity, less than 5% sequence identity 7 , less than 1% sequence identity) to a naturally occurring human protein.
- low sequence identity e.g., less than 50% sequence identity, less than 25% sequence identity, less than 10% sequence identity, less than 5% sequence identity 7 , less than 1% sequence identity
- VPR and VP64 domains are synthetic transactivation domains.
- Non-limiting examples include the following: a nucleic acid modified by changing its sequence to a sequence that does not occur in nature; a nucleic acid modified by ligating it to a nucleic acid that it does not associate with in nature such that the ligated product possesses a function not present in the original nucleic acid; an engineered nucleic acid synthesized in vitro with a sequence that does not exist in nature; a protein modified by changing its amino acid sequence to a sequence that does not exist in nature; an engineered protein acquiring a new function or property.
- An “engineered” system comprises at least one engineered component.
- tracrRNA or “tracr sequence” means trans-activating CRISPR RNA.
- tracrRNA interacts with the CRISPR (cr) RNA to form guide (g) RNA in type II and subtype V- B CRISPR-Cas systems. If the tracrRNA is engineered, it may have about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% sequence identity' and/or sequence similarity' to a wild type exemplary tracrRNA sequence (e.g, a tracrRNA from S. pyogenes. S. aureus).
- a wild type exemplary tracrRNA sequence e.g, a tracrRNA from S. pyogenes. S. aureus.
- tracrRNA may refer to a modified form of a tracrRNA that can comprise a nucleotide change such as a deletion, insertion, or substitution, variant, mutation, or chimera.
- the term tracrRNA encompasses a nucleic acid that can be at least about 60% identical to a wild ty pe exemplary' tracrRNA (e.g. a tracrRNA from S'. pyogenes. S. aureus, etc) sequence over a stretch of at least 6 contiguous nucleotides.
- a tracrRNA sequence has at least about 60% identical, at least about 65% identical, at least about 70% identical, at least about 75% identical, at least about 80% identical, at least about 85% identical, at least about 90% identical, at least about 95% identical, at least about 98% identical, at least about 99% identical, or 100 % identical to a wild type exemplary tracrRNA (e g, a tracrRNA from S. pyogenes, S'. aureus, etc) sequence over a stretch of at least 6 contiguous nucleotides.
- Type II tracrRNA sequences can be predicted on a genome sequence by' identifying regions with complementarity to part of the repeat sequence in an adjacent CRISPR array.
- a “guide nucleic acid” or “guide polynucleotide” refers to a nucleic acid that may hybridize to a target nucleic acid and thereby directs an associated nuclease to the target nucleic acid.
- a guide nucleic acid is, but is not limited to, RNA (guide RNA or gRNA), DNA, or a mixture of RNA and DNA.
- a guide nucleic acid can include a crRNA or a tracrRNA or a combination of both.
- guide nucleic acid encompasses an engineered guide nucleic acid and a programmable guide nucleic acid to specifically bind to the target nucleic acid.
- a portion of the target nucleic acid may be complementary to a portion of the guide nucleic acid.
- the strand of a double-stranded target polynucleotide that is complementary to and hybridizes with the guide nucleic acid is the complementary' strand.
- the strand of the double-stranded target polynucleotide that is complementary to the complementary strand, and therefore is not complementary’ to the guide nucleic acid is called noncomplementary strand.
- a guide nucleic acid having a polynucleotide chain is a ‘’single guide nucleic acid.”
- a guide nucleic acid having two polynucleotide chains is a “double guide nucleic acid.” If not otherwise specified, the term “guide nucleic acid” is inclusive, referring to both single guide nucleic acids and double guide nucleic acids.
- a guide nucleic acid may comprise a segment referred to as a “nucleic acidtargeting segment” or a “nucleic acid-targeting sequence,” or a “spacer.”
- a nucleic acid-targeting segment can include a sub-segment referred to as a “protein binding segment” or “protein binding sequence” or “Cas protein binding segment.”
- sequence identity or “percent identity” in the context of two or more nucleic acids or polypeptide sequences, generally refers to two (e.g., in a pairwise alignment) or more (e.g., in a multiple sequence alignment) sequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a local or global comparison window, as measured using a sequence comparison algorithm.
- Suitable sequence comparison algorithms for polypeptide sequences include, e.g., BLASTP using parameters of a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix setting gap costs at existence of 11, extension of 1, and using a conditional compositional score matrix adjustment for polypeptide sequences longer than 30 residues; BLASTP using parameters of a wordlength (W) of 2, an expectation (E) of 1000000, and the PAM30 scoring matrix setting gap costs at 9 to open gaps and 1 to extend gaps for sequences of less than 30 residues (these are the default parameters for BLASTP in the BLAST suite available at https://blast.ncbi.nlm.nih.gov); CLUSTALW with parameters of ; the Smith- Waterman homology search algorithm with parameters of a match of 2, a mismatch of -1. and a gap of -1; MUSCLE with default parameters; MAFFT with parameters retree of 2 and maxiterations of 1000; Novafold with default parameters; HMMER hmmalign
- RuvC III domain refers to a third discontinuous segment of a RuvC endonuclease domain (the RuvC nuclease domain being comprised of three discontiguous segments, RuvC I, RuvC II, and RuvC III).
- a RuvC domain or segments thereof can generally be identified by alignment to documented domain sequences, structural alignment to proteins with annotated domains, or by comparison to Hidden Markov Models (HMMs) built based on documented domain sequences (e.g.. Pfam HMM PF18541 for RuvC III).
- HMMs Hidden Markov Models
- HNH domain refers to an endonuclease domain having characteristic histidine and asparagine residues.
- An HNH domain can generally be identified by alignment to documented domain sequences, structural alignment to proteins with annotated domains, or by comparison to Hidden Markov Models (HMMs) built based on documented domain sequences (e.g., Pfam HMM PF01844 for domain HNH).
- HMMs Hidden Markov Models
- base editor refers to an enzyme that catalyzes the conversion of one target base or base pair into another (e.g., A:T to G:C, C:G to T:A) without requiring the creation and repair of a double-strand break.
- An exemplary base editor is a deaminase.
- the base editor comprises a deaminase and a nuclease that is deficient in nuclease activity.
- the base editor comprises a deaminase and a catalytically inactive nuclease.
- the base editor comprises a fusion of a deaminase and a catalytically inactive nuclease.
- deaminase refers to a protein or enzyme that catalyzes a deamination reaction (i.e., a reaction that removes an amino group).
- Deaminases include adenosine deaminases, which catalyze the hydrolytic deamination of adenine or adenosine (e.g., an engineered adenosine deaminase that deaminates adenosine in DNA), and cytidine (or cytosine) deaminases, which catalyze the hydrolytic deamination of cytidine (or cytosine) or deoxy cytidine to uridine (or uracil) or deoxyuridine, respectively.
- adenosine deaminases which catalyze the hydrolytic deamination of adenine or adenosine (e.g., an engineered adenosine deaminase that deaminates a
- the deaminase or deaminase domain can be a naturally-occurring deaminase or deaminase domain from an organism, such as a human, chimpanzee, gorilla, monkey, cow, dog, rat, mouse, or bacterium (e.g., E. coli), a variant of a naturally-occurring deaminase or deaminase domain, or a non-naturally occurring deaminase or deaminase domain.
- an organism such as a human, chimpanzee, gorilla, monkey, cow, dog, rat, mouse, or bacterium (e.g., E. coli), a variant of a naturally-occurring deaminase or deaminase domain, or a non-naturally occurring deaminase or deaminase domain.
- optically aligned in the context of two or more nucleic acids or polypeptide sequences, generally refers to two (e.g., in a pairwise alignment) or more (e.g., in a multiple sequence alignment) sequences that have been aligned to maximal correspondence of amino acids residues or nucleotides, for example, as determined by the alignment producing a highest or “optimized” percent identity score.
- the term “complex” refers to a joining of at least two components.
- the two components may each retain the properties/activities they had prior to forming the complex or gain properties as a result of forming the complex.
- the joining includes, but is not limited to, covalent bonding, non-covalent bonding (i.e., hydrogen bonding, ionic interactions, Van der Waals interactions, and hydrophobic bond), use of a linker, fusion, or any other suitable method.
- Contemplated components of the complex include polynucleotides, polypeptides, or combinations thereof.
- a complex comprises an endonuclease and a guide polynucleotide.
- variants of any of the enzymes described herein with one or more conservative amino acid substitutions can be made in the amino acid sequence of a polypeptide without disrupting the three-dimensional structure or function of the polypeptide.
- Conservative substitutions can be accomplished by substituting amino acids with similar hydrophobicity, polarity, and R chain length for one another. Additionally, or alternatively, by comparing aligned sequences of homologous proteins from different species, conservative substitutions can be identified by locating amino acid residues that have been mutated between species (e.g., non-conserved residues) without altering the basic functions of the encoded proteins.
- Such conservatively substituted variants may include variants with at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%. at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% identity to any one of the endonuclease protein sequences described herein.
- such conservatively substituted variants are functional variants.
- Such functional variants can encompass sequences with substitutions such that the activity of one or more critical active site residues or guide RNA binding residues of the endonuclease are not disrupted.
- a decreased activity variant as a protein described herein comprises a disrupting substitution of at least one, at least two, or all three catalytic residues.
- any of the endonucleases described herein can comprise a nickase mutation.
- any of the endonucleases described herein can comprise a RuvC domain lacking nuclease activity.
- any of the endonucleases described herein can be configured to cleave one strand of a double-stranded target deoxyribonucleic acid. In some embodiments, any of the endonucleases described herein can comprise can be configured to lack endonuclease activity or be catalytically dead.
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- Metagenomic sequencing from natural environmental niches that represent large numbers of microbial species may offer the potential to drastically increase the number of new CRISPR systems documented and speed the discovery of new oligonucleotide editing functionalities.
- a recent example of the fruitfulness of such an approach is demonstrated by the 2016 discovery of CasX/CasY CRISPR systems from metagenomic analysis of natural microbial communities.
- CRISPR systems are RNA-directed nuclease complexes that have been described to function as an adaptive immune system in microbes.
- CRISPR systems occur in CRISPR (clustered regularly interspaced short palindromic repeats) operons or loci, which generally comprise two parts: (i) an array of short repetitive sequences (30-40bp) separated by equally short spacer sequences, which encode the RNA-based targeting element; and (ii) ORFs encoding the nuclease polypeptide directed by the RNA-based targeting element alongside accessory proteins/enzymes.
- Efficient nuclease targeting of a particular target nucleic acid sequence generally requires both (i) complementary hybridization between the first 6-8 nucleic acids of the target (the target seed) and the crRNA guide; and (ii) the presence of a protospacer-adjacent motif (PAM) sequence within a defined vicinity of the target seed (the PAM usually being a sequence not commonly represented within the host genome).
- PAM protospacer-adjacent motif
- CRISPR systems are commonly organized into 2 classes, 5 types and 16 subtypes based on shared functional characteristics and evolutionary similarity (see FIG. 1).
- Class 1 CRISPR systems have large, multisubunit effector complexes, and comprise Types I, III, and IV.
- Type I CRISPR systems are considered of moderate complexity' in terms of components.
- the array of RNA-targeting elements is transcribed as a long precursor crRNA (pre-crRNA) that is processed at repeat elements to liberate short, mature crRNAs that direct the nuclease complex to nucleic acid targets when they are followed by a suitable short consensus sequence called a protospacer-adjacent motif (PAM).
- PAM protospacer-adjacent motif
- This processing occurs via an endoribonuclease subunit (Cas6) of a large endonuclease complex called Cascade, which also comprises a nuclease (Cas3) protein component of the crRNA-directed nuclease complex.
- Type I nucleases function primarily as DNA nucleases.
- Type III CRISPR systems may be characterized by the presence of a central nuclease, known as CaslO, alongside a repeat-associated mysterious protein (RAMP) that comprises Csm or Cmr protein subunits.
- CaslO central nuclease
- RAMP repeat-associated mysterious protein
- the mature crRNA is processed from a pre- crRNA using a Cas6-like enzyme.
- type III systems appear to target and cleave DNA-RNA duplexes (such as DNA strands being used as templates for an RNA polymerase).
- Type IV CRISPR systems possess an effector complex that comprises a highly reduced large subunit nuclease (csfl), two genes for RAMP proteins of the Cas5 (csf3) and Cas7 (csf2) groups, and, in some cases, a gene for a predicted small subunit; such systems are commonly- found on endogenous plasmids.
- csfl highly reduced large subunit nuclease
- csf3 two genes for RAMP proteins of the Cas5
- csf2 Cas7
- Class 2 CRISPR systems generally have single-polypeptide multidomain nuclease effectors, and comprise Types II, V and VI.
- Type II CRISPR systems are considered the simplest in terms of components.
- the processing of the CRISPR array into mature crRNAs does not require the presence of a special endonuclease subunit, but rather a small trans-encoded crRNA (tracrRNA) with a region complementary to the array repeat sequence; the tracrRNA interacts with both its corresponding effector nuclease (e.g., Cas9) and the repeat sequence to form a precursor dsRNA structure, which is cleaved by endogenous RNAse III to generate a mature effector enzyme loaded with both tracrRNA and crRNA.
- Type II nucleases are known as DNA nucleases.
- Type II effectors generally exhibit a structure comprising a RuvC-like endonuclease domain that adopts the RNase H fold with an unrelated HNH nuclease domain inserted within the folds of the RuvC- like nuclease domain.
- the RuvC-like domain is responsible for the cleavage of the target (e.g., crRNA complementary) DNA strand, while the HNH domain is responsible for cleavage of the displaced DNA strand.
- Type V CRISPR systems are characterized by a nuclease effector (e.g., Casl2) structure similar to that of Type II effectors, comprising a RuvC-like domain. Similar to Type II, most (but not all) Type V CRISPR systems use a tracrRNA to process pre-crRNAs into mature crRNAs; however, unlike Type II systems which requires RNAse III to cleave the pre-crRNA into multiple crRNAs, type V systems are capable of using the effector nuclease itself to cleave pre- crRNAs. Like Type-II CRISPR systems, Type V CRISPR systems are again known as DNA nucleases.
- Casl2 nuclease effector
- Type V enzymes e.g.. Casl2a
- Casl2a some Type V enzymes appear to have a robust single-stranded nonspecific deoxyribonuclease activity that is activated by the first crRNA directed cleavage of a double-stranded target sequence.
- Type VI CRISPR systems have RNA-guided RNA endonucleases. Instead of RuvC-like domains, the single polypeptide effector of Type VI systems (e.g., Cast 3) comprises two HEPN ribonuclease domains. Differing from both Type II and V systems, Type VI systems also may not require atracrRNA in some instances for processing of pre-crRNA into crRNA. Similar to type V systems, however, some Type VI systems (e.g., C2C2) appear to possess robust singlestranded nonspecific nuclease (ribonuclease) activity activated by the first crRNA directed cleavage of a target RNA.
- C2C2C2C2C2C2 Some Type VI systems (e.g., C2C2) appear to possess robust singlestranded nonspecific nuclease (ribonuclease) activity activated by the first crRNA directed cleavage of a target RNA.
- Class 2 CRISPR have been most widely adopted for engineering and development as designer nuclease/genome editing applications.
- One of the early adaptations of such a system for in vitro use involved (i) recombinantly- expressed, purified full-length Cas9 (e.g., a Class 2. Type II Cas enzyme) isolated from . pyogenes SF370, (ii) purified mature ⁇ 42 nt crRNA bearing a ⁇ 20 nt 5’ sequence complementary to the target DNA sequence desired to be cleaved followed by a 3’ tracr-binding sequence (the whole crRNA being in vitro transcribed from a synthetic DNA template carrying a T7 promoter sequence); (iii) purified tracrRNA in vitro transcribed from a synthetic DNA template carrying a T7 promoter sequence, and (iv) Mg 2+ .
- Cas9 e.g., a Class 2. Type II Cas enzyme
- a later improved, engineered system involved the crRNA of (ii) joined to the 5’ end of (iii) by a linker e.g., GAAA) to form a single fused synthetic guide RNA (sgRNA) capable of directing Cas9 to a target by itself.
- a linker e.g., GAAA
- sgRNA single fused synthetic guide RNA
- Such engineered systems can be adapted for use in mammalian cells by providing DNA vectors encoding (i) an ORF encoding codon-optimized Cas9 (e.g., a Class 2, Type II Cas enzyme) under a suitable mammalian promoter with a C-terminal nuclear localization sequence (e.g., SV40 NLS) and a suitable polyadenylation signal (e.g., TK pA signal); and (ii) an ORF encoding an sgRNA (having a 5’ sequence beginning with G followed by 20 nt of a complementary targeting nucleic acid sequence joined to a 3’ tracr-binding sequence, a linker, and the tracrRNA sequence) under a suitable Polymerase III promoter (e.g., the U6 promoter).
- Base editins e.g., a Class 2, Type II Cas enzyme
- Base editing is the conversion of one target base or base pair into another (e.g., A:T to G:C, C:G to T:A) without requiring the creation and repair of a double-strand break.
- the base editing may be achieved with the help of DNA and RNA base editors that allow the introduction of point mutations at specific sites, in either DNA or RNA.
- DNA base editors may comprise a fusion of a catalytically inactive nuclease and a catalytically active base-modification enzyme that acts on single-stranded DNAs (ssDNAs).
- RNA base editors may comprise of similar, RNA-specific enzymes. Base editing may increase the efficiency of gene modification, while reducing the off-target and random mutations in the DNA.
- DNA base editors are engineered ribonucleoprotein complexes that act as tools for single base substitution in cells and organism. They may be created by fusing an engineered basemodification enzyme and a catalytically deficient CRISPR endonuclease variant that cannot cut dsDNA, but it is able to unfold the dsDNA in a protospacer adjacent motif (PAM) sequencedependent manner, such that a guide RNA can find its complementary 7 target to indicate a ssDNA scission site. The guide RNA anneals to the complementary DNA, displacing a fragment of ssDNA and directing the CRISPR 'scissors’ to the base modification site. The cellular repair machinery will repair the nicked non-edited strand using information from the complementary edited template.
- PAM protospacer adjacent motif
- CBEs cytosine base
- ABEs adenine base editors
- engineered systems comprising: (a) a base editor; (b) an endonuclease configured to bind the base editor and is deficient in nuclease activity 7 ; and (c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%. 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1 160 and 1363-1415; and an engineered guide polynucleotide which forms a complex with an endonuclease deficient in nuclease activity 7 of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 85%, 90%, 95%, 98%, 99% or 100% identity' to any one of SEQ ID NOs: 1654- 1703 and 2021-2023; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023; and an engineered guide polynucleotide which forms a complex with an endonuclease deficient in nuclease activity of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- the base editor comprises a sequence with at least 80%, at least 81%. at least 82%. at least 83%. at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128- 1186, 1208-1315, 1356-1415, 1424, 1556-1644.
- the base editor comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356- 1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443. 444-475. 594-595. 599-675, 744-835, 970-1098, 1128-1186, 1208-1315. 1356-1415.
- the base editor comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654- 1703.
- the base editor comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128- 1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356- 1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703.
- the base editor comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315. 1356-1415, 1424, 1556-1644, and 1654- 1703.
- the base editor comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128- 1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence having 100% identity to any one of SEQ ID NOs: 1-51, 57-66, 385- 443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- the base editor comprises a sequence with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%. at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023.
- the base editor comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 1654-1703 and 2021- 2023.
- the base editor comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 1654- 1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 86% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023.
- the base editor comprises a sequence having at least about 87% identity to any one of SEQ ID NOs: 1654-1703 and 2021- 2023. In some embodiments, the base editor comprises a sequence having at least about 88% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 89% identity to any one of SEQ ID NOs: 1654- 1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 90% identity 7 to any one of SEQ ID NOs: 1654-1703 and 2021-2023.
- the base editor comprises a sequence having at least about 95% identity 7 to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 1654-1703 and 2021- 2023. In some embodiments, the base editor comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 1654- 1703 and 2021-2023.
- the base editor comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023. In some embodiments, the base editor comprises a sequence having 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023.
- engineered base editing system comprising: a base editor encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing system comprising: a base editor encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor deficient in nuclease activity and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- the base editor is encoded by a nucleic acid having a sequence with at least 70%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%. at least 99%. or 100% sequence identity to any one of SEQ ID NOs: 1727-1757.
- the base editor is encoded by a nucleic acid sequence having at least about 70% identity' to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base is encoded by a nucleic acid having a sequence with at least about 80% identity’ to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence with at least about 85% identity to any one of SEQ ID NOs: 1727-1757.
- the base editor is encoded by a nucleic acid having a sequence with at least about 90% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence with at least about 95% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence with at least about 96% identity to any one of SEQ ID NOs:
- the base editor is encoded by a nucleic acid having a sequence with at least about 97% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence with at least about 98% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence with at least about 99% identity to any one of SEQ ID NOs: 1727-1757. In some embodiments, the base editor is encoded by a nucleic acid having a sequence having 100% identity to any one of SEQ ID NOs: 1727-1757.
- the base editor comprises a deaminase.
- the deaminase binds non-covalently to the endonuclease.
- the deaminase is covalently linked to the endonuclease.
- the deaminase is fused to the endonuclease.
- the base editor is an adenine deaminase.
- the adenosine deaminase comprises a sequence with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to SEQ ID NOs: 50-51 , 57, 385- 443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703.
- the adenosine deaminase comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 50-51, 57. 385-443. 448-475. 595, 1356-1415, 1424. 1556-1644, 1654-1658, and 1665-1703. In some embodiments, the adenosine deaminase comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448- 475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703.
- the adenosine deaminase comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475. 595. 1356-1415. 1424, 1556-1644, 1654-1658, and 1665- 1703. In some embodiments, the adenosine deaminase comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703.
- the adenosine deaminase comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703. In some embodiments, the adenosine deaminase comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703.
- the adenosine deaminase comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448- 475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703. In some embodiments, the adenosine deaminase comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665- 1703.
- the adenosine deaminase comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703. In some embodiments, the adenosine deaminase comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424. 1556-1644. 1654-1658, and 1665-1703.
- the adenosine deaminase comprises a sequence having 100% identity to any one of SEQ ID NOs: 50-51, 57, 385-443, 448-475, 595, 1356-1415, 1424, 1556-1644, 1654-1658, and 1665-1703.
- the base editor is a cytosine deaminase.
- the cytosine deaminase comprises a sequence with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-49, 58- 66.
- the cytosine deaminase comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664.
- the cytosine deaminase comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664.
- the cytosine deaminase comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664.
- the cytosine deaminase comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664.
- the cytosine deaminase comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-49, 58-66, 444-447, 594, 744-835, 970-982, and 1659-1664. In some embodiments, the cytosine deaminase comprises a sequence having 100% identity to any one of SEQ ID NOs: 1-49, 58-66. 444-447.
- the base editor comprises one or more modifications.
- the base editor comprises a substitution at least one of residues T2, D7, E10, M13, W24. G32, K38, G45, G51, A63, E66, R75, C91, G93, H97, A107, E108, D109, Pl 10, H124, A126, H129. F150, or S 165. or any combination thereof relative to SEQ ID NO: 50 when optimally aligned.
- the substitution comprises W24G, G51V, E108D, P110H, F150P, D7G, E10G, or H129N, or any combination thereof, relative to SEQ ID NO: 50 when optimally aligned.
- the substitution comprises T2Xi, D7Xi, ElOXi, M13X 4 , W24X1, G32X1. K38X 2 , G45X 2 , G51X 5 , A63X 7 .
- the endonuclease comprises RuvC domain and an HNH domain. In some embodiments, the RuvC domain lacks nuclease activity. In some embodiments, the endonuclease comprises a nickase mutation. In some embodiments, the endonuclease is derived from an uncultivated microorganism. In some embodiments, the endonuclease is a class 2, t pe II endonuclease. In some embodiments, the endonuclease is configured to cleave one strand of a target nucleic acid (e.g., DNA).
- a target nucleic acid e.g., DNA
- the endonuclease is not a Cas9 endonuclease, a Cast 4 endonuclease, a Cast 2a endonuclease, a Cast 2b endonuclease, a Cas 12c endonuclease, a Casl2d endonuclease, a Casl2e endonuclease, a Casl3a endonuclease, a Casl3b endonuclease, a Casl3c endonuclease, or a Cas 13d endonuclease.
- the endonuclease has less than 80% identity to a Cas9 endonuclease.
- the endonuclease comprises a sequence with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1 122-1 127.
- the endonuclease comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1 120, and 1122-1127.
- the endonuclease comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 70-78. 596, 597, 1120, and 1 122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127.
- the endonuclease comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120. and 1122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127. In some embodiments, the endonuclease comprises a sequence having at least about 98% identity 7 to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127.
- the endonuclease comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 70-78, 596. 597, 1120, and 1122-1127. In some embodiments, the endonuclease comprises a sequence having 100% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127.
- the endonuclease is configured to bind to a protospacer adjacent motif (PAM) sequence comprising any one of SEQ ID NOs: 360-368 and 598. In some embodiments, the endonuclease is configured to bind to a protospacer adjacent motif (PAM) sequence comprising any one of SEQ ID NOs: 360, 362, and 368.
- PAM protospacer adjacent motif
- the engineered system disclosed herein comprises an engineered guide polynucleotide, e.g., a guide ribonucleic acid (gRNA), a single gRNA, or a dual guide RNA.
- an engineered guide polynucleotide e.g., a guide ribonucleic acid (gRNA), a single gRNA, or a dual guide RNA.
- the engineered guide polynucleotide (e.g, engineered guide RNA) is configured to form a complex with the engineered endonuclease.
- the engineered guide polynucleotide comprises a spacer sequence.
- the spacer sequence is configured to hybridize to a target nucleic acid sequence.
- the endonuclease is configured to bind to a protospacer adjacent motif (PAM) sequence.
- PAM protospacer adjacent motif
- the guide polynucleotide comprises a sequence is encoded by a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%. at least about 93%, at least about 94%, at least about 95%, at least about 96%. at least about 97%.
- the guide polynucleotide is encoded by a sequence having at least about 80% identity to any one of SEQ ID NOs: 88-96. 488-489. 679-680.
- the guide polynucleotide is encoded by a sequence having at least about 85% identity to any one of SEQ ID NOs: 88-96, 488-489, 679- 680, 876. 917-931. 963-967. 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479- 1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having at least about 90% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431- 1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having at least about 95% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931 , 963-967, 1099-1105, 1 187-1 195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having at least about 96% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099- 1105, 1 187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having at least about 97% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105.
- the guide polynucleotide is encoded by a sequence having at least about 98% identity to any one of SEQ ID NOs: 88-96, 488-489, 679- 680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479- 1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having at least about 99% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431- 1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide is encoded by a sequence having 100% identity to any one of SEQ ID NOs: 88- 96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418. 1427-1428, 1431-1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 80% identity to any one of SEQ ID NOs: 88-96. 488-489.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 85% identity to any one of SEQ ID NOs: 88-96, 488-489. 679-680. 876, 917-931, 963-967, 1099-1105, 1187- 1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 90% identity 7 to any one of SEQ ID NOs: 88-96, 488-489, 679- 680, 876. 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479- 1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 95% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187- 1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483. 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 96% identity to any one of SEQ ID NOs: 88-96, 488-489, 679- 680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479- 1483, 1489-1490, 1705-1710, and 1758-2019. In some embodiments, the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 97% identity to any one of SEQ ID NOs: 88-96, 488-489. 679-680.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 98% identity to any one of SEQ ID NOs: 88-96, 488-489, 679- 680, 876. 917-931. 963-967. 1099-1105, 1187-1195, 1416-1418, 1427-1428. 1431-1454, 1479- 1483, 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having at least about 99% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187- 1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483. 1489-1490, 1705-1710, and 1758-2019.
- the guide polynucleotide hybridizes or targets a sequence complementary to a sequence having 100% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917- 931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489- 1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1431- 1454, 1479-1483, 1489-1490, 1705-1710, and 1758-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1431-1454, 1704, and 2010-2019. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 85% sequence identity to any one of SEQ ID NOs: 1431-1454, 1704, and 2010-2019. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 90% sequence identity’ to any one of SEQ ID NOs: 1431-1454, 1704, and 2010-2019.
- the engineered guide polynucleotide comprises a sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1431-1454, 1704, and 2010-2019. In some embodiments, the engineered guide polynucleotide comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 1431-1454. 1704, and 2010-2019.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1890-1976. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 85% sequence identity to any one of SEQ ID NOs: 1890-1976. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 90% sequence identity’ to any one of SEQ ID NOs: 1890- 1976. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1890-1976. In some embodiments, the engineered guide polynucleotide comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 1890-1976.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1977-2009. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 85% sequence identity to any one of SEQ ID NOs: 1977-2009. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 90% sequence identity’ to any one of SEQ ID NOs: 1977- 2009. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1977-2009. In some embodiments, the engineered guide polynucleotide comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 1977-2009.
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity’ to any one of SEQ ID NOs: 1479-1483 and 1758-1889. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 85% sequence identity to any one of SEQ ID NOs: 1479-1483 and 1758-1889. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 90% sequence identity to any one of SEQ ID NOs: 1479-1483 and 1758-1889. In some embodiments, the engineered guide polynucleotide comprises a sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1479-1483 and 1758-1889. In some embodiments, the engineered guide polynucleotide comprises a sequence having 100% sequence identity to any one of SEQ ID NOs: 1479-1483 and 1758-1889.
- the guide polynucleotide comprises a sequence complementary to a eukaryotic, fungal, plant, mammalian, or human genomic polynucleotide sequence. In some embodiments, the guide polynucleotide comprises a sequence complementary to a eukaryotic genomic polynucleotide sequence. In some embodiments, the guide polynucleotide comprises a sequence complementary to a fungal genomic polynucleotide sequence. In some embodiments, the guide polynucleotide comprises a sequence complementary to a plant genomic polynucleotide sequence.
- the guide polynucleotide comprises a sequence complementary to a mammalian genomic polynucleotide sequence. In some embodiments, the guide polynucleotide comprises a sequence complementary to a human genomic polynucleotide sequence.
- the guide polynucleotide is 30-250 nucleotides in length. In some embodiments, the guide polynucleotide is 42-44 nucleotides in length. In some embodiments, the guide polynucleotide is 42 nucleotides in length. In some embodiments, the guide polynucleotide is 43 nucleotides in length. In some embodiments, the guide polynucleotide is 44 nucleotides in length. In some embodiments, the guide polynucleotide is 85-245 nucleotides in length. In some embodiments, the guide polynucleotide is more than 90 nucleotides in length.
- the guide polynucleotide is less than 245 nucleotides in length. In some embodiments, the guide RNA is 30, 40, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, or more than 240 nucleotides in length. In some embodiments, the guide RNA is about 30 to about 40. about 30 to about 50, about 30 to about 60, about 30 to about 70, about 30 to about 80.
- about 30 to about 90 about 30 to about 100, about 30 to about 120, about 30 to about 140, about 30 to about 160, about 30 to about 180, about 30 to about 200, about 30 to about 220, about 30 to about 240, about 50 to about 60, about 50 to about 70, about 50 to about 80, about 50 to about 90, about 50 to about 100, about 50 to about 120, about 50 to about 140, about 50 to about 160, about 50 to about 180, about 50 to about 200, about 50 to about 220, about 50 to about 240, about 100 to about 120, about 100 to about 140, about 100 to about 160, about 100 to about 180, about 100 to about 200, about 100 to about 220, about 100 to about 240, about 160 to about 180, about 160 to about 200, about 160 to about 220, or about 160 to about 240 nucleotides.
- the engineered guide polynucleotide comprises synthetic nucleotides or modified nucleotides. In some embodiments, the engineered guide polynucleotide comprises one or more inter-nucleoside linkers modified from the natural phosphodiester. In some embodiments, all of the inter-nucleoside linkers of the engineered guide polynucleotide, or contiguous nucleotide sequence thereof, are modified.
- the inter nucleoside linkage comprises Sulphur (S), such as a phosphorothioate inter-nucleoside linkage.
- the engineered guide polynucleotide comprises modifications to a ribose sugar or nucleobase.
- the engineered guide polynucleotide comprises one or more nucleosides comprising a modified sugar moiety, wherein the modified sugar moiety is a modification of the sugar moiety 7 when compared to the ribose sugar moiety 7 found in deoxyribose nucleic acid (DNA) and RNA.
- the modification is within the ribose ring structure. Exemplary modifications include, but are not limited to.
- the sugar-modified nucleosides comprise bicyclohexose nucleic acids or tricyclic nucleic acids.
- the modified nucleosides comprise nucleosides where the sugar moiety is replaced with a non-sugar moiety, for example peptide nucleic acids (PNA) or morpholino nucleic acids.
- the engineered guide polynucleotide comprises one or more modified sugars.
- the sugar modifications comprise modifications made by altering the substituent groups on the ribose ring to groups other than hydrogen, or the 2 ’-OH group naturally found in DNA and RNA nucleosides.
- substituents are introduced at the 2’, 3 ', 4’. or 5’ positions, or combinations thereof.
- nucleosides with modified sugar moieties comprise 2’ modified nucleosides, e.g, 2’ substituted nucleosides.
- a 2’ sugar modified nucleoside in some embodiments, is a nucleoside that has a substituent other than -H or -OH at the 2’ position (2’ substituted nucleoside) or comprises a 2’ linked biradical, and comprises 2’ substituted nucleosides and LNA (2'-4’ biradical bridged) nucleosides.
- Examples of 2 ’-substituted modified nucleosides comprise, but are not limited to.
- the modification in the ribose group comprises a modification at the 2’ position of the ribose group.
- the modification at the 2’ position of the ribose group is selected from the group consisting of 2’-O-methyl, 2’-fluoro, 2’-deoxy. and 2’-O-(2-methoxyethyl).
- the engineered guide polynucleotide comprises one or more modified sugars. In some embodiments, the engineered guide polynucleotide comprises only modified sugars. In certain embodiments, the engineered guide polynucleotide comprises greater than about 10%, 25%, 50%, 75%, or 90% modified sugars. In some embodiments, the modified sugar is a bicyclic sugar. In some embodiments, the modified sugar comprises a 2’-O- methoxy ethyl group. In some embodiments, the engineered guide polynucleotide comprises both inter-nucleoside linker modifications and nucleoside modifications.
- the engineered guide polynucleotide comprises a hairpin comprising at least 8 base-paired ribonucleotides. In some embodiments, the engineered guide polynucleotide comprises a hairpin comprising at least 9 base-paired ribonucleotides. In some embodiments, the engineered guide polynucleotide comprises a hairpin comprising at least 10 base-paired ribonucleotides. In some embodiments, the engineered guide polynucleotide comprises a hairpin comprising at least 11 base-paired ribonucleotides. In some embodiments, the engineered guide polynucleotide comprises a hairpin comprising at least 12 base-paired ribonucleotides.
- the engineered guide polynucleotide comprises a DNA-targeting segment.
- the DNA-targeting segment comprises a nucleotide sequence that is complementary to a target sequence.
- the target sequence is in a target DNA molecule.
- the engineered guide polynucleotide comprises a protein-binding segment.
- the protein-binding segment comprises two complementary’ stretches of nucleotides.
- the two complementary stretches of nucleotides hybridize to form a double-stranded RNA (dsRNA) duplex.
- the two complementary stretches of nucleotides are covalently linked to one another with intervening nucleotides.
- a T means U (Uracil) in RNA and T (Thymine) in DNA.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1 160 and 1363-1415; and an engineered guide polynucleotide which forms a complex with an endonuclease deficient in nuclease activity of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654- 1703 and 2021-2023; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- engineered base editing systems comprising an engineered base editing system comprising: a base editor comprising a sequence having at least 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023; and an engineered guide polynucleotide which forms a complex with an endonuclease deficient in nuclease activity of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system comprises a) a base editor comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415. 1424, 1556-1644, and 1654- 1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system compnses a) a base editor comprising a sequence having at least about 80% identity’ to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system comprises a) a base editor comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system comprises a) a base editor comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1 128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system comprises a) a base editor comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-51, 57-
- the engineered system comprises a) a base editor comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-51, 57-
- the engineered system comprises a) a base editor comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-51, 57-
- the engineered system comprises a) a base editor comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 1-51, 57-
- the engineered system comprises a) a base editor comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-51, 57-
- the engineered system comprises a) a base editor comprising 100% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415. 1424, 1556-1644, and 1654- 1703; b) an endonuclease; and c) an engineered guide polynucleotide.
- the engineered system comprises a) a base editor comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315. 1356-1415, 1424, 1556-1644, and 1654- 1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186. 1208-1315, 1356-1415, 1424, 1556-1644. and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 1- 51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186. 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654- 1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315. 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1 186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 1 - 51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising 100% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence.
- the engineered system comprises a) a base editor comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418
- the engineered system comprises a) a base editor comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475. 594-595. 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654- 1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 80% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427
- the engineered system comprises a) a base editor comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128- 1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-14
- the engineered system comprises a) a base editor comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475. 594-595. 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654- 1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427
- the engineered system comprises a) a base editor comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128- 1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475. 594-595. 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654- 1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 14
- the engineered system comprises a) a base editor comprising 100% identity’ to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1 128-1186, 1208- 1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising 100% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931 , 963-967, 1099-1 105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-14
- the guide polynucleotide hybridizes or targets a sequence complementary to any one of SEQ ID NOs: 88- 96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418.
- the engineered system comprises a) a base editor comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1 122-1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 70% identity to any one of SEQ ID NOs: 88
- the engineered system comprises a) a base editor comprising a sequence having at least about 75% identity' to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415.
- an endonuclease configured to bind the base editor and comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 75% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704- 1710.
- the engineered system comprises a) a base editor comprising a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443. 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 80% identity' to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122-1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 80% identity' to any one of SEQ ID NOs:
- the engineered system comprises a) a base editor comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444.475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556- 1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 85% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, or 1122-1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the
- the engineered system comprises a) a base editor comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 90% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122- 1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at
- the engineered system comprises a) a base editor comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415.
- an endonuclease configured to bind the base editor and comprising a sequence having at least about 95% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122- 1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 95% identity’ to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099- 1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704-1710.
- the engineered system comprises a) a base editor comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475. 594-595, 599-675, 744-835, 970-1098, 1128-1 186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122- 1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 96% identity to any one of SEQ ID NOs: 88
- the engineered system comprises a) a base editor comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 70-78.
- an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 97% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099- 1105, 1 187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704-1710.
- the engineered system comprises a) a base editor comprising a sequence having at least about 98% identity’ to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415.
- an endonuclease configured to bind the base editor and comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122- 1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 98% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099- 1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704-1710.
- the engineered system comprises a) a base editor comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-51, 57-66. 385-443. 444-475. 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 70-78, 596, 597, 1120, and 1122- 1127; and c) an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising a sequence having at least about 99% identity to any one of SEQ ID NOs: 88-
- the engineered system comprises a) a base editor comprising 100% identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970- 1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703; b) an endonuclease configured to bind the base editor and comprising a sequence having 100% identity to any one of SEQ ID NOs: 70-78.
- an engineered guide polynucleotide configured to form a complex with the endonuclease and comprising a spacer sequence configured to hybridize to a target nucleic acid sequence, the engineered guide polynucleotide comprising 100% identity to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105. 1187-1195, 1416-1418, 1427-1428, 1431-1454. 1479-1483, 1489-1490, and 1704-1710.
- the guide polynucleotide hybridizes or targets a sequence complementary to any one of SEQ ID NOs: 88-96, 488-489, 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416-1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704- 1710 or a sequence having at least 90%, 95%, 97%, 98%, or 99% sequence identity to any one of SEQ ID NOs: 88-96. 488-489. 679-680, 876, 917-931, 963-967, 1099-1105, 1187-1195, 1416- 1418, 1427-1428, 1431-1454, 1479-1483, 1489-1490, and 1704-1710.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 70;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 88; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 360.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 71;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 89; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 361.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 73;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 91; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 363.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 75, or a variant thereof;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 93; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 365.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 76, or a variant thereof;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 94: and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 366.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 77, or a variant thereof;
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 95; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 367.
- the endonuclease comprises a sequence at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 78.
- the guide RNA structure comprises a sequence at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 96; and the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 368.
- the base editor comprises an adenine deaminase. In some embodiments, the adenine deaminase comprises SEQ ID NO: 57. In some embodiments, the base editor comprises a cytosine deaminase. In some embodiments, the cytosine deaminase comprises SEQ ID NO: 58.
- the endonuclease or base editor comprises one or more modifications in a nickase domain.
- the nickase domain comprises an aspartate to alanine mutation at residue 9 relative to SEQ ID NO: 70, residue 13 relative to SEQ ID NOs: 71, 72, or 74, residue 12 relative to SEQ ID NO: 73, residue 17 relative to SEQ ID NO: 75, residue 23 relative to SEQ ID NO: 76, or residue 10 relative to SEQ ID NO: 597, or any combination thereof.
- the endonuclease or base editor comprises a substitution of 109N and at least one other substitution comprising any one of 24R, 37L.
- the endonuclease or base editor comprises at least one substitution of a wild-type amino acid for a non-wild-type amino acid comprising any one of W90A, W90F, W90H, W90Y, Y120F, Y120H, Y121F, Y121H, Y121Q, Y121A, Y121D, Y121W, H122Y, H122F, H122I, H122A, H122W, H122D, Y121T, R33A, R34A, R34K, H122A, R33A, R34A, R52A, N57G, H122A, E123A, E
- H121Q H121A. H121D, H121W, R33A, K34A, H122A, H121A, R52A, P26R, P26A, N27R, N27A, W44A, W45A, K49G, S50G, R51G, R121A, I122A, N123A, Y88F, Y120F, P22R, P22A, K23A, K41R, K41 A, E54A, E54A, E55A, K30A, K30R, M32A, M32K, Y117A, K118A, I119A, I119H, R120A, R121A, P46A, P46R, N29A, R27A, or N50G, or any combination thereof.
- the nickase comprises an aspartate to alanine mutation at residue 9 relative to SEQ ID NO: 70, residue 13 relative to SEQ ID NOs: 71, 72. or 74, residue 12 relative to SEQ ID NO: 73, residue 17 relative to SEQ ID NO: 75, residue 23 relative to SEQ ID NO: 76, or residue 10 relative to SEQ ID NO: 597, or any combination thereof.
- the engineered system further comprises a uracil DNA glycosylase inhibitor.
- the uracil DNA glycosylase inhibitor comprises a sequence with at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 52-56 and SEQ ID NO: 67.
- the uracil DNA glycosylase inhibitor comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 75% identity' to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 52-56 and 67.
- the uracil DNA glycosylase inhibitor comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 52-56 or 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 95% identity' to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 96% identity' to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 52-56 and 67.
- the uracil DNA glycosylase inhibitor comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 52-56 and 67. In some embodiments, the uracil DNA glycosylase inhibitor comprises a sequence having 100% identity to any one of SEQ ID NOs: 52-56 and 67.
- the base editor binds non-covalently to the endonuclease. In some embodiments, the base editor is covalently linked to the endonuclease. In some embodiments, the base editor is fused to the endonuclease at the N-terminus or at the C-terminus. In some embodiments, the base editor is fused to the endonuclease.
- the endonuclease is covalently coupled linked to the base editor or covalently linked to the base editor through a linker.
- the linker comprises a sequence having at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%. at least 86%. at least 87%.
- the linker comprises a sequence having at least 80%. at least 81%. at least 82%.
- the sy stem further comprises a source of Mg 2+ .
- the endonuclease comprises one or more nuclear localization sequences (NLSs) proximal to an N- or C-terminus of the endonuclease.
- the base editor comprises one or more nuclear localization sequences (NLSs) proximal to an N- or C-terminus of the endonuclease.
- the NLS can comprise any of the sequences in Table 2 below, or a combination thereof.
- the NLS comprises a sequence of any one of SEQ ID NOs: 369- 384 and 2024-2053, or a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about
- the NLS comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 91% identity to any one of SEQ ID NOs: 369-384 and 2024-2053.
- the NLS comprises a sequence having at least about 92% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 93% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 94% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 369-384 and 2024-2053.
- the NLS comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 369-384 and 2024-2053. In some embodiments, the NLS comprises a sequence having 100% identity to any one of SEQ ID NOs: 369-384 and 2024-2053.
- Described herein, in certain embodiments, is a cell comprising the systems described herein.
- the cell is a eukaryotic cell (e.g, a plant cell, an animal cell, a protist cell, or a fungi cell), a mammalian cell (a Chinese hamster ovary (CHO) cell, baby hamster kidney (BHK), human embryo kidney (HEK), mouse myeloma (NSO). or human retinal cells), an immortalized cell (e.g.
- a eukaryotic cell e.g, a plant cell, an animal cell, a protist cell, or a fungi cell
- a mammalian cell a Chinese hamster ovary (CHO) cell, baby hamster kidney (BHK), human embryo kidney (HEK), mouse myeloma (NSO). or human retinal cells
- an immortalized cell e.g.
- a HeLa cell a COS cell, a HEK-293T cell, a MDCK cell, a 3T3 cell, a PC12 cell, a Huh7 cell, a HepG2 cell, a K562 cell, aN2a cell, or a SY5Y cell
- an insect cell e.g, aSpodoptera frugiperda cell, a Trichoplusia ni cell, a Drosophila melanogaster cell, a S2 cell, or a Helio this vires cens cell
- a yeast cell e.g., a Saccharomyces cerevisiae cell, a Cryptococcus cell, or a Candida cell
- a plant cell e.g, a parenchyma cell, a collenchyma cell, or a sclerenchyma cell
- a fungal cell e.g, a Saccharomyces cerevisiae cell, a Cryptococcus
- the cell is a eukaryotic cell. In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is an immortalized cell. In some embodiments, the cell is an insect cell. In some embodiments, the cell is a yeast cell. In some embodiments, the cell is a plant cell. In some embodiments, the cell is a fungal cell. In some embodiments, the cell is a prokary otic cell.
- the cell is an A549, HEK-293, HEK-293T, BHK, CHO. HeLa, MRC5. Sf9, Cos-1, Cos-7. Vero, BSC 1, BSC 40. BMT 10. WI38, HeLa. Saos, C2C12, L cell. HT1080, HepG2, Huh7, K562, a primary cell, or derivative thereof.
- the present disclosure provides a cell (e. , host cell) comprising a vector described herein.
- the cell expresses the engineered system described herein or components thereof.
- the cell is a human cell.
- the cell is genome edited ex vivo. In some embodiments, the cell is genome edited in vivo.
- host cells comprising an open reading frame encoding a heterologous endonuclease and a heterologous base editor having at least 75% sequence identity to any one of SEQ ID NOs: 1-51, 57-66, 385-443, 444-475, 594-595, 599-675, 744-835, 970-1098, 1128-1186, 1208-1315, 1356-1415, 1424, 1556-1644, and 1654-1703.
- said heterologous base editor comprises a sequence having at least about 75%, at least about 80%.
- the host cell is a bacterial cell.
- the bacterial cell is Bifidobacterium longum, Bifidobacterium lactis, Bifidobacterium animalis, Bifidobacterium breve, Bifidobacterium infantis, Bifidobacterium adolescentis, Lactobacillus acidophilus, Lactobacillus casei. Lactobacillus paracasei, Lactobacillus salivarius, Lactobacillus reuteri. Lactobacillus rhamnosus, Lactobacillus johnsonii, Lactobacillus plantarum, Lactobacillus fermentum.
- the host cell is an E. coli cell.
- the E. coli cell is a ZDE3 lysogen or a BL21(DE3) strain.
- the E. coli cell has an ompT Ion genotype.
- the cell is within a cochlea. In some embodiments, the cell is within an embryo. In some embodiments, the embry o is a two-cell embryo. In some embodiments, the embryo is a mouse embryo.
- Lipid nanoparticles as described herein can be 4-component lipid nanoparticles.
- Such nanoparticles can be configured for delivery of RNA or other nucleic acids (e.g. synthetic RNA, mRNA, or in v/fro-synthesized mRNA) and can be generally formulated as described in WO2012135805A2.
- Such nanoparticles can generally comprise: (a) a cationic lipid (e.g. 98N12- 5 (TETA5-LAP), DLin DMA, DLin-K-DMA (2,2-Dilinoleyl-4-dimethylaminomethyl-[l,3]- dioxolane).
- DLin-KC2-DMA DLin-MC3-DMA, or C12-200
- a neutral lipid e.g. DSPC or DOPE
- a sterol e.g. cholesterol or a cholesterol analog
- a PEG-modified lipid e.g. PEG-DMG
- Cationic lipid formulations can include particles comprising either 3 or 4 or more components in addition to polynucleotide, primary construct, or RNA (e.g. mRNA).
- RNA e.g. mRNA
- formulations with certain cationic lipids include, but are not limited to, 98N12-5, and may contain 42% lipidoid, 48% cholesterol, and 10% PEG (C14 or greater alkyl chain length).
- formulations with certain lipidoids include, but are not limited to, Cl 2-200 and may contain 50% cationic lipid, 10% disteroylphosphatidyl choline, 38.5% cholesterol, and 1.5% PEG-DMG.
- lipid nanoparticles are formulated as described in US10709779B2.
- the cationic lipid nanoparticle comprises a cationic lipid, a PEG-modified lipid, a sterol, and a non-cationic lipid.
- the cationic lipid is selected from the group consisting of 98N12-5 (TETA5-LAP), DLin DMA, DLin-K-DMA (2,2-Dilinoleyl-4- dimethylaminomethyl-[l,3]-dioxolane), DLin-KC2-DMA, DLin-MC3-DMA, and C 12-200.
- the cationic lipid nanoparticle has a molar ratio of about 20-60% cationic lipid, about 5-25% non-cationic lipid, about 25-55% sterol, and about 0.5-15% PEG-modified lipid. In some embodiments, the cationic lipid nanoparticle comprises a molar ratio of about 50% cationic lipid, about 1.5% PEG-modified lipid, about 38.5% cholesterol, and about 10% noncationic lipid. In some embodiments, the cationic lipid nanoparticle comprises a molar ratio of about 55% cationic lipid, about 2.5% PEG-modified lipid, about 32.5% cholesterol, and about 10% non-cationic lipid.
- the cationic lipid is an ionizable cationic lipid
- the non-cationic lipid is a neutral lipid
- the sterol is a cholesterol
- the cationic lipid nanoparticle has a molar ratio of 50:38.5: 10: 1.5 of cationic lipid: cholesterol: PEG2000-DMG:DSPC or DMG:DOPE.
- lipid nanoparticles as described herein can comprise cholesterol, l,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1,1 ’- ((2-(4-(2-((2-(bis(2-hydroxydodecyl)amino)ethyl)(2-hydroxydodecyl)amino)ethyl)piperazin-l- yl)ethyl)azanediyl)bis(dodecan-2-ol) (Cl 2-200), and DMG-PEG-2000 at molar ratios of 47.5: 16:35: 1.5.
- DOPE dioleoyl-sn-glycero-3-phosphoethanolamine
- 1,1 ’- ((2-(4-(2-((2-(bis(2-hydroxydodecyl)amino)ethyl)(2-hydroxydodecyl)amino)ethyl)piperazin-l- yl)e
- nucleic acids encoding an engineered system described herein comprising a base editor, an endonuclease, and an engineered guide polynucleotide or components thereof (e.g., a base editor, an endonuclease, or an engineered guide polynucleotide).
- the nucleic acid encoding the engineered system or components thereof is a DNA, for example a linear DNA, a plasmid DNA, or a minicircle DNA.
- the nucleic acid encoding the engineered system is an RNA, for example a mRNA.
- the nucleic acid encoding the engineered system or components thereof is delivered by a nucleic acid-based vector.
- the nucleic acid-based vector is a plasmid (e.g, circular DNA molecules that can autonomously replicate inside a cell), cosmid (e.g., pWE or sCos vectors), artificial chromosome, human artificial chromosome (HAC), yeast artificial chromosomes (YAC), bacterial artificial chromosome (BAC), Pl-derived artificial chromosomes (PAC), phagemid, phage derivative, bacmid, or virus.
- cosmid e.g., pWE or sCos vectors
- HAC human artificial chromosome
- YAC yeast artificial chromosomes
- BAC bacterial artificial chromosome
- PAC Pl-derived artificial chromosomes
- the nucleic acid-based vector is selected from the list consisting of: pSF-CMV-NEO-NH2-PPT- 3XFLAG, pSF-CMV-NEO-COOH-3XFLAG, pSF-CMV-PURO-NH2-GST-TEV.
- the nucleic acid-based vector comprises a promoter.
- an open reading frame is operably linked to the promoter.
- the promoter is selected from the group consisting of a mini promoter, an inducible promoter, a constitutive promoter, and derivatives thereof.
- the promoter is selected from the group consisting of CMV, CBA, EFla, CAG, PGK, TRE, U6, UAS, T7, Sp6, lac, araBad, trp, Ptac, p5, pl 9, p40, Synapsin, CaMKII, GRK1, and derivatives thereof.
- the promoter is a U6 promoter.
- the promoter is a CAG promoter.
- the open reading frame is operably linked to a T7 promoter sequence, a T7-lac promoter sequence, a lac promoter sequence, a tac promoter sequence, a trc promoter sequence, a ParaBAD promoter sequence, a PrhaBAD promoter sequence, a T5 promoter sequence, a cspA promoter sequence, an araPBAD promoter, a strong leftward promoter from phage lambda (pL promoter), or any combination thereof.
- a T7 promoter sequence a T7-lac promoter sequence, a lac promoter sequence, a tac promoter sequence, a trc promoter sequence, a ParaBAD promoter sequence, a PrhaBAD promoter sequence, a T5 promoter sequence, a cspA promoter sequence, an araPBAD promoter, a strong leftward promoter from phage lambda (pL promoter), or any combination thereof.
- the open reading frame comprises a sequence encoding an affinity tag linked in-frame to a sequence encoding said base editor.
- the affinity 7 tag is an immobilized metal affinity chromatography (IMAC) tag.
- the IMAC tag is a polyhistidine tag.
- the affinity tag is a myc tag, a human influenza hemagglutinin (HA) tag, a maltose binding protein (MBP) tag, a glutathione S- transferase (GST) tag, a streptavidin tag, a FLAG tag, or any combination thereof.
- the affinity tag is linked in-frame to said sequence encoding said base editor via a linker sequence encoding a protease cleavage site.
- the protease cleavage site is a tobacco etch virus (TEV) protease cleavage site, a PreScission® protease (PSP) cleavage site, a Thrombin cleavage site, a Factor Xa cleavage site, an enterokinase cleavage site, or any combination thereof.
- TSV tobacco etch virus
- PSP PreScission® protease
- the open reading frame is codon-optimized for expression in said host cell. In some embodiments, the open reading frame is provided on a vector. In some embodiments, the open reading frame is integrated into a genome of said host cell.
- the nucleic acid-based vector is a virus.
- the virus is an alphavirus, a parvovirus, an adenovirus, an AAV, a baculovirus, a Dengue virus, a lentivirus, a herpesvirus, a poxvirus, an anellovirus. a bocavirus, a vaccinia virus, or a retrovirus.
- the virus is an alphavirus.
- the virus is a parvovirus.
- the virus is an adenovirus.
- the virus is an AAV.
- the virus is a baculovirus.
- the virus is a Dengue virus. In some embodiments, the virus is a lentivirus. In some embodiments, the virus is a herpesvirus. In some embodiments, the virus is a poxvirus. In some embodiments, the virus is an anellovirus. In some embodiments, the virus is a bocavirus. In some embodiments, the virus is a vaccinia virus. In some embodiments, the virus is or a retrovirus.
- the AAV is AAV1. AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8. AAV9. AAV10. AAV11, AAV12, AAV13. AAV14, AAV15, AAV16, AAV-rh8.
- the herpesvirus is HSV type 1, HSV-2, VZV, EBV, CMV, HHV-6, HHV-7, or HHV-8.
- the virus is AAV1 or a derivative thereof. In some embodiments, the virus is AAV2 or a derivative thereof. In some embodiments, the virus is AAV3 or a derivative thereof. In some embodiments, the virus is AAV4 or a derivative thereof. In some embodiments, the virus is AAV5 or a derivative thereof. In some embodiments, the virus is
- the virus is AAV7 or a derivative thereof.
- the virus is AAV8 or a derivative thereof. In some embodiments, the virus is AAV9 or a derivative thereof. In some embodiments, the virus is AAV 10 or a derivative thereof. In some embodiments, the virus is AAV 11 or a derivative thereof. In some embodiments, the virus is AAV 12 or a derivative thereof. In some embodiments, the virus is AAV 13 or a derivative thereof. In some embodiments, the virus is AAV 14 or a derivative thereof. In some embodiments, the virus is AAV 15 or a derivative thereof. In some embodiments, the virus is AAV 16 or a derivative thereof. In some embodiments, the virus is AAV-rh8 or a derivative thereof.
- the virus is AAV-rhlO or a derivative thereof. In some embodiments, the virus is AAV-rh20 or a derivative thereof. In some embodiments, the virus is AAV-rh39 or a derivative thereof. In some embodiments, the virus is AAV-rh74 or a derivative thereof. In some embodiments, the virus is AAV-rhM4-l or a derivative thereof. In some embodiments, the virus is AAV-hu37 or a derivative thereof. In some embodiments, the virus is AAV-Anc80 or a derivative thereof. In some embodiments, the virus is AAV-Anc80L65 or a derivative thereof. In some embodiments, the virus is AAV-7m8 or a derivative thereof.
- the virus is AAV-PHP-B or a derivative thereof. In some embodiments, the virus is AAV-PHP-EB or a derivative thereof. In some embodiments, the virus is AAV-2.5 or a derivative thereof. In some embodiments, the virus is AAV-2tYF or a derivative thereof. In some embodiments, the virus is AAV-3B or a derivative thereof. In some embodiments, the virus is AAV-LK03 or a derivative thereof. In some embodiments, the virus is AAV-HSC1 or a derivative thereof. In some embodiments, the virus is AAV-HSC2 or a derivative thereof. In some embodiments, the virus is AAV-HSC3 or a derivative thereof.
- the virus is AAV-HSC4 or a derivative thereof. In some embodiments, the virus is AAV-HSC5 or a derivative thereof. In some embodiments, the virus is AAV-HSC6 or a derivative thereof. In some embodiments, the virus is AAV-HSC7 or a derivative thereof. In some embodiments, the virus is AAV-HSC8 or a derivative thereof. In some embodiments, the virus is AAV-HSC9 or a derivative thereof. In some embodiments, the virus is AAV-HSC10 or a derivative thereof. In some embodiments, the virus is AAV-HSC11 or a derivative thereof. In some embodiments, the virus is AAV-HSC12 or a derivative thereof.
- the virus is AAV-HSC13 or a derivative thereof. In some embodiments, the virus is AAV-HSC14 or a derivative thereof. In some embodiments, the virus is AAV-HSC15 or a derivative thereof. In some embodiments, the virus is AAV-TT or a derivative thereof. In some embodiments, the virus is AAV-DJ/8 or a derivative thereof. In some embodiments, the virus is AAV -Myo or a derivative thereof. In some embodiments, the virus is AAV-NP40 or a derivative thereof. In some embodiments, the virus is AAV-NP59 or a derivative thereof. In some embodiments, the virus is AAV-NP22 or a derivative thereof.
- the virus is AAV-NP66 or a derivative thereof. In some embodiments, the virus is AAV-HSC16 or a derivative thereof. [0366] In some embodiments, the virus is HSV-1 or a derivative thereof. In some embodiments, the virus is HSV-2 or a derivative thereof. In some embodiments, the virus is VZV or a derivative thereof. In some embodiments, the virus is EBV or a derivative thereof. In some embodiments, the virus is CMV or a derivative thereof. In some embodiments, the virus is HHV- 6 or a derivative thereof. In some embodiments, the virus is HHV-7 or a derivative thereof. In some embodiments, the virus is HHV-8 or a derivative thereof.
- the nucleic acid encoding the engineered system, the endonuclease, or the engineered guide polynucleotide is delivered by anon-nucleic acid-based delivery system (e.g, a non-viral delivery system).
- a non-viral delivery system e.g. a non-viral delivery system
- the non- viral delivery system is a liposome.
- the nucleic acid is associated with a lipid.
- the nucleic acid associated with a lipid in some embodiments, is encapsulated in the aqueous interior of a liposome, interspersed within the lipid bilayer of a liposome, attached to a liposome via a linking molecule that is associated with both the liposome and the nucleic acid, entrapped in a liposome, complexed with a liposome, dispersed in a solution containing a lipid, mixed with a lipid, combined with a lipid, contained as a suspension in a lipid, contained or complexed with a micelle, or otherwise associated with a lipid.
- the nucleic acid is comprised in a lipid nanoparticle (LNP).
- the engineered system, the endonuclease, or the engineered guide polynucleotide is introduced into the cell in any suitable way, either stably or transiently.
- the engineered system, the endonuclease, or the engineered guide polynucleotide is transfected into the cell.
- the cell is transduced or transfected with a nucleic acid construct that encodes the engineered system, the endonuclease, or the engineered guide polynucleotide.
- a cell is transduced (e.g, with a virus encoding the engineered system, the endonuclease, or the engineered guide polynucleotide), or transfected (e.g, with a plasmid encoding the engineered system, the endonuclease, or the engineered guide polynucleotide) with a nucleic acid that encodes the engineered system, the endonuclease, or the engineered guide polynucleotide, or the translated engineered system or endonuclease.
- the transduction is a stable or transient transduction.
- cells expressing the engineered system, the endonuclease, or the engineered guide polynucleotide are transduced or transfected with one or more gRNA molecules.
- a plasmid expressing the engineered system, the endonuclease, or the engineered guide polynucleotide is introduced into cells through electroporation, transient (e.g., lipofection) and stable genome integration (e.g., piggybac) and viral transduction (for example lentivirus or AAV) or other methods known to those of skill in the art.
- the engineered system, the endonuclease, or the engineered guide polynucleotide is introduced into the cell as one or more polypeptides.
- delivery is achieved through the use of RNP complexes. Delivery methods to cells for polypeptides and/or RNPs are known in the art, for example by electroporation or by cell squeezing.
- Exemplary methods of delivery of nucleic acids include lipofection, nucleofection. electroporation, stable genome integration (e.g., piggybac), microinjection, biolistics, virosomes, liposomes, immunoliposomes, poly cation or lipid nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA.
- lipofection is described in e.g., U.S. Pat. Nos.
- lipofection reagents are sold commercially (e.g., TransfectamTM, LipofectinTM and SF Cell Line 4D-Nucleofector X KitTM (Lonza)).
- Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of WO 91/17424 and WO 91/16024.
- the delivery is to cells (e.g., in vitro or ex vivo administration) or target tissues (e.g., in vivo administration).
- the nucleic acid is comprised in a liposome or a nanoparticle that specifically targets a host cell.
- the engineered system comprises an adenine deaminase base editor, the nucleotide is an adenine, and modifying the target nucleic acid locus comprises converting the adenine to a guanine.
- the engineered system comprises a cytidine deaminase base editor and a uracil DNA glycosylase inhibitor, the nucleotide is a cytosine, and modifying the target nucleic acid locus comprises converting the cytosine to a uracil.
- the methods are used to introduce a modification in the genome of a cell.
- the target nucleic acid is modified in vitro.
- the target nucleic acid sequence is modified in vivo.
- the target nucleic acid sequence is modified ex vivo.
- the target nucleic acid comprises genomic DNA, viral DNA, or bacterial DNA. In some embodiments, the target nucleic acid is w ithin a cell. In some embodiments, the cell is a prokaryotic cell, a bacterial cell, a eukaryotic cell, a fungal cell, a plant cell, an animal cell, a mammalian cell, a rodent cell, a primate cell, or a human cell. In some embodiments, the cell is within an animal.
- the target nucleic acid comprises DNA.
- the DNA comprises a first strand comprising a sequence complementary to a sequence of the engineered guide polynucleotide and a second strand comprising a PAM.
- the PAM is directly adjacent to the 3' end of the sequence complementary to the sequence of the engineered guide polynucleotide.
- the PAM comprises a sequence selected from the group consisting of SEQ ID NOs: 360-368 or 598.
- the present disclosure provides a method of modifying a target nucleic acid (e.g., gene) locus.
- the method comprises delivering to the target nucleic acid locus the engineered system described herein.
- the endonuclease is configured to form a complex with the engineered guide polynucleotide.
- the complex is configured such that upon binding of the complex to the target nucleic acid locus, the complex modifies the target nucleic acid locus.
- delivery of the engineered system to the target nucleic acid locus comprises delivering the nucleic acid described herein or the vector described herein. In some embodiments, delivery of the engineered system to the target nucleic acid locus comprises delivering a nucleic acid comprising an open reading frame encoding the base editor and the endonuclease. In some embodiments, the nucleic acid comprises a promoter. In some embodiments, the open reading frame encoding the base editor and the endonuclease is operably linked to the promoter.
- delivery of the engineered system to the target nucleic acid locus comprises delivering a capped mRNA containing the open reading frame encoding the base editor and the endonuclease. In some embodiments, delivery of the engineered system to the target nucleic acid locus comprises delivering a translated polypeptide. In some embodiments, delivery of the engineered system to the target nucleic acid locus comprises delivering a deoxyribonucleic acid (DNA) encoding the engineered guide RNA operably linked to a ribonucleic acid (RNA) pol III promoter.
- DNA deoxyribonucleic acid
- RNA ribonucleic acid
- the target gene is TRAC.
- the gRNA comprises a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%. at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490.
- the gRNA comprises a sequence having at least about 70% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 75% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 80% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 85% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 90% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490.
- the gRNA comprises a sequence having at least about 91% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 92% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 93% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 94% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 95% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490.
- the gRNA comprises a sequence having at least about 96% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 97% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 98% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having at least about 99% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490. In some embodiments, the gRNA comprises a sequence having 100% identity to SEQ ID NO: 1489 or SEQ ID NO: 1490.
- the gRNA hybridizes to a TRAC sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about
- the gRNA hybridizes to a TRAC sequence having at least about 70% identity to any one of SEQ ID NOs: 1491-1492 and 1711- 1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 75% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 80% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719.
- the gRNA hybridizes to a TRAC sequence having at least about 85% identity to any one of SEQ ID NOs: 1491-1492 and 1711- 1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 90% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 91% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 92% identity to any one of SEQ ID NOs: 1491-1492 and 1711- 1719.
- the gRNA hybridizes to a TRAC sequence having at least about 93% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 94% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 95% identity to any one of SEQ ID NOs: 1491-1492 and 1711- 1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 96% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719.
- the gRNA hybridizes to a TRAC sequence having at least about 97% identity to any one of SEQ ID NOs: 1491 -1492 and 171 1-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 98% identity to any one of SEQ ID NOs: 1491-1492 and 1711- 1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having at least about 99% identity to any one of SEQ ID NOs: 1491-1492 and 1711-1719. In some embodiments, the gRNA hybridizes to a TRAC sequence having 100% identity to any one of SEQ ID NOs: 1491- 1492 and 1711-1719.
- the target gene is AAVS1.
- the gRNA comprises a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%. at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%. at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to SEQ ID NOs: 1705-1710.
- the gRNA comprises a sequence having at least about 70% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 75% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 80% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 85% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 90% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 91% identity to
- the gRNA comprises a sequence having at least about 92% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 93% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 94% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 95% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 96% identity to SEQ ID NOs: 1705-1710.
- the gRNA comprises a sequence having at least about 97% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 98% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having at least about 99% identity to SEQ ID NOs: 1705-1710. In some embodiments, the gRNA comprises a sequence having 100% identity’ to SEQ ID NOs: 1705-1710.
- the gRNA hybridizes to a AAV S 1 sequence having at least about 20%, at least about 25%. at least about 30%, at least about 35%, at least about 40%, at least about
- the gRNA hybridizes to a AAVS1 sequence having at least about 70% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 75% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 80% identity to any one of SEQ ID NOs: 1720-1725.
- the gRNA hybridizes to a AAVS1 sequence having at least about 85% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 90% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 91% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 92% identity to any one of SEQ ID NOs: 1720-1725.
- the gRNA hybridizes to a AAVS1 sequence having at least about 93% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 94% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 95% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 96% identity’ to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having at least about 97% identity to any one of SEQ ID
- the gRNA hybridizes to a AAVS1 sequence having at least about 98% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAV S 1 sequence having at least about 99% identity to any one of SEQ ID NOs: 1720-1725. In some embodiments, the gRNA hybridizes to a AAVS1 sequence having 100% identity to any one of SEQ ID NOs: 1720-1725.
- the target gene is hApoAl.
- the gRNA comprises a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%. at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%. at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity 7 to SEQ ID NOs: 1704.
- the gRNA comprises a sequence having at least about 70% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 75% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 80% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 85% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 90% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 91% identity to SEQ ID NOs: 1704.
- the gRNA comprises a sequence having at least about 92% identity 7 to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 93% identity 7 to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 94% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 95% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 96% identity 7 to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 97% identity to SEQ ID NOs: 1704.
- the gRNA comprises a sequence having at least about 98% identity to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having at least about 99% identity 7 to SEQ ID NOs: 1704. In some embodiments, the gRNA comprises a sequence having 100% identity 7 to SEQ ID NOs: 1704.
- the gRNA hybridizes to a hApoAl sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about
- the gRNA hybridizes to a hApoAl sequence having at least about 70% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 75% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 80% identity to any one of SEQ ID NOs: 1726.
- the gRNA hybridizes to a hApoAl sequence having at least about 85% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 90% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 91% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 92% identity to any one of SEQ ID NOs: 1726.
- the gRNA hybridizes to a hApoAl sequence having at least about 93% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 94% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 95% identity 7 to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 96% identity to any one of SEQ ID NOs: 1726.
- the gRNA hybridizes to a hApoAl sequence having at least about 97% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 98% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having at least about 99% identity to any one of SEQ ID NOs: 1726. In some embodiments, the gRNA hybridizes to a hApoAl sequence having 100% identity to any one of SEQ ID NOs: 1726.
- Described herein, in certain embodiments, are methods of modify ing a nucleic acid encoding ANGPTL3 comprising contacting the nucleic acid sequence encoding ANGPTL3 with an engineered base editing system, said base editing system comprising:a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757; and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid
- the engineered guide polynucleotide comprises a sequence having at least 80% sequence identity to any one of SEQ ID NOs: 1479-1483 and 1758-1889.
- Described herein, in certain embodiments, are methods of modifying a nucleic acid encoding APOA1 comprising contacting the nucleic acid sequence encoding APOA1 with an engineered base editing system, said base editing system comprising: a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identity to any one of SEQ ID NOs: 1727-1757; and an engineered
- Described herein, in certain embodiments, are methods of modifying a nucleic acid encoding BCL1 1 A comprising contacting the nucleic acid sequence encoding BCL1 1 A with an engineered base editing system, said base editing system comprising a base editor comprising a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identify to any one of SEQ ID NOs: 1654-1703 and 2021-2023, wherein the sequence does not comprise any one of the sequences selected from SEQ ID NO: 1128-1160 and 1363-1415, or encoded by a nucleic acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% identify to any one of SEQ ID NOs: 1727-1757, and an engineered guide polynucleotide which forms a complex with an endonuclease of the base editor and comprises a spacer sequence that hybridizes to a target nucleic acid sequence
- the endonuclease induces a single-stranded break or a doublestranded break at or proximal to the target locus. In some embodiments, the endonuclease induces a staggered single stranded break within or 5’ to said target locus. In some embodiments, the endonuclease does not induce a break at or proximal to the target locus.
- the present disclosure provides methods of manufacturing or producing a base editor.
- the method comprises cultivating the cell.
- the methods of producing a base editor comprising cultivating the host cell described herein in compatible growth medium.
- the methods further comprise inducing expression of said base editor by addition of an additional chemical agent or an increased amount of a nutrient.
- the chemical agent is Isopropyl 0-D-1- thiogalactopyranoside (IPTG).
- IPTG Isopropyl 0-D-1- thiogalactopyranoside
- the nutrient is lactose.
- the methods further comprise isolating said host cell after said cultivation and lysing said host cell to produce a protein extract.
- the methods further comprise subjecting said protein extract to IMAC, or ion-affinity chromatography. In some embodiments, the methods further comprise cleaving said IMAC affinity tag by contacting a protease corresponding to said protease cleavage site to said base editor. In some embodiments, the methods further comprise performing subtractive IMAC affinity chromatography to remove said affinity tag from a composition comprising said base editor.
- Systems of the present disclosure may be used for various applications, such as, for example, nucleic acid editing (e.g, gene editing), binding to a nucleic acid molecule (e.g, sequence-specific binding).
- Such systems may be used, for example, for addressing (e.g., removing or replacing) a genetically inherited mutation that may cause a disease in a subject, inactivating a gene in order to ascertain its function in a cell, as a diagnostic tool to detect disease-causing genetic elements (e.g., via cleavage of reverse-transcribed viral RNA or an amplified DNA sequence encoding a disease-causing mutation), as deactivated enzymes in combination with a probe to target and detect a specific nucleotide sequence (e.g., sequence encoding antibiotic resistance int bacteria), to render viruses inactive or incapable of infecting host cells by targeting viral genomes, to add genes or amend metabolic pathways to engineer organisms to produce valuable small molecules, macromolecules, or secondary metabolites, to establish a gene drive
- kits comprising one or more nucleic acid constructs encoding the various components of the engineered system described herein, e.g., comprising a nucleotide sequence encoding the components of the engineered editing system capable of modifying a target DNA sequence.
- the nucleotide sequence comprises a heterologous promoter that drives expression of the engineered system components.
- any of the engineered editing systems disclosed herein is assembled into a pharmaceutical, diagnostic, or research kit to facilitate its use in therapeutic, diagnostic, or research applications.
- a kit may include one or more containers housing any of the vectors disclosed herein and instructions for use.
- the kit may be designed to facilitate use of the methods described herein by researchers and can take many forms.
- Each of the compositions of the kit may be provided in liquid form (e.g., in solution), or in solid form, (e.g., a dry powder).
- some of the compositions may be constitutable or otherwise processable (e.g. to an active form), for example, by the addition of a suitable solvent or other species (for example, water or a cell culture medium), which may or may not be provided with the kit.
- a suitable solvent or other species for example, water or a cell culture medium
- Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, for example, audiovisual (e.g., videotape, DVD, etc.), Internet, and/or web-based communications, etc.
- the written instructions in some embodiments, are in a form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which instructions can also reflect approval by the agency of manufacture, use, or sale for animal administration.
- Example 1 Base editor ABE07 targets APOA1 and ANGPTL3 genes in primary mouse hepatocytes across a range of doses
- Hypercholesterolemia is a metabolic disorder characterized by elevated blood plasma levels of low-density lipoprotein (LDL) which can lead to atherosclerosis, heart attack, and stroke. Proteins that regulate plasma lipoprotein levels are primarily expressed in hepatocytes in the liver and are encoded by the APOA1 and ANGPTL3 genes. Knockdown of these genes can enable treatment of human lipoprotein metabolism disorders such as hypercholesterolemia and can be achieved by precise introduction of mutations in their coding sequence through base editing.
- LDL low-density lipoprotein
- NGS analysis of genomic DNA harvested from primary' mouse hepatocytes at three days post-transfection revealed base editing activity in eleven APOA1 guides (A5, A8, C5, D7, Dl l, El. F4. and F12) and one ANGPTL3 guide (C12).
- Maximum A-to-G conversion of 70.3% was achieved with APOA1 guide Fl 2 transfected with the lowest mRNA dose of 0.21 pg ABE07 at spacer position Al 1 (FIGs. 1A-1K).
- the second lowest mRNA dose of 0.42 pg resulted in maximum A-to-G conversion of 64.4%, also with APOA1 guide F12 at spacer position Al l.
- the highest two mRNA doses of 0.625 and 1.25 pg resulted in a maximum of 55.0% and 35.9% A- to-G conversion, respectively, in APOA1 guide F4 at spacer position A5.
- Example 2 Engineering and optimizing ABE variants through extensive guide screening in Hepal-6 cells
- two copies of the mutant were fused with a linker while in the heterodimer, the wild type deaminase was fused via linker to the beneficial mutant (FIG. 2). Further details of these variants are summarized in Table 4.
- these ABE variants were screened alongside the homodimeric D109N variant, ABE07 (SEQ ID NO: 1411) and the heterodimeric D109N variant, ABE07-74 (SEQ ID NO: 1654) over 31 pre-characterized genetic loci in the Hepal-6 cell line (Table 5; SEQ ID NO: 1455-1478 and 1484-1488)
- ABE variants listed in Table 4 were produced following mRNA production, these ABE variants were nucleofected into Hepal-6 cells along with chemically-synthesized sgRNA targeting the loci listed in Table 5. Amplicons were sequenced and analyzed to measure gene editing.
- Table 5 Mean A-to-G Editing activity at 31 genomic loci targeted in the Hepal-6 screening experiment.
- Table 6 Max A-to-G Editing activity at 31 genomic loci targeted in the Hepal-6 screening experiment.
- ABE07-78 introduced C-to-G edits across the different guides that were tested. This promiscuous editing by the ABE07-78 variant averaged at 5% and ranged up to 27.78% at one of the sites (FIGs. 4A-4B). Additionally, ABE07-78 also introduced indels (insertions and deletions) at a rate much higher (up to 7% at certain sites) than the other ABE variants (FIG. 5).
- Example 3 In vivo gene editing in liver of mice by the ABE07-77 delivered by systemic administration of lipid nanoparticles
- ABE07-77 was subjected to further testing in vivo through the lipid nanoparticle delivery of an mRNA encoding the protein and sgRNA that target the highly edited loci identified in the Hepal-6 screening experiment and primary mouse hepatocytes (Table 7).
- its parent nuclease that is MG3-6/3-8
- dose-escalation experiments were performed to assess the dose-dependence of editing activity of ABE07-77.
- chemical modifications of the native RNA structure were incorporated into these sgRNAs (Table 7). These chemical modifications were selected based on their ability to improve the stability of the sgRNA in vitro when incubated in extracts from mammalian cells without negatively impacting editing activity.
- the mRNA encoding the ABE07-77 was generated by in vitro transcription of a linearized plasmid template using T7 RNA polymerase, nucleotides, and enzymes.
- the DNA sequence that was transcribed into RNA comprised the following elements in order from 5' to 3': the T7 RNA polymerase promoter, a 5' untranslated region (5' UTR), a nuclear localization signal, a short linker, the coding sequence for the ABE07-77, a short linker, a nuclear localization signal, a 3' untranslated region, and an approximately 100 nucleotide polyA tail.
- the protein sequence encoded in the synthetic mRNA encoded in this ABE07-77 cassette comprised the following elements from 5' to 3': the nuclear localization signal from SV40, a five amino acid linker (GGGGS), the protein coding sequence of the ABE07-77 from which the initiating methionine codon was removed, a 3 amino acid linker (SGG), and the nuclear localization signal from nucleoplasmin.
- the DNA sequence of the protein coding region of this cassette was modified to reflect the codon usage in humans using a commercially available algorithm.
- An approximately 100-nucleotide polyA tail was encoded in the plasmid used for in vitro transcription and the mRNA was co-transcriptionally capped. Uridine in the mRNA was replaced with N1 -methyl pseudo uridine.
- a similar procedure was performed to prepare the mRNA encoding the MG3-6/3-8 nuclease.
- the lipid nanoparticle (LNP) formulation used to deliver the ABE07-77 mRNA and the guide RNA is based on LNP formulations described in the literature including Kauffman et al. (Nano Lett. 2015, 15, 11, 7300-7306).
- the four lipid components were dissolved in ethanol and mixed in an appropriate molar ratio to make the lipid working mix.
- the mRNA and the guide RNA were mixed prior to formulation at a 1 : 1 mass ratio.
- RNA was diluted in 100 mM Sodium Acetate (pH 4.0) to make the RNA working stock.
- the lipid working stock and the RNA working stock were mixed in a microfluidics device at a flow rate ratio of 1:3 and a flow rate of 12 mls/min.
- the LNP were dialyzed against phosphate buffered sahne (PBS) for 2 hours and then concentrated until the reduced volume was achieved.
- the concentration of RNA in the LNP formulation was measured using the Ribogreen reagent.
- the diameter and poly dispersity' (PDI) of the LNP were determined by dynamic light scattering. Representative LNP diameters ranged from 65 nm to 120 nm with PDI of 0.05 t [0416] Mouse dosing and harvesting
- LNP for mRNA and sgRNA were mixed at 1 : 1 mass ratio and injected intravenously into 7-week-old C57B16 wild type mice via the tail vein (0. 1 mL per mouse) at a total RNA dose of either 1.5 mg or 1 mg RNA per kg body weight (Table 7). Seven days post-dosing, all mice in each group were sacrificed. The left liver lobe was collected and flash frozen.
- Table 7 sgRNAs and targeting spacer sequences used in mouse study.
- Genomic DNA preparation and editing analysis by Next-Generation sequencing [0418] The left lateral lobe of the liver (100 mg) was homogenized in the digestion buffer. Genomic DNA was purified from the resulting homogenate using and quantified by measuring the absorbance at 260 nm. Genomic DNA purified from mice injected with PBS buffer alone was used as a control. The region of the APOA1 and ANGPTL3 genes targeted by each specific sgRNA was PCR amplified with DNA polymerase and gene specific primers with adapters complementary' to the barcoded primers used for next generation sequencing (NGS) for a total of 29 cycles.
- NGS Next-Generation sequencing
- the product of this first PCR reaction was PCR amplified using the barcoded primers for NGS for a total of 10 cycles.
- the resulting product was subjected to NGS, and the results were processed to generate the percentage of sequencing reads that contain insertions or deletions (indels) at the targeted site in the APOA1 and ANGPTL3 genes.
- an engineered cell line was devised with 5 consecutive PAMs compatible with MG3-6 and Cas9. This cell line allows for gRNA tiling to test editing efficiency and find -1 nt preferences of candidate CDA’s.
- CDA’s were cloned in a plasmid backbone containing MG3-6 and an MG uracil glycosylase inhibitor.
- the CDAs were cloned in the N termini (SEQ ID NOs: 1659-1664). Once the cloning of variant CDAs was confirmed they 7 were transiently transfected into the engineered HEK293T cells using lipofectamine 2000 with associated guides.
- a total of 6 engineered variants (139-52-V2 (SEQ ID NO: 1271), 139-52-V13 (SEQ ID NO: 1282), 139-52-V14 (SEQ ID NO: 1283), 139-52-V17 (SEQ ID NO: 1314), 139- 86vl2 (SEQ ID NO: 1296), and 152-6vl3 (SEQ ID NO: 1309) were tested in the gRNA tiling experiment described above. Out of the 6 engineered CDA's, all showed editing activity higher than 9% (FIG. 10).
- the -Int in vitro preference was plotted by calculating the sum of percentage cleavages (percent cleavage measures percent deamination) per -1 nt preference and then calculating the ratio per -1 nucleotide.
- the mammalian cell and in vitro -1 nt preference is shown in FIG. 11.
- the candidates have different -1 nt preferences: 152-6 WT (SEQ ID NO: 1322) prefers T in the -1 position, whereas 139-52 (WT, SEQ ID NO: 1325; and engineered variants) has a strong preference for C at the -1 position.
- Candidates with strong -1 nt preferences is preferable, since having a tighter nt preference improves off target activity.
- Candidates with different and strong -1 nt preferences allows us to target different loci without risking high off target activity.
- Candidates were identified with purine preferences: 139-86 whose preference is more G and/or A.
- ABE07 a homodimeric (ABE07) or heterodimeric (ABE07-77, hereafter referred as ABE15) form of architecture (SEQ ID NOs: 1411 and 1657) w ere utilized.
- ABE15 a homodimeric form of architecture
- SEQ ID NOs: 1411 and 1657 w ere utilized.
- These dimeric states were chosen to emulate the native oligomerization state of the deaminase domain.
- Table 9 which were either monomeric (ABE01, ABE02, ABE55), homodimeric (ABE54, ABE58) or heterodimeric (ABE51, ABE52, ABE57) were constructed.
- N- terminal or C -terminal deaminase domains were selectively deactivated through the disruption of the catalytic site with a E60A mutation (ABE51, ABE52, ABE54, and ABE57) to assess which of the two deaminase copies is critical for the DNA editing capabilities of the ABE.
- E60A mutation ABE51, ABE52, ABE54, and ABE57
- These variants were tested on a panel of pre-characterized target sites spanning the mouse APOA1 and ANGPTL3 genes (Table 8) using the protocols described below.
- Table 8 sgRNAs and targeting spacer sequences used in primary mouse hepatocytes
- Hepal-6 cells were obtained from ATCC. Hepal-6 cells were nucleofected with 500 ng of mRNA and 150 pmol of chemically synthesized sgRNA by an electroporator. Each nucleofection reaction had 100,000 cells. Immediately after transfection, cells were cultured in DMEM supplemented with 10% FBS and IX MEM containing non-essential amino acids at 37 °C with 5% CO2 for three days. Genomic DNA was harvested from. Targeted sequences were amplified with NGS primers. Amplicons around 250 bp long were checked by gel, sequenced by NGS, and computationally analyzed to measure gene editing outcomes.
- MG68-4 deaminase (SEQ ID NO: 386) is a putative tRNA adenosine deaminase and natively functions as an obligate dimer, the dimeric constructs outperform their monomeric versions (FIGs. 12A and 12B)
- the homodimeric ABE07 (SEQ ID NO: 1411) outperformed the monomeric ABE01 (SEQ ID NO: 1410) which has the MG68-4 (D109N) variant inlaid into the RuvC-III domain as well as ABE02 which has the MG68-4 (D109N) variant inlaid into the REC domain (SEQ ID NO: 1665).
- ABE33 demonstrated a mean A:T to G:C editing of 32.97% across the 15 tested guides, which is 1.5-fold higher than ABE15 which averages around 22.8% mean A:T to G:C editing activity (FIGs. 14A - 14B). Moreover, ABE33 showed lower promiscuous C deamination compared to ABE15 (FIGs. 15A - 15B) while having low indel formation activity (FIG. 16).
- ABE23 demonstrated a mean A:T to G:C editing of 30.8% as compared to 21.4% by ABE15 with analogous increase in the mean of the max observed A:T to G:C editing, 82.1 % for ABE23 compared to 73.5% across all the 15 guides tested in this study. Overall, this increased A:T to G:C editing activity of ABE23 is manifested in the relaxation of the stringent +1 C context when compared to ABE15 (FIG. 20B).
- D109Q mutations were included in top variants ABE12, ABE13, ABE14, ABE15. and ABE16 (SEQ ID Nos: 1654-1658) to yield ABE18. ABE19, ABE20. ABE21, and ABE22, respectively (SEQ ID Nos: 1686-1690). These mutations were also included in the proline variants tested in Example 7 to yield ABE28, ABE29, ABE30, and ABE31 (SEQ ID Nos: 1691-1694).
- Table 12 Designs of engineered ABE variants tested in this study.
- Example 9 Novel small base editors display high activity in human cells
- Sequences for small ABEs were codon optimized for human expression and each cloned into an expression vector with a T7 promoter and capping initiation sequence, 5’ and 3’ UTRs, and a polyA tail.
- the coding sequence contained an N-terminal SV40 nuclear localization signal and a C-terminal nucleoplasmin nuclear localization signal.
- the expression vector was midiprepped, linearized with Spel, cleaned with, and used for in vitro transcription with Hi-T7 polymerase. In vitro transcription reactions contained N 1 -methylpseudouridine in place of uridine and had an added capping reagent.
- the resulting mRNA was cleaned, checked for product size and purity by spectrophotometry and gel electrophoresis and diluted to 250 ng/pL in sterile water for use in nucleofection.
- ABE33 SEQ ID NO: 1672 was used as a positive control with a hAPOAl target guide (Table 14).
- K562 cells (ATCC CCL-243) were cultured in IMDM + L-alanyl-L-glutamine dipeptide media and 10% FBS for 1-2 passages prior to nucleofection. On the day of nucleofection, cells were harvested, counted, w ashed in IX PBS, and resuspended in nucleofection buffer according to manufacturer instructions. 120,000 cells were distributed per well and nucleofected with 500 ng of mRNA and 200 pmol of sgRNA using the a nucleofection kit. For some experiments, the amount of guide added varied from 100 to 400 pmol. Cells were added to recover ⁇ 7 media and grown for 72 hours before genomic DNA was harvested.
- Resulting gDNA was diluted 1:3 and used as a template for NGS PCR.
- Targeted sequences were amplified with NGS primers (IDT). Amplicons around 250 bp long were checked by gel electrophoresis, sequenced by an NGS sequencing machine, and computationally analyzed to measure gene editing outcomes.
- ABE87 which contains the MG68-4(D109N, T112R. A155R) deaminase inlaid within the MG34-29 nickase (FIG. 27C) demonstrated the highest editing of all the tested small ABEs, w ith a mean editing efficiency of from 74.34 zt 0.02 % at AAVS1 E7 locus and 32.41 ⁇ 3.77 % at AAVS1 C7 locus (FIG. 28B).
- the inlaid construct preference has been also observed with other nickases such as MG3-6_3-8 ABEs.
- ABE75 outperformed its C-terminal and N-terminal variants, with the highest performance of 7.10 ⁇ 1.60 % observed at the TRAC Cl l locus (FIG. 28D).
- Table 14 Spacer and sgRNA sequences used in testing the small ABEs.
- Example 10 PAM-interacting domain swaps increase the targetability of ABEs
- Adenine base editors can install precise A:T — > G:C edits in a programmable manner, and hence have the capability to correct more than half of the know n pathogenic single-nucleotide polymorphism and efficiently knock out protein through splice site disruption.
- ABEs due to its programmable nature, the utility of ABEs is highly dependent on the presence of an appropriate PAM downstream of the desired target adenine.
- SpCas9-based ABEs can merely access 18% of all the adenines present in the human reference genome due to the lack of NGG in the vicinity of target A (FIG. 29A).
- the ABEs were codon optimized for human expression and subsequently cloned into an expression vector with a CleanCap T7 promoter, 5’ and 3’ UTRs, and a poly A tail.
- the coding sequence contained an N-terminal SV40 nuclear localization signal and a C-terminal nucleoplasmin nuclear localization signal.
- the expression vectors were midi-prepped, mRNA templates were amplified from plasmids with Q5 High-Fidelity 2X Master Mix, cleaned with HighPrep PCR, and used for in vitro transcription with Hi-T7.
- In vitro transcription reactions contained N1 -methylpseudouridine in place of uridine and had added CleanCap reagent.
- the resulting mRNA was cleaned with Rneasy, checked for product size and purity by NanoDrop and Tapestation, and diluted to 250 ng/pL in sterile water for use in nucleofection.
- Targeted sequences were amplified with NGS primers. Amplicons around 250 bp long were checked in an Agilent TapeStation D1000 gel, sequenced on an MiSeq machine, and analyzed with CRISPResso2 to measure gene editing outcomes.
- K.562 cell lines were engineered with ‘disease’ state nucleotides using a lentiviral vector encoding a combination of therapeutically relevant targets (about 4.5kb, SEQ ID NO: 2020) at a low MOI.
- This 4.5kb sequence contained the PAH gene with the relevant SNVs 1222OT (p.R408W). 1O66-11G>A, 1315+1G>A with about 250 nt on either side of the therapeutically relevant target nucleotide.
- the mutation corresponding to the gene of interest would then be changed to the diseased state, allowing for the correction of a therapeutically relevant edit.
- MG3-6 chimeric ABEs offer more options for guides to target genes of interest.
- ANGPTL3 is an endogenous inhibitor of lipoprotein lipase (LPL) that is expressed predominantly in the liver and is associated with cholesterol metabolism. Knocking down of ANGPTL3 has been shown to increase LPL levels, which is the main enzyme involved in hydrolysis of triglycerides. Hence, knock out of ANGPTL3 is considered a potential permanent treatment of coronary artery diseases.
- LPL lipoprotein lipase
- ABE112 (SEQ ID NO: 1755) showed the best editing at ANGPTL3 as it demonstrated 21.9% editing at exon 4 splice acceptor site, 61.5% editing at exon 4 splice donor site, and 25% editing at exon 7 splice acceptor site.
- BCL11 A has three well-characterized human BCL11 A composite enhancer DHS sites(namely DHS+55, DHS+58, and DHS+62). These enhancer sites have a consensus GATA motif that binds the GATA1 and TALI enhancers, upregulating the expression of BCL11A. Mutating these GATA sites has the potential to suppress BCL11A levels which subsequently leads to an increase in the y-globin levels and rescues the 0-thalassemia disease state.
- Phenylalanine hydroxylase (PAH) deficiency results in intolerance to the dietary intake of the essential amino acid phenylalanine and produces a spectrum of disorders. The risk of adverse outcomes varies based on the degree of PAH deficiency. Over 500 mutations have been reported in the coding sequence as well as in the intervening sequence of the PAH gene (Regier and Greene, 2000). The current best therapeutic to treat PAH deficiency is an oral medication, Sapropterin, that serves as a cofactor of the PAH protein and can improve the activity' of some mutant forms of PAH. However, it is not effective against the most commonly reported SNVs in the PAH gene - 1222OT (p.R408W). 1066-11G>A, 1315+1G>A (FIG. 34A).
- ABE101 corrected the 1222OT (p.R408W) SNV with a pooled 15% efficiency (FIG. 34B) in the high-throughput pooled screening.
- 1222OT (p.R408W) SNV the potential for bystander editing outcome due to editing of neighboring as in the editing window- of the base editors w as observed.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Medicinal Chemistry (AREA)
- Microbiology (AREA)
- Veterinary Medicine (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Public Health (AREA)
- Animal Behavior & Ethology (AREA)
- Epidemiology (AREA)
- Pharmacology & Pharmacy (AREA)
- Mycology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480037660.9A CN121263525A (zh) | 2023-05-03 | 2024-05-03 | 碱基编辑酶 |
| EP24800721.3A EP4705464A2 (fr) | 2023-05-03 | 2024-05-03 | Enzymes d'édition de base |
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363499912P | 2023-05-03 | 2023-05-03 | |
| US63/499,912 | 2023-05-03 | ||
| US202363519790P | 2023-08-15 | 2023-08-15 | |
| US63/519,790 | 2023-08-15 | ||
| US202363611049P | 2023-12-15 | 2023-12-15 | |
| US63/611,049 | 2023-12-15 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2024229449A2 true WO2024229449A2 (fr) | 2024-11-07 |
| WO2024229449A3 WO2024229449A3 (fr) | 2025-04-03 |
Family
ID=93333505
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/027887 Ceased WO2024229449A2 (fr) | 2023-05-03 | 2024-05-03 | Enyzmes d'édition de base |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP4705464A2 (fr) |
| CN (1) | CN121263525A (fr) |
| WO (1) | WO2024229449A2 (fr) |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| IL310721B2 (en) * | 2015-10-23 | 2025-11-01 | Harvard College | Nucleobase editors and uses thereof |
| US10913941B2 (en) * | 2019-02-14 | 2021-02-09 | Metagenomi Ip Technologies, Llc | Enzymes with RuvC domains |
| BR112023024983A2 (pt) * | 2021-06-02 | 2024-04-30 | Metagenomi Inc | Sistemas crispr classe ii, tipo v |
-
2024
- 2024-05-03 CN CN202480037660.9A patent/CN121263525A/zh active Pending
- 2024-05-03 WO PCT/US2024/027887 patent/WO2024229449A2/fr not_active Ceased
- 2024-05-03 EP EP24800721.3A patent/EP4705464A2/fr active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| CN121263525A (zh) | 2026-01-02 |
| WO2024229449A3 (fr) | 2025-04-03 |
| EP4705464A2 (fr) | 2026-03-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12123014B2 (en) | Class II, type V CRISPR systems | |
| AU2023314925A1 (en) | Class ii, type v crispr systems | |
| US20250002886A1 (en) | Engineered and chimeric nucleases | |
| EP4677092A2 (fr) | Systèmes crispr de classe 2 et de type v | |
| WO2024233984A2 (fr) | Systèmes et procédés de transposition de séquences nucléotidiques cargo | |
| WO2024229449A2 (fr) | Enyzmes d'édition de base | |
| WO2026080408A1 (fr) | Enzymes d'édition de base | |
| US20260115319A1 (en) | Supplementation of liver enzyme expression | |
| US20250059568A1 (en) | Class ii, type v crispr systems | |
| AU2024339645A1 (en) | Engineered and chimeric nucleases | |
| WO2026035770A1 (fr) | Systèmes et procédés de transposition de séquences nucléotidiques cargos | |
| WO2024055013A1 (fr) | Systèmes et procédés de transposition de séquences nucléotidiques de chargement | |
| WO2024187119A2 (fr) | Systèmes et procédés de transposition de séquences nucléotidiques de charge | |
| WO2026044118A1 (fr) | Systèmes d'endonucléases | |
| JP2026510081A (ja) | Ruvcドメインを有する酵素 | |
| WO2024055012A1 (fr) | Systèmes et méthodes de transposition de séquences de nucléotides cargo | |
| EP4716752A2 (fr) | Systèmes d'endonucléases | |
| WO2024124204A2 (fr) | Compositions de rétrotransposon et procédés d'utilisation |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24800721 Country of ref document: EP Kind code of ref document: A2 |
|
| ENP | Entry into the national phase |
Ref document number: 2025563862 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2025563862 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024800721 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24800721 Country of ref document: EP Kind code of ref document: A2 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| ENP | Entry into the national phase |
Ref document number: 2024800721 Country of ref document: EP Effective date: 20251203 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2024800721 Country of ref document: EP |