WO2020041172A1 - Procédés et compositions de recrutement de protéines de réparation d'adn - Google Patents

Procédés et compositions de recrutement de protéines de réparation d'adn Download PDF

Info

Publication number
WO2020041172A1
WO2020041172A1 PCT/US2019/047021 US2019047021W WO2020041172A1 WO 2020041172 A1 WO2020041172 A1 WO 2020041172A1 US 2019047021 W US2019047021 W US 2019047021W WO 2020041172 A1 WO2020041172 A1 WO 2020041172A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
sequence
dna repair
domain
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2019/047021
Other languages
English (en)
Inventor
Albert Cheng
Nathaniel JILLETTE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jackson Laboratory
Original Assignee
Jackson Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jackson Laboratory filed Critical Jackson Laboratory
Publication of WO2020041172A1 publication Critical patent/WO2020041172A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/19Omega peptidases (3.4.19)
    • C12Y304/19012Ubiquitinyl hydrolase 1 (3.4.19.12)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/04Hydrolases acting on acid anhydrides (3.6) acting on acid anhydrides; involved in cellular and subcellular movement (3.6.4)
    • C12Y306/04012DNA helicase (3.6.4.12)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/04Hydrolases acting on acid anhydrides (3.6) acting on acid anhydrides; involved in cellular and subcellular movement (3.6.4)
    • C12Y306/04013RNA helicase (3.6.4.13)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/85Fusion polypeptide containing an RNA binding domain
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/35Nature of the modification
    • C12N2310/351Conjugate
    • C12N2310/3519Fusion with another nucleic acid

Definitions

  • programmable nucleases introduce a double stranded break (DSB) at a desired site in the genome whereupon the DSB may be repaired by one of a variety of cellular DNA repair mechanisms.
  • the most efficient repair pathway is error-prone non-homologous end joining (NHEJ), which has been widely exploited to disrupt gene function through the introduction of random insertions/deletions (indels).
  • HDR homology-directed repair
  • NHEJ and HDR represent competing pathways that contain overlapping yet distinct protein components.
  • NHEJ is the dominant outcome even in the presence of a repair template. Attempts to block NHEJ genetically or with protein and chemical inhibitors shift the balance in favor of HDR, but may risk unwanted genome instability.
  • the alternative approach is to stimulate the HDR pathway. Indeed, modest improvement in HDR efficiency has come from the overexpression/activation of proteins specific to the HDR pathway such as RAD51 16 or the fusion of DNA repair proteins with Cas9 nuclease.
  • Some aspects of the present disclosure provide methods that comprise delivering to a cell comprising a target nucleic acid (a) a ribonucleic acid (RNA)-guided nuclease or a nucleic acid encoding a RNA-guided nuclease, (b) a guide RNA (gRNA) or a nucleic acid encoding a gRNA that comprises (i) a DNA-targeting sequence that can bind specifically to a target nucleic acid sequence, (ii) a RNA-guided nuclease-binding sequence, and (iii) a Pumilio-FBF (PUF) domain binding sequence (PBS), and (c) a DNA repair protein or a nucleic acid encoding a DNA repair protein (e.g., conjugate) that comprises a DNA repair domain linked to a PUF domain that binds to the PBS.
  • a target nucleic acid e.g., RNA-guided nuclease or
  • a target nucleic acid sequence is in the genome of a cell.
  • a target nucleic acid may be referred to as a genomic target nucleic acid.
  • a target nucleic acid sequence is within a gene and/or within a transcriptional regulatory sequence.
  • the methods comprise delivering to the cell a ribonucleoprotein complex comprising the RNA-guided nuclease (e.g., a Cas9 nuclease or a Cas9 nickase) bound to the gRNA.
  • RNA-guided nuclease e.g., a Cas9 nuclease or a Cas9 nickase
  • the methods further comprise delivering to the cell a donor nucleic acid comprising a sequence of interest.
  • a donor nucleic acid may be single- stranded or double- stranded, for example.
  • the methods further comprise maintaining a cell under conditions that result in cleavage of a target nucleic acid sequence. In some embodiments, the methods further comprise maintaining a cell under conditions that result in the production of a cellular nucleic acid comprising a sequence of interest.
  • a RNA-guided DNA nuclease is a Cas9 nuclease or a Cas9 nickase.
  • a gRNA comprises at least two PUF domain binding sequences.
  • a gRNA may comprise 2-50 PUF domain binding sequences.
  • a PBS has a length of at least 8 nucleotides. In some embodiments, a PBS comprises the nucleotide sequence of SEQ ID NO: 83.
  • a PUF domain comprises a PUFa domain, a PUFb domain, a PUFc domain, or a PUFw domain.
  • a PUF domain comprises a PUFa domain that comprises the amino acid sequence of SEQ ID NO: 27.
  • a PUF domain comprises a C-terminus and a N-terminus, and the DNA repair domain is linked to the C-terminus or the N-terminus of the PUF domain.
  • a DNA repair domain comprises an enzymatic activity selected from the group consisting of ligase activity, polymerase activity, topoisomerase activity, helicase activity, and nuclease activity.
  • a DNA repair domain comprises a ligase, a polymerase, a topoisomerase, a helicase, or a nuclease.
  • a DNA repair domain may comprise a protein selected from the group consisting of: Replication Protein Al (RPA1);
  • RPA2 Replication Protein A2
  • FANCM Fanconi Anemia Complementation Group M
  • RAD51 Recombinase
  • RAD52 Homolog, DNA Repair Protein
  • RAD51Paralog C RAD51C
  • RAD18 E3 Ubiquitin Protein Figase
  • RBBP8/CTIP Tumor Protein P53 Binding Protein 1
  • BRCA1 BRCA1 DNA Repair Associated
  • RAD54F RAD54-like
  • PAFB2 X-Ray Repair Cross Complementing 3
  • MRE11A MRE11 Homolog, Double Strand Break Repair Nuclease
  • FEN1 Flap Structure-Specific Nuclease 1
  • RECQ5 RecQ Fike Helicase 5
  • FANCB FA Complementation Group B
  • USP1 Ubiquitin Specific Peptidase 1
  • FANCF FA Complementation Group F
  • FA Complementation Group G FA Complementation Group G
  • a gRNA, a RNA-guided nuclease, a DNA repair protein, and/or a donor nucleic acid are encoded on independent vectors or on the same vector.
  • a vector may be a plasmid or a viral vector.
  • DNA repair proteins e.g., conjugates
  • PEF Pumilio-FBF
  • nucleic acids encoding DNA repair proteins (e.g., conjugates) of the present disclosure.
  • Still other aspects provide expression vectors comprising a promoter operably linked to nucleic acids encoding DNA repair proteins (e.g., conjugates) of the present disclosure.
  • kits comprising DNA repair proteins (e.g., conjugates), nucleic acids, or expression vectors described herein.
  • the kits further comprise a RNA-guided nuclease or a nucleic acid encoding a RNA-guided nuclease.
  • kits further comprise a gRNA or a nucleic acid encoding a gRNA that comprises (i) a DNA-targeting sequence that is complementary to a target nucleic acid sequence, (ii) a RNA-guided nuclease-binding sequence, and (iii) a PBS, wherein the PUF domain of the DNA repair protein (e.g., conjugate) can bind to the PBS.
  • the kits further comprise a donor nucleic acid that comprises a sequence of interest.
  • a cell comprising a DNA repair protein (e.g., conjugate), a nucleic acid, or an expression vector of the present disclosure.
  • a cell further comprises a RNA-guided nuclease or a nucleic acid encoding a RNA-guided nuclease (e.g., Cas9 nuclease or Cas9 nickase).
  • a cell further comprises a gRNA or a nucleic acid encoding a gRNA that comprises (i) a DNA-targeting sequence that is complementary to a target nucleic acid sequence, (ii) a RNA-guided nuclease-binding sequence, and (iii) a PBS to which the PUF domain of the DNA repair protein (e.g., conjugate) can bind.
  • a gRNA or a nucleic acid encoding a gRNA that comprises (i) a DNA-targeting sequence that is complementary to a target nucleic acid sequence, (ii) a RNA-guided nuclease-binding sequence, and (iii) a PBS to which the PUF domain of the DNA repair protein (e.g., conjugate) can bind.
  • methods comprising delivering to a cell a programmable nuclease-based gene editing system that comprises a programmable nuclease linked to a DNA repair domain, wherein the programmable nuclease cleaves a target nucleic acid sequence, and the DNA repair domain is selected from the group consisting of: RPA1; RPA2; FANCM; BRCA1; RAD54L; PALB2; XRCC3; FEN1; RECQ5; FANCB; USP1; FANCF; and FANCG.
  • the methods further comprise delivering to the cell a donor nucleic acid comprising a sequence of interest.
  • a programmable nuclease comprises a RNA-guided nuclease, such as Cas9 nuclease or Cas9 nickase.
  • a method further comprises delivering to the cell a gRNA or a nucleic acid encoding a gRNA that specifically binds to a target nucleic acid sequence.
  • a programmable nuclease comprises a zinc finger nuclease (ZFN). In other embodiments, a programmable nuclease comprises a transcription activator-like effector nuclease (TALEN).
  • ZFN zinc finger nuclease
  • TALEN transcription activator-like effector nuclease
  • a programmable nuclease may be directly linked to (e.g., fused to) the DNA repair domain or indirectly linked (e.g., via at least one linker molecule) to the DNA repair domain.
  • DNA repair proteins comprising a programmable nuclease linked to DNA repair domain selected from the group consisting of: RPA1; RPA2; FANCM; BRCA1; RAD54L; PALB2; XRCC3; FEN1; RECQ5; FANCB; USP1; FANCF; and FANCG.
  • a programmable nuclease comprises a RNA- guided nuclease (e.g., Cas9 nuclease or Cas9 nickase), a ZFN, or a TALEN.
  • FIGS. 1A-1B The figures show HDR/NHEJ reporters in HEK293T.
  • Fig. 1A A constitutively expressed BFP inserted at the AAVS1 locus serves as a gene editing reporter that can be targeted with CRISPR/Cas9 and a BFP->GFP donor to produce non-fluorescent (NHEJ) or green-fluorescent (HDR) cell.
  • Fig. 1B Flow cytometry plots of non-targeted HEK293T/BFP cells, or those targeted with Cas9 + sgRNA + donor.
  • FIG. 2 The figure shows recruitment of DNA repair protein by direct fusion to Cas9.
  • NHEJ is the predominant choice of repair pathway after Cas9 mediated DSB;
  • DRP DNA repair protein
  • FIG. 3 The figure shows fusion of BRCA1 to Cas9 biases editing outcome towards HDR but with a decrease in total editing.
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP- >GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Unfused Cas9; (B) BRCAl-Cas9; (C) Cas9-BRCAl. Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion).
  • FIG. 4 shows recruitment of DNA repair protein to the CRISPR/Cas9 complex via the Casilio methodology. Fusions of DNA repair proteins with Pumilio/FBF (PUF) RNA binding domains can be recruited to the CRISPR/Cas9 complex via binding to the multiple (N) copies pumilio binding sites (PBS) inserted on the single-guide RNA with PBS sites (sgRNA-PBS). Recruitment of particular DRPs (DNA Repair Proteins) can favor editing outcome towards HDR.
  • PBS pumilio binding sites
  • sgRNA-PBS pumilio binding sites
  • FIG. 5 The figures show recruitment of BRCA1 by Casilio strategy enhances HDR without compromising total editing.
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP- >GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Unfused Cas9; (B) BRCAl-Cas9; (C) Cas9-BRCAl; (D) Cas9+BRCAl-PUFa; and (E) Cas9+PUFa-BRCAl.
  • FIG. 6 The figures show recruitment of RAD54L enhances HDR. BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different
  • FIG. 7. The figure shows recruitment of multiple DNA repair proteins to the
  • a casilio complex with Cas9 enzyme can recruit multiple different protein factors (Pl, P2, and Pi) to enhance HDR.
  • FIG. 8. The figures show recruitment of CtIP(T847E)-PALB2(KR)-BRCAl complex enhances HDR.
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP->GFP ssODN donor, BFP- targeting sgRNA-5xPBSa plus one of (A) Cas9; (B) Cas9+ CtIP(T847E)-PUFa+PALB2(KR)- PUFa+BRCAl-PUFa and (C) Cas9+PUFa-CtIP(T847E)+PUFa-PALB2(KR)+PUFa-BRCAl. Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion). Numbers above column indicate HDR/NHEJ ratios.
  • FIG. 9 The figures show recruitment of RAD51 enhances HDR at nick mediated by Cas9Nickase (Cas9n; Cas9 D10A nickase).
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP- >GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Cas9Nickase; (B)
  • Cas9Nickase+RAD5l-PUFa or (C) Cas9Nickase+PUFa-RAD5l. Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion).
  • FIG. 10 shows local recruitment of CtIP(T847E)-PALB2(KR)-BRCAl complex enhances HDR at nick mediated by Cas9Nickase (Cas9n; Cas9 D10A nickase).
  • BFP- >GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP->GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Cas9Nickase; (B) Cas9Nickase+CtIP(T847E)- PUFa+PALB2(KR)-PUFa+BRCAl-PUFa; or (C) Cas9Nickase+PUFa-CtIP(T847E)+PUFa- PALB2(KR)+PUFa-BRCAl.
  • A Cas9Nickase
  • B Cas9Nickase+CtIP(T847E)- PUFa+PALB2(KR)-PUFa+BRCAl-PUFa
  • C Cas9Nickase+PUFa-CtIP(T847
  • FIG. 11 The figure shows recruitment of XRCC3 by Casilio strategy enhances HDR.
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP->GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Cas9; (B) Cas9+XRCC3-PUFa; and (C) Cas9+PUFa-XRCC3.
  • Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion).
  • FIG. 12 The figure shows recruitment of RECQ5 by Casilio strategy enhances HDR.
  • BFP->GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP->GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Cas9; (B) Cas9+RECQ5-PUFa; and (C) Cas9+PUFa-RECQ5. Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion).
  • FIG. 13 The figure shows recruitment of FEN1 by Casilio strategy enhances HDR.
  • BFP- >GFP editing experiments were conducted on the HEK293T/BFP reporter cell line using different CRISPR/Cas9 complexes formed by a BFP->GFP ssODN donor, BFP-targeting sgRNA-5xPBSa plus one of (A) Cas9; (B) Cas9+FENl-PUFa; and (C) Cas9+PUFa-FEN 1. Stacked columns show the percentage of GFP+ cells, indicative of HDR (green lower portion) and percentage of double negative (Dbl-) cells, indicative of NHEJ (patterned upper portion).
  • FIG. 14 The figure shows recruitment of Fanconi Anemia (FA) pathway proteins by Casilio strategy enhances HDR. BFP->GFP editing experiments were conducted on the
  • FIG. 15 The figure shows more examples of factors that enhance HDR when recruited to site of Cas9Nickase (Cas9n; Cas9 D10A nickase)-mediated DNA nick.
  • terns“nucleic acid” and“polynucleotide” may be used interchangeably herein.
  • Nucleic acids, including nucleic acids with a phosphothioate backbone can include one or more reactive moieties.
  • the term reactive moiety includes any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide through covalent, non- covalent or other interactions.
  • the nucleic acid can include an amino acid reactive moiety that reacts with an amio acid on a protein or polypeptide through a covalent, non-covalent or other interaction.
  • Nucleic acids may include nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
  • Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphothioate), phosphorodithioate, phosphonocarboxylic acids,
  • phosphonocarboxylates phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and peptide nucleic acid backbones and linkages.
  • Other analog nucleic acids include those with positive backbones; non-ionic backbones, modified sugars, and non-ribose backbones (e.g. phosphorodiamidate morpholino oligos or locked nucleic acids (LNA)), including those described in U.S. Patent Nos.
  • nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose- phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half- life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
  • the internucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
  • the range of values provided includes the specified value. As recognized by a person of ordinary skill in the art such specified value would reasonably include a standard deviation using measurements generally acceptable in the art. In some embodiments, the standard deviation includes a range extending to +/- 10% of the specified value.
  • polypeptide refers to a polymer of amino acid residues, wherein the polymer may be linked to (e.g., conjugated to) a moiety that does not include amino acids.
  • the terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
  • the terms apply to macrocyclic peptides, peptides that have been modified with non-peptide functionality, peptidomimetics, polyamides, and macrolactams.
  • a protein conjugate may include two or more protein domains directly or indirectly linked to each other.
  • a fusion protein is an example of a protein conjugate.
  • a "fusion protein” refers to a chimeric protein encoding two or more separate protein sequences (e.g., domains) that are recombinantly expressed as a single moiety.
  • peptidyl and "peptidyl moiety” means a monovalent peptide.
  • amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetic s that function in a manner similar to the naturally occurring amino acids.
  • Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, g-carboxyglutamate, and O-phosphoserine.
  • Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
  • Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
  • the terms "non-naturally occurring amino acid” and "unnatural amino acid” refer to amino acid analogs, synthetic amino acids, and amino acid mimetics which are not found in nature.
  • Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
  • amino acid or nucleotide base “position” is denoted by a number that sequentially identifies each amino acid (or nucleotide base) in the reference sequence based on its position relative to the N-terminus (or 5'-end). Due to deletions, insertions, truncations, fusions, and the like that must be taken into account when determining an optimal alignment, in general the amino acid residue number in a test sequence determined by simply counting from the N- terminus will not necessarily be the same as the number of its corresponding position in the reference sequence. For example, in a case where a variant has a deletion relative to an aligned reference sequence, there will be no amino acid in the variant that corresponds to a position in the reference sequence at the site of deletion.
  • corresponding to when used in the context of the numbering of a given amino acid or nucleic acid sequence, refers to the numbering of the residues of a specified reference sequence when the given amino acid or nucleic acid sequence is compared to the reference sequence.
  • Constantly modified variants applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide.
  • nucleic acid variations are“silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid.
  • each codon in a nucleic acid except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan
  • TGG which is ordinarily the only codon for tryptophan
  • amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing
  • the following eight groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (L), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and
  • the term “functional fragment” refers to a protein, peptide, peptidyl moiety or nucleic acid that is comparable in function to another protein, peptide or nucleic acid (i.e., a so-called“reference” protein, peptide or nucleic acid), but differs in composition (e.g., has a similar but not identical amino acid sequence, nucleotide sequence or lengths thereof) and differs in structure or origin to a reference protein, peptide or nucleic acid.
  • the term“functional fragment” includes any recombinant or naturally-occurring form of a protein or nucleic acid sequence, variants thereof that maintain protein or nucleic acid sequence activity (e.g.
  • a functional fragment of a protein or nucleic acid may include individual substitutions, deletions or additions to a protein or nucleic acid, which alters, adds or deletes a single amino acid or nucleotide.
  • a “ribonucleoprotein complex” as provided herein refers to a complex including a nucleoprotein and a ribonucleic acid.
  • a “nucleoprotein” as provided herein refers to a protein capable of binding a nucleic acid (e.g., RNA, DNA). Where the nucleoprotein binds a ribonucleic acid it is referred to as “ribonucleoprotein.” The interaction between the
  • ribonucleoprotein and the ribonucleic acid may be direct, e.g., by covalent bond, or indirect, e.g., by non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like).
  • the ribonucleoprotein includes an RNA-binding motif non-covalently bound to the ribonucleic acid.
  • positively charged aromatic amino acid residues in the RNA- binding motif may form electrostatic interactions with the negative nucleic acid phosphate backbones of the RNA, thereby forming a ribonucleoprotein complex.
  • ribonucleoproteins include ribosomes, telomerase, RNAseP, hnRNP, CRISPR associated protein 9 (Cas9) and small nuclear RNPs (snRNPs).
  • the ribonucleoprotein may be an enzyme.
  • the ribonucleoprotein is an endonuclease.
  • Percentage of sequence identity is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the nucleic acid or polypeptide sequence in the comparison window may comprise additions or deletions ( i.e ., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same ⁇ i.e., 60% identity, optionally 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identity over a specified region, e.g., of the entire polypeptide sequences of the present disclosure or individual domains of the polypeptides of the present disclosure), when compared and aligned for maximum
  • the one or more homologous donor sequences form part of the donor nucleic acid and may be substantially identical to the DNA targeting sequence.
  • the homologous donor sequences e.g., a first and/or a second homologous donor sequence
  • the homologous donor sequences are 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to the DNA-targeting sequence.
  • the homologous donor sequences are 60%, identical to the DNA-targeting sequence.
  • the homologous donor sequences are 65%, identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 70% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 75% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second
  • homologous donor sequences are 80% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 85% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 90% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 95% identical to the DNA-targeting sequence.
  • the homologous donor sequences are 98% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second homologous donor sequence) are 98% identical to the DNA-targeting sequence. In some embodiments, the homologous donor sequences (e.g., a first and/or a second
  • homologous donor sequence are 99% identical to the DNA-targeting sequence.
  • identity exists over a region that is at least 50 nucleotides in length, or more preferably over a region that is 100 to 500 or 1000 or more nucleotides in length.
  • sequence comparison typically one sequence acts as a reference sequence, to which test sequences are compared.
  • test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated.
  • sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
  • a “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of, e.g., a full length sequence or from 20 to 600, 50 to 200, or 100 to 150 amino acids or nucleotides in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
  • Methods of alignment of sequences for comparison are well known in the art.
  • Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman (1970) Adv. Appl. Math. 2:482c, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol.
  • An example of an algorithm that is suitable for determining percent sequence identity and sequence similarity are the BEAST and BEAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410, respectively.
  • Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/).
  • This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the
  • neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score.
  • Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787).
  • One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance.
  • P(N) the smallest sum probability
  • a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than 0.2, more preferably less than 0.01, and most preferably less than 0.001.
  • nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross-reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below.
  • a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions.
  • Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below.
  • Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.
  • Bio sample refers to materials obtained from or derived from a subject or patient.
  • a biological sample includes sections of tissues such as biopsy and autopsy samples, and frozen sections taken for histological purposes.
  • samples include bodily fluids such as blood and blood fractions or products (e.g., serum, plasma, platelets, red blood cells, and the like), sputum, tissue, cultured cells (e.g., primary cultures, explants, and transformed cells) stool, urine, synovial fluid, joint tissue, synovial tissue, synoviocytes, fibroblast-like
  • a biological sample is typically obtained from a eukaryotic organism, such as a mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; or fish.
  • a mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; reptile; or fish.
  • a "cell” as used herein, refers to a cell carrying out metabolic or other function sufficient to preserve or replicate its genomic DNA.
  • a cell can be identified by methods that include, for example, presence of an intact membrane, staining by a particular dye, ability to produce progeny or, in the case of a gamete, ability to combine with a second gamete to produce a viable offspring.
  • Cells may include prokaryotic and eukaryotic cells.
  • Prokaryotic cells include but are not limited to bacteria.
  • Eukaryotic cells include but are not limited to yeast cells and cells derived from plants and animals, for example mammalian, insect (e.g., spodoptera) and human cells.
  • the word "expression” or “expressed” as used herein in reference to a gene means the transcriptional and/or translational product of that gene.
  • the level of expression of a DNA molecule in a cell may be determined on the basis of either the amount of corresponding mRNA that is present within the cell or the amount of protein encoded by that DNA produced by the cell (Sambrook et ah, 1989, Molecular Cloning: A Laboratory Manual, 18.1-18.88).
  • transfected gene expression of a transfected gene can occur transiently or stably in a cell.
  • transfected expression the transfected gene is not transferred to the daughter cell during cell division. Since its expression is restricted to the transfected cell, expression of the gene is lost over time.
  • stable expression of a transfected gene can occur when the gene is co transfected with another gene that confers a selection advantage to the transfected cell.
  • a selection advantage may be a resistance towards a certain toxin that is presented to the cell.
  • exogenous refers to a molecule or substance (e.g., nucleic acid or protein) that originates from outside a given cell or organism.
  • endogenous refers to a molecule or substance that is native to, or originates within, a given cell or organism.
  • transfection can be used interchangeably and are defined as a process of introducing a nucleic acid molecule and/or a protein to a cell.
  • Nucleic acids may be introduced to a cell using non-viral or viral-based methods.
  • the nucleic acid molecule can be a sequence encoding complete proteins or functional portions thereof.
  • a nucleic acid vector comprising the elements necessary for protein expression (e.g., a promoter, transcription start site, etc.).
  • Non-viral methods of transfection include any appropriate method that does not use viral DNA or viral particles as a delivery system to introduce the nucleic acid molecule into the cell.
  • non-viral transfection methods include calcium phosphate transfection, liposomal transfection, nucleofection, sonoporation, transfection through heat shock, magnetifection and electroporation.
  • any useful viral vector can be used in the methods described herein.
  • viral vectors include, but are not limited to retroviral, adenoviral, lentiviral and adeno-associated viral vectors.
  • the nucleic acid molecules are introduced into a cell using a retroviral vector following standard procedures.
  • the terms "transfection” or "transduction” also refer to introducing proteins into a cell from the external environment. Typically, transduction or transfection of a protein relies on attachment of a peptide or protein capable of crossing the cell membrane to the protein of interest. See, e.g., Ford et al. (2001) Gene Therapy 8:1-4 and
  • a nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence.
  • DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide;
  • a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
  • operably linked means that the DNA sequences being linked are near each other, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
  • gene means the segment of DNA involved in producing a protein; it includes regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
  • leader and trailer regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
  • the leader, the trailer as well as the introns include regulatory elements that are necessary during the transcription and the translation of a gene.
  • a “protein gene product” is a protein expressed from a particular gene.
  • the named protein includes any of the protein’s naturally occurring forms, or variants or homologs that maintain the protein
  • transcription factor activity e.g., within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to the native protein.
  • variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g. a 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring form.
  • the protein is the protein as identified by its NCBI sequence reference. In other embodiments, the protein is the protein as identified by its NCBI sequence reference or functional fragment or homolog thereof.
  • a "Cas9 nuclease” or “Cas9” protein as referred to herein includes any of the recombinant or naturally-occurring forms of the CRISPR-associated protein 9 (Cas9) or variants or homologs thereof that maintain Cas9 enzyme activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to Cas9).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g. a 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring Cas9 protein.
  • the Cas9 protein is substantially identical to the protein identified by the UniProt reference number Q99ZW2 or a variant or homolog having substantial identity thereto.
  • Patient refers to a living organism suffering from or prone to a disease or condition that can be treated by administration of a composition or pharmaceutical composition as provided herein.
  • Non-limiting examples include humans, other mammals, bovines, rats, mice, dogs, monkeys, goat, sheep, cows, deer, and other non-mammalian animals.
  • a patient is human.
  • compositions provided herein e.g., the ribonucleoprotein complex, the DNA repair protein (e.g., conjugate) or nucleic acids encoding the same
  • the compositions provided herein e.g., the ribonucleoprotein complex, the DNA repair protein (e.g., conjugate) or nucleic acids encoding the same
  • a disease e.g., cancer
  • the terms “disease” or “condition” refer to a state of being or health status of a patient or subject capable of being treated with a compound, pharmaceutical composition, or method provided herein.
  • the disease is cancer (e.g. lung cancer, ovarian cancer, osteosarcoma, bladder cancer, cervical cancer, liver cancer, kidney cancer, skin cancer (e.g., Merkel cell carcinoma), testicular cancer, leukemia, lymphoma (Mantel cell lymphoma), head and neck cancer, colorectal cancer, prostate cancer, pancreatic cancer, melanoma, breast cancer, neuroblastoma).
  • cancer e.g. lung cancer, ovarian cancer, osteosarcoma, bladder cancer, cervical cancer, liver cancer, kidney cancer, skin cancer (e.g., Merkel cell carcinoma), testicular cancer, leukemia, lymphoma (Mantel cell lymphoma), head and neck cancer, colorectal cancer, prostate cancer, pancreatic cancer, melanoma, breast cancer, neuroblastoma).
  • treatment or “treating,” or “palliating” or “ameliorating” are used interchangeably herein. These terms refer to an approach for obtaining beneficial or desired results including but not limited to therapeutic benefit and/or a prophylactic benefit.
  • therapeutic benefit is meant eradication or amelioration of the underlying disorder being treated.
  • a therapeutic benefit is achieved with the eradication or amelioration of one or more of the physiological symptoms associated with the underlying disorder such that an improvement is observed in the patient, notwithstanding that the patient may still be afflicted with the underlying disorder.
  • the compositions may be administered to a patient at risk of developing a particular disease, or to a patient reporting one or more of the physiological symptoms of a disease, even though a diagnosis of this disease may not have been made.
  • Treatment includes preventing the disease, that is, causing the clinical symptoms of the disease not to develop by administration of a protective composition prior to the induction of the disease; suppressing the disease, that is, causing the clinical symptoms of the disease not to develop by administration of a protective composition after the inductive event but prior to the clinical appearance or reappearance of the disease; inhibiting the disease, that is, arresting the
  • an “effective amount” is an amount sufficient to accomplish a stated purpose (e.g.
  • an "effective amount” is an amount sufficient to contribute to the treatment, prevention, or reduction of a symptom or symptoms of a disease, which could also be referred to as a "therapeutically effective amount.”
  • a “reduction” of a symptom or symptoms means decreasing of the severity or frequency of the symptom(s), or elimination of the symptom(s).
  • a “prophylactically effective amount” of a drug is an amount of a drug that, when administered to a subject, will have the intended prophylactic effect, e.g., preventing or delaying the onset (or reoccurrence) of an injury, disease, pathology or condition, or reducing the likelihood of the onset (or reoccurrence) of an injury, disease, pathology, or condition, or their symptoms.
  • the full prophylactic effect does not necessarily occur by administration of one dose, and may occur only after administration of a series of doses. Thus, a prophylactically effective amount may be administered in one or more administrations.
  • An “activity decreasing amount,” as used herein, refers to an amount of antagonist required to decrease the activity of an enzyme or protein relative to the absence of the antagonist.
  • a “function disrupting amount,” as used herein, refers to the amount of antagonist required to disrupt the function of an enzyme or protein relative to the absence of the antagonist.
  • Guidance can be found in the literature for appropriate dosages for given classes of pharmaceutical products. For example, for the given parameter, an effective amount will show an increase or decrease of at least 5%, 10%, 15%, 20%, 25%, 40%, 50%, 60%, 75%, 80%, 90%, or at least 100%. Efficacy can also be expressed as“-fold” increase or decrease. For example, a therapeutically effective amount can have at least a 1.2-fold, 1.5-fold, 2-fold, 5- fold, or more effect over a control.
  • the compounds of the present disclosure can be administered alone or can be co administered to the patient. Co-administration is meant to include simultaneous or sequential administration of the compounds individually or in combination (more than one compound).
  • the preparations can also be combined, when desired, with other active substances (e.g. to reduce metabolic degradation).
  • the combined administration contemplates co-administration, using separate
  • “Pharmaceutically acceptable excipient” and“pharmaceutically acceptable carrier” refer to a substance that aids the administration of an active agent to and absorption by a subject and can be included in the compositions of the present disclosure without causing a significant adverse toxicological effect on the patient.
  • a target nucleic acid sequence may be any nucleic acid sequence modified as provided herein (e.g., to which a programmable nuclease and/or DNA repair protein is localized).
  • a target nucleic acid sequence may include a site that is hydrolyzed (cleaved) by a programmable nuclease (e.g., a RNA-guided nuclease, such as Cas9, a ZFN, or a TALEN).
  • a target nucleic acid sequence includes an nuclease cleavage site.
  • a target nucleic acid sequence is an exogenous nucleic acid sequence. In some embodiments, a target nucleic acid sequence is an endogenous nucleic acid sequence. In some embodiments, a target nucleic acid sequence forms part of a cellular, e.g., genomic, gene. In some embodiments, a target nucleic acid sequence is part of a transcriptional regulatory sequence. In some embodiments, a target nucleic acid sequence is part of a promoter, enhancer or silencer.
  • a target sequence is a DNA. In other embodiments, a target sequence is a RNA.
  • a target nucleic acid sequence is at, near, or within a promoter sequence. In some embodiments, a target nucleic acid sequence is at, near, or within a gene. In some embodiments, a target nucleic acid sequence is known to be associated with a disease or condition characterized by a (one or more) nucleotide mutation (e.g., substitution), insertion or deletion. In some embodiments, a target nucleic acid sequence is within a tumor suppressor gene or an oncogene, such as within a transcriptional regulatory sequence/element or coding region of the tumor suppressor gene or oncogene.
  • a target nucleic acid sequence is immediately 3’ to a protospacer adjacent motif (PAM) sequence.
  • PAM protospacer adjacent motif
  • a PAM sequence of a target nucleic acid sequence is 5' -CCN-3', wherein N is any DNA nucleotide.
  • a PAM sequence of a target nucleic acid sequence matches the Cas9 endonuclease binding site or Cas9 nickase binding site or homologs or orthologs to be used.
  • the target nucleic acid sequence in the genomic DNA should be complementary to a guide RNA sequence and immediately followed by a correct PAM sequence.
  • a PAM sequence is present in the target nucleic acid sequence but not in the guide RNA sequence. Any DNA sequence with the correct target nucleic acid sequence followed by a PAM sequence should be bound by Cas9
  • a PAM sequence may be any of the PAM sequences disclosed in international application PCT/US2016/021491 and published as WO2016148994 A8, which is hereby incorporated by reference. Other PAM sequences are known and may be used herein.
  • the DNA repair complexes provided herein are, inter alia, useful for editing genome sequences by introducing precise changes in a target site in the presence of a donor sequence.
  • RNA-guided DNA endonuclease provided herein including embodiments thereof (e.g., Cas9 nuclease or Cas9 nickase) is capable of introducing a strand break (double- or single-strand break) at a target site in the genome of a cell (e.g., gene or transcriptional regulatory sequence) and the break is then predominantly repaired through the mechanism of HDR.
  • a strand break double- or single-strand break
  • a target site in the genome of a cell
  • the compositions and methods provided herein meet the long-felt need of site directed, highly accurate genome editing.
  • the compositions provided herein including embodiments thereof are therefore widely useful as therapeutics and research tools.
  • Effective doses of the RNA-guided DNA endonuclease, the nucleic acid (e.g., guide RNA), the DNA repair protein (e.g., conjugate) and the donor nucleic acid provided herein as well as nucleic acids encoding the same may be administered to a subject in need thereof for treating and preventing a disease (e.g., cancer).
  • a disease e.g., cancer
  • the DNA repair complexes provided herein including embodiments thereof are based on a three-component hybrid system (also known as Casilio system).
  • the Casilio system includes CRISPR/Cas9, guide RNA including PBS and PUF domain coupled with the DNA repair protein and Pumilio proteins.
  • the three-component hybrid system that includes CRISPR/Cas9 and Pumilio proteins may also be referred to interchangeably as the Casilio system.
  • the Casilio system is used for the targeted delivery of DNA repair protein domains (e.g., DNA ligases, nucleases, helicases) to a specific site in the genome.
  • DNA repair protein domains e.g., DNA ligases, nucleases, helicases
  • the DNA repair protein domain is linked to (e.g., fused) to the N-terminus or the C-terminus of Pumilio proteins or functional fragments thereof (PUF domains) that bind PBS in the Casilio system, thus bringing such domains to the vicinity of any target locus of interest that is specifically recognized by the Casilio system.
  • PRF domains Pumilio proteins or functional fragments thereof
  • compositions and methods provided herein including embodiments thereof are advantageous over previous attempts to edit a target gene sequence in a cell using programmable nuclease.
  • the present disclosure permits the precise editing at specific locations in the genome, for example, by increasing HDR at a target site.
  • a DNA repair complex includes: (a) a ribonucleoprotein complex including: (i) an RNA-guided DNA endonuclease; and
  • a nucleic acid including: (1) a DNA-targeting sequence that is complementary to a target nucleic acid sequence; (2) a binding sequence for the RNA-guided DNA endonuclease; and (3) one or more PUF binding site (PBS) sequences, wherein the RNA-guided DNA endonuclease is bound to the nucleic acid via the binding sequence; and (b) a DNA repair protein (e.g., conjugate) including: (i) a PUF domain, the PUF domain having a C-terminus and an N- terminus; and (ii) a DNA repair domain, the DNA repair domain linked to the PUF domain to form a DNA repair protein, wherein the DNA repair protein binds to the ribonucleoprotein complex via the PUF domain binding to the one or more PBS sequences to form a DNA repair complex, and wherein when the RNA-guided DNA endonuclease introduces a strand break at a target nucleic acid sequence present in a
  • the ribonucleoprotein complexes provided herein including embodiments thereof include an RNA-guided DNA endonuclease bound through a binding sequence to a nucleic acid (e.g., guide RNA).
  • the nucleic acid further includes a DNA targeting sequence, which is complementary to a target nucleic acid sequence in the genome, and one or more PUF binding site (PBS) sequences.
  • PBS PUF binding site
  • RNA-guided DNA endonuclease For the compositions and methods provided herein, any RNA-guided DNA endonuclease may be used.
  • An“RNA-guided DNA endonuclease” as provided herein refers to an
  • RNA-guided DNA endonuclease that can be recruited to a target sequence in the genome by a guide RNA and which is capable of introducing a strand break at a target sequence.
  • the DNA nuclease binds the guide RNA and the guide RNA is capable of hybridizing to a target sequence.
  • the RNA-guided DNA endonuclease provided herein may introduce a single strand break or a double strand break at a target nucleic acid sequence present in a genome.
  • Non-limiting examples of RNA-guided DNA endonucleases include Cas9 nuclease, Cas9 nickase.
  • the RNA-guided DNA endonuclease is a Cas9 nuclease.
  • the endonuclease may introduce a double strand-break at a target nucleic acid sequence (i.e., a break at sense strand and a break at the antisense strand).
  • the RNA- guided DNA endonuclease is a Cas9 nickase
  • the endonuclease may introduce a single strand- break at a target nucleic acid sequence (i.e., a break at sense strand or a break at the antisense strand).
  • the RNA-guided DNA endonuclease is a Cas9 nickase. In some embodiments, the RNA-guided DNA endonuclease includes an alanine at a position
  • the RNA- guided DNA endonuclease is a Cas9 D10A nickase.
  • the RNA-guided DNA endonuclease includes an alanine corresponding to amino acid position 840 of SEQ ID NO: 25.
  • the RNA-guided DNA endonuclease is a Cas9 H840A nickase.
  • the RNA-guided DNA endonuclease is SpCas9 from S.
  • the RNA-guided DNA endonuclease is Clustered Regularly Interspaced Short Palindromic Repeats from Prevotella and Francisella 1 (Cpfl).
  • the Cas9 endonuclease includes the sequence of SEQ ID NO: 25.
  • the RNA-guided DNA endonuclease is the sequence of SEQ ID NO: 25.
  • the RNA-guided DNA endonuclease includes the sequence of SEQ ID NO: 26.
  • the RNA-guided DNA endonuclease is the sequence of SEQ ID NO: 26.
  • the RNA-guided DNA endonuclease includes the sequence of SEQ ID NO: 89. In some embodiments, the RNA-guided DNA endonuclease is the sequence of SEQ ID NO: 89. Any of the endonucleases described in Leinstiver et al. (Nature, vol. 529, pages 490-495 (28 January 2016) and Hu et al. (Nature, vol 556, pages 57-63 (05 April 2018)), which are hereby incorporated by reference in their entirety and for all purposes, may be used for the compositions and methods proved herein.
  • a nucleic acid provided herein includes (1) a DNA-targeting sequence that is complementary to a target nucleic acid sequence, (2) a binding sequence for the RNA-guided DNA endonuclease (e.g., Cas9 nuclease, Cas9 nickase, Cas9 H840A nickase, Cas9 D10A nickase), and (3) one or more PUF binding site (PBS) sequences.
  • the complex includes Cas9 nuclease bound to the nucleic acid thereby forming a
  • the complex includes Cas9 nickase bound to the nucleic acid thereby forming a ribonucleoprotein complex.
  • the nucleic acid is a ribonucleic acid.
  • the nucleic acid is a guide RNA.
  • a "guide RNA” or “gRNA” as provided herein refers to a ribonucleotide sequence capable of binding a nucleoprotein, thereby forming ribonucleoprotein complex.
  • the nucleic acid of the present disclosure can be a single RNA molecule (single RNA nucleic acid), which may include a“single-guide RNA” (abbreviated to“sgRNA” or“gRNA” In another
  • the nucleic acid of the present disclosure includes two RNA molecules (e.g., joined together via hybridization at the binding sequence (e.g., Cas9 nuclease-binding sequence).
  • the subject nucleic acid is inclusive, referring both to two-molecule nucleic acids and to single molecule nucleic acids (e.g., sgRNAs).
  • the nucleic acid is a single-stranded ribonucleic acid.
  • the nucleic acid e.g., gRNA
  • the nucleic acid is 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 or more nucleic acid residues in length.
  • the nucleic acid e.g., gRNA
  • the nucleic acid is from 10 to 30 nucleic acid residues in length.
  • the nucleic acid (e.g., gRNA) is 20 nucleic acid residues in length.
  • the length of the nucleic acid e.g., gRNA
  • the nucleic acid (e.g., gRNA) is from 5 to 50, 10 to 50, 15 to 50, 20 to 50, 25 to 50, 30 to 50, 35 to 50, 40 to 50, 45 to 50, 5 to 75, 10 to 75, 15 to 75, 20 to 75, 25 to 75, 30 to 75, 35 to 75, 40 to 75, 45 to 75, 50 to 75, 55 to 75, 60 to 75, 65 to 75, 70 to 75, 5 to 100, 10 to 100, 15 to 100, 20 to 100, 25 to 100, 30 to 100, 35 to 100, 40 to 100, 45 to 100, 50 to 100, 55 to 100, 60 to 100, 65 to 100, 70 to 100, 75 to 100, 80 to 100, 85 to 100, 90 to 100, 95 to 100, or more residues in length.
  • the nucleic acid (e.g., gRNA) is from 10 to 15, 10 to 20, 10 to 30, 10 to 40, or 10 to 50 residues in length.
  • transcription of the nucleic acid is under the control of a constitutive promoter, such as a CMV promoter or a Ubc promoter, or an inducible promoter, such as a tetracycline -responsive promoter or a steroid-responsive promoter.
  • a constitutive promoter such as a CMV promoter or a Ubc promoter
  • an inducible promoter such as a tetracycline -responsive promoter or a steroid-responsive promoter.
  • the nucleic acid is a vector. In some embodiments, transcription of the nucleic acid is under the control of an RNA promoter. In some embodiments, the RNA promoter is a U6 promoter. In some embodiments, the RNA promoter is an Hl promoter.
  • the vector encoding the nucleic acid (for use in the methods of the present disclosure) is active in a cell from a mammal (a human; a non-human primate; a non human mammal; a rodent); an insect, a worm, a yeast, or a bacterium.
  • the vector is a plasmid, a viral vector (such as adenoviral, retroviral, or lentiviral vector, or AAV vector), or a transposon (such as piggy Bac transposon).
  • the vector can be transiently transfected into a host cell, or be integrated into a host genome by infection or transposition.
  • a nucleic acid comprising a nucleotide sequence encoding a gRNA.
  • a nucleic acid also comprises a nucleotide sequence encoding a RNA-guided DNA endonuclease (Cas9 protein) and/or a DNA repair protein (e.g., conjugate).
  • Cas9 protein RNA-guided DNA endonuclease
  • DNA repair protein e.g., conjugate
  • the nucleic acid includes a nucleotide sequence complementary to a target site (e.g., target nucleic acid sequence), which is referred to herein as "DNA-targeting sequence.”
  • the DNA-targeting sequence may mediate binding of the ribonucleoprotein complex to a target site (e.g., target nucleic acid sequence), which is referred to herein as "DNA-targeting sequence.”
  • the nucleic acid e.g., gRNA
  • the nucleic acid binds a target nucleic acid sequence.
  • the complement of the nucleic acid has a sequence identity of 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% to a target nucleic acid sequence.
  • the complement of the DNA-targeting sequence has a sequence identity of 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% to a target nucleic acid sequence.
  • DNA-targeting sequence may or may not be 100%
  • the DNA-targeting sequence is complementary to a target nucleic acid sequence over 8-25 nucleotides (nts), 12-22 nucleotides, 14-20 nts, 16-20 nts, 18-20 nts, or 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nts.
  • the complementary region includes a continuous stretch of 12-22 nts, preferably at the 3’ end of the DNA-targeting sequence.
  • the 5’ end of the DNA-targeting sequence has up to 8 nucleotide mismatches with a target nucleic acid sequence.
  • the DNA-binding sequence is 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100% complementary to a target nucleic acid sequence.
  • RNA-guided DNA endonuclease in the complex is a wildtype Cas9 protein.
  • the RNA-guided DNA endonuclease is a Cas9 nickase. In some embodiments, the RNA-guided DNA endonuclease is a Clustered Regularly Interspaced Short Palindromic Repeats from Prevotella and Francisella 1 (Cpfl).
  • the DNA-targeting sequence is functionally similar or equivalent to the crRNA or guide RNA or gRNA of the CRISPR/Cas complex / system.
  • the DNA-targeting sequence may not originate from any particular crRNA or gRNA, but can be arbitrarily designed based on the sequence of a target nucleic acid sequence.
  • the DNA-targeting sequence includes a nucleotide sequence that is complementary to a specific sequence within a target DNA (or the complementary strand of a target DNA).
  • the DNA-targeting sequence interacts with a target nucleic acid sequence of a target DNA in a sequence-specific manner via hybridization ( i.e ., base pairing).
  • the nucleotide sequence of the DNA-targeting sequence may vary, and it determines the location within a target DNA that the subject nucleic acid and a target DNA will interact.
  • the DNA-targeting sequence can be modified or designed (e.g., by genetic engineering) to hybridize to any desired sequence within a target DNA.
  • a target nucleic acid sequence is immediately 3’ to a PAM (protospacer adjacent motif) sequence of the complementary strand, which can be 5' - CCN-3' , wherein N is any DNA nucleotide. That is, in this embodiment, the complementary strand of a target nucleic acid sequence is immediately 5’ to a PAM sequence that is 5’-NGG-3’, wherein N is any DNA nucleotide.
  • the DNA-targeting sequence can have a length of from 12 nucleotides to 100 nucleotides.
  • the DNA-targeting sequence can have a length of from 12 nucleotides (nt) to 80 nt, from 12 nt to 50 nt, from 12 nt to 40 nt, from 12 nt to 30 nt, from 12 nt to 25 nt, from 12 nt to 20 nt, or from 12 nt to 19 nt.
  • the DNA-targeting sequence can have a length of from 19 nt to 20 nt, from 19 nt to 25 nt, from 19 nt to 30 nt, from 19 nt to 35 nt, from
  • the nucleotide sequence of the DNA-targeting sequence that is complementary to a target nucleic acid sequence of a target DNA can have a length of at least 12 nt.
  • the DNA-targeting sequence that is complementary to a target nucleic acid sequence of a target DNA can have a length at least 12 nt, at least 15 nt, at least 18 nt, at least 19 nt, at least 20 nt, at least 25 nt, at least 30 nt, at least 35 nt or at least 40 nt.
  • the DNA-targeting sequence that is complementary to a target nucleic acid sequence of a target DNA can have a length of from 12 nucleotides (nt) to 80 nt, from 12 nt to 50 nt, from 12 nt to 45 nt, from 12 nt to 40 nt, from 12 nt to 35 nt, from 12 nt to 30 nt, from 12 nt to 25 nt, from 12 nt to 20 nt, from 12 nt to 19 nt, from 19 nt to 20 nt, from 19 nt to 25 nt, from 19 nt to 30 nt, from 19 nt to 35 nt, from 19 nt to 40 nt, from 19 nt to 45 nt, from 19 nt to 50 nt, from 19 nt to 60 nt, from 20 nt to 25 nt, from 20 nt to 25 nt, from 20 nt to 25 nt
  • complementary to a target nucleic acid sequence of a target DNA can have a length of at least 12 nt.
  • the DNA-targeting sequence that is complementary to a target nucleic acid sequence of a target DNA is 20 nucleotides in length. In some cases, the DNA-targeting sequence that is complementary to a target nucleic acid sequence of a target DNA is 19 nucleotides in length.
  • the percent complementarity between the DNA-targeting sequence and a target nucleic acid sequence can be at 50% (e.g ., at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%). In some cases, the percent complementarity between the DNA-targeting sequence and a target nucleic acid sequence is 100% over the seven or eight contiguous 5’-most nucleotides of a target nucleic acid sequence. In some embodiments, the percent complementarity between a DNA-targeting sequence and a target nucleic acid sequence is at least 60% over 20 contiguous nucleotides.
  • the percent complementarity between the DNA-targeting sequence and a target nucleic acid sequence is 100% over the 7, 8, 9, 10, 11, 12, 13, or 14 contiguous 5’-most nucleotides of a target nucleic acid sequence (i.e ., the 7, 8, 9, 10, 11, 12, 13, or 14 contiguous 3’-most nucleotides of the DNA-targeting sequence), and as low as 0% over the remainder.
  • the DNA-targeting sequence can be considered to be 7, 8, 9, 10, 11, 12, 13, or 14 nucleotides in length, respectively.
  • the nucleic acid e.g., gRNA
  • the nucleic acid is 50%, 55%, 60%, 65%, 70%
  • the nucleic acid e.g., gRNA
  • the nucleic acid is 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% complementary to the sequence of a cellular gene.
  • the nucleic acid e.g., gRNA
  • the complex includes a Cas9 endonuclease or a Cas9 nickase bound to the nucleic acid through binding a binding sequence of the nucleic acid and thereby forming a ribonucleoprotein complex.
  • the binding sequence forms a hairpin structure.
  • the binding sequence is 30-100 nt, 35-50 nt, 37-47 nt, or 42 nt in length.
  • the binding sequence is the sequence of SEQ ID NO: 89.
  • the binding sequence is the sequence of SEQ ID NO: 90.
  • the binding sequence includes the sequence of SEQ ID NO: 89.
  • the binding sequence includes the sequence of SEQ ID NO:90.
  • the binding sequence (protein-binding segment or protein-binding sequence) of the subject nucleic acid binds to a RNA-guided DNA endonuclease (e.g., Cas9 endonuclease or a Cas9 nickase).
  • a RNA-guided DNA endonuclease e.g., Cas9 endonuclease or a Cas9 nickase
  • the binding sequence (protein-binding segment or protein-binding sequence) which may bind to a RNA-guided DNA endonuclease (e.g., Cas9 endonuclease or a Cas9 nickase)
  • RNA-guided DNA endonuclease e.g., Cas9 endonuclease or a Cas9 nickase
  • the binding sequence interacts with or is bound by a RNA- guided DNA endonuclease (e.g., Cas9 endonuclease or a Cas9 nickase), and together they bind to a target nucleic acid sequence recognized by the DNA-targeting sequence.
  • the binding sequence includes two complementary stretches of nucleotides that hybridize to one another to form a double stranded RNA duplex (a dsRNA duplex).
  • nucleotides may be covalently linked by intervening nucleotides known as linkers or linker nucleotides (e.g., in the case of a single-molecule nucleic acid), and hybridize to form the double stranded RNA duplex (dsRNA duplex, or“Cas9-binding hairpin”) of the binding sequence (Cas9-binding sequence), thus resulting in a stem-loop structure.
  • linkers or linker nucleotides e.g., in the case of a single-molecule nucleic acid
  • the two complementary stretches of nucleotides may not be covalently linked, but instead are held together by hybridization between complementary sequences (e.g., in the case of a two-molecule nucleic acid of the present disclosure).
  • the binding sequence can have a length of from 10 nucleotides to 100 nucleotides, e.g., from 10 nucleotides (nt) to 20 nt, from 20 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to 100 nt.
  • 10 nucleotides (nt) to 20 nt from 20 nt to 30 nt, from 30 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to 100 nt.
  • the Cas9-binding sequence can have a length of from 15 nucleotides (nt) to 80 nt, from 15 nt to 50 nt, from 15 nt to 40 nt, from 15 nt to 30 nt, from 37 nt to 47 nt (e.g., 42 nt), or from 15 nt to 25 nt.
  • the dsRNA duplex of the binding sequence can have a length from 6 base pairs (bp) to 50 bp.
  • the dsRNA duplex of the binding sequence can have a length from 6 bp to 40 bp, from 6 bp to 30 bp, from 6 bp to 25 bp, from 6 bp to 20 bp, from 6 bp to 15 bp, from 8 bp to 40 bp, from 8 bp to 30 bp, from 8 bp to 25 bp, from 8 bp to 20 bp or from 8 bp to 15 bp.
  • the dsRNA duplex of the binding sequence can have a length from 8 bp to 10 bp, from 10 bp to 15 bp, from 15 bp to 18 bp, from 18 bp to 20 bp, from 20 bp to 25 bp, from 25 bp to 30 bp, from 30 bp to 35 bp, from 35 bp to 40 bp, or from 40 bp to 50 bp.
  • the dsRNA duplex of the binding sequence (Cas9-binding sequence) has a length of 36 base pairs.
  • the percent complementarity between the nucleotide sequences that hybridize to form the dsRNA duplex of the binding sequence can be at least 60%.
  • the percent complementarity between the nucleotide sequences that hybridize to form the dsRNA duplex of the binding sequence can be at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99%.
  • the percent complementarity between the nucleotide sequences that hybridize to form the dsRNA duplex of the binding sequence is 100%.
  • the nucleic acid further includes a linker sequence linking the DNA-targeting sequence to the binding sequence (Cas9-binding sequence).
  • the linker can have a length of from 3 nucleotides to 100 nucleotides.
  • the linker can have a length of 3 nucleotides (nt) to 90 nt, from 3 nucleotides (nt) to 80 nt, from 3 nucleotides (nt) to 70 nt, from 3 nucleotides (nt) to 60 nt, from 3 nucleotides (nt) to 50 nt, from 3 nucleotides (nt) to 40 nt, from 3 nucleotides (nt) to 30 nt, from 3 nucleotides (nt) to 20 nt or from 3 nucleotides (nt) to 10 nt.
  • the linker can have a length of from 3 nt to 5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from 20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 35 nt, from 35 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to 100 nt.
  • the linker is 4 nt.
  • Non-limiting examples of nucleotide sequences that can be included in a suitable binding sequence are set forth in SEQ ID NOs: 563-682 of WO 2013/176772 (see, for examples, FIGs. 8 and 9 of WO 2013/176772 ), which is hereby incorporated by reference in its entirety and for all purposes.
  • the binding sequence includes a nucleotide sequence that differs by 1, 2, 3, 4, or 5 nucleotides from any one of the above-listed sequences. In some embodiments, the binding sequence (Cas9-binding sequence) includes a nucleotide sequence that has 98%, 97%, 96% or 95% sequence identity to any one of the above- listed sequences.
  • PBS sequences or“PUF binding site” sequences as provided herein refers to a site that is bound by a Pumilio/fem-3 mRNA binding factor (PUF).
  • a PUF binding site may form part of a guide RNA and provide for the binding of a PUF protein or PUF domain as provided herein (e.g., PUFa, PUFb, PUFc or functional fragments thereof) to the guide RNA.
  • the PUF binding site includes a nucleic acid sequence (i.e., a PBS sequence or PUF binding site sequence) which is characteristic of the PBS and may be bound directly by the PUF protein.
  • the nucleic acid e.g., gRNA
  • the nucleic acid further includes one or more PUF binding site (PBS) sequences.
  • the one or more PBS sequences contain 8 nucleotides in length. In some embodiments, the one or more PBS sequences contain at least 9 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 10 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 11 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 12 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 13 nucleotides in length.
  • the one or more PBS sequences contain at 14 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 15 nucleotides in length. In some embodiments, the one or more PBS sequences contain at 16 nucleotides in length. Any of the PBS sequences disclosed in Katarzyna et al. (PNAS May 10, 2016. 113 (19) E2579-E2588) and Zhao et al. (Nucleic Acids Research, Volume 46, Issue 9, 18 May 2018,
  • Pages 4771-4782) may be used as provided herein including embodiments thereof.
  • the one or more PBS sequences are identical. In some embodiments, the one or more PBS sequences are identical. In some embodiments, the one or more PBS sequences are identical.
  • the nucleic acid includes 1 to 50 PBS sequences. Any one of the PBS sequences disclosed in international application PCT/US2016/021491 and published as WO2016148994 A8, which is hereby incorporated by reference in its entirety and for all purposes, are
  • the nucleic acid of the present disclosure may have more than one copy of the PBS sequences.
  • the nucleic acid comprises 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 46, 47, 48, 49, or 50 copies of PBS sequences, such as 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 copies of PBS sequences.
  • the range of the PBS sequence copy number is L to H, wherein L is any one of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40, and wherein H is any one of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 80, 90, or 100, so long as H is greater than L.
  • L is any one of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, or 40
  • H is any one of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 80, 90, or 100, so long as H is greater than L.
  • Each PBS sequence may be the same or different.
  • a nucleic acid includes 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 46, 47, 48, 49, or 50 copies, or 1-50, 2-45, 3-40, 5-35, 5-10, 10-20 copies of identical or different PBS sequences.
  • the nucleic acid includes 5-15 copies of PBS sequences, or 5-14 copies, 5-13 copies, 5-12 copies, 5-11 copies, 5-10 copies, or 5-9 copies of PBS sequences.
  • the amount of the gRNA-PBS sequences and/or the amount of the DNA repair protein (e.g., conjugate) transfected or expressed is adjusted to maximize PBS/PETF domain binding. For example, this can be achieved by increasing the expression of the PUF domain by a stronger promoter or using an inducible promoter, such as a Dox-inducible promoter.
  • the spacing between PBS sequences and/or spacer sequences are optimized to improve system efficiency.
  • spacing optimization can be subject to particular DNA repair proteins (e.g., conjugates), and can be different between proteins that work as individual proteins and those DNA repair protein that may need to be positioned close enough to function (e.g., protein complexes).
  • one or more spacer region(s) separate two adjacent PBS sequences.
  • the spacer regions may have a length of from 3 nucleotides to 100 nucleotides.
  • the spacer can have a length of from 3 nucleotides (nt) to 90 nt, from 3 nucleotides (nt) to 80 nt, from 3 nucleotides (nt) to 70 nt, from 3 nucleotides (nt) to 60 nt, from 3 nucleotides (nt) to 50 nt, from 3 nucleotides (nt) to 40 nt, from 3 nucleotides (nt) to 30 nt, from 3 nucleotides (nt) to 20 nt or from 3 nucleotides (nt) to 10 nt.
  • the spacer can have a length of from 3 nt to 5 nt, from 5 nt to 10 nt, from 10 nt to 15 nt, from 15 nt to 20 nt, from 20 nt to 25 nt, from 25 nt to 30 nt, from 30 nt to 35 nt, from 35 nt to 40 nt, from 40 nt to 50 nt, from 50 nt to 60 nt, from 60 nt to 70 nt, from 70 nt to 80 nt, from 80 nt to 90 nt, or from 90 nt to 100 nt.
  • the spacer is 4 nt.
  • the one or more PBS sequences contain 8 nucleotides in length. In some embodiments, the one or more PBS sequences are identical. In some embodiments, the nucleic acid includes 1 to 50 PBS sequences. In some embodiments, the one or more PBS sequences include the nucleotide sequence of SEQ ID NO: 83.
  • the DNA repair proteina (e.g., conjugates) provided herein are, inter alia, useful for repairing the strand break introduced by the RNA-guided DNA endonuclease provided herein.
  • the DNA repair proteins provided herein include two domains, a PUF domain capable of binding the one or more PBS sequences, and a DNA repair domain, which is linked to the PUF domain and mediates repair of the strand break in a target nucleic acid sequence.
  • the DNA repair protein provided herein are recruited to a target nucleic acid sequence in the genome of a cell.
  • the DNA repair domain Upon recruitment to a target nucleic acid sequence and introduction of a strand break at a target nucleic acid sequence by the RNA-guided DNA endonuclease, the DNA repair domain subsequently repairs the strand break.
  • PUF proteins (named after Drosophila Pumilio and C. elegans fem-3 binding factor) are involved in mediating mRNA stability and translation. These proteins contain a unique RNA- binding domain known as the PUF domain.
  • the RNA-binding PUF domain such as that of the human Pumilio 1 protein (referred here also as PUM), contains 8 repeats (each repeat called a PUF motif or a PUF repeat) that bind consecutive bases in an anti-parallel fashion, with each repeat recognizing a single base - i.e., PUF repeats Rl to R8 recognize nucleotides N8 to Nl, respectively.
  • PUM is composed of eight tandem repeats, each repeat consisting of 34 amino acids that folds into tightly packed domains composed of alpha helices.
  • the PUF domain binds 8, 9 or 16 nucleotides of the PUF binding site (PBS) sequence.
  • PBS PUF binding site
  • the PUF domain is any of the domains disclosed in Katarzyna et al. (PNAS May 10, 2016. 113 (19) E2579-E2588) or Zhao et al. (Nucleic Acids Research, Volume 46, Issue 9, 18 May 2018, Pages 4771-4782), which are herewith incorporated by reference in their entirety and for all purposes.
  • the DNA repair proteins provided herein including embodiments thereof may be proteins (e.g., conjugates) that include a PUF domain linked to a DNA repair domain.
  • the DNA repair domain may be linked to the N-terminus or the C-terminus of the PUF domain.
  • the term“PUF domain” refers to a wildtype or naturally existing PUF domain, as well as a PUF homologue domain that is based on / derived from a natural or existing PUF domain, such as the prototype human Pumilio 1 PUF domain.
  • the PUF domain of the present disclosure specifically binds to an RNA sequence (e.g., an 8-mer RNA sequence), wherein the overall binding specificity between the PUF domain and the RNA sequence is defined by sequence specific binding between each PUF motif / PUF repeat within the PUF domain and the corresponding single RNA nucleotide.
  • the term“functional variant” as used herein refers to a PUF domain having substantial or significant sequence identity or similarity to a parent PUF domain, which functional variant retains the biological activity of the PUF domain of which it is a variant - e.g., one that retains the ability to recognize target RNA to a similar extent, the same extent, or to a higher extent in terms of binding affinity, and/or with substantially the same or identical binding specificity, as the parent PUF domain.
  • the functional variant PUF domain can, for instance, be at least 30%, 50%, 75%, 80%, 90%, 98% or more identical in amino acid sequence to the parent PUF domain.
  • the functional variant can, for example, comprise the amino acid sequence of the parent PUF domain with at least one conservative amino acid substitution, for example, conservative amino acid substitutions in the scaffold of the PUF domain ( i.e ., amino acids that do not interact with the RNA).
  • the functional variants can comprise the amino acid sequence of the parent PUF domain with at least one non-conservative amino acid substitution. In this case, it is preferable for the non-conservative amino acid substitution to not interfere with or inhibit the biological activity of the functional variant.
  • the non-conservative amino acid substitution may enhance the biological activity of the functional variant, such that the biological activity of the functional variant is increased as compared to the parent PUF domain, or may alter the stability of the PUF domain to a desired level (e.g., due to substitution of amino acids in the scaffold).
  • the PUF domain can consist essentially of the specified amino acid sequence or sequences described herein, such that other components, e.g., other amino acids, do not materially change the biological activity of the functional variant.
  • the PUF domain is a Pumilio homology domain (PU-HUD).
  • the PU-HUD is a human Pumilio 1 domain.
  • the PUF domain has the sequence of any one of the PUF domains disclosed in international application PCT/US2016/021491, published as WO2016148994 A8, in
  • the PUF domain includes a PUFa domain, a PUFb domain, a PUFc domain, or a PUFw domain.
  • the PUFa domain has the amino acid sequence of SEQ ID NO: 27.
  • the subject nucleic acid includes one or more tandem sequences, each of which can be specifically recognized and bound by a specific PUF domain (infra). Since a PUF domain can be engineered to bind virtually any PBS sequence based on the nucleotide- specific interaction between the individual PUF motifs of PUF domain and the single RNA nucleotide they recognize, the PBS sequences can be any designed sequence that bind their corresponding PUF domain. Any of the subject PUF domain can be made using, for example, a Golden Gate Assembly kit (see Abil et al, Journal of Biological Engineering 8:7, 2014), which is available at Addgene (Kit # 1000000051).
  • the DNA repair domains provided herein are linked to one or more PUF domains, in some embodiments, forming a DNA repair protein.
  • the strand break introduced by the RNA-guided endonuclease at a target nucleic acid sequence may be repaired at an increased rate, for example, through HDR, relative to the absence of the DNA repair domain.
  • the strand is repaired at a decreased rate, for example, through NHEJ relative to the absence of the DNA repair domain.
  • the relative amount of strand breaks repaired e.g., through HDR is higher than in the absence of the DNA repair domain.
  • the complexes provided herein are capable of increasing HDR activity at a target site.
  • the increase is 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or any percent increase in between 10% and 100% as compared to native or control levels.
  • HDR activity is increased by about 15% to about 95%. In some embodiments, HDR activity is increased by about 20% to about 95%. In some embodiments, HDR activity is increased by about 25% to about 95%. In some embodiments, HDR activity is increased by about 30% to about 95%. In some embodiments, HDR activity is increased by about 35% to about 95%. In some embodiments, HDR activity is increased by about 40% to about 95%. In some embodiments, HDR activity is increased by about 45% to about 95%. In some embodiments, HDR activity is increased by about 50 to about 95%. In some embodiments, HDR activity is increased by about 55% to about 95%. In some embodiments, HDR activity is increased by about 60% to about 95%.
  • HDR activity is increased by about 65% to about 95%. In some embodiments, HDR activity is increased by about 70 to about 95%. In some embodiments, HDR activity is increased by about 75% to about 95%. In some embodiments, HDR activity is increased by about 80% to about 95%. In some embodiments, HDR activity is increased by about 85% to about 95%. In some embodiments, HDR activity is increased by about 90% to about 95%.
  • the increase is expressed as“-fold” increase.
  • the increase in HDR activity be at least a 1.2-fold, 1.5-fold, 2-fold, 5-fold, or more over a control.
  • the increase is about 1.2-fold, about 1.3-fold, about 1.4-fold, about 1.5-fold, about 1.6-fold, about 1.7-fold, about 1.8-fold, about 1.9-fold, about 2.0-fold, about 2.2-fold, about 2.3-fold, about 2.4-fold, about 2.5-fold, about 2.6-fold, about 2.7-fold, about 2.8-fold, about 2.9-fold, about 3.0-fold, about 3.2-fold, about 3.3-fold, about 3.4-fold, about 3.5-fold, about 3.6-fold, about 3.7-fold, about 3.8-fold, about 3.9-fold, about 4.0-fold, about 4.2-fold, about 4.3-fold, about 4.4-fold, about 4.5-fold, about 4.6-fold, about 4.7-fold, about
  • A“control” sample or value refers to a sample that serves as a reference, usually a known reference, for comparison to a test sample.
  • a test sample can be a sample including the DNA repair domain provided herein and compared to samples lacking the DNA repair domain, or a known standard sample useful as a negative control.
  • a control value can also be obtained from the same sample, e.g., from an earlier-obtained sample, prior to introducing a DNA repair complex provided herein.
  • controls can be designed for assessment of any number of parameters.
  • Controls are valuable in a given situation and be able to analyze data based on comparisons to control values. Controls are also valuable for determining the significance of data. For example, if values for a given parameter are widely variant in controls, variation in test samples will not be considered as significant.
  • the term“repair” as provided herein refers to the processes by which a strand break is identified and corrected.
  • the process of strand repair includes several enzymatic steps the completion of which results in the transformation of a strand break into an intact strand.
  • one or more nucleotides may be replaced by new nucleotides thereby changing the sequence composition at and around the site of the strand break.
  • the process of strand break repair includes, for example, ligation, polymerization, endonucleolytic cleavage, and decoding.
  • the DNA repair domain is or comprises a ligase or ligase activity, a polymerase or polymerase activity, a topoisomerase or topoisomerase activity, a helicase or helicase activity, or an endonuclease or endonuclease.
  • the DNA repair domain includes an endonuclease domain, a helicase domain or a ligase domain. In some embodiments, the DNA repair domain is an endonuclease domain, a helicase domain or a ligase domain.
  • the DNA repair domain includes a BRCA1 protein or functional fragment thereof, a RAD54L protein or functional fragment thereof, a CtIP protein or functional fragment thereof, a PALB2 protein or functional fragment thereof, a RAD51A protein or functional fragment thereof, a XRCC3 protein or functional fragment thereof, a RECQ5 protein or functional fragment thereof, a FEN 1 protein or functional fragment thereof, a FANCB protein or functional fragment thereof, a FANCF protein or functional fragment thereof, a FANCG protein or functional fragment thereof, a FANCM protein or functional fragment thereof, a MRE11A protein or functional fragment thereof, a USP1 protein or functional fragment thereof, a RPA1 protein or functional fragment thereof, a RPA2 protein or functional fragment thereof, a BRC3 protein or functional fragment thereof, or a BRC4 protein or functional fragment thereof.
  • the DNA repair domain is a BRCA1 protein or functional fragment thereof, a RAD54L protein or functional fragment thereof, a CtIP protein or functional fragment thereof, a PALB2 protein or functional fragment thereof, a RAD51A protein or functional fragment thereof, a XRCC3 protein or functional fragment thereof, a RECQ5 protein or functional fragment thereof, a FEN 1 protein or functional fragment thereof, a FANCB protein or functional fragment thereof, a FANCF protein or functional fragment thereof, a FANCG protein or functional fragment thereof, a FANCM protein or functional fragment thereof, a MRE11A protein or functional fragment thereof, a USP1 protein or functional fragment thereof, a RPA1 protein or functional fragment thereof, a RPA2 protein or functional fragment thereof, a BRC3 protein or functional fragment thereof, or a BRC4 protein or functional fragment thereof.
  • BRCA1 “BRCA1 protein,”“BRCA1 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the breast cancer type 1 susceptibility protein (BRCA1) protein or variants or homologs thereof that maintain BRCA1 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to BRCA1).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the BRCA1 peptide is substantially identical to the protein identified by the UniProt reference number P38398 or a variant or homolog having substantial identity thereto. In some embodiments, the BRCA1 peptide is substantially identical to the protein identified by the GenBank reference number AAX42696.1, NP_009225.l or a variant or homolog having substantial identity thereto.
  • RAD54L “RAD54L protein,”“RAD54L peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the DNA repair and
  • RAD54-like (RAD54L) protein or variants or homologs thereof that maintain RAD54L protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to RAD54L).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g. a 10, 20, 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring RAD54L polypeptide.
  • the RAD54L peptide is substantially identical to the protein identified by the UniProt reference number Q92698 or a variant or homolog having substantial identity thereto.
  • the RAD54L peptide is substantially identical to the protein identified by the GenBank reference number CAA66379.1 or a variant or homolog having substantial identity thereto.
  • CtIP C-terminal binding protein Interacting Protein
  • CtIP C-terminal binding protein Interacting Protein
  • variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the CtIP peptide is substantially identical to the protein identified by the UniProt reference number Q99708 or a variant or homolog having substantial identity thereto. In some embodiments, the CtIP peptide is substantially identical to the protein identified by the GenBank reference number NP_002885.l or a variant or homolog having substantial identity thereto.
  • the terms "PALB2,”“PALB2 protein,”“PALB2 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the Partner and Localizer of BRCA2 (PALB2) protein or variants or homologs thereof that maintain PALB2 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to PALB2).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the PALB2 peptide is substantially identical to the protein identified by the UniProt reference number Q86YC2 or a variant or homolog having substantial identity thereto. In some embodiments, the PALB2 peptide is substantially identical to the protein identified by the GenBank reference number NP_07895l.2 or a variant or homolog having substantial identity thereto.
  • RAD51A “RAD51A protein,”“RAD51A peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the DNA repair protein RAD51 homolog 1 (RAD51A) protein or variants or homologs thereof that maintain RAD51A protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to RAD51A).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the RAD51A peptide is substantially identical to the protein identified by the UniProt reference number Q06609 or a variant or homolog having substantial identity thereto. In some embodiments, the RAD51A peptide is substantially identical to the protein identified by the GenBank reference number NP_002866.2 or a variant or homolog having substantial identity thereto.
  • XRCC3 “XRCC3 protein,”“XRCC3 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the X-ray Repair Cross Complementing 3 (XRCC3) protein or variants or homologs thereof that maintain XRCC3 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to XRCC3).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the XRCC3 peptide is substantially identical to the protein identified by the UniProt reference number 043542 or a variant or homolog having substantial identity thereto. In some embodiments, the XRCC3 peptide is substantially identical to the protein identified by the GenBank reference number NP_005423.l or a variant or homolog having substantial identity thereto.
  • RECQ5 “RECQ5 protein,”“RECQ5 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the ATP-dependent DNA Helicase Q5 (RECQ5) protein or variants or homologs thereof that maintain RECQ5 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to RECQ5).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the RECQ5 peptide is substantially identical to the protein identified by the UniProt reference number 094762 or a variant or homolog having substantial identity thereto. In some embodiments, the RECQ5 peptide is substantially identical to the protein identified by the GenBank reference number NP_005423.l or a variant or homolog having substantial identity thereto.
  • FEN1 “FEN1 protein,”“FEN1 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the Flap endonuclease 1 (FEN1) protein or variants or homologs thereof that maintain FEN1 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to FEN1).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the FEN 1 peptide is substantially identical to the protein identified by the UniProt reference number P39748 or a variant or homolog having substantial identity thereto. In some embodiments, the FEN 1 peptide is substantially identical to the protein identified by the GenBank reference number NP_004102.1 or a variant or homolog having substantial identity thereto.
  • FANCB Fanconi anemia group B
  • FANCB protein Fanconi anemia group B
  • FANCB peptide any of the recombinant or naturally-occurring forms of the Fanconi anemia group B (FANCB) protein or variants or homologs thereof that maintain FANCB protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to FANCB).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the FANCB peptide is substantially identical to the protein identified by the UniProt reference number Q8NB91 or a variant or homolog having substantial identity thereto. In some embodiments, the FANCB peptide is substantially identical to the protein identified by the GenBank reference number NP_689846.l or a variant or homolog having substantial identity thereto.
  • FANCF Fanconi anemia group F
  • FANCF protein Fanconi anemia group F
  • FANCF peptide any of the recombinant or naturally-occurring forms of the Fanconi anemia group F (FANCF) protein or variants or homologs thereof that maintain FANCF protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to FANCF).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the FANCF peptide is substantially identical to the protein identified by the UniProt reference number Q9NPI8 or a variant or homolog having substantial identity thereto. In some embodiments, the FANCF peptide is substantially identical to the protein identified by the GenBank reference number NP_073562.l or a variant or homolog having substantial identity thereto.
  • FANCG Fanconi anemia group G
  • FANCG protein Fanconi anemia group G
  • FANCG peptide any of the recombinant or naturally-occurring forms of the Fanconi anemia group G (FANCG) protein or variants or homologs thereof that maintain FANCG protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to FANCG).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the FANCG peptide is substantially identical to the protein identified by the UniProt reference number 015287 or a variant or homolog having substantial identity thereto. In some embodiments, the FANCG peptide is substantially identical to the protein identified by the GenBank reference number NP_004620.l or a variant or homolog having substantial identity thereto.
  • FANCM Fanconi anemia group M
  • FANCM protein Fanconi anemia group M
  • FANCM peptide any of the recombinant or naturally-occurring forms of the Fanconi anemia group M (FANCM) protein or variants or homologs thereof that maintain FANCM protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to FANCM).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the FANCM peptide is substantially identical to the protein identified by the UniProt reference number Q8IYD8 or a variant or homolog having substantial identity thereto. In some embodiments, the FANCM peptide is substantially identical to the protein identified by the GenBank reference number NP_065988.l or a variant or homolog having substantial identity thereto.
  • MRE11A “MRE11A protein,”“MRE11A peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the double- strand break repair protein (MRE11A) or variants or homologs thereof that maintain MRE11A protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to MRE11A).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the MRE11A peptide is substantially identical to the protein identified by the UniProt reference number P49959 or a variant or homolog having substantial identity thereto. In some embodiments, the MRE11A peptide is substantially identical to the protein identified by the GenBank reference number NP_005582.l or a variant or homolog having substantial identity thereto.
  • USP1 “USP1 protein,”“USP1 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the ubiquitin specific peptidase 1 (USP1) or variants or homologs thereof that maintain USP1 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to USP1).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g.
  • the USP1 peptide is substantially identical to the protein identified by the UniProt reference number 094782 or a variant or homolog having substantial identity thereto. In some embodiments, the USP1 peptide is substantially identical to the protein identified by the GenBank reference number NP_003359.3 or a variant or homolog having substantial identity thereto.
  • RPA1 “RPA1 protein,”“RPA1 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the replication protein Al (RPA1) protein or variants or homologs thereof that maintain RPA1 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to RPA1).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g. a 10, 20, 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring RPA1 polypeptide.
  • the RPA1 peptide is substantially identical to the protein identified by the UniProt reference number P27694 or a variant or homolog having substantial identity thereto. In some embodiments, the RPA1 peptide is substantially identical to the protein identified by the GenBank reference number NP_002936.l or a variant or homolog having substantial identity thereto.
  • RPA2 “RPA2 protein,”“RPA2 peptide” as referred to herein include any of the recombinant or naturally-occurring forms of the replication protein A2 (RPA2) protein or variants or homologs thereof that maintain RPA2 protein activity (e.g. within at least 50%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or 100% activity compared to RPA2).
  • the variants or homologs have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g. a 10, 20, 50, 100, 150 or 200 continuous amino acid portion) compared to a naturally occurring RPA2 polypeptide.
  • the RPA2 peptide is substantially identical to the protein identified by the UniProt reference number P15927 or a variant or homolog having substantial identity thereto. In some embodiments, the RPA2 peptide is substantially identical to the protein identified by the GenBank reference number NP_002937.l or a variant or homolog having substantial identity thereto. In some embodiments, the DNA repair domain is linked to the C-terminus of the PUF domain. In some embodiments, the DNA repair domain is linked to the N-terminus of the PUF domain.
  • the DNA repair protein (e.g., conjugate) includes the sequence of SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO:52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 76, SEQ ID NO: 78, or SEQ ID NO: 80.
  • the DNA repair protein has the sequence of SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 50, SEQ ID NO:52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 68, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 76, SEQ ID NO: 78, or SEQ ID NO: 80.
  • the DNA repair protein as provided herein may further include a nuclear localization sequence (NLS).
  • the DNA repair protein (e.g., conjugate) includes in N-terminal to C-terminal direction: a DNA repair domain, a first peptide linker including a first, a second, a third and a fourth nuclear localization sequence, a PUFa domain, a second peptide linker including a fifth and a sixth nuclear localization sequence.
  • the DNA repair complex further includes a donor nucleic acid including a donor sequence.
  • the donor nucleic acid is single stranded or double-stranded.
  • the donor nucleic acid forms part of a circular DNA molecule (e.g., plasmid, vector).
  • the donor nucleic acid forms part of a linear DNA molecule (e.g., oligonucleotide).
  • the donor nucleic acid has the sequence of SEQ ID NO: 85.
  • DNA repair proteins comprising a programmable nuclease linked to DNA repair domain selected from the group consisting of: RPA1; RPA2; FANCM; BRCA1; RAD54L; PALB2; XRCC3; FEN1; RECQ5; FANCB; USP1; FANCF; and FANCG.
  • the programmable nuclease comprises a RNA- guided nuclease.
  • the RNA-guided nuclease may be Cas9 nuclease or Cas9 nickase.
  • the programmable nuclease comprises a ZFN. In other embodiments, the programmable nuclease comprises a TALEN. Zinc-Finger Nucleases
  • ZFN zinc-finger nuclease
  • ZFNs are an endonuclease that can be programmed to cut specific sequences of DNA. ZFNs are composed of a zinc-finger DNA-binding domain and a nuclease domain.
  • the DNA-binding domains of individual ZFNs generally contain 3-6 individual zinc finger repeats that recognize 9-18 nucleotides. For example, if the zinc finger domain perfectly recognizes a 3 base pair sequence, then a 3 zinc finger array can be generated to recognize a 9 base pair target DNA sequence. Because individual zinc fingers recognize relatively short (e.g ., 3 base pairs) target DNA sequences, ZFNs with 4, 5, or 6 zinc finger domains are typically used to minimize off-target DNA cutting.
  • Non-limiting examples of zinc finger DNA-binding domains that may be used with methods of the present disclosure include Zif268, Gal4, HIV nucleocapsid protein, MYST family histone acetyltransferases, myelin transcription factor Mytl, and suppressor of tumurigenicity protein 18 (ST 18).
  • a ZFN may contain homogeneous DNA binding domains (all from the same source molecule) or a ZFN may contain heterogeneous DNA binding domains (at least one DNA binding domain is from a different source molecule).
  • Zinc finger DNA-binding domains work in concert with a nuclease domain to form a zinc finger nucleases (ZFNs) that cut target sequences.
  • the nuclease cuts the DNA in a non-sequence specific manner after being recruited to a target sequence by the zinc fingers DNA-binding domains.
  • the most widely-used ZFN is the type II restriction enzyme Fokl, which forms a heterodimer before producing a double-stranded break in the DNA.
  • ZFNs may be nickases that only cleave one strand of the double-stranded DNA. By cleaving only one strand, the DNA is more likely to be repaired by error-free HR as opposed to error-prone NHEJ (Ramirez, et ah, Nucleic Acids Research, 40(7): 5560-5568).
  • nucleases that may be used with methods in this disclosure include Fokl and DNasel.
  • the ZFN in the ZFN-based gene editing system may be expressed as a fusion protein, with the DNA-binding domain and the nuclease domain expressed in the same polypeptide.
  • This fusion may include a linker of amino acids (e.g., 1, 2, 3, 4, 5, 6, or more) between the DNA-binding domain and the nuclease domain.
  • TALENs transcription activator-like effector nucleases
  • TALENs are composed of transcription activator-like effector (TALE) DNA-binding domains, which recognize single target nucleotides in the DNA, and transcription activator-like effector nucleases (TALENs) which cut the DNA at or near a target nucleotide.
  • TALE transcription activator-like effector
  • TALENs transcription activator-like effector nucleases
  • Transcription activator-like effectors found in bacteria are modular DNA binding domains that include central repeat domains made up of repetitive sequences of residues (Boch J. et al. Annual Review of Phytopathology 2010; 48: 419-36; Boch J Biotechnology 2011; 29(2): 135-136).
  • the central repeat domains in some embodiments, contain between 1.5 and 33.5 repeat regions, and each repeat region may be made of 34 amino acids; amino acids 12 and 13 of the repeat region, in some embodiments, determines the nucleotide specificity of the TALE and are known as the repeat variable diresidue (RVD) (Moscou MJ et al.
  • RVD repeat variable diresidue
  • TALE-based sequence detectors can recognize single nucleotides. In some embodiments, combining multiple repeat regions produces sequence- specific synthetic TALEs (Cermak T et al. Nucleic Acids Research 2011; 39 (12): e82).
  • TALEs Non-limiting examples of TALEs that may be utilized in the present disclosure include IL2RG, AvrBs, dHax3, and thXoI
  • a transcription activator-like effector nuclease cleaves the DNA non- specifically after being recruited to a target sequence by the TALE. This non-specific cleavage can lead to off-target DNA cleavage events.
  • the most widely-used TALEN is the type II restriction enzyme Fokl, which forms a heterodimer to produce a double- stranded break in DNA.
  • Fokl the type II restriction enzyme
  • two TALEN proteins must bind to opposite strands of DNA to create the Fokl heterodimer and form a double-stranded break, reducing off-target DNA cleavage events (Christian M et al. Genetics 2010; 186: 757-761).
  • TALEN nucleases may be nickases, which cut only a single-strand of the DNA, thus promoting repair of the break by HR (Gabsalilow L. et al.
  • Non-limiting examples of TALENs that may be utilized in the present disclosure include Fokl, RNAseH, and MutH.
  • the TALEN in the TALEN-based gene editing system may be expressed as a fusion protein, with the DNA-binding domain and the nuclease domain expressed in the same polypeptide.
  • This fusion may include a linker of amino acids (e.g., 1, 2, 3, 4, 5, 6, or more) between the DNA-binding domain and the nuclease domain.
  • a donor nucleic acid is a nucleic acid that includes a sequence of interest.
  • a donor nucleic acid comprise a sequence that is partially complementary to a target nucleic acid sequence.
  • a donor nucleic acid comprise a sequence that is homologous to a target nucleic acid sequence.
  • a donor nucleic acid includes one or more homologous donor sequences.
  • a donor nucleic acid includes a first homologous donor sequence and a second homologous donor sequence, wherein the first and the second
  • homologous donor sequence are connected through a non-homologous insert sequence.
  • a donor sequence is about 5 to about 2,500, about 5 to about 2000, about 5 to about 1500, about 5 to about 1000, about 5 to about 500, about 5 to about 250, about 5 to about 100, or about 5 to about 50 nucleotides in length. In some embodiments, a donor sequence is about 10 to about 2,500, about 10 to about 2000, about 10 to about 1500, about 10 to about 1000, about 10 to about 500, about 10 to about 250, about 10 to about 100, or about 10 to about 50 nucleotides in length.
  • a donor sequence is about 20 to about 2,500, about 20 to about 2000, about 20 to about 1500, about 20 to about 1000, about 20 to about 500, about 20 to about 250, about 20 to about 100, or about 20 to about 50 nucleotides in length. In some embodiments, a donor sequence is about 5, 10, 15, 20, or 25 nucleotides in length. In some embodiments, a donor sequence is about 50 nucleotides in length. In some embodiments, a donor sequence is about 100 nucleotides in length. In some embodiments, a donor sequence is about 150 nucleotides in length. In some embodiments, a donor sequence is about 200 nucleotides in length. In some embodiments, a donor sequence is about 500 nucleotides in length. In some embodiments, a donor sequence is about 1,000 nucleotides in length. In some embodiments, a donor sequence is about 2,000 nucleotides in length.
  • a first or second homologous donor sequence are independently from about 5 to about 2,500 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 5, 10, 15, 20, or 25 nucleotides in length.
  • a first and second homologous donor sequence are independently about 50 nucleotides in length. In some embodiments, the first and second homologous donor sequence are independently about 100 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 150 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 200 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 500 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 1,000 nucleotides in length. In some embodiments, a first and second homologous donor sequence are independently about 2,000 nucleotides in length.
  • the first and second homologous donor sequence have the same or different nucleotide lengths. In some embodiments, the first homologous donor sequence is 36 nucleotides in length. In some embodiments, the second homologous donor sequence is 91 nucleotides in length. In some embodiments, the first homologous donor sequence is 36 nucleotides in length and the second homologous donor sequence is 91 nucleotides in length.
  • a cell comprising a DNA repair complex as provided herein including embodiments thereof is provided.
  • the cell is a mammalian cell.
  • Exemplary cell types contemplated as described herein include, a bacterial cell; an archaeal cell; a single-celled eukaryotic organism; a plant cell; an algal cell, e.g., Botryococcus braunii, Chlamydomonas reinhardtii, Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens , C.
  • a fungal cell e.g., an animal cell; a cell from an invertebrate animal (e.g., an insect, a cnidarian, an echinoderm, a nematode, etc.); a eukaryotic parasite (e.g., a malarial parasite, e.g., Plasmodium falciparum; a helminth; etc.); a cell from a vertebrate animal (e.g., fish, amphibian, reptile, bird, mammal); a mammalian cell, e.g., a rodent cell, a human cell, a non-human primate cell.
  • an invertebrate animal e.g., an insect, a cnidarian, an echinoderm, a nematode, etc.
  • a eukaryotic parasite e.g., a malarial parasite, e.g., Plasmodium falcip
  • Suitable host cells include naturally occurring cells; genetically modified cells (e.g., cells genetically modified in a laboratory, e.g., by the “hand of man”); and cells manipulated in vitro in any way.
  • a host cell is isolated or cultured.
  • a stem cell e.g. an embryonic stem (ES) cell, an induced pluripotent stem (iPS) cell, a germ cell; a somatic cell, e.g. a fibroblast, a hematopoietic cell, a neuron, a muscle cell, a bone cell, a hepatocyte, a pancreatic cell; an in vitro or in vivo embryonic cell of an embryo at any stage, e.g., a l-cell, 2-cell, 4-cell, 8-cell, etc. stage zebrafish embryo; etc.).
  • ES embryonic stem
  • iPS induced pluripotent stem
  • a germ cell e.g. a somatic cell, e.g. a fibroblast, a hematopoietic cell, a neuron, a muscle cell, a bone cell, a hepatocyte, a pancreatic cell
  • an in vitro or in vivo embryonic cell of an embryo at any stage
  • Cells may be from established cell lines or they may be primary cells, where “primary cells,”“primary cell lines,” and“primary cultures” are used interchangeably herein to refer to cells and cells cultures that have been derived from a subject and allowed to grow in vitro for a limited number of passages, i.e. splittings, of the culture.
  • primary cultures include cultures that may have been passaged 0 times, 1 time, 2 times, 4 times, 5 times, 10 times, or 15 times, but not enough times go through the crisis stage.
  • Primary cell lines can be maintained for fewer than 10 passages in vitro.
  • Target cells are in many embodiments, unicellular organisms, or are grown in culture.
  • the cells may be harvest from an individual by any convenient method.
  • leukocytes may be conveniently harvested by apheresis, leukocytapheresis, density gradient separation, etc., while cells from tissues such as skin, muscle, bone marrow, spleen, liver, pancreas, lung, intestine, stomach, etc. are most conveniently harvested by biopsy.
  • An appropriate solution may be used for dispersion or suspension of the harvested cells.
  • Such solution will generally be a balanced salt solution, e.g.
  • fetal calf serum or other naturally occurring factors, in conjunction with an acceptable buffer at low concentration, e.g., from 5-25 mM.
  • Convenient buffers include HEPES, phosphate buffers, lactate buffers, etc.
  • the cells may be used immediately, or they may be stored, frozen, for long periods of time, being thawed and capable of being reused. In such cases, the cells will usually be frozen in 10% dimethyl sulfoxide (DMSO), 50% serum, 40% buffered medium, or other solutions commonly used in the art to preserve cells at such freezing temperatures, and thawed in any suitable manner.
  • DMSO dimethyl sulfoxide
  • the cell is a cancer cell.
  • Another aspect of the present disclosure provides a host cell including any one of the subject vector, nucleic acid, and complex.
  • the RNA-guided DNA endonuclease is encoded by a first nucleic acid.
  • the nucleic acid i.e., guide RNA
  • the DNA repair protein e.g., conjugate
  • the donor sequence is encoded by a fourth nucleic acid.
  • the expression of the RNA-guided DNA endonuclease, the nucleic acid (i.e., guide RNA), the DNA repair protein, or the donor sequence can be under the control of a constitutive promoter or an inducible promoter.
  • the cell includes the first nucleic acid, second nucleic acid, the third nucleic acid or the fourth nucleic acid.
  • the first nucleic acid is contained within a first vector. In some embodiments, the second nucleic acid is contained within a second vector. In some
  • the third nucleic acid is contained within a third vector.
  • the fourth nucleic acid is contained within a fourth vector.
  • either the first, second, third or fourth vector is the same.
  • the first, second, third or fourth vector is a transfection vector.
  • the first, second, third or fourth vector is a viral vector.
  • the cell includes the first, second, third or fourth vector. In some embodiments, the cell includes the first, second, third and fourth vector.
  • sequences that can be encoded by different vectors may be on the same vector.
  • the second vector may be the same as the vector
  • the third vector may be the same as the vector or the second vector.
  • the host cell may be in a live animal, or may be a cultured cell.
  • a RNA-guided DNA endonuclease (e.g., Cas9) is encoded by a first nucleic acid.
  • the nucleic acid i.e., guide RNA
  • the DNA repair protein (e.g., conjugate) is encoded by a third nucleic acid.
  • the donor sequence is encoded by a fourth nucleic acid.
  • the first nucleic acid is contained within a first vector.
  • the second nucleic acid i.e., guide RNA
  • the third nucleic acid is contained within a third vector.
  • the fourth nucleic acid is contained within a fourth vector. In some embodiments, either the first, second, third or fourth vector is the same. In some embodiments, the first, second, third or fourth vector is a transfection vector. In some embodiments, the first, second, third or fourth vector is a viral vector.
  • a nucleic acid, a nucleic acid comprising a nucleotide sequence encoding same, or a nucleic acid comprising a nucleotide sequence encoding the subject RNA-guided DNA endonuclease (Cas9 protein), nucleic acid (i.e., guide RNA) or DNA repair protein (e.g., conjugate), can be introduced into a host cell by any of a variety of methods. Any method can be used to introduce a nucleic acid (e.g ., vector or expression construct) into a stem cell or progenitor cell.
  • RNA-guided DNA endonuclease Cas9 protein
  • nucleic acid i.e., guide RNA
  • DNA repair protein e.g., conjugate
  • Examples of these methods include, include viral or bacteriophage infection, transfection, conjugation, protoplast fusion, lipofection, electroporation, calcium phosphate precipitation, polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediated transfection, liposome-mediated transfection, particle gun technology, calcium phosphate precipitation, direct micro injection, nanoparticle-mediated nucleic acid delivery (see, e.g., Panyam et al., Adv. Drug Deliv. Rev., pii: S0l69-409X(l2)00283-9.doi:l0.l0l6 / j.addr.20l2.09.023), and the like.
  • PKI polyethyleneimine
  • a method involves introducing into a host cell (or a population of host cells) one or more nucleic acids (e.g., vectors) comprising nucleotide sequences encoding a subject nucleic acid and/or a RNA-guided DNA endonuclease (Cas9 protein) and/or a DNA repair protein (e.g., conjugate).
  • a host cell including a target DNA is in vitro.
  • a host cell including a target DNA is in vivo.
  • Suitable nucleic acids including nucleotide sequences encoding a subject nucleic acid and/or a RNA-guided DNA endonuclease (Cas9 protein) and/or a DNA repair protein include expression vectors, where the expression vectors may be recombinant expression vector.
  • the recombinant expression vector is a viral construct, e.g., a recombinant adeno-associated virus construct (see, e.g., U.S. Patent No. 7,078,387), a recombinant adenoviral construct, a recombinant lentiviral construct, a recombinant retroviral construct, etc.
  • a viral construct e.g., a recombinant adeno-associated virus construct (see, e.g., U.S. Patent No. 7,078,387), a recombinant adenoviral construct, a recombinant lentiviral construct, a recombinant retroviral construct, etc.
  • Suitable expression vectors include, but are not limited to, viral vectors (e.g. viral vectors based on vaccinia virus; poliovirus; adenovirus (see, e.g., Li et al., Invest Opthalmol. Vis. Sci., 35:2543-2549, 1994; Borras et al., Gene Ther., 6:515-524, 1999; Li and Davidson, Proc. Natl. Acad. Sci. USA, 92:7700-7704, 1995; Sakamoto et al., Hum.
  • viral vectors e.g. viral vectors based on vaccinia virus; poliovirus; adenovirus (see, e.g., Li et al., Invest Opthalmol. Vis. Sci., 35:2543-2549, 1994; Borras et al., Gene Ther., 6:515-524, 1999; Li and Davidson, Proc. Natl. Acad. Sci. USA,
  • a retroviral vector e.g ., Murine Leukemia Virus, spleen necrosis virus, and vectors derived from retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, a lentivirus, HIV virus, myeloproliferative sarcoma virus, and mammary tumor virus; and the like.
  • Suitable expression vectors may be used, and many are commercially available.
  • the following vectors are provided by way of example; for eukaryotic host cells: pXTl, pSG5 (Stratagene), pSVK3, pBPV, pMSG, and pSVLSV40 (Pharmacia).
  • any other vector may be used so long as it is compatible with the host cell.
  • DNA repair domains e.g., BRCA1, FEN1 enzymes or functional fragments thereof
  • Delivery of a combination of DNA repair domains to a cell allows for an increase of HDR activity at a targeted gene locus.
  • the present disclosure further provides for the delivery of a plurality of DNA repair domains, wherein the domains may be the same or different.
  • the domains may form part of a plurality of DNA repair proteins (e.g., conjugates), each linked to a PUF domain, and/or they may be directly fused to the RNA-guided DNA endonuclease enzyme (e.g., Cas9).
  • the present disclosure allows for the delivery of DNA repair domains to different target sites in a cell at the same time. Applicants were the first to show that due to targeted delivery of DNA repair domain to a, for example, Cas9-introduced strand break, the activity of HDR at the break can be increased relative to the absence of the DNA repair domain.
  • a method of increasing homology directed repair (HDR) in a mammalian cell includes: (a) providing a mammalian cell containing a target nucleic acid requiring homology directed repair; (b) delivering to the mammalian cell a first nucleic acid encoding an RNA-guided DNA endonuclease; (c) delivering to the mammalian cell a second nucleic acid including: (i) a DNA-targeting sequence that is complementary to a target nucleic acid sequence; (ii) a binding sequence for the RNA-guided DNA endonuclease enzyme; and (iii) one or more PUF binding site (PBS) sequences, wherein the RNA-guided DNA endonuclease enzyme is capable of binding to the second nucleic acid via the binding sequence;
  • PBS PUF binding site
  • a DNA repair protein e.g., conjugate
  • the DNA repair protein (e.g., conjugate)is bound to the second nucleic acid via binding of the PUF domain to the one or more PBS sequences.
  • the first nucleic acid is contained within a first vector.
  • the third nucleic acid is contained within a third vector.
  • the fourth nucleic acid is contained within a fourth vector.
  • the first, second, third or fourth vector are the same.
  • the delivering is performed by transfection.
  • the delivered DNA repair protein (e.g., conjugate)is capable of decreasing non-homologous end joining (NHEJ) at a target nucleic acid sequence in the cell relative to a standard control.
  • NHEJ non-homologous end joining
  • a method of decreasing non-homologous end joining (NHEJ) in a mammalian cell includes: (a) providing a mammalian cell containing a target nucleic acid requiring NHEJ; (b) delivering to the mammalian cell a first nucleic acid encoding an RNA-guided DNA endonuclease; (c) delivering to the mammalian cell a second nucleic acid including: (i) a DNA-targeting sequence that is complementary to a target nucleic acid sequence; (ii) a binding sequence for the RNA-guided DNA endonuclease; and (iii) one or more PUF binding site (PBS) sequences, wherein the RNA-guided DNA endonuclease is capable of binding to the second nucleic acid via the binding sequence; (d) delivering to the mammalian cell a third nucleic acid encoding a DNA repair protein (e.g., conjugate
  • NHEJ non-homologous end joining
  • the DNA repair protein is bound to the second nucleic acid via binding of the PUF domain to the one or more PBS sequences.
  • the first nucleic acid is contained within a first vector.
  • the second nucleic acid is contained within a second vector.
  • the third nucleic acid is contained within a third vector.
  • the fourth nucleic acid is contained within a fourth vector.
  • the first, second, third or fourth vector are the same.
  • the delivering is performed by transfection.
  • the delivered DNA repair protein is capable of increasing HDR at a target nucleic acid sequence in the cell relative to a standard control.
  • kits in another aspect, includes: (i) a ribonucleoprotein complex as provided herein including embodiments thereof or a nucleic acid encoding the same; and (ii) a DNA repair protein conjugate as provided herein including embodiments thereof or a nucleic acid encoding the same.
  • the kit includes an RNA-guided DNA endonuclease and a DNA repair protein (e.g., conjugate).
  • the RNA-guided DNA endonuclease and the DNA repair protein may be any of the RNA-guided DNA endonucleases and DNA repair protein provided herein including embodiments thereof (e.g., a Cas9 nickase and a DNA repair protein including a OUF domain linked to a ligase domain).
  • the kit includes a nucleic acid (e.g., a first nucleic acid) encoding the RNA-guided DNA endonuclease and a nucleic acid (e.g., a third nucleic acid) encoding the DNA repair protein.
  • a nucleic acid e.g., a first nucleic acid
  • a nucleic acid e.g., a third nucleic acid
  • the kit includes an RNA-guided DNA endonuclease, a DNA repair protein, a nucleic acid (i.e., guide RNA) and a donor sequence.
  • the RNA-guided DNA endonuclease, the DNA repair protein, the nucleic acid (i.e., guide RNA) and the donor sequence may be any of the RNA-guided DNA endonucleases, DNA repair proteins, nucleic acids (i.e., guide RNAs) and donor sequences provided herein including embodiments thereof.
  • the kit includes a nucleic acid (e.g., a first nucleic acid) encoding the RNA-guided DNA endonuclease, a nucleic acid (e.g., a third nucleic acid) encoding the DNA repair protein, a nucleic acid (e.g., a second nucleic acid) encoding the nucleic acid (i.e., guide RNA) and a nucleic acid (e.g., a fourth nucleic acid) encoding the donor sequence.
  • a nucleic acid e.g., a first nucleic acid
  • a nucleic acid e.g., a third nucleic acid
  • the DNA repair protein e.g., a nucleic acid
  • a nucleic acid e.g., a second nucleic acid
  • a nucleic acid e.g., a fourth nucleic acid
  • the kit includes a first nucleic acid encoding the RNA-guided DNA endonuclease, a second nucleic acid encoding the nucleic acid (i.e., guide RNA), a third nucleic acid encoding the DNA repair protein or a fourth nucleic acid encoding the donor sequence.
  • the kit includes a first nucleic acid encoding the RNA-guided DNA endonuclease, a second nucleic acid encoding the nucleic acid (i.e., guide RNA), a third nucleic acid encoding the DNA repair protein and a fourth nucleic acid encoding the donor sequence.
  • the first, second, third or fourth nucleic acid independently a form part of a vector.
  • the first, second, third or fourth vector is a transfection vector.
  • the kits provided herein including embodiments thereof may include nucleic acids (DNA or RNA nucleic acids) encoding the individual components (i.e., RNA-guided DNA endonuclease, the nucleic acid (i.e., guide RNA), the DNA repair protein and the donor sequence), they may include the RNA-guided DNA endonuclease and/or the DNA repair protein as proteins or any combination thereof.
  • the nucleic acid i.e., guide RNA
  • the nucleic acid is a guide RNA.
  • the kit further includes a transfection agent. In some embodiments, the kit further includes a transfection agent.
  • the kit further includes a sample collection device for collecting a sample from a patient.
  • a subject kit may include: a) a nucleic acid of the present disclosure, or a nucleic acid (e.g ., vector) including a nucleotide sequence encoding the same; optionally, b) a subject Cas9 protein (e.g., Cas9 endonuclease or Cas9 nickase), or a vector encoding the same (including an expressible mRNA encoding the same); and optionally, c) one or more subject DNA repair proteins each including a PUF domain linked to a DNA repair domain that may be the same or different among the different DNA repair proteins, or a vector encoding the same (including an expressible mRNA encoding the same).
  • one or more of a) - c) may be encoded by the same vector.
  • the kit also comprises one or more buffers or reagents that facilitate the introduction of any one of a) - c) into a host cell, such as reagents for
  • a subject kit can further include one or more additional reagents, where such additional reagents can be selected from: a buffer; a wash buffer; a control reagent; a control expression vector or RNA nucleic acid; a reagent for in vitro production of the Cas9 endonuclease or Cas9 nickase from DNA; and the like.
  • additional reagents can be selected from: a buffer; a wash buffer; a control reagent; a control expression vector or RNA nucleic acid; a reagent for in vitro production of the Cas9 endonuclease or Cas9 nickase from DNA; and the like.
  • Components of a subject kit can be in separate containers; or can be combined in a single container.
  • a subject kit can further include instructions for using the components of the kit to practice the subject methods.
  • the instructions for practicing the subject methods are generally recorded on a suitable recording medium.
  • the instructions may be printed on a substrate, such as paper or plastic, etc.
  • the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (i.e., associated with the packaging or sub-packaging) etc.
  • the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD-ROM, diskette, flash drive, etc.
  • the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g. via the internet, are provided.
  • An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
  • a DNA repair complex comprising:
  • PBS PUF binding site
  • RNA-guided DNA endonuclease is bound to the polynucleotide via the binding sequence
  • the DNA repair protein conjugate binds to the ribonucleoprotein complex via the PUF domain binding to the one or more PBS sequences to form a DNA repair complex, and wherein when the RNA-guided DNA endonuclease introduces a strand break at the target polynucleotide sequence present in a genome, the DNA repair protein conjugate repairs the strand break favoring homology-directed repair (HDR).
  • HDR homology-directed repair
  • RNA-guided DNA endonuclease is a Cas9 nuclease.
  • RNA-guided DNA endonuclease is a Cas9 nickase.
  • DNA repair domain is an endonuclease domain, a helicase domain or a ligase domain.
  • RNA-guided DNA endonuclease is encoded by a first polynucleotide.
  • a cell comprising a DNA repair complex of one of embodiments 1-27.
  • a method of increasing homology directed repair (HDR) in a mammalian cell comprising:
  • PBS PUF binding site
  • RNA-guided DNA endonuclease enzyme is capable of binding to the second polynucleotide via the binding sequence; (d) delivering to the mammalian cell a third polynucleotide comprising a DNA repair protein conjugate comprising:
  • the delivered DNA repair protein conjugate increases homology directed repair at the target nucleic acid sequence in the cell relative to a control.
  • a method of decreasing non-homologous end joining (NHEJ) in a mammalian cell comprising:
  • RNA-guided DNA endonuclease is capable of binding to the second polynucleotide via the binding sequence
  • the delivered DNA repair protein conjugate decreases non-homologous end joining (NHEJ) at the target nucleic acid sequence in the cell relative to a control.
  • NHEJ non-homologous end joining
  • a kit comprising:
  • kit of embodiment 48 or 49 further comprising a sample collection device for collecting a sample from a patient.
  • Example 1 Local recruitment of DNA repair proteins to enhance precise genome editing. DNA breaks are repaired through competing pathways containing overlapping and yet distinct protein components. Genetic studies in model organisms and human cells show that different templates require different pathways and protein factors. Drugs that perturb repair pathways have been applied to enhance genome editing with limited success, but these agents may also induce unwanted genomic instability. We took a more direct approach, which was to recruit DNA repair proteins locally to target sites. We developed a hybrid system based on CRISPR/Cas9 and the programmable Pumilio RNA-binding protein, termed“Casilio,” to recruit effector proteins at genomic targets.
  • Example 2 Reporter cell lines for genome editing outcomes.
  • HDR/NHEJ reporter HEK293T cell line FIG. 1A
  • FACS fluorescence- activated cell sorting
  • a HDR/NHEJ reporter HEK293T cell line FIG. 1A
  • Editing experiments were done by co-delivering Cas9 and a sgRNA targeting BFP with a repair template containing a H66Y mutation (single-stranded oligonucleotides, ssODN; or plasmid donor) that changes BFP to GFP.
  • This reporter system conveniently reports on the fraction of cells that have undergone HDR (GFP-positive population), NHEJ (GFP-, BFP- double negative) and no modification (BFP-positive) (FIG. 1B).
  • Example 3 Recruitment of BRCA1 to site of Cas9-mediated double-stranded break enhances HDR.
  • BRCA1 To test whether we can locally recruit BRCA1 to enhance HDR by direct tethering to Cas9 or through recruitment by Casilio, we fused BRCA1 to Cas9 or PUFa and tested the complexes’ abilities to enhance HDR in the context of the BFP->GFP reporter system. Tethering of BRCA by direct fusion to Cas9 N-terminus or C-terminus resulted in small decreases in HDR and decreased in NHEJ to greater extents. The total decrease in editing efficiency (HDR%+NHEJ%) may be due to steric disadvantages imposed on Cas9 by the large BRCA1 protein (FIG. 3).
  • Example 4 Recruitment of RAD54L to site of Cas9-mediated double-stranded break enhances HDR.
  • Such recruitment stimulated HDR by 1.31 -fold with similar total editing efficiency (FIG. 6), demonstrating that recruitment of RAD54L by Casilio enhances HDR at Cas9 cut site without compromising total editing efficiency.
  • Example 5 Recruitment of CtIP(T847E)-PALB2(KR)-BRCAl to site of Cas9- mediated double-stranded break enhances HDR.
  • the complex formation capabilities of Casilio do not only allow the recruitment of individual proteins, but also the assembly of multiprotein complexes— either multiple molecules of a particular protein or combinations of different proteins— at the target site. We thus tested whether we can recruit multiple DNA repair proteins to the site of Cas9-mediated double- stranded break by the Casilio approach (FIG. 7).
  • Cas9Nickase would enhance HDR while keeping minimal NHEJ.
  • RAD5lA-PUFa was fused with Cas9 Nickase, BFP->GFP ssODN, sgBFP-5xPBSa and RAD5lA-PUFa into HEK293T/BFP reporter cell and observed 3.55-fold stimulation of HDR (FIG. 9).
  • Example 7 Recruitment of CtIP(T847E)-PALB2(KR)-BRCAl at site of DNA nick mediated by Cas9Nickase (Cas9 D10A nickase) enhances HDR.
  • Cas9 D10A nickase Cas9 D10A nickase
  • Example 8 Recruitment of XRCC3, RECQ5 or FEN1 to site of Cas9-mediated double-stranded break enhances HDR.
  • XRCC3, RECQ5, or FEN1 to Cas9- DSB by fusing each of them N- or C-terminally to PUFa to allow local recruitment via Casilio- sgRNA scaffold.
  • Recruitment of XRCC3 stimulated HDR by twofold (XRCC3-PUFa) or 1.84- fold (PUFa-XRCC3) (FIG. 11).
  • Recruitment of RECQ5 stimulated HDR by 1.85-fold (RECQ5- PUFa) or 1.58-fold (PUFa-RECQ5) (FIG. 12).
  • Recruitment of FEN1 stimulated HDR by 1.97- fold (FENl-PUFa) or 1.86-fold (PUFa-FENl) (FIG. 13).
  • Example 9 Recruitment of Fanconi Anemia (FA) pathway proteins to site of Cas9- mediated double-stranded break enhances HDR.
  • FANCB, FANCF, FANCG, FANCM proteins of the Fanconi Anemia pathway
  • FANCB, FANCF, FANCG, FANCM proteins of the Fanconi Anemia pathway
  • FANCB, FANCF, FANCG, FANCM proteins of the Fanconi Anemia pathway
  • FANCF-PUFa 2.35-fold
  • PANCG-PUFa 2.36-fold
  • FANCG-PUFa 2.07-fold
  • PPFa-FANCG 2.15-fold
  • FANCM-PUFa 2.l2-fold
  • PPFa-FANCM 1.79-fold
  • Example 10 More examples of factors that enhance HDR when recruited to site of Cas9Nickase (Cas9 D10A nickase)-mediated DNA nick. We also recruited more factors to Cas9Nickase (Cas9 D10A nickase)-mediated nick via Casilio-sgRNA scaffold (FIG. 15).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Selon l'invention, les complexes de réparation D'ADN sont, entre autres, utiles pour l'édition de séquences génomiques par introduction de changements précis dans un site cible en présence d'une séquence donneuse. L'endonucléase d'ADN guidée par ARN décrite dans la présente invention comprenant des modes de réalisation de cette dernière (par exemple, nucléase Cas9 ou nickase Cas9) est apte à introduire une cassure de brin (cassure double ou simple brin) au niveau d'un site cible dans le génome d'une cellule (par exemple, un gène ou une séquence régulatrice de transcription) et la cassure est ensuite majoritairement réparée par l'intermédiaire du mécanisme de HDR. Grâce à l'augmentation de l'efficacité HDR de manière considérable et dans certains cas de diminution de NHEJ au niveau d'un site cible, les compositions et les procédés de l'invention satisfont le besoin ressenti depuis longtemps d'édition de génome dirigée, hautement précise.
PCT/US2019/047021 2018-08-21 2019-08-19 Procédés et compositions de recrutement de protéines de réparation d'adn Ceased WO2020041172A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862720847P 2018-08-21 2018-08-21
US62/720,847 2018-08-21

Publications (1)

Publication Number Publication Date
WO2020041172A1 true WO2020041172A1 (fr) 2020-02-27

Family

ID=69591286

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/047021 Ceased WO2020041172A1 (fr) 2018-08-21 2019-08-19 Procédés et compositions de recrutement de protéines de réparation d'adn

Country Status (1)

Country Link
WO (1) WO2020041172A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11041172B2 (en) 2019-06-25 2021-06-22 Inari Agriculture, Inc. Homology dependent repair genome editing
CN114231568A (zh) * 2021-12-20 2022-03-25 安可来(重庆)生物医药科技有限公司 一种提高dna修复效率的辅助蛋白及其基因编辑载体和应用
WO2022034374A3 (fr) * 2020-08-11 2022-04-21 University Of Oslo Édition génique améliorée
WO2023114992A1 (fr) * 2021-12-17 2023-06-22 Massachusetts Institute Of Technology Approches d'insertion programmables par recrutement de transcriptase inverse
EP4118203A4 (fr) * 2020-03-11 2024-03-27 The Broad Institute, Inc. Nouvelles enzymes cas et méthodes de profilage de spécificité et d'activité

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017142923A1 (fr) * 2016-02-16 2017-08-24 Emendobio Inc. Compositions et procédés permettant de favoriser l'édition de gènes à médiation par la réparation dirigée par une homologie
US20180094257A1 (en) * 2015-03-13 2018-04-05 The Jackson Laboratory Three-component crispr/cas complex system and uses thereof
US20180230494A1 (en) * 2014-10-01 2018-08-16 The General Hospital Corporation Methods for increasing efficiency of nuclease-induced homology-directed repair
WO2018162702A1 (fr) * 2017-03-10 2018-09-13 Institut National De La Sante Et De La Recherche Medicale (Inserm) Fusions de nucléase destinées à améliorer l'édition de génome par intégration de transgène dirigée par homologie

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180230494A1 (en) * 2014-10-01 2018-08-16 The General Hospital Corporation Methods for increasing efficiency of nuclease-induced homology-directed repair
US20180094257A1 (en) * 2015-03-13 2018-04-05 The Jackson Laboratory Three-component crispr/cas complex system and uses thereof
WO2017142923A1 (fr) * 2016-02-16 2017-08-24 Emendobio Inc. Compositions et procédés permettant de favoriser l'édition de gènes à médiation par la réparation dirigée par une homologie
WO2018162702A1 (fr) * 2017-03-10 2018-09-13 Institut National De La Sante Et De La Recherche Medicale (Inserm) Fusions de nucléase destinées à améliorer l'édition de génome par intégration de transgène dirigée par homologie

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BOTHMER ET AL.: "Characterization of the interplay between DNA repair and CRISPR/Cas9-induced DNA lesions at an endogenous locus", NAT COMMUN., vol. 8, no. 13905, 9 January 2017 (2017-01-09), pages 1 - 12, XP055687275 *
CHENG ET AL.: "Casilio: a versatile CRISPR-Cas9-Pumilio hybrid for gene regulation and genomic labeling", CELL RES., vol. 26, no. 2, February 2016 (2016-02-01), pages 254 - 7, XP055278824, [retrieved on 20160115], DOI: 10.1038/cr.2016.3 *
PAULSEN ET AL.: "Ectopic expression of RAD52 and dn53BP1 improves homology-directed repair during CRISPR-Cas9 genome editing", NAT BIOMED ENG., vol. 1, no. 11, November 2017 (2017-11-01), pages 878 - 888, XP036428862, [retrieved on 20171009], DOI: 10.1038/s41551-017-0145-2 *
REES ET AL.: "Development of hRad51-Cas9 nickase fusions that mediate HDR without double-stranded breaks", NAT COMMUN., vol. 10, no. 1 :2212, 17 May 2019 (2019-05-17), pages 1 - 12, XP055687277 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11041172B2 (en) 2019-06-25 2021-06-22 Inari Agriculture, Inc. Homology dependent repair genome editing
EP4118203A4 (fr) * 2020-03-11 2024-03-27 The Broad Institute, Inc. Nouvelles enzymes cas et méthodes de profilage de spécificité et d'activité
WO2022034374A3 (fr) * 2020-08-11 2022-04-21 University Of Oslo Édition génique améliorée
US20230304047A1 (en) * 2020-08-11 2023-09-28 University Of Oslo Improved gene editing
WO2023114992A1 (fr) * 2021-12-17 2023-06-22 Massachusetts Institute Of Technology Approches d'insertion programmables par recrutement de transcriptase inverse
CN114231568A (zh) * 2021-12-20 2022-03-25 安可来(重庆)生物医药科技有限公司 一种提高dna修复效率的辅助蛋白及其基因编辑载体和应用

Similar Documents

Publication Publication Date Title
US20230203540A1 (en) Methods and compositions for nuclease-mediated targeted integration of transgenes into mammalian liver cells
CN113151215B (zh) 工程化的Cas12i核酸酶及其效应蛋白以及用途
US9757420B2 (en) Gene editing for HIV gene therapy
WO2020041172A1 (fr) Procédés et compositions de recrutement de protéines de réparation d'adn
US12203110B2 (en) RNA-programmable endonuclease systems and uses thereof
JP7830129B2 (ja) 標的遺伝子編集構築物およびそれを使用する方法
JP7085716B2 (ja) Rnaガイド遺伝子編集及び遺伝子調節
CN109844123A (zh) 经人工操纵的血管生成调控系统
KR20230005865A (ko) 전위-기반 요법
US20240425830A1 (en) Engineered cas12i nuclease, effector protein and use thereof
KR20230123492A (ko) 프로그래밍 가능한 트랜스포사제 및 이의 용도
WO2022040909A1 (fr) Systèmes cas12 divisés et leurs méthodes d'utilisation
JP2025081589A (ja) 改善された遺伝子編集のための組成物及び方法
WO2020074729A1 (fr) Sélection au moyen de transactivateurs artificiels
WO2019089623A1 (fr) Protéines de fusion destinées à être utilisées pour améliorer la correction génique par recombinaison homologue
WO2023138617A1 (fr) Nucléase casx modifiée, protéine effectrice et son utilisation
KR20180021135A (ko) 인간화 심장 근육
US20230167431A1 (en) Tagged gene editing technology for clinical cell sorting and enrichment
RU2832109C2 (ru) Конструкции для направленного редактирования генов и способы с их применением
Simone Expanding Targeting and Manipulation of the Human Genome towards Regenerative Medicine Applications
JP2026511058A (ja) ゲノム工学を使用して制御性t細胞(treg)を作製するための方法
HK40109303A (en) Engineered cas12i nuclease, effector protein and use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19852636

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19852636

Country of ref document: EP

Kind code of ref document: A1