WO2024215712A2 - Méthodes d'identification de facteurs épigénétiques influençant l'édition de gènes et de modulation du résultat d'édition de gènes par modulation épigénétique - Google Patents

Méthodes d'identification de facteurs épigénétiques influençant l'édition de gènes et de modulation du résultat d'édition de gènes par modulation épigénétique Download PDF

Info

Publication number
WO2024215712A2
WO2024215712A2 PCT/US2024/023808 US2024023808W WO2024215712A2 WO 2024215712 A2 WO2024215712 A2 WO 2024215712A2 US 2024023808 W US2024023808 W US 2024023808W WO 2024215712 A2 WO2024215712 A2 WO 2024215712A2
Authority
WO
WIPO (PCT)
Prior art keywords
editing
prime
cell
gene
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2024/023808
Other languages
English (en)
Other versions
WO2024215712A3 (fr
Inventor
Xiaoyi Li
Jay Ashok Shendure
Wei Chen
Junhong CHOI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Washington
Original Assignee
University of Washington
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Washington filed Critical University of Washington
Publication of WO2024215712A2 publication Critical patent/WO2024215712A2/fr
Publication of WO2024215712A3 publication Critical patent/WO2024215712A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/711Natural deoxyribonucleic acids, i.e. containing only 2'-deoxyriboses attached to adenine, guanine, cytosine or thymine and having 3'-5' phosphodiester links
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/14Type of nucleic acid interfering nucleic acids [NA]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]

Definitions

  • the XML file is 1,292,777 bytes; was created on March 28, 2024; and is being submitted electronically via Patent Center with the filing of the specification.
  • STATEMENT OF GOVERNMENT LICENSE RIGHTS [0003] This invention was made with government support under Grant Nos. R01HG010632, UM1HG011966, and F32HG011817, awarded by the National Institutes of Health (NIH). The government has certain rights in the invention.
  • BACKGROUND [0004] The development of prime editing has shifted the paradigm of genome manipulation from less controllable site-directed mutagenesis to more precise genome reprogramming. However, the efficiency of prime editing is still suboptimal and varies across genomic contexts, which limits its application in both basic and clinical research.
  • the low efficiency can be attributed to inefficient generation of a single-strand DNA flap containing the edit and efficient removal of the intermediate by the mismatch repair (MMR) pathway.
  • MMR mismatch repair
  • strategies such as improving the robustness of the prime editing core components and transiently dampening mismatch repair (MMR) response in cells have been developed and have shown promising enhancing effects on prime editing outcome.
  • MMR transiently dampening mismatch repair
  • the reporter system comprises a vector comprising a promoter for in vitro transcription.
  • the vector is selected from a retroviral-based vector, a lentiviral-based vector, a transposon-based vector, and a serine integrase-based vector.
  • the transposon-based vector is a piggyBac-based vector.
  • the vector is devoid of any cis-regulatory elements.
  • the promoter is selected from a bacteriophage T7 promoter, a T3 promoter, a Sp6 promoter and variants thereof.
  • the reporter system disclosed herein further comprises a polynucleotide sequence linked to the promoter.
  • the polynucleotide sequence comprises an amplification primer binding domain/site.
  • the polynucleotide sequence further comprises a nucleic acid sequence for a target sequence.
  • the target sequence comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • the polynucleotide sequence further comprises a molecular identifier.
  • the polynucleotide sequence further comprises nucleic acid sequences selected from coding or non-coding transcripts, cis-regulatory elements, molecular identifier barcode, or functional or non-functional genomic elements.
  • the reporter system disclosed herein comprises a vector comprising a promoter; and a polynucleotide sequence linked to the promoter.
  • the polynucleotide sequence comprises: a target sequence; and a molecular identifier or a barcode.
  • the vector is selected from a retroviral-based vector, a lentiviral-based vector, a transposon-based vector, and a serine integrase-based vector.
  • the transposon-based vector is a piggyBac-based vector.
  • the vector is a transposon-based vector (e.g., a piggyBac-based vector) comprising an inverted terminal repeat (ITR) at each vector end.
  • the vector is devoid of any cis-regulatory elements.
  • the promoter is selected from a bacteriophage T7 promoter, a T3 promoter, or a Sp6 promoter and variants thereof.
  • the target sequence comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • the present disclosure pertains to a method of mapping a genomic location of at least one reporter system disclosed herein in a genome of a cell.
  • the cell is selected from a prokaryotic cell or a eukaryotic cell.
  • the cell is a eukaryotic cell, preferably a mammalian cell.
  • the method comprises integrating/inserting the at least one reporter system in the genome of the cell.
  • the method further comprises determining the insertion/integration site for the at least one reporter system within the genome of the cell.
  • the step of integrating the at least one reporter system in the genome of the cell comprises transfecting the cell with the at least one reporter system.
  • the reporter system comprises a vector comprising: a promoter for in vitro transcription; and a polynucleotide sequence linked to the promoter.
  • the polynucleotide sequence comprises: a target sequence; and a molecular identifier or a barcode.
  • the vector is selected from a retroviral-based vector, a lentiviral-based vector, a transposon-based-based vector, and a serine integrase- based vector.
  • the transposon-based vector is a piggyBac-based vector.
  • the vector is a transposon-based vector (e.g., a piggyBac- based vector) comprising an inverted terminal repeat (ITR) at each vector end.
  • ITR inverted terminal repeat
  • the vector is devoid of any cis-regulatory elements.
  • the promoter is selected from a bacteriophage T7 promoter, a T3 promoter, or a Sp6 promoter and variants thereof.
  • the step of integrating the at least one reporter system in the genome of the cell further comprises co-transfecting the cell with an expression construct capable of expressing a protein or an enzyme for excising a payload 3915-P1300WO.UW -3- of the vector and integrate the payload into the genome.
  • the protein or enzyme comprises transposase.
  • the payload of the vector comprises a promoter for in vitro transcription, a target sequence, and a molecular identifier.
  • the target sequence comprises an amplification primer binding domain/site.
  • the target sequence further comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • the step of determining the insertion site for the at least one reporter system within the genome of the cell comprises isolating genomic DNA of the cell comprising the at least one reporter system integrated in the genome of the cell.
  • the method further comprises in vitro transcription of the isolated genomic DNA using the promoter of the reporter system to generate chimeric RNAs comprising a portion of the reporter system and of its neighboring genomic DNA.
  • the method comprises generating a cDNA library by reverse transcribing the chimeric RNAs followed by amplification of the cDNAs.
  • the cDNA library thus obtained is sequenced to associate the molecular identifiers with the insertion site to map the genomic location of the at least one reporter system in the genome of the cell.
  • the step of sequencing the library of cDNAs comprises high-throughput short- and long-read sequencing, and Sanger sequencing.
  • the reporter system and the methods disclosed herein may be used to map any number of genomic features in a genome of a cell including but not limited to expression cassettes of coding or non-coding transcripts, cis-regulatory elements (e.g., enhancers, silencers, and insulators), gene editing target sequences, recombination sites (e.g., loxP and FRT sites), and other functional or non-functional genomic elements.
  • cis-regulatory elements e.g., enhancers, silencers, and insulators
  • gene editing target sequences e.g., loxP and FRT sites
  • recombination sites e.g., loxP and FRT sites
  • the method comprises generating a reporter pool of cells comprising a plurality of reporter systems, disclosed herein, integrated into genomes of a plurality of cells.
  • the step of generating a reporter pool of cells comprises transfecting the plurality of cells with the plurality of reporter systems.
  • the plurality of cells constitutively express Prime Editor (PE).
  • the PE comprises Cas9 Nickase and a Reverse transcriptase. 3915-P1300WO.UW -4- [0014]
  • each of the plurality of the reporter system comprises a vector comprising a promoter for in vitro transcription; and a polynucleotide sequence linked to the promoter.
  • the polynucleotide sequence comprises: a target sequence; and a molecular identifier or a barcode.
  • the vector is selected from a retroviral-based vector, a lentiviral-based vector, a transposon-based vector (e.g., a piggyBac-based vector), and a serine integrase-based vector.
  • the vector comprises a transposon-based vector (e.g., a piggyBac-based vector) comprising an inverted terminal repeat (ITR) at each vector end.
  • the method further comprises co-transfecting the plurality of cells with a transposase.
  • the vector is devoid of any cis-regulatory elements.
  • the promoter is selected from a bacteriophage T7 promoter, a T3 promoter, or a Sp6 promoter and variants thereof.
  • the target sequence comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • the method further comprises mapping insertion sites determining prime editing efficiency in the plurality of reporters in the genome of the plurality of cells by methods disclosed herein.
  • the method comprises transfecting the reporter pool of cells with pegRNAs comprising a sequence for introducing a desired edit at the target sequence of the plurality of reporter systems integrated into the genome of the plurality of cells, wherein each of the plurality of cells constitutively express a prime editor (PE).
  • PE prime editor
  • the method comprises measuring the frequency of the desired edit in the genome of the cell. In some embodiments, the method further comprises correlating the prime editing efficiencies with features of epigenetic environment in the genome in about 100bp to about 2kb region centering at the edit sites of the target sequences. In various embodiments, the features of epigenetic environment are selected from DNA binding proteins, epigenetic modulators, histone modifications, DNase Hypersensitive sites, transposase-accessible chromatin (ATAC), and higher order chromatic structures.
  • the plurality of cells comprises eukaryotic cells, preferably mammalian cells. In some embodiments, the plurality of cells comprises stem cells (e.g., induced pluripotent stem cells (iPSC)).
  • iPSC induced pluripotent stem cells
  • a method for modulating gene editing efficiency of a target site in a genomic region of a cell comprises modulating chromatin accessibility of the genomic region and/or transcriptional activity of a gene in the genomic region proximal to 3915-P1300WO.UW -5- the gene editing target site.
  • gene editing comprises prime editing.
  • the prime editing target site is selected from an exon, an intron, a promoter, a 3’ UTR, a 5’ UTR, and intergenic sites in the genomic region.
  • the prime editing target site is downstream of a promoter of the gene.
  • the prime editing target site is downstream of a transcriptional start site (TSS) of the gene. In certain embodiments, the prime editing target site is upstream of a promoter of the gene.
  • modulating the prime editing efficiency comprises activating transcription of the gene proximal to the prime editing target site. In some embodiments, the step of activating transcription occurs before or during the step of prime editing of the target site. In some embodiments, modulating the prime editing comprises repressing transcription of the gene proximal to the prime editing target site. In various embodiments, the step of repressing transcription occurs before or during the step of prime editing of the target site. [0017] The methods disclosed herein may be combined with other known methods of modulating prime editing efficiency of the target site.
  • the method further comprises modulating at least one DNA Damage Repair (DDR) gene in the cell.
  • the at least one DDR gene is a DNA mismatch repair (MMR) gene (e.g., an MMR gene selected from PMS2, MLH1, MLH2, MLH3, MLH6, EXO1, and FEN1).
  • MMR DNA mismatch repair
  • the at least one DDR gene is HLTF.
  • the method further comprises use of a more robust or enhanced prime editing ribonucleoprotein (RNP) complex.
  • RNP prime editing ribonucleoprotein
  • the more robust or enhanced RNP complex comprises a prime editor selected from (a) a prime editor comprising an engineered reverse transcriptase (e.g., PE2), (b) a codon- and structure-optimized prime editor (PEmax), (c) a compact and/or specialized prime editor (e.g., a prime editor selected from PE6a, PE6b, PE6, PE6d, PE6e, PE6f, and PE6g), and (d) a prime editor fused to an RNA-binding protein (e.g., PE7).
  • a prime editor comprising an engineered reverse transcriptase (e.g., PE2)
  • PEmax codon- and structure-optimized prime editor
  • PEmax codon- and structure-optimized prime editor
  • a compact and/or specialized prime editor e.g., a prime editor selected from PE6a, PE6b, PE6, PE6d, PE6e, PE6f, and PE6g
  • the more robust or enhanced RNP complex comprises a structurally stabilized pegRNA (epegRNA) and/or a computationally optimized peg RNA).
  • the method further comprises inhibiting DNA mismatch repair (MMR).
  • MMR DNA mismatch repair
  • inhibiting MMR comprises use of a second-site nicking guide RNA targeting the non-edited strand.
  • inhibiting 3915-P1300WO.UW -6- MMR comprises inhibiting an MMR protein selected from the group consisting of MLH1, MSH2, MSH3, MSH6, PMS2, and EXO1.
  • inhibiting MMR comprises expressing a dominant negative MLH1 protein, RNAi-mediated knockdown of MMR protein expression, or targeted degradation of the MMR protein.
  • the cell is a eukaryotic cell, preferably a mammalian cell.
  • the cell is a stem cell (e.g., an induced pluripotent stem cell (iPSC)).
  • iPSC induced pluripotent stem cell
  • FIG. 1C Motif enrichment analysis of 20- bp windows surrounding synHEK3 integration sites.
  • FIG. 1E UpSet plot of genomic annotations of synHEK3 integration sites.
  • FIGs 2A-2J show an example of characterization of synHEK3 insertion sites determined by T7-assisted reporter mapping assay, according to aspects of the disclosure.
  • FIG. 2A Schematic of the synHEK3 reporter construct and the T7-assisted reporter mapping method.
  • FIG. 2B Experimental workflow.
  • FIG.2C Copy number of synHEK3 reporters was estimated by qPCR using that of SNRPB (3 copies) as a reference.
  • FIG. 2J Density plots of chromatin feature scores of the selected genomic sites.
  • FIGs 3A-3I show chromatin context has an impact on prime editing efficiency, according to aspects of the disclosure.
  • FIG.3A Left: workflow of experiment.
  • FIG.3A Left: workflow of experiment.
  • FIG.3B Heatmaps of fractions of highly editable (>25%) sites in synHEK3 sites stratified by chromatin feature scores.
  • SynHEK3 reporters are binned into 10 equally sized bins with increasing chromatin feature scores.
  • the chromatin features are ordered left-to-right by their correlation coefficient (Spearman’s ⁇ ) with prime editing efficiency.
  • FIG.3C Scatter plot of observed vs. predicted prime editing efficiencies using reporters in a holdout test set. Points shaded by the number of neighboring points.
  • FIG.3D Scatter plot of Spearman’s ⁇ between chromatin feature scores and prime editing 3915-P1300WO.UW -8- efficiencies, calculated separately for intragenic and intergenic reporters.
  • FIG. 3E Prime editing efficiency for gene-proximal reporters. Distance was scaled by gene length and binned. Negative values refer to synHEK3 sites located upstream of TSS. Values >100% refer to synHEK3 sites located downstream of (transcription termination site) TTS. Points shaded based on expression levels (log10) of the genes. TPM, transcripts per million.
  • FIG. 3F Genome browser views of the 4 most highly editable sites. Sites of integration and measured editing efficiencies are shown as a dot plot at top and aligned with selected epigenetic tracks. For each synHEK3 insertion, editing efficiency, number of reads with edit (numerator), and total number of reads (denominator) are annotated.
  • FIG. 3G Scatter plot of sequence-based prediction (DeepPrime, x-axis) vs. normalized editing rate (y-axis, log10 scale) for epegRNAs designed for prime editing at endogenous genomic sites.
  • FIG.3H Scatter plot of chromatin-based prediction (our model, x-axis) vs. normalized editing rate (y-axis, log10 scale).
  • FIG.3I Scatter plot of combined score (x-axis) vs. normalized editing rate (y-axis, log10 scale).
  • FIG.4A Scatter plots of chromatin feature scores vs. prime editing efficiencies for individual synHEK3 reporters. Points are shaded by the number of neighboring points. The Spearman’s ( ⁇ ) and Pearson’s ⁇ correlation coefficients between the chromatin feature score and prime editing efficiency are annotated.
  • FIG.4B Boxplots of prime editing efficiency in synHEK3 sites, binned by chromatin feature scores. –1 - Q10 correspond to 10 equally sized bins of synHEK3 reporters with increasing chromatin feature scores.
  • FIG.4C Scatter plots of Spearman’s ⁇ of chromatin feature scores vs.
  • FIG.4D Scatter plots of predicted (beta regression model) vs. observed prime editing efficiencies in two independent pools of synHEK3 reporters in wild-type K562 cells. Points are shaded by the number of neighboring points.
  • FIG. 4F Boxplot of prime editing efficiencies of gene-proximal synHEK3 reporters. Distances were calculated relative to the closest TSS, scaled by gene length and binned. Negative values refer to synHEK3 sites located upstream of TSS. Values >100% 3915-P1300WO.UW -9- refer to synHEK3 sites located downstream of TTS.
  • the reporters were binned based on the TPM of the overlapping/nearest genes. Bin 1 contains unexpressed genes, while Bin 2- 5 are equally sized in terms of the number of assigned genes (though not necessarily in terms of the number of assigned synHEK3 reporters), sorted by increasing expression levels.
  • FIG.4G Boxplot of prime editing efficiencies of gene-proximal reporters.
  • FIG. 4H Cumulative distribution function (CDF) plot of prime editing efficiencies of intragenic synHEK3 reporters of different orientations with respect to the direction of transcription.
  • “Opposite“ means synHEK3 reporter on the opposite strand of the coding strand
  • “same“ means synHEK3 reporter on the same strand as the coding strand.
  • Genes with detectable expression (TPM > 3) and synHEK3 reporters within 50 kb from the TSSs were selected for this analysis. P value: two-sided Kolmogorov–Smirnov test.
  • FIG. 4I Genome browser views of 4 poorly editable sites.
  • FIG.5B Hierarchical clustering of synHEK3 reporters based on prime and Cas9 editing efficiencies.
  • FIG. 5C Density plot of prime and Cas9 editing efficiencies for 14 groups of synHEK3 reporters, ordered by mean prime editing efficiency.
  • FIG.5D Bar graph of the log 2 ratio between number of intragenic vs. intergenic sites in each of the 14 groups.
  • FIGs 5E-5G Comparison of properties of intragenic sites in selected groups. P-value: two-sided Kolmogorov–Smirnov test.
  • FIG. 5E Boxplot of 3915-P1300WO.UW -10- ATAC-seq scores of selected groups.
  • FIG.5F Expression levels of the overlapping genes in TPM of selected groups (x-axis; log10 scale).
  • FIGs 6A-6J show example comparisons between prime editing and Cas9 editing, leveraging a common set of integrated editing reporters, according to aspects of the disclosure.
  • FIG.6A PCA plots of synHEK3 reporters generated using chromatin scores as features. The first two PCs are plotted (PC1: variance 62%; PC2: variance 9%). Points are shaded by prime editing efficiency at Day 4 and Cas9 indel frequency measured at Day 1, 2 and 4.
  • FIG. 6B Cas9 indel frequency measured at Day 1, 2 and 4 for gene-proximal reporters.
  • FIG. 6C PCA plots of synHEK3 reporters generated using chromatin scores as features. The first two PCs are plotted (PC1: variance 62%; PC2: variance 9%). Points are shaded by the 6 main groups as in FIG.5B.
  • FIG.6D Scatter plot of Cas9 editing (Day 1) and prime editing (Day 4) efficiency. Points are shaded by the 6 main groups as in panel C.
  • FIG.6E Scatter plot of Cas9 editing (Day 1) and prime editing (Day 4) efficiency, shaded by the 14 subgroups.
  • FIG. 6F Boxplot of H3K27me3 and H3K9me3 scores of synHEK3 reporters in Groups 1.0 and 2.0. P-value: two-sided Kolmogorov–Smirnov test.
  • FIG.6G The MMEJ / (MMEJ + NHEJ) ratio in all synHEK3 sites and in sites overlapping with H3K27me3 and H3K9me3 peaks. Black line: the median MMEJ / (MMEJ + NHEJ) ratio. P-value: two-sided Kolmogorov–Smirnov test.
  • FIG.6H The MMEJ / (MMEJ + NHEJ) ratio in the 6 groups of synHEK3 sites as in FIG.6C.
  • Black line the median MMEJ / (MMEJ + NHEJ) ratio.
  • FIGs 6I-6J) Scatter plots of the MMEJ / (MMEJ + NHEJ) ratio or allele frequencies of intragenic synHEK3 reporters and their distance to corresponding TSSs (x-axis; log10 scale). Points are shaded based on expression levels (log 10 ) of the genes. Black line: linear regression line with confidence interval in gray.
  • FIG. 6I The MMEJ / (MMEJ + NHEJ) ratio is plotted.
  • FIG. 6I The MMEJ / (MMEJ + NHEJ) ratio is plotted.
  • FIGs 7A-7E show an example of dissecting chromatin context-dependent regulation of prime editing using a modified sci-RNA-seq3 workflow, according to aspects of the disclosure. Experimental workflow of the pooled shRNA screen.
  • FIG.7A The two monoclonal K562 lines used in this example stably expressed PE2 and reverse tetracycline transactivator (rtTA), and together contained 50 unique synHEK3 reporters.
  • Cells were transduced with the TRE-shRNA library at a high multiplicity of infection (MOI) and treated with doxycycline. On Day 2, cells were transfected with pegRNAs to introduce random 6-bp insertions at synHEK3 reporters. After 3-4 days, nuclei were extracted and fixed. TRE: tetracycline response element.
  • FIG. 7B Fixed nuclei were subjected to IST with T7 polymerase (gray circle) to produce transcripts from synHEK3 and shRNA constructs.
  • FIG. 7A Fixed nuclei were subjected to IST with T7 polymerase (gray circle) to produce transcripts from synHEK3 and shRNA constructs.
  • FIG.7C Nuclei were distributed to 96-wells for indexed RT. In each well, a cocktail of three indexed RT primers were used: oligo-dT primers, and synHEK3- and shRNA-specific primers.
  • FIG.7D After RT, nuclei were pooled and redistributed into 96- well plates for indexed hairpin ligation. Then, they were pooled and split to final 96-well plates. After second-strand synthesis, nuclei were lysed and the resulting lysates were split to two plates. One plate underwent Tn5 tagmentation and indexed PCR to generate a transcriptome library. The other plate was used for indexed enrichment PCR targeting the synHEK3 and shRNA transcripts.
  • FIG.7E For each synHEK3 reporter, editing outcomes were computed and compared between cells with vs. without a specific shRNA.
  • FIGs 8A-8D show an example experimental setup of the pooled shRNA screen and its readout via T7 IST-assisted sci-RNA-seq3, according to aspects of the disclosure.
  • FIG.8A Correlation between efficiencies of random 3-bp insertions measured in the monoclonal lines versus efficiencies measured for the same reporters in the original polyclonal population.
  • FIG. 8B List of genes targeted by the shRNA library. Genes are grouped by pathways. Control genes are shown separately.
  • FIG. 8C Schematic of the lentiviral shRNA construct and the synHEK3 reporter, with key features relevant to the sci- RNA-seq3 workflow highlighted.
  • TRE tetracycline response element.
  • FIG.8D Schematic structures of the sci-RNA-seq3, shRNA and synHEK3 libraries.
  • UMI unique molecular identifier;
  • RT PBS reverse transcription primer binding site.
  • FIGs 9A-9D show example effects of perturbing MMR-related genes on prime editing, according to aspects of the disclosure.
  • FIG. 9A Q-Q plot of statistical significance (-log10) of synHEK3-shRNA pairs in clones 3 (left) and 5 (right).
  • FIG. 9B Plots of 3915-P1300WO.UW -12- adjusted p values (-log10) of all synHEK3-shRNA pairs. Target genes with high statistical significance are annotated. Points shaded by editing efficiency changes caused by corresponding shRNAs.
  • FIG. 9C Effects of shRNAs targeting MMR-related genes in clone 5. Log 2 fold-changes of prime editing efficiencies of synHEK3-shRNA pairs are plotted and shaded by their corresponding adjusted p values (-log10).
  • FIG.9D Effects of shRNAs against MLH1 and PMS2.
  • FIGs 10A-10J show an example pooled shRNA screen with sci-RNA- seq3 and effects of perturbing the MMR pathway on prime editing, according to aspects of the disclosure.
  • FIG. 10A Scatter plot of synHEK3 UMIs detected in single cells in sci- RNA-seq3 data. Cells assigned to the two clones are shaded (upper left grouping: clone 5; lower right grouping: clone 3). Mixed cells are in gray.
  • FIG.10B Histograms of cell count per shRNA (left), number of shRNAs captured per cell (middle), and number of synHEK3 reporters captured per cell (right) in the two clones.
  • FIG.10C Scatter plot of prime editing frequencies of synHEK3 reporters estimated with sci-RNA-seq3 vs. bulk amplicon sequencing.
  • FIG.10D Scatter plots of the number of cells per synHEK3-shRNA pair and corresponding adjusted p values (-log 10 ).
  • Candidate shRNAs (left) and control shRNAs (right) are plotted separately.
  • FIG.10E Effects of shRNAs targeting MMR-related genes in clone 3.
  • FIGs 10F-10G Effects of shRNAs against FEN1 and EP300. Lines indicate: editing frequencies in cells with individual shRNAs, mean editing frequencies of the gene-targeting shRNAs, control editing frequencies for individual shRNAs (not visible because low variance relative to mean line), and mean control editing frequencies.
  • FIG. 10H Effects of shRNAs against DOT1L and KDM2B. Barcode sequences omitted but in the same order as FIGs 10F and 10G. Shading as per FIGs 10F and 10G.
  • FIG. 10H Effects of shRNAs against DOT1L and KDM2B. Barcode sequences omitted but in the same order as FIGs 10F and 10G. Shading as per FIGs 10F and 10G.
  • FIG. 10I Volcano plots of gene expression changes in EPZ-5676 treated cells (5 ⁇ M for 6 days). Genes with synHEK3 insertions in clone 3 or clone 5 of the single cell screen are shaded in gray.
  • FIG. 10J Scatter plots of log 2 fold-change of prime editing efficiency vs. gene expression change (log2 fold; top) and H3K79me2 score (bottom) for a set of intragenic synHEK3 reporters with baseline efficiency > 20%. Black line: linear regression line with confidence interval in gray. 3915-P1300WO.UW -13- [0031]
  • FIGs 11A-11D show an example chromatin context-specific response to HLTF inhibition, according to aspects of the disclosure.
  • FIG. 10I Volcano plots of gene expression changes in EPZ-5676 treated cells (5 ⁇ M for 6 days). Genes with synHEK3 insertions in clone 3 or clone 5 of the single cell screen are shaded in gray.
  • FIG. 11A Violin plot of fold- changes of editing efficiency of synHEK3 sites in response to inhibition of HLTF, MLH1 and PMS2. Points shaded by shRNA identity.
  • FIG. 11B Heatmap of synHEK3 reporters (row) and their responses to shRNAs against HLTF (left: clone 5; right: clone 3). Leftmost bar annotates the overlapping status of synHEK3 reporters with GRCh38 gene annotations. Second left bar indicates the expression status of the overlapping or nearest gene in TPM. Third left bar indicates distances (bp) to corresponding TSS of gene-overlapping synHEK3 reporters or nearest genes for synHEK3 reporters outside genes.
  • FIG. 11C Bar plot of synHEK3 reporter counts based on their responsiveness to HLTF inhibition and overlapping status with GRCh38 gene annotations. P value: Fisher’s exact test.
  • FIG.11D Expression of genes overlapping or proximal (within 10 kb) to a synHEK3 reporter.
  • FIGs 12A-12H show an example chromatin context-specific response to HLTF inhibition, according to aspects of the disclosure.
  • FIG. 12A Effects of shRNAs against HLTF on all synHEK3 reporters in clone 5 and clone 3. For every synHEK3- shRNA pair, editing frequencies in cells with candidate shRNA are plotted and shaded by their statistical significance. Control editing frequencies are shown as horizontal, flat data. Regions of the plot containing synHEK3 reporters that are significantly less responsive to HLTF inhibition are shaded gray.
  • FIG.12B Effects of shRNAs against PMS2 on all synHEK3 reporters in clone 5 and clone 3. Layout as in FIG.12A.
  • FIG.12C Scatter plots of baseline editing efficiencies (x-axes) vs. fold-changes induced by shRNAs against HLTF, MLH1 and PMS2 (y-axes) across 50 synHEK3 reporters. Vertical lines correspond to baseline editing efficiencies of 0.2 and 0.9.
  • FIG.12D Cumulative distribution function (CDF) plots of log2 fold-change of mean editing frequency induced by shRNAs against 3915-P1300WO.UW -14- HLTF. synHEK3 reporters are colored by their responsiveness to HLTF inhibition based on the shRNA screen. P value: two-sided Kolmogorov–Smirnov test.
  • FIG.12E Validation of differential responsiveness to HLTF inhibition in cells transduced with individual shRNAs. CDF plots of log 2 fold-change of editing frequency induced by shRNAs against HLTF (shHLTF.2367: used in the shRNA screen; shHLTF.2623: an orthogonal shRNA) or MLH1 (shMLH1.1911).
  • FIG. 12F Scatter plot of RNA log 2 fold-changes induced by two shRNAs against HLTF. Genes overlapping or near shHLTF responsive sites are lower; genes overlapping or near shHLTF unresponsive sites are upper; other sites are in gray; HLTF is shown at bottom left corner; -1.5 and -2 log2 fold-change.
  • FIG.12G CDF plot of log2 fold-changes of ATAC-seq peak counts induced by shHLTF.2367.
  • Peaks near synHEK3 sites were defined as those either: 1) within 5 kb of a synHEK3 site; and/or 2) in the promoter or body of a gene overlapping or proximal (within 10 kb) to a synHEK3 insertion.
  • FIG. 12H Western blot analysis of HLTF, PMS2 and MLH1. Clone 5 or a wild-type (WT) K562 line were transduced with rtTA and shRNAs targeting the candidate genes. Protein expression was compared to cells without doxycycline (Dox) or cells transduced with a shRNA targeting a different gene.
  • FIGs 13A-13H show an example of modulating prime editing outcomes by epigenetic conditioning, according to aspects of the disclosure.
  • FIG.13A Workflow of the CRISPRoff experiment.
  • FIG. 13B Scatter plots of mean prime editing efficiencies of synHEK3 reporters in cells transfected with CRISPRoff gRNAs targeting USP7, METTL2A and LRRC8C promoters. SynHEK3 reporters in corresponding target genes are labeled. Error bars correspond to standard deviation of measured editing efficiencies.
  • FIG. 13C Bar plot of prime editing efficiency changes of synHEK3 reporters in CRISPRoff experiment. Control editing efficiencies are predicted efficiencies using linear models trained in synHEK3 reporters that are not in the corresponding CRISPRoff target genes as shown in FIG. 13B.
  • FIG. 13D Workflow of the CRISPRa experiments.
  • FIG. 13E Prime editing efficiency (%) at endogenous gene targets in K562 cells, with or without epigenetic editing via CRISPRa. Dark gray bars show mean prime editing efficiencies in a wild-type K562 cell line which received only PEmax a ⁇ (e)pegRNA. Light gray bars show mean prime editing efficiencies when control promoters were activated via CRISPRa. Black bars show mean prime editing efficiencies when target gene promoters were activated. Fold- 3915-P1300WO.UW -15- changes are calculated between the CRISPRa and control groups. Inset shows a zoomed view of the first three genes.
  • FIG. 13F Prime editing efficiency (%) at endogenous gene targets in a human iPSC line, with or without epigenetic editing via CRISPRa.
  • FIG. 13G Scatter plots of mean prime editing efficiencies measured in control vs. CRISPRa cells in exons of IL2RB. Points shaded by edit Type. x- and y-axes in log10 scale. The dotted lines indicate 2-, 5- and 10-fold differences between the CRISPRa and control groups.
  • FIG.13H Boxplot of prime editing efficiency fold-change for all variants. The dashed line indicates a fold-change of 2.
  • FIGs 14A-14H show an example of modulating prime editing outcomes by epigenetic reprogramming, according to aspects of the disclosure.
  • FIG.14A Schematic diagrams of target genes in the CRISPRoff experiment, with the locations of the CRISPRoff gRNAs and synHEK3 reporters annotated.
  • FIG. 14B Scatter plots of mean normalized counts of gene expression in cells transfected with CRISPRoff gRNAs at Day 11 (2 replicates each). x- and y-axis on log 10 scale. Insets are bar plots of normalized counts of the CRISPRoff target genes. NTC: non-targeting control.
  • FIG. 14C Prime editing efficiency (%) at endogenous gene targets in K562 cells, with or without epigenetic editing via CRISPRa. Light gray bars show mean prime editing efficiencies when control promoters were activated via CRISPRa.
  • FIG. 14D mRNA fold-change of CRIPSRa target genes quantified by qPCR.
  • FIG. 14E Relative expression levels of selected CRISPRa target genes compared to a set of reference genes. Circle: reference genes; triangle: endogenous expression levels of the target genes; square: expression levels of the target genes after CRISPRa. y-axis on log10 scale.
  • FIG.14F Schematic of the IL2RB gene with locations of gRNA (for CRISPR ⁇ or (e)pegRNA targets (for prime editing) annotated.
  • FIG.14G List of epegRNA edit types tested, including all possible single-nucleotide substitutions, small insertions of 1, 3 and 6 bp, insertion of a loxP site (34 bp), and small deletions of 1, 3 and 6 bp.
  • FIG.14H Total editing efficiencies measured at the individual IL2RB exons. Light gray bars show mean prime 3915-P1300WO.UW -16- editing efficiencies in control groups and dark gray bars show mean prime editing efficiencies when the IL2RB promoter was first activated via CRISPRa.
  • Prime editor consisting of a Cas9 nickase and a reverse transcriptase, searches for a genome engineering target and introduces a desired mutation, with both site specificity and edit identity pre-programmed in a prime editing guide RNA (pegRNA).
  • pegRNA prime editing guide RNA
  • the high programmability of this method has further allowed for repurposing the cellular genome as an information storage device, which serves as a means for deconvoluting regulatory events and lineage trajectories during the development of a highly complex multicellular organism.
  • Prime editing which involves the generation and repair of a single-strand DNA break, like all other DNA damage repair processes, is very likely regulated by the local 3915-P1300WO.UW -17- chromatin contexts. Nevertheless, these chromatin environmental factors remain largely uncharacterized. [0038]
  • the inventors have developed a reporter system and developed an assay using this reporter system to resolve the complex and heterogeneous influence of chromatin landscape on the efficiency of prime editing. Utilizing this approach, highly variable prime editing efficiency was observed across genomic contexts and epigenetic features predictive of prime editing outcomes were identified.
  • CRISPRoff is a newly developed epigenetic reprogramming tool that has been shown to establish prolonged epigenetic silencing on target genes. Leveraging CRISPRoff, it was demonstrated that prime editing efficiency of intragenic targets can be attenuated after silencing of the underneath genes, corroborating the correlation between transcriptional activity and prime editing outcome.
  • a reporter system for mapping a genomic location of at least one modification in a genome of a cell comprises a vector comprising a promoter for in vitro transcription.
  • a vector may comprise polynucleotide sequences needed for RNA in vitro transcription, such as a promoter sequence, e.g., an RNA promoter sequence, preferably T3, T7 or SP6 RNA promoter sequences. Particularly preferred in the context of the present disclosure are vectors comprising promoters for DNA-dependent RNA polymerases such as T3, T7 and Sp6.
  • the vector can include any sequence that can integrate into the genome, whether randomly or by targeted integration.
  • the vector is selected from a transposon-based vector (e.g., a piggyBac-based vector) retroviral-based vector, a lentiviral-based vector, and a serine integrase-based vector.
  • the reporter system may further comprise a polynucleotide sequence linked to the promoter.
  • the polynucleotide sequence may comprise an amplification primer binding domain/sequence.
  • the polynucleotide sequence may further comprise other nucleic acid sequences.
  • the polynucleotide may comprise nucleic acid sequences for coding or non-coding transcripts, cis-regulatory elements, molecular identifier barcode, or functional or non-functional genomic elements.
  • the polynucleotide sequence comprises a target sequence and a molecular identifier.
  • the target sequence comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • Molecular identifiers contemplated by the reporter systems and methods of this disclosure are well-known in the art and may comprise nucleic acid sequences or sequencing linkers functioning as molecular tags or barcodes.
  • the molecular barcode comprises at least about 5 (such as at least about any one of 10, 15, 20, or 25) preprogrammed, or randomly and/or degenerately designed nucleotides.
  • the molecular identifier is 16bp.
  • the reporter system comprises a vector comprising a polynucleotide sequence comprising: a promoter for in vitro transcription; a target sequence; and a molecular identifier.
  • the target sequence comprises a target sequence for a Cas9 guide RNA or a variant thereof.
  • the reporter system is used to map a desired edit meditated by prime editing into a genome of a cell and/or determine prime editing efficiency in a genome of a cell.
  • the cell constitutively expresses Prime Editor (PE).
  • PE Prime Editor
  • the vector is devoid of any cis-regulatory elements.
  • Provided herein also is a method of mapping the genomic location of at least one reporter system disclosed herein in the genome of a cell.
  • the present disclosure provides a method for determining prime editing efficiencies across a genome using the reporter system of the present disclosure.
  • gene editing comprises prime editing.
  • the prime editing target site may be any site in the genome. Exemplary target sites include but are not limited to an exon, an intron, a promoter, 3’ UTR, 5’ UTR, and intergenic sites in the genomic region.
  • the prime editing target site may be downstream or upstream of a promoter of the gene. In a preferred embodiment, the prime editing target site is downstream of a transcriptional start site (TSS) of the gene.
  • TSS transcriptional start site
  • modulating the prime editing efficiency comprises activating transcription of the gene proximal to the prime editing target site.
  • transcriptional activation of a gene may be mediated by CRISPRa system.
  • the step of activating transcription may occur before, or during the step of prime editing of the target site in the genome of the cell.
  • the step of activating transcription of the gene proximal to the prime editing target site occurs before the step of prime editing of the target site in the genome of the cell.
  • modulating the prime editing comprises repressing transcription of the gene proximal to the prime editing target site.
  • transcriptional repression of the gene proximal to the prime editing target site may be mediated by CRISPRoff system.
  • the step of repressing transcription of the gene proximal to the prime editing target site occurs before, or during the step of prime editing of the target site.
  • factors impacting prime editing efficiency include 1) the properties of the prime editing ribonucleoprotein (RNP) complex itself; 2) the sequence of the target site and nature of the 3915-P1300WO.UW -20- programmed edit; 3) trans-acting factors such as the endogenous DNA repair proteins involved in the installation of prime edits; and 4) the cis-chromatin context of the target site (as addressed by the present disclosure).
  • an engineered reverse transcriptase (PE2), a codon- and structure-optimized prime editor protein (PEmax), a set of more compact and functionally specialized prime editors (PE6a-g), a prime editor fused to a small RNA-binding protein to functionally interact with and stabilize pegRNAs (PE7), and a structurally stabilized pegRNA (epegRNA) have been developed.
  • PE2 engineered reverse transcriptase
  • PEmax codon- and structure-optimized prime editor protein
  • PE6a-g a set of more compact and functionally specialized prime editors
  • PE7 a prime editor fused to a small RNA-binding protein to functionally interact with and stabilize pegRNAs
  • epegRNA structurally stabilized pegRNA
  • a rate-limiter for prime editing is DNA mismatch repair (MMR), which detects the intermediate product of prime editing and efficiently repairs the edited strand.
  • MMR DNA mismatch repair
  • the method further comprises modulating at least one DNA Damage Repair (DDR) gene in the cell.
  • DDR DNA Damage Repair
  • the at least one DDR gene is HLTF.
  • the at least one DDR gene is a DNA mismatch repair (MMR) gene such as, for example, an MMR gene selected from PMS2, MLH1, MSH2, MSH3, MSH6, EXO1, and FEN1.
  • MMR DNA mismatch repair
  • the method further comprises use of a more robust or enhanced prime editing ribonucleoprotein (RNP) complex, such as, for example, an RNP complex comprising (a) a prime editor comprising an engineered reverse transcriptase (e.g., 3915-P1300WO.UW -21- PE2), (b) a codon- and structure-optimized prime editor (PEmax), or (c) a compact and/or specialized prime editor (e.g., a prime editor selected from PE6a, PE6b, PE6, PE6d, PE6e, PE6f, and PE6g), (d) a prime editor fused to an RNA-binding protein (e.g., PE7), and/or (e) a structurally stabilized pegRNA (epegRNA) and/or a computationally optimized peg RNA).
  • RNP prime editing ribonucleoprotein
  • the method further comprises inhibiting DNA mismatch repair (MMR).
  • MMR DNA mismatch repair
  • inhibiting MMR comprises use of a second-site nicking guide RNA targeting the non-edited strand.
  • inhibiting MMR comprises inhibiting an MMR protein selected from MLH1, MSH2, MSH3, MSH6, PMS2, and EXO1.
  • inhibiting MMR comprises expressing a dominant negative MLH1 protein, RNAi- mediated knockdown of MMR protein expression, or targeted degradation of the MMR protein.
  • the cell is a eukaryotic cell, preferably a mammalian cell.
  • the cell is a stem cell (e.g., an induced pluripotent stem cell (iPSC)).
  • the target nucleotide site is a DNA sequence in a genome, e.g., a eukaryotic genome.
  • the target site is a nucleotide sequence in a mammalian (e.g., a human) genome.
  • a “vector“ is a composition of matter which comprises an isolated nucleic acid and which can be used to deliver the isolated nucleic acid to the interior of a cell.
  • a vector can include any nucleic acid sequence that can integrate into the genome of a cell.
  • a vector comprises a 3915-P1300WO.UW -22- nucleic acid sequence that can integrate into the genome by random integration.
  • a vector comprises a nucleic acid sequence that can integrate into the genome by targeted integration.
  • the “antisense“ strand of a segment within double-stranded DNA is the template strand, and which is considered to run in the 3′ to 5′ orientation.
  • the “sense“ strand is the segment within double-stranded DNA that runs from 5′ to 3′, and which is complementary to the antisense strand of DNA, or template strand, which runs from 3′ to 5′.
  • the sense strand is the strand of DNA that has the same sequence as the mRNA, which takes the antisense strand as its template during transcription, and eventually undergoes (typically, not always) translation into a protein.
  • the antisense strand is thus responsible for the RNA that is later translated to protein, while the sense strand possesses a nearly identical makeup to that of the mRNA. Note that for each segment of dsDNA, there will possibly be two sets of sense and antisense, depending on which direction one reads (since sense and antisense is relative to perspective). It is ultimately the gene product, or mRNA, that dictates which strand of one segment of dsDNA is referred to as sense or antisense.
  • Cas9“ or “Cas9 nuclease“ refers to an RNA-guided nuclease comprising a Cas9 domain, or a fragment thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9).
  • a “Cas9 domain“ as used herein, is a protein fragment comprising an active or inactive cleavage domain of Cas9 and/or the gRNA binding domain of Cas9.
  • a “Cas9 protein“ is a full length Cas9 protein.
  • a Cas9 nuclease is also referred to sometimes as a casn1 nuclease or a CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)- associated nuclease.
  • CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements, and conjugative plasmids).
  • CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids.
  • CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).
  • tracrRNA trans-encoded small RNA
  • rnc endogenous ribonuclease 3
  • Cas9 domain The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA.
  • Cas9/crRNA/tracrRNA endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer.
  • the target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3′-5′ 3915-P1300WO.UW -23- exonucleolytically.
  • RNA-binding and cleavage typically requires protein and both RNAs.
  • single guide RNAs (“sgRNA,“ or simply “gRNA“) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species.
  • sgRNA,“ or simply “gRNA“) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species.
  • Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self versus non-self.
  • Cas9 nuclease sequences and structures are well known to those of skill in the art (see, e.g., “Complete genome sequence of an M1 strain of Streptococcus pyogenes.“ Ferretti et al., J. J., McShan W. M., Ajdic D. J., Savic D. J., Savic G., Lyon K., Primeaux C., Sezate S., Suvorov A. N., Kenton S., Lai H. S., Lin S. P., Qian Y., Jia H. G., Najar F. Z., Ren Q., Zhu H., Song L., White J., Yuan X., Clifton S.
  • Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier, “The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems“ (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
  • a Cas9 nuclease comprises one or more mutations that partially impair or inactivate the DNA cleavage domain.
  • variant“ should be taken to mean the exhibition of qualities that have a pattern that deviates from what occurs in nature, e.g., a variant Cas9 is a Cas9 comprising one or more changes in amino acid residues as compared to a wild type Cas9 amino acid sequence.
  • variant“ encompasses homologous proteins having at least 75%, or at least 80%, or at least 85%, or at least 90%, or at least 95%, or at least 99% percent identity with a reference sequence and having the same or substantially the 3915-P1300WO.UW -24- same functional activity or activities as the reference sequence.
  • proteins comprising Cas9 or fragments thereof are referred to as “Cas9 variants.“ A Cas9 variant shares homology to Cas9, or a fragment thereof.
  • a Cas9 variant is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, at least about 99.8% identical, or at least about 99.9% identical to wild type Cas9.
  • Exemplary Cas9 variants include but are not limited to High-fidelity Cas9, eSpCas9, HypaCas9, Sniper-Cas9, evoCas9 and Cas9 variants with broadened PAM compatibilities, include but are not limited to VQR, VRER, VRQR, QQR1, SpCas9-NG, xCas9. Also included are Cas9 from different species of bacteria, SpCas9, SaCas9, ScCas9, NmCas9, FnCas9, CjCas9, St1Cas9, and other DNA-targeting Cas proteins, including Cas12a (formerly Cpf1), Cas12b, CasX.
  • guide RNA“ is a particular type of guide nucleic acid which is mostly commonly associated with a Cas protein of a CRISPR-Cas9 and which associates with Cas9, directing the Cas9 protein to a specific sequence in a DNA molecule that includes complementarity to spacer sequence of the guide RNA.
  • this term also embraces the equivalent guide nucleic acid molecules that associate with Cas9 equivalents, homologs, orthologs, or paralogs, whether naturally occurring or non-naturally occurring (e.g., engineered or recombinant), and which otherwise program the Cas9 equivalent to localize to a specific target nucleotide sequence.
  • the Cas9 equivalents may include other napDNAbp from any type of CRISPR system (e.g., type II, V, VI), including Cpfl (a type-V CRISPR-Cas systems), C2cl (a type V CRISPR-Cas system), C2c2 (a type VI CRISPR-Cas system) and C2c3 (a type V CRISPR-Cas system).
  • Cpfl a type-V CRISPR-Cas systems
  • C2cl a type V CRISPR-Cas system
  • C2c2 a type VI CRISPR-Cas system
  • C2c3 a type V CRISPR-Cas system
  • guide RNA“ may also be referred to as a “traditional guide RNA“ to contrast it with the modified forms of guide RNA termed “prime editing guide RNAs“ (or “pegRNAs“) which have been invented for the prime editing methods and composition disclosed herein. 3915-P1300WO.UW -25- [0060]
  • Guide RNAs or pegRNAs may comprise various structural elements that include, but are not limited to: Spacer sequence - the sequence in the guide RNA or pegRNA (having about 20 nts in length) which binds to the protospacer in the target sequence or DNA.
  • the guide RNAs used in the present disclosure may be 15-1000 nucleotides in length and comprise a sequence of at least 10, at least 15, or at least 20 contiguous nucleotides that is complementary to a target nucleotide sequence.
  • the guide RNA may comprise a sequence of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 contiguous nucleotides that is complementary to a target nucleotide sequence.
  • the guide RNA may be about 100 nucleotides or about ⁇ 100 nucleotides in length.
  • the term “prime editor“ refers to the herein described fusion constructs comprising a napDNAbp (e.g., Cas9 nickase) and a reverse transcriptase and is capable of carrying out prime editing on a target nucleotide sequence in the presence of a PEgRNA (or “extended guide RNA“).
  • the term “prime editor“ may refer to the fusion protein or to the fusion protein complexed with a PEgRNA, and/or further complexed with a second- strand nicking sgRNA.
  • the prime editor may also refer to the complex comprising a fusion protein (reverse transcriptase fused to a napDNAbp), a PEgRNA, and a regular guide RNA capable of directing the second-site nicking step of the non-edited strand as described herein.
  • the reverse transcriptase component of the “primer editor“ may be provided in trans.
  • the term “more robust or enhanced ribonucleotprotein (RNP) complex“ refers to later generations of prime editing systems (e.g., later than the original PE system (PE1)).
  • More robust or enhanced RNP complexes can include, for example, enhanced expression level and stability (for either or both of prime editor protein and pegRNA), specialized performance (e.g., PE6 variants favoring different sequences), a more compact system to enable efficient packaging and in vivo delivery, and/or other derivative use of original PE systems (e.g., Prime-del, TwinPE, PEDAR, etc.).
  • the “CRISPRa system or CRISPRa“ is a construct containing an appropriate expression cassette capable of targeting and activating the target promoter to enhance the expression of the target gene driven by the target promoter.
  • the “CRISPRoff system or CRISPRoff“ is a construct containing an appropriate expression cassette capable of targeting, repressing and 3915-P1300WO.UW -26- methylating the target promoter and reducing the expression of the target gene driven by the target promoter.
  • the term “cDNA“ refers to a strand of DNA copied from an RNA template. cDNA is complementary to the RNA template.
  • the terms “upstream“ and “downstream“ are terms of relativity that define the linear position of at least two elements located in a nucleic acid molecule (whether single or double-stranded) that is orientated in a 5′-to-3′ direction.
  • a first element is upstream of a second element in a nucleic acid molecule where the first element is positioned somewhere that is 5′ to the second element.
  • complementary“ or “complementarity“ refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick or other non-traditional types.
  • a percent complementarity indicates the percentage of residues in a nucleic acid molecule which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary).
  • “Perfectly complementary“ means that all the contiguous residues of a nucleic acid sequence will hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence. “Substantially complementary“ as used herein refers to a degree of complementarity that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%. 97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides, or refers to two nucleic acids that hybridize under stringent conditions.
  • Prime editing facilitates the precise installation of diverse genetic variants with minimal off-target effects.
  • the prime editor includes a fusion of a Cas9 nickase and reverse transcriptase (RTase), while the prime editing guide RNA (pegRNA) specifies both the target site and desired edit.
  • RTase Cas9 nickase and reverse transcriptase
  • pegRNA prime editing guide RNA
  • Prime editing offers advantages over alternatives in the context of therapeutic genome editing, molecular recording and the functional characterization of genetic variants. 3915-P1300WO.UW -28- [0074]
  • a counterpoint to this promise is that prime editing’s efficiency is variable and often low.
  • Relevant factors likely include: 1) properties of the prime editing ribonucleoprotein (RNP) complex; 2) the target sequence and edit type; 3) trans-acting factors, e.g., endogenous DNA repair proteins; and 4) the cis-chromatin environment of the target site.
  • Prime editing is influenced by chromatin, both through the steric effects of nucleosomes and epigenetic effects on the balance of endogenous DNA repair pathways used to repair Cas9-mediated DSBs.
  • prime editing leverages a different set of endogenous DNA repair pathways than conventional Cas9-mediated genome editing, it is proposed herein that prime editing‘s outcome can also be influenced by the chromatin.
  • T7 polymerase linearly amplifies molecules carrying positional information before RT-PCR, which further increases sensitivity.
  • PB piggyBac-based prime editing reporter bearing the T7 promoter was developed and included the target sequence for a highly efficient pegRNA (HEK3), and a 16-bp degenerate barcode (BC).
  • the final construct was 358 bp and lacks 3915-P1300WO.UW -29- any known regulatory elements that could potentially interfere with the local chromatin environment (FIG. 2A).
  • Subsets of aligned reads defined sharp boundaries when visualized on a genome browser, each corresponding to the precise insertion point of a synHEK3 reporter (FIG. 1B).
  • 10,095 insertion sites were identified (FIG. 2D).
  • Motif analysis of insertion junctions revealed the expected TTAA motif for piggyBac transposition (FIG. 1C). Sites lacking this motif (6.4%) were removed and the genomic coordinates of TTAA motifs assigned as locations of individual reporters. Of note, only 4,273 mapped sites (42.3%) bore a unique barcode (FIG.
  • piggyBac integrations were most strongly enriched near active transcriptional start sites (TSS) and enhancers, and most strongly depleted from quiescent regions, which are mostly constitutive heterochromatin (FIG.2I). Given these biases, it was sought to assess how broad a range of epigenetic environments was sampled by integrated reporters. For this, epigenetic scores for 2-kb windows surrounding synHEK3 integrations were computed for various chromatin features in K562 cells, and these were compared against equivalent scores for randomly selected genomic and TTAA sites.
  • a small fraction (11%) of H3K79me2-high sites (top 10%) had near-zero editing efficiencies (FIG.4A).
  • synHEK3 reporters were stratified by distance from the nearest TSS as well as the mRNA expression level of the overlapping or nearest gene.
  • Prime editing efficiencies were indeed correlated with both expression levels and TSS proximity (FIG. 3E; FIGs 4F, 4G).
  • genome browser views of the top 4 highly editable synHEK3 sites (90-94% PE) and 4 poorly editable synHEK3 sites ( ⁇ 1% PE) were generated (FIG.3F; FIG. 4I).
  • the top sites are intragenic, in highly expressed genes, and within 3.5 kb of the TSS.
  • the poorly editable examples are also intragenic, but within unexpressed genes and in some cases hundreds of kilobases from the TSS.
  • Impact of chromatin environment on prime editing in diverse endogenous target sites [0087] To ask whether learnings based on synHEK3 reporters generalize, epegRNAs were designed with DeepPrime to install 3-bp CCT insertions at 121 endogenous genomic target sites and these were tested in K562 cells (predicted efficiencies 65-85%).
  • synHEK3 reporters near highly expressed genes were more frequently edited by Cas9. Reporters immediately downstream of TSSs of highly expressed genes had higher indel frequencies at Day 1, but the differences were negligible at later time points (FIG.6B).
  • synHEK3 reporters were clustered into 6 groups based on measured efficiencies of prime and Cas9 editing (FIG.5B). The reporters in each group clustered in PCA plots of chromatin feature scores, suggesting that the epigenetic environment shapes the editing efficiencies exhibited by members of each group (FIGs 6C, 6D). The larger, more highly edited groups (Groups 3-6) were further clustered into subclusters, resulting in 14 clusters overall (FIG.6E).
  • Group 4.2 sites are amenable to Cas9 editing but not prime editing suggests that both Cas9 and prime editing benefit from chromatin accessibility, but more active transcription specifically promotes prime editing.
  • Group 1.0 had substantially lower Cas9 editing (mean: 25.8% vs.38.3%, p ⁇ 2.2 x 10 -16 ). Without wishing to be bound by any particular theory, it was suspected that Group 1.0 corresponds to constitutive heterochromatin marked by higher H3K9me3 that remains silenced throughout the cell cycle and development, and in contrast, Group 2.0 is marked by higher H3K27me3 and might correspond to facultative heterochromatin, which is silenced upon differentiation (FIG.6F). [0094] Finally, frequencies of alleles resulting from Cas9 editing inferred to derive from non-homologous end joining (NHEJ) vs.
  • NHEJ non-homologous end joining
  • MMEJ microhomology-mediated end joining
  • a DDR-focused genetic screen was therefore designed, in which perturbations are coupled to outcomes at pre-integrated synHEK3 reporters with single cell molecular profiling.
  • IST in situ transcription
  • a library of 304 shRNAs against 76 genes was constructed, including 74 DDR-related genes (10 unexpressed) and 2 luciferase genes.
  • the DDR-related, expressed genes comprised hits found by Repair-seq, genes in other major DDR pathways, and epigenetic factors involved in H3K79me2 metabolism (FIG. 8B).
  • the lentiviral construct was modified to contain a T7 promoter upstream of the shRNA and a RT primer binding site (PBS) (FIG.8C).
  • PBS RT primer binding site
  • the most prominent targets include major components of the MMR pathway (PMS2, MLH1), which can influence prime editing outcomes.
  • PMS2, MLH1 The most prominent targets include major components of the MMR pathway (PMS2, MLH1), which can influence prime editing outcomes.
  • synHEK3-shRNA pairs targeting each MMR- related factor were consolidated (FIG. 9C; FIG. 10E).
  • the strongest prime editing- promoting effects were observed for shRNAs against PMS2 and MLH1 (homologs of bacterial MutL ⁇ ; FIG.9D), which form a heterodimer and coordinate multiple repair steps after mismatch recognition.
  • Knocking down FEN1, a 5’ DNA flap endonuclease led to strong suppression of prime editing (FIG. 10F).
  • MSH2/MSH6 dimer recognizes 1-2-bp mismatches, while MSH2/MSH3 dimer detects longer indels (>2 bp). Without wishing to be bound by any particular theory, it was speculated that knocking down MSH6 releases sequestered MSH2, allowing for more efficient detection and correction of the 6-bp insertions; the mild effect for MSH2 might suggest a different mechanism used in recognizing 6-bp insertions.
  • synHEK3 reporters with intermediate baseline editing frequencies (0.2-0.9) was a focus. In clone 5, 4/21 of these sites showed much weaker increases in editing frequencies compared to reporters with 3915-P1300WO.UW -38- similar starting editing frequencies. In clone 3, reporters were overall less responsive to HLTF inhibition, but 5/14 sites showed stronger and statistically significant upregulation of editing frequencies (FIGs 12A, 12D). Sites that were less responsive to HLTF inhibition in clone 5 showed slightly higher response to shRNAs against PMS2 (FIG. 12B). The selected synHEK3 reporters were grouped based on their responsiveness to HLTF shRNA knockdown and their overlapping status counted with annotated genes.
  • H3K79me2 is deposited by DOT1L, which is part of the Pol II transcriptional elongation complex. H3K79me2 is strongly predictive of intragenic prime editing efficiency. At the same time, shRNA-mediated knockdown of DOT1L failed to appreciably alter prime editing efficiencies, a result corroborated by pharmacological inhibition of DOT1L. Second, intragenic prime editing efficiencies were also correlated with transcription levels, proximity to TSS, and, more weakly, transcript orientation. Third, the analyses of the differential outcomes of prime and Cas9 editing on identical synHEK3 reporters suggested that while chromatin accessibility may enhance both prime and Cas9 3915-P1300WO.UW -39- editing, higher levels of transcription and TSS proximity appear to specifically promote prime editing.
  • a K562 cell line stably expressing the dCas9- VP64 fusion was transfected with a pair of gRNAs (2XMS2) targeting the target gene‘s promoter, along with an MCP-p65-Rta fusion protein. Then, 2 days later, the cells were transfected with PEmax and pegRNA to program the desired mutations (FIG. 13D).
  • the epegRNAs targeting exon 1 of IL2RB were the least responsive to CRISPRa, potentially due to the CRISPRa and editing target sites being only 300 bp apart.
  • the installation of small insertions (1, 3, 6 bp) were most responsive to CRISPRa (median fold-changes ranging from 6.6- to 10.2-fold), followed by other edit types (FIGs 13G, 13H).
  • G ⁇ A and A ⁇ G edits were most responsive to CRISPRa (median fold-changes of 5.3-fold and 4.7-fold, respectively; FIG. 13H). It was concluded that epigenetic reprogramming strategy is effective for enhancing the prime editing efficiencies of all edit types.
  • T7 promoter-bearing reporter construct and protocol leverages T7 IVT and IST to enable questions surrounding chromatin position effects to be tackled in either bulk or single-cell format.
  • T7 IVT enabled near-complete mapping of densely integrated reporters and measurement of position-dependent regulatory effects on prime editing.
  • T7 IST enabled co-profiling of non-transcribing genomic constructs.
  • the interaction between trans- acting factors and genome editing could be further stratified by chromatin context.
  • the method is straightforward to adapt based on this disclosure, and can include, e.g., use of a 19-bp T7 promoter with any reporter construct, such that it has the ability to make capturing precise genomic coordinates a possibility for any bulk or single cell assay in which reporter or effector constructs can be randomly integrated. [0116] Prime editing efficiency at over 4,000 genomic locations was measured, and the correlation of 23 chromatin features was quantified with those efficiencies.
  • H3K79me2 is an effective predictor of efficacious prime editing, but without wishing to be bound by any particular theory, multiple lines of evidence can suggest that this is due to its strong correlation with active transcriptional elongation rather than a direct effect.
  • This same beta regression model based entirely on position effects on prime editing of the synHEK3 reporter, successfully predicts the relative editing efficiencies of high-quality epegRNAs targeting endogenous genomic target sites, providing value that is orthogonal to sequence-based prediction tools.
  • HLTF was identified as a context-dependent repressor of prime editing, as knocking it down preferentially enhanced prime editing at sites undergoing active transcription.
  • HLTF represses PE3 but not PE2.
  • prime editing efficiencies of endogenous, intragenic targets can be markedly enhanced by first “conditioning“ the locus with CRISPRa.
  • the effectiveness of this approach is validated for multiple loci, at a range of distances from the TSS, in both K562 and iPSC cell lines, and for all edit types.
  • K562 cells (CCL-243) were purchased from ATCC and maintained in RPMI 1640 medium (Gibco) supplied with 10% FBS (Hyclone) and penicillin/streptomycin (Gibco, 100 U/ mL).
  • HEK293T cells were maintained in DMEM medium (Gibco) supplied with 10% FBS and penicillin/streptomycin.
  • the DHFR-dCas9-VPH (VP48-P65-HSF1) WTC11 (human iPSCs) line was generated by the Martin Kampmann lab.
  • PB-T7-HEK3-BC (synHEK3): First, a minimal piggyBac cargo construct was created by deleting all intervening sequences between the 5’ and 3’ terminal repeats (including core insulators) of the PB-CMV-MCS-EF1 ⁇ -Puro vector (System Biosciences, PB510B-1).
  • a gBlock (Integrated DNA Technologies, IDT) including a filler sequence and flanking scaffold sequences (from GFP) was inserted to create a shuttle vector.
  • the filler sequence contains two divergent BsmBI recognition sites and can be removed scarlessly.
  • the shuttle vector was digested with BsmBI (New England Biolabs).
  • An 87-bp region around the HEK3 gRNA target site was synthesized from IDT and amplified with a pair of primers to introduce a T7 promoter and a 16-bp barcode to its upstream and downstream, respectively.
  • the resulting PCR product was inserted into the linearized shuttle vector using NEB HiFi assembly.
  • LT3-GFP-T7-miR-E-CS1-PGK-Neomycin The LT3GEN vector was purchased from Addgene (#111173) and digested with I-SceI (New England Biolabs). A fragment containing a T7 promoter and homologous sequences was ordered from IDT and assembled into the backbone.
  • the LT3-GFP-T7-PGK-Neomycin vector was digested with XhoI and EcoRI-HF (New England Biolabs).
  • An shRNA targeting the Renilla luciferase (Ren713) was inserted along with a Capture Sequence 1 (CS1) after the EcoRI site.
  • CS1 Capture Sequence 1
  • shRNAs were ordered from IDT as 97-nt 4 nmole Ultramers (Table 2) or oPool and amplified with primers p1 (5’- ATTACTTCGACTTCTTAACCCAACAGAAGGCTCGAGAAGGTATATTGCTGTTG ACAGTGAGCG-3’; SEQ ID NO:1) and p2 (5’- AATTGCTCTTGCTAGGACCGGCCTTAAAGCGAATTCTAGCCCCTTGAAGTCCG AGGCAGTAGGCA-3’; SEQ ID NO:2).
  • the PCR products were assembled into the backbone using NEB HiFi assembly and transformed into NEB Stable Competent E.
  • Lenti-rtTA-P2A-Blast The Lenti-Cas9-P2A-Blast (Addgene: 52962) vector was digested with XbaI and BamHI (New England Biolabs) to create a backbone. rtTA was amplified from the LT3GEPIR vector (Addgene: 111177) and cloned into the backbone using NEB HiFi assembly.
  • pU6-Sp-pegRNA-HEK3-ins6N and pU6-Sp-pegRNA-HEK3-ins3N the pU6-Sp-pegRNA-HEK3-insCTT vector (Addgene:132778)1 was linearized by PCR with 5’ phosphorylated oligos p3 (5’-TCTGCCATCANNNNNNCGTGCTCAGTCTGT TTTTTTAAGCTTG-3’, ins6N; SEQ ID NO:3) or p4 (5’-TCTGCCATCANNNC GTGCTCAGTCTGTTTTTTTAAGCTTG-3’, ins3N; SEQ ID NO:4) and p5 (5’- GGACCGACTCGGTCCCACTT-3’; SEQ ID NO:5) and ligated with T4 DNA ligase.
  • pU6-Sp-dual-gRNA vectors A pU6-Sp-dual-gRNA scaffold vector was generated by replacing the pegRNA expressing cassette of pU6-Sp-pegRNA-HEK3- insCTT vector with a dual U6-gRNA cassette from the PX333 vector (Addgene: 64073). The second gRNA cloning site (two BsaI sites) was modified to two BsmBI sites. Spacer sequences were cloned into this vector sequentially using an oligo annealing method.
  • pU6-Sp-gRNA-2XMS2 vectors A pU6-Sp-gRNA-2XMS2 scaffold vector was generated by modifying the pU6-Sp-pegRNA-HEK3-insCTT backbone.
  • the scaffold sequence is 5’-GTTTAAGAGCTAAGCCAACATGAGGATCACCCATGT CTGCAGGGCATAGCAAGTTTAAATAAGGCTAGTCCGTTATCAACTTGGCCAA CATGAGGATCACCCATGTCTGCAGGGCCAAGTGGCACCGAGTCGGTGCTTTTT TT-3’; SEQ ID NO:1370.
  • Spacer sequences were cloned in between two BsmBI sites using the oligo annealing method.
  • PB-CMV-MCP-XTEN80-p65-Rta-3xNLS-P2A-T2A-mPlum The MCP(N55K) sequence was synthesized as an IDT gBlock and amplified.
  • XTEN80 and 3XNLS-P2A were amplified from TETv4 (Addgene: 167983).
  • p65-Rta was amplified from sadCas9-VPR (Addgene: 188514).
  • mPlum was amplified from mPlum-C1(Addgene: 54839).
  • PB-CMV-PEmax-EF1a-Puro PE2max was amplified from pCMV- PEmax-P2A-hMLH1dn (Addgene: 174828) and cloned into the PB-CMV-MCS-EF1 ⁇ - Puro vector.
  • PB-UCOE-EF1a-PEmax-P2A-mCherry-PGK-Blast The UCOE-EF1a was amplified from pMH0006 (Addgene: 135448). PE2max was amplified from pCMV- PEmax-P2A-hMLH1dn. The rest of the sequences were synthesized as IDT gblocks and amplified. All PCR products were cloned into an empty piggyBac transposon vector.
  • (e)pegRNAs (pU6-Sp-(e)pegRNA) used in this example were ordered as 4nm ultramers (IDT) or long primers containing the spacer and 3’ extension sequences and cloned into the backbone of pU6-Sp-pegRNA-HEK3-CTT using NEB HiFi assembly.
  • the epegRNA libraries were ordered as individual IDT eBlocks, pooled and cloned into the same backbone by Golden Gate assembly (New England Biolabs).
  • K562 PEmax(+) cell line Wild-type K562 cells were transfected with PB-UCOE-EF1a-PEmax- P2A-mCherry-PGK-Blast and pCMV-HyPBase at a ratio of 3:1 in a 100 ⁇ L nucleofection reaction. Cells were selected with 10 ⁇ g/mL blasticidin (Gibco) for 7 days. Monoclonal lines were isolated and the clone with the brightest mCherry fluorescent signal was used for following experiments.
  • the plasmid pools (2 ⁇ g) were then transfected into 1 x 10 6 PEmax(+) K562 cells in duplicates. Editing rates were measured at the 121 sites and normalized using the abundance of the HEK3 plasmid and editing efficiency at the HEK3 loci in the individual pools (Table 1). 3915-P1300WO.UW -47- Cas9 RNP editing experiment [0142] Alt-R sgRNAs were ordered from IDT and resuspended in TE buffer to 100 ⁇ M.
  • the RNP complex was assembled by combining 2.1 ⁇ L PBS, 1.2 ⁇ L Alt-R sgRNA (100 ⁇ M) and 1.7 ⁇ L Alt-R Cas9 Nuclease (61 ⁇ M, IDT) and incubating at room temperature for 10-20 min.
  • the RNP mixture was added to 1 x 10 6 PE2(+) K562500-cell pool resuspended in nucleofection solution along with 0.5 ⁇ g pmax-GFP (Lonza) and 1 ⁇ L electroporation enhancer (100 ⁇ M, IDT).
  • gDNA was extracted at Day 1, 2 and 4 after transfection for amplicon sequencing of synHEK3 reporter.
  • CRISPRoff experiment [0143] On Day 0, 2 x 10 6 clone 5 cells were transfected with CRISPRoff-v2.1 (3 ⁇ g), pU6-Sp-dual-gRNA (1 ⁇ g) and pmax-GFP (500 ng, Lonza) using the SF Cell Line 4D-Nucleofector X kit L (Lonza). On Day 2, cells were sorted based on high GFP expression (top 20%) on a flow cytometer and expanded. On Day 11, 4 x 10 5 cells were transfected with the pU6-Sp-pegRNA-HEK3-insCTT plasmid (1 ⁇ g). Cells were lysed on Day 15 for amplicon sequencing of the synHEK3 reporter.
  • K562 experiments On Day 0, 4 x 10 5 K562 dCas9-VP64 cells were transfected with PB-CMV-MCP-XTEN80-p65-Rta-3xNLS-P2A-T2A-mPlum (600 ng) and paired pU6-Sp-gRNA-2XMS2 (200 ng each) plasmids targeting the same promoter.
  • iPSCs experiments On Day 0, 2 x 10 5 the DHFR-dCas9-VPH WTC11 cells were transfected with PB-CMV-MCP-XTEN80-p65-Rta-3xNLS-P2A-T2A-mPlum (2.1 ⁇ g) and paired pU6-Sp-gRNA-2XMS2 (700 ng each) plasmids targeting the same promoter.20 ⁇ M Trimethoprim (Sigma-Aldrich) was added to induce dCas9-VPH.
  • SpCas9 spacers for the selected exons of human IL2RB were generated using GuideScan2, sorted by the number of off target and cutting efficiency, and filtered for spacers with all of A, T, C, G bases within the first 7 base pairs from the cut sites (which allowed for modeling all 12 possible substitutions in close proximity of the cut sites).
  • substitutions the first occurrences of the A, T, C, G bases were converted to the rest 3 bases.
  • insertions reverse complement to the first base after the cut sites were modeled.
  • insertions of CCT and CGTCAT were modeled at the cut sites.
  • the 34-bp loxP sequences were inserted at the cut sites.
  • sequences of 1-, 3- and 6-bp were deleted after the cut sites.
  • epegRNAs for substitutions, and insertions and deletions ⁇ 3bps were designed using DeepPrime.
  • the resulting epegRNAs for each exon had a median DeepPrime score > 40%.
  • 6-bp insertions and deletions, and insertions of loxP the epegRNAs were designed manually with the length of primer binding sites being 13 bp and length of template (excluding the inserted or deleted sequences) being 10 bp.
  • Table 1 shows sequences of the epegRNAs.
  • 19 epegRNAs derived from the same spacer were synthesized as a pool (IDT oPool) and cloned into the pU6-Sp-pegRNA-HEK3-CTT backbone as a library.
  • the 7 epegRNAs libraries were tested following the CRISPRa experiment procedure in the dCas9-VP64 K562 cells as described above.
  • Quantitative PCR (qPCR) analysis [0148] qPCRs were performed on purified gDNA or cDNA reverse transcribed from total RNAs using SuperScript IV reverse transcriptase (200 U/ ⁇ L, Invitrogen) following manufacturer’s instructions.
  • qPCRs were performed with Power SYBR Green PCR Master Mix (Invitrogen) or KAPA2G Robust HotStart ReadyMix (Roche) supplied with SYBR Green (Invitrogen).
  • Cq values of synHEK3 were normalized to those of SNRPB (3 copies per genome).
  • Table 1 shows the list of primers used. 3915-P1300WO.UW -49- T7-assisted reporter mapping [0149] gDNA of K562 cells was purified using the DNeasy Blood & Tissue Kit (QIAGEN) and in vitro transcribed with the HiScribe T7 High Yield RNA Synthesis Kit (New England Biolabs).
  • each reaction (20 ⁇ L) contained 0.3 ⁇ 1 ⁇ g gDNA, NTPs (10 mM each) and 2 ⁇ l T7 RNA Polymerase Mix. The reaction mixture was incubated at 37 °C for 16 hours. Then, gDNA was digested with 2.5 ⁇ L DNase (QIAGEN) in a 100 ⁇ L reaction at room temperature for 30 min. RNA was extracted with TRIzol LS Reagent (Invitrogen), and aqueous phase was precipitated with 1 volume of isopropanol and 5 ⁇ g Glycogen (Invitrogen) at -80 °C for 1 hour. RNA pellet was collected by centrifugation at 21,000 x g at 4 °C for 1 hour.
  • DNase QIAGEN
  • RNA was incubated with 0.5 ⁇ L 100 ⁇ M RT primer p6 (5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNNNNN-3’; SEQ ID NO:6) and 1 ⁇ L 10 mM dNTP at 65 °C for 5 min and cooled on ice.
  • cDNA library was amplified with KAPA2G Robust HotStart polymerase, using primers p7 (5’- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGAAAGG AAGCCCTGCTTCCTCCAGAGGG-3’, 0.5 ⁇ M; SEQ ID NO:7) and p8 (5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3’, 0.5 ⁇ M; SEQ ID NO:8).
  • PCR reaction was performed as follows: 95 °C 3 min; 95 °C 15 s, 65 °C 15 s, 72 °C 30 s, 16 ⁇ 18 cycles; and 72 °C 1 min.
  • the PCR product was subjected to double-sided size selection (0.5X, 0.9X) and cleaned up with AMPure XP beads. The resulting product ranged from 200 to 1000 bp.
  • To prepare Illumina sequencing libraries 5-10 ng PCR product was re- amplified with the Nextera P5 and TruSeq P7 library index primers as shown in Table 1 for 5 cycles.
  • the final PCR product underwent another round of double-sided size selection (0.5X, 0.9X) and clean-up with AMPure XP beads.
  • the library was sequenced on an Illumina MiSeq in paired-end mode (Read 1: 254 bp; Read2: 55bp).
  • gDNA of K562 cells was purified using the DNeasy Blood & Tissue Kit. Alternatively, cells were lysed in a lysis buffer [10 mM Tris-HCl pH8.0, 0.05% SDS and 3915-P1300WO.UW -50- 0.04 mg/mL proteinase K(Thermo Scientific)], and incubated at 50 °C 60 min and 80 °C 30 min.100 ⁇ 250 ng ng gDNA or cell lysates were amplified using KAPA2G Robust HotStart ReadyMix with primers p9 (5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT CTACCCCGACCACATGAAGCAGC-3’, 0.5 ⁇ M; SEQ ID NO:9) and p10 (5’- TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNNNNNNNNNGACCATGTC ATCGCGCTTCTCGT-3’, 0.5 ⁇ M; SEQ ID NO:9
  • PCR reaction was performed as follows: 95 °C 3 min; 95 °C 15 s, 68-N °C 15 s, 72 °C 30 s, for 9 cycles (N was cycle number); 95 °C 15 s, 65 °C 15 s, 72 °C 30 s, for 11 cycles; and 72 °C 1 min.
  • N was cycle number
  • 72 °C 1 min To ensure enough coverage and accurate measurement of editing efficiencies, for the K562 synHEK3 pool, products from at least 16 PCR reactions were pooled.
  • the PCR product was purified with AMPure XP beads. 5 ng PCR product was re-amplified with the Nextera P5 and TruSeq P7 library index primers in Table 1 for 5 cycles.
  • the final libraries were cleaned up with AMPure XP beads and sequenced on an Illumina NextSeq 500 or an Illumina NextSeq 2000 sequencer.
  • Lentivirus production and transduction [0153] 1.6 x 10 7 HEK293T cells were seeded in a 10 cm 2 dish the day before transfection. 5.6 ⁇ g lentiviral vector and 14.4 ⁇ g ViraPower Lentiviral Packaging Mix (Invitrogen) were mixed and transfected using Lipofectamine 3000 (Invitrogen) following manufacturer’s instructions. Medium was changed at 24 hours post transfection.
  • Viruses were collected at 48 and 72 hours post transfection, filtered using 45 ⁇ m filters and concentrated 100-fold using PEG-it Virus Precipitation Solution (System Biosciences).
  • K562 cells were transduced with lentivirus in the presence of 8 ⁇ g/mL polybrene (Millipore). Medium was replaced after 24 hours.
  • Pooled shRNA screen and T7 IST-assisted sci-RNA-seq3 [0155] To prepare for the pooled shRNA screening, two monoclonal lines (clone 3 and clone 5) from the original 500-cell synHEK3 pool were isolated and genotyped. There were 22 and 28 unique synHEK3 reporters in these two clones.
  • the shRNA lentiviral library was first titrated, and mixed with clone 3 and 5 rtTA(+) cells at an MOI of 10 and a >1000X coverage. Transduced cells were selected in 3915-P1300WO.UW -51- 800 ⁇ g/mL Geneticin (Invitrogen) for 7 days. 1 x 10 6 cells were treated with 1 ⁇ g/mL doxycycline for 2 days to induce shRNA expression.
  • Nuclei were collected by centrifugation at 500 xg, 4 °C for 3 min and resuspended in 1 mL 0.3 M SPBSTM buffer with DEPC. Nuclei were fixed with 4 mL ice-cold methanol for 15 min on ice, swirled occasionally. After rehydration by adding 10 mL SPBSTM, nuclei were collected by centrifugation and washed once with 1 mL 0.3 M SPBSTM.
  • Counted nuclei were resuspended in SPBSTM at 4 x 106 /mL.
  • 500 ⁇ L nuclei were combined with 56 ⁇ L 10 mM dNTPs (New England Biolabs) and distributed to a low-bind 96-well plate (5 ⁇ L per well).1 ⁇ L indexed oligo-dT, HEK3 and CS1 primers (10 ⁇ M) were added to each well. The plate was incubated at 55 °C for 5 min and cooled on ice.
  • RT mixture was prepared by combining 240 ⁇ L 5X SuperScript IV Buffer, 60 ⁇ L SuperScript IV and 60 ⁇ L water. 3 ⁇ L RT mixture was added to each well.
  • the RT plate was incubated at 55 °C for 10 min and cooled on ice.5 ⁇ L ice-cold SPBSTM was added to each well and all nuclei were pooled in pre-chilled LoBind tubes. Nuclei were washed once with 1 mL SPBSTM and resuspended in 1,200 ⁇ L SPBSTM (per ligation plate). [0159] 11 ⁇ L nuclei were distributed to each well of a new 96-well plate and mixed with 2 ⁇ L 10 ⁇ M ligation primers.
  • ligation plate For each ligation plate, 195 ⁇ L 10X T4 Ligation Buffer was mixed with 65 ⁇ L T4 DNA Ligase, and 2 ⁇ L of the ligation mixture was added to each well. Ligation was performed at room temperature for 20 min on the bench. For the shRNA screen, nuclei were distributed into 4 ligation plates (384 ligation indices) to increase cell index complexity. The ligation plate was then cooled on ice and 10 ⁇ L ice- 3915-P1300WO.UW -52- cold SPBSTM was added to pool nuclei from all wells. Nuclei were washed twice with 1 mL SPBSTM.
  • protease QIAGEN
  • EB buffer QIAGEN
  • Tn5-N7 and mosaic end oligos were resuspended to 100 ⁇ M in annealing buffer (50 mM NaCl, 40 mM Tris-HCl pH8.0), mixed at a 1:1 ratio and annealed on a thermocycler using the following program: 95°C 5 min, cool to 65°C (0.1°C/s), 65°C 5 min, cool to 4°C (0.1°C/s).
  • 20 ⁇ L Tagmentase (Tn5 transposase - unloaded, Diagenode) was mixed with 20 ⁇ L annealed oligos and incubated on a thermomixer at 350 rpm, 23°C for 30 min.
  • glycerol 20 ⁇ L glycerol was added to the loaded Tn5 before storage at -20 °C.
  • 13.75 ⁇ L N7-loaded Tn5 (Diagenode) was mixed with 550 ⁇ L TD buffer and 5 ⁇ L of this mixture was added to each well. The plate was incubated at 55 °C for 5 min.
  • 50 ⁇ L 1% SDS, 50 ⁇ L BSA (New England Biolabs) and 225 ⁇ L water were mixed, and 2.6 ⁇ L of the mixture was added to each well and incubated at 55 °C for 15 min. Then, SDS was quenched by adding 2 ⁇ L 10% Tween-20 to each well.
  • PCR For PCR, 96 indexed P5 primers were used with constant or indexed P7 primers (See Table 1).
  • a PCR master mixture was prepared by combining 2,200 ⁇ L 2x NEBNext High-Fidelity 2X PCR Master Mix (New England Biolabs), 22 ⁇ L common P7 primer (100 ⁇ M) and 352 ⁇ L water.2 ⁇ L indexed TruSeq P5 primer and 23.4 ⁇ L PCR mixture were added to each well. If using indexed P7 primers, 2 ⁇ L of 10 ⁇ M primer was added to each well.
  • PCR reaction was performed as follows: 70 °C 3 min; 98 °C 30 s; 98 °C 10 s, 63 °C 30 s, 72 °C 60 s, 16 cycles; and 72 °C 5 min.3 ⁇ L of each well was pooled and cleaned up with 0.8X AMPure XP beads.
  • the library was resolved on a 1% 3915-P1300WO.UW -53- agarose gel and the smear between 300-600 bp was extracted using Monarch DNA Gel Extraction Kit (New England Biolabs).
  • Monarch DNA Gel Extraction Kit New England Biolabs.
  • the enrichment PCR master mixture contained 2,200 OneTaq 2X Master Mix (New England Biolabs), 16.5 ⁇ L synHEK3 P7 enrichment primer (100 ⁇ M), 16.5 ⁇ L shRNA P7 enrichment primer (100 ⁇ M), 44 ⁇ L 100X SYBR green (Invitrogen) and 1,353 ⁇ L water. 2 ⁇ L indexed P5 primer and 33 ⁇ L PCR mixture were added to each well.
  • PCR reaction was performed as follows: 95 °C 3 min; 95 °C 15 s, 68-N °C 15 s, 72 °C 30 s, for 9 cycles (N was cycle number); 95 °C 15 s, 65 °C 15 s, 72 °C 30 s, for M cycles (decided by qPCR); and 72 °C 1 min.
  • N was cycle number
  • All PCR products were pooled and concentrated using 0.9X AMPure XP beads.
  • the library was resolved on a 1% agarose gel and the two discrete bands corresponding to HEK3 and shRNA constructs were extracted using Monarch DNA Gel Extraction Kit.
  • Lysates were directly used for PCR with the KAPA2G Robust HotStart ReadyMix with primers (0.5 ⁇ M each) designed for the endogenous targets.
  • PCR reaction was performed as follows: 95 °C 3 min; 95 °C 15 s, 66-N °C 15 s, 72 °C 40 s, for 8 cycles (N was cycle number); 95 °C 15 s, 60 °C 15 s, 72 °C 40 s, for M cycles (decided by qPCR); and 72 °C 1 min.
  • the PCR product was purified with AMPure XP beads. 5 ng PCR product was re- amplified with the Nextera P5 and TruSeq P7 library index primers in Table 1 for 5 cycles.
  • the RNA libraries were indexed using the Illumina TruSeq RNA UD Indexes. All libraries were sequenced on an Illumina NextSeq 500 or an Illumina NextSeq 2000 sequencer in a paired end mode. [0167] Sequencing reads were demultiplexed using the bcl2fastq software (Illumina). For K562 PE2(+) and EPZ-5676 treated cells, an average of 34 million 75-bp paired-end reads were obtained.
  • sequencing reads from the unstranded RNA libraries were aligned to the GRCh38 reference genome (Gencode V43) using Salmon (v1.9.0).
  • CRISPRoff experiment on average, 55 million 50-bp paired-end reads were obtained per sample.
  • HLTF knockdown experiment 20 million 59-bp paired-end reads were obtained per sample. Reads were aligned to the GRCh38 reference genome and counted against all Ensembl genes using STAR (2.7.6a). Raw counts were analyzed with DESeq2.
  • ATAC sequencing library preparation and processing [0168] 1 x 10 5 cells were collected, washed with PBS, and pelleted by centrifugation at 500 xg, 4 °C for 5 min. Cells were then lysed with 50 ⁇ L lysis buffer (10 mM Tris-HCl pH7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.1% Tween-20, 0.01% Digitonin) on ice for 3 min and neutralize with 250 ⁇ L RSB buffer with Tween-20 (10 mM Tris-HCl pH7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20).
  • lysis buffer 10 mM Tris-HCl pH7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.1% Tween-20, 0.01% Digitonin
  • Nuclei were counted on a 3915-P1300WO.UW -55- hemocytometer and 5 x 10 4 nuclei underwent tagmentation.
  • tagmentation per 50 ⁇ L contained 25 ⁇ L 2x TD Buffer, 8.25 ⁇ L PBS, 0.5 ⁇ L 1% Digitonin, 0.5 ⁇ L 10% Tween- 20, 2.5 ⁇ L Tn5 enzyme (Illumina, 2.5 ⁇ M) and 13.25 ⁇ L water. The reaction was incubated at 37 °C for 30 min. DNA was purified using the Clean and Concentrate kit (Zymo) and eluted in 10-20 ⁇ L EB Buffer (QIAGEN). All or half of the eluted DNA was used for qPCR.
  • the “-Y“ option was used to enable soft clipping for supplementary alignments. 4) Reads uniquely (without XA:Z tag) and contiguously mapped near putative piggyBac landing pads (TTAA motifs) were kept using samtools (v1.9) and a custom script. 5) Aligned reads were converted to BED format using the sam2bed function in bedops (v2.4.35). Reads aligned to standard chromosomes were kept. 6) Insertion points were 3915-P1300WO.UW -56- calculated for all reads based on strand of alignment and reads were sorted by the insertion coordinates. 7) The first 8-bp of reads were used as UMIs.
  • a custom script was used to collapse reads at a per-location, per-barcode, per UMI basis.
  • a barcode-location-UMI count table was generated.8) synHEK3 barcodes ⁇ 3 Levenshtein Distances at each location were collapsed and barcodes with > 3 UMIs were kept. 9) The count table was converted to a GenomicRanges object in R. Coordinates of the last 4 base pairs of aligned reads were designated as genomic locations of the inserted synHEK3 reporters.10) “Landing pads“ of the mapped synHEK3 reporters were retrieved using the getSeq() function in the BSgenome package.
  • a custom script was used to align sequences to a reference sequence and count mutation frequency for every barcode.
  • Cas9 mutagenesis analysis all sequences were aggregated and editing outcomes (alleles) with the highest number of counts were selected. These most frequent alleles were then aligned to reference sequences using needleall with the following parameters: -gapopen 20 -gapextend 0.5 -endopen 20.
  • a custom script was used to annotate the mutational events. MMEJ alleles were selected based on the following criteria: 1) microhomology sequences being at least 2 bp; 2) observed allele frequencies being 6-fold higher than expected frequencies.
  • PCR duplicates were collapsed using the RT index, ligation index, UMI sequence and end coordinate of reads. Reads were further demultiplexed based on the combination of the RT, ligation and PCR index and split into files for individual cells. To generate gene expression count matrices, reads were assigned to the exonic and intronic region of closest genes with HTseq (v.2.0.2). Reads with ambiguous assignments were discarded. Cells were further filtered based on total UMI (> 100) and mitochondrial reads percentage ( ⁇ 10%). Cells with the number of features detected between 10% and 90% percentile of all cells were kept and considered high-quality cells. The single cell analysis was performed using the Seurat (v4.0.0) package in R.
  • UMIs with ⁇ 3 Hamming distances were collapsed. [0176] A series of pre-processing steps were applied to the data.1) Cell IDs were matched between the sci-RNA-seq3 transcriptome libraries and the capture libraries, and only kept high-quality cells (number of features between 10% and 90% percentile).2) For the shRNA library, cells with ⁇ 3 or >200 UMIs or >20 shRNA were removed. 3) For the synHEK3 library, cells with ⁇ 4 UMIs were first removed. UMIs were counted for synHEK3 reporters belonging to the two clones and calculated a clone UMI/total UMI ratio. If the ratio was >80% for a specific clone, the cell is assigned to the corresponding clone.
  • Empirical p values of candidate synHEK3-shRNA pairs were calculated as (n_lower + 1)/(N + 1), where n_low was the number of control tests with a raw p value lower than the candidate test‘s raw p value, and N was the total number of control tests. Empirical p values were then Benjamini-Hochberg corrected, and those with eFDR ⁇ 5% were reported. Processed screen results for clone 3 and clone 5 are provided in Table 2.
  • FIGs 10I, 10J, 12F, 12G, and 14B differential gene expression or accessibility analyses were performed using the DESeq2 package in R (Wald tests).
  • FIG. 11C Fisher’s exact test was used to compare the differential enrichment of gene- overlapping synHEK3 reporters in the two groups.
  • FIGs 4H, 5E-5G, 6F-6G, 11D, 12D, and 12E p values were calculated using two-sided Kolmogorov–Smirnov tests.
  • FIG. 14C welch’s two sample t-tests were used to compare prime editing efficiency of genes before and after CRISPR activation. Table 1: List of oligos and nucleic acid sequences used in this study (part 1 of 6; oligos).
  • SEQ ID Name Note NO - A y 3915-P1300WO.UW -60- SEQ ID Name Note NO: F i f lif i h HEK R 2 er er w N 3915-P1300WO.UW -61- SEQ ID Name Note NO: XL R i IL2RB E 4 R 2 , pegRNA, shRNA sequences).
  • Embodiment 1 A method for modulating gene editing efficiency of a target site in a genomic region of a cell comprising: modulating chromatin accessibility of the genomic region and/or transcriptional activity of a gene in the genomic region proximal to the gene editing target site.
  • Embodiment 2. The method of Embodiment 1, wherein gene editing comprises prime editing.
  • Embodiment 1 or 2 The method of Embodiment 1 or 2, wherein the target site is downstream of a promoter of the gene.
  • Embodiment 4. The method of any one of Embodiments 1 to 3, wherein the target site is downstream of a transcriptional start site (TSS) of the gene.
  • Embodiment 5. The method of Embodiment 1 or 2, wherein the target site is upstream of a promoter of the gene.
  • Embodiment 6. The method of any one of Embodiments 1 to 5, wherein the target site is selected from the group consisting of an exon, an intron, a promoter, a 3’ UTR, a 5’ UTR, and intergenic sites in the genomic region.
  • Embodiment 11 The method of any one of Embodiments 1 to 6, wherein modulating the gene editing efficiency comprises activating transcription of the gene proximal to the gene editing target site.
  • Embodiment 8 The method of Embodiment 7, wherein activating transcription of the gene proximal to the gene editing target site comprises CRISPRa- mediated transcriptional activation.
  • Embodiment 9 The method of Embodiment 7, wherein the step of activating transcription occurs before, or during the step of prime editing of the target site.
  • Embodiment 10 The method of any one of Embodiments 1 to 6, wherein modulating the gene editing efficiency comprises repressing transcription of the gene proximal to the prime editing target site. [0190] Embodiment 11.
  • Embodiment 12 The method of any one of Embodiments 1 to 11, wherein the method further comprises modulating at least one DNA damage repair (DDR) gene in the cell. 3915-P1300WO.UW -102- [0192] Embodiment 13. The method of Embodiment 12, wherein the at least one DDR gene is a DNA mismatch repair (MMR) gene. [0193] Embodiment 14.
  • DDR DNA damage repair
  • Embodiment 13 wherein the MMR gene is selected from the group consisting of PMS2, MLH1, MLH2, MLH3, MLH6, EXO1, and FEN1.
  • Embodiment 15 The method of any one of Embodiments 2 to 14, wherein the method further comprises use of a more robust or enhanced prime editing ribonucleoprotein (RNP) complex.
  • RNP prime editing ribonucleoprotein
  • the more robust or enhanced prime editing RNP complex comprises a prime editor selected from the group consisting of (a) a prime editor comprising an engineered reverse transcriptase, (b) a codon- and structure-optimized prime editor, (c) a compact and/or specialized prime editor, and (d) a prime editor fused to an RNA-binding protein.
  • a prime editor selected from the group consisting of (a) a prime editor comprising an engineered reverse transcriptase, (b) a codon- and structure-optimized prime editor, (c) a compact and/or specialized prime editor, and (d) a prime editor fused to an RNA-binding protein.
  • Embodiment 16 wherein (a) the prime editor comprising an engineered reverse transcriptase is PE2, (b) the codon- and structure-optimized primer editor is PEmax, (c) the compact and/or specialized prime editor is selected from PE6a, PE6b, PE6c, PE6d, PE6e, PE6f, and PE6g, or (d) the prime editor fused to an RNA-binding protein is PE7.
  • Embodiment 18 The method of any one of Embodiments 15 to 17, wherein the more robust or enhanced prime editing RNP complex comprises a structurally stabilized pegRNA (epegRNA).
  • Embodiment 19 The method of any one of Embodiments 15 to 17, wherein the more robust or enhanced prime editing RNP complex comprises a structurally stabilized pegRNA (epegRNA).
  • Embodiment 20 The method of any one of Embodiments 1 to 11, wherein the method further comprises inhibiting DNA mismatch repair (MMR).
  • MMR DNA mismatch repair
  • Embodiment 21 The method Embodiment 20, wherein inhibiting MMR comprises use of a second-site nicking guide RNA targeting the non-edited strand.
  • Embodiment 22 The method of Embodiment 20, wherein inhibiting MMR comprises inhibiting an MMR protein selected from the group consisting of MLH1, MLH2, MLH3, MLH6, PMS2, and EXO1.
  • Embodiment 23 The method of Embodiment 22, wherein the MMR protein is MLH1. 3915-P1300WO.UW -103- [0203] Embodiment 24. The method of Embodiment 23, wherein inhibiting MLH1 comprises expressing a dominant negative MLH1 protein. [0204] Embodiment 25. The method of Embodiment 22, wherein inhibiting the MMR protein comprises RNAi-mediated knockdown of MMR protein gene expression. [0205] Embodiment 26. The method of Embodiment 22, wherein inhibiting the MMR protein comprises targeted degradation of the MMR protein. [0206] Embodiment 27.
  • Embodiment 28 The method of any one of Embodiments 1 to 27, wherein the cell is a stem cell.
  • Embodiment 29 The method of Embodiment 28, wherein the stem cell is an induced pluripotent stem cell (iPSC).
  • Embodiment 30 The method of Embodiment 29, wherein the iPSC is a human iPSC.
  • Embodiment 31 The method of Embodiment 31.
  • a method for determining prime editing efficiencies across a genome comprising: (i) generating a reporter pool of cells comprising a plurality of reporter systems integrated into genomes of a plurality of cells, wherein each of the plurality of reporter systems comprise a vector comprising: (a) a promoter for in vitro transcription; (b) a target sequence for a pegRNA; and (c) a molecular identifier barcode; (ii) mapping insertion sites of the plurality of reporters in the genome of the plurality of cells; (iii) transfecting the reporter pool of cells with pegRNAs comprising a sequence for introducing a desired edit at the target sequence of the plurality of reporter systems integrated into the genome of the plurality of cells, wherein each of the plurality of cells constitutively express a prime editor (PE); and (iv) measuring the frequency of the desired edit in the genome of the cell.
  • PE prime editor
  • Embodiment 32 The method of Embodiment 31, wherein the step of generating a reporter pool of cells comprises transfecting the plurality of cells with the plurality of reporter systems. 3915-P1300WO.UW -104- [0212] Embodiment 33.
  • the step of measuring the frequency of the desired edit in the genome of the cell by prime editing comprises: (i) isolating genomic DNA from the plurality of cells; (ii) amplifying reporter systems integrated in the genome of the plurality of cells to obtain a library of amplified DNAs; (iii) sequencing the library of amplified DNAs; (iv) associating the molecular identifier barcode with the edit site in the target sequence of the plurality of reporters; and (v) computationally determining frequency of edits in the target sequence of the plurality of reporter systems associated with the molecular identifier barcode.
  • Embodiment 33 wherein the step of sequencing the library of amplified DNAs comprises high-throughput short- and long-read sequencing.
  • Embodiment 35 The method of any one of Embodiments 31 to 34, wherein the prime editor (PE) comprises a Cas9 nickase and a reverse transcriptase.
  • Embodiment 36 The method of Embodiment 31, further comprising correlating the prime editing efficiencies with features of epigenetic environment in the genome in about 100bp to about 2kb region centering at the edit sites of the target sequences.
  • Embodiment 37 Embodiment 37.
  • Embodiment 36 wherein the features of epigenetic environment are selected from DNA binding proteins, epigenetic modulators, histone modifications, DNase Hypersensitive sites, transposase-accessible chromatin (ATAC), and higher order chromatic structures.
  • Embodiment 38 The method of any one of Embodiments 31 to 37, wherein the plurality of cells comprises eukaryotic cells, preferably mammalian cells.
  • Embodiment 39 A method of mapping a genomic location of at least one reporter system in a genome of a cell comprising: (i) integrating the at least one reporter system in the genome of the cell; and (ii) determining an insertion/integration site for the at least one reporter system within the genome of the cell.
  • Embodiment 40 The method of Embodiment 39, wherein the reporter system comprises a vector comprising a promoter for in vitro transcription.
  • the reporter system comprises a vector comprising a promoter for in vitro transcription.
  • Embodiment 41 The method of Embodiment 40, wherein the vector further comprises a polynucleotide sequence linked to the promoter.
  • Embodiment 42 The method of any one of Embodiments 39 to 41, wherein the polynucleotide sequence comprises an amplification primer binding domain.
  • Embodiment 43 Embodiment 43.
  • Embodiment 42 wherein the polynucleotide sequence further comprises nucleic acid sequences selected from coding or non-coding transcripts, cis-regulatory elements, molecular identifier barcode, or functional or non-functional genomic elements.
  • Embodiment 44 The method of any one of Embodiments 39 to 43, wherein the polynucleotide sequence comprises a target sequence and a molecular identifier.
  • Embodiment 45 The method of Embodiment 39, wherein the reporter system comprises a vector comprising: (a) a promoter for in vitro transcription; (b) a target sequence; and (c) a molecular identifier.
  • Embodiment 46 Embodiment 46.
  • Embodiment 45 wherein the vector is devoid of any cis-regulatory elements.
  • Embodiment 47 The method of any one of Embodiments 39 to 46, wherein the vector is selected from a transposon-based vector, a retroviral-based vector, a lentiviral-based vector, and a serine integrase-based vector.
  • Embodiment 48 The method of Embodiment 47, wherein the vector is a transposon-based vector, and wherein the vector comprises an inverted terminal repeat (ITR) at each vector end.
  • ITR inverted terminal repeat
  • Embodiment 47 or 48 wherein the transposon-based vector is a piggyBac-based vector.
  • Embodiment 50 The method of Embodiment 48 or 49, wherein the cell is co-transfected with an expression construct capable of expressing transposase.
  • Embodiment 51 The method of any one of Embodiments 40, 41, 45, and 46, wherein the promoter is selected from a bacteriophage T7 promoter, a T3 promoter, a Sp6 promoter, and variants thereof.
  • Embodiment 52 Embodiment 52.
  • Embodiment 53 The method of any one of Embodiments 39 to 52, wherein the step of integrating the at least one reporter system in the genome of the cell comprises transfecting the cell with the at least one reporter system. [0233] Embodiment 54.
  • the step of determining the insertion site for the at least one reporter system within the genome of the cell comprises: (i) isolating genomic DNA of the cell comprising the at least one reporter system integrated in the genome of the cell; (ii) in vitro transcription of the isolated genomic DNA using the promoter of the reporter system to generate chimeric RNAs comprising a portion of the reporter system and of its neighboring genomic DNA; (iii) reverse transcribing the chimeric RNAs; (iv) amplifying cDNAs to generate a cDNA library comprising the reporter system integrated into the genome of the cell; (v) sequencing the cDNA library; and (vi) associating the molecular identifier barcode with insertion site in the genomic DNA.
  • Embodiment 55 The method of Embodiment 53, wherein the step of sequencing the library of cDNAs comprises high-throughput short- and long-read sequencing, and Sanger sequencing.
  • Embodiment 56 The method of any one of Embodiments 45 to 55, wherein the cell constitutively expresses a prime editor (PE), wherein the prime editor (PE) comprises a Cas9 nickase and a reverse transcriptase.
  • Embodiment 57 The method of Embodiment 56, wherein the method further comprises transfecting the cell with at least one pegRNA.
  • Embodiment 58 Embodiment 58.
  • Embodiment 57 wherein the at least one pegRNA comprises a sequence complementary to the target sequence of the reporter system and comprises a desired edit for editing the target sequence.
  • Embodiment 59 The method of Embodiment 58, wherein the method determines editing status and frequency of a desired edit in the genome of the cell.
  • Embodiment 60 The method of any one of Embodiments 39 to 59, wherein the insertion site comprises an exon, an intron, a promoter, a 3’ UTR, a 5’ UTR, or an intergenic site in the genomic region. 3915-P1300WO.UW -107- [0240] Embodiment 61.
  • Embodiment 12 wherein the at least one DDR gene is HLTF.
  • Embodiment 62 The method of any one of Embodiments 12 to 14 and 61, wherein modulating the at least one DDR gene comprises inhibiting the at least one DDR gene.
  • modulating the at least one DDR gene comprises inhibiting the at least one DDR gene.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Microbiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Sont divulguées des méthodes de modulation de l'efficacité d'édition de gènes d'un site cible dans une région génomique d'une cellule. La modulation de l'efficacité d'édition de gènes peut comprendre la modulation de l'accessibilité de la chromatine de la région génomique et/ou la modulation de l'activité transcriptionnelle d'un gène dans la région génomique à proximité du site cible d'édition de gènes. Sont également divulguées des méthodes pour déterminer des efficacités de la réécriture par matrice d'ARN dans un génome, qui peuvent être utilisés pour identifier des facteurs épigénétiques influençant l'édition de gènes, par exemple en corrélant les efficacités de la réécriture par matrice d'ARN avec des caractéristiques d'environnement épigénétique. Sont en outre divulgués des systèmes rapporteurs et des méthodes correspondantes permettant de mapper un emplacement génomique d'au moins une modification dans un génome d'une cellule.
PCT/US2024/023808 2023-04-11 2024-04-10 Méthodes d'identification de facteurs épigénétiques influençant l'édition de gènes et de modulation du résultat d'édition de gènes par modulation épigénétique Ceased WO2024215712A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363495372P 2023-04-11 2023-04-11
US63/495,372 2023-04-11

Publications (2)

Publication Number Publication Date
WO2024215712A2 true WO2024215712A2 (fr) 2024-10-17
WO2024215712A3 WO2024215712A3 (fr) 2025-01-02

Family

ID=93059999

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/023808 Ceased WO2024215712A2 (fr) 2023-04-11 2024-04-10 Méthodes d'identification de facteurs épigénétiques influençant l'édition de gènes et de modulation du résultat d'édition de gènes par modulation épigénétique

Country Status (1)

Country Link
WO (1) WO2024215712A2 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020180975A1 (fr) * 2019-03-04 2020-09-10 President And Fellows Of Harvard College Édition de base hautement multiplexée
WO2020236972A2 (fr) * 2019-05-20 2020-11-26 The Broad Institute, Inc. Systèmes de ciblage d'acides nucléiques à constituants multiples autres que de classe i
EP4217490A2 (fr) * 2020-09-24 2023-08-02 The Broad Institute Inc. Arn guides d'édition primaire, leurs compositions et leurs méthodes d'utilisation
EP4274894A2 (fr) * 2021-01-11 2023-11-15 The Broad Institute, Inc. Variants d'éditeur primaire, constructions et procédés pour améliorer l'efficacité et la précision d'une édition primaire

Also Published As

Publication number Publication date
WO2024215712A3 (fr) 2025-01-02

Similar Documents

Publication Publication Date Title
Li et al. Chromatin context-dependent regulation and epigenetic manipulation of prime editing
US12018272B2 (en) RNA-guided human genome engineering
Esposito et al. Hacking the cancer genome: profiling therapeutically actionable long non-coding RNAs using CRISPR-Cas9 screening
Flasch et al. Genome-wide de novo L1 retrotransposition connects endonuclease activity with replication
Sharon et al. Functional genetic variants revealed by massively parallel precise genome editing
Ke et al. Quantitative evaluation of all hexamers as exonic splicing elements
van Arensbergen et al. Genome-wide mapping of autonomous promoter activity in human cells
Vlaming et al. Screening thousands of transcribed coding and non-coding regions reveals sequence determinants of RNA polymerase II elongation potential
CN113646434B (zh) 使用加标签的向导rna构建体进行高效基因筛选的组合物和方法
Lu et al. Transcriptome-wide investigation of circular RNAs in rice
US20220267759A1 (en) Methods and compositions for scalable pooled rna screens with single cell chromatin accessibility profiling
Li et al. Optimization of genome engineering approaches with the CRISPR/Cas9 system
Schmieder et al. Enhanced genome editing tools for multi‐gene deletion knock‐out approaches using paired CRISPR sgRNAs in CHO cells
CN110343724B (zh) 用于筛选和鉴定功能性lncRNA的方法
JP7244885B2 (ja) 機能的なIncRNAをスクリーニングおよび同定するための方法
Lemp et al. Cryptic transcripts from a ubiquitous plasmid origin of replication confound tests for cis-regulatory function
Diaz Quiroz et al. Development of a selection assay for small guide RNAs that drive efficient site-directed RNA editing
US11946163B2 (en) Methods for measuring and improving CRISPR reagent function
US20240011055A1 (en) Precise genome deletion and replacement method based on prime editing
US20220315920A1 (en) Type i crispr system as a tool for genome editing
WO2024215712A2 (fr) Méthodes d'identification de facteurs épigénétiques influençant l'édition de gènes et de modulation du résultat d'édition de gènes par modulation épigénétique
Stringer et al. Characterization of primed adaptation in the Escherichia coli type IE CRISPR-cas system
CN111334531A (zh) 高信噪比阴性遗传筛选方法
Handler et al. The Drosophila OSC Genome: A Resource for Studies of Transposon and piRNA Biology
US12612643B2 (en) RNA-guided human genome engineering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24789339

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 24789339

Country of ref document: EP

Kind code of ref document: A2