WO2022093846A1

WO2022093846A1 - Safe harbor loci

Info

Publication number: WO2022093846A1
Application number: PCT/US2021/056689
Authority: WO
Inventors: Xinying ZHENG; Brendan GALVIN; Somya Khare; Aaron Cooper; Michelle NGUYEN; Anzhi YAO
Original assignee: Arsenal Biosciences Inc
Current assignee: Arsenal Biosciences Inc
Priority date: 2020-10-26
Filing date: 2021-10-26
Publication date: 2022-05-05
Anticipated expiration: 2023-04-26
Also published as: AU2021369494A1; CO2023006809A2; IL302315A; CA3196269A1; MX2023004822A; AU2021369494A9; DOP2023000080A; JP2023547887A; EP4232049A1; EP4232049A4; ZA202402949B; DOP2024000162A; BR112023007874A2; CL2023001176A1; CR20230220A; KR20230101839A; PE20231514A1

Abstract

Provided herein are safe harbor loci and methods for identifying and using safe harbor loci. The safe harbor loci exhibit increased knock-in efficiency and allow for increased, stable expression of transgenes.

Description

SAFE HARBOR LOCI

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to and the benefit of U.S. Provisional Patent Application No. 63/179,143, filed on April 23, 2021, U.S. Provisional Patent Application No. 63/141,926, filed on January 26, 2021, and U.S. Provisional Patent Application No. 63/105,834, filed on October 26, 2020, the entire contents of which are incorporated by reference herein for all purposes.

SEQUENCE LISTING

[0002] The instant application contains a Sequence Listing which will be submitted via EFS- Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on October 15, 2021, is named ANB-203WO_SequenceListing and is 623,177 bytes in size.

BACKGROUND

[0003] Cancer continues to present a significant clinical burden despite the substantial research efforts and scientific advances in cancer therapies. Blood and bone marrow cancers are frequently diagnosed cancer types, including multiple myelomas, leukemia, and lymphomas. Current treatment options for these cancers are not effective for all patients and/or can have substantial adverse side effects. Other types of cancer also remain challenging to treat using existing therapeutic options. Cancer immunotherapies are a promising solution because they can be highly specific, allowing for increased therapeutic effectiveness and the mitigation of side effects.

[0004] Genetically engineered immune cell therapy is a growing field with promising applications for the treatment of diseases including, but not limited to, cancer. Through the alteration of coding and/or non-coding genomic regions, researchers are identifying transgenes and insertion sites within cells that facilitate, for example, enhanced cell function, arrest cell growth, induced cell death, and tumor size/volume reduction. The identification of safe harbor sites (SHS) has improved outcomes of genome-engineering therapies. Well known SHS include the AAVS1 adeno-associated virus insertion site on chromosome 19, the human homolog of the murine Rosa26 locus, and the CCR5 chemokine receptor gene — the absence of which confers HIV resistance. (See, for example, Pellenz et al., 2018, the relevant disclosures of which are herein incorporated by reference). However, there is still a need for improved guidelines for gene editing therapies and additional SHS to address challenges such as poor knock-in (KI) efficiency, insertional oncogenesis, unstable and/or anomalous expression of transgenes and/or adjacent genes, etc..

SUMMARY

[0005] The present disclosure is directed, inter alia, to safe harbor loci that exhibit high knock-in efficiency and stable expression of their transgenes. These safe harbor loci can be used to alter T cells for immunotherapy. These safe harbor loci are useful for the treatment of various diseases, including cancer.

[0006] In one aspect, the present disclosure provides an engineered cell, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus, the safe harbor locus is at any one or more of the sgRNA target loci provided in Table 4; and wherein expression of the at least one sequence encoding the transgene is operatively linked to an endogenous promoter. In another aspect, the present disclosure provides an engineered cell, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus, the safe harbor locus is at any one or more of the sgRNA target loci provided in Table 4; and wherein expression of the at least one sequence encoding the transgene is operatively linked to an exogenous promoter. [0007] In some embodiments, the target locus is selected from: chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2: 87460000- 87470000, chr3: 186510000- 186520000, chr3: 59450000-59460000, chr8: 127980000- 128000000, and chr9:7970000-7980000. In some embodiments, the target locus is selected from: chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 5:92830000-92840000, and chrl6: 11220000-11230000. In some embodiments, the target locus is chrl 1 : 128340000- 128350000. In some embodiments, the target locus is chrl 5:92830000-92840000. In some embodiments, the target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FPL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[0008] In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS88, GS89, GS90, GS91, GS92, GS93, GS94, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, GS120. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, GS93, GS94, GS95, GS96, GS100, GS101, GS102, GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, and GS96. In some embodiments, the safe harbor locus is the GS94 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, and GS102. In some embodiments, the safe harbor locus is the GS102 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, and GS93.

[0009] In some embodiments, the exogenous promoter is an EFla promoter. In some embodiments, the engineered cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor. In some embodiments, the transgene encodes a recombinant protein, optionally a therapeutic agent. In some embodiments, the transgene encodes a chimeric antigen receptor (CAR).

[0010] In another aspect, the present disclosure provides a composition comprising the engineered cell as described herein and a pharmaceutical excipient.

[0011] In another aspect, the present disclosure provides a guide ribonucleic acids (gRNA) for editing a cell at a safe harbor locus, wherein gRNA comprises any one of the sgRNA sequences in Table 4.

[0012] In some embodiments, the gRNA comprises any one of SEQ ID NOS: 1-120. In some embodiments, the gRNA comprises any one of SEQ ID NOS: 91-96 and 100-105. In some embodiments, the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102. In some embodiments, the gRNA comprises SEQ ID NO:94. In some embodiments, the gRNA comprises SEQ ID NO: 102. In some embodiments, the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor.

[0013] In another aspect, the present disclosure provides a method of editing a cell having chromosomal DNA, comprising inserting at least one sequence encoding a transgene within a safe harbor locus in the chromosomal DNA of the cell, wherein the safe harbor locus is any one or more of the sgRNA target loci provided in Table 4. [0014] In some embodiments, the target locus is selected from: chrlO:33130000-33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2: 87460000- 87470000, chr3: 186510000- 186520000, chr3: 59450000-59460000, chr8: 127980000- 128000000, and chr9:7970000-7980000. In some embodiments, the target locus is selected from: chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 5:92830000-92840000, and chrl6: 11220000-11230000. In some embodiments, the target locus is chrl 1 : 128340000- 128350000. In some embodiments, the target locus is chrl 5:92830000-92840000. In some embodiments, the target locus is a gene selected from: A PR E B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[0015] In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS88, GS89, GS90, GS91, GS92, GS93, GS94, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, GS120. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, GS93, GS94, GS95, GS96, GS100, GS101, GS102, GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, and GS96. In some embodiments, the safe harbor locus is the GS94 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, and GS102. In some embodiments, the safe harbor locus is the GS102 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, and GS93.

[0016] In some embodiments, the transgene encodes a recombinant protein, optionally a therapeutic agent. In some embodiments, the transgene encodes a chimeric antigen receptor (CAR). In some embodiments, the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene. In some embodiments, the exogenous promoter is an EFla promoter. In some embodiments, the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell progenitor. In some embodiments, the at least one sequence is inserted using a homology-directed repair. In some embodiments, the at least one sequence is inserted using a homology independent targeted insertion. In some embodiments, the at least one sequence is inserted using one or more guide ribonucleic acids (gRNAs) and one or more Cas9 endonucleases.

[0017] In some embodiments, the one or more gRNAs comprises any one of SEQ ID NOS: 1-120. In some embodiments, the one or more gRNAs comprises any one of SEQ ID NOS: 91-96 and 100-105. In some embodiments, the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102. In some embodiments, the gRNA comprises SEQ ID NO:94. In some embodiments, the gRNA comprises SEQ ID NO: 102.

[0018] In another aspect, the present disclosure provides a method of editing a T cell, comprising contacting a T cell with one or more guide ribonucleic acids (gRNAs), at least one sequence encoding a transgene, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within a safe harbor locus, wherein the safe harbor locus is selected from any one or more of the sgRNA target loci in Table 4.

[0019] In some embodiments, the one or more gRNAs comprises a sequence selected from any one of the sgRNA sequences in Table 4. In some embodiments, the one or more gRNAs comprises any one of SEQ ID NOS: 1-120. In some embodiments, the one or more gRNAs comprises any one of SEQ ID NOS: 91-96 and 100-105. In some embodiments, the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102. In some embodiments, the gRNA comprises SEQ ID NO:94. In some embodiments, the gRNA comprises SEQ ID NO: 102.

[0020] In some embodiments, the target locus is selected from: chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2: 87460000- 87470000, chr3: 186510000- 186520000, chr3: 59450000-59460000, chr8: 127980000- 128000000, and chr9:7970000-7980000. In some embodiments, the target locus is selected from: chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 5:92830000-92840000, and chrl6: 11220000-11230000. In some embodiments, the target locus is chrl 1 : 128340000- 128350000. In some embodiments, the target locus is chrl 5:92830000-92840000. In some embodiments, the target locus is a gene selected from: A PR E B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, S0CS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[0021] In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS88, GS89, GS90, GS91, GS92, GS93, GS94, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, GS120. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, GS93, GS94, GS95, GS96, GS100, GS101, GS102, GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, and GS105. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, and GS96. In some embodiments, the safe harbor locus is the GS94 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, and GS102. In some embodiments, the safe harbor locus is the GS102 integration site in Table 4. In some embodiments, the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, and GS93

[0022] In another aspect, the present disclosure provides an ex vivo method of obtaining an engineered cell or population thereof, comprising (a) obtaining a cell; and (b) genetically modifying the cell by inserting at least one sequence encoding a transgene within a safe harbor locus, wherein the safe harbor locus is selected from any one of the sgRNA target loci in Table 4.

[0023] In some embodiments, obtaining the cell comprises: (i) collecting a tissue sample from a subject, (ii) isolating the cells from the tissue samples, and (iii) culturing the cells in vitro. In some embodiments, the tissue sample is a blood sample. In some embodiments, the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell precursor (T cell progenitor). In some embodiments, the at least one sequence is inserted using a homology-directed repair. In some embodiments, the at least one sequence is inserted using a homology independent targeted insertion.

[0024] In some embodiments, the genetically modifying in step (b) comprises contacting the cell with one or more guide ribonucleic acids (gRNAs), the at least one sequence, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within the safe harbor locus. In some embodiments, the one or more gRNAs comprises a sequence selected from any one of the sgRNA sequences in Table 4.

[0025] In some embodiments, the transgene encodes a recombinant protein, optionally a therapeutic agent. In some embodiments, the transgene encodes a chimeric antigen receptor (CAR). In some embodiments, the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene. In some embodiments, the exogenous promoter is an EFla promoter.

[0026] In yet another aspect, the present disclosure provides a method of treating a subject having or at risk of having a disease, comprising administering to the subject an effective amount of an engineered cell as described herein, a population thereof, or a composition as described herein. In some embodiments, the cell, the population thereof, or the composition is administered to the subject by infusion.

[0027] In yet another aspect, the present disclosure provides a method of treating a subject having or at risk of having a disease, comprising (a) conducting any one of the methods described supra, and (b) administering to the subject an effective amount of a composition comprising the cell or a population thereof. In some embodiments, the composition is administered to the subject by infusion. In some embodiments, the disease is cancer. In some embodiments, the disease is blood cancer.

[0028] In another aspect, the present disclosure provides a method of identifying a safe harbor locus, comprising: (a) identifying genes or non-coding regions in a chromosome that are above a threshold level for expression across developmental cell states and/or a threshold level for accessibility of chromatin; (b) generating a linear model that correlates the gene or non-coding region from step (a) with knock-in (KI) efficiency and estimates the KI efficiency of any gene or coding region on the chromosome; and (c) selecting the safe harbor locus based on threshold parameters; wherein the safe harbor locus is selected for insertion of at least one sequence encoding a transgene within a cell.

[0029] In some embodiments, the threshold parameters include one or more of: stable expression of a transgene, knockout of the gene confers benefit to the function of the cell, no known function within the cell, stable transgene expression in vitro with or without CD3/CD28 stimulation, negligible off-target cleavage as detected by iGuide-Seq or CRISPR- Seq, less off-target cleavage relative to other loci as detected by iGuide-Seq or CRISPR-Seq, negligible transgene-independent cytotoxicity, negligible transgene-independent cytokine expression, negligible transgene-independent chimeric antigen receptor expression, negligible deregulation or silencing of nearby genes, and positioned outside of a cancer-related gene. In some embodiments, the stable expression of a transgene at the safe harbor locus is less than or equal to 2-fold expression change over the course of at least 1, 2, 3, 4, 5, 6, or 7 days, and wherein expression change is measured by mean fluorescence intensity of a reporter gene encoded by the at least one sequence. In some embodiments, the accessibility of chromatin is measured using an assay for transposase-accessible chromatin using sequencing (ATAC-seq). In some embodiments, the level of expression across developmental cell states is measured using RNA sequencing (RNA-seq).

[0030] In some embodiments, the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell progenitor. In some embodiments, the linear model has a coefficient of determination (R² value) of at least 30%.

[0031] In yet another aspect, the present disclosure provides an engineered cell, composition, gRNA or method as described herein, wherein insertion within the safe harbor locus increase cell cytotoxicity of diseased cells.

[0032] In yet another aspect, the present disclosure provides an engineered cell, composition, gRNA or method as described herein, wherein knock-in efficiency at the safe harbor locus is increased relative to other locations along the chromosome.

BRIEF DESCRIPTION OF THE DRAWINGS

[0033] These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, and accompanying drawings, where:

[0034] FIG. 1 includes a schematic depicting the methodology used to identify safe harbor loci in the present disclosure, in some embodiments.

[0035] FIGS. 2A-2C include schematic depicting 3 top safe harbor loci known in the art: AAVS1 (FIG. 2A), CCR5 (FIG. 2B), and Rosa26 (FIG. 2C). The figures were retrieved from Sadelain, M., et al. (2012). Safe harbours for the integration of new DNA in the human genome. Nature reviews Cancer, 12(1), 51-58, the relevant disclosures of which are herein incorporated by reference in their entirety.

[0036] FIG. 3 includes a schematic outlining the data provided in Roth, T. L., et al. 2019. Rapid discovery of synthetic DNA sequences to rewrite endogenous T cell circuits. bioRxiv, 604561, the relevant disclosures of which are herein incorporated by reference in their entirety.

[0037] FIGS. 4A-4B include heatmaps with samples clustered based on activation status and cell type, generated with processed RNA-seq data. FIG. 4A includes a heatmap of samples generated with processed RNA-seq data. FIG. 4B includes a heatmap sample generated with processed RNA-seq data, showing the clustering of ~20k genes.

[0038] FIGS. 5A-5B include plots generated using processed RNA-seq data. FIG. 5A includes a plot showing direct correlation between transcript expression data from Roth, T.L., et al. 2019 and transcript expression data generated by the inventors of the present disclosure. FIG.5B includes a heatmap showing clusters of the top 10% expressed genes.

[0039] FIG. 6 includes a heatmap of samples generated with processed ATAC-seq data.

[0040] FIGS. 7A-7B includes plots depicting the signal enrichment at transcription start sites (TSS) (FIG. 7A) and peak size distribution (FIG. 7B), generated with processed ATAC-seq data for coding regions.

[0041] FIG. 8 includes a plot depicting the open chromatin regions around the TRAC locus.

[0042] FIG. 9 includes a plot depicting KI (knock-in) efficiency for coding regions/genes with GFP as a reporter.

[0043] FIG. 10 includes a plot depicting KI efficiency for coding regions/genes with tNGFR as a reporter.

[0044] FIG. 11 includes plots showing scaled KI efficiency for 90 genes vs d2 (day-2) RNA- seq data, d4 (day-4) RNA-seq data and ATAC-seq data, utilized for the predictive linear model.

[0045] FIG. 12 includes a plot showing chromatin accessibility for a top candidate noncoding region, measured using ATAQ sequencing.

[0046] FIG. 13A includes plots showing the cell counts for GFP controls used for the evaluation of candidate KI loci. FIG. 13B includes pmax (GFP high) readings for GFP controls used for the evaluation of candidate KI loci.

[0047] FIG. 14A includes plots showing the maximum episomal GFP expression

(GFP high) readings for non-targeting controls. FIG. 14B includes the cell count for nontargeting controls from donors 1, 2, and 3.

[0048] FIG. 15 includes the cell count and maximum GFP expression (GFP high) readings for WT controls. [0049] FIG. 16 includes plots showing the cell count and maximum GFP expression (GFP high) readings when using sgRNA5, which targets a B2M safe harbor locus, and the construct expression was driven by an endogenous promoter.

[0050] FIG. 17 includes plots showing the cell count and maximum GFP expression

(GFP high) readings when using sgRNA5, which targets a B2M safe harbor locus, and the construct expression was driven by an EFla promoter.

[0051] FIG. 18 includes plots showing the cell count and maximum GFP expression

(GFP high) readings when using sgRNA79 to target a TRAC safe harbor locus, and the construct expression was driven by an endogenous promoter.

[0052] FIG. 19 includes plots showing the cell count and maximum GFP expression

(GFP high) readings when using sgRNA79, which targets a TRAC safe harbor locus, and the construct expression was driven by an exogenous promoter.

[0053] FIG. 20 includes plots showing TCR (T cell receptor) vs. GFP among all donors and time points when using sgRNA79, and the construct is driven by an endogenous promoter.

[0054] FIG. 21 includes plots showing TCR vs. GFP among all donors and time points when using sgRNA79 and the construct is driven by an exogenous promoter.

[0055] FIG. 22 includes plots showing the cell count and maximum GFP expression

(GFP high) readings when using sgRNA83, which targets a TRAC safe harbor locus, and the construct expression was driven by an endogenous promoter.

[0056] FIG. 23 includes plots showing the cell count and maximum GFP expression

(GFP high) readings when using sgRNA83, which targets a TRAC safe harbor locus, and the construct expression was driven by an exogenous promoter.

[0057] FIG. 24 includes plots showing TCR vs. GFP among all donors and time points when using sgRNA83 and the construct is driven by an endogenous promoter.

[0058] FIG. 25 includes plots showing TCR vs. GFP among all donors and time points when using sgRNA83 and the construct is driven by an exogenous promoter.

[0059] FIG. 26 includes plots illustrating potential sources of variation (e.g., edge effects and electroporation errors) observed between replicates and donors.

[0060] FIG. 27 includes plots illustrating potential sources of variation observed between replicates and donors, e.g., relating to inherent differences between donors.

[0061] FIG. 28 includes plots illustrating potential sources of variation observed between replicates and donors, e.g., relating to gating errors. [0062] FIG. 29 includes plots showing the GFP mean fluorescence intensity (GFP MFI) and KI efficiency for top KI loci evaluated with endogenous promoters.

[0063] FIG. 30 includes plots showing the GFP mean fluorescence intensity (GFP MFI) and KI efficiency for top KI loci evaluated with the EFla promoter.

[0064] FIGS. 31A-31C include plots showing all the significant KI loci (top KI loci) as evaluated with endogenous and EFla promoters. FIG. 31 A includes a plot showing that the expression from the EFla promoter was approximately 10 times higher than expression from an endogenous promoter. FIG. 31B shows the top KI loci ranked by GFP MFI at week 3. FIG. 31C shows the top integration loci ranked by GFP MFI (at weeks 3 and 4; donors 1-3). [0065] FIG. 32A includes a plot showing some target loci and their measured transgene expression levels. FIG. 32B includes plots showing the transgene (Prime Receptor (PrimeR)) and TCR expression for the control and insertion at GS94, GS102 and TRAC loci.

[0066] FIG. 33A includes a plot showing the PrimeR levels measured for the indicated integration sites. FIG. 33B includes a schematic showing the GS94 integration site on Chromosome 11.

[0067] FIG. 34A includes plots showing CAR induction and primeR expression of engineered T cells after 48 hours of coculturing with K562-CD19 cells. FIG. 34B includes plots showing the cytotoxity and cytokine secretion levels for engineered T cells 48 hours of coculturing with K562-CD19/MSLN cells.

[0068] FIG. 35A includes a schematic showing the experimental overview for evaluating the effect of integration site on cytotoxicity. FIG. 35B includes plots showing the measured cytotoxicity for engineerred T cells cocultured for 48 hours with the K562 CD19+/MSLN+ or K562 CD19-/MSLN+ cells.

[0069] FIG. 36A includes a schematic showing the experimental overview for evaluating the effect of integration site on cytokine secretion. FIG. 36B includes plots showing the measured cytokine levels for engineerred T cells cocultured for 48 hours with K562 CD19+/MSLN+ cells.

[0070] FIG. 37 includes a schematic showing the in vitro experiment conducted to determin the effect of integration site on primeR-independent CAR expression. “Flow” refers to flow cytometry and “restim” refers to repetitive CD3/CD28 stimualation of the engineered T cells. “EP” refers to electroporation.

[0071] FIGS. 38A and 38B include plots showing the stability of PrimeR expression over time when using the indicated integration sites. “Flow” refers to flow cytometry and “restim” refers to repetitive CD3/CD28 stimualation of the engineered T cells. “EP” refers to electroporation. In FIG. 38B, the PrimeR expression is normalized to the expression from using the TRAC integration site.

[0072] FIG. 39A includes a schematic showing the iGuide-Seq assay technique. FIG. 39B includes a plot showing the on-target efficiency, using iGuide Seq assay, for the indicated integration sites. FIG. 39C includes schematics from the iGuide-Seq analysis showing that GS94 had no reproducible putatitve off-targets across two donors.

[0073] FIG. 40 includes a schematic showing the iGuide-Seq workflow and data.

[0074] FIG. 41 includes a plot showing rhAmp-seq analysis of putative off-target sites identified by iGUIDE-seq and Elevation prediction.

[0075] FIG. 42 includes plots showing RNA-seq analysis of cells with GS94, GS102 and TRAC knock-in of CD19/MSLN circuits. Scatterplot of gene expression in cells with integration at the GS94 locus (y-axis) vs cells with integration at either the TRAC or the GS102 locus (x-axis) in two donors. The yellow dots correspond to ETS1 and FLU. In blue are the genes that were found to be differentially expressed using edgeR (fold-change > 0, FDR-corrected p-value < 0.01, average counts-per-million across compared conditions at least 2).

[0076] FIG. 43 includes plots showing the absence of cytokine-independent growth in cells with CD19/MSLN circuit KI at GS94.

[0077] FIG. 44 shows a diagram of a 8.3 kb cassette that was inserted into the GS94 safe harbor locus.

[0078] FIG. 45 shows the expression of a 8.3 kb transgene circuit comprising a priming receptor and CAR in K562 cells.

[0079] FIG. 46 shows that non-viral editing generated less differentiated T cells.

DETAILED DESCRIPTION

[0080] The present disclosure provides safe harbor loci and methods for identifying safe harbor loci that exhibit high integration efficiency (e.g., high knock-in (KI) efficiency), high and constant levels of transgene expression, and such benefits independent of T cell activation/differentiation state. In some embodiments, the safe harbor loci also exhibit minimal to no disruption to T cell function and/or capacity for product manufacturing. In some embodiments, these loci are useful for effective and safe integration and expression of transgenes in T cells (e.g. in CAR T therapies). In some embodiments, the methods described herein can be used for the identification of safe harbor loci for insertion of transgenes in other types of cells.

[0081] FIG. 1 illustrates the overall approach that the inventors of the present disclosure used to identify safe harbor loci. In some embodiments, the present disclosure provides a method comprising the identification of genes within a genome and non-coding regions with sustained expression in a treatment cell (e.g. T cell) and using a predictive model of KI efficiency as a function of T cell chromatic state and computational analysis to predict candidate integration sites. The method further comprises evaluating the candidate integration sites for actual KI efficiency, sustained levels of transgene expression of a transgene, and minimal disruption to the treatment cell phenotype (e.g., T cell function and/or capacity for treatment product expansion and manufacturing). In some embodiments, the safe harbor loci allow for integration of a transgene driven by an endogenous promoter. In some embodiments, the safe harbor loci allow for integration of a transgene driven by an exogenous promoter (e.g. EFla promoter).

[0082] To facilitate an understanding of the present disclosure, a number of terms and phrases are defined below. Unless otherwise defined herein, scientific and technical terms used in this application shall have the meanings that are commonly understood by those of ordinary skill in the art. Generally, nomenclature used in connection with, and techniques of pharmacology, cell and tissue culture, molecular biology, cell and cancer biology, neurobiology, neurochemistry, virology, immunology, microbiology, genetics and protein and nucleic acid chemistry, described herein, are those well-known and commonly used in the art. In case of conflict, the present specification, including definitions, will control.

[0083] Unless otherwise defined, all terms of art, notations and other scientific terminology used herein are intended to have the meanings commonly understood by those of skill in the art. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a difference over what is generally understood in the art. The techniques and procedures described or referenced herein are generally well understood and commonly employed using conventional methodologies by those skilled in the art, such as, for example, the widely utilized molecular cloning methodologies described in Sambrook et al., Molecular Cloning: A Laboratory Manual 4th ed. (2012) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. As appropriate, procedures involving the use of commercially available kits and reagents are generally carried out in accordance with manufacturer-defined protocols and conditions unless otherwise noted.

[0084] As used herein, the singular forms “a,” “an,” and “the” include the plural referents unless the context clearly indicates otherwise. The terms “include,” “such as,” and the like are intended to convey inclusion without limitation, unless otherwise specifically indicated. [0085] As used herein, the term “comprising” also specifically includes embodiments “consisting of’ and “consisting essentially of’ the recited elements, unless specifically indicated otherwise.

[0086] The term “about” indicates and encompasses an indicated value and a range above and below that value. In certain embodiments, the term “about” indicates the designated value ± 10%, ± 5%, or ± 1%. In certain embodiments, where applicable, the term “about” indicates the designated value(s) ± one standard deviation of that value(s).

[0087] As used herein, the term “gene” refers to the basic unit of heredity, consisting of a segment of DNA arranged along a chromosome, which codes for a specific protein or segment of protein. A gene typically includes a promoter, a 5' untranslated region, one or more coding sequences (exons), optionally introns, a 3' untranslated region. The gene may further comprise a terminator, enhancers and/or silencers.

[0088] As used herein, the term “locus” refers to a specific, fixed physical location on a chromosome where a gene or genetic marker is located.

[0089] As used herein, the term “target locus” refers to a locus on a chromosome within which a safe harbor locus can be used for the insertion of a sequence. A target locus can consist of multiple potential safe harbor loci (integration sites). Examples of target loci are provided in Table 4, as sgRNA target loci. The notation used for the sgRNA target loci in Table 4 refers to the genomic region of the target locus, defined by the chromosome of the target locus and the coordinate range for that target locus. For example, chrl0:33130000- 33140000 refers to a target locus on ChrlO (chromosome 10) starting from coordinate 33130000 and ending with coordinate 33140000.

[0090] The term “safe harbor locus” refers to a locus at which genes or genetic elements can be incorporated without disruption to expression or regulation of adjacent genes. These safe harbor loci are also referred to as safe harbor sites (SHS). As used herein, a safe harbor locus refers to an “integration site” or “knock-in site” at which a sequence encoding a transgene, as defined herein, can be inserted. In some embodiments the insertion occurs with replacement of a sequence that is located at the integration site. In some embodiments, the insertion occurs without replacement of a sequence at the integration site. Examples of integration sites contemplated are provided in Table 4.

[0091] As used herein, the term “insert” refers to a nucleotide sequence that is integrated (inserted) at a safe harbor site. The insert can be used to refer to the genes or genetic elements that are incorporated at the safe harbor site using, for example, homology-directed repair (HDR) CRISPR/Cas9 genome-editing or other methods for inserting nucleotide sequences into a genomic region known to those of ordinary skill in the art.

[0092] The “CRISPR/Cas” system refers to a widespread class of bacterial systems for defense against foreign nucleic acid. CRISPR/Cas systems are found in a wide range of eubacterial and archaeal organisms. CRISPR/Cas systems include type I, II, and III subtypes. Wild-type type II CRISPR/Cas systems utilize an RNA-mediated nuclease, Cas9 in complex with guide and activating RNA to recognize and cleave foreign nucleic acid. Guide RNAs having the activity of both a guide RNA and an activating RNA are also known in the art. In some cases, such dual activity guide RNAs are referred to as a small guide RNA (sgRNA).

[0093] Cas9 homologs are found in a wide variety of eubacteria, including, but not limited to bacteria of the following taxonomic groups: Actinobacteria, Aquificae, Bacteroidetes- Chlorobi, Chlamydiae-Verrucomicrobia, Chlroflexi, Cyanobacteria, Firmicutes, Proteobacteria, Spirochaetes, and Thermotogae. An exemplary Cas9 protein is the Streptococcus pyogenes Cas9 protein. Additional Cas9 proteins and homologs thereof are described in, e.g., Chylinksi, et al., RNA Biol. 2013 May 1; 10(5): 726-737 ; Nat. Rev. Microbiol. 2011 une; 9(6): 467-477; Hou, et al., Proc Natl Acad Sci U S A. 2013 Sep 24;110(39): 15644-9; Sampson et al., Nature. 2013 May 9;497(7448):254-7; and linek, et al., Science. 2012 Aug 17;337(6096):816-21. The Cas9 nuclease domain can be optimized for efficient activity or enhanced stability in the host cell.

[0094] As used herein, the term “Cas9” refers to an RNA-mediated nuclease (e.g., of bacterial or archeal orgin, or derived therefrom). Exemplary RNA-mediated nuclases include the foregoing Cas9 proteins and homologs thereof, and include but are not limited to, CPF1 (See, e.g., Zetsche et al., Cell, Volume 163, Issue 3, p759-771, 22 October 2015). Similarly, as used herein, the term “Cas9 ribonucleoprotein” complex and the like refers to a complex between the Cas9 protein, and a crRNA (e.g., guide RNA or small guide RNA), the Cas9 protein and a trans-activating crRNA (tracrRNA), the Cas9 protein and a small guide RNA, or a combination thereof (e.g., a complex containing the Cas9 protein, a tracrRNA, and a crRNA guide RNA).

[0095] As used herein, the terms “T lymphocyte” and “T cell” are used interchangeably and refer to cells that have completed maturation in the thymus, and identify certain foreign antigens in the body. The terms also refer to the major leukocyte types that have various roles in the immune system, including activation and deactivation of other immune cells. The T cell can be any T cell such as a cultured T cell, e.g., a primary T cell, or a T cell derived from a cultured T cell line, e.g., a Jurkat, SupTl, etc., or a T cell obtained from a mammal. The T cell can be a CD3 + cell. The T cell can be any type of T cell, CD4 + / CD8 + double positive T cells, CD4 + helper T cells (e.g. Thl and Th2 cells), CD8 + T cells (e.g. cytotoxic T cells), peripheral Including but not limited to blood mononuclear cells (PBMC), peripheral blood leukocytes (PBL), tumor infiltrating lymphocytes (TIL), memory T cells, naive T cells, regulatory T cells, y6 T cells, etc. It can be any T cell at any stage of development. Additional types of helper T cells include Th3 (Treg) cells, Thl7 cells, Th9 cells, or Tfh cells. Additional types of memory T cells include cells such as central memory T cells (Tcm cells), effector memory T cells (Tern cells and TEMRA cells). A T cell can also refer to a genetically modified T cell, such as a T cell that has been modified to express a T cell receptor (TCR) or a chimeric antigen receptor (CAR). T cells can also be differentiated from stem cells or progenitor cells (e.g., precursor cells).

[0096] “ CD4 + T cells” refers to a subset of T cells that express CD4 on their surface and are associated with a cellular immune response. CD4 + T cells are characterized by a poststimulation secretion profile that can include secretion of cytokines such as IFN-y, TNF-a, IL-2, IL-4 and IL-10. “CD4” is a 55 kD glycoprotein originally defined as a differentiation antigen on T lymphocytes, but was also found on other cells including monocytes / macrophages. The CD4 antigen is a member of the immunoglobulin superfamily and has been implicated as an associative recognition element in MHC (major histocompatibility complex) class II restricted immune responses. On T lymphocytes, the CD4 antigen defines a helper / inducer subset.

[0097] “ CD8 + T cells” refers to a subset of T cells that express CD8 on their surface, are MHC class I restricted, and function as cytotoxic T cells. The “CD8” molecule is a differentiation antigen present on thymocytes, as well as on cytotoxic and suppressor T lymphocytes. The CD8 antigen is a member of the immunoglobulin superfamily and is an associative recognition element in major histocompatibility complex class I restriction interactions.

[0098] As used herein, the term “ex vivo” generally includes experiments or measurements made in or on living tissue, preferably in an artificial environment outside the organism, preferably with minimal differences from natural conditions.

[0099] As used herein, the term “construct” refers to a complex of molecules, including macromolecules or polynucleotides.

[00100] As used herein, the term “integration” refers to the process of stably inserting one or more nucleotides of a construct into the cell genome, i.e., covalently linking to a nucleic acid sequence in the chromosomal DNA of the cell. It may also refer to nucleotide deletions at a site of integration. Where there is a deletion at the insertion site, “integration” may further include substitution of the endogenous sequence or nucleotide deleted with one or more inserted nucleotides.

[00101] As used herein, the term “exogenous” refers to a molecule or activity that has been introduced into a host cell and is not native to that cell. The molecule can be introduced, for example, by introduction of the encoding nucleic acid into host genetic material, such as by integration into a host chromosome, or as non-chromosomal genetic material, such as a plasmid. Thus, the term, when used in connection with expression of an encoding nucleic acid, refers to the introduction of the encoding nucleic acid into a cell in an expressible form. The term “endogenous” refers to a molecule or activity that is present in a host cell under natural, unedited conditions. Similarly, the term, when used in connection with expression of the encoding nucleic acid, refers to expression of the encoding nucleic acid that is contained within the cell and not introduced exogenously.

[00102] As used herein, a “polynucleotide donor construct” refers to a nucleotide sequence (e.g. DNA sequence) that is genetically inserted into a polynucleotide and is exogenous to that polynucleotide. The polynucleotide donor construct is transcribed into RNA and optionally translated into a polypeptide. The polynucleotide donor construct can include prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and synthetic DNA sequences. For example, the polynucleotide donor construct can be a miRNA, shRNA, natural polypeptide (i.e., a naturally occurring polypeptide) or fragment thereof or a variant polypeptide (e.g. a natural polypeptide having less than 100% sequence identity with the natural polypeptide) or fragments thereof. [00103] As used herein, the term “transgene” refers to a polynucleotide that has been transferred naturally, or by any of a number of genetic engineering techniques from one organism to another. It is optionally translated into a polypeptide. It is optionally translated into a recombinant protein. A “recombinant protein” is a protein encoded by a gene — recombinant DNA — that has been cloned in a system that supports expression of the gene and translation of messenger RNA (see expression system). The recombinant protein can be a therapeutic agent, e.g. a protein that treats a disease or disorder disclosed herein. As used, transgene can refer to a polynucleotide that encodes a polypeptide. A transgene can also refer to a non-encoding sequence, such as, but not limited to shRNAs, miRNAs, and miRs.

[00104] The terms “protein,” “polypeptide,” and “peptide” are used herein interchangeably.

[00105] As used herein, the term “operably linked” refers to the binding of a nucleic acid sequence to a single nucleic acid fragment such that one function is affected by the other. For example, if a promoter is capable of affecting the expression of a coding sequence or functional RNA (i.e., the coding sequence or functional RNA is under transcriptional control by the promoter), the promoter is operably linked thereto. Coding sequences can be operably linked to control sequences in both sense and antisense orientation.

[00106] As used herein, the term “developmental cell states” refers to, for example, states when the cell is inactive, actively expressing, differentiating, senescent, etc. developmental cell state may also refer to a cell in a precursor state (e.g., a T cell precursor or T cell progenitor).

[00107] As used, the term “encoding” refers to a sequence of nucleic acids which codes for a protein or polypeptide of interest. The nucleic acid sequence may be either a molecule of DNA or RNA. In preferred embodiments, the molecule is a DNA molecule. In other preferred embodiments, the molecule is a RNA molecule. When present as a RNA molecule, it will comprise sequences which direct the ribosomes of the host cell to start translation (e.g., a start codon, ATG) and direct the ribosomes to end translation (e.g., a stop codon). Between the start codon and stop codon is an open reading frame (ORF). Such terms are known to one of ordinary skill in the art.

[00108] The term “inserting” refers to a manipulation of a nucleotide sequence to introduce a non-native sequence. This is done, for example, via the use of restriction enzymes and ligases whereby the DNA sequence of interest, usually encoding the gene of interest, can be incorporated into another nucleic acid molecule by digesting both molecules with appropriate restriction enzymes in order to create compatible overlaps and then using a ligase to join the molecules together. One skilled in the art is very familiar with such manipulations and examples may be found in Sambrook et al. (Sambrook, Fritsch, & Maniatis, “Molecular Cloning: A Laboratory Manual”, 2nd ed., Cold Spring Harbor Laboratory, 1989), which is hereby incorporated by reference in its entirety including any drawings, figures and tables. [00109] As used herein, the term “subject” refers to a mammalian subject. Exemplary subjects include humans, monkeys, dogs, cats, mice, rats, cows, horses, camels, goats, rabbits, pigs and sheep. In certain embodiments, the subject is a human. In some embodiments the subject has a disease or condition that can be treated with an engineered cell provided herein or population thereof. In some aspects, the disease or condition is a cancer. [00110] As used herein, the term “promoter” refers to a nucleotide sequence (e.g. DNA sequence) capable of controlling the expression of a coding sequence or functional RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. A promoter can be derived from natural genes in its entirety, can be composed of different elements from different promoters found in nature, and/or may comprise synthetic DNA segments. A promoter, as contemplated herein, can be endogenous to the cell of interest or exogenous to the cell of interest. It is appreciated by those skilled in the art that different promoters can induce gene expression in different tissue or cell types, or at different developmental stages, or in response to different environmental conditions. As is known in the art, a promoter can be selected according to the strength of the promoter and/or the conditions under which the promoter is active, e.g., constitutive promoter, strong promoter, weak promoter, inducible/repressible promoter, tissue specific Or developmentally regulated promoters, cell cycle-dependent promoters, and the like.

[00111] A promoter can be an inducible promoter (e.g., a heat shock promoter, tetracycline- regulated promoter, steroid-regulated promoter, metal-regulated promoter, estrogen receptor- regulated promoter, etc.). The promoter can be a constitutive promoter (e.g., CMV promoter, UBC promoter). In some embodiments, the promoter can be a spatially restricted and/or temporally restricted promoter (e.g., a tissue specific promoter, a cell type specific promoter, etc.). See for example US Application No. 15/715,068, the disclosures of which are herein incorporated by reference in their entirety.

[00112] Gene editing, as contemplated herein, may involve a gene (or nucleotide sequence) knock-in or knock-out. As used herein, the term “knock-in” refers to an addition of a DNA sequence, or fragment thereof into a genome. Such DNA sequences to be knocked-in may include an entire gene or genes, may include regulatory sequences associated with a gene or any portion or fragment of the foregoing. For example, a polynucleotide donor construct encoding a recombinant protein may be inserted into the genome of a cell carrying a mutant gene. In some embodiments, a knock-in strategy involves substitution of an existing sequence with the provided sequence, e.g., substitution of a mutant allele with a wild-type copy. On the other hand, the term “knock-out” refers to the elimination of a gene or the expression of a gene. For example, a gene can be knocked out by either a deletion or an addition of a nucleotide sequence that leads to a disruption of the reading frame. As another example, a gene may be knocked out by replacing a part of the gene with an irrelevant (.e.g., non-coding) sequence.

[00113] As used herein, the term “non-homologous end joining” or NHEJ refers to a cellular process in which cut or nicked ends of a DNA strand are directly ligated without the need for a homologous template nucleic acid. NHEJ can lead to the addition, the deletion, substitution, or a combination thereof, of one or more nucleotides at the repair site.

[00114] As used herein, the term “homology directed repair” or HDR refers to a cellular process in which cut or nicked ends of a DNA strand are repaired by polymerization from a homologous template nucleic acid. Thus, the original sequence is replaced with the sequence of the template. The homologous template nucleic acid can be provided by homologous sequences elsewhere in the genome (sister chromatids, homologous chromosomes, or repeated regions on the same or different chromosomes). Alternatively, an exogenous template nucleic acid can be introduced to obtain a specific HDR-induced change of the sequence at the target site. In this way, specific mutations can be introduced at the cut site. [00115] The terms “vector” and “plasmid” are used interchangeably and as used herein refer to polynucleotide vehicles useful to introduce genetic material into a cell. Vectors can be linear or circular. Vectors can integrate into a target genome of a host cell or replicate independently in a host cell. Vectors can comprise, for example, an origin of replication, a multicloning site, and/or a selectable marker. An expression vector typically comprises an expression cassette. Vectors and plasmids include, but are not limited to, integrating vectors, prokaryotic plasmids, eukaryotic plasmids, plant synthetic chromosomes, episomes, viral vectors, cosmids, and artificial chromosomes.

[00116] As used herein the term “expression cassette” is a polynucleotide construct, generated recombinantly or synthetically, comprising regulatory sequences operably linked to a selected polynucleotide to facilitate expression of the selected polynucleotide in a host cell. For example, the regulatory sequences can facilitate transcription of the selected polynucleotide in a host cell, or transcription and translation of the selected polynucleotide in a host cell. An expression cassette can, for example, be integrated in the genome of a host cell or be present in an expression vector.

[00117] As used herein, the phrase “subject in need thereof’ refers to a subject that exhibits and/or is diagnosed with one or more symptoms or signs of a disease or disorder as described herein.

[00118] A “chemotherapeutic agent” refers to a chemical compound useful in the treatment of cancer. Chemotherapeutic agents include “anti-hormonal agents” or “endocrine therapeutics” which act to regulate, reduce, block, or inhibit the effects of hormones that can promote the growth of cancer.

[00119] The term “composition” refers to a mixture that contains, e.g., an engineered cell or protein contemplated herein. In some embodiments, the composition may contain additional components, such as adjuvants, stabilizers, excipients, and the like. The term “composition” or “pharmaceutical composition” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective in treating a subject, and which contains no additional components which are unacceptably toxic to the subject in the amounts provided in the pharmaceutical composition.

[00120] As used herein, the term “effective amount” refers to the amount of a compound e.g., a compositions described herein, cells described herein) sufficient to effect beneficial or desired results. An effective amount can be administered in one or more administrations, applications or dosages and is not intended to be limited to a particular formulation or administration route. As used herein, the term “treating” includes any effect, e.g., lessening, reducing, modulating, ameliorating or eliminating, that results in the improvement of the condition, disease, disorder, and the like, or ameliorating a symptom thereof.

[00121] The terms “modulate” and “modulation” refer to reducing or inhibiting or, alternatively, activating or increasing, a recited variable.

[00122] The terms “increase” and “activate” refer to an increase of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 50-fold, 100-fold, or greater in a recited variable.

[00123] The terms “reduce” and “inhibit” refer to a decrease of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 50-fold, 100-fold, or greater in a recited variable. Safe Harbor Loci

[00124] Gene editing therapies include, for example, viral vector integration and site specific integration. Site-specific integration is a promising alternative to random integration of viral vectors, as it mitigates the risks of insertional mutagenesis or insertional oncogenesis (Kolb et al. Trends Biotechnol. 2005 23:399-406; Porteus et al. Nat Biotechnol. 2005 23:967- 973; Paques et al. Curr Gen Ther. 2007 7:49-66). However, site specific integration continues to face challenges such as poor knock-in efficiency, risk of insertional oncogenesis, unstable and/or anomalous expression of adjacent genes or the transgene, low accessibility (e.g. within 20 kB of adjacent genes), etc.. These challenges can be addressed, in part, through the identification and use of safe harbor loci or safe harbor sites (SHS), which are sites in which genes or genetic elements can be incorporated without disruption to expression or regulation of adjacent genes.

[00125] The most widely used of the putative human safe harbor sites is the AAVS1 site on chromosome 19q, which was initially identified as a site for recurrent adenoassociated virus insertion. Other potential SHS have been identified on the basis of homology, with sites first identified in other species (e.g., the human homolog of the permissive murine Rosa26 locus) or among the growing number of human genes that appear non-essential under some circumstances. One putative SHS of this type is the CCR5 chemokine receptor gene, which, when disrupted, confers resistance to human immunodeficiency virus infection. Additional potential genomic SHS have been identified in human and other cell types on the basis of viral integration site mapping or gene-trap analyses, as was the original murine Rosa26 locus. The three top SHS, AAVS1, CCR5, and Rosa26, are in close proximity to many protein coding genes and regulatory elements. (See FIG. 2 from Sadelain, M., et al. (2012). Safe harbours for the integration of new DNA in the human genome. Nature reviews Cancer, 12(1), 51-58, the relevant disclosures of which are herein incorporated by reference in their entirety).

[00126] The AAVS1 (also known as the PPP1R12C locus) on human chromosome 19 is a known SHS for hosting transgenes (e.g. DNA transgenes) with expected function. It is at position 19ql3.42. It has an open chromatin structure and is transcription-competent. The canonical SHS locus for AAVS1 is chrl9: 55,625,241-55,629,351. See Pellenz et al. “New Human Chromosomal Sites with "Safe Harbor" Potential for Targeted Transgene Insertion.” Human gene therapy vol. 30,7 (2019): 814-828, the relevant disclosures of which are herein incorporated by reference. An exemplary AAVS1 target gRNA and target sequence are provided below:

• AAVS1 -gRNA sequence: ggggccactagggacaggatGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTA GTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTT

• AAVS1 target sequence: ggggccactagggacaggat

[00127] CCR5, which is located on chromosome 3 at position 3p21.31, encodes the major co-receptor for HIV-1. Disruption at this site in the CCR5 gene has been beneficial in HIV/AIDS therapy and prompted the development of zinc-finger nucleases that target its third exon. The canonical SHS locus for CCR5 is chr3: 46,414,443-46,414,942. See Pellenz et al. “New Human Chromosomal Sites with "Safe Harbor" Potential for Targeted Transgene Insertion.” Human gene therapy vol. 30,7 (2019): 814-828, the relevant disclosures of which are herein incorporated by reference.

[00128] The mouse Rosa26 locus is particularly useful for genetic modification as it can be targeted with high efficiency and is expressed in most cell types tested. Irion et al. 2007 ("Identification and targeting of the ROSA26 locus in human embryonic stem cells." Nature biotechnology 25.12 (2007): 1477-1482, the relevant disclosure of which are herein incorporated by reference) identified the human homolog, human ROSA26, in chromosome 3 (position 3p25.3).The canonical SHS locus for human Rosa26 (hRosa26) is chr3: 9,415,082- 9,414,043. See Pellenz et al. “New Human Chromosomal Sites with "Safe Harbor" Potential for Targeted Transgene Insertion.” Human gene therapy vol. 30,7 (2019): 814-828, the relevant disclosures of which are herein incorporated by reference.

[00129] Additional examples of safe harbor sites are provided in Pellenz et al. “New Human Chromosomal Sites with "Safe Harbor" Potential for Targeted Transgene Insertion.” Human gene therapy vol. 30,7 (2019): 814-828, the relevant disclosures of which are herein incorporated by reference.

[00130] The present disclosure is directed to methods for identifying safe harbor loci with benefits including, but not limited to, high knock-in efficiency and high expression of transgene. An example of applications of the presently disclosed methods is the identification of safe harbor loci for insertion of transgenes (e.g., chimeric antigen receptors (CAR)) into T- cells. In some embodiments, the safe harbor loci of the present disclsoure are useful for the insertion of a sequence encoding a transgene. In some embodiments, the safe harbor sites allow for high transgene expression (sufficient to allow for transgene functionality or treatment of a disease of interest) and stable expression of the transgene over several days, weeks or months. In some embodiments, knockout of the gene at the safe harbor locus confers benefit to the function of the cell, or the gene at the safe harbor locus has no known function within the cell. In some embodiments the safe harbor locus results in stable transgene expression in vitro with or without CD3/CD28 stimulation, negligible off-target cleavage as detected by iGuide-Seq or CRISPR-Seq, less off-target cleavage relative to other loci as detected by iGuide-Seq or CRISPR-Seq, negligible transgene-independent cytotoxicity, negligible transgene-independent cytokine expression, negligible transgeneindependent chimeric antigen receptor expression, negligible deregulation or silencing of nearby genes, and positioned outside of a cancer-related gene.

[00131] As used, a “nearby gene” can refer to a gene that is within about lOOkB, about 125kB, about 150kB, about 175kB, about 200kB, about 225kB, about 250kB, about 275kB, about 300kB, about 325kB, about 350kB, about 375kB, about 400kB, about 425kB, about 450kB, about 475kB, about 500kB, about 525kB, about 550kB away from the safe harbor locus (integration site).

[00132] In some embodiments, the present disclosure contemplates inserts that comprise one or more transgenes. The transgene can encode a therapeutic protein, an antibody, a peptide, a suicide gene, an apoptosis gene or any other gene of interest. The safe harbor loci identified using the method described herein allow for transgene integration that results in , for example, enhanced therapeutic properties. These enhanced therapeutic properties, as used herein, refer to an enhanced therapeutic property of a cell when compared to a typical immune cell of the same normal cell type. For example, an NK cell having “enhanced therapeutic properties” has an enhanced, improved, and/or increased treatment outcome when compared to a typical, unmodified and/or naturally occurring NK cell. The therapeutic properties of immune cells can include, but are not limited to, cell transplantation, transport, homing, viability, self-renewal, persistence, immune response control and regulation, survival, and cytotoxicity. The therapeutic properties of immune cells are also manifested by: antigen-targeted receptor expression; HLA presentation or lack thereof; tolerance to the intratumoral microenvironment; induction of bystander immune cells and immune regulation; improved target specificity with reduction; resistance to treatments such as chemotherapy. [00133] As used herein, the term “insert size” refers to the length of the nucleotide sequence being integrated (inserted) at the safe harbor site. In some embodiments, the insert size comprises at least about 100, 200, 300, 400 or 500 basepairs. In some embodiments, the insert size comprises about 500 nucleotides or basepairs. In some embodimetns, the insert size comprises up to 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,

17, 18, 19, 20 kbp (kilo basepairs) or the sizes in between. In some embodiments, the insert size is greater than 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,

18, 19, 20 kbp or the sizes in between. In some embodiments, the insert size is within the range of 3-15 kbp or is any number in that range. In some embodiments, the insert size is within the range of 1.5-8.3 kbp or is any number in that range. In some embodiments, the insert size is within the range of 1.5-15 kbp or is any number in that range. In some embodiments, the insert size is within the range of 0.5-20 kbp or is any number in that range. In some embodiments, the insert size is 0.5-10, 0.6-10, 0.7-10, 0.8-10, 0.9-10, 1-10, 2-10, 3- 10, 4-10, 5-10, 6-10, 7-10, 8-10, 9-10 kbp. In some embodiments, the insert size is 0.5-11, 0.6-11, 0.7-11, 0.8-11, 0.9-11, 1-11, 2-11, 3-11, 4-11, 5-11, 6-11, 7-11, 8-11, 9-11, or 10-11 kbp. In some embodiments, the insert size is 0.5-12, 0.6-12, 0.7-12, 0.8-12, 0.9-12, 1-12, 2- 12, 3-12, 4-12, 5-12, 6-12, 7-12, 8-12, 9-12, 10-12, or 11-12 kbp. In some embodiments, the insert size is 0.5-13, 0.6-13, 0.7-13, 0.8-13, 0.9-13, 1-13, 2-13, 3-13, 4-13, 5-13, 6-13, 7-13, 8-13, 9-13, 10-13, 11-13, or 12-13 kbp. In some embodiments, the insert size is 0.5-14, 0.6- 14, 0.7-14, 0.8-14, 0.9-14, 1-14, 2-14, 3-14, 4-14, 5-14, 6-14, 7-14, 8-14, 9-14, 10-14, 11-14, 12-14 or 13-14 kbp. In some embodiments, the insert size is 0.5-15, 0.6-15, 0.7-15, 0.8-15, 0.9-15, 1-15, 2-15, 3-15, 4-15, 5-15, 6-15, 7-15, 8-15, 9-15, 10-15, 11-15, 12-15, 13-15, or 14-15 kbp. In some embodiments, the insert size is 0.5-16, 0.6-16, 0.7-16, 0.8-16, 0.9-16, 1-

16, 2-16, 3-16, 4-16, 5-16, 6-16, 7-16, 8-16, 9-16, 10-16, 11-16, 12-16, 13-16, 14-16 or 15-16 kbp. In some embodiments, the insert size is 0.5-17, 0.6-17, 0.7-17, 0.8-17, 0.9-17, 1-17, 2-

17, 3-17, 4-17, 5-17, 6-17, 7-17, 8-17, 9-17, 10-17, 11-17, 12-17, 13-17, or 14-17, 15-17 or 16-17 kbp. In some embodiments, the insert size is 0.5-18, 0.6-18, 0.7-18, 0.8-18, 0.9-18, 1-

18, 2-18, 3-18, 4-18, 5-18, 6-18, 7-18, 8-18, 9-18, 10-18, 11-18, 12-18, 13-18, 14-18, 15-18, 16-18 or 17-18 kbp. In some embodiments, the insert size is 0.5-19, 0.6-19, 0.7-19, 0.8-19, 0.9-19, 1-19, 2-19, 3-19, 4-19, 5-19, 6-19, 7-19, 8-19, 9-19, 10-19, 11-19, 12-19, 13-19, 14-

19, 15-19, 16-19, 17-19, or 18-19 kbp. In some embodiments, the insert size is 0.5-20, 0.6-20, 0.7-20, 0.8-20, 0.9-20, 1-20, 2-20, 3-20, 4-20, 5-20, 6-20, 7-20, 8-20, 9-20, 10-20, 11-20, 12-

20, 13-20, 14-20, 15-20, 16-20, 17-20, 18-20, or 19-20 kbp.

[00134] The inserts of the present disclosure refer to nucleic acid molecules or polynucleotide inserted at a safe harbor site. In some embodiments, the nucleotide sequence is a DNA molecule, e.g., genomic DNA, or comprises deoxy-ribonucleotides. In some embodiments, the insert comprises a smaller fragment of DNA, such as a plastid DNA, mitochondrial DNA, or DNA isolated in the form of a plasmid, a fosmid, a cosmid, a bacterial artificial chromosome (BAC), a yeast artificial chromosome (YAC), and/or any other sub-genome segment of DNA. In some embodiments, the insert is an RNA molecule or comprises ribonucleotides. The nucleotides in the insert are contemplated as naturally occuring nucleotides, non-naturally occuring, and modified nucleotides. Nucleotides may be modified chemically or biochemically, or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications. The polynucleotides can be in any topological conformation, including single-stranded, double-stranded, partially duplexed, triplexed, hairpinned, circular conformations, and other three-dimension conformations contemplated in the art.

[00135] The inserts can have coding and/or non-coding regions. The insert can comprises a non-coding sequence (e.g., control elements, e.g., a promoter sequence). In some embodiments, the insert encodes transcription factors. In some embodiments, the insert encodes an antigen binding receptors such as single receptors, T-cell receptors (TCRs), syn- notch, CARs, mAbs, etc. In some embodiments, the inserts are RNAi molecules, including, but not limited to, miRNAs, siRNA, shRNAs, etc. In some embodiments, the the insert is a human sequence. In some embodiments, the insert is chimeric. In some embodiments, the insert is a multi-gene/multi-module therapeutic cassette. A multi-gene/multi-module therapeutic cassette referst to an insert or cassette having one or more than one receptor (e.g., synthetic receptors), other exogenous protein coding sequences, non-coding RNAs, transcriptional regulatory elements, and/or insulator sequences, etc.

[00136] Various cell types are contemplated as having the safe harbor sites in the present disclosure. A cell comprising a safe harbor site and/or a cell comprising an insert at a safe harbor site as described in the present disclosure can be referred to as an engineered cell. The cells can include, but are not limited to, eukaryotic cells, prokaryotic cells, animal cells, plant cells, fungal cells and the like. Optionally, the cell is a mammalian cell, for example, a human cell. In some embodiments, that engineered cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor. Non-limiting examples of immune cells that are contemplated in the present disclosure include T cell, B cell, natural killer (NK) cell, NKT/iNKT cell, macrophage, myeloid cell, and dendritic cells. Non-limiting examples of stem cells that are contemplated in the present disclosure include pluripotent stem cells (PSCs), embryonic stem cells (ESCs), induced pluripotent stem cells (iPSCs), embryo-derived embryonic stem cells obtained by nuclear transfer (ntES; nuclear transfer ES), male germline stem cells (GS cells), embryonic germ cells (EG cells), hematopoietic stem/progenitor stem cells (HSPCs), somatic stem cells (adult stem cells), hemangioblasts, neural stem cells, mesenchymal stem cells and stem cells of other cells (including osteocyte, chondrocyte, myocyte, cardiac myocyte, neuron, tendon cell, adipocyte, pancreocyte, hepatocyte, nephrocyte and follicle cells and so on). In some embodiments, the engineered cells is a T cell, NK cells, iPSC, and HSPC. In some embodiments, the engineered cells used in the present disclosure are human cell lines grown in vitro (e.g. deliberately immortalized cell lines, cancer cell lines, etc.).

[00137] The methods for integrating the inserts at the safe harbor sites can be viral or non- viral delivery techniques.

[00138] In some embodiments, the nucleic acid sequence is inserted into the genome of the engineered cell by introducing a vector, for example, a viral vector, comprising the nucleic acid. Examples of viral vectors include, but are not limited to, adeno-associated viral (AAV) vectors, retroviral vectors or lentiviral vectors. In some embodiments, the lentiviral vector is an integrase-deficient lentiviral vector.

[00139] In some embodiments, the nucleic acid sequence is inserted into the genome of the T cell via non-viral delivery. In non-viral delivery methods, the nucleic acid can be naked DNA, or in a non-viral plasmid or vector. Non-viral delivery techniques can be site-specific integration techniques, as described herein or known to those of ordinary skill in the art. Examples of site-specific techniques for integration into the safe harbor loci include, without limitation, homology-dependent engineering using nucleases and homology independent targeted insertion using Cas9. In some embodiments, the non-viral delivery method comprises electroporation.

[00140] In some embodiments, the insert is integrated at a safe harbor site by introducing into the engineered cell, (a) a targeted nuclease that cleaves a target region in the safe harbor site to create the insertion site; and (b) the nucleic acid sequence (insert), wherein the insert is incorporated at the insertion site by, e.g., HDR. Examples of non-viral delivery techniques that can be used in the methods of the present disclosure are provided in US Application Nos. 16/568,116 and 16/622,843, the relevant disclosures of which are herein incorporated by reference in their entirety. [00141] The engineered cell can retain its undifferentiated state after insertion of the transgenes. In some embodiments, the engineered cell is undifferentiated. In some embodiments, the engineered cell is undifferentiated after insert of the transgene. In some embodiments, the engineered cell is CD45RA⁺ and CCR7⁺ after insertion of the transgene. In some embodiments, the engineered cell is CD45RA⁺CCR7⁺CD27⁺ after insertion of the transgene.

CAR T cell Therapy

[00142] Chimeric antigen receptor (CAR) T cells are T cells that have been genetically engineered to produce an artificial T-cell receptor for use in immunotherapy. Chimeric antigen receptors are receptor proteins that have been engineered to confer T cells with the ability to target a specific protein. The genetic modification of lymphocytes (e.g. T cells) by incorporation of, for example, CARs, and administration of the engineered cells to a subject is an example of “adoptive cell therapy”. As used herein, the term “adoptive cell therapy” refers to cell-based immunotherapy for transfusion of autologous or allogeneic lymphocytes, referred to as T cells or B cells. In this CAR therapy approach, cells are expanded and cultured ex vivo and genetically modified, prior to transfusion.

[00143] The expression of CARs allows the engineered T-cells to target and bind specific proteins, for example, tumor antigens. In CAR therapy, T-cells are harvested from a subject — they can be autologous T-cells from the subject own blood or from a donor that will not be receiving the CAR therapy. Once isolated, the T-cells are genetically modified with a CAR, expanded ex vivo, and administered to the subject (i.e. patient) by, e.g. infusion.

[00144] The CARs may be introduced into the T-cells using, for example, a viral technique (e.g., retroviral integration) or site-specific technique. With site specific integration of the transgenes (e.g. CARs), the transgenes may be targeted to a safe harbor locus. Examples of site-specific techniques for integration into the safe harbor loci include, without limitation, homology-dependent engineering using nucleases and homology independent targeted insertion using Cas9.

[00145] The engineered CAR T cells have applications to immune-oncology. The CAR, for example, can be selected to target a specific tumor antigen. Examples of cancers that can be effectively targeted using CAR T cells are blood cancers. In some embodiments, CAR T cell therapy can be used to treat solid tumors. Gene editing

[00146] The terms “gene editing” or “genome editing”, as used herein, refer to a type of genetic manipulation in which DNA is inserted, replaced, or removed from the genome using artificially manipulated nucleases or “molecular scissors”. It is a useful tool for elucidating the function and effect of sequence-specific genes or proteins or altering cell behavior (e.g. for therapeutic purposes).

[00147] Currently available genome editing tools include zinc finger nucleases (ZFN) and transcription activator-like effector nucleases (TALENs) to incorporate genes at safe harbor loci (.e.g. the adeno-associated virus integration site 1 (AAVS1) safe harbor locus). The DICE (dual integrase cassette exchange) system utilizing phiC31 integrase and Bxbl integrase is a tool for target integration. Additionally, clustered regularly interspaced short palindromic repeat/Cas9 (CRISPR/Cas9) techniques can be used for targeted gene insertion. [00148] Site specific gene editing approaches can include homology dependent mechanisms or homology independent mechanisms.

[00149] All methods known in the art for targeted insertion of gene sequences are contemplated in the methods described herein to insert constructs at safe harbor loci.

Crispr-Cas Gene editing

[00150] One effective example of gene editing is the Crisp-Cas approach (e.g. Crispr- Cas9). This approach incorporates the use of a guide polynucleotide (e.g. guide ribonucleic acid or gRNA) and a cas endonuclease (e.g. Cas9 endonuclease).

[00151] As used herein, a polypeptide referred to as a “Cas endonuclease” or having “Cas endonuclease activity” refers to a CRISPR-related (Cas) polypeptide encoded by a Cas gene, wherein a Cas polypeptide is a target DNA sequence that can be cleaved when operably linked to one or more guide polynucleotides (see, e.g., US Pat. No. 8,697,359). Also included in this definition are variants of Cas endonuclease that retain guide polynucleotide-dependent endonuclease activity. The Cas endonuclease used in the donor DNA insertion method detailed herein is an endonuclease that introduces double-strand breaks into DNA at the target site (e.g., within the target locus or at the safe harbor site).

[00152] As used herein, the term “guide polynucleotide” relates to a polynucleotide sequence capable of complexing with a Cas endonuclease and allowing the Cas endonuclease to recognize and cleave a DNA target site. The guide polynucleotide can be a single molecule or a double molecule. The guide polynucleotide sequence can be an RNA sequence, a DNA sequence, or a combination thereof (RNA-DNA combination sequence). A guide polynucleotide comprising only ribonucleic acid is also referred to as “guide RNA”. In some embodiments, a polynucleotide donor construct is inserted at a safe harbor locus using a guide RNA (gRNA) in combination with a cas endonuclease (e.g. Cas9 endonuclease).

[00153] The guide polynucleotide includes a first nucleotide sequence domain (also referred to as a variable targeting domain or VT domain) that is complementary to a nucleotide sequence in the target DNA, and a second nucleotide that interacts with a Cas endonuclease polypeptide. It can be a double molecule (also referred to as a double-stranded guide polynucleotide) comprising a sequence domain (referred to as a Cas endonuclease recognition domain or CER domain). The CER domain of this double molecule guide polynucleotide comprises two separate molecules that hybridize along the complementary region. The two separate molecules can be RNA sequences, DNA sequences and/or RNA- DNA combination sequences.

[00154] Genome editing using CRISPR-Cas approaches relies on the repair of site-specific DNA double-strand breaks (DSBs) induced by the RNA-guided Cas endonuclease (e.g. Cas 9 endonuclease). Homology-directed repair (HDR) of these DSBs enables precise editing of the genome by introducing defined genomic changes, including base substitutions, sequence insertions, and deletions. Conventional HDR-based CRISPR/Cas9 genome-editing involves transfecting cells with Cas9, gRNA and donor DNA containing homologous arms matching the genomic locus of interest.

[00155] HITI (homology independent targeted insertion) uses a non-homologous end joining (NHEJ)-based homology-independent strategy and the method can be more efficient than HDR. Guide RNAs (gRNAs) target the insertion site. For HITI, donor plasmids lack homology arms and DSB repair does not occur through the HDR pathway. The donor polynucleotide construct can be engineered to include Cas9 cleavage site(s) flanking the gene or sequence to be inserted. This results in Cas9 cleavage at both the donor plasmid and the genomic target sequence. Both target and donor have blunt ends and the linearized donor DNA plasmid is used by the NHEJ pathway resulting integration into the genomic DSB site. (See, for example, Suzuki, K., et al. (2016). In vivo genome editing via CRISPR/Cas9 mediated homology-independent targeted integration. Nature, 540(7631), 144-149, the relevant disclosures of which are herein incorporated in their entirety).

[00156] Methods for conducing gene editing using CRISPR-Cas approaches are known to those of ordinary skill in the art. (See, for example, US Application Nos. US16/312,676, US 15/303,722, and US 15/628,533, the disclosures of which are herein incorporated by reference in their entirety). Additionally, uses of endonucleases for inserting transgenes into safe harbor loci are described, for example, in US Application No. 13/036,343, the disclosures of which are herein incorporated by reference in their entirety.

[00157] The guide RNAs and/or mRNA (or DNA) encoding an endonuclease can be chemically linked to one or more moi eties or conjugates that enhance the activity, cellular distribution, or cellular uptake of the oligonucleotide. Non-limiting examples of such moieties include lipid moieties such as a cholesterol moiety, cholic acid, a thioether, a thiocholesterol, an aliphatic chain (e.g., dodecandiol or undecyl residues), a phospholipid, e.g., di-hexadecyl-rac-glycerol or triethylammonium 1 ,2-di-O-hexadecyl- rac-glycero-3-H- phosphonate, a polyamine or a polyethylene glycol chain, adamantane acetic acid, a palmityl moiety and an octadecylamine or hexylamino-carbonyl-t oxycholesterol moiety. See for example US Application No. 15/715,068, the disclosures of which are herein incorporated by reference in their entirety.

Therapeutic Applications

[00158] For therapeutic applications, the engineered cells, populations thereof, or compositions thereof are administered to a subject, generally a mammal, generally a human, in an effective amount.

[00159] The engineered cells may be administered to a subject by infusion e.g., continuous infusion over a period of time) or other modes of administration known to those of ordinary skill in the art.

[00160] The engineered cells provided herein can be administered as part of a pharmaceutical compositions. In some embodimetns, the present disclosure provides compositions comprising a guide RNA of the present disclosure. The pharmaceutical composition may comprise one or more pharmaceutical excipients. Any suitable pharmaceutical excipient may be used, and one of ordinary skill in the art is capable of selecting suitable pharmaceutical excipients. Accordingly, the pharmaceutical excipients provided below are intended to be illustrative, and not limiting. Additional pharmaceutical excipients include, for example, those described in the Handbook of Pharmaceutical Excipients, Rowe et al. (Eds.) 6th Ed. (2009), incorporated by reference in its entirety.

[00161] The engineered cells provided herein not only find use in gene therapy but also in non-pharmaceutical uses such as, e.g., production of animal models and production of recombinant cell lines expressing a protein of interest. [00162] The engineered cells of the present disclosure can be any cell, generally a mammalian cell, generally a human cell that has been modified by integrating a transgene at a safe harbor locus described herein. In some embodiments, the engineered cells are immune cells. In some embodiments, the engineered cells are lymphocytes. In some embodiments, the engineered cells are T cells or T cell progenitors.

[00163] The engineered cells, compositions and methods of the present disclosure are useful for therapeutic applications such as CAR T cell therapy and TCR T cell therapy. In some embodiments, the insertion of a sequence encoding a transgene within a safe harbor locus maintains the TCR expression relative to instances when there is no insertion and enables transgene expression while maintaining TCR function.

[00164] Various diseases treated using the engineered cells, populations thereof, or compositions thereof are provided herein. Non-limiting examples of such diseases include alopecia areata, autoimmune hemolytic anemia, autoimmune hepatitis, cancer, dermatomyositis, diabetes (type 1), certain juvenile idiopathic arthritis, glomerulonephritis, Graves' disease, Guillain Valley Syndrome, idiopathic thrombocytopenic purpura, myasthenia gravis, certain myocarditis, multiple sclerosis, pemphigus/pemphigoid, pernicious anemia, polyarteritis nodosa, polymyositis, primary bile With cirrhosis, psoriasis, rheumatoid arthritis, scleroderma/systemic sclerosis, Sjogren's syndrome, systemic lupus erythematosus, certain thyroiditis, certain uveitis, vitiligo, multiple vasculitis (Wegener)); autoimmune disorders including, but not limited to, granulomatosis; hematopoietic tumors including but not limited to acute and chronic leukemia, lymphoma, multiple myeloma and myelodysplastic syndrome; tumors of the prostate, breast, lung, colon, uterus, skin, liver, bone, pancreas, ovary, testis, bladder, kidney, head, neck, stomach, cervix, rectum, larynx, or esophagus solid tumors; HIV (human immunodeficiency virus) related disorders, RSV (respiratory syncytial virus) related disorders; EBV (Epstein-Barr virus) related disorders; CMV (cytomegalovirus) related disorders; and infectious diseases including, but not limited to, adenovirus-related disorders and BK polyomavirus-related disorders.

[00165] Cancers that can be treated with the engineered cells (e.g., CAR T-cells) of the present disclosure, populations thereof, or compositions thereof include blood cancers. In some embodiments, the cancer treated using the engineered cells (e.g., CAR T-cells) described herein, populations thereof, or compositions thereof is a hematologic malignancy or leukemia. In some embodiments, the engineered cells (e.g., CAR T-cells) described herein, populations thereof, or compositions thereof are used for the treatment of acute lymphoblastic leukemia (ALL) or diffuse large B-cell lymphoma (DLBCL). In some embodiments, the cancer is acute myeloid leukemia (AML), acute lymphoblastic leukemia (ALL), myelodysplasia, myelodysplastic syndromes, acute T-lymphoblastic leukemia, or acute promyelocytic leukemia, chronic myelomonocytic leukemia, or myeloid blast crisis of chronic myeloid leukemia. Examples of cancers treatable using the engineered cells (e.g., CAR T-cells) described herein include, without limitation, breast cancer, ovarian cancer, esophageal cancer, bladder or gastric cancer, salivary duct carcinoma, salivary duct carcinomas, adenocarcinoma of the lung or aggressive forms of uterine cancer, such as uterine serous endometrial carcinoma. In some other embodiments, the cancer is brain cancer, breast cancer, cervical cancer, colon cancer, colorectal cancer, endometrial cancer, esophageal cancer, leukemia, lung cancer, liver cancer, melanoma, ovarian cancer, pancreatic cancer, rectal cancer, renal cancer, stomach cancer, testicular cancer, or uterine cancer. In yet other embodiments, the cancer is a squamous cell carcinoma, adenocarcinoma, small cell carcinoma, melanoma, neuroblastoma, sarcoma (e.g., an angiosarcoma or chondrosarcoma), larynx cancer, parotid cancer, biliary tract cancer, thyroid cancer, acral lentiginous melanoma, actinic keratoses, acute lymphocytic leukemia, acute myeloid leukemia, adenoid cystic carcinoma, adenomas, adenosarcoma, adenosquamous carcinoma, anal canal cancer, anal cancer, anorectum cancer, astrocytic tumor, bartholin gland carcinoma, basal cell carcinoma, biliary cancer, bone cancer, bone marrow cancer, bronchial cancer, bronchial gland carcinoma, carcinoid, cholangiocarcinoma, chondrosarcoma, choroid plexus papilloma/carcinoma, chronic lymphocytic leukemia, chronic myeloid leukemia, clear cell carcinoma, connective tissue cancer, cystadenoma, digestive system cancer, duodenum cancer, endocrine system cancer, endodermal sinus tumor, endometrial hyperplasia, endometrial stromal sarcoma, endometrioid adenocarcinoma, endothelial cell cancer, ependymal cancer, epithelial cell cancer, Ewing's sarcoma, eye and orbit cancer, female genital cancer, focal nodular hyperplasia, gallbladder cancer, gastric antrum cancer, gastric fundus cancer, gastrinoma, glioblastoma, glucagonoma, heart cancer, hemangioblastomas, hemangioendothelioma, hemangiomas, hepatic adenoma, hepatic adenomatosis, hepatobiliary cancer, hepatocellular carcinoma, Hodgkin's disease, ileum cancer, insulinoma, intraepithelial neoplasia, interepithelial squamous cell neoplasia, intrahepatic bile duct cancer, invasive squamous cell carcinomajejunum cancer oint cancer, Kaposi's sarcoma, pelvic cancer, large cell carcinoma, large intestine cancer, leiomyosarcoma, lentigo maligna melanomas, lymphoma, male genital cancer, malignant melanoma, malignant mesothelial tumors, medulloblastoma, medulloepithelioma, meningeal cancer, mesothelial cancer, metastatic carcinoma, mouth cancer, mucoepidermoid carcinoma, multiple myeloma, muscle cancer, nasal tract cancer, nervous system cancer, neuroepithelial adenocarcinoma nodular melanoma, non-epithelial skin cancer, non-Hodgkin's lymphoma, oat cell carcinoma, oligodendroglial cancer, oral cavity cancer, osteosarcoma, papillary serous adenocarcinoma, penile cancer, pharynx cancer, pituitary tumors, plasmacytoma, pseudosarcoma, pulmonary blastoma, rectal cancer, renal cell carcinoma, respiratory system cancer, retinoblastoma, rhabdomyosarcoma, sarcoma, serous carcinoma, sinus cancer, skin cancer, small cell carcinoma, small intestine cancer, smooth muscle cancer, soft tissue cancer, somatostatinsecreting tumor, spine cancer, squamous cell carcinoma, striated muscle cancer, submesothelial cancer, superficial spreading melanoma, T cell leukemia, tongue cancer, undifferentiated carcinoma, ureter cancer, urethra cancer, urinary bladder cancer, urinary system cancer, uterine cervix cancer, uterine corpus cancer, uveal melanoma, vaginal cancer, verrucous carcinoma, VIPoma, vulva cancer, well-differentiated carcinoma, or Wilms tumor. [00166] In some embodiments, the present disclosure provides methods of treating a subject in need of treatment by administering to the subject a composition comprising any of the engineered cells described herein. As used, the terms “treat,” “treatment,” and the like refer generally to obtaining a desired pharmacological and/or physiological effect. That effect is preventive in terms of complete or partial prevention of the disease and/or therapeutic in terms of partial or complete cure of the disease and/or adverse effects resulting from the disease. The term “treatment”, as used herein, encompasses any treatment of a disease in a subject (e.g., mammal, e.g., human). Treatment may also refer to the administration of the engineered cells provided herein to a subject that is susceptible to the disease but has not yet been diagnosed as suffering from it, including preventing the disease from occurring; inhibiting disease progression; or reducing the disease (i.e., causing a regression of the disease). Further, treatment may stabilize or reduce undesirable clinical symptoms in subjects (e.g., patients). The cells provided herein populations thereof, or compositions thereof may be administered before, during or after the occurrence of the disease or injury.

[00167] In certain embodiments, the subject has a disease, condition, and/or injury that can be treated and/or ameliorated by cell therapy. In some embodiments, the subject in need of cell therapy is a subject having an injury, disease, or condition, thereby causing cell therapy (e.g., therapy in which cellular material is administered to the subject). However, it is contemplated that it is possible to treat, ameliorate and/or reduce the severity of at least one symptom associated with the injury, disease or condition. In certain embodiments, a subject in need of cell therapy includes, but is not limited to, a bone marrow transplant or stem cell transplant candidate, a subject who has received chemotherapy or radiation therapy, a hyperproliferative disease or cancer (e.g., a hematopoietic system), a subject having or at risk of developing a hyperproliferative disease or cancer), a subject having or at risk of developing a tumor (e.g., solid tumor), viral infection or virus. It is also intended to encompass subjects suffering from or at risk of suffering from a disease associated with an infection.

[00168] In some embodiments, the present disclosure provides a composition of the present disclosure along with instructions for use. The instructions for use can be present in the kits as a package insert, in the labeling of the container of the kit or components thereof, or can be in digital form (e.g. on a CD-ROM, via a link on the internet). A kit can include one or more of a genome-targeting nucleic acid, a polynucleotide encoding a genometargeting nucleic acid, a site-directed polypeptide, and/or a polynucleotide encoding a site- directed polypeptide. Additional components within the kits are also contemplated, for example, buffer (such as reconstituting buffer, stabilizing buffer, diluting buffer), and/or one or more control vectors.

Combination Therapies

[00169] In some embodiments, an engineered cells of the present disclosure or composition thereof is administered with at least one additional therapeutic agent. Any suitable additional therapeutic agent may be administered with an engineered cell provided herein, populations thereof, or compositions thereof. In some aspects, the additional therapeutic agent is selected from radiation, an ophthalmologic agent, a cytotoxic agent, a chemotherapeutic agent, a cytostatic agent, an anti-hormonal agent, an immunostimulatory agent, an anti-angiogenic agent, and combinations thereof.

[00170] In some embodiments, an engineered cell of the present disclosure or composition thereof is administered with a steroid. The administration of a steroid can prevent or mitigate the risk of a subject receiving the engineered cell(s) or composition thereof having an autoimmune reaction.

[00171] The additional therapeutic agent may be administered by any suitable means. In some embodiments, the engineered cells described herein, populations thereof, or compositions thereof and the additional therapeutic agent is administered in the same pharmaceutical composition, e.g. by infusion. In some embodiments, the engineered cells described herein and additional therapeutic agent are included in different pharmaceutical compositions.

[00172] The pharmaceutical composition may comprise one or more pharmaceutical excipients. Any suitable pharmaceutical excipient may be used, and one of ordinary skill in the art is capable of selecting suitable pharmaceutical excipients. Accordingly, the pharmaceutical excipients provided below are intended to be illustrative, and not limiting. Additional pharmaceutical excipients include, for example, those described in the Handbook of Pharmaceutical Excipients, Rowe el al. (Eds.) 6th Ed. (2009), incorporated by reference in its entirety.

[00173] Various modes of administering the additional therapeutic agents are contemplated herein. In some embodiments, the additional therapeutic agent is administered by any suitable mode of administration. Generally, modes of administration include, without limitation, intravitreal, subretinal, suprachoroidal, intraarterial, intradermal, intramuscular, intraperitoneal, intravenous, nasal, parenteral, topical, pulmonary, and subcutaneous routes. [00174] In embodiments where the engineered cells provided herein and the additional therapeutic agent are included in different pharmaceutical compositions, administration of the engineered cells provided herein can occur prior to, simultaneously, and/or following, administration of the additional therapeutic agent.

Additional Embodiments

[00175] In some aspects, provided herein are engineered cells, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus; wherein the safe harbor locus is at any one or more of the sgRNA target loci selected from: chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1:128340000- 128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3: 59450000-59460000, chr8: 127980000-128000000, chr9:7970000-7980000, APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[00176] In some embodiments, expression of the at least one sequence encoding the transgene is operatively linked to an endogenous promoter.

[00177] In some embodiments, expression of the at least one sequence encoding the transgene is operatively linked to an exogenous promoter. [00178] In some embodiments, the target locus is selected from: chrlO:33130000- 33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1:65425000- 65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2 : 87460000-87470000, chr3 : 186510000- 186520000, chr3 : 59450000-59460000, chr8: 127980000-128000000, and chr9:7970000-7980000.

[00179] In some embodiments, the target locus is chrl 1 : 128340000-128350000 or chrl 5:92830000-92840000.

[00180] In some embodiments, the target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, S0CS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[00181] In some embodiments, the safe harbor locus is the GS94 or GS102 integration site in Table 4.

[00182] In some embodiments, the exogenous promoter is an EFla promoter.

[00183] In some embodiments, the engineered cell is a natural killer (NK) cell, an induced pluripotent stem cells (iPSC), a human pluripotent stem cell (HSPC), a T cell or a T cell progenitor.

[00184] In some embodiments, the transgene encodes a recombinant protein, a therapeutic agent, or a chimeric antigen receptor (CAR).

[00185] In some aspects, provided herein are compositions comprising the engineered cell described herein and a pharmaceutical excipient.

[00186] In some aspects, provided herein are guide ribonucleic acids (gRNA) for editing a cell at a safe harbor locus, wherein the gRNA comprises any one of SEQ ID NOS: 1-120.

[00187] In some aspects, provided herein are methods of editing a cell having chromosomal DNA, comprising inserting at least one sequence encoding a transgene within a safe harbor locus in the chromosomal DNA of the cell, wherein the safe harbor locus is at any one or more of the sgRNA target loci selected from: chrl 0:33130000-33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2: 87460000- 87470000, chr3: 186510000- 186520000, chr3: 59450000-59460000, chr8: 127980000- 128000000, chr9:7970000-7980000, APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28. [00188] In some embodiments, the target locus is selected from: chrlO:33130000- 33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1:65425000- 65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: 11220000-11230000, chr2 : 87460000-87470000, chr3 : 186510000- 186520000, chr3 : 59450000-59460000, chr8: 127980000-128000000, and chr9:7970000-7980000.

[00189] In some embodiments, the target locus is chrl 1 : 128340000-128350000 or chrl 5:92830000-92840000.

[00190] In some embodiments, the target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, S0CS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28.

[00191] In some embodiments, the transgene encodes a recombinant protein, a therapeutic agent, or a chimeric antigen receptor (CAR).

[00192] In some embodiments, the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene.

[00193] In some embodiments, the cell is a T cell or T cell progenitor.

[00194] In some embodiments, the at least one sequence is inserted using a homology- directed repair or a homology independent targeted insertion.

[00195] In some embodiments, the at least one sequence is inserted using one or more guide ribonucleic acids (gRNAs) and one or more Cas9 endonucleases, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

[00196] In some aspects, provided herein are ex vivo methods of obtaining an engineered cell or population thereof, comprising: obtaining a cell; genetically modifying the cell by inserting at least one sequence encoding a transgene within a safe harbor locus, wherein the safe harbor locus is at any one or more of the sgRNA target loci selected from: chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 : 128340000-128350000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, chr9:7970000-7980000, APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTL, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, and TRIM28. [00197] In some embodiments, obtaining the cell comprises: (i) collecting a tissue sample from a subject, (ii) isolating the cells from the tissue samples, and (iii) culturing the cells in vitro.

[00198] In some embodiments, the cell is a stem cell, a natural killer (NK) cell, an induced pluripotent stem cells (iPSC), a human pluripotent stem cell (HSPC), a T cell or a T cell progenitor.

[00199] In some embodiments, the at least one sequence is inserted using a homology- directed repair or a homology independent targeted insertion.

[00200] In some embodiments, the genetically modifying in step (b) comprises contacting the cell with one or more guide ribonucleic acids (gRNAs), the at least one sequence, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within the safe harbor locus and wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

[00201] In some embodiments, the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene.

[00202] In some aspects, provided herein are methods of treating a subject having or at risk of having a disease, comprising administering to the subject an effective amount of the engineered cell described herein.

[00203] In some aspects, provided herein are methods of treating a subject having or at risk of having a disease, comprising: conducting the method described herein; and administering to the subject an effective amount of a composition comprising the cell or a population thereof.

[00204] In some embodiments, the disease is cancer.

EXAMPLES

[00205] The following are examples of methods and compositions of the invention. It is understood that various other embodiments may be practiced, given the general description provided herein.

[00206] Below are examples of specific embodiments for carrying out the present invention. The examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperatures, etc.), but some experimental error and deviation should, of course, be allowed for. [00207] The practice of the present invention will employ, unless otherwise indicated, conventional methods of protein chemistry, biochemistry, recombinant DNA techniques and pharmacology, within the skill of the art. Such techniques are explained fully in the literature. See, e.g., T.E. Creighton, Proteins: Structures and Molecular Properties (W.H. Freeman and Company, 1993); A.L. Lehninger, Biochemistry (Worth Publishers, Inc., current addition); Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition, 1989);

Methods In Enzymology (S. Colowick and N. Kaplan eds., Academic Press, Inc.); Remington's Pharmaceutical Sciences, 18th Edition (Easton, Pennsylvania: Mack Publishing Company, 1990).

Example 1: Identification of Knock-In Loci in Coding Regions

[00208] The objective of the experiments in this set of examples was to identify T-cell knock-in (KI) loci in coding regions outside of the T Cell Receptor Alpha Constant (TRAC) locus. To select candidate loci, the criteria and requirements shown in Table 1 were used.

Table 1: Criteria used for identification of candidate knock-in loci within coding regions

Criteria Detailed Requirements Datasets Considered for identification of loci

• Knock in efficiency data on ~90 genes derived from Roth, T. L., et al. 2019. (Rapid discovery of synthetic DNA sequences to rewrite endogenous T cell circuits. bioRxiv,

• High expression at 604561.) d2 (day 2) • Bulk RNA-seq data at dO, d2, d3 and d4:

• Similar expression ■ (i) publicly available data;

dynamics as TRAC ■ (ii) data derived from Roth, T. L., et al. 2019. (Rapid

KI (stable expression discovery of synthetic DNA sequences to rewrite efficiency upon activation)

endogenous T cell circuits. bioRxiv, 604561.); and

cytolytic activity or

• Achilles’ project on cancer cell lines data

Safety signaling of T cells

• Annotated oncogenes and tumor suppressors from TCGA Be careful with

• Annotated functions in immune cells tumor suppressors and oncogenes

[00209] Generally, the requirements for candidate KI loci within the coding regions was that the coding gene is not essential for T cell function or the knock-out of that gene would be beneficial to chimeric antigen receptor therapy (CAR T) functions.

[00210] As shown in Table 1, some of the data utilized, for comparison purposes, to identify candidate KI loci was sourced, in part, from Roth, T. L., et al. 2019 (Rapid discovery of synthetic DNA sequences to rewrite endogenous T cell circuits. bioRxiv, 604561), the relevant disclosures of which are herein incorporated by reference in their entirety. A summary of data in the Roth, T.L. et al. paper is shown in FIG. 3.

[00211] FIG 4A and FIG 4B show the processed RNA-seq data, showing that samples cluster based on activation status and by cell type. Donor 4 (AKI4) was identified as an outlier, as it clustered differently than the others, and was removed from remaining analysis. The transcript expression data from RNA-seq experiments on the 90 genes cited in the Roth, T.L. et al. paper was correlated to transcript expression data in Roth, T.L. et al. paper. (See FIG. 5A and FIG. 5B)

[00212] FIG. 6 shows the process ATAC-seq data, which focused on the 10 kb area around the transcription start site (TSS). Again, donor 4 (AKI4) was identified as an outlier, based on expression profile in both CD4 and CD8 cells, and was removed from remaining analysis. The results of the remaining ATAC-seq analysis are shown in FIGS. 7A-8. The data in FIG. 7A for TSS enrichment scores also indicated that all libraries were high quality. The data on open chromatin regions around the TRAC locus revealed the highest signal near exon 3 rather than exon 1 (FIG. 8)

Determination of Knock-In Efficiency:

[00213] The KI efficiency results (obtained using the model described infra) based on the 5 donors, 2 gRNA, and 2 replicates/gRNAs from the Roth, T.L. et al. paper are shown in FIG. 9 and FIG. 10

[00214] A linear model was built to estimate KI efficiency of approximately 90 genes using RNA-seq and ATAC-seq data. The linear model captured 33% of the variation in the data (i.e. R²=0.33). FIG. 11 shows KI efficiency versus day 2 (d2) RNA-seq data, day 4 (d4) RNA seq data, and d2 ATAC data. The linear model was applied to remaining candidate genes to estimate their KI efficiency.

[00215] The candidate coding loci were selected by first ranking all genes in the pooled data sets by predicted KI efficiency, using RNA-seq expression data from d2 (day 2). Candidate coding loci were required to be stably expressed during T cell activation (e.g., <=2-fold expression change relative to day 0 (dO). The candidate loci also had to be accessible based on ATAC-seq data. Using that selection process/requirements, 16 well- characterized coding genes (with known functions) were selected as candidate genes. The knockout of these 16 would confer a benefit to the function of CAR T cells (e.g. B2M, CD5, SMAD2, PTPRC, CD3E). An additional 12 coding genes with high predicted KI efficiency and no apparent essential function (inert coding genes) were selected as candidate genes. (See

Table 2).

Table 2: KI Candidate Loci selected

Example 2: Identification of Knock-In Loci in Gene Deserts

[00216] The criteria for selection of candidate loci in non-coding regions or gene deserts is summarized in Table 3.

Table 3: Criteria used for identification of candidate knock-in loci within non-coding regions

[00217] To select candidate regions within the non-coding regions, the inventors here started by looking in highly accessible regions of the genome (10 kb windows). The most accessible region overlapped with (i) annotated protein coding genes (>50% accessible regions), (ii) pseudogenes and noncoding RNAs (-20% accessible regions), and (iii) enhancer/regulatory regions (-20% accessible regions). The candidate genes were required to be < 10 kb from the coding regions. A few regions that overlapped with long intergenic noncoding RNAs (lincRNAs) but did not have apparent function in T cells were also considered.

[00218] Examples of other criteria that have been used in the art for selection of SHS are described, for example in Pellenz, S., et al. (2019). New human chromosomal sites with “safe harbor” potential for targeted transgene insertion. Human gene therapy, 30(7), 814-828, the relevant disclosures of which are herein incorporated by reference in their entirety.

[00219] FIG. 12 is a plot showing the normalized ATAQ SEQ data for a top candidate non-coding region.

[00220] Using the above-described selection process/requirement, 11 candidate regions in gene deserts were selected. In total, there were 39 KI candidate loci, including those in coding and non-coding regions that would be evaluated as safe harbor loci (predicted loci).

Example 3: Experimental Evaluation of Predicted Loci

[00221] Materials

• 120 sgRNAs were ordered from Synthego, 3 sgRNAs per region, and editing efficiency was assessed by next generation sequencings (NGS).

■ 87 targeted coding genes

■ 33 targeted gene deserts

• 189 constructs were synthesized by Genscript

■ Homology arm: 450bp each

■ eGFP was used as the reporter ■ V ector b ackb one : pUC 57 -Kan b ackb one

■ 78 with endogenous promoters

■ 111 with EF 1 a-HTL V promoter

• Constructs were sequence verified by Genscript and majority were internally verified by sequencing. The sgRNA sequences are provided in Table 4 and the construct sequences are provided in Table 5.

[00222] Methods

[00223] To evaluate the effectiveness of the predicted loci, the following were used:

[00224] (87 endogenous constructs + 120 exogenous construct)*2 = 414 wells + Controls

=460

[00225] 3 donors, 2 replicates at 3 time points: wkl (d6=day 6), wk3 (d21=day 21) and wk4 (d28=day 28)

[00226] Controls

[00227] DNA-only (no ribonucleoprotein (RNP)): episomal expression (n=8)

[00228] DNA + non-targeting guides + RNP: to check if DNA delivery is more efficient with RNP (n=8)

[00229] pMax GFP was measured to check if transfection is ok (n=3).

[00230] WT cells with no electroporation

[00231] WT cells with electroporation

[00232] Controls were spread out across plates.

[00233] Cells were electroporated without RNP/HDR to check if cell counting makes sense.

[00234] 2 more donors were added and changed to 3 replicates at 1 time point: wkl (d6)

[00235] FACS panel: GFP, TCR, Zombie

[00236] Washed every 3 wells.

[00237] The secondary data collected included: RNA-seq data: at dO, d2, and d4 (to determine stable expression) and ATAC-seq data: at d2 (to identify activated cells). The methods used for conducting the RNA-seq and ATAC-seq experiments are in Roth, T. L., et al. 2019. (Rapid discovery of synthetic DNA sequences to rewrite endogenous T cell circuits. bioRxiv, 604561), which is herein incorporated by reference in its entirety.

[00238] Automated gating was performed on FACS data, using .fcs files. For constructs with endogenous promoters, all 5 donors and all 3 time points were used. For constructs with EFla promoter, used 3 donors (donors 1-3) and 2 time points (wk3 and wk4). [00239] To summarize data, wells with a %KI efficiency (% GFP high cells) <=10% (likely a gating error, experimental error or minimal integration) were removed. Loci were then ranked by median expression among donors and median %KI efficiency among donors. [00240] Loci significance was calculated using robust rank aggregation, which was the methodology used for consistent ranking of loci among donors and time points. Methods for conducting robust rank aggregation are known in the art. (See, for example, Kolde, R., et al. (2012). Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics (Oxford, England), 28(4), 573-580, the disclosure of which is herein incorporated by reference in its entirety).

[00241] Results

[00242] The results from the analysis of controls and experimental groups is shown in FIGS. 13A-25.

[00243] The results of the pmax GFP control experiments revealed that donor 5 had significantly higher cell count than other donors. Generally, the GFP high readings decreased over time, as expected. (FIGS. 13A and 13B).

[00244] Experiments to analyze non-targeting controls for changes in episomal GFP expression, revealed that episomal GFP expression decreased to less than 0.05 over time. (FIGS. 14A and 14B) Some of the outliers were attributed to gating errors, which were most severe in donor 3. WT control experiments also exhibited <0.05 GFP high readings, as expected (FIG. 15).

[00245] As shown in FIG. 16 and FIG. 17, there was a more consistent trend among donors when using sgRNA5 to target the B2M safe harbor locus.

[00246] A comparison of GFP expression trends between sgRNA79 and sgRNA83 for TRAC insertion and expression with an endogenous reporter revealed that donor 2 trends were different between sgRNA79 and sgRNA83, while donor 4 trends were the same (FIG. 18 and FIG. 22). Some potential reasons for the observed variation between replicates and donors include:

[00247] (1) manual handling of a large number of plates (e.g., FIG. 26):

[00248] Edge effects

[00249] Electroporation errors

[00250] (2) Inherent differences among donors (e.g., FIG. 27):

[00251] There were consistent differences with certain wells throughput the assay (spanning multiple weeks) [00252] There were variable cell numbers between donors [00253] (3) Gating errors (e.g. FIG. 28):

[00254] 1 peak vs >2 peaks in GFP signals

Example 4: Ranking Loci by Expression and KI Efficiency

[00255] The loci were ranked based on mean florescence intensity (MFI) of reporter gene, GFP, and KI efficiency.

[00256] FIG. 29 shows GFP MFI and KI efficiency for all significant loci having endogenous promoters. The B2M locus reported the highest GFP MFI. The TRAC locus was among the top 10 MFI. The highest GFP MFI readings were reported in week 1 and reduced slightly in weeks 3 and 4. The results also showed most top loci have more than 1 sgRNA. With regard to KI efficiency, the SOCS1 locus reported the highest KI efficiency. Overall, KI efficiency showed greater variation among donors than GFP MFI.

[00257] FIG. 30 shows GFP MFI and KI efficiency for all significant loci having exogenous promoters (e.g. EFla promoters). The TRAC locus reported the highest GFP MFI. There was comparable expression between weeks 3 and 4. The results also showed most top loci have more than 1 sgRNA. With regard to KI efficiency, the SOCS1 locus reported the highest KI efficiency. Overall, KI efficiency showed less variation when driven by the exogenous promoters than when driven by endogenous promoters.

[00258] FIGS. 31A, 31B, and 31C show GFP MFI and KI efficiency for all significant loci having endogenous and exogenous (e.g., EFla) promoters. The results showed that expression driven by EFla promoters is about 10 X higher than endogenous promoters (FIG. 31 A). With regard to KI efficiency, the SOCS1 locus reported the highest KI efficiency. At weeks 3 and 4, KI efficiency was higher when driven by the exogenous promoters than when driven by endogenous promoters.

Example 5: Evaluation of TCR expression in candidate noncoding sites [00259] This experiment was conducted to identify target loci and integration sites that enable high transgene expression without disrupting the TCR. Some gene deserts (e.g. gene deserts 2, 3, 5, and 6) were identified as having high transgene expression (FIG. 32A). [00260] To evaluate and identify preferred knock-in sites, TCR expression was measured following circuit cassette knock in of PrimeR (Prime Receptor (Myc)). Myc denotes an N- terminal Myc epitope tag to facilate detection of surface expressed primer receptor. Briefly, CD3-CD28 Dynabead-activated T cells were electroporated with sgRNA/Cas9 RNPs targeting the indicated sites (see FIG. 32B and 33 A), as well as HDRTs with homology arms directing HDR-mediated integration into the indicated sites. At day 6 post-electroporation, cells were stained with anti-TCRalpha/beta and anti-myc antibodies and analyzed on an Attune NxT flow cytometer. As shown in FIG. 32B, the TCR expression was maintained at GS94 and GS102 candidate integration (knock-in) sites. Only 2/3 of the cells that had circuit cassette knock in of PrimeR at GS79 (TRAC) locus maintatined TCR expression. These results indicated that the GS94 and GS102 sites showed better potential for TCR stimulation. [00261] The percentage of cells showing effective knock-in (based on measurements of PrimeR) was 36% ± 4% when using the GS94 integration site, as compared to 32% ± 5% when using the TRAC integration site. These results revealed integration sites, including GS94, that supported reproducibly high circuit cassette knockin rates. See FIG. 33A.

Example 6: Evaluation of GS94 circuit expression and function

[00262] GS94 is a candidate integration site located on chromosome H’s distal q arm. It is within 180-350kb of the promoters for ETS1 and FLU (FIG. 33B), however that is considered low-risk for integration vector gene therapy. The circuit expression and function potential of the GS94 gene was evaluated.

[00263] T cells underwent circuit cassette integration with PrimeR at GS79 (TRAC) integration site and GS94 integration site. The cells were cocultured with K562 C19 cells for 48 hours and then the PrimeR induced CAR MFI was compared to the PrimeR MFI.

[00264] Briefly, T cells generated as described in the Example above were cocultured with K562 CD19+/MSLN- cells at day 7 post-electroporation. MSLN (mesothelin) is a gene that is overexpressed in human pancreatic cancer. Cells were then stained with anti-FLAG antibody 48 h post initiation of coculture and analyzed on an Attune NxT flow cytometer. The results revealed that GS94 yields superior CAR induction with high prime R expression following the 48-hour coculture with the K562 C19 cells. See FIG. 34A. GS94 resulted in prime antigen-dependent CAR expression that was approximately two-fold higher than the expression in several other candidate integration sites as well as the TRAC integration site. Additionally, on average, the prime receptor surface expression level was no less than 50% of expression level when using the TRAC integration site.

Cytotoxicity and Cytokine Secretion

[00265] To evaluate the effect of the candidate integration site on cytotoxicity and cytokine secretion, T cells that had undergone circuit cassette integration with PrimeR at GS79 (TRAC) integration site and GS94 integration site were cocultured with K562 C19/MSLN cells for 48 hours. MSLN is a gene that is overexpressed in human pancreatic cancer. The cells were treated at a 1 : 1 effectortarget cell ratio (1 : 1 E:T)). Briefly, T cells generated as described above were cocultured with K562 CD19+/MSLN+ cells at day 7 postelectroporation. 48h post initiation of coculture, supernatants were collected and analyzed via Luminex for cytokine levels. The cytokines measure were IL-2, INFg, and TNF. Cytotoxicity was analyzed by measuring luciferase activity of remaining target cells after 48h. Each of the data points in FIG. 34B represent two replicates and the lines represent the range of cytotoxicity for the replicates. As shown in FIG. 34B, the GS94 integration sites resulted in superior cy cotoxic ability and cytokine secretion following the 48-hour coculture with K562 C19/MSLN cells.

Prime-independent cytotoxicity

[00266] To compare the effect of the candidate integration site on cytotoxicity versus prime-independent cytotoxicity, T cells that had undergone circuit cassette integration with PrimeR at GS79 (TRAC) integration site and GS94 integration site were cocultured with K562 CD19+/MSLN+ cells or K562 CD19-/MSLN+ cells (“K562 MSLN”) at day 7 postelectroporation for 48 hours. 0.3, 1.0, and 3.0 E:T cell ratios were tested. See FIG. 35A and FIG. 35B. At 48h post initiation of coculture, cytotoxicity was analyzed by measuring luciferase activity of remaining target cells As shown in FIG. 35B, the GS94 integration site resulted in equivalent cytotoxic potential to the TRAC integration site and there was no prime-independent cytotoxicity.

Prime-independent cytokine section

[00267] To compare the effect of the candidate integration site on cytoxicity versus primeindependent cytotoxicity, T cells that had undergone circuit cassette integration with PrimeR at GS79 (TRAC) integration site and GS94 integration site, generated as described above, were cocultured with K562 CD19+/MSLN+ cells. A group that had target cells only (E:T = 0; Targets only) was compared to a group with an E:T cell ratio of 1. Following the 48 h coculture with the K562 CD19+/MSLN+ cells, supernatant was collected and analyzed via Luminex for cytokine levels to measure secretion of IL-2, INFg and TNF cytokines. See FIG. 36A and FIG. 36B. As shown in FIG. 36B, the GS94 integration site resulted in equivalent cytokine secretion to the TRAC integration site and and there was no prime-independent secretion of IL-2, INFg or TNF.

Prime-independent CAR expression [00268] To evaluate the effect of the candidate integration site on prime-independent CAR expression, T cells that had undergone circuit cassette integration with PrimeR at GS79 (TRAC) integration site, GS94 integration site, and GS102 integration site were cultured in vitro for 32 days. The cells were treated with repetitive CD3/CD28 stimulation at days 5, 12, 19 and 28 of the experiment. On Day 16, the cells were evaluated for CAR expression using a flow cytometry assay. As shown in FIG. 37, T cell activation through TCR did not result in prime R-independent CAR expression from circuit cassette integration at the candidate integration sites.

[00269] T cells generated as described above were cultured in 96-well plates, with T cell growth medium being exchanged every 2 days. At days 5, 12, 19 and 28, T cells were stimulated with 1 : 1 CD3/CD28 Dynabeads. Cells were analyzed for PrimeR expression by myc epitope tag staining, and for CAR expression by FLAG epitope tag staining at the indicated time points. Flow analysis was performed on an Attune NxT flow cytometer.

Example 7: Evaluating stability of prime receptor expression over several weeks [00270] To evaluate the effect of the candidate integration sites on stable (sustained) expression of PrimeR, T cells that had undergone circuit cassette integration with PrimeR at the integration sites indicated in FIG. 38 were cultured in vitro for 32 days. Briefly, T cells generated as described above were cultured in 96-well plates, with T cell growth medium being exchanged every 2 days. At days 5, 12, 19 and 28, T cells were stimulated with 1 : 1 CD3/CD28 Dynabeads repetitive stimulation. Flow cytometry assays were run on days 16 and 32 using an Attune NxT flow cytometer. The cells were analyzed for PrimeR expression by myc epitope tag staining. As shown in FIGS. 38A and 38B, the GS94 integration site resulted in stable PrimeR expression over at least a 4-week period.

Example 8: Evaluation of on-target editing efficiency

[00271] To evaluate the on-target editing efficiency of candidate knock-in sites, iGUTDE- Seq assay was used. The methods used for conducting the iGUTDE-Seq assay are illustrated in FIG. 39A and provided in Nobles et al., Genome Biology (2019), which is hereby incorporated by reference in its entitrety. As shown in FIG. 39B, the GS94 integration site had the highest on-target editing efficiency of the evaluated candidate integration sites. As shown in FIG. 39C, GS94 resulted in no putative off-target editing as observed with two donors. Example 9: Evaluation of GS94 knock-in

Methods

[00272] Elevation prediction: Computational predictions of potential off-target sites from (gs94) were performed using Elevation- search (algorithm described in Listgarten et al. 2018. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat Biomed Engr 2, 37-48; software obtained from https://github.com/Microsoft/Elevation). All sites identified by Elevation-search were subjected to analysis using rhAmp-seq.

[00273] rhAmpSeq: 49 candidate off-target sites for GS94 identified by iGUIDE or the Elevation prediction algorithm and the GS94 target site were characterized by rhAmpSeq (Integrated DNA Technologies, Inc.). This targeted amplification enables NGS-based quantification of the editing occurring at numerous sites simultaneously. Genomic DNA from T cells from at least 2 donors that had been treated singly with each of the following 7 guides: GS84, GS94, GS95, GS96, GS102, GS108, and GS138 was isolated with the GenFind V3 DNA purification system (Beckman Coulter). Two separate rhAmpSeq amplification pools were used to cover the 50 loci, the procedure was performed as recommended by Integrated DNA Technologies for each of the samples. The rhAmpSeq libraries were sequenced on a MiniSeq with a Mid Output Kit (300-cycles) (Illumina). The CRISPResso2 algorithm (https://github.com/pinellolab/CRISPResso2) was used to determine the percentage of insertions and deletions at each of the amplified loci. Statistical significance (FDR-adjusted p-value < 0.001) using a chi-squared test was only observed at the GS94 site.

[00274] RNA-seq: To evaluate changes induced by GS94 integration at the transcriptional level, a CD19/MSLN circuit was integrated at the GS79 (TRAC), GS94, and GS102 integration sites. On day 6 post-integration, le6 edited cells were sorted using a BD FACSAria based on transgene expression. RNA was isolated from sorted T cells with the RNeasy kit (Qiagen). Purified RNA was converted into an NGS library using the TruSeq RNA Library Prep Kit v2 (Illumina). Libraries were sequenced on either the NovaSeq 6000 or NextSeq 550 instruments (Illumina). The STAR 2.7.3a aligner (Dobin A. et al. Bioinformatics. 2013. 29: 15-21) was used to align the RNA-seq data against the reference human GROG 8 transcriptome and to obtain gene-level read counts. edgeR (Robinson MD et al. Bioinformatics. 2010. 26: 139-140) was used to compute differential expression, combining data across both donors. The only genes within 300Kb of the GS94 site, ETS1 and FLU, were not differentially expressed in cells with integration at the GS94 integration site compared to cells with integration at any of the other two loci. At an FDR-adjusted p-value cutoff of 0.01, the number of differentially expressed genes was minimal (<100 genes genome-wide).

[00275] Cytokine-independent growth assay: To evaluate the safety of the primary T cells with GS94 locus KI, cytokine-independent growth assay was performed to evaluate the potential for oncogenic transformation. Briefly, primary human T cells that had undergone CD19/MSLN circuit cassette integration at GS94 locus were thawed and recovered overnight. IxlO⁶ cells were then seeded in one well of a 24 well-GRex plate, culturing for 5 days in the medium with or without cytokines. Cell number and viability were recorded at days 0, 3 and 5. As a positive control, IxlO⁶ Jurkat cells were cultured in the medium without cytokines in parallel. As shown in FIG. 43, while GS94 KI T cells maintained good viability and total cell count when cultured with cytokine, the viability of GS94 KI T cells drastically decreased over the course of 5 days when cultured without cytokine and there was no viable cell left on day 5. The positive control Jurkat cells maintained good viability and expansion without cytokine throughout the assay. Taken together, this data shows that GS94 edited primary human T cells still depend on exogenous cytokine for growth, survival and expansion, therefore, there is no concern for cellular transformation.

Results

[00276] The specificity of CRISPR reagents (e.g. SpCas9 complexed with sgRNA) targeting candidate loci including GS94 was evaluated by iGUTDE-seq (FIG. 40). GS94- targeting CRISPR RNP showed the highest percentage of iGUIDE-seq oligo cassette trapping events of all candidates evaluated, and the control sgRNA sequences from the iGUIDE-seq paper showed similar specificities to what was reported in the original publication, suggesting that the assay performed as expected.

[00277] Putative off-target sites were taken from the iGUIDE-seq output, which already suggested that the putative sites were spurious. Additional target sites were predicted by a computational approach (Elevation software package). rhAmp-seq was used to prepare high- throughput sequencing libraries for each of the putative off-target sites, and the method was applied to DNA samples from T cells electroporated with CRISPR RNPs targeting the candidate target sites. The resulting NGS data were processed with CRISPResso2 software, and the frequency of insertions and deletions (indels) was taken as indication of CRISPR cleavage activity, as is common in the field. T cells electroporated with GS94-targeting CRISPR RNP showed no greater frequency of indels at the set of putative off-target sites than T cells treated with CRISPR RNP targeting other sites, consistent with the GS94-targeting CRISPR RNP having no consequential or detectable off-target activity, and therefore being the most specific out of the set evaluated (FIG. 41).

[00278] Potential effects of transgene integration at the GS94 site on the regulation of the T cell transcriptome were evaluated by knocking in a large cassette to the site, growing T cells for several days, sorting cells expressing the transgene within the cassette, and then collecting RNA from the cells. RNA-seq libraries were prepared and sequenced, and analysis of the resulting Illumina sequencing data revealed no biologically or statistically significant differences in expression of any genes within 300kb of the GS94 site in cells with integrations at GS94 compared to cells with integration at TRAC or the GS102 sites (FIG. 42). Furthermore, other gene expression differences that reached statistical significance were minimal in number and in effect size, consistent with them being noise in the comparison. [00279] To assess whether transgene integration at GS94 could confer a transformed phenotype, cells with integrations at the GS94 site were cultured with and without cytokines in vitro. Cells remained alive and viable with cytokine addition, but died without cytokine supplementation and lost their viability (FIG. 43). The positive control Jurkat cells remained viable and proliferated. Overall, this indicates that integration of a transgene at GS94 does not confer capacity for cytokine-independent growth, which is a hallmark of T cell transformation.

Example 10: In vivo Insertion of a CAR expressing cassette

[00280] In vivo efficacy of T cells with a transgene cassette expressing a CAR recognizing a tumor antigen, or a CAR recognizing a tumor antigen under control of a priming receptor recognizing an antigen in the anatomical vicinity of the tumor, is assessed against human tumor cells such as K562 engineered to express the CAR antigen or to express antigens recognized by both the priming receptor and the CAR. Tumor cells (e.g. Ie6) are subcutaneously injected into the flank of NSG mice (Jackson Laboratories). Tumor growth is assessed by dimensional measurement by calipers every 2-4 days. When the tumor volume reaches -100 cubic mm, mice are intravenously injected with 5e6 T cells with a CAR or prime-CAR circuit cassette integrated at a specific site by CRISPR-mediated insertion, or with T cells engineered with CRISPR RNP alone, or with PBS alone as a sham injection. Tumor growth is monitored and mice are euthanized when tumor volume reaches 2000 cubic mm. Peripheral blood is bled from mice through a retro-orbital procedure, and flow cytometry and/or ddPCR is used to observe engineered T cell expansion over time. At time of sacrifice, spleen, blood, tumor and/or other tissue is analyzed via flow cytometry, ddPCR, and/or immunohistochemistry for the presence of engineered T cells. The results demonstrate that T cells engineered with cassette integration at one of the defined genomic loci lead to tumor regression and clearance in injected mice as compared to T cells without cassette integration, and that engineered T cells are detectable in the peripheral blood and tissues of injected mice.

Example 11: Evaluation of non-viral insertion of a large 8.3kb expression cassette in GS94

[00281] Next, an 8.3 kb insert was inserted into a T cell at the GS94 safe harbor loci using materials/methods as previously described. A diagram of the cassette is provided in FIG. 44. [00282] Construct generation

[00283] To generate plasmid constructs for knock-in, synthetic DNA was ordered from Twist, IDT and GENEWIZ and assembled via Gibson Assembly and Golden Gate Assembly. Plasmids contained homology arms homologous to sequences flanking the CRISPR target sites in the genome of 1.2kb or 450 bp in length.

[00284] T cell engineering

[00285] T-cells were enriched from peripheral blood mononuclear cells (PBMCs) obtained from normal donor Leukopaks (STEMCELL Technologies) using Lymphoprep (STEMCELL Technologies) and the EasySep Human T-Cell Isolation Kit (STEMCELL Technologies). T- cells were subsequently activated with CD3/CD28 Dynabeads at 1 : 1 bead to cell ratio (ThermoFisher, 40203D) in TexMACS medium (Miltenyi 130-197-196) supplemented with 3% human AB serum (Gemini Bio) and 12.5 ng/ml human IL-7 and IL-15 (Miltenyi premium grade) and cultured at 37°C, 5% CO2 for 48 hours before electroporation.

[00286] CRISPR RNP were prepared by combining 120 pM sgRNA (Synthego) targeting DNA sequence GAGCCATGCTTGGCTTACGA (GS94, SEQ ID NO: 94), 62.5 pM sNLS- SpCas9-sNLS (Aldevron) and P3 buffer (Lonza) at a volume ratio of 5: 1 :3:6, and incubated for 15 minutes at room temperature. An optimized amount of plasmid DNA, determined by dose titration experiments (ranging from 0.5-3 micrograms) was mixed with 3.5 pl of RNP. T-cells were counted, debeaded, centrifuged at 90 X G for 10 minutes and resuspended at 10^A6 cells/14.5 pl of P3 with supplement added (Lonza). 14.5 pl of T-cell suspension was added to the DNA/RNP mixture, transferred to Lonza 384-well nucleocuvette plate, and pulsed in a Lonza HT Nucleofector System with code EH-115. Cells were allowed to rest for 15 minutes at room temperature before transfer to 96-well plates (Sarstedt) in TexMACS medium supplemented with 12.5 ng/ml human IL-7 and IL-15 (Miltenyi premium grade). [00287] Transgene expression was detected by staining with anti-Myc antibody (Cell Signaling Technology clone 9B11) and anti-Flag antibody (RnD systems, clone 1042E) and analyzed on an Attune NxT Flow Cytometer. Other antibodies used were live/dead Fixable Near-IR (Thermo Fisher), TCRalpha/beta antibody (BioLegend clone IP26), CD4 antibody (BioLegend clone RPA-T4), CD8 antibody (BioLegend clone SKI).

[00288] Priming receptor induction

[00289] To assess functional activity of the transgene (ie. synthetic circuit), edited T cells were co-cultured with target cell line expressing priming antigen at 1 : 1 E:T ratio, and incubated for 24hrs. T cells were harvested and stained with anti-Myc and anti-Flag antibodies to assess for Priming Receptor and CAR expression, respectively.

[00290] To assess whether the 8.3kb transgene integration at GS94 resulted in functional knock-in, cells were cultured with parental K562 cells and K562 cells expressing the cognate priming antigen at a 1 : 1 E:T cell ratio. Cells were assayed by flow cytometry after 48 hours as previously described. K562 cells with priming antigen induced CAR expression, while control parental K562 cells did not (FIG. 45). Overall, this indicates that the Priming Receptor induced CAR expression after insertion of a 8.3kb transgene circuit. Thus, insertion of the 8.3 kb transgene circuit resulted in expression of multiple functional genes.

Example 12: T cell differentiation post editing

[00291] Methods

[00292] Two donor T cells edited as described above with a priming receptor and CAR synthetic circuit were phenotypically profiled with cell surface T cell subset markers by flow cytometry. Resting T cells were taken from in vitro culture conditions, rinsed with PBS prior to staining with Zombie-Aqua viability dye, CD4, CD8, CD45RA, CCR7 and CD27 with FMOs used as controls for gating and analyzed with an Attune NxT. In FlowJo, single, viable lymphocytes were selected by SSC and FSC and subset profiling by a combination of CCR7, CD27, CD45RA were used to identify Naive- or stem cell memory- (Tn/Tscm: CD45RA+CCR7+CD27+), central memory- (Tern: CD45RA-CCR7+CD27+), effector memory- (Tern: CD45RA-CCR7-CD27-), or terminal effector- (Tte: CD45RA+CCR7-CD27- ) T cells on CD4+ and CD8+ subpopulations.

[00293] Results

[00294] The non-viral editing generated a less differentiated T cell product (FIG. 46). In both donors, the non-viral editing did not contribute to an expansion of terminally differentiated T cells, as the major subset of T cells in both subpopulations retained positive expression of CD45RA and CCR7. (FIG. 46). This suggests that the edited T cells contain the capacity to expand, survive and persist in vivo.

References

[00295] Eyquem, J., Mansilla-Soto, J., Giavridis, T., van der Stegen, S. J., Hamieh, M., Cunanan, K. M., ... & Sadelain, M. (2017). Targeting a CAR to the TRAC locus with CRISPR/Cas9 enhances tumour rejection. Nature, 543(7643), 113-117.

[00296] Sadelain, M., Papapetrou, E. P., & Bushman, F. D. (2012). Safe harbours for the integration of new DNA in the human genome. Nature reviews Cancer, 12(C), 51-58.

[00297] Irion, S., Luche, H., Gadue, P., Fehling, H. J., Kennedy, M., & Keller, G. (2007). Identification and targeting of the ROSA26 locus in human embryonic stem cells. Nature biotechnology, 25(12), 1477-1482.

[00298] Pellenz, S., Phelps, M., Tang, W ., Hovde, B. T., Sinit, R. B., Fu, W ., ... & Monnat Jr, R. J. (2019). New human chromosomal sites with “safe harbor” potential for targeted transgene insertion. Human gene therapy, 30(7), 814-828.

[00299] Roth, T. L., Li, P. J., Nies, J. F., Yu, R., Nguyen, M. L., Lee, Y ., ... & Nguyen, D.

N. (2019). Rapid discovery of synthetic DNA sequences to rewrite endogenous T cell circuits. bioRxiv, 604561.

TABLE 4: SGRNA SEQUENCES USED FOR EVALUATION OF PREDICTED LOCI

TABLE 5: CONSTRUCTS USED FOR EVALUATION OF PREDICTED LOCI

Claims

1. An engineered cell, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus, the safe harbor locus is at any one or more of an sgRNA target loci provided in Table 4; and wherein expression of the at least one sequence encoding the transgene is operatively linked to an endogenous promoter.

2. An engineered cell, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus, the safe harbor locus is at any one or more of an sgRNA target loci provided in Table 4; and wherein expression of the at least one sequence encoding the transgene is operatively linked to an exogenous promoter.

3. The engineered cell of claim 1 or 2, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, or chr9:7970000-7980000.

4. The engineered cell of any one of claims 1-3, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:72290000-72300000, chrl 5:92830000- 92840000, or chrl6: 11220000-11230000.

5. The engineered cell of any one of claims 1-4, wherein the sgRNA target locus is chrl 1 : 128340000-128350000.

6. The engineered cell of any one of claims 1-4, wherein the sgRNA target locus is chrl 5:92830000-92840000.

7. The engineered cell of claim 1 or 2, wherein the sgRNA target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, or TRIM28.

8. The engineered cell of any one of claims 1-3, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS88, GS89, GS90, GS91, GS92, GS93, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, or GS120.

9. The engineered cell of any one of claims 1-4, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS91, GS92, GS93, GS95, GS96, GS100, GS101, GS102, GS103, GS104, and GS105.

10. The engineered cell of claim 8 or 9, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, or GS105.

11. The engineered cell of claim 8 or 9, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, or GS96.

12. The engineered cell of claim 8 or 9, wherein the safe harbor locus is an GS94 integration site in Table 4.

13. The engineered cell of claim 8 or 9, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, orGS102.

14. The engineered cell of claim 8 or 9, wherein the safe harbor locus is an GS102 integration site in Table 4.

15. The engineered cell of claim 8 or 9, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, or GS93.

16. The engineered cell of any one of claims 2-15, wherein the exogenous promoter is an EFla promoter.

17. The engineered cell of any one of claims 1-16, wherein the engineered cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor.

18. The engineered cell of claim 17, wherein the cell is a T cell or a T cell progenitor.

19. The engineered cell of any one of claims 1-18, wherein the engineered cell is undifferentiated.

20. The engineered cell of any one of claims 1-18, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

21. The engineered cell of any one of claims 1-20, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

22. The engineered cell of any one of claims 1-21, wherein the transgene encodes a chimeric antigen receptor (CAR).

23. A composition comprising the engineered cell of any one of claims 1-22 and a pharmaceutical excipient.

24. A guide ribonucleic acids (gRNA) for editing a cell at a safe harbor locus, wherein gRNA comprises any one of the sgRNA sequences in Table 4.

25. The gRNA of claim 24, wherein the gRNA comprises any one of SEQ ID NOS: 1-120.

26. The gRNA of claim 24 or 25, wherein the gRNA comprises any one of SEQ ID NOS:

91-96 and 100-105.

27. The gRNA of any one of claims 24-26, wherein the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102.

28. The gRNA of any one of claims 24-26, wherein the gRNA comprises SEQ ID NO:94.

29. The gRNA of any one of claims 24-26, wherein the gRNA comprises SEQ ID NO: 102.

30. The gRNA of any one of claims 24-29, wherein the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor.

31. A method of editing a cell having chromosomal DNA, comprising inserting at least one sequence encoding a transgene within a safe harbor locus in the chromosomal DNA of the cell, wherein the safe harbor locus is any one or more of the sgRNA target loci provided in Table 4.

32. The method of claim 31, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, or chr9:7970000-7980000.

33. The method of claim 31 or 32, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:72290000-72300000, chrl 5:92830000-92840000, or chrl6: 11220000-11230000.

34. The method of any one of claims 31-33, wherein the sgRNA target locus is chrl 1 : 128340000-128350000.

35. The method of any one of claims 31-33, wherein the sgRNA target locus is chrl 5:92830000-92840000.

36. The method of claim 31, wherein the sgRNA target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, or TRIM28.

37. The method of claim 31 or 32, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS88, GS89, GS90, GS91, GS92, GS93, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, or GS120.

187

38. The method of any one of claims 31-33, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS91, GS92, GS93, GS95, GS96, GS100, GS101, GS102, GS103, GS104, or GS105.

39. The method of any one of claims 31-33, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, or GS105.

40. The method of any one of claims 31-33, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, or GS96.

41. The method of any one of claims 31-33, wherein the safe harbor locus is the GS94 integration site in Table 4.

42. The method of any one of claims 31-33, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, or GS102.

43. The method of any one of claims 31-33, wherein the safe harbor locus is the GS102 integration site in Table 4.

44. The method of any one of claims 31-33, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, or GS93.

45. The method of any one of claims 31-44, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

46. The method of any one of claims 31-45, wherein the transgene encodes a chimeric antigen receptor (CAR).

47. The method of any one of claims 31-46, wherein the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene.

48. The method of claim 47, wherein the exogenous promoter is an EFla promoter.

49. The method of any one of claims 31-48, wherein the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell progenitor.

50. The method of claim 49, wherein the cell is a T cell or a T cell progenitor.

51. The method of any one of claims 31-50, wherein the engineered cell is undifferentiated.

52. The method of any one of claims 31-51, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

53. The method of any one of claims 31-52, wherein the at least one sequence is inserted using a homology-directed repair.

188

54. The method of any one of claims 31-52, wherein the at least one sequence is inserted using a homology independent targeted insertion.

55. The method of any one of claims 31-54, wherein the at least one sequence is inserted using one or more guide ribonucleic acids (gRNAs) and one or more Cas9 endonucleases.

56. The method of claim 55, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

57. The method of claim 55 or 56, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 91-96 and 100-105.

58. The method of any one of claims 55-57, wherein the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102.

59. The method of any one of claims 55-58, wherein the gRNA comprises SEQ ID NO:94.

60. The method of any one of claims 55-58, wherein the gRNA comprises SEQ ID NO: 102.

61. A method of editing a T cell, comprising contacting a T cell with one or more guide ribonucleic acids (gRNAs), at least one sequence encoding a transgene, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within a safe harbor locus, wherein the safe harbor locus is selected from any one or more of an sgRNA target loci in Table 4.

62. The method of claim 61, wherein the one or more gRNAs comprises a sequence selected from any one of the sgRNA sequences in Table 4.

63. The method of claim 61 or 62, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

64. The method of any one of claims 61-63, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 91-96 and 100-105.

65. The method of any one of claims 61-64, wherein the gRNA comprises SEQ ID NO:94 or SEQ ID NO: 102.

66. The method of any one of claims 61-65, wherein the gRNA comprises SEQ ID NO:94.

67. The method of any one of claims 61-65, wherein the gRNA comprises SEQ ID

NO: 102.

189

68. The method of any one of claims 61-67, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, or chr9:7970000-7980000.

69. The method of any one of claims 61-68, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:72290000-72300000, chrl 5:92830000-92840000, or chrl6: 11220000-11230000.

70. The method of any one of claims 61-69, wherein the sgRNA target locus is chrl 1 : 128340000-128350000.

71. The method of any one of claims 61-69, wherein the sgRNA target locus is chrl 5:92830000-92840000.

72. The method of claim 61-67, wherein the sgRNA target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, or TRIM28.

73. The method of any one of claims 61-68, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS88, GS89, GS90, GS91, GS92, GS93, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, or GS120.

74. The method of any one of claims 61-68 and 73, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS91, GS92, GS93, GS95, GS96, GS100, GS101, GS102, GS103, GS104, or GS105.

75. The method of any one of claims 61-68 and 73-74, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS103, GS104, or GS105.

76. The method of any one of claims 61-68 and 73-74, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS94, GS95, and GS96.

77. The method of claim 76, wherein the safe harbor locus is the GS94 integration site in

Table 4.

190

78. The method of any one of claims 61-68 and 73-74, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS100, GS101, or GS102.

79. The method of claim 78, wherein the safe harbor locus is the GS102 integration site in Table 4.

80. The method of any one of claims 61-68 and 73-74, wherein the safe harbor locus is selected from any one of the integration sites in Table 4 designated: GS91, GS92, or GS93.

81. The method of any one of claims 61-80, wherein the engineered cell is undifferentiated.

82. The method of any one of claims 61-81, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

83. An ex vivo method of obtaining an engineered cell or population thereof, comprising: a. obtaining a cell; b. genetically modifying the cell by inserting at least one sequence encoding a transgene within a safe harbor locus, wherein the safe harbor locus is selected from any one of an sgRNA target loci in Table 4.

84. The method of claim 83, wherein obtaining the cell comprises: (i) collecting a tissue sample from a subject, (ii) isolating the cells from the tissue samples, and (iii) culturing the cells in vitro.

85. The method of claim 84, wherein the tissue sample is a blood sample.

86. The method of any one of claims 83-85, wherein the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell, or T cell progenitor.

87. The method of claim 86, wherein the cell is a T cell or a T cell progenitor.

88. The method of any one of claims 83-87, wherein the engineered cell is undifferentiated.

89. The method of any one of claims 83-88, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

90. The method of any one of claims 83-89, wherein the at least one sequence is inserted using a homology-directed repair.

91. The method of any one of claims 83-89, wherein the at least one sequence is inserted using a homology independent targeted insertion.

191

92. The method of any one of claims 83-91, wherein the genetically modifying in step (b) comprises contacting the cell with one or more guide ribonucleic acids (gRNAs), the at least one sequence, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within the safe harbor locus.

93. The method of claim 92, wherein the one or more gRNAs comprises a sequence selected from any one of the sgRNA sequences in Table 4.

94. The method of any one of claims 83-93, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

95. The method of any one of claims 83-93, wherein the transgene encodes a chimeric antigen receptor (CAR).

96. The method of claim 83-95, wherein the at least one sequence comprises an exogenous promoter and the exogenous promoter is operably linked to the transgene.

97. The method of claim 96, wherein the exogenous promoter is an EFla promoter.

98. A method of treating a subject having or at risk of having a disease, comprising administering to the subject an effective amount of the cell of any one of claims 1-22, a population thereof, or the composition of claim 23.

99. The method of claim 98, wherein the cell, the population thereof, or the composition is administered to the subject by infusion.

100. A method of treating a subject having or at risk of having a disease, comprising: a. conducting the method of any one of claims 83-97; and b. administering to the subject an effective amount of a composition comprising the cell or a population thereof.

101. The method of claim 100, wherein the composition is administered to the subject by infusion.

102. The method of claim 100 or 101, wherein the disease is cancer.

103. The method of any one of claims 100-102 or , wherein the disease is blood cancer.

104. A method of identifying a safe harbor locus, comprising: a. identifying genes or non-coding regions in a chromosome that are above a threshold level for expression across developmental cell states and/or a threshold level for accessibility of chromatin;

192 b. generating a linear model that correlates the gene or non-coding region from step (a) with knock-in (KI) efficiency and estimates the KI efficiency of any gene or coding region on the chromosome; and c. selecting the safe harbor locus based on threshold parameters; wherein the safe harbor locus is selected for insertion of at least one sequence encoding a transgene within a cell.

105. The method of claim 104, wherein the threshold parameters include one or more of: stable expression of a transgene, knockout of the gene confers benefit to the function of the cell, no known function within the cell, stable transgene expression in vitro with or without CD3/CD28 stimulation, negligible off-target cleavage as detected by iGuide-Seq or CRISPR- Seq, less off-target cleavage relative to other loci as detected by iGuide-Seq or CRISPR-Seq, negligible transgene-independent cytotoxicity, negligible transgene-independent cytokine expression, negligible transgene-independent chimeric antigen receptor expression, negligible deregulation or silencing of nearby genes, and positioned outside of a cancer-related gene.

106. The method of claim 105, wherein the stable expression of a transgene at the safe harbor locus is less than or equal to 2-fold expression change over the course of at least 1, 2, 3, 4, 5, 6, or 7 days, and wherein expression change is measured by mean fluorescence intensity of a reporter gene encoded by the at least one sequence.

107. The method of any one of claims 104-106, wherein the accessibility of chromatin is measured using an assay for transposase-accessible chromatin using sequencing (ATAC-seq).

108. The method of any one of claims 104-107, wherein the level of expression across developmental cell states is measured using RNA sequencing (RNA-seq).

109. The method of any one of claims 104-108, wherein the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell progenitor.

110. The method of any one of claims 104-109, wherein the linear model has a coefficient of determination (R² value) of at least 30%.

111. The engineered cell, composition, gRNA or method of any one of the preceding claims, wherein insertion within the safe harbor locus increase cell cytotoxicity of diseased cells.

112. The engineered cell, composition, gRNA or method of any one of the preceding claims wherein knock-in efficiency at the safe harbor locus is increased relative to other locations along the chromosome.

113. An engineered cell, comprising at least one sequence encoding a transgene, wherein the at least one sequence is inserted within a safe harbor locus, wherein the safe harbor locus is at any one or more of an sgRNA target loci; and wherein expression of the at least one sequence encoding the transgene is operatively linked to an endogenous promoter or an exogenous promoter, and wherein the engineered cell is undifferentiated.

114. The engineered cell of claim 113, wherein the safe harbor locus is selected any one of the integration sites designated: GS94, GS88, GS89, GS90, GS91, GS92, GS93, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, or GS120.

115. The engineered cell of claim 113 or 114, wherein the safe harbor locus is the GS94 integration site.

116. The engineered cell of claim 113, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, or chr9:7970000-7980000.

117. The engineered cell of claim 113, wherein the sgRNA target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN, PTPN2, PTPN6, PTPRC, PTPRCAP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, SOCS1, RP14, SRSF9, SUB1, TET2, TIGIT, TRAC, or TRIM28.

118. The engineered cell of any one of claims 113-117, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

119. The engineered cell of any one of claims 113-118, wherein the engineered cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor.

120. The engineered cell of claim 119, wherein the cell is a T cell or a T cell T cell progenitor.

121. The engineered cell of any one of claims 113-120, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

122. The engineered cell of any one of claims 113-121, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

123. The engineered cell of any one of claims 113-122, wherein the transgene encodes a chimeric antigen receptor (CAR).

124. A composition comprising the engineered cell of any one of claims 113-123 and a pharmaceutical excipient.

125. A method of editing a cell having chromosomal DNA, comprising inserting at least one sequence encoding a transgene within a safe harbor locus in the chromosomal DNA of the cell, wherein the safe harbor locus is at any one or more of an sgRNA target loci; and wherein the engineered cell is undifferentiated.

126. The method of claim 125, wherein the engineered cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or a T cell progenitor.

127. The method of claim 125 or 126, wherein the cell is a T cell or a T cell progenitor.

128. A method of editing a T cell, comprising contacting a T cell with one or more guide ribonucleic acids (gRNAs), at least one sequence encoding a transgene, and one or more Cas9 endonucleases, wherein the one or more gRNAs and Cas9 endonucleases facilitate the insertion of the at least one sequence into chromosomal DNA within a safe harbor locus.

129. The method of any one of claims 125-128, wherein the safe harbor locus is selected from any one of the integration sites designated: GS88, GS89, GS90, GS91, GS92, GS93, GS94, GS95, GS96, GS97, GS98, GS99, GS100, GS101, GS102, GS103, GS104, GS105, GS106, GS107, GS108, GS109, GS110, GS111, GS112, GS113, GS114, GS115, GS116, GS117, GS118, GS119, or GS120.

130. The method of any one of claims 125-128, wherein the safe harbor locus is the GS94 integration site.

131. The method of any one of claims 125-128, wherein the sgRNA target locus is selected from: chrl 1 : 128340000-128350000, chrl0:33130000-33140000, chrl0:72290000-72300000, chrl 1 :65425000-65427000 (NEAT1), chrl 5:92830000-92840000, chrl6: l 1220000- 11230000, chr2:87460000-87470000, chr3: 186510000- 186520000, chr3:59450000- 59460000, chr8: 127980000-128000000, or chr9:7970000-7980000.

132. The method of any one of claims 125-128, wherein the sgRNA target locus is a gene selected from: APRT, B2M, CAPNS1, CBLB, CD2, CD3E, CD3G, CD5, EDF1, FTP, PTEN,

195 PTPN2, PTPN6, PTPRC, PTPRC AP, RPS23, RTRAF, SERF2, SLC38A1, SMAD2, S0CS1, SRP14, SRSF9, SUB1, TET2, TIGIT, TRAC, or TRIM28.

133. The method of any one of claims 125-132, wherein the one or more gRNAs comprises any one of SEQ ID NOS: 1-120.

134. The method of any one of claims 125-133, wherein the engineered cell is CD45RA⁺ and CCR7⁺ after insertion of the at least one sequence into the safe harbor locus.

135. The method of any one of claims 125-134, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

136. The method of any one of claims 125-135, wherein the transgene encodes a chimeric antigen receptor (CAR).

137. An ex vivo method of obtaining an undifferentiated engineered cell or population thereof, comprising: c. obtaining a cell; d. genetically modifying the cell by inserting at least one sequence encoding a transgene within a safe harbor locus, wherein the engineered cell is undifferentiated.

138. The method of claim 137, wherein obtaining the cell comprises: (i) collecting a tissue sample from a subject, (ii) isolating the cells from the tissue samples, and (iii) culturing the cells in vitro.

139. The method of claim 138, wherein the tissue sample is a blood sample.

140. The method of any one of claims 137-139, wherein the cell is a stem cell, a human cell, a primary cell, an hematopoietic cell, an adaptive immune cell, an innate immune cell, a T cell or T cell progenitor.

141. The method of claim 140, wherein the cell is a T cell or a T cell progenitor.

142. The method of any one of claims 137-142, wherein the engineered cell is CD45RA⁺ and CCR7⁺.

143. The method of any one of claims 137-143, wherein the transgene encodes a recombinant protein, optionally a therapeutic agent.

144. The method of any one of claims 137-143, wherein the transgene encodes a chimeric antigen receptor (CAR).

196

145. A method of treating a subject having or at risk of having a disease, comprising administering to the subject an effective amount of the cell of any one of claims 113-123, a population thereof, or the composition of claim 124.

146. A method of treating a subject having or at risk of having a disease, comprising: c. conducting the method of any one of claims 125-144; and d. administering to the subject an effective amount of a composition comprising the cell or a population thereof.

147. The method of claim 146, wherein the composition is administered to the subject by infusion.

148. The method of claim 146 or 147, wherein the disease is cancer.

197